instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 205 212 doi: 10.2298/fuee1502205b infrared thermography applied to power electron devices investigation  giovanni breglio, andrea irace, luca maresca, michele riccio, gianpaolo romano, paolo spirito university of naples federico ii, department of electrical engineering and information technologies, naples, italy abstract. the aim of this paper is to give a presentation of the principal applications of infrared thermography for analysis and testing of electrondevices. even though experimental characterization could be carried out on almost any electronic devices and circuits, here ir thermography for investigation of power semiconductor devices is presented. different examples of functional and failure analysis in both transient and lock-in modes will be reported. key words: infrared thermography, thermal measurement, power device, lock-in thermography 1. introduction as it is well known, infrared thermography systems are widely used in several engineering fields asa powerful investigation tool, thanks to their many attractive features. among all possible applications, the analysis ofthe operation of electronic devices and circuits is of particular interest. specifically, non-destructive contactless testing of electronic components, circuits, semiconductor devices, solar cells, ics, etc., is possible[1-3]. thanks to the commercial availability of fast, high-sensitivity infrared sensors 2darrays, which have completely replaced single sensor systems, nowadays many different ir cameras can be used to monitor temperature distribution across a semiconductor device. temperature gradient might arise during devices operation, because heat is generated, according to joule's effect, due to current flowing within it. gathering information on temperature maps can be, for example, used for complete thermal characterization and functional analysis of semiconductor components. application of ir technique to monitoring such situations has been extensively reported in literature (e.g. [4-8]), showing its flexibility and usefulness. in this paper, attention will be focused on employing infrared thermography analysis for the investigation of power semiconductor devices. electron devices for high power received february 3, 2015 corresponding author: michele riccio, university of naples federico ii, department of electrical engineering and information technologiesvia claudio 21, 80125, naples, italy (e-mail: michele.riccio@unina.it) 206 g. breglio, a. irace, l. maresca, m. riccio, g. romano, p. spirito applications usually have to operate in harsh environments and conditions. for this reason, in recent years, power devices performances and reliability have become a major concern for industry and research, where the development has focused on increasing ruggedness. in the next section, basic concepts of ir thermography will be briefly recalled for readers' convenience.after that, fundamental applications of this technique applied to power devices will be reported. 2. infrared emission electromagnetic radiation is emitted by any object at a given temperaturet different from zero kelvin, whose amount of energy, called spectral radiance, is described by plank's law: ( ) ( ) ( ) (1) whereh is the planck constant, k is the boltzmann constant, c is the speed of light,λ is the wavelength of the emitted radiation, and ε is the emissivity of the bodydepending on material and temperature (with 0<ε<1). the emissivity takes into account that any real object differs from ideal black body (i.e. it emits less energy). the range of infrared wavelengths used in thermography is generally located between about 1μm and 10μm, usually indicated as thermal infrared, even though longer wavelengths might be employed. the majority of ir sensors are designed to detect radiation in these ranges and convert it into a digital signal, making it possible to measure the surface temperature of any object. the choice of the ir camera is imposed by the trade-off between sensor sensitivity and speed: faster sensors are the only choice if rapid thermal transients have to be detected. on the other hand, more accurate steady-state temperature valuescan be obtained with high-sensitivity sensors. fig. 1 the electromagnetic spectrum. infrared thermography applied to power electron devices investigation 207 3. ir analysis in this section a review of ir thermography applied to different analyses of power devices is presented, also with reporting several examples.detection of steady-state temperature distribution is one of the basic tests that can be done, but it is of lesser interest in the study of power devices for which transient and lock-in inspection are preferred. as stated above, ruggedness of power devices is widely investigated and studied togain better understanding of the phenomena that limit their reliability [9-10]. to this purpose, temperature maps allow to point out some useful information on device behaviour because of temperature that is directly related to current distribution within the device itself.through ir thermography a complete thermal characterization can be obtained, and some unusual working conditions can be examined, such as hot-spots or non-uniform current spreading. it has to be stated that one main issue to deal with in order to have temperaturecalibrated images, that is to have quantitative evaluation of temperature, is the contrast in emissivity for materials on top surfaces of electronic devices. moreover, the emissivity has a temperature dependence that cannot be neglected, especially when large thermal excursions are experienced. several compensation techniques have been proposed, including black painting of device top surface but this is not a reliable solution because the thermal response of the device is modified by this additional layer. the most accurate and robust approach is the off-line calibration of the thermal images by passive heating of the device and subsequent evaluation of emissivity map.this emissivity map can be used to transform the ir radiation map into a correct temperature map of the surface under test. 3.1. transient thermography power devices might experience large temperature variations due to high power densities involved during their operation (e.g. if avalanche breakdown occurs). there is therefore the need to detect their heating/cooling transients to accurately monitor device behaviour. unfortunately,modern ir cameras usually have an acquisition rate that hardly exceed some hundreds of hz at full frame, because the read-out time to process the signal from the sensors array is still high. this of course, is a great limitation in the detection of thermal dynamics which are in the μs time scale. higher acquisition frequency can be obtained if an equivalent time sampling method is used. with this approach the attainablemaximum frame rate is theoretically limited only by the minimum integration time, which usually is 1μs, or smaller, for state-of-art ir sensors. in this way equivalent sampling frequency of 1mhz can be easily achieved [11]. the only restriction to the use of this method is that the experiment to be observed must be periodically repeated, and a precise synchronization of all signals has to be ensured. fig. 2 reports temperature distribution of an igbt during transient avalanche condition, in subsequent time instants 5μs spaced. non uniformity in current distribution and motion of current flow across the entire device area are spotted out. 208 g. breglio, a. irace, l. maresca, m. riccio, g. romano, p. spirito (a) (b) (c) (d) fig. 2 transient temperature maps during avalanche condition. 3.2 lock-in thermography lock-in technique [12-14] is generallyusedduring temperature variation, or in other terms, the current flow is so low such that it is not possible to detect the signal with an acceptable signal to noise ratio. the device heating is synchronously detected with a heterodyne demodulation method schematically reported in fig.3. fig. 3 schematic representation of lock-in technique[13]. infrared thermography applied to power electron devices investigation 209 the basic principle is to amplitude-modulate the signal to detect, in this context the temperature, at a specified frequency flock-in. the ir images returned by the camera are multipliedby two weighting sinusoidal functions in quadrature. amplitude and phase maps are subsequently obtained by low pass filtering, so that the noise falling outside this frequency is averaged out. numerical filtering can be performed real-time with an appropriate software, allowing a temperature resolution less than 100μk for most of state-of-art ir cameras. for a lock-in detection approachthe temperature of the device under test (dut) is modulated by external biasing at the lock-in frequency flock-in: ( ) ( ) ( ) [ ( )] ( ) (2) where t0(x,y) is the initial temperature on chip and n(x,y) takes into account all the noise components. after an integration of the sensor output for an acquisition time tacq>>1/flock-in (i.e. if a reasonable number of periods are averaged), the in-phase and quadrature components can be obtained: ( ) ( ) [ ( )] (3a) ( ) ( ) [ ( )] (3b) from (3) it is easy to calculate amplitude and phase of the lock-in signal as follows: ( ) √ ( ) ( ) (4a) ( ) ( ( ) ( ) ) (4b) from which useful information on heat distribution and spreading can be obtained. it is important to note that emissivity contrast does not affect phase signal. in the characterization of power devices, lock-in thermography is usually used to investigate the location where device breaks after a failure event. this is done by biasing with an extremely small current, in the range of tens ma, the tested device and detecting the heat by lock-in approach. as an example, fig.4 shows the failure location of a power diode failed in avalanche condition after anunclamped inductive switching (uis) test. fig. 4 lock-in image superimposed on device geometry. furthermore, use of this technique is mandatory in the inspection of device failure when no evident damages could be spotted optically. this is depicted in fig.5 where a punch 210 g. breglio, a. irace, l. maresca, m. riccio, g. romano, p. spirito through igbt is analyzed by means of lock-in thermography after a failure in avalanche, by using a small current at low voltage that will flow only in the damaged region and will lead to a small overheating in that area, thus indicating the failure location. fig. 5 dut failure area. lock-in thermography could be used in a more advanced analysis if an appropriate ad-hoc experimental system is employed, as shown in the following examples. the ir set-up developed in [11] is able to synchronously repeat an experiment a given number of periods, detecting by lock-in method the temperature of a dut. with this procedure a 1200v rated igbt was analyzed during an uis test. results are reported in fig.6, where it is clear that the current is not distributed uniformly on the entire device area. current crowding is an undesirable situation because it can lead to a premature device failure. in this case precise information on the area effectively active where the current flows, can be obtained by the phase signal because, as said before, it is not affected by the emissivity problem. another interesting analysis can be carried out using the system described in [15].without going into details, an uis transient is repeated periodically, but the avalanche current is drained out from the dut by activation of a parallel device (crow-bar) after a certain time (fig. 6). in such a way, it is possible to observe just few μs of the transient. in that short time, temperature is again detected by lock-in technique. this kind of experiment is useful to study the possible activation of the termination area during avalanche, and therefore prove the influence of the termination design in the device breakdown behavior and reliability. it is visible in fig. 8a and fig. 8b that during the initial instant of avalanche condition the current flows in the terminationarea, moreover it is confined in just very narrow areas. fig. 6 amplitude (right) and phase (left) images of an igbt during unclamped inductive switching test. infrared thermography applied to power electron devices investigation 211 fig. 7 timing graph of modified uis test [14]. (a) (b) fig. 8 current distribution during first few μs of avalanche breakdown. 4. conclusion in this paper, the use of infrared thermographyapplied to the characterization and monitoring of power semiconductor devices behaviour has been shown, giving a description of its many attractive features. transient and lock-in detection method have been presented, since they are probably the two main analyses employed for power devices investigation, with the aid of some experimental results. 212 g. breglio, a. irace, l. maresca, m. riccio, g. romano, p. spirito references [1] o. breitenstein, m. langenkamp, lock-in thermography-basics and use for functional diagnostics of electronic components, springer, berlin, 2003. [2] o. breitenstein, m. lagenkamp, "lock-in contact thermography investigation of lateral electronic inhomogeneities in semiconductor devices", sensors actuators a: phys, vol. 71, no. 1–2, pp. 46–50, 1998. [3] x. maldaque, theory and practice of infrared technology for nondestructive testing, wiley, new york, 2001. [4] j.p. rakotoniaina, o. breitenstein, m. lagenkamp, "localization of weak heat sources in electronic devices using highly sensitive lock-in thermography", materials science and engineering: b, vol. 91-92, pp. 481–485, 2002. [5] m. riccio, a. pantellini, a. irace, g. breglio, a. nanni, c. lanzieri, "electro-thermal characterization of algan/gan hemt on silicon microstrip technology", microelectron. reliab., vol. 51, no. 9, pp. 1725– 1729, 2011. [6] n. killat, m.kuball, t. chou, u. chowdhury, j. jimenez, "temperature assessment of algan/gan hemts: a comparative study by raman, electrical and ir thermography", in proceedings of the ieee international reliability physics symposium, irps 2010, 2-6 may 2010, pp. 528–531. [7] d. may, b. wunderle, m.a. ras, w. faust, a. gollhard, r. schacht, b. michel, "material characterization and non-destructive failure analysis by transient pulse generation and ir-thermography", 14th international workshop on thermal investigation of ics and systems, therminic 2008, 24-26 sept. 2008, pp. 47–51. [8] j. h. l. ling, a. a. o. tay, k. f. choo, w. chen, "thermal characterization and modelling of a gallium arsenide power amplifier mmic", in proceedings of the 13th ieee intersociety conference on thermal and thermomechanical phenomena in electronic systems, itherm 2012, may 30 2012-june 1 2012, pp. 440– 445. [9] m. riccio, l. maresca, a. irace, g. breglio, y. iwahashi, "impact of gate drive voltage on avalanche robustness of trench igbts", microelectron. reliab., vol. 54, issues 9–10, pp. 1828–1832, september– october 2014. [10] m. riccio, a. castellazzi, g. de falco, a. irace, "experimental analysis of electro-thermal instability in sic power mosfets", microelectron. reliab., vol. 53, issues 9–11, pp. 1739–1744, september– november 2013. [11] g. romano, m. riccio, g. de falco, l. maresca, a. irace, g. breglio, "an ultrafast ir thermography system for transient temperature detection on electronic devices", in proceedings of the 30th annual semiconductor thermal measurement and management symposium, semi-therm 2014, march 2014, pp. 80–84. [12] s. huth, o. breitenstein, a. huber, u. lambert, "localization of gate oxide integrity defects in silicon metal-oxide-semiconductor structures with lock-in ir thermography", journal of applied physics, vol. 88, no.7, pp. 4000–4003, oct 2000. [13] g. breglio, a. irace, e. napoli, m. riccio, p. spirito, k. hamada, t. nishijima, t. ueta, "detection of localized uis failure on igbts with the aid of lock-in thermography", microelectron. reliab., vol. 48, issues 8–9, pp. 1432-1434, august–september 2008. [14] m. riccio, a. irace, g. breglio, l. rossi, m. barra, f. v. di girolamo and a. cassinese, "current distribution effects in organic sexithiophene field effect transistors investigated by lock-in thermography: mobility evaluation issues", appl. phys. lett., vol. 93, p. 243504, 2008. [15] m. riccio, l. rossi, a. irace, e. napoli, g. breglio, p. spirito, r. tagami, y. mizuno, "analysis of large area trench-igbt current distribution under uis test with the aid of lock-in thermography", microelectron. reliab., vol. 50, issues 9–11, pp. 1725-1730, september–november 2010. facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 137-143 https://doi.org/10.2298/fuee2201137o © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper dielectric absorption in pedot:pss capacitors with stainless steel yarn electrodes in textile substrates sheilla atieno odhiambo1,2, gilbert de mey3, carla hertleer1, lieva van langenhove1 1department of textiles, ghent university, ghent, belgium 2department of textiles, moi university, eldoret, kenya 3department of electronics and information systems, ghent university, ghent, belgium abstract. capacitors have been made on textile substrates. stainless steel yarns were used as electrodes. the dielectric material was a mixture of pedot and pss. stainless steel yarns were used as the electrodes. these capacitors are developed to be inserted in wearable textiles, a research field called smart textiles. after charging, a spontaneous discharge was observed lasting for several hours. by connecting a small resistance or even a short circuit for a certain time, it was observed that the voltage starts to rise afterwards when the load resistor or the short circuit was removed. this phenomenon is known as dielectric absorption. it was observed for the pedot:pss cells that the voltage recovery is relatively high as compared to other materials. key words: dielectric absorption, capacitor, pedot:pss, textile. 1. introduction dielectric absorption, also called dielectric relaxation or battery action, is a phenomenon that a capacitor still contains an amount of electric charge even when the electrodes have been short circuited or connected to a resistor for a certain period of time. after removal of the short circuit, the voltage across the capacitor starts to increase. with an ideal capacitor such behaviour is not possible. it turns out that the dielectric layer is able to absorb electrostatic energy and to store it during a limited period of short circuiting. dielectric absorption has been observed in several dielectric materials such as polymers and insulating oxides. a physical explanation is that the dielectric contains polar molecules which will be oriented along the externally applied electric field according to a debye relaxation mechanism [1]. received july 15, 2021; received in revised form september 15, 2021 corresponding author: gilbert de mey department of electronics and information systems, ghent university, technologiepark 126, 9052 ghent, belgium e-mail: gilbert.demey@ugent.be 138 s. a. odhiambo, g. de mey, c. hertleer, l. van langenhove consequently, these dielectrics are lossy, i.e. that the equivalent electric network models always include both perfect capacitors and resistors [2][3][4]. for most materials the relaxation of the voltage after a short circuit is rather small, typically 0.1 % for a dielectric material like sio2 [2]. generally, the relaxation is in the range 0.01 % 10 %. dielectric absorption is a serious problem for electronic circuit design. one has to take into account that some types of capacitors are able to produce an unexpected voltage which may inhibit the normal behaviour of the circuit [3][4]. our research fits into the field of smart textiles, which includes the integration of electronic components into wearable textiles for different applications like medical surveillance or safety [5]. conductors and resistors have been integrated in textiles for interconnection, heating elements and electrodes [6][7][8][9][10]. our research is focused towards capacitors and batteries in textile structures [11][12][13][14]. the purpose is to make a device to store a small amount of electric energy which can also be fully integrated into a textile fabric so that the device is fully wearable. more specifically, a water solution of pedot:pss has been deposited on a textile fabric made from cotton or polyester (pet). more details about the fabrication have been published elsewhere [15]. a photograph of a device is shown in fig.1 fig. 1 photograph of a sample. dimensions are about 5×5 cm2. the black area is the deposited pedot:pss. three stainless steel electrodes are sewed into the substrate. electric conducting yarns made from stainless steel fibers were used as electrodes as shown in fig.1 [15]. three electrodes were inserted. during the measurements either the first and second either the second and third electrode was used. it was observed that these electrodes gave much better results in comparison with ag electrodes used by other authors [16]. pedot:pss is a mixture of two polymers: pedot (3,4ethylenedioxythiophene) and pss (polystyrenesulphonate). an electron can jump from a pedot to a pss molecule so that ionization occurs: pedot+ and pss-. these ions give rise to electric conduction in the solid. a discussion about the energy storage possibilities of these devices can be found in [17]. several experiments have also been carried out in order to understand the physical basics of the conducting mechanism [18][19][20][21]. an intensive search on the web of science using the topics "pedot" and "dielectricabsorption" or "dielectric relaxation" did not reveal any papers. hence, to the best of our knowledge no article could be found related to dielectric absorption in pedot:pss layers. dielectric absorption in pedot:pss capacitors with stainless steel yarn electrodes in textile substrates 139 2. experimental measurements for the experimental measurements the circuit shown in fig. 2 has been used. fig. 2 schematic layout of the measuring circuit the pedot:pss capacitor has been charged at a constant voltage v0 = 1.5 v for a sufficiently long time, around 2 hours by closing the switch s1. the switch s1 was opened and the voltage v(t) was recorded as a function of time with the voltage meter vm. the input resistance of vm was about 10 mω so that its current consumption was always below 150 na. fig.3a shows the decaying voltage v(t) with the label "initial". one remarks a steep decay in the beginning followed by a long period with a very slow voltage decay [15]. after several other measurements, exactly the same experiment was repeated again and plotted in fig.3a with the label "final". it turns out that this curve has a similar shape but the voltage values are higher. this result agrees with reliability experiments carried out on similar samples [22]. it was found that the pedot:pss capacitors are improving during the first 5 to 7 charging/discharging cycles. if more charging/discharging cycles have been applied, the capacitances start to get less efficient [23]. the conclusion is that one cannot assign one single discharge curve to a given device. fig. 3 voltage decay curves of pedot:pss cells, with and without resistive load. (a) general view (b) detailed view of the dielectric absorption. 140 s. a. odhiambo, g. de mey, c. hertleer, l. van langenhove a similar experiment was carried out with a rl = 10 kω load resistor connected to the pedot:pss capacitor. the curve is shown in fig.3a along with a more detailed view in fig.3b. as expected, the discharging voltage is much lower if one compares with the "initial" or the "final" curve. for the load rl = 5 kω , the discharging curve is a little bit below the rl = 10 kω curve. after 360 seconds discharging, the rl = 5 kω was disconnected by opening the switch s2. as can be seen in fig.3b, the output voltage started to increase considerably. the output voltage reached almost 80 % of the values of the "initial" curve. remark that a load resistor of only rl = 5 kω is much smaller than the internal resistance of the pedot:pss cells which has been measured to be around 300 kω. these results are quite different from the results obtained with other dielectric materials, where the recovery is at most 15 % after a short circuiting of only 10 seconds. our result is totally different: even after a load period of 360 seconds, the voltage recovery is almost 80 % of the voltage which would have been obtained without any load resistor. in a second series of experiments, the pedot:pss cell was short circuited (rl = 0 ω) for a certain period by closing the switch s2 (fig.2). the results are shown in fig.4. fig. 4 transient voltages measured with short circuit periods of (a) 19 s, (b) 111 s, (c) 215 s and (d) 721 s each time, the pedot:pss cells were charged to 1.5 v during two hours. after a certain discharge time when only the voltage meter was connected (fig.2), the short circuit was applied intentionally. four tests are shown in fig.4 done with the following short circuiting periods: (a) 19 s, (b) 111 s, (c) 215 s and (d) 721 s. it is remarkable to note that even a long short circuit time of 721 s, which is about 12 minutes, was not enough to completely discharge the pedot:pss cell. moreover, these long short circuit times did not have any negative influence on the cells. the cells could be recharged and discharged again without any problem. dielectric absorption in pedot:pss capacitors with stainless steel yarn electrodes in textile substrates 141 as already mentioned, the pedot:pss cells always show a voltage decay after charging, even when no load resistor was connected. experiments carried out with a load resistor in the range of rl > 100 kω did not provide clear results. the difference between the discharging curves with rl > 100 kω and rl = ∞ were hardly visible. therefore our experiments have been done with lower values: rl = 10 kω, rl = 5 kω and even rl =0. the pedot:pss cells are not perfect capacitors. the self-discharge is clearly visible from the results shown in fig.3. it is not so obvious to evaluate the voltage recovery because the output voltage even decays without any short circuiting. hence, the initial voltage (1.5 v in our experiments) will be used as the reference one. in fig.4 some numerical values for the voltages are shown. if we consider the value of 0.2121 v obtained after a short circuiting period of 721 s (fig.4d), the voltage recovery is found to be 0.212/1.5 = 14 %. for the shorter circuit period of 19 s, one gets 0.804/1.5 = 53.6 % (fig.4a). from the literature voltage recoveries are in the range 0.1 10 %. with our experiments we have shown that much higher values can be obtained using pedot:pss cells. as far to our knowledge, no voltage recovery values higher than 10 % have been reported in the literature. 3. discussion a frequently used model for dielectric relaxation is a parallel connection of several rc networks. when such a circuit is short circuited, all the individual capacitors can only be discharged through the resistors. this explains why still a charge will remain after a certain short circuit period. such a network can also be represented by the following complex dielectric constant, known as the cole-cole model [24]: 𝜀 = 𝜀ℎ + 𝜀𝑙 − 𝜀ℎ 1 + (𝑗𝜔𝜏0) 𝛼 (1) which represents a lossy dielectric material, because it has a non-zero imaginary part. it should be noted that this model was used for the first time a long time ago to explain the dielectric behaviour of electrolytes [19]. later on, this model was described in other papers and textbooks as well for several dielectric materials [25][26][27]. this might be an argument that the electric properties of the pedot:pss are also due to mobile ions. the pedot:pss being a mixture of two polymers, pedot and pss, or if charged pedot+ and pss-, it is clear that ionic conduction takes place. the fact that we are dealing with polymers, i.e. long molecules, explains the large time constants observed in the experiments. the parameter α in (1) is related to the variance of the time constant distribution [25]. if α = 1, only one time constant τ0 occurs. for 0 < α <1 a distribution of time constants centered around τ0 will be observed. the smaller the value of α, the wider the distribution will be. the dielectric relaxation observed in the pedot:pss cells, can be explained by the fact that even a short circuiting for a relatively long period is not enough to provide enough time for all the pedot+ and pssions to move back to their equilibrium position, which corresponds to charge neutrality and hence zero output voltage. 142 s. a. odhiambo, g. de mey, c. hertleer, l. van langenhove 4. conclusion electroconductive cells with stainless steel yarns as the electrodes and pedot:pss as the dielectric material have been made on textile substrates. it was found that very high values for the voltage recovery after a short circuit could be observed. this phenomenon known as dielectric absorption, has been detected in several materials but the voltage recovery never exceeded 10 %. in the pedot:pss much higher voltage recoveries have been measured. acknowledgements. s. odhiambo, on leave from the moi university, eldoret, kenya, wants to thank the vlir (flemish interuniversity council) for the financial support for her stay at the university of gent. references [1] a. k. jonscher: "dielectric relaxation in solids", chelsea dielectric press, london, 1983. [2] s. r. ekanayake, m. b. cortie and m. j. ford, "design of nanocapacitors and associated materials challenges", current applied physics, vol. 4, pp. 250–254, 2004. [3] c. iorga, "compartemental analysis of dielectric absorption in capacitors", ieee transactions on dielectrics and electrical insulation, vol. 7, pp. 187–192, 2000. [4] s. westerlund and l. ekstam, "capacitor theory", ieee transactions on dielectrics and electrical insulation, vol.1, pp. 826–839, 1994. [5] l. van langenhove and c. hertleer, "smart clothing: a new life", international journal of clothing science and technology, vol. 16, 2004. [6] m. irwin, d. roberson, r. olivas, r. wicker and e. macdonal, "conductive polymer-coated threads as electrical interconnects in e-textiles", fibers and polymers, vol.12, pp. 904–910, 2011. [7] o. kayacan, e. bulgun and o. sahin, "implementation of steel-based fabric panels in a heated garment design", textile research journal, vol. 79, pp. 1427–1437, 2009. [8] l. rattfalt, m. linden, f. hult, l. berglin and p. ask, "electrical characteristics of conductive yarns and textile electrodes for medical applications", medical & biological engineering & computing, vol. 45, pp. 1251–1257, 2007. [9] j. lesnikowski and m. tokarska, "modeling of selected electric properties of textile signal lines using neural networks", textile res journal, vol. 84, pp. 290–302, 2014. [10] i. kazani, c. hertleer, g. de mey, a. schwarz, g. guxho and l. van langenhove, "electrical conductive textiles obtained by screen printing", fibres & textiles in eastern europe, vol. 20, pp. 57–63, 2012. [11] j.a. gu, s. gorgutsa and m. skorobogatiy: "soft capacitor fibers for electronic textiles", applied physics letters, no. 115006, 2010. [12] l. hu, "lithium-ion textile batteries with large areal mass loading", advanced energy material, online: wiley, 2011. [13] k. jost, c. perez, j.k. mcdonough, v. presser, m. heon, g. dion and y. gogotsi, "carbon coated textiles for flexible energy storage", energy & environmental science, vol. 4, pp. 5060–5067, 2011. [14] a. laforgue, "all-textile flexible supercapacitors using electrospun poly(3,4-ethylenedioxythiophene) nanofibers", journal of power sources, vol. 196, pp. 559–564, 2011. [15] s. 0dhiambo, g. de mey, c. hertleer, a. schwarz and l.van langenhove, "discharge characteristics of poly(3,4-ethylene dioxythiophene): poly(styrenesulfonate) (pedot:pss) textile batteries; comparison of silver coated yarn electrode devices and pure stainless steel filament yarn electrode devices", textile research journal, vol. 84, pp. 347–354, 2014. [16] r. bhattacharya, m. de kok and j. zhou, "rechargeable electronic textile battery", applied physics letters, vol. 95, no. 223305, 2009. [17] sa. odhiambo, p. fiszer, g. de mey, c. hertleer, i. nuramshani, l. van langenhove, a. napieralski, "the electric energy stored in a pedot:pss capacitors on textile substrate: limits and possibilities", international journal of clothing science and technology, vol. 30, pp. 808–816, 2018. [18] i. nuramdhani, at. gokceoren, sa. odhiambo, g. de mey, c. hertleer, l. van langenhove, "electrochemical impedance analysis of a pedot:pss based textile energy storage device", materials, vol. 11, no. 48, 2018. dielectric absorption in pedot:pss capacitors with stainless steel yarn electrodes in textile substrates 143 [19] i. nuramdhani, sa. odhiambo, c. hertleer, g. de mey, l. van langenhove, "electric field effect on charge-discharge characteristics of textile-based energy storage devices. in search of the underlying mechanism", tekstilec, vol. 59, pp.162–167, 2016. [20] i. nuramdhani, g. de mey, m. widodo, l. van langenhove, "ionic shot noise in an electrochemical capacitor system made of poly(3,4-ethylenedioxythiophene)-poly(styrenesulfonate) film and silver coated polybenxazolestainless steel electroces on textile fabrics", textile research journal, vol. 89, pp. 1276–1285, 2019. [21] i. nuramdhani, j. manoj, p. samyn, p. adriaensens, b. malengier, w. deferme, g. de mey, l. van langenhove, "charge discharge characteristics of textile energy storage devices having different pedot:pss ratios and conductive yarns configurations", polymers, vol. 11, no. 345, 2019. [22] s. odhiambo, g. de mey, c. hertleer and l. van langenhove, "reliability testing of pedot:pss capacitors integrated in textile fabrics", eksploatacja i niezawodnosc maintenance and reliability, vol. 16, pp. 440–444, 2014. [23] k. cole and r. cole, "dispersion and absorption in dielectrics: i alternating current characteristics", journal of chemical physics, vol. 9, pp. 341–351, 1949. [24] r. fuoss and j. kirkwood, "electrical properties of solids: dipole moments in polyvinyl chloride diphenyl systems", journal of the american chemical society, vol. 63, pp. 385–394, 1941. [25] a. dekker, "solid state physics", mc millan, london, 1969, pp. 150–154. [26] a. van der ziel, "solid state physical electronics", mc graw hill, 1975, pp. 488–490. [27] a. k. jonscher, "dielectric relaxation in solids", journal of physics d: applied physics, vol.32, pp. r57– r70, 1999. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 393-400 https://doi.org/10.2298/fuee2103393v © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper controlled electron leakage in electron blocking layer free ingan/gan nanowire light-emitting diodes ravi teja velpula1, barsha jain1, trupti ranjan lenka2, hieu pham trung nguyen1 1new jersey institute of technology, university heights, newark, nj 07102, usa 2department of electronics & communication engineering, national institute of technology silchar, assam, india abstract. in this study, we have proposed and investigated the effect of coupled quantum wells to reduce electron overflow in ingan/gan nanowire white color light-emitting diodes. the coupled quantum well before the active region could decrease the thermal velocity, which leads to a reduced electron mean free path. this improves the electron confinement in the active region and mitigates electron overflow in the devices. in addition, coupled quantum well after the active region utilizes the leaked electrons from the active region and contributes to the white light emission. therefore, the output power and external quantum efficiency of the proposed nanowire leds are improved. moreover, the efficiency droop was negligible up to 900 ma injection current. key words: nanowires, light-emitting diodes, electron blocking layer, molecular beam epitaxy. 1. introduction indium gallium nitride (ingan) based white color light-emitting diodes (wleds) have tremendous energy-saving potentials in solid-state technology [2, 3]. compared to conventional planar structures, iii-nitride nanowires can exhibit significant advantages, including greatly reduced dislocation densities and polarization fields, due +to the effective lateral stress relaxation. moreover, the nanowires show improved carrier confinement due to the incorporation of quantum dots/disks, which is promising for high-efficiency leds with tunable emission [4-6]. however, nanowire leds still pose several challenges for further improving received march 24, 2021; received in revised form may 09, 2021 corresponding author: hieu pham trung nguyen new jersey institute of technology, university heights, newark, nj 07102, usa e-mail: hieu.p.nguyen@njit.edu * an earlier version of this paper was presented at the international conference on micro/nano electronics devices, circuits and systems (mndcs-2021), 30-31 january, 2021, india [1]. 394 r. t. velpula, b. jain, t. r. lenka, h. p. t. nguyen the quantum efficiency and light output power, which may include non-uniform carrier distribution, electron overflow, and the presence of large densities of defects along the nanowire lateral surfaces [7, 8]. it is believed that one of the critical reasons for the efficiency droop is the electron overflow, and this also influences the output characteristics of leds, especially at high injection levels [9]. in leds, electron overflow is caused mainly by the non-uniform carrier distribution in the active region. as the effective mass of holes is higher and their mobility is lower compared to those of electrons, hole injection in ingan/gan structure is highly nonuniform. the holes reside close to the p-gan layer, while electrons have a relatively uniform distribution in the active region. the resulted non-uniform carrier distribution throughout the active region leads to the increased electron overflow. moreover, non-radiative recombination increases in the p-gan region due to the recombination of inefficient injection of holes and leaked electrons. this further reduces the device performance. one of the solutions to reduce the electron overflow is to increase the electron confinement in the active region by introducing a high al content p-algan electron blocking layer (ebl) in between the last quantum barrier (qb) and p-gan layer [9, 10]. however, if the ebl layer is not designed properly, it may affect the hole injection into the active region further by forming the positive sheet polarization charges at the heterointerface of last qb/ebl [11, 12] and thereby reduce the radiative recombination in the active region. in this context, designing an ebl-free led without compromising the optical performance is highly desired. to reduce the electron overflow without using an ebl, it is obvious that electrons should be slowed down before injecting into the active region. inserting a layer with a lower in composition reduces the kinetic energy and velocity of the injected electrons. moreover, by inserting this layer before the active region, hot electrons are thermalized by interacting with longitudinal optical (lo) phonons [13]. in this context, we have performed the experimental study of electron overflow in ingan/gan nanowire wleds, wherein nin0.2ga0.8n well is incorporated between the n-gan and the device active region to effectively control electron overflow. furthermore, to utilize the electrons escaped from the active region, we have employed a p-ingan quantum well between the active region and p-gan to reduce the electron loss to the p-gan and contribute blue light emission to relatively control the white light emission from the led device [14, 15]. moreover, it is required to understand the fundamental mechanism behind the reduction of electron leakage due to incorporating the coupled quantum wells, which are placed before and after the active region. in this study, we have investigated the performance of coupled quantum wells in ingan/gan self-organized nanowire wleds and presented the detailed carrier mechanism via theoretical model. 2. molecular beam epitaxial growth of ingan/gan nanowire heterostructures on si (111) and fabrication vertically aligned ingan/gan nanowire heterostructures were grown on si (111) substrates by radio-frequency plasma-assisted molecular beam epitaxy (veeco gen ii mbe) under nitrogen-rich conditions. the substrate growth temperature for gan nanowires and ingan are at 730 °c and 550 600°c, respectively. during the nanowire growth, the nitrogen flow rate and forward plasma power were kept at 1 sccm and ~ 350 w, respectively. the device active region contains ten ingan/gan quantum dots with ~ 3 nm ingan quantum dot and ~ 3 nm gan quantum barrier layer. for led2, the device active region is sandwiched by two ingan/gan quantum wells which were grown at 630°c to control electron overflow in the controlled electron leakage in electron blocking layer free ingan/gan nanowire light-emitting diodes 395 active region and utilize the electron leakage out of the active region for the blue light emission. such uniformly grown nanowire samples are suitable for our device fabrication. the device fabrication of wleds involves the following steps. to remove native oxides from the nanowire surface and the backside of the si substrates, we have cleaned the nanowire led samples initially with hcl and then with hf. next, the cleaned nanowire samples were spin-coated with polyimide resist to fully cover and planarize the nanowires, which also avoids the short circuit between the top and bottom electrodes. the top portion of the nanowires were exposed by etching polyimide resist using the o2 dry etching method. deposition of the top metal contact (p-metal contact) includes three main steps. first, ni(5 nm)/au(5 nm) layers were deposited on the surface of p-gan nanowires, followed by, deposition of 200 nm indium tin oxide (ito) layer on the device top surface, which can serve as the current spreading layer and the transparent electrode. further to improve the current spreading facility, ni/au metal patterns were deposited on the top of ito. the n-metal contact was deposited with ti(20 nm)/au(120 nm) layers on the backside of the si substrate. finally, the fabricated devices were annealed at ~500°c for 1 minute in a nitrogen ambient to achieve low ohmic contact resistance. the photolithography process was deployed to define the device size and electrode position. the device area is ~300×300 µm2. the developed phosphor-free ingan/gan nanowire wleds can be suitable candidates for smart display applications [14, 16, 17]. 3. simulation setup and parameters figure 1 shows the schematic of two ingan/gan nanowire led structures considered in this study. the first structure, led1, consists of a 200 nm n-gan nanowire template, 10 multiple quantum wells (mqws) of 3 nm ingan quantum well (qw)/ 3 nm gan qb in the active region, and a 100 nm p-gan. the proposed structure, denoted as led2, has the same structure as led1, but with an extra 30 nm thick n-doped in0.2ga0.8n layer introduced in between the n-gan template and the active region. moreover, an extra 10 nm thick pdoped in0.2ga0.8n qw is also introduced in between the last barrier and p-gan layer. in fig. 1. schematic diagram of (a) led1 and (b) led2 396 r. t. velpula, b. jain, t. r. lenka, h. p. t. nguyen this study, the band offset ratio for ingan was set as 0.7/0.3, induced polarization charges due to both spontaneous and piezoelectric polarization are assumed to be 10% of the theoretical values [18]. 4. results and discussion figure 2 shows the transmission electron microscope (tem) image of led2, where 30 nm n-ingan qw, 3nm/3nm ingan/gan qds in the active region, and 10 nm p-ingan qw are clearly identified. moreover, crystal defects are not visible. the ingan/gan qds in the active region are positioned in the center of the nanowires due to strain-induced self-organization. fig. 2. tem image of led2, wherein nand pingan qws and ingan/gan qds are identified. figure 3(a) depicts a strong photoluminescence spectra of led1 and led2. the emission peak at ~ 550 nm corresponds to the emission from the active region, while the emission peak at ~ 430 nm in the case of led2 is due to the emission from the coupled quantum wells. further, normalized electroluminescence spectra of led2 at various injection currents are shown in fig. 3(b). the peak emission at ~ 550 nm originates from the active region, which is well agreed with the photoluminescence results, as shown in fig. 3(a). it is seen that emission at ~ 430 nm is progressively stronger with the increase in the injection current. this is because, at a higher injection current, more injected electrons can escape from the active region and have more chance to recombine with the holes in the p-ingan qw located inbetween the active region and p-gan layer. the electroluminescence spectra cover the whole visible range and show the balanced rgb distribution. the experimentally measured light output power (lop) and external quantum efficiency (eqe) of led1 and led2 are shown in figs. 3(c) and 3(d). it is seen that led2 demonstrates the high output power and eqe as compared to the conventional led i.e., led1. more importantly, no efficiency droop was observed in led2 up to an injection current of 900 ma. the improved performance of led2 is attributed to the ingan coupled quantum wells incorporated before and after the active region. the detailed mechanisms of the ingan coupled quantum wells on the improvement of the led performance are theoretically investigated through the mean free path (lmfp) model as follows. controlled electron leakage in electron blocking layer free ingan/gan nanowire light-emitting diodes 397 fig. 3 (a) normalized photoluminescence spectra of led1 and led2 measured at 300k, (b) electroluminescence spectra of led2 measured at different injection currents at 300k, (c) light output power-current characteristics for led1 and led2 measured at 300k and (d) relative eqes measured for led1 and led2 measured at 300k. the schematics of the energy band diagrams of led1 and led2 are depicted in figs. 4(a) and 4(b), respectively, along with four electron transport processes in the active region. illustrated in fig. 4, the incoming electrons are scattered and fall into the quantum wells denoted by process 1. some of those fallen electrons recombine with the holes radiatively as well as with the crystal defects as depicted by process 2 while remaining electrons escape from the qws and become free again, as illustrated by process 3. in addition, some electrons with longer lmfp travel to a remote position without being captured by the quantum wells as depicted by process 4. the lmfp of these electrons needs to be reduced so that the carrier concentration in the qws would be increased that would favor the higher radiative recombination rate in the active region by reducing the electron overflow. here, we have considered the total number of electrons injected into the n-gan region to be n0 for both led structures. for the simplicity of the model, electron loss through non-radiative recombination is neglected. also, the hole concentration in the n-ingan layer between n-gan and active region in the case of led2 is much lower than the electron concentration, so the electron loss through radiative recombination with holes is also negligible. it is assumed that out of no electrons, n2 electrons captured by the first extra qw undergo thermalization with lo phonon emission while remaining electrons denoted as n1, directly travel over the extra qw layer without undergoing thermalization. the captured electrons in the quantum wells are correlated with the electron lmfp [19]. to increase the number of the quantum wellcaptured electrons, the electron lmfp within the ingan/gan mqw region must be reduced. 398 r. t. velpula, b. jain, t. r. lenka, h. p. t. nguyen to understand the working mechanism of the extra qw before the active region in led2 in reducing lmfp, electron lmfp in both the leds are calculated, which is a function of thermal velocity (vth) and the scattering time (τsc) set to 0.0091ps [20, 21], as shown in eq. 1. vth can be further expressed as shown in eq. 2. for led2 with an extra qw n-ingan before the active region, the expression for vth will be as shown in eq. 3. =   mfp th sc l v (1) 2 [ ] /=  th e v e m (2) _ 2 [ ] /=  +  + −  −  th eqw c lo c e v e e qv e m 2 [ ] /=  + −  lo e e qv m (3) in eq. 1-3, e is the electrons energy before getting into the qw i.e., electron energy in the ngan layer, me is the effective mass of electrons. -ℏωlo means the energy loss by phonon emission, qv is the work done to the electrons by the polarization induced electric field in the extra qw. the first 𝛥𝐸𝑐 in eq. 3 represents the kinetic energy received by the electrons when jumping over the conduction band offset between n-gan and n-in0.2ga0.8n extra qw and −𝛥𝐸𝑐 represents the energy loss by the electrons when climbing over the conduction band offset between n-in0.2ga0.8n and the gan layer. here it is assumed that the thermionic emission process dominates over the intra-band tunneling during the electrons transport into the active region, thus 𝛥𝐸𝑐 can be eliminated, as shown in eq. 3. the energy loss through lo phonon emission i.e. -ℏωlo is considered to be 92 mev [21] and qv = ∫ 𝑞 × 𝐸(𝑦)𝑑𝑦 𝑡𝑒𝑄𝑊 0 is calculated from the electric field as shown in fig. 5(a). value of qv is found to be 47 mev. to understand the effect of extra qw to reduce the electron mean free path, it can be understood from eq. 2 and 3 that 𝐸 + 𝑞𝑉 − ℏ𝜔𝐿𝑂 < e. as qv is 47 mev and ℏωlo is 92 mev, overall 𝐸 + 𝑞𝑉 − ℏ𝜔𝐿𝑂 < e due to which vth_eqw < vth and lmfp_eqw < lmfp. this shows that the extra quantum well before the active region has a significant effect in reducing the electron lmfp in the active fig. 4. schematic energy band diagrams for (a) led1 and (b) led2 controlled electron leakage in electron blocking layer free ingan/gan nanowire light-emitting diodes 399 region, and consequently increasing capture efficiency of electrons in the quantum well and reducing the possibility of electron leakage, as shown in fig. 5(b). fig. 5 calculated electric field as a function of position within the n-ingan layer at 900 ma further, a blue-emitting ingan qw is incorporated between the last qb and p-gan region, wherein the escaped electrons from the active region can be captured and radiatively recombined with the holes as the hole concentration in this qw is high as it is close to p-gan region. this radiates blue light and contributes to the white light emission from the led device. the resulting device exhibits highly stable white-light emission characteristics. 5. conclusion in conclusion, a highly efficient and truly white light-emitting ingan/gan nanowire led with a coupled quantum well is demonstrated. the coupled n-ingan qw incorporated between the n-gan and active region improves the electron capture efficiency in the multiple quantum wells by reducing lmfp after electrons undergo thermalization by phonon emission in the n-ingan qw. further, the p-ingan after the active region captures the leaked electrons and contributes to the white light emission by radiatively recombine with the holes. the resulted ebl-free nanowire white leds show improved output power and no efficiency droop up to injection current of 900 ma. acknowledgments: this work is supported by the us national science foundation under grant number 2013783. references [1] r. t. velpula, b. jain, t.r. lenka, h. p. t. nguyen, “ ingan/gan nanowire white color light-emitting diodes without electron blocking layer” in proceedings of the international conference on micr/nano electronics devices, circuits and systems (mndcs-2021), 30-31 january, 2021, india. [2] e. f. schubert and j. k. kim, "solid-state light sources getting smart", science, vol. 308, no. 5726, pp. 1274-1278, 2005. [3] s. tan, x. sun, h. v. demir, and s. denbaars, "advances in the led materials and architectures for energy-saving solid-state lighting toward “lighting revolution", ieee photonics j., vol. 4, no. 2, pp. 613619, 2012. [4] h. p. t. nguyen, k. cui, s. zhang, s. fathololoumi, and z. mi, "full-color ingan/gan dot-in-a-wire light emitting diodes on silicon", nanotechnology, vol. 22, no. 44, pp. 445202-445206, 2011. [5] w. guo, a. banerjee, p. bhattacharya, and b. s. ooi, "ingan/gan disk-in-nanowire white light emitting diodes on (001) silicon", appl. phys. lett., vol. 98, no. 19, pp. 193102-193104, 2011. ( ( 400 r. t. velpula, b. jain, t. r. lenka, h. p. t. nguyen [6] d. t. tuyet, v. t. h. quan, b. bondzior, p. j. dereń, r. t. velpula, h. p. t. nguyen, l. a. tuyen, n. q. hung, and h.-d. nguyen, "deep red fluoride dots-in-nanoparticles for high color quality micro white lightemitting diodes", opt. express, vol. 28, no. 18, pp. 26189-26199, 2020. [7] j. xie, x. ni, q. fan, r. shimada, ü. özgür, and h. morkoç, "on the efficiency droop in ingan multiple quantum well blue light emitting diodes and its reduction with p-doped quantum well barriers", appl. phys. lett., vol. 93, no. 12, pp. 121107-121109, 2008. [8] c. g. van de walle and d. segev, "microscopic origins of surface states on nitride surfaces", j. appl. phys., vol. 101, no. 8, pp. 081704-081709, 2007. [9] h. p. t. nguyen, k. cui, s. zhang, m. djavid, a. korinek, g. a. botton, and z. mi, "controlling electron overflow in phosphor-free ingan/gan nanowire white light-emitting diodes", nano lett., vol. 12, no. 3, pp. 1317-1323, 2012. [10] s.-h. han, d.-y. lee, s.-j. lee, c.-y. cho, m.-k. kwon, s. lee, d. noh, d.-j. kim, y. c. kim, and s.-j. park, "effect of electron blocking layer on efficiency droop in ingan/gan multiple quantum well lightemitting diodes", appl. phys. lett., vol. 94, no. 23, pp. 231123-231125, 2009. [11] n. wang, y. a. yin, b. zhao, and t. mei, "performance analysis of gan-based light-emitting diodes with lattice-matched ingan/alinn/ingan quantum-well barriers", j. disp. technol., vol. 11, no. 12, pp. 1056-1060, 2015. [12] r. t. velpula, b. jain, h. q. t. bui, f. m. shakiba, j. jude, m. tumuna, h.-d. nguyen, t. r. lenka, and h. p. t. nguyen, "improving carrier transport in algan deep-ultraviolet light-emitting diodes using a stripin-a-barrier structure", appl. opt., vol. 59, no. 17, pp. 5276-5281, 2020. [13] k.-t. tsen, r. joshi, d. ferry, a. botchkarev, b. sverdlov, a. salvador, and h. morkoç, "nonequilibrium electron distributions and phonon dynamics in wurtzite gan", appl. phys. lett., vol. 68, no. 21, pp. 29902992, 1996. [14] b. jain, r. t. velpula, h. q. t. bui, h.-d. nguyen, t. r. lenka, t. k. nguyen, and h. p. t. nguyen, "high performance electron blocking layer-free ingan/gan nanowire white-light-emitting diodes", opt. express, vol. 28, no. 1, pp. 665-675, 2020. [15] r.t. velpula, b. j., t.r. lenka, h.p.t. nguyen, "ingan/gan nanowire white color light-emitting diodes without electron blocking layer", 1st international conference on micro/nanoelectronics devices, circuits and systems (mndcs 2021), assam, india, 2021. [16] h. q. t. bui, r. t. velpula, b. jain, o. h. aref, h.-d. nguyen, t. r. lenka, and h. p. t. nguyen, "fullcolor ingan/algan nanowire micro light-emitting diodes grown by molecular beam epitaxy: a promising candidate for next generation micro displays", micromachines, vol. 10, no. 8, pp. 492-500, 2019. [17] m. rajan philip, d. d. choudhary, m. djavid, m. n. bhuyian, t. h. q. bui, d. misra, a. khreishah, j. piao, h. d. nguyen, and k. q. le, "fabrication of phosphor-free iii-nitride nanowire light-emitting diodes on metal substrates for flexible photonics", acs omega, vol. 2, no. 9, pp. 5708-5714, 2017. [18] h. p. t. nguyen, m. djavid, s. y. woo, x. liu, a. t. connie, s. sadaf, q. wang, g. a. botton, i. shih, and z. mi, "engineering the carrier dynamics of ingan nanowire white light-emitting diodes by distributed p-algan electron blocking layers", sci. rep., vol. 5, no. 1, pp. 1-7, 2015. [19] z.-h. zhang, w. liu, s. t. tan, z. ju, y. ji, z. kyaw, x. zhang, n. hasanov, b. zhu, and s. lu, "on the mechanisms of ingan electron cooler in ingan/gan light-emitting diodes", opt. express, vol. 22, no. 103, pp. a779-a789, 2014. [20] x. ni, x. li, j. lee, s. liu, v. avrutin, ü. özgür, h. morkoç, and a. matulionis, "hot electron effects on efficiency degradation in ingan light emitting diodes and designs to mitigate them", j. appl. phys., vol. 108, no. 3, pp. 033112-033124, 2010. [21] x. ni, x. li, j. lee, s. liu, v. avrutin, ü. özgür, h. morkoç, a. matulionis, t. paskova, and g. mulholland, "ingan staircase electron injector for reduction of electron overflow in ingan light emitting diodes", appl. phys. lett., vol. 97, no. 3, pp. 031110-031112, 2010. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 261 268 doi: 10.2298/fuee1602261s lcr of sc receiver output signal over α-κ-µ multipath fading channels  suad suljović 1 , dejan milić 1 , stefan r. panić 2 1 faculty of electronic engineering, university of niš, serbia 2 faculty of natural science and mathematics, university of priština, kosovska mitrovica, serbia abstract. wireless mobile communication system with selection combining (sc) diversity receiver is investigated in this paper. received signal envelope experiences α-κ-µ short term fading resulting in system performance degradation. level crossing rate (lcr) and average fade duration (afd) of sc receiver output signal envelope are obtained as rapidly converging infinite series expressions. numerically evaluated results are presented graphically, in order to discuss the effects of transmission parameters: multipath fading severity, dominant component power and nonlinearity propagation parameter on observed lcr performance of dual sc. key words: wireless transmission, α-κ-µ fading selection combining (sc), level crossing rate (lcr), average fade duration (afd) 1. introduction short term fading heavily influences and often degrades transmission quality of wireless communication system and limits channel capacity. there are few statistical models that can be used to describe signal envelope variation in multipath fading channel depending on communication scenario and propagation environment. the α-κ-µ distribution is recently reported in technical literature to describe small scale signal envelope variation in fading channels [1]. the α-κ-µ fading model can describe small scale signal envelope variations in nonlinear line of sight multipath fading environments with two or more clusters, and is presented as a function of three parameters: 1) parameter κ, often called rician factor, denoting the ratio of dominant components power to the power of scattered components; 2) parameter µ, related to the number of clusters in propagation environment; and 3) parameter α related to the non-linearity of propagation environment. presented α-κ-µ fading model describes propagation environments with more severe fading when the values of rician κ factor are lower. the α-κ-µ multipath fading is also more severe for lower values of parameter µ, and when parameter µ tends to received april 20, 2015; received in revised form august 13, 2015 corresponding author: suad suljović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: suadsara@gmail.com) 262 s. suljović, d. milić, s. panić infinity, α-κ-µ fading channel approaches in its characteristics to a channel without fading effects. the α-κ-µ distribution is general distribution and α-µ, weibull, nakagami-m, rician and rayleigh distributions can be derived from α-κ-µ distribution as special cases. by setting κ=0, the α-κ-µ distribution reduces to α-µ distribution, and for κ=0 and µ=0, weibull distribution can be obtained from α-κ-µ distribution. by setting µ=1, and α=2, the α-κ-µ distribution reduces to rician distribution, while for α=2 and κ=0 the α-κ-µ distribution reduces to nakagami-m distribution, and by setting α=2, κ=0 and µ=1, rayleigh distribution is derived from the α-κ-µ distribution. there are several space combining techniques (spatial diversity combining), which can be used to mitigate the influence of multipath fading on receiver performance, depending on implementation complexity and quality of service [2,3,4]. maximal ratio combining (mrc) provides the best diversity gain, while sc enables the lowest implementation complexity. in sc diversity, receiver selects input branch with the highest signal-to-noise ratio, or highest envelope level in observed time instant. the established second order performance measures of wireless mobile communication system are average level crossing rate (lcr) and average fade duration (afd) [5]. lcr can be calculated as average value of the first time derivative of random process, while afd is defined as the average time over which the signal envelope ratio remains below a specified level after crossing that level in a downward direction. the system performs better when the values of average level crossing rate are lower. a considerable number of research papers consider lcr and afd of wireless system operating over multipath fading channels. in [6], macro diversity sc receiver with two micro diversity mrc receivers operating over gamma-shadowed nakagami-m multipath fading channel is considered. closed form expressions for lcr and afd are evaluated for the proposed system. lcr and afd of the wireless system in the presence of long term gamma fading and rician short term fading are determined in [7]. in [8], the expressions for the lcr and afd of sc receiver output signal for cases when rician, rayleigh and nakagami-m multipath fading are presented. in [9], an approach to for determining second order statistics over α-κ-µ fading channels was proposed. in this paper, we consider a wireless communication system with sc diversity receiver operating over α-κ-µ multipath fading channel. closed form expressions for lcr and afd of combiner output system have been efficiently evaluated. 2. system model the α-κ-µ random process can be obtained after transforming: 2y x   (1) where x denotes the κ-µ random process and α is a positive parameter. the κ-µ random variable follows probability density function (pdf): 2 1 ( 1)2 11 1 2 2 2 ( 1) ( 1) ( ) 2 , k y y k k k ky p y y e i k e                       0y  (2) lcr of sc receiver output signal over α-κ-µ multipath fading channels 263 where κ is rician factor, µ is fading parameter, ω is average power of y, and in(x) represents modified bessell function of n-th order. previous expression can be further written in the following form: 1 2 1 1 1 21 1 1 1 1 2 1 1 ( 1)2 1 1 2 1 0 1 12 2 2 3 1 ( 1)2 2 0 1 1 ( 1) 2 2 ( 1) ( ) ( ) !2 2 ( 1) ( ) ! i k y y i ik i ki i i y ik i k ky k p y y e i i k e k k y e e i i                                                       (3) probability density function (pdf) of α-κ-µ random variable now can be obtained after using relations: 2( ) | | x y dy p x p x dx         (4) and: 1 2 2 dy x dx     (5) after substituting (5) and (3) in (4), the expression for pdf for an α-κ-µ random variable becomes, as in [9]: 1 21 1 1 2 1 1 1 1 1 1 1 1 2 3 1 2 2 2 ( 1) 1 2 0 1 1 2 3 4 ( 1)2 4 0 1 1 2 ( 1) ( ) 2 ( ) ! ( 1) ( ) ! i i i i k x x ik i i ki i i x ik i k k x p x x e e i i k k x e e i i                                                               (6) now, cumulative distribution function (cdf) of α-κ-µ random variable can be determined as: 11 1 1 1 1 1 1 1 1 1 1 1 2 3 4 ( 1)2 4 00 01 1 2 2 4 4 1 6 0 4 1 1 2 ( 1) ( ) ( ) ( ) ! 2 3( 1) ( 1) , 4 ( ) ! 4 ( 1 i kx xi i i t x x ik i i i i i i k i i j k k f x p t dt t e dt e i i ik k k x e i i k k                                                                           1 1 11 2 3 4 ( 1)4 0 0 1 1 1 1 ( ) ) 2 3 5 ( ) !(2 3 ) 4 i j ki j x i j ki j j x e i e i i i                                         (7) where γ(a, x) is incomplete gamma function, and (a)n is pocchammer symbol [10]. the joint probability density function (jpdf) of κ-µ random variable and its first time derivative is: 264 s. suljović, d. milić, s. panić 1 2 21 1 1 2 1 1 21 1 1 2 1 2 2 1 1 2 3 1 ( 1)2 2 2 0 1 1 4 2 1 2 2 1 2 3 1 ( 1) ( 1) 2 2 2 2 2 2 1 2 2 1 1 2 ( 1) 1 ( ) ( ) ( ) ( ) ! 2 2 ( 1) 2 ( ) ! yy i yki i i y ik i i i i k y ki y fm i i k m k k y yy p y p y e e p e i i k k y e e f e i i                                                          (8) where β stands for the time derivate process variance and fm stands for the doppler frequency. the time derivate of α-κ-µ random process can be determined by using: 2 ,x y 2 ,y x   2 12 ,x y y    1 2 2 y x x     (9) now, the jpdf of α-κ-µ random process and its first time-derivative is: 1 2 2( ) , 2 xx yy p xx j p x x x            (10) where jacobian of transformation can be determined according to: 1 2 2 2 1 2 0 2 4 0 2 y y x x x j x y y x x x                     (11) after substituting (8) and (11) in (10), the expression for jpdf of α-κ-µ random variable and its first time derivative is: 21 1 1 2 2 1 22 1 1 1 2 2 4 2 1 2 2 1 2 3 3 8 ( 1)( 1)2 2 2 4 8 2 2 1 0 2 1 1 ( ) , 2 ( 1) 2 2 ( ) ! m xx yy i i i kki x xx f i i k m p xx j p x x x k k x e e f e i i                                                  (12) lcr of α-κ-µ random process is equal to the average mean value of the time derivative of α-κ-µ random process, namely: 21 1 1 2 2 1 22 1 1 1 1 1 1 0 4 2 1 2 2 1 2 3 3 8 ( 1)( 1)2 2 2 4 8 2 2 1 0 0 2 1 1 4 2 1 2 2 1 2 3 2 2 4 2 2 ( ) ( ) ( 1) 2 2 ( ) ! 2 ( 1) 2 m x xx i i i kki x xx f i i k m i i i i m k n x x p xx dx k k x e xe dx f e i i f k k x e                                                                1 1 ( 1) 1 0 2 1 1 ( ) ! k x i i e i i           (13) lcr of sc receiver output signal over α-κ-µ multipath fading channels 265 the expression for lcr of α-κ-µ random process can be used to determine afd of wireless communication systems operating over α-κ-µ multipath fading channels. namely, afd is equal to ratio of cumulative distribution function and its lcr. 3. performance analysis we further consider a wireless communication system with sc receiver operating over identically distributed independent α-κ-µ multipath fading channels. signal envelopes at inputs of sc receiver are denoted with x1 and x2, while the signal envelope at output of sc receiver is denoted by x. the sc receiver selects the branch with higher signal level, therefore pdf of sc receiver output signal envelope is: 1 2 2 1 1 2 1 1 1 1 1 1 2 2 2 2 2 2 3 4 ( 1)2 4 0 1 1 2 3 4 2 4 2 1 2 2 ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) ( 1) 2 ( ) ! 4 ( 1) 2 3 5 ( ) !(2 3 ) 4 x x x x x x x i ki i i x ik i i j i i j i j i j k p x p x f x p x f x p x f x k k x e e i i k k x i e i i i                                                               2 1 2 1 2 1 2 1 2 1 21 2 ( 1) 0 0 ( ) 3 2 2 2 ( 1)2 2 2 22 2 220 0 0 2 1 1 2 2 2 ( ) 8 ( 1) 2 3 5 ( ) ! ( ) !(2 3 ) 4 k x i j j i i j ki i j i i j i i x i i jki i j j e k k x e i e i i i i i                                                                       (14) now, cdf of sc receiver output signal envelope is obtained as:   1 2 1 1 1 1 1 11 2 2 3 4 ( 1)2 4 0 0 1 1 1 1 ( ) ( ) ( ) ( ( )) 4 ( 1) 2 3 5 ( ) !(2 3 ) 4 x x x x i j ki i j i j x i j ki j j f x f x f x f x k k x e i e i i i                                               (15) further, jpdf of sc receiver output signal and its first time derivative can be obtained as: 1 1 2 2 2 1 1 1 2 ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) xx x x x x x x x x x p xx p xx f x p xx f x p xx f x   (16) after substituting (7) and (13) in (17), the expression for lcr can be expressed as: 2 1 1 2 1 1 2 1 2 1 2 1 2 1 2 0 0 4 4 4 2 1 2 2 4 2 1 3 2 2 ( 1) 2 2 2 2 2 4 2 1 2 2 1 1 2 2 2 ( ) ( ) 2 ( ) ( ) 2 ( ) ( ) 16 ( 1) 2 2 ( ) ! ( ) !(2 3 ) x xx x x x x x i i j i i j i i j k x i i m i i j k n x x p xx dx f x x p xx dx f x n x f k k x e e i i i i i                                                       1 20 0 0 2 3 5 4 i i j j i                (17) where 1 ( ) x n x is given by (13). 266 s. suljović, d. milić, s. panić the afd of sc receiver can now be determined as [2, eq.4.14]:   2 2 2 1 1 2 2 2 2 22 1 1 1 1 1 2 2 4 3 2 4 0 0 2 2 2 2 4 2 1 2 2 1 2 3 2 2 4 2 2 ( ( )) ( )( ) ( ) 2 ( ) ( ) 2 ( ) ( 1) 2 3 5 ( ) !(2 3 ) 4 ( 1) 2 x xx x x x x i j i i j i j i ji j j i i i i m i f x f xf x t n x f x n x n x k k x i i i i f k k x                                                                 1 1 0 2 1 1 ( ) ! i i i         (18) -20 -10 0 10 1e-6 1e-5 1e-4 1e-3 0,01 0,1 1 10 n x (x )/ f m x[db]     fig. 1 lcr for different system parameters 4. numerical results in figure 1, normalized lcr values at the sc receiver output signal envelope, versus sc receiver output signal envelope for several values of fading severity and nonlinearity parameter are presented. first, we consider level crossing for the fixed level x, set below the average signal level. in this scenario, it is expected that the signal is going to be above level x most of the time, and the lcr is going to be relatively low. as the level x increases, and comes closer to average signal level, the lcr also increases. the lcr values decrease, and in general, the system will perform better, when parameter µ increases, resulting in reduced fading severity. also, it is obvious that lcr values decrease as nonlinearity parameter  increases. when the crossing level is above the average signal level, the lcr will start to decrease with increase of level x. again, this is an expected effect, as the signal excursions above its average value will quickly become less likely. the parameters  and  generally have similar effects as in the previous scenario. lcr of sc receiver output signal over α-κ-µ multipath fading channels 267 normalized afd are presented for different system parameters in fig. 2. when the crossing threshold x is below the average signal level, afd is low, and this is generally the regime in which the system normally operates. better performances are expected in the cases when rician κ factor increases, resulting in lower afd. rician κ factor increases when dominant los (line-of-sight) component power increases or the power of scattering components decreases, thus making the fading less severe. performance improvement is expected in less severe environments. -20 -15 -10 -5 0 5 10 0,01 0,1 1 10 100 1000 10000 t x (x ) f m x [db]     fig. 2 afd for different system parameters 5. conclusion in this paper, wireless communication system with dual selection combining (sc) diversity receiver operating over α-κ-µ multipath fading channel is considered. main contribution is generality of the analysis, since from α-κ-µ distribution model other models can be derived as special cases. closed form expressions for lcr and afd of sc receiver output signal envelope are efficiently evaluated and discussed in the function of system parameters. in order to point out the influence of propagation nonlinearity, fading severity and rician κ factor on observed performances, results are presented graphically. acknowledgement: the paper is supported in part by the project iii44006 funded by ministry of education, science and technological development of republic of serbia. references [1] g. fraidenraich and m. d. yacoub, ”the α −κ− μ and α − η −μ fading distributions,” in proc. ieee ninth international symposium on spread spectrum techniques and applications, aug. 2006, manausamazon, brasil, pp. 16-20. [2] panic s, anastasov j, stefanovic m, spalevic p. fading and interference mitigation in wireless communications. crc press: usa, 2013 268 s. suljović, d. milić, s. panić [3] simon mk, alouini ms. digital communication over fading channels. john wiley & sons: new york, 2000. [4] stüber gl. principles of mobile communications. kluwer academic publishers: massachusetts, 1996. [5] lee wcy. mobile communications engineering. mcgraw-hill: new york , 2003. [6] d. stefanovic, s. panić, p. spalević, "second order statistics of sc macrodiversity system operating over gamma shadowed nakagami-m fading channels", international journal of electronics and communications (aeu), vol. 65, issue 5, pp. 413-418, may 2011. [7] d. milic, d.djosic, c. stefanovic, s. panic, m. stefanovic, "second order statistics of the sc receiver over rician fading channels in the presence of multiple nakagami-m interferers", international journal of numerical modelling: electronic networks, devices and fields, accepted for publication. [8] m. bandjur, n. sekulovic, m. stefanovic, a. golubovic, p. spalevic, and d. milic "second-order statistics of system with microdiversity and macrodiversity reception in gamma-shadowed rician fading channels", etri journal, vol. 35, no. 4, pp. 722-725, august 2013. [9] papazafeiropoulos, a. k.; kotsopoulos, s. a., "second-order statistics for the envelope of α κ μ fading channels," communications letters, ieee, vol. 14, no. 4, pp. 291-293, april 2010. [10] gradshteyn i, ryzhik i. tables of integrals, series, and products. academic press: new york, 1980. the implementation of signal analysis in java to determine the sound of human voice and graphical representation in standard m facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 139 149 doi: 10.2298/fuee1601139s the implementation of signal analysis in java to determine the sound of human voice and its graphical representation in standard music notation  patryk solecki, wojciech zabierowski lodz university of technology department of microelectronics and computer science, poland abstract. the article presents the problems associated with signal processing in the human voice analysis. based on the specific implementation of algorithms defining the human voice pitch, paper is shows the result in the form of a standard music notation with treble and bass keys on the stave. particular attention is paid to the performance of the algorithms used for their implementation in java. the basic analysis of human voice signals is not a challenge, in general, but its implementation on mobile devices like smartphones with their limited hardware resources remains a challenge. the limitations of both, the cpu and the memory, affect the processing speed of the java virtual machine. one should remember that the quality of microphone used in this type of mobile devices is low. from this point of view we have presented the new approach to the well-known problem of signal analysis implemented in computer applications such as raven. key words: signal processing, java programming, music notation, voice analysis, java 1. introduction the sound pitch generated by the music instrument, was analyzed in ref. [1] on the example of the smartphone application, which used phone's limited resources. in addition to the limitations associated with hardware (cpu, memory, bandwidth of the microphone) the selection and use of the appropriate programming language had a significant impact on the application. the problem with the choice of the language is that it is not just a matter of habit but it is very often determined by the selected hardware platform. it seems to be obvious that the most effective are the c + + type languages, especially in a situation where the sound analysis is done "online" on the data from the microphone. analysis of the sound pitch is one of the easiest issues related to the processing of acoustic signals. therefore, an implementation of algorithms for the identification of the human voice or the sound of an instrument [1] is an often undertaken problem. received march 27, 2015; received in revised form october 21, 2015 corresponding author: wojciech zabierowski lodz university of technology department of microelectronics and computer science, poland (e-mail: wojciech.zabierowski@p.lodz.pl) 140 p. solecki, w. zabierowski in the literature there are different approaches to the signal analysis. various methods are used for speech recognition and the sound pitch or speech pitch determination. one example might be the use of adaptive algorithms in human speech segmentation [2]. the various approaches to the problem can be found in [3], where different methods of segmentation in automatic speech recognition are shown. one of the common problems in speech recognition is also a discrete wavelet transform used to identify the pitch and the segmentation of the signal [4]. an important aspect of speech recognition is to take into account the impact of individual features of the speakers and the signal transmission conditions on the issue of automatic speech recognition [5]. another one is the cepstral analysis, necessary for the continuation of the speech recognition not only in terms of its pitch, but also individual sounds [6]. the purpose of this publication is to show that for some simple problems of sound analysis, in particular its pitch determination, it is possible to create an effective implementation in java, using the java virtual machine and available input/output libraries. the intention of the authors was not to propose commercial applications, having a possibility to create complete music notation from any of the songs. there are simple such applications, e.g. for an iphone, as well as for desktops, like raven. some of them, especially those for smartphones, at the time of this research, were not yet available. the aim of this research was to show that using fairly simple, publicly available java mechanisms, applicable to the various systems, including e.g. symbian on mobile phones, it is possible, despite the hardware limitations (among others, of the phone's microphone) to create this type of application. the simplifications and consideration of only certain ranges in the field of signal analysis are intentional and aim to produce a good presentation of the implementation for human speech-signal analysis on the devices with limited hardware resources. 2. specification of the sound processing problem assuming, that the current version of java can deal with the issue of sound analysis, it was decided that a certain pitch, for a better visual effect and clarity, will be presented in the form of the traditional musical notation on a stave. from the algorithmic point of view, dft (discrete fourier transform) analysis should be used for the bands of the spectrum of the analyzed signal [7]. the algorithm should recognize the fundamental tone band number and transform calculated frequency into the corresponding notes in the standard musical notation. the implementation also considered the use of the fft (fast fourier transform) algorithm. with its use one can save a considerable amount of calculation in comparison to the direct implementation of the dft. the complexity of the model is described first for the case of the fft algorithm. the computational complexity for both algorithms is given by the equations: 2 =o( )dftk n (1) 2=o( *log )fftk n n (2) where n is the number of the input data. with proper implementation, the memory load is not much larger than the amount of memory occupied by the input data. in the case of real samples, the process of fft can be further improved by using a modification of the algorithm, the 2n-point real fft, the implementation of signal analysis in java to determine the sound of human voice... 141 which further reduces the amount of needed resources. unfortunately, apart from the benefits, the applied algorithm has also a drawback: the number of samples increases to 2n, which, in turn, adversely affects the flexibility of the analysis of the sound. the final results are obtained only at the end of the execution of the algorithm. these advantages and disadvantages of considered solutions have a significant influence on implementation, in particular when considering the operation of the system in the "on-line" version. basic dft algorithm has two main inherent advantages. in contrast to the fft, the results for various bands are evenly spaced in time. the second advantage, not to be underestimated, is the flexible amount of the input samples. this means that by regulating the number of samples used in the algorithm one can control the resolution, identified as fr (equation 3). sr f f = n (3) where fs is the adopted sampling frequency. in the case of the fft, a large number of input samples should be used, which has an effect on the application performance. during the online analysis, the signal is analyzed constantly. in the case of the offline analysis only specific samples are analyzed. in the case of the "online" processing (samples are processed in real time and immediately presented by applications in standard music notation), the extended period of the sampling time increases the inertia of the system the implemented application. there are delays in the graphical presentation of notes. assuming the standard fs = 44100 hz, we chose the following numbers of samples powers of two (equation 4). ({8192 16384 32768 65536})in = , , , (4) too small number of samples, as in the case of n = 8192, results in a very low resolution. in this case, fr = 5.38 hz, on the basis of equation 3. modified 2n-point fft algorithm, that is used, determines the use of at least n = 16384 samples, which can significantly affect the data acquisition time for the next step of calculation. out of the above discussion, the following conclusions can be derived with respect to the applicability of specific algorithms for the problem under consideration. the fft is a faster, but more complex implementation, which requires more resources. the dft algorithm in the basic version with a simple implementation has fewer hardware requirements, which in turn, gives the programmer more possibility to adapt the implementation to the limited resources. calculating the fft of the significant harmonics requires calculation of the whole fft, or in other words, calculation of all the harmonics, which in the given example requires calculating at least 8820 samples every time (because fs = 44.1khz). the experimentally used algorithm has the calculation complexity stated below: 9002 s o = n f       (5) complexity graphs of the fft, dft and the experimental dft for the analyzed ranges of 500hz and 900hz are shown in figure 1. 142 p. solecki, w. zabierowski fig. 1 graph of the calculation complexity: computing time relative to the number of samples. specifying the fundamental pitch is a useful tool in the analysis of musical sounds. it allows you to specify the frequency of the fundamental tone, and on this basis determine the name of the sound. it also helps to examine such traits sound like vibrato. the frequency of the fundamental tone is in the intervals shown in table 1. [5]: table 1 frequency of the fundamental tones depending on the voice type.[10] voice name frequency [hz] bass 8—320 baritone 100-400 tenor 120-480 alto 160-640 mezzo-soprano 200-800 soprano 240-960 in addition, it is varied depending on the individual characteristics and is appropriate to the resonators: laryngeal, sinus, mouth and pectoral (chest), participating in creating a sound [10]. that is why, among other things, for the purpose of the research we decided to limit the frequency range to 1khz. this restriction was introduced, because it was assumed that sounds will be read and written with the musical notation only in this range of frequencies. the human speech spectrum includes frequencies from 100 hz to over 8 khz, where the largest spectral density (energy) is in the vicinity of 500 hz and gradually decreases with increasing frequency, which also supports our limitation of the analyzed interval. the human ear receives signals in a much wider frequency range, but it is limited depending on individual human being. the typical range of signals recorded by the human ear covers frequencies from 20 hz to 15 khz (sometimes 20 khz) and the highest sensitivity is from 1khz to 3 khz [16]. the implementation of signal analysis in java to determine the sound of human voice... 143 3. solving the problem – the dft analysis analysis of the human voice induces a lot of problems. the human voice is very complex in terms of number of parameters describing it [2,3,12,14]. this also results in a very complex set of harmonics visible in the spectrum of the signal. changes in the voice can occur dynamically during the analysis, because of the conscious subject's voice modulation, but also by the impact of external factors that could affect the image of the spectrum of the voice of the tested person [10]. although, the voice of every human is determined appropriately to the personal sound produced by vocal folds, but as a result of changes in the voice path it can vary considerably. this means that the voice of each person will be different due to the inter-individual characteristics, although it will still have the same pitch or the same character. with the dft analysis based on transformed expressions (equation 6) one must be aware of certain characteristics of such signals, which facilitate further analysis and can prevent errors. 1 0 ( ) ( ) [cos(2π / ) sin(2π / )] n n= x m = x n n m n j n m n         (6) attention should be drawn to the following points:  the fundamental tone is not always the most powerful component of the sound.  quality of used equipment is of fundamental importance and has an impact on the resolution and the possible disruption of the spectrum at low frequencies.  if in the dft without windowing the input data is used, fundamental tone may be disturbed with other bands, so-called "leaves".  for a proper analysis of the signal and, in particular, of a human voice, the signal strength must be at the right level for the appropriate resolution of the harmonics, which allows for proper analysis. fig. 2 spectrum of 'e' sound produced by a male voice [1]. 144 p. solecki, w. zabierowski fig. 3 spectrum of 'e' sound produced by a female voice [1]. as shown in figures 2 and 3, the frequency range of the human voice, in the sense of the fundamental tone, already starts in this particular case around 60-70hz, and ends just over 1000 hz. taking into account the fact that the lowest and the highest frequencies are achieved only by a small percentage of the population for the tests and the described implementation, for the simplification, the narrowing of the scope was adopted in the range f = 98.00 hz 783.99 hz. this is due to the need to ensure the appropriate resolution and frequency values assigned by equally tempered scale. the scale of the difference between the sounds was at the level df = 5.83 hz. this means that the adoption of the resolution of fr = 5 hz would be sufficient for the most of the range. unfortunately, the specificity of the mathematical properties of the fourier transform can introduce errors for low frequencies, so-called "leaks". correctly adopted resolution (see above) implicates a certain behavior of the variants of the algorithm. with such resolution of the dft procedure components with the pitch similar to the observed spectral bands is transferred to adjacent spectral strips, which can result in having two neighboring strips of similar values. the interpretation of such strips depends on the applied algorithm. either one chooses the strip of larger value as the basis for a sound diagnosis or approximating neighboring bands identifies the maximum to assign the sound pitch to it. at the resolution fr = 5 hz it is possible to recognize extreme low sound pitch on the basic level. in this way we limit also the dft resolution and for the calculations, according to formula (1), 8820 samples may be used. with such resolution it is possible to reduce the number of samples while maintaining the basic, satisfactory sensitivity of the pitch markings. the algorithm adopted in the analysis was narrowed to five strips. the limitation was adopted on the basis of the signal analysis and the observation that for the purposes of pitch recognition this limitation gives such advantages in terms of utilization and load on system resources that we can tolerate the potential inaccuracies in the designation of the pitch. the implementation of signal analysis in java to determine the sound of human voice... 145 for the proper operation of the described algorithm, the following assumptions were made:  two adjacent bands on both sides must have a smaller value.  the analyzed band must exceed the value established in advance. please be aware of the issue of simplifying assumptions. one can, of course, expand the algorithm described, but these changes will affect the computational complexity, and this, in turn, will decrease the processing performance of the solution. the presented algorithm is based on a single pass, which implies that the time needed to find a correct tone varies for different tones. the analysis continues from low to high frequencies and, therefore, the lower sound is detected earlier and it will be marked sooner. 4. the application a multi-threaded application was written, using [8], so that the described algorithms work efficiently. the project was split into groups of classes responsible for the analysis of samples and class group presentation, as well as the input/output operation to receive samples. control group classes provide communication between the groups and between the threads. collection and processing of data is carried out and can be controlled by the user. data are collected directly from the buffered sound card stream, and then are subjected to normalization. fig. 4 a simple line-in configuration [11]. java libraries provide a mechanism for downloading directly from the line-in standard audio mixer of the operating system (figure 4). the analysis described in the previous section is carried out with the prepared data to search for the fundamental tone. the effect of this thread is delivered to the thread responsible for the presentation by means of the music notation (figure 5). saving the marked tone should be taken into consideration as well as special characters like flat and sharp symbols, which lay down the increase and decrease of the displayed note one halftone. although the frequency range of the human voice is not large, for the proper presentation of tones the key treble and bass should be used. the application has several features to enable various options for sound analysis. fig. 5 the fragment of the dialog window (simplified) presentation of the sound [11] 146 p. solecki, w. zabierowski to generate the appropriate notes in musical notation javaswing package was used. in addition, applications have been introduced to allow control and algorithm modification by the user (figure 6). the user can change parameters and select the method of analysis. it is possible to choose how the data acquisition and data analysis should be performed. these additional features allow one to show a different aspect of the application functioning. fig. 6 the fragment of the dialog window presentation of the sound [11]. before beginning the sound pitch analysis, the size of the incoming data buffer (samples) being sent to the analyzing functions must be set in the application window. this setting determines the number (n) of the samples analyzed in one process. this number is shown in the lower-left corner of the application's dialog window. the number of currently being identified stripe is shown on the right. this allows fast real time check of correctness of the results. there is also the intonation indicator in this corner. it shows if the identified pitch is below or above the standard frequency of the identified sound. during the calculation process the type of the calculations can be changed at any time by choosing between the fft or the narrowed dft. to assure correct identification of the tone of the sound by the program the calibration of the traceability threshold is necessary. the possibility to control this parameter allows adjusting the program to the level and timbre of the analyzed sound. this means scaling the levels of the analyzed parameters in order to avoid false identification of the tones. the use of the fft, avoids leaflets (dicribed in the dft section). while using the fft, user has the possibility to enable the option of the thorough peak checking, filling with zeros in order to increase the resolution of the spectrum and windowing, in order to eliminate leaks (the, so called, side leafs or leaflets). the implementation of signal analysis in java to determine the sound of human voice... 147 5. smartphone application as already mentioned, the real challenge is to write the application for the mobile device, such as a smartphone, which is now already a very popular device accessible to everyone. having experience in setting up sound for recording guitar tablature and guitar tuner implementations on a mobile phone [1], we decided to face a more complex challenge. on a smartphone we tested the results obtained for a relatively powerful machines: a desktop computer and a laptop. to increase the challenge, the application has not been tested on the latest models of smartphones, but on those 2-3 years old. the implementation of the user interface especially required considerable changes in comparison to the desktop version. the touch screen instead of a mouse and a keyboard gives significant expansion of the functionality and usability of the application. however, as it was with the guitar tabulature creator solution [1], a serious problem was the mobile device's microphone. during the implementation of the application, the following additional assumptions have been adopted. first, the fir (finite impulse response) filter has been used, in addition, as an interpolating filter to increase digital sample rating. the main operation of data filtering is the convolution (equation 7). 0 ( ) ( ) ( ) m k= y n = h k x n k (7) w h e r e , m – number of input samples. thanks to this operation one can obtain better resolution of the guitar tuning. dolphchebyshev window also applied to the final impulse response filter is very useful and corrects characteristic of window. furthermore, characteristics of the filter and the window (such as the gamma parameter) can be set by the user. in addition, the autocorrelation function were introduced. analysis of the autocorrelation function described below has been done: ( ) ( ) ( )r n = x m x m+ n (8) m – sample from the input range, nsample number; fig. 7 analysis of the autocorrelation function. extraction of the fundamental harmonic [6]. 148 p. solecki, w. zabierowski this algorithm provides a good resolution, but it does not eliminate the noise. there can occur also problems if input signal does not include the fundamental harmonic:  analysis of the spectral function. the main goal of the spectrum function analysis is to find the peak of the function and establish current fundamental frequency of the input signal;  adaptation of dolph-chebyshev window (fir filter). this type of window is very useful in case of creating the fir filter;  approximation of the complex vector module. commonly used operation is arithmetic with complex numbers i.e.: 2 2 |v |= i +q (9) i – real part of a complex number q – imaginary part of a complex number it can be replaced with the simple low cost operation, which gives comparable result: |v |= αmax+ βmin (10) where max number is a larger part of the complex value and min is a smaller part. alpha and beta are the parameters, which are chosen from the appropriate table [7]. a standard sound sampling for mobile devices amounts to 8 khz. this frequency is typical for voice calls and simple voice recording but sometimes it can cause difficulties if one wants to analyze the digital signal (sometimes the digital increase of the sample rating is required). for the purposes of this implementation, input signal frequency has been increased from 8 khz to 80 khz with digital interpolation process (sampling frequency has been increased). the disadvantage is the additional interpolation process to be executed by smartphone's resources but, on the other hand, we profit from the possibility of using anti-alias filter of lower order, which needs less processing power. the disadvantage is the possibility of the appearance of the noise and artifacts, because interpolation process is never able to reproduce the original signal accurately. the main profit is, that it increases the signal-to-noise ratio. summing up, by applying this procedure it became possible to obtain a good quality of the signal for further analysis and, finally, satisfactory results. 6. summary it has been shown that java programming language and the java virtual machine, despite its limitations is able to process signals "online" in a satisfactory manner. new versions of the java platform, as well as newer computers, significantly improve the comfort of the programmer, allowing more accurate analysis of increasingly complex computing problems. however, we must remember that today's challenge is not a desktop computer or even a laptop, but a mobile device [1]. therefore, the issue of signal analysis for a variety of platforms, including java is still valid. this study may also have a practical aspect. application of this type can be used for educational purposes, e.g. for learning signal analysis and related issues. it can also serve as an interesting addition to learning to recognize a sound pitch, which is a basic exercise for students learning to play an instrument or to sing. the implementation of signal analysis in java to determine the sound of human voice... 149 the signal was collected online from a microphone, and analyzed according to the presented algorithm. no tests were carried out on the external databases, but only a few people (musicians) evaluated the application by listening, in terms of the quality of the voice recognition. indeed, one could think of some more systematic way of checking the application, but it is worth noting that apart from checking by a few musicians, the application is used during learning, as an aid to teaching at the music school. this is good practical stress test. a detailed comparison with commercial applications has not been done because it was not the purpose of authors to compete with commercial applications for desktops, like raven [17]. for these tests, the effect of recognition was satisfying. the obtained results show that the effect of our work may be useful for people studying playing on musical instruments, tuning the instruments, etc. the program was used with good results as a teaching aid for children learning music at music school. references [1] p. solecki, w. zabierowski, the signal analysis of sound based on the application of guitar tabulatures for mobile devices. przegląd elektrotechniczny, 2012, rocznik 88, nr 10b, p. 239-242. [2] v. a. petrushin, adaptive algorithms for pitch-synchronous speech signal segmentation, specom’2004: 9th conference speech and computer st. petersburg, russia september 20-22, 2004. [3] a. s. spanias, ―speech coding: a tutorial review,‖ proc. ieee, 82, 1541–1575, october 1994. [4] ch.wendt, athina p. petropulu, pitch determination and speech segmentation using the discrete wavelet transform, electrical and computer engineering department, drexel university, philadelphia pa 19104. [5] p. mrowka, algorytmy kompensacji warunków transmisyjnych i cech osobniczych mówcy w systemach automatycznego rozpoznawania mowy, politechnika wrocławska, instytut telekomunikacji, teleinformatyki i akustyki, raport nr i28/pre-001/07, phd dissertation, wrocław 2007. [6] a.p. dobrowolski, e. majda, analiza cepstralna w systemach rozpoznania mówców, no 6/2012, instytut logistyki i magazynowania, 2012. [7] r. g. lyons, understanding digital signal processing, pearson, 2010. [8] b. eckel, , 2003. thinking in java. wydawnictwo ―helion‖, gliwice. [9] k. demuynck, t. laureys, a comparison of different approaches to automatic speech segmentation, http://www.esat.kuleuven.ac.be/#spch 2013. [10] w. p. morozow, isskustwo rezonansnawo pienija, iskusstwo i nauka. instytut psychologii rosyjskiej akademii nauk, państwowe konserwatorium im. p.i. czajkowskiego w moskwie, moskwa 2002. [11] m. dybowski, w. zabierowski, 2005. aplikacja rozpoznająca wysokość dźwięków głosu ludzkiego java – w mgnieniu oka, xiii konferencja sis sieci i systemy informatyczne – teoria, projekty, wdrożenia, aplikacje, łódź, p. 421-426, t. 2, piątek trzynastego wydawnictwo 2005, isbn 837415-069-6, 711 s., 2 t., 23,5 cm [12] a. gersho, ―advances in speech and audio compression‖, proc. ieee, 82, june 1994. [13] lawreace r.rabiner, ronald w. schafer, digital processing of speech singlas, prentice-hall, inc.englewood cliffs, new jersey 07632, bell laboratories 1978 [14] t. robinson, speech analysis, lent term 1998 [15] w. hess. pitch determination of speech signals. springer-verlag, 1983. [16] d. gerhard. pitch extraction and fundamental frequency: history and currenttechniques. technical report tr-cs 2003-06, department of computer scienceuniversity of regina, regina, saskatchewan, canada s4s 0a2, november 2003 [17] http://www.birds.cornell.edu/brp/raven/raventestimonials.html 2013 8222 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 301-312 https://doi.org/10.2298/fuee2203301o © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper adaptive control of dc motor without identification of parameters fezazi omar1, hamdaoui habib1, ayad ahmed nour el islam2, ardjoun sid ahmed el mehdi3 1,2iceps laboratory (intelligent control & electrical power systems) djillali liabes university, sidi bel-abbes, algeria 2kasdi merbah university ouargla, algeria 3irecom interaction réseaux-convertisseurs-machines djillali liabes university, sidi bel-abbes, algeria abstract. parameter identification is a major problem in industrial environments where it might be difficult or even impossible in some situations. moreover, non-measurable and unknown variations of system parameters can affect the performance of conventional proportional-integral (pi) controllers. the concept of developing a controller that does not depend on the system parameters seems very interesting. therefore, this paper deals with the experimental implementation of model reference adaptive control of a dc motor without identifying parameters. adaptive control is considered an online solution to control a system without knowing system parameters since it can be adjusted automatically to maintain favorable tracking performance. the simulation and experimental results are presented to demonstrate the effectiveness of the proposed control method. key words: uncertainty of the identification, adaptive control, reference model, dc motor 1. introduction due to the simple structure of dc motors and their low cost, they have been widely used in electromechanical systems that require movement [1-18]. the dc motors are used in vehicles [2], unmanned underwater vehicles [15-16-17], industrial tools [3], and robotic manipulators [4]. moreover, they are perfect for many applications requiring an accurate operation and high precision [5], due to their inherent decoupling between torque and speed as well as their simple control [20]. received september 5, 2021; revised february 26, 2022; accepted may 15, 2022 corresponding author: fezazi omar iceps laboratory, djillali liabes university, sidi bel-abbes, algeria e-mail: omar.fezazi@univ-sba.dz 302 f. omar, h. habib, a. ahmed nour el islam, a. sid ahmed el mehdi nevertheless, the parameters identification of dc motor is very important to achieve speed control [19-20]. however, the uncertainty of the measuring devices (like ammeter and voltmeter) and the bad use of the identification techniques [10-11] poses a real problem to calculate pi controller which is based on parameters of the dc motor. to get rid of these identification problems many researchers proposed complex algorithms to enhance the identification results like genetic algorithms [12-13-14]. moreover, the performances of the conventional controllers can be affected by the unknown load dynamics, external disturbance, and parameters variations [22-23-24]. therefore, the main contribution in this paper is to develop a control mode that is not based on the parameters of the system to be controlled to avoid identification problems. moreover, this mode of control will save time by skipping parameter identification after maintenance or changing the dc motor in case of machine failure. also this mode of control must be robust against online parametric variations. model reference adaptive control (mrac) has been utilized to deal with these problems [6]. this mode of control has been used to deal with systems as a black box. to achieve or maintain a certain level of performance when the parameters of the process are either unknown and/or vary over time. adaptive control can automatically adjust controllers during implementation (effect: reduction of adjustment time and improvement of performance) and can automatically determine the optimal parameters of the controllers in the various operating points of the process [7]. in this article, the implementation of model reference adaptive control (mrac) on dc motor (sonelecrme_24245) without making the identification of its parameters is made using dspace 1104 carte. in the beginning, a theoretical study is made where a control law is defined to calculate adaptive controller parameters, then adaptive control is simulated by matlab simulink to visualize motor speed and the mrac parameters theta 1, theta 2, and theta 3, finally, the implementation of the adaptive control is done. 2. functioning principle of the model reference adaptive control adaptive control with reference model is simple to implement and used to date in practice. the adaptive control scheme with reference model was originally proposed by whitaker [8] (1958), the first applications of this technique date back to the early 70s. mrac does not depend on the system parameters but the system is forced to follow the desired reference model [7]. the diagram given in fig. 1 represents the principle of functioning of adaptive control with a reference model, the difference between the output of the system (dc motor) and the output of the reference model is used by the adaptation mechanism which also receives other information to automatically adjust the controller parameters [7-21]. mit rule and lyapunov theory can be used to develop the adaptation mechanism, according to research the method lyapunov is more effective than the mit rule [25]. moreover, the complicacy is reduced in the configuration of mrac with the lyapunov rule as compared to the mit rule, so the physical realization for the system under consideration is comparatively more feasible with the lyapunov theory [26-27]. adaptive control of dc motor without identification of parameters 303 fig. 1 principle of functioning of model reference adaptive control model reference adaptive control contains two loops: the internal loop is an ordinary feedback loop constituted by the process and the controller. the external loop adjusts the controller parameters so that the tracking error e=y-ym is small. 3. adaptive controller design the problem of stability led several researchers in 1960 to consider the synthesis of adaptive controllers using stability theory like the second method of lyapunov. the lyapunov approach offers global stability properties for any restriction, either using the initial conditions of the error or the inputs of the system. the lyapunov method is applied for the synthesis of the adaptive control with a reference model [9]. in the adaptive control, a lyapunov function is determined and an adaptation mechanism is built to ensure that the tracking error e=y-ym converges to zero, the objective is to elaborate a control system that changes its parameters overtime to ensure that the tracking error converges to zero; it is about choosing a lyoponov function, then chosing a control law that will decrease the derivative of this function. 3.1. state-space representation system state space: ( ) 1 1 0 0 [1 0] p p p a b ab x a x b u x u y c x x  − + −    = + = +           = = (1) reference model state space often defined from a technical specification or desired responses: 2 2 2 1 0 0 [1 0] m m m m c c m m m m x a x b u x u y c x x       − − = + = +           = = (2) 304 f. omar, h. habib, a. ahmed nour el islam, a. sid ahmed el mehdi 3.2. adaptive control law the adaptive control law by state feedback is defined [7]: 3 1 1 2 2 3 1 2 [ ] and [ ] r c c r u l u lx u x x l l       = − = − −  = = (3) the closed-loop system is: ( ) p p p r c c x a b l x b l u x ax bu  = − +   = + (4) the matrices a, b, and c as a function of   1 2 3 ( ) ( ) 1 0 ( ) 0 0 1 p p p r p a b ab a a b l b b l c c       − + − − −  = − =        = =      = =   (5) let us introduce the tracking error m e x x= − so: ( ) ( ) c m m m c m m m c e ax bu a x b u a e a a x b b u= + − − = + − + − (6) this error tends to zero (0) if is stable ( ) 0 ( ) 0 m m a a and b b − = − = (7) 3.3. unknown system parameters case let us define the lyapunov function [7]: ( ) ( ) ( ) ( ) t t t m a m m b m v e pe tr a a q a a tr b b q b b= + − − + − − (8) so: ( ) ( ) ( ) ( ) t t t a m t t t m a b m m b p e e pe e a q a adv v tr dt a a q a b q b b b b q b   + + − +  = =    + − + − + −  (9) from (6) we can write: [ ( ) ( ) ] [ ( ) ( ) ] t t m m m c t t m m m c p e e p a e a a x b b u e pe e pe a e a a x b b u  = + − + −   = + − + − (10) let us put (10) into (9) and select the terms proportional to ( )t m a a− and ( ) t m b b− : adaptive control of dc motor without identification of parameters 305 2 [( ) ] 2 [( ) ] t t am t t bm c tr a a q a pex tr b b q b peu  − +   − + (11) then its result: 2 [( ) ] 2 [( ) ] t t t t t am m m t t bm c dv v e pa e e a pe tr a a q a pex dt tr b b q b peu = = + + − + + + − + (12) parameters of the adaptation laws: 0 0 t a t b c q a pex q b peu  + =   + = (13) while ( )t t m m v e pa a p e= + (14) and: 0 t v e pe= −  let us put: 1 0 and 1 0 1 a b q q   = =    then the parameter of the adaptation law result 1 11 1 12 2 1 2 11 1 12 2 2 3 11 1 12 2 ( ) ( ) ( ) c p e p e x p e p e x p e p e u     = +   = +   = − +  (15) 4. simulation by matlab simulink the verification of the mrac control on a dc machine is necessary, for this matlab simulink is used fig. 2 speed response (matlab simulink) 306 f. omar, h. habib, a. ahmed nour el islam, a. sid ahmed el mehdi table 1 quantitative results of speed response (matlab simulink) rise time [s] peak time [s] settling time [s] overshoot [%] reference model speed 0.671 0.892 1.621 7% dc motor speed 0.671 0.892 1.621 7% fig. 3 theta 1, 2 and 3 responses (matlab simulink) table 2 quantitative results of theta 1, 2, and 3 response adaptation time [s] peak time [s] settling value [rad] peak value [rad] oscillation after settling time per second [rad/s] theta 1 1 to 2s 1.574 60.06 115.8 1.62/10-5 theta 2 1 to 2s 1.293 73.47 -29.26 1.72/10-5 theta 3 1 to 2s 1.302 75.82 194.5 1.33/10-1 adaptive control of dc motor without identification of parameters 307 fig. 2 and table 1 show that the motor follows perfectly the imposed reference speed because it is forced to follow the reference model, the system responds very quickly with a response time less than 1.621s and an overshoot of less than 7% with no steady errors. the adaptations parameters of the state feedback control (theta 1, theta 2 and theta 3) are given in fig. 3, and in more details in table 2. these parameters converge towards their nominal values. it can be noticed that the adaptation parameters change in the transient state from t=1s to t=2s, to stabilize at almost fixed values in the steady-state, the value of theta 1 increases to the value of 115.8 rad at 1.574 s to drop to its nominal value 60.06, the same for theta 3 which reaches the value of 194.5 rad at 1.302s then decreased to a stabilization value of 75.82 rad, but theta 2 decreases to -29.26 rad at 1.293 then increase to a steady value 73.47, oscillation is remarked after steady values of theta 1, 2, and 3. 5. adaptive control implementation to implement the adaptive control, the combination between matlab simulink to build the mrac control and controldesk to control the dspace 1104 is used. the dspace 1104 board generates a pwm signal to control the 4 quadrant chopper that will drive the dc motor. the feedback is done by the incremental encode (gi355). fig. 4, represent the implementation adaptive control diagram on the dc motor: fig. 4 implementation adaptive control diagram on dcm 5.1. four quadrant chopper power circuit dc motor is excited by its stator at a fixed voltage 125 v and supplied by its rotor by variable voltage via the 4 quadrant chopper, isolation is assured by tech isolator that also prevents the closing of t1 and t2, or t3 and t4, at the same time, then it secures the circuit against short circuits. fig. 5 illustrates the four-quadrant chopper drive dc motor utilized: 308 f. omar, h. habib, a. ahmed nour el islam, a. sid ahmed el mehdi fig. 5 four quadrant chopper drive dc motor to deduce  to control the switches t1,2,3,4 it is necessary to calculate vab (1 ) a dc b dc v v and v v = = − (16) as result 1 (2 1) 1 2 ab ab dc dc v v v so v     = − = +    (17) 6. implementation result fig. 6 represents the actual motor speed response (rad/s) realized by the dspace 1104 card (matlab from controldesk). fig. 6a shows that the motor follows the speed reference because in reality it is forced to follow the reference model. at the beginning we drive the motor at the speed of 160 rad/s, then we break our motor, then the motor turns to the opposite direction at the speed -100 rad/sec, and finally we return to the nominal speed in the positive rotation direction. adaptive control changes the parameters of the adaptation mechanism each time the speed of rotation is changing without knowing the parameters of the machine. fig. 6b, a closeup view of the speed response, and table 3 show that the time response in real-time is less than 1.272s and the overshoot is less than 12.17% with no steady errors, while the time responses of the reference model are less than 1.621s and the overshoot is less than 7 %. fig. 6c, a closer view of the response, shows small measurement disturbance. the comparison of the simulation and implementation results with other studies demonstrate that mrac control with the lyapunov method is more robust than mrac control with mit rules, mrac control with mit rules is very sensitive with changes in the amplitude of the reference signal and may become oscillatory for larger values of reference input [28]. however, the result of simulation and implementation of our study demonstrates that mrac control with lyapunov method is more robust even against for larger values of reference input (100, 160 rad/s). moreover, beyond a certain limit the performance of the system becomes very poor for mrac control with mit rules and stability cannot be determined [29], but our study proves that mrac control with the lyapunov method is very stable with the change of reference input. adaptive control of dc motor without identification of parameters 309 a b c fig. 6 motor speed response (rad/s) (matlab from controldesk) 310 f. omar, h. habib, a. ahmed nour el islam, a. sid ahmed el mehdi table 3 experimental results of speed values rise time [s] peak time [s] settling time [s] overshoot [%] dc motor speed 0.544 0.743 1.272 12.17 % reference model speed 0.671 0.892 1.621 7 % 7. conclusion in this study, the adaptive control with reference model has been validated via computer simulations and experimentally implemented in real-time using the dspeace 1104 controller board. model reference adaptive control proves that controlling an electrical machine especially a dc motor can be accomplished without the need for parameter identification. a major problem in an industrial environment where identification can be difficult or impossible. the results have shown good performance and an excellent speed dc motor response, and the implementation results showed no overshoot and no steady-state error. adaptive control changes control parameters in real-time to ensure reference speed tracking by following the reference model. the next step is to test this control mode on several similar electric machines in the industry without making the parameters identification which takes a lot of time and creates a lot of uncertainty problems, or after doing maintenance of the machine without redoing the identification, then implement this mode of control on the most commonly used machine (asynchronous squirrel-cage machine). 8. materials used list the material used is available at the iceps (intelligent control et electrical power system) laboratory http://www.univ-sba.dz/fge/index.php/fr/recherches dspace 1104 (code 382508) dc motor hampden dm-100a sonelecrme_24245 serial no 10545 (125volt 3.5 amp 1800 rpm) incremental encoder gi355 (1024 impulsions) measurement card for dspace utech analog/digital conversion interface for dspace two-level inverter semikron ref semiteach igbt skm50gb used as a chopper 9. nomenclature x, y system state and output vector u system input (or control) vector ap system state (or system) matrix bp system input matrix cp system output matrix xm, ym reference model state and output vector uc reference model input (or control) vector http://www.univ-sba.dz/fge/index.php/fr/recherches adaptive control of dc motor without identification of parameters 311 am reference model state (or system) matrix bm reference model input matrix cm reference model output matrix a, b system poles ,  damping and pulsation of the reference model 1, 2, 3 adaptive control parameters e tracking error v lyapunov function p, qa 22 matrix qb 11 matrix tr trace of a matrix va, vb, vdc, vab a, b, dc, ab voltage references [1] s. j. chapman, electric machinery fundamentals. fourth edition, 2005, bae systems australia. [2] z. bitar, a. sandouk and s. jabi, "testing the performances of dc series motor used in the electric car", energy procedia, vol. 74, pp. 148–159, 2015. [3] r. alejandro, v. miguel gabriel and a. mario, "an adaptive control study for the dc motor using metaheuristic algorithms", soft comput., vol. 23, pp. 889–906, 2017. [4] y. li, s. tong, and t. li, "adaptive fuzzy output feedback control for a single-link flexible robot manipulator driven dc motor via backstepping", nonlinear anal. real world appl., vol. 14, no. 1, pp. 483–494, 2013. [5] s.-f. yang and j.-h. chou, "a mechatronic positioning system actuated using a micro dc-motor-driven propeller-thruster", mechatronics, vol. 19, no. 6, pp. 912–926, 2009. [6] s. zaky, "adaptive and robust speed control of interior permanent magnet synchronous motor drives", electr. eng., vol. 94, pp. 49–58, 2012. [7] i. d. landau and l. dugard, commande adaptative aspets pratique et théorique : adaptive control practical and theoretical aspects , masson, 1986. [8] h. p. whitaker and a. kezer, design of model-reference adaptive control systems for aircraft, massachusetts institute of technology. instrumentation laboratory jackson & moreland, 1958. [9] s. sastry, nonlinear systems lyapunov, stability theory springer, 1999. [10] f. a. aliev, n. s. hajieva and n. a. safarova, "the identification problem for determining the parameters of a discrete dynamic system", journal of non-destructive testing and evaluation. international applied mechanics, vol. 55, no. 1, pp. 128–135, 2019. [11] t. kara and i. eker, "nonlinear closed-loop direct identification of a dc motor with load for low speed two-directional operation", electr. eng., vol. 86, pp. 87–96, 2004. [12] q. zhu, x. yuan and h. wang, "an improved chaos optimization algorithm-based parameter identification of synchronous generator", electr. eng., vol. 94, pp. 147–153, 2012. [13] v. rashtchi, e. rahimpour and e. m. rezapour, "using a genetic algorithm for parameter identification of transformer r-l-c-m model", electr. eng., vol. 88, pp. 417–422, 2006. [14] e. rahimpour, v. rashtchi and m. pesaran, "parameter identification of deep-bar induction motors using genetic algorithm", electr. eng., vol. 89, pp. 547–552, 2007. [15] t. sands, "control of dc motors to guide unmanned underwater vehicles", appl. sci., vol. 11, no. 5, p. 2144, 2021. [16] r. shah and t. sands, "comparing methods of dc motor control for uuvs", appl. sci., vol. 11, no. 11, p. 4972, 2021. [17] t. sands, "development of deterministic artificial intelligence for unmanned underwater vehicles (uuv)", j. mar. sci. eng., vol. 8, no. 8, p. 578, 2020. [18] r. gerov and z. jovanović, "the usage of lambert w function for identification and speed control of a dc motor", fu elec. energ., vol. 32, no. 4, pp. 581–600, 2019. [19] m. bozic, s. antic, v. vujicic, m. bjekic and g. đorđević, "electronic gearing of two dc motor shafts for wheg type mobile robot", fu elec. energ., vol. 31, no. 1, pp. 75–87, 2018. https://link.springer.com/book/10.1007/978-1-4757-3108-8 312 f. omar, h. habib, a. ahmed nour el islam, a. sid ahmed el mehdi [20] o. fezazi, a. haddj el mrabet, i. belkraouane and y. djeriri, "sliding mode control for a dc motor system with dead-zone", journal européen des systèmes automatisés, vol. 54, no. 6, pp. 897–902, 2021. [21] m. stanković, m. naumović, s. manojlović and s. mitrović, "fuzzy model reference adaptive control of velocity servo system", fu elec. energ., vol. 27, no. 4, pp. 601–611, 2014. [22] p. tomei and c. m. verrelli, "observer-based speed tracking control for sensorless permanent magnet synchronous motors with unknown load torque", in proceedings of the xix international conference on electrical machines – icem, 2010, pp. 1–6. [23] k. in hyuk, i. s. yang, h. k. sang and l. seungchul, "robust position control of dc motor using a loworder disturbance observer against biased harmonic disturbances", in proceedings of the ieee international conference on industrial electronics for sustainable energy systems (ieses), 2018, pp. 484–489. [24] s. miao and l. min, "online identification technology based on variation mechanism of traction motor parameters", in proceedings of the 4th international conference on advanced electronic materials, computers and software engineering (aemcse), 2021, pp. 77–82. [25] c.-c. hang and p. parks, "comparative studies of model reference adaptive control systems", ieee trans. automat. control, vol. 18, no. 5, pp. 419–428, 1973. [26] p. swarnkar, j. s. kumar and r.k. nema, "comparative analysis of mit rule and lyapunov rule in model reference adaptive control scheme", innovative systems design and engineering, vol. 2, no. 4, pp. 154–162, 2011. [27] n. tariba, a. bouknadel and a. haddou, "comparative study of adaptive controller using mit rules and lyapunov method for mppt standalone pv systems", in the proceedings of the aip conference 1801, p. 040008, 2017. [28] m. swathi and p. ramesh, "modeling and analysis of model reference adaptive control by using mit and modified mit rule for speed control of dc motor", in proceedings of the ieee 7th international advance computing conference, 2017, pp. 482–486. [29] s. mallick and u. mondal, "performance study of different model reference adaptive control techniques applied to a dc motor for speed control", in proceedings of the ieee xplore proceedings of the third international conference on trends in electronics and informatics (icoei), 2019, pp. 770–774. https://ieeexplore.ieee.org/author/37300980100 https://ieeexplore.ieee.org/author/37300979100 https://ieeexplore.ieee.org/xpl/conhome/5600363/proceeding https://ieeexplore.ieee.org/xpl/conhome/5600363/proceeding https://ieeexplore.ieee.org/author/37276365500 https://ieeexplore.ieee.org/author/37086373126 https://ieeexplore.ieee.org/xpl/conhome/8341447/proceeding https://ieeexplore.ieee.org/xpl/conhome/8341447/proceeding https://ieeexplore.ieee.org/author/37088943101 https://ieeexplore.ieee.org/author/37088944918 https://ieeexplore.ieee.org/xpl/conhome/9512932/proceeding https://ieeexplore.ieee.org/xpl/conhome/9512932/proceeding https://ieeexplore.ieee.org/author/37087150315 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=9 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=9 641.indd facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 77 84 doi: 10.2298/fuee1501077m facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 101 125 the use of fractional calculus for the optimal placement of electronic components on a linear array gilbert de mey1, mariusz felczak2 and bogus�law wiȩcek2 1university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium 2insitute of electronics, technical university of �lódź, ul. wolczańska 211-215, 90-924 �lódź, poland abstract: cooling of heat dissipating components has become an important topic in the last decades. sometimes a simple solution is possible, such as placing the critical component closer to the fan outlet. on the other hand this component will heat the air which has to cool the other components further away from the fan outlet. if a substrate bearing a one dimensional array of heat dissipating components, is cooled by forced convection only, an integral equation relating temperature and power is obtained. the forced convection will be modelled by a simple analytical wake function. it will be demonstrated that the integral equation can be solved analytically using fractional calculus. keywords: heat transfer, placement, convective cooling, thermal wake, integral equation, fractional calculus. 1 introduction heat transfer in electronics and microelectronics has become an important topic. the reason is quite simple: the heat dissipation in electronic components is increasing. integrated circuits dissipating 100 watts are no longer manuscript received november 24, 2014 corresponding author: gilbert de mey department of electronics, university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium (e-mail: demey@elis.ugent.be) 1 received november 24, 2014 corresponding author: gilbert de mey department of electronics, university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium (e-mail: demey@elis.ugent.be)) facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 101 125 the use of fractional calculus for the optimal placement of electronic components on a linear array gilbert de mey1, mariusz felczak2 and bogus�law wiȩcek2 1university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium 2insitute of electronics, technical university of �lódź, ul. wolczańska 211-215, 90-924 �lódź, poland abstract: cooling of heat dissipating components has become an important topic in the last decades. sometimes a simple solution is possible, such as placing the critical component closer to the fan outlet. on the other hand this component will heat the air which has to cool the other components further away from the fan outlet. if a substrate bearing a one dimensional array of heat dissipating components, is cooled by forced convection only, an integral equation relating temperature and power is obtained. the forced convection will be modelled by a simple analytical wake function. it will be demonstrated that the integral equation can be solved analytically using fractional calculus. keywords: heat transfer, placement, convective cooling, thermal wake, integral equation, fractional calculus. 1 introduction heat transfer in electronics and microelectronics has become an important topic. the reason is quite simple: the heat dissipation in electronic components is increasing. integrated circuits dissipating 100 watts are no longer manuscript received november 24, 2014 corresponding author: gilbert de mey department of electronics, university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium (e-mail: demey@elis.ugent.be) 1 facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 101 125 the use of fractional calculus for the optimal placement of electronic components on a linear array gilbert de mey1, mariusz felczak2 and bogus�law wiȩcek2 1university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium 2insitute of electronics, technical university of �lódź, ul. wolczańska 211-215, 90-924 �lódź, poland abstract: cooling of heat dissipating components has become an important topic in the last decades. sometimes a simple solution is possible, such as placing the critical component closer to the fan outlet. on the other hand this component will heat the air which has to cool the other components further away from the fan outlet. if a substrate bearing a one dimensional array of heat dissipating components, is cooled by forced convection only, an integral equation relating temperature and power is obtained. the forced convection will be modelled by a simple analytical wake function. it will be demonstrated that the integral equation can be solved analytically using fractional calculus. keywords: heat transfer, placement, convective cooling, thermal wake, integral equation, fractional calculus. 1 introduction heat transfer in electronics and microelectronics has become an important topic. the reason is quite simple: the heat dissipation in electronic components is increasing. integrated circuits dissipating 100 watts are no longer manuscript received november 24, 2014 corresponding author: gilbert de mey department of electronics, university of ghent, sint pietersnieuwstraat 41, 9000 ghent, belgium (e-mail: demey@elis.ugent.be) 1 78 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array 79 2 g. de mey, m. felczak and b. wiȩcek exceptional. if you buy a pentium processor, you will receive the processor already mounted on a printed circuit board with the cooling fin and the fan. otherwise the company cannot guarantee that the device will work at all. some textbooks on heat transfer even include a chapter on ”electronics cooling” [1]. the most obvious way to cool electronic components is to mount them on a cooling fin which is cooled either by natural convection or by forced convection if a fan is blowing. normally, at first an electronic design is made and once this has been finished the cooling problems have to be solved. a few years ago, designers seem to be convinced that one should take the cooling problem into account from the beginning, i.e. during the electronic design phase. let us give a simple example: you are designing a printed circuit board and one component on this board is dissipating a lot of heat. the cooling fan is blowing from the left. do you put this component on the left, on the right, or somewhere in the middle? if you put this component on the left the cooling with be quite efficient due to the close presence of the fan. but furtheron, the air behind this component, the so called wake, will be warmed up. so, the other components will be warmed up by the warm air blowing. this can give rise to malfunctioning of the circuit if temperature sensitive components are involved. alternatively you may decide to put the heat dissipating component on the right side of the printed circuit board. but the cooling air coming from the fan has first to cross over the printed circuit board before reaching the heat source. this gives rise to friction and hence a reduction of the air speed in the surroundings of the pcb. so the component will not be cooled so effectively. if you put the heat source in the middle you have a mix of the problems just mentioned. there is no simple answer to that simple question. the only solution is to make an electrothermal design from the very beginning. a network simulator like spice should not only calculate voltages and currents but also temperatures and power dissipations. in this paper we will deal with a simple problem depicted in fig.1. it � � � �� t1 t2 t3 t4 t5 t6 p1 p2 p3 p4 p5 p6 fig. 1: linear layout of integrated circuits. shows a printed circuit board with 6 integrated circuit, in a linear array. the 78 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array 79 2 g. de mey, m. felczak and b. wiȩcek exceptional. if you buy a pentium processor, you will receive the processor already mounted on a printed circuit board with the cooling fin and the fan. otherwise the company cannot guarantee that the device will work at all. some textbooks on heat transfer even include a chapter on ”electronics cooling” [1]. the most obvious way to cool electronic components is to mount them on a cooling fin which is cooled either by natural convection or by forced convection if a fan is blowing. normally, at first an electronic design is made and once this has been finished the cooling problems have to be solved. a few years ago, designers seem to be convinced that one should take the cooling problem into account from the beginning, i.e. during the electronic design phase. let us give a simple example: you are designing a printed circuit board and one component on this board is dissipating a lot of heat. the cooling fan is blowing from the left. do you put this component on the left, on the right, or somewhere in the middle? if you put this component on the left the cooling with be quite efficient due to the close presence of the fan. but furtheron, the air behind this component, the so called wake, will be warmed up. so, the other components will be warmed up by the warm air blowing. this can give rise to malfunctioning of the circuit if temperature sensitive components are involved. alternatively you may decide to put the heat dissipating component on the right side of the printed circuit board. but the cooling air coming from the fan has first to cross over the printed circuit board before reaching the heat source. this gives rise to friction and hence a reduction of the air speed in the surroundings of the pcb. so the component will not be cooled so effectively. if you put the heat source in the middle you have a mix of the problems just mentioned. there is no simple answer to that simple question. the only solution is to make an electrothermal design from the very beginning. a network simulator like spice should not only calculate voltages and currents but also temperatures and power dissipations. in this paper we will deal with a simple problem depicted in fig.1. it � � � �� t1 t2 t3 t4 t5 t6 p1 p2 p3 p4 p5 p6 fig. 1: linear layout of integrated circuits. shows a printed circuit board with 6 integrated circuit, in a linear array. the the use of fractional calculus for the optimal placement of electronic... 3 circuits dissipate powers p1; p2, ... giving rise to a temperature distribution t1, t2,... the fan is at the left side and provides a uniform flow over the circuit. if only the first circuit dissipates heat, not only will t1 rise, but the downstream airflow (thermal wake) will be heated up so that the other components will be also heated even when p2 = ...p6 = 0. when only the rightmost component dissipates power (p6 �= 0), the 5 other components will not be warmed up as they are in an upstream position. by using a mathematical approximation for the wake function, i.e. the temperature rise caused by one heat dissipating component in all the other components located downstream, an integral equation will be set up for the temperature distribution. this equation will be solved by fractional calculus as will be outlined further on in this paper. 2 a short introduction to fractional calculus fractional calculus is not so well known at the moment. therefore a very short introduction will be given here. what is a semi derivative of a function. in simple words, it is a mathematical operator and if you apply twice a semiderivative, you get a well known classical derivative. in the last decennia, it has been found that several physical phenomena can be described by differential equations involving fractional derivatives [2]. also thermal diffusion problems can give rise to equations using fractional derivatives [3]. mostly used is the so called semi derivative in the time domain defined by: ( d dt )1/2 f (t) = 0d 1/2 t f (t) = 1 √ π d dt ∫ t 0 f (t′)dt′ √ t − t′ . (1) applying two times the semiderivative (1), is nothing else than the classical derivative d/dt. the most straightforward way to interpret (1) is to transform (1) into the laplace domain. one gets: l[ 0d 1/2 t f (t)] = √ sf (s). (2) where s is the laplace variable and f (s) = l[f (t)]. a semiderivative is just a multiplication by √ s. hence, two consecutive semiderivations are then represented by a multiplication by √ s √ s = s, which corresponds to a time derivative in the laplace domain. a fractional derivative of order α (0 < α < 1) corresponds to a multiplication by sα in the laplace domain. 80 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array 81 4 g. de mey, m. felczak and b. wiȩcek the second notation used in (1) is preferred because the subscript ”0” in front of the operator d indicates that the integration should start from t = 0, which is common for the analysis of time dependent problems. generally, a fractional derivative of order α is then defined by: ( d dt )α f (t) = 0d α t f (t) = 1 γ(1 − α) d dt ∫ t 0 f (t′)dt′ (t − t′)α . (3) where γ is the euler gamma function. in the laplace domain this corresponds to a multiplication by sα. integrating (3) with respect to time gives: ( d dt ) −1+α f (t) = 0d −1+α t f (t) = 1 γ(1 − α) ∫ t 0 f (t′)dt′ (t − t′)α . (4) for α < 1, (4) can be considered as a fractional integration of order 1 − α. in the laplace domain this is equivalent to a multiplication by 1/s1−α. 3 integral equation for the thermal wake problem in present day electronic and microelectronic components, the power density is such that the temperatures can attain quite high values, affecting seriously the overall circuit reliabilities [4, 5]. designing a circuit is no longer possible if the heat removal from chip to the ambient is not taken into account. not only the heat transfer by conduction from the semiconductor chip to the package, but also the convective heat transfer is modelled. the latter one is done by solving the navier stokes equations in order to model the flow around the packages and cooling fins [6–8]. on electronic substrates components are usually placed according to regular arrays. as a consequence, in case the substrate is cooled by forced convection caused by fan blowing, a component located in x′ will heat the air used to cool all the remaining components located downstream x > x′ (fig.2). in case the components have different power dissipations, interchanging components can give rise to a more uniform temperature distribution without excessive hot spots. it should be mentioned here that a high operating temperature of just one single component will reduce the reliability of the whole circuit. such a problem can be attacked by computational fluid dynamics modelling. this requires the numerical solution of the navier stokes equations which is a difficult task from a numerical point of view. any time some components interchange their positions a new cfd simulation has to be carried 80 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array 81 4 g. de mey, m. felczak and b. wiȩcek the second notation used in (1) is preferred because the subscript ”0” in front of the operator d indicates that the integration should start from t = 0, which is common for the analysis of time dependent problems. generally, a fractional derivative of order α is then defined by: ( d dt )α f (t) = 0d α t f (t) = 1 γ(1 − α) d dt ∫ t 0 f (t′)dt′ (t − t′)α . (3) where γ is the euler gamma function. in the laplace domain this corresponds to a multiplication by sα. integrating (3) with respect to time gives: ( d dt ) −1+α f (t) = 0d −1+α t f (t) = 1 γ(1 − α) ∫ t 0 f (t′)dt′ (t − t′)α . (4) for α < 1, (4) can be considered as a fractional integration of order 1 − α. in the laplace domain this is equivalent to a multiplication by 1/s1−α. 3 integral equation for the thermal wake problem in present day electronic and microelectronic components, the power density is such that the temperatures can attain quite high values, affecting seriously the overall circuit reliabilities [4, 5]. designing a circuit is no longer possible if the heat removal from chip to the ambient is not taken into account. not only the heat transfer by conduction from the semiconductor chip to the package, but also the convective heat transfer is modelled. the latter one is done by solving the navier stokes equations in order to model the flow around the packages and cooling fins [6–8]. on electronic substrates components are usually placed according to regular arrays. as a consequence, in case the substrate is cooled by forced convection caused by fan blowing, a component located in x′ will heat the air used to cool all the remaining components located downstream x > x′ (fig.2). in case the components have different power dissipations, interchanging components can give rise to a more uniform temperature distribution without excessive hot spots. it should be mentioned here that a high operating temperature of just one single component will reduce the reliability of the whole circuit. such a problem can be attacked by computational fluid dynamics modelling. this requires the numerical solution of the navier stokes equations which is a difficult task from a numerical point of view. any time some components interchange their positions a new cfd simulation has to be carried the use of fractional calculus for the optimal placement of electronic... 5 fig. 2: temperature profile downstream a heating component (thermal wake function) out. this has to be repeated till the optimum layout has been obtained. it is quite obvious that this method requires a huge amount of computing time so that it is no longer of practical use during the design phase of a circuit. in this contribution the thermal wake function approach will be presented. it gives rise to a one dimensional integral equation which can be solved with a minimum of computational effort. finding the optimal position of the individual components can be quickly performed during the design phase of a circuit. by definition, the thermal wake function g is the temperature distribution of the downstream components caused by single component having a unit heat dissipation. all downstream component should not have any heat production then. several authors have studied the thermal wake function properties [9,10]. from experimental data and the own measurements of the authors [10], it was found that the thermal wake function g can be very well approximated by (fig.1): g(x − x′) = 1 (x − x′)p if x > x′, g = 0 if x < x′ (5) if the heat production of a linear array of components can be described by a continuous function q(x), one gets the following equation for the temperature distribution: t (x) = ∫ x 0 q(x′)g(x − x′)dx′ = ∫ x 0 q(x′)dx′ (x − x′)p (6) finding the optimal placement is now finding the optimal function q(x). 82 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array 83 6 g. de mey, m. felczak and b. wiȩcek usually a uniform temperature t (x) = t0 is considered optimal because all components will then have equal reliability if they are of course identical. 4 solution with fractional calculus the integral equation (6) has been solved analytically for a uniform temperature distrbution t0. a rather artificial method was used based on a particular property of the euler beta function [10, 11]. for a non constant temperature this method cannot be used. a uniform temperature distribution t (x) = t0 is often considered as the optimal situation. if all components do not have the same degradation rate as a function of temperature a non uniform temperature distribution can then considered as the optimal situation. the problem is now to find the power q(x) for the given t (x). this problem will be solved now using fractional calculus. the equations (3) and (4) being time dependent, the causality principle is then automatically taken into account. however, the fact that a heat source can only warm up the downstream part, can be interpreted as the causality principle in the space domain. hence, by comparing (4) and (5), the equation (6) can be rewritten as: t (x) = γ(1 − p) 0d−1+px q(x) (7) if t (x) is given the solution q(x) is immediatly found to be: q(x) = 1 γ(1 − p) 0 d1−px t (x) = 1 γ(1 − p)γ(p) d dx ∫ x 0 t (x′)dx′ (x − x′)1−p (8) taking into account that [10]: γ(1 − p)γ(p) = π sin πp (9) one obtains the general solution: q(x) = sin πp π d dx ∫ x 0 t (x′)dx′ (x − x′)1−p (10) in case one wants to get a uniform temperature t (x) = t0, the heat production q(x) turns out to be: q(x) = t0 sin πp π d dx ∫ x 0 dx′ (x − x′)1−p = t0 sin πp π 1 x1−p (11) 82 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array 83 6 g. de mey, m. felczak and b. wiȩcek usually a uniform temperature t (x) = t0 is considered optimal because all components will then have equal reliability if they are of course identical. 4 solution with fractional calculus the integral equation (6) has been solved analytically for a uniform temperature distrbution t0. a rather artificial method was used based on a particular property of the euler beta function [10, 11]. for a non constant temperature this method cannot be used. a uniform temperature distribution t (x) = t0 is often considered as the optimal situation. if all components do not have the same degradation rate as a function of temperature a non uniform temperature distribution can then considered as the optimal situation. the problem is now to find the power q(x) for the given t (x). this problem will be solved now using fractional calculus. the equations (3) and (4) being time dependent, the causality principle is then automatically taken into account. however, the fact that a heat source can only warm up the downstream part, can be interpreted as the causality principle in the space domain. hence, by comparing (4) and (5), the equation (6) can be rewritten as: t (x) = γ(1 − p) 0d−1+px q(x) (7) if t (x) is given the solution q(x) is immediatly found to be: q(x) = 1 γ(1 − p) 0 d1−px t (x) = 1 γ(1 − p)γ(p) d dx ∫ x 0 t (x′)dx′ (x − x′)1−p (8) taking into account that [10]: γ(1 − p)γ(p) = π sin πp (9) one obtains the general solution: q(x) = sin πp π d dx ∫ x 0 t (x′)dx′ (x − x′)1−p (10) in case one wants to get a uniform temperature t (x) = t0, the heat production q(x) turns out to be: q(x) = t0 sin πp π d dx ∫ x 0 dx′ (x − x′)1−p = t0 sin πp π 1 x1−p (11) the use of fractional calculus for the optimal placement of electronic... 7 which is exactly the same solution found by a rather artificial method [9]. the use of fractional calculus offers us a general solution which can be used for any temperature function t (x). 5 conclusion it has been shown that fractional calculus can be succesfully used for some particular problems in electronics. we treated the placement problem of components on a one dimensional array, using the prescribed temperature distribution as the optimisation criterion. acknowledgments m. felczak and b. wiȩcek thank the ministry of science and education of poland for the finantial support (project nr. 3t11b02429). references [1] y. cengel, heat transfer. mc graw hill, boston, 2003, no. chapter 15. [2] i. sokolov, j. klafter, and a. blumen, “fractional kinetics,” physics today, vol. 55, no. 11, pp. 48–54, 2002. [3] r. magin, s. boregowda, and c. deodhar, “modelling of pulsating peripheral bioheat transfer using fractional calculus and constructal theory,” international journal of design and nature, vol. 1, no. 1, pp. 18–33, 2007. [4] a. m. anderson and r. j. moffat, “the adiabatic heat transfer coefficient and the superposition kernel function: part i: data for arrays of flatpacks for different flow conditions,” journal of electronic packaging, vol. 114, no. 1, pp. 14–21, 1992. [5] ——, “the adiabatic heat transfer coefficient and the superposition kernel function: part ii: modeling flatpack data as a function of channel turbulence,” journal of electronic packaging, vol. 114, no. 1, pp. 22–28, 1992. [6] o. leon, g. d. mey, and e. dick, “study of the optimal layout of cooling fins in forced convection cooling,” microelectronics reliab, vol. 42, pp. 1101–1111, 2002. [7] r. a. wirtz and p. dykshoorn, “heat transfer from arrays of flat packs in a channel flow,” in 4th ieps conference, baltimore, pp. 318–326. [8] r. a. wirtz, forced air cooling of low profile air cooling in air cooling technology for electronic equipment, ser. air cooling technology for electronic equipment, s. j. kim and s. w. lee, eds. new york: crc press, 1996. 84 g. de mey, m. felczak, b. więcek the use of fractional calculus for the optimal placement of electronic components on a linear array pb 8 g. de mey, m. felczak and b. wiȩcek [9] s. s. kang, “the thermal wake function for rectangular electronic modules,” journal of electronic packaging, vol. 116, pp. 55–59, 1994. [10] g. d. mey, m. felczak, and b. wiecek, “exact solution for optimal placement of electronic components on linear array using analytical thermal wake function,” electronics letters,, vol. 44, pp. 1216–1217, 2008. [11] m. abramowitz and i. stegun, handbook of mathematical functions. dover, 1970. facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 499-510 https://doi.org/10.2298/fuee2104499v © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper characterization of ptc effect in batio3-ceramics as a special phase transition – fractal approach zoran b. vosika1, vojislav v. mitić1,2, vesna paunović1, jelena manojlović3, goran lazović4 1university of niš, faculty of electronic engineering, niš, serbia 2institute of technical sciences of sasa, belgrade, serbia 3university of niš, faculty of mechanical engineering, niš, serbia 4belgrade university, faculty of mechanical engineering, belgrade, serbia abstract. the applications of batio3-ceramics are very important and constantly increasing nowadays. in that sense, we analyzed some phenomena related to intergranular effects. we used experimental data based on murata powders and processing technology. our original contribution to heywang-jonker-daniels inter-granular capacity model is based on thermodynamic fractal analysis applied on phase transition in ceramic structures. in this case, ptcr effect has a diffuse first-order phase transition character in a modified landau theory-fractal approach. its basic properties are considered. this is an original contribution as a bridge between theoretical aspects of batio3-ceramics and experimental results. key words: ptcr effect, batio3-ceramics, heywang-jonker-daneils model, fractal correction, phase transition 1. introduction the positive temperature coefficient of electrical resistivity (ptcr or ptc) effect, in this case, is a jump in the resistivity of many orders of magnitude in a certain temperature range of the poly-crystalline semiconducting n-doped batio3 across the curie point (t0 ~ 130oc), respectively, in the paraelectric-ferroelectric phase transition (tetragonal-cubic structural change) (see [1-4]). for ntc materials, the temperature coefficient of electrical resistivity has a negative value. the ptc phenomena are usually strongly associated with the grain-boundary phenomenon and, by themselves, represent the electric semiconductorinsulator transition, which will be preliminarily shown in this text. a typical material with ptcr properties is a batio3-ceramic doped with 0.5 x10 -2 mol nb3+of or sn3+ [1]. in the received february 21, 2021; received in revised form june 22, 2021 corresponding author: zoran b. vosika university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia e-mail: setstory@gmail.com 500 z. vosika, v. mitić, v. paunović, j. manojlović, g. lazović ptc markets, there is an enormous number of devices based on the ptcr effect. for example, those could be chemical sensors, heaters, and current limiters. sintering in an oxygen atmosphere affects not only the two-layer electrical potential barrier height, but also the grain boundaries resistance and capacitance. due to adsorbed gases variations at the grain boundaries this effect appears. sintering under these atmospheric conditions, results in increasing the number of oxygen acceptors and resistance in the grain boundary. the ptc effect in batio3, as a complex phenomenon, according to available literature, involves more general features (nowotny and rekas [5], mitić [6]). in short, the most important causes that lead to the ptc effect are the donor concentration – the charge carrier concentration – nd, the acceptor state density at the grain boundaries – ns, and the depth of acceptor level – the energy gap between the energy levels of the acceptor states – es. in addition to these variables, but less important, are the indirect causes such as porosity, mean grain size (~ 10 µm), abnormal grain growth, preparative conditions, and composition, etc. the most important model to be considered here is the heywang model based on statistical thermodynamic methods heywang [7,8], as well as its modifications, the jonker [9], daniels [10], and mitić-kocić fractal approach [11,12]. nonetheless, other models are also known; for example [6], lewis and catlow for ti acceptors, saburi models of variable valences of ti4+ ions etc. in this paper, based on the theory of phase transitions [13-16], especially mott's electric insulator-conductor phase transitions [17], and, on the basis of the above, supported by the experiment, a simple model of the landau-type joule heating of this diffuse phase transition for the ptc effect could be formulated. the paper is organized in the following way: in section 2, the basics of experimental design and theoretical models are described. section 3 is the basic part of the work. this section discusses the experimental results and formulates a preliminary model. the paper is concluded with section 4. 2. experiment 2.1. experimental details the samples used for analysis in this paper were obtained using the solid-state sintering method [6]. as the starting material, high purity, commercial batio3 murata powder (99.9% purity, mean grain size <2μm) was used. the starting powder was mixed into a mill with balls and isopropyl alcohol. in the mixture, organic binders were added, and homogenization was performed for 48 hours. this powder was dried and granulated at a standard vibrating sieve – fritch pulverissete 5. then, the powder is pressed into tablets at a pressure of 86 mpa. after pressing, the samples were sintered at 1190ºc for 1 and 2h. electrical characteristics in the function of temperature were measured with hewlett packard 4276a lcz meter. the measuring electrical resistance was at the constant voltage u=250v. in continuation, our method, with the use of mathematical-physical tools, was applied to pure batio3. characterization of ptc effect in batio3-ceramics as a special phase transition – fractal approach 501 2.2. theoretical models of the ptc effect and electrical properties of batio3ceramics heywang assumed in [7,8] model with a two-dimensional layer with grain boundaries and active acceptor states. these acceptors attract electrons from the bulk, resulting in an electron-depletion double-layer electric-potential barrier of schottky type with thickness of b (see fig. 1), d s n n b 2  (1) the temperature dependence of the acceptor state density, ns=ns(t), (t-temperature), for ns0 – the total number of acceptor states is described by: 0 0 , ( ) 1 exp s s f s n n e e e kt  = + + (2) where es is the energy of acceptor states, e – elementary charge, k is the boltzmann constant, and, for nc – the effective density of the states in the conduction band, ef is the fermi energy which equals .ln d c f n n kte = (3) the depletion layer formula in a grain boundary electric potential barrier φ0 is: 2 0 0 ( ) , 8 ( ) s r d en t t n    = (4) where, ε0 – vacuum permittivity, εr=εr(t) s the relative permittivity of the grain boundary region. permittivity εr follows the curie-weiss law: , 0 tt c r − = (5) where c is the curie-weiss constant. fig. 1 double schottky barrier at the grain boundary caused by a two-dimensional acceptor layers with the grain boundary. after heywang 1961, 1964 [7,8] 502 z. vosika, v. mitić, v. paunović, j. manojlović, g. lazović at t=t0, for the fermi energy level ef, the acceptor levels are well below and filled so that ns=ns0. due to the decrease in permittivity with increasing temperature in the grain boundary region – the phase transition of batio3 ceramics from ferroelectric to paraelectric above the curie temperature t0, potential φ0 increases proportionally with the temperature as expressed in equations (4) and (5) (see fig. 2). fig. 2 a typical sketch graph characteristic of the ptc effect. the characterized temperature region or ptc ceramics: i – semiconductor region; ii – transition region; and iii – insulator region. the maximum resistivity is obtained at temperature tm in the region ii, the dependence r = r(t) is considered to be of the exponential type. the resistivity of the batio3-samples, r, is related to the schottky potential barrier, as in equation, where r0 is a constant: .exp 0 0 k t e rr  = (6) heywang’s model was not able to explain the ptc behavior below t0 accurately. jonker model ([9]) assumed that, as a result of strong electric fields around grain boundaries, the electric permittivity for a ferroelectric state is smaller than that assumed by heywang. lowering of the surface potential barrier is explained as the effect of compensation the surface charges with different directions of the ferroelectric domains on both sides. the total resistance of a material is described by the equation: .exp1 0 0 0         += kt e e zbkt rr   (7) where z is the number of grains per cm. based on daniels type of models [10], the thickness of cation vacancy (oxygen, titan or, for daniels model – barium vacancies) in sintered donor-doped batio3 grains, rich insulating layers can be up to 3 µm. with fine-grain batio3 samples, cation vacancy layers act as the acceptor state dominating the electric structures in the grains and these conditions resulting in the global insulation profile. characterization of ptc effect in batio3-ceramics as a special phase transition – fractal approach 503 fig. 3 schematic picture of one ceramic grain in polycrystalline batio3 above curie temperature t0: a. tetragonal type batio3 crystal region; b. area with gradient of tetragonality; c. cubic type batio3 crystal region. roughly, in the region c located insulating grain boundary layer (here is the place of the nonzero schottky potential barrier), a+b area bulk is characterized as n-type semiconductor, nowotny and rekas 1991 [5] the ptc effect dependence, above the curie temperature, on the grain size considered with influence the structural gradient across the batio3 grain [5]. according to them, below t0 the tetragonal structure is stable only in the crystalline bulk phase. in the outer part of the grain forms, there is a low dimensional layer predominantly of the cubic structure. this circumstance results in the formation of the tetragonality gradient between the two structures (fig. 3), [5]. between the bulk and the insulating grain, this develops a boundary layer as an electrical potential barrier, resulting in a high electrical resistivity. below t0, the barrier significantly decreases as a result of high permittivity [7,8] or electrical polarization compensation effect produced ([9]) or due to the influence the presence of cation vacancies [10]. in further research into the ptc effect, new ideas emerged. for example, considering the fractality of grain or pore shapes, that is, electron trajectories in batio3-ceramics. benoit mandelbrot [18] systematically introduced fractal geometry during six decades of the last century. in this context, for a practical approach for describing complex noneuclidean curves and shapes, see [19]. some examples of the application of fractal geometry are given by gouyet [20]. objects in the fractal geometry, among other things, are characterized by a unique number – the hausdorff fractal dimension dhf. the basic relation for it is dhf >dt, where dt is the topological dimension. the mitić-kocić hypothesis is that the batio3–ceramics working temperature must be influenced by these three fractality factors. according to them, the correction of “theoretic” temperature t is: ,tt ff = (8) with fractal corrective factor αf. these three factors αs, αp, αm – in that order, surface, pores, and particle (electron) movement (brownian motion) fractal parts, participate (see [11,12]) as follows: 504 z. vosika, v. mitić, v. paunović, j. manojlović, g. lazović ( , , ), f s p m    =  (9) where, φ is some function. in the paper mitic et al. [11], showed the contours of many ceramic grains to be fractal curves with 1.5< dhf <2. for theoretical elaboration on αm see [21]. in this context, proposed by uchino and nomura [22], the diffuse phase transformation of relative permittivity εr at tmax the equation max max ( )1 1 , ' r r t t c    − − = (10) c’ is a curie-like constant, and γ is the critical exponent. exponent   [1,2],  = 1,2, in order for a sharp phase transformation and for strong diffuse phase transition. for single crystal batio3, γ is 1.08. in the modified batio3-ceramics it can progressively increase up to 2 (see [12]). the connection of the mentioned phase transitions with the fractal structure of ceramics is known in the literature. therefore, it also affects the conductive properties of batio3 ceramics. the ptc effect could, for the above, be considered as a kind of conditional, diffuse phase transition. however, the thermodynamic equation of the state of clean-phase transition in this context is not determined. in this paper, based on the theory of phase transitions (see [13-16]), especially mott's electric insulator-conductor phase transitions [17], and on the basis of the above, supported by the experiment, a simple model of landau-type joule heating of this diffuse phase transition could be formulated. the reasons for using the word "diffuse", in relation to this phase transition, are due to the mechanism of its manifestation, but also to the theoretical method by which it is examined. from the standard modern point of view, phase transitions and critical phenomena are essentially quantum-statistical phenomena, although they can very often be treated thermodynamically [4],[13-16]. phase transitions in the case of electrical conductivity of the conductor (usually, metal)-insulator type can be considered, roughly, from more aspects [17], [20]: strictly correlated systems (e.g. the mott phase transition [17]), disorder, frustration, and fractals [20]. the mott phase transition, which has a behavior similar to the ntc effect (see [23]), in principle, cannot be applied to batio3-ceramics or analogous materials due to the absence of structural details. the basic restrictions for the classification of simple insulators or conductors and “weak” external fields is ohm’s law between the current density jα and the applied electrical field eβ, [17] (α, β = 1, . . ., d; d is the dimension of the system): ( , ) ( , ) ( , ).j e      = q q q (11) if linear response theory is applied, then the conductivity tensor is σαβ (q, ω) derived from an equilibrium current-current correlation function. for an insulator, at zero temperature, the static electrical conductivity vanishes (re {. . .}: real part): , , 0 ( 0) lim re( ( , )) 0. dc t t      → → = = q q (12) for an ideal metal-conductor, the behavior of this tensor is (e as the elementary charge and n/m∗ as the ratio between the charge carrier concentration and the effective mass): 2 * re( ( ; , 0)) ( ). e n t m        → =q (13) characterization of ptc effect in batio3-ceramics as a special phase transition – fractal approach 505 results reported by limelette et al., 2003 [24] for mott metal-nonmetal (here, metaldielectric) transition report conductivity measurements of cr-doped v2o3, determined critical thermodynamic relations associated with the equation of state and corresponding exponents. the universal scaling function associated with the equation of state, for conductivity σ, pressure p, and temperature t, of the following type: σ=σ (p, t), is investigated in detail. the universal properties of a liquid-gas transition are established. batio3 -ceramics and their dielectric properties, partly in contrast to single-crystal theory (see wang 2010 et al. [25], and their dependence εr = εr (p, t)) – the pure thermodynamic approach, the situation is more complicated. within the linear heywangjonker – daniels theory using the fractal approach, for constant state parameters, exists this relation εr(ω)-1 = σ(ω)/iωε0. from the semiconductor-dielectric diffuse phase transition point of view, all of these can lead to a new, conductive phase transformation, which could be the best candidate for the ptcr effect understanding and explanation. 3. results and discussion in this section, considering the foundation of the thermodynamic ptcr effect. it represents a continuation of the research of the landau's thermodynamic theory presented by wang 2010 et al. [25]. haywang's model, etc., is microscopic and yet semi-qualitative, and, in some segments, explanatory. it does not foresee the possibility, either quantitatively or qualitatively, that this is, from a theoretical point of view, a phase transition. more precisely, therefore, it is a phase transition and not a ptcr effect. the effect, in this context, physically, is characterized by a single phase (for example, the meissner effect exists in a superconducting phase). similar to non-diffuse first-order phase transition in dielectrics under a hydrostatic pressure, within the landau-devonshire model, the following correction of the gibbs free energy originating from the electric current density is in the form of: 2 4 6 0 0 ( , ) ( ) , j p j j g t j g a t t j b j d j= + − + + (14) where g0 – landau-devonshire potential, bj<0, dj>0, tp=tp0+bj 2/(4ajdj)-transition temperature (see [4], [15], [16], [25]). if the external electric field e is a constant (σ=j/e), the temperature dependence of σ in a first-order phase transition is given in the fig. 4. fig. 4 the temperature dependence of electrical conductivity σ in a first-order phase transition, as an explanation of the non-diffusion part of the ptc effect, similar to schwabl 2006 [15] 506 z. vosika, v. mitić, v. paunović, j. manojlović, g. lazović in [6], for undoped atmospherically reduced batio3-x, after the fluorination process, a significant ptc effect was manifested. experimental results and the graphs, logarithms of the specific electrical resistivity r (ω cm) as a function of temperature t, for sintering times 1, 2, and 3 h, are given in that paper. for further analysis, in the case of atmospheric pressure, more acceptable is a dependence conductivity function σ= σ(t), (σ = 1/r) as fig. 5. fig. 5 ptc effect for undoped batio3-x after fluorination process, semiconductor-insulator relative phase transition, specific electrical resistivity r, conductivity function σ = σ(t) =1/r, t is temperature. from left to right, sintering time, in order: 1, 2, and 3h; mitić, 2001 [6] in the thermodynamic theory of phase transitions, various thermodynamic potentials are considered, such as free energy as a function of characteristic parameters. batio3 grains are characterized by the dependence of gibbs free energy on polarization and strain [25], [26]. as a result, in particular, a known dependence εr = εr (p,t) is obtained. for batio3 ceramics, this may have a very small effect, due to the described general conditions of the ptc effect. the idea of the paper is to start from macroscopic electro-thermal effects in the case of transport processes in a constant applied electric field e, for external atmospheric pressure unchangeable and fixed measurement time interval and volume of specimens – t, v. joule's law in differential form for metals is known: . 2 ej = dtdv qd (15) q is the joule heat. for batio3 ceramics, generalized ohm’s law j = σα e α (1≤α<15, [27]) is considered. it will be considered as α=1. then, it is valid that: , 2 2 ~σσe dtdv qd  (16) or: characterization of ptc effect in batio3-ceramics as a special phase transition – fractal approach 507 .σ~σvtq  (17) in this context, the following physical quantities (due to the constancy of the pressure, a label is introduced q=qp, see [13]) are important the most: p dq dσ p ~ c , dt dt = (18) cp is the molar heat capacity and the molar entropy: . p dqdσ ~ s et t =  (19) if the chemical potential µ=0, for the gibbs potential g=g (t, p), the following thermodynamic relations are known: p g s e t   = −     (20) and 2 2p p g c t . t   = −     (21) by definition, a first-order phase transition has a finite discontinuity in the first derivative by temperature of the gibbs potential g (t, p) and a divergence in the second derivative, [13,15]. if numerically differentiated by the temperature, the data σ = σ(t) in fig. 6 presents dependency (dσ/dt) (t) which is obtained as shown in fig. 7. fig. 6 numerically differentiated curves σ = σ(t) presented in fig. 5. the temperature corresponding to the minimum of the curve is tp 508 z. vosika, v. mitić, v. paunović, j. manojlović, g. lazović the temperatures of the relative or diffuse phase transitions are tp1=100.0 oc, tp2=102.8 oc, and tp3=102.8 oc. the sintering time does not significantly affect their value. the minimums on the curves in fig.6 are not a divergence in infinite. the equation (18) shows that this is a diffuse jump in magnitude proportional to cp. in order to show that this is indeed a diffuse first-order phase transition, it is necessary to obtain the dependence of entropy on temperature. using the procedure of numerical integration and relation (19), fig. 7 is obtained. fig. 7 numerically integrated data from curves σ = σ(t) in fig. 5 using the equation (19). entropy se has a diffuse finite jump one way of considering this “diffusivity”, based on degree laws, at this point, it is relevant, to consider the following relation: ,( ) ( ) | | .l r p p t t t t     = + − (22) fitting experimental data to obtain, respectively (l-left, r-right), for αl – 0.3054, 0.0210 and 0.5861, and, for αr 0.4314, 0.4567, and 0.4416. then, the first derivation by temperature from a variable σα (t) determines the behavior around the phase transition point. these values suggest that there is scaling behavior in batio3 based on the relation of ceramic conductive properties. electrical conductivity, in our preliminary model, around the temperature tp, is αl=0.5 and αr=0. statistical aspects for (22), in general, are given in [28]. however, it is necessary to discuss based on fractional-fractal brownian motion [29]. there are also opportunities for further improvements. the evident diffusity of the transition can be interpreted by the geometric and velocity fractality, which is described by the equations (9) and (10). in particular, on the circumstances related to more general gibbs free energy and entropy concepts (related to fractal dimension and entropy [30]), in addition to results related to various experiments, more analysis is necessary, based on the works [24] and [25]. the question arises, however, based on other thermodynamic phase transitions, such as the liquid-gas transition. characterization of ptc effect in batio3-ceramics as a special phase transition – fractal approach 509 4. conclusion in this paper, preliminarily, the modified landau theory of a diffuse first order phase transition was considered for a description of the ptcr effect in the specific case of batio3 ceramics. as a basis of it, the fractal heywang-jonker-daniels model was considered. in addition to the basic thermodynamic relations, though slightly modified, the experimental results of the ptcr characteristic effect, for almost pure batio3-ceramics, were used. acknowledgements: this work has been supported by the ministry of education, science and technological development of the republic of serbia (grant no. 451-03-9 / 2021-14 / 200102). references [1] k.c. kao, dielectric phenomena in solids: with emphasis on physical concepts of electronic processes. elsevier academic press, london, 2004. [2] y. l. chen and s. yang, "ptcr effect in donor doped barium titanate: review of compositions, microstructures, processing and properties", adv. appl. ceram., vol 110, no. 5, pp. 257-269, jan. 2011. [3] s. h. cho, "theoretical aspects of ptc thermistors", j. korean ceram. soc., vol. 43, no. 11, pp. 673-679, oct. 2006. [4] f. duan and j. guojun, introduction to condensed matter physics, volume 1. world scientific publishing co. pte. ltd. singapore, 2005. [5] j. nowotny and m. rekas, "positive temperature coefficient of resistivity for batio3-based materials", ceram. int., vol. 17, no. 4, pp. 227-241, 1991. [6] v. v. mitić, structure and electrical properties of batio3 – ceramics. belgrade, serbia, zadužbina endowment andrejević, 2001. [7] w. heywang, "bariumtitanat als sperrschichthalbleiter", solid-state electron., vol. 3, no. 1, pp. 51-58, july 1961. [8] h. heywang, "resistivity anomaly in doped barium titanate", j. am. ceram. soc., vol. 47, no. 10, pp. 484–490, oct. 1964. [9] g. h. jonker, "some aspects of semiconducting barium titanate", solid-state electron., vol. 7, no. 12, pp. 895–903, dec. 1964. [10] j. daniels, k.h. hardtl and r. wernicke, "ptc effect of barium titanate", philips tech. rev., vol. 38, no. 3, pp. 73-82, 1979. [11] v. v. mitić, v. paunović and lj. kocić, "fractal approach to batio3 –ceramics microimpedances", ceramic int., vol. 41, no. 5, pp. 6566-6574, 2005. [12] v.v. mitić, v. paunović, g. lazović, lj. kocić and b. vlahović, "clausius–mossotti relation fractal modification", ferroelectrics, vol. 536, no. 1, pp. 60-76, 2018. [13] h. e. stanley, introduction to phase transitions and critical phenomena. oxford university press, 1971. [14] s. r. a. salinas, introduction to statistical physics. springer-verlag, new york, 2010. [15] f. schwabl, statistical mechanics. second edition. springer-verlag berlin heidelberg, 2006. [16] l. e. reichl, a modern course in statistical physics. 4th revised and updated edition. wiley-vch verlag gmbh & co., weinheim, germany, 2016. [17] f. gebhard, the mott metal-insulator transition. springer, berlin heidelberg, 2010. [18] b. mandelbrot, the fractal geometry of nature. 3rd ed. w.h. freeman, san francisco, 1983. [19] m. barnsley, fractals everywhere. academic press, san diego, ca, 1988. [20] j. f. gouyet, physics and fractal structures. springer, berlin, 1996. [21] z. b. vosika, v. v. mitić, g. lazović, v. paunović and lj. kocić, "meso-kinetics of one-time relaxation electrical processes in batio3 ceramics—modified boltzmann-poisson model", ferroelectrics, vol. 531, no. 1, pp. 38–50, nov. 2018. 510 z. vosika, v. mitić, v. paunović, j. manojlović, g. lazović [22] k. uchino and s. namura, "critical exponents of the dielectric constants in diffuse-phase transition crystals", ferroelectrics letters, vol. 44, pp. 55–61, may 1982. [23] r. b. darling, s. iwanaga, "structure, properties, and mems and microelectronic applications of vanadium oxides", sadhana 34, vol. 531, oct. 2009. [24] p. limelette, a. georges, d. p. jérome, p. wzietek, p. metcalf and j. m. honig, "universality and critical behavior at the mott transition", science, vol. 302, no. 5642, pp. 89–92, oct. 2003. [25] j. wang, p.p. wu, x.q ma and l.q. chen, "temperature-pressure phase diagram and ferroelectric properties of batio3 single crystal based on a modified landau potential", j. appl. phys., vol. 108, p. 114105, dec. 2010. [26] l. tu and a.a. ilhan, "hierarchical structure−ferroelectricity relationships of barium titanate particle", crystal growth & design, vol. 1, no. 5, aug. 2001. [27] m. viviani, m.t. buscaglia, v. buscaglia, l. mitoseriu, a. testino, p. nanni and d. vladikova, "analysis of conductivity and ptcr effect in er-doped batio3 ceramics", j. eur. ceram. soc., vol. 24, no. 6, 2004. [28] m. suzuki, "phase transitions and fractals", prog. theor. phys., vol. 69. no. 1, jan. 1983. [29] p. lévy, "random functions: general theory with special references to laplacian random functions", university of california publications in statistics, vol. 1, pp. 331–390, 1953. [30] v. v. mitić, g. m. lazović, j. ž. manojlović, w.-c. huang, m. m. stojiljković, h. fecht and b. vlahović, "entropy and fractal nature", thermal science, vol. 24, no. 3b, pp. 2203–2212, 2020. 10563 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 455-468 https://doi.org/10.2298/fuee2203455n © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside nataša nešić1, nebojša dončov2, slavko rupčić3, vanja mandrić-radivojević3 1department of information and communication technologies, academy of applied technical and preschool studies, niš, serbia 2faculty of electronic engineering, university of niš, serbia 3faculty of electrical engineering, university of josip juraj strossmayer, osijek, croatia abstract. in this paper, the impact of an electromagnetic absorber inside a protective metal enclosure is analyzed. the absorber is put inside the enclosure in order to improve its shielding effectiveness, especially at the first resonant frequency. different absorber's sheet positions inside the enclosure are analyzed. the absorber sheet dimensions are fitted to correspond the enclosure's walls. the experimental procedure is conducted in a semianechoic room. the numerical tlm simulations of the em filed distribution inside enclosure are conducted in order to consider position of the absorber sheet on different walls. key words: absorber, enclosure, emi absorber sheet, measurements, shielding effectiveness, tlm method. 1. introduction an increasing number of modern electronic devices resulted in the rise of electromagnetic (em) radiation. hence, it is of considerable importance to conduct electromagnetic compatibility (emc) analysis. quantifying the shielding properties of an enclosure can be considered from the viewpoint of shielding effectiveness (se). commonly, a shielding characteristic of an enclosure can be given as a ratio of em fields with and without an enclosure at some probe point, over a wide frequency range. the se of enclosure may be very low or even has negative value at the resonant frequencies in the observed frequency range. the negative values of the enclosure se at the resonant received march 1, 2022; revised march 24, 2022; accepted may 15, 2022 corresponding author: nataša nešić department of information and communication technologies, academy of applied technical and preschool studies, aleksandra medvedeva 20, 18000 niš, serbia e-mail: natasa.nesic@akademijanis.edu.rs *an earlier version of this paper was presented at the 15th international conference on applied electromagnetics пес 2021, august 30th september 1st, 2021, in niš, serbia [1]. mailto:natasa.nesic@akademijanis.edu.rs 456 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević frequencies can affect or even can compromise the useful frequency range, in which em shielding of a device is provided. a number of different methods such as the analytical, the numerical and the experimental ones can be used, in order to study the shielding characteristic of an enclosure. the analytical methods [2]–[4] are usually based on problem simplification, thus they can be very fast but with some inherent limitations. for an efficient computational modelling of protective enclosures, there are numerous numerical techniques. one among many is the transmission-line matrix (tlm) method [5], which will be employed in this paper. finally, in the experimental methods, an antenna is set inside the enclosure in order to measure its se. furthermore, the physical dimensions of an in-house monopole receiving-antenna, which is often used in experimental set-up for measuring em field level, could also affect the se of enclosure. this was numerically demonstrated in [6] and experimentally confirmed in [7]. several techniques can be applied in order to improve the shielding properties of enclosure over a frequency range. the se of enclosure was increased by using absorbers [8] or conductive foam in [9] and [10]. as damping techniques, the composite materials based on nanotechnology [11] and metamaterial absorber structure [12] can be used. furthermore, a frequency-selective surface [14] and polymer composites filled with carbonaceous particles which are suitable for microwave absorption [13] can be employed. the enclosure can be coated with composite foam material or can be made of that material [15]. in [16], it was shown that placing small antenna elements, dipole or loop antenna structure with loaded resistance on the enclosure wall opposite to the enclosure aperture can improve the enclosure se. the effective length of this small structure was chosen to match the first resonant frequency of enclosure. in papers [17] and [18], the authors proposed to suppress the first resonant frequency in a metal enclosure by putting small antenna elements with loaded resistance. it was shown that the em shielding could be improved by placing a small dipole printed antenna structure on the enclosure wall inside. the improvement was efficiently, especially at the first resonant frequency over observed frequency range. in [8], an influence of an electromagnetic interference (emi) absorber inside the enclosure and its improvement on the enclosure se was experimentally studied. the absorbers were placed on the back wall of enclosure, on two side walls and on the back and two side walls at the same time. in [1] the study from [8] was expanded placing and combining absorbers on other enclosure walls in order to see how these absorbers positions affect the se of enclosure, especially at resonant frequencies. in this paper, the experimental study of absorber sheet position impact on shielding effectiveness of enclosure is systematized and supported by the numerical analysis of the em field distribution inside inner enclosure walls. for numerical simulations, the tlm method is used aiming to estimate the greatest impact of absorber position inside the enclosure. in such a way, an absorber amount and precisely position can be determined in advance. in experimental and numerical studies, position of thin emi absorber sheet on one or more inner enclosure walls is considered focusing on se behavior at the first resonance of enclosure but it is clear that its placement might affect se at higher resonances as well. the paper is organized as follows. section ii refers to analytical calculation of enclosure modes. in section iii, the numerical tlm model of enclosure is described. in section iv, the experimental set-up and measurement procedure are described. section v presents a physical enclosure’s model with the emi absorber material and with a receiving-antenna inside. section vi provides discussion of the experimental results. finally, section vii summarizes the work. experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 457 2. analytical calculation of enclosure modes in this section, a rectangular metal enclosure with one aperture on a frontal enclosure wall is described. the enclosure has dimensions of (300 x 300 x 120) mm3. symmetrically around the centre on the frontal enclosure wall, a rectangular slot aperture with dimensions of (100 x 5) mm2 is positioned. the thickness of all enclosure walls is t = 1.5 mm. it is made of copper material. to start with, the metal enclosure can be analysed as a resonator cavity. a waveguide is a type of a transmission line; a resonator can be constructed from closed sections of waveguide [19]. since a waveguide is short-circuited at both ends, a closed metal box or cavity is obtained. inside the cavity, electric and magnetic energy can be stored and power can be dissipated in the metallic walls of the cavity [19]. usually, coupling to the resonator can be obtained by a small aperture(s) and a small probe or a small loop. in this paper, an aperture on the frontal enclosure wall is used for coupling to the enclosure while a small probe such as a monopole antenna is employed for measuring the distribution of the em field inside enclosure. according to the analytical equation [19], the te and tm modes which are occurred in considered enclosure are calculated and are given in table 1. table 1 all the resonant modes for te and tm modes occurred in considered enclosure, in observed frequency range resonant frequency mode, 𝑓𝑚,𝑛,𝑙 ghz 𝑓110 0.707 𝑓101 =𝑓011 1.346 𝑓111 1.436 𝑓201 = 𝑓021 1.601 𝑓120 =𝑓210 1.118 𝑓211 =𝑓121 1.677 𝑓220 1.414 𝑓221 1.887 3. numerical model of enclosure before the experimental procedure is conducted, a numerical model of the considered enclosure is designed by using the tlm method as a numerical modelling technique [5]. it is created to resemble to the experimental procedure. the tlm compact wire model is very suitable for modelling an antenna inside enclosure whose purpose is to measure the em field level and its distribution [6]. this wire model is based on wire segment incorporated into the standard tlm symmetrical condensed node (scn). the impedances of additional wire network link and short-circuit stub lines depend on the space used and time-step discretization, and also on per-unit length wire capacitance and inductance [20] and [21]. in the numerical model of the enclosure entitled by d, a monopole antenna is employed inside. the tlm compact wire model is used to describe the monopole antenna as a wire conductor with a length of l = 60 mm and with a radius of r = 0.1 mm, placed in the middle of the enclosure [6], as shown in fig. 1. the antenna is also connected to the ground via resistor r. a slot aperture on the front wall of enclosure is described by several nodes across each cross458 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević section dimension. external em field, represented as a vertically polarized incident plane wave, penetrates into enclosure through aperture and a current induces on the wire. further, on a resistor r, which is loaded at wire base, a voltage generates. this allows measuring numerically the level of em field inside the enclosure [6] and [18]. table 2 presents the first three resonant frequencies (the excited tm modes) obtained by using analytical calculation for rectangular resonator (enclosure without aperture/slot) and by using numerical simulations. the first three modes for the enclosure with slot aperture for empty and for enclosure with monopole antenna are obtained by the tlm numerical calculations. the numerical se results obtained for the empty enclosure and the one with a monopole-receiving antenna are presented in fig. 2. it can be observed that the first resonant frequency is shifted toward lower frequencies in presence of the monopole antenna inside. the analysis with different antenna radii and different antenna length is given in [6]. the frequency shift can be explained by the perturbation theory, according to that, when a volume ∆v is put inside the resonator, the total interior volume decreases by ∆v, which affects the position of the resonant frequency in that enclosure [19] and [6]. fig.1 enclosure d with one rectangular aperture on the front wall, excited by normal incident plane wave vertically polarized [6] fig. 2 the first resonant frequency comparative peaks of enclosure d with and without monopole antenna (tlm simulations) [6] experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 459 table 2 the first three resonant frequencies in enclosure d tem mode analytical calculation empty enclosure [6] enclosure with monopole r = 0.1 mm [6] tm110 f110 = 707.107 [mhz] fr1 = 703.059 [mhz] fr1 = 688.496 [mhz] tm120 f120 = 1118.03 [mhz] fr2 = 1101 [mhz] fr2 = 1099 [mhz] tm130 f130 = 1581.138 [mhz] fr3 = 1608 [mhz] fr3 = 1444 [mhz] 4. experimental procedure and setup the experimental procedure of the equipment under test (eut) is described in this section. the measurements are conducted in a semi-anechoic room measuring place occupied with rf absorbers in an ordinary laboratory space, in laboratory for hf measurements at ferit faculty in osijek, croatia. in order to determine the se of enclosure, a measuring procedure has to be performed twice, without and with enclosure. the se of considered enclosure is measured by the network analyzer and with the s21 parameters. the transmission parameters of the measurement without and with an enclosure are marked as s21n and s21e, respectively [8]. the se can determine by following: 𝑆𝐸 [𝑑𝐵] = 𝑠21𝑛 − 𝑠21𝑒 . (2) figure 3 illustrates the measuring configuration used in a semi-anechoic room. the dipole broadband-antenna, type vivaldi, was used as a transmitting antenna, in the experimental set-up. the vector network analyzer (vna), the keysight field fox rf analyzer n9914a 6.5 ghz, with a maximum power of 3 dbm and with a resolution of 100 hz was employed as a measuring device. the vivaldi antenna was connected via coax cable to the vna. further, the vna was connected to the receiving-antenna via coax cable. an in-house monopole antenna was employed as a receiving-antenna and it is placed inside tested d enclosure [17] and [18]. in the measurement process, as an excitation source, a vertically polarized vivaldi antenna is used [21], as depicted in fig. 3. the vivaldi antenna has a frequency range of 600 mhz to 6 ghz while a receiving one is a very thin in-house monopole. the monopole antenna is placed in the middle of the enclosure in order to measure the level of em field inside. all measurements are performed in the frequency range from 600 mhz to 2 ghz, in the far-field. enclosure used in experiments is made of copper material with the internal dimensions (300 x 300 x 120) mm3. an in-house monopole antenna is also made of copper with a length of l = 60 mm and with a radius of r = 0.15 mm. figure 4 presents a photography of a measuring configuration used for obtaining the experimental results in a semi-anechoic room [21]. the se results of enclosure with the monopole antenna obtained by the measurements and the numerical simulations are compared and presented in fig. 5. it can be observed an excellent match between the measurements and the simulation curves. 460 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević fig. 3 the sketch of the measuring set-up: transmitting antenna, vna and eut (enclosure under test d) [21] fig. 4 photography of measuring configuration used in a semi-anechoic room: transmitting antenna, vna and eut (enclosure d with a receiving antenna) experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 461 fig. 5 the comparison between measurements and numerical simulation results of the se of enclosure with monopole antenna 5. emi absorber the 3m™ emi absorber ab7050 from ab7000 series [1], [8] and [22] is used in the measurements conducted in this paper. one side of the absorber sheet consists of a flexible polymer resin loaded with soft metal flakes and on the other side is covered by an acrylic pressure-sensitive adhesive allows for easy application [8] and [22]. this absorber is typically used for applications in the wide frequency range, from 50 hz up to 10 ghz. it is a broadband emi absorber designed to work in near-field applications inside and around electronic devices and assemblies [22]. this absorber is thin as a sheet of paper, with the backing thickness of 0.5 mm and adhesive thickness of 0.05 mm, so it does not occupy significant space inside the enclosure [22]. the emi absorber used in the experimental analysis is cut to fit the inner enclosure’s sides. the experimental procedure is conducted for eight cases (configurations). firstly, the enclosure without emi absorber is measured and its se characteristics is obtained. secondly, the emi absorber is employed on the lower wall of enclosure which is entitled by lw. its dimensions correspond to the inner dimensions of the lower enclosure’s wall. the third case refers to the absorbers on two side enclosure’s walls (entitled by 2sw). the dimensions of the absorber are cut to fit the enclosure’s side walls which is (297 x 120) mm2. in the fourth case, the absorber on the wall opposite to the front wall with an aperture, so-called back wall (entitled by bcw), is considered. in the fifth case, the absorbers are employed at the same time on lower wall and on two side walls (entitled by lw+2sw, as in fig. 6). the sixth case, the absorbers are put on lower and back enclosure walls (case lw+bcw). the seventh case, the absorbers are put on lower and upper enclosure’s wall (case lw+up). finally, the eight case is all above-mentioned absorber positions. this case will be called lw+2sw+bcw+up. 462 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević fig. 6 photography of the physical model of d metal enclosure with the emi absorbers on lower wall and both side walls of enclosure (lw+2sw) 6. discussion of results the experimental results of the shielding characteristics of considered enclosure with the emi absorbers inside are presented in this section. to start with, the se results are obtained based on the measured transmission parameters without and with the enclosure. in fig. 7, and also in further figures, the se results for configuration of the enclosure without absorber (empty enclosure with only receiving antenna inside) are given. the results are compared to the configuration with absorber on the lower enclosure wall and are presented in fig. 7. it can be observed that the presence of the absorber inside the enclosure gave a significant improvement, especially at the resonant frequencies, over the case without it. apart from the resonance frequencies, the both se curves are very similar in terms of the shape and se levels. therefore, it can be seen that in the presence of the absorber, all the peaks at resonant frequencies are damped. table 3 presents the se values at the first enclosure resonance for different absorber configurations. also, it can be seen that the first resonant frequency of empty enclosure occurs at 686 mhz and the se value is equal to -14.95 db. a negative se value might compromise shielding property of the enclosure. by putting the emi absorber on the lower wall of enclosure, the first resonant frequency occurs at 696 mhz and the positive value of 9.87 db for the se is obtained. in comparison between the enclosure with the lw absorber with the one without it, the frequency shift (δfr1) of 10 mhz is obtained, as shown in fig. 7. moreover, at the first resonant frequency the difference between the se levels (δse) of 25.32 db is indicated. therefore, it can be observed the first resonance frequency shift toward higher frequencies in a presence of the emi absorber. experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 463 secondly, fig. 8 presents the compared measurement results of enclosure with the emi absorbers placed on two side walls to the empty enclosure case. although the se curves look similar in fig. 8 and do not differ much in terms of shape in the whole frequency range, the se levels are still different at resonant frequencies around 700 mhz, 1100 mhz and 1650 mhz, respectively. it can be seen a very good absorber efficiency at lower frequencies, while it is weaker at higher frequencies in observed range. the additional te and/or tm modes are not established inside the enclosure, since a very thin emi absorbers were employed inside it. also, table 3 presents that the frequency shift of the first resonance, δfr1, is 8 mhz, while the difference between se levels is 23.65 db. for the third case, the absorber is placed on the back wall inside enclosure. the results are depicted in fig. 9. it can be seen that the difference between the se levels (δse) is 20.4 db and the frequency shift (δfr1) related to the first resonance position without and with absorber is 8 mhz, as depicted in fig. 9 and in table 3. fig. 7 the se measurement enclosure results without absorber and with the absorber placed on the lower wall (case lw) fig. 8 the measurement results for the se of the enclosure with the absorber placed on two side walls (case 2sw) 464 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević fig. 9 the measurement results for the se of the enclosure with the absorber placed on the wall on opposite side from frontal enclosure wall (case bcw) in order to consider the effects of absorbing material, put in different positions inside the enclosure, on the shielding characteristic, the tlm simulations are conducted. the em field distribution is shown on different inner wall surface of enclosure without absorber. figure 10 presents the em field distribution on the lower wall inside the enclosure. further, figs. 11 and 12 present the em field distribution on the left-side wall and on the back wall inside the enclosure, respectively. it can be observed that the em field distribution is not uniform and that placing absorber on the lower wall might have the strongest influence on the se characteristic among these three considered positions. therefore, position of absorber on the lower enclosure wall is included in all further considered cases. fig. 10 the em field distribution on the lower wall inside the enclosure, obtained by the numerical model experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 465 fig. 11 the em field distribution on the left-side wall inside the enclosure, obtained by the numerical model fig. 12 the em field distribution on the back wall inside the enclosure, obtained by the numerical model in fig. 13, the compared se measurement results are presented for the first, the fifth, the sixth and the seventh configurations. for the fifth case (lw+2sw), it can be seen that the se curve do not differ much in terms of shape in the whole frequency range in order to the empty enclosure (the first case), but the se values differ at resonant frequencies, especially above 1400 mhz. one can notice that the absorber efficiency is very good at lower frequencies, while it is a bit weaker at higher frequencies in observed range. for this configuration, the difference between se values is 30.54 db, while the frequency shift δfr1 is 10 mhz. the fifth, the sixth 466 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević and the seventh configurations have the same frequency shift, see table 3. further, for the sixth configuration (lw+bcw) the difference between the se levels (δse) at the first resonance for this case and the case without the absorbers is 29.2 db. at higher frequencies, above 1700 mhz, the presence of absorbers for this case led to the decrease of the se, as depicted in fig. 13. for the seventh case (lw+up), the difference between the se levels (δse) for this case and the case without the absorbers is 28.96 db, at the first resonance. at higher frequencies, above 1700 mhz, the presence of absorbers influenced to the increase of the se, for this configuration depicted in fig. 13. it can be observed that all compared cases have a similar shape of characteristics, but case lw+up has the highest se value at the first resonance, as well as significantly higher se levels at higher frequencies, above 1700 mhz. fig. 13 the comparison of se measurement results for enclosure without absorber, case lw + 2sw, case lw + bcw and case lw + up fig. 14 the comparison of the measured se of the enclosure around the first resonant frequency – without absorber (empty), case lw+2sw, and case lw+up+bcw+2sw experimental shielding effectiveness study of metal enclosure with electromagnetic absorber inside 467 finally, fig. 14 presents the compared se characteristics for enclosure without absorber, for the sixth case (lw+2sw) and the configuration where all inner wall surfaces are coated with absorbers, the eight case (lw+up+bcw+2sw). obviously, the eight case has a quite better se level at the first resonant frequency, but bear in mind that it includes a significant amount of absorber materials employed. in practical applications, the case lw+2sw is the most economical one, due to the fact that the significant absorber effects are achieved by using absorber materials only on two walls. table 3 the se values at the first enclosure resonance emi absorber position fr1_meas [mhz] se_meas [db] δfr_meas [mhz] δse_meas [db] empty 686 -15.45 lw 696 9.87 10 25.32 2sw 694 8.70 8 24.15 bcw 694 5.45 8 20.9 lw+2sw 696 15.09 10 30.54 lw+bcw 696 13.75 10 29.2 lw+up 696 13.51 10 28.96 lw+2sw+bcw+up 696 19.30 10 34.75 7. conclusion the eight configurations of the enclosure without absorber and with different positions of absorbers inside are considered in the experimental shielding effectiveness study, supported by numerical analysis. the significant se level improvement of 30.54 db at the first resonant frequency, compared to the empty enclosure case, is obtained for the case lw+2sw. the case lw+2sw+bcw+up gives further improvement of 4.21 db with respect to case lw+2sw, however, it requires a significantly higher amount of absorber material. overall, the technique of using thin absorber can improve the shielding properties of enclosure, but to estimate its effects in different positions a numeric study will be beneficial to be carried out. therefore, a numerical model of emi absorber will be in future research focus. in addition to that, emi absorber presence may also influence the se peaks at the higher frequencies and that will be also further explored. acknowledgement: this work has been supported by the euroweb+ project, by the cost ic 1407 and by the ministry of education, science and technological development of republic of serbia (grant no. 451-03-68/2022-14/ 200102). references [1] n. nešić, s. rupčić, v. mandrić-radivojević and n. dončov, "experimental analysis of a metal enclosure shielding effectiveness improvement with emi absorber", in proceedings of the 15th international online conference on applied electromagnetics пес 2021, niš, 2021, pp. 98–101. [2] c. christopoulos, principles and techniques of electromagnetic compatibility, 2nd ed. crs press, 2007. [3] h. a. mendez, "shielding theory of enclosures with apertures", ieee trans. electromagn. compat., vol. 20, no. 2, pp. 296–305, may 1978. 468 n. nešić, n. dončov, s. rupčić, v. mandrić-radivojević [4] p. m. robinson, m. t. benson, c. christopoulos, f. j. dawson, d. m. ganley, c. a. marvin, j. s. porter and p. w. thomas, "analytical formulation for the shielding effectiveness of enclosures with apertures", ieee trans. electromagn. compat., vol. 40, no. 3, pp. 240–248, august 1998. [5] c. christopoulos, the transmission-line modelling (tlm) method. piscataway, new jersey: wiley-ieee press in association with oxford university press, may 1995. [6] n. j. nešić, and n. dončov, "shielding effectiveness estimation by using monopole-receiving antenna and comparison with dipole antenna", frequenz, vol. 70, no. 5-6, pp. 191–201, april 2016. [7] n. j. nešić, numerical and experimental analysis of aperture arrays impact on the shielding effectiveness of metal enclosures in microwave frequency range, doctoral thesis, in serbian, singidunum university, belgrade, 2017. [8] n. j. nešić, s. rupčić, v. mandrić radivojević and n. dončov, "experimental analysis of electromagnetic interferences absorber influence on metal enclosure immunity", in proceedings of the 8th international conference on electrical, electronic and computing engineering (icetran). bosnia and herzegovina, 2021, pp. 383–386. [9] x. luo and d. d. l. chung, "electromagnetic interference shielding using continuous carbon-fiber carbonmatrix and polymer-matrix composites", elsevier science, compos. b eng., vol. 30, no. 3, pp. 227–231, april 1999. [10] r. kumar, s. r. dhakate, p. saini and r. b. mathur, "improved electromagnetic interference shielding effectiveness of light weight carbon foam by ferrocene accumulation", the roy. soc. of chem. 2013: rsc advances, vol. 3, pp. 4145–4151, january 2013. [11] t. k. gupta, b. p. singh, r. b. mathur and s. r. dhakate, "multi-walled carbon nanotube–graphene– polyaniline multiphase nanocomposite with superior electromagnetic shielding effectiveness", the royal society of chemistry 2014: nanoscale, vol. 6, p. 842–851, 2014. [12] f. costa, s. genovesi, a. monorchio and g. manara, "a circuit-based model for the interpretation of perfect metamaterial absorbers", ieee trans. antennas and propag., vol. 63, no. 3, pp. 1201–1209, march 2013. [13] b. a. munk, frequency selective surfaces theory and design, new york: john wiley and sons, inc., 2000. [14] f. qin and c. brosseau, "a review and analysis of microwave absorption in polymer composites filled with carbonaceous particles", j. appl. phys., vol. 111, p. 061301, march 2012. [15] a. ameli, p. u. jung and c. b. park, "electrical properties and electromagnetic interference shielding effectiveness of polypropylene/carbon fiber composite foams", elsevier: carbon, vol. 60, pp. 379-391, august 2013. [16] j. paul, s. greedy, h. wakatsuchi and c. christopoulos, "measurements and simulations of enclosure damping using loaded antenna elements", in proceedings of the ieee 10th international symposium on electromagnetic compatibility, york, 2011, pp. 676–679. [17] n. nešić, b. milovanović, n. dončov, v. mandrić-radivojević and s. rupčić, "improving shielding effec-tiveness of a rectangular metallic enclosure with aperture by using printed dog-bone dipole structure", in proceedings of 52nd international scientific conference on information, communication and energy systems and technologies (icest), niš, 2017, pp. 97–100. [18] n. j. nešić, b. g. milovanović, n. s. dončov, s. m. rupčić and v. mandrić-radivojević, "improving shielding effectiveness of a metallic enclosure at resonant frequencies", in proceedings of the ieee 13th international conference on advanced technologies, systems and services in telecommunications (telsiks), niš, 2017, pp. 42–45. [19] d. m. pozar, microwave engineering, 4th ed. wiley, 2012, chapters 2-3, pp. 48–162. [20] v. trenkić, a. j. wlodarczyk and r. scaramuzza, "a modelling of coupling between transient electromagnetic field and complex wire structures", int. journal of num. modelling, vol. 12, no. 4, pp. 257–273, july/august 1999. [21] n. j. nešić, slavko s. rupčić, nebojša s. dončov, vanja mandrić-radivojević, "experimental shielding effectiveness analysis of metal plate influence inside an enclosure with aperture", in proceedings of the ieee 14th international conference on advanced technologies, systems and services in telecommunications (telsiks), niš, 2019, pp. 190–193. [22] https://multimedia.3m.com/mws/media/960654o/3m-emi-absorber-ab7000hf-series-halogen-free.pdf https://multimedia.3m.com/mws/media/960654o/3m-emi-absorber-ab7000hf-series-halogen-free.pdf facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 217-228 https://doi.org/10.2298/fuee2202217k © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper implementation of voice call transfer service between smart phone and tablet through wi-fi durga sowjanya kolluru1, bhaskara reddy puchakayala2 1department of electronics and communication engineering, mlr institute of technology, hyderabad, india 2holy mary institute of technology and science, hyderabad, india, abstract. communication through voice call leads to significant growth in technology in distant areas where two or more people from opposite ends of world will connect. this research describes a case study of voice call transfer service. this research aims at designing a system that will allow android users to communicate over wi-fi. this design is able to transfer voice of incoming telephone caller over wi-fi network at real time through udp. it uses client/server architecture: server for receiving telephone call and transferring voice (one user) and client for receiving incoming caller voice and enables communication with server. architecture designed could be used on android smart phones with telephony enabled and tablets with telephony not enabled. outcome of this research will allow users to communicate on real time at no cost. proposed design gives cost effective, reliable and real time voice communication over wi-fi. it provides good and comfort experience to users in emergency situation where user cannot effort cost for telephone call. proposed design is useful for educational organizations, construction buildings, shopping malls and hospitals which point to new possibilities for voice communication. key words: voice call, voice call transfer service, real time, client/server architecture, incoming caller, telephony. 1. introduction usage of wi-fi enabled mobile phones to access internet is increasing day-by-day.it would be very extraordinary experience to provide communication in secure manner with off-shelf device without using internet [1]. proposed research implements design without having any infrastructure, direct communication between devices in which one device was telephony enabled and other device was telephony not enabled.it offers low cost and fast installation for direct wireless voice communication. received july 20, 2021; received in revised form november 4, 2021 corresponding author: durga sowjanya kolluru department of electronics and communication engineering, mlr institute of technology, hyderabad, india e-mail: k.durgasowjanya@gmail.com 218 d. s. kolluru, b. r. puchakayala ina business organization or university campus, employees need to communicate with each other very frequently which consumes large amount of budget or cost to pay internet bill or telephony bill.so it is big issue which needs to reduce cost of phone call or internet data flow usage.in a fire accident when a large building was fired, without network, way in which stranded people can communicate with people outside has emergent issue.not only for fire accident, all stranded people caused by disaster, i.e. fire, earthquake, tsunami, volcanoes and debris flow need to communicate with outside in emergent way while regular communication facilities are destroyed such as phone facilities or internet facilities. handling these emergent circumstances are very critical [2]. concept of voice communication over wi-fi is processed by wireless communication through 2.4 ghz free channel [3]. wireless communication infrastructure represents core for information sharing between connected devices [4]. router can be used to enable wifi on devices to enable communication within network range. proposed research uses wlan for implementing required design.wireless lan technology has ability to change network infrastructure of an organization without expensive re-routing of cable or installation of new cable [5]. voice over internet protocol (voip) is wireless communication protocol processes packet based ip communication to carry digitized voice [6]. hence ip addresses are very important because socket programming of udp processed for voice communication through ip addresses and ports. so that internet based server is not necessary in proposed research [7]. 1.1. client-server model if there are several computers and resources which are available to other resources to establish connection between them, it is called as network. in network some devices are receiving information from others and some are sending information to others.devices which receives information is called “client” and devices which sends information is called “server”. proposed research implemented client-server model [8] for voice transferring between telephony caller and wi-fi user. it can be implemented by smart phone as server and android tablet as client. proposed research implements real time voice call transfer service between smart phone and tablet through wi-fi with no cost from telephone and internet service providers. figure 1 shows basic idea of proposed research implementation. as depicted in figure, telephony subscriber needs to establish connection with another subscriber (smart phone user) through telephone network. connection establishment is shown by text-field displaying caller’s identity; input recipient number and call recipient is notified about an incoming call. after receiving call, wireless connection is established and call receiving subscriber registered to wi-fi network and updated with other registered users in network range. then telephony not enabled user (tablet user) get connected with smart phone user through their ip addresses. then captured voice of incoming caller from smart phone is transferred to tablet at real time and voice of tablet user is sent to smart phone [9]. hence real time communication between incoming caller and tablet user. implementation of voice call transfer service between smart phone and tablet through wi-fi 219 fig. 1 idea of proposed application 1.2. voice codec figure 2 describes process of voice codec for wireless lan. as depicted in figure; first speech signal has to be digitized at sender before transmitted over packet switched networks. reverse process has been performed at receiver. digitalization process consists of sampling, quantization and encoding. different types of encoding techniques in wireless networks are g.711, g.729 and g.723.1. then encoded speech is packetized into packets of equal size. each packet consists of headers and payload for certain duration depends on codec deployed in application [10]. fig. 2 process of voice codec for wireless lan 1.2.1. g.711 codec in wireless networks, g.711 encodes telephone audio signal of 64 kbps with sample rate 8 khz and 8 bits per sample. in ip network, voice is converted into packets with durations of 5, 10 or 20 ms of sampled voice, and these samples are encapsulated in packet. 1.2.2. g.723.1 codec g.723.1 codec belongs to algebraic code excited linear prediction (acelp) family and has two bit rates associated with it: 5.3 kbps and 6.3 kbps. encoder consists of voice activity detection and comfort noise generation (vad/cng) and decoder is capable of accepting silence frames. encoder operates on speech frames of 30 ms corresponding to 240 samples at sampling rate of 8000 samples/s and total algorithmic delay is 37.5ms. it offers good speech quality in network impairments such as frame loss and bit errors. it is suitable for voip applications. 220 d. s. kolluru, b. r. puchakayala 1.2.3. g.729 codec g.729 codec belongs to code excited linear prediction coding (celp) family and uses conjugate structure-algebraic code excited linear prediction (cs-acelp) model. this was specially designed for wireless applications at fixed 8 kbit/s output rate but it does not include channel coding. it works on frame of 80 speech samples (10ms) and acquires lookahead delay of 5ms. total algorithmic delay is 15ms. figure 3 shows architecture of voice call transfer service through wi-fi. proposed architecture was divided into two phases:in first phase, incoming call is detecting and receiving, registration of users. out of registered users, one is acting as server and other is acting as client. connection between registered users was achieved by connecting them to same wi-fi router. each user is added to wi-fi network through specific ports assigned by programmer. therefore, there is no need of using any server. both devices will be connected with their ip addresses and users are added or removed through packets originated from udp port. in second phase, voice of incoming caller is captured and transferred through wifi from smart phone to tablet. udp based socket programming will be used to implement this. fig. 3 architecture of proposed application figure 4 shows real time audio encoding and audio decoding. it implements cost less voice communication. real time communication is established between them through specific ports assigned by programmer. steps in audio encoding are record voice, encode, storing in buffer, packet framing and transfer through port. steps in audio decoding are receiving through port, storing in buffer, reframe packet, decoding and playing voice. implementation of voice call transfer service between smart phone and tablet through wi-fi 221 fig. 4 process of real time audio encoding and audio decoding 2. research background latest smart phones feature with wi-fi enabling facility. number of smart phone users for wi-fi service has been increasing more and more every year [11]. usage of wi-fi eliminates cost consumed by service providers for short-distance calls. wi-fi calls combine voice and data into single signal by digitizing raw voice signals. converting this combined signal into ip packet and sent through wi-fi which replaces existing telephony network [12]. this entire process is facilitated by voip. in current days, voice telephony over mobile trends as growing technology because of cost consuming telephony service.wi-fi allows data and voice transmission within its coverage area.voice over internet protocol (voip) provides communication between wifi connecting users through internet. voip is a process of exchanging voice between caller and callee through wi-fi/internet. transforming voice in the way telecommunication evolves makes it more advantageous. increasing demand of service with its broadband infrastructure causes voip to develop economical ip phone liked equipment [13]. ip phone was used as interface between telephony network and ip network. but only drawback is ip phones are fixed type. voice communications through voip are more delay sensitive than error sensitive [14]. to resolve this issue, a real time voice protocol udp is developed [15]. udp is a non-reliable wireless communication protocol suited best for real time voice processing. in transport layer, it worked as signaling protocol for ip based applications. it provides peer to peer packet based communication. it converts voice data into packets and communicated to router through wi-fi channel and send to device within the wi-fi range. udp reforms communication from circuit based switched network to ip packet based switched network. udp has network applications in domain naming system (dns), simple network management protocol (snmp), dynamic host configuration protocol (dhcp) and routing information protocol (rip) [16]. due to high speed and less impact on lost packet, udp is suitable for real time voice streaming applications. existing research possess real time audio, audio streaming applications from one peer to another [17]. udp 222 d. s. kolluru, b. r. puchakayala is used in tunnels which create virtual link for direct connection between two locations that are distant in physical network topology. voice processing of real time communication is done by encryption at sender and decryption at receiver through digital signals [18]. previous research explains voice codecs for voice processing. international telecommunications union (itu) implementes voice codec g.711 and g.7xx for audio compression and de-compression. each codec has different packet size and hence performs differently. research in[19] describes features of voice codec g.711. it posses high bit rate (64kbps) as per itu standard which is preferable for digital telephony and ip networks. its quality of voice is high with low processor time. but it has drawback of higher bandwidth utilization. some of the researches suggests graphical user interface (gui) mode for voice processing applications. in [20], authors explained dual channel wireless model. it provides high reliability with low delay. by using this model, same packet was broadcasted by multiple transmitters. hence packet-loss was decreased compared with single channel transmitter [21]. till date, existing technologies supports chatting, video chatting and calling from one device to another over internet. communication protocol like voip work with help of ip based protocols which makes effective communication with less jitter. such system makes wifi subscriber can call each other with no cost over internet. proposed paper implements an application by using free source facility and standard for providing free voice call transfer without internet over wi-fi within lan network from smart phone to tablet. it not only saves money on calling but also provides an effective way for communication to utilize resources in an effective manner. proposed paper uses wlan communication medium to provide cost less voice call transfer facility within wi-fi covered area [22]. 3. methodology wi-fi router is acting like switch to detect devices active in wi-fi lan range. proposed application presents detail implementation of voice call transfer service through wi-fi using socket programming. in this work, android platform has been chosen for developing and implementing mobile application with java in android as programming language. it uses required android apis to implement proposed application.android mobile application named “dst_mlrit_voice_call” has been developed for proposed research. different modules are created, developed and running for smart phone and tablet. app contains three sections: java code, xml code, android manifest. these three sections inter connected each other to user requirements in android gui [23]. software package using android studio will be developed for receiving call, automatically acquiring ip address of tablet user and transferring voice from smart phone to tablet [24]. java is used as backbone of programming because it is platform independent. java incorporates number of features from object-oriented language. java include extensive libraries for multimedia, networking, multi reading, graphic, database access.it has unique attributes that are embedded into its design features makes software developers to design most web applications. embedded design features include oop, platform independence, high performance, multithreading and dynamic linking made programming complex applications rather simple and straightforward [25-26]. application started with program file mainactivity.java. it describes main functions of app. java programming of app start with method on create(). it setup all variables and implementation of voice call transfer service between smart phone and tablet through wi-fi 223 elements initially for application. it provides buttons declaration and onclick methods of buttons. in gui, main user interface layout file activity_main.xml provides start-up screen. then, mainactivity. java will call contact manager.java for connection handling process. in this way, all programming files link elements of front end with back end [27-28]. proposed research designs client server based model to implement voice call transfer service. proposed implementation designed as mobile application in android studio. figure 5 shows flow chart for voice call transfer service programmatically. telephony api in android makes app continuosly in listening state for incoming phone calls. whenever incoming call was detected, smart phone register itself with name in wi-fi network and sends request to other users connected to same network. datagram socket api makes easy for user registartion with add name and remove name parts using udp packets. fig. 5 design flow of implementation of voice call transfer service name entered by user identified with its device ip address through wi-fi manager api. router forwards this request in form of packets through udp based socket programming. update button on screen updates all registered within network. udp socket continuously updates add and remove requests within wi-fi network. when telephony caller (smart phone user) wants to transfer callee voice to another user (tablet user) in same wi-fi network, application continues to runs bychoosing user from drop down list on smart phone. both smart phone and tablet are android devices. proposed application first detects incoming call. incoming voice is received by smart phone with support of android telephony api. voice from microphone of smart phone is taken for processing to 224 d. s. kolluru, b. r. puchakayala destination. sampling rate is kept at 8 kbps. 16 bit pcm is used for sample voice. buffer is used to store the sample voice from pcm. voice codec api compress and encode voice data to transmit over low bit rate ieee 802.15.4 standard. voice encoder converts voice signal into frames and decoder converts frames into voice signal. single frame contains raw voice data of 20 ms voice signal. processed frames are given to encoder, which compresses and returns 38 bytes of encoded voice frames. udp converts these frames into packets. udp socket based communication is to be used for smart phone to communicate with tablet through wi-fi. smart phone will detect destination device. destination device is identified with its ip address through auto discovery method. figure 6 shown process of voice transferring between two users. audio manager api uses read method to send encoded data over wi-fi. real time communication is made by thread running for pre defined time interval reads raw voice from microphone. socket programming provides packet based communication between devices through wi-fi. udp datagram protocol provides packet based communication for tablet and smart phone. as and when android devices starts transferring voice call, voice codec compresses and encodesuser voice into frames and stored into buffer. data in buffer is converted into packets by udp and sent over wi-fi to another user [25]. fig. 6 design flow of voice transferring between two users at receiving end, tablet received voice call through wi-fi notified as incoming call received from smart phone. received voice packets are converted into frames and stored in receive buffer. decoder in codec decodes frames in buffer into voice. decoded voice is played on tablet gui at real time. in the same manner, voice from tablet transferred to smart phone at real time. 4. research results proposed research is designed as mobile application named “dst_mlrit_voice_ call”. it was installed on samsung galaxy smart phone and lenovo android tablet which are connected to same wi-fi. designed application installed successfully on both devices. voice of incoming caller is transferred between smart phone and tablet successfully via wi-fi without consuming any cost from telephone and internet service providers. it is verified on samsung galaxy phone and samsung tablet. experiment have been conducted using android studio tools like import and run on samsung galaxy smart phone as implementation of voice call transfer service between smart phone and tablet through wi-fi 225 server and samsung tablet as client and is quite easy and quick set up. figure 7 (a) shows deployment of mobile app in smart phone. app launching page is shown in figure 7 (b), incoming phone call received on smart phone shown in figure 7 (c). (a) (b) (c) fig. 7 pages of mobile application on smart phone (a) application icon (b) application launching window (c) receiving incoming call application has process permissions for wifi connectivity, phone and audio. as conversation started, screen appears is shown in figure 8 (a). after starting conversation from caller, with wifi permission,users need to be registered. user registration screen with registered user name “sai” is shown in figure 8 (b). after user submit name, app automatically get ip address of registered user. update button will update all registered users. after updating, registered users list will be displayed. screen showing all registered users is shown in figure 8 (c). after user registration, received incoming call will be transferred to registered user. for transferring call, socket based programming is used. when transferring call, packets are sent and received through udp. transferring voicefrom smart phone through wi-fi is shown in figure 8 (d). (a) (b) (c) (d) fig. 8 pages of mobile application on smart phone (a) start of outgoing call (b) user registration page (c) updating users page (d) transferring incoming caller voice page. 226 d. s. kolluru, b. r. puchakayala when voice call is transferring from smart phone to tablet,tablet user needs to be registered on same network in application. user registration screen on tablet is shown in figure 9 (a). for proposed application, user with name “aneesh” is registered. after submitting name, user can click on submit name. update button in screen updates all registered users. list of registered users shown below update button. screen shot of update of users list is shown in figure 9 (b). (a) (b) fig. 9 pages of mobile application on tablet (a) launching application and user registration (b) updating users after user registration, user gets incoming voice call from smart phone user which is voice transferring from smart phone. voice call transferring from smart phone to tablet is shown in figure 10 (a). accept button is used to accept call screen is shown in figure 10 (b). after completing their conversation, end call button is used to end call. (a) (b) fig. 10 pages of mobile application on tablet (a) receiving incoming caller voice from smart phone(b) after accepting conversation 5. conclusion main motive behind proposed research is to enable cost less and server less real time voice call transfer service from smart phone to tablet without using internet. it was implemented as mobile application in client-server model. using androidapk, proposed implementation was installed in devices as app. android apis are used for software model of this design. wi-fi service and udp are used to provide real time voice communication between incoming caller and tablet through smart phone. major challenges at sending side of proposed research are capturing received voice from microphone, stored it in buffer and transfer voice to destination device at real-time. implementation of voice call transfer service between smart phone and tablet through wi-fi 227 at receiving side, tablet programmed to receive and send voice through udp ports. as udp does not reserve extra bandwidth, it would not slow down network. therefore, it resembles quality of system that can be designed for voice communication over wi-fi. nowadays, trend moves towards telecommunication with virtual office/class rooms that create legitimate business deals for voice communication. proposed research design, implements and test on devices and achieves cost effective, server less and reliable voice communication over wi-fi. advantages of proposed model are server less, internet less and cost less real time communication service without changing infrastructure of organization and without using expensive re-routing of cable or installation of new cables. it is implemented designed and tested for one to one device. in future, it creates opportunity to other developers and researchers to extend this service from one device to more devices. acknowledgment: i would like to thank department of science and technology (dst),india under wos-a scheme for financial support. i would like to express my very great appreciation to p. bhaskara reddy,my project mentor for their guidance, enthusiastic encouragement and useful critiques of this research work. references [1] p. shrikondawar, h. bharmal and a. kommera, "chat and file transfer android application using wifidirect", int. j. mod. trends eng. res. (ijmter), vol. 5, no. 5, pp. 1–6, may 2018. [2] i. adabara, e. edozie, g. otiang okoth, o. stephen and k. susan, "implementation and analysis of a free wireless intercom system", int. j. academic inf. syst. res., vol. 3, no. 7, pp. 23–28, july 2019. [3] d. sowjanya kolluru and p. bhaskara reddy, "review on communication technologies in telecommunications from conventional telephones to smart phones", in proceedings of international conference on advances in signal processing, vlsi, communications and embedded systems (icsvce2021), hyderabad, india, april 2021, p. 2407. [4] k. durga sowjanya, ch. srinu , "instant message transfer between two smart phones using wi-fi", int. j. adv. eng. manag. sci. (ijaems),vol. 2, no. 12, pp. 1949–1951, dec. 2016. [5] d. sowjanya kolluru and p. bhaskara reddy, "development of voice call transfer service between android smart phone and tablet", revista geintec – gestao, inovacao e tecnologias, vol. 11, no. 2, june 2021. [6] s.v.s. prasad, t.s. savithri and i.v.m. krishna, "a new technique for color based image segmentation using support vector machines", in proceedings of the international conference on medical imaging, m-health and emerging communication systems (medcom), ieee, greater noida, india, 2014, pp. 189–192. [7] c. brouzioutis, v. vitsas and p. chatzimisios, "studying the impact of data traffic on voice capacity in ieee 802.11 wlans", in proceedings of the ieee international conference on communications (icc), 2010, pp.1–6. [8] m.a.r. siddique and j. kamruzzaman, "increasing voice capacity over ieee 802.11 wlan using virtual access points", in proceedings of the ieee conference on global telecommunications, 2010, pp. 1–6. [9] a. malhotra, v. sharma, p. gandhi and p. purohit, "udp based chat application", in proceedings of the 2nd international conference on computer engineering and technology (iccet), 2010, pp. v6-374–v6-377. [10] a. ratnaningsih, a. ida wuryandari and y. priyana, "the analyze of android’s microphone audio streaming beatme", in proceedings of the international conference on system engineering and technology, september 11-12, bandung, indonesia, 2012, pp.1–6. [11] a. mohd and o. lee loon, "performance of voice over ip (voip) over a wireless lan (wlan) for different audio/voice codecs", jurnal teknologi, vol. 47, pp. 39–60, dec. 2007. [12] s.v.s. prasad, t. satya savithri and i. v. murali krishna, "performance evaluation of svm kernels on multispectral liss iii data for object classification", int. j. smart sensing intell. syst., vol. 10, no. 4, pp. 829–844, jan. 2017. [13] s. venkatraman, s. natarajan and t.v. padmavathi, "voice calls over wi-fi", in proceedings of the world congress on engineering and computer science, 2009, san francisco, usa, pp. 1–5. 228 d. s. kolluru, b. r. puchakayala [14] s. a. ahson, and m. ilyas, voip handbook applications, technologies, reliability, and security, boca raton, fl, crc press, new york. 2009. [15] "voice over internet protocol (voip)", available at: www.android.org. [16] d. sowjanya kolluru, p. bhaskara reddy, "ip to ip calling through socket programming", in proceedings of the ieee sponsored asian conference on innovation in technology (asiancon), pune, maharashtra, india, 2021, pp. 1–7. [17] javvin technologies, network protocols handbook (2nd ed)", saratoga ca 95070 usa, 2005. [18] l. roychoudhuri, e. al-shaer, h. hamed and g.b. brewster. "audio transmission over the internet: experiments and observations", in proceedings of the ieee international conference on communications, 2003, vol.1, pp. 552–556. [19] k. durga sowjanya and v. gayathri devi, "voice call between android devices using wireless sensors", int. j. adv. res. eng. technol., vol. 4, no. 4, april 2016. [20] s.v.s. prasad, "double block zero padding acquisition algorithm for gps software receiver", j. autom. mobile robot. intell. syst., vol. 12, no. 4, pp. 58–63, dec. 2018. [21] a. ramo and h. toukomaa, "voice quality evaluation of recent open source codecs", in proceedings of the 11th annual conference of the international speech communication association, interspeech 2010, makuhari, chiba, japan, pp. 2390–2393. [22] s. woo cho, "p2p-based mobile social networks", in proceedings of the 10th international conference on p2p, parallel, grid, cloud and internet computing, 2015, pp. 141–145. [23] a. biradar, r. c. thool and r. velur, "voice transmission over lan using bluetooth", in proceedings of the ieee region 10 conference tencon, 2009, pp. 1–6. [24] p. aishwarya, v. dinesh, c. ramya, m. ramya and s. sathish kumar, "design and implementation of wi-fi based intercom system using arm 11", in proceedings of the 6th national conference on frontiers in communication and signal processing systems (ncfcsps '18), int. j. innov. res. sci. eng. technol., vol. 7, no. 1, march 2018. [25] j. ahmad shaheen et al, "android os with its architecture and android application with dalvik virtual machine review", int. j. multimedia ubiquitous eng., vol. 12, no. 7, pp. 19–30, july 2017. [26] m. sabri altemem, "voice chat application using socket programming", eur. acad. res., vol. 2, no. 5, aug. 2014. [27] d. sowjanya kolluru and p. bhaskara reddy, "analysis of parameters for measuring performance of mobile applications", int. j. anal. appl., vol. 19, no. 4, pp. 587–603, 2021. [28] s. n. khan and i. firdous, "review on android app security", int. j. adv. res. comput. sci. software eng., vol. 7, no. 4, april 2017. [29] i. nachev and s. maleshkov, "android-based control interface solution for windows applications", in proceedings of the 1st virtual international conference on advanced research in scientific areas 2012, december 2012, pp. 2073–2077. [30] s. k. sohail, s. f. aalam and m. r. m. balal, y. kamble and d. m. rauf, "intercom system for short path communication", int. j. res. eng. appl. manag., vol. 3, no. 08, pp. 1–2, nov. 2017. [31] r. c. vaidya and p. s. s. kulkarni, "voice over ip mobile telephony using wifi", int. j. sci. eng. res., vol. 3, pp. 1–5, dec. 2012. http://www.android.org/ plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 93-106 https://doi.org/10.2298/fuee2201093s © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a non-isolated high step-up converter with low ripple input current and reduced voltage stress asghar salehi, mohammad hossein ershadi, mehdi baharizadeh department of electrical engineering, khomeinishahr branch, islamic azad university, khomeinishahr/isfahan, iran abstract. in this paper a new non-isolated high step-up interleaved cascade converter is presented. in comparison with the conventional cascade boost converter, the proposed converter has a higher voltage gain, lower input current ripple and reduced voltage stress for the switches and diodes. besides, unlike the conventional cascade boost converter, in the proposed converter the input current is shared between inductors and hence the converter can be implemented with lower current rated inductors. thus, the converter size and conduction losses are reduced and the efficiency is increased. the proposed converter is analyzed and experimental results of a 200w laboratory prototype are presented. key words: dc-dc converters, soft switching, high step up, voltage stress. 1. introduction nowadays, dc microgrid systems due to their self-sustainability in small areas have been receiving attention and are expected to be the next generation of power systems. renewable energy sources, such as wind and solar, are increasingly being integrated into the electric power grid, while the power system becomes more tightly intertwined with other systems, such as buildings, natural gas pipelines, and the transportation sector. in microgrid systems, renewable energy sources including photovoltaic (pv), wind turbine, waves, and geothermal sources are utilized for generating dc power and batteries, ultracapacitors, and fuel cells are adopted as backup power sources for the renewable energy sources [1]-[3]. however, since these power sources usually generate a low voltage, a high step-up dc–dc converter is required to supply high operating voltages loads [4]. for step-up applications, a conventional boost converter can be applied due to its simple circuit and low cost. however, it is not suitable for high step-up applications due to high duty cycle for the converter switch, high voltage stress of the power devices, reverse recovery problems, high conduction losses, stability problems in control and efficiency limitation [5]-[8]. received may 20, 2021; received in revised form august 27, 2021 corresponding author: mohammad hossein ershadi department of electrical engineering, khomeinishahr branch, islamic azad university, khomeinishahr/isfahan, iran e-mail: ershadi@iaukhsh.ac.ir 94 a. salehi, m. h. ershadi, m. baharizadeh to reach a high voltage gain, two or more boost converters can be connected in series. these converters are called cascade converters [9]. to reduce the number of required components, quadratic boost converters are proposed in [10]-[12]. in these converters, the two series boost converters are integrated and the converter needs only one switch. however, since the input voltage is low and all the input current flows from the first stage inductor, the size, volume and conduction losses of this inductor increase drastically as the output power increases. to solve this problem, interleaved technique can be used in order to share current between modules [13]. however, the voltage gain of two stage cascade boost converter is still limited and more than two stage cascade boost converters suffer from low efficiency, complex circuit and control, and high cost [10]. in recent years various interleaved high step up converters are presented in which the input current of the converter is shared between the interleaved phases. in [14] and [15], switched capacitor technique is applied to the interleaved boost converter and the voltage gain has increased. although in these converters the voltage gain is higher than conventional boost converter, it is still limited and the voltage stress of semiconductor devices are high. in [16][19], coupled inductors are used instead of main inductors in interleaved boost converter and the turn ratio of the coupled inductors is employed to adjust the voltage gain. however, because a high turn ratio is required to obtain a high voltage gain higher than 15, the size of the coupled inductor and the conduction loss of the winding and the core loss increase. moreover, a snubber circuit or clamping circuit is needed due to the leakage inductance from the coupled inductor [19]-[23]. besides, in [18] the input current of the converter is pulsating and its ripple is high. to decrease the input current ripple, three coupled inductors are adopted in [20] and [21] which increase the converter size and complexity. to solve the problems of conventional cascade boost converter and avoid using coupled inductors, in this paper a new non-isolated high step-up interleaved cascade converter is presented. the proposed converter has a higher voltage gain in comparison to the two stage cascade boost converter and interleaved boost converter. moreover, the voltage stress of the switches in the proposed converter is reduced and the input current is shared between converter inductors. besides, the input current ripple in the converter has decreased compared to the conventional cascade boost converter. hence, the converter can be implemented with lower current rated inductors, and the converter size and conduction losses are reduced and efficiency is increased. the rest of the paper is organized as follows. the proposed converter operation principals are described in section ii. in section iii, converter analysis and design considerations are discussed in details. experimental results of a prototype converter are presented in section iv and conclusions are given in section vi. 2. proposed converter operation principle fig. 1 shows the proposed interleaved high step-up converter. in order to indicate the operation of the proposed converter, some assumptions are made: 1) all semiconductor components are ideal; 2) the output capacitor co and capacitors c1~c3 are large enough and can be considered as voltage sources; 3) the inductors l1, l2 and l3 are large enough and the converter operates in continuouscurrent-mode (ccm); a non-isolated high step-up converter with low ripple input current and reduced voltage stress 95 l1 l2 s1 s2 s3 d1 c1 d2c2 d3c3 co r + vo -d4 l3 vin il1 il2 il3id1 id3 id2 id4 is3 is1 is2 + vc1 vc2 + vc3 + vd1 + vd2 + vd3 + vd4 + ic3 ic2 ic1 + vl2 + vl3 + vl1 + vs1 + vs3 + vs2 ioiin fig. 1 proposed interleaved high-step up converter. with respect to above assumptions, each switching period can be divided into four modes and the key operating waveforms of the proposed converter and equivalent circuits are shown in fig. 2 and fig. 3, respectively. likewise, the two-phase interleaved converters, switches s2 and s3are driven with the phase shift angle of 180ºand duty cycles of them are equal. the gate pulse of switch s1 is similar to s2 as it is shown in fig. 2. interval i, [t0-t1]: fig. 3(a) shows the equivalent circuit of the converter in this interval and as it is shown in the figure, switches s1 and s2 are turned off and s3 is turned on. in this interval inductors l1 and l2 are discharged through pathsvin-d1-c1-l1 and vin-l2-c2d2-c3-s3, respectively. in addition, inductor l3 is charged through vin-d1s3-l3. the equations of converter elements in this interval are as follows: )()()( 0 1 1 011 tt l vv titi inc ll − − −= (1) )()()()( 0 2 23 0222 tt l vvv tititi cinc ldl − −− −== (2) )()()( 030 3 3 titt l v ti l in l +−= (3) )()()( 311 tititi lld += (4) )()()( 323 tititi lls += (5) interval ii, [t1-t2]: this interval begins when switches s1 and s2 turn on. as it is shown in fig. 3(b), in this interval all of the switches are turned on and inductors l1, l2 and l3 are charged through vin-s1-l1, vin-s1-c1-l3-s3 and vin-l2-s2, respectively. important equations of converter elements are as follows: 96 a. salehi, m. h. ershadi, m. baharizadeh vgs1&vgs2 vgs3 is1 vs1 is2 vs2 is3 vs3 id1 id2 id3&id4 iin t0 t1 t3 t4t2 fig. 2 typical key waveforms of the proposed converter. )()()( 1 1 111 tt l v titi in ll −+= (6) )()()()( 1 2 1222 tt l v tititi in lsl −+== (7) a non-isolated high step-up converter with low ripple input current and reduced voltage stress 97 )()()( 1 3 1 133 tt l vv titi cin ll − + += (8) )()()( 311 tititi lls += (9) interval iii, [t2-t3]: at t2, s3 turns off and this interval begins. when s3 turns off, l3 continues its current and turns d3 and d4 on. part of l3 current flows through vin-s1-c1l3-c3-d3-co and the other part of l3 current runs through vin-s1-c1-l3-d4-c2-s2. hence, c1 and c3 are discharged and co and c2 are charged in this interval. l1 and l2 are charged similar to the pervious interval. important equations of the converter elements are: )()()( 2 1 211 tt l v titi in ll −+= (10) )()()( 2 2 222 tt l v titi in ll −+= (11) )()()( 2 3 31 233 tt l vvvv titi ccino ll − −−− −= (12) )()()( 311 tititi lls += (13) interval iv, [t3-t4]: fig. 3(b) shows the equivalent circuit of the converter in this interval. the converter operation and its important equations are similar to the second interval. 3. converter analysis and design considerations 3.1. voltage conversion ratio following equations can be obtained from volt-second-balance of l1, l2 and l3, respectively. tdvvdtv incin )1)(( 1 −−= (14) tdvvvdtv inccin )1)(( 23 −−−= (15) tdvvvtdvtdvv inccincin )1)(()1()12)(( 121 −−−=−+−+ (16) from, (14), (15) and (16) following equations are obtained. 1 )1( cin vdv −= (17) ))(1( 23 ccin vvdv −−= (18) 12 .)1( ccin vdvdv −−= (19) from (17), (18) and (19), vc1, vc2 and vc3 can be obtained as, )1( 1 d v v in c − = (20) 98 a. salehi, m. h. ershadi, m. baharizadeh l1 l2 s1 s2 s3 d1 c1 d2c2 d3c3 co r + vo -d4 l3 vin + vc1 vc2 + vc3 + (a) l1 l2 s1 s2 s3 d1 c1 d2c2 d3c3 co r + vo -d4 l3 vin + vc1 vc2 + vc3 + (b), (d) l1 l2 s1 s2 s3 d1 c1 d2c2 d3c3 co r + vo -d4 l3 vin + vc1 vc2 + vc3 + (c) fig. 3. equivalent circuits of the proposed converterč (a) interval i [t0-t1], (b) interval ii [t1-t2], (c) interval iii [t2-t3] (d) interval iv [t3-t4]. 22 )1( d v v in c − = (21) 23 )1( ).2( d vd v in c − − = (22) a non-isolated high step-up converter with low ripple input current and reduced voltage stress 99 from interval iii, 32 coc vvv −= (23) by substituting vc2 and vc3 from (21) and (22) in (23), the voltage gain g of the proposed converter can be obtained as follows: 2 )1( 3 d d v v g in o − − == (24) relation (24) shows that the proposed converter has a high step up voltage gain. fig. 4 shows a comparison between the voltage gains of the proposed converter, the conventional cascade boost converter and the converters presented in [14] and [15]. as it can be observed from the figure, the proposed converter has a higher voltage gain. fig. 4. voltage gain comparison of the proposed converter with conventional cascade boost converter and the converters presented in [14] and [15]. 3.2. inductors average current in the proposed converter, input current is sum of the l1, l2 and l3 currents. hence, the average value of input current is as follows: av glav glav glav gin iiii _3_2_1_ ++= (25) from the current-second-balance of c3 and co, following equation can be obtained tditd i avgl avgl )1()1( 2 _3 _2 −=− (26) titdi av goav gl __3 )1( =− (27) from (26) and (27), av glav gl ii _3_2 2= (28) )1( _ _3 d i i av go av gl − = (29) 100 a. salehi, m. h. ershadi, m. baharizadeh the following equation can be obtained by assuming ideal condition: oavgooinavglavglavglinavginin vipviiivip __3_2_1_ )( ==++== (30) so, avgo in o avglavglavgl i v v iii __3_2_1 )( =++ (31) by substituting il2_avg,il3_avgand vo/vin from (28), (29) and (24) into (31), average current of inductors are: av gin av go av gl i d d d di i _2 _ _1 3 2 )1( 2 − = − = (32) avgin avgo avgl i d d d i i _ _ _2 3 )1(2 1 2 − − = − = (33) avgin avgo avgl i d d d i i _ _ _3 3 )1( 1 − − = − = (34) the proposed converter is compared with the conventional cascade boost and converters presented in [14] and [15] in table 1. from this table, it is obvious that in the proposed converter, unlike the conventional cascade boost converter, the input current is shared between all the inductors. 4. semiconductor stress analysis based on the converter operating intervals and equivalent circuits, the voltage stress of s1, s2 and d1 is: o in dss v d d d v vvv . )3( )1( )1( 121 − − = − === (35) also, the voltage stresses of s3, d3, d2 and d4 are as follows: )3()1( 233 d v d v vv oin ds − = − == (36) )3( ).2( )1( ).2( 242 d vd d vd vv oin dd − − = − − == (37) the voltage stress of the semiconductor components in the proposed converter are compared with some other transformer-less high step up converters in table 1 and as it can be seen, in the proposed converter the voltage stresses of the components are reduced. when s1 is on, the currents of l1 and l2 flow through this switch and when it turns off its current passes through d1, hence the current stresses of s1 and d1 can be obtained as: ) )( ( 2 1 ) ).1)(( ( 2 1 2 1 1 1 _2_111 l dtvv l tdvv iiii cincin avglav glds + + −− ++== (38) a non-isolated high step-up converter with low ripple input current and reduced voltage stress 101 the current stresses of other switches and diodes are as follows: ) )( ( 2 1 ) )1)(( ( 2 1 2 1 3 32 _2_32 l dtvv l tdvvv iii cinccin av glav gls + + −−+ ++= (39) )] )1)(( ( 2 1 [ 2 1 )( 2 1 2 21 _2 3 _33 l tdvvv i l dtv ii ccin av gl in avgls −−+ +++= (40) )( 2 1 3 _32 l dtv ii in avgld += (41) )] )( ( 2 1 [ 2 1 2 1 _243 l dtvv iii cin avgldd + +== (42) table 1 comparison of the proposed converter with the conventional cascade boost converter and converters presented in [14] and [15]. proposed converter converter presented in [15] converter presented in [14] conventional cascade boost converter items voltage gain )in/vo(v voltage stress of switches voltage stress of diodes inductors average current 102 a. salehi, m. h. ershadi, m. baharizadeh 5. input current ripple and inductors from equations (6), (7) and (8), the input current ripple is obtained as follows: 3 21 21 2 )( 22 l dtvvv l dtv l dtv i ccininin in −+ ++= (43) by replacing vc1 and vc2 from (20) and (21) in (43): 3 2 21 2 ) )1( 1 1 1 1( 22 l dtv dd l dtv l dtv i in inin in − − − + ++= (44) by assuming l = l1 = l2, ) 2 ) )1( 1 1 1 1( 1 ( .2 ) )1( 1 1 1 1( 3 2 3 2 l dd l dtv l dtv dd l dtv i in in in in − − − + += − − − + += (45) from (45), if l d d l )1 )1( ( 2 1 23 − − = (46) , the input current ripple of the converter would be zero. in the case that vin=40vand vo=400v, the converter operating duty cycle from (24) is 0.5 and from (46),l3 is ll 2 1 3 = (47) by substituting l3 from (47) and vin from (24) in (45), the input current ripple of the proposed converter is obtained as: )2 1 1 )1( 1 ( 2 − − − − = ddl dtv i in in (48) the equation of input current ripple in the conventional cascade converter is: l dtv i in in 2 = (49) comparing (48) with (49) shows that the proposed converter has a lower input current ripple in comparison to conventional cascade boost converter. the proposed converter operates under continuous current mode (ccm) and the design equation of l1, l2 and can be obtained from (30) and (31) as: )3(4 )1( m in_ 4 21 d trd ll o − − == (50) 123 )1 )1( ( 2 1 l d d l − − = (51) a non-isolated high step-up converter with low ripple input current and reduced voltage stress 103 6.capacitors the values of the proposed converter capacitors can be obtained from following equations: 1 _ 1 )1( 2 c av go vd dti c − = (52) 2 _ 2 2 c av go v ti c  = (53) 3 _ 3 2 c av go v ti c  = (54) o av go o v ti c  = 2 _ (55) where, δvc1, δvc2, δvc2 and δvco are the voltage ripple of c1, c2, c3 and co, respectively. 7. experimental results in order to verify the performance of the proposed converter and the presented key waveforms, a 200 w laboratory prototype is implemented. the component specifications of the proposed converter are summarized in table2.in order to show the ability of providing high voltage gain, input voltage and output voltages are selected 40v and 400 v, respectively. the switching frequency and duty ratio of the gate signals of all switches are 100 khz and approximately 0.5, respectively. the experimental waveforms of the converter are shown in fig. 5. in fig. 5(a), (b) and (c) the current and voltage waveforms of s1, s2 and s3 are represented, respectively. as shown in the figures, the maximum voltage across s1, s2 and s3for 40v input and 400v output are about 80v, 80v and 160v, respectively. as a result, low voltage rated switches can be adopted to reduce the conduction loss and to achieve high efficiency. in fig. 5(d), (e), (g) and (h) the current and voltage of d1, d2, d3 and d4are illustrated, respectively. the voltage stress of d1, d2, d3 and d4for 400v output voltage is about 80v, 250v, 160v and 250v, respectively. as can be seen, the voltage stresses of diodes are sufficiently lower than the output voltage. fig. 6 shows the measured efficiency of the proposed converter compared with the conventional cascade converter. for a fair comparison, the conventional cascade boost converter is designed with the same switching frequency. other important parameter of the conventional cascade boost converter is mentioned in table 2. as it can be observed form fig. 6, the proposed converter has higher efficiency and as the load increases, efficiency drop in the conventional converter is higher compared with the proposed converter. fig. 7 shows the input and output voltages of the converter. 104 a. salehi, m. h. ershadi, m. baharizadeh vs1[50v/div] is1[5a/div] 2.5µs 2.5µsis3[5a/div] vs3[100v/div] vd1[50v/div] id1[2a/div] 2.5µs2.5µsis2[1a/div] vs2[50v/div] (a) (b) vs1[50v/div] is1[5a/div] 2.5µs 2.5µsis3[5a/div] vs3[100v/div] vd1[50v/div] id1[2a/div] 2.5µs2.5µsis2[1a/div] vs2[50v/div] (c) (d) 2.5µs id2[2a/div] vd2[100v/div] 2.5µs id3[2a/div] vd3[50v/div] 2.5µs id4[2a/div] vd4[50v/div] (e) (f) (g) fig. 5. experimental voltage and current waveforms of the proposed converter semiconductor components (a) s1, (b) s2, (c) s3, (d) d1, (e) d2, (f) d3 and (g) d4. fig. 6 measured efficiency of the proposed converter. a non-isolated high step-up converter with low ripple input current and reduced voltage stress 105 table 2 components value and specification of the implemented converters parameter value proposed converter conventional cascade converter switching frequency 100 khz 100 khz switches s1~s3 (irf640) s1 (irfp260) s2(irfp460) diodes d1~d4 (mur460) d1 (byv32-200) d2 (mur460) inductors l1 and l2 (500 µh) l3(250 µh) l1 (1mh) l2(500 µh) capacitors c1 (10µf/ 100v) c2 (10µf/ 200v) c3 (10µf/ 450v) co(47µf/ 450v) c1 (22µf/ 200v) co(100µf/ 450v) vo 100v vin 50v 2.5µs fig. 7 input and output voltage waveforms of the sample converter 8.conclusions a new interleaved cascade boost dc–dc converter with an improved voltage gain is presented in this paper. in the proposed converter, the input current is continuous with low ripple and the converter does not need additional filter in the input. besides, unlike the conventional cascade boost converter, the input current is shared between all the inductors and hence the proposed converter can be implemented with lower current rated inductors. moreover, the voltage stress of the converter power devices is reduced in comparison with conventional cascade boost converters and the converters presented in [14] and [15]. thus, the efficiency of the converter is improved. 106 a. salehi, m. h. ershadi, m. baharizadeh references [1] t. dragicevic, j.c. vasquez and j.m. guerrero, "dc microgrids-part ii: a review of power architectures, application and standardization issues", ieee trans. on power elec., vol. 31, no. 5, pp. 3528–3549, 2015. [2] n. eghtedarpour and e. farjah, "distributed charge/discharge control ofenergy storages in a renewableenergy-based dcmicro-grid", iet renewable power gener., vol. 8, no. 1, pp. 45–57, jan. 2014. [3] h. khamooshpoor, m. baharizadeh, m.h. ershadi, "comparison of two approaches of resolving power sharing error in droop based dc microgrids", majlesi journal of electrical engineering, vol. 14, no. 2, pp. 111–115, 2020. [4] s.r. addula and m. prabhakar, "coupled inductor based soft switched interleaved dc-dc converter for pv applications", international journal of renewable energy research, vol. 6, no. 2, pp. 361–374, 2016. [5] b. ahmed, g. yacine, d. rabah and h. mha, "design and electromagnetic modeling of integrated lc filter in a buck converter", facta universitatis, series: electronics and energetics, vol. 33, no. 2, pp. 289–302, 2020. [6] m. vafa, m.h. ershadi, h. khodadadi, and m. baharizadeh, "an interleaved high step-up dc–dc converter with low voltage stress", iranian journal of science and technology, transactions of electrical engineering, vol. 44, no. 1, pp. 1–12, 2020. [7] o. alonso, p. sanchis, e. gubia, and l. marroyo, "cascaded h-bridge multilevel converter for grid connected photovoltaic generators with independent maximum power point tracking of each solar array," in proceedings of the ieee conference pesc, 2003, pp. 731–735. [8] f. l. tofoli, d. de castro pereira, w. josias de paula, d. de sousa oliveira júnior, "survey on nonisolated high-voltage step-up dc–dc topologies based on the boost converter", iet power electron., vol. 8, no.10, pp. 2044-2057, oct. 2015. [9] l. huber, m.m. jovanovic, "a design approach for server power supplies for networking applications". in proceedings of the ieee applied power electronics conf. and exposition, 2000, pp. 1163–1169. [10] m. h. todorovic, l. palma, and p. n. enjeti, "design of a wide input range dc–dc converter with a robust power control scheme suitable for fuel cell power conversion, " ieee trans. ind. electron., vol. 55, no. 3, pp. 1247–1255, mar. 2008. [11] b.-r. lin, j.-j. chen, "analysis and implementation of a soft switching converter with high-voltage conversion ratio", iet power electron., vol. 1, no. 3, pp. 386–394, sep. 2008. [12] s.-m. chen, t.-j. liang, l.-s. yang, and j.-f. chen, "a cascaded high step-up dc–dc converter with single switch for microsource applications," ieee trans. power electron., vol. 26, no. 4, pp. 1146– 1153, apr. 2011. [13] n. molavi, m. esteki, e. adib and h. farzanehfard, "high step-up/down dc-dc bidirectional converter with low switch voltage stress," in proceedings of the 6th power electronics, drive systems & technologies conference (pedstc2015), tehran, 2015, pp. 162–167. [14] r. gules, l. l. pfitscher, and l. c. franco, "an interleaved boost dc–dc converter with large conversion ratio," in proceedings of the ieee conference isie, 2003, pp. 411–416. [15] l.-w. zhou, b.-x. zhu, q.-m. luo, s. chen, "interleaved non-isolated highstep-updc/dc converter based on the diode-capacitor multiplier", iet powerelectron., vol. 7, no. 2, pp. 390–397, 2014. [16] s. dwari and l. parsa, "an efficient high-step-up interleaved dc–dc converter with a common active clamp, " ieee trans. power electron., vol. 26, no. 1, pp. 66–78, jan. 2011. [17] k. c. tseng, c. c. huang, and w. y. shih, "a high step-up converter with a voltage multiplier module for a photovoltaic system, " ieee trans. power electron., vol. 28, no. 6, pp. 3047–3057, jun. 2013. [18] k. c. tseng, j. z. chen, j. t. lin, c.c. huang, and t. h. yen, "high step-up interleaved forward-flyback boost converter with three-winding coupled inductors, " ieee trans. power electron., vol. 30, no. 9, pp. 4696–4703, sep. 2015. [19] w. li, y. zhao, and x. he, "interleaved high step-up converter with winding-cross-coupled inductors and voltage multiplier cells," ieee trans. power electron., vol. 27, no. 1, pp. 133–143, jan. 2012. [20] t.-f. wu, y.-s. lai, j.-c. hung, and y.-m. chen, "boost converter with coupled inductors and buck– boost type of active clamp, " ieee trans. ind. electron., vol. 55, no. 1, pp. 154–162, jan. 2008. [21] t.-f. wu, y.-d. chang, c.-h. chang, h.-x. lee, k.-y. lee and j.-g. yang, "a 5 kw boost converter with various passive/active snubbers for reducing component stress and achieving high efficiency," in proceedings of the international conference power electron. drive syst., nov. 2009, pp. 187–192. [22] x. liu, x. xhang, " interleaved high step-up converter with coupled inductor and voltage multiplier for renewable energy system" cpss transactions on power electronics and applications, vol. 4, no. 4, pp. 299–306, 2019. [23] s. chen, s. yang, c. hwang, "interleaved high step-up dc-dc converter based on voltage multiplier cell and voltage-stacking techniques for renewable energy applications," energies 11, 1632, pp.1–8, 2018. https://scholar.google.com/citations?user=ikjvjraaaaaj&hl=fa&oi=sra https://link.springer.com/article/10.1007/s40998-020-00366-w https://link.springer.com/article/10.1007/s40998-020-00366-w facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 501-518 https://doi.org/10.2298/fuee1804501j methods of decreasing losses in optical metamaterials  zoran jakšić 1 , marko obradov 1 , olga jakšić 1 , goran isić 2,3 , slobodan vuković 1,3 , dana vasiljević radović 1 1 center of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, njegoševa 12, 11000 belgrade, serbia 2 institute of physics, university of belgrade, pregrevica 118, 11080 belgrade, serbia 3 science program, texas a&m university at qatar, p.o. box 23874 doha, qatar abstract. in this work we review methods to decrease the optical absorption losses in metamaterials. the practical interest for metamaterials is huge, but the possible applications are severely limited by their high inherent optical absorption in the metal parts. we consider the possibilities to fabricate metamaterial with a decreased metal volume fraction, the application of alternative lower-loss plasmonic materials instead of the customary utilized noble metals, the use of all-dielectric, high refractive index contrast subwavelength nanocomposites. finally, we dedicate our attention to various methods to optimize the frequency dispersion in metamaterials by changing their geometry and composition in order to reach lower absorption, which includes the use of the hypercrystals. the final goal is to widen the range of different metamaterialbased devices and structures, including those belonging to transformation optics. maybe the most important among them is the fabrication of a novel generation of alloptical or hybrid optical/electronic integrated circuits that would operate at optical frequencies and at the same time would offer a packaging density and complexity of the contemporary integrated circuits, owing to the strong localization of electromagnetic fields enabled by plasmonics. key words: metamaterials, transformation optics, plasmonics, low-loss metamaterials, hyperbolic metamaterials 1. introduction artificial structuring of optical materials at the subwavelength level ensures excellent control over spectral and spatial dispersion. it becomes possible to obtain very high, very low (near-zero) and negative effective values of refractive index. a path is thus opened to tailoring received august 22, 2018 corresponding author: zoran jakšić, institute of chemistry, technology and metallurgy, university of belgrade, njegoševa 12, 11000 belgrade, serbia (e-mail: jaksa@nanosys.ihtm.bg.ac.rs)  502 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović the optical space at will, ultimately leading to the new field of transformation optics, fig. 1 [1-3]. the materials structured in the quoted manner possess electromagnetic properties that surpass those normally met in nature and are thus denoted as metamaterials [4, 5]. in a general case, metamaterials represent 1d, 2d or 3d composites of constituent parts with different values of complex refractive index, fig. 2, and are the main building blocks for tailoring the optical space. they are typically structured at a subwavelength level, which in the case of visible radiation is of the order of nanometers. in most cases the optical metamaterials owe their operation to electromagnetic localization on an interface between the two constituent materials (typically metal and dielectric). in such a situation an evanescent wave is formed at the interface, called surface plasmon polariton. in its basic form such a surface wave is obtained by coupling free electron plasma in the metal part with the p-polarized electromagnetic wave at the interface between semi-infinite dielectric and semi-infinite metal. depending on geometry of the nanocomposite and its constituent materials, a host of different waves can appear at the surface and in the bulk [6-8]. the strong localization of the electromagnetic field at the surface lies in the root of many exotic optical phenomena connected with plasmonic metamaterials. many novel wave phenomena are met in such metastructures, for instance extreme light concentration [9, 10], near-perfect absorption [11-13], superlensing [14-16] and hyperlensing [17-19], optical cloaking (invisibility shields) [3, 20, 21], to name just a few. basically, using metamaterials anything could be done with propagation of electromagnetic waves, the practical limit being only the imagination. one of the most interesting practical goals of metamaterials and transformation optics is merging the packaging density of electronic devices with the speed of photonic ones by creating ultracompact, all-optical circuits as a new step in the continuation of the moore's law [22]. this holds the potential to revolutionize the electronics industry. fig. 1 modification of the optical space by transformation optics. black lines represent the optical space (the artificially made structure of the metamaterial) while red arrows show the propagation of electromagnetic beams. possible approaches to decreasing losses in optical metamaterials 503 fig. 2 an example of 3d structuring of optical metamaterials (spheres with a given value of complex refractive index within a host with a different refractive index.) to obtain extreme concentrations, the typical approach is to utilize electromagnetic resonances in metal-dielectric nanocomposites, thus ensuring field localization at metaldielectric interface (the already mentioned surface plasmons polaritons, spp) [23]. the use of metals means high absorption losses, thus short propagation paths and generally poor figures of merit. this is a major obstacle to the more widespread application of transformation optics. a host of practical applications would benefit from low-loss plasmonics and nanophotonics. in this work we consider strategies for nanostructuring of artificial optical composites to decrease or eliminate absorption. some approaches include  low metal volume fraction: utilize structures with smaller relative amount of metal – generalization of the old concept of artificial dielectrics.  alternative plasmonic materials: use of plasmonic materials with lower losses compared to pure metals.  all-dielectric meta-optics: completely avoid the use of metals and limit the design to pure dielectric and possibly low-loss semiconductors.  optimizing frequency dispersion: design metal-dielectric or even metal-metal nanocomposites with a frequency dispersion specifically tailored to obtain lower losses. in the next sections we consider each of the quoted strategies. 504 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović 2. low metal volume fraction an obvious approach to decrease the absorption losses in metal-dielectric nanocomposites is to decrease their metal to dielectric volume fraction. basically, this idea leans heavily on the concept of artificial dielectrics, used already in 1950ties in microwave technique [24, 25]. here we have a simple rule of thumb, which also follows the common sense: as the metal volume fraction decreases, a smaller part of the wave propagates through the absorptive medium, thus losses become lower. on the other hand, field localization also tends to become weaker and the electromagnetic field spreads over a larger volume, see fig. 3. therefore, there is need for a trade-off between the two. fig. 3 electromagnetic field distribution (light color) around metal/plasmonic material (dark color). figure 4. shows some examples of low metal fraction metamaterials. the top left structure is a one-dimensional plasmonic crystal [26] (metal-dielectric multilayer) with metal sheets much thinner than their dielectric counterparts. the top right structure in fig. 4 is the simplest 1d plasmonic crystal, a freestanding metallic/plasmonic material membrane with nanometer thickness. the dielectric part of this plasmonic crystal is the surrounding ambient and it ensures a perfect electromagnetic symmetry of the structure – the dielectric above and below is identical. as mentioned in the description of fig. 4, the plasmonic nanomembrane [27, 28] is a typical example of 1d plasmonic structure with minuscule volume fraction of metal. it can be defined as a freestanding metallic (or generally material containing free electron plasma) structure with an extremely high aspect ratio (lateral dimensions being even several million times larger than the thickness which can be of the order of tens of nanometers, even less). the thinner nanomembranes are, the longer are the propagation paths of spp. this makes them an ideal platform for long-range surface plasmons polaritons [29, 30]. possible approaches to decreasing losses in optical metamaterials 505 fig. 4 examples of plasmonic crystal-based metamaterials with low volume fraction of metal constituent. top left: conventional metal-dielectric multilayer (1d plasmonic crystal – pc); top right: freestanding metal nanomembrane as the simplest 1d pc; bottom right: wire medium (metal wires within a dielectric host, 2d pc) and bottom left: metal/plasmonic particles within dielectric host (3d pc). propagation of spp on separate interfaces of a membrane is shown in fig. 5. when interfaces are close enough to each other (sufficiently thin membrane), separate spp modes couple across the plasmonic membrane. if membrane becomes thinner still, two spp modes merge into one. fig. 5 coupled spp on membranes in imi (insulator-metal-insulator) configuration. thin metal strata in dielectric host metal nanoparticles in dielectric host freestanding plasmonic nanomembrane wire medium 506 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović 3. alternative plasmonic materials plasmonic effects were usually obtained using good metals like gold and silver. at the same time, these materials have very strong absorption losses. in the recent years, however, the focus of attention has shifted to alternative plasmonic materials [31, 32] which also have free electron plasma, but they offer various advantages like tailorability of their electromagnetic response, lower absorption losses. probably the most interesting group of alternative plasmonic materials are optically transparent, electrically conductive oxides (tco) [33]. heavy doping ensures an increase of the electron concentration in these materials, thus contributing to an improvement of their plasmonic properties and at the same time ensuring tailoring of their spectral characteristics, i.e. shifting them to the near-infrared part of the spectrum. examples of tco include ito – indium-tin-oxide; azo – aluminum-zinc-oxide; gzo – gallium-zincoxide. recently, however, it has been noted that the quoted properties come for a price and that the tco figure of merit (the ratio of the real by imaginary part of the refractive index) reaches rather poor values, even worse than in noble metals they are intended to replace [7]. other alternative plasmonic media include metallic alloys. they are tunable by design, by simply adjusting the alloy composition. such tailoring could shift the peak losses to another frequency, possibly outside the operating range. this group of materials includes noble-transition alloys (e.g. au-cd), alkali-noble inter-metallic compounds (e.g. li2agin, kau) and intermetallics (e.g. ag3sn, cu3sn). in spite of the tailorability of the composition of metal alloys and thus of their frequency dispersion, a problem of their excessively high absorption at optical wavelengths still remains, only mitigated to a minuscule degree. graphene represents weakly corrugated sub-nanometer honeycomb lattice of carbon. its relatively low optical absorption in visible and infrared (of the order of 3%), connected with the existence of quasi-2d free electron plasma makes it a convenient candidate for plasmonics [34]. its properties are easily tailored by doping and gating. a combination of graphene with noble metal nanoparticles has been proposed as a platform for tunable spp [35]. graphene plasmonics represents a field of its own and vastly surpasses the scope of this article. finally, an alternative plasmonic platform are highly doped semiconductors, e.g. gaas, gap, sic, gan, which also have free electron plasma. their use in active and ultrafast plasmonics has been considered [36]. however, absorption losses in the visible are again a hurdle towards more widespread use. 4. all-dielectric meta-optics the idea with all-dielectric metamaterials is to avoid absorption by completely removing lossy parts [37-39] . the price to pay is a lower degree of design freedom (it is far easier to localize em field using metal-dielectric interfaces). no metals or similar materials with free electron plasma are used: material can be pure dielectric or possibly semiconductor (simultaneously ensuring high refractive index and low losses). it is necessary to reach high refractive index contrast between scatterers and the embedding host. mie resonance theory [40] is applied (exact solution of the classical electromagnetic diffraction problem): both scatterer size and morphology/shape are important. extreme possible approaches to decreasing losses in optical metamaterials 507 field concentration is obtained through creation of hotspots (nonlocalities) at deep subwavelength level due to edge effects at sharp angles. to achieve this the shapes of scatterers are modified. some 3d shapes of nanoparticles are presented in fig. 6; this is only a very small number of examples among a vast variety of the existing forms. deep subwavelength hotspots cause effective medium approximation (ema) to break down (conventional ema theory no longer remains valid). fig. 6 illustration of nanoparticles with different shapes. since resonances are shape-dependent, a wealth of new modes appears. morphologydependent em behavior includes both electric and magnetic dipole resonances and higher order multipole resonances. in addition to that, relative positions of nanoparticles are important because magnetic or electric field tend to concentrate between them (nanoparticle dimers), again causing the appearance of magnetic or electric hotspots. as an illustration, the scattering properties of a single all-dielectric cylinder on a substrate are shown in figs. 7–8. the nanocylinders are on a low index substrate (n=2) surrounded by air and are built of a high index material (n=8). the cylinder radius is 50 nm and its height is 40 nm. the scattering properties of all-dielectric nano-cones are shown in figs. 9–12. the cones are deposited on a low index substrate (n=1.5) are surrounded by air and consist of a high index material (n=4). the dielectric cone base radius is 75 nm and the height is 100 nm. fig. 7 radiation pattern of far field scattered from a dielectric cylinder h=40 nm, r=50 nm, n=8 on a substrate n=2. the incident beam =560 nm arrives from above, along the cylinder axis. 508 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović fig. 8 spatial distributions of field intensity; electric field (top row) and magnetic field (bottom row) for a dielectric cylinder h=40 nm, r=50 nm, n=8 on a substrate n=2 at =560 nm. fig. 9 radiation pattern of far field scattered from a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. the incident beam =300 nm arrives from above, along the axis of the cone. possible approaches to decreasing losses in optical metamaterials 509 fig. 10 spatial distributions of field intensity; electric field (top row) and magnetic field (bottom row) for a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. at =300 nm. fig. 11 radiation pattern of far field scattered from a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. the incident beam =380 nm arrives from above, along the axis of the cone. 510 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović \ fig. 12 spatial distributions of field intensity; electric field (top row) and magnetic field (bottom row) for a dielectric cone h=100 nm, r=75 nm, n=4 on a substrate n=1.5. at =380 nm. when deposited on an interface between two materials with different refractive indices single dielectric particle exhibits high directivity in its radiation pattern in favor of material with higher refractive index i.e. the substrate, same as with metallic particles [41, 42]. unlike metals, electric field localizations occur also within the particle but are still tied to edges of the particle with much lower efficiency in comparison to metals. however, unlike metals, dielectric particles exhibit high magnetic field localizations within the particles with spatial distributions almost complementary to those of electric fields. figure 13 shows the response of purely dielectric nanodimers (paired nanocylinders, top view) [38]. for the field directions as in fig. 13 the dimers exhibit field hotspots in the gap between the nanoparticles: for the electric field directed along the axis of the dimer a hotspot of the electric field appears, and for the magnetic field along the same axis a magnetic hotspot appears. in the case when the nanoparticles are metallic, an identical situation is encountered if the layout is as shown in fig. 13a. however, metallic dimers behave oppositely to the dielectric ones when the configuration shown in fig.13b is used, and contrary to the all-dielectric case no magnetic hotspot appears at all. possible approaches to decreasing losses in optical metamaterials 511 fig. 13 electric and magnetic hotspots in purely dielectric dimers. + and – signs describe polarization of molecules within dimers. the vector of electric polarization within separate nanoparticles has the same direction as the electric field, while magnetic polarization follows the magnetic field. a square array of cylindrical high refractive index dielectric resonators is shown in fig. 14. such structure is denoted as dielectric huygens metasurface [43]. a metasurface can be defined as quasi-2d structure with a subwavelength thickness containing metamaterial “atoms” in its plane, which are themselves with subwavelength dimensions. a huygens metasurface can behave as a reflectionless plane, i.e. an array of huygens sources which do not have a backward component of scattering. the metamaterial “atoms” in this case are high-index cylinders arranged in plane. fig. 14 square array of subwavelength high refractive index cylinders embedded in a low-index host. 512 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović 5. optimizing frequency dispersion metal-dielectric nanocomposites exhibit very complex photonic behavior even in the case of the simplest structures. an interplay between bragg and plasmon-polariton interface phenomena generates a plethora of various electromagnetic modes. as an illustration, fig. 15 shows a frequency dispersion of a simple one-dimensional metal-dielectric multilayer with only 3 metal-dielectric pairs. the structure is deposited on dielectric and surrounded by air or vacuum. even such a basic plasmonic structure shows a surprising wealth of modes. besides a bandgap one can observe different plasmonic modes (to the right of the light line), including those modes that exist within the bandgap and cross into the bands, as well as the negative group velocity modes. obviously, if we meet such a complex situation for a simple 1d plasmonic crystal, with increasing structural complexity (making nanoplasmonic structures in 2d and 3d), it should be possible to customize frequency dispersion of the obtained artificial materials to arrive at almost any desired group velocity in a given frequency/wave vector range. the idea is to adjust the parameters to minimize losses and maximize fom for a given frequency range. fig. 15 frequency dispersion of a simple three-layer pair metal-dielectric with vacuum (n=1) on top side and dielectric substrate n=2.89. possible approaches to decreasing losses in optical metamaterials 513 now we give here an example of plasmonic metamaterials that can have strongly decreased losses. it is an old/new paradigm –materials with hyperbolic dispersion [44] (hm, hyperbolic metamaterials). the effect was first demonstrated in 1969, but only relatively recently attracted attention within the field of metamaterials. hm metallodielectric structures became a target of intensive research, among other reasons, because their absorption losses can be strongly reduced. hm is a metamaterial designed to exhibit extreme optical anisotropy, with opposite signs of dielectric permittivity in two orthogonal directions εn∙ετ < 0 (1) ετ = εx = εy, εn = εz (2) hyperbolic dispersion for extraordinary waves: kτ 2 /εn + kn 2 /ετ = k0 2 (3) k0 2 = ω 2 /c 2 , kτ 2 = kx 2 + ky 2 , kn = kz. (4) hm are much easier to fabricate than the well-known double-negative media (artificial composites that simultaneously have their effective permeability and permittivity below zero, i.e. negative refractive index metamaterials). a visual presentation of topological transformations in k-space which bring to hyperbolic dispersion is shown in fig. 16. fig. 16 topological transitions in k-space: isofrequency surfaces for extraordinary waves in hyperbolic metamaterials. various implementations of hyperbolic metamaterials are illustrated in fig.17. the simplest one is obviously a metal-dielectric multilayer whose isofrequency surfaces are hyperbolic. other examples include multilayer fishnet metamaterials and their complementary structures, pillars made from alternating metal and dielectric layers. finally, the most complex design presented in fig. 17 is a sculpted metal-dielectric film – the superlens design. 514 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović fig. 17 examples of hyperbolic metamaterials another type of structures that can be used to tailor optical absorption losses are the plasmonic hypercrystals. they can be defined as a periodic combination of hyperbolic medium with another medium (metal, dielectric, metamaterial...) [45]. a general design of a hypercrystal is shown in fig. 18. fig. 18 general design of a hypercrystal possible approaches to decreasing losses in optical metamaterials 515 dispersion of hyperbolic materials does not impose diffraction limit for tm waves (system unlimited by frequency!) bragg reflection in a hyperbolic photonic crystal (d~λ0) leads to the appearance of optical tamm surface states [7]. in hypercrystals the formation of pbgs (photonic bandgaps) persists in the subwavelength mode (metamaterial regime, d<<λ0) optical tamm states in hypercrystals lead to high em confinement (larger wave numbers) and simultaneously to lower absorption losses compared to surface plasmons polaritons. such behavior does not occur either in conventional pbg or in metamaterials. figures 19 and 20 show the frequency dispersion of the extinction coefficient im(kn)d (absorption) in a hypercrystal in dependence on the normalized in-plane momentum (k| |=k0) for varying τ=1/γ in lossy drude model (fig. 19). fig. 19 absorption in a hypercrystal for varying τ=1/γ in lossy drude model. 516 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović fig. 20 absorption in a hypercrystal for varying metal fraction in hyperbolic part. 6. conclusion reaching low-loss or lossless nanophotonics & plasmonics is a holy grail of electromagnetics, photonics and transformation optics. strategies include the use of alternative plasmonic materials, all-dielectric nanocomposites and optimization of dispersion and structure toward lower losses. each of them holds its own promises and pitfalls. a winning combination is not (yet) known, but may include a combination of two or more of the above. a host of practical applications would benefit: transformation optics (including superlenses and hyperlenses, cloaking devices, superconcentrators, superabsorbers...) elimination of losses is a crucial step toward merging electronics and photonics into super-compact, super-fast new generation of integrated circuitry. acknowledgement: the paper is a part of the research funded by the serbian ministry of education and science within the projects tr32008, iii45016 and on171005, as well as by the qatar national research fund within the projects nprp 8-028-1-001 and nprp 7-665-1-125. possible approaches to decreasing losses in optical metamaterials 517 references [1] u. leonhardt, “optical conformal mapping,” science, vol. 312, no. 5781, pp. 1777-1780, 2006. [2] y. liu, t. zentgraf, g. bartal, and x. zhang, “transformational plasmon optics,” nano lett., vol. 10, no. 6, pp. 1991-1997, 2010. [3] j. b. pendry, d. schurig, and d. r. smith, “controlling electromagnetic fields,” science, vol. 312, no. 5781, pp. 1780-1782, 2006. [4] w. cai, and v. shalaev, optical metamaterials: fundamentals and applications, springer, dordrecht , germany, 2009. [5] s. a. ramakrishna, and t. m. grzegorczyk, physics and applications of negative refractive index materials, spie press bellingham, wa & crc press, taylor & francis group, boca raton fl, 2009. [6] m. i. dyakonov, “new type of electromagnetic wave propagating at an interface,” sov. phys. jetp, vol. 67, pp. 714-716, 1988. [7] g. isić, s. vuković, z. jakšić, and m. belić, “tamm plasmon modes on semi-infinite metallodielectric superlattices,” scientific reports, vol. 7, no. 1, pp. 3746, 2017. [8] j. a. polo jr, and a. lakhtakia, “surface electromagnetic waves: a review,” laser and photonics reviews, vol. 5, no. 2, pp. 234-246, 2011. [9] j. yang, m. huang, c. yang, z. xiao, and j. peng, “metamaterial electromagnetic concentrators with arbitrary geometries,” opt. express, vol. 17, no. 22, pp. 19656-19661, 2009. [10] d. s. wiersma, p. bartolini, a. lagendijk, and r. righini, “localization of light in a disordered medium,” nature, vol. 390, no. 6661, pp. 671-673, 1997. [11] n. i. landy, s. sajuyigbe, j. j. mock, d. r. smith, and w. j. padilla, “perfect metamaterial absorber,” phys. rev. lett., vol. 100, no. 20, 2008. [12] n. liu, m. mesch, t. weiss, m. hentschel, and h. giessen, “infrared perfect absorber and its application as plasmonic sensor,” nano lett., vol. 10, no. 7, pp. 2342-2348, 2010. [13] j. ng, h. chen, and c. t. chan, “metamaterial frequency-selective superabsorber,” opt. lett., vol. 34, no. 5, pp. 644-646, 2009. [14] n. fang, h. lee, c. sun, and x. zhang, “sub-diffraction-limited optical imaging with a silver superlens,” science, vol. 308, no. 5721, pp. 534-537, 2005. [15] z. liu, s. durant, h. lee, y. pikus, n. fang, y. xiong, c. sun, and x. zhang, “far-field optical superlens,” nano lett., vol. 7, no. 2, pp. 403-408, 2007. [16] j. b. pendry, and d. r. smith, “the quest for the superlens,” sci. am., vol. 295, no. 1, pp. 60-67, 2006. [17] z. jacob, l. v. alekseyev, and e. narimanov, “optical hyperlens: far-field imaging beyond the diffraction limit,” opt. express, vol. 14, no. 18, pp. 8247-8256, 2006. [18] z. liu, h. lee, y. xiong, c. sun, and x. zhang, “far-field optical hyperlens magnifying sub-diffractionlimited objects,” science, vol. 315, no. 5819, pp. 1686, 2007. [19] e. e. narimanov, and v. m. shalaev, “optics: beyond diffraction,” nature, vol. 447, no. 7142, pp. 266267, 2007. [20] w. cai, u. k. chettiar, a. v. kildishev, and v. m. shalaev, “optical cloaking with metamaterials,” nature photonics, vol. 1, no. 4, pp. 224-227, 2007. [21] t. ergin, n. stenger, p. brenner, j. b. pendry, and m. wegener, “three-dimensional invisibility cloak at optical wavelengths,” science, vol. 328, no. 5976, pp. 337-339, 2010. [22] e. ozbay, “plasmonics: merging photonics and electronics at nanoscale dimensions,” science, vol. 311, no. 5758, pp. 189-193, 2006. [23] s. a. maier, plasmonics: fundamentals and applications, springer science+business media, new york, ny, 2007. [24] j. brown, “artificial dielectrics having refractive indices less than unity,” proc. ieee, vol. 100, no. 4, pp. 51-62, 1953. [25] j. brown, "artificial dielectrics," progress in dielectrics, j. b. birks, ed., pp. 193–225, hoboken, new jersey: wiley, 1960. [26] s. m. vuković, z. jakšić, and j. matovic, “plasmon modes on laminated nanomembrane-based waveguides,” j. nanophotonics, vol. 4, pp. 041770, 2010. [27] z. jakšić, and j. matovic, “functionalization of artificial freestanding composite nanomembranes,” materials, vol. 3, no. 1, pp. 165-200, 2010. [28] c. jiang, s. markutsya, y. pikus, and v. v. tsukruk, “freely suspended nanocomposite membranes as highly sensitive sensors,” nature mater., vol. 3, no. 10, pp. 721-728, 2004. [29] p. berini, “long-range surface plasmon polaritons,” adv. opt. photon., vol. 1, no. 3, pp. 484-588, 2009. 518 z. jakšić, m. obradov, o. jakšić, g. isić, s. vuković, d. vasiljević radović [30] p. berini, r. charbonneau, and n. lahoud, “long-range surface plasmons along membrane-supported metal stripes,” ieee j. sel. top. quant. electr., vol. 14, no. 6, pp. 1479-1495, 2008. [31] a. boltasseva, and h. a. atwater, “low-loss plasmonic metamaterials,” science, vol. 331, no. 6015, pp. 290-291, 2011. [32] p. r. west, s. ishii, g. v. naik, n. k. emani, v. shalaev, and a. boltasseva, “searching for better plasmonic materials,” laser & photon. rev, pp. 1-13, 2010. [33] s. franzen, c. rhodes, m. cerruti, r. w. gerber, m. losego, j. p. maria, and d. e. aspnes, “plasmonic phenomena in indium tin oxide and ito-au hybrid films,” opt. lett., vol. 34, no. 18, pp. 2867-2869, 2009. [34] z. fei, a. rodin, g. andreev, w. bao, a. mcleod, m. wagner, l. zhang, z. zhao, m. thiemens, and g. dominguez, “gate-tuning of graphene plasmons revealed by infrared nano-imaging,” nature, vol. 487, no. 7405, pp. 82, 2012. [35] a. grigorenko, m. polini, and k. novoselov, “graphene plasmonics,” nature photonics, vol. 6, no. 11, pp. 749, 2012. [36] j. m. luther, p. k. jain, t. ewers, and a. p. alivisatos, “localized surface plasmon resonances arising from free carriers in doped quantum dots,” nature mater., vol. 10, no. 5, pp. 361, 2011. [37] s. jahani, and z. jacob, “all-dielectric metamaterials,” nature nanotech., vol. 11, no. 1, pp. 23-36, 2016. [38] a. i. kuznetsov, a. e. miroshnichenko, m. l. brongersma, y. s. kivshar, and b. luk’yanchuk, “optically resonant dielectric nanostructures,” science, vol. 354, no. 6314, 2016. [39] p. spinelli, m. a. verschuuren, and a. polman, “broadband omnidirectional antireflection coating based on subwavelength surface mie resonators,” nature comm., vol. 3, 2012. [40] m. quinten, optical properties of nanoparticle systems: mie and beyond, wiley-vch, weinheim, germany, 2011. [41] m. schmid, r. klenk, m. c. lux-steiner, m. topič, and j. krč, “modeling plasmonic scattering combined with thin-film optics,” nanotechnology, vol. 22, no. 2, pp. 025204.1-10, 2010. [42] z. jakšić, m. obradov, s. vuković, and m. belić, “plasmonic enhancement of light trapping in photodetectors,” facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 183-203, 2014. [43] a. epstein, and g. v. eleftheriades, “huygens’ metasurfaces via the equivalence principle: design and applications,” josa b, vol. 33, no. 2, pp. a31-a50, 2016. [44] a. poddubny, i. iorsh, p. belov, and y. kivshar, “hyperbolic metamaterials,” nature photonics, vol. 7, no. 12, pp. 948-957, 2013. [45] e. e. narimanov, “photonic hypercrystals,” physical review x, vol. 4, no. 4, 2014. facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 617-630 https://doi.org/10.2298/fuee2004617m © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd distributed multi-hop routing algorithm for wireless sensor networks  morteza mohammadi zanjireh, jaafar gadban department of computer engineering, imam khomeini international university, qazvin, iran abstract. in a wireless sensor network (wsn), routing is the process of finding a costeffective route in terms of power consumption. as an evaluation criterion for the wsn performance, network lifetime is directly affected by the routing method. in a wide variety of wsns, different techniques are used as routing methods, such as shortest distance path. in this paper, we propose a novel algorithm, optimizing power consumption in wsn nodes, based on the shortest path algorithm. in this approach, the energy level of nodes and their geographical distance from each other contribute to the weight of the connecting path. the proposed algorithm is used as a data dissemination method in wsns with randomly scattered nodes. we also apply dijkstra’s shortest path algorithm to the same networks. the results showed that the proposed algorithm increases the network lifetime up to 30 % by preventing nodes with low charge levels from early disconnection. key words: wireless sensor network, routing, network lifetime, dijkstra’s algorithm. 1. introduction a wireless sensor network (wsn) [1][2][3][4][5][6][7][8] is an ad hoc network consisting of a set of distributed, small, low-power, low-cost sensor nodes that communicate through a wireless link and have limited memory, computational, and communication resources. in addition, sensor nodes can have such special equipment as global positioning system (gps) antenna helping them locate themselves (location-aware networks) [9]. each node continuously monitors the environmental condition, collects detailed information about it, and then transmits the collected data to a special node, called a sink or base station (bs) [10]. this station passes the received data to the server from where the enduser can access it. the wsns do not rely on a pre-existing infrastructure, such as routers in wired networks or access points in managed wireless networks [11]. moreover, sensor nodes are positioned  received april 1, 2020; received in revised form june 14, 2020 corresponding author: morteza mohammadi zanjireh department of computer engineering, imam khomeini international university, 34148-96818 qazvin, iran e-mail: zanjirehm@eng.ikiu.ac.ir 618 m. m. zanjireh, j. gadban randomly and therefore, all the protocols and algorithms have self-organizing capability [12]. in addition, they can be homogeneous or heterogeneous based on the types of storage and processing capacities, battery power, and sensing and communication capabilities [13][14]. they also can be static, in which sensors are in fixed positions, or dynamic considering nodes as moving objects [13]. additionally, wsns can be single-hop or multi-hop[4][5][6][8][15]. in multi-hop fashion, a sensor node plays a dual role, working as data originator and data router, but in a single-hop one, the sensor sends the collected data directly to the bs[1][16][17]. applications of wsn range from small-size healthcare surveillance systems to largescale environmental monitoring. such networks can only attain their objectives as long as they are “alive”. the wsn lifetime, therefore, is an important metric forming an upper bound for the network availability. this metric, depending on the lifetimes of all the sensor nodes, evaluates the performance, availability, and security of the network in an applicationspecific way [18]. one of the most important tasks in the wsn, affecting its lifetime, is communication. in such a network, routing [1] is the process of finding a cost-effective route in terms of power consumption. the simplest method to send data packets to the bs is direct transmission in which each sensor node communicates with the bs individually. the second approach is flooding protocol [19][20][21] whereby each node must transmit the received packet to all of its neighbors. this process continues until the packet reaches its destination or the maximum number of hops. on the other hand, the gossiping algorithm is a version of flooding protocol [22] in which a node sends the packet to a randomly selected neighbor. the shortest distance or hop path algorithms, such as dijkstra’s and bellman-ford [23], are also well-known, time and energy-efficient solutions to determine the best path for data transmission [24]. the routing techniques are classified based on architecture, protocol operation, and topology (route selection strategies) of the network. each routing algorithm can be included in more than one category [25]. based on the network architecture, there are three subcategories [26]; flat, hierarchical, and location-based. as a source initiated technique, flat or “data-centric” routing [1][27] uses a network setup message, including hop counts and remaining energy level of neighbor nodes, to find the route. hierarchical routing [17][28][29][30][31] is an energy-efficient technique determining the role of each node based on its energy level. in location-based routing techniques, each sensor node is aware of its location [27]. operation based routing protocols want to achieve optimal performance and save the scarce resources of the network. these techniques include query, negotiation, quality of service (qos), path selection, and coherent based routing [1][32][33]. the main idea behind topology routing protocols is that how the source node computes and maintains the paths to the destination nodes [10][13][34][35][36]. this category is subdivided into proactive, reactive, and hybrid-based routing. selecting an appropriate routing algorithm is a fundamental task in wsn applications and directly affects the network lifetime. inefficient routing algorithm drains off the energy of nodes faster and consequently lowers network availability. the performance of some previously mentioned methods is limited [37]. for example, since bs is usually located far away from sensor nodes, direct transmission not only results in high transmission costs but drains off the energy of nodes faster and reduces system lifetime as well. in flooding technique, nodes ignore the amount of their available energy (resource blindness) and duplicated data packets are sent to the same node (implosion). having one copy of a message, the gossiping approach avoids this problem but has a longer propagation time. the overlap is another problem of flooding distributed multi-hop routing algorithm for wireless sensor networks 619 technique that happens when two nodes share the same geographical region. though used in many wsn applications [24], shortest path algorithms have unbounded message complexity and significantly high overhead which make them unsuitable for wsns due to the energy problem. shortest hop path algorithms are also used in many protocols for wsns. in this asynchronous approach, path construction is easy and straightforward, whereas the message complexity cannot be calculated and the initiator node cannot know whether the algorithm terminates. although numerous methods have been proposed to increase the wsn lifetimes, there is still much ongoing research on how to optimize energy consumption in them. in this research, we propose a distributed multi-hop algorithm. our goal is to balance the energy consumption between network nodes by creating energy-efficient paths from the bs to all nodes and changing them when the energy levels of constituent nodes drop under a critical level. the proposed algorithm is based on the well-known dijkstra’s shortest path algorithm [23]. in order to investigate the performance of our proposed algorithm in the wsns, we implemented it and dijkstra’s. the obtained results show that our proposed algorithm increases the lifetime of the network up to 30%. the rest of this paper is organized as follows: “related work” presents the background of routing algorithms in wsns. the third section introduces the proposed routing algorithm. “results and discussion” discuss experiment details and experimental results. finally, the conclusion and the discussion about future work are given in the last section. 2. related work sensor protocols for information via negotiation (spin)[38] is a data-centric routing protocol based on a negotiation model to propagate information in the wsn. spin overcome the shortage of flooding by negotiation and adapting the resources. in this algorithm, nodes negotiate with each other about their data requirements. this process ensures that there is no redundant data transmission in the network. directed diffusion (dd)[39] is a data-centric, query-based routing protocol for data propagation. in this protocol, the bs sends interest message to the network and a sensor node sends gradients towards the bs if its data matches with interest. the path is forming while the source is sending gradients. rumor routing (rr) [40], is a variant of the dd algorithm and an example of hybrid routing. this protocol sends queries to the nodes that have observed an event instead of flooding a message into the whole network. low energy adaptive clustering hierarchy (leach)[41] is a hierarchical routing protocol that adopts an equal probability method to select cluster heads in a circle and random manner. it also distributes the energy of the whole network evenly to each node. cougar [42] is another data-centric routing protocol that makes a new query layer between the wsn and its applications. destination-sequenced distance vector (dsdv) [13] is a popular, highly proactive, loop-free routing protocol. it uses distance vectors to find the shortest path to the destination. ad hoc on-demand distance vector (aodv) [36] is a scalable, loop-free, reactive routing protocol for mobile ad hoc network, capable of both supporting unicast and multicast routing. greedy perimeter stateless routing (gpsr) [35] is another reactive geographical routing protocol for wsns. in this method, each node only needs its neighbors' positions without other topological information. every node is assumed to have a mechanism, 620 m. m. zanjireh, j. gadban maybe a gps device [43], to identify its own location. distance routing effect algorithm for mobility (dream) [43] is also a geographical routing protocol considered as both proactive and reactive. the distributed bellman-ford (dbf)[43] is a routing algorithm based on shortest distance paths in wsns suitable for distributed systems. query-based protocol (peq) [24] and inter-cluster communication (ice)[44] are protocols working based on shortest hop path algorithms. ice uses an acknowledgment based approach to discover faulty paths. 2.1. dijkstra’s algorithm the dijkstra’s [23] algorithm is a well-known solution to find the shortest path between two vertices in a weighted, directed graph so that the sum of the weights of its constituent edges is minimized. the dijkstra’s algorithm assumes that the number of vertices is finite and all edge costs are non-negative. calculating the shortest path between one node and every other node in the graph, dijkstra’s algorithm is appropriate for wsn applications. in figure.1, the green path represents the shortest path between node x and the base found by dijkstra’s algorithm. fig. 1 shortest path between base and node x calculated by dijkstra’s algorithm 3. the proposed algorithm in this section, we propose an algorithm maximizing the lifetime of the network and balancing power consumption between its nodes. most of the algorithms mentioned in the previous section have their own deficiencies. for example, flooding algorithm suffers from impulsion, overlap, and resource blindness. gossiping has high propagation time. spin does not guarantee the delivery of data and is not suitable for applications requiring reliable data. dd has high overhead at sensor nodes and is not suitable for the applications requiring a continuous flow of data. rr fails in large networks. cougar has extra overhead and distributed multi-hop routing algorithm for wireless sensor networks 621 memory usage and requires synchronization for successful in-network data computation. however the shortest path algorithm is one of the mostly used algorithms for data propagation in wsns [22], this method has serious problems. one of its problems is that these nodes are located near to the bs and consume more energy than others. moreover, the shortest distance approach drains off the energy of important nodes playing a bridge role between two parts of the network. thus, to balance energy consumption in the wsn, we propose a flat routing algorithm, called layered routing (lr), that deploys a semi-dynamic transmit range. in this multipath, reactive method, a lower transmission range is set for those nodes near the bs, where higher transmission ranges are set for those nodes that are far away from the bs to reach the bs directly and without using the low energy nodes in the connection path. our proposed algorithm makes various assumptions, such as: 1. all nodes are stationary. 2. all nodes have the same capabilities (processing, memory, radio, and battery power). 3. each node has a distinct node id. 4. links between nodes are symmetric (i.e. if there is a link from a to b, there exists a reverse link from b to a). 5. nodes are not aware of their positions (they are not equipped with a gps receiver). 3.1. description of proposed algorithm in order to apply the algorithm, the network has to be divided into layers according to the distance between the nodes and the bs. for example, nodes within the 20 m radius of the bs are in the first layer, and nodes within the 40 m radius of the bs, which are not in the first layer, are in the second one. in addition, a unique id is assigned to each node initially. the routing procedure works as follows: 1. the bs sends a request message to the surrounding nodes within a specific range (e.g. 20 m) to form the first layer. 2. nodes receiving the request, reply with an acknowledgment message through previous layers (except the first layer in which nodes contact the bs directly). 3. the bs dedicates an id to each of these nodes and broadcast these ids (the layer is constructed). 4. the bs increases the transmitting range (40 m). 5. the first 4 steps are repeated (network layering) until no node replies to the bs with the acknowledgment message (i.e. the whole network is layered, figure 2). 6. to define the paths, a table, figure 3, is being made to link nodes in different layers with each other and with the bs. the weight of the link between two nodes is determined by considering their distance, is computed in the receiving node via receiving signal strength between them and the sender's residual energy. 7. after running the network, the bs monitors the residual energy in the nodes. when the energy level of a certain fraction of the nodes is less than a predefined threshold (e.g. 10 % of the initial energy), the bs increases the transmitting range to establish a new connection that does not include the low energy nodes. 622 m. m. zanjireh, j. gadban fig. 2 a layered wsn (layers are defined and each node has an unique id) fig. 3 connections table produced by the lr algorithm 3.2. simulation setups we examine the performance of the lr algorithm using matlab r2016b. the simulation results are compared with the performance of dijkstra’s shortest path algorithm. in the simulated wsn, there are two kinds of nodes: bs and sensor. the efficient transmitting distance among nodes is 25 m for dijkstra’s and varies between 15 to 50 m for the lr algorithm. in this experiment, the wsn consists of 100 sensor nodes and one bs. these sensor nodes are scattered randomly in a square 100 m by 100 m and the bs is placed in the center of it (figure 4). moreover, all the sensor nodes are static and have no mobility. the initial power of a sensor node is 0.5 j and the bs has no energy restriction. the energy consumption for transmitting and receiving a bit per meter are 5*10^(-8) j and 0, respectively. also, the ideal listening energy cost is set to zero. we also assume that there is no data loss or data collision in the simulation. each sensor node generates random data (6400 bit per packet) and transmits it to the next node on its path with the highest weight. distributed multi-hop routing algorithm for wireless sensor networks 623 table 1 simulation study parameters parameter value area 100*100 m^2 base station position (50,50) number of sensors 100 transmitting energy(per bit and per meter) 5*10^(-8)j receiving energy(per bit and per meter) 0 initial energy 0.5 j transmitting range of the dijkstra’s algorithm (except for experiments in figure 9) 25 m transmitting range (proposed algorithm) 15-50 m the simulation continues until there is no connected sensor to the bs. the simulation starts by defining layers and giving ids to the sensors. each layer is defined based on its distance from the bs. sensors’ transmitting signal range is dynamic; it starts with low value and goes up when the energy of the sensors near the bs drops down. table 2 illustrates the relationship between the transmitting signal range and the number of low energy sensors. table 2 the relationship between the transmitting signal range and the number of low energy sensors. number of low energy nodes transmitting signal range low energy nodes < 10 % of all nodes 15% of environment length low energy nodes < 20 % of all nodes 25% of environment length low energy nodes < 30 % of all nodes 30% of environment length low energy nodes < 40 % of all nodes 35% of environment length low energy nodes < 50 % of all nodes 40% of environment length low energy nodes > 50 % of all nodes 50% of environment length fig. 4 an example of the random distribution of nodes in wsn 624 m. m. zanjireh, j. gadban 4. results and discussion since the nodes are randomly distributed, wsns with the same number of nodes may have different performance. therefore, the dijkstra’s and the lr algorithm are both applied to the same network each time. each round of the simulation manages to send messages from all alive nodes, with at least one connection, to the bs. therefore, when the simulation reaches 100 rounds, there will be approximately 10000 messages received by the bs. the connection and the status of nodes are different after each round of simulation. figure 5 presents the network status after 500 rounds of simulation with dijkstra’s as the routing method. in figure 5, node number 1 represents the bs and a dead node does not have any connections. we can notice that after 500 rounds the bs and the remaining alive nodes are disconnected and the network is considered dead. fig. 5 the connection of nodes after 500 rounds of simulation with dijkstra’s algorithm as routing method when we applied the lr algorithm to the same network, we can achieve different results (figures 6 and 7). in these figures, the network holds this transmitting range as long as the number of low energy nodes is under 10 nodes, thereafter the transmitting range is increased. therefore, more nodes can reach the bs directly and the low energy nodes around it are no longer mediators for signal transmission. as the number of low energy nodes increases, the transmitting range increase as well. this allows more distant nodes to communicate directly to the bs, i.e., low energy nodes are eliminated from the previous connecting paths and consequently, low energy nodes can save more power and remain alive. distributed multi-hop routing algorithm for wireless sensor networks 625 fig. 6 the connection of nodes after 1000 rounds of simulation with the lr algorithm as routing method (transmitting range 40m) fig. 7 the connection of nodes after 1200 rounds of simulation with the lr algorithm as routing method (transmitting range 50m) it can be seen that the lr algorithm used the residual energy of nodes in an efficient way to keep the network as alive and functional as possible (1200 rounds of simulation with the lr algorithm versus 500 rounds with dijkstra’s), and this can be considered as a credible enhancement in dijkstra’s performance. 626 m. m. zanjireh, j. gadban the results show that the lr algorithm avoided early splitting of the network. taking into account the combination of the nodes' distance and the amount of residual energy in the weighting process, the lr algorithm increased the network lifetime. figure 8 shows that dijkstra’s split the network and consumed the energy of important nodes faster than the lr algorithm. in other words, using the lr algorithm, nodes can communicate longer (more round of simulation) than the case that dijkstra’s is used. fig. 8 a comparison of the performance of the dijkstra’s and the lr algorithm the results show that both algorithms tend to consume the energy of nodes near the bs faster than the others’ although the lr algorithm tries to overcome this problem by increasing the transmitting range. the simulation has been repeated using different transmitting ranges for dijkstra’s algorithm (25, 35, 45 and 55 m). results are shown in figure 9. figure 10 also presents a comparison of the number of rounds between the lr algorithm and dijkstra’s as the routing protocols. the results witness the lr’s better performance in randomly produced networks compared to dijkstra’s. in another experiment, we changed the size of the wsn (200 m by 200 m) to check the lr performance. the results were promising, that is, this algorithm is flexible against random distribution. distributed multi-hop routing algorithm for wireless sensor networks 627 fig. 9 number of rounds for each transmitting range in dijkstra’s algorithm fig. 10 the lr algorithm vs. dijkstra’s algorithm 628 m. m. zanjireh, j. gadban 5. conclusion and future research in this paper, we proposed a novel distributed multi-hop algorithm for wsns. we used the combination of geographical distance between two nodes and residual energy of them to define the weight of a path. this combination provided flexibility to reassign the communication path for the network. to investigate the effectiveness of the proposed algorithm, we simulated it in the randomly produced wsns and the results showed that it avoided early split of the network to different parts and thus increased its lifetime. furthermore, the usage of the residual energy to determine the transmitting range of the nodes not only prevented the network from early death but also provided every node with the ability to communicate with the bs even if its neighbors are dead. the proposed algorithm was compared to dijkstra’s shortest path algorithm and the results were 30% better than that of dijkstra’s. on the light of promising results of the proposed algorithm, it could be useful to take into consideration application of the proposed algorithm along with other energy-efficient techniques in wsns. in this research, we split the network into layers based on their distance from the bs and different transmission ranges are used for different layers. in future work, we will try to make transmitting range adjustment on smaller levels (layer level or on single sensor level) which could enhance the performance and the lifetime of the network. this study did not consider data delay; any future work should study data delay in the proposed algorithm. moreover, the bs location has an impact on a wsn performance and lifetime; any future work should take this feature into consideration among with having more than one bs in the network. list of abbreviations ad hoc on-demand distance vector (aodv) base station (bs) destination-sequenced distance vector (dsdv) directed diffusion (dd) distance routing effect algorithm for mobility (dream) distributed bellman-ford (dbf) global positioning system (gps) greedy perimeter stateless routing (gpsr) inter-cluster communication (ice) layered routing (lr) low energy adaptive clustering hierarchy (leach) quality of service (qos) query-based protocol (peq) rumor routing (rr) sensor protocols for information via negotiation (spin) wireless sensor network (wsn) funding: this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. distributed multi-hop routing algorithm for wireless sensor networks 629 references [1] a. djedouboum, a. abba ari, a. gueroui, a. mohamadou, and z. aliouat, “big data collection in large-scale wireless sensor networks,” sensors, vol. 18, no. 12, p. 4474, 2018. [2] v. mythili, a. suresh, m. m. devasagayam, and r. dhanasekaran, “seat-dsr: spatial and energy aware trusted dynamic distance source routing algorithm for secure data communications in wireless sensor networks,” cogn. syst. res., 2019. [3] b. xi-rong, q. zhi-tao, z. xue-feng, and z. shi, “an efficient energy cluster-based routing protocol for wireless sensor networks,” in proceedings of the 2009 chinese control and decision conference, 2009, pp. 4716–4721. [4] m. m. zanjireh and h. larijani, “analytical modelling of the a-anch clustering algorithm for wsns,” in proceedings of the 2015 seventh international conference on ubiquitous and future networks, 2015, pp. 429–434. [5] m. m. zanjireh and h. larijani, “a survey on centralised and distributed clustering routing algorithms for wsns,” in proceedings of the 2015 ieee 81st vehicular technology conference (vtc spring), 2015, pp. 1–6. [6] m. m. zanjireh, h. larijani, and w. o. popoola, “energy based analytical modelling of anch clustering algorithm for wireless sensor networks,” int. j. adv. networks serv., vol. 7, no. 3/4, 2014. [7] m. m. zanjireh, h. larijani, and w. o. popoola, “activity-aware clustering algorithm for wireless sensor networks,” in proceedings of the 2014 9th international symposium on communication systems, networks & digital sign (csndsp), 2014, pp. 122–127. [8] m. m. zanjireh, a. shahrabi, and h. larijani, “anch: a new clustering algorithm for wireless sensor networks,” in proceedings of the 2013 27th international conference on advanced information networking and applications workshops, 2013, pp. 450–455. [9] a. kumar, h. y. shwe, k. j. wong, and p. h. chong, “location-based routing protocols for wireless sensor networks: a survey,” wirel. sens. netw., vol. 9, no. 1, pp. 25–72, 2017. [10] s. basagni, i. chlamtac, v. r. syrotiuk, and b. a. woodward, “a distance routing effect algorithm for mobility (dream),” in proceedings of the 4th annual acm/ieee international conference on mobile computing and networking, 1998, pp. 76–84. [11] k. maraiya, k. kant, and n. gupta, “application based study on wireless sensor network,” int. j. comput. appl., vol. 21, no. 8, pp. 9–15, 2011. [12] m. r. ahmed, x. huang, d. sharma, and h. cui, “wireless sensor network: characteristics and architectures,” int. j. inf. commun. eng., vol. 6, no. 12, pp. 1398–1401, 2012. [13] c. e. perkins and p. bhagwat, “highly dynamic destination-sequenced distance-vector routing (dsdv) for mobile computers,” acm sigcomm comput. commun. rev., vol. 24, no. 4, pp. 234–244, 1994. [14] z. zhu and s. h. yang, “a possible hardware architecture of wireless sensor nodes,” in proceedings of the 2006 ieee international conference on systems, man and cybernetics, 2006, pp. 3377–3381. [15] m. singh and v. k. prasanna, “optimal energy-balanced algorithm for selection in a single hop sensor network,” in proceedings of the first ieee international workshop on sensor network protocols and applications, 2003, pp. 9–18. [16] m. ilyas and i. mahgoub, handbook of sensor networks: compact wireless and wired sensing systems. 2004. [17] s. saranya and m. princy, “routing techniques in sensor network–a survey,” procedia eng., vol. 38, pp. 2739–2747, 2012. [18] i. dietrich and f. dressler, “on the lifetime of wireless sensor networks,” acm trans. sens. networks, vol. 5, no. 1, 2009. [19] a. kumar and s. pahuja, “a comparative study of flooding protocol and gossiping protocol in wsn,” int. j. comput. technol. appl., vol. 5, no. 2, pp. 797–800, 2014. [20] p. kumarawadu, d. j. dechene, m. luccini, and a. sauer, “algorithms for node clustering in wireless sensor networks: a survey,” in proceedings of the 2008 4th international conference on information and automation for sustainability, 2008, pp. 295–300. [21] q. zhang, r. h. jacobsen, and t. s. toftegaard, “bio-inspired low-complexity clustering in large-scale dense wireless sensor networks,” in proceedings of the 2012 ieee global communications conference (globecom), 2102, pp. 658–663. [22] j. n. al-karaki and a. e. kamal, “routing techniques in wireless sensor networks: a survey,” ieee wirel. commun., vol. 11, no. 6, pp. 6–28, 2004. [23] e. w. dijkstra, “a note on two problems in connexion with graphs,” numer. math., vol. 1, no. 1, pp. 269–271, 1959. 630 m. m. zanjireh, j. gadban [24] o. yilmaz, s. demirci, y. kaymak, s. ergun, and a. yildirim, “shortest hop multipath algorithm for wireless sensor networks,” comput. math. with appl., vol. 61, no. 1, pp. 48–59, 2012. [25] o. younis, m. krunz, and s. ramasubramanian, “location-unaware coverage in wireless sensor networks,” ad hoc networks, vol. 6, no. 7, pp. 1078–1097, 2008. [26] x. chen and p. yu, “research on hierarchical mobile wireless sensor network architecture with mobile sensor nodes,” in proceedings of the 2010 3rd international conference on biomedical engineering and informatics, 2010, pp. 2863–2867. [27] s. fedor and m. collier, “on the problem of energy efficiency of multi-hop vs one-hop routing in wireless sensor networks,” in proceedings of the 21st international conference on advanced information networking and applications workshops (ainaw’07), 2007, pp. 380–385. [28] r. aggarwal, a. mittal, and r. kaur, “hierarchical routing techniques for wireless sensor networks: a comprehensive survey,” int. j. futur. gener. commun. netw., vol. 9, no. 7, pp. 101–112, 2016. [29] d. bhattacharyya, t. h. kim, and s. pal, “a comparative study of wireless sensor networks and their routing protocols,” sensors, vol. 10, no. 12, pp. 10506–10523, 2010. [30] v. kumar, s. b. dhok, r. tripathi, and s. tiwari, “a review study of hierarchical clustering algorithms for wireless sensor networks,” int. j. comput. sci. issues, vol. 11, no. 3, p. 92, 2014. [31] s. p. singh and s. c. sharma, “a survey on cluster based routing protocols in wireless sensor networks,” procedia comput. sci., vol. 45, pp. 687–695, 2015. [32] m. abdullah and a. ehsan, “routing protocols for wireless sensor networks: classifications and challenges,” in journal of electronics and communication engineering research, dec. 2014, vol. 2, pp. 5–15. [33] m. ullah and w. ahmad, “evaluation of routing protocols in wireless sensor networks,” blekinge institute of technology. karlshamn, blekinge, sweden., 2009. [34] x. hou, d. tipper, and s. wu, “a gossip-based energy conservation protocol for wireless ad hoc and sensor networks,” j. netw. syst. manag., vol. 14, no. 3, pp. 381–414, 2006. [35] b. karp and h. t. kung, “gpsr: greedy perimeter stateless routing for wireless networks,” in in proceedings of the 6th annual international conference on mobile computing and networking, 2000, pp. 243–254. [36] c. e. perkins and e. m. royer, “ad-hoc on-demand distance vector routing,” in proceedings of the wmcsa’99. second ieee workshop on mobile computing systems and applications, 1999, pp. 90–100. [37] i. f. akyildiz, w. su, y. sankarasubramaniam, and e. cayirci, “wireless sensor networks: a survey,” comput. networks, vol. 38, no. 4, pp. 393–422, 2002. [38] j. kulik, w. heinzelman, and h. balakrishnan, “negotiation-based protocols for disseminating information in wireless sensor networks,” wirel. networks, vol. 8, no. 2/3, pp. 169–185, 2002. [39] c. intanagonwiwat, r. govindan, and d. estrin, “directed diffusion: a scalable and robust communication paradigm for sensor networks,” in proceedings of the 6th annual international conference on mobile computing and networking, 2000, pp. 56–67. [40] d. braginsky and d. estrin, “rumor routing algorthim for sensor networks,” in proceedings of the 1st acm international workshop on wireless sensor networks and applications, 2002, pp. 22–31. [41] l. xingguo, w. junfeng, and b. linlin, “leach protocol and its improved algorithm in wireless sensor network,” in 2016 international conference on cyber-enabled distributed computing and knowledge discovery (cyberc), 2016, pp. 418–422. [42] y. yao and j. gehrke, “the cougar approach to in-network query processing in sensor networks,” acm sigmod rec., vol. 31, no. 3, pp. 9–18, 2002. [43] k. m. chandy and j. misra, “distributed computation on graphs: shortest path algorithms (tr-lcs8203),” texas university at austin, 1982. [44] a. boukerche and a. martirosyan, “an energy-aware and fault tolerant inter-cluster communication based protocol for wireless sensor networks,” in proceedings of the ieee globecom 2007-ieee global telecommunications conference, 2007, pp. 1164–1168. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 325 344 doi: 10.2298/fuee1503325m thermal and electro-thermal modeling of components and systems: a review of the research at the university of parma  roberto menozzi, paolo cova, nicola delmonte, francesco giuliani, giovanna sozzi department of information engineering, university of parma, italy abstract. this paper reviews the activity carried out at the department of information engineering of the university of parma, italy, in the field of thermal and electrothermal modeling of devices, device and package assemblies, circuits, and systems encompassing active boards and heat-sinking elements. this activity includes: (i) finite-element 3d simulation for the thermal analysis of a hierarchy of structures ranging from bare device dies to complex systems including active and passive devices, boards, metallizations, and airand water-cooled heat-sinks, and (ii) lumped-element thermal or electro-thermal models of bare and packaged devices, ranging from purely empirical to strictly physicsand geometry-based. key words: modeling, simulation, temperature, reliability 1. introduction temperature is a key factor in the performance and reliability of electron devices, circuits and systems. from basic material properties such as electrical conductivity to device parameters and, as a consequence, circuit and system performance, the role of temperature is ubiquitous. reliability-wise, many wear-out mechanisms are exponentially accelerated by temperature, and thermal gradients in space and time are the source of many a failure, often related with die-attach, solders, etc., which suffer for the differences in the thermal expansion coefficients of the various materials. for this series of reasons, thermal modeling is mandatory for optimum device and circuit design, analysis, reliability estimation, and failure analysis. however, there are intrinsic difficulties: 1. in operating device/circuit/system, temperature may vary dramatically over space and time due to localized power dissipation and self-heating, which in general depend on local and instantaneous currents and voltages; in turn, temperature  received february 27, 2015 corresponding author roberto menozzi department of information engineering, university of parma, parco area delle scienze 181a, 43124 parma, italy (e-mail: roberto.menozzi@unipr.it) 326 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi affects performance, hence current and voltage values: the electrical and thermal behavior are therefore tightly coupled, and the problem has to be solved selfconsistently; 2. space-wide, thermal modeling is a multi-scale problem: while the volume element where power is dissipated in a semiconductor device may have nanometer-size scale, the boundary conditions that ultimately determine the whole device thermal behavior typically involve regions that are tens or hundreds of micrometers away from that volume; when circuits or systems are to be modeled, the scale explodes to millimeteror centimeter-size; 3. thermal modeling may be a multi-scale problem time-wise, too: when spectrally rich signals are applied to the device/circuit/system under evaluation, the overall time dependence of temperature is affected by spectral components spanning several decades, with time constants ranging from nanoseconds for small semiconductor volumes to seconds or minutes for large assemblies. this means that thermal modeling is a very rich field for research, the holy grail being the optimum trade-off between accuracy of the picture and modeling effort. this paper overviews the activity carried out in this field over several years by the authors and co-workers in the department of information engineering of the university of parma, italy. the next section is focused on finite-element (fe) numerical thermal modeling at the device level. these fe physical models are often used to validate nimbler lumpedelement (le) models, where the electrical behavior and thermal behavior can be selfconsistently linked much more effectively: these models will be described in section 3. section 4 will review our activities in the field of thermal modeling of circuits, systems and assemblies, and will be followed by a brief summary. 2. device-level finite-element thermal modeling the beauty of fe models lies in their ability to provide us with a completely physical, three-dimensional (when required), accurate description of the thermal behavior of complex structures encompassing one or more layers of semiconductor materials, metallizations, passivation layers, etc., with due account of non-linearities like the temperature dependence of thermal conductivities – and sometimes complex boundary conditions – adiabatic, isothermal, and everything in between, air convection, and even forced cooling. the flip side is obvious: such model sophistication has a cost in terms of complexity of model development and computational burden. for this reason, these models are most frequently purely thermal models, although in principle fe tools allow to self-consistently couple the thermal problem with the electrical problem, or the electro-magnetic problem. even for purely thermal models, numerical convergence may take extremely long times to reach, especially in the simulation of time-dependent characteristics, if it can be reached at all: developing practically useful and efficient models is thus far from trivial, and requires skilled and experienced users. some of our first works in this field describe the simulation of relative simple structures, such as a 2d rendition of a chip/rig assembly for the analysis of press-pack igbt stress thermal and electro-thermal modeling of components and systems... 327 cycles [1] or even the basically 1d structure of a press-pack power p-i-n diode for welding applications [2]: these simulations were in general aimed at better understanding of accelerated stress experiments. more specific works were devoted to the modeling of packaged devices for power supplies. ref. [3], for instance, describes the complete workflow of the development of the fe thermal model of packaged power mosfets, including the measurements for parameter extraction and model validation. fig. 1 shows a schematic of the die and package 3d structure (left) and the actual test rig we built for parameter extraction and model tuning by comparison of measured and simulated temperatures (right). fig. 1 3d die and package structure for fe thermal modeling of power mosfets (left), and the test rig (right) built for parameter extraction and model tuning [3]. in the same context, we also applied 3d fe thermal modeling to the study of passive components, such as the inductors shown in fig. 2 [4], [5]. the complexity and cost of 3d fe models obviously pay off most handsomely when applied in the design phase, when investing in extensive accurate simulations makes good economic sense if it allows to avoid taking unsatisfactory solutions all the way to the prototyping phase. as an example, fig. 3 shows a comparison among different device/heat-sink assemblies [6]. while in the field of power converter applications, as illustrated in the examples above, one is often interested in the determination of temperature profiles in assemblies made of die, package, and often heat-sink, the device-level thermal simulation of semiconductor devices for integrated circuits is typically focused on the semiconductor alone – and possibly such top-side elements as metal lines and contacts, and passivation – with the external world replaced by suitable boundary conditions: in this respect, the fact that individual devices are often close-packed in regular patterns in integrated circuits makes things a lot easier, since the planes separating adjacent devices can often be replaced by adiabatic boundary conditions thanks to symmetry. cu, close to the d.u.t. backside fr4 surface top of the package al drain flange (through hole) 328 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi fig. 2 3d simulation of the temperature distribution in wound (left) [4] and planar (right) [5] transformers for switching power supplies; right scales in °c. fig. 3 3d fe simulations (bottom) of the temperature distribution in different device/heat-sink assemblies (top); right scales in °c [6]. from the point of view of the variety of materials and geometries, this is a comparatively simpler situation than the one we discussed before, where die/package/heat-sink assemblies are to be studied, and 2d analysis (as opposed to 3d) is often satisfactory; as such, it thermal and electro-thermal modeling of components and systems... 329 allows the thermal problem and the electrical problem to be solved self-consistently, in what may be called an electro-thermal (et) simulation, where classical semiconductor device equations (e.g., drift-diffusion equations plus electron and hole continuity equations plus poisson equation) are coupled with the heat transport equation. here the main problem is the dramatically different scale of the regions relevant for the electrical and the thermal problem: while the former is typically in the nanometer to micrometer range, the latter often measures hundreds of micrometers – think for instance about the distance between the channel of a fet and the back-side wafer contact from which most of the heat is dissipated. this is a significant computational challenge that can be overcome with suitable techniques: an introductory review dealing with these problems can be found in [7]. however, when the structure we want to simulate gets more complex and threedimensional, when features like top-side metal lines and contacts, passivation layer, etc. cannot be neglected lest the thermal problem be significantly distorted, purely thermal simulations – where the electrical problem is condensed in just one piece of information: the location and size of the volume where power is dissipated – are again the weapon of choice. our group in particular has worked extensively on the 3d thermal simulation of ganbased fets. an example of device design guidelines provided by 3d fe simulations is shown in fig. 4 [8]. the importance of considering top-side heat spreading and heat removal due to metal lines and contacts is shown in fig. 5. fig. 4 maximum power density that can be dissipated under a 150 k temperature increase constraint in a 6-finger gan hemt as a function of finger width, finger spacing, and substrate material (3d fe simulations) [8]. 330 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi fig. 5 simulated gan hemt structure (top) and channel temperature increase in a gan hemt dissipating 3.5 w/mm (bottom); if the effect of top metal lines is neglected (red dashed line) the self-heating is grossly overestimated. rth l is the thermal resistance of the top contacts. [8]. obviously enough, top-side boundary conditions are not the only relevant ones. in the case of gan-based hemts, the thermal boundary resistance (tbr) between the gan buffer and the sic/si/sapphire substrate – due to phonon scattering at the hetero-interface is particularly significant; the die attach-layer is also a source of additional temperature increase relative to the package back. fig. 6 illustrates the importance of these two factors [9]. in the dynamic simulations of fig. 7 the tbr layer and the die attach are clearly visible. besides providing valuable guidelines in the design phase, fe thermal simulations are extremely useful in the analysis of reliability results. as an example, fig. 8 [10] shows the fe-simulated thermal map and von mises stress map of a surface-mounted power mosfet undergoing thermal cycling. here the thermal simulation is part of a self-consistent thermo-mechanical model supporting the interpretation of power cycle stress experiments. in another recent reliability study, we also applied fe thermal simulation to the study of the heavy ion irradiation damage in sic schottky diodes [11], [12], showing that the ion penetration raises the junction temperature above the sic melting point, as illustrated by fig. 9, with permanent device damage. thermal and electro-thermal modeling of components and systems... 331 fig. 6 maximum temperature in a gan hemt as a function of the thermal conductance of the tbr layer (blue curve, left) and of the die-attach layer (red curve, right) (3d fe simulations) [9]. fig. 7 dynamic simulation of vertical temperature profiles following the application of a power step in a gan hemt (3d fe simulations). the effect of the tbr layer and of the die-attach can be seen in the temperature step at about 3 m depth and in the steep temperature gradient at the back surface [9]. 332 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi fig. 8 thermal map (left, scale in °c) and von mises stress map (right, scale in n/m 2 ) for a surface-mounted power mosfet after 240 s at 0.5 w dissipation (3d fe simulations). the box and arrow indicate one of the critical points for mechanical stress [10]. fig. 9 thermal map of a sic schottky diode after heavy ion ( 79 br 240 mev) penetration [11] (3d fe simulation). 3. lumped-element thermal and electro-thermal models powerful as they are, fe simulations have some practical limitations, mostly lying in the computational complexity of multi-physics models – such as electro-thermal ones – and in the difficulty of integration in circuit simulation tools. lumped-element (le) thermal models, made of networks of thermal resistances and thermal capacitances, offer in this context a good compromise between accuracy and ease of implementation and integration in the electrical simulation tools. here are some of the advantages: 1. thanks to the analogy between thermal resistance and electrical resistance, thermal capacitance and electrical capacitance, temperature and voltage, and dissipated power and current, thermal le network can be seamlessly and self-consistently integrated in circuit simulation tools; thermal and electro-thermal modeling of components and systems... 333 2. if desired, the model can retain a sound physical meaning, since thermal resistances and capacitances can be calculated based on device geometry and material properties; alternatively, more empirical models can be used, where parameter values are optimized to get the best fit with measurements; 3. including conductive or convective boundary conditions, heat-spreading and heatsinking elements is relatively easy, and amounts to inserting additional thermal resistances and capacitances between the device and the ambient. 3.1. empirical le thermal models: foster and cauer networks by far the most common le thermal and electro-thermal models use multi-stage foster or cauer networks such as those shown in fig. 10. these networks may collapse to a single stage in the simplest – and most widespread – models (see for instance [13] for the use of a single-stage model in the context of reliability predictions). fig. 10 three-stage foster (top) and cauer (bottom) networks for le thermal simulation. the red arrow indicates the injection of dissipated power, the electrical equivalent of which is current. the ambient temperature is modeled by a constant voltage source between the device back-side node and ground: consequently, node voltages give a direct reading of node temperatures. the foster network has the advantage that each resistance-capacitance stage introduces a specific time constant – given by the product of the two – in the thermal time response of the system. therefore, it is relatively easy to extract the network parameters from the experimental step response. an example is given in fig. 11 [14], where the measured collector current response (dots) to a base current step in an algaas/gaas hbt shows three clear plateaus in the self-heating dynamics: this suggested that a 3-stage (foster) thermal network might be good enough to model the heating dynamic, as demonstrated by the good match of the modeled curves (solid lines). the drawback of the foster network is that its parameters, and particularly thermal capacitance values, have little if any – physical meaning. the lack of physical meaning of the foster model can be easily grasped if one considers that capacitance discharge may 334 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi revert the direction of heat flow on a resistance, something than does not happen in reality, nor in the cauer model. cauer network parameters can be given a physical meaning, especially when each stage of the ladder is associated with a specific part or layer of the device or assembly. the price to pay is a more cumbersome procedure to extract them from measured data. fig. 11 measured (dots) and modeled (lines) collector current dynamics following a base current step in an algaas/gaas hbt. three clear plateaus in the measured step response suggested the use of a 3-stage le network to model the dynamics of self-heating [14]. regardless of the network topology, the number of stages of the ladder is a key point. single-stage rc networks are most commonly used in compact electro-thermal models, but a single time constant is very unlikely to be able to describe the self-heating dynamics even for a bare die, let alone a packaged device. on the other hand, using networks with more stages than necessary will uselessly burden the model, make parameter extraction more cumbersome, and loosen the tie with device physics. in our experience with gan hemts, the vertical heat flow dynamics of unpackaged devices can be satisfactorily modeled with 3-stage networks. however, in wide-finger fets, one must be aware of the fact that assuming a constant channel temperature is a gross simplification, the semiconductor being significantly hotter at the gate finger center than at its periphery, as shown in fig. 12 [15]. thermal and electro-thermal modeling of components and systems... 335 a situation like that shown in fig. 12 requires that more than one rc network be included in the le thermal model. we choose to split the finger width in 5 parts, each one represented by the temperature marked by a letter in fig. 12. each of these sections was modeled with its individual 3-stage rc network. the resulting match between fe-simulated and le-simulated dynamic self-heating was excellent, as demonstrated by fig. 13. the next step is that of using the le thermal model to develop a selfconsistent dynamic electrothermal device model, as schematically shown in fig. 14 [16]. here each of the 5 sections (a-e in fig. 12) is modeled by self-consistently coupling a temperaturedependent large-signal model fig. 12 fe-simulated temperature profile along a gate finger in a gan hemt. the dissipated power is 0.5w/finger. the ambient temperature is 300 k. distance = 100 m is the finger center. the arrows and letters mark five device sections with significantly different temperatures, that have been modeled individually in the le model [15]. fig. 13 a comparison between fesimulated (lines) and le-simulated (dots) dynamic temperature profiles along a gate finger in a gan hemt following a power step of 0.5 w/finger. the ambient temperature is 300 k. distance = 100 m is the finger center. from top to bottom, the curves are taken 1, 10, 100, and 1000 s after the application of the power step [15]. fig. 14 self-consistent electro-thermal model of a gan hemt. the device is dived into different sections to account for temperature non-uniformities along the gate fingers, as shown in fig. 12. each section selfconsistently couples a temperature-dependent largesignal model with a 3-stage le cauer thermal network [16]. 336 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi with a 3-stage le cauer thermal network, for a complete dynamic description of self-heating including temperature non-uniformities along the gate fingers and amenable to easy integration in circuit simulation tools. 3.2. physical le thermal networks we developed another successful approach to le thermal and electro-thermal modeling, whereby the thermal rc network is a physical representation of the 2d or 3d structure of the device under study. this concept was first applied to the 2d cross-section of unpackaged gan-hemts [17], [18]. here a physical le thermal network was self-consistently coupled with a temperature-dependent large-signal fet model as shown in fig. 15 for dynamic description of self heating, including the 2d temperature distribution over the whole structure. a 3d extension of this approach is exemplified in [9] and [19]. the modeled results were compared with fe simulations and with experimental data with good success. the model was later enhanced including the effect of trapping phenomena [20], a significant concern for gan fets. fig. 16 shows the excellent match between measured and modeled dc output characteristics at different temperatures, while fig. 17 is an illustration of the interplay between thermal and trapping dynamic in the pulsed response of these – and other – devices. this physical le modeling approach has been applied with good results to power mosfet assemblies (see fig. 18) [21]-[23], as well as to nanometer-scale soi finfets [24]. 4. circuitand system-level fe thermal models fe thermal simulations, so far considered at device or device-plus-package level, can be effectively used at higher hierarchical levels, to describe the thermal behavior of circuits and boards including several active and passive devices, metal lines, heat-sinks, etc. the building blocks are in this case the fe models of the individual components, such as those described in section 2. however, for practical reasons the whole circuit/system fe model can hardly be built by assembling detailed device-level models like those of figs. 1-3 and 8, due to the excessive number of degrees of freedom of the fe simulation, and the attendant overhead and convergence problems. once detailed fe models of the individual components are available, the first step of the circuit/system modeling process is a simplification of the device models aimed at obtaining nimbler models amenable to integration in the whole circuit/system without simulation overburden, but at the same time retaining the fundamental and necessary amount of information on their thermal behavior. an example of this simplification process is shown in fig. 19 [25]. in particular, we applied this technique to the thermal simulation of converter modules for dc power supplies in the context of the re-design of the electronics for one of the experiments of the cern’s atlas project [4], [25]-[28]. an example of circuit-level fe thermal simulation, and its experimental validation, is given by fig. 20. in this context, accurate description of thermal boundary conditions is key: this is often no trivial task, sometimes requiring thermal and fluid-dynamics simulation of water heat-sinks [29]-[31], as shown in fig. 21. thermal and electro-thermal modeling of components and systems... 337 fig. 15 top: part of the le thermal network used in [17], [18]; thermal capacitances are connected between each node and thermal ground (only three shown for simplicity). bottom: self-consistent electro-thermal model: each of the 3 fingers of the device is individually modeled (hemt1-hemt3 large signal models) and coupled with the physical le thermal network (plecs circuit block). 338 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi fig. 16 comparison between our electro-thermal gan hemt model (lines) and experimental data (dots) [20]. ambient temperatures: 200 k (top), 300 k (middle), and 400 k (bottom). thermal and electro-thermal modeling of components and systems... 339 fig. 17 modeled gate-lag response of a gan hemt, in the case of a bulk donor trap (top) and a surface donor trap (bottom) [20]. the ambient temperature is 300 k. 340 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi yout rconv tamb rconv tamb heat sink drain flange power source rconv tamb rconv tamb rconv tamb rconv tamb rconv tamb rconv tamb rconv tamb rconv tamb rconv tamb rconv tamb resin lid si die resinresin tcrsi_lid tcrsi_flange tcrflangeheatsink tcrflangeheatsink tcrflangeheatsink tcrflangeheatsink fig. 18 le physical thermal model of a power mosfet die, package, flange, and heatsink assembly [23]. thermal and electro-thermal modeling of components and systems... 341 fig. 19 an example of detailed fe thermal model for rectifier diodes in isotop package (left) and it simplified version (right) for circuit simulation [25]. fig. 20 an example of fe thermal model (top) and experimental ir thermal map (bottom) for a single-module dc/dc converter [25]. the output power is 1.2 kw, and forced-air cooling is in place. the maximum temperature error is 8% all over the board. 342 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi fig. 21 fe simulation of a converter board on a water-cooled heat-sink [30]. top: water velocity in the hear sink; bottom: thermal simulation of the converter and heat-sink. 5. summary in this paper we have reviewed the activity carried out over several years at the department of information engineering of the university of parma, italy, in the field of thermal and electro-thermal modeling of devices, device and package assemblies, circuits, and systems encompassing active boards and heat-sinking elements. we have shown examples of the use of finite-element (fe) 3d tools for the thermal analysis of a hierarchy of structures ranging from bare device dies to complex systems including active and passive devices, boards, metallizations, and airand water-cooled heat-sinks. increasing the level of complexity requires developing smart solutions for the reduction of model complexity, lest numerical convergence be slowed down beyond acceptable limits, or made altogether impossible. a variety of lumped-element modeling examples has also been shown. these models lose some of the physical detail of fe models, but are amenable to integration inside circuit simulation tools, thus allowing self-consistent electro-thermal simulation of the device or circuit under realistic operating conditions, something that is practically impossible with fe tools. these models can range from purely empirical to strictly physicsand geometry-based. thermal and electro-thermal modeling of components and systems... 343 references [1] p. cova, g. nicoletto, a. pirondi, m. portesine, m. pasqualetti, "power cycling on press-pack igbts: measurements and thermomechanical simulation", microel. reliab., vol. 39, pp. 1165–1170, 1999. [2] p. cova, f. fasce, p. pampili, m. portesine, g. sozzi, p. e. zani, "high reliable high power diode for welding applications", microel. reliab., vol. 44, pp. 1437–1441, 2004. [3] p. cova, n. delmonte, r. menozzi, "thermal characterization and modeling of power hybrid converters for distributed power systems", microel. reliab., vol. 46, pp. 1760–1765, 2006. [4] p. cova, n. delmonte, r. menozzi, "thermal modeling of high-frequency dc/dc switching modules: electromagnetic and thermal simulation of magnetic components", microel. reliab., vol. 48, pp. 1468– 1472, 2008. [5] m. bernardoni, n. delmonte, p. cova, r. menozzi, "thermal modeling of planar transformer for switching power converters", microel. reliab., vol. 50, pp. 1778–1782, 2010. [6] m. bernardoni, p. cova, n. delmonte, r. menozzi, "heat management for power converters in sealed enclosures: a numerical study", microel. reliab., vol. 49, pp. 1293–1298, 2009. [7] g. sozzi, r. menozzi, "a review of the use of electro-thermal simulations for the analysis of heterostructure fets", microel. reliab., vol. 47, pp. 65–73, 2007. [8] f. bertoluzza, n. delmonte, r. menozzi, "three-dimensional finite-element thermal simulation of ganbased hemts", microel. reliab., vol. 49, pp. 468–473, 2009. [9] m. bernardoni, n. delmonte, r. menozzi, "modeling of the impact of boundary conditions on algan/gan hemt self heating", in proc. 2011 int. conf. compound semiconductor manufacturing technology (cs-mantech), pp. 229-232, 2011. [10] n. delmonte, f. giuliani, p. cova, "finite element modeling and characterization of lead-free solder joints fatigue life during power cycling of surface mounting power devices", microel. reliab., vol. 53, pp. 1611–1616, 2013. [11] c. abbate, g. busatto, p. cova, n. delmonte, f. giuliani, f. iannuzzo, a. sanseverino, f. velardi, "thermal damage in sic schottky diods induced by se heavy ions", microel. reliab., vol. 54, pp. 2200–2206, 2014. [12] c. abbate, g. busatto, p. cova, n. delmonte, f. giuliani, f. iannuzzo, a. sanseverino, f. velardi, "analysis of heavy ion irradiation induced thermal damage", ieee trans. nucl. sci., vol. 62, pp. 202219, 2015. [13] m. ciappa, f. carbognani, p. cova, w. fichtner, "a novel thermomechanics-based lifetime prediction model for cycle fatigue failure mechanisms in in power semiconductors", microel. reliab., vol. 42, pp. 1653–1658, 2002. [14] m. busani, r. menozzi, m. borgarino, f. fantini, "dynamic thermal characterization and modeling of packaged algaas/gaas hbts", ieee trans. components packaging technologies, vol. 23, pp. 352359, 2000. [15] m. bernardoni, n. delmonte, r. menozzi, "empirical and physical modeling of self-heating in power algan/gan hemt", in proc. 2012 int. conf. compound semiconductor manufacturing technology (cs-mantech), pp. 95-98, 2012. [16] f. giuliani, n. delmonte, p. cova, r. menozzi, "gan hemts for power switching applications: from device to system-level electro-thermal modeling", in proc. 2013 int. conf. compound semiconductor manufacturing technology (cs-mantech), pp. 215-218, 2013. [17] f. bertoluzza, g. sozzi, n. delmonte, r. menozzi, "lumped element thermal modeling of gan-based hemt", in proc. 2009 ieee mtt-s int. microwave symp. dig., pp. 973-976, 2009. [18] f. bertoluzza, g. sozzi, n. delmonte, r. menozzi, "hybrid large-signal/lumped-element electro-thermal modeling of gan-hemts", ieee trans. microw. th. techn., vol. 57, pp. 3163-3170, 2009. [19] m. bernardoni, n. delmonte, g. sozzi, r. menozzi, "large-signal gan hemt electro-thermal model with 3d dynamic description of self-heating", in proc. 41 th european solid-state device research conf. (essderc), pp. 171-174, 2011. [20] d. mari, m. bernardoni, g. sozzi, r. menozzi, g. a. umana-membreno, b. d. nener, "a physical large-signal model for gan hemts including self-heating and trap-related dispersion", microel. reliab., vol. 51, pp. 229–234, 2011. [21] p. cova, m. bernardoni, "a matlab based approach for electro-thermal design of power converters", in proc. 6 th int. conf. integrated power electronics systems (cips), pp. 407-411, 2010. 344 r. menozzi, p. cova, n. delmonte, f. giuliani, g. sozzi [22] m. bernardoni, n. delmonte, p. cova, r. menozzi, "self-consistent compact electrical and thermal modeling of power devices including package and heat-sink", in proc. ieee int. symp. power electronics systems, electrical drives, automation and motion (speedam), pp. 556-561, 2010. [23] p. cova, m. bernardoni, n. delmonte, r. menozzi, "dynamic electro-thermal modeling for power device assemblies" microel. reliab., vol. 51, pp. 1948–1953, 2011. [24] m. bernardoni, n. delmonte, r. menozzi, "lumped-element thermal modeling of soi finfets", in proc. apmc10/iconn2012/acmm22, paper 1096, 2012. [25] m. alderighi, m. citterio, m. riva, s. latorre, a. costabeber, a. paccagnella, f. sichirollo, g. spiazzi, m. stellini, p. tenti, p. cova, n. delmonte, a. lanza, m. bernardoni, r. menozzi, s. baccaro, f. iannuzzo, a. sanseverino, g. busatto, v. de luca, f. velardi, "power converters for future lhc experiments", j. instrumentation (jinst), vol. 7, c03012, 2012. [26] p. tenti, g. spiazzi, s. buso, m. riva, p. maranesi, f. belloni, p. cova, r. menozzi, n. delmonte, m. bernardoni, f. iannuzzo, g. busatto, a. porzio, f. velardi, a. lanza, m. citterio, c. meroni, "power supply distribution system for calorimeters at the lhc beyond the nominal luminosity", j. instrumentation (jinst), vol. 6, p06005, 2011. [27] s. baccaro, g. busatto, m. citterio, p. cova, n. delmonte, f. iannuzzo, a. lanza, m. riva, a. sanseverino, g. spiazzi, "reliability oriented deign of power supplies for high energy physics applications", microel. reliab., vol. 52, pp. 2465–2470, 2012. [28] p. cova and n. delmonte, "thermal modeling and design of power converters with tight thermal constraints", microel. reliab., vol. 52, pp. 2391–2396, 2012. [29] p. cova, n. delmonte, p. pampili, m. portesine, p. e. zani, "finite element design of water heat sinks for press-pack igbts", in proc. 4 th int. conf. integrated power electronics systems (cips), pp. 353-357, 2006. [30] p. cova, n. delmonte, f. giuliani, m. citterio, s. latorre, m. lazzaroni, a. lanza, "thermal optimization of water heat sink for power converters with tight thermal constraints", microel. reliab., vol. 53, pp. 1760–1765, 2013. [31] m. lazzaroni, m. citterio, s. latorre, a. lanza, p. cova, n. delmonte, f. giuliani, "thermal modeling and characterization for designing reliable power converters for lhc power supplies", acta imeko j., vol. 14, pp. 17-25, 2014. 10413 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 469-482 https://doi.org/10.2298/fuee2204469g © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design of a four stages vco using a novel delay circuit for operation in distributed band frequencies mriganka gogoi1,2, pranab kishore dutta1 1assam don bosco university, department of ece, india 2north eastern regional institute of science and technology, department of ece, india abstract the manuscript proposes a novel architecture of a delay cell that is implemented in 4-stage vco which has the ability to operate in two distributed frequency bands. the operating frequency is chosen based on the principle of carrier mobility and the transistor resistance. the vco uses dual delay input techniques to improve the frequency of operation. the design is implemented in cadence 90nm gpdk cmos technology and simulated results show that it is capable of operating in dual frequency bands of 55 mhz to 606 mhz and 857 mhz to 1049 mhz. at normal temperature (270) power consumption of the circuit is found to be 151μw at 606 mhz and 157μw at 1049 mhz respectively and consumes an area of 171.42µm2. the design shows good tradeoff between the parameters-operating frequency, phase noise and power consumption. key words: ring oscillator, voltage controlled oscillator (vco), tuning range 1. introduction phase lock loop (pll), one of the key elements of contemporary wireless digital signal processing and instrumentation systems, is crucial for improving the performance of this electronic component. the parameters associated with vco like operating frequency range, power dissipation and phase noise have important contribution towards the improvement of the pll. there are two widely used vcos topologies and they are lc and ring vcos. the former has a high resolution and frequency, but the operating frequency range is limited and the chip surface is big. the latter has many advantages like wide tuning range, easy integration, low chip area, multiphase clock and low power consumption; however, it has a low resolution and poor phase noise performance [1]. ring vcos are divided into two sorts based on their delay stages. 1) vco with a single-ended ring (sero) and 2) vco with a differential ring received january 10, 2022; revised june 22, 2022 and november 15, 2022; accepted december 3, 2022 corresponding author: pranab kishore dutta associate professor, department of ece, nerist, itanagar, arunachal pradesh, 791109,india e-mail: pkdutta07@gmail.com 470 m. gogoi, p. k. dutta (dro). seros consume less area compared to dros but have more noise and hence less efficient [3-7]. dros are more resilient to common mode noise and have a lower swing. delay cell is the basic element of differential configuration oscillators. many such delay cells were proposed by different researchers at different times. maneatis et al proposed a delay cell that was used to design a ring oscillator which could oscillate with an operating frequency of 141 mhz. the delay cell was based on a source coupled pair [8]. a wide operating frequency three stage vco was proposed by yan et al that could operate in frequency range of 1.3 to 1.8 ghz, however the power consumption was comparatively high [9]. park et al designed a 4 stage ring oscillator with low phase noise and operates in 900 mhz. the phase noise was found to be -101 dbc/hz at 100 khz [10]. tu et al. proposed a novel delay circuit and used it to design two stages voltage controlled oscillator whose operating range was from 2.5 ghz to 5.2 ghz for a supply voltage of 1.8v. however due to lesser number of stages the phase noise achieved was -90.1 dbc/hz at offset frequency of 1 mhz [11]. sheu et al proposed a new differential delay cell which was implemented in three stages vco, the tuning range was found to be 479 mhz to 4.09 ghz with phase noise -93.3 dbc/hz at offset frequency of 1 mhz [12]. parvizi et al. proposed design of ring oscillators using two topologies which are differential and single-ended. to reduce stage delay and boost tuning range, the vco used a feed-forward technique and load in terms of inductive impedance [13]. a pll was designed by shruti suman et al by proposing an improved performance vco. the operating frequency varied from 2.26 ghz to 3.44 ghz with the help of a controlled voltage changed from 1v to 3v but they did not focus on phase noise[14]. a delay cell for using in ring oscillator with dual loop was proposed by gao et al, where they also used controlled voltage to tune the frequency range. the design was efficient enough to achieve wideband tuning range while maintaning low phase noise [15]. to determine the optimal dimensions of the vco, gargouri et al proposed a systematic and efficient optimization method and found an optimal trade-off between various specifications [16]. salem et al proposed a fault tolerant delay cell to be used for designing ring oscillator that uses redundant transistor methods to improve relaibility, power dissipation and phase noise [17]. kumar et al presented a vco using nor gate and varactor tuning method with inversion mode [18]. the changes in the varactor width is considered for variation in the operating frequency. however there is still scope for improve in the phase noise. a low noise injection locked vco was proposed by lee et al in which a separate injection signal is employed and the oscillator output locks to the frequency of the injection signal [19]. the circuit showed tuning range having wide frequency and low phase noise with low power consumption. ramazani et al presented some delay cells using basic inverters and current starved inverters to be used for designing vco to achieve better frequency stability [20]. the proposed circuit is oriented towards designing of ring vco with the ability to operate in disributed tuning range while maintaininng decent tradeoff between phase noise and power consumption in differential configuration. thus the novelty of the work lies in allowing the same vco to work in the high as well as low frequency ranges without altering the physical design of the circuit. the subsequent sections of the paper are organized as follows: section 2 deals with proposed vco, section 3 deals with delay circuit analysis, section 4 deals with implementation and section 5 deals with conclusion. design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 471 2. proposed vco even and odd numbers of stages can be used in differential ring oscillators, but an odd number of stages cannot produce both in phase and quadrature phase outputs. frequency of oscillation depends on factors like driving capability, load and number of stages. in addition, when the number of phases increase, the quantity of energy used, the amount of space needed, and the cost increased. additionally, there will be greater phase noise with fewer stages. therefore, maintaining adequate tradeoff between the various performance characteristics requires an optimal design. two stages dro will have tight constrains particularly in oscillations to occur, while three stages limit the output of in phase and quadrature phase, hence we choose to design a four stages vco [21-23]. the designed vco will be applicable in communication systems where multiphase signals are needed like phase array transceivers, fractional frequency synthesizers and clock data recovery circuits. moreover, communication systems demand the need of wide range oscillators to cover a variety of standards across multiple frequency bands [24]. proposed delay cell of the vco is designed using two control frequencies hence this technique is also known as dual frequency control technique [25-26]. it comprises of input vin1+, vin2+, vin1and vin2-, output voltages vout+ , and vout, control voltages vcntr1 and vcntr2. considering tdelay as the delay time of the cell then total delay time of four stages vco will be 4tdelay and hence the operating frequency will be 1/(4tdelay). the proposed delay cell and four-stage vco are depicted in fig.1 and fig.2. the time constants τc and τd estimated during charging and discharging provide a generic equation for finding the oscillation frequency. the following formula is used to compute the time constant: τ = rc (1) where, r is the resistance offered by the charging and discharging path, c is the lumped capacitance that is the combined parasitic capacitances. using τc and τd, time intervals t1 and t2 which are the charging and discharging time intervals of the delay cell are calculated. they are used for determining the oscillating frequency which is given by: 𝑓𝑎 = 1 𝑇1 + 𝑇2 (2) resistance(r) in the time constant formula is the resistance of pmos and nmos transistors respectively, and is given by: 𝑟𝑝 = 1 𝜇𝑝𝐶𝑜𝑥 𝑊 𝐿 (|𝑉𝑔𝑠 | − |𝑉𝑡𝑝|) (3) 𝑟𝑛 = 1 𝜇𝑛𝐶𝑜𝑥 𝑊 𝐿 (|𝑉𝑔𝑠 | − |𝑉𝑡𝑛|) (4) where 𝜇𝑝 and 𝜇𝑛 are the mobility of pmos and nmos transistors, cox is the oxide capacitances, both transistors channel width and length are w and l, respectively. vgs is the applied gate to source voltage, vtp and vtn are the pmos and nmos threshold voltages. the 472 m. gogoi, p. k. dutta transistors are assumed to be working in triode region and small drain to source voltage vds is neglected. thus, from the formulae it can be clearly interpreted that higher the mobility lower will be the resistance or in other words resistance is inversely proportional to mobility. moreover, resistance is directly proportional to time constant hence oscillating frequency is inversely dependent on resistance and directly dependent on mobility. so, transistor with higher mobility can play crucial role in improving the oscillating frequency. since the mobility of electrons is larger than that of holes, this has an effect on the current flow and time constant, or delay time, which has an additional impact on oscillation frequency. lowering the resistance will result in a shorter delay time and a greater oscillation frequency. since the nmos time constant is lower than the pmos time constant, the oscillation frequency will be higher. this idea inspired us to suggest the delay circuit depicted in figure 1 and utilise it to create the vco depicted in figure 2. fig. 1 delay cell fig. 2 four stages vco design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 473 3. delay circuit analysis the proposed delay cell is for dual loop ring based voltage controlled oscillator [27-29]. the primary loop’s inputs are m1 and m2, while the secondary loop’s inputs are m3 and m4. the dual-loop technique amplifies the oscillation. m13 controls the ring vco’s frequency. the latch’s feedback strength is made up of m5, m6, m7, m8, m9, m10, m11 and m12. thus, it is simple to control the delay time of the latch by controlling vcntr2 while vcntr1 helps in maintaining the vco to operate both in the low frequency and high frequency bands. considering the left part of the delay circuit which deals with vout-, the charging and discharging time can be calculated as shown below: initial condition be vout-=vl and vout+=vo, vl and vo being the minimum and maximum output voltages. the charging time is controlled by vin2and the transistor m3, based on the initial condition m5 will be off, is given by 𝜏1 = 𝑟3𝐶𝑙𝑜𝑎𝑑 (5) 𝑟3is equivalent resistance of mosfet m3 and cload is the parasitic capacitive load associated with vout-. 𝑟3 = 1 𝜇𝑝𝐶𝑜𝑥 𝑊 𝐿 (|𝑉𝑔𝑠 | − |𝑉𝑡𝑝|) (6) 𝐶𝐿𝑜𝑎𝑑 = 𝐶𝑑𝑏1 + 𝐶𝑔𝑑1 + 𝐶𝑑𝑏3 + 𝐶𝑔𝑑3 + 𝐶𝑑𝑠3 + 𝐶𝑑𝑏5 + 𝐶𝑔𝑑5 + 𝐶𝑑𝑏7 + 𝐶𝑔𝑑7 + 𝐶𝑑𝑏9 + 𝐶𝑔𝑑9 + 𝐶𝑑𝑏11 + 𝐶𝑔𝑑11 + 𝐶𝑖𝑛_𝑥 (7) where, cin_x is the next stage input capacitance, cdb is drain to body, cgd is gate to drain and cgs is gate to source capacitances of the transistor. across the load capacitance voltage will be 𝑉𝐶𝐿𝑜𝑎𝑑 = 𝑉0 − (𝑉0 − 𝑉𝑙 )exp (− 𝑇1 𝜏1 ) (8) suppose in the time interval 𝑇1capacitor charges upto αv0 then α𝑉0 = 𝑉0 − (𝑉0 − 𝑉𝑙 )exp (− 𝑇1 𝜏1 ) (9) α is a constant with a value ranging from 0 to 1. 𝑇1 = 𝜏1ln { 𝑉0 − 𝑉𝑙 𝑉0(1 − 𝛼) } (10) in the next state when vout-=v0 and vout+=vl. the discharging phenomenon comprises of both charging and discharging time constants and the effective discharging time τ2 is 𝜏2 = {(𝑟1||(𝑟7 + 𝑟13)) − (𝑟3||𝑟5)}𝐶′𝐿𝑜𝑎𝑑 (11) r1, r5, r7 and r13 represents the equivalent resistances of mosfet m1, m5, m7 and m13. 474 m. gogoi, p. k. dutta 𝐶′𝐿𝑜𝑎𝑑 = 𝐶𝑑𝑏1 + 𝐶𝑔𝑑1 + 𝐶𝑑𝑠1 + 𝐶𝑑𝑏3 + 𝐶𝑔𝑑3 + 𝐶𝑑𝑏5 + 𝐶𝑔𝑑5 + 𝐶𝑑𝑏7 + 𝐶𝑔𝑑7 + 𝐶𝑔𝑑9 + 𝐶𝑑𝑏9 + 𝐶𝑑𝑏11 + 𝐶𝑔𝑑11 + 𝐶𝑖𝑛𝑥 (12) c’load is node capacitance during time t2 now voltage across capacitor c’load can be given by 𝑉𝐶′𝐿 = 𝑉𝑙 − (𝑉𝑙 − 𝛼𝑉0)exp (− 𝑇2 𝜏2 ) ( 123) suppose the capacitor c’load discharges to, βvl in the time interval t2 such that β>1 then β𝑉𝑙 = 𝑉𝑙 − (𝑉𝑙 − 𝛼𝑉0)exp (− 𝑇2 𝜏2 ) (14) 𝑇2 = 𝜏2ln { 𝑉𝑙 − 𝛼𝑉0 𝑉𝑙 (1 − β) } (15) 𝑇 = 𝑇1 + 𝑇2 = 𝜏1ln { 𝑉0 − 𝑉𝑙 𝑉0(1 − 𝛼) } + 𝜏2ln { 𝑉𝑙 − 𝛼𝑉0 𝑉𝑙 (1 − β) } (16) and finally, fosc=1/4t, for four stage ring vco. fig. 3 delay circuit for low voltage of vcntr1 case 1: when vcntr1 is low (0 to 0.3v), the transistors m9 and m10 become more dominant as both are pmos transistors and they operate in low gate voltage than the m11 and m12 transistors. hence the circuit is found to operate similar to the circuit shown in fig. 3. the normal delay loop’s input pair is m1 and m2, while the skewed delay loop’s input pair is m7 and m8 in the circuit depicted in fig. 3. transistor m1 shuts off when the voltage connected to gate terminal of m1, vin1+, is less than the threshold value. the source current design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 475 of the secondary input transistor m3 is already flowing towards the capacitor associated with output node, vout-, because the input voltage at vin2reaches earlier than at vin1+. this results in reduction of the output node’s rise time. in the delay cell, m5 and m6 combine to form a latch. m9 and m10 are cross-coupled transistors that control the load transistors' maximum gate voltages, as well as the latch strength and frequency of operation. now varying the control voltage vcntr2 will vary the frequency of oscillation. the path delay increases due to the action of pmos transistors m9 and m10, which causes the vco to operate in the low frequency band. fig. 4 delay cell due to high vcntr2 case 2: when vcntr1 is high (0.7v to 1v) the transistors m11 and m12 are more dominant as both are nmos transistors and they operate in higher gate voltage than m9 and m10. the delay circuit is found to work as shown in fig. 4. in this case m11 and m12 are cross-coupled transistors that govern the maximum voltages associated with the gate terminal of the transistors in the load and hence the latch strength and so the frequency of operation. phase noise: several noise elements influence the phase noise in a ring oscillator. the most prevalent types of noise are white noise and flicker noise. in contrast to inverter-based delay cells, differential delay cells operate in class a and consume a steady state current [30]. the main source of flicker noise is the fet that powers the common gate line for all the currents in the delay cells [31]. equation (17) and (18) shows the ssb (single side band) phase noise because of white noise and flicker noise respectively in differential oscillators. l(f) = 2kt i. ln2 [ɤ ( 3 4 veffd + 1 vefft ) + 1 vop ] ( f0 f ) 2 (17) 476 m. gogoi, p. k. dutta l(f) = a kf wlc′oxf ( 1 vefft 2 ) 2 f0 2 f 3 (18) where, ɤ is noise factor of fet, veffd and vefft are the effective gate voltages of the differential delay cell at balance and unbalanced conditions, vop is actual output voltage, i is tail current, f0 is oscillation frequency, w and l stand for fet’s width and length, a is the ratio of width of fet to that of tail fet and c’ox is the oxide capacitance of nfet (tail transistor). in our design a is considered to be 1 as both w and l are of same length. figure of merit which is used for characterizing vco performance can be obtained from the equation (19) [32]. 𝐹𝑂𝑀 = l(f) − 20log ( f0 f ) + 10log ( 𝑃𝑑𝑐 1𝑚𝑊 ) (19) pdc is the dc power consumption. the dimension of both, nmos and pmos, shown in table 1 are maintained same as the goal is to get the functional circuit in order to confirm the topological idea. table 1 device dimension device aspect ratio nmos pmos m1,m2,m7,m8, m11,m12,m13 m3,m4,m5,m6, m9,m10 120/100 120/100 table 2 variation of the parameters at different temperatures when vcntr1=0.2v and vcntr2 is varied from 0 to 1v (pre layout) temp tuning range power consumption phase noise 1m phase noise 10m 00 42 mhz-672 mhz (93.75%) 160 μw -93.10 dbc/hz -112.36 dbc/hz 100 47 mhz-647 mhz (92.73%) 155 μw 92.94 dbc/hz -112.05 dbc/hz 270 55mhz606 mhz (90.9%) 151 μw -92.07 dbc/hz -111.25 dbc/hz 700 63 mhz-487 mhz (87%) 144 μw -91.86 dbc/hz -111.09 dbc/hz table 3 variation of the parameters at different temperatures when vcntr1=0.77 v and vcntr2 is varied from 0 to 1v (pre layout) temp tuning range power consumption phase noise 1m phase noise 10m 00 1040 mhz-1230` mhz (15.44%) 169 μw -93.42 dbc/hz -112.83dbc/hz 100 969 mhz-1161 mhz (16.5%) 161 μw -93.09 dbc/hz -113.23 dbc/hz 270 857mhz1049 mhz (18.3%) 157 μw -93.50 dbc/hz -113.93dbc/hz 700 585 mhz-771 mhz (24.12%) 152 μw -92.88 dbc/hz -112.34 dbc/hz design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 477 fig. 5 tuning range of vco at different temperatures for vcntr1=0.2v and vcntr2 varies from 0v to 1v (pre layout simulation) fig. 6 tuning range of vco at normal temperature for vcntr1=0.2v and vcntr2 varies from 0v to 1v (pre and post layout simulation at 270) table 4 corner analysis at vcntr1=0.2 v and vcntr2=0.1v process coners pre layout @1mhz post layout @1mhz pre layout @10 mhz post layout @10 mhz output noise (db) phase noise (dbc/hz) output noise (db) phase noise(db c/hz) output noise(db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) nn -93.10 -92.07 -93.23 -92.61 -112.26 -111.25 -113.00 -112.16 ff -94.68 -93.32 -95.38 -93.87 -113.45 -112.33 -114.12 -113.11 fs -96.12 -95.54 -96.62 -95.93 -116.10 -114.56 -116.96 -115.22 sf -95.22 -94.56 -95.88 -94.89 -115.31 -113.34 -116.45 -114.31 ss -94.20 -93.40 -94.66 -93.84 -112.87 -111.86 -113.10 -112.53 478 m. gogoi, p. k. dutta 4. implementation the proposed four stages vco design is implemented using cadence cmos 90nm technology. device dimension used in the circuit is mentioned in table 1. analysis of tuning ranges were carried out by varying vcntr2 from 0v to 1v for different values of vcntr1. mainly vcntr1 was divided into two ranges, the lower one 0 to 0.5v and upper 0.5v to 1.0v. optimum values for maximizing tuning range in both cases were found to be 0.2v (lower) and 0.77v (upper). oscillating frequency ranges from 55 mhz to 606 mhz (91% approx.) for lower band with the control voltage vcntr1=0.2v as shown in fig 5 and table 2. whenever path delay is high oscillating frequency is found to be low and vice versa. fig 7 and table 3 shows variation of tuning range due to change in vcntr2 keeping vcntr1=0.77v. vcntr2 varies from 1v to 0v and the tuning range is found to be 857 mhz to 1049 mhz (18.30%) at normal temperature. thus, it can be considered as operation in higher band frequency. so, the benefit of the circuit is that the same circuit can be operated in two different bands of frequency and thereby increasing the tuning range of the circuit. effect of temperature: due to the changes in transconductance gain (gm), threshold voltage (vth), electron and hole mobility (n and p) and parasitic capacitors, transistors have the biggest impact on the frequency drift and they are obtained as follows [33]: 𝑔𝑚 = µ𝑛cox w l (vgs − vth) (20) 𝑉𝑡ℎ = 𝑉𝑡ℎ0 − 𝛼(𝑇 − 27 0) (21) µ(𝑇) = µ(𝑇 = 270) ( t 270 ) − 3 2 (22) the operation of the circuit is tested by varying the temperatures; it is found that the operating frequency is reduced with increase in frequency. analysis of the circuit is carried out by varying the temperatures from 00c to 700c in both pre layout and post layout at vdd=1v, vcntr1 equals to 0.77 v and 0.2 v respectively and vcntr2 is varied from 0 to 1v. comparative analysis between the pre and post layout simulation with respect to tuning range are shown in fig. 6 and fig. 8, it is found that the changes in frequency tuning range due to the control voltage vcntr2 are close to each other in both the cases. corner analysis of fig. 7 tuning range of vco at different temperatures for vcntr1=0.77 v and vcntr2 vary from 0v to 1v. design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 479 the circuit in terms of output and phase noise are depicted in tables 4 and 5 for all the five processes namely nn, ff, fs, sf and ss and the results found are satisfactory. the delay circuit layout design is 5.87µm x 6.86µm, while the four-stage vco layout design is 24.98µm x 6.86µm, spanning an area of 171.42m2. they are depicted in fig. 9 and fig. 10. fig. 8 tuning range of vco at normal temperature for vcntr1=0.77v and vcntr2 varies from 0v to 1v (pre and post layout simulation at 270) table 5 corner analysis at vcntr1=0.77v and vcntr2=0.1v process coners pre layout @1mhz post layout @1mhz pre layout @10 mhz post layout @10 mhz output noise (db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) output noise (db) phase noise (dbc/hz) nn -96.23 -93.50 -97.11 -94.78 -115.31 -113.93 -116.35 -114.42 ff -95.68 -94.33 -96.28 -94.96 -116.20 -114.42 -116.85 -115.36 fs -99.72 -97.23 -100.22 -98.17 -118.51 -117.81 -119.24 -119.42 sf -98.22 -96.48 -98.89 -97.33 -117.54 -116.71 -118.21 -117.36 ss -97.05 -93.60 -97.76 -94.28 -115.86 -114.11 -116.10 -115.06 fig. 9 layout of the proposed delay cell 480 m. gogoi, p. k. dutta fig. 10 layout of the 4 stage vco the layout design shown in fig. 9 and fig. 10 can further be optimized to reduce the area significantly and get results closer to the obtained in the schematic level. one of the main advantages of the proposed circuit is that the same circuit can be used for working in low frequency range as well as high frequency range by varying the control voltages. however, the range of operations in terms of tuning range is comparatively low. additionally, there is lot of transistors in the delay circuit which further raises the count in the oscillator even more. in the realization column of table 6 it is highlighted that the comparative parameters which are oscillation frequency, consumption of power and phase noise values are either measured or simulated. in our case post-layout values are considered. table 6 comparison parameters references technology (nm) supply voltage (v) number of stages (n) oscillation frequency range (ghz) power consumption (mw) phase noise (dbc/hz) fom dbc/hz realizatio n level 11 180 1.8 2 2.5-5.2 (74%) 17 -90.1 @ 1mhz ---measured 16 180 1 2 0.473-7.54 (93.72%) 7.41 -107.1 @ 10mhz -150.44 simulated 31 180 1.8 4 0.455 to 0.505 0.00139 to 0.00145 1.98 (lower band) and 9.7 (upper band) ------simulated 12 180 1 4 0.479-4.09 (88.28%) 10 -93.3 @ 1mhz -154.4 measured 18 90 1 to 3 3 1.379-1.970 (30%) 0.650-2.584 (74.84%) 0.556-2.584 (78.48%) 0.129 to 5685 -89.779 @ 1mhz -154.51 simulated 28 65 1.2 30 0.556 0.72 -101.7 @ 1mhz -158 simulated 15 65 1.8 3 0.470-0.964 (51.24%) 4.1 -116 @1 mhz -169 measured 29 90 1.2 4 9.21 2.092 -137.9 @ 1mhz ---post layout proposed work 90 1 4 0.048 to 0.57 (91.57%) and 0.82 to 1.01 (18.8 %) {distributed band} 0.151 (lower band) and 0.157 (upper band) -152.40 and -160.64 post layout design of a four stages vco using a novel delay circuit for operation in distributed band frequencies 481 5. conclusion a four-stage vco is designed using a novel differential delay circuit. the vco is found to be operated in two distributed band of frequencies namely lower and upper which is one of its main advantages. pre layout simulation result shows operating frequency bands are 55 mhz to 606 mhz (lower) at normal temperature when one of the control voltages vcntr1 is maintained at 0.77v while the other one vcntr2 varied from 0v to 1v. phase noise at 1mhz and 10 mhz offset are found to be 92.07 dbc/hz and -111.25 dbc/hz at lower frequency band. the vco operates in 857 mhz to 1049 mhz (upper) when vcntr1 is 0.2v and vcntr2 varied from 0v to 1v. in this band the phase noise at 1mhz and 10 mhz offset are -93.50 dbc/hz and -113.93 dbc/hz. pre layout power consumption of the vco at 270 is found to be 151µw and 157µw for the operating frequencies of 606 mhz and 1049 mhz respectively. references [1] t. miyazaki, m. hashimoto and h. onodera, "a performance comparison of plls for clock generation using ring oscillator vco and lc oscillator in a digital cmos process", in proceedings of asia and south pacific design automation conference (aspdac), 2004, pp. 545-546. [2] h. ghonoodi, h. miar-naimi and m. gholami, "analysis of frequency and amplitude in cmos differential ring oscillators", integration, vol.52, pp.253-259, january 2016. [3] m. gogoi, and p. k. dutta, "review and analysis of charge-pump phase-locked loop", in proceedings of 1st international conference on electronics systems and intelligent computing (esic), 2020, pp. 565-574. [4] j. johnson, m. ponnambalam and p. v. chandramani, "comparison of tenability and phase noise associated with injection locked three staged single and differential ended vcos in 90nm cmos", in proceedings of 4th international conference on signal processing, communication and networking, 2017, pp. 1-4. [5] w. t. lee, j. shimand and j. jeong, "design of a three-stage ring-type voltage controlled oscillator with a wide tuning range by controlling the current level in an embedded delay cell", microelectronics j., vol. 44, pp. 1328-1335, dec. 2013. [6] s. salem, m. tajabadiand and m. saneei, "the design and analysis of dual control voltages delay cell for low power and wide tuning range ring oscillators in 65nm cmos technology for cdr applications", aeu international journal electronics communication, vol. 82, pp. 406-412, dec. 2017. [7] v. muddi, k. d. shinde and b. k. shivaprasad, "design and implementation of 1ghz current starved voltage control oscillator (vco) for pll using 90nm cmos technology", in proceedings of international conference on control, instrumentation, communication and computational technologies (iccicct), 2015, pp. 335-339. [8] j. g. maneatis and m. a. horowitz, "precise delay generation using coupled oscillators", ieee j. solid-state ciruits., vol. 28, pp. 1273-1282, dec. 1993. [9] s. t. yan and h. c. luong, "a 3-v 1.3-to-1.8-ghz cmos voltage-controlled oscillator with 0.3-ps jitter", ieee trans. circuits syst. ii: analog and digital signal proc., vol.45, pp. 876-880, july 1998. [10] c. h. park and b. kim, "a low-noise, 900-mhz vco in 0.6-/spl mu /m cmos", ieee j. solid-state circuits, vol. 34, pp. 586-591, june 1998. [11] w. h. tu, j. y. yeh, h. c. tsai and c. k. wang, "a 1.8 v 2.5-5.2 ghz cmos dual-input two-stage ring vco", in proceedings of asia pacific conference on advanced system integrated circuits, 2004, pp. 134-137. [12] m. l. sheu, y. s. tiao and l. j. taso, "a 1-v 4-ghz wide tuning range voltage-controlled ring oscillator in 0.18 μm cmos", microelectronics j., vol. 42, pp. 897-902, april 2011. [13] m. parvizi, a. khodabakhshand and a. nabavi, "low-power high-tuning range cmos ring oscillator vcos", in proceedings of the ieee international conference on semiconductor electronics, 2008, pp. 40-44. [14] s. suman, k. g. sharma and p. k. ghosh, "design of pll using improved performance ring vco", in proceedings of the international conference on electrical, electronics and optimization techniques (iceeot), 2016, pp. 3479-3483. 482 m. gogoi, p. k. dutta [15] h. gao, r. xia, x. wang, t. zhou and m. zhou, "wideband ring oscillator with switched resistor array for low tuning sensitivity", analog integr. circuits and signal process., vol. 89, pp. 493-498, sept. 2016. [16] n. gargouri, d. b. issa, z. sakka, a. kachouri and m. samet, "design and optimization of differential ring oscillator for ir-uwb applications in 0.18μm cmos technology", j. circuits syst. comput., vol. 26, pp. 1750080-1-1750080-15, dec. 2016. [17] s. salem, h. zandevakili, a. mahani and m. saneei, "fault-tolerant delay cell for ring oscillator application in 65 nm cmos technology", iet circuits devices syst., vol. 12, pp. 233–241, nov. 2017. [18] m. kumar and d. dwivedi, "a low power cmos-based vco design with i-mos varactor tuning control", j. circuits syst. comput., vol. 27, pp. 1850160-1-1850160-14, jan. 2018. [19] s. y. lee, s. amakawa, n. ishihara and k. masu, "2.4-10 ghz low-noise injection-locked ring voltage controlled oscillator in 90nm complementary metal oxide semiconductor", jpn. j. appl. phys., vol. 50, pp. 04de03-1-04de03-5, april 2011. [20] a. ramazani, s. biabani and g. hadidi, "cmos ring oscillator with combined delay stages", aeu – int. j. electron. commun., vol. 68, pp. 515-519, june 2014. [21] m. karimi-ghartemani, h. karimiand and m. r. iravani, "a magnitude phase-locked loop system based on estimation of frequency and in-phase/quadrature-phase amplitudes", ieee trans. ind. electron., vol. 51, pp. 511-517, april 2004. [22] a. sharma, saurabh and s. biswas, "a low power cmos voltage controlled oscillator in 65nm technology", in proceedings of international conference on computer communication and informatics, 2014, pp. 1-5. [23] s. kamran and n. ghaderi, "a novel high speed cmos pseudo-differential ring vco with wide tuning control voltage range", in proceedings of the iranian conference on electrical engineering (icee), 2017, pp. 201-204. [24] z. chen and t. lee, "the study of a dual-mode ring oscillator", ieee trans. circuits syst. ii: express briefs, vol. 58, no. 4, pp. 210-214, april 2011. [25] g. k. sharma, a. k. johar, t. b. kumar and d. boolchandani, "design and analysis of wide tuning range differential ring oscillator (wtr-dro)", analog integr. circuits signal process., vol. 103, pp. 17-29, april 2020. [26] j. m. kim, s. kim, i. y. lee, s. k. han and s. g. lee, "a low noise four-stage voltage controlled ring oscillator in deep-submicrometer cmos technology", ieee trans. circuits syst. ii: express briefs, vol. 60, no. 2, pp. 71-75, feb. 2013. [27] i. kovacs and m. neag, "new dual-loop topology for ring vcos based on latched delay cells", in proceedings of the ieee international symposium on circuits and systems (iscas), 2018, pp. 1-5. [28] t. yoshio, t. kihara and t. yoshimura, "a 0.55 v back-gate controlled ring vco for adcs in 65 nm sotb cmos", in proceedings of the ieee asia pacific microwave conference (apmc), 2017, pp. 946-948. [29] s. k. saw, s. k. yadav, m. maiti, a. j. mondal and a. majumder, "a design approach of higher oscillation vco made of cs amplifier with varying active load", microsyst. technol., vol. 26, pp. 1-10, feb. 2020. [30] a. a. abidi, "phase noise and jitter in cmos ring oscillators", ieee j. solid-state circuits, vol. 41, no. 8, pp. 1803-1816, aug. 2006. [31] s. pahlava and m. b. ghaznavi-ghoushchi, "1.45 ghz differential dual band ring based digitallycontrolled oscillator with a reconfigurable delay element in 0.18 μm cmos process", analog integr. circuits signal process., vol. 89, no. 2, pp. 461-467, nov. 2016. [32] m. katebi, a. nasri and s. toofan, "a wide tuning range and low phase noise vco using new capacitor bank structure", majlesi j. electr. eng., vol. 12, pp. 95-103, 2018. [33] m. katebi, a. nasri, s. toofan and h. zolfkhani, "a temperature compensation voltage controlled oscillator using a complementary to absolute temperature voltage reference", int. j. eng., vol. 32, no. 5, pp. 710-719, 2019. instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 61 76 doi: 10.2298/fuee1601061m comparison of classical cic and a new class of stopband-improved cic filters formed by cascading non-identical comb sections  dejan n. milić, vlastimir d. pavlović university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper we propose a new class of selective cic filters in recursive and nonrecursive form. the filters use a modification of cic concept, which is achieved by applying a set of non-identical comb sections in cascade. we illustrate examples of the proposed filter function and calculate integer coefficients of filter impulse response. detailed comparison between the proposed selective filter class and classical cic filters is given. the results show that the stopband selectivity can be improved significantly in comparison with classical cic filters with the same filter complexity. key words: selective cic filter, comb filters, fir filters, recursive form, nonrecursive form, classical cic filter 1. introduction comb-based digital filters have become widely used in multirate systems in the recent years, primarily because of their low complexity and power consumption [1]. classical comb filter functions with finite impulse response and linear phase characteristics hn(z) have all their zeroes on a unit circle, and the total number of zeroes is n. with cascade synthesis of identical comb filter functions, one can generate conventional cic filter functions whose attenuation characteristics are given by: sin ( , , ) sin cic n n n        (1) cic filters have a great importance in telecommunication techniques and especially in multirate processing and sigma-delta modulation [2, 3]. they have two very important characteristics: 1. linear phase response, and 2. multiplierless operation, since they require only delay, addition and subtraction. received september 5, 2014; received in revised form july 24, 2015 corresponding author: dejan n. milić faculty of electrical engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: dejan.milic@elfak.ni.ac.rs) 62 d. n. milić, v. d. pavlović however, classical cic filters also have the following important shortcomings: 1. very high value of filter function normalization constant n  , 2. high ratio of max. and min. values of integer coefficients in impulse response: max inm ( ) { ( )}, max , , 0,1, ( 1) ( ) min{ ( )}, , , cic cic r cic cic r hh n n r r n hh n n r          (2) 3. comparatively low stopband attenuation. for example, by setting n = 9, and using  = 7 cascades, one can obtain stopband attenuation of: |hcic(9, 7, z)| = 90.27 db. stopband attenuation of classical cic filters is equal to the depth of the first sidelobe, and therefore can be estimated as: , 3 ( , ) sin 2 cic s n n n               (3) some of the attempts to sharpen the filter response and improve the stopband selectivity are described in literature, for example [4-12]. table 1 summarizes the values of normalization constants, ratios of maximal and minimal impulse response coefficients, and stopband attenuation for different filter parameters n and . table 1 characteristic values of classical cic filter functions n  normalization constant max/min coefficient ratio stopband attenuation [db] 5 5 3125 381 60.21 8 390625 38165 96.33 11 48828125 4091495 132.45 6 5 7776 780 62.13 8 1679616 135954 99.40 11 362797056 25090131 136.68 7 5 16807 1451 63.26 8 5764801 398567 101.22 11 1977326743 117224317 139.17 in this paper, we propose a new class of fir filter function with all zeroes on the unit circle, that improves on all three issues present in classical cic filters. normalization constants are lower, ratios of hcicmax(n, ) and hcicmin(n, ) are reduced, and stopband attenuation values are improved. 2. the proposed class of cic fir filter functions classical cic filter is described by the normalized transfer function which can be condensed into the recursive form: 1 1 ( ) (1 ) n n z h z n z      , (4) comparison of classical cic and a new class of stopband-improved cic filters 63 where n is an integer parameter, and filter order is equal to (n-1). when better stopband suppression is required, it is a common procedure to cascade multiple filter sections until the requirements are met. by cascading  identical comb/integrator stages, the effective transfer function of the cascaded filter is of the form: ( , ) ( ( )) n n h z h z    . (5) by cascading non-identical sections of classical cic filters, it is possible to obtain filters with different characteristics. some particular cases of new filter classes based on this approach have been considered previously in [7, 8]. the choice of filter sections can in general be arbitrary, and not every combination would yield justifiable results. on the other hand, the classification of general cascaded filter type has not been attempted in the literature, and we choose to present our own filter class which showed good results towards better stopband performance. in this paper, we propose a filter with transfer function 2 2 1 1 ( , ) ( , ) ( , ) ( , ) ( ) ( ) n n n n n n h l z h l z h l z h l z h z h z      , (6) which consists of two cic filters with transfer functions hn-1(z) and hn+1(z), and an l-fold cascade group of three filters with transfer functions hn-2(z), hn(z), and hn+2(z). 2.1 recursive form of the proposed filter class following from (4), the proposed filter function has a recursive form: 1 1 1 1 2 2 1 1 1 1 1 ( , ) ( 1)(1 ) ( 1)(1 ) 1 1 1 ( 2)(1 ) (1 ) ( 2)(1 ) n n n l n n n z z h l z n z n z z z z n z n z n z                                  (7) frequency response characteristic is: ( 1) 2 2 2 1 ( 1) ( 1) ( , ) sin sin 2 2sin ( / 2) ( 1) 1 ( 2) ( 2) sin sin sin 2 2 2( 4) j n k j n k l e n n h l e n n n n n n                                                , (8) where we denote the total number of cic cascades as 23  lk . normalized amplitude response is: 2 3 sin(( 1) 2) sin(( 1) / 2) ( , ) ( 1)( 1) sin 2 sin(( 2) 2) sin( / 2) sin(( 2) 2) ( 2)( 2) sin 2 n l n n a l n n n n n n n n                      , (9) and the magnitude response is obtained when we take absolute value of the amplitude response. as it is obvious from (8), the phase response characteristic is linear and expressed as 64 d. n. milić, v. d. pavlović ( , ) ( 1) / 2 2 n l n k k      , where k{0, 1, 2, …}, while the group delay of the proposed filter class is: ( , ) ( 1)(3 / 2 1) n l n l     (10) normalized response characteristics are shown in fig. 1, for l = 2, and 3 < n < 25. amplitude response shows that there are no visible variations in the passband. there is also the smooth transition towards the stopband, which is consistent with general behavior of the classical cic filters. as the filter order increases, the passband decreases as expected. magnitude response shows strong attenuation in the stopband, and this is clearly the consequence of filter zeroes. however, attenuation drops to much lower values between the zeroes, effectively defining the stopband attenuation limit. lines that define the locations of filter zeroes are clearly visible for lower filter orders and the overall effect of cascading non-identical filter sections is in fact in dispersion of the zeroes. this is also obvious in the fig. 2 where we show contour plots of the attenuation for different filter orders versus angular frequency. fig. 2 also compares the magnitude response of the classical cic filters and the proposed filter class, so that the differences can be highlighted. while the classical cic filters have strong attenuation bands, and comparatively low attenuation between them, we see that the proposed filter functions have dispersed high attenuation bands, and significantly better attenuation between those bands. as the filter order increases, after certain point the characteristics begin to look similar, so we expect significant results in stopband attenuation improvement at lower filter orders. figs. 2b and 2d compare the magnitude response spanning lower frequencies close to passband. we can also see that the passband responses of the classical cic and the proposed filters are very similar, and therefore we can assume that the compensation techniques used for classical cic filters [14-16] can also be used here successfully. a) amplitude response b) magnitude response fig. 1 normalized response characteristics of the proposed cic fir filter class for n{4-24}, and l = 2. comparison of classical cic and a new class of stopband-improved cic filters 65 0.0 0.5 1.0 1.5 5 10 15 20 [ ] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 5 10 15 20 [ ] a) classical cic filter with  = 8 b) same as previous, lower part of frequency range 0.0 0.5 1.0 1.5 5 10 15 20 [ ] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 5 10 15 20 [ ] c) proposed cic fir filter with k = 8 d) same as previous, lower part of frequency range 0 50 100 150 200 attenuati on [db] fig. 2 contour plots of magnitude response characteristics of classical cic and the proposed cic fir filters for n{4-24}, and total number of cascaded sections  = k = 8. we further investigate the passband and stopband cut-off values of the proposed filter class, and the obtained results are shown in fig. 3. the results are suitable for determining filter parameters n and l when the required values of passband and stopband cut-offs are given. we observe that filter order strongly influences the values of passand stopband cut-offs, while the number of cascades does less so. as a consequence, in many cases requirements can be met using more than a single combination of parameters, which gives a certain degree of freedom in choosing the efficient filter function. 66 d. n. milić, v. d. pavlović 5 10 15 20 25 0.02 0.04 0.06 0.08 0.10 a n g u la r p a s s b a n d c u to ff f re q u e n c y [ ] 1 2 3 5 10 15 20 25 0.2 0.4 0.6 0.8 1.0 a n g u la r s to p b a n d c u to ff f r e q u e n c y [ ] 1 2 3 a) passband cut-off values for 0.28 db response variation b) stopband cut-off values for 100 db attenuation fig. 3 passband and stopband cut-off values of the proposed filter class, for n{4-24}, and l{1,2,3}. 2.2 nonrecursive form of the proposed filter class using the non-recursive form [17, 18] of the normalized classical cic filter impulse response: 1 0 1 ( ) n i n i h z z n      , (11) as a building block, we can write directly the non-recursive form of the proposed filter class impulse response: 2 2 2 3 1 1 0 0 0 0 0 ( , ) ( 1) ( 2) l l n l n n n n n i j k l m i j k l m h l z n n n z z z z z                              , (12) classical cic filters have all zeroes (the number of zeroes is equal to filter order, n-1) on the unit circle in the z-plane, and their multiplicity increases linearly with increasing number of cascades. therefore, increased multiplicity of the zeroes is a side-effect of cascading filter element in order to improve the stopband attenuation. in the proposed filter class, there are also multiple filter section, but in contrast with classical cic, the sections are not of the same order. this diversity allows wide spread of zeroes, which are also distributed on the unit circle. by distributing zeroes more evenly for the proposed filter class, we hope to get significantly better stopband characteristics while retaining the other desirable characteristics of the cic filters. an illustrative example is given in fig 4, where locations and multiplicities of zeroes are compared for the two types of filters with the same total number of cascades and the same group delay. comparison of classical cic and a new class of stopband-improved cic filters 67 1 1 2 3 6 multipli cities 8 1 a) classical cic filter b) proposed filter fig. 4 locations and multiplicities of filter function zeros in z-plane for n = 8, and  = k = 8 cascades. when all products in (12) are taken into account, the impulse response is written simply as: (3 2)( 1) , 0 ( , ) l n k n n l k k h l z c a z       , (13) where 2 2 , ( 1) ( 2) l l n l c n n n     is the normalization constant. we can observe that the result can be interpreted as a scalar product of the two vectors: , , , ( , ) t n n l n l n l h l z c a z , (14) where , 0 1 (3 2)( 1)[ , , , ]n l l na a a  a , 1 2 (3 2)( 1) , [1, , , ] l n n l z z z      z , and symbol t denotes the vector transpose. coefficients ak are computed easily for the given filter parameters n and l, and although general symbolic formula for the ak may exist, we have not pursued its derivation. instead, we give the coefficients vectors values for filters with n{7, 8}, and l{1,2}: for sixth order filter (n = 7), with l = 1, we have the following 31 coefficients: a7,1 = [1, 5, 15, 35, 70, 125, 204, 309, 439, 589, 750, 910, 1055, 1171, 1246, 1272, 1246, 1171, 1055, 910, 750, 589, 439, 309, 204, 125, 70, 35, 15, 5, 1], and with l = 2, we have the following 49 coefficients: a7,2 = [1, 8, 36, 120, 330, 790, 1699, 3350, 6142, 10578, 17243, 26758, 39710, 56562, 77553, 102602, 131233, 162538, 195191, 227520, 257635, 283600, 303628, 316274, 320598, 316274, 303628, 283600, 257635, 227520, 195191, 162538, 131233, 102602, 77553, 56562, 39710, 26758, 17243, 10578, 6142, 3350, 1699, 790, 330, 120, 36, 8, 1] coefficients of seventh order filter with l = 1, are: a8,1 = [1, 5, 15, 35, 70, 126, 209, 324, 474, 659, 875, 1114, 1364, 1610, 1835, 2022, 2156, 2226, 2226, 2156, 2022, 1835, 1610, 1364, 1114, 875, 659, 474, 324, 209, 126, 70, 35, 15, 5, 1], 68 d. n. milić, v. d. pavlović while the same order filter with l = 2 has the following 57 coefficients: a8,2 = [1, 8, 36, 120, 330, 792, 1714, 3415, 6353, 11147, 18586, 29618, 45313, 66796, 95150, 131293, 175839, 228957, 290246, 358645, 432396, 509073, 585684, 658844, 725007, 780736, 822984, 849356, 858322, 849356, 822984, 780736, 725007, 658844, 585684, 509073, 432396, 358645, 290246, 228957, 175839, 131293, 95150, 66796, 45313, 29618, 18586, 11147, 6353, 3415, 1714, 792, 330, 120, 36, 8, 1] in order to compare the filter responses, we observe the classical cic filter with same order and group delay has the following impulse response 3 2 3 2 1 3 2 0 1 ( , ) l l n k n l k h z z n                 (15) using the multinomial theorem, previous equation can be written in its expanded form: 1 0 3 2 0 1 1 3 2 0 1 1 1 ! ( , ) ! ! ! n t t l n t k n l k k k l n l h z z k k kn                 , (16) and finally in the form analogue to (13): (3 2)( 1) 3 2 0 1 ( , ) l n k n kl k h z b z n         (17) in order to further compare the coefficients of classical cic and proposed filters, we have computed the coefficient vectors for both filters, using the same filter order and group delay. relative difference of the coefficients is shown in fig. 5, with classical cic filter taken as reference, i.e. we define relative difference as (ak bk)/bk. since the relative difference is always negative, corresponding coefficients of the proposed filter class are always less or equal to those of the classical cic filters, and this is especially true for the largest coefficients. 0 = 3 = 2 = 1 4020 60 25 20 15 10 5 0 coefficient order, r e la ti v e d if fe re n c e [ % ] = 8 = 7 fig. 5 relative difference of the impulse response coefficients of the proposed filter class compared to corresponding coefficients of the classical cic filters. comparison of classical cic and a new class of stopband-improved cic filters 69 normalization constants and max/min coefficient ratios of the proposed filter class are compared to appropriate values of classical cic filters, and the results are listed in table 2. relevant values are about 10% to 45% lower, relative to classical cic. table 2 characteristic values of the proposed filters impulse responses, and comparison to corresponding values for classical cic filters n l  normalization constant relative to classical cic [%] max/min coefficient ratio relative to classical cic [%] 5 1 5 2520 -19.35 292 -23.36 2 8 264600 -32.26 24544 -35.69 3 11 27783000 -43.10 2209862 -45.99 6 1 5 6720 -13.58 651 -16.54 2 8 1290240 -23.18 100716 -25.92 3 11 247726080 -31.72 16524804 -34.14 7 1 5 15120 -10.04 1272 -12.34 2 8 4762800 -17.38 320598 -19.56 3 11 1500282000 -24.13 86589572 -26.13 3. comparison of stopband characteristics as mentioned previously, the most significant effect of zeroes dispersion in the proposed filter class is expected to be the stopband attenuation improvement. to study and illustrate the effect, we show detailed analysis of numerical results obtained for even and odd filter orders n{7, 8}, and different number of cascaded sections, corresponding to l{1, 2, 3}. in fig. 6 we show filter attenuation in dbs, for the angular frequency span of 0    . it is immediately obvious that the proposed filter outperforms classical cic filters in the stopband. at the same time, passband characteristics are closely matched, potentially allowing the use of compensators designed for classical cic filters. as the number of cascades increases, so does the benefit of attenuation improvement. this is in agreement with our initial assumption that the zeroes multiplicity of classical cic filters can be traded for stopband performance. numerical values of stopband attenuation, as well as stopband cut-off values are shown in the fig. 7, which shows zoomed areas of interest from the fig. 6. it is evident that the stopband improvement can be significant, ranging from about 19 db for l = 1, 26 db for l = 2, up to 32 db for l = 3. 70 d. n. milić, v. d. pavlović 0 0.5 1.0 1.5 2.0 2.5 3.0 0 50 100 150 angular frequency [ ] a tt e n u a ti o n [ d b ] a) number of cascades is  = 5 (corresponding to l = 1 for the proposed filter) 0 0.5 1.0 1.5 2.0 2.5 3.0 0 50 100 150 200 angular frequency [ ] a tt e n u a ti o n [ d b ] b) number of cascades is  = 8 (corresponding to l = 2 for the proposed filter) 0 0.5 1.0 1.5 2.0 2.5 3.0 0 150 200 250 angular frequency [ ] a tt e n u a ti o n [ d b ] c) number of cascades is  = 11 (corresponding to l = 3 for the proposed filter) fig. 6 comparison of normalized magnitude response characteristics in db for classical cic filter with n = 7 (dashed lines), and the proposed cic fir filter functions with n = 7 (solid lines). comparison of classical cic and a new class of stopband-improved cic filters 71 82.23 db 63.26 db 0.76283 0.72214 0.6 0.8 1.0 1.2 1.4 60 70 80 90 angular frequency , [ ] a tt e n u a ti o n [ d b ] a) number of cascades is  = 5 (corresponding to l = 1 for the proposed filter) 127.09 db 101.22 db 0.76841 0.72214 0.6 0.8 1.0 1.2 1.4 90 100 110 120 130 140 angular frequency , [ ] a tt e n u a ti o n [ d b ] b) number of cascades is  = 8 (corresponding to l = 2 for the proposed filter) 171.10 db 139.17 db 0.77147 0.72214 0.6 0.8 1.0 1.2 1.4 130 140 150 160 170 180 190 angular frequency , [ ] a tt e n u a ti o n [ d b ] c) number of cascades is  = 11 (corresponding to l = 3 for the proposed filter) fig. 7 details of comparison shown in fig.6, with enlarged sections of interest and specific values shown. characteristics of the classical cic filters are shown using dashed lines, and those of the proposed filter are in solid lines. 72 d. n. milić, v. d. pavlović in fig. 8 we show an example of filter attenuation for odd filter order: n 1 = 7 (n = 8). the figure looks very similar to previous example shown in fig. 6, but there are a few points worth taking notice. firstly, there are no fundamental differences visible between even and odd filter orders. secondly, attenuation improvement is not linear, but depends on complex interplay of zeroes locations and multiplicities. actually, in fig. 8.a we have a slightly lower improvement than in fig. 6.a. 0 0.5 1.0 1.5 2.0 2.5 3.0 0 50 100 150 angular frequency [ ] a tt e n u a ti o n [ d b ] a) number of cascades is  = 5 (corresponding to l = 1 for the proposed filter) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 50 100 150 200 angular frequency [ ] a tt e n u a ti o n [ d b ] b) number of cascades is  = 8 (corresponding to l = 2 for the proposed filter) 0 0.5 1.0 1.5 2.0 2.5 3.0 0 150 200 250 angular frequency [ ] a tt e n u a ti o n [ d b ] c) number of cascades is  = 11 (corresponding to l = 3 for the proposed filter) fig. 8 comparison of normalized magnitude response characteristics in db for classical cic filter with n = 8 (dashed lines), and the proposed cic fir filter functions with n = 8 (solid lines). comparison of classical cic and a new class of stopband-improved cic filters 73 82.59 db 63.99 db 0.66501 0.63347 0.6 0.8 1.0 1.2 60 70 80 90 angular frequency , [ ] a tt e n u a ti o n [ d b ] a) number of cascades is  = 5 (corresponding to l = 1 for the proposed filter) 134.66 db 102.38 db 0.68294 0.63347 0.6 0.8 1.0 1.2 90 100 110 120 130 140 150 angular frequency , [ ] a tt e n u a ti o n [ d b ] b) number of cascades is  = 8 (corresponding to l = 2 for the proposed filter) 182.95 db 140.77 db 0.68706 0.63347 0.6 0.8 1.0 1.2 140 160 180 200 angular frequency , [ ] a tt e n u a ti o n [ d b ] c) number of cascades is  = 11 (corresponding to l = 3 for the proposed filter) fig. 9 details of comparison shown in fig.8, with enlarged sections of interest and specific values shown. characteristics of the classical cic filters are shown using dashed lines, and those of the proposed filter are in solid lines. 74 d. n. milić, v. d. pavlović 5 10 15 20 0 10 20 30 40 s to p b a n d a tt e n u at io n im p ro v e m e n t [d b ] 3 2 1 fig. 10 stopband attenuation improvement versus parameter n. for higher number of cascades, the attenuation valleys are more uniformly distributed in terms of their attenuation values, and this is an indication that the filter has finer balance and better stopband characteristics. numerical values of stopband attenuation, as well as stopband cut-off values are shown in the fig. 9, which shows zoomed areas of interest from the fig. 8. the stopband improvement here ranges again from about 19 db for l = 1, over 32 db for l = 2, up to 42 db for l = 3. as we have noticed, because of the complex nature of interplay between the zeroes, it is hard to predict the exact values of stopband attenuation improvement, and these can be efficiently calculated and tabulated only after the actual characteristics comparison. therefore, we have performed detailed calculations for different filter orders and number of cascades, and we summarize the results in fig. 10. the results indicate that the best results in improving attenuation in the stopband can be obtained when n = 8, for l = 2 and l = 3. when l = 1, most improvement is obtained for n = 7. as the filter order increases beyond its optimal value, attenuation improvement becomes consistently lower. in order to compare the filter function to a similar one presented in [7], we have calculated the stopband attenuation improvement in dbs and normalized it by the total group delay (10), therefore showing how efficient is the filter function in improving the stopband attenuation with increasing number of delay elements. the results shown in fig. 11 indicate that the proposed filter function is more efficient in this regard than the one presented in [7]. we note that l = 2 from [7] corresponds to the same delays as for l = 4 in this paper. 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 proposed filter function ref. [7] s to p b a n d i m p ro v e m e n t p e r g ro u p d e la y u n it [ d b ] = 1 2 3 fig. 11 stopband attenuation improvement normalized by group delay. comparison of classical cic and a new class of stopband-improved cic filters 75 4. conclusion this paper describes a new class of selective cic filter functions in recursive and nonrecursive form. we have illustrated examples of the proposed filter function class, and shown details of the response characteristics for wide range of filter orders. we have highlighted the common points and differences in relation to classical cic filters. results show that normalization constant, and span of integer filter coefficients are lower than that of corresponding classical cic filters, while the stopband characteristics are significantly improved. detailed comparison of response characteristics with classical cic filters is given. the results indicate that the proposed class of cic filter functions can have significant stopband attenuation improvement for the same digital filter complexity. further research will be directed towards passband droop compensation while keeping the proposed technique for stopband improvement. acknowledgement: this work is supported in part by the ministry of education, science, and technology development of the republic of serbia, under grants iii44006 and tr32023. references [1] e. hogenauer, "an economical class of digital filters for decimation and interpolation," ieee trans. acoustics, speech and signal processing, vol. 29, no. 2, pp. 155-162, april 1981. [2] m. laddomada, “generalized comb decimation filters for  a/d converters: analysis and design”, ieee trans. on circuits and systems-i, vol. 54, no. 5, pp. 994-1005, may 2007. [3] m. laddomada, “comb-based decimation filters for  a/d converters: novel schemes and comparisons”, ieee trans. on signal processing, vol. 55, no. 5, pp. 1769-1779, may 2007. [4] g. jovanović doleček, s. k. mitra, “a new two-stage sharpened comb decimator”, ieee trans. circuits syst. i: regular papers, vol. 52, no. 7, pp. 1414-1420, july 2007. [5] m. nikolić, m. lutovac, “sharpening of the multistage modified comb filters”, serbian journal of electrical engineering, vol. 8, no. 3, pp. 281-291, 2011. [6] j.o. coleman, “chebyshev stopbands for cic decimation filters and cic-implemented array tapers in 1d and 2d”, ieee trans. circuits syst. i, vol. 59, no. 12, pp. 2956-2968, december 2012. [7] d. milić, v. pavlović, “a new class of low complexity low-pass multiplierless linear-phase special cic fir filters”, ieee signal processing letters, vol. 21, no. 12, pp. 1511-1515, dec. 2014. [8] v. pavlović, d. milić, b. stošić, „characteristics of novel designed class of cic fir filter functions over classical cic filters“, icetran 2014, proceedings of the 1st international conference on electrical, electronic and computing engineering, vrnjačka banja, serbia, june 2-5, 2014. [9] m. lutovac, v. pavlovic, m. lutovac, "efficient recursive implementation of multiplierless fir filters", in proceedings of the 2nd mediterranean conference on embedded computing (meco), 15-20 june 2013, pp. 128-131. [10] v. pavlovic, m. lutovac, m. lutovac, "efficient implementation of multiplierless recursive lowpass fir filters using computer algebra system", in proceedings of the 11th international conference on telecommunication in modern satellite, cable and broadcasting services (telsiks), 16-19 oct. 2013, vol. 1, pp. 65-68. [11] m. laddomada, d. e troncoso, g. j. doleček, “design of multiplierless decimation filters using an extended search of cyclotomic polynomials”, ieee trans. circuits syst. ii, vol. 58, no. 2, pp. 115-119, feb. 2011. [12] w. a. abu-al-saud, g. l. stuber, “modified cic filter for sample rate conversion in software radio systems” ieee signal processing letters, vol. 10, no. 5, pp. 152-154, may 2003. [13] g. j. doleček, f. harris, “design of wideband cic compensator filter for a digital if receiver”, digital signal processing, vol. 19, pp. 827-837, september 2009. 76 d. n. milić, v. d. pavlović [14] a. fernandez-vazquez, g. j. doleček, “maximally flat cic compensation filter: design and multiplierless implementation”, ieee trans. circuits syst. ii, vol. 59, no. 2, pp. 113–117, feb. 2012. [15] g. j. doleček, a. fernandez-vazquez, “trigonometrical approach to design a simple wideband comb compensator”, int. j. electron. commun. (aeü), vol. 68, no. 5, pp. 437-441, may 2014. [16] g. j. doleček, a. fernandez-vazquez, “novel droop-compensated comb decimation filter with improved alias rejections”, int. j. electron. commun. (aeü), vol. 67, no. 5, pp. 387-396, may 2013. [17] j. le bihan, “impulse response and generating functions of sinc n fir filters”, in proceedings of the sixth international conference on digital telecommunications icdt 2011, 2011, pp. 25-29. [18] s.c. dutta roy, “impulse response of sinc n fir filters”, ieee trans. circuits syst. ii, vol. 53, no. 3, pp. 217-219, march 2006. [19] b. stošić, d. milić, v. pavlović, “new cic filter architecture: design, parametric analysis and some comparisons”, iete journal of research, vol. 61, no. 3, pp. 244-250, mar. 2015. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 193 203 doi: 10.2298/fuee1502193v the current status of power semiconductors  jan vobecký abb switzerland ltd. semiconductors, lenzburg, switzerland abstract. trends in the design and technology of power semiconductor devices are discussed on the threshold of the year 2015. well established silicon technologies continue to occupy most of applications thanks to the maturity of switches like mosfet, igbt, igct and pct. silicon carbide (sic) and gallium nitride (gan) are striving to take over that of the silicon. the most relevant sic device is the mps (jbs) diode, followed by mosfet and jfet. gan devices are represented by lateral hemt. while the long term reliability of silicon devices is well trusted, the sic mosfets and gan hemts are struggling to achieve a similar confidence. two order higher cost of sic equivalent functional performance at device level limits their application to specific cases, but their number is growing. next five years will therefore see the coexistence of these technologies. silicon will continue to occupy most of applications and dominate the high-power sector. the wide bandgap devices will expand mainly in the 600 1200 v range and dominate the research regardless of the voltage class. key words: power devices, silicon, silicon carbide, gallium nitride, thyristor, transistor. 1. introduction there is no doubt that power semiconductor devices play an important role in the development of modern society. they are the key components of an ever increasing number of applications which process the electrical power ranging from units of watts to gigawatts with modulation frequency of hertz to gigahertz. after sixty five years of development we reached the stage where different semiconductor materials compete for their use. silicon is defending its dominance by ongoing improvements in ratings (up to 10 ka), cost reduction (using up to 12” wafers) and circuit topologies. wide bandgap semiconductors like sic and gan strive to profit from their much better electrical strength and thermal conductivity in order to compensate for their much higher cost. old favourite arguments are the profit at a system level due to a higher operation frequency and lower equipment volume. typical arguments against the wide bandgap devices are missing experience with their gate dielectric reliability, about two orders higher cost and low current ratings (≤ 50 a). at the moment, ≈ 1 2 $/a of output current of a 1200 v/100 a sic inverter against ≈ 0.02 $/a for that of si can be compensated only in few cases. this is received january 13, 2015 corresponding author: jan vobecký, abb switzerland, ltd. semiconductors, fabrikstrasse 3, ch-5600, lenzburg, switzerland (e-mail: jan.vobecky@ch.abb.com) 194 j. vobecký when several superior features are fulfilled at the same time, like very high frequency (high power density), the highest possible conversion efficiency and the existence of low loss body diode (solar inverters, ups). the rest is in principal not acceptable for customers due to the high prices. these facts motivate researchers, designers and managers to change this situation. in the following paragraphs a short summary is given on the current status and possible future trends in power devices. for this purpose a classical sorting of components into groups according to electrical power is used as shown in fig. 1. 2. mosfet since about forty five years has been the silicon mosfet the power device of choice for breakdown voltages below 1000 v. for discrete parts, this is (1) thanks to the invention of the vertical double-diffused mosfet structure (vdmos) in favor of simple processing [1], (2) trench vdmos with higher cell density bringing lower conduction losses, and (3) charge compensation principle (super-junction) for low drift layer resistance breaking the original silicon limit of specific on-state resistance vs. breakdown voltage [2]. superjunction dmostrench dmosvertical dmos n+ substrate nepi ppn+ n + p+ p + g s s d n+ substrate nepi n+ n + p+ p + g s s d n+ substrate nepi n+p+ p+ gs s d n+ fig. 2 vertical double-diffused mosfet design concepts conversion frequency [hz] 104103102101 108 106 104 102 igct igbt mosfet mw gw 1010 106 kw pct conversion power [w] fig. 1 power semiconductor devices pct = phase controlled thyristor, igct = integrated gate commutated thyristor, igbt = integrated gate bipolar transistor, mosfet = metal oxide semiconductor field effect transistor the current status of power semiconductor devices 195 the invention of lateral dmos (ldmos) and reduced surface field (resurf) [3] concepts have allowed the integration with bipolar and cmos devices into the bipolar cmos dmos (bcd) [4], smart power [5] and other platforms. from the application viewpoint, the existence of internal body diode makes the power mosfet concept very practical. superjunction ldmosresurf ldmosldmos p-well p p+ p n n d n+ gs n n p p p substrate n+ compens. layersresurf layer p-well p+ d n+ gs n p substrate n+ p-well p+ d n+ gs n p substrate n+ n-drift layer fig. 3 lateral double-diffused mosfet design concepts since about 2010, the silicon mosfet is facing competition from the sic mosfet. because of the low inversion channel mobility leading to a high channel resistance, the sic mosfets are typically available from breakdown voltage of 900 v. their specific on-state resistance ron_sp is claimed to approach the theoretical value for the sic 1-d unipolar device starting from the breakdown voltages of 3.3 kv [6]. for 900 v voltage class they outperform the best in class silicon device, which is for example the latest generation of coolmos [7]. while ron_sp ≈ 10 mω.cm 2 for the 650 v superjunction mosfet, it is ron_sp ≈ 2.3 mω.cm 2 for the next generation 900 and 1200 v sic mosfets. by factor of five lower ron_sp of the sic switches is nowadays a typical benchmarking figure. in addition, the sic devices offer significantly lower temperature dependence of ron_sp and very low charge of output capacitance qoss, while the silicon mosfets go in the opposite direction due to the reduction of a cell pitch. in specific applications, this could be a valid argument for compensation of a much higher price for the sic die by making profit at the system level, especially in the range 900 – 1200 v, which is free of competition [7]. but the producers of sic devices like cree are aiming much higher. they offer their mosfets up to 15 kv and demonstrate bipolar devices up to 27 kv. they are seeking new applications to oust the producers of silicon devices from their traditional markets. typical problems, like high price, missing experience with reliability (e.g. gate oxide and vth stability), and low current ratings, are still to be overcome. another competition for the 650 v superjunction mosfet in the future is the gan hemt [7]. to be cost competitive, the mainstream hemt is produced by epitaxial growth on a silicon wafer. consequently, it is (1) a lateral device, (2) it has a high defect density at the epitaxial layer, which allows us to apply only a much lower dc link voltage compared to the breakdown voltage, and (3) it has a relatively low thermal conductivity and capacity. the attractiveness of this concept lies in very low switching and on-state losses. however, the latter grows much more with increasing temperature than that of the sic mosfet. to overcome the inherent limitations of the lateral gan devices, the arpa-e switches program has been launched in 2014 [8]. the goal is to develop a 1200 v / 100 a single die vertical gan switch, which would have the potential to achieve the target cost of 0.1 $/a. this 196 j. vobecký value is claimed to be at cost parity with the silicon switch. however, one should treat it with caution, because it is deduced from the distributor prices. the current cost of a typical silicon foundry is at least five times lower and it can drop further with increasing the size of the starting silicon wafer towards 300 mm. nevertheless, such ambitious programs are needed to boost the gan technology. benefiting from cmos technology advancements, the unipolar transistor has become the most developed power device technology. there is no reason for being it otherwise in the future. in addition to the classical planar and trench gate architectures mentioned above, there has already been demonstrated nanoscale device concepts, e.g. finfet 3d approach on gan substrate [9], gate all around (gaa) fet concept [10], etc. moreover, there is not only silicon, sic and gan. since recently, diamond substrates are available with p-type boron doped and n-type phosphorus doped epitaxial layers. possibilities for an attractive research are further growing. 3. igbt at the beginning of 1980s´, the on-state resistance of silicon unipolar devices has been found too high above the breakdown voltage of 600 v. this stimulated the invention of a mos-controlled power device with a carrier injection from the side opposite to that of the mos control. the original designation power mosfet with an anode region [11], the insulated gate rectifier (igr) [12] or conductivity modulated fet (comfet) [13] unified later in the term insulated gate bipolar transistor (igbt). nowadays, the igbt is the device of choice for the voltage classes ranging from 650 v to 6.5 kv. the main reasons are (1) the superior cost relative to other technologies, (2) the process-adjustable position at the trade-off curve between the on-state (vce sat) and switching losses eoff, (3) high current ratings of power modules in the range of switching frequencies relevant for the most of kw, mw, and up to 1 gw applications, (4) possibility to integrate the free-wheeling diode (fwd) at the same chip (rc-igbt, bigt, rcdc) or to optimize the fwd independent of the igbt, (5) short circuit withstanding capability, (6) low temperature dependence of the on-state losses. (1) the production of igbt chips requires a similar technology as that of the mosfet. one more process is the anode implantation at the backside. the field-stop (softpunch through) n-type buffer layer of the state-of-the-art igbts can be for the voltage classes of ≥ 2.5 kv pre-diffused already at the starting silicon. only for the low voltage igbts (≤ 1700 v), a more demanding thin wafer technology is needed. in this case, the wafer is usually thinned by grinding and spin etching after completion of the mos part and then implanted for the n-type buffer and p-type anode. the record low wafer thickness at 100 µm for 1200 v class has been achieved, for example using the thin proton implanted buffer layer instead of the thicker pre-diffused one [14]. there are also further processing options for the thin wafer processing. (2) the possibility to adjust the buffer and anode doping profiles at the same time provides a great flexibility in controlling the bipolar gain of the igbt. this reflects at the position at the trade-off curve between the vce sat and eoff, at the magnitude of leakage current (tjmax), short circuit withstanding capability, etc. the devices with low eoff (higher vce sat) can be operated at higher frequencies and alternatively with the sic the current status of power semiconductor devices 197 fwds. the devices with low vce sat (higher eoff) are preferred in the low-frequency applications, like for example in the multilevel converters for energy transmission and distribution. again, the adjustment of buffer and anode doping is more demanding at the low voltage igbts, because it can be efficiently processed only after the wafer thinning with finished mos part including the passivation of junction termination. for fast igbts the limited temperature for the activation of p-type anode using a classical furnace is not a problem. for the igbts with low vce sat, the laser annealing technology is available, which exposes only a thin anode surface layer into the melting regime for the limited period of time, typically between 100 and 200 ns, while the cathode stays at a safely low temperature [15]. nenhancement ndrift pwell nbuffer pcollector emitter (cathode) collector (anode) enhanced classical pwell depth log carrier concentration ndrift nenhancement layer log doping concentration p-well depth fig. 4 plasma enhancement at cathode side of the state-of-the-art igbt (3) the original invention of the igbt concept with the injection of holes from the anode side significantly lowered the vce sat and increased power density. the enhancement of the electron-hole plasma at the cathode side for further reduction of the on-state losses of igbt modules belongs to the achievement of the last decade. the planar and trench soft punch-through (field-stop) igbts have developed into the enhanced trench [16] and enhanced planar igbts [17] with sophisticated shaping of the on-state plasma (see fig.4). the enhanced trench igbt represents the most important design concept for the two chip, i.e. igbt and diode, solution. for example, it moved recently the maximal ratings of the 6.5 kv igbt module to 1 ka [18]. in addition, the power density of the low voltage igbt technology is expected to grow. by further reduction of the cell pitch, the maximal operating temperature of 1700 v modules is expected to achieve tjmax = 200 °c around the year 2017 [7]. igbt + rc igbt >>> bigtenhanced planar igbt enhanced trench igbt active trench nenhancement ndrift nsource pwellnenhancement ndrift nsource pwell nbuffer pcollector (anode) metal pcollector (anode) nbuffer metal anode anode anode segment n+-short cathode cathode igbt rc igbt bigt rc-igbtpilot-igbt mos-cell cathode fig. 5 state-of-the-art igbt design concepts 198 j. vobecký (4) another way of increasing the power density is the reverse conducting rc-igbt concept, where both the igbt and fwd are integrated at the same chip. the consequence of the fact that the igbt and fwd share the same silicon is a bigger and more uniform current and thermal load of the chip as well. consequently, the rcigbt chip is exposed to a much lower temperature swings (∆t) during continuous operation, which increases the load cycling capability. in addition, the anode shorts, which form the cathode of the fwd, provide a softer switching compared to the two chip solution. there are two difficulties with the rc-igbt, namely (a) the snapback of forward i-v characteristics of the rc-igbt and (b) the difficulty to lower the reverse recovery losses of the fwd diode erec without worsening the igbt parameters. (a) to eliminate the snapback, a sophisticated integration of the n-type shorts into the p-type anode evolved into the bi-mode insulated gate transistor bigt [19, 20]. the pilot part of the anode, which is free of the n-type shorts, turns-on like the normal igbt without the snapback. the rest of the device, which is the rc-igbt with combined mos-like and igbt-like structures, is active only after the pilot part is turned on. the rc part has therefore the n-type shorting pattern optimized for the fwd operation, igbt turn-off and the required area ratio between the fwd and igbt. (b) at the planar rc-igbt design, the erec of the fwd can be partly minimized by a local lifetime control and partly by a special gating concept. in the trench design, the special gating is claimed to be sufficient to control the diode anode injection efficiency in order to reduce the stored charge prior to turn-off [21]. one may decide on their own, whether the better cycling capability is worth of the effort with the special gating in comparison with the easier two chip concept with less uniform temperature utilization. (5) the short circuit withstanding capability is given by the shape of the i-v curves ic = f(vce), which saturate in the saturation regime of the mosfet. this means, that the igbt, which inherited this feature after the mosfet, can limit a short circuit current as long as it survives thermally. this is typically within the period of 10 µs or less during which there has to be the igbt turned-off. the short circuit turnoff capability is a clear advantage over the thyristor based devices discussed below, which lack this feature. as they are subjected to the latch-up during turn-on and in the on-state, a protective fuses or di/dt chokes (see the choke „l‟ in fig.6a) need to be added into the application circuits, which make them more bulky and complex. a) cl l ccl r fwdigct b) vdc vdc fwdigct fwdigct fwdigbt fwdigbt fwdigbt fig. 6 typical inverters with igct (a) and igbt (b). the current status of power semiconductor devices 199 (6) the superior low temperature dependence of the on-state losses in the wide range of operating temperatures of the igbt is given by the unique combination of bipolar and mos structures. the strong degradation of the electron mobility with increasing phonon scattering (lattice temperature) µn = f(t) makes the silicon mosfet the most temperature sensitive device from all relevant power devices [7]. the increase of intrinsic concentration with growing temperature ni = f(t) is responsible for the unwanted negative temperature coefficient of the on-state voltage drop in bipolar devices. in the igbt, the strong negative temperature coefficient of µn is partly compensated by the positive one of the ni. this feature is given rather by coincidence than by design. an experimental igbt processed at sic substrate [6] has been demonstrated with an ambition to penetrate into the silicon domain silicon. the trick to achieve it is based on offering the igbts with blocking voltage well above 15 kv (demonstrated vbr ≈ 24 kv), which silicon cannot achieve. in addition to other technical challenges, the low carrier lifetime in the sic substrate had to be increased from 2 to 10 µs using 15 hours thermal oxidation at 1300 °c prior to device processing. the question is, whether the designers of the existing high voltage applications appreciate such effort. for example, the voltage source converters (vsc) for very high voltage systems can use the modular multilevel converter topology, where a higher number of devices is needed [22]. these devices need to be switched close to the fundamental output frequency of 50 hz and have to be optimized for low conduction losses. the device of choice is then the silicon igbt, with blocking voltage (typically 4.5 kv) giving an optimal trade-off between electrical losses and system cost. 4. igct the exceptional feature of the integrated gate commutated thyristor is that it conducts like a thyristor (i.e. with a low vt) and blocks both dynamically and statically like a bipolar transistor [23]. this concept has been invented in the mid 1990´s by integrating the fast (i.e. low inductance) gate unit into the structure similar to his forerunner, the gate turn-off thyristor (gto) [24]. the special gate unit provides very fast switching from the thyristor on-state regime to that of the transistor with safe turn-off, i.e. without a secondary breakdown. the snubber circuit, necessary for the safe gto turn-off, is hereby eliminated. a) b) fig. 7 conventional (a) and corrugated p-base design of an igct (b). 200 j. vobecký the igct is nowadays the most powerful device with a turn-off capability. the reason for its lower occurrence compared to that of the igbt is its more bulky gate unit, the lack of the short-circuit capability, and therefore the necessity of having an overvoltage clamped di/dt choke to set the switching speed. in addition, the increasing power density of igbts described above pushes the igct out from its original applications. however, the power handling capability of the igct is growing further. it has increased the turn-off power density to 700 w/cm 2 [25] thanks to the corrugated pbase design [26] implemented at a four inch wafer (see fig.7). moreover, the integration with the fwd at the 150 mm wafer moved the maximal current handling capability of the rc-igct towards 10 ka [27]. having very low on-state losses, this device is particularly well suited for the modular multilevel converter technology with very high rated power, which are needed for the future hvdc energy transmission and large capacity energy storage systems [28]. recently, a new concept of the rc-igct has been introduced, namely the bi-mode gate commutated thyristor (bgct) [34]. this device features an interdigitated integration of diode and gct segments, which brings several improvements in ratings like increased power density and homogeneous thermal loading analogous to the bigt above, lower leakage current, softer reverse recovery of the fwd, etc. [35]. we will therefore continue to see these devices in very high power applications. 5. pct the three-terminal p-n-p-n device was introduced in 1956 [29] to be called later the silicon controlled rectifier (scr) and thyristor as well. to distinguish it from other existing thyristor types, we call it phase controlled thyristor (pct). as one of the first power semiconductor devices ever, the pct played a unique role in the development of semiconductors from the very beginning, as described in the ref. [30]. nowadays, the pct is not anymore a hot topic of research as it was sixty years ago. however, it constantly maintains a significant market share thanks to its low processing cost, the lowest on-state losses and the simplest handling from all switches. the pct can be found in the voltage classes ranging from few hundred volts up to 10 kv and from ten amperes up to several ka, when made from silicon. they are being used in various industrial applications, like motor control, induction heating, power quality, power supplies, etc. they play a very important role in the high-voltage direct current transmission systems (hvdc). as the hvdc represents the most advancing technology based on the pct, we discuss it briefly below. due to overall low system losses, large-area silicon pcts are being used in the line commutated converters (lcc) for long-distance and multi-gigawatt power transmission. as the length of these transmission lines can reach over 2,000 km, extreme demands are laid on the energy efficiency. this dictates the usage of the dc transmission line cables operating at ultra-high voltages (uhv). the most of uhvdc systems are installed and further planned in china. following the success of uhvdc lines operating with rated dc voltage of 800 kv and rated power up to 7 gw, china is investigating the possibility of increasing the voltage rating up to 1,100 kv for power transmission breaking the 10 gw limit [31, 32]. to satisfy the demands on the converter valve for so high power, the the current status of power semiconductor devices 201 current handling capability of pcts had to be increased. for this purpose, abb has developed the second generation of six inch pct platform with the voltage ratings at 6.7, 7.2 and 8.5 kv [33]. thanks to the lowered on-state voltage vt, the maximal rating current itmax is over 6 ka for voltage ratings of 6.7 kv and 7.2 kv pcts, and 5 ka for 8.5 kv ones. the variety of voltage classes enables the optimization of system cost vs. energy losses trade-off. there are also other relevant uhvdc projects, for which the pct with itmax < 2.5 ka provides a better trade-off between the system cost and energy losses. for this purpose, the pct with even much lower vt has been developed at four inch silicon [36]. this device has a typical on-state voltage drop vt = 1.55 v at it = 1.5 ka and t = 90°c for the forward and reverse blocking capability of 8.5 kv. there is no other device with such low on-state losses for the 8.5 kv blocking. this implies further loss reduction of the converter valve by more than 5%, hereby moving the csc concept to ever higher transmission efficiencies. this is important for the future of the uhvdc transmission systems based on the lcc with pcts (so-called hvdc classic). this is because they are being replaced by that of the selfcommutated vsc with igbts (hvdc light), which have a lower efficiency and can transmit lower power, but they offer other beneficial features like smaller footprint, reactive power control, cold start, no need for a solid ac network at both sides of the transmission line, etc. anode cathode gate amplifying gate shorts turn-on signal n-base time (s) fig. 8 electron concentration in the pct at turn-on phase (a). measured vt vs. it for the new 6” pct platform (b). 6. conclusions current trends in the design and technology of power semiconductor devices were discussed on the threshold of the year 2015. it has been stated that power electronics continues to utilize above all the silicon based devices, but the wide bandgap devices based on the gan and especially on the sic have assumed the importance as never before. the silicon mosfet dominates the low voltage sector, but in the future, the sic mosfet is expected to surpass above 650 v up to about 1200 v. the silicon igbt is 202 j. vobecký available from 650 v to 6.5 kv. it dominates between 900 and 1700 v. it is also important for higher voltage classes due to the industrial drives, traction (1.7 6.5 kv) and energy transmission and distribution (4.5 kv). the modular multilevel converters with igbts are hindering the penetration of the sic solutions into the low loss applications. the area of very high power is occupied by the igct and pct. the main drivers are the energy savings and the world‟s hunger for electricity. the pct occupies also the lower power areas, where a low cost and simplicity of handling is required. the wide bandgap devices can be found in the applications with a lower power, where their very high conversion efficiency, very high operational frequency and high power density can compensate for their much higher price. the development trends of high-power technologies will continue to secure the high power systems with exceptional performance, comfort, energy savings and the required environmental sustainability. references [1] h. sigg, g. vandelin, t. cauge and j. kocsis, "dmos transistor for microwave applications", ieee transactions on electron devices, vol. ed-19, pp. 45–53, 1972. [2] g. deboy, p. märz, j.-p. stengl, h. strack, j. tihanyi and h. weber, "a new generation of high voltage mosfets breaks th limit line of silicon", in proceedings of the international electron device meeting, 1998, pp. 683–685. [3] j. a. appels and h. m. j. vaes, "high voltage thin layer devices (resurf devices)", in proceedings of the international electron device meeting, 1979, pp. 238–241. [4] a. r. alvarez, r. m. roop, k. j. ray, g. r. gettermeyer, "lateral dmos transistor optimized for high voltage bimos applications", in proceedings of the international electron device meeting, 1983, pp. 420–423. [5] r. s. wrathall, d. tam, l. terry, s. p. robb, "integrated circuits for the control of high power", in proceedings of the international electron device meeting, 1983, pp. 408–411. [6] j. w. palmour, "silicon carbide power development for industrial market", in proceedings of the international electron device meeting, 2014, pp. 1.1.1–1.1.8. [7] r. rupp, t. laska, o. häberlen, m. treu, "application specific trade-offs for wbg, sic, gan and high end si power switch technologies", in proceedings of the international electron device meeting, 2014, pp. 2.3.1–2.3.4. [8] t. d. heidel, p. gradzki, b. a. hamilton, "power devices on bulk gallium nitride substrates: an overview of arpa-e`s switches program", in proceedings of the international electron device meeting, 2014, pp. 2.7.1–2.7.4. [9] k.-s. im, y.-w. jo, k.-w. kim, d.-s. kim, h.-s. kang, c.-h. won, r.-h. kim, s.-m. jeon, d.-h. son, y.-m. kwon, j.-h. lee, s. cristoloveanu, j.-h. lee, "first demonstration of heterojunction-free gan nanochannel finfets", in proceedings of the international symposium on power semiconductor devices & ics, 2013, pp. 415–418. [10] g. larrieu, x. l. han, "vertical nanowire array-based field effect transistors for ultimate scaling", nanoscale, vol.5, pp.2437–2441, 2013. [11] h. w. becke, and c. f. wheatley, "power mosfet with an anode region", u.s. patent 4,364,073, december 14, 1982. [12] b. j. baliga, m. s. adler, p. v. gray, r. p. love, n. zommer, "the insulated gate rectifier (igr): a new power switching device", in proceedings of the international electron device meeting, 1982, pp. 264–267. [13] a. m. goodman, j. p. russell, l. a. goddman, c. j. nuese, and j. m. neilson, "improved comfet with fast switching speed and high-current capability", in proceedings of the international electron device meeting, 1982, pp. 79–82. [14] j. vobecky, m. rahimo, a. kopta, s. linder, "exploring the silicon design limits of thin wafer igbt technology: the controlled punch through (cpt) igbt", in proceedings of the international symposium on power semiconductor devices & ics, 2008, pp. 76–79. the current status of power semiconductor devices 203 [15] m. rahimo, c. corvasce, j. vobecky, y. otani. k. huet, "thin-wafer silicon igbt with advanced laser annealing and sintering process", ieee electron device letters, vol. 33, pp. 1601–1603, nov. 2012. [16] h. takahsahi, h. haruguchi, h. hagino, and t. yamada, "carrier stored trench-gate bipolar transistor (cstbt) – a novel power device for high voltage application", in proceedings of the international symposium on power semiconductor devices & ics, 1996, paper 15.2. [17] m. rahimo, a. kopta, s. linder, "novel enhanced-planar igbt technology rated up to 6.5 kv for level losses and higher soa capability", in proceedings of the international symposium on power semiconductor devices & ics, 2006, pp. 33–36. [18] m. masaomi, m. tabata, t. hieda, h. muraoka, "7th generation igbt module for industrial applications", in proceedings pcim europe, 2014, pp. 34–38. [19] m. rahimo, a. kopta, u. schlapbach, j. vobecky, r. schnell, s. klaka, "the bi-mode insulated gate bipolar transistor (bigt) a potential technology for higher power applications", in proceedings of the international symposium on power semiconductor devices & ics, 2009, pp. 283–286. [20] l. storasta, m. rahimo, c. corvasce, a. kopta, "resolving design trade-offs with the bigt concept", in proceedings pcim europe, 2014, pp. 354–361. [21] d. werber, f. pfirsch, t. gutt, v. komarnitskyi, c. schaeffer, t. hunger, d. domes, "6.5 kv rcdc for increased power density in igbt modules", in proceedings of the international symposium on power semiconductor devices & ics, 2014, pp. 35–38. [22] a. kopta, "high voltage silicon based devices for energy efficient power distribution and consumption", in proceedings of the international electron device meeting, 2014, pp. 2.4.1–2.4.4. [23] s. klaka, m. frecker, h. grüning, "the integrated gate-commutated thyristor: a new high efficiency, high-power switch for series or snubberless operation", in proceedings pcim europe, 1997. [24] r. h. van ligten, d. navon, "base turn-off of p-n-p-n switches, ire wescon convention record", part 3 on electron devices, pp. 49 52, august 1960. [25] t. wikstrom. t. stiasny, m. rahimo, d, cottet and p. streit, "the corrugated p-base igct – a new benchmark for large area soa scaling", in proceedings of the international symposium on power semiconductor devices & ics, 2007, pp. 29–32. [26] n. lophitis, m. antoniou, f. udrea, i. nistor, m. t. rahimo, m. arnold, t. wikstrom, and j. vobecky, "gate commutated thyristor with voltage independent maximum controllable current", ieee electron device letters, vol. 34, pp. 954-956, 2013. [27] t. wikstrom. m. arnold, t. stiasny, c. waltisberg, h. ravener, m. rahimo, "the 150 mm rc-igct: a device for highest power requiremnts", in proceedings of the international symposium on power semiconductor devices & ics, 2014, pp. 91-94. [28] s. linder, "power electronics: the key enabler of a future with more than 20% wind and solar electricity", in proceedings of the international symposium on power semiconductor devices & ics, 2013, pp. 11-18. [29] j. l. moll, m. tanenbaum, j. m. goldey, p-n-p-n transistor switches, proceedings of the ire, vol.44, pp.1174–1182, 1956. [30] n. holonyak, "the silicon p-n-p-n switch and controlled rectifier (thyristor)", ieee transactions on power electronics, vol.16, pp.8–16, 2001. [31] r. montano, b. jacobson, d. wu, l. arevalo, corridors of power, abb review, special report: 60 years of hvdc, 2014. [32] j. cao, j. cai, "hvdc in china", 2013 hvdc and facts conference, palo alto, ca, usa, 2013. [33] j. vobecky, t. stiasny, v. botan, k. stiegler, u. meier, m. bellini, "new thyristor platform for uhvdc (> 1mv) transmission", in proceedings pcim europe, 2014. [34] u. verpulati, m. bellini, m. arnold, m. rahimo, t. stiasny, "the concept of bi-mode gate commutated thyristor, a new type of reverse conducting device", in proceedings of the international symposium on power semiconductor devices & ics, 2012, pp. 29–32. [35] u. verpulati, m. arnold, m. rahimo, j. vobecky, t. stiasny, n. lophitis, f. udrea, "an experimental demonstration of a 4.5 kv “bi-mode gate commutated thyristor (bgct)", in proceedings of the international symposium on power semiconductor devices & ics, 2015, accepted for publication. [36] j. vobecky, v. botan, k. stiegler, m. bellini, u. meier, "a novel ultra low loss four inch thyristor for uhvdc", in proceedings of the international symposium on power semiconductor devices & ics, 2015, accepted for publication. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 221 234 doi: 10.2298/fuee1402221d causal models of electrically large and lossy dielectric bodies  antonije djordjević 1 , dragan olćan 1 , mirjana stojilović 2 , miloš pavlović 3 , branko kolundžija 1 , dejan tošić 1 1 university of belgrade – school of electrical engineering, belgrade, serbia 2 university of applied sciences western switzerland, yverdon-les-bains, switzerland 3 wipl-d d.o.o., belgrade, serbia abstract. this paper presents a novel formula for the complex permittivity of lossy dielectrics, which is valid in a broad frequency range and is ensuring a causal impulse response in the time domain. the application of this formula is demonstrated through the analysis of wet soil, where the coefficients of the formula are tuned to match the measured data from the literature. additionally, an analytical expression for the impulse response of the relative permittivity is derived. the influence of the frequency dependence of the complex permittivity on the causality of responses is illustrated through the analysis of 1-d, 2-d, and 3-d electromagnetic systems. being the most complex, the 3-d system is also used as a test bed for comparing the computational limitations of two commercially available solvers, cst and wipl-d. key words: causal response, complex permittivity, wet soil, impulse response, cst, wipl-d. 1. introduction contemporary software tools for electromagnetic (em) simulation can be efficiently used for modeling and analysis of various complex systems. yet, when a system comprises electrically large but highly-detailed objects filled with lossy dielectrics, as is usually the case for simulations at microwave and millimeter-wave frequencies, software limitations can easily be reached. an example of a large and complex system is a human body, which is often modeled when considering body-area networks. in order to find the response of such a system in the time domain, there are two general simulation strategies: to apply a time-domain solver, or to apply a frequency-domain solver and then use the inverse discrete fourier transform.  received january 23, 2014 corresponding author: antonije djordjević university of belgrade – school of electrical engineering, bulevar kralja aleksandra 73, p.o. box 35-54, 11120 belgrade, serbia (e-mail: edjordja@etf.bg.ac.rs) mailto:edjordja@etf.bg.ac.rs 222 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić if a frequency-domain solver is used, the system needs to be analyzed at a large number of equispaced frequencies. at the lower end of the frequency range, the electrical dimensions of the body are small compared with the wavelength, so that integral-equation solvers are the preferred choice [1]. at the upper end of the frequency range, various asymptotic techniques may be used, provided that the shape of the body is simple (e.g., a sphere). if, however, the shape of the body is highly-detailed, then an integral-equation solver (is) is preferred again. however, the computational resources required for is (the memory and processor requirements) quickly increase with increasing the simulation frequency. that is why the contemporary commercial integral-equation solvers often fail to analyze objects whose overall dimensions exceed several tens or hundreds of wavelengths. in this paper, we will challenge the limits of two available commercial solvers implemented in wipl-d [2] and cst [3] electromagnetic simulation tools. the time-domain response of any real (physical) system is causal: the response cannot start before the excitation. all time-domain solvers implement a time-stepping procedure and naturally incorporate the causality feature, as in [3]. however, this is not the case with frequency-domain solvers. to ensure causal response, one must not forget to properly model the dielectric relative permittivity. if the dielectric is lossless, the relative permittivity can be independent of frequency. yet, if the dielectric is lossy, the variations with frequency must be described by an appropriate function of frequency, as discussed in section 2. otherwise, a noncausal, and thus unrealistic, response will be obtained. in order to illustrate the causality issues, we use wet soil as an example. the parameters of the soil are evaluated based on measured data available in the literature, presented in section 3.1. these data are fitted in a very broad frequency range, as described in section 3.2. the fitting function involves the broadband term from [4]. in the time-domain analysis, a dispersive relative permittivity is described by the corresponding impulse response. hence, for the broadband term from [4], we reveal in section 3.3 the corresponding impulse response, which is not available in the literature. in section 4 we present three examples to demonstrate differences between a causal and a noncausal model of the soil. the examples are sorted by computational complexity. the first example is a one-dimensional (1-d) electromagnetic problem (plane-wave propagation). the second example is a two-dimensional (2-d) problem (a cylindrical dielectric scatterer). finally, the third example is the most complex one: a threedimensional (3-d) problem consisting of a large dielectric cube and two dipole antennas. 2. causality issues with frequency-dependent permittivity parameters of lossy media are frequency-dependent. thus, a linear nonmagnetic medium is characterized by the complex relative permittivity )(j)()j( rrr  , where f 2 is the angular frequency and f is the frequency. here, the negative of the imaginary part, )( r  , takes into account both conductive and dielectric losses. a physical system is causal: the response cannot occur before the excitation. consequently, )( r  and r ( )  are not mutually independent. under certain conditions [5], they are related by the hilbert transform, or, equivalently, the kramerkronig relations (dispersion relations). causal models of electrically large and lossy dielectric bodies 223 time-domain solvers that are capable of dealing with dispersive parameters inherently use causal models of media. usually, they fit the relative permittivity in the frequency domain using simple terms, most often debye terms [6], so that the impulse response is obtained analytically using the fourier transform. the impulse response is afterwards used in convolution integrals in the time domain. frequency-domain solvers can deal with any kind of frequency dependence of )j( r  , because they are not bound by causality issues. however, if we use the results of the frequency-domain analysis to compute the time-domain response, )j( r  should be such as to provide a causal response. otherwise, the response in the time domain would have a non-physical behavior. for example, it could start before the excitation, thus violating strict causality [7], or the speed of propagation of electromagnetic fields could exceed the speed of light in a vacuum, violating einstein’s causality. although it is often stated in the literature that )( r  can be evaluated from )( r  [8], and vice versa, the required numerical integration is not easy, and sometimes even not doable. the reasons for that lie in singular, highly oscillatory, or even diverging integrands, and in infinite integration limits in the hilbert transform. even analytically, the integrals cannot be evaluated in many important practical cases because the integrals are divergent or undefined [5]. as a simple example, let us consider a leaky dielectric characterized by const)( r  (equal to the electrostatic relative permittivity) and by a constant conductivity const)(  , which is independent from r  . the equivalent (complex) permittivity of the material is r r 0 ( j ) j( / )       , so that r 0 ( ) /     . clearly, if only r  is known, it is impossible to find )( r  without an additional piece of information, and vice versa. in consequence, the data for )j( r  that satisfy causality conditions can be most reliably and easily supplied in terms of an analytic function of the complex frequency s (  js on the imaginary axis). this function, )( r s , cannot have poles in the right half-plane. it can have only simple poles on the imaginary axis, where it must possess conjugate symmetry: )j(*)j( rr  . hence, )(**)( rr ss  . the function )( r s can be supplied directly by the user, in an analytic form. alternatively, the user tabulates the frequency-dependent data for )( r  and )( r  , and the solver evaluates an appropriate interpolation formula as in [3]. for direct analysis in the time-domain, the impulse response of r  is needed. it is convolved with the vector )( 0 te to obtain )(td . this convolution is the time-domain counterpart of the relation ed 0r  in the frequency domain. 3. complex relative permittivity of wet soil in order to clearly demonstrate the difference between causal and noncausal models, it is preferable to have a medium with relatively high losses. in that case, the causality issues can be noted even after an em wave propagates along a short distance. we have selected wet soil as an example of a dispersive medium throughout the remainder of this paper. we have characterized the soil based on experimental data presented in subsection 3.1. the analytic approximation for )( r s is given in subsection 3.2. 224 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić 3.1. experimental data on relative permittivity of water and soil in this paper we use two sets of measured relative permittivity values: those of water and those of soil. we combine them to estimate the relative permittivity of wet soil. measurement results and analytical model for the relative permittivity of water are given in [9]. the measured complex permittivity is fitted by a constant ( r ), two debye (relaxation) terms, and a frequency-independent conductivity (  ), as 0 2w r2 lw r1 rr 11 )(             sss s . (1) in this model, conductive losses are attributed to  and polarization losses to the two debye terms. the measured relative permittivity of soil is given in [10], and comprise data for r  , r  , and  . although it is not sufficiently clear from the report, r and  are not independent; they are related by r 0 /( )    , which can be verified from the numerical results presented in the paper. in other words, two distinct descriptions of losses are used in reference [10]. in the first description, all losses (conductive and polarization) are attributed to r  . in the second description, all losses are attributed to  . in this paper we use results from the middle row of fig. 36 in [10]. 3.2. broadband approximation of relative permittivity of wet soil in [4], an approximation for frequency-dependent complex relative permittivity of lossy dielectrics is proposed, covering a very wide frequency range. it uses a logarithmic function, which provides practically constant )( r  , while )( r  slowly decays with frequency. this approximation yields a causal response in the time domain. the complete expression for the relative permittivity is given by o 1 2 12 r j 10ln j j ln ' ')j(          mm , (2) where [rad/s]1101 log m and [rad/s]2102 log m . the first term is the relative permittivity at very high frequencies, the second term is the broadband logarithmic term, and the third term comes from the conductivity, which is assumed to be independent of frequency. for the frequency range where 21  , the real part of the logarithmic term is 10ln ln ' 10ln j j ln ' 10ln j j ln ' re 2 12 1 2 12 1 2 12                             mmmmmm , (3) and it linearly decays with the logarithm of the frequency, for 2 1 ' ( )m m  per decade. in the same frequency band, the imaginary part of the integral, causal models of electrically large and lossy dielectric bodies 225 10ln 2' 10ln j j arg ' 10ln j j ln ' im 12 1 2 12 1 2 12                                   mmmmmm , (4) is practically constant. for angular frequencies below 1  or above 2  , the imaginary part of the logarithmic term tends to zero, while the real part tends to be constant. this logarithmic function can replace several debye terms in a wide frequency range. the formula (2) is often quoted as “djordjevic-sarkar” model, and it has been built into ansoft [11], agilent [12], simberian [13], and other software. we make an approximation of the parameters of wet soil by combining the soil parameters from [10], the logarithmic term from [4], and the approximation for pure water at 25ºc, based on data from [9] and [14]. the measured data for the soil are in the frequency range from 0.1 ghz to 3 ghz, whereas the approximation is valid outside of this frequency range as well. our approximation for the permittivity of wet soil reads:                          s s mm ps 1 2 12 r r ln 10ln 11)(   0 2w rw w lw rw w 11 1                        ss p s pp , (5) where  11.0p is the relative contribution of water,  11 2 f , where mhz 1 1 f is the lower cutoff frequency of the broadband term,  22 2 f , where thz 100 2 f is the upper cutoff frequency of the broadband term,  [rad/s]1101 log m ,  [rad/s]2102 log m ,  r rd 2 1 ( )m m    is the total variation of the real part of the broadband term, where 61.1 rd  is the slope per decade,  s/m025.0 is the constant conductivity,  1w1w 2 f , where ghz 25 1w f is the location of the first debye term for water,  2w2w 2 f , where ghz 200 2w f is the location of the second debye term for water,  5.76 rw  is the total variation of the real part of the permittivity of water, and  065.0 w p is the relative contribution of the second debye term. 226 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić fig. 1 compares the measured data from [10] with the results obtained from our approximation formula in the frequency range from 0.1 ghz to 3 ghz. the measured data exhibit stochastic errors, but the achieved agreement can be considered to be good. (a) (b) fig. 1 comparison of measured and analytically calculated parameters of wet soil: (a) relative permittivity and (b) conductivity. the results show good agreement. 3.3. impulse response for time-domain solvers, we need the impulse response of )( r s from equation (5). this response can be evaluated by summing the responses for all terms. the first term is a constant (unity), the second term is a broadband term, two debye terms follow, and the last term has the form s/1 . the impulse responses for all terms, except the broadband term, are elementary and can be found in standard tables of the inverse laplace transform. however, the response for the broadband term is not available in the literature. hence, we have evaluated it analytically using the inverse laplace transform, so that the impulse response corresponding to (5) reads: l 2 r r 2 1 e e ( ) ( ) (1 ) h( ) ( ) ln10 t t t t p t m m t                    lw 2w w rw lw w rw 2w 0 ((1 ) e e ) h( ) h( ) t t p p p t t             , (6) where )(t is the dirac (delta) function and )(h t is the heaviside (step) function. 4. examples of (non)causal response in this section, we present three examples to demonstrate differences in the timedomain response when using a causal and when using a noncausal model of wet soil. the examples are ordered according to the complexity and dimensionality of the analyzed electromagnetic problems, from the simplest to the most complex ones. causal models of electrically large and lossy dielectric bodies 227 4.1. plane wave we consider a uniform plane wave that propagates through a homogeneous nonmagnetic medium. this is a one-dimensional (1-d) electromagnetic problem. the excited wave is described by a delta-function. the distance of wave propagation is m 5.0d . we analyze the propagation in the frequency domain at 4096 frequency points, starting from 0, with a step of 1 mhz. thereafter, we use the inverse discrete fourier transform to obtain the response in the time domain. we consider two models. first, when the complex relative permittivity is given by (5). second, when the complex relative permittivity is independent of frequency and equal to 36402j725116 r . .  (which is an estimated mean value of the permittivity of wet soil in the first model). the results are shown in fig. 2. for the first model, a causal response is obtained. it has a crisp start at 6 ns. for the second model, a noncausal response is obtained. it is characterized by a premature and “lazy” leading edge of the pulse. fig. 2 time-domain response when a plane wave is propagating through a causal and a noncausal medium. the causal response has a crisp start, while the noncausal response has an early start and slow leading edge. 4.2. dielectric cylinder we consider an infinitely long cylinder of a square cross-section, whose side length is 0.5 m. the cross-section of the cylinder is shown as an inset in fig. 3. the axis of the cylinder coincides with the z-axis of the cartesian coordinate system. hence, we deal here with a two-dimensional (2-d) em system. 228 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić fig. 3 electric field z-component at the axis of a dielectric cylinder for the cases when the permittivity is frequency-constant (noncausal) and frequency-dependent (causal). a uniform plane electromagnetic wave illuminates the cylinder. the wave propagates along the x-axis, in the opposite direction of the x-axis. the electric-field vector of the wave is a gaussian pulse in the time domain, defined as 2 0 2 ( ) 2 0 ( ) e t t z t e   e i , (7) where, v/m 1 0 e , ns 3 0 t , ns 1.0 , and z i is the unit vector in the z-direction. these data are for the transversal plane at m 1x . since the vector of the electric field is parallel to the cylinder axis, the electromagnetic field in this system is a 2-d transversal magnetic-field, usually termed as tm mode. the numerical analysis is performed using the method of moments (mom) with the surface integral-equation formulation, piecewise constant approximation for electric and magnetic surface currents, pmchwt formulation, and point-matching testing procedure [1], [15]. as the response, we calculate the z-component of the electric field at the cylinder axis (i.e., at point o at the inset of fig. 3). the frequency-domain analysis is done from 9.99512 mhz to 10.235 ghz over 1024 equidistant frequency samples. the total number of unknowns increases with the increase of the analysis frequency, but it does not exceed 1600. the total analysis time is 815 s on a desktop computer with intel i7 cpu and 32 gb of ddr3 ram. the time-domain response is calculated for the time interval from 0 to 100 ns over 2048 equidistant time samples. the results for constant permittivity in the whole frequency range, 3j16 r  , which is a noncausal model, and for r ( )s given by (5), which is a causal model, are shown in causal models of electrically large and lossy dielectric bodies 229 fig. 3 for the first 15 ns. as in fig. 2, the response obtained by the first model shows a premature beginning of the leading edge. in contrast, the causal model has a clear start, which allows for much more precise timing when evaluating the beginning of the impulse response. 4.3. dielectric cube as the most resource-demanding problem, we consider the three-dimensional (3-d) system shown in fig. 4. it consists of a cube, made of a lossy dielectric, and two symmetrical dipoles. the side of the cube is 2c, where we take two values for c: mm 100c for a smaller cube and mm 250c for a larger cube. inside the cube, at its center (which coincides with the coordinate origin o), one symmetrical dipole (dipole #1) is located. the length of one arm of the dipole is 5 mm (the overall dipole length is 10 mm). the wire radius is 0.1 mm (the diameter is 0.2 mm). another dipole (dipole #2) is located outside the cube. the arm length of this dipole is 20 mm (40 mm overall) and the wire radius is 0.5 mm (the diameter is 1 mm). the dipoles are mutually parallel, and parallel to the height of the cube. the distance between the dipole centers (feeding points, ports) is 2c. fig. 4 two dipole antennas, one of which is inside a lossy dielectric cube. not drawn to scale. both cubes are analyzed at 1000 frequencies: 10 mhz, 20 mhz, ..., 10000 mhz (10 ghz) using the program wipl-d [2]. for each frequency, the impedance and scattering parameters are computed. the nominal impedance for the scattering parameters is 50 ω. two models of the dielectric are used: a noncausal model, for which the complex relative permittivity is frequency independent, 3j16 r  (i.e., 16 r  and 3 r  ), and a causal model, for which the complex relative permittivity is evaluated from equation (5). for both cubes and for both dielectric models, the impulse responses for s21 and z21 are evaluated. the results are shown in figs. 5–8. 230 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić wipl-d simulations are done using wipl-d pro version 11, on a desktop computer with intel cpu core i7 3820 @3.60 ghz, 64 gb of (ddr3) ram, nvidia geforce gtx590, and with microsoft windows 7 pro 64-bit operating system. the em system is modeled using three symmetry planes in order to maximally reduce the computer resources needed for the analysis. although such symmetry introduces a parasitic image of the second (larger) dipole, the effect of the parasitic dipole is negligible as it is located far away. first, we analyze the smaller cube ( mm 100c ). in the case of the constant relative permittivity, 3j16r  , the analysis takes 26,931 s (i.e., approximately 7.5 hrs), and the total number of unknowns increases with frequency from 1,202 at the lowest frequency (10 mhz) up to 4,943 at the highest frequency (10 ghz). in the case when the relative permittivity is given by equation (5), the simulation takes 26,172 s (i.e., approximately 7.3 hrs). the total number of unknowns increases with frequency from 1,202 at the lowest frequency up to 4,803 at the highest frequency. thereafter, we analyze the larger cube ( mm 250c ). the analysis for the constant permittivity lasts 113,753 s (i.e., approximately 31.6 hrs). the number of unknowns is 1,202 at the lowest frequency and rises with frequency up to 15,233 for the highest frequency. the analysis for the relative permittivity given by equation (5) lasts 105,484 s (i.e., 29.3 hrs), while the number of unknowns is in the range from 1,202 to 13,621. note that in both cases, the analysis of the cube with frequency dependent permittivity lasts slightly less than the analysis with constant permittivity. this is due to the fact that wipl-d allocates resources by taking into the account the electrical size of the structure at the operating frequency. at higher frequencies, the modulus of the frequencydependent permittivity is smaller than the modulus of the constant permittivity, thus demanding fewer unknowns. the results for mm 100c obtained by wipl-d are compared with the results obtained by program cst [3], which uses a time-domain solver (fig. 5). cst simulations were performed using microwave studio software from the cst studio 2013 package, on a windows 7 64-bit server equipped with two intel(r) xeon(r) cpus @2ghz and 192 gb ram. unfortunately, cst cannot analyze the case for mm 250c for the given hardware configuration. the cst time-domain solver uses the causal model of the relative permittivity. the hexahedral mesh size was about 7 million cells. the cube interior was filled with a frequency dispersive material, defined using the appropriate permittivity values given by equation (5) for each frequency point. the background was filled with a vacuum. the model boundaries were set to “open”. two symmetry planes (one magnetic-field and one electric-field symmetry plane) were defined to reduce the total computational load. the solver accuracy was set to 80 db. the excitation signal was of a 10 ghz cst-default gaussian type. the simulation was run so to ultimately yield 1001 equidistant frequency points in the 0 to 10 ghz frequency range. the total simulation time is approximately 67.5 hours. the time-domain solver in cst evaluates only the scattering parameters. thereby, only one port is excited, so that in one run of the program the parameters s11 and s21 are evaluated. in order to calculate the impedance parameter z21, the parameter s22 is needed as well. however, the computation of s22 requires another full-time run of cst, which was not performed to avoid the long run of the program. consequently, only the impulse response for s21 is shown (fig. 5). causal models of electrically large and lossy dielectric bodies 231 the simulation for the noncausal response in cst can be performed using a frequency-domain solver only. the simulation parameters are as follows: integral solver (is), a mesh with 1313 surfaces, a vacuum background, open boundaries, two symmetry planes, solver accuracy 1e3, s-parameters normalized to 50 ω, 3rd order solver, and the solver type is mom. the results are also shown in fig. 5. the total simulation time is approximately 52 hours. the agreement between the results evaluated by wipl-d and by cst is very good, both for the causal model and the noncausal model. fig. 5 impulse response for mm 100c , for s21, and zoom-in (inset), as computed by cst time-domain solver for the causal model, by cst frequency-domain solver for the noncausal model, and by wipl-d for both the causal and noncausal models. fig. 6 shows the impulse response for z21 for the smaller cube. figs. 7 and 8 show the impulse response for the larger cube, for s21 and z21, respectively. these results were computed only by wipl-d. the noncausal model of the dielectric yields a premature start of the response, which is more visible for z21 than for s21. the explanation is in the shape of the spectrum of these two parameters. the spectrum of the parameter z21 is wider than the spectrum of s21. hence, the inadequate variations of the permittivity of the noncausal model have influence in a wider frequency range for z21 than for s21. 232 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić fig. 6 impulse response for mm 100c , for z21, and zoom-in (inset), as computed by wipl-d for the noncausal and causal models. fig. 7 impulse response for mm 250c , for s21, and zoom-in (inset), as computed by wipl-d for the noncausal and causal models. causal models of electrically large and lossy dielectric bodies 233 fig. 8 impulse response for mm 250c , for z21, and zoom-in (inset), as computed by wipl-d for the noncausal and causal models. 7. conclusion this paper presents an analytical expression of complex permittivity of wet soil, valid in a broad frequency range, which assures a causal response in the time domain. the parameters of the formula are tuned to fit the measured data for soil and water in a broad range of frequencies. the impulse response, needed for direct analysis in the time domain, is derived, too. the discrepancies between the causal and noncausal responses, and their relations with the complex permittivity of the material, are illustrated through several examples of different dimensionality and complexity. it is shown that in all cases the causal response has a crisp start, while the noncausal response has an early and slow leading edge. additionally, a model of a 3-d em system, being the most complex example, is used to test the present-day limits of some commercial em solvers. acknowledgement: the paper is a part of the research done within the project tr32005 of the serbian ministry of education, science, and technological development. references [1] b. m. kolundţija and a. r. djordjević, electromagnetic modeling of composite metallic and dielectric structures, boston: artech house, 2002. [2] wipl-d pro 3-d. available: http://www.wipl-d.com/ [3] cst, 3d electromagnetic simulation software. available: https://www.cst.com/ 234 a. djordjević, d. olćan, m. stojilović, m. pavlović, b. kolundţija, d. tošić [4] a. r. djordjević, r. m. biljić, v. d. likar-smiljanić, and t. k. sarkar, “wideband frequency-domain characterization of fr-4 and time-domain causality”, ieee trans. electromagn. compat., vol. 43, no. 4, pp. 662–667, november 2001. [5] a. r. đorđević and d. v. tošić, “causality of circuit and electromagnetic-field models”, in proc. of 5th european conference on circuits and systems for communications (eccsc'10), belgrade, serbia, 2010, pp. 12–21. [6] p. j. w. debye, polar molecules, new york: the chemical catalog company, 1929. [7] c. f. bohren, “what did kramers and kronig do and how did they do it?”, eur. j. phys., vol. 31, pp. 573–577, 2010. [8] f. m. tesche, “on the use of the hilbert transform for processing measured cw data”, ieee trans. electromagn. compat., vol. 34, no. 3, august 1992, pp. 259–266. [9] t. meissner and f. j. wentz, “the complex dielectric constant of pure and sea water from microwave satellite observations”, ieee trans. geoscience remote sens., vol. 42, no. 9, pp. 1836–1849, september 2004. [10] g. d. smith and b. j. stanton, soil parameters from fort a.p. hill soil permittivity and conductivity measurements for the wide area airborne minefield detection program, army research laboratory, adelphi, md, arl-tr-3049, sept. 2003. available: http://www.arl.army.mil/arlreports/2003/arl-tr3049.pdf. [11] ansys. (2012, may). automating the si design flow for hfss. [online]. available: http://www.ansys.com/ staticassets/ansys/conference/confidence/minneapolis/downloads/automating-si-design-flow-for-ansyshfss-1.pdf [12] agilent technologies. (2009). about dielectric loss models. [online]. available: http://edocs.soco.agilent. com/display/ads2009/about+dielectric+loss+models [13] simberian inc. (2008, sept.). modeling frequency-dependent dielectric loss and dispersion for multigigabit data channels. [online]. available: [14] http://www.simberian.com/appnotes/modelingdielectrics_2008_06.pdf [15] j. barthel, k. bachhuber, r. buchner, h. hetzenauer, and m. kleebauer, “a computer-controlled system of transmission lines for the determination of the complex permittivity of lossy liquids between 8.5 and 90 ghz”, ber. bunsenges. phys. chem., vol. 95, no. 8, pp. 853–859, 1991. [16] wipl-d 2-d solver. available: http://www.wipl-d.com/ 10528 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 437-454 https://doi.org/10.2298/fuee2203437a © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper improving performance of transmission networks using facts through continuation power flow method jamal alnasseir electric power department, faculty of mechanical and electrical engineering, damascus university, syria abstract. over the past 50 years, modern electrical systems have become more complex, as they overrun the geographical boundaries of neighboring countries. the problem is that the power system faces many challenges, because it is exposed to difficult operating conditions. the phenomenon of voltage instability is the most frequent phenomenon, and this can lead to the collapse of the power system. to avoid power outages in the system (especially in blackout situations), the power system must be analyzed in order to maintain voltage stability in the expected difficult operating conditions. the main objective is to determine the maximum load capacity of the system and the causes of voltage instability. the voltage instability problem is related to the nature of nonlinear loads, so different load characteristics must be taken into consideration when analyzing voltage stability. this study aims to discover the maximum load capacity required by using the continuous power flow method (cpf) in the studied network. then, the performance of this network using a flexible alternating current transmission system (facts) will be utilized. facts systems present a promising solution in improving the voltage stability by improving the power transmission capacity and controllability of the parameters of the existing power networks. this study will be conducted on a reference network platform under normal working conditions, then installation of one of the facts systems will show its effect on improving voltage stability. the continuous power flow method will be used to find pv curves, which in turn will help to determine the conditions of maximum loading while maintaining stability, and identify the bus bar with the smallest voltage, on which the flexible ac systems will be installed. the software environment matlab/psat will be used for modeling and simulation. key words: voltage stability, continuation power flow (cpf), maximum load conditions, flexible alternating current transmission system (facts), thyristor-controlled series capacitor (tcsc) received february 22, 2022; revised april 7, 2022; accepted may 5, 2022 corresponding author: jamal alnasseir electric power department, faculty of mechanical and electrical engineering, damascus university, syria e-mail: jamalnasseir@yahoo.de 438 j. alnasseir 1. introduction power systems face a lot of challenges, because the energy demand is increasing drastically nowadays. due to that, the generated power will be increased. there are various ways of increasing power generation. power must reach the end consumers through the existing transmission lines, and/or new transmission line should be built. in any case, the loading capacity of the existing transmission lines will increase. if these transmission lines are overloaded, the problem of voltage stability will appear [1]. the voltage profile of the system transmission lines will be affected, and as a result, the power losses will increase. the use of flexible alternating current transmission systems (facts) systems in specific locations of the system will solve most of the previous problems in important power transmission lines [1]. facts devices will improve: loading capacity of transmission lines voltage levels and reduce power losses under normal operating conditions and during the occurrence of faults. facts devices depend on sophisticated power electronic elements, which help control the power flow in transmission lines. these systems can be connected in series or in parallel to important transmission lines. through it, this reactive and active power could be controlled [1]. continuous power flow method is considered one of the best methods used for load flow analysis [1, 2, 3]. studies have confirmed that this method is effective for studying voltage stability in the transmission system. the cpf is characterized by reduced execution time and computation burden, in addition to accuracy and ease of implementation. this method has been applied to the ieee 11-bus system. in this paper, the following systems svc, tcsc and upfc will be connected on a specific bus-bar in the studied transmission system in order to improve the voltage stability in it. the continuation power flow (cpf) method will be used to study the impact of previous systems on the transmission system other studies did not use cpf method to improve the stability if transmission networks in the presence of facts systems. the cpf method uses the step method in prediction and correction, therefore, the jacobian matrix is not considered a mono matrix. the principle of this method is to locate the weakest transmission line, where one of the flexible alternating current systems will be connected. then, an analytical study will be implemented to compare the performance of the network before and after adding the aforementioned system [1, 3, 4, 5]. 2. voltage stability voltage stability is defined as the ability of the power system to maintain the voltage of all buses within acceptable values during normal conditions or after the occurrence of a disturbance. the system is subjected to voltage instability when overloading occurs. the parameters of the system will change and the voltage will drop rapidly. in this case, the automatic control units would be unable to control the system changes and suppress them accordingly. it may take several seconds or even (10-20) minutes to suppress the changes in the voltage. if the disturbance keeps occurring, the voltage then becomes unstable, which would lead to a collapse in the voltage of generators and transmission lines. in other words, the main reason behind the instability of the power system is that the system is unable to meet the demand for reactive power [3, 4, 5, 12, 13]. improving performance of transmission networks using facts through continuation power flow method 439 3. p-v curves the p-v curves express the changes in voltage when the reactive power of the load has changed. these curves are the result of the implementation of load flows at different levels of uniformly distributed loads combined with the constant power factor. when the number of system branches increases, the time required to find these curves will also increase because the time required to calculate the load flow will definitely increase. the p-v curves provide the index of voltage stability of a network as well as the voltage collapse point. voltage stability analysis provides transmission limits through the study of p-v curves. moreover, these curves give the results of the entire system and determine the disturbances which have an impact on the system blackout or emergencies [3, 4, 5, 6, 7, 12, 13]. 4. continuation power flow (cpf) method the cpf method consists of the following steps: 1 − run load flow emergencies [3, 4, 5, 6, 7, 12, 13]: the principle of continuous load flow is to trace the solutions of a nonlinear system through the steps of prediction and correction, considering the nonlinear equations. 0),,( = vf (1) where λ is the load factor, and the value of λ is between 0≤λ≤ λcritical. the following equations give the conventional load flow for a bus i. 0 0 =−− =−− tiligi tiligi qqq ppp (2) where pgi , qgi are the active and reactive generated power, respectively. pli , qli are the active and reactive power of loads. pti, qti are the net power injected into the bus i. therefore, the net power equations are given as follow: ijjiijj n j iti ijjiijj n j iti yvvp yvvq   −−= −−=   = = cos ( )s in( 1 1 (3) 2 − expected step: the step size depends on the direction of the tangent at the previous solution point. equation (4) gives the tangent [3, 4, 5, 5, 7]. 0),,( = vdf (4) by applying the partial differentiation of equation (4), equation (5) is given as follow. 0)()()( =   +   +       f v v ff (5) therefore, the matrix is given by equation (6), where the right side of equation(6) represents tangent vector t. 440 j. alnasseir 0)])()([( =                       v f v ff (6) voltage stability limit stable region unstable region active power loading b u s v o lt a g e critical point (maximum load point) fig. 1 p-v characteristic [2, 7] 0 1 0 ][ ))()(( =       =                 t z f v ff k  (7) the row vector is equal zero and the kth element is equal 1.           =   d dv d t][ (8) the tangent vector t is defined in equation (9).                        = − 1 0))()(( ][ 1 kz f v ff t  (9) by solving the equations, we find:           +           =                    d dv d vv * * * (10) 3 − correction step: the correction step comes after choosing the size of the prediction step for the tangent vector [3, 4, 5, 6, 7]. sg nnn nn rxor rx d dv d −− ++             1 21 2 12    (11) improving performance of transmission networks using facts through continuation power flow method 441 where n1, n2 the number of buses in series and n is the total number of buses in the system. ng is the pv generated buses and ns is the number of infinite buses [3, 4, 5, 6, 7]. the extended equation is an equation from group of equations to determine the status of the variables. k x = (12) therefore, the equation resulting from a group of equations is given by (13) [3, 4, 5, 6, 7]. ]0[ )( =      − kx xf (13) figure 2 shows the scheme of the calculation method in the cpf algorithm [10]. start reading network parameters initializes variables (v, p, q newton-raphson power flow determine continuation parameters calculate tangent factor prediction soluation checking reactive power generation limits are violated checking reactive power generation limits are violated perform correction end pv pq or slack q conversion fig. 2 flow chart of calculation method used in the cpf algorithm [10] 442 j. alnasseir 5. type of flexible alternating current transmission systems (facts) [1] facts controllers are normally connected in series or parallel to the transmission lines. these controllers enhance the power transfer capability of the existing transmission lines. they also improve the voltage stability of the transmission system. when subjected to external disturbances, these controllers help the power system to regain its normal state. effective reactive power management is done using these controllers in transmission system [1, 11]. the series compensation results in the improvement of the maximum power transmission capacity of the line. the net effect is a lower load angle for a given power transmission level and, therefore, a higher-stability margin. the reactive-power absorption of a line depends on the transmission current, so when series capacitors are employed, automatically the resulting reactive-power compensation is adjusted proportionately. also, because the series compensation effectively reduces the overall line reactance, it is expected that the net linevoltage drop would become less susceptible to the loading conditions [1, 11]. application of series capacitors in a long line constitutes placing a lumped impedance at a point. therefore, the following factors need careful evaluation: ▪ the voltage magnitude across the capacitor banks (insulation). ▪ the fault currents at the terminals of a capacitor bank. ▪ the placement of shunt reactors in relation to the series capacitors (resonant over-voltages). ▪ the number of capacitor banks and their location on a long line (voltage profile). while, shunt devices may be connected permanently or through a switch. shunt reactors compensate for the line capacitance, and because they control over-voltages at no loads and light loads, they are often connected permanently to the line, not to the bus [1, 11]. shunt capacitors are used to increase the power-transfer capacity and to compensate for the reactive-voltage drop in the line. the application of shunt capacitors requires careful system design. the circuit breakers connecting shunt capacitors should withstand highcharging in-rush currents and also, upon disconnection, should withstand more than 2-pu voltages, because the capacitors are then left charged for a significant period until they are discharged through a large time-constant discharge circuit. also, the addition of shunt capacitors creates higher-frequency–resonant circuits and can therefore lead to harmonic over-voltages on some system buses [1, 11]. so, facts systems can be classified into three main groups: ▪ series control systems (like tcsc), ▪ shunt control systems (like svc), ▪ shunt-series composite control systems (like upfc) [1, 11]. 5.1. thyristor-controlled series capacitor (tcsc) thyristor-controlled series capacitor (tcsc) is one of the facts types, consisting of a capacitor connected as in parallel with the reactance which is controlled by a thyristor, as shown in figure 3. additionally, figure 3 shows the installation of the arrester discharger made of metal oxide to avoid the occurrence of over voltage across the unit. the series connection of several tcsc units is used to meet the total required compensation, as observed in figure 4 [8, 9, 11, 12]. improving performance of transmission networks using facts through continuation power flow method 443 thyristor vc l (t)li thyristor varistor c fig. 3 power circuit of tcsc compensator thyristor vc l (t)li thyristor varistor c thyristor vc l (t)li thyristor varistor c thyristor vc l (t)li thyristor varistor c fig. 4 series connected tcsc compensators 5.2. static var compensator the svc is an advanced technology that is widely used for transmission applications for several purposes. the primary purpose is usually rapid control of voltage at weak points in the network. worldwide, there is a steady increase in the number of installations. the ieee-definition of an svc is as follows: "static var compensator (svc): a shuntconnected static var generator or absorber whose output is adjusted to exchange capacitive or inductive current so as to maintain or control specific parameters of the electrical power system (typically bus voltage) [8, 9, 11, 12, 13]. svc is an umbrella term for several devices. the svc devices discussed in the following sections are the tcr (thyristor controlled reactor), fc (fixed capacitor) and tsc (thyristor switched capacitor). the components of an svc may include: transformers between the high voltage network bus and medium voltage bus where the power electronic equipment is connected, a fixed (usually air-core) reactor of inductance l and a bidirectional thyristor. the thyristors are fired symmetrically in an angle î± in a controlled range of 90° to nearly 180°, with respect to the capacitor voltage. the tsc is often used in order to decrease standby losses. figure. 5 shows a common structure of svc [8, 9, 11, 12, 3]. (t)ci xc xl (t)li (t)li fig. 5 common structure of svc 444 j. alnasseir 5.3. unified power flow controllers the unified power flow controller (upfc) is one of facts, which is combined of series and shunt facts. it consists of two voltage source converters (vscs), the two vscs are connected to common dc capacitor bank. the first unit of upfc is a static compensator (statcom), which is connected vsc via parallel transformer, then to the dc bus. the second unit is a static synchronous series compensator (sssc). it is also connected to the vsc via series transformer, the to the dc bus. the upfc provides the control capabilities in power flow and instantaneously satisfy the power flow regulation requirements (see figure 6). the major control techniques are as follows [8, 9, 11, 12, 18]: ▪ reactive shunt compensation or bus voltage regulation; ▪ reactive series compensation or line impedance compensation [8, 9, 12, 18]. c shunt transformer series transformer transmission line dc link shunt converter series converter ssscstatcom fig. 6 upcf block diagram between them, the upfc achieves shunt voltage regulation by injecting an in-phase or anti-phase voltage that varies within the maximum and minimum injection limits. these limits are controlled by the ratings of the shunt converter (see figure 7) [8, 9, 12, 18]. area 1 area 2 exchangep li xv vs s vr r svvinj s inj+vsv sv ljx r, qrp fig. 7 upfc voltage injection improving performance of transmission networks using facts through continuation power flow method 445 6. practical study by using matlab-psat software in this research, matlab-psat (power system analysis toolbox) is used to model the power system networks. psat works within the matlab environment and is considered one of the developed software designed to perform static and dynamic analysis of the electrical power systems. it can be used to do the following calculations: ▪ load flow, ▪ continuous power flow, ▪ optimal power flow, ▪ continuous power flow, ▪ static and transient stability analysis of electrical networks, ▪ voltage stability analysis of electrical networks during the static and transient conditions. figure 8 shows the graphical user interface (gui) of matlab-psat which is used to build the electrical power network. the user can add data of the network, build the single line diagrams using the psat-simulink library whereby the data is saved. after that, the data is uploaded by the information file into the gui, and the necessary studies pertaining to the networks are then conducted. fig. 8 user interface of matlab-psat 7. results and discussion 7.1. application of the proposed method to a standard ieee 11-bus system the ieee 11-bus system is a standard system often used by power system specialists for conducting research. figure 9 shows the diagram of the studied network, which consists of 11 buses, and four cylindrical rotor synchronous generators. the voltage level of the generator is 20 kv with a capacity of 900 mva, while the parameters of all generators remain similar. the system has eight transmission lines, two loads and two capacitors. the frequency is 60 hz, while the transmission voltage level is 230 kv and the based power is 100 mva. 446 j. alnasseir fig. 9 diagram of ieee 11-bus system 7.2. voltage stability analysis of ieee 11-bus system utilizing (cpf) method load flow analysis is implemented to calculate the voltage of buses and to determine the weakest buses in the network during normal operating conditions. figure 10 shows the 11-bus system modelled by matlab-psat. fig. 10 ieee 11-bus system modeled by matlab-psat the continuation of power flow is analyzed by using matlab-psat, as explained in figure 11. the results of 11-bus system are presented in table 1. figure 12 presents the voltage of all buses of the studied network calculated by the cpf method. table 1 cpf results under normal operating conditions v [p. u.] bus. nr. v [p. u.] bus. nr. 0.95987 6 1.029 1 0.93523 7 1.0089 2 0.90731 8 1.029 3 0.94874 9 1.0086 4 0.96691 10 0.997 5 0.99876 11 it has been observed in table 1 and figure 12 that the critical voltage values in the network are the voltage of bus 6, bus7, bus 8, bus 9, and bus10, therefore, these buses are the weakest buses in the network, most subjected to network changes, and the probability of voltage collapse is high compared with the other buses. the software calculates the maximum loading factor of the network. as seen in figure 8 (from command improving performance of transmission networks using facts through continuation power flow method 447 window in matlab), after applying the continuation power flow for the studied network, the maximum loading factor is λ = 1.1481[p. u]. in other words, the network is considered stable before this point. however, the network is found to be unstable, after this point. that is why it is called the maximum loading point. fig. 11 voltage of all buses calculated by cpf before adding facts fig. 12 value of maximum loading factor of the 11-bus system without the compensator figure 13 illustrates the p-v curves calculated by cpf method, which provide the maximum loading factors of bus 5 to bus11 as a function of the voltage of the network. moreover, figure 13 shows the buses which have the critical voltage in the studied network. the critical voltage curves are the lowest among these curves, and bus 8 is the weakest which has the lowest curve. fig. 13 p-v curves of the 11-bus system before adding facts 448 j. alnasseir 7.3. voltage stability analysis of ieee 11-bus system after adding the thyristorcontrolled series capacitor (tcsc) a compensator tcsc is connected in series with the transmission line (9-10), as shown in figure 14. fig. 14 the studied network after adding the tcsc in series with transmission line (9-10) the compensator tcsc is connected in series with the transmission line (9-10), therefore, due to that, bus 9 and bus10 were found to be among the weakest buses in the network, as mentioned in table 1, based on the cpf results. additionally, this line is considered as one of the network lines, which carries the largest load and has big reactive losses (table 2). as observed in table 2, the active power injected into the line (9-10) is 1406.4035 mw, which has the biggest value. furthermore, the active and reactive losses are 20.5187 mw and 203.5155 mvar respectively, while the power losses of this line is bigger than the other transmission lines. the value of the maximum loading factor reflects on the abovementioned results, where the value is λ = 1.1754 [p. u.] (as indicated in fig. 15), after adding the compensator tcsc in the line (9-10). table 2 load flow results of the ieee 11-bus before adding facts fig. 15 the value of maximum loading factor of the 11-bus system with tcsc connected in series with the transmission line (9-10) improving performance of transmission networks using facts through continuation power flow method 449 fig. 16 presents the values of the maximum loading factor of the studied network with the implementation of tcsc. it was found that the best location to add the compensator tcsc is the line (9-10). in this case, the value of the maximum loading factor λ is the best. table 3 illustrates the voltage profile of the network from the continuation power flow method after adding the compensator tcsc in series with line (9-10). fig. 16 value of maximum loading factor with the tcsc connected in different locations table 3 load flow results of the studied system after adding tcsc in series with the transmission line (9-10) v [p. u.] bus. nr. v [p. u.] bus. nr. 0.96131 6 1.028813 1 0.938492 7 1.00841 2 0.919863 8 1.028533 3 0.965957 9 1.007755 4 0.967424 10 0.996724 5 0.999152 11 fig. 17 provides the p-v curves after adding tcsc in series with line (9-10), where the graph shows improvement on the critical voltage levels compared with the results before adding facts. these results are similar to what was presented by the reference [16], that using tcsc compensator in the transmission network, to improve the power transfer capacity and loading factor. fig. 17 p-v curves of the 11-bus system after adding tcsc in series with line (9-10) 1.155 1.16 1.165 1.17 1.175 1.18 maximum loading factor( lamda) (9-10) (8-9) (7-8) (6-7) 450 j. alnasseir 7.4. voltage stability analysis of ieee 11-bus system after adding the static variable compensator (svc) a compensator svc is connected in parallel to busbar-8, as shown in fig. 18. fig. 18 the studied network, after adding the svc in parallel at busbar-8 the compensator svc is connected in parallel at busbar-8, because it is the weak point between the network buses, as mentioned in table 1. after applying the continuous load flow to the studied network, table 4 presents the busbar voltages based on the cpf. it is clear that the voltage level of all displayed busbars improved when compared with the first case. also, it was found that the maximum load capacity of the studied network is that the network load capacity after linking the svc has improved, it is: λ =1.1501[p. u.] = as shown in fig.19. table 4 load flow results of the studied system after adding svc v [p. u.] bus. nr. v [p. u.] bus. nr. 0.96945 6 1.0288 1 0.95258 7 1.0084 2 0.96411 8 1.0291 3 0.97936 9 1.0302 4 0.9915 10 1.0002 5 1.009 11 fig. 19 value of maximum loading factor after connecting svc fig. 20 provides the p-v curves after adding on the svc, where the graph shows improvements on the critical voltage level compared with the results before adding the facts system. so, this result is consistent with the reference [17], since the svc compensator is capable to improve and maintain the system’s voltage profile within an acceptable limit, and will also reduce power loss, and in effect improve power transfer capability of the system if applied. improving performance of transmission networks using facts through continuation power flow method 451 fig. 20 p-v curves of the 11-bus system after adding svc 7.5. voltage stability analysis of ieee 11-bus system after the addition of the upfc compensator the upfc compensator was connected between the busbars (7-8), as they are the weakest busbars in the studied network, the rated power of upfc was100 [mvar]. figure 21 shows the modeling of the studied network in the presence of upfc. fig. 21 the studied network after the addition of the upfc between busbars (7-8) after applying the continuous load flow to the studied network, table 5 presents the busbar voltages based on the cpf. it is clear that the voltage level of all busbars was improved when compared with the first case. table 5 load flow results of the studied system after adding svc v [p. u.] bus. nr. v [p. u.] bus. nr. 0.976393 6 1.028729 1 0.966668 7 1.008499 2 1.018 8 1.028578 3 0.970404 9 1.008196 4 0.978061 10 1.002375 5 1.002686 11 452 j. alnasseir from this table (5), the values of the studied network voltage levels, resulting from the continuous load flow method in the presence of upfc, have been found to be better compared to the previous cases as in tables (2), (3) and (4). the compensator upfc points to the best performance in improving the maximum load capacity of the studied network compared with the parallel compensator and the serial compensator, as it controls the volage and reactive power between the two busbars (7, 8). the maximum load capacity becomes λ =1.1918 [p. u.], as shown in figure (22). figure (23) presents the (pv) curves of the ieee-11 in the presence of ufpc. thus, the voltage levels become better when compared to the other cases, i.e., the voltage stability margin for the studied network is better, this means that the maximum loading point (or the so-called voltage breakdown point) is better than the other cases, therefore, the system will stay for a long time, when compared to previous compensation cases without the systemcollapse. these results are in agreement with what was provided by the reference [18], that there is an improvement in the real and reactive powers through the transmission line when upfc is introduced, and combined facts system (upfc) has the advantages like reduced maintenance and ability to control real and reactive powers. fig. 22 value of maximum loading factor after connecting upfc fig. 23 p-v curves of the 11-bus system after adding the upfc system 8. conclusion in this research the continuation power flow method has been used to analyze the possibility of increasing the loading capability of the electrical power systems with the implementation of the facts systems (tcsc, svc, upfc). this study has applied the ieee 11-bus system. the research findings include: improving performance of transmission networks using facts through continuation power flow method 453 1. the cpf has been found to be an effective method in determining the best location to connect the compensators, on the weakest node for the compensators. 2. the study concluded that the implementation of the reactive compensators (series, shunt, or composite) increases the loading capacity of electrical networks, where the loading factor λ of ieee 11-bus increases: from 1.1481 [p. u.] to 1.1754 [p. u.] in the presence of tcsc, form 1.1481 [p. u.] to 1.150 [p. u.] in the presence of svc and 1.1481 [p. u.] to 1.1918 [p. u.] in the presence of upfc. 3. the compensator tcsc is a source of improvement for the performance of the network, this improvement is related to the nature of the network, its loads and line losses. 4. the ufpc compensator shows the best performance among the compensators used in increasing the load capacity of electrical networks, regardless of the nature of the electrical network. 5. the use of different type of compensators (series or shunt), such as svs &upfc is recommended so as to be able to compare their performance. references [1] n. karuppiah1, s. muthubalaji, s. ravivarman, md. asif and a. mandal, "enhancing the performance of transmission lines by facts devices using gsa and bfoa algorithms", int. j. eng. techn., vol. 7, no. 4.6, pp. 203–208, 2018. [2] a. jalali and m. aldeen, "novel continuation power-flow algorithm", in proceedings of the ieee international conference on power system technology (powercon), 2016, p. 16487968. [3] m. gudavalli, h. emulapalli and k. cherukupall, "voltage stability analysis using continuation power flow under contigency", j. theor. appl. inf. technol., vol. 99, no. 10, pp. 2373–2383, may 2021. [4] s. b. bhaladhare, "improving voltage stability by using facts devices", iaset: j. electr. electron. eng. (iaset: jeee), vol. 1, no. 1, pp. 1–10, 2016. [5] s. d. patel, h. h. raval and a. g. patel, "voltage stability analysis of power system using continuation power flow method", int. j. technol. res. eng., vol. 1, no. 9, pp. 763–767, may 2014. [6] n. fnaiech, a. jendoubi and f. bacha, "voltage stability analysis in power system using continuation method and psat software", in proceedings of the 6th international renewable energy congress (irec), tunis, 2015, pp. 1–6. [7] s. greene, i. dobson and f. l. alvarado, "sensitivity of the loading margin to voltage collapse with respect to arbitrary parameters", ieee trans. power syst., vol. 12, no. 1, pp. 262–272, feb. 1997. [8] leonardo l. grigsby, power system stability and control, crc press, 3rd edition, 2012. [9] v. chauhan, b. singh and j. bala, "enhancement of static voltage stability using tcsc and svc", int. j. sci. eng. res., vol. 8, no. 4, pp. 127–130, april 2017. [10] j. m. teixeira da silva marques da cruz, "extension of continuation power flow to incorporate dispersed generation", doctor thesis, lisbon technical university, 2016. [11] r. mohan and r. k. varma, thyristor-based facts controllers for electrical transmission systems, john wiley & sons, 2002. [12] i. g. adebayo, i. a. adejumobi and o. s. olajire, "power flow analysis and voltage stability enhancement using thyristor controlled series capacitor (tcsc) facts controller", int. j. eng. adv. technol. (ijeat), vol. 2, no. 3, pp. 100–104, feb. 2013. [13] t. van cutsem, "a method to compute reactive power margins with respect to voltage collapse", ieee trans. power syst., vol. 6, no. 1, pp. 145–156, feb. 1991. [14] c. sharma and m. g. ganness, "determination of the applicability of using modal analysis for the prediction of voltage stability", in proceedings of the ieee/pes transmission and distribution conference and exposition, 2008, pp. 1–7. [15] y. zhang, s. rajagopalan and j. conto, "practical voltage stability analysis", in proceedings of the ieee pes power and energy society general meeting, 2010, pp. 1–7. https://ieeexplore.ieee.org/xpl/conhome/7735973/proceeding https://ieeexplore.ieee.org/xpl/conhome/7735973/proceeding 454 j. alnasseir [16] o. i. adebisi, i. a. adejumobi, p. e. ogunbowale and o. o. ade-ikuesan, "performance improvement of power system networks using flexible alternating current transmission systems devices: the nigerian 330 kv electricity grid as a case study", lautech j. eng. techno., vol. 12, no. 2, pp. 46–55, 2018. [17] s. kumar, "implementation of tcsc on a transmission line model to analyze the variation in power transfer capability", int. j. res. (ijr), vol. 1, no. 8, pp. 1091–1098, sept. 2014. [18] j. p. sai kumar reddy and p. janga, "power flow improvement in transmission line using upfc", int. j. electron. commun. technol. (iject), vol. 7, no. 4, pp. 9-12, 2016. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 477 494 doi: 10.2298/fuee1503477s development of low-voltage power mosfet based on application requirement analysis  ralf siemieniec, michael hutzler, oliver blank, david laforet, li juin yip, alan huang, ralf walter infineon technologies austria ag, villach, austria abstract. low-voltage power mosfets based on charge-compensation using a fieldplate offer a significant reduction of the area-specific on-resistance. beside a further improvement of this key parameter, the new device generation takes an in-depth focus on the other device parameters which are essential to the targeted application fields. to allow a high efficiency also in light-load conditions, the power mosfet not only needs to meet general requirements like low on-resistance, low gate charge and good avalanche capability, but must also have a low output capacitance and low reverserecovery charge. the paper discusses how the most important of these often conflicting requirements were identified. it is shown that beside the device technology the package contributes significantly to the overall device performance. a new package solution is introduced which is especially suited for high current applications linked to high reliability requirements such as industrial motor drives or servers. key words. power mosfet, charge compensation, synchronous rectification, motor drives, package, overall efficiency 1. introduction several years ago the then upcoming 80plus® requirements for smps (switchedmode power supply) forced the designers of power supplies to rethink the concept of secondary side rectification [1]. at that time, conventional diodes with a forward voltage drop of roughly 0.5 v were used leading to a poor efficiency level at high output power due to the large output currents. the change to sr (synchronous rectification) by using standard mosfets with low on-resistance rds(on) was the solution to increase the efficiency level above 80%. further design steps like improved pcb layout, enhanced snubber networks for better spiking behavior of the mosfet, in addition to lower rds(on), increased the efficiency level to a peak of around 90 %. received february 20, 2015; received in revised form march 24, 2015 corresponding author: ralf siemieniec infineon technologies austria ag, siemensstrasse 2, a-9500 villach, austria (e-mail: ralf.siemieniec@infineon.com) 478 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter however, the following 80plus platinum certification required much more. the efficiency for single output psus (power supply units) with an ac input voltage of 230 v (e.g. server psu) has to be above 90 %, 94 % and 91 % at respectively 20 %, 50 % and 100 % of the output power [1]. an optimization at full load could be enabled by using the lowest available rds(on) for the sr mosfet, but this approach does not allow the highest performance to be reached at low output power. to reach or exceed such requirements in the coming years, it is essential to have mosfets offering a wellbalanced ratio between switching losses and conduction losses with absolute loss values being on an extremely low level. however, the device must also be rugged to withstand critical operation conditions often manifesting as avalanche events. due to unavoidable parasitic elements in the circuitry, it is also likely that the devices may enter avalanche mode for very short times at low avalanche energies repetitively even under regular operating conditions. as a consequence, the device is expected to be robust against such short repetitive avalanche events. very similar requirements exist also for other application fields such as primary side switches or devices used in motor drives, still there might be additional parameters to consider. due to the large number of parameters which could be improved it is important to identify those having the highest impact on the device parameters being essential for all application fields considered. 2. application requirements 2.1. methodology for analysis of customer needs addressing a wide range of voltage classes from 60 v to 150 v is linked with targeting a significantly wider range of differing application fields. to offer a solution capable of delivering the best performance for as many of them as possible, one needs to identify the device properties which are beneficial for all applications, in order to focus on the optimization of the right device features. an in-depth analysis of the application requirements also allows a ranking of the different device properties in order of their importance. ideally this procedure also identifies the essential parameters which need to be optimized in opposing directions for different applications, thereby indicating opportunities or requirements to develop technology derivatives. established methodologies for such an analysis are offered in general by the quality function deployment [2],[3]. within our work, the house-of-quality matrix was employed as an aid in determining how products live up to customer needs [4]. fig. 1 illustrates the basic worksheet used in this process for analyzing the relationship between customer wishes and product capabilities and their interactions, identifying development priorities and including a benchmarking of the new concepts against predecessor products and the competition in the market. as the required inputs are delivered from different functional units such as marketing, engineering and manufacturing, the methodology also increases the cross-functional integration within the organization. development of low-voltage power mosfet based on application requirement analysis 479 fig. 1 house-of-quality matrix used for concept evaluation [5] 2.2. general requirements fig. 2 presents the summary out of the house-of-quality investigation for initially three application fields. synchronous rectification and primary side switches turned-out to have so many requirements in common that these two form one group, the second group being motor drive applications. in total, almost 30 parameters were evaluated against the identified application requirements and their interactions. fewer than half of those parameters proved to have an influence significant enough to take account of them in the optimization process. here, two main groups of parameters are identified – parameters related to the device package and parameters related to the electrical characteristics of the chip itself. it was found that most of the significant parameters such as low on-resistance, reduced gateand output charge, small parameter tolerances or low thermal resistance, are of equal importance for both application groups [5]. one parameter causes a conflict, meaning that there are potentially opposing optimization targets. unsurprisingly this parameter, the gate-drain-charge qgd, is one of the main factors controlling the switching speed of the device. a small gate-drain-charge qgd is therefore advantageous for fast switching applications such as synchronous rectification. the gatedrain-charge qgd is also related to the common need of paralleling devices in typical drive applications to meet the load current requirements. paralleling of devices calls for the ability to switch all of them at the same time, which is usually linked to longer switching times in order to balance inevitable device tolerances. on the other hand switching frequencies are typically lower than in the case of synchronous rectifiers and primary-side switches. consequently, devices for drive applications may also have larger miller and input capacitances in addition to a low variation of the threshold voltage in order to ease the task of paralleling. it is important that the di/dt and dv/dt can be controlled over a wide range by choosing an adequate external gate resistance, as this could avoid the need for a derivative with increased capacitances. 480 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter it is further known that avalanche events due to unclamped inductive switching can affect the device in all addressed application fields. in case of single pulse events found under critical operation conditions (as abrupt load changes, abrupt disconnection from the power grid, blocked motors etc.) the energy which needs to be dissipated by the device can be large. also, the peak current density may exceed the nominal current rating. here a good suppression of the unwanted turn-on of the parasitic bjt is required and given for most modern mosfets. however, the avalanche capability is limited by the intrinsic temperature of the device where the intrinsic carrier density equals the background doping, leading to thermal destruction of the device [6],[7],[8]. as active device areas become smaller due to a lower specific on-resistance not only the overall device volume for energy dissipation gets smaller but also the current densities increase. also the thermal resistance from junction to case rthjc increases at smaller chip sizes which imposes another challenge in order to maintain the required device robustness. fig. 2 requirements for synchronous rectification and primary-side switch applications (left) and motor drive applications (right) [5] a first conclusion to draw out of this analysis is that just one technology needs to be developed since it is relatively easy to adapt it later to get a derivate for the other field of application in case it is really needed. the second conclusion underlines the need for further improved package technologies. here, the most important requirements are a further reduction of the package contribution to the overall on-resistance of the product and improved cooling capabilities (lower rthjc). however, it is not only the package contribution to the on-resistance of the device that matters, but also the parasitic inductance which it introduces. the inductance due to the package leads to additional switching losses, slower switching speed, or may even cause an unwanted turn-on of the device, all lowering the overall efficiency of the powerelectronic device. this parasitic inductance might also trigger repetitive avalanche events. the number of repeated avalanche cycles, even when dissipating low energies in the range of 1 µj only, may affect the semiconductor in case of poor device designs. 2.3. specific synchronous rectification requirements the power losses in the mosfet must be separated into load dependent conduction losses and constant switching losses [9]. conduction losses are determined by the rds(on) of the switch. they increase with increasing output load of the power supply. on the other hand the switching losses are constant over the whole output load, and are mainly determined by the gate charge qg and the output charge qoss. development of low-voltage power mosfet based on application requirement analysis 481 fig. 3 simplified model of the synchronous rectification mosfet turn-off [9] further considering the turn-off process, also the stored charge qrr of the body diode must be removed and the output capacitance coss, formed by the gate-drain-capacitance cgd and the drain-source-capacitance cds, has to be charged up to the input voltage of the sr stage as explained in fig. 3. this process results in a reverse current peak irrm which is linked to the overall inductance of the commutation loop. the energy stored in this inductance is transferred to the output capacitance as soon as the drain-source-voltage vds of the mosfet exceeds the input voltage vin with a voltage spike carrying this energy. the amount of energy is defined by the reverse-recovery charge stored in the body diode qrr and the charge stored in the output capacitance qoss and is lost in every switching cycle. a high qoss + qrr does not only generate power losses but also causes a large reverse current peak irrm as shown schematically in fig. 3. the higher the reverse current peak, the higher the rate of voltage rise dv/dt, and thus the greater the turn-off voltage spike, will be. this high dv/dt can also trigger a dynamic re-turn-on of the mosfet by raising the gate voltage above the threshold voltage due to the capacitive voltage divider cgd / cgs as depicted in fig. 4 [10]. to prevent this, a small output capacitance coss, a small stored charge qrr, a non-critical ratio cgd / cgs and a narrow tolerance of all mosfet capacitances are essential. fig. 4 dynamic turn-on of a mosfet by large dv/dt [10] 482 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter 2.4. how to gain highest efficiency to optimize the power mosfet for highest efficiency, a well-balanced ratio between switching losses and conduction losses must be found. at low output loads the conduction losses only play a minor role while switching losses are dominant. for higher loads the weighting of the losses is the other way around. to calculate the losses and to get an indication how the technology will perform in the system, different figures-of-merit (fom) need to be considered [11],[12]. the fomg is the product of the rds(on) and the qg, while the fomoss is the product of rds(on) and qoss. as the capacitances of a mosfet are inversely proportional to the rds(on), this product is fixed over the whole rds(on) range of a given technology. fig. 5 illustrates the derived relation between on-resistance and overall power losses on example of a synchronous rectifier. as illustrated herein, the conduction losses increase linearly with higher rds(on). since switching losses increase at low rds(on) values, a local minimum is found considering the total power losses [13]. here the mosfet generates the lowest losses in a given system and therefore the highest efficiency is found. further optimization of a synchronous rectification system cannot be done within this given mosfet technology. consequently, the main goal of a new synchronous rectification mosfet is moving this point of minimum losses to the bottom left corner in fig. 5. this can only be achieved by a further massive reduction of switching losses and conduction losses at the same time. this will raise the whole system efficiency both at low output power and at high output power. an improvement of the fomoss will mainly affect the system efficiency at low output power while the rds(on) will primarily affect the efficiency at high currents. also the stored charge qrr negatively affects the system efficiency at medium and high output power and adequate measures might be required to reduce it. these considerations are also valid for motor-drives, however the usually lower switching frequency shifts the optimum point to a significantly reduced on-resistance. fig. 5 power losses per device vs. on-resistance in synchronous rectification for a given 60 v mosfet technology (vin = 30 v, vgs = 10 v, i = 15 a, f = 125 khz) [13] development of low-voltage power mosfet based on application requirement analysis 483 3. power semiconductor optimization 3.1. introduction of device concept the device concept discussed is related to a fieldplate trench mosfet as shown schematically in fig. 6. such devices entered the market more than 10 years ago and developed into a kind of standard technology for fast-switching devices. the basics and properties of these devices have been discussed in more details in many publications over the years, e.g [14]-[18]. the basic principle to realize an area-specific onresistance well below the 1d silicon limit [19],[20] is similar to the charge-compensation principle in superjunction devices like the coolmos™, as schematically shown in fig. 7a. here the compensation of n-drift region donors is realized by acceptors located in p-columns. in field-plate type devices, an isolated field-plate provides the mobile charges required to compensate the drift region donors under blocking conditions as indicated in fig. 7b. a) b) fig. 7 a) compensation by p& n-columns; b) compensation using a field-plate compared to a device using a simple planar pn-junction, the electric field now also has a component in the lateral direction. fig. 8 explains the basic differences in the electric field for a simple pn-junction and for the case where a field-plate compensates the donors in the drift region. the application of a field-plate leads to an almost constant field distribution in the vertical direction since the ionized dopants in the drift region are laterally compensated by mobile carriers in the field-plate, thereby reducing the necessary drift region length and increasing the allowed drift region doping for a given breakdown voltage. both contribute to the significantly reduced area-specific on-resistance. since the field-plate electrode is connected to the source electrode of the mosfet and the gate is formed by a separate electrode, such a device offers an outstanding area-specific onresistance and a low gate-charge at the same time. fig. 6 schematic structure of a field-plate mosfet 484 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter a) b) fig. 8 a) electric field for a pn-junction; b) electric field for a field-plate structure 3.2 improvement of device properties despite all the advantages, the introduction of charge-compensation is inevitably linked to an increase in the output capacitance coss and the output charge qoss due to the increased doping density compared to a standard mosfet. here it is useful to consider the previously defined fomoss since from an application point of view the output charge for a given on-resistance is of interest. a simple optimization towards the lowest possible area-specific on-resistance by using a smaller cell pitch will lead to a degradation of the fomoss. alternatively, a reduction of the qoss is obviously possible by a further reduction of the drift region length, a lower drift region doping, and a decrease in the cell density. unfortunately, these measures will degrade the area-specific on-resistance and/or affect the breakdown voltage. fig. 9 shows the dependence of the breakdown voltage on the trench depth and the linked drift region length at a given doping level. the target is to minimize the trench depth without any deterioration of the breakdown voltage. the width and the depth of the trench will vary over the manufacturing process within a specific range and as such the charge in the mesa region, which forms a major part of the output charge, will vary as well [13]. moreover, due to the process tolerances, the average trench depth must be deep enough to always ensure the required minimum blocking capability. therefore a reduction of the trench depth variation by improved tools and better process control will allow for a simultaneous reduction of on-resistance and output-charge at the same time. also, the variation of the trench width for a constant pitch does limit the device performance since the charge along the lateral direction must be compensated by the field-plate without exceeding the critical strength of the electric field. again a better control of this parameter by improved tools and/or a more advanced lithography allows for a higher doping level linked to a better on-resistance and a more narrow range of the output charge variation at the same time. of course there are many other process-related parameters where a better control directly leads to an improvement of the device parameters [13]. development of low-voltage power mosfet based on application requirement analysis 485 fig. 9 breakdown voltage dependence on trench depth and linked drift region length for a given doping [13] independent of the exact device structure, these thoughts can be transferred to any similar device design. as example, fig. 10 indicates the result for different ways of optimizing the fomoss vs. rds(on) x aactive. despite the clear improvement of both key parameters in the sweet spot, there are two particularly interesting facts to note. first, the strong reduction of the output-charge results in only a minor increase in the area-specific on-resistance compared to what would be achieved by a straightforward reduction of the on-resistance. second, also the fomoss of such an optimized device is competitive to devices with pure focus on output charge reduction. fig. 10 comparison of device performance of 1 st and 2 nd generation field-plate trench mosfet to a standard trench mosfet in the 60 v class 486 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter fig. 11 comparison of the gate-drain charge distribution for the 60 v class with respect to the dynamic behavior the absolute value of the gate-drain-charge qgd and its variation over the manufacturing process should be low. the requirement for a low variation range of this parameter is especially important when devices need to be connected in parallel with each other, enabling a faster switching of the whole system. a small variation range of the qgd value also allows the minimization of safety margins. here the optimized technology also benefits from the previously discussed improvements to the manufacturing process and equipment. progress in the process details, a better process control and the tweaked device geometry result in a much smaller range of the qgd compared to the predecessor technology as indicated in the cumulative plot shown in fig. 11. as paralleling is especially important for high-current motor-drive applications, a smaller variation of the threshold voltage allows for a smaller current derating as imbalances in the current distribution of the paralleled devices become less. as depicted in fig. 12 also the threshold voltage variation benefits from the advances in the manufacturing process. fig. 12 comparison of the threshold voltage distribution for the 100 v class development of low-voltage power mosfet based on application requirement analysis 487 4. packaging issues with silicon technology moving rapidly forward the package becomes an increasingly important part for low-voltage mosfets. the on-resistance of the latest device technologies has become remarkably low; the package proportion of the overall on-resistance has changed from a negligible 1:10 to 1:1 or even worse. in the past this need for low-resistive packages to avoid a limitation of the device by the package characteristics drove the development of new packages, optimized for high currents and high switching frequencies. this becomes clear when referring to the package contributions of the discussed lowvoltage mosfet devices with maximum die-size for the given package in fig. 13. these advanced device technologies allow for mosfet dies in a still widely used to-220 with an on-resistance being equal to or lower than the package resistance. therefore, the package resistance clearly limits the minimum achievable on-resistance as it is explained on example of three generations of 100 v devices with maximum die size in the respective package shown in fig. 14. for a to-220 device, only 50 % of the gained on-resistance improvement on chip level is realized in the packaged device due to the significant package contribution. to follow the route towards denser and more efficient power converter designs, available surface-mounted package types, such as the new to-leadless (to-ll) [21], the superso8, the shrinked superso8 (s3o8) or the canpak™, are needed to replace the leaded smd or through-hole devices for low-voltage mosfets as they contribute significantly less to the overall on-resistance of the product. fig. 13 package contribution to the overall on-resistance for devices of different voltage classes of the latest generation with maximum die-size in the respective package of course it is not only the package contribution to the on-resistance of the device which matters, but also the parasitic inductance it introduces. at increasing switching frequencies and switching speeds, the package inductance can play a major part in loss generation for the overall device and application performance. for example, a buck-converter with an output current of 30 a, operating at 250 khz, generates 0.7 w of losses in a d488 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter pak design due to the total package inductance of 6 nh. with a low-inductive package like the superso8, showing an inductance of less than 0.5 nh, these losses drop below 0.1 w. however, most surface-mounted devices available so far were less suited for highcurrent applications due to their limited footprint area and the corresponding limited current-density due to the package itself. a recent solution addressing such applications is the already mentioned to-leadless (to-ll), which offers a lower parasitic resistance and inductance, a lower thermal resistance and a higher current capability [21]. this solution also extends the maximum allowed continuous current capability compared to the commonly available to-packages such as to-220 or d²pak 7-pin up to 300 a as it offers a 50 % bigger solder contact area. this reduces the current density through the solder contact areas and thus avoids electromigration issues at high current levels. fig. 14 typical on-resistance reduction comparison of three 100 v mosfet generations on chip and package level 5. device performance 5.1. efficiency and voltage overshoot improvements of the mosfet die itself are mainly based on a detailed understanding of the device physics and consequent improvement of manufacturing capabilities as discussed briefly in this work or more detailed in [13]. the improvements realized by the new 2 nd generation 80 v and 100 v mosfets over the equivalent 1 st generation products were investigated in a 400 w power supply (psu) based on a full-bridge converter with full wave synchronous rectification as schematically shown in fig. 15. for the efficiency evaluation, the synchronous rectifier stage was equipped with either 80 v or 100 v devices:  for 80 v: one 1 st generation / 4.7 m device or one 2 nd generation / 3 m device.  for 100 v: two paralleled 1 st generation / 4.6 m devices or two paralleled 2 nd generation / 4 m devices. development of low-voltage power mosfet based on application requirement analysis 489 fig. 15 simplified schematic of a 400 w / 33 a psu using a full-bridge converter on the primary side and full wave synchronous rectification fig. 16 compares the measured efficiency over the full load range of the psu equipped with 1 st generation or 2 nd generation devices in the synchronous rectifier stage. while the on-resistance of the 2 nd generation device is much lower, resulting in the better high-load efficiency, the efficiency at low and medium load conditions is also maintained due to the improved fomoss. by choosing the right on-resistance of the device, efficiency can be easily improved over the full load range. fig. 17 indicates the voltage overshoot at the synchronous rectification switches for the example of the 100 v devices at low load. even at this condition the voltage spike is lowered despite the higher degree of charge-compensation responsible for the lower onresistance. as designers need to ensure that the level of this peak does not exceed the maximum rating of the device, a snubber network is commonly used which is costly and typically decreases the performance of the power supply [22]. a snubber in its easiest version consists of a series-connected resistor and capacitor connected in parallel to the drain and source of the mosfet. any reduction its capacitance improves the efficiency of the circuit and supports a lowered voltage spike of the power mosfet. fig. 16 comparison of the overall efficiency at an input voltage of 48 v [5] 490 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter fig. 17 voltage overshoot at the secondary rectification stage of a psu equipped with 100 v devices of the 1 st and the 2 nd device generation at a load current of 8 a [5] the lower overall losses translate into a lower device temperature as shown in fig. 18 [5]. as different color scales are used, the maximum temperature of the devices is indicated on the graphs. fig. 18 maximum chip temperature for 80 v devices (top) and 100 v devices (bottom) in synchronous rectifier stage, each for 1 st generation device (left) and 2 nd generation device (right) [vin = 48 v, iout = 33 a, tamb = 300 k, no forced air cooling] development of low-voltage power mosfet based on application requirement analysis 491 5.2 avalanche ruggedness during development, the single-pulse avalanche destruction current was investigated following a mixed-mode 2d simulation approach using two slightly different mosfet cells as proposed in earlier work [23]. the good agreement of the simulated and measured destruction currents as shown in fig. 19 indicates a proper chip design since no serious degradation is introduced by the real, three-dimensional device structure. fig. 19 measured and simulated single-pulse avalanche destruction current to compare the avalanche capability of the 1 st and 2 nd generation, single-pulse avalanche measurements were done for different inductances and temperature values. fig. 20 presents the result of these measurements on example of 100 v devices having an identical active area. to estimate the intrinsic temperature, extrapolation lines are fit to the average failure current points determined at the various temperatures. the intersection point with the zero-current line is found at the intrinsic temperature of the device. the thermal destruction is found at approximately the same intrinsic temperature for both device generations under identical conditions [7]. consequently, the improved device properties are not linked to an avalanche weakness. fig. 20 measured avalanche capability vs. junction temperature for 100 v mosfet 492 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter 5.3 performance of to-ll package as previously discussed, advanced package concepts enable a significant reduction of the package contribution to the overall on-resistance of a device. how big this difference can be is shown in a direct comparison between a d 2 pak 7 pin and a to-ll, both with identical chip size. a typical application for such low resistive mosfets in a high current package is the inverter for an electric 3-phase motor. starting with a few tenths of an ampere, the continuous current easily reaches several hundred amperes or more. a typical battery voltage is 24 v therefore 60 v mosfets are an appropriate choice. the lowest available on-resistance of such a device in d 2 pak 7 pin is 1 mω. this is the upper guaranteed limit, including both silicon and package resistance. the package (= “copper“) losses are already around 0.4 mω, representing nearly 50 % of the conduction losses. a better solution would be a package with an improved design for lower copper losses. in to-ll with its optimized electrical and mechanical design, the package resistance goes down to app. 0.25 mω. this enables a 60 v mosfet with a maximum on-resistance of less than 0.75 mω. the reduction of package resistance results in dramatically lower losses, enabling, for example, the chip temperature to be kept lower. fig. 21 shows the chip temperature in a typical motor control application (3-phase 24 v motor system, irms = 100 a) for the two packages, d 2 pak 7 pin and to-leadless with identical chip size. after one hour the temperature difference is already around 10 k. as a consequence the temperature stress to the to-ll parts is less, leading to an increased reliability of the parts linked to fewer failures in the field and as such longer expected lifetimes [5]. fig. 21 evolution of chip temperature with time for 60 v mosfet in a typical drives application for devices with identical chip area in two different package types [5] using intermittent operating lifetime tests (iol), also called „power cycling“, the reliability of the devices in the new to-ll package was proven. in this test, the device is heated up by a high current flow in each cycle until the defined temperature difference is reached. the relevant industry standard aec q101 requires the device to survive a minimum of 15,000 cycles at a temperature difference of 100 k. development of low-voltage power mosfet based on application requirement analysis 493 as a rule of thumb, each additional temperature increase by 10 k leads to a reduction of the expected lifetime by 50 % (= half the number of possible power cycles). in order to reduce the test time (15,000 cycles last app. 1 month), the temperature difference was increased from 100 k to 150 k, applying a much higher stress to the mosfet. 60,000 applied cycles using this dramatically higher stress condition did not result in any device failure or in measurable parameter drifts. fig. 22 shows the diagram including also the calculated curves for different temperature rises [5]. fig. 22 power cycling of mosfet. red line: calculated number of cycles over temperature rise, derived from the measurement conditions (∆t = 150 k, red dot). blue line: requirements according to aec q101 standard, derived from cycles with ∆t = 100 k (blue dot) [5] 6. conclusion this paper discusses the optimization of mosfet technologies based on a detailed requirements analysis methodology. to improve the overall efficiency it is not sufficient to focus only on low rds(on). as the efficiency targets also require high levels of low load performance, all switching losses need to be minimized at the same time. to fulfill these needs, the fomg and fomoss are both decreased simultaneously which is enabled by improved manufacturing setups. it is further shown that these improvements do not compromise the device ruggedness such as the single-pulse avalanche capability. the increasing importance of the package and its influence on the overall performance is discussed. measurement results in the respective target applications indicate the achieved progress in the overall device performance, both from the device and the package point of view. such device properties enable an easier design-in process with less effort for the designers of power-electronic appliances. 494 r, siemieniec, m. hutzler, o. blank, d. laforet, l. j. yip, a. huang, r. walter references [1] http://www.plugloadsolutions.com/80pluspowersupplies.aspx. [2] y. akao. qfd quality function deployment. verlag moderne industrie, landsberg / lech, 1992. [3] http://www.qfdonline.com/. [4] http://www.webducate.net/qfd/qfd.html. [5] r. siemieniec, m. hutzler, d. laforet, l.-j. yip, a. huang and r. walter, "application-tailored development of power mosfet", in proc. isps 2014, prague, 2014. [6] s. k. ghandhi, semiconductor power devices: physics of operation and fabrication technology, john wiley and sons, new york, 1977. [7] d. kinzer, "advances in power switch technology for 40v 300v applications", in proc. epe 2005, dresden, 2005. [8] j. lutz, h. schlangenotto, u. scheuermann and r. de doncker, semiconductor power devices, springer, 2011, pp. 419-421. [9] c. mößlacher and l. görgens, "improving efficiency of synchronous rectification by analysis of the mosfet power loss mechanism", in proc. pcim 2009, nürnberg, 2009. [10] l. görgens and r. siemieniec, "niedriger widerstand und kurze schaltzeiten optimos-2 100vmosfets für leistungsanwendungen mit hohem wirkungsgrad", in elektronik, pp. 60 63, june, 2006. [11] a. nakagawa, y. kawaguchi and k. nakamura, "silicon limit electrical characteristics of power devices and ics", in proc. isps 2008, prague, 2008, pp. 25-32. [12] c. mößlacher and l. görgens, "simple design techniques for optimizing efficiency and overvoltage spike of synchronous rectification in dc to dc converters", in proc. pcim 2010, nürnberg, 2010. [13] r. siemieniec, c. mößlacher, o. blank, m. rösch, m. frank and m. hutzler, "a new power mosfet generation designed for synchronous rectification", in proc. epe 2011, birmingham, 2011. [14] a. schlögl, f. hirler, j. ropohl, u. hiller, m. rösch, n. soufi-amlashi and r. siemieniec, "a new robust power mosfet family in the voltage range 80 v-150 v with superior low rdson, excellent switching properties and improved body diode", in proc. epe 2005, dresden, 2005. [15] j. yedinak, d. probst, g. dolny, a. challa and j. andrews, "optimizing oxide charge balanced devices for unclamped inductive switching (uis) ", in proc. ispsd 2010, hiroshima, 2010. [16] f. tong, p.a. mawby, j.a. covington and a. pérez-tomás, "investigation on split-gate rso mosfet for 30v breakdown", in proc. isps 2008, prague, 2008. [17] d. pattanyak, "low voltage super junction technology", in proc. isps 2006, prague, 2006. [18] j. roig, d. lee, f. bauwens, b. burra, a. rinaldi, j. mcdonald and b. desoete, "suitable operation conditions for different 100v trench-based power mosfets in 48v-input synchronous buck converters", in proc. epe 2011, birmingham, 2011. [19] c. hu, "a parametric study of power mosfets", in proc. pesc 1979, san diego, 1979, pp. 385-395. [20] i. pawel, r. siemieniec and m. born, "theoretical evaluation of maximum doping concentration, breakdown voltage and on-state resistance of field-plate compensated devices", in proc. isps 2008, prague, 2008, pp. 55-61. [21] infineon technologies ag: optimos™ in to-leadless, product brief, http://www.infineon.com [22] r. severns, "design of snubbers for power circuits", http://www.cde.com/resources/technicalpapers/design.pdf, july 2009. [23] i. pawel and r. siemieniec, a new simulation approach to investigate avalanche behaviour, in k. elleithy (ed.): innovations and advanced techniques in systems, computing sciences and software engineering, springer, 2008, pp. 9-14. http://www.plugloadsolutions.com/80pluspowersupplies.aspx http://www.qfdonline.com/ http://www.webducate.net/qfd/qfd.html http://www.infineon.com/ http://www.cde.com/resources/technical-papers/design.pdf http://www.cde.com/resources/technical-papers/design.pdf plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 29-41 https://doi.org/10.2298/fuee2201029b © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper all-optical frequency encoded dibit-based parity generator using reflective semiconductor optical amplifier with simulative verification* surajit bosu1, baibaswata bhattacharjee2 1department of physics, bankura sammilani college, bankura, india (w.b) 2department of physics, ramananda college, bishnupur, bankura, india (w.b) abstract. high-speed signal computation and communication are an essential part of modern communication that increases optical necessity. therefore, researchers developed different types of digital devices in the all-optical domain. due to the versatile gain medium of reflective semiconductor optical amplifiers (rsoas), it has various important applications in passive optical networks. in comparison with semiconductor optical amplifier (soa), rsoas exhibit better gain performance because of their double pass property. therefore, rsoa shows better switching properties. in this communication, co-propagation scheme of rsoa is used to design and analyze a frequency encoded dibit-based parity generator. taking the advantages of rsoa like high switching speed, low noise, high gain, and low power consumption, the proposed design achieves these qualities. this design simulated in matlab and simulated outputs accurately verify the truth table. key words: optical communication, reflective semiconductor optical amplifier, frequency encoding, dibit-based logic system, parity generator. 1. introduction the photon becomes more popular for information transmission [2, 3]. photons can carry information at a superfast speed. therefore, the researchers are very interested to design photon-based devices [4, 5] instead of electron-based devices. the data signal can be transmitted in long-range using different types of encoding techniques [6-8]. the frequency encoding [9-13] technique is more reliable in long-range signal propagation. in optical communication, the adder [1, 14-16], subtractor [17], comparator [9, 18], parity generator are basic components for arithmetic, decision-making circuits, logic units [1925], and memory units [11]. received august 9, 2021; accepted september 29, 2021 corresponding author: surajit bosu bankura sammilani college, faculty of physics, bankura, west bengal, india e-mail: surajitbosu7@gmail.com * an earlier version of this paper was presented at the 4th international conference on 2021 devices for integrated circuit (devic 2021), may 19-20, 2021, in kalyani, west bengal, india [1]. 30 s. bosu, b. bhattacharjee these are the basic building blocks of optical data processors. in the frequency encoding concept [1], the digital logic states ‘0’ and ‘1’ are indicated by frequencies υ1 and υ2 respectively. in communication and data storage systems, the parity generator is a very essential device. in the last decade, researchers are working for parity generators. from the literature survey, it is found that the even/odd parity generator units are not designed in a single device and also dibit-based logic and frequency encoding scheme is also first time implementation. in this communication, a frequency encoded dibit-based even/odd parity generator in the all-optical domain, using add/drop multiplexer (adm) and reflective semiconductor optical amplifier (rsoa) is devised. from the previous version [1], we adopted the logic of sum from the design of a half adder using rsoa and adm. in half-adder design [1], two input dibit-based logic is used but in this proposed design, we have implemented three inputs dibit-based logic. so the operation of the three inputs dibit-based logic is much more complicated than the previous version [1]. this proposed design is a single device for the even/odd parity generator units and it has no extra control terminal. as a result, the devised design reduces the space of the device as well as simultaneously generates even and odd parity. introducing the dibit-based logic in this design, one can be expected a high degree of parallelism. the frequency encoding and dibit-based systems reduce also the bit error problems and enhance the speed of operation in long-range transmission. since rsoa has ultrafast switching property with low noise, so the proposed design operates at ultrafast speed. in the results and discussion section, the proposed design is compared with the other designs [26, 28, 33, 36] which are given in table 3. the remaining part is structured as follows: related works are described in section 2. the working principle of rsoa and adm are described in section 3. section 4 describes the operation scheme of the proposed parity generator. the simulation experiment of the proposed model is described in section 5. the results and discussion of the proposed system are presented in section 6. finally, the conclusions with potential future works are given in section 7. 2. related works the researchers are working for parity generators during the past several years. some of these legendary works are discussed here. chowdhury et al. [26] have introduced a design of 4-bit parity generator and checker using non-linear material-based switches. using spatial light modulator and savart plate a parity generator and parity checker has been reported by ghosh [27]. dimitriadou et al. [28] have introduced a 4-bit parity generator and checker. they used a high-speed switch to design the parity generator. this high-speed switch is based on quantum-dot-soa-based mzi and their design is verified through numerical simulation. this design is based on the modified trinary number system. a micro-ring-resonator (mrr)-based parity generator and checker have been reported by rakshit et al. [29] and it also verified using numerical simulation. mehra et al. [30] have introduced an soa-mzi-based 7-bit parity generator and checker circuit. this design is simulated at high-speed 120 ghz. bhattacharyya et al. [31] have reported a 4-bit parity generator using an soa-assisted sagnac switch and the design is verified through numerical simulation. kumar et al. [32] have reported a parity checker using the electro-optic effect in mzi. using the matlab software, results are obtained and frequency encoded dibit-based parity generator 31 optibpm software is used for verification of the implementation of this design. wang et al. [33] have reported parity checker in the all-optical domain and the works implemented in the nanoscale-integrated chip. plasmonic metal-insulator-metal (mim)-based parity generator has been reported by singh et al. [34]. this design is simulated in matlab. kaur et al. [35] have proposed an soa-mzi-based 3-bit parity generator and checker and also transfer matrix method (tmm) based time-domain simulation is done for this design. nair et al [36] have introduced an soa-mzi-based 3-bit parity generator and checker in the all-optical domain. they used the tree architecture concept to design their work. maji et al. [37] have proposed a design of a 4-bit parity generator and checker using a reflective semiconductor optical amplifier. they have introduced a single device in which even/odd parity generator units are designed but they have used an extra control terminal to switch between the even-odd parity units. 3. working principle of rsoa and adm as mentioned in the introduction, the basic key components of the proposed design are rsoa and adm. now, the working principle of these two is logically explained in this section. so, one pump signal (strong) and a probe signal (weak) are injected into the input signals of soa but a high power probe beam is obtained at the output. this design is based on cross gain modulation (xgm) [1, 38]. therefore, it is called rsoa. a high reflective (hr) and an anti-reflective (ar) coating are placed in the two facets of rsoa [1, 22, 25, 38-41]. it has a very versatile high gain medium so it has various important applications in passive optical networks (pon). here, the frequency corresponding to the wavelengths of the probe signals is in the c-band (1535-1570 nm). the saturation power of rsoa may be used within 5-20 dbm [1, 9, 43]. fig. 1 block diagram of rsoa an add/drop multiplexer (adm) [41-43] is very popular as a frequency selector. if we consider, the frequencies, υ2 and υ1 are injected into the bias and input port of adm respectively, then at the output frequency, υ1 is obtained whereas nothing at the drop port. if we consider the same frequency, υ1 (or υ2) into the input and bias port then adm reflects the input signal, υ1 (or υ2) at the drop port by the circulator whereas nothing is obtained at the output. the schematic diagram of adm is given in fig. 2. rsoa and adm are used to develop different devices such as multiplexer, adder, comparator, etc. [1, 9-10, 41]. 32 s. bosu, b. bhattacharjee fig. 2 block diagram of adm 4. proposed scheme of operation of the three-bit parity generator in this section, the 3-bit parity generator and its operational scheme is proposed. here, a, b, and c are the frequency encoded dibit-based inputs whose parity will be generated. the boolean expression of even and odd parity generators are even y a b c=   (1) oddy a b c=   (2) the schematic diagram of this design is given in fig. 3. in this proposed design, frequency encoding technique and dibit-based logic are opted. mukhopadhyay [44] first reported the dibit-based representation technique. according to this representation technique [1, 9-10, 42-44], two consecutive bit positions are chosen to represent a digit. here, digital logic states ‘0’ and ‘1’ are represented by the dibits ‘01’ and ‘10’ respectively. since the proposed design is frequency encoding, so two different frequencies υ1, and υ2 when placed side by side as ‘‘υ2 υ1’’ indicates the logic state ‘1’ and ‘‘υ1 υ2’’ indicates the digital logic state ‘0’. here, we adopted the logic of sum from the design of half adder [1] using rsoa and adm. in half-adder design, two inputs dibit-based logic are used but in this proposed design, we have implemented three inputs dibit-based logic. so the operation of the three inputs dibit-based logic is much more complicated than the previous version [1]. the operation of the proposed designs is based on the eqs. 1-2. now, the operation scheme of the proposed frequency encoded dibit-based parity generator describe in the following cases. 4.1. case-i (when all the inputs are the same) in this case, a/=υ1, a //=υ2, b /=υ1, b //=υ2, c /=υ1, c //=υ2 frequencies are injected into the input terminals. therefore, υ1 frequencies are obtained from the rsoa-3, rsoa-2, and rsoa-1. the outputs of rsoa-2 and rsoa-1 are injected into a4 (adm-4) as input and bias signals. since both the frequency, of a4 is the same then frequency, υ1 is obtained at the drop port through the circular. this frequency, υ1 acts as a pump signal of rsoa-5 and probe signal frequency, υ1 so the output of rsoa-5 is frequency, υ1. this frequency, υ1 acts as an input signal of a5 (adm-5) and its biasing signal is υ1 which is the output frequency of rsoa-3. since both the frequencies of a5 (adm-5) are the frequency encoded dibit-based parity generator 33 same then a5 reflects the input signal, υ1 at the drop port. this frequency, υ1 works as a pump signal of rsoa-7. therefore, the output of rsoa-7 is frequency, υ1. one part of this frequency directly shows the dibit output y/even and another part acts as an input of a6 (adm-6) which is biased by the signal of frequency, υ2. therefore, a6 selects the input signal of frequency, υ1 to the output and this output frequency, υ1 works as the pump signal of rsoa-8. this yields υ2 at the dibit output terminal, y // even. finally, the dibit outputs υ1 and υ2 are obtained at the output terminals, y / even and y//even respectively, which indicates the digital logic state ‘0’ and the dibit outputs υ2, υ1 are obtained at the dibit output terminals, y/odd and y//odd which altogether indicates the digital logic state ‘1’. therefore, when inputs, a=0, b=0, and c=0 then the outputs show even parity, yeven=0 and odd parity, yodd =1. similarly, when a/=υ2, a //=υ1, b /=υ2, b //=υ1, c /=υ2, and c //=υ1 are taken as input signals to the device then the dibit outputs υ1, υ2 are obtained at the dibit output terminals, y/even and y//even respectively which altogether indicates the digital logic state ‘0’ and the dibit outputs υ2, υ1 are obtained at the dibit output terminals, y / odd and y//odd which altogether indicates the digital logic state ‘1’. therefore, when inputs, a=1, b=1, and c=1 then the outputs show even parity, yeven=0 and odd parity, yodd =1. 4.2. case-i i (when one input is different) here, a/=υ2, a //=υ1, b /=υ1, b //=υ2, c /=υ2, and c //=υ1 are applied as the input signals of the device. therefore, frequencies, υ2, υ1, and υ2 are obtained at the outputs of rsoa-3, rsoa-2, and rsoa-1 respectively. the outputs from rsoa-2 and rsoa-1 are injected into the a4 (adm-4) as input and biasing signals respectively. since both the frequencies of a4 are not the same then the input signal is selected by the a4 at the output. this frequency, υ1 acts as a pump signal of rsoa-4 and its probe signal is υ2 so the output of rsoa-5 is υ2. this frequency, υ2 acts as an input frequency of a5 (adm-5), and the output frequency, υ2 of rsoa-3 works as biasing frequency. since both the frequency of a5 (adm-5) are the same then a5 reflects the input signal of frequency, υ2 at the drop port. this frequency, υ2 works as a pump signal of rsoa-7. therefore, the output of rsoa-7 is frequency, υ1, one part of this frequency directly shows the dibit output y / even and another part acts as an input of a6 (adm-6) which is biased with the frequency, υ2. therefore, a6 selects the frequency, υ1 at the output, and this output frequency, υ1 acts as pump signal of rsoa-8 which gives υ2 at the dibit output terminal, y // even. finally, the dibit output υ1 and υ2 are obtained at the output terminals, y / even and y//even respectively, which indicates the digital logic state ‘0’ and the dibit outputs υ2 and υ1 are obtained at the dibit output terminals, y/odd and y//odd which altogether indicates the digital logic state ‘1’. therefore, when inputs, a=1, b=0, and c=1 then the outputs show even parity, yeven=0 and odd parity, yodd =1. 34 s. bosu, b. bhattacharjee fig. 3 schematic diagram of proposed parity generator frequency encoded dibit-based parity generator 35 similarly, when a/=υ1, a //=υ2, b /=υ2, b //=υ1, c /=υ2, and c //=υ1 are taken as input signals to the device then the dibit outputs υ1, υ2 are obtained at the dibit output terminals, y/even and y//even respectively which altogether indicates the digital logic state ‘0’ and the dibit outputs υ2, υ1 are obtained at the dibit output terminals, y / odd and y//odd which altogether indicates the digital logic state ‘1’. therefore, when inputs, a=0, b=1, and c=1 then the outputs show even parity, yeven=0 and odd parity, yodd =1. in this way, other outputs can be obtained and these are given in table 1. table 1 frequency encoded truth table of proposed design dibit input dibit output input (a) input (b) input (c) even parity (yeven) odd parity (yodd) a/ a// b/ b// c/ c// y/even y // even y / odd y // odd υ1 υ2 υ1 υ2 υ1 υ2 υ1 υ2 υ2 υ1 υ1 υ2 υ1 υ2 υ2 υ1 υ2 υ1 υ1 υ2 υ1 υ2 υ2 υ1 υ1 υ2 υ2 υ1 υ1 υ2 υ1 υ2 υ2 υ1 υ2 υ1 υ1 υ2 υ2 υ1 υ2 υ1 υ1 υ2 υ1 υ2 υ2 υ1 υ1 υ2 υ2 υ1 υ1 υ2 υ2 υ1 υ1 υ2 υ2 υ1 υ2 υ1 υ2 υ1 υ1 υ2 υ1 υ2 υ2 υ1 υ2 υ1 υ2 υ1 υ2 υ1 υ2 υ1 υ1 υ2 5. simulation of the proposed parity generator in the previous section, the operational scheme of the proposed design is explained theoretically. now, we discuss the simulation model of the proposed design. using matlab (r2018a) software, the proposed design of the parity generator is verified. rsoas and adms are programming on the basis of their characteristics using matlab language. if frequencies, υ1=193.5 thz (wavelength=1550 nm) and υ2=194.1 thz (wavelength=1545 nm) are considered as the probe signal and pump signal then, 193.5 thz is obtained at the output port whereas 194.1 thz is obtained at the output port when υ1=193.5 thz (wavelength=1550 nm) and υ2=194.1 thz (wavelength=1545 nm) are considered as the pump and probe signals. if frequencies, υ1=193.5 thz and υ2=194.1 thz are considered as the input and biasing signals then, adm selects the input signal (193.5 thz) at the output whereas nothing is obtained at the drop port. if the same frequencies are injected at the biasing and input port then, adm reflects the input signal at the drop port whereas output gives nothing. by the use of these considerations, this design is simulated. in figs. 4, and 5, dibits <193.5><194.1> and <194.1><193.5> indicate the digital logic states ‘0’, and ‘1’ respectively. from fig. 4, the dibits <193.5><194.1>, <194.1><193.5> and <193.5><194.1> are injected into the dibit inputs ‘a’, ‘b’ and ‘c’ terminals of the device respectively. as a result, <194.1><193.5> and <193.5> <194.1> are obtained as dibit even parity, yeven and dibit odd parity, yodd at the output terminals respectively. dibits <194.1><193.5>, <194.1><193.5>, and <193.5><194.1> are injected into the dibit inputs ‘a’, ‘b’, and ‘c’ terminals respectively. as a result, <193.5><194.1> and <194.1><193.5> are obtained as dibit even parity, yeven and dibit odd parity, yodd at the output terminals 36 s. bosu, b. bhattacharjee respectively. <194.1><193.5>, <194.1><193.5>, and <194.1><193.5> are injected into the dibit inputs ‘a’, ‘b’ and ‘c’ terminals respectively. therefore, <194.1><193.5>, and <193.5><194.1> are yielded as dibit even parity, yeven and dibit odd parity, yodd at the output terminals respectively. similar way, other outputs are obtained corresponding to the applied inputs. all the dibit output waveforms corresponding to the dibit input waveforms are given in fig. 5. the results of the simulation are discussed in the next section. fig. 4 dibit input signal waveforms of parity generator frequency encoded dibit-based parity generator 37 6. results and discussion in this section, the theoretical interpretation and the simulation results are discussed. in this design, two signals with different frequencies, υ1=193.5 thz and υ2=194.1 thz are injected into the 50 ps time intervals to the input. the input and output signal waveforms are given in figs. 4 and 5. the simulation (given in table 2) verifies the parity generator results. fig. 5 dibit output signal waveforms of parity generator 38 s. bosu, b. bhattacharjee in the first 50 ps, we applied “υ1 υ2” in a, “υ1 υ2” in b, and “υ1 υ2” in c that indicates a=0, b=0, and c=0. after simulation with these data, we obtained y/even=υ1, y // even=υ2, and y/odd=υ2, y // odd=υ1 that indicates yeven=0, and yodd=1. in 50-100 ps, we applied “υ1 υ2” in a, “υ1 υ2” in b, and “υ2 υ1” in c that indicates a=0, b=0, and c=1. after simulation with these data, we obtained y/even=υ2, y // even=υ1, and y / odd=υ1, y // odd=υ2 that indicates yeven=1, and yodd=0. in 100-150 ps, we applied “υ1 υ2” in a, “υ2 υ1” in b, and “υ1 υ2” in c that indicates a=0, b=1, and c=0. after simulation with these data, we obtained y/even=υ2, y // even=υ1, and y / odd=υ1, y // odd=υ2 that indicate yeven=1, and yodd=0. in 150-200 ps, we applied “υ1 υ2” in a, “υ2 υ1” in b, and “υ2 υ1” in c that indicates a=0, b=1, and c=1. after simulation with these data, we obtained y/even=υ1, y // even=υ2, and y/odd=υ2, y // odd=υ1 that indicate yeven=0, and yodd=1. in 200-250 ps, we applied “υ2 υ1” in a, “υ1 υ2” in b, and “υ1 υ2” in c that indicates a=1, b=0, and c=0. after simulation with these data, we obtained y/even=υ2, y // even=υ1, and y / odd=υ1, y // odd=υ2 that indicate yeven=1, and yodd=0. in 250-300 ps, we applied “υ2 υ1” in a, “υ1 υ2” in b, and “υ2 υ1” in c that indicates a=1, b=0, and c=1. after simulation with these data, we obtained y/even=υ1, y//even=υ2, and y / odd=υ2, y // odd=υ1 that indicates yeven=0, and yodd=1. in 300-350 ps, we applied “υ2 υ1” in a, “υ2 υ1” in b, and “υ1 υ2” in c that indicates a=1, b=1, and c=0. after simulation with these data, we obtained y/even=υ1, y // even=υ2, and y / odd=υ2, y // odd=υ1 that indicates yeven=0, and yodd=1. in 350-400 ps, we applied “υ2 υ1” in a, “υ2 υ1” in b, and “υ2 υ1” in c that indicates a=1, b=1, and c=1. after simulation with these data, we obtained y/even=υ2, y // even=υ1, and y / odd=υ1, y // odd=υ2 that indicate yeven=1, and yodd=0. after the verification of the simulation results (given in table 2), and the truth table (table1), it is interpreted that the proposed design works accurately. a comparative study with previous work is given in table 3. table 2 simulation results of proposed parity generator (all the frequencies are in thz range) dibit input dibit output time (ps) input (a) input (b) input (c) even parity (yeven) odd parity (yodd) a/ a// b/ b// c/ c// y/even y // even y / odd y // odd 0-50 193.5 194.1 193.5 194.1 193.5 194.1 193.5 194.1 194.1 193.5 51-100 193.5 194.1 193.5 194.1 194.1 193.5 194.1 193.5 193.5 194.1 101-150 193.5 194.1 194.1 193.5 193.5 194.1 194.1 193.5 193.5 194.1 151-200 193.5 194.1 194.1 193.5 194.1 193.5 193.5 194.1 194.1 193.5 201-250 194.1 193.5 193.5 194.1 193.5 194.1 194.1 193.5 193.5 194.1 251-300 194.1 193.5 193.5 194.1 193.5 194.1 193.5 194.1 194.1 193.5 301-350 194.1 193.5 194.1 193.5 193.5 194.1 193.5 194.1 194.1 193.5 351-400 194.1 193.5 194.1 193.5 194.1 193.5 194.1 193.5 193.5 194.1 frequency encoded dibit-based parity generator 39 table 3 comparative study with previous work work simulation of bit pattern given rsoa used or not even and odd parity generator in one device without extra control signal dibit logic used or not ref. [26] no no no no no ref. [28] yes no no no no ref. [33] yes no no no no ref. [37] yes yes yes no no proposed work yes yes yes yes yes 7. conclusions in this communication, rsoa and adm are utilized to design a frequency encoded dibit-based 3-bit parity generator. in this devised design, input dibit control units pass the error-free dibit logic that decreases the bit-error problems and enhances the operational speed. so, it promotes reliable and faithful operation. matlab software is used to simulate and verify the devised design. furthermore, a comparative study with parity generators designed using different nonlinear materials has been conducted. this design is a single device for the even/odd parity generator units and it has no extra control terminal. as a result, the devised design reduces the space of the device as well as simultaneously generates even and odd parity. by introducing the dibit-based logic in this design, a high degree of parallelism can be expected. the frequency encoding and dibitbased systems also reduce the bit error problems and enhance the speed of operation in long-range transmission. in the future, we intend to develop a higher bit parity generator and checker. we also intend to use the proposed design in the full adder, cryptographic systems, and in the development of binary to gray code converter devices in the future. references [1] s. bosu and b. bhattacharjee, "a design of frequency encoded dibit-based half adder using reflective semiconductor optical amplifier with simulative verification", in proceedings of the 4th international conference on devices for integrated circuit (devic 2021), ieee, kalyani, india, 19-20 may, 2021, pp. 175–179. [2] s. kumar, k. boone, j. tuszyński, p. barclay and c. simon, "possible existence of optical communication channels in the brain", sci. rep., vol. 6, no. 1, pp. 1–13, nov. 2016. [3] s. mukhopadhyay, s. dey and s. saha. "photonics: a dream of modern technology" in photonics and fiber optics: foundations and applications, p. 39, 2019. [4] s. ma, a. kotb, z. chen, and n. k. dutta, "all optical logic gates based on two-photon absorption", in photonics north, international society for optics and photonics, vol. 7750, p. 77501l, 2010. [5] a. kotb, "and gate based on two-photon absorption in semiconductor optical amplifier", optoelectron. lett., vol. 9, no. 3, pp. 181–184, may 2013. [6] a. chatterjee and s. mukhopadhyay, "use of optical kerr medium for parametric generation of very low frequency electrical signal", j. opt., vol. 48, no. 4, pp. 582–585, nov. 2019. [7] k. mukherjee and a. raja, "three input nand gate using semiconductor optical amplifier", in proceedings of the ieee vlsi device circuit and system (vlsi dcs), 2020, pp. 142–145. [8] n. k. dutta and z. xiang, optoelectronic devices. world scientific, 2018. [9] s. bosu and b. bhattacharjee, "all-optical frequency encoded dibit-based comparator using reflective semiconductor optical amplifier with simulative verification", in proceedings of the 4th international ieee conference on devices for integrated circuit (devic 2021), kalyani, india, 2021, pp. 388–392. 40 s. bosu, b. bhattacharjee [10] s. bosu and b. bhattacharjee, "a novel design of frequency encoded multiplexer and demultiplexer systems using reflected semiconductor optical amplifier with simulative verification", j. opt., vol. 50, no. 3, pp. 361–370, june 2021. [11] s. saha and s mukhopadhyay, "all optical frequency encoded quaternary memory unit using symmetric configuration of mzi-soa", opt. laser technol., vol. 131, p. 106386, nov. 2020. [12] b. ghosh, s. hazra, n. haldar, d. roy, s. n. patra, j. swarnakar and s. mukhopadhyay, "a novel approach to realize of all optical frequency encoded dibit based xor and xnor logic gates using optical switches with simulated verification", opt. spectrosc., vol. 124, no. 3, pp. 337–342, march 2018. [13] k. maji, k. mukherjee and a. raja, "an alternative method for implementation of frequency-encoded logic gates using a terahertz optical asymmetric demultiplexer (toad)", j. comput. electron. vol. 18, no. 4, pp. 1423–1434, dec. 2019. [14] p. k. nahata, a. ahmed, s. yadav, n. nair and s. kaur, "all optical full-adder and full-subtractor using semiconductor optical amplifiers and all-optical logic gates", in proceedings of the 7th international conference on signal processing and integrated networks (spin), 2020, pp. 1044–1049. [15] n. nair, s. kaur and h. singh, "all-optical ripple carry adder based on soa-mzi configuration at 100 gb/s", optik, vol. 231, p. 166325, apr. 2021. [16] k. mukherjee, k. maji and m. k. mandal, "design and analysis of all-optical dual control dual soa tera hertz asymmetric demultiplexer based half adder", opt. quantum electron., vol. 52, no. 9, pp. 1–15, sept. 2020. [17] p. shanmugapriya, m. margarat and a. jayaraj, "design of all-optical half adder and half subtractor based on soa-mi", in journal of physics: conference series, 2021, vol. 1717, no. 1, p. 012005, iop publishing. [18] k. komatsu, g. hosoya and h. yashima, "ultrafast all-optical digital comparator using quantum-dot semiconductor optical amplifiers", opt. quantum electron., vol. 51, no. 2, p. 39, jan. 2019. [19] a. kotb, k. e. zoiros and w. li, "execution of all-optical boolean or logic using carrier reservoir semiconductor optical amplifier-assisted delayed interferometer", opt. laser technol., vol. 142, p. 107230, oct. 2021. [20] a. kotb, k. e. zoiros and w. li, "numerical study of carrier reservoir semiconductor optical amplifierbased all-optical xor logic gate", j. mod. opt., vol. 68, no. 3, pp. 161–168, feb. 2021. [21] a. kotb, k. e. zoiros and c. guo, "2 tb/s all-optical gates based on two-photon absorption in quantum dot semiconductor optical amplifiers", opt. laser technol., vol. 112, pp. 442–451, apr. 2019. [22] a. kotb, k. e. zoiros and c. guo, "performance investigation of 120 gb/s all-optical logic xor gate using dual-reflective semiconductor optical amplifier-based scheme", j. comput. electron., vol. 17, no. 4, pp. 1640–1649, sept. 2018. [23] a. alquliah, a. kotb, s. c. singh and c. guo, "all-optical and, nor, and xnor logic gates using semiconductor optical amplifiers-based mach-zehnder interferometer followed by a delayed interferometer", optik, vol. 225, p. 165901, jan. 2021. [24] k. mukherjee, "design and analysis of all optical frequency encoded x-or and x-nor gate using quantum dot semiconductor optical amplifier-mach zehnder interferometer", opt. laser technol., vol. 140, p. 107043, aug. 2021. [25] a. kotb, k. e. zoiros, w. li and c. guo, "theoretical investigation of 120 gb/s all-optical and, and or logic gates using reflective semiconductor optical amplifiers", opt. eng., vol. 60, no. 6, p. 066107, june 2021. [26] k. r. chowdhury, d. de and s. mukhopadhyay, "parity checking and generating circuit with nonlinear material in all-optical domain", chin. phy. lett., vol. 22, no. 6, p. 1433, june 2005. [27] a. k. ghosh, "parity generator and parity checker in the modified trinary number system using savart plate and spatial light modulator", opt. lett., vol. 6, no. 5, pp. 325–327, sept. 2010. [28] e. dimitriadou, k. e. zoiros, t. chattopadhyay and j. n. roy, "design of ultrafast all-optical 4-bit parity generator and checker using quantum-dot semiconductor optical amplifier-based mach-zehnder interferometer", j. comput. electron., vol. 12, no. 3, pp. 481–489, may 2013. [29] j. k. rakshit, j. n. roy and t. chattopadhyay, "design of micro-ring resonator based all-optical parity generator and checker circuit", opt. commun., vol. 303, pp. 30–37, aug. 2013. [30] r. mehra, s. jaiswal and h. k. dixit, "parity checking and generating circuit with semiconductor optical amplifier in all-optical domain", optik, vol. 124, no. 21, pp. 4744–4745, nov. 2013. [31] a. bhattacharyya, d. k. gayen and t. chattopadhyay, "all-optical parallel parity generator circuit with the help of semiconductor optical amplifier (soa)-assisted sagnac switches", opt. commun., vol. 313, pp. 99–105, feb. 2014. frequency encoded dibit-based parity generator 41 [32] a. kumar and s. k. raghuwanshi, "implementation of optical gray code converter and even parity checker using the electro-optic effect in the mach–zehnder interferometer", opt. quantum electron., vol. 47, no. 7, pp. 2117–2140, july 2015. [33] f. wang, z. gong, x. hu, x. yang, h. yang and q. gong, "nanoscale on-chip all-optical logic parity checker in integrated plasmonic circuits in optical communication range", sci. rep., vol. 6, no. 1, pp. 1–8, apr. 2016. [34] l. singh, a. bedi and s. kumar, "modeling of all-optical even and odd parity generator circuits using metal-insulator-metal plasmonic waveguides", photonic sens., vol. 7, no. 2, pp. 182–192, june 2017. [35] s. kaur and m. k. shukla, "all-optical parity generator and checker circuit employing semiconductor optical amplifier-based mach–zehnder interferometers", optica applicata, vol. 47, no. 2, pp. 263–271, 2017. [36] n. nair, s. kaur and r. goyal, "all optical integrated parity generator and checker using an soa-based optical tree architecture", curr. opt. photonics, vol. 2, no. 5, pp. 400–406, oct. 2018. [37] k. maji, k. mukherjee, a. raja and j. n. roy, "numerical simulations of an all-optical parity generator and checker utilizing a reflective semiconductor optical amplifier at 200 gbps", j. comput. electron, vol. 19, no. 4, pp. 800–814, june 2020. [38] w. b. kwon, c. g. lee, d. seo and c. s. park, "tunable photonic microwave band-pass filter with high-resolution using xgm effect of an rsoa", curr. opt. photon, vol. 2, pp. 563-567, dec. 2018. [39] a. kotb and c. guo, "reflective semiconductor optical amplifiers-based all-optical nor and xnor logic gates at 120 gb/s", j. mod. opt., vol. 67, no. 18, pp. 1424–1435, dec. 2020. [40] a. kotb and c. guo, "120 gb/s all-optical nand logic gate using reflective semiconductor optical amplifiers", j. mod. opt., vol. 67, no. 12, pp. 1138–1144, aug. 2020. [41] p. p. sarkar, b. satpati and s. mukhopadhyay, "new simulative studies on performance of semiconductor optical amplifier based optical switches like frequency converter and add-drop multiplexer for optical data processors", j. opt., vol. 42, no. 4, pp. 360–366, dec. 2013. [42] b. ghosh, s. hazra and p. p. sarkar, "simulative study of all-optical frequency encoded dibit-based controlled multiplexer and de-multiplexer using optical switches", j. opt., vol. 48, no. 3, pp. 365–374, sept. 2019. [43] p. p. sarkar, b. ghosh and s. n. patra, "simulative study of all optical frequency encoded dibit based universal nand and nor logic gates using a reflective semiconductor optical amplifier and an add/drop multiplexer", j. opt. technol., vol. 83, no. 4, pp. 257–262, apr. 2016. [44] s. mukhopadhyay, "binary optical data subtraction by using a ternary dibit representation technique in optical arithmetic problems", appl. opt., vol. 31, no. 23, pp. 4622–4623, aug. 1992. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. i-ii © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd guest editorial the field of microelectronics and nanoelectronics devices, circuits and systems including mems and nems plays a tremendous role in virtually every facet of today’s life. practically there are no persons whose quality of lives is not being influenced by it. consequently, there is an enormous increase of the number of novel solutions and applications and an explosive development of related sophisticated production and testing technologies, practical devices and equipment. with the idea to spread knowledge in this key area of engineering and technological knowledge, the department of electronics and communication engineering, national institute of technology silchar, assam, india in association with ieee electron devices (ed) nit silchar student branch chapter and ieee kolkata section nanotechnology council (ntc) chapter organized the international virtual conference on micro/ nanoelectronics devices, circuits and systems (mndcs-2021) from 29-31 jan 2021. the objective was to promote advanced research and developments in the areas of microelectronics, nanoelectronics, semiconductor devices, vlsi circuits and systems through keynotes, invited talks, and oral presentations in the relevant areas. here, we present a special section of the journal facta universitatis series: electronics and energetics dedicated to the papers based on the selected outstanding presentations from the virtual conference mndcs-2021. none of the papers was previously published anywhere, including any kind of conference proceedings, printed or electronic. the articles published here were specifically written for this special section based on the corresponding oral presentations given at the mndcs-2021 and were all subjected to rigorous peer reviewing procedure, typically by three reviewers from various countries each. six papers altogether were selected for this special section, however the processing of one of them had to be postponed due to the authors’ problems with covid-19 and is currently under consideration for one of the future issues of facta universitatis series: electronics and energetics. the remaining five chosen articles included in this issue are as follows 1. rajan singh, trupti ranjan lenka and hieu pham trung nguyen, “analytical study of effect of energy band parameters and lattice temperature on conduction band offset in aln/ga2o3 hemt” 2. girolamo tagliapietra, jacopo iannacci, “a comprehensive overview of recent developments in rf-mems technology-based high-performance passive components for applications in the 5g and future telecommunications scenarios” (invited presentation) 3. ivana jokić, olga jakšić, miloš frantlović, zoran jakšić, koushik guha, “mems resonator mass loading noise model: the case of bimodal adsorbing surface and finite adsorbate amount” 4. lakshmi narayana thalluri, k v v kumar, konari raja sekhar, n bhushana babu d, s s kiran, koushik guha, “damping analysis to improve the performance of shunt capacitive rf mems switch” 5. ravi teja velpula, barsha jain, trupti ranjan lenka, hieu pham trung nguyen, “controlled electron leakage in electron blocking layer free ingan/gan nanowire light-emitting diodes.” ii the guest editors hope that the selected papers will serve their purpose and inspire other authors with novel concepts in this rapidly developing and expanding field. their greatest pleasure would be the appearance of new publications inspired by this special section. the guest editors would like to express their sincerest gratitude to all of the authors who ensured the existence of this special through their quality contributions. the gratitude also extends to the organizers of the mndcs-2021 who assembled such choice group of world-class researchers, as well as to fuee editor-in-chief prof. danijel danković and the late member of the serbian academy of sciences and arts prof. ninoslav stojadinović, who ensured the very existence of this special section. guest editors: dr. zoran jakšić full research professor, institute of chemistry, technogy and metallurgy, national institute of the republic of serbia – university of belgrade, serbia associate editor, facta universitatis series: electronics and energetics dr. trupti ranjan lenka assistant professor – national institute of technology silchar, assam, india organizing chair, mndcs-2021 facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 585-598 https://doi.org/10.2298/fuee1804585t an augmented reality system for improving health and safety in the electro-energetics industry dušan tatić faculty of electronic engineering, university of niš, niš, serbia abstract. occupational safety has a crucial role in every technological process in industry environments. recently, smart mobile devices have become standard hardware that can help inform workers about their duties and procedures during work. in this paper, we present an augmented reality (ar) system for mobile devices as a tool for safeguarding health and safety, and the secure performance of tasks in a technological process by following virtual instructions in the workplace. in a case study, we explored the task procedures and defined the risk factors in the electro-energetics industry. based on that, we implement a corresponding ar system that should be used to issue occupational safety and work instructions to workers during task execution. with that aim in mind, we designed a client-server architecture to project the related instructions on the screen of the mobile device, to ensure the confirmation of implementing them, as well as to keep a record of all the steps the worker performed. as an illustrative example, we present the application of the designed ar system to particular tasks in electro-energetics industrial plants. key words: augmented reality, occupational safety, mobile systems, industry. introduction smart mobile devices, such as mobile phones and tablet pcs, have become important tools for resolving many tasks in the daily work routine in many areas. in industry especially, there are many technological processes whose complexity demands a high level of knowledge and expertise from workers, as well as imposes considerable challenges in preserving occupational safety. the diversity of the devices and equipment that are parts of the technological process require detailed knowledge of specific elements involved in the process, and of the safety measures that have to be appreciated and consequently performed strictly as defined by various regulatory issues. received january 12, 2018; received in revised form june 18, 2018 corresponding author: dušan tatić faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: dule_tatic@yahoo.com) 586 d. tatić in such situations, smart devices offer a convenient way to quickly provide the relevant information necessary for resolving a particular task with the undertaken occupational safety measures. the imperative is that the information should be presented in an intuitively clear way, to allow the simple implementation of related instructions. according to that, the fast search for relevant information depends on the technology implemented in a particular smart device. also, the presentation of such information, in terms of task completion, should be simple and clear enough. the interface between a user and the real word is provided by the relatively new and extremely fast-developing technology of augmented reality (ar). this technology gives the user augmented information by mixing the real image captured by the camera of the mobile device and the virtual content prepared to explain the context of the captured real space in real time [1]. particular applications of ar have been developed for industry purposes, where this technology is primarily used to give assistive information to the worker [2, 3, 4, 5, 6] in fields such as maintenance and repair of various devices and systems [7, 8, 9], manufacturing and assembling [10, 11], collaboration, management and product design [12, 13, 14], and training procedures [15, 16]. considerably fewer research efforts are devoted to the usage of ar in improving the occupational safety in industry [17, 18, 19]. even less research in this area addresses human resource management. this observation was a motivation for our project focused on of the author towards an application of ar in combined human resources management and occupational safety improvement in industry environments [20, 21]. in this paper, we extend our research in the same direction, that is, towards further applications. the goal is to show that the same principles and the corresponding system as in [21] can equally be applied in another industry environment and with different tasks which should be performed under considerably stronger demands from occupational safety point of view. for this research, we chose the electro-energetics industry in order to test our system for providing safety and work instructions by taking into account both the complexity of the task that needs to be performed and the expertise of the worker to whom the task is assigned. compared to previous work reported in [21], main differences and new issues addressed in the present considerations can be summarized as follows. since the work procedures in the electro-energetics industry are very strictly formulated, in the present research we focus on preventing various exceedance situations which occur due to requirements to work with high voltage and strong current installations and systems. another aspect we address in the present paper is an extension of the existing methods towards the usage of cross platform tools like easyar sdk [22], which is a tool for creating augmented reality mobile applications for both ios and android platforms. in the previous system, instead of the easyar, vuforia [23] was used for marker tracking in our ar system. further, the symfony framework [24] is used on the server side instead of native php in previous work. this framework enables implementation of more sophisticated web service that can handle data stored on the server side. also, it can be a server for applications based on different mobile platforms. regarding implementation issues, the difference with respect to the previous system [21] is that in the present case we include two image plaits instead of a fiducial marker to test the proposed solution for the ar system. this way, the worker can easily find the corresponding object of recognition to be informed about tasks in a more intuitive way without devoting much time to searching for it. an augmented reality system for improving health and safety in the electro-energetics industry 587 improvement of the previous system previous work was presented in [21] as an original solution in the design of an augmented reality system for the implementation of occupational safety instructions in an industry environment. this solution elaborates the architecture of the system and its application at the thermal plant ugljevik. the main goal was to reduce risk factors and prevent injuries by putting ar markers on the appropriately selected parts of the machine. in this way, by using the ar system, workers are led step by step thorough the safety and work procedures during task execution. related to that, in the present considerations, we improve the system by reducing the number of markers and by giving the information about the task execution before the job starts. we enable cross platform solutions on the server side, as well as at the client side. in this paper, the architecture of the ar system is discussed in detail, as well as the system realization specification. since in [21] just the basic implementation issues are presented, here we also present the implementation details, since more complex and upgraded system components are used. in particular, on the server side we implemented a symfony framework for rest service instead of using a native php for database connection. the service is used to give more universality of the system on the server side and it will result in independence in terms of changing the client application. this independence gives us the ability to use different mobile technologies such as android or ios on the client side. our goal was to use a cross platform solution for the client part such as a unity game engine that provides good support for augmented reality technology. to show the independence and modularity of the system, we changed the ar module on the client side. this will show that by changing the system components and modules, the system architecture remains the same. on the client side, instead of using vuforia for marker recognition, we used the easyar. easyar provides us with an image tracking solution that is used for recognition of image plates in industrial systems. in the previous work multiple makers are arranged over the different parts of the machine, but in the current project we reduce their number. the markers provide the information on the exact place in the industry environment to which it is related, instead of classical mobile application elements such as normal lists. instead of markers, in the present version of the system, we use two image plates for recognition of safety and work instructions, respectively. placing two image plates next to each other at a very visible place reduces the time for finding markers. further, if a plate is damaged, it is easier to replace that plate. the image plates are robust in the sense that they still can be recognized even when damaged until a reasonable extent. recall that the fiducial markers are very sensitive to destruction. by changing the type of industrial environment, we wanted to show that the concept proposed in [21] can be used in different fields of industry. in this paper, the case in point is the electro-energetics industry environment where protecting electrical installations from stress is a different kind of occupational safety issues than risk of injury when working on mechanical machines that was considered in our previous work [21]. this change of industry environment is intended to show that the system can be easily modified after which can be used in different exploitation circumstances. we will elaborate the whole system and its application to the electro-energetics industry in detail. the next section will provide a discussion about the advancements in the usage of the ar system. 588 d. tatić formulation of the problem work in the electro-energetics industry requires specially trained professionals because of numerous hazardous situations that can happen during everyday work. as a result, before beginning any task, special attention is paid to the safety measures. these measures are aimed at reducing the risk of injury, and hence reduce the time and costs caused by an improper handling or behavior in the industry space. based on these measures and respective rules, special instructions, also including a description of the specific equipment parts, are issued to workers working under high voltage conditions. although carefully formulated and strictly prescribed rules are issued before any task, hazardous situations can happen. in the case of well-trained and experienced workers, mistakes occur due to their over confidence. because of the monotony of any daily job routine, they often skip prescribed safety rules issued by the occupational safety officer. this can cause unexpected consequences that can lead to serious injuries, heavy defects in the industry space, and a time delay for completing the task. table 1 safety and work instructions for changing the circuit breaker in an electrical substation safety instructions work instructions 1. check the protective equipment. check the tool and its correctness. 2. disconnect installation from voltage source. installation of the insulating substrate. turn off the main switch. 3. lock-off main switch and tag-out the locked installation. checking the absence of voltage. 4. ground and short circuit. circuit breaker replacement. fix locate and repair faulty circuit breaker. 5. enclose from parts under voltage. check the circuit breaker voltage at the output. after installing a new circuit breaker, turn on the system again. sometimes the situation dictates that less trained and inexperienced workers have to undertake relatively complex tasks. in this case, they are exposed to risk if the proper information about the equipment and related safety instructions are not provided in a clear and easy to perceive manner. the current job routine follows a certain flow of specific procedures. first, the safety officer informs the workers about safety measures that have to be applied before their daily activities. after that, the technology officer assigns the tasks that have to be done. every step of each task is defined in the work manual for a concrete job. work manuals are usually provided in the form of a sheet of paper with the text and drawings for each safety and work instruction. as an illustration, table 1 shows the instruction steps for the safety rules on low voltage installation and work instructions for changing a circuit breaker in a typical electrical substation. an augmented reality system for improving health and safety in the electro-energetics industry 589 augmented reality in solving the problem augmented reality is recognized as a technology which can improve the way of issuing working and safety instructions for the given task. this improvement includes direct interaction with the particular industry space through a camera on the mobile device, with the additional virtual information about necessary safety rules being projected. by recognizing specifically provided and conveniently positioned ar markers, safety and work instructions related to a concrete task and the equipment, as well as the tools to be used, are projected. as shown in table 1, the safety instructions are projected first, and after their implementation is confirmed, the work instruction are projected. displaying ar instructions has multiple goals:  keep the attention of the workers to ensure they apply all the necessary steps during their work, by confirming the implementation of each issued instruction,  give more precise visual information directly at the work place which is supposed to be considerably more detailed and appropriate than the information which can be read in a hard copy printed instruction manual,  save the time of the safety officers in explaining the safety instructions, which is especially important when different tasks on different types of equipment should be assigned to a large group of workers,  keep the history of all the examined and realized task instructions in the database on an external server for subsequent analysis and monitoring of the job done. organization of the ar content to be projected follows the working procedures and related safety rules. it is assumed that, as is the typical customary practice, for each worker the related data about his education level, specific professional training, and work experience are provided as a part of his personal data record. amount of information to be projected is determined for each worker individually according to his educational level and professional skills and expertise. selection of the amount of information content is done by the technology and occupational safety officers during the assignment of tasks. therefore, highly qualified workers will get a reduced number of instructions focused on specific details of the working procedure and related safety measures, while general information will be omitted assuming that the worker is already well familiar with it. for less qualified workers, complete instructions will be issued in order to finish the job in a proper and safe manner. ar system for the electro-energetic industry in this section we describe our ar system that can be used to overcome the abovementioned problems regarding proper implementation of work and safety instructions related to various tasks in the electro-energetic industry. it should be noted that the proposed system can easily be modified and made applicable to various other branches of industry. organization of the system the system is organized as a client-server architecture as shown in fig. 1. the arbased system present safety and work instructions in an electro-energetics environment where working under high voltage and strong currents is typical practice. the main role of the system, together with this particular requirement, determines its structure. 590 d. tatić the server side consists of a database, which stores data about users and tasks, a multimedia repository containing virtual instructions in various multimedia formats, and rest service [25] which represents the software architecture that can control data flow during the client-server communication using a predefined format of communication. these transfers of data are between tasks stored on the server and workers that are using the client application. the client side is realized as a software tool that consists the communication (cm) module, that enables data to send and receive data from the server, the task (tm) module implemented as a data structure that can store information about the current task of the worker and his safety and work instructions, the user interface (ui) module, which allows the user to navigate, and the augmented reality (ar) module, which provides task instructions to the worker. application of the ar system fig. 1 illustrates the basic functionality and application of the ar system. fig. 1 the block diagram of the system the system starts upon the worker logging onto it by using his identification data and password. this data is collected by the user interface module and forwarded to the communication module. the communication module prepares the login data in the proper format and sends it to the server via the rest service. based on this data, the rest service checks if the user is defined in the list of workers. upon successful authentication, the rest service searches in the database for the task that is assigned to the user. all of the tasks are integrated into the database and are accompanied by a list of related safety and working instructions. every instruction consists of descriptive elements and linked multimedia material that is stored in the multimedia repository. when task data are found in the database, the rest service prepares data in a format readable by the client application. an augmented reality system for improving health and safety in the electro-energetics industry 591 the communication module receives and processes data from the server side. upon processing, the data are stored in the task module and they are ready to be sent to the user interface and the ar module. the parameters of one instruction at a time are sent to these two modules. when a current instruction is completed, the next one is taken from the task module. the module user interface shows data that explain to the worker how to use the ar module. the explanation involves images and text in order to better guide the worker. at the same time, the ar module is activated with parameters about the object recognition and multimedia material that is shown during ar tracking. upon tracking, an interactive checking system is projected in order to confirm that the worker has seen the instructions. confirmation allows the next instruction to be received from the task module. after confirmation of all the task instructions, the data are prepared by the communication module to be sent to the server. on the server side, the rest service receives the data and stores it inside the database. fig. 2 architecture of the system server the server stores all the related data necessary for the application of the ar-system. data are stored inside the database and multimedia repository. the transfer of data between the database and the client application is realized through the rest service. in the next chapters all of the necessary elements of the server side will be described. these server elements are given in fig. 2. 592 d. tatić database the database consists of the list of workers, list of tasks, list of instructions, list of realized tasks, and list of realized instructions. the list of workers represent data for authentication like the username and password. also, for each worker, information about his qualifications and level of expertise is stored. the list of tasks contains all the necessary data that describe possible tasks which could be assigned to the workers. this list is connected with the list of workers to determine a specific task for each worker. also, the list of tasks is linked with the list of instructions to regulate safety and work instructions for each task assigned to a worker. the list of instructions describes two instruction types, safety and work instructions. safety instructions consist of data given by the safety expert, while work instructions are defined by the technology officer. each instruction has a regular execution number, title description and defined target for recognition by the ar-module. also, with the target information, a link is provided that points to the virtual material in the multimedia repository which is used by the ar module during tracking. the list of realized tasks stores the job data for each worker when he confirms completion of a task. parameters about the date, time, type of the task, and the person who realized it are saved on this list. the list is linked with the list of realized instructions for more details about the finished task. the list of realized instructions contains data about realization of each instruction. stored data are the time of the beginning, time of completing, and the duration of the work. rest service the rest service is used to enable communication between the client application and the server side. through the rest service, authentication and authorization are done so that the worker can get all the necessary data for task realization. when the service accepts the worker username and password, it compares them with the data in the list of workers. upon successful registration, the service determines the task for the worker from the list of tasks. based on the assigned task, the corresponding safety and work instructions are read from the list of instructions. the data collected and prepared to be sent to the client application are:  token for worker authorization,  id of the worker,  parameters of task id, description and the title,  parameters of safety and work instruction id, title, description, order number, ar marker, link to the multimedia file. the client side accepts and processes the data after which they are ready to be used by the worker. when the worker completes all the safety and working steps, the data are sent to the rest service. the data sent back to the server to be stored in the list of realized tasks and the list of realized instructions are:  id of the worker,  parameters of the task id, date of the start, date of the finishing, and duration of the task.  parameter of instructions id date start, date finished, duration of the work. an augmented reality system for improving health and safety in the electro-energetics industry 593 multimedia repository the multimedia repository is used to store the virtual material related to the safety and work instructions. each task has a separate folder where virtual material is stored. the virtual material is available in the form of a 3d model and a video file for instructions or an image for the description of a part of the instruction. video files are used for client application streaming during the marker recognition. they are recorded in mp4 file format with h264 video codec. the video resolution is 720p with a bit rate of 3000 kbps. the sound is recorded in aac format with a frequency of 44.1 khz and a bit rate of 64 kbps because only the narrative voice is recorded. the 3d model file format depends on the amount of polygons used. the format 3ds provides better file compression for 3d model under 65000 polygons. the fbx format is used for 3d models for larger number of polygons. these models are included as a unity [26] asset bundle, from which they are downloaded and used on the client's side after the service call is received. all multimedia materials are linked to the client application through the links located in the list of instructions in the database. if necessary, the application downloads the material from the specified link and connects it to the application modules that will be used. for example, a module for augmented reality downloads or streams the video material from a particular link and displays it during ar marker recognition. client side the architecture of the client side is determined by the mobile application defined by the four modules explained below (fig. 2). communication module this module allows the application to communicate with the server side in order to access data in the database. also, this module prepares the data in the predefined format for sending. for worker registration to the ar system, the username and password should be sent. upon successful registration, the task data obtained from the rest service are parsed in this module and sent to other modules for further execution. after the task is completed and confirmed, the data are parsed by this module and prepared to be sent to the rest service which transmits these data to the database. task module the goal of the task module is to store the data downloaded from the server and to forward them to other modules for execution. data from the server are stored in the list of safety and work instructions. for each instruction, data such as the name, description, serial number, marker, link to the multimedia material, and employee qualifications are stored. for displaying instructions to the workers for task execution, data are transferred by the user interface module. the ar module is supplied with data concerning the tracking, ar marker, and multimedia material. when a confirmation of the instructions is performed, this structure records data such as the start time, finish time and the time spent executing the instruction. when the last 594 d. tatić instruction is implemented, the executed task data is sent to the module for interacting with the rest server for further parsing and sending the data to the server database. user interface the module user interface serves to define the visual elements of the application. this takes into account the application layout elements and their appearance. the login form serves to register a worker on the system. the registration form is connected with the database through the communication module. user data are forwarded to this module when the worker fills in the form. after the registration has been completed, the necessary data are transferred from the server into the mobile application, and the description part on the user interface module is projected. the description serves to show the information about the current instruction execution. in order to simplify the search for the ar marker in the environment and the workplace, the help function is implemented to displays the image of the active ar marker and explanatory text on the mobile screen. the checking system is used to control instruction delivery in the proper order. each safety instruction is followed by the work instruction. to process the work instruction, the safety instruction should first be confirmed. the checking system is activated when the ar tracking is finished in order to confirm that the current step is completed. two steps of confirmation are implemented to prevent accidental validation of the given instruction. after a positive confirmation, the next instruction in the sequence is read. then, after each safety instruction, the next work instruction is issued, and the ar module activated. ar module the ar module is the core component for user interaction with the real world. it is designed to project safety and work instructions over elements in the industry space. each instruction stored in the task module has a description, corresponding virtual material, and parameters for the ar marker. the ar module in this application is realized by using the easyar sdk. the main idea is that the worker is obliged to use ar technology to inform himself about secure and efficient job realization before entering the substation. the application of the augmented reality technology implies the recognition of two plates attached on the entrance of the electro-energetic substation. during the recognition process only one image marker is active at the time and corresponds to the active instruction. safety instructions is obtained by tracking first plate image while work instructions will be achieved image plate (fig. 3). fig. 3 augmented reality image markers above is safety sign (english translation of the text: caution! check the safety procedures.) below is working sign (english translation of the text: caution! check the safety procedures!) an augmented reality system for improving health and safety in the electro-energetics industry 595 as soon as the ar module starts over the picture of the camera, the user interface shows a description above the image of the real world. the worker is informed to find the marker in the exact place in the industry environment. when the ar marker is recognized, this description vanishes from the screen. on the place of the recognized marker, the virtual instructions are shown. the worker is informed about the current task by the ar system directly at his workplace. various multimedia formats can be projected over the ar marker in the form of 3d, video or image with the text. the ar marker related to the current instruction is active at the time, so the other markers cannot be recognized by this module. when ar tracking stops, the system alerts the checking system to be shown over the display. the worker then goes through a double check system to get the next ar marker for tracking in order to get a new instruction. if he accidentally stops tracking and the checking system is displayed, the ar module is still active in the background. accordingly, the worker can point the device camera at the active marker. upon recognition of the active ar marker, the checking system will vanish from the screen and the virtual instruction will be projected during the tracking. implementation the ar system using the unity engine while the server part is realized by the symfony framework. the validation and verification of the ar system was made on the basis of the safety rules on low voltage installation and work instructions for changing the circuit breaker in the electrical substation described in table 1. this instruction is intended for workers who are involved in the maintenance of electric medium and low voltage substations. table 1 consists of five security and work instructions that are needed to ensure proper and safe operation. each instruction contains a single video file that is displayed through the augmented reality technology implemented in the ar system. in order to use the ar system, an internet connection is required. the internet connection is used to download a task from the server, and streaming multimedia material from the multimedia repository. also, an internet connection is required to send information about the completed task for entry into the database. if there is no internet connection at the workplace, it is necessary to download the content of the task before the job starts. downloading the material can be done at the office, from the internet, before the worker goes into the field. also, after completing his duties, a worker must access the internet, from the same room, to send information about the completed task. upon registration, the system requires the worker to recognize the first image marker to get first safety instruction from table 1. fig. 4 shows the screen of the mobile device through which the worker is instructed where to find and point the camera towards the first image marker. 596 d. tatić fig. 4 augmented reality searching instructions (english translation of the text above image: find the plate from the image and point the camera towards) when the marker is recognized, the worker gets first safety instruction in the form of a video instruction during tracking. at fig. 5 is shown augmented reality projection of video material about the first safety instruction for checking protective equipment. fig. 5 augmented reality instruction for checking the protective equipment after the tracking, the worker performs verification through a checking system implemented in the ar system (fig. 6). first step of verification is clicking the check box to verify that he was checked equipment according to the safety instruction. second step of verification is using the button for confirmation. then, the user is required to find the second image marker representing the next work instruction. by verifying this instruction, the system directs the user to go back to the first marker to execute the next safety instruction. in this way the user is guided through all instructions related to the current task. an augmented reality system for improving health and safety in the electro-energetics industry 597 fig. 6 usage of checking system for confirmation of occupational safety instruction (english translation of text above: did you check protective equipment?, middle: equipment check, below: yes, check, no) conclusion as a continuation of our previous work, on which this paper relies, we present in detail the usage of the ar system in the case of the electro-energetics industry. by following the previous methodology, we specified custom problems that concern the daily job routine. based on that, we determined that tasks can be provided through a sequence of virtual safety and work instructions. client-server architecture for implementation of such a system has been described. for the server side we used the rest service for issuing task elements for client application. client application is organized in several modules where the central part is dedicated to the module of augmented reality. this module is used for worker interaction with industry space in order to get virtual instructions at the workplace. the worker has to confirm every step after each instruction. presently there are no possibilities to test the system on the wide range of tasks in the environment accessible by the author. the proposed system however was experimentally tested by few workers for the case of changing the circuit breaker in an electrical substation. the system was presented to experts working in the area of electro-energetics systems and comments approving its usage are obtain. it is expected that a system based on augmented reality technology will be an efficient aid for issuing safety and work instructions. acknowledgement: the work presented in this paper was supported by the serbian ministry of education and science (project iii 044006). the author is grateful to anonymous reviewers whose constructive comments were useful in improving the presentation in this paper. 598 d. tatić references [1] r.t. azuma, “survey of augmented reality”, presence: teleoperators and virtual environments, vol. 6, no. 4, pp. 355-385, 1997. [2] j. gimeno, p. morillo, j.m. orduña, and m. fernández, “a new ar authoring tool using depth maps for industrial procedures”, computers in industry, vol. 64, no. 9, pp. 1263-1271, 2013. [3] g.m. re, j. oliver, and m. bordegoni, “impact of monitor-based augmented reality for on-site industrial manual operations”, cognition, technology & work, vol. 18, no. 2, pp. 379-392, 2016. [4] m. fiorentino, a.e. uva, m. gattullo, s. debernardis and g. monno, “augmented reality on large screen for interactive maintenance instructions”, computers in industry, vol. 65, no. 2, pp. 270-278, 2016. [5] x. wang, m. truijens, l. hou, y. wang, and y. zhou, “integrating augmented reality with building information modeling: onsite construction process controlling for liquefied natural gas industry”, automation in construction, vol. 40, pp. 96-105, 2014. [6] s. benbelkacem, m. belhocine, a. bellarbi, n. zenati-henda, and m. tadjine, “augmented reality for photovoltaic pumping systems maintenance tasks”, renewable energy, vol. 55, pp. 428-437, 2013. [7] j. zhu, s.k. ong, and a.y.c. nee, “a context-aware augmented reality system to assist the maintenance operators”, international journal on interactive design and manufacturing, vol. 8, no. 4, pp. 293-304, 2014. [8] c. koch, m. neges, m. könig, and m. abramovici, “natural markers for augmented reality-based indoor navigation and facility maintenance”, automation in construction, vol. 48, pp. 18-30, 2014. [9] s. henderson, and s. feiner, “exploring the benefits of augmented reality documentation for maintenance and repair”, ieee transactions on visualization and computer graphics, vol. 17, no. 10, pp. 1355-1368, 2011. [10] a.w.w. yew, s.k. ong, and a.y.c. nee, “towards a griddable distributed manufacturing system with augmented reality interfaces”, robotics and computer-integrated manufacturing, vol. 39, pp. 43-55, 2016. [11] s. webel, u. bockholt, t. engelke, n. gavish, m. olbrich, and c. preusche, “an augmented reality training platform for assembly and maintenance skills”, robotics and autonomous systems, vol. 61, no. 4, pp. 398403, 2013. [12] x. wang, p.e. love, m.j. kim, and w. wang, “mutual awareness in collaborative design: an augmented reality integrated telepresence system”, computers in industry, vol. 65, no. 2, pp. 314-324, 2014. [13] d.b. espíndola, l. fumagalli, m. garetti, c.e. pereira, s.s. botelho, and r.v. henriques, “a model-based approach for data integration to improve maintenance management by mixed reality”, computers in industry, vol. 64, no. 4, pp. 376–391, 2013. [14] h. park, and h.c. moon, “design evaluation of information appliances using augmented reality-based tangible interaction”, computers in industry, vol. 64, no. 7, pp. 854-868, 2013. [15] f. de crescenzio, m. fantini, f. persiani, l. di stefano, p. azzari, and s. salti, “augmented reality for aircraft maintenance training and operations support”, ieee computer graphics and applications, vol. 31, no. 1, 96-101, 2011. [16] g. westerfield, a. mitrovic, and m. billinghurst, “intelligent augmented reality training for motherboard assembly”, international journal of artificial intelligence in education, vol. 25, no. 1, 157-172, 2015. [17] s. kim, m.a. nussbaum, and j.l. gabbard, “augmented reality „smart glasses‟ in the workplace: industry perspectives and challenges for worker safety and health”, iie transactions on occupational ergonomics and human factors, vol. 4, no. 4, 253-258, 2016. [18] s.a. talmaki, s. dong, and v.r. kamat, “geospatial databases and augmented reality visualization for improving safety in urban excavation operations”, in proceedings of the construction research congress 2010: innovation for reshaping construction practice, 2010, pp. 91-101. [19] w. kim, n. kerle, and m. gerke, “mobile augmented reality in support of building damage and safety assessment”, natural hazards and earth system sciences, vol. 16, no. 1, pp. 287-298, 2016. [20] d. tatić, and b. tešić, “improvement of occupational safety systems by the application of augmented reality technologies”, in proceedings of the 23rd telecommunications forum, telfor, 2015, pp. 962-965. [21] d. tatić, and b. tešić, “the application of augmented reality technologies for the improvement of occupational safety in an industrial environment”, computers in industry, vol. 85, pp. 1-10, 2017. [22] easyar sdk, https://www.easyar.com/, accessed january 2018. [23] vuforia sdk, https://www.vuforia.com/, accessed january 2018. [24] symfony, https://symfony.com/, accessed january 2018. [25] r.t. fielding, “architectural styles and the design of network-based software architectures”, doctoral dissertation: university of california, irvine, 2000. [26] unity, https://unity3d.com/, accessed january 2018. https://www.easyar.com/ https://www.vuforia.com/ https://symfony.com/ https://unity3d.com/ facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 367-380 https://doi.org/10.2298/fuee2103367j © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper mems resonator mass loading noise model: the case of bimodal adsorbing surface and finite adsorbate amount ivana jokić1, olga jakšić1, miloš frantlović1, zoran jakšić1, koushik guha2 1university of belgrade, institute of chemistry, technology and metallurgy – national institute of the republic of serbia, center of microelectronic technologies, belgrade, serbia 2national mems design centre, department of electronics and communication engineering, national institute of technology, silchar, assam – 788010, india abstract. modeling of adsorption and desorption in microelectromechanical systems (mems) generally is crucial for their optimization and control, whether it is necessary to decrease the adsorption-desorption influence (thus ensuring stable operation of ultra-precise micro and nanoresonators) or to increase it (and enhancing in this manner the sensitivity of chemical and biological resonant sensors). in this work we derive and use analytical mathematical expressions to model stochastic fluctuations of the mass adsorbed on the mems resonator (mass loading noise). we consider the case where the resonator surface incorporates two different types of binding sites and where non-negligible depletion of the adsorbate occurs in a closed resonator chamber. we arrive at a novel expression for the power spectral density of mass loading noise in resonators and prove the necessity of its application in cases when resonators are exposed to low adsorbate concentrations. we use the novel approach presented here to calculate the resonator performance. in this way we ensure optimization of these mems devices and consequentially abatement of adsorption-desorption noise-caused degradation of their operation, both in the case of micro/nanoresonators and resonant sensors. this work is intended for a general use in the design, development and optimization of different mems systems based on mechanical resonators, ranging from the rf components to chemical and biological sensors. key words: adsorption, mass loading noise, langevin equation, power spectral density, resonator received march 15, 2021; received in revised form july 07, 2021 corresponding author: ivana jokić university of belgrade, institute of chemistry, technology and metallurgy – national institute of the republic of serbia, center of microelectronic technologies, 11000 belgrade, serbia e-mail: ijokic@nanosys.ihtm.bg.ac.rs * this paper is loosely based on the authors’ invited presentation from the 1st international conference on micro/nanoelectronics devices, circuits and systems (mndcs 2021), 29 31 january, 2021, in silchar, assam, india. [1] 368 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha 1. introduction compared to the conventional mechanical resonators, micromechanical resonant structures manufactured by mems (microelectromechanical system) or nems (nanoelectromechanical system) technologies have many favorable features. these include their extremely compact dimensions of the order of micrometers or nanometers, the technological compatibility – and thus vastly facilitated integration – with active electronics, low power consumption, an extended range of available resonant frequencies (even reaching the gigahertz range), high reliability and batch production with a high yield and a very low cost per unit element. this makes them convenient for applications in electronics, as frequency defining units of miniature oscillators and frequency selective circuits [1-4]. additionally, the easy adjustability of their parameters makes them promising candidates for tunable passive components, which further enables reconfigurability, a higher integration degree and miniaturization of radio-frequency circuits [3]. mems/nems resonators are also being developed for various sensing applications and represent the basic components of highly sensitive resonant sensors of mass, force, acceleration, temperature, as well as the concentration of various chemical substances and biological agents [5-8], often being based on microand nanocantilevers. smaller device dimensions lead to an increase of the surface-to-volume ratio, which boosts their sensitivity to some physical processes whose effects are negligible in larger structures. among such processes are adsorption and desorption (ad) of particles from the surrounding medium, which occur at the surface of the structure, and results in varying amounts of mass being added to the native resonator mass, thus changing the resonant frequency of the device. this process enables the operation of ad-based chemical and biological sensors, however in other resonant devices it is undesirable. the added mass randomly fluctuates due to the stochastic nature of the ad processes. in resonators, these fluctuations are known as mass loading noise or resonator ad noise [9]. they contribute to the total frequency noise of the resonator, together with other noise sources (including temperature fluctuations, outgassing, brownian motion, johnson (thermal) noise, drive power and self-heating, random vibration, flicker (1/f) noise, etc. [10-12]), and therefore degrade the performance of electronic devices and sensors of which the microresonator is an integral part. it is particularly important to analyze this kind of fundamental noise, since it allows for the optimal resonator design and optimization of the operating conditions, which consequently ensures lower noise levels and, accordingly, minimizes signal degradation in electronic circuitry and improves detection limits in sensors. based on these facts, there is a need to establish a mathematical model of particles binding-unbinding process at the resonator surface as accurately as possible, enabling the analysis of mass loading noise. numerous models of mass loading noise in microresonators, nanoresonators and other microand nanostructures applicable to different practical situations can be found in literature [9-10, 13-16]. a majority of them assumes the existence of only one type of binding sites on the adsorbing surface, so that the adsorption and desorption of one species is characterized by one pair of adsorption and desorption rate constants and most often it is described by the langmuir model. however, usually the real surfaces are not uniform, and each of them is characterized instead by different structural, morphological or chemical features, which results in a difference between its surface adsorption sites. if this is the case, the ad process is characterized by some spatial distribution of ad rate constants across the surface. the simplest type of such surfaces has two distinct kinds of adsorption binding sites, so that there mems resonator mass loading noise model 369 is a bimodal surface affinity for the particles of a given species. in that case the kinetics of ad processes and resonator noise are described by models that imply a bi-langmuir ad process [17, 18]. in certain cases it is necessary that the models of ad process and noise include some additional effects which may be of importance. one of such effects is depletion of adsorbate particles in the resonator chamber due to adsorption, which may be significant, especially at low adsorbate concentrations and for small dimensions of the resonator chamber, as described in [19]. the joint effect of the two phenomena to the ad process kinetics has been analyzed in [20], while a model of the dynamics of the equilibrium fluctuations, ad noise in frequency domain, has not been established yet for such a case. here we present a new mathematical model of mass loading noise, which is more comprehensive than the other mass loading noise models previously published in the literature, since it simultaneously takes into account the bimodal surface affinity and the decrease of the analyte concentration in a chamber containing the resonator, caused by particles adsorption-desorption processes and the finite dimensions of the chamber. in this way, the model can closely approximate the conditions that exist in practice when resonators are exposed to low adsorbate concentrations. 2. mass loading noise modeling a bimodal adsorbing surface (illustrated in fig. 1) implies the existence of two types of binding sites, which differ in their affinity for binding adsorbate particles from the ambient. the adsorption-desorption of adsorbate particles on such a surface is characterized by two pairs of adsorption and desorption rate constants, (kfv1, kr1) and (kfv2, kr2), corresponding to the sites of type 1 and 2, respectively. if na1 and na2 are the numbers of binding sites of the two types on the resonator surface, and n0t is the number of particles in the resonator chamber at the moment t, the time evolution of the numbers of particles adsorbed at the sites of the first, n1, and of the second type, n2, is determined by the equations 111101 1 )( d d nknnnk t n ratfv −−= (1) 222202 2 )( d d nknnnk t n ratfv −−= (2) it is assumed here that only one adsorbate particle can be bound to a single site of any kind, and that adsorbate particles do not interact among themselves. in a closed chamber, n0t is changing over time due to the ad process, and it equals n0t = n0 – n1 – n2, where n0 is the number of particles in the chamber at the moment t = 0. then, eqs. (1) and (2) constitute the system of nonlinear differential equations [20] 1111112101 1 ))(( d d danknnnnnk t n rafv −=−−−−= (3) 2222222102 2 ))(( d d danknnnnnk t n rafv −=−−−−= (4) 370 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha fig. 1 illustration of adsorption-desorption process of adsorbate particles on a bimodal surface, characterized by two types of adsorption sites (here represented as surface patches with two different shades). which is solved numerically for n1 and n2. however, if the numbers of the adsorbed particles in each moment are much smaller than the total number of adsorbate particles in the chamber n0, then n0t can be considered constant over time, thus eqs. (3) and (4) become equal to the equations of the bi-langmuir model, given by eqs. (1) and (2) in which n0t = n0. the use of the model that takes into account the adsorbate depletion in the resonator chamber during adsorption, becomes necessary with a decreasing n0. its importance for mems resonators becomes obvious since it is known that their use in frequency reference and timing applications requires low operating gas pressures inside the chamber in order to ensure higher q-factor by minimizing air damping as one of the major energy loss mechanisms [2,3]. apart from that, micromechanical resonant structures used in adsorption-based chemical and biological sensors also operate under conditions of a small number of adsorbate particles, since these sensors are intended for highly sensitive detection of ultralow analyte concentrations. in both of the given examples, it may not be valid that n0>>n1+n2, thus the finite amount of analyte and the depletion over time of the particles available for adsorption should be taken into account when performing the analysis, as predicted by eqs. (3) and (4). if m is the mass of a single adsorbate particle, and n is the total number of adsorbed particles, the total adsorbed mass on the resonator surface is )( 21 nnmmnm +== (5) the ad process on the resonator surfaces reaches the steady state after some time. then the numbers of the adsorbed particles reach the values n1e and n2e, determined by the equations obtained from eqs. (3) and (4) for dn1/dt = 0 and dn2/dt = 0 mems resonator mass loading noise model 371 3 2 1 2 1 1 2 2 1 1 2 0 1 1 1 2 1 2 0 2 0 1 1 1 1 2 0 1 ( ) [ ( )( )] [ ( ) (2 )] 0 e a a a e a a a e a k k n k n k n k k n n k n k n n k n n k n n k n n − + + − − + + − − + + + + = (6) 1 2 2 1 1 1 2 1 1 ( ) a e e e a e k n n n k n k n n = + − (7) here k1 = kr1/kfv1 and k2 = kr2/kfv2 are the equilibrium constants. however, even after reaching the steady state the numbers of adsorbed particles fluctuate due to the inherently random nature of the ad process. these fluctuations result in the fluctuations of the adsorbed mass, known as the mass loading noise, m. if ∆n1 and ∆n2 denote small fluctuations around the equilibrium values n1e and n2e, the numbers of the adsorbed particles in each moment are n1 = n1e + ∆n1 and n2 = n2e + ∆n2, and the fluctuations of the adsorbed mass are )( 21 nnmnmm +== (8) the linear approximation of the functions a1, a2, d1 and d2 (defined in eqs. (3) and (4)) around the equilibrium values of the adsorbed particles numbers 2 2 1 1 n n a n n a aa ii ieila    +   += , i i i ieila n n d dd    += , i=1 or i=2 (9) where all the functions and derivatives are calculated for n1 = n1e and n2 = n2e, enables linearization of eqs. (3) and (4), which yields the system of langevin equations after adding an intrinsic source function (1 and 2) to the right side of each of them ij j i i i i i ii n n a n n d n a t n +   +           −   =  d )d( , i=1 or i=2, j=1 or j=2, i  j (10) (the equalities aie = die are applied, which stem from the steady-state conditions dni/dt = 0, for i = 1 or i = 2). the previous equations may be presented in the form 1212111 1 d )d( +−−=  nmnm t n (11) 2222121 2 d )d( +−−=  nmnm t n (12) where )2()( )()2( 1220222222221 1111221101111 eeafvreafv eafveeafvr nnnnkkmnnkm nnkmnnnnkkm −−++=−= −=−−++= (13) eqs. (11) and (12) have a form suitable for the application of the langevin procedure to obtain the power spectral density (psd) of steady-state fluctuations of mass loaded on the resonator, sm(f), for the case of ad processes of two adsorbates, in the manner presented in [21]. this procedure is performed by solving eqs. (11) and (12) in the frequency domain in order to obtain the psd of the fluctuations n1 and n2, denoted as sn1(f) and sn2(f), and also their cross-spectral density, sn1n2(f). the psd of mass loading noise is then, based on eq. (8) 372 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha ))(2)()(()()( 2121 22 fsfsfsmfsmfs nnnnnm  ++== (14) (as derived e.g. in supplementary data of [15] by using the definition of the spectral density and autocorrelation function of a random variable that equals the sum of two coupled random variables). after performing all the necessary calculations the result is obtained in the form ))2(1)()2(1( )2(1 )( 2 2 22 1 2 2 3 2 ,   ff f sfs lfmm ++ + =  (15) where 1 2112 2 221122111 4)(2 −     +−++= mmmmmm (16) 1 2112 2 221122112 4)(2 −     +−−+= mmmmmm (17) 1 22 2 121111 2 212222113 ])())[(( − −+−+= erererer nkmmnkmmnknk (18) and the low-frequency magnitude is 2 3 2 2 2 1 2211 2 , )(4   ererlfm nknkms +=  (19) the characteristic frequencies of the psd sm(f) are fi=1/(2τi), index i is 1, 2 or 3. the psd of resonator frequency noise caused by random mass loading is [10] )( 4 )( 2 0 2 0 fs m fs mδνδ ν = , (20) where the resonator mass is m0, and ν0 is its resonant frequency. if the analyte depletion due to the adsorption is neglected, eqs. (1) and (2) become independent and linear. the overall psd of mass fluctuations is then calculated as the sum of psds of mass fluctuations of the both parts of the adsorbed amount (the part adsorbed on the type 1 sites, and the part adsorbed on the other type of sites) )()())()(()()( 2121 22 fsfsfsfsmfsmfs mlmlnlnlnlml δδδδδδ +=+== (21) (this equation stems from eq. (14) when n1 and n2 are statistically independent random variables) and the components 1 and 2 are given by the expression 2 3 0 02 2 2 0 4 / ( ) ( ) ( ) 1 (2 ) / ( ) i i i i i i i i a fv ri r fv ml nl r fv m n k k n k k n s f m s f f k k n   + = = + + , i=1 or i=2 (22) with the characteristic frequency fli=1/(2τli), where τli = kri + kfvin0 (i is 1 or 2). mems resonator mass loading noise model 373 3. results and discussions we investigate the phenomenon of mass loading fluctuations in resonant structures with bimodal surface affinity, influenced by adsorbate depletion from the finite sample during adsorption, as well as quantitatively investigate and compare the results obtained by using the model that takes into account the depletion and the one that neglects it, through the analysis which is as general as possible, i.e. not pertinent to a given micromechanical resonant structure or adsorbate. therefore, the results of mass loading noise analysis are presented in terms of psds of the fluctuations of the total number of adsorbed particles, sn(f). all the conclusions about sn(f) easily lead to conclusions about sm(f) and sν(f), having in mind the linear relation between their values (eqs. (14), (20) and (21)). the values of adsorption and desorption rate constants used in the analysis are kfv1=1.3 .10-11 1/s, kr1=0.4 1/s, kfv2=1.3 .10-13 1/s and kr2=0.02 1/s. they belong to the ranges corresponding to biomolecules and biosensors [22, 23], which does not affect the generality of the analysis. the presented results are pertinent to different amounts of adsorbate particles surrounding the resonator (i.e. various adsorbate concentrations or pressures in the chamber of fixed volume). among a total of na=10 11 adsorption sites on a bimodal affinity surface, different shares of two types of sites are assumed. fig. 2 shows the power spectral density of fluctuations of the total number of adsorbed particles according to the model that takes into account adsorbate depletion, sn(f). it is introduced in eq. (14), and determined by eqs. (15)-(19). the same quantity obtained by using the model which assumes a constant adsorbate concentration in the resonator chamber, snl(f), given in eq. (21), is also shown. the concentration of adsorbate in the chamber of volume 1·10-7 m3 is 2.5·1019 1/m3 (which corresponds to the overall number of adsorbate molecules of 2.5.1012). the equal shares of different types of adsorption sites are assumed (v=0.5). also presented on the diagram are the components of snl(f), snl1(f) and snl2(f), that originate from independent fluctuations on the two types of adsorption sites. they are determined by eq. (22), according to the linear (i.e. bi-langmuir) model of adsorption on a bimodal adsorbing surface. a good match can be observed between the total pds values predicted by the two models. the bi-langmuir set of equations (1)–(2) models the two adsorbed fractions as independent. their contributions to the psd of fluctuations of the total number of adsorbed particles are shown by dashed and dotted lines in the diagram. however, the dynamics of fluctuations of numbers of particles adsorbed on the two types of sites is not actually independent if the adsorbate quantity is finite: although the adsorbate particles independently occupy the two sets of adsorption sites, they deplete the same pool of the free particles. in equilibrium, the surface coverage remains constant on average, but the distribution of the occupied sites continuously changes, with the dynamics determined by the rate constants. the term 'favorable sites' in the analysis refers to the adsorption sites characterized by a greater affinity for the adsorbate (expressed as the ratio of adsorption and desorption rate constants). a greater binding energy will cause the adsorbed particles to reside longer on the surface, hence their desorption rate constant will be lower and there will be less fluctuations at lower frequencies. truly, in the diagram, at lower frequencies, the fluctuations of the adsorbed fraction with lower rate constants dominate over the fluctuations of the fraction with the greater rate constants (the dashed line is above the dotted line). naturally, higher rate constants characterize greater dynamics of the process, and at higher frequencies the overall noise level is dominated by the fraction 374 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha that exhibits more frequent binding (the dotted line is above the dashed one). after specifying the low frequency noise magnitude (lfnm) and the frequencies at which the psd curve changes its slope as the characteristic parameters of the psd, it is of interest to analyze the discrepancy between the two models regarding these parameters over the range of values of concentrations and fractions of favorable sites. fig. 2 psd of the fluctuations of the number of adsorbed particles on the resonator surface (expressed in 1/hz) with bimodal affinity (for v=0.5), presented both according to the model that takes into account adsorbate depletion in the resonator chamber, and according to the model that neglects it. the psds shown by dotted and dashed lines correspond to the fluctuations of the numbers of particles adsorbed on the two types of sites, n1 and n2, for negligible depletion. the overall number of adsorbate particles is 2.5.1012. the same quantities as in fig. 2 are shown in figs. 3 a-c, but for different adsorbate concentrations: for 5 times (fig. 3a), 25 times (fig. 3b) and 50 times (fig. 3c) lower values. the smaller the concentration, the stronger the depletion of the adsorbate will be in the gas phase and consequently, the greater the discrepancies between the results obtained by the use of the linear model and by the more accurate nonlinear one. in fig. 3a, a small deviation can be observed between the psds of total fluctuations determined according to the two adsorption models. the model that includes adsorbate depletion predicts a slightly higher total noise, and a small difference can also be noticed between the characteristic frequencies of the two spectral densities. compared to the case shown in fig. 2, both models at a 5 times lower concentration predict an order of magnitude higher low-frequency noise magnitude, and lower characteristic frequencies. the characteristic frequencies are determined by the parameters τ1, τ2, τ3, τl1 and τl2, whose values are given in table 1. a further decrease of the adsorbate concentration results in a significantly higher difference between the psds obtained according to the two models, as shown in fig. 3b. mems resonator mass loading noise model 375 a) b) c) fig. 3 psd (expressed in 1/hz) of the fluctuations of the number of adsorbed particles on a bimodal affinity surface (v=0.5), according to the two adsorption models for: a) 5 times, b) 25 times, and c) 50 times lower adsorbate concentration than in fig. 2. 376 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha the lower low-frequency noise magnitude is obtained according to the model that takes into account the adsorbate depletion. a certain difference in characteristic frequencies of the spectra can also be observed. the largest difference between the magnitudes of mass loading noise calculated according to the two models can be observed at the lowest adsorbate concentration used in this analysis (fig. 3c). the difference exists at all frequencies, so the total noise according to the model that neglects the depletion is higher than that predicted by the more accurate (non-linear) model. however, the difference between the characteristic frequencies of the two spectra is negligible. table 1 psd parameter values of mass loading noise according to the two models for different values of n0, i.e. for different cases shown in figs. 2 and 3 (v=0.5). parameter fig. 2 fig. 3a fig. 3b fig. 3c n0 25.1011 5.1011 1.1011 0.5.1011 1 0.0316 0.1705 0.7422 0.9346 2 3.0042 13.1688 32.1861 37.6628 3 0.1481 0.8827 6.2486 11.4308 l1 0.0304 0.1449 0.5882 0.9524 l2 2.8986 11.7647 30.3030 37.7358 difference between the noise magnitudes negligible modest noticeable significant we have seen that the power spectral density is affected by the depletion of the adsorbate in the gas phase due to the adsorption-desorption process. now we will investigate the influence of the shares of different types of adsorption sites on the surface. figs. 4 and 5 show the power spectral density for different percentages of the favorable adsorption sites on the surface. figures 4a-c are obtained for the parameter values which are the same as for fig. 2, while figs. 4d-f are obtained for the same parameter values as fig. 3c, but for three cases: when the favorable sites dominate (v=0.8), when the unfavorable sites are dominant (v=0.2), and when the number of different sites is the same (v=0.5). figs. 4a-c show a good matching between the two models for all the values of v, at the same adsorbate concentration (2.5·1019 1/m3). therefore, figs. 4a-c correspond to the systems for which the linear model is applicable. however, figs. 4d-f, obtained for 50 times lower concentration, show a significant difference between the results according to the two models, for every v. they demonstrate that for a certain subset of the parameter space the bi-langmuir model of adsorption does not enable accurate quantification of the noise level, thus the nonlinear model must be applied. the previous analysis showed that in certain cases the linear model falsely predicts noise levels. the ratio of lfnms is shown in fig. 5. it is obtained by dividing the lfnm calculated using the linear model with the lfnm calculated by the nonlinear model. it can be observed that the difference in lfnm is greater for lower adsorbate concentrations. when the difference is significant, it increases with the higher share of the high-affinity sites. typical concentrations span over a very wide range, and consequently the discrepancy due to the neglected analyte depletion may exceed an order of magnitude. mems resonator mass loading noise model 377 fig. 4 psd of the fluctuations of the number of particles adsorbed on a bimodal affinity surface, for different percentages of the favorable adsorption sites,  ( = 0.2, 0.5, 0.8), calculated using the model that neglects depletion and the model that takes it into account. the parameters are the same as in fig. 2 for a-c, and the same as in fig. 3 c for d-f. 378 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha fig. 5 the ratio of the low frequency noise magnitude (lfnm) calculated by the linear model and the lfnm obtained according to the model that takes into account adsorbate depletion, over the range of adsorbate concentrations and fractions of favorable adsorption sites on the surface. fig. 6 time constants calculated by the linear and nonlinear model over a span of adsorbate concentrations and fractions of favorable adsorption sites on the surface. wide ribbons correspond to τl1 (lower ribbon) and τl2 (upper ribbon) from the linear model, narrow ribbons correspond to τ1 (lower ribbon) and τ2 (upper ribbon) from the nonlinear model, while symbols correspond to τ3 from the nonlinear model. the time constants are calculated for 20%, 40%, 60% and 80% fractions of the favorable sites. fig. 6 shows the time constants τ1, τ2, τ3, τl1 and τl2, which determine the characteristic frequencies of the power spectra according to the two models. the difference between the corresponding characteristic frequencies (i.e. between the time constants τ1 and τl1, and also between τ2 and τl2) increases with the decrease of the concentration, but the parameter v also influences the relation between them. the domination of adsorption sites where molecules mems resonator mass loading noise model 379 bind with lower rate constants (the low percentage of the favorable sites in fig. 6), does not ensure lower discrepancies between the time constants obtained by the linear and the nonlinear model at low concentrations. depending on both the concentration and the value of v, the characteristic frequencies according to the nonlinear model can be higher or lower than the corresponding frequencies predicted by the linear model. however, in the whole examined parameter space they are of the same order of magnitude. 4. conclusion we presented the results of the stochastic analysis of adsorption-desorption (ad) processes on a micromechanical resonator surface with a bimodal affinity. it is assumed that a resonator operates in an ambient containing a finite and low amount of adsorbate, so there is a significant change of adsorbate concentration in the resonator chamber during adsorption. such conditions are met e.g. in resonators used in frequency reference and clocking applications, as well as in resonant micromechanical sensors of various physical parameters, chemical substances or biological agents. the expressions for the power spectral density (psd) of fluctuations of the number of adsorbed particles are derived by taking into account the adsorbate depletion in the resonator chamber, and also when the depletion is neglected. they yielded the corresponding expressions for the psds of mass loading noise, and also for the psds of resonator frequency noise. the analysis was performed by using the computer simulations, based on the two presented noise models. it revealed the change of the discrepancies between the two models in low-frequency noise magnitudes and characteristic frequencies of the noise spectra as the adsorbate amount decreases, and also with the change of the shares of different types of adsorption sites on the resonator surface. all the results and conclusions stemming from the analysis expand the knowledge about the mass loading noise of resonators operating in a closed chamber. our results are useful for the estimation of the resonator mass loading noise and the corresponding frequency noise. additionally, the development of noise models leads to a better understanding of the influence of the resonator parameters and their operating conditions, enabling their optimized design and application, and thus ensuring lower noise levels, minimization of signal degradation in electronic circuits, and improved detection limits in sensors. the described approach is generally applicable to micro/nanoelectromechanical systems with bimodal affinity surfaces, since each real structure will be exposed to some kind of ambient, and thus to adsorption and desorption of the species present therein. this will be the cause of mass loading effects, including stochastic frequency fluctuations in resonators. the presented results are pertinent to surfaces with bimodal affinity towards adsorbate particles. however, the described procedure is applicable with simple modifications to the more general case of adsorption on surfaces with multimodal affinity. hence, our future research will include stochastic analysis of ad processes on surfaces with multimodal affinity towards the adsorbate in trace amounts, with special concern on the interplay between the analyte concentration, the fraction of adsorption sites with different affinities, and the level of influence of the analyte depletion caused by adsorption. acknowledgement: this research was financially supported by the ministry of education, science and technological development of the republic of serbia, grant number 451-03-9/2021-14/200026. 380 i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha references [1] i. jokić, m. frantlović, o. jakšić, z. jakšić, k. guha, “mass loading noise in micromechanical resonators: a model considering bimodal surface affinity and adsorbate depletion in the resonator chamber”, 1st international conference on micro/nanoelectronics devices, circuits and systems (mndcs 2021), 29 31 january, 2021, silchar, assam, india. [2] g. wu, j. xu, e.j. ng, w. chen, “mems resonators for frequency reference and timing applications”, j. microelectromech. sys. vol. 29, pp. 1137–1166, september 2020. [3] i. jokić, m. frantlović, z. djurić, m.l. dukić, “rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise”, fu elec. energ., vol. 28, pp. 345–381, 2015. [4] k. guha, h. dutta, j. sateesh, s. baishya, k. s. rao, “design and analysis of perforated mems resonator”, micros. technol., vol. 27, pp. 613–617, november 2021. [5] t. kose, k. azgin t. akin, “design and fabrication of a high performance resonant mems temperature sensor”, j. micromech. microeng., vol. 26, pp. 045012, march 2016. [6] s. x. p. su, h. s. yang, a. m. agogino, “a resonant accelerometer with two-stage microleverage mechanisms fabricated by soi-mems technology”, ieee sensors j., vol. 5, pp. 1214–1223, november 2005. [7] r. g. azevedo, d. g. jones, a. v. jog, b. jamshidi, d. r. myers, l. chen, x-a. fu, m. mehregany, m. b. j. wijesundara, a. p. pisano, “a sic mems resonant strain sensor for harsh environment applications”, ieee sensors j., vol. 7, pp. 568–576, march 2007. [8] f. m. battiston, j-p. ramseyer, h. p. lang, m. k. baller, c. gerber, j. k. gimzewski, e. meyer, h-j. güntherodt, “a chemical sensor based on a microfabricated cantilever array with simultaneous resonance-frequency and bending readout”. sens. act. b, vol. 77 pp. 122–131, june 2001. [9] z. djurić, o. jakšić, d. randjelović, “adsorption-desorption noise in micromechanical resonant structures”, sens. act. a, vol. 96, pp. 244–251, february 2002. [10] j.r. vig, “noise in microelectromechanical system resonators”, ieee trans. ultrason., ferroel. freq. contr., vol. 46, pp. 1558–1565, november 1999. [11] z. djurić, “mechanisms of noise sources in microelectromechanical systems”, microel. reliab., vol. 40, pp. 919–932, may 2000. [12] y. nie, h. zhan, z. zheng, a. bo, e. pickering, y. gu, “how gaseous environment influences a carbon nanotube-based mechanical resonator”, j. phys. chem. c, vol. 123, pp. 25925–25933, october 2019. [13] c mathai, sa bhave, s tallur, “modeling the colors of phase noise in optomechanical oscillators”, osa continuum, vol. 2, pp. 2253–2259, july 2019. [14] z. djurić, i. jokić, m. frantlović, o. jakšić, “fluctuations of the number of particles and mass adsorbed on the sensor surface surrounded by a mixture of an arbitrary number of gases”, sens. act. b, vol. 127, pp. 625–631, november 2007. [15] m. frantlović, i. jokić, z. djurić, k. radulović, “analysis of the competitive adsorption and mass transfer influence on equilibrium mass fluctuations in affinity-based biosensors”, sens. act. b, vol. 189, pp. 71–79, december 2013. [16] o. jakšić, z. jakšić, ž. čupić, d. randjelović, l. kolar-anić, “fluctuations in transient response of adsorption-based plasmonic sensors”, sens. act. b, vol. 190, pp. 419–428, january 2014. [17] i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha, “influence of sensing surface bimodal affinity on biosensor steady-state response”, in proceedings of the 9th internat. conf. on defensive technologies oteh, belgrade, pp. 106.1–4, october 2020. [18] t. contaret, j.-l. seguin, p. menini, k. aguir, “physical-based characterization of noise responses in metal-oxide gas sensors”, ieee sensors j., vol. 13, pp. 980–986, nov. 2013. [19] i. jokić, o. jakšić, “a second-order nonlinear model of monolayer adsorption in refractometric chemical sensors and biosensors case of equilibrium fluctuations”, opt. quant. electron., vol. 48, pp. 1–7, june 2016. [20] i. jokić, o. jakšić, m. frantlović, z. jakšić, k. guha, k.s. rao, “temporal response of biochemical and biological sensors with bimodal surface adsorption from a finite sample”, microsys. technol., october 2020. [21] z. djurić, i. jokić, m. frantlović, o. jakšić, d. vasiljević-radović, “adsorbed mass and resonant frequency fluctuations of a microcantilever caused by adsorption and desorption of particles of two gases”, in proceedings of the 24th internat. conf. on microel. miel, vol. 1, pp. 197–199. niš, serbia, may 2004. [22] g. canziani, w. zhang, d. cines, a. rux, s. willis, g. cohen, r. eisenberg, i. chaiken, “exploring biomolecular recognition using optical biosensors”, methods, vol. 19, pp. 253–269, may 1999. [23] d.g. myszka, x. he, m. dembo, t.a. morton, b. goldstein, “extending the range of rate constants available from biacore: interpreting mass transport-influenced binding data”, biophys. j., vol. 75, pp. 583–594, august 1998. facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 89-104 https://doi.org/10.2298/fuee2101089p © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper gpu-supported simulation for abep and qos analysis of a combined macro diversity system in a gamma-shadowed k-µ fading channel nenad petrović1, selena vasić2, dejan milić1, suad suljović1, samir koničanin1 1faculty of electronic engineering, university of niš, serbia 2faculty of information technologies, metropolitan university, belgrade, serbia abstract. in this paper we have analyzed macro-diversity (md) system with one macro sc diversity (md sc) receiver and two micro mrc (md mrc) receivers over correlated gamma-shadowed k-µ fading channel. the average bit error probability (abep) is calculated using the moment generating function (mgf) approach for bdpsk and bpsk modulations. graphical representation of the results illustrates the effects of different parameters of the system on its performance as well as the improvements due to the benefits of a combined micro and macro diversity. the obtained analytical expressions are used for the gpu-enabled mobile network modeling, planning and simulation environment to determine the value of quality of service (qos) parameter. finally, linear optimization is proposed as an approach to improve the qos parameter of the fading-affected system observed in this paper. key words: gamma shadowing, k-µ fading, mgf, abep, gpgpu, sdr 1. introduction in cellular communications, due to specific nature of transmission channels, signal power level at the receiver is affected by macroscopic and microscopic degrading effects [1]. macroscopic effects are usually shadowed fading effects from buildings, foliage, and other objects. microscopic fading is a result of multipath characteristics. in urban areas, there is no direct line-of-sight (los) along the transmission line. the channel characteristics between transmitter and receiver will change instantaneously as the mobile user moves in the environment with not-stationary surrounding objects that additionally affect the transmission. therefore, a multipath fading, a shadowed fading, or a mixture of these two can characterize received july 22, 2020; received in revised form november 4, 2020 corresponding author: nenad petrović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia e-mail: nenad.petrovic@elfak.ni.ac.rs 90 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović the transmission channel [2]. in overcrowded downtown environments where foliage and urban shadowing is found, both multipath and shadowed fading need to be considered while analyzing system performances. long-scale fading is manifested in variations of the signal average strength at the receiver. small-scale fading emerges in the form of rapid fluctuations of the amplitude and phase of a signal. in a homogeneous channel short-term fading is successfully described by rayleigh, nakagami-m, rice or hoyt distributions. in a real system, the surfaces are spatially correlated, so the channel is non-homogeneous [3]. distributions that model non-homogeneous fading channels are generalized n-distribution and generalized q-distribution. they are valid in both cases-when los (line-of-sight) component is present or absent and are more appropriate in modeling practical fading channels. parameterized distributions such as κ-μ and η-μ distributions are used to model this type of non-homogeneous channels which show functional similarities with generalized-n and generalized-q distributions, respectively. the η-μ distribution is used to model the non-homogeneous small-scale fading in the absence of los component [4]. distribution κ-μ is generalized distribution used to model small-scale fading in line-of-sight (los) non-homogeneous fading channel. for μ=1, κ-μ distribution becomes rice distribution, for κ→0 nakagami-m distribution, for μ=1, κ→0 rayleigh distribution, and for μ=0.5, κ→0 one-sided gaussian distribution [5]. applying diversity techniques eliminates negative effects of fading [6]. diversity combining improves receiver performance by processing numerically statistically independent copies of the same signal carrying information across multiple fading channels and by efficiently combining the two or more signals. the idea behind this concept is that it is highly unlikely that deep fading will simultaneously occur in all the diversity channels, thus reducing error probability. the minimum antenna spacing for a mobile unit is at least half a wavelength [7]. macro-diversity (md) reception reduces both long-term and short-term fading effects on performance of cellular mobile radio systems [8]. md reduces the shadowing effect, and micro-diversity (md) systems decrease rapid fading. mrc and egc combining techniques, followed by sc and ssc techniques reduce the impact of fading and increase channel capacity [9]. mrc and egc combiners are complex to implement in practice; hence, the use of sc combining, especially in cases with minimal correlation between channels, has proved as more efficient. the sc receiver choses the branch with the highest signal-to-noise ratio (snr). for mrc harvesters, if the noise power is equal at all inputs the square of the output signal is equal to the sum of the squares of the input signals [10]. in this paper, a combined md system with md sc receiver and two md mrc receivers in a gamma-shadowed k-µ multipath fading channel is analyzed. using the calculated expression for the moment-generating function (mgf) of the signals at the output of the md mrc receivers, mgf of the signal at the output of md sc receiver is determined. in the following, the obtained expression for mgf can be used to evaluate the first-order statistics such as outage probability (op) and bit error probability (bep) of the observed md system. the average bit error probability (abep) for non-coherent binary differential phase-shift keying (bdpsk) and non-coherent binary phase-shift keying (bpsk) is studied using the mgf-based approach. to the best authors’ knowledge deriving abep using mgf method for the combined md system with md sc receiver and two md mrc receivers in a gamma large-scale fading and κ-μ small-scale fading in a simulated gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel 91 environment is not analyzed in open technical literature. numerical results are obtained using software mathematica and origin and they illustrate the proposed analytical expressions and impact of the parameters of the system on the statistical characteristics observed. moreover, the derived expressions were integrated in gpu-enabled mobile network modeling, planning and simulation environment. the paper is organized in the following way: in section ii, system abep is calculated for bfsk and bdpsk modulations schemes of the observed system using mgf approach. in section iii, numerical results and the corresponding analysis of the system performance are presented. gpu-enabled modelling, simulation and network planning environment for the system undergoing conditions described in this paper are given in section iv. the conclusion is presented in section v. 2. average bit error probability over moment generating function in this section macro diversity system (md) with md sc receiver and two md mrc receivers in a channel affected by gamma long-term and k-µ short-term fading is investigated. the block-diagram of a macro-diversity model under consideration is shown in fig. 1. the envelopes of the useful signals at the inputs of the first and the second lbranch md are denoted by: r11,…,r1l, where r21,…,r2l are the second md inputs, and x1 and x2, as the outputs of mds. the envelope of the useful signal at the output of md sc receiver is denoted by r. fig. 1 block-diagram of the combined md system in mobile radio channels with los (line-of-sight) components the envelopes of the useful signals rij, affected by a small-scale fading, follow the κ-μ distribution [11]: 2 1 ( 1) 2 11 k 2 2 1 ( 1) ( ) e 2 e i ij i ij k r ij i i i r ij ij i i i r k k k p r i r k       + + −  −−   + + =          (1) 92 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović in the above expression i0 (x) denotes modified bessel function of the first kind and of zeroth order, 2 iji x= are the powers of the useful signals, xij  0. parameter μ describes the number of multipath clusters. the smaller values of μ cause the increase in fading severity. the parameter k describes the ratio of the power of a dominant component and the power of the scattered components. in the considered environment with a dominant los component, after the transformation xij=rij 2, xij≥0; i=1,2, j=1,2..l., the pdf of k-µ random variable xi can be written as [12]: 1 1 ( 1) 2 2 1 1 ( 1) ( ) e 2 e i i i i i i i ii i l l k x i i i i i i x i i l il k i i i l k x k k x p x i l l k l       + − + −  −     + +  =              (2) using well-known transformation [13; 17.7.1.1] the modified bessel function of the first kind can be written as: 2 2 0 ( ) . 2 ! ( 1) v k v v k k x i x k v k  + + = =  + +  (3) where (m) is the gamma function. using the expression (3) we can obtain the pdf of the snr at the output of the md mrc combiner: ( 1) 2 1 0 ( ) 1 ( ) e ! ( ) i i i i i i i i k i li i l i lx l k i i i x i i i ii lk x k p x i i l      + + + + −− −  =  +  =    +    (4) parameter μ can take only integer values as it represents the argument in the gamma function. moreover, the moment-generating function (mgf) of a random variable xij > 0 is given by [14]: 2 00 ( ) 11 ( ) d e ( ) ! ( 1)e i ii i i i i i i i li li x s x s i i x ij x i l k i i ii lk k m s e y p x i k s     +  + − =  + = = =   + +    (5) in the observed system, there are i=2 branches of md sc receiver, and powers of the corresponding useful signals are represented by the random variables ω1 and ω2. in gamma-shadowed channels, these random variables follow the correlated gamma distribution [15], [16]: 1 2 0 1 2 1( )/( (1 )) 1 2 1 2 2 2 2 00 ( )e ( , ) ( ) ! ( ) (1 ) i i c i c i c i p c i i c     + −−  +  −   + + =     =    +  −  (6) in the last expression, c is the order of the gamma distribution, ω0 denotes the average value of the signal powers ω1 and ω2 ( 0 1 2  =  =  ) and ρ is the correlation coefficient. md sc receiver selects the signal from the micro diversity mrc receiver that has the largest average power of the signal envelope. hence, the moment generating function of the signal at the output of the macro diversity sc receiver will be given by [17], [18]: gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel 93 1 1 11 2 2 1 2 0 0 ( ) d d ( ) ( )x xm s m s p   =     +  2 2 12 1 2 1 2 0 0 d d ( ) ( )xm s p   +     =  1 1 11 2 2 1 2 0 0 2 d d ( ) ( ).xm s p   =      (7) by substituting expressions (5) and (6) in the expression (7), we obtain the mgf at the output of the md sc receiver: 2 21 2 1 1 1 2 2 2 2 2 1 2 1 00 0 ( ) ( 1)2 ( ) ! ! ( ) (1 )( )e i i i i i l i li i i i i x l k i c i c i i lk k m s i i i cc         + + + + = = + =   +  −  1 1 01 2 0 1 2 / (1 )1 / (1 ) 11 1 2 2 0 0 e d d e ( ( 1) ) i i c i c i l i i ik s      −  −+ − −  − + − +      + +    (8) the integral i1 in the above expression is equal: 1 2 01 1/( (1 ))1 1 1 2 2 0 1 0 0 d e ( (1 )) , . (1 ) i c i c i i c      −  −+ − +   =   =  − +   −   (9) here, γ(α, x) is lower incomplete gamma function [19; 6.5.2], that can be developed by using the following form of the series [20]: 0 1 ( , ) e . (1 ) j a x jj x x x a     − = = +  (10) where (a)n denotes the pochhammer symbol. using (10), we can evaluate the integral i1 as follows: 1 31 0 3 33 /( (1 )) 1 1 1 1 00 e ( 1) ( (1 )) i i c i ii i i c i c    + +−  − =  = + + +  −  (11) after substituting back, the expression (11) into the expression (8) for the mgf at the output of the md sc receiver, we can rewrite the expression for the mgf as: 1 2 31 2 3 1 2 1 1 10 0 0 ( )2 ( ) ! ! ( )( )( 1)( )e i i i i i i x l k ii i i l k m s i i i c i c i cc       = = = =   + + + +  1 3 1 0 1 3 1 3 2 2 2 1 2 / (1 ) 1 12 2 2 0 0 e1 d (1 ) (1 / ( ( 1)) ) i i i c i i c i i c i l i i is k      + + − −   − + + + + +     − + +   (12) 94 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović using the following identity [21; 3.383]: 1 0 e d ( ) , 1 , (1 ) q px q v x p x a q q q v aax  − − −   =   + −  +    (13) where ψ (s, d, t) denotes the confluent hyper-geometric function, the last part of the expression (12) labeled as integral i2, can take the form: 1 31 3 1 0 2 2 22 2 1 2 / (1 ) 1 2 1 0 ( 1)e d (1 / ( ( 1)) ) i i i ci i c i i i l i i i k i ss k      + ++ + − −   − + +   =  =   + +     1 3 1 3 1 3 2 0 2 ( 1) (2 2 ) 2 2 , 2 2 1 , (1 ) i i i k i i c i i c i i c i l s     +  + +  + + + + + − −   −  (14) substituting back the expression for the integral i2 into expression (12), we obtain the mgf at the output of the md sc receiver: 1 2 31 2 1 3 1 2 3 2 2 2 2 1 2 1 00 0 0 ( )2 ( ) ( )e ! ! ( )i i i i i ci i i i x l k i i c i i i lk m s c i i i c       + + + + + = = = =    +   1 3 1 3 3 2 2 1 3 2 1 1 (2 2 ) 1 (1 ) ( )( 1) i i c i i i c i i i c k si c i c + + + +  + + +     − + + +   1 3 1 3 2 0 2 ( 1) 2 2 , 2 2 1 , (1 ) i i i k i i c i i c i l s     +  + + + + + − −   −  (15) one of the performance measures of the wireless communication system is the average symbol error probability (asep). if there are two bits per symbol, this property is equivalent to the average bit error probability (abep) [8]. in the following, using the obtained expression for mgf, the abep of non-coherent bfsk and bdpsk modulation signals can be directly calculated as [6]: bdpsk.for (1),0.5m=)(p bfsk;for (0.5),0.5m=)(p x0be x0be   (16) graphical representation of the abep for the range of different values of parameters from expression (16) for bfsk and bdpsk modulation is given in figures 2, 3, 4 and 5. 3. numerical and graphical results figures 2 and 3 show the abep of the md sc receiver output in terms of the parameter ω0 (the average value of the signal powers ω1 and ω2) for several different values of fadingseverity parameters, correlation coefficient ρ and different number of branches (l) on the input of md mrc receivers for bfsk modulation. gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel 95 0 5 10 15 20 1e-3 0,01 0,1 1 =1, =0.2, l=2 a b e p [     [db] c=1, k=1 c=1.5, k=1 c=2, k=1 c=2.5, k=1 c=3, k=1 c=1, k=1.5 c=1, k=2 c=1, k=2.5 c=1, k=3 fig. 2 abep in terms of ω0 for bfsk modulation and different values of c and k 0 5 10 15 20 1e-3 0,01 0,1 1 c=1, k=1 a b e p [     [db] =1, =0.2, l=2 =1.5, =0.2, l=2 =2, =0.2, l=2 =2.5, =0.2, l=2 =3, =0.2, l=2 =1, =0.4, l=2 =1, =0.6, l=2 =1, =0.8, l=2 =1, =0.2, l=3 =1, =0.2, l=4 fig. 3 abep in terms of ω0 for bfsk modulation and different values of µ, ρ and l 96 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović 0 5 10 15 20 1e-3 0,01 0,1 1 =1, =0.2, l=2 a b e p [     [db] c=1, k=1 c=1.5, k=1 c=2, k=1 c=25, k=1 c=3, k=1 c=1, k=1.5 c=1, k=2 c=1, k=2.5 c=1, k=3 fig. 4 abep in terms of ω0 for bdpsk modulation and different values of c and k 0 5 10 15 20 1e-3 0,01 0,1 1 c=1, k=1 a b e p [     [db] =1, =0.2, l=2 =1.5, =0.2, l=2 =2, =0.2, l=2 =2.5, =0.2, l=2 =3, =0.2, l=2 =1, =0.4, l=2 =1, =0.6, l=2 =1, =0.8, l=2 =1, =0.2, l=3 =1, =0.2, l=4 fig. 5 abep in terms of ω0 for bdpsk modulation and different values of µ, ρ and l the figures 2, 3, 4, 5 for both bfsk and bdpsk modulations show that, as parameters µ, c, k and l are increasing abep function is decreasing and the system has better performance and stability. the increase in correlation coefficient ρ influences the increase in abep and the system becomes unstable. it can be observed that the higher values of ω0 lead to a lower error and better performance. abep is significantly reduced with increasing l-input in micro mrc, and the system performance will improve. on the other hand, larger abep results in a degradation of system performance. comparing the graphics results we can conclude that the better stability of the system and a smaller abep are achieved for the bdpsk than bfsk modulation. gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel 97 to further analyze the abep performances, error related to truncation of infinite series will be considered. tables 1 and 2 illustrate the number of terms in the series that need to be added in order to assure the accuracy of the expression (16) to 5 decimal places, for the given values of bfsk parameters, as given in figures 2 and 3. table 1 number of terms needed in the infinite series expansion in the expression (16) for the abep for the bfsk with accuracy to 5 decimal places: µ=1, l=2, ρ=0.2, while changing parameters c and k (fig.2). ω0=0 db ω0=5 db ω0=10 db c=1, k=1 11 10 7 c=1.5, k=1 12 9 7 c=2, k=1 13 11 7 c=2.5, k=1 13 10 7 c=3, k=1 14 11 8 c=1, k=1.5 13 11 9 c=1, k=2 13 12 11 c=1, k= 2.5 15 14 12 c =1, k=3 18 15 13 for three different values of ω0 [db] and different values of parameters c and k, more terms are needed in the expansion and convergence will occur at a slower pace. table 2 number of terms needed in the infinite series expansion in the expression (16) for the abep for the bfsk with accuracy to 5 decimal places, where c=k=1 and parameters µ, ρ and l are varied (fig.3). ω0=0 db ω0=5 db ω0=10 db µ=1, ρ=0.2, l=2 11 10 7 µ=1.5, ρ=0.2, l=2 12 11 10 µ=2, ρ=0.2, l=2 14 13 10 µ=2.5, ρ=0.2, l=2 15 14 13 µ=3, ρ=0.2, l=2 18 15 13 µ=1, ρ=0.4, l=2 13 10 8 µ=1, ρ=0.6, l=2 15 13 10 µ=1, ρ=0.8, l=2 11 20 16 µ=1, ρ=0.2, l=3 11 10 8 µ=1, ρ=0.2, l=4 13 11 10 as the values of the parameters µ, ρ and l increase, the number of terms in the expression needed for the accuracy to 5 decimal places increases too, and convergence is obtained at a slower pace. in tables 3 and 4 the results for the similar analysis are presented for the bdpsk modulation scheme. the number of terms in the series expansion in the expression (16) needed to achieve the accuracy to 5 decimal places for the given parameter values, are presented all together with the results given in figures 4 and 5. 98 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović table 3 number of terms in the infinite series expansion of the expression (16) for the abep for bdpsk at the accuracy of 5 decimal places, where µ=1, l=2, ρ=0.2, and various values of parameters c and k (fig.4). ω0=0 db ω0=5 db ω0=10 db c=1, k=1 11 8 7 c=1.5, k=1 11 8 5 c=2, k=1 12 9 5 c=2.5, k=1 12 9 5 c=3, k=1 12 9 5 c=1, k=1.5 12 9 8 c=1, k=2 13 12 10 c=1, k=2.5 14 13 12 c=1, k=3 17 15 13 again, for three different values of the average power ω0 [db] as the parameters c and k increase, the number of terms needed for the series convergence increases, so convergence is slower. table 4 number of terms in the infinite series expansion of the expression (16) for the abep for bdpsk at the accuracy of 5 decimal places, where c=k=1 and parameters µ, ρ and l take various values (fig.3). ω0=0 db ω0=5 db ω0=10 db µ=1, ρ=0.2, l=2 11 8 7 µ=1.5, ρ=0.2, l=2 12 9 9 µ=2, ρ=0.2, l=2 13 11 10 µ=2.5, ρ=0.2, l=2 14 13 12 µ=3, ρ=0.2, l=2 17 15 12 µ=1, ρ=0.4, l=2 12 9 7 µ=1, ρ=0.6, l=2 13 11 7 µ=1, ρ=0.8, l=2 23 17 12 µ=1, ρ=0.2, l=3 10 9 7 µ=1, ρ=0.2, l=4 11 10 9 table 4 shows that for higher values of the parameters µ, ρ and l, more terms should be added in the expression to achieve the accuracy up to and including the fifth decimal, so convergence becomes slower. also, the higher values of the correlation coefficient require more terms. 4. gpu-enabled modelling, simulation and network planning environment 4.1. framework overview although many of the network simulators conceptualize mobile propagation models up to some point, often they involve time-consuming and inefficient computational schemes resulting in difference between simulated and realistic scenarios. to assure better simulation results and optimal computation algorithms, in this section, we present the framework for simulated transmission in a gamma-shadowed kappa-mu fading environment analyzed in gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel 99 the previous sections. the idea behind the efficiency of the approach is the implementation of the gpgpu computational units [22], [23]. the software framework has several phases. in the first phase, user defines mobile network within the graphical environment running in a web browser. various modelling aspects related to infrastructure, terrain, channel, and service consumers within smart cities, have been considered earlier [24], [25]. the quality of service (qos) parameter values are calculated with the reference to the abep values. these values and the user-defined model are taken as the input of a linear optimization mechanism. the process of linear optimization is executed, giving the optimal base station configuration and deployment as output, which is further translated to target software defined radio (sdr) commands. fig. 6 depicts the modeling and simulation environment used for network planning. fig. 6 overview of gpu-enabled mobile network modeling, planning and simulation environment: 1-creation of user-defined smart city mobile network model diagram 2-mobile network model 3-calculated qos values 4-sdr commands 4.2. gpu-enabled fading calculation traditionally, graphics processing units (gpu) perform computations only for computer graphics. the goal is to implement general-purpose gpu (gpgpu) programming to accelerate the calculations in applications that are not graphic. in [22] and [23], gpgpu has been considered as an effective approach towards performance improvement when it comes to simulations of fading effect. a pseudo-code depicting the structure of cuda kernel is given in fig. 7. fig. 7 gpu-powered cuda kernel pseudo-code for qos impact calculation in this paper, we adopt nvidia cuda for gpu-enabled calculation of abep that is further used for qos estimation of the observed system. the pseudo-code shows how calculations are distributed among different gpu computation/processing subunits to __global__ void calc_impact(location* l, float* qi) { int k = threadidx.x + blockidx.x * blockdim.x; while (k < n) { qi[k] = mgf(l->o[k])/qnf; k += blockdim.x * griddim.x; } } 100 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović achieve the speed-up. the input of this algorithm is a set of locations (denoted as *l) within the map, while the output is a set of qos impact coefficients determined for each location (denoted as *qi). to each location, ω0 value is assigned (denoted as o) and used as input of mgf function (denoted as mgf) to calculate the value of abep. furthermore, qnf is used for normalization of abep to a real number from range [0, 1] in order to express the weight of fading effect expressed by abep value among different parameters that might affect quality of service, as discussed in [25]. this value can be set by user and is determined empirically. moreover, the obtained result is further used for calculation of the next qos value at the time point t+1 at location l: (t 1, l) (t) ( 1, )qos qos qi t l+ =  + (17) to assure the computational pace, intel i7 7700-hq quad-core cpu machine at 2.8 ghz and gtx 1050 gpu with 2 gb vram, 1tb hdd and 16gb ddr4 ram is used. according to the results, the gpu-enabled calculation of qos based on abep for this type of fading is around 42 times faster than calculation done on cpu using the tool from [24]. moreover, the performance improvement using the gpgpu approach for levelcrossing rate (lcr) of the sc receiver output over α-κ-µ multipath fading channels [25] was 39 times faster while performance evaluation of md wireless communication system within weibull multipath fading channel [26] was 47 times faster. 4.3. linear optimization and qos one of the main objectives in designing and implementing wireless networks is to provide the best possible quality of service (qos). in this section, we overview the linear optimization approach to improve the qos in a real-life scenario of the transmission over gamma long-term and k-µ short-term fading channel employing a combined md system with md sc receiver and two md mrc receivers. in every fading-affected environment, qos requirements can be recognized in terms of data rates, average bit error rates, delay limits, outage probabilities and so forth. the linear optimization method presented in the following is aimed to assure the best possible data rates and lowest error rates, as well as minimum possible outage probability of the system proposed in the previous sections. the analysis is based on a mathematical model with the requirements expressed by linear relationships [27]. we define the base station basestationb, energy consumption ecb and an estimated capacity, that is the highest number of connected users cb. also, we define the location locationl for each base station and encounter the cost of energy distribution dcl as well as an estimated demand dl of service by the customers for the given location covered by the allocated base station. we will contemplate a level of qos drop qd[b,l] for each pair (basestationb, locationl). this parameter depends not only on the specific characteristics of the channel (in our scenario these would be specific fading conditions) but also the design and properties of the base station itself. the parameter is represented by the ratio between the values of maximum possible and estimated qos for the observed part of the channel: 0 max 0 [ , ] [ , ] [ , ] estimate q s b l qd b l q s b l = (18) gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel101 where x[b,l] represents the decision variable. it can take only two possible values: 1 if basestationb is at the locationl, while it takes 0 in the other case. linear optimization allows for allocating the base stations to optimal locations. this will have positive effects on the qos parameters such as the minimization of qos drop, together with reduction of costs related to energy distribution and consumption: , [ , ] [ , ] [ ] [ , ] b basestation l location minimize qd b l dc b l ec b x b l    (19) there are several conditions that must be fulfilled to successfully complete the optimization. primarily, a single base station has to be assigned to each location: [ , ] 1, b basestation x b l l location  =  (20) moreover, each base station is assigned to at most one location at the moment: [ , ] 1, l location x b l b basestation    (21) each base station assigned to a specific location has to assure the needed capacity to meet the service demand for the corresponding location: [ ] [ , ] [ ], b basestation c b x b l d l l location    (22) to implement the linear optimization program we have used the ampl1 optimization supporting system while the optimization process itself has been completed using ibm cplex2 optimizer and its option of simplex method-based solver. in table 5, an overview of the results obtained in optimization for different model sizes is given. the first model represents the number of different configurations of the involved base stations (nbs). the second column is the number of considered locations (nl). these two values determine the size of the model. the third column represents the time needed for optimization processes for the given model size. in the last column, a percentage of the cost reduction obtained in each scenario is given. according to the results in the table 5, we can observe that the cost reduction depends on the specific instance model while the time spent on optimization increases with the increase of the model size. table 5 results of the optimization for different sizes of the network model number of base stations [nbs] number of locations [nl] optimization [s] cost reduction [%] 5 4 0.05 77 10 6 0.08 54 15 8 0.14 83 1 https://ampl.com/ 2 https://www.ibm.com/analytics/cplex-optimizer https://ampl.com/ https://www.ibm.com/analytics/cplex-optimizer 102 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović 5. conclusion we have considered the combined md system with one macro-diversity sc receiver and two micro-diversity mrc receivers in a gamma long-term and κ-μ short-term fading channel has been studied. micro-diversity combines signal envelopes from multiple l antennas at the base stations. in this way multiple effects of κ-μ fading are reduced. on the other hand, md combines signals with antennas at two or more base stations which helps mitigating the effects of a long-term fading. we have derived the closed-form expressions for the moment-generating function of the signal at the output of the system for correlated composite non-homogenous fading channel for two modulation schemes: bfsk and bdpsk. using the obtained expressions, analytical expressions for abep for both modulation schemes are evaluated. the abep is improved with an increase of the number of antennas l. the increase in correlation coefficient ρ weakens the system performance. the correlation coefficient ρ has higher influence on abep for higher values of gamma long-term fading severity parameters. when ρ is one, the lowest value of the signal occurs simultaneously resulting in md-reception becoming md-reception. the abep function is smaller and the system performance is better when the number of branches in md is greater than l=2. in order to estimate the rate at which the convergence of the expressions for abep developed in the infinite series occurs, the analysis has been conducted where the needed number of terms of the series is determined for the rounding accuracy of 5 decimal places. it is observed that the series converges at a high rate, and in general, 10-15 terms need to be taken to achieve the expected accuracy. the increase of the values of parameters µ, ρ and l affects the convergence rate as more terms should be encountered to assure the stated accuracy. in the final section of the work, we have proposed the implementation of gpupowered calculations that significantly speed up the determination of qos parameters based on abep function. this is of the utmost importance for the qos performances in real-time wireless systems dealing with data transmission such as real-time video. the usage of gpgpu in a simulated model of a gamma long-term and κ-μ short-term fading environment dramatically improves the response. within the analysis for the improvement in qos parameter of the fading-affected system observed in this paper the linear optimization is proposed. the idea behind the linear optimization model is built on the optimal base station configuration and deployment scheme that can be further translated to target software defined radio (sdr) commands. the implementation of command generation mechanisms for specific sdr hardware solutions is outside the scope of this paper and will be covered by our future research. moreover, we would like to include the channel capacity calculation techniques [28] to dynamically determine the capacity parameter in linear optimization model. acknowledgement: this paper has been supported by the ministry of education, science and technological development of the republic of serbia. gpu-supported simulation for abep of macro diversity system in gamma shadowed κ-μ fading channel103 references [1] t. s. rappaport, wireless communications principles and practice. prentice hall, 2007. [2] p. s. bithas, n. c. sagias, and p. t. mathiopoulos, "the bivariate generalized-kkg distribution and its application to diversity receivers", ieee trans. on commun., vol. 57, no. 9, pp. 2655–2662, sep. 2009. [3] p. bithas, n. sagias, and t. tsiftsis, "performance analysis of dual-diversity receivers over correlated generalised gamma fading channels", iet commun., vol. 2, no. 1, pp. 174–178, 2008. [4] r. subadar, t. s. b. reddy and p. r. sahu, "performance of an l-sc receiver over kappa-mu and etamu fading channels", in proceedings of the ieee international conference on communications, cape town, 2010, pp. 1-5. [5] m. d. yacoub, "the κ−μ distribution: a general fading distribution", in proceedings of the ieee atlantic city fall vehicular technology conf., october 2001, pp. 1427–1432. [6] m. k. simon and m. s. alouni, digital communication over fading channels. 2nd ed., new jersey: wiley-interscience; 2005. [7] g. l. stuber, principles of mobile communications. 2nd ed. norwell ma: kluwer academic publishers, 2001. [8] a. s lioumpas, a. p doukeli and g. k karagiannidis, "another look at multibranch switched diversity systems", ieee commun. lett., vol. 11, no. 4, pp. 325-327, 2007. [9] p. g. stavrianos, p. s. bithas and d. s. kalivas, "an analytical study for an efficient multi‐branch switched diversity receiver", int. j. commun. syst., vol. 30, 2017. [10] e. a. neasmith and n. c. beaulieu, "new results in selection diversity", ieee trans. commun., vol. 46, no. 5, pp. 695–704, 1998. [11] m. d. yacoub, "the κ-µ distribution and the η-µ distribution", ieee antennas propag. mag., vol. 49, no. 1, pp. 68-81, 2007. [12] s. r panić, d. m stefanović, i. m petrović, m. č stefanović, j. a anastasov and d. s krstić, "secondorder statistics of selection macrodiversity system operating over gamma shadowed κ-μ fading channels", eurasip j. wirel. commun. netw., no. 151, 2011. [13] w. pearson, s. olver and m. a. porter, numerical methods for the computation of the confluent and gauss hypergeometric functions. the numerical algorithms group (nag) and the engineering and physical sciences research council (epsrc), june 2016. [14] n. sekulovic and m. stefanović, "performance analysis of system with micro and macro-diversity reception in correlated gamma shadowed rician fading channels", wirel. pers. commun., vol. 65, pp 143–156, july 2012. [15] p. m. shankar, "macrodiversity and microdiversity in correlated shadowed fading channels", ieee trans. veh. technol., vol. 58, no. 2, february 2009. [16] s. suljović, d. milić, s. panić, č. stefanović and m. stefanović, "level crossing rate of macro diversity reception in composite nakagami-m and gamma fading environment with interference", vol. 102, p. 102758, may 2020. [17] s. panic, j. anastasov, m. stefanovic, and p. spalevic, fading and interference mitigation in wireless communications. crc press: usa, 2013. [18] d. krstic, s. vasić, s. koničanin, s. suljović and m. stefanović, "mgf based calculation of abep for macrodiversity receiver over gamma-shadowed fading environment with line-of-sight", in proceedings of the 5th international conference on smart and sustainable technologies (splitech), 23-26 september 2020, pp. 1-5. [19] m. abramowitz and irene a. stegun, handbook of mathematical function with formulas, graphs and mathematical tables. national bureau applied mathematics, series 55, december 1972. [20] s. suljović, d. krstić, d. bandjur, s. veljković and m. stefanović, "level crossing rate of macrodiversity system in the presence of fading and co-channel interference", rev. roumaine des sciences techniques-série électrotechnique et énergétique, vol. 64, no. 1, pp. 63–68, 2019. [21] i. gradshteyn, i. ryzhik, tables of integrals, series, and products. academic press, new york 1994. [22] a. f. abdelrazek, m. kaschub, c. blankenhorn and m. c. necker, "a novel architecture using nvidia cuda to speed up simulation of multi-path fast fading channels", in proceedings of the vtc spring 2009 ieee 69th vehicular technology conference, barcelona, spain, 2009, pp. 1-5. [23] r. carrasco-alvarez, j. v. castillo, a. c. atoche and j. o. aguilar, "a fading channel simulator implementation based on gpu computing techniques", math. probl. eng., vol. 2015, pp. 1-8, 2015. [24] n. petrović, s. koničanin, d. milić, s. suljović and s. panić, "gpu-enabled framework for modelling, simulation and planning of mobile networks in smart cities", in proceedings of the zooming innovation in consumer technologies conference (zinc), novi sad, serbia, 2020, pp. 280-285. ../appdata/local/downloads/vol.%20102 https://ieeexplore.ieee.org/xpl/conhome/9243671/proceeding 104 n. petrović, s. vasić, d. milić, s. koničanin, s. suljović [25] d. milić, s. suljović, n. petrović, s. koničanin and s. panić, "software environment for performance of relay signal by df technique influenced by κ-μ fading", in proccedings of the 19th international symposium infoteh-jahorina, east sarajevo, bosnia and herzegovina, 2020, pp. 1-4. [26] s. suljović, d. milić and s. panić, "lcr of sc receiver output signal over α-κ-µ multipath fading channels", fu elec. energ., vol. 29, no. 2, pp. 261-268, 2016. [27] s. suljović, d. milić, z. nikolić, s. panić, m. stefanović and đ. banđur, "performance of macro diversity wireless communication system operating in weibull multipath fading environment", fu elec. energ., vol. 30, no. 4, pp. 599-609, 2017. [28] z. ji, c. dong, y. wang, and j. lu, "on the analysis of effective capacity over generalized fading channels", in proceedings of the ieee international conference on communications (icc 2014) communications theory, 2014, pp. 1977-1983. facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 71-88 https://doi.org/10.2298/fuee2101071d © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper wavelet-based audio features of dc motor sound* đorđe damnjanović1, dejan ćirić2, zoran perić2 1university of kragujevac, faculty of technical sciences čačak, čačak, serbia 2university of niš, faculty of electronic engineering, niš, serbia abstract. the usage of wavelets is widespread in many fields nowadays, especially in signal processing. their nature provides some advantages in comparison to the fourier transform, and therefore many applications rely on wavelets rather than on other methods. the decomposition of wavelets into detail and approximation coefficients is one of the methods to extract representative audio features. they can be used in signal analysis and further classification. this paper investigates the usage of various wavelet families in the wavelet decomposition to extract audio features of direct current (dc) motor sounds recorded in the production environment. the purpose of feature representation and analysis is the detection of dc motor failures in motor production. the effects of applying different wavelet families and parameters in the decomposition process are studied using sounds of more than 60 motors. time and frequency analysis is also done for the tested dc motor sounds. key words: wavelets, detail coefficients, approximation coefficients, audio features, dc motors 1. introduction wavelets can be used for different purposes, including de-noising, signal parameterization and analysis [1]. among other methods, wavelets are proposed to overcome certain limitations of the fourier transform, especially when the time domain resolution is in focus [2-4]. roots of wavelet method date back to 1909 when alfred haar proposed wavelets as an alternative method to fourier transform, although fourier himself had mentioned wavelets as a mathematical model in his papers earlier [3]. during the century, particularly in the last few decades, scientists developed wavelet families for various applications. these families are often named by the scientists: gabor, morlet, daubechies, haar, meyer, etc. however, received june 24, 2020; received in revised form december 24, 2020 corresponding author: đorđe damnjanović faculty of technical sciences čačak, svetog save st. 65, 32000 čačak, serbia e-mail: djordje.damnjanovic@ftn.kg.ac.rs *an earlier version of this paper was presented at the 6th international conference on electrical, electronic and computing engineering “(ic)etran 2019”, june 3 6, 2019, in silver lake, serbia [1] mailto:djordje.damnjanovic@ftn.kg.ac.rs 72 đ. damnjanović, d. ćirić, z. perić there are others named by their shape or mathematical model: mexican hat, biorthogonal, reverse biorthogonal, etc [2,3]. lots of studies have shown that wavelets can be used in different areas. most common applications of wavelets are present in signal processing, usually when denoising of signals such as audio signals, images, special acoustical signals (for example room impulse responses) and biomedical signals (electromyogram, electrocardiograph and electroencephalogram) is in focus [2-6]. the usage of wavelets is widespread, not only for noise removal. remote sensing of very low-frequency signals and estimation of truncation time of a room impulse response are also applications where wavelets have found their place [7,8]. recently, increased interest in both academia and industry has been shown for audio signal classification, audio event detection and auditory scene recognition. this kind of audio signal processing plays a vital role in biomedical engineering, mechanical engineering, telecommunications, acoustics, etc. in that regard, different features of audio signals are extracted and used for classification, detection and recognition purpose [9,10]. some of those features are based on wavelets [11,12]. although they are not frequently used, their potentials are quite perspective as they can provide good results in artificial intelligence based automated classification for particular applications such as industrial monitoring [10,13] or musical acoustics [12]. one of the significant problems always present in the industry is how to assess the quality of a product (e.g., produced motors) or how to detect a faulty product (motor) and recognize the failure type. several different approaches are already presented [13]. unfortunately, there is no established optimum approach. the one attracting significant attention lately is based on usage of sound generated by the tested product. on the other hand, thanks to recent development in audio signal processing and artificial intelligence, advanced machine/deep learning-based methods have become available options in product quality assessment. here, since every product has its specific sound characteristics, a logical solution would be to apply a customized set of audio features and classification algorithm appropriate for that particular use-case. the literature shows that signal decomposition into detail and approximation coefficients can be used for feature extraction, especially when audio signals and images are in focus [9,14]. this paper presents the study’s results using wavelet decomposition into detail and approximation coefficients as audio features for a classification purpose. sounds of more than 60 recorded dc motors, faulty and non-faulty ones, are analyzed by applying the wavelet decomposition. since there are several parameters of wavelets, the effects of changing these parameters on a set of audio features consisting of detail and approximation coefficients are investigated. besides, the possibilities of using statistics of the extracted waveletbased audio features consisting of absolute and mean values as well as standard deviation are observed, too. the goal is to define a procedure and relevant audio features capable of making a distinction between sounds of faulty and non-faulty motors. in that regard, the frequency analysis of motor sounds is also done for a better understanding of signal nature and results. the processing is done in matlab software package, and representative examples are presented here. the paper is organized in five sections including introduction and conclusion. section 2 provides relevant background information. section 3 presents the methodology of this research step by step. section 4 gives the results of analysis of wavelet-based features wavelet-based audio features 73 extracted from dc motor sounds. the paper is concluded in section 5 with suggestions for future work. 2. related work techniques used for detection of failures (faults) in motors include vibration monitoring (vibration signature), motor current signature analysis (mcsa), electromagnetic field monitoring, chemical analysis, temperature measurement, infrared measurement, acoustic noise analysis (sound signature), and partial discharge measurement [15]. among these, current signature, vibration signature and sound signature analysis are the most common in use [16]. mcsa is a rather popular and reliable technique providing good results, although in some cases, it is not sensitive enough because of the low signal-to-noise ratio. it also has spectral leakage and low-frequency resolution as well as the installation can be complicated [16]. vibration analysis requires appropriate sensors (accelerometers), which can be an additional expense, and sometimes it is not easy to correctly place the sensor in the right position. the latter is especially valid in industrial environment where dust, moisture and high temperature is often present [15,16]. sound analysis is contactless, low-cost and easy installation approach, where problem typically comes from the noise of the industrial environment. mcsa is often used in combination with sound signature analysis [16,17]. these techniques, particularly sound signature analysis, are used in analyzing and detection of rotor faults, bearing faults and unbalanced faults in wings [15]. the development and popularity of machine/deep learning techniques have led to their usage in various areas, including fault detection in different types of motors [18]. here, they can be considered an upgrade of the traditional techniques able to give better results and more advanced functionalities. machine/deep learning techniques include wellknown algorithms like support vector machine, k-nearest neighbors (knn), neural networks, cross-validation, etc. [18-20]. different types of knn algorithms like fine knn, weighted knn and subspace knn can provide classification accuracy close to 100 % for all motor faults tested in ref. 21. it is worth mentioning that machine learning requires an adequate set of features to be provided at the input [19,20]. different methods can be applied for feature extraction, and different features can be used for audio signal parameterization [18,19,21]. features are typically divided into categories such as time domain features (e.g., zero-crossing rate), frequency domain features (e.g., fundamental frequency and spectral peaks) or perceptual domain features (e.g., loudness, sharpness and roughness) [10]. there are some recent studies where wavelet-based features are applied for motor faults detection and motor classification [11,12,17,22]. an example is [22], where experimental results prove that wavelets can be used for this purpose as simple, easy and fast method. audio signals can be decomposed by the wavelet transform into detail and approximation coefficients considered as wavelet-based features. typically, the pre-processing stage precedes the wavelet transform. in this stage, audio signals are divided into smaller segments (short-term frames) with certain overlap, although in literature, it can be found that wavelet transform is also implemented on longer (mid-term) frames or whole signals [11,17,22]. considering all three cases, the feature extraction results depend on many factors, including the type and size of the recorded signal. for longer signals, segmentation is necessary, while for shorter signals, segmentation can sometimes be skipped [22]. frame size can vary, but it is 74 đ. damnjanović, d. ćirić, z. perić important to emphasize that shorter frames can provide better classification results. also, it is important to mention an overlap of frames, where in most cases, it is 50% of frame size [23], although it could even be 75% [22], while some authors do segmentation without any overlap [11,17]. as audio signals usually contain noise, de-noising in pre-processing can often be beneficial. de-noising can be done by applying wavelets, notch filtering, moving average filtering, etc. [11,17]. one of the steps in feature processing can be calculating the statistical values of the obtained detail and approximation coefficients. since it has been shown that coefficient’s negative values can cause errors in classification [17], an alternative can be to work with their absolute values. the coefficient statistics also includes the mean of absolute values, standard deviation and ratio of absolute mean values providing additional information about features [9,24,25]. regarding the application of wavelets for fault analysis in different types of motors, the most common wavelets are haar and daubechies, although coiflets, symlet and mеyer wavelets are also proposed for that purpose [11,17]. the selection of adequate wavelet type is not an easy task, especially in some cases. for example, certain wavelet families, like daubechies, have a large number of different functions (there are up to 45 daubechies wavelet functions when matlab is used). in [12], authors present their research about the choice of right wavelets for the classification of percussive sounds. besides the selection of appropriate audio features, it is also important to choose the right classifier. similar to the situation with features, different classifiers are proposed and used in studies [15-17]. for the majority of authors, the level of decomposition in the wavelet transform is one of the most important parameters whose effects need to be investigated. as decomposition level increases, loss of resolution can be a major problem in the obtained coefficients [12,26]. for example, the first level coefficients extract the finest resolution. some authors use the decomposition up to level 8 or 9 [11-13], while some other authors have a standpoint that suitable results can be obtained using much lower levels, for example, 3, 4 or 5 [15,26]. 3. methods of analysis more than 60 dc motors are included in the analysis. some of the important characteristics of these motors are: input voltage from 12 v to 13 v, no load speed (rotation per minute) about 80, maximum output power from 18 w to 25 w, approximate dimensions 16  13.5  4.5 cm. the sound of each motor was recorded in an anechoic boot located in the production hall of the motor manufacturer. the anechoic boot has a “box in a box” construction. the outer box is a semi-anechoic chamber (only the floor is reflective) representing a working place of an operator performing the measurements. the inner box is a small sound insulated boot of approximate volume of 0.5 m3. the ambient noise outside the anechoic boot is generally rather high, since it is mainly generated by machinery located in the production hall. however, since the measurements were done during weekends when only a minority of machines was active, the ambient noise inside the inner boot was significant only at low frequencies, mainly below 300 hz. the motors were placed on a test bench provided by the motor manufacturer, see fig. 1. this test bench was able to drive the motors with adequate force and to apply an adequate load simulating real conditions. the motors were driven in two directions of wavelet-based audio features 75 rotation, where operation in each direction lasted approximately 8 s. the measuring microphone was placed about 40 cm from the tested motor. fig. 1 test bench driving the motors with adequate force and applying adequate load pre-processing here is related to the extraction of relevant parts of the recorded audio signals and segmentation when it is required. in order not to use the transition regions in the beginning and at the end of the signals, their medium parts of a duration of 5 s for each direction of rotation are extracted and further processed. the wavelet-based features are extracted either from the whole signals or from signals divided into segments. according to literature, the size of segments (frames) and overlap between them can vary from one study to another. the signals are here segmented in short-term frames of 50 ms with an overlap of 50%, that is, 25 ms. the starting point in the analysis is to observe the dc motor sounds in time and frequency domain in order to notice specific sound properties, and, if possible, make a distinction between motors with certain faults (faulty motors) and motors with good characteristics (non-faulty motors). since these motors are brand new ones, it is expected that there will be only tiny differences among their sounds in most cases. the exceptions are expected to be seen only in rare cases of serious failures. separation between non-faulty and faulty motors is done by the experienced personnel of the motor manufacturer. the most common faults found in these motors are commutator faults, mechanical unbalance, bearing and gearbox defects. wavelet-based feature extraction starts with the decomposition of an audio signal. the decomposition is usually done using discrete wavelet transform (dwt) rather than continuous wavelet transform (cwt) because of its easier implementation in multilevel signal decomposition [3-5]. every signal is decomposed into detail and approximation coefficients (high and low-frequency components) at each level [3-6]. this is why the dwt is equivalent to low and high pass filtering [4]. fig. 2 presents the whole process of decomposition down to level 3, where lp is a low-pass filter, hp is a high-pass filter, ax 76 đ. damnjanović, d. ćirić, z. perić stands for approximation coefficients at decomposition level x, dx stands for detail coefficients at decomposition level x, and 2↓ is down-sampling. fig. 2 block diagram of wavelet transform decomposition into detail and approximation coefficients different wavelets are applied to the pre-processed signals to provide an adequate set of wavelet-based features being able to make a difference between faulty and non-faulty motors. generally speaking, the most common wavelet used in audio signal processing is the haar wavelet. this wavelet is proved to be the most stable in signal de-noising, although daubechies wavelets provide somewhat better results than haar in this particular application [5]. other wavelet families used here are: coiflets, symlet, biorthogonal, reverse biorthogonal and discrete mеyer. investigation of the effects of wavelet decomposition level is an important part of the analysis, too, especially because many authors use different values for this parameter. here, the levels of decomposition from 1 to 8 are used. when the wavelet decomposition is applied to the whole pre-processed signals, absolute values of the obtained detail and approximation coefficients at each decomposition level are treated as wavelet-based features. on the other hand, when the wavelet decomposition is applied to the segmented signals, the mean and standard deviation of the coefficients as well as of absolute values of the coefficients obtained from the segments are considered to be the wavelet-based features. apart from the analysis of the recorded signals in time and frequency domain, their wavelet-based features are analyzed in detail, too. special attention is paid to differences between faulty and non-faulty motors in any of these domains, and to the correlation between the results in different domains (if any). in order to quantitatively evaluate the performance of the wavelet-based features in making a distinction between the motors, a measure named feature difference is calculated as the mean value of differences between the features for non-faulty and faulty motor from all segments normalized by the mean feature value. wavelet-based audio features 77 the processing described is done in matlab software package. the fully automated software application is created. the selected wavelets and level of decomposition are applied to the defined signals using the command wavedec. the detail and approximation coefficients are generated utilizing two commands appcoef and detcoef, respectively, according to the level of decomposition. other used supporting functions belong to the standard ones for matlab software. a block diagram of the whole processing including analysis in time and frequency domain as well as wavelet-based feature extraction and quantification is shown in fig. 3. fig. 3 block diagram of the processing applied in the time, frequency and feature domain 4. results although more than 60 motors are used in the analysis, only the representative cases are included here illustrating typical behavior of these motors regarding the analyzed issues. the investigation results obtained by applying the wavelet decomposition to the whole pre-processed signals are given first. they are followed by the results obtained from the segmented pre-processed signals. these two approaches in applying the wavelets for audio feature extraction are compared afterward. 4.1. analysis of full-length signals the whole (full-length) pre-processed sounds of the duration of 5 s of both faulty and non-faulty dc motors in one direction of rotation in the time domain are given in fig. 4 (a). results show that there are certain fluctuations of amplitude (levels) in time. in some cases (signals), these amplitude fluctuations are more prominent. the fluctuations can have a shape similar to a low-frequency pattern, or some sudden onset of high amplitude can appear in particular time moments. however, it is rather difficult to distinguish between non-faulty and faulty motors considering only the signals in the time domain. by analyzing the spectra of the signals shown in fig. 4 (b), it can be seen that there are some prominent low-frequency components (below 100 hz) mainly caused by the ambient noise. in the remainder of the frequency range, some peaks and dips appear. one more observation (not shown here) is that distinction between directions of rotation can be made in an easier way in the spectral domain than in the time domain. however, even in the spectral domain, it is not easy to differentiate the faulty motors from the non-faulty motors. in that regard, it would be very beneficial to find an alternative approach/domain able to make a clearer distinction between these motors. 78 đ. damnjanović, d. ćirić, z. perić fig. 4 pre-processed sound signals of non-faulty (blue) and faulty (red) dc motors for the direction of rotation 1: (a) time domain, (b) frequency domain when the wavelets are applied to the full-length pre-processed signals, the results are the detail and approximation coefficients of these particular signals. the decomposition process using daubechies 2 wavelet (db2 in matlab) and decomposition levels from 1 to 8 is illustrated in fig. 5, where absolute values of the detail coefficients for the sound of a non-faulty and faulty dc motor are shown. fig. 5 detail wavelet coefficients after applying daubechies 2 wavelet to full-length signals (with taking coefficient absolute value) for non-faulty (blue) and faulty motor (red) for the direction of rotation 1, and using the decomposition levels from 1 to 8 the detail coefficients for the majority of decomposition levels are rather similar for the non-faulty and faulty motor. this is valid for the levels 1 and 2 as well as for the wavelet-based audio features 79 levels from 5 to 8. however, the coefficients for the decomposition levels 3 and 4 show some differences, since the values of detail coefficients for the faulty motor are greater than those for the non-faulty motor. the noticed difference between non-faulty and faulty motor represents a promising result that will be explored in more detail later. from a general point of view, it is worth noting that the length of detail coefficient array becomes shorter with the increase of the decomposition level, which is an inherent property of the wavelet decomposition. the results for detail coefficients obtained using different decomposition levels could have a certain correlation with the signal representation in the frequency domain, that is, the signal spectrum. all measured signals are sampled at 16 khz, so the maximum frequency of the signals is 8 khz. as described above (see fig. 2), the decomposition starts at level 1, where the signal is passed through a high pass and low pass filter yielding the detail and approximation coefficients at the decomposition level 1, respectively [27]. these coefficients are then down sampled by 2, and the procedure is repeated until the final decomposition level is reached. in this manner, the signal frequency range is divided into two equal parts (related to detail and approximation coefficients) at every level of decomposition. following the described procedure, the frequency range up to 8 khz is first divided into upper part (from 4 khz to 8 khz related to the detail coefficients at the decomposition level 1) and the lower part that is further divided at the next decomposition level. in that respect, the frequency content from 4 khz to 8 khz is somehow correlated with the detail coefficients at the decomposition level 1 (d1). what is worth emphasizing is that the detail coefficients represent the wavelet filtered data in time, while the spectrum is a spectral representation of the signal calculated from the whole time interval used for the analysis. as the decomposition goes through the next decomposition levels, up to level 8, the frequency range related to the detail coefficients at every next level is halved. thus, the frequency ranges after halving become: from 2 khz to 4 khz at the decomposition level 2, from 1 khz to 2 khz at the decomposition level 3, etc. the lower part of the frequency range at the last decomposition level is related to the approximation coefficients, and in the presented case it is from dc to 31.25 hz. this procedure is illustrated in fig. 4 (b) by vertical lines and symbols d1 to d8. 4.2. decomposition of segmented signals most authors have reported that segmenting the signals into frames might improve the results in the feature extraction using wavelets [11,17,23]. when a single frame is observed, the decomposition using wavelets is done in the same way as in the case of full-length signals. every frame is decomposed using a particular wavelet (daubechies 2 in this case) up to level 8. however, opposite to the case when the full-length signal is decomposed, here the feature vectors consisting of the mean values and standard deviations of detail coefficients represented by their absolute values have the same length independently of the decomposition level, see fig. 6. the same sounds are used for both figures 5 and 6. in a similar manner as in fig. 5, the most prominent differences between the nonfaulty and faulty motors exist at the decomposition levels 3 and 4. as mentioned above, these results are consistent with those from the frequency domain, where the biggest differences are present in the range from 1 khz to 2 khz and from 500 hz to 1 khz, see fig. 4 (b), related to the decomposition levels 3 and 4. comparing the features obtained using mean values, fig. 6 (a), and standard deviation, fig. 6 (b), they both lead to certain 80 đ. damnjanović, d. ćirić, z. perić differences between the non-faulty and faulty motors, although the mean values provide more stable results. in that regard, both procedures (mean value based and standard deviation based) can be used for feature extraction. fig. 6 detail coefficients after applying daubechies 2 wavelet to segmented signals (with taking coefficient absolute value) of non-faulty (blue) and faulty motors (red) up to level 8 of decomposition: (a) mean values, (b) standard deviation to show the extent to which the proposed features vary in the case of the non-faulty motors in reference to the faulty motors, the detail coefficients at the decomposition level 4 of eleven non-faulty motors and one faulty motor are presented in fig. 7. the detail coefficient mean values for the non-faulty motors are concentrated in a rather narrow region having smaller feature values than the faulty motor. thus, there is a prominent difference between the non-faulty and faulty motors. wavelet-based audio features 81 fig. 7 mean values of detail coefficients after applying daubechies 2 wavelet to segmented signals (with taking coefficient absolute value) of eleven non-faulty motors (blue) and one faulty motor (red) at decomposition level 4 as done in the previous research of the authors of this paper (see [1]), mean values and standard deviation of the detail coefficients without applying absolute value are also used for feature extraction in some other papers as well [1,13,22]. the wavelet-based features (detail coefficients) obtained in this way using the same sounds from figures 5 and 6 are shown in fig. 8. however, these results are not as good as those presented in fig. 6. the procedure employing mean values (but without taking the absolute value of detail coefficients) provides almost no difference between non-faulty and faulty motors. on the other hand, the procedure employing standard deviation without absolute value can result in certain, but rather small difference between compared motors, as shown in fig. 8 (b). the mentioned observations are logical since detail coefficients are bipolar having a mean value close to zero, while standard deviation is calculated using the square function, eliminating the bipolarity. this trend of having worse results without absolute value than with absolute value appears in almost all analyzed dc motors. this is why the procedure without absolute value will no longer be considered during this research. 4.3. effects of changing the wavelet function when the wavelet function is changed from daubechies 2 to other functions (haar, symlet, coiflet, biorthogonal, reverse biorthogonal, mеyer), the patterns of detail and approximation coefficients are also changed to a certain extent. fig. 9 illustrates four cases of usage of different wavelet functions (haar, simlet 8, coiflet 5 and discrete meyer) applied to the segmented signals of non-faulty and faulty dc motors. only the wavelet-based features for the first four decomposition levels obtained using absolute and mean values of the detail coefficients are shown. haar wavelet proved to be the most stable one in the previous research [5,8], but differences between non-faulty and faulty motors obtained with haar wavelet, in this case, are smaller than those obtained using daubechies 2. the other three wavelets provide rather similar results like the ones presented in fig. 6. the described observations are quantitatively supported by the measure feature difference calculated as explained in section 3. for the decomposition level 4, the wavelets whose results are shown in fig. 9 lead to the following values of the feature difference: daubechies 2 leads to the feature difference of 0.68, symlet 8 to 0.72, coiflet 5 to 0.69 and discrete meyer to 0.75, while haar wavelet leads to the difference of 0.37. these results for feature differences confirm that haar wavelet provides smaller difference between non-faulty and faulty motors than daubechies 2 wavelet or any other used here. 82 đ. damnjanović, d. ćirić, z. perić fig. 8 detail coefficients after applying daubechies 2 wavelet to segmented signals (without taking coefficient absolute value) of non-faulty (blue) and faulty motors (red) up to level 8 of decomposition: (a) mean values, (b) standard deviation 4.4. differences between motors in detail coefficients and spectra from the analysis of detail coefficients and time/frequency characteristics of sounds of all used dc motors, two specific cases of correlation of results from different domains stand out. the first case (case 1) is similar to the one given in figures 5, 6 and 9, where there are certain differences between non-faulty and faulty motors in both detail coefficients and spectra. here, the differences in detail coefficients at a particular decomposition level are related to the differences in spectra in a particular frequency range, as explained in section 4.1, see fig. 4 (b). the second case (case 2) is opposite to the first one. in this case, there are almost no differences in the detail coefficients and spectra between the non-faulty and faulty motors. again, the differences in the detail coefficients at a particular decomposition level are compared to the differences in spectra in a particular frequency range. these cases are presented in more detail below. wavelet-based audio features 83 fig. 9 detail coefficients given as mean values from the frames after applying (a) haar, (b) symlet 8, (c) coiflet 5 and (d) discrete meyer wavelets to segmented signals of non-faulty (blue) and faulty motors (red) up to level 4 in order to shed light from one more perspective, another example of a certain correlation of differences between non-faulty and faulty motors in the detail coefficients and spectra is shown in fig. 10. the processing is identical to the previously described one. daubechies 2 wavelet up to the decomposition level 8 is applied to the pre-processed signals of non-faulty and faulty motors. wavelet-based features consist of mean value and standard deviation of the absolute value of the detail coefficients from every frame. differences between the non-faulty and faulty motors are the most prominent in the detail coefficients from the decomposition level 1 up to level 4. also, there are some smaller differences present at level 5. similar results are found in the spectra of the analyzed signals, see fig. 9 (c). an exception is found in the region/coefficients d6, where a bigger difference exists in the spectra than in the detail coefficients. 84 đ. damnjanović, d. ćirić, z. perić fig. 10 detail coefficients after applying daubechies 2 wavelet to segmented signals (with taking coefficient absolute value) of non-faulty (blue) and faulty motors (red) up to level 8 of decomposition: (a) mean values, (b) standard deviation, (c) spectrum of full-length signals (the case with certain differences between non-faulty and faulty motors) it should be kept in mind that the motors used in the present research are the new ones coming out from the production line. in that regard, faulty motors make up the minority. moreover, only very few of those faulty motors have serious failures. the sound of such wavelet-based audio features 85 a motor is distinguishable from the non-faulty ones. however, in many other cases where the fault is a minor one, the sounds of non-faulty and faulty motors are perceptually similar to each other, and their objective characteristics are also similar. this is why making a distinction between non-faulty and faulty motors having only minor failures is a difficult task. such a case is presented in fig. 11. independently on whether the mean or standard deviation of the detail coefficients from the frames is applied to generate the wavelet-based feature, these features are similar for non-faulty and faulty motor, as shown in fig. 11 (a) and (b). this is valid for all used decomposition levels (up to level 8). spectra of these motors are also similar, although not entirely the same, see fig. 11 (c). as a consequence of the mentioned similarities, in cases like this one, the wavelet-based features will not provide a clear distinction between non-faulty and faulty motors. the measure feature difference is also calculated for two specific cases investigated in this section, and the results are summarized in table 1. the feature difference is significantly larger for the case 1 than for the case 2, which is in line with the noticed behavior of the wavelet-based features for these cases. for the case 1, the feature difference has larger values at the levels of decomposition from 1 to 4, confirming the above stated observation. table 1 feature difference calculated for two cases analyzed in this section (case 1 is related to fig. 10, while case 2 is related to fig. 11) decomposition level 1 2 3 4 5 6 7 8 feature difference (case 1) 0.39 0.34 0.42 0.83 0.26 0.18 0.23 0.24 feature difference (case 2) 0.12 0.042 0.1 0.027 0.068 0.051 0.025 0.001 5. conclusion the sound generated by a motor having a certain fault can be significantly changed from the sound generated by a non-faulty motor. the extent of difference depends on the fault, but also on the motor itself. the sounds of non-faulty and faulty motors can be compared from the perceptual point of view, but their objective characteristics such as spectra can be compared, too. an approach that can provide additional information is based on the extraction of some features from the signal (sound) and using these features as attributes describing the motor as a sound source and its condition. wavelet technique is one of the options to extract useful features when the sound signature analysis is applied for motor quality estimation. the decomposition of an audio signal into detail and approximation coefficients is the main task in this type of wavelet analysis. by using the adequate wavelet parameters, the differences in wavelet coefficients (representing the wavelet-based features) between non-faulty and faulty motors can become prominent. although differences between the motors might be seen in the spectra, too, the wavelet-based features provide a different insight into this topic. apart from the fact that wavelet filtering can emphasize these differences between motors, usage of the wavelet-based features is much more convenient for application of automated classification procedures based on machine/deep learning. 86 đ. damnjanović, d. ćirić, z. perić fig. 11 detail coefficients after applying daubechies 2 wavelet to segmented signals (with taking coefficient absolute value) of non-faulty (blue) and faulty motors (red) up to level 8 of decomposition: (a) mean values, (b) standard deviation, (c) spectrum of whole signals (the case without prominent differences between nonfaulty and faulty motors) regarding the wavelet parameters, it is interesting to note that several different wavelets provide similar results, although there are slightly different cases. for instance, the haar wavelet does not provide as good results as daubechies 2 wavelet for this wavelet-based audio features 87 particular application. the present research results show that a few wavelets stand out, including daubechies, symlet and coiflet. the waveform of the symlet wavelet is similar to that of the daubechies wavelet, so it is logical that they have a similar impact on the signal decomposition. one of the main observations of this research is that there is no single decomposition level leading to the largest differences between non-faulty and faulty motors. this is use-case dependent. in most cases, there is a correlation between the differences of the motor sound spectra in particular frequency ranges and differences in the wavelet-based features (detail coefficients). thus, the decomposition level can be chosen according to the frequency range where the largest differences between the compared dc motors occur. in this research, the most prominent differences exist in the upper part of the frequency range, and in the detail coefficients at the levels of decomposition from 1 to 4. only in rare cases, certain differences between non-faulty and faulty motors can be present in the detail coefficients at the decomposition levels higher than 4. the sounds of dc motors used in this research were recorded in the production hall of the motor manufacturer, as described above. future work will also include dc motor sounds recorded in an alternative environment (e.g., the one of different size and ambient conditions). also, the analysis will be extended to some other cases of correlation of results from different domains in addition to two extreme cases presented here with and without differences between non-faulty and faulty motors in both detail coefficients and spectra. acknowledgement: the work presented in this paper was supported by the ministry of education, science and technological development of the republic of serbia (the work of the author). this research was supported by the science fund of the republic of serbia, 6527104, ai-com-in-ai (the work of the co-authors). references [1] đ. damnjanović, d. ćirić, z. perić, analysis of dc motor sounds using wavelet-based features, in proceedings of the 6th international conference on electrical, electronic and computing engineering "(ic)etran 2019", srebrno jezero, serbia, june 3 6, 2019, pp. 17-22. [2] m. sifuzzaman, m.r. islam, m.z. ali, "application of wavelet transform and its advantages compared to fourier transform", journal of physical sciences, vol. 13, pp. 121-134, 2009. [3] r.j.e. merry, "wavelet theory and applications: a literature study", technische universiteit eindhoven, eindhoven, june 7, 2005. [4] b. ergen, "signal and image denoising using wavelet transform", chapter 21 in: advances in wavelet theory and their applications in engineering. physics and technology, in-tech (2012), pp. 495-515. [5] ð. m. damnjanović, d. g. ćirić, b. b. predić, "de-noising of a room impulse response by applying wavelets, acta acustica united with acustica", journal of the european acoustics association (eaa) international journal on acoustics, vol. 104, no. 3, pp. 452-463, may/june 2018. [6] g. kaushik, h.p. sinha, l. dewan, "biomedical signals analysis by dwt signal denoising with neural networks", journal of theoretical and applied information technology, vol. 62, no.1, pp. 184-198, 10th april 2014. [7] e. güzel, m. canyılmaz, m. türk, "application of wavelet based denoising techniques to remote sensing very low frequency signals", radio science, vol. 46, issue 2, pp. 1-9, april 2011. [8] damnjanović, d. ćirić, "usage of wavelet de-noising for estimation of room impulse response truncation time", in proceedings of 5th international conference on electrical, electronic and computing engineering "icetran 2018", pp. 565-570, palić, serbia, june 11-14, 2018. [9] g. tzanetakis, g. essl, p. cook, "audio analysis using the discrete wavelet transform", in proceedings of the wses international conference acoustics and music: theory and applications (amta 2001), pp. 318-323, skiathos, greece, january 2001. 88 đ. damnjanović, d. ćirić, z. perić [10] t. zhang, c.-c. jay kuo, "content-based audio classification and retrieval for audiovisual data parsing", chapter 3 in: audio feature analysis, springer, boston, ma, 2001. [11] a. glowacz, "diagnostics of direct current machine based on analysis of acoustic signals with the use of symlet wavelet transform and modified classifier based on words", eksploatacja i niezawodnosc – maintenance and reliability, vol. 16, no. 4, pp. 554-558, 2014. [12] m. daniels, "classification of percussive sounds using wavelet-based features", ph.d. dissertation, ccrma, stanford university, 2010. [13] c. da costa, m. kashiwagi, m. h. mathias, "rotor failure detection of induction motors by wavelet transform and fourier transform in non-stationary condition", case studies in mechanical systems and signal processing, vol. 1, pp. 15-26, elsevier, 2015. [14] m. zhao, q. chai, s. zhang, "a method of image feature extraction using wavelet transforms", in proceedings of the 5th international conference on intelligent computing, icic 2009, pp. 187-192, ulsan, south korea, september 16-19, 2009. [15] p. sharma, n. saraswat, "diagnosis of motor faults using sound signature analysis", international journal of innovative research in electrical, electronics, instrumentation and control engineering, vol. 3, issue 5, pp. 80-83, may 2015. [16] p. a. delgado-arredondo, d. morinigo-sotelo, r. a. osornio-rios, j. g. avina-cervantes, h. rostrogonzalez, r. de j. romero-troncoso, “methodology for fault detection in induction motors via sound and vibration signals”, mechanical systems and signal processing, vol. 83, pp. 568-589, january 2017. [17] a. glowacz, "dc motor fault analysis with the use of acoustic signals, coiflet wavelet transform, and k-nearest neighbor classifier", archives of acoustics, vol. 40, no. 3, pp. 321-327, 2015. [18] a. dineva, a. mosavi, m. gyimesi, i. vajda, n. nabipour, t. rabczuk, "fault diagnosis of rotating electrical machines using multi-label classification", applied sciences, vol. 9, pp. 1-18, november 2019. [19] a. a. silva, a. m. bazzi, s. gupta, "fault diagnosis in electric drives using machine learning approaches", in proceedings of the international electric machines & drives conference, pp. 722-726, chicago, il, usa, 12-15 may 2013. [20] s.-y. shao, w.-j. sun, r.-q. yan, p. wang, r. x gao, "a deep learning approach for fault diagnosis of induction motors in manufacturing", chinese journal of mechanical engineering, vol. 30, pp. 13471356, october 2017. [21] m. z. ali, m. n. s. k. shabbir, x. liang, y. zhang, t. hu, "machine learning-based fault diagnosis for singleand multi-faults in induction motors using measured stator currents and vibration signals", ieee transactions on industry applications, vol. 55, issue 3, pp. 2378-2391, january 2019. [22] r. s. s. kumari, d. sugumar, "wavelet based feature vector formation for audio signal classification", in proceedings of the icacc2007 int. conference, madurai, india, pp. 752-755, 9-10 feb, 2007. [23] m. c. sezgin, b. gunsel, g. k. kurt, "perceptual audio features for emotion detection", eurasip journal on audio, speech, and music processing, no. 16, pp. 1-21, 2012. [24] y. shi, g. wang, j. niu, q. zhang, m. cai, b. sun, d. wang, m. xue, x. d. zhang, "classification of sputum sounds using artificial neural network and wavelet transform", international journal of biological sciences, vol. 14, issue 8, pp. 938-945, 2018. [25] a. hashemi, h. arabalibiek, k. agin, "classification of wheeze sounds using wavelets and neural networks", in proceedings of the international conference on biomedical engineering and technology, vol. 11, pp. 127-131, singapore, 2011. [26] p. de chazal, b. g. celler and r. b. reilly, "using wavelet coefficients for the classification of the electrocardiogram", in proceedings of the 22nd annual international conference of the ieee engineering in medicine and biology society, pp. 64-67, chicago, il, usa, 23-28 july 2000. [27] a. kandaswamya, c. sathish kumarb, rm. pl. ramanathanc, s. jayaramana, n. malmurugana, "neural classification of lung sounds using wavelet coefficients", computers in biology and medicine, pp. 523537, vol. 34, issue 6, september 2004. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 569-588 https://doi.org/10.2298/fuee2104569b © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper cuckoo search algorithm to solve the problem of economic emission dispatch with the incorporation of facts devices under the valve-point loading effect larouci benyekhlef 1*, sitayeb abdelkader2, boudjella houari1, ayad ahmed nour el islam1 1department of electrical engineering, faculty of applied sciences, university kasdi merbah ouargla, ouargla, algeria, 2applied research unit on renewable energies “uraer ghardaia”, ghardaïa, algeria abstract. the essential objective of optimal power flow is to find a stable operating point which minimizes the cost of the production generators and its losses, and keeps the power system acceptable in terms of limits on the active and reactive powers of the generators. in this paper, we propose the nature-inspired cuckoo search algorithm (csa) to solve economic/emission dispatch problems with the incorporation of facts devices under the valve-point loading effect (vpe). the proposed method is applied on different test systems cases to minimize the fuel cost and total emissions and to see the influence of the integration of facts devices. the obtained results confirm the efficiency and the robustness of the cuckoo search algorithm compared to other optimization techniques published recently in the literature. in addition, the simulation results show the advantages of the proposed algorithm for optimizing the production fuel cost, total emissions and total losses in all transmission lines. key words: combined economic emission dispatch, opf, cuckoo search algorithm, vpe, facts devices. 1. introduction the production of electrical power is marked by several orientations as limiting the environmental impact of the generating and use of energy, increasing the energy efficiency of systems and developing low-cost production and minimizing gas emissions toxic in the atmosphere under the valve-point effect [1]. received april 3, 2021; received in revised form august 14, 2021 corresponding author: larouci benyekhlef department of electrical engineering, faculty of applied sciences, university kasdi merbah ouargla, street ghardaia, 30000, ouargla, algeria e-mail: larouci.benyekhlef@univ-ouargla.dz 570 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam the impact of power plants on the environment has changed the way power grids are managed [2]. however, recent awareness of the toxic effects of gases emitted by fossil fuel power plants and the new stringent environmental laws imposed on power producers have led to the incorporation of environmental considerations into the methods that govern the production of electricity [3] so, the emissions and fuel cost must be considered simultaneously to provide the true measure of optimum production [4]. currently, with the new energy market deregulation system [5], there is increased interest in facts (flexible alternative current transmission systems) for the operation and control of power systems [6], this is due to the new load constraints and new contingencies. the installation of facts has become essential to increase the transmission capacity of the power system, reduce losses, and improve the safety and the controllability of an electrical network [7]. these facts devices are capable of changing network parameters quickly and efficiently to achieve better system performance [8]. the mathematical formulations of all the above tasks to be performed by power producers are therefore becoming more and more complex [9]. this growing complexity has led many researchers to turn to nature-inspired algorithms to solve these problems [10]. these algorithms are those developed by imitating natural phenomena and biological models [11-12]. they offer robust and competitive solutions. the objectif of this article is to propose a nature-inspired algorithm known as cuckoo search algorithm (csa) [10], in order to provide the optimal solution to the optimal power flow problems. the proposed technique is applied on ieee 30-bus system considering valve-point effect (vpe) with and without installing two facts devices (static var compensator (svc) and statcom), to reach the lowest values of fuel cost, installation cost of facts devices, and to reduce toxic gas emissions. the statistical results are as compared with other algorithms existing in the recent literature. the rest of this article is structured as follows: the definition and mathematically formulation of the opf problem are offered in section 2, while section 3 addresses a brief description of the csa. the simulation results are carefully studied and analyzed in section 4. finally, conclusion and future suggestions are given in section 5. 2. optimal power flow the optimal power flow (opf) was conceived as an extension of the conventional economic and emission dispatch [13-14]. the opf problem is large-scale non-convex optimization problem, which may also have uncertain variables. in general, the opf problem seeks to optimize the steady state performance of a power system in terms of an objective function while satisfying several equality and inequality constraints [15]. in contrast, opf aims to optimize an objective function by finding optimal free variables while keeping the network constraints in their acceptable limits [16]. 2.1. mathematical model of economic dispatch the cost of each generating unit is typically represented by fuel cost. generator curves are generally represented as quadratic convex curves of second order function [17-18]. the total fuel cost function is formulated as follows: cuckoo search algorithm to solve the problem of economic emission dispatch with facts 571 igiigiigii cpbpapf ++= 2 )( (1) the coefficients ai, bi and ci are numerically known. the optimal functioning of a set of thermal production units can be seen by the model:  = =      gn i gigi ppcminimize 1 i1 )(f (2) where: ng is the total number of generators on the system under the constraints of equality and inequality type. 2.1.1. constraints a. equality constraint these constraints are represented by nonlinear power flow equations [20]. the sum of the active and reactive generated powers in the network must be equal to the sum of the active and reactive powers consumed with transmission losses, this constraint is given by [21]: 0 1 )cos( = = −−−− ng j ijjiijyjvivpp digi  (3) 0 1 )sin( = = −−−− ng j ijjiijyjvivqq digi  (4) b. inequality constraints these constraints represent the operating limits of the power system (generator voltages, real and reactive power, transmission lines, transformers, facts, etc.) [22-23]. maxmin gigigi ppp  (5) maxmin gigigi qqq  (6) maxmin kikiki vvv  (7) maxmin factsfactsfacts sss  (8) maxmin svcsvcsvc qqq  (9) 2.2. mathematical model of economic dispatch with valve-point loading effect in some large generators, their cost functions are also non-linear, due to the effect of valve-point loading (vpe) [24]. this effect will increase several local minimum points in the cost function and make the problem more difficult. the fuel cost function with the effect of the valve-point loading can be expressed as follows [25]: 572 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam ))(sin()( min, 2 gigiiiigiigiigii ppfecpbpapf −+++= (10) where pgi is the real power generation of unit i in (mw), ei, and fi are cost coefficients of ith generator due to vpe. chiang in [26-28], presented a realistic economic dispatching problem by simultaneously considering the fuels cost and the effect of the valve-point loading to make the economical dispatching solution more precise. 2.3. mathematical model of environmental dispatch the total emission can be expressed as [29]: igiigiigii pppe  ++= 2 )( (11) where: the coefficients γi, βi and αi are nox emission coefficients numerically known [30]. 2.4. mathematical model of combined economic emission dispatch problem the study of economic-environmental dispatch consists of the simultaneous minimization of the two functions given by equations (1) and (11). we therefore transform the bi-objective optimization problem into a single-objective optimization problem, by introducing a price penalty factor [31]. this factor is defined as the ratio between the maximum cost and the maximum emissions of each generator [32]: ng,………2,1,=i; kg $ )( )( m ax m ax         = gi gi p pe pc f (12) the steps to determine the price penalty factor specific for a given load are: determine the ratio of the maximum cost and the maximum emissions for each generator. rank the values of these factors in ascending order. the sum of the maximum powers of each generator starting with the power of the plant with the lowest factor up to:  =  gn i dpp 1 m ax gi at this point, fp tied to the last unit in the process and the price penalty factor for the given load. after determining this factor, we can represent the economic-environmental dispatch function by the following equation [33-34]: 2 2 1 1 ( ) ( ) ( ) g gn n gi i gi i gi i p i gi i gi i i i p a p b p c f p p = =  = + + +  +  +   (13) equation (14) can be rewritten as follows [35]: 2 1 ( ) ( ) gn gi i gi i gi i i p c p b p a =  = + + (14) cuckoo search algorithm to solve the problem of economic emission dispatch with facts 573 with: . , . , .i i p i i i p i i i p ia a f b b f c c f= +  = +  = +  (15) the minimization of this function is done by taking into account the type of the equality and inequality constraints. 2.5. opf with cost function model of svc devices the cost of svc device, was developed by the manufacturer siemens. the cost function of the svc in ($/kvar) is as follows [36]: 127.380.3051-0.00031 2 += svcsvcsvc ssc (16) where: ssvc is reactive power of svc in mvar. the formulation of the optimal choice problem of svc locations can be expressed as follows [37-38]: ( )fcpccmin gitotal 21 +     = (17) 0),( 1 =gfe (18) 0)(,0)( 21  gbfb (19) ctotal: the total objective function comprising the svc investment cost and the cost of production. fi (pgi) : the generator cost function given by equation (1). c1 (f): the investment cost function of svc given by the equations (16) and (17). e1: represents the power flow equations. b1, b2: are the inequality constraints of the svc and the optimal power flow, respectively. f, pgi : represent the variables parameters of the svc and the powers supplied by the alternators. the fuel cost is expressed in ($/hour) while the investment costs of facts are expressed in ($). these must be expressed in $/hour. normally facts are designed to be in service for several years [39]. however, they are only used for a portion of their lifetimes for power flow control. in this research, three years are used to estimate the average cost of facts, i.e. the depreciation (from a financial view point) of facts is estimated at three years [40]: ( ) 38760 )( 1  = fc fc ($/hour) (20) where: c(f) is the investment cost of svc. the svc device, is composed of a capacitor, which is the var generator, and a tcr (thyristor controlled reactor), which behaves as a variable var absorbing load (depending on the firing angle of the thyristor valve) [41]. thus, the svc can inject or absorb a variable amount of reactive power to the power system, adapting the compensation to the load conditions at each instant (see fig.1) [42]. 574 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam l c tcr  c() l() fig. 1 static var compensator configuration. 2.5.1. opf with model and mathematical analysis of statcom a static synchronous compensator (statcom), also known as a static synchronous condenser [43] is a regulating device used on alternating current electricity transmission networks. it is based on a power electronics voltage-source converter and can act as either a source or sink of reactive ac power to an electricity network. if connected to a source of power it can also provide active ac power. it is inherently modular and electable. statcom is modelled as a controllable voltage source (ep) in series with impedance [43]. the real part of this impedance represents the copper losses of the coupling transformer and converter, while the imaginary part of this impedance represents the leakage reactance of the coupling transformer. statcom absorbs requisite amount of reactive power from the grid to keep the bus voltage within reasonable range for all power system loading. fig. 2 shows the circuit model of a statcom connected to the ith bus of a power system. fig. 2 schematic static model of statcom https://en.wikipedia.org/wiki/static_synchronous_compensator#cite_note-ieee_conference_publication_2017-2-2 https://en.wikipedia.org/wiki/alternating_current https://en.wikipedia.org/wiki/power_electronics https://en.wikipedia.org/wiki/ac_power https://en.wikipedia.org/wiki/ac_power cuckoo search algorithm to solve the problem of economic emission dispatch with facts 575 the injected active and reactive power flow equation of the ith bus are given below: 2 cos( ) cos( ) 1 p p k k k p k p p n p g v v e y v v y i j ij i j ij j = −  −  −  +  −  −  = (21) 2 sin( ) sin( ) 1 p p k k p p k p p n q b v v e y v v y i j ij i j ij j = − −  −  −  +  −  −  = (22) the implementation of statcom in transmission system introduces two state variables (|ep| and δp); however, |vk | is known for statcom connected bus. 3. cuckoo search algorithms the cuckoo search algorithm (csa) is one of the newer nature-inspired metaheuristic algorithms developed by xin-she yang and suash deb in 2009 [10], [44]. csa is a population-based search method that is used as a tool for optimization to solve complex, nonlinear and non-convex optimization problems. the algorithm of csa uses three idealized rules [45]: (a) each cuckoo lays an egg place them in time and randomly chosen nest. (b) the best nest with high quality eggs is passed on to the next generation. (c) the number of available host nests is fixed and a host bird can discover an exotic egg with a conversation of pa ϶ [0, 1]. in this case, the host bird can either drop the egg or leave the nest to build a brand new nest in a new location. the cs method's key steps can be described as [46-47]: 1. select the value of the csa parameter, which is the number of nests (eggs) (n), step size parameter (β), probability of discovering (pa), and maximum number of iterations to end the cycle. 2. randomly generate an initial population of n host nests   ( )nix i ,....,2,1, = . each nest represents a possible solution to an optimization problem using objective functions ( )xf and decision variables    t mi xxxx ,...,, 21 = . 3. use levy flights to get a cuckoo randomly and evaluate its fitness i f .   . 1 += + ii xx (23) where λ is a random walk based on levy flight (1 < λ ≤ 3). 4. randomly choose a nest among n (say j) and evaluate its fitness fj. if fj < fi, replace j with the new solution. 5. abandon a fraction of the worst nests behind and create new ones. this is done depending on the probability parameter (pa). first, check whether each nest maintains its current position (equation (24)). the matrix r stores the values 0 and 1 so that each of them can be assigned to any component of the ith nest. 0 means that the current position is kept and 1 means that the current position is updated.      ⎯⎯ parandif parandif ri 0 1 (24) 576 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam the new nest is carried out by eq. 25: ( ) iii t i t i permpermrrxx 21 1 −+= + (25) where: r is a random number from 0 to 1. perm1 and perm2 are two row permutations of the corresponding nests. r defines a probability matrix. fig. 3 a simplified flowchart of the csa 6. rank solutions and find the current best one. 7. repeat steps 3-6 until completion criteria is satisfied, which are usually considered the maximum number of iterations. a simplified flowchart of the cs algorithm is demonstrated in fig. 3 [48]: cuckoo search algorithm to solve the problem of economic emission dispatch with facts 577 4. numerical results and discussion in this work, four cases of opf problem are studied; the proposed algorithm is applied on standard ieee 30-bus system considering vpe in presence of two facts devices, svc and statcom, in order to solve the optimal power flow and solving the combined economic emission dispatch problem. the single-line diagram of which is illustrated by the fig. 4. all the cases studies are executed in matlab 2017 under windows 8.1 on intel core(tm) i5-3110 cpu 2.40 ghz, with 4 gb ram. table 1 and table 2 groups the values of the coefficients of the cost an emission functions of the 06 generators, and the limit powers pmax and pmin. the cost functions of generators 1 and 2 are obtained based on the ripple curve; this curve contains a higher order of non-linearity and discontinuity due to the valve-point effect. the cost coefficients of these units are given in table 1. the parameters of the cuckoo search algorithm are: ▪ maximum number of iterations (kmax) is 100. ▪ the rate of discovery of eggs (pa) / solutions is 0.25. ▪ the number of nests is 70. 30 29 28 27 25 26 8 24 23 22 15 18 19 20 21 6 10 9 14 16 17 12 11 3 1 2 4 13 7 5 fig. 4 ieee 30 bus system structure 578 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam table 1 cost coefficients of generators for ieee 30-bus system bus pmin (mw) pmax (mw) ci ($/h) bi ($/mwh) ai ($/mw²h) di ($/mwh) ei ($/mw²h) 1 50 200 150 2 0.0016 50 0.0630 2 20 80 25 2.5 0.0100 40 0.0980 5 15 50 0 1.00 0.0625 / / 8 10 35 0 3.25 0.00834 / / 11 10 30 0 3.00 0.025 / / 13 12 40 0 3.00 0.025 / / the coefficients of the gas emission function are shown in table 2. table 2 emission coefficients of generators for ieee 30-bus system node γi ($/h) βi ($/mwh) αi ($/mw²h) 1 22.983 -1.1000 0.0126 2 25.313 -0.1000 0.0200 5 25.505 -0.0100 0.0270 8 24.900 -0.0050 0.0291 11 24.700 -0.0040 0.0290 13 25.300 -0.0055 0.0271 4.1. case 1: optimal power flow (opf) an optimal power flow program with the valve-point loading effect based on the newton-raphson method, to determine the voltages at the different bus, the generated powers and the transmission losses. the results obtained for case1 are shown in table 3. table 3 optimal power flow results bus v angle injection generation load no p.u deg mw mvar mw mvar mw mvar 1 1.06 0 200.00 -7.157 200 -7.157 0 0 2 1.043 -4.141 -1.7 26.044 20 38.744 21.7 12.7 5 1.01 -10.6513 -73.418 7.276 20.782 26.276 94.2 19 8 1.01 -7.9854 -6.254 -15.163 23.746 14.837 30 30 11 1.082 -8.0311 15.419 15.324 15.419 15.324 0 0 13 1.071 -9.66 13.613 8.157 13.613 8.157 0 0 total 10.266 -6.720 293.566 119.48 283.400 126.20 comparisons of our results with those obtained by other methods are grouped in tables 4. the results show that the cs-opf algorithm gives a better result compared to other methods reported in the literature. the total cost found by the csa small compared with those found by the methods ga, pso, fpso and ga-mga which are of the order of 923.07$/h, 928.56 $/h, 923.72$/h, 923.54$/h and 922.77 $/h respectively. the cost ranges from 0.1-0.71% in relation to the values obtained by the cs opf algorithm. cuckoo search algorithm to solve the problem of economic emission dispatch with facts 579 table 4 optimal output power for ieee 30-bus system with different algorithms variable cs opf ga [49] mga [49] pso [49] fpso [49] ga-mga [49] pg1 (mw) 200.00 199.34 199.66 199.78 199.78 199.73 pg2 (mw) 20.00 20.03 20.14 20.24 20.00 20.00 pg3 (mw) 20.78 23.13 18.70 21.60 25.42 18.49 pg4 (mw) 23.74 22.78 17.18 19.91 22.43 24.29 pg5 (mw) 15.41 13.97 10.31 14.22 13.37 16.74 pg6(mw) 13.61 14.56 27.77 18.13 12.94 14.57 pg total (mw) 293.56 293.81 293.76 293.88 293.94 293.82 fuel cost ($/hr) 921.88 923.07 928.56 923.72 923.54 922.77 losses (mw) 10.26 10.41 10.360 10.480 10.55 10.42 the value of the active losses found by csa is of the order of 10,266 mw; it is smaller compared to those obtained by of ga, pso, fpso and ga-mga techniques. figures 5, 6, 7 and 8 respectively, illustrates the variations of fuel cost, transmission losses, the generated powers and the values of the nodal voltages respectively. these graphs clearly indicate that csa converges rapidly to the optimal solution. fig. 5 convergence of fuel cost fig. 6 optimal values of the powers generated. fig. 7 variation of active losses fig. 8 nodal voltage values 580 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam 4.2. case 2: economic / environmental dispatch (with variable losses) to demonstrate the effectiveness of the proposed approach, the combined economic environmental dispatch with the optimal power flow applied by introducing the price penalty factor is resolved. the transmission losses are variable depending on the generated power. the price penalty factors of each generator, are valued at 2.000, 1.9888, 2.2296, 2.0534, 2.2198 and 2.3378 ($/ton) respectively. the optimal values of generated power, transmission losses, fuel cost, nox emission and a comparison of our results with those obtained using the hsabc algorithm (harvest season artificial bee colony) are given by the table 5. for the case of economic dispatch (opf), the value of the production cost is reduced to the minimum (921.88 $/h) and its value is better than that of economic / environmental dispatch (959.94 $/h) and environmental dispatch (1071.64 $/h). for the case of environmental dispatch, the value of total emission is very low (295.92 kg/h) compared to the combined economic environmental dispatch (336.98 kg/h) and economic dispatch (457.43 kg/h). according to table 5, it is clear that the gas emissions found by our algorithm (295.92 kg/h) are lower compared to those found by the hsabc technique which are estimated at 309.84 (kg/h). the total emission, are minimized by 13.92 (kg/h). characteristics convergence of fuel cost, nox emissions and total cost are depicted in figures 9, 10, and 11 respectively. the graphs clearly indicate that csa converges rapidly to the optimal solution. table 5 economic-environmental dispatch with variable losses variable economic dispatch combined economic emission dispatch environmental dispatch cs opf cs hsabc [50] cs pg1 (mw) 200.00 149.46 126.07 114.99 pg2 (mw) 20.00 51.41 49.74 49.07 pg3 (mw) 20.78 18.37 28.40 37.08 pg4 (mw) 23.75 31.29 31.80 28.11 pg5 (mw) 15.42 25.33 26.63 29.69 pg6 (mw) 13.61 14.99 27.17 30.41 total pg (mw) 293.56 290.85 289.81 289.36 cost ($/hr) 921.88 959.95 1048.68 1071.64 emission ($/hr) 457.43 336.98 309.84 295.92 total cost ($/hr) / 1655.53 / / losses (mw) 10.26 7.783 6.41 5.40 fig. 9 production cost fig. 10 nox emissions cuckoo search algorithm to solve the problem of economic emission dispatch with facts 581 fig. 11 total cost 4.3. case 3: opf with the presence of svc to solving opf problem with svc device, the investment cost of svc is integred in the power system. we are increasing the load from 283.40 mw to 383.40 mw, adding 100 mw at bus 20. the candidate bus at the location of the svc is the bus where the voltage drop is important, so we have chosen bus 20 to install the svc. the parameters of the svc are grouped in table 6: table 6 svc parameters’ qsvcmax (mvar) qsvcmin (mvar) c ($/kvar) b ($/kvar²) a ($/kvar3) 100 -100 188.22 -0.2691 0.0003 the power flow in power system without and with installation of svc, are reported in tables 7 and 8 respectively. from table 7, the voltage level at bus 20 (without svc) is considered the lowest (0.8939 p.u) the voltage drop, it is lower than the minimum allowable value (10.61% < 5 %). table 7 optimal power flow without svc (pload = 383.9 mw) bus v angle injection generation load no p.u deg mw mvar mw mvar mw mvar 1 1.06 0.00 199.78 2.31 199.98 2.31 0.00 0.00 2 1.043 -3.70 58.30 28.59 80.00 41.29 21.70 12.70 5 1.01 -10.89 -65.72 12.69 28.48 31.69 94.20 19.00 8 1 -9.46 4.38 1.23 34.38 31.23 30.00 30.00 11 1.062 -11.28 27.02 23.71 27.02 23.71 0.00 0.00 13 1.071 -12.52 39.10 23.96 39.10 23.96 0.00 0.00 20 0.8939 -28.42 -102.20 -0.70 0.00 0.00 102.20 0.70 total 25.36 51.29 408.76 177.49 383.40 126.20 the results obtained with csa considering svc device, are reported in table 8. 582 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam table 8 optimal power flow with svc. bus v angle injection generation load no p.u deg mw mvar mw mvar mw mvar 1 1.06 0 200.00 -0.884 200.00 -0.884 0 0 2 1.043 -3.692 58.251 19.767 79.951 32.467 21.7 12.7 5 1.01 -10.75 -64.28 7.724 29.916 26.724 94.2 19 8 1.01 -9.591 4.966 4.719 34.966 34.719 30 30 11 1.082 -11.07 30 24.123 30 24.123 0 0 13 1.071 -13.28 32.222 14.397 32.222 14.397 0 0 20 0.95 -28.34 -102.2 16.908 0 17.608 102.2 0.7 total 23.692 46.254 407.09 172.45 383.4 126.2 according to table 8, it is remarkable that svc device at bus 20 will be more effective in the bus with the greatest voltage drop. the installation of svc significantly reduces the fuel cost, transmission losses and improve the level of voltages from 0.8939 to 0.95 p.u. table 9 optimal results with and without svc. variable cs with svc cs without svc pg1(mw) 200.00 199.98 pg2(mw) 79.95 80 pg5(mw) 29.91 28.48 pg8(mw) 34.96 34.38 pg11(mw) 30 27.02 pg13(mw) 32.22 39.10 v1 (pu) 1.06 1.06 v2 (pu) 1.043 1.043 v5 (pu) 1.01 1.01 v8 (pu) 1.01 1 v11 (pu) 1.082 1.062 v13 (pu) 1.071 1.071 v20 (pu) 0.95 0.8939 total pg (mw) 407.09 408.96 losses (mw) 23.69 25.36 qsvc (mvar) 2.696 / cost svc $/kvar 187.49 / cost ($/hr) 1372.23 1375.49 from table 9, the total cost (1372.23$/h) obtained by our algorithm with the location of svc at bus 20 is lower compared to without svc (1375.49$/h). the cost is minimized by 3.2581$/h. the transmission losses in this case are minimal (23.691 mw) compared to without installing svc device (25.36 mw). they are reduced by 1.66 mw. cuckoo search algorithm to solve the problem of economic emission dispatch with facts 583 fig. 12 production cost fig. 13 variation of active losses fig. 14 optimal values of the generated powers fig. 15 voltage profile we also deduce that the cuckoo search algorithm quickly converges to the optimal solution. 4.4. case 4: opf with the presence of statcom in the third application, we are interested in the resolution of the optimal power flow with the integration of statcom in the power system. we increase the load demand from 283.40 mw to 383.40 mw. to maintain all the voltages at acceptable values, the candidate bus for the statcom location is the bus where the voltage drop is important; we have chosen the bus n°20 to install statcom. the voltage source value is considered 1.00 p.u. an optimal power flow program based on the newton-raphson method [51, 52] determines the voltages (magnitude and angle) at the different bus, the generated powers and the transmission losses. the opf results obtained with installation of statcom are cited in tables 10 and 11 respectively. 584 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam table 10 optimal power flow with statcom bus v angle injection generation load no pu degree mw mvar mw mvar mw mvar 1 1.06 0 180.245 -5.581 199.868 -5.581 0 0 2 1.043 -3.703 58.3 8.114 80 20.81 21.7 12.7 5 1.01 -10.93 -65.75 -0.161 25.44 18.84 94.2 19 8 1.02 -9.682 4.916 0.653 34.91 30.65 30 30 11 1.082 -11.04 29.97 16.64 29.97 16.64 0 0 13 1.081 -12.57 36.29 9.226 36.29 9.226 0 0 20 1 -27.97 -102.3 34.4 0 35.1 102.2 0.7 total 23.33 42.5 406.5 168.7 383.4 126.2 the simulation results illustrate in table 10, show that the addition of statcom at bus 20 improve the voltage profile (from 0.8939 to 1.00 p.u) and the levels of other voltage buses. table 11 simulation results of optimal values variable cs with svc (bus n°20) cs with statcom cs without svc pg1(mw) 200.00 199.8688 199.98 pg2(mw) 79.95 80.0000 80 pg5(mw) 29.91 25.4363 28.48 pg8(mw) 34.96 34.9129 34.38 pg11(mw) 30 29.9736 27.02 pg13(mw) 32.22 36.2903 39.10 v1 (pu) 1.06 1.06 1.06 v2 (pu) 1.043 1.043 1.043 v5 (pu) 1.01 1.01 1.01 v8 (pu) 1.01 1.02 1 v11 (pu) 1.082 1.082 1.062 v13 (pu) 1.071 1.081 1.071 v20 (pu) 0.95 1.000 0.8939 total pg (mw) 407.09 406.4819 408.96 losses (mw) 23.69 23.3280 25.36 cost ($/hr) 1372.23 1363.83387 1375.49 table 12 statcom parameter result vsh of statcom thst of statcom qsh of statcom bus p.u deg p.u 20 1.00 -28.1606 -0.3505 we can see from the table 11, that the obtained opf results indicate that csa with statcom give a better fuel cost (1363.83387 $/h) compared to case without statcom (1375.49069$/h), the cost is reduced by 11.65 $/h. the power losses have considerably decreased from 25.3580 mw to 23.3280 mw, they are minimized by 2.03 mw. therefore, the opf problem with statcom using the proposed algorithm cuckoo search algorithm to solve the problem of economic emission dispatch with facts 585 performing well represented a best solution. the fuel cost and the transmission losses are reduced and voltage magnitude are maintained at the specified value. the variations of fuel cost, transmission losses, optimal values of generated powers and nodal voltages values are illustrated in figures 16, 17, 18 and 19 respectively. fig. 16 production cost fig. 17 variation of the powers generated. fig. 18 variation of active losses fig. 19 voltage profile we also deduce that the cuckoo search algorithm quickly converges to the optimal solution. 5. conclusions the main difficulty of such an optimization problem is linked to the presence of a conflict between the production cost function, the toxic gas emission function, the valve-point loading effect and the control function cost of the facts. it requires the transformation of this multiobjective problem into a single-objective optimization problem. to do this, we have changed the problem of optimizing economic-environmental dispatching into a single-objective optimization problem, by introducing a price penalty factor. 586 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam the csa tests were validated on the ieee 30-bus system. the simulation results prove that the proposed technique present as a competing algorithm for the resolution of the mentioned problems. a comparison of obtained results with those recently published in the literature confirms the efficiency and robustness of the algorithm in finding precise solutions. in this paper, we have proved the positive contribution of the insertion of facts devices in the power system to improve voltage profile, maximize power flow capability, and reduce active power losses on the optimal management of the electrical system. we also conclude that although the complexity of the problems associated with power networks by changing their topologies by inserting facts devices and taking into account the valvepoint loading effect, the csa presents a better solution of the optimal power flow and economic-environmental dispatch. to ensure good results, in the future, we will endeavor to find a parameter-free developed technique combined with the csa algorithm and introduce it to other kinds of optimization issues, such as multi-objective ed problems with many complex constraints, dynamic ed problems and large-scale eld problems integrated renewable energy sources. references [1] j. olamaei, et al. "economic environmental unit commitment for integrated cchp-thermal-heat only system with considerations for valve-point effect based on a heuristic optimization algorithm", energy, vol. 159, pp. 737-750, sep. 2018. [2] g. d. surywanshi, et al, "4-e and life cycle analyses of a supercritical coal direct chemical looping combustion power plant with hydrogen and power co-generation", energy, vol. 217, p. 119418, 2020. [3] t. calheiros-cabral, et al, "evaluation of the annual electricity production of a hybrid breakwaterintegrated wave energy converter", energy, vol. 213, p. 118845, dec. 2020. [4] a. skorek-osikowska, et al, "thermodynamic, economic and environmental assessment of energy systems including the use of gas from manure fermentation in the context of the spanish potential", energy, vol.200, 117452, june 2020. [5] w. w. clark, agile energy systems: global distributed on-site and central grid power. elsevier, 2017. [6] x. p. zhang, "a framework for operation and control of smart grids with distributed generation", in proceedings of ieee power and energy society general meeting-conversion and delivery of electrical energy in the 21st century, ieee, 2008, pp. 1-5. [7] b. larouci, et al, "amélioration de l’influence des variations paramétriques sur les performances de l’upfc", acta electrotehnica, vol. 53, pp. 187–191, may 2012. [8] l. j. cai, et al, "optimal choice and allocation of facts devices in deregulated electricity market using genetic algorithms". in proceedings of the ieee pes power systems conference and exposition, ieee, 2004, pp. 201-207. [9] b. sereeter, et al, "optimal power flow formulations and their impacts on the performance of solution methods", in proceedings of the ieee power & energy society general meeting (pesgm), ieee, 2019, pp. 1-5. [10] x. s. yang, et al, "from swarm intelligence to metaheuristics: nature-inspired optimization algorithms", computer, vol. 49, pp. 52-59, sept. 2016. [11] s. binitha and ss. sathya, "a survey of bio inspired optimization algorithms", int. j. soft comput. eng., vol. 2, pp. 137-151, may 2012. [12] l. n. de castro, "fundamentals of natural computing", phys. life rev., vol. 4, pp. 1-36, 2007. [13] a. a. el-fergany and h. m. hasanien, "salp swarm optimizer to solve optimal power flow comprising voltage stability analysis", neural comput. appl., vol. 32, pp. 5267-5283, may 2020. [14] e. h. talbi, et al, "solution of economic and environmental power dispatch problem of an electrical power system using bfgs-al algorithm", procedia comput. sci., vol. 170, pp. 857-862, april 2020. [15] k. srilakshmi, et al, "an enhanced most valuable player algorithm based optimal power flow using broyden's method", sustain. energy technol. assess., vol. 42, pp. 100801, sept. 2020. [16] e. mohagheghi, et al, "a survey of real-time optimal power flow ", energies, vol. 11, pp. 3142, nov. 2018. cuckoo search algorithm to solve the problem of economic emission dispatch with facts 587 [17] h. j.touma, "study of the economic dispatch problem on ieee 30-bus system using whale optimization algorithm", int. journal eng. technol. sci., vol. 5, pp. 11-18, june 2016. [18] s. espinosa, d. a. cazco and m. y. salcedo, "economic dispatch hydrothermal system with co2 emissions constraints", ieee latin america trans., vol. 15, pp. 2090-2096, nov. 2017. [19] h. bouchekara, "solution of the optimal power flow problem considering security constraints using an improved chaotic electromagnetic field optimization algorithm", neural comput. appl., vol. 32, pp. 2683-2703, april 2020. [20] z. yang, et al, "optimal power flow in ac dc grids with discrete control devices", ieee trans. power syst., vol. 33, pp. 1461-1472, march 2017. [21] a. f. attia, r. a. el sehiemy and h. m. hasanien, "optimal power flow solution in power systems using a novel sine-cosine algorithm", int. j. electr. power energy syst., vol. 99, pp. 331-343, july 2018. [22] y. tang, k. dvijotham and s. low, "real-time optimal power flow", ieee trans. smart grid, vol. 8, pp. 2963-2973, nov. 2017. [23] o. herbadji, l. slimani and t. bouktir, "optimal power flow with four conflicting objective functions using multiobjective ant lion algorithm: a case study of the algerian electrical network", iran. j. electr. electron. eng., vol. 15, pp. 94-113, march 2019. [24] c. l. chiang, "artificial immune system for economic dispatch problems considering power generators having valve-point loadings", in proceedings of the 7th international conference on control, decision and information technologies (codit), ieee, 2020, pp. 501-504. [25] h. boudjella, et al, "solution of economic load dispatch problems using novel improved harmony search algorithm", int. j. electr. eng. inform., vol. 13, no. 1, pp. 218-241, march 2021. [26] y. yang, et al, "chaos firefly algorithm with self-adaptation mutation mechanism for solving large-scale economic dispatch with valve-point effects and multiple fuel options", ieee access, vol. 6, pp. 4590745922, aug. 2018. [27] c. l. chiang, "an optimal economic dispatch algorithm for large scale power systems with cogeneration units", eur. j. eng. res. sci., vol. 1, pp. 10-16, 2016. [28] l. h. pham, et al, "adaptive cuckoo search algorithm based method for economic load dispatch with multiple fuel options and valve point effect", int. j. hybrid inf. technol., vol. 9, no. 1, pp. 41-50, dec. 2016. [29] i. n. trivedi, et al, "an economic load dispatch and multiple environmental dispatch problem solution with microgrids using interior search algorithm", neural comput. appl., vol. 30, pp. 2173-2189, oct. 2018. [30] p. balachandar, s. ganesan, n. jayakumar and s. subramanian, "economic/environmental dispatch of an interconnected power system considering multiple fuel sources", in proceedings of the international conference on circuit, power and computing technologies (iccpct), ieee, 2017, pp. 1-7. [31] f. jabari, m. shamizadeh and b. mohammadi‐ivatloo, "risk‐constrained day‐ahead economic and environmental dispatch of thermal units using information gap decision theory", int. trans. electr. energy syst., vol. 29, pp. e2704, feb. 2019. [32] f. p. mahdi, et al, "a quantum‐inspired particle swarm optimization approach for environmental/economic power dispatch problem using cubic criterion function", int. trans. electr. energy syst., vol. 28, pp. e2497, march 2018. [33] z. xin-gang, et al, "economic-environmental dispatch of microgrid based on improved quantum particle swarm optimization", energy, vol. 195, pp. 117014, march 2020. [34] m. basu, "economic environmental dispatch using multi-objective differential evolution", appl. soft comput., vol. 11, no. 2, pp. 2845-2853, march 2011. [35] b. gjorgiev and m. čepin, "a multi-objective optimization based solution for the combined economicenvironmental power dispatch problem", eng. appl. artif. intell., vol. 26, no. 1, pp. 417-429, jan. 2013. [36] m. saravanan, et al, "application of particle swarm optimization technique for optimal location of facts devices considering cost of installation and system loadability", electr. power syst. res., vol. 77, pp. 276-283, march 2007. [37] m. saravanan, et al, "application of pso technique for optimal location of facts devices considering system loadability and cost of installation", in proceedings of the international power engineering conference, ieee, 2005, pp. 716-721. [38] p. k. tiwari and y. r. sood, "optimal location of facts devices in power system using genetic algorithm", in proceedings of the world congress on nature & biologically inspired computing (nabic), ieee, pp. 1034-1040, 2009. [39] a. a. alabduljabbar and j. v. milanović, "assessment of techno-economic contribution of facts devices to power system operation", electr. power syst. res., vol. 80, no. 10, pp. 1247-1255, oct. 2010. http://www.ijeei.org/?file=19161264236065fe198308d.pdf&q=download http://www.ijeei.org/?file=19161264236065fe198308d.pdf&q=download 588 l. benyekhlef, s. abdelkader, b. houari, a. a. n. el islam [40] h. r. baghaee, et al, "security/cost-based optimal allocation of multi-type facts devices using multiobjective particle swarm optimization ", simulation, vol. 88, pp. 999-1010, march 2012. [41] m. ćalasan, et al, "optimal allocation of static var compensators in electric power systems", energies, vol. 13, pp. 3219, june 2020. [42] m. lima and s. l. nilsson, technical description of static var compensators (svc): flexible ac transmission systems facts. springer, 2020, chapter 3, pp. 155-206. [43] b. larouci, l. benasla, a. belmadani and m. rahli, "cuckoo search algorithm for solving economic power dispatch problem with consideration of facts devices", sci. bull. series c – electri. eng. comput. sci., vol. 79, pp. 43-54, 2017. [44] x. s. yang, and s. deb, "cuckoo search: recent advances and applications", neural comput. appl., vol. 24, pp. 169-174, jan. 2014. [45] x. s. yang and s. deb, suash, "multiobjective cuckoo search for design optimization". comput. oper. res., vol. 40, pp. 1616-1624, june 2013. [46] x. s. yang and s. deb, "cuckoo search via lévy flights", in proceedings of the world congress on nature & biologically inspired computing (nabic), ieee, 2009, pp. 210-214. [47] x. s. yang and s. deb, "engineering optimisation by cuckoo search", int. j. math. model. numer. optim., vol. 1, pp. 330-343, dec. 2010. [48] x. s. yang, s. deb, m. karamanoglu and x. he, "cuckoo search for business optimization applications", in proceedings of the national conference on computing and communication systems, ieee, 2012, pp. 1-5. [49] r. l. kherfane, et al, "solving economic dispatch problem using hybrid ga-mga", energy procedia, vol. 50, pp. 937-944, 2014. [50] a. n. afandi and h. miyauchi, "a new evolutionary method for solving combined economic and emission dispatch", energy power eng., vol. 5, pp. 774, july 2013. [51] h. boudjella "calcul de la répartition optimale des puissances dans un réseau électrique par les méthodes métaheuristiques", ph.d thesis, university of science and technology of mohammed boudiaf oran usto, algeria, 2021. [52] a. gil, j. segura and n. m. temme, numerical methods for special functions. society for industrial and applied mathematics, 2007. https://www.researchgate.net/publication/220693008 instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 243 260 doi: 10.2298/fuee1602243m parallel execution tracing: an alternative solution to exploit under-utilized resources in multi-core architectures for control-flow checking  mohammad maghsoudloo, hamid r. zarandi amirkabir university of technology (tehran polytechnic), iran abstract. in this paper, a software behavior-based technique is presented to detect control-flow errors in multi-core architectures. the analysis of a key point leads to introduction of the proposed technique: employing under-utilized cpu resources in multicore processors to check the execution flow of the programs concurrently and in parallel with the main executions. to evaluate the proposed technique, a quad-core processor system was used as the simulation environment, and the behavior of spec cpu2006 benchmarks were studied as the target to compare with conventional techniques. the experimental results, with regard to both detection coverage and performance overhead, demonstrate that on average, about 94% of the control-flow errors can be detected by the proposed technique, with less performance overhead compared to previous techniques. key words: on-line error detection, control-flow error, error detection coverage, multi-core processor, cpu utilization. 1. introduction advances in cmos technology have provided reduction in transistor size and voltage levels [1]. as the number of available transistors continues to grow exponentially, adding and using more hardware resources has emerged as a convenient solution to increase the performance of the microprocessors. while these additional resources yield performance enhancements, they lead to increases in the level of power dissipation [2]. in order to improve the performance per cost ratio of processors, computer architects are forced to shift from single-core single-threaded processors to multi-core multi-threaded processors or chip multi-processors (cmps) [2], [3]. however, current trends in computer architectures have shown that the forthcoming of new processors will involve new challenges related to increasing the vulnerability of them against hardware transient faults [4]. most soft errors are the result of energetic particle received april 6, 2015; received in revised form december 27, 2015 corresponding author: mohammad maghsoudloo amirkabir university of technology (tehran polytechnic), 424 hafez ave, iran (e-mail: m.maghsoudloo@aut.ac.ir) 244 m. maghosudloo, h. r. zarandi strikes, induced by high-energy neutrons from cosmic rays, and alpha particles from decaying radioactive impurities in packaging and interconnect materials [5]. soft errors, that occur in both system memory and combinational logic of the computer system, will need to be addressed during the design phase of the systems, especially for safety-critical applications [2]. to counter the errors, present in different level of memories, designers typically employ redundant information or hardware, such as error correcting codes (eccs), and bit interleaving to protect memories. similarly, combinational logic within the processor should be protected. therefore, high-availability systems need much more hardware redundancy than that provided by ecc and parity bits. for example, ibm has historically added 20-30% of additional logic within its mainframe processors for fault tolerance [6]. there have been several approaches to add extra hardware redundancy and modification for tolerating hardware faults or detecting soft errors in cmps, called hardware fault tolerant techniques (hwft), such as core cannibalization architecture (cca) [7], core salvaging [8], mixed-mode multi-core (mmm) [9], which imposes more power consumption, and complexity challenges [10]. the inclusion of redundant hardware design and modification may negatively impact the design cycles of systems and also areaand power-efficiency of the new and modern processor [11], [12], [13]. on the other hand, software fault tolerant techniques (swft) usually provide adequate levels of fault tolerance, hence system reliability [14], such as mswat [1], and detouring [11]. despite the flexibility of swft techniques which can moderate the negative impacts of hwft ones, they can lead to huge and considerable performance degradation during the execution, because of the nature of their structures which are based on execution replication [10]. in the other category, control-flow checking (cfc) methods, key methods used for monitoring the behavior of a program, have shown to provide effective and cost-efficient error detection coverage [15], [16], [17], [18], [20]. unfortunately, due to the crucial drawbacks of the hardware-based methods, hardware-based cfc (hcfc) methods are not considered to be appropriate for current architectures [23]. moreover, the structure of conventional software-based cfc (scfc) techniques has not been adapted with the inherent features of cmps. although using the scfc techniques can reduce the huge performance overhead of the software-based methods, however, these techniques still have potential of imposing high performance overhead (due to inserting some instructions into the programs) on the systems that would undermine the obtained benefits of the modern processors and would waste the efforts taken by designers to improve the execution time of the programs and utilization of the processors. typically, the performance overhead of these techniques grows to 40% for only detection, and also grows to more than 100% for detection and correction [18] that would be intolerable in current architectures and applications. so, one effective way for enhancing the reliability of cmps, is to modify the configuration of software-centric methods, so that the adaptability with the status and load of the cpu become visible in their structure. this paper presents an adaptive and efficient scfc technique which takes the advantages of multi-core multi-threaded processors, such as parallel execution capabilities, to remove the main drawbacks of the conventional scfc techniques [21], [22]. shifting from single-core to multi-core processor means that programmers must write concurrent multi-threaded programs for the optimal use of hardware capacities. unfortunately, some challenges, such as load balancing, sequential and synchronization dependencies, parallel execution tracing: an alternative solution to exploit under-utilized resources... 245 cause that parallelism is growing slowly and, the resources in multi-core processors may not be employed properly [19], [28], [29]. so, there are some idle cycles and resources during the execution of each core, which is noticeable when the number of cores increases. therefore, the availability of idle hardware resources in current processors has motivated us to use under-utilized cpu capacities for employing a technique for parallel and concurrent execution tracing which is compatible with the features of multi-core multi-threaded processors. using this idea will lead to moderation of the negative effects of the serial controlflow checking, proposed by the previous techniques, that causes undesirable performance and memory overheads. to evaluate the proposed technique, eleven well-known programs in the spec cpu2006 benchmark suite [30] were utilized. these benchmarks were run on a quad-core processor system, intel® core™ i7-740qm with 6.00 gb memory ram, with a real operating system (red hat enterprise linux as release 10). the results of injecting about 12000 errors reveal that on average, about 94% of the cfes were detected by the proposed technique. moreover, based on a comparison metric, which considers the effects of methods in cfe detection coverage and performance overheads, the proposed technique is found to have higher coverage with lower overhead compared to the previous works. the structure of this paper is as follows: section 2 introduces terminology. section 3 describes the main problem and motivation. the structure of the proposed technique is explained in section 4. the simulation environment and the experimental results are shown in section 5. finally, section 6 concludes the paper. 2. background all of the hardware techniques have been traditionally considered to achieve reliability requirements. however, it is not feasible for those where cost is a critical issue. the use of commercial off-the-shelf (cots) components for safety-critical applications has been suggested to accelerate the development cycle and produce cost effective systems. cots components require specific approaches to take into account the effect of possible hardware faults [17]. therefore, swft techniques are recently preferred. re-execution of the program in different level of the code (thread, process, or instruction) is the basis of the most swft techniques to detect faults. however, these methods require around 100% performance overhead, which may not be suitable for most of the current applications and systems [11], [12], [13]. consequently, the need for state-of-the-art high computing performances, coupled with cost containment, provides a strong motivation for investigating feasible alternatives to traditional solutions [17]. transient or intermittent faults induced in hardware have an impact on software running on it, which is either data error or control flow error. controlflow errors occur when a processor jumps to an incorrect next instruction, which have been demonstrated to account for more than 70% of all errors [17]. so, due to the incidence and importance of control-flow errors and the advantages of software-based techniques, using the scfc techniques for detecting control-flow errors has been shown as an effective and low-cost alternative solution to enhance the reliability of the processors. moreover, previous studies on the effects of multiple bit upset (mbu), the new types of errors, arisen due to shrinking the feature size of transistors, has demonstrated that the probability of occurrence of control-flow errors has been recently increased compared to the past [31]. 246 m. maghosudloo, h. r. zarandi table 1 summarizes the features of different methods for enhancing the reliability of the cmps. as table 1 demonstrates, if the problem of high performance degradation, imposed by the scfc techniques, can be moderated, the scfc techniques will be more compatible with the features of the modern processors, compared to the other types of the proposed methods. table 1 a general comparison among ifferent methods for enhancing the reliability of cmps categories detecting hardware faults/errors need program modification need hardware modification performance degradation in memory in combinational logic hwft most almost all no yes low ecc almost all none no yes low hwft+ecc almost all almost all no yes high hcfc most most no yes low swft most most yes no very high scfc most most yes no high 3. problem and motivation in the general case, almost all of the previous scfc techniques are based on inserting some instructions into the program code. these instructions are responsible for monitoring the flow of the program execution and detecting any violation of run-time behavior. in fact, the fault model, considered in the paper, is the transient faults which lead to illegal sequence of basic blocks execution and control flow errors. extracting the control flow graph (cfg) from relations among basic blocks of a program code is always considered a prerequisite step in both of scfc and hcfc methods. any incorrectness and limitation in capturing the control dependencies among nodes of the cfg causes that the flow of a given program will not be precisely followed in checking phase. as fig. 1(a) shows, after determining control dependencies among basic blocks of the program, each node of the cfg should be labeled by a unique signature, and to store the values of run-time signatures, a global variable or register should be reserved (for example variable sig in fig. 1(a)). after assigning signatures to each basic block (bb), types and locations of the checking and updating instructions should be specified. fig. 1(b) demonstrates the method, used by the conventional techniques, for locating and specifying additional statements in the program code. most of the previous scfc techniques added these instructions at the beginning and/or at the end of the bbs in order to achieve high detection coverage. the sequence of the signatures is checked at run-time by conditional branch instructions. these instructions compare the value of the run-time signature with the pre-defined value assigned to each block at the design or compile time in order to detect any misbehavior. the run-time signatures should be updated, so that after checking instructions they confirm the correct execution. the major differences among previous related methods are in the content of updating statements. for example, the control-flow checking by software signatures (cfcss) technique [16] computes the signature of the destination blocks from the signature of the source block by implementing the xor functions between the signature of the current node and the destination node. in addition, the yet another control-flow checking using assertions (yacca) technique [17] defines a set of instructions that updates the parallel execution tracing: an alternative solution to exploit under-utilized resources... 247 signature using and operation and xor function with two constant masks. finally, the control-flow error detection through assertions (ceda) technique [15] inserts fewer instructions (only a xor operation) than the previous works by calculating signatures differently. moreover, this technique can uniquely identify the node from which the illegal jump occurred, because for eventual cfe correction, stronger requirements are required. none of the above mentioned techniques can be used here as proposed. therefore, this technique has been shown to be more efficient and effective compared to the prior methods [15], [18]. fig. 1(b) shows three typical basic blocks of a program with the added instructions, specified according to ceda. due to the advantages of the ceda, this technique is selected as the most efficient previous scfc techniques to compare with the proposed method, in the rest of this paper. .l1: br sig != 0001, err movl sig, sig xor 0011 pushl %ebp movl %esp, %ebp pushl %edi pushl %esi subl $44, %esp movl $0, -28(%ebp) movl $0, -36(%ebp) br sig != 0010, err movl sig, sig xor 0001 jmp .l3 .l2: br sig != 0101, err movl sig, sig xor 0011 movl $0, -32(%ebp) addl %eax, %eax addl %edx, %eax addl %edi, %eax br sig != 0110, err movl sig, sig xor 0001 jmp .l4 .l3: br sig != 0011, err movl sig, sig xor 0111 movl $0, -28(%ebp) addl %eax, %eax addl %edx, %eax addl %edi, %eax cmpl $2, -22(%ebp) addl $1, -30(%ebp) br sig != 0100, err movl sig, sig xor 0001 jle .l2 bb1 bb2 bb4 bb3 sig = 1 sig = 2 sig = 3 sig = 4 sig = 5 sig = 6 sig = 7 sig = 8 sig = 9 (a) (b) fig. 1 the conventional algorithm for (a): assigning signatures, (b): locating the added instructions, and specifying types of the added instructions 248 m. maghosudloo, h. r. zarandi 3.1. the problem unfortunately, due to the execution of checking and updating statements, added at the beginning and at the end of each bb, applying previous scfc techniques on the current processors will lead to high performance degradation. the conditional branch instructions, used as the usual checking statements by the previous scfc techniques, take two clock cycles; and the xor operation, used for updating the values of run-time signature, takes one clock cycle. so, the negative impacts of the checking statements on the execution time are more than the updating ones. on average, about 75% of performance degradation due to the previous scfc techniques is imposed just because of executing the checking statements. this drawback of the scfc techniques can undermine the obtained benefits of the modern processors and would waste the efforts taken by designers to improve the execution time of the programs and utilization of the processors. 3.2. the motivation in order to reduce the negative impacts of the checking instructions, the features of hcfc techniques can be modeled. the hcfc techniques check the behavior of the main processor in parallel with the execution of the main program [24], [25]. in general case, an external checker or watchdog processor is responsible to check the sequence of the signatures of the main program, or the accesses of the main processor to the memory for detecting cfes during the execution. due to the reasons mentioned in the previous section, although hcfc techniques can moderate the main drawback (performance overhead) of scfc techniques, these techniques require some modifications in hardware, which make it impossible to be used in cots and current processors. the one effective way to take the advantages of parallel cfc in current architectures (without adding any hardware redundancy) is to use the under-utilized cpu capacities in multi-core processors. utilization is the percentage of time that a component is actually occupied, compared to the total time that the component is available for use [10], [19], [29]. unfortunately, developing parallel software for shared-memory multi-cores, using today’s programming languages, can be challenging. therefore, due to the lack of high-level parallelization constructs, multi-core processors are not being fully leveraged [19], [29]. fig. 2 shows the average percent of cpu usage of the system for three real applications from spec cpu2006 benchmarks suite when running on a real quad-core system (the execution times of applications in this figure are not real). in the example, the cores are only executing the designated application. the results show that since parallelism is not thoroughly possible, the cpu is not fully occupied during the execution of these applications (even when the number of running programs grows to 3) which becomes worse when the number of cores in the system increases. therefore, there are enough resources in the system to run cfc mechanism during the idle cycles of cpus. parallel execution tracing: an alternative solution to exploit under-utilized resources... 249 (a) (b) (c) fig. 2 percentage of cpu usage for three benchmarks: (a): bzip2, (b): bzip2*, (c): mcf 4. the proposed technique for parallel scfc the idea, for exploiting under-utilized resources in multi-cores, is to develop a watchdog thread, which is responsible for checking the sequence of signatures, in parallel with updating phases of signatures by the main threads. as fig. 3 shows, although redundancy at thread-level allows the operating system to freely schedule the competing threads across all available resources [1], the structure of the watchdog thread should be organized with regard to some important facts: 0 10 20 30 40 50 60 70 1 4 7 10 13 16 19 22 25 28 31 34 37 40 % c p u u sa g e time (s) 0 10 20 30 40 50 60 70 1 4 7 10 13 16 19 22 25 28 31 34 37 40 % c p u u sa g e time (s) 0 10 20 30 40 50 60 70 1 4 7 10 13 16 19 22 25 28 31 34 37 40 % c p u u sa g e time (s) 250 m. maghosudloo, h. r. zarandi 1. too many new dependencies should not appear between the watchdog and the main threads. increasing the synchronization and communication dependencies among threads of a process will directly lead to the increased number of interruptions, and also reduction in the level of parallelism in the multi-threaded programs. so, as much as possible, the watchdog thread should not interrupt the execution of the main ones. 2. the watchdog thread can check the value of the signatures, just between two consecutive updating phases of each main thread. otherwise, the number of false positive cfes (number of wrongly cfe detection) will be increased. supposed that, the watchdog thread compares the value of the signature with the expected value, after two updating phases of the main thread. then, the mismatch is detected as a cfe, while in reality it is not. according to these two facts, there is a conflict between them: how could the watchdog thread guarantee that the values of the signatures will be checked exactly between two consecutive updating phases of the main ones, while the execution of the main threads cannot be interrupted? this conflict between facts would be removed, if the values of the signatures are not overwritten till the certain time, so the watchdog has enough time to check the sequence of the signatures in each thread. this solution can be implemented by replacing a register with an array for storing the values of the signature. it should be noted that the virtual cores in fig. 3 show a physical core that is assigned to a virtual machine. by default, virtual machines are allocated one core each. if the physical host has multiple cores at its disposal, however, then a core scheduler assigns execution contexts and the virtual core essentially becomes a series of time slots on logical processors. therefore, the virtual cores are the software/hardware co-elements that is used by the system and does not show the virtualization in the operation of watchdog thread. v1 v2 v3 v4 c4c3c2c1 application operating system system software system hardware main thread watchdog thread virtual cores physical cores fig. 3 the presence of watchdog thread in the architectural model of a quad-core processor fig. 4 illustrates this idea by implementing two updating phases of a thread. the first bit of each element of the array is reserved, as a tag bit, to show the state of each element (u: un-checked, c: che cked). after initialization of the elements (step 1 in fig. 4), the main thread update the value of the signature, stored in the first element, and the state of this element is changed to un-checked (step 2). in the second updating phase, the second element of the array is updated, and the value of signature in the first element is still parallel execution tracing: an alternative solution to exploit under-utilized resources... 251 maintained. this process continues until the last element of the array is updated. during the updating phase and from the first element, the watchdog can check the elements of the array which is tagged with un-checked. if any mismatch is observed, occurrence of the cfe is reported, otherwise, the state of the element is changed to checked. when the last element of array is also updated (step 4), the main thread is suspended until the watchdog checks all elements, and the main thread is let to overwrite the elements of the array from the first one (steps 6, 7, and 8). 0000 0000 0000 ... 0001 0000 0000 ... 0001 0001 0000 ... 0001 0001 0001 ... s11 s12 s1b 0001 0001 0001 ... 0010 0001 0001 ... 0010 0010 0001 ... 0010 0010 0010 ... c c c c c c u c c u u c c c c c u u c c u u c c c u (a) (b) (c) (d) (e) (f) (g) (h) s11 s12 s1b s11 s12 s1b s11 s12 s1b s11 s12 s1b s11 s12 s1b s11 s12 s1b s11 s12 s1b fig. 4 the steps of updating signatures stored in the signature array of each thread: (a):step1, ... , (h):step 8 (c:checked, u:un-checked) ignoring dependencies among running threads in phase of designing the cfgs causes that interactions among threads to be considered as cfes. using one shared global variable or register for storing the value of the signatures is the main reason of wrong cfe detection, since the running threads can access and alter the value of the signature register, simultaneously and without any limitation. for instance, in a dual-threaded program, if one thread updates the value of the signature between two consecutive checking phases of the other one, unexpected value of the signature is wrongly detected as a cfe by the second thread. these weaknesses in detection is known as false positive cfes, when a technique detects some behavior in execution as cfes, while in reality they are not. in other words, this mistake means that a positive inference about a cfe occurrence is actually false. however, reserving separated arrays for storing the run-time signatures of each thread, introduced as one of the perquisite steps of the proposed technique for parallel cfc. threads can alter the values of the signatures, stored in their own specific arrays, and their accesses to the arrays of the other threads are limited to check the values of them. so, the proposed technique for parallel cfc guarantees that the described scenario for checking and updating the signatures in multi-threaded programs will not be happen during the run-time. 252 m. maghosudloo, h. r. zarandi 1 i: a number assigned to each thread (id) 2 j: a number assigned to each element of arrays (index) 3 sij: the value of element j in the array of thread i as the signature 4 lij: the tag bit of element j in the array of thread i 5 eij: the value of the expected value for element j in the array of thread i 6 ti: the thread, identified by the value of variable i as its id 7 b: the number of elements in a array 8 n: the number of threads in the program 9 begin 10 x = 0, i = 1 11 for (j = 1, j <= b, j++) 12 begin 13 while (lij == 1) 14 begin 15 i++ 16 if (i > n) then 17 i = 1 18 end 19 end 20 if (sij == eij) then 21 x++, lij = 1 22 end 23 else 24 “cfe occurrence” 25 end 26 i++ 27 if (i > n) then 28 if (x == n) then 29 i = 1 30 end 31 else 32 i = 1, goto 13 33 end 34 end 35 if (j == b) 36 continue (ti) 37 end 38 end 39 goto 10 40 end fig. 5 the algorithm of watchdog thread used in the proposed technique the configuration of the watchdog thread should be organized at the design time considering the information provided by the algorithm in fig. 5. the following steps describe the structure of the watchdog thread: parallel execution tracing: an alternative solution to exploit under-utilized resources... 253  to select the execution flow of the first thread for checking the sequence of signatures, the value of i is initialized to 1 (line 10). to ensure that the element j in the array of each thread has been checked (before increasing the value of the j), variable x is reserved to count the number of threads during each round of the algorithm.  to determine the state of selected element, whether it is checked or un-checked, the value of lij is compared with 0 (line 13). 0 represents the un-checked, and 1 indicates the checked state. if this element of the selected array has been checked in the previous round, the value of i is increased by 1 (line 15), and the next thread from the list of threads is selected for checking the state of element j in its array.  the value of the signature is compared with the expected value to detect possible cfes during execution (line 20).  to count the number of threads, which have been checked in this round, the value of variable x in increased by 1, and the state of element j is changed to checked (line 21).  the value of i is increased by 1, to select the next thread from the list of threads (line 26).  if the value of i is still less than the number of threads in the program, this round is repeated until the element j of the last thread in the list of threads is also checked (line 27).  one step should be intended to ensure that the elements j of all threads has been checked (line 28). after the first round of each checking phase, if the value of x is less than the number of threads in the program, the tag bit of elements j for all thread will be re-checked. in the cases, where the main threads did not have enough time to update the signature stored in element j, state of their elements remains un-checked, and the process of checking the signature should be performed for them in the next rounds (line 32).  at line 38, the watchdog has already checked the element j of each thread. so, the value of j is increased by 1 to check the next element in the array of main threads (line 11).  at line 35, if the value of j is equal to the number of elements in array of each thread and all elements of each array have been already checked, then the watchdog let the main threads to continue the execution and overwrite the array from the first node (line 36), and the value of i and j are re-initialized (lines 10 and 11). fig. 6 shows the structure of bbs in the proposed technique. suppose that the size of arrays is five. therefore, after five updating phases of the main threads, they will be suspended until the watchdog let them continue their executions and overwrite the arrays. this interruption in the executions of the main threads is inevitable, because the main threads should be ensured that overwriting the elements of the array will not be detected as a cfe in the cases where the watchdog has not had enough time to check the previous value of those elements. the experimental results will show that these interruptions have negligible side effects on the execution time of the program. moreover, the code of watchdog thread algorithm can be also divided to basic blocks. so, the proposed techniques can be easily applied on the code of watchdog thread. 254 m. maghosudloo, h. r. zarandi l11=0 , updating s11 l12=0 , updating s12 l13=0 , updating s13 l14=0 , updating s14 l15=0 , updating s15 suspending l11=0 , updating s11 l12=0 , updating s12 l13=0 , updating s13 li1=0 , updating si1 li2=0 , updating si2 li3=0 , updating si3 li4=0 , updating si4 li5=0 , updating si5 suspending li1=0 , updating si1 li2=0 , updating si2 li3=0 , updating si3 t1 ti... ... ... fig. 6 the structure of the bbs in the proposed technique parallel execution tracing: an alternative solution to exploit under-utilized resources... 255 5. experimental results in order to evaluate our design decisions, a quad-core processor system, intel® core™ i7-740qm with 6.00 gb memory ram, is used as the simulation environment with a real operating system (red hat enterprise linux as release 10). also, the behaviors of eleven well-known programs of spec cpu2006 [30] have been studied on the simulation environment. to implement the real behaviors of soft errors which lead to program sequence changes, a software function was written as a saboteur thread that runs simultaneously with the main program. this thread has accesses to the registers, such as instruction pointer, stack pointer, and also memory spaces with which program’s threads have interactions. during simulation, the saboteur thread manipulates the content of registers and variables in a random fashion (like the effects of the bit-flips and stuck-at fault models). also, to observe the effects of the injected errors and to see what is going on inside a program after error injection, the gnu project debugger (gdb) [32] has been used. the gdb can perform several operations to help you catch misbehaviors of the execution in operation, such as examine what has happened when your program has stopped, make your program stop on specified conditions, and so on. in this work, the gdb has been utilized to identify the destinations of the illegal branches, occurred due to the error injections. fig. 7 shows the performance overhead of previous scfc techniques due to checking and updating instructions. the information provided by the figure confirms the fact that the checking instructions have a greater impact on the performance degradation. on average, about 26.48% out of 35.31% performance overhead of the previous scfc techniques is imposed just due to the execution time of checking statements. fig. 7 the percentage of performance overhead due to the previous techniques 0 10 20 30 40 50 % p e r fo r m a n c e o v e r h e a d benchmarks due to the execution of updating statements due to the execution of checking statements 256 m. maghosudloo, h. r. zarandi fig. 8 compares the performance overhead of the related techniques. as illustrates by the figure, while array size increases, the number of interruptions during the execution and subsequently the performance overhead, is considerably reduced. for example, when the array size increases to 5, 10, and 15 the performance degradation is reduced to 24%, 20% and 19%, respectively. on the other hand, when the array size is equal to 1, as similar as previous scf c techniques, the checking phase should take place exactly before each updating phase. furthermore, in this case, adding a new phase to wait for watchdog thread causes increased performance overhead compared to the other cases. therefore, increasing the size of the arrays plays an impressive role in moderating the negative effects of the conventional scfc techniques on the execution time of the programs. fig. 8 comparison of the performance overhead of the related techniques fig. 9 represents the average percentage of cpu usage for each benchmark during their executions in the presence of the watchdog thread compared to the average cpu usage for the original programs. with regard to the figure, the average percentage of cpu usage for the original programs is less than 22%. so, there are more underutilized cpu resources in multi-core processors during the execution which can be exploited for parallel scfc. also, adding the watchdog thread to the set of the running threads in the programs increases the cpu utilization up to 19%, relatively. as similar as some previous works, to give a general comparison among all the methods, which also takes the error detection coverage and the performance overhead into account; a metric called method efficiency [15], [18], [26], [27] is defined to estimate the efficiency of the methods: 0 10 20 30 40 50 60 70 80 % p e r fo r m a n c e o v e r h e a d benchmarks due to the conventional techniques due to the proposed technique (array size = 1) due to the proposed technique (array size = 5) due to the proposed technique (array size = 10) due to the proposed technique (array size = 15) parallel execution tracing: an alternative solution to exploit under-utilized resources... 257 [ ( ) ] fig. 9 average percentage of the cpu usage for the programs during their executions table 2 compares the value of four important parameters obtained after applying the related techniques on the programs. the conventional techniques with two sets of the added instructions at the end and at the beginning of each bb is shown as the type1 conventional techniques, and the conventional techniques with one set of the added instructions at the end and/or at the beginning of each bb is shown as the type2 conventional techniques. table 2 general comparison among related techniques in terms of four impressive factors category technique error detection coverage (%) performance overhead (%) method efficiency error detection latency (cycle) bzip2 mcf sjeng astar bzip2 mcf sjeng astar bzip2 mcf sjeng astar bzip2 mcf sjeng astar t h e c o n v e n ti o n a l te c h n iq u e s type1 (like [15]) 93.0 98.9 89.9 95.7 34.7 39.9 46.4 38.6 0.41 2.28 0.21 0.60 7.1 6.3 6.4 6.8 type2 (like [16]) 82.1 86.5 80.4 84.1 17.9 21.3 25.1 20.2 0.31 0.35 0.20 0.31 9.0 8.2 7.8 8.8 t h e p ro p o se d te c h n iq u e s with l=1 93.0 98.9 89.9 95.7 56.4 66.5 79.7 64.0 0.25 1.37 0.12 0.36 11.9 10.3 9.4 11.2 with l=5 93.0 98.9 89.9 95.7 24.2 28.5 34.2 27.4 0.59 3.19 0.29 0.85 19.1 16.2 16.2 18.1 with l=10 93.0 98.9 89.9 95.7 20.2 23.7 28.5 22.8 0.70 3.84 0.35 1.01 35.2 33.0 32.8 34.9 with l=15 93.0 98.9 89.9 95.7 18.8 22.2 26.6 21.3 0.76 4.09 0.37 1.09 45.0 42.9 41.0 44.6 according to the table, the efficiency of the proposed techniques is more than the efficiency of the conventional techniques, especially when the length of the arrays grows to 5, 10 and 15. it brings to this concept that the proposed technique can effectively moderate the negative impacts of the previous techniques on the execution time of the programs. the only 0 10 20 30 40 50 60 70 % a v e r a g e c p u u sa g e benchmarks without watchdog thread with watchdog thread 258 m. maghosudloo, h. r. zarandi drawback of the proposed technique is the increase of the error detection latency which can be controlled by adjusting the length of the array with respect to the type of applications and systems. on average, the error detection latency of the previous techniques is about 7 cycles, while this value is increased to 18, 35, and 44 cycles for the proposed techniques, with the arrays size of 5, 10 and 15, respectively. it is concluded that the size of array should be selected with respect to the type of applications and systems. for example, for real-time systems that error detection latency is important, the arrays with shorter lengths are better than the other cases. the efficiency of the proposed techniques is more than the efficiency of the conventional techniques, especially when the length of the arrays grows to 5, 10 and 15. therefore, for high performance computing, where performance is the main goal of design process, the arrays with longer lengths are better than the other cases. 6. conclusions in this paper, an efficient software behavior-based technique was presented in order to detect control-flow errors in multi-threaded architectures. the goal is to enhance the applicability of the related techniques in order to be employed in multi-core processors. the key innovation is to make some changes in the structure of software-based controlflow checking techniques to exploit under-utilized resources in multi-core processors for parallel control-flow checking. error injection experiments have shown that the proposed technique, when applied on the programs, can detect the cfes in over 94%. the latency needed for detecting the cfes is considerably less than the related techniques which have been recently published. a metric for estimating and comparing the efficiency of the methods was defined, and it was shown that the proposed technique is more efficient compared to conventional methods in order to be used in multi-core architectures. references [1] s. kumar, s. hari, m. li, p. ramachandran, b. choi and s. v. adve, “mswat: low-cost hardware fault detection and diagnosis for multicore systems,” in proc of the 42th annual international symposium on microarchitecture, 2009. [2] g. a. reis, j. chang, n. vachharajani, r. rangan and d. i. august, “swift: software implemented fault tolerance,” in proc. of the 3rd international symposium on code generation and optimization, 2005, pp. 243-254. [3] w. shi, h. hsin, s. l. laura falk and m. ghosh, “an integrated framework for dependable and revivable architecture using multicore processors,” in proc. of the 33th annual international symposium on computer architecture, 2006. [4] n. aggarwal, p. ranganathan, n. jouppi and j. smith, “configurable isolation: building high availability systems with commodity multi-core processors,” in proc. of the 34th annual international symposium on computer architecture, 2007, pp. 340-347. [5] m. manoochehri, m. annavaram and m. dubois, “cppc: correctable parity protected cache,” in proc. of the 44th annual international symposium on microarchitecture, 2011. [6] t. slegel, r. averill iii, m. check, b. giamei, b. krumm, c. krygowski, w. li, j. liptay, j. macdougall, t. mcpherson, j. navarro, e. schwarz, k. shum, and c. webb, “ibm’s s/390 g5 microprocessor design,” ieee micro, vol. 19, pp. 12–23, 1999. parallel execution tracing: an alternative solution to exploit under-utilized resources... 259 [7] b. f. romanescu and d. j. sprin, “core cannibalization architecture: improving lifetime chip performance for multicore processors in the presence of hard faults,” in proc. of the 17th international conference on parallel architecture and compilation techniques, 2008, pp. 43-50. [8] m. d. powell, a. biswas, s. gupta and s. s. mukherjee, “architectural core salvaging in a multi-core processor for hard-error tolerance,” in proc. of the 36th annual international symposium on computer architecture, 2009, pp. 93-104. [9] p. m.wells, k. chakraborty and g. s. sohi, “mixed-mode multicore reliability,” in proc. of the 14th international conference on architectural support for programming languages and operating systems, 2009, pp. 169-180. [10] h. aliee, h. r. zarandi and a. tajary, “dynamically scheduled process-level redundancy to tolerate faults in multi-cores,” in proc. of the 9th ieee international conference on high performance computing & simulation, 2011. [11] a. meixner and d. j. sorin, “detouring: translating software to circumvent hard faults in simple cores,” in proc. of the 38th ieee/ifip international conference on dependable systems and networks, 2008 pp. 80-89. [12] a. shye, t. moseley, v. j. reddi, j. blomstedt and d. a. connors, “using process-level redundancy to exploit multiple cores for transient fault tolerance,” in proc. of the 37th ieee/ifip international conference on dependable systems and networks, 2007. [13] a. shye, j. blomstedt, t. moseley, v. j. reddi and d. a. connors, “plr: a software approach to transient fault tolerance for multi-core architectures,” ieee transactions on dependable and secure computing, 2008. [14] m. fazeli, r. farivar and g. miremadi, “error detection enhancement in powerpc architecture-based embedded processors,” journal of electronic testing: theory and applications, vol. 24, pp. 21-33, 2008. [15] r. vemu and j. a. abraham, “ceda: control-flow error detection through assertions,” in proc. of the 12th ieee international on-line testing symposium, 2006, pp. 151–158. [16] n. oh, p. shirvani and e. mccluskey, “control-flow checking by software signatures,” ieee transactions on reliability, vol. 51, pp. 111–122, 2002. [17] o. goloubeva, m. rebaudengo, m. sonza reorda and m. violante, “improved software-based processor control-flow errors detection technique,” in proc. of the 50th reliability and maintainability syposium, 2005, pp. 583-589. [18] r. vemu, s. gurumurthy and j. a. abraham, “acce: automatic correction of control-flow errors,” in proc. of the ieee international test conference, 2007, pp. 1-10. [19] g. blake, r. g. dreslinski, t. mudge, k. flautner, “evolution of thread-level parallelism in desktopapplications,” in proc. of the 43th annual international symposium on microarchitecture, isca, 2010. [20] m. rimén, j. ohlsson and j. karlsson, “experimental evaluation of control flow errors,” in proc. of the pacific rim international symposium on fault tolerant systems, 1995, pp. 238–243. [21] m.a. schuette, j.p. shen, "processor control flow monitoring using signatured instruction streams," ieee transactions on computers, vol. 36, pp. 264-276, 1987. [22] a. rajabzadeh and g. miremadi, “transient detection in cots processors using software approach,” journal of microelectronics reliability, vol. 46, pp. 124-133, 2006. [23] a. rajabzadeh, g. miremadi and m. mohandespour, “error detection enhancement in cots superscalar processors with performance monitoring features,” journal of electronic testing: theory and applications, vol. 20, pp. 553-567, 2004. [24] a. rajabzadeh and g. miremadi, “cfcet: a hardware-based control flow checking technique in cots processors using execution tracing,” elsevier journal of microelectronics reliability, vol. 46, pp. 959-972, 2006. [25] u. schiffel, a. schmitt, m. süßkraut and c. fetzer, “anband anbdmem-encoding: detecting hardware errors in software,” in proc. of the 29th international conference on computer safety, reliability and security, 2010, pp. 169-182. [26] h. r. zarandi, m. maghsoudloo and n. khoshavi,“two efficient software techniques to detect and correct control-flow errors,” in proc. of the 16th ieee pacific rim international symposium on dependable computing,2010, pp. 141-148. [27] m. maghsoudloo, h.r. zarandi, s. pour-mozaffari and n. khoshavi, “soft error detection technique in multi-threaded architectures using control-flow monitoring,” in proc. of the 14th euromicro conference on digital system design, architecture, methods and tools, 2011. [28] e. d. berger, t. yang, t. liu and g. novark, “grace: safe multithreaded programming for c/c++,” in proc. of the 24th sigplan conference on object oriented programming systems languages and applications, 2009, pp.81-96. [29] david patterson, “the trouble with multi-core,” journal of ieee spectrum, vol. 47, pp. 28-32, 2010. http://www.springerlink.com/content/?author=ute+schiffel http://www.springerlink.com/content/?author=andr%c3%a9+schmitt http://www.springerlink.com/content/?author=martin+s%c3%bc%c3%9fkraut http://www.springerlink.com/content/?author=christof+fetzer 260 m. maghosudloo, h. r. zarandi [30] standard performance evaluation corporation. spec cpu2006 benchmarks, http://www.specbench.org/ cpu2006, 2006. [31] sh. parsaeeian, and a. rajabzadeh, “multi-bit upset faults correction in embedded systems”, journal of iran computer society, 2011. [32] gdb: the gnu project debugger, http://www.gnu.org/s/gdb. facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 583-603 https://doi.org/10.2298/fuee2004583b © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd risk management and participation of electric vehicle considering transmission line congestion in the smart grids for demand response cyrous beyzaee, sara karimi marvi, mahdi zarif department of electrical engineering, mashhad branch, islamic azad university, mashhad, iran abstract. demand response (dr) could serve as an effective tool to further balance the electricity demand and supply in smart grids. it is also defined as the changes in normal electricity usage by end-use customers in response to pricing and incentive payments. electric cars (evs) are potentially distributed energy sources, which support the grid-to-vehicle (g2v) and vehicle-to-grid (v2g) modes, and their participation in time-based (e.g., time of use) and incentive-based (e.g., regulation services) dr programs helps improve the stability and reduce the potential risks to the grid. moreover, the smart scheduling of ev charging and discharging activities supports the high penetration of renewable energies with volatile energy generation. this article was focused on dr in the presence of evs to assess the effects of transmission line congestion on a 33-bit grid. a random model from the standpoint of an independent system operator was used to manage the risk and participation of evs in the dr of smart grids. the main risk factors were those caused by the uncertainties in renewable energies (e.g., wind and solar), imbalance between demand and renewable energy sources, and transmission line congestion. the effectiveness of the model in a 33-bit grid in response to various settings (e.g., penetration rate of evs and risk level) was evaluated based on the transmission line congestion and system exploitation costs. according to the results, the use of services such as time-based dr programs was effective in the reduction of the electricity costs for independent system operators and aggregators. in addition, the results demonstrated that the participation of evs in incentive-based dr programs with the park model was particularly effective in this regard. key words: electric vehicles, smart grid, v2g, g2v, gams, loss function, demand response received march 3, 2020; received in revised form august 21, 2020 corresponding author: cyrous beyzaee department of electrical engineering, mashhad branch, islamic azad university, mashhad, iran e-mail: mahaniranian1@gmail.com 584 c. beyzaee, s. karimi marvi, m. zarif 1. introduction electric vehicle (ev) sales are growing rapidly worldwide [1,2], with the amount exceeding one million. several factors have been involved in this growing trend in the past few years, including the ability to replace fossil fuel vehicles with evs, which results in the preservation of natural reservoirs. however, the increased number of evs leads to increased grid demand. with the growth of domestic, industrial, and commercial demands, the power network must be capable of responding to all types of demands. the current power grids used in most countries are unable to fully respond to the large volume of evs. in this regard, the simplest solution is to increase transmission lines and various power plants to supply the electricity required by the grid. nonetheless, this solution requires unjustified large operating and economic costs. as such, the proper management of various parameters such as evs, wind and solar power plants and new energies, programs to reduce consumption, increased grid sustainability and customer satisfaction, and operational costs of the system is of paramount importance. in this context, one of the important topics is the transmission line congestion and management of grid demand response (dr) using evs since the lack of management of ev charging may lead to issues such as increased grid demand, power loss, and voltage fluctuations [3].this article was focused on the management and participation of evs in smart grid dr considering the impact of transmission line congestion in the form of timebased and incentive-based programs. to obtain our goals, we have first introduced evs, their types, and dr in this field. 2. overview of evs and renewable energy sources and dr 2.1. evs [4-9] there are different types of evs, some of which use the electronic grid to supply their required energy, which increases the energy received from the grid, thereby causing more problems for the grid. in general, evs are able to operate in frequency regulation, voltage regulation, spinning and non-spinning reserves, subsidiary services, and demand profile adjustment [4]. compared to common vehicles (e.g., fossil fuel cars), evs have a different propellant. the electric power required for evs is provided by three main sources, including power plants, generators, and energy savers; however, most evs are of the third type. in recent years, special attention has been paid to plug-in evs (pevs; especially battery evs and hybrid pevs) in the industrial and university sectors. 2.1.1. battery evs battery evs encompass three parts, including an electronic engine, a battery, and a controller. the electronic motor uses the battery as the driving force. the two-input controller is only able to manage the power provided to the electric motor, which provides the driving force for the vehicle to move backward or forward. simultaneously, a four-input controller supports the brake as well. another important part of battery evs is the power inverter, which is responsible for the conversion of the stored electrical energy in the battery from the dc into the ac mode. this is mainly due to the fact that most evs have an ac engine, which has a simple, low-cost structure. risk management and participation of electric vehicle considering transmission line congestion... 585 2.1.2. hybrid pevs hybrid evs are classified into three categories of parallel, series, and two-part hybrids based on their engine type. the first category is recognized as the most common engines of such vehicles. pevs often have two electronic and internal combustion engines as the propellant, which enables the vehicle to move in the no-charge and full-charge modes. hybrid pevs supply their propulsion energy from batteries. when the battery power levels are lower than a certain amount in the no-charge mode, the vehicle changes its status and switches to the use of the internal combustion engine as the propellant. in the full-charge mode, the vehicle uses a combination of electric engine and internal combustion engine for maximum efficiency in propulsion. simultaneously, the controller controls the battery charge level and maintains it at a certain level. 2.2. renewable energy sources the increased awareness of environmental crises and reduction of fossil fuel use are leading to new directions for energy production and consumption. one of these issues is renewable energy sources with eco-friendly features, including the wind energy and solar energy. 2.3. response demand the necessity to define new electronic energy sources with quick response ability in the emergency situations of power network is ever-increasing due to the growing load of power networks, especially the increased loads sensitive to the changes in the power supply parameters by the network. therefore, it is essential to address consumer management issues. structural changes in the electricity industry have led to the emergence of new paradigms alongside consumer management. dr is one of these paradigms, which encompasses the consumer management methods that lead to changes in the consumption level of costumers caused by the changes in electricity prices in the market. according to the united states department of energy, dr is defined as the empowerment of industrial, commercial, and residential users to improve electronic energy consumption, so that appropriate costs could be established and the network exploitation conditions could be improved [10]. in other words, dr could change the form of electronic energy consumption, so that the maximum system demand would reduce and consumptions would be transferred to non-peak hours. the us energy regulatory commission divides dr programs into two main groups of motivation-based and time-based dr [11]. in each classification, the dr programs are divided into several subcategories, which have been discussed in the following section [12]: incentive-based dr programs 1) direct demand control programs 2) demand reduction/cessation programs 3) repurchase/demand sales programs 4) emergency dr programs 5) market capacity programs 6) subsidiary service market programs 586 c. beyzaee, s. karimi marvi, m. zarif time-based dr programs 1) application-time pricing plans 2) actual-time pricing plans 3) critical peak-time pricing plans 3. problem statement and model presentation with the increased prevalence of evs and their use worldwide, there has been growing demand for attention and planning to exploit these vehicles. owing to their numerous benefits, fuel fossil vehicles are being rapidly replaced by evs. however, the increased number of evs has resulted in higher demands in this regard. on the other hand, the electricity network must be able to respond to all types of demands with the ever-increasing growth of housing, industrial, and commercial demands. to this end, the simplest solution is to increase transmission lines and various power plants to supply the required electricity level. nonetheless, this solution requires substantial operating and economic costs, which may not be economical. therefore, the proper management of various network parameters such as evs, wind and solar power plants, and new energies, various programs to reduce consumption, increased network stability, customer satisfaction, and system operating costs is of paramount importance. one of the key topics in this regard is the discussion of transmission line congestion, grid demand response, and management using evs. as such, the present study aimed to evaluate the management and participation of evs in the demand response of smart grids, while considering transmission line congestion and its impact in time-based and motivationbased programs. 3.1. time-based programs these programs involve the use of global networks by consumers and grid demands. by pricing electricity at different hours (load peak, mean load, and low load), the consumption peak is divided into non-peak hours and may reduce. therefore, there would be no transmission line congestion, and electricity purchase level would decrease significantly. 3.2. motivation-based programs focusing on regulatory services and supplying the reserve amount are essential to the states of g2v and v2g, leading to demand-response balance and reduction of global network costs. moreover, it results in increased profitability for customers and higher use of evs. in this program, evs are connected to the grid in the two states of g2v and v2g, experiencing smart discharging in addition to smart charging [13-15]. 3.3. studied system the system assessed in the present study is illustrated in figure 1 [16].the independent system operator plays a pivotal role in this system, managing the market by collecting and exporting information among the market members, such as power plants, demand centers, and ev aggregators. the independent system operator aims to reduce the operational costs of the system. however, the balance between supply and demand remains constant at all times. the power system encompasses the energy distribution of various manufacturing risk management and participation of electric vehicle considering transmission line congestion... 587 units, such as conventional power generators and renewable energy systems (wind and solar power). considering the limited capacity of electric car batteries, the contribution of each battery separately to the grid is negligible. previous studies have indicated the inefficiency of planning for small-scale consumer participation in wholesale electricity market [11]. therefore, it is essential to control the charging and discharging of numerous evs by an aggregator to participate in tenders and coordinate the charging and discharging activities of evs. notably, the aggregator units cover both v2g and g2v models. vehicle owners announce their battery capacity and traffic route to the aggregator units by considering the additional time and distance and possible parameters for the proper and accurate planning of evs. on the other hand, the aggregator units inform the independent system operator on the available and anticipated capacity of the state of charge (soc) in order to participate in the demand level and frequency tuning services. fig. 1 an overview of studied system [16] the aggregators support both the time-based and motivation-based states in demand response programs. in this article, the time of use was selected as the time-based program to supply the demand service provision. the vehicles participating in the program were required with different costs at various times (e.g., load peak or low load) [17-20]. the aggregators often participate in the motivation-based programs of the demand response to supply the required v2g and g2v for regulatory services. these services have two classifications in terms of the costs for the independent system operator, which involve paying the reserving capacity costs and energy costs to the aggregators [2122].the reserve capacity costs are equal to the maximum capacity supplied by each aggregator during the contract. the energy costs are associated with the costs of energy transfer from the v2g state to the g2v state. in addition, a specific number of evs is required for the rapid responding to the demand, as well as saving the excess energy or compensating for its shortage. the level of emergency storage must also be set correctly. moreover, the decision-makings in this regard are mainly focused on the charging and discharging of electric cars, and the plan of producing electricity from various sources often has to be precise. decisions should be made by considering various risk factors for 588 c. beyzaee, s. karimi marvi, m. zarif the possible future scenarios. in this regard, the risk in the mentioned conditions is the possible imbalance between demand and power supplies (supply and demand). in the current research, the model presented for the management of the participation of evs in demand response programs was based on the dc opf model. 𝑚𝑖𝑛 ∑ (1) ∑ 𝑓 𝑘 (2) 𝑓 ( ) 𝑙 (3) (4) (5) equations 1-5 show the main formula of opf, which minimizes the costs associated with various generating units and the current load in terms of the technical limitations of the electricity grid. in addition, equations 2 and 3 demonstrate the load balance per bus and power flux per line. the heat flux limit and generator capacity are shown in equations 4 and 5. the opf model presented above has been corrected for the integration of dynamic issues into our model. in this respect, the main goal was to manage the level of necessary reservation for the v2g and g2v states, as well as the anticipated costs in the future system vision. the modified model was presented for the management of the cooperation of evs in equations 5 and 6. the first part of the target function shown in equation 6 is related to the reservation capacity costs of evs to conclude the v2g and g2v service contracts. the second part includes the expected operating costs of the independent system operators for the actual energy payments sent for regulatory services. the next part of the production costs of conventional generators is the costs of the current load and decreased costs of renewable energies. in addition, the energy outage costs are considered in the model because when the surplus energy is generated by renewable sources, iso should be allocated to others to receive the additional energy [23]. equation 7 is similar to equation 2 in terms of showing the power balance per bus. the overall energy flux to the bus (generating electricity) through conventional generators, renewable energy sources, and energy discharge from the aggregators is equal to the total overall energy output from the bus (base demand, charged energy of the aggregators, reduction of renewable energies). ∑ 𝑑 𝑑 the soc of each aggregator is presented in equation 8, which changes based on the charge/discharge status and battery efficiency. initially, the aggregators must supply the charge of the vehicles that immediately leave the place and need charge. the soc of the input and output vehicles often affects the overall charge of the aggregators. equation 9 ( 𝑎 +𝐴𝑔𝑟 𝑋𝑎,𝑡 𝑉2𝐺 + 𝑎 𝐴𝑔𝑟 𝑋𝑎,𝑡 𝐺2𝑉) 𝐴 𝑎=1 𝑇 𝑡=1 + 𝑠 𝑆 𝑠=1 ( 𝑡 𝐷𝑖𝑠 𝑏𝑎,𝑡,𝑠 + 𝑡 𝐷𝑖𝑠 𝑑𝑎 ,𝑡,𝑠 ) 𝐴 𝑎=1 𝑇 𝑡=1 + 𝑠 𝑆 𝑠=1 ( 𝑔 𝐺𝑒𝑛 𝑔,𝑡,𝑠 + 𝑘 𝑢𝑙 𝑘,𝑡,𝑠 + 𝑘 𝑐𝑢𝑟 𝑘,𝑡,𝑠) 𝐴 𝑎=1 𝑇 𝑡=1 (6) (7) risk management and participation of electric vehicle considering transmission line congestion... 589 shows the charge of the remaining battery capacity of the aggregators. the improvement of the charge/discharge pattern affects the remaining battery capacity (rbc) of an ev after arriving at the parking lot. moreover, the rbc is affected by the soc of the input and output vehicles. 𝑆 𝑆 𝑑 𝑑 𝑆 𝑆 𝑑 𝑑 𝑆 𝑆 the equations 10-12 show a method similar to the equations 3-5. however, the flux constraint was not presented for lines with error ( ). ( ) equations 10-12 show a method similar to equations 3-5. however, the flux constraint was not presented for the lines with error ( ). in addition, equation 13 guarantees the use of the generator in the allowed range. on the other hand, equation 14 indicates the risk coefficient required for the operator. accordingly, the probability of any mismatch between the power source and demand would be less than or equal to the specified error limit ). in addition, equations 15 and 16 guarantee the operation of the aggregator only in one of the v2g or g2v states at any moment. 𝑟 ( ) 𝑑 𝑑 ( ) moreover, equation 17 demonstrates that discharge is limited by the available energy, while equation 18 guarantees that the level of charge does not exceed the empty capacity available to the accumulators. 𝑑 𝑆 𝑆 𝑆 𝑑 𝑆 𝑆 (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) 590 c. beyzaee, s. karimi marvi, m. zarif (24) nonetheless, equations 19 and 22 are boundary constraints. the non-provided load and decreased energy are limited by the actual load and renewable energy available in equations 19 and 20. the required storage was determined in the aggregators’ contract and limited to their capacities to support the v2g and g2v services, while constraint 21 shows the limit of this capacity in the g2v state. similarly, the discharge energy of the aggregators is limited by the maximum storage defined for the v2g state in their contracts. 𝑏 𝑋 𝑑 𝑋 equation 23 shows the g2v reservation storage. moreover, the maximum period guarantees the level of g2v reservation required when the generated energy is higher than the system’s demand. in such case, evs are charged, and the g2v reservation level is estimated based on their participation in the use of the surplus energy. equation 24 shows that the g2v service provided by each aggregator cannot exceed its charge amount, and the range of changes in the variables is shown in equation 25. 𝑏 𝑑 𝑋 𝑋 the aforementioned model is a nonlinear complex number programming problem, which could be converted into a linear complex number programming problem. to establish linearity, equation 14 is replaced by equations 26 and 27. moreover, the binary variable is equal to one if there is incompatibility between the energy sources and existing demand; otherwise, it would be zero. in order to make equation 23 linear, we used equation 28 through equation 31 to cover all the possible cases. ∑ ∑ ∑ 𝑟 𝑏𝑎 ,𝑡,𝑠 𝐴 𝑎=1 = 𝑚𝑖𝑛 𝑑𝑎,𝑡,𝑠 + 𝐴 𝑎 =𝑙 , 𝑚𝑎𝑥 (0, 𝑗 ,𝑡,𝑠 𝐽 𝑗 =1 + 𝑛,𝑡,𝑠 𝑁 𝑛=1 𝑘,𝑡,𝑠 𝐾 𝑘=1 ) 𝑎,𝑡,𝑠 ( ′ 𝑡,𝑠) 𝑗 ,𝑡,𝑠 𝐽 𝑗 =1 + 𝑛,𝑡,𝑠 𝑁 𝑛=1 𝑘,𝑡,𝑠 𝐾 𝑘=1 (1 ′ 𝑡,𝑠) 𝑎,𝑡 (19) (20) (21) (22) (25) (23) (26) (27) (28) risk management and participation of electric vehicle considering transmission line congestion... 591 4. model implementation and simulation in order to evaluate the proposed models, we applied a one-day program on the standard 33 bus grid as the case study, the characteristics of which are presented in table 1, along with the base load. the maximum generating capacity was 700 kw, and the minimum production value for the conventional generators was not set. the transmission capacity of each 2 mw line with equal susceptance risk was estimated at 10 p.u. the charging of evs imposes an additional load to the system, which does not include the base load. the parking pattern in figure 2 was considered for the evaluation of the number of the ev inputs and outputs each day. each parking region had the maximum capacity of 200 vehicles and was managed by an aggregator. therefore, it was assumed that each ev has a battery with a 24 kwh capacity and 99% charge/discharge efficiency. in addition, it is expected that 35% of the parked vehicles are evs. in general, evs enter the parking with 30% charge and prefer to leave the parking with 90% battery charge. figure 3 shows the generated energy by the wind and solar power plants as selected based on the data of california iso wind and solar power plants [24]. the cost related to renewable energy decreased, and the reduced load was assumed as 1.5 and 5$/kw, respectively as shown in equation 23. furthermore, the cost related to the generation of emergency electricity by a conventional generator was presented as 0.20 $/kwh. the aggregators cost 0.02 $/kwh for the available capacity to provide the v2g and g2v services. the aggregators could benefit from 100% discount if they charge when there is the need for energy reduction. in addition, the independent system operator deals with the aggregators, high costs of regulation services, and other services. however, the different services had various costs, which mostly depend on the electricity market cost. for instance, 0.01 $/kwh was considered as the base cost of electricity. 𝑗 ,𝑡,𝑠 𝐽 𝑗 =1 + 𝑛,𝑡,𝑠 𝑁 𝑛=1 𝑘,𝑡,𝑠 𝐾 𝑘=1 𝑑𝑎,𝑡,𝑠 + 𝐴 𝑎 =𝑙 (1 ′′ 𝑡,𝑠) 𝑎,𝑡 𝑏𝑎 ,𝑡,𝑠 𝐴 𝑎=1 ( 𝑗 ,𝑡,𝑠 𝐽 𝑗 =1 + 𝑛,𝑡,𝑠 𝑁 𝑛=1 𝑘,𝑡,𝑠 𝐾 𝑘=1 ) (1 ′′ 𝑡,𝑠) ′ 𝑡,𝑠 𝑡,𝑠 𝑏𝑎,𝑡,𝑠 𝐴 𝑎=1 𝑑𝑎,𝑡,𝑠 + 𝐴 𝑎=1 ′′ 𝑡,𝑠 ′ 𝑡,𝑠 𝑡,𝑠 (29) (30) (31) 592 c. beyzaee, s. karimi marvi, m. zarif table 1 characteristics of a standard 33 bus grid br.no rc.nd sn.nd r(ohm) x(ohm) pl(kw) 1 0 1 0.0922 0.47 100 2 1 2 0.493 0.2511 90 3 2 3 0.366 0.1864 120 4 3 4 0.3811 0.1941 60 5 4 5 0.819 0.707 60 6 5 6 0.1872 0.6188 200 7 6 7 0.7114 0.2351 200 8 7 8 1.03 0.74 60 9 8 9 1.044 0.74 60 10 9 10 0.1966 0.065 45 11 10 11 0.3744 0.1238 60 12 11 12 1.468 1.155 60 13 12 13 0.5416 0.7129 120 14 13 14 0.591 0.526 60 15 14 15 0.7463 0.545 60 16 15 16 1.289 1.721 60 17 16 17 0.732 0.574 90 18 1 18 0.164 0.1565 90 19 18 19 1.5042 1.3554 90 20 19 20 0.4095 0.4784 90 21 20 21 0.7089 0.9373 90 22 2 22 0.4512 0.3083 90 23 22 23 0.898 0.7091 420 24 23 24 0.896 0.7011 420 25 5 25 0.203 0.1034 60 26 25 26 0.2842 0.1447 60 27 26 27 1.059 0.9337 60 28 27 28 0.8042 0.7006 120 29 28 29 0.5075 0.2585 200 30 29 30 0.9744 0.963 150 31 30 31 0.3105 0.3619 210 32 31 32 0.341 0.5302 60 risk management and participation of electric vehicle considering transmission line congestion... 593 fig. 2 parking pattern fig. 3 pattern of electricity generation by wind and solar sources this model was developed in matlab software and solved by the cplex solver in the definitive and randomized forms. it is notable that the definitive cases were considered as the base case, and no risk range was considered for the definitive cases. the model was solved after adjusting the random parameters for their expected values. the results regarding the load levels in the definitive cases are shown in figure 4, where the collaboration of evs was observed to be effective in correcting the available load and using renewable energies, while transferring the load charge to off-peak periods. figure 5 illustrates the results on the g2v and v2g services in the definitive cases. in this regard, 0 20 40 60 80 100 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 p a rk in g u ti li za ti o n (% ) time(hour) 0 200 400 600 800 1000 1200 1400 1600 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 p o w e r g e n e r a t io n (k w ) time(hour) wt pv 594 c. beyzaee, s. karimi marvi, m. zarif evs provided the g2v reserve at hours by generating more renewable energy and insufficient base load. in addition, the evs were discharged to provide v2g services at the load peak. conventional generators are applied to generate the necessary electricity in periods when the sum of renewable energies and emitted energy by evs is insufficient to reach the base load. fig. 4 results of ev participation in base state fig. 5 results of ev participation to provide reservation services in base state the capacity of the lines also reduced to observe the effect of transmission line congestion on the cost function in the definitive form. the maximum capacity of transmission lines was 2 mw, and decreased congestion constraint to 1 mw led to the 0 500 1000 1500 2000 2500 3000 3500 4000 4500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 hourly-load ev-load res -1000 -500 0 500 1000 1500 2000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 rg2v rv2g risk management and participation of electric vehicle considering transmission line congestion... 595 congestion of the lines. as a result, the cost of system operation increased. figure 6 illustrates the results of decreased transmission line congestion and the effects on the v2g and g2v states. as is observed, the reservation amount decreased in the g2v state with the transmission line congestion. in contrast, the reservation amount increased in the v2g state. fig. 6 effect of transmission line congestion on reservation plans in base state in general, the reservation level in the g2v and v2g states increased, which in turn led to the increased system operating costs. 4.1. charging method this section illustrates the effects of charging the evs on the operating costs of the power systems. the definitive base model was run with three different charging models 4917.322909 4385.643281 4100 4200 4300 4400 4500 4600 4700 4800 4900 5000 f_max(l)=2mw f_max(l)=1mw r e se rv in g i n g 2 v m o d e 2713.14846 3592.512 0 500 1000 1500 2000 2500 3000 3500 4000 f_max(l)=2mw f_max(l)=1mw r e se rv in g i n v 2 g m o d e 596 c. beyzaee, s. karimi marvi, m. zarif and policies. in the first policy, it was assumed that the evs do not participate in recharge programs and are charged once when they arrive in the parking lot. the second policy showed that the evs participated in the time-based program of the demand response, which led to the planning of ev charging by the aggregators to reduce the electricity costs and eliminate the load peak. when the aggregators attended the time-based programs of the demand response, the independent system operator only responded to the charging patterns by minimizing its operational costs. table 2 shows the time spent to manage consumer recharge. the total charging cost of the aggregators participating in the time of use program was calculated using the ∑ ∑ 𝑑 equation. table 2 hourly electricity cost hour price($) hour price($)2 1 0.05 13 0.19 2 0.05 14 0.19 3 0.05 15 0.19 4 0.05 16 0.12 5 0.05 17 0.12 6 0.05 18 0.12 7 0.05 19 0.19 8 0.12 20 0.19 9 0.12 21 0.19 10 0.12 22 0.12 11 0.19 23 0.12 12 0.19 24 0.05 in the third policy, the participation of the evs in the motivation-based program of the demand response was assumed, and the vehicles were motivated to participate in the g2v and v2g states. the overall energy cost of the aggregators in this policy was estimated using the equation below: in the equation above, the negative values indicated that not only the aggregators did not pay the costs, but they also inspire revenue generation in most cases. the results of the charging strategy are presented in table 3. table 3 results of three charging policies of evs dr charging policy iso reserve cost ($) iso operation cost ($) aggregator's energy payment ($) generation (kwh) no participation 0 44500.572 261.791 15812 time-based 0 37312.752 116.64 14756 incentive-based 417.89 25804.761 -179.741 13590 ( 𝑡 𝑐𝑕 𝑑𝑎 ,𝑡,𝑠 + 𝑡 𝑒𝑔 𝑏𝑎 ,𝑡,𝑠 𝑡 𝐷𝑐𝑕 𝑑𝑎 ,𝑡,𝑠 𝑎 +𝐴𝑔𝑟 𝑥𝑎 ,𝑡 𝑉2𝐺 𝑎 𝐴𝑔𝑟 𝑥𝑎 ,𝑡 𝐺2𝑉) 𝐴 𝑎 =1 𝑇 𝑡=1 (32) risk management and participation of electric vehicle considering transmission line congestion... 597 the conventional power generation and charging patterns of the three policies are shown in figures 7 and 8. participation in the demand response programs decreased the costs of the aggregators and independent system operator. compared to the time-based program, the motivation-based program provided more saving in the costs of the independent system operator, which was mainly due to the fact that the use of evs for the management of the v2g and g2v states could reduce the costs related to the lost load and energy reduction. participation in the motivation-based programs is often associated with positive income generation for the aggregators. as is depicted in figure 7, unplanned charging forced the conventional power systems to generate more power during the peak times when the system experienced higher load. motivational programs often cover the need for routine energy generation by entirely using renewable sources. the main goal of demand response programs is to decrease the load peak. according to the obtained results, participation in the time-based and motivation-based programs led to 48% and 51% decrease in the load peak, respectively. as is shown in figure 8, the participation of the aggregators in the demand response programs created the motivation for the lack of charging at the peak hours, thereby increasing the desire to charge at the non-peak hours. fig. 7 conventional power plant production rates in three different charging policies fig. 8 charging activity of evs in three different charging policies 0 500 1000 1500 2000 2500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 p o w e r( k w ) time(hour) no participation time_based incentive_based 0 500 1000 1500 2000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 no participation(sen1) time-based(sen2) incentive-based(sen3) 598 c. beyzaee, s. karimi marvi, m. zarif figure 9 shows the effect of transmission line congestion on the production of conventional power plants in the three charging policies. according to the results, the energy produced by conventional power plants significantly reduced in case of congestion in the transmission lines. considering that conventional power plants are used to supply part of the system load that is not responsive to renewable energies and electric vehicles, the network cannot supply that part of the system load. fig. 9 effect of transmission line congestion on production of conventional power plants 0 5000 10000 15000 20000 f_max(l)=2mw f_max(l)=1mw c o n ve n ti o n a l g e n e rt io n s senario1 0 5000 10000 15000 20000 f_max(l)=2mw f_max(l)=1mw c o n ve n ti o n a l g e n e ra ti o n s senario2 10500 11000 11500 12000 12500 13000 13500 14000 f_max(l)=2mw f_max(l)=1mwc o n ve n ti o n a l g e n e ra ti o n s senario3 risk management and participation of electric vehicle considering transmission line congestion... 599 according to the simulation results, the costs of the independent system operator increased with the decreased transmission line congestion. figure 10 depicts the results in the three charging policies. fig. 10 effect of transmission line congestion on costs of independent system operator 44500.572 67941.623 0 10000 20000 30000 40000 50000 60000 70000 80000 f_max(l)=2mw f_max(l)=1mw c o st o f is o ($ ) senario1 37312.752 60353.122 0 10000 20000 30000 40000 50000 60000 70000 f_max(l)=2mw f_max(l)=1mw c o st o f is o ($ ) senario2 25804.761 39323.319 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 f_max(l)=2mw f_max(l)=1mw c o st o f is o ($ ) senario3 600 c. beyzaee, s. karimi marvi, m. zarif 4.2. risk perspective and random solutions in this section, the model is solved in the random form by the predefined risk level of 0.01, which indicated that the possibility of mismatch between the load and source must be less than 1%. therefore, it was assumed that the load, renewable energy production, behavior of the ev owners, soc input and output of the aggregators, and line errors were uncertain. to reduce the computational time of the random model, the reduction scenario presented in was used to construct a tree scenario with 10 scenarios [25-26]. in the random model, a higher reserve level was required compared to the definitive status due to the uncertainty and risk level parameters. the random model was also solved for various risk thresholds, including 0.01, 0.1, and 1. as can be seen in figure 11, the higher risk threshold tolerated the higher probability of mismatch between the source and load, thereby requiring less storage. fig. 11 effect of imbalance between energy source and demand on reservation programs with various risk factors 1800 1900 2000 2100 2200 2300 2400 2500 2600 stochastic33(0.01)-g2v stochastic33(0.1)-g2v stochastic33(1.00)-g2v r e g u la ti o n d o w n (k w ) risk telorance 2000 2050 2100 2150 2200 2250 2300 stochastic33(0.01)-v2g stochastic33(0.1)-v2g stochastic33(1.00)-v2g r e g u la ti o n u p (k w ) risk telorance risk management and participation of electric vehicle considering transmission line congestion... 601 similar to the definitive form, the reservation level increased in the v2g and g2v states by applying line congestion in the random form, which led to the increased cost of system operation. however, the amount was lower compared to the definitive form, which was due to the presence of a risk coefficient and possible disproportion between the load and energy source. the results for 1% risk coefficient are shown in figure 12. fig. 12 effect of transmission line congestion on reservation level of demand respond programs in random form 2490.466 2503.466 2480 2485 2490 2495 2500 2505 f_max(l)=2mw f_max(l)=1.8mw r e se rv in g i n g 2 v m o d e (k w ) stochastic model 2260.198 2287.32 2245 2250 2255 2260 2265 2270 2275 2280 2285 2290 f_max(l)=2mw f_max(l)=1.8mw r e se rv in g i n v 2 g m o d e (k w ) stochastic model 602 c. beyzaee, s. karimi marvi, m. zarif 5. conclusion in the present study, we applied a new ev participation plan in demand response programs and their timing in a smart grid. in addition, we evaluated the effects of transmission line congestion on the cost of system operation and level of reservation in the definitive and random forms. the applied system was a standard 33-bus system exposed to the possible risk of various load levels due to the uncertainty of evs, production of renewable energies, transmission line congestion, and behavior of the ev owners in a group manner. we also assessed the participation of evs in demand response programs in timebased and motivation-based areas, observing that the participation could be extensively effective in the response to the load of the smart grid, thereby providing considerable load and reducing the load peak, which led to the reduction of the operational costs of the system, aggregators, and ev owners, as well as monetization in some cases. the random model enables users to determine the level of risk and costs and their profits considering the available factors. the model evaluated in this thesis could be used to improve the storage levels required by an independent system operator by considering the profits of the aggregators. the independent system operator could reduce operational costs by improving the conventional production schedule and renewable energies, as well as the participation of evs. moreover, the aggregators attempted to reduce the electricity costs by optimizing the charge/discharge schedule of evs in order to receive the maximum discount and revenue from participation in the demand response. the definitive and random cases were assessed to demonstrate the effects of parameters such as charging policy, level of risk, penetration of renewable energies, and residential load pattern. according to the results, services such as time-based programs affected the reduction of electricity costs for the independent system operator and aggregators. in addition, the participation of evs in the motivationbased programs by the park model had a significant impact in this regard. references [1] lutsey n. global milestone: the first million electric vehicles. 2015. [2] wilson l. the ev wedge: how electric vehicle fuel savings vary by country (and car). 2015. [3] z. yang, k. li, a . foley, “computational scheduling methods for integrating plugin electric vehicles with power systems: a review”, renew sustain energy rev, vol. 51, pp. 396–416, 2015. [4] g. cardoso, m. stadler, m.c. bozchalui, r. sharma, c. marnay, a. barbosa-povoa, et al., “optimal investment and scheduling of distributed energy resources with uncertainty in electric vehicle driving schedules”, energy, vol. 64, pp. 17–30, 2014. [5] k. mets, r. d'hulst, c. develder, “comparison of intelligent charging algorithms for electric vehicles to reduce peak load and demand variability in a distribution grid”, commun netw j, vol. 14, no. 6, pp. 672– 81, 2012. [6] m. musio, p. lombardi, a. damiano, “vehicles to grid (v2g) concept applied to a virtual power plant structure”, in proceedings of xix international conference on electrical machines icem 2010, rome, italy, 2010. [7] c. white, k. zhang, “using vehicle-to-grid technology for frequency regulation and peak-load reduction”, j power sources; vol. 196, no. 8, pp. 392–398, 2011. [8] m. honarmand, a. zakariazadeh, s. jadid, “optimal scheduling of electric vehicles in an intelligent parking lot considering vehicle-to-grid concept and battery condition”, energy, vol. 65, pp. 572–579, 2014. [9] s. shao, m. pipattanasomporn, s. rahman, “grid integration of electric vehicles and demand response with customer choice”, ieee trans smart grid, vol. 3, no. 1, pp. 543–50, 2012. risk management and participation of electric vehicle considering transmission line congestion... 603 [10] a. khazali, m. kalantar, “a stochasticeprobabilistic energy and reserve market clearing scheme for smart power systems with plug-in electrical vehicles”, energy convers manag, vol. 105, pp. 1046–105, 2015. [11] e. sortomme, m.a. el-sharkawi, “optimal combined bidding of vehicle-to-grid ancillary services”, ieee trans smart grid, vol. 3, no. 1, pp. 70–79, 2012. [12] a. damiano, g. gatto, i. marongiu, m. porru, a. serpi, “vehicle-to-grid technology: state-of-the-art and future scenarios”, j energy power eng, vol. 8, no. 1, p. 152, 2014. [13] r. sioshansi, “or forum-modeling the impacts of electricity tariffs on plug-in hybrid electric vehicle charging, costs, and emissions”, oper res, vol. 60, no. 3, pp. 506–516, 2012. [14] s. habib, m. kamran, u. rashid, “impact analysis of vehicle-to-grid technology and charging strategies of electric vehicles on distribution networksea review”, j power sources, vol. 277, pp. 205–214, 2015. [15] m. yilmaz, p.t. krein, “review of the impact of vehicle-to-grid technologies on distribution systems and utility interfaces”, ieee trans power electron, lov 28, no. 12, pp. 5673–5689, 2013. [16] n. nezamoddini, y. wang, risk management and participation planning of electric vehicles in smart grids for demand response. [17] l. zhang, z. zhao, j. chai and z. kan, “risk identification and analysis for ppp projects of electric vehicle charging infrastructure based on 2-tuple and the dematel model”, world electric vehicle journal, vol. 10, no. 4, 2019. [18] risk management and public participation (geog90020), handbook,the university of melbourne, 2019. [19] y. wang, l. li, “time-of-use based electricity demand response for sustainable manufacturing systems”, energy, vol. 63, pp. 233–244, 2013. [20] y. wang, l. li, “time-of-use electricity pricing for industrial customers: a survey of us utilities”, appl energy, vol. 149, pp. 89–103, 2015. [21] a. khalid, n. javaid, a. mateen, m. ilahi, t. saba and a. rehman. “enhanced time-of-use electricity price rate using game theory”. [22] w. kempton, j. tomic, “vehicle-to-grid power fundamentals: calculating capacity and net revenue”, j power sources, vol. 144, no. 1, pp. 268–279, 2005. [23] d. howarth, b. monsen, “renewable energy faces daytime curtailment in california”, 2015. [24] caiso renewable watch data. [25] n. growe-kuska, h. heitsch, w. romisch, “scenario reduction and scenario tree € construction for power management problems”, in proceedings of ieee power tech conference, bologna, italy, 2003. [26] f. thangaiyan, risk management in renewable energy and sustainability in india., project. july 2019. https://www.sciencedirect.com/science/article/abs/pii/s0360544216314232#! https://www.sciencedirect.com/science/article/abs/pii/s0360544216314232#! https://handbook.unimelb.edu.au/ https://www.researchgate.net/profile/franklin_thangaiyan?_sg%5b0%5d=8pecsfmzbmwtaljm8odnseamvwuejpuj8zfkyn7jogwnjzivuzyp3nfiuamzrwz3vbagqyk.-kjcz-qbyzoahcjmrbordy_6xd1ptgup19ww_4pudzwzkfommrxxcu5ornzb4tqcjfhriz66kk7hio6bkzcbaq&_sg%5b1%5d=uk3qzvsxsolysy6us2hf88qpeuaj0p8cwh__zj1hw_wztzfivbcaffhvwnibqfhyf1qpyg0.7brco4g0nbapogie7zv1rjg8riclxw70ssxlsacx4uxhl77bp9zer-o4m8fpf7_622b_ovq7lfj5z-zuf2iyeq instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 275 298 doi: 10.2298/fuee1402275s why and how photovoltaics will provide cheapest electricity in the 21 st century  rajendra singh, githin f. alapatt, guneet bedi holcombe department of electrical and computer engineering and center for silicon nanoelectronics clemson university, clemson, sc, usa abstract. with the advent of solar panels and windmills, and our ability to generate and use electrical energy locally without the need for long-range transmission, the world is about to witness transformational changes in energy infrastructure. the use of photovoltaics (pv) as source of direct current (dc) power reduces the cost and improves the reliability of pv system. dc microgrid and nanogrid based on pv and storage can provide sustainable electric power to all human beings in equitable fashion. bulk volume manufacturing of batteries will lead to cost reduction in a manner similar to the cost reduction experience of pv module manufacturing. future manufacturing innovations and r & d directions are discussed that can further reduce the cost of pv system. if the current trends of pv growth continue, we expect pv electricity cost with storage to reach $0.02 per kwh in the next 8-10 years. key words: photovoltaics, direct current, local electricity, nanogrid, energy policy, batteries 1. introduction the world population is currently about 7 billion and by the end of the 21 st century the world population is projected to reach nearly 11 billion people [1]. providing green energy to all human beings in equitable fashion is one of the biggest technical challenges. the costs of generating, transmitting and utilizing energy must be decreased to ensure sustainability. for any energy technology to be truly sustainable it must be environmentally friendly, conserves water, and be affordable [2]. as a solid state device, silicon based integrated circuits popularly known as “computer chips” have brought revolution in the information technology that stated in the 20 th century and is continuing to shape the future world of tomorrow [3]. the use of solid-state devices for power generation, power delivery and power utilization is bringing green energy revolution in a manner similar to the role played by solid-state devices in the field of global  received february 6, 2014 corresponding author: rajendra singh holcombe department of electrical and computer engineering and center for silicon nanoelectronics clemson university, clemson, sc, usa (e-mail: srajend@clemson.edu) 276 r. singh, g. f. alapatt, g. bedi communication [4]. in particular, photovoltaics (pv) is playing the central role in the emerging green energy revolution [5], [6]. the objective of this paper is to demonstrate that the progress made in recent years in reducing the cost of the electricity generated by pv is phenomenal and in the very near future pv will emerge as the lowest cost and sustainable electricity generation technology. for achieving the goal of lowest cost, the importance of direct current (dc) generated by pv systems will be highlighted. in addition we will also provide manufacturing innovations, research directions and energy policies that will continue to further reduce the cost of electricity generated by pv systems. 2. energy sources for equitable sustainable energy scenario, we must consider environmental concerns and conservation of water for future generations of mankind. in earlier publications we have stated that nuclear energy is not economical for any country and no more new nuclear reactors should be constructed [4]-[7]. hydro-energy, biomass energy, and geothermal energy are partially renewable, but are not totally sustainable [2]. as of today, no cost effective technology exists for producing bio-fuels. fundamental breakthrough is required to produce cost-effective bio-fuels [2]. consideration of renewable and nonrenewable energies (figure 1) shows that only solar and wind energies are truly renewable and can provide the ultimate in sustainable energy to meet the global energy needs of the 21 st century [2]. 3. why photovoltaics? there is no direct competition between solar and wind energy, since without storage solar energy can be used during the daytime and wind energy mostly during the nighttime. however, other than the larger amount of available solar energy, there are fundamental differences between solar energy and wind energy. as shown in figure 2 and 3, solar energy is more uniformly distributed than wind energy. 98 % of world population receives more than 3 kwh/m 2 solar irradiance per day. the other difference relates to the cost and reliability of pv systems and wind energy systems. during the last several years the annual global installation of wind energy was much more than the pv systems. in 2010, one of us predicted that due to inherent advantages, pv will take over wind and eventually we will have pv as the dominant electricity generation technology [11]. globally in 2013, 33.8 gw of new onshore wind farms plus 1.7 gw of offshore wind capacity will be installed [12]. the total wind capacity of 35.5 in 2013 is lower than the 36.7 gw pv installed in 2013 [12]. since 2008, solar pv panel prices have fallen well over 70 percent, with the cost of wind turbines decreasing by 40 percent during that same time. similar to the experience of semiconductor products, the cost of pv systems will continue to decrease in coming years. why and how photovoltaics will provide cheapest electricity in the 21st century 277 fig. 1 importance of solar and wind energy in global context [8] fig. 2 global mean wind speed at 80 m [9] 278 r. singh, g. f. alapatt, g. bedi fig. 3 global mean solar irradiance [10] solar energy received on earth surface per year is about 89 pw (1 pw = 10 15 w). 2009 global energy consumption is about 16 tw (1 tw = 10 12 w) that is about 0.016 % of solar energy received on earth surface. the challenge is to convert the enormous amount of solar energy into electricity at lower cost than any other technique of generating electricity. solar energy can be converted into electricity either by concentration solar power (csp) or by photovoltaics. due to a number of cost and reliability related factors the successful implementation of csp is much lower than pv. as an example in 2012, installed capacity of csp is 2.8 gw [13] while pv installed capacity is 28.4 gw [14]. in fact, between 2007 and 2012, only 7.44 gw of csp was installed. the primary interest in csp is due to the fact that other than optical system, the operation of csp is similar to coal, nuclear and gas generation of electricity. thus initially utilities were more interested in csp than pv, however due to intrinsic cost advantages of pv over csp the situation has changed now. 4. important role of pv as dc source of power and the development of pv based dc nanogrid and dc microgrid current global electric infrastructure is dominated by alternating current (ac). due to the development of solid state dominated power electronics in the last 50 years, high voltage dc transmission has certain advantages [15] over high voltage long haul transmission and currently about 2 % of installed global generating capacity is handed by high voltage dc transmission [16]. however, except few applications, all loads around us (smart phone, lap top, refrigerator, air conditioner, light source etc.) need dc power source. due to local generation of power by pv and the availability of power electronics to step up or step down dc voltages, it is important to visit thomas edison’s original why and how photovoltaics will provide cheapest electricity in the 21st century 279 concept of local dc power generation [17]. in the context of 21 st century power generation and utilization, edison’s concept can be extended in to two directions. ideally the distance between electricity generation sources and loads must be at a minimum; however cost-effective solar and wind farms at a particular site also meet the requirements of the local dc power. minimum conversion from dc to ac and or ac to dc must take place to conserve energy. according to us energy information administration (eia), local power generation is defined as electricity that is (i) self-generated, (ii) produced by either the same entity that consumes the power or an affiliate, and (iii) used in the direct support of a service or industrial process located within the same facility or group of facilities that house the generating equipment. because of the novelty of direct use of the electricity, local electricity generation is on the rise in the united states. this increase is partly due to the compatibility of local dc electricity infrastructures, which can co-exist with existing electrical infrastructures that are based upon alternating current (ac). regarding power storage, dc storage devices such as batteries, capacitors and fuel cells also meet the requirements of local dc electricity. in essence, the self-sufficient power network of energy generation and energy storage sources, known as the micro grid, is basically a smaller version of the larger power grid. in the absence of no external connectivity of the microgrid with the main grid, this self-sufficient pv based “nanogrid” can generate, store and distribute its own power. figure 4 show the structure of the proposed pv based nanogrid. this concept is innovative in that it uses dc power generation sources, dc storage devices and minimum distance between power sources and the dc loads for the 21 st century new electricity infrastructure. rooftop pv with storage is a typical example of nanogrid. pv based nanogrid is also ideally suited for rural electrification where there is no existing grid. fig. 4 pv based nano grid for rural electrification though there are many components driving the growth of local dc electricity, we list the key points below: (i) the traditional model of large base-load ac centralized electrical power generation and long haul distribution via high-voltage transmission and low voltage lines causes huge losses of energy and costs required to operate such systems. as clearly shown in table 1, approximately 70% of electricity produced is lost in generation, 280 r. singh, g. f. alapatt, g. bedi transmission and distribution. assuming that the cost of electricity is $0.1/kwh, the annual energy loss amounts to about 40 trillion dollars. (ii) direct current (dc) electricity locally generated by renewable energy sources such as solar panels and wind mills, and used with a minimum conversion (dc to ac or ac to dc) and minimum transmission can reduce energy losses by as much as 30% or more energy that is typically lost in ac generation, transmission, and distribution. (iii) unlike 20 th century technologies, the cost of generating local power generated from solar pv and wind systems is decreasing daily, with the substitution of dc for ac power further reducing that cost. the cost of centralized ac power generation, however, has either increased or remained unchanged during that time. wind and solar generated power is cheaper than coal-fired power plants when factoring the social costs of carbon. some utilities are now using more pv as it has become more cost-competitive with natural gas. us companies are increasingly turning to solar panels and dc microgrid to offset energy costs. in minnesota, for example, roof-top pv electricity cost 36-75% less than natural gas during peak delivery times. steam generation by pvforso-called enhanced oil reserve projects costs about $5 to $7 per million british thermal units of energy, half of the $12 to $18 price for liquefied natural gas [18]. (iv) dc based pv and wind power systems are more reliable than ac based systems. while the inverter cost is less than about 20% of pv system cost, any system malfunction can shut the system down, with a total loss in energy production [19]. wind turbines are more reliable in dc configurations, due to the greatly reduced complexity of the mechanical transmissions that are required for turbine ac generation [16]. (v) pv systems are extremely reliable. after almost 20 years of continuous outdoor exposure, silicon pv module average performance decay is only 4.42% for the whole period [20]. reliability results guarantee safe investments, for the benefit of all pv users and stakeholders [20]. (vi) batteries, capacitors, and fuel cells are used to store dc electricity. the use of ac in place of dc increases the cost of storage device, as with batteries in which ac based storage systems increase their cost to as much as 50 % [21]. (vii) increase in energy efficiency translates to job creation and economic growth. according to the energy information administration the electricity consumed in the us in 2011 was 3,839 billion kwh and is expected to increase by 0.91% annually until 2040. assuming that by 2015 the dc electricity is used for 10% of generation and distribution of the electricity consumed in us, more than 60,000 jobs will be created. (viii) integrated circuits and other solid state devices revolutionized virtually every facet of human life. except very few cases, (e.g. certain motor-based systems), all other loads require dc power. for example, unlike old cathode-ray tube televisions, solidstate tvs do not use ac current. similarly, though lighting consumes about 20% of the electricity produced worldwide, it too uses dc power. also, unlike dc current, why and how photovoltaics will provide cheapest electricity in the 21st century 281 typical ac based cell phone chargers waste approximately 20-35% energy used [16]. electrical vehicles do not require ac power for charging batteries. with revolution in the it industry, more semiconductor-based electronics are being used, with a concurrent increase in dc loads and a decrease in ac loads. (ix) battery-based hybrid and electrical vehicles and solid-state based led lighting are transforming the transportation and lighting industries, both of which are powered by direct current. (x) energy-efficient appliances use adjustable speed motor drives in which a rectifier converters the ac from the grid into an internal dc bus voltage. though one option entails directly powering the appliances from a dc source, it is also possible to redesign appliances without these embedded rectifiers. such redesigns may require a refit of manufacturing facilities, state and/or federal government subsidies and related financial incentives to the consumer can offset the costs. globally, 268.1 million major appliances were sold in 2012. developing public policies to offset the cost of retrofitting manufacturing factories and exporting this new technology will create many new jobs (xi) a dc nanogrid is the key-enabler of the “zero energy building model.” with minimum wastage in transmission and conversion, the use of locally generated dc electricity can provide 100% energy needs of a building. table 1 2008 world energy consumption by sector [source: us energy information administration (eia)] end-use sectors energy end-use includes end-use of electricity but excludes losses (quadrillion btu) electricity losses includes generation, transmission, and distribution losses (quadrillion btu) total energy use includes electricity losses (quadrillion btu) share of total energy use (quadrillion btu) commercial 28 32 60 12% industrial 191 64 255 51% residential 52 37 89 18% transportation 98 2 100 20% total end-use sectors 369 electric power sector 194 39% total electricity losses 135 total energy use 505 5. global manufacturing advantages of pv generated dc electricity cutting energy costs increases the competiveness of manufacturing industry and saves jobs worldwide, the energy cost of which in some cases is as great as one-third of the operating cost of the manufacturing plant. this is typically true for aluminum plants and many other high energy consuming manufacturing industries. as shown in table 2 [22], aluminum plants lose 6.3% of the total energy due to conversion from ac to dc current, a process that cannot be avoided today. based on world aluminum data, 93,576thousand metric tons of aluminum was produced in 2012. using the average data of table 2 and an 282 r. singh, g. f. alapatt, g. bedi electricity cost of $0.1/kwh, a net saving of $9.6 billion is possible through the use of dc instead of ac power. similarly, other high energy consuming industries (such as the pulp and paper industries) can also be retrofitted for dc current. table 2 2012 data of energy consumed in producing one ton of aluminum [22] nation or region dc energy (kwh) ac energy (kwh) % loss in current plants north america 14,540 15,458 6.31 world 13,756 14,639 6.42 china 13,014 13,844 6.38 as clearly indicated in table 3 [23], different ac standards of voltage and frequency are used in different countries. japan, however, is an exception in that two sets of frequency standards are used in that nation. the worldwide adoption of dc power can prevent such a redundancy of effort by providing uniform voltage standards worldwide, thus reducing the cost of related power electronics to yield an overall lower manufacturing cost of every dc based electrical system. table 3 voltage and frequency standards of 16 developing/developed nations [23] country voltage (v) frequency (hz) australia 230 50 brazil 110 and 220 60 canada 120 60 china 220 50 cyprus 240 50 egypt 220 50 guyana 240 60 south korea 220 60 mexico 127 60 japan 100 50 and 60 oman 240 50 russian federation 220 50 spain 230 50 taiwan 110 60 united kingdom 230 50 united states 120 60 6. current status of photovoltaics as shown in fig. 5, by the end of year 2012 the cumulative installed solar pv electricity generation capacity has exceeded 100 gw and is expected to double from about 100gw in 2012 to 200gw in 2015 [24]. the installed capacity of pv is expected to reach 36 gw and 49 gw by the end of year 2013 and 2014 respectively [25]. the large-scale solar pv market that is comprised of rooftop projects above 100 kilowatts (kw) in size and ground-mounted solar pv projects is about 26 gw in 2013 [26]. based why and how photovoltaics will provide cheapest electricity in the 21st century 283 on manufacturing considerations discussed at length in earlier publications [5], [27]-[29] silicon based pv modules will continue to dominate pv market. fundamentally, there is nothing wrong in assuming that concentration photovoltaic (cpv) systems should provide lower cost compared to non-concentration solar cells. however the engineering problems that include the thermal and optical challenges have not permitted the largescale commercialization of concentration solar cells (fig. 6). the average cost of installed pv system for various segments of us market is shown in fig. 7 [31]. average solar pv system coast for various sizes and locations in australia is given in table 4 [32]. the cost of pv modules for various countries is shown in fig. 8 [33]. the data of fig. 7, fig. 8 and table 4 clearly indicate that we are reaching towards pv module and installed pv system cost of $0.50/wp and $1.00/wp respectively. as shown in fig. 9 and fig. 10 european union (eu) has dominated the pv market in the past and china, japan and us are currently dominating the pv market. pv module manufacturing share in 2013 is dominated by companies based in china and taiwan [34]. in 2013, only one us based pv company (first solar) is in top 10 pv manufacturers list [34]. however, pv growth in us is significant, since water conservation advantages of pv are quite important [35]. table 4 installed cost of pv system in australia (1 australian $~$0.87 us $) [32] 284 r. singh, g. f. alapatt, g. bedi fig. 5 global growth in pv electricity generation capacity worldwide [24] fig. 6 past, current and projected market of cpv [30] why and how photovoltaics will provide cheapest electricity in the 21st century 285 fig. 7 average installed prices of pv system in us for various market segments [31] fig. 8 silicon pv module prices for various countries [33] 286 r. singh, g. f. alapatt, g. bedi fig. 9 dominance of european union pv market between 2007-2011 [33] fig. 10 dominance of china, japan and us pv market between 2012-2016 [33] 7. manufacturing innovations leading to constant reduction of pv system cost as we have stated before more than 90% of the installed pv capacity employs bulksilicon solar cells. the rise in pv market (fig. 5) and innovation in materials and processing leads to reduced cost of silicon solar cells (fig. 11). the highest reported am 1.5g efficiency of silicon solar cells and silicon pv modules are 25 % and 21.5 % respectively [29]. except some minor improvements, no major improvement is expected in increasing the efficiency of silicon solar cells. further cost reduction will be achieved by using thinner wafers, building new process equipment with higher throughput, lower defect density and reduced foot print. all the new tools must provide lower cost of ownership [29] than current manufacturing tools. in a recent publication [37] we have shown that as compared to conventional furnace processing and rapid thermal processing, the use of ultra violet (uv) and or vacuum ultra violet (vuv) photons enhances the why and how photovoltaics will provide cheapest electricity in the 21st century 287 diffusion coefficient of dopants by many orders of magnitudes. as shown in figure 12, for wavelength below about 0.3 micrometer, the diffusion coefficient is higher by two to four order of magnitudes thus in case of rapid photothermal processing (rpp), other than thermal energy, the vuv photons are used as an additional source of optical energy. the principal advantages of rpp over other thermal processing techniques are (i) lower density of defects, (ii) minimum process variation, (iii) higher throughput, and (iv) lower deposition temperature. based on a conservative estimate, we expect that the throughput of rpp based diffusion and annealing tools will be at least an order of magnitude higher than current thermal processing tools. similar to silicon integrated circuit (ic) manufacturing (fig. 13), pv manufacturers can use larger size substrate to further reduce the cost of silicon solar cells. gigawatt pv system manufacturing shown in fig. 14 will provide the ultimate lowest cost of pv system. fig. 11 innovations and supply-chain advantages leading to low-cost of silicon solar cells [36] fig. 12 diffusion coefficient under vacuum ultra violet (vuv) photons [37] 288 r. singh, g. f. alapatt, g. bedi fig. 13 use of larger wafer size by silicon ic manufacturers to reduce the cost of silicon based ics [38] fig. 14 giga watt scale pv system manufacturing will lead to ultimate lowest cost of pv system why and how photovoltaics will provide cheapest electricity in the 21st century 289 8. r & d directionsthat can lead to further reductions of cost of pv system and lead to new applications of pv in a recent publication [29] we have shown that most of the research community is working in areas that either has fundamental flaws or does not meet fundamental manufacturing requirements. interested reader should read reference [29]. since the publication of reference [29], lead halide perovskite solar cell [39] has received lot of publicity. following are the main reasons that this type of solar cell will never be manufactured: (i) as a single junction solar cell, silicon cannot be replaced by other solar cells, unless the abundant materials based solar cell has at least 30 % efficiency of large area cells. (ii) the use of lead in the solar cell reported in reference [39] does not meet the manufacturing requirements. (iii) the area of the high efficiency solar cell reported in ref. [39] is less than 0.1 cm 2 . the authors of reference [40] reported that any silicon solar cell with area less than 0.25 cm 2 would not show the efficiency degradation due to series resistance effects. in other words, simply scaling to larger size of area less than this critical size will lead to significant reduction in efficiency. using am 1.5g spectrum and other data as used in theoretical calculations in ref. [29], we have used method of reference [40] to calculate the minimum solar cell area that should be used in reporting any new type of solar cells. these results are shown in fig. 14. (iv) the band gap of lead halide perovskite of 1.5 ev is not optimum for multi junction multiterminal solar cells [29]. (v) the use of graphene in lead halide perovskite solar cell will lead to significant process variability and the performance of module will be worse than other thin film materials (cdte, cigs and a-si) used in manufacturing thin film pv modules [29]. thus module based on lead halide perovskite solar cells will not be able to complete with existing thin film pv modules. based on solid scientific and engineering principles, following are productive r &d directions that can lead to advancement of pv module manufacturing: (a) multiterminal multijunction solar cells the next major improvement in the performance and cost reduction of silicon solar cells can be achieved by taking advantage of both bulk silicon and thin film solar cells. in reference [29] we have introduced the concept of multiterminal multijunction solar cells. fig. 15 shows the concept for two terminals four junctions device. the optimal band gap of top junction should be about 1.8 ev. (b) thin film solar cells for building integrated photovoltaics (bipv) enormous opportunity exists to convert glass used in the construction of every building for generating pv electricity. the low efficiency leading to high cost of existing commercial thin film for bipv application is the major barrier. reliability of bipv must be very high, since glass does not degrade under normal weather conditions. (c) use of pv in transport sector all kind of vehicles used in transport sector can use pv for power generation [7]. limited surface area of the vehicle requires high efficiency and low-cost of pv modules. mounting of pv modules on surface of the vehicle requires innovative design concepts. 290 r. singh, g. f. alapatt, g. bedi (d) integration of pv and consumer products with rise in the number of consumer products used by modern man, there is need to use ambient light to convert into electricity. this will require innovation in the integration of pv and consumer products. recent patent filed by apple demonstrate the importance of this area [41]. (e) use of pv electricity for desalination pv electricity can be used for desalination. however, to reduce the cost of drinkable water the design of pv system and desalination must be reconsidered to conserve energy. (f) pv as source of combined heat and power the unused part of the spectrum by pv can be used to collect heat generated by the pv system. however, cost effectiveness of such a chp system has not been proven. (g) solid state capacitor as energy storage device the fundamental problem of the current capacitor technology is the low value of capacitance density.based on ultra-high dielectric constant (k) materials (k > 10 6 ) solid-state capacitors have the potential of providing high energy and high power density. unfortunately, currently, due to defects in the material the dielectric constant degrades with both electric field and temperature and the leakage current is very high. finding a solution to all these problems can provide a cost effective large-scale solution of storing electric energy (h) use of nanostructures in the fabrication of solar cells for almost two decades, the buzzword “nanotechnology” has been advocated by a large group of researchers to improve the efficiency of solar cells. however, to date there is not a single work where the efficiency of nanostructure based solar cells for terrestrial applications has exceeded the efficiency of bulk solar cells. in previous publications we have critically examined the role of nanostructures for solar cell applications [29],[42][44] and the summary of our findings is reported here. figure 16 shows the quantization of properties as the dimension changes from “3-d” to “0-d”.the 2-d properties of quantum wells have been exploited very well in the fabrication of iii-v solar cells. the use of self-assembly for 2-d and 1-d nanostructures may provide properties of isolated structures as expected theoretically. however, when self-assembly is used in device fabrication, the process variation results into lower efficiency devices. further when such devices are used in module manufacturing, the lowest performance device will dictate the efficiency of the modules. as shown in figure 17 [46], the quantum dot or “0-d” devices shows increase in the value of energy gap in the quantum confinement region. below 8 nm, the band gap of silicon quantum dots increases (figure 18). there is no experimental proof that at any given wavelength quantum dots can provide quantum efficiency greater than one. similarly there is no experimental proof that hot electrons can provide higher efficiency than normal operating devices. as discussed at length in reference 29, the concept of intermediate band gap is flawed and one cannot get higher conversion efficiency than the normal bulk material solar cell. there is need to invent new control processes so that the unique properties of nanostructures can provide tolerable process variation. in the absence of such processing tools, no practical devices can be made where one can exploit the unique properties offered by the nanostructures. why and how photovoltaics will provide cheapest electricity in the 21st century 291 fig. 15 minimum solar cell area as a function of band gap to observe the degradation of efficiency of solar cell. fig. 16 (a) schematic of the proposed two-junction four-terminal solar cell. (b) external electric circuitry to combine the electricity generated separately by the two junctions. 292 r. singh, g. f. alapatt, g. bedi fig. 17 quantization of properties with scaling of dimensions [45] fig. 18 increase in energy gap with decrease in the size of quantum dot [46] why and how photovoltaics will provide cheapest electricity in the 21st century 293 9. storage of electricity generated by pv batteries [21], ultracapacitors [48] and fuel cells [49] are all useful for storing dc electricity. other than safety issue, fuel cells are expensive. significant progress has been made in recent years to reduce the cost of batteries. also the increasing use of evs [50] and large-scale grid storage [51]-[53] will increase the demand of batteries. similar to the experience of cost reduction of pv by volume manufacturing (fig. 19), the cost reduction of battery prices will continue. indeed, utility scale battery storage is now competitive with natural gas in the us; eos energy storage inc. has developed a battery system that costs approximately $160/kwh [21]. semiconductor manufacturing techniques can also further reduce the cost of batteries. similar to solar cells [29], the use of a series and parallel combination of various cells in batteries yield the desired watt-hours of the battery. the equipment used in battery manufacturing is generally based on statistical process control, and the resultant process variations leads to variations in the output of various cells of the battery. advanced process control can reduce this process variation resulting higher power out from the same battery. in addition, large scale manufacturing of batteries in a single location will provide tight control on supply chain and further reduce the cost of batteries. fig. 19 experimental results on the variation of optical bandgap of nanostructured silicon with diameter of silicon nanograins [47] 294 r. singh, g. f. alapatt, g. bedi 10. misconception about subsidies in the context of pv, wind energy and other renewables, there are many misconceptions regarding the concept of energy subsidies, a review of which is provided elsewhere [55]. globally, fossil fuel industries receive nearly $1 trillion a year in subsidies, approximately twelve times of that allocated to the renewables industry [56]. most alarmingly, nearly 43% of subsidies to fossil fuel industries in the developing world end up in the pockets of the richest 20%; only 7% go to the bottom 20% of households [57]. eliminating subsidies for oil, gas, coal and other fossil fuels would make a significant dent in curbing global warming pollution [56]. in table 5 we provide a historical average of us federal energy subsidies. similar situation exists in other parts of the world. table 5 historical average of average federal energy subsidies in us [55] energy source subsidies 2010 $billion oil and gas 4.86 nuclear 3.50 biofuels 1.08 renewables 0.37 11. photovoltaics and the future of utilities the traditional business model of the utility “investing in equipment, turning meters, and earning steady profits” is undergoing a transformational change with the emergence of rooftop pv leading to new business models [58]. rooftop pv dominated concept of the dc microgrid poses no threat to a utility industry that is willing to adapt to rapid technological changes in the power industry that pv, storage, power electronics and wind technologies will accelerate. if the utility fails to adapt to these all but certain developments, they will become as archaic as the sears catalog business of the 20 th century. 13. energy policy issues the worldwide adoption of pv generated dc power is a wise global public policy move in terms of sustainability and economic growth of developed, developing and underdeveloped economies pv trade war between any two countries is not in the best interest of any nation [59]. a new business model that capitalizes the buying power of a nation or group of nations further reduce the cost of implementing pv generated dc power. the real or virtual vertical business model [2] will lead to the lowest cost of pv electricity generated by either the dc microgrid or dc nanogrid. 14. photovoltaics for underprivileged people (pup) economic disparity is a serious issue, since the 85 richest people are as wealthy as poorest half of the world [60]. globally, 2.5 billion people in the developing world rely on biomass (fuel from wood, charcoal and animal dung) to meet their energy needs for cooking and other daily necessities. the continuous decrease in cost of pv generated why and how photovoltaics will provide cheapest electricity in the 21st century 295 electricity is now making it possible to provide electrical energy to those populations who can be served entirely by pv generated dc electricity. similar to the explosive growth of cell phones (no need of land lines), pv combined with a dc energy distribution system will provide badly needed clean alternatives to dirty sources of fuel. unlike developed economies, in which replacing an aging electricity infrastructure is a challenge, implementing a new low-cost dc power system infrastructure in developing economies that have no such infrastructures is a much easier proposition. a pv based dc nanogrid (figure 4) is the most practical low-cost method of providing cost effective electricity to such developing societies worldwide. indeed, the market size is huge and the societal implications are monumental. united nations, world bank, developed, developing and underdeveloped countries need to work together and invent new real or virtual business model as shown in fig. 20. underdeveloped countries do not have the technology or capital to manufacturer pv systems, new business model must be developed where the combined purchasing power of the underdeveloped countries must be considered as a single entity and the developed economies or developing economies gets a huge market share without investment on marketing. the power situation in emerging and underdeveloped countries is a serious issue. as an example shortage of electricity in india is a significant hindrance in economic growth [7]. although not an optimal low-cost engineering solution, under desperation india is running a pilot project where each customer will get uninterrupted 100 w dc power that will be obtained from each substation and run on a separate lime and separate meter [61]. nanogrid for underprivileged people is also a solution of national security [62], [63] fig. 20 doubling the volume manufacturing reduces the pv module prices by 20 % [54] 296 r. singh, g. f. alapatt, g. bedi fig. 21 proposed virtual or real vertical integrated business model will provide lowest cost of solar electricity 15. concluding remarks in this paper we have provided an in depth review of photovoltaics for generating sustainable green electricity in the 21 st century. the importance of pv is examined from global perspective. without storage the cost of pv system is approaching less than about $1per peak watt, which can provide dc electricity generation cost of about $0.02$0.03 per kwh for most of the world’s population. we have identified future manufacturing innovations and r & d directions that can further reduce the cost of pv system. bulk volume manufacturing of batteries will lead to cost reduction in a manner similar to the cost reduction experience of pv module manufacturing. for underprivileged people, united nations and world bank need to seriously think about pv and develop a new vertically integrated business model that can capitalize the buying power of the underprovided people living all over the world. current trends of pv market growth are such that in about two years the market size doubles. if this trend continues, we expect terawatt (1,000 gw) pv installations by the end of this decade. under this scenario we expect pv electricity cost with storage to reach $0.02 per kwh in the next 8-10 years. why and how photovoltaics will provide cheapest electricity in the 21st century 297 references [1] http://blogs.ei.columbia.edu/2013/07/15/world-population-projected-to-cross-11-billion-threshold-in-2100/ [2] r. singh, and g. f. alapatt, “innovative paths for providing green energy for sustainable global economic growth”, proc. spie 8482, photonic innovations and solutions for complex environments and systems (pisces), 848205 (october 11, 2012); doi:10.1117/12.928058 [3] r. singh, l. colombo, k. schuegraf, r. doering, and a. diebold, “semiconductor manufacturing,” in guide to state-of-the-art electronic devices, j. n. burghartz, ed.; new york, ny, usa: wiley, ch. 10, pp. 121–132., 2013 [4] r. singh, n. gupta, k.f. poole “global green energy conversion revolution in 21st century through solid state devices”, proc. 26th international conference on microelectronics, nis, serbia, may 11-14, 2008, vol. 1, pp. 45-54, ieee, new york, ny. [5] r. singh, g., f. alapatt and k. f. poole., “photovoltaics: emerging role as a dominant electricity generation technology in the 21st century”, proc. 28th international conference on microelectronics, ieee, and new york, ny. 53-63, 2012 [6] r. singh, k. shenai, g.f. alapatt, and s.m. evon ,"semiconductor manufacturing for clean energy economy", invited paper, proc. ieee energy tech 2013 technology frontiers in sustainable power and energy, may 21-23, 2013, case western reserve university, published by ieee, publication year: 2013, pp: 1 – 7 doi: 10.1109/energytech.2013.6645351 [7] r. singh, g. f. alapatt, and m. abdelhamid , “green energy conversion & storage for solving india's energy problem through innovation in ultra large scale manufacturing and advanced research of solid state devices and systems”, 2012 international conference on emerging electronics, pp. 1-8, 2012, digital object identifier: 10.1109/icemelec.2012.6636220 [8] http://www.asrc.cestm.albany.edu/perez/2011/solval.pdf [9] http://www.3tier.com/static/ttcms/us/images/support/maps/3tier_5km_global_wind_speed.jpg [10] http://www.3tier.com/static/ttcms/us/images/support/maps/3tier_solar_irradiance.jpg [11] http://www.renewableenergyworld.com/rea/news/article/2010/10/champions-of_photovoltaics [12] http://www.sunwindenergy.com/news/more-photovoltaics-wind-power-installed-2013 [13] http://www.sbc.slb.com/sbcinstitute/publications/~/media/files/sbc%20energy%20institute/sbc%20e nergy%20institute_solar_factbook_jun%202013.ashx [14] http://cleantechnica.com/2013/04/11/total-global-solar-pv-capacity-approaching-100-gw/ [15] http://www.economist.com/blogs/babbage/2013/01/power-transmission [16] http://smartgrid.ieee.org/questions-and-answers/902-ieee-smart-grid-experts-roundup-ac-vs-dc-power [17] http://www.abb.com/cawp/seitp202/c646c16ae1512f8ec1257934004fa545.aspx [18] http://www.renewableenergyworld.com/rea/news/article/2014/01/solar-beats-natural-gas-to-unlockmiddle-easts-heavy-oil-says-glassdoor-solar [19] http://www.solarindustrymag.com/issues/si1401/feat_02_a-lifecycle-approach-to-invertermanagement.html [20] d. polverini, m. field, e. dunlop, and w. zaaiman, ”polycrystalline silicon pv modules performance and degradation over 20 years”, prog. photovolt: res. appl. vol. 21:pp. 1004–1015, 2013. [21] http://reneweconomy.com.au/2013/eos-utility-scale-battery-storage-competitive-with-gas-36444 [22] http://www.world-aluminium.org/statistics/ [23] m.h. el-sharkawi, ”electrical energy: an introduction”, taylor & francis group, third edition, chapter 2, 2003. [24] m. barker. (2013, mar. 15). reaching new heights: cumulative pv demand to double again by 2015 [online]. available: http://www.solarbuzz.com/resources/blog/2013/03/ reaching-new-heights-cumulativepv-demand-to-double-again-by-2015 [25] http://www.pv-magazine.com/news/details/beitrag/global-solar-pv-demand-to-reach-49-gw-in-2014--saynpd-solarbuzz_100013796/#axzz2sdoix3mn [26] http://www.businessspectator.com.au/news/2014/1/13/solar-energy/large-scale-pv-exceeded-26-gw-year [27] r. singh and j.d. leslie, "economic requirements for new materials for solar cells", solar energy, vol. 24, pp. 589 592, 1980. [28] r. singh, “why silicon is and will remain the dominant photovoltaic material”, journal of nanophotonics, vol. 3, 032503 (16 july 2009). [29] r. singh, g. f. alapatt, and a. lakhtakia,” making solar cells a reality in every home: opportunities and challenges for photovoltaic device design, ieee journal of the electron devices society, vol. 1, no. 6, pp. 129-144, june 2013. [30] http://www.renewableenergyworld.com/rea/news/article/2013/12/cpv-outlook-demand-doubling-costshalved-by-2017?cmpid=solarnl-tuesday-december17-2013 298 r. singh, g. f. alapatt, g. bedi [31] http://www.computerworld.com/s/article/9244836/solar_power_installation_costs_fall_through_the_floor [32] http://www.businessspectator.com.au/article/2014/1/20/solar-energy/solar-pv-price-check---january [33] http://www.greentechmedia.com/articles/read/regional-pv-module-pricing-dynamics-what-you-need-to-know [34] http://www.renewableenergyworld.com/rea/news/article/2014/01/top-ten-pv-manufacturers-from-2000-topresent-a-pictorial-retrospective?cmpid=solarnl-thursday-january23-2014 [35] http://campverdebugleonline.com/main.asp?sectionid=36&subsectionid=73&articleid=41284 [36] http://www.businessspectator.com.au/article/2013/11/25/solar-energy/solar-silicon-wafers-below-20cwatt [37] s. shishiyanu, r. singh, t. shishiyanu, s. asher and r. reedy, “the mechanism of enhanced diffusion of phosphorus in silico[46]n during rapid photothermal processing of solar cells”, ieee transaction of electron devices, vol. 58, pp. 776-781, 2011 [38] a simulation study of 450mm wafer fabrication costs, s. w. jones, ic knowledge llc [39] g. hodes, and d. cahen, “perovskite solar cell roll forward”, nature photonics, vol. 8, pp. 87-88, 2014. [40] k. rajkanan and j. shewchun, "a better approach to the evaluation of the series resistance of solar cells" solid-state electronics, vol. 22, no. 2, pp. 193-197, 1979. [41] http://au.ibtimes.com/articles/536794/20140201/apple-macbook-pro.htm [42] n. gupta, g. f. alapatt, r. podila, r. singh, and k. f. poole, “prospects of nanostructure-based solar cells for manufacturing future generation of photovoltaic modules,” int. j. photoenergy., vol. 2009, article no. 154059, 2009. [43] g. f. alapatt, r. singh, and k. f. poole, “fundamental issues in manufacturing photovoltaic modules beyond the current generation of materials”, advances in optoelectronics, article id 782150, 10 pages, doi:10.1155/2011/782150, 2011 [44] g. f. alapatt, r. singh, n. gupta and k. f. poole,, ”fundamental problems of self assembly for manufacturing semiconductor products”, emerging materials research, vol. 1, issue s1, pp. 1-5, 2012. [45] j. hank, introduction to the theory of nanostructures, international max planck research school of science and technology of nanostructures, 2006. [46] http://www.sigmaaldrich.com/materials-science/nanomaterials/quantum-dots.html [47] v. a. belyakov, v. a. burdov, r. lockwood, and a. meldrum, “silicon nanocrystals: fundamental theory and implications for stimulatedemission,” adv. opt. technol., vol. 2008, article no. 279502, 2008. [48] http://www.solarpowerworldonline.com/2014/01/ultracapacitors-grid-scale-solar-smoothing/ [49] http://www.njspotlight.com/stories/13/04/02/hydrogen-fuel-cells-could-add-year-round-reliability-torenewable-energy/ [50] http://www1.eere.energy.gov/vehiclesandfuels/pdfs/1_million_electric_vehicles_rpt.pdf [51] http://energy.gov/sites/prod/files/2013/12/f5/grid%20energy%20storage%20december%202013.pdf [52] http://www.pv-tech.org/news/saft_and_ingeteam_to_build_combined_pv_plant_and_battery_storage _project_on [53] http://www.renewableenergyworld.com/rea/news/article/2014/01/transmission-and-energy-storage-2014outlook-the-macro-and-micro-transformation-of-electric-grids?cmpid=wnl-friday-january24-2014 [54] http://www1.eere.energy.gov/solar/pdfs/47927_chapter4.pdf [55] http://i.bnet.com/blogs/dbl_energy_subsidies_paper.pdf. [56] http://www.nrdc.org/international/rio-2012/cleanenergy.asp [57] http://www.economist.com/news/finance-and-economics/21593484-economic-case-scrapping-fossil-fuelsubsidies-getting-stronger-fuelling [58] http://www.greentechmedia.com/articles/read/new-utility-business-models [59] r. singh, “can the us return to manufacturing glory?,” photovoltaics world, pp. 40-43, march/april 2012. [60] http://www.latimes.com/business/money/la-fi-mo-oxfam-world-economic-forum-income-inequality20140120,0,7080817.story#axzz2s8ppvaqu [61] http://economictimes.indiatimes.com/news/news-by-industry/energy/power/iit-madras-project-to-supplylow-power-dc-may-end-outages/articleshow/29458820.cms [62] http://online.wsj.com/news/articles/sb10001424052702304851104579359141941621778?mg=reno64wsj&url=http%3a%2f%2fonline.wsj.com%2farticle%2fsb10001424052702304851104579359141941 621778.html [63] http://live.wsj.com/video/mystery-assault-on-power-grid-raises-alarms/9afcc446-5b2e-4749-a8ac6e4b0a8a7301.html#!9afcc446-5b2e-4749-a8ac-6e4b0a8a7301 10215 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 283-300 https://doi.org/10.2298/fuee2202283s © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimization of the 3p keys kernel parameters by minimizing the ripple of the spectral characteristic nataša savić, zoran milivojević, zoran veličković academy of applied technical and preschool studies, niš, serbia abstract. the ideal interpolation kernel is described by the sinc function, and its spectral characteristic is the box function. due to the infinite length of the ideal kernel, it is not achievable. therefore, convolutional interpolation kernels of finite length, which should better approximate the ideal kernel in a specified interval, are formed. the approximation function should have a small numerical complexity, so as to reduce the interpolation execution time. in the scientific literature, great attention is paid to the polynomial kernel of the third order. however, the time and spectral characteristic of the third-order polynomial kernels differs significantly from the shape of the ideal kernel. therefore, the accuracy of cubic interpolation is lower. by optimizing the kernel parameters, it is possible to better approximate the ideal kernel. this will increase the accuracy of the interpolation. the first part of the paper describes a three-parameter (3p) keys interpolation kernel, r. after that, the algorithm for optimizing the parameters of the 3p keys kernel, is shown. first, the kernel is disassembled into components, and then, over each kernel component, fourier transform is applied. in this way the spectral characteristic of the 3p keys kernel, h, was determined. then the spectral characteristic was developed in the taylor series, ht. with the condition for the elimination of the members of the taylor series, which greatly affect the ripple of the spectral characteristic, the optimal kernel parameters (αopt, βopt, opt) were determined. the second part of the paper describes an experiment, in which the interpolation accuracy of the 3p keys kernel, was tested. parametric cubic convolution (pcc) interpolation, with the 3p kernel, was performed over the images from the test database. the test database is created with standard test images, which are intensively used in digital image processing. by analyzing the interpolation error, which is represented by the mean square error, mse, the accuracy of the interpolation was determined. the results (αopt, βopt, opt, msemin) are presented on tables and graphs. detailed comparative analysis showed higher interpolation accuracy with the proposed 3p keys interpolation kernel, compared to the interpolation accuracy with, 1p keys and 2p keys interpolation kernels. finally, the numerical values of the optimal kernel parameters, which are determined by the optimization algorithm proposed in this paper, were experimentally verified. key words: convolution, interpolation, interpolation kernel, pcc interpolation, keys kernel received november 23, 2021; received in revised form april 5, 2022 corresponding author: nataša savić academy of applied technical and preschool studies, generala milojka lešjanina 39, 18000 niš, serbia e-mail: natasa.savic@akademijanis.edu.rs 284 n. savić, z. milivojević, z. veličković 1. introduction interpolation is the process of estimating intermediate values between discrete samples of a continuous signal. among other things, interpolation can be realized by applying a convolution between a discrete signal and a continuous interpolation kernel. the interpolation kernel significantly affects the accuracy and time execution of interpolation [1]. for interpolation of band-limited signals, the ideal interpolation kernel is of the form sin(x)/x (in the notation sinc) where -∞ ≤ x ≤ +∞ [1, 2]. the spectral characteristic of the sinc interpolation kernel is a rectangular function, hsinc. the sinc kernel cannot be practically realized because it has infinite limits. for this reason, there is a need to truncate the sinc interpolation kernel to a finite length. as a consequence of the truncated sinc kernel, its spectral characteristic deviates from the ideal, rectangular, characteristic, which leads to: a) ripple in the passband and stopband, and b) finite slope in the transition band. the idea is to approximate the truncated sinc interpolation kernel with a low-degree polynomial function. in this way, the interpolation kernel has a small numerical complexity, and thus, allows a higher interpolation speed. these features of kernel are especially important when implemented in real-time systems. signal interpolation using finite length interpolation kernels is realized by applying convolution. a polynomial zeroth-degree kernel allows interpolation by rounding to the nearest-neighbor [3, 4]. nearest-neighbor interpolation is the most efficient in terms of computational speed, but in doing so, the largest interpolation error is generated. a linear, first-degree interpolation kernel is described in [5]. a quadratic, second-degree interpolation kernel is described in [3, 6]. a cubic, third-degree interpolation kernel, intended for parametric cubic convolution, pcc, is described in [1, 5]. using numerical examples, it has been shown that cubic convolution is more precise than nearest-neighbor and linear interpolation [7 9]. the parameterization of the cubic interpolation kernel, by introducing the kernel parameter α, is shown in [1]. the paper [1] is one of the basic papers in the field of interpolation in digital image processing. later, in the scientific literature, the parametric interpolation kernel from [1] was named, in honor of the author, the 1p keys interpolation kernel. by changing the value of the kernel parameter α, the characteristics of the kernel can be changed and, in this way, adjusted to the corresponding signal that is interpolated. the process of changing the kernel parameter for customization is called parameter optimization. in [1], the optimization of the parameter α was performed by minimizing the interpolation error by developing the error function into a taylor series in f = 0 (maclaurin series). in this way, it is shown that the optimal value of the parameter αopt = -0.5. the ripple of the spectral characteristic is reduced by eliminating the members of the taylor series that predominantly influence on the ripple. in [10], the ripple of the spectral characteristic was reduced by eliminating the members of the taylor series that affect on the concavity of the spectral characteristic. in [11], the reduction of ripple of the spectral characteristic was achieved with α = -0.5. the construction of a two-parameter interpolation kernel is shown in [12, 13]. this kernel is based on the extended parameterization of the 1p keys kernel [1]. in the scientific literature, this kernel is called the 2p keys kernel. optimal values of kernel parameters (αopt = 0.1, βopt = 0.2975) in estimating the fundamental frequency of the speech signal determined in [14]. further expansion of parameterization, in order to improve the characteristics of the kernel, led to the construction of 3p keys kernel [15]. the optimal values of kernel parameters in the estimates of the fundamental frequency of the speech signal are αopt = 1.7, βopt = -4.7, γopt = -3.8. a detailed analysis of the error estimate, presented using mse, optimization of the 3p keys kernel parameters ... 285 shows a higher accuracy of estimation using 3p kernels compared to the use of 1p keys and 2p keys kernels [15]. in the paper [16] the results of precision of the interpolation of audio signals, which was realized using the 3p keys kernel [15], are presented. audio test signals were acquired by recording g tones (g1 g7) on a steinway b concert piano. a detailed comparative analysis showed that the interpolation error, when the 3p keys kernel was used, was compared to the following: a) 1p keys kernel, 7.374 times smaller, and b) 2p keys kernels, 2.4166 times smaller. encouraged by the results of the papers, which unequivocally indicate the fact that increasing the number of interpolation kernel parameters reduces the interpolation error, the authors of this paper performed optimization of 3p keys kernel parameters, in order to increase similarity with the ideal kernel, sinc. thus created, optimized kernel, will further reduce interpolation error. in this paper, the process of optimization of parameters of the 3p keys kernel [15] in the spectral domain, is presented. optimization of kernel parameters was performed by minimizing the ripple of the spectral characteristic. the first part of the paper describes the algorithm for optimizing kernel parameters. first, by applying the fourier transform on the 3p kernel, r, the analytical form of the spectral characteristic, h, was determined. after that, the spectral characteristics were approximated using the taylor series, ht. the ripple reduction was achieved by eliminating the members of the taylor series, ht, which have a dominant effect on the ripple increase. then, the degree of similarity of the spectral characteristics of the ideal sinc kernel, hsinc, and the optimized kernel, hopt, was determined by comparative analysis. mse were used as a measure of similarity [11]. finally, the optimal parameters, (αopt, βopt, opt), were determined based on the minimum of the mse. the second part of the paper presents the results of an experiment in which the optimal parameters for 1p keys, 2p keys and 3p keys kernels were determined. an algorithm for interpolation test images, error interpolation estimation, and determination of experimental optimal parameters, is described. for the purposes of the experiment, the image test base was formed. image test base consists of: a) standard test images for digital signal processing (lena, barbara, cameraman, peppers, boats, tulips, and watch), and b) images from the bsds500 image base [17]. test images from the bsds500 base have numeric labels, so they will be named in the same way later in this paper. by applying the algorithm for each image, the optimal parameters and the corresponding estimate errors were determined. the results are presented in tables and graphs. finally, a comparative analysis of the experimental results with the results obtained by optimizing the spectral characteristic, was performed. comparative analysis will determine: a) the accuracy of interpolation using mse and b) the accuracy of estimating kernel parameters using absolute error. finally, in the last part of the paper, an analysis of the execution time of all analyzed kernels was performed. testing of the execution time was performed on a computer desktop s2ac43p, processor: intel (r) pentium (r), cpu: g3220 3 ghz, ram: 8 gb and a windows 10 operating system. the matlab r2017b program was applied (to determine the execution time, the tic and toc functions are used). it should be emphasized that the realized experiment, within which the algorithm for pcc interpolation is described, is intended, exclusively, for the comparative analysis of the interpolation accuracy of the 1p keys, 2p keys and 3p keys kernels. it was implemented using the matlab. therefore, the time of interpolation execution, in this case, is not of primary importance, because the condition for real-time is not set. the paper is organized as follows: section 2 describes 3p keys kernel. section 3 describes the 3p keys kernel parameterization algorithm. experimental results and comparative analysis are presented in section 4. section 5 is the conclusion. 286 n. savić, z. milivojević, z. veličković 2. keys parametric interpolation kernels in paper [1], for the field of convolutional interpolation fundamental paper, the author defined a parametric interpolation kernel. the kernel was intended to image interpolation. later, in the scientific literature, the interpolation kernel from [1] was called the 1p keys kernel. 2.1. 1p keys kernel the proposed 1p keys kernel [1] is defined as: 3 2 3 2 ( 2) | | ( 3) | | 1, | | 1, ( ) | | 5 | | 8 | | 4 , 1 | | 2 0, | | 2 x x x r x x x x x x        + − + +   = − + −     , (1) where α is parameter of the 1p keys kernel. the length of this kernel is l = 4. 2.2. 2p keys kernel a modification of the 1p key kernel, with the introduction of the second kernel parameter, with length l = 6, is shown in [13]. the analytical form of the 2p keys kernel is: ( ) 3 2 3 2 3 2 ( 2) | | ( 3) | | 1, | | 1 | | (5 ) | | (8 3 ) | | (4 2 ), 1 | | 2 | | 8 | | 21 | | 18 , 2 | | 3 0, | | 3 x x x x x x x r x x x x x x                 − + − − + +   − − + − − −   =  − + −     , (2) where α and β are the parameters of the kernel. for β = 0 is obtained 1p keys kernel. in [12 14], it was shown that the precision of the pcc interpolation with the 2p keys kernel was increased compared to the interpolation of the pcc interpolation with the 1p keys kernel. 2.3. 3p keys kernel the results in [12 14] show that the precision of the pcc interpolation with 2p keys kernel, compared to interpolation with 1p keys kernel, is increased. with the idea of further increasing the interpolation accuracy, the parameterization of the 1p kernel, using three parameters, was performed [15]. the three-parameter kernel is called the 3p keys kernel. the analytical form of the keys 3p kernel is: 3 2 3 2 3 2 3 2 ( 2) | | ( 3) | | 1, | | 1 | | ( 5 ) | | (8 3 3 ) | | ( 4 2 2 ), 1 | | 2 ( ) | | ( 8 ) | | (21 5 ) | | ( 18 6 ), 2 | | 3 | | 11 | | 40 | | 48 , 3 | | 4 0, | | 4 x x x x x x x r x x x x x x x x x x                             − + + + − + − − +   + − − − + − + + − + −    = + − + + − + − +    − + −      ,(3) where α, β and  are the parameters of the 3p keys kernel. as an example, fig. 1.a shows the time characteristics of the ideal interpolation kernel, rsinc, and the 3p keys kernel, rαβ, for kernel parameters α = -1.2, β = -0.1 and  = -0.1. optimization of the 3p keys kernel parameters ... 287 a) b) fig. 1 characteristics of the ideal sinc and 3p keys kernels (α = -1.2, β = -0.1,  = -0.1): a) time characteristics (rsinc, rαβγ) and b) spectral characteristics (hsinc, hαβ) 3. optimization of the keys 3p kernel parameters the spectral characteristic, h, of the 3p keys kernel (eq. 3) is different from the spectral characteristic, hsinc of the ideal interpolation kernel rsinc (fig. 1.b). the deviation of the spectral characteristic h from hsinc is described as the ripple of the spectral characteristic. the optimization process minimizes the difference between the spectral characteristics of h and hsinc. optimization involves selecting the kernel parameters α, β, and , so as to minimize the mean square error between h and hsinc. in this way, the optimal parameters of the 3p keys kernel αopt, βopt and opt are obtained. 3.1. algorithm for minimizing of the ripple of the spectral characteristic this part of the paper conducts the optimization of keys 3p kernel parameters by minimizing the ripple of the spectral characteristic. the algorithm for parameters optimization consists of the following steps: input: r 3p keys kernel output: αopt, βopt and opt kernel parameters. step 1: decomposition 3p keys kernel r to its components r0, r1, r2 and r3. step 2: determining the spectral characteristic h(f) by applying the fourier transform over the kernel components r0, r1, r2 and r3. step 3: the expansion of the spectral characteristic h( f ) into taylor series ht( f ). step 4: eliminating coefficients of the members of the spectral characteristic ht( f ) which dominantly affect on the ripple of the spectral characteristic. determining the optimal kernel parameters αopt, βopt and opt. a more detailed explanation of the algorithm steps (step 1 step 4) is shown below. 3.2. kernel components (step 1) the 3p keys kernel r (eq. (3)) can be represented as the sum of the kernel components: 288 n. savić, z. milivojević, z. veličković 0 1 2 3 ( ) ( ) ( ) ( ) ( )r x r x r x r x r x  = + + + , (4) where 3 2 0 2 | | 3 | | 1. | | 1 ( ) 0, | | 1 x x x r x x  − +  =   , (5) 3 2 3 2 1 | | | | , | | 1 ( ) | | 5 | | 8 | | 4, 1 | | 2 0, | | 2 x x x r x x x x x x  −   = − + −     , (6) 3 2 2 2 3 2 | | | | , | | 1 | | 3 | | 2, 1 | | 2 ( ) | | 8 | | 21 | | 18, 2 | | 3 0, | | 3 x x x x x x r x x x x x x  − +   − +   =  − + −     , (7) and 3 2 2 2 3 3 2 | | | | , | | 1 | | 3 | | 2, 1 | | 2 ( ) | | 5 | | 6, 2 | | 3 | | 11 | | 40 | | 48, 3 | | 4 0, | | 4 x x x x x x r x x x x x x x x x  −   − + −    = − +    − + −      , (8) are components of the 3p keys kernel. fig. 2 shows the components of 3p keys kernel r0, r1, r2 and r3. fig. 2 3p keys kernel components: r0, r1, r2 and r3 3.3. spectral characteristic of the 3p keys kernel (step 2) in order to optimize the parameters α, β, and , of the 3p keys kernel r in the spectral domain, by using the fourier transform (ft) the spectral characteristic of the kernel h was obtained: 0 1 2 3 0 1 2 3 ( ) ( ( )) ( ( ) ( ) ( ) ( )) ( ) ( ) ( ) ( ) h f ft r x ft r x r x r x r x h f h f h f h f       = = + + + = + + + (9) where h0, h1, h2 and h3 are spectral components of the 3p keys kernel: optimization of the 3p keys kernel parameters ... 289 2 0 ( ) ( ) xfi o h f r x e dx   − − =  , (10) 2 1 1 ( ) ( ) xfi h f r x e dx   − − =  , (11) 2 2 2 ( ) ( ) xfi h f r x e dx   − − =  , (12) and 2 3 3 ( ) ( ) xfi h f r x e dx   − − =  . (13) by substituting eq. (5) in eq. (10) is obtained: 0 1 3 2 2 3 2 2 0 1 0 ( ) ( 2 3 1) (2 3 1) xfi xfi h f x x e dx x x e dx  − − − = − − + + − +  , (14) by substituting eq. (6) in eq. (11) is obtained: -1 0 3 2 -2 3 2 -2 1 -2 -1 1 2 3 2 -2 3 2 -2 0 1 ( ) ( 5 8 4) ( ) ( ) ( 5 8 4) xfi xfi xfi xfi h f x x x e dx x x e dx x x e dx x x x e dx     = − − − − + − − + − + − + −     , (15) by substituting eq. (7) in eq. (12) is obtained: 2 1 3 2 2 2 2 2 3 2 0 1 2 3 2 2 3 2 2 2 2 1 0 1 3 3 2 2 2 ( ) ( 8 21 18) ( 3 2) ( ) ( ) ( 3 2) ( 8 21 18) xfi xfi xfi xfi xfi xfi h f x x x e dx x x e dx x x e dx x x e dx x x e dx x x x e dx       − − − − − − − − − − − = − − − − + + + + + + − + + − + + − + −       , (16) by substituting eq. (8) in eq. (13) is obtained: 3 2 3 2 2 2 2 3 4 3 1 0 1 2 2 3 2 2 3 2 2 2 1 0 2 3 2 2 2 2 1 2 4 3 2 3 ( ) ( 11 40 48) ( 5 6) ( 3 2) ( ) ( ) ( 3 2) ( 5 6) ( 11 40 48) xfi xfi xfi xfi xfi xfi xfi h f x x x e dx x x e dx x x e dx x x e dx x x e dx x x e dx x x e dx x x x        − − − − − − − − − − − − − − = − − − − + + + + − − − + − − + − + − + − + − + + − + −        2 xfi e dx −  , (17) after applying euler's formula and partial integration, the spectral components of the kernel can be written in the following form: 290 n. savić, z. milivojević, z. veličković 2 0 4 4 6 sin ( ) 3 sin(2 ) ( ) 2 f f f h f f     − = , (18) 2 1 4 4 3sin (2 ) 4 sin(2 ) sin(4 ) ( ) 2 f f f f f h f f       − − = , (19) 2 2 2 2 4 4 4 4 3sin ( ) 3sin (2 ) 3sin (3 ) ( ) 2 3 sin(2 ) 3 sin(4 ) sin(6 ) 2 f f f h f f f f f f f f f            − − + = + − − , (20) and 2 2 2 3 4 4 4 4 3(sin ( ) sin (3 ) sin (4 )) ( ) 2 (3sin(2 ) 2 sin(4 ) 3sin(6 ) sin(8 )) 2 f f f h f f f f f f f f           − + = − − + − − . (21) spectral components h0 (eq. (18)), h1 (eq. (19)), h2 (eq. (20)) and h3 (eq.(21)), are shown in fig. 3. fig. 3 spectral components h0, h1, h2 and h3. of the 3p keys kernel 3.4. optimal kernel parameters (step 3, step 4) in order to determine the optimal parameters of the 3p keys kernel r in the spectral domain, the taylor expansion ht of spectral characteristic h (eq. (9)) was determined. (step 3) by expansion into taylor series in the neighborhood f = 0 (maclaurin series), spectral components of the kernel were obtained: optimization of the 3p keys kernel parameters ... 291 2 4 6 8 0 4 1 8 2 ( ) 1 ( ) ( ) ( ) ( ) ... 15 35 4725 31185 t h f f f f f   = − + − + + , (22) 2 4 6 8 1 8 16 232 4112 ( ) ( ) ( ) ( ) ( ) ... 15 35 1575 155925 t h f f f f f   = − + − + + , (23) 2 4 6 8 2 8 272 4232 205808 ( ) ( ) ( ) ( ) ( ) ... 15 105 1575 155925 t h f f f f f   = − + − + + , (24) and 2 4 6 8 3 16 256 25904 2640832 ( ) ( ) ( ) ( ) ( ) ... 15 35 1575 155925 t h f f f f f   = − + − + + . (25) by substituting eq. (22)-(25) in eq. (9) is obtained: 0 1 2 3 2 4 6 8 ( ) ( ) ( ) ( ) ( ) 4 1 1 (1 2 2 4 )( ) (3 48 272 768 )( ) 15 105 8 (1 87 1587 9714 )( ) ( ) 4725 t t t t t h f h f h f h f h f f f f o f                 = + + + = − + + + + + + + − + + + + . (26) (step 4) the minimization of the spectral characteristic (eq. (26)) ripple is carried out by eliminating the dominant members of the spectral characteristic: 1 2 2 4 0 3 48 272 768 0 1 87 1587 9714 0          + + + =  + + + =  + + + = . (27) after calculating the system of equations eq. (27) is obtained: 4945 0.6132 8064 409 0.1522 2688 157 0.0195 8064 opt opt opt    = −  − =  = −  − . (28) by substituting the optimal parameter αopt = -0.5 [11], the optimal interpolation 1p keys kernel, ropt_1p, was obtained: 3 2 3 2 _1 1.5 | | 2.5 | | 1, | | 1, ( ) 0.5 | | 2.5 | | 4 | | 2, 1 | | 2 0, | | 2 opt p x x x r x x x x x x  − +   = − − +     . (29) the spectral characteristic of the 1p keys kernel, hopt_1p, is shown in fig. 4. by substituting the optimal parameters αopt = -0.5938, βopt = 0.0938 [12] the optimal interpolation 2p keys kernel, ropt_2p, was obtained: 292 n. savić, z. milivojević, z. veličković 3 2 3 2 _ 2 3 2 1.3124 | | 2.3124 | | 1, | | 1 0.5938 | | 3.0628 | | 5.0318 | | 2.5628, 1 | | 2 ( ) 0.0938 | | 0.7504 | | 1.9698 | | 1.6884, 2 | | 3 0, | | 3 opt p x x x x x x x r x x x x x x  − +   − + − +   =  − + −     . (30) the spectral characteristic of the 2p keys kernel, hopt_2p, is shown in fig. 4. by substituting the optimal parameters αopt = -0.6132, βopt = 0.1522, opt = -0.0195 (eq. (28)) in eq. (3), the optimal interpolation 3p keys kernel, ropt_3p, was obtained: 3 2 3 2 3 2 _ 3 3 2 1.2151 | | 2.2151 | | 1, | | 1 0.6132 | | 3.2377 | | 5.4207 | | 2.7962, 1 | | 2 ( ) 0.1522 | | 1.2371 | | 3.2937 | | 2.8566, 2 | | 3 0.0195 | | 0.2145 | | 0.78 | | 0.936, 3 | | 4 0, | | 4 opt p x x x x x x x r x x x x x x x x x x  − +   − + − +    = − + −    − − − −      . (31) the spectral characteristic of the 3p keys kernel, is shown in fig. 4. moreover, fig. 4 shows the spectral characteristics of the ideal rsinc kernel (hsinc). paper [11] presents the total mean square error, mset, i.e. the difference between the spectral characteristic h and the ideal box characteristic hsinc: 1 2 sinc 0 1 ( ) ( ) k t k k k mse h f h f k − = = − . (32) fig. 4 spectral characteristic of the ideal interpolation kernel hsinc and optimal spectral characteristics h of: a) 1p (αopt = -0.5), b) 2p (αopt = -0.5938, βopt = 0.0938) and c) 3p (αopt = -0.6132, βopt = 0.1522, opt = -0.0195) keys kernel optimization of the 3p keys kernel parameters ... 293 4. experimental results and analysis 4.1. experiment an experiment, with the aim of determining: a) interpolation accuracy with the 3p keys kernel, and b) interpolation execution time with the 3p keys kernel, te, in relation to interpolations with the 1p and 2p keys kernels, was realized. interpolations were performed on test images, ti, from the image base. the image base is created from some: a) standard test images used in digital image processing, and b) test images from the bsds500 base. some test images from the image base are in color (rgb) and some are in black-white (y). in this experiment, interpolations were performed on black-white images. therefore, color images were transformed into black-white images in accordance with the colorimetric equation y = 0.3r + 0.59g + 0.11b. the experiment was performed as follows. first, the experimental optimal values of the kernel parameters for: a) 1p keys (αopt), b) 2p keys (αopt, βopt) and c) 3p keys (αopt, βopt, opt), using the algorithm described below, were determined. after that, a comparative analysis of the error estimation of the optimal parameters of the kernels obtained: a) by optimizing the ripple of the spectral characteristic and b) obtained by experiments. finally, a comparative analysis of interpolation accuracy between the proposed 3p keys versus 1p keys and 2p keys was performed. for these reasons, the test image , ti, which, for analysis purposes, is presented as a two-dimensional matrix, with dimensions (l x k), was transformed into a one-dimensional matrix. the transformation was performed by connecting the rows of the test image matrix one after the other, and, in this way, a onedimensional matrix, x, with dimensions n = l x k, was obtained (these activities are realized by the algorithm described in section 4.3). the interpolation is organized as follows. the interpolation of the intensity of the pixel i, x(i), was performed by convolution between the interpolation kernel and intensity of the pixels x(ik), x(i k + 1), ..., x(i + k), where k is the length of the interpolation kernel. the interpolated value of pixel i is ˆix . on the other hand, intensity of the pixel i is known (x(i)), and, in the experiment, it is considered to be the true value of the pixel intensity. further analysis involved defining interpolation error. the interpolation error was defined by mse (eq. (32)), which was calculated between true, x(i), and the interpolated intensity, ˆix , of the pixel i. mse was used in a comparative analysis of interpolation accuracy, between interpolation results with applied 1p, 2p and 3p keys kernels. the interpolation results (msemin) are presented using graphs and tables. by comparative analysis of msemin, the precision of interpolation with the 3p keys kernel, in relation to the precision of interpolation with the 1p and 2p keys kernels, was determined. in addition, the executions time of the pcc interpolation, te, was determined. testing of the of the execution time was performed on a computer desktop s2ac43p, processor: intel (r) pentium (r), cpu: g3220 3 ghz, ram: 8 gb and a windows 10 operating system. the matlab r2017b program was applied (to determine the execution time, te, the tic and toc functions are used). execution time was measured for: a) complete convolution with kernels (eq. (1), eq. (2) and eq. (3)), where, based on the kernel parameters α, β and , the coefficients of third order polynomials are calculated, and then the value of the polynomials were calculated, b) convolution with the optimized kernel parameters (eq. (29), eq. (30) and eq. (31)), where the coefficients of the polynomial were previously calculated, and, after that, the value of the polynomial is were calculated, and c) convolutional kernel execution time, without interpolation. all interpolation execution times, as the arithmetic mean of the value of the results for 100000 interpolations, were determined. 294 n. savić, z. milivojević, z. veličković 4.2. image base for the purpose of realizing the experiment, in which the accuracy of pcc interpolation with image interpolation, is tested, an image base was created. image base consists of: a) standard test images for digital signal processing, and b) images from the bsds500 image base [17]. standard test images are: lena (512 x 512, rgb) (fig. 5.a), barbara (225 x 675, rgb) (fig. 5.b), cameraman (225 x 675, y) (fig. 5). c), peppers (225 x 675, rgb) (fig. 5.d), boats (225 x 675, rgb) (fig. 5.e), tulips (512 x 768, rgb) (fig. 5.f) , and watch (768 x 1024, rgb) (fig. 5.d). test images from the bsds500 base have numeric labels: 3096 (321 x 481, rgb) (fig. 5.h), 14037 (321 x 481, rgb) (fig. 5.i), 295087 (321 x 481, rgb) ( fig. 5.j), 126007 (321 x 481, rgb) (fig. 5.k), 260058 (321 x 481, rgb) (fig. 5.l), 160068 (321 x 481, rgb) (fig. 5.m), 241004 (321 x 481, rgb) (fig. 5.n), 197017 (321 x 481, rgb) (fig. 5.o), 143090 (321 x 481, rgb) (fig. 5.p). a) b) c) d) e) f) g) h) i) j) k) l) m) n) o) p) fig. 5 test image for digital image processing: a) lena, b) barbara, c) cameraman, d) pappers, e) boats, f) tulips, d) watch. test images from bsds500 database, with numeric labels: h) 3096, i) 14037, j) 295087, k) 12607, l) 260058, m) 160068, n) 241004, o) 197017, p) 143090 optimization of the 3p keys kernel parameters ... 295 4.3. algorithm for interpolation error determining the following algorithm performs interpolation of the test images, determines the interpolation error and determines the mse depending on the parameters α, β and γ. optimal parameters were determined by minimizing mse. algorithm is realized in the following steps: input: (r0, r1, r2, r3) – 3p keys kernel parameters, (αmin, δα, αmax, βmin, δβ, βmax, γmin, δγ, γmax) parameter boundaries and iteration steps, l – kernel length, ti (l x k) test image. output: αopt, βopt, γopt. optimal parameters. mseα, mseαβ, mseαβγ. step 1: converting a color image to a black-white image. if test image == color image 0.3 0.59 0.11 i t r g b=  +  +  end step 2: transformation of the image ti (l x k) into a one-dimensional matrix x: for = 1 : l for k = 1 : k (( 1) ) ( , ) i x k k t k−  + = end k end the dimensions of the one-dimensional matrix x are (1, n), where n = l x k. for γ = γmin : δγ :γmax. for β = βmin : δβ : βmax for α = αmin : δα : αmax step 3: construction of the kernel: 0 1 2 3 r r r r r  = + + + , step 4: the length of interpolation frame is: 2 1m l=  − for i = 1: n-m+1, step 5: selecting the i-th frame: xi = x (1: i+m-1) step 6: estimation of ˆ i x by applying pcc: ˆ [1: 2 : ] i i x x m r=  , where the symbol  stands for convolution. step 7: estimation error is: ˆ( ) ( ) i i e i x l x= − end i step 8: mean square error of estimation of 1p kernel: 1 2 1 ( ) 1 ( 1) | ( ) | n m k mse n m e k  − + = = − +  , end α step 9: mean square error of estimation of 2p kernel: ( )mse mse  = , end β step 10: mean square error of estimation of 3p kernel: ( )mse mse  = , end γ step 11:. optimal values of 3p kernel parameters: , , ( , , ) arg min( ) opt opt opt mse       = . 296 n. savić, z. milivojević, z. veličković the described algorithm had the purpose of testing the interpolation error with the 3p keys kernel in relation to the interpolation error with the 1p and 2p keys kernels. the algorithm was implemented in matlab, and, except for testing, is not intended for realtime systems. therefore, the execution time of the algorithm is not of dominant importance. however, in the experiment, using the matlab function tic and toc, for the case of applying 1p, 2p and 3p keys kernels, the interpolation execution time, te, is determined. based on the execution time, a comparative analysis was performed. 4.4. experimental results using the test algorithm described in section 4.3, interpolation of the test images was performed. interpolation for some values of α, β and γ parameters from the specified range has been performed. in addition, interpolation with the 1p, 2p and 3p keys kernels with all parameters from the range was performed. for each interpolation, the interpolation error, mse, is determined. based on the minimum interpolation error, msemin, the optimal interpolation kernel parameter was determined. figure 5.a shows the dependence of the mseα on the parameter α, for the 1p keys kernel (test image boats). the optimal parameter, αopt, was determined as ( ) arg min( )opt mse   = . figure 5.b shows the dependence of mseαβ on the parameters α and β for the 2p keys kernel (test image boats). the optimal parameters αopt and βopt, were determined as , ( , ) arg min( ) opt opt mse      = . the minimum interpolation errors, msemin, and the corresponding optimal kernel parameters, when interpolating all test images from the image base, are shown in: a) table 1 (1p keys, αopt, mse1p), b) table 2 (2p keys, αopt, βopt, mse2p) and c) table 3 (3p keys, αopt, βopt, γopt, mse3p). table 4 shows the execution time of pcc convolution for: a) complete convolution with kernels (label in the table: int1), (eq. (1), eq. (2) and eq. (3)), b) convolution with the optimized kernel parameters (label in the table: int2) (eq. (29), eq. (30) and eq. (31)) and c) convolutional kernel execution time, without interpolation (label in the table: kert). all interpolation execution times, as the arithmetic mean of the value of the results for 100000 interpolations, were determined. a) b) fig. 5 dependence of mse on kernel parameters for the test image boats: a) 1p keys kernel and b) 2p keys kernel optimization of the 3p keys kernel parameters ... 297 table 1 optimal parameter α and minimum mse for 1p keys kernel. image base image αopt mse1p d s p t e st b a se lena -0.3000 11.3234 barbara -0.1000 247.0271 cameraman -0.5000 0.3133 pappers -0.6200 75.7521 boats -0.3000 263.2390 tulips -0.7000 14.5797 watch -0.4000 49.9283 b s d s 5 0 0 b a z e 3096 0.200 0.7933 14037 -0.6000 10.0185 295087 0.1000 2.3780 126007 -0.4000 19.1678 260058 -0.300 4.8327 160068 0.6000 0.5835 241004 -0.300 6.4673 197017 0.3000 6.4499 143090 -0.01 18.2042 _1opt p 1pmse -0.1706 45.6911 table 2 optimal parameters α and β, and minimum mse for 2p keys kernel image base image αopt βopt mse2p d s p t e st b a se lena -0.3000 -0.1000 11.3137 barbara -0.1000 0 247.0271 cameraman -0.3000 -0.2000 0.3114 pappers -0.5400 0.1000 75.2829 boats -0.4000 0.1000 262.7854 tulips -0.6000 0.2000 14.1536 watch 0 0.3000 49.2893 b s d s 5 0 0 b a z e 3096 0.2100 0.0030 0.6346 14037 -0.300 0.200 7.9427 295087 0.0400 -0.0100 1.9018 126007. -0.4000 0.0100 15.3342 260058 -0.300 -0.010 3.8658 160068 0.7000 0.1000 0.4663 241004 -0.3000 -0.0300 5.1729 197017 0.4000 0.0900 5.1557 143090 -0.0100 0.0040 14.5632 _ 2opt p _ 2opt p 2 pmse -0.1375 0.0473 44.7000 298 n. savić, z. milivojević, z. veličković table 3 optimal parameters α, β and γ , and minimum mse for 3p keys kernel imag e base image αopt βopt γopt mse3p d s p t e st b a se lena -0.3000 -0.1000 -0.0500 11.3130 barbara -0.1000 -0.3000 -0.3000 242.1622 cameraman 0.3000 -0.1000 0.1000 0.3113 pappers -0.5200 0.1000 -0.0200 75.2664 boats 0.5000 0.2000 0.0500 262.7747 tulips -0.6000 0.2000 -0.0500 14.1407 watch -0.1000 0.1000 -0.1500 49.2107 b s d s 5 0 0 b a z e 3096 0.2000 -0.007 -0.005 0.4760 14037 -0.300 0.2000 -0.010 5.9566 295087 0 0 0.0400 1.4259 126007. -0.400 0.1000 0.0800 11.4961 260058 -0.400 -0.060 0.0001 2.8970 160068 0.7000 0 -0.110 0.3491 241004 -0.300 0 0.0400 3.8780 197017 0.400 0.0800 -0.014 3.8666 143090 -0.020 0.0900 0.1000 10.9091 _ 3opt p _ 3opt p _ 3opt p 3pmse -0.1587 0.0314 -0.0187 43.5271 table 4 execution time for pcc interpolation. execution time te (s) te_1p_keys te_2p_keys te_3p_keys int1 1.430510-6 2.487610-6 2.522510-6 int2 1.190310-6 2.070410-6 2.099010-6 kert 5.649910-7 5.649210-7 5.648910-7 4.5. comparative analysis according to the results presented in table 1, table 2 and table 3, it is obvious that: a) mse when applying 2p keys kernel compared to 1p keys kernel: 1p mse / 2 pmse = 45.6911 / 44.700 = 1.0222 times smaller, b) mse when applying 3p keys kernel compared to 1p keys kernel: 1p mse / 3p mse = 45.6911 / 43.5271 = 1.0497 times smaller, and c) mse when applying 3p keys kernel compared to 2p keys kernel: 2 p mse / 3p mse = 44.700 / 43.5970 = 1.0269 times smaller. the optimal values of the kernel parameters, determined by minimizing the ripple of the characteristic of the 3p keys kernel (eq. (28)), are: αopt = -0.6132, βopt = 0.1522 i γopt = 0.0195. using the experimental results (table 3), it was shown that the mean values of the optimal kernel parameters, determined for all test images, are: _ 3opt p  = -0.1587, _ 3opt p  = 0.0314 and _ 3opt p = -0.0187. the absolute error of the kernel parameters, determined by algorithm for minimizing of the ripple of the spectral characteristic, in relation to the experimentally determined kernel parameters, are: a) α3p = _ 3 | | opt opt p  − = | 0.6132 ( 0.1587) |− − − = 0.4545, b) β3p = _ 3| |opt opt p − = | 0.1522 0.0314 |− = 0.1208, optimization of the 3p keys kernel parameters ... 299 c) 3p = | | opt opt  − = | 0.0195 ( 0.0187) |− − − = 0.0008. the total absolute error of kernel parameter estimation for all test images is et = 2 2 2 3 3 3p p p    + + = 0.4703. in accordance with the results presented in table 4, for a complete convolution with nonoptimized kernels, (eq. (1), eq. (2) and eq. (3)), (label in the table 1: int1), it is concluded that time execution, te, when applying: a) 2p keys kernel compared to 1p keys kernel is te_2p_keys / te_1p_keys = 2.487610 -6 / 1.430510-6 = 1.7389 times bigger, b) 3p keys kernel compared to 1p keys kernel is te_3p_keys / te_1p_keys = 2.522510 -6 / 1.430510-6 = 1.7633 times larger, and c) 3p keys kernel compared to 2p keys kernel is te_3p_keys / te_2p_keys = 2.522510 6 / 2.487610-6 = 1.014 times larger in accordance with the results presented in table 4, for a complete convolution with optimized kernels, (eq. (29), eq. (30) and eq. (31)), (label in the table 1: int1), it is concluded that time execution, te, when applying: a) 2p keys kernel compared to 1p keys kernel is te_2p_keys / te_1p_keys = 2.070410 -6 / 1.190310-6 = 1.7393 times bigger, b) 3p keys kernel compared to 1p keys kernel is te_3p_keys / te_1p_keys = 2.099010 -6 / 1.190310-6 = 1.7634 times larger, and c) 3p keys kernel compared to 2p keys kernel is te_3p_keys / te_2p_keys = 2.099010 -6 / 2.070410-6 = 1.013 times larger. the convolutional kernel execution time, te, without interpolation (label in the table: kert) is approximately 5.64910-7 for all keys kernels. the reason is that all kernels, after optimization, have the same numerical complexity: a third-order polynomial with constant coefficients. when a 3p keys interpolation kernel with optimized parameters is applied, the convolution execution time, compared to non-optimized kernels, is te 3p keys / te 3p keys opt = 2.522510 -6 / 2.099010-6 = 1.2017 times less. the results from the described experiment and the conducted detailed comparative analysis of interpolation error, which were expressed through mse, indicated the fact that the accuracy of interpolation when the 3p keys kernel was applied, in relation to 1p and 2p kernels, increased. the testing algorithm is implemented in the matlab programming language. the interpolation execution times were calculated using the matlab function tic and toc. the experiment only showed precision interpolation with 3p keys kernels compared to precision with 1p keys and 2p keys kernels. in addition, the relative ratio of the interpolation execution times is determined. however, for real-time interpolation, the convolution algorithm must be written in a programming language (for example, programming language c) where, in the compilation process, optimizations can be made to reduce program execution time (eq. 31). in this way, image processing can be provided in real-time mode and, among other things, on personal computers. 5. conclusion the paper presents an algorithm for optimizing the parameters of the 3p keys interpolation kernel. parameter optimization was performed in the spectral domain by minimizing the ripple of the spectral characteristic. first, the spectral characteristic was developed in the taylor series, and, after that, the members of the taylor series that have a great effect on increasing the riple of the spectral characteristic, were eliminated. from the conditions of elimination of the dominant members of the taylor series, the optimal values of the parameters 3p keys kernel (αopt = -0.6132, βopt = 0.1522, γopt = -0.0195) were determined. verification of the accuracy of the 3p keys kernel when interpolating images 300 n. savić, z. milivojević, z. veličković was performed experimentally. the interpolation accuracy is expressed through the mse interpolation error. detailed comparative analysis showed that the 3p keys kernel, with experimentally determined optimal parameters, has a higher interpolation accuracy compared to the 1p keys kernel 1.0497 times, and compared to the 2p keys kernel 1.0269 times. based on the presented results, it is concluded that the 3p keys kernel is superior to the 1p keys and 2p kernels and that the interpolation error is very small. experimental results show that the 3p keys kernel, with the optimal parameters, which are determined by the optimization algorithm presented in this paper, performed the interpolation of the test images with great precision. the 3p keys kernel with optimal parameters, compared to the ideal sinc kernel, has a small numerical complexity, and therefore, it is suitable for implementation in convolutional interpolations for operation in real-time systems. references [1] r. g. keys, "cubic convolution interpolation for digital image processing" ieee trans. acout. speech, & signal processing, vol. assp-29, pp. 1153–1160, dec. 1981. [2] e. meijering, m. unser, "a note on cubic convolution interpolation", ieee transactions on image processing, vol. 12, no. 4, pp. 447–479, april 2003. [3] n. dodgson, "quadratic interpolation for image resampling", ieee transactions on image processing, vol. 6, no. 9, pp. 1322–1326, sept. 1997. [4] o. rukundo, b. maharaj, "optimization of image interpolation based on nearest neighbor algorithm", in proceedings of the international conference on computer vision theory and applications (visapp), 2014, vol. 1, pp. 641–647. [5] s. s. rifman, "digital rectification of erts multispectral imagery", in proceedings of the symp. significant results obtained from the earth resources technology satellite-1, 1973, vol 1, sec. b, pp. 1131–1142. [6] t. b. deng, "frequency-domain weighted-least-squares design of signal-dependent quadratic interpolators", iet signal process., vol. 4, no. 1, pp. 102–111, feb. 2010. [7] n. gajalakshmi, s. karunanithi, "cubic convolution and osculatory interpolation for image analysis", international journal of creative research thoughts (ijcrt), vol. 9, issue 12, pp. 836–841, december 2021. [8] y. li, f. qi, y. wan, "improvements on bicubic image interpolation", in proceedings of the ieee 4th advanced information technology, electronic and automation control conference (iaeac), 2019, pp. 1316– 1320. [9] s.-h. hong, l. wang, t.-k. truong, "an improved approach to the cubic-spline interpolation", in proceedings of the 25th ieee international conference on image processing (icip) 2018, pp. 1468– 1472. [10] k. s. park, r. a. schowengerdt, "image reconstruction by parametric cubic convolution", computer vision, graphics & image processing, vol. 23, pp. 258–272, 1982. [11] e. meijering, k. zuiderveld, m. viegever, "image reconstruction by convolution with symmetrical piecewise nth-order polynomial kernels", ieee transactions on image processing, vol. 8, no. 2, pp. 192–201, feb. 1999. [12] z. milivojević, n. savić, d. brodić, p. rajković, "optimization parameters of two parameter keys kernel in the spectral domain", in proceedings of the xv international scientific-professional symposium infoteh-jahorina, bosnia, 2016, pp. 392–397. [13] r. hanssen, r. bamler, "evaluation of interpolation kernels for sar interferometry", ieee transactions on geoscience and remote sensing, vol. 37, no. 1, pp. 318–321, jan. 1999. [14] z. milivojević, d. brodić, "estimation of the fundamental frequency of the real speech signal compressed by mp3 algorithm", archives of acoustics, vol. 38. no. 3, pp. 363–373, 2013. [15] z. milivojević, n. savić, d. brodić, "three-parametric cubic convolution kernel for estimating the fundamental frequency of the speech signal", computing and informatics, vol. 36, pp. 449–469, 2017. [16] n. savić, z. milivojević, "optimization of the 3p keys kernel parameters for interpolacion of audio signals", in proceedings of the international scientific conference unitech'20, gabrovo, bulgaria, 2020, pp. 200–205. [17] https://www2.eecs.berkeley.edu/research/projects/cs/vision/bsds/ instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications  tomislav suligoj 1 , marko koričić 1 , josip žilak 1 , hidenori mochizuki 2 , so-ichi morita 2 , katsumi shinomura 2 , hisaya imai 2 1 university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2 asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 15, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) 508 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai be to use a coarser-lithography bicmos technology [1, 2], instead of an advancedlithography pure cmos technology [3]. moreover, the bipolar part should be integrated with cmos with a minimum addition to process complexity, which could make the highperformance si/sige bicmos technologies [4] prohibitively expensive. on the other hand, horizontal current bipolar transistor (hcbt) [5, 6] is a very compact structure, outperforming all the existing lateral bipolar transistors (lbts) [7, 8]. hcbt is fabricated in a simple technology without the need for the steps that are standard in the vertical-current bipolar structures, i.e. without n + buried layer, epitaxial growth, base polysilicon layer, emitter-base spacers, collector plug implantation, deep trench isolation etc., which makes it attractive for the very low-cost, high-performance bicmos technology. hcbt is invented at the faculty of electrical engineering and computing, university of zagreb, croatia, [9, 10] and its characteristics has been improved over 3 generations of transistors. at first, the technology concept has been demonstrated by using coarse contact lithography having transistors of with cutoff frequency (ft) of 4.4 ghz and collector-emitter breakdown voltage (bvceo) of 15.8 v [11]. in the second generation of hcbt, the 0.5 μm stepper lithography has been used reaching ft=30.4 ghz and bvceo=4.2 v which became the fastest lateral bipolar transistor [12, 13]. finally, hcbt has been integrated with cmos and further optimized having ft=51 ghz and bvceo=3.4 v [5, 14], which is among the fastest pure-silicon bipolar transistors reported [15]. in this paper, an overview of the most advanced hcbt technology is given, showing all the innovative technology steps and specific device effects that have enabled the recordbreaking electrical characteristics. furthermore, the mixer is demonstrated as an rf circuit fabricated in hcbt technology [16], together with high-voltage hcbt structures [17-19], which broaden the application spectrum of hcbt bicmos technology platform. 2. hcbt fabrication the hcbt structure with a single polysilicon region is fabricated by using a commercial 180 nm cmos process, which features dual gate oxide thicknesses of 3 nm and 7 nm for 1.8 v and 3.3 v supply voltages, respectively. both nmos and pmos transistors are made with 2 versions of threshold voltages (vth), optimized for high-speed and low stand-by power consumption at 1.8 v supply voltage. the cmos process features 6 aluminum layers and poly-poly and metal-metal capacitor modules. the hcbt fabrication sequence is depicted in fig.1. the active transistor region is processed in the silicon sidewall defined by the shallow trench isolation (sti), which is 350 nm deep with the sidewall at approximately 80° angle relative to the surface. the active sidewalls of hcbt are aligned to (100) crystal direction. after the implantation of the cmos nand p-wells, the 1 st hcbt mask is used for the implantation of the n-hill collector region as shown in fig. 1.a. the n-hill is implanted by phosphorus and consists of 3 steps with the energies of 340 kev, 220 kev and 110 kev. alternatively, the cmos n-well can be used for the collector n-region and the 1 st hcbt mask is not needed, resulting in the even lower-cost process. the cmos gate polysilicon layer is left at the emitter side of the n-hill at the distance of 500 nm (fig. 1.b), in order to obtain the desirable final shape of the emitter n + polysilicon region. after the gate polysilicon etching, re-oxidation and source/drain horizontal current bipolar transistor (hcbt) 509 extension implantation for mos transistors, the extrinsic base is implanted by using the 2 nd hcbt mask (or the 1 st , if the n-well collector is used), as shown in fig. 1.b. the edge of the mask across the n-hill determines the extrinsic base width (wbext) and the distance between the extrinsic base and the n + collector region. the extrinsic base is annealed together with the source/drain extensions, which is a cmos baseline process step. the 3 rd hcbt mask (or the 2 nd in the case of the n-well collector) is used for sti oxide etching after the source/drain annealing. the sti oxide is timed etched, as shown in fig. 1.c, defining the trench for the emitter polysilicon region. the thickness of the remaining oxide at the n-hill sidewall is around 100 nm. the 10 nm of teos oxide is deposited next and the 2 nd hcbt mask (or the 1 st in the case of the n-well collector) is used again for the intrinsic base implantation, which is performed at a tilt angle of 30° using bf2, as shown in fig. 1.d. the rta process at 800°c is carried out, followed by the deposition of 450 nm of in situ doped amorphous silicon (α-si) layer as shown in fig. 1.e. the n + α-si layer fills the emitter trench near the active sidewall and under the cmos gate. the α-si is then timed etched by tetramethyl ammonium hydroxide (tmah) and is removed across the wafer except in the emitter trench (fig. 1.f). since the tmah etchant is very selective to the oxide, the n-hill is protected from etching by a thin layer of oxide grown during the predeposition rta step, as shown in fig. 1.e. in this way, the emitter n + region is formed, while the base and the n-hill are protected by the thin oxide layer. the cmos gates are protected from tmah etching by the oxide encapsulation, grown during gate reoxidation process. the cmos gate at the emitter side of the n-hill makes it possible to obtain the shape of emitter n + α-si layer with the minimum thickness very close to the active sidewall, as can be seen in the tem cross-sections in figs. 2.a and 2.b. if the cmos gate is not used (fig. 3), the emitter α-si is the thinnest in the middle of the trench, which limits its thickness at the active sidewall. fig. 3.a depicts the marginal case of hcbt without cmos gate, where the emitter contact barely sits on polysilicon, but its thickness at the active sidewall is 125 nm. by using the cmos gate, the emitter polysilicon thickness at the active sidewall is 85 nm (fig. 2.b) and it increases toward the contact. additionally, the use of cmos gate requires a deposition of thinner polysilicon layer to fill the emitter trench, which improves the controllability of the final polysilicon thickness. b ecphotoresist photoresist n-hill sio2 p+ photoresist n-hill sio2 p+ int. base i/i p n-hill sio2 p+ p s/d i/i n+ photoresist n-hill sio2 p+ ext. base i/i n-hill sio2 p+ p (a) (b) ( )d ( )g ( )h photoresist n-hill sti teos sio2 (c) (e) (f) n-hill sio2 p+ p n+ photoresist silicide blocking sio2 n-hill sio2 p+n+ cmos gate wbext spacers cosi2 500 nm n+ si - protection oxide p-substrate p-substrate p-substrate p-substrate p-substrate p-substrate p-substrate p-substrate n+ polyn+ poly pn+ si - fig. 1 fabrication sequence of hcbt with a single polysilicon region. 510 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai the cmos spacers are formed at the n-hill sidewalls above the n + α-si layer and serve to isolate the emitter and base silicides from each other, as shown in fig. 1.f. next, the source/drain implantation mask of the nmos transistor is also opened above the n-hill and the collector n + region is obtained (fig. 1.f). the source/drain junction depth is around 200 nm, reaching deeper than the extrinsic base junction. the emitter drive-in diffusion is performed during source/drain annealing and α-si layer crystallizes forming the emitter n + polysilicon region. the silicide-blocking oxide layer has to be left between the extrinsic base and the implanted n + collector in order to prevent the collector base shorts, also used in standard cmos contact processing (fig. 1.g). the final hcbt structure with a single polysilicon layer is shown in fig. 1.h. fig. 2 tem cross-section of the processed hcbt structures with a single polysilicon region: (a) the whole transistor structure with cmos gate, (b) close-up of the active sidewall. the emitter contact is out of the image plane and is hand-sketched. fig. 3 tem cross-section of the processed hcbt structures without cmos gate: (a) excessive n + amorphous silicon etching and removed n + polysilicon under the emitter contact, (b) exact n + amorphous silicon etching, but too thick n + polysilicon (154 nm) at the n-hill sidewall. horizontal current bipolar transistor (hcbt) 511 3. hcbt electrical characteristics the electrical characteristics of the hcbt with the optimized collector fabricated by a separate implantation are compared with the lower-cost hcbt with cmos n-well region used as collector. the collector profile of the optimized hcbt is designed to obtain a uniform electric field in the collector-base depletion region resulting in an optimum tradeoff between the ft and fmax and collector-emitter breakdown voltage (bvceo). this effect is specific to hcbt structure and can be used as an additional technological step to optimize transistor characteristics, which will be analyzed further in section 4. the gummel plots and output characteristics of the optimized and n-well hcbts are shown in fig. 4 and the electrical parameters are summarized in table 1. both transistors are optimized for maximum ft and fmax and have a modest current gain (β) of around 70. the n-well hcbt has a higher extrinsic base doping level reducing the electron component of the base current. the n-well hcbt has bvceo = 2.8 v, whereas the optimized hcbt has bvceo = 3.4 v, which makes it more suitable for the use in the circuit applications with voltage supply of 3.3 v. 0.4 0.6 0.8 1.0 1.2 10 -12 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 collector type: optimized n-well v ce =1 v i b i c c o ll e c to r, b a s e c u rr e n t (a ) base-emitter voltage (v) 0 1 2 3 4 0 100 200 300 400 500 collector type: optimized n-well i b =0...3 a i b =0.5 a c o ll e c to r c u rr e n t ( a ) collector-emitter voltage (v) (a) (b) fig. 4 measured dc characteristics of hcbt with a single polysilicon region with emitter area 0.1×1.8 μm 2 with the optimized collector and n-well collector: a) gummel plots, i.e. ib and ic vs vbe, and b) output characteristics, i.e. ic vs vce. table 1 electrical parameters of hcbt with the optimized collector and n-well collector optimized n-well emitter area 0.1 x 1.8 μm 2 peak β 72 76 bvcbo (v) 9.5 8.3 bvceo (v) 3.4 2.8 va (v), vbe=0.85 v 16 15 va (v), ib=5 μa 10 11 cbc (ff) @ vcb= 1 v 1.1 1.6 rb (ω), circle imp. 480 430 ft (ghz) @ vce= 2v 51 43 fmax (ghz) @ vce= 2v 61 56 ftbvceo(ghzv) 173 120 512 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai the high-frequency characteristics of the optimized and n-well hcbts are shown in fig. 5. the optimized hcbt has ft and fmax of 51 ghz and 61 ghz, respectively, and ftbvceo product equals 173 ghzv, which is among the highest reported for the implanted-base si bjts and very close to the theoretical johnson’s limit [20]. the n-well hcbt has ft and fmax of 43 ghz and 56 ghz, respectively. the ft and fmax of n-well hcbt fall off at higher currents due to the increased collector concentration. however, the peak values are lower for n-well hcbt due to the increased neutral base width and due to the effect of charge sharing between the extrinsic and intrinsic base regions, which will be explained in more details in section 4. peak ft and fmax of n-well hcbt are still high enough for wireless applications and it can be used as a low-cost technology. both hcbts have a small collector-base capacitance (cbc) per emitter length of less than 0.8 ff/μm, which makes them attractive for low-power circuit applications. the early voltages (va) of the optimized hcbt are equal to 16 v and 10 v for constant vbe and for constant ib, respectively, and 15 v and 11 v for n-well hcbt. since both transistors are optimized for maximum speed, va for constant ib are relatively low, but it can be improved by reducing collector doping level and traded for ft in such a case. 4. collector doping profile effect on electrical characteristics in standard vertical-current bipolar transistors, the intrinsic and extrinsic base regions are formed at the wafer surface next to each other, resulting in classical planar collectorbase pn-junction. on the other hand, in hcbt structure, the extrinsic base p + -region and the intrinsic base p-region form the angle of approximately 100°, because the extrinsic base is implanted at the wafer surface, whereas the intrinsic base is implanted at the n-hill sidewall. hence, the ionized donor charge on the n-collector side of the collector-base pnjunction is shared between the intrinsic and the extrinsic base acceptors, since the collector is surrounded by the extrinsic and intrinsic base regions. therefore, the depletion region has to extend to the collector side and to shrink at the base side to reach the charge balance [21], as shown in fig. 6, reducing the electric field as a result. as a 10 -5 10 -4 10 -3 0 10 20 30 40 50 60 70 optimized collector f t f max n-well collector f t f max f re q u e n c y ( g h z ) collector current (a) fig. 5 cutoff frequency (ft), and maximum frequency of oscillations (fmax) vs. collector current (ic), of hcbt with emitter area 0.1×1.8 μm 2 with the optimized collector and the n-well collector, at vce=2 v. horizontal current bipolar transistor (hcbt) 513 consequence, the intrinsic base is locally wider at the top of emitter reducing the ic, β and ft. hence, the collector doping must be increased just under the extrinsic base to suppress the charge sharing effect, i.e. to reduce the neutral base widening and the extension of the depletion region. in order to examine the effect of collector design and to optimize the hcbt characteristics, two structures with different collector doping profiles, as shown in fig. 7, are compared [22]. hcbt with collector 1 has a steeper doping profile than hcbt with collector 2, i.e. a higher doping concentration at the top of the intrinsic base, just under the extrinsic base, where the charge sharing effect is mostly pronounced. a distribution of impact ionization rates are simulated and shown in fig. 8. non-local impact ionization based on lucky electron model with hard threshold energy is used. the peak impact ionization rates are 1.4·10 24 cm -3 s -1 and 7.9·10 24 cm -3 s -1 for collector 1 (steep n-hill) and collector 2 (uniform n-hill), respectively. the hcbt with collector 2 (uniform n-hill) has a higher impact ionization rate and it occurs at the bottom of the base, because the current density is the highest in this region due to the narrowest neutral base there. moreover, the electric field is reduced at the top of the base due to the charge sharing effect reducing the impact ionization rate there. additionally, the rounded shape of the collector-base fig. 6 simulations of hcbt cross-section showing the potential distribution in the collector-base depletion region. 0 100 200 300 400 500 600 10 17 10 18 extrinsic base emitter depth collector 2: uniform n-hill collector 1: steep n-hill d o p in g ( c m -3 ) depth (nm) photoresist p-substrate sio2 ( )a cmos poly p+ n-hill p-chan. stop int. base i/i photoresist p sio2 ( )b p+ sic sic + int. base i/i pn-hill p p a a’ cmos poly fig. 7 measured sims (lines) and simulated (symbols) doping profiles of collector region along the cross-section aa’, after all of the cmos annealing steps. 514 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai depletion region at the bottom of the base causes the reverse charge sharing effect increasing the local electric field there. the hcbt with collector 1 (steep n-hill) has a smaller impact ionization rate since the doping profile reduces charge sharing effect, and also decreases electric field at the bottom of the base. therefore, impact ionization rate does not have a peak as sharp as in uniform collector, but is more uniformly distributed along the intrinsic transistor. the output characteristics depicted in fig. 9.a show a lower bvceo for hcbt with collector 2 (uniform n-hill) corresponding to the higher peak impact ionization shown in fig. 8., and a higher bvceo for hcbt with steep collector profile due to the more uniform electric field and current flow distributions in the collector-base depletion region, and reduced impact ionization rate. as shown in fig. 9.b, ft and fmax are basically equal for two collector doping profiles. therefore, due to the higher bvceo and equal ft the hcbt with collector 1 (steep n-hill) has a higher ftbvceo product and represents an optimum hcbt design. the measured characteristics of hcbt with two different collectors are summarized in table 2. both transistors are designed to have a higher β comparing to the transistors described in section 3 [5], by reducing the doping levels in the intrinsic base impa ct ion iza t ion log | x| (cm-3s-1) 23 23.2 23.52 23.84 24.16 24.48 24.8 25.1 emitter n+ poly n-hill n+ depletion region edge pn-junction impact ionization basecollector sio2 p+ uniform int. base sio2 p narrower depletion region emitter n+ poly n-hill n+ depletion region edge pn-junction impact ionization basecollector sio2 impa ct ion iza t ion log | x| (cm-3s-1) 23 23.2 23.52 23.84 24.16 24.48 24.8 25.1 p+ wider int. base on top sio2 p wider depletion region (a) (b) fig. 8 cross-sections of the simulated impact ionization rate distribution of hcbt structures with: (a) collector 1 (steep n-hill), (b) collector 2 (uniform n-hill), at vbe=0.7 v and vce=2 v. 0 1 2 3 4 0 50 100 150 200 250 300 collector type: n-hill: steep n-hill: unif. i b =0..1 a i b =0.2 a c o ll e c to r c u rr e n t ( a ) collector-emitter voltage (v) 10 -1 10 0 5 10 15 20 25 30 35 40 45 50 55 60 f t f max v ce =2 v collector type: n-hill: steep n-hill: unif. f re q u e n c y ( g h z ) collector current (ma) (a) (b) fig. 9 measured (a) output and (b) high-frequency characteristics of hcbt with a single polysilicon region with emitter area 0.1×1.8 μm 2 with collector 1 (steep n-hill), with collector 2 (uniform n-hill). horizontal current bipolar transistor (hcbt) 515 and collector and consequently resulting in a lower ft. the hcbt with collector 1 (steep n-hill) has a higher collector resistance (rc) due to the lower average collector doping level, but it still has a rather small effect on ft and fmax as compared to the neutral base and collector-base depletion region time constants. table 2 measured electrical parameters of hcbt with collector 1 (steep n-hill), collector 2 (uniform n-hill). collector 1 collector 2 emitter area 0.1 x 1.8 μm 2 peak β 118 126 ft (ghz) 34 35 fmax (ghz) 57 56 bvceo (v) 3.6 3.1 ftbvceo(ghzv) 122 109 cbc(ff) vcb=1v 1.8 1.8 rc (ω), sat. 590 320 5. hcbt circuit design beside the characterization of transistor-level electrical characteristics, the hcbts’ performance is examined by using them in circuits. for this purpose, a down-converting mixer is designed and measured as the first rf circuit fabricated in hcbt technology [13]. mixers are rf building blocks widely used in heterodyne transceivers [23]. since most communication protocols involve an increasing number of users, the frequency spectrum is shared by multiple channels. in order to minimize the intermodulation distortion, the linearity is a critical parameter of wireless transceivers. moreover, the linearity of radio receivers (also including bandpass filters and low-noise amplifier) are typically limited by the im distortion of the first downconverting mixer [24]. hence, mixer linearity must be as high as possible at a given power consumption, since many of applications include portable battery-supplied devices. double-balanced active mixer based on a gilbert cell shown in fig. 10.a is designed in three different hcbt technologies by using different collector doping profiles: hcbt 1 (steep n-hill), hcbt 2 (uniform n-hill) and hcbt 3 (cmos n-well). gilbert cell mixer consists of differential input amplifier (q1, q2) cascoded by a commutating circuit (quad) made by 4 transistors (q3 – q6). since the im distortion in such mixer is mainly caused by the input differential pair, degeneration resistances (re) are used to improve the linearity. the local oscillator (lo) buffer is used to convert the single-ended input to the differential signal for gilbert cell and to provide the voltage swing high enough to switch the quad transistors on and off. all subcircuits (gilbert cell, lo buffer, current source) are made with the same hcbts in 3 different technology versions with different collector doping profiles. power supply voltage is 5 v. all passive components are kept constant in all versions of circuits, such that the difference in the circuit performance can be attributed to the difference of the used transistors. 516 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai rc rc q1 q2 q3 q4 q5 q6 i0 rere vlo vcc lo buffer q7 q8 vcc vrf _ vrf+ vif _ vif+ (a) (b) fig. 10 double-balanced active mixer based on a gilbert cell designed and fabricated in hcbt technology: (a) mixer schematic, (b) chip layout and test setup. mixers are measured on-wafer by using multi-contact probes with the setup shown in fig. 10.b. the rf and lo ports are driven by a single-ended rf signal generator without any matching networks. the input impedances are designed to be 50 ω, but the exact values are measured separately by using vector network analyzer (vna) and the input losses due to the impedance mismatch are taken into account. however, they are below 1 db due to the small reflexion coefficient at both inputs. the output power is measured by spectrum analyzer connected asymmetrically to one output (collectors of q3 and q5), whereas the other output port is terminated by 50 ω. the output impedance is also measured by vna and the impedance mismatch loss together with the loss due to the single ended output is added to the measured output power. the 3rd order input intercept point (iip3) and conversion gain of mixers with 3 different hcbts are measured at 1 ghz rf frequency and -10 dbm input power. the lo buffer is driven by rf generator with 0 dbm output power. the output frequency is 10 mhz and the two-tone spacing used in iip3 measurement is 10 khz. the measured iip3 and conversion gain dependence on the mixer current (imix) (without the lo buffer current) are shown in fig. 11. the maximum iip3 of 17.7 dbm is achieved by mixer with hcbt 2 at imix= 9.2 ma, which is a small current for a given iip3 as compared to the available mixers, e.g. [25]. the peak iip3 of hcbt 1 and hcbt 3 are 10.9 dbm, and 14.7 dbm at currents 6.7 ma and 9.5 ma, respectively. if the power consumption of the mixer is critical, the iip3 above 10 dbm can be obtained at current consumption between 5 ma and 6 ma by all three mixer designs, resulting in the power consumption between 25 and 30 mw. the conversion gains are rather constant with current (above 4 ma) with the maximum values of -4.2 db, -5 db and -5.5 db for hcbt 1, hcbt 2, and hcbt 3, respectively. the maximum conversion gains are obtained at approximately the same current as the maximum iip3. such conversion gains are expected and are due to the use of emitter degeneracy and are traded for high iip3s. horizontal current bipolar transistor (hcbt) 517 all three mixers have approximately the same linearity at low currents (below 6 ma), whereas the difference appears at higher currents, where the quad transistors (q3 – q6) operate near the high-current drop-off region, i.e. at or above the currents of peak ft. the linearity of transistors in high-current regime is affected by the slope of ft vs ic characteristics at high currents, influenced by the charge sharing effect discussed in section 4. it can be explained by the rate of base charge (qb) increase with ic, which is the smallest for hcbt 2 with uniform n-hill collector profile. more detailed explanation is provided in [16]. high-current linearity can be improved for all collector doping profiles by increasing the size of quad transistors resulting in the operation at the lower current density avoiding the high-current drop-off region. however, the transistor operation below the current densities around peak ft implies the increase of layout area. 6. high-voltage hcbt devices 6.1. double-emitter (de) hcbt the hcbt structures described so far are optimized for high-frequency characteristics targeting rf communication circuit applications. in order to broaden the application spectrum of hcbt bicmos technology, i.e. for automotive, instrumentation and biomedical electronics, transistors with higher breakdown voltages are highly desirable. in standard vertical-current bipolar transistor structures based on the super-self-aligned transistor (sst), different breakdown voltage devices are typically obtained by the different parameters of selectively implanted collector (sic) [26], which usually requires additional lithography masks and increases the fabrication costs. a high-breakdown voltage hcbt can be fabricated by placing two active transistor regions at the silicon sidewalls opposite to each other, such that their collector-base 2 4 6 8 10 12 14 0 5 10 15 20 25 ii p 3 ( d b m ) mixer current (ma) iip3 hcbt 1 hcbt 2 hcbt 3 -25 -20 -15 -10 -5 c o n v e rs io n g a in ( d b ) conv. gain fig. 11 measured 3 rd order input intercept point (iip3) and conversion gain vs. mixer current (imix) of mixers in three different technologies: hcbt 1 (steep n-hill), hcbt 2 (uniform n-hill) and hcbt 3 (cmos n-well). measurement setup: prf= -10 dbm, plo= 0 dbm, frf= 1 ghz, fif= 10 mhz, two-tone δf= 10 khz, vcc= 5 v. 518 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai depletion regions merge, resulting in the reduced electric field. such structure has two emitters opposite to each other and two collector contacts in the plane perpendicular to the direction that connects emitters, as shown in fig. 12. the structure is named doubleemitter (de) hcbt [17, 18]. since the two emitters of de hcbt are placed at the opposite sidewalls of the silicon n-hill, extrinsic bases overlap on the top and intrinsic collector between two intrinsic bases is shared, as can be seen in fig. 13. the extrinsic collector is fabricated laterally in front and back of the intrinsic transistor. in such a way, the intrinsic collector is surrounded by p + extrinsic base from the top, two intrinsic bases from left and right and by the p-substrate from the bottom (fig. 12). since collector charge is shared between surrounding acceptors, collector is fully depleted by reverse collector-base voltage, if transistor operates in the forward active region. once collector is fully depleted by collector-base reverse voltage (vcb), the potential is pinned in the middle of the n-hill between two intrinsic bases, as shown in sti sio2 depletion region int. base p-substrate base emitter polyn+ ext. base n+ ext. collector emitter collector emitter collector depleted n-hill electron flow fig. 12 3d schematic of double-emitter (de) hcbt structure formed by merging two hcbts in opposite directions resulting in the reduced electric field. sti cmos gate n-hill p+ cmos gate p-substrate emitter n+ poly emitter n+ poly base emitter 2emitter 1 hillw 5 0 nm0 p fig. 13 tem cross-section along the emitters of the fabricated double-emitter (de) hcbt structure. extrinsic collectors are in the front and the back. horizontal current bipolar transistor (hcbt) 519 fig. 14. further increase in vcb causes the potential drop laterally across the drift regions, which are formed toward the extrinsic collector, whereas the potential drop across the intrinsic base-collector junction remains roughly constant. additional shielding of the intrinsic bases from the collector voltage is obtained by the extension of the extrinsic base on the top of the drift region, which is wider than the intrinsic base (fig. 12), as well as by the substrate, which is connected to the ground potential in order to isolate the device. eventual current leakage into the substrate might occur at very high current densities, but this is beyond the useable bias conditions. double-emitter hcbt is fabricated in the same fabrication flow as standard single polysilicon region hcbt with the steep collector profile [22], described in sections 4 and 5. the only additional process step is eventually the ion implantation of the intrinsic base at the opposite side of the n-hill. no additional lithography masks are needed to integrate de hcbt with standard hcbt bicmos. the measured dc characteristics of de hcbt are presented in fig. 15. the gummel characteristics (fig. 15.a) show satisfactory quality of fabricated junctions. in the output characteristics (fig. 15.b) with different n-hill widths (whill) it is obvious that de hcbt has a higher bvceo and early voltage (va) comparing to standard single-poly hcbt. the measured electrical parameters of two de hcbts and single-poly hcbt are summarized in table 3. in order to take the full advantage of bvceo improvement and to maximize va for a given collector profile, transistors should be fabricated with a narrow n-hill, i.e. whill should be 0.5 µm or smaller. a hard breakdown cannot be observed in fig. 15.b for vce lower than 10 v for all de hcbt structures. in case of the transistor with whill=0.6 µm, the change in the slope indicates the start of the avalanche process, which is then limited by the base shielding effect at higher vce. the bvceo is measured in forced vbe configuration, where vbe is set to 0.7 v and vcb is swept. bvceo is determined as the vcb where the base current (ib) turns from positive to negative, increased by vbe=0.7 v. for the transistor with whill=0.6 µm substantial avalanche current is generated for vcb > 2.5 v reducing ib and eventually reversing its direction. however, the slope of ib characteristics becomes smaller for vcb > 4 v f(x)(x) x x potential field collector fully depleted collector not fully depleted middle of n-hill middle of n-hill a a’ b b’ fig. 14 (a) schematic cross-section at the middle of the intrinsic transistor parallel with the wafer surface (top view). (b) potential and electric field at the symmetry line along the middle of emitters (aa’ line). in case of fully depleted collector maximum potential and electric field are limited due to limited amount of collector fixed charges. the rest of the voltage is dropped laterally across the drift region. 520 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai indicating that collector is fully depleted and electric field across the intrinsic basecollector junction as well as avalanche multiplication are limited. even though ib turns to negative, hard breakdown does not occur and the output characteristics in fig. 15.b become flat. in case of the transistor with whill=0.5 µm, characteristics in fig. 15.b show similar behavior. however, since whill is decreased, a smaller vcb is needed to fully deplete collector and the base shielding effect is more efficient. therefore, the electric field across the intrinsic base-collector junction is limited to lower value compared to the transistor with whill=0.6 µm. in case of the transistor with whill=0.36 µm, base shielding is the most efficient. the output characteristics in fig. 15.b are flat, indicating that potential drop over the intrinsic base-collector junction does not increase substantially with vcb, meaning that base width modulation is suppressed. indeed, extrapolated early voltage from the output characteristics between vce=5 v and vce=8 v for ib=0.5 µa equals va=301 v. giving the fact that the current gain at vce=5 v is β=95.4 this gives the β·va product as high as 28700 v. table 3 measured electrical parameters of single-poly hcbt and double-emitter (de) hcbts with different width of the n-hill single-poly de, whill=0.5 µm de, whill=0.36 µm emitter area (μm 2 ) 0.1 x 1.8 2 x (0.1 x 1.3) βmax (vce=2 v) 124 104 94 va, (v) 9.5 75 301 bvcbo (v) 8.3 11.2 12.9 bvceo (v) 3.6 11.6 12.6 vcb@bvceo (v) 2.9 10.9 11.9 ft (ghz) 37.6 13.6 12.7 fmax (ghz) 67 29.5 28 ic@ftmax (µa) 220 100 77 ftbvceo(ghzv) 135 158 160 β·va, (v), 1178 (vce=2 v) 7800 (vce=5 v) 28700 (vce=5 v) it can be seen in table 3 that the de hcbt with narrower n-hill has a reduced ft of 13.6 and 12.7 ghz for transistors with whill of 0.5 µm and 0.36 µm, respectively. dominant cause of the lower ft is the increase in the base-collector depletion region transit time, because electrons flow through the depleted n-hill region, which is approximately 1 µm long. moreover, since whill is smaller than the emitter width (we), the current is crowded near the middle of the n-hill increasing the local current density and causing the kirk effect to occur at lower values of ic. therefore, ft peaks at lower ic in de hcbt. for transistors with smaller whill, bvcbo is increased, meaning that electric field is reduced at the peripheral part of the extrinsic base toward the extrinsic collector. interestingly, measured bvceo and bvcbo given in table 3 are almost equal, but vcb at which bvceo occurs (i.e. ib changes the sign) is slightly smaller than bvcbo. horizontal current bipolar transistor (hcbt) 521 6.2. double-emitter (de) reduced-surface-field (resurf) hcbt in de hcbts the breakdown voltage can be increased above 12 v by merging the ncollector regions of two transistors, due to the fact that the n-collectors are at the active region surface in the compact hcbt structure and not at the bottom of the intrinsic device as in the conventional vertical-current transistors. the breakdown voltage can be increased further, up to 36 v, by shielding the electric field in the drift region resulting in the reduced-surface-field (resurf) de hcbt [19]. this is done by using a cmos pwell implant and by the design of lithography masks (i.e. without any additional costs). having a high-speed, as well as 12 v and 36 v high-voltage bipolar transistors along with the cmos increases the flexibility and application spectrum of hcbt bicmos technology further, making it attractive both for rf and other analog applications. since high-voltage bipolar transistors are integrated at zero-cost, the technology is suitable for integration of low-cost smarter systems including higher-power and human-interface sensor circuits, which makes it a contender for the future internet of everything (ioe) applications. cross-sections at the symmetry lines of the de hcbt with resurf region are shown in figs. 16.a and 16.b. in double-emitter configuration, a ceb e c layout is used with extrinsic collectors folded to front and back of the intrinsic transistor. compared to the standard de hcbt, this one has an extended extrinsic collector with cmos p-well implanted underneath to obtain local substrate with increased concentration. the basic idea is that the n-hill above the p-well region is fully depleted if collector voltage is increased and that the second resurf drift region is formed. fig. 16 schematic cross-sections of the fabricated de resurf hcbt structure having ceb e c layout. (a) ebe cross-section along the emitters. (b) cbc cross-section along the middle of the n-hill. due to the symmetry, only one half is shown. compared to standard de hcbt structure, cmos p-well is implanted in the n-hill between the collector contact region and the intrinsic transistor. in the forward active region, portion of the n-hill above the p-well is fully depleted and the 2 nd drift region (dr 2) is formed. 522 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai the change of the electric field with the increase of the collector-emitter voltage (vce) is shown in fig. 17. for small vce, the peak electric field at the intrinsic base-collector junction increases with vce as shown in fig. 17.a and depletion regions spread into the intrinsic collector. after intrinsic collector is fully depleted, there is no available donor charge in this cross-section (fig. 16.a) and the maximum electric field at the junctions remains unchanged. the voltage is dropped in the perpendicular cross-section across the 1 st drift region (dr 1 in fig. 16.b) toward the extrinsic collector. electric field along the current path in the middle of the n-hill is shown in fig. 17.b. as vce is increased, the 2 nd peak of the electric field appears at the end of dr 1, whereas the 1 st peak at the intrinsic base-collector junction remains the same, because collector voltage is blocked by the extrinsic base extensions above dr 1. further increase in vce increases the 2 nd peak up to the voltage where the extrinsic collector above the p-well region becomes fully depleted and the 2 nd drift region (dr 2) is formed. additional increase in vce causes the voltage drop across the dr 2. collector voltage is partially blocked by the p-well region reducing its impact on the value of the electric field 2 nd peak. since there is enough available charge in the extrinsic collector, the 3 rd peak of the electric field appears at the end of the dr 2. the ability of the p-well region to block the collector voltage determines whether the critical field is first reached in the 2 nd or the 3 rd peak of the electric field. this can be controlled by the length of dr 2. de resurf hcbts are fabricated on the same dies as high-speed hcbts and de hcbts with bvceo=12 v. the steep n-collector doping profile described in section 4 is used. measured common emitter output characteristics of fabricated transistors with different lpw are shown in fig. 18.a. breakdown occurs around 26 v for the transistor with lpw=0.5 µm and around 36 v for the transistor with lpw=3 µm. summary of electrical characteristics is given in table 4. fig. 17 electric field with the increasing vce: (a) along the middle of the emitters (ebe cross-section of fig. 16.a), (b) along the current path in the cbc cross-section from fig. 16.b. horizontal current bipolar transistor (hcbt) 523 table 4 measured electrical parameters of double-emitter (de) hcbt with n-hill width of 0.36 μm and different length lpw. lpw=0.5 µm lpw=3 µm emitter area (μm 2 ) 2 x (0.1 x 1) 2 x (0.1 x 1) βmax 123 129 va, (v), (ib=15 na, vce=6~7 v) 1928 2233 bvceo (v), output char. 26 36 (=bvcs) bvcs (v) 33 36 ft (ghz) 5.3 2.7 fmax (ghz) 10.6 4.6 ftbvceo (ghzv) 137 97 β·va, (kv), 237 288 in the case of transistor with lpw=0.5 µm, the classical common-emitter breakdown mechanism occurs, meaning that the critical field appears along the current path and that a positive feedback loop due to transistor current gain is closed. for the transistor with lpw=3 µm, breakdown occurs between the local p-well substrate and the n-hill. this means that neither the 2 nd nor the 3 rd peak from fig. 17.b generate holes, which can close the positive feedback loop. the 2 nd peak is limited below the critical value for avalanche, whereas the holes generated at the 3 rd peak are collected by the substrate instead of the extrinsic base. for the transistor with lpw=3 µm, it is more effective than for the transistor with lpw=0.5 µm, because holes have to travel longer distance to reach the extrinsic base in the presence of strong vertical electric field component in the dr 2. this is confirmed by the measurements of the collector-substrate breakdown voltage (bvcs), which equals the bvceo measured in the output characteristics for the structure with lpw=3 µm. avalanche current generated at breakdown flows between the collector and the substrate, whereas the base and the emitter currents are not changed, which is not the case in the standard bipolar transistors. as a result we have bvceo=bvcbo=bvcs. due to the e-field shielding of the intrinsic base-collector junction, the base-width modulation is suppressed, resulting in very high early voltage, which equals around 1.93 kv and 2.23 kv for the transistors with lpw=0.5 µm and lpw=3 µm, respectively. this reflects to almost 100 db of intrinsic gain (va/vt) at room temperature for both devices. since the value of current gain β is high, considering that the transistor has implanted base, the β·va product is remarkable indicating good analog performance. dependence of the cut-off frequency (ft) and maximum oscillation frequency (fmax) on collector current are shown in fig. 18.b. in this structure, the high-frequency performance is traded for higher bv and ft and fmax are reduced accordingly. nevertheless, ft·bvceo products show results very close to the johnson’s limit [20]. 7. conclusion the hcbt is based on a new concept of bipolar transistor technology resulting in a low-cost fabrication, but with many innovative steps. the optimized-collector hcbt is fabricated with 3 additional masks to cmos process, resulting in an optimum trade-off between the ft, fmax and bvceo. the hcbt with the n-well collector requires 2 additional 524 t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s-i. morita, k shinomura, h. imai masks to cmos process and has lower ft, fmax and bvceo, but still high enough for wireless communications circuits in the frequency range between 0.9 and 5 ghz. the optimizedcollector hcbt targets the applications with supply voltages of 3.3 v, whereas the hcbt with the n-well collector has bvceo below 3 v, which has to be taken into account in circuit design. since ft and fmax peak at low currents, i.e. at 200-300 μa in hcbt with optimized collector, hcbt is very attractive for low-power battery-supplied wireless communications circuit blocks. furthermore, such small currents allow for an increase of emitter length in order to reduce rb for low-noise applications, while maintaining a reasonably low ic. the demonstrated double-balanced active mixers based on a gilbert cell show that the highcurrent linearity of hcbts are affected by n-collector doping profile and are optimized such that transistors can operate in high-current regime saving the layout area. the n-collector doping profile also impacts the degree of the charge sharing between the extrinsic and intrinsic bases, which determines the value and distribution of the electric field defining the transistor breakdown voltage. therefore, the breakdown voltage can be increased without affecting the high-frequency characteristics. by using the charge sharing effect and hcbt geometry where all intrinsic transistor regions (emitter, base and collector) are along the horizontal line of current flow, it is possible to merge 2 devices and fully deplete n-collector. in this way, the electric field can be shielded and the breakdown voltage is engineered. by adding the p-well region underneath n-collector, the electric field shielding effect is extended further and the breakdown voltage can be increased to 36 v. the breakdown voltage can be adjusted just by changing the lithography masks. hence, hcbt makes it possible to have a flexible bicmos technology platform with high-speed devices for rf circuits and high-voltage devices for very diverse system on-a-chip applications. acknowledgement: this work has been supported in part by asahi kasei microdevices co., by the croatian science foundation under the project no. 9006, and by the ministry of science, education and sports of the republic of croatia, under contracts no. 036-0361566-1567 and no. 036-0982904-1642. references [1] w. m. huang, h. s. bennet, j. costa, p. cottrell, a.a. immorlica, jr., j-e mueller, m. racanelli, h. shichijo, c. e. weitzel, and b. zhao, "rf, analog and mixed signal technologies for communication ics – an itrs perspective", in proc. bipolar/bicmos circuits technol. meeting, october 2006, pp. 1–8. [2] h. s. bennet, r. brederlow, j. c. costa, p. e. cottrell, w. m. huang, a. a. immorlica, jr., j-e mueller, m. racanelli, h. shichijo, c. e. weitzel, and b. zhao, "device and technology evolution for si-based rf integrated circuits", ieee trans. electron devices, vol. 52, no. 7, pp. 1235-1258, july 2005. [3] s. lee, b. jagannathan, s. narashima, a. chou, n. zamdmer, j. johnson, r. williams, l. wagner, j. kim, j.-o. plouchart, j. pekarik, s. springer, and g. freeman, "record rf performance of 45-nm soi cmos technology", in iedm tech. dig., 2007, pp. 255-258. [4] a. fox, b. heinemann, r. barth, d. bolze, j. drews, u. haak, d. knoll, b. kuck, r. kurps, s. marschmeyer, h.h. richter, h. rücker, p. schley, d. schmidt, b. tillack, g. weidner, c. wipf, d. wolansky, and y. yamamoto, "sige hbt module with 2.5 ps gate delay", in iedm tech. dig., 2008. [5] t. suligoj, m. koriĉić, h. mochizuki, s. morita, k. shinomura, and h. imai, "horizontal current bipolar transistor (hcbt) with a single polysilicon region for improved high-frequency performance of bicmos ics", ieee electron device lett., vol. 31, no. 6, pp 534-536, june 2010. horizontal current bipolar transistor (hcbt) 525 [6] t. suligoj, m. koriĉić, h. mochizuki, s. morita, k. shinomura, and h. imai, "examination of horizontal current bipolar transistor (hcbt) with double and single polysilicon region", in proc. bipolar/bicmos circuits technol. meeting, september 2012, pp. 5-8. [7] h. nii, t. yamada, k. inoh, t. shino, s. kawanaka, m. yoshimi, and y. katsumata, "a novel lateral bipolar transistor with 67 ghz fmax on thin-film soi for rf analog applications", ieee trans. electron devices, vol. 47, no. 7, pp. 1536-1541, july 2000. [8] i.-s. m. sun, w. t. ng, k. kanekiyo, t. kobayashi, h. mochizuki, m. toita, h. imai, a. ishikawa, s. tamura, k. takasuka, "lateral high-speed bipolar transistors on soi for rf soc applications", ieee trans. electron devices, vol. 52, no. 7, pp. 1376-1383, july 2005. [9] p. biljanović, t. suligoj, "horizontal current bipolar transistor (hcbt): a new concept of silicon bipolar transistor technology", ieee trans. electron devices, vol. 48, pp. 2551-2554, november 2001. [10] t. suligoj, p. biljanović, k.l. wang, "horizontal current bipolar transistor and fabrication method", us patent no.7,038,249, may 2006. [11] t. suligoj, m. koriĉić, p. biljanović, k.l. wang, "fabrication of horizontal current bipolar transistor (hcbt)", ieee trans. electron devices, vol. 50, no. 7, pp. 1645-1651, july 2003. [12] t. suligoj, p. biljanović, j.k.o. sin, and k.l. wang, "a new hcbt with a partially etched collector", ieee electron device lett., vol. 26, no. 3, pp. 200-202, march 2005. [13] t. suligoj, j.k.o. sin, and k.l. wang, "horizontal current bipolar transistor (hcbt) process variations for future rf bicmos applications", ieee trans. electron devices, vol. 52, no. 7, pp. 1392-1398, july 2005. [14] t. suligoj, m. koriĉić, h. mochizuki, s. morita, "hybrid-integrated lateral bipolar transistor and cmos transistor and method for manufacturing the same", u.s. patent 8,569,866, october 2013. [15] j. böck, h. knapp, k. aufinger, t. f. meister, m. wurzer, s. boguth, and l. treitinger, "highperformance implanted base silicon bipolar technology for rf applications", ieee trans. electron devices, vol. 48, no. 11, pp. 2514-2519, november 2001. [16] t. suligoj, m. koriĉić, j. žilak, h. mochizuki, s. morita, k. shinomura, and h. imai, "optimization of horizontal current bipolar transistor (hcbt) technology parameters for linearity in rf mixer", in proc. bipolar/bicmos circuits technol. meeting, october 2013, pp. 13-16. [17] m. koriĉić, t. suligoj, h. mochizuki, s. morita, k. shinomura, and h. imai, "examination of novel high-voltage double-emitter horizontal current bipolar transistor (hcbt)", in proc. bipolar/bicmos circuits technol. meeting, october 2011, pp. 5–8. [18] m. koriĉić, t. suligoj, h. mochizuki, s. morita, k. shinomura, and h. imai, "double-emitter hcbt structure—a high-voltage bipolar transistor for bicmos integration", ieee trans. electron devices, vol. 59 , no. 12, pp. 3647 – 3650, december 2012. [19] m. koriĉić, j. žilak, t. suligoj, "double-emitter reduced-surface-field horizontal current bipolar transistor with 36 v breakdown integrated in bicmos at zero-cost", ieee electron device lett., vol. 36, no. 2, pp. 90 – 92, february 2015. [20] e. o. johnson, "physical limitations on frequency and power parameters of transistors", rca rev., vol. 26, pp. 163-177, 1965. [21] m. koriĉić, t. suligoj, h. mochizuki, s. morita, k. shinomura, and h. imai, "design considerations for integration of horizontal current bipolar transistor (hcbt) with 0.18 μm bulk cmos technology", solid-state electronics, vol. 54, no. 10, pp. 1166-1172, 2010. [22] t. suligoj, m. koriĉić, h. mochizuki, s. morita, k. shinomura, and h. imai, "collector region design and optimization in horizontal current bipolar transistor (hcbt)", in proc. bipolar/bicmos circuits technol. meeting, october 2010, pp. 212-215. [23] j. rogers and c. plett, radio frequency integrated circuit design. artech house inc., boston, 2003. [24] s.-t. lim, and j. r. long, "a low-voltage broadband feedforward-linearized bjt mixer", ieee j. solid state cir., vol. 41, no. 9, pp. 2177-2187, september 2006. [25] high linearity, low power downconverting mixer, linear technology, lt5526 [online]. available: http://www.linear.com/product/lt5526 [26] j. s. dunn, d. c. ahlgren, d. d. coolbaugh, n. b. feilchenfeld, g. freeman, d. r. greenberg, r. a. groves, f. j. guarín, y. hammad, a. j. joseph, l. d. lanzerotti, s. a. st. onge, b. a. orner, j.-s. rieh, k. j. stein, s. h. voldman, p.-c. wang, m. j. zierak, s. subbanna, d. l. harame, d. a. herman, jr., and b. s. meyerson, "foundation of rf cmos and sige bicmos technologies", ibm j. res. develop., vol. 47, no. 2/3, pp. 101–138, march 2003. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 213 221 doi: 10.2298/fuee1502213f new generation of 3.3 kv igbts with monolitically integrated voltage and current sensors  david flores 1 , salvador hidalgo 1 , jesús urresti 2 1 instituto de microelectrónica de barcelona, imb-cnm (csic), campus uab, 08193, cerdanyola del vallès, barcelona, spain 2 school of electrical and electronic engineering, newcastle university, merz court, newcastle upon tyne, ne1 7ru, uk abstract. although igbt modules are widely used as power semiconductor switch in many high power applications, there are still reliability problems related to the current unbalance between paralleled igbts that may destroy the whole module and, eventually, the power system. indeed, short-circuit and overvoltage events can also destroy some of the igbts of the power module. in this sense, the instantaneous monitoring of the anode current and voltage values and the use of a more intelligent gate driver able to work with the signals of each particular igbt of the module would enhance its operating lifetime. in this sense, the paper describes the design, optimization, fabrication and basic performances of 3.3 kv – 50 a punch-through igbts for traction and tap changer applications where anode current and voltage sensors are monolithically integrated within the igbt core. key words: igbt, voltage sensor, current sensor, overvoltage, overcurrent 1. introduction igbts [1,2] are the most widely used power semiconductor device in low, medium and high voltage applications. igbts with voltage capability ranging from 600 to 6500 v delivering up to 2500 a are commercially available as discrete devices or as power modules. low voltage discrete igbts are manly addressed to automotive applications, where low losses and high reliability are the most critical challenges [3], while medium voltage igbt modules are basically designed for traction applications and wind generation with shortcircuit capability [4]. finally, high voltage igbt modules for applications operating at an output power in excess of 100 kva are addressed to high speed trains, high power industrial drives, var compensation and flexible ac transmission [5]. although the igbt technology received february 4, 2015 corresponding author: david flores instituto de microelectrónica de barcelona, imb-cnm (csic), campus uab, 08193, cerdanyola del vallès, barcelona, spain (e-mail: david.flores@imb-cnm.csic.es) 214 d. flores, s. hidalgo, j. urresti is mature, the reliable operation of a module is still under optimization since the safe operation of the power module is one of the most critical issues of a power system. short-circuit events and overcurrent transient peaks inherent to inductive load switching may destroy one or several igbt chips packaged into the power module with the eventual subsequent destruction of the whole power system due to explosion or overcurrent burn-out. therefore, the direct measure of the instantaneous anode voltage and current levels during an undesired destructive event would improve the lifetime of each discrete power semiconductor device included into the power module. in this sense, the current distribution between the different igbts may become unbalanced as a consequence of non-uniform thermal distribution or a local thermal resistance increase derived from delamination problems [6-8]. hence, the current increase in one of the igbts may drive it out of its safe operating area [9]. the implementation of integrated current sensors in the igbt core, which is not commonly done in the igbts of commercial power modules for traction applications, would significantly increase their safe operation. average current sensing is the most common technique using current mirrors with shunt resistors and a reference voltage [10]. current sensors are usually implemented in large area discrete igbts, where a certain number of cathode cells are connected to an auxiliary cathode electrode, leading to a low current value proportional to the cathode current value. overvoltage or short-circuit events can also be destructive since the igbt may be driven to avalanche or to high power dissipation, with local temperature increase and thermal destruction. if an anode voltage sensor is implemented, the undesired anode voltage increase can be quickly detected and the gate drive can safely turn-off the entire igbt before destruction. however, the anode electrode is placed at the backside of the die and the anode voltage value is too high to be used in a logic circuit. therefore, the anode voltage sensor has to be placed on top of the die with a voltage level compatible with the gate drive electronics. the first anode voltage sensors were successfully developed for 600 v applications based on the voltage mirror concept [11]. this paper describes the design and fabrication of smart igbts for 3.3 kv applications where current and anode voltage sensors are monolithically integrated within the core region. two target applications are envisaged: traction modules with a large number of paralleled igbts delivering high anode currents and on-load tap changers for smart grid distribution transformers, where the transformer ratio has to be changed as a function of the active and passive connected loads if a stable and safe ac voltage waveform has to be delivered to end users. commercial tap changers are based on the mechanical adding or subtraction of small inductances connected in series with the primary inductor [12]. however, the increasing demand of electrical energy with large fluctuations of the connected loads requires remote operation of tap changers and this can only be achieved by substituting the mechanical switches by the solid-state counterparts. the implementation of anode voltage sensors is crucial to avoid the destruction of the solid-state switch due to the eventual short circuits both at the high and low sides of the transformer. 2. anode voltage sensor the design and optimization of the anode sensor structure is based on a 3.3 kv igbt punch-through technology with terraced gate design [13] to prevent a premature breakdown new generation of 3.3 kv igbts with monolitically integrated anode and current sensors 215 between adjacent core cells at the curvature of the p-body diffusion. a cross-section of the last igbt core cell and the sensor structure is plotted in fig. 1, where the main design parameter; distance between adjacent deep p + sinkers (l), is highlighted. the layout of the core region cells is striped and the corresponding metal gate runners are also included to minimize the gate resistance and prevent possible delays in the turn-on process of the farthest cells with the subsequent current focalization. the process technology is based on the standard 8 mask igbt technology available at the imb-cnm clean room, including a back-side deep n-type diffusion to create the n-buffer layer inherent to the punch-through structures. an optimized multiple floating guard ring edge termination with 22 rings has been used to provide the required voltage capability for 3.3 kv applications, using the same deep p + sinker diffusion of the core cells. each ring is covered by a metal line to ensure a uniform bias distribution along the ring, once it becomes biased due to the depletion region extension. rings have to be wide enough to avoid a metal overlay that would lead to a field plate effect, degrading the effectiveness of the edge termination. fig. 1 cross-section of the last igbt core cell and the sensor region the concept and the operation mode of the anode voltage sensor structure was initially demonstrated on a 600 v igbt technology by integrating stripped and cellular sensor structures with the suitable edge termination selected igbt structure [11]. the sensor consists of two deep p + diffusions connected to the grounded cathode electrode separated a certain distance with an additional shallow n + diffusion in between to provide an ohmic contact to the n substrate for the additional sensor electrode. when a positive bias is applied to the anode electrode, the two p + n junctions of the sensor become reverse biased, leading to a self-shielding effect. hence, the bias at the sensor electrode (vsense) proportionally increases with the applied anode voltage in a range compatible with the gate driver electronics. in case of short-circuit or overvoltage, the vsense value will exceed a pre-defined threshold value and the gate driver will directly turn-off the entire igbt or reduce the anode current to a safe value. the vsense value obtained in stripped sensor design is higher than that of the cellular counterpart but the most critical parameter is the resistive load connected to the sensor electrode. 216 d. flores, s. hidalgo, j. urresti fig. 2 detailed view of the stripped anode voltage sensor and its monolithic integration within the core igbt fig. 3 top view of a fabricated 3.3 kv – 50 a igbt packaged for anode voltage sensor test the final goal is to get 10 v at the sensor electrode at an anode voltage of 50 v for a 3.3 kv punch-through igbt technology. in this sense, a stripped anode voltage sensor structure (see fig. 2) has been placed within the core of the igbt (see fig. 3), consuming a 3% of the total active area. the anode voltage sensor electrode is placed at the edge of anode voltage sensor current sensor cathode wires gate pad new generation of 3.3 kv igbts with monolitically integrated anode and current sensors 217 the igbt core cells, being the stripes of the sensor orthogonal to those of the igbt. the current sensor is placed at the opposite side of the chip. a gate runner is placed between the igbt cells and the anode voltage sensor to minimize their mutual interaction. in this sense, the current flowlines simulated with sentaurus [14] tcad of the last igbt core cell together with the first anode voltage sensor stripe are plotted in fig. 4. a small fraction of the current is collected through the sensor p + diffusions with the inherent slight on-state resistance increase. moreover, these p + diffusions helps in collecting holes during the igbt turn-off process in a similar way than the peripheral p + diffusion connected to the cathode potential. fig. 4 current flowlines at a gate bias of 15 v in the last igbt core cell and the first anode voltage sensor strip 3. fabrication of intelligent 3.3 kv igbts 3.3 kv igbts have been fabricated with a total chip area of 1.3×1.3 cm² for a nominal current of 50 a. the real current capability is shown in fig. 5 where more than 100 a are reached at a gate voltage of 15 v, being the gate threshold voltage in the range of 5 v. the feasibility anode voltage sensor has been experimentally measured with a vsene = 8 v at an anode voltage of 50 v, as inferred from fig. 6. in order to check the compatibility of the designed anode voltage sensor structure with a wide range of gate driver architectures, simulations of the vsense evolution when different resistive loads (100, 200, 500 and 1000 ω) are connected to the sensor electrode is also included in fig. 6. although the vsense value at an anode voltage of 50 v slightly decreases when the load resistance increases, it can also be used as an input signal for the gate control driver. igbt cathode gate runner sensor sensor cathode 218 d. flores, s. hidalgo, j. urresti fig. 5 experimental i(v) curves of the 3.3 kv igbt fig. 6 simulated and experimental evolution of the voltage sensor value as a function of the resistive load of the control electronics 4. transient analysis of the anode voltage sensor up to now, the vsense evolution has been simulated or measured by increasing the anode voltage from 0 to the desired value with the gate electrode grounded. hence, the current flowing through the entire igbt is the leakage range with no heating effects. in contrast, igbt modules operating in traction or on-load tap changers applications are far from the described ideal conditions [15]. pulse-width-modulation schemes are typical in traction applications to supply the sinusoidal current for the train speed control. therefore, 2d new generation of 3.3 kv igbts with monolitically integrated anode and current sensors 219 tcad simulations have been carried out to determine the turn-off performance of the igbt with an anode voltage sensor based on the typical turn-off test circuit plotted in fig. 7 (left). igbt1 accounts for the designed 3.3 kv igbt with anode voltage sensor while igbt2 is a conventional 3.3 kv igbt with the corresponding area factor to deliver a nominal current of 50 a. igbt1 is dimensioned to take into account that the total anode voltage sensor length corresponds to one of the laterals of the active igbt area and has also to deliver a nominal current of 50 a. fig. 7 test circuit for the inverter switching simulation (left) and simulated turn-off waveforms of the different considered structures three different cases have been considered by properly selecting the way the two igbts are connected, assuming that they share the anode electrode. the first case is igbt1 connected and igbt2 not connected. hence, information about the interaction between sensor and core cells can be directly derived. the second case is igbt1 not connected and igbt2 connected, leading to the transient simulation of a real 3.3 kv – 50 a pt-igbt without anode voltage sensor. finally, the third case corresponds to both devices connected in a mixed-mode way, accounting for the fabricated igbt with anode voltage sensor. the current at the inductor increases with a certain di/dt when a constant vds voltage is applied to the circuit up to the desired level. when the gate voltage is ramped to 0v, the igbts are turned-off and the inductor current is forced to flow through the freewheeling diode (fwd). the inverter test circuit has been designed in accordance to the standard 2.5 kv dc line, where 3.3 kv igbts are used. in this sense, vds, l, lo and rg are set at 2500 v, 2.5 mh, 285 nh and 3.7 ω, respectively. as a consequence, the maximum achievable current is in the 50 a range, which corresponds to the limit of the reverse blocking safe operating area (rbsoa). the turn-off waveforms plotted in fig. 7 reveals that a significant interaction between the sensor and the core cells can be expected (dash-dot line) with the subsequent turn-off delay and enhanced heat generation in the vicinity of the anode voltage sensor cells due to the increase of the parasitic capacitances. nevertheless, when realistic area factors are 220 d. flores, s. hidalgo, j. urresti used (solid line) to emulate the fabricated igbt with anode voltage sensor, the turn-off performance approaches the one corresponding to the conventional pt-igbt counterpart (dashed line). in conclusion, the monolithic integration of the anode voltage sensor within the core area of the 3.3 kv igbt is feasible since the interaction between the sensor and the adjacent core cells has not a deep impact on the transient performance. fig. 8 steady and transient simulation of the anode voltage sensor level as a function of the applied anode voltage the last step in demonstrating the suitable operation of the fabricated intelligent igbt with integrated sensors (igbt1 + igbt2) is the transient simulation of the anode voltage sensor at turn-on and turn-of processes and its comparison with the steady-state case. the turn-on and turn-off curves are extracted form transient simulations where the anode voltage is ramped down from 2500 v to 0 v (turn-on) and vice-versa (turn-off). the steady state simulations with the gate biased at 0 and 15 v exhibit a mismatch at high anode voltage values, far from real operation. nevertheless, the simulation at high anode voltage with the gate at 15 v is included since these conditions will happen at the beginning of the turn-off process with a high current density flowing through the igbt core cells. as a consequence, the carrier concentrations and the electric field distribution will be modified in the anode voltage sensor structure. assuming that the anode voltage sensor structure is included in the active igbt area to mainly protect it from an unexpected fast increase of the anode voltage as a consequence of short-circuit event or when the energy of an inductor is dumped to the semiconductor, the fast increase of the anode voltage sensor value at turn-on ensures the protection capability since the threshold voltage level at which the gate driver will turnoff the intelligent igbt will be reached even earlier than in the ideal steady state case. steady state (vg = 0 v) steady state (vg = 15 v) turn-off turn-on new generation of 3.3 kv igbts with monolitically integrated anode and current sensors 221 5. conclusions the basic design aspects and the expected evolution of an intelligent 3.3 kv – 50 a igbt with integrated anode voltage and current sensors is reported in this paper. the operation of the new anode voltage sensor structure is analyzed with the aid of tcad simulations, including its interaction with the adjacent igbt core cells. the experimental evolution of the anode voltage sensor level as a function of the applied anode voltage has corroborated the feasibility of the fabricated devices. finally, transient simulations have been carried out to demonstrate the protection capability of the anode voltage sensor. acknowledgement. the paper has been partially supported by the spanish ministry of science and innovation under grant tec2008-03304 and under cascada innpacto project (ipt-120000-2010-19). references [1] b. j. baliga, m. s. adler, p. v. gray, r. p. love, n. zommer, "the insulated gate rectifier (igr): a new power switching device", in proceedings of the 1982 international electron devices meeting (iedm). san francisco, usa, december 1982, pp. 264–267. [2] a. nakagawa, y. yamaguchi, k. watanabe, h. ohashi, "safe operating area for 1200-v nonlatchup bipolar mode mosfet’s", ieee trans. on electron devices, vol. 34, pp. 351–355, february 1987. [3] k. hamada, t. kushida, a. kawahashi, m. ishiko, "a 600 v 200 a low loss high current density trench igbt for hybrid vehicles", in proceedings of the 13 th international symposium on power semiconductor devices and ics (ispsd), osaka, japan, june 2001, pp. 449–452. [4] m. ciappa, a. castellazzi, "reliability of high-power igbt modules for traction applications", in proceedings of the 45 th annual ieee international reliability physics symposium (irps), phenix, usa, april 2007, pp. 480–485. [5] e. r. motto, m. yamamoto, "new high power semiconductors: high voltage igbts and gcts", powerex inc., youngwood, pennsylvania, usa; mitsubishi electric, power division, fukuoka, japan. [6] x. perpina, o. garonne, j.p. rochet, p. jalby, m. mermet-guyennet, j. rebollo, "experimental analysis of temperature distribution within traction igbt modules", in proceedings of the 2007 european conference on power electronics (epe), aalborg, denmark, september 2007, pp. 1–10. [7] m. ciappa, "selected failure mechanisms of modern power modules", microelectronics reliability, vol. 42, pp. 653–667, april-may 2002. [8] x. perpina, a. castellazzi, m. piton, m. mermet-guyennet, j. millán, "robustness tests and failure analysis of igbt modules during turn-off", microelectronics reliability, vol. 48, pp. 1427–1431, septembre-november 2007. [9] x. perpina, j.f. serviere, x. jordà, a. fauquet, s. hidalgo, j. urresti-ibáñez, j. rebollo, m. mermetguyennet, "igbt module failure analysis in railway applications", microelectronics reliability, vol. 48, pp. 1427–1431, august-september 2008. [10] e.r. motto, j.f. donlon, "igbt module with user accessible on-chip current and temperature sensors", in proceedings of the 27 th annual ieee applied power electronics conference and exposition (apec), orlando, usa, february 2012, pp. 176–181. [11] c. caramel, "nouvelles fonctions interrupteurs integrées pour la conversion d’énergie", phd thesis, université paul sabatier, toulouse, france, 2007. [12] a. gómez expósito, d. monroy berjillos, "solid-state tap changers: new configurations and applications”, ieee trans. power delivery, vol. 18, pp. 2228–2235, april 2007. [13] j. bencuya, s.h. kwan, s.p. saap, "self aligned method of fabricating terrace gate dmos transistor", us patent, us879994a, 1999. [14] 2010 sentaurus tcad tools suite, synposys. [15] dany. chen, f.c. lee, g. carpenter, "nondestructive rbsoa characterization of igbts and mcts", ieee trans. on power electronics, vol. 10, pp. 368–372, may 1995. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 529-545 https://doi.org/10.2298/fuee1804529s demands for spin-based nonvolatility in emerging digital logic and memory devices for low power computing  viktor sverdlov 1,2 , siegfried selberherr 2 1 christian doppler laboratory for nonvolatile magnetoresistive memory and logic at the institute for microelectronics, tu wien 2 institute for microelectronics, tu wien gußhausstraße 27–29, a-1040 wien, austria abstract. miniaturization of semiconductor devices is the main driving force to achieve an outstanding performance of modern integrated circuits. as the industry is focusing on the development of the 3nm technology node, it is apparent that transistor scaling shows signs of saturation. at the same time, the critically high power consumption becomes incompatible with the global demands of sustaining and accelerating the vital industrial growth, prompting an introduction of new solutions for energy efficient computations. probably the only radically new option to reduce power consumption in novel integrated circuits is to introduce nonvolatility. the data retention without power sources eliminates the leakages and refresh cycles. as the necessity to waste time on initializing the data in temporarily unused parts of the circuit is not needed, nonvolatility also supports an instanton computing paradigm. the electron spin adds additional functionality to digital switches based on field effect transistors. spinfets and spinmosfets are promising devices, with the nonvolatility introduced through relative magnetization orientation between the ferromagnetic source and drain. a successful demonstration of such devices requires resolving several fundamental problems including spin injection from metal ferromagnets to a semiconductor, spin propagation and relaxation, as well as spin manipulation by the gate voltage. however, increasing the spin injection efficiency to boost the magnetoresistance ratio as well as an efficient spin control represent the challenges to be resolved before these devices appear on the market. magnetic tunnel junctions with large magnetoresistance ratio are perfectly suited as key elements of nonvolatile cmos-compatible magnetoresistive embedded memory. purely electrically manipulated spin-transfer torque and spin-orbit torque magnetoresistive memories are superior compared to flash and will potentially compete with dram and sram. all major foundries announced a near-future production of such memories. two-terminal magnetic tunnel junctions possess a simple structure, long retention time, high endurance, fast operation speed, and they yield a high integration density. combining received september 10, 2018 corresponding author: viktor sverdlov christian doppler laboratory for nonvolatile magnetoresistive memory and logic at the institute for microelectronics, tu wien, gußhausstraße 27–29, a-1040 wien, austria (e-mail: sverdlov@iue.tuwien.ac.at)  530 v. sverdlov, s. selberherr nonvolatile elements with cmos devices allows for efficient power gating. shifting data processing capabilities into the nonvolatile segment paves the way for a new low power and high-performance computing paradigm based on an in-memory computing architecture, where the same nonvolatile elements are used to store and to process the information. key words: digital spintronics, spinfet, spinmosfet, spin-transfer torque, stt, spin-orbit torque, sot, mram, in-memory computing 1. introduction continuous miniaturization of complementary metal-oxide semiconductor (cmos) devices is enabling the unprecedented increase of speed and performance of modern integrated circuits. numerous outstanding technological challenges have been resolved on this exciting path. among the most crucial technological achievements implemented by the semiconductor industry within the last 15 years to boost cmos performance while maintaining the gate control over the semiconductor channel are the introduction of strain [1], high-k gate dielectrics and metal gates [2], and three-dimensional (3d) tri-gate transistor architecture [3],[4], [5]. while chips with 5nm technology based on nanosheets are already nearing production [6], the semiconductor industry is now focusing on a 3nm technology node. although setting limits for scaling has proven to be a mere meaningless task in the past, it is obvious that the conventional transistor scaling is showing signs of saturation. to sustain the growing demand for high performance small area central processing units (cpus) and high-capacity memory needed to handle an increasing information flow, the introduction of a disruptive technology employing new computing principles is anticipated. most importantly, any emerging technology must be energy efficient. indeed, a harmful active power penalty already prevents the clock frequency from increasing in cmos circuits and is saturated at approximately 3.7 ghz with the possibility to be boosted for a short time up to 4.2 ghz under heavy load in high-end consumer-level workstation cpus. although the transistor size has been scaled down, the load capacitance value per unit area remained approximately unchanged, which keeps the on-current approximately constant for maintaining appropriate high speed operation due to the unavoidable charging of this capacitance. in addition, small transistor dimensions lead to rapidly increasing leakages. a rapid increase of the stand-by power due to transistor leakages at small transistor dimensions, as well as of the dynamic power and the need to refresh the data in dynamic random access memory (dram), is becoming a pressing issue. the microelectronics industry is facing major challenges related to power dissipation and energy consumption, and the scaling of silicon semiconductor devices will soon hit a power wall. an attractive path to mitigate the unfavorable trend of increasing power at stand-by is to introduce nonvolatility in the circuits. the development of an electrically addressable nonvolatile element, which combines fast operation, simple structure, and high endurance, is essential to mitigate the increase of the stand-by power and the power needed for data refreshment. nonvolatile elements also enable instant-on architectures without the need of data initialization when going from a stand-by to an operation regime. recently, a fruitful cooperation between intel and micron resulted in bringing a nonvolatile memory to the market, which is based on a three-dimensional x-point cross-bar architectural solution [7]. although the physics principle of operation has not been officially released, there is a demands for spin-based nonvolatility in emerging digital logic and memory 531 consent within the community that, as a phase-change memory, it is based on a resistance change due to the phase transition. although being nonvolatile, the phase-change memory requires a high power to write information as compared to other nonvolatile memories, for example, resistive ram. oxide-based resistive ram (rram) exploits filamentary switching between the on/off states and is thus intrinsically prone to significant resistance fluctuations in both states. in addition, the endurance reported is only slightly better than that of flash memory. although rram possesses a simple structure and a large on/off current ratio, it is premature to consider rram at its current stage of development for digital applications. since continuous conductance modulation is suitable for implementing analog synaptic weights, both filamentary and non-filamentary switching rram types are currently intensively investigated, particularly for neuromorphic applications [8]. to be competitive with the traditional volatile technologies and also with nonvolatile flash, emerging nonvolatile devices must offer a fast switching time and high integration density supported by good scalability. in addition, emerging nonvolatile memory must possess a long retention time, a high endurance, and a low write power. at the same time, it must exhibit a simple structure to reduce fabrication costs and must be compatible with cmos to benefit from advantages provided by outstandingly well-developed cmos fabrication technology. traditional cmos technology is based entirely on the electron charge. another intrinsic characteristic of electrons, the electron spin has remained relatively unexploited for digital applications until recently. although the electron spin is characterized by the two states with distinct projections at a quantization axis and may require a negligible energy for spin reversal, the property attractive for digital electronics is, that the spin interacts effectively with a magnetic, not an electric, field. as the magnetic field is usually generated by a current, this technology turns out to be not suitable for downscaling as reducing the current carrying wire’s cross-section increases the current density, which results in reliability issues due to electromigration. it is therefore necessary to proceed to the quantum mechanical level to make the coupling of the electron spin to the electric field efficient. we briefly review below the current status of spin-based digital switches including the recently demonstrated spin field effect transistor (spinfet) and the spin mosfet, and we outline the exciting challenges still preventing the spin-based switches from entering mass production. thereafter we document the current status in spin-based purely electrically addressable nonvolatile magnetoresistive memories, which are proven to be competitive with flash memory and possess a considerable potential to enter the dynamic and static ram markets. we conclude the review with an outlook focusing on nonvolatile logic architectures. 2. spin-based switches for digital applications discovery of the gate-voltage dependent rashba spin-orbit field [9] acting on the electron spin in the transistor channel opened a possibility for a purely electrical way to manipulate the spin of a propagating electron, which resulted in the proposal of a spinbased transistor, in which the charge functionality was complemented and enhanced by the electron spin – the spin field-effect transistor, or spinfet [10]. the spinfet (fig.1) is a promising future semiconductor device with a performance potentially superior to 532 v. sverdlov, s. selberherr that achieved in the present transistor technology [11], [12], [13]. an additional functionality is added by replacing the non-magnetic source and drain electrodes in a fet by ferromagnetic counterparts. the two ferromagnetic contacts (source and drain) are connected by a nonmagnetic semiconductor channel region. metallic ferromagnetic contacts serve not only as an injector/detector of the electron charge current in the channel. because of their magnetization, the source and drain electrodes inject/detect electron spins [9] ,[10]. the electron current is enhanced in the case of parallel alignment between the source/drain electrodes as electrons injected with spins parallel to the drain magnetization can easily escape from the channel to the drain. alternatively, the current is suppressed for antiparallel magnetization alignment [9], [10]. the device with the on-current depending on the relative alignment between source and drain is termed spin metal-oxide-semiconductor field-effect transistor (spinmosfet). as the magnetization of the source/drain can be manipulated by means of the external magnetic field and/or current (by means of the spin-transfer torque), the two on-current states for parallel/antiparallel magnetization alignment potentially enable the design of reprogrammable logic [10]. importantly, the relative magnetization orientation between the source and drain is preserved without external power, which makes reprogrammable logic partly nonvolatile. fig. 1 illustration of spinfet functionality [10]. spin-polarized electrons are injected from a ferromagnetic source and absorbed by a ferromagnetic drain. the electron spins in the channel can be additionally manipulated by means of the gate voltage dependent spin-orbit interaction. the total current through the device depends on the relative orientation between the magnetization direction of the drain and the electron spin polarization at the end of the semiconductor channel. the electron spin polarization close to the drain interface is determined by the source magnetization and can be additionally modulated by the effective gate voltage dependent spin-orbit interaction in the channel. however, in contrast to the electron charge, the spin injected into the channel is a non-equilibrium quantity and is not conserved. the injected spin relaxes to its equilibrium zero value while propagating through demands for spin-based nonvolatility in emerging digital logic and memory 533 the channel. spin relaxation is an important detrimental factor affecting the spinfet functionality as it reduces the current modulation due to spin functionality. the spin relaxation is governed by the spin-orbit interaction (soi) in combination with scattering, so even a spin-independent scattering potential will result in a spin decay. the spin relaxation manifests itself differently in semiconductors of the group iv including silicon and in iii-v semiconductors. in crystals obeying the inversion symmetry (silicon, germanium) the spin relaxation is governed by the elliott-yafet mechanism [14], [15]. because the wave function with fixed spin projection is not an eigenstate of the hamiltonian with the spin-orbit interaction included, the wave function possesses a small but finite contribution with an opposite spin projection. therefore, the small but finite probability to flip the electron spin appears at every spin-independent scattering event – the elliott process [14]. this is complemented by the yafet spin-flip events due to spindependent contribution to electron-phonon scattering. in silicon the electron spin relaxation is determined by the inter-valley transitions [14] and can be efficiently controlled by stress [15]. in silicon channels, uniaxial stress generating shear strain is particularly efficient to suppress the spin relaxation [16] as it lifts the degeneracy between the two unprimed subband ladders [17]. in addition, choosing the spin injection direction also boosts the spin lifetime by a factor of two [18]. recently, the first successful demonstration of a silicon spinmosfet at room temperature [19] was presented. a large absolute current modulation in the spinmosfet was achieved by altering the relative magnetization between the source and the drain from parallel to antiparallel. however, the relative ratio of the on-currents, a characteristic similar to the tunnel (t) magnetoresistance (mr) ratio, is still several orders of magnitude lower [19] than the tmr in magnetic tunnel junctions. in the spinmosfet studied, a mr less than 1% was experimentally observed at room temperature [19]. a possible option to boost the modulation is to employ an electric field at the contact interface between the ferromagnet and the two-dimensional electron gas to increase spin-to-charge conversion [20]. although a large tmr of 80% was reported on a iii-v surface layer at low temperature, at about 1k, which is needed to avoid spin relaxation, the technique of boosting spin-to-charge conversion possesses potential for being employed in silicon at room temperatures. in iii-v materials the crystal lattice does not have any inversion symmetry, and the degeneracy between the up and down spin states with the same electron momentum is lifted. the spin relaxation is governed by the dyakonov-perel mechanism [12], [13] and becomes stronger for larger spin-orbit interaction. at the same effective spin-orbit, the design of a spinfet is facing a tough trade-off. from one side, one needs a stronger spinorbit interaction for more efficient spin manipulation. from the other side, strong spin-orbit interaction results in a short spin-diffusion length characterizing the distance for which nonequilibrium spins can propagate along the channel before relaxing to the zero equilibrium value. therefore, the first convincing experimental demonstration of a spinfet [21] was performed at a very low temperature to effectively suppress spin relaxation. spinfets and spinmosfets can properly function only, if the electron spins are efficiently injected/extracted in/from the channel. to achieve spin injection/detection from a metal ferromagnet in the semiconductor channel and vice versa, a properly engineered tunneling barrier must be placed between the electrodes and the channels [22] to mitigate the spin impedance mismatch. however, the signal attributed to the spin injection [23] appears to be much weaker as compared to the large effect [24] currently attributed to the spin-dependent resonant tunneling [25], [26], [27]. peculiarities of spin-dependent trap534 v. sverdlov, s. selberherr assisted tunneling in a multi-terminal configuration may result in switches driven by a single spin on the trap [28] and with characteristics similar to those of a single-electron transistor. the lack of an efficient way to electrically inject spins from a ferromagnetic metal in a semiconductor was one of the reasons why it took more than 25 years from the vision of a spinfet in 1990 [10] to the first reliable experimental spinfet demonstration [21], where a clever idea to employ the voltage-dependent spin-orbit interaction in iii-v materials was used for an efficient spin injection from point contacts into the channel. additional gates were employed to create the point contacts to the two-dimensional electron gas by confining it to a one-dimensional channel. applying different voltages to these gates produces the spin-orbit rashba field perpendicular to the point contact. then all electrons impinging the two-dimensional channel through the one-dimensional point-like contacts are spin-polarized. this way an efficient and purely electrical spin injection is achieved, which allowed the ever first reliable demonstration [21] of a working spinfet. new 2d materials (graphene, transition metal dichalcogenides) are attractive for emerging microelectronics applications as they allow developing new concepts, in particular for spin switches [29]. spin-polarized electrons injected into a graphene sheet propagate to the drain and remain spin-polarized as the spin relaxation in graphene is weak. however, if a mos2 single layer is put on top, a parallel path for electrons through the material with strong spin relaxation is open, and the current reaching the drain is not spinpolarized. as the chemical potential of the mos2 layer can be tuned from the conduction band to the gap, this opens the opportunity to have a switch between spin-polarized/spinunpolarized drain currents depending on the gate voltage, which operates at 200k. although many fundamental challenges have been resolved and a spinfet and a spinmosfet have been successfully demonstrated, both devices still rely on the charge current to transfer the spin, which may set some limitations for the applicability of such devices in main-stream microelectronics. in addition, the absence of an efficient and purely electrical spin injection scheme results in a low mr inferior to that in magnetic tunnel junctions. nonvolatile devices based on magnetic tunnel junctions possess a tmr suitable for practical applications, in particular in memory devices discussed below. 3. nonvolatile magnetoresistive memories an efficient coupling between the electrical and the magnetic degree of freedom at the quantum mechanical level, called the giant magneto-resistance effect, was discovered in 1986. the discovery offered a purely electrical way of reading the stored magnetization information, which revolutionized hard drive storage devices and was honored with the nobel prize in 2007 [30]. it then was discovered that the tunneling current through a magnetic tunnel junction (mtj) consisting of two ferromagnets separated by a thin tunnel barrier structure strongly depends on the relative polarization of the ferromagnetic contacts. the difference in the mtj resistivity between the parallel and the antiparallel configuration reaches several hundred percent at room temperature [31], which represents an efficient way of converting the spin (magnetization) degree of freedom into a charge (current) employed by cmos devices. thanks to this technology a new generation of hard drives with even higher storage densities has been developed. demands for spin-based nonvolatility in emerging digital logic and memory 535 however, in order to introduce competitive nonvolatile memory based on mtjs, an efficient way of converting charge information into magnetic moment (spin) orientation by purely electrical means is required. the spin-transfer torque effect (stt) [32], [33] has been proven to be perfectly suited for purely electrical data writing by passing a current through the mtj (fig.2a). when electrons pass through a fixed ferromagnetic layer, their spins become aligned with the magnetization. when these spin-polarized electrons enter the free magnetic layer, they become aligned with the magnetization of the free layer within a transition layer of a few angstroms. due to the conservation of the total angular momentum the change of the electron spin is compensated by the modification of the free layer magnetization. therefore, the spin-polarized current exerts a torque on the magnetization of the free layer. if the current is sufficiently strong to overcome the damping, this torque causes magnetization switching. if the current tends to switch the magnetization from the parallel to the antiparallel configuration, altering the current direction will result in the magnetization switched from the antiparallel to the parallel state. 3.1. spin-transfer torque mram the exciting journey of bringing stt-mram to the market was ignited by the observation of spin torque induced switching in mgo-based [34] stt-mram cells. depending on the orientation of the layer magnetizations the magnetic pillars can be divided into in-plane with the magnetization lying in the plane of the magnetic layer (fig.2) and out-of-plane with the magnetization direction to be discussed later. (a) (b) fig. 2 (a) magnetic tunnel junction with in-plane magnetization orientation. the magnetization of the reference layer (rl) is fixed, while the magnetization of the free layer (fl) can be flipped between the two preferred magnetization orientations by means of the spin transfer torque generated by the current passing through the structure. the two ferromagnetic layers are separated by a thin mgo tunnel barrier. (b) the preferred magnetization direction in in-plane magnetized free layers is defined by their shape. for elliptic structures these directions are along the long ellipse axes. for in-plane mtjs, faster switching is achieved, when the free layer is made of two half-ellipses separated by a narrow gap [35], [36] schematically shown in fig.3. the switching of the half-elliptic parts appears in-plane in opposite senses (fig.3b). this way a large demagnetization penalty of the magnetization getting out of plane at switching of a monolithic structure (fig.2b) is avoided, and the switching barrier becomes equal to the thermal barrier. because the thermal barrier depends on the free layer volume, the required large thermal stability factors of ~80kt are easily achieved in this structure. 536 v. sverdlov, s. selberherr (a) (b) fig. 3 (a) equilibrium magnetization in an in-plane composite layer made of two halfelliptic parts separated by a narrow gap. (b) at spin transfer switching, the magnetization of either half-elliptic part remains mainly in-plane. the magnetizations move in opposite senses and pass through the configuration defining the thermal barrier between the two equilibrium states [35]. in perpendicular mtjs (p-mtjs) shown in fig.4a the thermal barrier separating the two states is equal to the switching barrier, which reduces the switching current. in addition, p-mtjs are better suited for high-density integration [37]. (a) (b) fig. 4 (a) illustration of a perpendicularly magnetized mtj. the perpendicular uniaxial anisotropy is interface-induced [38]. (b) the thermal stability is enhanced by adding the second cofeb/mgo interface with the interface induced anisotropy [42]. the gilbert damping is also reduced for the composite free layer made of two thin ferromagnetic cofeb layers separated by a ta metallic spacer [42]. however, it is difficult to find a material with so strong uniaxial anisotropy that it can overcome the demagnetization field of a layered structure bringing the magnetization inplane. a critical technological step allowing to solve this problem was the discovery of an interface-induced perpendicular anisotropy at the cofeb/mgo interface [38], which makes the very thin cofeb layer perpendicularly magnetized. to scale the diameter of the mtj beyond 10nm, the use of shape anisotropy was suggested [39]. it has been demands for spin-based nonvolatility in emerging digital logic and memory 537 shown that the thermal stability can be boosted for small diameters without sacrificing on tmr and without any need of new materials, as feb for the ferromagnetic free layer and mgo for the tunnel barrier were used. any nonvolatile memory including mram must be characterized by the ability to write the data with low energy without damaging the device, long data retention, and the ability to read the data without destroying it. improving one or two aspects of its functionality usually leads to a degradation of the remaining functionality [40]. therefore, a careful parameter optimization specific to a particular technology must be properly addressed. an innovative design yielded already a successful implementation of 8mb 1t-1mtj stt-mram embedded in a 28nm cmos logic platform [41]. another issue with stt-mram is the relatively high current required for fast sttinduced writing. this fact has several implications. firstly, due to the relatively high energy required for writing, stt-mram cannot be used in high-level processor caches due to the high activity factor. the necessity to switch memory frequently negates the benefits of nonvolatility provided by mram. secondly, large switching currents are supplied via an access transistor. finally, a large switching current density can result in serious reliability issues like mtj’s resistance drift and eventually its dielectric breakdown. the critical current density depends on the switching pulse duration, with a substantial current increase for sub 10ns switching. a plausible way to reduce the switching current density is to work with p-mtjs, however, even in this case the switching current competes with the gilbert damping and must be sufficiently strong to overcome the potential barrier separating the two states, the thermal stability barrier. the height of the thermal stability barrier determines the data retention. for 10 year data retention the thermal stability barrier must be at least 80kt for gigabit mram arrays. as the barrier cannot be decreased without violating the retention, in order to reduce the switching current density and preserve the large thermal barrier at the same time, one has to reduce the gilbert damping and increase the spin current polarization. a solution which helps boosting the thermal stability barrier is to employ a free layer with two cofeb/mgo interfaces [42] with the interface induced perpendicular anisotropy shown in fig.4b. it turns out that the use of the p-mtj structure with two cofeb/mgo interfaces and a composite free layer cofeb/ta/cofeb [42] also reduces the gilbert damping by half, thus allowing to simultaneously boost the thermal barriers and to reduce the switching current. in order to integrate stt-mram with the cmos fabrication process, mtjs must sustain at least 400c temperature typical for the back-end-of-line process. recently, imec reported a process allowing to preserve the high tmr and thermal stability of mtjs [43]. the idea is to invert the free and the fixed layer by putting the fixed layer on top of the mtj. an additional synthetic ferromagnet instead of an antiferromagnet is employed to pin the magnetization of the fixed layer. the design requires a compensating magnet to be in addition integrated in the structure. optimization of this magnet can be used to achieve a symmetric switching between the parallel and the antiparallel configuration and back. advanced stt-mram is characterized by a high-speed access time of 10ns. it is thus suitable for last level caches. 4gbit stt-mram arrays with p-mtjs and compact memory cells were recently reported [37]. currently, 256mb stt-mram from everspin technologies is already available [44]. all major foundries including samsung [45] and globalfoundries [46] announced the beginning production of embedded stt-mram based on the 28nm [41]/22nm [46] fully-depleted silicon-on-insulator technology. we are therefore witnessing the beginning of nonvolatile stt-mram entering the embedded 538 v. sverdlov, s. selberherr memory market and competing with dram and potentially sram, traditionally dominated by cmos-based volatile devices. if successful, this will result in an exponential growth of the stt-mram market with a momentous impact on information storage and processing in the near future. 3.2. spin-orbit torque mram spin-transfer torque magnetic ram (stt-mram) is fast (10ns), possesses high endurance (10 12 ), and has a simple structure. it is compatible with cmos and can be straightforwardly embedded in circuits. it is particularly promising to employ nonvolatility in internet of things (iot) and automotive applications, as well as a replacement of conventional volatile cmos-based dram and nonvolatile flash memory. although the use of stt-mram in last-level caches is conceivable [47], the switching current for operating faster than at 10ns is quite high. the need of even higher switching currents for faster operation in higher-level caches potentially prevents stt-mram from entering in l2 and l1 caches currently mastered by static ram (sram). in addition, rapidly increasing critical currents required for operating stt-mram at 5ns result in large current densities running through magnetic tunnel junctions. this leads to oxide reliability issues, which in turn reduces the mram endurance to that of the flash memory, thus negating one of the important mram advantages over flash. (a) (b) fig. 5 (a) spin-orbit torque memory cell with an in-plane magnetized free layer (fl) switched by current write pulses (red and blue) applied through heavy metallic wires nm1 and nm2 with a high spin hall angle. fl is grown on nm1. read current (green) is applied through an mtj. (b) the switching scheme [36] based on two short consecutive orthogonal current pulses provides fast, deterministic, and magnetic field free magnetization reversal of a perpendicularly magnetized fl [68]. the engineering of an electrically addressable nonvolatile memory combining high speed (sub-ns operation) and high endurance suitable for replacing sram in higher-level caches of hierarchical multi-level processor memory structures cannot be based on stt, and the use of new physical principles is required. among the newly discovered physical phenomena suitable for next-generation mram is the spin-orbit torque (sot) assisted demands for spin-based nonvolatility in emerging digital logic and memory 539 switching at room temperature in heavy metal/ferromagnetic [48], [49], [50], [51], [52], [53], [54], [55] or topological insulator/ferromagnetic [56], [57], [58], [59] bilayers (fig.5). in this memory cell, the magnetic tunnel junction’s free layer is grown on a material (nm1 in fig.5) with a large spin hall angle. the sot acting on the adjacent magnetic layer is generated by passing the current through this material, schematically shown by the red arrow. the large switching current is injected in-plane along the heavy metal/ferromagnetic bilayer and does not flow through the mtj, the state of which is read by passing a small current through the mtj shown by the green arrow. this results in a three-terminal configuration where the read and write current paths are decoupled. since the large write current does not flow through the oxide in the mtj, this prevents the tunnel barrier from damage. therefore, three-terminal mram cells are promising candidates for future generations of nonvolatile memory for fast sub-ns switching [54]. sot-mram is an electrically addressable nonvolatile memory combining high speed and high endurance and is thus suitable for applications in caches [54]. although the high switching current is not flowing through a magnetic tunnel junction but rather through a heavy metal wire under it, the current is still high, and its reduction is the pressing issue in the field of sot-mram development. topological insulators (tis) are promising materials for reducing the switching current as they are characterized by a high spin hall angle and efficiency of charge to spin conversion due to peculiar perpendicular spin-momentum locking in the interface states. in addition, the strong spin-orbit interfacial rashba field helps generating spin density in tis boosting the charge to spin conversion efficiency above 100%. however, although high charge to spin conversion efficiency in tis has been reported, the electrical conductivity of tis required to build a high-density, ultra-low power, and ultra-fast nonvolatile memory was not sufficiently high because of their insulating bulk. recent developments introduce bise [57] and bisb [58] based tis as suitable candidates for emerging sot-mram as they possess a charge to spin conversion efficiency of 18.8 and 52 times, respectively. this allows reducing the switching current by two orders of magnitude as compared to tungsten-based sot-mram. in addition, bisb samples [58] exhibit very high electrical conductivity making thin bisb films leading candidates for emerging fast and low-power sot-mram, and the process integration of bisb into a realistic mtj stack is currently under scrutiny. despite an undisputable progress in developing sot-mram, one important shortcoming has not been convincingly resolved so far. namely, a static magnetic field is still required to guarantee deterministic switching [60] or a perpendicularly magnetized free layer. several paths to achieve deterministic switching without magnetic fields were suggested. they require unusual solutions to break the mirror symmetry either by means of the shape of the dielectric [61] or the free layer [62], or by controlling the crystal symmetry of the metal line at the microscopic layer [53]. biasing the free layer by employing an exchange coupling to an antiferromagnet [63], [64], [65], [66] as well as the use of a peculiar sample shape, which controls the switching [67], were recently reported. however, even if in most of these studies a field-free switching was reported, these methods either require a local intrusion into the fabrication process, or are based on solutions whose scalability is questionable (antiferromagnets, shapes), which makes further large scale integration of the fabricated memory cells problematic 540 v. sverdlov, s. selberherr fig. 6 switching time of a perpendicular fl as a function of the width of the heavy metal line nm2, for several durations of the second write current pulse in the two-pulse switching scheme [36]. a nm2 width of 12.5nm is optimal as it guarantees fast, robust, and deterministic switching which is insensitive to small variations of the pulse duration or the nm2 width. the applied write currents are equal to 100μa. the first write current pulse of 100ps is applied after initial thermalization at 300k. the saturation magnetization is ms=4×10 5 a/m, the gilbert damping is α=0.05, and the fl dimensions are 52.5nm × 12.5nm × 2nm. the sot switching scheme based on the use of two consecutive orthogonal subnanosecond current pulses shown in fig.5 can switch in-plane structures efficiently [37], [68]. the implementation is suitable for integration in a cross-bar architecture [37]. the two-pulse switching scheme is applicable for switching perpendicularly magnetized free magnetic layers of rectangular shape shown in fig.5b [69]. the second consecutive pulse is applied through the wire nm2 with a large hall angle. sub-300ps, 100% reliable, and magnetic field-free switching is achieved at around 30% overlap of the second pulse wire nm2 with the free layer as shown in fig.6. in addition, at these overlaps the switching is not sensitive to small variations of the nm2 width and the second pulse duration. a proper integration of sot-mram represents a significant challenge as the memory cells must withhold at least the back-end-of-line thermal budget. as imec presented recently a technology to integrate a perpendicular beta-phase w/cofeb/mgo/cofeb/synthetic antiferromagnetic stack based sot-mram on a 300mm cmos wafer with fully cmos compatible processes [70], there is a cautious confidence that fast low-power nonvolatile magnetoresistive memory suitable for processor caches will be developed soon. demands for spin-based nonvolatility in emerging digital logic and memory 541 4. nonvolatile computing mram is cmos compatible and is embedded directly on top of a cmos logic chip. this layout is practically relevant as it facilitates 3d integration of circuits in which highperformance cmos layers are separated by low energy consuming nonvolatile mram containing spacers. fast nonvolatile memory combined with nonvolatile processing elements is a fertile ground for realizing microprocessors with reduced power consumption working on an entirely new principle. in addition, placing mram arrays directly on top of cmos circuits allows reducing the length of interconnects and the corresponding delay time. the computer architecture, where nonvolatile elements are located on a chip with cmos devices, is traditionally called logic-in-memory, although as of yet no information is processed in nonvolatile elements. power-efficient mrambased logic-in-memory concepts have already been demonstrated [71]. they include field-programmable gate arrays and ternary content addressable memory, as well as other variants. these cmos/spintronic hybrid solutions are already competitive in comparison to the conventional cmos technology with respect to power consumption and speed. the introduction of nonvolatility in circuits helps cutting the power consumption by 50%, with outstanding 90% reduction in specific circuits [71]. placing the actual computation into the magnetic domain reduces the need of converting magnetically stored information into currents and voltages for processing and helps to simplify the circuit layout and to increases the integration density. the idea is to use mtjs as elementary blocks for non-conventional in-memory computing architectures. for example, any two of the 1t-1mtj cells in an mram array can not only serve as nonvolatile memory units, but can also be employed for implementing a conditional switching of a target mtj depending on the state of the source mtj [72]. this results in a logical operation known as the material implication operation (imp), which in combination with the false operation can cover the whole space of all boolean operations. a compact implication-based single-bit full adder realization involving only six 1t-1mtj cells and 27 subsequent false and imp operations can be realized [73]. recently, a massively parallel not operation based on imp implementation by using the voltage asymmetry of the voltage controlled magnetic anisotropy effect and the precessional dynamics of the switching process was proposed [74]. the logic architecture based on the devices acting simultaneously as a memory element and compute unit is termed stateful. an alternative option is to follow a conventional path with memory and computing units separated, in which, like in an allspin logic concept [75], both elements are nonvolatile and implemented in a magnetic domain. placing the actual computation into the magnetic domain eliminates the need of converting magnetically stored information into the currents and voltages for processing. the idea of combining mtjs with a common free layer enables the realization of a nonvolatile magnetic flip-flop [76]. the processing unit consists of an stt based nonvolatile majority gate and nonvolatile magnetic flip-flops used as memory registers in a nonvolatile processing environment [77]. the availability of high-capacity nonvolatile memory enables new logic-in-memory and computing-in-memory architectures for future artificial intelligence and cognitive computing [78]. nonvolatile mtjs are suitable for neural network realizations as they can be considered as a current-driven programmable resistor, a memristor [79]. mtj based neural networks featuring nonvolatile synapses [80] allow for high-speed pattern 542 v. sverdlov, s. selberherr recognition with about a 70% reduction in gate count and a 99% improvement in speed as compared to their cmos counterparts. neuromorphic computing is becoming a reality, with the first self-learning chips already revealed [81]. 5. conclusions although spin switches based on spin-enhanced transistors have been successfully demonstrated, the increase of the on-current ratio between the parallel and antiparallel source/drain magnetization configuration at room temperature and the efficient gate voltage induced spin control remain the main challenges preventing these devices from entering the market so far. in addition, as all proposed up-to-date spin switches require the charge current to transfer the spin, it sets limitations for the applicability of such devices in main-stream microelectronics which is already suffering from high power demand, and new ideas to realize spin-based switches for digital applications are urgently needed. nonvolatile memories based on magnetic tunnel junctions are about to hit the market as all leading manufacturers announced embedded stt-mram production in 2018. stt-mram is positioned as a successor not only for flash, but also for cmos-based main computer memory. however, rapidly growing switching currents and power consumption might prevent stt-mram from entering cache memory currently mastered by sram. because of the large switching currents and insufficient speed, sttmram is unlikely to replace sram in high-level core caches. spin-orbit torque innovative nonvolatile devices with improved switching characteristics and enhanced charge to spin conversion efficiency demonstrate a potential for processor-embedded memories. the successful adoption of nonvolatility in microelectronic systems by developing various logic-in-memory architectures and in-memory processing will inevitably result in increasing dissemination of this technology for other applications such as ultralow-power electronics, high-performance computing, big data analysis, automotive electronics, and the internet of things. acknowledgement: the financial support by the austrian federal ministry for digital and economic affairs and the national foundation for research, technology and development is gratefully acknowledged. references [1] s.-e. thompson, m. armstrong, c. auth et al., ―a 90-nm logic technology featuring strained-silicon‖, ieee trans.electron devices, vol. 51, 1790, 2004. [2] k. mistry, c. allen, c. auth et al., ―a 45nm logic technology with high-k+metal gate transistors, strained silicon, 9 cu interconnect layers, 193nm dry patterning, and 100% pb-free packaging‖, in iedm techn. digest, 2007, pp. 247-250. demands for spin-based nonvolatility in emerging digital logic and memory 543 [3] s. natarajan, m. armstrong, m. bost et al., ―a 32nm logic technology featuring 2 nd -generation high-k + metal-gate transistors, enhanced channel strain and 0.171μm 2 sram cell size in a 291mb array‖, in iedm techn. digest, 2008, pp. 941-943. [4] r. xie, p.montanini, k.akarvardar et al., ―a 7nm finfet technology featuring euv patterning and dual strained high mobility channels‖, in iedm techn. digest, 2016, pp. 47-50. [5] s.-y. wu, c.y.lin, m.c.chiang et al., ―7nm cmos platform technology featuring 4th generation finfet transistors with a 0.027µm 2 high density 6-t sram cell for mobile soc applications‖, in iedm techn. digest, 2016, pp. 43-46. [6] n. loubet, t. hook, p. montanini et al., ‖stacked nanosheet gate-all-around transistor to enable scaling beyond finfet‖, in proceedings of the symp. vlsi technology and circuits, 2017, t230. [7] https://www.intel.com/content/www/us/en/architecture-and-technology/intel-optane-technology.html [8] j g.w. burr, r.m. shelby, a.sebastian et al., ―neuromorphic computing using non-volatile memory‖, advances in physics x, vol. 2, 89, 2017. [9] y. bychkov and e. rashba, ―properties of a 2d electron gas with lifted spectral degeneracy‖, jetp lett. vol. 39, 78, 1984. [10] s. datta and b. das, ―electronic analog of the electro-optic modulator‖, applied physics letters, vol.56, 665, 1990. [11] s. sugahara and j. nitta, ―spin-transistor electronics: an overview and outlook‖, in proceedings of the ieee, 2010, vol. 98, 2124. [12] i. zutic, j. fabian, and s. das sarma, ―spintronics: fundamentals and applications‖, rev. mod. phys., vol. 76, 323,2004. [13] j. fabian, a. matos-abiaguea, c. ertler, et al., ―semiconductor spintronics‖, acta phys. slovaca, vol. 5, 565, 2007. [14] p. li and h. dery, ―spin-orbit symmetries of conduction electrons in silicon‖, phys. rev. lett., vol. 107, 107203, 2011. [15] o. chalaev, y. song, and h. dery, ―suppressing the spin relaxation of electrons in silicon‖, phys. rev. b, vol. 95, 035204, 2017. [16] v. sverdlov and s. selberherr, ―silicon spintronics: progress and challenges‖, physics reports, vol. 585, 1, 2015. [17] v. sverdlov, ―strain-induced effects in advanced mosfets‖, springer, 2011. [18] v. sverdlov, j. ghosh, and s. selberherr, ―universal dependence of the spin lifetime in silicon films on the spin injection direction‖, in proceedings of the workshop on innovative devices and systems (winds), 2016, p.7. [19] t. tahara, h. koike, m. kameno, et al., ―room-temperature operation of si spin mosfet with high on/off spin signal ratio‖, appl. phys. express, vol. 8, 11304, 2015. [20] m. oltscher, f. eberle, t. kuczmik et al., ―gate-tunable large magnetoresistance in an allsemiconductor spin valve device‖, nature communications, vol. 8, 1897, 2017. [21] p. chuang, s.-c. ho, l.w. smith et al., ―all-electric all-semiconductor spin field-effect transistors‖, nature nanotechnology, vol. 10, 35, 2015. [22] e.i. rashba, ―theory of electrical spin injection: tunnel contacts as a solution of the conductivity mismatch problem‖, phys. rev. b, vol. 62, r16267, 2000. [23] t. tahara, y. ando, m. kameno et al., ―observation of large spin accumulation voltages in nondegenerate si spin devices due to spin drift effect: experiments and theory‖, phys. rev. b, vol. 93, 214406, 2016. [24] r. jansen, ―silicon spintronics―, nature materials, vol. 11, 400, 2012. [25] y. song and h. dery, ―magnetic-field-modulated resonant tunneling in ferromagnetic-insulatornonmagnetic junctions‖, phys. rev. let. vol. 113, 047205, 2014. [26] z. yue, m.c. prestgard, a. tiwari, and m.e. raikh, ―resonant magnetotunneling between normal and ferromagnetic electrodes in relation to the three-terminal spin transport‖, phys. rev. b, vol. 91, 195316, 2015. [27] v. sverdlov and s. selberherr, ―current and shot noise at spin-dependent hopping through junctions with ferromagnetic contacts‖, solid-state electronics, submitted, 2018. [28] v. sverdlov and s. selberherr, ―spin correlations at hopping in magnetic structures: from tunneling magnetoresistance to single-spin transistor‖, in proceedings of the spie conference nanoscience+engineering, 2018, [29] w. yan, o. txoperena, r. llopis et al., ―a two-dimensional spin field-effect switch‖, nature communications, vol. 7, 13372, 2016. https://www.intel.com/content/www/us/en/architecture-and-technology/intel-optane-technology.html 544 v. sverdlov, s. selberherr [30] a. fert, ―nobel lecture: origin, development, and future of spintronics‖, rev.modern phys., vol. 80, 1517, 2008; p. a. grunberg, ―nobel lecture: from spin waves to giant magnetoresistance and beyond‖, rev. modern phys., vol. 80, 1531, 2008. [31] s. ikeda, j. hayakawa, y. ashizawa et al., ―tunnel magnetoresistance of 604% at 300 k by suppression of ta diffusion in cofeb/mgo/cofeb pseudo-spin-valves annealed at high temperature‖, appl. phys. lett., vol. 93, 082508, 2008. [32] j. slonczewski, ―current-driven excitation of magnetic multilayers‖, j. magnetism and magnetic materials, vol. 159, l1, 1996. [33] l. berger, ―emission of spin waves by a magnetic multilayer traversed by a current‖, phys. rev. b, vol. 54, 9353, 1996. [34] z. diao, d. apalkov, m. pakala et al., ―spin transfer switching and spin polarization in magnetic tunnel junctions with mgo and alox barriers‖, appl. phys. lett., vol. 87, 232502, 2005. [35] a. makarov, v. sverdlov, d. osintsev, and s. selberherr, ―reduction of switching time in pentalayer magnetic tunnel junctions with a composite-free layer‖, phys. stat. solidi (rrl – rapid research letters), vol. 5, pp. 420-422, 2011. [36] a. makarov, t. windbacher, v. sverdlov, and s. selberherr, ―cmos-compatible spintronic devices: a review‖, semiconductor science and technology, vol. 31, 113006, 2016. [37] s.-w. chung, t. kishi, j.w. park et al., ―4gbit density stt-mram using perpendicular mtj realized with compact cell structure‖, in iedm techn. digest, 2016, pp. 659-662. [38] s. ikeda, k. miura, h. yamamoto et al., ―a perpendicular-anisotropy cofeb–mgo magnetic tunnel junction‖, nature materials, vol. 9, 721, 2010. [39] k. watanabe, b. jinnai, s. fukami et al., ―shape anisotropy revisited in single-digit nanometer magnetic tunnel junctions‖, nature communications, vol. 9, 663, 2018. [40] d. apalkov, b. dieny, and j.m. slaughter, ―magnetoresistive random access memory‖, proceedings of the ieee, vol. 104, 1796, 2016. [41] y.j. song, j.h. lee, h.c. shin et al., ―highly functional and reliable 8mb stt-mram embedded in 28nm logic‖, in iedm techn. digest, 2016, pp. 663-666. [42] h. sato, m. yamanouchi, s. ikeda et al., ―mgo/cofeb/ta/cofeb/mgo recording structure in magnetic tunnel junctions with perpendicular easy axis‖, ieee trans. magnetics, vol. 49, 4437, 2013. [43] j. swerts, e. liu, s. couet et al., ―solving the beol compatibility challenge of top-pinned magnetic tunnel junction stacks‖, in iedm techn. digest, 2017, pp. 866-859. [44] https://www.everspin.com/stt-mram-products [45] https://www.mram-info.com/tags/companies/samsung [46] https://www.globalfoundries.com/news-events/press-releases/globalfoundries-launches-embedded-mram22fdxr-platform [47] g. jan, l. thomas, s. le et al., ―achieving sub‐ns switching of stt‐mram for future embedded llc applications through improvement of nucleation and propagation switching mechanisms‖, in proceedings of the symp. vlsi technology and circuits, 2016, p.18. [48] i.m. miron, k. garello, g. gaudin et al., ―perpendicular switching of a single ferromagnetic layer induced by in-plane current injection‖, nature, vol. 476, 189, 2011. [49] l. liu, j. lee, t.j. gudmundsen et al., ―current-induced switching of perpendicularly magnetized magnetic layers using spin torque from the spin hall effect‖, phys. rev. lett., vol. 109, 096602, 2012. [50] l. liu, c.-f. pai, y. li et al., ―spin-torque switching with the giant spin hall effect of tantalum‖, science, vol. 336, 555, 2012. [51] a. brataas and k.m.d. hals, ―spin–orbit torques in action‖, nature nanotechnology, vol. 9, 86, 2014. [52] t. taniguchi, j. grollier, and m.d. stiles, ―spin-transfer torques generated by the anomalous hall effect and anisotropic magnetoresistance‖, phys. rev. appl., vol. 3, 044001, 2015. [53] d. macneil, g.m. stiehl, m.h.d. guimaraes et al., ―control of spin–orbit torques through crystal symmetry in wte2/ferromagnet bilayers‖, nature physics, vol. 13, 300, 2017. [54] s.-w. lee and k.-j. lee, ―emerging three-terminal magnetic memory devices‖, in proceedings of the ieee, vol. 104, 1831, 2016. [55] k.u. demasius, t. phung, w. zhang et al., ―enhanced spin–orbit torques by oxygen incorporation in tungsten films‖, nature communications, vol. 7, 10644, 2016. [56] j. han, a. richardella, s.a. siddiqui et al., ―room-temperature spin-orbit torque switching induced by a topological insulator‖, phys. rev. lett., vol. 119, 077702, 2017. [57] y. wang, d. zhu, y. wu et al., ―room temperature magnetization switching in topological insulatorferromagnet heterostructures by spin-orbit torques‖, nature communications, vol. 8, 1364, 2018. [58] d.c. mahendr, r. grassi, j.-y. chen et al., ―room-temperature high spin–orbit torque due to quantum confinement in sputtered bixse(1–x) films‖, nature materials, vol. 17, 800, 2018. https://www.everspin.com/stt-mram-products https://www.mram-info.com/tags/companies/samsung https://www.globalfoundries.com/news-events/press-releases/globalfoundries-launches-embedded-mram-22fdxr-platform https://www.globalfoundries.com/news-events/press-releases/globalfoundries-launches-embedded-mram-22fdxr-platform demands for spin-based nonvolatility in emerging digital logic and memory 545 [59] n. huynh, d. khang, y. ueda, and p.n. hai, ―a conductive topological insulator with large spin hall effect for ultralow power spin–orbit torque switching‖, nature materials, vol. 17, 808 2018. [60] s. fukami, t. anekawa, c. zhan, and h. ohno, ―a spin–orbit torque switching scheme with collinear magnetic easy axis and current configuration‖, nature nanotechnology, vol. 11, 621, 2016. [61] g. yu, p. upadhyaya, y. fanet et al., ―switching of perpendicular magnetization by spin-orbit torques in the absence of external magnetic fields‖, nature nanotechnology, vol. 9, 548, 2014. [62] g. yu, l.-t. chang, m. akyol et al., ―current-driven perpendicular magnetization switching in ta/cofeb/[taox or mgo/taox] films with lateral structural asymmetry‖, appl. phys. lett., vol. 105, 102411, 2014. [63] s. fukami, c. zhang, s. duttagupta et al., ―magnetization switching by spin–orbit torque in an antiferromagnet–ferromagnet bilayer system‖, nature materials, vol. 15, 535, 2016. [64] a. van den brink, g. vermijs, a. solignac et al., ―field-free magnetization reversal by spin-hall effect and exchange bias‖, nature communications, vol. 7, 10854, 2016. [65] y.-c. lau, d. betto, k. rode et al., ―spin–orbit torque switching without an external field using interlayer exchange coupling‖, nature nanotechnology, vol. 11, 758, 2016. [66] y.-w. oh, s.-h.c. baek, y.m. kim et al., ―field-free switching of perpendicular magnetization through spin–orbit torque in antiferromagnet/ferromagnet/oxide structures‖, nature nanotechnology, vol. 11, 878, 2016. [67] c.k. safeer, e. jué, a. lopez et al., ―spin-orbit torque magnetization switching controlled by geometry‖, nature nanotechnology, vol. 11, 143, 2016. [68] a. makarov, t. windbacher, v. sverdlov, and s. selberherr, ―concept of a sot-mram based on 1transistor-1mtj-cell structure‖, in proceedings of the conference solid state devices and materials (ssdm), 2015, pp. 140-141. [69] v. sverdlov, a. makarov, and s. selberherr, ―two-pulse sub-ns switching scheme for advanced spinorbit torque mram‖, solid-state electronics, submitted, 2018. [70] k. garello, f.yasin, s couet et al., ―sot‐mram 300mm integration for low power and ultrafast embedded memories‖, in proceedinghs of the symp. vlsi technology and circuits, 2018, p.c8-2. [71] t. hany, t. endoh, d. suzuki et al., ―standby-power-free integrated circuits using mtj-based vlsi computing‖, in proceedings of the ieee, 2016, vol. 104, 1844. [72] h. mahmoudi, t. windbacher, v. sverdlov, and s. selberherr, ―rram implication logic gates‖, patent: international, no. wo 2014/079747 a1; patent priority number ep 12193826.0; submitted: 201311-13. [73] h. mahmoudi, v. sverdlov, and s. selberherr, ―mtj-based implication logic gates and circuit architecture for large-scale spintronic stateful logic systems‖, in proceedings of the european solidstate device research conference (essderc), 2012, pp. 254-257. [74] a. jaiswal, a. agrawal, and k. roy, ―in-situ, in-memory stateful vector logic operations based on voltage controlled magnetic anisotropy‖, scientific reports, vol.8, 5738, 2018. [75] b. behin-aein, d. datta, s. salahuddin, and s. datta, ―proposal for an all-spin logic device with builtin memory‖, nature nanotechnology, vol. 5, 266, 2010. [76] t. windbacher, h. mahmoudi, v. sverdlov, and s. selberherr, ―spin torque magnetic integrated circuit‖, patent: international, no. wo 2014/154497 a1; patent priority number ep 13161375.4; submitted: 2014-03-13, granted: 2014-10-02. [77] t. windbacher, a. makarov, v. sverdlov, and s. selberherr, ―a universal nonvolatile processing environment‖, in "future trends in microelectronics journey into the unknown", s. luryi, j. xu, a. zaslavsky (ed); j. wiley&sons, 2016. [78] d. ielmini and h.-s.p. wong, ―in-memory computing with resistive switching devices‖, nature electronics, vol. 1, 333, 2018. [79] l.o. chua, (1971), ―memristor—the missing circuit element‖, ieee trans. circuit theory, vol. ct18, 507, 1971. [80] y. ma and t. endoh, ―a novel neuron circuit with nonvolatile synapses based on magnetic-tunneljunction for high-speed pattern learning and recognition‖, in proceedings of the asia-pacific workshop fundam. appl. adv. semicond. devices, vol. 4b-1, 2015, pp. 273-278. [81] https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificialintelligence/, https://venturebeat.com/2017/05/10/nvidia-unveils-massive-ai-processing-chip-tesla-v100/ https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificial-intelligence/ https://newsroom.intel.com/editorials/intels-new-self-learning-chip-promises-accelerate-artificial-intelligence/ https://venturebeat.com/2017/05/10/nvidia-unveils-massive-ai-processing-chip-tesla-v100/ instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 613 620 doi: 10.2298/fuee1604613b advanced sample ionization method in ion mobility spectrometer  vladimir vasilievich belyakov, maksim aleksandrovich matusko, anatoly vladimirovich golovin, evgeniy anatolovich gromov microelectronic department, national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation abstract. a new method for the corona discharge ignition in ion mobility spectrometer has been developed. it substantially improves stability and increases device resolution because sample controlled ionization causes a stable flow of ion-exchange processes. the implemented circuit simplifies the design of the proposed ion source and an electronic control circuit. this controlling circuit allows forming a corona discharge without using any additional ignition electrodes. key words: ion mobility spectrometer, corona discharge, ionization source 1. introduction ion mobility spectrometry is a method of detection and identification of chemical compounds vapors based on the ions separation on the criterion of mobility in a weak electric field in a gaseous medium at atmospheric pressure [1-5]. the conventional pattern of the device (figure 1) includes the following components: ionization chamber where the sample is being ionized; gate, for the formation of ion clusters; the drift chamber, where ions are being divided by mobility while moving in a constant electric field; detecting node where the ion current measurement is performed; storage and data processing system. an ion cluster is formed by means of the gate from the ions in the ionization chamber. the cluster is injected into the drift chamber and moved in the direction of the collector under the influence of electrostatic field. ions with different mobilities reach the collector at different times, besides, the time structure of the collector current corresponds to the ions propagation speed in the drift region. the resulting ion mobility distribution spectrum makes it possible to detect and identify chemical substances contained in the gas sample. radioactive radiation, corona discharge, laser radiation, ultraviolet or x-ray radiation [6] are used to ionize air samples in ion mobility spectrometry. received march 29, 2015; received in revised form may 26, 2016 corresponding author: maksim aleksandrovich matusko microelectronic department, national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation (email: maksim-starcity@ya.ru) 614 v. v. belyakov, m. a. matusko, a. v. golovin, e. a. gromov ionization source is an important part of the system responsible for system stability, resolution and sensitivity of ion mobility spectrometer. the ionization source can operate in continuous or pulse modes. when the ionization source operates in pulse mode (pulse corona discharge, pulse laser), formation of ionic clusters occurs directly during ionization, which in some cases does not require an ion gate. fig. 1 the spectrometric cell of ion mobility spectrometer with ionization source and electrostatic gates. the choice of a corona discharge for air samples ionization is related to the following advantages:  lack of radioactive materials;  possibility of generating both positive and negative ions;  simplicity and low cost of manufacturing;  low power consumption. following physical processes occur while igniting a corona discharge. in the air of the ion source there is always a certain number of "background" ions and electrons caused by natural radiation. when voltage is applied between the tips of the discharge electrodes with a small curvature radius, the electric field may exceed the air gap breakdown level. the presence of primary "background" charge carriers contributes to the rapid process of secondary avalanche ionization. the ions formation process of a detectable substance is a multi-stage process. initially, the formation of the so-called reactant ions takes place due to air molecules ionization. next, through a series of ion-molecule interactions (chemical ionization at atmospheric pressure) reactant ions transfer their charge to the impurities molecules, and in particular, to the molecules of detectable substances. advanced sample ionization method in ion mobility spectrometer 615 2. corona dischare ionization source ionization source based on corona discharge represents a 1 mm thick conductive substrate called an ejecting electrode and thin sharp electrodes, placed over the substrate. corona discharge plasma is formed between these electrodes at high voltage (fig. 2). limiting resistance rlim is used for smoothing the burning corona current. fig. 2 corona discharge ion source schematic. the ionization source used in this paper operates in pulse mode. as a result, short clusters of ions are formed. typically, the ionization chamber design is such that the ignition electrodes are located in high electric field areas to entrain the resulting ions from the ionization region to the gate. at the same time, “background” natural ions (arising due to the action of cosmic radiation, fluorescence, local fluctuations and temperature), initiating the discharge occurrence, are constantly carried out of the area between the tips of the corona source, which makes the ignition of the corona difficult. this is expressed in the instability of the discharge and the need to increase the duration and amplitude of the ignition pulse voltage. discharge instability causes the dispersion of the ion charge in the cluster and instability of spectrum, which lead to an increased probability of malfunctioning. stabilization of the results requires averaging of the spectrum over large time intervals, which increases the time to obtain reliable results and affects sensitivity. therefore, an important task is to create an environment, in which air samples ionization would be processed under controlled conditions to ensure the ignition process stabilization. and as a result, this could lead to increasing the sensitivity and reducing the malfunctioning time of the device. one way to solve this problem is proposed in [2], serving to use corona ionization source with additional ignition electrodes (fig. 3). fig. 3 ionization source with two igniters. 616 v. v. belyakov, m. a. matusko, a. v. golovin, e. a. gromov additional ignition electrodes (fig. 3) are managed by a voltage pulse generator. the discharge is formed between them before the main ignition pulse, creating additional “catalytic ions”. these ions contribute to a stable discharge ignition between the main electrodes controlled by a high-voltage pulse generator. it is understood that the use of additional electrodes promotes controlled and stable discharge ignition between the main electrodes in the presence of an electric field in the ionization region. a disadvantage of the method of the ion source ignition mentioned above is a structural complication of the ionization chamber, which leads to an increase in size and in complexity and deteriorates its manufacturing technology. there just remains the problem of unstable discharge firing between extra igniters, as the process occurs in an electrostatic field and the resulting primary carriers in the electrodes gaps continuously carried out of the discharge region. in this paper the corona ignition system is considered when the ignition process is divided into two phases: preparatory and main. during the preparatory phase of ignition, the electric field in the ion source is reduced compared to the nominal level, or set to zero, which ensures the establishment of a fixed background concentration of charge carriers near the discharge gap. during the initial phase of the corona discharge, a single pulse or a series of voltage pulses are formed at the ignition electrodes, leading to the occurrence of avalanche discharge between the electrodes. ions formed at the same time remain near ignition electrodes after the discharge termination, because there is no electric field in the ion source. by the beginning of the main phase of the ionization, the field in the ion source is set to the nominal level, and the “catalytic ions” do not have time to leave the ignition area because of low mobility. the corona ignition at the main phase is carried out, as well as on the preparatory, by single pulse or series of pulses. this ensures the stability of the corona discharge ignition due to the presence of ions in the discharge gap left since the first phase of ignition. in addition to increasing the stability, the proposed system provides manufacturability and retains the dimensions of the used ionization source without using any additional ignition electrodes. ignition pulse generation and modulation of the electric field does not require any additional modifications to the ionization chamber and is realized by the synchronization from the ion mobility spectrometer control system. 3. corona discharge ion source control to form the corona, a special control system forms high-voltage voltage that is applied to the sharpened electrodes of the corona source. the second electrode in this system is a conductive substrate – ejecting electrode. timing diagram of the corona source is presented in figure 4. the control system generates a sequence of several igniting pulses for each measurement cycle (fig. 4a). the duration of the high-voltage power supply can be adjusted to the electrodes of the ionization source tpulse, pause between pulses tpause, as well as the number of pulses per measurement cycle (fig. 4b). changing the corresponding timing intervals in the ignition system of the ionization source can achieve stable corona discharge combustion during necessary working time, change of the number of ions generated. complete or partial filling of the ionization chamber by ions formed during one measurement cycle can be chosen. advanced sample ionization method in ion mobility spectrometer 617 fig. 4 timing diagram of the ion mobility spectrometer (a); sequence of k ignition pulses in one measurement cycle (b). 4. electrostatic gates presence of two electrostatic gates allows the change of the field e1 in the ionization region (between the ejecting electrode and the gate grid 1) and field e2 in the gate region (between the gate grids 1 and 2). changing the field e1 allows the control of the lead time of ion-molecule reactions between the reactant ions and molecules of detectable substance after ionization. next, ions of analyzed substances are injected into the gate region. injection takes place by the fact that the potential applied to the ejecting electrode becomes greater than the potential applied to the first gate grid. so, an electric field arises in the ionization region, which ejects ions into the gate region. efficiency of the electrostatic gates depends on the mode that is provided by the appropriate electronic control system. during switching-on the device, building of the initial electrostatic potentials at the gates occurs. let us conventionally denote the first grid potential as uc1, and the ejection electrode potential as ub. gates functioning during one measurement cycle is divided into seven successive phases (fig. 5):  phase 1 – turning field e1 on in the ionization region, which corresponds to the beginning of the measurement cycle preparation.  phase 2 – consists of two parts. during the first one, pre-ionization occurs at the reduced field e1. during the second – corona ionization is being processed, at the same time the direction or magnitude of the fields in the ionizationand the gate region are being restored. ions movement, formed during the ionization process, occurs by the field e1, from the ejection electrode to the gate region.  phase 3 – the potential of the first gate grid becomes equal to the ejecting electrode potential (uc1 = ub). the field in the ionization region becomes zero, 618 v. v. belyakov, m. a. matusko, a. v. golovin, e. a. gromov the ions are being stopped. in the stopped cluster ionmolecule reactions take place. reactions occur between the molecules of the examined substance and the reactant ions generated during ionization process.  phase 4 – promotion of the cluster to the gate region.  phase 5 – phase of ions injection into the drift region. by the field e1 in the ionization chamber and e2 in the gate region ions are injected into the drift chamber. at the same time, a thin ion bunch can be “cut – off” by a second electrostatic gate. this significantly increases the resolution of the spectrometer.  phase 6 – field e2 direction is reversed, so the electrostatic gate is being closed. ions left in the ionization chamber are being moved by the field e1 to the gate grid 1 and are being captured by it.  phase 7 – gates condition is being restored to its original state. field in the ionization region is absent, there is a stopping field e2 in the gate region. the system is prepared for the next measurement cycle. fig. 5 ionizationand gate region field distribution during the electrostatic gates functioning. 5. practical experiment an experiment with (fig. 6a) and without (fig. 6b) using an advanced ignition method was performed. the spectrum altitude represents the total ion charge (normally, about 400 nc) in the drift tube. a charge deviation between the two consecutive spectrums is not more than 3% (fig. 6a) and up to 31% (fig. 6b). advanced sample ionization method in ion mobility spectrometer 619 a) b) fig. 6 experimental results with (a) and without (b) using an advanced ignition method. adduced electrostatic gate control circuit allows to increase or decrease the duration of the of ion-molecule reactions passage, vary the number of ions in the drift chamber, and ultimately on the collector. this opens up the possibility to improve spectrometer performance, sensitivity and resolution. with the increase of phase 3 duration, more complete charge transfer from the reactant ions to the molecules of detected substances, that have a greater proton affinity (in the case of a positive mode) or electron affinity (in the case of a negative mode), occurs. this happens due to the increase of time spent for ion-molecule reactions. 620 v. v. belyakov, m. a. matusko, a. v. golovin, e. a. gromov references [1] y. p. raizer, j. e. allen, v. i. kisin, gas discharge physics, springer, 1991. [2] discharge ionization source, patent no. us 6407382, date of patent: 06/18/2002 [3] t. w. carr, plasma chromatography, plenum press, new york, 1984. [4] g. a. eiceman, z. karpas, ion mobility spectrometry, crc press, boca raton, 1993. [5] m. tabrizchi, a. abedi, “a novel electron source for negative ion mobility spectrometry”, international journal of mass spectrometry, vol. 218, pp. 75-85, 2002. [6] v. s. pershenkov, a. d. tremasov, v. v. belyakov, a. u. razvalyaev, v. s. mochkin, “x-ray ion mobility spectrometer”, microelectronics reliability, vol. 46, issue 2-4, pp. 641-644, february 2006. facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 61-70 https://doi.org/10.2298/fuee2201061g © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper realization of a variable resolution modified semiflash adc based on bit segmentation scheme pranati ghoshal1, chanchal dey2, sunit kumar sen2 1dept. of applied electronics and instrumentation engineering, techno main salt lake, kolkata, india 2instrumentation engineering, dept. of applied physics, university college of technology,92, apc road, kolkata, india abstract. a modified variable resolution semiflash adc, based on ‘bit segmentation scheme’, is presented. its speed and comparator count are identical to a normal flash adc. an 8-bit adc has 256 different bit combinations. sixteen consecutive bit combinations from the msb side – beginning with the first one, remain unaltered for such an adc. it continues this way till the last group of sixteen bits. in the designed circuit, the four msb and four lsb bits are determined in the first and second part of the clock. following the same logic, the bits in a 16-bit adc can be found out in only two clock cycles by employing only fifteen comparators. it implies that a higher resolution adc can easily be determined with low power and small die area. it is tested in p-sim professional 9 for an 8-bit adc and curves drawn to establish the validity of the proposal. key words: bit segmentation scheme (bss), bit swap logic (bsl), least significant bit (lsb), semiflash adc, half flash adc, modified full flash adc (mffadc) 1.introduction different types of adc architectures are used to meet various application needs like resolution, accuracy and faster operation. a flash or parallel adc is fastest among the various types of adcs available in the market. comparator count for a flash adc increases exponentially with resolution. this limits the use of flash adcs to 8-bit resolution. over the years, researchers have designed various versions of flash adcs like half flash, semi flash, simplified half flash, multi-step flash etc. these different types have their own advantages and disadvantages compared to a flash or full flash adc. a modified 8-bit semiflash adc, using only fifteen comparators, is reported in [1] whose speed is same as that of a full flash adc. since only fifteen comparators are received september 7, 2021; received november 8, 2021 corresponding author: pranati ghoshal dept. of applied electronics and instrumentation engineering, techno main salt lake, kolkata 700091 (india) e-mail: pranati991@gmail.com * an earlier version of this paper was presented at the 4th international conference on 2021 devices for integrated circuit (devic 2021), may 19-20, 2021, in kalyani, west bengal, india [1]. 62 p. ghoshal, c. dey, s. k. sen required, its power and die area requirement are thus significantly reduced. in [2], [3] hybrid flash-hybrid adc architectures were used. in the former, a sampling switch was used to get reduced settling time for dac and parallel capacitors used to reduce high frequency noise jitters while in the case of the latter, a segmented split capacitor charge redistribution dac was employed to achieve less area and power and higher speed. a twostep 8-bit flash adc and an 8-bit semiflash adc, both of which used 15 comparators and a charge redistribution technique, were reported in [4], [5]. a bit swap logic (bsl) based bubble error correction (bec) technique was applied in [6], while in [7] flash adc performance was evaluated in presence of offset using hot code generator and bit swap logic (bsl).in [8], an encoder with reduced power was used for threshold inverter quantization based flash adc. a statistically-driven two-step flash sub-adc was constructed [9] having applications in high-speed time-interleaved adcs in wire line communications. in [10], 8-bit and 10-bit adcs were designed using fewer number of comparators and resistors which resulted in less area and power consumption. in [11], [12] a low power 4-bit flash adc and a 9-bit two step flash adc were respectively realized using standard cells. in [13], a flash adc was used which increased the input dynamic range of adc by using 5-input logic gates. a simplified half flash adc, having the same speed as that of a flash adc was designed in [14], but with reduced comparator count and less die area. a two-step flash adc was reported in [15] which can be used in communication fields. ota based comparators were used to realize a 3-bit high speed flash adc [16] having applications in wireless lan. active data and clock distribution trees [17] were used to realize a 4-bit flash adc. bubble errors were taken care of in [18] by using a low power wallace tree encoder for flash adcs. a low power fault resistant flash adc which finds applications in instrumentation fields was reported in [19]. a 6-bit low power flash adc was reported in [20] which used an online offset cancellation technique. 2. previous architectures relationship between resolution and number of comparators for a flash adc [10], [14] is given by 𝑁𝑐 (𝑁) = 2 𝑁 − 1 (1) from (1), it is evident that number of comparators increases exponentially [10] with increase in resolution. a two-step or half flash [4,9] adc accomplishes the bits in two clock cycles. for a half flash adc, comparator count and resolution carries the following relationship. 𝑁𝑐 (𝑁) = 2.2 𝑁/2 − 2 (2) a semiflash adc [5] is very simple in nature which consumes less power and die area. an 8-bit semiflash adc uses only 15 comparators for both fine and coarse conversions leading to a drastic reduction in the number of comparators used. number of comparators for a semiflash adc follows the relationship [5] 𝑁𝑐 (𝑁) = 2 𝑁/2 − 1 (3) a simplified half flash adc [14] employs a voltage estimator (ve) and a modified full flash adc (mffadc). its requirement of power and die area are very small and has a realization of a variable resolution modified semiflash adc based on bit segmentation scheme 63 speed almost thrice that of a normal half flash adc. requirement of the number of comparators for a simplified half flash adc bears the following relationship [14] 𝑁𝑐 (𝑁) = 2 (𝑁/2−2) + 2 (4) speed of a multistep 10-bit adc [10] is identical to a conventional half flash adc. its die area and power needs are low due to small number of comparators required. number of comparators needed in case of [10] bears the relationship. 𝑁𝑐 (𝑁) = 2 ( 𝑁 2 −1) (5) 3. the bit segmentation scheme (bss) a normal flash or full flash adc, as it is called, has an exponential relationship between the number of comparators used and the resolution. thus, with increasing resolution, number of comparators needed for such an adc becomes unmanageable. for instance, for a 16-bit flash adc, number of comparators needed is 2^16 1 = 65535. in the present case, a modified 8-bit semiflash adc is presented which is based on bss. the designed modified 8-bit semiflash adc determines the 8 bits in a single clock. it implies that its speed is same as that of a normal flash adc. a look at the bit combinations of an 8-bit flash adc shows that it has 256 bit combinations. the combinations are segregated into sixteen fields as shown in fig .1. it is observed from the figure that the four msb bits in each field are same. as an example, in field 2 of the figure, the four msb bits 0001 remain unchanged. the designed circuit identifies this 4 msb bits in the first half of the clock cycle, i.e., during this time the particular field in which the unknown analog signal belongs, is identified. the rest four bits (lsb bits) in the field is determined in the second half of the clock cycle. thus, for the designed circuit, all the 8-bits are determined in a single clock – implying that its speed is identical to a normal flash adc. also, it will be seen only fifteen comparators would be required to evaluate the 8-bits. by the same logic, a 16-bit adc of identical architecture would require only two clock cycles, but number of comparators needed for evaluation of the 16 bits would still remain at 15. thus, savings in the number of comparators would be 65535/15 = 4369 times compared to a normal flash adc. thus, both power and die area would be drastically reduced. fig. 1 the sixteen fields for an 8-bit adc 64 p. ghoshal, c. dey, s. k. sen 4. realizingthe modified 8-bitsemiflash adc based onbss fig. 2a) below shows the details of the designed modified 8-bit semiflash adc while fig. 2b) shows its timing diagram. the explanation of the designed circuit is given below. the bottom left part of fig. 2a) shows the manner of generation of pulses c1, c2 and c11. c1, c2 remain active during first and second half of the clock, while c11 is a delayed version of c1. vr and vin are respectively the reference voltage and unknown analog input voltage applied during the first half of the clock, while during the second half of the clock, they are replaced by ‘new vr’ and ‘new vin’. also, during the second half of the clock, the bottom of the ladder is fed with a voltage available at point ‘p’ shown in fig. 2a). this voltage represents the voltage value of the four msb bits corresponding to the particular field detected during first half of the clock. (a) (b) fig. 2 a) realization of a modified 8-bit semiflash adc b) timing diagram for realization of different pulses realization of a variable resolution modified semiflash adc based on bit segmentation scheme 65 during the first half of the clock, pulse c1 makes sure that the switches sw1, sw3 and sw5 remain in the closed condition. the positive inputs of the fifteen comparators are all connected to vin while their negative inputs are connected to voltages from the ladder as shown in the figure. the fifteen outputs o15-o1 from the comparators along with o0 (which is always 1) are fed to a 16-to-4 priority encoder. its four outputs are connected to two sets of and gates. a set of four and gates are controlled by c1 while the other set by c2. during the first half of the clock, since c1 is active, thus encoder outputs are latched by latch1. these four bits are stored in the 1-byte register as d7-d4 bits. they are also anded with c’1 pulse. outputs of anding operation act as control inputs to switches sw7-sw10. the inputs to sw7-sw10 are respectively connected to d7-d4 bits, represented by va, vb, vc and vd as shown in the figure. the outputs of switches act as inputs to the summer s1(1). output of summer s1(1) is a representation of the sum of weights of the four bits from the msb side of the input analog signal. for example, if during c1, the output from the priority encoder is 1011, the output of summer s1(1) will represent a value equal to 1011. summer s1(2) sums the output of s1(1) with 0000 1111. thus, summer s1(2) output will correspond to the highest value within the field to which the analog signal belongs. this acts as the ‘new vr’, which would become effective from the beginning of c2. s2 is a subtractor that subtracts s1(1) from the original vin. the subtracted result acts as the ‘new vin’ value. at the beginning of c2, switches sw2, sw4 and sw6 become active. thus, during this time, ‘new vr’ and ‘new vin’ come into the circuit by replacing vr and vin respectively. the top of the ladder is fed with a voltage which represents the highest value of the analog signal fig. 3 simulation of the circuit shown in fig. 2a by psim p9 66 p. ghoshal, c. dey, s. k. sen corresponding to the field in which the unknown analog signal belongs, while the lower end of the ladder is supplied with a voltage that corresponds to the analog voltage of the four msb bits to which the unknown signal belongs. the circuit behaves in an identical manner as during c1 with priority encoder outputting a new set of four bits. these four bits are latched by latch 2, since it is active during c2. the outputs of latch 2 are now stored in d3-d0 bits of the 1-byte register. the circuit is then reset so that it can accept the next new analog signal. psim professional 9 is used to simulate the circuit of fig. 2a). 5. experimental results table 1-5 shows the input voltages and their corresponding outputs for reference voltages starting from 5v to 1v, with a gap of 1v between any two successive reference voltages. an interval of 0.25v is maintained between any two successive readings for any table. each output voltage shown in any table corresponds to the average output voltage obtained for both increasing and decreasing input voltages. error curves have been drawn for all reference voltage levels. table 1 i/p vs. o/p for ref. voltage= 5v sr. no. i/ps (v) bit pattern o/ps (v) % error 1 0.0 0000 0000 0.0 0.0 2 0.25 0000 1101 0.2549 1.96 3 0.5 0001 1001 0.4901 1.98 4 0.75 0010 0110 0.7451 0.65 5 1.0 0011 0100 1.0196 -1.96 6 1.25 0011 1111 1.2353 1.18 7 1.5 0100 1101 1.5097 -0.65 8 1.75 0101 1011 1.7647 -0.84 9 2.0 0110 0110 1.9804 0.98 10 2.25 0111 0011 2.2549 -0.22 11 2.5 0111 1111 2.4902 0.39 12 2.75 1000 1111 2.7647 -0.53 13 3.0 1001 1101 3.0784 -2.6 14 3.25 1010 1011 3.2745 -0.75 15 3.5 1011 0001 3.4706 0.84 16 3.75 1011 1110 3.7255 0.65 17 4.0 1100 1111 3.9804 0.49 18 4.25 1101 1010 4.2745 -0.58 19 4.5 1110 0111 4.5294 -0.65 20 4.75 1111 0110 4.7451 0.10 21 5.0 1111 1111 5.0 0.0 realization of a variable resolution modified semiflash adc based on bit segmentation scheme 67 table 2 i/p vs. o/p for ref. voltage= 4v sr. no. i/ps(v) bit pattern o/ps (v) % error 1 0.0 0000 0000 0.0 0.0 2 0.25 0000 1111 0.2510 -0.4 3 0.5 0010 0001 0.5020 -0.4 4 0.75 0011 0001 0.7529 -0.39 5 1.0 0100 0001 1.0196 -1.96 6 1.25 0101 0010 1.2549 -0.39 7 1.5 0110 0000 1.506 -0.4 8 1.75 0111 0000 1.7569 -0.39 9 2.0 0111 1111 1.9922 0.39 10 2.25 1000 1111 2.2431 0.31 11 2.5 1001 1111 2.4941 0.24 12 2.75 1010 1111 2.7451 0.18 13 3.0 1011 1111 2.9961 0.13 14 3.25 1100 1111 3.2471 0.09 15 3.5 1101 1111 3.4980 0.06 16 3.75 1110 1111 3.7490 0.03 17 4 1111 1111 4.0 0.0 table 3 i/p vs. o/p for ref. voltage= 3v sr. no. i/ps(v) bit pattern o/ps(v) %error 1 0.0 0000 0000 0.0 0.0 2 0.25 0001 0011 0.2471 1.16 3 0.5 0010 1011 0.5059 -1.18 4 0.75 0011 1111 0.7412 1.18 5 1.0 0101 0110 1.0118 -1.18 6 1.25 0110 1011 1.2588 -0.70 7 1.5 0111 1111 1.4941 0.39 8 1.75 1001 0111 1.7765 -1.5 9 2.0 1010 1011 2.0118 -0.59 10 2.25 1011 1111 2.2471 0.13 11 2.5 1101 0111 2.5294 -1.18 12 2.75 1110 1111 2.7647 -0.53 13 3.0 1111 1111 3.0 0.0 table 4 i/p vs. o/p for ref. voltage= 2v sr. no. i/ps(v) bit pattern o/p(v) % error 1 0.0 0000 0000 0.0 0.0 2 0.25 0001 1111 0.2432 2.72 3 0.5 0011 1111 0.4941 1.18 4 0.75 0101 1111 0.7451 0.65 5 1.0 0111 1111 0.9961 0.39 6 1.25 1001 1111 1.2471 0.23 7 1.5 1011 1111 1.4980 0.13 8 1.75 1101 1111 1.7490 0.06 9 2.0 1111 1111 2.0 0.0 68 p. ghoshal, c. dey, s. k. sen table 5 i/p vs. o/p for ref. voltage= 1v sr. no. i/ps(v) bit pattern o/ps(v) % error 1 0.0 0000 0000 0.0 0.0 2 0.25 0011 1111 0.245 2.0 3 0.5 0111 1111 0.495 1.0 4 0.75 1011 1101 0.74 1.33 5 1.0 1111 1111 1.0 0.0 6. error curves fig. 4 shows five error curves for reference voltages starting from 5v to 1v. it is observed that for any curve, error percentage is more for low value of the input analog signal which is only to be expected. series 1 corresponds to reference voltage 5v while series 5 corresponds to 1v respectively. fig. 4 error curves for different input reference voltages 7. comparison between different architectures number of comparators and clock cycles needed by different architectures, including the proposed one, is shown in fig. 5a) for 8 and 16-bit resolutions, while the plot of comparator count versus resolution is depicted in fig. 5b). the architectures of interest discussed in both the figures are flash or full flash, half flash, semi flash, simplified half flash, multi-step and the proposed one. for 8-bit resolution, number of comparators in the proposed scheme is slightly more than simplified half flash and multi-step types. but for 16-bit resolution, the proposed scheme requires fewer number of comparators compared to other architectures. from the point of view of the number of clock cycles required for 8bit adc, the proposed architecture requires same number of clock cycles as that of a flash adc, but better than the other architectures. for 16-bit adc, the proposed scheme is either realization of a variable resolution modified semiflash adc based on bit segmentation scheme 69 better (simplified half flash) or same as that of the other architectures, barring the full flash adc which requires only one clock cycle. fig. 5 a) number of comparators and clock cycles needed for different architectures b) a plot of comparator count vs. adc resolution for different architectures 8. conclusion a variable resolution modified semi flash adc design is presented in this paper. it achieved variable resolution by merely changing the required number of clock cycles. design of a modified 8-bit semi flash adc, based on bss, is presented in hardware and simulated in psim p9. the idea of the presented technique is in a way based on flash adc design. higher order adcs can be realized based on the presented technique. as an example, a 16-bit adc can be designed and realized in only two clock cycles requiring only fifteen comparators. extending the concept, a 24-bit adc would require three clock cycles only. thus, high resolution adcs with very high speeds can be designed. since the designed circuit requires drastically reduced number of comparators, hence it would require very low power and die area. the proposed design thus completely eliminates the need for an exponential increase in the number of comparators with increasing resolution, as it is the case with a flash adc. in the designed circuit, both the reference voltage to the ladder as also the input signal have been changed in the latter half of the clock cycle. it is different from pipeline or two step adcs where only one or two bits are analyzed in a single clock cycle, while the presented method is based on bss. in bss, the whole bit pattern is divided into fields, which has been explained in text. references [1] p. ghoshal, c. dey and s. k. sen, "design of a combinational modified 8-bit semiflash analog to digital converter", in proceedings of the 4th international conference on devices for integrated circuits (devic), 2021, kalyani, india, pp. 1–5. [2] a. razzaq and s. m. chaudhry, "a 15-bit 85 ms/s hybrid flash-sar adc in 90-nm cmos", circuits, syst. signal process., vol. 37, pp. 1452–1478, aug. 2017. [3] b. d. kumar, s. k. pandey, n. gupta and h. shrimali, "design of hybrid flash-sar adc using an inverter based comparator in 28 nm cmos", microelectron. j., vol. 95, p. 104666, jan. 2020. https://link.springer.com/article/10.1007/s00034-017-0629-z 70 p. ghoshal, c. dey, s. k. sen [4] a. cremonesi, f. maloberti, g. torelli and c. vacchi, "an 8-bittwo step flash a/d converter for video application", in proceedings of the ieee custom integrated circuits conference (cicc), pp. 6.3/1–6.3/4, 1989. [5] d. p. dimitrov and t. k. vasileva, "eight-bit semi flash a/d converter", hindawi publishing corporation, vlsi design, vol. 2007, pp. 1–7, 2007. [6] p. ghoshal and s. k. sen, "a bit swap logic (bsl) based bubble error correction (bec) method for flash adcs", in proceedings of the international conference on control, instrumentation, energy and communication, ciec, 2016, pp. 111–115. [7] p. ghoshal and s. k. sen, "performance evaluation of flash adcs in presence of offsets using hot code generator and bit swap logic (bsl)", in proceedings of the international conference on industry interactive innovations in science, engineering and technology, i3set-2016, springer-lnns, published in industry interactive innovations in science, engineering and technology, 2016, pp. 435–445. [8] m. gurjar and s. akashe, "design low power encoder for threshold inverter quantization based flash adc converter", int. j. vlsi des. commun. syst., vol. 4, no. 2, pp. 83–90, apr. 2013. [9] d. liu, l. he, f. lin, t. li and y.-k. chou, "a time interleaved statistically-driven two step flash adc for high-speed wire line applications", j. circ. syst. comput., vol. 26, no. 7, p. 1750118, july 2017. [10] m. k. mayes and s. w. chin, "a multi step a/d converter family with efficient architecture", ieee j. solid state circ., vol. 24, no. 6, pp. 1492–1497, dec. 1989. [11] m. s. njinowa, h. t. bui and f-r. boyer, "design of low power 4-bit flash adc based on standard cells", in proceedings of ieee 11th international new circuits and systems conference, (newcas), 2013, pp. 1–4. [12] e. rahul, e, r. k. siddharth, v. sharma, m. h. vasantha and y. b. nitin kumar, "two-step flash adc using standard cell based flash adcs", in proceedings of the ieee international symposium on smart electronic systems, 2019, pp. 292–295. [13] r. k. siddharth, k. y. b. nithin and m. h vasantha, "design of low power 5-bit hybrid flash adc", in proceedings of the ieee computer society annual symposium on vlsi (isvlsi), 2016, pp. 585–588. [14] p. b. y. tan, a. v. kordesch and o. sidek, "simplified half flash cmos analog to digital converter", in proceedings of the nsti-nanotech 2004, pp. 191–194. [15] d. liu, l. he, f. lin, t. li and y-k. chou, "a time interleaved statistically driven two step flash adc for high speed wireless applications", j. circ. syst. comput., vol. 26, no. 7, p. 1750118, july 2017. [16] m. n. a. bajg and r. ranjan, "design and implementation of 3-bit high speed flash adc for wireless lan applications", int. j. adv. res. comput. commun. eng., vol. 6, no. 3, pp. 428–433, march 2017. [17] s. shahramian, s. p. voinigescu and a. c. carusone, "a 35-gs/s, 4-bit flash adc with active data and clock distribution trees", ieee j. solid-state circ., vol. 44, no. 6, pp. 1709–1720, june 2009. [18] m. p. ajanya and g. t. varghese, "low power wallace tree encoder for flash adc", iop conference series: materials science and engineering, vol. 396, 2018, p. 012042. [19] g. prativa and m. santhi, "design of low power fault tolerant flash adc for instrumentation applications", microelectron. j., vol. 98, p. 104739, apr. 2020. [20] a. amini, a. baradararanrezaeii and m. hassanzadazar, "a novel online offset cancellation mechanism in a low power 6-bit 2gs/s flash adc", analog integr. circuits signal process., vol. 99, no. 2, pp. 219–229, may 2019. https://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=7558446 plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 107-120 https://doi.org/10.2298/fuee2201107j © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper active inductor based low phase noise voltage controlled oscillator shailesh jakodiya, ram chandra gurjar, radhe shyam gamad department of electronics and instrumentation engineering, shri g. s. institute of technology and science indore, india abstract. this paper proposed a fully mos-based voltage-controlled oscillator (vco) with tuning range and low phase noise, replacing the most often used nmos-based inductor-capacitor tank arranged in cross-coupled topology with a high-q active inductor. this study mainly focuses on vco design using a mos-based active inductor and is implemented and verified using umc 180nm cmos technology. the proposed vco is resistorless and consists of an active inductor, two mos capacitors, and the buffer circuits. the fundamental principle of this mos-based vco concept is to use mos based inductor to replace the passive inductor, which is an active inductor that gives less area and low power usage. at 1 mhz frequency offset, the phase noise achieved by this proposed configuration is -102.78dbc/hz. in the proposed vco architecture, the frequency tuning range is 0.5ghz to 1.7ghz. this vco design can accomplish this acceptable tuning range by altering the regulating voltage from 0.7v to 1.8v. this suggested architecture of proposed vco design has the power consumption of 9mw with a 1.8v supply voltage. the suggested vco has been shown to be a good fit for low-power rf circuit applications while preserving acceptable performance metrics. key words: active inductor, vco, frequency tuning range, phase noise, power consumption 1. introduction in several domains such as healthcare fields, military, telecommunications, radar equipment, etc., wireless transceivers have been commonly used [1]. in the exponential growth of transmission technologies such as wireless and diverse uses, integrated circuits based on cmos technology play a very significant role. the vco has its benefits in analog signal applications [2]. low power, high speed, and minimal space are the factors for selecting the cmos technology. the lc-vco with a wide range of frequency tuning is the most critical rf building block in the adopted and modern communication system. in wireless communication arrangements, voltage-controlled oscillators provide critical received june 05, 2021; received in revised form january 23, 2022 corresponding author: shailesh jakodiya department of electronics and instrumentation engineering, shri g. s. institute of technology and science indore, india e-mail: ershadi@iaukhsh.ac.ir 108 s. jakodiya, r. c. gurjar, r. s. gamad performance [3-5]. the depletion of phase noise is managed by attempts at different levels: technology, construction, and overall visualization [5]. we want inductive characteristics for high-speed applications. in making large-speed transceivers, the high efficiency of the inductor plays an essential part. it relies on several specifications and relevant parameters. a ring oscillator can be used as a vco, an oscillator that supports an lc component or a mos-based vco. with positive feedback, the ring oscillator is helpful for large frequency tuning range, but the disadvantage is minor phase noise performance; hence they are not valuable for different communication fields [2,6]. the topology of gyrator-c is easy and convenient in achieving the mos-based inductor configuration. two back-to-back transconductors, one of which is connected to a capacitor, make up the primary active inductor [7]. a high-quality (q) feature is present in the above structure of the active inductor [2]. in comparison to a passive inductor, using an active inductor throughout a vco increases noise in the entire circuit. when compared to a passive inductor, this active inductance created greater phase noise. the composite trans-conductor vco is employed to make the most out of this devastating challenge because it exhibits severe phase noise, low power consumption, and chip area reduction. for convenience, the present bias is considered to be independent of vdd [8]. to boost the output amplitude of the oscillation, an external buffer circuit is involved on both sides of the vco output node. a new circuit architecture for a vco using an active inductor is suggested in this study. a unique active inductor is presented in this work to enhance the overall parameter of the lc vco, such as tunning range, power consumption, and phase noise, at the same time. the recommended research's primary focus is on making the active inductor adjustable by varying the inductor's inductance parameters without causing any physical changes, which is not feasible with spiral inductors [8]—also, making mos varactor capacitance adjustable by varying the control voltage. there are separate sections in this proposed paper of vco. the entire system architecture is discussed in section ii, meaning the function of the inductor topology and the design of the vco and the principle, and section iii show the circuit design based on the active inductor. section iv describes the simulation results of the proposed vco using the cadence environment. 2. architecture for active inductor 2.1 gyrator topology gyrator-c topology is employed in the creation of an active inductor. the functional topology of the gyrator-c based inductor is shown in figure 1. forwarding and feedback trans-conductors are the main fundamentals for the structure of an active inductor. if the gyrator port is attached to a capacitor, the whole circuit is called the gyrator-c topology. the entire network acts inductively by connecting a capacitor to the circuit. it provides an inverse relationship between the trans-conductor and the transconductance, respectively [2,9,10,11]. the gyrator may be used as a dual-mode system. the same source has a trans-conductance that is negative. the great benefit of using a unique topology is that it helps eradicate circuit noise and typical mode interference and creates and attracts them active inductor based low phase noise voltage controlled oscillator 109 well. similarly, it means that the current is out of the trans-conductor while the voltage is positive. by using the feedback loop, the simple inductance equations are as follows: (1) (2) (3) fig. 1 gyrator-c topology fig. 2 single-ended type active inductor the gyrator converts total capacitance into inductance at input terminal vin. figure 2 shows the most often used inductor [1], which is composed of two transistors designed to act as an inverting impedances network [12,13]. the transistor threshold voltage is supposed to be vt, and vsat is the saturation voltage, so voltage variations on the input side are regulated by vdd-vsat-vod-vt and vt+vod middle, where vod is the transistor overdrive voltage. this characteristic is used to calculate the theoretical low power consumption of the active inductor-based voltage-controlled oscillator. still, on the other hand, as an electric current travels in the non-inductor part of the circuit with limited range, may have been accurately calculated the oscillator's rf output power. 110 s. jakodiya, r. c. gurjar, r. s. gamad cascode topology is used to increase the consistency factor and, as expected, to minimize the overall series resistance, a lack of voltage swings can impair the energy flow. that's why we need to move toward a differential active inductor topology. it has two significant benefits over similar circuits [13]: 1) the circuit can avoid common-mode interference and minimize the identical harmonics, much like any other distinct topology; 2) the voltage swing of the differential active inductor is twice as high as the active inductor of the single-ended form. figures 3 and 4 show a differential active inductor and an equivalent active inductor circuit, respectively [13]. fig. 3 differential active inductor fig. 4 equivalent circuit for active inductor active inductor based low phase noise voltage controlled oscillator 111 3. vco circuit design 3.1 conventional lc-vco in figure-5, the primary circuit of the traditional vco is shown. in cross-coupled pairs, nmos transistors are used [1]. the lc tank and the formation of negative resistance of the oscillation initial conditions, which are commonly utilised by the crosscoupled pair, are used to measure frequency oscillations [1,14-19]. only the passive inductor portion will replace the 75-80 percent region of the entire vco chip, given the supporting passive lc-vco. fig. 5 conventional lc-vco the gyrator-c topology will be used for the creation of an active inductor in this suggested circuit design. the formula below may be used to calculate absolute inductance in the gyratorc structure: l = (4) fig. 6 vco design using active inductor 112 s. jakodiya, r. c. gurjar, r. s. gamad fig. 7 equivalent circuit representation of proposed vco the equivalent equations are shown below [1] (5) (6) (7) proposed active inductor based vco design is fully mos based structure. the lc tank circuit is loaded at the drain terminal, and the differential arrangement is formed by a pair of lc tanks [7]. these nmoss, which give 1800 phase shift as they are in typical source configuration, also present another criterion relating the overall phase shift, which is that it should be 00 or 3600 [7]. the capacitor is often replaced by a variable capacitor based on mos. a schematic diagram of the proposed vco architecture using an inductor based on mos is shown in figure-6, and its equivalent circuit representation is shown in figure 7 [20]. a mos-dependent inductor substitutes the passive inductor in a typical lc-vco, and then the passive capacitor is also replaced by a supplied mos-based capacitor that is a mos varactor. here, additional buffer circuits are attached to increase the amplitude of oscillation. n1 and n2 mos transistors are used as voltage buffers in figure-6 and are both related in the configuration of the common mode. the equivalent circuit of the proposed vco is shown in figure 7. as a capacitor with voltage control, mos varactor are employed. when the drain, source, and bulk of mos varactor are linked, they behave as a device like a capacitor having c capacitance [20]. the values of the various components utilized to construct a mos-based vco are shown in table 1. table 1 circuit parameters of the designed vco components device size/ values m1, m2, m3, m4 50µm/0.18µm n1, n2, n9, n10, n11, n12 100µm/0.18µm n3, n4, n5 50µm/0.18µm n6, n7, n13, n14 25µm/0.18µm c1, c2 0.1×10-15 f active inductor based low phase noise voltage controlled oscillator 113 for reuse, two blocks of gm are related backward. m2 and m3 mos serve as existing sources of mos transistors to limit the inductor inductance quantity. the voltage vc should be adjusted to the minimum point to get the lowest inductance at the maximum frequency. transistors n3-n4 should also be biased at the maximum overdrive voltage to produce significant transconductance with the smallest gate capacitors (vgs-vt). we need to give start up condition every time for getting sustained oscillation. for low phase noise of vco in oscillation, n5 and n8 are used in the active inductor, and transistors m1 and m4 minimize the differences in the same signal. n9 and n10 transistors are used to provide a negative resistance for oscillations. the inductance property is controlled by an external voltage source vc for the circuit presented. the lc tank comprises the inductor having its parasitic capacitance and the mos varactor, which is utilized to change the oscillation of frequency. this supply voltage helps direct control of the value of active inductor inductance. with the m1 and m4 mos transistors, this can be done. on m1 and m4 mos transistors, we need to increase the control voltage to compensate the low gate voltage. there is another vcrtl voltage to control the capacitance value fluctuation, which provides more significant phase noise and a higher frequency range compared to others. the capacitors c1 and c2 behave like dc blockers [2]. these capacitors are used to block the actual circuit and the carrier circuit from interfering. it blocks the circuit's dc signals, thus allowing the high radio frequency signal to pass through. the design of the above vco begins with the most straightforward strategy for creating the active inductor, the gyrator-c topology. using a mos-based inductor, the voltage-controlled oscillator is built such that the whole system becomes an entirely mos-based vco. 4. post-layout simulation results the fully integrated mos-based vco circuit uses an active inductor built into the umc 180nm cmos technology in the cadence virtuoso tool. analysis and simulation were performed using the cadence software. to achieve sinusoidal oscillation, different characteristics of the suggested design like periodic steady-state (pss), phase noise analysis, and transient response must be assessed. fig. 8 block diagram representation of proposed vco figure 8 represents the block level representation of the proposed vco design for post-layout simulation. figure. 9 shows the layout design of the proposed vco using an active inductor. the parasitics have a significant impact on the vco's performance. for 114 s. jakodiya, r. c. gurjar, r. s. gamad the detection of inductance, parasitic resistance, and parasitic capacitance, rlc extraction (av extraction) with coupled mechanism has been used. this proposed arrangement has 1293 pcapacitance, 450 resistance, and 357 pinductors in its circuitry inventory. the following parts go through the post-layout extracted and simulated active inductor-oriented vco. the performance characteristics as a function of tuning voltage, ranging from 0.2 to 2.0 v, are presented in table 2. fig. 9 layout design of vco using active inductor table 2 performance summary and comparison of proposed vco design source [26] [10] [11] [4] [18] [15] [25] this work technology 0.18μm cmos 0.13μm cmos 0.18μm cmos 0.18μm cmos tsmc 0.18μm 0.35μm 0.18μm cmos 0.18μm cmos technique active inductor active inductor active inductor active inductor active inductor active inductor active inductor active inductor frequency tuning range 0.55ghz 3.8ghz 1.22ghz– 2.6ghz 0.5ghz– 2ghz 1.3ghz– 3.8ghz 0.5ghz 2ghz 680mhz1.450ghz 5.5ghz 0.5ghz1.7ghz phase noise (dbc/hz) -89 to -78 82 to -87 -78 to -90 -81 to -94 -90.33 -87 -80.314 at 1mhz offset frequency -102.78 at 1mhz offset frequency figure of merit (fom) -133.5 ~ -138.84 -151 -140.43 -168.54 voltage supply (v) 1.8 1.1 1.8 1.8 3.3 1.8 1.8 power consumption 11.9mw 3.6mw4.3mw 13.8mw 10.5mw 29.38mw 9mw active inductor based low phase noise voltage controlled oscillator 115 the simulated time-domain output oscillation vs time plot waveforms derived by transient analysis is shown in figures 10 (a & b). this vco has a concise start-up time of 1-2 ns at 1.0 v supply voltage to achieve a steady state. this is a quick technique to check the vco's operation and determine its settling time and output frequency. fig. 10 (a) single-ended output waveform fig. 10 (b) differential output waveform figure 11 and 12 represent the periodic steady-state (pss) analysis plot and phase noise plot for the proposed vco design after the post layout simulation. it is evident from the findings below that the vco has a broad frequency spectrum from 512mhz to 1.7ghz. by changing the voltage vc from 0.7 to 1.7v, this tunning range can be obtained from equation 8. 116 s. jakodiya, r. c. gurjar, r. s. gamad fig. 11 periodic steady-state analysis of proposed vco design fig. 12 phase noise vs relative frequency fig. 13 figure of merit vs control voltage active inductor based low phase noise voltage controlled oscillator 117 the formula for percentage tunning range can be % tuning range = × 100 (8) in figure 12 phase noise plot for the proposed design is depicted. here fc denotes the average frequency, and fmaximum and fminimum are the max. and min. range of frequency concerning 1st harmonic. the phase noise achieved by this vco configuration is -102.78dbc/hz at a 1 mhz frequency offset. theoretically, phase noise is also calculated by using leeson’s formula, given in equation 9. (9) the figure of merit (fom) of the proposed mos based vco has been calculated. the fom values can be obtained from eq. (10) [17,21]. (10) where f0 is the oscillation frequency. pdc in the above equation is the power consumption in mw, and l{f} is the phase noise calculated at a frequency offset {f} from the carrier at f0 [17, 21]. the fom results of proposed vco design are shown in figure-13. it is evident from eq. 10) that for a certain frequency, the oscillator's fom is completely dependent on the phase noise and the oscillator's power consumption value. as demonstrated in table 2, better phase noise at reduced power consumption leads in better fom of the oscillator. 5. stability, temperature effect & monte-carlo analysis of proposed vco design slight changes in process factors like doping levels, oxide thickness of the mos, junction depths, and so on can impact overall circuit performance. analog circuits are vulnerable to temperature change when the voltage level is decreased. thus, a temperature sweep study was performed for various distinct temperatures ranging from 50°c to 50°c. figures-14 and 15 demonstrate the influence of temperature on the frequency tuning range and phase noise of the vco, respectively. monte carlo simulations have been performed on the suggested active inductor based vco design. the monte carlo analysis can be used to analyse a circuit's process variance and interface mismatch and its influence on the system. monte carlo simulations have been verified with spectre-rf simulator of cadence environment. the monte carlo analysis with a 300 sample size was conducted using gaussian distribution with sigma variations, and the results are depicted in figure16 and 17. for various intervals of the oscillation frequency fvco, they are represented as a frequency of occurrence histogram. to obtain probability statistics, histograms are matched with the gaussian function. in the condition of vctrl = 1.2, the oscillation frequency is the necessary range in 95 percent of the occurrences, with a mean value of 1.42 ghz and a σ of 0.018 in figure. 17. figure-16 shows the average power dissipation of the overall proposed circuit. figure-17 represents the frequency tuning range of the proposed vco design. 118 s. jakodiya, r. c. gurjar, r. s. gamad fig. 14 temperature effect on vco frequency tuning range fig. 15 temperature effect on vco phase noise fig. 16 average power dissipation (mw) active inductor based low phase noise voltage controlled oscillator 119 fig. 17 frequency tuning range plot (ghz) 6. conclusion the active inductor topology used in this paper to enhances the frequency of oscillation and reduce the phase noise in the novel mos-based vco design is proposed in this work. an active inductor based vco has been simulated using umc 180nm technology with the help of a cadence virtuoso tool. two nmos transistors are connected in cross-coupled mode topology. this integrated mos transistor pairs connected with high-q active inductor have provided low noise contribution. such kind of configurations play important role in noise reduction in the circuit. vco nominal range is 512mhz to 1.7ghz. the vc variance of the vco achieves this from 0.7 to 1.7v. for this purpose, we used the standard topology, so the phase noise of the vco is reduced to -102.78dbc/hz. these results indicate that the proposed method is feasible and valuable for designing the next-generation chip-transceivers system in conventional silicon technology, perhaps up to the millimetre-wave frequency. acknowledgement: simulations and analysis of proposed design were carried out in the laboratory of department of electronics &instrumentation engineering, sgsits, indore (india). the authors are thankful to smdp-c2sd (a project funded by deity, ministry of communication and it, govt of india) for providing the necessary electronic design automation (eda) tools in the laboratory. references [1] j.-c. huang, n.-s. yang, s. wang, "an ultra-compact 0.5~3.6-ghz cmos vco with high-q active inductor," in proceedings of the 1st european microwave conference in central europe, may 2019. [2] r. mehra, r.c. gurjar, "lc vco using active inductor for low phase noise and wide tuning range," international journal of innovative research in science engineering and technology, vol. 5, no. 7, pp. 13345–13350, july 2016. [3] s. singh and r. c. gurjar, "a low power, low phase noise vco using cascoded active inductor," in proceedings of the international conference on information, communication, instrumentation and control (icicic), 2017. [4] j. laskar, r. mukhopadhyay, and c.-h. lee, "active inductor-based oscillator: a promising candidate for low-cost low-power multi-standard signal generation," in proceedings of the ieee radio wireless symp., jan. 2007, pp. 31–34. 120 s. jakodiya, r. c. gurjar, r. s. gamad [5] d. zito, d. pepe, a. fonte, "13 ghz cmos active inductor lc vco," ieee microwave and wireless components letters, vol. 22, no. 3, pp. 138–140, 2012. [6] y. zhang, w. zhang, p. shen, h. xie, d. jin, s. xu, x. yang, z. zhang, "an lc vco with high output power and low phase noise using differential active inductor," in proceedings of the 3rd ieee international conference on integrated circuits and microsystems (icicm), 2018. [7] o. faruqe, a. rumana, and a. tawfiq, "a low power wideband varactorless vco using tunable active inductor. " telkomnika, vol. 18, no. 1, pp. 264–271, 2020. [8] b. tang, x. gui, l. geng, "design of a low-supply sensitivity lc vco with complementary varactors", in proceedings of the ieee 61st international midwest symposium on circuits and systems (mwscas), 2018. [9] m. ebrahimzadeh, f. rezaei, s. eezaei, "a new active inductor and its application to wide tuning range lc oscillator," international journal of soft computing and engineering (ijsce), vol. 1, no. 5, pp. 111-114, november 2011. [10] f. haddad, i. ghorbel, w. rahajandraibe, "design of reconfigurable inductorless rf vco in 130 nm cmos," bionanoscience, vol. 9, no. 2, pp. 285–295, 2019. [11] r. mukhopadhyay, y. park, p. sen, n. srirattana, j. lee, c.-h. lee, s. nuttinck, a. joseph, j. d. cressler, and j. laskar, "reconfigurable rfics in si-based technologies for a compact intelligent rf front-end, " ieee trans. microw. theory tech., vol. 53, no. 1, pp. 81–93, jan. 2005. [12] o. faruqe, and a. md tawfiq, "active inductor with feedback resistor-based voltage controlled oscillator design for wireless applications." international journal of electronics and telecommunications, vol. 65, no. 1, pp. 57–64, 2019. [13] j. xu, c. e. saavedra, g. chen, "an active inductor-based vco with wide tuning range and high dc-to-rf power efficiency," ieee transactions on circuits and systems ii: express briefs, vol. 58, no. 8, pp. 462–466, 2011. [14] r. mukhopadhyay et al., "reconfigurable rfics in si-based technologies for a compact intelligent rf frontend," ieee transactions on microwave theory and techniques, vol. 53, no. 1, pp. 81–93, jan. 2005. [15] m. fillaud, h. barthelemy, "design of a wide tuning range vco using an active inductor," in proceedings of the joint 6th international ieee northeast workshop on circuits and systems and taisa conference, 2008. [16] t. kanar, g. m. rebeiz, "a 2-15 ghz vco with harmonic cancellation for wide-band systems" ieee microwave and wireless components letters, vol. 26, no. 11, pp. 933–935, 2016. [17] d. sachan, h. kumar, m. goswami, p. k. misra, "a 2.4 ghz low power low phase-noise enhanced fom vco for rf applications using 180 nm cmos technology," wireless personal communications, vol. 101, no. 1, pp. 391–403, 2018. [18] o. faruqe, a. k. bulbul, m. m. saikat, t. amin, "a high output power active inductor based voltage controlled oscillator for bluetooth applications in 90nm process". 2018 4th international conference on electrical engineering and information & communication technology (iceeict), 2018. [19] s. xu, w. zhang, p. shen, h. xie, d. jin, y. zhang, z. zhang, "a wide tuning range low kvco and low phase noise vco," in proceedings of the 3rd ieee international conference on integrated circuits and microsystems (icicm), 2018. [20] n. cheraghi shirazi, and r. hamzehyan, "a 5.5 ghz voltage control oscillator (vco) with a differential tunable active and passive inductor," international journal of machine learning and computing, vol. 3, no. 1, p. 13, 2013. [21] b. razavi, "a circuit for all seasons, the cross coupled pair – part i," ieee solid state circuit magazine, pp. 7–10, 2014. [22] s. y. lee, j. y. hsieh, "analysis and implementation of a 0.9-v voltage-controlled oscillator with low phase noise and low power dissipation". ieee transactions on circuits and systems ii: express briefs, vol. 55, no. 7, pp. 624–627, 2008. [23] b. tang, x. gui, l. geng, "design of a low-supply sensitivity lc vco with complementary varactors, " in proceedings of the 61st ieee international midwest symposium on circuits and systems (mwscas), 2018. [24] b. razavi, "design of analog cmos integrated circuits". [25] n. c. shirazi, e. abiri, and r. hamzehyan, "a 5.5 ghz voltage control oscillator (vco) with a differential tunable active and passive inductor," international journal of information and electronics engineering, vol. 3, no. 5, pp. 493–497, september 2013. [26] a. saberkari and s. seifollahi, "wide tuning range cmos colpitts vco based on tunable active inductor, " malesi journal of telecommunication devices, vol. 1, no. 1, pp. 11–15, march 2012. https://www.amazon.in/behzad-razavi/e/b000apu0hi/ref=dp_byline_cont_book_1 instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 597 609 doi: 10.2298/fuee1504597r efficient calculation of the autocorrelation of boolean functions with a large number of variables  miloš radmanović 1 , radomir stanković 1 , claudio moraga 2 1 faculty of electrical engineering, university of niš, niš, serbia 2 european centre for soft computing, mieres, spain, and department of computer science, technical university of dortmund, dortmund, germany abstract. the autocorrelation of a boolean function is an important mathematical concept with various applications. it is a kernel of many algorithms with essential applications whose efficiency is directly limited by the time and space complexity of methods for computing the autocorrelation. these limitations, in this paper, can be overcome by computing the autocorrelation using a shared multi-terminal binary decision diagram (smtbdd) which is a data structure allowing compact representations of large boolean functions. the computation is performed in the spectral domain by exploiting the wiener-khinchin theorem and the fast calculation algorithms through smtbdds. it is necessary to develop a specialized decision diagram package with all the standard bdd operations that supports a fast calculation algorithms through decision diagrams with dynamically resizable terminal nodes. it allows to deal with large integers that appear in computing the autocorrelation coefficients. an experimental evaluation over benchmarks favorably confirms the efficiency of the proposed data structure and related algorithm. key words: boolean functions, autocorrelation, wiener-khinchin theorem, fast walsh transform, bdd package, dynamically resizable terminal nodes. 1. introduction the autocorrelation function has numerous applications in computing, telecommunications, data encoding and transmission, cryptography, etc. in particular, in computer-aided design, the autocorrelation is used in the optimization and synthesis of combinational logic [1-5], variable ordering for binary decision diagrams [6-9], and estimation the function complexity [10]. the related algorithms are deterministic and, for the classes of boolean functions where they can be efficiently applied (depending on the properties of autocorrelation received november 26, 2014; received in revised form march 20, 2015 corresponding author: miloš radmanović faculty of electrical engineering, university of niš, alaksandra medvedeva 14, 18000 niš, serbia (e-mail: milos.radmanovic@elfak.ni.ac.rs) 598 m. radmanović, r. stanković, c. moraga coefficients), the produced solutions are optimal. this can be considered an advantage compared to various heuristic approaches that have been proposed for the same applications, see, e.g., [11] and references therein, since in heuristic algorithms there is no guarantee on the obtained quality. the efficiency of algorithms based on autocorrelation is directly determined by the runtime of methods used for computing the autocorrelation as well as the complexity of the underlaid data structures used to represent the input and output data. in vector notation, the autocorrelation of a function of n variables is as a vector of length 2 n . therefore, methods for computing the autocorrelation coefficients have an exponential complexity in the number of the variables. there are various methods for an efficient computation of individual autocorrelation coefficients of a given function depending on the data structure used to specify the boolean function and its autocorrelation [12, 13]. the autocorrelation coefficients may also be computed from the spectral coefficients of the function by exploiting the wiener-khinchin theorem and the fast calculation algorithms through multi-terminal binary decision diagrams (mtbdds) [14]. the method may be extended to the computation of the autocorrelation for multiple-output functions. since this method produces large integers up to the value 2 2n , where n is the number of variables in the function, currently available bdd based techniques are limited to functions of less than 32 variables [15]. in this paper, we present a method for the computation of the autocorrelation spectra through smtbdds for multiple-output boolean functions of more than 32 variables. the computation is performed in the spectral domain by exploiting the wiener-khinchin theorem and the fast calculation algorithm through smtbdds [16]. this computation, for boolean functions of many variables, requires calculations of large integers. for this reason, the standard bdd packages [17-22] with integer data type terminal nodes cannot be used. experiments with a package with integer data type terminal nodes show that it is necessary to develop a package that will preserve all the standard bdd operations necessary to perform the corresponding fast calculation algorithms through decision diagrams, however, with dynamically resizable terminal nodes. being developed by appreciating and incorporating all the standard techniques in programming decision diagrams, the specialized decision diagram package presented in this paper can be viewed as an extension of the classic bdd packages. however, it allows dealing with large integer terminal nodes. this feature was achieved by incorporating in the decision diagram package the template class “bitvector” used to define the dynamically resizable terminal nodes. the size of the node can be specified by the user as a template parameter. to estimate features of the package, we show, by experiments on benchmarks, that the proposed implementation allows us the computation of the autocorrelation of multiple-output boolean functions with a large number of variables (over 32 variables), while the application of the decision diagram packages with integer data type terminal nodes restricts the computation of autocorrelations to boolean functions with 32 or less variables. the paper is organized as follows: the second section reviews basic properties of the autocorrelation of boolean functions. the third section describes smtbdds and a fast calculation algorithm through smtbdds. the fourth section discusses computation of the autocorrelation coefficients through smtbdds. section five briefly describes how the classic bdd packages can be extended into a specialized decision diagram package with dynamically resizable terminal nodes. section six illustrates, based on benchmarks, that the proposed extension of the classical bdd packages allows the computation of the efficient calculations of the autocorrelation of boolean functions with a large number of variables 599 autocorrelation coefficients for boolean functions of many variables. furthermore, some peculiar properties for the discussed computations are pointed out and illustrated. the paper concludes with a discussion of possible directions for future research. 2. autocorrelation function the following notation is used throughout the paper. a binary n-tuple x1x2...xn, xi  {0,1} is denoted by x and the equivalent integer value is assigned to it as: 1 x 2 n n i i i x     . (1) with this notation, f (x) is an n-variable boolean function, i.e., f (x) = f (x1x2...xn), xi  {0,1}, x  {0…00, 0…01, …, 1…11}. an m-output boolean function is defined as the function f (x) = f (f0, f1, f0,..., fm1): {0,1} n  {0,1} m . the autocorrelation function is defined as [1]: 2 1 0 ( ) ( ) ( ), {0,1,..., 2 1} n n v b u f v f v u u       , (2) where  is the addition modulo 2, exor. the autocorrelation function computes a measure of similarity between a function f and the same function under displacement. the autocorrelation function (or transform), of a boolean function is an integer valued function. we assume that the boolean functions are represented by bdds, while the autocorrelation functions, being integer valued, are represented by mtbdds. for multi-output functions the shared bdds (sbbds) and smtbdds are used. it should be noticed that the maximal value of an autocorrelation coefficient can be 2 n . this can be a source of difficulties when computing the autocorrelation of boolean functions with a large number of variables and representing the autocorrelation function by decision diagrams since representing large integers in terminal nodes is required. for example, a boolean function of 65 variables might have autocorrelation coefficients whose value could be 2 65  3.6810 19 . this problem is addressed in the paper and a solution is proposed through decision diagrams with dynamically resizable terminal nodes by using a particular technique of object oriented programming languages. for multiple-output functions f = (f0, f1, f0,..., fm1) the autocorrelation functions of the individual outputs are combined into the total autocorrelation function [1]:          1 0 1 0 12 0 )()()()( m i m i v iii n uvfvfubub . (3) as evident from previous equations, computing the autocorrelation coefficients requires 2 n operations to compute each of the 2 n coefficients. therefore, the run-time and computational resources are exponential in n. for a function f defined by the truth-vector tn fff )]12(,),1(),0([   , the walsh spectrum tn ffff ssss )]12(,),1(),0([   is defined as [22]: 600 m. radmanović, r. stanković, c. moraga fnws f )( , (4) where, )1()( 1 wnw n i  , (5) where  denotes the kronecker product, and         11 11 )1(w , (6) is the basic walsh matrix. the walsh transform is a self-inverse transform up to the constant 2 -n that is used as the normalization factor when defining the walsh transform and its inverse. in the matrix notation, if a function f and its autocorrelation function )(ub are represented by vectors tn fff )]12(,),1(),0([   , and tn f bbbb )]12(,),1(),0([   , respectively, the wiener-khinchin theorem is defined as [1]: 2 ))()((2 fnwnwb n f   . (7) the main advantage of this theorem comes from the existence of the fast walsh transform (fwt) that is an algorithm to compute the walsh spectrum with logarithmic time complexity. since the fwt can be performed also over decision diagrams [16], the wiener-khinchin theorem can be used in computing the autocorrelation function of boolean functions with a large number of variables. it should be noticed that the maximal value of an autocorrelation coefficient, when using the wiener-khinchin theorem, before multiplication with 2 -2n could be 2 2n . for example, a function of 65 variables might produce the value 39130 1036.12  . again we see the problem of large integers that should be represented in terminal nodes of decision diagrams. 3. smtbdds and a fast calculation algorithm through smtbdd a bdd is a data structure convenient to represent a boolean functions of many variables. due to that, bdds have become widely used for a variety of cad applications, for example in [23-24], including symbolic simulation, verification, reliability analysis of combinational and sequential circuits. an mtbdd is a generalization of a bdd, derived by allowing terminal nodes that show integer values [25]. a comprehensive set of arithmetic operations can be realized efficiently on mtbdds, such as addition, subtraction, and multiplication, as well as logic operations. they are implemented by recursive algorithms and executed in time almost linear in the graph size [16]. multiple-output integer-valued functions are represented by smtbdds, having a separate root node for the each output [22]. the walsh spectrum of a boolean function (if the scaling factor 2 -n is assigned to the inverse transform) is an integer-valued function and can be represented by an mtbdd. example 1: fig. 1 shows the smtbdd for the walsh spectra of the functions 1 1 2 3 1 2 3 ( , , )f x x x x x x  and 2 1 2 3 1 2 2 3( , , )f x x x x x x x  . the walsh spectrum of the function f1 is sf 1 = [5,1,1,1,3,1, 1,1] t and that of the function f2 is sf 2 = [3,1,3,1,1,1,1,1] t . it efficient calculations of the autocorrelation of boolean functions with a large number of variables 601 should be observed that the mtbdd for sf 1 is compact since there are two constant subvectors of two consecutive 1 in the walsh spectrum of f1. it is obvious that this smtbdd is much smaller than two mtbdds for the functions f1 and f2 in the number of nodes since there are shared values of the walsh coefficients of f1 and f2. fig. 1 smtbdd for the walsh spectra for the functions in example 1 the algorithm, we refer to as the fast calculation algorithm for the walsh transform [1] through an smtbdd, is based on the fast algorithms for spectral transforms through bdds. several variants of bdd based calculation algorithms for walsh transform as well as extensions of the bdd calculation methods to other transforms for the boolean functions are considered in [19], [22], [25], [26], and elsewhere. the algorithm is based upon the factorization of transform matrices as used in the development of the fast fourier transform (fft). butterfly operations are implemented in terms of graph addition and subtraction operations resulting in a technique that is implemented through the use of graph manipulations only. this method takes advantage of the compactness inherent in mtbdds and can be more effective for boolean function transformations than traditional approaches. example 2: the fast calculation algorithm for the walsh spectrum through an mtbdd of the function 1 2 3 1 2 3 ( , , )f x x x x x x  with walsh spectrum sf = [5,1,1,1,3,1, 1,1] t is shown in fig. 2. an mtbdt (multi-terminal binary decision tree) can be reduced into an mtbdd. the subtrees may be shared and the redundant information (nodes) deleted from the mtbdt. the impact of the deleted nodes can be represented by the cross-points defined as points where an edge crosses a level in the mtbdd [26]. in this introductory example, all cross-points or "hidden" nodes must be considered as explicitly present to apply the local walsh transform. the walsh transform algorithm is bottom up. the transform of the level corresponding to the variable x3 has as effect, that for each node and cross-point, it replaces the subtree connected to a node with a 0-labeled edge, by the sum of the values of both subtrees of that node (which in this case are leaves) and the subtree connected to the same node with a 1-labeled edge, by the difference. the same procedure is applied step by step to nodes and cross-points in higher levels of the mtbdd. computation of walsh transform through an mtbdd by applying the traversal in a bottom-up manner is denoted as bolded nodes and cross-points. 602 m. radmanović, r. stanković, c. moraga fig. 2 fast calculation algorithm for the walsh spectrum of the function f in example 2 through an mtbdd this technique can be used for any transformation that has a kronecker product based transformation matrix and can be extended to smtbdds [26]. 4. autocorrelation through an smtbdd by using the wiener-kinchin theorem the computation of the walsh spectrum can be performed through the flow-graph describing the fast walsh transform (fwt). example 3: computing the autocorrelation coefficients using the wiener-khinchin theorem as specified in eq. (7) through the fast walsh transform for the function 1 2 3 1 2 3 ( , , )f x x x x x x  is shown in fig. 3. step 1 is dedicated to calculate the walsh spectrum sf = [5,1,1,1,3,1, 1,1] t from the truth-vector of the function f. in the final step, the autocorrelation spectrum is multiplied by the normalization factor 1/8. notice that normalizing after the computation of the walsh transform allows us the use of the integer arithmetic in the whole process, even though at the price of accepting 2 2n as the upper boundary. moving ahead the normalizing factor, would allow an upper bound of 2 n , but would require working with rational numbers and a complex floating point implementation. this is why the integer arithmetic version was adopted for the present work. the wiener-khinchin theorem and the walsh spectrum computation through an mtbdd [13] leads to the following algorithm for the computation of the autocorrelation over smtbdds. fig. 3 computing the autocorrelation for the function in example 3 efficient calculations of the autocorrelation of boolean functions with a large number of variables 603 algorithm 1. autocorrelation spectrum through an smtbdd let )( fsmtbdd be the representation of multi-output function f. suppose that function can be efficiently represented by smtbdd. 1) update the size of smtbdd terminal node according with the maximal value of an autocorrelation coefficient. 2) conversion of the )( fsmtbdd into an )( f ssmtbdd , where f s denotes the walsh spectra of the function f. 3) multiplication of the )( f ssmtbdd by itself using the standard procedure for multiplication of functions represented by bdds (see, e.g., [2]). 4) conversion of the resulting )( 2 f ssmtbdd into a new )2( f n bsmtbdd . 5) normalization with 2 -n , since the walsh matrix is self-inverse up to the constant 2 n , where n is the number of variables in the function f. example 4: computing the autocorrelation coefficients using the wiener-khinchin theorem and the fwt through an smtbdd for the functions 1 1 2 3 1 2 3( , , )f x x x x x x  and 2 1 2 3 1 2 2 3( , , )f x x x x x x x  is shown in fig. 4. it is fairly obvious that the number of "butterflies" and multiplication operations in this smtbdd is smaller than the number of operations in two mtbdds for the functions 1 f and 2 f , since there are shared subtrees in the mtbdd for the functions 1 f and 2 f . 5. the bdd package with dynamically resizable terminal nodes 5.1. motivation decision diagrams are a standard part of many cad-cam systems since they permit compact representations of large boolean functions and efficient manipulations and calculations with them. there are several code packages and development environments using bdds and their various generalizations and extensions, as the main data structure. these decision diagram packages are built in various programming languages, especially in c, c++, and java. basic principles in programming decision diagrams are set in [17] and then further elaborated by many authors, e.g., [11, 18, 21, 22, 27, 28], and references therein. most packages appreciate these fundamental principles and share common features, however, a specification and suitable modification of the basic decision diagram packages is usually required to meet demands in particular concrete applications. the same is true when decision diagrams are used to compute the autocorrelation of boolean functions. in this case, a particular problem is the requirement to deal with large integers. it should be noticed that the maximal value of a terminal node, when performing the wiener-khinchin theorem and the fwt through an smtbdd, before multiplication with 2 -n could be 2 2n , where n is the number of variables in the function. therefore, if the computation of the autocorrelation coefficients is performed through the smtbdd, the usability of classical bdd packages which have 32-bits or 64-bits integer terminal nodes 604 m. radmanović, r. stanković, c. moraga is necessary limited to relatively moderate size boolean functions (with 16 or 32 variables respectively). with this motivation, in this paper, we propose an extension of the classical bdd packages, with dynamically resizable terminal nodes, that allows computation of the autocorrelation for large boolean functions. the idea comes from the consideration in [29] where decision diagrams with terminal nodes replaced by vectors are discussed. these vectors can be viewed as binary representations of large integers. in computing the autocorrelation, it is convenient to have flexibility in determining the size of the binary vectors corresponding to the binary representations of integers. this consideration leads to decision diagrams with dynamically resizable terminal nodes. we however appreciate and preserved all other basic recommendations in programming decision diagrams, as for instance implementation of the unique table, compute table, garbage collection, etc., and implemented them as in many other decision diagram packages. therefore, a description of these features is omitted. instead we refer to basic principles in programming decision diagrams, and focus on the modifications that we did in implementation of terminal nodes. some other implementation details are briefly presented in sect. 5.3 discussing computation of the autocorrelation functions. 5.2. implementation of dynamically resizable terminal nodes the class diagram for the static structure of nodes implementation for the bdd package with dynamically resizable nodes is shown in fig. 5. fig. 4 computing the autocorrelation for the multiple-output function in example 4 efficient calculations of the autocorrelation of boolean functions with a large number of variables 605 fig. 5 bdd node implementation for the bdd package with dynamically resizable nodes the class "node" is used as basic class to represent a non-terminal or a terminal node. it is an object-oriented class that contains the attributes: "level", "next", and "reference counter". the "level" denotes a variable that labels the node and it uses the integer data type. the "next" pointer links nodes together that belong to the same level. the visited flag for a bdd traversal can be stored as the least significant bit of the "next" pointer. the "reference counter" is implemented for garbage collection of nodes and it uses the integer data type [17]. a nonterminal node is a class, derived from the class "node", and contains the attributes: "then" and "else" children pointers. a terminal node is also a class, derived from the class "node", and contains the attribute "value", which stores the constant value of the terminal node. object oriented languages allow us the definition of a template class to represent a class member of any possible datatype (including a user-defined datatype). in order to implement dynamically resizable terminal nodes, the "terminalnode" class is implemented as a template class. the attribute "value" uses a template data type and can be of any possible datatype. the main program declares the used type of the attribute "value". this implementation for the attribute "value" allows a user-defined implementation of a class that supports work with large integers, where the length of large integers can be preset with a parameter. 5.3. computation of the autocorrelation through smtbdds with dynamically resizable terminal nodes to compute the autocorrelation coefficients of a large boolean function in the case of restricted memory resources, we developed a bdd package with dynamically resizable nodes. the computation procedure is implemented in c++. the unique table and the operation table are implemented as a hash tables with collision chains [17]. the hash key is composed of the memory position of the node and its successors. since this implementation uses the walsh transformation over the bdd package, currently available techniques require the usage of the field "level" in the node implementation. for this reason, the unique table is not divided into subtables as proposed in [20]. since this implementation uses bdd operations that require operations for user-defined types, the bdd operations are implemented as templates of overloaded operator functions. for a user-defined implementation of the class that supports work with large integers, we developed the template class "bitvector", where the size of the binary vector (large integer) can be preset with the template parameter “size”. since this implementation uses bdd operations that require operations for the binary vectors, the class "bitvector" must support overloaded operator functions for assignments, relational, and arithmetic operators. 606 m. radmanović, r. stanković, c. moraga the implementation of the class "bitvector" uses standard programming technique for data structures. example 5: the implementation of the update of the size of smtbdd terminal node where the maximal value of an terminal node can be 2 128 (expressed in c++) uses the following two lines of code: bdd_manager> manager; smtbdd> bdd(manager); in terms of bdd package implementation, it is common to use class “bdd_manager” for initialization of the bdd package. after this initialization a “bdd” object must be defined that handles the smtbdd. 6. experimental results the autocorrelation computation was tested on a set of large benchmarks [30] on a pc pentium iv running at 2,66 ghz with 4 gb of ram. the size of the unique table and the operation table was limited to 262139 entries. the garbage collection was activated when available memory runs low. computation run-time statistics of the bdd-based method includes the creation of the smtbdds. all benchmarks were used in the espresso-mv or pla format [31]. table 1 gives the experimental results of the computation run-time and the terminal node size of the autocorrelation computation through an smtbdd with dynamically resizable terminal. all times are given in seconds. the fifth column shows the number of bits required to represent values of terminal nodes. it can be seen as an experimental justification of the necessity for the extension of the bdd package that is presented in this paper. table 1 statistics of the computation run-time and the terminal node size of the autocorrelation computation through an smtbdd with dynamically resizable terminal nodes benchmark inputs outputs cubes terminal size [bits] computation time [s] b4 33 23 54 96 1.28 in3 35 29 75 96 1718.38 jbp 36 57 166 96 0.88 signet 39 8 124 96 - apex2 39 3 1035 96 - seq 41 35 336 96 - apex1 45 45 206 96 - ti 47 72 271 96 33.18 ibm 48 17 173 128 113.27 apex3 54 50 280 128 - misg 56 23 75 128 0.28 e64 65 65 65 160 0.49 x7dn 66 15 622 160 16035.81 x2dn 82 56 112 192 0.59 soar 83 94 529 192 7.52 mish 94 43 91 192 0.65 apex5 117 88 1227 256 32.96 ex4p 128 28 620 288 360.26 o64 130 1 65 288 - efficient calculations of the autocorrelation of boolean functions with a large number of variables 607 in the case of computing the autocorrelation coefficients for benchmarks with 30 or more variables using the wiener-khinchin theorem as specified in eq. (7) through the fast walsh transform the computation failed, due to the memory limitations of 4gb for storing walsh and the autocorrelation spectrum. moreover, currently available bdd based techniques are limited to benchmarks of less than 32 variables. therefore, experimental results are not compared with results using other approaches. table 2 gives the experimental results of space statistics. all results are given in number of nodes. table entries with dashes indicate that the method failed to complete for that particular benchmark because of running out of memory. table 2 space statistics of the autocorrelation computation through an smtbdd with dynamically resizable terminal nodes benchmark smtbdd (f) size smtbdd (sf) size smtbdd (sf2) size smtbdd (bf) size [non-terminal nodes / terminal nodes] b4 512 / 2 5923 / 277 3831 / 156 1842 / 158 in3 377 / 2 13874 / 538 9563 / 335 7496 / 1618 jbp 550 / 2 6813 / 260 4290 / 157 1501 / 211 signet 2956 / 2 --- apex2 7102 / 2 --- seq 142321 / 2 44743 / 3993 24093 / 2013 - apex1 28414 / 2 77535 / 3191 55024 / 2193 - ti 6187 / 2 16437 / 950 8782 / 493 49947 / 2091 ibm 835 / 2 40264 / 517 23680 / 311 5085 / 1078 apex3 ---- misg 107 / 2 2994 / 120 1748 / 72 377 / 73 e64 1446 / 2 3039 / 99 1610 / 50 1686 / 39 x7dn 863 / 2 73602 / 1046 53813 / 761 32217 / 6280 x2dn 223 / 2 5541 / 116 3283 / 68 657 / 107 soar 995 / 2 35346 / 465 16116 / 254 2584 / 453 mish 131 / 2 5595 / 99 3832 / 64 142 / 65 apex5 2705 / 2 73230 / 238 29499 / 122 4645 / 158 ex4p 1301 / 2 133953 / 1095 56278 / 621 4621 / 1149 o64 ---- 7. conclusions and future work the complexity of methods for computing the autocorrelation is exponential in the number of variables of the function. the method presented in this paper is based on smtbdd representations of the functions. besides allowing the processing of multi-output boolean functions of a large number of variables with a restricted memory, the smtbdd offers a considerable flexibility in calculations of user defined subsets of particular autocorrelation coefficients. the computation is performed in the spectral domain by exploiting the wienerkhinchin theorem and the fast calculation algorithm through an smtbdd. this computation, for large boolean functions, requires calculations with large integers. for this reason, the usability of classical bdd packages is necessarily limited to relatively moderate size boolean functions. with this motivation, we propose an extension of the classical bdd packages with 608 m. radmanović, r. stanković, c. moraga dynamically resizable terminal nodes that allows us the computation of the autocorrelation for large boolean functions. an experimental verification confirms that the proposed implementation allows us the computation of the autocorrelation of large boolean functions. in a few cases the computation failed, due to the memory limitations caused by the size of the smtbdd to represent either the function, its walsh spectrum, or the autocorrelation spectrum. thus, the main concept presented in the paper is achieved. however, additional work towards further optimization of the implementation in terms of memory and time requirements is advisable. the implementation can be easily modified for the computation of the convolution, the correlation, and related mathematical operators. this concept can be successfully applied to the computation of other spectral transformations for large boolean functions. the proposed bdd package can be used in other applications where dealing with functions having large integer values is required. acknowledgement: the authors are very grateful to the reviewers for their constructive comments which significantly improved the contents and the presentation of the paper. references [1] m. g. karpovsky, finite orthogonal series in the design of digital devices, new york: wiley, 1976. [2] m. g. karpovsky, r. s. stanković, and j. t. astola, "spectral techniques for design and testing of computer hardware", in proceedings of the 1st int. workshop on spectral techniques and logical design for future digital systems, pp. 9-43, 2000. [3] j. e. rice, and j. c. muzio, "on the use of autocorrelation coefficients in the identification of threelevel decompositions", in proceedings of the ieee/acm int. workshop on logic synthesis, pp. 187191, 2003. [4] o. keren, i. levin, and r. s. stanković, "linearization of logical functions defined by a set of orthogonal terms theoretical aspects", automation and remote control, vol. 72, no. 3, pp. 615-625, 2011. [5] o. keren, and i. levin, "linearization of multi-output logic functions by ordering of the autocorrelation values", facta universitatis series: electronics and energetics, vol. 20, no. 3, pp. 479498, dec. 2007. [6] j. e. rice, m. serra, and j. c. muzio, "the use of autocorrelation coefficient for variable ordering for robdds", in proceedings of the int. workshop on applications of reed-muller expansion in circuit design, pp.185-196, 1999. [7] m. g. karpovsky, r. s. stanković, and j. t. astola, "reduction of sizes of decision diagrams by autocorrelation functions", ieee trans. on computers, vol. 52, no. 5, pp. 592-606, 2003. [8] o. keren, "reduction of average path length in binary decision diagrams by spectral methods", ieee trans. on computers, vol. 57, no. 4, pp. 520-531, 2008. [9] o. keren, i. levin, and r. s. stanković, "determining the number of paths in decision diagrams by using autocorrelation coefficients", ieee trans. on cad of integrated circuits and systems, vol. 30, no. 1, pp. 31-44, 2011. [10] m. g. karpovsky, and e. s. moskalev, "utilization of autocorrelation functions for realization of systems of logical functions", automation and remote control, vol. 31, no. 2, pp. 243-250, 1970. [11] r. ebendt, g. fey, and r. drechsler, advanced bdd optimization, netherlands: springer, 2005. [12] j. e. rice, and j. c. muzio, "methods for calculating autocorrelation coefficients", in proceedings of the 4th workshop on boolean problems, pp. 69-76, 2000. [13] m. radmanović, r. stanković, and c. moraga, "analysis of decision diagram based methods for the calculation of the dyadic autocorrelation", int. journal of systemics, cybernetics and informatics, pp.11-19, july 2007. [14] r. s. stanković, m. bhattacharaya, and j. t. astola, "calculation of dyadic autocorrelation through decision diagrams", in proceedings of the european conf. circuit theory and design (ecctd’01), pp. 28-31, 2001. efficient calculations of the autocorrelation of boolean functions with a large number of variables 609 [15] g. janssen, "a consumer report on bdd packages", in proceedings of the 16th symposium on integrated circuits and systems design, pp. 217-223, 2003. [16] e. m. clarke, k. l. mcmillan, x. zhao, and m. fujita, "spectral transforms for extremely large boolean functions", in proceedings of the ifip wg 10.5 workshop on applications of the reed-muller expression in circuit design, pp. 86-90, 1993. [17] k. s. brace, r. l. rudell, and r. e. bryant, "efficient implementation of a bdd package", in proceedings of the 27th design automation conf., pp. 40-45, 1990. [18] g. janssen, "design of a pointerless bdd package", in proceedings of the 10th int. workshop on logic and synthesis, pp. 310-315, 2001. [19] m. thornton, and r. drechsler, "spectral decision diagrams using graph transformations", in proceedings of the design, automation and test in europe conf. and exhibition, pp. 713-717, 2001. [20] f. somenzi, "efficient manipulation of decision diagram", int. journal on software tools for technology transfer, vol. 3, no. 2, pp. 171-181, 2001. [21] s. n. yanushkevich, d. m. miller, v. p. shmerko, and r. s. stanković, decision diagram techniques for microand nanoelectronic design handbook, crc press, 2006. [22] t. sasao, and m. fujita, representations of discrete functions, boston: kluwer academic publishers, 1996. [23] p. dziurzanskii, v. p. shmerko, and s. n. yanushkevich, "representation of logical circuits by linear decision diagrams with extension to nanostructures", automation and remote control, vol. 65, no. 6, pp. 920-937, 2004. [24] d. grobe and r. drechsler, "bdd-based verification of scalable designs", facta universitatis series: electronics and energetics, vol. 20, no. 3, pp. 367-379, dec. 2007. [25] e. m. clarke, m. fujita, p. c. mcgeer, k. mcmillan, j. c. yang, and x. zhao, "multi-terminal binary decision diagrams: an efficient data structure for matrix representation", in proceedings of the int. workshop on logic synthesis, vol. 6a, pp. 1-15, 1993. [26] r. s. stanković, and b. falkowski, "spectral transform calculation through decision diagrams", vlsi design, vol. c-14, no.1, pp. 5-12, 2002. [27] g. d. hachtel, and f. somenzi, logic synthesis and verification algorithms, norwell: kluwer academic publishers, 1996. [28] j. v. sanghavi, r. k. ranjan, r. k., brayton, and a. sangiovanni-vincentelli, "high performance bdd package by exploiting memory hierarchy", in proceedings of the 33rd ieee/acm design automation conference (dac’96), pp. 635-640, 1996. [29] h. hasan babu, and t. sasao, "shared multi-terminal binary decision diagrams for multiple-output functions", ieice trans. on fundamentals, vol. e81-a, no. 12, pp. 2545-2553, 1998. [30] f. brglez, "the benchmark archives at cbl acm/sigda benchmarks", nort carolina state university, 2011, http://www.cbl.ncsu.edu/benchmarks. [31] r. rudell, espresso misc. reference manual pages, berkeley university of california, 1993. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 465 476 doi: 10.2298/fuee1503465j mtj-based hybrid storage cells for “normally-off and instant-on” computing  bojan jovanović 1 , raphael m. brum 2 , lionel torres 2 1 university of niš, faculty of electronic engineering, niš, serbia 2 lirmm laboratory, university of montpellier 2, montpellier, france abstract. besides increasing a computing throughput, multi-core processor architectures bring increased capacity of sram-based cache memory. as a result, cache memory now occupies large proportion of recent processor chips, becoming a major source of the leakage power consumption. the power gating technique applied on a sram cache is not efficient since it is paid by data loss. in this paper, we present two hybrid memory cells that combine a conventional volatile cmos part with magnetic tunnel junctions (mtjs) able to store a data bit in a non-volatile way. being inherently nonvolatile, these hybrid cells enable instantaneous power off and thus complete reduction of the leakage power. moreover, given that the data bit can be stored in local mtjs and not in distant storage memories, these cells also offer instantaneous and efficient data retrieval. to demonstrate their functionality, the cells are designed using 28 nm fd-soi technology for the cmos part and 45 nm round spin transfer torque mtjs (stt-mtjs) with perpendicular magnetization anisotropy. we report the measured performances of the cells in terms of required silicon area, robustness, read/write speed and energy consumption. key words: hybrid mtj/cmos cells, magnetic tunnel junction (mtj), spin transfer torque (stt), normally-off instant-on computing 1. introduction conventional von-neumann computing architectures consist of a pure computational part (central processor unit cpu) and a memory part in which the computing recipes (programs) and the input/output data of the calculations are stored [1]. such complex systems have a memory hierarchy comprising different semiconductor memory types, as illustrated in fig. 1. dense, slow and non-volatile storage memory with limited endurance is combined with fast, volatile, power and area consuming sram/dram working memory (located close to the cpu) in order to ensure both rapid accessibility and data non-volatility. however, this sort of design hierarchy requires complex control. start-up (booting) and shutdown procedures usually take a long time and waste a significant amount of power since received january 20, 2015; received in revised form march 16, 2015 corresponding author: bojan jovanović university of niš, faculty of electronic engineering, niš, serbia (e-mail: bojan@elfak.ni.ac.rs)  466 b. jovanović, r. m. brum, l. torres they imply extensive data traffic (from storage memories to working memories and viceversa). in recent years, both the limited clock frequency of the processor and the emergence of multi-core architectures led to a significant increase in working memory capacity. as a result, the performance and the power of the computing system became determined by working, sram-based, memory. it occupies most of the chip area, consumes most of the static power and is prone to soft errors caused by radiation [2]. replacing conventional sixtransistor (6t) sram cells with four-transistors (4t) counterparts did not solve all these issues. although they occupy slightly less silicon area, 4t-sram memory cells consume more leakage power and exhibit poor data stability. furthermore, 4t-sram cells still limit system performance as they require complex control and communication with the nonvolatile storage elements [3]. fig. 1 typical structure of a computer memory hierarchy. to circumvent these limitations, non-volatility needs to be brought directly to the working memory cell. this would pave the way for new green computing paradigm based on “normally-off and instant-on” operation. computing equipment could be quickly turned-off when not in use, keeping the off state with zero stand-by power as long as possible. on the computing request, the equipment could be turned on instantly, with the full performance capabilities. such computing approach may be far more energy efficient compared with the current “normally-on” computing systems [4]. among the non-volatile devices that are prospective candidates for co-integration with cmos, spin-based magnetic tunnel junctions (mtjs) are the most promising [5]. unlike the other candidates in which the position of atoms (e.g. ferroelectric ram feram [6]) or the whole structure (e.g. phase change memory pcm [7]) have to be changed to define a nonvolatile state, spin-based mtjs are controlled only by electron spin [8]. in addition to energy efficiency (little energy is needed to change the electron spin), mtjs provide radiation immunity, high speed data switching, higher density, infinite endurance as well as the ability to continue shrinking in size [9]. moreover, they can be very easily co-integrated with the cmos without imposing the area overhead, as illustrated in fig. 2a). in this paper, we present two hybrid cells that combine cmos transistors with perpendicular spin-transfer torque mtjs (stt-mtjs) as non-volatile storage elements. the cells can be considered as hybrid alternatives for the mainstream 4tand 6t-sram cells. they can store a data bit in both volatile and non-volatile contexts. furthermore, the cells are mtj-based hybrid storage cells for "normally-off and instant-on" computing 467 able to quickly and efficiently transfer a data bit from one context to another, thus supporting the "normally-off and instant-on" computing concept. the remainder of the paper is organized as follows: in section 2, we analyze the evolution of the mtj writing mechanisms. in section 3, we introduce our hybrid cells that contain four-transistors and two-mtjs (4t-2m) and six-transistors and four-mtjs (6t4m), explaining their structure and functionality. in section 4, we report the measured performance of the cells in terms of required silicon area, robustness, leakage, read/write speeds and energy consumption. finally, section 5 is reserved for our conclusions. 2. evolution of mtj writing mechanisms an mtj is a nanopillar composed of an ultra thin layer of insulator (oxide barrier) sandwiched between two ferromagnetic (fm) metals (fig. 2a). the insulating layer is so thin that electrons can tunnel through the barrier if a bias voltage is applied between two fm electrodes. the resistance of mtj depends on the relative orientation of the magnetization in the two fm layers. in standard applications, the magnetization of one fm layer (the reference layer) is commonly pinned, whereas the other (storage) layer is free to take a parallel (p) or an anti-parallel (ap) orientation, thus determining parallel (rp) or anti-parallel (rap) mtj resistance and storing a binary state. the relative difference between these two resistances defines the tunnel magneto-resistance (tmr) ratio, ∆r/r=(rap-rp)/rp. in recent decades, much research effort has been invested in improving the tmr ratio of mtjs to make them more attractive for integration with cmos. today, commercial mtjs that use mgo oxide barriers have a tmr of about 200% [10], whereas some laboratory prototypes can have a tmr of up to 1000% [11]. the mechanism for switching between two mtj states (i.e. writing non-volatile data) is also an important research field that influences the area, speed and power performances of hybrid mtj/cmos circuits. early field-induced magnetic switching (fims) required writing currents in the order of a few milliamperes and thus very large driving transistors and write lines that penalized the die area of hybrid circuits [12]. thermally assisted switching (tas) has undergone improvement in terms of bit selectivity and writing efficiency. prior to switching, mtj stack is heated above the blocking temperature of the free layer. afterward, the state of the mtj is completely controlled by the external magnetic field [13]. however, due to the required heating and cooling latencies, tas-mtjs exhibit low switching speeds (about 20ns [14]), meaning they are not efficient enough for use in "normally-off and instant-on" computing systems. recent current induced magnetic switching (cims) methods use the spin-transfer torque (stt) effect proposed by berger [15] and slonczewski [16]. this enables magnetization of the free layer to be switched with only one, low, spin-polarized bidirectional current passing through the mtj stack, as illustrated in fig. 2b) and 2c). if the density of the spin-polarized writing current is greater than the critical current density (jco), mtj resistance is determined only by the direction of the current. 468 b. jovanović, r. m. brum, l. torres fig. 2 a) cmos-mtj co-integration; b) in-plane stt mtj writing; c) perpendicular stt mtj writing. mature and commercialized stt-mtjs with in-plane magnetization have very fast mtj switching speeds (up to 100 ps, according to [17]). however, with the writing currents of hundreds of micro amperes, this switching approach is still not efficient since it consumes a lot of energy and requires large driving transistors. furthermore, it suffers from reliability issues including data thermal stability, erroneous write by read current and short retention times [18]. high error rate of reading circuits is an additional obstacle. emerging perpendicular stt-mtj structures in which the magnetization direction is perpendicular to the film plane have proved to be the breakthrough technology that enables a significant reduction in the switching current required (several tens of microampers) as well as improvements in data thermal stability. perpendicular stt-mtjs are slightly slower than their in-plane counterparts. however, both their energy efficiency and their reported switching speeds of few ns [19], which are comparable with the write speeds of advanced sram cells, make them appropriate for the use in "normally-off and instant-on" computing systems [17, 19]. in the following section, we present two hybrid cells that combine perpendicular stt-mtjs as non-volatile storage elements with cmos transistors used to store a volatile data bit. 3. hybrid (mtj/cmos) memory cells here described memory cells are based on hybrid (volatile/non-volatile) cross-coupled inverters. they have perpendicular stt-mtjs “embedded” within a cmos part which makes them suitable to replace sram-based volatile memory cells or flip-flops located near the processor‟s arithmetic logic unit (alu). the unique feature of these cells is that while cpu is in active state, they behave as a conventional cmos-based flip-flop or sram memory cells with the very high speed of operation (> 2 ghz). while cpu is in mtj-based hybrid storage cells for "normally-off and instant-on" computing 469 stand-by state, data are stored in mtjs and zero stand-by power is achieved by the power gating. after power supply returns, the cell itself operates as a sense amplifier automatically restoring the data saved in mtjs into the sram or flip-flop. this enables the processor core to quickly become ready to start arithmetic operation. furthermore, such cells allow run-time saving of the processors‟ context (non-volatile check-pointing), thus significantly improving the reliability of data processing. 3.1. 6t-4m hybrid cell with double non-volatile context the first hybrid cell we propose is shown in fig. 3. it has a structure similar to that of a conventional 6t-sram cell. a volatile (sram) data context consists of the crosscoupled inverters (cmos latch) used to store one data bit in its electrical, complementary form (q, !q). in addition to the cmos latch, the cell has two non-volatile (mram) contexts located in both pull-up and pull-down networks of the latch structure. each mram context contains two perpendicular stt-mtjs that, for the correct operation of the cell, must be in mutually complementary states (rp/rap or vice versa). fig. 3 6t-4m hybrid memory cell. the procedure of writing a volatile data bit is exactly the same as in the conventional sram memory cell. the volatile data bit to be written and its complementary value are connected to the bl and blb lines, respectively. after activation of the access transistors (mn3 and mn4) with the wl signal pulse, the volatile data bit is stored in the cmos latch. reading the non-volatile data bit (i.e. restoring the mram context to sram) consists of converting the physical value (resistance) stored in mtjs into its electrical equivalent which will be stored in the cmos latch. fig. 4 illustrates the reading phase of mram_2 context (mtjs in the pull-down network). to read this mram context, bl and blb lines need to be pre-charged to vdd. the reading phase begins with activation of wl signal (wl=vdd). consequently, pull-down transistors (mn1/mn2) of the cmos latch are switched on, whereas the pull-up ones (mp1/mp2) are blocked (off). in both pull-down branches of the hybrid cell, there is a current flowing from the bl/blb lines through the access transistors and nmos pull-down transistors to the ground (gnd). provided that the cell is fully symmetrical (the transistors in both branches have equal on resistances since they have the same dimensions), the voltage drops on the q and !q nodes entirely depends on the mtj resistances in the mram_2 context that are in the path of the current. furthermore, if both 470 b. jovanović, r. m. brum, l. torres the transistors and the mtjs are carefully sized, the voltages on the latch nodes q and !q can be adjusted to be one below and another above the meta-stable voltage (vmeta), depending on the non-volatile data bit stored in mram_2 context. as illustrated on the transfer curve in fig. 4a), non-volatile data bit „1‟ stored in mram_2 context (rap/rp configuration) will cause the voltage on the q node to be greater than the meta-stable voltage (vq > vmeta). the opposite will occur if mram_2 context stores non-volatile data bit „0‟ (rp/rap configuration, fig. 4b)): q and !q voltages will be below and above meta-stable voltage, respectively (vq < vmeta; v!q > vmeta). fig. 4 the phase of reading mram_2 context that stores: a) non-volatile data bit „1‟ (rap/rp); b) non-volatile data bit „0‟ (rp/rap). in both scenarios, at the end of mram reading phase when the wl signal is deactivated and the access transistors are turned off, the cmos latch converges from an unbalanced state to one of its stable states, which is strictly determined by the state (resistance) of mtjs in mram_2 context. the procedure of reading mram_1 context is the same. the only difference is that, in this case, bl and blb lines need to be pre-charged to gnd. consequently, the pull-up network is activated, the current flows in both branches from the power supply (vdd) to the bl/blb nodes (which are now on the ground potential) putting the latch in a meta-stable state. finally, when the access transistors are deactivated, the latch converges from an unbalanced state to a stable one determined by the non-volatile data bit stored in mram_1 context (mtj2 and mtj3). rp/rap configuration for mtj2/mtj3 stores non-volatile data bit „1‟ whereas the rap/rp combination is used to store non-volatile „0‟ bit. mtj-based hybrid storage cells for "normally-off and instant-on" computing 471 3.2. 4t-2m hybrid cell with single non-volatile context in order to additionally decrease required implementation area, we propose another hybrid cell with a structure similar to that of a 4t-sram loadless volatile memory cell. as shown in fig. 5, it contains two pmos access transistors (mp1 and mp2) with low threshold voltage (vth) and two cross-coupled nmos transistors (mn1 and mn2) used to store one volatile data bit. in addition, the cell has one non-volatile (mram) context located in the pull-down network. it contains two perpendicular stt-mtjs that, for the correct operation of the cell, must be in mutually complementary states (rp/rap or vice versa). fig. 5 a) the 4t-2m hybrid memory cell; b) the same cell with the stt writing interface and current generator (cg) design. the low threshold voltage of the pmos access transistors implies increased subthreshold leakage current compared to the leakage of the pull-down nmos transistors (ioffp>ioffn). this, in turn, ensures volatile data retention when the cell is on stand-by (bl,blb,wl = vdd). the procedure of writing a volatile data bit is exactly the same as in conventional 4tsram loadless memory cells whereas the restoring phase is similar to that of a previously described 6t-4m hybrid cell. fig. 5b) shows stt writing interface. in addition to the current generator (cg) that supplies the bi-directional, spin-polarized current needed to write a non-volatile data bit (d), it contains the footer transistor mn5 as well as the pass transistors mn3 and mn4. in normal cell operation, these three transistors are always switched on (wr=’0’). conversely, during the phase of writing a non-volatile data bit (wr=’1’), they cut the mtjs off from the ground rails and cross-coupled nmos transistors, ensuring that spin-polarized cg current passes through both mtjs in mutually opposite directions. the direction of the cg current is strictly determined by the non-volatile data bit to be written (d). given that in the idle state cg inverters are with the active pull-down networks (logic zero at the inverters' outputs), the volatile data bit (electrical charge) stored in nmos cross-coupled transistors could discharge through the cg. to prevent this happening, a power-gating transistor mng is used to cut-off the cg from the ground rails during its idle state. 472 b. jovanović, r. m. brum, l. torres 4. evaluation of hybrid cells before measuring the performance of the cells, we implemented them in cadence spectre using stmicroelectronics 28 nm fully depleted silicon on insulator (fd-soi) technology for the cmos part [20] and 45 nm wide, round, perpendicular stt-mtjs for the non-volatile part. however, it should be said that using soi is not essential for the proper operation of here presented hybrid cells. they could be implemented in any standard cmos technology node. thanks to the presence of buried oxide in the transistor structure, fd-soi technology has proved to be very reliable in providing high speed at low voltage [21]. for our measurements, we used a power supply of vdd=1.1v. furthermore, the buried oxide significantly reduces standby power consumption by reducing both gate induced drain leakage and junction leakage currents. in addition, the wide range back gate controllability of fd-soi structure enables optimization of both performance and power after fabrication. perpendicular stt-mtjs were co-integrated with cmos using the open source spinlib physical model [22]. the model gives the resistances of mtjs depending on its magnetic configuration (p or ap) and its bias voltage. it also defines the current thresholds required to switch between the two configurations. finally, the model takes the switching delays, including stochastic fluctuations, into account. to achieve high simulation accuracy, the model was calibrated with respect to the experimental data provided by toshiba and ibm. table 1 summarizes some of the mtj parameters that are important for co-integration with the cmos. as can be seen, required switching currents are few dozen microamperes, whereas switching current pulses are in the order of few nanoseconds. however, it is worth mentioning that the stt writing mechanism has the ability to adjust the amount of switching current and the duration of the switching pulse. increasing the former entails decreasing the latter. thus, it would be possible to speed up non-volatile writing by increasing the amount of writing current, or to make it more energy efficient by increasing the duration of the writing current pulse. with a breakdown voltage of nearly 1 v and supply voltage of 1.1 v, stt-mtjs are in a safe area of operation (we measured 484 mv of voltage across the mtj during the switching phase). table 1 main parameters of perpendicular stt-mtjs parameter description value rp/rap [kω] p/ap mtj resistance 3.14/9.4 isw [µa] switching currents p → ap ~60 ap → p ~50 tsw [ns] switching speed p → ap 4.27 ap → p 4.71 vbd [v] breakdown voltage ~1 area mtj area 45nm x 45nm ra [ωˑµm 2 ] resistance-area product 5 tmr [%] tmr ratio 200 to ensure the area efficiency of any target application of our hybrid memory cells, they have to be as small as possible, since they may be instanced many times. that is why our first evaluation step was to find the smallest possible hybrid cell design, i.e., the smallest mtj-based hybrid storage cells for "normally-off and instant-on" computing 473 possible transistor sizes with which the cell was still operational. to this end, we used monte carlo (mc) analysis. the length of all the transistors in both cells was the smallest possible allowed by the technology (l=lmin=30 nm). we continued to vary the width (w) of the transistors as long as we obtained 0% of conversion (non-volatile reading) errors in 5000 mc runs with std = 10% variations in the length and width of all the transistors. using minimally sized hybrid cells, we continued to measure its other performances: static power consumption, the robustness of volatile data, the speed of writing the volatile data bit, the speed of restoring the non-volatile data bit as well as the dynamic energy required to restore it. the results are summarized in table 2. some measured parameters are also compared with the performances of conventional 4tand 6t-sram cells implemented in pure 28 nm fd-soi cmos technology. the total transistor area (w x l) of the hybrid cells is 2-3 times bigger than conventional, pure cmos memory cells. this increase in area is mostly due to the presence of the stt writing interface. however, given that hybrid cells can store 2-3 data bits, this difference in required silicon area can be considered as expected and acceptable. regarding leakage power, it is calculated by the help of cadence measurement description language (mdl) using the following formula: 1 0 ( ) p , t dd vdd t s v i t dt t      (1) where ivdd is the power supply current during the idle time interval δt = t1-t0. given that 4t-sram cells preserve the volatile data bit with increased leakage currents coming from lvt pmos transistors, their leakage power is significantly higher compared with 6t-sram leakage. low leakage power consumed by 4t-2m hybrid cell is due to resistive mtjs that are positioned in the path of the leakage currents (pull-down network of the cross-coupled transistors). 6t-4m hybrid cell consumes more static power simply because it contains more transistors. however, unlike conventional sram cells, our hybrid cells can store a volatile data bit into a non-volatile context, meaning the power supply can be turned off. this, in turn, completely eliminates leakage power. table 2 evaluated performance of hybrid cells 6t-sram 4t-sram 6t-4m 4t-2m w x l [µm2] 0.024 0.0144 0.0624 0.0516 leakage [nw] 0.93 5.67 3.15 1.9 mtj reading [ps] § ctx1 33.4 92.7 ctx2 60.8 erd [fj=µw/ghz] ¥ 3.94 3.17 vol. writing [ps]º 6.8 4.9 10 9.8 snm [mv] * 395 154 318 98 § the speed of reading (restoring) a non-volatile data bit stored in mtjs ¥ dynamic energy consumed during the phase of reading non-volatile data bit º the speed of writing a volatile data bit * the higher the snm, the better the robustness 474 b. jovanović, r. m. brum, l. torres to determine the speed of restoring a non-volatile data bit to volatile context, we continued to increase the width of the reading wl pulse (by using the binary search method) as long as the first correct reading operation was detected. the measured minimum reading pulse determines the maximum possible reading speed. as can be seen from table 2, nonvolatile data bit can be read in a gigahertz regime. in 6t-4m hybrid cell, non-volatile mram_1 context is faster than its mram_2 counterpart due to the fact that pull-up network in our cell is less resistive the pull-down one. 4t-2m hybrid cell is slightly slower because of the position of mtjs as well as sub-threshold working regime. this influences slow reaching of the unbalanced state. the minimum dynamic energy consumed by the hybrid cell during the phase of nonvolatile reading is listed in the middle of table 2. it was calculated by: ,)(e 1 0   t t vddddrd tpsdttiv (2) where ivdd is the power supply current during the restoration phase, δt = t1-t0 is previously determined minimum duration of the reading pulse, and ps is leakage power consumpiton of the cell. both 4t-2m and 6t-4m hybrid cells exhibit similar performance in terms of required restoration energy. to measure the speed of writing the volatile data bit, we used binary search method to determine the minimum width of the wl pulse needed to write the volatile data bit set on the bl/blb lines. the measured values are listed at the bottom of table 2. it can be seen that volatile writing speeds of all the cells are below 10 ps. the presence of the stt writing interface and resistive mtjs slightly reduce the volatile writing speeds of our hybrid cells compared to both the 4tand 6t-sram cells. hybrid cells we present here use cross-coupled inverters (6t-4m) or cross-coupled nmos transistors (4t-2m) to store the volatile data bit. the stability of this kind of structure is typically expressed in terms of its static noise margin (snm). informally, the static noise margin can be understood as the minimum voltage disturbance that could flip the volatile data stored in the memory cell. fig. 6 shows the conceptual measurement setup we used to measure snm. dc noise sources with the value vn were introduced between the gates of the nmos transistors and output q, !q nodes. using spectre mdl, we increased the noise voltage vn as long as we detect volatile data flipping. we repeated the same procedure for both possible values of the non-volatile data bit (q=1 and q=0) stored in cross-coupled nmos transistors. the measured snms (worst case) of hybrid cells are listed at the bottom of the table 2. it can be seen that 4t-2m hybrid cell is the most sensitive to voltage noise. to benefit from the dual storage facility, a significant property of the hybrid cell would be its ability to write non-volatile data bit without disturbing volatile data (q, !q). in this way, the device based on hybrid cells may profit from the run-time (on-the-fly) reconfiguration ability. during the processing of volatile data bits, some background operation may write non-volatile ones in parallel. to investigate this ability for the 6t-4m hybrid cell, we monitored the disturbance of volatile data (logic level degradation) during the non-volatile writing phase. we report logic level degradations of 92 mv and 116 mv for mram_1 and mram_2 contexts, respectively. given that logic level degradations mtj-based hybrid storage cells for "normally-off and instant-on" computing 475 are less than the snm value, we conclude that both non-volatile mram contexts of our cell can be dynamically reconfigured. fig. 6 a) static noise margin (snm) measurement setup for 6t-4m hybrid cell; b) snm measurement setup for 4t-2m hybrid cell. finally, it is worth mentioning that the above presented performance analysis is not completely exhaustive. we did not take into account the influence of the mtj process variations that become more and more critical, particularly in terms of resistance variations. moreover, we did not consider the sensitive aspects of integrating mtj electric signals to cmos electronics (reliability of nearly zero run-time error is required by the logic applications) [23, 24]. this, together with the influence of voltage and temperature variations will be included in our future work. 5. conclusion this paper presents two hybrid cells that are able to store and process one data bit both electrically and magnetically. the cells are based on 4tand 6t-sram architectures and use recently emerging perpendicular stt-mtj nanopillars as non-volatile storage elements. measured performance of both hybrid cells implemented in 28 nm fd-soi technology combined with 45 nm round stt-mtjs showed that the cells are ready to be used in "normally-off and instant-on" computing systems. the cells need less than 100 ps to restore a non-volatile data bit, spending not more than 4 fj for the operation. the volatile data bit can be written for a time bellow 10 ps. moreover, 6t-4m hybrid cell presented here has a few clear advantages compared to existing hybrid cells: two non-volatile data contexts and the ability to write a volatile data bit. the cells presented here have the potential to completely eliminate idle power consumption of a battery powered systems-on-chip. they are also suitable for non-volatile reconfigurable logic applications (non-volatile registers, processor cache, magnetic fpgas, etc). acknowledgement: this research was sponsored in part by the french national agency for scientific research (anr), through the projects dipmem and mars, as well as, by the serbian ministry of science and technological development, through the project iii-44004. references [1] j. rabaey, low power design essentials. new york: springer-verlag, 2009. [2] p. rech, j.-m. galliere, p. girard, f. wrobel, f. saigne, and l. dilillo, "impact of resistive-open defects on sram error rate induced by alpha particles and neutrons", ieee transactions on nuclear science, vol. 58, pp. 855-861, 2011. 476 b. jovanović, r. m. brum, l. torres [3] r. sandeep, n.t. deshpande, and a.r. aswatha, "design and analysis of a new loadless 4t sram cell in deep submicron cmos technologies", in proceedings of the 2nd international conference on emerging trends in engineering and technology, nagpur, 16-18 dec. 2009, pp. 155-161. [4] k. abe, s. fujita, and h. lee, "novel nonvolatile logic circuits with three-dimensionally stacked nanoscale memory device", in proceedings of nanotechnology conference, anaheim, california, 8-12 may 2005, pp. 203-206． [5] semiconductor industry association (sia). (2011) international technology roadmap for semiconductors. san jose, ca: semiconductor industry association (sia), http://www.itrs.net/. accessed march 13 20015. [6] s. james, p. arujo, and a. carlos, "ferroelectric memories", science, vol. 246, pp. 1400-1405, 1989. [7] h. wong, s. raoux, s. kim et al. "phase change memory", invited paper, in proceedings of the ieee, 2010, vol. 98, pp. 2201-2227. [8] c. chappert, a. fert, and v. dau, "the emergence of spin electronics in data storage", nature materials, vol. 6, pp. 813-823, 2007. [9] w. zhao, e. belhaire, c. chappert, and p. mazoyer, "spintronic device based non-volatile low standby power sram", in proceedings of ieee annual symposium on vlsi, montpellier, 7-9 apr. 2008, pp. 40-45. [10] s. ikeda, h. sato, m. yamanouchi, et al., "recent progress of perpendicular anisotropy magnetic tunnel junctions for non-volatile vlsi", journal of spin, vol. 2, pp. 1240003-1 124003-12, 2012. [11] t. kawahara, k. ito, r. takemara, and h. ohno, "spin-transfer torque ram technology: review and prospect", microelectronics reliability, vol. 52, pp. 613-627, 2012. [12] w. zhao, e. belhaire, c. chappert, and p. mazoyer, "power and area optimization for run-time reconfiguration sopc based on mram", ieee transactions on magnetics, vol. 45, pp. 776-780, 2009. [13] l. torres, y. guillemenet, and s. ahmed, "a dynamic reconfigurable mram-based fpga", in proceedings of international conference on engineering of reconfigurable systems and algorithms, las vegas, nevada, 12-15 jul. 2010, pp. 31-40. [14] d. suzuki, m. natsui, s. ikeda, et al., "fabrication of a nonvolatile lookup-table circuit chip using magneto/semiconductor hybrid structure for an immediate-power-up feld programmable gate array", in proceedings of ieee symposium on vlsi circuits, kyoto, 16-14 jun. 2009, pp. 80-81. [15] l. berger, "emission of spin waves by a magnetic multilayer traversed by a current", physical review b, vol. 54, pp. 9353–9358, 1996. [16] j. c. slonczewski, "current-driven excitation of magnetic multilayers", journal of magnetism and magnetic materials, vol. 1859, pp. l1–l7, 1996. [17] h. yoda, s. fujita, n. shimomura, et al., "progress of stt-mram technology and the effect on normally-off computing systems", in proceedings of ieee international electron devices meeting, san francisco, california, 10-13 dec. 2012, pp. 11.3.1 11.3.4. [18] r. takemura, t. kawahara, k. ono, k. miura, h. matsuoka, and h. ohno, "highly-scalable disruptive reading scheme for gb-scale sram and beyond", in proceedings of ieee international memory workshop, seoul, 16-19 may 2010, pp. 1-2. [19] e. kiagawa, s. fujita, k. nomura, et al. "impact of ultra low power and fast write operation of advanced perpendicular mtj on power reduction for high-performance mobile cpu", in proceedings of ieee international electron devices meeting, san francisco, california, 10-13 dec. 2012, pp. 29.4.1 29.4.4. [20] n. planes, o. weber, v. barral, et al. "28nm fdsoi technology platform for high-speed low-voltage digital applications", in proceedings of the symposium on vlsi technology, honolulu, hawai, 12-14 jun. 2012, pp. 133-134. [21] t. ishikagi, r. tsuchiya, y. morita, et al. "silicon on thin box (sotb) cmos for ultralow standby power with forward-biasing performance booster", in proceedings of the european solid-state device research conference, edinburgh, 15-19 sep. 2008, pp. 198-201. [22] y. zhang, w. zhao, y. lakys, "compact modeling of perpendicular-anisotropy cofeb/mgo magnetic tunnel junctions", ieee transactions on electron devices, vol. 59, pp. 819-826, 2012. [23] w. kang, w. zhao, e. deng et al., "a radiation hardened hybrid spintronic/cmos non-volatile unit using magnetic tunnel junctions", journal of physics d: applied physics, vol. 47, p. 405003, 2014. [24] w. kang, e. deng, j. o. klein et al., "separated pre-charge sensing amplifier for deep submicron mtj/cmos hybrid logic circuits", ieee transactions on magnetics, vol. 50, pp. 3400305-5, 2014. http://www.itrs.net/ facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 269 283 doi: 10.2298/fuee1602269m design of iir digital filters with critical monotonic passband amplitude characteristic a case study  dejan mirković, miona andrejević stošović, predrag petković, vančo litovski university of niš, faculty of electronic engineering, serbia abstract. a case study is reported related to the design of iir digital filters exhibiting critical monotonic amplitude characteristic (cmac) in the pass band. this kind of amplitude characteristic offers several advantages as compared to its non-monotonic counterparts, although it has not been studied thoroughly so far, if at all. after giving a short overview of the way of cmacs generation, arguments will be listed in favor of the iir version of the digital filter function realization. next, the iir implementation of the digital filters will be considered in short. the main part of the paper will be devoted to the design sequence of this kind of filters which will be illustrated on the example of a band-pass filter obtained by a set of transformations from an all-pole low-pass analogue prototype. this will be the first time a cmac band-pass iir digital filter is reported. key words: digital filters, iir, monotone amplitude characteristic, all-pole filters 1. introduction the critical monotonic amplitude characteristic (cmac) filters represent an extension of the broad family of filtering functions having all transmission zeroes at infinity [1]. they exhibit distinctive properties such as monotonic amplitude response in the pass band, reduced group delay distortions, higher symmetry of the pulse response, improved mapping of tolerances, improved sensitivity, and high selectivity. the interest for a digital realization of this kind of filtering functions comes from several reasons. first of all, only one sub-class of these functions has already been published in its digital form, the butterworth filters [2]. as shown in [1] and elsewhere, however, practically all sub-classes of cmac functions outperform the butterworth solution in almost every aspect of implementation with the exception of function’s simplicity. this study is a part of our effort to make cmacs more popular and to help bridging the gap between designers and cmac which has deepened during time [3]. second, due to their monotonic behavior, their sensitivity in the passband is reduced and accordingly, they received april 24, 2015; received in revised form august 3, 2015 corresponding author: dejan mirković faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: dejan.mirkovic@elfak.ni.ac.rs) 270 d. mirković, m. andrejević stošović, p. petković, v. litovski offer a good alternative to their non-monotonic counterparts (e.g. chebyshev and least pth [4]). at the same time, this means an improvement in the mapping of the tolerances of the circuit parameters into the tolerances of the attenuation characteristic [5]. finally, they exhibit smaller distortions of the passband group delay which reduces the complexity of the potential phase-corrector to be used to flatten the group delay characteristic [6]. this also means that cmac have smaller asymmetry of the response to a dirac pulse in the time domain which may be of crucial importance for some applications in telecommunication and signal processing. it is our opinion that the advantages of cmac filtering functions have not been completely understood in the research and design community. that especially stands for the iir implementation where no instances of implementation of cmac may be found. the reason for that, in our opinion, is inertia and the need of some additional (mathematical) knowledge for generation of the cmac transfer functions as compared with the chebyshev and butterworth filters. here we try to reopen the subject of cmac design by reporting the results related to the design of band-pass digital iir filter which is the first implementation of band-pass cmac of all. being a designer, one is first to decide either to go for fir filters and start the synthesis of transfer functions for each type of cmac from scratch, or to go for iir filters and transform the existing analog data into the digital domain. in the text below, a short paragraph is devoted to help the decision. as a conclusion, the designer will be advised to go for an iir filter with parallel implementation as the most economical solution in almost every respect. next, one is to create the cmac transfer function and to choose among sub-classes. again, a short paragraph will be devoted to this issue. four main sub-classes of cmac will be described from the implementation point of view. corresponding transfer function generation will be discussed shortly. based on these, a design sequence will be advised for finding the coefficients of the transfer function of iir filters in the z-domain. note that parallel implementation will be recommended and all the calculations will be performed under that presumption. the transformed function will be studied from both stability and accuracy point of view. the procedure will be exemplified on the case of a band-pass iir filter. to get it, a lowpass to band-pass transformation was performed in the analog domain. in that way the analog prototype so obtained was to be transformed into the z-domain by bilinear transformation. the implementation obtained in this way was evaluated by simulation of a filter excited by a complex signal in the time domain. various possible computing technologies were taken into account by changing the number of significant figures for the computations in order to establish the most economical implementation satisfying the design requirements. the paper is organized as follows. in the second paragraph arguments will be given for adopting iir digital filters. in the third paragraph the cmac function will be introduced. then, in the fourth paragraph, the bilinear transform implementation to a parallelized analog transfer function will be given. the case study describing the design (and its verification) of a band-pass cmac filter will be given in the fifth paragraph. 2. properties of the iir digital filters in digital filter design, one is to decide first on the choice between fir and iir filter functions and then to proceed to the approximation problem. then, one is to choose among design of iir digital filters with critical monotonic passband amplitude characteristic 271 different structures exhibiting the same transfer function. in the case of digital filters, the choice is to be done between the canonical (or state variable) and the parallel form. these two are illustrated in fig. 1 for an iir digital filter. it should be noted that if the order of the filter, n, is even, first order cell at the bottom of fig. 1b is omitted, leaving only second order cells in filter realization. when taking the decision between fir and iir filters one has to have in mind several criteria such as complexity of the solution, stability of the system, and processing time. the first criterion may be fragmented into several having the same origin. namely, the complexity of the solution will influence the power consumption, the silicon area and the design effort especially when special techniques are to be implemented for reduction of the power consumption [7]. fig. 1 realization of an nth order iir filter, a) canonical b) parallel (for n odd) note that not all of the criteria are of equal weight in design. for some applications the latency, i.e. the computational time may be of prime importance since it allows for speed. in others, reduction of heating or silicon area may prevail as a main criterion. putting all together, the choice is to be made by taking into account several, if not all criteria. in our detailed study [3] we came to the following. the use of iir filter has the following advantages: 1. lower complexity (in some cases, e.g. [8], incomparably lower); 2. lower dissipation; 3. lower silicon area; 4. available analog prototypes to transform. 272 d. mirković, m. andrejević stošović, p. petković, v. litovski the use of fir filters has the following advantages: 1. lower latency; 2. easier synthesis of linear phase filters; 3. better stability. the use of parallel architecture for the iir filters as shown in fig. 1b however, mitigates all disadvantages (stability, latency) of the iir filters, while there are no methods to do the same for the fir counterparts. it is to note here that getting a linear phase by fir filters doubles the complexity of the solution while using a phase corrector for the iir solution contributes marginally to its complexity [8]. that was the reason why we adopted the parallel architecture and the iir filter structure for the implementation in the cmac design. 3. cmac filters in the s-domain polynomial (or all-pole) filters with critical monotonic amplitude characteristics (cmac) in the passband have been available for several decades now [1]. the main property of cmac is related to the critical monotonicity of the amplitude response in the passband which will be first described here in short. the squared amplitude characteristic may be expressed as 2 2( jω) 1/{1 (ω )}h k  (1) where k(ω 2 ) is the characteristic function. in the simplest form (as proposed in [9]), for n even, one has:            2/ 1 22 2/ 1 222 2222 )ω1( )ωω( ε)(ωε)ω( n i i n i i n lk , (2) where ω is the normalized angular frequency, n is the order of the filter, ε defines the insertion loss at the passband edge, i.e.  2 = 10 amax/10  1, 0 < ωi < 1, i=1,2, ..., ⌊n/2 ⌋ are the abscissa of the inflection points, amax is the maximum allowable attenuation (in db) in the passband, and ⌊ ⌋ denotes the floor function. ln(ω 2 ) is a polynomial with n second order real zeroes located in the interval (1,1). since the characteristic function has a maximum number of inflection points in the passband, so do the amplitude characteristic and the attenuation, the last one being defined as 22 (ω ) 10 log(1/ (jω) ) [db]a h  (3) the main property of cmac leads to a good mapping of the element tolerances into the tolerances of the attenuation. namely, as shown in [4], the tolerance of the attenuation may be expressed in the following form: ω ω    a x x a i i , (4) where xi is the ith parameter of the analog circuit. having a maximal number of inflexion points (where both the first and the second derivatives are equal to zero) of the amplitude design of iir digital filters with critical monotonic passband amplitude characteristic 273 characteristic in the passband, the cmac forces the left-hand side of (4) to go through zero a maximal number of times. note, the derivative of a in (4) does not change its sign if cmac is used since it is monotonic which is different to the non-monotonic functions, e.g. c and ls. filters exhibiting cmac characteristic are also known to have lower group delay distortions in the passband than their c and ls counterparts [10]. altogether, the existence of cmac gives to the filter designer an additional freedom in the choice of the best solution for a filter design problem. there are four main classes of cmac as discussed in [1] and [10]. they originate from the design criteria implemented for synthesis of the transfer function. these criteria are: 1. maximally flat in the origin. the class of filters thus obtained is called butterworth’s after the author [11]. these will be here referred to as b-filters. 2. maximum slope of the characteristic function at the edge of the passband [12] [13] [14]. the name l-filters comes from the fact that for the original derivation legendre polynomials were used. 3. maximum asymptotic attenuation. [15]. here, these will be referred to as h-filters. 4. least-squares-monotonic. in this case, the reflected power in the pass-band was minimized under the critical monotonicity criterion [16] and named lsm filters. a catalog of the coefficients of the transfer functions of all four classes of cmac for n up to 10, obtained by these criteria, was published in [1] where a comparative study was also given. to illustrate this here, fig. 2 depicts the passband attenuation characteristics of the above four classes for n=7. in the next section, before proceeding to cmac digital filter design, the arguments for using iir filters will be discussed in short as based on the comparison of the properties of fir and iir filters. fig. 2 the four main cmac approximants for n=7 4. the bilinear transform and cmac in the z-domain there are several transformations claiming to preserve some of the original properties of the analog filter function when producing a digital domain counterpart. as listed in [17], these methods may be categorized in two groups. in the first group are put the ones which implement a specific criterion of approximation such as: the impulse response 274 d. mirković, m. andrejević stošović, p. petković, v. litovski invariant method; the modified impulse response invariant method; the step response invariant method (or zero order hold); the magnitude-invariance method; and the phase-invariance method. there are, however, transformations based on substitution of the complex frequency in the s-domain by an expression being a function of z. in that way, one has the matchedz transform method, and three methods obtained by approximation of the analog integrator by a digital one. these are known as the backward euler (backward difference); the trapezoidal method or the bilinear transform method, discussed in [18], and the second order formula introduced in [17]. the most popular among all of these is the bilinear transform. its main properties are simplicity of implementation and good preservation of the properties of the amplitude characteristic of the analog filter. it preserves the stability of the analog prototype. it introduces distortions (reduced by increasing the sampling rate) into the phase (group delay) characteristic which, however, has no importance in many applications. it is implemented via the following transformation into the analog transfer function: 1 12    z z t s . (5) where z is the complex digital angular frequency, and t is the sampling rate t=1/fs, fs being the sampling frequency. in that way )( 1 12 )( daa zh z z t hsh          (6) is obtained, where ha stands for the transfer function of the analog filter, while hd stands for that of the digital filter. the procedure of implementation of the bilinear transform to a parallelized analog transfer function together with the stability analysis and numerical considerations were discussed in [3] and we will not repeat them here. instead, in the sequel, we will go for the design of a band-pass filter obtained by low-pass-to-band-pass transformation in the sdomain and then transposed into the z-domain. it is our goal with that design to study all steps that remain to be performed in order to get an implementable design and to analyze the implementation problems related to the limited number of binary digits that arise in real life situations. 5. design, vhdl modeling, and simulation of cmac iir filters the following steps are to be performed in order to get an implementable design of the filter: creation of the band-pass filter in the s-domain; performing the s-to-z transform; conversion the decimal coefficient values into binary; scaling the coefficients to become implementable in fixed point arithmetic; and verification of the design by simulation of the filter hardware. concurrently, based on transfer function evaluation performed after taking into account the finite number of digits used for the representation of the coefficients (after quantization), a final decision will be enabled about the acceptability of the given approximation, i.e. selected number of binary digits. that and scaling are steps of crucial importance for defining the quality of the final solution. design of iir digital filters with critical monotonic passband amplitude characteristic 275 the example filter will be created based on the following requirements: a) band-pass (bp); b) central frequency: f0 = 3 khz; c) bandwidth: fbw = 900 hz; d) sampling frequency: fs = 50 khz e) order of the prototype low-pass filter n=7; f) pass-band amplitude approximation lsm; g) s-to-z transform used: bilinear; h) architecture: parallel combination of transpose direct form ii (tdf ii) filter sections. the well known [19] low-pass to band-pass transform was used: 0 22 0 r 1 ω    bw , (7) where ω is the angular frequency of the prototype filter, while ω is the angular frequency of the band-pass filter. ω0 is the central angular frequency, while bwr=bw/ ω0. bw is the bandwidth of the filter expressed as angular frequency. after the substitution of slp=jω and sbp=jω, (7) becomes a second order algebraic equation with complex coefficients which is usually solved by the geffe algorithm [20]. the new function has fourteen poles obtained by solving (7) as depicted in table 1 (together with the poles of the prototype lp lsm filter), and seven zeroes in the origin. in this case bwr=fbw / f0 =0.3 and ω0=1 rad/s was used. table 1 pole locations of the bp and lp lsm filter in the s-domain band-pass low-pass no. real part imaginary part real part imaginary part 1/2 -0.08266346190 ±0.99657751935 -0.1179475625 ±0.9751626241 3/4 -0.02025317565 ±1.15676424786 -0.3342221750 ±0.7735798237 5/6 -0.01513109310 ±0.86421546064 -0.4935853895 ±0.4252967357 7/8 -0.05591895577 ±1.12151432405 -0.5510897460 0.0 9/10 -0.04434769671 ±0.88944037693 11/12 -0.07876429908 ±1.06309950994 13/14 -0.06931131778 ±0.93551048922 next, the transfer function of the band-pass filter was expressed as a sum of partial fractions to enable parallel realization and, before the bilinear s-to-z transformation was implemented; the poles of the band-pass filter were to be denormalized: every pole coordinate was multiplied by 2π∙f0. based on this the coefficients of the biquads were calculated and s-toz-domain mapping enabled. the resulting coefficients of the biquads in the z domain are given in the first row (entitled full precision) of table 2. this concludes the synthesis procedure. we proceed now with the realization. as the first step, we encounter the necessity to express the coefficient values with a finite number of digits as physical implementation is expected. this process is usually referred to as quantization. only fixed point, two’s complement, biquad’s coefficients representation is considered. fig. 3 shows the transfer function’s pole locations in the z-plane for various binary word lengths used to represent the coefficient values. note that for all cases the poles are confined within the unit circle which confirms our claim that parallel realization will mitigate stability problems in iir realizations. 276 d. mirković, m. andrejević stošović, p. petković, v. litovski fig. 3 z – plane pole location of the bp lsm filter, a) unit circle, b) zoomed poles location. the following notation was used for the quantized version of the filter coefficients, q[n f]. n stands for the number of bits of the whole digital word and f for the number of bits allocated for the digits after the decimal point. accordingly, q[n f] will populate the range (rng) in increments (inc) as follows:     mnfcm  ;1log max2 , (8a) ff n f n incrng 2 1 ; 2 12 , 2 2 11          , (8b) where cmax is the coefficient with maximal absolute value, m is the number of bits allocated for the integer part plus the sign, and f is the number of bits allocated for the fractional part. the symbol ⌈ ⌉ denotes the ceiling function. two operations are performed over coefficients: first, scaling is done with the help of the results of (8a) and appropriate number of bits is determined for integer and fractional part; second, coefficients are quantized, i.e. mapped to appropriate values in range given with (8b) using round to nearest method. decimal and hexadecimal representation of coefficients quantized with q[16 14] are given in second and third row of table 2, respectively. observing table 2, one can see that the coefficient with maximal absolute value is the d1 coefficient of the first section, therefore m = 2, f = 14 are required for 16 bit representation. for these parameters range rng = [−2, 1.99993896484375] is covered in inc = 0.00006103515625 increments. assuming absence of any other source of computational error or noise we calculated the attenuation characteristic of the filter for different quantization formats of the coefficients as discussed above. the results for the example bp lsm filter are depicted in fig. 4. observing fig. 4, one can conclude that variants with 16-bit word length and higher, produce amplitude characteristics that start to agree with the one obtained with full precision. therefore, 16-bits representation can be used if attenuation larger than 50 dbs is not required (observing the lower stop-band in fig. 5). of course, one can use q[24 22] or q[32 30] if more accurate design is required. design of iir digital filters with critical monotonic passband amplitude characteristic 277 table 2 original and quantized filter coefficients numerator f u ll p re c is io n cell c0 c1 c2 i +0.0033086494281706663 -0.0018617614854014842 -0.0051704109135721505 ii +0.0067062285319410561 +0.0085906290469084413 +0.0018844005149673854 iii -0.044039453119340377 -0.0120564809558326 +0.031982972163507775 iv +0.06277875243767074 0 -0.062778752453249653 v -0.029334088252293972 +0.015384970101239978 +0.044719058353533951 vi -0.006864819539353288 -0.013389525398100878 -0.00652470585874759 vii +0.0074447307119548771 +0.0032630379244648162 -0.0041816927874900617 q [1 6 1 4 ] d e c im a l i +0.00329589843750 -0.00189208984375 -0.00518798828125 ii +0.00671386718750 +0.00860595703125 +0.00189208984375 iii -0.04406738281250 -0.01208496093750 +0.03198242187500 iv +0.06280517578125 +0.00000000000000 -0.06280517578125 v -0.02935791015625 +0.01538085937500 +0.04473876953125 vi -0.00683593750000 -0.01336669921875 -0.00653076171875 vii +0.00744628906250 +0.00323486328125 -0.00421142578125 h e x a d e c im a l i 0036 ffe1 ffab ii 006e 008d 001f iii fd2e ff3a 020c iv 0405 0000 fbfb v fe1f 00fc 02dd vi ff90 ff25 ff95 vii 007a 0035 ffbb denominator f u ll p re c is io n cell d1 d2 i -1.886085860514888 +0.98894784642499967 ii -1.8601293281281488 +0.96799935587139696 iii -1.8323004517946606 +0.95057717438073264 iv -1.8083338880121447 +0.94157013740084117 v -1.7935719318070145 +0.94450186279953363 vi -1.7923157567661698 +0.96044412883386077 vii -1.8052459566148433 +0.98552821289667847 q [1 6 1 4 ] d e c im a l i -1.88610839843750 +0.98895263671875 ii -1.86010742187500 +0.96801757812500 iii -1.83227539062500 +0.95056152343750 iv -1.80834960937500 +0.94158935546875 v -1.79357910156250 +0.94451904296875 vi -1.79229736328125 +0.96044921875000 vii -1.80523681640625 +0.98553466796875 h e x a d e c im a l i 874a 3f4b ii 88f4 3df4 iii 8abc 3cd6 iv 8c44 3c43 v 8d36 3c73 vi 8d4b 3d78 vii 8c77 3f13 after 16-bits representation is adopted, we perform an additional verification, but now in the time domain. fig. 5 depicts the results of time domain simulation using coefficients 278 d. mirković, m. andrejević stošović, p. petković, v. litovski quantized with q[16 14]. both signals and appropriate spectra are presented. the spectra shown in the fig. 5b and 5d are obtained with nfft = 65536 point fft. the input test signal is 0 1 2 3 sin(2π ) sin(2π ) sin(2π ) sin(2π ) in s f t f t f t f t    , (9) with: f0=3 khz, f1=374.60 hz, f2=749.97 hz, and f3=5999.76 hz. the bandwidth is limited by [fl, fu] = [2583.56, 3483.56] hz. the values of the test frequencies are picked to match integer multiples of fft resolution bin (fs/nfft) in order to minimize spectral leakage in the resulting fft image of the spectrum. a) b) fig. 4 attenuation of the 14 th order lsm band-pass filter: a) pass-band, b) stop-band design of iir digital filters with critical monotonic passband amplitude characteristic 279 observing the spectra in fig. 5b and 5d one can see that after filtering there is only one dominant bin at frequency f0, while the others are filtered out. fig. 5 time domain simulation of the bp filter mathematical model: a) input waveform, b) input spectrum, c) output waveform and d) output spectrum for hardware implementation a versatile vhdl code was written. it combines second and/or first order cells presented in fig. 1b. illustrative schematics of the described second order tdfii and top-level filter cells are shown in fig. 6a and 6b, respectively. appropriate number/position of bits at each signal path is labeled as well. fig. 6 schematic representation of a) second order tdf ii and b) top level filter cell each delay block (z -1 ) is realized as a register. parallel multipliers and ripple carry adders (with add and subtract functions) are designed for multiplication and summing operations, respectively. according to fig. 6a it can be concluded that second order cell requires five multipliers and four adders. on the other hand, assuming zero values for c2,i 280 d. mirković, m. andrejević stošović, p. petković, v. litovski and d2,i coefficients first order cell stems out from second order one. therefore, first order cell will require three multipliers and only two adders. to ensure successful synthesis whole filter is described structurally. each individual block, starting from basic ones, i.e. multipliers, adders and registers up to top level entity is described. therefore, no predesigned structures are assumed making the code as portable as possible. tdf_iii represents first or second order tdf ii cell. din, and dout are input and output digital word. index bounds and constants in fig. 6b are defined as follows,   1-/2 + 2) ,mod(,,0 prri   , (10a)       0 for 1,)max( 0 for ),max( ki ki j , (10b) where i is the index of the filter’s section and j is the number of adders used to sum outputs of the sections. the order of the resulting transfer function is marked with r. it should be emphasized that the order of the resulting band-pass/stop transfer function is doubled compared to low-pass prototype function. symbols ⌊ ⌋, max(x) and mod(x,y) denotes floor function, maximal value, and modulo operator (reminder after x by y division), respectively. parameter p is the flag that detects existence (1-exist, 0-do not exist) of two real poles/residues in resulting band-pass/stop transfer function. if this is the case, two first order sections are generated. finally, k represents direct term of partial fraction expansion of the resulting transfer function. this term is always zero, if the order of the denominator polynomial, m, is less than the order of the numerator polynomial, n. otherwise, this term is of the order m – n. in filter’s transfer functions it can only happen to be m = n which gives k as a simple constant factor. therefore, a branch with the k factor is nothing but a simple buffer stage. possible values for parameters k and p are given in table 3. table 3 possible values for p and k parameters filter type even order odd order p k p k high-pass 0 1 0 1 band-pass 0 0 1 0 band-stop 0 1 1 1 accordingly, vhdl entity accepts generics and has interface ports shown in table 4. vhdl code sample is given below. next, vhdl description was verified by logic simulation with the excitation described in the previous section. it is important to mention that when dealing with hardware implementation two important effects must be examined, namely: saturation and round-off noise. saturation is intensely dependent of the input signal waveform and filter’s architecture and coefficient values. even more, internal states usually saturate with different speeds making the tracking of the saturation process a non trivial task. design of iir digital filters with critical monotonic passband amplitude characteristic 281 table 4 generics and ports of vhdl entity symbol description g e n e ri c s n word length f number of bits after decimal point r order of resulting transfer function p flag for detecting two real poles/residues k direct term of partial fraction expansion cfs coefficients of the filter p o rt s clk clock signal at sample rate frequency rst reset signal active at negative level x input n bit signal y output n bit signal round-off noise is the direct consequence of a fixed-point representation. simply, product of two n-bit fixed-point numbers is a 2n bit number. this product must eventually be quantized to n-bits by rounding or truncation, which results with the round-off noise. a number of techniques can be used to mitigate this problem [21], [22]. the most commonly used technique to prevent saturation and round-off noise is the dynamic range scaling (or simply scaling) of the input signal prior to filtering action. namely, each input signal value should be scaled down into a specific range which ideally, ensures no saturation in any of the internal and external nodes of the filter. luckily, since two’s complement representation is exploited, saturation of the internal nodes is to be allowed since it will be interpreted as an overflow (wrap-around) effect. e.g. an overflow occurs when the sum of two positive numbers yields a negative result and vice versa, otherwise the result is correct. similar occurrence happens when the internal state values reach boundaries of the dynamic range, i.e. first larger/smaller value, then the maximal/minimal is interpreted as the minimal/maximal value of the range. this can be tolerated as long as the final values of the output signal are valid, i.e. wrap-around does not coincide with the moment of the output signal acquisition. this is where parallel realization, adopted in this work, again outperforms the cascade one. saturation conditions are drastically relaxed when using parallel realization, especially as the order of filter increases. occurrence of wrap-around is reduced as well. this is simply because no matter how large the filter order is, all second and/or first order cells process the input signal independently of each other. therefore, scaling of the input signal applies to all cells at once. the bottleneck is, of course, the output summing node. nevertheless, net sensitivity to saturation is reduced when constrains regarding saturation are relaxed at each individual cell. one may also choose to omit scaling and relay completely on two’s complement representation, but this technique requires sound knowledge about the input signal waveform and algorithm for tracking and handling wrap-around effect. also, all possible cases have to be predicted, therefore extensive simulations are required. this is usually too expensive in the real-world applications, therefore some form of scaling is always applied. moreover, scaling technique is quite easy to implement in the digital domain, knowing that each scaling down/up by two is nothing but the one simple shift right/left operation. accordingly, scaling operation can be implemented as tunable (programmable). even non-linear scaling can be implemented if high accuracy is required. unfortunately, there is no scaling technique which provides closed-form, general solution and it all depends on concrete application at the end of the day. this implies that some exploration of the time domain simulation results is inevitable in the design process to 282 d. mirković, m. andrejević stošović, p. petković, v. litovski determine the appropriate scaling factor. finally, combining several techniques to cope saturation and round-off noise may result with more efficient solution, but scaling with fixed coefficient, because of its simplicity, is still considered suitable for verity of applications and therefore utilized in this design, as well. since our test signal is known in advance, fixed scaling coefficient is to be determined. finding maximum and minimum of the function given in (3), one can determinate the range of the input signal, i.e. rngin = [−3.73, 3.73]. using time domain simulation scaling factor of four turned out to be suitable. this can be also intuitively concluded when looking at the range of filter’s coefficients. namely, dividing rngin with four gives rngnew = [−0.93, 0.93], which is smaller than half of the filter’s coefficient range rng = [−2, 1.99993896484375], leaving enough headroom for values of internal states to spread without reaching saturation. after filtering, output is scaled up and the obtained results are presented in fig. 7. fig. 7 time domain simulation of the bp filter vhdl model: a) output waveform, b) output spectrum sound representation of signal’s spectrum using fft usually requires a large number of samples. this inevitably leads to longer time domain simulation. to minimize duration of the time domain logic simulation, a smaller number of fft points, compared with a case with purely mathematical model which simulates faster (fig. 5b, 5d), is desired. therefore, nfft = 16384 is chosen for representing the output signal spectrum in fig 7b. it turns out that this number gives a satisfactory compromise between simulation time and fft accuracy. finally, comparing fig. 7a with fig. 5c and fig. 7b with fig. 5d, one can see that the time and frequency domains of the output signal obtained by simulating mathematical and hardware models of the filter match. this proves that hardware representation successfully implements the desired behavior of the designed filter. 6. conclusion with this case study we intended to fulfill two main goals. first, we wanted to raise the awareness of the salient advantages of the cmac filtering functions as compared with their non-monotonic counterparts. to achieve this, we gave a short overview of the properties of cmac amplitude characteristics. the second goal was to give, for the first time, design results characterizing the amplitude characteristic of band-pass iir digital filters. accordingly, we went through several steps. first, we gave arguments on the choice of iir filters. then, we gave arguments for the parallel implementation of digital design of iir digital filters with critical monotonic passband amplitude characteristic 283 filters that was used throughout the design process. next, we described and exemplified the complete design procedure including the verification steps needed to support the design decisions taken on the way. all that was performed on the example of a band-pass cmac iir digital filter, a solution that was here reported for the very first time. acknowledgement: this research was funded by the ministry of education, science and technological development of republic of serbia under contract no. tr32004. references [1] d. topisirović, v. litovski, and m. andrejević stošović, “unified theory and state-variable implementation of critical-monotonic all-pole filters,” international journal of circuit theory and applications, vol. 43, no. 4, pp. 502–515, 2015. [2] jf. kaiser, digital filters, in: ff. kuo, jf. kaiser (eds.) system analysis by digital computer. wiley: new york, 1996, chapter 7, p. 245. [3] d. mirković, m. andrejević stošović, p. petković and v. litovski, “iir digital filters with critical monotonic pass-band amplitude characteristic,” aeu international journal of electronics and communications, vol. 69, no. 10, pp. 1495-1505, oct. 2015, issn 1434-8411. [4] ds. humpherys, the analysis, design, and synthesis of electrical filters, prentice-hall, 1970 [5] k. geher, theory of network tolerances, akademiai kiadó: budapest, hungary, 1971. [6] v. litovski, “synthesis of monotonic passband sharp cutoff filters with constant group delay response,” circuits and systems, ieee transactions on, vol. 26, no. 8, pp. 597–602, aug 1979. [7] rabey j. low power design essentials. springer science + business media, llc: new york, 2009. [8] mf. quélhas, a. petraglia, mr. petraglia, “efficient group delay equalization of discrete-time iir filters,” in proceedings of the xii european signal processing conference, eusipco-2004, vol. 1, vienna, austria, 2004, pp. 125-128. [9] rabrenović d, jovanović v. low-pass filters with critical monotonic magnitude. publications of faculty of electrical engineering, eta series: belgrade, 1973, pp. 59–68. [10] b. d. rakovich, “designing monotonic low-pass filters – comparison of some methods and criteria,” international journal of circuit theory and applications, vol. 2, no. 3, pp. 215–221, 1974. [11] s. butterworth, “on the theory of filter amplifiers,” experimental wireless and the wireless engineer, vol. 7, 1930, pp. 536-541. [12] a. papoulis, “optimum filters with monotonic response,” in proceedings of the ire, vol. 46, no. 3, pp. 606–609, 1958. [13] a. papoulis, “on monotonic response filters,” proceedings of the ire, vol. 47, pp. 332–333, 1959. [14] m. fukada, “optimum filters of even orders with monotonic response,” circuit theory, ire transactions on, vol. 6, no. 3, pp. 277–281, 1959. [15] p. halpern, “optimum monotonic low-pass filters,” circuit theory, ieee transactions on, vol. 16, no. 2, pp. 240–242, may 1969. [16] b. rakovich and v. litovski, “least-squares monotonic lowpass filters with sharp cutoff,” electronics letters, vol. 9, no. 4, pp. 75–76, february 1973. [17] d. mirković, p. petković, and v. litovski, “a second order s-to-z transform and its implementation to iir filter design,” compel the international journal for computation and mathematics in electrical and electronic engineering, vol. 33, no. 5, pp. 1831–1843, 2014. [18] w. park, k.-s. park, and h.-m. koh, “active control of large structures using a bilinear pole-shifting transform with h∞ control method,” engineering structures, vol. 30, no. 11, pp. 3336–3344, 2008. [19] h. orchard and g. c. temes, “filter design using transformed variables,” circuit theory, ieee transactions on, vol. 15, no. 4, pp. 385–408, 1968. [20] p. geffe, “designers guide to active bandpass filters,” part iii’, edn, vol. 19, no. 7, 1974. [21] k. k. parhi, scaling and round-off noise in: vlsi digital signal processing systems: design and implementation. john wiley & sons, 2007, chapter 11. [22] k. prasad and p. sathyanarayana, “signal scaling in cascade digital filters,” circuits, systems and signal processing, vol. 8, no. 4, pp. 421–426, 1989. instruction facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 157-172 https://doi.org/10.2298/fuee2102157a © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd review paper triboelectric nanogenerators (teng): factors affecting its efficiency and applications deepak anand, ashish singh sambyal, rakesh vaid department of electronics, university of jammu, jammu-180006, india abstract. the demand for energy is increasing tremendously with modernization of the technology and requires new sources of renewable energy. the triboelectric nanogenerators (teng) are capable of harvesting ambient energy and converting it into electricity with the process of triboelectrification and electrostatic-induction. teng can convert mechanical energy available in the form of vibrations, rotation, wind and human motions etc., into electrical energy there by developing a great scope for scavenging large scale energy. in this review paper, we have discussed various modes of operation of teng along with the various factors contributing towards its efficiency and applications in wearable electronics. key words: teng (triboelelctric nanogenerator), ptfe (poly tetra fluoro ethylene), tet (triboelectric textile), stet (single layer triboelectric textile), pdms (polydimethyl siloxane), pmma (polymethyl methacrylate) 1. introduction with the increase in the energy requirement, various non-renewable resources of energy are depleting day by day causing serious environmental conditions. solar and wind energies are the targeted renewable sources of energy to provide power in the gigawatt scales. high power density, high efficiency and low cost are the main requirements to harvest these energy sources. for the welfare of the society, it is necessary to find a new and high efficient energy technology that can be able to harvest the energy available in the environment which could be harvested easily to act as prominent source for energy harvesting system [1-4]. all these power sources should be easily available, sustainable, and maintenance-free as well as pollution free. most of the present day electronic devices use batteries as external power sources with a short span of life time. till date electromagnetic-induction, piezoelectric and electrostatic effects were the main mechanisms used for major energy harvesting techniques developed during the last few decades [5-11]. more recently, a new energy technology has been invented for harvesting environmental energy known as tribo-electric nanogenerators (teng) which converts the ambient mechanical energy into electrical energy [12-16]. teng received february 24, 2021 corresponding author: rakesh vaid department of electronics, university of jammu, jammu 180006, (j&k), india e-mail: rakeshvaid@ieee.org 158 d. anand, a. singh sambyal, r, vaid works on the principle of triboelectrification in conjunction with electro-static induction. the concept of teng was demonstrated by wang et. al in the year 2012 and since then it has attracted the energy industry to meet the large scale energy demand. various device structures based on triboelectric-effect and electro-static induction have been reported utilizing mechanical energies from vibrations [17-20], human-motions [21-22], rotation [23-24], wind [25-26], and walking [28]. in this review paper, we have described an overview of the progress in the teng based devices. we have also discussed the various modes of operation, energy harvesting source along with different parameters affecting its efficiency and applications. 2. fundamental modes of teng charge generation takes place between two different materials having distinct affinity to electrons when they are brought in contact with each other and then separated is known as triboelectric effect. when the materials are separated from each other it results in the generation of potential on the surface of two materials. on the other hand, electrostatic induction is the phenomenon of generating electricity when the electrons from one electrode flow to the other electrode through external load to bring equilibrium in the potential difference. in teng both triboelectric effect and electrostatic induction are used to convert the mechanical energy into electrical energy. figure 1 below demonstrates the various fundamental modes of teng such as verticalcontact separation mode [37-40], sliding mode [41-42], single electron mode [43-46] and freestanding triboelectric-layer mode [47-52]. fig. 1 fundamental modes of teng a) the vertical contact separation mode b) the sliding mode c) the single electron mode d) the freestanding mode 2.1. vertical contact-separation mode the process of energy conversion by triboelectrification was first demonstrated by zhu et. al., in january 2012 [13]. the operation of teng can be explained on the basis of coupling triboelectric nanogenerators (teng): factors affecting its efficiency and applications 159 between electrostatic induction and contact electrification. figure 2(a-b) clearly indicates the process of generation of electricity using contact-separation mode. the materials used for vertical contact-separation mode include pmma (poly methyl methacrylate) and kapton. both open-circuit voltage and short circuit current have been demonstrated in this mode of teng. in the open circuit condition, when no force is applied between these two materials, no electric potential difference is produced as shown in figure 2(a). but when an external force is applied, transfer of charge takes place from one surface to another as soon as these two materials come in contact with each other. because of triboelectric-effect, electrons will be transferred from pmma to the kapton surface thereby making pmma as positive electrode and kapton as negative electrode (refer figure 2(a)). further, when these two materials are separated with the release of force, a potential difference is created between these two electrodes. the opencircuit voltage (voc) so produced can be expressed as: voc = σ d/ ϵₒ (1) where, σ is the triboelectric charge density; ϵₒ is the permittivity and d is the distance between the two surfaces. voc can reach its maximum value when the force is released of the free space. now, when the force is applied again, the potential difference decreases and reaches its minimum value when the two materials come in contact/closer to each other. this depicts the whole cycle of generating electricity in vertical contact-separation mode. under the short circuit condition, the electrons flows from top electrode to the bottom electrode, so as to balance the electric potential difference so generated resulting in the flow of instantaneous current in the process of releasing. thus, the positive charge will accumulate on the top electrode and negative charge will accumulate on the bottom electrode. the charge density during full released process can be expressed as: σ′ = σ d′ ϵrk ϵrp/d1 ϵrp+d′ϵrk ϵrp+d2 ϵrk (2) where, ϵrp = relative permittivity of pmma; ϵrk = relative permittivity of kapton d1 = thickness of the kapton layer; d2 = thickness of the pmma layer now, when the force is applied again, the electrons will move from bottom electrode to the top electrode reducing the induced charge due to which a negative instantaneous current appears. the whole induced charge gets neutralized when these layers come in contact with each other. 2.2. sliding mode siding mode of operation was demonstrated by wang et al in the year 2013 [42] in which two surfaces slide over one another in the lateral direction. the mechanism of generation of electricity has been demonstrated in figure 3 (i-iv). in this case one layer is of ptfe (poly tetra fluoro ethylene) and the other layer consists of nylon plate. in the initial position, when the two plates are placed over one another having full contact with each other, no transfer of electron takes place from nylon to ptfe, thus no potential difference is generated between the two electrodes as shown in figure 3(i). when the positively charged top surface starts sliding in the outward direction, relative displacement in the lateral direction takes place. thus, ptfe electrode will be having a higher potential as compared with the nylon electrode, hence the electrons from the ptfe film will move 160 d. anand, a. singh sambyal, r, vaid towards the nylon film through the external load, until full mismatch, as shown in figure 3(iiiii), the potential difference and charge transfer will reach the maximum value. now, the nylon plate is moved in the inward direction and the whole process will get reserved and the electrons moved from nylon film to ptfe film through external load which produces a negative current when the equilibrium is achieved, no transfer of charge take place and the two plates reaches its original position. several advantages of sliding mode have been observed as compared to vertical contact separation mode such as higher energy conversion efficiency and increased power enhancement. fig. 2 (a-b) process of generation of electricity using contact-separation mode of teng fig. 3 (i-iv) the basic mechanism of generation of electricity triboelectric nanogenerators (teng): factors affecting its efficiency and applications 161 2.3. single electron mode figure 4(a) show the single electron mode operation [45] consisting of pdms layer having micro pyramids over its surface serving the purpose of providing friction and the other contact surface consists of human skin. the layer of pdms is deposited on the ito coated pet substrate and with change in the distance between the two surfaces, transfer of charge take place in between ito and the ground and hence flow of electrons take place. fig. 4 (a) schematic illustration showing the single electron mode teng [45], (b) the electricity generation cycle figure 4(b) indicates the mechanism of generation of electricity in the single electron mode. with the bringing of a finger near the pdms surface, a negative charge appears on its surface as pdms is more negatively charged as compared to human skin and thus more electrons will be transferred from the human skin to the pdms surface. this negative charge can be preserved on the pdms surface due to its insulating nature. now, 162 d. anand, a. singh sambyal, r, vaid when the finger is separated from the pdms surface, a potential difference between the ito and the reference electrode gets generated. this results in the flow of free electrons from the ito electrode to the ground/reference electrode to maintain the equilibrium as shown in figure 4(b). again, when the finger is made to approach the pdms, the movement of free electron takes place from the reference electrode to the ito resulting in the production of negative current/voltage. this is how the cycle gets completed for the single-electron mode operation. 2.4. freestanding triboelectric layer mode the free standing triboelectric layer mode have distinct advantages over the other modes of operations as far as its versatility and applicability in the process of energy harvesting from a moving object or from the motion of human walking without an attached electrode. this mode also has very high energy conversion efficiency and high robustness. in this mode, the generation of electricity depends upon the change in position of the tribo charged surface between two electrodes resulting in change of induced potential difference as depicted in figure 5(a). the main structure consists of two metal films and a free-standing dielectric layer. when the fep (fluorinated ethylene propylene) layer is aligned with the left-electrode of aluminum (al) a negative charge will be developed on the inner surface of the fep layer and a positive charge on the left-electrode surface as shown in figure 5(b). fig. 5 (a) two electrodes resulting in change of induced potential difference in the freestanding triboelectric layer mode when the fep layer slides towards the right-electrode, the potential difference between the left and the right electrodes will be reduced causing the flow of current from left electrode towards the right electrode as shown in figure 5(b). when the fep layer reaches on the top of right electrode, no electric potential difference appears and hence no current flows. finally, when the fep layer slides towards the left electrode, an electric potential difference will appear between the two electrodes, causing flow of current between them, thus completing the whole cycle of generating electricity in free-standing triboelectric layer mode. triboelectric nanogenerators (teng): factors affecting its efficiency and applications 163 fig. 5 (b) working principle of a free-standing triboelectric layer mode 3. energy harvesting sources using teng 3.1. energy harvesting through waste water flow the energy from the waste water flow can be harvested using a rotatory teng as shown in figure 6. it consists of ptfe (poly tetra fluoro ethylene) and nylon being the tribo-electric materials. with the use of triboelectric effect and electrostatic induction, energy can be harvested by contact and sliding modes of the teng operation. the devices so far demonstrated has the ability to light up 50 leds connected in series [46]. when the water is allowed to flow through the tube, the fan connected to the shaft starts rotating. as shown in figure 6, different triboelectric materials are placed on the eight different poles. with the rotation of the shaft, the triboelectric materials come in contact with each other thereby causing the flow of current [46]. energy from the water waves can be harvested as demonstrated by jiang et al., [47] where they designed a spring based teng to store the potential energy present in the water waves. actually, the energy is produced by translating the low frequency wave motion energy of water into high frequency kinetic energy by the use of a spring. in order to achieve higher efficiency, the various parameters like spring rigidity and spring length must be taken into account. water driven teng based on water electrification has been demonstrated and developed by kim et al., [48] which are capable of producing energy even under adverse environmental conditions and rarely affected by humidity and friction. 164 d. anand, a. singh sambyal, r, vaid fig. 6 schematic diagram of a rotatory teng [46] 3.2. energy harvesting from triboelectric textile one of the unique sources of energy harvesting takes place through human motion using tet (triboelectric textile). because of triboelectric effect, the transfer of charge takes place between the skin and the triboelectric textile. in order to obtain a voltage ~ 500 v and a short circuit current of 600 ma, silicon and ni-coated polyester had been used as triboelectric materials as single layer triboelectric textile (stet). on the other hand, for a voltage of ~ 540v and a short circuit current of 140 ma was obtained for a 5x5cm square sized double layer triboelectric textile which is capable of illuminating 100 leds connected in series [49] with stretching, rubbing and pressing using folded tet. on stretching, the layer of materials comes in contact with each other and they retain the original shape by removing the external forces. silk and si-rubber, when comes in contact with each other on stretching results in the generation of electricity due to the transfer of charge between the two layers as depicted in figure 7. this type of tet is capable of producing electricity that can light 54 led bulbs [50]. triboelectric nanogenerators (teng): factors affecting its efficiency and applications 165 fig. 7 working principle of tet 3.3. energy harvesting from human walking the energy harvesting from a foot-fall was analyzed and demonstrated by te-chien hou and others experimentally [51] in the year 2013. the fabrication of shoes soles using triboelectric materials with proper use of spacers has been done by using elastic sponge as a spacer. the variations in the size and thickness of the spacer varied the output so generated. the energy converted from human walking into electricity has generated an electrical output which is capable of illuminating 30 leds connected in series. it has also been observed that an increase in the number of spacer reduces the output voltage because of a decrease in the effective area of contact. 3.4. magnetic force and finger tip pressure driven teng the teng driven by magnetic force and finger tip pressure was designed by taghavi et al [52] as shown in the figure 8. with the application of pressure on the upper part, the upper pair of materials comes in contact with each other, whereas when the pressure is removed the lower part is pushed in upward causing the lower pair of materials to come in contact with each other due to magnetic force. this contact and separation causes the transfer of charge between the materials resulting in the flow of electric-current. 166 d. anand, a. singh sambyal, r, vaid fig. 8 mechanism of contact keys driven by finger tips and then by magnetic-force [52] 3.5. pendulum and comb shaped electrodes based teng another triboelectric nanogenerator that can be fabricated using contact electrification and electrostatic induction is using by a comb-shaped electrode for harvesting energy. more the number of comb electrode arms, the more will be the production of energy. even the rougher surface shows higher output as compared to the flat surface [53]. the working of this teng is basically based on the oscillations of a pendulum. with the application of force to the pendulum, a to and fro motion is generated which produces multiple output for a single input. many setups were created based on the surface roughness and nanowires showing maximum efficiency. the efficiency of teng increases with an increase in the surface roughness because the surface roughness ultimately increases the area of contact [54]. as shown in figure 9, when one material is placed on the top of pendulum and the other material is placed fig. 9 teng consisting of two parts i and ii (i is movable and ii is fixed) triboelectric nanogenerators (teng): factors affecting its efficiency and applications 167 on the frame, with the starting of oscillations, the contact and separation take place between the two materials resulting in charge unbalancing thereby producing the flow of electriccurrent [54]. 4. effect of various factors on the efficiency of teng 4.1. effect of humidity the generation of charge is greatly influenced by humidity as well as temperature. it has been noticed that the generation of charge between various triboelectric materials increased up to 20% with the decrease in the relative humidity whereas increase in the humidity has adverse effect on the efficiency of triboelectric materials and on the triboelectric effect [55]. a triboelectric nanogenerator can also be fabricated which works on a wide range of humidity without causing change in its electrical output. such a teng is consists of triboelectric materials which are water reluctant and hence can be utilized for low and high humidity pendulum conditions [56]. 4.2. effect of temperature temperature also has an impact on the output of triboelectric nanogenerator as observed by various researchers. it has been observed that with an increase in temperature, the ductility of triboelectric material increases while the stiffness decreases whereas on decreasing temperature reverse process is observed. from the graph shown below in figure 10, it is observed that the output voltage decreases beyond a temperature of 300⁰k and the output also varies over a wide range of temperature. u+ denotes average positive peak voltage and u‾ denotes the average negative peak voltage respectively [57]. fig. 10 variations of peak voltage with temperature [57] 4.3. effect of surface structure patterning various triboelectric materials like pdms (polydimethyl siloxane) and pmma (polymethyl methacrylate) can be used for the fabrication of teng with nanopatterns fabricated on their surface using photolithography. different types of patterns like 168 d. anand, a. singh sambyal, r, vaid hexagonal, pillar, and line can be printed and it has been observed that hexagonal patterns show maximum output voltage as compared to the other patterns. triboelectric materials with smaller width pillars show higher output as compared with the large width pillar shaped patterns [58]. seol et al [59] has demonstrated that the effect of pressure on the surface of triboelectric materials result in deformation which has an impact on the output of the teng devices. it has been observed that high pressure applications result in increased output because of the increase in contact surface thereby causing an increase in the maximum charge density. 5. applications of teng 5.1. teng as a micro-scale power source the main and most important purpose for developing teng is to act as a power source for small scale electronic devices and sensors applications. energy harvesting by using its various modes of operation has been demonstrated for body motion [60] vibrations produced by human walking [61], pressing of hand [62-63], insole of shoes [64-65], sound waves present in air [66] and in water [67]. in its sliding mode of operation, approximately a conversion efficiency of 50% has been observed [68] whereas it is about 24% in the case of rotation based teng [69]. it has been demonstrated that the output power reaches to a maximum value of 1200 w/m square which is quite sufficient for powering the small device applications in wearable electronics. energy harvesting has also been demonstrated from flowing river water [70], rain drops [71] by using contactelectrification between solid surface and liquid as applicable in parallel teng [72]. the energy can be harvested using the fluctuations in the water surface [73], water wave, and water stream [74]. energy harvesting can be easily done without constructing huge dams. it has been predicted that in the near future a 1mw of power can be generated from 1km square of surface in ocean if the output of each unit will be 1mw on an average by constructing a 3-d network of teng [75-76]. this will be a big source of blue energy for fulfilling large scale applications/requirements of the world’s energy needs. 5.2. teng as self –powered sensor triboelectric nanogenerators can also be used as self-powered sensors without applying any external power source just by sensing dynamic mechanical action. a large number of sensing applications are available which includes finger touching [77-79], detection of vibration [80], rotation and chemical sensor [81-82]. 6. conclusion in this review paper, a study of triboelectric nanogenerator (teng) has been made on the basis of its fundamental modes of operation, harvesting energy from various sources, along with various factors affecting its efficiency and applications in the real world. its simple mechanism of working, compact size, light weight and innovative design makes this device applicable in small and large power generating fields. the output of theteng depends upon various factors like effective area of contact, amount of force/pressure applied, and morphology of the surface in contact, temperature and humidity. triboelectric nanogenerators triboelectric nanogenerators (teng): factors affecting its efficiency and applications 169 are capable of working over a wide range of temperatures and variable humidity conditions. all the energy which otherwise goes waste in the environment can be utilized by such devices. for achieving sustainable and self-powered systems, teng devices will soon be available in the form of various products in the wearable electronics, mobile and healthcare monitory systems along with many other relevant applications. acknowledgement: the author deepak anand and ashish sambyal organized the concept of this review paper and would like to thank prof. rakesh vaid for supervising the project. all the authors read and approved the final manuscript. references [1] s. p. beeby, m. j. tudor, and n. m. white, "energy harvesting vibration sources for microsystems applications", meas. sci. technol., vol. 17, no. 12, pp. r175– r195, october 2006. [2] j. w. matiko, n. j. grabham, s. p. beeby, and m. j. tudor, "review of the application of energy harvesting in buildings", meas. sci. technol., vol. 25, no. 1, article id 012002, november 2013. [3] e. arroyo and a. badel, "electromagnetic vibration energy harvesting device optimization by synchronous energy extraction", sens. actuators, a, vol. 171, no. 2, pp. 266–273, november 2011. [4] j. chen, d. chen, t. yuan, and x. chen, "a multi-frequency sandwich type electromagnetic vibration energy harvester", appl. phys. lett., vol. 100, no. 21, article id 213509, 2012. [5] j. yang, y. wen, p. li, x. bai, and m. li, "improved piezoelectric multifrequency energy harvesting by magnetic coupling", in proceedings of the 10th ieee sensors conference 2011 (sensors’11), limerick, ireland, 2011, pp. 28–31. [6] j. yang, y. wen, p. li, x. yue, and q. yu, "energy harvesting from ambient vibrations with arbitrary inplane motion directions using a magnetostrictive/piezoelectric laminate composite transducer", j. electron. mater., vol. 43, no. 7, pp. 2559–2565, may 2014. [7] q. yu, j. yang, x. yue, a. yang, j. zhao, n. zhao, y. wen and p. li, "3d, wideband vibro-impacting based piezoelectric energy harvester", aip adv., vol. 5, no. 4, article id 047144, april 2015. [8] p. d. mitcheson, p. miao, b. h. stark, e. m. yeatman, a. s. holmes, and t. c. green, "mems electrostatic micropower generator for low frequency operation", sens. actuators, a, vol. 115, no. 2-3, pp. 523–529, september 2004. [9] l. g. w. tvedt, d. s. nguyen, and e. halvorsen, "nonlinear behavior of an electrostatic energy harvester under wide-and narrowband exitation", j. microelectromech. syst., vol. 19, no. 2, pp. 305–316, may 2010. [10] j. yang, y. wen, p. li, x. yue, q. yu, and x. bai, "a twodimensional broadband vibration energy harvester using magnetoelectric transducer", appl. phys. lett., vol. 103, no. 24, article id 243903, december 2013. [11] j. yang, q. yu, j. zhao, n. zhao, y. wen, p. li and j. qiu, "design and optimization of a bi-axial vibration-driven electromagnetic generator", j. appl. phys., vol. 116, no. 11, article id 114506, september 2014. [12] k. y. lee, j. chun, j.-h. lee, k. n. kim, n.-r. kang, j.-y. kim, m. h. kim, k-s. shin, m. k. gupta, j. m. baik, s.-w. kim, "hydrophobic sponge structure-based triboelectric nanogenerator", adv. mater., vol. 26, no. 29, pp. 5037–5042, may 2014. [13] g. zhu, c. pan, w. guo, c.-y. chen, y. zhuo, r. yu and z. l. wang, "triboelectric-generator-driven pulse electrodeposition for micropatterning", nano lett., vol. 12, no. 9, pp. 4960–4965, august 2012. [14] g. zhu, z.-h. lin, q. jing, p. bai, c. pan, y. yang, y. zhou and z. l. wang, "toward large-scale energy harvesting by a nanoparticle-enhanced triboelectric nanogenerator", nano lett., vol. 13, no. 2, pp. 847–853, january 2013. [15] j. yang, j. chen, y. yang, h. zhang, w. yang, p. bai, y. su and z. l. wang, "broadband vibrational energy harvesting based on a triboelectric nanogenerator", adv. energy mater., vol. 4, no. 6, article id 1301322, november 2013. [16] s. kim, m. k. gupta, k. y. lee, a. sohn, t. y. kim, k.-s. shin, d. kim, s. k. kim, k. h. lee, h.-j. shin, d.-w. kim and s.-w. kim, "transparent flexible graphene triboelectric nanogenerators", adv. mater., vol. 26, no. 23, pp. 3918–3925, 2014. 170 d. anand, a. singh sambyal, r, vaid [17] w. yang, j. chen, g. zhu, x. wen, p. bai, y. su, y. lin and z. wang, "harvesting vibration energy by a triple-cantilever based triboelectric nanogenerator", nano res., vol. 6, no. 12, pp. 880–886, september 2013. [18] h. zhang, y. yang, y. su, j. chen, k. adams, s. lee, c. hu and z. l. wang, "triboelectric nanogenerator for harvesting vibration energy in full space and as self-powered acceleration sensor", adv. funct. mater., vol. 24, no. 10, pp. 1401–1407, october 2014. [19] b. k. yun, j. w. kim, h. s. kim et al., "base-treated polydimethylsiloxane surfaces as enhanced triboelectric nanogenerators", nano energy, vol. 15, pp. 523–529, july 2015. [20] y. su, j. chen, z. wu, and y. jiang, "low temperature dependence of triboelectric effect for energy harvesting and selfpowered active sensing", appl. phys. lett., vol. 106, no. 1, article id 013114, january 2015. [21] y. yang, h. zhang, z.-h. lin et al., "human skin based triboelectric nanogenerators for harvesting biomechanical energy and as self-powered active tactile sensor system", acs nano, vol. 7, no. 10, pp. 9213–9222, september 2013. [22] w. seung, m. k. gupta, k. y. lee et al., "nanopatterned textile-based wearable triboelectric nanogenerator,” acs nano, vol. 9, no. 4, pp. 3501–3509, february 2015. [23] p. bai, g. zhu, y. liu et al., "cylindrical rotating triboelectric nanogenerator", acs nano, vol. 7, no. 7, pp. 6361–6366, june 2013. [24] g. zhu, j. chen, t. zhang, q. jing, and z. l. wang, "radialarrayed rotary electrification for high performance triboelectric generator", nat. commun., vol. 5, article 3426, march 2014. [25] y. yang, g. zhu, h. zhang et al., "triboelectric nanogenerator for harvesting wind energy and as selfpowered wind vector sensor system", acs nano, vol. 7, no. 10, pp. 9461–9468, september 2013. [26] z. wen, j. chen, m.-h. yeh et al., "blow-driven triboelectric nanogenerator as an active alcohol breath analyzer", nano energy, vol. 16, pp. 38–46, september 2015. [27] z.-h. lin, g. cheng, w. wu, k. c. pradel, and z. l. wang, "dual-mode triboelectric nanogenerator for harvesting water energy and as a self-powered ethanol nanosensor", acs nano, vol. 8, no. 6, pp. 6440– 6448, may 2014. [28] s. jung, j. lee, t. hyeon, m. lee, and d.-h. kim, "fabricbased integrated energy devices for wearable activity monitors", adv. mater., vol. 26, no. 36, pp. 6329–6334, july 2014. [29] h. zhang, y. yang, y. su et al., "triboelectric nanogenerator as self-powered active sensors for detecting liquid/gaseous water/ ethanol", nano energy, vol. 2, no. 5, pp. 693–701, september 2013. [30] y. su, g. zhu, w. yang et al., "triboelectric sensor for selfpowered tracking of object motion inside tubing", acs nano, vol. 8, no. 4, pp. 3843–3850, march 2014. [31] f. yi, l. lin, s. niu et al., "stretchable-rubber-based triboelectric nanogenerator and its application as selfpowered body motion sensors", adv. funct. mater., vol. 25, no. 24, pp. 3688–3696, june 2015. [32] f. yi, l. lin, s. niu et al., “stretchable-rubber-based triboelectric nanogenerator and its application as self-powered body motion sensors,” adv. funct. mater., vol. 25, no. 24, pp. 3688–3696, june 2015. [33] y. wu, q. jing, j. chen et al., "a self-powered angle measurement sensor based on triboelectric nanogenerator", adv. funct. mater., vol. 25, no. 14, pp. 2166–2174, april 2015. [34] p. bai, g. zhu, q. jing et al., "transparent and flexible barcode based on sliding electrification for selfpowered identification systems", nano energy, vol. 12, pp. 278–286, march 2015. [35] z. l. wang, j. chen, and l. lin, "progress in triboelectric nanogenerators as a new energy technology and self-powered sensors", energy environ. sci., vol. 8, no. 8, pp. 2250– 2282, august 2015. [36] g. zhu, b. peng, j. chen, q. jing, and z. l. wang, "triboelectric nanogenerators as a new energy technology: from fundamentals, devices, to applications", nano energy, vol. 14, pp. 126–138, may 2015. [37] s. park, h. kim, m. vosgueritchian et al., "stretchable energyharvesting tactile electronic skin capable of differentiating multiple mechanical stimuli modes", adv. mater., vol. 26, no. 43, pp. 7324–7332, november 2014. [38] f.-r. fan, l. lin, g. zhu, w. wu, r. zhang, and z. l. wang, "transparent triboelectric nanogenerators and self-powered pressure sensors based on micropatterned plastic films", nano lett., vol. 12, no. 6, pp. 3109–3114, may 2012. [39] j. yang, j. chen, y. su et al., "eardrum-inspired active sensors for self-powered cardiovascular system characterization and throat-attached anti-interference voice recognition", adv. mater., vol. 27, no. 8, pp. 1316–1326, february 2015. [40] s. lee, w. ko, y. oh et al., "triboelectric energy harvester based on wearable textile platforms employing various surface morphologies", nano energy, vol. 12, pp. 410–418, march 2015. [41] g. zhu, j. chen, y. liu et al., "linear-grating triboelectric generator based on sliding electrification", nano lett., vol. 13, no. 5, pp. 2282–2289, april 2013. triboelectric nanogenerators (teng): factors affecting its efficiency and applications 171 [42] s. wang, l. lin, y. xie, q. jing, s. niu, and z. l. wang, "sliding-triboelectric nanogenerators based on in-plane chargeseparation mechanism", nano lett., vol. 13, no. 5, pp. 2226– 2233, april 2013. [43] s. niu, y. liu, s. wang et al., "theoretical investigation and structural optimization of single-electrode triboelectric nanogenerators", adv. funct. mater., vol. 24, no. 22, pp. 3332–3340, june 2014. [44] y. li, g. cheng, z.-h. lin, j. yang, l. lin, and z. l. wang, "single-electrode-based rotationary triboelectric nanogenerator and its applications as self-powered contact area and eccentric angle sensors", nano energy, vol. 11, pp. 323–332, january 2015. [45] b. meng, w. tang, z.-h. too et al., "a transparent single-friction-surface triboelectric generator and self-powered touch sensor", energy environ. sci., vol. 6, no. 11, pp. 3235–3240, august 2013. [46] c. r. s. rodrigues, c. a. s. alves, j. puga, a. m. pereira and j. o. ventura, "triboelectric driven turbine to generate electricity from the motion of water", nano energy, vol. 30, pp. 379-386, december 2016. [47] t. jiang, y. yao, l. xu, l. zhang, t. xiao and z. l. wang, "spring assisted triboelectric nanogenerator for efficiently harvesting water wave energy", nano energy, vol. 31, pp. 560-567, january 2016. [48] t. kim, j. chung, d. y. kim, j. h. moon, s. lee, m. cho, s. h. lee and s. lee, "design optimization of rotating triboelectric nanogenerator by water electrification and inertia", nano energy, vol. 27, pp. 340351, september 2016. [49] z. tian, j. he, x. chen, z. zhang, t. wen, c. zhai, j. han, j. mu, x. hou, x. chou and c.y. xue, "performance-boosted triboelectric textile for harvesting human motion energy", nano energy, vol. 39, pp. 562-570, september 2017. [50] a. y. choi, c. j. lee, j. park, d. kim and y. t kim, "corrugated textile based triboelectric generator for wearable energy harvesting", sci. rep., vol. 7, article id 45583, march 2017. [51] te-chien hou, y. yang, h. zhang, j. chen, l.j. chen and z.l. wang, "triboelectric nanogenerator built inside shoe insole for harvesting walking energy", nano energy, vol. 2, no. 5, pp. 856–862, september 2013. [52] m. taghavi and l. beccai, "a contact-key triboelectric nanogenerator: theoretical and experimental study on motion speed influence", nano energy, vol 18, pp. 283-292, november 2015. [53] d. yoo, d. choi, and d. s. kim, "comb-shaped electrode-based teng’s for bidirectional mechanical energy harvesting", microelectron. eng., vol. 174, pp. 46-51, april 2017, [54] s. lee, y. lee, d. kim, y. yang, l. lin, z. h. lin, w. hwang and z. l. wang, "triboelectric nanogenerator for harvesting pendulum oscillation energy", nano energy, vol. 2, no. 6, pp. 1113-1120, november 2013. [55] v. nguyen and rusen yang, "effect of humidity and pressure on triboelectric nanogenerator", nano energy, vol. 2, no. 5, pp. 604-608, september 2013. [56] j. shen, z. li, j. yu, and b. ding, "humidity-resisting triboelectric nanogenerator for high performance biomechanical energy harvesting", nano energy, vol. 40, pp. 282-288, october 2017. [57] x. wen, y. su, y. yang, h. zhang and z. l. wang, "applicability of triboelectric nanogenerator over a wide range of temperature", nano energy, vol. 4, pp. 150-156, march 2014. [58] m. a. p. mahmud, j. lee, g. kim, h. lim and k. b. choi, "improving the surface charge density of a contact-separation-based triboelectric nanogenerator by modifying the surface morphology", microelectron. eng., vol. 159, pp. 102-107, june 2016. [59] m. l. seol, s.h lee, j.w. han, d. kim, g.h cho and y.k choi, "impact of contact pressure on output voltage of triboelectric nanogenerator based on deformation of interfacial structures" nano energy, vol. 17, pp. 63-71, october 2015. [60] w. q. yang, j. chen, x. n. wen, q. s. jing, j. yang, y. j. su, g. zhu, w. z. wu and z. l. wang, "triboelectrification based motion sensor for human-machine interfacing", acs appl. mater. interfaces, vol. 6, pp. 7479-7484, april 2014. [61] w. q. yang, j. chen, g. zhu, j. yang, p. bai, y. j. su, q. s. jing, x. cao and z. l. wang, "harvesting energy from the natural vibration of human walking", acs nano, vol. 7, pp. 11317-11324, november 2013. [62] x. s. zhang, m. d. han, r. x. wang, f. y. zhu, z. h. li, w. wang and h. x. zhang, "frequencymultiplication high-output triboelectric nanogenerator for sustainably powering biomedical microsystems", nano lett., vol. 13, no. 3, pp.1168-1172, february 2013. [63] s. kim, m. k. gupta, k. y. lee, a. sohn, t. y. kim, k. s shin, d. kim, s. k. kim, k. h. lee, h. j. shin, d. w. kim and s. w. kim, "transparent flexible graphene triboelectric nanogenerators", adv. mater., vol. 26, no. 23, pp. 3918-3925, march 2014. [64] g. zhu, p. bai, j. chen and z. l. wang, "power-generating shoe insole based on triboelectric nanogenerators for self-powered consumer electronics", nano energy, vol. 2, no. 5, pp. 688-692, september 2013. 172 d. anand, a. singh sambyal, r, vaid [65] b. meng, w. tang, x. s. zhang, m. d. han, w. liu and h. x. zhang, "self-powered flexible printed circuit board with integrated triboelectric generator", nano energy, vol. 2, no. 6, pp. 1101-1106, november 2013. [66] j. yang, j. chen, y. liu, w. q. yang, y. j. su and z. l. wang, "triboelectrification-based organic film nanogenerator for acoustic energy harvesting and self-powered active acoustic sensing", acs nano, vol. 8, no. 3, pp. 2649-2657, february 2014. [67] a. f. yu, m. song, y. zhang, y. zhang, l. b. chen, j. y. zhai and z. l. wang, "self-powered acoustic source locator in underwater environment based on organic film triboelectric nanogenerator", nano res., vol. 8, pp. 765-773, september 2014. [68] g. zhu, y. s. zhou, p. bai, x. s. meng, q. s. jing, j. chen and z. l. wang, "a shape-adaptive thin-filmbased approach for 50% high-efficiency energy generation through micro-grating sliding electrification", adv. mater., vol. 26, no. 23, pp. 3788-3796, april 2014. [69] g. zhu, j. chen, t. j. zhang, q. s. jing and z. l. wang, "radial-arrayed rotary electrification for high performance triboelectric generator", nat. commun., vol. 5, article 3426, march 2014. [70] z. h. lin, g. cheng, s. lee, k.c. pradel and z. l. wang, "harvesting water drop energy by a sequential contact-electrification and electrostatic-induction process", adv. mater., vol. 26, pp. 46904696, july 2014. [71] z. h. lin, g. cheng, w. z. wu, k. c. pradel and z. l. wang, "dual-mode triboelectric nanogenerator for harvesting water energy and as a self-powered ethanol nanosensor", acs nano, vol. 8, no. 6 pp. 6440-6448, may 2014. [72] z. h. lin, g. cheng, l. lin, s. lee and z. l. wang, "water–solid surface contact electrification and its use for harvesting liquid-wave energy", angew. chem., int. ed., vol. 52, no. 48, pp. 12545-12549, november 2013. [73] g. zhu, y. j. su, p. bai, j. chen, q. s. jing, w. q. yang and z. l. wang, "harvesting water wave energy by asymmetric screening of electrostatic charges on a nanostructured hydrophobic thin-film surface", acs nano, vol. 8, no. 6, pp. 6031–6037, april 2014. [74] x. n. wen, w. q. yang, q. s. jing and z. l. wang, "harvesting broadband kinetic impact energy from mechanical triggering/vibration and water waves", acs nano, vol. 8, no. 7, pp. 7405-7412, june 2014. [75] y. f. hu, j. yang, q. s. jing, s. m. niu, w. z. wu and z. l. wang, "triboelectric nanogenerator built on suspended 3d spiral structure as vibration and positioning sensor and wave energy harvester", acs nano, vol. 7, no. 11, pp. 10424-10432, october 2013. [76] y. yang, h. l. zhang, r. y. liu, x. n. wen, t. c. hou and z. l. wang, "fully enclosed triboelectric nanogenerators for applications in water and harsh environments", adv. energy mater., vol. 3, no. 12, pp. 1563-1568, december 2013. [77] y. yang, h. l. zhang, x. d. zhong, f. yi, r. m. yu, y. zhang and z. l. wang, "electret film-enhanced triboelectric nanogenerator matrix for self-powered instantaneous tactile imaging", acs appl. mater. interfaces, vol. 6, no. 5, pp. 3680-3688, february 2014. [78] y. yang, h. l. zhang, z. h. lin, y. s. zhou, q. s. jing, y. j. su, j. yang, j. chen, c. g. hu and z. l. wang, "human skin based triboelectric nanogenerators for harvesting biomechanical energy and as self-powered active tactile sensor system", acs nano, vol. 7, no. 10, pp. 9213-9222, september 2013. [79] b. meng, w. tang, z. h. too, x. s. zhang, m. d. han, w. liu and h. x. zhang, "a transparent single-frictionsurface triboelectric generator and self-powered touch sensor", energy environ. sci., vol. 6, no. 11 pp. 32353240, august 2013. [80] j. yang, y. yang, j. chen, h. l. zhang, w. q. yang, p. bai, y. j. su and z. l. wang, "broadband vibrational energy harvesting based on a triboelectric nanogenerator", adv. energy mater., vol. 4, no. 6, article id 1301322, april 2014. [81] q. s. jing, g. zhu, w. z. wu, p. bai, y. n. xie, r. p. s. han and z. l. wang, "self-powered triboelectric velocity sensor for dual-mode sensing of rectified linear and rotary motions", nano energy, vol. 10, pp. 305– 312, november 2014. [82] z. h. lin, g. zhu, y. s. zhou, y. yang, p. bai, j. chen and z. l. wang, "a self-powered triboelectric nanosensor for mercury ion detection", angew. chem., int. ed., vol. 52, no. 19, pp. 50655069, may 2013. facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 525 546 https://doi.org/10.2298/fuee2104525g nadja gavrilović, vladimir ćirić received march 25, 2021; received in revised form september 22, 2021 corresponding author: nadja gavrilović faculty of electronic engineering, computer science department, aleksandra medvedeva 14, 18000 niš, serbia e-mail: nadja.gavrilovic@elfak.ni.ac.rs facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) autoscalabile distributed anti-spam smtp system based on kubernetes university of niš, faculty of electronic engieering, niš, serbia © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper abstract. due to the increasing amount of spam email traffic, email users are in increasing danger, while email server resources are becoming overloaded. therefore, it is necessary to protect email users, but also to prevent smtp system overload during spam attacks. the aim of this paper is to design and implement an autoscalable distributed anti-spam smtp system based on a proof of work concept. the proposed solution extends smtp protocol in order to enable the evaluation of the client’s credibility using the proof of work algorithm. in order to prevent resource overload during spam attacks, the anti-spam smtp system is implemented in a distributed environment, as a group of multiple anti-spam smtp server instances. kubernetes architecture is used for system distribution, configured with the possibility of autoscaling the number of antispam smtp server instances depending on the system load. the implemented system is evaluated during a distributed spam attempt, simulated by a custom-made traffic generator tool. various performance tests are given: (1) the proposed system’s impact on client’s behaviour and the overall amount of spam messages, (2) the performance of the undistributed anti-spam smtp server during spam attack, in terms of resource load analysis (3) autoscaling demonstration and evaluation of proposed distributed system’s performance during a spam attack. it is shown that the proposed solution has the possibility of reducing the amount of spam traffic, while processing tens of thousands of simultaneous smtp client requests in a distributed environment. key words: anti-spam, spam, email, smtp, kubernetes, proof of work 2 n. gavrilovic et al. 1 introduction according to recent reports [1], approximately 50% of the overall email traffic consists of spam. spam represents the abuse of electronic systems and smtp protocol for the purpose of sending mass unwanted messages. spam can lead to serious attempts at a data breach or email users identity theft, also. furthermore, spam traffic is wasting valuable network and device resources [2]. there are a lot of different anti-spam solutions and techniques recently developed and applied in many different environments. in recent years, many of them relate to the problem of email spam [2, 3], spam in social networking [4–6], etc. one of the main approaches in filtering spam messages is based on the analysis of their contents. in the beginning, these solutions were based on a set of user-defined rules. in recent years, there were many proposed classification techniques, based on machine and deep learning [7,8]. these solutions can be time-consuming, but also can encounter a false alarm problem. furthermore, spam attackers find ways to manipulate contents and structure of emails, so they can avoid content-based spam filters. another type of anti-spam solutions are reputation-based systems, which analyse the identity of message sender. there are many different approaches. for example, [9] verifies client’s identity by using public key cryptography. instead, this paper will propose solution which enables the evaluation of individual client’s credibility using the proof of work (pow) algorithm. numerous papers were published which studied the implementation of this algorithm in various systems [10–12]. in general, the pow algorithm can be applied to any automated malicious online interaction problem (e.g. spam comments, spam text messages), but it is important to take into account the specific protocol, implementation and backward compatibility. additional communication is not always possible, or can be difficult to implement. to the best of our knowledge, only in [10] the authors suggested the use of the pow concept as an anti-spam solution. they implemented an anti-spam system on a peer to peer (p2p) network. the system from [10] differs from the system proposed in this paper in terms of how the pow algorithm is implemented. also, the anti-spam solution proposed in this paper is based on client/server communication in a distributed environment, instead of the p2p system. the goal of this paper is to design and implement a distributed anti-spam smtp system based on a proof of work concept. the proposed system does not address the problem of receiving spam messages on the client, but affects the first and biggest step in the existence of spam traffic sending a large 526 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 527 2 n. gavrilovic et al. 1 introduction according to recent reports [1], approximately 50% of the overall email traffic consists of spam. spam represents the abuse of electronic systems and smtp protocol for the purpose of sending mass unwanted messages. spam can lead to serious attempts at a data breach or email users identity theft, also. furthermore, spam traffic is wasting valuable network and device resources [2]. there are a lot of different anti-spam solutions and techniques recently developed and applied in many different environments. in recent years, many of them relate to the problem of email spam [2, 3], spam in social networking [4–6], etc. one of the main approaches in filtering spam messages is based on the analysis of their contents. in the beginning, these solutions were based on a set of user-defined rules. in recent years, there were many proposed classification techniques, based on machine and deep learning [7,8]. these solutions can be time-consuming, but also can encounter a false alarm problem. furthermore, spam attackers find ways to manipulate contents and structure of emails, so they can avoid content-based spam filters. another type of anti-spam solutions are reputation-based systems, which analyse the identity of message sender. there are many different approaches. for example, [9] verifies client’s identity by using public key cryptography. instead, this paper will propose solution which enables the evaluation of individual client’s credibility using the proof of work (pow) algorithm. numerous papers were published which studied the implementation of this algorithm in various systems [10–12]. in general, the pow algorithm can be applied to any automated malicious online interaction problem (e.g. spam comments, spam text messages), but it is important to take into account the specific protocol, implementation and backward compatibility. additional communication is not always possible, or can be difficult to implement. to the best of our knowledge, only in [10] the authors suggested the use of the pow concept as an anti-spam solution. they implemented an anti-spam system on a peer to peer (p2p) network. the system from [10] differs from the system proposed in this paper in terms of how the pow algorithm is implemented. also, the anti-spam solution proposed in this paper is based on client/server communication in a distributed environment, instead of the p2p system. the goal of this paper is to design and implement a distributed anti-spam smtp system based on a proof of work concept. the proposed system does not address the problem of receiving spam messages on the client, but affects the first and biggest step in the existence of spam traffic sending a large autoscalabile distributed anti-spam smtp system based on kubernetes 3 number of spam messages in a short period of time. in order to enable pow evaluation of the smtp client, the extension of the smtp protocol is designed. a common problem during spam attacks is server resource overload, which can cause denial of service. in order to prevent resource overload of the proposed solution, the anti-spam system is implemented in a distributed environment. kubernetes architecture is used for distribution of multiple anti-spam smtp server instances, which make up the proposed anti-spam smtp system. kubernetes is configured with the possibility of autoscaling the number of smtp server replicas depending on the system load. the implemented system is evaluated during the distributed spam attempt, simulated by the custom made traffic generator tool. various performance tests are given: (1) the client’s perspective on the proposed system and the impact on the overall amount of spam messages, (2) the undistributed antispam server’s performance during the spam attempt in terms of resource load analysis, (3) autoscaling demonstration and distributed system’s performance evaluation during spam attack. it is shown that the proposed solution significantly reduces the amount of spam traffic, while processing tens of thousands of simultaneous smtp client requests in a distributed environment. the paper is organized as follows. section 2 gives a brief introduction to smtp and proof of work algorithm. section 3 is devoted to process of containerization and kubernetes orchestration. section 4 is the main section and presents the design and architecture of the proposed anti-spam smtp system. section 5 is devoted to the system evaluation during simulated spam attempt, while the concluding remarks are given in section 6. 2 background on smtp and proof of work the smtp protocol defines rules for sending and reliable transfer of email messages through the network. the devices taking part in the transfer itself are referred to as agents and their communication is defined by the smtp protocol. within communication of each two agents during email transmission, the device sending the email has the role of smtp client, while the device which receives the email has the role of smtp server. an email transaction begins with the command mail, which the client uses to define the email address of the sender. the second step is the definition of the email address of the recipient using the command rcpt, etc. the responses from the server ensure the synchronization of the activities in the smtp client-server communication and provide the information on the success of client’s actions. 526 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 527 4 n. gavrilovic et al. the basic communication between the smtp client (c) and server (s) during the successful transfer of a message is as follows [13]: s: 220 smtpserver.example.com simple mail transfer service ready c: helo smtpclient.example.com s: 250 smtpclient.example.com, pleased to meet you c: mail from: s: 250 sender ok c: rcpt to: s: 250 recipient ok c: data s: 354 enter mail, end with ’.’ on a line by itself c: from: ”from example” c: to: recipient example c: date: tue, 15 may 2020 16:02:43 c: subject: hello c: hello world! c: . s: 250 queued mail for delivery c: quit s: 221 service closing transmission channel proof of work systems emerged with the aim of verifying device’s credibility and preventing abuse of computer’s processing power, for example during denial of service (dos) attacks. the basic idea behind this concept is to request relatively small amount of processing time from a device which tries to access protected resource, with the aim of fulfilling the criterion issued by the server. this prevents the underlying characteristic of aforementioned attacks a large number of access attempts at a resource over a short period of time, without significantly affecting normal use of the system [14]. the principle behind how a pow system works is based on a typical cryptographic scenario, during which the client who is requesting a service or resource attempts to prove its credibility to the server [14]. most often, execution of mathematical or cryptographic functions represents a way of investing sufficient processing time as proof of credibility. the functions are not overly demanding, but are complex enough to ensure, in the case where their multiple execution is required, significant processing time on the client’s side. the task of the client is to solve time-intensive calculation involving certain data many times over, until the obtained solution satisfies the requirement issued by the server. after the client sends that solution, the task of the server is to check its validity and identify the client as reliable or 528 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 529 4 n. gavrilovic et al. the basic communication between the smtp client (c) and server (s) during the successful transfer of a message is as follows [13]: s: 220 smtpserver.example.com simple mail transfer service ready c: helo smtpclient.example.com s: 250 smtpclient.example.com, pleased to meet you c: mail from: s: 250 sender ok c: rcpt to: s: 250 recipient ok c: data s: 354 enter mail, end with ’.’ on a line by itself c: from: ”from example” c: to: recipient example c: date: tue, 15 may 2020 16:02:43 c: subject: hello c: hello world! c: . s: 250 queued mail for delivery c: quit s: 221 service closing transmission channel proof of work systems emerged with the aim of verifying device’s credibility and preventing abuse of computer’s processing power, for example during denial of service (dos) attacks. the basic idea behind this concept is to request relatively small amount of processing time from a device which tries to access protected resource, with the aim of fulfilling the criterion issued by the server. this prevents the underlying characteristic of aforementioned attacks a large number of access attempts at a resource over a short period of time, without significantly affecting normal use of the system [14]. the principle behind how a pow system works is based on a typical cryptographic scenario, during which the client who is requesting a service or resource attempts to prove its credibility to the server [14]. most often, execution of mathematical or cryptographic functions represents a way of investing sufficient processing time as proof of credibility. the functions are not overly demanding, but are complex enough to ensure, in the case where their multiple execution is required, significant processing time on the client’s side. the task of the client is to solve time-intensive calculation involving certain data many times over, until the obtained solution satisfies the requirement issued by the server. after the client sends that solution, the task of the server is to check its validity and identify the client as reliable or autoscalabile distributed anti-spam smtp system based on kubernetes 5 unreliable. a very important criterion of pow implementation is that it has to be difficult and time-consuming to calculate the solution on the client’s side, while the solution validation on the server side does not require a large amount of time [11]. 3 containerization and kubernetes background when developing software solutions, one of the common requirements is portability between different platforms and environments. also, the ability to distribute and scale the developed application is an important prerequisite for its efficient and reliable execution. recently, these requirements are commonly being met by introducing containerization into the software development process [15]. 3.1 docker containerization today’s application containerization technologies are based on research and solutions designed years ago, in the field of virtualization. virtualization enables isolation of applications, but at the cost of significant impact on host resources. in order to overcome these disadvantages, containers are presented. they provide functionality similar to those provided by the virtual machines, while preserving host resources [16]. a container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. all containers running on a single device share its operating system, which frees up significant amounts of device resources, in comparison to virtual machines. also, containers are easily portable between different devices and cloud platforms [15,17]. the most commonly used container technology is docker. docker represents a set of software tools used for the development and distribution of software packages, i.e. containers. docker technology is used for packaging the application and its runtime environment into a container, which can then be executed on many platforms [18]. 3.2 kubernetes orchestration when developing software solutions that require a reliable and scalable system that will balance the application load, the use of distributed container systems is becoming more common. in such situations, it is necessary to 528 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 529 6 n. gavrilovic et al. ensure automatic container communication and management using orchestration tools. since its introduction in 2014, kubernetes orchestrator has grown to be one of the largest and most popular open source projects in the world. it has become the standard api for building distributed cloudnative applications, especially in big enterprises. kubernetes enables the distributed execution of software applications on a large number of separate nodes that make up the kubernetes cluster. the kubernetes orchestration enables the automation of all aspects of container coordination and management and involves the process of placing, managing, scaling and connecting containers. the kubernetes cluster consists of master and worker nodes. the master node (also called the control plane) manages the entire kubernetes system, while the worker nodes are responsible for the execution of containerized applications. worker nodes are linux hosts which constantly listen for new tasks and execute assigned ones, and report state and changes to the master node [19, 20]. in order to define how the containerized applications are executed on the kubernetes cluster, kubernetes objects are created at the request of the user. the main kubernetes object and the basic building block of the kubernetes system is a pod. a pod can be defined as a shared environment for the execution of one or more containers. it can be seen as a wrapper around the container, which is necessary for the container execution on the kubernetes cluster. each pod is executed exclusively on one node [20]. the deployment kubernetes object provides the ability to easily create and manage multiple pods. it provides additional pod features, which they do not originally have self-healing, scalability, automatic rolling update and rollback mechanism. in order to have a constant access point to the containerized application that executes as a group of pods, the kubernetes service object could be used. the service is associated with the pods and fronts them with a stable ip, dns, and port. it also load-balances requests across the pods. the client does not have to know the exact location and configuration of the pods, which can be unstable, change their location and network settings. by using service object, the pods can be scaled, new pods can be started and previous versions of the pods can be updated and shut down. the service object in front of the pods observes the changes, and maintains a list of active pods ready to accept client connections. if external access to service is needed, the nodeport service type should be used, because it exposes the service on each node’s ip address and a static port, and makes it possible to contact the service from outside the cluster. 530 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 531 6 n. gavrilovic et al. ensure automatic container communication and management using orchestration tools. since its introduction in 2014, kubernetes orchestrator has grown to be one of the largest and most popular open source projects in the world. it has become the standard api for building distributed cloudnative applications, especially in big enterprises. kubernetes enables the distributed execution of software applications on a large number of separate nodes that make up the kubernetes cluster. the kubernetes orchestration enables the automation of all aspects of container coordination and management and involves the process of placing, managing, scaling and connecting containers. the kubernetes cluster consists of master and worker nodes. the master node (also called the control plane) manages the entire kubernetes system, while the worker nodes are responsible for the execution of containerized applications. worker nodes are linux hosts which constantly listen for new tasks and execute assigned ones, and report state and changes to the master node [19, 20]. in order to define how the containerized applications are executed on the kubernetes cluster, kubernetes objects are created at the request of the user. the main kubernetes object and the basic building block of the kubernetes system is a pod. a pod can be defined as a shared environment for the execution of one or more containers. it can be seen as a wrapper around the container, which is necessary for the container execution on the kubernetes cluster. each pod is executed exclusively on one node [20]. the deployment kubernetes object provides the ability to easily create and manage multiple pods. it provides additional pod features, which they do not originally have self-healing, scalability, automatic rolling update and rollback mechanism. in order to have a constant access point to the containerized application that executes as a group of pods, the kubernetes service object could be used. the service is associated with the pods and fronts them with a stable ip, dns, and port. it also load-balances requests across the pods. the client does not have to know the exact location and configuration of the pods, which can be unstable, change their location and network settings. by using service object, the pods can be scaled, new pods can be started and previous versions of the pods can be updated and shut down. the service object in front of the pods observes the changes, and maintains a list of active pods ready to accept client connections. if external access to service is needed, the nodeport service type should be used, because it exposes the service on each node’s ip address and a static port, and makes it possible to contact the service from outside the cluster. autoscalabile distributed anti-spam smtp system based on kubernetes 7 4 design of autoscalable distributed anti-spam smtp system the proposed anti-spam smtp server, which uses pow algorithm for smtp client verification, is implemented as a .net core application. the implemented application was containerized using docker technology in order to enable horizontal scaling of lightweight, portable containers, which can run virtually anywhere. the containerized anti-spam smtp server is distributed on the kubernetes cluster in the cloud. also, kubernetes is configured with the possibility of autoscaling the number of required server replicas depending on the system load. the role of the proposed distributed anti-spam smtp system is shown in fig. 1. the basic behaviour of the proposed solution is defined by the smtp protocol. its basic network functionalities, as a device that forwards an email to its destination (fig. 1), have been upgraded with the implementation of a distributed anti-spam system that has the possibility of verifying smtp clients, by asking for proof of their credibility. the proposed anti-spam system has the role of a proxy, which checks smtp clients before forwarding their messages through the network, and marks the messages as spam if needed (fig. 1). it uses pow algorithm to check the credibility of email clients, by issuing a challenge that an smtp client has to resolve by using its processing power. marking is done by adding the field x-spam-category: spam to the header of the email message. in this way, the proposed anti-spam smtp system alerts the next smtp server to which it forwards the message about potential spam traffic, while containing all of the client’s processing power which could be used for spamming. the role of the proposed solution is to prevent further passage of spam emails through the network, but also to slow down spam attacks, as will be described below. a potential problem of the proposed anti-spam system could occur when it is simultaneously abused by multiple clients, which can deplete its resources because of a large number of parallel connections (a problem close to dos attacks). taking that into consideration, the proposed anti-spam system is designed to consist of many anti-spam smtp server replicas (fig. 1, replica 1 replica n) distributed on the cluster, which can balance the system load if necessary. also, the proposed system is configured with the possibility of autoscaling the number of smtp server replicas depending on the system load. 530 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 531 8 n. gavrilovic et al. fig. 1: the role of the proposed anti-spam system 4.1 the proposed kubernetes architecture the proposed solution has been distributed on kubernetes, using docker as containerization technology. the goal of distributed execution of the proposed anti-spam system is a reliable, fault-tolerant smtp system, which is resistant to spam attacks and has higher total available resources. the proposed kubernetes architecture in which anti-spam system executes is shown in fig. 2. in order to simplify the architecture shown, only two worker nodes are given. the figure shows a nodeport service named smtp-service, created to allow external clients to access the proposed antispam system. as can be seen, the proposed system is exposed on any of the worker node ip addresses, in combination with nodeport defined value of 30001. the range of values that nodeport on the kubernetes architecture can have is from 30,000 to 32,767. in an smtp production environment, the destination port of the incoming traffic could be easily changed on the router/firewall, from a standard tcp port value of 25 to the nodeport value of 30001. the client’s request is forwarded from the node to the created smtp-service object. then, after analyzing the cluster and available running pods, the smtp-service can reroute the request to one of the smtp-pod objects, taking into account the current load of active pods. requests are forwarded on configured internal port 5000. the number of currently active pods depends on autoscaling component, which estimates current system load and creates additional pods or shuts some of them down if needed. autoscaling enables automatic regulation of the number of anti-spam smtp server instances, depending on the system’s 532 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 533 8 n. gavrilovic et al. fig. 1: the role of the proposed anti-spam system 4.1 the proposed kubernetes architecture the proposed solution has been distributed on kubernetes, using docker as containerization technology. the goal of distributed execution of the proposed anti-spam system is a reliable, fault-tolerant smtp system, which is resistant to spam attacks and has higher total available resources. the proposed kubernetes architecture in which anti-spam system executes is shown in fig. 2. in order to simplify the architecture shown, only two worker nodes are given. the figure shows a nodeport service named smtp-service, created to allow external clients to access the proposed antispam system. as can be seen, the proposed system is exposed on any of the worker node ip addresses, in combination with nodeport defined value of 30001. the range of values that nodeport on the kubernetes architecture can have is from 30,000 to 32,767. in an smtp production environment, the destination port of the incoming traffic could be easily changed on the router/firewall, from a standard tcp port value of 25 to the nodeport value of 30001. the client’s request is forwarded from the node to the created smtp-service object. then, after analyzing the cluster and available running pods, the smtp-service can reroute the request to one of the smtp-pod objects, taking into account the current load of active pods. requests are forwarded on configured internal port 5000. the number of currently active pods depends on autoscaling component, which estimates current system load and creates additional pods or shuts some of them down if needed. autoscaling enables automatic regulation of the number of anti-spam smtp server instances, depending on the system’s autoscalabile distributed anti-spam smtp system based on kubernetes 9 load, ie. the number and frequency of smtp client requests. the configuration of the proposed kubernetes system is done so that initially there is one anti-spam smtp server instance, which is not replicated until the system’s load indicates the need for a larger number of smtp pods that will serve all of the client requests. the moment the system load reaches a configured threshold, the system scales by adding new replicas. the proposed system is autoscaled depending on two criteria cpu and memory usage. in particular, autoscaling will be performed when the average pods memory usage is greater than configured. also, the number of replicas will be increased when the average pod cpu utilization reaches 50%. when multiple metrics are specified, kubernetes monitors both parameters and performs the addition of new replicas when either of them reaches the maximum defined value. fig. 2: the proposed kubernetes architecture 4.2 the role of the proposed system the proposed solution filters traffic and marks messages as spam if it is necessary. the smtp client’s credibility check is performed immediately after the proposed system receives the email. it has to decide whether the 532 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 533 10 n. gavrilovic et al. client is a potential spammer, and if so, how it will be challenged using the pow algorithm. the operation of the proposed system when processing a client’s email request is presented by the algorithm in fig. 3. the initial client evaluation can be performed using a reputation system. if the client is estimated as reliable, smtp communication continues without any changes. an email message is sent through the network, to its destination. an alternative flow of the communication occurs in the event that during the fig. 3: the proposed anti-spam system algorithm, after receiving email request communication, the client is estimated as a potential spammer. in that case, its validation will be performed by issuing a challenge that the client has to solve. if the client proves its validity, his email message will be forwarded further through the network, without any marking of the message. however, if the client does not prove its credibility, he is evaluated as a spammer, his email message is marked as spam traffic and his spam attempt is slowed down. the precise flow of communication between the smtp client and the proposed anti-spam system, during the pow algorithm, will be given in the next subsection. 534 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 535 10 n. gavrilovic et al. client is a potential spammer, and if so, how it will be challenged using the pow algorithm. the operation of the proposed system when processing a client’s email request is presented by the algorithm in fig. 3. the initial client evaluation can be performed using a reputation system. if the client is estimated as reliable, smtp communication continues without any changes. an email message is sent through the network, to its destination. an alternative flow of the communication occurs in the event that during the fig. 3: the proposed anti-spam system algorithm, after receiving email request communication, the client is estimated as a potential spammer. in that case, its validation will be performed by issuing a challenge that the client has to solve. if the client proves its validity, his email message will be forwarded further through the network, without any marking of the message. however, if the client does not prove its credibility, he is evaluated as a spammer, his email message is marked as spam traffic and his spam attempt is slowed down. the precise flow of communication between the smtp client and the proposed anti-spam system, during the pow algorithm, will be given in the next subsection. autoscalabile distributed anti-spam smtp system based on kubernetes 11 4.3 pow algorithm implementation the use of pow algorithm gives the possibility of asking a potential spammer for proof of credibility by requiring a certain amount of processing time. the implementation of such a system requires certain changes in the implementation of the smtp protocol. at the beginning of the communication, the smtp client provides the server with information on the sender, recipient and overall message being sent, as defined in the basic smtp protocol. the server, after it has received all the email data, has the possibility of transferring the email message to its destination, the same way it does in protocol-defined communication in the case of a valid consumer. however, if based on the evaluation of the sender, the reputation system determines that the client might be unreliable, in this paper we propose for smtp protocol to require proof of credibility prior to the transfer of the messages. with that aim in mind, changes were made to the basic smtp communication. the extended smtp protocol is shown in fig. 4. fig. 4: extended smtp protocol the changes enable the server to send integer value, which represents the weight (line 8 in fig. 4). its value determines the criterion which the client has to meet. along with the weight, the server sends the supported hash algorithms (line 8 in fig. 4). the client matches one of the acceptable hash functions and computes it against the entire mail message, including the header and the time stamps. by computing a hash function on the entire email message, the client is required to generate a sequence that has as many zeroes in the beginning, as was defined by the previously received 534 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 535 12 n. gavrilovic et al. weight parameter (line 8 in fig. 4). the execution of the same hash function on the same data sequence always results in the same output. that is why the client has to append a nonce value to the data, based on which the client calculates the hash value of the message. the only way to determine a nonce value that satisfies the requirement of the server is brute force. the hash function is sequentially executed several times, until the generated output meets the server requirement, and with each function execution, the value of the nonce changes. a single execution of a hash algorithm on the data does not require significant processing time. however, obtaining a satisfactory output sequence is sufficiently rare to take away significant cpu time from the client. once the nonce value used to satisfy the issued requirement is found, the client forwards it to the server, along with the hash function which has been used (line 9 in fig. 4). after receiving the nonce value from the client, it checks it to ensure if the obtained value meets the set requirement. the check is achieved through the execution of a single hash function on a previously obtained email in combination with a recently received nonce value. it can be noticed that a significantly greater amount of processing time is required to solve the given problem on the client than the amount of processing time required to verify the solution on the server. if the server determines that the applied nonce value meets the previously set requirement, the email is successfully transferred. a valid client, unlike a spammer, rarely sends great number of email requests over a short period of time. thus, even if the valid email client has been incorrectly evaluated by a reputation system as potentially dangerous, the processing time needed for sending a small number of emails will not render the use of the email service more difficult. thus, valid user’s use of the mail server has not undergone any noticeable changes. on the other hand, a spammer who is trying to send great amount of emails, will have to prove his credibility by executing pow algorithm for each email message, which will result in a significant cpu time. it is important to note that the hash function is applied over the entire email (body and header), including the recipient and the time stamp. in this way, due to the time stamp in the header, it is assured that hash needs to be computed by the spammer every time, even if the same message is sent over and over again. the example given in fig. 4 shows the successful pow validation of the email client. the server requires cpu time from the client by sending the challenge number with the value of 3, which denotes that the client must generate an output sequence which in the beginning has precisely 3 zeroes (line 8 in fig. 4). in the given example, the value of the nonce parameter is 536 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 537 12 n. gavrilovic et al. weight parameter (line 8 in fig. 4). the execution of the same hash function on the same data sequence always results in the same output. that is why the client has to append a nonce value to the data, based on which the client calculates the hash value of the message. the only way to determine a nonce value that satisfies the requirement of the server is brute force. the hash function is sequentially executed several times, until the generated output meets the server requirement, and with each function execution, the value of the nonce changes. a single execution of a hash algorithm on the data does not require significant processing time. however, obtaining a satisfactory output sequence is sufficiently rare to take away significant cpu time from the client. once the nonce value used to satisfy the issued requirement is found, the client forwards it to the server, along with the hash function which has been used (line 9 in fig. 4). after receiving the nonce value from the client, it checks it to ensure if the obtained value meets the set requirement. the check is achieved through the execution of a single hash function on a previously obtained email in combination with a recently received nonce value. it can be noticed that a significantly greater amount of processing time is required to solve the given problem on the client than the amount of processing time required to verify the solution on the server. if the server determines that the applied nonce value meets the previously set requirement, the email is successfully transferred. a valid client, unlike a spammer, rarely sends great number of email requests over a short period of time. thus, even if the valid email client has been incorrectly evaluated by a reputation system as potentially dangerous, the processing time needed for sending a small number of emails will not render the use of the email service more difficult. thus, valid user’s use of the mail server has not undergone any noticeable changes. on the other hand, a spammer who is trying to send great amount of emails, will have to prove his credibility by executing pow algorithm for each email message, which will result in a significant cpu time. it is important to note that the hash function is applied over the entire email (body and header), including the recipient and the time stamp. in this way, due to the time stamp in the header, it is assured that hash needs to be computed by the spammer every time, even if the same message is sent over and over again. the example given in fig. 4 shows the successful pow validation of the email client. the server requires cpu time from the client by sending the challenge number with the value of 3, which denotes that the client must generate an output sequence which in the beginning has precisely 3 zeroes (line 8 in fig. 4). in the given example, the value of the nonce parameter is autoscalabile distributed anti-spam smtp system based on kubernetes 13 79745 (line 9 in fig. 4). it denotes that the value 79745, appended to the data which make up the content of the email message, is the piece of data that met the given requirement of the server. let us note that the proposed smtp extension is backward compatible with the original smtp specification. in the case of a client who does not support pow, he will interpret the message containing the weight as a confirmation of successful sending of email due to code 250, and close the connection. in the currently implemented system, such an email will not be sent. one of the ideas for future work is to implement marking of this type of clients, so that they would be allowed to send a certain number of emails without using the pow algorithm, as long as their activity does not indicate the possible presence of spam. 5 implementation results the proposed system is implemented and evaluated on the cloud, which consists of six servers, with a total ram capacity of cca. 400gb. five servers are ibm system x3550 models, with intel (r) xeon (r) cpu e5603 @ 1.60ghz processor type, with 8 logical processors. one server is hp proliant dl380p gen8 model, with intel (r) xeon (r) cpu e5-2620 v2 @ 2.10ghz processor type, with 12 logical processors. the kubernetes system is implemented on linux ubuntu 16.04 virtual machines. each machine has 2 logical cores and 8gb of ram. the kubernetes cluster consists of five virtual machines of equal resources, one of which is the master node, while the other four are worker nodes. the experiment is set as follows: 1) an evaluation of the client’s effort and the impact on the overall amount of sent spam messages, 2) an evaluation of the spam attack impact on undistributed anti-spam smtp solution, 3) the system performance during the spam attempt when the smtp solution is distributed on the kubernetes architecture. also, automatic autoscaling of the proposed solution, depending on the system load, will be demonstrated. the evaluated client/server communication and the hash function used are as given in fig. 4. in order to simulate distributed spam attack and maximize the system load, a custom traffic generator tool with the pow support was made. 5.1 the client’s perspective evaluation of client’s work for different weight values, is discussed in [21]. the results obtained in [21] are given in table 1. 536 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 537 14 n. gavrilovic et al. table 1: an evaluation of the client work for various weights weight avg. time standard deviation avg. # of hash functions # of completed tests 1 1.14ms 0.41ms 41 20 2 28.7ms 25.09ms 1728 20 3 4.35s 2.88s 223620 20 4 2.71min 1.81min 10996000 20 furthermore, this paper gives the evaluation of client’s behaviour and the pow impact on the amount of spam traffic. it is determined that with pow algorithm, the number of emails sent from the client per second does not depend on the number of sent requests from the client per second. it depends only on the value of the weight given by the server and the client’s processing power, as in: n = hp h(t) , (1) where n is the number of outbound messages from the client per second, hp is the hash power of the client, t is the weight parameter, and h(t) is the average number of required executions of the hash algorithm for the weight t. the reason is that with the increase in the number of simultaneously initiated client smtp requests, the number of executed hash functions per connection per second decreases due to the increase in cpu load. a direct consequence of reducing the number of hash function executions per connection is an increase in the duration of an individual client/server connection. we can conclude that a client with the intention of abusing the email server is limited to a constant number of sent emails per second, which is slowing down his spam attacks [21]. in addition, in this paper we will design and implement an anti-spam smtp system based on a discussed proof of work concept, which is distributed by using kubernetes architecture, with the possibility of autoscaling the number of smtp server instances depending on the system load. 5.2 the performance evaluation of the undistributed anti-spam smtp server in order to evaluate the system with the increased load, aforementioned distributed spam attack is simulated using 10 virtual machines. a custom 538 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 539 14 n. gavrilovic et al. table 1: an evaluation of the client work for various weights weight avg. time standard deviation avg. # of hash functions # of completed tests 1 1.14ms 0.41ms 41 20 2 28.7ms 25.09ms 1728 20 3 4.35s 2.88s 223620 20 4 2.71min 1.81min 10996000 20 furthermore, this paper gives the evaluation of client’s behaviour and the pow impact on the amount of spam traffic. it is determined that with pow algorithm, the number of emails sent from the client per second does not depend on the number of sent requests from the client per second. it depends only on the value of the weight given by the server and the client’s processing power, as in: n = hp h(t) , (1) where n is the number of outbound messages from the client per second, hp is the hash power of the client, t is the weight parameter, and h(t) is the average number of required executions of the hash algorithm for the weight t. the reason is that with the increase in the number of simultaneously initiated client smtp requests, the number of executed hash functions per connection per second decreases due to the increase in cpu load. a direct consequence of reducing the number of hash function executions per connection is an increase in the duration of an individual client/server connection. we can conclude that a client with the intention of abusing the email server is limited to a constant number of sent emails per second, which is slowing down his spam attacks [21]. in addition, in this paper we will design and implement an anti-spam smtp system based on a discussed proof of work concept, which is distributed by using kubernetes architecture, with the possibility of autoscaling the number of smtp server instances depending on the system load. 5.2 the performance evaluation of the undistributed anti-spam smtp server in order to evaluate the system with the increased load, aforementioned distributed spam attack is simulated using 10 virtual machines. a custom autoscalabile distributed anti-spam smtp system based on kubernetes 15 generator of a large amount of smtp requests for sending email is implemented, which supports the pow algorithm. in order to load the system with as many parallel connections as possible, the spam generator tries to send 10,000 emails in a row. fig. 5 shows the client processor load before and during the attack. it can be noticed that from the moment when the client starts sending smtp requests, its cpu usage is 100%. fig. 5: individual client cpu load before and during the spam attack fig. 6 shows the amount of network traffic generated by the individual client who performs the spam attack. at the beginning of the attack, the client initiates large number of connections at the same time, creating a thread for each email. it can be seen that at the time of the attack beginning (t0), the client generates traffic of approximately 5mbps. within each connection, the proposed pow system will request cpu time from the client, which will result in a client cpu load (fig. 5). the client, relatively quickly after the start of the attack, opens the maximum number of connections that its resources can support (because each connection implies one thread that significantly loads the cpu by sequentially executing hash functions). when client’s resources become significantly loaded, he is no longer able to send the initial amount of data, which can be seen in the graph, after the moment t0. nevertheless, the client continues to send a relatively small amount of data to the server, as he solves the challenges one by one (fig. 6, between moments t0 and t1). fig. 7 shows the dependence of the proposed anti-spam smtp server cpu usage on the number of clients simultaneously performing the spam attack. it can be noticed that the cpu of the server is not significantly 538 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 539 16 n. gavrilovic et al. fig. 6: individual client network flow before and during the spam attack loaded at any time of the attack, which leads to the conclusion that the implementation of the pow system on the smtp server does not significantly affect the cpu usage, even during a distributed spam attack. fig. 7: undistributed smtp server cpu usage depending on the number of clients simultaneously performing spam attack 540 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 541 16 n. gavrilovic et al. fig. 6: individual client network flow before and during the spam attack loaded at any time of the attack, which leads to the conclusion that the implementation of the pow system on the smtp server does not significantly affect the cpu usage, even during a distributed spam attack. fig. 7: undistributed smtp server cpu usage depending on the number of clients simultaneously performing spam attack autoscalabile distributed anti-spam smtp system based on kubernetes 17 unlike cpu load, spam attack has a significant impact on undistributed anti-spam smtp server’s memory load. a potential problem on the proposed anti-spam smtp solution could occur when it is attacked by multiple clients, which can deplete its memory resources with a large number of parallel open connections (a problem close to dos attacks). an example of one such attack is given in fig. 8, which shows server memory usage as the number of requests from clients attacking single instance of the system increases over time. it can be seen that, after approximately 60,000 emails were sent, the server memory usage reached the upper limit. the memory usage shown refers to the load of the complete physical host executing the smtp server. it can be seen that the anti-spam smtp solution significantly loads the host’s resources. at that point, the displayed server’s load may result in a significant slowdown in request processing and an inability to respond to new client requests by rejecting their connections. fig. 8: undistributed smtp server memory usage during the spam attack previous analysis of the smtp server memory load during a spam attack indicates a potential risk of congestion of proposed server resources by opening a large number of parallel connections. in order to address this problem, the server has been distributed and autoscaling on the kubernetes architecture has been implemented. the next section will provide the analysis of the system memory usage during a spam attack, in the case where the anti-spam smtp server is distributed on the kubernetes architecture. 540 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 541 18 n. gavrilovic et al. 5.3 autoscaling in the configuration of the kubernetes system initially there is one instance of the anti-spam smtp server, which is not replicated until the system load indicates the need for a larger number of instances that will serve all of the client requests. previously presented results of measuring performance of undistributed anti-spam smtp server led to the conclusion that the parameter according to which the system should be scaled primarily is the memory usage, because the smtp server cpu usage during a spam attack is not significant. the memory load in fig. 8 is measured on the entire machine executing undistributed smtp server (including os and additional processes). let us note that, since kubernetes autoscaling is performed depending on the resource load of individual pods, and since the execution of pods, as a consequence of containerization, does not significantly affects the physical hosts’ load, only the pod’s memory load is observed. table 2 presents performance analysis of the implemented system during the spam attack and demonstrates autoscaling the number of anti-spam smtp server instances. the spam attack was simulated using 10 previously described virtual machines, where each of them is trying to send 10.000 of email requests. the number of smtp clients which are participating in the attack has increased gradually. in order to analyze the impact of each added attacker, one new smtp client was included in the attack every 120 seconds, as can be seen from the first two columns of table 2. the third column gives the total memory usage of the pods during the attack. at time t0, just before the start of the attack, it can be seen that the total pod background memory usage is 12.9mb. as the number of clients increases, the total memory usage increases, approximately linearly. the fourth column gives the number of pods (anti-spam smtp server instances) that the kubernetes system automatically lifts and maintains, depending on the average pod memory usage given in the fifth column. the scaling of the number of pods managed by the kubernetes system is performed in the manner described previously. the upper limit of the average memory usage of the pods running the anti-spam system is set to 100mb, in order to demonstrate autoscaling process in a given environment. based on table 2, it can be concluded that the system behaves according to the defined autoscaling configuration. every time the average memory usage per pod exceeds 100mb, the kubernetes system creates another pod (fourth and fifth columns of table 2), protecting the anti-spam system from resource overload. at any time, the current number of pods running the anti-spam smtp system share the overall workload. in this way, the average memory 542 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 543 18 n. gavrilovic et al. 5.3 autoscaling in the configuration of the kubernetes system initially there is one instance of the anti-spam smtp server, which is not replicated until the system load indicates the need for a larger number of instances that will serve all of the client requests. previously presented results of measuring performance of undistributed anti-spam smtp server led to the conclusion that the parameter according to which the system should be scaled primarily is the memory usage, because the smtp server cpu usage during a spam attack is not significant. the memory load in fig. 8 is measured on the entire machine executing undistributed smtp server (including os and additional processes). let us note that, since kubernetes autoscaling is performed depending on the resource load of individual pods, and since the execution of pods, as a consequence of containerization, does not significantly affects the physical hosts’ load, only the pod’s memory load is observed. table 2 presents performance analysis of the implemented system during the spam attack and demonstrates autoscaling the number of anti-spam smtp server instances. the spam attack was simulated using 10 previously described virtual machines, where each of them is trying to send 10.000 of email requests. the number of smtp clients which are participating in the attack has increased gradually. in order to analyze the impact of each added attacker, one new smtp client was included in the attack every 120 seconds, as can be seen from the first two columns of table 2. the third column gives the total memory usage of the pods during the attack. at time t0, just before the start of the attack, it can be seen that the total pod background memory usage is 12.9mb. as the number of clients increases, the total memory usage increases, approximately linearly. the fourth column gives the number of pods (anti-spam smtp server instances) that the kubernetes system automatically lifts and maintains, depending on the average pod memory usage given in the fifth column. the scaling of the number of pods managed by the kubernetes system is performed in the manner described previously. the upper limit of the average memory usage of the pods running the anti-spam system is set to 100mb, in order to demonstrate autoscaling process in a given environment. based on table 2, it can be concluded that the system behaves according to the defined autoscaling configuration. every time the average memory usage per pod exceeds 100mb, the kubernetes system creates another pod (fourth and fifth columns of table 2), protecting the anti-spam system from resource overload. at any time, the current number of pods running the anti-spam smtp system share the overall workload. in this way, the average memory autoscalabile distributed anti-spam smtp system based on kubernetes 19 table 2: autoscaling anti-spam smtp system instances during spam attack time (min) # of clients / # of email requests total memory usage of the pods (mb) # of pods average pod memory usage (mb) t0 0 / 0 12.9 1 12.9 t0 + 2 1 / 10000 36.6 1 36.6 t0 + 4 2 / 20000 52.03 1 52.03 t0 + 6 3 / 30000 78.2 1 78.2 t0 + 8 4 / 40000 111.9 1 111.9 t0 + 10 5 / 50000 142.4 2 71.2 t0 + 12 6 / 60000 170.8 2 85.4 t0 + 14 7 / 70000 195.4 2 97.7 t0 + 16 8 / 80000 218.4 2 109.2 t0 + 18 9 / 90000 237.9 3 79.3 t0 + 20 10 / 100000 259.5 3 86.5 usage per pod is regulated to a value below 100mb. the data in the table are illustrated in fig. 9. this configuration of the system autoscaling prevents the possibility of its congestion during a spam attack, because the implementation allows the number of anti-spam smtp instances to be automatically raised during the attack, which enables processing all of the client requests without overloading resources. the proposed system provides a reliable, efficient and scalable antispam solution. system containerization enables its flexibility and portability, while execution in distributed kubernetes environment provides easy scalability, reliability and fault tolerance, but also greater memory and processing power. autoscaling of the system maximizes the use of available resources by maintaining the optimal number of anti-spam smtp replicas, proportional to the current load. additionally, not only will the operation of the anti-spam smtp distributed system not be compromised by spam attacks, the proposed anti-spam system will slow down spam attacks, which will result in a significant reduction in the spam traffic amount on the network. in future work it would be useful to test how such a system affects legitimate mass mailing systems, which send a large number of emails that do not have spam content. 542 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 543 20 n. gavrilovic et al. fig. 9: average pod memory usage depending on the number of clients simultaneously performing spam attack 6 conclusion in this paper design and implementation of a distributed anti-spam smtp system is proposed. the usage of pow algorithm for verifying email client’s credibility is analysed, and an extension of smtp protocol, which enables requiring a certain amount of work from a sender, is proposed. distributed kubernetes architecture was used in order to prevent resource overload of the implemented system. the great benefit of the proposed system is the possibility of autoscaling the number of smtp server replicas depending on the server load, which maximizes the use of available resources and makes the system resistant to spam attacks. the implemented system was evaluated during the distributed spam attempt, simulated by the custom-made traffic generator tool. various performance tests have been given: (1) the proposed system’s impact on the client’s behaviour and the overall amount of spam traffic, (2) the undistributed anti-spam smtp server’s performance during the spam attempt, followed by a discussion about the reason for system distribution. furthermore, (3) autoscaling demonstration and distributed environment’s performance evaluation during spam attack was given. we showed that the proposed solution has the possibility to significantly reduce 544 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 545 20 n. gavrilovic et al. fig. 9: average pod memory usage depending on the number of clients simultaneously performing spam attack 6 conclusion in this paper design and implementation of a distributed anti-spam smtp system is proposed. the usage of pow algorithm for verifying email client’s credibility is analysed, and an extension of smtp protocol, which enables requiring a certain amount of work from a sender, is proposed. distributed kubernetes architecture was used in order to prevent resource overload of the implemented system. the great benefit of the proposed system is the possibility of autoscaling the number of smtp server replicas depending on the server load, which maximizes the use of available resources and makes the system resistant to spam attacks. the implemented system was evaluated during the distributed spam attempt, simulated by the custom-made traffic generator tool. various performance tests have been given: (1) the proposed system’s impact on the client’s behaviour and the overall amount of spam traffic, (2) the undistributed anti-spam smtp server’s performance during the spam attempt, followed by a discussion about the reason for system distribution. furthermore, (3) autoscaling demonstration and distributed environment’s performance evaluation during spam attack was given. we showed that the proposed solution has the possibility to significantly reduce autoscalabile distributed anti-spam smtp system based on kubernetes 21 the amount of spam traffic, while processing tens of thousands of simultaneous smtp client requests in a distributed environment and adjusting the proposed system’s robustness. acknowledgments references [1] d. j. c. klensin, “email statistics report 2021-2025,” 2021. [2] h. faris, a. m. al-zoubi, a. a. heidari, i. aljarah, m. mafarja, m. a. hassonah, and h. fujita, “an intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks,” information fusion, vol. 48, pp. 67–83, 2019. [3] h. hu and g. wang, “revisiting email spoofing attacks,” arxiv, vol. abs/1801.00853, 2018. [4] t. wu, s. wen, y. xiang, and w. zhou, “twitter spam detection: survey of new approaches and comparative study,” computers & security, vol. 76, pp. 265–284, 2018. [5] a. m. al-zoubi, h. faris, j. alqatawna, and m. a. hassonah, “evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts,” knowledge-based systems, vol. 153, pp. 91–104, 2018. [6] z. alom, b. carminati, and e. ferrari, “a deep learning model for twitter spam detection,” online social networks and media, vol. 18, 2020. [7] s. kaddoura, o. alfandi, and n. dahmani, “a spam email detection mechanism for english language text emails using deep learning approach,” 2020 ieee 29th international conference on enabling technologies: infrastructure for collaborative enterprises (wetice), pp. 193–198, 2020. [8] a. karim, s. azam, b. shanmugam, k. kannoorpatti, and m. alazab, “a comprehensive survey for intelligent spam email detection,” ieee access, vol. 7, pp. 168 261–168 295, 2019. [9] s. hameed, t. kloht, and x. fu, “identity based email sender authentication for spam mitigation,” in eighth international conference on digital information management (icdim 2013), 2013, pp. 14–19. [10] a. schaub and d. rossi, “design and analysis of an improved bitmessage anti-spam mechanism,” 09 2015, pp. 1–5. [11] a. biryukov and d. khovratovich, “equihash: asymmetric proof-of-work based on the generalized birthday problem,” ledger, vol. 2, 2017. [12] i. bentov, c. lee, a. mizrahi, and m. rosenfeld, “proof of activity: extending bitcoin’s proof of work via proof of stake,” iacr cryptol. eprint arch., p. 452, 2014. 544 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes 545 22 n. gavrilovic et al. [13] d. j. c. klensin, “simple mail transfer protocol,” rfc 5321, oct. 2008. [14] c. dwork and m. naor, “pricing via processing or combatting junk mail,” in proceedings of the 12th annual international cryptology conference on advances in cryptology, ser. crypto ’92. berlin, heidelberg: springerverlag, 1992, pp. 139–147. [15] n. poulton, docker deep dive: harness the full potential of your applications with docker. packt publishing, 2020. [16] m. chae, h. lee, and k. lee, “a performance comparison of linux containers and virtual machines using docker and kvm,” cluster comput, vol. 22, p. 17651775, 2019. [17] k. cochrane, j. s. chelladhurai, and n. k. khare, docker cookbook: over 100 practical and insightful recipes to build distributed applications with docker, 2nd edition, 2nd ed. packt publishing, 2018. [18] k. matthias and s. p. kane, “docker: up and running: shipping reliable containers in production,” 2015. [19] m. luksa, kubernetes in action. manning publications, 2018. [20] n. poulton and p. joglekar, the kubernetes book. jjnp consulting limited, 2019. [21] n. gavrilovic and v. ciric, “design and evaluation of proof of work based anti-spam solution,” in 2020 zooming innovation in consumer technologies conference (zinc), 2020, pp. 286–289. 546 n. gavrilović, v. ćirić autoscalabile distributed anti-spam smtp system based on kubernetes pb facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 495-512 https://doi.org/10.2298/fuee2204495p © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper fuzzy-based real-coded genetic algorithm for optimizing non-convex environmental economic loss dispatch shradha singh parihar1, nitin malik2 1gautam buddha university, greater noida, india 2the northcap university, gurugram, india abstract. a non-convex environmental economic loss dispatch (nceeld) is a constrained multi-objective optimization problem that has been solved for assigning generation cost to all the generators of the power network with equality and inequality constraints. the objectives considered for simultaneous optimization are emission, economic load and network loss dispatch. the valve-point loading, prohibiting operating zones and ramp rate limit issues have also been taken into consideration in the generator fuel cost. the tri-objective problem is transformed into a single objective function via the price penalty factor. the nceeld problem is simultaneously optimized using a fuzzybased real-coded genetic algorithm (ga). the proposed technique determines the best solution from a pareto optimal solution set based on the highest rank. the efficacy of the projected method has been demonstrated on the ieee 30-bus network with three and six generating units. the attained results are compared to existing results and found superior in terms of finding the best-compromise solution over other existing methods such as ga, particle swarm optimization, flower pollination algorithm, biogeography-based optimization and differential evolution. the statistical analysis has also been carried out for convex multi-objective problem. key words: multi-objective optimization, non-convex environmental economic loss dispatch, price penalty factor, pareto optimality, real-coded genetic algorithm, valve-point loading, prohibiting operating zones, ramp rate limit received march 2, 2022; revised june 22, 2022; accepted july 6, 2022 corresponding author: nitin malik the northcap university, sector 23a, gurugram, india e-mail: nitinmalik77@gmail.com 496 s. s. parihar, n. malik list of abbreviations: ceed: combined emission and economic dispatch ed: emission dispatch eld: economic load dispatch fpa: flower pollination algorithm frcga: fuzzy-based real-coded genetic algorithm ga: genetic algorithm n/w: network nceeld: non-convex environmental economic loss dispatch nsga: non-dominated sorting genetic algorithm pozs: prohibiting operating zones ppf: price penalty factor pso: particle swarm optimization rcga: real-coded genetic algorithm rrl: ramp rate limit vpl: valve point loading 1. introduction 1.1. motivation the electrical power networks traditionally functioned to minimize total generation fuel cost and were less bothered about the harmful emissions generated in the network [1-3]. after the us clean air act of 1990 (amended in 2010) and similar legislation in several other countries, the public concern towards the pollutants like cox, so2 and nox produced from the thermal power plant has grown. this, in turn, forces the utilities to deliver the power to the consumers with simultaneous minimum total generator fuel cost and total emission level [4-22]. a high degree of non-linearity and complexity is present in the modern generator’s cost curve function because of the presence of valve point loading (vpl) effect and other effects, the resultant approximate solutions lead to a lot of revenue loss over time which is also affected by the network losses. to overcome this, the optimal amount of generated power of the thermal units are to be determined by minimizing emission, loss and cost simultaneously while satisfying all practical constraints, hence, generating a large-scale highly constrained non-linear multi-objective optimization problem. 1.2. literature survey the economic load dispatch (eld) [1-3] is a real-world problem that, earlier, only considers the minimization of the generator fuel cost. therefore, emission dispatch (ed) is considered in [4] for the very first time. hence, both generator fuel cost and harmful environmental emissions should be treated as competing objectives. the combined emission and economic dispatch (ceed) minimize harmful emissions and generating unit cost simultaneously to obtain optimal generation for each network (n/w) unit satisfying various practical constraints. in [5-9], the authors presented weighted-sum or price penalty factor (ppf) based methods where all the considered objectives are treated as a unit function. conventional genetic algorithm (ga) and differential evolution have been presented in [10] and [11], respectively to demonstrate the effect of vpl on the generators cost function but fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 497 ga requires large cpu time for the optimization. a fast initialization approach has been presented in [12] to solve non-convex economic dispatch problem but is usually stuck in local minima. a new whale optimization approach has been presented in [13] and have high computational efficiency. a flower pollination algorithm (fpa) is demonstrated in [14] for solving eld and ceed problem in larger n/w. many evolutionary algorithms such as non-dominated sorting genetic algorithm (nsga) [15], squirrel search algorithm [16], evolutionary programming [17] and nsgaii [18] have been proposed for solving the bi-objective problem. the evolutionary programming has a slow convergence rate for large problem. a mine-blast algorithm has been developed in [19] to incorporate the valve point loading effect for solving the environmental economic load dispatch problem. a new global particle swarm optimization (pso) is developed in [20] to solve bi-objective problem without and with transmission losses. a fuzzified pso technique [21], harmony search [22] and cuckoo search [23] is applied to optimize the solution for the ceed problem. the pso approach deals with the problem of partial optimism. 1.3. paper contributions a) as most of the research has been carried out considering only two objectives (fuel cost and emissions), the authors have incorporated additional objective (network loss) to make the problem formulation more comprehensive and find better solution by merging two soft-computing techniques (rcga and fuzzy) for finding the best compromised solution out of the obtained pareto solutions. moreover, it has been found from the exhaustive literature review that the non-convex multi-objective optimization problem formulation with simultaneous minimization of three objective functions (emission, fuel cost and network loss) at different load demands has not been explored before. b) the different non-linearities like valve-point loading, prohibiting operating zones (pozs) and ramp rate limit (rrl) are considered in this article for three conflicting objectives. c) as all the considered objectives are competitive, the method generates multiple nondominated pareto optimal solutions rather than a single best solution from which the bestcompromised solution is selected based on the highest fuzzy membership function value. d) to validate the proposed methodology, three test cases have been considered at different load demands and the results are compared with already published methods based on ga [25], pso [25, 26], fpa [27], biogeography-based optimization [28] and differential evolution [29]. 2. mathematical modeling the practical non-convex eeld problem has three conflicting objectives which aim to minimize generating cost, amount of harmful emissions and losses of the complex and nonlinear network. to formulate a non-convex eeld problem following objectives and operating constraints are given below: 2.1. non-convex economic load dispatch it is more practical for fossil fuel-based generators to introduce the steam valve-point loading effect in a turbine by adding a rectified sinusoidal term to the quadratic cost 498 s. s. parihar, n. malik equation which leads to non-smooth and non-convex function having manifold minimas [10]. total generator fuel cost based on active power output can be represented as [14] 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓1 = 𝐹𝑇 = ∑ (𝑎𝑖 𝑃𝑖 2 + 𝑏𝑖 𝑃𝑖 + 𝑐𝑖 ) 𝑁 𝑖=1 + |𝑒𝑖 × sin (𝑓𝑖 × (𝑃𝑚𝑖𝑛 − 𝑃𝑖 ))| (1) where pi represents the output power generation of i th unit. ai, bi, ci, ei, and fi are the generator fuel cost coefficients. 2.2. emission dispatch (ed) the goal of ed is to minimize the total environmental degradation due to fossil fuel burning to produce power. the total pollution level of the environment that needs to be minimized is given as [14]: 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓2 = 𝐸𝑇 = ∑ 10 −2 × (𝛼𝑖 + 𝛽𝑖 𝑃𝑖 + 𝛾𝑖 𝑃𝑖 2)𝑁𝑖=1 + 𝜉𝑖 exp (𝜆𝑖 𝑃𝑖 ) (2) where i, i, i, i, i represents the pollution coefficients of the i th generating unit. 2.3. loss dispatch the loss dispatch aims to minimize power loss without considering the generator cost and harmful emission of the network. to minimize loss [14] 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓3 = 𝑃𝐿 = ∑ ∑ 𝑃𝑖 𝐵𝑖𝑗 𝑃𝑗 + ∑ 𝐵𝑖𝑜 𝑃𝑖 + 𝐵𝑜𝑜 𝑁 𝑖=1 𝑁 𝑗=1 𝑁 𝑖=1 (3) where bij, bio and boo represents the line loss coefficients. 2.4. non-convex environmental economic loss dispatch (nceeld) the nceeld problem is to be formulated having an economy, harmful emissions and losses of the network as competing objectives. the proposed complex problem can be written as 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐶 = 𝑓1 + (𝑝𝑓𝑒) ∗ 𝑓2 + (𝑝𝑓𝑙) ∗ 𝑓3 (4) where ′𝑃𝑓𝑒′ and ′𝑃𝑓𝑙′ are the ppf for emission and loss respectively. 𝑓1 represents total generator fuel cost, 𝑓2 represents total emission and 𝑓3 represents total n/w loss. the ratio of the max value of f1 to the max value of f2 gives ppf for emission, whereas, the ratio of the max value of f1 to the max value of f3 of the corresponding generator gives ppf for loss. the procedure for finding ppf for emission and loss can be given as: (a) the generator fuel cost ($/hr) is calculated at its maximum output using (1) for the convex and non-convex problems. (b) the emission release from every generator (lb/hr or kg/hr) is calculated at its maximum output using (2). (c) the losses of each are calculated at its maximum output using (3). (d) 𝑃𝑓𝑒[𝑖], 𝑃𝑓𝑙[𝑖] (𝑖 = 1,2 . . . 𝑛) for each generator is determined as in (5) and (6). 𝑝𝑓𝑒[𝑖] = ∑ (𝑎𝑖+𝑏𝑖𝑃𝑖 𝑚𝑎𝑥 +𝑐𝑖𝑃𝑖 𝑚𝑎𝑥 2 ) 𝑁 𝑖=1 +|𝑒𝑖×sin {𝑓𝑖×(𝑃𝑖𝑚𝑖𝑛 𝑚𝑎𝑥 −𝑃𝑖 𝑚𝑎𝑥 )}| ∑ 10−2×(𝛼𝑖+𝛽𝑖𝑃𝑖 𝑚𝑎𝑥 +𝛾𝑖𝑃𝑖 𝑚𝑎𝑥 2)𝑁𝑖=1 +𝜉𝑖exp (𝜆𝑖𝑃𝑖 𝑚𝑎𝑥 ) ($/𝑙𝑏) (5) 𝑝𝑓𝑙[𝑖] = ∑ (𝑎𝑖+𝑏𝑖𝑃𝑖 𝑚𝑎𝑥 +𝑐𝑖𝑃𝑖 𝑚𝑎𝑥 2 )𝑁𝑖=1 +|𝑒𝑖×sin {𝑓𝑖×(𝑃𝑖𝑚𝑖𝑛 𝑚𝑎𝑥 −𝑃𝑖 𝑚𝑎𝑥 )}| ∑ ∑ 𝑃𝑖 𝑚𝑎𝑥 𝐵𝑖𝑗𝑃𝑗 𝑚𝑎𝑥 +∑ 𝐵𝑖𝑜𝑃𝑖 𝑚𝑎𝑥 +𝐵𝑜𝑜 𝑁 𝑖=1 𝑁 𝑗=1 𝑁 𝑖=1 ($/𝑝𝑢) (6) where 𝑃𝑖 𝑚𝑎𝑥 is the maximum capacity of the unit. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 499 (e) 𝑃𝑓𝑒[𝑖] and 𝑃𝑓𝑙[𝑖] (i=1, 2... n) are sorted in ascending order. (f) 𝑃𝑖 𝑚𝑎𝑥 is added starting from the generator unit with the smallest 𝑃𝑓𝑒[𝑖] for harmful emissions and the generator unit with the smallest 𝑃𝑓𝑙[𝑖] for the loss until ∑ 𝑃𝑖 𝑚𝑎𝑥 ≥ 𝑃𝐷 . (g) the 𝑃𝑓𝑒[𝑖] and 𝑃𝑓𝑙[𝑖] linked with the last generator unit is the ppf for emission and loss, respectively for a given load 𝑃𝐷 . (h) the 𝑃𝑓𝑒[𝑖] and 𝑃𝑓𝑙[𝑖] for particular load are determined. eq. (4) is optimized subject to constraints in case of the tri-objective minimization problem. for the convex eed problem, the ′𝑃𝑓𝑒 ′ selected is 43.55981 $/kg and 44.07915 $/kg [27] for three generator unit network at 400 mw and 500 mw respectively. for nonconvex problem considering standard ieee 30-bus network, 𝑃𝑓𝑒 ′ and ′𝑃𝑓𝑙′ calculated for load pd of 2.834 p.u is 5932.9377 $/lb & 10445.0680 $/p.u and for load pd = 4.32 p.u is 10949.4251 $/lb & 19612.6323 $/p.u respectively using method given in reference [8]. the optimization process is subjected to the following constraints: a) the active power output of a generating unit is constrained by its bounds for a stable operation and is given as: 𝑃𝑖 𝑚𝑖𝑛 ≤ 𝑃𝑖 ≤ 𝑃𝑖 𝑚𝑎𝑥 𝑖 = 1,2, … . , 𝑁 (7) b) the total generated power balances the sum of the active power loss (pl) and total load demand (pd). therefore, ∑ 𝑃𝑖 − (𝑃𝐷 + 𝑃𝐿 ) = 0 𝑁 𝑖=1 (8) where pl is denoted as b-coefficients. the error in loss coefficients is considered to be constant as in ref [14]. c) generator ramp rate limits: the inclusion of ramp rate limits changes the operating limits of the generator as [24] 𝑀𝑎𝑥(𝑃𝑖 𝑚𝑖𝑛 , 𝑃𝑖 𝑜 − 𝐷𝑅𝑖 ) ≤ 𝑃𝑖 ≤ 𝑀𝑖𝑛(𝑃𝑖 𝑚𝑎𝑥 , 𝑃𝑖 𝑜 + 𝑈𝑅𝑖 ) (9) where, 𝑃𝑖 𝑜 is the previous operating point of ith generator and dri & uri are the down and up ramp rate limits respectively. e) prohibited operating zones: if any power plant works in these zones, some faults might occur for the machines or accessories such as pumps or boilers. therefore, to prevent theses faults, the power generation limits must be changed so that they satisfy the poz constraint. this feature can be included in the non-convex multi-objective problem formulation as [24] min 1 1 max l i i i u l i ik i ik u izi i i p p p p p p p p p p −           (10) here zi are the number of prohibited zones in i th generator curve, k is the index of prohibited zone of ith generator, p ik l is the lower limit of kth prohibited zone, and p ik−1 u is the upper limit of kth prohibited zone of ith generator. 500 s. s. parihar, n. malik 3. solution methodology the paper implemented frcga on threeand six generator networks, to identify the best-compromised solution amongst the available set of pareto optimal solutions. the techniques used in the algorithm are as follows: 3.1. pareto optimality it is defined as the degree of efficacy in multi-objective and multi-criteria solutions and represents a condition where economic resources and its output have been assigned in such a manner that no objective can be made better without losing the well-being of the other. there is no way to improve one part of a pareto optimal solution set without making another part worse. a state u will dominate state v if u is superior to v in at least one objective function and not worse in regard to the other objective functions. a decision vector ‘u’ will dominate another vector ‘v’ (as m˂n) if 𝑓𝑗 (𝑢) ≤ 𝑓𝑗 (𝑣) ⩝ 𝑗 = 1,2,3, , , 𝑖 (11) and 𝑓𝑗 (𝑢) ˂ 𝑓𝑗 (𝑣) for at least one j (12) where j shows a total number of objectives considered for simultaneous optimization. the reduction in fuel cost of generator increases the environmental emissions and vice-versa. as the considered objectives are conflicting in nature so instead of getting an optimal solution a set of non-dominated (pareto-optimal) solutions have been obtained, hence, pareto-optimal solution has been considered. 3.2. real-coded genetic algorithm in a real-coded genetic algorithm (rcga) for optimization, the output of each generator in the system is illustrated as a floating point rather than a binary number resulting in high precision solution [30]. for discontinuous, non-differentiable and discrete objective functions the algorithm is proved to be effective and superior to binary coded genetic algorithm. the outputs of all the generating units generate a solution string known as chromosome. the initial population is randomly generated in a given search space. the rcga loop comprises pre-processing, three genetic operations and post-processing. it performs a global optimization to identify the best solution to the formulated problem and iterates until the convergence criteria is met. to estimate the fitness value for each individual to optimize nceeld problem mentioned by (4) for a given load while satisfying limits shown in (7) and (8): 𝑀𝑖𝑛 𝐶 = (𝑓1 + 𝛼[∑ 𝑃𝑖 𝑁 𝑖=1 − (𝑃𝐷 + 𝑃𝐿 )]) 2 + ([𝑝𝑓𝑒 ∗ (𝑓2 + 𝛼[∑ 𝑃𝑖 𝑁 𝑖=1 − (𝑃𝐷 + 𝑃𝐿 )] 2]) + ([𝑝𝑓𝑙 ∗ (𝑓3 + 𝛼[∑ 𝑃𝑖 𝑁 𝑖=1 − (𝑃𝐷 + 𝑃𝐿 )] 2]) (13) where α represents the penalty parameter that occurs if n/w load demand is not satisfied. this guarantees that a feasible solution gets higher fitness as compared to an infeasible solution. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 501 3.3. fuzzy approach based on min-max proposition to optimize three conflicting objectives (fuel cost, emission and n/w loss) simultaneously is a tedious task as there are no single criteria to finalize the merit of the available non-dominated solutions. due to the conflicting nature of the objectives, it is hard to find the best solution. every objective is assigned a degree of satisfaction based on the membership functions provided by the fuzzy method. the membership functions represent the degree of membership in fuzzy sets in the range [0,1].  (fi) is monotonically decreasing function given as [9]: min max min max max min max 1; ( ) ; 0; i i i i i i i i i i i i f f f f f f f f f f f f     − =   −    (14) where f i min represents the expected minimum value and f i max represents the expected maximum value of objective function i. the membership function value signifies how much a solution satisfies fi on a scale of 0 to 1. the fuzzy min-max proposition to nominate the best solution amongst many solutions can be given as [9] µ𝑏𝑒𝑠𝑡𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 = 𝑀𝑎𝑥{min [µ(𝐹𝑗 )] 𝑘 } (15) where k is the number of pareto-optimal solutions. each objective is expected to attain higher satisfaction for each solution. the bestcompromised solution is identified based on the highest rank among k solutions. the pseudo-code to solve nceeld problem is shown below step i: initialise the cost coefficients, generator limits, load demand and the min-max values of each objective. step ii: create a random population to define the number of generators within specified limits. step iii: evaluate the fitness of the constrained tri-objective problem of the network with prohibiting operating zones and ramp rate limits. step iv: single point crossover is used for pairing and mating of the selected chromosomes. step v: mutant is created on a random basis. step vi: create new chromosomes and offspring for convergence check. step vii: select the fittest individual for the next generation. step viii: check the convergence criteria. if the maximum counter is reached, jump to step ix. else, step iv. step ix: calculate the membership value of the pareto optimal solutions using (14). the fmin and fmax value of each objective are determined by optimizing all the objectives independently to determine the endpoints of the obtained pareto front. step x: the degree of satisfaction attained for each objective is used to find the bestcompromise solution based on min-max proposition as given in (15). 502 s. s. parihar, n. malik 4. results and discussion to validate the performance, frcga has been employed to solve nceeld problem on two networks having 3 and 6 generators satisfying all the operational network constraints at various power demands. the network data for 3 and 6 generating units is given in the appendix (table 13, table 14, table 15 and table 16). a program to imitate results for both the test n/w is written on matlab 7.10. the standard ieee-30 bus network with six generator units is presented in fig.1. fig. 1 one-line diagram of 30-bus network to demonstrate the superiority of the frcga, three different test cases have been identified at different network complexity. the convergence test was carried out employing the same evaluation function for the same no. of iterations for convex case. the results for one trial of 250 iterations are shown in fig. 2, fig. 3 and fig. 4 for optimized cost, emission and loss function respectively. it can be seen that frcga converges faster for the population size of 500. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 503 fig. 2 convergence characteristic for best fuel cost solution for different pop sizes fig. 3 convergence characteristic for best emission solution for different pop sizes fig. 4 convergence characteristic for best n/w loss solution for different pop sizes 0 50 100 150 200 250 606 608 610 612 614 616 618 620 622 no. of iteration fu e l c o s t popsize=200 popsize=300 popsize=500 popsize=400 0 50 100 150 200 250 0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.32 no. of iteration e m is s io n popsize=200 popsize=300 popsize=400 popsize=500 0 50 100 150 200 250 0 0.02 0.04 0.06 0.08 0.1 no. of iteration s y s te m l o s s popsize=200 popsize=300 popsize=400 popsize=500 504 s. s. parihar, n. malik hence, the optimal settings for both cases are the same, with the exception of population size and are mentioned in table 1 table 1 frcga parameters for different case studies parameters selected value population size 200 (case 1) 500 (case 2 & 3) selection rate 0.3 mutation rate 0.2 trials 60 iterations 250 4.1. environmental economic dispatch three and six generator networks have been tested without considering the effect of vpl in the network. table 2 illustrates the best cost and emission linked with the network at two different power demands of 400 mw and 500 mw. when cost minimization is performed, the generating fuel cost and n/w emissions are 20792.88 $ and 206.3426 kg, respectively, but the cost of the generator increases to 20846.60 $, and the network harmful emission reduces to 200.1578 kg in ed case at power demand of 400 mw. for 500 mw, the generator cost and n/w emissions are 25453.26 $ and 319.5089 kg when cost minimization is performed, but the cost rises to 25500.40 $ and emission reduces to 311.0776 kg. using min and max values of each objective function, the membership value of the non-dominated solutions is determined. table 2 best solution for eld and ed of 3-unit n/w at pd=400 mw and 500 mw load demand 400 mw 500 mw eld ed eld ed p1(mw) 81.4957 106.4685 103.5167 130.8372 p2(mw) 175.8190 151.1246 217.1612 190.1187 p3(mw) 149.8137 149.7724 190.9736 190.7181 fuel cost ($) 20792.88 20846.60 25453.26 25500.40 emission (kg) 206.3426 200.1578 319.5089 311.0776 loss (mw) 7.5560 7.3865 11.9239 11.6800 the simultaneous optimization of the environmental emission and the generator fuel cost is carried out to determine a best-compromise solution. in table 3 and table 4, five intermediate pareto solutions are listed from the attained pareto solution set using the presented approach with its membership values. solution 5 is selected as the best solution having the highest rank of 0.1584 and 0.1110 at 400 mw and 500 mw respectively. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 505 table 3 pareto optimal solutions for the convex-eed problem at pd=400 mw (3-unit n/w) solution number cost ($) emission (kg) µ𝟏 µ𝟐 µ𝒎𝒊𝒏 1 20845.74 203.7849 0.0160 0.4135 0.0160 2 20843.59 200.6626 0.0560 0.9184 0.0560 3 20812.80 205.3911 0.6293 0.1539 0.1539 4 20838.31 200.3850 0.1544 0.9633 0.1544 5 20838.09 200.2123 0.1584 0.9912 0.1584 table 4 pareto optimal solutions for the convex-eed problem at pd=500 mw (3-unit n/w) solution number cost ($) emission (kg) µ𝟏 µ𝟐 µ𝒎𝒊𝒏 1 25497.79 312.3221 0.0553 0.8524 0.0553 2 25497.63 311.0877 0.0586 0.9988 0.0586 3 25497.56 312.2660 0.0602 0.8590 0.0602 4 25496.93 311.1103 0.0737 0.9961 0.0737 5 25495.17 311.1194 0.1110 0.9950 0.1110 the summarized result for a best-compromised solution for three generating unit network is tabulated in table 5 and is compared with the other methods such as ga [25], pso [25] and fpa [27]. table 5 best solution for the convex-eed problem at pd=400 mw and 500 mw (3-unit n/w) best-compromised solution 400 mw 500 mw frcga ga [25] pso [25] fpa [27] frcga ga [25] pso [25] fpa [27] p1 (mw) 102.8514 102.617 102.612 102.4468 129.3252 128.997 128.984 128.8074 p2 (mw) 154.0217 153.825 153.809 153.8341 192.4745 192.683 192.645 192.5906 p3 (mw) 150.5278 151.011 150.991 151.1321 189.8764 190.11 190.063 190.2958 fuel cost ($) 20838.09 20840.10 20838.30 20838.10 25495.17 25499.40 25495.00 25494.70 emission (kg) 200.2123 200.256 200.221 200.2238 311.1194 311.273 311.15 311.155 loss (mw) 7.4090 7.41324 7.41173 7.4126 11.6882 total cost ($) 29559.59 29563.20 29559.90 29559.81 39209.7 39220.10 39210.20 39210.15 the comparison depicts that the total generation cost incurred in solving eed problem from the frcga approach is lower than that incurred using other optimization approaches in both test cases. thus, frcga succeeds to obtain the global minimum solution and performs superior to these algorithms in respect of all parameters. the total network losses for the best-compromised solution are 7.4090 mw and 11.6882 mw for power demand of 400 mw and 500 mw, respectively. for 30-bus n/w, the best-compromised solution attained has the value of 0.1999 lb/hr and 619.90 $/hr respectively for harmful environmental emission and cost, respectively at load demand of 2.834 p.u and is in close agreement with 0.1969 lb/hr and 623.87 $/hr as mentioned in [20]. fig. 5 is the pareto front drawn between the fuel cost and the emission points which was found to have an inverse relationship between the two objectives. 506 s. s. parihar, n. malik fig. 5 pareto front between generator fuel cost ($/hr) and emission (lb/hr) for convex eed 4.2. environmental economic loss dispatch with valve-point loading the performance of the frcga on the nceeld problem is examined for the first time on the ieee 30-bus network at two different loading conditions. three objectives (fuel cost, environmental emission and losses) are simultaneously considered and optimized to obtain minimum network generation cost. the total generation cost comes out to be 1810.10 $/hr at 2.834 p.u load demand which is found to be superior to published results at 2.834 p.u. the minimum-maximum limits for fuel cost with vpl effect, harmful environmental emissions and losses for load demand of 2.834 p.u and 4.32 p.u are given in table 6. for the load of 2.834 p.u, the values attained for cost and emission is 608.02 $/hr and 0.1938 lb/hr that is found to be less when compared to 626.96 $/hr & 0.2110 lb/hr [26], 613.342 $/hr & 0.2028 lb/hr [28] and 613.338 $/hr & 0.1953 lb/hr [29], respectively. the membership values of all the pareto optimal solutions for the nceeld problem are obtained. five intermediate solutions are tabulated in table 7 and table 8 for pd=2.834 p.u and pd=4.32 p.u respectively. table 6 min-max limit for fuel cost with vpl effect, emission and loss at 2.834 p.u and 4.32 p.u load (p.u) 2.834 4.32 cost ($/hr) minimum 608.02 965.93 maximum 646.19 980.67 emission (lb/hr) minimum 0.1938 0.2263 maximum 0.2211 0.2422 loss (p.u) minimum 0.0209 0.0514 maximum 0.0379 0.0612 table 7 pareto optimal set of nceeld problem with vpl effect for load pd=2.834 p.u solution number cost ($/hr) emission (lb/hr) loss (p.u) 1 2 3 µ𝑚𝑖𝑛 1 622.74 0.1973 0.0262 0.6144 0.8704 0.6894 0.6144 2 622.62 0.2001 0.0228 0.6174 0.7697 0.8862 0.6174 3 621.53 0.2022 0.0228 0.6461 0.6911 0.8863 0.6461 4 619.84 0.2021 0.0268 0.6905 0.6961 0.6558 0.6558 5 614.99 0.2027 0.0255 0.8174 0.6742 0.7284 0.6742 615 620 625 630 635 640 645 650 0.193 0.194 0.195 0.196 0.197 0.198 0.199 0.2 0.201 cost e m is s io n fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 507 table 8 pareto optimal set of nceeld for load pd=4.32 p.u solution number cost ($/hr) emission (lb/hr) loss (p.u) total cost ($/hr) 1 2 3 µ𝒎𝒊𝒏 1 973.38 0.2326 0.0555 4490.2 0.4944 0.6013 0.5774 0.4944 2 972.85 0.2329 0.0563 4515.4 0.5301 0.5831 0.4959 0.4959 3 972.96 0.2335 0.0541 4489.12 0.5228 0.5451 0.7296 0.5228 4 972.76 0.2332 0.0545 4475.29 0.5365 0.5666 0.6788 0.5365 5 972.22 0.2335 0.0543 4497.8 0.5728 0.5480 0.7045 0.5480 the results reveal that the best-compromise solution for load demand of 2.834 p.u is 2099.20 $/hr and for load pd=4.32 p.u is found to be 4497.82 $/hr with the highest rank of 67.42% and 54.80% respectively depending upon its membership value of each objective. fig. 6 depicts the convergence criteria of 30-bus network on two different loads which reveal that the convergence of load pd= 2.834 p.u and pd= 4.32 p.u is attained faster even for the complex multi-objective minimization problem. fig. 6 convergence characteristic for total generation cost for different load conditions 4.3. environmental economic loss dispatch with valve-point loading, pozs and rrl for this test case, all the mentioned practical constraints and non-linear characteristic of non-convex multi-objective problem are considered. due to which this test case is more complex than other test cases considered above. data for the ramp rate limits and pozs has been taken from appendix (table 15 and table 17). the generator ramp rate limit needs to be satisfied as generator output cannot change (increase or decrease its output) arbitrarily to any value, the change has to within the up/down ramp rate limits. the inclusion of ramp rate limits changes the operating limits of the generator. the minimum-maximum limits of fuel cost, emission and loss evaluated for the six-unit system with pozs and rrl are given in table 9 with load demand 2.834 pu. the results presented in table 10 provides the intermediate solutions obtained using rcga. the best solution is ranked on the basis of its performance for all the objectives considered. therefore, overall rank for extreme points is zero. the rank of best solution is found to be 0.6685 which indicated that all three objectives are satisfied at least 66.85 % for load of 2.834 p.u. 0 50 100 150 200 250 0 1 2 3 4 5 6 x 10 4 iteration fi tn e s s f u n c ti o n load= 2.834 pu load= 4.32 pu 508 s. s. parihar, n. malik table 9 min-max limit for fuel cost with vpl effect, emission and loss with pozs and rrl at 2.834 p.u cost($/h) emission(lb/h) loss(pu) minimum maximum minimum maximum minimum maximum 611.2998 645.3562 0.1942 0.2073 0.0256 0.0358 table 10 pareto optimal set of nceeld with pozs and rrl for load pd=2.834 p.u cost ($/h) emission (lb/h) loss (pu) µ1 µ2 µ3 µ𝑚𝑖𝑛 sol.1 624.7335 0.1975 0.0257 0.6055 0.7473 0.9901 0.6055 sol.2 623.7816 0.1989 0.0242 0.6335 0.6421 1.0000 0.6335 sol.3 623.3781 0.1979 0.0291 0.6453 0.7208 0.6589 0.6453 sol.4 621.4747 0.1987 0.0283 0.7012 0.6598 0.7379 0.6598 sol.5 620.3646 0.1985 0.0256 0.7338 0.6685 1.0000 0.6685 the results clearly showed that all the constraints, such as vpl effect, pozs, rrl, generation limits and power balance constraints were fully satisfied for all considered test cases of tri-objective optimization problem. due to the non-convexity constraints introduced in test system, the cost increases from 608.0296 $/hr to 611.2998 $/hr, emission increases from 0.1938 lb/hr to 0.1942 lb/hr and system loss from 0.0209 p.u to 0.0256 p.u. 4.4. statistical analysis table 11 lists the comparison of different approaches for cost and emission minimization in terms of their minimum, maximum, mean and median values, respectively, for ieee 30bus n/w. the cost minimum (cmin), cost mean (cmean), cost median (cmedian), emission minimum (emin), emission mean (emean) and emission median (emedian) values obtained for the eld and ed problem, respectively, are found to be lowest as compared to other published work. the statistical comparison of ceed problem has also been shown in table 12 in terms of their mean and standard deviation. the values of cmean and emean obtained from solving convex ceed problem also demonstrates the superiority of the method. the value of cost standard deviation (cstd) and emission standard deviation (estd) attained from the proposed approach of frgca are 7.127 and 0.0057, respectively which is less than that obtained from other approaches. this clearly shows that the obtained results lie close to its mean value as compared to other published methods. table 11 statistical comparison of eld and ed minimization for ieee 30-bus n/w at load pd=2.834 p.u [1] fuel cost minimization cmin cmax cmean cmedian proposed approach 601.31 610.07 603.20 602.23 gqpso [31] 606.38 611.86 609.49 609.66 saiwpso [32] 605.99 606.00 605.99 605.99 ngpso [20] 605.99 605.99 605.99 605.99 [2] emission minimization emin emax emean emedian proposed approach 0.1938 0.2295 0.1941 0.1940 gqpso [31] 0.1942 0.1946 0.1944 0.1944 saiwpso [32] 0.1941 0.1941 0.1941 0.1941 ngpso [20] 0.1941 0.1941 0.1941 0.1941 fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 509 table 12 statistical comparison of ceed minimization for ieee 30-bus n/w at load pd=2.834 p.u cmean cstd emean estd proposed approach 622.62 7.127 0.2012 0.0057 gqpso [31] 644.09 12.2 0.2109 0.0095 saiwpso [32] 623.76 0.1970 ngpso [20] 623.86 0.1969 5. conclusion the fuzzy-based rcga is demonstrated to solve multi-objective environmental economic loss dispatch problem considering non-convex and non-smooth fuel cost function. the multiobjective minimization problem is transformed into the constrained single-objective problem by the use of price penalty factor which blends all competing objectives (generator cost, environmental emission and system losses). because the objectives are inversely related, a set of pareto optimal solutions are attained rather than a single optimal solution for a given objective. furthermore, a fuzzy approach is exploited to extract best-compromised solution as per the highest rank based on their membership values. the convergence of the nceeld problem at different load demand is also analyzed considering the different practical operating limits (pozs, rrl and vpl) of the network. the total generation cost of the network attained from the proposed method for different test cases has been compared to the other techniques which validate the solution to nceeld problem for small and large networks. the statistical analysis also validates the frgca approach. the percentage reduction in cstd and estd values are 41.5% and 40% as compared to ref. [31]. the proposed work can further be extended for the study of integration of renewable energy sources and for practical transmission networks considering dynamic non-convex ceeld problem. appendix table 13 generator cost, emission coefficients & generation constraints for three generating unit network cost coefficients g1 g2 g3 ai 0.03546 0.02111 0.01799 bi 38.30553 36.32782 38.27041 ci 1243.5311 1658.5696 1356.6592 emission coefficients αi 0.00683 0.00461 0.00461 βi -0.54551 -0.5116 -0.5116 𝛾i 40.2669 42.89553 42.89553 unit limits pmin (p.u) 35 130 125 pmax(p.u) 210 325 315 510 s. s. parihar, n. malik table 14 b-coefficients for three generating unit network bij * 0.0001 0.71 0.3 0.25 0.3 0.69 0.32 0.255 0.32 0.8 table 15 generator fuel cost, emission coefficients and n/w generation constraints for 30bus n/w cost coefficients g1 g2 g3 g4 g5 g6 ai 100 120 40 60 40 100 bi 200 150 180 100 180 150 ci 10 10 20 10 20 10 ei 200 200 200 200 200 200 fi 0.0050 0.0060 0.0010 0.0009 0.0009 0.0015 emission coefficients αi 4.091 2.543 4.258 5.326 4.258 6.131 βi -5.554 -6.047 -5.094 -3.550 -5.094 -5.555 𝛾i 6.490 5.638 4.586 3.380 4.586 5.151 𝜁i 0.0002 0.0005 0.00001 0.002 0.000001 0.00001 𝜆i 2.857 3.333 8.000 2.000 8.000 6.667 generator unit constraints pmin (p.u) 0.05 0.05 0.05 0.05 0.05 0.05 pmax (p.u) 0.5 0.6 1.0 1.2 1.00 0.60 ramp rate limits dri(up)/h 0.08 0.11 0.15 0.18 0.15 0.18 dri(dn)/h 0.08 0.11 0.15 0.18 0.15 0.18 table 16 b-coefficients for six generating unit network bij 0.1382 -0.0299 0.0044 -0.0022 -0.0010 -0.0008 -0.0299 0.0487 -0.0025 0.0004 0.0016 0.0041 0.0044 -0.0025 0.0182 -0.0070 -0.0066 -0.0066 -0.0022 0.0004 -0.0070 0.0137 0.0050 0.0033 -0.0010 0.0016 -0.0066 0.0050 0.0109 0.0005 -0.0008 0.0041 0.0066 0.0033 0.0005 0.0244 bo -0.0107 0.0060 -0.0017 0.0009 0.0002 0.0030 boo 0.00098573 table 17 pozs of units for ieee-30 bus n/w unit 1 2 5 poz [0.10 0.15] [0.25 0.30] [0.50 0.55] references [1] j. c. dodu, p. martin, a. merlin and j. pouget, "an optimal formulation and solution of short-range operating problems for a power system with flow constraints", proc. ieee, vol. 60, no. 1, pp. 54-63, 1972. [2] m. modiri-delshad, s. h. a. kaboli, e. taslimi-renani and n. a. rahim, "backtracking search algorithm for solving economic dispatch problems with valve-point effects and multiple fuel options", energy, vol. 116, pp. 637-649, 2016. [3] m. pradhan, p. k. roy and t. pal, "grey wolf optimization applied to economic load dispatch problems", int. j. electr. power energy syst., vol. 83, pp. 325-334, 2016. [4] m. r. gent and w. l. john, "minimum-emission dispatch", ieee trans. power syst., vol. 90, pp. 2650–2660, 1971. fuzzy-based real-code genetic algo for optimizing non-convex environment economic loss dispatch 511 [5] k. t. chaturvedi, m. pandit and l. srivastava, "modified neo-fuzzy neuron-based approach for economic & environmental optimal dispatch", appl. soft comput., vol. 8, no. 4, pp. 1428-1438, 2008. [6] s. zaoui and a. belmadani, "solution of combined economic and emission dispatch problems of power systems without penalty", appl. artif. intell., p. 1976092, 2021. [7] a. chatterjee, s. p. ghoshal and v. mukherjee, "solution of combined economic and emission dispatch problems of power system by an opposition-based harmony search algorithm", int. j. electr. power energy syst., vol. 39, no. 1, pp. 9-20, 2012. [8] c. palanichamy and k. srikrishna, "economic thermal power dispatch with emission constraint", j. institution of eng., vol. 72, pp. 11-18, 1991. [9] s. s. parihar and n. malik, "multi-objective optimization with non-convex cost functions using fuzzy mechanism based continuous genetic algorithm", in proceedings of the ieee 4th international conference on electrical, computer and electronics, 2017, pp. 457-462. [10] d. c. walters and g. b. sheble, "genetic algorithm solution of economic dispatch with valve point loading", ieee trans. power syst., vol. 8, no. 3, pp. 1325-1332, 1993. [11] d. zou, s. li, g. g. wang, z. li and h. ouyang, "an improved differential evolution algorithm for the economic load dispatch problems with or without valve-point effects", appl. energy, vol. 181, pp. 375-390, 2016. [12] w. t. el-sayed, e. f. el-saadany, h. h. zeineldin and a. s. al-sumaiti, "fast initialization methods for the nonconvex economic dispatch problem", energy, vol. 201, p. 117635, june 2020. [13] s. m. abd elazim and e. s. ali, "optimal network restructure via improved whale optimization approach", int. j. commun., vol. 34, no. 1, e. 4617, 2021. [14] a. y. abdelaziz, e. s. ali and s. m. abd elazim, "flower pollination algorithm to solve combined economic and emission dispatch problems", eng. sci. technol. int. j., vol. 19, no. 2, pp. 980-990, 2016. [15] m. a. abido, "a novel multi-objective evolutionary algorithm for environmental/economic power dispatch", int. j. electr. power system res., vol. 65, no. 1, pp. 71–91, 2003. [16] v. p. sakthivel, m. suman and p. d. sathya, "combined economic and emission power dispatch problems through multi-objective squirrel search algorithm", appl. soft comput., vol. 100, p. 106950, march 2021. [17] n. sinha, r. chakrabarti and p. k. chattopadhyay, "evolutionary programming techniques for economic load dispatch", ieee trans. evol. comput., vol. 7, no. 1, pp. 83-94, 2003. [18] m. basu, "dynamic economic emission dispatch using nondominated sorting genetic algorithm – ii", int. j. electr. power energy syst., vol. 30, no. 2, pp. 140-149, 2008. [19] e. s. ali and s. m. abd elazim, "mine blast algorithm for environmental economic load dispatch with valve loading effect", neural comput. appl., vol. 30, pp. 261-270, 2018. [20] d. zou, s. li, z. li and x. kong, "a new global particle swarm optimization for the economic emission dispatch with or without transmission losses", energy convers. manag., vol. 139, pp. 45-70, 2017. [21] l. wang and c. singh, "environmental / economic power dispatch using fuzzified multi-objective particle swarm optimization algorithm", int. j. electr. power syst. res., vol. 77, no. 12, pp. 1654-1664, 2007. [22] s. sivasubramani and k. s. swarup, "environmental/economic dispatch using multi-objective harmony search algorithm", electr. power syst. res., vol. 81, no. 9, pp. 1778-1785, 2011. [23] l. benyekhlef, s. abdelkader, b. houari and a. a. n. el-islam, "cuckoo search algorithm to solve the problem of economic emission dispatch with the incorporation of facts devices under the valve-point loading effect", fu: elec. energ., vol. 34, no. 10, pp. 569-588, 2021. [24] q. quande, c. shi, c. xianghua, l. xiujuan and s. yuhui, "solving non-convex/non-smooth economic load dispatch problems 2 via an enhanced particle swarm optimization", appl. soft comput., vol. 59, pp. 1-24, 2017. [25] a. l. devi and o. v. krishna, "combined economic and emission dispatch using evolutionary algorithms – a case study", arpn j. eng. appl. sci., vol. 3, no. 6, pp. 28-35, 2008. [26] s. hemamalini and s. p. simon, "emission constrained economic dispatch with valve point effect using particle swarm optimization", in proceedings of the ieee region 10 conference (tencon), 2008, vol. 1, pp. 1-6. [27] a. y. abdelaziz, e. s. ali and s. m. abd elazim, "combined economic and emission dispatch solution using flower pollination algorithm", int. j. electr. power energy syst., vol. 80, pp. 264-274, 2016. [28] a. bhattacharya and p. k. chattopadhyay, "application of biogeography-based optimization for solving multi-objective economic emission load dispatch problem", electr. power compon. syst., vol. 38, no. 3, pp. 826-850, 2010. [29] a. bhattacharya and p. k. chattopadhyay, "solving economic emission load dispatch problems using hybrid differential evolution", appl. soft comput., vol. 11, no. 2, pp. 2526-2537, 2011. [30] r. l. haupt and s. e. haupt, practical genetic algorithm, 2004. (book) https://www.sciencedirect.com/journal/engineering-science-and-technology-an-international-journal/vol/19/issue/2 512 s. s. parihar, n. malik [31] s. agrawal, b. k. panigrahi and m. k. tiwari, "multiobjective particle swarm algorithm with fuzzy clustering for electrical power dispatch", ieee trans. evol. comput., vol. 12, no. 5, pp. 529-541, 2008. [32] m. a. c. silva, c. e. klein, v. c. mariani and l. s. coelho, "multiobjective scatter search approach with new combination scheme applied to solve environmental/economic dispatch problem", energy, vol. 53, no. 5, pp. 14-21, 2013. instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 151 158 doi: 10.2298/fuee1601151g application of infrared thermography to non-contact testing of ad/dc power supply  stanisław galla, alicja konczakowska gdansk university of technology, gdansk, poland abstract. testing of ac/dc power supplies using the thermography was carried out in order to assess their assembly and operation correctness before launching them on the market. the investigation was carried out for 17 ac/dc power supplies which passed the standard tests (measurements of their basic parameters and characteristics). the investigation consisted of two steps. in the first step the dispersion of temperature on power supply boards was measured after 20 minutes operating in nominal conditions. three regions were defined as potentially revealing a failure. in the second step the acceptable temperature increments on the boards of tested power supplies were evaluated. it was proposed to assess properties of power supplies on the basis of temperature increments on their boards, registered by an infrared camera either for 12 minutes or up to 20 minutes. key words: thermography, testing, power supply 1. introduction diagnostics of electronic systems is an actual problem of their manufacturing. particularly significant are such testing and fault identification methods that enable diagnostics without interfering with a tested system or tested components. the examples of such solutions (nondestructive testing) are: quality (reliability) evaluation on the basis of inherent noise [1-7], resonant ultrasound spectroscopy technique [8] or infrared thermography inspections [9-16]. the thermographic technology offers very advantageous conditions for assessing properties of single components, parts of systems, as well as whole systems and also for detection of faults and defects on electronics boards [9-16]. using an infrared camera allows for the non-contact inspection of a tested object within the infrared radiation range. emission of the infrared radiation can be recorded without any interference with a diagnosed single component, whole system or its part. all components of the power supply mounted on the board are sources of radiation. each of these components has its own specified emissivity coefficient, which depends on received april 8, 2015; received in revised form september 2, 2015 corresponding author: alicja konczakowska gdansk university of technology, g. narutowicza 11/12, 80-233 gdansk, poland (e-mail: alkon@eti.pg.gda.pl) 152 s. galla, a. konczakowska its structure. it was assumed that the properly constructed components have similar coefficients of emissivity and that they will affect the temperature dispersion across the board only in a specific small range. a defective component or a defective assembly of the component will case anomalous temperatures. the mounted components are mainly smds. in the paper, applying the infrared thermography for the quality diagnosis of an ac/dc power supply for a fire station (u = 18 v, i = 3 a) is proposed. the diagnosis consists in the identification of a faulted component or part of a faulted system. in the paper an ac/dc power supply is abbreviated to ‘a power supply’. 2. thermography inspection of ac/dc power supply a procedure of the power supply inspection, applying the well-known thermographic technique is proposed, in order to determine the quality of every manufactured power supply (the correctness of assembly and operation), before launching it on the market. the thermographic method enables a non-contact measurement of the inspected surface temperature, in this case the surface of a power supply board. we assume that an incorrect assembly or an improper operation of a tested power supply will be indicated by an increase of its temperature. at the beginning of the thermographic investigations, in the first step, the typical temperature dispersion for a few high quality power supply boards was evaluated. the analysis of measurement results of the temperature dispersion (thermograms) enables recognizing regions with highest local temperatures of a power supply board. these regions have to be inspected if the thermograms reveal some improprieties. in this case, an investigated power supply may be not classified as operating properly. as a result, the research will determine thermographic test duration. in the second step, the temperature increment measurement technique is used during the power supply operation, i.e. comparing temperature increments at the test starting moment t = t0 and in successive moments t = ti, where i = 1, 2, …, n, and n is the number of observations (measurements) with an infrared camera till the end of testing, i.e. to t = tn. it was assumed that at the moment t = t0 the temperature increment on the power supply board is constant, i.e. tmax0 – tmin0 = ∆t0 = 0. it was also assumed that tmaxi and tmini are, respectively, the maximum and minimum temperature values appearing on the board at the moments ti, where i = 0, 1, 2, …, n. after starting the inspection procedure, the temperature on the board starts changing and at successive moments t = ti temperature increments occur on the board; they are defined as: tmaxi – tmini = ∆ti, where i = 1, 2, …, n. the temperature increment values can be easy determined from the thermograms and enable comparisons of region properties during testing independently of their individual emission coefficients. the aim of investigation was to determine the time moment, ti, after turn on the power supply in terms of the effectiveness of detective of improper operating power supply. the thermograms during investigations were carried out with a vigo system s.a.'s vigocam v50 infrared camera equipped with a 35 mm lens, and the tested power supply boards were situated at the distance of 0.97 m. the dimensions of the observed surfaces of tested boards were: 73.5 mm x 105 mm. the relevant parameters of the infrared camera are summarized in table 1 [17]. application of infrared thermography to non-contact testing of ad/dc power supply 153 table 1 relevant parameters of vigocam v50 camera [17] parameter value/function description detector type non-cooled bolometric matrix (fpa) spectrum range 8‚14 μm thermal resolution ≤ 0.065°c (for temperature 30°c) to determine the properties of power supplies (ac/dc power supplies for a fire station: u = 18 v, i = 3 a) the infrared thermography was used after a preliminary standard test of the power supplies was performed. the tested power supplies are assumed to be operating properly during the standard tests (measurements of basic parameters and characteristics of the power supplies). 3. results of investigation the investigation was carried out for 17 power supplies which passed successfully the standard tests consisting of measurements of their basic parameters and characteristics. the investigations consist of: in the first step:  the thermography inspection of the tested power supplies (the dispersion of temperature) after operating in nominal conditions after 20 minutes,  determining the regions with the maximum temperature values,  evaluating the regions with the maximum local temperature values, in the second step:  the thermography inspection of the tested power supplies during operating (the temperature increment measurement),  the elaboration of rules for the classification of power supplies for the sake of their quality. for investigated power supply the standard test duration (measurements of basic parameters and characteristics) is equal to 20 minutes; it is typical for examination of these power supplies. the temperature dispersion on the tested power supply board was checked after their 20, 40 and 180 minutes operation in normal conditions. thermograms revealed that the temperature dispersion on the board after 20 minutes is stable. in fig. 1a the thermogram of the power supply no. 3, taken after 20 minutes operating in nominal conditions is presented, the maximum temperature is equal to 75 o c. three regions with the highest local temperatures were recognized and they are marked on the power supply board, as presented in fig. 1b. one can expect that components operating in these regions will be the reason of a possible failure (this is most likely). of course, the increase of the temperature of the board may also result from a failure of any component located in other regions of the board. 154 s. galla, a. konczakowska a) b) fig. 1 the inspected power supply no. 3: a) thermogram taken after 20 minutes operating in nominal conditions, b) power supply board with the highest temperature regions marked: 1 –thermistor, 2 –main transformer, 3 –resistor. for the evaluation of the local maximum temperature values in the selected regions, the temperatures were measured during 20 minutes of operating of power supplies, after their earlier 20 minutes operation (stable state). the maximum temperatures for the investigated power supply were as follows: region 1 – 62,5 o c, region 2 – 56 o c, region 3 – 65 o c, and the dispersion of these local maximum temperatures were estimated as 5 o c for regions 1 and 3, and about 7 o c for region 2. the thermal data of thermistors, transformers and resistors applied in this type of power supplies are collected in table 2. the maximum temperatures of components taken from the technical data are higher than the measured ones in the investigated power supplies. it was found that the local maximum temperatures for every one of high quality power supplies can be similar. the inspection period equal to 20 minutes of power supply operating has been chosen for the second step. application of infrared thermography to non-contact testing of ad/dc power supply 155 table 2 thermal data of power supply components component temperature [ o c] remarks minimum maximum thermistor -55 +200 transformer -40 +125 made to order resistor -55 +155 the second step of examining 17 power supplies was concerned on the temperature increment measurements. the thermograms of power supply boards were taken by an infrared camera during 20 minutes of operating, after turning on a power supply. the number of measurement points was equal to n = 10; the measurements were taken every 2 minutes. the results of the temperature increment measurements on the tested boards surface are presented in fig. 2. a) b) fig. 2 the temperature increments ∆t for: a) 14 power supplies with similar temperature conditions, t – the mean heating characteristic, b) the power supply no. 16 – curve 1, no. 17 – curve 2, no. 8 – curve 3, th t – the threshold heating characteristic – curve 4. in fig. 2a results of the temperature increments for 14 power supplies are presented. it is easy to recognize that the values of temperature increments for all measurement points are similar. the mean value of these measurement results is presented in fig. 2a. formally, it is 156 s. galla, a. konczakowska the mean heating characteristic t of the investigated power supplies. the standard uncertainty u of t values at measurement points (fig. 2a) is greater for the starting point i = 1 (i.e. for t1 = 2 min), and smaller for final points i = 8, 9, 10 (i.e. for t8 = 16 min, t9 = 18 min, t10 = 20 min), and is equal to u1 = ± 4,6 o c, u8 = ± 2,5 o c u9 = ± 2,4 o c, and u10 = ± 2,3 o c, respectively. the mean heating characteristic t was approximated on the basis of measurement results by the relation: )] 12 exp(1[2222 t t  (3.1) this relation was estimated for time t ≥ 2 min. to the relation (3.1) at every measurement point the calculated standard uncertainty ui (i = 1, 2, …, 10) was added. this characteristic was approximated by the below relation, called the threshold heating characteristic th t : )] 12 exp(1[2224 t t th  (3.2) the relation (3.2), as tth is presented in fig. 2b, as curve 4. the value of tth evaluated for t = 12 minutes or 20 minutes, enables evaluation the quality of investigated power supply according to the following classification rules: ∆t ≤ 38ºc a high quality power supply ∆t evaluated for t = 12 min ∆t > 38ºc a poor quality power supply or (3.3) ∆t ≤ 42ºc a high quality power supply ∆t evaluated for t = 20 min ∆t > 42ºc a poor quality power supply where: 38 o c, and 42 o c are the threshold heating values of temperature increments for the above test durations, respectively, see fig. 2b, and ∆t is the value of temperature increment evaluated for the investigated power supply after its operation for 12, and 20 minutes, respectively. if the temperature increment at t6 = 12 minutes is higher than 38 o c, it means that the investigated power supply has to be additionally examined, especially its components from three defined regions. in such a case we propose to perform a quality procedure by the infrared thermography inspection, which takes only 12 minutes of operation of the investigated power supply. a different value of tth can be also applied in the classification rules (3.3), but for a suitable time of power supply operating, for example tth equal to 41 o c at t = 16 minutes. it was surprising that for 3 power supplies (no. 8, no. 16 and no.17) the results of temperature increment measurements totally differed from those obtained for the rest 14 power supplies. all power supplies are assumed to be operating properly during the standard tests. the results of temperature increment measurements for power supplies no. 16, no. 17 and no. 8 are presented in fig. 2b, as curves 1, 2, 3, respectively. especially surprising is the result of the temperature increment measurements for the power supply no. 16. after 20 minutes of operating the temperature increment was equal to 77 o c. application of infrared thermography to non-contact testing of ad/dc power supply 157 for the power supplies no. 8, no. 16 and no. 17, very detailed measurements of their parameters, and characteristics were carried out, supplemented by the mechanical inspection of transformers. it was found that the problem lied in the construction of transformers (the transformer core being unglued – no. 8 and no. 17, and an asymmetrical winding on the transformer core – no. 16). as can be seen (fig. 2b, curves 1, 2, 3), the temperature increment levels for the power supplies no. 16, no. 17 and no. 8, from the beginning of measurements are significantly higher than the threshold heating characteristic tth (fig. 2b, curve 4). if the classification rules (3.3) are applied, these power supplies will be classified as poor quality power supplies. below, the other case of failure has been described. the thermistor failure was triggered off (catastrophic failure, short circuit) after 8 minutes of power supply operation. the results of the temperature increments measurements of this power supply are presented in fig. 3. fig. 3 temperature increments ∆t for power supply for which the fail of thermistor was triggered off after 8 minutes of power supply operating. the temperature increment rapidly increased at t4 = 8 minutes, and then it rapidly decreased and this incident brought the total failure of the investigated power supply. changing ∆t as a function of time from the measurement starting point t = 2 minutes to the point t = 8 minutes, after turning on the investigated power supply, is compatible with the characteristics of the heating power supplies (fig. 2a). the experiment showed that the thermistor damage results in a temporary increase of the temperature of the investigated power supply board. the presumption is that such damage may occur at any time during the power supply operation. therefore, it is difficult to detect such damage during the first minutes of its operation (but, of course, it is also possible). in this case the temperature increment ∆t, measured on the power supply board in time t = 12 minutes is smaller than the threshold value tth of heating characteristics. this is an indication that some component of the power supply was destroyed. 3. conclusion the investigations carried out for ac/dc power supplies revealed a necessity of evaluating their quality. to sum up, checking the quality of power supplies within the period from 12 minutes to 20 minutes consists in determining whether the temperature increment ∆t on the board 158 s. galla, a. konczakowska is the correct one or if it exceeds the threshold value. if ∆t is greater than the threshold value, detailed tests must be carried out in order to find what has been damaged. if the temperature increments are in order of 24 o c after 12 minutes of power supply operating, it means that the classification rules (3.3) are not satisfied. in this case some catastrophic failure of components can be expected. the described procedure is used for ac/dc power supplies testing before launching them on the market. the proposed scenario of the thermographic investigation can be applied for other systems. all steps of the investigation should be realized. references [1] l. hasse, s. babicz, l. kaczmarek, et al. ”quality assessment of zno-based varistors by 1/f noise”, microelectronics reliability, vol. 54, pp. 192-199, issue 1, january 2014. [2] jae-hyung jang, hyuk-min kwon, ho-young kwak, et al. ”effect of fluorine implantation on 1/f noise, hot carrier and nbti reliability of mosfets”. ieice transactions on electronics, vol. e96.c, pp.624-629, no. 5, 2013. [3] zhuang yiqi and bao junlin. ”1/f noise and g-r noise related to reliability in optoelectronic coupled devices”, in proceedings of the 22 nd international conference on noise and fluctuations. montpellier, france, jun, 2013, pp. 24-28. [4] h. k. chan, r. c. stevens, j. p. goss, et al. ”reliability evaluation of 4h-sic jfets using i-v characteristics and low frequency noise”. in proceedings of the 9 th european conference on silicon carbide and related materials. st. petersburg, russia, sep 02-06, 2012 and silicon carbide and related materials 2012, book series: materials science forum, vol. 740-742, 2013, pp. 934-937. [5] b. k. jones, ”electrical noise as a reliability indicator in in electronic devices and components”. iee proc. circuits devices syst., vol. 149, pp. 13-22, no. 1, february 2002. [6] a. konczakowska, ”methodology of semiconductor devices classification into groups of differentiated quality”, microelectronics reliability, vol. 48, pp. 87-97, issue 1, january 2001. [7] c. ciofi and b. neri, ”low-frequency noise measurements as a characterization tool for degradation phenomena in solid-state devices”, journal physics d: applied physics, vol. 33, pp. 199-216, 2000. [8] l. hasse, a. konczakowska and j. smulko, ”classification of high-voltage varistors into groups of differentiated quality”. microelectronics reliability, vol. 49, pp. 1483-1490, issue 12, december 2009. [9] r. lethiniemi, ”bibliography of the application of infrared thermography to electronics”. thermosense xxi, in proceedings of the society of photo-optical instrumentation engineers (spie), vol. 3700, 1999, pp. 202-208. [10] st. galla and a. konczakowska, ”application of infrared thermography to the non-contact testing of varistors”, metrology and measurement systems, vol. 20, pp. 677-688, issue 4, 2013. [11] s. j. hsieh, ”survey of thermography in electronic inspection”. thermosense: thermal infrared applications xxxvi, in proceedings of spie. vol. 9105, 2014. [12] b. giron-palomares et al., ”evaluation of nonintrusive active infrared thermography technique to detect hidden solder ball defects on plastic ball grid array components”, journal of electronic packaging, vol. 136, pp. 31008-31016, issue 3, 2014. [13] w. minkina and s. dudzik, infrared thermography – errors and uncertainties. john wiley & sons ltd, chichester, 2009. [14] h. kaplan, practical applications of infrared thermal sensing and imaging equipment. 3rd ed., spie, 2007. [15] m. vollmer and k. p. möllmann, infrared thermal imaging: fundamentals, research and applications. john wiley & sons. wiley-vch verlag gmbh & co. kgaa, 2011. [16] b. więcek and g. de mey, infrared thermovision; foundations and applications. pak warszawa, 2011. [17] www.vigo.com.pl instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 383 391 doi: 10.2298/fuee1503383r optimization and advantages of the bimode insulated gate transistor  munaf rahimo, liutauras storasta abb switzerland ltd., semiconductors abstract. the bi-mode insulated gate transistor bigt is a single chip reverse conducting igbt concept, which is foreseen to replace the standard igbt / diode two chip approach in many high power semiconductor applications. therefore, it is important to understand in detail the design challenges and performance trade-offs faced when optimizing the bigt for different application requirements. in this paper, we present the main conflicting design trade-offs for achieving the overall electrical and thermal performance targets. we will demonstrate experimentally how on one hand, the bigt provides improved design features which overcome the restrictions of the current state of the art igbt/diode concepts, while on the other hand, a new set of tailoring parameters arise for an optimum bigt behavior. key words: power semiconductors, igbt, diode, bigt 1. introduction in modern power electronics applications employing igbt modules, the diode presents a major restriction with regard to losses reductions and maximum surge current capability. both issues are a result of the typically limited diode area available in a given package footprint design. in particular, these limits were further restricted after the introduction of modern low-loss igbt designs. therefore, the simple approach of increasing the diode area is not a preferred solution and in any case remains constrained by the package standard footprint designs. nevertheless, the clear demand for increased power densities of igbt and diode components has led to the focus on an igbt and diode integration solution, or what has been normally referred to as the reverse conducting rc-igbt. the key rc-igbt feature has been the introduction of the anode shorts for the diode integration [1][2]. however, a number of process and design constraints related to the integrated diode structure have hindered the development of rc-igbts for hard switching applications and recent development efforts were aimed at tackling these issues. hence, resulting in an advanced rc-igbt concept referred to as the bi-mode insulated gate transistor (bigt) [3].  received march 2, 2015 corresponding author: munaf rahimo abb switzerland ltd, semiconductors, fabrikstrasse 3 5600 switzerland (e-mail: munaf.rahimo@ch.abb.com) 384 m. rahimo, l. storasta in addition to the modern miniaturized igbt mos cell designs, the second main design enabler for the bigt realization is the soft-punch-through (spt) buffer concept [4]. almost all of today`s igbt structures are based on the soft punch through or field stop lowly doped buffer concept combined with low injection efficiency p-type anodes for providing very low on-state and switching losses when compared to previous generations. however, this design approach has some limits for delivering optimum overall performance due to the difficulty to control the bipolar gain of the igbt. the bipolar gain has a critical dependency on the finely controlled design parameters of the buffer and anode especially when compared to typical non-punch-through npt devices. it was also clear that these design restrictions become even more challenging for igbts with higher voltage ratings [5]. the main requirements affected by the soft-punchthrough (spt) igbt structure are illustrated in fig. 1 and listed below: 1. reverse blocking leakage current which is critical for device stability during high temperature operation 2. short circuit withstand capability at low temperatures and high gate emitter voltages 3. turn-off softness under high inductance, low currents and temperatures 4. safe operating area under dynamic avalanche and switching-self-clamping-mode sscm 5. static and dynamic losses trade-off point selection due to (1-4) restrictions rbsoa short circuit soa leakage current adjusting the pnp bipolar transistor gain for optimum performance losses, softness fig. 1 the trade-offs for spt design in igbts. as mentioned previously, when compared to igbts the rc-igbt in principle also benefits greatly from the spt design, and more importantly, it has been shown that the above performance and associated design trade-offs are strongly minimized by the introduction of the anode shorts. thus, the anode shorts not only enable diode conduction but are also are fundamental for the functionality of the whole device concept [6]. however, conventionally for rc-igbts, the igbt/diode integration and the introduction of the anode shorts for high voltage and hard switching applications has also resulted in a number of performance drawbacks and a new set of trade-offs as summarized below:  snap-back in the igbt on-state i-v characteristics (the shorting effect)  igbt on-state versus diode recovery losses trade-off (the plasma shaping effect) optimization and advantages of the bimode insulated gate transistor 385  safe operating area soa (the charge uniformity effect)  igbt versus diode softness trade-off (the silicon design effect) in this paper, we will discuss the above mentioned topics and provide an overview of the required bigt optimum design features to obtain good overall electrical performance while targeting main stream hard switching power electronics applications. 2. the bigt device concept the development of the bigt was aimed at solving the above issues by following two integration steps. the first integration follows the standard approach for an rc-igbt. a cross section is shown in fig. 2 combining both an igbt and diode in a single structure. at the collector side, alternating n+ doped areas are introduced into an igbt p+ anode layer, which then act as a cathode contact for the internal diode mode of operation. the area ratio between the igbt anode (p+ regions) and the diode cathode (n+ regions) determines which part of the collector area is available in igbt or diode modes, respectively. during the rc-igbt conduction in diode mode, the p+ regions are in-active and do not directly influence the diode conduction performance. however on the other hand, the n+ regions act as anode shorts in the igbt mode of operation, strongly influencing igbt conduction mode. fig. 2 first integration step: the reverse conducting rc igbt. one of the implications of anode shorting is the voltage snapback a referred to previously which is observed as a negative resistance region in the device igbt mode i-v characteristics. this effect will have a negative impact when devices are paralleled, especially at low temperature conditions. to resolve this issue, a second integration step was required. it has been shown that the initial snap-back can be controlled and eliminated by introducing wide p+ collector/anode regions into the device, also referred to as a pilotigbt. this approach resulted in the bigt concept which is in principle a hybrid structure consisting of an rc-igbt and a standard igbt in a single chip as shown in fig. 3. 386 m. rahimo, l. storasta fig. 3. second integration step: the bimode insulated gate transistor bigt. 3. the bigt design trade-off challenges 3.1. the bigt snapback effect despite the fact that the n-type shorting regions have contributed to many advantages as explained earlier, the major drawback is related to the snap-back effect in the forward iv characteristics. nevertheless, this effect has been minimized strongly with the bigt hybrid design with the introduction of the p+ collector/anode pilot region. the main target of this combination is to eliminate snap-back behavior at low temperatures in the bigt transistor on-state mode by ensuring that hole injection occurs at low voltages and currents from the p+ pilot region in the igbt section of the bigt. nevertheless, further optimization was still required with relation to the shorting layout design. a radial shorting layout in relation to the pilot region [7] has shown optimum on-state curves with a more smooth increase in the current when compared to a stripe shorting design in parallel to the p+ pilot region as shown in fig. 4 for a 4.5kv/50a bigt device. 0 20 40 60 80 100 0 1 2 3 4 5 6 on-state voltage (v) c o ll e c to r c u rr e n t (a ) pilot igbt operation secondary snap-backssmooth transition pilot igbt pilot igbt fig. 4 reducing the snap-back with the bigt hybrid design and radial shorting layout. optimization and advantages of the bimode insulated gate transistor 387 3.2. igbt mode versus diode mode losses for the bigt losses optimization, the main challenge was to enable low diode mode recovery losses while not having a considerable effect on the transistor mode on-state losses. a three step approach is utilized to achieve this target. the first step is the fine control of the doping profiles of the emitter p-well cells and collector/anode short regions. as shown in fig. 2, the enhanced planar cell technology exhibits low injection levels and a compensation effect due to the enhancement n-layer. these two features provide the bigt with a fine pattern p-well profile for obtaining low injection efficiency for a better diode performance. the second optimization step employs a local p-well lifetime (lpl) control technique utilizing a selfaligned and well-defined particle implantation which further reduces the diode recovery without degrading the transistor losses trade-off curve and blocking characteristics. the final adjustment of the reverse recovery losses is achieved with a uniform local lifetime control employing proton irradiation. further reductions in diode recovery losses can be obtained by applying a mos gate control during diode-mode conduction and switching [6]. 3.3. the bigt charge uniformity the structured collector/anode of a bigt with p+ and n+ areas introduces non-uniformities in the lateral charge distribution which have been studied with the aid of device simulation. while the fine patterning of the rc-igbt region (see fig. 2) does not bring much changes to the overall charge distribution, the pilot-igbt region significantly modifies the bigt charge and current distribution compared to an igbt. during the igbt conduction mode, the p+ pilot acts as a large non-shorted region having very strong anode injection, therefore the electronhole plasma and the current density is the highest in the pilot region as shown in fig. 5 [8]. the junction temperature distribution is also affected by this and is highest in the region of the pilotigbt, which is placed in the middle of the device for this reason. conversely, the pilot region has the lowest carrier plasma density during diode conduction. during the igbt mode turn-off, the high plasma concentration in the pilot region triggers early dynamic avalanche at the mos cells located above the same region. the described charge inhomogeneity is mainly pronounced at lower temperatures and low device currents. at the critical soa conditions, the difference between the rc-igbt and pilot-igbt regions is reduced which results in a similar dynamic avalanche behavior as for the corresponding igbt. on-state turn-off rc-igbt bigt pilotigbt pilotigbt fig. 5 hole density during igbt on-state conduction and turn-off of a reverse-conducting igbt compared to the bigt, showing the effect of carrier plasma un-uniformity and the occurrence of dynamic avalanche. 388 m. rahimo, l. storasta 3.4. the bigt diode mode softness the diode softness challenge is mainly due to the fact that generally the diode silicon does not match the igbt silicon for obtaining soft recovery performance. thus, such conflicting requirements could result in diode mode snappy behavior in an integrated structure. to resolve this critical issue, the anode shorts have inherently a switching behavioral feature which provides the bigt with very soft turn-off characteristics as described in the following section. 4. the bigt trade-off advantages 4.1. reverse bias and leakage current the presence of the n-type shorts in the bigt has a large impact on lowering the leakage current. the n-type areas provide a direct path for electrons during reverse blocking conditions, therefore no or very little hole injection occurs. fig. 6 shows thermal stability comparisons at different temperatures for 6.5kv rated igbts and bigts with two anode designs. the bigt clearly demonstrate improved thermal stability at higher temperatures when compared to the igbt even with very high anode injection efficiencies [9]. the anode shorts remove the influence of the bipolar gain on the leakage current to a large extent. as a result, the leakage current is suppressed and the increment with the temperature is reduced. in addition, the anode strength does not influence the leakage current in the bigt in contrast to an igbt structure. however, it can still occur that holes are injected due to the lateral voltage drop when a high leakage current is flowing over large/wide p-doped anode areas, but this was not observed in practical designs even with a pilot igbt region occupying around 20% of the collector area. 0.01 0.1 1 10 75 100 125 150 175 temperature (ºc) l e a k a g e c u rr e n t (m a ) igbt igbt 2x anode bigt bigt 2x anode fig. 6 6.5kv bigt and igbt thermal stability curves. optimization and advantages of the bimode insulated gate transistor 389 2.2. igbt and diode mode turn-off softness as mentioned in the previous section, with regard to the bigt softness in diode as well as igbt turn-off modes, an inherent effect in the bigt similar to the field charge extraction fce diode [10] has ensured soft performance under all operating conditions. due to the presence of anode shorts in the bigt, the lateral current flowing above the large pilot-igbt area forward biases the p-n junction and additional hole injection produces a small tail current providing the required softness and causing only a minimal increase of the switching losses. as a typical example, for the igbt mode of operation, this approach means that stronger anode injection is not anymore required to provide only softer performance at the expense of higher losses, higher leakage currents and strong dynamic avalanche conditions as is the case with igbts. fig. 7 shows the igbt mode turn-off for 6.5kv devices. the bigt exhibits clearly a soft tail with no abrupt drop in the current during the later stages of the turn-off event as for the igbt. the same effect is also present in the diode mode. here it is of more importance due to the fact that the n-base region design of the bigt is similar to the igbt and is not optimized for diode operation. a standard diode using an igbt n-base region design with a low punch through voltage would be very susceptible to snappiness even under nominal conditions. because of the fce effect induced by the anode shorts, the diode is turning off softly and without any visible snap-off. fig. 8 shows diode turn-off waveforms at the most critical conditions at a low current and low temperature (-40ºc). -600 0 600 1200 1800 2400 3000 3600 4200 4800 -200 -100 0 100 200 300 400 500 600 700 0 2 4 6 8 10 12 14 v o lt a g e [ v ] v g e [ v ] x 1 0 , c u rr e n t [a ] time [us] ic=600a, vc=3600v, tj=125c, ls=300nh -600 0 600 1200 1800 2400 3000 3600 4200 4800 -200 -100 0 100 200 300 400 500 600 700 0 2 4 6 8 10 12 14 v o lt a g e [ v ] c u rr e n t [a ], v g e [ v ] x 1 0 time [us] ic=600a, vc=3600v, tj=125c, ls=300nh fig. 7 6.5kv/600a igbt (left) and bigt (right) module turn-off waveforms under nominal conditions. 390 m. rahimo, l. storasta -900 0 900 1800 2700 3600 4500 5400 -1800 -1400 -1000 -600 -200 200 600 1000 0 1 2 3 4 5 v o lt a g e [ v ] c u rr e n t [a ] time [us] diode turn-off 4500v 50a-600a fig. 8 6.5kv/600a bigt module diode-mode turn-off waveforms at critical low current conditions at -40ºc. 2.3. the bigt short circuit in addition, the bigt shows that the high local anode injection levels needed with the presence of n-types shorts have brought about improvements on the short circuit soa capability. the bigt will normally require higher anode p-region doping concentrations compared to an igbt anode for obtaining the same over-all injection efficiency and hence on-state voltage drop and turn-off losses. during short circuit, an important current dependent failure mode occurs during the short circuit current pulse in spt designs which is mainly dependent on the charge compensation effect near the buffer region of the igbt which in turn is dependent on the anode and buffer design [11]. under high vge and/or lower operating temperatures, the resulting higher short circuit current will limit the short circuit soa (scsoa) capability. in a bigt, the higher anode p-region doping provide improved charge compensation and hence higher scsoa. fig. 9 shows the short circuit test of a 3300v/50a bigt and reference igbt chips at 25°c, and a gate voltage of 18v. fig. 9 3.3kv igbt and bigt single chip short circuit type 1 waveforms at room temperature. optimization and advantages of the bimode insulated gate transistor 391 both devices are designed for the mos cell, anode and buffer to have similar short circuit current and turn-off losses. while the bigt has a faster turn-on behavior resulting in a higher overshoot current, it is still capable of withstanding this test at a dc-link voltage of 1800v compared to the igbt which fails already at 900v. the bigt chip is also capable of passing the test for higher gate voltages up to 19.5v. in addition to the advantages discussed previously, this feature in the bigt design provides further flexibility for design trade-offs required to tailor the bigt for improved overall performance. 5. conclusions the bigt concept is foreseen to play an important role in many future power electronics applications. hence, it is important to understand the design trade-off improvements and challenges presented by the bigt device concept when compared to state of the art igbts and diodes. this paper has presented a comprehensive review of these design aspects based on published and newly obtained results for a high voltage bigt. references [1] h. takahashi, a. yamamoto, s. anon, t. minato, "1200v reverse conducting igbt", in proc. int. sym. on power semiconductor devices & ic's ispsd`04, kitakyushu, japan, p. 133. [2] s. voss, f-j. niedernostheide, h-j. schulze, "anode design variations in 1200v trench field-stop rc igbts", in proc. int. sym. on power semiconductor devices & ic's ispsd`08, orlando, usa, 2008, pp. 169-172. [3] m. rahimo, a. kopta, u. schlapbach, j. vobecky, r. schnell, s. klaka, "the bi-mode insulated gate transistor (bigt) a potential technology for higher power applications", in proc. int. sym. on power semiconductor devices & ic's ispsd`09, barcelona, spain, 2009, pp. 283-286. [4] s. dewar, s. linder, c. von arx, a. mukhitinov, g. debled, "soft punch through (spt), setting new standards in 1200v igbt", in proc. pcim`00 conference, nurnberg, germany, 2000. [5] j. vobecky, m. rahimo, a. kopta, s. linder, "exploring the silicon design limits of thin wafer igbt technology: the controlled punch through (cpt) igbt", in proc. int. sym. on power semiconductor devices & ic's ispsd`08, orlando, usa, 2008, pp. 76-79. [6] m. rahimo, u. schlapbach, a. kopta, j. vobecky, d. schneider, a. baschnagel, "a high current 3300v module employing rcigbts setting a new benchmark in output power capability", in proc. int. sym. on power semiconductor devices & ic's ispsd`08, orlando, usa, 2008, pp. 68-71. [7] l. storasta, a. kopta, m. bellini, m.t. rahimo, u. vemulapati, n. kaninsky, "the radial layout design concept for the bi-mode insulated gate transistor", in proc. int. sym. on power semiconductor devices & ic's ispsd`11, san diego, usa, 2011. [8] d. wigger, d. weiss, h-g. eckel, "impact of inhomogeneous current distribution on the turn-off behaviour of bigts", in proc. pcim 2013, pp. 860-867. [9] l. storasta, s. matthias, m.t. rahimo, a. kopta, "bipolar transistor gain influence on the high temperature thermal stability of hv-bigts", in proc. int. sym. on power semiconductor devices & ic's ispsd`11, bruges, belgium june 2012, pp. 157-160. [10] a. kopta, m. rahimo, "the field charge extraction (fce) diode, a novel technology for soft recovery high voltage diodes", in proc. int. sym. on power semiconductor devices & ic's ispsd`05, santa barbara, usa, 2005, pp. 83-86. [11] a. kopta, m. rahimo, u. schlapbach, n. kaminski, d. silber, "limitation of the short-circuit ruggedness of high-voltage igbts", in proc. int. sym. on power semiconductor devices & ic's ispsd`09, barcelona, spain, 2009, pp. 33-36. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 621 629 doi: 10.2298/fuee1604621s electromagnetic pulse effects and damage mechanism on the semiconductor electronics  vladimir vasilevich shurenkov, vyacheslav sergeevich pershenkov microelectronic department, national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation abstract. in recent years, growing attention has been paid to the threat posed by highpower microwave electromagnetic interference, which can couple into semiconductor electronic devices intentionally from microwave sources or unintentionally due to the proximity to general environmental hf signals. this paper examines physical mechanism of malfunction and destruction of electronic devices by high power microwaves electromagnetic pulse. key words: electromagnetic pulse, emp sources, electronic component, damage, coupling mechanisms, susceptibility 1. introduction in recent years, the use of electronics, its components and assemblies has constantly increased. evidently, there is a risk of generating the disturbing signals inside the electronic system by injection of the external electromagnetic fields. the growing attention must be paid also to the threat posed by the electromagnetic pulse radiation, which could couple into electronic devices intentionally or unintentionally due to the proximity to general environmental high frequency signal, usually in microwave range. a breakdown or destruction of the electronic systems would be inconceivable and devastating. the electromagnetic radiation may cause permanent damage to the semiconductor devices [1]-[8]. this report discusses the various electromagnetic pulse (emp) effects and damage mechanisms on the performance of the semiconductor electronics. the authors used the results of own investigations and some literature dates. the paper is organized as follows (fig. 1). section 1 presents the types and the main parameters of emp sources. the coupling mechanisms between the emp device output and the target system are defined in section 2. the induced effects on the target are discussed in section 3. the main physical mechanisms of the emp effects are discussed in section 4. received march 31, 2015; received in revised form january 29, 2016 corresponding author: vladimir vasilevich shurenkov microelectronic department, national research nuclear university mephi (moscow engineering physics institute), moscow, 115409, russian federation (email: vvshurenkov@mephi.ru) 622 v. v. shurenkov, v. s. pershenkov fig. 1 paper′s structure. 2. emp sources it is well established now that sufficiently intense emp in the frequency range of 200 mhz to 5 ghz can cause upset or damage in electronic systems [1], [4], [5], [8]. the reason to choose this range is extensively populated with the radars, television broadcasting, mobile communications, high power microwave (hpm) sources, etc. this induced effect in an electronic system is commonly referred to as intentional or unintentional electro-magnetic irradiation (emi). such emi could be radiated or conducted [4]. an electromagnetic pulse, also sometimes called a transient electromagnetic disturbance, is a short burst of electromagnetic energy. the waveform of em pulse describes how its instantaneous amplitude (of the field strength or current) changes over the time. the real pulses tend to be quite complicated, so the simplified models are often used in theoretical and experimental studies. usually such models address a rectangular or "square" pulse. pulses are typically characterized by:  the type of energy (radiated, electric, magnetic or conducted).  the range or spectrum of frequencies present.  pulse waveform: shape, duration and amplitude one way of classifying the emi is based on the frequency content of their spectral densities as “narrowband” and “wideband”. the frequency spectrum and the pulse waveform are interrelated via the fourier transform. an emp typically contains energy at many frequencies from dc (zero hz) to some upper limit depending on the source. the shortness of the pulse means that it will always be spread over a range of frequencies. types of emp divide broadly into natural, man-made and weapons (fig.2) [1], [4]. nowadays, it is practicable to generate the transient wideband pulses (wb) with high amplitude and very short rise times. as a radiated electromagnetic field, these pulses have an effect on the function of modern electronics. the microwave interference is often considered to have a pulse width ranging from several to several hundreds of nanoseconds. fig.2 emp characteristics. electromagnetic pulse effects and damage mechanism on the semiconductor electronics 623 3. the coupling mechanisms the damage to electronic devices is determined by the amount of energy that is transferred while the electronic devices are coupled with electromagnetic environment. the electromagnetic coupling is a kind of mechanism where the microwave energy is delivered to an equipment through a circuit line [1], [3], [7]. all electronic equipment is susceptible to the malfunctions and permanent damage under the electromagnetic radiation of sufficient intensity. the intensity level for system vulnerability is dependent upon the coupling from the external fields to the electrical circuits and their corresponding sensitivity characteristics. a temporary malfunction (or upset) can occur when an electromagnetic field induces current(s) and voltage(s) in the operating system electronic circuits at levels that are comparable to the normal operating signals. no matter what kind of the hf source is used or which power/frequency/mode is applied, two principal coupling modes are recognized in literature assessing how much power is coupled into target systems:  front door coupling, (fdc)  back door coupling, (bdc) front door coupling the fdc is typically observed when the power radiated from the hf source is directly coupled into the electronic systems [6] (fig. 3). but more often the antenna subsystem is designed to couple hf power in and out of the equipment, and thus provides an efficient path for the power flow from the electromagnetic source to enter the equipment and cause damage. fig. 3 scheme of the experimental the research of the influence of microwave radiation on the diodes. 624 v. v. shurenkov, v. s. pershenkov back door coupling the bdc occurs when the electromagnetic field from the hf source produces large transient currents (termed spikes, when produced by a transient source) or the electrical standing waves through the cracks, small apertures and via the fixed electrical wiring and cables, interconnecting the bdc equipment, or providing connections to the power mains, or the telephone network [1], [8] (fig. 4). the bdc can generally be described as a wideband, but it may have the narrow-band characteristics because of the resonance effects (the coupling to cables for example). fig. 4 the coupling mechanism. the bdc creates the voltages on the traces and wires that superimpose with the normal signals and enter the device terminals. while the circuit traces and wires are not designed specifically to transmit and receive signals, they introduce the parasitic resonances in the systems that reduce the level of hf power required to stimulate emp effects. microwave radiation from emp devices creates the high voltage standing waves on the fixed wiring infrastructure. the equipment connected to exposed cables or wiring will experience either the high voltage transient spikes or the standing waves, which can damage the semiconductor devices (table 1). table 1 spikes damage semiconductor devices types of semiconductor devices breakdown voltage range silicon high frequency bipolar transistors 15v-65v gallium arsenide field effect transistors 10v high density dynamic random access memories (dram) 7v generic cmos logic 7v-15v microprocessors running off 3.3 v or 5 v power supplies 3.3v-5v electromagnetic pulse effects and damage mechanism on the semiconductor electronics 625 since the impinging emp field has a broad frequency spectrum and a high field strength, the antenna response must be considered both in and out of the band. the inadvertent unintended or parasitic antennas are electrically penetrating conducting structures, power lines, communication cables and ac pipes that collect emp energy and allow its entry into the enclosure. the cavity fields are another important aspect of emp effects. emp radiation will penetrate the enclosures such as computer cases and excite the field distributions according to the resonant modes of the structure. predicting these field distributions deterministically is difficult due to the complexity of the em boundary conditions that are typical of even the most basic electronic enclosure. it is often that the dimensions of the enclosures and the corresponding em boundaries are many times greater than the wavelength of the emp radiation. thus, the structures support numerous modes that are typically closely spaced in frequency. with emp sources, the standing waves on the wiring which enters the equipment cavities may contribute to the exciting spatial resonances within the equipment cavity itself. further complicating the analysis of em fields is the fact that the em boundaries are rarely static. the small changes due to the motion, vibration, or temperature may substantially alter the field distribution. the form of the electromagnetic pulse in the cavity usually is a damped sinewave. a damped sinewave couple a relatively narrow frequency band. the openings like the doors, windows, utility lines / holes, improperly terminated cable shields and the poorly grounded cables can couple emp energy directly in to the shielded enclosure. the leakage through an aperture depends on its size, the type of the structure housing it and its location. the aperture responds to both the electric and magnetic fields. the microwave radiation from emp devices has the ability to directly couple into the shielded equipment cavities through the ventilation holes, and the poorly sealed panels. the gaps or holes can behave as the slot radiators providing that they are comparable in size to the wavelength of the radiation. the panels which are not conductively sealed around their edges may also resonate when excited by the microwave radiation and directly couple the energy into the cavity. the spatial standing wave pattern will exhibit potentially large field strengths at its antinodes, and the semiconductor components exposed to such fields. in order to analyze the coupling process of electromagnetic pulses into the electronic systems the transversal electromagnetic (tem) tem-waveguides are used. the test area inside the tem-waveguide is large enough to position a test setup of the electronic circuits. combined with different pulse generators, the tem-waveguide allows to generatethe reproducible transient electromagnetic pulses with a defined field strength and pulse form. 4. the effects on targets the hf source interactions with the system electronics can be categorized into four levels of destructive effect: upset, lock-up, latch-up, and burnout [1,5,8,9]. these four potential effects of the hf source on targets can be categorized into a hierarchy of damaging, each of which will require the increasing microwave emission on the target. 4.1. soft-kill a soft kill is produced when the effects of the hf source cause the operation of the target equipment or system to be temporarily disrupted. the soft kill can occur in the two forms: 626 v. v. shurenkov, v. s. pershenkov a. upset: the upset is a temporary alteration of the electrical state of one or more nodes, in which the nodes no longer function normally. the upset continues until the impressed radiation is terminated. once the signal is removed, the affected system can be easily restored to its previous condition. b. lock-up: the lock-up is similar to the upset in that the electrical states of affected nodes are temporarily altered, but the functionality of these nodes remains altered after the radiation is removed. the lock-up produces a temporary alteration similar to upset, but electrical reset or shut off and restart is necessary to regain the full functionality after the radiation is removed. 4.2. hard-kill a hard kill is produced when the effects of the hfsource cause permanent electrical damage to the target equipment or system, necessitating either the repair or the replacement of the equipment or system in question. the hard kill can be seen in two forms: a. latch-up: the latch-up is an extreme form of lockup in which the parasitic elements are excited and conduct current in the relatively large amounts until either the node is permanently self-destroyed or the electrical power to the node is switched off. b. damage/burnout: the damage/burnout is an electrical destruction of a node by some mechanism like the latch-up, metallization burnout, or junction burnout. the damage/ burnout occurs when the high-power microwave energy causes melting in the capacitors, resistors or conductors. the burnout mostly occurs in the junction region where multiple wires or the base collector or emitter of a transistor come together, and often it involves electrical arcing. consequently, the heating is localized to the junction region. permanent damage can occur when these induced stresses are at the levels that produce the joule heating to the extent that thermal damage occurs (usually between 600 and 800 degrees kelvin). 5. the physical mechanisms of damage in our opinion there are four main physical factors of damage to semiconductor structure sand their parameters under the emp radiation. 5.1. the induced voltages and currents the semiconductor devices can experience the serious failures and malfunction caused by the over current and over voltage when the reverse voltage is biased to the pn junction region (fig. 5) of the thermal secondary breakdown caused by the high power microwaves, as the devices are mostly comprised of the integrated circuits and microelectronics, which are sensitive to the microwaves [5]. another emp thermal problem lies in the major factor affecting the reliability of the electronic system. the high voltage transients, large current flow, etc., can exceed the designed levels of the thermal dissipation in the device and cause the thermal runaway [2]. electromagnetic pulse effects and damage mechanism on the semiconductor electronics 627 fig. 5 the large loop area enclosed by faulty pcb layout scheme. 5.2. the arcing (the air and dielectric breakdown) it is a very common in the high-level pulsing of the circuits. the arcing can occur wherever the two conductors are close together, and there is a high voltage between the two (such as produced by a high level pulse entering the system) (fig. 6). the arcing can also occur near the port entry point, or deeper within the system. generally, it is better if the arcing is not deep within the system [7]. fig. 6 arching occurring within the system. the air breakdown occurs at the levels of about 1 kv for a 1 millimeter of the air gap at the normal air pressure. the breakdown level may be affected by the water vapor, and dust or debris that may have accumulated. 5.3. a latch-up it is the inadvertent creation of a low-impedance path between the power supply rails of an electronic component, triggering the parasitic structure, which then acts as a short circuit, disrupting the proper functioning of the part and possibly even leading to its destruction due to the overcurrent. all ic are made by combining adjacent p-type and n-type into transistors. the paths other than those chosen to form the desired transistor can sometimes result in the so-called parasitic transistors, which, under the normal conditions, cannot be activated. the parasitic structure is usually an equivalent of a thyristor (or silicon controlled rectifier, src), a pnpn structure which acts as a pnp and an npn transistor stacked next to each other (fig. 7) [4], [6]. 628 v. v. shurenkov, v. s. pershenkov fig. 7 the parasitic pnpn bipolar component in cmos gate. during a latch-up, when one of the transistors is conducting, the other one begins conducting too. they both keep each other in saturation for as long as the structure is forward-biased and some current flows through it which usually means until a power-down. the scr parasitic structure is formed as a part of the totem-pole pmos and nmos transistor pair on the output drivers of the gates. latch-up is a phenomena where impedance between the power supply and ground becomes low, and when a vertical pnp (or npn) transistor and lateral npn (or pnp) transistor, which create a parasitic pnpn bipolar structure, simultaneously operate. when latch-up occurs, a large amount of current suddenly flows between the vdd and vss. 5.4. the induced emi recombination currents in our earlier publications [6,11] we have discussed the increase of the recombination current induced under the strong electric field in the p-njunction (fig. 8). the same effect of the increase of the recombination current has been in heterojunction bipolar transistor [10]. but it is necessaryto note that this effect take place only at low level emp power. fig. 8 the i-v characteristics of the p-n junction under microwave radiation. electromagnetic pulse effects and damage mechanism on the semiconductor electronics 629 6. conclusion the main physical factors of the damage to the semiconductor structure and its parameters under emp radiation are:  the over current and over voltage in a loop areas in the layout schemes;  the air and dielectric breakdown between the near placed conductors in the schemes;  a latch-up of the parasitic thyristor structure;  the induced emi recombination currents. the concrete mechanism depends on the distance to the target from the emp source, its operating frequency, the burst rate and pulse duration, bandwidth, vulnerability of the target, coupled power level the emp power, coupling mode or entry points. so a detailed study of these induced effects of microwaves on information electronic instruments is needed. some protective measures are needed to exclude emp effects on semiconductor electronics. from the classical protective measures the grounding, the shielding and filtering, we may use only the shielding and the grounding because its allow to reduce any coupling. references [1] c. kopp and r. pose, “the impact of electromagnetic radiation considerations on computer system architecture”, dept. of computer science, monash university, clayton, victoria, australia. [2] v. lakshminarayanan, “basic steps to successful emc design”, r.f. design; sep 1999; 22, 9. [3] hypothetical electromagnetic bomb http://www.edi-info.ir/files/hypothetical-electromagnetic-bomb_hob6ta2e. pdf. [4] high power microwave technology and effects, university of maryland short course presented to msic redstone arsenal, alabama august 8-12, 2005. [5] s. m. hwang, j. i. hong, and c. s. huh, “characterization of the susceptibility of integrated circuits with induction caused by high power microwaves”, progress in electromagnetics research, pier, vol. 81, pp. 61-72, 2008. [6] v.v. shurenkov, “on the physical mechanism of interaction of the microwave radiation with the semiconductor diodes”, advanced materials research, vol. 1016, pp. 521-525, 2014. [7] j. hong, s.-m. hwang, k.-y. kim, c.-s. huh, u.-y. huh, and j.-s. choi, “susceptibility of ttl logic devices to narrow-band high power” piers online, vol. 5, no. 8, 2009. [8] x. fei, c. bing and l. chenglong, “damage efficiency research of pcb components under strong electromagnetic pulse” applied mechanics and materials, vol. 130-134, pp. 1383-1386, 2012. [9] m. rohe, s. korte, m. koch, “simulation of the destruction effects in cmos-devices caused by impact of fast transient electromagnetic pulses” in proc. of the excerpt from the proceedings of the comsol conference 2008 hannover. [10] a. aladdin, m. kadi, k. daoud, h. m. p. eudeline, “study of electromagnetic field stress impact on sige heterojunction bipolar transistor performance” international journal of microwave and wireless technologies, vol. 1, no. 6, pp. 475-482, 2009. [11] a.n. didenko, v.v. shurenkov, “modeling of interaction of the vhf radiations with diode structures”, engineering physics n4, pp. 16-19, 2001. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 173-185 https://doi.org/10.2298/fuee2102173r © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd review paper on the efficiency of energy storage systems – the influence of the exchanged power and the penalty of the auxiliaries alfred rufer epfl, ecole polytechnique fédérale de lausanne, lausanne, switzerland abstract. storage is an important domain of the energy sector, with its traditional, classical solutions for smaller and larger amounts of energy. energy storage has become of higher importance in relation with the development of alternative energy sources, leading to the development of new technologies. the energy efficiency of the storage means is an important parameter, being often not considered in the conception and design of the applications. for the evaluation of the energetic performance of a storage device, a well-adapted tool has been proposed, namely “the theory of ragone plots”. this tool sets in evidence in what way the effectively recoverable energy amount of a device is depending on the power level of the charge/discharge process. further, the taking into account of the power needed for the auxiliary equipment of a storage system like the circulation pumps of a flow battery, the vacuum pumps of a flywheel or the forced cooling of a battery can lead to a globally negative value of the efficiency. key words: energy storage, efficiency. 1. introduction high performance solutions for the accumulation of energy in order to cover the needs of the applications exist from longer time. mechanical solutions invented by watchmakers or by manufacturers of film cameras are good examples of a pragmatic way to elaborate a solution to a specific problem. higher amounts of energy have also been stored in the form of pumped water from a lower to a higher accumulation reservoir as it is widely spread in alpine regions. mobile as well as stationary applications have been the context of the development of a long list of different electrochemical accumulators, from the classical lead-acid batteries to the today’s high-performance lithium based metallic associations. all these solutions have mainly suffered from limited life cycle or other ageing phenomena like the loose of energy capacity or power availability. received may 4, 2021 corresponding author: alfred rufer epfl, ecole polytechnique fédérale de lausanne, ch 1015 lausanne, switzerland e-mail: alfred.rufer@epfl.ch 174 a. rufer other solutions have appeared along the years, as flywheels, compressed-air systems, superconductive magnets, but have never reached a breakthrough point, due to limited performance, high costs, or missing the adapted materials or infrastructure. limited investment from the side of the industrial world is another reason for a stagnating evolution of storage alternatives, in the context of the largely available and cheap energy resources of the 20th century. environmental concern and limited fossil resources have been the triggers of the development of renewable sources, where the stochastic character of solutions as wind and solar generators has been a new motivation for the development of new and more performant energy storage solutions. today, the modern li-ion accumulators can be seen as the most promising storage solutions for limited amount of energy at the level of several mwh, while other better adapted solutions as supercapacitors can solve the problem of the instantaneous power demand with less internal losses. life cycle issues will remain a high motivation for the development of evolved solutions to the actual electrochemical battery, and totally different approaches as the chemical transformations into hydrogen or methane will be in the future the real alternative to pumped hydro power, allowing longer bridging through the seasons due to their much higher energy content [1], [2]. a) b) c) fig. 1 examples of storage over a wide range of power and energy amount on the efficiency of energy storage devices 175 figure 1a illustrates the evolution seen in the domain of wrist watches, with at the right side a classical self-winding movement with an autonomy of 8 days. this former world record has been recently smashed up to 50 days [3]. the second object is an electronic watch with analogic and digital display and many new functions as altimeter, compass, etc. the autonomy of the battery powered watch is given as 24 months. the last watch is a connected device with an exploded number of new functions. its autonomy felt down to around one or two tens of hours. the second line (fig. 1b) shows one of the most eccentric application of storage for electric mobility with the first airplane able to fly over night with energy collected from pv panels during the day (solarimpulse, [4]). the third line (fig. 1c) represents a pumped storage plant in switzerland (nant-dedrance, [5]). its energy capacity is equal to 18’000 mwh, corresponding to an autonomy of 20 hours at the rated power of 900 mw. the evolution of public grids from the conventional concept of centralized generation towards decentralized generation and the integration of as well the renewable sources as also decentralized storage facilities has been recently accelerated. the concept of adding storage systems in order to achieve the so-called day-to-night shift or in order to replace diesel generators has been called “the hybrid power plant” [6], (fig. 2). = ~ -+ public grid private = = = g ~ ~ ~ chp = = fcell ~ = (1) (2) (3) (4)(5) (6)(7) ~ (8) (9) (10) (11) (12) (13) fig. 2 the hybrid power plant as already mentioned before, the different storage systems are covering a very large range of power and allow to store or release the energy over up to 103 hours as it is represented in fig. 3. the diagram covers a very wide range of power, over six decades, from 1 kw to 1gw 176 a. rufer 1 kw 10 kw 100 kw 1 mw 10 mw 100 mw 1 gw pumped hydro caes smesflywheels li-ion ni-cd lead-acid batteries nas batteries metal-air bridging power energy management 0.1 h 1 h 10 1 h 10 2 h 10 3 h flow batteries seasonal storage h2 power-to-gas (ch4) fig. 3 ratings of different storage systems 2. four main parameters for the characterization of a storage device for the characterization of a storage device, and especially for mobile systems, not only the amount of stored energy is relevant but the power level of the charge and discharge processes must be defined. thus, the energy density and the power density must be specified. more precisely, the volume and weight densities are generally specified as represented in table 1. table i four main parameters of energy storage devices parameter symbol unit the volume energy density ev [wh/dm3] the weight energy density em [wh/kg] the volume power density pv [w/dm 3] the power-to-weight ratio pm [w/kg] alternately to the specification of the amount of energy to be stored or to be recovered, together with the power level of the energy exchange, one can define the power and the time of needed power delivery. but the final result is equivalent. on this base, engineers and manufacturers have proposed to use a simultaneous representation of the energy density and of the power density in the same diagram. such a diagram is called the “ragone chart”. fig. 4 represents in the same ragone chart the main parameters of different storage solutions [7]. on the efficiency of energy storage devices 177 1e00 1e01 1e02 1e03 1e04 1e05 1e06 1e07 power density [w/kg] e n e rg y d e n si ty [ w h /k g ] 0.01 0.10 1.00 10.00 100.00 1000.00 fuel cells batteries fly wheels supercapacitors capacitors smes fig. 4 the ragone chart 3. electrochemical solutions versus systems from the classical physics one other aspect of energy storage devices or systems is their life duration. through specific parameters as life cycle or lifetime, the value of a given technology can be evaluated, also in terms of global costs of a given application. fig. 5 shows the characteristics of several common technologies, together with the related energy efficiency. from this figure, one can see that classical as well as modern batteries show good to very good energy efficiencies. but they suffer from limited life cycles. the main reason is the ageing phenomena related to electrochemical processes with ion migrations and transformations of the molecular structures. the storage technologies based on electrochemical transformations are located over the range of only several hundreds to several thousands of cycles, what is a limiting factor in the domain of applications to renewable energy sources. on the right side of figure 5, beyond the order of magnitude of ten thousand cycles, technologies with higher numbers of possible cycles are represented. they generally belong to the category of solutions based on reversible physics. in this category, the term of macroscopic energy of a system is used and the amount of accumulated energy in the system is related to its movement and to the external effects as gravity, magnetism or electricity. more explicitly the category includes the classical hydraulic pumped storage, flywheels, superconductive magnetic energy storage, supercapacitors or compressed air energy storage. even if by these systems the number of possible cycles is several orders of magnitude higher than for electrochemical technologies, the system components are subject to ageing phenomena as metal wear, friction and ageing of bearings or alteration of insulations. generally, the indicated lifetime of such systems includes revisions or replacement of sensitive sub-components. 178 a. rufer 100 1'000 10'000 100'000 lifetime [cycles] 40 50 60 70 80 90 100 e ff ic ie n cy [% ] metalair nicd lead-acid flow batt nas li-ion e.c. capacitors flywheels pumped hydro caes smes fig. 5 efficiency and lifetime of energy storage solutions recently, the category of storage systems based on physics has been completed by an original proposal to realize “dry gravitational storage systems”. in such installations, the stored energy amount is obtained from a stack of blocks of concrete. the blocks are moved up-and-down with a multiple arm crane system [8], [9]. figure 6 shows the socalled energy vault demonstration system currently under construction. fig. 6 the energy vault system, a dry gravitational storage system on the efficiency of energy storage devices 179 4. a general model for the efficiency the quality of a storage system is quantified through its energy efficiency, taking into account the internal losses during charging and discharging, the self-discharge effect during the time the energy is maintained in the storage device even if the exchanged power is set to zero. in some specific cases, the energy needed for the auxiliaries must be considered. these auxiliaries are for example the circulating pumps of a vanadium redox flow battery (vrb), the vacuum pumps of a flywheel for the reduction of the aerodynamic drag, or the cryogenic system of a superconductive magnetic energy storage device (smes). figure 7 shows the energy flow of a storage system where all the listed effects are represented. for an internally stored amount of energy, the primary needed amount can be significantly higher (energy to be stored). a typical example of this mechanism can be seen in the sector of electrical vehicles when so called ultra-fast chargers are used [10]. similarly, at the output of the storage system, the recovered energy can be strongly reduced in comparison with the initially existing amount of accumulated energy. in order to evaluate the different penalties, the next section will briefly introduce the “theory of ragone plots”. internally stored energy e0 internal losses ech/disch self discharge esd auxiliaries eaux energy to be stored recovered energy fig. 7 energy flow to and from a storage system 5. the theory of ragone plots as already explained in the previous section, the energy efficiency of a storage device is related to the different losses. charging and discharging losses as well as the selfdischarge losses influence directly the round-trip efficiency. as a consequence the amount of energy which can really be recovered from a fully charged storage device has to be defined in dependency of the instantaneous power of the energy transfer. this principle of interdependency between the energy density and the power density has been described under the name of «the theory of ragone plots» [11]. in this reference, a general circuit is associated with ragone plots (fig. 8). the energy storage device (esd) feeds a load with constant power p. the esd contains elements for 180 a. rufer energy storage. due to constant power, energy supply occurs only for a finite time tinf(p). the energy amount e available for the load in dependency of the power p defines a ragone plot. energy storage device constant power load fig. 8 general circuit associated with ragone plots (adapted from [11]) consider the general circuit of fig. 8. for example, the esd may consist of a voltage source, v(q), depending on the stored charge q, an internal resistor r, and an internal inductance l. note that this esd can describe many kinds of electric power sources. the esd is connected to a load which draws a constant power p ≥ 0. such a load can be realized with an electronically controlled power converter feeding an external user. the current i and voltage u at the load are then related nonlinearly by u = p/i. provided reasonable initial conditions 0 (0)q q= and 0 (0)q q= are given, the electrical dynamics is governed by the following ordinary differential equation: ( ) p lq rq v q q + + = − (1) where the dot indicates differentiation with respect to time. this equation applies not only to electrical esd but covers many kinds of physical systems (mechanical, hydraulic, etc.). without making reference to a specific physical interpretation of rel. (1), the ragone curve can be defined as follows. at time t = 0, the device contains the stored energy 2 0 0 0 / 2 ( )e lq w q= + (2) for t > 0, the load draws a constant power p such that q(t) satisfies the relation (1). it is clear that for finite e0 and p, the esd is able to supply this power only for a finite time, say tinf (p). a criterion is given either by when the storage device is cleared or when the esd is no longer able to deliver the required amount of power. since the power is time independent, the available energy is inf ( ) ( )e p p t p=  (3) the curve e(p) versus p corresponds to the ragone plot. on the efficiency of energy storage devices 181 5.1. the ragone plot of a battery in this section, the particular case of an ideal battery is studied. first, and regarding the model leading to the rel. (1), we assume the condition l = 0. then, the ideal battery with a capacity of q0 is characterized by a constant cell voltage v = u0 if q0 ≥ q > 0 and v = 0 if q = 0. in a first step, the leakage resistor rl is neglected. rel. (1) reads: 0 ( )p u i u ri i=  = − where u is the terminal voltage and i q= is the current. the solutions of the quadratic equation are 2 0 0 2 2 4 u u p i r rr  =  − (4) at the limit p → 0, the two branches correspond to a discharge current 0 /i u r + → and 0i − → . for the ideal battery, the constant power sink can also be parametrized by a constant load resistance rload. the two limits belong then to rload → 0 (short circuit) and rload →  (open circuit) respectively. clearly, in the context of the ragone plot, we are interested in the latter limit, such that we have to take the branch with the minus sign, i  i_ in eq. (2). now the battery is empty at time tinf = q0 / i, where the initial charge q0 is related to the initial energy e0 = qou0. it is now easy to include the presence of an ohmic leakage current into the discussion. the leakage resistance rl increases the discharge current i by u0/rl. the energy being available for the load becomes: 0 2 0 0 0 2 ( ) 4 2 / b l rq p e p p t u u rp u r r  =  = − − + (5) equation (5) corresponds to the ragone curve of the ideal battery. in the presence of leakage, eb(0) = 0. for the extracted energy, there exists a maximum at 2 0 / 2 l p u rr= without leakage r / rl → 0, the maximum energy is available for vanishing low power eb(p → 0) = e0. from eq. (5), one concludes that there is a maximum power, 2 max 0 / 4p u r= associated with an energy e0/2 (a small correction due to leakage is neglected). this point is the endpoint of the ragone curve of the ideal battery, where only half of the energy is available while the other half is lost at the internal resistance. finally, the expression of the ragone plot is given in the dimensionless units using 0 0 / b b e e q u= and 2 0 4 /p rp u= 1 ( ) 2 (1 1 2 / ) b l p e p p r r = − − + (6) ragone curves according eq. (6) with and without leakage are shown in fig. 9 for the ideal battery. the branch belonging to i+ is plotted by the dashed curve. 182 a. rufer 10 0 10 -1 10 -2 10 -3 10 0 10 -1 10 -2 10 -3 10 -4 p [4rp/u0 2 ] e [ e /q 0 u 0 ] rl= rl=10 3 r u0,q0 rl r p fig. 9 ragone curve of the ideal battery (adapted from [11]) 5.2 the case of superconductive magnetic energy storage smes the ragone curves of superconductive magnet energy storage systems are also described in details in [5]. figure 10 gives the normalized curves for inductive energy storage devices with coulomb (c), stokes (s), and newton (n) friction. the dashed double-dotted curve corresponds to a smes with an ohmic bypass (4r/rb = 0.001). this resistance rb is used for the modelling of the losses of all freewheeling paths, with a dominant contribution of the freewheeling elements of the power electronic converter. 10 0 10 -1 10 -2 10 2 10 -1 10 -2 10 1 p [p/ri0 2 ] e [ 2 e /l i 0 2 ] l r p 10 0 10 3 rb= 4r/rb=0.001 rb c s n fig. 10 normalized ragone curves for the inductive esd (adapted from [11]). 6. the mrr (modified ragone representation) a slightly different model for the representation of the relationship between the really recoverable energy amount of a storage device and the power level of the exchange has been proposed in [12]. this mrr (modified ragone representation) is based on a simple equivalent circuit which can be used for example for a battery. this equivalent circuit includes a series resistor for the model of the charging and discharging losses, and a parallel resistor for the model of the self-discharge (fig. 11). on the efficiency of energy storage devices 183 fig. 11 equivalent scheme for the mrr the energy that can be recovered from the storage device is represented in function of the transfer power p (logarithmic scale, fig. 12). a too small power in the range of the self-discharge losses results into a nearly zero energy to be recovered (left ends of the curves in fig. 12). at the right end of the mrr curves, the effect of a too high transfer power results in a similar situation of zero recovery due to the high internal losses. 10 8 6 4 2 e n e rg y [ k j] power [w] 10 -6 10 -4 10 -2 10 0 10 2 0 10 8 6 4 2 e n e rg y [ k j] power [w] 10 -10 10 -5 10 0 10 5 0 rp->0rp-> rs->0rs-> fig. 12 the mrr (modified ragone representation). 7. from a positive to a negative efficiency in section 3, the energy needed for the system auxiliaries has been mentioned. such auxiliaries correspond for example to the circulation pumps of a flow battery, to the vacuum pumps for the evacuation of the envelope of a high-speed flywheel. superconductive magnetic energy storage must be assisted by a cryogenic equipment assuming the superconducting conditions (fig. 13 left). the power consumed by all such auxiliaries should not be higher than the power available for the storage if the storage efficiency has to be kept positive. for example, in the case of a vrb battery, the mechanical power for the electrolyte pumps has to be subtracted from the stack power (the battery itself is powering its auxiliaries) during discharge, and it must be added to the stack power during charge (the external source is powering the auxiliaries). if the charging time is identical to the discharging one, the round-trip efficiency becomes ( ) ( ) stackdisch mech roundtrip stackch mech p p p p  − = + (7) 184 a. rufer fig. 13 effect of the auxiliaries on the efficiency from relation (7), it becomes evident that the round-trip energy efficiency can become negative (for example pmech>pstackdisch). the variation of the round-trip efficiency in dependency of the charging/discharging power related to the nominal power of the battery pn and in dependency of the related mechanical power pmech/pn is represented in fig. 13 (right side). the operating range of the battery power is comprised between zero and 1.2 [p. u.], and the range of the needed related auxiliary power is comprised between zero and 0.2 [p. u.]. in fig. 13, the negative values of the efficiency are only represented at the value of pmech/pn = 0.2 in order to illustrate this property. a real operation of the storage system under such conditions would correspond to a non-sense. 8. conclusions energy storage has been and will be in the future a component with growing importance in the wide field of powered systems. a broad range of power and energy capacity is characterizing the storage components. a general model for the efficiency and new tools have been described where the real amount of recoverable energy in dependency of the power level of the exchange can be calculated. in the context of alternative and renewable energy supplies, new storage supports are proposed as flow batteries, flywheels or even superconducting magnetic components. as for all storage devices, the energy efficiency must be quantified and the operation boundaries for a reasonable global performance must be defined. on the efficiency of energy storage devices 185 references [1] a. rufer, energy storage – systems and components, crc press, taylor & francis group, november 2017. [2] k. o. papailiou (ed.), springer handbook of power systems, a. rufer, chapter 16, energy storage, springer science and business media llc, 2021. [3] hublot mp-05 “la ferrari”: 50-day power reserve: a world record power reserve for a hand-wound tourbillon wristwatch. https://www.gphg.org/horlogerie/en/watches/mp-05-laferrari-50-days-power-reserve [4] www.solarimpulse.com [5] www.nant-de-drance.ch [6] storaxe, industrial and infrastructure scalable large-scale storage solutions, ads-tec, gmbh, d-72622 nürtingen, www.ads-tec.de [7] s. c. lee, w. y. jung, "analogical understanding of the ragone plot and a new categorization of energy devices", energy procedia, vol. 88, pp. 526-530, june 2016. [8] s. k. moore, "gravity energy storage will show its potential in 2021, gravitricity and energy vault are pioneering a radical new alternative to batteries for grid storage", ieee spectrum, january 2021. [9] a. rufer, "design and control of a ke (kinetic energy) compensated gravitational energy storage system", in proceedings of the 2020 22nd european conference on power electronics and applications (epe'20 ecce europe), 2020, pp. 1-11. [10] h. höimoja, m. vasiladiotis, s. grioni, m. capezzali, a. rufer, h. b. püttgen, "toward ultrafast charging solutions of electric vehicles", in proceedings of the cigre 2012, paris, pp. 1-8. [11] t. christen, m. w. carlen, "theory of ragone plots", j. power sources, vol. 91, no. 2, pp. 210-216, december 2000. [12] s. delalay, "etude systémique pour l’alimentation hybride – application aux systèmes intermittents", phd thesis nr 5768, epfl, lausanne, switzerland, 2013. https://www.gphg.org/horlogerie/en/watches/mp-05-laferrari-50-days-power-reserve http://www.solarimpulse.com/ http://www.nant-de-drance.ch/ http://www.ads-tec.de/ https://spectrum.ieee.org/author/samuel-k-moore https://ieeexplore.ieee.org/document/9215714/ instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 589 600 doi: 10.2298/fuee1404589p covered microstrip line with ground planes of finite width  mirjana t. perić 1 , saša s. ilić 1 , slavoljub r. aleksić 1 , nebojša b. raičević 1 , mirza i. bichurin 2 , alexander s. tatarenko 2 , roman v. petrov 2 1 university of niš, faculty of electronic engineering of niš, serbia 2 novgorod state university, veliky novgorod, russian federation abstract. characteristic parameters of a covered microstrip line with ground planes of finite width are determined using hybrid boundary element method (hbem). this method, developed at the faculty of electronic engineering of niš is based on the combination of equivalent electrodes method (eem) and boundary element method (bem). results for the characteristic impedance of the observed microstrip line are compared with the corresponding ones obtained by the finite element method. key words: characteristic impedance, finite element method (fem), hybrid boundary element method (hbem), microstrip line, perfect electric conductor (pec). 1. introduction over the years, many authors have analyzed microstrip lines with finite width dielectric substrate using numerical and analytical methods [1]-[14]. the variational method [5, 7], the boundary element method/method of moments (bem/mom) [1], [9]-[11], the conformal mapping and the moving perfect electric wall methods [12]-[13], etc. are some of the commonly used procedures for microstrip lines analysis. on the other side, the problem of the finite width microstrip ground plane has not been so often researched, although these forms of microstrips are typical in practice. in [4] and [14]-[15] the microstrip line with finite-width dielectric and ground plane was analyzed. a moving perfect electric wall method (mpew) was applied in [12]. this method is used in combination with the conformal mapping method (cmm). the author obtained simple analytical relations for quasi tem parameters of microstrip lines. the calculation was performed with the assumption that the conductor thickness is zero.  received january 21, 2014; received in revised form september 5, 2014 corresponding author: mirjana t. perić faculty of electronic engineering, a. medvedeva 14, 18000 niš, republic of serbia (e-mail: mirjana.peric@elfak.ni.ac.rs) 590 m. t. perić, s. s. ilić, s. r. aleksić, et al. in [6] the authors present an efficient numerical technique for characteristic parameters determination of multiconductor transmission lines with homogeneous dielectrics. the influence of finite width ground plane was also investigated. the system of integral equations resulting from the method is solved using galerkin’s method with a pulse approximation. the technique applied in this paper is an improvement of the procedure presented in [8], in the sense of better efficiency and accuracy of the obtained results. analysis of structures with ground planes of finite width as well as finite conductor thickness is also possible using the hybrid boundary element method (hbem) [15]. this method is applied for the microstrip characteristic parameters determination in [15] and [16]. in [17] and [18] the symmetrically coupled microstrip lines with finite and infinite width ground plane are analyzed using the hbem. both modes (even and odd) are considered. covered coupled microstrip lines parameters are calculated in [19]. the structure that has not been analyzed using hbem until now is a covered single microstrip line with ground planes of finite width and finite conductor thickness. the analysis of such structure will be presented in this paper. results obtained for the characteristic impedance will be shown in tables and graphically, as equipotential contours. the main assumption in this analysis involves quasi tem propagation in the microstrip line. in order to validate the hbem values obtained for the characteristic impedance, in terms of accuracy, they have been compared with the corresponding ones obtained by the finite element method (fem). that method is very useful for application in software for electromagnetic problems solving, including the microwave analysis. some of this type of software is femm [20] or comsol [21]. the first one will be applied in this paper for results comparison. 2. theoretical background the hbem has been applied, until now, for electromagnetic field determination in the vicinity of cable terminations [22], calculation of magnetic force between permanent magnets as well as for microstrip lines parameters determination [23]. a generalization of the hbem, which is applied in this paper for microstrip lines analysis, was described in detail in [15] and [16]. this method presents a combination of the bem/mom, the equivalent electrodes method (eem) [24] and the point-matching method (pmm). the main idea of the hbem is in discretizing each arbitrarily shaped surface of the perfect electric conductor (pec) electrode as well as an arbitrarily shaped boundary surface between any two dielectric layers. the boundary surfaces are divided into a large number of segments. each of those segments on pec electrode is replaced by equivalent electrodes (ees) placed at their centers. the potential of equivalent electrodes obtained in this manner is the same as the potential of pecs themselves. the segments at any boundary surfaces between the two layers are replaced by discrete equivalent total charges. those charges are placed in the air [15, 16]. the equivalent electrodes are line charges whose radius is determined in [24]. the green’s function for the electric scalar potential of the charges is used. covered microstrip line with ground planes of finite width 591 applying the point-matching method (pmm) for the potential of the perfect electric conductor (pec) electrodes and for the normal component of the electric field at the boundary surface between any two dielectric layers, the system of linear equations is formed. increasing the number of the ees the distances between them becomes smaller. in order to keep stability of the formed system of equations it is necessary that the distances between ees be larger than their radius. the formed quadratic system of linear equations is well-conditioned. the system matrix always has the greatest values at the main diagonal. after solving the system of equations, according to [15], it is possible to calculate the capacitance per unit length of the microstrip line, as well as the characteristic impedance and effective relative permittivity. this method will be described in detail in the following section for characteristic parameters determination of covered microstrip line with the ground planes of finite width and finite conductor thickness. 3. hbem application geometry of the covered microstrip line, with finite width dielectric substrate placed between two ground planes of finite width, is shown in fig. 1. fig. 1 problem geometry the hbem, based on discretization of boundary surfaces between any two dielectric layers and replacement of those segments with total charges per unit length, is applied. it should be mentioned that the free surface charges do not exist on boundary surfaces layer 1 layer 2, so the total surface charges placed between dielectric layers are only surface polarization charges. the equivalent hbem model is shown in fig. 2. 592 m. t. perić, s. s. ilić, s. r. aleksić, et al.  indices “d”, “a” and “t” denote the charges per unit length placed in dielectric (“d”) and air (“a”) as well as total (“t”) charges per unit length, respectively.  mi (i = 1,2) is the number of ees on pecs, with line charges q'd im (m = 1,...,mi), placed in the layer 2;  mj ( j = 3,...,5) is the number of ees on pecs, with line charges q'a jm (m = 1,...,mj), placed in the layer 1;  ni (i = 1,...4) is the number of ees on boundary surfaces layer 1 – layer 2, with line charges q't in, placed in the air (n = 1,...,ni);  ),( dd imim yx , ),( aa imim yx , ),( tt inin yx are the positions of the ees. fig. 2 hbem model the electric scalar potential of the system from fig. 2, is given in eq. (1). 2 2 2dim 0 dim dim 1 1 2 5 2 2a a a 3 1 1 4 2 2t t t 1 1 0 ln ( ) ( ) 2 ln ( ) ( ) 2 ln ( ) ( ) , 2 i i i m i m m im im im i m n in in in i n q x x y y q x x y y q x x y y                                    (1) where 0 is unknown additive constant, which depends on the chosen referent point for the electric scalar potential. the procedure for determining the number of unknowns is the following: in order to avoid placing an arbitrary number of unknowns on each boundary surface, an initial parameter np is introduced. the number of unknowns is determined as covered microstrip line with ground planes of finite width 593 p 1 1 n sh w m   , p2 n sh s m   , p 11 3 2 n sh tw m    , p 22 4 22 n sh ytw m    , where 2 2 sw y   , p 22 5 22 n sh tw m    , p41 n sh h nn   , p32 n sh x nn   , where 2 1wsx   . the total number of unknowns tot n , will be denoted by: 5 4 1 1 1 tot i i i i n m n       . the electric field is obtained using )grad(e . a relation between the normal component of the electric field and the total surface charges is given with eq. (2). (0 ) 2 t 0 1 2 ˆ ( ) i im im           n e , t t im im im q l     , inn ,,1 , 4,3,2,1i (2) where in̂ ( xnynynxn ˆˆ,ˆˆ,ˆˆ,ˆˆ 4321  ) are unit normal vectors oriented from the layer 2 into layer 1. applying the procedure described in the previous section, the system of linear equation is formed using the pmm for the potential of the perfect electric conductor given in (1) and the pmm for the normal component of the electric field (2). the unknown free charges per unit length on conductors, and total charges per unit length on the boundary surfaces between two dielectric layers is determined after solving the system of equations. in order to satisfy the necessary condition of electrical neutrality of the whole covered microstrip line, equation (3) is added: 2 5 d a 1 1 3 1 0 i im m im im i m i m q q         (3) in that way, a quadratic system of linear equations is formed. the unknown values are free charges of pecs, total charges per unit length at boundary surfaces between dielectric layers, and unknown additive constant 0. the capacitance per unit length of the observed microstrip line is: 31 d1 a 3 1 1 1 mm k k k k c q q u              (4) the characteristic impedance is given in (5) c c 0 / eff r z z  , (5) 594 m. t. perić, s. s. ilić, s. r. aleksić, et al. where r eff = c'/c'0 is the effective relative permittivity of the microstrip line, and zc0 is the characteristic impedance of the microstrip line placed in the air. also, with c'0 the capacitance per unit length of the microstrip line without dielectrics (free space) is denoted. in order to validate and compare the obtained results for the characteristic impedance, the software femm [20] is used. 4. numerical results a computer code based on the procedure described in previous section, is written in mathematica [25]. all calculations were performed on computer with dual core intel processor 2.8 ghz and 4 gb of ram. the results convergence and the computation time are shown in table 1. the values of the effective relative permittivity, the characteristic impedance are determined for: r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0, t2/w2 = 0.1, h/d = 0.5 and s/d = 2.0. table 1 convergence of the results and computation time np ntot r eff zc[] t(s) 5 66 1.7008 44.665 0.3 10 98 1.8648 42.234 0.4 15 134 1.7107 44.228 0.7 20 166 1.7825 43.544 1.0 50 376 1.8559 43.328 4.5 75 550 1.8707 43.343 9.6 85 618 1.8744 43.346 12.1 100 722 1.8786 43.349 16.5 125 894 1.8836 43.350 25.2 135 964 1.8846 43.356 29.3 150 1068 1.8866 43.355 36.0 160 1136 1.8877 43.355 41.3 170 1242 1.8887 43.356 49.3 200 1414 1.8908 43.356 64.5 250 1760 1.8935 43.356 100.5 300 2106 1.8953 43.356 143.2 325 2278 1.8963 43.356 171.0 the “computation time” is the time spent for determining the number of unknowns, their positioning, forming a matrix elements, solving the system of equations, the characteristic parameters calculation. most of the calculation time is spent on matrix fill. for example, when the totn =1068, the time for determining the number of unknowns and their positioning is 0.2 s and for the matrix fill 32 s. for solving the system of linear equation is spent 3.3 s and for the capacitance, characteristic impedance and effective dielectric permittivity calculation 0.5 s. from table 1 is evident that a good convergence of the results is achieved in a short computation time. sufficient accuracy is obtained for 1242 unknowns, so there is no need to increase the number of ees. covered microstrip line with ground planes of finite width 595 equipotential contours are shown in fig. 3, for: np = 150, r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0, t2/w2 = 0.1, h/d = 0.5 and s/d = 2.0. fig. 3 equipotential contours in order to verify the obtained hbem values, a comparison of hbem and femm results for the effective dielectric permittivity and the characteristic impedance versus h/d is given in table 2. the discrepancy of these results is less than 0.6 %. it should be mentioned that the classical comparison of results does not make sense here. these methods (hbem and fem) are applied under different conditions. the number of unknowns in the hbem application was about 1100. on the other hand, the corresponding femm model was created with a few thousand finite elements. increasing the number of finite elements, accuracy of femm increases too, so it is possible to “compare” and verify the hbem results. table 2 verification of results for effective dielectric permittivity and characteristic impedance of microstrip line versus h/d for parameters: np = 150, r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0, t2/w2 = 0.1 and s/d =2.0 h/d hbem fem r eff zc[] r eff zc[] 0.2 2.3745 27.897 2.3740 28.056 0.3 2.2007 35.847 2.2089 35.895 0.4 2.0424 40.946 2.0568 40.883 0.5 1.8866 43.355 1.9068 43.208 0.6 1.7286 42.904 1.7530 42.702 0.7 1.5588 39.030 1.5898 38.812 0.8 1.3685 30.462 1.4072 30.320 596 m. t. perić, s. s. ilić, s. r. aleksić, et al. distributions of characteristic impedance versus different parameters are shown in the following figures. fig. 4 shows the influence of ground plane thickness on the characteristic impedance of microstrip line. the input data are: np = 150, r1 = 1, r2 = 3, w1/d = 1.0, t1/w1 = 0.05, w2/d = 3.0 and s/d = 2.0. from this figure it is evident that for corresponding input data, the characteristic impedance does not depend on the ground planes thickness. the characteristic impedance depends on the conductor’s distance from the planes (parameter h/d). increasing this parameter, the characteristic impedance first increases, and then decreases. the maximum value is when the conductor is equidistant from the ground planes. fig. 4 distribution of characteristic impedance versus t2/w2 for different values of parameter h/d distribution of characteristic impedance versus w1/d and s/d is shown in fig. 5. also, there are given values for characteristic impedance of microstrip line with parallel ground planes of infinite width [7]. the influence of dielectric substrate width as well as ground planes width is given in fig. 6. increasing the substrate width, the characteristic impedance decreases. the influence of planes width on characteristic impedance exists, but it can be neglected. increasing the substrate height, the characteristic impedance first increases first, then decreases as the conductor approaches the upper plane, fig. 7. the dielectric permittivity of substrate has also the influence on the characteristic impedance value. increasing the substrate permittivity, the characteristic impedance values decrease. covered microstrip line with ground planes of finite width 597 fig. 5 distribution of characteristic impedance versus s/d for different values of parameter w1/d fig. 6 distribution of characteristic impedance versus s/d for different values of parameter w2/d 598 m. t. perić, s. s. ilić, s. r. aleksić, et al. fig. 7 distribution of characteristic impedance versus h/d for different values of parameter r2 distribution of polarization charges per unit length along boundary surface is shown in fig. 8. fig. 8 distribution of polarization charges per unit length along boundary surface 5. conclusion the aim of this paper is to apply a very efficient hbem, based on a combination of eem and bem, for determining the characteristic impedance of the covered microstrip line with ground planes of finite width. that configuration has not been analyzed so far using hbem. the quasi tem analysis is applied. the main advantage of this method is the possibility to solve arbitrarily shaped, multilayered configuration of microstrip lines, with finite dimension of ground planes and conductor thickness, without any numerical integration. of course, there are other methods that can analyze this structure, but the hbem is simple and accurate procedure. the convergence of the results is good and the computation time is very short. covered microstrip line with ground planes of finite width 599 the analysis of this microstrip was performed for different values of microstrip parameters. the influence of permittivity of layer 2 on the characteristic impedance is evident. also, the results show that for w2/w1 > 2.5 the influence of finite width of ground planes on the characteristic impedance values can be neglected. acknowledgement: this research was partially supported by funding from the serbian ministry of education and science in the frame of the project iii 44004. references [1] k. li, y. fujii, “indirect boundary element method applied to generalized microstrip line analysis with applications to side-proximity effect in mmics”, ieee trans. microwave theory tech., vol. 40, pp. 237–244, 1992, doi: 10.1109/22.120095. [2] chang t. and c. tan, ”analysis of a shielded microstrip line with finite metallization thickness by the boundary element method”, ieee transactions on microwave theory tech., vol. 38, no. 8, pp. 11301132, 1990, doi: 10.1109/22.57340. [3] c.e. smith, r.s. chang, “microstrip transmission line with finite width dielectric”, ieee trans. microwave theory and tech., vol. 28, pp. 90–94, 1980, doi: 10.1109/tmtt.1980.1130015. [4] j. svacina, “new method for analysis of microstrip with finite-width ground plane”, microwave and optical technology letters, vol. 48, no. 2, pp. 396-399, 2006, doi: 10.1002/mop.10672. [5] t. fukuda, t. sugie, k. wakino, y.-d. lin, and t. kitazawa, “variational method of coupled strip lines with an inclined dielectric substrate,” in asia pacific microwave conference – apmc 2009, december 7-10, 2009, pp. 866-869. [6] j. venkataraman, s. n. rao, a. r. đorđević, t. k. sarkar, and y. naiheng, „analysis of arbitrarily oriented microstrip transmission lines in arbitrarily shaped dielectric media over a finite ground plane“, ieee trans. on microwave theory tech., vol. mtt-33, pp. 952–959, oct. 1985, doi: 10.1109/ tmtt.1985.1133155. [7] m. b. baždar, a. r. đorđević, r. f. harrington, t. k. sarkar, „evaluation of quasi-static matrix parameters for multiconductor transmission lines using galerkin’s method“, ieee trans.on microwave theory tech., vol. 42, no. 7, pp. 1223-1228, 1994, doi: 10.1109/22.299760. [8] a. r. đorđević, r. f. harrington, t. k. sarkar, m. b. baždar, matrix parameters for multiconductor transmission lines, software and user’s manual, artech house, boston, 1989. [9] r. f. harrington, field computation by moment methods. new york: macmillan, 1968. [10] t. g. bryant and j. a. weiss, “parameters of microstrip transmission lines and of coupled pairs of microstrip lines”, ieee trans. microwave theory tech., vol. mmt-16, pp. 1021-1027, dec. 1968, doi: 10.1109/tmtt.1968.1126858. [11] a. farrar and a. t. adams, “characteristic impedance of microstrip by the method of moments”, ieee trans. microwave theory tech., vol. mmt-18, pp. 65-66, jan. 1970, doi: 10.1109/tmtt.1970.1127146. [12] j. svacina, “new method for analysis of microstrip with finite-width ground plane”, microwave and optical technology letters, vol. 48, no. 2, pp. 396-399, feb. 2006, doi: 10.1002/mop.21361. [13] c.e. smith, and r.s. chang, “microstrip transmission line with finite width dielectric”, ieee trans. microwave theory tech., vol. 28, pp. 90–94, feb. 1980, doi: 10.1109/tmtt.1980.1130015. [14] c.e. smith, r.s. chang, “microstrip transmission line with finite width dielectric and ground plane” , ieee trans. microwave theory tech., vol. 33, pp. 835–839, 1985, doi: 10.1109/tmtt.1985.1133142. [15] s. ilić, m. perić, s. aleksić, n. raičević, “hybrid boundary element method and quasi tem analysis of 2d transmission lines – generalization”, electromagnetics, vol. 33, no. 4, pp. 292-310, 2013, doi: 10.1080/02726343.2013.777319. [16] m. perić, s. ilić, s. aleksić, n. raičević, “application of hybrid boundary element method to 2d microstrip lines analysis”, int. journal of applied electromagnetics and mechanics, vol. 42, no. 2, pp. 179-190, 2013, doi 10.3233/jae-131655. [17] s. ilić, m. perić, s. aleksić, n. raičević, “quasi tem analysis of 2d symmetrically coupled strip lines with finite grounded plane using hbem”, in 15 th international igte symposium, graz, austria, pp. 7377, 2012. 600 m. t. perić, s. s. ilić, s. r. aleksić, et al. [18] s. ilić, m. perić, s. aleksić, n. raičević, “quasi tem analysis of 2d symmetrically coupled strip lines with infinite grounded plane using hbem”, in xvii-th international symposium on electrical apparatus and technologies siela 2012, bourgas, bulgaria, pp.147-155, 2012. [19] s. ilić, m. perić, s. aleksić, n. raičević, “covered coupled microstrip lines with ground planes of finite width”, in 11 th international conference on telecommunications in modern satellite, cable and broadcasting services – telsiks 2013, niš, serbia, pp. 37-40, 2013. [20] d. meeker, femm 4.2, available: http://www.femm.info/wiki/download. download date: 1 oct 2011. [21] comsol multiphysics, available: http://www.comsol.com [22] n. b. raičević, s. s. ilić, s. r. aleksić, “application of new hybrid boundary element method on the cable terminations”, in 14th international igte'10 symposium, graz, austria, pp. 56-61, 2010. [23] ana n. vučković, nebojša b. raičević, mirjana t. perić and slavoljub aleksić, “magnetic force calculation of permanent magnet systems using hybrid boundary element method”, in sixteenth biennial ieee conference on electromagnetic field computation cefc 2014, annecy, france, 2014. (accepted for presentation) [24] d. m. veličković, “equivalent electrodes method”, scientific review, vol. 21–22, pp. 207–248, 1996. [25] mathematica 5.0, wolfram research inc., 1988-2003. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 165 175 doi: 10.2298/fuee1502165n image and video processing with fpga support used for biometric as well as other applications  andrzej napieralski, jakub cłapa, kamil grabowski, małgorzata napieralska, wojciech sankowski, przemysław sękalski, mariusz zubert lodz university of technology, department of microelectronics and computer science, lodz, poland abstract. paper presents the recent research in dmcs. the image processing and biometric research projects are presented. one of the key elements is an image acquisition and processing. the most recent biometric research projects are in the area of authentication in uncooperative scenarios and utilizing many different biometric traits (multimodal biometric systems). also, the recent research on the removal of geometric distortion from live video streams using fpga and gpu hardware is presented together with preliminary performance results. key words: biometrics, fpga, video correction algorithms 1. researches on biometrics 1.1. introduction biometrics has a great potential and many fascinating applications. one can imagine airports with automatic passenger identification, where passenger does not need to show any id document but can conveniently pass a biometric identification gate. another huge space for biometric systems implementations are banking systems, where cash machines do not require bank customers to present credit card or enter pin number. biometric technologies have been developed for decades. the first successful biometric system implementation was published in 1991 as a case of face recognition and in 1993 as a case of iris recognition. even longer development has been devoted to fingerprint and hand recognition. technical note on the first operational fingerprint matching algorithm used at the fbi (federal bureau of investigation) for narrowing the human search was published in 1972, while first commercial hand geometry recognition systems became available in received december 2, 2014 corresponding author: andrzej napieralski lodz university of technology, department of microelectronics and computer science, ul. wolczanska 221/223, 90-924 lodz (e-mail: napier@dmcs.pl) 166 a. napieralski et al. 1974. nowadays, after such a huge development effort, one can ask why even the most modern societies have not widely adopted biometric solutions in everyday life. the biometric system to be widely adopted by citizens has to be easy to use, fast, inexpensive, and what is most important reliable. currently none of the existing biometric technologies satisfactorily fulfils all listed conditions. today even the most advanced solutions do not provide 100% reliability, and the error rates, although extremely low, are inevitable in a practical system. this is especially visible in large scale implementations, where the number of registered users can be counted in millions. 1.2. dmcs biometric projects overview current practical implementations and current state of the art indicate that there is still a large space for improvements in the field of biometrics. the department of microelectronics and computer science (dmcs) has been conducting researches on biometrics since 2000. the first publications [1,2] concerns the automatic people identification on the basis of iris pattern. in 2004 the first biometric project was granted to the department by the polish state committee for scientific research. the project was entitled “persons recognition and identification based on eye biometric parameters” and was focused on iris recognition technology development. the project resulted in construction of iris station device and image processing algorithms (see fig. 1), that form a fully functional prototype system for iris recognition. the experience of the team resulted in the realization of the next two projects which were a continuation of researches on iris recognition technology. these projects were entitled “hardware acceleration of computations for biometric applications” and “iris positive authentication system”. the three mentioned projects resulted in a complete iris recognition solution developed in the dmcs. fig. 1. authentication process based on iris pattern current biometric researches in the dmcs focus on multimodal solutions, where data coming from more than one biometric feature are fused to enhance system reliability. the two projects on multimodal biometrics are currently ongoing. they are entitled “multimodal biometric system for contactless person identification” and “non-cooperative biometric system for positive authentication”. the first one concentrates on cooperative scenarios, image and video processing with fpga support used for biometric as well as other applications 167 while the second one focuses on non-cooperative scenarios. the list of all projects on biometrics in dmcs (completed and ongoing) is presented in table 1, while the details are given in the next sections. table 1 biometric projects in dmcs title acronym signature time frame status persons recognition and identification based on eye biometric parameters iris station 1374/t11/2004/27 17.11.2004 16.05.2007 completed hardware acceleration of computations for biometric applications biosys k25/b.w.2/2009 1.01.2009 31.12.2009 completed iris positive authentication system ipass k-25/2011/bw/3 1.11.2011 05.2013 completed multimodal biometric system for contactless persons identification mbs 2011/01/d/st6/06269 08.12.2011 07.12.2014 in progress non-cooperative biometric system for positive authentication compact lider/027/591/l4/12/ncbr/2013 01.11.2013 10.2016 in progress 1.3. iris station project the result of this completed project is iris station laboratory stand developed in the dmcs biometric laboratory (see fig. 2). it is a high resolution iris image acquisition system dedicated for biometric applications. this prototype system allows real-time eye tracking and registration/identification of the observed person. fig. 2 iris station laboratory stand in the presented solution the dedicated system is applied, where image acquisition is performed by digital camera controlled and positioned by personal computer in the realtime. moreover, the system includes specialized lighting system, precise positioning system of the camera with lens and lighting, optical path with image acquisition system. to minimize the impact of accelerations on precise shutter mechanism of camera and photo cameras there was applied a variable frequency control of step motors with the 168 a. napieralski et al. ability to independently control motors’ movement while tracking the observed object. a multi-channel control of lighting block allows a variety of lighting conditions, including the ability to acquire iris images with different lighting conditions during a single session. the system contains a dedicated support allowing proper and comfortable head placing towards the acquisition system. the system’s operation may be summarized in the following steps: a) software installed on the working station acquires an image from the camera; b) it localizes the iris in the image and verifies if the iris region is focused; c) the focus is corrected if necessary; d) the acquired high resolution iris image is further processed to recognize the person’s identity. experimental results obtained in the project are presented in [3-5, 9-11]. 1.4. biosys project a biometric security system needs to process computationally intensive tasks of authentication flow, such as quality assessment, segmentation and analysis, protocols, and database scanning. this flow contains two broad areas of computing: mathematical calculations (typical for dsp systems) and data manipulation and testing (typical for standard processor architectures). even though such systems require a fair amount of signal processing, typical personal computers (pcs) are still widely used for this purpose with software that is responsible for data processing. moreover, although current commercially available systems use images that originate from relatively simple vision systems, these systems will have to handle more information and increased processing in the near future because recently a significant effort has been focused on authenticating objects at-a-distance and on-the-move using the iris trait. fig. 3 hardware implementation architecture of biometric algorithms the main motivation for this work was to develop a hardware system for iris identification as a positive biometric system that is able to implement contextual and non-contextual filtering, image segmentation, pattern calculation, and testing within a template’s repository. an additional requirement for the device was to impose certain processing time limits, which is important in high-throughput biometric authentication or when preprocessing. biometric sample quality assessment has to be conducted using a video signal with a certain frame rate. thus, the described solution introduces a specialized hardware-based architecture (see fig. image and video processing with fpga support used for biometric as well as other applications 169 3) that can take advantage of the inherent parallelism of fpgas and their embedded processors as well as the contextual filtering of dsps. additionally, dsp reconfiguration and multicore processing techniques used for even more efficient data processing have been tested [6]. the developed system, called bioserver, is presented in fig. 4. fig. 4 the bioserver device 1.5. ipass project the main aim of the project was to design a simple, low-cost device able to realize iris positive authentication within dmcs employees that can be practically implemented in b18 building on tul campus. the scientific goal of the project is to design a lesscooperative iris-based biometric authentication system as well as methods and devices for acquisition of images with sufficient content of distinctive features. one of the key problems was to enable the automatic identification of people that were previously registered by the system on the basis of observation carried out by the low-cost vision system. although there are known solutions for positive biometric authentication (identification or verification) available at the commercial market, there is still place for innovations in the field of automated biometric recognition using iris pattern, especially when cooperation between a subject and system is taken under consideration. such systems need to deal with highly unconstrained imaging conditions, such as: exposition, wavelength of illumination, ambient light reflections from the eye’s surface, perspective fluctuations, as well as cheat attempts, etc. 1.6. mbs project the scientific goal of the project is to develop a multimodal biometric system for contactless persons identification working based on the following physical and behavioral features (biometrics): 1. iris pattern 2. face geometry 3. hand geometry 170 a. napieralski et al. in the constructed system iris pattern features unique for each individual are derived from the analysis of the iris structure image taken under nir (near infrared) light. unique face features are computed by analyzing two data sources acquired simultaneously: face image taken under visible light and face geometry acquired using 3d scanner. unique hand features are extracted by analyzing the 3d scan of hand. the multimodal biometric database is collected using the iris station laboratory stand (iris) and the mbs laboratory stand (face and hand). the mbs laboratory stand is presented in fig. 5. it is built based on two structured light 3d scanning devices manufactured by the smarttech company. it allows two modes of operation: fast and precise. the precise mode is intended for scanners calibration (setting up common coordinate system based on template of known geometry presented to both devices), while the fast mode is dedicated for face and hand samples acquisition. the preliminary experimental results obtained in the project are presented in [7, 8]. fig. 5 mbs laboratory stand (left) and the structured light scanning device (right) 1.7. compact project automatic identification of people at-a-distance and on-the-move is one of the most explored areas of biometrics nowadays. this problem is dealt with by the compact project where the recognition process is done using specialized vision systems under unconstrained imaging conditions, at a distance and on the move. one of the key assumptions is to realize a system able to acquire biometric samples with the level of cooperation lower than in the case of the iris-on-the-move technology. it is assumed that the authorized subject should be inside an area around the system rather than the straight path like in the iom. the other system feature is the high throughput of the system, understood as a number of persons that could be successfully authorized per time interval – not less than in iom technology, i.e., 30 persons per minute. using the modern fpga or/and multicore dsp technology, it may be possible to recognize several subjects at the same time increasing significantly the throughput of the system. the scientific goal of the project is to design a non-cooperative, high-throughput biometric positive authentication system based on features fusion of iris pattern, periocular and face image, as well as methods and devices for acquisition of such images with sufficient content of distinctive features. the authentication process (see fig. 1) is planned to be mainly based on iris pattern and to be supported by a face trait, since iris pattern achieves the best biometric efficiency (lowest error rates). however, the key problem for iris-based image and video processing with fpga support used for biometric as well as other applications 171 biometric system design is the image acquisition issue. the acquisition of good quality image of an object of 1,2cm 2 area from the subject at-a-distance and on-the-move is the most challenging task in the whole authorization process. thus, multibiometrics is the tool that allows for fusion of other features that could have been captured during acquisition process. in the case of iris-based systems this can be periocular region: eyebrows and skin texture and color. 2. implementation and development of fisheye distortion correction algorithms 2.1. background nowadays, the majority of camera systems have a field of view of about 60 degrees (fov<60 deg). however, there is a wide market for fisheye lens based and wide angle lens based cameras i.e. gopro hero. these cameras provide larger fovs thanks to specially designed wide angle or fisheye lenses that project much bigger part of the scene onto the image sensor. on the one hand, this type of cameras are used as a toy, but it is worth mentioning that this increase in the field of view is extremely useful in a number of applications ranging from laparoscopic surgery, through rear-view cameras in cars to closed circuit television (cctv) systems [12], [13]. the increase in the fov brings geometric distortions, which make straight lines from the scene to appear as curves in the image plane. in the one hand this ‘artistic’ transformation is accepted when we use a sport camera to record our achievements, but in many applications it is not desired and causes problems for proper interpretation of the scene. our team focuses on research and implementation of novel, on-the-fly algorithms for correcting geometrical distortions in video streams on different systems starting from regular pcs, through mobile devices ending on dsps and fpgas. 2.2. the algorithm the equations for geometrical distortion are well documented and they are as old as cartography. the concept of the algorithm is how to project the sphere on a planar surface. this may be happen only using the trigonometric functions as it is presented below [14]. fig. 6 set of equations to project the sphere on the planar surface [16] 172 a. napieralski et al. hence, the algorithm is well known, but there are always problems with hardware implementations of the above set of equations. for a single image there is a plenty of time to make all the calculations and present the final corrected image. the problem arises when we would like to deal with the high-resolution video at frame rate of i.e. 30 frames per second. then the power needed to make such a calculation is rising proportionally to the number of frames. one of the biggest problems is the algorithm itself and the fact that it uses trigonometric functions. if we make a brief calculation for a hd video we need to make approximately 1920px * 1080px * 30 frames = 62 million trigonometric calculations per second. in principle, the following calculation for each pixel has to be done. fig. 7 block diagram of geometrical correction system as can be seen on the block diagram in fig. 7, there is a need to read and write single pixels to memory. using conventional techniques we need to access external memory index by index in the way given as a result of trigonometric calculation. thus, the access time to memory cell plays the major role. we can use faster memories like ddr5 or even on-chip sram (cache), but it is an expensive solution. 2.3 sample implementation in the first attempt, we built a system based on xilinx spartan vi fpga chip and the aptina mt9p031 image sensor equipped with fisheye lens. the photo of the system is presented below. fig. 8 photo of the real system image and video processing with fpga support used for biometric as well as other applications 173 the internal structure of the image sensor is presented below. one can see that the resolution of camera is 5 megapixels, thus even higher than the hd image described above. fig. 9 image sensor mt9p031 internal architecture [from aptina documentation] we designed and then implemented the correction algorithm [15], [16]. the internal block architecture is presented below. the system contains four main parts:  the front-end frame grabber which acquires the image data from sensor and forms information about the pixels for further processing,  the demosaicer which recalculates the bayer subpixels and forms rgb pixels,  the datamover state machine which is responsible for running the correction algorithm and performing the transfer of pixels from the input image to the output buffer,  the axi datamover which is used to store pixel data in ram memory through axi bus. fig. 10 correction system internal architecture [7] the main performance limit of the above structure is the memory access time. the current research is focused on this issue. 2.4 results as it was described in [15], currently we are able to deal with 3-4 frames per second with hd resolution (720p). the current research is conducted to increase the number of frames which can be processed simultaneously. meantime, we developed opengl shader programs, which allowed us to build a web-based tool for image and video correction 174 a. napieralski et al. which can be accessed online. using powerful machines equipped with high-performance graphic cards, we can deal with medium-resolution video sequences. however, we are looking for more efficient algorithms which can be used with cheaper devices. moreover, we designed the implementation of geometrical distortion algorithm for android os. the preliminary results show that we can process 2-3 frames per second with hd resolution using one core of a mobile arm cpu. when we decrease the resolution we can deal with a higher frame rate. fig. 11 snapshot of web-based implementation of fisheye lens geometrical correction algorithm a sample screenshot from the web-based application is presented in fig. 11 below. on the left panel the fisheye image is presented. the yellow dots show the image being processed. the right panel presents the corrected image using our implementation of the algorithm. 2.5. summary the correction of geometrical distortion is a very old problem, which was not solved efficiently until now. the algorithms base on trigonometric functions, thus it consumes a substantial time and energy to make the computation. during the image/video processing there is a need to have access to random indices in the memory so the burst mode of modern memory systems cannot be used to accelerate pixel read and/or write. the current research of our team is focused on better understanding and finding the regularity in image pixel indexing. intellectual property rights protect the main parts of the presented solution, therefore they were not presented in this paper in details. acknowledgement: the researches presented in the paper were supported by:  funds from the state committee for scientific research granted on the basis of the decision number 1374/t11/2004/27,  funds from the dmcs department granted on the basis of the decision number k25/b.w.2/2009 and k-25/2011/bw/3, image and video processing with fpga support used for biometric as well as other applications 175  funds from the national science centre granted on the basis of the decision number dec2011/01/d/st6/06269,  funds from the national centre for research and development granted on the basis of the decision number lider/027/591/l-4/12/ncbr/2013,  funds from the national centre for research and development (grant number lider/30/110/l3/11/ncbr/2012). references [1] p. jabłoński, r. szewczyk, z. kulesza, a. napieralski, m. moreno, j. cabestany, "automatic people identification on the basis of iris pattern – image processing and preliminary analysis", in proceedings of the 23rd international conference on microelectronics, niš, yugoslavia, 12-15 may 2002, vol. 2, pp. 687–690. [2] r. szewczyk, p. jabłoński, z. kulesza, a. napieralski, m. moreno, j. cabestany, "automatic people identification on the basis of iris pattern -extraction features and classification", in proceedings of the 23rd international conference on microelectronics, niš, yugoslavia, 12-15 may 2002, vol. 2, pp. 691–694. [3] k. grabowski, w. sankowski, m. napieralska, m. zubert, a. napieralski, "iris recognition algorithm optimized for hardware implementation," 2006 ieee symposium on computational intelligence and bioinformatics and computational biology, 2006. cibcb '06., 28-29 sept. 2006, pp.1–5. [4] k. grabowski, w. sankowski, m. zubert, m. napieralska, "reliable iris localization method with application to iris recognition in near infrared light", in proceedings of the international conference mixed design of integrated circuits and system, mixdes 2006, 22-24 june 2006, pp. 684–687. [5] k. grabowski, m. zubert, m. napieralska, a. napieralski, "uncertainty in iris recognition based on texture analysis," in proceedings of the 17th international conference mixed design of integrated circuits and systems,mixdes 2010, 24-26 june 2010, pp.587–592. [6] k. grabowski, a. napieralski, "hardware architecture optimized for iris recognition”, ieee transactions on circuits and systems for video technology, vol. 21, issue 9, pp. 1293–1303, september 2011. [7] j.a napieralski, m.m. pastuszka, w. sankowski, "3d face geometry analysis for biometric identification," in proceedings of the 21st international conference mixed design of integrated circuits & systems, mixdes 2014, 19-21 june 2014, pp. 519–522. [8] p. nowak, "a comparative study on biometric hand identification," in proceedings of the 21st international conference mixed design of integrated circuits & systems, mixdes 2014, 19-21 june 2014, pp. 411–414. [9] w. sankowski, k. grabowski, m. napieralska, m. zubert, "eyelids localization method designed for iris recognition system," in proceedings of the 14th international conference on mixed design of integrated circuits and systems, mixdes 2007, 21-23 june 2007, pp.622–627. [10] w. sankowski, k. grabowski, j. pietek, m. napieralska, m. zubert, "optimization of iris image segmentation algorithm for real time applications," in proceedings of the 16th international conference mixed design of integrated circuits & systems, mixdes 2009, 25-27 june 2009, pp. 671–674. [11] w. sankowski, k. grabowski, m. napieralska, m. zubert, a. napieralski, "reliable algorithm for iris segmentation in eye image". image and vision computing, vol. 28, no. 2, pp. 231–237, 2010. [12] c. hughes, m. glavin, e. jones, p. denny, "wide-angle camera technology for automotive applications: a review", iet intelligent transport systems, vol. 3, no. 1, pp. 19–31, 2009. [13] l. meinel, m. findeisen, m. hes, a. apitzsch, and g. hirtz, "automated real-time surveillance for ambient assisted living using an omnidirectional camera", in proceedings of the ieee international conference on consumer electronics, icce 2014, pp. 396–399. [14] r. andraka, a survey of cordic algorithms for fpga based computers, fpga'98, monterey, california, usa, association for computing machinery,1998 [15] m. michalak, p. sekalski, k. grabowski, s. izydorczyk, "fast fpga-based frame grabber for digital progressive scan image sensors", in proceedings of the 21st international conference mixed design of integrated circuits & systems, mixdes 2014, 19-21 june 2014. [16] j. cłapa, h. błasinski, k. grabowski, p. sekalski, "a fisheye distortion correction algorithm optimized for hardware implementations", in proceedings of the 21st international conference mixed design of integrated circuits & systems, mixdes 2014, 19-21 june 2014. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 219 231 doi: 10.2298/fuee1602219s wave digital models of ideal and real transformers  biljana p. stošić university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper, the wave digital filter (wdf) theory is applied for development of the wave digital models of ideal and real transformers which can be used for modeling of more complex structures. the transformers wave digital networks are described and developed here based on scattering variables and two-port and three-port networks of parallel and series adaptors. wdf-based model of a real transformer includes parasitic resistors and inductors, which are usual in low-frequency transformer equivalent circuit. key words: wave digital approach, ideal transformer, real transformer, series/parallel adaptors, network synthesis 1. introduction the basic theory of wave digital filters (wdf) was developed by alfred fettweis [13] in the early 70’s. it was used for digitizing lumped electrical circuits composed of inductors, capacitors, resistors, and other elements of classical network theory. wave digital filters offer computational efficiency, stability under finite-arithmetic conditions and facilitate interfacing with wave variables, making them a worthwhile subject of study. the wave digital structures are specially tailored with respect to hardware implementation. in the past, wave digital filter concept has been widely used in time-discrete wavebased modeling and analysis of different physical systems. a detailed review of application of wdf structures for electromagnetic (em) field simulation is given in [4-6]. the original intent of the author is to develop a self-written tool in matlab based on wdf theory for microstrip circuit simulation in digital domain. the only practical constraint is that the complex structure under study needs to have an analogous equivalent circuit representation. as stated in [7, 8], a mixed wave-digital/full-wave em method combining the accuracy of em modeling with the convenience of using wave digital approach can be very efficiently used to develop wdf-based network models and to get the scattering parameters of different two-port microstrip structures. the effects of discontinuity are more accurately modeled and taken into account with a full-wave em simulation. received february 23, 2015; received in revised form june 30, 2015 corresponding author: biljana p. stošić university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: biljana.stosic@elfak.ni.ac.rs) 220 b. p. stošić compact network models can facilitate the solution of em field problems such as microwave structures. recently, a systematic method for the automated extraction of lumped-element equivalent circuits for multiport microwave circuits has been presented [9]. a circuit model is extracted directly from the em simulation or measurements. the proposed multiport-foster synthesis method generates network topology with lumped elements and connection subnetworks based on ideal transformers. network-oriented modeling can also be applied to complex em radiating structures using the segmentation technique and dividing structure into subregions. the lumpedelement circuit models can be established by representing subdomains by foster equivalent circuits, their connection by interconnection elements exhibiting only ideal transformers and the radiation modes by cauer equivalent circuits. some examples are shown in [10, 11]. this type of connection subnetworks based on ideal transformers can also be used for hybridization of the tlm method with other numerical methods [10, 11]. as it is mentioned earlier, the author of the paper intents to develop a software tool that can be applied to simulate microwave structures of different geometries by using their developed wave digital network models. so, in order to digitize equivalent network circuits with incorporated ideal transformers represented in [9-11] by using wdf theory, wdf-based models of an ideal transformer are developed in the first part of this paper. in wave digital structures adaptors are used as the connection elements. a. fettweis has introduced elementary two-port parallel and series adaptors representing parallel and series connections of two-ports [1-3, 12]). the ideal transformer is an important component in the building of complex microwave systems and its proper modeling using the methods of the wdfs is very important. in this paper, at first the abcd and the scattering matrices parameters of the ideal transformer are introduced and interconnected, and then the results are applied to develop wdf-based models, using only two-port series or parallel wave digital adaptors and some additional multipliers. the correctness of these models is verified through a simple proof. contrary to an ideal transformer, a real transformer has winding resistance, flux leakage, finite permeability and core losses. wdf-based model of transformer t-equivalent circuit [13, 14] which has been successfully used in steady-state studies and some low-frequency transient studies is developed here in the second part of the paper. 2. the scattering variable formalism the introduction of the wave variables is the key element of the wave digital approach. the correspondence between wave digital circuit and the related analog circuit is based on scattering variables, not on voltages and currents. the relevant signal quantities used in wdf representation are the so-called incident a and reflected b waves, which are defined with respect to each port of the reference network as shown in fig. 1. synthesis of wave digital network of an ideal transformer can be done by starting from its abcd parameters and their transformation into scattering parameters. for a two-port network characterized by a current and a voltage at each port (fig. 1), abcd matrix formulation is 1 2 1 2 u ua b i ic d                 , (1) and scattering matrix formulation is wave digital models of ideal and real transformers 221 1 11 12 1 1 2 21 22 2 2 b s s a a b s s a a                            s , (2) where a1 and b1 are the incident and reflected wave at port 1 with port resistance r1, and a2 and b2 are the incident and reflected wave at port 2 having port resistance r2, [12, 15]. classic abcd matrix to s matrix transformation formulas for two-port case give connection [12, 15] 2 1 1 2 11 2 1 1 2 a b g c r d r g s a b g c r d r g                , (3) 1 2 12 2 1 1 2 2 ( )r g a d b c s a b g c r d r g               , (4) 21 2 1 1 2 2 s a b g c r d r g         , (5) 2 1 1 2 22 2 1 1 2 a b g c r d r g s a b g c r d r g                 , (6) where 22 /1 rg  , fig. 1. fig. 1 two-port network 3. twoand three-port adaptors adaptors are memoryless digital elements whose task is to perform transformations between pairs of wave variables that are referred to different port resistances [1-4, 12]. they have low sensitivity to coefficient quantization. in this section, subsequently required twoand three-port adaptors are presented. 3.1. two-port adaptors the equations for two-port parallel adaptor are )( 2121 aaab  , (7) )( 2112 aaab  , (8) where the adaptor coefficient  = (r2  r1)/(r2 + r1) is usually written on the side corresponding to port 2. the equations for two-port series adaptor are 1u 1i 1r 1a 2a 1b 2b 2u 2i two-port network 2r 222 b. p. stošić 1 2 1 2( )b a a a     , (9) )( 2112 aaab  . (10) the adaptor coefficient can be defined also as  =  = (r1  r2)/(r2 + r1) and written on the side corresponding to port 1. in this case, the other form of eqs, (7)-(8) and eqs. (9)-(10) can be written. 3.2. three-port adaptors a network of three-port series adaptor with port 2 being reflection-free is depicted in fig. 2, [1-4, 12]. the adaptor coefficient  = r1 / r2, r2 = r1 + r3 is shown explicitly next to the port 1, fig. 2a. the set of equations for the three-port series adaptor with reflection-free port 2 is 3210 aaaa  , (11) 011 aab  , (12) )( 312 aab  , (13) )( 213 abb  . (14) fig. 2 three-port series adaptor with reflection-free port 2: a) symbol and b) wave digital network model a network of three-port parallel adaptor with port 2 being reflection-free is depicted in fig. 3, [1-4]. the adaptor coefficient  = g1 / g2, g2 = g1 + g3 is shown explicitly next to the port 1, fig. 3a. the set of equations for the three-port series adaptor with reflection-free port 2 is 3210 )1( aaaa  , (15) 1331 aabb  , (16) )( 3132 aaab  , (17) )( 3123 aaab  . (18) 3r 2r1r 3a 3a3 b 3b 2a 2b 1a 1a 2a 1b 2 1 r r  1b 1 1   2b wave digital models of ideal and real transformers 223 fig. 3 three-port parallel adaptor with reflection-free port 2: a) symbol and b) wave digital network model 4. definition of an ideal transformer and its matrices a transformer usually consists of two coupled windings (primary and secondary) on a magnetic iron core. an ideal transformer is an imaginary transformer which does not have any loss in it, means no core losses, copper losses and any other losses in the transformer. efficiency of this transformer is considered as 100 %. in general, there are different cases of an ideal transformer depending on the polarity of the voltages and current ratios. also, the same turn ratio can be given in different ways. in general, the equation set of an ideal transformer is written according to the established reference polarities for currents (both currents entering or one entering and one leaving the dotted terminals) and voltages (both voltages positive/negative or one positive and one negative at the dotted terminals). in this section, the abcd and the scattering matrices parameters for one case of an ideal transformer are introduced and interconnected first and then the results are applied to develop wdf-based models, using only two-port series or parallel wave digital adaptors with the defined adaptor coefficient . a schematic symbol of an ideal transformer with turn ration n where both currents entering (are directed into the dot-marked terminal) and both voltages are positive at dotted terminals is presented in fig. 4. the abcd parameters of the ideal transformer given in fig. 4 can be easily obtained by considering its equation system 1 2/ 1/v v n , 1 2/i i n  , (19) and they are given in the matrix form as             n n dc ba 0 0/1 . (20) s-parameters of an ideal transformer can be found starting with its abcd matrix given in eq. (20) and the previously given classic formulas eqs. (3)-(6), as follows 3r 2r1r 3a 3a3 b 3b 2a 2b 1a 1a 2a 1b 2 1 g g  1b 2b fig. 4 a schematic symbol of an ideal transformer 2i1i1v 2v n:1 224 b. p. stošić 21 2 21 2 11 1 1 grn grn s    , 21 2 21 12 1 2 grn grn s    , (21) 21 221 1 2 grn n s    , 21 2 21 2 22 1 1 grn grn s    . (22) 5. the wave digital models of an ideal transformer an ideal transformer is transformed to the wave digital domain by expressing its sparameter equations in the form corresponding to the equations of the two-port parallel or series adaptor networks and by choosing value of the adaptor coefficient . in order to represent an ideal transformer with the network based on two-port parallel adaptor, adaptor coefficient is chosen to be 2 2 1 2 2 1 r n r r n r       . (23) for this case, the s-parameters of the ideal transformer given in eqs. (21)-(22) can be written in the form 1 (1 ) (1 ) n n                 s . (24) for these obtained scattering parameters, according to formulation given in eq. (2), the set of equations for ideal transformer is 1 11 1 12 2 1 2 1 (1 )b s a s a a a n            , (25) 2 21 1 22 2 1 2(1 )b s a s a n a a           . (26) 5.1. the ideal transformer model based on two-port parallel adaptor network to synthesize the wave digital network of ideal transformer based on parallel adaptor network, eqs. (25)-(26) are written in the form of eqs. (7)-(8) as following 1 2 1 2 1 [ ( )]b a n a a n       , (27) 2 1 1 2( )b n a n a a      . (28) according to the last equations, wave digital network model of an ideal transformer based on two-port parallel adaptor network is depicted in fig. 5. in addition to wave digital network of two-port parallel adaptor [1-4], this transformer model has two more multipliers. the wdf-based model can be found in [3], but description in details is not present there. in fig. 6, the wave digital network models for different ideal transformer cases are symbolically presented. by observing the abcd parameters and models given in figs. 5 and 6, a generalization of multiplier values can be made. the multiplier in branch b1 always takes the value of parameter a. the multiplier in branch a1 takes the value of parameter d if parameters wave digital models of ideal and real transformers 225 a and d are both positive or negative, and value of d in the other cases (one positive and one negative parameter). (a) 1/ 0 0 a b n c d n             (b) fig. 5 ideal transformer: its realization for arbitrary choices of r1, r2 and n, based on two-port parallel adaptor network: (a) wave digital network model and (b) its symbolic representation nvv /1/ 21  , nii 21 /               n n dc ba 0 0/1 (a) an ideal transformer where one current entering and one leaving the dotted terminals and one voltage positive and one negative at dotted terminals nvv /1/ 21  , nii 21 /             n n dc ba 0 0/1 (b) an ideal transformer where both currents entering the dotted terminals and one voltage positive and one negative at dotted terminals nvv /1/ 21  , nii 21 /              n n dc ba 0 0/1 (c) an ideal transformer where one current entering and one leaving the dotted terminals and both voltages are positive at dotted terminals fig. 6 ideal transformers: equation systems and symbolic representations of their wave digital network models for arbitrary choices of r1, r2 and n, based on two-port parallel adaptor network with  = (r2  n 2  r1) / (r2 + n 2  r1) 1a 1b 2b 2a 1r 2r  n/1 1 2 rn two-port parallel adaptor network n n/1 1а 1r 1b 1 2 rn 2r 2b 2а  n n/1 1а 1r 1b 1 2 rn 2r 2b 2а  n n/1 1а 1r 1b 1 2 rn 2r 2b 2а  1 n n/1 1а 1r 1b 1 2 rn 2r 2b 2а  1 226 b. p. stošić 5.2. the ideal transformer model based on two-port series adaptor network in order to form the wave digital model of an ideal transformer based on series adaptor network, the previously given eqs. (25)-(26) have to be written in the form of eqs. (9)(10) as shown        2121 11 a n aa n b , (29)              2112 1 a n aanb . (30) starting with eqs. (29) and (30), the wave digital network model for an ideal transformer based on two-port series adaptor network can be obtained and its representation in matlab simulink toolbox can be drawn. it will contain three multipliers with constant coefficients and three adders. its network model is modification of that one shown in fig. 5b, multiplier n is moved from branch of a1 to branch corresponding to b2, and multiplier 1/n from b1 to branch of a2. 5.3. wdf-based model possibilities a detailed description of synthesis process of wdf-based model for an ideal transformer case given in fig. 4 was given previously. one can follow the described procedure and develop wave digital network models for any possible case of ideal transformer and any way shown turn ratio. the choice of two-port series or parallel adaptors with defined coefficient  or  can be made depending on the problem at hand. solution which gives the simplest overall expressions for the particular problem is preferable. the simulink models of the parallel and series adaptors, as well as of different ideal transformer cases can be added in the simulink browse library between common used blocks and those blocks can be used further in generating wave digital network models of complex structures. 5.4. proof of the ideal transformer feature let’s consider first an ideal transformer shown in fig. 4 closed by a resistor r2 = rp. its input impedance is 2 1 1 n r drc bra i v z p p p r in p     . if r1 and r2 are port resistances, the reflection coefficient is 1 2 1 2 1 1 1 1 )( rnr rnr rz rz a b z p p in in       . (31) the wave digital network of an ideal transformer based on two-port parallel adaptor shown in fig. 5 will be used in order to prove the main transformer feature. at the far end (port 2), it is terminated by a wave digital element corresponding to resistor (multiplier ). by substituting relation wave digital models of ideal and real transformers 227 2 2 a b  (32) where 2 2 p p r r r r     , in the set of eqs. (25)-(26), the reflection coefficient is found to be 2 1 1 (1 ) (1 ) ( ) b z a              . (33) in case when resistance of the resistor at the far end is chosen to be equal to its port resistance rp = r2, the coefficient  takes zero value, and finally the reflection coefficient is 2 2 1 2 2 1 ( ) r n r z r n r         . (34) it is evident that the reflection coefficients given by eqs. (31) and (34) are the same. in this way, it is shown that this wave digital network operates exactly like ideal transformer, it modifies load impedance. 6. definition of a real transformer although transformers are used for different purposes, the fundamental theory and concepts of all transformers are same. a real transformer is non-ideal or practical iron-core transformer. contrary to an ideal transformer, a practical transformer has winding resistance, flux leakage, finite permeability and core losses. all these things have to be considered to derive equivalent circuit of a real transformer. a real transformer’s behaviour may be represented by an equivalent circuit model which contains an ideal transformer. the remaining elements are those elements that contribute to the non-ideal characteristics of the device. the linear model of a transformer, with core losses and load losses taken into account, is observed here. the ideal transformer can be shifted to either side of the circuit, and the observed equivalent circuit with secondary parameters referred to primary side is shown in fig. 7. this model is known as transformer t-equivalent circuit (t-model) [13, 14] and has been successfully used for many years in steady-state studies and some low-frequency transient studies. in the equivalent circuit, rp and rs are primary and secondary winding resistances (series resistances including conductor losses of each ending), xp = lp and xs = ls are primary and secondary leakage reactances. the parallel branch (so-called magnetizing branch of the model) represents the magnetic core of the transformer, where rc is core losses and xm = lm is magnetizing reactance. all parameters values in the transformer model shown in fig. 7 can be determined by carrying out experiments. the equivalent model with included capacitances between windings and between each winding and ground, and self capacitance of each winding, can be found in [13]. the wave digital model of such an equivalent circuit with included parasitic capacitances will be derived in the author’s future work. 228 b. p. stošić fig. 7 equivalent circuit of a practical transformer referred to the primary side 7. the wave digital model of a real transformer the analysis presented for an ideal transformer is helpful when digitizing the real transformer. the wdf-based model of real transformer is developed following the rules of interconnection of wave elements. to form wdf-based model from the transformer tequivalent circuit from fig. 7, it is advisable to divide it into several parts representing a real voltage source, three-port series/parallel adaptors with series/parallel element branches, an ideal transformer and load resistance. block connection in digital form, i.e. symbolic representation of wave digital model of real transformer is shown in fig. 8. in that way, the circuit elements are separated from their connection network. then, series and parallel element branches with resistor and inductor are represented by corresponding models. wave digital one-ports corresponding to the classical one-ports such as resistor, inductor and real voltage source can be found in [1-4, 12]. the one-ports are transformed into the wave digital domain by expressing their underlying physical law in terms of wave variables, applying the bilinear transform, and choosing the port resistances. in the literature [14], one can also find transformer  -model. it can also be transformed to wave digital domain by following the interconnection rules and by using wave digital one-port models marked in fig. 8. the wave digital network model parameters (port resistances and adaptor coefficients) are defined as , (35) 22 3 2432243 , , , g g ggglrrr mc  , (36) 33 5 46533 ' 6 ' 5 , , , r r rrrlrrr ss  , (37) 12 11112 , r r rrr in in  , (38) 23 12 1221223 , g g ggg  , (39) 34 23 3332334 , r r rrr  , and 34 2 34 2 rnr rnr out out    . (40) voltage source load resistor n:1 px pr 2' / nxx ss  2' / nrr ss  mx cr inr inv outr ideal transformertwo-port network 11 1 2211121 , , , r r rrrlrrr pp  wave digital models of ideal and real transformers 229 response in the formed network model can be easily found by use of block-diagram network drawn in simulink toolbox and some basic matlab functions allowing for accurate and fast modeling and analyzing of circuits. fig. 8 real transformer: symbolic representation of wdf-based network model 8. conclusion the application of wave digital approach in modeling and analysis of microstrip structures at high operating frequency is considered in [7, 8]. the two-port microstrip structures under investigation are represented by their equivalent circuits. generation of the equivalent circuit models is also presented in those papers. further in case of multi-port microwave circuits, one of the techniques to construct an equivalent-circuit model is based on numerical data from either full-wave analysis or measurement [9] in wide frequency range. for this purpose, system identification is performed on the impedance/admittance function. the generated equivalent circuit consists of connection network and lumped elements. the connection network does not store energy and can be represented by canonical form which only makes use of ideal transformers [16]. the element values and turns ratio of transformers are determined automatically [9]. equivalent circuit contains lossy elements, i.e. losses in the microwave structure are included through resistances and conductances. the main goal here is to digitize these devices (ideal and real transformers) based on their known equivalent circuits. because of that, one possible equivalent circuit of real transformer and its corresponding wave digital network model are shown. a schematic representation of the major low-frequency parasitic elements in a generalized transformer is shown in this paper. in most textbooks and literature, this model is used to describe the transformer behaviour. in general, this equivalent circuit is more accurate than that one of an ideal transformer, but less accurate than the exact one. also, it is possible to form wave digital network model of real transformer with included all high-frequency parasitic effects. such an equivalent circuit is capable of representing a practical design with considerable accuracy, but actual calculations would be very difficult in some cases. the wave digital model of such an equivalent circuit will be derived in the author’s future work. 11r 12rinr 34r e a b 33r 23r outr outout uа 2= outb 11 3r 5r1r 4r 6r2r 22r -1 t tt 2 42 3 0 0 0 n/1 2n  n 34r -1-1 inv 0 series branch parallel branch series branch ideal transformer one-port inductor one-port resistor voltage source '' ss lrmc lrpp lr 230 b. p. stošić in general, this paper shows how wave digital networks representing digital models of ideal and real transformers are synthesized based on scattering variables. wave digital model of the ideal transformer is developed based on two-port parallel/series adaptor networks. the wdf-based network model of the ideal transformer is modified to include the effects of winding resistance, flux leakage, finite permeability and core losses. in that way, real transformer is digitizing and its wdf-based model is formed. the author’s intention is to develop a software tool embedded in matlab for simulation of a wide variety of microwave structures based on their equivalent circuits and wave digital approach. a transformer is a very common magnetic structure which can be found in many applications. in the complex em structures, such as multiport microwave or radiating structures, connection subnetworks in their equivalent circuits can be represented based on ideal transformer [6, 9-11]. developed wave digital models of ideal and real transformers can be added in the simulink browse library between common used blocks and those blocks can be used further in generating wave digital network models of the mentioned complex structures containing transformers as interconnecting elements. now, an increase in a number of types of structures that can be efficiently modeled by using wave digital approach is obtained. compact wave digital network models, embedded into matlab, can provide considerable lower computational effort and run time in comparison with full-wave em analysis. due to the complexity of modern microwave structures and systems, an em full-wave analysis of these structures/systems is often prohibitive due to limitations in processing and memory capabilities. acknowledgement: this paper has been supported by the ministry for education, science and technological development of serbia under grant no. tr32052. references [1] a. fettweis, "digital filter structures related to classical filter networks", archiv für elektronik und übertragungstechnik, vol. 25, no. 2, pp. 79–89, 1971. [2] a. fettweis, "digital circuits and systems", ieee transactions on circuits and systems, vol. cas-31, no. 1, pp. 31-48, 1984. [3] a. fettweis, "wave digital filters: theory and practice", proc. ieee, vol. 74, no. 2, pp. 270-327, 1986. [4] s. bilbao, wave and scattering methods for numerical simulation, hoboken, new jersey: wiley, 2004. [5] j.a. russer, y. kuznetsov, p. russer, "discrete-time network and state equation methods applied to computational electromagnetics", microwave review, vol. 16, no. 1, pp. 2-14, 2010. [6] l.b. felsen, m. mongiardo, p. russer, electromagnetic field computation by network methods, springer-verlag, 2009. [7] b.p. stošić, n.s. dončov, a.s. atanasković, “response calculation of parallel-coupled resonator filters by use of synthezised wave digital network”, in proc. of the 11 th international conference on telecommunications in modern cable, satellite and broadcasting services (telsiks), serbia, niš, october 16-19, 2013, vol.1, pp. 253-256. [8] b.p. stošić, n. dončov, j. russer, b. milovanović, ”a combined wave digital/full-wave electromagnetic approach used for response calculation in equivalent networks of microwave circuits“, in proc. of the international conference on electromagnetics in advanced applications (iceaa 2013), italy, torino, september 9-13, 2013, pp. 569-572. [9] j. russer, f. mukhtar, b.p. stošić, t. asenov, a. atanasković, n.s. dončov, b.milovanović, p. russer, “systematic network model generation for linear reciprocal microwave multiports”, in proc. of the 7 th european microwave integrated circuits conference eumic 2012, the nederlands, amsterdam, 29-30 october 2012, pp. 40-43. wave digital models of ideal and real transformers 231 [10] p. russer, "network-oriented modeling of radiating electromagnetic structures", turkish journal of electrical engineering, vol. 10, no. 2, pp. 147-162, 2002. [11] p. lorenz, p. russer, "connection subnetworks for the transmission line matrix (tlm) method" in timedomain methods in modern engineering electromagnetics, ser. springer proceedings in physics, p. russer and u. siart, eds. berlin, springer, vol. 121, pp. 263-281, 2008. [12] m.v. gmitrović, microwave and wave digital filters, faculty of electronic engineering, niš, serbia, 2007 (in serbian). [13] b. whitlock, "audio transformers" in handbook for sound engineers, g.m. ballou, jensen transformers inc., chatsworth, ca, 2006. [14] k. shaarbafi, transformer modelling guide, teshmont consultants lp, 2014. [15] b.p. stošić, n.s. dončov, "synthesis and use of wave digital networks of admittance inverters", microwave review, vol. 19, no.2, pp. 89-95, 2013. [16] p. russer, m. mongiardo, l.b. felsen, "electromagnetic field representations and computations in complex structures iii: network representations of the connection and subdomain circuits", international journal of numerical modelling: electronic networks, devicesand fields, vol. 15, no. 1, pp. 127-145, 2002. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 611 623 doi: 10.2298/fuee1504611r analysis of half-band approximately linear phase iir filter realization structure in matlab  aleksandar d. radonjić 1 , jelena d. ćertić 2 1 crnogorski telekom a.d., montenegro 2 school of electrical engineering, university of belgrade, belgrade, serbia abstract. in this paper a detailed analysis of an atypical filter structure in matlab filter design and analysis (fda) tool is presented. as an example of atypical filter structure, the iir half-band filter with approximately linear phase realized as a parallel connection of two all-pass branches was examined. we compare two types of those filters obtained by two different design algorithms. fda tool was used for the experiment because different effects of the fixed point implementation can be simulated easily. one of the goals of this paper was to compare results obtained by two different design algorithms. in addition, different realizations of the filter structure based on the parallel connection of two all-pass branches were examined. key words: approximately linear phase iir filters, fda tool, half-band iir filters 1. introduction the digital filter design process consists of several steps. after the design itself, a very important step is the analysis of different aspects of filter implementation. if the filter is to be implemented in a fixed-point arithmetic, the quantization effects should be carefully examined [1]. this can be done by theoretical investigation, for example, by sensitivity analysis [2] and detailed round-off noise study. it is not always possible to calculate closed-form expressions for all transfer functions that are needed for the exact derivation of the sensitivity functions. for the digital filters, it is common practice to use numerical simulation of the quantization effects [3]. for that purpose, simulation model of specific target platform can be developed, or alternatively commercially available tools can be used. the first solution is time consuming and requires good knowledge of the fixed-point arithmetic and all the parameters of the target platform. for example, if the target platform is a dsp processor, it is not enough to take care of the word-length of the processor. usually, it is necessary to fully understand the structure of the integrated multiplier. in the second approach, when a commercially available tool is used, analysis received december 3, 2014; received in revised form june 15, 2015 corresponding author: jelena d. ćertić school of electrical engineering, university of belgrade, bulevar kralja aleksandra 73, 11020 belgrade, serbia (e-mail: certic@etf.bg.ac.rs)  an earlier version of this manuscript received the best section paper award (electric circuits and systems and signal processing section) at the 58th etran conference, vrnjačka banja, 2-5 june, 2014. [5]. 612 a. radonjić, j. ćertić time can be decreased. analysis tools contain sets of typical values for relevant parameters of the proposed design. drawback of this method is that commercially available analysis tools do not have the procedures for all possible cases. it means that in the case of a typical filter design, an analysis tool probably would be of no help. in this paper we analyze iir half-band approximately linear phase filter by means of the commercial analysis tool. we use matlab filter design and analysis (fda) tool [4] because it simulates quantization effects in a way that is suitable for the fixed-point implementation. we compare results obtained for filters designed by two different algorithms. filter is realized as a parallel connection of two all-pass branches [1, 2]. although parallel connection of two all-pass branches is a common choice for implementation of the low-pass/high-pass odd-order iir filters [1, 2], it is not fully supported in matlab filter design and analysis tool [4, 5]. we define a procedure that can be used for the analysis by matlab fda tool of a specific filter structure, iir half-band filter with approximately linear phase. this paper is organized as follows: in section 2 performances of matlab fda tool relevant for the fixed-point implementation are presented; in section 3 the iir half-band approximately linear phase filters are discussed; in section 4 possible realization structures are defined, in section 5 results of the analysis are presented, and section 6 concludes the paper. 2. matlab fda tool in recent years, the new versions of matlab are available twice a year [6]. typically, each new version has some new features regarding filter design and analysis. filter design and analysis (fda) tool is part of the signal processing toolbox [4]. by using the fda tool, different filter structures can be designed and analyzed in a rapid way, because the fda tool itself contains algorithms for the design of different filter types and the large set of analysis procedures. however, sometimes it seems that new features are not introduced in this tool fast enough. for the scope of our project, part of the fda tool related to the simulation of the quantization effects is important. it should be noted that the simulation of the quantization requires an additional (fixed point) toolbox. fig. 1 fda tool setting simulation parameters of the multiplier analysis of half-band approximately linear phase iir filter realization structure in matlab 613 for the supported filter types, fda simulation of the quantization is a powerful tool that allows the user to verify robustness of the filter structure to different effects of the quantization process. the user can define word-length parameter for the input signal and output signal and filter coefficients. in addition, the number of bits associated to the fractional part of the data (input signal, output signal and filter coefficients) can be set. multiplier/accumulator structure can be simulated by defining values for relevant parameters, fig. 1. the user can enter data through the gui or choose a set of predefined values. the predefined values usually correspond to “best possible” scenario that is not always possible to obtain in “real world” situations, but can be useful for the users inexperience in fixed-point applications. 3. half-band approximately linear phase iir filters an odd order iir filter (or filter pair) can be implemented as a parallel connection of two all-pass branches a0(z) and a1(z), fig. 2. the transfer functions of the low-pass filter, hlp(z), and of the high-pass filter hhp(z) are obtained as: 0 1 ( ) ( ) ( ) 2 lp a z a z h z   , (1a) 0 1 ( ) ( ) ( ) 2 hp a z a z h z   . (1b) usually, all-pass branches are implemented as the cascaded connections of the one first order section, and second order sections: 1 2( 1) / 2 2 1 0 1 2 2,4,... 1 2 ( ) 1 n l l l l l a a z z a z a z a z            , (2a) 1 21 ( 1) / 2 2 111 1 1 1 2 3,5,... 11 1 2 ( ) , 1 1 n l l l l l a a z za z a z a z a z a z               (2b) where n is the filter order, an odd number, and the constants ali, l=1, 2, 3, …, (n+1)/2, i = 1, 2 are first and second order sections coefficients [2]. it should be noted that for the overall filter hlp(z) of order n (an odd number), the order of the all-pass branch a0(z) is an even number n0 and the order of the all-pass branch a1(z) is an odd number n1. frequency response of the parallel connection of the low-pass filter is: 0 01 1 0 10 1 0 1 ( ) ( )( ) ( ) ( ) ( )( ) ( ) 2 2 2 2 2 2 ( ) ( ) 0 1 2 ( ) 2 2 ( ) ( ) cos . 2 j jj j j jj j j lp j e e e e e e h e e e e                                      (3) where φ0(ω) and φ1(ω) are phase responses of the functions a0(z) and a1(z). 614 a. radonjić, j. ćertić fig. 2 iir odd order filter realization as a parallel connection of two all-pass filters from (3) it can be concluded that the overall magnitude response depends on the difference of the phase responses of the all-pass functions. the overall phase response of the filter hlp(z) is a mean-value of the phase responses of the all-pass branches. comparing to the classical implementation structures of the iir filters that are based on the cascaded or parallel connections of the first and second order sections, realization based on the parallel connection of the two all-pass branches has reduced sensitivity in the pass-band [2, 7]. for that reason, it is usually a preferable choice for the implementation structure in the case of fixed-point implementation [2]. on the other hand, filter structure based on the parallel connection of the two all-pass branches suffers from the high stop-band sensitivity [2, 7]. in the case when high stop-band attenuation is required, quantization effects can degrade the filter frequency response [2]. in the special case of the half-band filter with approximately linear phase, the all-pass branch a1(z) is a pure delay z n1 , and the all-pass branch a0(z) is an all-pass function with approximately linear phase. in that special case, the filter order of the all-pass branch a0(z) is an even number n0 = n1 + 1. in addition, every second coefficient of the function a0(z) is zero-valued: 0 ( )a z  (4) half-band filter with approximately linear phase is a special case of the iir filter realization based on the parallel connection of the two all-pass branches. for that reason, the sensitivity of the filter is low in the pass-band and high in the stop band. design of the half-band iir filter with approximately linear phase is performed by design of the all-pass branch a0(z), approximately linear phase all-pass function. in this paper, we use filter transfer functions obtained by two different algorithms, one based on the optimization method [8] and the other based on the direct positioning in the z domain of the stop-band zeros of the low-pass filter transfer function [9, 10]. the first solution, originally presented in [8], design all-pass approximately linear phase transfer function a0(z) by optimization procedure. as an outcome, overall magnitude response of the half-band iir filter hlp(z) is equiripple. results obtained by design [8] for the filter of order n = 23 (n0 = 12, n1 = 11) are presented. the filter gain is shown in fig. 3 and the group delay of the filter in fig. 4. it should be noted that the passband group delay is approximately n1 samples. a0(z) a1(z) in outlp outhp 1/2   . 1 0 0 000 0 2 2 4 4 2 2 2 2 2 2 0 n n k k nnkn kn zazazaza zzazaa za        analysis of half-band approximately linear phase iir filter realization structure in matlab 615 fig. 3 gain response of the filter designed by optimization algorithm [8] fig. 4 group delay of the filter designed by optimization algorithm [8] the second approach, presented in [9] and [10], actually controls the positions of stop-band zeros of the overall half-band filter. in the case of the low-pass filter design, sometimes it is important to provide additional signal attenuation for certain frequencies on the stop-band. it can be achieved by the exact control of stop-band zeros positions. by placing a stop-band zeros exactly on the unit circle, large attenuation of the corresponding frequency range can be achieved. in the design approach presented in [9, 10] it is possible to control the stop-band frequencies for which an infinitely large attenuation is needed. it was shown in [9, 10] that the stop-band zeros of the low-pass half-band iir filter are roots of the polynomial function (5) where are: n is overall filter order, a2k are coefficients of the non-trivial all-pass branch (order of the non-trivial all-pass branch is n0 = (n + 1)/2), w=sin(ω) and u4k-2(w) is the chebyshev polynomial of the second kind. there are (n + 1)/4 low-pass half-band iir filter stop-band zeros lying on the unit circle. if the stop-band zeros are defined according to the filter specifications and all-pass filter coefficients are unknown, then (5) can be transformed into the system of linear equations (one equation for each zero). values of (n + 1)/4 all-pass branch coefficients are calculated by solving system of linear equations.   )()(...)()(1)( 21 2 12426422 wuawuawuawuawp nnkkl   616 a. radonjić, j. ćertić results obtained by the second approach of the design are presented for the same filter order and overall characteristics similar to characteristics obtained by the first approach case. the filter gain is shown in fig. 5 and the group delay of the filter in fig. 6. fig. 5 gain response of the filter designed by zero positioning algorithm [9, 10] fig. 6 group delay of the filter designed by zero positioning algorithm [9, 10] it should be noted that both filters share the same realization structure, fig. 1. therefore, for the filter analysis of both structures it is essential to analyze nontrivial allpass branch of the filter a0(z). 4. implementation of the half-band approximately linear phase iir filters the goal was to develop a procedure for the detailed analysis of the filter structure presented in fig. 1 in the case of the fixed-point realization. the objective was to compare half-band iir filters with approximately linear phase obtained by two different algorithms and to select for each of the two filter types, a filter realization that is most suitable for the case of fixed point implementation platform. three different implementations of the all-pass branch were analyzed, direct realization, cascaded connection of the second order sections and cascaded connection of the fourth sections. analysis of half-band approximately linear phase iir filter realization structure in matlab 617 since the filter hlp(z) is a half-band filter, poles of the transfer function a0(z) are symmetric about the imaginary axis. poles and zeros of a0(z) occur in conjugate reciprocal pairs. all-pass filter a0(z) is of order n0 = 4l + 2 or n0 = 4l + 4. in the 4l + 2 case, all-pass filter a0(z) has two poles on the imaginary axis and l quadruplets of poles, fig. 7a. in the 4l + 4 case, there is additional pair of poles placed on the real axis, fig 7b. all-pass branch a0(z) can be implemented as a direct structure of order n0, or as a cascaded connection of lower order sections. however, because hlp(z) is the half-band filter, symmetric poles and corresponding zeros can be grouped into the forth order sections. as a result, transfer function a0(z) can be implemented as a cascaded connection of one (for n0 = 4l + 2) or two (for n0 = 4l + 4) second order section(s) and l fourth order sections. fig. 7 poles (x) and zeros (o) of the all-pass transfer function a0(z), a) filter order is n0 = 4l + 2, b) filter order is n0 = 4l + 4 each quadruplet of poles with corresponding zeros form a single forth order all-pass section. since hlp(z) is a half-band filter, the forth order section is of the form: ( ) m a z (6) the fourth order section am(z) can be realized with only two multiplications [1, 4]. if the filter is realized as a connection of the second order sections, structure of each section (apart from the sections that correspond to the real axis and imaginary axis poles) is: ( ) m a z (7) it should be noted that , thus minimum number of multiplications is two [1, 4]. two imaginary axis pair of poles (and a pair of two real poles for n0 = 4l + 4) form a second order section(s):   .,...,1,0, 1 4 4 2 2 42 24 lm zaza zzaa za mm mm m         .2,...,1,0, 1 2 2 1 1 21 12 lm zaza zzaa za mm mm m       01 ma 618 a. radonjić, j. ćertić , ( ) i r a z (8) both sections can be implemented with a single multiplication. each section (second or fourth or n0-th order) can be realized as a direct form (direct form i), direct canonical form (direct form ii), transposed direct form (transposed direct form i) or transposed direct canonic form (transposed direct form ii). alternatively, scheme with reduced number of multiplications [1, 4], fig. 8 can be used. for the all-pass filter of order n0, the minimum number of multipliers is n0. since the filter hlp(z) is a half-band filter, every second coefficient of the all-pass branch is zero. therefore, the number of multipliers is reduced to n0/2. forth order section (6) can be realized with only two multipliers am4 = a4 and am2 = a2. second order section given by (8) can be implemented with only one multiplier ai,r2 = a1. z -1 z -1 z -1 z -1 z -1 z -1 z -1 z -1 x[n] a2 a1 y[n] 0n a 10 n a fig. 8 all-pass filter structure with minimum number of multiplications (n0=4) 5. analysis of the half-band approximately linear phase iir filters the structure presented in fig. 1 (parallel connection of two all-pass branches) should be considered as a “classic” structure (along with cascaded and parallel realizations) but matlab fda tool does not have direct support for this type of the design. it means that a fda tool can’t be used for the design of the filter. instead, the filter should be designed in matlab and imported into the fda tool. it can be done if the filter is constructed as an object, because matlab fda tool can import the filter object from the currently active workspace. for that reason, the filters were designed in the conventional way and obtained the coefficients of the denominator of the non-trivial allpass branch a0(z). for both algorithms, three filter objects were constructed, one for the direct implementation, one for the cascaded connection of the second order sections and last for the realization with second and forth order sections. unfortunately, it is not possible to perform quantization analysis by using the fda tool for the cascaded or   . 1 2 2 2 2, ,      za za za i, r ri ri analysis of half-band approximately linear phase iir filter realization structure in matlab 619 parallel structures that were not designed in fda tool. this means that the user has to set filter object properties in matlab. example filter is a parallel connection of an allpass branch of order n0 = 12 and a pure delay of n1 = 11 samples. the filter a0(z) can be implemented as a direct structure, cascaded connection of 12 second order sections or as a cascaded connection of two second order sections and 5 fourth order sections. the filter a0(z) was defined as the all-pass filter, assuming realization based on fig. 8 with minimum number of multipliers [1, 4]. there is another benefit of the all-pass filter implementation with reduced number of multiplications. when the all-pass filter is implemented as in fig. 8, the last coefficient of the numerator polynomial remains exactly one. for all other implementation variants, this coefficient usually is rounded to the nearest value allowed by the chosen quantization parameters. for example, if the coefficients are coded as two’s complement numbers with 15 fractional bits, 1 will be rounded to the value 1-2 -15 = 0.999969482421875. however, for all-pass filter type, arithmetic property of the filter object can’t be set to “fixed”. for the analysis of the quantization effects, it is not essential to implement filter with as few multipliers as possible. therefore, we changed our design to direct form i. we defined filter object properties related to the fixed-point arithmetic, fig. 9. at the end, we made a parallel connection of a0(z) and a pure delay, and add scaling factor 0.5. h=dfilt.df1(fliplr(a),a); h.arithmetic='fixed'; set(h,'outputwordlength',16,'outputfraclength',15); set(h,'coeffwordlength',16,'coeffautoscale',0); set(h,'numfraclength',15,'denfraclength',15); set(h,'productmode','specifyprecision'); set(h,'numprodfraclength',30); set(h,'denprodfraclength',30,'castbeforesum',cbs); h2=dfilt.delay(11); huk=cascade(dfilt.scalar(0.5),parallel(h,h2)); fig. 9 creating filter object, all-pass branch is realized as a direct structure, a is vector of denominator coefficients of the transfer function a0(z) hi=dfilt.df1(fliplr(ci),ci); ... h2ord=dfilt.cascade(hi); hr=dfilt.df1(fliplr(cr),cr); ... addstage(h2ord,hr); h2ord2=copy(h2ord); for br=1:length(nule_rest)/2 hc2=dfilt.df1(fliplr(cc2(br,:)),cc2(br,:)); ... addstage(h2ord2,hc2); end; fig. 10 creating filter object, all-pass branch is realized as a cascaded connection of second order sections; structure, ci, cr, and cc2 are denominator coefficients corresponding to imaginary axis poles, real axis poles and “rest” poles respectively for the cascaded implementations, the arithmetic properties of the all sections should be set independently. it means that filter object was defined for each section. all-pass 620 a. radonjić, j. ćertić branch a0(z) was defined as a new object defined as a cascaded connection of the objects corresponded to all low order sections. in fig. 10 code for obtaining a connection of the second order sections is presented. fixed-point arithmetic properties are set in the same way as for the direct realization of a0(z). 5. analysis results analysis is performed in the fda tool for approximately linear phase half-band iir filters obtained by optimization algorithm [8], and for filters obtained by the low-pass filter stop-band zero positioning method. design parameters for the optimization algorithm [8] are: the filter order, n0 = 23, the pass-band edge frequency, g = 0.45π. design parameters for the stop-band zero positioning method are: the filter order, n0=23, the first stop-band zero frequency, 0 = 0.61π. both designs share the same realization structure, fig. 1. therefore, for the same filter order, number of the multipliers and number of the states are the same for both structures. in table 1 results for the number of the multiplications (m) and the number of the states (s) are presented, for the filter order n = 23 (n0=12, n1=11), assuming direct form i for all sections. it should be noted that direct form i is not optimal. it requires twice as many multiplications as the all-pass structure. in addition, the number of the states is reduced in the case of canonic structures (direct form ii and transposed direct form ii). table 1 implementation costs, m – number of multipliers, s – number of states, direct form i structure m s sos1 second order section, real (or imaginary) axis poles 2 4 sos2 second order section (other poles) 4 4 fos fourth order section 4 8 d delay (11 taps) 0 11 0.5 constant 1 0 direct implementation of a0(z) 12 24 hlp(z) – direct impl. of a0(z) 13 35 hlp(z) – 2 sos1, 4 sos2 21 35 hlp(z) – 2 sos1, 2 fos 13 35 it was shown in [5] that, for the implementation consist of the second order sections only, degradation of the frequency response is larger comparing to other two alternatives. in fig. 11 gains of implementation based on the second order section are presented for quantized and non-quantized filter coefficients for both algorithms. the quantization parameters are set to: coefficient world-length – 16 bits, number of fractional bits – 15. assuming two’s complement signed numbers, values that can be representing correctly are in [-1 1) range. in fig. 11 it can be seen that the filter obtained by the optimization method [8] h12(z) has equiripple response. for the given filter order, stop-band attenuation is less than 40 db. therefore, the quantization error is small (for the specified world-length). characteristic of the filter obtained by zero positioning procedure [9, 10] analysis of half-band approximately linear phase iir filter realization structure in matlab 621 h22(z) has increased stop-band attenuation. for the attenuation values larger than 80 db degradation of the response is noticeable. in fig. 12 gains of the low-pass filter hlp(z) for three different implementation of the all-pass branch a0(z) in the case of the design [9, 10] are presented. for all three simulated structures, the degradation for the attenuations larger than 80 db is similar. for the defined word-length of 16 bits, this effect is expected. fig. 11 the gains of the analyzed filters, h12(z) – design method based on the optimization [8], h22(z) – design method based on the low-pass stop-band zeros positioning [9, 10] fig. 12 the gains of the analyzed filters based on the low-pass stop-band zeros positioning [9, 10], for three different implementation of the all-pass branch a0(z), h2(z) – direct implementation, h22(z) – cascaded connection of the second order sections, h24(z) – cascaded connection of the second and fourth order sections the analyzed structures are approximately linear phase iir half-band filters. in fig. 13 group delays are presented for low-pass filters obtained by the optimization 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -100 -80 -60 -40 -20 0 / g a in [ d b ] h 12 (z) quantized h 12 (z) reference h 22 (z) quantized h 22 (z) reference 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -100 -80 -60 -40 -20 0 / g a in [ d b ] zero positioning reference h 2 (z) h 22 (z) h 24 (z) 622 a. radonjić, j. ćertić method [8], h1(z) and by the low-pass filter stop-band zero positioning method [9, 10], h2(z). in both cases, group delay is approximately 11 samples. in fig. 14 group delays are presented for the filter obtained by the optimization method [8] for the all-pass branch h1ap(z) and for the low-pass filter h1(z). it should be noted that the delay of the low-pass filter has smaller variations comparing to the variations of the delay of the all-pass branch. fig. 13 the group delay of the analyzed filters, h12(z) – design method based on the optimization [8], h22(z) – design method based on the low-pass stop-band zeros positioning [9, 10] fig. 14 the group delay of the all-pass branch h1ap(z) and of the low-pass filter h1(z) designed by the optimization method [8] 6. conclusion in this paper, a possible solution for analysis of the quantization effects and implementation cost using a well known commercially available fda tool was presented. it was shown that it is possible to use an fda tool even in the cases where the filter 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 10.5 11 11.5 / g ro u p d e la y [ s a m p le s] h 1 (z) h 2 (z) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 10.5 11 11.5 / g ro u p d e la y [ s a m p le s] h 1 (z) h 1ap (z) analysis of half-band approximately linear phase iir filter realization structure in matlab 623 structure that was analyzed is not directly supported. our approach was confirmed by simulation of quantization effects in the case of half-band iir filter with approximately linear phase. two different algorithms were used for the design of the filter, one, wellknown [8], based on the optimization method, and the other, recently published [9, 10], based on the direct positioning of the low-pass filter stop-band zeros. implementation structures are the same for both filters, and consist of a parallel connection of the approximately linear phase all-pass branch and a pure delay. for the situation presented in this paper, when the structure is not fully supported in the fda tool, the user should be able to set additional parameters manually (by writing the appropriate code). it requires advanced knowledge about different implementation structures, the principles of the simulation of the quantization effect and number representations in the fixed-point arithmetic systems. it can be concluded that it is possible to use the fda tool for the analysis of the filters that are not supported, but the process is not as simple as in the case of the supported filters. acknowledgement: this work was partially supported by the ministry of education and science of serbia under grant tr-32023. references [1] m. d. lutovac, d.v. tošić, b. l. evans, filter design for signal processing using matlab and mathematica, prentice-hall, new york, 2001. [2] j. d. ćertić and l. d. milić, "investigation of computationally efficient complementary iir filter pairs with tunable crossover frequency", int. j. of electron. and commun. (aeü), vol. 65, pp. 419-428, 2011. [3] j. d. ćertić and l. d. milić, "on the sensitivity of two-channel iir filter banks with variable crossover frequency", in proceedings of the 5th international symposium on image and signal processing and analysis, ispa. istanbul, turkey, 2007. pp. 86-91. [4] mathworks, fdatool documentation [online], the mathworks, united states, available: http://www.mathworks.com/help/dsp/ref/fdatool.html [accessed 10 may 2015]. [5] a. d. radonjić and j. d. ćertić, " analysis of atypical filter structures in matlab“, in proceedings of the icetran, vrnjačka banja, srbija, 2014, eki1.1. [6] mathworks, matlab documentation [online], the mathworks, united states, available: http://www.mathworks.com/help/matlab/index [accessed 10 may 2015]. [7] l. d. milić and m. d. lutovac, "design of multiplierless elliptic iir filters with a small quantization error” ", ieee trans on sig. proc., vol. 47, pp. 469 – 479, february 1999. [8] h. w. schüssler and p. steffen, "recursive half-band filters", int. j. of electron. and commun. (aeü), vol. 55, pp. 377-388, june 2001. [9] m. d. lutovac and a. radonjić, "digital halfband iir filters with approximately linear phase" (in serbian), in proceedings of informacione tehnologije. žabljak, montenegro, 2010, pp. 174-177, 2010. [10] a. d. radonjić, "analysis and design of digital filter using algebra software systems", (in serbian), master degree thesis, school of electrical engineering, university of belgrade, 2014. 10414 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 393-403 https://doi.org/10.2298/fuee2203393p © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens milica preradović university of banjaluka, faculty of mechanical engineering, banjaluka, republic of srpska, bosnia and herzegovina abstract. this paper presents a comparative analysis of solar energy potential for six different cities, in six different countries in europe: freiburg (germany), graz (austria), maribor (slovenia), banja luka (bosnia and herzegovina), niš (serbia), and athens (greece). data processed in this work are accessed from photovoltaic geographical information system (pvgis). photovoltaic technology is crystalline silicon, and installed peak photovoltaic power is 5 kwp. the aim of the work is to find out whether there are statistically significant differences among the cities in relation to monthly energy production in regard to different types of photovoltaic system (fixed – free standing, fixed – building integrated, inclined, and two axis solar power plants). the work is based on four hypotheses. the estimation of solar energy production in different regions is very important for determination of potential regions suitable for generation of renewable and sustainable energy. key words: solar panels, photovoltaic technology, crystalline silicon, pvgis 1. introduction different factors have impact on the amount of incoming solar radiation to the earth. the most important factors are: geographical latitude, part of the year and day, atmosphere condition, cloud status, surface disposition, and orientation. these information are important for planning and installing of photovoltaic systems [1]. in this paper, solar energy potential for six different locations in europe (freiburg, graz, maribor, banja luka, nis, and athens) has been compared. those six cities were selected in order to see the differences in the amount of produced electricity from photovoltaic systems. cities like freiburg, graz, and maribor have developed pv systems for electricity generation, while banja luka, niš, and athens, are on the ascending path in regard to application and use of solar energy. different types of photovoltaic systems were used for this comparison: fixed – free standing, fixed – building integrated, inclined, and two-axis solar power plants. received january 10, 2022; revised march 19, 2022; accepted march 23, 2022 corresponding author: milica preradović university of banjaluka, faculty of mechanical engineering, 71 vojvode stepe stepanovića, 78000 banjaluka, republic of srpska, bosnia and herzegovina e-mail: milica.preradovic@student.mf.unibl.org 394 m. preradović freiburg and graz have been green model cities from the late 1980s. both cities are midsized, with less than 500 000 inhabitants, and both cities are administrative centers of their regions. freiburg was ‘germany’s environmental capital’ in 1992, for its ecological accomplishments. in 2010, freiburg received another award, ‘federal capital of climate protection’, and in 2012, ‘most sustainable large city of germany’. graz has been awarded many times for its achievements in field of ecology and sustainability (‘greenpeace climate protection award’ in 1993 and the ‘sustainable energy europe award’ in 2008). in 1996, graz has received, as the fist city in europe, the ‘international sustainable city’ award by the european union [2]. freiburg is also called ‘europe’s solar city’. vauban is the neighborhood in freiburg, which is one of the most sustainable city neighborhoods worldwide. in this city district, the majority of houses have solar energy generation on-site (mostly from the rooftop pv panels). the surplus electricity is sold to the municipal grid [3]. the international headship of freiburg in urban sustainability began in the 1970s, after successful anti-nuclear protests in the city [4]. federal state government has intended to build nuclear power plant in the rural area north of the city. because of the strong resistance of the city’s citizens, the government plans have not been realized and therefore, freiburg is called ‘birthplace’ of the environmental movement [5]. freiburg is also one of the sunniest locations in germany. city has incorporated many branches – community, business, energy, scientific comunity, education, construction, tourism with civil society together with the help from local and national levels to become a world leader in solar energy [4]. in graz, in the first half of 1990s, many environmental proposals and projects were arranged (‘ecocity 2000’, ‘municipal energy and climate concept’, ‘eco-profit’, and ‘eco-drive’). at the same time, graz became the first austrian representative of ‘climate alliance of european cities’, with the aim to reduce greenhouse gas emission for 50 per cent until 2010 (with 1987 as the baseline). graz also embraced energy constricting plans for the renovation of buildings and the transition to district heating or renewable fuels. also, city has set in motion a ‘solar initiative’ that supports the feeding-in of solar thermal energy into the district heating system during summer [2]. in graz, the first smart city community is being developed. in this district, new energy technologies for energy self-sufficient cities are established. the smart city graz project is examining innovations like solar modules, solar cooling systems, solar power generation in urban areas, mini-chp-facilities (combined heating and power), integrated façade technologies and smart heat grids, with their application in demonstration buildings [6]. in maribor, the faculty of energy of the university of maribor is an important institution in the disciplines of thermo-energetics, hydropower, nuclear power, renewable and alternative energy sources. the emphasis of the research is on pv systems. the institute of energy technology possess a park of renewable energy sources, which comprises nine tracking pv systems. this renewable resources park aims to study various networking systems for examination of new elements that are components of a smart grid. pv systems in the park are coupled to the distribution grid [7]. another paper from seme et al. [8] presented a overview of performance study of pv systems in slovenia. total of 91% of the pv systems in slovenia have a peak power of 50 kwp or less. this is because of the energy law that prevents installations of higher power [8]. however, in recent years in slovenia, feed-in tariff has influenced the growth of the pv market, which triggered the lower prices of pv technologies [9]. dravske elektrane maribor is the major renewable electricity manufacturer in slovenia. it got a permit for segment five of the zlatoličje solar solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 395 power plant. this segment of the solar power plant will be installed on the left bank of the outflow canal of the biggest slovenian hydro power plant zlatoličje. a planned yearly production of 5 820 pv modules with a power of 2.7 mwp will be 3 gwh [10]. the republic of srpska holds a huge potential for electricity production utilizing pv systems. the promotion of renewable energy is secured by renewable energy in may 2013 together with the decision of the regulatory commission for energy of the republic of srpska on the charge level and premium prices. the republic of srpska gives a priority to grid connection for renewable energy source operators and proposes incentives for external investors. the solar energy laboratory of the academy of sciences and arts of the republic of srpska was developed in 2012, as an outcome of the scientific research projects on renewable energy sources – particularly solar energy. on one rooftop, in october 2012, fixed on-grid solar power plant (power 2.08 kwp, monocrystalline silicon solar cells) was installed. the solar power plant is equipped with accompanying tools for supervising, acquisition, and data obtaining, and measuring. with the help of this pv power plant, the effects of solar radiation strength, air temperature, wind speed, and air humidity on the energy efficiency of the pv solar power plant in the banja luka region can be constantly observed. two years later, in 2014, another solar system was installed additionally to the solar energy laboratory – solar box, which comprises a metallic base with five pv solar modules made of polycrystalline silicon, with distinct power of 50 w. three solar modules are placed vertically and positioned to the east, south, and west, respectively. the fourth solar module is placed horizontally, and the fifth is at an angle of 33° to the south. additionally, in october 2017, a two-axes tracking pv system was appointed on the roof of the academy of sciences and arts of republic of srpska. this system contains electronic, mechanic, and measuring subsystem. in 2020, in the republic of srpska, 42 electricity producers used pv systems of up to 250 kw [11]. following papers [12,13,14,15,16] contain great amount of material on the solar potenitals to generate electricity from pv solar plants in the republic of srpska. serbia’s solar centers are located in niš, zrenjanin, and novi sad. faculty of sciences and mathematics (fsm) in niš occupies a solar energy laboratory that studies physical features of the flat-plate thermal and hybrid solar radiation collectors, solar cells and pv solar power plants. also, in niš, faculty of electronic engineering possess contemporary laboratory for electronic exploring of rotational pv systems for optimum solar radiation incidence. faculty of technical sciences in novi sad owns renewable and distributed energy sources laboratory devoted to the investigation in the field of renewable energy, mostly in the wind and solar energy conversion and energy storing. in zrenjanin, faculty of technical sciences m. pupin, has a solar energy laboratory that focuses on flat-plate thermal and pv modules [17]. studies [17 – 21] contain relevant information on solar energy in serbia. greece is considered to be very attractive country in terms of investing in solar photovoltaics [22]. solar thermal market in greece is well explained in the [23]. starting in 2011, there were many policy attempts to promote solar investing. those efforts positioned greece at the leading position in global rankings for solar power share in electricity production, in just three years. but domestic pv market decreased in the time period from 2014 to 2017 to 1% of its 2013 range. this widespread closure of solar energy was directly in relationship with regulatory response to economic effects of the policy agenda very plentiful twenty-year-feed-in-tariffs provided for great scale developments, remaining at high levels despite the fact that costs have dropped. policy makers were forced to apply retroactive tariffs cuts. however, it could be fairly related to the energy-linked 396 m. preradović effects of political and economic insecurities, like the construction of new traditional power plants, and constant economic stagnation. another barrier for advanced development of solar power in greece can be contemporary immaturity of the economy, in terms of strategy and trade models, to motivate consumers to generate and accumulate clean energy locally [22]. currenlty, greece generates solar irradiation generally with flat plate collectors for low-temperature heating applications and with pv [24]. 1.1. general information on selected cities geographical information on freiburg, graz, maribor, banja luka, niš, and athens, are given in the following table (tab. 1). athens is at the same time the southernmost and easternmost city from the selected, freiburg is the northernmost and westernmost city from the selected. more details are presented in the following table. table 1 information on selected cities [29] parameter freiburg graz maribor banja luka nis athens geografical latitude (˚) 48.0005 47.071 46.5621 44.772 43.3187 37.982 geographical longitude (˚) 7.832 15.438 15.65 17.188 21.893 23.727 optimal angle for fixed solar power plants (˚) 36 37 36 34 fs: 34* bi: 33* fs: 32* bi: 31* optimal angle for inclined axis (˚) 38 39 38 36 36 34 elevation (m) 263 364 275 167 198 84 * fs – freestanding solar power plants, bi – building integrated solar power plants, only niš and athens have different values for optimal angle for fixed fs and bi solar power plants, all the other cities have the same optimal angles for fs and bi solar power plants. given elevation is accessed from pvgis and is related to free-standing solar power plants solar energy capacity and production of selected countries are presented in the following table (tab. 2). table 2 solar energy capacities and solar energy production in germany, austria, slovenia, bosnia and herzegovina, serbia, and greece in 2019 [25] country solar energy capacity (mw) solar energy production (gwh) germany 49 047 46 392 austria 1 702 1 702 slovenia 264 303 bosnia and herzegovina 22 30 serbia 23 14 greece 2 834 4 429 as it can be seen from this table, germany has the greatest solar energy capacity and the greatest solar energy production, whereas serbia and bosnia have the lowest solar energy capacity and the lowest solar energy production. greater solar energy capacity of the country, means larger solar energy production. solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 397 2. goals materials and methods the goal of this work is to analyze differences in the projected solar energy production (kwh) between six cities. also, the payback time for the installation of photovoltaic system (5 kw) is calculated for all six cities. [1] have studied solar radiation atlas for banja luka and it was concluded that there are no significant deviations of energy of global and direct solar radiation that fall on the horizontal and optimally positioned surface. in this work differences in solar energy potential were statistically analyzed between following cities: freiburg, graz, maribor, banja luka, niša, and athens. pvgis was established at the joint research centre (jrc) of the european commission within its renewable energies unit as a geographical information systems (gis) tool for the evaluation of performance solar pv systems in different geographical regions. it supplies data for technical, environmental, and socio-economic analysis of solar pv electricity generation [26,27]. the pvgis data base [28] consists of satellite data from four different meteorological sources: photovoltaic geographical information system on climate monitoring satellite application facility – pvgis-cmsaf, surface solar radiation data set heliostat pvgis-sarah, data produced by the european center for medium-range weather forecast – pvgis-era5, and consortium for small scale modelling – pvgiscosmo. the cmsaf data are obtained in this work. the cmsaf solar surface irradiance retrieval is built on radiative transfer calculations, where satellite-derived parameters are used as input. it is the part of the european organization for the exploitation of meteorological satellites (eumestat) ground segment and of the eumestat network of satellite application facilities. pvgis-cmsaf aims to generate climate data records, which are time series of certain length, stability and excellence to discover climate variability and differentiations. available data are from time period between 2007 and 2016 [29]. 2.1. pvgis method – explanation as it is described on the european commission’s science and knowledge service, the first stage in the calculation of solar radiation from satellite is the estimation of satellite images in order to see effects of clouds on the solar radiation, because they can reflect the arriving sunlight and so it comes to reduction of radiation that comes to the earth’s surface. cloud reflectivity can be estimated, when the same satellite image pixel is observed at the identical time every day in a month. the darkest pixel during a month denotes the state of the clearest sky, which means there are no clouds. the cloud reflectivity of other days is estimated relative to the clear-sky day. the same is applied for all hours in one day. so, on that way, effective cloud albedo could be estimated [30]. the second step contains calculations of the solar radiation of clear-sky states, with the help of radiative transfer theory in the atmosphere, together with the information on atmosphere aerosols quantity and the amount of water vapor and ozone concentration, because water vapor and ozone do attract radiation at certain wavelengths. the overall solar radiation is estimated from the cloud albedo and the clear-sky irradiance. this method achieves good results, but may be neglect in some occasions, i.e., when snow covers the ground. the snow could seem like clouds in case that the method determines very low irradiance. the aerosol data used in the method is average over longer period of time, and sudden changes in aerosols (due volcanic eruptions or dust storms) are not took into account in this method [30]. 398 m. preradović previously described method computes global and beam irradiance on a horizontal plane. but units and pv systems are placed at an inclined angle with respect to the flat plane or on tracking systems towards maximization of the incoming in-plane irradiance. in this case, the satellite-based values are not characteristic for the solar radiation obtained at the module surface, and it is crucial to evaluate the in-plane irradiance. for estimation of the values of the beam and diffuse constituents on sloped planes, the irradiance values on the horizontal plane of global and diffuse and/or beam irradiance components are needed. the addition of those gives the in-plane global irradiance on a sloped surface. straight from the solar disc originates the beam irradiance, and its value on a sloped surface can be retrieved from the value on the horizontal plane when position of the sun in the sky and precise placement of the inclined surface is known. however, the estimation of the diffuse irradiance over sloped surfaces cannot be easily calculated, because it can be dispersed by the atmosphere. in this case, models for defining of diffuse component are classified into two categories, isotropic and anisotropic. the first category takes into account equal distribution of diffuse irradiance over the sky. therefore, the diffuse irradiance on a sloped surface is same as the value on the horizontal plane scaled by the factor that depends only on the surface inclination and represents the portion of the sky, which can be seen from the plane’s surface. but the diffuse irradiance is almost never isotropic. the estimation model used in pvgis is anisotropic of two components, it can differentiate among clear and cloudcovered sky states and bright and shaded surfaces [30,31]. 2.2. statistical tests used for the calculations data and results are shown in tables and graphs. the analytical-statistical tool spss, version 24, was used for obtaining the data. applied statistical tests were kruskal wallis test, which determines whether three or more samples do originate from the same population. statistically significant differences were obtained by mann-whitney test that determines whether two samples originate from the same population [32]. the wilcoxon rank-sum test, also known as mann-whitney u test, analyses the differences in population means, when the populations are not normally distributed. first assumption that is necessary is that the population must be continuous, and the second assumption that is necessary, their probability density functions need to have same shape and size [33]. the mann-whitney u test calculates the statistic value u for each group. mathematically, the mann-whitney u statistic for each group is expressed by next equations [34]: 𝑈𝑥 = 𝑛𝑥 𝑛𝑦 + ( (𝑛𝑥(𝑛𝑥+1)) 2 ) − 𝑅𝑥 (1) 𝑈𝑦 = 𝑛𝑥 𝑛𝑦 + ( (𝑛𝑦(𝑛𝑦+1)) 2 ) − 𝑅𝑦 (2) where, nx describes the number of observations or number of participants of the first group, ny describes the number of observations or number of participants of the second group, rx represents the ranks sum of the first group, and ry is the sum of the ranks of the second group. equations (1) and (2) can be seen as the number of times observation in one sample precede or follow observation in the other sample, after all the score from one group is placed in ascending order. the null hypothesis can be either rejected or accepted, after the calculation of u value and the appropriate statistical threshold (𝛼) [34]. solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 399 the kruskal-wallis test represents a nonparametric statistical test, which considers differences of three or more independent groups on a single, and not normally distributed data [35]. the starting assumption is that we have k independent samples of volume n1, n2,…, nk, so that n1 + n2 + … + nk = n. after the ranking of samples, the sums of the ranks (r1, r2,…, rk) are obtained. test statistics can be described with the following equation (eq. 3) [36]: 𝑅 = 12 𝑛(𝑛+1) ∑ 𝑅𝑖 2 𝑛𝑖 𝑘 𝑖=1 − 3(𝑛 + 1) (3) the following four hypotheses have built this work: h01: there is no statistically significant difference in monthly solar energy production between the fixed solar panels (free-standing and building integrated) between the cities; h02: there is no statistically significant difference in monthly solar energy production of inclined photovoltaic system between the cities; h03: there is no statistically significant differences in monthly solar energy production between the cities in relation to two-axis solar power plant, and h04: there is no statistically significant differences in monthly solar energy production when all types of solar power plants were compared with each other among the cities. the aim of the test is to reject one hypothesis and to accept the other hypothesis. the p stands for probability and it calculates the probability that difference between the groups is random. the p value can be between 0 and 1 [37]. small p value, provides stronger evidence against h0, and we are more certain that h0 is not true. when the p value is large, h0 becomes more possible, but we cannot be confident that h0 is true. h0 should be rejected, in case when p ≤0.05 [33]. 3. results beforehand the results of statistical analysis, table 3 represents yearly solar energy production (kwh) in selected six cities. athens has the greatest yearly solar energy production among the selected cities, and freiburg has the lowest yearly solar energy production. more details are provided in the table below. table 3 yearly solar energy production (kwh) type of the pv technology freiburg (fr) graz (gr) maribor (mb) banja luka (bl) nis (ni) athens (at) fixed free standing 5316.05 5722.54 5851.62 5575.21 6302.62 8282.53 fixed building integrated 5128.36 5514.94 5640.08 5366.03 6051.62 7952.83 inclined 6661.49 7246.43 7541.29 7240.20 8216.33 11224.55 two-axis 6813.87 7421.63 7725.29 7417.43 8415.07 11550.93 testing the first hypothesis (h01), statistically significant differences were found in testing fixed-free standing photovoltaic systems between the cities (p = .044) and in testing fixed-building integrated photovoltaic systems between the cities (p = .043). high statistically significant differences for both types of fixed photovoltaic systems were 400 m. preradović obtained in monthly solar energy production between freiburg and athens (p = .009), between banja luka and athens (p = .009), between maribor and athens (p = .021), and between graz and athens (p = .018). high statistically significant difference was obtained between niš and athens (p = .0496) for fixed-free standing solar power plant, p = .043 for fixed-building integrated solar power plant). for the inclined photovoltaic systems (h02), statistically significant differences were obtained between maribor and athens (p = .028), between freiburg and athens (p = .011), between graz and athens (p = .021), and between banja luka and athens (p = 0.018). in testing of third hypothesis (h03), high statistically significant difference resulted in testing of monthly solar energy production between freiburg and athens (p = .009). statistically significant difference was obtained between maribor and athens (p = .028), graz and athens (p = .021), and between banja luka and athens (p = .015). results of testing h03 are presented in the table 4. table 4 results of testing of third hypothesis, monthly solar energy production by twoaxis solar power plant between the cities fixed – free standing fixed – building integrated inclined two-axis all .044† .043† .064† .062† mb & fr .273‡ .273‡ .299‡ .299‡ mb & gr .644‡ .644‡ .644‡ .644‡ mb & bl .773‡ .773‡ .817‡ .817‡ mb & ni .564‡ .603‡ .644‡ .603‡ mb & at .021‡ .021‡ .028‡ .028‡ fr & gr .326‡ .326‡ .419‡ .419‡ fr & bl .686‡ .686‡ .525‡ .564‡ fr & ni .248‡ .248‡ .225‡ .273‡ fr & at .009‡ .009‡ .011‡ .009‡ gr & bl .954‡ .954‡ 1.000‡ .954‡ gr & ni .488‡ .488‡ .525‡ .488‡ gr & at .018‡ .018‡ .021‡ .021‡ bl & ni .386‡ .386‡ .419‡ .453‡ bl & at .009‡ .009‡ .018‡ .015‡ ni & at .0496‡ .043‡ .065‡ .065‡ †kruskal wallis test ‡ mann-whitney test finally, for the fourth hypothesis (h04), high statistically significant difference (p = .000) was obtained when fixed-building integrated, inclined, and two-axis solar power plants were compared with each other. only in athens is there a statistically significant difference (p = .029) in testing monthly solar energy production of fixed-building integrated, inclined, and two-axis solar power plants. in all the other cities, there is no statistically significant difference when those three systems were compared with each other. results for testing of fourth hypothesis are presented in the table 5. solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 401 table 5 monthly energy production comparison between all types of installed solar power plants location fixed – free standing & fixed – building integrated inclined & two-axis fixed – building integrated & inclined & two-axis all .415‡ .655‡ .000† fr .488‡ .603‡ .140† gr .419‡ .686‡ .135† mb .525‡ .644‡ .150† bl .644‡ .686‡ .143† ni .644‡ .729‡ .166† at .564‡ .686‡ .029† † kruskal wallis test ‡ mann-whitney test in the following paragraphs, the payback time for installed fixed-building integrated photovoltaic system (5 kwp) has been calculated. also, information about annual incident solar energy (optimal angle), specific yearly electricity production, price of photovoltaic installation, and electricity prices in typical household (four members and yearly electricity demand 6 000 kwh) are shown in table 6. table 6 calculation of payback time for installed photovoltaic system, 5 kw, for one typical household with annual electricity demand of 6 000 kwh location yearly incident solar energy under optimal angle (kwh/m2) [27] specific yearly electricity production (kwh/kwp) electricity price that one household pays in one year (4 members, demand 6 000 kwh), country’s average for march 2021* payback time for installed photovoltaic system, with power 5 kw freiburg 1331.72 992 1 920 2.60 graz 1442.69 1145 1 260 3.97 maribor 1472.76 1167 1 080 4.63 banja luka 1433.46 1096 552 9.06 nis 1662.62 1239 480 10.41 athens 2108.31 1557 1 140 4.38 installation prices for photovoltaic system ‘key in hand’ for the selected cities are approximately the same (1 000 €/kwp), because of the bounded components. this is related to the systems with the power to 10 kw, which are mostly used in households for the own energy consumption. * country’s average electricity price as for march 2021, according to [38]: germany 0.32 €/kwh, austria 0.21 €/kwh, slovenia 0.18 €/kwh, bosnia and herzegovina 0.092 €/kwh, serbia 0.080 €/kwh, and greece 0.190 €/kwh. investment payback time is the shortest for the countries where the electricity price is the highest. the payback time is calculated by dividing investment costs with electricity price that one household pays in one year. 402 m. preradović 4. conclusion based on the presented research, following conclusions can be made: i. germany has the largest solar energy capacity and solar energy production; ii. between freiburg and athens, between banja luka and athens, between maribor and athens, graz and athens, and between niš and athens, there is a high statistically significant difference when the energy production of fixed-free standing and fixed-building integrated photovoltaic systems were tested; iii. statistically significant differences were obtained in testing of inclined photovoltaic system between following cities: maribor and athens, between freiburg and athens, between graz and athens, and between banja luka and athens; iv. in testing of produced energy amount by two-axis solar power plant, following results were obtained: high statistically significant difference between freiburg and athens, statistically significant difference between maribor and athens, graz and athens, and banja luka and athens; v. in athens, there is a statistically significant difference when monthly solar energy production was tested between three types of solar power plants (fixedbuilding integrated, inclined, and two-axis solar power plants), and vi. germany has the highest electricity price, and serbia the lowest electricity price. accordingly, in germany the payback time for installed photovoltaic system of 5 kw is the shortest, and in serbia the longest. references [1] t. m. pavlović, d. lj. mirjanić, i. s. radonjić, l. s. pantić and g. i. sazhko, "solar radiation atlas in banja luka in the republic of srpska", contemporary materials, vol. 12, no. 1, pp. 39-49, 2021. [2] h. rohracher and p. späth, "the interplay of urban energy policy and socio-technical transitions: the eco-cities of graz and freiburg in retrospect", urban studies, vol. 51, no. 7, pp. 1415–1431, 2014. [3] green city: freiburg, germany. (n.d.). https://www.greencitytimes.com/freiburg/, visited on february, 19. 2022. [4] a. thomas, freiburg solar region. https://wwf.panda.org/wwf_news/?204419/freiburg-green-city, visited on february 19. 2022. [5] s. fastenrath and b. braun, "sustainability transition pathways in the building sector: energy-efficient building in freiburg (germany)", applied geography, vol. 90, no. 1, pp. 339–349, 2018. [6] j. fälchle and photolia de. n.d. ‘energy innovation austria 4/2016’16. [7] s. seme, k. sredensek and z. praunseis, "smart grids and net metering for photovoltaic systems". in proceedings of the ieee international conference on modern electrical and energy systems (mees). kremenchuk, 2017, pp. 188–191. [8] s. seme, k. sredenšek, b. štumberger and m. hadžiselimović, "analysis of the performance of photovoltaic systems in slovenia", solar energy, vol. 180, pp. 550–558, 2019. [9] p. virtič and r. kovačič lukman, "a photovoltaic net metering system and its environmental performance: a case study from slovenia", j. clean. prod., vol. 212, pp. 334–342, 2019. [10] "dravske elektrane maribor obtains building permit for first part of solar park on canals of the zlatoličje and formin hydro power plants", hse. retrieved 19 february 2022 (https://www.hse.si/en/dravskeelektrarne-maribor-obtains-building-permit-for-first-part-of-solar-park-on-canals-of-the-zlatolicje-andformin-hydro-power-plants/). [11] t. pavlović and d. lj. mirjanić, solar energy and lighting in the republic of srpska. in the sun and photovoltaic technologies (pp. 383–411). springer international publishing. [12] t. m. pavlović, d. d. milosavljević, d. mirjanić, l. s. pantić, i. s. radonjić and d. pirsl, "assessments and perspectives of pv solar power engineering in the republic of srpska (bosnia and herzegovina)", renew. sust. energy rev., vol. 18, pp. 119–133, 2013. https://www.greencitytimes.com/freiburg/ https://wwf.panda.org/wwf_news/?204419/freiburg-green-city solar energy potential in freiburg, graz, maribor, banja luka, niš, and athens 403 [13] energy strategy of republic of srpska up to 2030, banja luka, https://www.vladars.net/eng/vlada/ministries/ miem/documents/energy%20strategy%20of%20the%20republic%20of%20srpska%20up%20to%202030_4 59254634.pdf, visited on february 9. 2022. [14] t. pavlović, i. radonjić, d. milosavljević, l. pantić and d. pirsl, "assessment and potential use of concentrating solar power plants in serbia and republic of srpska", thermal sci., vol. 16, no. 3, pp. 931–945, 2012. [15] t. pavlović, d. milosavljević, d. mirjanić, l. pantić and d. pirsl, "assesment of the possibilities of building integrated pv systems of 1 kw electricity generation in banja luka", contemporary materials, vol. 2, no. 3, pp. 167–176, 2013. [16] d. d. milosavljević, t. m. pavlović, d. lj. mirjanić and d. divnić, "photovoltaic solar plants in the republic of srpska current state and perspectives", renew. sust. energy rev., vol. 62, pp. 546–560, 2016. [17] t. m. pavlović, y. tripanagnostopoulos, d. lj. mirjanić and d. d. milosavljević, "solar energy in serbia, greece and the republic of srpska", academy of sciences and arts of the republic of srpska, 2015. [18] m. golusin, z. tesić, and a. ostojić, "the analysis of the renewable energy production sector in serbia", renew. sust. energy rev., vol. 14, no. 5, pp. 1477–1483, 2010. [19] l. pantić, t. pavlović and d. milosavljević, "a practical field study of performances of solar modules at various positions in serbia", thermal sci., vol. 19, pp. 511–523, 2015. [20] t. pavlović, d. milosavljević, m. lambić, v. stefanović, d. mančić and d. piršl, "solar energy in serbia", contemporary materials, vol. 2, no. 2, pp. 204–20, 2011. [21] s. prvulović, d. tolmac, m. matić, lj. radovanović, and m. lambić, "some aspects of the use of solar energy in serbia", energy sources, part b: econ. plan. policy, vol. 13, no. 4, pp. 237–245. [22] a. nikas, v. stavrakas, a. arsenopoulos, h. doukas, m. antosiewicz, j. witajewski-baltvilks and a. flamos, "barriers to and consequences of a solar-based energy transition in greece", environ. innov. soc. transit., vol. 35, pp. 383–399, 2020. [23] a. a. argiriou and s. mirasgedis, "the solar thermal market in greece—review and perspectives", renew. sust. energy rev., vol. 7, no. 5, pp. 397–418, 2003. [24] e. bellos and c. tzivanidis, "solar concentrating systems and applications in greece – a critical review", j. clean. prod., vol. 272, p. 122855, 2020. [25] irena, renewable energy statistics, the international renewable energy agency, abu dhabi, (2021) 43. [26] l. pantić, t. pavlović, d. milosavljević, d. mirjanić, i. radonjić and m. radovic, "electrical energy generation with differently oriented photovoltaic modules as façade elements", thermal sci., vol. 20, no. 4, pp. 1377–1386, 2016. [27] t. pavlović, d. milosavljević and d. pirsl, "simulation of photovoltaic systems electricity generation using homer software in specific locations in serbia", thermal sci., vol. 17, no. 2, pp. 333–347, 2013. [28] photovoltaic geographical information system, https://re.jrc.ec.europa.eu/pvg_tools/en/tools.html, visited on december, 10. 2021. [29] k. cieslak and p. dragan, "comparison of the existing photovoltaic power plant performance simulation in terms of different sources of meteorological data", edited by l. lichołai, b. dębska, p. miąsik, j. szyszka, j. krasoń, and a. szalacha. e3s web of conferences, 2018, vol. 49, 00015. [30] european commission, eu science hub pvgis data sources and calculation methods, https://jointresearch-centre.ec.europa.eu/pvgis-photovoltaic-geographical-information-system/getting-startedpvgis/pvgis-data-sources-calculation-methods_en visited on march, 1. 2022. [31] t. muneer, "solar radiation model for europe", build. serv. eng. res. technol., vol. 11, no. 4, pp. 153–163, 1990. [32] s. jakšić and s. maksimović. 2, verovatnoća i statistika: teorijske osnove i rešeni primeri, arhitektonskograđevinsko-geodetski fakultet, banja luka, 2020. [33] w. navidi, statistics for engineers and scientists. new york: mcgraw-hill, 2011. [34] n. nachar, "the mann-whitney u: a test for assessing whether two independent samples come from the same distribution", tutor. quant. methods psychol., vol. 4, no. 1, pp. 13–20, 2008. [35] p. e. mckight and j. najab, "kruskal-wallis test" in the corsini encyclopedia of psychology, edited by i. b. weiner and w. e. craighead. hoboken, nj, usa: john wiley & sons, inc. [36] m. lovrić, j. komić and s. stević, statistička analiza: metodi i primjena, 2. izmijenjeno i dopunjeno izdanje. narodna i univerzitetska biblioteka republike srpske, banja luka, 2017. [37] t. dahiru, "p-value, a true test of statistical significance? a cautionary note", annals of ibadan postgraduate medicine, vol. 6, no. 1, pp. 21–26, 2011. [38] global petrol prices, https://www.globalpetrolprices.com/electricity_prices/, visited on december, 20. 2021. https://re.jrc.ec.europa.eu/pvg_tools/en/tools.html https://www.globalpetrolprices.com/electricity_prices/ instruction facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 297 308 doi: 10.2298/fuee1602297m centralized detection of pre-alarm state in telephone network of electric power utility  dragan mitić, vladimir matić, aleksandar lebl, mihailo stanić, žarko markov iritel a.d., belgrade, serbia abstract. in this paper we consider the mixed telephone network of electric power utility consisting of ip, isdn and power line carrier links. very important demand in the network is high availability. the central detector of ip and isdn link failure (pre-alarm) is presented. the detector function is based on the prolonged response time of the network in the case of ip and isdn link failure. we define undesirable events in the detector operation: false prealarm and miss detection, and we derive the expressions for their probability calculation. it is indicated that centralization of this detector is merit, which facilitates testing of the whole network from one location. key words: centralized detector, electric power utility, mixed telephone network, pre-alarm state 1. introduction the main demand for the telephone network of electric power utility (epu) is very high availability and all possible resources are used to achieve that. in order to realize the main demand, different technologies (optical cables, metal cables, radio) and non-hierarchical network architecture (alternate routing) are used in the epu telephone network. using of different technologies increases the availability [1], [2], but the problem is the conversion of different signalling systems (cas, isdn, ip) and speech signal forms (analog, digital, packet) in signalling and media gateways. in this paper we present how the mixed network of epu, which uses new and old techniques, (besides the problem of interworking), can use different signalling systems, i.e. different duration of post-dialling delay for monitoring the proper operation of the parts of mixed network. different methods can be used to detect faulty link in the telephone network of epu. a few approaches based on telephone traffic characteristics are presented in [3-8]. received may 21, 2015; received in revised form august 31, 2015 corresponding author: dragan mitić iritel a.d., 11080 belgrade, batajnički put 23, serbia (e-mail: mita@iritel.com) 298 d. mitić, v. matić, a. lebl, m. stanić, ž. markov this paper deals with a novel method of finding faulty isdn or ip link (link of the first choice) by measuring post-dialling delay to the beginning of the ring-back tone. the method implementation is based on the fact that there are one or more links of the second choice (power line carrier (plc) links) with considerably slower dialling speed than the links of the first choice. if there is a fault in some part of the network, the slower link will be activated on that part of the network and the dialling speed will be decreased. by proper choice of dialling numbers, it is possible to detect the network section with faulty links of the first choice. the main advantage of the method is that the testing for the whole epu telephone network can be realized from one, central place. the testing can be realized manually, without any equipment, only by adequately choosing subscriber dialling numbers, or using relatively simple equipment to generate dialling. the contribution of the paper is that it develops the method for testing and that it calculates the main characteristics of the system: the miss probability and the probability of false pre-alarm. 2. model, designations and assumptions the mixed network of epu consists of telephone exchanges (te) and transmission systems, which can be ip, isdn and plc systems. the old network was based on plcs. (plc is the technique of telephone channel creating by the use of high voltage power lines. sometimes this transmission is called voice over high voltage power line. plcs exist in new mixed network in order to increase availability. in [2] plcs are referred e&m analog lines.) the main characteristics of plcs in epu telephone network are the use of slow e&m signalling with pulse digit transfer, [9], and lower quality of speech signal transfer. let us consider the connection through the mixed epu telephone network, (see fig. 1.a)). from this connection let us consider only two nodes on the connection route, (see fig. 1.b)) and (fig. 1.c)). the offered traffic to the group of links is designated as a. the number of channels on the isdn link, or the greatest number of connections using ip link is n. telephone exchanges tek and tek+1 are connected by isdn or ip link and from the earlier network are still connected by plc. the connections between exchanges tek and tek+1 are established by the selection rule (sr) such that first isdn channels (see fig. 1.b)) or ip link (see fig. 1.c)) are selected, and if they are not available, plc is selected. this sr results from the faster connection establishment and the better speech signal quality when digital connections are used then when plcs are used. (it is clear that selections in different directions on isdn links will be in such a way that collision probability will be minimized). normal operation (state) is the state when all links between exchanges are faultless. the alarm state is the state when it is not possible to establish the connection between exchanges tek and tek+1 because all links between exchanges are faulty. pre-alarm state is defined as the state when it is not possible to establish the connection by the route of first choice, i.e. by the isdn or ip link, because these links are faulty. the connection can be established using plc. it is important that some connections can be established in this state, for example dispatcher connections. post-dialling delay (pdd, or post selection delay) is defined as the time interval from the last dialled digit until the start of the called side answer, i.e. until the beginning of the centralized detection of pre-alarm state in telephone network of electric power utility 299 ringing (busy) tone. let us suppose that 5-digit numbering plan is used in the network and that the transfer of all digits is equally probable (uniform distribution). the aim of this paper is to present the operation of pre-alarm state detector (the state when isdn or ip links are faulty). the operation of this detector is based on the difference in pdd values in the case of using digital links and plcs. fig. 1 model of connection through mixed epu network the main components of pdd are the time intervals used for processing and sending the information about dialled number between adjacent network nodes. that’s why it is necessary to know the characteristics of transferring time intervals between nodes in the case of digital links and plcs. 3. time delay of successful transfer of address information (dialled number) between exchanges the time of successful transfer of signalling information about the dialled number between the network nodes is the most important component of pdd. this time depends on the signalling type, transmission method, traffic load of the links and nodes. that’s why it is random variable. in order to satisfy the main request that pdd has sufficiently short duration, the recommendations about the allowed duration of transfer of signalling information (concerning the dialled number) between network nodes are introduced. these recommendations are different for different techniques. 300 d. mitić, v. matić, a. lebl, m. stanić, ž. markov 3.1. isdn technique recommendations for the greatest allowed time of exchange operation are presented in [10], sections 2.3 (delay probability – non-isdn or nixed (isdn – non-isdn) environment) and 2.4 (delay probability – isdn environment). among all recommended values, we shall select the most stringent ones (the longest time intervals), which deal with isdn technique, the message carrying address information and en-bloc signalling (en-bloc signalling means that signalling transmission on one link starts when complete address information from the previous link is collected in the node preceding the considered link). these greatest allowed time intervals are defined in the following sections of [10]: 2.3.2.3 local exchange call request delay, 2.3.3.2.3 exchange call set-up delay for originating outgoing traffic connections, 2.4.3.1 call set up delay, 2.4.5 incoming call indication sending delay which recommend that the longest allowed mean time for the activity of one route section and one network node is 600ms (load a) and 800ms (load b). the longest recommended time for the activity in the case of 95% connections is 800ms (load a) and 1200ms (load b). the reason for taking the longest time intervals from [10] is that in that case is most probable to make an error (i.e. to replace dialling using isdn or ip link by dialling using plc link or vice versa), thus making one of two possible false detections in decision algorithm (false pre-alarm or miss detection). in [11] for cross-office transfer time for signalling ccs no 7 messages in the most difficult conditions (complex message content – processing intensive and increased load 30%), the longest mean time (450ms) and the longest time for forwarding at least 95% messages (900ms) is recommended. the probability distribution of time necessary for forwarding the address information is exponential, and its main component is the waiting time on the (signalling) processor service, [12]. in [12] it is indicated that the time of processor service can be constant or exponentially distributed. here we suppose that signalling processor service time is distributed according to exponential distribution. there are two reasons for this: the first one is that the service time of signalling processor for different messages is different, and the second one is that the results for exponential distribution are more reliable (conservative, on the safe side). the probability density function of the time duration needed for the address signalling message transfer across isdn link is presented by the function f(t) (see fig. 2). it is clear that in this case t is continuous random variable. the mean value of this time is signed as tmisdn. 3.2. ip technique the parts of telephone network, which are realized using ip techniques, use sip for connection setup [13]. the address information for connection setup exists in message (method) invite, and after sending message invite, an acknowledgement using some of the provisional or final responses from the groups 1xx or 2xx is expected. the message invite also can be transmitted using unreliable protocol (udp), and in this case preventive retransmission must be used. in [13] it is stated that the first retransmission is sent after 500ms. let us suppose that in private network, as is the case of epu network, the time interval of 500ms is enough to receive the response on 95% invite requests. we can suppose that for address transmission between two network nodes using ip techniques the centralized detection of pre-alarm state in telephone network of electric power utility 301 same recommendations are valid as for transmission time across isdn link. the only (positive) difference is that in this case the time intervals are shorter. as the conclusion, it can be said that the longest allowed time for address information transfer between two network nodes of digital links is that which is valid for 95% of all connections with traffic load b, i.e. 1200ms. 3.3. plc in this technique the dialled digits are forwarded in pulse form without acknowledgement. that’s why we shall consider that address information transfer between exchanges is finished after the selected number is completely transmitted. the time of address information transfer between exchanges in ip or isdn technique depends on processor load and link load (i.e. signalling equipment load), and doesn’t depend on signalling message duration. on the contrary, in the case of plc the time for address information transfer between exchanges depends on signalling information transfer, i.e. on the number of dial pulses. the time for address information transfer using plc link is random variable, which has discrete values. example 1: if we use 5-digit numbering plan, i.e. there are more than 10000 users in the network, then the time for address information transfer can be calculated as tplc= 4·tp + 100·n (ms), where tp is interdigit pause (350ms), and n is the number of dial pulses, n = 5,6,...49,50. probability distribution of this time duration is presented symbolically and signed as p(t) (see fig. 2). the values of time distribution p(t) for the five-digit numbering and plc link have discrete values (see fig. 3) (every fifth value presented bold). it is obvious that this is discrete random variable and that the probability has values different than 0 only for the values t = 1400 + 100·n (ms), where n is integer, i.e. only for values t = 1900, 2000, 2100,...., 6300, 6400 (ms). fig. 2 probability density function (full line) in the case of isdn link and probability distribution (dashed line) of address information transfer time in the case of plc link 302 d. mitić, v. matić, a. lebl, m. stanić, ž. markov fig. 3 probability distribution of the time for digit transfer on plc link for 5-digit numbering plan value tmplc is the mean value of time needed for address information transfer over plc link (see fig. 2). the main conclusion of this section is that time of address information transfer between two adjacent nodes of epu network differs in the case of isdn or ip link (tisdn) and plc link (tplc) for several seconds. in the case of 5-digit numbering plan, the mean value of this difference δtm is about 4s (see fig. 2.). 4. basic idea for pre-alarm state detector main idea of the detector is that it generates test telephone calls in the network and compares pdd with the usual values and, in the case of a great difference, declares prealarm state. the difference in the time delay of address information transfer on the link (δt), which is in pre-alarm state, is transferred on the total pdd time. let us present the main idea of the detector (see fig. 4) and (fig. 5). in normal state, i.e. when all isdn or ip links are correct, these links are used for the whole connection setup (see fig. 4.a)). when there is one faulty isdn (ip) link on one section of call route, it is replaced by plc (see fig. 4.b)). (the established connection is presented by bold line). the time values of pdd are different in the case of correct and faulty section on the call route (see fig.5). the moment of test signal sending is signed as td (see fig. 5). the time interval from signal sending till receiving the answer from the called side, in the case of all isdn and ip links are correct on the trace towards the called user (see fig. 4.a), is signed as pdd1 (see fig. 5). the response time from the receiving side if some isdn or ip link is faulty (see fig. 4.b), is signed as pdd2 (see fig. 5). time interval pddt is centralized detection of pre-alarm state in telephone network of electric power utility 303 chosen in advance as the threshold time value. if pdd>pddt, the pre-alarm state is declared. the guard interval is pddt – pdd1. fig. 4 basics of pre-alarm state detector fig. 5 the pdd values in the case of correct and faulty section on the call route as is the detector function based on random variables analysis, two undesired consequences are also possible: the false pre-alarm and the miss detection. the false pre-alarm is the phenomenon that all links are correct, and the detector declares the pre-alarm state. the detector miss is the reverse situation: the failure on isdn or ip link exists, but the detector does not detect it. the false pre-alarm is possible in the case of increased traffic load when all links are correct, and the connection is realized by plc. the miss in pre-alarm state detection is possible in the case that the value of pddt is chosen to be too high. 5. calculation of probability for false pre-alarm and for miss detection let us consider two network nodes in epu network (see fig. 1). these two nodes belong to one connection (see fig. 4). the central detector of pre-alarm state is turned on and for this case the threshold value for the answer time delay of the called side (pddt) is defined, (see fig. 5). the pre-alarm state is declared if pdd>pddt. the false pre-alarm can occur in two cases: 304 d. mitić, v. matić, a. lebl, m. stanić, ž. markov  if the telephone traffic is high and isdn or ip link is faultless, but busy by previous calls, and the next call is served by plc;  if the signalling traffic between network nodes, which form the connection, is great, the time for address information sending is too great and the total time until the answer from the called side becomes pdd>pddt. the probability of false pre-alarm, caused by the great traffic, i.e. the probability of false pre-alarm of the first kind is, obviously: 1 ( , )fpap b e a n  (1) where e(a,n) is the well known erlang loss formula in the group of n channels with the offered traffic a, [14]. the probability of the false pre-alarm, caused by the too great signalling traffic (probability of false pre-alarm of the second kind) can be calculated in the following way: let us consider the distribution of the time for address information sending between network nodes on isdn or ip link. in the subsection iii.1. it was pointed that this distribution is negative exponential, (see fig. 6). fig. 6 distribution of the time for address information sending between network nodes the probability density function of exponential distribution is ( ) , 0 x f x e x        (2) while the cumulative distribution function, i.e. the probability that t ≤ x (in other words p(t ≤ x) = f(x)), is: ( ) 1 x f x e     (3) the probability of false pre-alarm of second kind (see fig. 6) can be expressed as: ( ) 1 ( )fpa2 t tp p t pdd f pdd    (4) centralized detection of pre-alarm state in telephone network of electric power utility 305 the total probability of false pre-alarm is: 1 (1 ) (1 )fpa fpa1 fpa2 fpa1 fpa2p p p p p       (5) because 1fpa1p and 1fpa2p . example 2: let us consider the primary group of isdn channels (n = 30) with the offered load of 20e, then is pfpa1 = 0.00846. using the most stringent requirement from the subsection iii.1. that the waiting time for 95% calls must be less than 1200ms, we find the value of λ: 1200 1 (1200) 1 0.95 2.5f e s           (6) taking the value pddt = 1.5s, we have pfpa2 = 0.0235. the total probability of false pre-alarm in this example is pfpa = 0.032. there is no possibility for miss detection (pmiss = 0) if it is possible to define the value of time threshold (for address information sending when isdn (ip) links are faultless) on the smaller value than it is the minimum time of address information sending over plc. in that case there is no overlapping of possible time intervals: time interval of address information sending when all isdn (ip) links are faultless is surely shorter than time interval of address information sending over plc. but, the situation changes when these time intervals are overlapping. it means that the probability of miss detection exists if the value of time threshold (pddt) is greater than the lower limit for transmission time of address information over plc (tplcmin), pddt > tplcmin, (see fig. 3). in this case, if the value of pdd is tplcmin tplcmin, we can come to the situation when is pmiss > 0. 7. how central detector functions in epu network central detector contains the numbering plan of the whole epu network. in the situation when all links are correct it generates test calls (directed towards test ports) and determines the standard value of pdd (pdd1) for each node in the network (see fig. 5). these data are memorized in detector for comparison with later measured values of pdd. besides, according to the dialled number and the standard value of pdd, the threshold pddt is determined for each network node. the testing is performed in such a way that the ports of farthest nodes are called first. if pddpddt for distant node, it is necessary to determine on which route section the pre-alarm state exists (see fig. 7). let us suppose that we dial subscriber number of tsd in the exchange ted (far network node) from telephone tsa in the exchange tea, where centralized detector is situated. the response time differs from the standard value more than it is allowed according to the threshold. it means that on some of the route sections tea – teb, teb – tec, tec – ted pre-alarm state appeared. standard values for the pdd exist for the connections tsa – tsc and tsa – tsb. it is possible to detect the route section on which pre-alarm state appeared by successively dialling telephone numbers tsc and tsb. if the fault exists on the link between teb and tec, the pdd value when dialling tsc will be greater than the pre-defined threshold and the pdd value when dialling tsb will be smaller than the pre-defined threshold. flow-chart of the detector algorithm is presented in fig.8. in this flow-chart m is the number of directions, which have to be tested, i is the direction, which is instantaneously tested, ni is the number of nodes in the direction i, and j is the node, which is instantaneously tested in the direction i. dni,j is the testing dial number in node determined by i and j. as it is already pointed, testing of direction i starts from the last node in the direction (j = ni). test number is dialled (dnlast i,j) and if pdd is less then pddt for node i,j (pddti,j), testing is finished for direction i. if it is not the last direction to be tested (i < m), testing is continued on the next direction (i = i + 1). if all directions are tested, it is started again from the first direction. centralized detection of pre-alarm state in telephone network of electric power utility 307 in the case that test pdd10 -4 cm 2 ) capacitors with different aspect ratios were subjected to severe stress conditions (eox>4-5 mv/cm) with the aim of generating a large density of breakdown spots (from 10 5 to 10 6 spots/cm 2 ) in the same device. the resulting mark pattern on the top metal electrode associated with the failure events was analyzed first using conventional functional estimators for twodimensional spatial statistics. second, as a double check, the attention was focused on the same breakdown spot patterns but in relation to the probe point location. in this latter case, the objective was to rule out any stochastic dependence of the breakdown spot distribution on the position of the source of degradation and therefore to confirm whether or not the spots follow a complete spatial randomness (csr) process. in order to simplify the mathematical treatment of the point-to-event distributions, the voltage probe was assumed to be located at one corner of the observation window which significantly reduces the number of cases to analyze. infrared images revealed that the generation of the spots is associated with micro-explosions within the insulating material (hfo2) and with the local volatilization of the top metal electrode (pt). key words: oxide breakdown, oxide reliability, infrared thermography, hfo2 1. introduction failure analysis of thin dielectric films in metal-insulator-metal (mim) and metalinsulator-semiconductor (mis) devices consists in the application of controlled electrical stress and the determination of lifetime estimates using accelerated degradation tests [1,2]. in order to achieve good estimates a large number of devices need to be subjected to identical stressing conditions until a failure event is detected according to a pre-established criterion (current or voltage jump, progressive leakage current increase, anomalous noise  received january 5, 2015 corresponding author: enrique miranda departament d’enginyeria electrònica, universitat autònoma de barcelona, 08193 cerdanyola del vallés, barcelona, spain (e-mail: enrique.miranda@uab.cat) 178 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda increment in the measurements, etc.). in this way, the time-to-breakdown (bd) statistics using multiple units is constructed. however, the fact that each device provides a single data point to the sampling distributions is a limiting factor for data collection and a timeconsuming process which in general requires careful planning and execution. fortunately, it has been demonstrated that in certain cases more than one failure event per device can be generated and detected by means of electrical methods, mainly using constant voltage stress (cvs) tests. the successive bd statistics has been used not only to estimate the lifetime of devices but also the temporal and spatial correlations among the failure events [3]. in [4], the successive failure event statistics was obtained assuming poisson area scaling for large virtual area devices and the weakest link property of dielectric bd. although this procedure allows examining large data sets (with a single bd event per device) and assessing their order statistics, the method cannot be considered a direct proof of uncorrelated events. on the other hand, in this work, the attention is particularly focused on the spatial correlation of multiple failure events occurring in the same device. the temporal correlation of the bd events was investigated in [5] and it was shown to be consistent with a homogeneous poisson process. from the microscopic viewpoint, it is well known that during electrical stress, defects or traps are generated within the dielectric film which ultimately leads to the creation of a filamentary pathway spanning the oxide layer [6]. the formation of this path and the consequent leakage current increase is identified as a bd event. if the occurrence of this event is sufficiently energetic, the damage becomes visible on the top electrode as a mark associated with the local volatilization of the metal layer. since during the application of cvs the degradation of the device does not stop after the occurrence of the first bd event, it is possible therefore to generate more than a single mark per device. at the outset, the bd spot generation rate is governed by the magnitude of the stress voltage but as the degradation proceeds, it becomes limited by the interplay of the current that flows through the parallel bd paths and the voltage drop across the series resistance associated with the device itself or with the measurement setup. when the oxide voltage reduces, the generation of spots ends. the final result is a visible point pattern on the top electrode material that can be analyzed using the methods of two-dimensional (2-d) spatial statistics [7,8]. functional estimators and point-to-event distributions are used then to assess the statistical properties of these failure events. in addition, in order to illustrate the origin of the bd spot pattern, we used the transient infrared (ir) thermography characterization technique, which is based on a real-time mapping of the thermal activity of the device during the degradation stage. interestingly, the methods described here are particularly relevant for devices exhibiting the resistive switching (rs) effect, which basically consists in the formation and rupture of a single or multiple conduction channels spanning the dielectric layer in a mim structure [9]. in this regard, several papers have reported the appearance of degradation patterns on the top metal electrode of the structures during the switching process arising from the outdiffusion of oxygen ions from the oxide layer [9,10]. although this subject is out of the scope of this paper, it is worth mentioning that multifilamentary rs is currently being considered for multilevel memory devices and could represent a breakthrough in the field of large capacity information storage systems [11]. this paper is organized as follows: in section ii, the samples under investigation and the experimental setup used to characterize the ir emission during the multiple dielectric breakdown process are described. in section iii, the generation of the bd spot pattern and the associated thermal effects are discussed. in section iv, the spatial distribution of study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 179 the failure events is investigated using conventional functional estimators for 2-d spatial point processes. in all the cases, the estimated curves are accompanied by the corresponding 95% confidence bands. in section v, the same bd spot patterns are examined using distance and angular point-to-event probability distributions for rectangular area observation windows. in this case, the probe point location is considered as the reference point from which the distances and angles to the bd events are computed. for the sake of mathematical simplicity, the point probe is considered to be located at one of the vertices of the observation window. finally, in section vi, the conclusions of this work are presented. 2. samples and experimental setup the devices investigated were mim capacitors fabricated according to the following steps: a 200 nm-thick thermal sio2 layer was grown on a n-type si (100) substrate, after which a pt (200 nm-thick) layer was deposited by electron-beam (e-beam) evaporation. the samples were then placed in a cambridge nanotech fiji atomic layer deposition (ald) system where hfo2 (30 nm-thick) was deposited using temahf precursor and h2o. the samples were then returned to the e-beam evaporator and a pt layer (200 nmthick) was deposited on top of the hfo2 film. lithography and a lift-off process were used to form arrays of rectangular capacitors of different sizes (400 x 400 m 2 , 400 x 100 m 2 , 1600 x 100 m 2 ). access to the bottom pt metal was enabled via a dry etching technique using a mask/resist process that removed the hfo2 to the bottom pt metal while at the same time protected the top pt metal of the patterned devices. the oxide extends 25 µm beyond the perimeter edge of the top metal electrode. the devices were stressed using a hp4155 semiconductor parameter analyzer with the bottom electrode grounded. a schematic of the experimental setup used to carry out the ir mappings discussed in section iii is illustrated in fig. 1. the core of the system is the merlin-mid ir camera together with an external synchronization logic block implemented on a fpga digital circuit. the ir camera presents a focal plane array (fpa) of 320×256 pixels insb sensor (30×30 µm 2 pixel area) that guarantees a minimum temperature resolution of 25mk. the observation field is about 30 µm with a 1:1 optical magnification lens. the system is capable of detecting temperature distributions both in steady-state and transient conditions. in transient operation it is possible to use real-time and equivalent-time measurement modes. in the former case, the maximum sampling-rate is limited by the thermo-camera framerate to 50hz. in the latter case, if the experiment can be repeated periodically, the system can reach an equivalent bandwidth of 100 khz full-frame limited only by the minimum integration time of the fpa sensors (10µs). in this work, a direct method for transient characterization of the device temperature was used. the information provided by the system is subsequently edited using image processing software. more details about the ir mapping setup can be found in [12]. to conclude this section, it is worth pointing out that the statistical study reported in sections iv and v was carried out using the spatstat package for the r language [13]. this package supports creation, manipulation and plotting of point patterns, exploratory data analysis, simulation of point process models, parametric model fitting, as well as many other statistical tools (including over 1500 user-level functions). spatstat can be downloaded for free from the r website [14]. 180 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda fig. 1 schematic of the experimental ir setup with detail of connections between different system parts. 3. generation of the bd spots and thermal effects the images shown in fig. 2 illustrate the local temperature increase caused by the occurrence of a single and multiple bd events in the mim devices described in section ii. the failure events were generated using cvs with stress voltages of about 8-9 v applied for periods not longer than 1 minute. in many cases the shorts between the top and bottom electrodes are permanent as revealed by the ir images. in other cases just a flash is observed, which indicates that after the metal evaporation, the electrical contact between the electrodes is locally lost. the voltage probe is clearly visible in the ir images too, which points out a temperature increase of the contact between the tip of the probe and the top electrode (see fig. 2). a difference of up to eight degrees celsius is detected between the metal electrode and the center of the leakage paths. notice that this temperature can be radically different from the actual temperature reached inside the dielectric film during the micro-explosions. the ir images also show that the leakage current can be localized in a few sites (see figs. 2.a and 2.b) or, if the damage is extremely severe, it can almost flow uniformly distributed over the whole device area (see fig. 2.c). a thorough analysis of these experiments reveals that although the density of marks associated with the generation of bd spots on the top electrode can be quite large, the actual operating leakage current pathways spanning the dielectric film can be considerably less numerous. a second interesting question that arises is to what extent the local temperature increase associated with the formation of the current pathways does not affect the spatial generation of subsequent failure events. this issue will be analyzed in sections iv and v in terms of the statistical distribution of the events. the results obtained by other authors using electrical methods indicate the absence of bd spot spatial correlation in mis structures [15]. however, contrary to ours, that kind of study was limited to just a few bd events per device. in the next section, the bd spot distribution in a 2-d space will be analyzed using a variety of functional estimators. study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 181 a) b) c) fig. 2 thermal images of different mim structures in which a a) single and b) and c) multiple bd spots are generated during electrical stress. 4. analysis of the bd spot distribution using 2-d functional estimators in order to characterize the bd spots distribution in a 2-d region, it is necessary first to provide a brief introduction to the different functional estimators that are considered in this study. notice that the standard reference model of a point process in the plane is the 182 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda homogeneous poisson point process, also referred to as csr for complete spatial randomness, which is commonly regarded as the null hypothesis model in spatial statistics [7]. the nearest neighbor distance distribution g is the cumulative distribution of the distance from a typical random point to the nearest other point (see fig. 3). for a csr process, g is given by the expression: ( ) ( ) (1) where λ is the average intensity of the process, i.e. the number of events registered divided by the area of the observation window. while g>gcsr suggests that nearest neighbor distances are shorter than for a csr process (clustered pattern), gfcsr suggests that empty space distances are shorter than for a csr process (regularly space pattern), fkcsr suggests clustering, while k1 suggests clustering or attraction, while g<1 suggests inhibition or regularity. for a csr process gcsr(r)=1. the estimator fails for r values close to 0 [7]. fig. 3 distribution of the points and circular region of generic radius r used to calculate the functional estimators. study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 183 in figs. 4 to 6, three particular cases of mim devices with bd spot patterns are shown. the images were obtained after the application of constant voltage stress. in the first two cases (figs. 4 and 5), the voltage probe tip is clearly visible close to one corner of the device. the position of the probe tip defines the considered observation window (red solid line) within which the point pattern is analyzed. however, notice that for the analysis that follows, the specific location of the voltage probe is irrelevant, since r refers to a generic distance within the spatial point process. in the case of fig. 4.a, the area of the device is 400 x 400 m 2 , the area of the observation window is 370 x 370 m 2 and the number of spots registered is 321. this corresponds to a density of 2.35x10 -3 points/m 2 . the dashed lines in figs. 4.b to 4.e were calculated using expressions (1)-(4) for a csr process. notice that the estimated curves (black solid lines) are all confined within the 95% confidence bands (shaded area) except for some deviation detected in g for the short distance range. this is merely a consequence of considering the bd spots as mathematical points (recall that typical lateral sizes are about 2 to 3 m) [16]. in addition, notice that g fluctuates around the unity mark, which is indicative of a csr process at different scales. in summary, the process illustrated in fig. 4 can be considered a csr process. no interaction is detected among the bd spots, i.e. they are spatially uncorrelated. something similar occurs for the device shown in fig. 5.a. in this case the area of the device is 400 x 100 m 2 , the area of the observation window is 392.5 x 94.5 m 2 and the number of spots is 168. this corresponds to a density of 4.53x10-3 points/m2, which almost doubles the intensity of the previous example. no interpoint distance below 2 m is registered but this is again a matter related to the spot sizes. no sign of deviation from a csr process is detected. the case illustrated in fig. 6.a is rather different since one of the sides of the device is significantly larger than the other. the device under consideration has lateral sizes of 1600 x 100 m 2 and the region of interest (observation window) coincides with the device area. in this case, 132 points were registered with an average density of 8.25x10 -4 points/m 2 . notice the important deviations that take place with respect to the corresponding csr process (see for example the pair correlation function gin fig. 6.e). particularly important is the absence of points in the short distance range (<5 m), but this can be related to the low density of points (~10 -4 ) investigated. for the example illustrated in fig. 6.a, the information about the location of the voltage probe is unavailable. the estimators seem to indicate that the data points follow a csr process for distances larger than 5-10 m. the deviation in g for the long distance range (>30 m) can be attributed to edge effects associated with the low aspect ratio (100/1600) of the investigated device (see fig. 6.b). in section v, the same bd spot patterns explored in this section are assessed again but in connection with the probe point location. since no spatial correlation effects were detected using 2-d functional estimators, it is expected these results to be confirmed using alternative methods such as the point-to-event distance and angular probability distributions. this analysis is accomplished in the next section. 184 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda a) b) c) d) e) fig. 4 a) distribution of the bd spots in a mim device with area 400  400 m 2 . 2-d functional estimators: b) g , c) f , d) k and e) g. study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 185 a) b) c) d) d) fig. 5 a) distribution of the bd spots in a mim device with area 400  100 m 2 . 2-d functional estimators: b) g , c) f , d) k and e) g. 186 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda a) b) c) c) d) fig. 6 a) distribution of the bd spots in a mim device with area 1600  100 m2. 2-d functional estimators: b) g, c) f, d) k and e) g. study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 187 5. analysis of the bd spot distribution using point-to-event distributions in order to perform the study on the spatial distribution of the bd spots with respect to a particular point in the plane, it is necessary to characterize the locations of the failure events by two random variables (see fig. 7): i) the distance 0x(a 2 +b 2 ) 1/2 between the reference point p and the event e and, ii) the angle 0α/2 subtended by the line connecting p with e measured with respect to the horizontal side of the rectangle a x b. in particular, for our analysis, we take p as the location of the voltage probe, which is chosen to coincide with one vertex of the observation window. this selection remarkably simplifies the mathematical treatment. the variable x defines the point-to-event distance probability distribution function (pdf) f(x), whereas α defines the point-to-event angular probability distribution function f(α). f(x) and f(α) denote the corresponding cumulative distribution functions (cdf). for the sake of simplicity, we use the same notation f and f for both variables though their explicit mathematical expressions are different. these distributions can be easily found by combining the areas of geometrical figures (see ref.[17]). the pdf and cdf for the distance x are given by the expressions: ( ) { ( ) * ( ) ( )+ √ (5) and ( ) { [ √ ( )] [ √ ( ) √ ( )] √ (6) respectively. the pdf and cdf for the angle α are given by the expressions: ( ) { ( ) ) ( ) ( ) ( ) (7) and ( ) { ( ) ( ) ( ) ( ) (8) respectively. 188 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda fig. 7 definition of the variables used to localize the breakdown spots (e) with respect to the voltage probe position (p). x is the distance from p to e and  is the corresponding angle. figures 8 to 10 show the three cases discussed in the previous section but now emphasizing the distribution of the points with respect to the bottom-left vertex of the observation window. as expected, the experimental results (histograms for the pdfs and black solid lines for the cdfs) agree well with the theoretical results (red solid lines) calculated using expressions (5)-(8). notice how the mode of the distributions shifts to the left (shorter distances and smaller angles) as the length of the rectangle increases. in all the analyzed cases, it can be concluded that the data po ints are csr distributed in total consistency with the observations reported in section iv. notice that a fundamental difference between the method presented here and that of the functional estimators is that the former one makes explicit reference to the shape of the observation while the second does not (no edge correction has been considered). on the other hand, methods for calculating the confidence bands are ready available for the functional estimators, whereas these methods for the point-to-event distributions have not been developed yet. study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 189 a) b) c) d) e) fig. 8 a) map of the bd spots in a mim device with area 400 x 100 m2. the lines indicate the distances from the event to the bottom-left vertice of the observation window. pdfs and cdfs for the distances x and the angles : b) f(x) , c) f(x) , d) f() and e) f(). 190 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda a) b) c) d) e) fig. 9 a) map of the bd spots in a mim device with area 400 x 100 m 2 . the lines indicate the distances from the event to the bottom-left vertice of the observation window. pdfs and cdfs for the distances x and the angles : b) f(x) , c) f(x) , d) f() and e) f(). study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios 191 a) b) c) d) e) fig. 10 a) map of the bd spots in a mim device with area 1600  100 m 2 . the lines indicate the distances from the event to the bottom-left vertice of the observation window. pdfs and cdfs for the distances x and the angles : b) f(x) , c) f(x) , d) f() and e) f(). 192 x. saura, m. riccio, g. de falco, j. suñé, a. irace, e. miranda 6. conclusions the spatial distribution of breakdown spots in hfo2-based large area mim capacitors generated by severe electrical stress was investigated. the analysis was performed on devices with different aspect ratios. using real-time ir mapping it was demonstrated that the occurrence of these shorts between the top and bottom metal electrodes is associated with a notable local temperature increase. however, not all the generated spots remain conducting after their creation because of the volatilization of the top metal electrode. from the statistical analysis carried out in this work, it can be concluded that the breakdown spots in mim devices with high-k dielectric are spatially uncorrelated. this is consistent with previous observations carried out on mis devices. acknowledgement: this work is funded in part by the spanish ministry of science and technology under contract tec2012-32305 and the dursi of the generalitat de catalunya under contract 2009sgr783.e. miranda thanks the funding from the visiting professor program of the university of naples federico ii. the authors also acknowledge i. m povey, e. o’connor, k. cherkaoui and p.k. hurley (tyndall national institute, cork, ireland) for device provision and assistance in the electrical characterization of the samples.we give our special thanks to prof. ninoslav stojadinović and prof. danijel danković from university of niš, republic of serbia for their invitation to write this report.. references [1] a. oates, “reliability issues for high-k gate dielectrics”, iedm tech. dig., 2003, pp. 923-926. [2] e. wu, j. stathis, and l. han, “ultra-thin oxide reliability for ulsi applications”, semicond. sci. technol.15, 425 (2000). [3] m. alam, r. smith, b. weir, and p. silverman, “thin dielectric films: uncorrelated breakdown of silicon integrated circuits”, nature 420,378 (2002). [4] j. suñé and e. wu, “statistics of successive breakdown events in gate oxides”, ieee elect. dev. lett. 24, 272 (2003) [5] x. saura, d. moix, j. suñé, p.k. hurley, and e. miranda, “direct observation of the generation of breakdown spots in mim structures under constant voltage stress”, mic. rel. 53, 1257 (2013). [6] e. miranda and j. suñé, “electron transport through broken down ultra-thin sio2 layers in mos devices”, mic. rel. 44, 1 (2004). [7] j. illian, a. penttinen, h. soyan, and d. stoyan, in statistical analysis and modelling of spatial point patterns, wiley, 2008. [8] p. diggle, in statistical analysis of spatial point patterns, arnold, 2003. [9] r. waser, r. dittmanm, g. stakov, and k. szot, “redox-based resistive switching memories nanoionic mechanisms, prospects, and challenges”, adv. mat. 21, 2632 (2009). [10] r. waser and m. aono, “nanoionics-based resistive switching memories”, nature materials 6, 833 (2007). [11] s. lombardo, j. stathis, b. linder, kin leon pey, f. palumbo, and chih hang tung, “dielectric breakdown mechanisms in gate oxides”, j. appl. phys. 98, 121301 (2005). [12] m. riccio, g. breglio, a. irace, and p. spirito, “an equivalent time temperature mapping system with a 320 x 256 pixels full-frame 100 khz sampling rate”, rev. sci. instrum. 78, 106106 (2007). [13] a. baddeley, r. turner, “spatstat: an r package for analyzing spatial point patterns”, j. stat. software 12, 1 (2005). [14] www.r-project.org [15] m. alam, d. varghese, and b. kaczer, “theory of breakdown position determination by voltageand current-ratio methods”, ieee trans. elect. dev. 55, 3150 (2008). [16] x. saura, j. suñé, s. monaghan, p.k. hurley, and e. miranda, “analysis of the breakdown spot spatial distribution in pt/hfo2/pt capacitors using nearest neighbor statistics”, j. appl. phys. 114, 154112 (2013). [17] s. chiu and r. larson, “bertrand’s paradox revisited: more lessons about that ambiguous word: random”, j. ind. sys. eng. 3, 1 (2009). http://www.iop.org/ej/search_author?query2=james%20h%20stathis&searchfield2=authors&journaltype=all&datetype=all&sort=date_cover&submit=1 8161 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 229-242 https://doi.org/10.2298/fuee2202229b © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimal location and sizing of multiple distributed generators in radial distribution network using metaheuristic optimization algorithms nasreddine belbachir1, mohamed zellagui2,3, benaissa bekkouche1 1department of electrical engineering, university of mostaganem, mostaganem, algeria 2department of electrical engineering, école de technologie supérieure, québec, canada 3department of electrical engineering, university of batna 2, batna, algeria abstract. the satisfaction of electricity customers and environmental constraints imposed have made the trend towards renewable energies more essential given its advantages such as reducing power losses and enhancing voltage profiles. this study addresses the optimal sizing and setting of photovoltaic distributed generator (pvdg) connected to radial distribution network (rdn) using various novel optimization algorithms. these algorithms are implemented to minimize the multi-objective function (mof), which devoted to optimize the total active power loss (tapl), the total voltage deviation (tvd), and the overcurrent protection relays (ocrs)’s total operation time (tot). the effectiveness of the proposed algorithms is validated on the test system standard ieee 33-bus rdn. in this paper is presented a recent meta-heuristic optimization algorithm of the slime mould algorithm (sma), where the results reveal its effectiveness and robustness among all the applied optimization algorithms, in identifying the optimal allocation (locate and size) of the pvdg units into rdn for mitigating the power losses, enhance the rdn system's voltage profiles and improve the overcurrent protection system. accordingly, the sma approach can be a very favorable algorithm to cope with the optimal pvdg allocation problem. key words: multi-objective function, photovoltaic distributed generation, radial distribution network, optimal integration, metaheuristic optimization algorithms. 1. introduction after the rapid rise for electricity demand, the achievement of the balance between demand and electricity production become an essential challenge for researchers, among solutions. the conventional solution consists of creating new power stations, but this received august 21, 2021; received in revised form november 7, 2021 corresponding author: nasreddine belbachir department of electrical engineering, university of mostaganem, algeria e-mail: nasreddine.belbachir.etu@univ-mosta.dz 230 n. belbachir, m. zellagui, b. bekkouche solution requires significant investments and costs. the photovoltaic distributed generation (pvdg) is therefore, the alternative to solve this problem, because of its benefits and advantages [1]. the optimal allocation of pvdg units in radial distribution network (rdn) plays an important role in benefits’ maximizing, such as reducing the network losses and enhancing voltage profiles [2]. to achieve these benefits, the optimal allocation of pvdg units should be well done while considering the objective function, constraints, and the best choice of optimization algorithms [3]. the findings of research explicitly confirm that the location, size and type of pvdg units in rdns significantly affect its technical, economic, and environmental parameters [4]. the impact of the pvdg sources can be either beneficial or disadvantageous to the rdn, based and related to the allocation of the pvdg units. therefore, the correct placement of pvdg units in the rdn remains a barrier to reach their possible full benefits [5]. recently, several techniques and algorithms have been implemented to find a solution for the problem of the optimal integration of the pvdg units in rdn considering different objective functions as: applied biogeography-based optimization (bbo) algorithm in [6], moth flame optimization (mfo) algorithm in [7], adaptive pso (apso) algorithm in [8], sinusoidal modulated pso (sm-pso) algorithm in [9], spider monkey optimization (smo) algorithm in [10], artificial bee colony (abc) algorithm in [11], improved artificial bee colony (iabc) algorithm in [12], applied the grasshopper optimization algorithm (goa) in [13], gravitational search algorithm (gsa) in [14], coyote optimization algorithm (coa) in [15], and used krill herd algorithm (kha) in [16]. recently, applied sine–cosine algorithm (csa) in [17], salp swarm algorithm (ssa) in [18], harris hawk optimizer (hho) algorithm in [19], improved equilibrium optimization algorithm (ieoa) in [20], chaotic grey wolf optimization (cgwo) algorithm in [21], bat algorithm (ba) in [22], and applied ant lion optimizer (alo) algorithm in [23]. this paper started by proposing a new approach to identify the optimal allocation of photovoltaic sources that based on multiple pvdg units into rdn which demonstrated using a new proposed multi-objective functions (mof) that refer to minimize the total active power loss (tapl), total voltage deviation (tvd) and total operation time (tot) of overcurrent relays (ocr) by applying various novel optimization algorithms of whale optimizer algorithm (woa) in [24], ant lion optimization (alo) in [25], grasshopper optimization algorithm (goa) in [26], salp swarm algorithm (ssa) in [27] and slime mould algorithm (sma) in [28], and validating their effectiveness on the standard test system ieee 33-bus rdn. the main contribution and novelty of this paper: ▪ proposing a new multi-objective function that comprised three technical parameters to be minimized simultaneously. ▪ evolving the parameter of the operation time of the overcurrent relays in the multiobjective function, where in addition to the simultaneous ameliorating of the voltage profiles and the reduction of the active power losses, minimizing the operation time of the ocr improved the protection system, raised the reliability and brings various economical and technical benefits as extending the lifetime of the equipment, also guaranteed the system’s normal operation and the continuity of service, by doing the quick removal of the system’s part where the fault current may occur. ▪ applying the optimization on the standard radial distribution network ieee 33-bus rdn. ▪ studying the impact of the optimal integration of different cases of pvdg units on various technical parameters of the distribution network. optimal location and sizing of multiple distributed generators in radial distribution network 231 the rest of the study comprised 4 main sections followed by a list of references, where it is organized as, section 2: indicates the evaluation of the proposed multiobjective functions applied in this paper, section 3: presents the description of the applied algorithm of sma, section 4: demonstrates the optimal results and discussions, section 5: contains the conclusions and the achievements including the future perspectives. 2. mathematical problem formulation 2.1. multi-objective functions the interest of this paper is to find the optimal allocation of all pvdg cases into rdn, to optimize simultaneously the technical parameters of tapl, tvd and tot, while minimizing the developed mof that considered as the sum of the technical indices of total power loss index (tapli), total voltage deviation index (tvdi) and the total operation time index (toti) of the overcurrent relays, for reason that the proposed indices are relative to unity. the mof is formulated as: , 1 2 1 [ ] bus bus rn n n i j j i i j i mof minimize tapli tvdi toti = = = = + + (1) the first index tapli, of line is expressed by [7, 8]: after pvdg before pvdg tapl tapli tapl = (2) , , 1 2 bus bus nn i j i j i j tapl apl = = =  (3) , ( ) ( )i j ij i j i j ij i j i japl p p q q q p pq = + + + (4) cos( ) ij ij i j i j r v v   = − and sin( ) ij ij i j i j r v v   = + (5) where, rij represents the resistance fo the line. nbus is the bus number, (δi, δj) and (vi, vj) denote the angles and voltages, respectively. (pi, pj) and (qi, qj) demonstrate active and reactive powers, respectively. v1 is the voltage at the sub-station and equal to 12.66 kv. the second index, the tvdi is defined as [9, 13]: after pvdg before pvdg tvd tvdi tvd = (6) 1 2 busn j j j tvd v v = = − (7) 232 n. belbachir, m. zellagui, b. bekkouche the third index is toti of the overcurrent relays, which can be represented as [21]: after pvdg before pvdg tot toti tot = (8) 1 rn i i i tot t = = (9) 1 i i b i a t tds m   =    −  (10) f i p i m i = and f f ij v i z = (11) where, ti is the operation time of ocr, tds is the time dial setting, m is the multiple of pickup current. if and ip represent the fault and the pickup currents, respectively. a and b are constants of the relay, set to 0.14 and 0.02, respectively. nr is the number of ocrs. vf is the phase fault voltage magnitude measured, zij is the line impedance. 2.2. equality constraints the equality constraints of the balanced power equations could be expressed as follows: g pvdg dp p p apl+ = + (12) g dq q rpl= + (13) where, qg and pg illustrate the total reactive and active powers from generator. qd and pd are the load’s total reactive and active powers. apl and rpl denote the active and reactive power losses, respectively. ppvdg is the pvdg’s output power. 2.3. distribution line constraints min maxiv v v  (14) max1 jv v−   (15) maxijs s (16) where, vmin, vmax denote the minimum and maximum of voltage limits, δvmax is the maximum of voltage drop limits. smax and sij are the maximum and apparent power in the distribution line. 2.4. pvdg units constraints min max pvdg pvdg pvdgp p p  (17) 1 1 ( ) ( ) pvdg busn n d i i pvdg i p i = =   (18) optimal location and sizing of multiple distributed generators in radial distribution network 233 2 position buspvdg n  (19) .maxpvdg pvdgn n (20) , / 1pvdg in location  (21) where, ppvdg min and ppvdg max are the minimum and the maximum pvdg’s output power. npvdg is the pvdg units’ number. npvdg is the location of pvdg units at bus i. 3. the slime mould algorithm the sma is recent metaheuristic technique proposed in [28], basing on the nature behavior of slime mould which search and explore for food, surround it, and deliver enzymes to assimilate it. the characteristics of the slime mould may be described based on three principal steps, incorporating of approach food, wrap food, and oscillating, where the following subsections represents their mathematical forms. 3.1. approach food the model of slime mould’s approach behavior is represented in a mathematical equation in order to imitate the contraction mode, the next rule is proposed: ( ) .( . ( ) ( )), ( 1) . ( ), b a b x t vb w x t x t r p x t vc x t r p  + −  + =   (22) where, vb represents a parameter set between [-a, a], vc reduces linearly from 1 to 0. t is the current iteration, xb indicates the individual location including the highest concentration of odor found, x is the slime mould’s location, xa and xb are two random individuals that determined among the swarm, w is the slime mould’s weight. the p may be formulated as follows: tanh ( )p s i df= − (23) where, i ϵ 1,2, …, n. s(i) is the x’s fitness, df is the best fitness acquired for the iterations. the vb is formulated as follows: [ , ]vb a a= − (24) arctan ( 1) max_ t a h t   = − +    (25) the w is formulated and listed as next: ( ) 1 .log 1 , ( ( )) ( ) 1 .log 1 , bf s i r condition bf wf w smellindex l bf s i r others bf wf  −  + +   −   =  −  − +  −  (26) ( )smellindex sort s= (27) 234 n. belbachir, m. zellagui, b. bekkouche where, condition designates that s(i) ranks first part of population, r is random value between [0, 1], bf represents the optimal fitness determined in the iterative process currently, wf represents the worst value of fitness in the current iterative, smellindex is the sequence of fitness values sorted. 3.2. wrap food the mathematical formulation of slime mould’s location update is as next: * .( ) , ( ) .( . ( ) ( )), . ( ), b a b rand ub lb lb rand z x x t vb w x t x t r p vc x t r p  − +   = + −    (28) where, lb and ub are the low and up limits for the range of search, rand and r represent values randomly between [0, 1]. 3.3. oscillating the value of vb oscillates randomly in the range of [-a, a] and gradually addresses 0 as the iterations raise. the value of vc oscillates in range [-1, 1] and tends to 0 at last. algorithm 1 pseudo-code of sma initialize the parameters pop-size, max_iteration; initialize the slime mould’s positions xi (i=1,2, …, n); while (t ≤ max_iteration) calculate the slime mould’s fitness; update bestfitness, xb calculate the w by eq. (26); for each search portion update p, vb, vc; update positions by eq. (28) end for t=t+1 end while return bestfitness, xb; 4. test system, optimal results and analysis the selected algorithms were implemented in the software of matlab (version 2017.b) with a pc comprised of processor intel core i5, 3.4 ghz, including 8 gb of ram. figure 1 represents the single diagram of the standard ieee 33-bus rdn [29, 30], which operates with a base voltage equal to 12.66 kv, including active load demand power of 3715.00 kw and reactive load demand power of 2300.00 kvar. it comprised 32 branches and 33 buses, and everyone of these buses is protected and covered by an overcurrent relay considered as primary, also followed by another overcurrent relay considered as backup, where a coordination time interval (cti) set optimal location and sizing of multiple distributed generators in radial distribution network 235 between them above 0.2 seconds. generally, it is calculated for the whole system, 31 ocrs and between them 31 ctis. fig. 1 single diagram of test system ieee 33-bus rdn. figure 2 showed the curves of convergence after applying the various metaheuristic algorithms for all the studied cases of optimal integration into the rdn test system. to improve the comparison between the various applied metaheuristic optimization algorithms, their convergence curves are implemented and shown in figure 2, while the optimization for all algorithms was carried out for a maximum number of iterations equal to 150, and a search agent parameter set to 10, including 2 dimensions for the formulated problem (location and sizing). it is obvious among all the applied optimization algorithms, that the sma was the superior algorithm which showed good efficiency and performance in reaching the best results of mof minimization until 2.088, 1.971 and 1.894 for the optimal allocation into rdn of all cases studied of one, two and three pvdg units respectively, performing with a quick convergence characteristic for all the cases of pvdg integration into rdn, where the sma converged after 75 iterations. tables 1 and 2, illustrate the allocation of all the cases studied of optimal integration and the results obtained after that presence into rdn for all the applied algorithms. from tables 1 and 2, where is mentioned the results after applying of the various algorithms for the optimal allocation of the cases studied into test system rdn. it is clear among all of them, that the sma provided the best minimization of the three technical indices simultaneously represented as mof with values until 2.088, 1.971, and 1.894 for the three cases of one, two and three pvdg integration into rdn, respectively. the rest of the applied algorithms also showed a good efficiency in delivering some good results separately, as examples, the woa provided the minimum of tapl for the case of three pvdg until 80.391 kw, the goa provided the minimum of tot for the case of two pvdg units until 20.286 seconds. 236 n. belbachir, m. zellagui, b. bekkouche (a) (b) (c) fig. 2 convergence curves of applied optimization algorithms: a) one pvdg, b) two pvdg, c) three pvdg. optimal location and sizing of multiple distributed generators in radial distribution network 237 table 1 optimal allocation of the studied cases of pvdg units. algorithms applied pvdg number pvdg buses ppvdg (mw) woa 1 pvdg 6 2.890 2 pvdg 12 – 28 1.004 – 1.510 3 pvdg 12 – 24 – 28 1.069 – 1.035 – 1.349 alo 1 pvdg 26 2.865 2 pvdg 12 – 28 1.071 – 1.590 3 pvdg 5 – 11 – 29 1.500 – 0.937 – 1.062 goa 1 pvdg 26 2.866 2 pvdg 12 – 28 0.991 – 1.7597 3 pvdg 5 – 10 – 29 1.5000 – 0.9337 – 1.083 ssa 1 pvdg 27 2.870 2 pvdg 12 – 27 1.011 – 1.800 3 pvdg 5 – 12 – 28 1.204 – 0.923 – 1.287 sma 1 pvdg 27 2.877 2 pvdg 12 – 27 0.932 – 2.000 3 pvdg 5 – 13 – 30 1.557 – 0.787 – 1.077 table 2 optimized parameters with pvdg integration. algorithms applied pvdg number tapl (kw) tvd (kv) tot (sec) mof basic case --210.987 22.939 20.574 -- woa 1 pvdg 112.251 13.077 20.363 2.109 2 pvdg 91.622 12.356 20.303 1.988 3 pvdg 80.391 12.077 20.281 1.920 alo 1 pvdg 115.490 13.039 20.359 2.104 2 pvdg 94.170 12.444 20.288 1.974 3 pvdg 87.094 11.887 20.279 1.963 goa 1 pvdg 115.491 13.027 20.359 2.103 2 pvdg 95.870 12.394 20.286 1.994 3 pvdg 87.644 11.925 20.281 1.919 ssa 1 pvdg 121.030 12.963 20.354 2.108 2 pvdg 95.660 12.267 20.292 1.979 3 pvdg 89.775 11.887 20.278 1.928 sma 1 pvdg 121.181 11.887 20.353 2.088 2 pvdg 96.865 12.178 20.288 1.971 3 pvdg 83.780 11.761 20.279 1.894 figure 3 illustrates the bus voltage profiles results after applying sma for all cases studied of pvdg integration into rdn. the curves illustrated in figure 3 based on using the metaheuristic optimization algorithm of sma confirm its efficiency and reliability in providing the best results that led to the ameliorating of the voltage profiles after integrating all cases of pvdg units into rdn, while this enhancement was associated to the minimization of voltage deviation, as long as it indicates the value of rdn’s voltage and how much it is far from the nominal voltage value of 12.66 kv. besides, noticing much better results from the case of three pvdg units, due to 238 n. belbachir, m. zellagui, b. bekkouche the optimal allocation of the multiple pvdg with the sizes of 1.5570 mw, 0.7878 mw and 1.0778 mw in buses 5, 13 and 30 of rdn, respectively. fig. 3 bus voltage profiles for all cases studies. figure 4 shows the bus voltage deviation after applying the sma for all cases studied of pvdg integration into test system rdn. fig. 4 bus voltage diviation for all cases studied. from figure 4, applying of the sma on the test system rdn occurs the minimization of the total bus voltage deviation from 22.939 kv until 11.887 kv, 12.178 kv and 11.761 kv optimal location and sizing of multiple distributed generators in radial distribution network 239 for the integration of one, two and three pvdg units in rdn, respectively, with much superior and better results for the case of three pvdg unit integrating into rdn when taking into consideration a minimum limit of voltage deviation equal to 5 % (0.633 kv). the bus voltage deviation’s minimization consequently led to the improvement of the voltage profiles in all test system’s buses as mentioned previously in figure 3 if the voltage deviation is known as the nominal voltage value of 12.66 kv minus its voltage value at the basic case. figure 5 represents the branch power loss by applying the sma for all cases studied. fig. 5 branch power loss for all cases studied. the second parameter of mof, which is minimized based on using the sma, is the active power losses, where it is clearly got reduced in all the system branches after the optimal integration of all cases studied of pvdg units. besides, the tapl were clearly minimized from the total value at the basic case of 210.980 kw until 121.181 kw, 96.856 kw and 83.780 kw for studied cases of one, two, and three pvdg units respectively, with noticing superior and best results from the case of three pvdg units’ integration into rdn. figure 6 demonstrates the ocr’s operation time after applying sma for all cases. fig. 6 overcurrent relay operation time for all cases studied. 240 n. belbachir, m. zellagui, b. bekkouche the last mof parameter, which is minimized using the sma, based on the optimal allocation of all cases studied, is the ocr’s operation time. it is obviously got reduced almost in all the primary ocrs, also from a total value at the basic case of 20.570 seconds until 20.353 seconds, 20.288 seconds, and 20.279 seconds for one, two and three pvdg units respectively, with much better results and clear impact after the optimal integration for case of three pvdg units. for reason that the operation time of the ocrs is related and associated proportionally to the level of fault current that may occur in system’s lines, this minimization represents a result of the voltage profiles improvement which also raised the level of the fault current as mentioned in the previous equations (10) and (11). that minimization of the tot provided wide economical and technical benefits to the studied system. figure 7 illustrates the graphical comparison for power losses (active and reactive) with inclusion of the minimum bus voltage value after pvdg integration into test system rdn. fig. 7 comparison the power losses and minimum bus voltage. the analysis of figure 7 reveals that the minimum voltage’s value of the rdn kept increasing proportionally while active and reactive power losses were being minimized after the optimal integration for all studied cases of pvdg units into rdn. it is also noticed that the best results of vmin raising, total active power loss and total reactive power loss minimization were achieved when optimally integrating the case of three pvdg units into rdn. injecting three sizes of active powers at three different locations of the test system rdn, was the reason that those best results of minimizing the tapl and trpl until 83.780 kw and 57.570 kvar respectively, were being achieved, including a better minimum voltage value equal 12.402 kv. 5. conclusion in this paper, a study of comparison was carried out between the recent optimization algorithms to find a solution to the problem of identifying the optimal allocation of all pvdg units’ cases into the radial distribution network by minimizing simultaneously the technical parameters of tvd, tapl and tot of the overcurrent relays. optimal location and sizing of multiple distributed generators in radial distribution network 241 the results proved the robustness and efficiency of the sma approach which delivered the best mof’s minimization of with quick convergence characteristics. the rest of the algorithms also showed good effectiveness and provided suitable results but in terms of each parameter on its own. in addition, when comparing the results provided by the sma, the case of three pvdg units was the best choice, which led to simultaneously reducing active power losses, ameliorating of system voltage profiles and enhancement of the protection system against fault current. basing on the previous discussion, the next work will focus on testing other recent optimization algorithms to solve a more complex mof that gathered various technical and economic indices. references [1] r. o. bawazir and n. s. cetin, "comprehensive overview of optimizing pv-dg allocation in power system and solar energy resource potential assessments", energy rep., vol. 6, pp. 173–208, nov. 2020. [2] p. niveditha and m. s. sujatha, "optimal allocation, and sizing of dg in radial distribution system a review", int. j. grid and distrib. comput., vol. 11, pp. 49–58, may 2018. [3] h. a. pesaran, m. p. d. huy and v. k. ramachandaramurthy, "a review of the optimal allocation of distributed generation: objectives, constraints, methods, and algorithms", renew. sust. energ. rev., vol. 75, pp. 293–312, aug. 2017. [4] a. r. jordehi, "allocation of distributed generation units in electric power systems: a review", renew. sust. energ. rev., vol. 56, pp. 893–905, april 2016. [5] m. h. ali, m. mehanna and e. othman, "optimal planning of rdgs in electrical distribution networks using hybrid sapso algorithm", int. j. electr. comput. eng., vol. 10, no. 6, pp. 6153–6163, june 2020. [6] s. ravindran and t. a. a. victoire, "a bio-geography-based algorithm for optimal siting and sizing of distributed generators with an effective power factor model", comput. electr. eng., vol. 72, pp. 482–501, nov. 2018. [7] s. settoul, r. chenni, h. a. hassan, m. zellagui and m. n. kraimia, "mfo algorithm for optimal location and sizing of multiple photovoltaic distributed generations units for loss reduction in distribution systems", in proceedings of the 7th international renewable and sustainable energy conference (irsec), agadir, morocco, 27-30 november 2019, pp. 1–6. [8] a. lasmari, m. zellagui, r. chenni, s. semaoui, c. z. el-bayeh and h. a. hassan, "optimal energy management system for distribution systems using simultaneous integration of pv-based dg and dstatcom units", energetika., vol. 66, no. 1, pp. 1–14, aug. 2020. [9] n. belbachir, m. zellagui, a. lasmari, c.z. el-bayeh and b. bekkouche, "optimal pv sources integration in distribution system and its impacts on overcurrent relay-based time-current-voltage tripping characteristics", in proceedings of the 12th international symposium on advanced topics in electrical engineering (atee), bucharest, romania, 25-27 march 2021, pp. 1–7. [10] g. deb, k. chakraborty and s. deb, "modified spider monkey optimization-based optimal placement of distributed generators in radial distribution system for voltage security improvement", electr. power compon. syst., vol. 48, no. 10, pp. 1006–1020, oct. 2020. [11] d. manna and s. k. goswami, "optimum placement of distributed generation considering economics as well as operational issues", int. trans. electr. energy syst., vol. 30, no. 3, e12246, jan. 2020. [12] p. khetrapal, j. pathan, and s. shrivastava, "power loss minimization in radial distribution systems with simultaneous placement and sizing of different types of distribution generation units using improved artificial bee colony algorithm", int. j. electr. eng. inform., vol. 12, no. 3, pp. 686–707, sept. 2020 [13] n. belbachir, m. zellagui, s. settoul and c. z. el-bayeh, "multi-objective optimal renewable distributed generator integration in distribution systems using grasshopper optimization algorithm considering overcurrent relay indices", in proceedings of the 9th international conference on modern power systems (mps), clujnapoca, romania, 16-17 june 2021, pp. 1–6. [14] v. s. n. murty and a. kumar, "optimal dg integration and network reconfiguration in microgrid system with realistic time varying load model using hybrid optimization", iet smart grid., vol. 2, no. 2, pp. 192–202, june 2019. [15] g. w. chang and n. c. chinh, "coyote optimization algorithm-based approach for strategic planning of photovoltaic distributed generation", ieee access, vol. 8, pp. 36180–36190, feb. 2020. 242 n. belbachir, m. zellagui, b. bekkouche [16] s. sultana and p. k. roy, "krill herd algorithm for optimal location of distributed generator in radial distribution system", appl. soft comput., vol. 40, pp. 391–404, march 2016. [17] u. raut and s. mishra, "an improved sine–cosine algorithm for simultaneous network reconfiguration and dg allocation in power distribution systems", appl. soft comput., vol. 92, p. 106293, july 2020. [18] s. settoul, m. zellagui and r. chenni, "a new optimization algorithm for optimal wind turbine location problem in constantine city electric distribution network based active power loss reduction", j. optim. ind. eng., vol. 14, no. 2, pp. 13–22, june 2021. [19] m. rizwan, l. hong, w. muhammad, s. w. azeem and y. li, "hybrid harris hawks optimizer for integration of renewable energy sources considering stochastic behavior of energy sources", int. trans. electr. energy syst., vol. 31, no. 2, e12694, jan. 2021. [20] a. m. shaheen, a. m. elsayed, r. a. el-sehiemy and a. y. abdelaziz, "equilibrium optimization algorithm for network reconfiguration and distributed generation allocation in power systems", appl. soft comput., vol. 98, p. 106867, jan. 2021. [21] n. belbachir, m. zellagui, s. settoul, c. z. el-bayeh and b. bekkouche, "simultaneous optimal integration of photovoltaic distributed generation and battery energy storage system in active distribution network using chaotic grey wolf optimization", electr. eng. electromechan., vol. 2021, no. 3, pp. 52–61, july 2021. [22] t. yuvaraj, k. r. devabalaji, n. prabaharan, h. h. alhelou, a. manju, p. pal and p. siano, "optimal integration of capacitor and distributed generation in distribution system considering load variation using bat optimization algorithm", energies., vol. 14, no. 12, p. 3548, june 2021. [23] r. palanisamy and s. k. muthusamy, "optimal siting, and sizing of multiple distributed generation units in radial distribution system using ant lion optimization algorithm", j. electr. eng. technol., vol. 16, no. 1, pp. 79–89, oct. 2020. [24] s. mirjalili and a. lewis, "the whale optimization algorithm", adv. eng. softw., vol. 95, pp. 51–67, may 2016. [25] s. mirjalili, "the ant lion optimizer", adv. eng. softw., vol. 83, pp. 80–98, may 2015. [26] s. saremi, s. mirjalili and a. lewis, "grasshopper optimization algorithm: theory and application", adv. eng. softw., vol. 105, pp. 30–47, march 2017. [27] s. mirjalili, a. h. gandomi, s. z. mirjalili, s. saremi, s. faris and s. m. mirjalili, "salp swarm algorithm: a bio-inspired optimizer for engineering design problems", adv. eng. softw., vol. 114, pp. 163–191, dec. 2017. [28] s. li, h. chen, m. wang, a. a. heidari and s. mirjalili, "slime mould algorithm: a new method for stochastic optimization", future gener. comput. syst., vol. 11, pp. 300–323, oct. 2020. [29] n. belbachir, m. zellagui, a. lasmari, c. z. el-bayeh and b. bekkouche, "optimal integration of photovoltaic distributed generation in electrical distribution network using hybrid modified pso algorithms", indones. j. electr. eng. comput. sci., vol. 24, no. 1, pp. 50-60, oct. 2021. [30] m. zellagui, n. belbachir, and c. z. el-bayeh, "optimal allocation of rdg in distribution system considering the seasonal uncertainties of load demand and solar-wind generations systems", in proceedings of the 19th ieee international conference on smart technologies (eurocon), lviv, ukraine, 6-8 july 2021, pp. 471-477. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 309 323 doi: 10.2298/fuee1503309k two control-flow error recovery methods for multithreaded programs running on multi-core processors  navid khoshavi, hamid r. zarandi, mohammad maghsoudloo amirkabir university of technology (tehran polytechnic) abstract. this paper presents two control-flow error recovery techniques, cfe recovery using data-flow graph consideration and cfe recovery using macro block-level check pointing. these techniques are proposed with regards to thread interactions in the programs. these techniques try to moderate the high memory and performance overheads of conventional control-flow checking techniques. the proposed recovery techniques are composed of two phases of control-flow error detection and recovery. these phases are designed by means of inserting additional instructions into program at compile time considering dependency graph, extracted from control-flow and data-flow dependencies among basic blocks and thread interactions in the programs. in order to evaluate the proposed techniques, five multithreaded benchmarks are utilized to run on a multi-core processor. moreover, a total of 10000 transient faults have been injected into several executable points of each program. fault injection experiments show that the proposed techniques recover the detected errors at-least for 91% of the cases. key words: control-flow checking, control-flow error recovery, multi-threaded programs, multi-core processors. 1. introduction recently, multi-core processors have introduced as viable way to keep performance improvement rates within a given power budget [11]. multithread programming energized performance of multi-core processors by extracting thread level parallelism from the sequential program flow. when a sequential program is parallelized conventionally, the programmer or compiler needs to ensure that threads are free of data dependences. if data dependences do exist, threads must be carefully synchronized to ensure that no violations occur. additionally, advances in cmos technology have provided reduction in transistor size and voltage levels. reduction in transistor size and voltage levels coupled with increased sensitivity of microprocessors to transient faults. one of the major threats in received february 9, 2015 corresponding author: mohammad maghsoudloo computer engineering and it department, amirkabir university of technology (tehran polytechnic), no. 424, hafez st., tehran, iran (e-mail: m.maghsoudloo@aut.ac.ir) 310 n. khoshavi, h. r. zarandi, m. maghosudloo modern microprocessors is transient faults which induced by energetic particle strikes, such as high-energy neutrons from cosmic rays, and alpha particles from decaying radioactive impurities in packaging and interconnect materials [13]. it has been shown that considerable fraction of transient faults, between 33% and 77%, reflects control-flow errors, such as possible errors in program counter (pc), address circuits, steering and control logic [12]. a control-flow error (cfe) is said to have occurred if the processor executes an incorrect sequence of instructions [1]. numerous software-based cfe detection techniques have been devised to assess processor errors [2], [3], [5], [6], [7], [8], [9], [14]. in these approaches firstly, program code is partitioned into basic blocks and secondly, extra instructions are added to each basic block in order to verify the flow of code execution. basic block includes a maximal set of ordered non-branching instructions (except in the last instruction) [2]. a unique signature is assigned to each basic block at design time. signatures also are calculated at run-time and next compared with the original ones. if any mismatch has observed (by the added instructions), an error is detected and reported. unfortunately, only a few published works have concentrated on cfes correction [4], [10]. after the cfe is detected, control should be transferred back to the block in which illegal branch was occurred. however, correcting the cfe is not sufficient and the program may fail since there may be some data errors generated by the cfes [4]. therefore, any data errors caused by cfe should be corrected after or during correcting the cfe, as well. error recovery techniques are classified into two broad categories: forward error recovery (fer) and backward error recovery (ber). fer techniques detect and correct the errors without requiring roll-back to a previous correct state. the primary cost of fer schemes is the redundant hardware. backward error recovery (ber) techniques periodically save (checkpoint) system state and roll-back to the latest validated checkpoint when a fault is detected. in multi-core systems, since all processors share a single view of data and the communication between processors, the method which corrects cfes and data errors should take into account synchronization and communication dependencies between threads of multithreaded program. furthermore, the high memory and performance overheads of these techniques can be problematic for real-time embedded systems which have tight memory and performance budget. therefore, regarding the importance of handling the cfes, unsuitability of the conventional related techniques in the modern processors and high memory/performance overheads of previous cfe recovery techniques, a ber cfe recovery technique is proposed in this paper. while previous techniques utilized two set of instructions at the beginning and end of each basic block, the proposed cfe detection method only use a set of checking instructions at the end of each basic block and it has fewer checking instructions in compare to mentioned techniques. to correct cfe and data errors in our approach, we also use a checkpoint-based method like mcp technique [ref], but checkpoint instructions are added to particular basic blocks regarding the location of basic blocks in dependency graph and acceptable latency for cfe recovery. simulation fault injection is used to evaluate recovery capability of the proposed technique. to evaluate the technique, five modified multithreaded benchmarks are used and the gnu debugger, gdb [15] has been used to inject faults on the program. it has been shown that using the approaches presented in the paper, can recover more than 91% of the detected errors with about 67% performance overhead and 89% memory overhead. two control-flow error recovery methods for multithreaded programs running on multi-core processors 311 the structure of this paper is as follows: section 2 introduces dependency graph in multithreaded program. section 3 introduces control-flow error detection technique. section 4 describes different check-pointing used in our approach. the proposed recovery technique is described in section 5. simulation environment and experimental results are presented by section 6. finally section 7 concludes the paper. 2. dependency graph in multithreaded program a multithreaded program, running on the multi-core systems, has a number of threads that each one has its own control-flow and data-flow. these flows are not independent since inter-thread synchronizations and communications may exist in the program. in order to represent multithreaded program, we present a dependency graph. this graph is composed of connecting graphs of all single threads in the program, using dependency arcs between different threads. 2.1. single-threaded dependency graph the single-threaded dependency graph consists of a number of control-flow graphs (cfgs) and data-flow graphs (dfgs). cfg is a graph composed of a set of nodes v and a set of edge e, cfg={v,e}, where v={n1, n2, …, ni, …, nn} and e={e1, e2, …, ei, …,en}. each node ni represents a basic block and each edge ei represents the branch bri,j from ni to nj. as shown in fig. 1, cfgs and dfgs are depicted at compile time and represented control conditions and data dependencies between basic blocks. fig. 1 single thread dependency graph. 2.2. multi-threaded dependency graph extracting the cfg from relations among basic blocks of a program code is always considered as prerequisite step in both of softwareand hardware-based cfc methods. any incorrectness and limitation in capturing the control dependencies among nodes of the cfg causes that the flow of a given program will not be precisely followed in checking phase. the multithreaded program dependency graph consist of a collection of single thread dependency graphs that each represent a single thread, and some special kinds of dependency arcs to model thread interactions. these dependency arcs are based on: 1) synchronization between thread synchronization statements and 2) communication between shared variables of the program threads. 312 n. khoshavi, h. r. zarandi, m. maghosudloo 2.2.1. synchronization dependencies multithreaded programs must be specially programmed to ensure that threads do not step on each other. a section of a code that modifies data structures shared by multiple threads is called a critical section. it is important that a critical section should be accessed exclusively by each thread. synchronize access ensure that only one thread can execute in a critical section at a time. synchronization dependency among different threads may be caused in two ways: create/join relations, lock/unlock relations. fig. 2 shows some additional synchronization arc to model synchronization between threads. fig. 2 multithreaded program dependency graph. 2.2.2. communication dependencies: communication dependency is used to capture dependency relations between different threads because of inter-thread communication. if the value of a variable computed at node ni of a thread has direct influence on the value of a variable computed at node nj of other thread through an inter-thread communication, there is a communication dependency among mentioned threads. shared memory is often used to support communication among threads. to construct the dependency graph of a multithreaded program, firstly, single thread dependency graph is extracted and next, synchronization and communication dependencies are considered between different threads of multithreaded program as shown in fig. 2. in this figure, bolded dotted and bidirectional dashed arcs are synchronization and communication dependencies, respectively. 3. control-flow error detection scheme the cfe detection methods used in the crdc and the crmc are quite similar, and the differences between the proposed methods which have emerged in fig. 3, are only generated because of applying different types of recovery. cfes can be divided into three types in multithreaded programs: intra-node, inter-node/intra-thread and inter-thread. an intra-node cfe is an illegal movement within a basic block (cfe2 in fig. 4), and inter-node or intra-thread cfe is an illegal movement between two blocks of a thread two control-flow error recovery methods for multithreaded programs running on multi-core processors 313 (cfe1 in fig. 4). inter-thread cfe is an illegal jump from basic block of a thread to basic block of another thread in the same processor (cfe3 in fig. 4). while our cfe detection approach is capable to detect inter-node/intra-thread and inter-thread cfes, as well as possible, it does not have enough power to detect intra-node cfes. (a) (b) fig. 3 illustration of added instructions for methods: (a):crdc (b):crmc. fig. 4 illustration of cfe types in cfg scheme. 314 n. khoshavi, h. r. zarandi, m. maghosudloo 3.1. intra-thread/inter-node cfe detection after determining control dependencies among basic blocks of the program, each node of the dependency graph should be labeled by a unique signature. the sequence of these signatures is checked at run-time by the instructions added at the end of each basic block. the checking instructions compare the value of the run-time signature with the pre-defined value assigned to each block at compile time. the run-time signatures should be updated, after checking instructions confirm the correct execution. fig. 3 shows the added instruction to the basic blocks due to methods implementation. if an illegal jump occurs before added instructions at the end of the basic block and control is transferred to it illegally, then the cfe can be detected by comparing the stored value in the ssj (as the signature of the node) with another one calculated in compile time. if they are not equal, the cfe is detected and the function used for recovery is called. source signature of thread j (ssj) is a shared variable of thread j which is continuously updated in executed nodes (where j shows thread number of multithreaded program). ssj finally stores the signature of the basic block in which a cfe has occurred. destination signature of thread j (dsj) is a shared variable of thread j which is continuously updated, and finally stores the signature of the basic block that control is transferred to it incorrectly. shadow variables update instructions are placed in some basic blocks based on an algorithm that has explained in the proposed crmc section. additionally, interaction instructions like pthread_create/pthread_join exist in some basic blocks based on the type of program and they direct the flow of program to other thread legally. thereupon if an illegal branch jumped to the block including interaction instruction, it cannot be detected before thread interaction. so these instructions are placed after dsj update and checking instructions to prevent thread interaction before cfe detection. both source and destination signatures are used in cfe_handler function of both proposed techniques to recover cfe and data errors. 3.2. inter-thread cfe detection each thread of multithreaded program has particular signature identifier to avoid possible interference by the threads in updating and checking phase. thereupon, signature of thread j is allowed to be updated only in thread j and each illegal signature updating in thread j considered as cfe. as illustrated in fig. 5, an inter-thread cfe occurred from n2 of thread 1 to n2 of thread 2 before the signature of thread 2 updated at the end of n1. this cfe can be detected by comparing last updated ssdestination thread with expected value at the end of n2 in thread 2. 4. automatic recovery phase in the previous section, some problems of prior methods used for recovery are described. moreover, as showed in critical applications the recovery methods which only concentrate on the cfes, is not applicable. so, the data errors should be considered and finally recovered. the techniques for recovering the data errors by duplicating instructions are presented in [2], [10], [11], [12]. however, this type of data errors recovery has high overhead because of duplicating and comparing. in the rest of this section, the proposed recovery techniques are explained. two control-flow error recovery methods for multithreaded programs running on multi-core processors 315 4.1 the proposed crdc technique when a cfe is detected through added instructions, the control is transferred to cfe_handler function. this function is implemented by considering the dfg and cfg of the program at design time. the signatures of the source and destination basic blocks are given to cfe_handler function as inputs. this function can relocate the control to the nearest block from which re-executing the program corrects the cfe, and all of the affected variables between source and destination will be re-initialized. fig. 6 (a) shows three basic blocks from the set of basic blocks of a thread in a program code as well as the dfg extracted from data dependencies among variables in these basic blocks. fig. 6 (b) illustrates the process of the correction used by the proposed techniques. regarding them, if cfe1 has occurred in basic block 2 and the control is transferred from basic block2 to basic block3 (step 1 in fig. 6(b)), then the values stored in variables x and z cannot be reliable, because of the problems previously explained. for example, suppose that the source basic block is basic block2 and the destination one is basic block3, also the variables modified by the cfe (x and z) are initialized in basic block1 and basic block2. for cfe and data errors recovery, the control should be transferred to basic block1 (step 3 in fig. 6(b)). therefore, the modified variables are re-initialized and their corresponding computations are re-executed after this transmission. by re-executing the code from basic block1, the first value which was stored in variable z is re-loaded again. also, after completing basic block1 and in basic block2, the first value of x is re-loaded. fig. 5 inter-thread cfe detection. 316 n. khoshavi, h. r. zarandi, m. maghosudloo (a) (b) fig. 6 (a): cfg and dfg generated from program code, (b): scheme of crdc methods another example is when cfe2 occurs, then the source basic block is basic block3 and the destination one is basic block1. the variables affected by this cfe are x, y, and z. the initialization of x is done in basic block2, and the initialization of variables y and z are done in basic block1. hence, returning to basic block1 leads to load the initialization values to variables and re-execute computations by which the variables had been used. in multithreaded programs, since threads act on each other, recovering one thread in the case of cfe does not mean the whole of program is recovered. in many cases, several threads should rollback to special locations to provide consistency and true execution in re-execution process. threads which were created in our benchmarks were entirely independent function and there was no need to rollback several threads to a previous state except in the case when inter-thread cfe would happen. in this case, corrupted threads are discovered and rollback process is done based on relations among slave threads and main thread. furthermore, in this technique for detection and correction of illegal jumps to unused space (partition block), the partition block is filled-up with branch instructions to cfe_handler function. zero (null) is reserved as the destination signature value for the partition block to distinguish it from the other blocks in the program code. if the illegal jump occurs to it, the cfe_handler function ignores the destination, because it contains no computation related to the program. two control-flow error recovery methods for multithreaded programs running on multi-core processors 317 4.1.1. the proposed crdc cfe_handler function fig. 7 (a) shows scheme of the cfe_handler function defined for a program including three threads in crdc technique. determining the type of cfe (intra/inter thread cfe) is the first step after transferring control of program to the cfe_handler function. as shown in condition 1 code, if the ssj and dsj in two different threads were not equal, the occurred cfe is inter-thread type. in this case, another situation should be considered that is whether the slave threads have only been corrupted or the main thread has also been corrupted. in spite of re-starting the program in the situation where the main thread has corrupted, the program can resume from the thread creation instructions in main thread when only slave threads have corrupted. as shown in fig. 7 (b), to recover intra-thread cfes, the crdc function first determines the corrupted thread by comparing ssj and dsj of each thread. next, it specify source basic block by comparing the value stored in ssj with the signatures assigned to each basic block at design time. then, the control is transferred to sub-sections which are separately defined for each source basic block. in these subsections the destination basic block can be determined as similar as determining the source one, and finally the control can be transferred to the basic block in which the first initialization of the affected register is done. this transition can be performed by conditional branches to the first instruction of the basic blocks. when an illegal jump occurs to the cfe_handler function statements, the function can gives the control back to the source basic block, by executing the first subsection. the last lines of the subsections (jump instructions to the first line of the function) were defined to correct this type of cfes. (a) (b) fig. 7 crdc scheme ((a): sample flowchart, (b): cfe-handler function). 4.1.2. optimization of the crdc cfe_handler function in some applications, imposing less memory overhead may be more important than the other issues, due to the area-efficiency of their designs. therefore, the structures of this function should be optimized under the memory constraint. as shown by fig. 8, the targets of the final branches, used in the last phase of the recovery, are specified taking to account the pair of the source and destination basic blocks. so, to reduce the number of instructions in the functions, the phase of determining the destination basic block can be omitted. for 318 n. khoshavi, h. r. zarandi, m. maghosudloo example, regarding fig. 8, if the source basic block of an illegal jump is basic block 1, then, the target of the final branches is also basic block 1, independent of the destination basic block. in the other case (for example the fourth line of the matrix in fig. 8), considering the topmost basic block, nearest basic block to the beginning of the program, from the set of the targets for one specific source basic block causes that one of the phases in the recovery process (checking the value of the dsj for determining the destination) is omitted. these optimizations in designing the structures of the functions lead to effective improvements in the percentages of the memory overhead of the proposed technique. fig. 8 schematic of an algorithm for reducing the memory overhead of the cfe-handler function. 4.2. the proposed crmc technique in the first section, some problems (potential of imposing high overhead and high latency) of checkpoint-based methods were explained. in crmc technique, the program gets checkpoints at some points during code execution. at these times, the values stored in variables (such as registers and memory blocks) should be sustained in shadow variables. then, for cfes and data errors recovery, it needs to re-execute from the last trustable checkpoint, after loading correct values stored in shadow variables to original ones. through the crmc, the shadow variables always contain the true values of the original ones. if shadow variables updated at the end of each basic block in which the corresponding original variables has been modified, a noticeable performance and memory overheads are imposed to the system. on the other hand, since thread interaction instructions such as synchronization or communication change some variables in different threads, these modifies should be considered in the proposed recovery technique. therefore, the shadow variables are divided to two different shadows:  local shadows: local shadows are used to accelerate recovery process while the source and destination basic blocks of cfe are from two different threads. the contents of the local shadow are chosen by the application programmer with respect to the information provided by the dfg of the program and there is no need to save the entire system variables or any other information related to the hardware or the operating system. to reduce the imposed overheads due to two control-flow error recovery methods for multithreaded programs running on multi-core processors 319 shadow variables, we specify a boundary of consecutive executed basic blocks which are free of thread interaction instructions as macro basic block for each thread. the local shadow variables are updated at the end of each macro basic block; then a snapshot of the thread state is taken. if the macro basic block placed within a loop, a variable is used to specify the iteration number and trigger shadow variables updating at acceptable iterations. regards to the points which have mentioned above, the pace of executing thread interaction instructions can be introduced as macro basic block size: macro basic block size= (1) assume a program consists of 24 basic blocks that 3 of them included thread interaction instructions. the macro basic block size will be approximately equal to 3 based on the equation 1. fig. 9 shows the scheme of this macro basic block. the shadow variables are updated at the end of third basic block as illustrated in fig. 9. this optimization directly leads to noticeable reduction in the overheads of our method in compare to checkpoint-based methods. fig. 9 illustration of the shadow variables updating location.  global shadows: the places where the global shadows are updated should correspond to a consistent state of the application. we considered synchronization/communication points of the application like at the beginning and end of create/join and lock/unlock relations as natural consistent global states. a miniaturized snapshot of entire system saved at global shadows and it will be used when the global consistency needed. 320 n. khoshavi, h. r. zarandi, m. maghosudloo 4.2.1. the proposed crmc cfe_handler function regarding fig. 10 (a), the time of updating the local and global shadows with the original ones is illustrated. as shown in fig. 10 (b), suppose that variables y and z are initialized in basic block1, and variable x is initialized in basic block2. for example, the values of variables y, z and x are changed in the different basic blocks of macro basic block. therefore, the local shadow of all modified variables should be updated only at the end of macro basic block (instead of updating at the end of the all basic blocks). after detection phase, the control is transferred to cfe_handler function (step 2 in fig. 10 (c)). at this time, the signatures of the source and destination basic blocks are already available in ssj and dsj, respectively. these two values are given to cfe_handler function as inputs. as shown in condition code in fig. 11 (a), if the ssj and dsj in two different threads were not equal, the occurred cfe is inter-thread type. in this case, the program should be updated with global shadows and resumed from that point. otherwise, the occurred cfe is intrathread type and the function can update the affected original variables in the source and the destination with shadow ones as demonstrated in fig. 11 (b). finally, the control is transferred to the address of basic block which is placed next to the basic block contained local shadow variables updating (step 3 in fig. 10 (c)) and the code is re-executed from this point. consequently, both of the cfe and the generated data errors can be corrected. (a) (b) (c) fig. 10 cfg and dfg generated from program code ((a): local and global shadows places, (b): local shadow updating, (c): scheme of crmc) two control-flow error recovery methods for multithreaded programs running on multi-core processors 321 (a) (b) fig. 11 crmc scheme ((a): sample flowchart, (b): cfe-handler function) 5. experimental results in order to evaluate the proposed technique, five multithreaded benchmarks quick sort, matrix multiplication, bubble sort, linked list and fast fourier transform utilized to run on a multi-core processor, and a total of 5000 transient faults has been injected into several executable points of each program. branch deletion, branch insertion and branch target modification used as considered fault models. table 1 represents a comparison of associated overheads and error recovery coverage in different methods. the memory and performance overheads of the proposed techniques are lower than other previous works ([10], [4]). the memory/performance overhead of the acced is comparatively higher than the proposed techniques because of adding duplicated instructions and executing the set of instructions used for comparing the results to obtain correct output. moreover, the memory and performance overheads of the proposed techniques are slightly increased, when the running threads of the programs increase. this is due to the utilizing different checkpoint level and concept of macro block in crmc and using less checking instruction at the cfe detection phase in crdc. table 1 comparison of the memory and performance overheads and error recovery coverage bench marks category bench marks acced[10] cdcc[4] mcp[4] crdc crmc m.o. a (%) p.o. b (%) e.c. c (%) m.o. a (%) p.o. b (%) e.c. c (%) m.o. a (%) p.o. b (%) e.c. c (%) m.o. a (%) p.o. b (%) e.c. c (%) m.b.s d (%) m.o. a (%) p.o. b (%) e.c. c (%) dual threaded programs qs 222.6 112.3 86.5 86.5 71.4 84.3 178.3 92.4 81.1 72.6 51.2 94.0 4 84.5 62.3 93.2 mm 219.2 101.0 88.3 75.2 57.7 89.8 144.5 70.2 85.8 55.9 48.0 94.3 5 67.8 59.1 94.0 bs 226.5 108.4 84.3 88.6 70.0 84.0 182.5 88.4 83.4 74.3 50.2 91.9 4 86.4 61.3 91.1 ll 228.0 104.4 81.9 91.4 69.1 83.5 184.6 83.1 80.9 75.5 49.9 92.2 4 87.6 60.7 92.8 ff 195.3 97.8 88.0 71.3 55.2 88.5 135.7 68.0 87.9 54.3 44.3 92.9 4 66.1 55.7 92.0 avg. 218.3 104.7 85.8 82.6 64.6 86.0 165.1 80.4 83.8 66.5 48.7 93.0 4 78.4 59.8 92.6 quad threaded programs qs 232.0 119.2 85.3 104.1 88.6 83.1 189.0 108.6 80.8 88.3 60.6 93.1 3 99.1 71.9 92.6 mm 231.7 117.5 87.6 91.5 63.3 88.3 162.0 86.8 83.3 67.2 51.5 93.2 4 78.7 62.5 92.6 bs 238.1 115.9 82.0 107.8 79.7 83.4 193.6 102.3 81.0 85.1 59.1 90.1 3 96.4 70.7 89.3 ll 242.6 113.8 80.1 110.7 76.5 81.7 196.2 96.4 79.0 83.7 58.7 91.3 3 94.2 69.1 91.9 ff 217.5 102.5 86.7 89.0 61.4 86.3 157.9 82.7 84.2 64.6 47.2 92.1 3 75.4 58.8 90.0 avg. 232.3 113.7 84.3 100.6 73.9 84.5 179.7 95.3 81.6 77.7 55.4 91.9 3 88.7 66.6 91.2 a. memory overhead b. performance overhead c. error recovery coverage d. macro block size 322 n. khoshavi, h. r. zarandi, m. maghosudloo 6. conclusions in this paper, two software techniques to detect and correct cfes in multithreaded programs are proposed. these techniques are implemented via considering control and data dependency in dependency graph beside synchronization and communication dependency at compile time. also, proposed techniques correct data errors generated by cfes that can cause considerable corruptions in the systems. fault injection experiments showed that the proposed techniques, when applied on the programs, produce correct results in over 91.2% of the cases. the latency and the additional memory required for correcting the cfes and the data errors are considerably less than the duplication based and checkpoint based methods which have been recently published. references [1] m. fazeli, r. farivar and s. g. miremadi, "error detection enhancement in powerpc architecture-based embedded processors", journal of electronic testing: theory and applications, vol. 24, pp. 21-33, 2008. [2] n. oh, p. shirvani and e. j. mccluskey, "control-flow checking by software signatures", ieee transactions on reliability, vol. 51, no. 2, pp. 111-122, 2002. [3] o. goloubeva, m. rebaudengo, m. r. sonza and m. violante, "soft-error detection using control flow assertion", in proceedings of the 18th ieee international symposium on defect and fault tolerance in vlsi systems, 2003, pp. 57-62. [4] h. r. zarandi, m. maghsoudloo and n. khoshavi, "two efficient software techniques to detect and correct control-flow errors", in proceedings of the 16th ieee pacific rim international symposium on dependable computing, 2010, pp. 141-148. [5] r. venkatasubramanian, j. p. hayes and b. t. murray, "low-cost on-line fault detection using control flow assertions" in proceedings of the 9th ieee international on-line testing symposium, 2003, pp. 137-143. [6] r. vemu and j. a. abraham, "ceda: control-flow error detection through assertions" in proceedings of the 12th ieee international on-line testing symposium, july 2006, pp. 151-158. [7] a. rajabzadeh and s. g. miremadi, "cfcet: a hardware-based control flow checking technique in cots processors using execution tracing", elsevier journal of microelectronics and reliability, vol. 46, pp. 959-972, 2006. [8] y. sedaghat, s. g. miremadi and m. fazeli, "a software-based error detection technique using encoded signature", in proceedings of the 21st ieee international symposium on defect and fault tolerance in vlsi systems, 2006, pp. 389-400. [9] p. bernardi, l. v. bolzani, m. rebaudengo, m. s. reorda, f. vargas and m. violante, "online detection of control-flow errors in socs by means of an infrastructure ip core", in proceedings of the 35th international conference on dependable systems and networks, 2005, pp. 50-58. [10] r. vemu, s. gurumurthy and j. a. abraham, "acce: automatic correction of control-flow errors", in proceedings of the ieee international test conference, 2007, pp. 1-10. [11] d. gizopoulos, m. psarakis, s. v. adve, p. ramachandran, s. k. hari, d. sorin, a. meixner, a. biswas and x. vera, "architectures for online error detection and recovery in multicore processors" in proceedings of design, automation and test in europe, 2011. [12] j. ohlsson, m. rimen and u. gunneflo, "a study of the effects of transient fault injection into a 32-bit risc with built-in watchdog", in proceedings of the 22nd international symposium on fault tolerant computing, 1992, pp. 316-325. two control-flow error recovery methods for multithreaded programs running on multi-core processors 323 [13] c. bolchini, a. miele, m. rebaudengo, f. salice, d. sciuto, l. sterpone and m. violante, "software and hardware techniques for seu detection in ip processors", journal of electronic testing theory and application, vol. 24, no. 1-3, pp. 35-44, 2008. [14] r. vemu and j. a. abraham, "budget-dependent control-flow error detection", in proceedings of the 14th ieee international on-line testing symposium, 2008, pp. 73-78. [15] gnu debugger. http://www.gnu.org/software/gdb/. 8376 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 333-348 https://doi.org/10.2298/fuee2203333а © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper lighting – the way to reducing electrical energy demand in university buildings in bangladesh md. yousuf ali, imran khan, mehedi hassan department of electrical and electronic engineering, jashore university of science and technology, jashore-7408, bangladesh abstract. lighting is one of the dominant electricity demand factors in the building energy sector and has huge potential for demand reduction. however, concerning the efficacy of energy consumption, this potential energy-saving option entails further investigations, particularly for developing countries. this study addresses the issues of an efficient lighting system design for educational institutions with particular attention to classroom and laboratory lighting systems for a university in bangladesh as a case study. measurements show that during the daytime, under clear and average sky conditions both rooms received sufficient natural light (>300 lx) for educational activities, whereas under an overcast sky, only 50% space receives sufficient natural light. at night, the installed fluorescent tube lights illuminance level was found insufficient (<300 lx) for educational activities. the inefficient lighting system design was found to be the main reason for this illuminance level. simulation results reveal that light emitting diode (led) tube lights with a maintenance factor of 0.8 could save 10,080-15,120 kwh, 91,929-137,894 bdt (1usd=84bdt), and 6,753-10,130 kgco2-eq, energy, cost, and, greenhouse gas emissions respectively per year for the classrooms. key words: lighting efficiency, electricity demand, energy saving, lighting system design, lighting energy demand. 1. introduction lighting is responsible for system peak demand in both developed and developing countries. for instance, it was found that about 12% of demand was attributed to evening peak lighting in winter in new zealand [1]. thus, reducing energy demand from lighting could be beneficial for electricity authorities. lighting demand could be managed in three different ways: (i) use of ‘new more efficient equipment’, (ii) ‘utilization of improved lighting design practices’, and (iii) ‘improvements in lighting control systems to avoid energy waste for unoccupied and daylight hours’ as identified in [2]. the focus of this received october 6, 2021; revised december 14, 2022; accepted december 22, 2022 corresponding author: imran khan department of electrical and electronic engineering, jashore university of science and technology, jashore7408, bangladesh. e-mail: ikr_ece@yahoo.com, i.khan@just.edu.bd 334 m. y. ali, i. khan, m. hassan study is the second option, that is, to explore the potential of efficient lighting system design for university buildings, focusing on bangladesh as a case study. lighting systems in educational institutions have significant impacts on learners’ cognitive performance [3] and emotional behavior [4]. in educational institutional lighting systems at universities, about 42% of electricity is consumed for lighting purposes only [5]. one study showed that significant amounts of energy, cost, and indirect greenhouse gas (ghg) emissions could be reduced if the existing fluorescent lamps were replaced with more efficient ones in the university of malaya [5]. martirano (2011) proposed two different smart controls, namely switching and dimming, for the lighting systems in two classrooms in the university of rome, sapienza, to save energy, cost, and to increase the efficiency of the overall system [2]. in sweden, the effectiveness of the lighting control system in an educational building in lund was investigated and they found that about 30% of the total lighting energy consumption was responsible for standby energy use and in extreme cases, this could be as high as 55% [6]. a recent study in greece found that through direct current light emitting diode (led) and daylight harvesting systems, annual lighting energy consumption could be reduced from 90.5 kwhp/m 2 to 0.55 kwhp/m 2 for a typical classroom in a public school [7]. an energy audit for two brac university buildings in dhaka, bangladesh, found that 28% to 45% energy reduction is possible if the existing lighting systems could be replaced by more efficient ones [8]. the government of bangladesh provides subsidies in electricity sector because the cost of electricity increases significantly during system peak hours [9]. studies shows that there are many factors responsible for this peak demand, such as the number of occupants, use of rice cookers, and air conditioners [10]. one of the major contributors to the evening peak demand in bangladesh is lighting [11]. the lighting load is one of the potential demand driving factors in educational buildings [5], and reducing this demand would be helpful for the grid and ghg emission reductions due to electricity generation [12]. the energy efficiency and conservation master plan (eecmp) of bangladesh estimated the potential of 1,862 gwh per year energy saving from lighting load in the country [11]. the plan indicates that the use of more efficient lights such as led would be helpful to achieve this energy saving. the present lighting energy consumption of about 15% would be reduced to 7.5% [11]. however, how much energy that could be saved from educational institutions has not been identified, and this is essential in order to identify the potential energy saving from this sector. thus, the main goal of this study is to reveal the limitations, scope, and potential of energy saving options from efficient lighting system design in an institutional building in one of the least developed countries. this study has taken into account the jashore university of science and technology (just) campus in bangladesh as a case study. more specifically, it considers efficient lighting system design for the classrooms and laboratories of just academic buildings, taking into account natural and artificial lighting along with many other parameters, such as reflectance factors. this study is novel for a number of reasons: first, for the authors, this is the first study that has explored the potential of efficient lighting system design for educational institutions in bangladesh, a least developed country. although the study particularly focused on bangladesh as a case study, the findings could be applicable to other developing countries. second, this study reveals the limitations of educational lighting system design in least developed countries. lighting – the way to reducing electrical energy demand in university buildings in bangladesh 335 2. data and method for this analysis, practical lux in the rooms was measured using luxmeter uni-t ut383 [illuminance measurement: 0-199999 lux ± (4%+8); resolution: 1 lux; sampling rate: 2/s]1. at the same time, the simulation was conducted using dialux evo 8.2 software. although there are many software available for lighting simulation (e.g., relux, btwin), dialux evo 8.2 software was used for the simulation as it was found to be an effective tool in designing an efficient lighting system with the help of a complete database [13]. this software is able to take into account both natural and artificial lighting for simulation purposes. many previous studies also used this lighting simulation software such as [14], [15]. the methodology used for this study is illustrated by the following sequence of events: (i) the test room measurements are taken into consideration to develop the grid dimensional frame for measuring the lux of the rooms. (ii) the lux of the test rooms is measured experimentally at the pre-defined grid positions through the luxmeter. (iii) with the same room-measurements, the software tool dialux evo 8.2 is used to measure the lux of the test rooms. (iv) the step-(ii) and (iii) are repeated further for different lighting conditions (i.e., clear sky, overcast sky, average sky, and at night). (v) finally, the simulation results and the measurement samples are computed, compared, and analyzed to design an effective lighting system. the classroom (room#836) and laboratory (room#837) in the academic building are measured and the positions of the doors and windows are properly identified. the grid dimension of the measurement samples is considered to be 1.5 m to 1.5 m with a measuring-plane height of 0.8 m, satisfying the en 12464-1 standard [16]. for every sample, the measurement procedures are almost instant and take approximately 30 minutes to complete the total measurements. hence, lighting conditions are nearly stable for the sample values. for the simulation, this measurement data is utilized. using the luxmeter, lux is measured at different positions in these rooms under different natural lighting conditions, such as clear sky and overcast sky [17]. according to [18] these lighting conditions are: ▪ clear sky: ‘clear sky varies according to the altitude and azimuth of the sun, is brighter and closer to the sun and attenuates when moving away from it. the brightness of the horizon is between these two extremes.’ that is a cloudless sky. the measurement is taken on 14th september 2019 at 1 pm local time. ▪ overcast sky: ‘this type of sky is completely covered by clouds and the view of the sun is completely impeded. under a very overcast condition, there is little to no direct lighting and the values of global and diffuse illuminance are very close’. for this weather conditions, we considered a cloudy day, i.e., 24th september 2019 at 1:44 pm (local time). 1 https://www.uni-t.cz/en/p/luxmeter-uni-t-ut383 (accessed on 12-jun-2020) https://www.uni-t.cz/en/p/luxmeter-uni-t-ut383 336 m. y. ali, i. khan, m. hassan ▪ average/intermediate sky: ‘this is a type of sky found between the clear and the overcast skies.’ that is average weather conditions. the lux is measured on 17th september 2019 at 1 pm (local time). ▪ at night: no natural light is present at this condition. this measurement is taken on 17th september 2019 at 7:10 pm local time. finally, all these measured and simulated results are compared and the factors that have an impact on illumination are identified. this is to be noted that the specific dates and times mentioned above are merely considered for capturing suitable measurement samples under different lighting conditions with diverse weather. at the same time, the computations are also performed with relevant samples measured at level 7 of the academic building. the geometric properties of the test-rooms (classroom and laboratory) along with the technical specifications of the luminaries are given in table 1. for an efficient lighting system design simulation, we used led tube light (philipsll512x 1xled50s/835 nb) and compared it with fluorescent tube light (philips tcs460 1xtl5-32w hfp d8-vh). for both the cases, the power of the luminaries is found to be 6.5 w/m2. this comparison is made, as led is becoming popular nowadays due to its low energy consumption, although most of the existing lights in university campuses in bangladesh are fluorescent tube lights. the detailed technical specifications for these two lights are also provided in table 1. for all the artificial lighting simulation we used a maintenance factor (mf) of 0.8. the mf is a product of different parameters as shown in eq. (1). for further details of each of the parameters see [19]. mf = lld × ldd × aft × of × svv × bf × fsd (1) where, mf = maintenance factor, lld = lamp lumen depreciation, ldd = luminaire dirt depreciation, aft = ambient fixture temperature, of = optical factor, svv = supply voltage variation, bf = ballast factor, fsd = fixture surface depreciation. another crucial parameter in lighting system design is the reflectance factor. it is defined as, ‘the ratio of the flux actually reflected by a sample surface to that which would be reflected into the same reflected-beam geometry by an ideal (glossless), perfectly diffuse (lambertian), completely reflecting standard surface irradiated in exactly the same way as the sample2. in general terms, it is a measure of usable visible reflected light that is reflected from different surfaces in a room when illuminated by a light source. for this simulation, the reflectance factors for the laboratory were considered to be ceiling 50%, walls 75%, and floor 50% and for the classroom, these were ceiling 50%, walls 65%, and floor 50% (based on the color of the surfaces) [20]. note that the reflectance factors of the surfaces should be chosen carefully as the simulation result will be different with variations in this factor. 2 https://www.ies.org/definitions/reflectance-factor-r/ (accessed on 18-jun-2020) https://www.ies.org/definitions/reflectance-factor-r/ lighting – the way to reducing electrical energy demand in university buildings in bangladesh 337 table 1 test-room properties and technical specifications of light used for the simulations. test-room properties type classroom laboratory level l-7 (room#836) l-7 (room#837) geometry floor area: 88 m2 floor to ceiling height: 3.5 m floor area: 44 m2 floor to ceiling height: 3.5m glazing area: 28 m2 orientation: south west area: 14 m2 orientation: south west fabric floor: concrete wall: plaster wall with paint ceiling: concrete with paint floor: concrete with plastic floor mat wall: plaster wall with paint ceiling: concrete with paint technical specifications of light used for the simulations lamp parameters lamp model philipsll512x 1xled50s/835 nb philips tcs460 1xtl5-32whfp d8-vh lamp flux (lm) 4700 3250 total flux (lm) 4693 2920 luminous efficacy (lm/w) 130 81 correlated color temp. [cct](k) 3000 3500 color rendering index [cri] 99 80 light output ratio [lor] (%) 100 90 total power (w) 36 36 lamp type baten type baten type 338 m. y. ali, i. khan, m. hassan 3. result and analysis according to the bangladesh national building code (bnbc), the illuminance level in classrooms or laboratories should be at least 300 lx3. the measured illuminance levels at different positions in the classroom for different natural lighting conditions are shown in fig. 1. there are 16 luminaries in the classroom and 8 luminaries in the laboratory. they are uniformly placed at a longitudinal distance of 3 feet and a breadthwise distance of 8 feet from each other [21]. except for the overcast sky and at night, the other two natural lighting conditions are more than sufficient for the classroom. on the other hand, for the overcast sky lighting condition, half of the classroom receives sufficient natural light and the other half requires artificial lighting [see fig. 1 (c)]. notably, at night, the existing lighting system is unable to provide sufficient illuminance levels to provide for reading or writing activities [see fig. 1 (d)]. (a) (b) (c) (d) fig. 1 measured illuminance level in the classroom at different positions for: (a) clear sky, (b) average sky, (c) overcast sky, and (d) at night (fluorescent tube light was used). although the simulation results for clear sky and average sky lighting conditions show sufficient light in the classroom, the illuminance level varies from the measured values as evident from figs. 1 and 2. one of the reasons might be the maintenance factor (mf), which is a combination of many different parameters as shown in eq. (1). 3 http://www.dpp.gov.bd/upload_file/gazettes/39201_96302.pdf (accessed on 14-nov-2021) http://www.dpp.gov.bd/upload_file/gazettes/39201_96302.pdf lighting – the way to reducing electrical energy demand in university buildings in bangladesh 339 (a) (b) 340 m. y. ali, i. khan, m. hassan (c) (d) fig. 2 simulated illuminance level for the classroom at different positions for: (a) clear sky, (b) average sky, (c) overcast sky, and (d) at night [philips-tcs460 1xtl532w hfp d8-vh]. lighting – the way to reducing electrical energy demand in university buildings in bangladesh 341 for the overcast sky condition, the simulation results are partially in line with the measured values, particularly for the window side. the reason might be the position of the classroom. in particular, the window side is completely open and receives sufficient natural light, but the opposite side does not. a corridor and other rooms are situated on this side of the room. thus, the measured light (through luxmeter) shows the actual scenario, whereas the simulation result shows theoretical value. in terms of night lighting simulation, the result shows sufficient light (≥300 lux) but in actual measurement, it varies significantly. this variation is predominantly due to the use of poor-quality tube lights in the classroom, whereas in our simulation, we used philipstcs460 1xtl5-32w hfp d8-vh light, which is suitable for first class lighting with a clean, distinctive design. a similar result was also obtained for the laboratory as depicted in figs. 3 and 4. (a) (b) (c) (d) fig. 3 measured illuminance level in the laboratory room at different positions for: (a) clear sky, (b) average sky, (c) overcast sky, and (d) at night. during the overcast sky and at night the classrooms and laboratories require artificial lighting. at the same time, the measured illuminance levels in both rooms indicate that the lights used are not efficient as typical tube lights with ballasts. a more efficient lighting system would be designed with led tube lights. 342 m. y. ali, i. khan, m. hassan fig. 4 simulated illuminance level for the laboratory room at different positions for: (a) clear sky, (b) average sky, (c) overcast sky, and (d) at night [considering philipstcs460 1xtl5-32w hfp d8-vh]. lighting – the way to reducing electrical energy demand in university buildings in bangladesh 343 (a) (b) (c) (d) fig. 5 simulated illuminance level for the classroom with: (a) 16 typical tube lights, (b) 12 led tube lights, and laboratory room with (c) 8 typical tube lights, and (d) 8 led tube lights. here, typical tube light: philips-tcs460 1xtl5-32w hfp d8vh and led tube lights: philips-ll512x 1xled50s/835 nb. 344 m. y. ali, i. khan, m. hassan evidently, if 12 led tube lights were used instead of 16 regular tube lights, the illuminance level that could be obtained from the former is better than the latter for the classroom [see figs. 5 (a) and (b)]. although the number of led tube lights could not be reduced for the laboratory, the illuminance level improved significantly from the illuminance level with typical tube lights [see figs. 5 (c) and (d)]. it can be seen from table 2 that the cost of led light is higher than typical tube lights. however, about 25% of energy and cost-saving can be achieved from this led lighting system compared with typical tube lighting for the classroom. the average illuminance level was increased by 115 lx. in contrast, no energy or cost-saving was observed for the laboratory. nonetheless, with an extra expenditure of 1,840 bdt, an additional 239 lx was achieved. table 2 cost-benefit analysis sl. no. type led tube light typical tube light for classroom 1. price (bdt*) 350 (average unit priceonly for light) 350×12 = 4,200 120 (average unit price-only for light) 120×16 = 1,920 2. average illuminance level (lx) 535 420 3. energy consumption (kwh/year) 520-830 700-1,100 4. energy cost (bdt/kwh) flat rate for commercial and office consumers 9.12 (unit price) 9.12 (unit price) 5. total cost (bdt/year) 520×9.12 = 4,742.4 (min) 830×9.12 = 7,569.6 (max) 700×9.12 = 6,384 (min) 1100×9.12 = 10,032 (max) 6. benefits: energy and cost saving 180 270 kwh/year 1,641.6 – 2,462.4 bdt/year -- for laboratory 1. price (bdt) 350 (average unit priceonly for light) 350×8 = 2,800 120 (average unit priceonly for light) 120×8 = 960 2. average illuminance level (lx) 564 325 3. energy consumption (kwh/year) 350-550 350-550 4. energy cost (bdt/kwh) flat rate for commercial and office consumers 9.12 (unit price) 9.12 (unit price) 5. total cost (bdt/year) 350×9.12 = 3,192 (min) 550×9.12 = 5,016 (max) 350×9.12 = 3,192 (min) 550×9.12 = 5,016 (max) 6. benefits: quality of light increased illuminance level (564 – 325 = 239 lx) -- * bangladeshi currency, 1 usd = 84 bdt lighting – the way to reducing electrical energy demand in university buildings in bangladesh 345 there is a total of 41 rooms on each floor (approximately 2,600 square meter) in the academic building (nine-storied) and they are located face-to-face, of which, 15 and 26 are classrooms and laboratories, and offices, respectively. of the 15 classrooms and laboratories, eight are similar to the room shown in fig. 1, and the other seven rooms are as depicted in fig. 3. the academic building is west-facing and nine-storied, with an auditorium and a large exam hall on the ground and top floor, respectively. although there are few classrooms and laboratories, in estimating the energy and cost-saving, the ground and top floors were excluded. there were eight rooms on each floor from which it is possible to save energy and cost through this led lighting system. the total number of potential classrooms for this purpose would be 56 (8×7) and the total energy-saving per year would be between 10,080 kwh and 15,120 kwh. this type of led lighting system could save from bdt 91,929.6 to bdt 137,894.4 per year. with respect to ghg emission reduction due to this energy saving, it was estimated that 6,753 to 10,130 kgco2-eq could be saved per year by avoiding fossil-fueled electricity generation. for this estimation, the average yearly carbon intensity of 670 gco2-eq/kwh was considered for bangladesh [12], [22]. 4. discussion personal communication with the non-academic and academic staff members of many different public universities in bangladesh reveals that almost every university in the country uses typical fluorescent tube lights in their classrooms and offices. clearly, the use of led lights in these educational institutions is capable of saving electrical energy. the use of led not only saves energy but also offers economic and environmental benefits. university students found that led light is more attractive, efficient, stimulating, comfortable, and cutting-edge technology compared to fluorescent light [4]. after life expiration, normal fluorescent tubes could be harmful to the environment and human health as they contain phosphor and mercury. due to the lack of proper waste management systems in least developed countries, expired fluorescent tube lights are a major threat to the environment, as outdated light manufacturing materials such as mercury could mix with soil and water. on the contrary, led tubes do not have these chemicals. ballast is also required for the operation of fluorescent tubes, which not only adds to the cost of the lamp but is also responsible for the typical buzzing noise. often, fluorescent tube lights become dull and flicker frequently, whereas led tube lights do not have these problems. although led offers many advantages over typical fluorescent tube lights, the cost of the former is about three times higher than the latter. the overall efficiency of led lighting systems depends on many parameters such as reflectance factors. we varied the reflectance factor of ceiling, walls, and floor with a mf of 80%, and the results are presented in table 3. notably, an effective optimization between the ceiling, walls, and floor color is required to gain maximum lux output from a lighting system in a room. for different colors, the reflectance factors are different. although the reflectance factors for these three surfaces are recommended in the developed world [23] for efficient lighting system development, in the least developed and developing world they are rarely seen. 346 m. y. ali, i. khan, m. hassan table 3 illuminance level variation due to different reflectance factors of the surfaces for the classand laboratory room. sl. no. ceiling rf (%) walls rf (%) floor rf (%) classroom average (lx) laboratory average (lx) 1. 75 75 75 604 453 2. 75 75 50 564 437 3. 50 75 50 535 424 4. 50 75 16.3 503 405 5. 50 75 25 501 409 6. 50 50 50 491 399 7. 50 75 12.5 491 404 8. 25 75 50 489 402 9. 12.5 75 50 474 395 10. 50 50 16.3 470 389 11. 50 25 50 467 383 although led lighting systems offer several benefits over fluorescent tube lights in educational institutions, the implementation of this efficient design faces several barriers. first, the lack of information. the initial cost of led is indeed higher than that of fluorescent tube lights, and the lifecycle saving from the led lighting system is frequently not taken into account by the proper authority due to the absence of available information. second, the lack of environmental awareness. in developing and least developed countries, one of the primary uses of electricity is for lighting and consumers are not aware of electricity generation. the negative impact of fossil-fueled electricity generation and its consequences on the environment and human health thus receive less attention. third, rigidity to change. often the government and the authorities emphasize procuring electricity from known suppliers at a cheap rate, but these suppliers are most often unable to supply energy-efficient goods due to a comparatively high initial price. moreover, the staff involved in this procurement process is not well informed about the advantages of energy-efficient options. fourth, the lack of energy management (i.e., demand side management) strategies. in developed countries, most educational institutions, predominantly universities, have their own demand side management strategies to reduce energy consumption towards sustainable development. this type of strategy is completely absent in educational institutions, mainly due to the lack of research in this field in least developed countries [24]. finally, the lack of technical expertise in lighting system design. during planning, construction, and the interior design of any building, most often priority is given to civil engineers and architects. lighting system design is usually completed by the local electrical technician who has zero knowledge of lighting efficiency factors. to overcome these barriers, policymaking needs to be revised or developed. some recommendations for these changes include: ▪ awareness and development of procurement staff through different training and programs, so that they can make optimal decisions regarding efficiency and costs while procuring new lighting systems. these awareness programs must include environmental and sustainable development issues. lighting – the way to reducing electrical energy demand in university buildings in bangladesh 347 ▪ to obtain proper information regarding energy-efficient lighting and its benefits, consultation with experts in this field would be an effective approach. an energy audit by a professional could also be helpful for this. ▪ for educational institutional lighting system design, a lighting system expert or lighting engineer should be employed during the planning and construction phases of the building. this is crucial as there are many parameters that need to be considered for a lighting system design [25]. ▪ every educational institution should launch demand side management schemes for their institute for effective utilization of energy resources, including lighting systems. ▪ the government of the country should develop regulations regarding the use of more efficient lighting system design and usage at educational institutions. ▪ at the initial stage, each institution should run a pilot project for a more efficient lighting system design, and consider the project’s outcome. 5. conclusion in this study, a simulation exercise and practical measurement of lighting levels inside an educational building located in bangladesh with the aim of understanding how a careful design of the lighting system may help reduce electricity needs and guarantee visual comfort was carried out. to overcome the deficiencies of the existing system, a led-based efficient lighting system was proposed. the results show that 25% of energy and cost per year could be saved from this type of lighting system. although the proposed led-based lighting system has higher initial costs than the typical fluorescent tube system, it offers long-term economic and environmental benefits. the lifetime of an led tube light is almost twice that of a typical fluorescent tube light. furthermore, the former does not contain any hazardous metals such as mercury, whereas the latter does. for energy efficient lighting system design, the quality of light is equally important as the quantity of light, that is, increasing the number of lights is not necessarily a better option. maximizing the use of daylight in conjunction with artificial lighting is another potential alternative to reducing energy demand in buildings, which must be taken into account during any lighting system design for educational institutions. acknowledgement: this work was supported by the jashore university of science and technology under the research project grant. references [1] c. dortans, m. w. jack, b. anderson and j. stephenson, "lightening the load: quantifying the potential for energy-efficient lighting to reduce peaks in electricity demand", energy effic., vol. 13, pp. 1105–1118, 2020. [2] l. martirano, "lighting systems to save energy in educational classrooms", in proceedings of the 10th international conference on environment and electrical engineering. rome, 2011, pp. 1–5. [3] o. keis, h. helbig, j. streb and k. hille, "influence of blue-enriched classroom lighting on students’ cognitive performance", trends neurosci. educ., vol. 3, pp. 86–92, 2014. [4] n. castilla, c. llinares, f. bisegna and v. blanca-giménez, "emotional evaluation of lighting in university classrooms: a preliminary study", front. archit. res., vol. 7, pp. 600–609, 2018. 348 m. y. ali, i. khan, m. hassan [5] t. m. i. mahlia, h. a. razak and m. a. nursahida, "life cycle cost analysis and payback period of lighting retrofit at the university of malaya", renew. sustain. energy rev., vol. 15, pp. 1125–1132, 2011. [6] n. gentile and m. c. dubois, "field data and simulations to estimate the role of standby energy use of lighting control systems in individual offices", energy build., vol. 155, pp. 390–403, 2017. [7] l. t. doulos, a. kontadakis, e. n. madias, m. sinou and a. tsangrassoulis, "minimizing energy consumption for artificial lighting in a typical classroom of a hellenic public school aiming for near zero energy building using led dc luminaires and daylight harvesting systems", energy build., vol. 194, pp. 201–217, 2019. [8] r. rayhana, m. a. u. khan, t. hassan, r. datta and a. h. chowdhury, "electric and lighting energy audit: a case study of selective commercial buildings in dhaka", in proceedings of the ieee international wie conference on electrical and computer engineering, dhaka, 2015, pp. 1–4. [9] i. khan, "household factors and electrical peak demand: a review for further assessment", adv. build. energy res., vol. 15, no. 4, pp. 409–441, 2021. [10] i. khan, "a temporal approach to characterizing electrical peak demand: assessment of ghg emissions at the supply side and identification of dominant household factors at the demand side", phd thesis, university of otago, 2019. [11] eecmp, "energy efficiency and conservation master plan up to 2030", government of bangladesh, dhaka, 2015. [12] i. khan, "importance of ghg emissions assessment in the electricity grid expansion towards a lowcarbon future: a time-varying carbon intensity approach", j. clean. prod., vol. 196, pp. 1587–1599, 2018. [13] a. castillo-martinez, j. a. medina-merodio, j. m. gutierrez-martinez, j. aguado-delgado, c. depablos-heredero and s. otón, "evaluation and improvement of lighting efficiency in working spaces", sustainability, vol. 10, no. 1110, pp. 1–16, 2018. [14] s. bunjongjit and a. ngaopitakkul, "feasibility study and impact of daylight on illumination control for energy-saving lighting systems", sustainability, vol. 10, no. 4075, pp. 1–22, 2018. [15] f. salata et al., "maintenance and energy optimization of lighting systems for the improvement of historic buildings: a case study", sustainability, vol. 7, pp. 10770–10788, 2015. [16] en 12464-1. lighting of work places part 1: indoor work places. european committee for standardization, 2011. [17] a. meresi, "evaluating daylight performance of light shelves combined with external blinds in southfacing classrooms in athens, greece", energy build., vol. 116, pp. 190–205, 2016. [18] m. b. piderit, c. cauwerts, and m. diaz, "definition of the cie standard skies and application of high dynamic range imaging technique to characterize the spatial distribution of daylight in chile", rev. la construcción., vol. 13, no. 2, pp. 22–30, 2014. [19] lightsearch, "light loss factors", lightsearch.com, 2020. [20] approximate reflectance values of typical building finishes, decrolux.com.au, 2018. [21] n. makaremi, s. schiavoni, a.l. pisello, f. asdrubali and f. cotana, "quantifying the effects of interior surface reflectance on indoor lighting", energy procedia., vol. 134, pp. 306–316, 2017. [22] i. khan, "temporal carbon intensity analysis: renewable versus fossil fuel dominated electricity systems", energy sources, part a recover. util. environ. eff., vol. 41, no. 3, pp. 309–323, 2019. [23] n. makaremi, s. schiavoni, a. l. pisello, f. asdrubali, and f. cotana, "quantifying the effects of interior surface reflectance on indoor lighting", procedia eng., vol. 134, pp. 306–316, 2017. [24] i. khan, "energy-saving behaviour as a demand-side management strategy in the developing world: the case of bangladesh", int. j. of energy and env. engin., vol. 10, no. 4. pp. 493-510, 2019. [25] i. khan, "a survey-based electricity demand profiling method for developing countries: the case of urban households in bangladesh" journal of building engineering., vol. 42, no. 102507, pp. 1-9, 2021. facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 21-35 https://doi.org/10.2298/fuee2101021a © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design of an effective control for grid-connected pv system based on fs-mpc nadjah attik1, abd essalam badoud1, farid merahi1, abdelbaset laib2, yahya ayat1 1automatic laboratory of setif, electrical engineering department, university of ferhat abess, setif-1, algeria 2lepci laboratory, electronics department. university of ferhat abess, setif-1, algeria abstract. this paper is deals in part of research that has been conducted on modern means in the basis of power electronics. harmonic cancellation of distribution network is currently a serious problem, especially in high electrical industry. the main source of harmonic currents injected into the network requires attention to reduce the current harmonic levels. energy quality is a fairly broad concept which covers both, the quality of power supply (voltage wave) and these of the currents injected into the electrical grid. in this context, a modern approved preventive solution in purpose to limit the rate of harmonic disturbance caused by the deferent power electronics systems connected to the grid must take action. it appears necessary to develop the quality and stability of the grid and develop curative devices such as converters provided with a control device making the current drawn on the most sinusoidal network possible. this paper proposes a control of tow stage grid tied pv system established on finite set model predictive control (fs-mpc). the design of fs-mpc is developed depending on the structure and operating principle associated to three-phase inverter tied to the grid. in this context, we have also employed the structure of mppt controller (p&o) and pi controller for adjustment of the dc-bus voltage. to set the proposed control scheme, numerical simulations are carried out using matlab/simulink 2013b. the obtained results demonstrate that the proposed control scheme assure the tracking of mpp and the injection of extracted pv power into the grid with high current quality under irradiation changes. key words: photovoltaic system, two-level inverter, finite set model predictive control fs-mpc, thd, grid-connected received march 12, 2020; received in revised form october 12, 2020 corresponding author: yahya ayat automatic laboratory of setif, electrical engineering department, university of ferhat abess, setif 1, algeria e-mail: yahya.ayat@yahoo.fr 22 n. attik, a. badoud, f. merahi, a. laib, y. ayat 1. introduction the use of renewable energies is experiencing significant growth in the world, faced with the growing demand for electrical energy mainly for the needs of remote areas lacking reliable electricity. among the sources of renewable energy, photovoltaic energy is rapidly becoming competitive to the conventional sources and has become a real alternative to boost renewable energy penetration into the energy mix [1], [2]. electric generators, control theory and power electronic converters are the most important elements to enable the safe, reliable, and high performance. the photovoltaic energy is the most renewable energy source used in the word due to their great advantage [3], [4], [5] that’s why solar energy grid integration systems (segis) concept will be the key to achieving high penetration of the photovoltaic (pv) systems into the utility grid [6]. there are various topologies of pv installations connected to a power grid. nevertheless, all these topologies are based on a photovoltaic generator connected to the grid by means of inverters which transfer and shape solar electric energy. the progress made in recent years in the development of inverters dedicated to photovoltaics have made it possible to greatly improve these management systems [7]. inverters are no longer just limited to converting the dc power produced by the solar panels into ac power in the form of a sinusoidal voltage of the desired frequency, but they also exploit the power delivered by the gpv by forcing it to operate at its point of maximum power. in addition, they provide reliable network monitoring to protect the network against outages and interrupt the power supply in the event of problems arising, either in the network or in the installation [2], [7]. for a medium voltage network, it is difficult to directly connect a single power semiconductor switch. as a result, multilevel inverters have been introduced as an alternative in high power and medium voltage applications because they offer several advantages. the increase in the number of the level makes it possible to improve the waveforms at the output of the converter, in particular in terms of harmonic content, but this requires a much more complex control and a large number of semiconductors used. the present challenge is to achieve the maximum power from photovoltaic system and deliver it to the power system with high current quality. for this reason, many researches work with the technological advancements in digital signal processors. due to greater reliability and improved performance, which leads to increased production rates, the use of power converters with high performance adaptable variable speed drives has gained increased presence in in a wide range of applications. in this tendency, power converters have become an emerging paradigm with many applications in a wide range of systems [8]. today, in view of the new need for quality, energy efficiency and the increasing demand for energy, the control and management of power generation systems is a very attractive research area. in recent years, new control schemes, novel topologies and new semiconductor devices are being developed in order to meet these requirements. in the literature, several inverter control techniques have been suggested. some of these are well developed and reasonable, while modern control methods, generally among these fresh process control, predictive control sound is a very attractive possibility for the control of power converters. a very large family of controllers with various approaches included by the predictive controller [9], [10]. the principle of predictive control is to use a system-controlled model within the controller in real time to predict the future behavior of the controlled variables. this design of an effective control for grid-connected pv system based on fs-mpc 23 information is used by the controller to obtain the desired optimal control, of course taking into account the optimization criterion predefined previously. predictive control has a number of advantages over other methods, including: its principle is intuitive and easy to understand. the corrector obtained is a linear control law easy to implement and which requires little computation time. allows to respect the constraints on the controlled and manipulated variables. allows automatic adaptation of the system in the event of measurable disturbances. it is intrinsically capable of compensating for delays or downtime. it is very useful when the instructions to be followed are known in advance [8]. published research works in the field of static converters and power electronics applications in general, shows that this kind of techniques especially the predictive control based on an mpc model is often used in current control applications. inverters. the model predictive design (mpc) approach has arisen in power electronics as an easy favorable method of digital control. electrical drive and power converter predictive control is a serious move towards a new approach that will improve the reliability of alternative energy control and management systems [10], [11]. through the use of switching mode operation, in which power semiconductor devices are operated in on / off mode, the key characteristics of modern power electronic converters (fast operation and high power densities, high performance, reduced weight and size) are obtained [3], [6], [13]. based on an accurate agreement that clarifies the various safety standards that must be followed during the connection, concluded between the consumer and the utility company, the pv system is connected. this paper proposes a control of the tow stage grid linked pv system founded on predictive control of the finite set model (fs-mpc). the aim goal of the fs-mpc technical is to ensure that pv power with a high grid current factor is injected into the grid. in addition, the p&o mppt controller and pi controller are used to track the mpp under change of irradiation and regulate the dc-bus voltage respectively. moreover, levels pv inverter connected to the grid is organized as follows. firstly, a general block diagram of model simulating the architecture of the finite set fs-mpc strategy of two-levels pv inverter connected to the grid on matlab/simulink environment. secondly, the control predictive proposed of three-phases two-levels inverter will be studied. finally, the simulation results are investigated. 2. global system configuration this paper gives the impact of the two-control strategy of three-phases two-level pv inverter tied grid as illustrated in fig 1. the system studied is composed of pv generator, dc-dc adaptor (boost), dc-bus and three-phase two-level inverter. mppt control extract the maximum output power from pv generator; the aim of the dc voltage regulation loop is to maintain this voltage at the reference value. the regulation of the dc voltage is affected by adjusting the amplitude of the current references by pi regulator. a phase locked loop (pll) outputs a unitary signal synchronized in phase and frequency with the input signal and the rl filter connected to the network through two levels inverter. we explore an intelligent stochastic fs-mpc for an optimal utilization of solar energy. 24 n. attik, a. badoud, f. merahi, a. laib, y. ayat 3. global system control the outline control of this work is given with these follow steps technique: the boost converter is used to realize the (p&o) mppt for pv systems recovering the maximum of energy [15] [14]. for optimal operation, the installation needs a constant voltage across this capacity. the regulation of dc voltage vdc is implemented by supplying the active power into the network. the correction of this voltage must be done by adding the active fundamental current to the reference current. a proportional-integral regulator (pi) is implemented in the dc voltage regulation loop in order to reduce the fluctuations across the dc capacitor and maintain it at its desired value v dc *. the pll (phase locked loop) ensures that the error in the phase between input and output is kept to zero, and the input and output frequency is the same. fig. 1 complete control strategy used in the proposed system 3.1. dc-ac converter the power circuit of the three phase two-level inverter is illustrated in fig 1. it uses six bidirectional switches to connect the three-phase directly. each bidirectional switch is have of an igbt with a parallel diode, as shown in figure 1. the two switches of each inverter leg must operate in a complementary mode to avoid short-circuit of the dc link. it is assumed that the switches and diodes are ideal devices. the inverter output voltages can be expressed in terms of dc-link voltage and switching states as follows [16], [17]: in this modeling, we assume that the components of the inverter are perfect switches, having an image of the logic control signals si (i = a, b, c) such that: design of an effective control for grid-connected pv system based on fs-mpc 25 ▪ if si = 1 the top switch is closed and the bottom one is open. ▪ if si = 0 the top switch is open and the bottom one is open. in these terms, we can deduce the three-phase output voltages of the inverter (van, vbn, vcn) as shown by the following equation system (1): an a dc bn b dc cn c dc s v s v s v v v v =  =  = (1) van, vbn, vcn: are the phase-to-neutral (n) voltages of the inverter. sa, sb, sc: are the switching signals of the inverter. vdc: is the inverter input voltage (v). for a the inverter with six switches, the switches of the each arm are controlled in a complementary manner, there are therefore eight possible combinations of the switch states (sa, sb, sc) corresponding to eight voltage states [3], as shown in the figure (2) below . on the basis of the notion of the rotating vector [4], [6], we can associate with each of these combinations the instantaneous spatial lever defined by (2): 2 an bn cn 2 ( ) 3 v v av a v= + + (2) with: 2 /3 1 3 2 2 j a e j  = = − + the possible number of combinations for the gating signals (sa, sb, sc) becomes eight (23), and consequently eight voltage vectors for the inverter are obtained. a space vector diagram that contains these eight combinations is shown in fig (2). fig. 2 voltage vectors in the complex plane 26 n. attik, a. badoud, f. merahi, a. laib, y. ayat with: 1 2 3 dc dc 4 dc 5 dc 6 dc dc 7 dc dc 8 v 0; v 2 / 3 vdc; v 1 / 3 v j 3 / 3v ; v 1 / 3 v j 3 / 3vdc, v 2 / 3 v , v 1 / 3 v j 3 / 3v ; v 1 / 3 v j 3 / 3v ; v 0. = = = +  = − +  = − = − −  = −  = 3.2. philosophy of predictive control the synthesis of predictive control is based essentially on two stages: predicting future behavior of the system and quadratic optimization. fig. 3 principle of fs-mpc prediction of future system behavior the phase-by-phase model of the injected current is given by the equation below (3): 1 ( ) f f f f f di r i l e v dt rdi i v e dt l l + + = = − + − (3) where: e: the grid voltage, v: is the inverter output voltage, i: the current injected. design of an effective control for grid-connected pv system based on fs-mpc 27 in order to predict the behavior of the variables evaluated by the cost function, a discreet time model of the system is necessary. the euler preview technique is used to discretize the system model because of its brevity. it also provides acceptable precision, which is essential for better effectiveness. according to this technique, we have the discrete time form of the system as follows in (4) [16], [18]: ( 1) ( ) s dx x k x k dt t + −  (4) ts: is the time of sampling. x(k) et x(k+1): are the state variable value in the current state and in the next sampling time, respectively by using euler's method, equation (4) is discredited in order to obtain an expression which makes it possible to predict the future current at (k+1) for the eight possible switching states applied to the inverter, this expression s 'written in the following form (5): ( 1) 1 ( ) ( ( ) ( )) f s s f f r t t i k i k v k e k l l   + = − + −      (5) quadratic optimization as a final step, the cost function is defined and expressed in orthogonal coordinates and measure the error between the reference and predicted currents and given by (6) [4]: * * , , ( 1) ( 1) ( 1) ( 1) p p g k i k k i ki i  = + − + + + − + (6) iα,p(k+1) and iβ,p(k+1) : are the real and imaginary part of the predicted grid current. iα*(k+1) et iβ*(k+1): are the real and imaginary part of the reference grid current. the goal of optimizing the cost function is to select the cost value g as close to zero as possible. the optimal switching state which minimizes the cost function is chosen and then applied to the converter at the time of the next sampling instant. fs-mpc algorithm the control strategy can be summarized by the following steps and illustrated in fig (4): 1. build a model of the static converter and its possible switching states. the injected currents are measured and then undergo a transformation according to the d-q coordinates. the values of the reference currents are subsequently obtained from the output quantity of the dc bus regulation loop. 2. build a model of the currents injected for the prediction. the system model is used to predict the injection current value in the sampling interval (k+1), for each of the eight voltage vectors. 3. define the cost function. the cost function (g) minimizes the error between the reference and predicted current. 4. the voltage vector which minimizes the current error is selected and the signals corresponding to the switching states are applied. 28 n. attik, a. badoud, f. merahi, a. laib, y. ayat estimation of the references currents the reference estimates of the two controlled currents, iα and iβ, can be estimated from the currents using the outputs of a three-phase pll. for this method, the references of the absorbed currents are given by equation (7), from which the three unit sinusoidal signals, sin(ωt), sin(ωt-2π / 3) and sin(ωt-4π / 3) are obtained through a 03-phase pll. [17] * max * max * max ( ) sin( ) 2 ( ) sin( ) 3 2 ( ) sin( ) 4 a b c t i wt t i wt t i wt i i i    =   = −   = −  (7) by applying the abc / αβ transformation, the references of the currents in the stationary frame αβ, are defined by the expressions below (8): * max * max 3 ( ) sin( ) 2 3 ( ) cos( ) 2 t wt t wt i i i i    =    =   (8) fig. 4 predictive control algorithm design of an effective control for grid-connected pv system based on fs-mpc 29 4. simulation results the main parameters of three phase two level converter are given in table 1. the use of simulation is a very important step in the study of photovoltaic systems, it makes it possible to modify system parameters such as sunshine and test the performance of control methods under different conditions. to perform a simulation, the functioning of the system components must be represented in the form of mathematical equations understandable by the simulation software. the simulation study of this first approach to predictive control of the three-phase inverter with two levels based on the selection of the optimal control vector is carried out through the matlab/simulink tool and the simpower system library. the results are obtained in steady state and for a purely sine wave power supply. in this work, we present the different models used for the photovoltaic panel and the parts of a photovoltaic system connected to the network then we integrate the proposed control scheme in order to validate the proposed scheme a control scheme test proposed for a photovoltaic system connected to the network was conducted under solar irradiation profile as in figure(a )is considered and injected into the photovoltaic panel and the temperature is set at 25 ° c. the aim of first study is to demonstrate the improvement achieved by applying the proposed control method based on mpc with the conventional p&o method in terms of mppt, dc-link voltage control, δ-β current control axes and grid current current quality. in the second study, the development platform is tested to evaluate the grid current thd%. figures (b) and (c) illustrate the simulation results of the evolution of the voltage and current, while the current is proprtionnel to thesolar irradiation. the figure (d) shows output power of the photovoltaic panel obtained with the algorithm of the p&o method a sudden decrease and increase in solar irradiation from 600 to 1000w/m2 at 0.5 sec, a large sudden decrease from 1000 to 400w/m2 at 1.5 sec. the mppt reaches the maximum power mpp rapidly. moreover, figure (e) shows the simulation result of the evolution of the dc bus voltage obtained for a reference voltage of 220v, the proposed control scheme tekes only mily seconds to track the reference voltage. the figures (f), (g),(h) present the grid current, figures (i),(j) and (k) illustrate the line and phase voltage with predictive control. the grid currents are increased or decrease rapidly due to the increase or decrease of solar irradiation, and kept the sinusoidal form with the grid current amplitude is proportion at to the irradiation. figures (l) and (m) illustrate the active and reactive power, it is clear that the developed control technique provides improved performance during the cases of irradiation with a perfect active and reactive power. the harmonic content of the currents thd values obtained are shown in figure (k) using fft analysis, a grid current thd% have been provided by the model predictive control. finally, to prove the efficiency of the predictive technique, in the goal to shows the contribution of the control technique proposed. the criteria taken into account in the evaluation of the efficiency of these commands is the total distortion harmonic of the network currents (thd) as presented in figure (o) under different irradiation and the ripple of the active and reactive powers. 30 n. attik, a. badoud, f. merahi, a. laib, y. ayat 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 200 300 400 500 600 700 800 900 1000 1100 time(sec) ir ra d ia n ce (w /m 2 ) (a) irradiations (w/m2) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 3 4 5 6 7 8 9 10 time(sec) p v -c ur re nt (a ) (b) photovoltaic current (a) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 20 40 60 80 100 120 time(sec) p v -v ol ta ge (v ) (c) photovoltaic voltage (v) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 100 200 300 400 500 600 700 800 900 time(sec) pv -p ow er (w ) (d) pv power (w) design of an effective control for grid-connected pv system based on fs-mpc 31 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -50 0 50 100 150 200 250 300 350 time(s) d c l in kv ol ta ge (v ) (e) dc voltage (v) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -20 -15 -10 -5 0 5 10 15 20 25 time(sec) g rid -c ur re nt (a ) (f) grid current (a) 0.4 0.42 0.44 0.46 0.48 0.5 0.52 0.54 0.56 0.58 0.6 -15 -10 -5 0 5 10 15 time(sec) z oo m o f g rid -c ur re nt (a ) (g) zoom of grid current (a) 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 -50 -40 -30 -20 -10 0 10 20 30 40 50 time(sec) z oo m o f g rid -v ol ta ge (v )& c ur re nt (a ) grid-current(v) grid-voltage(i) (h) zoom of grid voltage (v) & current (a) 32 n. attik, a. badoud, f. merahi, a. laib, y. ayat 1 1.02 1.04 1.06 1.08 1.1 1.12 1.14 1.16 1.18 1.2 -150 -100 -50 0 50 100 150 time(sec) li ne v ol ta ge (v ) (i) zoom of phase voltage (v) 1 1.02 1.04 1.06 1.08 1.1 1.12 1.14 1.16 1.18 1.2 -250 -200 -150 -100 -50 0 50 100 150 200 250 time(sec) t h re e p h as ev ol ta ge (v ) (j) zoom of three line voltage (v) 1 1.02 1.04 1.06 1.08 1.1 1.12 1.14 1.16 1.18 1.2 -250 -200 -150 -100 -50 0 50 100 150 200 250 time(sec) p h a s e v ol ta ge (v ) (k) zoom of line voltage (v) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -2000 -1500 -1000 -500 0 500 1000 time(sec) a ct iv e po w er (w ) (l) active power (w) design of an effective control for grid-connected pv system based on fs-mpc 33 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -200 0 200 400 600 800 1000 1200 1400 1600 time(sec) r ea ct iv e po w er (w ) (m) reactive power (var) (n) total harmonic distortion 400 500 600 700 900 1000 0.5 1 1.5 2 2.5 3 irradiance(w/m 2 ) t h d % (o) total harmonic distortion fig. 6 simulation results we can note that the power of the photovoltaic panel faithfully follows the change in lighting, in a fast and stable manner with small oscillations around the optimal power points. the pi regulator proposed for the conventional control of the dc-voltage has proven to be effective whatever the operating conditions. it tracks its reference despite the change in the lighting with good accuracy, precision and stability which proves efficiency of the proposed pi. we can also note that the amplitude of current injected is proportional to the illumination, almost sinusoidal and in phase with the line voltage which means that the power factor is very close to the unit. in the grid level, the predictive command is used to regulate the current reference in order to inject the maximum active power into the 34 n. attik, a. badoud, f. merahi, a. laib, y. ayat electrical network as shown in figures (l) and (m). to prove the performance of the predictive method, the simulation results show that the predictive command proposed ensures current injection continuously. we note that each injected current into the network obtained with the predictive algorithm perfectly follows its reference. on another side, we have proposed the predictive control, synthesized at basis of an optimization principle for current control at the inverter level. the results obtained with these control laws have shown a good dynamic performance, great capacity for tracking references and high robustness against variations in metrological conditions. furthermore, figure (n) shows the harmonic spectrum of one current grid phase analyzed by fast fourier transform (fft) of fundamental frequency. as shown in this figure, the total harmonic distortion is less than 3% in more distorted region of current which occurs 400w/m2 sun irradiance. moreover, we note that the currents distortion rate for different instants obtained by the predictive algorithm is acceptable and improved when the illumination is increased. 5. conclusion this work presents fs-mpc technique for current control in a three-phase inverter built to resolve the disadvantages of traditional techniques. active and reactive current are modeled as the reference control variables. in order to deal with these control goals, the cost functions are specified during each interval of sampling time. the control targets are accomplished on the basis of cost-function minimization. the dc-bus voltage control is also controlled while the mppt control is achieved by the p&o. the proposed pv system has been tested under various irradiation profiles. the results obtained indicate that the proposed system has a fast dynamic response, high performance of reference tracking with low oscillation, and fewer errors of steady state. the pv system transfers power to the utility grid with good efficiency when connected to the grid using an fsmpc controller. an optimal power factor and a very low harmonic distortion rate (thd) percent were also achieved. table 1 system parameters parameters value parameters value pv module short circuit current isc 5 a filter inductor 10 mh oppen circuit voltage 21.6v resistor 0.1ω grid grid frequency 50 hz boost chopper input capacitor 330 µf grid voltage 50v dc link capacitor 330 mh simulation parameters mppt sampling time tm 1e-3s indictor 330 µf predictive sampling time ts 1e-5 s design of an effective control for grid-connected pv system based on fs-mpc 35 references [1] k. benamrane, t. abdelkrim, a. borni, t. benslimane and o. abdelkhalek, "stability study of output voltages of stand alone single stage npc seven levels inverter for pv system in south algeria ", in proceedings of the 2016 8th international conference on modelling, identification and control (icmic), algiers, algeria, 2016, pp. 654‑659. [2] s. kouro, p. cortés, r. vargas, u. ammann and j. rodríguez, "model predictive control—a simple and powerful method to control power converters", ieee trans. ind. electronics, vol. 56, no. 6, pp. 1826– 1838, june 2009. [3] a. bouafia, "techniques de commande prédictive et floue pour les systèmes d’électronique de puissance: application aux redresseurs à mli", phd thesis, university of setif, 2014. [4] w. alhosaini, y. wu and y. zhao, "an enhanced model predictive control using virtual space vectors for grid-connected three-level neutral-point clamped inverters", ieee trans. energy convers., vol. 34, no. 4, pp. 1963‑1972, december 2019. [5] m. o. benaissa, s. hadjeri, s. a. zidi and y. i. d. kobibi, "photovoltaic solar farm with high dynamic performance artificial intelligence based on maximum power point tracking working as statcom", revue roumaine des sciences techniques. série électrotechnique et énergétique, vol. 63, no. 2, pp. 156– 161, 2018. [6] s. aurtenechea larrinaga, m. a. rodriguez vidal, e. oyarbide and j. r. torrealday apraiz, "predictive control strategy for dc/ac converters based on direct power control", ieee trans. ind. electronics, vol. 54, no. 3, pp. 1261–1271, june 2007. [7] t. geyer, and d. e. quevedo, "performance of multistep finite control set model predictive control for power electronics", ieee trans. power electron., vol. 30, no. 3, pp. 1633–1644, march 2015. [8] c. bordons, f. garcia-torres and m. a. ridao, model predictive control of microgrids. springer, 2020. [9] p. cortés, m. p. kazmierkowski, r. m. kennel, d. e. quevedo and j. rodríguez, "predictive control in power electronics and drives", ieee trans. ind. electronics, vol. 55, no. 12, pp. 4312–4324, december 2008. [10] t. geyer, g. papafotiou and m. morari, "model predictive control in power electronics: a hybrid systems approach", in proceedings of the 44th ieee conference on decision and control, seville, spain, 2005, pp. 5606–5611. [11] a. laib, f. krim, b. talbi, h. feroura and a. kihal, "decoupled active and reactive power control strategy of grid-connected six-level diode-clamped inverters based on finite set model predictive control for photovoltaic application", revue roumaine des sciences techniques-serie electrotechnique et energetique, vol. 64, no. 3, pp. 51-56, 2019. [12] a. kihal, f. krim, b. talbi, a. laib and a. sahli, "a robust control of two-stage grid-tied pv systems employing integral sliding mode theory", energies, vol. 11, no. 10, p. 2791, 2018. [13] a. k. podder, m. tariquzzaman and m. habibullah, "comprehensive performance analysis of model predictive current control based on-grid photovoltaic inverters", journal of physics: conference series, vol. 1432, p. 012051, 2020. [14] j. rodriguez, m. p. kazmierkowski, j. r. espinoza, p. zanchetta, h. abu-rub, h. a. young and c. a. rojas, "state of the art of finite control set model predictive control in power electronics", ieee trans. industr. inform., vol. 9, no. 2, pp. 1003–1016, may 2013. [15] s. vazquez, j. i. leon, l. g. franquelo, j. rodriguez, h. a. young, a. marquez and p. zanchetta, "model predictive control: a review of its applications in power electronics", ieee ind. electron. mag., vol. 8, no 1, pp. 16-31, march 2014. [16] t. geyer and d. e. quevedo, "multistep finite control set model predictive control for power electronics", ieee tran. power electron., vol. 29, no. 12, pp. 6836–6846, december 2014. [17] x. chen, w. wu, n. gao, h. s. h. chung, m. liserre and f. blaabjerg, "finite control set model predictive control for lcl-filtered grid-tied inverter with minimum sensors", ieee trans. ind. electronics, vol. 67, no. 12, pp. 9980-9990, december 2020. [18] j. rodriguez and p. cortes, predictive control of power converters and electrical drives, vol. 40. john wiley & sons, 2012. 10479 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 405-420 https://doi.org/10.2298/fuee2203405r © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper comb jamming as a strategy for rcied activation prevention jovan radivojević, mladen mileusnić, aleksandar lebl, verica marinković-nedelicki iritel a.d., belgrade, batajnički put 23, serbia abstract. the main objective of this paper is the analysis of comb jamming as a technique for rcied activation prevention. presentation of three strategies for comb signal generation follows after comprehensive survey of various jamming techniques in the introduction. there are two paper original contributions. the first one is quantitative comparison for three signal generation techniques of their emission power in relation to barrage jamming under the condition of equal ber value. the second contribution is determination of exact ber value as a function of emission power in the case of barrage jamming. until now we have made different analyses and comparisons starting from estimated emission power. the analysis procedure is performed for qpsk modulated rcied activation signal. power saving is evident for all three methods of jamming signal generation. it is proved that additional 2.5db of power saving is achieved by equalization of frequency components level in comb signal. the analysis in this paper shows that comb jamming allows the same effects as barrage jamming, but with lower emission power. key words: remote controlled improvised explosive devices jamming, comb jamming, emission power, qpsk modulation, bit error rate 1. introduction procedures of fight against remote controlled improvised explosive devices (rcied) today are becoming more and more important. this method of activation allows a significant degree of comfort for an attacker to realize his intentions from a safe distance, where his activities are difficult to be detected. besides, there are few other reasons why remote control is very attractive to the attacker for explosive devices activation: more effective and precise bombing, absence of wires gives autonomy to the attacker and the possibility that an attacker is arrested or killed is decreased [1]. different wireless communication techniques are available to the attacker. these techniques are not implemented only in highly specialized, hardly available equipment, but may be found in received february 12, 2022; revised march 25, 2022; accepted april 7, 2022 corresponding author: aleksandar lebl iritel a.d., 11080 belgrade, batajnički put 23, serbia e-mail: lebl@iritel.com 406 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki low-cost, commercial devices, such as long range cordless telephones, cell phones, satellite phones, radio controlled toys, car alarms, keyless automobile door openers, wireless doorbell buzzers, and so on [1]. it is this variety of attacking techniques that set high requirements in the development of the jammer of rcied activation. it is necessary to implement a wide variety of jamming strategies and generate a significant number of jamming signal types, and to change signal parameters within wide limits for each signal type. not only are various signal types necessary, but it is also important to develop new jamming technology, or signal type in a very short time interval, measured in weeks, not in months or years. that is why it is important to have well organized development and production of rcied jamming equipment, as the one presented in [2]. very important element in the organization of such development and production is consolidating data about performed rcied attacks in a database. event logs, implemented at the systems from one of the suppliers, presented in [3], may be implemented for such a purpose. after this introduction, a survey of applied rcied jamming systems is given in the section 2. section 3 of the paper presents the three most important techniques for comb signal generation. section 4 deals with the characteristics of frequency spectrum of these techniques. the exact bit error rate (ber) characteristics of sweep and barrage jamming are compared in the section 5. the procedure to define parameters of comb jamming is described in the section 6. the emission power relation between comb and barrage jamming is investigated in the section 7. paper conclusions are in the section 8. 2. a survey of applied rcied jamming systems frequencies implemented in commercial devices used for rcied activation are, a priori, known and these frequencies should be dominantly jammed to achieve successful jamming. a survey of commercial devices frequencies, usually used for rcied activation, may be found in [4]. these frequencies include those implemented for mobile communication systems (gsm, umts), dect telephones, remote control toys, wireless doorbells and gate drivers, car alarms, and so on. a survey of frequencies shows the part of wireless device spectra which may be adapted for rcied activation. the applied signal power in these devices is variable in the range from several tens of milliwatts to several watts [1]. a very detailed presentation of jamming techniques with mathematical analysis may be found in [5]. the main analyzed or just explained jamming techniques in [5] are noise jamming (separately broadband, partial-band, narrowband depending on the number of jammed channels), tone jamming (single tone and multiple tones), sweep jamming and pulse jamming (in fact comb jamming according to this paper). contribution [6] emphasizes two specific jamming techniques: following (or follower) jamming and smart jamming. following jamming is applied against frequency hopping: here a jammer follows carrier frequency changes on the transmitted signal and then performs jamming on each hopped frequency. the jamming probability when follower jamming is applied is calculated in [7]. it is proved in [7] that channels scanning speed increases linearly as the function of the hopping rate for the lower values of jaming probability, but this dependence is hyperbolical for the higher jamming probability values. in smart jamming the knowledge of transmission protocol is the key issue, because jamming is based on the attack towards the places of protocol vulnerabilities, such as error correction checksum, acknowledgement messages, transmitting overloading (false messages), comb jamming as a strategy for rcied activation prevention 407 and so on. the special threat for successful jamming in the group of smart jamming strategies is the case when timing channels normally intended for regular function of the protected device are maliciously used as covert channels to send activation signal [8]. contributions [9] and [10] present an idea that there is a specific, optimum technique for jamming each kind of modulations. in these contributions jamming of digital amplitude-phase modulated signals is analyzed and it is proved that the same kind of jamming signal modulation as the activation signal modulation is not always the optimum choice. such an analysis is important only in the case that we a priori know the type of implemented modulation in activation message coding, but this is very rarely fulfilled. reactive (responsive) jamming technique is lately more and more implemented [4], [11] [16]. this technique may be treated, in fact, as a kind of smart jamming because jamming is based on successful detection of frequency band implemented for rcied activation signal transmission. in the existing solutions usually is implemented fast fourier transform (fft), as a fast and reliable detection algorithm [4], [11]. in [17] it is proved that rcied activation signal detection on the basis of fft analysis may be faster and in this way more reliable than frequency sweep in active jammer. in [18] this analysis is further expanded to other reactive detector types. a survey of problems, arising in the realization of reactive jammers, is presented in [11]. among them, the greatest attention in [11] is devoted to time synchronization in the case of simultaneous function of multiple jammers. in [12], [13] the characteristics of some other detector types (energy detector, matched filter detector, feature detector and detector based on the calculation of eigenvalues of the covariance matrix) are theoretically compared one to the other. contribution [14] is devoted to activation signals jamming in one specific network (ieee. 802.15.4), where message packet duration is very short (only about 350μs), thus causing necessity for a very short detection time. in the case of universal jamming (not for specific activation signal type), the achieved detection time is less than 1ms in [15], and even about 200μs for the frequency range of 6ghz in [16]. a survey of implemented techniques for remote activation of improvised explosive devices and the frequency band intended for each technique implementation may be found in [19]. besides these techniques, sms message sending is very attractive and in some world regions dominant technique of rcied activation, because of its realization simplicity [20], [21]. rcied activation signal sending by sms messages may be prevented or delayed by various detection algorithms implemented in base stations [21]. modern solutions of rcied activation signal jammers should follow development in communication procedures and techniques. one such direction which aims at reliable and hardly detectable communication is implementation of frequency hopping signals. today hopping speed in realized systems may be significantly higher than it is presented in [7]. responsive jammers realized on the base of rcied activation signal detection in some cases have the possibility to follow frequencies changes when frequency hopped signal is applied [22], [23]. according to the achieved detection rate, the solution [22] may block the signal with 300hops/s while the solution [23] is effective even when the hop rate is 10000hops/s. the systems [22], [23] are available now and may be purchased on the market. our idea is to implement active jamming in a broad frequency range with not too high jamming power and thus to avoid the risk of, perhaps, unsuccessful rcied activation signal detection. one possible solution with these desired characteristics is comb jamming signal implementation according to the principles presented in this paper. 408 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki there are two mutually different accesses to jamming signal generation. the first one is to generate the desired shape of jamming signal at low/lower frequency band and then to shift it by the modulator to the necessary frequency band [24]. it is easier to model the signal at lower frequencies, but modulation is additional complication in the solution practical implementation. the other possibility is to directly generate the signal in the jammed frequency band. our intention is to consider the first possibility since we want to cover the broad frequency range in one moment and the generated signal may be shifted by several modulators adjusted at different frequency bands in the same time. a completely new approach to jamming signal generation is presented in [25], [26]. there is no need to take care about the shape of jamming signal or even to have such a generator. the solutions belong to the group of responsive jammers. when this approach is applied, the detected signal which has to be jammed is first delayed by the implementation of optical lines with adjustable, precise delay and then transmitted as the generated jamming signal. the selected value of delay determines the level of rcied activation signal attenuation. instead of this approach, we apply a specific jamming signal generation again to avoid the possibility that rcied activation signal is not detected. the complexity of the fight against the rcied activation and development perspectives of remote control of these devices were already noticed in [27]. there were made measurements of the bit error rate in the transmission when several jamming procedures types are implemented, thus presenting the possibilities for the fight against the then existing devices, but also against devices, which would appear in the future. the obtained measurement results led to the development of practical devices for fight against rcied activation [28] [30]. in these devices generation of very heterogeneous jamming signal types is applied: continuous wave (cw), amplitude shift keying (ask), phase shift keying (psk), frequency shift keying (fsk), comb signal (barrage jamming), sweep signal (with different sweep strategies, as, for, example, single sweep, multiple sweep, sweep with frequency gap, where there is no sweep signal and where jamming device management may be realized, etc.), white gaussian noise (wgn), and so on. among all these techniques jamming by sweep signal and jamming by wgn are most often applied. the characteristics of sweep jamming are analyzed in detail in [31] [33]. in [34] there is compared necessary power to realize jamming of mpsk (m-ary psk) modulated rcied activation message by sweep signal and by wgn, but without considering simultaneous influence of sweep signal and noise which is normally present in the system environment (environmental noise). sweep signal and wgn are in some cases combined in one unique signal, as demonstrated in [35]. a method for wgn signal generation is analyzed in [36]. the results presented in these last six papers are based on iritel great experience in developing jamming devices of various applications: against rcied activation [37], for jamming mobile telephony systems [38] and for radio surveillance and jamming [39]. comb jamming is a special technique for generating a signal for rcied activation prevention, similar to, but more energy efficient than barrage jamming. iritel is one of the pioneers for such jamming implementation [40], [41]. regarding recent times, the main characteristics of comb jamming are presented in [42]. comb jamming as a strategy for rcied activation prevention 409 3. techniques for comb jamming realization the main purpose of comb signal definition for jamming is to achieve similar implementation characteristics as if barrage jamming is applied, but with reduced emission power. comb signal consists of a number of discrete, usually equidistant components when considering frequency spectrum. in this way continual part of frequency spectrum is replaced by only one frequency component, but with the same jamming effect. there are three main methods for comb signal generation [43]: rectangular pulse train, filtered pulse train and pseudorandom sequence. the signal with the desired frequency characteristics (number of discrete frequency components, components distance in frequency domain) is usually first generated in a low frequency band. after that such a signal modulates a carrier in order to be shifted to the pre-defined frequency band. rwg τ t lpf txmod posc a fig. 1 principle block-scheme of rectangular pulse train jamming signal generation figure 1 presents the principle block-scheme for generating the rectangular pulse train signal. the generation process is initiated in the rectangular waveform generator (rwg), where the pulses of duration τ and period t are formed. the amplitude of pulses is a. the frequency spectrum of the generated pulses is band-limited in the low-pass filter (lpf). the frequency characteristic of this lpf is flat in the pass-band, meaning that only the undesired frequency components are truncated. amplitudes of frequency components in the pass band are not changed and they remain as generated. such modified impulses have the frequency spectrum at low frequency band and this spectrum is shifted to the required higher frequency band in the modulator (mod). here the generated pulse train signal is multiplied by the signal from the programmable oscillator (posc). it is possible to produce variable signal frequency band changing the frequency of posc, i.e. to additionally sweep the generated comb signal in the case that it is necessary to jam wider frequency band (one such application example for jamming mobile communication in gsm systems may be found in [44]). at the end the generated jamming signal is transmitted by a transmit antenna (tx). the generation of filtered pulse train signal is a slight modification of the previous method. its principle block-scheme is equal to the one presented in figure 1. difference is in the function of lpf. besides limiting the pass-band width, this filter also modifies the amplitudes of the generated comb frequency components with the aim to achieve approximately flat frequency characteristic in the pass band. in lpf the higher frequency components are more amplified (or, in other sense, less attenuated) than the lower frequency components. modifications are also noticeable in the pulse train signal shape in the time domain [43]. 410 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki figure 2 presents the principle block-scheme for comb signal generation according to the third method based on pseudorandom sequence implementation. the initial signal is generated in the linear feedback shift register (lfsr). the period of a sequence is t and it consists of n pulses whose duration is τ (i.e. it is t=n·τ). the amplitude of each pulse is +a or –a. the remaining algorithm realization phases are the same as for the previous algorithms: the spectrum of the generated comb signal is filtered in lpf and transferred to higher frequencies after signals modulation (implementation of blocks mod and posc). tx mod posc lfsr lpf τ t a a t τ τ fig. 2 principle block-scheme of pseudorandom sequence based jamming signal generation 4. frequency spectrum characteristics of three methods for comb signal generation frequency spectrum of rectangular pulse train signals is well-studied and presented in many references [43]. this spectrum is discrete with equidistant components and may be expressed by the equation 2 2 2 2 2 sin ( ) ( ) ( ) ( )k k a ktp f f k tt t  =−    =   −         (1) where p(f) presents signal power spectral density, δ(f-k/t) is designation for places where discrete frequency components are situated and the remaining part in the equation presents frequency components power envelope. the meaning of variables a, τ and t is already illustrated in the fig. 1. frequency spectrum of the signal shaped as the rectangular pulse train is presented in the fig. 3. such signal is obtained implementing the comb signal generator from the fig. 1. spectral components envelope is the function in the form (sin(x)/x)2 and the number of frequency components in the main lobe is selected by the ratio k=t/τ. the function of the lpf is to pass certain number of components from the main lobe leaving them with unchanged amplitudes. in the example from the fig. 3 it is k=6 and the lpf passes total 2·i+1 frequency components where i=3 is the number of non-attenuated frequency components on both sides related to the central component. comb jamming as a strategy for rcied activation prevention 411 f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 k=t/τ=6 i=3 fig. 3 frequency spectrum of rectangular pulse train signal frequency spectrum of the signal shaped as the filtered rectangular pulse train is presented in fig. 4. its initial shape is equal to the one presented in fig. 3 with the addition that the lpf characteristic (the curve designated by lpf in figure 4) has to approximate reciprocal function of (sin(x)/x)2 in the filter pass band. in this way 2·i+1 transferred frequency components at the generator output have approximately the same level. f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 k=t/τ=6 i=3 lpf fig. 4 frequency spectrum of filtered rectangular pulse train signal similar to the case of rectangular pulse train, frequency spectrum of the pseudorandom sequence signal may be presented by the equation ( ) ( )k k k p f p f n  =− =  −    (2) where coefficients pk which model the frequency spectrum envelope are 2 1 for 0kp k n = = (3) 2 2 2 sin ( ) 1 for 0 ( ) k k n np k kn n  + =      (4) 412 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki variables a, n and τ are already defined in the fig. 2 and in the explanation dealing with the same figure. f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 n=t/τ=6 i=3 fig. 5 frequency spectrum of the pseudorandom sequence signal fig. 5 presents the frequency spectrum of the generated pseudorandom sequence signal [43], [45]. comparing to the frequency spectrum of rectangular pulse train (fig. 3), difference exists at the component for i=0. this component has very low level (it is nearly eliminated) comparing to other components in the main lobe, because the typical values of n are more than 10. as frequency spectrum of pseudorandom sequence signal is similar to the spectrum of rectangular pulse train, all analysis in the continuation of the paper are performed only for this second type of signal. now, when we have explained the main characteristics of comb signal in time and frequency domain, the logical question is: what are the possibilities for this signal generation and practical implementation. if we want to have a wide main frequency lobe, the rectangular pulse duration τ should be very narrow. in a hardware sense it is difficult to generate such an impulse with a significant amplitude level. on the other hand, if we adopt the longer τ, there are fewer frequency components in the main lobe and there is a need for more additional hardware processing to expand the frequency spectrum. this means that we need to have more modulators and programmable oscillators connected as in fig.1 or fig. 2 to realize complete solution. generally, the process of shifting and shaping the frequency spectrum of comb signal which is generated in lower frequency band is also challenging. these are the reasons why comb jamming is not often practically applied. 5. performances comparison of sweep and barrage jamming the main purpose of comb jamming implementation is to achieve benefits as at barrage jamming, but with lower emission power. our first step in such an analysis is to compare the performances of pure sweep and pure barrage jamming. such an analysis is already approximately performed in [34] for mpsk modulated signals. the deviation from the accurate result is mainly caused by the fact that it is supposed that only one error in the symbol is possible regardless of the jamming signal level. in other words, the situation when both bits in qpsk symbol are faulty is replaced by only one faulty bit. comb jamming as a strategy for rcied activation prevention 413 0,01 0,1 1 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 s/n (db), s/i (db) p b s/n for s/i=60db s/i for s/n=60db 0,01 0,1 1 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 s/n (db), s/i (db) p b s/n for s/i=60db s/i for s/n=60db 0,01 0,1 1 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 s/n (db), s/i (db) p b s/n for s/i=60db s/i for s/n=60db fig. 6 ber (pb) as a function of the ratio s/n in the case of barrage jamming and as a function of s/i in the case of sweep jamming in this paper we implemented more accurate comparison in the case of qpsk signal jamming. the exact number of faulty bits in a symbol is supposed in an estimation process. the estimation is based on the implementation of our originally developed simulation program which is already presented in [35]. the purpose of the simulation program is to determine bit error rate (ber) when mpsk modulated signal is jammed by the simultaneous influence of sweep and barrage jamming. for the implementation in this paper we select one of the two jamming signals to have very low level. in order to simulate barrage jamming, we have defined the sweep signal level by the expression s/i=60db and in order to simulate sweep jamming we have defined noise level by s/n=60db, where s is reserved for the level of qpsk modulated rcied activation signal and i and n are the levels of sinusoidal interference and noise signal, respectively. fig. 6 presents the ber values as a function of the ratio s/n when barrage jamming is implemented and as a function of s/i when sweep jamming is implemented. for the ber values greater than 0.1 (which are of interest in jamming applications) it is necessary to apply higher interference signal level in the case of barrage jamming to achieve the same ber as if sweep jamming is implemented. difference in interference level is about 3db when it is ber=0.2, 4db when it is ber=0.3 and 4.8db when it is ber=0.4. 414 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki 6. comb jammer parameters definition jammed bandwidth is usually the initial condition which has to be defined in each jammer realization. this bandwidth is then transferred to the bandwidth important for comb jammer design. let us suppose that 2·i+1 is the number of discrete frequency components which is expected to effectively cause jamming. this number of frequency components is odd, but the generality of results is not lost because we may always select one component more than it is necessary. the second parameter which has to be satisfied at the beginning is the desired ber value. the first problem in jammer design is to determine the optimum number of frequency components in the main lobe of a comb signal before lpf when the number of generated jamming frequencies is known. optimum number of frequency components is selected so that jamming signal emission power is minimized for the pre-defined ber. when rectangular pulse train or filtered rectangular pulse train is designed, the problem is manifested as the selection of the ratio τ/t. the comb jamming signal is presented as the sum of a number of frequency components. according to the shape of frequency spectrum in figure 3 for the rectangular pulse train, the minimum level has the highest frequency component in the main lobe which is passed through the lpf (i.e. the component of the order i). comb jammer has to be designed so that this component satisfies the desired ber value. as a consequence, all other frequency components after the lpf have the higher level than the component of the order i and thus cause the higher ber value. the fact that comb signal has the minimum power means that its amplitude a is minimum. there are two opposite effects, which have the influence on the value of a. first, if we select the lower value of ratio τ/t, there will be more frequency components in the main lobe and the frequency components after the lpf will tend to be equal. the effect of this modification is lower value of a. but, according to the equation (1), lower value of τ/t means that multiplication factor in this equation in front of the part in the shape (sin(x)/x)2 is decreased and it is necessary to compensate this effect by the higher value of a. that is why there is the ratio τ/t where signal amplitude a is minimum. this is illustrated in figure 7. there are three presented characteristics. each of them is for the same width of filter pass-band, i.e. equal signal period t, but for different pulse width τ. f p(f) k=0 k=1k=-1k=-2k=-3k=-4k=-5k=-6 k=2 k=3 k=4 k=5 k=6 i=3 1 2 3 fig. 7 frequency spectrum of rectangular pulse for i=3 and three different values of τ comb jamming as a strategy for rcied activation prevention 415 the curve 1 in figure 7 corresponds to the case when the number of frequencies in the main lobe is significantly higher than the number of frequencies which have to cause jamming. signal energy in the main lobe is distributed on relatively high number of frequency components which have relatively low level each. the curve 2 is opposite case, when a low number of frequencies are in the pass-band. these frequencies have higher level than in the previous case. the curve 3 is in the middle when considering signal level at f=0, but its level at the frequency f=i is maximal. our problem to determine the optimum ratio τ/t is now solved after finding the first and the second derivative of the expression (1) at the point i, because it is necessary to find when the power in this point is maximal. the first derivative when considering only spectrum envelope in the (1) is expressed as 2 sin(2 ) i k i k xdp a dx k =−    =     (5) while the second derivative is 2 2 2 2 cos(2 ) i k i d p a k x dx =− =       (6) where it is x=τ/t and components between k=-i and k=i are passed through the lpf. according to the real conditions from the figure 3, it must be i<(1/x). in the point i the expressions (5) and (6) become ( ) 2 sin(2 ) i i xdp a dx i    =     (7) and 2 2 2 cos(2 ) i i xd p a idx     =       (8) the equation (7) is equal 0 if it is satisfied the condition 1 2 x i =  (9) meaning that it is the function extreme. for this x the value of the second derivative according to (8) is less than 0, which proves that emission power in the point defined by (9) is really the maximum. it further means that system gain should have minimum value to reach the desired power level and that emission power should be minimal in that case. 7. emission power relation of comb and barrage jamming we have already emphasized that the intention of comb jamming implementation is to produce the same effect as with barrage jamming, but with the reduced jammer emission power. that is why we are now going to compare the necessary jamming power for these two jamming strategies. let us suppose that our wish is to cause jamming in total 2·i+1 channels. the classical solution is to implement noise signal for jamming which covers continually frequency 416 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki band of these channels. the improved possibility is to implement only one jamming frequency in each channel. the characteristics presented in figure 6 correspond to each one of 2·i+1 considered channels, i.e. frequency components. as a consequence, benefits of filtered rectangular pulse train are directly obvious from figure 6. namely, the power of each frequency component in the filtered pulse train signal is equal and for the same extent lower than uniform noise jamming power to cause the same ber. in this way the total effect of jamming in all channels is also equal to the one presented in figure 6. the necessary emission power decreases when comb jamming is implemented is δp1fp=3db when it is ber=0.2, δp2fp=4db when it is ber=0.3 and δp3fp=4.8db when it is ber=0.4. the benefits are decreased when rectangular pulse train or pseudorandom sequence signal is implemented. to determine the improvement in emission power in this case, we start from the calculation of total emission power related to the case of uniform spectrum emission power. our estimation is illustrated by the example when it is i=3, meaning that total 7 frequency components are passed through the lpf. according to the problem which is earlier defined to be solved, components at i=3 need to have equal power. table 1 illustrates procedure to determine the ratio of comb signal power to the barrage signal power when the sinusoidal component level at i=3 is equal to the value of the power at the same frequency in the case of filtered pulse train or also to the level of barrage (noise) signal. the column with the designation prel presents ratio of considered sinusoidal component power with the order k to the unity power. the last two rows in the table present the power ratio of total 7 frequency components after the lpf to the uniform power in the same frequency band. the data in the last column of the table 1 is graphically presented by fig. 8. it illustrates the power level ratio of frequency components of rectangular pulse train signal to barrage signal where rectangular pulse train signal has (at least) the same jamming effect as barrage signal. the calculated power difference of 2.5db has to be subtracted from the power save when filtered rectangular pulse train signal is implemented to obtain the equivalent power save when rectangular pulse train is considered. therefore, in the case of rectangular pulse train implementation, power save is δp1p=0.5db when it is ber=0.2, δp2p=1.5db when it is ber=0.3 and δp3p=2.3db when it is ber=0.4. these values are significantly lower than the values for filtered pulse train, thus approving the benefits of power spectrum equalization. table 1 power ratio of comb signal for rectangular pulse train to barrage jamming k τ/t prel pcomb/pbarrage -3 0.167 0.011258 1 -2 0.167 0.019044 1.692 -1 0.167 0.025422 2.258 0 0.167 0.027889 2.398 1 0.167 0.025422 2.258 2 0.167 0.019044 1.692 3 0.167 0.011258 1 total 1.768 total (db) ≈2.5 comb jamming as a strategy for rcied activation prevention 417 pcomb/pbarrage k=0 k=1k=-1k=-2k=-3 k=2 k=3 1 2 barrage fig. 8 power spectrum ratio graphical presentation for rectangular pulse train to barrage signal 8. conclusions this paper starts with the comprehensive presentation of iritel contributions in the area of rcied activation jamming. after that analysis is directed towards comb jamming. comb jamming is a wide-band jamming strategy. it efficiently replaces more often implemented barrage jamming strategy. the available literature only emphasizes the fact that comb jamming signal power is lower than barrage jamming power, but without any attempt to quantitatively support this statement [5], [46]. the main paper contribution is quantitative estimation of emission power difference between comb and barrage jamming under the criterion of the same achieved ber value in both cases. the analysis in the paper considers all three most often implemented strategies for comb jamming signal generation: rectangular pulse train, filtered pulse train and pseudorandom sequence. it is proved that power equalization for all generated frequency components when filtered pulse train signal is considered additionally achieves 2.5db improvement of power saving possibilities. in this way power saving is more than doubled comparing to the pulse train signal. the second paper contribution is determination of exact ber value when barrage jamming of rcied activation message is applied. in our previous contributions we have used only approximate calculation of this value [34]. the exact value of this variable is obtained by the implementation of our original simulation program. our other direction of jammers development is related to malicious drones’ missions prevention. modern drone communication channels are often realized using some broadband techniques [47]: frequency hopping spread spectrum (fhss) [48] or direct sequence spread spectrum (dsss) [49]. comb jamming is highly suitable for jamming these two signal types due to its ability to cover great bandwidth with not too high emission power. the solutions presented in this paper are the first step for the future development to allow broadband jamming of drone communication signals. 418 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki references [1] g. kumaraswamy rao and k. v. ranga rao, "intelligent jamming solution to defeat the growing menance of remotely controlled improvised devices (rcieds) using electronic counter measures", int. j. electron. commun. comput. eng., vol. 4, no. 5, pp. 1479–1488, 2013. [2] m. e. pesci, "systems engineering in counter radio-controlled improvised explosive device electronic warfare", john hopkinsapl technical digest, vol. 31, no. 1, pp. 58–65, 2012. [3] j. haystead, "defeat ied mission expands to defensive electronic attack (dea)", the j. electron. defense, pp. 28–40, 2015. [4] k. wilgucki, r. urban, g. baranowski, p. grądzki and p. skarźyński, automated protection system against rcied, military communications and information technology. chapter 7: cognitive radio and spectrum management techniques, 2012, pp. 593–601. [5] r. poisel, modern communications jamming principles and techniques. boston/london, second edition, artech house, 2011. [6] k. wilgucki, r. urban, g. baranowski, p. grądzki and p. skarźyński, "selected aspects of effective rcied jamming", in proceedings of the military communications and information systems conference, warsaw, 2012, pp. 1–5. [7] k. burda, "the performance of follower jammer with a wideband scanning receiver", j. electr. eng., vol. 55, no. 1–2, pp. 36–38, 2004. [8] s. d’oro, l. gallucio, g. morabito and s. palazzo, "efficiency analysis of jamming-based countermeasures against malicious timing channel in tactical communications", in proceedings of the ieee international conference on communications icc, budapest, 2013, pp. 4020–4024. [9] s. amuru and r. m. buehrer, "optimal jamming strategies in digital communications / impact of modulation", in proceedings of the ieee global communications conference (globecom), 2014, pp. 1619–1624. [10] s. amuru and r. m. buehrer, "optimal jamming against digital modulation", ieee trans. inf. forensics secur., vol. 10, no. 10, pp. 2212–2224, 2015. [11] j. mietzner, p. nickel, a. meusling, p. loos and g. bauch, "responsive communications jamming against radio-controlled improvised explosive devices", ieee commun. mag., vol. 50, no. 10, pp. 38–46, 2012. [12] m. tanatwy, "responsive communication jamming detector with noise power fluctuation using cognitive radio", int. j. innovative res. comput. commun. eng., vol. 2, no. 10, pp. 5967–5973, 2014. [13] t. trump and i. müürsepp, "detection speed of responsive communication jamming detectors, recent advances in telecommunications and circuits", in proceedings of the 2nd international conference on circuits, systems, communications, computers and applications, dubrovnik, 2013, pp. 149–154. [14] m. wilhelm, i. martinović, j. schmitt and v. lenders, "reactive jamming in wireless networks: how realistic is the threat?", in proceedings of the 4th acm conference on wireless network security (wisec '11), acm, hamburg, 2011, pp. 47–52. [15] g. evans, "a new weapon in the fight against rcieds, army technology", august 2015, https://www.army-technology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/. [16] selena electronics, rss intelligent reactive stationary jammer and rsv vehicle reactive jammer. in electronics warfare systems: jamming solution, 2015. [17] m. mileusnić, p. petrović, a. lebl and b. pavić, "comparison of rcied activation responsive and active jamming reliability", in proceedings of the 6th international conference icetran 2019. srebrno jezero, 2019, pp. 988–993, awarded as the best paper in the section of telecommunications. [18] m. mileusnić, p. petrović, v. kosjer, a. lebl and b. pavić, "reliability analysis of different rcied activation signal responsive jamming techniques and their comparison to active jamming", fu electr. energ., vol. 33, no. 3, pp. 459–476, 2020. [19] a. gulyás, "the radio controlled improvised explosive device (rcied) threat in afghanistan", aarms, vol. 12, no. 1, pp. 1–11, 2013. [20] oss net, survey of rcieds southeast asia – feb 2003-oct 2005. oss southeast asia division, 2005. [21] f. e. idachaba, "algorithm for source mobile identification and deactivation in sms triggered improvised explosive devices", procedia eng., vol. 78, pp. 96-101, 2014. [22] stratign, "radio jammers", https://www.stratign.com/radio-jammers/. [23] security & counterintelligence group llc, "lightning: rcied jamming system – vehicle installed", https://scgroup-ltd.com/lightning/. [24] j. magiera, "wideband signal generation for jamming radio-controlled improvised explosive devices", in proceedings of the 41st international conference on telecommunications and signal processing (tsp). athens, 2018, pp. 1–4. https://www.army-technology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/ https://www.stratign.com/radio-jammers/ https://scgroup-ltd.com/lightning/ comb jamming as a strategy for rcied activation prevention 419 [25] m. e. belkin, a. alyoshin, d. fofanov and a. s. sigov, "studying microwave-photonics design principle of a responsive jammer for radio-controlled explosive devices", tech. phys. lett., vol. 46, no. 11, pp. 1132–1135, 2020. [26] m. e. belkin, l. zhukov and n. smirnov, "devising an optimal time-delay circuit configuration for a microwave-photonics-based radio communication jammer", in proceedings of the 29th telecommunications forum (telfor), belgrade, 2021, pp. 440–443. [27] p. petrović and m. šunjevarić, "radio surveillance and jamming systems and techniques", trends in telecommunications, pp. 17.1.-17.22., belgrade, november 1988, (p. petrović, m. šunjevarić, “savremeni sistemi i tehnike za radio-izviđanje i ometanje”, pravci razvoja telekomunikacija, str. 17.117.22, beograd, novembar 1988). [28] iritel high frequency (hf) radio surveillance and jamming system, chapter in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011. [29] iritel very/ultra high frequency (v/uhf) radio surveillance and jamming system, chapter in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011. [30] m. mileusnić, p. petrović, b. pavić, v. marinković-nedelicki, j. glišović, a. lebl and i. marjanović, "the radio jammer against remote controlled improvised explosive devices", in proceedings of the 25th telecommunications forum (telfor), belgrade, 2017, pp. 151–154. [31] m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić and a. lebl, "analysis of jamming successfulness against rcied activation", in proceedings of the 5th international conference icetran 2018. palić, 2018, pp. 1206–1211, paper awarded as the best one in the section of telecommunications. [32] m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić and a. lebl, "analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming", fu electron. energ., vol. 32, no. 2, pp. 211–229, 2019. [33] v. marinković-nedelicki, a. lebl, m. mileusnić, p. petrović and b. pavić, "ber calculation for sweep jamming of mpsk modulated rcied activation message signals", in proceedings of the 18th international symposium "infoteh jahorina 2019". jahorina, 2019, pp. 1–6. [34] m. mileusnić, p. petrović, b. pavić, v. marinković-nedelicki, v. matić and a. lebl, "jamming of mpsk modulated messages for rcied activation", in proceedings of the 8th international scientific conference on defensive technologies oteh, belgrade, 2018. [35] v. marinković-nedelicki, a. lebl, m. mileusnić and p. petrović, "combined jamming in rcied activation prevention", in proceedings of the 19th international symposium “infoteh jahorina 2020”. jahorina, 2020, pp. 1–6. [36] a. lebl, m. mileusnić, b. pavić, v. marinković-nedelicki and p. petrović, "programmable generator of pseudo-white noise for jamming applications", in proceedings of the 27th telecommunications forum (telfor). belgrade, 2019, pp. 1–4. [37] p. petrović, n. remenski, p. jovanović, v. tadić, b. pavić, m. mileusnić and b. mišković, wrj 2004 wideband radio jammer against rcieds. tehničko rešenje – novi proizvod na projektu tehnološkog razvoja tr32051 pod nazivom razvoj i realizacija naredne generacije sistema, uređaja i softvera na bazi softverskog radija za radio i radarske mreže, http://www.iritel.com/images/pdf/wrj2004-e.pdf, 2011. [38] n. remenski, b. pavić, p. petrović, m. mileusnić and v. marinković-nedelicki, integrisana radiooprema za zaštitu prostora od mobilnih veza (treća generacija radio-opreme). tehničko rešenje – novi proizvod s oznakom cj-1p na projektu tehnološkog razvoja tr-11030 razvoj i realizacija nove generacije softvera, hardvera i usluga na bazi softverskog radija za namenske aplikacije, http://www.iritel.com/images/pdf/cj-1p-e.pdf,, 2010 (also published in the book m. streetly, jane’s radar and electronic warfare systems.. ihs global limited, 2011). prva generacija radio-opreme s oznakom cj-1 je realizovana na projektu tehnološkog razvoja tr6149b, 2006. [39] p. petrović, m. mileusnić, b. pavić, v. tadić and v. marinković-nedelicki, razvoj nove generacije sistema za radio-izviđanje i ometanje u vf i vvf/uvf opsegu. tehničko rešenje u okviru projekta 10 m 06, ministarstvo za nauku i tehnologiju srbije, fond za naučni razvoj, 1997-2000. [40] p. petrović, generator of jamming signals gemos. technical solution, 1990. [41] p. petrović, development of new generation of gemos devices and signal classifier based on dsp technology. technical solution, 1999. [42] a. lebl, m. mileusnić and j. radivojević, "combined and comb rcied activation messages jamming – two different strategies with similar names", sci. tech. rev., vol. 70, no. 1, pp. 21–28, 2020. [43] b. a. black, on the generation of waveforms having comb-shaped spectra. nrl memorandum report 619, naval research laboratory, may 1988. [44] r. e. stoddard, multi-band jammer. patent no. us7697885 b2, 2010, pp. 1–7. http://www.iritel.com/images/pdf/wrj2004-e.pdf http://www.iritel.com/images/pdf/cj-1p-e.pdf 420 j. radivojević, m. mileusnić, a. lebl, v. marinković-nedelicki [45] x. song, x. wang, z. dong, x. zhao and x. feng, "pseudo-random sequence correlation identification parameters and anti-noise performance", energies, vol. 2018, no. 11, pp. 1–18, 2018. [46] m. r. frater and m. ryan, electronic warfare for the digitized battlefield. artech house inc., 2001. [47] v. chamola, p. kotesh, a. agarwal, naren, n. gupta and m. guizani, "a comprehensive review of unmanned aerial vehicle attacks and neutralization techniques", ad hoc networks, vol. 111, p. 102324, 2021, [48] h.-b. kil, j.-s. lee and e.-r. jeong, "analysis of frequency hopping signals in commercial drones", int. j. pure appl. math., vol. 118, no. 19, pp. 2015–2024, 2018. [49] b. m. todorović and v. d. orlić, "direct sequence spread spectrum scheme for an unmanned aerial vehicle ppm control signal protection”, ieee commun. lett., vol 13, no. 10, pp. 727–729, 2009. facta universitatis series: electronics and energetics using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines martin lukac1, michitaka kameyama2, marek perkowski3 and pawel kerntopf4 1nazarbayev university, astana, kazakhstan 2ishinomaki senshu university, ishinomaki, japan 3portland state university, portland, oregon, usa 4university of lodz, lodz poland abstract: a digital device is called reversible if it realizes a reversible mapping, i.e., the one for which there exist a unique inverse. the field of reversible computing is devoted to studying all aspects of using and designing reversible devices. during last 15 years this field has been developing very intensively due to its applications in quantum computing, nanotechnology and reducing power consumption of digital devices. we present an analysis of the reversible finite state machines (rfsm) with respect to three well known sequences used in the testability analysis of the classical finite state machines (fsm). the homing, distinguishing and synchronizing sequences are applied to two types of reversible fsms: the converging fsm (crfsm) and the nonconverging fsm (ncrfsm) and the effect is studied and analyzed. we show that while only certain classical fsms possess all three sequences, crfsms and ncrfsms have properties allowing to directly determine what type of sequences these machines possess. keywords: reversible logic, finite state machines, testing 1 introduction one of the problems when designing sequential logic is the ability to efficiently generate tests, apply them to the circuit under test and design an easily testable circuit (design for test dft). in classical finite state machines (fsms) this issue manuscript received corresponding author: 1 facta universitatis series: electronics and energetics vol. 32, no 3, september 2019, pp. 417 438 https://doi.org/10.2298/fuee1903417l martin lukac1, michitaka kameyama2, marek perkowski3, pawel kerntopf4 received august 28, 2018; received in revised form april 23, 2019 corresponding author: martin lukac 53 kabanbay batyr ave, block 7, office 7237, nur-sultan city, republic of kazakhstan, 010000 (e-mail: martin.lukac@nu.edu.kz) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines 1nazarbayev university, astana, kazakhstan 2ishinomaki senshu university, ishinomaki, japan 3portland state university, portland, oregon, usa 4university of lodz, lodz poland abstract. a digital device is called reversible if it realizes a reversible mapping, i.e., the one for which there exist a unique inverse. the field of reversible computing is devoted to studying all aspects of using and designing reversible devices. during last 15 years this field has been developing very intensively due to its applications in quantum computing, nanotechnology and reducing power consumption of digital devices. we present an analysis of the reversible finite state machines (rfsm) with respect to three well known sequences used in the testability analysis of the classical finite state machines (fsm). the homing, distinguishing and synchronizing sequences are applied to two types of reversible fsms: the converging fsm (crfsm) and the non-converging fsm (ncrfsm) and the effect is studied and analyzed. we show that while only certain classical fsms possess all three sequences, crfsms and ncrf-sms have properties allowing to directly determine what type of sequences these ma-chines possess. key words: reversible logic, finite state machines, testing 2 is well studied and various techniques exist to determine fsm’s testability [1–11]. among the desired characteristics of a testable fsm is the identification of an unknown current state and the ability to bring the fsm to a known state. these properties are verified by the homing, synchronizing and distinguishing input sequences. using these sequences it is possible to establish whether a classical fsm • possesses a transition path from the initial to the final state, • possesses a transition path from any state to any another state and • allows by only observing the machine’s output sequence, to reach an arbitrary state from a different arbitrary state. the motivation for the construction of automata possessing a particular set of sequences can be appreciated by observing the advantages of each of the sequences separately. the homing sequences have been successfully used in hardware fault detection [12] and in machine learning [13, 14]. the synchronizing sequence has been successfully used in various designs where the existence of the synchronizing sequence allows to simplify the circuit implementation and testing. finally, the distinguishing sequence is used to build a checking sequence used to verify if an implementation of an fsm is consistent with its specification [2, 10]. consequently, an fsm that possesses the desired sequences would be highly testable and thus both practical in industrial applications and useful for theoretical research. the general area of sequential reversible circuits and automata has been explored several traditional approaches. in [15, 16], optimized d and jk latches have been proposed. in [17] proposed to build reversible components for sequential circuits based on toffoli reversible gates and in [18] the elements of sequential reversible circuits were implemented using conservative fredkin logic gates. in [19] the reversible t-flip-flop was proposed to be built using custom gates. in [20] the authors proposed to build a memory cell from two toffoli gates in standard cmos. in [21] a reversible double-edge triggered flip-flop was build on fpga. testable sequential devices based on quantum cellular automata have been explored in [22, 23]. reversible sequential circuits have also been designed in multiple-valued logic such as in [24]. in the construction of reversible fsms (rfsms), the reversible computation imposes the reversibility constraints and thus limits the construction [25, 26]. this makes the rfsms harder to construct and more expensive because ancilla bits and additional logic are required to preserve or achieve the reversibility [27–30]. consequently, applying methods based on the established classical fsms to rfsms, can determine if the established classical fsms methods are sufficient, 418 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 419 2 is well studied and various techniques exist to determine fsm’s testability [1–11]. among the desired characteristics of a testable fsm is the identification of an unknown current state and the ability to bring the fsm to a known state. these properties are verified by the homing, synchronizing and distinguishing input sequences. using these sequences it is possible to establish whether a classical fsm • possesses a transition path from the initial to the final state, • possesses a transition path from any state to any another state and • allows by only observing the machine’s output sequence, to reach an arbitrary state from a different arbitrary state. the motivation for the construction of automata possessing a particular set of sequences can be appreciated by observing the advantages of each of the sequences separately. the homing sequences have been successfully used in hardware fault detection [12] and in machine learning [13, 14]. the synchronizing sequence has been successfully used in various designs where the existence of the synchronizing sequence allows to simplify the circuit implementation and testing. finally, the distinguishing sequence is used to build a checking sequence used to verify if an implementation of an fsm is consistent with its specification [2, 10]. consequently, an fsm that possesses the desired sequences would be highly testable and thus both practical in industrial applications and useful for theoretical research. the general area of sequential reversible circuits and automata has been explored several traditional approaches. in [15, 16], optimized d and jk latches have been proposed. in [17] proposed to build reversible components for sequential circuits based on toffoli reversible gates and in [18] the elements of sequential reversible circuits were implemented using conservative fredkin logic gates. in [19] the reversible t-flip-flop was proposed to be built using custom gates. in [20] the authors proposed to build a memory cell from two toffoli gates in standard cmos. in [21] a reversible double-edge triggered flip-flop was build on fpga. testable sequential devices based on quantum cellular automata have been explored in [22, 23]. reversible sequential circuits have also been designed in multiple-valued logic such as in [24]. in the construction of reversible fsms (rfsms), the reversible computation imposes the reversibility constraints and thus limits the construction [25, 26]. this makes the rfsms harder to construct and more expensive because ancilla bits and additional logic are required to preserve or achieve the reversibility [27–30]. consequently, applying methods based on the established classical fsms to rfsms, can determine if the established classical fsms methods are sufficient, 3 need to be improved or if unique new methods must be designed when dealing with the reversible computational paradigm. rfsms have been studied for their properties of universality [25] and furthermore extensive studies have been conducted in the area of reversible cellular automata (rca) [31–35]. an rca can be seen as a rfsms with spatial constraints. an rca maps input states to next states using local rules. a local rule is a function of k spatially closest inputs that is repeatedly applied to all n inputs. while rca have been studied for their effective implementation and powerful computation abilities [36] no study on the testability of rca has been done. similarly, up to now there have been no serious efforts to explore the general testability of rfsms from the point of view of the well established testing techniques such as those used in testing of classical irreversible fsms. specifically, the impact of the three above introduced sequences has not been studied at all for rfsms and for the fsms embedded in rfsms. some studies into testability have been performed from the classical point of view such as in [37]. in this paper the constraints of the reversible-permutative and unitary matrices used to specify fsms are studied with respect to the three above introduced sequences. we assume that an rfsm is specified by a permutation matrix and our analysis is limited to such permutative and discrete rfsms. we apply the three types of sequences to rfsms and determine their power when used on reversible sequential devices. the main contributions of this paper are: 1. the analysis of the homing, synchronizing and distinguishing sequences for rfsms, 2. criteria for rfsms in order to have the homing, distinguishing, or synchronizing sequences. the paper is organized as follows. first, background on the classical and reversible fsms is given in section 2. section 3 describes and defines the terms and concepts necessary for the understanding of the three considered sequences and section 4 shows the application and analysis of the sequences related to both the converging and non-converging finite state machines crfsms/ncrfsms. section 5 concludes the paper by summarizing our results. 2 background let a and b be finite non-empty sets. definition 1 (balanced logic function). a function f : a → b is balanced if 418 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 419 4 for every value b ∈ b there is the same number of combinations of input variable values in its domain a. definition 2 (reversible logic function). a function f : a → b is reversible if it is one-to-one and onto. in other words, f(xa) = f(xb) =⇒ xa = xb and f(x) = y for x ∈ x and y ∈ y . table 1: example of (a) an irreversible logic function f , (b) a reversible logic function fr. ab a’b’ 00 01 01 11 10 00 11 11 f (a) ab a’b’ 00 00 01 11 10 01 11 10 fr (b) table 1a shows example of an irreversible logic function and table 1b an example of a reversible function. definition 3 (permutative matrix). a permutative matrix for n input variables, is a sparse matrix with binary coefficients performing a reversible function f : bn → bn. reversible functions in this paper will be represented by truth tables and permutative matrices. table 1a and 1b show examples of irreversible and reversible functions respectively. the corresponding matrices are shown in eq. (1) and (2).     0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1     (1)     1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0     (2) definition 4 (indexing variable). the indexing variable of a block diagonal matrix is the input variable that allows to separate a block diagonal matrix to independent reversible matrices. definition 5 (reversible logic gate). a reversible logic gate (or circuit) on nvariables realizes a n × n reversible function f : i → o. 420 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 421 4 for every value b ∈ b there is the same number of combinations of input variable values in its domain a. definition 2 (reversible logic function). a function f : a → b is reversible if it is one-to-one and onto. in other words, f(xa) = f(xb) =⇒ xa = xb and f(x) = y for x ∈ x and y ∈ y . table 1: example of (a) an irreversible logic function f , (b) a reversible logic function fr. ab a’b’ 00 01 01 11 10 00 11 11 f (a) ab a’b’ 00 00 01 11 10 01 11 10 fr (b) table 1a shows example of an irreversible logic function and table 1b an example of a reversible function. definition 3 (permutative matrix). a permutative matrix for n input variables, is a sparse matrix with binary coefficients performing a reversible function f : bn → bn. reversible functions in this paper will be represented by truth tables and permutative matrices. table 1a and 1b show examples of irreversible and reversible functions respectively. the corresponding matrices are shown in eq. (1) and (2).     0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1     (1)     1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0     (2) definition 4 (indexing variable). the indexing variable of a block diagonal matrix is the input variable that allows to separate a block diagonal matrix to independent reversible matrices. definition 5 (reversible logic gate). a reversible logic gate (or circuit) on nvariables realizes a n × n reversible function f : i → o. 5 definition 6 (controlled reversible logic gate). a k-controlled reversible logic gate on k + 1 variables performs a logic function on target variable ik+1 while leaving all control variables i1, . . . , ik unchanged such that fk+1(i1, . . . , ik, ik+1) = { 1 ⊕ ik+1 if for all j = 1, . . . , k, ij = 1 ik+1 otherwise (3) that is, it realizes a one variable balanced function on a target variable if all k control variables have value 1. a k-controlled reversible logic gate on k + 1 variables uses k control variables and one target variable such that k variables remain unchanged after the application of the reversible logic gate on the target variable ik+1. definition 7 (positive and negative control). a positive (negative) control is a variable i that must be 1 (0) in order to activate the function on target variable. 1 2 a b f1 = a ⊕ b f0 = b fig. 1: example of realization of a reversible function reversible circuits are built from reversible gates. for instance, consider the realization of an reversible function shown in figure 1. gates 1 and 2 are both two variable gates, called cnot. a cnot gate is a single variable positive controlled not gate implementing the function f = a⊕b. observe gate labelled 1 in circuit from figure 1: the control variable a controls the not operation on variable b. the cnot gate labelled 2, applies the not gate on variable a and is controlled by b. therefore the cnot gate 1 in figure 1 implements function f1 = a ⊕ b and cnot gate 2 implements f0 = a ⊕ b ⊕ a = b. definition 8 (finite state machine). a finite state machine (fsm) is a sequential device defined by a septuple m = (i, o, s, si, sf, f, g) where i is the input alphabet (a finite, non-empty ordered set of input symbols), o is the output alphabet (a finite, non-empty ordered set of output symbols), s is a finite, non-empty ordered set of states, si ⊂ s is the set of initial states, sf ⊂ s is the set of final states, f is the next state function given by the mapping f : i × s → s′ and g is the output function given by the mapping g : i × s → o. 420 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 421 6 table 2: example of an irreversible fsm s i 0 1 a b/0 a/1 b a/0 d/0 c a/1 d/1 d c/0 b/1 table 2 shows an example of an fsm. in table 2 first column shows in every row one possible current state of the fsm. the second and third columns show the the next sate and output assignments for input values i = 0 and i = 1 respectively. definition 9 (reversible finite state machine (rfsm)). a rfsm is a state machine m = (i, o, s, si, sf, f, g) with state transition function f : i × s → s ′ and output function g : i × s → o are balanced logic functions and such that i × s → o × s′ is a reversible function. in this paper we will distinguish two types of rfsms: the convergent rfsm (crfsm) and its special case the non-convergent rfsm (ncrfsm). the difference is in the fact that for any ncrfsm and for a given input value, every next state assignment is unique. this difference is illustrated in table 3; table 3(a) shows an ncrfsm and 3(b) and crfsm in the so called reversible specification table. table 3: example of (a) non-converging λn and (b) converging λc specifications. s i 0 1 a d/1 a/1 b a/0 c/1 c c/0 d/0 d b/1 b/0 s i 0 1 a d/1 c/0 b a/0 c/1 c a/1 d/0 d b/1 b/0 (a) (b) definition 10 (ncrfsm evolution). the input-state-output mapping λn(f ×g) : i × s → s′ × o. the evolution function λn is a bijection with constraint that when i is used as input variable(s), each output state occurs once at most: ∀i ∈ i, ∀s ∈ s, f(i, s) �= f(i, s′). example of a ncrfsm is shown in table 3(a). 422 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 423 6 table 2: example of an irreversible fsm s i 0 1 a b/0 a/1 b a/0 d/0 c a/1 d/1 d c/0 b/1 table 2 shows an example of an fsm. in table 2 first column shows in every row one possible current state of the fsm. the second and third columns show the the next sate and output assignments for input values i = 0 and i = 1 respectively. definition 9 (reversible finite state machine (rfsm)). a rfsm is a state machine m = (i, o, s, si, sf, f, g) with state transition function f : i × s → s ′ and output function g : i × s → o are balanced logic functions and such that i × s → o × s′ is a reversible function. in this paper we will distinguish two types of rfsms: the convergent rfsm (crfsm) and its special case the non-convergent rfsm (ncrfsm). the difference is in the fact that for any ncrfsm and for a given input value, every next state assignment is unique. this difference is illustrated in table 3; table 3(a) shows an ncrfsm and 3(b) and crfsm in the so called reversible specification table. table 3: example of (a) non-converging λn and (b) converging λc specifications. s i 0 1 a d/1 a/1 b a/0 c/1 c c/0 d/0 d b/1 b/0 s i 0 1 a d/1 c/0 b a/0 c/1 c a/1 d/0 d b/1 b/0 (a) (b) definition 10 (ncrfsm evolution). the input-state-output mapping λn(f ×g) : i × s → s′ × o. the evolution function λn is a bijection with constraint that when i is used as input variable(s), each output state occurs once at most: ∀i ∈ i, ∀s ∈ s, f(i, s) �= f(i, s′). example of a ncrfsm is shown in table 3(a). 7 definition 11 (crfsm evolution). the input-state-output mapping λc(f × g) : i × s → s′ × o. the evolution function λc is a bijection with the only constraint being that each pair {s′, o} occurs only once in λc . example of crfsm is shown in table 3(b). in this paper the reference rfsm will be used to address any of the crfsm/ncrfsm unless specifically indicated. note two major differences between the definition of the rfsm and fsm: 1. a rfsm must preserve reversibility 2. it has to be specified for all the combinations of the input-state {i, s} values 3. all state-outputs {s′o} combinations must occur only once in the reversible specification table. definition 12 (state-output set θ). given a rfsm mr, the state-output set θ is the set of all combinations of s ∈ s and o ∈ o. the size of θ is |s| ∗ |o|, with element indices i = 1, . . . , |s|∗|o|. let k = 0, . . . , |s|−1 and j = 0, . . . , |o|−1 then the indices of elements in θ are calculated as i = k + |o| ∗ j. definition 13 (don’t care). the don’t care ∗ is used to represent unknown or unconsidered values of states, output or input values. in the analysis of an rfsm the input sequence uncertainty is used to describe the knowledge about the current state of the rfsm. to understand the concept of input sequence uncertainty, first the concepts of partition and cover need to be defined and explained. the following concepts are adapted to the crfsms and ncrfsms from the original ones defined in [11]. definition 14 (ordered state partition). given a crfsm/ncrfsm m, the ordered state partition π of states s of m, is a collection of disjoint subsets of states whose set union is s and an ordering ≺i−1,i. the order ≺i of states s ∈ πi is given by the order ≺i−1 of s ∈ πi−1 of the direct predecessor state. let the crfsm from table 3(a), π1 = {a/∗, b/∗, c/∗, d/∗} and i = 0 then the ordered partition is π2 = {d/1, a/0, c/0, b/1}. using this notation it can be easily determined that the predecessor to b/1 was the state d. note that the original definition of state partition [38] breaks the states into groups such that each group contains states with same output. in the definition 14 these groups are broken visually but still exists if grouped by variable values. thus an unordered partition π2 = (ac)(bd). 422 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 423 8 definition 15 (ordered state cover). an ordered cover is a collection φ of subsets of states s (with their associated outputs o) whose union is s, such that no subset is contained in another subset in the collection [11]. additionally, the order of states in the cover is given ≺i−1,i. let the crfsm from table 3(b), π = {a/∗, b/∗, c/∗, d/∗} and i = 0 then the ordered cover is φ = {d/1, a/0, a/1, b/1}. using this notation it can be easily determined that the predecessor to b/1 was the state d/∗. definition 16 (ordered input sequence uncertainty). the uncertainty ∆(x) of an input sequence x = x1, . . . , xk of an rfsm is a cover φ of s where two distinct states f(xt, sh) = si and f(xt, sj) = sk are ordered according to ≺i−1,i. the input sequence uncertainty is highest when the machine is in an unknown state given by π = {a/∗, b/∗, c/∗, d/∗, }; all states are indistinguishable by their outputs. the lowest uncertainty is either when the cover of machine states is a singleton state or when each group of states is a singleton state such as π = {a/0, c/1}. the uncertainty of input sequence can be complemented by the amount of information computed from various states present in the block. definition 17 (information content of uncertainty). information content of uncertainty is given by e(φ) = − ∑ i pilog(pi) where the pi coefficient represents the multiplicity of the state si in the cover φ of the states resulting from an input value. for instance e({a/∗, b/∗, c/∗, d/∗}) = − ∑ 4 0.25 ∗ log(0.25) = 1.3863 and e({d/∗, a/∗, a/∗, a/∗}) = −(0.25 ∗ log(0.25) + ∑ 3 0.75 ∗ log(0.75)) = 0.9939. the difference between the information content and the uncertainty is: the uncertainty ∆(x) of input sequence x represents the partition/cover of states given their observable outputs and the information content e(φ) is given by the mixture of individual unique states (or states/output combinations) present in the partition or cover. for instance the lowest uncertainty ∆(x) = 0 represents the fact that each output represents one unique state, independently of how many such states are in the block. the lowest possible information content e(φ) = 0 is such ordered partition that contains exactly one state (i.e., {a/∗, a/∗, a/∗, a/∗}) let mr be a rfsm (definition 9) and x = x1, . . . , xj, s = s1, . . . , sj and o = o1, . . . , oj be an input, state and output sequences, correspondingly, each of them of length j. 424 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 425 8 definition 15 (ordered state cover). an ordered cover is a collection φ of subsets of states s (with their associated outputs o) whose union is s, such that no subset is contained in another subset in the collection [11]. additionally, the order of states in the cover is given ≺i−1,i. let the crfsm from table 3(b), π = {a/∗, b/∗, c/∗, d/∗} and i = 0 then the ordered cover is φ = {d/1, a/0, a/1, b/1}. using this notation it can be easily determined that the predecessor to b/1 was the state d/∗. definition 16 (ordered input sequence uncertainty). the uncertainty ∆(x) of an input sequence x = x1, . . . , xk of an rfsm is a cover φ of s where two distinct states f(xt, sh) = si and f(xt, sj) = sk are ordered according to ≺i−1,i. the input sequence uncertainty is highest when the machine is in an unknown state given by π = {a/∗, b/∗, c/∗, d/∗, }; all states are indistinguishable by their outputs. the lowest uncertainty is either when the cover of machine states is a singleton state or when each group of states is a singleton state such as π = {a/0, c/1}. the uncertainty of input sequence can be complemented by the amount of information computed from various states present in the block. definition 17 (information content of uncertainty). information content of uncertainty is given by e(φ) = − ∑ i pilog(pi) where the pi coefficient represents the multiplicity of the state si in the cover φ of the states resulting from an input value. for instance e({a/∗, b/∗, c/∗, d/∗}) = − ∑ 4 0.25 ∗ log(0.25) = 1.3863 and e({d/∗, a/∗, a/∗, a/∗}) = −(0.25 ∗ log(0.25) + ∑ 3 0.75 ∗ log(0.75)) = 0.9939. the difference between the information content and the uncertainty is: the uncertainty ∆(x) of input sequence x represents the partition/cover of states given their observable outputs and the information content e(φ) is given by the mixture of individual unique states (or states/output combinations) present in the partition or cover. for instance the lowest uncertainty ∆(x) = 0 represents the fact that each output represents one unique state, independently of how many such states are in the block. the lowest possible information content e(φ) = 0 is such ordered partition that contains exactly one state (i.e., {a/∗, a/∗, a/∗, a/∗}) let mr be a rfsm (definition 9) and x = x1, . . . , xj, s = s1, . . . , sj and o = o1, . . . , oj be an input, state and output sequences, correspondingly, each of them of length j. 9 definition 18 (transition cycle). a transition cycle τj = so|x, is the sequence of length j of elements of state-output so ∈ θ combinations obtained as a result of applying the sequence x to a machine mr in some initial state θ1 = s1 and such that the fr(x1s1) = fr(xj, sj). the transition cycle defines the shortest sequence of input values x that would starting from arbitrary initial state si and ending in the same state si. definition 19 (maximal transition cycle). if the transition cycle τi reaches all available states in s, then the transition cycle is called maximal and is denoted τmax. for instance, the machine from table 3(b) starting from is = 0/a has τ4 = d/1b/1c/1a/1|0010. note that τ4 is also a maximal transition cycle and thus for mr, τ4 = τmax. theorem 1. any rfsm that does not possess τmax must have p transition cycles such that ∪ p l=1τl = θ and ∑p l=1 |τl| = |s| and ∩ p l=1τl = ∅. proof. every rfsm is specified by a set of unique mappings {i, s} → {s′, o}. additionally the evolution operator λc is a one-to-one bijection and thus every single {s′, o} ∈ θ can be reached from exactly one distinct {i, s}. this implies that each state can be reached from at maximum |i| different cycles. therefore, any existing {s′, o} is accessed by at maximum one existing cycle in λc . consequently, at maximum there are |o × s × i|/2 independent cycles. the content of each transition cycle implies that ∑p l=1 |τl| = |s| and ∩ p l=1τl = ∅. definition 20 (reversible successor tree (rst)). a reversible successor tree is a rooted dag rst = {v, e} where: (1) each edge e corresponds to a transition between two states given an input i value, and (2) node v represents a block of states with the associated output [11]. the rst of the fsm shown in table 3(a) is shown in figure 2. notice that unlike in classical successor tree [11], the rst preserves the order of the states between two successor nodes as shown in figure 2. for instance, at node 0 the state cover is φ = {a/∗, b/∗, c/∗, d/∗} because it represents the current state, no output was yet generated and therefore the machine could be in any possible state. for the input value of 0, node 1 transforms the changed ordered cover φ = {d/1, a/0, c/0, b/1}. thus a state with output at jth position in a parent node will generate the state with output at the jth position in every child node. definition 21 (impulse response (ir)). impulse response of a machine m is the vector of output values vi = {o i s1 , . . . , oisn} for all possible states sj|sj ∈ s generated to an input value xi. 424 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 425 10 10 0 1 4 5 8 0 1 2 0 0 1 3 6 7 {a/∗, b/∗, c/∗, d/∗} 1 {a/1, c/1, d/0, b/0} {a/1, d/0, b/0, c/1} {d/1, c/0, b/1, a/0} {d/1, a/0, c/0, b/1} {b/1, d/1, c/0, a/0} {b/0, a/1, d/0, c/1} {a/0, d/1, b/1, c/0} {c/1, a/1, b/0, d/0} fig. 2: partially expanded successor tree of the ncrfsm defined in table 3(a) definition 22 (response sequence (rs)). for a crfsm/ncrfsm m starting from a state si and an input sequence x = x1, . . . , xn, the sequence of irs v x i = {ox1si , . . . , o xn si } of obtained output values is called response sequence (rs). definition 23 (machine signature). combining ir and rs of a machine m into a matrix which columns are labeled by output values given a state os/o and rows by input values from sequence x = x1, . . . , xn results in the so called machine signature ms. example of machine signature ms of crfsm from table 3(b) obtained as a result of the input sequence x = 00101 is shown in table 4. columns in table 4 represent, the index of the input variables, the input variable, the next state and output, and the impulse response respectively. for instance, in second row, k = 0 indicating the input variable is x0 = 0. the third column indicates that given the state and output a/∗, the next state output is d/1. the outputs generated for input x0 are all gathered in v0 = 1011. table 4: response matrix of the crfsm from table 3(b) k x oa/∗ ob/∗ oc/∗ od/∗ ms 0 x0 = 0 d/1 a/0 a/1 b/1 v0 = 1011 1 x1 = 0 b/1 d/1 d/1 a/0 v1 = 1110 2 x2 = 1 c/1 b/0 b/0 c/0 v2 = 1000 3 x3 = 0 a/1 a/0 a/0 a/1 v3 = 1001 4 x4 = 1 c/0 c/0 c/0 c/0 v4 = 0000 426 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 427 10 10 0 1 4 5 8 0 1 2 0 0 1 3 6 7 {a/∗, b/∗, c/∗, d/∗} 1 {a/1, c/1, d/0, b/0} {a/1, d/0, b/0, c/1} {d/1, c/0, b/1, a/0} {d/1, a/0, c/0, b/1} {b/1, d/1, c/0, a/0} {b/0, a/1, d/0, c/1} {a/0, d/1, b/1, c/0} {c/1, a/1, b/0, d/0} fig. 2: partially expanded successor tree of the ncrfsm defined in table 3(a) definition 22 (response sequence (rs)). for a crfsm/ncrfsm m starting from a state si and an input sequence x = x1, . . . , xn, the sequence of irs v x i = {ox1si , . . . , o xn si } of obtained output values is called response sequence (rs). definition 23 (machine signature). combining ir and rs of a machine m into a matrix which columns are labeled by output values given a state os/o and rows by input values from sequence x = x1, . . . , xn results in the so called machine signature ms. example of machine signature ms of crfsm from table 3(b) obtained as a result of the input sequence x = 00101 is shown in table 4. columns in table 4 represent, the index of the input variables, the input variable, the next state and output, and the impulse response respectively. for instance, in second row, k = 0 indicating the input variable is x0 = 0. the third column indicates that given the state and output a/∗, the next state output is d/1. the outputs generated for input x0 are all gathered in v0 = 1011. table 4: response matrix of the crfsm from table 3(b) k x oa/∗ ob/∗ oc/∗ od/∗ ms 0 x0 = 0 d/1 a/0 a/1 b/1 v0 = 1011 1 x1 = 0 b/1 d/1 d/1 a/0 v1 = 1110 2 x2 = 1 c/1 b/0 b/0 c/0 v2 = 1000 3 x3 = 0 a/1 a/0 a/0 a/1 v3 = 1001 4 x4 = 1 c/0 c/0 c/0 c/0 v4 = 0000 11 3 sequences for the analysis of fsms in this section we present adapted definitions of testing sequences originally introduced for irreversible fsms [38] to the models of ncrfsm and crfsm introduced in this paper. 0 1 2 10 0 1 0 5 6 43 7 8 0 1 (a/∗, b/∗, c/∗, d/∗} 1 {c/0, c/1, d/0, b/0} b/0, c/1} {d/0, d/0, b/1, a/0} {b/1, d/1, d/1, a/0} {b/0, c/0, c/0, c/1} d/0, d/0} {c/1, d/0, {d/1, a/0, a/1, b/1} {a/0, a/1, a/1, a/1} {a/1, a/1, fig. 3: partially expanded successor tree of the crfsm defined in table 3(b) 3.1 homing sequence definition 24 (homing sequence). a homing sequence is a sequence of inputs x that independently on the initial state si and by observing the output sequence o allows to bring the machine to a distinct final state sf . we look to figure 2 for homing sequence. starting from the initial cover φ (node 0) the input sequence 01 leads to b, a, d and c for o = 10, o = 01, o = 00 and o = 11 respectively. to obtain a homing sequence, start from the top node with index 0 and expand the node into successor nodes by analyzing the state change from each state in the current node for input value 1 and 0. the expansion stops when each state is a singleton or if all states in a node are the same. 3.2 distinguishing sequence definition 25 (distinguishing sequence). a sequence x of inputs that creates a unique sequence of outputs o starting from any unknown initial state sj of the state machine. such output sequence permits to determine the unknown initial state of the machine. 426 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 427 12 let’s again use the successor tree from fig. 2. the sequence x = 01 shows that for each of the possible current state a unique output sequence is be generated. starting from node 0, the output sequences o = 10, o = 01, o = 00, and o = 11 for the initial states a, b, c and d respectively. 3.3 synchronizing sequence definition 26 (synchronizing sequence). the sequence x of inputs to create a path from any initial state si to the same specific final state sf independently of the output sequence o. the synchronizing sequence is in fact a more powerful type of the homing sequence and thus if a sequence is synchronizing it is also a homing sequence. let’s consider the successor tree from figure 3. similarly to the homing sequence, starting from the root node (with index 0) and sequentially feeding the state machine a sequence of input values the machine will end up in the same state. for instance, the input sequence 010 leads the machine through the nodes 1, 4, and 5 and in node 5 all states are a/0. thus independently of the output and of the initial state, the machine will be end up in the state a. 4 analyzing reversible finite state machines crfsms and ncrfsms used here are all reduced and thus do not contain any compatible states. consequently, most of trivial models of state devices are not discussed. 4.1 ncrfsm o s′1 s′0s1 s0 i fig. 4: circuit realization of the ncrfsm defined in table 3(a). for the analysis of the ncrfsm we use the ncrfsm from table 3(a) that has the rst shown in figure 2. figure 4 shows one possible realization of the 428 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 429 12 let’s again use the successor tree from fig. 2. the sequence x = 01 shows that for each of the possible current state a unique output sequence is be generated. starting from node 0, the output sequences o = 10, o = 01, o = 00, and o = 11 for the initial states a, b, c and d respectively. 3.3 synchronizing sequence definition 26 (synchronizing sequence). the sequence x of inputs to create a path from any initial state si to the same specific final state sf independently of the output sequence o. the synchronizing sequence is in fact a more powerful type of the homing sequence and thus if a sequence is synchronizing it is also a homing sequence. let’s consider the successor tree from figure 3. similarly to the homing sequence, starting from the root node (with index 0) and sequentially feeding the state machine a sequence of input values the machine will end up in the same state. for instance, the input sequence 010 leads the machine through the nodes 1, 4, and 5 and in node 5 all states are a/0. thus independently of the output and of the initial state, the machine will be end up in the state a. 4 analyzing reversible finite state machines crfsms and ncrfsms used here are all reduced and thus do not contain any compatible states. consequently, most of trivial models of state devices are not discussed. 4.1 ncrfsm o s′1 s′0s1 s0 i fig. 4: circuit realization of the ncrfsm defined in table 3(a). for the analysis of the ncrfsm we use the ncrfsm from table 3(a) that has the rst shown in figure 2. figure 4 shows one possible realization of the 13 ncrfsm defined by table 3(a) using the encoding of the states a = (s0 = 0, s1 = 0), b = (s0 = 0, s1 = 1), c = (s0 = 1, s1 = 0) and d = (s0 = 1, s1 = 1). table 5: encoding of the ncrfsm from table 3(a) s i 0 1 s0 s1 s0 s1 o s0 s1 o 0 0 1 1 1 0 0 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 1 0 table 5 shows the individual bits for nest state and output assignments. table 5 allows us to generate a set of equations describing the individual variable assignment: s′0 = s̄1 ⊕ is̄0 = s̄1 ⊕ is̄ ′ 1 (4) s′1 = s0 ⊕ īs̄1 (5) o = s̄0 ⊕ īs1 = i ⊕ s ′ 1 (6) however because we are dealing with reversible circuits, we cannot simply assign, but rather we have to use the equations 4∼6 to change one of the state, input or ancilla bit variables. here we decide to use the following variable mapping: i → o, s0 → s ′ 1 and s1 → s ′ 0 and we use one ancilla bit. the permutative matrix representing λn of the ncrfsm from table 3(a) is shown in eq. (7). the variables indicated at the top of the matrix are the input i, the state s and the output o variable respectively. λn =             0/a 0/b 0/c 0/d 1/a 1/b 1/c 1/d a/0 0 1 0 0 0 0 0 0 b/0 0 0 0 0 0 0 0 1 c/0 0 0 1 0 0 0 0 0 d/0 0 0 0 0 0 0 1 0 a/1 0 0 0 0 1 0 0 0 b/1 0 0 0 1 0 0 0 0 c/1 0 0 0 0 0 1 0 0 d/1 1 0 0 0 0 0 0 0             (7) lemma 1. an function λn does not changes the information content of the input uncertainty in an ncrfsm. 428 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 429 14 proof. the information content of the input uncertainty ∆(x) is changed only when for a given input value xj the number of distinct successor states is increased or decreased with respect to the number of distinct states in the predecessor state partition. the evolution function λn ≡ f × g (definition 10) preserves for each input value i the number of unique states. therefore, starting from an initial partition π = {s1/i1, . . . , s1/ik, . . . , sn/i1, . . . , sn/ik} each vertex of the successor tree will lead to a partition that contains exactly the same states permuted according to function λn . consequently, there is no input sequence that would modify the information content e(φ) in state partitions of the successor tree of an ncrfsm. for instance, observe that every node in the rst shown in figure 2 contains all states in every node of the tree. theorem 2. an ncrfsm always possesses a homing sequence. proof. the assignment of states and output values for each input using λn means that all states of the ncrfsm appear at every node of the successor tree (lemma 1). additionally, λn for an altering sequence of input values such as 0, 1, 0, etc. at every node all available states will be having output 1 or 0. finally, ncrfsm always contains a τmax: there exists at least one sequence of inputs that will traverse the τmax and thus generating a unique sequence for each initial state and consequently identifying the final state distinctively. theorem 3. an ncrfsm always possesses a distinguishing sequence. proof. this is a direct consequence of theorem 2. ncrfsm can always identify a final state by a unique output sequence. λn is specified by a reversible matrix and λn is a bijection. starting from an arbitrary final state with an associated sequence of outputs will lead backward to a unique and distinctive initial state. the successor tree shown in figure 2 shows that one of the available homing sequences for this machine is 10, with output sequences o = 11, o = 10, o = 01 and o = 00, the resulting states are d, c, b and a respectively. the distinguishing sequence can be directly seen in the tree from figure 2 because the input sequence 11 generates unique output sequences and thus confirms theorems 2 and 3 (for the outputs refer to the state-output mapping shown in table 3(b)). theorem 4. an ncrfsm cannot possess a synchronizing sequence proof. this is a natural consequence of lemma 1: if an ncrfsm generates a balanced distribution of states and outputs and does not modify the information 430 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 431 14 proof. the information content of the input uncertainty ∆(x) is changed only when for a given input value xj the number of distinct successor states is increased or decreased with respect to the number of distinct states in the predecessor state partition. the evolution function λn ≡ f × g (definition 10) preserves for each input value i the number of unique states. therefore, starting from an initial partition π = {s1/i1, . . . , s1/ik, . . . , sn/i1, . . . , sn/ik} each vertex of the successor tree will lead to a partition that contains exactly the same states permuted according to function λn . consequently, there is no input sequence that would modify the information content e(φ) in state partitions of the successor tree of an ncrfsm. for instance, observe that every node in the rst shown in figure 2 contains all states in every node of the tree. theorem 2. an ncrfsm always possesses a homing sequence. proof. the assignment of states and output values for each input using λn means that all states of the ncrfsm appear at every node of the successor tree (lemma 1). additionally, λn for an altering sequence of input values such as 0, 1, 0, etc. at every node all available states will be having output 1 or 0. finally, ncrfsm always contains a τmax: there exists at least one sequence of inputs that will traverse the τmax and thus generating a unique sequence for each initial state and consequently identifying the final state distinctively. theorem 3. an ncrfsm always possesses a distinguishing sequence. proof. this is a direct consequence of theorem 2. ncrfsm can always identify a final state by a unique output sequence. λn is specified by a reversible matrix and λn is a bijection. starting from an arbitrary final state with an associated sequence of outputs will lead backward to a unique and distinctive initial state. the successor tree shown in figure 2 shows that one of the available homing sequences for this machine is 10, with output sequences o = 11, o = 10, o = 01 and o = 00, the resulting states are d, c, b and a respectively. the distinguishing sequence can be directly seen in the tree from figure 2 because the input sequence 11 generates unique output sequences and thus confirms theorems 2 and 3 (for the outputs refer to the state-output mapping shown in table 3(b)). theorem 4. an ncrfsm cannot possess a synchronizing sequence proof. this is a natural consequence of lemma 1: if an ncrfsm generates a balanced distribution of states and outputs and does not modify the information 15 content of input sequence uncertainty, it cannot converge to a single unique state. 4.2 crfsm now that we showed properties of the special case of the reversible ncrfsm, we extend these results to the general model of crfsm. the crfsm is a relaxed type of ncrfsm and can be obtained from ncrfsm by simply changing output states between different columns of the state transition function specified in a statetransition table. for instance the table 3(b) shows a state transition function of a crfsm that is obtained by changing the state assignment from the table 3(a). the matrix corresponding λc to table 3(b) is shown in eq. (8). λc =             0/a 0/b 0/c 0/d 1/a 1/b 1/c 1/d a/0 0 1 0 0 0 0 0 0 b/0 0 0 0 0 0 0 0 1 c/0 0 0 0 0 1 0 0 0 d/0 0 0 0 0 0 0 1 0 a/1 0 0 1 0 0 0 0 0 b/1 0 0 0 1 0 0 0 0 c/1 0 0 0 0 0 1 0 0 d/1 1 0 0 0 0 0 0 0             (8) for illustration the circuit realizing the crfsm from table 3(b) is shown in figure 5. the encoding used for this realization is the same as in the case of the ncrfsm shown in figure 4, which is a = 0 and b = 1. for the analysis of the o s′0 s′1s0 s1 i fig. 5: compact circuit realization of the crfsm defined in table 3(b). crfsm we use the crfsm from table 3(a) that has the rst shown in figure 2. figure 4 shows one possible realization of the crfsm defined by table 3(a) using the encoding of the states a = (s0 = 0, s1 = 0), b = (s0 = 0, s1 = 1), c = (s0 = 1, s1 = 0) and d = (s0 = 1, s1 = 1). again table 6 shows the individual bits encoding. from table 6 we can 430 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 431 16 table 6: encoding of the crfsm from table 3(b) s i 0 1 s0 s1 s0 s1 o s0 s1 o 0 0 1 1 1 1 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 1 1 0 1 1 0 1 0 generate a set of equations describing the individual variable assignment: s′0 = is̄0 ⊕ is̄1 ⊕ s̄0s̄1 = s1 ⊕ s ′ 1 ⊕ ōs̄ ′ 1 (9) s′1 = s0 ⊕ īs̄1 (10) o = ī ⊕ s0s̄1 = ī ⊕ s1s̄2 (11) lemma 2. let λc be defined by a reversible matrix (definition 2), then λc reduces the information content of input uncertainty in an crfsm. proof. a crfsm defined by reversible λc with the only restriction that for at least one ijsk combination of the input and state values, the λc(ijsk) results in snop such that sk = sn. this implies that for each step resulting by the application of λc , at least one state will be assigned twice. this has for consequence that information content e(φ) reduces (definition 17). theorem 5. a crfsm always possesses a homing sequence proof. the crfsm specified by λc is not guaranteed to possess τmax and multiple cycles τj may exist. because λc is reversible and λc reduces the information content, there is at least one input sequence x that leads to either a single final state (due to information reduction in the input uncertainty) or to a partition φ with distinct output sequences (due to reversibility of λc ). before proceeding to the next step we introduce two sub-categories of crfsms: non-restricting crfms (nrcrfsm) and restricting crfsm (rcrfsm). consider the two crfsms shown in table 7. the nrcrfsm from table 7(a) is an example of machine that reduces the information content of the state partition. that is for a particular input value the e(πj) ≤ e(πj+1). the rcrfsm from table 7(b) is also a crfsm because it reduces the information content of the initial unknown partition only once and then e(pij) = e(pij+1). note that the rcrfsm from table 7(b) is halfway between a crfsm and ncrfsm: ncrfsm 432 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 433 16 table 6: encoding of the crfsm from table 3(b) s i 0 1 s0 s1 s0 s1 o s0 s1 o 0 0 1 1 1 1 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 1 1 0 1 1 0 1 0 generate a set of equations describing the individual variable assignment: s′0 = is̄0 ⊕ is̄1 ⊕ s̄0s̄1 = s1 ⊕ s ′ 1 ⊕ ōs̄ ′ 1 (9) s′1 = s0 ⊕ īs̄1 (10) o = ī ⊕ s0s̄1 = ī ⊕ s1s̄2 (11) lemma 2. let λc be defined by a reversible matrix (definition 2), then λc reduces the information content of input uncertainty in an crfsm. proof. a crfsm defined by reversible λc with the only restriction that for at least one ijsk combination of the input and state values, the λc(ijsk) results in snop such that sk = sn. this implies that for each step resulting by the application of λc , at least one state will be assigned twice. this has for consequence that information content e(φ) reduces (definition 17). theorem 5. a crfsm always possesses a homing sequence proof. the crfsm specified by λc is not guaranteed to possess τmax and multiple cycles τj may exist. because λc is reversible and λc reduces the information content, there is at least one input sequence x that leads to either a single final state (due to information reduction in the input uncertainty) or to a partition φ with distinct output sequences (due to reversibility of λc ). before proceeding to the next step we introduce two sub-categories of crfsms: non-restricting crfms (nrcrfsm) and restricting crfsm (rcrfsm). consider the two crfsms shown in table 7. the nrcrfsm from table 7(a) is an example of machine that reduces the information content of the state partition. that is for a particular input value the e(πj) ≤ e(πj+1). the rcrfsm from table 7(b) is also a crfsm because it reduces the information content of the initial unknown partition only once and then e(pij) = e(pij+1). note that the rcrfsm from table 7(b) is halfway between a crfsm and ncrfsm: ncrfsm 17 table 7: example of (a) the nrcrfsm and (b) the rcrfsm s i 0 1 a a/1 d/0 b c/0 b/0 c c/1 a/0 d b/1 d/1 s i 0 1 a a/1 d/0 b a/0 b/0 c c/0 b/1 d c/1 d/1 (a) (b) does not reduces e(πi) while rcrfsm reduces only e(π0) and then it behaves as ncrfsm. lemma 3. a crfsm always possess a distinguishing sequence. proof. the proof is separated into two special cases: 1. if the crfsm does not reduce the input sequence uncertainty information content then it is a ncrfsm. 2. if the crfsm is rcrfsm, then once it reduced e(π0) is behaves like ncrfsm and therefore will also posses the distinguishing sequence 3. if the crfsm is nrcrfsm, it reduces the input sequence uncertainty information. in order not to have a distinguishing sequence, for at least two initial states a and b any response sequences vxa and v x b must be exactly the same given any input sequence x. this is only possible (a) if two states lead to a same successor state in which case it is not an rfsm, (b) if vxa = v x b for different state sequences sa and sb and for any input sequence x in such case the crfsm is not properly reduced as it will contain redundant states. table 8: example of the distinguishing sequence of the crfsm from table 3(b) x a/∗ b/∗ c/∗ d/∗ vk 1 c/0 c/1 d/0 b/0 vk = 0100 0 a/1 a/1 b/1 a/0 vk = 1110 0 d/1 d/1 a/0 d/1 vk = 1101 1 b/0 b/0 c/0 b/0 vk = 0000 432 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 433 18 lemma 4. for a function λc defined by a permutative matrix (definition 2) that reduces the information content of input uncertainty for τj (nrcrfsm), there is at least one specific input sequence xred that leads to a single final state sf in τj. proof. in order to prove lemma 4 it is enough to show that starting from an arbitrary cover φ there is a sequence xred that reduces the information content of the input sequence. in order to prove this, we assume that all states sj ∈ π can be reached within τj and λc must either reduce or preserve the information content of π. then the proof is shown using the following steps: 1. let π = {τ0, . . . , τj} be the initial ordered partition π and the initial input sequence uncertainty. 2. let ij ∈ i be such that λc reduces the information content of the input sequence uncertainty by performing mapping mr : π ij −→ φ. this means ∃so, sp ∈ φ|so = sp (two states are equal in φ after applying λc with ij to π). 3. for any two states sj �= sk ∈ φ there must be a sequence of inputs xred,j,k = {i0, . . . , ir} such that λc(ir, sj, ∗) = λc(ij, sk, ∗). theorem 6. a rfsm possesses a synchronizing sequence if it reduces the information content of uncertainty and possesses maximal sequence τmax. proof. if an rfsm with evolution function λc reduces the information of the input uncertainty, it can with finite sequence reach a common final state (lemma 4). additionally if the machine mr contains τmax this common final state is reachable from arbitrary initial state: if a crfsm does not have τmax then there are at least two smaller τj cycles (theorem 1) that under any input sequence will end up in at least two different states. if a crfsm is rcrfsm it can still have τmax but it will never converge to a single state. table 10(a) and 10(b) show an example of two rfsms without and with synchronizing sequences respectively. note that one can observe the existence of the synchronizing sequence very quickly. the crfsm from table 10(a) is an rcrfsm and thus from an initial partition π1 = {a, b, c, d, e, f} it reduces the partition information content to − ∑ 6 i=1 1 3 log(1 3 ). then any next step will preserve the information content the same. therefore, the rcrfsm cannot have a synchronizing sequence. 434 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 435 18 lemma 4. for a function λc defined by a permutative matrix (definition 2) that reduces the information content of input uncertainty for τj (nrcrfsm), there is at least one specific input sequence xred that leads to a single final state sf in τj. proof. in order to prove lemma 4 it is enough to show that starting from an arbitrary cover φ there is a sequence xred that reduces the information content of the input sequence. in order to prove this, we assume that all states sj ∈ π can be reached within τj and λc must either reduce or preserve the information content of π. then the proof is shown using the following steps: 1. let π = {τ0, . . . , τj} be the initial ordered partition π and the initial input sequence uncertainty. 2. let ij ∈ i be such that λc reduces the information content of the input sequence uncertainty by performing mapping mr : π ij −→ φ. this means ∃so, sp ∈ φ|so = sp (two states are equal in φ after applying λc with ij to π). 3. for any two states sj �= sk ∈ φ there must be a sequence of inputs xred,j,k = {i0, . . . , ir} such that λc(ir, sj, ∗) = λc(ij, sk, ∗). theorem 6. a rfsm possesses a synchronizing sequence if it reduces the information content of uncertainty and possesses maximal sequence τmax. proof. if an rfsm with evolution function λc reduces the information of the input uncertainty, it can with finite sequence reach a common final state (lemma 4). additionally if the machine mr contains τmax this common final state is reachable from arbitrary initial state: if a crfsm does not have τmax then there are at least two smaller τj cycles (theorem 1) that under any input sequence will end up in at least two different states. if a crfsm is rcrfsm it can still have τmax but it will never converge to a single state. table 10(a) and 10(b) show an example of two rfsms without and with synchronizing sequences respectively. note that one can observe the existence of the synchronizing sequence very quickly. the crfsm from table 10(a) is an rcrfsm and thus from an initial partition π1 = {a, b, c, d, e, f} it reduces the partition information content to − ∑ 6 i=1 1 3 log(1 3 ). then any next step will preserve the information content the same. therefore, the rcrfsm cannot have a synchronizing sequence. 19 table 9: example of (a) rcrfsm without synchronizing and (b) nrcrfsm with a synchronizing sequence 001011 leading to a single state b. s i 0 1 a a/0 e/1 b f/0 b/1 c c/1 d/1 d a/1 d/0 e c/0 e/0 f f/1 b/0 s i 0 1 a a/0 f/1 b f/0 c/0 c d/1 b/0 d e/0 c/1 e d/0 e/1 f a/1 b/1 (a) (b) crfsm from table 10(b) has a duplicate state in both columns; for i = 0 {a}, {f} and {c}, {e} lead to same state a and d respectively. for i = 1 {b}, {d} and {c}, {f} provide the information content reduction necessary for possessing a synchronizing sequence (theorem 6). table 10: example of a synchronizing input sequence in crfsm from table 3(b). notice that every column ends with the same state and thus shows the existence of the synchronizing sequence 1101. x a/∗ b/∗ c/∗ d/∗ 1 c c d b 1 d d b c 0 b b a a 1 c c c c conjecture 1. if a crfsm possesses a distinguishing (synchronizing) sequence that contains all possible input values it cannot at the same time possess the synchronizing (distinguishing) sequence as well. proof. if an crfsm possesses a distinguishing sequence it means that in each column of the rst there exists only unique combinations of s × o. if that would not be the case, some of the columns in the rst of the rfsm would be reducing the input uncertainty. this also means that there exist at least two output sequences that have identical outputs and thus a distinguishing sequence cannot exist. however, if the distinguishing sequence exists, it means that each column does not reduce the state uncertainty and each element of s × o is present only once in each column. in such a case the rfsm cannot have the synchronizing sequence. 434 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 435 20 5 conclusions in this paper we presented an analysis of the reversible finite state machines and proved a set of exact results concerning rfsms being tested using the three testing sequences. we proved that because of the reversible requirement the ncrfsms cannot have the synchronizing sequence but have a homing and distinguishing sequences. we also proved that the crfsm allows to have all three sequences but at a cost of longer input sequences and we formulated precise conditions under which an crfsm can have all three studied sequences. the obtained results show that there is a hierarchy of testability between the ncrfsm and crfsm and the used methods can be used to study other fsm. in the future works, we extend this work into a complete method for transforming irreversible fsm to rfsm and give exact cost of ancilla bits required to have rfsm with all sequences. references [1] e. f. moore, “gedanken-experiments on sequential machines,” vol. 34, pp. 129–153. [2] a. gill, introduction to the theory of finite-state machines. new york: mcgrawhillq. [3] f. hennie, “fault detecting experiments for sequential circuits,” in in proceedings of the fifth annual symposium on switching circuit theory and logical design, pp. 95–110. [4] i. kohavi and z. z. kohavi, “variable-length distinguishing sequences and their application to the design of fault-detection experiments,” vol. c-17, pp. 792–795. [5] a. d. friedman and p. r. menon, fault detection in digital circuits. englewood cliffs, nj: prentice-hall, inc. [6] e. p. hsieh, “checking experiments for sequential machines,” vol. c-20, no. 10, pp. 1152–1166. [7] m. n. sokolovskii, “diagnostic experiments with automata,” vol. 6, pp. 44–49. [8] s. m. gobershtein, “check words for the states of a finite automaton,” vol. 1, pp. 46–49. [9] m. chen, y. choi, and a. kershenbaum, “approaches utilizing segment overlap to minimize test sequences,” in in proceedings of the 10th international symposium on protocol specification, testing and verification, pp. 67–84. [10] t.-f. lee, a. c.-h. wu, and y.-l. lin, “a transformation-based method for loop folding,” vol. 13, no. 4, pp. 439–. [11] z. kohavi and n. jha, switching and finite automata theory, 3rd edition. cambridge university press. 436 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 437 20 5 conclusions in this paper we presented an analysis of the reversible finite state machines and proved a set of exact results concerning rfsms being tested using the three testing sequences. we proved that because of the reversible requirement the ncrfsms cannot have the synchronizing sequence but have a homing and distinguishing sequences. we also proved that the crfsm allows to have all three sequences but at a cost of longer input sequences and we formulated precise conditions under which an crfsm can have all three studied sequences. the obtained results show that there is a hierarchy of testability between the ncrfsm and crfsm and the used methods can be used to study other fsm. in the future works, we extend this work into a complete method for transforming irreversible fsm to rfsm and give exact cost of ancilla bits required to have rfsm with all sequences. references [1] e. f. moore, “gedanken-experiments on sequential machines,” vol. 34, pp. 129–153. [2] a. gill, introduction to the theory of finite-state machines. new york: mcgrawhillq. [3] f. hennie, “fault detecting experiments for sequential circuits,” in in proceedings of the fifth annual symposium on switching circuit theory and logical design, pp. 95–110. [4] i. kohavi and z. z. kohavi, “variable-length distinguishing sequences and their application to the design of fault-detection experiments,” vol. c-17, pp. 792–795. [5] a. d. friedman and p. r. menon, fault detection in digital circuits. englewood cliffs, nj: prentice-hall, inc. [6] e. p. hsieh, “checking experiments for sequential machines,” vol. c-20, no. 10, pp. 1152–1166. [7] m. n. sokolovskii, “diagnostic experiments with automata,” vol. 6, pp. 44–49. [8] s. m. gobershtein, “check words for the states of a finite automaton,” vol. 1, pp. 46–49. [9] m. chen, y. choi, and a. kershenbaum, “approaches utilizing segment overlap to minimize test sequences,” in in proceedings of the 10th international symposium on protocol specification, testing and verification, pp. 67–84. [10] t.-f. lee, a. c.-h. wu, and y.-l. lin, “a transformation-based method for loop folding,” vol. 13, no. 4, pp. 439–. [11] z. kohavi and n. jha, switching and finite automata theory, 3rd edition. cambridge university press. 21 [12] i. pomeranz and s. reddy, “application of homing sequences to synchronous sequential circuit testing,” in test symposium, 1993., proceedings of the second asian, pp. 324–329. [13] r. l. rivest and r. e. schapire, “inference of finite automata using homing sequences,” in proceedings of the twenty-first annual acm symposium on theory of computing, ser. stoc ’89. new york, ny, usa: acm, pp. 411–420. [14] y. freund and r. schapire, “a decision-theoretic generalization of on-line learning and an application to boosting,” vol. 55, no. 1, pp. 119 – 139. [online]. available: http://www.sciencedirect.com/science/article/pii/s002200009791504x [15] m.-l. chuang and c.-y. wang, “synthesis of reversible sequential elements *,” pp. 420–425. [16] m. ueda, “optimization of reversible sequential circuits,” vol. 2, no. 6, pp. 208–214, 2010. [17] v. k. siva kumar sastry hari, shyam shroff, sk. noor mahammad and e. group, “efficient building blocks for reversible sequential circuit design,” in 49th ieee international midwest symposium on circuits and systems, 2006, pp. 437–441. [18] s. dhaarinee and n. rajeswaran, “implementation of reversible sequential circuits using conservative logic gates,” vol. 3, no. 5, pp. 500–505, 2014. [19] d. s. shubham gupta, vishal pareek, “low cost design of sequential reversible counters,” international journal of scientific & engineering research,, vol. 4, no. 11, pp. 1234–1240, 2013. [20] m. lukac, m. perkowski, and m. kameyama, “quantum finite state machines a circuit based approach,” international journal of unconvetional computing, vol. 9, no. 3-4, pp. 267–301. [21] v. singh and a. sharma, “implementation of sequential circuit using reversible fredkin gate on fpga,” international research journal of engineering and technology (irjet), vol. 03, no. 08, pp. 1873–1878, 2016. [22] h. thapliyal, s. member, and n. ranganathan, “design of testable reversible sequential circuits,” vol. 21, no. 7, pp. 1201–1209, 2013. [23] n. kumar, s. wairya, and b. sen, “design of conservative , reversible sequential logic for cost efficient emerging nano circuits with enhanced testability,” ain shams engineering journal, 2017. [online]. available: http://dx.doi.org/10.1016/j.asej.2017.02.005 [24] m. mohammadi, m. eshghi, and m. haghparast, “on design of multiple-valued sequential reversible circuits for nanotechnology based systems,” in tencon, 2008, pp. 1–6. [25] j. pin, “on the languages accepted by finite reversible automata,” in automata, languages and programming, ser. lecture notes in computer science, t. ottmann, ed. springer berlin heidelberg, vol. 267, pp. 237–249. [26] j.-e. pin, “on reversible automata,” in latin ’92, ser. lecture notes in computer science, i. simon, ed. springer berlin heidelberg, vol. 583, pp. 401–416. 436 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... 437 22 [27] r. ali, m. gooding, t. szilagyi, b. vojnovic, m. christlieb, and m. brady, “automatic segmentation of adherent biological cell boundaries and nuclei from brightfield microscopy images,” vol. 23, no. 4, pp. 607–621, 2011. [28] m. soeken, r. wille, c. otterstedt, and r. drechsler, “a synthesis flow for sequential reversible circuits,” in 2012 ieee 42nd international symposium on multiplevalued logic, may 2012, pp. 299–304. [29] m. h. a. khan, “design of reversible synchronous sequential circuits using pseudo reed-muller expressions,” ieee transactions on very large scale integration (vlsi) systems, vol. 22, no. 11, pp. 2278–2286, nov 2014. [30] s. gupta, “synthesis of sequential reversible circuits through finite state machine,” corr, vol. abs/1410.2, 2014. [online]. available: http://arxiv.org/abs/1410.2370 [31] n. margolus, “physics-like models of computation,” vol. 10, no. 1-2, pp. 81 – 95. [online]. available: http://www.sciencedirect.com/science/article/pii/0167278984902525 [32] g. y. vichniac, “simulating physics with cellular automata,” vol. 10, no. 112, pp. 96 – 116. [online]. available: http://www.sciencedirect.com/science/article/pii/0167278984902537 [33] k. morita, handbook of natural computing. springer berlin heilderberg, ch. reversible cellular automata, pp. 231–257. [34] ——, “reversible computing and cellular automata a review,” vol. 395, pp. 101–131. [35] ——, “reversible computing systems, logic circuits and cellular automata,” in proceedings of the 3rd international conference on networking and computing, pp. 1–8. [36] s. wolfram, a new kind of science. wolfram media inc. [37] b. anuradha and s. sivakumar, “a fault analysis in reversible sequential circuits,” iosr journal of vlsi and signal processing, vol. 4, no. 2, pp. 36–42, 2014. [38] z. kohavi, switching and finite automata theory. mc graw-hill. 438 m. lukac, m. kameyama, m. perkowski, p. kerntopf using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines... pb 8239 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 253-268 https://doi.org/10.2298/fuee2202253m © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping fahad bin muslim1, muntazir hussain2, usman hashmi1, aneesullah3, muhammad aamir4, ali zahir5 1department of computing and technology, iqra university, islamabad, pakistan 2department of electrical and computer engineering, air university, islamabad, pakistan 3department of electronics engineering, university of engineering and technology peshawar, pakistan 4department of electrical and computer engineering, pak austria fachhochschule institute of applied science and technology, haripur, pakistan 5department of electrical and computer engineering, comsats university islamabad, abbottabad, pakistan abstract. the ever-increasing end user demands are instigating the development of innovative methods targeting not only data rate enhancement but additionally better service quality in each subsequent wireless communication standard. this quest to achieve higher data rates has compelled the next generation communication technologies to use multicarrier systems e.g. orthogonal frequency division multiplexing (ofdm), while also relying on the multiple-input multiple-output (mimo) technology. this paper is focused on implementing a mimo-ofdm system and on using various techniques to optimize it in terms of the bit-error rate performance. the test case considered is a system implementation constituting the enabling technologies for 4g and beyond communication systems. the bit-error rate optimizations considered are based on preceding the ofdm modulation step by discrete fourier transform (dft) while also considering various subcarrier mapping schemes. matlab-based simulation of a 2 × 2 mimo-ofdm system exhibits a maximum of 2 to 5 orders of magnitude reduction in bit-error rate due to dftprecoding and subcarrier mapping respectively at high signal-to-noise ratio values in various environments. a 2-3dbs reduction in peak-to-average power ratio due to dftprecoding in different environments is also exhibited in the various simulations. key words: mimo-ofdm, bit-error rate, multiple access, diversity, amplifier nonlinearities. received september 9, 2021; received in revised form december 23, 2021 corresponding author: fahad bin muslim department of computing and technology, iqra university, islamabad, pakistan e-mail: fahad@iqraisb.edu.pk 254 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir 1. introduction the concept of multicarrier modulation has revolutionized the modern communication technologies quite profoundly. while orthogonal frequency division multiplexing (ofdm) has been around since long and had been a part of the radio access technologies till 4g, other modified techniques such as non-orthogonal multiple access (noma) and filtered ofdm (f-ofdm) are being seen as the waveforms of choice for 5g and beyond [1-3]. ofdm causes great enhancement in spectral efficiency besides providing performance boost in environments consisting of high frequency selectivity. moreover, the equalization process is also greatly simplified by ofdm owing to each of the subcarrier experiencing flat fading. this essentially implies the application of a single tap equalizer which represents a simple delay operation wherein, the equalization is accomplished by using a previously detected symbol [4]. furthermore, the key towards delivering upon the promises made in performance by modern communication standards, is the use of multi-antenna arrays i.e. multiple-input multiple-output (mimo) systems. these networks utilize mimo to create several physical routes from the transmitter to the receiver in order to deliver greater streams of data within the same time and frequency blocks. some of the key features of mimo systems responsible for delivering an enhanced capacity and spectral efficiency as expected from 5g networks, are the spatial diversity and spatial multiplexing [5]. in the former, the same signal is transmitted over several different paths, each experiencing an independent channel realization while the later comprises of different signals sent over different paths. these features of mimo lead to an improved data rate as well as an increased signal-to-noise ratio (snr) owing to the so-called diversity combining [6]. the two features are depicted in the fig. 1 where the blue dotted lines depict the spatial diversity while the spatial multiplexing is shown by the red and green dotted lines. mimo is already being used in 4g while its scaled version consisting of a considerably larger array of antennas (in 100s), the so-called massive mimo, is a strong contender to be used in 5g [7]. owing to the requirement of having a reliable high-speed wireless communication, ofdm in combination with mimo technology is an attractive combination for upcoming generations of wireless communications. this ensures the best of both the worlds i.e. the energy and spectral efficiency of mimo combined with the spectral efficiency and the ease in equalization offered by ofdm [5, 8]. the main theme of this paper is hence, based on the implementation and analysis of a mimo-ofdm system. fig. 1 spatial diversity and multiplexing in mimo performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 255 the main contribution of this work is to implement a mimo-ofdm system and its multiaccess version considering both the linear environment and the nonlinear environment. the system is then analyzed in terms of the bit-error rate (ber) performance considering discrete fourier transform-precoding (dft-precoding). this is followed by using various subcarrier mapping schemes and analyzing their impact on the ber. the underlying aspect of peak-to-average power ratio (papr) variation plays an important role in the ber performance variation especially in nonlinear environments. hence, the papr analysis of the various test cases is also being performed. the two techniques are not used primarily for ber improvement while a similar analysis on a mimo-ofdm system has not been found in the literature (see section 3) to the best knowledge of the authors. this hence, motivated us to align our work in this direction. the rest of the paper is structured as follows. the background on dft-precoding and subcarrier mapping is provided in section 2. some relevant work found in literature is presented in section 3. section 4 is dedicated to some discussion on the system model. the simulation results, in terms of the ber performance and the papr, are presented in section 5. finally, the work is concluded in section 6. 2. dft-precoding and subcarrier mapping the issue with the systems employing ofdm is the poor power efficiency owing to its excessive papr. this high papr results in inefficiency while using high-power amplifier (hpa) on the transmitter side of international mobile telecommunication-advanced (imt-advanced) [9] since it enters the nonlinear region of operation. the operating efficiency in hpa can be maximized if it is operated in the saturation region of operation. low papr ensures that the hpa operates near the linear region. besides, low papr is critical especially in the uplink since, the user equipment (ue) has a limited source of energy i.e. its battery [10]. 2.1. dft-precoding various techniques can be used to keep the ofdm peak-to-average power ratio within reasonable limits. one such technique is dft-precoding being employed e.g. in uplink in several wireless communication standards including long term evolution (lte) and its successor, the lte-advanced. among several alternatives, dft-precoding presents a favorable choice as it can be implemented by merely using the fast fourier transform (fft) prior to ofdm modulation and it does not require any extra signal overhead. ofdm preceded by dft-precoding causes a reduction in envelope variations in comparison to the conventional ofdm system. this reduction is because of the sharing of the subcarrier bandwidth among the whole of the subcarriers in a dft-precoded ofdm. the conventional ofdm in comparison, encompasses the superposition of all the subcarriers over the signal bandwidth. owing to the spreading effect experienced by every data symbol over several subcarriers in an ofdm system precoded by dft, this technique is also sometimes termed as a single-carrier frequency division multiplexing (sc-fdm). this results in spreading gain in a sc-fdm system [11]. because of the spreading gain inherent to dft-precoded ofdm, it is also sometimes termed as dft-spread ofdm. 256 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir 2.2. subcarrier mapping subcarrier (sc) mapping following the dft-precoding converts an ofdm system into its multiaccess counterpart i.e. the orthogonal frequency division multiple access (ofdma) [12]. this mapping results in any n dft-precoded symbols being spread among a combination of any m subcarriers such that m is greater than n. the subcarriers may be mapped in a localized fashion termed localized fdma (lfdma) or in a distributed fashion termed distributed fdma (dfdma). the former comprises of assigning consecutive subcarriers to the users while in the latter, an offset is introduced between subcarriers being assigned to each user. among the two schemes however, dfdma offers more diversity owing to greater spreading of the subcarriers, as compared to lfdma. the trade-off, on the other hand, is the increased pilot overhead in dfdma needed for channel estimation as compared to lfdma [11, 13]. another scheme that is a compromise between the two is the block-interleaved fdma (bifdma). this scheme consists of equidistant group/chunk of greater than one subcarrier being assigned to each user. grouping of subcarriers (instead of using individual subcarriers as in dfdma) gives this scheme a flavor of the lfdma while interleaving between the groups gives it a flavor of dfdma [14]. the three subcarrier mapping schemes are illustrated in fig. 2. fig. 2 comparison of ifdma, lfdma and bifdma 3. related work the concept of mimo and its affiliated advantages has been under active research since long. at the same time, the added advantage that ofdm can offer in a mimo communication system had also been acknowledged long ago in the research community. various aspects of mimo-ofdm have thereafter, remained areas of active research including the ber performance analysis of such systems. in this section, we present some relevant work done in this regard. in this context, several papers e.g. [15], compared the ber performance of a simple ofdm system with a mimo-ofdm system. the analysis in this paper has been made under awgn, rayleigh and rician fading environments by considering convolutional coding. the system implementation in this work has been carried out by considering a software-defined radio (sdr) platform like the one considered in [5]. the paper, however, only considers the performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 257 differences in performance between a simple ofdm and mimo-ofdm systems communicating over various channels and including the impact of repetition coding. a research similar to [15] has been done in [16] to compare the ber performance of various mimo antenna configurations while considering the same channels as considered in [15] and various digital modulation formats i.e. bpsk, qpsk, 16qam and 64qam. the work also includes implementing several error-correction codes. the authors have considered binary and image data and have tried to select the best options among the considered configurations in terms of ber performance. a hardware field programmable gate array (fpga) bases implementation of a 2 × 2 mimo-ofdm system considering rateless space-time block code (rstbc) has been performed in [17] and the results compared with simulations to verify the correctness of the hardware implementation. all the works considered so far i.e. [15-17] however, do not consider the aspects of papr reduction, nonlinearities and multiaccess that have been discussed in this work. a dft spread mimo-ofdm system is considered in [18]. the author has proposed a frequency domain representation of tomlinson-harashima precoding (thp) for mimoofdm in this work and performed ber analysis of his proposed algorithms with generic minimum mean square error (mmse) decoder. similarly, dft-precoded mimo-ofdm systems have been considered in [19] and [20] for underwater acoustic communication. the main consideration in both these papers, however, has been to research the use of this technique widely used in wireless communication, in underwater communications. this makes the focus in those works considerably different than what we are trying to achieve here. additionally, a mimo-ofdma system is considered in [21] wherein, the authors propose a quarter ici self-cancellation (ici-sc) subcarrier scheme to overcome the intercarrier interference caused by frequency offset inherent to ofdma systems. the authors, however, do not consider the dft-precoding and similarly the sc mapping schemes considered in this work are also non-existent. a papr reduction technique in a mimo-ofdm system has recently been proposed in [22] wherein, a hybrid approach combining turbo coding and enhanced switching differential algorithm-based partial transmit sequence is considered to accomplish papr as well as ber reduction. the authors of [22] also propose another approach in [23] for massive mimoofdm systems by incorporating a distributive population-based switching differential evolution strategy in the selected matching technique that also achieves significant papr reduction. in both the cases, the specific methodology to accomplish reduction in papr (and ber) is however, considerably different as compared to what we are accomplishing in this study. thus, papr reduction in systems based on mimo-ofdm and the resulting ber improvement is actively being pursued in the research community and is therefore the main theme of this research study as well. 4. system model in this section, we explain the basic system model that has been implemented. a 2 × 2 mimo channel is considered with two transmit and two receive antennas. a simple zero forcing (zf) equalization is considered in this work which reduces the channel experienced by every symbol being transmitted from each antenna to a simple 1 × 1 rayleigh fading channel [24]. the implemented system model based on [25] in the form of a block diagram is depicted in fig. 3. some assumptions are being made while implementing the system model. it is assumed that 258 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir the various transmitted signal streams encounter independent channel realizations. moreover, it is also assumed that each signal travelling from a transmit antenna to a receive antenna is multiplied by a random rayleigh channel coefficient and is corrupted by a gaussian distributed noise with mean 0. the channel being considered is a flat fading channel with rayleigh distribution. finally, it is assumed that the channel is known at the receiver. a simplified flow of signals from each transmitter end to receiver end is presented here. the presentation also includes the dimensions of the various blocks that have been used to condition the signal as it travels from one of the transmit antennas to one of the receive antennas. fig. 3 illustrative model of the implemented system 4.1. signal definition we consider the discrete-time equivalents of all the signals in their complex baseband interpretations. we use upper case alphabets to denote matrices while the vectors are being denoted by lower case alphabets. the quantity of users and of the subcarriers corresponding to every signal are indicated by using superscripts and subscripts with these signals respectively. various operators such as transpose, hermitian and pseudo-inverse are represented by (.)t, (.)h and (.) respectively. performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 259 4.2. transmitter structure throughout the mathematical analysis, a system with q symbols is assumed while considering the symbol index q spanning from 0 to q-1. the coarse data corresponding to every symbol is characterized as d(q). this data is initially modulated onto the eight symbols encapsulating the 8-qam modulation scheme. for each symbol q, the modulated symbol is represented by sk (q) where k represents the number of subcarriers assigned to a specific symbol. the modulated symbols are then translated into the frequency domain by considering their k-point fft in a dft-precoded system. the dft-precoding process is represented by a square matrix with the order 𝐾 × 𝐾 represented by fk. the subcarrier mapping block then maps the k dft-precoded symbols to k among m subcarriers where m>k. this is done by the 𝑀 × 𝐾 subcarrier mapping matrix represented by t(q). t(q) can be any one among the forms presented by (1), (2) and (3) based on the subcarrier mapping technique. these two blocks are represented by dotted lines in fig. 3 to indicate that these operations have been modified to obtain the various test case results. 𝑇𝐼𝐹𝐷𝑀𝐴 (𝑞) (𝑚, 𝑘) = { 1, 𝑚 = 𝑘𝑄 + 𝑞 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 , (1) 𝑇𝐿𝐹𝐷𝑀𝐴 (𝑞) (𝑚, 𝑘) = { 1, 𝑚 = 𝑞𝐾 + 𝑘 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 , (2) 𝑇𝐵−𝐼𝐹𝐷𝑀𝐴 (𝑞) (𝑚, 𝑘) = { 1, 𝑚 = 𝑝 𝑀 𝑃 + 𝑙 + 𝑞𝐿 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 . (3) ofdm is then realized by taking an m-point ifft of the signal followed by adding a cyclic prefix (cp) of an appropriate length. it is worthwhile to mention here that the cp addition as well the clipping (to induce nonlinearities) operation are not included in the mathematical representations. the reason for this being that cyclic prefix is nothing but mere addition of a few bits (and their removal on the receiver side) and has no profound impact on the system’s mathematical model as long as its length is at least equal to the channel delay spread. the transmit signal in a system employing dft-precoding is hence, mathematically represented by (4). 𝑥𝑀 (𝑞) = 𝐹𝑀 𝐻 . 𝑇(𝑞). 𝐹𝑘 . 𝑠𝑘 (𝑞) . (4) the transmit signal in a system with no dft-precoding is similar to (4) barring the dft-precoding matrix fk. 4.3. receiver structure this signal then passes through the rayleigh channel with its channel coefficient matrix corresponding to the qth symbol given as h(q). the transmitted signal is additionally distorted by the addition of the white gaussian noise (awgn) represented by wm. the received signal is mathematically represented by (5). 𝑟𝑀 (𝑞) = 𝐻(𝑞). 𝑥𝑀 (𝑞) + 𝑊𝑀 . (5) 260 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir once the signal is received at the receiver, the cp is firstly removed. this is succeeded by performing an m-point fft to accomplish ofdm demodulation. subcarrier demapping is then performed by using the 𝐾 × 𝑀 sc demapping matrix t(q) which represents the pseudo inverse corresponding to the sc mapping matrix t(q). a zf equalizer is utilized then for rectifying the channel imperfections. the matrix of the equalizer coefficients in our mathematical modelling is denoted by a matrix c(q). dft-predecoding is then employed in the cases including dft-precoding on the transmitter side. the signal is thereafter demodulated for obtaining the estimate of the signal being transmitted. an estimate of the source signal that is being conditioned on the receiver side is mathematically given by (6). 𝑆 𝑘 (𝑞)̂ = 𝐹𝑘 𝐻 . 𝐶 (𝑞). 𝑇(𝑞). 𝐹𝑀 . 𝑟𝑀 (𝑞) . (6) 4.4. nonlinearities modelling in order to analyze performance in nonlinear environments, we have introduced a simple threshold clipper model at the transmitter side. this is a simple clipper that ignores the impact of the phase distortion and considers the amplitude distortion alone, but the same analysis can be applied to more complex models among the ones found in [26] as well. the mathematical equation of the threshold clipper with an input signal 𝑥(𝑡), an output signal 𝑥 𝑐 (𝑡) and a threshold t is given by (7). 𝑥 𝑐 (𝑡) = { 𝑥(𝑡), |𝑥(𝑡)| ≤ 𝑇 𝑇𝑒 𝑖𝑎𝑟𝑔{𝑥(𝑡)} , |𝑥(𝑡)| > 𝑇 }. (7) nonlinearities in this work have been introduced by clipping off 30% of the signal input to the threshold clipper. 5. results this section presents the results we obtained by performing various simulations. the parameters used to perform simulations are presented in table 1. it should be noted that we have not considered ifdma scheme while analyzing the results in the nonlinear environments. this is due to the fact that the threshold clipping model does not clip the ifdma signal in the time domain the way that we would like, without pulse shaping. the reason for this is that the ifdma can be visualized as a symbol being upsampled in the frequency domain. on taking its ifft, we merely obtain the replication of the initial time domain symbols. thus, instead of clipping the whole ifdma symbol by some amount, the threshold clipper will rather distort each repetition of the symbol which is obviously not what we want. we would need to introduce pulse shaping in the work to be able to analyze the performance of ifdma which has not been considered in this work. performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 261 table 1 simulation parameters parameters values total subcarriers (m) 1024 subcarriers/user (k) 256 subcarriers/block (bifdma only) 8 modulation 8-qam equalization zf channel estimation perfect channel type flat fading channel fading channel distribution rayleigh distribution number of transmit antennas 2 number of receiver antennas 2 5.1. performance evaluation of dft-precoding this section presents the results indicating the effect of dft-precoding on a mimoofdm system in linear and nonlinear environments. moreover, both the single access and multiaccess systems are considered. 5.1.1. single access communication linear environment: fig. 4 indicates the impact of dft-precoding on a mimoofdm system in linear environment. as is clear from the figure, the system with dftprecoding indicates the best ber performance throughout. the ber for a dft-precoded system is low because of the spreading gain caused by dft-precoding in a mimo-ofdm system. nonlinear environment: the impact of dft-precoding on a mimo-ofdm system in a nonlinear environment is presented in fig. 5. it is to be noted that 30% clipping indicates that around 77 out of the 256 samples of the mimo-ofdm signal are being distorted by the nonlinear amplifier. the results are still consistent with those found in fig. 4 with the only difference that the ber at higher snr is not that low as in fig. 4 especially for non dft-precoded system. for example, for non dft-precoded system in fig. 4, the ber of 10-1 is achieved at 17dbs while in fig. 5, the same ber target is achieved at around 20dbs. this difference between the two figures in the case of dft-precoded system is not that profound may be due to the lower papr achieved as a result of dft-precoding. there is however, a levelling effect in ber at higher snr values in the linear environment with dft-precoding which is not present in the dft-precoded case with nonlinear environment. on the other hand, when the dft-precoded system is compared with the system with no dft-precoding, we have higher fluctuations in envelope in non dft-precoded systems that would ultimately result in larger out-of-band interference in nonlinear environments. the situation is further exacerbated in non dft-precoded systems due to the lack of spreading gain that is inherent to dft-precoded systems as mentioned before. 262 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir fig. 4 impact of dft-precoding on mimo-ofdm under linear conditions fig. 5 impact of dft-precoding on mimo-ofdm under nonlinear conditions performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 263 5.1.2. multiaccess communication this section presents the impact of dft-precoding on a multiaccess mimo scheme in linear and nonlinear environments. lfdma is the multiaccess scheme that has been considered in this simulation. as expected, the dft-precoding portrays higher spreading gain in the multiaccess schemes as well in both linear and nonlinear environments, as evident from the dft-precoding schemes offering better ber performance than the non dft-precoding schemes. additionally, the performance in nonlinear environments is further exacerbated due to the out-of-band interference. this impact is worse in non dftprecoded systems while in the case of dft precoding systems, the linear and nonlinear environments offer almost identical performance at higher snrs beyond 16dbs probably due to the ability of dft-precoding to reduce the out-of-band interference caused by nonlinearities. these results are depicted graphically in fig. 6. fig. 6 dft-precoding impact on mimo-lfdma under linear and nonlinear environments 5.2. performance evaluation of subcarrier mapping schemes this section presents the details of the analysis of a multiaccess mimo-ofdm system employing the various subcarrier mapping schemes in linear and nonlinear environments. as discussed before, the simulations with nonlinearities do not include ifdma. 5.2.1. linear environment fig. 7 depicts the ber performance of the various mimo-ofdma subcarrier mapping schemes being operated in linear environment. while the performance at lower snrs (under 10dbs) is hard to separate, the interleaved subcarrier mapping schemes (ifmda 264 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir and bifdma) performance beyond 10dbs beats lfdma quite clearly. the situation at higher snrs in mimo-ofdma seems to be dominated by the spatial diversity gains offered by mimo and hence the performance with ifdma matches that of bifdma. lfdma scheme inherently lacks the amount of diversity in comparison and this effect becomes more profound beyond 10dbs of snr. fig. 7 ber performance analysis of various subcarrier mapping schemes under linear conditions 5.2.2. nonlinear environment the ber performance comparison of bifdma and lfdma mimo systems under nonlinear environments is presented here and depicted in fig. 8. it must be noted that the sc mapping in this works maps the 256 subcarriers to 256 out of 1024 subcarriers at the output of the subcarrier mapping block. while the mapping is localized in lfdma, it is interleaved considering a block size of 8 in bifdma. clipping off 30% of the subcarrier mapped signal hence, means around 307 out of the 1024 samples would be clipped off. bifdma clearly beats lfdma at snr beyond 15dbs in terms of ber performance. the reason for this behavior is based on the envelope of the signal input to the high-power amplifier (modelled by the threshold clipper). the out-of-band leakage instigated by the nonlinear distortion of the amplifier is dependent on the input signal's envelope [27] whereby, bifdma consists of considerably smaller envelope fluctuations as compared to lfdma, especially when no signal oversampling and windowing is considered [14]. thus, the lower envelope variations characterized by bifdma seem to be responsible for its superior performance in terms of ber in comparison to its lfdma counterpart as far as nonlinear environments are concerned. performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 265 5.3. peak-to-average power ratio (papr) analysis papr reduction techniques play an important role in allowing wireless communication technologies to abide by rigorous standards necessary for modern telecommunications including reasonable value of the ber [28]. this section presents the papr analysis of the multiaccess test cases where the impact of dft-precoding has been observed. the detailed explanation of the test cases has already been presented in section 5.1.2 of the paper. it is a well-known fact that dft-precoding has a significant effect on the papr of the system which in turn leads to a profound impact on the ber performance especially when amplifier nonlinearities are considered. the papr of the various test cases in dbs is given in fig. 9. the difference in papr with and without dft-precoding in all the test cases is clearly evident. this reduced papr plays a major role in the ber performance particularly in nonlinear environments, where it results in reduction of out-of-band interference that is induced by the high-power amplifier. fig. 8 lfdma vs bifdma mimo ber performance in nonlinear environments 266 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir fig. 9 papr analysis of multiaccess communication in various environments 6. conclusions this section presents the conclusions based on the simulations performed in this research work. it was found that dft-precoding results in reduced ber over a wide range of snr values mainly due to its spreading gain under linear environments. this trend is complemented further by its reduced papr thereby resulting in reduced impact of hpa nonlinearities in nonlinear environments. comparing dft-precoded systems in linear and nonlinear environments resulted in almost identical performance at higher snrs. this was probably due to the envelope fluctuations taken out of the equation by dft-precoding while the spreading gain is the same in both the cases. for the subcarrier mapping schemes, it was observed that the interleaved mapping schemes outperformed the non-interleaved schemes comprehensively under both linear and nonlinear environments. this was due to their greater spreading gain complemented by their reduced envelope fluctuations under nonlinear environments. the ber results were further validated by simulating and examining the papr reduction in schemes with dft-precoding under various environments. pulse shaping was not considered in this work which limited our tests in nonlinear multiaccess environments to bifdma and lfdma only. the future directions of this work may be to include pulse shaping to make the results more holistic. furthermore, we can also consider a diverse range of more channel environments among the ones found in [29] to perform more realistic analysis of the considered test cases. finally, the results can be refined further by considering other nonlinearity models among the ones found in [30] that consider amplitude as well as phase distortion. performance evaluation of a multicarrier mimo system based on dft-precoding and subcarrier mapping 267 references [1] l. zhang, a. ijaz, p. xiao, m. m. molu and r. tafazolli, "filtered ofdm systems, algorithms, and performance analysis for 5g and beyond", ieee trans. commun., vol. 66, no. 3, pp. 1205–1218, march 2018. [2] r. c. kizilirmak and h. k. bizaki, "non-orthogonal multiple access (noma) for 5g networks", in towards 5g wireless networks-a physical layer perspective, intech open, 2016, pp. 83–98. [3] s. dhua, r. arjun, k. appaiah and v. m. gadre, "low complexity fbmc with filtered ofdm for 5g wireless systems", in proceedings of the ieee international conference on signal processing and communications (spcom), 2020, pp. 1–5. [4] c. he, l. zhang, j. mao, a. cao, p. xiao and m. a. imran, "performance analysis and optimization of dct-based multicarrier system on frequency-selective fading channels", ieee access, vol. 6, pp. 13075– 13089, 2018. [5] r. qomarrullah, i. w. mustika and s. dharmanto, "performance comparison of siso and mimo-ofdm based on sdr platform", in proceedings of the 3rd international ieee conference on science and technology-computer (icst), 2017, pp. 142–146. [6] a. ahrens, and c. benavente-peces, "modulation-mode and power assignment in broadband mimo systems", facta univ.: electron. energ., vol. 22, no. 3, pp. 313–327, dec. 2009. [7] e. björnson, j. hoydis and l. sanguinetti, "massive mimo has unlimited capacity", ieee trans. wirel. commun., vol. 17, no. 1, pp. 574–590, jan. 2018. [8] m. n. seyman and n. taşpinar, "symbol detection using the differential evolution algorithm in mimoofdm systems", turk. j. electr. eng. comput. sci., vol. 21, no. 2, pp.373–380, march 2013. [9] y. medjahdi, s. traverso, r. gerzaguet, h. shaiek, r. zayani, d. demmer, r. zakaria, j.-b. doré, m. b. mabrouk and d. le ruyet, "on the road to 5g: comparative study of physical layer in mtc context", ieee access, vol. 5, pp. 26556–26581, 2017. [10] k. s. ali, h. elsawy and m.-s. alouini, "on mode selection and power control for uplink d2d communication in cellular networks", in proceedings of the ieee international conference on communication workshop (iccw), 2015, pp. 620–626. [11] j. zhang, l.-l. yang, l. hanzo and h. gharavi, "advances in cooperative single-carrier fdma communications: beyond lte-advanced", ieee commun. surv. tutor., vol. 17, no. 2, pp. 730–756, 2015. [12] n. taşpinar and ş. şimşir, "an efficient technique based on firefly algorithm for pilot design process in ofdm-idma systems", turk. j. electr. eng. comput. sci., vol. 26, no. 2, pp. 817–829, march 2018. [13] y. zhang, a. ghazal, c.-x. wang, h. zhou, w. duan and e.-h. m. aggoune, "accuracy-complexity tradeoff analysis and complexity reduction methods for non-stationary imt-a mimo channel models", ieee access, vol. 7, pp. 178047–178062, 2019. [14] h. ochiai, "statistical distributions of instantaneous power and peak-to-average power ratio for singlecarrier fdma systems", phys. commun., vol. 8, pp. 47–55, sept. 2013. [15] a. elsanousi and s. oztürk, "performance analysis of ofdm and ofdm-mimo systems under fading channels", eng. technol. appl. sci. res., vol. 8, no. 4, pp. 3249–3254, aug. 2018. [16] a. agarwal and s. n. mehta, "performance analysis and design of mimo-ofdm system using concatenated forward error correction codes", j. cent. south univ., vol. 24, no. 6, pp. 1322–1343, july 2017. [17] a. h. alqahtani, k. humadi, a. i. sulyman and a. alsanie, "experimental evaluation of mimo-ofdm system with rateless space-time block code", hindawi int. j. antennas and propag., p. 6804582, feb. 2019. [18] s. kinjo, "a. tomlinson-harashima, precoding for dft-spread mimo-ofdm systems", ieice commun. express, vol. 1, no. 4, pp. 148–153, sept. 2012. [19] j. tao, "dft-precoded mimo ofdm underwater acoustic communications", ieee j. ocean. eng., vol. 43, no. 3, pp. 805–819, july 2018. [20] j. tao, l. an, s. yao, l. zhou, x. han and z. qin, "precoded ofdm over underwater acoustic channels", in proceedings of the thirteenth acm international conference on underwater networks & systems, 2018, pp. 1–6. [21] h. a. mohamad, a. idris and k. dimyati, "mimo-ofdma subcarrier mapping improvement by using quarter ici-sc with stfbc technique", j. telecommun. electron. comput. eng. (jtec), vol. 9, no. 1–4, pp. 33–36, 2017. [22] m. rakshit, s. bhattacharjee, j. sanyal and a. chakrabarti, "hybrid turbo coding pts with enhanced switching algorithm employing de to carry out reduction in papr in aul-based mimo-ofdm", arab. j. sci. eng., vol. 45, no. 3, pp. 1821–1839, march 2020. [23] m. rakshit, s. bhattacharjee, g. garai and a. chakrabarti, "a novel distributive population-based differential evolution algorithm for slm scheme to reduce papr in massive mimo-ofdm systems", sn comput. sci., vol. 1, no. 5, pp.1–17, sept. 2020. 268 f. b. muslim, m. hussain, u. hashmi, aneesullah, m. aamir, a. zahir [24] d. tse and p. viswanath, fundamentals of wireless communication. cambridge university press, 2005. [25] k. sankar, "mimo with ml equalization", e-notes on www.dsplog.com, posted on december 14 2008. [26] k. m. gharaibeh, nonlinear distortion in wireless systems: modeling and simulation with matlab, john wiley & sons, 2011. [27] t. frank, "block-interleaved frequency division multiple access and its application in the uplink of future mobile radio systems", ph.d. dissertation, technische universität darmstadt, 2010. [28] b. d. jovanović and s. milenković, "the peak windowing for papr reduction in software defined radio base stations", facta univ.: electron. energ., vol. 33, no. 2, pp. 273–287, june 2020. [29] j. meinilä, p. kyösti, t. jämsä and l. hentilä, "winner ii channel models", in radio technologies and concepts for imt-advanced, pp. 39–92, oct. 2009. [30] m. c. p. paredes, f. grijalva, j. carvajal-rodríguez and f. sarzosa, "performance analysis of the effects caused by hpa models on an ofdm signal with high papr", in proceedings of the ieee second ecuador technical chapters meeting (etcm), 2017, pp. 1–5. instruction facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 1-11 https://doi.org/10.2298/fuee2201001t © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper influence of oxide thickness variation on analog and rf performances of soi finfet dhananjaya tripathy1,2, debiprasad priyabrata acharya1, prakash kumar rout2, sudhansu mohan biswal2 1department of electronics and communication engineering, national institute of technology, rourkela, india 2department of electronics and instrumentation engineering, silicon institute of technology, bhubaneswar, india abstract. this paper focuses on the impact of variation in the thickness of the oxide (sio2) layer on the performance parameters of a finfet analysed by varying the oxide layer thickness in the range of 0.8nm to 3nm. while varying the oxide layer thickness, the overall width of the finfet is fixed at a value 30nm, and the finfet parameters are analysed for structures with different oxide layer thickness. the parameters like drain current, transconductance, transconductance generation factor, parasitic capacitances, output conductance, cut-off frequency, maximum frequency, gbw, energy and power consumption are calculated to study the influence of finfet oxide (sio2) layer thickness variation. it is detected from the result and analysis that the drain current, transconductance, transconductance generation factor, gain bandwidth and output conductance improve with decrement in oxide layer thickness whereas, the parasitic capacitances, cut-off frequency and maximum frequency degrade when there is a reduction in oxide (sio2) layer thickness. the parameters like energy and consumed power of finfet get better when the oxide (sio2) layer thickness increases. key words: finfet, oxide layer thickness, transconductance generation factor, maximum frequency received august 9, 2021; received in revised form january 18, 2022 corresponding author: dhananjaya tripathy department of electronics and communication engineering, national institute of technology, rourkela, india e-mail: 520ec8012@ nitrkl.ac.in * an earlier version of this paper was presented at the 4th international conference on 2021 devices for integrated circuit (devic 2021), may 19-20, 2021, in kalyani, west bengal, india [1]. 2 d. tripathy, d. p. acharya, p. k. rout, s. m. biswal introduction the demand of highly compact and denser ics have created the interest amongst the researchers to downscale the regular silicon mos field effect transistor, which results in the evolution of compact ics but, as a consequence short-channel effects (sce) are developed in the device which degrades the device parameters immensely. so, multiple gate-based devices are considered a solution to continue downscaling. these devices possess improved controllability over lower leakage currents, sces and better yield. the performance can also be improved by varying the thickness of the oxide layer [1-5]. finfet is one of the evolutionary techniques for application based less-power consuming circuits as it displays commendable performance to nullify the short-channel problems due to the fact that multiple gates are monitoring a single channel [6-11]. fin type silicon on insulator-based field effect transistor is the newly evolved technology which is presently used in ics. finfets encompass a triple-gate construction to suppress the major performance problems, such as the sces. the silicon on insulator (soi) technique insulates the internal active area from the lower part of the substrates, which internally reduces the leakage current, parasitic capacitance, and the power dissipation of circuits. hence, soi based finfets are the center of attraction nowadays. detailed studies of soi based finfets are presented in [12-18]. constructing tri-gate finfets different approaches has been followed in recent years like soi based finfets, bulk finfet [6-18]. the inverted-t structure finfet [19] is also designed which provides better drain current compared to the soi based finfet. a multilevel logic design concept is adopted in place of complex gates to reduce the process variability and radiation effects. but it is very important to study the impact of the oxide layer thickness on the performance of the device. the oxide layer thickness variation is studied in [1], where the thickness is varied from 3 nm to 10 nm. but, in general the thickness of the oxide layer should not exceed 3nm for a finfet of channel length 30nm. here, the 3-dimensional construction of finfet is analyzed by altering the oxide layer (sio2) thickness, keeping the total dimension of the finfet fixed. to realize the physical mechanism of the device, various performance parameters are evaluated based on the mathematical expressions and finally simulated to get a comparative analysis. in section ii the theory is explained. the result and discussion are presented in section iii. section iv summarizes the total work done in the paper. device structure and simulation setup the core of the finfet i.e. the fin, is placed vertically making an angle of 90⁰ to the finfet body and is responsible for the flow of current. gate material with higher work function covers the silicon fin from three sides to reduce the sces by increasing the control over the device [20]. the 3-dimensional cross-sectional view of the soi-based finfet structure is represented in fig. 1. here the oxide (sio2) layer placed between the fin and the gate is the central point of the discussion. as mentioned in the table 1, the thickness of this layer is varied from 0.8nm to 3nm, keeping the total dimension of the finfet as a constant, i.e. 30nm. the fin height and width are taken to be 20nm and 10nm with a channel length of 30nm. the length of the device is kept as 110nm which is shown in table 1. influence of oxide thickness variation on analog and rf performances of soi finfet 3 fig. 1 a 3d cross-sectional view of the soi-based finfet table 1 device specifications of finfet parameters measurements channel length 30 nm fin height 20 nm fin width 10 nm fin angle 90⁰ equivalent oxide thickness 0.8 nm 3 nm ultra-thin body thickness 10 nm total device length 110 nm total device width 30 nm the simulation process was carried out using the standard tcad simulation tool silvaco atlas (2016). to achieve better accuracy, the 3d quantum transport equations and the driftdiffusion equations are included. the bohm quantum potential (bqp) model is used for the simulation process in order to take care of the quantum effect produced in the nano scale devices. to account for the leakage currents that occur due to thermal generation process, the auger recombination/generation and shockley–read–hall (srh) model are used. for junctionless transistors, quantum confinement effect is not significant, so it is not considered. gummel-newton method is used for mathematical calculations in this study. during the whole simulation process the temperature is set at 300k. the calibration of the simulation model has been performed with the published experimental data [21] and is represented in fig. 2. fig. 2 the calibration of the id–vgs characteristics of the finfet against experimental data [22] 4 d. tripathy, d. p. acharya, p. k. rout, s. m. biswal results and discussion to investigate the effect of oxide (sio2) layer thickness, the silicon dioxide (sio2) material thickness was altered in the range of 0.8 nm to 3 nm, while preserving the overall dimension of the finfet static at 30 nm. to perceive the influence of the oxide layer thickness on numerous vital performance parameters like drain current, transconductance, transconductance generation factor, parasitic capacitances, cut-off frequency, maximum frequency, gain bandwidth, energy and power consumption [2224], etc., soi finfets were simulated and investigated for structures with different oxide layer thickness. the drain current of a device is the major parameter to be observed. the circuit is said to be more desirable if it produces more drain current for a specific gate voltage. in fig. 3 the drain current vs gate to source voltage curve is plotted for finfets with variation in sio2 layer thickness and it can be observed from the graph that the drain current increases for lesser oxide layer thickness. by decreasing the sio2 thickness, the oxide capacitance (cox) enhances, which internally rises the drain current as it is directly proportional to the cox. fig. 3 id ~ vgs curve with varying oxide layer thickness for operations at higher frequencies, the transconductance (gm ) plays a dynamic part as it implies the exaggeration capability of the finfet. it is mathematically denoted as [25] gm =∂id/∂vgs (1) fig. 4 shows the gm ~vgs curve for the finfets with different sio2 layer thickness, which displays that the lower value of oxide layer thickness provides better transconductance value. this happens due to the fact that the transconductance is proportional to drain current, and the drain current is increasing with reduction in oxide layer thickness. to analyze the impact of both transconductance and drain current on the device, the transconductance generation factor needs to be examined. the transconductance generation factor is defined as the ratio of the transconductance to the drain current and mathematically defined as [25] tgf=gm/id (2) influence of oxide thickness variation on analog and rf performances of soi finfet 5 fig. 4 gm ~ vgs curve with varying oxide layer thickness fig. 5 shows the tgf ~ vgs curve for the finfets with different sio2 layer thickness, which displays that the lower value of oxide layer thickness provides better transconductance generation factor value. the next parameter which should be analyzed is the output conductance (gds) which determines the overall gain of the device. the gds ~ vgs curve is plotted in fig. 6 by varying the sio2 layer thickness from 0.8 nm to 3 nm and it is clear from the graphical analysis that the structure with lesser oxide layer thickness possesses maximum output conductance. the output conductance is proportional to the rate of change in drain current. as the drain current increases for device with lower oxide thickness, the output conductance also increases when the thickness of the oxide layer reduces. fig. 5 tgf ~ vgs curve with varying oxide layer thickness the parasitic capacitances play a vital role in the radiofrequency (rf) performances of any device. the different parasitic capacitances are plotted in fig. 6. the cgd ~ vgs, cgs ~ vgs and cgg ~ vgs curves are shown in fig.7(a), fig.7(b) and fig.7(c) respectively. in each case the thickness of the sio2 layer is altered in the range of 0.8nm to 3nm and the behavior of each structure is analyzed. it is found in all cases that the parasitic capacitance values get reduced for increase in oxide layer thickness. the dependency of parasitic capacitances, i.e. gate-to-drain capacitance, gate-to-source capacitance and gate-to-gate 6 d. tripathy, d. p. acharya, p. k. rout, s. m. biswal capacitance on the variation of sio2 layer thickness is displayed in fig. 7(d). it is observed that the parasitic capacitance values get better due to increase in oxide layer thickness. fig. 6 gds ~ vgs curve with varying oxide layer thickness (a) (b) (c) (d) fig. 7 (a) cgd ~ vgs curve with varying oxide layer thickness; (b) cgs ~ vgs curve with varying oxide layer thickness; (c) cgg ~ vgs curve with varying oxide layer thickness; (d) capacitance ~oxide layer thickness curve at vgs=0.8v influence of oxide thickness variation on analog and rf performances of soi finfet 7 the cutoff frequency (ft) is treated as the most important component to be studied when it comes to rf applications. it is the frequency value for which the device attains the current gain value as ‘1’ and is denoted as [11] ft = gm / (2*pi*cgg) (3) and cgg = cgd+ cgs, where cgd and cgs are the gate to source and gate to drain capacitances respectively. the cut-off frequency ~vgs curve is analyzed in fig. 8 by varying the sio2 thickness ranging from 0.8 nm to 3 nm and it is observed that, the device with higher oxide layer thickness achieves better cutoff frequency. from equation (3) it is clear that the cutoff frequency is inversely proportional to the capacitance which increases for lower oxide layer thickness. so, the device with lower values of oxide layer thickness possesses lesser cutoff frequency compared to the device with higher oxide layer thickness. fig. 8 ft ~ vgs curve with varying oxide layer thickness the maximum frequency of a device is defined as the frequency at which the power gain becomes unity. it is mathematically defined as [11] fmax=gm / (2*pi*cgs*(√(4*(rs+ri+rg)*(gds+gm*(cgd/cgs))))) (4) where rg, rs, and ri are the gate, source and channel resistances respectively [26]. the dependency of the maximum frequency on the oxide layer thickness variation is analyzed through fig. 8. the fmax ~ vgs curve is represented in fig.9 where the maximum frequency of structures with varying sio2 thickness is analyzed and it is found that the maximum frequency improves with rise in oxide layer thickness. from equation (4) it is clear that the maximum frequency is inversely proportional to the parasitic capacitance which increases for lower values of oxide layer thickness. so, the device with lower values of oxide layer thickness possesses lesser maximum frequency compared to the device with higher oxide layer thickness. 8 d. tripathy, d. p. acharya, p. k. rout, s. m. biswal fig. 9 fmax ~ vgs curve with varying oxide layer thickness the trade-off between gain and bandwidth is calculated by gain bandwidth product (gbw) [27,28]. for semiconductor devices it is defined as gbw= gm / (20*pi* cgd) (5) gbw ~ vgs curve is represented in fig. 10 with variation in sio2 thickness. it is observed from the graph that the gain bandwidth is reduced with the rise in thickness of the oxide layer. from equation (5) it is clear that the gain bandwidth is inversely proportional to the gate to drain capacitance and directly proportional to transconductance. the transconductance being the more dominant parameter helps to improve the gain bandwidth for device with lower value of oxide layer thickness. fig. 10 gbw ~ vgs curve with varying oxide layer thickness along with the above discussed analog and rf performance parameters the two major parameters i.e., energy and total power consumption also need to be studied from the application point of view. hence, the below discussion will give a clear view of the above said parameters. influence of oxide thickness variation on analog and rf performances of soi finfet 9 the energy ~ vgs curve for structures with different oxide layer thickness is displayed in fig.11. it is quite understandable from the two graphs that the energy gets better for higher oxide layer thickness. this happens due to the fact that the energy (cv2) is mainly dependent on the capacitance as the supply voltage is fixed and previously it is already discussed that the capacitive effects get reduced for higher oxide layer thickness which improves the energy of the device. the power consumption of any device is proportional to its energy. hence, the power consumption also gets better for the structures with higher oxide layer thickness which is shown in fig.12. power ~ vgs curve is shown in fig. 12 with variation in sio2 thickness and power ~ oxide thickness is analyzed in fig. 12(b) at constant. it is detected that the finfet consumes more power for lesser oxide layer thickness. fig. 11 energy ~ vgs curve with varying oxide layer thickness fig. 12 power ~ vgs curve with varying oxide layer thickness 10 d. tripathy, d. p. acharya, p. k. rout, s. m. biswal conclusion in this paper, the basic finfet structure has been analysed by varying the oxide layer thickness while maintaining the total dimension of the finfet a constant. different analog and radio frequency performance parameters of the device like the drain current, transconductance, transconductance generation factor, parasitic capacitances, output conductance, cut-off frequency, maximum frequency, gain bandwidth product, energy and power consumption are determined. from the analysis it is observed that the drain current, transconductance, transconductance generation factor, gain bandwidth and output conductance degrade with increase in oxide layer thickness. whereas the parasitic capacitances get better when the oxide layer thickness rises, due to which the cut-off frequency and maximum frequency improves at higher oxide layer thickness. hence, it can be concluded that the increase in oxide layer thickness improves the radio frequency parameters whereas it degrades the analog parameters. finally, the parameters like the energy and power dissipation of finfet are determined by varying the sio2 thickness and it is concluded that these parameters improve with rise in sio2 thickness. references [1] d. tripathy, p. k. rout, d. nayak, s. m. biswal, n. singh, "the impact of oxide layer width variation on the performance parameters of finfet" in proceedings of the ieee conference (devic), may 2021, pp. 577–580. [2] c. auth, c. allen, a. blattner, d. bergstrom, m. brazier, m. bost, m. buehler, v. chikarmane, t. ghani, t. glassman and r. grover, "a 22 nm high performance and low-power cmos technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density mim capacitors",” in proceedings of the symposium on vlsi technology (vlsit), 2012, pp. 131–132. [3] a. pal and a. sarkar, "analytical study of dual material surrounding gate mosfet to suppress shortchannel effects (sces)", elsevier, pp. 205–212, july 2014. [4] a. majumdar, z. ren, s. j. koester, and w. haensch, "undoped-body extremely thin soi mosfets with back gates",” ieee trans. electron. devices, vol. 56, no. 10, pp. 2270–2276, sep. 2009. [5] m. saitoh, k. ota, c. tanaka, k. uchida and t. numata, "10 nm-diameter tri-gate silicon nanowire mosfets with enhanced high-field transport and vth tunability through thin box", in proceedings of the symposium on vlsi technology, 2012, pp. 11–12. [6] p. zheng, d. connelly, f. ding and t. k. liu, "simulation-based study of the inserted-oxide finfet for future low-power system-on-chip applications", ieee electron. device lett., vol. 36, no. 8, pp. 742–744, aug. 2015. [7] m. d. ko, c. w. sohn, c. k. baek and y. h. jeong, "study on a scaling length model for tapered tri-gate finfet based on 3-d simulation and analytical analysis", ieee trans. electron devices, vol. 60, no. 9, pp. 2721–2727, 2013. [8] k. biswas, a. sarkar and c. k. sarkar, "spacer engineering for performance enhancement of junctionless accumulation mode bulk finfets", iet circuits, devices & systems, vol. 11, pp. 80–88, sept. 2016. [9] k. biswas, a. sarkar and a. sarkar, "effect of channel doping and fin shpaes on performance of junctionless bulk finfet", in proceedings of the ieee conference (devic), 2020. [10] k. biswas, a. sarkar, c. k. sarkar, "impact of fin width scaling on rf/analog performance of junctionless accumulation-mode bulk finfet", acm j. emerg. technol. comput. syst., vol. 12, pp. 1–12, may 2016. [11] k. biswas, a. sarkar and c. k. sarkar, "fin shape influence on analog and rf performance of junctionless accumulation-mode bulk finfets", microsyst. technol., pp. 2317–2324, jan. 2018. [12] d. nagy, m. a. elmessary, m. aldegunde, r. valin, a. martinez, j. lindberg, w. g. dettmer, d. perić, a. j. garcia-loureiro and k. kalna, "3-d finite element monte carlo simulations of scaled si soi finfet with different cross sections", ieee trans. nanotechnol., vol. 14, no. 1, pp. 93–100, jan. 2015. [13] t. matsukawa, k. fukuda, y. x. liu, k. endo, j. tsukada, h. yamauchi, y. ishikawa, s. o'uchi, w. mizubayashi, s. migita and y. morita, "lowest variability soi finfets having multiple vt by backbiasing", in proceedings of the symposium on vlsi technol. syst. appl., 2014, pp. 1–2. influence of oxide thickness variation on analog and rf performances of soi finfet 11 [14] w. schwarzenbach, b.-y. nguyen, f. allibert, c. girard and c. maleville, "ultra-thin body & buried oxide soi substrate development and qualification for fully depleted soi device with back bias capability", solid-state electron., vol. 117, pp. 2–9, mar. 2016. [15] m. poljak, v. jovanovic and t. suligoj, "improving bulk finfet dc performance in comparison to soi finfet", microelectron. eng., vol. 86, no. 10, pp. 2078–2085, 2009. [16] h. w. gao, y. h. wang and t. k. chiang, "a quasi-3-d scaling length model for trapezoidal finfet and its application to subthreshold behavior analysis", ieee trans. nanotechnol., vol. 16, no. 2, pp. 281–289, mar. 2017. [17] t. chiang, "a new short-channel-effect-degraded subthreshold behavior model for double-fin multichannel fets (dfmcfets)", ieee trans. nanotechnol., vol. 16, no. 1, pp. 16–22, jan. 2017. [18] n. waldron, c. merckling, w. guo, p. ong, l. teugels, s. ansar, d. tsvetanova, f. sebaai, d. h. van dorp, a. milenin and d. lin, "an ingaas/inp quantum well finfet using the replacement fin process integrated in an rmg flow on 300mm si substrates", in proceedings of the 2014 symposium on vlsi technology digest of technical papers, 2014, pp. 232–233. [19] e. yu, k. heo and s. cho, "characterization and optimization of inverted-t finfet under nanoscale dimensions", ieee trans. electron devices, vol. 65, no. 8, pp. 3521–3527, aug. 2018. [20] m. j. h. van dal, g. vellianitis, g. doornbos, b. duriez, t. m. shen, c. c. wu, r. oxland, k. bhuwalka, m. holland, t. l. lee and c. wann, "demonstration of scaled ge p-channel finfets integrated on si", in proceedings of the 2012 international electron devices meeting, 2012, pp. 521–524. [21] t. bentrcia, f. djefal, e. chebaki and d. arar, "a kriging framework for the efficient exploitation of the nanoscale junctionless dg mosfets including source/drain extensions and hot carrier effect", in proceedings of the materials today, 2017, vol. 4, pp. 6804–6813. [22] s. k. pattnaik, u. nanda, d. nayak, s. r. mohapatra, a. b. nayak and a. mallick, "design and implementation of different types of full adders in alu and leakage minimization", in proceedings of the 2017 international conference on trends in electronics and informatics (icei), 2017, pp. 924-927. [23] d. nayak, d. p. acharya, p. k. rout and u. nanda, "a novel charge recycle read write assist technique for energy efficient and fast 20 nm 8t-sram array", solid-state electron., vol. 148, pp. 43–50, oct. 2018. [24] d. nayak, p. k. rout, s. sahu, d. p. acharya, u. nanda and d. tripthy, "a novel indirect read technique-based sram with ability to charge recycle and differential read for low power consumption, high stability and performance", microelectron. j., vol. 97, pp. 1–11, feb. 2020. [25] s. manikandan and n. b. balamurugan, "the improved rf/stability and linearity performance of the ultrathin-body gaussian-doped junctionless finfet", j. comput. electron., vol. 19, no. 2, pp. 613–621, march 2020. [26] a. sarkar and c. k. sarkar, "rf and analogue performance investigation of dg tunnel fet", int. j. electron. lett., vol. 1, no. 4, pp. 210–217, dec. 2013. [27] s. m. biswal, b. baral, d. de and a. sarkar "simulation and comparative study on analog/rf and linearity performance of iii–v semiconductor-based staggered heterojunction and inas nanowire (nw) tunnel fet", microsyst. technol., vol. 25, no. 5, pp. 1855–1861, may 2019. [28] s. misra, s. m. biswal, b. baral, s. k. swain, a. sarkar and s. k. pati, "analytical modelling of a cyljlam mosfet in the subthreshold region using distinct device geometry", j. comput. electron, vol. 20, no. 1, pp. 480–491, feb. 2021. https://ieeexplore.ieee.org/xpl/conhome/6471855/proceeding instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 259 273 doi: 10.2298/fuee1402259n physical modeling of electrical and dielectric properties of high-k ta2o5 based mos capacitors on silicon  nenad novkovski institute of physics, faculty of natural sciences and mathematics, university “ss. cyril and methodius”, arhimedova 3, 1000 skopje, macedonia abstract. in this paper we present an integral physical model for describing electrical and dielectric properties of mos structures containing dielectric stack composed of a high-k dielectric (with emphasize on pure and doped ta2o5) and an interfacial silicon dioxide or silicon oxynitride layer. based on the model, an equivalent circuit of the structure is proposed. validity of the model was demonstrated for structures containing different metal gates (al, au, pt, w, tin, mo) and different ta2o5 based high-k dielectrics, grown of bare or nitrided silicon substrates. the model describes very well the i-v characteristics of the considered structures, as well as frequency dependence of the capacitance in accumulation. stress-induced leakage currents are also effectively analyzed by the use of the model. key words: high-k dielectrics, metal-insulator-silicon structures, conduction mechanisms in dielectrics, leakage currents 1. introduction further scaling of microelectronic devices required for new generations of integrated circuits is confronting multiple challenges, rather important one of them being the fabrication of ultrathin dielectric layers used particularly in mosfets and drams. while decreasing the lateral size of devices, in order to obtain the required capacitance, a decrease of the equivalent oxide thickness is required. the above requirement can be met either by decreasing the physical thickness or by increasing the permittivity of the dielectric (gate oxide for mosfets, dielectric in mos capacitors of drams). doped, mixed and laminate high-permittivity (high-k) dielectric stacks attract progressively higher attention as a solution for further improvement of their electrical and dielectric properties [1]-[13]. it has been shown that ta2o5, known as one of the most attractive dielectrics for the nanoscale dynamic random-access memories, can improve  received february 5, 2014 corresponding author: nenad novkovski institute of physics, faculty of natural sciences and mathematics, university “ss. cyril and methodius”, arhimedova 3, 1000 skopje, macedonia (e-mail: nenad@iunona.pmf.ukim.edu.mk) 260 n. novkovski further by doping with convenient elements [14]. detailed studies of the properties of tantalum pentoxide doped with al, ti and hf and mixed with hfo2 have been reported [15][30]. in addition, it has been shown that the nitridation of the si substrate improves substantially electrical, dielectric and reliability properties of metal-high-k-si structures [31]. in [32] we described in detail a comprehensive model for the i-v characteristics of metal-ta2o5/sio2-si structures. in this work we present integrally the generalization of the comprehensive model for mis structures containing dielectric stack composed of a high-k dielectric (particularly pure and doped ta2o5) and an interfacial silicon dioxide or silicon oxynitride layer and review the important results obtained with using specific cases of this model for various mos structures of the considered type. 2. theoretical model 2.1. band diagram band diagram of the considered structure in the case of al gate is shown in fig. 1. 4 .0 5 e v al 4 .2 5 e v vacuum level ec  e  h high-k  e if si 1.12 ev sio2 or sioxny ms ef ev  e '  e h k s  h ' fig. 1 band diagram of the considered structure in fig. 1 ehk and eif are the bandgaps of the high-k and the interfacial layer, respectively. e' and h' are band offsets for electrons and holes, respectively, at the contact between the high-k and the interfacial layer, while e and h are band offsets for electrons and holes, respectively, at the contact between the interfacial layer and the silicon substrate. ms is the work function difference between the metal gate and si, while s is the shottky barrier height for electrons. in the case of al gate, ta2o5 high-k dielectric and sio2 interfacial layer the values are those summarized in table 1. work function difference, ms, depends on the si substrate doping and is the same as in the case of the corresponding metal-sio2-si structure. for p-type substrates it is around 0.5 ev. table 1 values of bandgaps and band offsets for al-ta2o5/sio2-si structures ehk (ev) eif (ev) e (ev) h (ev) e' (ev) h' (ev) s (ev) 8.97 4.4 3.15 4.97 3.06 1.51 0.29 physical modeling of ta2o5 based mos capacitors on si 261 2.2. conduction mechanisms the conduction mechanisms that have to be considered in general case for the interfacial layer are:  hopping conduction, which is a result of the quantum diffusion of electrons between the localized states in the insulator, typical of disordered materials. this is a bulk-limited conduction mechanism, and hence it does not depend on the gate voltage polarity. since the current density in this case is a linear function of the electric filed, we can consider it as a conductivity of ohmic type.  the trap-assisted inelastic tunneling [33]-[34]. electrons tunnel from the silicon to the traps in the sio2 layer. as the sio2 is an amorphous material with low trap density it is expected to observe this effect only in the films where the traps are created as a result of a stress, radiation or process induced damage. in the case of an sioxny interfacial layer significantly higher density of traps is to be expected. however, this density is still very low compared to typically high density materials.  direct tunneling (trough a trapezoidal barrier) and fowler-nordheim injection (trough a triangular barrier) into interfacial layer. tunneling current can be created by the electrons or the holes from the si substrate. the barrier for the tunneling of the holes is different from that for the electrons, thus a remarkable asymmetry can be observed between the opposite polarities. a particular mechanism involving both sioxny and high-k is the tunneling through double barrier (through a trapezoidal barrier in sio2 and a triangular barrier in high-k). the conduction mechanisms that have to be considered for the high-k dielectric are:  poole-frenkel mechanism, which is bulk-limited, and hence independent on the gate bias polarity. electrons are exited to the conduction band from the traps by field-enhanced thermal emission and they drift trough the layer. because of the high defect density, they are easily trapped by other positively charged defects. new electrons are released from other traps, thus transporting the charge step by step from one surface of the film to the opposite (fig. 2). when the gate is negative, electron needs first to enter the insulator from the metal gate. it is to be noted that they do not need to obtain enough energy to enter the conduction band, but just to move to a defect-related state in the vicinity of the metal surface. the activation energies of the defects responsible for the poole-frenkel emission in the ta2o5 are 0.2 ev (type a, [35]) and 0.8 ev (type d, must probably the first ionization level of the double-donor oxygen vacancy, [36]). they are close to or lower than the metal-gate fermi level (0.29 ev under the conduction band of ta2o5. we estimated the tunneling probability from the al-gate to the neighboring traps to be so high that extremely high current densities of order of 100 a/cm 2 can be attained for a voltage drop of only few mv.  shottky emission, which is an electrode-limited effect. schottky conduction is excluded for gate positively biased, because the side of the high-k layer near the negative electrode is not in direct contact with a metal or semiconductor. for the gate negatively biased, the barrier is low (for ta2o5 only 0.29 ev), and hence the schottky emission is to be expected. however, it is not expected to be a currentlimiting mechanism, because thus injected electrons are quickly trapped in the the high-k layer near the contact with the metal, continuing the transport by the poole262 n. novkovski frenkel emission from the traps. namely, the pure schottky effect occurs when electrons are injected from the metal in vacuum. the situation is similar when they are injected in a medium where they can almost freely traverse the distance from the injecting to the opposite electrode, as is the case with the ultra-thin sio2 or sioxny if the defect density is fairly low. for metals with higher absolute values of the work functions this issue requires further consideration. we observed a particular effect of charge trapping at the interface between the metal gate and the high-k dielectric for au and pt [37]-[39]. although the schottky emission from the metal to the high-k conduction band is practically impossible, an emission to the traps can substantially influence the leakage currents. for example, in the case of ta2o5 and a pt electrode, the fermi level in the metal is about 0.6 ev lower than the trapping level of the d type defect. in that case the filling of the traps d type can occur by thermal emission from the metal, leading to a schottky-like effect at low applied voltages, as it was observed on au-ta2o5-pt-si structures at pt electrode negatively biased [40]. this issue requires deeper investigation in a separate study on metal-insulator-metal structures. one of the possible approaches to this problem will be to use the multi-step trap-assisted tunneling model, as it was done in [41] for the metal-al2o3-si structures. fig. 2 illustration of the poole-frenkel conduction mechanism  the hoping conduction in the ta2o5 layer is of much lower importance because the poole-frenkel mechanism gives already much higher conductivity in ta2o5 then the hopping conductivity in sio2. specifically, when ta2o5 is polycrystalline, as is the case with the films studied here [42], the hopping conductivity is very weak, while the trap density (related to oxygen vacancies, grain boundaries etc.) becomes extremely high. therefore, it is reasonable to neglect the hopping conductivity. 2.2. differences between the cases of positive and negative gate in the case of the gate positively biased, the electrons that tunnel through the sio2 barrier enter the ta2o5 conduction band. they drift for a small distance, then they become trapped, but some new electrons are subsequently emitted from the traps and continue the transport, step by step, until entering the metal (fig. 3). e si high-k metal sio2 or sioxny physical modeling of ta2o5 based mos capacitors on si 263 fig. 3 conduction mechanisms ate positive gate in the case of the gate negatively biased, some electrons from the traps near the ta2o5/sio2 interface can move to the localized states in the sio2 layer, then by quantum diffusion to contribute to the hopping conduction. tunneling of electrons through the sio2 layer from the ta2o5 layer and of holes from the si substrate could occur. the usual assumption that the electron current gives the dominant contribution in this case is not valid, because the fowlernordheim and direct tunneling are possible where an electron gas from the metal of semiconductor is in contact with an sio2 surface [43]. there, the dominant part of the electrons moving towards this surface are reflected, while a small part tunnels through the sio2 layer entering the opposite electrode (direct tunneling) or a part of it entering its conduction zone (fowler-nordheim tunneling). in the case of an insulator, the density of the electrons in the conduction zone is practically zero and the electron tunneling is practically impossible. therefore only the holes from the substrate contribute to the tunneling current [44]. for enough high fields, the holes injected from the si substrate enter the valence band of the ta2o5 layer. because of the high trap density, after passing a small distance, they recombine with the electrons on the traps. special attention has to be devoted to the case of lower fields, where the holes can not tunnel to the valence band (fig. 4). by other authors [45] an attempt was made to describe a similar situation by the double barrier tunneling. fig. 4 conduction mechanisms ate negative gate e(-) e(-) poole-frenkel transport of electrons trapping of the electrons injected through sio2 into the high-k conduction band tunneling of electrons high-k metal si e(-) si high-k metal tunneling of holes poole-frenkel transport of electrons h(+) recombination of the electrons from the high-k traps with the holes injected through sio2 trapping of holes sio2 or sioxny sio2 or sioxny 264 n. novkovski our estimations in connection with the proposed comprehensive model showed feeble agreement with the experimental results if a double barrier tunneling mechanism is invoked. the reason is that the dominant conduction mechanism for the ta2o5 layer is the poole-frenkel and not the tunneling. once the charge carriers enter the forbidden gap of the tantalum pentoxide, they become trapped after a short distance, because the defect related trap density there is extremely high. tunneling is typical of the sio2 films and is observed in si3n4 films with very high quality, where the defect density is low and the injected charge carriers can pass long distances (of order of 100 nm) with a small probability to be trapped. in some cases (sio2 thinner than 4 nm) even a ballistic transport is observed [46]. the most probable route of the electrons injected into the ta2o5 forbidden gap is to be first trapped near the ta2o5/sioxny interface and then to recombine with electrons from other traps or from the conduction band (fig. 4). a similar situation can also appear in the case of low fields for the opposite gate polarity. 2.3. construction of the model the expressions for the current density due to the hopping conductivity in sio2 (jhc) is described by the following expression: ififhc ej  (1) where if is the temperature dependant hopping conductivity and eif is the filed in the interfacial layer. direct tunneling current density through the interfacial layer (jtd) is given by the following expression:                             2 3 if if if 3 2 if 2 td 11 3 28 exp 8 e d he qm e h q j    (2) and for the fowler-nordheim injection with (jfn)           if 3 2 if 2 fn 3 28 exp 8 he qm e h q j   , (3) where q is the electron charge, h is the planck’s constant, m* is the effective tunneling mass of charge carriers injected through the interfacial layer, dif is the thickness of the interfacial layer,  is the tunneling barrier height and eif is the electric field in it. the total current density flowing through the interfacial layer (jif) is given by the following expression: ifif ifif fn td hcif de de j j jj         , (4) and the voltage drop on the interfacial layer (eif) is ififif edv  . (5) physical modeling of ta2o5 based mos capacitors on si 265 the current density due to the poole-frenkel effect in the high-k layer (jpf) is described by the following expression: 3 pf hk hk hk 0 t 1 q j (0)e exp e rkt k          , (6) where hk(0) is a temperature dependent defect related constant having dimensions of conductivity, r is the degree of compensation [47], k is the boltzmann constant, 0 is the dielectric permittivity in vacuum, kt is the optical frequency dielectric constant of the high-k dielectric and ehk is the electric filed in it. the voltage drop on the layer (vhk) is given by: hkhkhk edv  , (7) where dhk is the thickness of the high-k dielectric layer. the numerical procedure consists in simultaneous computation of the two following quantities: the oxide voltage: ififhkhkifhkox ededvvv  (8) and the current density in steady state (kirchhoff’s laws) ifpf jjj  . (9) first the current density j = jif was determined for a given field eif in the interfacial layer. then the field in the high-k layer was computed as an inverse function of the current density jhk = j. at the end, the oxide voltage was calculated with the use of the expression (8). we intend to use minimum of fitting parameters. the defect density parameter for high-k layer was first chosen because it is dependent on the technological parameters and is difficult to be determined by independent methods. silicon dioxide layer thickness was also treated as a fitting parameter in a restricted range (2 to 3 nm) close to the measured value, because the small variations in it cause substantial variations in the result. later, these results were compared with independent measurements. the hoping conductivity was also treated as a fitting parameter, since there are no available data from independent experiments. because the different mechanisms do not exclude each other, they are considered in a single form for the entire measurement region; as we discussed in [48], this approach is unavoidable in the case of nano-layered dielectrics where the effects of contributions of different conduction mechanisms can not be separated but standard methods a single assuming dominant conduction mechanism in a given voltage range. in the case of al-ta2o5/sio2-si structures following typical values can be taken from the literature: tunneling electron mass in ultrathin sio2, me* = 0.61 me [49], where me denotes the mass of free electron; tunneling hole mass in sio2, mh* = 0.51 me; optical frequency dielectric constant of ta2o5, kt = n 2 = 2.1 2 = 4.4; tunneling barrier height for of holes in sio2; h = 4.70 ev [49]; tunneling barrier height for of electrons in sio2, e = 3.15 ev [50]; and compensation factor, r = 1 (we consider the poole-frenkel effect without compensation). 266 n. novkovski voltage on the stacked insulating layer (vox) can be calculated by using relations involving the flatband voltage (vfb) and the voltage drop in the semiconductor (vs): sfbgox vvvv  . (10) the value of the vfb was determined with the standard method which is not described here. a low value of the fixed charge density in the sio2 was assumed, i.e. the ideal value of the flatband voltage ( id fbv ) was used. this assumption will be discussed later, though it can be simply treated as an approximation that holds for insulating films of high quality, where the oxide charge density is fairly low. the voltage drop in si (vs) is connected with the electric field strength in the interfacial layer (eif) by the following expression:                                                                   si typen11 2 si typep11 2 ss 2 0 2 i si 0 s 2 0 2 is si 0 si ss ss kt qv e kt qv e n nktn kt qv e p n kt qv e ktp e kt qv kt qv kt qv kt qv if if     . (11) where si is the relative permittivity of silicon, if is the relative permittivity of the interfacial layer, n0 is the density of electrons in n-type silicon, p0 is the majority carrier density in p-type silicon and ni is the intrinsic carrier density in silicon. in strong inversion (positive gate for p-type substrate, negative gate for n-type substrate) the leakage current density reaches an almost saturated value of the order of magnitude 1 ma/cm 2 . this saturation is due to the exhaustion of the minority carriers in the substrate, due to the minority carrier extraction from the substrate (electrons for p-type and holes for n-type). namely, the maximum tunneling current density of the electrons from the substrate is limited by the thermal generation rate of electrons in the inversion region of si, similarly to the case of the diode reverse current. the values observed in our experiment are comparable to the values obtained for p-n si diode reverse currents for the voltages between 1 v and 10 v. 2.3. equivalent circuit combining above described model with the standard description of mis structures [51], a complete equivalent circuit of the considered structure can be constructed (fig. 5). diode (d) that is shown at the left end of the figure accounts for the effect of exhaustion of minority carrier in strong accumulation, as described above. diode orientation shown in the figure corresponds to an n-type substrate; for the case of p-type si substrate the orientation is reversed. physical modeling of ta2o5 based mos capacitors on si 267 fig. 5 equivalent circuit of the considered structure meanings of the symbols for physical quantities in fig. 5 are as follows: rl – serial resistance, rhk – voltage dependent resistance of the high-k layer, rif – voltage dependent resistance of the interfacial layer, rit – interface traps resistance, chk – capacitance of the high-k layer, cif – capacitance of the interfacial layer and cit – interface traps capacitance. capacitances of the layers of the dielectric stack are given by following expressions: hk 0hkhk d a c  (12) and if 0ifif d a c  , (13) where hk is the the relative permittivity of high-k layer and a is the electrode area of the capacitor. rl, rit, cif and cit are to be extracted from the c-g-v curves at various frequencies, while rhk and rif from i-v curves while using here described model. rhk and rif are both voltage dependent. 3. results 3.1. i-v curves first we discuss the values of the parameters obtained from the fitting of the theoretical to the experimental curve that can be obtained by independent methods. this is the case with the interfacial layer thickness (dif) and the band offsets (e and h) at the contact between si and sio2. for e and h values close to the literature data, 3.15 ev and 4.70 ev, respectively, have been obtained [44]. in [44], fitted value dif = 2.8 nm was obtained, close to the value of 2.6 nm measured by transmission electron microscopy. some of the results obtained from applying the model on the experimental results for i-v curves different for al-high-k/sioxny-si structures are displayed in table 2. several chk cif rhk rif rl rit cit gate substrate d u u cs 268 n. novkovski important features of the structures are clearly identified by the values of the important parameters. table 2 values of fitting parameters for al-high-k/sioxny-si structures r.f. sputtered ta2o5 on bare si at substrate temperature 493 k (unpublished data) annealed dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) not 2.90 27 2.50 3.30 110 -16 3.9510 -17 at 893 k 2.95 27 3.05 3.40 110 -16 3.9510 -15 at 1193 k 2.97 26 3.15 4.70 110 -16 1.9810 -12 ta2o5 obtained by thermal oxidation of ta in pure o2 at 873 k on bare si [44] gate dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) al 2.78 47 3.15 4.70 8.110 -17 8.210 -11 au 2.72 47 3.15 4.70 8.110 -17 6.610 -14 w 2.80 47 3.15 4.70 8.110 -17 1.710 -13 r.f sputtered ta2o5 at 493 k on si nitrided in nitrous oxide at temperatures ton [52] ton (k) dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) 973 2.65 17.3 2.92 ev 3.35 ev 410 -15 3.310 -8 1073 2.70 17.3 2.85 ev 3.50 ev 110 -15 3.310 -8 1123 2.80 17.2 2.80 ev 3.50 ev 310 -15 3.310 -8 r.f sputtered ta2o5 at 493 k on si nitrided in ammonia at temperatures ton [52] ton (k) dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) 973 2.70 17.3 2.60 ev 3.30 ev 110 -15 3.310 -8 1073 2.80 17.2 2.85 ev 3.25 ev 110 -15 3.310 -8 ta2o5 obtained by thermal oxidation of ta in pure o2 at 873 k on bare si [53] gate dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) al 1.84 8.1 3.15 4.4 110 -15 210 -9 w 2.04 8.0 3.15 4.7 210 -15 810 -11 au 2.05 8.0 3.15 4.7 510 -16 810 -11 metal-hf:ta2o5/sioxny-si structures (work in progress) gate dif (nm) dhk (nm) e (ev) h (ev) hc ( -1 cm -1 ) hk(0) ( -1 cm -1 ) ag 2.56 5.44 2.6 4.2 210 -16 210 -16 w 2.24 5.76 2.6 4.2 710 -15 210 -14 tin 2.10 5.90 2.6 4.2 1.210 -12 110 -11 first, as is seen from data for r.f. sputtered ta2o5 on bare si at substrate temperature 493 k, unannealed films posses high defect density, as manifested by a high value of the parameter hk(0); annealing substantially reduces density of these defects. annealing also increases the band offsets, thus substantially reducing leakage currents. this is attributed to the improvement of stoichiometry of the interfacial silicon oxide. second, for ta2o5 obtained by thermal oxidation of ta in pure o2 at 873 k on bare si it is obtained that band offsets are those for sio2, indicating that thermally grown films posses an sio2-like interfacial layer. the parameter depending on the deffect density in the high-k layer, hk(0), is about two order of magnitude higher for reactive al gate than for the nonreactive au, w and tin gates, indicating that deposition of the reactive gate creates high amount of defects in the high-k layer. thickness of the layer is practically independent on the gate material for films as thick as 50 nm [44], and weakly dependent physical modeling of ta2o5 based mos capacitors on si 269 on the gate material in the case of films as thin as 10 nm or thinner (nanosized dielectric) [53]. low-field conductivity (hc) for films as thick as 50 nm is independent on the gate material [44], while for nanosized films it is somehow reduced in the case of reactive al gate [53]. therefore, we conclude that the reactive gate in the case of nanosized high-k dielectrics affects also interfacial layer. third, it is seen that substrate nitridation reduces band offsets [52]. with this effect alone, the nitridation would degrade leakage properties of the dielectric films. nevertheless, there is a more important beneficial effect of nitridation consisting in an increase of the relative permittivity of the interfacial layer and substantial decrease of the equivalent thickness with nitridation. as a result, leakage currents for same equivalent thicknesses are lower for films grown on nitrided substrates than for the films grown on bare substrates. detailed analysis of electrical and dielectric properties of different mos structures containing high-k dielectric grown on nitrided si substrate have been reported in several works [31],[52],[62]. the model is also applicable to the structures containing ta2o5 with different metals (one example is given in the last section of the table. 2). in addition, in [54] we have shown that the model described in this work is also applicable to the case of hfo2 high-k dielectrics, by fitting the experimental i-v curves obtained by other authors [55]. it is expected the same or slightly modified model to be applicable on various similar structures. recently, an analysis of leakage properties of al-ta2o5/sioxny-si structures based on a derived model has been published by other authors [56]. 3.2. effective capacitance standard methods for characterization of mos structures include measurement of c-v and g-v (or r-v) curves in parallel mode (i.e., cp-v and g-v or rp-v) [51]. an alternative approach is to use c-v and r-v curves obtained in serial mode (cs-v and rs-v). our extensive experience with metal/high-k/si structures suggests that better results are obtained when using serial mode in characterization of capacitance properties of the considered structures. this approach has been supported by additional studies of the ac capacitance and resistance measurements at various frequencies [57],[58]. based on the model described here an equivalent circuit (simplified equivalent circuit of that shown in fig. 5) for the capacitance in accumulation has been constructed and applied to describe experimental results for measured capacitances and resistances as a function of the signal frequency, both in parallel and serial mode [57]. impedance of the considered equivalent circuit (z) is given with the following expression: hk if l2 2 hk hk if if hk if 2 2 hk hk if if 1 (2 ) 1 (2 ) 1 11 2 1 1 (2 ) 1 1 (2 ) r r z r fc r fc r c c i f fc r fc r                   , (14) where f is the measurement signal frequency. for measurements in serial mode (at given gate voltage v in accumulation), corresponding effective serial capacitance (cs) and resistance (rs) are frequency dependent and given with following expressions: 270 n. novkovski 1 hk if s 2 2 hk hk if if 1 1 ( ) 1 1 (2 ( )) 1 1 (2 ( )) c c c f fc r v fc r v           (15) and hk if s l2 2 hk hk if if ( ) 1 (2 ( )) 1 (2 ( )) r r r f r fc r v fc r v       . (16) in [57] excellent fits to the experimental results for al-ta2o5/sio2 structures have been obtained when using expressions (15) and (16). detailed analysis for the c-v, r-v and c-v curves for metal(al,w,au)-ta2o5/sio2 structures, both in parallel and serial mode, have been reported in [51]. all the results obtained are consistent with the model described in this work. 3.3. stress-induced leakage currents in addition to the description of the leakage currents of fresh structures, this model has been successfully applied to the description of the stress-induced leakage currents. we dominantly studied the case of constant current stress. we have shown that i-v characteristics of stressed al-ta2o5/sio2 structures can be very well described by our model [59]. increase of the leakage currents with the stress has been attributed to the degradation of the interfacial layer by creation of high density of defects in a part of it. this part can be degraded to the point where it can be regarded as a conductive material where conduction occurs through percolation paths [59]-[61]. 4. conclusions comprehensive physical model for describing electrical and dielectric properties of mos capacitors containing high-k/(sio2,sioxny) dielectric stack has been described in details. corresponding equivalent circuit has been constructed and displayed. the proposed model describes very well mos structures containing ta2o5 based dielectric layers, both obtained with different technological procedures and with different doping. it has been also shown that the model can be used for other high-k dielectrics such as hfo2. based on the model, degradation of the dielectric properties of the high-k dielectric layer induced by a reactive metal gate, such as al, can be clearly distinguished from other effects. the model is applicable on fresh as well on high-field/current stressed samples, thus allowing analyzing the stress-induced leakage currents at medium fields. finer details of the effect of various technological processes on the electrical and dielectric properties of the considered structures can be extracted using the model. acknowledgement: this work was supported by macedonian ministry of education and sciences under contract 13-3573. physical modeling of ta2o5 based mos capacitors on si 271 references [1] j. zhang, z. li, h. zhou, c. ye and h. wang, “electrical, optical and micro-structural properties of ultrathin hftion films”, applied surface science, in press, http://dx.doi.org/10.1016/j.apsusc.2013.12.064. [2] c.ye, c. zhan, j. zhang, h. wang, t. deng and s. tang, “influence of rapid thermal annealing temperature on structure and electrical properties of high permittivity hftio thin film used in mosfet”, microelectronics reliability 54, 2014, pp. 388–392. (anneling) [3] s. chen, zh. liu, l. feng, x. che and x. zhao, “the dielectric properties enhancement due to yb incorporation into hfo2”, appl. phys. lett. 103 2013, pp. 132902 (4 pages). [4] g.lee, b.-k. lai, c. phatak, r. s. katiyar and o. auciello, “interface-controlled high dielectric constant al2o3/tiox nanolaminates with low loss and low leakage current density for new generation nanodevices”, j. appl. phys. 114, 2013, pp. 027001 (5 pages). [5] m. ali khaskheli, p. wu, r. chand, x. li, h. wang, sh. zhang, s. chen and yili pei, “structural and dielectric properties of ti and er co-doped hfo2 gate dielectrics grown by rf sputtering”, applied surface science 266, 2013, pp. 355–359 [6] b. toomey, k. cherkaoui, s. monaghan, v. djara, é. o’connor, d. o’connell, l. oberbeck, e. tois, t. blomberg, s.b. newcomb and p.k. hurley, “the structural andelectrical characterization of a hferox dielectric for mim capacitor dram applications”, microelectronic engineering 94, 2012, pp. 7–10 [7] z. essa, c. gaumer, a. pakfar, m. gros-jean, m. juhel, f. panciera, p. boulenc, c. tavernier and f. cristiano, “evaluation and modeling of lanthanum diffusion in tin/la2o3/hfsion/sio2/si high-k stacks”, appl. phys. lett. 101 2012, pp. 182901 (5 pages). [8] t. usui, s. a. mollinger, a. t. iancu, r. m. reis and f. b. prinz, “high aspect ratio and high breakdown strength metal-oxide capacitors”, appl. phys. lett. 101 2012, pp. 033905 (4 pages). [9] w.yang, q.-q. sun, r.-c. fang, l. chen, p. zhou, s.-j. ding and d.w. zhang, “the thermal stability of atomic layer deposited hflaox: material and electrical characterization”, current applied physics 12, 2012, pp. 1445–1447 [10] t. yu, c. jin, x. yang, y. dong, h. zhang, l. zhuge, x. wu and z. wu, “the structure and electrical properties of hftaon high-k films prepared by dibsd”, applied surface science 258, 2012, pp. 2953– 2958 [11] x. zhang, h. tu, y. guo, h. zhao, m. yang, f. wei, y. xiong, z. yang, j. du and w. wang, “atomic configuration of the interface between epitaxial gd doped hfo2 high-k thin films and ge (001) substrates”, j. appl. phys. 111, 2012, pp. 014102 (4 pages) [12] l. ning, f. yang, c. duan, y. zhang, jun liang and z. cui, “structural properties and 4f→5d absorptions in ce-doped lualo3: a first-principles study”, j. phys.: condens. matter 24, 2012, pp. 055502 (10 pages) [13] l. kornblum, b. meyler, c. cytermann, s. yofis, j. salzman and m. eizenberg, “investigation of the band offsets caused by thin al2o3 layers in hfo2 based si metal oxide semiconductor devices”, appl. phys. lett. 100, 2012, pp. 062907 (3 pages) [14] k.m.a. salam, h. fukuda and s. nomera, “effects of additive elements on improvement of the dielectric properties of ta2o5 films formed by metalorganic decomposition”, j. appl. phys. 93, 2003, pp. 1169–1175. [15] e. atanassova, n. novkovski, d. spassov, a. paskaleva and a. skeparovski, “time-dependent-dielectricbreakdown characteristics of hf-doped ta2o5/sio2 stack”, microelectron. reliab. 54, 2014, pp. 381–387. [16] e. atanassova, n. stojadinovic, d. spassov, i. manic and a. paskaleva, “time-dependent dielectric breakdown in pure and lightly al-doped ta2o5 stacks”, semicond. sci. technol. 28, 2013, pp. 055006– 055006-9 [17] e. atanassova, d. spassov, n. novkovski, and a. paskaleva, “constant current stress of lightly al-doped ta2o5”, materials science in semiconductor processing 15, 2012, pp. 98–107. [18] y. karmakova, a. paskaleva and e. atanassova, “interfacial layers in ta2o5 based stacks and constituent depth profiles by spectroscopic ellipsometry”, appl. surf. sci. 258, 2012, pp. 4507–4512. [19] e. atanassova, a. paskaleva and d. spassov, “doped ta2o5 and mixed hfo2–ta2o5 films for dynamic memories applications at the nanoscale”, microelectron. reliab. 52, 2011, pp. 642–650. [20] a. paskaleva, m. ťapajna, e. dobročka, k. hušeková, e. atanassova and k. fröhlich, “structural and dielectric properties of ru-based gate/hf-doped ta2o5 stacks”, appl. surf. sci. 257, 2011, pp. 7876–7880. [21] a. skeparovski, n. novkovski, e. atanassova, a. paskaleva and v. k. lazarov, “effect of al gate on the electrical behaviour of al doped ta2o5 stacks”, j. phys. d: appl. phys. 44, 2011, pp. 235103–235103-10. [22] i. manić, e. atanassova, n. stojadinović, d. spassov and a. paskaleva, “hf-doped ta2o5 stacks under constant voltage stress”, microelectron. eng. 88, 2011, pp. 305–313. [23] d. spassov, e. atanassova and a. paskaleva, “lightly al-doped ta2o5: electrical properties and mechanisms of conductivity”, microelectron. reliab. 51, 2011, pp. 2102–2109. 272 n. novkovski [24] n. novkovski and e. atanassova, “charge trapping during constant current stress in hf-doped ta2o5 films sputtered on nitrided si”, thin solid films 519, 2011, pp. 2262–2267. [25] e. atanassova, n. novkovski, a. paskaleva and d. spassov, “constant current stress-induced leakage current in mixed hfo2– ta2o5 stacks”, microelectron. reliab. 50, 2010, pp. 794–800. [26] a. paskaleva and e. atanassova, “evidence for a conduction through shallow traps in hf-doped ta2o5”, mat. sci. semicond. proc. 13, 2010, pp. 349–355. [27] e. atanassova, m. georgieva, d. spassov and a. paskaleva, “high-k hfo2–ta2o5 mixed layers: electrical characteristics and mechanisms of conductivity”, microelectron. eng. 87, 2010, pp. 668–676. [28] d. spassov, e. atanassova, n. novkovski, “electrical behaviour of ti-doped ta2o5 on n2o and nh3 nitrided si”, semicond. sci. technol. 24, 2009, pp. 075024–075024-10. [29] a. skeparovski, n. novkovski, e. atanassova, d. spassov and a. paskaleva, “temperature dependence of leakage currents in ti doped ta2o5 films on nitrided silicon”, j. phys. d: appl. phys. 42, 2009, pp. 095302–095302-8. [30] a. paskaleva, e. atanassova and n. novkovski, “constant current stress of ti-doped ta2o5 on nitrided si”, j. phys. d: appl. phys. 42, 2009, pp. 025105–025105-8. [31] n. novkovski, “analysis of the improvement of al-ta2o5/sio2-si structures reliability by si substrate plasma nitridation in n2o”, thin solid films 517, 2009, 4394–4401. [32] n. novkovski and e. atanassova, “a comprehensive model for the i-v characteristics of metal-ta2o5/sio2-si structures”, appl. phys. a 83, 2006, pp. 435–445. [33] e. rosenbaum and l. f. register, “mechanism of stress-induced leakage current in mos capacitors”, ieee trans. electron dev. 44, 1997, pp. 317–323. [34] m. houssa, m. tuominen, m. naili, v. afanas’ev, a. stesmans, s. haukka and m. m. heyns, “trapassisted tunneling in high permittivity gate dielectric stacks”, j. appl. phys. 87, 2000, pp. 8615–8620. [35] w. s. lau, l. zhong, allen lee, c. h. see, taejoon han, n. p. sandler and t. c. chong, “detection of defect states responsible for leakage current in ultrathin tantalum pentoxide (ta2o5) films by zero-bias thermally stimulated current spectroscopy”, appl. phys. lett. 71, 1997, pp. 500–502. [36] w. s. lau, l. l. leong, t. han and n. p. sandler, “detection of oxygen vacancy defect states in capacitors with ultrathin ta2o5 films by zero-bias thermally stimulated current spectroscopy”, appl. phys. lett. 83, 2003, pp. 2835–2837. [37] n. novkovski, a. skeparovski and e. atanassova, “charge trapping effect at the contact between a highwork-function metal and ta2o5 high-k dielectric”, j. phys. d: appl. phys. 41, 2008, pp. 105302–105302-4. [38] l. stojanovska-georgievska, n. novkovski and e. atanassova, “charge trapping at pt/high-k dielectric (ta2o5) interface”, physica b: condensed matter 406, pp. 3348-3353 (2011). [39] l.s. georgievska, n. novkovski and e. atanassova, “charge trapping at low injection currents in (tin, mo, pt)/ta2o5:hf/sio2/si structures”, 2012 28 th international conference on microelectronics, proceedings, miel2012, pp. 331-334 [40] f.-c. chiu, j.-j. wang, j. y. lee and s. c. wu, “leakage currents in amorphous ta2o5 thin films”, j. appl. phys. 81, 1997, pp. 6911-6915. [41] o. blank, h. reisinger, r. stengl, m. gutsche, f. wiest, v. capodieci, j. schulze and i. eisele, “a model for multistep trap-assisted tunneling in thin high-k dielectrics”, j. appl. phys. 97, 2005, pp. 044107– 044107-7. [42] e. atanassova, d. spassov, a. paskaleva, j. koprinarova and m. gueorguieva, “influence of oxidation temperature on the microstructure and electrical properties of ta2o5 on si”, microel. j. 33, 2002, pp. 907–920. [43] m. lenzlinger and e. h. snow, “fowler-nordheim tunneling into thermally grown sio2”, j. appl. phys. 40, 1969, pp. 278-283. [44] n. novkovski and e. atanassova, “injection of holes from the silicon substrate in ta2o5 films grown on silicon”, appl. phys. lett. 85, 2004, pp. 3142-3144. [45] c. chaneliere, j. l. autran and r.a.b. devine, “conduction mechanisms in ta2o5/sio2 and ta2o5/si3n4 stacked structures on si”, j. appl. phys. 86, 1999, pp. 480–486. [46] m. v. fischetti and d. j. dimaria, “hot electrons in sio2: ballistic to steady-state transport”, solid-st. electron. 31, 1988, pp. 629–636. [47] j. r. yeargan and h. l. taylor, “the poole-frenkel effect with compensation present”, j. appl. phys. 39, 1968, pp. 5600–5604. [48] n. novkovski, “limitations in the methods of determination of conduction mechanisms in highpermittivity dielectric nano-layers”, physica b: condensed matter. 398, 2007, pp. 28–32. physical modeling of ta2o5 based mos capacitors on si 273 [49] k. n. yang, h. t. huang, m. c. chang, c. m. chu, y. s. chen, m. j. chen, y. m. lin, m. c. yu, s. m. yang, d. c. h. yu and m. s. liang, “a physical model for hole direct tunneling current in p + poly-gate pmosfets with ultrathin gate oxides”, ieee trans. electron dev. 47, 2000, pp. 2161-2166. [50] n. yang, w.k. henson, j.r. hauser and j. wortman, “modeling study of ultrathin gate oxides using direct tunneling current and capacitance-voltage measurements in mos devices”, ieee trans. electron dev. 46, 1999, pp. 1464-1471. [51] d. k. shroder, semiconductor material and device characterization. hobokeen, new jersey: john wiley& sons, 2006, chapter 9, pp. 347–350. [52] n. novkovski, a. paskaleva and e. atanassova, “dielectric properties of rf sputtered ta2o5 on rapid theramlly nitrided si”, semicond. sci. technol. 20, 2005, pp. 233–238. [53] n. novkovski, “conduction and charge analysis of metal (al, w and au)-ta2o5/sio2-si structures”, semicond. sci. technol. 21, 2006, pp. 945–951. [54] aleksandar skeparovski and nenad novkovski, “on the nature of the high-k dielectrics leakage current reduction by postdeposition annealing”, j. optoelectron. adv. mat. 9, 2007, pp. 897–901. [55] w. j. zhu, t.-p. ma, t. tamagawa, j. kim and y. di, “current transport in metal/hafnium oxide/silicon structure” ieee electron device lett. 23, 2002, pp. 97–99. [56] s. huang, “oxygen annealing effects on transport and charging characteristics of al-ta2o5/sioxny-si structure”, ieee trans. electron. dev. 60, 2013, pp. 2741–2746. [57] n. novkovski, and e. atanassova, “frequency dependence of the effective series capacitance of metalta2o5/sio2-si structures”, semicond. sci. technol. 22, 2007, pp. 533–536. [58] n. novkovski and e. atanassova, “peculiarities of capacitance measurements of nanosized high-k dielectrics: case of ta2o5”, j. optoelectron. adv. mat.-symposia 1, 2009, pp. 398–403. [59] n. novkovski and e. atanassova, “origin of the stress-induced leakage currents in al-ta2o5/sio2-si structures”, appl. phys. lett. 86, 2005, pp. 1521041–52104-3. [60] n. novkovski, e. atanassova and a. paskaleva, “stress-induced leakage currents of the rf sputtered ta2o5 on n-implanted silicon”, appl. surf. sci. 253, 2007, pp. 4396–4403. [61] n. novkovski, e. atanassova and a. paskaleva, “model based analysis of electrical and wear-out characteristics of ultra-thin ta2o5/sioxny stacks on si”, proc. 26 nd international conference on microelectronics, 10-14 may, 2008, vol. 2, pp. 533–536. [62] n. novkovski and e. atanassova, “dielectric properties of ta2o5 films grown on silicon substrates plasma nitrided in n2o”, appl. phys. a 81, 2005, pp. 1191–1195. facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 547-555 https://doi.org/10.2298/fuee2104547s © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a comparative study of optimization methods for eddy-current characterization of aeronautical metal sheets ben moussa oum salama1, ayad ahmed nour el islam1, tarik bouchala2 1electrical engineering department, faculty of applied sciences, lab. lage, ouarglauniversity, algeria 2electrical engineering department, mohamed boudiaf university msila, algeria abstract. this paper presents eddy current non-destructive characterization of three aeronautical metal sheets by deterministic and stochastic inversion methods. this procedure consists of associating the finite element method with three optimization algorithms (simplex method and genetic and particle swarm algorithms) simultaneously determine electric conductivity, magnetic permeability and thickness of al, ti and 304l stainless steel metal sheets largely used in aeronautical industry. indeed, the application of these methods has shown the performance of each inversion algorithms. as a result, while doing a qualitative and quantitative comparison, it was found that the simplex method is more advantageous in comparison with genetic and particle swarm algorithms, since it is faster and more stable . key words: eddy current sensor, inverse problem, genetic algorithm, simplex method, particle swarm optimization. 1. introduction eddy current non destructive testing is a well-known method for material characterization, which is sensitive to conductive materials properties, such as electrical conductivity and magnetic permeability [1]. in aeronautic domain, planes are periodically subjected to inspection and maintenance operations as is the case of algerian airline maintenance society. in the non-destructive testing (ndt) division, the eddy current technique is often used for inspecting and evaluating plane sensitive parts. among these applications, we perform measurement of thickness and electric conductivity of metal sheets [2-3]. received march 25, 2021; received in revised form august 14, 2021 corresponding author: ben moussa oum salama electrical engineering department, faculty of applied sciences, lab. lage, ouarglauniversity, algeria e-mail: benmoussa.oumsalama@univ-ouargla.dz 548 b. m. o. salama, a. a. n. el islam, t. bouchala in industrial automatic application, several iterative inversion methods are used to accomplish this objective. in general, the flowchart constitutes an iteration buckle containing the forward model associated to an inversion algorithm. consequently, we recall that the analytical forward method of dodd and deeds gives an exact solution but the skin and the proximity effects in the exciting coil turns are neglected [4-5]. the aim of this paper is to associate the finite element method (fem) with the optimization ones to estimate thickness, electric conductivity and magnetic permeability of al, ti and stainless steel 304l metal sheets largely used in aeronautic construction. from this association there results a comparative study of starting search interval, global searching time and the relative error for both optimization methods in order to determine the more advantageous one in terms of reliability and rapidity. 2. aeronautic construction materials an airplane cockle is made, in the majority of cases, of aluminum, because its volume density is very low and that presents an advantage in aeronautics. additionally, this material is also much appreciated since it has a good resistance to corrosion and is easily malleable which makes construction of different parts easier [3]. on the other hand, stainless steel 304l is less sensitive to corrosion effect and ideal for piece machining and welding in aeronautics applications. nowadays, titanium is a key element of aeronautic and spatial construction since its use is justified by its attractive characteristics: incomparable holding to corrosion and oxidization, nonmagnetic, good thermal and mechanical resistance. in fact, with such properties, titanium alloy constitutes an element of major quality for planes conception, fig. 1. fig. 1 aeronautic construction materials [3]. a comparative study of optimization methods for eddy-current characterization of aeronautical metal sheets 549 3. description of the forward model the geometry of the considered problem is illustrated schematically in fig.2. in this study, the metal sheet presents a flat surface with a thin nonconductive coating. in actual situation, when using an eddy current to measure thickness and electric conductivity, it is important to ensure that the other factors (geometry, the specimen temperature and liftoff) are kept under control [5,9]. a pancaketype, probe formed of coil is perpendicular to the tested metal sheet surface. the geometrical and physical characteristics are given in table 1. table 1 characteristics of the modeled system coil values current intensity frequency inner radius length high 0.04 [a] 10 [khz] 5.35 [mm]. 2.35 [mm]. 2.3 [mm]. metal sheet thickness electric conductivity magnetic permeability 2 [mm]. that of al, inox 304l, ti that of al, inox 304l, ti 4. mathematical formulation of the electromagnetic forward model the maxwell's equations, describing physical phenomena of eddy current sensing [611] are defined as follows jjh += s , (1) t  −= b e , (2) 0= b , (3) where h is the magnetic field, j is the induced eddy-current, js is the current density injected in the coils, e is the electric field, b is the magnetic flux density, and t denotes the time [7-12]. by considering constitutional relations linking the electromagnetic field to the properties of the material: hb = , (4) ej = , (5) where µ is the magnetic permeability, and σ is the electrical conductivity of the materials [13].magnetic vector potential a is being defined as: fig. 2 studied device configuration 550 b. m. o. salama, a. a. n. el islam, t. bouchala ab = . (6) differential equation describing the eddy current testing phenomena is then expressed by: 1 ( ) s t      = −    a a j (7) by considering the angular frequency 𝜔 and according to the condition of coulomb-gauge 0= a , the electromagnetic equation in time-harmonic regime, using complex amplitudes [8] is expressed by: s jaa +−=       j)rot( 1 rot (8) where a represents the magnetic vector potential, j is the imaginary unit, ω is the angular frequency of the excitation current (rad/s), μ is the magnetic permeability of the media involved (h/m), σ is the electrical conductivity (s/m), and j is the current density (a/m2) [10]. finite element formulation for the 2d axisymmetric eddy current phenomena was developed in many works. for axisymmetric geometries, eq. (8) reduces to the 2d form [2,4]. .j 11 2 22 2 2 s ja r a z a r a rr a −=         −   +   +    (9) this equation describes the problem shown in figure 3. fig. 3 finite element modeling procedure a comparative study of optimization methods for eddy-current characterization of aeronautical metal sheets 551 5. inversion steps for the iterative inversion, the process is constituted of an iteration buckle containing the forward model that calculates the sensor impedance (zc). the output (zc) is compared to the measured value (zm), than the obtained error is used by the optimization algorithm (genetic and particle swarm optimization algorithms) as an input in order to enhance the estimated parameters. for each iteration, this strategy minimizes the obtained error (fitness function). hence, the inversion process is accepted and stopped when the error is smaller than the tolerance [14,15]. we recall that in genetic algorithm (ga), firstly the population individuals are created according to a random process. each individual takes a set of the evaluation parameters. then, the fitness function is iteratively computed for all individuals. following that, the couples are mixed, and during the mutation step this method through which populations' genetic variety is maintained from one generation to the next. in order to generate a superior population, the genetic operators were used in a way that was inspired by natural evolution [16]. on the other hand, the simplex method is a very powerful local descent direct search method for minimizing a real-valued function. in each iteration, it begins with a simplex specified by n+1verticesand the associated function values. one or more test points are computed, along with their function values. at the end of each iteration, a new simplex is obtained, so as to satisfy some descent conditions regarding the values of the fitness function [17,18]. the inverse problem principle is based on the following steps: finding parameters of (e,σ,µ), and deducing values of zc(e,σ,µ)=zm. with zc is the impedance of the sensor and zm is the measured impedance. we have taken values from known properties (thickness, conductivity and magnetic permeability), and the measured values are replaced by those obtained by solving the direct problem by the finite element method. eq. (10) can be changed by minimizing the following fitness function: 2 1 [ ( , , )]1 , 2 m cn i i m i i z z e s z= −   =  (10) where n is the length of the measurement array. fig. 4 iterative inversion procedure 552 b. m. o. salama, a. a. n. el islam, t. bouchala 6. results and discussion an iterative inversion algorithm is elaborated in order to evaluate physical and geometrical properties of metal sheets (i.e. electric conductivity σ, magnetic permeability μ and thickness e). the inversion is achieved by stochastic methods, such as genetic and particle swarm algorithms combined with a deterministic one based on the nelder-mead algorithm associated to the finite element method (fem) [9]. it uses selected evaluation parameters and gives the evaluated properties, fig. 4. previous parameters and the fitness function according to iteration number are given in the following figures (figs. 5-7). we recall that these results are obtained for al, ti and 304l stainless steel metal sheets for which the characteristics are reported on table 2. table 2 metal sheets characteristics electric conductivity [ms/m] magnetic permeability thickness [mm] al 37.7 1 2 ti 2.52 25 2 stainless steel 304l 1.36 160 2 6.1. obtained results to show the precision and the speed of the used inversion techniques, we have implemented them in matlab environment. the obtained results are shown in the following figures: fig. 5 electric conductivity obtained for stainless steel, aluminum and titanium fig. 6 magnetic permeability obtained for stainless steel, aluminum and titanium a comparative study of optimization methods for eddy-current characterization of aeronautical metal sheets 553 fig. 7 thickness obtained for stainless steel, aluminum and titanium the computing time and the error rate between the real and estimated values of three optimization algorithms are summarized on table 3. table 3 the results comparison of three optimization algorithms real values ga pso sim estimated values estimated values estimated values stainless steel 304l σ(ms/m) 1.36 1.34 1.34 1.35 µ 160 158 158 159 e(mm) 2 1.8 1.8 2 al σ(ms/m) 37.7 37.5 37.6 37.6 µ 1 1.2 0.99 1 e(mm) 2 1.9 2 2 ti σ(ms/m) 2.52 2.54 2.51 2.53 µ 25 23 24 25 e(mm) 2 2 2.3 2 computing time (s) 1750 1420 224 error (%) 1.08 1.02 0.35 6.2. discussion through this application, we have noticed that the obtained results by using simplex, genetic and particle swarm algorithms are very accurate and relate to the actual ones. indeed, these results confirm the reliability and the robustness of the inversion procedure. besides, we have deduced that ga and pso are very slow in comparison to the sim because of the height number of fitness function to be calculated for each iteration. on the other hand, to reach a satisfactory precision, the population size has to be increased to a certain level since it increases calculation time. in fact, the sim method is more privileged because it is faster and its algorithm performance does not change while restarting calculation. nevertheless, the simplex method introduces some issues like regulating parameters choice (reflection, expansion, contraction) and those of the starting step. 554 b. m. o. salama, a. a. n. el islam, t. bouchala 7. conclusion periodically, aircrafts are subjected to security and maintenance operations by using the nondestructive testing methods. in this field, the eddy current technique is widely used for evaluating and controlling relevant elements of an aircraft. during our traineeship in the algerian airline nondestructive testing edifice, we noticed that the electric conductivity, magnetic permeability and thickness of metal sheets measurements are carried out separately which increases the inspection time. absolutely, when using inverse algorithms involving artificial intelligence, the measurement can be made simultaneously and rapidly. as stated above, an inversion procedure using the optimization algorithms associated with finite element method is elaborated in the matlab environment. a comparative study between these three methods (ga, sim, and pso) for solving the eddy current inversion problem has been proposed in this paper. as a result, we have deduced that fem-ga and fem-pso are very slow in comparison to the fem-sim because of the height number of fitness function calculation for each iteration. on the other hand, to reach a satisfactory precision, the population size has to be increasedto a certain extent since it increases the calculation time. in fact, the fem-sim is more privileged because it is faster and its algorithm performance does not change while restarting calculation [17,18]. references [1] g. cosarinsky, j. fava, m. ruch and a. bonomi, "material characterization by electrical conductivity assessment using impedance analysis", procedia mater. sci., vol. 9, pp. 156–162, 2015. [2] j. garcia-martin, j. gomez-gill and e. vazquez-sanchez, "non-destructive techniques based on eddy current testing", sensors j., vol 11, pp. 2525–2565, feb. 2011. [3] abdou a., bouchala t., abdelhadib.,guettafi a.,benoudjita., "real-time eddy current measurement of aeronautical construction material coating thickness", instrum. meas. metrol., vol. 18, no. 5, pp. 3–4, nov. 2019. [4] x. ma, a. j. peyton and y. y. zhaob, "measurement of the electrical conductivity of open-celled aluminum foam using non-contact eddy current techniques". ndt e int., vol. 38, no. 5, pp. 359–367, 2005. [5] c. v. dodd and w. e. deeds, "analytical solutions to eddy-current probe-coil probe problems", j. appl. phys., vol. 39, no. 6, pp. 2829–2839, sep. 1968. [6] t. bouchala, b. abdelhadi and a. benoudjit, "fast analytical modeling of eddy current non-destructive testing of magnetic material", j. nondestruct. eval., vol. 32, no. 3, pp. 294–299, sept. 2013. [7] t. bouchala, b. abdelhadi and a. benoudjit, "novel coupled electric field method for defect characterization in eddy current non-destructive testing", j. nondestruct. eval., vol. 32, no. 4, pp. 1–11, sept. 2013. [8] t. bouchala, b. abdelhadi and a. benoudjit, "new contactless eddy current non-destructive methodology of electric conductivity measurement", j. nondestruct. test eval., vol. 30, no. 1, pp. 63–73. jan. 2015. [9] t. bouchala, b. abdelhadi and a. benoudjit, "application of coupled electric field method for eddy current non-destructive inspection of multilayer structures", j. nondestruct. eval., vol. 30, no. 2, pp. 8– 10, march 2015. [10] d. vielldent, "optimisation des outils en forgeage a chaud par simulation elément finis et méthodes inverse. application a des problèmes industriels", thèse de doctorat, ecole nationale supérieure des mines de paris, 1999. [11] b. maouche and m. feliachi, "a half analytical formulation for the impedance variation in axisymmetric modeling of eddy current non-destructive testing", epj appl. phys., vol. 33, pp. 59-67, feb. 2006. [12] b. maouche, a. rezak and m. feliachi, "semi analytical calculation of the impedance of differential sensor for eddy current non-destructive testing", ndt e int., vol. 42, no. 7, pp. 573-580, oct. 2009. a comparative study of optimization methods for eddy-current characterization of aeronautical metal sheets 555 [13] s. zerguini, b. maouche, m. latreche and m. feliachi, "a coupled fictitious electric circuit’s method for impedance of a sensor with ferromagnetic core calculation. application to eddy currents nondestructive testing", epj appl. phys., vol. 48, no. 3, pp. 31202-31207, dec. 2009. [14] j. blitz, electrical and magnetic methods of non-destructive testing. new york: chapman &hall, 1997. [15] y. yating, d. pingan and x. luchuan, "coil impedance calculation of an eddy current sensor by the finite element method", russ. j. nondestruct. test., vol. 44, no. 4. pp. 296–302, april 2008. [16] v. p. lunin, "phenomenological and algorithmic method for the solution of inverse problem of electromagnetic testing", russ. j. nondestruct. test., vol. 42, no. 6, pp. 353–362, june 2006. [17] i. dolapchiev, k. brandisky and p. ivanov, "eddy current testing probe optimization using a parallel genetic algorithm", serb. j. electr. eng., vol. 5, no. 1, pp. 39–48, may 2008. [18] a. bouzidi, b. maouche and m. feliachi, "pulsed eddy current nde of groove dimensions by inversion with simplex method associated with coupled electric circuits method", ieee trans. magn., vol. 51, no. 3, pp 55–61, march 2015. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 299 316 doi: 10.2298/fuee1402299k electrification of the vehicle propulsion system – an overview  vladimir a. katić, boris dumnić, zoltan čorba, dragan milićević university of novi sad, faculty of technical sciences, novi sad, serbia abstract. to achieve eu targets for 2020, internal combustion engine cars need to be gradually replaced with hybrid or electric ones, which have low or zero ghg emission. the paper presents a short overview of dynamic history of the electric vehicles, which led to nowadays modern solutions. different possibilities for the electric power system realizations are described. electric vehicle (ev) operation is analyzed in more details. market future of evs is discussed and plans for 2020, up to 2030 are presented. other effects of electrification of the vehicles are also analyzed. key words: ev short history, electric vehicles, ev power system 1. introduction transportation sector is the major energy consumer. as statistical data from 2009 and 2010 show, the transportation is spending as much as 19% (2009) of global total energy use [1]. in the eu its share in 2010 goes up to 31.7% or 365.2 million toe [2]. it is also contributing to 23% of the energy related green-house-gasses (co2) emission (2012), which is significant increase from 6.5% in 1990 [2], [3]. if current trend continues, transportation energy use and co2 emission are projected to increase by nearly 50% by 2030 [1]. although eu 2020 policy target is to decrease green-house-gasses (ghg) emission by 20% in 2020, the above data shows that transportation sector will not contribute much to it. this future is not sustainable, as the effects of climate change resulting from global temperature increase and fast rise of co2 concentration (fig.1, left) are evident. such a negative trend needs to be addressed and some possible solution should be pointed out. replacement internal combustion engine (ice) with electric propulsion in passenger cars is seen as a way for decreasing ghg emission and to mitigate the climate change problems [4]. electric propulsion is not new, and it dated back to mid 19 th century, when the first electric vehicles (ev) were presented [5]. however, the market destiny was not favorable to evs and in 1930s they were totally abounded. the revival started in late 1980s when environmental awareness of the population increased due to fast rise of the ghg and especially co2 emission (fig.1, left). at the same time, fast depletion of fossil fuels raised oil prices and put forward questions of energy future of the mankind (fig.1,  received february 12, 2014 corresponding author: vladimir katić university of novi sad, faculty of technical sciences, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: katav@uns.ac.rs) 300 v.a. katić, b. dumnić, z. ĉorba, d. milićević right). intensive research efforts and new improved power conversion technology enable rapid development and cost-effective solutions. (source: http://philebersole.wordpress.com/2012/03/19/the-epic-history-of-oil/) (source: http://www.climatechoices.org.uk) fig. 1 concentration of the co2 and temperature change since 1850 (left) and crude oil prices history 1861 – 2010 (right) different structures of electric drive trains are possible: hybrid, as a combination of the ice and electric motor, or fully electric one [4]. the propulsion using fuel-cells or hydrogen energy is also offered, but in this paper it will not be discussed in more details. nowadays, all major car makers companies are offering some models with hybrid or electric propulsion. still, the motor type (induction, synchronous, reluctant, brushless dc or other) and battery packaging are not standardized and there are a lot of room for innovations and improvements. in this paper, an overview of current status in this field, regarding above presented problems is presented. additionally, market prospects and trends up to 2030 are considered, showing that ill destiny of early electric cars will not be repeated. 2. short history end of the 19 th century brought great discoveries in the field of electrical engineering and raised enormous hopes on its rapid development and application. one of the fascinating presentations of that time was at the world exhibition in chicago in 1893, where a new product for the transportation, the electric car was shown. this was result of more than 100 years of discoveries and innovations in the field of electricity, which yielded to creation of simple electric carriage powered by non-rechargeable primary battery cells by scottish inventor robert anderson in 1838, invention of rechargeable lead-acid battery by french physicist gaston plante in 1859 and its basic improvement for use in vehicles by another frenchman camille faure in 1881 [5]. the first electric car was made by french engineer gustav trouve in 1881. it was a three-wheel vehicle with a 70w dc motor powered by lead-acid battery. at the same time englishmen william ayrton and john perry develop their solution, an electric tricycle with a motor power of 350w and a maximum speed of about 15 km/h. vehicle was supplied from lead-acid batteries, and speed control was achieved by changing battery connections [5]. the sudden development of electric traction using a dc drive and its commercialization in the urban regions of (electric trams, trains, ships, etc.) led several companies to start ev manufacturing in 1896. the first produced cars found application as new york city taxis (fig. 2). they had a maximum speed of 32 km/h and radius of up to 40 km [5]. http://philebersole.wordpress.com/2012/03/19/the-epic-history-of-oil/ http://www.climatechoices.org.uk/ electrification of the vehicle propulsion system – an overview 301 (source http://en.kllproject.lv/new-york-yellow-taxi-photo.html) fig. 2 electric (yellow) taxi in new york around 1901 at the beginning of the 20 th century motorized transport customers could choose between a steam-powered vehicles, gasoline or electric ones. the market was divided, without any indication as to which drive will be dominant in the future. steam-powered vehicles had speed and they were cheaper, but they suffer from long start-up time (to warm up they needed about 45 minutes), and need to make frequent stops for water. vehicles with internal combustion engines had vibration, noise and smell due to exhausted gases. they needed manual operation to start, changing gears presented a special problem during a drive, but the price was moderate and they could be used for longer trips at a reasonable speed without stopping. the popularity of electric vehicles was due to some advantages they had over their competitors. they were convenient for short distances, around the city limits, easy to start and to drive (no difficulties with gear shifting), reaching high speeds. the first man to break 100 km/h speed barrier was camille jenatzy’s, a belgian race driver with his electric car named jamais contente in 1899. electric vehicles were clean and quiet, also, but expensive, as they were built for upper class in form of massive carriages, with fancy interiors and from expensive materials. during this period, nearly fifty companies manufactured electric cars covering 38% of the u.s. market. the electric vehicles were prosperous until 1920s with the peak of production in 1912 [5]. electric drive technology has not kept pace with the needs of population for traveling on long distances nor in terms of speed or a suitable infrastructure for energy supply (battery charging stations). already in 1913 the general observation was that electric vehicles are losing competition with gasoline cars. the great depression '20s in the united states and in the world has drastically limited the resources for innovation in this area, so it gradually decreased production in the u.s. and other companies in europe. the final blow was when the ford motor company developed a system of mass production and launched the famous model ford t4 for price 50% lower than the corresponding electric cars. up to the 1930s electric vehicles have disappeared from the car market in the usa. however, a series of circumstances brought electric cars, again, in the focus of researchers and the general public attention. developments in power electronics and wide http://en.kllproject.lv/new-york-yellow-taxi-photo.html 302 v.a. katić, b. dumnić, z. ĉorba, d. milićević application of semiconductors (solid-state) power converters in late 1950s have reduced losses, improved operation of electric drives and increased energy efficiency to over 90%. new algorithms of analogue and latter digital control using microprocessors, enables highquality and reliable motor speed control occupying small space. in sixties the production was limited to small experimental types. models p50 and peel trident of the peer electric mini car company were suitable for city rides and parking introducing three-wheeler structure with fiberglass bodywork (fig. 3, left). similarly, model enfield 8000 (fig. 3, right) was produced as small city car in london and had two doors and four seats. dc electric motor of 6 kw and 220ah lead-acid batteries enabled the radius of up to 90 km and a maximum speed of 60 km/h. it was manufactured in only 106 copies, so the price was not competitive and was no treat to dominant gasoline powered cars. (source: http://www.trendhunter.com) (source: www.veicolielettricinews.it) fig. 3 peel electric cars (1962) and model 8000 enfield (1969) one of the turning points in the automakers industry happened in mid-1970s, when oil supply was restricted due to political instabilities in the middle east, resulting in serious energy crises and sharp oil prices jump (fig.1). additionally, news that the oil reserves are limited and that they will be soon exhausted brought new concerns. at the same time, concerns for the environment and high air pollution in the big cities, due to exhaust gasses from the gasoline cars and threat to population health, started “green” movement in many countries. with increased environmental awareness and with ghg (especially co2) emission effects on climate changes (fig.1), the movement became a worldwide. it encouraged the search for alternatives in transportation power train and lead to reconsidering of electric cars or proposing hybrid solutions. during the 1980s the research efforts continued, but besides the various models of mini cars, no serious commercial attempts were made. a key problem has been the batteries, their great weight and relatively small energy capacity leading to short driving range. additional challenge has been improvements at the heat engine competition, i.e. significant reduction of exhausting gasses and fuel consumption of the ice. still, the nineties brought the first models of electric car to the market, the ev1, which general motors produced from 1996 to 2003 (fig. 4, left). ev1 was accelerating from 0100 km in 8 s, the maximum speed was 160 km/h and it had a radius of 193 km. the first models in 1996 use 53 ah lead-acid batteries, with internal voltage of 312 v, which enabled the range of 100 km. later models (2 nd generation 99-2003), switched to the nimh (nickel metal hydride) batteries, which reduced weight and increased range of up to 240 km. the batteries have been in the form of a package with capacity of 26.4 kwh, which consisted of http://www.trendhunter.com/ http://www.veicolielettricinews.it/ electrification of the vehicle propulsion system – an overview 303 26 pieces of 13.2 ah 77 v batteries, and with a bus voltage of 343 v. however, the basic model price was high and it needed complete replacement of the batteries after only 40,000 km, so it did not withstand competition and was removed from the market. another significant market attempt was launch of the honda's ev plus model in 1997, which uses dc brushless 49 kw motor and nimh batteries (fig. 4, right). ev plus had a maximum speed of 130 km/h and radius of up to 160 km. it was produced in 340 copies, only, and they were sold exclusively leased, as their price was high. in the end, honda pulled all the cars from the market and dismantled most of them in 1999. (source: http://www.telegraph.co.uk/motoring/picturegalleries/5423182/a-history-of-general-motors-in-pictures.html) (source: http://www.barthworks.com) fig. 4 general motors ev1 (1996) and honda model ev plus with nimh batteries frequent increases in prices and uncertainty of the oil market, as well as mature environmental awareness, especially in economically developed countries, have contributed that the beginning of the 21 st century is marked by the decision of the majority of the world's great car manufacturers to start the production of hybrid and electric vehicles. the landmark is clearly marked by two models: a hybrid car toyota’s prius and an electric car teslamotors’s roadster. toyota prius is result of careful thinking and evaluative strategy of passenger car development from the ice, over hybrid to electric propulsion. it is the first successful hybrid model, produced for more than 15 years. it has appeared in three generations (prius, prius + and prius plug -in) and was sold over 3,000,000 units (fig. 5). the details of these models are widely known, but it is worth mentioning that it is a hybrid solution, in which the 500 v electric motor (27kw, 37ks) directly powered from an electric generator, which powered internal combustion engine (like, drive the alternator on standard vehicles with ice) [6]. internal combustion engine has a capacity of 1.8 l, 100 hp with an average fuel consumption of 4 l/100km, and the excess energy is stored in batteries. nimh battery has a capacity of 6.5 ah, and consists of 28 modules of 7.2 v, of total weight of 30 kg, with an output voltage of 202 v. specific battery power is 1300 w/kg, and the durability of 300.000 km. last generation include larger battery packs and charging ability of the public distribution network, plug-in prius. other similar models hybrids, especially plug-in solutions, include chevrolet volt (or voxal ampera or opel ampera), ford fusion energi phev, ford c-max energi phev, etc. http://www.telegraph.co.uk/motoring/picturegalleries/5423182/a-history-of-general-motors-in-pictures.html http://www.barthworks.com/ 304 v.a. katić, b. dumnić, z. ĉorba, d. milićević (source: http://www.automagazin.rs) fig. 5 three generations of toyota prius, the top selling hybrid in the world tesla roadster appeared in 2008 and presents an innovation in the field of electric cars. electric propulsion is based on a three-phase, four-pole, ac electric motor, which is controlled by the microprocessor-controlled three-phase inverter and powered from the lithium-ion (li-ion) battery capacity of 60 kwh and 200,000 km of guaranteed operation (there is option of 85 kwh batteries and with unlimited duration of warranty). his driving performances are impressive it needs only 3.7 s to accelerate to 100 km/h, and with single battery charge may drive up to 400 km (fig. 6). for charging it may use the garage battery charger, which can fill the battery in 4h or mobile charger, with which charging takes 6 h. the new model (2014) tesla s offers some improved options like fast battery swap and better charging. (source: http://www.teslamotors.com/roadster) fig. 6 tesla roadster model 2008 the teslamotors company is also making significant investment to develop network of charging stations for energy supply of electric vehicles across the country. in 2014 there are feeder cells in most metropolitan areas in the u.s. with fast chargers of 120 kw, where charging takes only 30 min. the company is planning for 2015 to cover with fast chargers the most important cities and routes in usa enabling easy coast-to-coast ride (fig. 7). in recent years, more and more models of electric cars of serial production are available in the market, some of which are already well-known: aforementioned tesla roadster, then the mitsubishi i-miev, nissan leaf, reva (manufacturer reva electric car comp. from the uk), peugeot ion, citroen c-zero, renault zoe, bmw i3 and others. leaf and imiev, with total sales of over 15,000 units each, are now the best-selling full electric cars. http://www.automagazin.rs/ http://www.teslamotors.com/roadster electrification of the vehicle propulsion system – an overview 305 (source: http://www.teslamotors.com/supercharger) fig. 7 the charging station infrastructure in usa – teslamotors plan for 2015 3. electric vehicular power system a passenger vehicle or a motor car has four power systems: mechanical, hydraulic, pneumatic and electrical one to operate a large number of different loads and to perform assistance to the propulsion system main power train – internal combustion engine (ice). however, all systems are actually powered from the ice, except batteries, which could be charged off-board (this option is rarely used, i.e. only when batteries are depleted and needs to be re-charged). to improve efficiency and reduce oil consumption and gasses emission of an ice, involvement of electrically powered drive train is needed. for example, efficiency of an ice reaches 30-33%, but of an electric motor (machine) it could go between 80-90%, even beyond. the motor car electrification or process of application of electric energy for powering some apparatus or equipment in an ice powered vehicles started way back in 1908, when the first electric device was implemented. it was electric horn, or klaxon, which was powered directly from dry battery cells. problem with non-rechargeable dry cells and a need for better lighting was solved by introduction of rechargeable batteries and dynamo generator in 1912. the main goal of that time was to simplify driving, especially starting, than to increase safety, to improve convenience and passenger comfort. therefore, many other electrical apparatus and electronic devices were introduced (electric starter, wind screen wipers, air-conditioning, modern entertainment systems, on-board computers, information system, sensors, radars, parking assistance, and many others) later on, demanding more electrical power and secure and reliable supply. however, previous electrification goals have not included the main power train, i.e. the ice, which is actually the trend of recent decade. nowadays, the motor car electrification has wider goal, the one which has been mentioned at the beginning of this paper, i.e. to improve efficiency, reduce emission and improve performance. in addition, the need for smarter, more reliable, safer vehicle, integrated in modern communication (internet) and social networks, but also connected with other vehicles on the road, requires more electric, electronics, communication and computer systems on board. http://www.teslamotors.com/supercharger 306 v.a. katić, b. dumnić, z. ĉorba, d. milićević the level of electrification is defined as ratio of peak electric power to peak power of all power generating systems (electric and ice) [7]. regarding this feature of a motor vehicle, several levels of electrification may be distinguished: 1. ice with non-propulsion electric systems 2. more electric vehicle (mev) 3. hybrid electric vehicle (hev) 4. plug-in hybrid electric vehicle (phev) 5. full electric vehicle or battery electric vehicle (ev or bev) 6. future evs – fuel-cell powered evs, solar assisted evs, etc. 3.1. ice with non-propulsion electric systems ice with non-propulsion electric systems uses electric power for operation of a number of electrical loads (previously mentioned), but not for the drive train. all those loads are supplied from an alternator (12v, 0.8 – 1.7 kwp), an ac electrical generator with ac/dc conversion and with backup from lead-acid battery (12v, 45ah-110ah). as number of loads is increasing, energy management and increase of efficiency became essential. the long debated proposal for increasing the battery voltage to 42v, have been abounded due to significant additional costs and lower reliability, although many advantages have been pointed out [8]. 3.2. more electric vehicle (mev) more electric vehicle (mev) is a car that keeps its ice propulsion system, but optimizes other systems (non-propulsion), especially electrical one. the main characteristic of such vehicles is integration of the starter and alternator (integrated starter/alternator – isa), which enables easy implementation of start-stop function and regenerative braking. the start-stop function ceases motor operation during short stops in front of traffic lights and similar situation with idling engine, resulting in lower fuel consumption and co2 emission per km. the regenerative braking function uses the kinetic energy of the vehicle during braking or down-hill riding to convert it to electrical and charge the batteries. that function improves energy management and better utilization of different electrical loads. 3.3. hybrid electric vehicles (hev) hybrid electric vehicles (hev) are step in evolution towards full electric ones. they are compromise between huge investments needed for developing completely new vehicle model and requirements of modern society to decrease co2 gasses emission and fuel consumption. they may be also classified as low emission vehicles in compliance with new legislation, especially in the state of california (usa) [9]. the main idea of hev is to apply electric energy for propulsion in addition to the ice. depending on the level of electric propulsion implementation, different levels of hybridization are defined. these levels are expressed with vehicle’s hybridization factor or with ratio between its peak electrical power and its peak total electrical and mechanical power. in that sense, the hybrid vehicles can be divided into micro hybrids, mild hybrids, power (full) hybrids and energy hybrids [7]. micro hybrids usually have a hybridization factor of 5-10%, mild hybrids between 10-25%, while power and energy hybrids between 30-50%. the main advantages of hybridization or of having a dual power train are that combining electrical motor and ice higher efficiencies may be reached, better flexibility electrification of the vehicle propulsion system – an overview 307 of drive, improved riding autonomy, while fuel consumption and gasses emission are decreased. there are several possibilities of organization of such a dual power train, so hevs may be additionally classified into series, parallel and series-parallel [10]-[14]. the series hybrid architecture consists of three machines connected in series (fig. 8). the ice drives an ac electric generator that produces power for charging batteries (dc) and driving the ac electric motor, which is attached to the transmission or directly to the differential or the wheels. to connect different power systems and voltage levels, an ac/dc and a dc/ac converter are needed. in such a way, the dc link decouples two electrical machines, while the electrical system mechanically decouples ice from the wheels. problem is that overall efficiency gain is not significant due to multiple power conversion. fig. 8 series hybrid electric vehicle architecture in a parallel hybrid vehicle both electrical machine and ice are contributing to the propulsion as they are mechanically coupled to the transmission or the wheels (fig. 9). the electrical machine is assisting the ice in order to reduce the fuel consumption, so it is used mainly during start-up and speed acceleration. on the other hand, it enables regenerative function and battery charging through power electronics converter. the battery, i.e. energy storage system is relatively small providing enough power for short operation, but not enough to energize all-electric mode, especially at high speeds. for such an application supercapacitor or ultracapacitors are recently proposed [15]. fig. 9 parallel hybrid electric vehicle architecture 308 v.a. katić, b. dumnić, z. ĉorba, d. milićević in a serial-parallel hybrid two electrical machines are combined with ice to provide both series and parallel paths for power (fig. 10). electrical motor and ice are mechanically coupled for delivering power to the wheels. ice is also coupled with electrical generator to generate electricity for charging batteries. batteries are providing power to electric motor. regenerative function for charging batteries is possible, also. fig. 10 series-parallel hybrid electric vehicle architecture 3.4. plug-in hybrid vehicles (phev) while the hevs have electric assistance to ice, the phevs have a high-energydensity energy storage system that can be externally charged (fig. 11). this enables that the vehicle can run in full electric mode or electric assisted mode. again, series, parallel or series-parallel hybrid power trains are possible. as the batteries are of higher energy capacity than in hevs, such vehicles are also called range-extended hevs. (source: http://www.mge.com/environment/innovative/hybrid-vehicles.htm) fig. 11 phev in comparison to hev http://www.mge.com/environment/innovative/hybrid-vehicles.htm electrification of the vehicle propulsion system – an overview 309 phevs may have on-board charger which could be bidirectional and enable smartcharging capacity, which is recognized as vehicle-to-grid (v2g) charging mode. also, off-board or public chargers in v2g mode may be used, together with home chargers (vehicle-to-home v2h charging mode). this gives phevs more flexibility and possibility in running in two operating modes: charge-depleting and charge-sustaining mode. in the first mode, electrical energy from the batteries is used to provide power until it is consumed, i.e. the state of charge of the batteries reaches predefined minimal value. after that the ice is turned on and the vehicle runs in a hybrid mode. if the state of charge of the batteries is sustained in predefined range, this operation mode is called charge-sustaining mode. 3.5. full electric vehicle or battery electric vehicle (ev or bev) full electric vehicles or electric vehicles (evs) or battery electric vehicles (bev) has allelectric propulsion system. there is no other engine, then electric motor. the electric power system has the same concept as at the beginning at the end of 19 th century, when the energy was supplied from rechargeable batteries, and then energy conversion block make adaptation to the needs of the dc electric motor, which was powering the car. however, the technology and the efficiency of the whole system have been improved through intensive research and innovation using computer modeling, simulation, emulation and laboratory and prototype testing [4], [7], [11], [13], [14], [16][19]. nowadays power train components include improved high voltage batteries of high energy density, supercapacitors (or ultracapacitors) of high power density, battery (energy) management system, high voltage dc grid (130v – 400v), power conversion system (dc/dc converters and dc/ac inverter), on board battery charger (ac/dc converter), ac motor/generator, low voltage battery, low voltage (12v) dc power grid for non-propulsion loads. off-board chargers and plug-in features are also included in the system. complete ev’s electric power system is shown in fig. 12. fig. 12 typical power system architecture in an ev 310 v.a. katić, b. dumnić, z. ĉorba, d. milićević 3.5.1. batteries and supercapacitors the batteries and supercapacitor perform an energy supply and storage system of an ev, enabling high energy and power supply. normally, batteries (packed to produce high voltage output) are the main energy source of the propulsion system, determining the operation of the vehicle and its driving range. besides this one, the ev has a separate low voltage battery (12v) for supplying non-propulsion loads. to improve the performance, especially in high power demanding driving moments, like starting the car, speed increase or fast acceleration, a superconductor is considered as additional power source in the power train, parallel to the batteries [16], [20]-[22]. the batteries have evolved from the lead-acid ones, which are reliable, low price, and standardized, but heavy (9-15 kg), short-lived (500-800 cycles) and of lower specific energy (33-42 wh/kg) and power (0.18 kw/kg) density. today’s li-ion batteries are more convenient for ev and phev applications, having 3,500 cycles in a life time, a specific energy density of 130-140 wh/kg, and a power density of 2.4 kw/kg (fig. 13). increase production and innovations will lead to further improvements, so in 2015 it is expected that li-ion batteries will reach a specific energy of 250-300 wh/kg and a specific power density of 3.5 kw/kg, while the costs will decrease from 0.5-0.6 €/wh in 2011 to 0.15-0.25 €/wh in 2015 [16]. however, the main problem of long battery charging time is still remaining to be solved. improving ac/dc converters to convert ac power from the public grid to dc, either as off-board (public or home) or as on-board charger resulted in four charging modes – from the fast one (20-30 min) to slow one (6-8 h) [23]. however, this is still not satisfactory, as people are used to short refuelling time with the ice. one possible solution is offered with flow batteries, which are kind of rechargeable fuel cells [4], [24]. there are several types (like redox, hybrid and membrane-less), but the most promising are the vanadium redox batteries, which have a life time of 10,000 cycles, quick and easy recharging similar like refuelling an ice, as it is done simply by replacing the electrolyte. the main disadvantages are a relatively poor specific energy density 10-20 wh/kg and the system complexity and size. (source: http://en.wikipedia.org/wiki/file:supercapacitors-vs-batteries-chart.png#file) fig. 13 energy storage devices: energy vs power density http://en.wikipedia.org/wiki/file:supercapacitors-vs-batteries-chart.png#file electrification of the vehicle propulsion system – an overview 311 fast charging are achieved with supercapacitors or ultracapacitors (or electrical double layer capacitors) for power applications, which are energy storage devices like electrolytic capacitors, but of capacitance values up to several thousand farads (fig. 13) [15], [21]. their main advantage is very high specific power density (from 2 kw/kg to 15 kw/kg in 2013, with expectation of further rise up to 30 kw/kg), long life time (10 5 – 10 6 charge/discharge cycles) and very fast charging time (several seconds). on the other hand, the specific energy density is relatively low (10-15 wh/kg), so they are not suitable to be use as solely energy storage units. 3.5.2. energy conversion standard electric power system of an ice powered cars has 12v dc bus, which is appropriate for powering all loads. due to increased power demand in modern vehicles, 42v dc bus was considered, but this idea has been abounded due to economical reasons. however, in a hevs and evs besides conventional vehicular loads, there is an ac electric motor, as the main propulsion for a car motion, which needs higher operating voltage. therefore, a separate high voltage dc bus, which is supplied from high voltage battery, is needed [13]. the battery output voltage depends on state of charge i.e. on depletion level and may vary between 125v to 200 v. a regenerative dc/dc convertor is used to boost the voltage up to dc bus system level of 400 v. if battery voltage is below nominal 200 v, than the dc bus voltage is also decreased to minimum 267 v. the on-board or off-board charger is a three-phase ac/dc converter in h bridge or back-to-back topology connected to high voltage dc bus. the dc/dc converter operating in buck mode is transferring the energy to high voltage battery and through 12v dc/dc converter to low voltage battery, also. the traction inverter, which function is to provide ac power to the main, traction ac motor, has input voltage range between 190 v and 400 v. the inverter is h bridge topology, composed of igbt switches with free-wheeling diodes and controlled with space vector modulated pwm or different other control algorithms. all these converters are operating in switch-mode resulting in high efficiency and low losses. still, sophisticated energy management is needed to coordinate energy flow and enable high efficiency. further improvements in these directions are expected in the future. 3.5.3. traction electric motor although dc motor seems logical choice for ev’s propulsion, as it is powered from dc batteries, today’s solutions are based on ac motors. induction or synchronous ac motors are used as traction motors, due to their lower weight and costs, higher reliability and lower maintenance needs. for high power propulsion, induction motor is used. for example, tesla roadster is using a 3-phase 4-pole induction motor of 185 kw power and with maximum speed of 6,000 rpm. for four wheel drive, four permanent magnet synchronous motors (pmsm) are mounted as a part of a wheel structure. the advantages of such realization are elimination of mechanical gears and differential, which are used in single radial machine drive system. this gives higher efficiency, less weight, and improved reliability, but has usual size and weight restriction, so they are convenient for small vehicles. 312 v.a. katić, b. dumnić, z. ĉorba, d. milićević 3.6. future evs nowadays electric vehicles are powered from electric batteries, which are charged from the public electric grid. however, such electric energy is generated partially (30% 70%) from fossil fuels powered plants (coal, oil or similar) and therefore such evs are not contributing to reduction of the co2 emission and improving the environment in full sense. in fact, the emission area is only moved from the big cities to the area where the coal or oil plant is located. it may be estimated that a coal plant co2 emission for a 1km of a 40kw (52hp) ev drive is around 200 gco2/km [16]. therefore, additional efforts and technical innovations are needed to achieve full green effect of evs operation. there are several ideas, but for the moment only applications with solar energy and hydrogen energy using fuel cells have been manufactured as prototype in some ev models. solar energy is converted to electrical one using photo-voltaic effect in photo-cells. there are two ways of using this renewable energy source. one is for charging batteries during park time and the other is powering the vehicle from photo-cells integrated in the cover of the vehicle. the first solution is very popular enabling different parking shades design. the second possibility is practical only for non-propulsion apparatus/loads in the vehicle, like air-conditioning, or in case of very light vehicles [25]. fuel-cell (fc) is using h2 gas аs the fuel, and combines it with oxygen to produce electricity and h2о as output. therefore it is environmentally friendly and does not emmit any ghg. the complete scheme of a fc ev is shown in fig. 14. the cell is producing electricity and store it to the batteries, from which it is consumed by electric motor. the process is of low dynamics, so additional power using supercapacitors are considered [26]. fig. 14 fuel cell electric vehicle architecture electrification of the vehicle propulsion system – an overview 313 4. effects of vehicles electrification electrification of the vehicle propulsion system and development of plug-in ev (pev) industry is the final step toward achievement of low or even zero emission passenger cars. the process started as individual effort of some innovators in 1960s up today’s determination of all major car makers companies to include at least one hybrid or electrical model in their portfolio. at the moment the number of sold hevs, phevs and evs is rapidly rising, spreading form a group of countries called ev initiative (evi). market success and high public acceptance of hybrid models in usa, especially toyota prius and chevrolet volt, which was sold in more 3,000,000 cars, made breakthrough in application of electric energy in the power propulsion of modern vehicles. phevs are dominating in usa market, with 70% of share, followed by japan with 12% and the netherlands with 8% in 2012 [16]. the data show that there was around 180,000 pevs on the road in 2012 [27]. as of december 2013, this number has been risen up to 380,000 pevs (passenger cars and utility vans) worldwide, and almost 2,000,000 evs (pevs+hevs). fig. 15 (left) shows annual ev sales by drive-train (hev, phev, bev) in 2013 and further prospects up to 2022 [27]. it can be seen that there is huge ev market prospective and that the annual sales of several million units are envisages. other sources forecast that in 2020 the ev industry (hev+phev+bev) will produce between 5,000,000 units [27] up to 7,500,000 units [28], [29], reaching 15% [7] up to 20% [28] (fig.15, right) of all vehicles sales. fig. 15 left: annual evs worldwide sales forecast 2013–2022 [27]; right: cumulative evs sales 2009-2020 [28]. still the most of the sales will be in range of mild and full hybrid evs, which are in class of low emission vehicles (lev). therefore, the goal of decreasing the overall level of co2 by 20% in eu will not be achievable, with electrification, only. however, in long run, up to 2050, the effects will be more significant. fig. 16 shows cumulative ghg emission savings of a fleet of 11.2 million evs that will be sold between 2010 and 2020 under three scenarios. maximum savings is reaching almost 45 million metric tons of ghg (for comparison purpose, in 2009 u.s. emitted 7000 million metric tons) [29]. 314 v.a. katić, b. dumnić, z. ĉorba, d. milićević fig. 16 cumulative ghg savings due to use of evs 2010-2030 [29] another effect of fast growth of the evs may be increase of electricity demand and influence on stability of electrical grid. an estimation with similar three scenarios shows that in 2020 additional 7 twh is needed, while 40 twh in 2030 (fig. 17). this is not a significant demand for a large country capacity like usa, so it may be concluded that there will be not major problem in providing electricity supply for the evs. another research shows that in the california, no additional capacities will be needed to charge 10 million evs between 11 p.m. and 8 a.m., but a 30% of new capacities will be required if these vehicles are charged between 5 p.m. to 12 a.m. [30]. fig. 17 number of evs and their electricity demand forecast (2010-2030). 5. conclusion electrification of vehicles is entering in the final stage where the remaining ice propulsion is gradually replaced with electric one. different solutions are possible for the drive train – hybrid, plug-in hybrid or full electric. the evs power system is characterized with dominant dc bus, ac electric motor and multiple voltage levels. the main power source is battery, but additional power may be supplied from supercapacitors, also. to operate such a system, several power electronics converters with sophisticated control electrification of the vehicle propulsion system – an overview 315 methods and energy management are needed. also, special on-board and/or off-board electricity chargers are integrated in the system. the market perspectives for the ev industry are very promising. a huge rise in production is expected in coming years. this will have effects on decreasing the ghg emission, but in long run. on the other hand, no significant influence on existing public power system is expecting, especially if battery charging is performed during night hours. still, to become competitive with ice cars, further improvements and innovative development is needed and expected in the future. acknowledgement: the paper is a part of the research done within the project no. 114-4513508/2013-04 co-financed by the provincial secretariat for science and technological development of a.p. vojvodina. references [1] international energy agency, “transport, energy and co2 – moving toward sustainability”, report, paris, 2009, http://www.iea.org/publications/freepublications/publication/transport2009.pdf [2] european commission – eurostat, “consumption of energy”, data from august 2012, on-line only: http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/consumption_of_energy [3] international energy agency, “technology roadmap: electric and plug-in hybrid electric vehicles (ev/phev)”, report, paris, released 2009, updated june 2011, http://www.iea.org/publications/ freepublications/publication/name,3851,en.html [4] b.k. bose, “global energy scenario and impact of power electronics in 21 st century”, ieee transaction on industrial electronics, vol.60, no.7, pp.2638-2651, july 2013, doi: 10.1109/tie.2012.2203771 [5] ***, “the history of electric vehicles”, electric vehicles news, available on-line (feb. 2014), http://www.electricvehiclesnews.com/history/historyearly.htm [6] r.h. staunton, c.w. ayers, l.d. marlino, j.n. chiasson, t.a. burress, “evaluation of 2004 toyota prius hybrid electric drive system”, report for the u.s. department of energy, may 2006, http://k0bg.com/images/pdf/890029.pdf [7] a. emadi, “transportation 2.0”, ieee power & energy magazine, vol.9, no.4, pp.18-29, july/aug. 2011, doi: http://dx.doi.org/10.1109/mpe.2011.941320 [8] j.g. kassakian, h.c. wolf, j.m. miller, c.j. hurton, “automotive electrical systems circa 2005”, ieee spectrum, vol.33, no.8, pp.22-27, aug. 1996, doi: http://dx.doi.org/10.1109/6.511737 [9] u.s. department of energy, alternative fuels data center, “california laws and incentives for air quality / emissions”, 2013, http://www.afdc.energy.gov/laws/laws/ca/reg/3843 [10] a. emadi, s.s. williamson, a. khaligh, “power electronics intensive solutions for advanced electric, hybrid electric, and fuel cell vehicular power systems”, ieee transaction on power electronics, vol.21, no.3, pp.567-577, may 2006, doi: http://dx.doi.org/10.1109/tpel.2006.872378 [11] k. rajashekara, “present status and future trends in electric vehicle propulsion technologies”, ieee journal of emerging and selected topics in power electronics, vol.1, no.1, pp.3-10, march 2013. [12] c. shen, p. shan, t. gao, “a comprehensive overview of hybrid electric vehicles”, international journal of vehicular technology, vol.2011, article id 571683, pages 7, 2011, on line available: http://dx.doi.org/10.1155/2011/571683 [13] h. van hoek, m. boesing, d. van treek, t. schoenen, r.w. de doncker, “power electronics architectures for electric vehicles”, int. conf. on emobility electrical power train, 8-9 nov. 2010, leipzig, doi: 10.1109/emobility.2010.5668048 [14] a. emadi, y.j. lee, k. rajashekara, “power electronics and motor drives in electric, hybrid electric, and plug-in hybrid electric vehicles”, ieee transaction on industrial electronics, vol.55, no.6, pp.2237-2245, june 2008, doi: http://dx.doi.org/10.1109/tie.2008.922768 [15] j.w. dixon, m. ortúza, e. wiechmann, “regenerative braking for an electric vehicle using ultracapacitors and a buck-boost converter”, 17th electric vehicle symposium (evs17), montreal (canada), oct.15-18, 2000, http://web.ing.puc.cl/~power/paperspdf/dixon/42a.pdf http://www.iea.org/publications/freepublications/publication/transport2009.pdf http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/consumption_of_energy http://www.iea.org/publications/freepublications/publication/name,3851,en.html http://www.iea.org/publications/freepublications/publication/name,3851,en.html http://dx.doi.org/10.1109/tie.2012.2203771 http://www.electricvehiclesnews.com/history/historyearly.htm http://k0bg.com/images/pdf/890029.pdf http://dx.doi.org/10.1109/mpe.2011.941320 http://dx.doi.org/10.1109/6.511737 http://www.afdc.energy.gov/laws/laws/ca/reg/3843 http://dx.doi.org/10.1109/tpel.2006.872378 http://dx.doi.org/10.1155/2011/571683 http://dx.doi.org/10.1109/emobility.2010.5668048 http://dx.doi.org/10.1109/tie.2008.922768 http://web.ing.puc.cl/~power/paperspdf/dixon/42a.pdf 316 v.a. katić, b. dumnić, z. ĉorba, d. milićević [16] n.c. kar, k.l.v. iyer, a. labak, x. lu, ch. lai, a. balamurali, b. esteban, m. sid-ahmed, “courting and sparking: wooing consumers’ interest in the ev market”, ieee electrification magazine, vol.1, no.1, pp.21-31, sep.2013, doi: http://dx.doi.org/10.1109/mele.2013.2272481 [17] e.m. adzic, m.s. adzic, v.a. katic, d.p. marcetic, n.l. celanovic, “development of high-reliability ev and hev ac propulsion drive with ultra-low latency hil environment”, ieee transactions on industrial informatics, vol. 9, no.2, pp.630-639, may 2013, doi: 10.1109/tii.2012.2222649 [18] n. janiaud, f.-x. vallet, m. petit, g. sandou, “electric vehicle powertrain simulation to optimize battery and vehicle performances”, ieee vehicle power and propulsion conference (vppc), lille, 1-3 sep. 2010, doi: http://dx.doi.org/10.1109/vppc.2010.5729141 [19] seref soylu (editor), “electric vehicles – modelling and simulations”, intech europe, rijeka, croatia, 2011, http://www.intechopen.com/books/electric-vehicles-modelling-and-simulations [20] l. sun, c. c. chan, r. liang, q. wang, “state-of-art of energy system for new energy vehicles”, ieee vehicle power and propulsion conference (vppc), september 3-5, 2008, harbin, china, doi: http://dx.doi.org/10.1109/vppc.2008.4677574 [21] m. halper, j. ellenbogen, “supercapacitors: a brief overview”, mitre, mc lean, virginia, usa, march 2006, http://www.mitre.org/sites/default/files/pdf/06_0667.pdf [22] c.c. chan, l. sun, r. liang, q. wang, “current status and future of energy storage system for ev”, 23 rd int. battery, hybrid and fuel cell electric vehicle sym. & exh. (evs-23), anaheim, 2-5 dec. 2007, http://www.lifepo4.info/battery_study/batteries/current_status_and_future_of_energy_storage_system _for_ev.pdf [23] iec 62196-1 standard: “plugs, socket-outlets, vehicle couplers and vehicle inlets–conductive charging of electric vehicles”, genève, 2003. [24] t. nguyen, r. savinell, “flow batteries”, interface, the electrochemical society, vol. 19, no.3, pp.5456, fall 2010, http://www.electrochem.org/dl/interface/fal/fal10/fal10_p054-056.pdf [25] r. sims, p. mercado, w. krewitt, et al., “integration of renewable energy into present and future energy systems”, chapter 8 of the ipcc special report on renewable energy sources and climate change mitigation, cambridge university press, cambridge, u.k. and new york, usa, 2011, http://srren.ipccwg3.de/report/ipcc_srren_ch08.pdf. [26] k. rajashekara, “propulsion system strategies for fuel cell vehicles”, sae 2000 world congress, detroit, usa, march 6-9, 2000, http://am.delphi.com/pdf/techpapers/2000-01-0369.pdf [27] s.shepard, j.gartner, “electric vehicle market forecasts”, navigant research, report, 4q 2013, http://www.navigantresearch.com/research/electric-vehicle-market-forecasts [28] r. lache, d. galves, p.nolan, “electric cars: plugged in 2”, deutsche bank, fitt research report, nov.2009, http://gold-estate.com/content/lithium/electriccarspluggedin2.pdf [29] l. schewel, d.m. kammen, “smart transportation: synergizing electrified vehicles and mobile information systems”, environment – science and policy for sustainable development, vol. 4, sep.-oct. 2010, on line available: http://www.environmentmagazine.org/archives/back%20issues/septemberoctober%202010/smart-transportation-full.html [30] federal communication commission, “connecting america: national broadband plan” march 2010, http://download.broadband.gov/plan/national-broadband-plan.pdf http://dx.doi.org/10.1109/mele.2013.2272481 http://dx.doi.org/10.1109/tii.2012.2222649 http://dx.doi.org/10.1109/vppc.2010.5729141 http://www.intechopen.com/books/electric-vehicles-modelling-and-simulations http://dx.doi.org/10.1109/vppc.2008.4677574 http://www.mitre.org/sites/default/files/pdf/06_0667.pdf http://www.lifepo4.info/battery_study/batteries/current_status_and_future_of_energy_storage_system_for_ev.pdf http://www.lifepo4.info/battery_study/batteries/current_status_and_future_of_energy_storage_system_for_ev.pdf http://www.electrochem.org/dl/interface/fal/fal10/fal10_p054-056.pdf http://srren.ipcc-wg3.de/report/ipcc_srren_ch08.pdf http://srren.ipcc-wg3.de/report/ipcc_srren_ch08.pdf http://am.delphi.com/pdf/techpapers/2000-01-0369.pdf http://www.navigantresearch.com/research/electric-vehicle-market-forecasts http://gold-estate.com/content/lithium/electriccarspluggedin2.pdf http://www.environmentmagazine.org/archives/back%20issues/september-october%202010/smart-transportation-full.html http://www.environmentmagazine.org/archives/back%20issues/september-october%202010/smart-transportation-full.html http://download.broadband.gov/plan/national-broadband-plan.pdf 7531 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 187-198 https://doi.org/10.2298/fuee2202187r © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design and performance analysis of full adder using 6-t xor–xnor cell k srinivasa rao, marupaka aditya, b.s.d. karthik raja, ch. manisai, m. tharun sai reddy, k. girija sravani department of ece, koneru lakshmaiah education foundation, vaddeswaram, india abstract. in this paper, the design and simulation of a high-speed, low power 6-t xorxnor circuit is carried out. also, the design and simulation of 1-bit hybrid full adder (consisting of 16 transistors) using xor-xnor circuit, sum, and carry, is performed to improve the area and speed performance. its performance is being compared with full adder designs with 20 and 18 transistors, respectively. the performance of the proposed circuits is measured by simulating them in microwind tool using 180 and 90nm cmos technology. the performance of the proposed circuit is measured in terms of power, delay, and pdp (power delay product). key words: xor-xnor circuit, hybrid full adder 1. introduction logic gates are the basic building blocks of any digital system design. it is an electronic circuit having at least one input and only one output. logic gates are primarily executed utilizing diodes or transistors acting as electronic switches. logic circuits include devices such as multiplexers, registers, arithmetic logic units (alus), and computer memory, all the way up through complete microprocessors, which may contain in excess of 100 million gates. at present, most gates are made from mosfets. the basic circuits we are more familiar at are adders. there are two types of adders namely half adder and full adder. half adder which is having two inputs and two outputs are not used in practical applications. full adders, which are having three inputs and two outputs are being mostly used in applications, like, generating memory addresses inside a computer and to make the program counter point to next instruction, the alu makes use of this adder, fft algorithms, fir and iir filters etc. [1] the conventional full adders used in alu contains around 28 transistors, which results received march 8, 2021; accepted march 27, 2022 corresponding author: k srinivasa rao department of ece, koneru lakshmaih education foundation (deemed to be university), vaddeswaram, india e-mail: srinivasakarumuri@gmail.com https://en.wikipedia.org/wiki/diode https://en.wikipedia.org/wiki/transistor https://en.wikipedia.org/wiki/switch#electronic_switches https://en.wikipedia.org/wiki/multiplexer https://en.wikipedia.org/wiki/processor_register https://en.wikipedia.org/wiki/arithmetic_logic_unit https://en.wikipedia.org/wiki/computer_memory https://en.wikipedia.org/wiki/microprocessor https://en.wikipedia.org/wiki/mosfet 188 k. s. rao, m. aditya, b.s.d. k. raja, ch. manisai, m. tharun sai reddy, k. girija sravani in more power consumption, delay in output and area. but people today are keener on using technology, which is simple, small, easily carriable, and more reliable for longer time and can be used in more applications. all the above aspects have led to the development of hybrid technology. the building of low power hybrid vlsi systems has emerged as a significant performance goal because of the fast growing technology in mobile communications and computation. [2] hybrid technology is the combination of two or more different logic styles. mainly the hybrid 1-bit full adder is a combination of cmos logic design style, transmission gate logic and pass transistor logic. [3] most of the hybrid adders are lacking with poor driving capability and power delay product (pdp) when operated at lower voltages. in [3], hybrid logic full adder using 10 transistors xor-xnor circuit is proposed and performance parameters like power consumption, delay, pdp and driving capabilities are simulated in cadence virtuoso tool using 90-nm cmos technology. the proposed xor– xnor circuit is based on cpl and cross-coupled structure. a full adder was also proposed using the same xor-xnor circuit, sum and carry as shown in fig.1. fig. 1 block diagram of hybrid logic fa circuit [3] in [4], circuits of xor/xnor and simultaneous xor-xnor circuits are proposed. the proposed xor-xnor circuit has 12 transistors and not having a not gate on its critical path shows good performance on terms of power consumption, delay, low output capacitance, driving capability and robustness. the proposed xor–xnor circuit is saving almost 16.2%–85.8% in pdp, and it is 9%–83.2% faster than the other circuits. in [5], a technique is proposed to access the timing behavior of hybrid full adder, made of both cmos and finfet technologies of size 32-nm, and compare their performance in multistage circuits in hspice tool. the circuits include transistor function full adder (tfa), transmission gate full adder (tga), new-hpsc (hybrid pass static cmos), new14t full adder and a ccmos full adder (conventional cmos). in this proposed method, three parameters have been considered for timing behavior namely speed, driving capability and input capacitance. in [6], a design of hybrid full adder using pass transistors, transmission gates and cmos logic is proposed. the full adder is implemented in 45-nm technology in cadence simulation tool. performance parameters of the proposed full adder is compared with performance of twenty existing full adders with supply voltage ranging from 0.4v to 1.2v. in [7], a 1-bit hybrid full adder using modified xnor gates is proposed to improve the area & speed and compare its performance with conventional full adder. performance analysis of the proposed design is simulated in 90 design and performance analysis of high-performance full adder using 6-t xor–xnor cell 189 nm technology with 1.2 v supply voltage. the research is still going on to get a full adder which is applicable for practical applications and at the same time providing good performance in all aspects. in this paper, our focus is to reduce transistor count and delay of a full adder and then to compare it with the 20 and 18 transistor full adder. first, we tried to present the proposed logic in xilinx software, which is taking a long time. then we had the option to do in tanner software, but the software was not licensed and was not compatible to make circuit with 90-nm and lower nm technologies. finally, we have constructed the circuit in dsch tool, converted this into a verilog file and simulated in microwind tool. the rest of the paper is organized as follows: in section ii, the two input 6-t xorxnor cell is proposed and the full adder (fa) using the proposed xor–xnor circuit is also proposed in the same section. in section iii, the performance of the proposed xor– xnor cell and full adder in terms of power, speed, and pdp is compared with those of available xor–xnor circuits and fas. section iv concludes the paper. 2. proposed design 2.1. xor-xnor circuit xor-xnor circuits are the basic building block of many arithmetic and encryption circuits. the proposed xor-xnor circuit consists of six transistors as shown in fig.2. it consists of three pmos and three nmos transistors. here we are using two inverter circuits, one for getting the inverted input of a and other to get inverted output of xor operation, that is xnor output. this circuit provides full swing outputs simultaneously. fig. 2 proposed xor–xnor circuit 2.2. full adder circuit for hybrid logic design, full adder is designed using xor-xnor, sum and carry circuits. here we shall be considering the sum (fig.3) and carry (cout) (fig.4) circuits with the following expressions (1) & (2): sum = (𝐴 ⊕ 𝐵)𝐶 ′ + (𝐴 ⊕ 𝐵)′𝐶 = (𝐴 ⊕ 𝐵 ⊕ 𝐶) (1) cout = (𝐴 ⊕ 𝐵)𝐶 + 𝐴𝐵 (2) 190 k. s. rao, m. aditya, b.s.d. k. raja, ch. manisai, m. tharun sai reddy, k. girija sravani fig. 3 sum module fig. 4 carry module both sum and carry circuits are constructed using cmos logic wherein, sum circuit uses six transistors and provides good driving capability and high robustness and carry circuit uses four transistors and consumes lesser power while providing better delay. the proposed xor-xnor circuit in section 2.1, sum (fig.3) and carry (fig.4) circuits discussed above are combined to form a full adder of 16 transistors as shown in fig. 5. fig. 5 proposed full adder circuit design and performance analysis of high-performance full adder using 6-t xor–xnor cell 191 using the logical expressions (1) and (2), truth table for this full adder can be derived as follows: table 1 truth table of full-adder inputs outputs a b cin sum carry 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1 3. results and discussion all the circuits which were discussed in previous section are built in dsch tool and converted into a verilog file and then compile this file in microwind tool to get the required circuits as shown in fig.6 and fig.8 and verified the performance parameters such as delay, power, and pdp at various supply voltages ranging from 0.5 v to 1.2 v at random frequencies. 3.1. xor-xnor circuit in the xor-xnor circuit only two inputs a and b are required which are taken as piecewise linear (pwl) input signal. fig. 6 shows the circuit design in microwind tool and fig.7 shows the applied input pattern to the corresponding xor–xnor circuit. in output graph, there are small glitches, but they do not have more affect and we get full swing at the output. hence, we get the xor-xnor outputs simultaneously, which is required for full adder design. fig. 6 proposed xor–xnor circuit in microwind 192 k. s. rao, m. aditya, b.s.d. k. raja, ch. manisai, m. tharun sai reddy, k. girija sravani fig. 7 input–output waveforms for the proposed xor–xnor circuit performance of the same xor-xnor circuit is compared at in two different cmos technologies, i.e., 180 and 90 nm, respectively. for calculation of pdp the worst-case delay of xor and xnor outputs is taken. in table i, xor circuit delay, in table ii, xnor circuit delay, in table iii power consumption and in table iv pdp values are compared. table 2 xor circuit delay (ps) technologies input voltages 180 nm 90 nm 0.5 v 31 11 0.6 v 35 13 0.7 v 40 16 0.8 v 52 19 0.9 v 86 23 1.0 v 86 30 1.1 v 105 36 1.2 v 120 49 table 3 xnor circuit delay (ps) technologies voltages 180 nm 90 nm 0.5 v 0 0 0.6 v 15 15 0.7 v 42 24 0.8 v 65 25 0.9 v 83 37 1.0 v 98 43 1.1 v 115 55 1.2 v 126 62 design and performance analysis of high-performance full adder using 6-t xor–xnor cell 193 table 4 xor power consumption of xor-xnor circuit (uw) technologies voltages 180 nm 90 nm 0.5 v 6 2 0.6 v 10 4 0.7 v 11 5 0.8 v 12 6 0.9 v 13 7 1.0 v 15 7 1.1 v 15 8 1.2 v 16 9 table 5 pdp of xor-xnor circuit technologies voltages 180 nm 90 nm 0.5 v 186 22 0.6 v 350 26 0.7 v 440 72 0.8 v 624 75 0.9 v 1118 100 1.0 v 1204 150 1.1 v 1725 325 1.2 v 1712 650 as the technology is scaled down, the power as well the pdp values, have a good improvement, when operated in between 0.7 v-1 v. from the above values it can be concluded that whether the delays may have been very high, still the power and pdp values will be suitable for practical use at lower supply voltages. 3.2. full adder circuit fig. 8 shows the full adder circuit design in microwind tool and fig.9 shows the applied input pattern and the corresponding full adder circuit. the circuit provides full swing outputs with small glitches. fig. 8 proposed fa circuit in microwind 194 k. s. rao, m. aditya, b.s.d. k. raja, ch. manisai, m. tharun sai reddy, k. girija sravani fig. 9 input–output waveforms for the proposed fa circuit performance of the same full adder circuit is compared with two different cmos technologies, i.e., 180 and 90 nm, respectively. for delay calculation, cin to cout delay is considered as this delay is crucial for most of the high-level designs. in table 6, delay, in table 7 power consumption and in table 8 pdp values are compared. table 6 full adder circuit delay (ps) technologies voltages 180 nm 90 nm 0.5 v 1120 1021 0.6 v 1062 1023 0.7 v 1074 1026 0.8 v 1081 1029 0.9 v 1088 1030 1.0 v 1094 1029 1.1 v 1098 1030 1.2 v 1120 1031 table 7 power consumption of full adder circuit (uw) technologies voltages 180 nm 90 nm 0.5 v 6 6 0.6 v 21 7 0.7 v 32 9 0.8 v 38 10 0.9 v 43 12 1.0 v 48 16 1.1 v 53 35 1.2 v 58 75 design and performance analysis of high-performance full adder using 6-t xor–xnor cell 195 table 8 pdp of full adder circuit technologies voltages 180 nm 90 nm 0.5 v 6720 6126 0.6 v 22302 7161 0.7 v 34368 9234 0.8 v 41078 10290 0.9 v 46784 10360 1.0 v 52512 16464 1.1 v 58194 36050 1.2 v 68032 77325 a comparison of the full adder designs with 20t [3] and 18t full adder is done in the form of graphs considering each parameter both in 180 and 90nm technologies, for supply voltages from 0.5 v to 1.2v as shown in fig. 10. fig. 10 (a) delay, (b) power, (c) pdp [180nm] and (d) delay, (e) power, (f) pdp [90nm] comparison utilizing different supply voltages 0.5v-1.2v the delay of the proposed circuit is compared in fig. 10(a) and fig. 10(d), power is compared in fig. 10(b) and fig. 10(e), pdp is compared in fig.10(c) and fig. 10(f). the proposed circuit shows good performance in terms of power and pdp. although, the delay values are high, but can be applicable for practical uses because the circuit uses least number of transistors. the approach presented here, faces the challenge in reducing power consumption and delay. this can be controlled by replacing the mosfets with the finfets. 196 k. s. rao, m. aditya, b.s.d. k. raja, ch. manisai, m. tharun sai reddy, k. girija sravani table 9 overall comparison of 180-nm technology table 10 overall comparison of 90-nm technology table 9 and table 10 represent the overall comparison of the performance parameters using 180 nm and 90 nm technology, respectively. here both the 20t and 18t full adder designs used for comparison of parameters were again designed in microwind software and obtained different values and compared with our 16t full adder. the previous work was done on cadence tool [4], since we were unable to get same software hence, we performed again using microwind software. table 11 comparison of performance parameters of proposed architecture and existing architectures architecture power (mw) proposed 21 [27] 53 [28] 25 [29] 40 [30] 71 table 11 gives the performance comparison of different architectures in terms of power. as can be seen that the power consumed of the proposed adder is lower in comparison to that of existing architectures. design and performance analysis of high-performance full adder using 6-t xor–xnor cell 197 4. conclusion in this paper, a new xor-xnor circuit consisting of six transistors is proposed which reduces complexity of the circuit and provides full swing outputs simultaneously. this circuit is combined with sum and carry circuits to form a 16-t full adder circuit. the performance of the proposed xor-xnor circuit and the full adder are tested by simulating them using microwind tool using 180 and 90 nm cmos technology. the proposed circuits show good performance in terms of power and pdp at lower supply voltages between 0.7 v-1 v. the proposed full adder circuit at 90 nm technology is applicable for higher order cascaded full adder circuits and practical applications at lower supply voltages. it the same circuit is tested in another software, then result could have been much better, and this circuit can be used in applications like digital signal processing, microprocessors etc. references [1] r. rajaei, and a. amirany, "nonvolatile low-cost approximate spintronic full-adders for computing in memory architectures", ieee trans. magn., vol. 56, no. 4, pp. 1-8. april 2020. [2] e. pakniyat, s. r. talebiyan and m. j. a. morad, "design of high performance and low power 16t full adder cell for sub-threshold technology", in proceedings of the ieee international congress on technology, communication and knowledge (ictck), mashhad, iran, 2015, pp. 79–85. [3] d. radhakrishnan, "low voltage cmos full adder cells", electron. lett., vol. 35, no. 21, pp. 1792–1794, oct. 1999. [4] j. kandpal, a. tomar, m. agarwal and k. k. sharma, "high-speed hybrid-logic full adder using highperformance 10-t xor-xnor cell", ieee trans. very large scale integr. vlsi syst., vol. 28, no. 6, pp. 1413–1422, june 2020. [5] h. naseri and s. timarchi, "low-power and fast full adder by exploring new xor and xnor gates", ieee trans. very large scale integr. vlsi syst., vol. 26, no. 8, pp. 1481–1493, aug. 2018. [6] h.-r. basireddy, k. challa, and t. nikoubin, (2019) "hybrid logical effort for hybrid logic style full adders in multistage structures", ieee trans. very large scale integr. vlsi syst., vol. 27, no. 5, pp. 1138–1147, may 2019. [7] m. hasan, m. j. hossein, m. hossain, h. u. zaman and s. islam, (2019) "design of a scalable low-power 1-bit hybrid full adder for fast computation", ieee trans. circuits syst. ii: express briefs, vol. 67, no. 8, pp. 1464–1468, aug. 2020. [8] c. p. kadu and m. sharma, "area-improved high-speed hybrid 1-bit full adder circuit using 3t-xnor gate", in proceedings of the international conference on computing, communication, control and automation (iccubea), 2017, pp. 1–5. [9] k. sanapala and r. sakthivel, "ultra-low-voltage gdi-based hybrid full adder design for area and energyefficient computing systems", iet circuits, devices syst., vol. 13, no. 4, pp. 465–470, may 2019. [10] d. abedi and g. jaberipur, "decimal full adders specially designed for quantum-dot cellular automata", ieee trans. circuits syst. ii: express briefs, vol. 65, no. 1, pp. 106–110, jan. 2018 [11] v. kolla, t. nagateja and r. vaddi, "robust and energy efficient non-volatile reconfigurable logic circuits with hybrid cmos-mtjs", in proceedings of the 3rd international conference on emerging electronics (icee), 2016, pp. 1–5. [12] r. rajaei and s. bakhtavari mamaghani, "ultra-low power, highly reliable, and non-volatile hybrid mtj/cmos based full-adder for future vlsi design", ieee trans. device mater. reliab., vol. 17, no. 1, pp. 213–220, march 2017. [13] m. keerthana and t. ravichandran, "implementation of low power 1-bit hybrid full adder using 22 nm cmos technology", in proceedings of the ieee 6th international conference on advanced computing and communication systems (icaccs) coimbatore, india, 2020, pp. 1215–1217. [14] p. bhattacharyya, b. kundu, s. ghosh, v. kumar and a. dandapat, "performance analysis of a low-power high-speed hybrid 1-bit full adder circuit", ieee trans. very large scale integr. vlsi syst., vol. 23, no. 10, pp. 2001–2008, oct. 2015. [15] h. thapliyal, f. sharifi and s. d. kumar, "energy-efficient design of hybrid mtj/cmos and mtj/nanoelectronics circuits", ieee trans. magn., vol. 54, no. 7, pp. 1–8, july 2018. 198 k. s. rao, m. aditya, b.s.d. k. raja, ch. manisai, m. tharun sai reddy, k. girija sravani [16] t. nikoubin, m. grailoo and s. h. mozafari, "cell design methodology based on transmission gate for low-power high-speed balanced xor-xnor circuits in hybrid-cmos logic style", j. low power electron., vol. 6, no. 4, pp. 503–512, dec. 2010. [17] t. nikoubin, m. grailoo, c. li, "energy and area efficient three-input xor/xnors with systematic cell design methodology", ieee trans. very large scale integr. vlsi syst., vol. 24, no. 1, pp. 398–402, jan. 2016. [18] a. m. shams, t. k. darwish, and m. a. bayoumi, "performance analysis of low-power 1-bit cmos full adder cells", ieee trans. very large scale integr. vlsi syst., vol. 10, no. 1, pp. 20–29, feb. 2002. [19] s. janwadkar and s. das, "design and performance evaluation of hybrid full adder for extensive pdp reduction at 180 nm technology", in proceedings of the 3rd international conference for convergence in technology (i2ct), apr. 06-08, 2018, pp. 1–6. [20] m. agarwal, n. agrawal and m. a. alam, "a new design of low power high speed hybrid cmos full adder", in proceedings of the international conference on integrated networks, 2014, pp. 448–452. [21] s. goel, a. kumar and m. bayoumi, "design of robust, energy efficient full adders for deep-submicrometer design using hybrid-cmos logic style", ieee trans. very large scale integr. vlsi syst., vol. 14, no. 12, pp. 1309-1321, dec. 2006. [22] r. zimmermann and w. fichtner, "low-power logic styles: cmos versus pass-transistor logic", ieee j. solid-state circuits, vol. 32, no. 7, pp. 1079-1090, july 1997. [23] k. navi, m. maeen, v. foroutan, s. timarchi and o. kavehei, "a low-power full-adder cell for low voltage", vlsi j. integr., vol. 42, no. 4, pp. 457–467, sept. 2009. [24] a. m. shams, t. k. darwish and m. a. bayoumi, "performance analysis of low-power 1-bit cmos full adder cells", ieee trans. very large scale integer. vlsi syst., vol. 10, no. 1, pp. 20–29, feb. 2002. [25] j.-f. lin, y.-t. hwang, m.-h. sheu and c.-c. ho, "a hybrid high-speed and energy efficient 10-transistor full adder design", ieee trans. circuits syst. i, reg. papers, vol. 54, no. 5, pp. 1050–1059, may 2007. [26] s. kumar, a. kumar and p. bansal, "high speed area efficient 1-bit hybrid full adder" in proceedings of the ieee international conference on electrical, electronics, and optimization techniques (iceeot) chennai, india, 2016, pp. 682–686. [27] n. temenos and p. p. sotiriadis, "nonscaling adders and subtracters for stochastic computing using markov chains", ieee trans. very large scale integer. vlsi syst., vol. 29, no. 9, pp. 1612–1623, sept. 2021. [28] v. t. lee, a. alaghi, j. p. hayes, v. sathe and l. ceze, "energy-efficient hybrid stochastic-binary neural networks for near-sensor computing", in proceedings of the design, automation & test in europe conference & exhibition, lausanne, switzerland, mar. 2017, pp. 13–18. [29] p. ting and j. p. hayes, "eliminating a hidden error source in stochastic circuits", in proceedings of the ieee international symposium defect fault tolerance vlsi nanotechnology syst. (dft), oct. 2017, pp. 1–6. [30] a. ren et al., "sc-dcnn: highly-scalable deep convolutional neural network using stochastic computing", in proceedings of the 22nd acm international conference on architectural support for programming languages and operating systems (asplos), xi’an, china, apr. 2017, pp. 405–418. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 121-136 https://doi.org/10.2298/fuee2201121s © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimal battery storage location and control in distribution network miloš stevanović1, aleksandar janjić2, sreten stojanović1, dragan tasić2 1university of niš, faculty of technology, leskovac, serbia 2university of niš, faculty of electronic engineering, niš, serbia abstract. the paper discusses the problem of the energy losses reduction in electrical networks using a battery energy storage system. one of the main research interests is to define the optimal battery location and control, for the given battery characteristics (battery size, maximum charge / discharge power, discharge depth, etc.), network configuration, network load, and daily load diagram. battery management involves determining the state of the battery over one period (whether charging or discharging) and with what power it operates. optimization techniques were used, which were applied to the model described in the paper. the model consists of a fitness function and a constraint. the fitness function is the dependence of the power losses in the network on the current battery power, and it is suggested that the function be fit by a n order power function. the constraints apply to the very characteristics of the battery for storing electricity. at any time interval, the maximum power that the battery can receive or inject must be met. at any time, the stored energy in the battery must not exceed certain limits. the power of losses in the network is represented as the power of injection into the nodes of the network. the optimization problem was successfully solved by applying a genetic algorithm (ga), when determining optimal battery management. finally, the optimal battery management algorithm is implemented on the test network. the results of the simulations are presented and discussed. key words: energy losses, optimal location, battery storage, charge state (soc) 1. introduction energy security, as well as environmental concerns are becoming an increasingly current and frequent topic of the 21st century. currently, a large part of the world's energy comes from fossil sources, and humanity is slowly facing the problems of environmental pollution. therefore, the importance of alternative clean energy sources, which have a tolerable impact on the environment, is of great importance. in order to improve the quality received july 13, 2021; received in revised form october 26, 2021 corresponding author: aleksandar janjić faculty of electronic engineering, niš, aleksandra medvedeva 14, 18000 niš, serbia e-mail: aleksandar.janjic@elfak.ni.ac.rs 122 m. stevanović, a. janjić, s. stojanović, d. tasić of the power system (increase the reliability, decrease the energy losses, improve the voltage profile of the network) and create better distribution flexibility, renewable energy resources (res) are necessary for the power system. photovoltaic devices, electric vehicles, storage systems batteries are some examples of distributed energy resources. micro grid is a distribution system that includes various renewable energy sources in the power system. it can operate in two different operating modes: in island operation (autonomous) and be connected to the power system. the presence of several interconnected micro networks in distribution networks improves the performance and reliability of the power system. the micro network operator can reduce the operating costs of the system, while increasing its reliability and environmental performance [1]. much of the literature deals with the aforementioned [2 – 5]. to make the operation of the electricity network even more flexible and reliable, it is necessary to introduce battery storage systems (bees). electricity losses are one of the main issues for distribution system operators, as the planning, management, and maintenance of the distribution network are based on appropriate costs. therefore, the cost of supply to end users connected to the distribution system is also affected by the cost of electricity losses. therefore, the reduction of electricity losses is one of the main goals of electricity distributors, which should ensure efficient and reliable distribution of electricity at an affordable price [3]. distributed production from renewable energy sources has been growing in recent years, introducing the need for a possible reconfiguration of the current distribution network (dn) which can reduce the load of individual lines and provide less energy losses with corresponding increased reliability and efficiency [6]. minimizing power losses in the distribution network is one of the main issues of the distribution system, due to the need for reducing control distribution network management costs. recent developments in battery storage technologies have introduced new capabilities for their power distribution systems within the radial distribution network. batteries can be properly integrated into the grid and managed to reduce electricity losses, thereby increasing production from renewable energy sources and contributing to voltage regulation due to the production of reactive power from the battery inverter. reducing electricity losses using bees in the distribution network has become increasingly attractive in recent years due to a significant increase in technological performance and an expected reduction in bess installation costs [7]. energy storage devices in power systems can generally be classified into two types: long-term devices with relatively long response time and short-term storage devices with fast response. each bees type can provide a certain set of applications, depending on the range of its technical parameters. the first category is suitable for energy management applications such as peak shaving, loss reduction, island operation, renewable energy, time shift, and long-term voltage control. one of the most important applications in this category is the cutting of the peaks on the load diagram, which the researcher study in detail in [8]. this is especially true for some rapidly evolving technologies, for example, lithium-ion batteries with an expected reduction in capital costs by about 30-50% in the coming years [9, 10]. for these reasons, much of the research is currently focused on different modeling possibilities and simulations of optimal management of besss connected to the distribution network. the genetic algorithm has proven to be the most favorable in practice for solving this type of problem [11, 12, 13]. however, these methodologies require intensive computational time. the main goal of this paper is to define a novel optimization model, reducing the computational time, and simultaneously optimize the location and the scheduling of the battery. optimal battery storage location in distribution network and optimal battery control 123 in this paper, the problem of reducing electricity losses using a battery is considered. this paper aims to determine the optimal location of batteries and their management to reduce losses in the distribution network. first, the location of the battery is determined based on the sensitivity coefficient of the network nodes, and then the optimization of battery charging / discharging is performed for the selected location. optimal battery management involves determining the charge / discharge power of the battery so that daily power losses in the network are minimal. for this purpose, a fitness function has been defined, which includes network and battery models, in the form of the dependence of network power losses on the current battery power. to obtain a simpler solution, it has been proposed that this dependence be fitted with a power function of order n. to the knowledge of the authors, this way of defining the fitness function has not been researched in the literature so far. the optimization problem, in addition to the fitness function, also contains additional limitations such as inequalities and equations that result from the characteristics of the battery for storing electricity (battery size, maximum charging / discharging power, discharge depth, etc.). the optimization problem was successfully solved by applying a genetic algorithm (ga). at the end of the work, the defined optimal battery management algorithm was applied to one test network. the simulation results showed that the proposed method can be efficiently used to reduce electricity losses using a battery. the paper is organized as follows: the description of the problem, modelling of steady-state operation of dn and bess are presented in section 2, section 3 describes the methodology of optimal bess location and the equations modelling the optimization goal and constraints finally, in section 4 radial test grids are presented to validate the proposed methodology for finding the bess siting and determine optimal power of charge/discharge battery for the selected location. 2. problem description and formulation this paper uses the model of network and model of battery storage, which will be described below. the functional dependency of the power losses in the network on the current battery power was defined. based on it, the fitness function was formed for the optimization procedure. 2.1. network and battery modelling 2.1.1. network modelling the number of nodes and branches leads to an appropriate representation of the network where the incidence matrix a and matrix p are defined according to those given in references [14]. sk is the complex load force of the k-th node. the generated power in a node is taken as negative (-) and the consumption power as positive (+). the incidence matrix a is of dimension n x n (number of nodes x number of branches), not counting the root node. matrix a is a square matrix, due to the radial topology of the distribution network. the first and second nodes in the branch are identified by -1 and +1, respectively. matrix p, dimensions n x n (number of nodes x number of branches) defines whether the k-th branch is located in the path between the j124 m. stevanović, a. janjić, s. stojanović, d. tasić th and the root node. based on the previous one, the matrix equality holds that t a p i= . the preceding notation is identical to that presented in reference [14]. branch currents and node voltages are calculated iteratively according to the procedure proposed in reference [14]. the loads connected to the distribution network are characterized by constant currents at each iteration. under this assumption, the load current of the k th node is independent of the voltage in that node. so the current of the k-th node. in the j th iteration it is calculated as: * ( ) ( 1) 3 sj k i nk j u k = − (1) where ( 1)j k u − is the line voltage value calculated in the previous iteration (j−1) – th, ( )j i nk is injection current in k-th node. during the iterative process, the node voltages are used for the calculation and in the first iteration they should be equal to the root node voltage e0. the corresponding equation according to kirchhoff's law for each node, can be written for each iteration using the incidence matrix as follows:    ( ) ( )[ ] j ja i i nb = (2) where are ( )j i b branch currents in the j-th iteration, and ( )j i n injection currents in nodes network in the j-th iteration. when branch currents are calculated using the preceding formula, the voltage drop duk at each node in the network can be written as: ( ) ( )j j u z ikk bk  = (3) where is zk the impedance of the k-th branch and ( )j i bk is current in k–th branch in the jth iteration . therefore, in the j-th iteration, for bus voltage at each node of the network we have    ( ) ( )[ ]0j ju e p duk k= − (4) where e0 is the known root node voltage and du is voltage drop vector in each of the nodes of the network. finally, the iterative process is terminated when the following condition is satisfied:       ( ) ( 1) ( 1) j j u u k k j u k  − −  − (5) in other words, the iterative process ends when the relative change in voltage across all nodes in the network in two adjacent iterations is less than the given tolerance ε. optimal battery storage location in distribution network and optimal battery control 125 2.1.2. battery model the battery model is based on the battery model from reference [15] where the battery is treated as a passive component. therefore, the power injected into the battery (battery charge) has a positive sign while the power injected by the battery into the network (battery discharge) has a negative sign. the energy accumulated in the battery (charge state) of the soc at ti+1 is defined by the following linear equation [15]: , ( ) (( ) ( δ 1 )) , p tist d soc t soc t p t tc st ci ii sd d       = + − +     (6) where sd is the battery discharge efficiency, c is the charging efficiency, d is the discharge efficiency, and ,st cp and ,st dp are the battery powers during charging and discharging, respectively. to limit the charging and discharging power of the battery, the previous variables should satisfy the following inequalities: 0 , 0 , 0 1 socmax p cst c tc socmax p st d d t d c d          +  (7) where maxsoc is the capacity of the battery, ct and dt are the minimum charging and discharging times, respectively, c and d are binary variables that determine whether the battery is charged or discharged. for 1c = and 0d = the battery is charging, for 0c = and 1d = , the battery is discharging, while for 0c = and 0d = , the battery is offline. 2.2. problem formulation it is necessary to determine the optimal location and schedule of charging and discharging the battery throughout the day so that the daily energy losses in the network are kept to a minimum. battery siting is determined based on sensitivity analysis of the power losses given in reference [16]. * * , , 100 10 * 0 w loss w w los w w loss k los s lo s k l s oss s  − = =  (8) where , w loss k is corresponding to the daily power losses for the given grid configuration and * w loss are daily power losses for the referent configuration network. energy losses during the day can be calculated as the sum of energy losses in individual time intervals. this time interval is chosen so that the power of losses in the 126 m. stevanović, a. janjić, s. stojanović, d. tasić network could be constant during its duration. therefore, the total energy losses during one day are calculated as: 1 2 ( 1) n w w w w w wi in i = + + + + + =  = (9) where n is the number of time intervals and iw is energy losses in the i th time interval. energy losses depend on the power losses in the network and the duration of these losses, while power losses depend on the square of the current flowing through the network elements (line, transformer, generator, etc.). it is a well-known fact that power losses depend on the current squared. however, the power flow through the lines of the network largely contributes to the injection of power into the nodes of that network. hence the idea, that the power of losses in the network is represented by some function that depends on the power of injection in individual nodes of the network. there can be one node with batteries, and there can be more nodes. depending on whether one or more nodes are concerned, an appropriate polynomial function is chosen that gives the relationship between the dependence of energy losses and the injection power in the battery nodes. if it is a single node within a battery, the form of the function is as follows: ( ) ( , ) ( , ) 1 n n j i n j i bat i j w a p − − = =  (10) where ( , )bat ip is battery of power in i-th time interval, and n is order of polynomials. total energy losses during one day are equal to the sum of losses in individual time intervals, based on equality (9) and (10): ( ) ( , ) ( , ) ( 1) 1 n n n j n j i bat i i j w a p − − = = =   (11) in the case of two nodes with batteries, the form of the polynomial function is as follows: ( , ) ( 1, ) ( 2, ) 1 n n i i i i j bat j bat j j w a p p − = =  (12) based on equality (10) and (12), the total energy losses are calculated: ( , ) ( 1, ) ( 2, ) 1 1 n n n i i i j bat j bat j i j w a p p − = = =  (13) 3.2.1. one battery connected to the network in the next example, only one battery is connected to the network. the day is divided into n equal time intervals. the charging and discharging intervals are equal to c dt tt= = . for each time interval the dependence of the power loss on the network as a function of the current battery power for a particular node is determined for the range of battery power max max, ][ p p− . the idea is to create a dependency of the power losses in the network on the current battery power. an example of the dependence for the interval n = 24 is given in figure 1. in this example, max 1p mw= . optimal battery storage location in distribution network and optimal battery control 127 fig. 1 dependency of energy losses as a function of battery power the next step is fit to each of these dependencies analytically with a degree function of the n-th order using equations (9) and (10). energy losses for each time interval are represented as the power of injection into the nodes of the network. the fitness function is the daily energy loss as a function of battery power. in this example, a quadratic function is chosen for the fitness function because the coefficients a3i, a4i, etc. are extremely small, practically negligible values. it is necessary to determine the vector batp with appropriate constraints so that energy losses are kept to a minimum. the fitness function is given as follows: 2 ( , ) ( , ) ( 1) ( ) n i bat i i bat i i i w a cpp b = = + + (14) where: , ,i i ia b c are coefficients from fitting function and ( , )bat ip is battery power in i-th hours. the coefficients are selected based on the dependence of network losses on the battery injection power for each node 3.2.2. two batteries connected to the network a similar analysis is done for two batteries connected to the network in nodes 11th and 10th. the day is divided into n equal time intervals. the charging and discharging intervals are equal to tc = td = t. the dependence of the network energy loss as a function of the battery power for two selected nodes is determined for each time interval. based on a loss sensitivity analysis, for the range of battery power [−pmax, pmax] a functional relation between the energy losses and the battery power is created. an example of the dependence for the interval n = 24 is given in figure 2. in this example, pmax = 1mw. in the case of two connected batteries, equations (12) (14) are used and the total energy losses during the day are determined. as the individual coefficients of higher degrees of polynomial are small, they can be neglected and thus reduce the order of the polynomials. based on equality (13) it is obtained: 128 m. stevanović, a. janjić, s. stojanović, d. tasić 2 (00, ) (10, ) (11, ) (01, ) (10, ) (20, ) (11, ) (11, ) (11, ) (10, ) 2 2 2 3 ( 1) (02, ) (10, ) (21, ) (11, ) (10, ) (12, ) (11, ) (10, ) (03, ) (10, ) n i i i i i i i i i i i i i i i i i i i i i a a p a p a p a p p w a p a p p a p p a p=  + + + +  =  + + + + +    (15) (00, )ia , (00, )ia , (00, )ia are the coefficients with the corresponding variables in i th intervals. fig. 2 dependency of energy losses as a function of batteries power in nodes 11th and 10th 2.2.2. constraints the constraints are maximum charging/discharging power and minimum and maximum battery energy. charging/discharging power is constrained by the minimum and maximum active power (16): ,min , ,maxbat bat i batp p p−   (16) where ,minbatp is the minimum, and ,maxbatp the maximum battery power. a similar constraints can be applied to the energy of the battery that is limited by its minimum and maximum value. therefore, 1 min 0, max 1 , n bat i i j j i soc s spo tc oc − = +  +  (17) where soc0,(n−1) is battery energy at the end of the interval tj, socmin is minimum battery energy and socmax is maximum battery energy. to complete the model of this problem, another condition is introduced as a constraint the battery at the end of the time cycle (the time cycle is one day), returns to the original state in which it was at the beginning of the time cycle. this limitation is justified by the fact that the battery is globally a passive energy element. although at some point it can provide energy in the grid and eventually consume energy, in the overall energy balance it neither produces nor consumes the energy. based on the above, the following can be written: ( , ) ( 1) 0 n bat i i i p t = = (18) optimal battery storage location in distribution network and optimal battery control 129 the last equality emphasizes that the battery returns to its initial state at the end of the day. constraints (18) can be represented in a matrix form, using equations (19) and (20): ( ) ( ) ,1 ,2 ,3 0 , 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1 bat bat bat max bat n p p p t soc soc p −                           −                               (19) ( ) ( ) ,1 ,2 ,3 0 , 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1 bat bat bat min bat n p p p t soc soc p −  − −         − − −         − − − −       −         − − − −       − − − − −        − − − − − − −      (20) in developed form, constraint (17) looks similar to the constraint (19) ,1 ,2 ,3 ,24 0bat bat bat batp t p t p t p t+ + + + = (21) the fitness function is the daily energy loss as a function of battery power, equation (11). it is necessary to determine the vector batp , which contains unknown powers in each time interval, with appropriate constraints so that energy losses are kept to a minimum. the degree of the previous function is determined by coefficient values. coefficients from a previous function that are less than a certain pre-set value are ignored, thus reducing the order of the degree of the function. the optimization problem can be formulated in (22): fitness function: ,min ( )bat iw p constraints: (1) – (7), (16), (19), (20) (22) the optimization process is carried out in two steps. in the first step, the location of the battery is selected according to the maximal sensitivity coefficient (8). then, using (22), the optimal values of battery powers pbat, i are determined. the genetic algorithm is used for optimization. 130 m. stevanović, a. janjić, s. stojanović, d. tasić 3. results 3.1. distribution network data figure 3 shows the test network supplied from the 110 kv network and the substation 110/20 kv. it is a radial distribution system, because breakers s1, s2 and s3 are open. the nominal voltage of the network is 20 kv. the transformer is modeled as a serial impedance. the data for all branches of the grid are given in table 1, while the data of all network loads are given in table 2 [15]. fig. 3 cigre european mv distribution network benchmark optimal battery storage location in distribution network and optimal battery control 131 table 1 network lines data from node in node length [km] r [ω] x [ω] installation 1 2 2.82 0.7529 0.5732 underground 2 3 4.42 1.1801 0.8984 underground 3 4 0.61 0.1629 0.1240 underground 4 5 0.56 0.1495 0.1138 underground 5 6 1.54 0.4112 0.3130 underground 6 7 0.24 0.0641 0.0488 underground 7 8 1.67 0.4459 0.3395 underground 8 9 0.32 0.0854 0.0650 underground 9 10 0.77 0.2056 0.1565 underground 10 11 0.33 0.0881 0.0671 underground 11 4 0.49 0.1308 0.0996 underground 3 8 1.30 0.3471 0.2642 underground 12 13 4.89 2.2240 1.7914 overhead 13 14 2.99 1.3599 1.0953 overhead 14 8 2.00 0.9096 0.7327 overhead table 2 network load data busbar real power [kw] reactive power [kvar] power factor (ind.) 1 2 3 200 120 0.86 4 400 250 0.85 5 1500 930 0.85 6 3000 2260 0.80 7 800 500 0.85 8 200 120 0.86 9 1000 620 0.85 10 500 310 0.85 11 1000 620 0.85 12 300 190 0.84 13 200 120 0.86 14 800 500 0.85 15 500 310 0.85 16 1000 620 0.85 17 200 120 0.86 the loads connected to this network are a mix of residential and industrial consumption. since the optimization process is performed over a period of one day, daily active power diagrams for residential and industrial consumption are used [6] and shown in figure 4. daily diagrams of reactive power for residential and industrial consumption of end users [6] are shown in figure 5. active and reactive power are given in relative units. 132 m. stevanović, a. janjić, s. stojanović, d. tasić fig. 4 daily active power load diagrams for residential and industrial consumption fig. 5 daily reactive power load diagrams for residential and industrial consumption the previous method will be applied in the next two examples. in the first example, one battery in one node is used. that node is selected based on the loss sensitivity analysis given in section 2.2 of this paper. the parameter is calculated for the test grid by alternatively adding at each busbars loads with flat profiles, but different rated powers and power factors [17]. sensitivity coefficients are given in table 3. table 3 calculated busbar sensitivity to power losses busbar 11 10 7 9 8 6 5 4 3 2 14 13 1 12 loss,k [%]] 25.7 25.6 25.5 24.9 24.4 23.9 23.1 22.6 21.9 8.8 8 5.7 0.7 0.6 hours [h] hours [h] optimal battery storage location in distribution network and optimal battery control 133 the battery considered in this study is based on lithium-ion technology, which is one of the most promising technologies, with high energy density, high efficiency, and a relatively high number of charge / discharge cycles, and even at higher depths of discharge (dod). in our example, the following battery parameters were used: sd = 1, c = 1, d = 1, soc = 90%. finally, there are no distributed generators in the test network in this study, so the peak load is mostly covered by batteries. in the second example, there were two batteries used, connected to the 11th and 10th nodes of the network. the nodes were also selected based on sensitivity analyses. 3.2. results of simulation 3.2.1. results of simulation for one battery connected to the network on the basis of equation (22) for the case with the battery connected in one node the hourly battery powers for the optimum location are calculated. optimum location is determined in the node 11 based on the loss sensitivity analyses. charging and discharging battery power for every hour for optimum location is given in figure 6. positive sign of the battery power means that the battery is charging, negative sign means the discharge of the battery and if battery power is zero, the battery is offline. the optimization problem is solved with the genetic algorithm in matlab. the maximal generation number for the ga algorithm is 100, with the population size set to the array length of 24. optimization is performed on intel(r)xeon(r) cpu e5-26670 @ 2,90 ghz processor with 32 gb ram. the total time of optimization is 23s. as stated before, the first step in the optimization process is the optimal location determination using the sensitivity coefficient. the optimization procedure is thus greatly facilitated. to check the validity of this approach and compare the difference in energy losses on a daily basis for all possible battery locations, an analysis was performed for each node in the network. fig. 6 charging and discharging battery power per hour figure 7 shows the energy losses for each node. minimal daily energy losses are obtained for the battery placed in node 10, and highest in node 12. the difference between daily energy losses in nodes 10 and 12 is 0,15 mw. figure 6 shows the schedule of 134 m. stevanović, a. janjić, s. stojanović, d. tasić charging and discharging battery placed in node 10. during the night, the load of the network is low, and then the battery receives the power from the grid. during peak load periods, the battery injects the energy into the network, and then less energy is received from the grid. for such charge and discharge schedule at node 10, the daily energy losses are the lowest. fig. 7 energy losses for different battery locations in figure 8, the charging and discharging battery schedule for battery placed in each node is shown. fig. 8 daily load diagrams reactive power for residential and industrial consumption the approximate solution obtained with the sensitivity coefficient (node 11) doesn’t differ from the accurate solution (node 10) because of the small difference among sensitivity coefficients (αloss, 11 = 25,7; αloss, 10 = 25,6). 3.2.2. results of simulation for two batteries connected to the network based on the equation (15) for the case of batteries connected in two nodes and optimization formulation (22) the battery powers per hour for optimum locations are obtained. optimum locations (nodes 11th and 10th ) are determined by the loss sensitivity analysis. the resulting charging and discharging battery power for every hour for both batteries is given in figure 9. optimal battery storage location in distribution network and optimal battery control 135 fig. 9 batteries load for location nodes 11th and 10th 2. conclusion in this paper, the optimal battery management and selection of the optimum battery location for the battery energy storage system in the radial distribution network are analyzed. the computational time for the optimization is greatly reduced for two reasons. firstly, the optimum location is found using the sensitivity coefficient. it is shown that this approximation doesn’t differ from the accurate solution. secondly, the energy losses in the network are represented as the function of the power of injection into the nodes of the network the fitness function represents the dependence of the energy losses in the network on the current battery power, and it is suggested that the function should be fit by an n order degree function. the constraints correspond to the characteristics of the battery (battery size, maximum charge / discharge power, discharge depth, etc.). at each time interval, whether the battery is being charged, discharged or offline, the maximum power that the battery can receive or inject must be satisfied and at any time the stored energy in the battery does not exceed certain limits. finally, the optimal battery management algorithm is implemented on the test network. two separate cases were considered: with one and two batteries in the network. the results of the simulations are presented and discussed. in this way, the optimal locations of batteries are determined, as well as their way of charging and discharging in certain time intervals that the total energy losses in certain period are minimized. the economic analysis of the number of batteries connected was out of the scope of this paper. results obtained in the paper clearly show that the energy losses are decreased with the usage of two batteries instead of one, but the price of the battery is not compared with the energy price. this analysis will be the focus of our future research. acknowledgement: this work has been supported by the ministry of education, science and technological development of the republic of serbia, program for financing scientific research work, ev. no. 451-03-68 / 2020-14 / 200133. 136 m. stevanović, a. janjić, s. stojanović, d. tasić references [1] l. luo, s. s. abdulkareem, a. rezvani, m. r. miveh, s. samad, n. aljojo, m. pazhoohes, "optimal scheduling of a renewable based microgrid considering photovoltaic system and battery energy storage under uncertainty", journal of energy storage, vol. 28, p. 101306, 2020. [2] a. rufe, "on the efficiency of energy storage systems – the influence of the exchanged power and the penalty of the auxiliaries", facta universitatis, series: electronics and energetics, vol. 34, no. 2, pp. 173–185, june 2021. [3] h. saboori, s. jadid, "optimal scheduling of mobile utility-scale battery energy storage systems in electric power distribution networks", journal of energy storage, vol. 31, p. 101615, 2020. [4] h. karimi, s. jadid, a. makui, "stochastic energy scheduling of multi-microgrid systems considering independence performance index and energy storage systems", journal of energy storage, vol. 33, p. 102083, 2021. [5] p. firouzmakan, r. hooshmand, m. bornapour, a. khodabakhshian, "a comprehensive stochastic energy management system of micro-chp units, renewable energy sources and storage systems in microgrids considering demand response programs", renewable and sustainable energy reviews, vol. 108, pp. 355–368, 2019. [6] e. pons, m. repetto, "a topological reconfiguration procedure for maximising local consumption of renewable energy in italian active distribution networks", int. j. sustain. energy, vol. 36, no. 9, pp. 887– 900, 2016. [7] i. staffella, m. rustomjib, "maximising the value of electricity storage", journal of energy storage, vol. 8, pp. 212–225, 2016. [8] z. qing, y. nanhua, z. xiaoping, y. you, d. liu, "optimal siting and sizing of battery energy storage system in active distribution network", in proceedings of the 4th ieee/pes innovative smart grid technologies europe (isgt europe), 2013. [9] n.s. pearre, l.g. swan, "technoeconomic feasibility of grid storage: mapping electrical services and energy storage technologies", appl. energy, vol. 137, pp. 501–510, 2015. [10] g. carpinelli, g. celli, s. mocci, f. mottola, f. pilo, d. proto, "optimal integration of distributed energy storage devices in smart grids", trans. smart grid, vol. 4, no. 2, pp. 985–995, 2013. [11] d.magnor, d.u.sauer, "optimization of pv battery systems using genetic algorithms", energy procedia, vol. 99, pp. 332–340, 2016. [12] r. sakipour and h. abdi "optimizing battery energy storage system data in the presence of wind power plants: a comparative study on evolutionary algorithm," sustainability 2020, vol. 12, p. 10257, 2020. [13] p. boonluk, a. siritaratiwat, p. fuangfoo, and s. khunkitti, "optimal siting and sizing of battery energy storage systems for distribution network of distribution system operators," batteries, vol. 6, no. 4, 56, 2020. [14] l. debarberis, p.lazzeroni, s. olivero, v.a. ricci, f.stirano, m.repetto, "technical and economical evaluation of a pv plant with energy storage", in proceedings of the iecon 2013-39th annual conference of the ieee industrial electronics, 2013. [15] p. lazzeroni, m. repetto. "optimal planning of battery systems for power losses reduction in distribution grids," electric power systems research, vol. 167, pp. 94–112, 2019. [16] s.b. karanki, d. xu, b. venkatesh, b.n. singh, "optimal location of battery energy storage systems in power distribution network for integrating renewable energy sources", in proceedings of the ieee energy conversion congress and exposition (ecce), 2013. [17] e. bompard, e. carpaneto, g. chicco, r. napoli, "convergence of the backward/ forward sweep method for the load flow analysis of radial distribution systems", electr. power energy syst., vol. 22, no. 7, pp. 521– 530, 2000. 10226 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 379-391 https://doi.org/10.2298/fuee2203379l © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper verification of calculation method for drone micro-doppler signature estimation aleksandar lebl, mladen mileusnić, dragan mitić, jovan radivojević, vladimir matić iritel a.d., belgrade, serbia abstract. drones micro-doppler signatures obtained by fmcw radars are an excellent procedure for malicious drone detection, identification and classification. there are a number of contributions dealing with recorded spectrograms with these micro-doppler signatures, but very low number of them has analyzed possibility to calculate echo caused by drone moving parts. in this paper, starting from already existing mathematical apparatus, we presented such spectrograms as a function of changing drone moving parts characteristics: rotor number, blades number, blade length and rotor moving speed. this development is the part of a wider project intended to prevent malicious drone usage. key words: malicious drone detection, fmcw radar, spectrogram, drone microdoppler signatures, calculation method 1. introduction drones or unmanned aerial vehicles (uavs) fulfil our everyday lives more and more. they may be used in many friendly types of missions as, for example, aerial photography, traffic supervision, disaster monitoring, precise agriculture, industrial inspection, goods delivery and so on. but, on the other side, drones are used for a number of different malicious purposes [1]. drones may carry explosive devices with the intention to cause numerous victims and damages on objects such as airports, stadiums, governmental buildings, residential areas, commercial and industrial facilities, power plants, etc. they may be used for smuggling activities over state borders or into and out of the prisons, for causing fire in hardly accessible forest areas or to perform assassination on the important persons. there are a huge number of examples for each of these malicious activities, successfully or unsuccessfully realized. this is the reason why systems for drones detection, identification, localization and classification (dilc) become very important today. received november 25, 2021; revised december 28, 2021; accepted january 10, 2022 corresponding author: aleksandar lebl iritel a.d., 11080 belgrade, batajnički put 23, serbia e-mail: lebl@iritel.com 380 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić the most often applied sensors for drone dilc are radar, rf signal detector, optical camera, thermal camera and acoustic detector. the benefits and drawbacks of each sensor type are emphasized in details in [2]. drone dilc is usually performed using several sensor types among the mentioned ones. these selected sensors are then combined in one solution [3]-[6]. among these sensor types radar, especially frequency modulated continuous wave (fmcw) radar, is probably the most often applied technique [7]-[17]. the main principles of fmcw radar realization are described in [18]-[20]. fmcw radar allows reliable classification of the detected drone based on the analysis of drone microdoppler signatures. several typical drone construction and functional characteristics such as the number of its rotors, the number of blades in each rotor, rotor angular velocity and the length of blades may be only determined by fmcw radar on the base of drone microdoppler signature even in bad weather conditions. among contributions in the domain of fmcw radar, micro-doppler signatures for various drone types are presented and analyzed in [8]-[11], [15], [17]. contribution [8] gives several drone micro-doppler signature graphs in various flying mode phases (takeoff, hovering, flying phase). in this aspect [8] is more complete than our paper, but it lacks explanation to make the connection between the graphs and the derived formulas for micro-doppler signature calculation. the paper [9] presents a number of microdoppler signature graphs, but with addition of signals used for the communication between the drone and its operator, signals for drone video communication and so on. in [9] it is not possible to distinguish the spectrum behaviour as a consequence of drone flying from other frequency spectrum sources components. contributions [10], [15] are interesting because they pave the way in the comparison of drone and birds micro-doppler signatures, because drone and birds are often hard to distinguish due to their similar dimensions. the paper [21] contains very detailed theoretical and practical analysis of drone micro-doppler signatures, but on the base of experiments performed for drones at the distance of only several meters from the radar. drone micro-doppler signature graphs are often analyzed applying artificial intelligence algorithms, as for example in [22]. elements which have influence on the characteristics of drone micro-doppler signature are briefly emphasized in the section 2. the calculation method for drone micro-doppler signature determination is described in detail in the section 3. the calculation method is illustrated by a number of examples in the section 4 when drone physical characteristics and position relative to fmcw radar are changed. the concluding comments are given in the section 5. 2. drone parts causing micro-doppler effect all drone moving parts may cause micro-doppler effect detectable by fmcw radar. it is very important, because even a drone in hovering state will be detected by radar sensor. drone micro-movable parts are its rotors. each drone has a certain number of rotors, as presented in the fig. 1. there are nr=4 rotors in the example from fig. 1. drone micro-doppler signature depends on this number of rotors. the second important factor which has influence on drone micro-doppler signature is the number of blades (n) in each rotor. the blades 1 and 2 are designated in the fig. 1. verification of calculation method for drone micro-doppler signature estimation 381 two remaining blade characteristics which determine micro-doppler signature are the blades length (l) and blades rotation speed (ω). drone micro-doppler echo also depends on the drone (i.e. drone rotors) elevation angle (β) towards the radar level. this angle is determined by the drone height (h) and its distance from the radar (r0). h l blade blade blade blade l fig. 1 elements which have influence on the drone micro-doppler signature 3. calculation method method for drone micro-doppler signature calculation may be explained referring to the fig. 1. the main characteristic of fmcw radar is that it generates signal of variable frequency as a function of time. this frequency change is usually linear (sweep signal) and it is essentially important for fmcw radar detection principle of operation. the generated signal may be expressed by the equation [23] ( ) cos(2 ( ) ) c s t f b t t=  +   (1) where fc is the starting frequency of fmcw radar sweep signal and b is the slope of generated sweep signal. the generated signal is periodically repeated. the returned echo signal from rotor blades may be expressed by the equation from [7]: 1 0 0 0 1 0 4 ( ) ( ) exp ( sin ) sin ( ( )) exp( ( )) n lk k n k k k s t s t l j r z c t j t −  = − =   = =  −  +          −    (2) where it is 382 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 0 4 2 ( ) cos cos ( =0,1,2,... -1). 2 k l k t t k n n     =     + +     (3) in these two equations: ▪ l is the length of each blade; ▪ n is the number of blades in each rotor; ▪ r0 is the distance between the radar and the drone rotor (approximately the same as between radar and drone); ▪ z0 is the drone height; ▪ β is the drone elevation angle in relation to radar; ▪ ω is rotor angular rotation speed; ▪ φ0 is initial rotation angle; ▪ λ is fmcw signal wavelength. the magnitude of the rotor echo signal is 0 0 1 0 4 exp ( sin ) ( ) . sin ( ( )) exp( ( )) n k k k l j r z s t c t j t  − =    −  +       =    −  (4) the echo signal of all drone rotors is calculated according to the expression from [8]: 1 0 0 1 0 1 1 0 4 ( ) ( ) exp ( sin ) sin ( ( )) exp( ( )) r rn nn lk i i i i k i n ik ik k s t s t l j r z c t j t −  = = = − =   = =  −  +          −     (5) where nr is the number of drone rotors and 0 4 2 ( ) cos cos ( =0,1,2,... -1). 2 ik i i i l k t t k n n     =       + +     (6) as for the case of only one rotor, the magnitude of the whole drone echo signal is, similar to the equation (4): 0 01 1 1 1 0 0 4 exp ( sin ) ( ) ( ) . sin ( ( )) exp( ( )) r r n i i in n i lk n i k ik ik k l j r z s t s t c t j t − =  − = = =    −  +       = =    −     (7) the usual way to analyze drone micro-doppler signatures is the application of drone spectrograms. spectrograms present frequency spectrum of a signal as a function of time. they are obtained after calculation of short-time fourier transform (stft) [24]: ( ( , )) exp( ) n n n m n n stft s m s w j t  − =−  =   −  (8) or in the logarithmic division ( ) 20 log ( ( , )) n stft db stft s m=   (9) the meaning of variables in (8) is: verification of calculation method for drone micro-doppler signature estimation 383 ▪ sn – sequence of time samples of the signal whose spectrogram is calculated; ▪ wn – sequence of time samples of the selected window function; ▪ m – time index, i.e. time shift of the moment for which spectrogram is calculated; ▪ ω – frequency of the signal. hanning window is most often selected for the calculation of stft. the sequence of discretized hanning windows function is expressed as [25]: 1 2 1 cos ( 0,1, 2... ) 2 n n w n n n   =  − =    (10) 4. drone spectrograms drone spectrograms obtained using equations (2) to (9) are presented in the figures 2 to 10. they are derived varying the mechanical and position characteristics of drones to analyze how the change of each parameter influences the spectrogram. the analysis is presented for the hovering drone which means that rotor blades are the only moving parts of the drone. the majority of spectrograms are presented for a single rotor and this corresponds to the class of drones in the shape of helicopter. this is the smaller in number class then the class in the shape of quadcopters (which have four rotors). the starting spectrogram is presented in the fig. 2. it corresponds to the case that there is only one rotor with one blade. the blade rotation speed is ωrot=30rotations/s and the blade length is l=0.24m. after these mechanical characteristics, the drone position in relation to radar is defined by its height h=30m and distance from radar r0=100m meaning that drone position elevation angle relative to radar is β=arc sin (0.3). the radar functional characteristics are operating frequency f=24ghz (operating wavelength 0.0125m) and sampling rate fstep=20khz. time interval for spectrogram presentation is 0.1s and waveform repetition rate during this time interval is 3 or 30 in 1s. it means that spectrogram appearance (time repetition rate) directly follows from the rotor rotation speed. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 2 drone spectrogram for one rotor with one blade, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 384 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić for our analysis in this paper it is important to notice the frequency at which signal echo falls below -40db, i.e. where the spectrogram colour transfers from yellow to green. this frequency in the case of spectrogram from the fig. 2 is 144hz. fig. 3 presents the drone spectrogram for the same parameters as in the fig. 2 with the only difference that the blade rotation speed is ωrot=20rotations/s. two modifications are noticeable as a consequence of ωrot change: the signal repetition rate has dropped from 3 to 2 during 0.1s and the frequency at which signal echo falls below -40db is 96hz. in both cases the parameter ratio is 2/3 as also the ratio of ωrot values. this change of the important frequency bandwidth is important for our future analysis. fig. 4 presents the drone spectrogram for the case that the blade length has been changed comparing to the fig. 2. in this case the frequency at which signal echo falls below -40db is a bit more than 72hz. it means that in this case the ratio of important frequencies bandwidth has dropped in the ratio 1/2, as also the ratio of blades length. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 3 drone spectrogram for one rotor with one blade, the blade length l=0.24m, blade rotation speed ωrot=20rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 4 drone spectrogram for one rotor with one blade, the blade length l=0.12m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. verification of calculation method for drone micro-doppler signature estimation 385 fig. 5 presents the drone spectrogram for the case when its height has changed from h1=30m to h2=70m. it means that the ratio of elevation angle cosine functions has changed in the ratio 2 1 0 2 2 0 1 1.335 1 elev h r q h r   −    = =   −    (11) the bandwidth of important frequencies has changed in approximately the same ratio: from 144hz to about 109hz for the limit of -40db or, in other words, this ratio is 1.32. fig. 6 presents the spectrogram for the more probable case that the rotor has two blades. the other parameters for this spectrogram are the same as in the fig. 2. the important frequencies bandwidth remains 144hz as in the fig 2, but the repetition rate is twice as in the fig. 2, or total 6 due to the increased number of blades. highly similar spectrogram is obtained for the example of a rotor with one blade with two-fold rotation speed (ωrot=60rotations/s) and half a blade length (l=0.12m) and special attention has to be paid to distinguish these two cases. the spectrogram for this second case is presented in the fig. 7. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 5 drone spectrogram for one rotor with one blade, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=70m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 386 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 6 drone spectrogram for one rotor with two blades, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 7 drone spectrogram for one rotor with one blade, the blade length l=0.12m, blade rotation speed ωrot=60rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. the echo signal at the frequency 0hz may be used to distinguish whether it is considered the case according to the fig. 6 or the fig. 7. echo signal amplitude oscillations are significantly greater when rotation speed is lower, as is illustrated by the characteristics presented in the fig. 8 and the fig. 9. this peak-to-peak amplitude of the oscillations is even about 17db when there are two blades of 0.24m length and their rotation speed is 30 rotations/s (fig. 8) comparing to only about 2.5db when there is one blade of 0.12m length moving at ωrot=60rotations/s (fig. 9). this presentation of echo verification of calculation method for drone micro-doppler signature estimation 387 signal at the frequency 0hz for spectrograms more reliable distinguishing in some cases is, as for our knowledge, the paper original contribution. -30 -25 -20 -15 -10 -5 0 0,002 0,012 0,022 0,032 0,042 0,052 0,062 0,072 0,082 0,092 t [s] a [ d b ] fig. 8 echo at the frequency 0hz for the case of one rotor with two blades, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. -30 -25 -20 -15 -10 -5 0 0,002 0,012 0,022 0,032 0,042 0,052 0,062 0,072 0,082 0,092 t [s] a [ d b ] fig. 9 echo at the frequency 0hz for the case of one rotor with one blade, the blade length l=0.12m, blade rotation speed ωrot=60rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. the typical drone construction is with 4 rotors and each rotor with two blades. the spectrogram for such a construction is presented in the fig. 10. the consequence of more rotors and blades existence is that echo signal periodicity is less obvious and that limit value of important echo frequencies is practically constant as a function of time. 388 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 0 ,0 0 2 0 ,0 0 6 0 ,0 1 0 ,0 1 4 0 ,0 1 8 0 ,0 2 2 0 ,0 2 6 0 ,0 3 0 ,0 3 4 0 ,0 3 8 0 ,0 4 2 0 ,0 4 6 0 ,0 5 0 ,0 5 4 0 ,0 5 8 0 ,0 6 2 0 ,0 6 6 0 ,0 7 0 ,0 7 4 0 ,0 7 8 0 ,0 8 2 0 ,0 8 6 0 ,0 9 0 ,0 9 4 0 ,0 9 8 0 16 32 48 64 80 96 112 128 144 160 176 192 t (sec) f (hz) 0-10 -10-0 -20--10 -30--20 -40--30 -50--40 -60--50 -70--60 -80--70 -90--80 -100--90 -110--100 -120--110 -130--120 -140--130 fig. 10 drone spectrogram for four rotors with two blades, the blade length l=0.24m, blade rotation speed ωrot=30rotations/s, drone height h=30m, drone distance from radar r0=100m, fmcw radar operating frequency f=24ghz, digital sampling rate fstep=20khz. the graphs in the figures 2-7 may be compared to the selected graph from [8] which corresponds to the micro-doppler signature of rotors obtained by practical recording. the great similarity is obvious with the exception that the graph in [8] is presented for positive and negative frequencies and the echo signal is symmetrical about the frequency 0hz. this graph from [8] is presented in the fig. 11. it has periodicity – the number of periodical changes is 18 during 1s. according to this characteristic, the graph is most similar to the graph in the fig. 3. the frequency where the signal echo rapidly decreases is about 100hz. let us further suppose that we could conclude by some other technique what is the drone elevation angle, i.e. what is cosine of elevation angle. the final element to determine is now the length of rotor blade/blades (l). under the assumption that elevation angle is the same as in the fig. 3, we obtain l=0.25m. but, if the drone is situated approximately vertically above the fmcw radar (i.e. elevation angle tends to 90o) and the spectrogram is without significant changes, the corresponding l quickly grows. the graph in the fig. 10 is similar to the graph from [8] which corresponds to the drone in the hovering state. this graph from [8] is presented in the fig. 12. there is no obvious periodicity in the recorded characteristic. such a graph is the clear sign that there is a higher number of rotors probably with more than one blade. the summary of conditions for spectrogram characteristic calculation in the figures 2-10 is presented in the table 1. the main specificities to describe the obtained spectrograms for each combination of conditions (i.e. each figure) are also presented in the table 1. verification of calculation method for drone micro-doppler signature estimation 389 fig. 11 practical rotor micro-doppler record [8] fig. 12 practical drone micro-doppler record [8] table 1 summary of figure characteristics figure conditions for spectrogram calculation output spectrogram description 2 1 rotor, 1 blade, l=0.24m, ωrot=30/s, h=30m, r0=100m, f=24ghz, fstep=20khz waveform repetition rate 30/s attenuation 40db at 144hz 3 figure 2 with ωrot=20/s waveform repetition rate 20/s attenuation 40db at 96hz 4 figure 2 with l=0.12/s waveform repetition rate 30/s attenuation 40db at 72hz 5 figure 2 with h=70m (cosine of elevation angle higher 1.335 times) waveform repetition rate 30/s attenuation 40db at 109hz 6 figure 2 with two blades waveform repetition rate 60/s attenuation 40db at 144hz 7 figure 2 with l=0.12/s and ωrot=60/s waveform repetition rate 60/s attenuation 40db at 144hz 8 figure 2 with two blades amplitude oscillations peak-to-peak 17db at 0hz 9 figure 2 with l=0.12/s and ωrot=60/s amplitude oscillations peak-to-peak 2.5db at 0hz 10 figure 2 with four rotors and two blades echo frequencies constant in time, signal periodicity less obvious 390 a. lebl, m. mileusnić, d. mitić, j. radivojević, v. matić 5. conclusions calculation method for drone micro-doppler signature determination is presented in this paper. the influence of various drone parameters (number of rotors, number of blades forming a rotor, blades rotation rate, blades length) on spectrogram shape is analyzed. special attention is devoted to the way how it is possible to distinguish some combinations of drone characteristics which give very similar spectrograms. all results are presented for the fmcw radar which operates on the frequency of 24ghz. the method and the results from the paper may be used in the case that measurement results are not available. the results of calculation are compared to the similar examples from measurements and similarity of the results from these two groups is verified by several practical examples. the results from this paper are related only to the hovering drone. our plan for the future investigation is to try to develop calculation method for the drones in other flying modes (flying, take-off, etc). multi-doppler spectrograms are applicable for drone detection, identification and classification by artificial intelligence algorithms. our other development direction plan is to implement calculated spectrograms for training neural networks in the first phase of such networks construction when numerous practical records of various drone types are still not available. references [1] v. matić, v. kosjer, a. lebl, b. pavić and j. radivojević, "methods for drone detection and jamming", in proceedings of the 10th international conference on information society and technology (icist). kopaonik, 2020, pp.16–21. [2] n. eriksson, conceptual study of a future drone detection system countering a threat posed by a disruptive technology. master thesis in product development, chalmers university of technology, goethenburg, sweden, 2018. [3] advanced protection systems, ctrl+sky drone detection and neutralization system, 2017, http://apsystems.tech/wp-content/uploads/2018/01/aps_broszura_web.pdf. [4] droneshield, "product information", 2018. [5] h. liu, f. qu, y. liu, w. zhao and y. chen, "a drone detection with aircraft classification based on a camera array", in proceedings of the 2018 iop conference series: materials science and engineering, vol. 322, p. 052005. 2018, pp. 1–7. [6] x. shi, c. yang, c. liang, z. shi and j. chen, "anti-drone system with multiple surveillance technologies: architecture, implementation, and challenges", ieee commun. magaz., vol. 56, no. 4, pp. 68–74, 2018. [7] v. c. chen, the micro-doppler effect in radar. artech house, second edition, 2019, isbn: 978-1-63081546-2. [8] c. zhao, g. luo, y. wang, c. chen and z. wu, "uav recognition based on micro-doppler dynamic attribute-guided augmentation algorithm", remote sensing, vol. 13, no. 6, p. 1205, pp. 1–17, 2021. [9] t. šević, v. joksimović, i. pokrajac, r. brusin, b. sazdić-jotić and d. obradović, "interception and detection of drones using rf-based dataset of drones", sci. tech. rev., vol. 70, no. 2, pp. 29–34, 2020. [10] s. rahman and d. a. robertson, "radar micro-doppler signatures of drones and birds at k-band and wband", sci. rep., vol. 8, pp. 1–11, 2018. [11] y. cai, o. krasnov and a. yarovoy, "simulation of radar micro-doppler patterns for multi-propeller drones", in proceedings of the international radar conference (radar-2019), toulon, 2019, pp.1–5. [12] w. wang, j. du and j. gao, "multi-target detection method based on variable carrier frequency chirp sequence", sensors, vol. 18, p. 3386, pp. 1–12, 2018. [13] a. coluccia, g. parisi and a. fascista, "detection and classification of multirotor drones in radar sensor networks: a review", sensors, vol. 20, p. 4172, pp. 1–22, 2020. http://apsystems.tech/wp-content/uploads/2018/01/aps_broszura_web.pdf verification of calculation method for drone micro-doppler signature estimation 391 [14] m. daković, m. brajović, t. thayaparan and lj. stanković, "an algorithm for micro-doppler period estimation", in proceedings of the 20th telecommunications forum (telfor), belgrade, 2012, pp. 851–854. [15] p. molchanov, radar target classification by micro-doppler contributions. thesis for the degree of doctor of science in technology, publication 1255, tampere university of technology, finland, october 2014, issn 1459-2045. [16] e. hyun, y.-s. jin and j.-h. lee, "design and implementation of 24 ghz multichannel fmcw surveillance radar with a software-reconfigurable baseband", j. sensors, vol. 2017, p. 3148237, pp. 1–11, 2017. [17] b. karlsson, modeling multicopter radar return. master’s thesis in applied physics, chalmers university of technology, department of electrical engineering, gothenburg, sweden, 2017. [18] v. m. milovanović, “on fundamental operating principles and range-doppler estimation in monolithic frequency-modulated continuous-wave radar sensors", fu elec. energ., vol. 31, no. 4, pp. 547–570, 2018. [19] c. iovescu and s. rao, the fundamental of millimeter wave radar sensors. texas instruments, 2020. [20] j. zhu, low-cost, software defined fmcw radar for observations of drones. master thesis, university of oklahoma, graduate college, 2017. [21] m. passafiume, n. rojhani, g. collodi and a. cidronali, "modeling small uav micro-doppler signature using milimeter-wave fmcw radar", electronics , vol. 10, no. 6, pp. 1–16, 2021. [22] j. park, j.-s. park and s.-o. park, "small drone classification with light cnn and new micro-doppler signature extraction method based on a-spc technique", https://arxiv.org/abs/2009.14422, pp.1–5, 2020. [23] t. tang and c. wu, design of new frequency modulated continuous wave (fmcw) target tracking radar with digital beamforming tracking. defense research and development canada, scientific report drdcrddc-2019-r175, 2019. [24] m. ahmadizadeh, an introduction to short-time fourier transform (stft). sharif university of technology, department of civil engineering, 2014. [25] h. a. gaberson, "a comprehensive windows tutorial", sound and vibration, instrumentation reference issue, pp. 14–23, 2006. https://arxiv.org/abs/2009.14422 facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 105-114 https://doi.org/10.2298/fuee2101105r © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design of novel multiplexer circuits in qca nanocomputing hamid rashidi, abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan, iran, abstract. quantum-dot cellular automata (qca) technology is a promising alternative nano-scale technology for cmos technology. in digital circuits, a multiplexer is one of the most important components. in this study, an efficient and single layer 2 to 1 qca multiplexer circuit is proposed using majority gate and inverter gate. in addition, efficient 4 to 1 and 8 to 1 qca multiplexer circuits are implemented using this 2 to 1 multiplexer circuit. the developed multiplexer circuits are implemented in qcadesigner tool. according to the results, the developed 2 to 1, 4 to 1, and 8 to 1 multiplexer circuits utilize 16 (0.01μm2), 96 (0.11μm2), and 286 (0.43μm2) qca cell (area). the results demonstrate that the proposed 8 to 1 multiplexer circuit reduces the cost by about 25%99% compared to the existing multiplexer circuits. key words: multiplexer circuit, quantum-dot cellular automata; coplanar, nanotechnology, nanoelectronics 1. introduction quantum-dot cellular automata (qca) is one of the technologies at nano-scale level, which is developed by lent et al. [1] in 1993. the qca technology can be used for maintaining the trend predicted by moore’s law [2]. this technology has many advantages such as high device density, high switching speed, and low power consumption in comparison with complementary metal-oxide-semiconductor (cmos) technology [3]. basic devices in this technology consist of qca cells, wire crossing and qca logic gates. the fundamental unit in the qca technology is the qca cell that is comprised of a square with 4 quantum dots in corners [1, 4]. it should be noted that each qca cell has only two electrons that can tunnel through neighboring dots. these two electrons are resided in opposite corners. so, there are two possible polarizations. fig. 1 shows these two kinds of polarization, p= -1 and p=+ of qca cells [5]. received july 30, 2020; received in revised form october 10, 2020 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan, iran e-mail: rezaie@acecr.ac.ir 106 z. taheri, a. rezai fig. 1 two possible polarizations in qca cells, p= -1 and p=+1[5] qca wires consist of a number of qca cells which can be used for transferring input cell polarization [5]. qca wires can be categorized in two groups: (a) single layer crossing wire, and (b) multilayer crossing wire. in addition, a four-phase (four-zone) clock pulse provided synchronization of information flow in the qca circuits. the qca clock pulse is employed to reduce power dissipation [3, 6]. the qca cells behave like a single latch in each clock phase and propagate information in the same direction. as illustrated in fig. 2, the qca clock is composed of four phases and each phase is shifted by 90 degrees [3, 6]. in the clock phase, a signal has four states: 1) low-to-high state (switch phase), 2) high state (hold phase), 3) high-to-low (release phase), 4) low state (relax phase). fig. 2 four phases of the qca [3, 6] design of novel multiplexer circuits in qca nanocomputing 107 when the qca clock is in the low-to-high state, the potential energy of the qca cell is low. so, tunneling barriers of the qca cell start to raise and their polarizations start to actual computation according to the state of their neighboring cells during switch phase. the potential barriers of the qca cells are in the highest level and they avoid electrons from tunneling in the hold phase. during the release phase, the reduction in the cell polarization is started and the tunneling barriers gradually are reduced. finally, in the relax phase, the cells stay in an unpolarized state when potential barriers are held in low state and no barrier exists between the dots. the overall delay of the qca circuits can be specified by the number of critical path clock phase [3]. in qca circuits basic logic units are majority voter gate (mvg) and inverter gate [1]. the 3-input mvg is considered as the most important gate in the qca technology. it is because the 2-input or gate and 2-input and gate can be constructed using mvg by fixing one of the three inputs to p= +1 or p= -1, respectively [6]. the logic function of the mvg can be defined by the following equation: maj (a, b, c) = out = ab + bc + ca (1) where a, b, and c are inputs and the output is displayed by out. a four-phase clock pulse provides synchronization of information flow in the qca circuits [7]. in addition, the qca clock is employed for reducing the power dissipation [8]. the qca cells behave similarly to a single latch in each clock phase. so, the information is propagated in the same direction. in recent years, many different logic circuits have been developed in the qca technology for various applications, such as qca multiplier [9], qca full adder [5, 10], qca multiplexer [3, 6-9, 11-15], qca counter [16], qca shift register [17], and qca comparator [18, 19]. in addition, multiplexer circuits play a significant role in the digital circuit design such as arithmetic logic unit design [6]. in this study, we develop a circuit with the aim of improving the performance of the single-layer 2 to 1 qca multiplexer. then, efficient and single-layer 4 to 1 and 8 to 1 multiplexer circuits are implemented based on this 2 to 1 multiplexer. 2. design of 2 to 1 qca multiplexer the developed 2 to 1 qca multiplexer is shown in fig. 3. this circuit consists of one inverter gate, one rotate majority voter gate (rmvg) and two original majority voter gates (omvgs) due to area efficiency and need to have suitable architecture for modular design methodology for constructing efficient 2n to 1 multiplexer circuits. this circuit consists of two inputs, a and b, one address line, s, and one output, f. the output f is expressed by the following equation: f = a. s̅ + b. s (2) 108 z. taheri, a. rezai fig. 3 the developed 2 to 1 qca multiplexer circuit (a) logical circuit, (b) layout to verify and justify the layout of the developed single layer 2 to 1 qca multiplexer, qcadesigner tool version 2.0.3 [20] is utilized as a simulator on the cell level for qca circuits. figure 4 shows the simulated waveform of the developed 2 to 1 multiplexer circuit. fig. 4 the waveform of the developed 2 to 1 multiplexer circuit it should be mentioned that for rapid access to simulation results, bi-stable approximation simulation engine has been chosen. for optimum layout, cellular layout of the developed 2 to 1 multiplexer is designed in one layer using 16 qca cells and an area of 0.01 μm2. it also takes 0.5 clock cycles to generate the output. table 1 summarizes comprehensive comparison between the developed single layer 2 to 1 qca multiplexer circuit and other circuits in [3, 6-8, 13, 14] with regard to the latency design of novel multiplexer circuits in qca nanocomputing 109 (required clock cycles), cell count, circuit area (µm2), and cost, where the cost is defined by following equation: cost = area × latency2 (3) table 1 the simulation results of the single-layer 2 to 1 multiplexer circuits reference number of cells area (µm2) latency cost [14] 27 0.03 0.75 0.0169 [13] 19 0.02 0.75 0.0113 [7] 26 0.02 0.5 0.005 [6] 19 0.02 0.5 0.005 [8] 23 0.02 0.5 0.005 [15] 24 0.02 0.75 0.0113 [3] 15 0.01 0.5 0.0025 this paper 16 0.01 0.5 0.0025 based on these simulation results, the developed 2 to 1 multiplexer circuit has an improvement with regard to cost, cell count, latency and circuit area compared to other 2 to 1 qca multiplexer circuits in [13, 14, 15]. moreover, our developed circuit has advantages with regard to cost, cell count, and circuit area compared to 2 to 1 qca multiplexer circuits in [6-8]. the results demonstrate that the proposed 2 to 1 multiplexer circuit reduced the cost by about 50%-85% compared to the circuits that are proposed in [6-8, 13-15]. although the 2 to 1 multiplexer circuit in [3] has advantages compared to our developed 2 to 1 multiplexer, the architecture of the developed 2 to 1 multiplexer is such that it is suitable for modular design methodology for constructing efficient 2n to 1 multiplexer circuits. 3. design of 4 to 1 qca multiplexer the developed single-layer 4 to 1 qca multiplexer circuit is shown in fig. 5, which utilizes three developed 2 to 1 qca multiplexer modules. fig. 5 the developed single-layer 4 to 1 qca multiplexer circuit (a) layout, (b) logic circuit 110 z. taheri, a. rezai the developed circuit consists of two address lines, four inputs, and one output. a, b, c, and d are utilized as input signals, s0 and s1 denote the address lines and output signal is shown by f. the output f is expressed by following equation: 𝐹 = (𝑆1. 𝑆0)𝐷 + (𝑆1.𝑆̅0)𝐶 + (𝑆̅1. 𝑆0)𝐵 + (𝑆̅1.𝑆̅0)𝐴 (4) figure 6 shows the simulated waveform of the developed 4 to 1 multiplexer design. fig. 6 the waveform of the developed 4 to 1 multiplexer design for optimum layout, the cellular layout of the developed 4 to 1 multiplexer is designed in one layer using 96 qca cells and an area of 0.11 μm2. it also takes 1 clock cycle to generate the output. table 2 summarizes comprehensive comparison between the developed single-layer 4 to 1 qca multiplexer circuit and previous 4 to 1 qca multiplexer circuits in [3, 7, 8, 11, 12]. table 2 the simulation results for the 4 to 1 qca multiplexer circuits reference number of cells area (µm2) latency cost [7] 271 0.37 4.75 8.3481 [11]* 251 0.2 1.25 0.3125 [11] 199 0.27 1.50 0.6075 [8] 155 0.24 1.25 0.375 [12]* 103 0.08 1.75 0.245 [3] 107 0.15 1 0.15 this paper 96 0.11 1 0.11 * multilayer based on these simulation results, the developed 4 to 1 qca multiplexer circuit has advantages with regard to cost, cell count, latency, and circuit area compared to other 4 to 1 qca multiplexer circuits in [7, 8, 11, 12]. our developed 4 to 1 multiplexer circuit provides an improvement in terms of cost, number of cells, and circuit area compared to 4 to 1 qca multiplexer circuit in [3]. the developed circuit also provides an improvement in comparison with 4 to 1 qca multiplexer circuit in [12] with regard to cost, cell count design of novel multiplexer circuits in qca nanocomputing 111 and latency. the results demonstrate that the proposed 4 to 1 multiplexer circuit reduces the cost by about 26%-98% compared to the circuits that are proposed in [3, 7, 8, 11, 12]. 4. design of 8 to 1 qca multiplexer the developed single-layer 8 to 1 qca multiplexer circuit is displayed in fig. 7, which utilizes the developed 2 to 1 multiplexer circuit and two developed 4 to 1 qca multiplexer circuits. the developed circuit consists of eight inputs, one output, and three address lines. a, b, c, d, e, f, g, and h are utilized as input signals, s0, s1, and s2 denote the address lines and output signal is shown by out. the output out, is expressed by the following equation: out=(s2.s1.s0)h+(s2.s1.s ̅0)g+(s2.s ̅1.s0)f+(s2.s ̅1.s ̅0)e+(s ̅2.s1.s0)d +(s ̅2.s1.s ̅0)c+(s ̅2.s ̅1.s0)b+(s ̅2.s ̅1.s ̅0)a (5) fig. 7 the developed single-layer 8 to 1 multiplexer circuit 112 z. taheri, a. rezai figure 8 shows the simulated waveform of the developed 8 to 1 multiplexer design. fig. 8 the waveform of the developed 8 to 1 multiplexer design for optimum layout, qca layout of the developed 8 to 1 multiplexer is designed in one layer using 286 qca cells and an area of 0.43 μm2. it also takes 1.5 clock cycles to generate the output. table 3 summarizes comprehensive comparison between the developed singlelayer 8 to 1 multiplexer circuit and previous 8 to 1 qca multiplexer circuits in [3, 7, 8, 11]. table 3 the simulation results for the 8 to 1 qca multiplexer circuits based on these simulation results, the developed 8 to 1 qca multiplexer circuit has advantages with regard to cost, cell count, latency and circuit area compared to other 8 to 1 qca multiplexer circuits in [7, 8, 11]. moreover, our developed circuit has advantages with regard to cost, cell count, and circuit area compared to 8 to 1 qca multiplexer circuit in [3]. the results demonstrate that the proposed 8 to 1 multiplexer circuit reduces the cost by about 25%-99% compared to the circuits that are proposed in [3, 7, 8, 11]. reference number of cells area (µm2) latency cost [7] 1312 1.83 10.5 201.76 [8] 462 0.87 1.75 2.67 [11] 494 0.58 2.25 2.94 [3] 293 0.58 1.5 1.31 this paper 286 0.43 1.5 0.97 design of novel multiplexer circuits in qca nanocomputing 113 5. conclusions there are several kinds of nanotechnologies that are developed for replacing conventional cmos technology [21-23]. the qca technology is one of these nanotechnologies that provide the promising advantages. in this study, we have developed a novel and efficient single-layer circuit for 2 to 1 qca multiplexer based on majority and inverter gates. then, using this 2 to 1 qca multiplexer circuit, the 4 to 1 and 8 to 1 qca multiplexer circuits are developed. the developed circuits for qca multiplexers have been simulated using qcadesigner 2.0.3. according to the results, the developed 2 to 1, 4 to 1, and 8 to 1 multiplexer circuits utilized 16 (0.01μm2), 96 (0.11μm2), and 286 (0.43μm2) qca cell (area). the results demonstrate that the proposed 8 to 1 multiplexer circuit reduces the cost by about 25%99% compared to the circuits that are proposed in [3, 7, 8, 11]. references [1] c.s. lent, p.d. tougaw, w. porod et al. "quantum cellular automata", nanotechnology, vol. 4, no. 1, pp. 49–57, 1993. [2] j.d. meindl, "beyond moore’s law: the interconnect era", comput. sci. eng., vol. 5, no. 1, pp. 20–24, 2003. [3] h. rashidi, a. rezai and s. soltany, "high-performance multiplexer architecture for quantum-dot cellular automata", j. comput. electron., vol. 15, no. 3, pp. 968–981, 2016. [4] z. taheri, a. rezai, and h. rashidi, "novel single layer fault tolerance rca construction for qca technology", fu elec. energ., vol. 32, no. 4, pp. 601-613, 2019. [5] d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi and a. karimi, "design of novel efficient full adder circuit for quantum-dot cellular automata technology", fu elec. energ., vol. 31, no. 2, pp. 279-285, 2018. [6] b. sen, m. dutta, m. goswami and b. k. sikdar, "modular design of testable reversible alu by qca multiplexer with increase in programmability", microelectronics j., vol. 45, no. 11, pp. 1522–1532, 2014. [7] r. sabbaghi-nadooshan and m. kianpour, "a novel qca implementation of mux-based universal shift register", j. comput. electron., vol. 13, pp. 1–13, 2013. [8] b. sen, m. goswami, s. mazumdar and b.k. sikdar, "towards modular design of reliable quantum-dot cellular automata logic circuit using multiplexers", comput. electr. eng., vol. 45, pp. 42–54, 2015. [9] j.d. wood and d. tougaw, "matrix multiplication using quantum-dot cellular automata to implement conventional micro-electronics", ieee trans. nanotechnol., vol. 10, no. 5, pp. 1036–1042, 2011. [10] m. hayati, and a. rezaei "design of novel efficient adder and subtractor for quantum-dot cellular automata", int. j. circ. theor. appl., vol. 43, no. 10, pp. 1446–1454, 2015. [11] g. cocorullo, p. corsonello, f. frustaci and s. perri, "design of efficient qca multiplexers", int. j. circ. theor. appl., vol. 44, no. 3, pp. 602–615, 2016. [12] b. sen, a. nag, a. de and b.k. sikdar "towards the hierarchical design of multilayer qca logic circuit", j. comput. sci., vol. 11, pp. 233–244, 2015. [13] b. sen, m. dutta, d. saran and b.k. sikdar, "an efficient multiplexer in quantum-dot cellular automata", in proceedings of the progress in vlsi design and test, lecture notes in computer science, vol. 7373, 2012, pp. 350-351. [14] a. roohi, h. khademolhosseini, s. sayedsalehi, and k. navi, "a novel architecture for quantum-dot cellular automata multiplexer", int. j. comput. sci., vol. 8, pp. 55–60, 2011. [15] r. singh and d. k. sharma, "design of efficient multilayer ram cell in qca framework", circuit world, vol. 47, no. 1, pp. 31-41, 2020. [16] m. n. divshali, a. rezai and s.s.f. hamidpour, "design of novel coplanar counter circuit in quantum-dot cellular automata technology", int. j. theor. phys., vol. 58, no. 8, pp. 2677–2691, 2019. [17] m. n. divshali, a. rezai and a. karimi, "towards multilayer qca siso shift register based on efficient d-ff circuits", int. j. theor. phys., vol. 57, no. 11, pp. 3326–3339, 2018. [18] a. shiri, a. rezai and h. mahmoodian, "design of efficient coplanar 1-bit comparator circuit in qca technology", fu elec. energ., vol. 32, no. 1, pp. 119-128, 2019. [19] r. mokhtarii and a. rezai, "investigation and design of novel comparator in quantum-dot cellular automata technology", j. nano-electron. phys., vol. 10, no. 5, pp. 50141-50144, 2018. [20] k. walus, t. dysart, g.a. jullien and r. budiman, "qcadesigner: a rapid design and simulation tool for quantum-dot cellular automata", ieee trans. nanotechnol., vol. 3, no. 1, pp. 26–31, 2004. 114 z. taheri, a. rezai [21] a. naderi, and m. ghodrati, "improving band-to-band tunneling in a tunneling carbon nanotube field effect transistor by multi-level development of impurities in the drain region", eur. phys. j. plus, vol. 132, no. 12, p. 510, 2017. [22] a. naderi and b. tahne, "methods in improving the performance of carbon nanotube field effect transistors", ecs j. solid-state sci. technol., vol. 5, no. 12, pp. m131-m140, 2016. [23] a. naderi and f. heirani, "improvement in the performance of soi-mesfets by t-shaped oxide part at channel region: dc and rf characteristics", superlattices and microstructures, vol. 111, pp. 1022-1033, 2017. facta universitatis series: electronics and energetics vol. 33, no 4, december 2020, pp. 631 653 https://doi.org/10.2298/fuee2004631a elham amouee1, morteza mohammadi zanjireh1 , mahdi bahaghighat1, mohsen ghorbani2 received april 15, 2020; received in revised form august 10, 2020 corresponding author: morteza mohammadi zanjireh computer engineering department, imam khomeini international university, qazvin, iran e-mail: zanjireh@eng.ikiu.ac.ir facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) a new anomalous text detection approach using unsupervised methods 1computer engineering department imam khomeini international university qazvin, iran 2department of electrical engineering raja university qazvin, iran abstract. increasing size of text data in databases requires appropriate classification and analysis in order to acquire knowledge and improve the quality of decision-making in organizations. the process of discovering the hidden patterns in the data set, called data mining, requires access to quality data in order to receive a valid response from the system. detecting and removing anomalous data is one of the pre-processing steps and cleaning data in this process. methods for anomalous data detection are generally classified into three groups including supervised, semi-supervised, and unsupervised. this research tried to offer an unsupervised approach for spotting the anomalous data in text collections. in the proposed method, a combination of two approaches (i.e., clustering-based and distance-based) is used for detecting anomaly in the text data. in order to evaluate the effciency of the proposed approach, this method is applied on four labeled data sets. the accuracy of naïve bayes classification algorithms and decision tree are compared before and after removal of anomalous data with the proposed method and some other methods such as density-based spatial clustering of applications with noise (dbscan). our proposed method shows that accuracy of more than 92.39% can be achieved. in general, the results revealed that in most cases the proposed method has a good performance. key words: anomaly detection, text mining, unsupervised learning, clustering, pre-processing, dbscan algorithm. © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd 2 proposed method shows that accuracy of more than 92.39% can be achieved. in general, the results revealed that in most cases the proposed method has a good performance. keywords: anomaly detection, text mining, unsupervised learning, clustering, pre-processing, dbscan algorithm 1 introduction the current age is called information age. since the organizations and institutions record a huge amount of data daily, data recovery alone is not enough to make decisions. so automatic classification and analysis of data is very important. data mining is the process of identifying valid patterns and relationships among the high volume of data which have so far been unknown [1]. intelligent data exploring helps organizations to discover and predict system behaviors, and patterns to make better and faster decisions. besides, the machine learning (ml) is the science that deals with the development of algorithms and statistical models. in machine learning, the goal is to enable computer systems to perform particular tasks without using explicit instruction and merely using patterns and inference instead of being able to perform their functions. nowadays, this science is wildly used in broad fields such as image processing, machine vision, audio signal processing, natural language processing (nlp), communication networks, financial areas, and so on [2–10]. in many topics of data mining, the data is classified into structured, semistructured, and unstructured [11]. data mining and machine learning are strong tools to handle all of these problems. structured data is that which has a predictable and regular format such as the structure of the tables in relational databases. in contrast, unstructured data is that which does not a specific structure and its analysis is not so easy. the significant growth and diversity of text data can be considered as an example of this data type. volume and speed of unstructured data are several times more than those of structured type. therefore, one of the applied areas in data mining is the concept of text mining and natural language processing (nlp). before starting data mining, some steps should be taken in order to prepare data. the steps for data mining include selecting data, initial cleaning and pre-processing, discovering patterns, and interpreting and displaying them. diagnosis of anomalous data can be considered as a pre-processing step in the data mining path [12]. anomaly is a pattern that differs from the other patterns existing in the 632 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 633 2 proposed method shows that accuracy of more than 92.39% can be achieved. in general, the results revealed that in most cases the proposed method has a good performance. keywords: anomaly detection, text mining, unsupervised learning, clustering, pre-processing, dbscan algorithm 1 introduction the current age is called information age. since the organizations and institutions record a huge amount of data daily, data recovery alone is not enough to make decisions. so automatic classification and analysis of data is very important. data mining is the process of identifying valid patterns and relationships among the high volume of data which have so far been unknown [1]. intelligent data exploring helps organizations to discover and predict system behaviors, and patterns to make better and faster decisions. besides, the machine learning (ml) is the science that deals with the development of algorithms and statistical models. in machine learning, the goal is to enable computer systems to perform particular tasks without using explicit instruction and merely using patterns and inference instead of being able to perform their functions. nowadays, this science is wildly used in broad fields such as image processing, machine vision, audio signal processing, natural language processing (nlp), communication networks, financial areas, and so on [2–10]. in many topics of data mining, the data is classified into structured, semistructured, and unstructured [11]. data mining and machine learning are strong tools to handle all of these problems. structured data is that which has a predictable and regular format such as the structure of the tables in relational databases. in contrast, unstructured data is that which does not a specific structure and its analysis is not so easy. the significant growth and diversity of text data can be considered as an example of this data type. volume and speed of unstructured data are several times more than those of structured type. therefore, one of the applied areas in data mining is the concept of text mining and natural language processing (nlp). before starting data mining, some steps should be taken in order to prepare data. the steps for data mining include selecting data, initial cleaning and pre-processing, discovering patterns, and interpreting and displaying them. diagnosis of anomalous data can be considered as a pre-processing step in the data mining path [12]. anomaly is a pattern that differs from the other patterns existing in the a new anomalous text detection approach using unsupervised methods ...3 data set. anomaly was first defined by grubbs (1969): anomaly is data which dramatically deviates from other available samples in the series [13]. the term ‘anomaly in text data’ is referred to texts which are abnormal or are significantly different from the other texts in terms of concept. in the text data, anomaly can be investigated in terms of a difference in the text author, the subject, the genre, the style of text, and the emotional tone of the text [14]. the main reason for the development of text mining systems is the increasing volume of textual data in organizations and businesses. one of the challenges in monitoring infectious diseases, such as covid-19, is that large volumes of textual data are produced continuously. in a pandemic, this value can be far greater than a human being can process [15, 16]. among the applications of this field, the following can be mentioned [17,18] • 1. diagnosis of anomalies in safety reports sent from space stations • 2. tracing the subject of news • 3. abnormal in web content • 4. identify significant patterns in annual financial reports • 5. identify abnormal data in news reports • 6. discover knowledge of medical records there are several anomaly detection systems. these systems are comprised of three parts. the first phase is the pre-processing step which includes removing unwanted words through stemming [19]. in the second phase, text display (e.g., displaying text sentences for vector) is carried out. and the third phase includes text processing for detecting anomalies and comparing between documents. anomaly detection is not an easy challenge. so far researchers developed many anomaly detection methods using statistical methods, machine learning, and data mining but the problem is still open and in its progress. several approaches are shown in the following figure 1: the methods of anomaly detection are widespread, and each is used based on input data type and its application. in one approach, the methods are classified based on access to the labelled data. accordingly, the methods are categorized into three main categories [20]: 1. supervised anomaly detection 632 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 633 4 fig. 1: the key components of anomaly detection methods [12] 2. semi-supervised anomaly detection 3. unsupervised anomaly detection in the supervised method, both normal and abnormal data are labeled in the training dataset. typical approach in such a method is to create a predictive model for both normal and abnormal data. after comparing each test data sample with the model, it is determined to which class this data belongs [12]. in the semi-supervised method, it is supposed that only normal samples are labelled. since this method does not require anomalous data labelling, this method is more applicable than the supervised method [11]. in comparison, the methods which are run based on unsupervised method do not require the training data; so, they are more applicable than the two previous approaches. the most important advantage of this method is that it does not need to access the labelled data. usually, this group of methods is known as clustering solutions [11]. in the figure 2, a summary of a set of supervised and unsupervised anomaly detection methods is shown. text clustering refers to the process of dividing a text group into similar subgroups based on content. semantic clustering refers to cluster texts based 634 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 635 4 fig. 1: the key components of anomaly detection methods [12] 2. semi-supervised anomaly detection 3. unsupervised anomaly detection in the supervised method, both normal and abnormal data are labeled in the training dataset. typical approach in such a method is to create a predictive model for both normal and abnormal data. after comparing each test data sample with the model, it is determined to which class this data belongs [12]. in the semi-supervised method, it is supposed that only normal samples are labelled. since this method does not require anomalous data labelling, this method is more applicable than the supervised method [11]. in comparison, the methods which are run based on unsupervised method do not require the training data; so, they are more applicable than the two previous approaches. the most important advantage of this method is that it does not need to access the labelled data. usually, this group of methods is known as clustering solutions [11]. in the figure 2, a summary of a set of supervised and unsupervised anomaly detection methods is shown. text clustering refers to the process of dividing a text group into similar subgroups based on content. semantic clustering refers to cluster texts based a new anomalous text detection approach using unsupervised methods ...5 fig. 2: various categories of anomalous data detection methods: the supervised and unsupervised approaches [20], [11]. i: the assumption of these methods is that the normal data belong to at least one cluster while the anomalous data do not belong to any cluster [11]. ii: in this method, the normal instance locates near the centroid of its nearest cluster, while the anomalous sample is in a long distance of the nearest cluster gravity center. iii: normal data belongs to high density clusters while anomalous data is distributed in low density clusters. on their contents or meaning [21,22]. the remainder of this paper is organized as follows. in section 2, we review some related works. in section 3, we present the methodology of our proposed method. the simulations and experimental results of the proposed algorithm are presented in section 4. finally, in section 5, we conclude the paper. 634 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 635 6 2 related works there are many studies in the research literature that try to address the anomaly detection issues. consequently, different algorithms are provided to diagnose anomaly in multidimensional data sets. the key methods used in this area include the distance-based approach, the density-based approach, and their subset methods [23]. many researchers have worked to detect anomalies in textual documents. hence, different aspects of text attributes are considered. a number of researchers changed the text to numbers and used algorithms that are suitable for numerical data. others chose a limited number of documents, such as document titles, to detect anomalies and find a pattern for the dataset [24]. in [25], the authors used the conceptual graph method in order to identify anomalous data in the text. two-way graph that includes two different types of nodes (i.e., concepts and relations). this method differs from classical statistical approaches and distance-based approaches. in this approach, a data deviation is identified based on the concept of regularity. using a conceptual graph and the relationships between the entities (concepts), the template pattern is identified, and patterns differing from those that are rare are deemed as an anomaly [25]. sumithiradevi et al. used clustering methods for anomalous data detection. initially using the greedy method, they improved the k-means algorithm and clustered the data set. then, all records were read, and a flag with the initial value of zero was attributed to them. later on, one sample was considered as outlier and removed them from the data set. in the next step, the amount of entropy changes in the remaining set was calculated. if the entropy of the remaining set increases by removing data, the deleted data is anomaly, and the value of its flag is changed to one [26]. juntao wang et al. made use of the density-based approach in order to remove anomalous data. first, by clustering approach of fast k-means, they classified data sets. then, for each data in the cluster, the degree of anomaly was calculated, and any data whose anomalous degree is much larger than one is removed from the data set. in the next step, the average of the remaining set of a cluster is selected as the new center gravity of the cluster. this trend continues until converging the clusters so that all of the anomalous data is removed from the data set [27]. in [28], the authors applied a similar method to the local outlier factor (lof) algorithm in order to determine the degree of anomaly based on distance from centroid. in the first step, by improving the k-means algorithm using genetic algorithm, data set 636 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 637 6 2 related works there are many studies in the research literature that try to address the anomaly detection issues. consequently, different algorithms are provided to diagnose anomaly in multidimensional data sets. the key methods used in this area include the distance-based approach, the density-based approach, and their subset methods [23]. many researchers have worked to detect anomalies in textual documents. hence, different aspects of text attributes are considered. a number of researchers changed the text to numbers and used algorithms that are suitable for numerical data. others chose a limited number of documents, such as document titles, to detect anomalies and find a pattern for the dataset [24]. in [25], the authors used the conceptual graph method in order to identify anomalous data in the text. two-way graph that includes two different types of nodes (i.e., concepts and relations). this method differs from classical statistical approaches and distance-based approaches. in this approach, a data deviation is identified based on the concept of regularity. using a conceptual graph and the relationships between the entities (concepts), the template pattern is identified, and patterns differing from those that are rare are deemed as an anomaly [25]. sumithiradevi et al. used clustering methods for anomalous data detection. initially using the greedy method, they improved the k-means algorithm and clustered the data set. then, all records were read, and a flag with the initial value of zero was attributed to them. later on, one sample was considered as outlier and removed them from the data set. in the next step, the amount of entropy changes in the remaining set was calculated. if the entropy of the remaining set increases by removing data, the deleted data is anomaly, and the value of its flag is changed to one [26]. juntao wang et al. made use of the density-based approach in order to remove anomalous data. first, by clustering approach of fast k-means, they classified data sets. then, for each data in the cluster, the degree of anomaly was calculated, and any data whose anomalous degree is much larger than one is removed from the data set. in the next step, the average of the remaining set of a cluster is selected as the new center gravity of the cluster. this trend continues until converging the clusters so that all of the anomalous data is removed from the data set [27]. in [28], the authors applied a similar method to the local outlier factor (lof) algorithm in order to determine the degree of anomaly based on distance from centroid. in the first step, by improving the k-means algorithm using genetic algorithm, data set a new anomalous text detection approach using unsupervised methods ...7 is clustered. in the next step, the data set is filtered, and by defining the threshold limit, the data items whose degree of anomaly is more than a specific value is considered as anomalous data. in this method, for each vector a degree of anomaly is determined based on the distance from the centroid of the cluster. lei et al. using the subtractive clustering algorithm estimated the potential of each data to be as the initial seeds according to the neighborhood’s radius of samples. in the next step, by combining the silhouette index with the k-means algorithm, they improved the estimated number of clusters. the silhouette index is a criterion for measuring the amount of the desirability of the data assignment to the cluster. this means that each data is closer to the samples in its cluster or to data from other clusters. if the number obtained is closer to one, the assignment of data to the cluster is desirable but if the number is closer to 0.5, this means that it is likely that data belongs to another cluster. finally, the improved cluster-based localoutlier factor (cblof) algorithm is used to identify anomalous data [29]. in [30], the authors put their work on the basis of the improvement of the kmeans algorithm clustering and established their method in parallel. firstly, using principal component analysis (pca), they decreased dimensions of the problem. then, by applying the dd algorithm [31], they improved kmeans’ performance. this algorithm selects initial seeds according to the method of distribution of data and improves their choice quality. also, instead of using a certain point as the initial seeds, it uses the average value of some points as the centroid of the cluster. through a number of tests, yin et al. were able to define a threshold in order to determine the number of clusters in the k-means algorithm clustering so that there is no need for it to be determined by the user [30]. identifying items or events that do not match the expected patterns or other items in the data set is called an anomaly detection. these anomalies items cause problems such as structural defects, errors, credit card fraud, and a cyber-attack and etc. the ability to detect anomaly behavior can provide very useful insights into various industries and be an important key to solving these problems. machine learning algorithms make processing faster and more efficient for detecting anomalies. these algorithms can learn from data and predictions based on that data. in [32], they examined the issue of discovering emerging relations from news using machine learning. these relations can help with news-related tasks, such as retrieving the news, discovering events, ranking, and more, which is a challenging task. in this research, a novel heterogeneous graph embedding framework for emerging relation detection (heer) 636 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 637 8 and a global graph perspective was presented. heer can embed words and entities by learning from the heterogeneous textual graph and the knowledge graph and predicts the emerging relations via a positive and unlabeled learning (pu) classifier. in [33], the authors presented a kernel-based ensemble clustering approach and used a prototype reduction scheme to decrease the time required to generate the ensemble members. they showed that the reduction method could improve the results. the method they used was a learning process for documents clustering that correspondence-based aggregation in conjunction with kernel clustering on a matrix constructed using density-biased prototype selection. 3 proposed method similar to other existing studies such as [32–35], we deal with numerical data in this research. a combination of clustering-based and distance-based methods was used by us. at the first, it is required to convert text data into an understandable format for the system. to this end, text documents were converted into vectors. anomalous data is detected in two phases. in the first phase, the k-means algorithm is used for clustering of data items in the k clusters. in the second phase, anomalous data in each cluster is detected based on the similarity comparison of each data item with the document of the centroid of the cluster. in clustering phase, probably some clusters of empty values might be created, and/or one of the anomalous data items is selected as the initial seeds and forming a cluster. therefore, the clustering stage is carried out several times in order to get a more desirable result. in the step after clustering, centroid of each cluster is considered as the representative of that cluster and since the text data is displayed in the vector space, using the cosine similarity (cs) formula, the angle between the data within the cluster is compared to its center. it should be compared with the threshold limit. if the similarity rate is less than the threshold, the data is considered as an anomaly. it should be noted that the number of abnormal data is negligible in comparison with the total data in the set. in this approach, the method of k-means algorithm clustering was used to divide data set to a few smaller parts based on the criterion of cosine similarity which will result in a decrease in the number of comparison of documents. in other words, instead of calculating distance (similarity) of each and every document in the whole set of data, we first divide the set according to the most similar documents in the k-means algorithm clustering so that the number of comparison between documents within each cluster with the cluster centroid 638 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 639 8 and a global graph perspective was presented. heer can embed words and entities by learning from the heterogeneous textual graph and the knowledge graph and predicts the emerging relations via a positive and unlabeled learning (pu) classifier. in [33], the authors presented a kernel-based ensemble clustering approach and used a prototype reduction scheme to decrease the time required to generate the ensemble members. they showed that the reduction method could improve the results. the method they used was a learning process for documents clustering that correspondence-based aggregation in conjunction with kernel clustering on a matrix constructed using density-biased prototype selection. 3 proposed method similar to other existing studies such as [32–35], we deal with numerical data in this research. a combination of clustering-based and distance-based methods was used by us. at the first, it is required to convert text data into an understandable format for the system. to this end, text documents were converted into vectors. anomalous data is detected in two phases. in the first phase, the k-means algorithm is used for clustering of data items in the k clusters. in the second phase, anomalous data in each cluster is detected based on the similarity comparison of each data item with the document of the centroid of the cluster. in clustering phase, probably some clusters of empty values might be created, and/or one of the anomalous data items is selected as the initial seeds and forming a cluster. therefore, the clustering stage is carried out several times in order to get a more desirable result. in the step after clustering, centroid of each cluster is considered as the representative of that cluster and since the text data is displayed in the vector space, using the cosine similarity (cs) formula, the angle between the data within the cluster is compared to its center. it should be compared with the threshold limit. if the similarity rate is less than the threshold, the data is considered as an anomaly. it should be noted that the number of abnormal data is negligible in comparison with the total data in the set. in this approach, the method of k-means algorithm clustering was used to divide data set to a few smaller parts based on the criterion of cosine similarity which will result in a decrease in the number of comparison of documents. in other words, instead of calculating distance (similarity) of each and every document in the whole set of data, we first divide the set according to the most similar documents in the k-means algorithm clustering so that the number of comparison between documents within each cluster with the cluster centroid a new anomalous text detection approach using unsupervised methods ...9 will be fewer. to cluster text data and to determine their similarity rate, the bag of words approach and cosine similarity were used, respectively. in order to identify anomalous data in each cluster, we are looking for data items that differ from the behavioral pattern of other members or their differences from the cluster centroid is much. after the clustering phase, the weight of each cluster will be calculated using the following formula: w(k) = n∑ x=1 sx (1) where, k is the cluster number (k = 1, 2, . . . , n) and given that the text attributes are moved to the vector, sx is the level of similarity of each data to the cluster centroid. in the next step, the average similarity of documents in each cluster is calculated by the following formula: m(k) = n∑ x=1 sx/tn (2) thus, the total amount of document similarity relative to the centroid in each cluster is divided by the total number of documents tn, and a numeric value is obtained as the average similarity of documents in each cluster. in the next step, the maximum and minimum amount of similarity in each cluster is calculated based on the following formulas: s(k−max) = max(s1, s2, ..., sn) (3) s(k−min) = min(s1, s2, ..., sn) (4) avg(k) = (sk−max + sk−min)/2 (5) diff(k) = m(k) − avg(k) (6) finally, the threshold limit of similarity is obtained for each cluster by this formula: threshold(k) = ∣∣diff(k) ∣∣ (7) the similarity of each data item in the cluster is compared to the centroid with the threshold value. if the similarity rate of the document to the 638 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 639 10 centroid is less than the threshold, it is considered as outlier in the cluster. as it was determined, to calculate the threshold limit, the difference between the median values of similarity and the average of similarities was used. since the average of each cluster is obtained by dividing the sum of the values by the number of them, the existence of anomaly in the cluster leads to exceeding of standard deviation from the average value and increases the variance. as a result, mean similarity value of documents alone cannot be a good indicator of threshold limit. therefor, the median amount of similarities, that is, the average of the similarity values of the most similar and most different documents are also entered into the threshold formula and its difference from the average values of the similarity will result in modulating the threshold limit. the pseudo-code of proposed algorithm is shown as below: algorithm 1 the proposed algorithm require: input ⇒ data set d = {d1, d2.....dn}, where n is the number of documents (k the number of clusters) ensure: output⇒ a set of k-clusters without outliers require: choose k objects from d as initial cluster centers repeat 1. calculate distance of each data instances to centroid using cs 2. reassign objects to the cluster with the most similarity 3. update the cluster centroid due to the cs until until no changes calculate the weight based center wk repeat 1. calculate the mean cs as m(k) 2. calculate the max(s1, s2, ..., sn) and min(s1, s2, ..., sn) 3. calculate the threshold limit of similarity for each cluster if dn < d(k) then delete dn from cluster k end if until end of clusters 640 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 641 10 centroid is less than the threshold, it is considered as outlier in the cluster. as it was determined, to calculate the threshold limit, the difference between the median values of similarity and the average of similarities was used. since the average of each cluster is obtained by dividing the sum of the values by the number of them, the existence of anomaly in the cluster leads to exceeding of standard deviation from the average value and increases the variance. as a result, mean similarity value of documents alone cannot be a good indicator of threshold limit. therefor, the median amount of similarities, that is, the average of the similarity values of the most similar and most different documents are also entered into the threshold formula and its difference from the average values of the similarity will result in modulating the threshold limit. the pseudo-code of proposed algorithm is shown as below: algorithm 1 the proposed algorithm require: input ⇒ data set d = {d1, d2.....dn}, where n is the number of documents (k the number of clusters) ensure: output⇒ a set of k-clusters without outliers require: choose k objects from d as initial cluster centers repeat 1. calculate distance of each data instances to centroid using cs 2. reassign objects to the cluster with the most similarity 3. update the cluster centroid due to the cs until until no changes calculate the weight based center wk repeat 1. calculate the mean cs as m(k) 2. calculate the max(s1, s2, ..., sn) and min(s1, s2, ..., sn) 3. calculate the threshold limit of similarity for each cluster if dn < d(k) then delete dn from cluster k end if until end of clusters a new anomalous text detection approach using unsupervised methods ...11 3.1 data set in this study, two data sets (i.e., bbc and bbc sports news) were used [35]. bbc news contains 2225 documents from news articles on the bbc website in five news groups between 2004 and 2005. these five news groups were labeled under the title of business, entertainment, politics, sport, and tech. the bbc sports collection also contains 737 documents from the bbc sports website articles in five sports areas between 2004 and 2005, labeled as athletics, cricket, football, rugby, and tennis. each news data set contains a large number of text files from the broadcasted news text in several newsgroups with different topics on the bbc website. since news topics differ in these texts, the words used in the text will also vary according to the type of news. according to the news genre, similar words are used in political news which are not used in sports news. consequently, the words used in the sports news genre is similar, but it differs from the words of the business news genre. as a result, the similarity or difference of documents is characterized after conversion to the vector space. documents in the sports news genre is placed in the same category at the clustering time. if business news documents are placed in this cluster, they are considered as outliers. the purpose of this approach is to find irrelevant documents that should be placed in a different cluster with regard to their subject. to identify anomaly, the algorithm is performed on the text data set with and without pre-processing. the goal of pre-processing of texts is stemming, removing of stop words, and weighing by term frequency–inverse document frequency (tf-idf) method. table 1 shows the details of the bbc data sets. table 1: summarization of the data sets dataset descriptions documents class labels classes bbc news news articles from bbc 2225 5 business, entertainment, politics, sport, tech bbc sports sports news articles from bbc 737 5 athletics, cricket, football, rugby, tennis 3.2 text pre-processing the implementation of various operations on the text, including classification and clustering, requires the conversion of it into an understandable format for the system. as mentioned earlier, text documents are of an unstructured data type, and to perform the calculations it is necessary to convert them to a structured way. stemming: the stemming process will convert words to their root form. for 640 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 641 12 example, the words ‘apply,’ ‘applied,’ and ‘application’ have the same root and they are all converted to the word ‘apply’ [36]. stop words removal: in this step, a batch of worthless words like conjunctions and prepositions that are repeated alternately and do not have certain semantic meaning are deleted [37]. bag-of-words model: it is a simple demonstration of text documents that are used in natural language processing. in this model, each document is displayed regardless of grammar and how words are shown, but the number of words’ repetition matters. the result obtained from this model, will be a word-document matrix in which every row represents each document and every column represents each word. if there is a word in the document, in the corresponding column in the matrix 1 will be inserted otherwise 0 will be inserted. the first reference to ”bag-of-words model” in a linguistic context can be found in zellig harris’s article on distributional structure [38]. in this study, to display the texts of the bag-of-words model and conversion to the vector space model were used. vector space model is an algebraic model for displaying text documents in the vector space. weighing words by tf-idf: in this method, words are assigned a weight based on its frequency in the text relative to their frequency in other texts. this weighing system shows how important a word is for a document. the first form of term weighting is due to hans peter luhn (1957) which may be summarized as [39]: the weight of a term that occurs in a document is simply proportional to the term frequency. idf was introduced by karen spärck jones as ”term specificity” in [40, 41]. although it has worked well as a heuristic, many researchers trying to find information theoretic justifications for it [41]. this criterion is made up of two functions of the tf (term frequency function) and idf (inverse document frequency function), and that means that if the number of repetitions of a specific word in the document is more and in other documents under investigation is less, this word is very important. this criterion is derived from the multiplication of two values (i.e., tf * idf). tf equals the number of word repeats divided by the total number of words contained in the document [42,43]. tf − idft.d = tft.d × idft (8) tft,d = { log(1 + ft,d), ifft,d > 0 0, otherwise (9) 642 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 643 12 example, the words ‘apply,’ ‘applied,’ and ‘application’ have the same root and they are all converted to the word ‘apply’ [36]. stop words removal: in this step, a batch of worthless words like conjunctions and prepositions that are repeated alternately and do not have certain semantic meaning are deleted [37]. bag-of-words model: it is a simple demonstration of text documents that are used in natural language processing. in this model, each document is displayed regardless of grammar and how words are shown, but the number of words’ repetition matters. the result obtained from this model, will be a word-document matrix in which every row represents each document and every column represents each word. if there is a word in the document, in the corresponding column in the matrix 1 will be inserted otherwise 0 will be inserted. the first reference to ”bag-of-words model” in a linguistic context can be found in zellig harris’s article on distributional structure [38]. in this study, to display the texts of the bag-of-words model and conversion to the vector space model were used. vector space model is an algebraic model for displaying text documents in the vector space. weighing words by tf-idf: in this method, words are assigned a weight based on its frequency in the text relative to their frequency in other texts. this weighing system shows how important a word is for a document. the first form of term weighting is due to hans peter luhn (1957) which may be summarized as [39]: the weight of a term that occurs in a document is simply proportional to the term frequency. idf was introduced by karen spärck jones as ”term specificity” in [40, 41]. although it has worked well as a heuristic, many researchers trying to find information theoretic justifications for it [41]. this criterion is made up of two functions of the tf (term frequency function) and idf (inverse document frequency function), and that means that if the number of repetitions of a specific word in the document is more and in other documents under investigation is less, this word is very important. this criterion is derived from the multiplication of two values (i.e., tf * idf). tf equals the number of word repeats divided by the total number of words contained in the document [42,43]. tf − idft.d = tft.d × idft (8) tft,d = { log(1 + ft,d), ifft,d > 0 0, otherwise (9) a new anomalous text detection approach using unsupervised methods ...13 where t is the term, d is a bag of words (a document in ir terms), and ft,d is a frequency of the term in a bag. tft.d = ft.d max(ft′.d : t ′ ∈ d) (10) idf is the logarithm of the total number of documents divided by the total number of documents containing the target word [44]. idft,d = log |d| |d ∈ d : t ∈ d| = log n dft (11) where n is the cardinality of a corpus d (the total number of classes) and the denominator dft is a number of bags where the term t appears. idft = log n dft (12) then, tf ∗ idf weight value for a term t in the bag d of a corpus d is defined as: tf ∗ idf(t, d, d) = tft,d × idft,d (13) tf ∗ idf(t, d, d) = log(1 + ft,d) × log n dft , forft,d > 0 (14) for all cases where ft,d > 0 and dft > 0, or zero otherwise. once all frequency values are computed, term frequency matrix becomes the term weight matrix, whose columns used as class’ term weight vectors that facilitate the classification using cosine similarity. in accordance with the equation, the less the number of word repetition in the documents containing the target word, the more important. 3.3 criterion for assessing the similarity of two documents the cosine similarity is the similarity criterion between the two vectors that calculates the cosine of the angle between the two vectors. a zero cosine is equal to 1, as a result, if two vectors coincide each other, their similarity is equal to one. it is obvious that this amount will show the highest possible similarity between vectors [45]. after preparing the words bag, the document will be displayed in the vector space. then, the angle between the two vectors (the similarity of two documents) is calculated from the following formula: 642 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 643 14 for two vectors a and b cosine similarity is based on their inner product and defined as: similarity(a, b) = cos(θ) (15) cos(θ) = a · b ||a|| · ||b|| (16) a · b = n∑ i=1 aibi (17) ||a|| , ||b|| = √√√√ n∑ i=1 a2i , √√√√ n∑ i=1 b2i (18) similarity(a, b) = n∑ i=1 aibi √ n∑ i=1 a2i √ n∑ i=1 b2i (19) 4 experimental results in the following, the results of using the proposed approach in order to identify the abnormal data in the text will be investigated. to implement and evaluate the proposed approach, the following procedures are implemented: to evaluate the accuracy of the proposed approach, the data set is first divided into two parts. a part of data is considered as training data and other part as test data to measure the accuracy of the proposed algorithm. also, in order to weigh the text keywords, the tf-idf coefficient was used. since five news genres exists in the data set, the number of clusters in the k-means algorithm is pre-determined and is equal to five. since the used data set is labeled, the training data is used in order to learn k-means algorithm. then, the number of documents placed in each cluster will be counted. by appointing a two-dimensional array, the index of each document with a cosine similarity relative to the centroid of each cluster (distance criterion) is stored in the array. according to the proposed formula in order to determine the threshold limit, the lowest and highest similarity values in each cluster relative to the centroid as well as the average spacing values are calculated. in the following, the results of the accuracy of decision tree classification 644 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 645 14 for two vectors a and b cosine similarity is based on their inner product and defined as: similarity(a, b) = cos(θ) (15) cos(θ) = a · b ||a|| · ||b|| (16) a · b = n∑ i=1 aibi (17) ||a|| , ||b|| = √√√√ n∑ i=1 a2i , √√√√ n∑ i=1 b2i (18) similarity(a, b) = n∑ i=1 aibi √ n∑ i=1 a2i √ n∑ i=1 b2i (19) 4 experimental results in the following, the results of using the proposed approach in order to identify the abnormal data in the text will be investigated. to implement and evaluate the proposed approach, the following procedures are implemented: to evaluate the accuracy of the proposed approach, the data set is first divided into two parts. a part of data is considered as training data and other part as test data to measure the accuracy of the proposed algorithm. also, in order to weigh the text keywords, the tf-idf coefficient was used. since five news genres exists in the data set, the number of clusters in the k-means algorithm is pre-determined and is equal to five. since the used data set is labeled, the training data is used in order to learn k-means algorithm. then, the number of documents placed in each cluster will be counted. by appointing a two-dimensional array, the index of each document with a cosine similarity relative to the centroid of each cluster (distance criterion) is stored in the array. according to the proposed formula in order to determine the threshold limit, the lowest and highest similarity values in each cluster relative to the centroid as well as the average spacing values are calculated. in the following, the results of the accuracy of decision tree classification a new anomalous text detection approach using unsupervised methods ...15 algorithms and näıve bayes method on two data sets before and after preprocessing by removing the anomalous data by the proposed method and the dbscan method was presented with different neighborhood distances. fig. 3: accuracy diagram of the decision tree after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the bbc data set the results in figures 3, and 4 showed that the proposed method on the bbc data set before the pre-processing phase was not improved in comparison with the accuracy of näıve bayes method, but the accuracy of the decision tree using the proposed method increased in comparison with the dbscan method. the results in figures 5, and 6 proved that the accuracy of the proposed method like previous results on a pre-processed set of bbc has been improved compared to the non-pre-processed set so that the accuracy of the näıve bayes method after eliminating the anomalous data by the proposed method increased compared to the dbscan method. also, the accuracy of decision tree after eliminating the anomalous data by the proposed method has increased on the pre-processed data set. the results in figures 7, and 8 disclosed that the proposed method on the bbc sports data set before the pre-processing phase in comparison with the accuracy of the näıve bayes method was not improved, but the accuracy of decision tree after removal of the anomalous data using the proposed method increased in comparison with the dbscan method. 644 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 645 16 fig. 4: accuracy diagram of the näıve bayes algorithm after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the bbc data set fig. 5: accuracy diagram of the decision tree after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the pre-processed bbc data set the results in figures 9, and 10 revealed that the accuracy of the proposed method like previous results on a pre-processed set of bbc sports 646 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 647 16 fig. 4: accuracy diagram of the näıve bayes algorithm after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the bbc data set fig. 5: accuracy diagram of the decision tree after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the pre-processed bbc data set the results in figures 9, and 10 revealed that the accuracy of the proposed method like previous results on a pre-processed set of bbc sports a new anomalous text detection approach using unsupervised methods ...17 fig. 6: accuracy diagram of the näıve bayes algorithm after removing the anomalous data using the proposed method and the dbscan method with different neighborhood distance on the pre-processed bbc data set fig. 7: accuracy diagram of the decision tree after removing the anomalous data using the proposed method and dbscan method with different neighborhood distance on the sports data set 646 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 647 18 fig. 8: accuracy diagram of the näıve bayes algorithm after removing the anomalous data using the proposed method and dbscan method with different neighborhood distances on the bbc sport data set fig. 9: accuracy diagram of the decision tree after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the pre-processed bbc sport data set news has been improved compared to the non-pre-processed set so that the accuracy of the näıve bayes method after eliminating the anomalous data by 648 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 649 18 fig. 8: accuracy diagram of the näıve bayes algorithm after removing the anomalous data using the proposed method and dbscan method with different neighborhood distances on the bbc sport data set fig. 9: accuracy diagram of the decision tree after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the pre-processed bbc sport data set news has been improved compared to the non-pre-processed set so that the accuracy of the näıve bayes method after eliminating the anomalous data by a new anomalous text detection approach using unsupervised methods ...19 fig. 10: accuracy diagram of the näıve bayes algorithm after removing the anomalous data using the proposed method and dbscan with different neighborhood distance on the pre-processed bbc data set the proposed method increased compared to the dbscan method. also, the accuracy of decision tree after removal of the anomalous data by the proposed method increased on the pre-processed data set. table 2, compares two similar works with our proposed method on the same data set. table 2: results on the bbc news data set authors descriptions accuracy zhang et al. heterogeneous graph embedding for emerging relation detection from news 64.4% greene et al. kernel-based ensemble clustering approach 88.0% our proposed method a combination of two approaches (clustering-based and distance-based) 92.39% 5 conclusion and future works in this research, a novel approach for identifying the anomalous text data using unsupervised methods was proposed well. the advantage of using our proposed model as an unsupervised method is that there is no need for prior knowledge and training data. in this research, we assumed that the number 648 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 649 20 of anomalous data is negligible compared to normal data. the documents studied were also in english and the cosine similarity (cs) criterion was used to compare the distance between the documents. therefore, a given document which is the least similar to others is considered as an anomalous document. in the proposed method, a combination of two approaches clustering-based and distance-based is used for detecting anomaly in the text data. in order to evaluate the efficiency of the proposed approach, this method is applied on four labeled data sets. in general, the obtained results show that the classification accuracy of the documents after applying the anomalous outlier detection algorithm and removing them from the preprocessed data set is always improved and performs well in non-pre-processed data sets. in order to determine the threshold, our model iteratively runs some algorithms with high mathematical calculation. consequently computational complexity increases in our approach. besides, it should be noted that the user must specify k (the number of clusters) at the beginning. so an improved k-means would be used as a solution. we have to repeat the k-means algorithm several times to fix the best clustering and prevent the selection of outliers as initial seeds. furthermore, we should point out that based on the achieved results the pre-processing step can affect the accuracy. in the future, we are going to use improved k-means algorithm, evaluate other types of clustering algorithms, apply the model to other languages, investigate different distance thresholds (similarity) to tackle these issues. references [1] z. a. bakar, r. mohemad, a. ahmad, and m. m. deris, “a comparative study for outlier detection techniques in data mining,” in 2006 ieee conference on cybernetics and intelligent systems. ieee, 2006, pp. 1–6. [2] a. esmaeili kelishomi, a. garmabaki, m. bahaghighat, and j. dong, “mobile user indoor-outdoor detection through physical daily activities,” sensors, vol. 19, no. 3, p. 511, 2019. [3] m. ghorbani, m. bahaghighat, q. xin, and f. özen, “convlstmconv network: a deep learning approach for sentiment analysis in cloud computing,” journal of cloud computing, vol. 9, no. 1, pp. 1–12, 2020. [4] m. bahaghighat, l. akbari, and q. xin, “a machine learning-based approach for counting blister cards within drug packages,” ieee access, vol. 7, pp. 83 785–83 796, 2019. 650 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 651 20 of anomalous data is negligible compared to normal data. the documents studied were also in english and the cosine similarity (cs) criterion was used to compare the distance between the documents. therefore, a given document which is the least similar to others is considered as an anomalous document. in the proposed method, a combination of two approaches clustering-based and distance-based is used for detecting anomaly in the text data. in order to evaluate the efficiency of the proposed approach, this method is applied on four labeled data sets. in general, the obtained results show that the classification accuracy of the documents after applying the anomalous outlier detection algorithm and removing them from the preprocessed data set is always improved and performs well in non-pre-processed data sets. in order to determine the threshold, our model iteratively runs some algorithms with high mathematical calculation. consequently computational complexity increases in our approach. besides, it should be noted that the user must specify k (the number of clusters) at the beginning. so an improved k-means would be used as a solution. we have to repeat the k-means algorithm several times to fix the best clustering and prevent the selection of outliers as initial seeds. furthermore, we should point out that based on the achieved results the pre-processing step can affect the accuracy. in the future, we are going to use improved k-means algorithm, evaluate other types of clustering algorithms, apply the model to other languages, investigate different distance thresholds (similarity) to tackle these issues. references [1] z. a. bakar, r. mohemad, a. ahmad, and m. m. deris, “a comparative study for outlier detection techniques in data mining,” in 2006 ieee conference on cybernetics and intelligent systems. ieee, 2006, pp. 1–6. [2] a. esmaeili kelishomi, a. garmabaki, m. bahaghighat, and j. dong, “mobile user indoor-outdoor detection through physical daily activities,” sensors, vol. 19, no. 3, p. 511, 2019. [3] m. ghorbani, m. bahaghighat, q. xin, and f. özen, “convlstmconv network: a deep learning approach for sentiment analysis in cloud computing,” journal of cloud computing, vol. 9, no. 1, pp. 1–12, 2020. [4] m. bahaghighat, l. akbari, and q. xin, “a machine learning-based approach for counting blister cards within drug packages,” ieee access, vol. 7, pp. 83 785–83 796, 2019. a new anomalous text detection approach using unsupervised methods ...21 [5] m. bahaghighat, s. a. motamedi, and q. xin, “image transmission over cognitive radio networks for smart grid applications,” applied sciences, vol. 9, no. 24, p. 5498, 2019. [6] f. abedini, m. bahaghighat, and m. s’hoyan, “wind turbine tower detection using feature descriptors and deep learning,” facta universitatis, series: electronics and energetics, vol. 33, no. 1, pp. 133–153, 2019. [7] m. bahaghighat, f. abedini, m. s’hoyan, and a.-j. molnar, “vision inspection of bottle caps in drink factories using convolutional neural networks,” in 2019 ieee 15th international conference on intelligent computer communication and processing (iccp). ieee, 2019, pp. 381–385. [8] s. hasani, m. bahaghighat, and m. mirfatahia, “the mediating effect of the brand on the relationship between social network marketing and consumer behavior,” acta technica napocensis, vol. 60, no. 2, pp. 1–6, 2019. [9] m. bahaghighat, q. xin, s. a. motamedi, m. m. zanjireh, and a. vacavant, “estimation of wind turbine angular velocity remotely found on video mining and convolutional neural network,” applied sciences, vol. 10, no. 10, p. 3544, 2020. [10] m. bahaghighat and s. a. motamedi, “vision inspection and monitoring of wind turbine farms in emerging smart grids,” facta universitatis-series: electronics and energetics, vol. 31, no. 2, pp. 287–301, 2018. [11] v. chandola, a. banerjee, and v. kumar, “anomaly detection: a survey,” acm computing surveys (csur), vol. 41, no. 3, pp. 1–58, 2009. [12] j. d. parmar and j. t. patel, “anomaly detection in data mining: a review,” international journal, vol. 7, no. 4, 2017. [13] r. kaur and s. singh, “a survey of data mining and social network analysis based anomaly detection techniques,” egyptian informatics journal, vol. 17, no. 2, pp. 199–216, 2016. [14] d. guthrie, “unsupervised detection of anomalous text,” ph.d. dissertation, citeseer, 2008. [15] j. samuel, g. ali, m. rahman, e. esawi, y. samuel et al., “covid-19 public sentiment insights and machine learning for tweets classification,” information, vol. 11, no. 6, p. 314, 2020. [16] s. latif, m. usman, s. manzoor, w. iqbal, j. qadir, g. tyson, i. castro, a. razi, m. n. k. boulos, a. weller et al., “leveraging data science to combat covid-19: a comprehensive review,” 2020. [17] a. hotho, a. nürnberger, and g. paaß, “a brief survey of text mining.” in ldv forum, vol. 20, no. 1. citeseer, 2005, pp. 19–62. [18] a. k. nassirtoussi, s. aghabozorgi, t. y. wah, and d. c. l. ngo, “text mining for market prediction: a systematic review,” expert systems with applications, vol. 41, no. 16, pp. 7653–7670, 2014. 650 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 651 22 [19] a. mahapatra, n. srivastava, and j. srivastava, “contextual anomaly detection in text data,” algorithms, vol. 5, no. 4, pp. 469–489, 2012. [20] m. goldstein and s. uchida, “a comparative evaluation of unsupervised anomaly detection algorithms for multivariate data,” plos one, vol. 11, no. 4, p. e0152173, 2016. [21] t.-e. lin, h. xu, and h. zhang, “discovering new intents via constrained deep adaptive clustering with cluster refinement.” in aaai, 2020, pp. 8360–8367. [22] i. aalto et al., “discovering topics in slack message streams,” 2020. [23] r. kannan, h. woo, c. c. aggarwal, and h. park, “outlier detection for text data,” in proceedings of the 2017 siam international conference on data mining. siam, 2017, pp. 489–497. [24] m. t. sereshki and m. m. zanjireh, “outlier detection in text data: an unsupervised method based on text similarity and density peak,” 2020. [25] m. montes-y gómez, a. gelbukh, and a. lópez-lópez, “detecting deviations in text collections: an approach using conceptual graphs,” in mexican international conference on artificial intelligence. springer, 2002, pp. 176–184. [26] s. chellamuthu and m. punithavalli, “enhanced k-means with greedy algorithm for outlier detection,” international journal of advanced research in computer science, vol. 3, no. 3, 2012. [27] j. wang and x. su, “an improved k-means clustering algorithm,” in 2011 ieee 3rd international conference on communication software and networks. ieee, 2011, pp. 44–46. [28] m. marghny and a. i. taloba, “outlier detection using improved genetic kmeans,” arxiv preprint arxiv:1402.6859, 2014. [29] d. lei, q. zhu, j. chen, h. lin, and p. yang, “automatic k-means clustering algorithm for outlier detection,” in information engineering and applications. springer, 2012, pp. 363–372. [30] c. yin and s. zhang, “parallel implementing improved k-means applied for image retrieval and anomaly detection,” multimedia tools and applications, vol. 76, no. 16, pp. 16 911–16 927, 2017. [31] x.-j. tong, f.-r. meng, and z.-x. wang, “optimization to k-means initial cluster centers,” computer engineering and design, vol. 32, no. 8, pp. 2721– 2723, 2011. [32] j. zhang, c.-t. lu, m. zhou, s. xie, y. chang, and s. y. philip, “heer: heterogeneous graph embedding for emerging relation detection from news,” in 2016 ieee international conference on big data (big data). ieee, 2016, pp. 803–812. [33] d. greene and p. cunningham, “efficient ensemble methods for document clustering,” department of computer science, trinity college dublin, tech. rep., 2006. 652 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 653 22 [19] a. mahapatra, n. srivastava, and j. srivastava, “contextual anomaly detection in text data,” algorithms, vol. 5, no. 4, pp. 469–489, 2012. [20] m. goldstein and s. uchida, “a comparative evaluation of unsupervised anomaly detection algorithms for multivariate data,” plos one, vol. 11, no. 4, p. e0152173, 2016. [21] t.-e. lin, h. xu, and h. zhang, “discovering new intents via constrained deep adaptive clustering with cluster refinement.” in aaai, 2020, pp. 8360–8367. [22] i. aalto et al., “discovering topics in slack message streams,” 2020. [23] r. kannan, h. woo, c. c. aggarwal, and h. park, “outlier detection for text data,” in proceedings of the 2017 siam international conference on data mining. siam, 2017, pp. 489–497. [24] m. t. sereshki and m. m. zanjireh, “outlier detection in text data: an unsupervised method based on text similarity and density peak,” 2020. [25] m. montes-y gómez, a. gelbukh, and a. lópez-lópez, “detecting deviations in text collections: an approach using conceptual graphs,” in mexican international conference on artificial intelligence. springer, 2002, pp. 176–184. [26] s. chellamuthu and m. punithavalli, “enhanced k-means with greedy algorithm for outlier detection,” international journal of advanced research in computer science, vol. 3, no. 3, 2012. [27] j. wang and x. su, “an improved k-means clustering algorithm,” in 2011 ieee 3rd international conference on communication software and networks. ieee, 2011, pp. 44–46. [28] m. marghny and a. i. taloba, “outlier detection using improved genetic kmeans,” arxiv preprint arxiv:1402.6859, 2014. [29] d. lei, q. zhu, j. chen, h. lin, and p. yang, “automatic k-means clustering algorithm for outlier detection,” in information engineering and applications. springer, 2012, pp. 363–372. [30] c. yin and s. zhang, “parallel implementing improved k-means applied for image retrieval and anomaly detection,” multimedia tools and applications, vol. 76, no. 16, pp. 16 911–16 927, 2017. [31] x.-j. tong, f.-r. meng, and z.-x. wang, “optimization to k-means initial cluster centers,” computer engineering and design, vol. 32, no. 8, pp. 2721– 2723, 2011. [32] j. zhang, c.-t. lu, m. zhou, s. xie, y. chang, and s. y. philip, “heer: heterogeneous graph embedding for emerging relation detection from news,” in 2016 ieee international conference on big data (big data). ieee, 2016, pp. 803–812. [33] d. greene and p. cunningham, “efficient ensemble methods for document clustering,” department of computer science, trinity college dublin, tech. rep., 2006. a new anomalous text detection approach using unsupervised methods ...23 [34] j. manoharan, s. h. ganesh, and j. sathiaseelan, “outlier detection using enhanced k-means clustering algorithm and weight-based center approach,” int. j. comput. sci. mobile comput., vol. 5, no. 4, pp. 453–464, 2016. [35] d. greene and p. cunningham, “practical solutions to the problem of diagonal dominance in kernel document clustering,” in proc. 23rd international conference on machine learning (icml’06). acm press, 2006, pp. 377–384. [36] f. n. flores and v. p. moreira, “assessing the impact of stemming accuracy on information retrieval–a multilingual perspective,” information processing & management, vol. 52, no. 5, pp. 840–854, 2016. [37] w. j. wilbur and k. sirotkin, “the automatic identification of stop words,” journal of information science, vol. 18, no. 1, pp. 45–55, 1992. [38] z. s. harris, “distributional structure,” word, vol. 10, no. 2-3, pp. 146–162, 1954. [39] h. p. luhn, “a statistical approach to mechanized encoding and searching of literary information,” ibm journal of research and development, vol. 1, no. 4, pp. 309–317, 1957. [40] k. s. jones, “a statistical interpretation of term specificity and its application in retrieval,” journal of documentation, 1972. [41] s. robertson, “understanding inverse document frequency: on theoretical arguments for idf,” journal of documentation, 2004. [42] g. salton and c. buckley, “term-weighting approaches in automatic text retrieval,” information processing & management, vol. 24, no. 5, pp. 513–523, 1988. [43] c. d. manning, p. raghavan, and h. schütze, “scoring, term weighting and the vector space model,” introduction to information retrieval, vol. 100, pp. 2–4, 2008. [44] k. church and w. gale, “inverse document frequency (idf): a measure of deviations from poisson,” in natural language processing using very large corpora. springer, 1999, pp. 283–295. [45] a. huang, “similarity measures for text document clustering,” in proceedings of the sixth new zealand computer science research student conference (nzcsrsc2008), christchurch, new zealand, vol. 4, 2008, pp. 9–56. 652 e. amouee, m. m. zanjireh , m. bahaghighat, m. ghorbani a new anomalous text detection approach using unsupervised methods... 653 facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 381-392 https://doi.org/10.2298/fuee2103381t © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper damping analysis to improve the performance of shunt capacitive rf mems switch lakshmi narayana thalluri1, k v v kumar2, konari raja sekhar3, n bhushana babu d3, s s kiran4, koushik guha5 1department of ece, andhra loyola institute of engineering and technology, vijayawada, andhra pradesh, india 2department of ece, universal college of engineering and technology, perecharla, a p, india 3department of ece, n s raju institute of technology (autonomous),sontyam, a p, india 4department of ece, lendi institute of engineering and technology, visakhapatnam, a p, india 5national mems design center, department of ece, national institute of technology, silchar, assam, india abstract. this paper describes the significance of the iterative approach and the structure damping analysis which help to get better the performance and validation of shunt capacitive rf mems switch. the micro-cantilever based electrostatic ally actuated shunt capacitive rf mems switch is designed and after multiple iterations on cantilever structure a modification of the structure is obtained that requires low actuation voltage of 7.3 v for 3 µm deformation. to validate the structure we have performed the damping analysis for each iteration. the low actuation voltage is a consequence of identifying the critical membrane thickness of 0.7 µm, and incorporating two slots and holes into the membrane. the holes to the membrane help in stress distribution. we performed the eigen frequency analysis of the membrane. the rf mems switch is micro machined on a cpw transmission line with gapstrip-gap (g-s-g) of 85 µm 70 µm 85 µm. the switch rf isolation properties are analyzed with high dielectric constant thin films i.e., aln, gaas, and hfo2. for all the dielectric thin films the rf mems switch shows a high isolation of -63.2 db, but there is shift in the radio frequency. because of presence of the holes in the membrane the switch exhibits a very low insertion loss of -0.12 db. key words: vibration analysis, rf mems switches, material science, fem tools analysis. received march 22, 2021; received in revised form june 06, 2021 corresponding author: lakshmi narayana thalluri department of ece, andhra loyola institute of engineering and technology, vijayawada, andhra pradesh, india e-mail: drtln9@gmail.com * an earlier version of this paper was presented at the international conference on micr/nano electronics devices, circuits and systems (mndcs-2021), 30-31 january, 2021, india [1]. 382 l. n. thalluri, k v v kumar, k. r. sekhar, n b babu d.3, s s kiran, k. guha 1. introduction rf mems switches are becoming prominent because of their low power consumption and high linearity [1]. shunt capacitive rf mems switches are extremely useful in rf mems technology which has great potential in the design of reconfigurable antennas [2]. the frequency range of 1.5 15 ghz is the major band which will cover significant wireless applications like gps, gsm, wi-fi, wi-max, and umts [3]. potential major research challenges of electrostatically actuated rf mems switches are how to reduce the required actuation voltage, improve their switching time and reliability. a proper iterative study helps to obtain better mechanical, electrical and rf properties of the switch. the cantilever-based, serpentine, fixed-fixed, folded membrane structures are popular in the design of mems devices. among these, the cantilever based devices offer low actuation voltage and better switching properties [4-6]. but, there is still room to improve the cantilever performance by the iterative analysis. material science also helps to choose the most suitable thin film for the substrate, the transmission line and the membrane [7]. 2. related work in the early decades, several researchers advanced the research on rf mems switches. electrostatic, magneto static, piezo resistive, and thermal are the popular actuation techniques. among these, electrostatic actuation offers major advantages [8]. however, there are still a few potential research challenges in electrostatically actuated rf mems switches, like improving the reliability, reducing the actuation voltage, and improving the switching time [9, 10]. the prior iterative analysis obviously helps to improve the performance of the rf mems switches. material science has a prominent role in the selection of thin films for the transmission lines and the membranes. silicon or glass materials are generally used for the substrate [11]. the cpw and the membranes are micro machined in au, al, cu, and ti. for capacitive mems switches the dielectric material used plays an important role in improving the rf properties [12]. the rf properties i.e., insertion and isolation losses of the switch truly rely on the capacitance ratio. the ratio of downstate capacitance to upstate capacitance is known as the capacitance ratio [13]. 3. mathematical analysis the rectangular cantilever critical stress analysis is indispensable because it primarily determines the switch reliability. the critical stress (σc) in terms of cantilever dimensions and the young's modulus (e) can be expresses as [14], 2 2 2 48 (1 ) c et l    = − (1) fig. 1 cantilever membrane damping analysis to improve the performance of shunt capacitive rf mems switch 383 for the cantilever membrane as shown in fig. 1, the stiffness is equal to that of the spring constant (k). the mathematical equation can be given as [15], 3 3 4 ewt k l = (2) the resonant frequency of the cantilever membrane can be written as 0 1 2 2 r k f m    = = (3) where, m denotes membrane mass is given as m=ρ*l*w*t. the time required for the mems switch to come from the up state to the down state is known as the switching time. for an electrostatically actuated mems switch, the switching time can be expressed as 0 3.67 pull in s s v t v  −  (4) the capacitive switch insertion and the isolation properties truly depend on the switch capacitance ratio. the rf mems switch upstate and down state capacitance can be expresses as [16], 0 1 up d r a c t g   = + (5) 0 r down d a c t   = (6) 'a' is the cross sectional area among the membrane and the cpw strip, and ‘td’ is the dielectric thin film thickness. in terms of the return loss and the upstate and downstate capacitance the insertion losses (s21) can be expressed as 2 2 21 2 11 1 up down c s cs   =     (7) the isolation losses (s21) depend on the characteristic impedance and the rf frequency (f0) of the switch and can be expressed as 02 2 2 0 2 2 21 02 0 2 2 02 0 4 4 4 down s for f f c z r s for f f z l for f f z        =        (8) 384 l. n. thalluri, k v v kumar, k. r. sekhar, n b babu d.3, s s kiran, k. guha 4. membrane iterative analysis a rectangular cantilever structure as shown in fig. 2, is considered from the point of view of the desired radio frequency requirement. its dimensions are given in table 1. we have performed the iterative analysis which helped decrease the required actuation voltage. fig. 2 performance improved cantilever structure with bottom electrode. table 1 performance improved cantilever structure dimensions. parameter variable value (µm) cantilever cl 220 cw 200 ct 0.5 slot1 l 10 w 160 slot2 l 5 w 180 perforation -5x5 bottom electrode (be) bel 120 bew 200 bet 0.6 overall we have performed the multiple iterations on cantilever membrane by varying the membrane thickness, by placing slots and by incorporating the perforation. the iterations are started with 220 µm length, 200 µm width and 1 µm thickness cantilever designed with gold material as shown in fig. 3. in the design of rf mems switches, the validation of the membrane properties is very important. the reliability of the switch depends on the multiple parameters in the membrane damping analysis. with the primary goal of the switch validation, we have considered membrane damping in every iteration. on the whole, we have observed the cantilever damping up to 8000 µs. in this iterative process, we have noticed a few important points i.e., the incorporation of slots into the membrane leads to an increase in the damping duration but also helps to reduce the actuation voltage. incorporating holes into the membrane helps to reduce the damping duration but at the same time it leads to an increase of the actuation voltage. damping analysis to improve the performance of shunt capacitive rf mems switch 385 386 l. n. thalluri, k v v kumar, k. r. sekhar, n b babu d.3, s s kiran, k. guha fig. 3 cantilever structure iterative analysis however, we have considered the 6th iteration membrane for the design of the final rf mems switch i.e., a gold membrane with two slots, perforation and 0.7 µm thickness. this requires an actuation voltage of 7.3 v for 3 µm displacement and switching time is 110 µs as shown in fig. 4. damping analysis to improve the performance of shunt capacitive rf mems switch 387 (a) (b) fig. 4 cantilever membrane, (a) the displacement distribution under electrostatic actuation, (b) displacement versus switching time fig. 5 eigen frequencies 388 l. n. thalluri, k v v kumar, k. r. sekhar, n b babu d.3, s s kiran, k. guha in the rf mems switch performance analysis, eigen frequencies help to analyze the deformation of the membrane during electrostatic actuation as shown in fig.5. the real advantage of introducing holes into the membrane is that it helps to improve the insertion properties of the switch. this facilitates the electrostatic actuation and at the same time the holes make the release of the membrane during the fabrication process easier. the membrane thickness reduction helps reduce the required actuation voltage but up to some level the damping duration becomes limited. however, if the membrane thickness is below 0.7 µm, the membrane damping duration exceeds the limits. in the 7th iteration, we have notices that for a 0.6 µm thickness the membrane undergoes continuous damping which will lead to membrane collapse. so eventually, we have taken the membrane with 0.7 µm thickness which requires 7.3 v for a 3 µm displacement. the designed membrane is resonating at 27 khz in electrostatic actuation as shown in fig. 6. fig. 6 resonant frequency the real advantage introduced by perforating the membrane is to ensure an improved stress distribution. consequently, the reliability of the switch will improve. the stress distribution in the cantilever membrane is shown in fig. 7. fig. 7 stress distribution in the designed cantilever membrane damping analysis to improve the performance of shunt capacitive rf mems switch 389 5. rf mems switch the rf mems switch is designed using performance improved rectangular membrane with slots and perforation. the cpw transmission line with silicon used as a substrate is shown in fig. 8. the height of the silicon substrate is 800 µm. fig. 8 shunt capacitive rf mems switch with cantilever membrane a dielectric thin film of 1 µm thickness is placed on the top of the silicon substrate for better insulation. a cpw transmission line with g-s-g of 85 µm 70 µm 85 µm is micromachined in gold (au). unlike the traditional rf mems switches, in this work we have incorporated a separate actuation electrode of 120 µm -200 µm 0.6 µm to be used for cantilever electrostatic actuation, which helps reduce the noise in the rf cpw line. hfo2 of 220 µm length and 70 µm width is used as a dielectric material. its relative dielectric permittivity (εr) is 23. the complete switch dimensions are presented in table 2. the electrostatic actuation with 7.3 v creates an electrostatic force of 7.5 x 10-7 n. the membrane spring constant is 0.25 n/m. the capacitance analysis results with high relative permitivity thin films are listed in table 3. the designed rf mems switch shows an isolation of -63.2 db and an insertion of -0.12 db as shown in fig. 9 and fig. 10, respectively. our presented work is compared to the state of art as presented in table 4. 390 l. n. thalluri, k v v kumar, k. r. sekhar, n b babu d.3, s s kiran, k. guha fig. 9 isolation losses fig. 10 insertion losses table 2 shunt capacitive rf mems switch dimensions parameter description value(µm) parameter description value(µm) sl substrate dimensions 800 dl dielectric 220 sw 500 dw 70 st 800 bpl bias line 50 g-s-g cpw line & slots 85-70-85 bpw 50 d 120 g 10 e 40 h 185 f 300 i 50 table 3 capacitance ratio material dielectric constant (εr) dielectric thickness (dt) upstate capacitance (cup) downstate capacitance (cdown) capacitance ratio = cdown/cup aln 9.8 0.1 µm 73.9 ff 11 pf 148.8 gaas 12 0.1 µm 75.6 ff 13.5 pf 178.5 hfo2 23 0.1 µm 77.3 ff 26 pf 336.3 damping analysis to improve the performance of shunt capacitive rf mems switch 391 table 4 our work comparison with state-of-art parameter [17] [18] our work substrate glass silicon silicon insulator -sio2 sio2 micro mechanical structure cantilever cantilever cantilever damping analysis is performed no no yes air gap (µm) 3 3 3 actuation voltage (v) 16 19 7.3 total reaction electrostatic force (n) ----7.5 * 10-7 displacement (µm) 3 3 3 spring constant (n/m) ----0.25 upstate & downstate capacitances -& 2.75 pf -&0.02 pf 77.3 ff & 26 pf insertion loss (db) -0.41 0.05 -0.01 to -0.12 isolation loss (db) -20 -43 -20 to 63.2 6. conclusion the micro-cantilever based electrostatically actuated shunt capacitive rf mems switch is designed and after multiple iterations on cantilever structure modification the proposed structure requires low actuation voltage of 3.34 v for 3 µm deformation. this low actuation voltage is a result of identifying the critical membrane thickness of 0.5 µm, and incorporating two slots and an array of holes into the membrane. a similar iterative approach is used to design the final rf mems switch. the rf mems switch is micromachined on a cpw transmission line with g-s-g of 85 µm 70 µm 85 µm. the switch rf isolation properties are analyzed for different high dielectric constant thin films including aln, gaas, and hfo2. for all the dielectric thin films the rf mems switch shows a high isolation of -63.2 db, but there is a shift in the radio frequency. references [1] l. n. thalluri, k v v kumar, k r sekhar, n bhushana babu d, s s kiran, koushik guha, “iterative approach and structure damping analysis to advance the performance of shunt capacitive rf mems switch” in proceedings of the international conference on micr/nano electronics devices, circuits and systems (mndcs2021), 30-31 january, 2021, india. [2] h. r. ansari, s. khosroabadi, “design and simulation of a novel rf mems shunt capacitive switch with a unique spring for ka-band application”, microsystem technologies, 2018. [3] l. n. cheulkar, v. b. sawant and s. s. mohite, “evaluating performance of thermally curled microcantilever rf mems switches”, materials today: proceedings, 2019. [4] v. v. reddy, “frequency reconfigurable fractal patch circularly polarized antennas for gsm/wi-fi/wimax applications”, iete journal of research, 2019. [5] m. perić, s. ilić, s. aleksić, n. raičević, m. bichurin, a. tatarenko, r. petrov, “covered microstrip line with ground planes of finite width”, facta universitatis series: electronics and energetics, vol. 27, no. 4, 2014. [6] t. l. narayana, k. g. sravani & k. s. rao, “design and analysis of cpw based shunt capacitive rf mems switch”, cogent engineering, 2017. [7] s. agarwal, r. kashyap, k. guha, s. baishya, “modeling and analysis of capacitance in consideration of the deformation in rf mems shunt switch”, super lattices and microstructures, 2016. [8] j. iannacci, “rf–mems for high–performance and widely reconfigurable passive components – a review with focus on future telecommunications, internet of things (iot) and 5g applications”, journal of king saud university science, 2017. 392 l. n. thalluri, k v v kumar, k. r. sekhar, n b babu d.3, s s kiran, k. guha [9] j. iannacci, “rf-mems technology as an enabler of 5g: low-loss ohmic switch tested up to 110 ghz”, sensors and actuators a, vol. 279, pp. 624–629, 2018. [10] t. ciric, z. marinković, r. dhuri, o. pronić-rančić, v. marković, “hybrid neural lumped element approach in inverse modeling of rf mems switches”, facta universitatis series: electronics and energetics, vol. 33 no. 1, 2020. [11] h. kuisma, a. cardoso, t. braun, “fan-out wafer-level packaging as packaging technology for mems”, 2020. [12] r. laishram, o. p. thakur, d. k. bhattacharya, harsh, anshu goyal, renu sharma, jagbir singh, and ramjay pal, “low temperature deposited bst thin films for rf mems switch”, integrated ferroelectrics, vol. 116, pp. 35–40, 2010. [13] i. tittonen, m. koskenvuori, “electrostatic and rf-properties of mems structures”, silicon based mems materials and technologies, 2020. [14] t. singh, a. elhady, h. jia, a. mojdeh, c. kaplan, v. sharma, m. basha, e. abdel-rahman, “modeling of low-damping laterally actuated electrostatic mems”, mechatronics, vol. 52, 2018. [15] t. zengerle, j. joppich, p. schwarz, a. ababneh, h. seidel, “modeling the damping mechanism of mems oscillators in the transitional flow regime with thermal waves”, sensors and actuators: a. physical, 2020. [16] j. kaczynski, c. ranacher, c. fleury, “computationally efficient model for viscous damping in perforated mems structures”, sensors and actuators a, vol. 314, 2020. [17] m. angira, d. bansal, p. kumar, k. mehta, k. rangra, “a novel capacitive rf-mems switch for multifrequency operation”, superlattices and microstructures, vol. 133, 2019. [18] o. pertin, kurmendra, “pull-in-voltage and rf analysis of mems based high performance capacitive shunt switch”, microelectronics journal, vol. 77, 2018. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 233 241 doi: 10.2298/fuee1602233j algorithm for uptake assessment in small lesions based on dynamic scintigraphy scans * milica m. janković 1 , vera miler jerković 1 , ana koljević marković 2 , dejan b. popović 1 1 university of belgrade – faculty of electrical engineering, belgrade, serbia 2 national cancer research center of serbia, belgrade, serbia abstract. the aim of our research was to develop an algorithm for estimation and visualisation of radiopharmaceutical uptake based on time-activity-curve (tac) analysis in small regions of interest (roi) in scintigraphic studies. the algorithm is implemented in labview environment (national instruments, texas, austin) and comprises the following steps: 1) delineation of grid of small rois over the examined tissue and corresponding tac processing; 2) background vs tissue separation; 3) the extraction of all “suspected“ rois where tacs are not exponentially descendent; 4) correlation analysis between a tac corresponding to the central suspected roi and tacs of neghboring rois; 5) the extraction of representative tac for “suspected“ area by principal component analysis technique; and 6) visual interpretation of radiopharmaceutical distribution in the “suspected“ area. the application of algorithm is presented in data recorded in case of histopathologically proven parathyroid tumors. key words: uptake, time activity curve, principal component analysis, parathyroid tumor 1. introduction scintigraphy is a nuclear medicine diagnostic test for the visualization of spatial distribution of radioactivity uptake in a tissue. radioactivity is taken by injection, inhalation or swallowing of medical agents (radiopharmaceuticals) with incorporated radioisotopes and the spatial distribution of radioactivity uptake is monitored by planar scintillation camera, spect (single photon emission computer tomography) or pet (positron emission tomography) camera. dynamic scintigraphy is a diagnostic test for examining the function of organs and physiological systems. the result of this type of scintigraphy is a series of frames (dynamic scintigrams) recorded in short time intervals (10 seconds to 1 minute apart, depending on the type of organ and disease). time activity curve (tac) is a quantitative indicator of * an earlier version of this manuscript received the best oral paper award of the biomedical section at the 58 th etran conference, vrnjačka banja, 2-5 june, 2014 [1]. received february 23, 2015; received in revised form june 14, 2015 corresponding author: milica m. janković university of belgrade – faculty of electrical engineering, bulevar kralja aleksandra 73, belgrade, serbia (e-mail: piperski@etf.rs) 234 m.m. janković, v. miler jerković, a. koljević marković, d.b. popović radioactivity uptake changes in a specific region of interest (roi) over time. distinguishing typical tac patterns is of great importance for diagnostic purposes. beside the diagnostic application, scintigraphy has a very important place as a technique of preoperative imaging whose main goal is the precise localization of lesions in order to perform minimally invasive surgery [2-5]. in our previous work, we presented a submarine method, based on tac monitoring in small rois and finding abnormal tac patterns corresponding to lesions [6]. submarine method has proven useful for preoperative dynamic scintigraphic imaging of small lesions, especially in case of parathyroid imaging [6-8]. in this paper we introduce an algorithm that allows the precise uptake assessment in small lesions based on dynamic scintigrams and visual interpretation of uptake distribution in lesion area based on visualization of correlation matrix [1,8]. this algorithm is implemented as an additional tool in submarine software. 2. methods and materials typical tac pattern of health tissue consists of three phases: increasing vascular phase (the radioactivity in the target roi is rapidly growing), accumulation uptake phase (radioactivity is accumulated in the target roi) and washout phase (phase of exponential radioactivity decrease in the target roi), [9]. in the case of lesions, the atypical tac pattern (prolonged retention of radiopharmaceutical in the target tissue or even a peak of radioactivity in washout phase) could be observed, fig. 1. fig. 1 difference in tac patterns for healthy tissue and lesion in case of small lesions (<1 cm 3 ), it is very difficult or impossible to visually detect abnormal radioactivity uptake in individual frames, while it is clearly visible in the washout phase of tac, fig. 2. central roi of lesion is delineated by black color in fig. 2a, and another three rois shifted relative to the central roi are also delineated. tacs corresponding to highlighted rois are presented in fig. 2b (tac1, tac2, tac3, tac4). a high degree of correlation between curves tac1 and tac2 (r=0.93), tac1 and tac3 (r=0.97) could be observed, versus substantially less correlated tac1 i tac4 (r=0.67). tacs corresponding to regions positioned over the lesion are not exponential in washout phase and are strongly correlated. this fact is used for defining the algorithm for uptake assessment and its visualization in small lesions. uptake visualisation in small lesions by dynamic scintigraphy 235 fig. 2 a) a single frame from a dynamic image sequence, taken at the 23 th minute, with delineation of rois in the region of lesion (44 pixels, 66 mm) b) tacs corresponding to rois delineated in a) 2.1. software the algorithm is implemented in the software for reading and processing dynamic studies introduced by authors in previous work [10]. software is developed in labview 8.6 environment (national instruments, texas, austin) and additional ni labview biomedical toolkit. realized application enables:  selection and readout of a dynamic scintigraphic study consisted of dicom [11] images (each frame is archived as a separate .dcm file);  rectangular cropping of frames to the region that is to be processed further – selection cropping position is performed on selected frame with visual inspection of cropping position in all frames;  localization and visualization of small lesions by algorithm for uptake assessment presented in section 1.2. 236 m.m. janković, v. miler jerković, a. koljević marković, d.b. popović checking principal component analysis conditions (see section 1.3) for examples presented in section 2 was performed in rstudio, version 0.98.976. 2.2. algorithm description the algorithm for small lesion localization and visualization consists of six steps shown in fig. 3. fig. 3 flowchart of the algorithm for the uptake assessment in small lesions. roi – region of interest, tac – time activity curve step 1 cropped area, containing the tissue that will be examined, is automatically divided into n small square rois, equal in size n x n, where n is a number of pixels (n=4 is a default value, but user can change it). this number of rois (n) will be reduced in step 2 into the number of rois (t) which belongs to the tissue (tn). the number of tissue rois (t) will be reduced in step 3 into the number of rois (m) whose tacs are not exponentially descendent (mt) and thereby indicate the abnormal radioactivity uptake and the potential lesion. tacs corresponding to all n roi cells are calculated and smoothed by cubic spline technique using labview function cubic spline fit.vi [12]. user can adjust the value (range [0,1]) of balance parameter (input parameter of cubic spline function) taking into consideration the requirement that the coefficient of determination (r-square) is greater than 80% (user sets minimum balance parameter for which r-square>80%). labview function goodness of fit.vi is used for estimation of r-square value based on the raw tac and the cubic spline filtered tac. step 2 maximum values of radioactivity tac cs i max are calculated for all tac cs i (i=1, n). reference value p for discriminating tissue from background is calculated according to the following equation: nitacp i ,1),max( max cs  (1) uptake visualisation in small lesions by dynamic scintigraphy 237 further analysis continues only for those t rois (tn) that belong to the tissue, which means that satisfy the condition tac cs i max > m  p, 00.8), because the prerequisite for pca is good correlation, or too high in order to avoid multicollinearity (cxy<0.9) [13]. the value of determinant indicates on multicollinearity or singularity among original variables and it should not be less than 0.00001. in the case when the value of determinant is less than 0.00001, it means that some variables are highly correlated. the kaiser-meyer-olkin (kmo) is a measure of sampling adequacy [15]. it compares correlation and partial correlations between variables. kmo takes values between 0 and 1. the value of kmo should be greater than 0.5 if the sample is adequate. the bartlett's test of sphericity is a test used to examine the null hypothesis: “variables are uncorrelated, correlation matrix is an identity matrix”. therefore, we need to get p-value < 0.05 in this test and conclude that null hypothesis can be rejected. for choosing the number of principal components we used the kaiser rule and screeplot combined with the amount of total variance that the chosen principal components have (the amount of total variance above 80 % is usually suggested) [16]. the rotation of principal components is used for improving interpretation of results. we have chosen the orthogonal rotation – varimax [16]. uptake visualisation in small lesions by dynamic scintigraphy 239 3. results and discussion we demonstrated the results of suggested algorithm for radioactivity uptake assessment in two patients who underwent parathyroid scintigraphy in the national cancer research center of serbia, belgrade. scintigraphic recording was performed in patients suspected of having primary hyperparathyroidism (phpt) based on previous biochemical analysis (increased level of parathyroid hormone) and positive ultrasound findings. patient data (biochemical, ultrasound, biopsy) are shown in table 1. table 1 patients: biochemical, ultrasound and biopsy data. phpt – primary hyperparathyroidism, pth parathyroid hormone data patient 1 patient 2 gender female male age [years] 69 19 pth [pg/ml] 223 125 phpt ultrasound positive positive previous thyroidectomy no yes, right histopathology parathyroid adenoma parathyroid cancer tumor position right inferior left superior tumor volume [mm 3 ] 60 55 siemens e.cam camera and siemens syngo e.soft 2007 software (siemens ag, erlangen, germany) have been used for image acquisition. after intravenous 99m tc mibi administration (with the radioactivity of 500 mbq, 13.5 mci), 35 minutes of dynamic parathyroid scintigraphy (1 frame/min, dimension of image matrix: 128x128, pixel size 1.5 mm, zoom 3.2, anterior view) were performed. results of pre-analysis pca data are presented in table 2. all pca criteria from section 1.3 are satisfied (determinant value>0.00001, kmo>0.5, p-value<0.05 for bartlett's test). first principal component carries more than 80% of total variance, which means that it is representative of tac changes. fig. 5 shows results of algorithm applied in patient 1 for two dimensions of roi cells (33 pixels and 44 pixels). visual inspection of standard dynamic scintigram at the moment of radioactivity peak in washout phase cannot distinguish lesion from healthy tissue, unlike suggested parametric imaging (fig. 5a). better discrimination between lesion and healthy tissue is evident in case of smaller roi dimension, closer to the real lesion localization (compare fig. 5a left and right). representative tac patterns are presented in fig. 5b. the position of small parathyroid adenoma (right inferior) was surgically confirmed. table 2 results of pre-analysis pca data parameters patient 1 patient 2 33 pixels (4.54.5 mm) 44 pixels (66 mm) 44 pixels (66 mm) determinant value 0.00006 0.00009 0.00008 kaiser-meyer-olkin value 0.855 0.806 0.805 bartlett's test (p-value) 0.000 0.000 0.000 number of principal components 1 1 1 amount of total variance [%] 89.97 91.19 89.02 240 m.m. janković, v. miler jerković, a. koljević marković, d.b. popović fig. 5 a) a single frame from a dynamic image sequence, taken at the 22 nd minute and visual interpretation of lesion localization by introduced algorithm b) representative tac patterns obtained by principal component analysis fig. 6a shows results of visualization algorithm applied in patient 2. representative tac pattern is presented in fig. 6b. the position of small parathyroid cancer (left superior) was surgically confirmed. fig. 6 a) a single frame from a dynamic image sequence, taken at the 24 th minute and a visual interpretation of lesion localization by suggested algorithm b) representative tac pattern obtained by principal component analysis uptake visualisation in small lesions by dynamic scintigraphy 241 4. conclusion in this paper, we introduced an algorithm that enables the display of orientation, shape and boundaries of lesions. the algorithm visualizes the propagation of tac correlation in the lesion area. the application of such algorithms is desirable in preoperative diagnostics in order to plan surgery. further investigation will be related to the development of fully automated algorithm for lesion localization from dynamic scintigrams and its evaluation in a larger population with different oncological diseases. acknowledgement: the paper is financially supported by the ministry of education, science and technological development of the republic of serbia (no. 175016) and the company national instruments (slovenia, ljubljana). references [1] m. m. janković, v. miler jerković, a. koljević marković and d. b. popović, "algorithm for the uptake assessment in small lesions in dynamic scintigraphy ", in proceedings of the 58 th etran conference, 25 june, vrnjačka banja, 2014, pp. me 1.1 1-4 [in serbian]. [2] d. fuster, s. vidal-sicart, t. josé-vicente, p. paredes, d. rubello and f. pons, "what is the role of preoperative scintigraphic imaging and the intraoperative gamma probe in secondary hyperparathyroidism?" nucl med commun, vol. 35, no. 5, pp. 443-445, 2014. [3] i. stoffels, m. müller, m.h. geisel, j. leyh, t. pöppel, d. schadendorf and j. klode, "cost-effectiveness of preoperative spect/ct combined with lymphoscintigraphy vs. lymphoscintigraphy for sentinel lymph node excision in patients with cutaneous malignant melanoma", eur j nucl med mol, [epub ahead of print] 2014. [4] m. ibusuki, y. yamamoto, t. kawasoe, s. shiraishi, s. tomiguchi, y. yamashita, y. honda, k. iyama and h. iwase, "potential advantage of preoperative three-dimensional mapping of sentinel nodes in breast cancer by a hybrid single photon emission ct (spect)/ct system", surg oncol, vol. 19, no. 2, pp. 8894, 2010. [5] m. giuliano, s.a. gulec, d. rubello, g. boni, m. puccini, m.r. pelizzo, g. manca, d. casara, g. sotti, p. erba, d. volterrani and a.e. giuliano, "preoperative localization and radioguided parathyroid surgery", j nucl med, vol. 44, no. 9, pp. 1443-1458, 2003. [6] a. koljević marković, m. m. janković, i. marković, g. pupić, r. džodić and a. b. delaloye, "parathyroid dual tracer subtraction scintigraphy: small regions method for quantitative assessment of parathyroid adenoma uptake", ann nucl med, vol. 28, pp. 736-745, 2014. [7] m. đurović m, m. m. janković and a. koljević marković, " semi-automatic localization of parathyroid tumors in dynamic sestamibi scintigrams ", in proceedings of the 22 nd telecommunications forum telfor 2014, 25-27 november, belgrade, 2014, pp. 955-958 [in serbian]. [8] m. m. janković, " computer system for acquiring, storing, retrieving and processing images obtained by gamma camera ", phd thesis, university of belgrade – faculty of electrical engineering, 2014 [in serbian]. [9] m. p. sandler, r. e. coleman, j. a. patton, f. j. th. wackers and a. gottschalk, diagnostic nuclear medicine, 4th ed. philadelphia: lippincott williams & wilkins, 2003. [10] m. m. janković, a. koljević marković and d. b. popović, " labview application for analysis of time activity curves in regions of small lesions in nuclear medicine ", in proceedings of the 57 th etran conference, 3-6 june, zlatibor, 2013, pp. me 1.9 1-5 [in serbian]. [11] http://dicom.nema.org/ [12] j. s. fleming and r. w. kenny, "a comparison of techniques for the filtering of noise in the renogram, " phys med biol, vol. 22, no. 2, pp. 359-364, mar. 1977. [13] j. e. jackson and j. wiley, a user's guide to principal components. new york: john wiley and sons, inc., 1991. [14] t. m. lehmann, c. gönner and k. spitzer, "interpolation methods in medical image processing, " ieee trans med imag, vol. 18, no. 11, pp. 1049-1075, nov. 1999. [15] h. f. kaiser, "an index of factorial simplicity," psychometrika, vol. 39, pp. 31-36, 1974. [16] i. t. jolliffe, principal component analysis. 2nd ed. new york, usa: springer, 2002. http://dicom.nema.org/ instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 637 643 doi: 10.2298/fuee1504637j investigation on cylindrical gate all around  (gaa) to nanowire mosfet for circuit application biswajit jena 1 , kumar prasannajit pradhan 2 , prasanna kumar sahu 2 , sidharth dash 1 , guru prasad mishra 1 , sushanta kumar mohapatra 2 1 device simulation lab, institute of technical education & research, siksha 'o' anusandhan university, khandagiri, bhubaneswar, odisha-751030, india 2 nano electronics laboratory, department of electrical engineering, national institute of technology (nit), rourkela, 769008, odisha india. abstract. undoped cylindrical gate all around (gaa) mosfet is a radical invention and a potential candidate to replace conventional mosfet, as it introduces new direction for transistor scaling. in this work, the sensitivity of process parameters like channel length (lg), channel thickness (tsi), and gate work function (φm) on various performance metrics of undoped single material (sm) and double material (dm) cylindrical gaa (cgaa) to nanowire mosfet are systematically analyzed. the electrical characteristics such as on current (ion), subthreshold leakage current (ioff), the threshold voltage (vth) and transconductance (gm) are evaluated and studied with the variation of device design parameters. the discussion gives the direction towards low standby operating power (lstp) devices as improvement in ioff is approaching 90% in nanowire mosfets. all the device performances of undoped sm and dm cgaa mosfets are investigated through sentaurus device simulator from synopsys inc. key words: cylindrical gate all around, mosfets, sces, analog and rf foms 1. introduction to get low cost, high operational speed and better performance, the dimension of the conventional transistors need to be downscaled to sub-nanometer region. the reduction of mosfet dimensions will degrade the gate control over the channel due to the close proximity between the source and drain. this leads to increase various short channel effects (sces) like hot carrier effect, threshold voltage roll-off, and substrate bias effect [1], [2]. many new devices have been introduced in beyond moore’s era [3]–[5] to suppress the sces and enable further scaling down the device. similarly, some multi-gate silicon on insulator (soi) technology has also been proposed to replace the conventional received february 19, 2015; received in revised form may 15, 2015 corresponding author: k. p. pradhan nano electronics laboratory, department of electrical engineering, national institute of technology (nit), rourkela, 769008, odisha india (e-mail: skmctc74@gmail.com) 638 b. jena, k. p. pradhan, p. k. sahu, s. dash, g. p. mishra, s. k. mohapatra mosfet [6]–[10]. however, the cylindrical gate all around (cgaa) mosfet is one of the novel devices which further enables the scaling without hindering the device performance [11]. because of the low characteristic length and higher drive current, cgaa mosfets can achieve higher packing density as compared to the double gate (dg) mosfets [12]–[16]. also, cgaa mosfet has excellent electrostatic control of the channel, robustness against sces, better scaling options, no floating body effect, larger equivalent number of gates, ideal subthreshold swing as compared to other multiple-gate mosfets. hence, the cgaa mosfets are a promising solution for nanoscale technology cmos devices [17]–[21]. and the important device parameters like threshold voltage (vth), and on-off ratio (ion/ioff), are very much sensitive to the device geometry such as channel length (lg), channel thickness (tsi), and gate work function (φm). thus, the authors have taken an attempt to present a detailed analysis of the performance dependency of sm and dm cgaa mosfets on device geometry variation. in this paper, different performance metrics, like drain current (id), and transconductance (gm) are systematically presented with the variation of lg, φm, and tsi. along with the introduction, section 2 describes the device structure description that includes all the dimensions, materials and doping concentrations of both sm and dm cgaa mosfets. this section also analyses the physics of the device using device numerical simulations and models activated for simulation. section 3 comprises of all results and discussion. finally, the concluding remarks are presented in section 4. 2. device description and simulation setup the schematic diagram of the fully depleted single material (sm) and dual material (dm) cylindrical gaa (cgaa) mosfet structures used for modeling and simulation are shown in fig. 1 (a) and (b) respectively. the radial and lateral directions of the channel are assumed to be along the radius and the z-axis of the cylinder as shown in fig. 1. the source and drain of the device are uniformly doped with doping concentration of nd = 1× 10 20 cm −3 . the channel is kept undoped. the gate oxide thickness is tox = 1.1 nm. the metal gate work functions, φm=4.6 ev for sm and φm1=4.6 ev, φm2=4.6 ev for dm are considered. (a) (b) fig. 1 schematic structure of cylindrical gate all around (gaa) mosfet (a) single metal gate (b) dual metal gate the simulation is carried out by the device simulator sentaurus, a 3-d numerical simulator from synopsis inc. [22]. to obtain accurate results for mosfet simulation, we cylindrical gate all around (gaa) to nanowire mosfet 639 need to account for the mobility degradation that occurs inside inversion layers. the default carrier transport model in sentaurus is the drift diffusion model is activated.for the drift-diffusion model, the current densities for electrons and holes are given by: ( 1.5 ln ) ( ln )n n c n n nj n e nkt m d n n         (1) ( 1.5 ln ) ( ln )p p v p p pj p e pkt m d p p         (2) where, nj and pj are electron and hole current density. μn and μp represent electron and hole mobility. n and p describe electron and hole density. n and p are fermi statistics constant, and mn and mp present spatial effective masses of electron and hole respectively. t and k describe temperature and boltzmann constant. ec and ev are conduction and valance energy bands. dn and dp represent the diffusion constants for electron and holes respectively. in the simulation basic mobility a model is used that takes into account the effect of the doping dependence, high-field saturation (velocity saturation), and transverse field dependence. the silicon band gap narrowing model that determines the intrinsic carrier concentration is activated. models for quantum mechanical effects have not been invoked when radius of the silicon pillar is changed from 10 nm to 5 nm.[23]. uniform distribution of interface fixed charges 4x10 11 cm -2 has been used in the simulation. the electron and hole surface recombination velocity are considered as 1x10 4 cm/sec. the models activated in the simulation comprise of field dependent mobility, concentration dependent mobility and velocity saturation model. model parameters used from the lookup table are carrier mobility µno=1076 cm 2 /v.s, µpo=460.9 cm 2 /v.s, n= p= 1x10 -7 s are the electron and hole lifetimes and suitable empirical parameters βn, βp are selected to calibrate the drift-diffusion transport model. 3. results and discussion in order to analyze the impact of channel length (lg), and channel thickness (tsi), and gate work function (φm), on the device performance, the simulation is carried out by varying the above parameters. fig. 2(a) and (b) show the drain current (id) in the linear scale as a function of the gate to source voltage (vgs) for different lg of both sm and cgaa mosfets. the dm-cgaa mosfets are showing a significant improvement in drive current as compared to sm-cgaa mosfets. in fig. 2, lg varies from 28 nm to 70 nm and we can observe from the figure that a decrease in lg results a shift in the characteristics. the on-state current (ion) increases dramatically as lg decreases to below 30 nm in comparison to others. as channel length decreases, it gives rise to high drain current because of the relation id1/l. fig. 3 represents the id-vgs characteristic for different values of metal gate work function (φm) for both sm and dm-cgaa mosfets. the work function is varied from 4.6 ev to 5.1 ev for sm cgaa and φm2 from 4.2 ev to 4.5 ev with φm1=4.6 ev at vds=50 mv (sub-threshold region of operation). in case of dm-cgaa, the φm2 is varied in such a way that it has to satisfy the design condition, i.e., φm1>φm2. the results illustrate that the off-state leakage current (subthreshold performance) of the device improves for higher values of metal gate work function. higher the the φm increases 640 b. jena, k. p. pradhan, p. k. sahu, s. dash, g. p. mishra, s. k. mohapatra threshold voltage that reduces leakage current and improves the subthreshold behavior of the device. (a) (b) fig. 2 drain current (id) in linear scale as a function of gate to source voltage (vgs) for vds=50 mv with variation in lg (28 nm to 70 nm) (a) sm-cgaa (φm=4.6 ev) (b) dm-cgaa (φm1=4.6 ev, φm2=4.4 ev) (a) (b) fig. 3 drain current (id) in log scale as a function of the gate to source voltage (vgs) for vds=50 mv for different φm (a) sm-cgaa (b) dm-cgaa fig. 4 (a) and (b) reveal the id dependency on silicon body thickness (tsi) of both sm and dm cgaa mosfets. the characteristic of ioff is also influenced by tsi, which is cleared from fig. 4. as the silicon film gets thinner, there is a significant improvement in leakage current because no further leakage path is available far from the gate. the dm devices show a little higher ioff than sm device cases, but they predict higher drive current (ion) as compared to sm counterparts, which is verified from fig. 2. transconductance (gm) as a function of id for both sm and dm-cgaa mosfets are presented in fig. 5(a) and (b) respectively. from the figure, it is clear that as the channel length decreases the gm value is increasing because of high drain current. the high gm will further enhance the transconductance generation factor (tgf=gm/id) which is the requirement for the realization of circuits operating at low supply voltage. by comparing fig. 5(a) and (b), the dm devices are superior to their sm counterpart. all the extracted and calculated values of dc performances are tabulated in table 1, and table 2, with the variation of silicon body thickness (tsi), and channel length (lg) of both sm and dm-cgaa mosfets. table 1 compares and analyzes the sensitivity of tsi cylindrical gate all around (gaa) to nanowire mosfet 641 on various important parameters like ion, ioff, and vth. we can well control the vth and sces like off state leakage current by reducing tsi with a little compromise in on-state current. hence, people always prefer a ultra-thin body (utb) fully depleted (fd) soi mosfet as the body is completely controlled by the gate and there is no leakage path far from the gate. however, by considering two different gate metals, we can drastically enhance the drive current of the devices. (a) (b) fig. 4 drain current (id) in log scale as a function of gate to source voltage (vgs) for vds=50 mv with variation in tsi (10 nm to 20 nm) (a) sm-cgaa (φm=4.6 ev) (b) dm-cgaa (φm1=4.6 ev, φm2=4.4ev) (a) (b) fig.5 gm as a function of id at vds=50 mv with variation in lg (a) sm-cgaa (φm=4.6 ev) (b) dm-cgaa (φm1=4.6 ev, φm2=4.4ev) table 2 summarizes the similar dc performances of both sm and dm devices for different values of channel lengths. it is clear from table 2 that while the gate length is reduced the analog performance like transconductance (gm) is increased because of high drain current for shorter gate length devices. however, the device having shorter lg is more prominent towards sces due to high ioff. table 1 dc performance measures with tsi variation at vds=50 mv sm-cgaa (φm=4.6ev) dm-cgaa (φm1=4.6ev, φm2=4.4ev) tsi (nm) ion (μa) ioff (pa) vth (v) ion (μa) ioff (na) vth (v) 10 3.99 1.27 0.40 4.04 0.71 0.28 15 8.31 7.27 0.40 8.71 3.69 0.245 20 11.4 9.70 0.38 13.1 29.2 0.21 642 b. jena, k. p. pradhan, p. k. sahu, s. dash, g. p. mishra, s. k. mohapatra table 2 analysis of different parameters with lg variation at vds=50 mv sm-cgaa (φm=4.6ev) dm-cgaa (φm1=4.6ev, φm2=4.4ev) lg (nm) ion (μa) ioff (pa) vth (v) ion (μa) ioff (pa) vth (v) 28 3.68 2.44 0.382 4.04 711 0.28 40 3.67 0.114 0.424 3.85 7.21 0.42 55 3.40 0.0386 0.431 3.63 0.838 0.45 70 3.16 0.0248 0.432 3.30 0.360 0.455 5. conclusion a cylindrical gate all around (gaa) with gate engineering, i.e., single gate material (sm) and two different gate electrode (dm) is explored and the performance evaluation is carried out with extensive device simulation by sentaurus tm simulator. the sensitivity of device parameters like tsi, φm, and lg on various dc performances are systematically presented. improvement in device performance for low standby operating power (lstp) applications can be achieved with reduced in body thickness and higher gate work function. the subthreshold leakage current is significantly improved when the device approaches to the nanowire, i.e., tsi=10 nm, and for higher φm values. similarly, continuous miniaturization of lg is required for getting high ion and gm. the dm-cgaa shows a higher drive current as compared to sm counterpart with little compromise in off state leakage current. hence, an appropriate selection of the silicon thickness, and metal gate work function give rise to an optimum threshold voltage at a given channel length and drain bias. references [1] k. k. young, "short-channel effect in fully depleted soi mosfets", ieee trans. electron devices, vol. 36, no. 2, pp. 399–402, 1989. [2] s. bangsaruntip, g. m. cohen, a. majumdar, and j. w. sleight, "universality of short-channel effects in undoped-body silicon nanowire mosfets", ieee electron device lett., vol. 31, no. 9, pp. 903–905, 2010. [3] t. skotnicki, j. a. hutchby, t. j. king, h. s. p. wong, and f. boeuf, "the end of cmos scaling: toward the introduction of new materials and structural changes to improve mosfet performance", ieee circuits devices mag., vol. 21, no. 1, pp. 16–26, 2005. [4] j. p. colinge, "multiple-gate soi mosfets", solid. state. electron., vol. 48, no. 6, pp. 897–905, 2004. [5] l. chang, y. c. y. choi, d. ha, p. ranade, s. x. s. xiong, j. bokor, c. hu, and t. j. king, "extremely scaled silicon nano-cmos devices", in proceedings of the ieee, vol. 91, no. 11, pp. 1860–1873, 2003. [6] v. m. srivastava, k. s. yadav, and g. singh, "design and performance analysis of double-gate mosfet over single-gate mosfet for rf switch", microelectronics j., vol. 42, no. 3, pp. 527–534, 2011. [7] j. colinge, "from gate-all-around to nanowire mosfets", in proceedings of the international semiconductor conference, cas 2007, vol. 1, pp. 11–17. [8] k. p. pradhan, s. k. mohapatra, p. k. sahu, and d. k. behera, "impact of high-k gate dielectric on analog and rf performance of nanoscale dg-mosfet", microelectronics j., vol. 45, no. 2, pp. 144– 151, 2014. [9] s. k. mohapatra, k. p. pradhan, l. artola, and p. k. sahu, "estimation of analog/rf figures-of-merit using device design engineering in gate stack double gate mosfet", mater. sci. semicond. process., vol. 31, no. 0, pp. 455–462, 2015. [10] s. k. mohapatra, k. p. pradhan, and p. k. sahu, "resolving the bias point for wide range of temperature applications in high-k/metal gate nanoscale dg-mosfet", facta universitatis series: electronics and energetics, vol. 27, no. 4, pp. 613–619, 2014. cylindrical gate all around (gaa) to nanowire mosfet 643 [11] t.-k. chiang and j. j. liou, "an analytical subthreshold current/swing model for junctionless cylindrical nanowire fets (jlcnfets) ", facta universitatis series: electronics and energetics, vol. 26, no. 3, pp. 157–173, 2013. [12] s. k. gupta and s. baishya, "modeling of cylindrical surrounding gate mosfets including the fringing field effects", j. semicond., vol. 34, no. 7, pp. 1–6, 2013. [13] m. r. kumar, s. k. mohapatra, k. p. pradhan, and p. k. sahu, "a simple analytical center potential model for cylindrical gate all around (cgaa) mosfet", j. electron devices, vol. 19, pp. 1648–1653, 2014. [14] h. abd-elhamid, b. iñiguez, d. jiménez, j. roig, j. pallarès, and l. f. marsal, "two-dimensional analytical threshold voltage roll-off and subthreshold swing models for undoped cylindrical gate all around mosfet", solid. state. electron., vol. 50, no. 5, pp. 805–812, 2006. [15] r. gautam, m. saxena, r. s. gupta, and m. gupta, "gate all around mosfet with vacuum gate dielectric for improved hot carrier reliability and rf performance", electron devices, ieee trans., vol. 60, no. 6, pp. 1820–1827, 2013. [16] a. cerdeira, m. estrada, j. alvarado, i. garduño, e. contreras, j. tinoco, b. iniguez, v. kilchytska, and d. flandre, "review on double-gate mosfets and finfets modeling", facta universitatis series: electronics and energetics, vol. 26, no. 3, pp. 197–213, 2013. [17] y. pratap, p. ghosh, s. haldar, r. s. gupta, and m. gupta, "an analytical subthreshold current modeling of cylindrical gate all around (cgaa) mosfet incorporating the influence of device design engineering", microelectronics j., vol. 45, no. 4, pp. 408–415, 2014. [18] t. k. chiang, "a compact model for threshold voltage of surrounding-gate mosfets with localized interface trapped charges", ieee trans. electron devices, vol. 58, no. 2, pp. 567–571, 2011. [19] l. zhang, c. ma, j. he, x. lin, and m. chan, "analytical solution of subthreshold channel potential of gate underlap cylindrical gate-all-around mosfet", solid. state. electron., vol. 54, no. 8, pp. 806–808, 2010. [20] d. sharma and s. k. vishvakarma, "precise analytical model for short channel cylindrical gate (cylg) gate-all-around (gaa) mosfet", solid. state. electron., vol. 86, pp. 68–74, 2013. [21] i. ferain, c. a. colinge, and j. colinge, "multigate transistors as the future of classical metal–oxide– semiconductor field-effect transistors", nature, vol. 479, pp. 310–316, 2011. [22] http://www.synopsys.com/, "sentaurus tcad user’s manual", in proceedings of the synopsys sentaurus device, pp. 191–403. [23] a. tsormpatzoglou, d. h. tassis, c. a. dimitriadis, g. ghibaudo, g. pananakakis, and r. clerc, "a compact drain current model of short-channel cylindrical gate-all-around mosfets", semicond. sci. technol., vol. 24, no. 7, p. 75017, 2009. 10537 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 483-493 https://doi.org/10.2298/fuee2204483b © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper new approach to a ds-cdma-uwb system using a pseudo orthogonal code (poc) kada biteur1,2, belkacem benadda1,2, ahmed nour el islam ayad3 1dept of telecommunications, university abou bekr belkaid of tlemcen, algeria 2information processing and telecommunication laboratory (ltit),university tahri mohamed, bechar, algeria 3dept of electrical engineering, university kasdi merbah ourgla, algeria abstract. ultra-wideband direct sequences code division multiple access (ds-dma) plays an important role in the case of multi-terminal multi-application communications of uwb devices. in the case of uwb systems that exploit the injection of the pulse itself directly to the antenna hence the very wide bandwidth, generation of suitable ds-cdma codes poses a real challenge. in this paper we will describe our novel uwb transmission which uses pseudo-orthogonal time code (poc) as ds-cdma sequences. the suggested codes are unipolar sequences with chips that may be dynamically modified to target a certain number of users or applications. our approach bypasses the modulations schemes commonly used on uwb systems. moreover, as perspectives to our work, it would be very interesting to realize our new approach based on an fpga circuit. key words: uwb systems, pseudo-orthogonal code (poc), direct sequence-cdma 1. introduction the ultra-wideband (uwb) technology can be integrated into many applications such as personal area networks (wpan) [1-3] and mobile telecommunications (5g today) [46]. the uwb system is a rapidly developing technology that uses short range with very low power consumption, to transmit information over a majority of the radio spectrum to occupy a bandwidth greater than or equal to 25% of the center frequency or 1.5 ghz [7]. the uwb transmitters use very short-in-time pulses instead of carrier signals modulation. the most used pulses models are gaussian second derivatives, whose representation in the time domain is described by (1): 𝑈(𝑡) = (1 − 4𝜋 ( 𝑡 𝜗 ) 2 ) 𝑒 −2𝜋( 𝑡 𝜗 ) 2 (1) where ϑ represents a time normalization factor. received february 23, 2022; revised april 18, 2022 and july 30, 2022; accepted august 31, 2022 corresponding author: biteur kada department of telecommunications, university abou bekr belkaid of tlemcen, algeria e-mail: biteur.kada@univ-ghardaia.dz 484 k. biteur, b. benadda, a. n. e. i. ayad fig. 1 second derivative of a gaussian pulse especially for wireless communications, the united states federal communications commission has set the power level to a very low level (lower than -41.3 dbm) [8] allowing uwb technology to share spectrum with other users without interference. to get the required spreading, various techniques can be used such as direct sequence (ds) and time-hopping (th) [9]. user data is allotted to time frames in the th-uwb systems, and pulse position modulation (ppm) is employed to eliminate overlap in multiple access networks [10-11]. on the other hand, time spreading codes are used in ds-uwb techniques [12] in the same way as they are in traditional direct sequence code access (dscdma) technique, so they have the same advantages than direct sequence spread spectrum (dsss) [13-14]. in this paper we propose a transceiver model suitable for a new approach to direct sequence digital transmission, for an ultra-wideband application (ds-uwb), using a pseudo-orthogonal time code (poc). the proposed codes are composed of unipolar sequences characterized by a length l, constituted of n elements called "chips", a predefined number of users, and the weight of the code; chips with level "1". moreover, to enhance the synchronization between transmitters and receivers, this new proposed spreading schema makes it possible to separately code high-level bits '1' and low-level bits '0' of the data stream by two different codes; the doublet code sequence is unique for each user. the proposed study aims to transmit ds-cdma-uwb without using classical modulations associated with uwb systems. our new model, based on pseudo orthogonal codes, build a ds-cdma-uwb system for both sides receiver and emitter. direct sequences for uwb systems are explained in section 2. section 3 will introduce the classic modulation schema used on uwb systems. sections 4 detail the poc mechanisms. the uwb ds-cdma emitter is detailed in section 5. sections 6 and 7 highlight the emitter signals generation; simulation and results for the propagation and signal acquisition at the receiver level that we present in section 8. section 9 concludes this paper. -5 -4 -3 -2 -1 0 1 2 3 4 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 nanoseconds a m pl it ud e new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 485 2. direct sequence uwb (ds-uwb) direct sequence spread spectrum systems appear easier to implement since all the pulses are spaced at the same period, which imposes fewer constraints on the components of the transmission chain. indeed, our built ds-uwb transmitter scheme uses orthogonal pseudo-random codes (pn) [15] as spreading sequences to encode each bit of information, and the bandwidth of the transmitted pulse is much greater than that used by the transmitted binary stream. figure 2illustrates a block diagram of the ds-uwb signal generator. fig. 2 block diagram of a ds-uwb signal generator 3. modulations associated with uwb systems there are mainly modulation methods for uwb communications, such as ppm (pulse position modulation), ook (on-off keying) and pam (pulse position amplitude modulation) [16-17]. ▪ ppm modulation: the information is encoded according to the position of the timespaced pulse; bit '0' is defined by a time-shifted pulse from a reference pulse that matches bit ’1’. ▪ ook modulation: corresponds to the presence of a pulse representing the "1" bit and the "0" results in the absence of a pulse. ▪ pam modulation: is an access method based on the property of orthogonality of pulses. fig. 3 modulations associated with uwb systems 486 k. biteur, b. benadda, a. n. e. i. ayad 4. pseudo-orthogonal codes (poc) j. a. salhi developed the poc codes in 1989 [18], these codes are composed of unipolar sequences c = {c j} defined by the following parameters: ▪ l represents the code length poc ▪ w stands for the code's weight, which denotes the number of chips at "1." ▪ the auto and inter-correlation constraints are represented by λ a and λ c respectively. 4.1. numbers of user in the event that λ a = λ c =1, various works [18-19] have shown that the number of possible users of a poc code sequence is limited by the relation (2): 𝑁(𝐿, 𝑊, 1,1) ≤ ⌊ 𝐿−1 𝑤(𝑤−1) ⌋ (2) ▪ n: number of user. ▪ l, w: represents the code length poc and the code's weight respectively. 4.2. construction of codes the bibd (balanced incomplete block design) method [20] allows us to generate oc (l, w) code sequences when the desired spread length is a prime number. it is mathematical method based on properties related to primitive roots from a galois field; it is a simpler and faster method. we consider the primitive root α of l, we can get the positions of the chips at 1 of the ith sequence ci = [pi,0 ; pi,1;…; pi,w-1] for each code according to the parity of w [21] : − 𝑖𝑓 𝑊 𝑖𝑠 𝑒𝑣𝑒𝑛(𝑊 = 2𝑚): { 𝑃𝑖,0 = 0 𝑃𝑖,𝑗 =∝ (𝑚×𝑖)+(𝑗×𝑘) (3) 𝑤𝑖𝑡ℎ: 𝑖 ∈ [0, 𝑁 − 1]; 𝑗 ∈ [0, 𝑊 − 2] 𝑒𝑡 𝑘 = 2 × 𝑚 × 𝑁 − 𝑖𝑓 𝑊 𝑖𝑠 𝑜𝑑𝑑(𝑊 = (2 × 𝑚) + 1): {𝑃𝑖,𝑗 = 𝛼 (𝑚×𝑖)+(𝑗×𝑘) (4) 𝑤𝑖𝑡ℎ: 𝑖 ∈ [0, 𝑁 − 1]; 𝑗 ∈ [0, 𝑊 − 1] 𝑒𝑡 𝑘 = 2 × 𝑚 × 𝑁 ▪ α is the primitive root of l. ▪ is the pci is the position of chips at 1 for i th code sequence 𝐶𝑖 = [𝑃𝑖,0; 𝑃𝑖,1; … ; 𝑃𝑖,𝑊−1] table 1 shows the code positions used in our study according to the bibd method. in the following figure 4, we present the positions of the chips at "1" of the poc code (73, 4) according to number of users n=6andthe length of the code l=73. table 1 the different positions of (73, 4, 1, 1) code according to the bibd method first chips j 0 1 2 code (73,4,1,1) n = 6 i 0 (c1) 0 1 8 64 1 (c2) 0 25 54 67 2 (c3) 0 36 41 69 3 (c4) 0 3 24 46 4 (c5) 0 2 16 55 5 (c6) 0 35 50 61 new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 487 fig. 4 positions of chips at "1" of the code poc (73,4) 5. new model of ds-cdma-uwb emitter for our proposed model, the bit flow equal to "1" is convoluted by the chips of a user's poc code and the bit flow equal to "0" bit by another user code, which gives an increased bandwidth to the signal by emitting low-energy gaussian-shaped pulses that are coherent on reception as explained by figure 5. fig. 5 ds-cdma-uwb emitter the ds-cdma-uwb signal transmitted to a user can be expressed as follows: 𝑆𝑃𝑂𝐶𝑐𝑜𝑑𝑒 (𝑡) = [ ∑ 𝑏1 𝑘 ∞ 𝑘=−∞ ∑ 𝐶𝑗 𝑈 + ∑ 𝑏0 𝑘 ∞ 𝑖=−∞ ∑ 𝐶𝑗 𝑈∼ 𝑁𝑐−1 𝑗=0 ] ⊕ 𝑁𝑐−1 𝑗=0 𝑊(𝑡 − 𝑖𝑇𝑠 − 𝑗𝑇𝑐 ) (5) 0 50 100 0 0.2 0.4 0.6 0.8 1 user1 0 50 100 0 0.2 0.4 0.6 0.8 1 user2 0 50 100 0 0.2 0.4 0.6 0.8 1 user3 0 50 100 0 0.2 0.4 0.6 0.8 1 user4 0 50 100 0 0.2 0.4 0.6 0.8 1 user5 0 50 100 0 0.2 0.4 0.6 0.8 1 user6 spread spectrum data source the bits ‘’0’’ the bits ‘’1’’ poc code of a user poc code of another user uwb pulse generator 488 k. biteur, b. benadda, a. n. e. i. ayad ▪ 𝑏0 𝑘,𝑏1 𝑘 : is the 0 and the 1 bit respectively of binary data sent by the kth source ▪ 𝑊 is the pulse waveform ▪ 𝑇𝑐 , 𝑇𝑠are chip and symbol duration respectively ▪ 𝑁𝑐 is the number of chips ▪ 𝐶𝑗 𝑈 ,𝐶𝑗 𝑈∼ is a code of two different users which only takes chips 1 or 0 up to n the number of users. 6. emitter simulation we first consider a random sequence of 8 bits modeling the useful information as limited bit stream. then we use two selected poc codes to spread the spectrum, which is completely independent of the random data sequences [20], this data transmission method uses more bandwidth than necessary to traditional transfer. for this paper purpose we have selected as an example the 4th and 6th poc sequences (73,4) for our user (all other codes use the same principle), i.e. the bit flow equal to "1" is convolved by 73 chips of code #4 and the bit flow equal to "0" convolved by 73 chips of code #6. 𝑆(73,4)(𝑡) = [( ∑ 𝑏1 𝑘 ∞ 𝑘=−∞ ∑ 𝐶𝑗 4) + ( ∑ 𝑏0 𝑘 ∞ 𝑖=−∞ ∑ 𝐶𝑗 6) 73−1 𝑗=0 ] ⊕ 73−1 𝑗=0 𝑊(𝑡 − 𝑖𝑇𝑠 − 𝑗𝑇𝑐 ) (6) the spread of the spectrum as represented in figure 6 modulates a sequence of data “10011011” by means of two pseudo-random poc codes chosen at a bit rate much higher than that of the information signal to be transmitted. that is to say the convolution is done once between the 73 code chips of user #4 with bits equal to and the 73 code chips of user #6 with bits equal to 0. . fig. 6 spread spectrum phase for the data sequence “10011011” 0 2 4 6 8 0 0.5 1 \data 0 20 40 60 80 0 0.5 1 \code of user#4 0 20 40 60 80 0 0.5 1 \code of user#6 0 200 400 600 0 0.5 1 \speared spectrum 0 50 100 0 0.5 1 \speared spectrum zoom of the bit '1' and '0' new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 489 6.1. generation of uwb pulses in this paper, we used the second derivative of the gaussian generated by equation (1) because of their ease of implementation in uwb systems [19-20]. as shown in the figure 7, the uwb pulse generator receives the spread data to create a second order gaussian derivative pulse train and output the signal through the antenna [22]. fig. 7 uwb pulses to comply with the regulatory agency's recommendations, the frequency band allocated for uwb transmissions has been grouped into two parts, a so-called "low band", comprising between 3 and 5 ghz, and the other called "high band", include between 6 and 10 ghz [23]. our transmitted ds-uwb signal is included in low band according to figure 8 which shows uwb signal spectrum and power spectral. fig. 8 the spectrum and the power spectral of uwb signal 0 100 200 300 400 500 600 0 0.5 1 \uw b 0 100 200 300 400 500 600 700 -1 0 1 \uw b signal 0 20 40 60 80 100 120 140 -1 0 1 \zoom uwb signal of the bit ''1'' and ''0'' 0 200 400 600 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 frequency (hz) x 10 7 a m pl itu de (v ) spectrum of uwb signal 0 200 400 600 -100 -90 -80 -70 -60 -50 -40 -30 -20 frequency (hz) x 10 7 20 lo g1 0( db ) power spectral of uw b signal 490 k. biteur, b. benadda, a. n. e. i. ayad 7. transmission channel in our work, we did not examine multi-user interference (mui) [24] and intersymbol interference (isi) [25] since these phenomena are not predominant. in our work, the only phenomenon which imperfects our system is the noise awgn the received signal can be described by r(t) = s(t) + n(t) where s(t) is the signal generated by the transmitter and n(t) denotes the additive gaussian noise [26-27-28]. figure 9 shows the noise signal based on the awgn channel model. fig. 9 awgn channel output, where eb/no=2db 8. the correlation receiver gaussian white additive noise (awgn) channel the correlation receiver as shown in the figure 10 is the most optimal of a ds-cdma-uwb chain by adding a filter adapted to the received signal, it uses a correlation device, it breaks down into three steps main [29]: ▪ multiplication of the received signalr(t) by the poc code users #4 and #6 with the pulse generator uwb: 𝑅𝑐𝑜𝑟𝑟 (𝑡) = 𝑟(𝑡) ∗ [(∑ 𝑏1 𝑘∞ 𝑘=−∞ ∑ 𝐶𝑗 4) + (∑ 𝑏0 𝑘∞ 𝑖=−∞ ∑ 𝐶𝑗 6) 73−1𝑗=0 ] ⊕ 73−1 𝑗=0 𝑊(𝑡 − 𝑗𝑇𝑐 )] (7) ▪ integration of the correlated signal over the bit time 𝑍1 (𝑖) = ∫ 𝑟𝑐𝑜𝑟𝑟 (𝑡)𝑑𝑡 𝑇𝑏 0 (8) ▪ decision making by comparison to a threshold knowing that user poc code #4 and #6 indicates bit '1' , '0' respectively. fig. 10 correlation receiver 0 100 200 300 400 500 600 700 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 491 at the reception, it suffices to compare the correlator signal with the possibly generated poc sequence to recover the transmitted signal. figure 11 illustrates the correlator output signal with its power spectral, the spectrum of the correlator output signal and recovered data. fig. 11 the correlator output signal with its power spectral, the spectrum and the data recovered the new ds-uwb system based on poc orthogonal unipolar codes without modulation was analyzed. only the end-to-end ds-uwb transmission chain we are interested in. we removed the modulation part on our new approach. poc codes are preconfigured (calculated in advance). our perspective is to realize our new ds-uwb approach based on components such as fpga [30-31-32], soc [33]… because nowadays it is easy to build a transceiver. 9. conclusion in this work, we suggested a new approach to a multi-users ds-cdma-uwb system using a family of pseudo-orthogonal codes poc on an awgn channel for a correlation receiver. applying poc code offered a whole new and different approach than any other used before in literature with the ultra-broadband system. we have given a complete description of the ds-cdma-uwb system, including the transmission and reception formalism. this work allowed us to present and analyze new emission reception approach based on ds-cdma-uwb signal. references [1] k. h. liu, l. cai and x. s. shen, ''exclusive-region based scheduling algorithms for uwb wp'', ieee trans. wirel. commun., 2008, 7, 933–942. [2] z. p. li and g. s. kuo, ''layered mac for high-rate uwb wpan system''. in proceedings of the ieee 64th vehicular technology conference, melbourne, australia, 7–10 may 2006, pp. 1-5. 492 k. biteur, b. benadda, a. n. e. i. ayad [3] n. m. aripin and n. fisal, ''analysis of channel time allocations for mpeg-4 video transmission over uwb wpan'', in proceedings of the ieee symposium on industrial electronics & applications, (isiea 2009), kuala lumpur, malaysia, 4–6 october 2009; vol. 2, pp. 705-710. [4] j. clerk maxwell, a treatise on electricity and magnetism, 3rd ed., vol. 2. oxford: clarendon, 1892, pp. 68-73. [5] b. yu, d. yang and b. wang, ''design of uwb antenna with double band-notched in 5g'', in proceedings of the ieee 5th advanced information technology, electronic and automation control conference (iaeac), 12-14 march 2021, pp. 480-483. [6] a. m. islam, e. i. emon and a. ahmed, ''a metamaterial loaded microstrip patch antenna for lower 5g'', u-nii spectrum, math. model. eng. probl., vol. 7, no. 4, pp. 556-562, dec. 2020. [7] p. tiwari and p. k. malik, ''design of uwb antenna for the 5g mobile communication applications: a review'', in proceedings of the ieee international conference on computation, automation and knowledge management (iccakm), 9-10 jan. 2020, pp. 24-30. [8] d. g. leeper, ''a long-term view of short-range wireless'', ieee computer, vol. 34, no. 6, pp. 39-44, jun 2001. [9] s. elajoumi, a. tajmouati, j. zbitou, a. errkik, a. m. sanchez and m. latrachee, ''bandwidth enhancement of compact microstrip rectangular antennas for uwb applications'', telkomnika telecommunication computing electronics and control, vol. 17, no. 3, pp. 1559-1568, 2019. [10] c. r. nassar, f. zhu and z. wu, ''direct sequence spreading uwb systems: frequency domain processing for enhanced performance and throughput in communications'', in proceedings of the ieee international conference on communications, 2003, vol. 3, pp. 2180-2186. [11] b. hu and n. c. beaulieu, ''accurate performance evaluation of time hopping and direct-sequence uwb systems in mmulti-user interference'', ieee trans. commun., vol. 53, no. 6, pp. 1053-1062, 2005. [12] w. wu, z. y. wu and w. ji. xie, ''uwb ppm-th and pam-ds system with time reversal and its improved solution'', in proceedings of the ieee 6th international conference on information and automation for sustainability, 27-29 sept. 2012, pp. 332-336. [13] l. lu and v. k. dubey, ''performance of a complete complementary code-based spread-time cdma system in a fading channel'', ieee trans. veh. technol., vol. 57, no. 1, pp. 250-259, jan. 2008. [14] b. r. vojcic and r. l. pickholtz, ''direct-sequence code division multiple access for ultra-wide bandwidth impulse radio'' in proceedings of the ieee military communications conference (milcom), 2003, vol. 2, pp. 898-902. [15] a. gupta and l. bhaskar, ''performance analysis of different pn sequence and orthogonal spreading sequences in ds-ss'', in proceedings of the ieee 5th international conference confluence the next generation information technology summit, 25-26 sept. 2014, pp. 890-892. [16] n. t. huyen and p. t. hiep, ''proposing adaptive pn sequence length scheme for testing nondestructive structure using ds-uwb'', in proceedings of the 3rd international ieee conference on recent advances in signal processing, telecommunications & computing (sigtelcom), 21-22 march 2019, pp. 10-14. [17] i. opperman, j. iinatti and m. hčamčalčainen, uwb theory and applications, the atrium, southern gate, chichester, west sussex po 19 8sq, england, wiley 2004. [18] h. s. hamid, m. s. mohammed and m. i. mustafa, ''design low power detection qpsk-transceiver for uwb'', in proceedings of the 3rd international conference on sustainable engineering techniques, iop conf. series: materials science and engineering, vol. 881, 2020, p. 012134. [19] j. a. salehi and c. a. brackett, '' code division multiple-access techniques in optical fiber networkspart i: fundamental principles'', ieee trans. on comm., vol. 8, no. 37, pp. 824-833, aug. 1989. [20] k. biteur and m. kandouci, ''successive interference cancellation receiver (sic) in ds-ocdma system'', in proceedings of the 24th international conference on microelectronics (icm), 16-20 dec. 2012, pp. 1-4. [21] h. chung and p. kumar, ''optical orthogonal codes new bounds and an optimal construction'', ieee trans. inf. theory, vol. 36, pp. 866-873. [22] k. biteur and m. kandouci, ''conventional receiver with optical limiter in ds-ocdma system'', int. j. adv. eng. technol., vol. 6, no. 4, pp. 1494-1504, sept. 2013. [23] t. sarkar, a. ghosh, s. chakraborty, l. l. kumar singh, ''a new insightful exploration into a low profile ultra-wide-band (uwb) microstrip antenna for ds-uwb applications'', j. electromagn. waves appl., vol. 35, no. 3, pp. 1-19, 2021. [24] a. jassim, ''performances of multiuser interference using pulse amplitude modulation with time hoping for ultra wideband'', international journal of electronics, communication& instrumentation engineering research and development (ijecierd), vol. 6, no. 4, aug 2016. https://www.researchgate.net/profile/l-singh-2 new approach to a ds-cdma-uwbsystem using a pseudo orthogonal code 493 [25] i. čuljak, ž. lučev vasić, h. mihaldinec and h. džapo, ''wireless body sensor communication systems based on uwb and ibc technologies: state-of-the-art and open challenges'', sensors, vol. 20, no. 12, p. 3587, jun 2020. [26] a. ramesha, a. nareshb, n. v. seshagiri raoc, ''technique for reduction of inters symbol interference in uwb'', in proceedings of the international conference on emerging trends in engineering, science and technology (icetest), 2015, pp. 812-819. [27] s. im and e. j. powers, ''an algorithm for estimating signal to noise ratio of uwb signals'', ieee trans. veh. technol., vol. 54, no. 5, pp. 1905–1908, 2005. [28] l. bo, q.-z. liu, z.-d. yin and z.-l. wu, ''a novel snr estimator for ds-uwb wireless sensor network'', destech trans. comput. sci. eng., 2017. [29] f. ramirez-mirles, ''on the performance of ultra wideband signals in gaussian noise and dense multipath'', ieee trans. veh. technol., vol. 50, no. 1, pp. 244249, jan. 2001. [30] md. a. azim, h. mohammad, m. rahman and n. amin, ''direct sequence ultra wideband system design for wireless sensor network'', in proceedings of the international conference on computer and communication engineering, 13-15 may 2008, pp. 1131-1135. [31] l. sneler, t. matic and i. galic, ''the fpga system for evaluation of uwb wireless sensor network based on transmitted reference integral pulse frequency modulator'', in proceedings of the ieee zooming innovation in consumer technologies conference (zinc), 2018, pp. 55-57. [32] c. thomos and g. kalivas, ''fpga-based architecture of a ds-uwb channel estimator and rake receiver employing a hybrid selection scheme'', in proceedings of the ieee 17th international conference on telecommunications, 2010, pp. 903-909. [33] m. cervetto, e. marchi and c. g. galarza, ''a fully configurable soc-based ir-uwb platform for data acquisition and algorithm testing'', ieee embed. syst. lett., vol. 13, no. 2, pp. 53-56, june 2021. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 527 540 doi: 10.2298/fuee1504527s detection and suppression of parasitic dc voltages in 400 v ac grids slobodan n. vukosavic  university of belgrade, dept. of electrical engineering, 11000 belgrade, serbia abstract. grid connected static power converters inject parasitic dc currents due to the offset in current sensing, control imperfections, asymmetries in power switches and other secondary effects. ever growing number of grid connected converters contributes to an increase of dc bias in ac grids, and this brings the cores of distribution transformers closer to saturation and increases their power losses. this paper provides sensitivity analysis of distribution transformers to the dc bias, and considers solutions for detecting and compensating the parasitic dc components in ac grids. active compensation methods can be advantageously used in suppressing the dc bias at grid connection point of the power converter. the sensing approach proposed in this paper makes use of saturable ferromagnetic cores and a low cost dsp for signal analysis and processing. proposed algorithm uses distortion of the magnetizing current of a parallel connected saturable core due to the bias. experimental results demonstrate the capability for detecting and compensating the bias voltages far below 1 mv in 0.4 kv grids. the paper describes the principles of dc bias detection and it provides the guidelines for the proper design of magnetic components. high precision of the proposed dc bias sensing is thoroughly verified on the experimental setup connected to a 0.4 kv grid. key words: power quality, distribution transformers, power converters, dc bias. 1. introduction dc injection into the low-voltage and medium-voltage ac grids comes mostly from grid connected static power converters. recent developments in power electronics, electrical drives and distributed generation leads to a large number of static power converters connected to the grid, with the potential to inject a parasitic dc bias into the grid. static power converters with pwm control can produce ac waveforms with a low distortion factor [1], but they can also introduce parasitic spectral components, including the dc bias. numerical solutions can be used to reduce the parasitic spectral components [2], but the remaining dc offset cannot be eliminated completely. therefore, all the transformerless received march 22, 2015 corresponding author: slobodan n. vukosavic university of belgrade, dept. of electrical engineering, 11000 belgrade, serbia (e-mail: boban@ieee.org) 528 s. n. vukosavic grid-connected power converters have the potential of introducing a small, parasitic dc offset into the ac grid [3]. widespread use of electronically controlled electrical drives [4], which are often regenerative, makes the problem even more emphasized. recently introduced multiphase and multimotor drives [5] are also capable of introducing a parasitic dc bias through the front end converter. hence, whenever the power interface to the grid is performed through a static power converter, there is a potential of dc bias in ac grids has an adverse effect on the operation of power transformers [6,7]. adverse consequences are also possible in certain electrical loads [8]. widespread use of distributed power sources attached to the grid through a power electronics interface, as well as an increased use of active rectifiers in modern electrical speed drives [9] and static power converters [10] emphasizes the problem of dc injection. dc bias currents limits specified by the norms [11] and discussed by international working groups are difficult to measure. consequential dc bias voltages are even lower due to very low equivalent resistance in ac grids. therefore, the need emerges to measure dc bias voltages and currents in ac grids with high precision. dc injection of grid connected power converters is caused by the delay mismatch in gating circuits and imperfections of power switches [12], by the offset in current sensing [13], by dc injection based methods for detecting the stator resistance and temperature in grid-connected ac machines [14], while other sources of dc bias include geomagnetic induced currents [15], hvdc transmission, railway signalling equipment and similar. even a small dc bias may result in saturation of power transformers [16], an increase in their iron losses, increased corrosion and erroneous operation of measurement and protective equipment. relevant norms [11] prescribe the dc injection limit as 0.5% of the grid-connected power converter rated current. on the other hand, a dc bias of 0.5% of the rated current of sn > 500 kva distribution transformer [17,18] corresponds to more than 50% of the rated magnetizing current, and this would saturate the core and trip the protections. considering ever growing number of grid connected static power converters, it is essential do devise and use devices for dc bias detection and compensation [10]. distribution power transformers with 0.4 kv secondary windings have a very low winding resistance and a very low magnetizing current [17,18]. a dc bias voltage of only 1mv may introduce a 5% offset in the magnetizing current, moving the h field in b-h plane away from the origin. parasitic dc current in a transformer results in half-cycle saturation and an increase in reactive power, leakage flux, stray losses and temperature of the core, clamping plates, the tank walls and bolts. therefore, dc bias detection and compensation is required to suppress the parasitic dc voltages in 0.4 kv grid far below 1 mv level. transformerless grid-connected power converters are the source of the dc injection. equipped with adequate dc bias sensing and controls [19], they can be also used for suppressing the parasitic dc voltages at the grid connection point. it is rather difficult to measure very small dc offsets embedded in ac voltages, as the ratio between the two exceeds 10 5 -10 6 . required precision of 2-3 ppm has to be maintained over the range of operating conditions. this cannot be achieved even with advanced sensors [13, 20]. considerable effort has been made in improving precision of dc bias sensing [9, 19, 21, 22] and applying novel sensing techniques within closed loop dc bias suppression systems [10, 12, 23, 24]. in grid connected power converters with intermediate dc link circuit, parasitic dc injection can be determined from line frequency oscillations of the dc link voltage [10] with precision of 0.1%. at the same time, the offset introduced by hall effect current sensors replaced in the dc link can be removed by auto-calibration [12]. detection and suppression of parasitic dc voltages in 400 v ac grids 529 dc injection can be also suppressed [23] by inserting an isolating power transformer, by using the half bridge topologies, or by inserting a series blocking capacitor, but these methods increase the cost, size and power losses. therefore, the efforts were mainly focused towards improving the accuracy of dc bias methods and devices [13, 18-26]. in most cases, proposed reading of very small dc bias in the presence of a large ac signal is based on nonlinear effects in ac excited, dc biased iron cores. even a small bias results in detectable amounts of even harmonics [29-32] in distorted magnetizing current of saturable iron cores. parasitic dc voltage in ac grid can be detected by processing the magnetizing current im in parallel connected choke wound on saturable iron core. dc bias sensing proposed in [19, 21, 22, 24-26] compares the positive and negative peaks of the magnetizing current, which gets distorted in the presence of a dc bias. used in conjunction with an 8a transformerless power converter [24, 25], it suppresses the dc injection to 4ma. the same im peak comparing method can be advantageously used [26] in suppressing magnetic saturation in transformers used to connect a static power converter to the grid. with additional compensation winding on parallel connected choke [21, 22], the peak comparing method can be used to measure the dc bias in 0.4kv ac grids, offering precision better than 3mv for phase voltages uph = [170v .. 220v]. in this paper, the problems of detecting and suppressing the dc bias in ac grids is discussed and analyzed. an overview of sensing methods is followed by the proposal of a new, improved sensing technique based on nonlinearity of parallel connected choke, wound on a saturable iron core [29]. the main objective is achieving precision in dc bias sensing considerably better than 1 mv in 0.4 kv grids. the two main tools in achieving this goal are (i) the algorithm of detecting the bias and (ii) the approach to winding the choke and designing the filters. section ii provides a brief analysis distribution transformer parameters and studies the effects of parasitic dc voltages in 0.4kv grids, reinstating the required precision of dc bias sensing. in section iii, the state of the art sensing solutions are considered with the aim of identifying the factors that limit their accuracy. proposed guidelines to designing magnetics are summarized in section iv. the algorithm proposed to suppress the dc bias is given in section v, while section vi summarizes experimental results. discussion and conclusions are given in section vii. 2. required accuracy of dc bias sensing dc bias currents may have a detrimental effect on the integrity of the distribution and power transformers or their long term performance, which has a negative effect on the overall system reliability. typical winding resistances and magnetizing (no load) currents of distribution transformers up to 2500 kva are plotted in fig. 1 from data available in [33]. for transformers rated s = 1mva and above, the rated magnetizing current stays below 1% while the secondary resistance resides below 0.5%. this means that a dc offset voltage of udc > un /20000 produces dc bias current equal to the rated magnetizing current. considering 0.4kv winding, it is of interest to explore the effect of very small dc voltages on dc component of the magnetizing current. in fig. 2, the ratio between the dc bias current and the rated magnetizing currents is given for udc = 1mv and udc = 500v. for s = 1mva and above, udc = 1mv adds a dc offset of more than 5% to the magnetizing 530 s. n. vukosavic current. the iron loss investigation reported in [37] considers 2-, 3-, and 4-limb cores with single phase ac magnetizing and a superimposed dc bias. results plotted in figs. 6 and 7 of [37] suggest that the dc current equal to 5% of the maximum magnetizing current in normal conditions increase the iron losses in 2-, 3-, and 4-limb cores by 9%, 12% and 22%, respectively. although the core loss in distribution transformers is rather low (0.04% for a 1mva transformer [33]), its change can be an indicator of the dc injection problem severity. fig. 1 relative winding resistance and magnetizing (no load) currents of three phase line frequency distribution transformers up to 2500 kva. fig. 2 the ratio between the dc bias current and the rated magnetizing current for dc offset voltages of 500 v and 1 mv. other effects of dc injection may prove more detrimental to a distribution power transformer. the presence of a dc component contributes to the asymmetric magnetic core saturation during one sinusoidal semi-period, also called half-cycle saturation, causing a number of adverse effects [34-37]. with half-cycle saturation, transformers have an increase in acoustic noise, reactive power, leakage flux and stray losses, harmonics in induced voltages and losses in leads, clamping plates, transformer tank and bolts. detection and suppression of parasitic dc voltages in 400 v ac grids 531 for the standard magnetic material, commonly used in building the magnetic core of the power transformers, the change of hmax, hrms, specific power losses p and specific apparent power s are given in fig. 3. at high values of the flux density b, a dc offset of only 10% of the peak value can double the apparent power and increase the iron losses by 60%. therefore, it is of interest to suppress the dc bias current far below the level of imnom/10, where imnom stands for the rated magnetizing current. fig. 3 peak value of the magnetic field, rms value of the magnetic field, specific power losses and specific apparent power s as a function of the flux density. high efficiency distribution power transformers that have lower winding resistances and larger dc currents in their windings for the same parasitic dc voltage across the windings. moreover, their operating point in b-h plane comes closer to saturation. therefore, they are more sensitive to dc bias. with introduction of high efficiency transformers and increasing number of grid connected static power converters, the need to sense and compensate dc bias in 0.4kv ac grids is more evident. suppressing the dc bias below 1mv level requires detection methods and devices with considerably lower sensing errors. 3. accuracy of peak detection methods previously developed methods for sensing of parasitic dc voltages in ac grids [19, 21, 22, 24-26] make use of changes in magnetizing current of parallel chokes, namely, the iron core reactors which are parallel connected to the grid voltage. in the presence of a dc bias, the magnetizing current changes [29] and provides the grounds for detecting the sign and amplitude of parasitic dc current (fig. 4). distorted magnetizing current has the maximum positive value imax and the peak negative value of imin. the positive peak of the magnetizing current (imax) and the negative peak (imin) are supposed to be equal in the absence of the dc bias. considering the core which operates next to saturation, an hmax p s hrms 100 a/m 1 va/kg 532 s. n. vukosavic injection of dc bias would result in considerable change in the magnetizing current. the values imin and imax get different, thus providing the means to obtain the sign and estimate of the bias. the peak difference i is used in dc bias detectors presented in [19, 21, 22, 24-26]. all of these solutions compare the positive and negative peaks of the magnetizing current in a parallel connected reactor. the very concept of dc-compensated magnetic core is proved reliable [20] and also used in closed loop current sensing. fig. 4 suppression of dc injection from transformerless grid connected power converters. with peak detection method applied to grid-connected power converters (fig. 5) it is possible to use detected signal and correct the pwm pulses of the converter in order to drive the parasitic dc offset down to zero. whenever a parasitic dc bias produces an offset in magnetizing current, the difference i arises in a manner illustrated in fig. 4. the power converter in fig. 5 acts towards eliminating the bias by means of introducing small changes in pwm pattern. this approach can be used to suppress the dc injection from transformerless grid connected power converters. any dc injection caused by the converter imperfections results in a dc bias. in turn, the signal i is detected from the saturable core. this signal is used to affect the pwm commands of the grid connected power converter in the way that suppresses the dc injection and brings the difference i towards zero. fig. 5 suppression of dc injection from transformerless grid connected power converters. with spectrum-based sensing approach, the lc filter across the choke is not required. detection and suppression of parasitic dc voltages in 400 v ac grids 533 the difference i between the peak values of the magnetizing current depends on the instantaneous values of the current at instants of zero crossings of the supply voltage. therefore, the value of i can be affected by the noise and voltage harmonics coming from the grid. for this reason, the state of the art dc bias detectors include a low pass lc filter, designed to maintain integrity of detected i. this filter is drawn on the left side in fig. 5. reported accuracy of peak detection methods shows the capability to detect the dc bias current component within the choke magnetizing current up to 1/30 of the rated ac magnetizing current (idc/imag = 1/30). a drop in accuracy is detected with ac voltage off the rated value. in table 1, the ac voltage is varied from 68% up to 120%. the minimum detectable dc current idc drops at least 5 times as the voltage shifts away from the rated value. this represents a serious drawback of peak detection methods. considered drawback can be removed by replacing the peak detection method by other means of extracting the information on the dc bias from the magnetizing current measured in the parallel choke. sensing precision can be also improved by an improved design of the sensing core, focused on increasing the sensitivity. table 1 reduced sensitivity of the peak detection methods in operation with ac voltages off the rated value ac voltage 68% 77% 90% 100% 120% idc/imag 1/6 1/10 1/17 1/30 1/4 4. core design the sensitivity depends on the ratio idc/imag. detectable dc voltage udc depends in the sum of the active resistances in the reactor circuit, hence, idc = udc/r. therefore, in order to reduce the minimum detectable udc = idcr = imag(idc/imag) r, and given the ratio (idc/imag), it is of interest to minimize the product imagr. with that in mind, any additional lc filter is counterproductive, as it increases r and reduces sensitivity. a simple and straightforward way of getting a suitable sensor is adopting a small, ready-made toroidal transformer, with the primary winding already set for the line frequency and the ac grid voltage. in table 2, a summary is given of the key parameters of standard single phase line frequency transformers wound on toroidal iron core. these toroidal cores are made of most standard iron sheets, and available off the shelf. the table comprises relative magnetizing current and relative winding resistance for the transformers with the rated power ranging from 20va up to 500va. the sensitivity of the core to the dc bias is inversely proportional to the ri product. hence, the core of 50va is five time more sensitive than the core of 20va. on the other hand, increasing the power from 50va up to 500va raises the sensitivity roughly two times. for that reason, it appears suitable to avoid usage of large and heavy 500va cores, and remaining within 50va range. the rightmost column in table 2 provides the factor risnom, which is lower for a larger "sensitivity per va". namely, it illustrates how "the investment" into a larger core pays off as an increase in dc bias sensitivity. the most appropriate choices are the cores with 30va and 50va. 534 s. n. vukosavic table 2 properties of standard toroidal cores used for single phase line frequency transformers sn [va] imag/inom rrelative ri *1000 ri snom 20 0.0286 0.0483 1.3814 0.0276 30 0.0104 0.0434 0.4514 0.0135 50 0.0088 0.032 0.2816 0.0141 80 0.0096 0.0397 0.3811 0.0305 150 0.0066 0.0341 0.2251 0.0338 200 0.0074 0.0331 0.2449 0.0490 300 0.006 0.0254 0.1524 0.0457 400 0.0055 0.0264 0.1452 0.0581 500 0.0053 0.0238 0.1261 0.0630 the application in fig. 5 requires the sensing core with only one winding, the winding connected across the ac voltage. therefore, it is beneficial to use all the winding space of the core and reduce the winding resistance to the minimum. hence, an off the shelf toroidal transformer of 50va should be rewound. the secondary winding can be removed, and the available winding space used for the primary winding with an reduced resistance. in this manner, the winding resistance can be halved, and the sensitivity to dc offset doubled. 5. control of the dc bias suppression system in fig. 5, the sensing choke is connected across the ac voltage. the dc bias within the ac voltage may be injected from the grid side power converter in the right of the figure, but also from other grid side converters connected to the same grid. to begin with, it is necessary to detect the bias. as discussed before, conventional peak detection methods have a series of drawbacks, and there is a need to deploy a more robust, more reliable and more sensitive algorithm for extracting the bias information from the magnetizing current of the choke. robustness against the grid noise, pwm noise and other noise sources intrinsic in ac grids is a vital feature in sensing the dc bias. instead of relying on time-domain properties of relevant signals, it is possible to mode to frequency domain and consider the second harmonic of the magnetizing current, renown for being proportional to the dc bias. in table 3, a core of a small toroidal single phase transformer is tested for the second harmonic in the presence of the dc bias. the bias voltages are changed from 0mv up to 1.4mv. the test is performed with ac voltages ranging from 70% up to 116%. for a wide range of ac voltages, the amplitude of the second harmonic is proportional to the bias. therefore, it can be advantageously used in detecting the bias. notice in table 3 that the residual second harmonic, obtained with udc=0, does not exceed 0.42% of the rated magnetizing current. considering a sensing core with the weight of m < 0.7 kg, this corresponds to 9 a, and it contributes to the measurement error of 140 v (0.14 mv). detection and suppression of parasitic dc voltages in 400 v ac grids 535 table 3 second harmonic of the magnetizing current, expressed relative to the rated value of the magnetizing current. the values are given for the range of ac voltages and dc bias values. udc [mv] 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 uac=70% 0.001 0.02 0.026 0.071 0.097 0.12 0.144 0.157 uac=88% 0.002 0.036 0.067 0.093 0.123 0.156 0.184 0.200 uac=100% 0.0026 0.0338 0.07 0.096 0.132 0.158 0.185 0.209 uac=116% 0.0042 0.031 0.078 0.097 0.143 0.157 0.191 0.198 the bias amplitude is obtained from the amplitude of the second harmonic, while the sign is obtained from the phase shift of the second harmonic with respect to the fundamental. in fig. 5, the signals are fed back to the grid connected power converter. within the pwm algorithm of the converter, it is necessary to introduce small changes of the width of the voltage pulses, thus introducing a small dc correction of the output voltages. this change is calculated so as to suppress the dc bias from the grid. namely, as a consequence, the grid connected converter and the dc offset within its output voltage would introduce the dc injection required to drive detected dc bias down to zero. precision in keeping the bias at zero is defined by the sensor, and it is estimated to 140 v. from the results given in table 3, the amplitude of demodulated 2 nd harmonic can be expressed as 2 2 ( ) dc dc h u k u , (1) where k2  0.15 for the sensing core under consideration, and udc is the dc bias across the primary winding. this bias produces the primary side dc bias current idc. if rp is the primary resistance, 2 2 3p dc dc h k r i k i  . (2) in fig. 6, controller produces the modulation index m for the auxiliary pwm h-bridge which feeds the voltage u2 across the compensating winding. as a consequence, the current i2 provides correction and zeroes out the dc bias within the core. assuming that the primary winding has n1 turns while the secondary (compensating) winding has n2 = q n1 turns, the second harmonic in the presence of both primary and secondary magnetomotive forces is 2 3 2 ( ) dc h k i qi  . (3) assuming that the controller has an integral action with the gain ki, 2 2 3 2 ( )i i dc k k u h k i qi s s    . (4) the current i2 comes as a consequence of the voltage u2. with the resistance r2 and the inductance l2 of the compensating winding, 2 2 2 2 . u i r sl   (5) eventually, the current i2 response to the bias idc is defined by 536 s. n. vukosavic 3 2 2 2 2 3 ( ) ( ).i dc i k k i s i s s l sr k k q    (6) dynamic response of the closed loop can be tuned by the gain ki. since the dc bias fluctuations are rather slow, there is no need to select excessively fast response and too large gain. in the experimental setup, response time is characterized by the time constants of 200ms. in steady state conditions, the current i2 is proportional to the bias idc, and it reflects the bias voltage udc of the grid at the point of the common connecctions (pcc). 2 1 ( ) ( ). dc i i q    (7) fig. 6 using the sensing reactor with the compensating winding. control circuit sets the voltage u2 in order to obtain the current i2 of the compensating winding which zeroes out the offset from the sensing core. while the circuit in fig. 6 detects the dc bias within the ac grid, the setup in fig. 7 can be used to perform an active action and compensate the bias. the controller senses the second harmonic and introduces the correction m of the modulation index which is used within the grid connected power converter. in this way, a dc current i2 is injected into the grid. when the controller reaches the balance, the current i2 zeroes out the original dc bias of the grid and brings the voltage udc to the zero. fig. 7 using the grid connected power converter as an actuator in closed loop dc bias suppression system. control circuit detects the second harmonic, concludes on the dc bias, and produces the dc voltage correction u2. this voltage injects the dc current i2 which zeroes out the dc bias detected across the grid connection. detection and suppression of parasitic dc voltages in 400 v ac grids 537 6. experimental results the setup in figs. 5 and 7 comprises the sensing choke, the signal processing block and the grid connected power converter capable of injecting a controllable dc bias. the closed loop gains of the bias-removal control loop are set to obtain the closed loop response characterized by the time constant of 150 ms. experimental results are given in fig. 8, where the trace of detected dc bias illustrates the operation of the dc bias suppression controller. the scaling is 500ms per division on the x-axis and 0.5mv per division on vertical axis. an artificial bias of 2.5mv is introduced into the systems, and it is removed in, roughly, 200ms. fig. 8 transient response of the dc bias suppression controller. the scaling of x-axis is 500ms per division. the vertical axis shows detected dc bias with the scaling of 0.5mv per division. an artificial bias of 2.5mv is introduced into the systems, and it is removed in, roughly, 200ms. steady state accuracy is tested in regimes where the sensing is more difficult, namely, with av voltage reduced to 70%, where the dc bias has a lesser effect on distortion of the magnetizing current. for the close-loop dc bias suppression, given in fig. 7, the results are given in table 4 for a range. these results present the residual error for a range of dc bias voltages. these results demonstrate that, for a range of operating conditions, precision in sensing and removing the dc bias can be maintained with residual errors inferior to 140 v. considering the amplitude of superimposed ac voltages, this results brings the measurement precision better than 1 ppm. table 4 steady state accuracy of the proposed solution udc [mv] 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 residual error in [v] 80 101 33 129 117 57 73 17 538 s. n. vukosavic -1 -0.5 0 0.5 1 1.5 -100 -50 0 50 100 150 accuracy in detecting dc bias parasitic dc voltage bias r e s id u a l e r r o r fig. 9 residual error obtained in the steady state, with the circuit given in fig. 6. on x-axis, parasitic dc bias in 0.4kv ac grid is given, expressed in [mv]. residual error is given on y-axis in [v]. when using the proposed detection method in a manner illustrated in fig. 6, that is, as a sensor, the results are given in fig. 9. these results present the residual error for a range of dc bias voltages, and demonstrate that precision in sensing the dc bias can be maintained with residual errors inferior to 125 v. compared to uac, the measurement precision is better than 1 ppm. 7. conclusions growing number of grid connected converters contributes to an increase of dc bias in ac grids, and this brings the cores of distribution transformers closer to saturation and increases their power losses. the paper provides the analysis of contemporary distribution transformers and probes their sensitivity to the dc bias. it also presents a detailed analysis of the available solutions for detecting and compensating the parasitic dc bias in ac grids, and explored their limits. an active compensation method is proposed, where the grid connected power converter monitors the parasitic dc voltages at the point of common connection, and it provides the dc voltages which correct and suppress the bias. the sensing approach proposed in this paper makes use of saturable ferromagnetic cores and a low cost dsp for signal analysis and processing. proposed algorithm uses distortion of the magnetizing current of a parallel connected saturable core due to the bias. experimental results demonstrate the capability for detecting and compensating the bias voltages far below 1 mv in 0.4 kv grids. for a range of operating conditions, precision in sensing and removing the dc bias can be maintained with residual errors inferior to 140 v. considering the amplitude of superimposed ac voltages, this results brings the measurement precision better than 1 ppm. detection and suppression of parasitic dc voltages in 400 v ac grids 539 references [1] s. n. vukosavić, p. miljanić, "instantaneous feedback in voltage source inverters, a comparative study between nonlinear and linear approach", in conf. rec. 3rd ieee conf. power electronics and elect. drives, london, 1988, pp. 134-137. [2] s. n. vukosavić, m. r. stojić, "reduction of parasitic spectral components of digital space vector modulation by real-time numerical methods", ieee trans. power electronics, vol. 10, no. 1, pp. 94102, feb. 1995. [3] s. vukosavić, "designing energy conversion systems for the next decade", 16th international symposium on power electronics – ee 2011, novi sad, serbia, 26.-28. october, 2011. invited paper ip.2-2 [4] s. n. vukosavić, digital control of electrical drives, new york 10013, usa: springer, 2007, isbn 978-0-387-25985-7, library of congres 2006935130. [5] s. n. vukosavić, m. jones, d. dujić, e. levi, "an improved pwm method for a five-leg inverter supplying two three-phase motors", in ieee int. symp. ind. electronics, cambridge, uk, 2008, pp. 160-165. [6] j. g. kappenman, "transformer dc excitation field test and results", ieee special panel session report, 1989 [7] e. l. harder, "effect of direct current in transformer windings", electric journal, vol. 27, pp. 601 1930 [8] j.a. orr, a.e. emanuel, "on the need for strict second harmonic limits", ieee trans. power delivery, vol. 15, no. 3, pp. 967–971, july 2000. [9] l. gertmar, p. karlsson, o. samuelsson, "on dc injection to ac grids from distributed generation", european conference on power electronics and applications, epe2005, dresden, pp 1-10 [10] y. shi, b. liu, s. duan, "eliminating dc current injection in current-transformer-sensed statcoms", ieee trans. on power electronics, vol. 28, no. 8, pp. 257–265, aug. 2013. [11] "ieee standard for interconnecting distributed resources with electric power systems", ieee standard 1547-2003 [12] m. armstrong, d.j. atkinson, c.m. johnson, "auto-calibrating dc link current sensing technique for transformerless, grid connected, h-bridge inverter systems", ieee trans. on power electronics, vol. 21, no. 5, pp. 1385-1393, sept. 2006. [13] m.m. ponjavic, r.m. djuric, "nonlinear modeling of the self-oscillating fluxgate current sensor", ieee sensor journal, vol. 7, no. 11, pp. 1546-1553, nov. 2007. [14] s.b. lee, t.g. habetler, "an on line stator winding resistance estimation technique for temperature monitoring of line-connected induction machines", ieee trans. on industry applications, vol. 39, no. 3, may/june 2003, pp. 685-694 [15] p.r. price, "geomagnetically induced current effects on transformers", ieee trans. on power delivery, vol. 17, no. 4, oct 2002, pp. 1002-1008 [16] a. ahfock, a.j. hewitt, "dc magnetisation of transformers", iee proc. of electric power applications, vol. 153, no.4, pp. 601-607, july 2006 [17] e.g. tenyenhuis, o. guelph, r.s. girgis, "measured variability of performance parameters of power & distribution transformers", ieee pes transmission and distribution conference and exhibition 21.24. may 2006, pp. 523-528 [18] transformers ge electrical distribution e-catalog, ge industrial solutions 2013 [19] g. buticchi, l. consolini, e. lorenzani, "active filter for removal of the dc current component for single phase power lines", ieee trans. on industrial electronics, vol. 60, no. 10, pp. 4403-4414, oct. 2013 [20] "isolated current and voltage transducers, characteristics, applications, calculations", lem components, 3rd ed., 2004, publication ch 24101 e/us. [21] g. buticchi, e. lorenzani, "detection method of the dc bias in distribution power transformers", ieee trans. on industrial electronics, vol. 60, no. 8, pp. 3539-3549, aug. 2013 [22] g. buticchi, e. lorenzani, "a sensor to detect the dc bias of distribution power transformers", ieee international symposium in diagnostics for electric machines, power electronics & drives (sdemped), 5-8 sept. 2011, pp. 63-70 [23] f. berba, d. atkinson, m. armstrong, "a review of minimisation of output dc current component methods in singlephase grid-connected inverters pv applications", 2nd international symposium on environment-friendly energies and applications 2012, pp. 296-301 [24] g. buticchi, g. franceschini, e. lorenzani, c. tassoni, a. bellini, "a novel current sensing dc offset compensation strategy in transformerless grid connected power converters" ieee energy conversion congress and exposition, ecce 2009, 20-24 sept. 2009, pp. 3889 3894 540 s. n. vukosavic [25] g. buticchi, e. lorenzani, g. franceschini, "a dc offset current compensation strategy in transformerless grid-connected power converters", ieee trans. on power delivery, vol. 26, no. 4, oct. 2011, pp. 2743-2751 [26] g. franceschini, e. lorenzani, g. buticchi, "saturation compensation strategy for grid connected converters based on line frequency transformers", ieee trans. on energy conversion, vol. 27, no. 2, june 2012, pp. 229-237 [27] task force on harmonics modeling and simulation, "modeling devices with nonlinear voltage-current characteristics for harmonic studies", ieee trans. on power delivery, vol. 19, no. 4, oct. 2004, pp. 1802-1822 [28] s. lu, y. liu, j. de la ree, "harmonics generated from a dc biased transformer", ieee trans. on power delivery, vol. 8, no. 2, april 1993, pp. 725-731 [29] x. li, x. wen, p.n. markham, "analysis of nonlinear characteristics for a three-phase, five-limb transformer under dc bias", ieee trans. on power delivery, vol. 25, no. 4, oct. 2010, pp. 2504-2010 [30] x. zhao, j. lu, l. li, z. cheng, "analysis of the dc bias phenomenon by the harmonic balance finiteelement method", ieee trans. on power delivery, vol. 26, no. 1, jan. 2011, pp. 475-485 [31] "hv/lv distribution transformers, trihal cast resin dry type transformers 160 to 2500 kva", france transfo, schneider electric industries sas, april 2005. [32] d. warner and w. jewell, "an investigation of zero order harmonics in power transformers", power delivery, ieee transactions on, vol. 14, no. 3, pp. 972 –977, jul. 1999. [33] p. picher, l. bolduc, a. dutil, and v. pham, "study of the acceptable dc current limit in core-form power transformers", power delivery,ieee transactions on, vol. 12, no. 1, pp. 257 –265, jan. 1997. [34] n. takasu, t. oshi, f. miyawaki, s. saito, and y. fujiwara, "an experimental analysis of dc excitation of transformers by geomagnetically induced currents", power delivery, ieee transactions on, vol. 9, no. 2, pp. 1173 –1182, apr. 1994. [35] s. lu and y. liu, "fem analysis of dc saturation to assess transformer susceptibility to geomagnetically induced currents", power delivery, ieee transactions on, vol. 8, no. 3, pp. 1367 –1376, jul. 1993. [36] s.a. mousavi, g. engdahl, e. agheb, "investigation of gic effects on core losses in single phase power transformers", archives of electrical engineering vol. 60, no. 1, pp. 35-47, 2011. [37] t. mingxing, y. dongsheng, y. hong, "harmonic characteristic analysis of magnetically saturation controlled reactor", telkomnika, vol. 11, no. 8, august 2013, pp. 4214-4221 facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 327-349 https://doi.org/10.2298/fuee2003327v © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd high frequency common-mode noise in serdes circuits’ optimized interconnections  roxana vladuță 1,2 , lidia dobrescu 2 , nicolae militaru 2 , dragoș dobrescu 2 1 esilicon romania, bucharest – district 1, romania 2 faculty of electronics, telecommunications and information technology, university politehnica of bucharest, romania abstract. according to the requirements imposed by the new four-level pulse amplitude modulation (pam4) standard for high-speed data transfer and processing, electrical constraints and manufacturing tolerances in integrated electronic packages impose accurate electromagnetic simulations and new s-parameters analysis, saving time and financial resources for next-generation switches, routers or data centers circuits implementation. the complexity of the advanced networking class circuits’ encapsulation substrates massively increases due to the large number of differential signals that it integrates. differential signaling has replaced single-ended transmission in high-speed circuits due to their many advantages, including increased immunity to crosstalk and electromagnetic interference, but common-mode noise due to timing skew or amplitude unbalance differences can still affect them. this work tests five different models, identifies and optimizes the 45 ° bends, structures that commonly affect the reflections in a differential stripline. then it studies differential transmission lines in stripline topology, implemented in a 12-layered flip-chip package, using s-parameters, inspecting and comparing the common-mode noise. in this way, the paper combines microwave theory with a real chip packaging design in an innovative way, using finite element analysis of electromagnetic field simulation and mixed-mod scattering parameters of differential topologies, towards an optimized structure design. key words: common-mode noise, differential signaling, electromagnetic interference, finite element analysis model, flip-chip package, multilayer circuit board received јune 1, 2020 corresponding author: lidia dobrescu faculty of electronics, telecommunications and information technology, university politehnica of bucharest, romania. (e-mail: lidia.dobrescu@electronica.pub.ro).  mailto:lidia.dobrescu@electronica.pub.ro 328 r. vladuta, l. dobrescu, n. militaru, d. dobrescu 1. introduction the increasing number of devices connected to internet and modern cloud storage impose a growing need to move more data much faster [1]. serializer/deserializer solution, also known as serdes, is used to convert parallel data into serial data, without increasing the number of pins. ieee standards define fast data rates that impose fourlevel pulse amplitude modulation (pam4) signaling [2]. the price that is paid consists in pam4 sensitivity to noise and increasingly susceptibility to electromagnetic crosstalk problems in high-speed designs. advanced packaging styles and a constant reduced area increase the complexity of the designing and verifying processes. next-generation switches and routers impose power scaling, larger i/o bandwidth and a flexible and optimized architecture. 1.1. differential signaling differential signaling is a modern implementation method that enhances high-speed data carrying using two signals, each in its own conductor. a stripline is a transversal electromagnetic (tem) transmission line which uses a flat strip of metal between two parallel ground planes insulated in a dielectric bulk. the advantages of the planar microwave fabrication process impose parallel-stripline for many other applications such as microwave sensors [3]. fig. 1 stripline transmission in differential topology the common method to increase noise immunity in a stripline is to replace the single-ended topology with a differential one as shown in fig. 1, where there are two electromagnetically coupled conductors between the ground planes. high bandwidth differential signals can be transmitted if a uniform cross section down its length ensures constant impedance. the greater the coupling, the more robust to ground bounce noise picked up from environment [4]. the ground plane allows a common mode of propagation to exist together with the desired differential mode signaling, requiring a mixed theoretical approach. high frequency common-mode noise in serdes circuits’ optimized interconnections 329 1.2. chipset encapsulation integrated circuits (ic) encapsulation structure, called package, has both electrical and structural roles. in fact, it is a passive component that adapts the ic conductive elements dimensions to printed circuit boards (pcb) specific ones. it also enables the redistribution of the signals to facilitate the connection of several components on the pcb. the complexity of the ic encapsulation substrate is due to the large number of differential signals that it integrates, thus realizing the interconnection between the integrated circuit and the printed circuit board. the interconnection paths can be seen and analyzed as differential paths. they cannot be realized in the form of straight lines since they will have elements of bypass or they must connect non-aligned structures, so many bends are required. 1.3. propagation issues common-mode reflections generated in differential transmission lines as strip line or microstrip type are due to the route bends and asymmetries, therefore causing signal degradation. the signal integrity issues of bend discontinuities in a high-speed interconnect design can be investigated using circuit simulators. shiue, guo and lin [5] deal with 45° angle instead right-angle bends for commonmode noise reduction. the length of the routes of the differential pair is conventionally measured as a midline of the route. thus, for any bend of the differential pair, the outward path will have a longer length, whereby the propagated signal will have a greater delay [6]. skew is the deviation of propagation delay due to length differences and electrical loading. practical ways of compensating skew have been developed and a parallel-plate patch metal can act as a compensation capacitance [5]. other technological aspects as discontinuities, layer-to-layer variation of the dielectric constant or skew due to glass weave can be also considered [7]. 1.4. paper structure section 1 provides a quick view on interconnections paths from the substrate of an ic package, analyzed as stripline differential topology. it announces specific propagation problems such as common-mode reflections, noise and delays that can be simulated using specific software and can usually be compensated by length matching. this section outlines the structure of the paper in the end. it offers a short overview on this paper subject and topics. section 2 presents transmission lines modeling principles and characterization. mixed-mode s-parameters, as a theoretical base of modeling, are shortly described using electromagnetic-field simulations. section 3 presents ansys hfss simulation methodology and its modeling principles. section 4 shows layout routing rules for package and signal integrity requirements. section 5 presents simplified structures evaluations. section 6 demonstrates optimized structures for common-mode noise reduction. section 7 summarizes the salient points of this work and the state-of-the-art advancements are highlighted. 330 r. vladuta, l. dobrescu, n. militaru, d. dobrescu 2. transmission line modeling and characterization different approaches can be used for modeling the electromagnetic phenomena within the differential transmission line. 2.1. traditional distributed-element circuits models in order to model the differential transmission line (see fig. 2), the lumped-element models with conventional passive electrical elements, exemplified in texts by gray and meyer [8], are replaced with other models containing distributed circuit elements per unit length. in this case, a complex distributed circuit analysis is required [8]. the distributed resistance, inductance, capacitance, and conductance, primary line constants, can model the transmission line as an infinite series of two-port cells, using so-called telegrapher’s equations. fig. 2 equivalent circuit with distributed elements per unit length although these models were initially developed for microwaves, where concentrated constants are difficult to be implemented, bockelman affirms [9] that this method remains still difficult to be applied for measurements or tests in rf and microwave frequency range. 2.2. models using s-parameters scattering parameters (s-parameters) are more suitable for characterizing high-speed circuits at rf and microwave frequencies. mixed-mode s-parameters [9] theory allows a real-mode measurement system and offers a solid based for electromagnetic field simulation. a coupled line pair, line a and line b, over a common ground plane is analyzed. the four ports are not physically ports, but they can be seen as conceptual tools (see fig. 3). high frequency common-mode noise in serdes circuits’ optimized interconnections 331 mixed-mode 2-port physical port 1 physical port 2 adm1 bdm1 acm1 bcm1 adm2 bdm2 acm2 bcm2 sdd11 scc11 sdd21 scc21 sdd22 scc22 fig. 3 mixed-mode two-port device when the s-parameter indices are the same (s11 or s22), this indicates a reflection, because the input and output ports are the same. the mixed s-parameters matrix becomes:         11 12 11 12 21 22 21 22 11 12 11 12 21 22 21 22 1 1 1 2 2 2 1 1 1 2 2 2 dd dd dc dc dm dm dm dd dd dc dcdm dm dmdd dc cm cm cmcd cccd cd cc cc cm cm cmcd cd cc cc s s s sb a a s s s sb a as s b a as ss s s s b a as s s s                                                 (1) where: a = direct wave (incident on the port); b = reverse wave (reflected from the port); dm1 and dm2 = differential mode at port 1 and port 2; cm1and cm2 =common-mode at port 1 and port 2; sdd = differential mode s-parameters; scc = common-mode s-parameters; sdc = s-parameters describing the conversion of common-mode waves into differentialmode waves; scd = s-parameters describing the conversion of differential-mode waves into commonmode waves. the differential mode voltage is the difference between two voltages, establishing a signal that is no longer referenced to the ground. the common-mode voltage in a differential topology is the average voltage at a port, so common-mode voltage is the half of the sum of the two voltages. the common-mode current is the sum of the currents and the return current for the common-mode signal flows through the ground plane. mixedmode s-parameters can be measured with a special designed practical system [9]. 332 r. vladuta, l. dobrescu, n. militaru, d. dobrescu usually a channel must match only the characteristic impedance (50ω), but for highspeed transmissions the waveform at the connector output is degraded and only s-parameters complex matrix show reflection/transmission characteristics (amplitude/phase) in the frequency domain. mixed-mode s-parameters also cover mode conversions [10]. this theory using mixed-mode s-parameters can fully characterize a differential circuit, including coupled line systems and it will be used in the electromagnetic field simulation for optimizing the interconnection paths in the ic substrate. it will allow the evaluation of a transmission line both in differential transmission and in common transmission mode, as a main output of a simulated process, in the next section of the paper. 2.3. odd and even propagation mode in a stripline, the useful differential signal is applied at the end of a pair of coupled lines as a potential difference between the two signal conductors and propagates oddly. the presence of the ground conductor, serving as the current return path, makes propagation of the transmission common mode possible. the even-mode signal, also called the commonmode signal, can be expressed as the average of the two amplitudes applied at the end of the coupled lines [4]. 2.4. common-mode return loss high-speed serdes, in wire bond package applications, have clearly specified scc11 parameter, common-mode return loss, and other requirements. common-mode return loss is related to common-mode noise. na, arseneault et al. [11] shows that for a differential pair, common-mode return loss is a measure of common-mode signal reflection from mismatch of common-mode impedance in differential pairs. electromagnetic interference emissions and noise coupling is not strongly related to common-mode return loss. better isolations and better decoupling of power supply noise on reference plans are good solutions to limit electromagnetic interference (emi) caused by common-mode noise. many transmission protocols impose clear limits for both differentialand commonmode reflection. common-mode noise mainly affects the jitter, which has very small margins for pam4 modulation. also, in the case of long reach channels where the signal must be amplified by the receiver, the amplified common-mode noise can cause high overshoot voltages at sensitive receivers. any asymmetries in a differential transmission line produce a common signal that propagates through the device. this mode conversion is a main source of electromagnetic interference (emission/radiation). the electromagnetic compatibility compliance testing is a new condition for next-generation routers and switches at the end of the design cycle [10]. high frequency common-mode noise in serdes circuits’ optimized interconnections 333 3. electromagnetic-field simulations 3.1. 2d electromagnetic-field simulators ansys 2d extractor uses an automatic mesh refinement in order to obtain a highaccuracy solution over broadband frequencies. it uses the finite element method, dividing the whole 2d geometry into arbitrary-sized triangle elements, as shown in fig. 4. fig. 4 meshes for a differential stripline in an electromagnetic-field simulation when modeling structures with nonlinear characteristics, non-uniform meshes are generally used. this software allows smaller size cells in areas that are physically small but very important regarding electromagnetic field, and bigger cells in less complex regions [12], using an adaptive algorithm towards specific desired convergence criteria [13]. in order to identify a differential stripline that respects the adopted principle, 2d electromagnetic software that models its cross section is used. for every mesh element, the maxwell laws are applied in order to calculate the electric passive elements of the line per unit length. the boundary conditions on the interface between two elements of the geometry are automatically applied. based on these conditions and within the desired frequency range, the 2d structure is analyzed electromagnetically. the convergence criterion of the simulation is defined as a tolerance of the error imposed by the user, in this paper having a value of 0.5%. the main result of this 2d simulation is the characteristic impedance of the stripline structure that will be detailed in section 5. 3.2. 3d electromagnetic-field simulators the evaluation of differential stripline transmission lines from the common-mode reflections point of view can be accurately performed using 3d electromagnetic field simulation software that implements the finite element method. ansys hfss (high frequency structure simulator) software is a 3d electromagneticfield simulation tool for designing and simulating high frequency electronic products. the software is recognized for its accuracy by both academia and industry [14], generally used for analysis of three-dimensional microwave structures [15]. in this paper, the working method of ansys hfss simulator is based on the discretization of geometry in a tetrahedron network of arbitrary dimensions according to the geometry to be analyzed, as shown in fig. 5. 334 r. vladuta, l. dobrescu, n. militaru, d. dobrescu fig. 5 simulated interconnects the electromagnetic field is calculated by applying maxwell's laws to a fea (finite element analysis) model. the automatic process adapts the mesh in consecutive steps, refining it so that it correctly captures the gradient of the electromagnetic field quantities and the process continues until the s-parameters or other user-defined quantity, change between two consecutive adaptive steps less than the convergence criterion imposed by the user. the frequency response of the geometry is calculated within a frequency range defined by the user. the s-matrix describing the analyzed multiport can be then postprocessed by ansys hfss as a matrix of mixed-mode s-parameters, allowing the evaluation of the transmission line both in differential and in common transmission mode. 4. integrated circuits layout design the package (pkg) with electrical and structural roles can be realized in wire-bond or flip-chip technology [16]. wire-bonding is a robust technology and its cost is a major advantage. flip-chip's advantages regard lower-inductance power distribution network, reduced switching noise and ground bounce and lower parasitic elements due to the replacement of the highly inductive wire bonds with smaller solder balls to interface the package with the ic. both technologies coexist due to continuous improvements. 4.1. substrate layers the encapsulation of an application-specific integrated circuit (asic) has evolved from wire-bond to flip-chip technologies, the laminated substrate of the capsule acts like a minipcb with a surface of up to 60 mm  60 mm, having between 12 and 16 metal layers. interconnects between the package and the silicon die serve both for electrical connection and as a method of attaching the ic to the substrate, giving it structural stability. the package substrate is a printed wiring harness with the following laminated structure:  the core, a middle dielectric layer with the greatest thickness, ensuring rigidity to the printed wiring.  metallic layers deposited on both sides by the core, as a rule from copper, in which the geometry of the elements of interconnection of the electrical circuit is realized by corrosion: signal paths and power plans.  the build-up dielectric layers deposited to separate the metal layers. the dielectric filling of the corroded copper areas is realized from the same material.  via (vertical interconnect access), vertical elements used to make the connection between the metallic states. a via is made by laser or mechanical drilling of the high frequency common-mode noise in serdes circuits’ optimized interconnections 335 metal and dielectric layers, through which the connection must be made, followed by the plating of the cylinder thus formed or its filling with conductive material – usually copper.  a thinner dielectric layer, called solder-mask is applied over the outer metal layers, which protects copper against oxidation and accidental short-circuiting.  in the connection areas with the ic and the pcb, the solder-mask layer is not applied, thus allowing the bumps and the balls to be joined. the stack of metal and dielectric layers that make up the structure of the printed circuit called substrate, which is the subject of this work, are disposed in 2 solder masks of 20 µm, 12 copper layers of 15 µm and 11 intermediary dielectric build-ups of 30 µm. for differential pairs routing, the first 3 metal layers will be used. the signal leads – paths are created on layer 2 and the reference plane uses layers 1 and 3. frequency modeling of the electric properties of the dielectric, in ansys electromagnetic-field simulation programs integrates the djordjevic-sarkar mathematical model that allows the extrapolation of electrical properties over an entire frequency range starting from the known values at a single frequency point. 4.2. routing rules the correct execution of electronic circuits involves more than their simulation using an electrical computer aided design (e-cad) software environment. it is mandatory to consider the manufacturing process from the design stage as well. although the technology to produce printed circuits is advanced and structures of the order of microns can be manufactured, certain rules are imposed for the dimensions and spacing of the conductive elements. these layout routing rules, as shown in fig. 6, differ from one manufacturer to another, depending on manufacturing methods accuracy or on manufacturing equipment. fig. 6 routing rules the tolerance of the manufacturing processes translates into a percentage by which the physical dimensions of the topology elements of a printed circuit may vary from the nominal ones, required by the engineer that performs the routing. the lower the manufacturing tolerance and the greater the accuracy of the electrical circuit geometry dimensions are, the more expensive the manufacturing and assembly process will be. in the highly competitive environment of the electronic device market, balancing design effort and manufacturing cost becomes critical. 336 r. vladuta, l. dobrescu, n. militaru, d. dobrescu in order to achieve the differential stripline lines in a substrate manufactured with advanced technologies, the following manufacturing rules are required:  minimum width of a path (a): 14 µm;  maximum width of a route: 89 µm;  elements with a constant width greater than or equal to 90 µm are considered planes (f);  minimum spacing between paths (noted a): 14 µm;  minimum spacing between paths and planes (c): 40 µm. 4.3. signal integrity rules the signal integrity refers to the quality of the electrical signals as amplitude and synchronization. as most digital systems use variable or even programmable frequency data transfers, the passive elements that make up the transmission environment are required to comply with signal integrity conditions across the frequency range. for the same reason, the signal integrity requirements for a segment of the transmission channel, in this case the package of an ic, are expressed in the frequency domain, the requirements expressed in the time domain being used to validate the entire transmission channel. for the ic proper functioning, the rules of signal integrity are provided by its designer. considering only the transmission of differential signals, the rules of signal integrity are expressed using five main terms: 1. characteristic impedances the ratio of the amplitudes of voltage and current of the wave propagating along a transmission line, up to 15 ghz frequency domain is simulated as characteristic impedance. the value of differential-mode impedance is twice the value of odd-mode impedance and the value of common-mode impedance is half the value of even-mode impedance. two parallel traces in a pkg substrate are coupled and characteristic differential impedance of 100 ω will be used in simulations of serdes signals. common-mode impedance will be 25 ω for noise. 2. insertion-loss (il) for a transmission line, the signal power loss due to device loss is usually expressed in db. differential attenuation introduced by the package must not exceed 15% of the signal amplitude up to the first spectral component of the highest frequency useful signal of 15 ghz. the insertion s-parameters (s21) should not decrease below -1, 4 db at frequencies lower than 15 ghz in serdes circuits simulations [17]. 3. return-loss (rl) for a transmission line, the power loss in a signal returned/reflected by a discontinuity, usually due to a mismatch of the terminated load or impedance discontinuity across the conductive path is also expressed in db. further referred to as signal reflection, it is expressed by an element of the differential parameter s-matrix (sdd11), tolerated if they fall below a frequency-dependent limit. typical rl values could range from 15 to 60 db. many designers target 10 db as the critical value and try to keep return loss lower than 10db at the desired signal speeds. in most cases, 60 db is more desirable [18]. high frequency common-mode noise in serdes circuits’ optimized interconnections 337 4. crosstalk (xtalk) crosstalk is the mutual influence of two parallel, nearby routed traces. as an undesired phenomenon (an inductive and a capacitive coupling) crosstalk is the effect created in a specific circuit (victim) by the signal transmitted in another circuit (aggressor). it is expressed by specific elements of the differential parameter s-matrix and it has frequency dependent limits in simulations [19]. xtalk perceived as a path-based approach for identifying pairs of pathways that may crosstalk, is used in computation [20], [21]. 5. common mode return-loss (cmrl) reflections of the common mode signal, expressed by the commonly used sparameter scc11, are tolerated if they fall below a specific limit, frequency dependent, in serdes circuits’ simulations [19]. 5. simplified structures evaluations inserting guard traces into a simplified structure (see fig. 7), the cross talk is reduced by coupling the electromagnetic waves to the guard trace. fig. 7 guard traces added to a simplified structure the elements’ dimension for a simplified structure are shown in table 1. table 1 routing rules parameter dimension [µm] metal layer width 15 dielectric width 30 trace width 21 trace separation 90 separation between trace and guard trace 55 guard trace width 90 using the simulation software based on maxwell's equations in the frequency range (1-15) ghz, the capacitance, inductance, and characteristic impedance values are calculated and the parameters of the maximum length of the 10 mm paths and the signal growth time of 14 ps are defined. 338 r. vladuta, l. dobrescu, n. militaru, d. dobrescu 5.1. characteristic impedance for stripline topology as shown in fig. 8, characteristic impedance is frequency dependent. fig. 8 simulated characteristic impedance at 15 ghz, the wavelength becomes twice time greater than the differential pair’s length, so it becomes very important to match the characteristic impedance here, using ansys 2d extractor, in good agreement with the common-mode values and differentialmode impedances from integrity rules given in the previous section. 5.2. noise in a differential pair evaluation a mixed-mode multiport, according to fig. 3, is defined by ports placed at the end of each signal path of the differential pair, considering the input in at the end where the signal is applied and output out the end where the signal is transmitted. the ports used in simulation are placed as se (single-ended) ports of 50 ω standard impedance and the reference to the gnd conductor. in order to highlight the effect of the bends of a differential pair in a package’s structure, the conversion of the differential signal into a common signal is evaluated due to discontinuities introduced in signal propagation path. the common mode signal generated by a discontinuities (bends in this case) will propagate through the conductive structure in the two main directions: once as a commonly reflected signal having the opposite direction to the source differential signal that will be referred as rcd (reflected common-mode signal by conversion from differential mode) and as a common mode signal transmitted in the same direction as the source differential mode signal that will be referred as tcd (transmitted common-mode signal by conversion from differential mode). since data transmission through a serdes interface is purely differential, both at the transmitter and at the receiver, it can be considered the commonly generated signal by differential conversion as a noise that will be referred as cmn (common mode noise). in this paper, the cmn is evaluated due only to the package structure as a sum of rcd and tcd. high frequency common-mode noise in serdes circuits’ optimized interconnections 339 5.3. evaluation of a differential pair with bends the stripline structure can reduce the common-mode noise using a practical routing scheme, based on the same velocity of even-mode and odd-mode signals [5]. using dual back-to-back coupled bends with different angles, keeping the same routing rules for the matched impedance of the stripline differential pair, the same trace length without the significant skew can be maintained [5]. a right angle in a trace is not desired because the capacitance increases in the region of the bend, and the characteristic impedance changes. this impedance change causes reflections. so, right-angle bends in a trace are avoided and they are replaced with at least with two 45° bends, as shown in fig. 9. fig. 9 noise compensation zones in bends based on this principle, four test models with four bends were developed to evaluate the effect of 45˚ bends on the common reflections. in order to facilitate the presentation of the simulation results, the four models, designed as shown in fig. 10, are presented besides the straight model. they will be further referred as a, b, c and d models. the four models with bends are simulated using the same materials, boundary conditions and excitations for the entire frequency range between the dc point and the maximum frequency of the highest spectral component of 45 ghz. model a 1-2.7-1 model b 2-2-1 model c 1.7-1.7-1.7 model d 2-1.3-2 fig. 10 examples of different models with bends investigated 340 r. vladuta, l. dobrescu, n. militaru, d. dobrescu 5.4. differential pair without bends as reference model using an initial 3d simulation for a straight basic structure without bends, the first three s-parameters il, rl and cmrl can be extracted as reference, as shown in fig. 11. fig. 11 differentialand common-mode evaluation for the tested models the commonly used reflection attenuation limit (cmrl) also depicted in fig. 11 describes the common mode reflections generated by a common mode signal. it assumes the condition to be a non-ideal signal, containing both the differential mode component and a common mode noise due to the ic output stage, transmitted through the package. in differential pair structure simulations in frequency domain, outside noise is not determined and the cmrl cannot be interpreted as an effect of the differential pair bends, although in practice they will negatively influence the cmrl. high frequency common-mode noise in serdes circuits’ optimized interconnections 341 fig. 12 mixed-mode evaluation for the tested models the results of common-mode reflected signal converted from differential-mode signal (rcd), common-mode transmitted signal converted from differential-mode signal (tcd) and total common-mode noise generated (cmn) are shown in fig. 12. in the package rcd signal is critical due to its direction into the integrated circuit affecting the entire output buffer, more than tcd signal, seriously attenuated in communication channel consisting of three minimum elements: the ic package that emits the signal, the pcb and the ic receiver package, the most attenuating part remaining the pcb element. 342 r. vladuta, l. dobrescu, n. militaru, d. dobrescu 5.5. differential-mode parameters evaluation the il and rl, or sdd21 and sdd11 parameters, of all four models with bends are similar to the ones of the straight model, because the differential attenuation is mainly influenced by the equivalent resistance of the differential pair and while keeping the impedance controlled routing, length matching between the pair traces is enough to keep reflections below the necessary level. the insertion loss (il) for the first model is slightly different at high frequencies because it has the largest distance (2.7mm) between the phase shift and the correction area. in conclusion, for an ic encapsulation circuit, 45° bends can affect differential transmission if the distance between the phase area and the correction area is closed to the wavelength [22]. 5.6. common-mode parameter evaluation the higher the maximum value of cmrl is, the lower the transmission performance through this model will be. by evaluating the results of the cmrl attenuation versus frequency in fig. 11, all four tested models with 45° bends have a more unfavorable behavior than the straight model. the key decision factor in their overall evaluation is the maximum value over the whole analyzed frequency range. on the graph, the maximum cmrl values are periodically repeated after 8.25 ghz, due to the dimensions of the gnd plane of each model corresponding to the quarter of the wavelength of the frequency resonance, finally 6 frequency areas can be delimited on the graph from fig. 11, mainly linked to the distance between the phase shift and the correction areas. the greater the distance between the two zones becomes, the more the reflections increase, a strong effect is in case of the a model, as the most unfavorable case tested. the most favorable behavior has the c model, with perfect symmetry and an average distance between the two zones, and d model, with a small distance between the zones, although the phase shift and correction segments have the largest length of the tested ones. 5.7. mixed-mode parameters evaluation mixed-mode s-parameters, rcd and tcd, scd11 and scd21 parameters describe signal conversion from differentialto common-mode. the signal resulting from this conversion is added to the noise that can be generated by the ic output stage, triggering a negative chain reaction, which disrupts the useful signal. from fig. 12, the c model has greater rcd values only at high frequencies, above 38 ghz where spectral components have lower amplitude, and the a model can be considered as the worst case. the main elements that can worsen common mode generated reflections are the phase shift zone length and the distance between the two zones. although common mode reflections focus the ic output stage transmitting the signal through the package, the common mode transmitted signal due to the differential conversion affects the ic input stage that receives it as an additional input signal. thus, the two mixed s-parameters values, rcd and tcd are equally important for a proper functioning of a communication channel through the serdes pam4 interface. according to the classification of the average tcd results, the main factor influencing the conversion of a differential signal into a common-mode signal is the distance between the phase shift area and the correction area [23]. the shorter this distance is, the generated phase shifted noise has a shorter propagation time and the less tcd becomes, therefore high frequency common-mode noise in serdes circuits’ optimized interconnections 343 preferable, noticing that if the two zones do not have equal lengths, the results will be more unfavorable, even if the distance between them is smaller than in the asymmetric case. the total noise commonly generated by the 45˚ bends in the differential conversion is calculated as the sum of rcd and tcd and are depicted in fig. 12 and summarized in table 2. table 2 cmn models classification model phase zone [mm] distance [mm] corr. zone [mm] cmn average [db] straight 49.70 d 2 1,3 2 48.71 b 2 2 1 47.56 c 1.7 1.7 1.7 46.48 a 1 2.7 1 45.25 as the average tcd values are generally lower, the same rule about the distances between the two zones stays as the main element that influences the total common noise (cmn) generated by conversion from a differential signal. in conclusion, limiting the attenuation of reflection, cmn reduction, can be achieved in two ways:  common impedance matching, in order not to generate common mode noise reflections inserted into the package and further into the transmission channel by the ic output stage;  optimization of the zones with impedance discontinuity, in printed circuits, that means the 45˚ bends optimization. the bends optimization can be done considering the three main zones: the phase shift zone, the correction zone and the distance between them. in order to limit common-mode noise generation by differential conversion, which overlaps the common-mode noise inserted by the ic, it is primarily intended that the distance between the two zones to be as small as possible and their lengths to be as close as possible or even the same. 6. common mode noise in optimized interconnects paths the conclusions of previous section are verified for a real package case, shown in fig. 13. the medium-sized package has 17 mm sides and the stack up previously described in section 4. the package can encapsulate a 4 mm side ic and performs the interconnection between the ic and the pcb of 24 serdes pam4 channels with a 56 gbps per channel rate. the package geometry has been designed using e-cad software, allegro package designer and then automatically recognized in electromagnetic-field simulation software, preserving the accuracy of the geometric details. the differential pair in the real package requires 45˚ bends both to reach its connection to the pcb, a point that is not aligned with the differential signal output ic area and to bypass passive components assembled on the package such as decupling capacitors symbolized using the c letter in fig. 13. besides limiting the space where the differential pairs can be designed due to the passive components mounted on the package surface, the routes are conditioned to bypass the groups of vias that link the passive components mounted on the package and the power distribution network (pdn) on the lower metal layers, usually below the dielectric core. 344 r. vladuta, l. dobrescu, n. militaru, d. dobrescu in the real package paths electromagnetic simulation, the signal conductors’ geometry and the gnd conductor that serves as a reference plane for the routes are identically maintained. a major simplification consists in keeping only the coplanar guard elements from the metallic layer, where the paths of the differential pair are realized in, to highlight the main effect of the bending technique on the common-noise. in fig. 14 the real routing has a length of 7.12 mm, indicating the signal traveling through the ic package, from input (in), towards the pcb, (out). fig. 13 real package fig. 14 initial stripline routing this simplified model will be referred as initial in the electromagnetic-field simulations shown in fig. 15 and in fig. 16, in contrast with the optimized real model which is named pkg. the initial model is also depicted in fig. 17, where the phase shift zone and the correction zone are identified. initial model’s greater number of 45˚ bends compared to all simplified structures from section 5, disturbs the insertion loss linear trend versus frequency, as shown in fig. 15. the 45˚ bends have no negative effect on reflection as differential rl shows. due to the reduced length compared to the five models from section 5, the cmrl have similar values. in fig. 16, rcd has greater values till 20 ghz and tcd has higher values than the equivalent model, but low enough to be attenuated along the transmission channel. noting that the negative effect of 45˚ bends on total noise is commonly pronounced at frequencies up to 20 ghz, the differential pair optimization becomes a true necessity. the initial pair, shown in fig. 17 is optimized in fig. 18. this consists in changing the dimensions of several zones in order to reduce the common reflected signal and transmitted by differential conversion. the initial differential pair dimensions are shown in table 3 and they are compared with the final dimensions of the optimized structure shown in fig. 18. by this optimization it was intended that the differential pair should have the smallest distance between the phase shift area and the correction area and a smaller number of bends of 45˚. in order to reduce the number of 45˚ bends, a compromise was needed with respect to the length of the correction area segment, which became longer. high frequency common-mode noise in serdes circuits’ optimized interconnections 345 table 3 real package bends zones dimensions element initial dimensions [mm] optimized dimensions [mm] straight segment 0.9 0.7 phase shift d1 0.8 1.0 d1 – c1 distance 1.4 0.9 correction zone c1 1.4 3.3 straight segment 0.3 0.8 45˚ bend – d2 – – d2 – c2 distance 1.5 – straight segment 0.82 – fig. 15 differential and common mode evaluation for the initial and real package 346 r. vladuta, l. dobrescu, n. militaru, d. dobrescu due to the changes made to optimize the differential pair, the length of the pair changed to 6.7 mm. the improvement of the behavior of the differential pair in frequency was noticed only in terms of the common and mixed mode s-parameters. fig. 16 mixed-mode evaluation for the initial and real package from the point of view of transmitting a signal or a common-mode noise, the commonmode reflections are improved at high frequencies, over 38 ghz, in the optimized case compared to the original case as shown in the last graph in fig. 16. because the distance between the phase shift and correction zones was smaller after optimization, the rcd was reduced. the common mode signal cmn transmitted in the same sense as a source high frequency common-mode noise in serdes circuits’ optimized interconnections 347 differential mode signal, tcd, is greatly attenuated due to the optimization of the number of bends that the source differential signal encounters in its path (see fig. 18). fig. 17 initial structure for real stripline routing fig. 18 optimized structure 7. conclusions the paper analyzed the common-mode noise effects due to 45˚ bends in differential transmission lines. the study focused the stripline differential transmission lines, from an ic encapsulating circuit, for high-speed data flow transmitted through a serdes pam4 interface. the 45˚ bends are mandatory in package design to interconnect the integrated circuit with the pcb, linking points that cannot be aligned, bypassing passive components, decoupling capacitors, mounted on the surface of the package. up to 15 ghz in order to ensure the signals integrity, it was enough to balance the differential pair routes lengths, neglecting the signal conversion from differential to common mode. this conversion between the two transmission modes is a negative effect introduced by discontinuities in the differential pair. the study started by identifying a differential pair structure with coplanar elements and adapted differential impedance, using electromagnetic modeling and simulation. using previously identified dimensions for the stripline structure with impedance adaptation, five differential pairs test models on average length about 10 mm in flip-chip encapsulation circuit have been designed: a straight model as reference and four different models including 45˚ double bends with three focused main zones: the phase shift zone, the correction zone and the distance between them. the five test models were analyzed in frequency domain using 3d electromagnetic simulation. differential, common-mode and mixed-mode s-parameters frequency dependence, as graphical results, were compared with operating requirements according to the ic manufacturer up to 45 ghz. these models were also evaluated from common-mode noise generated by 45˚ bends perspective, expressed as mixed-mode parameters: common-mode reflections due to a differential mode source signal, scd11 or rcd, and as common-mode signal transmitted in the same sense as the source differential mode signal, scd21 or tcd. their sum, the total common noise cmn was also investigated. comparing the results for the five test models it was concluded that the main factor that negatively influences the generation of a common mode noise by conversion from differential mode is the large distance between the phase shift zone and the correction zone. the next factor is the symmetry between them. this generated common-mode noise, cmn, propagating along the differential paths has an unfavorable behavior at high frequencies. 348 r. vladuta, l. dobrescu, n. militaru, d. dobrescu these conclusions were verified by optimizing a stripline differential pair from a real package with a length of 7.12 mm. the total generated common noise, cmn, has been evaluated by comparing the results of its 3d electromagnetic simulations with the results of the initial model. because the noise generated by the bends is predominantly reflected towards the ic output buffer, it was decided to optimize the differential pair by reducing the distance between the phase shift and the correction zones and by reducing the number of bends, from 6 to 4 bends. comparing the initial model results versus the optimized solution in a real package, an improvement in terms of the total common noise generated by differential conversion over the entire frequency range up to 45 ghz has been obtained. future developments will focus 45˚ bends effects on the crosstalk between the different pairs on the adjacent metal layers. in designing multi-layered flip-chip packages, although the output and the input signals are placed on different layers, they share a metal layer that serves as a reference plane for both classes of differential signals. thus, the common noise generated by the bends in the differential pairs by the un-attenuated output signals can be electromagnetically coupled through the common ground panel with the input signals that have been attenuated due to the transmitted channel. references [1] a. mutschler, higher performance helps smooth the gap between analog and digital, but it adds a number of new twists. available: https://semiengineering.com/wrestling-with-high-speed-serdes/. [2] ieee electronics packaging society, heterogeneous integration roadmap. 2019, available: https://eps.ieee.org/images/files/hir_2019/hir1_ch02_hpc.pdf. [3] d.a. nesic, i. radovic, “parallel-strip line stub resonator for permittivity characterization”, facta universitatis, series: electronics and energetics, vol. 33, nno. 1, pp. 61–71, march 2020. [4] e. bogatin, signal and power integrity – simplified. new jersey: prentice hall, 2009, pp. 475–553. [5] g.h. shiue, w.d. guo, c.m. lin and r.b. wu, “noise reduction using compensation capacitance for bend discontinuities of differential transmission lines”, ieee trans. advanced packaging, vol. 29, pp. 560–569, august 2006. [6] e. kunz, j.y. choi, v. kunda, l. kocubinski, y.li, j.r. miller, g.j. blando and i. novak, “sources and compensation of skew in single-ended and differential interconnects”, signal integrity journal, april 2017. [7] s.c. thierauf, understanding signal integrity. artech house, 2011, pp. 88–89. [8] p.r.gray and r.g.meyer, analysis and design of analog integrated circuits, 3rd. ed. new york: wiley, 2009, pp. 291–292. [9] d. e. bokelman, “combined differential and common-mode scattering parameters: theory and simulation,” ieee trans. microw. theory and techn., vol. 43, pp. 1530–1539, july 1995. [10] keysight technologies, “s-parameter measurements, basics for high-speed digital engineers”. available: http://literature.cdn.keysight.com/litweb/pdf/5991-3736en.pdf, may 2019. [11] n. na, m. arseneault, k. yonehara, h. hu, d. zwitter, e.m. wolf, k. srinivasan, c. cox and r. anderson, “common mode return loss consideration in wirebond packaging for high speed serdes links”, in proceedings of the ieee electrical performance of electronic packaging, scottsdale, az, usa: world scientific, 2006. [12] m. kostic, n. doncov, z. stankovic, j. paul, “numerical compact modeling approach of dispersive magnoelectric media based on scattering parameters”, facta universitatis, series: electronics and energetics, vol. 33, no. 1, pp.73–82, march 2020. [13] ansys, “2d extractor solver option”. available: https://www.ansys.com/products/electronics/option-2dextractor-solver. [14] ansys hfss, available: https://www.ansys.com/products/electronics/ansys-hfss, july 2019. [15] a.s. tatarenko, d.v. snisarenko, m.i. bichurin, “modeling of magnetoelectric microwave devices”, facta universitatis, series: electronics and energetics, vol. 30, no. 3, pp. 285–293, september 2017. https://semiengineering.com/wrestling-with-high-speed-serdes/ https://eps.ieee.org/images/files/hir_2019/hir1_ch02_hpc.pdf high frequency common-mode noise in serdes circuits’ optimized interconnections 349 [16] l. ammann, “the package interconnect selection quandary”, eetimes, 2003, available: https://www.eetimes.com/the-package-interconnect-selection-quandary/#. [17] d. r. stauffer, j. trinko meckler, m. sorna, k. dramstad, c. r. ogilvie, a. mohammad and j. rockrohr “high speed serdes devices and applications”, springer, 2008. [18] timbercon, “connector return loss“, available at: https://www.timbercon.com/resources/glossary/connector-return-loss/. [19] 58g & 112g pam4 & nrz dsp-based long-reach serdes family in 7nm, available at: https://www.esilicon.com/products/high-performance-networking-computing-ip/serdes. [20] a.n. tegge, n. sharp and t.m. murali, “xtalk: a path-based approach for identifying crosstalk between signaling pathways”, bioinformatics, vol. 32, no. 2, pp. 242-251, january 2016. [21] freescale semiconductor, “high speed layout design guidelines, application notes“, available at: https://www.nxp.com/docs/en/application-note/an2536.pdf. [22] z. popovic and e.f, kuester, “principles of rf and microwave measurements”, 2017, pp. 1, available at: https://ecee.colorado.edu/~ecen4634/4634-lectures-labs.pdf. [23] texas instruments, “high speed layout guide, texas instruments application report”, 2017, available at: http://www.ti.com/lit/an/scaa082a/scaa082a.pdf. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 249-265 https://doi.org/10.2298/fuee1902249c ewma statistics and fuzzy logic in function of network anomaly detection petar čisar 1 , sanja maravić čisar 2 1 university of criminal investigation and police studies, zemun-belgrade, serbia 2 subotica tech, deparment of informatics, subotica, serbia abstract. anomaly detection is used to monitor and capture traffic anomalies in network systems. many anomalies manifest in changes in the intensity of network events. because of the ability of ewma control chart to monitor the rate of occurrences of events based on their intensity, this statistic is appropriate for implementation in control limits based algorithms. the performance of standard ewma algorithm can be made more effective combining the logic of adaptive threshold algorithm and adequate application of fuzzy theory. this paper analyzes the theoretical possibility of applying ewma statistics and fuzzy logic to detect network anomalies. different aspects of fuzzy rules are discussed as well as different membership functions, trying to find the most adequate choice. it is shown that the introduction of fuzzy logic in standard ewma algorithm for anomaly detection opens the possibility of previous warning from a network attack. besides, fuzzy logic enables precise determination of degree of the risk. key words: network anomaly detection, ewma, fuzzy rules, membership functions, operators 1. introduction intrusion detection is an area of computer security that involves the detection of unwanted manipulations to computer networks. an intrusion detection system (ids) is required to detect all types of malicious network traffic and computer usage that cannot be detected by a conventional firewall (fig. 1). this security method is needed in today’s computing environment because it is impossible to keep pace with the current and potential threats and vulnerabilities in our computing systems. an ids may be categorized by its detection mechanism on: anomaly based, signature based or hybrid (uses both of previous technologies). received october 4, 2018; received in revised form january 15, 2019 corresponding author: petar ĉisar university of criminal investigation and police studies, cara dušana 196, 11080 zemun-belgrade, serbia (e-mail: petar.cisar@gmail.com) 250 p. ĉisar, s. maravić ĉisar fig. 1 ids elementary configuration [1] when the ids identifies intrusions as unusual behaviour that differs from the normal behaviour of the monitored system, this analysis strategy is called anomaly detection 2. the disadvantage of this method is the occurrence of a relatively large number of false alarms, i.e. network situations where the detection system indicates a non-existent attack. in order to reduce the appearance of false alarms, the measurement of the traffic activity is applied. the goal is to make a decision whether a network situation is attack or not. in order to make the correct decision, the behaviour profiles are defined: normal and abnormal profiles (simple thresholds (limit values) or complex statistical distributions). the false positive rate is considered one of the most important factors for performance evaluation of ids. anomaly detection learns a statistical or neural network model to figure out what is normal. the following techniques can use: bayesian statistics, neural networks, expert systems and statistical decision theory. a number of anomalies are seen in changes in the intensity of events occurring in computer networks. due to the ability of exponentially weighted moving average (ewma) control chart to supervise the event-occurrence rate on the basis of their intensity, this statistic can be implemented in control limits based algorithms. the effectiveness of the performance of standard ewma algorithm can be increased when one combines the logic of adaptive threshold algorithm and adequate application of fuzzy theory. fuzzy logic enables precise adjustment of degree of exceeding limit values (expressed in form of percentage). different aspects of fuzzy rules are described in this paper, including different membership functions, trying to define the most suitable choice. the aim of this paper is to show the theoretical opportunity and examine the possible way of implementing ewma statistics and fuzzy logic to detect network anomalies. ewma statistics and fuzzy approach are not new in analyzing and detecting network anomalies and intrusions. ye et al. [3] implemented chi square distance metric to measure the deviation of the observed activities from the forecast of normal activities. the results indicate that the chi square distance measure with the ewma forecasting provides better performance in intrusion detection than that with the average-based forecasting method. abdeh et al. [4] applied genetic fuzzy systems and showed that they are able to develop accurate and also interpretable intrusion detection systems. yu et al. [5] developed a fuzzy model tuner, through which the user can tune the model fuzzily but yield much appropriate tuning. the results showed the system can achieve about 23% improvement. works [6], [7] and [8] also point to the benefits of both approaches to improve the quality of transmission and predicting network anomalies. senturk et al. in [9] combined popular control charts, ewma statistics and fuzzy logic in function of network anomaly detection 251 ewma control chart for univariate data with fuzzy environment. the fuzzy ewma control charts (fewma) can be used for detecting small shifts in the original data represented by fuzzy numbers (unlike this paper, in which fuzzy logic is applied in the decision-making phase based on the original network values). the fewma control charts decreases number of false decisions by providing flexibility on the control limits. dickerson et al. [10] explored using fuzzy systems as the correlation engine for an ids. fuzzy systems have several important characteristics that suit intrusion detection: fuzzy systems can readily combine inputs from widely varying sources, many types of intrusions cannot be crisply defined, the degree of alarm that can occur with intrusions is often fuzzy. researchers [11] developed the fuzzy intrusion recognition engine (fire) using fuzzy sets and fuzzy rules. fire uses simple data mining techniques to process the network input data and generate fuzzy sets for every observed feature. the fuzzy sets are then used to define fuzzy rules to detect individual attacks. fire does not establish any sort of model representing the current state of the system, but instead relies on attack specific rules for detection. instead, fire creates and applies fuzzy logic rules to the audit data to classify it as normal or anomalous. the authors found that the approach is particularly effective against port scans and probes. the primary disadvantage of this approach is the labor intensive rule generation process. control charts have characteristics that can be successfully applied in detection of network anomalies to detect shifts. in addition to the ewma control charts, a cusum (cumulative sum) algorithm can be also used for monitoring of change-points, whose behaviour in the case of distributed denial of service (ddos) attacks is described in [31]. the authors in [32] implemented ewma to detect anomalous changes in the intensity of a jamming attack event by using the packet inter-arrival feature of the received packets from the wireless sensor nodes. this paper consists of six sections. the introduction offers a terminological basis for the problem of intrusion detection, then in the second section a short overview of anomaly detection techniques is presented. the ewma algorithm is described in the third section, followed by the fourth section dealing with the fuzzy approach. the simulation and results of analyzed algorithm and its various aspects in an adequate software environment are given in the fifth section. the sixth part elaborates the process of improving the results. finally, this paper closes with the conclusion based on the analyzed cases. 2. intrusion detection the main challenge in intrusion detection is that of separating anomalous events from normal events. anomalous events can include actual attacks against a computer system or more subtle and hence difficult to detect, probes that are aimed at information reconnaissance. another challenge in id is that of false positives. false positives in id occur when an ids reports an intrusion as occurring when in fact it has not. it has been argued that it is actually this false alarm rate that is the limiting factor in an idss performance. the performance of a network ids can be more effective if it includes not only attack signature matching but also traffic analysis at the same time. by using traffic analysis, anomalous traffic is identified as a potential intrusion. traffic analysis does not deal with the payload of a message, but its other characteristics such as source, destination, routing, length of the message, time it was sent, the frequency of the communication etc. 2. traffic payload is not always available for analysis – the traffic may be encrypted or it may simply be against policy to analyze packet payload 12. 252 p. ĉisar, s. maravić ĉisar anomaly detection techniques 13:  protocol anomaly detection – protocol anomaly refers to all exceptions related to protocol format and protocol behaviour.  application payload anomaly – application anomaly must be supported by detailed analysis of application protocols. application anomaly also requires understanding of the application semantics in order to be effective.  statistical anomaly – to fully characterize the traffic behaviour in any network, various statistical measures are used to capture this behaviour. additionally, the implemented statistical algorithm must recognize the difference between the long – term (assumed normal) and the short – term observations to avoid generating false alarms on normal traffic variations. network statistical anomaly detection (nsad) attempts to dynamically understand the network and statistically identify traffic that deviates from normal traffic usage and patterns. nsad systems can be broken down further into threshold, baseline and adaptive systems, with each looking for different triggers to identify anomalous behaviour 2. 3. exponentially weighted moving average many intrusions manifest in changes in the intensity of events occurring in computer networks. because of the ability of ewma control charts to monitor the rate of occurrences of events based on their intensity, this technique is appropriate for implementation in control limits based algorithms. the performance of standard ewma algorithm can be made more effective combining the concept of adaptive threshold algorithm and adequate application of fuzzy logic. the following section containing the theoretical background has already been discussed in previous works 14, 15, 16 yet it is vital for the comprehension of the entire paper to summarize it again. the exponentially weighted moving average is a statistic for monitoring the process that averages the data in a way that gives less and less weight to data as they are further removed in time. for the ewma control technique, the decision regarding the state of control of the process depends on the ewma statistics, which is an exponentially weighted average of all prior data, including the most recent measurements. the statistics that is calculated is: (1) for t = 1, 2, ...n where  ewma0 is the mean of historical data  yt is the observation at time t  n is the number of observations to be monitored including ewma0  0 < λ ≤ 1 is a constant that determines the depth of memory of the ewma. this equation is formulated by roberts 17. by the choice of weighting factor λ, the ewma control procedure can be made sensitive to a small or gradual drift in the process. the parameter λ determines the rate at which “older” data enter into the calculation of the ewma statistic. a value of λ = 1 implies that only the most recent measurement influences the ewma. thus, a large value of λ = 1 gives more weight to recent data and less weight to ewma statistics and fuzzy logic in function of network anomaly detection 253 older data a small value of λ gives more weight to older data 18. the value of λ is usually set between 0.2 and 0.3 19 although this choice is somewhat arbitrary. lucas and saccucci 20 have shown that although the smoothing factor λ used in an ewma chart is usually recommended to be in the interval between 0.05 to 0.25, in practice the optimally designed smoothing factor depends not only on the given size of the mean shift δ, but also on a given in-control average run length (arl). arl represents the average number of determined process points before the first point indices the appearance of out-of-control state (exceeding one of the control limits). the estimated variance of the ewma statistics is approximately: (2) when t is not small, where σ is the standard deviation calculated from the historical data. the center line for the control chart is the target value or ewma0. the upper and lower control limits are 21: (3) (4) where the factor k is either set equal 3 (the 3-sigma control limits) or chosen using the lucas and saccucci tables 20. ewma control chart is especially efficient in detecting small shifts of the monitored process – less than 1.5σ. control charts are specialized time series plots, which assist in determining whether a process is in statistical control. some of the most widely used forms of control charts are xr charts and individuals charts. these are frequently referred to as “shewhart” charts after the control charting pioneer walter shewhart who introduced such techniques. these charts are sensitive to detecting relatively large shifts in the process (i.e. of the order of 1.5σ or above). in computer network practice, shifts can be caused by intrusion or attack, for example. two types of charts are usually used to detect smaller shifts (less than 1.5σ), namely cusum charts and ewma charts. a cusum chart plots the cumulative sums of the deviations of each sample value from a target value. an alternative technique to detect small shifts is to use the ewma methodology. this type of chart has some very attractive properties, in particular 14, 16: 1. unlike x-r and individuals charts, all of the data collected over time may be used to determine the control status of a process. 2. like the cusum, the ewma utilizes all previous observations, but the weight attached to data exponentially decreases as the observations become older and older. 3. the ewma is often superior to the cusum charting technique due to the fact that it detects larger shifts better. 4. ewma schemes may be applied for monitoring standard deviations in addition to the process mean. 5. ewma schemes can be used to forecast values of a process mean. 6. the ewma methodology is not sensitive to normality assumptions. 254 p. ĉisar, s. maravić ĉisar in real situations, the exact value of the shift size is often unknown and can only be reasonably assumed to vary within a certain range. such a range of shifts deteriorates the performance of existing control charts. calculating the optimal value of parameter λ is based on the study of authentic samples of network traffic. random variations of network traffic are normal phenomena in the observed sample. in order to decrease or eliminate the influence of individual random variations of network traffic on occurrence of false alarms, the procedure of exponential smoothing is applied, as an aspect of data preprocessing. for any time period t, the smoothed value st is determined by computing: (5) where 0 < λ ≤ 1and t ≥ 3. this is the basic equation of exponential smoothing. the formulation here is given by hunter 19. it should be noted that there is an alternative approach, in which, according to roberts 17, yt is used instead of yt-1. this smoothing scheme starts by setting s2 to y1 (there is no s1), where si stands for smoothed observation or ewma, and yi stands for the original observation. the subscripts refer to the time periods 1, 2, ..., n. for example, the third period is s3 = λ y2 + (1 – λ) s2 and so on. there is no generally accepted statistical procedure for choosing λ. in that situation, the method of least squares might be adequate to determine the optimal value of λ for which the sum of the squared errors (sse) (sn-1−yn-1) 2 is minimized. the method of least squares represents a standard approach to the approximate solution of over-determined systems (i.e. sets of equations in which there are more equations than unknowns). the most important application is in data fitting. the best fit in the least squares sense minimizes the sum of squared residuals, a residual being the difference between an observed value and the fitted value provided by a model. here is an illustration of this principle through an example 15. consider the following data set consisting of n observations of data flow over time – for starting λ = 0.1: table 1 smoothing scheme time flow (yt) st error (st – yt) error squared 1 y1 2 y2 y1 e2 e22 3 y3 s3 e3 e32 ... ... ... ... ... n yn sn en en2 ssen the sum of the squared errors (sse) is then sse0.1. after that, the sse is calculated for λ = 0.2. if sse0.2< sse0.1 then sse0.2 is better value for λ. this iterative procedure is related to the range of λ between 0.1 and 0.9. in this way, the best initial choice for λ is determined and then, for getting more precise value, search optionally continues between λ–δλ and λ+δλ, where δλ is an arbitrarily small interval around λ (for instance, in practical applications, ± 10% around optimal λ). ewma statistics and fuzzy logic in function of network anomaly detection 255 the initial ewma plays an important role in computing all the subsequent ewma's. there are several approaches to define this value: 1. setting s2 to y1 2. setting s2 to the target of the process 3. setting s2 to average of the first four or five observations it can also be shown that the smaller the value of λ, the more important is the selection of the initial ewma. the sensitivity of standard ewma algorithm can be improved by implementing the logic of adaptive threshold algorithm 22. namely, network anomaly in adaptive algorithm is detected only in case when for multiple consecutive time intervals (in the figure below marked with #) the threshold is exceeded (if # > k, where the factor k is set by network security administrator). the sensitivity of this algorithm also depends on the value of the threshold exceeding (βμt, where where μt represents the measured mean in some observation period). fig. 2 adaptive threshold algorithm with the theme of statistical anomaly detection in computer networks and implementation of ewma statistics in network environment have dealt also the publications 23, 13, 15 and 24. 4. fuzzy approach the intrusion detection problem is viewed in the misuse or signature model as a classification problem: the goal is to classify patterns of the system behaviour in two categories (normal and abnormal), using patterns of known attacks, which belong to the abnormal class and patterns of the normal behaviour. with fuzzy rules, the solution of this classification problem is based on fuzzy logic concepts. fuzzy systems have several important characteristics that suit intrusion detection very well 10:  fuzzy implementations had been showed to possess ability to readily combine inputs from widely varying sources (for instance, digital cameras are usually equipped with auto-focusing feature that estimates the distance. for this purpose, camera's fuzzy control system uses several different inputs).  many types of intrusions cannot be crisply defined (e.g. the value of adaptive (variable) alarm threshold or network values, including intrusions, which in most real cases do not belong to a set of predefined values). 256 p. ĉisar, s. maravić ĉisar  the degree of alert that can occur with intrusions is often fuzzy because there is no clear distinction between normal and anomaly traffic behaviour in a network. in fuzzy logic, fuzzy sets define the linguistic notions and membership functions define the truth-value of such linguistic expressions 25. a collection of fuzzy sets, called fuzzy space, defines the fuzzy linguistic values or fuzzy classes that an object can belong to. for instance, a fuzzy space of five sets (low, med-low, medium, medhigh and high) is shown in the following figure 26. fig. 3 fuzzy space of five sets with fuzzy spaces, fuzzy logic allows an object to belong to different classes at the same time. this possibility is helpful when the difference between classes is not well defined. it is the case in the intrusion detection task, where the difference between the normal and abnormal class are not well defined 27. the approach to the problem of anomaly detection by inducting the fuzzy logic into time based ewma algorithm can be realized through several phases: definition of set of possible values of the state (inputs) in multiple categories the basic concept of the analyzed algorithm is the following: for the input network traffic samples, correspondent ewma values are calculated and observe as many consecutive values of how many sets the fuzzy space is composed. for the purposes of this analysis, the fuzzy space consists of three sets. it implies that the set of regular ewma values of network traffic is divided, depending on the intensity, on three categories. for instance: low, medium and high ewma value (in accordance with the defined membership function in relation to the threshold, shown in fig. 2). in addition to the criteria of traffic intensity, it is possible in a similar way to set the other variable criteria for anomaly detection as well number of consecutive threshold (in this case, upper control limit) overcomes #. since by its nature it represents a crisp value that can not be fuzzified, this paper will, for reasons of simplification of analysis without loss of generality, accept the value of # = 3. this means that the fuzzy algorithm analyzes a block of three consecutive ewma values (ewma1, ewma2 and ewma3) and depending on their values, formulates the conclusion about the type of output. with implementation of exceeding parameter (the parameter that indicates the percentage above upper threshold; e.g. 0.5), fine tuning of the algorithm is provided, additionally eliminating false alarms. definition of set of possible actions (or output types) in a few sets or categories in the case of anomaly detection, the possible set of output actions could be attack alarm, warning alarm and indication of normal condition. in this way, fuzzy outputs are ewma statistics and fuzzy logic in function of network anomaly detection 257 determined. in order to do this, it is necessary first to set up the empirical definition of fuzzy rules ("if then" rules) or fuzzy relations. fuzzy rules in this case may look like this:  if all three ewma values of traffic are high (vrednost veća od thresholda), then attack alarm is generated.  if two values are high, and one is medium, then warning alarm is generated.  all other cases represent normal traffic situations. the next step is defining the membership function that describes the fuzzy sets. a membership function is a curve that defines how each point in the input space is mapped to a membership value (or degree of membership) between 0 and 1. there are different types of membership functions. the only condition a membership function must really satisfy is that it must vary between 0 and 1. the function itself can be an arbitrary curve whose shape is defined as a function that suits from the point of view of simplicity, convenience, speed, and efficiency. fuzzy inference is the actual process of mapping from a given input to an output using fuzzy logic. 5. simulation and results for simulation of described fuzzy ewma algorithm the software package “matlab” fuzzy logic toolbox 2.0 was used. the basic simulation scheme of algorithm consists of three consecutive input ewma values and one output value (fig. 4). in the fuzzy logic toolbox, there are five parts of the fuzzy inference process:  fuzzification of the input variables  application of the fuzzy operator (and or or) in the antecedent  implication from the antecedent to the consequent  aggregation of the consequents across the rules  defuzzification fig. 4 basic simulation scheme two types of fuzzy inference systems (fis) is possible to implement in the toolbox: mamdani-type and sugeno-type. these two types of inference systems vary somewhat in the way outputs are determined. sugeno (or takagi-sugeno-kang) method of fuzzy 258 p. ĉisar, s. maravić ĉisar inference is similar to the mamdani method in many respects. the first two parts of the fuzzy inference process, fuzzifying the inputs and applying the fuzzy operator, are exactly the same. the main difference between mamdani and sugeno is that the sugeno output membership functions are either linear or constant. mamdani-type inference, as defined for the toolbox, expects the output membership functions to be fuzzy sets. after the aggregation process, there is a fuzzy set for each output variable that needs defuzzification 28, 29. the membership functions for inputs and output are determined according to the following figures: fig. 5 input and output membership functions after creation of membership functions, the next step is formulation of fuzzy rules, which is realized using rule editor (fig. 6). ewma statistics and fuzzy logic in function of network anomaly detection 259 fig. 6 fuzzy rules the first check of this fuzzy algorithm is made in such way that input values (ewma1 = ewma2 = ewma3 = 0.91) are selected with intention to generate an alarm situation. it was expected, in accordance with defined rules, to get relatively high output value, which was also actualized (output = 0.945). the practical choice of input values is realized by scrolling the vertical line that indicates the value, while the program automatically recalculates the value of output. fig. 7 rule viewer (alarm condition) 260 p. ĉisar, s. maravić ĉisar the form analysis of membership functions is done for selected combination of input values that generates warning and alarm indication. the situation in case of chosen input values that generates a warning condition, is given by fig. 8. fig. 8 rule viewer (warning condition) changing the offered membership functions 30, the output values were examined and compared. in case of warning, for „warning“ segment, the following output values are confirmed: table 2 output values warning membership function output value triangular (trimf) 0.538 generalized bell (gbellmf) 0.533 gaussian curve (gaussmf) 0.535 two-sided composite gaussian curve (gauss2mf) 0.533 similarly, in case of alarm, for „alarm“ output segment, the following values are confirmed: table 3 output values alarm membership function output value triangular (trimf) 0.945 sigmoidal (sigmf) 0.943 difference between two sigmoidal functions (dsigmf) 0.944 product of two sigmoidal functions (psigmf) 0.943 looking at the results from these tables, insignificant differences can be found in output values. however, triangular membership function provides the largest output value and may be concluded that it is the most appropriate shape for application in case of this security algorithm. ewma statistics and fuzzy logic in function of network anomaly detection 261 simulation of algorithm in conditions of network anomaly as an illustration of the functioning of the described anomaly detection algorithm, the samples of network traffic (acknowledgement (ack) numbers in tcp headers) can serve in observation time period with the parameters calculated on the basis of historical data: ewma0 = 50 and σ = 2.0539, with λ accepted to be 0.3 (often used value). the control limits are in this case: ucl = 50 + 3*(0.4201)*(2.0539) = 52.5884 ≈ 52.6 lcl = 50 – 3*(0.4201)*(2.0539) = 47.4115 ≈ 47.4 data from table 4 derive from local server using a packet sniffer (wireshark) that captures and filters packets according to specific protocol. in order to see only captured packets using the tcp protocol, in the filter field is necessary to enter "tcp" as shown in figure 9 (example). extracting the ack numbers of packets (rounded data values) is realized 35 times, at equal time intervals (5 minutes). fig. 9 filtering tcp packets 262 p. ĉisar, s. maravić ĉisar table 4 network traffic samples and ewma values sample data ewma 0 50.00 1 52 50.60 2 47 49.52 3 53 50.56 4 49.3 50.18 5 50.1 50.16 6 47 49.21 7 51 49.75 8 50.1 49.85 9 51.2 50.26 10 50.5 50.33 11 49.6 50.11 12 47.6 49.36 13 49.9 49.52 14 51.3 50.05 15 47.8 49.38 16 51.2 49.92 17 52.6 50.73 18 52.4 51.23 19 53.6 51.94 20 52.1 51.99 21 53.9 52.56 22 53 52.69 23 52.9 52.76 24 52.5 52.68 25 51.8 52.42 26 49.7 51.60 27 50.5 51.27 28 49.9 50.86 29 48.5 50.15 30 49.6 49.99 31 51.2 50.35 32 48.3 49.74 33 50 49.81 34 50.4 49.99 35 51.6 50.47 analyzing the values in the table above, it should be noted that three consecutive values of network traffic in the marked samples 21-23 are above the ucl (the 19th sample is a single case that is not of interest for analysis), simulating in this way the situation suspicious to alarm. these elevated traffic values result in expected increase in the three consecutive ewma values (22-24 marked), which also exceed the ucl value. accepting that each network value that is for 20% or more greater than ucl (threshold = 1.2*ucl, analogy with fig. 2) is a situation which can be interpreted as a certain alarm, and bearing in mind the form of membership functions (fig. 5), the situation of warning is at about 80% (between 0.75 and 0.9 more precisely) of the alarm situation, which is 0.8*0.2 = 0.16. it gives an alarm threshold value of 52.75 (ucl + 0.16). among the three marked ewma values that exceeded the threshold, only one was greater than 52.75. deciding on ewma statistics and fuzzy logic in function of network anomaly detection 263 the basis of the majority logic (one alarm situation and two warnings), it is concluded that the analyzed network situation can be interpreted as a warning. 6. improvement of results in further phase of research, the improvement possibility of previously obtained results will be examined. in that sense, the idea was to predefine some of the built-in fuzzy operators, because fuzzy logic toolbox offers this opportunity. the situation when all three inputs have high values was analyzed. firstly, it is found that the choice of some other or method and aggregation does not affect the output value. in and method, the options “min“ and “prod“ were examined and determined the correspondent output values 0.945 (min) and 0.934 (prod). in this research, instead of function “min“, the function of square root of minimum is proposed and tested its impact on the output value. this function is defined in file customand.m as: function y=customand(x) y=sqrt(min(x)), and gave the output result 0.949, which is better than “min“ function’s result. in addition to this, research has shown that “prod“ function from implication part generates higher output signal then “min“ function, which represents another improvement. similarly, it is shown that “mom“ option (mean of maximum) instead of popular “centroid“ from defuzzification part has the greatest impact on output. considering the previous improvements, testing the output values was performed with the following selected options (fig. 10). the final result of this research is given by fig. 11. fig. 10 the final fuzzy inference options fig. 11 the final result 264 p. ĉisar, s. maravić ĉisar analyzing the last figure, it is necessary to emphasize the greatest value of output (0.99), which proves the correctness of presented conclusions. the percentage of improvement compared to the previous case (0.945) is about 5%. 7. conclusion on the basis of elaboration presented in this paper it can be concluded that the introduction of fuzzy logic in standard ewma algorithm for anomaly detection the socalled fuzzy ewma (fewma) algorithm opens the possibility of previous warning from a network attack, which contributes to raising the level of security. standard ewma algorithm does not have this opportunity. besides, fuzzy logic enables precise determination (fine tuning) of degree of the risk (expressed in form of percentage). it is important to emphasize that here proposed improvement of standard ewma algorithm can be applied to other algorithms based on threshold, since ewma algorithm in its basis, is security algorithm with fixed threshold. the future work will focus on creating opportunities for practical testing of the presented approach in real time. this involves enabling the automation of algorithm functioning by creating the appropriate software for taking live-network traffic samples, calculating ewma values and real-time decision-making, and linking these functions with fuzzy module that has the ability to adjust the membership functions of the input and output. references [1] s. drew, intrusion detection faq: what is the role of security event correlation in intrusion detection?, sans institute, http://www.sans.org/security-resources/idfaq/role.php [2] p. ĉisar and s. maravić ĉisar, “network statistical anomaly detection based on traffic model. annals of faculty engineering hunedoara”, international journal of engineering, tome x-fascicucle 3, , pp. 89–96, 2012. [3] n. ye, q. chen and c.m. borror, “ewma forecast of normal system activity for computer intrusion detection”, ieee transactions on reliability, vol. 53, no. 4, pp. 557–566, 2004. [4] m.s. abadeh, h. mohamadi and j. habibi, “design and analysis of genetic fuzzy systems for intrusion detection in computer networks”, expert systems with applications, vol. 38, no. 6, 2011, pp. 7067–7075. [5] z. yu and j. tsai, “fuzzy model tuning for intrusion detection systems”, in proceedings of the international conference on autonomic and trusted computing, atc 2006, 2006, pp. 193-204. [6] g. spathoulas and s. katsikas, “reducing false positives in intrusion detection systems”, computers & security, vol. 29, no. 1, pp. 35–44, 2010. [7] a. silva, e. pontes and f. zhou, “prbs/ewma based model for predicting burst attacks (brute froce, dos) in computer networks”, in proceedings of the international conference on digital information management (icdim), 2014. [8] h.h.w.j. bosman, anomaly detection in networked embedded sensor systems. university of technology, eindhoven, 2016 [9] s. senturk, n. erginel, i. kaya and c. kahraman, “fuzzy exponentially weighted moving average control chart for univariate data with a real case application”, applied soft computing, vol. 22, pp. 1–10, 2014. [10] j.e. dickerson, j. juslin, o. koukousoula, and j.a. dickerson, “fuzzy intrusion detection ifsa world congress and 20th north american fuzzy information processing society (nafips)”, in proceedings of the international conference, vancouver, british columbia, vol. 3, 2001, pp. 1506-1510. [11] j.e. dickerson and j.a. dickerson, “fuzzy network profiling for intrusion detection”, in proceedings of the nafips 19th international conference of the north american fuzzy information processing society, atlanta, 2000, pp. 301-306. [12] k. liston, intrusion detecion faq: can you explain traffic analysis and anomaly detection?” sans institute, http://www.sans.org/security-resources/idfaq/anomaly_detection.php ewma statistics and fuzzy logic in function of network anomaly detection 265 [13] g. fengmin, deciphering detection techniques: part ii anomaly–based intrusion detection. white paper, mcafee security, 2003, https://secure.mcafee.com/japan/products/pdf/deciphering_detection_ techniques-anomaly-based_detection_wp_en.pdf [14] p. ĉisar and s. maravić ĉisar, “optimization methods of ewma statistics”, acta polytechnica hungarica, vol. 8, no. 5, pp. 73–87, 2011. [15] p. ĉisar, s. bošnjak and s. maravić ĉisar, “ewma-based threshold algorithm for intrusion detection”, computing and informatics, vol. 29, institute of informatics, slovak academy of sciences, bratislava, slovakia, pp. 1089–1101, 2010. [16] p. ĉisar, s. bošnjak and s. maravić ĉisar, “ewma algorithm in network practice”, int. j. of computers, communications & control, vol. v, no. 2, 2010, pp. 160–170. [17] s.w. roberts, control chart tests based on geometric moving averages. technometrics, 1959 [18] nist/sematech e-handbook of statistical methods (2008). http://www.itl.nist.gov/div898/handbook/ pmc/section3/pmc324.htm [19] j.s. hunter, the exponentially weighted moving average. journal of quality technology 18, 1986, pp. 203–210. [20] j.m. lucas and m.s. saccucci, exponentially weighted moving average control schemes: properties and enhancements. technometrics vol. 32, no. 1, 1990, pp.1-29. [21] engineering statistics handbook–ewma control charts, http://www.itl.nist.gov/div898/handbook/ pmc/section3/pmc324.htm [22] v. siris and f. papagalou, application of anomaly detection algorithms for detecting syn flooding attacks, 2004, http://www.ist-scampi.org/publications/papers/siris-globecom2004.pdf [23] s. sorensen, competitive overview of statistical anomaly detection. white paper, juniper networks, 2004 [24] p. ĉisar and s. maravić ĉisar, network statistics in function of statistical intrusion detection. springer publication, studies in computational intelligence, volume 313, springer verlag publication, 2010, pp. 27–35. [25] m. hellmann, fuzzy logic introduction. a laboratories antennas radar telecom, f.r.e cnrs 2272, equipe radar polarimetrie, 2000, france [26] s.m.a. naqshbandi and v.w. samawi, “one-rule genetic-fuzzy classifier”, in proceedings f the 2012 ieee international conference on in computer science and automation engineering (csae), vol. 2, 2012, pp. 204–208. [27] k. subramanian, “emerging intuitionistic fuzzy classifiers for intrusion detection system”, journal of advances in information technology 2.2, pp. 99–108, 2011. [28] matlab & simulink, “what is sugeno-type fuzzy inference?”, http://www.mathworks.com/help/fuzzy/ what-is-sugeno-type-fuzzy-inference.html [29] b. lazzerini, fuzzy logic toolbox, http://www.unife.it/ing/lm.infoauto/tecniche-controllo/fis_estratto.pdf [30] fuzzy logic toolbox user's guide, http://www.mathworks.com/help/pdf_doc/fuzzy/fuzzy.pdf [31] o. osanaiye, k.k.r. choo and m. dlodlo, “change-point cloud ddos detection using packet inter-arrival time”, in proceedings of the 8th ieee computer science & electronic engineering conference (ceec’16), sept 28th -30th 2016, essex, uk. [32] o. osanaiye, a.s. alfa and g.p. hancke, “a statistical approach to detect jamming attacks in wireless sensor networks”, sensors, vol. 18, no. 6, p. 1691, 2018. http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc324.htm http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc324.htm http://www.ist-scampi.org/publications/papers/siris-globecom2004.pdf 10573 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 541-555 https://doi.org/10.2298/fuee2204541t © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper green computing for iot – software approach* haris turkmanović1, ivan popović1, dejan drajić2,3, zoran čiča2 1university of belgrade, school of electrical engineering, department of electronics, 2university of belgrade, school of electrical engineering, department of telecommunications 3innovation centre of school of electrical engineering abstract. more efficient usage of limited energy resources on embedded platforms, found in various iot applications, is identified as a universal challenge in designing such devices and systems. although many power management techniques for control and optimization of device power consumption have been introduced at the hardware and software level, only few of them are addressing device operation at the application level. in this paper, a software engineering approach for managing the operation of iot edge devices is presented. this approach involves a set of the application-level software parameters that affect consumption of the iot device and its real-time behavior. to investigate and illustrate the impact of the introduced parameters on the device performance and its energy footprint, we utilize a custom-built simulation environment. the simulation results obtained from analyzing simplified data producer-consumer configuration of iot edge tier, under push-based communication model, confirm that careful tuning of the identified set of parameters can lead to more energy efficient iot end-device operation. key words: green iot, energy saving, real-time iot, push communication technology, embedded systems 1. introduction many technological achievements in recent years, especially in field of information and communication technologies, have enabled the usage of a wide range of iot applications and devices. healthcare systems, smart cities, home automation and security, wearable devices, and agriculture are just some of the applications whose rapid development is facilitated by the advancement of various iot communication technologies [2]. according to research [3], global number of connected iot devices is expected to grow nearly 10% per year, where the number of ip connections by the year 2023. is expected to received march 6, 2022; revised april 28, 2022; accepted may 8, 2022 corresponding author: haris turkmanović university of belgrade, school of electrical engineering, department of electronics, bulevar kralja aleksandra 73, 11120 beograd, serbia e-mail: haris@etf.bg.ac.rs * an earlier version of this paper was presented at the 15thinternational conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20-22, 2021, in niš, serbia [1] 542 h. turkmanović, i. popović, d. drajić, z. čiča be three times higher than the total world population. this accelerated growth of network traffic and the increased number of connected iot devices lead to elevated global energy consumption and pollutions related to co2 emission [4]. it has been predicted that soon iot devices and systems will be leading energy consumers in the domain of information and communication technologies [5]. the utilization of energy-efficient technologies to reduce energy consumption as well as co2 emission has become mandatory in the design of green iot systems (giot). there are various definitions of iot systems which commonly include different edge iot devices distributed all over the iot network. these devices produce data and autonomously communicate with other parts of iot system without direct human intervention [6]. design of an iot system that relies on the use of modern iot communication technologies is very complex because of different challenges such as security [7], scalability [8], data management, real-time performance [9], and others [10]. the scope of giot systems is even more complex because it involves different set of green technologies in product life cycle. green technologies are targeting hardware and software design, green production, green utilization and green disposal of iot devices. iot edge devices are usually designed as battery-operated embedded devices that have constrained resources and capabilities, such as limited memory and cpu processing power. hardware and software design requirements of such devices are mostly related to efficient usage of limited available energy. these requirements are important to prolong iot application operational runtime and enable utilization of green technologies. therefore, the limitation in terms of the available energy resources represents the main issue in designing energy-efficient iot applications [11-13] and giot systems. the base for designing such system is green computing which main goal is to reduce iot devices' energy consumption without degrading their performance [14]. in this paper, we introduce a set of application-level parameters that shape the communication and operational behavior of iot edge devices but also affect their energy consumption and performance. the conducted study aims to explore the way to utilize a software engineering approach in controlling this set of parameters to achieve the balance between the energy consumption of iot edge devices, its operational runtime [15], and the required real-time performance of iot services and applications. to quantify the contribution of the selected software engineering approach, a simulation environment is developed as a flexible open-source framework for analyzing the behavior, performance, and energy footprint of arbitrary distributed iot systems and applications [16-17]. achievements and contributions of our proposed software engineering approach in designing giot edge devices are the following: ▪ introduction of application-level software parameters that enable fine-tuning of iot application performance vs. energy consumption of the iot devices located at the network edge ▪ the extension of the simulation framework with the set of parameters for modeling consumption of iot system processing and communication elements, enables comprehensive analysis of the energy requirements of an arbitrary iot system and/or its components. ▪ the analysis of real-time performance and energy consumption confirmed the trade-off potential of the proposed software approach for driving the operation of giot edge devices. green computing for iot – software approach 543 the rest of the paper is organized as follows. section ii presents related work regarding the existing energy optimization methods. section iii presents a brief overview of the simulation framework with details related to the implementation of energy calculation algorithms. in section iv, we present the simulation model used in our analysis while section v presents and discusses obtained results. section vi concludes the paper. 2. related work energy consumption-related problems have been attracting a lot of attention from the scientific and research community. data involved in processing and communication within iot applications are usually generated by battery-powered edge iot devices that generate data by sensing their environment. until today, various energy optimization methods, which are addressing different energy-intensive aspects of iot systems, have been developed to prolong iot edge device operational time and to provide enhanced real-time performance of giot applications. the rest of the section provides an overview of existing optimization methods, their classification, and approaches related to the energy-aware design of resourceconstrained embedded devices found in various iot applications and systems. although the standard methodologies in designing low-power embedded systems involve the range of approaches from simple usage of low power products to complex algorithms for scheduling system workload, there is no single universally accepted methodology that fits all applications needs. all optimization techniques can be classified into two major categories: hardware and software energy optimization techniques. within this research we be explored only the software optimization techniques which, based on [18] can be further classified as data center-based, cloud computing-based, and virtualization-based techniques. a more detailed classification of software optimization techniques is done in [4] where they are classified into nine different groups. based on [19], all software optimization techniques can be classified into three groups: instruction level, compiler level, and operating system level. the taxonomy presented in this paper is an entry point for the analysis of energy optimization techniques applicable for iot edge devices. in paper [20] is presented overview of such techniques and it has been discussed influence that certain techniques have on edge device energy consumption and overall iot system consumption as well. iot system edge devices are, in a certain sense similar to sensing node devices used within wireless sensor networks (wsn). therefore, in the case of energy optimization of iot edge devices, some optimization techniques already considered in the case of wsn can be utilized. a review of energy optimization techniques in [21-22] gives a systematic classification of the solutions that can be used to preserve energy in wsn. moreover, these papers introduce the division of battery-powered sensor devices on basic subsystem components which consume energy: sensing subsystem, processing subsystem and communication subsystem. this division makes sensing node energy analysis more systematic, and it can be also applied in case of giot edge devices. it is pointed out that the consumption of each subsystem must be considered equally during the energy profiling analysis of iot edge nodes. in iot applications based on wireless communication technologies, communication subsystems in most cases consume significantly more energy compared to processing or sensing subsystems. different software methods are focused to reduce the consumption of 544 h. turkmanović, i. popović, d. drajić, z. čiča the communication subsystem. based on research presented in [23] all these methods can be classified into two groups: duty-cycling based methods and in-processing methods. dutycycling-based methods reduce energy consumption by disabling communication components in a case when they are not used. in-processing methods use various data compression and/or data aggregation techniques to reduce the amount of data involved in communication. the amount of data involved in communication increases with the number of edge nodes participating in the communication. different research has shown that there is a certain similarity between the data produced by sensor nodes [24-25]. data aggregation methods exploit this feature to reduce the overall amount of data in iot applications. instead of forwarding the data instantaneously, data are first collected and then aggregated using functions like sum, average, threshold. in [26] some of the data aggregation methods are presented. it is shown that utilization of these methods has a huge impact on the reduction of sensor node energy consumption, but also decreases real-time performance since delay in data delivery time over iot applications is increased. therefore, in the case of applying methods based on data aggregation, it is important to use two metric parameters: energy consumption and data delay to describe overall iot application performance. in [27] analytical model is presented which enables calculation of energy consumption and packet delivery time in case of aggregation optimization methods usage. this model allows determining parameters values such as buffering time and maximum number of buffered packets. additionally, this work shows that the utilization of aggregation methods may lead to a significant decrease in energy consumption. in [28], it is shown that the energy consumption of the processing subsystem may increase compared to the communication subsystem when complex memory-intensive compression algorithms are used. however, a study conducted in [28] modifies already developed techniques, such as discrete cosine transform (dct) and discrete wavelet transform (dwt), that can be applied on sensor nodes to compress data. these techniques are modified in such a way that utilization of memory and processing capability is not too high compared to the original algorithm. it has been shown that among other things, utilization of these modified techniques leads to reduction of energy consumption and prolonged device operational runtime. when data aggregation methods are used it is very important to decide when to communicate to send aggregated data to the consumer node. paper [30] named this parameter as transmission period. it is shown that this parameter has a significant impact on sensor node performance in terms of energy, data accuracy, and data freshness. the specific approach is developed in this work which gives a possibility to balance between energy saving and data availability at the higher tiers of hierarchically organized iot systems. in some practical applications, sensing subsystem can consume significantly more energy compared to other edge device system parts. work presented in [31] establishes an approach based on smart sensing policy which achieves less energy consumption of sensing subsystem compared to a usage of standard fixed sensing period policy. this policy used a learning model based on a backpropagation neural network. it has been concluded that this policy may reduce consumption by up to 50%. in [32] adaptive sampling algorithm is proposed which can dynamically estimate optimal sampling frequency. the performance of this algorithm is estimated in simulation of snow monitoring application. obtained simulation results show that this algorithm may reduce the energy consumption of the sensing subsystem up to 97%. beside the techniques that are directly related to our research, there are other techniques that also affect power consumption. in [33] power management techniques are categorized as green computing for iot – software approach 545 dynamic voltage and frequency scaling, subthreshold design, asynchronous circuit design and power-gating. in case of edge devices that utilize real-time operating systems there are different os-level techniques that impact task scheduling [33-34]. within iot multimedia applications control of parameters such as frames per seconds (fps) is also found as a common approach to lower energy consumption [36]. in the domain of e-healthcare applications there are several solutions offering energy-efficient frameworks using internet of medical things (iomt) protocol to optimize the communication overhead and overall energy consumption while transmitting the healthcare data [37-38]. although most reviewed solutions and approaches investigate the impact of individual parameters on the power or energy consumption, neither of them analyzes trade-off potential and more complex tuning of device operation through the control of a group of parameters. on the other hand, our study presented in this paper provides a comprehensive analysis of the performance and energy consumption properties of iot edge devices, during their operation under the different setup of the selected application-level parameters. 3. materials and methods the first part of this section presents general aspects of traffic engineering relevant for iot system energy consumption and performance analysis. set of application-level software parameters, that enable tuning of performance and consumption properties of iot devices, is also introduced. metric associated with the quantification of these properties is presented within the first part of this section, while utilization of simulation framework for energy analysis is described in the rest of the section. 3.1. overview of the approach there are many different possibilities of iot system realizations, but in most cases, it is possible to identify four main elements: 1) the intelligent devices where data are produced – producer devices, 2) the gateways that extract data, aggregate data, and/or perform protocol translation, 3) the network used to establish communication between devices and 4) the device which receives data – consumer device. in the simplest representation of an iot system, it is possible to consider that system consists only of producer and consumer device. the communication between producer nodes and the rest of the distributed iot system determines producer nodes' energy consumption and the real-time performance of the iot application. two traffic engineering strategies can be applied when designing an edge-tier iot system: pull and push [39]. the messaging patterns of these two strategies are illustrated in figure 1. in case of pull strategy, data generated on the producer node side are sent to the consumer node side only when the consumer node sends a pull request to the producer node. pull strategy is suitable for implementation in a case where the consumer node is interested in partial data from certain producer nodes (there is a correlation between data sent fig 1. a) pull b) push traffic strategy 546 h. turkmanović, i. popović, d. drajić, z. čiča from different producer nodes to the same consumer nodes). contrary, push traffic strategy involves sending data or notifications from the producer node periodically or when a particular event occurs on the producer node. push strategy forces real-time performance of iot applications [40], although in some iot applications it is also possible to combine both communication strategies. since we observe real-time iot applications, the push traffic technology is considered as the reference for our research, since it supports a higher number of parameters on the edgedevice node's side. table 1 gives overview of the parameters that are available from an application point of view for tailoring the device operation for push communication strategy. table 1 overview of software available parameters for push strategy parameter description sampling time (st) defines how often data are generated on the producer node. aggregation rate (ar) defines the level of data reduction on the producer node. transmission period (tp) defines the period for sending data from the producer to the consumer node. the performed analysis explores how much these three parameters impact energy consumption and overall real-time performance of iot applications. to quantify this impact, two metrics are introduced: ▪ energy consumption(e) – expressed in milliamperes per hour. ▪ average data delivery time (adt)– the time interval that elapses from the moment of data generation to the moment of data processing at the destination node. 3.2. tools and procedures the simulation framework used within this work enables the creation of arbitrary iot system topologies and analysis of various iot application performance parameters. results obtained by simulation provide a detailed overview of data availability across the entire iot system at any point in time. by analyzing obtained results, it is possible to quantify various iot application performance parameters such as iot system consumption, real-time performance, and scalability of iot system architecture. in paper [17], it is already described how to exploit developed simulation framework to quantify scalability of the iot architecture. in this section, we describe in more details the main aspects of the simulation framework important for better understanding of how to quantify the influence of a certain set of parameters on iot system energy consumption. the created simulation framework is available as an open-source solution [16] and it can be further developed and adapted to satisfy any requirements which are not supported by the current framework version. the simulation framework comprises simulation core and a graphic user interface. the simulation core is in charge to implement all functionalities related to the simulation of iot system behavior on different levels of iot system architecture. these functionalities rely on component’s model which exists within iot system in the general case: node model – which represents a device that generates data or consumes data, link model – which represents a connection between iot system components, and protocol – which encloses all information related to data created and consumed within iot system. current version of simulation framework used within this analysis supports only the simplest models of iot system components that exist in general case such as processing green computing for iot – software approach 547 devices, links and protocols. these models support only basic parameters configuration. model included within current simulation framework version does not support modeling of packet dropouts, connection losses and packet retransmissions which can be significant for overall iot application quality of service analysis. within each model, it is possible to configure a certain set of parameters. for the easy process of configuring the model’s parameter, the graphical user interface is developed. communication between the framework core and the graphic user interface is established by using the model’s configuration file. at the end of the simulation, different log files are created. by examining and analyzing the content of these files it is possible to understand how iot systems behave. a general overview of the simulation framework is presented in figure 2. fig. 2 simulation framework architecture the node, link, and protocol model support different parameters. the current version of the simulator supports three-node models: producer, gateway, and consumer. node consumption, processing time, adjacent nodes, aggregation level, compression rate, and transmission period are parameters that can be configured for each node model. additionally, in the case of the producer node model, it is possible to define the amount of data produced on the node but also it is possible to define the data sampling rate. the model of the link supports the configuration of the following parameters: link speed, link consumption (transmit and receive), and link speed deviation. data are exchanged between nodes using a protocol model where it is possible to define protocol overhead and optionally it is possible to enable a handshaking mechanism. a more detailed description of parameters is given in [17]. energy consumption calculation within the simulation framework is implemented based on current overall node consumption (conc) expressed in ma. to calculate the charge consumed by a node for specific action, conc is multiplied by the time required for executing specific action on the node. calculation algorithms print the cumulative sum of consumed charge over time (csc), expressed in ma per time resolution – r, to the node log file. node energy consumption is directly proportional to csc value, and it is easy to calculate it directly if information about node voltage power supply is available. based on this information, it is easy to profile nodes based on energy consumption. conc value is determined by the current node operation mode as well as the type of the links used to communicate with adjacent nodes. two operating modes are supported by each node model: active and low power mode. node is in active mode when data are processing on node or data are transmitting/receiving from/on a node. if there is no any action on the node, it is in low power mode. for each of these modes, within the node’s model configuration file, it is possible to configure current node consumption (cncm) by simulation log files model’s configuration files simulation framework gui core 548 h. turkmanović, i. popović, d. drajić, z. čiča setting parameter value related to specific node mode m (cnca – current node consumption when node is in active mode, cnclp – current node consumption when it is in low power mode). link models enable configuration of current link consumption clcs during different states s such as transmission clct and receiving data clcr. the following equation is used for conc calculation: 𝐶𝑂𝑁𝐶 = 𝐶𝑁𝐶𝑚 + 𝐶𝐿𝐶𝑠(1) where cncm and clcs take value depending on current actions on the node as presented in table 2: table 2 value of conc depends on action on the node action on the node conc = low power mode cnclp processing received data cnca processing received data and receiving new data from another node cnca + clcr receiving data cnclp + clcr transmitting data clct it needs to be mentioned that the improvements of available simulation’s models to correspond with practical mcu based device implementations is seen as a part of future work. goal of this future work will be to extend the simulation model to accurately represent both, device and communication power and performance behaviors. to illustrate the working principle of developed algorithms and to illustrate the potential of developed simulation framework in terms of profiling node’s energy consumption, we examine the behavior of simple node n which is connected to the rest of the iot system over link l. information relevant to this example is presented in tables 3 and 4. table 3 node n parameters values parameter name value unit processing speed 50 [b/s] data production rate 15 [s] data size 50 [b] cnclp 10 [ma] cnca 90 [ma] table 4 link l parameters values parameter name value unit link speed 12.5 [b/s] clcr 400 [ma] clct 400 [ma] figure 3 showspart of the node log file obtained after completed simulation. the shownpart of the log file includes actions on the nodes inside the time interval [45s, 72s]. green computing for iot – software approach 549 fig. 3 part of node’s log file the node log file is given in form of a csv file where each value in a single row represents information about node parameter value at a specific point in time. more information about specific values is given in [17] while in this analysis we focus only on the values important for energy analysis such as timestamp (1st value), conc (next to last value), and csc (last value). obtained values are extracted and visualized in figure 4. the analysis shown in figure 4. illustrates the charge and the consumption of the selected iot node for the selected time interval [45-73s]. time intervals 1 and 4 include all node actions which occur on the node log file within intervals [45-47.4s] and [60-62.4s], respectively. these actions are mostly based on data processing of created data. time intervals 2 and 5 represent actions on the node within [47.4-57s] and [62.4-72s] where the action of processed data transmission is executing. after node sends data, there are no more actions on the node, and the node goes to low-power mode. this node state is observed in time interval 3 within [57-60s] as found from a log file. fig. 4 node charge consumption and conc values within time interval [45s-73s] 550 h. turkmanović, i. popović, d. drajić, z. čiča 4. case study this section gives a description of the experiment setup, including iot system topology and node and link configuration, and the simulation results illustrating the impact of introduced application-level parameters on node operation and consumption. the parametric analysis and the discussion of the associated trade-off properties are also given. in our analysis, data communication at the edge-tier of the iot system is modeled as an interaction between the data producer node (iot edge device) and corresponding consumer (destination) node located in the higher hierarchy of the rest of the iot system. data from edge devices are pushed toward data consumer device through a link used to establish communication between producer and consumer device. this iot system is illustrated in figure 5. while parameters used in simulation are presented in tables 5. and 6. fig. 5 illustration of iot system used in our analysis table 5 producer node parameters value parameter name value unit processing speed 1 [mb/s] data size 100 [b] data overhead 70 [b] cnclp 20 [ma] cnca 110 [ma] table 6 link parameters value parameter name value unit link speed 18 [kb/s] maximum transmission unit (mtu) 1500 [b] clct 410 [ma] clcr 410 [ma] edge devices can be considered as simple mcu-based embedded system which gathers data by sensing its environment, performs simple data processing, like data aggregation, and provides physical connectivity with the rest of the iot system. from software's perspective is only possible to control parameters such as data sampling rate, aggregation rate, and transmission rate. the range of values of these three parameters is shown in table 7. table 7 range of parameters values parameter name range unit data sampling time 0.1-10 [s] aggregation rate 1-100 transmission period 1-10 [s] the analysis of the results obtained by variation of these three parameters’ values in a presented range is given in the next section. green computing for iot – software approach 551 5. results analysis results obtained by simulation are presented in figure 6. results are normalized to the operating point with the coordinates q0(ar0, tp0, st0) = (10,10s,1s). the normalized results are adopted to illustrate the potential of adjusting the parameter values on observed system properties given on different scales. each parameter value at q0 is selected as a midpoint of the parameter range given in the logarithmic scale. furthermore, parameter range is chosen to avoid boundary conditions of system operation where sampling rate interval is comparable with the data processing time and/or communication latency. based on the obtained results, it is possible to quantify the impact of these parameters on reducing energy consumption, but also on reducing the average time of data availability on consumer nodes. normalized energy consumption (e) and optimization cost function (o) values are presented on each graphics' left side, while normalized average data delivery time (adt) values are presented on the right side. from figure 6-i it is noticeable that increasing the aggregation rate within the first half of the observed range [0.1-1] reduces data payload size which leads to the reduction of the total energy consumption (~0.26). within the second half of the observed range [1-10], increasing aggregation rate to a lesser extent contributes to a further reduction of energy consumption (~constant) because payload size becomes negligible to protocol header size. from the same graph, it can be also noticed that the increase in aggregation rate does not cause a significant change in data delivery time (~0.03). this impact is expected because the change of the aggregation rate does not change the outcome in terms of the data availability time, but it changes only the form of the exchanged data since the original data are embedded within the aggregated data format. the change in the transmission period has a significantly greater impact on the reduction of energy consumption compared to the impact of the aggregation rate parameter, because of the reduced activity of the iot communication subsystem. it can be seen in figure 6-ii that due to the increase in transmission period, energy consumption decreases almost linearly along with the entire observed range. on the other hand, there is a proportional degradation of data availability time and corresponding real-time performance. the effect of the sampling rate parameter is shown in figure 6-iii. by controlling this parameter, we can achieve certain energy-saving up to half of the observed range [0.1 – 1], like in the case of the aggregation rate parameter. however, in contrast to the other two parameters in the second half of the observed range [1 – 10], it is possible to achieve significantly better characteristics in the domain of data availability at the consumer node side. to quantify the trade-off that can be achieved by tuning certain parameters, we introduce the optimization cost function defined as: o = 𝑘 ∙ e + 𝑞 ∙ adt (2) where parameters take a value within a range [0, 1] and relation between k and q is defined within following equality: k = 1 – q. the purpose of this cost function is to establish the relation between the power consumption and performance domains to find the optimal operating point for iot edge device. the cost function provides background for tuning the certain parameter within iot device at the edge tier to optimize power consumption and/or overall iot system real-time performance. operating at the best performance, without the concerns about the consumption means operating point with maximal sampling rate and communication rate without data aggregation. 552 h. turkmanović, i. popović, d. drajić, z. čiča thus, optimizing only performance imply that the value for q is set to 1 while k equals 0. if both k and q are higher than zero than we can talk about the trade-off in power-performance domain. analysis conducted in this paper considers that both requirements are equally important, and consequently both parameters’ values are set to 0.5. fig. 6 influence of aggregation rate (i), transmission period (ii) and sampling time (iii) on energy consumption (left scale – blue) and average data delivery time (right scale – orange) vs trade-off optimization norm (left scale black) green computing for iot – software approach 553 by analyzing cost functions presented on figure 6. is possible to find optimal operating point by tunning only single parameters. following the shape of the cost function o, presented in figure 6-i, decreasing the value of ar below the ar0 results in a significant increase in the optimization cost function’s value. alternatively, increasing the value of ar above ar0 has a minor effect on the cost function’s value. it’s obviously that optimal ar value is located at the end of the observed range. as visible from figure 6-ii, varying the value of the transmission period (tp) parameter away from tp0 degrades the value of o, since its optimal value of tp parameters is found around tp0. on the other hand, as observable from figure 6-iii it is feasible to identify that optimal st value is located left from st0 where cost function has minimum value. finding optimal operating point in 3d space of system parameters is found from the criterion for minimizing cost function. if both relationships for quantifying performance and power consumption are depending on operating point parameters according to linear equation in opposite direction, then it is expected that optimal parameters are found at the middle between boundary values. as the dependences are not linear as obvious form figure 6, then it is expected more complex relationship between optimization criterion and system parameters. 6. conclusion the energy requirements and the performance in the operation of the iot edge device are analyzed through the investigation of the typical data producer-consumer relationship. as the more generalized option, the iot edge device was considered a typical data producer which operates under a push-based communication model. iot edge device operation, under the influence of the identified set of parameters, was investigated utilizing the custom-built simulation environment. the simulation results have shown that the control of parameters such as sampling rate, aggregation rate, and transmission period at the data producer side can lead to the more optimal behavior of iot systems in the power-performance domain, where the optimization criteria can be tuned to fulfill the particular application requirements. simulation results confirmed the trade-off potential, where adjusting parameters often have opposite effects on the power requirement of the iot edge device node and the resulting real-time performance of the iot application. this trade-off potential was quantified by the introduced cost function, which defines the relationship between both, power, and performance domains, in linear form. by introducing the cost function, it has been shown that it is possible to find the optimal operating point where iot system real-time and edge device energy consumption performance will be optimized in case where power consumption and performance are equally important to optimize the exact position of optimal operating point in the 3d space of system parameters is complex to estimate without comprehensive parametric analysis since the complex relationship between system parameters and system power consumption and performance. in general, to lower energy consumption, in the same time compromising real-time performance, presumes less frequent sampling with higher aggregation rate and lower communication rate. the utilization of this approach can result in the development of an algorithm that would control introduced parameters to achieve optimal compromise and enable the design of giot applications. the design and the implementation of an algorithm that controls the introduced set of parameters to achieve optimal operation of the edge devices, in the same way enabling the deployment of giot applications, is seen as a part of future work. 554 h. turkmanović, i. popović, d. drajić, z. čiča acknowledgment: this work has been supported by the ministry of education, science and technological development of the republic of serbia. references [1] h. turkmanović, i. popović, d. drajić and z. čiča, "launching real-time iot applications on energyaware embedded platforms", in proceedings of the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks), pp. 279-282, 2021. [2] r. lu, x. li, x. liang, x. shen and x.lin. "grs: the green, reliability, and security of emerging machine to machine communications", ieee commun. mag., vol. 49, no. 4, pp. 28-35, april 2011. [3] cisco, "cisco annual internet report 2018-2023", march 2020. [4] a. s. h. abdul-qawy, n. m. s. almurisi and s. tadisetty, "classification of energy saving techniques for iotbased heterogeneous wireless nodes", procedia comput. sci., vol. 171, pp. 2590-2599, 2020. [5] x. liu and n. ansari, "toward green iot: energy solutions and key challenges"," ieee commun. mag., vol. 57, no. 3, pp. 104-110, march 2019. [6] p. k. verma, r. verma, a. prakash, a. agrawal, k. naik, r. tripathi, m. alsabaan, t. khalifa, t. abdelkader and a. abogharaf, "machine-to-machine (m2m) communications: a survey", j. netw. comput. appl., vol. 66, pp. 83-105, 2016. [7] t. xu, j. b. wendt and m. potkonjak, "security of iot systems: design challenges and opportunities", in proceedings of the ieee/acm international conference on computer-aided design (iccad), pp. 417-423, 2014. [8] a. damian and l. kung-kiu, "evaluating iot service composition mechanisms for the scalability of iot systems", future gener. comput. syst., vol. 108, pp. 827-848, 2020. [9] b. diène, j. j. p. c. rodrigues, o. diallo, e. h. m. ndoye and v. v. korotaev, "data management techniques for internet of things", mech. syst. signal process., vol. 138, april 2020. [10] c. c. sobin, "a survey on architecture, protocols, and challenges in iot", wirel. pers. commun., vol. 112, pp. 1383-1429, 2020. [11] n. kimura and s. latifi, "a survey on data compression in wireless sensor networks," in proceedings of the international conference on information technology: coding and computing (itcc'05) volume ii, vol. 2, april 2005, pp. 8-13. [12] a. ali, g. a. shah and j. arshad, "energy-efficient techniques for m2m communication: a survey", j. netw. comput. appl., vol. 68, pp. 42-55, june 2016. [13] a. azari and g. miao, "energy-efficient mac for cellular-based m2m communications", in proceedings of the ieee global conference on signal and information processing (globalsip), december 2014, pp. 128-132. [14] m. muniswamaiah, t. agerwala and c. c. tappert, "green computing for internet of things", in proceedings of the 7th ieee international conference on cyber security and cloud computing (cscloud)/2020 6th ieee international conference on edge computing and scalable cloud (edgecom), 2020, pp. 182-185. [15] h. turkmanović and i. popović, "a systematic approach for designing battery management system for embedded applications", in proceedings of the zooming innovation in consumer technologies conference (zinc), may 2021, pp. 85-90. [16] h. turkmanovic, https://github.com/turkmanovic/lsnsimulator.git, github/turkmanovic, lsnsimulator. [17] h. turkmanović, i. popović, z. čiča and d. drajić, "simulation framework for performance analysis in multi-tier iot systems", in proceedings of the 29th telecommunications forum (telfor), 2021, pp. 1-4, [18] u. b. k. ramesh, s. sentilles and i. crnkovic, "energy management in embedded systems: towards a taxonomy", in proceedings of the first international workshop on green and sustainable software (greens), 2012, , pp. 41-44. [19] r. arshad, s. zahoor, m. a. shah, a. wahid and h. yu, "green iot: an investigation on energy saving practices for 2020 and beyond", ieee access, vol. 5, pp. 15667-15681, 2017. [20] a. haider, t. umair, h. james, z. xiaojun, l. liu, z. yongjun, b. faycal, a. abbes, f. kaniz, a. niko, "a survey on system level energy optimisation for mpsocs in iot and consumer electronics", comput. sci. rev., vol. 41, p. 100416, aug. 2021. [21] g. anastasi, m. conti, m. francesco and a. passarella, "energy conservation in wireless sensor networks: a survey", ad hoc netw., vol. 7, no. 3, pp. 537-568, may 2009. [22] r. soua and p. minet, "a survey on energy efficient techniques in wireless sensor networks", in proceedings of the 4th joint ifip wireless and mobile networking conference, october 2011, pp. 1-9. https://github.com/turkmanovic/lsnsimulator.git green computing for iot – software approach 555 [23] t. srisooksai, k. keamarungsi, p. lamsrichan, k. araki, "practical data compression in wireless sensor networks: a survey", j. netw. comput. appl., vol. 35, no. 1, pp. 37-59, january 2012. [24] d. parker, m. stojanovic and c. yu, "exploiting temporal and spatial correlation in wireless sensor networks", in proceedings of the asilomar conference on signals, systems and computers, november 2013, pp. 442-446. [25] y. zhou, l. yang, l. yang and m. ni, "novel energy-efficient data gathering scheme exploiting spatial-temporal correlation for wireless sensor networks", wirel. commun. mobile comput., vol. 2019, p. 4182563, 2019. [26] s. randhawa and s. jain, "data aggregation in wireless sensor networks: previous research, current status, and future directions", wireless pers commun., vol. 97, pp. 3355-3425, july 2017. [27] s.-y. tsai, s.-i. sou and m.-h. tsai, "reducing energy consumption by data aggregation in m2m networks", wireless pers commun., vol. 74, pp. 1231-1244, jan. 2014. [28] i. solis and k. obraczka, "the impact of timing in data aggregation for sensor networks", in proceedings of the ieee international conference on communications (ieee cat. no. 04ch37577), vol. 6, 2004, pp. 3640-3645. [29] t. sheltami, m. musaddiq and e. shakshuki, "data compression techniques in wireless sensor networks", future gener. comput. syst., vol. 64, pp. 151-162, nov. 2016. [30] i. solis and k. obraczka, "the impact of timing in data aggregation for sensor networks", in proceedings of the ieee international conference on communications (ieee cat. no. 04ch37577), vol. 6, 2004, pp. 3640-3645. [31] w. kim and i. jung, "smart sensing period for efficient energy consumption in iot network", sensors, vol. 19, no. 22, p. 4915, nov. 2019. [32] c. alippi, g. anastasi, c. galperti, f. mancini and m. roveri, "adaptive sampling for energy conservation in wireless sensor networks for snow monitoring applications", in proceedings of the ieee international conference on mobile adhoc and sensor systems, october 2007, pp. 1-6. [33] m. hempstead, m. j. lyons, d. brooks and g.y. wei, "survey of hardware systems for wireless sensor networks", j. low power electronics, vol. 4, pp. 1-10, april 2008. [34] w. h. cheng, w. isaac, y. cheng-wen, a. alagan and o. mohammad, "energy-efficient tasks scheduling algorithm for real-time multiprocessor embedded systems", j. supercomput., vol. 62, pp. 967-988, nov. 2012. [35] s. li and j. huang, "energy efficient resource management and task scheduling for iot services in edge computing paradigm", in proceedings of the ieee international symposium on parallel and distributed processing with applications and ieee international conference on ubiquitous computing and communications (ispa/iucc), december 2017, pp. 846-851. [36] c. h. lin, j. c. liu and c. w. liao, "energy analysis of multimedia video decoding on mobile handheld devices", comput. stand. interfaces, vol. 32, no. 1-2, pp. 10-17, jan. 2010. [37] s. a. alvi, g. a. shah, w. mahmood, "energy efficient green routing protocol for internet of multimedia things", in proceedings of the ieee tenth international conference on intelligent sensors, sensor networks and information processing (issnip), may 2015, pp. 1-6. [38] s. tanzila, h. khalid, a. imran and r. amjad, "secure and energy-efficient framework using internet of medical things for e-healthcare", j. infect. public health, vol. 13, no. 10, pp. 1567-1575, july 2020. [39] a. lindgren, f. b. abdesslem, b. ahlgren, o. schelén and a. m. malik, "design choices for the iot in information-centric networks", in proceedings of the 13th ieee annual consumer communications and networking conference (ccnc), january 2016, pp. 882-888. [40] r. c. sofia and p. m. mendes, "an overview on push-based communication models for informationcentric networking", future internet, vol. 11, no. 3, p. 74, march 2019. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 657 671 doi: 10.2298/fuee1504657b a latency optimized biased implementation style weak-indication self-timed full adder  padmanabhan balasubramanian school of computer engineering, nanyang technological university, singapore abstract. this article presents a biased implementation style weak-indication self-timed full adder design that is latency optimized. the proposed full adder is constructed using the delay-insensitive dual-rail code and adheres to the 4-phase handshaking. performance comparisons of the proposed full adder vis-à-vis other strong and weakindication full adders are done on the basis of a 32-bit self-timed ripple carry adder architecture, with the full adders and ripple carry adders realized using a 32/28nm cmos process. the results show that the proposed full adder leads to reduction in latency by 63.3% against the best of the strong-indication full adders whilst reporting decrease in area by 10.6% and featuring comparable power dissipation. on the other hand, when compared with the existing optimized weak-indication full adder, the proposed full adder is found to minimize the latency by 25.1% whilst causing an increase in area by just 1.6%, however, with no associated power penalty. key words: self-timed design, full adder, rca, indication, standard cells, cmos 1. introduction self-timed design, which constitutes a robust flavor of asynchronous design, is considered to be a viable alternative and/or a necessary supplement to mainstream synchronous design by the semiconductor industry association [1] due to several reliability and variability issues, which have become prominent in the nanoscale electronics regime. random dopant fluctuations, sub-wavelength lithography, high heat flux, electro-migration, hot carrier effects, negative bias temperature instability, stressinduced variation, electrostatic discharge, process-induced defects, and metrology and other manufacturing defects [2] are complicated issues which have become more pronounced in the nanoelectronics are compared to the microelectronics era and are indeed difficult to deal with. to circumvent these issues, various material-level, devicelevel, process-level, circuit-level and system-level solutions have been developed and further developments are also underway [1]. received april 06, 2015; received in revised form june 08, 2015 corresponding author: padmanabhan balasubramanian school of computer engineering, nanyang technological university, 50 nanyang avenue, singapore 639798 (e-mail: balasubramanian@ntu.edu.sg) 658 p. balasubramanian at the circuit/system-level, self-timed design has been drawing sustained interest from the research community over the past decades due to several inherent advantages such as low noise [3] and almost nil electro-magnetic interference (emi) [4], greater modularity [5], ability to cope with process, temperature, and parametric variations with ease thus being inherently adaptive [6] [7], consuming power only when and where active [5] [8], and being self-checking [9]. low noise and emi compatibility imply that self-timed circuits are innately resistant to side channel attacks [10] [11] and are therefore preferable for secure banking and financial and other sensitive applications. modularity, also known as design reusability, and the capacity to tolerate process, temperature and parametric uncertainties imply that self-timed circuits are well positioned to deal with statistical timing analysis and reliability issues whilst delivering an average case performance. due to the consumption of power only on-demand, depending on when and where required, self-timed circuits/systems form a natural choice for ultra low power vlsi designs where complimentary design strategies such as multiple supplies, multiple thresholds, and dynamic voltage and/or frequency scaling may be deployed to leverage the maximum benefits from a self-timed design. being self-checking, self-timed circuits/systems conform to the design-for-testability paradigm although complexities may be involved in the testability aspect of certain asynchronous elements, for example the c-element which incorporates feedback; nevertheless some feasible approaches are reported [12] [13]. asynchronous designs are primarily classified as bundled-mode and input-output mode, and here, we only consider the input-output mode, which is the robust among the two as it employs unbounded delay models for components (gates) and/or interconnect [14]. asynchronous circuits/systems corresponding to input-output mode are commonly referred to as self-timed circuits/systems [15]. the fundamental architecture of a selftimed system is shown in figure 1, which has a centrally located function block. the function block in a self-timed system is equivalent to the combinational logic of a synchronous system, with the exceptions of being realized using delay-insensitive codes, and being bestowed with the responsibility of not only having to produce the correct outputs subject to the applied inputs but also should signal the completion of internal data computation. thus the function block forms the heart of a self-timed system that performs data processing. in this article, the term ‘function block’ may refer to an arithmetic element, say the full adder, or a sub-system, for example, a ripple carry adder (rca). the self-timed system, portrayed in figure 1, utilizes delay-insensitive codes (here, dual-rail code) for data representation, communication and processing, and the 4-phase return-tozero (rtz) handshaking. the dual-rail code is the simplest member of the generic family of delay-insensitive m-of-n codes [16], where m wires are asserted ‘high’ (i.e. binary 1) out of a total of n wires to represent binary data. in a dual-rail code, a data wire d is encoded using two wires viz. d0 and d1, where d = 1 is represented by d1 = 1 and d0 = 0, and d = 0 is represented by d0 = 1 and d1 = 0. when d1 and d0 signify a binary value of 0 or 1 according to the assignments mentioned, it is called ‘valid data’. the state of both d0 and d1 being equal to 0 is referred to as the ‘spacer’. it may be noted that both d0 and d1 cannot simultaneously transition to 1 as it is illegal and invalid since the coding scheme adopted is unordered [17], where no codeword should form a subset of another codeword. a latency optimized biased implementation style weak-indication self-timed full adder 659 referring to figure 1, the 4-phase handshake protocol is explained as follows 1 . the dual-rail data bus that feeds the current stage register is initially in the spacer state, and the acknowledge input (ackin) for the current stage register is high (binary 1), since the acknowledge output (ackout) provided by the next stage register is low (binary 0). the current stage register now transmits a codeword (i.e. valid data). this results in low to high transitions on the bus wires (i.e. any one of the rails of all the dual-rail signals is asserted as binary 1) feeding the function block. after the next stage register receives a codeword, subsequent to data processing in the function block, it drives the ackout wire to binary 1, and the ackin wire assumes binary 0. the current stage register waits for the ackin signal to become 0 and then resets the data bus, i.e. the data bus feeding the function block is driven to the spacer state. after an unbounded but finite and positive amount of time taken for resetting the function block and the passage of spacer data to the following register stage, the next stage register drives the ackout (ackin) to 0 (1). a data transaction is now said to be complete, and the system is ready to proceed with the next transaction. the application of data in the self-timed system depicted by figure 1 follows the sequence: valid data – spacer – valid data – spacer, and so forth. next stage register completion detector completion detector current stage register ackout ackout a c k in a c k in function block data bus data bus fig. 1 standard self-timed system architecture employing delay-insensitive data encoding and 4-phase handshaking 2. function blocks – classification and timing behavior self-timed function blocks are classified as strongly indicating and weakly indicating [18] depending on the manner in which they indicate (i.e. acknowledge) the arrival of the primary inputs. the differences between the properties of strong and weak-indication function blocks are explained using the illustrative timing diagram shown as figure 2. 2.1. strong-indication function block a strongly indicating function block [19] waits for all the valid/spacer primary inputs to arrive and then starts to compute and produce the desired valid/spacer primary outputs. the strong input-output conditions are given as follows: 1 this explanation remains valid for any delay-insensitive data encoding scheme. 660 p. balasubramanian  all the primary inputs must attain the valid/spacer state before any primary output attains the valid/spacer state  all the primary outputs must have attained the valid/spacer state before any primary input attains the spacer/valid state 2.2. weak-indication function block a weakly indicating function block [20] is capable of producing valid/spacer primary outputs subsequent to the arrival of just a subset of the valid/spacer primary inputs. however, the production of at least one valid/spacer primary output is delayed until all the valid/spacer primary inputs have arrived. reference [21] discusses two kinds of weakindication function block implementations: i) distributed implementation [22], where the task of indicating the primary inputs is shared between the primary outputs, and ii) biased implementation [18] [23], where the responsibility of primary inputs indication is just delegated to a single primary output. the weak input-output conditions are given below:  some valid/spacer primary outputs are produced subsequent to the arrival of a subset of the valid/spacer primary inputs  all the valid/spacer primary inputs should have arrived before all the respective valid/spacer primary outputs are produced  all the valid/spacer primary outputs should have been produced before any subsequent spacer/valid primary input(s) arrive inputs arrived all none all none outputs producedstrong-indication all none outputs producedweak-indication fig. 2 input-output behavior of strong and weak-indication function blocks 3. weak-indication: basic, distributed, and biased implementations three types of weakly indicating self-timed full adder implementations viz. basic, distributed, and biased implementations are discussed in this section. 3.1. basic implementation the weakly indicating dims full adder [24], shown in figure 3, is an example of the basic weak-indication implementation style. the circuit consists of two levels, with celements realizing the product terms in the first level and the or gates summing up the a latency optimized biased implementation style weak-indication self-timed full adder 661 product terms in the second level. only one product term will be activated to produce the sum or carry outputs thus satisfying the monotonic cover constraint [14], which requires that the product terms comprising a boolean function should be mutually disjoint [25] [26]. the c-element 2 is highlighted by the circle with the marking ‘c’ on its periphery. a1, a0, b1, b0, cin1 and cin0 represent the dual-rail primary inputs, while sum1, sum0, cout1 and cout0 represent the dual-rail primary outputs. a1 and a0 represent the augend, and b1 and b0 represent the addend inputs. the logic equations governing the full adder are, sum1 = a0b0cin1 + a0b1cin0 + a1b0cin0 + a1b1cin1 (1) sum0 = a0b0cin0 + a0b1cin1 + a1b0cin1 + a1b1cin0 (2) cout1 = a0b1cin1 + a1b0cin1 + a1b1 (3) cout0 = a0b1cin0 + a1b0cin0 + a0b0 (4) from (1) to (4), it is evident that the sum outputs depend upon all the primary inputs, while the carry outputs may not. when the carry-generation occurs (i.e. a1 = b1 = 1), cout1 would output binary 1 irrespective of the value of the carry input cin1 or cin0. on the other hand, when the carry-kill condition occurs (i.e. a0 = b0 = 1), cout0 would output binary 1 regardless of the value of the incoming carry signal cin1 or cin0. thus for both valid data and spacer, while the sum outputs have to wait for the arrival of all the primary inputs, the carry outputs may not. however, since the product terms are realized using c-elements for both the sum and carry outputs, the carry-output logic provides an additional acknowledgement for some/all of the primary inputs besides the sum outputs. when the carry-propagate condition (i.e. a1 = b0 = 1 or a0 = b1 = 1) occurs, the arrival of the augend, addend and carry inputs are indicated by both the sum and carry outputs, i.e., the responsibility of indicating the primary full adder inputs is not shared between the sum and carry outputs, neither is the responsibility confined to a single primary output (i.e. the sum or carry output), but rather, multiple acknowledgments tend to manifest for both valid data and spacers. as a result, if the full adder shown in figure 3 is cascaded to form an n-bit rca and if the carry signal propagates through a maximum m out of n full adder stages in the rca, the forward latency and the reverse latency would be specified by o(m). the forward latency signifies the maximum propagation delay incurred in processing the valid data inputs, and the reverse latency denotes the maximum propagation delay encountered for the passage of spacer data inputs (i.e. the time taken for the reset of the self-timed circuit/system) [14]. hence, the cycle time, which is the time taken for a single data transaction, and computed as the sum of forward and reverse latencies, would be specified by o(2m). 2 the muller c-element/c-gate basically governs the rendezvous of the input signals. hence the c-element is also referred to as an ‘input-complete element’. it outputs a 1/0 only if all its inputs are 1s/0s respectively. it retains the existing steady-state in case the inputs are different. 662 p. balasubramanian a0 b0 cin0 a0 b0 cin1 a0 b1 cin0 a0 b1 cin1 a1 b0 cin0 a1 b0 cin1 a1 b1 cin0 a1 b1 cin1 a0 b0 a1 b1 cout0 cout1 sum1 sum0 c c c c c c c c c c fig. 3 dims weak-indication full adder 3.2. distributed implementation martin’s full adder [22], which forms an example for the distributed weak-indication implementation, is shown in figure 4. equations (5) to (8) are synthesized by martin’s full adder using a full-custom static cmos design involving 42 transistors. in fact, (5) to (8) represent the factorized forms of (1) to (4). the nmos network realizes the full adder functionality and is activated when the valid data inputs are supplied, while the pmos network is activated during the application of spacers and resets the full adder during the rtz phase. sum1 = (a0b0 + a1b1) cin1 + (a0b1 + a1b0) cin0 (5) sum0 = (a0b0 + a1b1) cin0 + (a0b1 + a1b0) cin1 (6) cout1 = (a0b1 + a1b0) cin1 + a1b1 (7) cout0 = (a0b1 + a1b0) cin0 + a0b0 (8) a latency optimized biased implementation style weak-indication self-timed full adder 663 b1 a1 cin1 b0 a0 cin0 b0 a1 cin1 b1 a0 cin0 cin0 cin0 cin1 cin1 vdd vdd sum1 sum0 a1 a0 b1 b0 b0 a0 a0 b0 cin0 vdd cout0 a1 a0 b1 b0 cin1 vdd cout1 a1 b1 a1 b1 (a) sum logic (b) carry output logic fig. 4 martin’s weak-indication full adder when valid data inputs are supplied, the operation of the martin’s full adder is identical to the weakly indicating dims full adder discussed previously. therefore the forward latency of an n-bit self-timed rca employing the martin’s full adder would be specified by o(m), where m signifies the maximum length of the carry chain activated in the rca. however when spacer data are applied, the carry output is reset through the spacer states of augend and addend inputs, while the sum output is reset subsequent to the arrival of the carry input as well. therefore, the responsibility of indicating the primary inputs is distributed between the primary outputs (viz. sum and carry outputs) in the martin’s full adder, especially when spacer data are applied during the rtz phase. since the carry signal alone may be required to propagate through the full adders in an n-bit rca, subject to the propagate mode becoming active, the martin’s full adder paves 664 p. balasubramanian the way for a fast and simultaneous reset of all the intermediate carry outputs in the rca by involving just one full adder delay, soon after the corresponding augend and addend inputs of the full adders have become spacers regardless of the carry input. once the carry output of a k th stage full adder attains a spacer which in turn serves as the carry input for the (k + 1) th stage full adder, the sum output of the (k + 1) th stage full adder in the rca would also become a spacer which entails another full adder delay. therefore the reverse latency of the n-bit self-timed rca employing a cascade of martin’s full adders is not data-dependent unlike the previous case but is just a constant, which is approximately equal to two full adder delays. as a result, the cycle time of the n-bit rca incorporating a cascade of martin’s full adders is specified by o(m + 2). 3.3. biased implementation seitz’s weakly indicating full adder design [18], shown in figure 5, constitutes a good example for the biased implementation style, and synthesizes (1) to (4). note that the full adder shown in figure 5 is similar in many aspects to that portrayed by figure 3 with a few exceptions: i) the product terms are realized using and gates in figure 5 instead of the c-elements in figure 3, and ii) the intermediate sum outputs viz. intsum1 and intsum0 are combined with the output of the 6-input or gate (org) that logically sums up the dualrail input signals to produce the primary sum outputs viz. sum1 and sum0. the carry output logic of seitz’s weak-indication full adder may utilize the carry-generate or the carry-kill condition similar to that of the dims weak-indication full adder or the martin’s full adder. to explain the biased approach prevalent in the design of the seitz’s weakindication full adder, let us two consider two example scenarios. when valid data inputs are applied to the full adder shown in figure 5, and assuming that the carry propagates, one of the 3-input and gates present in the first-level of the full adder would transition to 1, which will cause a similar transition on intsum1 or intsum0. even with a single dual-rail primary input transitioning to 1, org would transition to 1. however with isochronic fork assumptions [27] imposed on the primary inputs, it is implied that the low to high transitions on the primary inputs of the and gate are simultaneously accompanied by similar transitions on the inputs of the 6-input or gate. the isochronic fork implies that when a transition arrives on one branch of a node and is acknowledged, the transitions on all other branches of the same node are also assumed to have arrived at the same time and hence they are considered to be acknowledged. subsequently, the low to high transition on intsum1/intsum0 is combined with the transition on org, resulting in the production of a low to high transition on the primary sum output viz. sum1/sum0. notice that the low to high transition on the output of an and gate would also cause a similar transition on the carry output viz. cout1/cout0. in a subsequent rtz phase, the and gate which experienced a low to high transition earlier would now output the spacer and this can happen even with anyone of its inputs assuming the spacer state, which leads to the production of a spacer output on cout1/cout0, either of which was asserted high previously. a latency optimized biased implementation style weak-indication self-timed full adder 665 a0 b0 cin0 a0 b0 cin1 a0 b1 cin0 a0 b1 cin1 a1 b0 cin0 a1 b0 cin1 a1 b1 cin0 a1 b1 cin1 a0 b0 a1 b1 cout0 cout1 sum1 sum0 c c a1 a0 b1 b0 cin1 cin0 intsum1 intsum0 org fig. 5 seitz’s weak-indication full adder let us now consider that the carry-generate mode is active. under this consideration, when valid data inputs are supplied to the full adder shown in figure 5, a 3-input and gate and a 2-input and gate (which implements ‘a1b1’) would transition to 1. with org also transitioning to 1, either sum1 or sum0 would experience a low to high transition, and cout1 would also experience a low to high transition. notice here that cout1 acknowledges the arrival of only the augend and addend inputs, i.e. a1 and b1, and not the carry input cin1/cin0. however, intsum1/intsum0 and subsequently sum1/sum0 indicates the arrival of the augend and addend inputs as well as the carry input. in the 666 p. balasubramanian following rtz phase, even with either a1 or b1 becoming a spacer, cout1/cout0, which transitioned to 1 earlier, would now assume the spacer state. thus cout1/cout0 may not specifically acknowledge the rtz of a1 and/or b1, nor is the rtz of the carry input acknowledged. however, org that indicates the rtz of those dual-rail primary inputs which experienced a low to high transition previously, when coupled with intsum1/intsum0 results in the rtz of sum1/sum0 respectively. hence the primary sum output is found to assume the entire responsibility of duly indicating the rtz of all the primary inputs. this deliberation would equally apply for the carry-kill condition. from the preceding discussions, it may be understood that in the case of seitz’s weakindication full adder, the sum output assumes the responsibility of indicating the complete arrival of all the primary inputs subsequent to the application of valid or spacer data inputs and that the carry output is freed from indication constraints; thus there is a bias towards the carry output. nevertheless, this tends to benefit by paving the way for fast carry propagation between the full adder stages in an n-bit rca. also, note that the 6-input or gate shown as part of figure 5 cannot be decomposed arbitrarily due to the gate orphan problem [21] [29] that would arise for the application of valid data. the gate orphan implies an unacknowledged transition at a gate output. a1 a0 b1 b0 cin1 cin0 org a1 a0 b1 b0 cin1 cin0 org int (a) safe realization (b) unsafe realization fig. 6 naïve decomposition of the 6-input or gate, potentially causing a gate orphan to explain the gate orphan problem associated with a naïve logic decomposition, consider figure 6, wherein the 6-input or gate is decomposed into a 4-input or gate and a 3-input or gate. supposing during a valid data phase cin1 transitions to 1, output org will transition to 1 in the case of both the realizations shown in figure 6 without waiting for the low to high transitions to occur on the remainder of the dual-rail inputs viz. a1/a0 and b1/b0. subsequently, if a1/a0 and b1/b0 also experience transitions, they will not be acknowledged by the or gate in figure 6a but they do not give rise to wire orphans since they are considered to be acknowledged by the and gate(s) present in the first level of figure 5 through the isochronic fork assumption. let us revisit the similar scenario of cin1 transitioning to 1 before a1/a0 and b1/b0 experience low to high transitions with reference to figure 6b. it can be seen that after cin1 experiences a low to high transition, the output org will also experience a low to high transition irrespective of any transition occurring on the intermediate output int. subsequently if a1/a0 and b1/b0 also transition to 1, the internal output int will experience a transition to 1. however, the low to high transition on int will not be acknowledged by the output org, and the unacknowledged transition on the internal gate output (int) is referred to as a gate orphan, which may get eliminated only through sophisticated timing assumptions. gate orphans are problematic and tend to affect the robustness of a self-timed circuit/system. therefore, self-timed implementations should be devoid of gate orphans in order to be robust. a latency optimized biased implementation style weak-indication self-timed full adder 667 it can be inferred from [28] that a strong-indication n-bit rca constructed using strongly indicating full adder blocks has fixed forward and reverse latencies of o(n), and hence exhibits the worst-case cycle time of o(2n). on the other hand, the weak-indication n-bit rca composed using basic weak-indication full adders has similar forward and reverse latencies of o(m), and hence features a cycle time of o(2m), where m denotes the maximum length of carry propagation in the n-bit rca. the weak-indication n-bit rca constructed using the distributed or biased implementation style weak-indication full adders have forward and reverse latencies of o(m) and o(2), and hence the least cycle time of o(m + 2) [28]. given these, weak-indication realizations are preferable than their strong-indication counterparts for self-timed design of arithmetic circuits. 4. proposed biased implementation style weak-indication full adder the proposed full adder that corresponds to the biased implementation style of weakindication is depicted in figure 7, synthesizing (5) to (8) using 4 simple gates viz. 2-input or gates and 10 complex gates. among the 10 complex gates, 8 of them are 2-input muller c-elements, where a 2-input c-element is realized using an ao222 cell with feedback, and the remaining are ao21 gates. assuming x and y are the inputs and z is the output of a 2-input c-element, z = xy + (x + y) z. presuming that a and b are the inputs given to the and logic part of an ao21 gate and with c as its other input, the output of the ao21 gate, say d, is given by d = ab + c. when ab and/or c are equal to 1, d equates to 1; hence the ao21 gate is said to be input-incomplete. in general, with the exception of the c-gate, all other logic gates tend to exhibit input-incomplete behavior. it can be seen in figure 7 that the product terms corresponding to the sum logic are realized using input-complete c-elements. hence the sum output would indicate the complete arrival of the entire primary inputs viz. augend, addend and carry inputs for both valid and spacer data. on the other hand, the carry output logic is realized using a mix of input-complete c-elements and input-incomplete complex gates. since the sum output fully indicates the arrival of all the primary inputs during the valid and spacer data phases, the carry output at the best provides multiple acknowledgments for the arrival of valid or spacer data on the augend and addend inputs and/or the carry input. hence the proposed full adder features a bias toward the carry output in terms of relaxing its indication constraints, and the advantages associated with such an implementation in terms of less forward and reverse latencies and cycle time have been articulated earlier. to elaborate on this, let us consider the following:  carry-propagate mode: once the primary inputs assume valid data states, internal outputs intm1 or intm2 and intm3, shown in figure 7, would transition to 1. depending on whether cin1 or cin0 experiences a low to high signal transition, a similar transition is reflected on the corresponding primary output, cout1 or cout0. when spacer data are applied subsequently, even with intm1 or intm2 and intm3 becoming a spacer, the carry output which transitioned to 1 earlier would now be reset regardless of the carry input becoming a spacer  carry-generate or carry-kill mode: when valid data are supplied through the primary inputs, an intermediate output intm4 or intm5, highlighted in figure 7, makes a low to high signal transition, which is followed by a similar transition on 668 p. balasubramanian cout1/cout0 respectively. this could occur regardless of a transition on the carry input. subsequently, when spacer data are applied on the primary inputs, intm4 or intm5 which transitioned to 1 earlier would now assume the spacer state, which is acknowledged by the respective carry output. this could also happen irrespective of the carry input becoming a spacer c c a0 b1 a1 b0 c c a1 b1 a0 b0 c c c c ao21 cout1 cin1 ao21 cout0 cin0 sum1 sum0 cin0 cin0 cin1 cin1 intm1 intm2 intm3 intm4 intm5 fig. 7 proposed weak-indication full adder 5. simulation results and discussion a number of 32-bit self-timed rcas were constructed in a semi-custom design fashion at the gate-level by utilizing the various strong and weak-indication full adders separately. the structural integrity of the different gate-level self-timed full adders was preserved during the physical realization (technology mapping) to pave the way for a legitimate comparison, and they were implemented using the elements of a 32/28nm cmos cell library [30]. the 2-input c-element was alone designed manually using the ao222 cell with feedback and was made available to realize the self-timed designs, and the 3-input c-elements were decomposed safely into 2-input c-elements using the method of [29]. the self-timed rcas comprise the function block, the input registers, and the completion detection circuit. the input registers and the completion detector part of the various rcas are identical, and only the function blocks differ. hence the differences between the simulation results of the various rcas can be attributed to the differences between their constituent full adders. more than 1000 random input vectors were supplied to the rcas at time intervals of 20ns through test benches in order to capture the a latency optimized biased implementation style weak-indication self-timed full adder 669 switching activities. the .vcd files generated were subsequently used for power estimation using synopsys tools. since the eda tool estimates just critical path timing, only the worst-case forward latency was evaluated. appropriate wire loads were included automatically whilst performing the simulations. as part of the advanced timing analysis, a virtual clock was used just to constrain the input and output ports of the rcas, and it did not consume any power. the power, (forward) latency, and area results obtained for the various 32-bit rcas are shown in table 1. the indication type of each full adder is highlighted in the 1 st column of table 1. the area of the rcas and the respective full adders are given before and after the semicolon in the 4 th column. the gates present in the critical path of the different rcas are mentioned in the 5 th column. the simulation results correspond to a typical case specification (1.05v, 25⁰ c) of the 32/28nm cmos process [30]. the primary sum and carry outputs of the rcas possess fanout-of-4 drive strength. table 1 power, latency and area parameters of different 32-bit self-timed rcas incorporating distinct full adders full adder and its indication type power (µw) latency (ns) rca; full adder area (µm 2 ) critical path elements singh [31] – strong dims [24] – strong dims [24] [14] – weak toms [32] – strong folco et al. [33] – weak sssc [23] – weak toms & edwards [34] – weak proposed – weak 2190 2181 2177 2172 2171 2174 2192 2171 14.61 9.26 8.24 9.04 7.00 4.43 9.66 3.32 2529; 54.64 2504.60; 53.88 2423.27; 51.34 2293.14; 47.27 2016.63; 38.63 2097.96; 41.17 2642.85; 58.20 2049.16; 39.65 2 ce2, 2 or3 ce2, or4 ce2, or3 ce2, 2 or2 ce2, or2 ao222 and2, ce2, or3 ao21 ce2: 2-input c-element; and2: 2-input and gate; or2/3/4: 2/3/4-input or gate; ao222 and ao21 are complex gates the reason for the differences in the latency figures of the various rcas is due to the different logical operators found in their critical paths, as mentioned in table 1. it can be seen in table 1 that the 32-bit rca incorporating the proposed full adder features the least latency of 3.32ns among its counterparts – thanks to the ao21 cell used for implementing the carry output logic of the proposed full adder. with respect to power dissipation, the 32-bit rca featuring the folco et al.’s full adder is comparable with that incorporating the proposed full adder since both these dissipate a similar average power of 2171µw. it can be seen that the total power dissipation does not vary much across the different rcas, although the variations in area are quite significant. this is because selftimed designs have a unique signal propagation path for each input pattern unlike synchronous designs as they adhere to the monotonic cover constraint [14]. in terms of area, the 32-bit rca incorporating folco et al.’s full adder occupies the least area while the 32-bit rca constructed using the proposed full adder occupies more silicon by just 1.6%. however, the latter enables considerably less latency by 52.6% compared to the former. also, note that this latency reduction is achieved not at the expense of any extra power dissipation for the latter compared to the former. the sssc full adder which is a gate-level design features a carry output logic that is similar to the carry output logic of 670 p. balasubramanian martin’s full adder which is a transistor-level design. hence the sssc full adder was considered as a substitute for the martin’s full adder in implementing the rca as both these have similar latencies and cycle time metrics. the weak-indication sssc full adder corresponds to the biased implementation style similar to that of the proposed full adder, and the proposed full adder leads to reduced latency by 25.1% than the sssc full adder for a 32-bit rca implementation with no penalty in terms of power or area parameters. 6. conclusion a new full adder design that corresponds to the biased implementation style of weakindication was presented. for an n-bit rca realized using the proposed full adder, the forward and reverse latencies and cycle time are specified by o(m), o(2), and o(m + 2) respectively, where m denotes the maximum number of full adder stages in the n-bit rca through which the carry propagates. example 32-bit self-timed rca implementations incorporating different strong and weak-indication full adders were analyzed. against the best of the strong-indication full adders, the proposed full adder reports respective reductions in latency and area by 63.3% and 10.6% whilst dissipating similar power. on the other hand, in comparison with the existing optimized weak-indication full adder, the proposed full adder achieves reduced latency by 25.1% with no power penalty albeit at a small area expense of 1.6%. overall, from a combined power-latency-area perspective, the proposed full adder is found to yield optimum quality-of-results. references [1] semiconductor industry association’s itrs report. available: http://www.itrs.net [2] s. kundu and a. sreedhar, nanoscale cmos vlsi circuits: design for manufacturability, mcgrawhill, usa, 2010. [3] n.c. paver, p. day, c. farnsworth, d.l. jackson, w.a. lien and j. liu, "a low-power, low noise, configurable self-timed dsp", in proceedings of the 4 th international symposium on advanced research in asynchronous circuits and systems, 1998, pp. 32-42. [4] g.f. bouesse, g. sicard, a. baixas and m. renaudin, "quasi delay insensitive asynchronous circuits for low emi", in proceedings of the 4 th international workshop on electromagnetic compatibility of integrated circuits, 2004, pp. 27-31. [5] c.h. van kees berkel, m.b. josephs and s.m. nowick, "scanning the technology applications of asynchronous circuits", in proceedings of the ieee, vol. 87, pp. 223-233, 1999. [6] k.j. kulikowski, v. venkataraman, z. wang, a. taubin and m. karpovsky, "asynchronous balanced gates tolerant to interconnect variability", in proceedings of the ieee international symposium on circuits and systems, 2008, pp. 3190-3193. [7] i.j. chang, s.p. park and k. roy, "exploring asynchronous design techniques for process-tolerant and energy-efficient subthreshold operation", ieee journal of solid-state circuits, vol. 45, pp. 401-410, 2010. [8] o.c. akgun, j. rodrigues and j. sparsø, "minimum-energy sub-threshold self-timed circuits: design methodology and a case study", in proceedings of the 16 th ieee international symposium on asynchronous circuits and systems, 2010, pp. 41-51. [9] i. david, r. ginosar and m. yoeli, "self-timed is self-checking", journal of electronic testing: theory and applications, vol. 6, pp. 219-228, 1995. [10] z.c. yu, s.b. furber and l.a. plana, "an investigation into the security of self-timed circuits", in proceedings of the 9 th international symposium on asynchronous circuits and systems, 2003, pp. 206-215. [11] d. sokolov, j. murphy, a. bystrov and a. yakovlev, "design and analysis of dual-rail circuits for security applications", ieee transactions on computers, vol. 54, pp. 449-460, 2005. http://www.itrs.net/ a latency optimized biased implementation style weak-indication self-timed full adder 671 [12] d. koppad and a. efthymiou, "bist for strongly-indicating asynchronous circuits", in proceedings of the 17 th ifip international conference on very large scale integration, 2009, pp. 215-218. [13] a. efthymiou, "initialization-based test pattern generation for asynchronous circuits", ieee transactions on vlsi systems, vol. 18, pp. 591-601, 2010. [14] j. sparsø and s. furber (editors), principles of asynchronous circuit design: a systems perspective, kluwer academic publishers, netherlands, 2001. [15] balasubramanian padmanabhan, "self-timed logic and the design of self-timed adders", phd thesis, school of computer science, the university of manchester, uk, 2010. [16] t. verhoeff, "delay-insensitive codes – an overview", distributed computing, vol. 3, pp. 1-8, 1998. [17] b. bose, "on unordered codes", ieee transactions on computers, vol. 40, pp. 1-8, 1988. [18] c.l. seitz, "system timing", introduction to vlsi systems, c. mead and l. conway (editors), pp. 218262, addison-wesley, reading, massachusetts, usa, 1980. [19] p. balasubramanian and d.a. edwards, "efficient realization of strongly indicating function blocks", in proceedings of the ieee computer society annual symposium on vlsi, 2008, pp. 429-432. [20] p. balasubramanian and d.a. edwards, "a new design technique for weakly indicating function blocks", in proceedings of the 11 th ieee workshop on design and diagnostics of electronic circuits and systems, 2008, pp. 116-121. [21] c. jeong and s.m. nowick, "block-level relaxation for timing-robust asynchronous circuits based on eager evaluation", in proceedings of the 14 th ieee international symposium on asynchronous circuits and systems, pp. 95-104, 2008. [22] a.j. martin, "asynchronous datapaths and the design of an asynchronous adder", formal methods in system design, vol. 1, pp. 117-137, 1992. [23] p. balasubramanian and d.a. edwards, "a delay efficient robust self-timed full adder", in proceedings of the ieee 3 rd international design and test workshop, 2008, pp. 129-134. [24] j. sparsø and j. staunstrup, "delay-insensitive multi-ring structures", integration, the vlsi journal, vol. 15, pp. 313-340, 1993. [25] p. balasubramanian and d.a. edwards, "self-timed realization of combinational logic", in proceedings of the 19 th international workshop on logic and synthesis, 2010, pp. 55-62. [26] p. balasubramanian, r. arisaka and h.r. arabnia, "rb_dsop: a rule based disjoint sum of products synthesis method", in proceedings of the 12 th international conference on computer design, 2012, pp. 39-43. [27] a.j. martin, "the limitation to delay-insensitivity in asynchronous circuits", in proceedings of the 6 th mit conference on advanced research in vlsi, 1990, pp. 263-278. [28] p. balasubramanian and n.e. mastorakis, "timing analysis of quasi-delay-insensitive ripple carry adders – a mathematical study", in proceedings of the 3 rd european conference of circuits technology and devices, 2012, pp. 233-240. [29] p. balasubramanian and n.e. mastorakis, "qdi decomposed dims method featuring homogeneous/ heterogeneous data encoding", in proceedings of the international conference on computers, digital communications and computing, 2011, pp. 93-101. [30] synopsys digital standard cell library saed_edk32/28_core databook, revision 1.0.0, 2012. [31] n.p. singh, "a design methodology for self-timed systems", m.sc. thesis, mit laboratory for computer science technical report tr-258, 1981. [32] w.b. toms, "synthesis of quasi-delay-insensitive datapath circuits", phd thesis, school of computer science, the university of manchester, uk, 2006. [33] b. folco, v. bregier, l. fesquet and m. renaudin, "technology mapping for area optimized quasi delay insensitive circuits", in proceedings of the ifip international conference on very large scale integration, 2005, pp. 146-151. [34] w.b. toms and d.a. edwards, "a complete synthesis method for block-level relaxation in self-timed datapaths", in proceedings of the 10 th international conference on application of concurrency to system design, 2010, pp. 24-34. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 519-528 https://doi.org/10.2298/fuee1804519j considerations on the importance of proper heat transfer coefficient modeling in air cooled electronic systems  marcin janicki, agnieszka samson, tomasz raszkowski, tomasz torzewicz, andrzej napieralski department of microelectronics and computer science, lodz university of technology, poland abstract. this paper illustrates, based on a practical example of a hybrid circuit, the influence of proper heat transfer coefficient modelling in air cooled electronic systems on the accuracy of thermal simulations. this circuit contains a transistor heat source and a set of temperature sensors. the measurements of their temperature responses are taken in natural convection and forced air cooling conditions. the experimental data provide the information necessary to estimate the local heat transfer coefficient values in heat source and temperature sensor locations. moreover, the experiments rendered possible the fitting of parameters of an empirical heat transfer coefficient model for different surface temperature rise values and cooling air velocities, and hence allowed significant improvement of thermal simulation accuracy. key words: thermal modeling, air cooling, heat transfer coefficient. 1. introduction temperature is the most important factor influencing the operation of all electronic systems and, at the same time, it is the principal cause of their failures [1]. thus, thermal analyses have become an indispensable stage of electronic system design. unfortunately, usually in simulations it is assumed that thermal model parameter values are temperature independent. however, in most real cases this assumption may not be true since, as shown in [2], the heat transfer coefficient values depend strongly on surface temperature rise and cooling air velocity. obviously, there exist thermal models, mostly empirical, which allow taking into account such dependencies, however typically for flat plates these relations were derived assuming the uniformity of surface temperature or heat flux [3]-[5]. thus, considering that in electronic circuits heat sources usually occupy a small part of their surface, large temperature gradients occur and consequently local heat transfer coefficient received september 9, 2018 corresponding author: marcin janicki lodz university of technology, 116 żeromskiego street 90-924 lodz, poland (e-mail: janicki@dmcs.pl)  520 m. janicki, a. samson, t. raszkowski, t. torzewicz, a. napieralski values differ considerably in various locations rendering those standard relations useless in practice [6]-[7]. this problem is illustrated in this paper based on the example of a real hybrid circuit containing a transistor heat source and several thermistor temperature sensors. during the experiments, the heat source and sensor temperature values were measured both in natural and forced convection cooling conditions with variable air velocity. local heat transfer coefficient values were estimated owing to the efficient coupling of the forward thermal solver with different inverse algorithms implementing chosen methods described in [8][10] and a simple algorithm proposed by the authors. moreover, the experiments rendered possible the fitting of empirical model parameters. the fitting was carried out employing various swarm intelligence algorithms [11]-[13]. the following section of this paper will introduce in detail the investigated test hybrid circuit and provide the results of its temperature measurements. then, the estimation results of local heat transfer coefficient values are presented and discussed. finally, these values are fitted to an empirical model allowing the computation of local heat transfer coefficient at a chosen location for any surface temperature rise and cooling air velocity. 2. experimental results 2.1. hybrid circuit description the investigations presented in this paper concern a test hybrid circuit, whose layout is presented in fig. 1, containing a bipolar junction transistor (bjt) and five platinum thermistors with a positive thermal coefficient (pt1-5). all these components can be used as temperature sensors and the bjt can serve at the same time as a heat source. the pcb is manufactured in the insulated metal substrate technology and is made of a 1 mm-thick aluminum alloy. the board is a long but narrow rectangle with the devices located in the middle of its width so that the heat diffusion along the substrate gradually becomes quasi one-dimensional. each time the distances between individual sensors are doubled. initially all devices were calibrated using a cold plate. the bjt was calibrated for the emitter current values, which were used later on for heating, i.e. the currents ranging from 250 ma to 1000 ma with the step of 250 ma, and the thermistor current value was equal to 750 a. as expected, the measured device temperature characteristics were linear. the bjt base-emitter voltage sensitivity decreased with the emitter current from -1.86 mv/k at 250 ma to -1.51 mv/k at 1000 ma and the thermistor sensitivity was 96 v/k. 33 5 5 1045 20 160 40 bjt p t 1 p t 2 p t 3 p t 4 p t 5 wind direction fig. 1 layout of the hybrid circuit (dimensions in millimeters). considerations on the importance of proper heat tranfer coefficient modeling in air cooled electronic ... 521 2.2. temperature measurement results the experiments consisted in the measurements of transistor heat source and sensor dynamic thermal responses for various levels of dissipated power with natural convection cooling and with forced convection cooling in the wind tunnel at different air velocities. the transistor temperature values were recorded with the transient thermal tester t3ster manufactured by mentor graphics and the thermistor temperature values were registered with a custom measurement board designed by the authors. the dependence of bjt and sensor steady state temperature rise values on dissipated power in natural convection cooling conditions for the four values of emitter current used during the calibration is plotted in fig. 2. as can be seen, this dependence is not perfectly linear and for the power dissipation of 7 w the temperature rise values of all thermistors are at least 20% lower than the ones, which would be obtained by projecting the values measured for the lowest power dissipation. this clearly indicates the existence of a nonlinearity related to the change of the heat transfer coefficient value. this effect apparently is not so strong for the bjt, however as demonstrated in [14] it results from the increase of the package thermal resistance what counterbalances the change of cooling conditions. the transistor and sensor steady state temperature rise values obtained at the heating current of 1000 ma for different air velocities are shown in fig. 3. the wind direction, indicated in fig. 1 by the arrow, was chosen expressly in order to avoid the sensor heating by the air passing over the board. looking at the figure it could be concluded that owing to the forced air movement the temperature rise values are noticeably reduced, but for the higher air velocities this decrease is less important. moreover, the transistor temperature rise was reduced by one fourth, whereas for the sensor locations this reduction is more significant and it amounts from nearly 50% for the sensor closest the source to 80% for the furthest one. the results obtained for the other heating current values are very similar, except for the fact that the temperature rise values are correspondingly lower. finally, the recorded device dynamic thermal responses are presented in fig. 4. for the sake of picture clarity, the results are shown only for the heat source and two selected sensors located 10 mm and 40 mm away from the bjt. as can be seen, there is a visible delay of a few seconds in the thermal response at the sensor locations due to the diffusion of generated heat through the board. moreover, the change of cooling conditions affects the heating curves only after approximately 10 s. 3. heat transfer coefficient modelling 3.1. estimation of local heat transfer coefficient values for the purposes of the local heat transfer coefficient estimation, a three-dimensional thermal model of the circuit was created. taking into account that the only nonlinearity is related to temperature dependent boundary conditions, the temperature solutions were obtained with the analytical green‟s function solver because of short computation time. this forward solver was coupled with different inverse estimation algorithms. the current local heat transfer coefficient values were updated in each iteration step based on the comparison between measured and simulated data. the procedure was stopped when the difference was smaller than the assumed simulation accuracy. 522 m. janicki, a. samson, t. raszkowski, t. torzewicz, a. napieralski 0 10 20 30 40 50 60 70 80 90 100 110 120 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 t e m p e ra tu re r is e [ k ] power [w] bjt pt1 pt2 pt3 pt4 pt5 fig. 2 device steady state temperature rise for different dissipated power values with natural convection cooling. 0 10 20 30 40 50 60 70 80 90 100 110 120 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 t e m p e ra tu re r is e [ k ] air velocity [m/s] bjt pt1 pt2 pt3 pt4 pt5 fig. 3 device steady state temperature rise for different air velocities at the 1000 ma emitter current with forced convection cooling. 0 10 20 30 40 50 60 70 80 90 100 110 120 1e-5 1e-4 1e-3 1e-2 1e-1 1e+0 1e+1 1e+2 1e+3 t e m p e ra tu re r is e ( k ) time (s) bjt 1.0a 0.0m/s bjt 1.0a 2.0m/s pt2 1.0a 0.0m/s pt2 1.0a 2.0m/s pt4 1.0a 0.0m/s pt4 1.0a 2.0m/s fig. 4 measured heating curves of selected devices for different air velocities. considerations on the importance of proper heat tranfer coefficient modeling in air cooled electronic ... 523 the effectiveness of tested inverse estimation algorithms was assessed using as the performance indicators the number of iteration steps and the simulation time required to attain the desired accuracy. these algorithms included the false position, the newtonraphson, the secant and the bisection methods. however, the best results were obtained employing a simple numerical algorithm proposed by the authors. this algorithm has the shortest computation time, mainly because of the smallest number of algebraic operations required and partly owing to the efficient storage of previously computed heat transfer coefficient values. obviously, these results cannot be generalized directly since each time the computation time depends on the initial heat transfer coefficient value h0 or the step size of the coefficient value update h0. for more detailed description of the algorithm tests, please refer to [15]. the main idea of the proposed algorithm consists in the iterative improvement of the heat transfer coefficient estimates h. when the difference between the measured value and the estimated one is positive, the temperature is underestimated and the coefficient value has to be decreased, and when the difference is negative, this value is increased. if the difference changes its sign, the step size h is reduced and the search becomes more accurate. it is also worth mentioning that in each iteration step the current heat transfer coefficient values and their corresponding temperature values in the transistor and sensor locations are stored, what allows significant reduction of the computation time. the total value of the heat transfer coefficient is supposed to model both radiation and convection cooling mechanisms occurring at outer structure surfaces and its value could be split into two components, as shown in eq. 1, which can be estimated independently. the first of them encompasses the phenomena which are dependent on the surface and the surrounding ambient temperature values, i.e. mainly radiation and natural convection, and the other one reflects the effects related to the cooling air velocity with forced convection. fcnc hhh  (1) therefore, originally the first component hnc was estimated assuming that its initial value equalled 5 w/(m 2 k), what corresponds approximately to the theoretical value for the pure radiation cooling at room temperature. the initial step size for the update of the heat transfer coefficient value h was 1 w/(m 2 k) and the desired temperature simulation accuracy (stopping criterion) was set to 0.05 k. the estimation results obtained using the algorithm proposed by the authors for the four bjt heating current values are shown with different markers in fig. 5. as can be seen, the estimated dependence of the local heat transfer coefficient value on circuit surface temperature rise is not a linear function and the heat transfer coefficient ranges from 9.7 w/(m 2 k) 12 k over the ambient temperature to 15.4 w/(m 2 k) when the temperature rise amounts to 91 k. next, the second component of the heat coefficient value, due to the forced movement of the cooling air, was estimated. first, the total values of the coefficient were determined, similarly as in the case of the radiation and the natural convection component, and then the previously stored coefficient values dependent only on the surface temperature were subtracted from the total values. the estimation results obtained for the heat source are plotted in fig. 7 again with different markers corresponding to the respective bjt heating currents. as can be seen, between the air velocities of 1 m/s and 3 m/s the values of this component are at least doubled. the results obtained for sensor locations are very similar. 524 m. janicki, a. samson, t. raszkowski, t. torzewicz, a. napieralski fig. 5 estimated dependence of the radiation and natural convection component on the surface temperature rise. fig. 6 estimated dependence of the forced convection component on the air velocity. 3.2. determination of model parameter values once the values of the heat transfer coefficient were estimated, it was possible to fit these values to a simple parametric model allowing the computation of local heat transfer coefficient values for given temperature rise and air velocity values. the first component hnc, related to the surface temperature rise t, can be modeled by the relation presented in eq. 2. when a surface is at ambient temperature, the constant „a‟ reflects the radiation cooling. the theoretical value of the exponent „c‟ for a surface of uniform temperature and with pure natural convection cooling is 0.25 [4]-[5], but in reality it is much higher because it depends also on the radiation and surface temperature gradients. considerations on the importance of proper heat tranfer coefficient modeling in air cooled electronic ... 525 c δba thnc  (2) the heat transfer coefficient component related to the forced convection cooling could be modeled by a power function of the air velocity v shown in eq. 3. the theoretical value of the exponent „e‟ is in the range of 2/3 ÷ 3/4 [4]-[5]. e d vh fc  (3) in order to estimate the unknown model parameters contained in eq. 2-3, a modified version of the bee colony optimization algorithm, described in [15], was used. the results of the fitting obtained in the case of the natural and forced convection cooling are plotted in figs. 7-8 respectively. the thick black line in these figures represents the results of the global fitting carried out for all measurement data available. the final expression for the total value of heat transfer coefficient is expressed by the following formula: 778.00.395 67.885.183.4 vth  (4) the fitted dependence of the first heat transfer coefficient component hnc shows a very good agreement with the measurements. the value of model parameter „a‟ is close to the theoretical one and the value of the exponent „c‟, as expected, is relatively high due to the small size of the heat source occupying only slightly more than 1% of the circuit surface area. on the other hand, the value of the exponent „v‟ in the forced convection component hfc is just slightly higher than the one reported in literature. the global fit results in fig. 8 seem not to be accurate, but the black line represents, as already mentioned, the results produced for the entire measurement data set. more accurate results can be obtained when the fitting is performed individually for the particular temperature sensor locations. then, the heat transfer coefficient values predicted for the transistor location are much higher due to its elevated temperature. fig. 7 fitting results of the heat transfer coefficient component modeling the radiation and natural convection cooling. 526 m. janicki, a. samson, t. raszkowski, t. torzewicz, a. napieralski fig. 8 fitting results of the heat transfer coefficient component modeling the forced convection cooling. the final parametric model for the heat transfer coefficient value, expressed by eq. 4, was implemented in the forward thermal solver. the simulations were repeated using the variable coefficient values and the constant ones equal to the lowest and the highest value used in the preceding simulation with the variable temperature and air velocity dependent value. the results of these simulation are compared in fig. 9 with the measurement. the assumption of a low heat transfer coefficient value leads to the important overestimation of the heat source temperature (dotted line), by almost 30%. on the other hand, the high value of the coefficient assures the accurate result in the thermal steady state (solid line), but in the region when the heat diffuses through the board, between 1 s and a few minutes, the simulation errors exceed 10 k. 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 1e-5 1e-4 1e-3 1e-2 1e-1 1e+0 1e+1 1e+2 1e+3 t e m p e ra tu re r is e ( k ) time (s) bjt 1.0a 0m/s mes bjt 1.0a 0m/s high bjt 1.0a 0m/s var bjt 1.0a 0m/s low fig. 9 comparison of measured and simulated transistor heating curves. considerations on the importance of proper heat tranfer coefficient modeling in air cooled electronic ... 527 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 n o rm a li s e d te m p e ra tu re r is e () distance (mm) natural convection forced convection fig. 10 comparison of temperature profiles along the cross-section of the circuit board in different cooling conditions. finally, it was instructive to compare the temperature profiles across the circuit board obtained in the case of the natural and forced convection cooling. these results, presented in fig. 10, represent the temperature rise relative to the heat source temperature. as can be seen with the forced convection cooling the temperature profile is much steeper and consequently the differences in the heat transfer coefficient values are more important, but the thermal influence coefficients between the heat source and sensor locations decrease reducing the so-called „cooling circle. similar conclusions were drawn also in [16]. 4. conclusions this paper discussed the problem of proper modelling of the heat transfer coefficient value in electronic circuits. currently, it is a common practice that for thermal simulations the same coefficient value, usually around 10 w/(m 2 k) for natural convection, is assumed for the entire surface of an electronic circuit. however, as it was demonstrated here based on a practical example, the use of constant coefficient values might lead to significant simulation errors, both in transient and thermal steady states. this is caused mainly by the fact that the heat transfer coefficient values can vary across the surface of a circuit even by an order of magnitude. the simple parametric model proposed by the authors allowed the computation of the local heat transfer coefficient values in function of surface temperature rise and cooling air velocity. this model, fitted to experimental data, can be used for the iterative updates of the local heat transfer coefficient values during thermal simulations. when included into standard fem thermal analysis tools or lumped thermal models, such as the ones described in [17]-[18], it could improve significantly the thermal simulation accuracy. additionally, it was shown that the parameter values in the proposed model differ substantially from the standard heat transfer textbook formulas, which are derived mostly for cases when surface temperature or heat flux is uniform. 528 m. janicki, a. samson, t. raszkowski, t. torzewicz, a. napieralski references [1] r. ross, ed., microelectronics failure analysis: desk reference. edfas, asm international, 2011. [2] m. janicki, z. sarkany and a. napieralski, “impact of nonlinearities on electronic device transient thermal responses”, microelectron. j., vol. 45, pp. 1721-1725, december 2014. [3] f.p. incropera and d.p. de witt, fundamentals of heat and mass transfer. wiley & sons, 2002. [4] j.p. holman, heat transfer. mcgraw-hill, 1986. [5] j.h. lienhard iv and j.h. lienhard v, a heat transfer textbook. phlogiston press, 2012. [6] g. ellison, thermal computations for electronics: conductive, radiative, convective air cooling. crc press, 2010. [7] y. m. li and a. ortega, “forced convection from a rectangular heat source in uniform shear flow: the conjugate peclet number in the thin plate limit”, in proceedings of the 6 th itherm. seattle, wa: ieee, 1998, pp. 284-294. [8] r. aster, b. borchers, and c. thurber, parameter estimation and inverse problems. elsevier, 2005. [9] m.n. ozisik and h.r.b. orlande, inverse heat transfer. taylor & francis, 2000. [10] m. janicki and s. kindermann, “recovering temperature dependence of heat transfer coefficient in electronic circuits”, inverse probl. sci. en., vol. 17, pp. 1129-1142, october 2009. [11] d. karaboga, and b. akay, “survey: algorithms simulating bee swarm intelligence”, artif. intell. rev., vol. 31, pp. 68-85, june 2009. [12] d. teodorovic, p. lucic, g. z. markovic and m. dell'orco, “bee colony optimization: principles and applications”, in proceedings of the 8 th neurel. belgrade: ieee, 2006, pp. 151-156. [13] a. colorni, m. dorigo, and v. maniezzo, “distributed optimization by ant colonies”, in proceedings of the ecal. paris: elsevier, 1991, pp. 134-142. [14] t. torzewicz, a. samson, t. raszkowski, a. sobczak, m. janicki, m. zubert, a. napieralski, “thermal analysis of hybrid circuits with variable heat transfer coefficient”. in proceedings of the 33 rd semitherm, san jose, ca, 2017, pp. 19-22. [15] t. raszkowski and a. samson, “application of genetic algorithm and swarm intelligence algorithms to heat transfer coefficient estimation”, bulletin de la société des sciences et des lettres de łódź. série: recherches sur les déformations, vol. lxvii, pp. 103-125, october 2017. [16] g. a. (w.) luiten, “characteristic length and cooling circle”, in proceedings of the 26 th semitherm. santa clara, ca: ieee, 2010, pp. 7-13. [17] r. menozzi, p. cova, n. delmonte, f. giuliani, and g. sozzi, “thermal and electro-thermal modeling of components and systems: review of the research at the university of parma”, facta universitatis, series: electronics and energetics, vol. 28, pp. 325-344, september 2015. [18] s. stanisic, m. jevtic, b. das, and z. radakovic, “fem cfd analysis of air flow in kiosk substation with oil immersed distribution transformer”, facta universitatis, series: electronics and energetics, vol. 31, pp. 411-423, september 2018. facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 669-686 https://doi.org/10.2298/fuee2004669d © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd a review of real time smart systems developed at university of niš  danijel danković, miloš đorđević faculty of electronic engineering, university of niš, serbia abstract. this paper presents the bibliographic review of smart systems implemented so far and their application. also this paper is dedicated to new smart mobile system developed for monitoring microclimatic parameters. this system is primarily intended for monitoring real-time microclimatic parameters, such as air quality where the presence of carbon monoxide (co) is monitored, as well as other microclimatic parameters. the mobile system which will be described in this manuscript can be installed in public transport (to obtain information on microclimatic parameters on a known route). also, to obtain information on microclimatic parameters from a random route, it is possible to install the system in a taxi vehicle. this system provides the ability to generate a map using the data provided by the system based on gps coordinates. the system is based on a group of embedded sensors, gps module, pic microcontroller as a core and server system, and wireless internet using global system for mobile telecommunications (gsm) module with general packet radio service (gprs) as a communication protocol. key words: smart mobile system, internet of things, pic microcontroller, sensor technology 1. introduction with the increase in the number of vehicles, but also with the reduction of green areas in cities, for the needs of the construction of residential buildings, as well as parking spaces, the harmful impact on microclimatic parameters has increased significantly. today, the urban population makes up almost half of the world's population. it is estimated that a city of one million citizens produces about 25,000 tons of co2 and co and about 300,000 tons of water waste every day [1]. these parameters are progressively increasing every year thanks to urbanization, which reduces the quality of life of people more and more. in order to monitor the parameters that greatly affect the air quality and other microclimatic parameters, it is necessary to have a large number of points where these parameters are monitored. knowing  received july 17, 2020; received in revised form august 30, 2020 corresponding author: danijel danković faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia e-mail: danijel.dankovic@elfak.ni.ac.rs 670 d. dankovic, m. djordjevic in which parts of the city the greatest pollution is, it is possible to report certain corrective measures, such as traffic regulation, use of pollutant filters, change of heating fuel type (transition of heating plants to natural gas and renewable energy sources). the systems used to monitor microclimatic parameters require large amounts of money (installation costs, regular maintenance and servicing needs), which on the other hand means that a small number of cities decide for this solution. to avoid this, it is necessary to implement a smart autonomous mobile system that can replace a large number of systems that are installed at measuring points. in order to have an insight into the position where the parameters were measured, it is necessary to use the gps module, to obtain location information. based on this, it is possible to locate where the air pollution is higher, as well as in what period of the day, month, even during the year. the advantage of such systems is reflected in the fact that a higher frequency of measuring points is possible, which makes the entire system for monitoring and measuring microclimatic parameters cheaper. the manuscript aims is to develop a smart mobile system for real time monitoring microclimatic parameters, such as air quality where the presence of carbon monoxide (co) is monitored, as well as other microclimatic parameters (various sensors can be added, which can change the set of microclimatic parameters which are monitored). such a system can be part of smart cities, since it is an autonomous system for monitoring environmental parameters. the advantage of such system compared to conventional static systems is that such a system can be installed on vehicles (public transport, police, taxis, etc.), which means that the coverage of the area monitored is almost unlimited. if it is desired to monitor the established route, public transport can be used, while in case of need of a random route, taxi vehicles could be used. for example, in the city of nis, one taxi vehicle crosses a route averaging about 400 km in 24 hours, making about 70 individual rides. in addition to monitoring microclimatic parameters, it is possible to generate a map using the data provided by the system, based on gps coordinates. the system has a wide application based on meteorological/microclimatics parameters that measure: temperature, humidity, atmospheric pressure, altitude, lighting, and detection and measurement of carbon monoxide (co). all measurements are accompanied by information on the time and date of measurement, also with gps coordinates, which are used so that each measurement is supported by the location where the parameter measurement was performed. gps coordinates, time and date information are present during the storage of data on the server and are available when downloading the results. after that, the collected data on the measured parameters can be added to the map. in connection with the previous, for the purposes of system testing various measurements were performed in nis. testing was performed in parts of the city where there are no measuring points (global monitoring), i.e. parts of the city that are not covered by measuring points for global monitoring. based on the data of the site for monitoring and supervising information on air quality "air pollution in serbia: real-time air quality index visual map", [2], there are 5 measuring points in nis where air quality parameters are monitored. for a global view of the situation in terms of air quality, these points represent a sufficient number of points, but to look at the situation in specific parts of the city, it is necessary to have a significantly higher number of points. also, the highest frequency of residential buildings, as well as people and cars are in the central parts of the cities, it is necessary to set up as many measuring points as possible in these parts of the cities. it is important to note, a review of real time smart systems developed at university of niš 671 several other factors that additionally affect the air quality should be taken into account, and they are more often located outside the central parts of the city. some of the factors that negatively affect are the type of building (whether it is a residential building or a building for other purposes), the type of fuel used for heating (whether it is fossil fuels, natural gas, electricity, etc.), and then which is a type of public city transport (buses, trolleybuses, metro, etc.) and perhaps the most influential factor is the existence of the heavy industry. there are a large number of modular systems for measuring and acquiring atmospheric parameters on the market, but there are few such comprehensive systems that combine all modules for measuring/observing both microclimatic and atmospheric parameters and that such a system is also mobile. for example, it can find a system that measures only one parameter, wind speed [3]. that anemometer is a part of the meteorological station project. it is important to note that different systems are developed at university of niš. the smart systems such as meteorological stations, smart farms, and smart systems within smart faculty are data collection systems that can remotely collect information based on meteorological/ambient (microclimatic) parameters. in addition to storing the collected data in the cloud or database on a web server and on the basis of the collected data, the system, depending on the purpose, takes certain actions that are expected of it [4]. some of the mentioned systems are implemented and they are described in more detail in section 2. it should be emphasized that in this paper there is a double goal, the first is related to the bibliographic review of smart systems implemented so far and their application, as well as the advantages they offer in relation to systems described and published in reference journals. the second goal of the paper is dedicated to a smart mobile system for monitoring microclimatic parameters, which is described in detail in section 3. 2. theoretical background with a review of authors’ previous investigation there are many different implementations of smart autonomous systems for monitoring microclimatic parameters, which can be divided according to the communication technologies they use, as well as storage media. most implementations use wireless technologies to communicate between the sensor part and the main unit. a smart system after monitoring and measuring parameters, stores measured data so that the end-user can easily access them, access measurement results from anywhere, and also using stored data according to his needs. based on these needs, smart autonomous systems have been developed based on different technologies. so it could stand out four groups of smart systems based on: 1. custom microcontroller and mobile application (bluetooth for communication). 2. custom microcontroller and computer (radio frequency (rf) for communication). 3. nodemcu running esp8266 wi-fi module and cloud or database on the webserver (wi-fi internet for communication). 4. custom microcontroller and cloud or database on the webserver (global system for mobile telecommunications (gsm module) for communication). 672 d. dankovic, m. djordjevic it is very important to mention some implementations of smart autonomous systems for monitoring microclimatic parameters. in the manuscript [1], the authors described a mobile system that can measure nitrogen dioxide no2 and carbon monoxide co. the system described by the authors is equipped with a gps module, in order to obtain information on the measurement location of microclimatic parameters. as the authors stated, the system that was implemented was tested, but the measurement intervals were not the same, so that the results obtained were not measured in real time. this shortcoming affects users a lot, especially who need real-time parameter information. therefore, our primary task was to develop the system that has the ability to monitor microclimatic parameters in real time, as well as information on the location of measurements. also, our system enables the measurement of more microclimatic parameters than the system described in the manuscript [1]. in the following manuscript [4], the authors described a mechatronic system for measuring environmental parameters. the system is based on the arduino due development board with atmega 328 microcontroller. the sensor part of this system consists of temperature and humidity sensor sht1x, which is much more unreliable compared to the sensor used in the implementation of our system. in addition, four ds1820 temperature sensors were used, whose measuring range is smaller compared to the digital sensor used in our system. also, four bpw 34 photodiode light sensors that are not reliable enough compared to the digital sensor we used. finally, they used a noise sensor consisting of a capacitive microphone czn-15e and an mcp 601 i/p. the system is static, which means that it is necessary to use more such systems to measure microclimatic parameters, which further increases the cost of the system itself. the next manuscript [5] presented a device developed by the authors for monitoring and controlling microclimatic parameters within a livestock barn. the realized system is of static type, which monitors and controls parameters such as temperature, air humidity, ammonia concentration and carbon dioxide. the sensor nodes are interconnected by rf communication using the zigbee module, which allows a relatively short range in sending or receiving data. in this regard, such system is limited to a narrow application, that is, for monitoring microclimatic parameters in a small area. in addition, it is important to note that the authors do not specify which sensors were used for the realization of the system, based on which there is no specific information on the ranges of measurement of microclimatic parameters. in addition, it is stated in the manuscript that the system was tested within a few hours, unlike the system we implemented, where the testing period lasted at least 7 days. finally, to our knowledge, there are no available mobile systems, comprehensive as this one described in the manuscript, which is the main motive of this manuscript. furthermore, the systems we implemented earlier did not allow the measurement of microclimatic parameters at different locations, since they are static systems. our implemented systems are described in more detail in our previous papers [5, 6, 7, 8] published in relevant journals, and at international conferences. the graphical illustration of realized systems is shown in fig. 1. a review of real time smart systems developed at university of niš 673 fig. 1 block diagram with an overview of realized smart systems and proposed smart mobile system. in our previous research [6], a real-time smart meteorological station based on embedded sensors and iot technology was analyzed (as shown in fig. 2). the described meteorological station was based on two main parts, the first was a pic microcontroller which represented the central part of the measuring system together with the built-in sensors. the second part was related to the thingspeak internet of things platform for storing data using gsm/gprs communication modules. the microcontroller used to implement this system belongs to the microchip family of pic microcontrollers. in addition to microcontrollers, sensors were used to monitor and measure meteorological and ambient parameters such as temperature, humidity, atmospheric pressure, altitude, wind speed, light intensity, and detection and measurement of natural gas concentration (lpg). in fig. 2, it can be seen the measured results for temperature (for more detailed information about other measured parameters in [6]). based on a wide range of meteorological and ambient parameters, the implemented system was used not only in meteorological stations, botanical gardens, libraries, and hospitals but also in mines, since the system measured the concentration of lpg, could detect and measure the presence of methane ch4. in addition to the listed 674 d. dankovic, m. djordjevic parameters that can be monitored and measured, there is a possibility to determine the period between two measurements, as well as how long the measurement of parameters lasts since the implemented system has an rtc module that monitors and calculates the current time. the disadvantage of this system is that it is a static system, unlike the mobile system which will be described in chapter 3 of this manuscript. also, a less reliable sensor was used to detect and measure the concentration of natural gas (lpg), and carbon monoxide co, in contrast to the sensor used in the implementation of the system described in chapter 3. fig. 2 block scheme of the smart weather station. block scheme is based on [6]. the measured results for temperature are given as an example. our next research [7] is related to the application of iot technology and smart systems in the industry, more precisely to the implementation of iiot technology (as shown in fig. 3). the described real-time system is based on power line poles monitoring to avoid an unwanted drop of the power pole, which would cause an interruption in the power supply, fall of the power line pole on a car passing by (provided that the pole is next to the road), injury of people due to the fall of the power pole, and also not leading to an accident. to avoid this undesirable scenario, the slope of each pole was monitored using an accelerometer, to know which of the poles could cause problems. each of the poles has its unique id based on which it is possible to track the slope of each pole independently. in addition to the accelerometer, parameters such as temperature, humidity, and atmospheric pressure were monitored, so that the people in charge of maintaining the flagpole had an insight into the atmospheric conditions that await them in the field. the system consisted of three separate parts, which were divided hierarchically, starting with the part with the least intelligence (level 1), through level 2 which represented the connection between the flagpole and the cloud server and the control room (level 3). in fig. 3, it can be seen the measured results for a review of real time smart systems developed at university of niš 675 the slope of angle (more detailed information about other measured parameters can be found in [7]). the first level consisted of a microcontroller of the pic family that monitored and measured the parameters (angle of inclination, temperature, humidity, and atmospheric pressure) and depending on the inclination of the flagpole sent information to level 2 according to a defined measurement period. the data was sent using rf modules, to make the system as cheap as possible for implementation. level 2 was served by a pic microcontroller that received information about each pole based on its module based on the rf module and sent that data to the cloud or database using the gsm/gprs module since level 2 was located in the open. level 3 was a database or cloud server, along with a control room from which it was possible to access measurement data and from where messages could be sent to teams about which flag was critical. fig. 3 block scheme of the smart system for supervision and monitoring of the power line poles using iiot technology. block scheme is based on [7]. the measured results for slope of angle are given as an example. as part of the research [8], we have implemented a system related to smart agriculture, which allows monitoring and control of the greenhouse and the most important parameters of microclimatic conditions in it (as shown in fig. 4). based on these parameters, it is possible to improve the quality and quantity of yield in the greenhouse. besides, the system monitored and controlled the greenhouse irrigation system, so that the necessary fertilizer was delivered to the plants at an adequate time. the system was based on monitoring and 676 d. dankovic, m. djordjevic control of greenhouses in three levels, as follows, level 1 was the control of ventilation (air conditioner and door), safety net, level 2 was the control of irrigation (water temperature, water level in the tank), and also and the amount of feed to be added to the tank. level 3 is the most complex part of the entire system and was reflected in the fact that parameters such as greenhouse air temperature, soil temperature, air humidity, soil moisture, atmospheric pressure, soil ph, wind speed (to protect the greenhouse structure), light intensity and amount of carbon dioxide (co2) were monitored and controlled. in fig. 4, it can be seen the measured results for air humidity and soil moisture (more detailed information about other measured parameters can be found in [8]). the entire system is powered by using a pic microcontroller that sends data to a database or cloud using a gsm/gprs module. for the needs of the system, an application for smartphones was realized, so that the monitoring and control of the greenhouse could be done remotely. fig. 4 block scheme of the smart autonomous agricultural system for improving yields in greenhouse based on sensor and iot. block scheme is based on [8]. the measured results for air humidity and soil moisture are given as an example. as part of the smart faculty within the research [9], we have implemented a system for monitoring and control of microclimatic parameters at the faculty, more precisely in amphitheaters and classrooms, to provide better working conditions, i.e. teaching. the system (as shown in fig. 5) was based on monitoring and control of microclimatic parameters such as temperature, humidity, atmospheric pressure, light intensity, carbon dioxide (co2) concentration. the entire system is realized in the form of control nodes, where each of the ambient parameters is controlled by air conditioner and ventilation (temperature and humidity in amphitheaters/classrooms), adjustment of blinds/venetian blinds (lighting intensity in amphitheaters/classrooms). the system we have implemented is a review of real time smart systems developed at university of niš 677 part of a smart faculty, which, based on a known number of students who have classes in one of the amphitheaters or classrooms, could set adequate conditions for the best possible student work. this system is completely designed in the altium designer software tool for designing printed circuit boards [10], a 3d model was made and realized as shown in [9]. as in previous systems, the central component is the microcontroller of the pic family. in fig. 5, it can be seen the measured results for temperature and relative humidity (more detailed information about other measured parameters can be found in [9]). a gsm/gprs module was used to send the measured data to the database on the server. for the needs of the realization of the system, an application for smartphones was realized, to monitor and control remotely. all research [6, 7, 8, 9] is related to smart systems that can monitor and measure meteorological, ambient and microclimatic parameters in real time, with the disadvantage of static systems, i.e. systems that are not mobile and do not have the ability to measure parameters at multiple locations. fig. 5 block scheme of the smart data logger system based on sensor and internet of things technology as part of the smart faculty. block scheme is based on [9]. the measured results for temperature and relative humidity are given as an example. 678 d. dankovic, m. djordjevic 3. development of smart mobile system this manuscript presents the model of smart mobile data logger for real time monitoring microclimate parameters based on pic microcontroller and cloud platform. the smart system is designed to be mobile, scalable and easy to setup and extend. it is based on powerful pic microcontroller which manages the whole system. it includes embedded sensors for observing and measuring of the microclimatic parameters, gps coordinates for information about location where the measurement were made and gprs module which upload data to cloud platform. 3.1. design of solution a smart mobile data logger system for real-time monitoring is realized so that is consists of 7 segments, shown in fig. 6. the power supply serves all other blocks. the microcontroller pic18f45k22 [11], which represents the core of the entire system, manages the microclimatic sensor block, which serves for microclimatic measurements and observations and gps coordinates for location information. also, the gsm/gprs block, realized using the sim800l module [12], is controlled by the above microcontroller. fig. 6 block scheme of the smart real time mobile microclimatic monitoring system based on sensor and iot technology. a review of real time smart systems developed at university of niš 679 the sensor list is shown in table 1: table 1 sensors and their measurement characteristics sensor measurement measurement range ref. bme280 temperature, air humidity, atmospheric pressure, altitude temperature:-40°c to +85°c, air humidity: 0% to 100%, atmospheric pressure: 300 to 1100 mbar [13] bh1750 light intensity 0 lx to 65535 lx [14] mq-7 carbon monoxide 20 ppm to 2000 ppm [15] communication between microcontroller pic18f45k22 and sensors bme280 and bh1750 is realized via the i2c bus. also, there is a global positioning system (gps) module neo6mv2 [16], which are used to obtain information on the location where the observation and measurement of microclimatic parameters was performed. the information of interest for this smart mobile system is geographic longitude and latitude in the format (xx.xxxx (n), yy.yyyy (e)). this module communicates with microcontroller using uart serial communication. the real time clock (rtc) module ds1307 [17], was used to set the current time and determine the measurement step. finally, the gsm/gprs module sim800l serves to send measured data and location information to the cloud (thingspeak [18]) realized on the matlab webserver. this module communicates with the microcontroller via the (rx/tx) uart serial communication such as gps module neo6mv2, using at commands. to interact with the user while working with the smart mobile data logger system, a 420 character lcd is used [19]. the lcd display serves to monitor the current measurement results and the time for the next measurement. at the start, it is necessary for the user to set the ip address in the form of an sms message, so that later the gsm/gprs module has information on where to send the measured data. when the ip address is set, uart serial communication, i2c bus and a/d converter setup begins. finally, the sensors and module are initialized, after which the measurement and sending of data to the cloud begins. 3.2. software design of smart system each thread during the work of the smart data logger system is defined as the algorithmic mode of displaying the software as shown in fig. 7. this algorithm is based on our previous systems [6, 9]. but, the previous systems used security digital (sd) memory card. it was used as a backup medium for data storage in case there is no internet access, in order not to create a "hole" in the measurement interval, ie in order not to lose information about the measured parameters. another difference is that the gps module is present in this algorithm, as well as part of the algorithm for its configuration. 680 d. dankovic, m. djordjevic fig. 7 basic algorithm of the embedded software of smart mobile monitoring system a review of real time smart systems developed at university of niš 681 4. experimental results the microclimatic parameters were measured with a prototype of a smart mobile data logger system in the city of niš, in order to confirm its validity. all the measured results are shown in fig. 8. however, by driving a vehicle it is provided a large amount of data for a certain area, so it is not that obvious to analyze data. for these reasons, we can divide the city into cells (larger or smaller, depending on the need) and assign only the most recent data we have measured in each cell. fig. 8 generated map based on the obtained results measured by the realized system on the example of temperature measurement, we can see that the vehicle is transferred from cell to cell (marked with numbers 1 to 5), as illustrated in fig. 9. it may happen that in the same cell in one pass we have a larger number of measurements, but for better visibility, only the results of the last measurement are shown, as illustrated in fig. 10. the points t1, t2, t3, and t4 (the last measured points in each of the specific cells), in fig. 10, coincide with the measured points shown in fig. 9. the vertical lines on the chart shown in fig. 9 show the moments when the vehicle left a certain cell, i.e. entered the next one. therefore, all points measured in one cell can be seen. when multiple systems are installed on different vehicles, the last measured data from all vehicles in that cell will be recorded in the cell. the route taken by the vehicle will not be shown, it is shown here only for the purpose of a detailed description of the operation of the system. 682 d. dankovic, m. djordjevic fig. 9 measured temperature data with marked points that were last measured in specific cells fig. 10 generated map based on the latest results measured by the realized system in specific cells (vehicle direction is also shown) the functionality of this system is shown on the example of temperature measurement. however, other parameters were also measured as shown in fig. 11. microclimatic parameters were measured on june, 26 th (friday), 2020 in niš. measurements were performed during the working day when the frequency of vehicles is significantly pronounced. a review of real time smart systems developed at university of niš 683 fig. 11 measured parameters using the realized system (temperature and co concentration (field 1 and field 2) – first two charts, air pressure and air humidity (field 3 and field 4) – second two charts, light intensity and gps coordinates (field 5 and field 6) – third two charts) the results we recorded during the testing can be used by experts from various fields such as tourism and catering, traffic, meteorological stations, as well as experts dealing with air and environmental pollution. based on the provided results, people from the above areas can have an insight into more detailed information that is extremely important to them for their activities, as well as for taking certain actions in accordance with the obtained results. as our system also provides information on the location (gps coordinates) where the measurements were performed (coordinates are shown in field 6 for each measurement separately), it is possible to monitor the microclimatic parameters in each area in much more detail, even where there are no measuring points that monitor the level of air pollution globally (fig. 12). 684 d. dankovic, m. djordjevic fig. 12 real time air quality measurement places [2] based on the site with monitoring of the air pollution index [2] in nis, it can be seen that a small part of the city has the possibility of monitoring. there is a problem with updating the data on this site, as the data is updated in a few hours (usually 2 to 3 hours). most of the city with significant traffic is not covered by systems for monitoring microclimatic parameters. our smart mobile system for real-time monitoring enables the coverage of a large percentage of the city area, and along the way, it is possible for the system to be used within the city transport, taxi vehicles, which reduces the cost of installing a large number of systems since one system is enough to cover the entire city. as stated in the manuscript, the realized system, considering that it is modular, offers the possibility of using other sensors, ie monitoring and other microclimatic parameters, depending on the needs of the user. 5. discussion and future work the systems we have implemented so far find application within large smart systems such as smart colleges, where they represent one segment within the whole complex system. in addition, the system has found application in agriculture, and also a smart meteorological station is used not only within meteorological stations, but also in mines (since it has sensors that monitor microclimatic parameters that are vital not only for the mine, but also for the miners in it). this manuscript describes the smart mobile system for monitoring microclimatic parameters, which can replace a large number of static systems. the static systems that we realized were divided according to the spheres in which they found the primary application (represented by the block diagram in fig. 1). each system presented within the block diagram is realized completely, in other words, from idea to realization. first, the functionality of each system was confirmed separately within the laboratory, and after that in real working conditions (by realizing a prototype on the protoboard). when the testing of the prototype proved its functionality, a printed circuit board was designed using the altium designer software tool. after that, when the systems are completely physically realized, they are tested a review of real time smart systems developed at university of niš 685 in real conditions within the prescribed 7 days needed to confirm the functionality of the systems themselves. the systems we have implemented are suitable for outdoor and indoor application, with the proviso that the systems suitable for outdoor application are implemented for different needs and spheres. the idea is to test the implemented systems in the future in laboratory and real conditions, but so that these systems do not require additional maintenance and servicing. this would significantly reduce the financial resources required to implement such systems. to make this as easy as possible, it is not enough to test the reliability of one component within the system, but we want to test the reliability of our entire system as a whole. in manuscript [20], the authors state that there are a small number of manuscripts that deal with this problem. specifically, they state that “system-level condition monitoring has not been explored sufficiently compared with component-level counterpart”. 6. conclusion the manuscript describes the implemented smart mobile system for real-time monitoring and measuring microclimatic parameters. the system was successfully tested in real conditions in the city of nis and the results obtained by applying the system are presented in the paper. the realized system is suitable because it can replace a large number of static systems. in addition, the proposed system has the ability to collect information about the location where measurements were made based on gps coordinates. finally, the realized system is modular, therefore it is possible to expand it if it is necessary. acknowledgment: this work has been supported by the ministry of education, science and technological development of the republic of serbia. references [1] v. rajs, v. milosavljevic, z. mihajlovic, m. zivanov, s. krco, d. drajic, b. prokic, “realization of instrument for environmental parameters measuring”, elektronika ir elektritechnika, vol. 20, no. 6, pp. 61–66, 2014. [2] air pollution in serbia: real-time air quality index visual map. [3] e. avallone, p. c. moralli, p. s. g. natividade, p. h. palota, j. f. de costa, j. r. antonio, s. a. v. juniorm, “am inexpensive anemometer using arduino board”, facta universitatis, series: electronics and energetics, vol. 32, no. 3, pp. 359–368, september 2019. [4] b. mihai, “about the smart weather station”, acta universitatis cibiniensis – technical series, vol. lxviii, no. 3, pp. 26–29, 2016. [5] y. zhang, o. chen, g. liu, w. shen, g. wang, “environment parameters control based on wireless sensor network in livestock buildings”, international journal of distributed sensor networks, vol. 12, no. 5, may 2016. [6] m. djordjevic and d. dankovic, “a smart weather station based on sensor technology”, facta universitatis, series: electronics and energetics, vol. 32, no. 2, pp. 195–210, june 2019. [7] m. djordjevic, j. vracar and a. stojkovic, “supervision and monitoring system of the power line poles using iiot technology”, in proceedings of the 55th international scientific conference on information, communication and energy systems and technologies (icest), 2020. 686 d. dankovic, m. djordjevic [8] m. djordjevic, v. paunovic, d. dankovic and b. jovičić, "smart autonomous agricultural system for improving yields in greenhouse based on sensor and iot technology", in proceedings of the 2nd young researchers conference (yours), 2020, p. 12 [9] m. djordjevic, b. jovicic, s. markovic, v. paunovic and d. dankovic, “a smart data logger system based on sensor and internet of things technology as part of the smart faculty”, journal of ambient intelligence and smart environments -1 (2020) (jaise), vol. 12, no. 4, pp. 359–373, 2020. [10] altium designer pcb software: https://www.altium.com/altium-designer/. [11] pic18f45k22 http://www.microchip.com/wwwproducts/en/pic18f45k22. accessed: 01.07.2020. [12] gsm/gprs sim800l: http://simcom.ee/documents/sim800/sim800_hardware%20design_v1.08.pdf. [13] bme280 sensor bosch sensortec: https://cdn-shop.adafruit.com/datasheets/bst-bme280_ds001-10.pdf. [14] bh1750fvi sensor ics – mouser electronics: http://rohmfs.rohm.com/en/products/databook/datasheet/ ic/sensor/light/bh1721fvc-e.pdf. [15] mq-7 sensor: https://www.sparkfun.com/datasheets/sensors/biometric/mq-7.pdf [16] gps module neo6mv2: https://www.u-blox.com/sites/default/files/products/documents/neo6_datasheet_(gps.g6-hw-09005).pdf [17] ds1307 – part number search – maxim integrated: https://datasheets.maximintegrated.com/en/ds/ds1307.pdf. [18] thingspeak cloud database http://thingspeak.com. [19] lcd display 20x4 – vishay: https://www.vishay.cco/docs/37314/lcd020n004l.pdf [on-line]. [20] z. ni, x. lyu, o. p. yadav, b. n. singh, s. zheng, d. cao, “overview of real-time lifetime prediction and extension for sic power converters”, ieee ttransactions on power electronics, vol. 35, no. 8, pp. 7765– 7794, august 2020. http://www.microchip.com/wwwproducts/en/pic18f45k22 http://simcom.ee/documents/sim800/sim800_hardware%20design_v1.08.pdf https://cdn-shop.adafruit.com/datasheets/bst-bme280_ds001-10.pdf instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 585 596 doi: 10.2298/fuee1504585c numerical calculation of shielding effectiveness of enclosure with apertures based on em field coupling with wire structures  tatjana cvetković 1 , vesna milutinović 1 , nebojša dončov 2 , bratislav milovanović 3 1regulatory agency for electronic communications and postal services, belgrade, serbia 2 faculty of electronic engineering, university of niš, niš, serbia 3 singidunum university of belgrade, belgrade, serbia abstract. shielding effectiveness of a protective metal enclosure with apertures and receiving antenna placed inside is numerically considered. the purpose of the antenna, here considered as a dipole, is to detect the electromagnetic (em) field level within the enclosure and to transfer this information via a coaxial cable to a network analyzer. this follows the experimental procedure used to measure the shielding effectiveness of enclosure. a numerical model, based on the transmission-line matrix (tlm) method enhanced with so-called wire node, is used to simulate this dipole antenna/cable arrangement in order to investigate how much it disturbs the level of shielding effectiveness due to their two-way coupling with em field inside the enclosure. the numerical model, whose accuracy is proved by comparison with experimental results available in the literature, is used to consider the influence of radius and length of the dipole-receiving antenna and the impact of cable presence on the distribution of em field inside the enclosure and shift of resonant frequencies for normal and oblique incidence. key words: enclosure, apertures, shielding effectiveness, tlm method, dipolereceiving antenna, coaxial cable 1. introduction many factors can influence an electronic system behavior in terms of electromagnetic compatibility (emc) [1]. the character and importance of electromagnetic (em) radiation from different parts of the system and the impact of interference, generated externally, on the functional integrity of an entire system have to be considered during the system design. to provide the em signature of equipment and immunity to electromagnetic interference (emi) that both meet limits specified in emc standards, a designer can seek to minimize interference at its source, reduce coupling paths by choosing a suitable layout, received november 18, 2014; received in revised form march 5, 2015 corresponding author: tatjana cvetković regulatory agency for electronic communications and postal services, višnjićeva 8, 11000 belgrade, serbia (e-mail: tatjana.cvetkovic@ratel.rs) 586 t.cvetkovic, v.milutinovic, n.doncov, b.milovanovic apply shielding, filtering, grounding, etc. enclosures usually built of highly conductive materials, are often used to protect the system from external em fields but also to reduce emi emission from equipment. their protective characteristic is often expressed as a ratio between the field strength without and with shield at some point inside the enclosure (socalled shielding effectiveness – se) and it can be defined both in terms of the electric and magnetic fields. em characteristics of materials used to build the enclosure walls as well as the structure and form of the enclosure can significantly influence the value of this parameter and therefore the immunity of the whole system. apertures of various forms, intended for heat dissipation, insertion of control panels and outgoing or incoming cables, airing or other purposes are integral parts of the shielding enclosure. apertures can significantly degrade the shielding performances of the enclosure as em radiation penetrates through the apertures in the inside/outside space. besides apertures, there is a penetration via diffusion through conducting enclosure walls [1], but it is less significant if conductivity of the walls is high. the em energy, which penetrates into enclosure, often couples into wires, which are part of transmission lines such as printed circuit boards (pcbs) or cable, and then propagates causing further interference. therefore, it is important to consider all these coupling mechanisms, in order to design the enclosure with satisfying se over the frequency range of interest. there are many methods which can be used for the calculation of the se. some of them are analytical methods [2-4] that can be very efficient but with some limitations. for an example an equivalent waveguide circuit proposed in [3,4] was developed to tackle normal [3] and oblique incidence and arbitrary location of apertures on the enclosure walls [4], but only in the case of an empty enclosure. numerous numerical methods are also often used for the se calculation, e.g. the finite difference time domain (fdtd) method [5], the methods of moments (mom) [6] and the transmission line matrix (tlm) method [7-10]. they are generally applicable to complex problems, usually without any limitations but with high computational cost. in addition to that, once the enclosure is made off, measurements can be conducted for determination of its se. to measure the level of em field as some critical points inside the enclosure, a receiving antenna such as monopole [3] or dipole [4] can be used, while the coaxial cable transfers the induced current/voltage information to an instrument that records measurement results (network analyzer, spectrum analyzer, emc receiver). in [11-13] authors have numerically consider the impact of the physical presence of the receiving dipole antenna on the level of em field inside the enclosure and therefore on accuracy of estimated se when antenna is used in an experimental procedure. the motivation for this research was that equivalent circuital model presented in [4] provided se results slightly different from the measurements due, as stated in [4], its inability to include the receiving antenna. also, some of the authors of this paper have been previously shown that antenna placed inside the microwave cavity can influence the em field distribution and location of resonant frequencies [14]. the tlm method incorporating the compact wire model, developed in [15] and adapted in [16] for cylindrical mesh, has been used to create a numerical model capable of taking into account the antenna presence. in addition, in [12] the equivalent circuital model proposed in [3,4] has been extended to include the receiving antenna. due to numerical model ability to account, unlike circuital model, for two-way interaction between em field inside the enclosure and receiving dipole antenna, we proceed in this paper with using only the numerical model in our analysis of wire parameters impact on the se. besides directly sampling the em field in numerical characterization of shielding effectiveness of enclosure with apertures and monitoring antenna 587 the space nearby antenna to estimate the se of enclosure as in [11,12], another way of collecting signal that corresponds to the experimental procedure is also proved to be valid [13]. a current signal can be picked up directly from the antenna and then used to find the voltage induced in the center of the dipole antenna to calculate the se. in this paper, the numerical model is applied on a rectangular enclosure with three different aperture patterns on the front wall and comparison with measurement se results [4] is provided. an analysis of wire radius and length influence on the level of se of enclosure and location of resonant frequencies, more detailed than one conducted in [13], is presented here for the case of plane wave of normal incidence and with horizontal polarization. in addition, an oblique incident plane wave defined by appropriate azimuth, elevation and polarization angles is used as an excitation in order to demonstrate how placing antenna in three different positions to detect each field component significantly influences the detected level of the se. signal transfer to the measurement instrument via a coaxial cable is taken into account by directly modeling the cable in the numerical model. its impact on the se is illustrated for a plane wave of normal or oblique incidence to the frontal panel and with horizontal or arbitrary electric polarization. 2. tlm method enhanced with wire node describing enclosure with apertures and receiving antenna inside the tlm method [17] is a differential time-domain numerical technique based on temporal and spatial sampling of em fields. the fundamental building block in the tlm method is known as the symmetrical condensed node (scn) consisting of 12 interconnected link lines (fig. 1) that model a cuboid piece of space (δx x δy x δz) . the network of interconnected scns, together with chosen parameters of link lines, is used here to describe the em properties of a medium inside/outside the enclosure. in addition, short and open stubs (not shown in fig. 1a) can be attached to scns to model inhomogeneous and lossy materials. imperfectly conducting enclosure walls are modeled by terminating the tlm link lines at the edge of the enclosure space with an appropriate load. cross-section of each aperture on an enclosure wall is described by using a fine tlm mesh so that there are several nodes across aperture width and length. aperture depth is also taken into account by using a few nodes across wall thickness. in the case of an array of apertures on enclosure walls (e.g. for ventilation purposes), a compact air-vent model [8] is proved to be more efficient, while in the case of slots (apertures whose length is significantly larger than dimensions across the slot) a compact slot model can be used [9]. regarding the wire structures inside the enclosure, the most efficient solution is to use a compact wire model consisting of additional wire network, formed by two link lines and one short-circuit stub line, embedded within scn (so-called wire node, fig. 1b). as wire node model one wire segment, a column of nodes is usually required to model the whole wire length. this model allows for accurate modeling of wires having a considerably smaller diameter than the node size (up to 40% of node cross-section size). in addition, it models signal propagation along the wires, while allowing for interaction with the external em field (two-way interaction). wire presence increases the capacitance and inductance of the medium in which they are placed so characteristic impedance parameters of link and stub lines, zwi and zwsi, i  (x, y, z), respectively, have to be chosen to model this increase in 588 t.cvetkovic, v.milutinovic, n.doncov, b.milovanovic capacitance and inductance in the direction of wire axis, maintaining at the same time synchronism with the rest of the transmission line network. a) b) fig. 1 a) symmetrical condensed node (scn), b) a column of wire nodes (scns with additional wire networks two link lines and one short-circuit stub line per node model one wire segment) for example, in the case of wire running in y direction, the characteristic impedances of link and stub lines can be expressed as: ' wy w t z y c    , ' wsy w wy y z l z t     (1) where δy represents the dimension of the tlm node in the direction of the wire segment passing through the node, δt is a time-step discretization, c'w and l'w are the wire capacitance and inductance per-unit length, respectively, calculated as: ' 2 / ln( / ) w ci c w c k y r  , ' ln( / ) / 2 w li c w l k y r   (2) where yc represents mean cross-section dimensions of the tlm node in the y direction, yc = (x + z) / 2, rw is the wire radius and kci and kli are factors obtained empirically by using the known tlm network characteristics and the mean dimensions of the node crosssection in the wire running direction [15]. numerical characterization of shielding effectiveness of enclosure with apertures and monitoring antenna 589 the current in a straight wire segment running in the y direction can be found as [15]: 2( 2 2 ) 2 i i i wyn wsy wyp ab wy wy wsy ab v v v v i z z z       , (3) where vab and zab describe the two-way coupling of the link lines of scn polarized in the y direction (pulses vxny, vxpy, vzny and vzpy and characteristic admittances yxy and yzy) and open circuit stub of scn polarized in the z direction (pulse voy and characteristic admittance yoy) with the additional wire network [17]: 2( ) 2( ) 2 2( ) i i i i i xny xpy xy zny zpy zy oy oy ab xy zy oy v v y v v y v y v y y y        , (4) 1 (2 2 ) ab xy zy oy z y y y     . (5) similar expressions per wire node can be derived for straight wire segments running in other two directions. 3d tlmscn solver, based on scn and designed at the microwave lab of the faculty of electronic engineering in nis, incorporates the wire node feature allowing to efficiently model wire structures. 3. numerical analysis and results a rectangular enclosure with three different aperture patterns on the front wall (fig. 2) is used to analyze the influence of a dipole-receiving antenna and coaxial cable connection on the se. the dimensions of enclosure are the same as specified in [4]. it should be noted that in [4] the radius of the dipole-receiving antenna used in measurements is not specified. externally generated interference is represented as a plane wave of normal incidence to the wall with apertures and with horizontal electric polarization (ey). a fine tlm mesh has been used to accurately describe the apertures cross-section and wall thickness of 2 mm in the numerical model of an empty enclosure. the similar behavior of the se in the considered frequency range can be noticed in fig. 3 for all three aperture patterns. the only difference is regarding the level of the se due to different percentage of wall surface covered by apertures resulting in different amount of em energy penetrating into the space inside the enclosure. a) b) fig. 2 a) enclosure of a rectangular cross-section, b) front panel with one or three differently sized apertures 590 t.cvetkovic, v.milutinovic, n.doncov, b.milovanovic fig. 3 se of enclosure with various aperture patterns on the front wall tlm model of enclosure without antenna two wires, running in the y direction, each with length of 50 mm and radius of 0.1 mm, and separated by 2 mm are then used to create a receiving-dipole antenna. wires are represented by the tlm wire node as explained in section 2. position of both wire arms of the antenna inside the enclosure is defined according to [4]. both wire arms are terminated by resistors in order to have the dipole antenna loaded with coaxial cable impedance. physical presence of the coaxial cable will be modeled (discussed) later on in the paper. balun often used between unbalanced and balanced transmission lines is not considered in this paper, but cable-antenna coupling is realized to be symmetric. currents induced on both wire arms of the loaded dipole antenna are of equal amplitudes and opposite signs and they can be used to find the voltage induced between the mutually nearest ends of two wires, i.e. in the center of the dipole antenna. for illustration purposes, the currents running at the terminating resistors of dipole wire arms, as well as the induced voltage, are shown in fig. 4 for the case of the enclosure with one (10 x50) mm 2 aperture on the front wall of the enclosure. fig. 4 induced currents at the terminating resistors of wire arms (overlapping black and red solid lines) and voltage induced between wire arms as a function of frequency numerical characterization of shielding effectiveness of enclosure with apertures and monitoring antenna 591 numerical result for the se obtained from the voltage difference between two arms of loaded dipole antenna, without and with the enclosure with one (10x50) mm 2 aperture on the front wall, is shown in fig. 5. it is also compared to the case when only presence of dipole antenna is taken into account (i.e. em field is directly taken at a point in space between dipole wire arms). it can be observed that the level of se is always lower in the case when the dipole antenna loaded with coaxial cable impedance is taken into account and that in some frequency regions tlm model with loaded antenna follows better experimental results. the similar conclusions can be reached for other two apertures patterns. however, it should be pointed out that more accurate comparison of these two tlm models with the experimental results has to include the balun presence and its characteristics in considered frequency range which was not given in [4]. fig. 5 se of enclosure with one (10x50) mm 2 aperture on the front wall tlm models of enclosure with antenna, with loaded antenna and measurements [4] as already illustrated in [13], wires of different radius and length, used to represent receiving dipole antenna arms, can influence the se and shift the resonant frequencies of the enclosure. at and around resonant frequencies, the level of se can be very low indicating poor shielding so it is important to also accurately determine their values. therefore, a more detailed analysis of antenna physical dimensions influence on the em field distribution inside the enclosure is conducted here. the radius of the both wire arms was changed in the range (0-2) mm, while the length varied in the range (50-150) mm. their impact on the level of se and location of three resonant frequencies of the enclosure with one (10x50) mm 2 aperture on the front wall for horizontally polarized plane wave of normal incidence is shown in figs.6-9. figs. 6 and 7 show how the δse varies as a function of frequency for different radii and lengths of the antenna, respectively, where δse represents difference between the level of the se of considered enclosure in the absence and presence of dipole antenna. thicker and longer wires affect more the level of se in comparison with the case when antenna is not placed inside the enclosure. this impact is almost constant over the considered frequency range except around resonant frequencies where also shift of resonant frequencies has to be taken into account. location of resonant frequencies, compared to the case on empty enclosure changes towards lower frequencies with the increase of wire 592 t.cvetkovic, v.milutinovic, n.doncov, b.milovanovic radius and length, as shown in figs. 8 and 9, respectively, for three resonant frequencies indicted in [4]. this shift can be significant especially for thicker and longer wires and, as in the case of level of the se, it is due to stronger influence of the antenna on the total em field inside the enclosure. fig. 6 difference between the level of the se of enclosure with one (10x50) mm 2 aperture on the front wall in the absence and presence of dipole antenna with various radii and length of 100 mm fig. 7 difference between the level of the se of enclosure with one (10x50) mm 2 aperture on the front wall in the absence and presence of dipole antenna with various lengths and radius of 0.1 mm numerical characterization of shielding effectiveness of enclosure with apertures and monitoring antenna 593 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0.894 0.896 0.898 0.900 0.902 0.904 t h e f ir s t f r ( g h z ) radius of the wire (mm) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1.160 1.162 1.164 1.166 1.168 1.170 1.172 t h e s e c o n d f r (g h z ) radius of the wire (mm) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1.666 1.668 1.670 1.672 1.674 1.676 1.678 t h e t h ir d f r (g h z ) radius of the wire (mm) fig. 8 resonant frequency of the se of enclosure with one (10x50) mm 2 aperture on the front wall versus radius of the dipole antenna; antenna length is 100 mm 50 75 100 125 150 0.892 0.894 0.896 0.898 0.900 0.902 0.904 t h e f ir s t f r ( g h z ) length of the wire (mm) 50 75 100 125 150 1.160 1.162 1.164 1.166 1.168 1.170 1.172 t h e s e c o n d f r (g h z ) length of the wire (mm) 50 75 100 125 150 1.62 1.63 1.64 1.65 1.66 1.67 1.68 1.69 1.70 t h e t h ir d f r (g h z ) length of the wire (mm) fig. 9 resonant frequency of the se of enclosure with one (10x50) mm 2 aperture on the front wall versus length of the dipole antenna; antenna radius is 0.1 mm 594 t.cvetkovic, v.milutinovic, n.doncov, b.milovanovic in all previously considered examples, a plane wave of normal incidence to the frontal panel and with horizontal electric polarization (ey) is used as an excitation. as a result, the dominant em field component excited inside the enclosure is in the y direction and the dipole-receiving antenna has to be positioned during the measurement only in this direction in order to calculate the se. however, in general, a plane wave can be with arbitrary incident angle and polarization so that all three em field components can exist inside the enclosure, which requires that in the experimental procedure the dipole receiving antenna is separately placed along each of the three cartesian axes. therefore, the dipole antenna presence influences the detected level of each em field component and, hence, it has a stronger impact on the accuracy of determining the total se of the enclosure. as an illustration, the se of the enclosure with three (10x50) mm 2 apertures on the front wall for an obliquely incident plane wave with the azimuth angle 60º, elevation angle 90º and polarization angle 30º is calculated. numerical tlm results for the case of an empty enclosure (without dipolereceiving antenna) and the case when a loaded dipole antenna is present inside the enclosure are shown in fig.10. it can be seen that, especially in some frequency ranges, the presence of the dipole antenna strongly underestimates the level of the se of the enclosure. fig. 10 se of enclosure with three (10x50) mm 2 apertures on the front wall and incident plane wave with azimuth angle 60º, elevation angle 90º and polarization angle 30º tlm models of enclosure without antenna and with loaded antenna the physical presence of the coaxial cable inside the enclosure and its impact on the se is considered next. a coaxial cable of length 140 mm is placed along the x direction in order to transfer the signal picked up from the center of the dipole receiving antenna to the network analyzer. the radius of outer conductor of coaxial cable is chosen to be equal to the radius of the dipole receiving antenna. for a plane wave of normal incidence to the frontal panel with apertures and with horizontal electric polarization (ey), the presence of the coaxial cable does not have any influence on the se of enclosure as the dominant em field component is in the y direction, while the cable is running in the x direction. therefore the se of the enclosure is the same as shown in fig. 5 for the tlm model of enclosure with loaded antenna. however, for an obliquely incident plane wave with the numerical characterization of shielding effectiveness of enclosure with apertures and monitoring antenna 595 azimuth angle 60º, elevation angle 90º and polarization angle 30º, the coaxial cable has some impact on the se of enclosure (fig. 11) as some small current, induced by the x component of em field excited inside the enclosure, is running along the cable. it can be seen that the result for the se of enclosure obtained by using the tlm model with the antenna and cable slightly differs from the se results calculated by using the tlm model of the enclosure with the loaded antenna. the difference is noticeable at the frequencies where induced current running along the cable is not negligible and therefore enhancing the detected level of x component of em field. fig. 11 se of enclosure with three (10x50) mm 2 apertures on the front wall and incident plane wave with azimuth angle 60º, elevation angle 90º and polarization angle 30º tlm models of enclosure with loaded antenna, with antenna and cable and current induced in the cable 4. conclusion in this paper, the tlm method with wire node model is used to describe a dipole-receiving antenna and a coaxial cable inside the enclosure and their two-way interaction with em field. numerical model follows the experimental arrangement for se measurements. it is shown that the antenna/cable, in the presence of monitoring em field, both can have an impact on the measured level of the se and measured values of resonant frequencies of the enclosure. the level of their impact depends on antenna radius and length but also on cable orientation and external interference incident angle and polarization. based on these results, authors will be continue their research regarding the impact of different types of monitoring antennas (dipole, monopole, loop) on the em field distribution inside the protective enclosure and will try to extend the described numerical model taking into consideration the characteristics of balun placed between antenna and cable. acknowledgement: this work has been partially supported by the ministry for education, science and technological development of serbia, project number tr32052. 596 t.cvetkovic, v.milutinovic, n.doncov, b.milovanovic references [1] c. christopoulos, principles and techniques of electromagnetic compatibility, crc press, 2007. [2] h. a. mendez, "shielding theory of enclosures with apertures", ieee trans. electromagn. compat., vol. 20, no. 2, pp. 296–305, 1978. [3] m.p. robinson, t. m. benson, c. christopoulos, j.f. dawson, m.d. ganley, a.c. marvin, s.j. porter, d.w.p. thomas, "analytical formulation for the shielding effectiveness of enclosures with apertures", ieee trans. electromagn. compat., vol. 40, no. 3, pp. 240–248, 1998. [4] j. shim, d.g. kam, j.h. kwon, j. kim, "circuital modeling and measurement of shielding effectiveness against oblique incident plane wave on apertures in multiple sides of rectangular enclosure", ieee trans. electromagn. compat., vol. 52, no. 3, pp. 566–577, 2010. [5] l.j. nuebel, j. l. drewniak, r. e. dubroff, t.h. hubing, t. p. van doren, "emi from cavity modes of shielding enclosures – fdtd modeling and measurements", ieee trans. electromagn. compat., vol. 42, no. 1, pp. 29–38, 2000. [6] s. ali, d.s. weile, t. clupper, "effect of near field radiators on the radiation leakage through perforated shields", ieee trans. electromagn. compat., vol. 47, no. 2, pp. 367–373, 2005. [7] b.l. nie, p.a. du, y.t. yu, z. shi, "study of the shielding properties of enclosures with apertures at higher frequencies using the transmission-line modeling method", ieee trans. electromagn. compat., vol. 53, no. 1, pp. 73–81, 2011. [8] n. doncov, b. milovanović, z. stankovic, "extension of compact tlm air-vent model on rectangular and hexagonal apertures", applied computational electromagnetic society (aces) journal, vol. 26, no. 1, pp. 64-72, 2011. [9] a. mallic, d. p. johns, a. j. wlodarczyk, "tlm modelling of wires and slots", in proceedings of international zurich symposium on electromagnetic compatibility, zurich, switzerland, pp. 515-520, 1993. [10] v. milutinovic, t. cvetkovic, n. doncov, b. milovanovic, "analysis of enclosure shielding properties dependence on aperture spacing and excitation parameters", in proceedings of the ieee telsiks conference, niš, serbia, vol. 2, pp. 521-524, 2011. [11] t. cvetković, v. milutinović, n. dončov, b. milovanović, "tlm modelling of receiving dipole antenna impact on shielding effectiveness of enclosure", int. j. reasoning-based intelligent systems, vol. 5, no. 3, pp. 202-207, 2013. [12] v. milutinović, t. cvetković, n. dončov, b. milovanović, "circuital and numerical models for calculation of shielding effectiveness of enclosure with apertures and monitoring dipole antenna inside", radioengineering, vol. 22, no. 4, pp 1249-1257, 2013. [13] t. cvetković, v. milutinović, n. dončov, b. milovanović, "numerical investigation of monitoring antenna influence on shielding effectiveness characterization", applied computational electromagnetic society (aces) journal, vol. 29, no. 11, pp 837-845, 2014. [14] j. jokovic, b. milovanovic, n. doncov, "numerical model of transmission procedure in a cylindrical metallic cavity compared with measured results", int. journal of rf and microwave computer-aided engineering, vol. 18, no. 4, pp. 295-302, 2008. [15] a.j. wlodarczyk, v. trenkic, r. scaramuzza, c. christopoulos, "a fully integrated multiconductor model for tlm", ieee trans. microwave theory tech., vol. 46, no. 12, pp. 2431-2437, 1998. [16] t. dimitrijevic, j. jokovic, b. milovanovic, n. doncov, "tlm modeling of a probe-coupled cylindrical cavity based on compact wire model in the cylindrical mesh", int. journal of rf and microwave computer-aided engineering, vol. 22, no. 2, pp. 184-192, 2012. [17] c. christopoulos, the transmission-line modelling (tlm) method, ieee/oup series, piscataway, nj, 1995. facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 557-567 https://doi.org/10.2298/fuee2104557r © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper feature extraction for person gait recognition applications adnan ramakić1, zlatko bundalo2, željko vidović3 1rectorate, university of bihać, bihać, bosnia and herzegovina 2faculty of electrical engineering, university of banja luka, banja luka, bosnia and herzegovina 3university of east sarajevo, faculty of transport and traffic engineering, doboj, bosnia and herzegovina abstract. in this paper we present some features that may be used in person gait recognition applications. gait recognition is an interesting way of people identification. during a gait cycle, each person creates unique patterns that can be used for people identification. also, gait recognition methods ordinarily do not need interaction with a person and that is the main advantage of these methods. features used in a person gait recognition methods can be obtained with widely available rgb and rgb-d cameras. in this paper we present a two features which are suitable for use in gait recognition applications. mentioned features are height of a person and step length of a person. they may be extracted and were extracted from depth images obtained from rgb-d camera. for experimental purposes, we used a custom dataset created in outdoor environment using a long-range stereo camera. key words: gait recognition, gait energy image, backfilled gait energy image, height of a person, step length of a person. 1. introduction people may be identified using different biometric methods. examples of these methods are fingerprint, retina and iris recognition (identification based on eye features), facial recognition, keystroke dynamics, voice recognition etc. generally, they may be divided in physiological and behavioral biometric methods. physiological biometric methods include fingerprint, retina and iris recognition, hand geometry, facial recognition etc., while behavioral biometric methods include methods like keystroke dynamics, voice recognition, person signature recognition, gait recognition etc. received april 3, 2021; received in revised form august 14, 2021 corresponding author: zlatko bundalo faculty of electrical engineering, university of banja luka, 5 patre, 78 000 banja luka, bosnia and herzegovina e-mail: zlatbun2007@gmail.com 558 a. ramakić, z. bundalo, ž. vidović most of the above listed methods need some kind of interaction with a person during an identification process. on the other hand, gait recognition is a method that ordinarily does not need any interaction with a person during identification process. using some type of longrange cameras (e.g. zed stereo camera) some facial recognition methods also may be conducted without interaction with a person. today’s gait recognition approaches, which are in use, are model-based or appearance-based. model-based approach ordinarily exploits different parts of human body to create a model that are in use for identification purposes. some of human body parts that are ordinarily in use with model-based approaches are legs, arms, etc. in other words, some measures related to mentioned body parts are in use (e.g. arm length). appearance-based approach ordinarily uses persons’ silhouette representations. research related to gait recognition usually has been done using rgb or rgb-d cameras (also sometimes called rgb and rgb-d sensors) and datasets that were created with them. earlier were used rgb cameras, but today in use also are rgb-d cameras. rgb-d camera provides depth data along with rgb data. the most of research that has been done using rgb-d cameras were realized using kinect sensor developed by microsoft. kinect sensor provides rgb data along with depth data. in this paper we analyze some features that may be used along with well-known gait recognition methods. these features are height of a person and step length of a person. both features may be obtained from depth images of rgb-d camera. gait recognition methods that were used along with mentioned features are, appearance-based methods, gei (gait energy image) [1] and bgei (backfilled gait energy image) [2]. gei is an image that contains silhouettes (aligned, normalized and averaged) of a person over a gait cycle. bgei is similar to the gei, also represents an image with a person silhouettes, but silhouettes of a person are back filled from front most pixels. 2. related work in the field of gait recognition there is a large number of works. approaches that deal with gait recognition are usually divided in two types: model-based approach and appearance-based approach. model-based approach uses explicit models to represent and track different parts of human body (such as, e.g. legs or arms) over time while appearancebased approach ordinarily uses human silhouettes that are extracted from rgb or depth images. appearance-based approach usually does not use explicit models. in this paper we generally focused on some appearance-based approaches. han and bhanu [1] presented a spatiotemporal gait representation called gait energy image (gei). gei is an image that contains averaged silhouettes, normalized and aligned, of a person during a gait cycle. sivapalan et al. [3] presented a gait energy volume (gev). authors [3] extended gei with a 3d and used reconstructed voxel volumes instead of temporally averaging segmented silhouettes. sivapalan et al. [2] also presented backfilled gait energy image (bgei). bgei is a feature that may be constructed using side-view silhouettes or frontal depth images. bgei is an image, such as gei, with a person silhouettes, but silhouettes of a person are back filled from front most pixels. feature extraction for person gait recognition applications 559 hofmann et al. [4] used depth information with a gei in a way that gei required silhouettes were calculated using a depth data. also, authors [4] proposed and a feature defined as depth gradient histogram energy image (dghei). arora and srivastava [5] presented gait gaussian image (ggi), a period based gait technique that is used for feature extraction of gait image during a gait cycle. iwashita et al. [6] presented an approach in which an image that contains a human body is divided in multiple areas. for every mentioned area features are extracted and used in gait recognition process. ramakić et al. [7] and lenac et al. [8] presented approaches where they used appearancebased methods, such as gei and bgei, and height feature obtained from depth images for gait recognition tasks. lenac et al. [8] presented hgei-i and hgei-f methods where in case of first method, hgei-i, early fusion of information is realized while hgei-f performs late fusion of information. hgei-i represents a method for gait recognition where are combined gei features and feature height of a person in a way that height of a person is added as one of a features alongside features obtained from gei. then, classification process is realized after integration of gei features and height of a person feature. in hgei-f method, gei and height of a person are separately considered and based on results from each classifications single prediction is made. ramakić et al. [9] also presented a method for gait recognition that exploits silhouettes of a person along with height of a person and step length of a person. chattopadhyay et al. [10] presented a pose depth volume (pdv) feature for frontal gait recognition. bashir et al. [11] proposed a gait representation called gait entropy image (geni). geni encodes in a single image the randomness of pixel volumes in the silhouette images over a gait cycle. portillo-portillo et al. [12] presented an approach for a gait recognition. mentioned approach exploits gei and direct linear discriminant analysis (dlda) in order to create a view invariant model for identification. lishani et al. [13] proposed an approach which is based on the haralick features extracted from gei. rudek et al. [14] presented a method for a gait classification based on analysis of the trajectory of pressure centers extracted from a feet contact point with a ground. 3. feature extraction features that were extracted and used in gait recognition process are real height of a person and step length of a person. the main idea in this paper is using additional features along with well-known appearance-based gait recognition methods such as gei and bgei. 3.1. gei and bgei gei represents person’s silhouettes (aligned, normalized and temporally averaged) over a gait cycle in one image. gei is defined as: 560 a. ramakić, z. bundalo, ž. vidović  = = n t tjii n jig 1 ),,( 1 ),( (1) where n is the silhouette frames number in gait cycle, t is the frame number in gait cycle at moment of time, i(i, j) is the original silhouette image with (i, j) values in the image coordinate. examples of gei images for three people and for three different datasets are shown in fig. 1. first row shows gei images from own custom dataset created with a long-range stereo camera, second row shows gei images from well-known casia dataset b [15] [16] [17] and third row shows gei images also from well-known tum gait from audio, image and depth (gaid), tum-gaid [18] dataset. fig. 1 examples of gei images, for three different people, from own custom dataset, casia dataset b [15] [16] [17], and tum-gaid [18] dataset bgei is created as well as gei, with difference that bgei is constructed by first back filling the binary silhouettes, where the front most pixel on each row is found and from it feature extraction for person gait recognition applications 561 filled to the back of the image. examples of bgei images, for three different people, for our own custom dataset and tum-gaid dataset are shown in fig. 2. fig. 2 examples of bgei images, for three people, from own custom dataset and tum-gaid [18] dataset 3.2. height of a person and step length of a person height of a person and step length of a person were extracted from depth images. depth images contain information about distance of specific object from a camera. using mentioned information real height of a person and step length of a person may be estimated. these two features are conceptually simple and robust features which can be easily extracted and combined with some gait recognition methods. independent use of height of a person or step length of a person does not represent reliable way of identification, but in a combination with some gait recognition method can improve overall identification score. height of a person may be estimated from depth image as a distance between topmost point on person’s silhouette and ground plane. top-most point on silhouette is represented as vector that contains the fallowing values (eq. (2)): 2 1 x y p w      =       (2) where x and y are 2d image pixel coordinates. label w represents a disparity value. ground plane can be detected using random sample consensus (ransac). ransec represents 562 a. ramakić, z. bundalo, ž. vidović a plane detection method that are often in use in point cloud data. output from this step are a, b, c, and d plane parameters. in order to obtain 3d coordinates for the top-most point in cartesian coordinate system perspective transformation matrix was used. mentioned matrix is obtained through camera calibration process. this is shown in eq. (3): , 1 0 0 0 1 0 0 0 0 1 0 0 x y x x x x c c fq c c t t −    −    =   −−      (3) if a perspective transformation matrix is known, matrices multiplication should be done. this is shown in eq. (4): 3 2 p q p=  (4) result is a fallowing 3d point vector as shown in eq. (5): 3 ' ' ' ' x y p z w      =       (5) in order to obtain metric system values and real values for distance it is necessary to divide all vector elements with w as shown in eq. (6) and eq. (7): 3 3 ' /p p w= (6) 3 1 x y p z      =       (7) in order to obtain distance between top-most point (eq. (7)) and ground plane, eq. (8) is used. 2 2 2 a x b y c z d d a b c  +  +  + = + + (8) obtained distance d represents a height of a person. also height of a person may be estimated based on height of a person’s silhouette in depth image. in that case, two points are necessary: top-most point and bottom point between legs of a person. both points can be calculated as described in previous text for the top-most point. distance between these feature extraction for person gait recognition applications 563 two points represents a height of a person. in this paper height of a person was estimated as a distance amongst two points, top-most point and bottom point. this is illustrated in fig. 3. step length of a person was estimated as a distance between two points, defined on left and right leg which is also illustrated in fig. 3. average value for step length of a person, which is obtained over a gait cycle, was used. fig. 3 depth image with estimated height of a person and step length of a person (in meters) 4. experimental setup in this paper we used an own custom dataset that contains 14 persons. all 14 persons in dataset are in normal walk. dataset was recorded in outdoor environment. creation of dataset was conducted using a zed long-range stereo camera. all of mentioned persons in dataset walking without any accessories (e.g. backpack) and they captured with 90 degrees’ angle. experimental test was conducted using matlab. for every of mentioned 14 persons there is a six gei and bgei images as well as height and step length values for each person. we used bag of features with vocabularysize of 500 (default value in matlab). mentioned value corresponds to k in k-means clustering algorithm that was used on extracted feature descriptors. we also defined pointselection as a detector. pointselection is a selection method for picking point location. from gei and bgei images feature points were selected using speeded-up robust features (surf) algorithm. people identification was realized using classification process. classification was realized using support vector machine (svm) algorithm. after features was obtained dimensionality reduction was conducted using principal component analysis (pca). also cross-validation was used, i.e. five-fold cross-validation. that means that an original dataset is partitioning in a way that there is a subset to train algorithm and remaining data for testing. in case of k-fold cross-validation data are partitioned in k randomly chosen subsets or folds where subsets are approximately equal size. one subset is used to validate the model 564 a. ramakić, z. bundalo, ž. vidović trained using the remaining subsets. process is repeated k times. fig. 4. shows a steps during a features extraction and classification. fig. 4 steps during a features extraction steps shown in fig. 4 can be described as fallows. when depth images are available silhouettes of a person can be extracted and also height of a person and step length of a person can be estimated. before that, it’s necessary to do person segmentation from depth image. silhouettes of a person can be also extracted from rgb images. after person segmentation from depth image, silhouettes of a person can be extracted and height of a person and step length of a person can be estimated. if silhouettes of a person are feature extraction for person gait recognition applications 565 available, gei or bgei (depending on what is being created) can be created. when gei or bgei images are available for each person, features can be extracted from them. features from gei or bgei images and height of a person and step length of a person are than classified using svm algorithm. in final step there are results of classification. 5. results experimental research has been done using dataset with 14 people in gait. methods that were tested are gei, bgei, gei along with height and step length of a person (geiheight-step integration), bgei along with height and step length of a person (bgeiheight-step integration) and only height of a person as a feature. step length of a person was not tested as a stand-alone feature because it is not too reliable for people identification when it is only feature for identification. when height of person was used it means that only values of height were classified with svm classifier. in case of gei or bgei only extracted features were classified using svm classifier. when bgei and gei were used along with height and step length of a person that means that height and step length values were added as additional features alongside features from gei and bgei and that together classified with svm classifier. results of classification are shown in table 1. table 1 classification results methods results with svm classifier height feature 83.3% bgei 85.7% gei 97.6% bgei + height and step length (bgei-height-step integration) 94.0% gei + height and step length (gei-height-step integration) 100.0% height as a method for a people identification has lowest result and that is expected because many people have same or similar height. bgei has better result than height of a person but lower compared to gei. gei as a method for people gait recognition has good overall result of 97.6%. in cases when height and step length of a person were used along with gei and bgei overall results of identification were improved. in case when bgei was used along with height and step length of a person (bgeiheight-step integration) result is 94% in regards to result of 85.7% when bgei was used stand-alone as a method for people gait identification. also, in case of gei, result of gei-height-step integration was better in regards to using a gei as a stand-alone method. result in case when only gei was used is 97.6% while in gei-height-step integration is 100%. fig. 5 shows obtained results for all used method while in fig. 6 is shown comparison between used methods. 566 a. ramakić, z. bundalo, ž. vidović fig. 5 obtained results for used methods fig. 6 comparison between used methods 6. conclusion gait recognition is interesting way for identification. people may be identified with different methods such as fingerprint, retina and iris recognition, facial recognition, voice recognition etc. most of above mentioned methods ask for some interaction with a person during identification process. gait recognition methods ordinarily do not need any interaction with a person during a process of identification and that is the main advantage of these type of methods. there are two approaches that deal with gait recognition, model-based approach and appearance-based approach. in this paper we presented some additional features that can be used along with well-known appearance-based gait recognition methods such as gei and bgei. mentioned additional features are height and step length of a person. feature extraction for person gait recognition applications 567 experimental research was point out that integration of features some well-known gait recognition methods, such as gei and bgei, and additional features such as height and step length of a person significantly improves accuracy of identification in regards to stand-alone use of appearance-based gait recognition methods. experimental results show that in case when gei-height-step integration was used people identification result is approximately 100% for used own custom dataset. references [1] j. han and b. bhanu, "individual recognition using gait energy image",” ieee тrans. pattern anal. mach. intell., vol. 28, no. 2, pp. 316–322, feb. 2006. [2] s. sivapalan, d. chen, s. denman, s. sridharan and c. fookes, "the backfilled gei-a cross-capture modality gait feature for frontal and side-view gait recognition", in proceedings of the international conference digital image computing techniques and applications (dicta), ieee, 2012, pp. 1–8. [3] s. sivapalan, d. chen, s. denman, s. sridharan and c. fookes, "gait energy volumes and frontal gait recognition using depth images", in proceedings of the international joint conference on biometrics (ijcb), ieee, 2011, pp. 1–6. [4] m. hofmann, s. bachmann and g. rigoll, "2.5 d gait biometrics using the depth gradient histogram energy image", in proceedings of the 5th international conference biometrics: theory, applications and systems (btas), ieee, 2012, pp. 399–403. [5] p. arora and s. srivastava, "gait recognition using gait gaussian image", in proceedings of the 2nd international conference signal processing and integrated networks (spin), ieee, 2015, pp. 791–794. [6] y. iwashita, k. uchino, and r. kurazume, "gait-based person identification robust to changes in appearance", sensors, vol. 13, no. 6, pp. 7884–7901, june 2013. [7] a. ramakić, d. sušanj, k. lenac and z. bundalo, "depth-based real-time gait recognition", j. circuits, syst. comput., vol. 29, no. 16, p. 2050266, 2020. [8] k. lenac, d. sušanj, a. ramakić and d. pinčić, "extending appearance based gait recognition with depth data", appl. sci., vol. 9, no. 24, p. 5529, dec. 2019. [9] a. ramakić, z. bundalo and d. bundalo, "a method for human gait recognition from video streams using silhouette, height and step length", j. circuits, syst. comput., vol. 29, no. 7, p. 2050101, june 2020. [10] p. chattopadhyay, a. roy, s. sural and j. mukhopadhyay, "pose depth volume extraction from rgb-d streams for frontal gait recognition", j. vis. commun. image represent., vol. 25, no. 1, pp. 53–63, jan. 2014. [11] k. bashir, t. xiang and s. gong, "gait recognition using gait entropy image", in proceedings of the 3rd international conference on imaging for crime detection and prevention, 2009, pp. 1–6. [12] j. portillo-portillo, r. leyva, v. sanchez, g. sanchez-perez, h. perez-meana, j. olivares-mercado, k. toscano-medina and m. nakano-miyatake, "a view-invariant gait recognition algorithm based on a joint-direct linear discriminant analysis", appl. intell., vol. 48, no. 5, pp. 1200–1217, may 2018. [13] a. o. lishani, l. boubchir, e. khalifa and a. bouridane, "human gait recognition based on haralick features", signal, image video process., vol. 11, no. 6, pp. 1123–1130, sep. 2017. [14] m. rudek, n.m. silva, j.p. steinmetz and a. jahnen, "a data-mining based method for the gait pattern analysis", “facta univ. mech. eng., vol. 13, no. 3, pp. 205-215, 2015. [15] s. zheng, j. zhang, k. huang, r. he and t. tan, "robust view transformation model for gait recognition", in proceedings of the international conference on image processing (icip), ieee, 2011, pp. 2073–2076. [16] s. yu, d. tan and t. tan, "a framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition", in proceedings of the 18th international conference on pattern recognition (icpr), vol. 4, ieee, 2006, pp. 441–444. [17] "the institute of automation, chinese academy of sciences (casia)", link: http://www.cbsr.ia.ac.cn/ english/gait%20databases.asp, (accessed: 25.03.2021.) [18] m. hofmann, j. geiger, s. bachmann, b. schuller and g. rigoll, "the tum gait from audio, image and depth (gaid) database: multimodal recognition of subjects and traits", j. vis. commun. image represent., vol. 25, no. 1, pp. 195–206, jan. 2014. facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 269-282 https://doi.org/10.2298/fuee2202269p © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper wk-fnn design for detection of anomalies in the computer network traffic danijela protić1, miomir stanković2, vladimir antić3 1center for applied mathematics and electronics, belgrade, serbia 2mathematical institute of sasa, belgrade, serbia 3center for applied mathematics and electronics, belgrade, serbia abstract. anomaly-based intrusion detection systems identify abnormal computer network traffic based on deviations from the derived statistical model that describes the normal network behavior. the basic problem with anomaly detection is deciding what is considered normal. supervised machine learning can be viewed as binary classification, since models are trained and tested on a data set containing a binary label to detect anomalies. weighted k-nearest neighbor and feedforward neural network are highprecision classifiers for decision-making. however, their decisions sometimes differ. in this paper, we present a wk-fnn hybrid model for the detection of the opposite decisions. it is shown that results can be improved with the xor bitwise operation. the sum of the binary “ones” is used to decide whether additional alerts are activated or not. key words: wk-fnn, anomaly detection, weighted k-nearest neighbor, feedforward neural network 1. introduction due to the enormous increase in computer applications in the last few decades, the need for protection of the computer networks has multiplied [1]. intrusion detection systems (idss) are the main defense of the network infrastructure, used to detect attacks or to indicate anomalies in the behavior of the computer network. the signature or misuse idss proactively detect the presence of known maliciousness. the most practical method to detect signature of malicious content is to measure the similarity between detected pattern of current network activity and the already known patterns of various types of malicious attacks [2]. the anomaly detection is performed by detecting changes in system behavior or usage patterns [3]. the identification of anomalies in the network is essential to diagnose received october 11, 2021; received in revised form december 6, 2021 corresponding author: danijela protić center for applied mathematics and electronics, belgrade, serbia e-mail: adanijela@ptt.rs 270 d. protić, m. stanković, v. antić attacks or failures that seriously affect the performance and security of the computer network [4, 5]. the goal of an anomaly-based ids is to proactively detect any activity or an event on a host computer or network that shows a deviation from a normal network behavior [2]. in order to provide suitable solution for the detection of anomalies in the computer network, the concept of normality is fundamental. the idea of normality is usually introduced through a formal model that expresses the relationship between the variables involved in the dynamics of the system, so that an event is recognized as abnormal when its degree of deviation in relation to the profile or the behavior of the system, specified by the normality model, is high enough [6]. in the last few decades, machine learning has started to play an important role in anomaly detection [6, 7, 8]. in supervised machine learning, anomaly detection can be thought of as a kind of binary classification, since the data sets for training and testing the models contain binary labels: one for normal observations and one for abnormal observations. it should be noted that the troubleshooting data set can be quite unbalanced in detecting anomalies. therefore, it is important to use some data transformation algorithms prior to supervised learning. in this article we propose a three-step algorithm that removes all irrelevant features from the kyoto 2006+ dataset and normalizes the instances so that the influence of one feature cannot dominate the others. after pre-processing is completed, there were nine features left to train two binary classifiers, namely the weighted k-nearest neighbor (wk-nn) and the feedforward neural network (fnn). the classifiers show a high precision in decision making but, in some cases their decisions are different. the proposed wk-fnn hybrid model recognizes the opposite decisions based on a bitwise exclusive or (xor) operation between the outputs of the classifiers. the binary sum of the opposing decisions is used as the basis for the additional warnings. two alerts are combined. trigger alert reacts to the opposite decisions and threshold-based alert allows users to prioritize alerts that are rated as critical. 2. literature review since the nature of the features and the number of instances determine the applicability of anomaly detection techniques, the analysis of the high-dimensional data sets becomes a challenge for researchers [9, 10]. in the last few decades, researchers have investigated the intrusion detection systems for various purposes and on the different datasets. in [11] and [12] the authors compare the darpa98, kdd cup ’99, nsl-kdd, kyoto 2006+ and caida datasets. in addition, the authors in [13] have compared a signature-based and anomaly-based classification and examined the iscx2012, cic-ids-2017 and cse-cic2018 datasets in the context of the feature selection and the attack types. in [14] the authors describe the functionality of the adfa-lf and adfa-wd datasets and compare them with the darpa98, kdd cup ’99, nsl-kdd and cic-ids-2017 datasets. the datasets are simulated or captured from real computer network traffic, and differ in size, number of features, purpose, type of attacks, etc. the main characteristics of the above datasets are summarized in the table 1. wk-fnn design for detection of anomalies in the computer network traffic table 1 description of the datasets dataset type of the attacks features kind of traffic description adfa-ld and adfawf hydra-ftp, hydra ssh, adduser, javameterpreter, meterpreter, webshell. 26 from the host for normal activities, with user behavior ranging from web browsing to latex document preparation. created from the evaluation of the systemcall-based hids; linux and unix os (ld) and windows (wf). awid attacks on 802.11 (authentication request, probe request, injection, arp flooding). 156 features extracted from each packet emulated (small network, 11 clients) wlan traffic in packetbased format; 37 million packets in one hour captured. caida ddos network traffic traces real (collected on high-speed monitors) collected on commercial backbone link from 2008 to 2019; does not contain diversity of attacks. cic-ids2017 botnets, cross-sitescripting, dos, ddos, goldeneye, hulk, rudy, slowhttptest, slowloris. more than 80 emulated (small network) captured over a period of 5 days; contains network traffic in packet-based and bidirectional flowbased format. cse-cic2018 brute force, hearthbleed, botnet, dos, ddos, web attacks, infiltration from the network inside. more than 80 emulated (simulated scenarios) 10 days network traffic and log files of 50 machines from the attacker side and 420 pcs and 30 servers from the victim organization. darpa98 dos, privilege escalation (r2l and u2r), probing. 41 emulated (small network) 7 weeks of network traffic in packet-based format and audit log. iscx 2012 scenarios: infiltrating the network from the inside, http dos, ddos using an irt bootnet, ssh brute force attack. 20 emulated (small network) 7 days of packet network traffic observed. kdd cup ‘99 dos, privilege escalation (r2l and u2r), probing. 42 emulated (small network) derived from the darpa98 dataset. five weeks of network traffic in packet-based format kyoto 2006+ attacks against honeypots (dos, exploits, malware, port scans, shellcode). 24 real (honeypots, and regular servers) 3 years of real packetbased network traffic; packets converted into the sessions. nsl-kdd dos, privilege escalation (r2l and u2r), probing. 42 emulated (small network) derived from the kddcup ’99 dataset; does not contain redundant records in the training set nor duplicates in the test set. as it is shown in table 1, all datasets, with the exception of the kyoto 2006+ dataset, are either simulated network data or come from actual network traffic, which is mainly 272 d. protić, m. stanković, v. antić used for signature detection. the dataset is also the only one intended for anomaly-based ids modelling. for these reasons, this study uses the kyoto 2006+ dataset as the basis for binary classification experiments with machine learning (ml) models. machine learning is effective in eliminating redundant and irrelevant data, increasing learning accuracy and improving comprehensibility of the results [15]. feature selection has direct influence on the efficiency of the results and offers a way to reduce computation time, improve accuracy, and enable a better understanding of the classification models or the data. in the case of an anomaly detection, the labels assigned to the data instances are usually in the form of binary values [16]. machine learning models can be very effective in learning normal or abnormal patterns from training data and in detection of the anomalies in the computer networks [17]. the kyoto 2006+ dataset is captured and created in actual network traffic to classify network traffic as normal or abnormal. since the purpose of this work is to present the hybrid classifier for improved anomaly detection in binary classification this data set is used in experiments. the kyoto 2006+ dataset is unbalanced data set in which the amounts of normal and abnormal data are unbalanced. in [18], the authors present a series of tests they carried out to assess the effectiveness of ml techniques in detecting anomalies and present the algorithms that gave the best results. in [19] the authors carried out experiments with 10 daily records from the kyoto 2006+ dataset and showed that accuracy decreases slightly when the number of features is reduced from 17 to 9 and the instances range from -1 to 1. in supervised machine learning, wk-nn has the highest accuracy of a variety of machine learning models. in [20] the author proposes a method that can detect large-scale attacks in real time with weighted k-nn classifiers. the key factor in developing an anomaly-based intrusion detection system is the selection of significant features for decision-making. a good feature selection for choosing meaningful and as few features as possible plays a key role in successful anomaly-based ids. in [21] the authors proposed a new learning algorithm for pseudo-neighbor elimination and anomaly detection based on the wk-nn model in order to minimize the effects of these distant neighbors. in [22] the authors examine the applicability of the feedforward architecture of neural networks for traffic prediction and compare the performance of different back-propagation algorithms. the prediction is made for various random aggregates of traffic flows. the performance analysis showed the effectiveness of the proposed method for an adequate choice of the learning algorithm. in [23] the authors approached an ids using a 2-layered feedforward neural network. in the training phase, the early-stop strategy is used to overcome the problem of overfitting in neural networks. the proposed system is assessed against the darpa dataset. the selected connections from the darpa dataset are preprocessed and feature range is converted into [-1, 1]. these modifications affect final detection results in particular. in [24] the authors proposed ids model, which uses the feedforward neural network and the back-propagation algorithms along with various optimization techniques to minimize the overall computational overhead, while maintaining a high level of performance. the experimental results on the benchmark nsl-kdd dataset shows that in some cases the accuracy of the proposed ids model is better than that of the other ids models. because of its high performance and low computational requirements, the proposed model was a suitable candidate for real-time implementation. in [25] the authors showed the results on the accuracy of two fnn classifiers in the short processing time when deciding on anomalies in the behavior of the complex computer networks. in [26] the authors used a pc-generated offline data set to assess the performance of two neural network-based techniques. in this data set, each wk-fnn design for detection of anomalies in the computer network traffic data point corresponds to a normal or anomaly class. it is assumed that the anomaly data is the intruder data, obtained by disabling some pc controllers, audio drivers, graphics drivers, etc. in this article, the authors took 15 randomly selected features from the log file, which contains 20,000 records. the authors have shown that the fnn classifiers are approximately 98% accurate. hybrid models for anomaly detection are also the topic of various research. in scenario given in [27], the authors propose a hybrid online-offline system in which the offline model maintains the general properties of the network traffic, based on radius nearest neighbor while the online model based on the support vector machine continuously learns and they work together to detect anomalies. the method is evaluated using the nsl-kdd 2009 dataset. this model achieved an accuracy of ~95% with known anomalies. it should be noted that the nsl-kdd dataset is the simulation of the computer network traffic on the middle-size american military base [11]. in order to improve the detection performance and to reduce the tendency to frequent attacks, the two-stage hybrid method based on binary classification and k-nn technique is proposed in [28]. first, binary classifiers and an aggregation module are used to efficiently identify the exact classes of network connections. afterwards, the connections whose classes insecure, further determine their classes by the k-nn algorithm. the second step is built on the results of the first step and is a useful addition to the first step. by combining the two steps, the proposed method achieves reliable results in the nsl-kdd data set [11]. network alerts are a critical aspect of network performance monitoring because they are designed to provide information technology (it) administrators with quick insight into the network problems. therefore, network alerting should be an important consideration for those choosing their network alerting tools. in [29] the author provides information on the four main types of network alerts. real-time alerts periodically or continuously scan all areas of the network for network behavior problems. the time between each network pass is an important consideration as it determines how quickly network problems are identified. intelligent alerts provide details about the problem, when and where it occurred, and which areas of the network are affected. flexible delivery alerts are network monitoring notification tools that can be configured for scheduled and hourly alerts to ensure alerts are received at the right time. critical and tiered alerts are tools for minimizing the number of network notifications. network monitoring alerts, also known as threshold alerts, are tools that support critical and tiered alerts, so that the user can prioritize alerts that are critical or violate a preconfigured network configuration. systems with tiered alerting assign problems to one of several categories. alerts are processed according to the importance of the category. in [30] the authors confirm that the alert ranking classifies alerts according to the dangerousness of the alert. the alarm tactic requires that the functionality responsible for the alarm classification should not be computationally expensive, otherwise the advantages of the quick response, which is obtained by a prioritized reaction to dangerous alerts, are negated. in [31] the authors explain that not all classification algorithms equally accurate. therefore, it is important to carefully select the criteria that can accurately classify the alerts based on the specific security needs of an organization. in [32] the authors describe the efficiency of the basic methods for rule-based alert classification and explain that engineers usually concentrate primarily on critical alerts, but not on errors and warnings. they claim that engineers should investigate more alerts. at the same time, they find a lot of time is wasted in investing in non-serious warnings (low precision), but many serious alerts are still lost. in [33] the authors divide alerts into lowand high-level alerts and point 274 d. protić, m. stanković, v. antić out that high-level alert management is a potential task that helps the administrator to analyze alerts correctly and to allocate time and effort. 3. data collection the kyoto 2006+ dataset is publicly available and a widely used dataset in networkbased intrusion detection research. the dataset includes more than three years of actual traffic data collected from honeypots (solaris 8 for intel, windows xp (no patch, sp2, fully patched), nepenthes, others), darknet sensors, and other systems (mail server to collect various types of mails, web crawler developed by ntt information sharing platform laboratories, windows xp to evaluate malware activities) deployed on five different computer networks inside and outside the university of kyoto [34]. the kyoto 2006+ dataset is developed through deploying of honeypots in the network, but does not describe any details the types of attack [13]. in addition, the ids bro has been used to convert packet-based traffic into a format called sessions. ids bro is a signature and behaviorbased analysis framework that provides detailed data on hypertext transfer protocol (http), domain name system (dns), secure shell (ssh) communication protocol and strange network behavior [35]. thanks to its analysis engine, it is suitable for high performance network monitoring, protocol analysis, and real-time application layer status information. the bro event engine is responsible for receiving the internet protocol (ip) packets and converting them into events forwarded to the policy script interpreter, which then produces an output [36]. during the observation period (from 2006 to 2009) more than 50 million sessions with normal traffic, 43 million sessions with known attacks and 425 thousand sessions with unknown attacks were recorded. each session includes 24 features, 14 out of which characterize statistical features derived from the kdd ’99 cup dataset and 10 additional flow-based features (ip addresses, ports, and duration) [11, 37]. a feature label indicates the presence of attacks [38]. in the original data set, there were three labels: 1 for normal sessions, -1 for known attacks, and -2 for unknown attacks. however, since unknown attacks are very rare in the dataset (~0.7%), we assigned the same label to known and unknown attacks (-1), which leads to binary classification [39]. the main problem associated with the kyoto 2006+ is its size. in this study, this problem is solved with the pre-processing algorithm, which removes all irrelevant features (categorical features, statistical features regarding to the connection duration, and features for further analyses) and normalizes instances of the relevant features with a hyperbolic tangent function to the range [-1,1]. after the pre-processing is completed, features 5-13 remain for the evaluation of the models, and the feature label identifies the session as normal or abnormal [19, 40]. table 2 shows the description of the features used in the experiments. in this research, the notation of the instances is as follows: the number of instances in a daily record is referred to as the total number of instances, the number of instances labelled with 1 is referred to as the number of normal instances, while the number of instances labelled with -1 denotes the number of anomalous instances. wk-fnn design for detection of anomalies in the computer network traffic table 2 description of the features from the kyoto 2006+ dataset feature description count the numbers of connections whose source ip address and destination ip address are the same to those of the current connection in the past two seconds. same_srv_rate % of connections to the same service in the count feature. serror_rate % of connections that have ‘syn’ errors in count feature. srv_error_rate % of connections that have ‘syn’ errors in srv_count (% of connections whose service type is the same to that of the current connections in the past two seconds) features. dst_host_count among the past 100 connections whose destination ip address is the same to that of the current connection, the number of connections whose source ip address is also the same to that of the current connection. dst_host_srv_count among the past 100 connections whose destination ip address is the same to that of the current connection, the number of connections whose service type is also the same to that of the current connection. dst_host_same_src_port_rate % of connections whose source port is the same to that of the current connection in dst_host_count feature. dst_host_serror_rate % of connections that have ‘syn’ errors in dst_host_count feature. dst_host_srv_serror_rate % of connections that have ‘syn’ errors in dst_host_srv_count feature. label indicates whether the session was attack or not; ‘1’ means normal. ‘1’ means known attack was observed in the session, and ‘-2’ means unknown attack was observed in the session. 4. wk-fnn model a classification model generally maps the input data to a specific target and determines which label to assign to the new, unlabeled data. with binary classification, a classifier assigns the input data into one of two classes. the wk-fnn hybrid model is based on two binary classifiers. the wk-nn classifier is a lazy learner who saves training data and labels, and waits for the test data. instead of focusing on building a general model, it works on storing instances of the training data into classes. the fnn, an eager learner, creates a classification model based on the training data set before it is received data for prediction. the basic idea of the wk-nn is to expand k-nearest neighbor (k-nn) algorithm which stores all instances corresponding to the training data in n-dimensional space. predictions for a new instance x are made by searching the entire training set for the k closest neighbors and summarizing the output variable for these cases. the classification is based on calculation of a simple majority vote of each point. wk-nn extends the k-nn such that instances of the training set, which are particularly close to the new instance, have more weight in the decision than those who are more distant. the main idea is to make the distant neighbor less effective than the closest, at making decisions by majority vote, by giving more weight to the nearest point and less to the more distant [41, 42]. to do this, the distances 𝑑𝑤 (𝐱, 𝐲) = √∑ (𝑥𝑖 − 𝑦𝑖 ) 2𝑝 𝑖=1 are converted into the weights. the simplest conversion 276 d. protić, m. stanković, v. antić function is inverse of the distance. the closest k points are weighted with weights 𝑤 = 1 𝑑𝑤(𝒙,𝒚) 2 (the weight decreases with increasing the distance). the fnn consists of a series of layers with highly connected neurons in each layer, with the final layer producing the outputs that relate the inputs to the desired output, so that 𝑦𝑖 (𝐰, 𝐖) = 𝐹𝑖 (∑ 𝑊𝑖𝑗 𝑓𝑗 (∑ 𝑤𝑗𝑖 𝑥𝑙 + 𝑤𝑗0 𝑚 𝑙=1 ) 𝑞 𝑗=1 + 𝑊𝑓0) (1) where fj and fi denote hidden and output layer transfer functions, m represents the number of inputs xl, q represents the number of outputs yi, w and w are weight matrices, and wj0 and wf0 are biases [43]. the fnn is trained through an iterative process to modify the weights so that the given inputs map an appropriate response. in this way, the inputs are classified according to the target classes. in general, fnns have a large number of parameters which, due to the convergence to a correct set of parameter values, can lead to the estimation problems [44]. for this reason, the weights are updated according to the levenberg-marquardt (lm) algorithm [45, 46]. the design of the wk-fnn model is based on the wk-nn and fnn binary classifiers, which work in parallel and decide on the anomaly in the behavior of the computer network. the basic idea is to train wk-nn and fnn with the same training set and evaluate highprecision classifiers (figure 1). subsequently, the classification of the unknown network transfer is carried out by both classifiers. the decisions about the anomaly are transmitted to the xor block, where the result of the counter-decision is calculated. finally, the percentage of the opposite decision triggers an alert. fig. 1 classifiers’ training the wk-fnn model is a three-layer structure. the first layer classifies the network traffic according to the both wk-nn and fnn. a bit-by-bit xor operation is carried out in the second module. the third part of the wk-fnn marks the opposite decisions (fig. 2). out1 out2 decision 10% 0 classification unknown traffic wknn fnn x o r 𝑠𝑢𝑚(𝑥𝑜𝑟(𝑜𝑢𝑡1, 𝑜𝑢𝑡2)) xor block training wknn settrain fnn wk-fnn design for detection of anomalies in the computer network traffic fig. 2 wk-fnn model through the classification, both classifiers decide about the unknow network traffic and the outputs of each of the classifiers (decision about normal network behavior or the anomaly) are then passed on to the xor block, where the ‘exclusive or’ bitwise operation is performed. the different/opposite decisions are recognized by performing the xor logical operation on the classification results, which is logically true (1) if one of the outputs is, but not both, non-zero. otherwise the result is logical false (0). the sum of the different decision in the decision block is then calculated as follows 𝑠𝑢𝑚𝑜𝑢𝑡 = ∑ 𝑥𝑜𝑟(𝑜𝑢𝑡1𝑘 , 𝑜𝑢𝑡2𝑘 ) 𝑙𝑒𝑛𝑔𝑡ℎ(𝑑𝑎𝑡𝑎𝑠𝑒𝑡) 𝑘=1 (2) where out1k and out2k represent the k-th results of the classification, and outk=xor(out1k,out2k). the result is then passed on to the decision-making engine. the opposite decisions indicated by bit-by-bit xor operation can generate different types of alerts, depending on the organizational structure and information security requirements such as confidentiality, integrity and data availability. the alerts can be sent to the network administrator or to the other ids. it should be noted that the additional anomaly alerts are separate from regular it alerts. therefore, it is necessary to define the anomaly alert promotion rule in order to generate an it alert based on the anomaly alerts. the promotion rule of the wk-fnn model is based on the ratio between the number of opposing decisions and the total number of decisions of the classifiers, expressed as a percentage. a decision is presented based on a linear scale threshold. the basic idea behind the decision is that the number of contradicting decisions is low if the two classifiers are really highly accurate. otherwise, the results will not be reliable. instead of making additional decisions about what is normal or abnormal, the decision block points out the difference in the classifiers’ decisions. if both choose ‘normal’ or ‘anomaly’ their decisions are not different. otherwise, their decisions will differ, and the result on xor operation will be binary one. as ‘normal’ traffic refers to binary 1 (label equals 1) and ‘anomaly’ refers to the binary 0 (label equals to -1), the number of opposing decisions refers to the sum of all decision, because the same decisions result in zero after performing the xor operation. the number of opposing decisions 𝑠𝑢𝑚𝑜𝑢𝑡 is given in the eq. 2. divided by the total number of the decisions, given as 𝑙𝑒𝑛𝑔𝑡ℎ(𝑑𝑎𝑡𝑎𝑠𝑒𝑡), the resulting value shows the percentage of the opposing decisions. for this experiment, we have divided the priority levels of the alert scale into five different categories: negligible/insignificant alerts (whitelisted), potential threats (they have no direct influence on network traffic and the network structure), warnings (provide information about the risks), silent alarms (critical with ticketing) and high priority alarms (signal an attack). the scale is linear and divided into five groups with two percentage ranges, from 0 to 10 %. it should be noted that the scale can be chosen differently, depending on the needs of the organization security. 5. experimental results the experiments are carried out on three days of the computer network traffic recorded at kyoto university computer network in february 2007. all models are selected based on 278 d. protić, m. stanković, v. antić processing time and memory usage and are simulated in the matlab classification learner platform for windows 64-bit os installed on an intel core i7 processor with 2.7ghz cpu and 16 gb ram memory. the wk-nn model is trained by approximating an instance by the weighted sum of 10 k-nearest neighbors. weights are calculated based on inverse distances. the fnn with one hidden layer, nine inputs, nine nodes in the hidden layer and one output node is trained with the lm algorithm. in order for the lm algorithm to work correctly, the activation function of the hyperbolic tangent is used for each node, since it is differentiable, centered around 0, and its output range is [-1, 1]. the weights are initialized to the small random numbers, because the optimization begins as a gradient descent (gd) algorithm, which speeds up the convergence of the lm algorithm and minimizes the wrong approximations [20]. the wk-fnn model design is tested as follows: ▪ each of the three daily sets, consisting of 57278, 57279 and 58317 instances, is divided into the two subsets: set1 of 75% of the instances is used to train and test the classifiers, while set2 of 25% of the instances is used for the wk-fnn tests; ▪ set1 is then divided into two groups of instances: 70% are used to train the classifiers and the remaining instances are used to test both the models; ▪ set2 is used for testing the opposite decisions of the classifiers – the results are passed on to the xor module; ▪ the sum of all contradicting/opposing decisions is sent to the alarm detector. in summary, 52.5% and 22.5% of instances of each daily record are used to train and test the classifiers, respectively, and 25% of the instances are used to verify the wk-fnn model. the performance of the classifiers is calculated in term of accuracy (acc). acc represents the ratio between the number of correctly classified instances to the total number of instances, given as follows [47] acc = 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 (3) a true positive (tp) result indicates that the anomaly has been correctly identified as “anomaly”. a true negative (tn) means that the ids has correctly classified the normal behavior as “normal”. a false positive (fp) indicates a misclassification of the normal behavior of the network as an “anomaly”. a false negative (fn) indicates abnormal behavior of the network that has been mistakenly assigned to the “normal” class. the accuracy results for the classifiers and the number of opposing decisions recognized by the wk-fnn model are shown in table 3. the opposing decisions are calculated for 25% of the instances from each daily record. table 3 accuracy of the classifiers and the number of opposing decisions instances accuracy [%] opposing decisions [%] opposing decisions [instances] fnn wk-nn 57278 99.3 99.5 8.08 1157 57279 99.3 99.3 3.18 456 58317 99.0 99.1 0.67 98 in table 3, the opposing decisions [instances] = 𝑠𝑢𝑚𝑜𝑢𝑡 (every binary 1 triggers an alert), and the ratio of the number of opposing decisions and the total number of decisions (anomaly score) is given with the opposing decisions [%] = 𝑠𝑢𝑚𝑜𝑢𝑡 𝑙𝑒𝑛𝑔𝑡ℎ(𝑑𝑎𝑡𝑎𝑠𝑒𝑡) ∙ 100%. it can be wk-fnn design for detection of anomalies in the computer network traffic seen that there is no relationship between number of instances in the daily set and the opposing decisions. the anomaly score ranges from 0 to 10 % and is used as the threshold value for the additional alert. there are some concerns about the priorities and the percentages associated with the conflicting decision. a higher percentage of the different decisions indicates the greatest uncertainty in the classification and the incomplete knowledge of the event, which is not related only to the decision of the classifiers [48]. in general, uncertainty in decisions can arise from the following sources: (1) data errors (uncertainties about past events), (2) forecast errors (uncertainties about future events) and (3) model (residual) errors (differences between what is observed and what the model shows). the wk-fnn supports resolving the uncertainty by calculating the percentage of the opposing decisions of the classifiers, but cannot determine the probability of a certain event occurring. it is designed to provide the warning for the conflicting decisions of the classifiers. then the decision makers, knowing all the possible versions of the resolved issues use this auto-generated alert to resolve information security related issues in their organization. the alert scale presented in this paper was chosen to indicate the low probability of serious effects on network security with a small percentage of similar decisions and the high probability of an attack on the computer network with a high percentage of opposing decisions. it should be noted that there are other options for selecting the decision criteria, the threshold value ranges, and the alert colors, which can be modified depending on the additional protection requirements and the sensitivity of the information to the potential threats. in [49] multicriteria decision-making (mcdm) is presented. the authors examined the changes in the measurement scale and the formulation of criteria. in [50] the authors proposed the evaluation metrics to measure the effectiveness of collaborative decisions based on the likelihood of trust in collaborative decision-making processes. in [51] the author proposes a prioritization of alerts, which can be achieved by integrating several methods. in the experiments presented in this paper, the percentages of the opposing decisions are divided into five alert groups (see table 4). table 4 linear threshold scale and ranges of the opposing decisions threshold range [%] alert grouping and colouring opposing decisions [%] 0.67 3.18 8.08 0-1.99 negligible alert (black) 2-3.99 potential threat (blue) 4-5.99 warning (green) 6-7.99 silent alarm (orange) ≥ 8.00 high priority alarm (red) although the simple manual rules cannot always adequately capture the complex and interactive patterns of factors that influence the priority of the alerts the manual rules proposed here were used for classification of the five different priority levels. generally, the levels can be divided into three main categories: (1) errors/failures (negligible alert, potential threat), (2) warnings and (3) critical level (silent alarm, high-priority alarm) [32]. a negligible alert means that the alert is whitelisted (the lowest probability of serious effects on network security and the smallest percentage of similar decisions). the administrator can exclude certain activities which generate alerts, based on the user analytics. for the purposes of this research, the percentage of the negligible alert is used to 280 d. protić, m. stanković, v. antić be less than 2%. potential threat means that the alert may result from network disturbances and has no negative impact on business. the warning displays the known alert sources, acts as an information aggregator, provides information about the risks, and generates a hazard message. the silent alarm signals the high-level discrepancy in the decision of the classifiers and causes the ticket to be issued (the proof of authentication or authorization must be verified). a high priority alarm signals the highest probability of the attack (the highest percentage of opposing decisions). the ranking list can be adopted after a few other factors relating to the dataset (label, total number of instances) and metrics (accuracy, precision, recall), and depending on the changes to the system, new types of the alerts can be added [32]. the ranking can be combined with methods that reduce or reclassify a given list of rankings [52]. 6. conclusion this article introduces the design of a wk-fnn hybrid model that warns of opposing decisions about anomalies in the computer network. the model consists of a classification module, an xor block and a decision-making engine. in the classification module two high-precision binary classifiers work in parallel. the classifiers take into account 9 features with the normalized instances to decide whether the network traffic is abnormal or not. the results of the decisions of the classifiers are passed on to the xor block, where the exclusive or binary operation is carried out. binary 1 triggers an additional anomaly alert which is sorted into one of the predefined alert groups. the results show the presence of additional alerts related to the negligible alert, potential threat, and high-priority alarm. acknowledgement: a part of this research is presented at the 21st international arab conference on information technology, 6th of october 2020, giza, egypt. references [1] d. protic, "neural cryptography," military technical courier, vol. 64, no. 2, pp. 483–492, 2016. [2] j. sen and s. methab "machine learning applications in misuse and anomaly detection," 2009. available https://arxiv.org/ftp/arxiv/papers/2009/2009.06709.pdf. [3] d. dasgupta and h. brian, "mobile security agents for the network traffic analysis," in proceedings of the darpa information survivability conference and exposition ii discex01, 2001, vol. 2, pp. 332–340. [4] a. kind, m. p. stoecklin and x. dimitropoulos, "histogram-based traffic anomaly detection," ieee transactions on network and service management, vol. 6, no. 2, pp. 110–121, june 2009. [5] p. čisar and s. marvić čisar, "ewma statistics and fuzzy logic in function of network anomaly detection," facta universitatis, series: electronics and energetics, vol. 32, no. 2, pp. 249–265, june 2019. [6] m. h. bhuyan, d. k. bhattacharyya and j. k. kalita, "network anomaly detection: methods systems and tools," ieee communication surveys & tutorials, vol. 16, no. 1, pp. 303–336, first quarter 2014. [7] v. hodge and j. austin, "a survey on outlier detection methodologies,” artificial intelligence review, vol. 22, no. 2, pp. 85–126, 2004. [8] t. nguyen and g. armitage, "a survey of techniques for internet traffic classification using machine learning," ieee commun. surveys tutorials, vol. 10, no. 4, pp. 56–76, 2008. [9] s. omar, a. ngadi and h. h. jebur, "machine learning techniques for anomaly detection: an overview," international journal of computer applications, vol. 79, no. 2, pp. 33–41, october 2013. [10] c. jie, l. jiawei, w. shulin and y. sheng, "feature selection in machine learning: a new perspective," neurocomputing, vol. 300, pp. 70–79, 26 july 2018. https://arxiv.org/ftp/arxiv/papers/2009/2009.06709.pdf wk-fnn design for detection of anomalies in the computer network traffic [11] d. protic, "review of kdd cup ’99, nsl-kdd and kyoto 2006+ datasets," military technical courier, vol. 66, no. 3, pp. 580–595, 2018. [12] b. bohara, j. bhuyan, f. wu and j. ding, "a survey on the use of data clustering for intrusion detection system in cybersecurity," int. j. netw. secur. appl., vol. 12, no. 1, pp. 1–18, jan 2020. [13] a. thakkar and r. lohiya, "a review of the advancement int the intrusion detection datasets," international conference on computational intelligence and data science (iccids 2019), procedia computer science, vol. 167, pp. 636–645, 2020. [14] a. khraisat, i. gondal, p. vamplew and j. kamruzzaman, "survey of intrusion detection systems: techniques, datasets and challenges," cybersecurity, pp. 2–20, 2019. [15] s. khalid, t. khalil and s. nasreen, "a survey of feature selection and feature extraction techniques in machine learning," in proceedings of the 2014 science and information conference, 2014, pp. 372–378. [16] o. osanaiye, o. ogundile, f. aina anda. periola, "feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network," facta universitatis, series: electronics and energetics, vol. 32, no. 2, pp. 315–330, june 2019. [17] m. bahrololum, e. salahi and m. khaleghi, "machine learning techniques for feature reduction in intrusion detection systems: a comparison," in proceedings of the 2009 fourth international conference on computer sciences and convergence information technology, pp. 1091-1095, 2009. [18] y. -g. cheong, k. park, h. kim, j. kim and s. hyun, "machine learning based intrusion detection systems for class imbalanced datasets," journal of the korea institute of information security and cryptology, vol. 27, no. 6, 2017, pp. 1385–1395. [19] d. protic and m. stankovic, "detection of anomalies in the computer network behaviour," european journal of engineering and formal sciences, vol. 4, no. 1, pp. 7–13, 2020. [20] ming-yang su, "real-time anomaly detection systems for denial-of-service attacks by weighted k-nearest neighbor classifier," expert systems with applications, vol. 38, no. 4, pp. 3492–3498, april 2011. [21] j. dhar, a. shukla, m. kumar and p. gupta, "a weighted mutual k-nearest neighbour for classification mining," arxiv.org. submitted on 14 may 2020. https://arxiv.org/abs/2005.08640 [cs.lg]. [22] c. callegari, s. giordano and m. pagano, "neural network based anomaly detection," in proceedings of the 2014 ieee 19th international workshop on computer aided modeling and design of communication links and networks (camad), 2014, pp. 310–314. [23] f. haddadi, s. khanchi, m. shetabi and v. derhami, "intrusion detection and attack classification using feed-forward neural network," in proceedings of the 2010 second international conference on computer and network technology, 2010, pp. 262–266. [24] b. subba, s. biswas and s. karmakar, "a neural network based system for intrusion detection and attack classification," in proceedings of the 2016 twenty second national conference on communication (ncc), 2016, pp. 1–6. [25] d. protic and m. stankovic, "а hybrid model for anomaly-based intrusion detection in complex computer networks," in proceedings of the 21st international arab conference on information technology, 6th of october 2020, giza, egypt, pp. 1–8. [26] s. k. gutam and h. om, "computational neural network regression model for host based intrusion detection system," perspectives in science, vol. 8, pp. 93–95, september 2016. [27] m. odiathevar, w. k. g. seah and m. frean, "a hybrid online offline system for network anomaly detection," in proceedings of the 2019 28th international conference on computer communications and networks (icccn), 2019, pp. 1–9. [28] l. li, y. yu, s. bai, y. hou and x. chen, "an effective two-step intrusion detection approach based on binary classification and $k$ -nn," ieee access, vol. 6, pp. 12060–12073, 2018. [29] j. griffin, "all about network alerts + best tools," by solarwinds on october 29, 2020. available https://logicalread.com/network-alerts/. [30] f. ullah and m. ali babar, "architectural tactics for big data cybersecurity analytic systems: a review," the journal of systems and software, vol. 151, pp. 81–118, 2019. [31] s. allier et al., "a framework to compare alert ranking algorithms," in proceedings of the reverse engineering (wcre), 19th working conference on. ieee, 2012. [32] n. zhao, p. jin, l. wang, x. yang, r. liu, w. zhang, k. sui and d. pei, "automatically and adaptively identifying severe alerts for online service systems," in proceedings of the infocom, 2020. [33] w. alhakami, "alerts clustering for intrusion detection systems: overview and machine learning perspectives," international journal of advanced computer science and applications, vol. 10, no. 5, pp. 573–582, 2019. [34] j. song, h. takakura, y. okabe, m. eto, d. inoue and k. nakao, "statistical analysis of honeypot data and building of kyoto 2006+ dataset for nids evaluation," in proceedings of the 1st work-shop on https://arxiv.org/abs/2005.08640 https://logicalread.com/network-alerts/ 282 d. protić, m. stanković, v. antić building anal. datasets and gathering experience returns for security, salzburg, april 10-13, 2011, pp. 29–36. [35] k. demertzis, "the bro intrusion detection system", project: machine learning to cyber security, 2018, doi: 10.31140/rg.2.2.35333.40168. [36] r. mccarthy, "network analysis with the bro security monitor," 2014, retrieved from https://www.adminmagazine.com/archive/2014/24/network-analysis-with-the-bro-network-security-monitor, 7 november 2021. [37] kdd cup ‘99 dataset. [internet] http://kdd.ics.uci.edu/dataset/kddcup’99/kddcup’99.html, 2018. [38] m. ring, s. wunderlich, d. scheuring, d. landes and a. hotho, "a survey of network-based intrusion detection data sets, " arxiv:1903.02460v2 [cs.cr] 6 jul 2019, pp. 1–17. [39] y.e maleh, "security and privacy management, techniques, and protocols," igi global, usa, 2018, pp. 266–267. [40] d. protic and m. stankovic, "anomaly-based intrusion detection: feature selection and normalization instance to the machine learning model accuracy," european journal of engineering and formal sciences, vol. 1, no. 3, pp. 43–48, 2018. [41] m. zhao and j. chen, "improvement in comparission of weighted k nearest neighbor classifiers for model selection," journal of software engineering, vol. 10, pp. 109–118, 2016. [42] m. faryaneh, "weighted k-nearest neighbors (wknn)," matlab central file exchange, https://www.mathworks.com/matlabcentral/fileexchange/74111-weighted-k-nearest-neighbors-wknn. [43] w. f. schmidt, m. a. kraaijveld and r. p. w. duin, "feed forward neural networks with random weights," the netherlands, delft university of technology, faculty of applied phisics,1992, 0-8186-2915-0/92, ieee, pp. 1–4. [44] d. protic, "feedforward neural networks: the levenberg-marquardt optimization and the optimal brain surgeon pruning," military technical courier, vol. 63, no. 3, pp. 11–28, 2015. [45] k. levenberg, "a method for the solution of certain problems in least squares," quarterly of applied mathematics, vol. 5, pp. 164–168, 1944. [46] d. marquardt, "an algorithm for least-squares estimation of nonlinear parameters," siam journal in applied mathematics, vol. 11, no. 2, pp. 431–441, 1963. [47] c. ambedkar and v. k. babu, "detection of probe attacks using machine learning techniques," international journal of research studies in computer science and engineering, vol. 2, no. 3, pp. 25–29, 2015. [48] m. kurhade and r. wankhade, "an overview on decision making under risk and uncertainty," international journal of science and research, vol. 5, no. 4, pp. 416–422, april 2016. [49] d. pamucar, d. bozanic and a. randjelovic, "multi-criteria decision-making: an example of sensitivity analysis," serbian journal of management, vol. 12, no. 1, 2017. [50] a. ramos, m. lazar, r. f. filho and j. j p. c. rodrigues, "a security metric for evaluation of collaborative intrusion detection systems in wireless sensor networks," in proceedings of the 2017 ieee international conference on communications (icc), 2017, pp. 1–6. [51] l. zomlot, "handling uncertainty in intrusion analysis,” thesis for phd, 2014. http://doi.org/10.13140/ rg.2.1.4936.4326. [52] t. h. ho, j. j. hull and s. n. sirihari, "decision combination in multiple classification systems," ieee transaction on pattern analysis and machine intelligence, vol. 16, no.1, pp. 66–75, january 1994. https://www.admin-magazine.com/archive/2014/24/network-analysis-with-the-bro-network-security-monitor https://www.admin-magazine.com/archive/2014/24/network-analysis-with-the-bro-network-security-monitor http://kdd.ics.uci.edu/dataset/kddcup’99/kddcup’99.html https://www.mathworks.com/matlabcentral/fileexchange/74111-weighted-k-nearest-neighbors-wknn http://doi.org/10.13140/%0brg.2.1.4936.4326 http://doi.org/10.13140/%0brg.2.1.4936.4326 instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 205 218 doi: 10.2298/fuee1602205p a secure e-voting for the student parliament  dragoljub pilipovic 1 , djordje babic 2 1 slobomir p university, bijeljina, bosnia and herzegovina 2 school of computing, union university, belgrade, serbia abstract. e-voting is a service or system which serves to get individual human inputs and to summarize them to a certain group decision. usually, e-voting is a take for egovernment part, but in this paper we consider e-voting for particular and specific population. the proposed e-voting system is intended for student population and student parliament election. in this paper, we describe concept of p.u.t. (personal unique token) and ways to distribute p.u.t.s to students. at the end, we present a software designed for the student parliament use case. key words: e-voting, evs (electronic voting system), p.u.t., randomizer, secure distribution, student parliament, uml (unified modeling language). 1. introduction in today‟s society, digital services have become an important part in everyone‟s lives. electronic voting is deployed in many countries worldwide [1] [2]. every year in the last two decades, the number of theoretical and practical solutions and research papers in this field is increasing. we have noticed the existence of more than two hundred papers from the year 2000 onwards. however, there is a scientific paper, which is considered as the pioneer in this area: it is an overview of mix-net scheme for e-voting by david chaum [3]. the main focus in the e-voting field is most often on technical dimensions: cryptographic algorithms, e-voting protocols and scheme, trusted hardware, software implementation, security, etc. recently, the focus has moved to the organizational, social, and political aspects of e-voting. despite of a large amount of research in e-voting field, e-voting has no big real-world success. direct recording electronic (dre) machines for e-voting have been criticized many times [4-6]. the us department of defense proposed a remote and internet based voting system for elections: secure electronic registration and voting experiment (serve), but in report [7] there are strong recommendations against deploying this e-voting system. e-voting schemes are the core of e-voting protocols and e-voting systems. we will mention three most often represented in the literature. received january 29, 2015; received in revised form november 1, 2015 corresponding author: dragoljub pilipovic slobomir p university, bijeljina, pf 70 pavlovića put 76, 76300 slobomir, bosnia and herzegovina (e-mail: dragoljub.pilipovic@gmail.com) 206 d. pilipovic, d. babic mix networks (mix-nets) are a cryptographic primitive generally used to obfuscate a path through a network [3]. mix-nets usually consist of a set of servers, named mixes, which ensure that the output of a mix-net cannot be correlated with its input. in e-voting, mixnets simulate an anonymous channel between a voter and ballot box [8-10]. the second evoting scheme mentioned here is the one based on blind signature primitive. in this primitive, messages are signed by a signer, but the message content is kept hidden from the signer. in e-voting, a voter sends encrypted and blinded ballot to the ballot box. after validating the ballot, the voter unblinds the ballot and, therefore, gets a validated ballot which cannot any longer be linked to the original content of it [11-12]. the last e-voting schemes are homomorphic encryption schemes. the scheme is based on the algebraic homomorphic properties of few public-key cryptosystems which permit tallying of an election without the decryption of any single vote [13-14]. in addition to the three aforementioned schemes, there is certainly an interesting paper-based scheme prêt à voter. here, a voter retains a part of the ballot as their encrypted receipt [15]. technological measures like e-voting approach may increase voter turnout, but some researchers found that e-voting gets the turnout to the initial level at a later stage [16]. electronic voting system (evs) described in this paper is specific because it is designed for the student population, consisting mostly of young people, who are characterized by a particular state of mind and practical attitude. they are open-minded, mobile, independent, not included in politics (at least this is valid for majority). they use all the latest technological advances, thus all devices used in e-voting are familiar to them. however, students are apolitical in the sense that they have low voting turnout [17-19]. a problem of distrust in voting can be significantly reduced; perhaps they can even disappear, if e-elections are conducted with educated voters, in addition to the increased simplicity of e-voting. the system proposed in this paper is intended only for student population. the students‟ „job‟ is exactly composed of adoption (and construction) of something new. we believe that students of computer science will have no obstacles to understand and adopt this way of voting. furthermore, we hope that one day the whole population will have such an attitude. the evs for student parliament has to be safe like others evss, but it does not need to have an extremely high level of safety features, as it affects a smaller part of the society. in addition, due to the characteristics of the student population, especially with their open mind and independence, we assume that it will be unusual for students to sell their e-vote to the other. the presented evs will prevent misuses, for example the coercion to vote for someone else will be responded by appropriate mechanisms proposed in our evs. the main contribution of our paper is a concept of voting vectors, called p.u.t., which consists of readable string of numbers and characters. printed p.u.t. list is anonymous as long as the voter keeps it in a secret and secure place. in addition, we show design and implementation of electronic voting system (evs) based on p.u.t. concept. the evs provides a very high usability because of its simplicity. the system requires only limited capabilities on the side of the voter: ordinary computer and practically any web browser. the paper is organized as follows. in the second section, we present the concept of p.u.t. vectors for e-voting. the third section explains two ways to secure the distribution of the generated vectors for e-voting. in the next section, the entire protocol i.e. system for e-voting in student elections, is described. finally, at the end, we show the design and layout of software for e-voting based on the principles given in the previous sections. a secure e-voting for the student parliament 207 2. p.u.t. concept the most important issue in voting is voter identity problem. here, this problem is solved by using a method called personal unique tokens (we use abbreviation p.u.t. from this point further). in the classical, paper based model of e-voting, the voter identity and his/her choice are sent separately to the information system for e-voting. the voters‟ identities are sent most often through a digital signature, and the voters‟ choice has to use a secure connection to the database. the main idea of the p.u.t. concept is that we merge into one entity these two things which are separated physically and in-time, although one follows immediately after the other. this entity is sent to the information system of e-voting. in this way, p.u.t. represents identity of the voter and his choice at the same time (fig. 1). fig. 1 personal unique tokens data sent as a vote of specific user is given as a unique answer (a vector), which is related to the uniqueness of each voter for each option he can vote. p.u.t. is formed in such way that it is easy to read it and easy to enter it to the computer. in table 1, an example of p.u.t. list for a specific voter is shown. each row in table 1 represents p.u.t data that should be sent to evs for a specific voting option. in this case, the voting option is represented by name and surname of an election candidate. one voter has only one p.u.t. list for specific election race. table 1 example of the p.u.t. list for e-voting option p.u.t. – first voter ... p.u.t. – last voter slobodan milutinović abc 123 cba ghw 111 kkk milan milošević fgh 907 uso jap 231 gap boris nikolić uuu 000 iii hag 790 grw tomislav tadić eae 888 nhf iae 834 yiy here, we propose the following format aaa xxx aaa for generating p.u.t.s, where a is a letter of the latin alphabet (such alphabet exists on every keyboard, every operating system, every type of device and at least everyone knows latin letters) and x is a decimal digit. in order to calculate the number of available p.u.t.s, for each group of three positions we use a formula to calculate the variations of k elements over a set of n elements with repetition: ( , ) k v n k n (1) after that, we multiply the resulting numbers to each other: 3 3 3 26 10 26 = 17.576 1.000 17.576    (2) 1: 2: 1: identity choise p.u.t. e.v.s. e.v.s. 208 d. pilipovic, d. babic in this way, we calculate the number of available personal unique answers, which is equal to 308.915.776.000 (a little over 300 billion). this number should satisfy any existing elections for the student parliament. the alternative is to additionally use lower-case letters of the latin alphabet, but reducing the number of the text position at the same time. in this way, there are two groups of two letters instead two groups of three letters (current aa xxx aa). this case sensitive alternative offers 7.311.616.00 unique p.u.t. values. 2 3 2 (26 26) 10 (26 26) = 2.704 1.000 2.704      (3) the next alternative is to use alphabet with a huge number of different letters. it is believed that a chinese must know 4.000 characters for conventional literacy [20]. if the positions of the characters from chinese alphabet are still reduced over to the 1 + 1 position, we will get 16.000.000.000 variations of p.u.t. values. the current format looks like a xxx a. 1 3 1 4.000 10 4.000 = 4.000 1.000 4.000    (4) 3. distribution оf answers distribution of potential e-votes in the form of p.u.t. list to the student voters is a critical and very sensitive step. the reason is that the p.u.t. list contains the identity of the voter and the virtual ballots, and therefore should be protected from misuse. in fact, this list must be kept in the strictest confidentiality. only a voter can have an insight into the content. there are two possibilities that meet this requirement: paper option and software option. in both distribution options, it is necessary that the voter himself (i.e. a student in our case) appears at the site of student election organizers. the paper option is triggered when the operator chooses this option by pressing the corresponding button in the election software (details of election software are provided in the following sections). the p.u.t. list is printed for a present student, whose identity has previously been established. the default printer prints without a preview. the printer is designed in such a way that the text is printed at the bottom of the paper, and therefore printed text is invisible to the operator or anyone else. in addition, the operator can not see p.u.t. list on the screen because this option is not implemented in election software. the sealed envelope with the printed values is given to the voter after loading the paper in an opaque envelope without turning the printed page (envelope and paper are of the same size), or the paper is folded without turning on the printed page and then it is put in an usual size envelope. the voters, who watch out for irregularities in the operator‟s work, monitor the registration process all the time. the software option of p.u.t. distribution encrypts the list with a strong symmetric algorithm. evs generates the windows executable file which is transferred to the voter‟s usb flash drive, or possibly to optical media (for example cd or/and dvd), and then handed to the voter. when executed, this applet has only one text box for entering password that unlocks and displays the bitmap image with p.u.t. list. the design and features of this applet are similar to the digital wallet. for more portability, java applets can be generated with a similar purpose, or even native applications for all popular operating systems (windows, linux, mac os x, android, ios). a secure e-voting for the student parliament 209 4. architecture аnd protocol the protocol of the proposed evs has three phases. each phase consists of several actions. in a sequel, we give description of these three phases and corresponding actions fig. 2). a. phase i. the phase i is pre-election phase, which comprises all the necessary preparatory work for the second phase. it consists of the following two actions: 1. evs initialization consists of: launching software for e-voting in the appropriate mode; allocation of responsibilities/duties to administrators, as well as the establishing list of students eligible to vote, which is obtained externally or the list is made in this sub-phase. 2. voter registration of students who want to vote through the e-voting system, which must be done personally (face to face) when a voter gets the p.u.t. list. the voters will subsequently receive additional instructions that enable successful e-voting, e.g. address for voting, authentication data, etc. b. phase ii. this is voting phase, in which voters make selection electronically, and it consists of the following sub-phases: 1. voter authentication confirms that the person is exactly a voter eligible to vote, i.e. he/she is on the list for e-elections. the voters use the instructions and data obtained in the sub-phase i-2. 2. ballot is available to voters. it should be blank and anonymous. 3. act of voting in which the voter fills ballot and submits it to the evs. fig. 2 phases in proposed evs c. phase iii. this is election phase in which are all those activities that take place after the closing moment of elections occur. 1. vote counting consists of a ballot collection (e.g. from individual polling stations, the ballots are forwarded to the central location), preparation for processing (e.g. if ballots come encrypted, then their decryption starts), and finally there is counting of evotes according to the rules defined in the sub-phase i-1 in order to obtain accurate results. the results can be made public, for example through the news channel on the official website of election commission. phase i phase iii phase ii initialization registration authentication ballot voting counting integration, evaluation and revision 210 d. pilipovic, d. babic 2. integration, evaluation and revision. if e-voting is only part of the overall election, then voting results will be summed up in order to obtain the final results. evaluation assesses the process of e-voting, and then generates statistics that will help to improve the whole process and make it more transparent. revisions are performed if there is suspicion about the results or the implementation of e-voting procedures, but most often it comes down to a recount of e-votes. [21] architecture and protocol for e-voting proposed here, are intended for imaginary elections for a student parliament. its properties should be easily scaled to the general purpose elections. it is assumed that there is a database with several thousand students – eligible voters. next assumption is that there will also be candidates, a few dozen, for example. the entire process of e-voting is divided into three stages, which are not time-overlapping. these stages are registration process, voting day and post-election period. in the first stage, students are registered for e-voting with identification documents like id card or student card. registered students are recorded in the database and at that point, they receive their p.u.t. lists. p.u.t.s are unique, randomly generated series of decimal digits and latin alphabet letters, as explained above. each personal token is located at the intersection of voters‟ row and candidates‟ column of p.u.t.s matrix. the p.u.t list is given to student voter using one of the two options described in the previous section. voting day is realized for students to vote on the web site with the corresponding simple form which any web browser is able to read and render because it consists mainly of basic html elements. web site for the elections represents the equivalent of polling stations. the main e-voting page is removed from the web server after the voting closes. 5. unlinkability, anonymity аnd verifiability in the proposed evs, there are two databases. the first database (db) is completely offline (e.g. separated from the internet) and the second db is used just for e-voting on the internet. the second db contains anonymous p.u.t. lists and the results of voting, while the first db contains everything else. the second db is made from the first db (see details in the next section). the database on the web server (the second one), which is used to check whether a certain p.u.t. value is valid or not, does not contain personal data of students, but only a list of p.u.t. values and their associated candidate. the following is a description of the properties of the proposed evs (graphically depicted on fig. 3). unlinkability: since p.u.t.s are independent from personal data of voters, because they are not derived from them, but they are uniformly distributed in the domain of values, there is no direct link between voters and e-votes. anonymity: if p.u.t. list is kept as top secret, then the voter and his e-vote will be anonymous. verifiability: upon closing e-voting, all valid p.u.t.s are gathered from second db and published on the website of the election commission. at this point all students can check whether their e-votes are recorded. whole p.u.t. list is also published on the web site, thus anyone can verify whether the final results are correct because each p.u.t. value can be connected to an election candidate. a secure e-voting for the student parliament 211 student : voter f ir s t d .b . s e c o n d d .b . voting server site with results 1: p.u.t. list 3: p.u.t. 4: p.u.t. 2: anonymized p.u.t.s 5: anonymized p.u.t.s 6: voted p.u.t.s offline online 7: p.u.t. anonymity unlinkability verifiability fig. 3 the flow of p.u.t.s, p.u.t. lists, voted p.u.t.s and anonimized p.u.t.s 6. design аnd implementation оf evs all previous sections can be seen as a programming task, or user requirements for the process of developing software. for the software development, we use well known and common larman method [22]. here, we show only the most significant parts of the project documentation. figure 4 provides a global use case diagram divided by domain areas. it can be seen that there are two actors: an operator (synonym to a voting official), and a student (synonym to an eligible voter). in total, there are sixteen use cases (not shown here) in four domain areas. based on these use cases, we design evs software. operator student student's databallot's data e-voting p.u.t.'s data fig. 4 global use case diagram 212 d. pilipovic, d. babic the following three figures shows the gui (graphical user interface) of the proposed evs software. the software is developed in visual studio ide, the c# programming language, and ado.net and asp.net technologies. figure 5 shows the server part of the software that is targeted for windows 8 implemention platform, and the remaining figures 6-8 show the client part. at student‟s data tab are implemented operations to work with eligible voters students with voting rights. the students can be added, modified or deleted, and the data about them can be imported from .xml file with the appropriate scheme. this is designed and implemented based on student‟s data use cases. at ballot‟s data tab are placed the maintenance of elections data for the student parliament and in particular setting the options to vote (e.g. candidates) for the virtual ballot. this is based on ballot‟s data use cases. at p.u.t.s data tab are implemented the features for p.u.t. generation, printing and creating digital wallet application within p.u.t. list. there is an issue (we call it „randomizer efficiency problem‟) that is reflected in the slow rate of generating p.u.t. values, which requires further optimization. algorithm 1 pseudo-code for generation of p.u.t. list for every eligible voters 1: main_program(); 2: initialize put_list as matrix[maxvoters,maxvoteoptions]; 3: put_list(1,1)=generate_candidate_for_put(); 4: for i:=1 to (maxvoters*maxvoteoptions) do 5: found:=true; 6: while found do 7: temp:=generate_candidate_for_put(); 8: for j:=1 to i-1 do 9: y:=j-(round(j/maxvoters)-1)*maxvoters; 10: if put_list(x:=round(j/maxvoters),y)==temp then 11: found:=false; 12: break; 13: end if 14: end for 15: end while 16: y:=i-(round(i/maxvoters)-1)*maxvoters; 17: if found then put_list(x:=round(i/maxvoters),y):=temp else i--; 18: end for 19: generate_candidate_for_put(); 20: initialize candidate as array[9]; 21: for i:=1 to 6 do 22: temp:=rand_between('a','z'); 23: if i<3 then place:=i else place:=i+6; 24: candidate[place]:=temp; 25: end for 26: for i:=1 to 3 do 27: temp:=rand_between('0','9'); 28: candidate[i+3]:=temp; 29: end for 30: return candidate; 31: round(x); 32: if integer(x)==x then y:=x else y:=integer(x)+1; 33: return y; a secure e-voting for the student parliament 213 figure 5 displays a part of the evs software that is used to adjust settings of the evoting. the check box ‟use captcha check‟ is used to determine whether or not the voter is human (captcha completely automated public turing test to tell computers and humans apart). the check box ‟use web forum‟ determines whether the output of e-voting goes to public web site. the text box ‟minimal duration of e-voting‟ determines shortest time in seconds to finish cast an e-vote. fig. 5 gui for some e-voting use cases figures 6 and 7 show two different client parts of the proposed evs software, which are designed and implemented by two use cases of e-voting domain area. figure 6 displays the client part of evs in ubuntu linux on the web browser opera. it is a virtual polling place and a virtual ballot. a voter fills in the first row of text boxes form with his/her own p.u.t. list, and then types captcha series, in the second row. when the timer ends up, a student can press vote button, and send completed virtual ballot to the server. figure 7 depicts applet with p.u.t. list. this is partly covered by e-voting use cases. physically, the applet consists of three parts. the first part of the applet is executable file (extension is .exe), which is always the same for all voters. at the top of the applet, there is a password field (in fact it is a decryption key), besides that there is a view button and below there is a decrypted image with p.u.t. values. the second part of the applet is a readme.txt file with instructions for using the applet. the third part, the most important component of the applet, is picture.jpg. this file represents the bitmap image with high compression rate, which is encrypted with aes algorithm. encryption key length is 128 bits (16 bytes in the form of password). the symmetric algorithm aes is selected because there is a good support for it in the windows operating system with .net framework 3.5 with whom whole software ecosystem is built. 214 d. pilipovic, d. babic fig. 7 applet digiwallforevs with decrypted p.u.t.s fig. 6 virtual ballot in the client part of evs all three components are packed in .zip archive with the „extract-to-folderpseudorandom_number‟ name. here, the last number is obtained from the sequence whose pseudorandom generator seed is set at the start of the evs software. the image with p.u.t.s is rendered with random font, size, color and style in order to decrease possibility of success of fig. 8 sequence diagram for phase ii of proposed evs a secure e-voting for the student parliament 215 the ocr (optical character recognition) process. the password for decryption is not stored anywhere. however, only the voter knows the password because the voter is a legitimate user. in order to improve protection of program code by adding prevention for readability free obfuscator software obfuscar is used. a sequence diagram in figure 8 shows objects in the proposed evs and their interactions in the sequential order that the interactions occur. the sequence diagram is a form of interaction diagram in uml and it models the collaboration of objects based on a time sequence. databases are stored in rdbs sql server. figures 9 and 10 display schemes of both relational databases. fig. 9 relational scheme for offline db fig. 10 relational scheme for online db in order to better perceive components and layout of hardware and software of our evs, figure 11 shows the uml deployment diagram. 216 d. pilipovic, d. babic client (https html) dbms (sql server 2008) web server (internet information services) * * «execute» «execute» firewall * * * * {windows 32-bits application (.net 3.5, c#)} {asp.net virtual ballot} fig. 11 deployment diagram of evs components 7. discussion and conclusion in june 2015, we conducted a pilot e-voting with software presented in the article. the aims were to obtain practical experience in conducting process of e-voting and to collect feedback from the participants. the pilot was locally coordinated by the pilot election team. there were two separated e-voting groups. first of them consisted of second year students and second group were fourth year students. both groups belonged to department of information technology, the total of 47 students. students from the first group voted for types of project in database course they needed to finish as a pre-exam engagement, while students from the second group voted for a leader of software project they needed to make. first group‟s voting imitated elections with political parties and/or political options, while the other one resembled voting for candidates in the elections. after the election period, we organized mini surveys among e-voting participants. the survey had five questions: one yes/no type, one question with open answers and the rest of them were with likert type with 1-6 scale. students‟ evaluation was generally positive to evoting and the evs. nevertheless, usability of the evs was rated at the middle of scale. the reason for this can be seen from the answers to the open-type question. students were asked for mobile application for e-voting, particularly application for android operating system. prêt à voter was used in student elections in both luxembourg and surrey [23] (the evs is mentioned in introduction). in [24] is documented use of punchscan in the 2007 student elections at the university of ottawa. punchscan is paper-based evs with opticalscan counting of votes. bingo voting was applied in the election of the student parliament in karlsruhe institute of technology in 2008 [25]. security of the evs relies on trusted random number generating devices like bingo machines. it has property of e2e (end-2end) verifiability. a secure e-voting for the student parliament 217 these evs, used in academic and student environment, are not considered comparable to our evs, since they have not fully eliminated the need for paper. therefore, for the sake of comparison, we use evs described in [26]. it is based on personal smart cards with digital signatures protocol and mixing of votes. it is more technically complex than our evs, since it requires an additional device, i.e. a smart card reader. our system does not seek personal smart card as a prerequisite, which is still not widespread. otherwise, any general election evss or e-voting schemes can be used for the purposes of academic and student voting, e.g. most frequently mentioned schemes from the introduction. still, it is not happening because it would be irrational and the waste of resources. our evs has a relatively simple structure and organization, which leads to overall practicality of implementation. beside its simplicity, the evs has three important properties that make it suitable for the intended purpose. these are properties of unlinkability, anonymity and verifiability. the proposed evs can be applied with additional effort at general-purpose elections. in addition, with some minor changes or even in identical form given here, it might be applied to any kind of e-voting, such as corporate voting, syndicate voting, etc. however, the e-voting pilot showed that the usability of e-voting software must be improved. web site for e-voting must have a professional appearance. next, trusted mobile applications for e-voting should be developed, which will be a sort of gateway or shortcut for an e-voting web site. security can be improved easily and significantly with qualified digital certificate at e-voting server for to give the https connection. for the application in the general-purpose elections, it is necessary to analyze the evs with appropriate methodology or protection profiles. currently, there are three protection profiles for evs: bsi-pp-0031 in germany, pp-civis in france, and ieee p1583 in usa [27]. further analysis could be modeling of potential threats based on the components in a scheme, in conjunction with attack trees offering possible ways to handle such threats and/or errors [28]. another option is using applied π-calculus to symbolically describe the system and their operations and attributes [29]. finally, the pilot project should be implemented on a large electorate in order to identified errors and received feedback from a larger number of participants. references [1] project world map of e-voting.cc gmbh, [online]. available at: http://www.e-voting.cc/en/itelections/world-map/ (current 2015). [2] d. pilipović, “razvoj, trenutno stanje i perspektive e-glasanja“ (in serbian), in proc. infoteh ‟14, jahorina, pp. 1225-1228, 2014. [3] d. chaum, “untraceable electronic mail, return addresses, and digital pseudonyms”, communications of the acm, vol 24, no 2, pp. 84-90, 1981. [4] t. kohno, a. stubblefield, a. rubin, and d. wallach, “analysis of an electronic voting system”, in proc. of ieee symposium on security and privacy, pp. 27–40, 2004. [5] electronic voting machine information sheet, sequoia voting systems avc edge, version 1.1., eff (electronic frontier foundation), 2006. [6] j. bannet, d. w. price, a. rudys, j. singer, and d. s. wallach, “hack-a-vote: security issues with electronic voting systems”, ieee security and privacy, vol. 2, no. 1), pp. 32–7, 2004. [7] d. jefferson, a. d. rubin, b. simons, and d. wagner, “a security analysis of the secure electronic registration and voting experiment (serve)”, 2004, [online]. available at: http://www.servesecurityreport. org (current january 2004). 218 d. pilipovic, d. babic [8] c. a. neff, “a verifiable secret shuffle and its application to e-voting”, acm conference on computer and communications security, pp. 116–125, 2001. [9] m. jakobsson, a. juels, and r. l. rivest, “making mix nets robust for electronic voting by randomised partial checking”, in proc. of the 11th usenix security symposium, berkeley, pp. 339– 53, 2002. [10] d. wikström, “a sender verifiable mix-net and a new proof of a shuffle”, in proc. of the advances in cryptology asiacrypt 2005, pp. 273–92, 2005. [11] g. dini, “a secure and available electronic voting service for a large-scale distributed system”, future generation computer systems, vol. 19, no. 1 , pp. 69-85, 2003. [12] yu-yi chen, jinn-ke jan, and chin-ling chen, “the design of a secure anonymous internet voting system”, computers & security, vol. 23, no. 4, pp. 330-337, 2004. [13] r. cramer, r. gennaro, and b. schoenmakers, “a secure and optimally efficient multi-authority election scheme”, european transactions on telecommunications 8, pp. 481–490, 1997. [14] m. hirt and k. sako, “efficient receipt-free voting based on homomorphic encryption”, in proc. of the advances in cryptology eurocrypt 2000, bruges belgium, pp. 539–56, 2000. [15] p. ryan, “a variant of the chaum voter-verifiable scheme”, in proceedings of the workshop on issues in the theory of security, wits ‟05, new york, pp. 81–88, 2005. [16] v. d. besselaar, et. al. “experiments with e-voting technology: experiences and lessons”, building the knowledge economy: issues, applications, case studies, ios press, 2003. [17] e. dahlstrom, j. d. walker, and c. dziuban, “ecar study of undergraduate students and information technology”, boulder, co: educause center for applied research, 2012. [18] m. r. jeffreys, nursing student retention: understanding the process and making a difference, springer publishing company, 2012. [19] t. nyundu, k. naidoo, and t. chagonda, “getting involved on campus: student identities, student politics, and perceptions of the student representative council (src)”, jssa vol 6, pp. 149-161, 2015 [20] j. norman, chinese, cambridge university press. 1988, p. 73. [21] d. pilipović, “međunarodni standardi i preporuke kod elektronskog glasanja” (in serbian), infoteh 14, jahorina, pp. 1233-1236, 2014. [22] c. larman, “applying uml and patterns: an introduction to object-oriented analysis and design and iterative development”, prentice hall ptr, 2004 [23] c. z. acemyan , p. kortum , m. d. byrne, and d. s. wallach, usability of voter verifiable, end-to-end voting systems: baseline data for helios, prêt à voter, and scantegrity ii, usenix journal of election technology and systems (jets), vol. 2, no. 3, july 2014. [24] a. essex, j. clark, r. t. carback, and s. popoveniuc, "punchscan in practice: an e2e election case study", in proc. 2007 iavoss workshop on trustworthy elections (wote), ottawa, canada, 2007. [25] m. bär, c. henrich, j. müller-quade, s. röhrich, and c. stüber, real world experiences with bingo voting and a comparison of usability, workshop on trustworthy elections, wote 2008, 2008. [26] r. krimmer, a. ehringfeld, m. traxl, the use of e-voting in the austrian federation of students elections 2009., in proc. of the 4th international conference evote 2010, 2010. [27] k. lee, y. lee, d. won, and s. kim, protection profile for secure e-voting systems, ispec 2010, lncs 6047, pp. 386–397, 2010. [28] j.h. espedahlen, attack trees describing security in distributed internet-enabled metrology. master‟s thesis, department of computer science and media technology, gjøvik university college, 2007. [29] s. kremer, m. ryan, and b. smyth, election verifiability in electronic voting protocols, esorics 2010, lncs 6345, pp. 389–404, 2010. instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 127 138 doi: 10.2298/fuee1601127m linearization of broadband two-way microstrip doherty amplifier  nataša maleš-ilić 1 , aleksandra đorić 2 , aleksandar atanasković 1 1 faculty of electrical engeneering, university of niš, serbia 2 innovation center of advanced technologies, niš, serbia abstract. the linearization of a broadband two-way microstrip doherty amplifier designed for application in the frequency range 0.9-1.0 ghz is considered in this paper. the amplifier characterized by the maximum output power 8 w is designed in microstrip technique for broadband applications. for linearization purposes, the secondand fourth-order nonlinear signals are extracted at the output of the peaking cell, adjusted in amplitude and phase and fed at the input and output of the carrier transistor over the microstrip hairpin band-pass filters. the effects of the linearization are considered through the simulation for two sinusoidal signals with different frequency interval between them setting on from 5 mhz and going up to30 mhz at different input power levels up to saturation, as well as for an orthogonal frequency division multiplexing (ofdm) digitally modulated signal. key words: doherty amplifier, broadband, microstrip, ofdm signal, linearization, secondand fourth-order nonlinear signals, intermodulation products. 1. introduction the doherty amplifier topology enables high efficiency in a range of output power that represents one of the most attractive characteristic required in contemporary wireless communication systems; therefore its design is a subject of various researches and investigations for application as a power amplifier in transmitters that need to support multistandard modulation schemes and to provide a high efficiency, wideband operation, and linear performance. significant efforts have been devoted to the development of the linearization techniques for suppression of the nonlinear distortions in a doherty amplifier: the post-distortioncompensation [1], the feedforward linearization technique [2], the predistortion linearization technique [3] and their combination [4], as well as the digital predistortion [5]-[7]. the linearization effects of the fundamental signals’ second(im2) and fourth-order nonlinear signals (im4) at the frequencies of the second harmonics and around them on received january 17, 2015; received in revised form march 23, 2015 corresponding author: nataša maleš-ilić faculty of electrical engeneering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: natasa.males.ilic@elfak.ni.ac.rs) 128 n. maleš-ilić, a. đorić, a. atanasković the standard (two-way, three-way and three-stage) doherty amplifiers were investigated in simulation, [8]. we applied the approach where the im2 and im4 signals are injected together with the fundamental signals into the carrier amplifier input and fed at its output [9]. additionally, the influence of im2 and im4 signals on doherty amplifier linearity was verified experimentally on a standard symmetrical and asymmetrical two-way doherty amplifier [9], [10] and [11]. a broadband two-way doherty amplifier including an additional circuit for linearization was designed in configuration with ideal lumped elements in the input and output matching circuits [12]. matching circuit design was based on the filter structures with lumped elements and adequate transformations such as norton transformations, [13], [14]. after applying appropriate transformations on the lumped elements [15], we designed and analyzed doherty amplifier whose matching circuits comprise the ideal transmission lines combined with the lumped elements [16]. in order to design a matching circuit realizable in microstrip technology, the microstrip lines are calculated starting from the ideal transmission lines and combined with the commercially available smd (surface mounted devices) components instead of the ideal lumped elements in the matching circuits [17]. additionally, the impact of band-pass filters in the linearization circuit on the amplifier performance was analyzed for four different filter types: doubly and singly terminated ideal bandpass filters, hairpin microstrip band-pass filter and a combination of a singly terminated stop-band filter with a hairpin filter [16]. in this paper, a broadband two-way doherty amplifier was designed in microstrip technology, so that the input and output matching circuits of the transistors were based on only microstrip lines. different transformations [15] were applied on the starting lumped elements matching circuits of doherty amplifier in order to achieve the adequite performance (gain, efficiency, linearity). the complete microstrip doherty amplifier is to be simpler for realization and implementation, especially at higher frequencies, in comparison with the amplifier consisting of combined microstrip lines and lumped elements matching circuits [17]. also, an additional circuit for linearization was included into the broadband doherty amplifier which operates over the frequency range 0.9-1.0 ghz. in the applied linearization method, the secondand fourth-order nonlinear signals are extracted at the output of the peaking transistor of doherty amplifier, adjusted in amplitude and phase throughout two independent branches and inserted into the input and output of the carrier cell over the hairpin microstrip band-pass filters. the effects of linearization are considered for two tone signals separated by various frequency intervals from 5 mhz to 30 mhz at different input power levels commencing from 5 dbm and getting up to 15 dbm (total signal power is at 2 db back-off level in relation to the maximal output power). additionally, according to the authors knowledge, for the first time in this paper the broadband two-way doherty amplifier linearization by the technique that exploites the secondand fourth-order nonlinear signals is analysed for оfdm digitally modulated signal. in the recent authors' work, the linearization for ofdm signal has been carried out on narrowband single-stage power amplifier for lte application by the injection of only second-order nonlinear signals (second harmonics) as depicted in [18]. this paper is organized as follows: section ii relates to the design of broadband twoway doherty amplifier and additional linearization circuit including the simulation results that refer to the main amplifier characteristics-gain and efficiency. the simulation results of linearization for two sinusoidal fundamental signals (two-tone) analyzed in the range of signal output power and various frequency intervals between two signals are included in linearization of broadband two-way microstrip doherty amplifier 129 section iii. this section also represents the results of linearization in case of ofdm digitally modulated signal for a power range. the conclusions are reported in section iv. 2. design and characteristics of broadband doherty amplifier advance design system-ads was utilized for the design of two-way doherty amplifier in standard configuration [1], [2], [4], [5], (schematic diagram is shown in fig. 1) characterized by two quarter-wave impedance transformers with the characteristic impedance r0=50  and 20rrt  in the output combining circuits, as well as a 3 db quadrature branch-line coupler at the input to compensate for the phase difference of 90 caused by the 50  quarter-wave impedance transformer at the output. the offset line in the output of the peaking amplifier transforms its output impedance to an open circuit in a low power region to prevent current leakage from the carrier amplifier. the carrier and peaking cells were designed using freescale’s mrf281s ldmosfet showing 4-w peak envelope power. the carrier amplifier was biased in class-ab (vd = 26 v, vg = 5.1 v (13.5% of idss-transistor saturation current)), whereas the peaking amplifier operates in class-c (vd = 26 v, vg = 3.6 v). the source and load matching impedances of carrier and peaking cells were determined by using source-pull and load-pull analysis; for the carrier cell they are (5.5 j15) s z    and (12.5 j27.5) l z    , respectively, and for the peaking cell (3.55 j15.7) s z    and (3.95 j30.85) l z    . the broadband operation of the amplifier was achieved by using the input and output matching circuits of the transistors based on the filter structures with lumped elements. a detailed insight into the design process of the broadband matching circuits with lumped elements with all necessary transformations is given in [12]. the next step is a transformation of the lumped element matching circuits into the matching circuits with the transmission lines characterized by the appropriate characteristic impedances and electrical lengths. in this paper, the direct transformations of the resonant lc circuits into the appropriate transmission lines or stubs were applied; therefore the parallel lc circuit was approximated by an open quarterwave stub connected in parallel, the parallel l was replaced by a parallel short quarterwave stub, whereas the serial lc circuit, the serial l and the parallel c were replaced by an appropriate transmission line connected in series [15]. in order to design a matching circuit realizable in microstrip technology, all ideal transmission lines were transferred into the microstrip lines on substrate with parameters: dielectric constant r = 2.2, substrate height h = 0.635 mm and metallization thickness t = 17 m. in order to prevent losses of the fundamental signals, the stabilization of the carrier and peaking amplifiers was performed by the resistances whose values were selected in the range between 300-600 ω. these resistances were connected in parallel with rf chocks in dc power supply circuits of the carrier and peaking transistors and also posted parallel to the input matching circuit in carrier cell, as shown in fig. 1. the im2 and im4 signals generated at the output of peaking transistor were extracted through the band-pass filter (bpf1), characterized by the center frequency of the second harmonics to pass the signals for linearization (im2 and im4 signals). the linearization circuit consists of two independent linearization branches that include the variable attenuator, variable phase shifter and amplifier for the amplitude and phase adjustment of the im2 and im4 130 n. maleš-ilić, a. đorić, a. atanasković signals. the im2 and im4 signals, over the band-pass filters bpf2 and bpf3, are delivered to the carrier transistor input and output, respectively, as illustrated in fig. 1. fig. 1 schematic diagram of broadband two-way microstrip doherty amplifier with additional circuit for linearization (length (l) and width (w) are in millimeters). since the band-pass filters are directly connected to the gate and drain of the transistors in the doherty amplifier they have the greatest influence on the amplifier behaviour. the band-pass filter was designed as a hairpin filter (hp) with three sections at the center frequency 1.9 ghz and 20% bandwidth on the microstrip substrate. insertion of the linearization branches into the doherty amplifier over the band-pass filters deteriorates the amplifier performance [16]. accordingly, in order to correct the degradation and improve the amplifier characteristics, an additional optimization of the microstrip line parameters (length and width) in the input and output matching circuits of the carrier and peaking amplifiers was performed. the output power, gain and drain efficiency, de, of the doherty amplifier as a function of the input power are shown in fig. 2 for the frequencies 0.90 ghz, 0.925 ghz, 0.95 ghz, 0.975 ghz and 1.00 ghz of single-tone excitation. it can be noted that the maximal gain achieved is around 21 db. the gain characteristics observed at the excitation signal frequencies vary in the range of approximately 2 db up to approximately 15 dbm input signal power level. however, they differ less and less with the increase of the input power from 15 dbm to 20 dbm. in a low power range, the drain efficiency at denoted frequencies deviates from the de at 0.95 ghz by maximum 5%, whereas differences at higher power range go up to 8% maximally. the drain efficiency attained at maximal output power of around 40 dbm is 53%. the doherty amplifier layout including the hairpin band-pass filters are illustrated in fig. 3. concerning the insertion of t-junctions for lines connection as well as necessary bending of microstrip lines required to achieve appropriate circuit placement on the substrate, the additional corrections of microstrip lines lengths were performed in comparison to the dimensions denoted in fig. 1. it should be stressed that the large circuit linearization of broadband two-way microstrip doherty amplifier 131 board dimensions are expected for the operational frequency at l-band. also, the additional circuit for linearization used for obtaining simulation results in this paper consists of the ideal elements from ads library. a branch of the fabricated circuit for fig. 2 output power, gain and de in terms of one-tone input power at 0.9 ghz, 0.925 ghz, 0.95 ghz, 0.975 ghz and 1.00 ghz. fig. 3 layout of broadband two-way microstrip doherty amplifier including band-pass filters for extraction and injection of the signals for linearization. 132 n. maleš-ilić, a. đorić, a. atanasković linearization shown in fig. 4 is planned to be used in further experimental verification of the linearization approach on microstrip broadband two-way doherty amplifier. it comprises m/a-com pin diode variable attenuator ma4vat2007-1061t, two minicircuits 180° voltage variable phase shifters jsphs-23+ to provide phase shift of 360° and skyworks high linear 2w power amplifier sky65120. this linearization circuit, which was already used in experiments performed for the linearization of narrowband amplifiers, [10], [11] satisfies the requirements for broadband applications. fig. 4 fabricated linearization circuit. 3. linearization results in order to evaluate the impact of the proposed linearization technique on the designed microstrip doherty amplifier, a two-tone test was carried out in ads. one sinusoidal signal at frequency 0.95 ghz and the other shifted in frequency by 5 mhz to 30 mhz with an increment of 5 mhz were simultaneously driven at the amplifier input. the analysis was carried out for different input signal power levels. power level of the third-order intermodulation products, before and after the linearization, in terms of the frequency interval between the signals is presented in fig. 5 for the fundamental signal power levels at the amplifier input: 5 dbm, 8 dbm, 12 dbm and 15 dbm. according to the theoretical analysis of the linearization approach given in [8]-[10], the linearization signals, im2 and im4, which are adjusted on the appropriate amplitudes and phases and injected at the input and output of the amplifier transistor, can reduce both the thirdand fifth-order distortion of fundamental signal. however, the suppression grade depends on the relations between the amplitudes as well as phases of the im2 and im4 signals generated at the peaking amplifier output. therefore, when the required relations between the amplitudes and phases are not fulfilled, only one kind of the intermodulation products can be lowered sufficiently. in this paper the parameters of the linearization circuits were optimized to decrease the third-order intermodulation products while the fifth-order intermodulation products were required to retain at as low as possible power levels. it can be noted that, after the linearization in the considered power range, a significant reduction of the im3 products was attained. the suppression of the im3 products is the highest, more than 40 db, for the lower input power of 5 dbm. however, it is noticeable that the decreasing grade of intermodulation products descends when the power levels and interval between signals grow up; therefore it is seen that at the input power 8 dbm the linearization of broadband two-way microstrip doherty amplifier 133 im3 products are lessened by around 20 db, whereas the increase of the input power to 15 dbm indicates around 10 db reduction of the im3 products. a) b) c) d) fig. 5 third-order intermodulation products of broadband microstrip doherty amplifier at different input power levels: a) 5 dbm, b) 8 dbm, c) 12 dbm, d) 15 dbm. the linearization results accomplished in this paper for the broadband doherty amplifier incorporating the matching circuits that consist entirely of microstrip lines are very similar to the results that were achieved in the linearization of the broadband two-way doherty amplifier designed in configuration with combined matching circuits comprising the microstrip lines and real smd lumped elements, [17]. suppressions of the im3 products for all considered frequency intervals between signals up to 30 mhz for higher input power levels are approximately the same for both mentioned configurations; there is only difference at lower power levels where greter reduction of the im3 products is obtained in case of a pure microstrip broadband doherty configuration. additionally, it should be indicated that, beside the comparable or better results of linearization, the broadband microstrip doherty amplifier is simpler and less demanding for the realization comparing to the combined topology with smd components, especially at higher frequencies. moreover, microstrip 134 n. maleš-ilić, a. đorić, a. atanasković amplifier enables the possibility for additional tuning of the width and length of the microstrip lines in the matching circuits if the correction of amplifier characteristics is required. additionally, the designed amplifier was tested for ofdm signal at 0.95 ghz carrier frequency, 17 mhz spectrum width for the range of signal power levels in order to access the impact of the proposed linearization technique on the microstrip broadband doherty amplifier. the results of the analysis, the adjacent channel power ratio-acpr before and after the linearization in terms of different average output power levels of fundamental signals are shown in fig. 6. the parameters of the linearization circuits were optimized to suppress the third-order intermodulation products. the acpr is observed at ±16 mhz offset from the carrier (the range of dominant third-order distortion) over the 0.32 mhz frequency span. it can be noted that the acpr was improved by around 8 db to 10 db for the average output power levels of 20 dbm to 30 dbm. the acpr decreases with output power rise, so that the improvement is around 3 db at maximal observed power level. the fifth-order intermodulation products are dominant at ±30 mhz offset from the centre frequency in the output spectrum. it follows from the fig. 7 that the acpr considered over the range of 0.32 mhz are retained on the power levels before the linearization at the lower output power levels up to approximately 26 dbm, while we notice the improvement of the parameter by maximum 8 db for higher power. it follows from the fig. 8, which shows the average output power of fundamental signal for ofdm test, that the linearization method applied has the negligible influence of a few tenths of a db on the output power of the fundamental signals. the fundamental signal power may be even increased after applying the proposed linearization techniques. the output spectra for ofdm digitally modulated signal gained in simulation before and after the linearization for 12 dbm average input power level are given in fig. 9. the average output power relates to the level around 8 db back-off, but considering the 12 db peak to average power ratio assigned to ofdm signal we may infer that the linearization results, which indicate the acpr improvement of 11 db for lower channel and 9 db for higher channel, were achieved for the doherty amplifier operating at high signal power going to the saturation. fig. 6 acpr before linearization (solid line) and after linearization (dashed line) at ±16 mhz offset from the carrier (the range of dominant third-order distortion) for ofdm signal in a range of average output power. linearization of broadband two-way microstrip doherty amplifier 135 fig. 7 acpr before linearization (solid line) and after linearization (dashed line) at ±30 mhz offset from the carrier (the range of dominant fifth-order distortion) for ofdm signal in a range of average output power. fig. 8 output power of ofdm signal before and after the linearization. 136 n. maleš-ilić, a. đorić, a. atanasković fig. 9 output spectra before and after linearization for ofdm signal at average input power of 12 dbm. 4. conclusion the effect of the linearization technique on the broadband two-way doherty amplifier designed in microstrip technology is considered in this paper. a doherty amplifier is designed to operate over the frequency range 0.9-1.0 ghz in configuration with matching circuits that consist of only microstrip lines. the applied linearization technique uses the even-order nonlinear signals, (im2 and im4) at frequencies of the second harmonics and around. these linearization signals, which are generated at the output of the peaking cell, are guided throughout two linearization branches. after being set in amplitude and phase on the appropriate values, the linearization signals are run at the carrier transistor input and output over the microstrip hairpin band-pass filters. the results attained in simulation for two-tone test indicate a satisfactory suppression of the im3 products, even when the frequency interval between signals increases. however, with the growth of the power levels and interval between signals, the suppression results of the intermodulation products fall off. moreover, in the case of broadband ofdm digitally modulated signal, the linearization method improves the acpr in the range of dominant third-order intermodulation products and retains the fifth-order nonlinearities at sufficiently low power. the output power of useful fundamental signal is impaired negligibly or even it is augmented in case of higher power, which contributes to the efficiency of the amplifying system. it should be stressed that the amplifier designed completely as a microstrip circuit offers possibilities for simpler practical realization, especially at higher frequencies. also, degradation of the amplifier performance, which appears when the commercially available smd components are applied for lumped elements in the matching circuits, is evaded. besides, the microstrip line dimensions may be trimmed additionally to improve the amplifier performance. it should be mentioned that dimensions of the microstrip two-way doherty amplifier are expectedly large in the l frequency band of circuit operation. this paper aims to analyze the effects of the proposed linearization of broadband two-way microstrip doherty amplifier 137 linearization approach for application in broadband doherty amplifier realizable in microstrip technique, while further intention will lead toward the design and linearization of amplifiers at higher frequencies where smaller circuit dimensions are expected. acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia, the project number tr-32052. references [1] k. j. chao, w. j. kim, j. h. kim and s. p. stapleton, “linearity optimization of a high power doherty amplifier based on post-distortion compensation”, ieee microwave and wireless components letters, vol.15, no.11, pp. 748-750, 2005. [2] k. j. cho, j. h. kim and s. p. stapleton, “a highly efficient doherty feedforward linear power amplifier for w-cdma base-station applications”, ieee trans., microwave theory tech., vol. 53, no. 1, pp.292300, 2005. [3] b. shin, j. cha, j. kim, y. y. woo, j. yi, b. kim , “linear power amplifier based on 3-way doherty amplifier with predistorter”, in proceedings of the ieee mtt-s int. microw. symp. digest, 2004, pp.2027-2030. [4] t. ogawa, t, iwasaki, h. maruyama, k. horiguchy, m. nakayama, y. ikeda and h. kurebayashi, “high efficiency feed-forward amplifier using rf predistortion linearizer and the modified doherty amplifier”, in proceedings of the ieee mtt-s int. microw. symp. digest, 2004, pp. 537-540. [5] l. piazzon, r. giofre, p. colantonio, and f. giannini, “a wideband doherty architecture with 36% of fractional bandwidth," ieee microwave and wireless components letters, vol. 23, no. 11, pp.626628, nov. 2013. [6] david yu-ting wu and slim boumaiza, a modified doherty configuration for broadband amplification using symmetrical devices, ieee trans., microwave theory tech., vol. 60, no. 10, pp.3201-3213, 2012. [7] khaled bathich, georg boeck, “wideband harmonically-tuned gan doherty power” amplifier, in proceedings of the ieee mtt-s international microwave symposium digest (mtt), 2012, montreal, canada, pp.1-3. [8] a. atanasković, n. maleš-ilić, b. milovanović, “the linearization of doherty amplifier”, microwave review, no.1, vol. 14, pp. 25-34, september 2008. [9] a. atanasković, n. maleš-ilić, b. milovanović: "linearization of two-way doherty amplifier", in eumic 2011 proceedings, conference proceedings on cd, manchester, uk, euma october 10-11, poster 01-17, pp.304-307, 2011. [10] a. atanasković, n. maleš-ilić, b. milovanović: “linearization of power amplifiers by second harmonics and fourth-order nonlinear signals”, microwave and optical technology letters, vol. 55, issue 2, pp. 425-430, february 2013. [11] aleksandar atanasković, nataša maleš-ilić, bratislav milovanović: “linearization of symmetrical and asymmetrical two-way doherty amplifier”, facta universitatis (nis), series: electronics and energetics, vol.25, no.2, pp.161-170, 2012, [12] a. đorić, n. maleš-ilić, a. atanasković, b. milovanović: “linearization of broadband doherty amplifier”, ” in telsiks 2013 proceedings, niš, serbia, october 16-19, , vol. 2, 2013, pp. 509-512,. [13] g. matthei, l. zoung, and e. m. t. jones, microwave filters, impedance-matching networks, and coupling structures. norwood, ma: artech house, 1980. [14] d. dawson, “closed-form solution for the design of optimum matching networks”, ieee trans. microw. theory tech., vol 57, no.1, pp. 121-129, 2009. [15] r. w. rhea, hf filter design and computer simulation. new york noble, pp. 89, 1994. [16] a. đorić, n. maleš-ilić, a. atanasković: “analysis of linearization circuit impact on broadband doherty amplifier performances”, in proceedings of the 1st international conference on electrical, electronic and computing engineering, icetran 2014, vrnjačka banja, serbia, june 2-5, 2014, pp. mti2.4.1-6. 138 n. maleš-ilić, a. đorić, a. atanasković [17] a. đorić, n. maleš-ilić, a. atanasković, “broadband microstrip doherty amplifier design and linearization'', in icest 2014 proceedings, niš, serbia, 25-27 june, 2014, pp. 131-134. [18] mijušković jelena, bukvić branko, nešković nataša, maleš-ilić nataša, budimir djuradj, “compensation of nonlinear distortion in rf power amplifiers by injection for lte applications”, in proceedings of the 22nd telecommunications forumtelfor2014 on cd, pp. 352-355, belgrade, 2014. preparation of papers in a two-column format for the 21st annual conference of the ieee industrial electronics society facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 27 38 doi: 10.2298/fuee1701027z calculation model for the induced voltage in rectangular coils above conductive plates  siquan zhang 1 , nathan ida 2 1 department of electrical and automation, shanghai maritime university, shanghai, 201306, china 2 department of electrical and computer engineering, the university of akron, akron, oh, 44325-3904, usa abstract. electromagnetic ndt methods and in particular eddy currents play an important role in nondestructive testing of conducting materials. in testing conductive structures, rectangular coils are often more useful than circular coils. a particular configuration consists of two rectangular coils located above the conductive plates, one placed parallel to the plates serving as an excitation coil and the other perpendicular to the plates serving as a sensing coil. in this work we derive analytical expressions for the induced voltage variations in the pick-up coil. then the influences of the plate thickness, the exciting frequency and the moving speed of the conductor on the induced voltage variation are analyzed. the analytical calculation results are verified using the finite element method. key words: eddy current testing, conductive plates, rectangular coil, induced voltage, finite element method. 1. introduction eddy current testing (ect) techniques are widely used in testing of conductive structures with advantages of high sensitivity when testing for surface flaws [1-3]. in standard eddy current testing a circular coil carrying current is used to test the conductive specimen. the alternating current in the coil generates an alternating magnetic field, which interacts with the test specimen and generates eddy currents. however, rectangular coils are more useful than circular coils, because the rectangular coil is not axisymmetric, hence it affects the field inside the medium resulting in higher sensitivity to sub-surface flaws [4]. in spite of these advantages, rectangular coils have been seldom discussed in the literature. in this paper, we analyse a model with two rectangular coils, one serving as the exciting coil and the other is the pick-up coil, both located above the conductive plates. the conductive materials’ characteristics or parameters of flaws can be evaluated  received august 17, 2016 corresponding author: nathan ida department of electrical and computer engineering, the university of akron, akron, oh, 44325-3904, usa (e-mail: ida@uakron.edu) 28 s. zhang, n. ida from the induced voltage variation in the pick-up coil. the validity of the theoretical analysis is confirmed by the finite element method (fem). 2. theoretical analysis 2.1. analytical model fig. 1 shows two rectangular single-turn coils located above multi-layer conductive plates. the exciting coil is parallel to the surface of the conductor which coincides with the z = 0 plane. the dimensions of the exciting coil are 2a1, 2b1 and a lift-off z0. an ac harmonic current tjie  flows in the coil. the pick-up coil is parallel to the yz plane and perpendicular to the conductor, it has dimensions of 2a2, 2b2 and a lift-off z0+w2. the thickness, conductivity and permeability of the two layer conductive plate are assumed to be di, σi and μi (i =1, 2) and the conductive media are assumed to be linear, isotropic and homogeneous. 11 ,  22 ,  i 1a 1b 1 d 2 a 2 b 0 x c 20 wz  0 z 0 yo x y z v 2 d fig. 1 filamentary rectangular coils above a multi-layer conductor to simplify the analysis, the solution region is divided into region 0, 1 and 2. in region 0 (z > 0), the incident magnetic flux density bi generated by the exciting current and the reflected magnetic flux density br generated by inducted eddy currents exist simultaneously. the incident magnetic flux density bi can be expressed by the vector potential ai as: jai 0 (1) ii ab  (2) the reflected magnetic flux density br satisfies the following: 0 rb (3) 0 2  r b (4) region 1 )0(  zd is the top conductive plate. the magnetic flux density b1 in this region satisfies the following: 2 1 1 1 1 1 1 1 0 b b v j b y           (5) 1 0b  (6) calculation model for the induced voltage in rectangular coils above conductive plates 29 region 2 )( dz  is the lower conductive plate. the magnetic flux density b2 in this region satisfies: 0222 2 222 2     bj y b vb  (7) 02  b (8) to solve these equations, the double fourier transform and its inverse are introduced:         dxdyezyxbzb yxj )( ),,(),,(   (9)            ddezbzyxb yxj )( 2 ),,( 4 1 ),,( (10) where ξ and η are the integration variables. 2.2. incident magnetic flux density the single filamentary rectangular coil consists of four finite length wires, as shown in fig.1. by solving (1), the vector potential generated at an arbitrary point ),,( zyxp by a source point )',','( zyx in the coil can be written as:  v r dvzyxj zyxa ')',','( 4 ),,( 0   (11) where j is the current density in the coil, v is the coil segment carrying current, r is the distance of ),,( zyxp to the source point )',','( zyx as follow: 222 )'()'()'( zzyyxxr  (12) performing the fourier transform on (11), the expression of the vector potential in the region z < z0 is obtained as: '} 1 {)',','( 4 ),,( )(0 dvdxdye r zyxjza v yxj                  v zzyxj dveezyxj ' 1 )',','( 2 22 0 22 )''(0    (13) similarly, the components of the incident magnetic flux density are obtained by performing the fourier transform on (2): z a ajb y zx     , z x y aj z a b     , xyz ajajb   (14) as shown in fig. 1, the wire parallel to the x axis satisfies izyxj )',','( , 0' yy  and z  z0 < 0. substituting these into (13), the x component of the vector potential becomes:     v zzyxj x dveezyxja ' 1 )',','( 2 22 0 22 )''(0        0 0 0 22 0 ' 2 ' 22 )( 0 x x xjyj zz dxee ei     0 22 0 22 )( 0 2 yj zz e ei          )sin(2 0 x 22 )( 00 22 00)sin(      zzyj eexi (15) 30 s. zhang, n. ida similarly, the wire parallel to the y axis satisfies izyxj )',','( , 0' xx  and z  z0 < 0, substituting into (13), the y components of the vector potential becomes:     v zzyxj y dveezyxja ' 1 )',','( 2 22 0 22 )''(0        0 0 0 22 0 ' 2 ' 22 )( 0 y y yjxj zz dyee ei     0 22 0 22 )( 0 2 xj zz e ei          )sin(2 0 y 0 22 0 22 )( 00 )sin( xj zz e eyi        (16) the x components of the magnetic flux density can be obtained by substituting (15) and (16) into (14) as follows: )},,(),,({ 01120112 1 zbaazbaa zz a b yy y ix        22 0 )(110 )sin()sin(2      zz e baij (17) similarly, the y and z components of the magnetic flux density can be obtained as: 22 0 )(110 )sin()sin(2      zz iy e baij b (18) 22 0 )(11 22 0 )sin()sin(2      zz iz e bai b (19) the general solution for the z component of the incident magnetic flux density in region 0 is: 22    z iziz ecb (20) where the coefficients ciz are: 22 0 )sin()sin(2 11 22 0      z iz e bai c (21) 2.3. reflected magnetic flux density performing the fourier transform on (4), the reflected magnetic flux density in region 0 can be expressed as: 0)( 22 2 2    r r b z b  (22) in similar fashion, performing the fourier transform on (5) and (7), the magnetic flux density in region 1 and 2 can be expressed as: 0)( 11111 22 2 1 2    bjvj z b  (23) 0)( 22222 22 2 2 2    bjvj z b  (24) the normal component of b and the tangential components of h must be continuous on the z = 0 and z = -d planes. calculation model for the induced voltage in rectangular coils above conductive plates 31 applying the continuity of bz, we obtain zrziz bbb 1 (z = 0) (25) zz bb 21  (z = d) (26) applying the continuity of hx, we obtain 1 1 0 )(  xrxix bbb   (z = 0) (27) 2 2 1 1  xx bb  (z = d) (28) applying the continuity of hy, we obtain 1 1 0 )(  yryiy bbb   (z = 0) (29) due to the fact that 0 j , the current density jz does not exist in regions 1 and 2, and we get: xy bb 11   (30) xy bb 22   (31) the following equations are obtained from (3) z b bj ry rz     (32) z b bj rx rz     (33) following similar steps, the following equations are obtained from (6) and (8): 0111     z b bjbj z yx  (34) 0222     z b bjbj z yx  (35) the coefficient of the reflected magnetic flux density is obtained by solving the above equations: izd d rz c penn penn d 1 1 2 2 )1(1 )1()1(      (36) where  cos ,  sin , 22   , m 12 1   , p m m    1 1 , n   1 10 , 1111 22 1  jvj  (37) let      d d penn penn 1 1 2 2 )1(1 )1()1( (38) 32 s. zhang, n. ida the coefficient of the reflected magnetic flux density becomes:    22 0)sin()sin(2 11 22 0    z rz ebai d (39) rzrx d j d    22      22 0)sin()sin(2 110    z ebaij (40) the x component of the reflected magnetic flux density becomes: 22    z rxrx edb    22 0 )( 110 )sin()sin(2    zz ebaij (41) the x component of the reflected magnetic flux density in region 0 is obtained by performing the inverse fourier transform on (41):            )sin()sin( 2 11 2 0 baij b rx   )( 0zze   ( )j x y e d d       (42) fig. 2 shows two multi-turn rectangular coils obtained by extending the two singleturn coils shown in fig. 1 in width and length respectively. the coil parallel to the surface of the conductor is the excitation coil and the coil perpendicular to the conductor is the pick-up coil. the turns of the excitation and pick-up coil are n1 and n2 respectively. the lower surfaces of the two rectangular coils are level with each other. 11 ,  22 ,  1 d c v o 1 z 1 h 1 w 1 a 2 a 2 b 1 b 2 w 2 h z x y 2 d fig. 2 configuration of two multi-turn rectangular coils the reflected magnetic flux density generated by the multi-turn rectangular exciting coil shown in fig. 2 is obtained by integrating (42) with respect to the width and length as follows:     11 1 1 0 0 11 1 hz z w rx total rx dpbdz hw n b 0 1 2 1 1 2 j in w h           }])sin[()])sin[({ 1 0 11  w dppbpa  1 1 0 1 0 { } z h z z e dz      ( )z j x y e e d d                 ddeee k hw inj yxjhzzzz )()()(1 11 2 10 ][ 2 111           (43) where calculation model for the induced voltage in rectangular coils above conductive plates 33 ][ 1 )( 0 111 11 1 0   hzz hz z z eedze     (44) 1 1 1 1 0 sin[( ) )]sin[( ) ] w k a p b p dp    )(2 )sin(])(sin[ 11111      bawba )(2 ])(sin[)sin( 11111      wbaba (45) fig. 3 shows a comparison of the variation of the reflected magnetic flux density’s x component as calculated from (43) and as simulated using maxwell 3d respectively. the results of the simulation are obtained by subtracting the x component of the magnetic flux density without the conductor from the x component of the magnetic flux density with the conductor. the points shown belong to the line between (-16,0,5) and (16,0,5) which is located below the exciting coil and above the conductive plate. it can be seen that the analytical calculation results agree with the simulated results very well. -20 -15 -10 -5 0 5 10 15 20 -40 -30 -20 -10 0 10 20 30 40 position along x axis (mm)  b x （ g a u s s ） fem fourier transform fig. 3 variations of the x component the magnetic flux density calculated from the analytical and fem simulation 3. induced voltage in pickup coil 3.1. magnetic flux penetrating through the pick-up coil to obtain the reflected magnetic flux penetrating through the multi-turn rectangular pickup coil shown in fig. 2, we first derive the reflected magnetic flux penetrating through the single-turn rectangular coil with lengths 2a2, 2b2, and assume it is located at (c, 0, zc), where zc = z1 + w2 + a2. the reflected magnetic flux penetrating through the single-turn coil is obtained by integrating (43) on the area of coil as: 34 s. zhang, n. ida 2 2 2 2 c c z a b total r rx x cz a b dz b dy           1 11 2 10 2 k hw inj        cj e  ][ )( 111  hzz ee   }{ 2 2 dze az az zc c         dddye b b jy  2 2 }{ 22 1 11 2 10     k hw inj        ][ )( 111  hzz ee   cze  cj e   ][ 22 aa ee    2sin( )b d d  (46) then the reflected magnetic flux penetrating through the multi-turn rectangular pickup coil is obtained by integrating (46) with respect to the width and length of pickup coil as follows: 2 2 2 / 2 2 0 / 2 2 2 w c h r c h n dp dc w h        22 1 2211 2 210     k hwhw ninj        ][ )()( 111  hzzzz cc ee   }{ 2/ 2/ 2 2 dce hc hc cj          2 22 0 )()( ][{ w papa ee   dddppb })](sin[ 2  cj e hkk hwhw ninj              ) 2 sin( 2 2 22 21 2211 2 210 1 2 2 1 2 2 1 (2 ) (2 ) [ ] z w a z w a h e e d d              (47) where   2 22 0 2 )()( 2 )](sin[][ w papa dppbeek   22 )()( 22 )()( 22 ])[(cos])[(sin 22222222        awawawaw eewbeewb 22 22 ])[sin(])[cos( 2222        aaaa eebeeb (48) 3.2. induced voltage in the rectangular pickup coil the relationship between the magnetic flux penetrating through the pickup coil and induced voltage is:   j dt d v  (49) therefore, the induced voltage can be derived as: 0 1 2 1 2 2 2 2 2 1 1 2 2 2 sin( ) 2 j cin n k k h v e w h w h               1 2 2 1 2 2 1 (2 ) (2 ) [ ] z w a z w a h e e d d              (50) 4. results the induced voltage variation of the rectangular pick-up coil is now calculated by considering the influencing factors based on the expressions derived in the previous section. the parameters of the coils and the conductive plates are given in tables 1 and 2 respectively. calculation model for the induced voltage in rectangular coils above conductive plates 35 table 1 parameters of the rectangular coil exciting coil pick-up coil a1 (mm) 12 a2 (mm) 3 b1 (mm) 12 b2 (mm) 5 z1 (mm) 1 z1 (mm) 1 w1 (mm) 2 w2 (mm) 5 h1 (mm) 8 h2 (mm) 2 turns 500 c (mm) 6 turns 300 table 2 parameters of the conductive plate top layer σ1 (s/m) 3.8×10 7 μr1 1 lower layer σ2 (s/m) 5.8×10 7 μr2 1 fig. 4 shows the induced voltage due to the conductive plates as a function of the excitation frequency. the thickness of the top-layer conductor is 200 μm and the thickness of the lower-layer is semi-infinite and both conductors are stationary. c is the distance from the center of the pick-up coil to the z axis. it can be seen from fig. 4 that the variation of the induced voltage increases with frequency. at any given exciting frequency, the pick-up coil with larger distance to the z axis has a higher induced voltage. 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 frequency (khz) r e a l p a rt o f in d u c e d v o lt a g e ( v ) c = 3 mm c = 6 mm c = 9 mm fig. 4 induced voltage in the pickup coil as a function of exciting frequency fig. 5 compares the induced voltage calculated from the analytical method and fem simulation. the analytical results are calculated as the square root of the sum of squares of the real and imaginary parts of the induced voltage. the results of the fem are the effective values of the induced voltage obtained in pick-up coil, simulated with a time-dependent formulation. 36 s. zhang, n. ida 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 frequency (khz) in d u c e d v o lt a g e i n p ic k -u p c o il ( v ) analytical method fem fig. 5 comparation of the induced voltage variation in rectangular pick-up coil from analytical and fem at different excitation frequency the induced voltages in the coil for different thicknesses of the top-layer conductor are shown in fig. 6. the excitation frequencies are fixed at 0.5, 2, and 5 khz respectively, and the conductor is stationary. the distance from the center of the pick-up coil to the z axis is fixed at 9 mm. the induced voltage variation initially increases with the thickness, then, at a specific thickness, the induced voltage reaches a maximum, followed by a decreases with increasing thickness. as can be seen from fig. 6, the higher excitation frequency produces a higher maximum at a smaller thickness, but the induced voltage decreases faster with increasing excitation frequency. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 thickness of top-layer conductor (mm) r e a l p a rt o f in d u c e d v o lt a g e ( v ) 0.5 khz 2 khz 5 khz fig. 6 induced voltage in pickup coil as a function of top-layer conductor thickness the speed characteristics are shown in fig. 7. the induced voltage variations are calculated at speeds from v = 0 to 50 m/s. the excitation frequency is fixed at 2 khz. fig. 7 shows the differences of the coils induced voltage at different speeds of the conductor relative to the calculation model for the induced voltage in rectangular coils above conductive plates 37 coils’ induced voltage when the conductor is stationary. the rectangular coils’ induced voltage variation keeps increasing with the moving speed of conductor, the maximum variation of induced voltage is achieved with the top-layer conductor of thickness 200 μm. 0 5 10 15 20 25 30 35 40 45 50 -35 -30 -25 -20 -15 -10 -5 0 moving speed of conductor (m /s) r e a l p a rt o f  v ( m v ) 50 m 100 m 200 m 1000 m fig. 7 induced voltage of pickup coil at different speed of conductor 5. conclusion a closed-form expression for the induced voltage between a pair of rectangular coils above a multi-layered conductive plate has been derived using a 2d fourier transform method. the excitation coil is parallel to the plates and the pickup coil is perpendicular to the conductor. we discussed the influencing factors on the induced voltage, such as the excitation frequency, the thickness of the top-layer conductor and the speed of the conductor. the calculation model and results can be extended and used in the forward model of quantitative detection for eddy current testing of multi-layer conductive structures. acknowledgment: the authors would like to thank the financial support by shanghai maritime university and the national natural science foundation of china (51175321). references [1] t. theodoulidis, n. poulakis, a. dragogias, "rapid computation of eddy current signals from narrow cracks", ndt&e international, vol. 43, pp. 13-19, 2010. [2] l. guohou, h. pingjie, c. peihua, "quantitative nondestructive estimation of deep defects in conductive structures", international journal of applied electromagnetics and mechanics, vol. 33 (3-4), pp. 12731278, 2010. [3] j.w. luquire, w.e. deeds, c.v. dodd, "alternating current distribution between planar conductors", journal of applied physics, vol.41 (10), pp. 3983-3991, 1970. [4] t.p. theodoulidis, e.e. kriezis, "impedance evaluation of rectangular coils for eddy current testing of planar media", ndt & e international, vol. 35(6), pp. 407-414, 2002. [5] y. lei, x. ma, "calculation of impedance in an eddy-current coil by numerical integration method", transactions of china electrotechnical society, vol. 11 (1), pp. 17-20, 1996. 38 s. zhang, n. ida [6] p. huang, z. wu, j. zheng, "inversion algorithms for multi-layered thickness measurement in eddy current testing", chinese journal of scientific instrument, vol. 26 (4), pp. 428-432, 2005. [7] t. theodoulidis, e. kriezis, "series expansions in eddy current nondestructive evaluation models", journal of materials processing technology, vol. 161 (5), 2005. [8] c.v. dodd, w.e. deeds, "analytical solutions to eddy current probe-coil problems", journal of applied physics, vol. 39 (6), 2829-2838, 1968. [9] y.u. yating, d.u. ping an, l.i. daisheng, "computational methods of coil impedance of eddy current sensor", chinese journal of mechanical engineering, vol. 43 (2), pp. 210-214, 2007. [10] j.-l. ren, h.-b. diao, j.-h. tang, "simulation of the lift-off effect of eddy current testing based on ansys", chinese journal of sensors and actuation, vol. 21 (6), pp. 967-971, 2008. 10486 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 421-435 https://doi.org/10.2298/fuee2203421m © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper control of series impedance of power lines using power flow controller aleksandar aco marković1,2, slobodan vukosavić2,3 1university of banja luka, faculty of electrical engineering, banja luka, republic of srpska, bosnia and herzegovina 2university of belgrade, faculty of electrical engineering, belgrade, serbia 3serbian academy of sciences and arts, belgrade, serbia abstract. in this paper, the possibility of unified power flow controller (upfc) to modulate both series resistance r and series reactance x of an overhead power line is discussed. the classical power flow control system of the ufpc is modified in the manner that standard input references signals (active and reactive powers) are replaced by reference signals of series resistance and reactance. using the procedure described in this work, the reference signals for active and reactive powers are generated indirectly. the operation of upfc in proposed operation mode is analyzed using computer simulation, based on a model of single machine infinite bus (smib) with constant impedance loads and two parallel lines. the goal is to show that upfc is capable to control both series line parameters (r and x) directly and independently by means of a simple control system without additional decoupling controllers. an additional task is to show that power flows can be indirectly controlled this way. the step response of series line resistance and reactance is used to validate the operation of the proposed control system. the obtained results clearly show that all goals are fulfilled. key words: unified power flow controller, impedance regulation, power system, power flows 1. introduction with introduction of variable sources in ac grids, with electronically controlled loads, integrity of the grid is challenged by reduced system inertia, limited support for transients from power electronics devices, and with quite new and different static and dynamic properties of the sources and loads that interface the ac grid through grid side inverters. at the same time, electric power required to run the internet and digitalization is rising steadily, while the process of decarbonization of the transport by means of electrification requires further increase in electric energy demand. these growing trends have great received february 16, 2022; revised april 4, 2022; accepted may 25, 2022 corresponding author: aleksandar aco marković university of banja luka, faculty of electrical engineering, 5 patre, banja luka, 78 000 banja luka, republic of srpska, bosnia and herzegovina e-mail: aleksandar-aco.markovic@etf.unibl.org 422 a. a. marković, s. vukosavić impact on power system which must respond on increasingly complex requirements. some of these requirements are: the response time of the system, stability margin and quality of electrical energy delivered to the consumer. to fulfill all the requests, flexible alternating current transmission system (facts) devices are introduced into the system. the most complex and the most substantial device of all facts is unified power flow controller (upfc). the main reason for introduction of upfc into the system is the need for independent control of active and reactive power flows in power systems [1]. currently, upfcs are mostly used in two different operation modes: either voltage and power flow control mode or active power oscillation reduction mode. there are a lot of proposed algorithms for both operation modes. control algorithms for voltage and power flow control are often based on proportional – integral (pi) controllers. the simplest control system is described in dq – reference frame and it generates the desired upfc voltage reference out of the acquired feedback signals [2]. the feedback signals are usually line currents as well as active and reactive powers. there are several similar control systems with only a small difference between them in their parameter setting for achieving better performance or faster response [3-5]. however, some authors prefer using several feedback signals (up to three) to achieve better performance and faster stabilization [6],[7]. this way, several pi controllers are connected in cascade, thus reducing the phase margin with negative impact on stability and robustness. these problems are not discussed in literature. besides pi controller, some novel approaches are discussed too, such as fuzzy controllers and neural networks. these types of controllers are suitable for nonlinear systems like power system. fuzzy controllers are used in hybrid version, where only p control of pi controller is fuzzy – based and everything else is classical pi control [8]. more complex approach uses complete pi controller based on fuzzy logic [9 11]. it is noted that fuzzy based control schemes provide faster response, on the account of a rather complex and involve selection of suitable membership function types and domains, mostly performed on trial-and-error bases, rather than using exact mathematical procedure which would lead to predictable results. additionally, said algorithms can be numerically extensive. neural networks are also an option for upfc control. usually, simple algorithms based on radial basis neural networks using a single neuron in hidden layer with gaussian activation function are used [8]. there is also a hybrid version of controller which uses classical pi control combined with neural network. neural network based on back propagation error, uses deviation of variables of interest to generate the output which is summarized with the outputs of pi controllers [12]. the latest approach is to use neural network based controllers to generate auxiliary signals for active power oscillations reduction [13]. this way faster response and attenuation of active power oscillations can be achieved. however, these algorithms have also several shortcomings. their main disadvantage is that their stability cannot be mathematically proven [14] and they are rather complex for practical implementation. it can be seen that almost all algorithms in relevant literature use active and reactive power and nominal bus voltage as reference signals. some modifications of these algorithms use d and q axes currents which are calculated using active and reactive power references. there are some attempts to use upfc for reactance control [6], [15], [16]. however, in these works upfc is used only as a shunt device, so it is not capable of controlling resistance in this operation mode. sometimes, the term “impedance control” is used to describe reactance control, as it is explained in [17]. authors didn’t find any relevant literature dealing with the use of upfc for independent control of series line resistance and reactance. control of series impadance of power lines using power flow controller 423 in this paper, control solution is proposed where series resistance r and series reactance x are treated as reference signals, while the upfc performs complete emulation of the series impedance. this means that upfc generates the appropriate voltages to maintain series resistance and reactance on desired levels, thus exploiting the whole potential of the upfc hardware. 2. topology and mathematical model of upfc in this section topology of standard upfc system is discussed. additionally, mathematical model of upfc suitable for series power line impedance modulation is derived based on classical power flow upfc model. all mathematical equations are self – driven based on proposed equivalent schemes. 2.1. upfc topology topology of an upfc device is shown in fig. 1. fig. 1 topology of upfc in fig. 1, upfc is connected on bus k, and it can control power flow between buses k and k+1, along the power line with impedance zt. this device constitutes of two power converters (pc1 and pc2), which operation is based on power electronics switching devices. the first upfc, installed in the usa, used gate turn – off (gto) thyristors as switching devices, which operated on grid frequencies. latest upfcs, installed in china, use insulated gate bipolar transistors (igbt) as switching devices, combined in modular multilevel converters (mmc) and they operate on frequencies near 1[khz] [18],[19]. in upfc topology, two power transformers are obligatory (tr1 and tr2, fig. 1). shunt transformer (tr1) is a classical power transformer. series transformer (tr2) has much more complicated construction since it has to withstand line current and sometimes even short circuit currents for a small fraction of time. auxiliary transformers (atr1 and atr2, fig. 1) are not always necessary. they are usually used in cases when gtos are used in power converters to create appropriate phase shift. series transformer (tr2) and series converter (pc2) create series part of upfc which is called static series synchronous compensator (sssc). shunt transformer (tr1) and shunt 424 a. a. marković, s. vukosavić converter (pc1) together create the shunt part of upfc which is called static compensator (statcom). these two devices can operate separately from each other. however, when dc switch (dcs) is closed, shunt and series part share the same dc link and that configuration is called upfc. in this configuration, it is possible to achieve more complex control tasks than using statcom and sssc independently. 2.1. upfc mathematical model to describe upfc more precisely, the equivalent scheme shown in fig. 2.a can be observed. fig. 2 a – upfc equivalent scheme, b – phasor diagram variables uk and uk+1 represent complex voltages on busbars k and k+1, respectively. complex voltage use denotes series voltage inserted into the power line through the series power transformer. complex voltage ush is generated using shunt transformer. modified voltage phasor on sending end u’k represents vector sum of voltages uk and use. impedance zsh describes shunt impedance of upfc while zt is transmission power line series impedance. line current i flows through series transformer and current ish flows through shunt part of upfc, supplying the dc link with appropriate energy. in order to see how upfc generated voltages use and ush influence on power system operation, apparent power on sending end can be observed (1). 𝑆𝑘 = 𝑈𝑘 ′ 𝐼∗ = 𝑈𝑘 ′ ( 𝑈𝑘 ′ − 𝑈𝑘+1 𝑍𝑇 ) ∗ (1) according to the phasor diagram (fig. 2b), voltages can be expressed using their effective values and phases (2). 𝑈𝑘 = 𝑈𝑘𝑒 𝑗𝛿𝑘, 𝑈𝑘+1 = 𝑈𝑘+1𝑒 𝑗𝛿𝑘+1, 𝑈𝑠𝑒 = 𝑈𝑠𝑒𝑒 𝑗𝛿𝑠𝑒 (2) substituting (2) into (1), using previously explained condition uk ’ = uke jδk+1 + usee jδse equation (1) becomes (3). 𝑆𝑘 = 𝑈𝑘 ′2 𝑅𝑇−𝑗𝑋𝑇 − 𝑈𝑘𝑈𝑘+1𝑒 𝑗(𝛿𝑘−𝛿𝑘+1)+𝑈𝑘+1𝑈𝑠𝑒𝑒 𝑗(𝛿𝑠𝑒−𝛿𝑘+1) 𝑅𝑇−𝑗𝑋𝑇 (3) real and imaginary part of (3) are given with (4) and (5), respectively. for simplicity, resistance r is neglected because the ratio x/r for high voltage power lines is control of series impadance of power lines using power flow controller 425 approximately 1/11 for 400[kv] power lines. further, the appropriate phases are expressed as δ = δk – δk+1, δ’ = δse – δk+1. 𝑃 = 𝑈𝑘𝑈𝑘+1 𝑋𝑇 sin(𝛿) + 𝑈𝑠𝑒𝑈𝑘+1 𝑋𝑇 sin(𝛿′) = 𝑓(𝑈𝑠𝑒,𝛿𝑠𝑒) (4) 𝑄 = 𝑈𝑘 ′2 𝑋𝑇 − 𝑈𝑘𝑈𝑘+1 𝑋𝑇 cos(𝛿) − 𝑈𝑠𝑒𝑈𝑘+1 𝑋𝑇 cos(𝛿′) = 𝑓(𝑈𝑠𝑒,𝛿𝑠𝑒) (5) equations (4) and (5) represent active and reactive powers on sending end, respectively. it can be noted that these equations are function of effective value of series voltage use, and its phase δse. active power p can be dominantly controlled by generating appropriate phase δse while reactive power q is controlled by generating adequate series voltage amplitude. the importance of upfc lies in fact that effective value of series voltage use can be changed from zero to its maximal value use,m and the series voltage phase δse can be changed from 0 to 2π. this is possible only because two power controllers share the same dc link. in power control mode of operation, shunt part of upfc is used for delivering the energy for series part. active power exchanged between two converters is denoted as pex. it should be pointed out that reactive power cannot be transferred through the dc link. so, every converter has to generate or absorb the reactive power locally. shunt part is also used for keeping the k bus voltage amplitude at desired level, which is done by absorbing or injecting reactive energy. additionally, this part of upfc is used for controlling the dc link voltage by controlling exchanged active power pex. apparent power generated or absorbed by shunt part ssh can be expressed by (6). 𝑆𝑠ℎ = 𝑈𝑘𝐼𝑠ℎ ∗ = 𝑈𝑘 ( 𝑈𝑘−𝑈𝑠ℎ 𝑍𝑠ℎ ) ∗ (6) real part of (6) represents the shunt active power psh and imaginary part is shunt reactive power qsh. model of dc link can be described by (7). 𝑃𝑒𝑥 = 𝑃𝑠ℎ − 𝑃𝑠𝑒 = 𝑖𝐶𝑢𝐷𝐶 = 𝑢𝐷𝐶𝐶 𝑑𝑢𝐷𝐶 𝑑𝑡 (7) in (7) pse represents active power generated by series part of upfc, ic is current flowing through the dc link capacitor, udc is dc link voltage and c represents capacitor capacitance. traditionally, control of upfc is done by generating appropriate series use and shunt ush voltages. these voltages are generated by the control system (fig 1.), which goal is to regulate active and reactive powers as well as nominal voltage on k-th busbar. 3. proposed control scheme the main idea for control system is to use desired values of line resistance and reactance as reference signals. these signals are further to be used to calculate appropriate references for active and reactive powers. to accomplish this idea, the control system of the series part of upfc should be modified, while the control system of the shunt part of upfc can be kept the same relative to the standard control systems of upfc used in power flow control mode of operation. 426 a. a. marković, s. vukosavić 3.1. upfc series part control scheme unlike previously described classical control schemes, upfc can also be used in impedance control operation mode. to formulate the control low, the equivalent scheme shown in fig. 3 can be observed. fig. 3 upfc equivalent scheme for impedance control operation mode in this case, series part of converter can be observed as variable impedance z, unlike the classical study where the series part is represented by voltage source (fig. 2a). line current i should remain the same, independently of equivalent scheme (fig. 2a or fig. 3). line current form fig. 3 can be expressed by (8). 𝐼 = 𝑈𝑘+𝑈𝑠𝑒−𝑈𝑘+1 𝑍𝐿 (8) in this case, voltage vector use can be varied, while zt is constant. line current calculated using equivalent scheme from fig. 3 is given by (9). 𝐼 = 𝑈𝑘 − 𝑈𝑘+1 𝑍𝑒 (9) in case of (9), ze is equivalent line impedance, expressed as sum of variable part of impedance z and fixed impedance zt. these currents, expressed by (8) and (9), should be equal. from this equality, the expression for variable part of impedance can be easily obtained (10). 𝑍 = − 𝑈𝑠𝑒 𝐼 (10) variable impedance z is expressed using series injection voltage 𝑈𝑠𝑒 and line current i, which can be measured in a real power system. apparent power on power line, according the fig. 3, is expressed by (11). 𝑆𝑘,𝑟𝑒𝑓 = 𝑈𝑘 ( 𝑈𝑘 − 𝑈𝑘+1 𝑍𝑒,𝑟𝑒𝑓 ) ∗ = 𝑃𝑟𝑒𝑓 + 𝑗𝑄𝑟𝑒𝑓 (11) equation (11) shows that referent values for active and reactive power pref and qref, respectively, can be expressed indirectly by assigning referent values for equivalent impedance ze. calculated power references pref and qref are to be compared with measured control of series impadance of power lines using power flow controller 427 active and reactive powers given by (4) and (5). active power signal error represents input for pi controller (pi1, fig. 4.a), which output is imaginary part useq of complex voltage vector use. reactive power signal error feeds another pi controller (pi2, fig. 4.a), which output represent the real part used of complex voltage vector use. control scheme of series part of upfc is shown in fig. 4.a. fig. 4 a. upfc series part control scheme, b. upfc shunt part control scheme 3.2. upfc shunt part control scheme for proposed control scheme, based on impedance control, shunt part can be controlled classically. that means, shunt part complex voltage is generated using two pi controllers. the complete control scheme of shunt part of upfc is shown in fig. 4.b. the first pi regulator (pi3, fig. 4.b) is used to generate the real part ushd of complex voltage ush. this regulator is fed by error signal which is generated as difference between reference dc link voltage udc,ref and measured dc link voltage udc, which is obtained using (7). imaginary part ushq of complex voltage vector ush is generated using pi controller (pi4, fig. 4.b), which input signal is difference between referent (usually nominal) voltage on bus k uk,ref and measured voltage uk. controllers used in control schemes (fig. 4) are discrete type pi controllers in positional form with anti-windup mechanism (fig. 5). fig. 5 discrete type pi controller with anti-windup mechanism in fig. 6 signals f and y represent input and output signals, respectively. parameters kp and ki are proportional and integral gains, respectively, while parameter kc is calculated as ratio ki/kp. sampling time is denoted as t. all control parameters are given in appendix a. 428 a. a. marković, s. vukosavić 4. test system model operation of upfc in impedance control mode is tested by means of computer simulation, on a simple power system, showed in fig. 6. the system is classical single – machine infinite bus system with parallel lines. this type of system is widely used for demonstration of upfc performance by means of power regulation and active power oscillation suppression [20 – 23]. fig. 6 test system model model of the test power system (fig. 6) consists of four buses. buses 1 and 4 are generator buses whereby the bus 1 is slack bus. buses 1 and 2 are connected by means of power lines having impedances zt1 and zt4, respectively. buses 2 and 3 are connected by means of parallel lines with impedances zt2 and zt3. constant impedance loads are connected to buses 2, 3 and 4, and their impedances are denoted as zl1, zl2 and zl3, respectively. unified power flow controller is connected to the bus 2, in series with power line which impedance is zt2. thus, upfc will be used for control of impedance on this power line and simultaneously for controlling bus 2 voltage amplitude. power generator g1 (fig. 6) is slack generator, so it is modeled as constant voltage source with nominal voltage. detailed model of generator g2 is given in [24]. it consists of models of electrical and mechanical subsystems suitable for observation of transient and steady state periods. excitation system of this generator is modeled as standard type 1 ieee excitation system. system frequency controller is integral type controller, while turbine controller is modeled as widely used first order system with droop characteristics. power lines are described by their series impedances, where the shunt parts of the power lines are neglected. all loads are modeled as constant impedance loads. parameters of the test system model are given in appendix b and they are represented in per unit system with respect to base power 100[mva] and base voltage 220[kv]. 5. simulation results in order to explore the possibility of upfc to control series line impedance, computer simulation is created in matlab, simulnik. simulation is prepared according to the test system model (fig. 6) and upfc mathematical model, described in section iii. the simulation is divided into nine time segments (t1 – t9), and each of them lasts for 5[s]. the first time interval t1 starts at the time t1 = 10[s] and lasts until the time t2 = 15[s], and the last one t9 starts at the time t8 = 50[s] and lasts until the end of the simulation, control of series impadance of power lines using power flow controller 429 which is 55[s]. all time intervals are shown in fig. 7. the simulation results are observed form the time t1 = 10[s] in order to get clearer results and to skip the transient period. the aim of this simulation is to show the possibility of upfc to independently regulate line resistance and reactance. in order to investigate the great majority of all possible outcomes, different references of x and r are generated in every time interval. these are represented by step changes. the step responses of measured resistance (black) and reactance (blue) of line 2 are given in fig.7. dashed traces in fig. 7 represent nominal line parameters, when no compensation is done, that is re,ref=rt2=0.03[p.u] and xe,ref=xt2=0.2[p.u]. fig. 7 step change of equivalent line impedance the goal is to generate higher and lower values of resistance and reactance compared to uncompensated line parameters, to investigate if upfc is capable to independently compensate both line parameters. step responses of x and r represented in fig. 7 show that measured equivalent resistance and reactance follow the reference signals without steady sate error. the step responses are almost aperiodic. when the reference of one of the parameters (x or r) is changed while the other parameter is kept constant, undershoot or overshoot occur in response of the parameter which is kept constant. this can be observed in transition from time period t3 to t4, when r=0.05[p.u] and it is kept constant and greater than nominal (uncompensated) and x=0.15[p.u] which is lower than nominal. in this case the disturbance in measured resistance occurs and it is represented as an overshoot. however, this disturbance is evidently negligible, and it happens due to the socalled coupling between active and reactive powers. similar disturbances can be seen on the transition from time period t2 to t3 when the overshoot occurs in time response of measured reactance whereas in the transition from time period t6 to t7. the summarized results of the simulation for fig. 7 are given in table 1. in the table 1 the brief description of time periods t1 to t9 is given using symbols describing direction of change of x and r relative to previous time period. symbol “-“ which means no change in x or r, “↘” lower x or r and “↗” higher x or r. 430 a. a. marković, s. vukosavić table 1 summarized results of step responses of equivalent line x and r t1 t2 t3 t4 t5 t6 t7 t8 t9 x r x r x r x r x r x r x r x r x r d ir . − • • • • • • • • ↘ • • • • • ↗ • • • • • s t e p ap.1 • • • • • • • • • • • • over.2 • • • under.3 • • 1aperiodic 2overshoot 3undershoot the brief overview of the step response of equivalent x and r are described by the type of step response which can be aperiodic, with overshoot or with undershoot. the results in table 1 show also that the time response is mostly aperiodic. time responses of the variable resistance (blue) and reactance (red) are shown in fig. 8. when no compensation is done (time periods t1 and t8), variable r and x are zero, which is in accordance with the theoretical discussion. the step responses are the same as the step responses of equivalent x and r (fig. 7) since they represent the sum of these signals with constant, uncompensated values of x and r. in is interesting to notice that variable x and r, generated by the upfc can be both positive and negative. especially interesting is the possibility of generating negative resistance. fig. 8 step change of variable reactance and resistance the change of line resistance and reactance influences the change of active (red trace) and reactive (blue trace) powers in line 2, shown in fig. 9. dashed trances in fig. 9 represent active and reactive powers when no compensation is done. step responses of active and reactive powers are almost aperiodic. the overshoot in active power step response happens in transition from the time period t3 to t4 (1.2%) and in transition from the time period t5 to t6 (2.9%). however, these overshoots are under 5% which is control of series impadance of power lines using power flow controller 431 considered acceptable. it is important to notice that no oscillations in active power response are present. comparing the results in fig. 7 with the results obtained in fig. 9, it can be concluded that the step change in line reactance has greatest impact to power changes, which is in accordance with the theoretical discussion. fig. 9 active and reactive power change to deeply investigate the step response of equivalent line resistance (black trace), fig. 10 can be observed. the trances shown in fig. 10 are the same as the trances form fig. 7, only enlarged. fig. 10 the step response of equivalent line resistance step responses of equivalent resistance are mostly aperiodic as it is previously stated. step response is quite fast end it reaches the steady state for 4[s]. the enlarged parts in fig. 432 a. a. marković, s. vukosavić 10 show the exact time responses of equivalent resistance in transition from time period t6 to t7 when the overshoot of 4% occurs, and in transition from time period t8 to t9 when the overshoot of 3% occurs. these are acceptable values. however, greater disturbances evidently occur in transition from t3 to t4, t4 to t5 and t6 to t7. these disturbances can be lowered by designing an appropriate decoupling controller. further, the time responses of upfc and bus 2 voltages are observed (fig. 11). the main purpose of the upfc is to insert series voltage into the line to in order to generate the reference equivalent resistance and reactance. step responses of the d (red trace) and q (blue trance) components of the upfc series voltages are shown in fig. 11a. when no compensation is done (time periods t1 and t8), series voltage is equal to zero, which means that series part of upfc is inactive. in other time periods series voltage changes in appropriate manner to fit the regulation goals. time responses are obviously aperiodic with fig. 11 a. upfc series voltage, b. upfc shunt voltage, c. bus 2 voltage amplitude, d. dc link voltage control of series impadance of power lines using power flow controller 433 very fast response, with time constant below 1[s]. the amplitude of series injected voltage is within the rage of 0.1[p.u], which is the typical maximal value of inserted series voltage in practical implementation [18]. the main task of upfc shunt part is to keep bus 2 and dc link voltages at nominal level. fig. 11c and fig. 11d show that this task is successfully accomplished since observed voltages are kept constant during all time periods and no disturbances are noted. the reason for this is upfc shunt voltage which d (red trace) and q (blue trance) components are shown in fig. 11b, which is also kept constant during all time periods thanks to the shunt part control system. 6. conclusion the paper discusses the possibility of aiding to the integrity of ac grids by introducing unified power flow controller (upfc), enabled by the proposed controller, capable of modulating both series resistance r and series reactance x of an overhead power line. proposed controller is simple to set and straightforward to use. the proposed operation mode of ufpc is tested on single – machine infinite bus system consisting of four buses with detailly modeled generator. the results show that upfc is very efficient in compensating line equivalent resistance and reactance. the step responses are aperiodic with zero steady state error and small settling time. decoupling controllers are not required as the disturbances that take place during step changes of reference signals are quite insignificant. this way, active and reactive powers on the line are controlled indirectly, by changing the line impedance. no oscillations in active power step response are noted. the described possibility of upfc has the potential of being used for attenuation of power angle deviations and power oscillations in large scale power systems experiencing significant power disturbances. however, this possibility is yet to be proven. 7. appendix a parameters of four used pi regulators, numbered as in fig. 4 are: kp1=0.1, ki1=1, kc1=10; kp2=0.1, ki2=1.4, kc2=14; kp3=2, ki3=10, kc3=5; kp4=5, ki4=10, kc4=2. 8. appendix b parameters of the test power system are as follows: ▪ generator g2: xd=1.2[p.u], x'd=0.3[p.u], xq=1[p.u], t'd0=5[s], h=6[s], k=0.02; ▪ generator's g2 voltage regulator: ka=20, ta=0.2[s]; ▪ generator's g2 turbine: tch=0.4[s]; ▪ turbine's regulator: tsv=0.2[s]; ▪ system frequency regulator: tf=1[s]; ▪ power lines: zv1=0.01+j0.1[p.u], zv2=0.03+j0.2[p.u], zv3=0.03+j0.4[p.u], zv4=0.01+j0.2[p.u]; ▪ loads: zl1=2+j1[p.u], zl2=0.8+j0.6[p.u], zl3=0.8+j0.6[p.u]; ▪ upfc parameters: zsh=0.001+j0.08[p.u], c=0.5[p.u]. 434 a. a. marković, s. vukosavić references [1] l. gyugyi, "unified power flow control concept for flexible ac transmission systems", ieeе proceedings, vol. 139, no. 4, pp. 323–331, july 1992. [2] s. d. round, q. yu, l. e. norum and t. m. undeland, "performance of a unified power flow controller using a d-q control system", in proceedings of the sixth international conference on ac and dc power transmission, london, 1996, pp. 357–362 [3] k. r. padiyar and a. m. kulkarni, "control design and simulation of unified power flow controller", ieee trans. power deliv., vol. 13, no. 4, pp. 1348–1354, oct. 1998. [4] i. papic, p. zunko, d. povh and m. weinhold, "basic control of unified power flow controller", ieee trans. power syst., vol. 12, no. 4, pp. 581–588, nov. 1997. [5] h. fujita, y. watanabe and h. akagi, "control and analysis of a unified power flow controller", ieee trans. power electron., vol. 14, no. 6, pp. 1021–1027, nov. 1999. [6] l. liu, y. zhang, p. zhu, y. kang and j. chen, "control scheme and implement of a unified power flow controller", in proceedings of the international conference on electrical machines and systems, nanjing, 2005, pp. 1170–1175. [7] l. liu, p. zhu, y. kang and j. chen, "power-flow control performance analysis of a unified power-flow controller in a novel control scheme", ieee trans. power deliv., vol. 22, no. 3, pp. 1613–1619, july 2007. [8] p. k. dash, s. mishra and g. panda, "damping multimodal power system oscillation using a hybrid fuzzy controller for series connected facts devices", ieee trans. power syst., vol. 15, no. 4, pp. 1360–1366, nov. 2000. [9] f. m. albatsh, s. mekhilef, s. ahmad, h. mokhlis, "fuzzy logic based upfc and laboratory prototype validation for dynamic power flow control in transmission lines", ieee trans. ind. electron., vol. 64, no. 12, pp. 9538–9548, dec. 2017. [10] m. khaksar, a. rezvani and m. h. moradi, "simulation of novel hybrid method to improve dynamic responses with pss and upfc by fuzzy logic controller", neural comput. appl., vol. 29, pp. 837–853, feb. 2018 [11] n. narayana and r. k. mallick, "enhancement of small signal stability of power system using upfc based damping controller with novel optimized fuzzy pid controller", j. intell. fuzzy syst., vol. 35, no. 1, pp. 501–512, july 2018. [12] h. c. tsai, j. h. liu and c. c. chu, "integrations of neural networks and transient energy functions for designing supplementary damping control of upfc", ieee trans. ind. appl., vol. 55, no. 6, pp. 6438–6450, dec. 2019. [13] h. c. tsai and c. c. chu, "upfc supplementary damping control synthesis: a forward neural networks approximated energy function approach", in proceedings of the ieee industry applications society annual meeting (ias), 2018, pp. 1–8. [14] m. januszewski, j. machowski and j. w. bialek, "application of the direct lyapunov method to improve damping of power swings by control of upfc", iet proceedings – gener. transm. distrib., vol. 151, no. 2, pp. 252–260, april 2004. [15] m. a. sayed and t. takeshita, "line loss minimization in isolated substations and multiple loop distribution systems using the upfc", ieee trans. power electron., vol. 29, no. 11, pp. 5813–5822, nov. 2014. [16] k. k. sen and m. l. sen, introduction to facts controllers: theory, modeling and applications, john willey & sons, new jersey, 2009, chapter 2, pp. 58–62. [17] m. h. haque, "application of upfc to enhance transient stability limit", in proceedings of the ieee power engineering society general meeting, 2007, pp. 1–6. [18] x. yang, w. wang, h. cai, p. song and z. xu, "installation, system-level control strategy and commissioning of the nanjing upfc project", in proceedings of the ieee power and energy society general meeting, 2017, pp. 1–5. [19] y. cui, y. yu, w. bao, y. feng, q. guo, w. xie and m. jin, "analysis of application effect of 220 kv upfc demonstration project in shanghai grid", dianli xitong baohu yu kongzhi/power system protection and control, vol. 46, pp. 136–142, 2018. [20] s. k. samal, p. c. panda, "damping of power system oscillations by using unified power flow controller with pod and pid controllers", in proceedings of the international conference on circuits, power and computing technologies (iccpct-2014), 2014, pp. 662–667. [21] a. m. shotorbani, a. ajami, m. p. aghababa and s. h. hosseini, "direct lyapunov theory-based method for power oscillation damping by robust finite -time control of unified power flow controller", iet gener. transm. distrib., vol. 6, no. 9, pp. 822–830, nov. 2012. control of series impadance of power lines using power flow controller 435 [22] h. huang, l. zhang, o. oghorada and m. mao, "analysis and control of a modular multilevel cascaded converter-based unified power flow controller", ieee trans. ind. appl., vol. 57, no. 3, pp. 3202–3213, june 2021. [23] m. khaksar, a. rezvani and m. h. moradi, "simulation of novel hybrid method to improve dynamic responses with pss and upfc by fuzzy logic controller", neural comput. and appl., vol. 29, pp. 837–853, feb. 2018. [24] p. w. sauer, m. a. pai and j. h. chow, power system dynamics, and stability: with synchrophasor measurement and power system toolbox, 2nd edition, wiley-ieee press, 2017, chapter 4, pp. 53–70. facta universitatis series: electronics and energetics vol. 31, no 4, december 2018, pp. 547 570 https://doi.org/10.2298/fuee1804547m vladimir m. milovanovic received september 10, 2018 corresponding author: vladimir m. milovanović department of electrical engineering, faculty of engineering, university of kragujevac, sestre janjic 6, 34000 kragujevac, serbia (e-mail: vlada@kg.ac.rs) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) on fundamental operating principles and range-doppler estimation in monolithic frequency-modulated continuous-wave radar sensors faculty of engineering university of kragujevac abstract. the diverse application areas of emerging monolithic noncontact radar sensors that are able to measure object’s distance and velocity is expected to grow in the near future to scales that are now nearly inconceivable. a classical concept of frequency-modulated continuous-wave (fmcw) radar, tailored to operate in the millimeter-wave (mm-wave) band, is well-suited to be implemented in the baseline cmos or bicmos process technologies. high volume production could radically cut the cost and decrease the form factor of such sensing devices thus enabling their omnipresence in virtually every field. this introductory paper explains the key concepts of mm-wave sensing starting from a chirp as an essential signal in linear fmcw radars. it further sketches the fundamental operating principles and block structure of contemporary fully integrated homodyne fmcw radars. crucial radar parameters like the maximum unambiguously measurable distance and speed, as well as range and velocity resolutions are specified and derived. the importance of both beat tones in the intermediate frequency (if) signal and the phase in resolving small spatial perturbations and obtaining the 2-d range-doppler plot is pointed out. radar system-level trade-offs and chirp/frame design strategies are explained. finally, the nonideal and second-order effects are commented and the examples of practical fmcw transmitter and receiver implementations are summarized. key words: fmcw, frequency-modulated continuous-wave, radar, mm-wave,linear chirp, range-doppler, sensors, radar-on-a-chip (roc), single-chip radar. 502 v. milovanović 1 introduction applications of portable short-range contact-less radar sensors which provide simultaneous information on the presence, position and relative radial velocity are virtually countless. these radar systems not only have the potential to improve the service quality in numerous existing fields [1–3], but are also expected to be the driving force for many novel use-cases in the near future. multiple sensing technologies based on laser/optical, ultrasound and radio waves have been proposed in the past. among those, the millimeter-wave (mm-wave) radio frequency radars attracted considerable attention thanks to their robustness [4] against bad weather conditions and harsh environments. historically, mm-wave radar sensors were built from discrete components and therefore reserved only for low-volume markets. however, a prospective single-chip integrated solution with a low unit cost and small form factor, often referred to as the radar-on-chip (roc), would lead to its omnipresence in consumer and industrial electronic devices, along with probable pervasive use in a variety of areas spanning from automotive to healthcare. two fundamentally different microwave ranging methods, a pulse-based and continuous-wave (cw), coexist. the former ones are simply inefficient for monolithic integration [5], as they inherently suffer from higher peak(-toaverage) power. unmodulated cw radars can only determine the relative target velocity through the doppler shift. nevertheless, if the appropriate [6] kind of carrier modulation is employed, distances can also be resolved. pseudorandom noise modulated cw radars [7] that exploit pulse compression techniques for temporal energy distribution are a viable option especially for lower node digitally-intensive implementations [8], but come with a major drawback [9] that their baseband bandwidth equals half of the radio frequency (rf) one. this fact proves to be particularly bothersome in ultrahigh resolution sensors where power-hungry data converters are unavoidable. finally, the classical frequency-modulated cw (fmcw) radar, as will be presented by this article, in its simplest homodyne incarnation, transmits a sequence of linear chirps that are simultaneously used as a local oscillator signal for the receiver’s frequency mixer. assuming no nonlinear distortions occur on the pathway, when the transmitted chirp is mixed with its received reflections that are attenuated, delayed in time and possibly shifted in frequency the intermediate frequency, being the low-pass filtered heterodyning product, will contain information on the target’s distance (via time of flight) and its velocity (doppler effect). by analogy with acoustics, the resulting frequency difference, at the mixer’s output is referred to as the beat frequency. 548 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 549 502 v. milovanović 1 introduction applications of portable short-range contact-less radar sensors which provide simultaneous information on the presence, position and relative radial velocity are virtually countless. these radar systems not only have the potential to improve the service quality in numerous existing fields [1–3], but are also expected to be the driving force for many novel use-cases in the near future. multiple sensing technologies based on laser/optical, ultrasound and radio waves have been proposed in the past. among those, the millimeter-wave (mm-wave) radio frequency radars attracted considerable attention thanks to their robustness [4] against bad weather conditions and harsh environments. historically, mm-wave radar sensors were built from discrete components and therefore reserved only for low-volume markets. however, a prospective single-chip integrated solution with a low unit cost and small form factor, often referred to as the radar-on-chip (roc), would lead to its omnipresence in consumer and industrial electronic devices, along with probable pervasive use in a variety of areas spanning from automotive to healthcare. two fundamentally different microwave ranging methods, a pulse-based and continuous-wave (cw), coexist. the former ones are simply inefficient for monolithic integration [5], as they inherently suffer from higher peak(-toaverage) power. unmodulated cw radars can only determine the relative target velocity through the doppler shift. nevertheless, if the appropriate [6] kind of carrier modulation is employed, distances can also be resolved. pseudorandom noise modulated cw radars [7] that exploit pulse compression techniques for temporal energy distribution are a viable option especially for lower node digitally-intensive implementations [8], but come with a major drawback [9] that their baseband bandwidth equals half of the radio frequency (rf) one. this fact proves to be particularly bothersome in ultrahigh resolution sensors where power-hungry data converters are unavoidable. finally, the classical frequency-modulated cw (fmcw) radar, as will be presented by this article, in its simplest homodyne incarnation, transmits a sequence of linear chirps that are simultaneously used as a local oscillator signal for the receiver’s frequency mixer. assuming no nonlinear distortions occur on the pathway, when the transmitted chirp is mixed with its received reflections that are attenuated, delayed in time and possibly shifted in frequency the intermediate frequency, being the low-pass filtered heterodyning product, will contain information on the target’s distance (via time of flight) and its velocity (doppler effect). by analogy with acoustics, the resulting frequency difference, at the mixer’s output is referred to as the beat frequency. on range-doppler estimation in integrated fmcw radar sensors 503 recently, the fmcw radars drew considerable attention [10–21], partially owing to their high integration potential. although the main driver in developing these small footprint solutions was the automotive industry [1], a gradual breakthrough into other spheres is evident. regulatory committees of the itu and the etsi even assigned the dedicated 77-81 ghz range in the w-band as part of the spectrum to be automotive specific, which is often referred to as the so-called “short-range radar” (srr) band. in spite of that, having a device that could operate in the frequency band where an unlicensed spectral emission is permitted would be favorable for its widespread adoption. namely, choosing one of the industrial, scientific, and medical (ism) radio bands might turn out advantageous for cross-disciplinary expansion of fmcw-based sensors that will not be limited to vehicular radar systems. the mm-wave radars can grasp important benefits of higher frequency operation that are not only related to its antenna size. as will be also shown in the next sections, the fmcw multitarget differentiation ability is directly proportional to the irradiated chirp bandwidth. in the prospect of the fcc’s relatively recent extension of the unlicensed part in the v band [22], that now incorporates a complete 57-71 ghz frequency range, previously unfeasible spatial target discrimination is enabled. in other words, these 14 ghz of a contiguous unlicensed spectrum translate to a centimeter-order space resolution, thus allowing fmcw-type radars to be used in complex indoor and outdoor scenes which contain an abundance of close proximity objects. all this sets a fruitful ground for a universal ranging radar devices, which will dominate the future markets. the first commercial roc solutions already appeared [23] and more are following and are expected to follow fairly soon. this paper is intended to make a rather gentle introduction to the area of integrated fmcw mm-wave radar sensors as they are presently build. it focuses on main operating principles in estimating object distance/range and its relative radial velocity in sensor devices that are based on fast fmcw modulation and slow time processing which gives multiple advantages. in order to follow the elaborated matter, a general undergraduate-level knowledge in electronics and signal processing is assumed. the rest of the paper is organized as follows. concept of a frequency chirp as the fundamental signal in fmcw radars is introduced in section 2. further in section 3 it is elaborated on the operating principles of fmcw sensors with two subsections each devoted to range and velocity estimation. final subsection gives some system-level trade-offs and explains chirp/frame design decisions. present state of the art fmcw radar transmitter and receiver architectures are examined in section 4 and finally section 5 concludes the article. 548 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 549 504 v. milovanović −ac 0 +ac t0 t0 + tc t0 + 2tc f0 fc f0 + b t0 t0 + tc t0 + 2tc am pl it ud e a (t ) time t fr eq ue nc y f (t ) time t fig. 1. a time sequence of linear up-chirp waveforms plotted as amplitude versus time (upper subplot) and frequency versus time (lower subplot) resembling the sawtooth wave. 2 chirp as the fundamental signal of an fmcw radar a sine wave or a sinusoid whose frequency increases (up-chirp) and/or decreases (down-chirp) with time is called a chirp or, although less often in this context, a sweep. in particular, linear chirps, i.e. signals in which the frequency changes linearly with time, are at the heart of every fmcw radar. specifically, in a linear chirp, a representative example of which is plotted in fig. 1, the instantaneous frequency f varies exactly linearly with time t: f(t) = f0 + b tc (t − t0) = f0 + s(t − t0) , (1) where f0 is the starting frequency at time point t = t0, while s = b/tc is the rate of frequency change or the frequency slope, sometimes also referred to as the chirpyness. the slope is defined using two parameters, namely the chirp bandwidth b and its duration tc, also called the modulation time. 550 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 551 504 v. milovanović −ac 0 +ac t0 t0 + tc t0 + 2tc f0 fc f0 + b t0 t0 + tc t0 + 2tc am pl it ud e a (t ) time t fr eq ue nc y f (t ) time t fig. 1. a time sequence of linear up-chirp waveforms plotted as amplitude versus time (upper subplot) and frequency versus time (lower subplot) resembling the sawtooth wave. 2 chirp as the fundamental signal of an fmcw radar a sine wave or a sinusoid whose frequency increases (up-chirp) and/or decreases (down-chirp) with time is called a chirp or, although less often in this context, a sweep. in particular, linear chirps, i.e. signals in which the frequency changes linearly with time, are at the heart of every fmcw radar. specifically, in a linear chirp, a representative example of which is plotted in fig. 1, the instantaneous frequency f varies exactly linearly with time t: f(t) = f0 + b tc (t − t0) = f0 + s(t − t0) , (1) where f0 is the starting frequency at time point t = t0, while s = b/tc is the rate of frequency change or the frequency slope, sometimes also referred to as the chirpyness. the slope is defined using two parameters, namely the chirp bandwidth b and its duration tc, also called the modulation time. on range-doppler estimation in integrated fmcw radar sensors 505 since the time derivative of the phase φ is the angular frequency, the corresponding time-domain function for the phase of any oscillating signal is the integral of the frequency function, and therefore the phase is expected to grow like φ(t + ∆t) � φ(t) + 2πf(t)∆t as a function of time. this results in: φ(t) = φ0 + 2π ∫ t t0 f(τ) dτ = φ0 + 2π [ f0 (t − t0) + b 2tc (t2 − t02) ] , (2) where φ0 is the initial phase at time point t = t0. deriving the previous expression it can be verified that φ′(t) = 2πf(t), what was actually expected. finally, the corresponding time-domain function for a sinusoidal linear chirp is the sine of the quadratic-phase signal in radians and can be written: yc(t) = vtx(t) = ac sin ( φ0 + 2πf0t + π b tc (t − mtc)2 ) , (3) where ac is the chirp’s amplitude and where t0 = 0 under assumptions that the sweeps are performed continuously and that m represents the mth chirp. the carrier frequency can be defined in terms of the starting frequency and the modulation bandwidth as fc = f0 + b/2 and represents the central frequency for the spectrum band that is being covered. typical frequency bands of interest in the mm-wave part are around 64 ghz for unlicensed and 79 ghz for automotive applications. bandwidth spans depend on targeted radar range resolution but are in the order of up to several ghz, while chirp modulation times vary from dozens of microseconds up to a millisecond. 2.1 sawtooth versus triangular wave linear chirps in the recent past, triangle (concatenation of up-chirp and down-chirp) slow fmcw modulation waveforms with typical chirp durations in the millisecond range were dominant. as it will be seen in the next section, the resulting output frequency of an fmcw radar is concurrently influenced by the target’s range and its relative radial velocity, thus estimating both parameters simultaneously from a single linear chirp/sweep is an unresolvable task. by using up-slope and down-slope chirps which produce slightly different beat frequencies for an object in motion the two parameters can be decoupled. however, this procedure suffers from ambiguity when there are multiple moving objects and the ghost targets that will appear must be identified and discarded. in contrast to this, fast sawtooth fmcw modulations which typically last up to a hundred of microseconds automatically resolve object range and velocity into a 2-d image and are in exclusive focus for the rest of the paper. 550 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 551 506 v. milovanović dsp adc synthesizerf t if pa lna tx rx fig. 2. simplified high-level block diagram of a typical homodyne fmcw radar which includes linear chirp synthesizer that is being transmitted and used as the local oscillator. 3 the operating principles of homodyne fmcw radars an fmcw radar transmits a chirp signal defined more closely in the previous section and captures its reflections from objects located in the propagation path. a high-level simplified block diagram of a homodyne fmcw radar is shown in fig. 2 and features a single transmitter (tx) and a single receiver (rx) antenna. the radar’s general operating principles are the following: • an fmcw synthesizer generates an appropriate chirp signal; • the generated chirp is first amplified by a power amplifier (pa); • after amplification the chirp is transmitted by a transmit antenna; • chirps reflected back from objects are captured by the receive antenna; • the received signal is then passed through a low-noise amplifier (lna); • a down-conversion frequency mixer combines the rx and tx signals at its inputs to produce an intermediate frequency signal at its output; • the intermediate frequency (if) signal is also referred to as the beat frequency and it contains information on the irradiated objects/targets. additionally, it should be noted that not only the instantaneous output frequency of the down-conversion mixer at any point in time will correspond to the difference of the instantaneous frequencies of the two input signals at that particular point in time, but also the initial phase of the output signal will be equal to the difference between initial phases of the two input signals. 552 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 553 506 v. milovanović dsp adc synthesizerf t if pa lna tx rx fig. 2. simplified high-level block diagram of a typical homodyne fmcw radar which includes linear chirp synthesizer that is being transmitted and used as the local oscillator. 3 the operating principles of homodyne fmcw radars an fmcw radar transmits a chirp signal defined more closely in the previous section and captures its reflections from objects located in the propagation path. a high-level simplified block diagram of a homodyne fmcw radar is shown in fig. 2 and features a single transmitter (tx) and a single receiver (rx) antenna. the radar’s general operating principles are the following: • an fmcw synthesizer generates an appropriate chirp signal; • the generated chirp is first amplified by a power amplifier (pa); • after amplification the chirp is transmitted by a transmit antenna; • chirps reflected back from objects are captured by the receive antenna; • the received signal is then passed through a low-noise amplifier (lna); • a down-conversion frequency mixer combines the rx and tx signals at its inputs to produce an intermediate frequency signal at its output; • the intermediate frequency (if) signal is also referred to as the beat frequency and it contains information on the irradiated objects/targets. additionally, it should be noted that not only the instantaneous output frequency of the down-conversion mixer at any point in time will correspond to the difference of the instantaneous frequencies of the two input signals at that particular point in time, but also the initial phase of the output signal will be equal to the difference between initial phases of the two input signals. on range-doppler estimation in integrated fmcw radar sensors 507 f t f0 f0 + b t0 t0 + tc t0 + 2tc tx rx1rx2rx3 fb3 fb2 fb1 target 1 target 2 target 3 (a) (b) (c) ffb1 fb2 fb3 a r r r roc tx rx fb3fb2fb1 2r c rf if fb = 2b ctc · r fig. 3. static multitarget detection with an fmcw radar (a) spatial object positioning, (b) time-domain transmitted and received reflected up-chirps with the corresponding mixing products (c) amplitude spectrum of appropriately windowed intermediate frequency signal. it is crucial to remark that the received chirp reflected from a single object is actually just a time-delayed replica of the transmitted chirp. this is best illustrated in fig. 3 for a somewhat more complicated case of three objects. since the mixing product will be the difference of between the instantaneous input frequencies, and since the rf mixer input is just the delayed version of the local oscillator (lo) signal that is being transmitted, hence in the ideal case the if signal will possess the fixed frequency component proportional to the reflected signal delay. the delay between the transmitted and the received chirp is equal to the round-trip delay 2r/c, where r denotes the distance between the radar and the object, and c is the speed of light, while the constant of proportionality will be the transmitted chirp’s slope s. as a consequence every object that is irradiated by the radar will produce a constant frequency component in the if signal with the value of 2rs/c. 552 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 553 508 v. milovanović for the case of a simple stationary or quasi-stationary scene, relation between the beat frequency tone fb and the object’s range can be related as: fb = 2rs c = 2b ctc · r ⇐⇒ r = cfb 2s = ctc 2b · fb , (4) where chirp slope s = b/tc and any radar or object movement is negligible. previous statements mean that a single transmitted chirp when reflected from multiple objects located at different distances in front of a radar will also imply multiple received chirps each delayed by a different amount depending on the distance to that particular object. therefore, the produced if signal will be composed of several tones that correspond to each of the reflections and the frequency of each is directly proportional to the range of that object. the initial phase of every component in the if signal will also be the difference between the phase of the tx chirp and the phase of the rx chirp at the time instant corresponding to the start of the if signal, or more precisely, to the start of that particular frequency component of the if signal. it is important to note that the if signal is only valid from the time the reflected signal is received at the rx antenna until the end of the current tx chirp. so in order to digitize the if signal using an adc, it should be assured that sampling begins after 2r/c time has elapsed after the beginning of the tx chirp, and only up to the time where the tx signal is present. in practical implementations the round-trip delay is typically just a small fraction of the total chirp duration tc, thus the nonoverlapping segment of the transmitted chirp is usually negligible. for example, for an object that is r = 150 m away from the radar and the chirp modulation time of tc = 20 µs, this delay accounts only for approximately 5% of the total sweep duration. 3.1 target distance estimation and radar range resolution in a quasi-stationary scene, radial object velocities with respect to a radar are negligible and, as explained, if such a scene is composed out of multiple targets, the produced if signal will contain multiple frequency components. in other words, the frequency spectrum of such if signal will reveal multiple tones, the frequency of each being proportional to the distance between each object and the radar. if two objects are closer to each other, or at least at the similar distance from the radar, their tones in the if signal are also closer. certainly the most natural and one of the most popular methods of processing the if signal is the fourier transform. it is generally known that longer observation periods yield better frequency resolution so that, for ex554 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 555 508 v. milovanović for the case of a simple stationary or quasi-stationary scene, relation between the beat frequency tone fb and the object’s range can be related as: fb = 2rs c = 2b ctc · r ⇐⇒ r = cfb 2s = ctc 2b · fb , (4) where chirp slope s = b/tc and any radar or object movement is negligible. previous statements mean that a single transmitted chirp when reflected from multiple objects located at different distances in front of a radar will also imply multiple received chirps each delayed by a different amount depending on the distance to that particular object. therefore, the produced if signal will be composed of several tones that correspond to each of the reflections and the frequency of each is directly proportional to the range of that object. the initial phase of every component in the if signal will also be the difference between the phase of the tx chirp and the phase of the rx chirp at the time instant corresponding to the start of the if signal, or more precisely, to the start of that particular frequency component of the if signal. it is important to note that the if signal is only valid from the time the reflected signal is received at the rx antenna until the end of the current tx chirp. so in order to digitize the if signal using an adc, it should be assured that sampling begins after 2r/c time has elapsed after the beginning of the tx chirp, and only up to the time where the tx signal is present. in practical implementations the round-trip delay is typically just a small fraction of the total chirp duration tc, thus the nonoverlapping segment of the transmitted chirp is usually negligible. for example, for an object that is r = 150 m away from the radar and the chirp modulation time of tc = 20 µs, this delay accounts only for approximately 5% of the total sweep duration. 3.1 target distance estimation and radar range resolution in a quasi-stationary scene, radial object velocities with respect to a radar are negligible and, as explained, if such a scene is composed out of multiple targets, the produced if signal will contain multiple frequency components. in other words, the frequency spectrum of such if signal will reveal multiple tones, the frequency of each being proportional to the distance between each object and the radar. if two objects are closer to each other, or at least at the similar distance from the radar, their tones in the if signal are also closer. certainly the most natural and one of the most popular methods of processing the if signal is the fourier transform. it is generally known that longer observation periods yield better frequency resolution so that, for exon range-doppler estimation in integrated fmcw radar sensors 509 ample, an observation window of t seconds in length can independently resolve frequency components that are separated by at least 1/t hertz. one of the most important properties of every radar is its range resolution, which refers to the radar’s ability to resolve two closely spaced objects. more precisely, it determines the minimum spacing between the two objects which still show up as two separate frequency peaks the if signal spectrum. obviously, one way to improve the range resolution of a radar is to extend the observation window, which looking at fig. 3 further implies increasing the chirp duration and consequently its bandwidth, if the slope is preserved. analytically, two or more distinct if signal tones can be resolved as long as ∆f > 1/tc, where the small portion at the beginning of the chirp which is associated by the round-trip delay is discarded. it is known that two objects that are spatially ∆r apart produce tones separated by ∆f = 2∆rs/c apart. eliminating ∆f from the previous two expressions and having in mind that the slope s = b/tc, the expression for radar’s range resolution is obtained: ∆r > c 2stc =⇒ ∆r > c 2b , (5) which exclusively depends on the chirp bandwidth b. thus, an fmcw radar with a chirp bandwidth of 5 ghz can have a range resolution of 3 cm at least. although from fourier transform properties it may intuitively seem that for the fixed bandwidth b, chirps of higher duration tc would imply longer if observation windows and better resolving capabilities, the if signal tones will also be lower in frequency, because of a less steep chirp, and therefore more densely grouped, hence being proportionally harder to differentiate. besides range resolution another important parameter is the maximum range of a radar. as high-level block diagram of fig. 2 indicates, the if signal is usually filtered and digitized by an analog-to-digital converter (adc) for further postprocessing inside the following digital signal processing (dsp) chain. so, the maximum detectable distance of a radar rmax will produce a tone of frequency 2rmaxs/c and the adc’s sampling rate should be at least twice as high in order to appropriately discretize this (real) baseband signal. viewed the other way around, for the adc’s maximum sampling rate of fs the maximum distance that an fmcw radar can see is determined by: rmax = cfs 4s = ctcfs 4b = cn 4b , (6) which follows directly from the sampling theorem for a bandlimited if signal. consequently, if it turns out that the adc’s sampling rate presents a bottleneck, the maximum detectable range can always be traded for chirp’s slope. 554 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 555 510 v. milovanović typically, radars tend to use lower chirp slopes for larger maximum range. also, n denotes the number of adc samples per chirp. the discrete nature of the sampled if signal suggests the use of discrete fourier transform (dft) for further postprocessing. the actual algorithm which is employed is the fast fourier transform (fft). since this processing operation resolves objects in range, it is commonly referred to as the “range-fft” in radar literature. it seems appropriate to stress one of the most important benefits of fmcw radars which is also observable from fig. 3 and that is the difference between the rf bandwidth and the if bandwidth. specifically, the rf bandwidth is the frequency range from f0 up to f0 + b which is spanned by the chirp and it directly translates to better range resolution. the typical rf bandwidths are in the order of a few hundred megahertz up to several gigahertz. on the other hand, higher if bandwidth primarily enables the fmcw radar to see at larger distances and enables faster/steeper chirps. the if bandwidths are typically in the order of megahertz up to a dozen of megahertz. hence, the uniqueness of fmcw radar sensors is that huge rf bandwidths do not imply nor necessitate extremely fast data converters. 3.2 radial velocity estimation and radar velocity resolution for the nonstationary case in which there are nonnegligible object or radar movements, all distance measurements through round-trip delay are going to be affected by either signal compression or elongation depending on whether the object is moving away or towards the radar. this effective frequency shift due to relative movement is caused by the well-known doppler effect. small spatial displacements of an object ∆d will have an effect on both the if signal’s frequency and its phase. in mm-wave radars, small displacements are the ones that are comparable to the wavelength which is in the order of several millimeters for typical radar bands. slight spatial displacements will lead to small round-trip delay changes. spatial object variation does not have an effect on the initial phase of the received rf signal, but does have on the current phase of the transmitted signal and hence also on the phase of the if signal. more formally speaking, for very small displacements the higher order terms can be neglected. furthermore, based on (2) the phase offset of the transmitted signal can be expressed in terms of small displacements as ∆φ = 2πf0∆t = 2πf0 2∆d c = 4π λ0 · ∆d , (7) where ∆t presents the round-trip delay change caused by the object’s range displacement and λ0 = c/f0 is the wavelength of the transmitted rf signal. 556 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 557 510 v. milovanović typically, radars tend to use lower chirp slopes for larger maximum range. also, n denotes the number of adc samples per chirp. the discrete nature of the sampled if signal suggests the use of discrete fourier transform (dft) for further postprocessing. the actual algorithm which is employed is the fast fourier transform (fft). since this processing operation resolves objects in range, it is commonly referred to as the “range-fft” in radar literature. it seems appropriate to stress one of the most important benefits of fmcw radars which is also observable from fig. 3 and that is the difference between the rf bandwidth and the if bandwidth. specifically, the rf bandwidth is the frequency range from f0 up to f0 + b which is spanned by the chirp and it directly translates to better range resolution. the typical rf bandwidths are in the order of a few hundred megahertz up to several gigahertz. on the other hand, higher if bandwidth primarily enables the fmcw radar to see at larger distances and enables faster/steeper chirps. the if bandwidths are typically in the order of megahertz up to a dozen of megahertz. hence, the uniqueness of fmcw radar sensors is that huge rf bandwidths do not imply nor necessitate extremely fast data converters. 3.2 radial velocity estimation and radar velocity resolution for the nonstationary case in which there are nonnegligible object or radar movements, all distance measurements through round-trip delay are going to be affected by either signal compression or elongation depending on whether the object is moving away or towards the radar. this effective frequency shift due to relative movement is caused by the well-known doppler effect. small spatial displacements of an object ∆d will have an effect on both the if signal’s frequency and its phase. in mm-wave radars, small displacements are the ones that are comparable to the wavelength which is in the order of several millimeters for typical radar bands. slight spatial displacements will lead to small round-trip delay changes. spatial object variation does not have an effect on the initial phase of the received rf signal, but does have on the current phase of the transmitted signal and hence also on the phase of the if signal. more formally speaking, for very small displacements the higher order terms can be neglected. furthermore, based on (2) the phase offset of the transmitted signal can be expressed in terms of small displacements as ∆φ = 2πf0∆t = 2πf0 2∆d c = 4π λ0 · ∆d , (7) where ∆t presents the round-trip delay change caused by the object’s range displacement and λ0 = c/f0 is the wavelength of the transmitted rf signal. on range-doppler estimation in integrated fmcw radar sensors 511 it is crucial to note that the phase of the if signal changes linearly with small displacements of the object distance and also that the phase is much more sensitive to small spatial perturbations than the actual if tone frequency. to gain a numerical sense of the previous fact, assume ∆d = λ0/4 which for typical automotive radar band is in the order of one millimeter. based on elaborations from the last subsection, every spatial object displacements that are much smaller than the radar’s range resolution, which is a few centimeters for present state of the art devices, that is ∆d � ∆r, will be practically not discernible in the frequency spectrum. on the other hand the phase changes by ∆φ = π = 180◦ for the quarter wavelength displacements. thus, the if signal’s phase is very sensitive to small changes in object range. this gives all the tools for effective velocity measurement of an object by an fmcw radar. the basic idea is to transmit two consecutive chirps of duration tc. each of the two reflected chirps is processed through fft to detect the range of the object. the range-fft corresponding to each chirp will have peak at the same location but with a different phase. the measured phase difference of two peaks corresponds to spatial motion of the object. assuming that an object with a radial velocity of v in time tc traverses ∆d = vtc, then substituting this into (7) and rearranging it, the object velocity can be directly estimated from the measured phase difference as: v = λ0 4πtc · ∆φ . (8) hence, the phase difference measured across two consecutive chirps can be exploited to estimate the velocity of a single object in front of the radar. since the phase difference measurement is unambiguous only in cases in which |∆φ < π|, the maximum unambiguously measurable velocities are: vmax = λ0 4tc . (9) this further implies that measuring higher vmax requires faster/shorter chirps. the previously described method that combines two consecutive chirps does not only work for measuring velocity of a single object, but it is also applicable to multiple objects as well, as long as they are located at different ranges from the radar. however, it will not work if multiple moving objects with different velocities are at the time of measurement all equidistantly located from the radar. this is because the range-fft of both chirps would yield a single peak whose frequency would correspond to range, but whose phase change would present a combined signal from all of these equi-range object and hence a simple phase comparison technique would not suffice. 556 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 557 512 v. milovanović f 1 2 3 m-2 m-1 m traw adc samples n sa m pl es p er ch ir p � n n-1 n-2 3 2 1 an example radar scene consists of five moving objects in total. however, only two peaks are observable after the range-fft. nevertheless, all five targets emerge after the doppler-fft. peak bins (objects) are shaded. range fft 1 2 3 m-2 m-1 m velocity � d op pl er f f t ra ng e � n n-1 n-2 3 2 1 fig. 4. two-dimensional (2-d) fft processing of an fmcw frame containing m chirps and that n samples are taken out of each chirp. the first m × n matrix contains the raw radar data. after the first fft which is performed on each matrix column the range is resolved. the second fft performed across matrix rows resolves the doppler frequency. one way of estimating the velocities of multiple equidistant objects is to transmit a series of more than two consecutive equally spaced chirps just as fig. 4 illustrates. again, under the assumption of relatively slow motion, the range-fft corresponding to each of these chirps would yield peaks in identical frequency locations. nevertheless, the phase of each magnitude peak in 558 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 559 512 v. milovanović f 1 2 3 m-2 m-1 m traw adc samples n sa m pl es p er ch ir p � n n-1 n-2 3 2 1 an example radar scene consists of five moving objects in total. however, only two peaks are observable after the range-fft. nevertheless, all five targets emerge after the doppler-fft. peak bins (objects) are shaded. range fft 1 2 3 m-2 m-1 m velocity � d op pl er f f t ra ng e � n n-1 n-2 3 2 1 fig. 4. two-dimensional (2-d) fft processing of an fmcw frame containing m chirps and that n samples are taken out of each chirp. the first m × n matrix contains the raw radar data. after the first fft which is performed on each matrix column the range is resolved. the second fft performed across matrix rows resolves the doppler frequency. one way of estimating the velocities of multiple equidistant objects is to transmit a series of more than two consecutive equally spaced chirps just as fig. 4 illustrates. again, under the assumption of relatively slow motion, the range-fft corresponding to each of these chirps would yield peaks in identical frequency locations. nevertheless, the phase of each magnitude peak in on range-doppler estimation in integrated fmcw radar sensors 513 the spectral domain across chirps would be different since it incorporates in itself phase contributions from all of these equidistant objects. performing yet another fft round, now across this discrete sequence of chirps would result in peaks corresponding to normalized angular frequencies of each object velocity. the obtained angular frequencies ω can be used to back-calculate the object velocities from (8) substituting ∆φ = ω, i.e., the phase difference between consecutive chirps with the discrete angular frequency. the transform that is performed across chirps is often referred to as the “doppler-fft”, while the sequence of m equispaced chirps on which it is performed is called a frame. therefore, a basic transmission unit of an fmcw radar is the frame. just as range estimation capability had its range resolution, the velocity extraction has its own resolution. analogously to range, there is a certain minimum separation between normalized angular frequencies so that they show up as two independent peaks in the doppler-fft spectrum. identically to the continuous fourier transform, the longer the dft input sequence length, better the resolution. more precisely, a sequence of m samples can resolve discrete angular frequencies that are separated by more than 2π/m radians per sample or equivalently 1/m cycles per sample, since one cycle is equal to 2π radians. so, in the continuous case, the resolution is inversely proportional to observation time t , while in the discrete case it is inversely proportional to the number of observed samples m. apparently, a way to improve the velocity resolution is to increase the number of chirps per frame. analytically, two distinct normalized angular frequencies can be resolved as long as ∆ω > 2π/m and since two velocities that are ∆v apart produce angular frequencies that are ∆ω = 4π∆vtc/λ0, eliminating ∆ω from those expressions and accounting that frame duration is given as tf = mtc, yields ∆v > λ0 2mtc =⇒ ∆v > λ0 2tf , (10) where tc is the separation between the adjacent chirps. this was an expected result having already mentioned the velocity resolution’s inverse proportionality to frame duration, or, more precisely, the number of chirps in a frame. range and velocity estimation is best summarized in fig. 4 which provides insight in transformations and data organization. samples taken from an adc corresponding to each chirp in a frame are stored as the columns of a data matrix. a range-fft performed on each column resolves objects in range. subsequently, a doppler-fft is performed along the rows of the range-fft results to resolve objects in the velocity or doppler dimension. the process of taking the range-fft followed by the doppler-fft is to558 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 559 514 v. milovanović f 1 2 3 m-2 m-1 m t range fft 1 2 3 m-2 m-1 m velocity � d op pl er f f t ra ng e � n n-1 n-2 3 2 1 fig. 5. a practical implementation of an fmcw slow-time 2-d fft radar processing in which range-fft is performed on the fly as data samples for each chirp become available. gether called two-dimensional fft (2-d fft) in the fmcw [24] literature. just as illustrated in fig. 5, in practical radar dsp implementations, the range-fft is usually accomplished in line as soon as the samples from an adc for each chirp become available and prior to storing them into memory. contrary to previous, the doppler-fft can only be performed once all the range-fft output data points have become available. therefore, a radar dsp system should be equipped with sufficient amount of memory to store the complete content of all the range-fft outputs corresponding to a frame. once the 2-d fft has been performed on a complete frame, the so-called range-doppler response can be obtained. a practical example, visualized in the range-velocity grid, is shown in fig. 6, where two objects can be clearly identified as peaks that stand out from the noise floor or surrounding clutter. noise suppression near the range edges comes from the band-pass filtering. it should also be mentioned that the limitation on maximum unambiguously measurable velocity imposed by (9) can actually be extended using some higher level algorithms, but they fall beyond the scope of this article. as a final remark, the radial velocity in the above derivation is assumed to be both constant and sufficiently small so that the illuminated object does not move from one range bin to another across the duration of a single frame. 560 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 561 514 v. milovanović f 1 2 3 m-2 m-1 m t range fft 1 2 3 m-2 m-1 m velocity � d op pl er f f t ra ng e � n n-1 n-2 3 2 1 fig. 5. a practical implementation of an fmcw slow-time 2-d fft radar processing in which range-fft is performed on the fly as data samples for each chirp become available. gether called two-dimensional fft (2-d fft) in the fmcw [24] literature. just as illustrated in fig. 5, in practical radar dsp implementations, the range-fft is usually accomplished in line as soon as the samples from an adc for each chirp become available and prior to storing them into memory. contrary to previous, the doppler-fft can only be performed once all the range-fft output data points have become available. therefore, a radar dsp system should be equipped with sufficient amount of memory to store the complete content of all the range-fft outputs corresponding to a frame. once the 2-d fft has been performed on a complete frame, the so-called range-doppler response can be obtained. a practical example, visualized in the range-velocity grid, is shown in fig. 6, where two objects can be clearly identified as peaks that stand out from the noise floor or surrounding clutter. noise suppression near the range edges comes from the band-pass filtering. it should also be mentioned that the limitation on maximum unambiguously measurable velocity imposed by (9) can actually be extended using some higher level algorithms, but they fall beyond the scope of this article. as a final remark, the radial velocity in the above derivation is assumed to be both constant and sufficiently small so that the illuminated object does not move from one range bin to another across the duration of a single frame. on range-doppler estimation in integrated fmcw radar sensors 515 awr1243 sensor: highly integrated 76–81-ghz radar front-end 3 may 2017 for emerging adas applications produces a beat-frequency (intermediate frequency [if] frequency) output, which is digitized and subsequently processed in a dsp. figure 1 shows the received fmcw signal, which comprises different delayed and attenuated copies of the transmitted signal corresponding to various objects. from figure 1, you can see that the beatfrequency signal corresponding to each object is a tone (ignoring the edge effects at the start and end of the chirp), whose frequency (f b ) is proportional to the distance (r) of the object from the radar. the process of detecting objects (targets) and their distances from the radar involves taking a fast fourier transform (fft) of the beat-frequency signal and identifying peaks that stand out from the noise floor. in the case of moving objects, the beat-frequency signal also has a doppler component that depends on the relative velocity between the radar and the target. looking at the phase shift of the beat signal from one chirp to the next provides an estimate of the doppler and hence the relative velocity. this is typically accomplished by performing a second fft across chirps [1] . the detection process involves performing a first-dimension fft of the received samples corresponding to each chirp and then a seconddimension fft of this output across chirps. the result of the 2-d fft procedure is an image of the target(s) in the range-velocity grid, as shown in figure 2. the detection process is often performed on this 2-d fft output and involves detecting peaks amid the noise floor or surrounding clutter. additionally, for the detected objects, an angle estimation process is performed using digital beamforming with multiple tx/rx antennas. thus, the fmcw radar can provide a 3-d image (range, relative velocity and angle of arrival) of the scene that it illuminates. for a comprehensive description of fmcw, see the citations in the references section. advantages of fast fmcw modulation the 2-d fft processing procedure is applicable to radar implementations that use fast fmcw modulation. this is in contrast to other techniques, such as triangular fmcw waveform (slow fmcw modulation). in fast (saw tooth) fmcw modulation, the chirp durations are in the order of tens of microseconds, whereas in slow (triangular) fmcw modulation, the chirp durations are much longer, typically in milliseconds. one of the key advantages of fast fmcw modulation is that the range and velocity of various objects are automatically resolved into a 2-d image. figure 2. radar 2-d fft image showing range and velocity of two point objects.fig. 6. radar 2-d fft images of the so-called range-doppler response showing range and relative speed/velocity of two point objects that stand out as peaks above the noise floor. besides object velocity estimation, measuring if signal’s phase change over multiple antennas separated in space (instead of multiple chirps separated in time) can be used to resolve angular dimension of objects. differential distance of an object to each antenna is exploited to estimate the angle of arrival. however, extracting target angle information is also not the topic. finally, in addition to measuring angle of arrival and object velocity, the fact that the phase of the if signal is very sensitive to small movements is also the basis for interesting applications such as vibration or heartbeat monitoring, among others. the only assumption is that the movements are small so that the maximum displacement of the object is in the order of a fraction of the λ0 wavelength. even though the effect on if frequency tone will be negligible, the phase of the frequency peak will exhibit some sort of a periodic behavior as a response to oscillatory movement of the object. in connection to that, the maximum phase deviation will be related to maximum object displacement providing means to extract the vibration amplitude. in a similar way, the periodicity can be estimated and thus the time evolution of the phase can yield both the amplitude and periodicity of the vibration. 3.3 radar requirement mapping to chirp and frame parameters having derived the equations that define maximum unambiguously measurable range and velocity, as well as their corresponding resolutions, it is also important to know how to exploit these to design an fmcw transmit signal that meets certain end-user requirements. assuming the specifications for range resolution (∆r), maximum range (rmax), velocity resolution (∆v) and maximum velocity (vmax) are given and dictated by a certain application, there are multiple strategies how to map this set of requirements to chirp and frame parameters. a sketch of one possible design method is as follows: 560 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 561 516 v. milovanović • the carrier frequency/wavelength is determined by the frequency band • chirp bandwidth is directly dictated by b = c/2∆r the range resolution • inter-chirp time is only ruled by tc = λ0/4vmax the maximum velocity • since both b and tc are fixed, the chirp slope s = b/tc is also locked • the frame duration is governed by tf = λ0/2∆v the velocity resolution • finally, it is assumed that the data converter’s sampling rate is sufficiently high fs = 4srmax/c to support if signal bandwidth of 2srmax/c. however, in practice the process of arriving at desired chirp and frame parameters might involve several iterations, simply because the fmcw radar sensor could have some additional constraints that were not addressed so far. for example, the maximum if bandwidth could exceed the adc’s sampling frequency. in such cases, a trade-off between the chirp slope and the maximum measurable distance might be needed. therefore, in order to increase rmax the chirp slope would have to be decreased. on the other hand if the modulation time tc is frozen based on vmax, a lower modulation rate s directly translates to worse range resolution. so, basically, for the fixed modulation time, a short-range radar has a steeper chirp slope and consequently a larger chirp bandwidth and better range resolution, while long-range radar has a lower slope and thereupon a smaller bandwidth and poorer resolution. besides the mentioned maximum sampling frequency, other device limitations that are in connection with either analog front-end or digital back-end are often present. for example, there is always a certain maximum slope an fmcw synthesizer can generate. also related to that, due to a finite settling period, usually a device-specific requirements for idle time between adjacent chirps need to be honored. on the back-end side, the device must have sufficient memory to store the range-fft output data for all the chirps in the frame to respect a request imposed by the doppler-fft on data availability. 4 contemporary mm-wave fmcw radar sensor examples contrary to communication systems where wireless signal receivers are more complicated than their transmitter counterparts, this is not the case with fmcw radar sensors where tx needs to satisfy stringent chirp generation requirements. namely, although they were not elaborated in the previous sections, object detection quality of fmcw-based sensors will depend on many nonideal effects, such as chirp nonlinearity or synthesizer phase noise. 562 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 563 516 v. milovanović • the carrier frequency/wavelength is determined by the frequency band • chirp bandwidth is directly dictated by b = c/2∆r the range resolution • inter-chirp time is only ruled by tc = λ0/4vmax the maximum velocity • since both b and tc are fixed, the chirp slope s = b/tc is also locked • the frame duration is governed by tf = λ0/2∆v the velocity resolution • finally, it is assumed that the data converter’s sampling rate is sufficiently high fs = 4srmax/c to support if signal bandwidth of 2srmax/c. however, in practice the process of arriving at desired chirp and frame parameters might involve several iterations, simply because the fmcw radar sensor could have some additional constraints that were not addressed so far. for example, the maximum if bandwidth could exceed the adc’s sampling frequency. in such cases, a trade-off between the chirp slope and the maximum measurable distance might be needed. therefore, in order to increase rmax the chirp slope would have to be decreased. on the other hand if the modulation time tc is frozen based on vmax, a lower modulation rate s directly translates to worse range resolution. so, basically, for the fixed modulation time, a short-range radar has a steeper chirp slope and consequently a larger chirp bandwidth and better range resolution, while long-range radar has a lower slope and thereupon a smaller bandwidth and poorer resolution. besides the mentioned maximum sampling frequency, other device limitations that are in connection with either analog front-end or digital back-end are often present. for example, there is always a certain maximum slope an fmcw synthesizer can generate. also related to that, due to a finite settling period, usually a device-specific requirements for idle time between adjacent chirps need to be honored. on the back-end side, the device must have sufficient memory to store the range-fft output data for all the chirps in the frame to respect a request imposed by the doppler-fft on data availability. 4 contemporary mm-wave fmcw radar sensor examples contrary to communication systems where wireless signal receivers are more complicated than their transmitter counterparts, this is not the case with fmcw radar sensors where tx needs to satisfy stringent chirp generation requirements. namely, although they were not elaborated in the previous sections, object detection quality of fmcw-based sensors will depend on many nonideal effects, such as chirp nonlinearity or synthesizer phase noise. on range-doppler estimation in integrated fmcw radar sensors 517 vga lna baseband rx tx target r tof = 2r c 0◦/90◦ rf front end dsp mcu osc t f spi slave fmcw generator fb analog pa roc adc pll fig. 7. a contemporary fully-integrated fmcw radar-on-chip (roc) sensor solution which consists of a single transmitter (tx) and single receiver (rx) antenna and processing chain. a simplified block diagram of a modern monolithic fmcw radar-onchip (roc) sensor is shown in fig. 7 which sketches its main components. it consists of two major functional blocks: (i) the rf sensor front end containing antennas, signal creation and transmission, signal reception and conditioning and analog-to-digital sampling and conversion, and (ii) digital back end which converts time-domain samples into frequency information, identifies targets and calculates their distances, relative radial velocities, angles and can even perform some advanced functions like target classification or object tracking. in the mm-wave bands of interest, antennas are mostly realized as patch or dipole antennas on printed-circuit board (pcb) due to their dimensions. 4.1 fmcw radar transmitters the key components of fmcw transmitters are the fmcw synthesizers. they synthesize transmitted radar signal and provide desired modulation schemes. the most important signal conditioning parameters are transmitter phase noise and generated chirp nonlinearity and both have a profound effect on extracting relevant target information from background clutter and noise. a common block in vast majority of fmcw synthesizers is the oscillator. integrated voltage-controlled (vcos) and digitally-controlled oscillators (dcos) [13] in cmos and bicmos technologies are generally nonlinear with respect to the input control signal due to nonlinear varactor devices [18] which are used in resonators as the frequency control elements. accordingly, 562 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 563 518 v. milovanović the biggest issue in an fmcw synthesizer is the compensation of inherent nonlinearity of the dco/vco frequency tuning curve. various methods for fmcw signal generation are proposed so far, each with its own advantages and disadvantages. the most intuitive method is based on the open-loop oscillator, in which the compensation of its nonlinearity is achieved via a lookup table (lut) and a digital-to-analog converter (dac). a drawback of this method is the frequency drift with temperature or supply voltage variations which demands periodical updating of the lut. apart from aforementioned variations, large effect on the oscillator frequency have unwanted load fluctuations and disturbances which cannot be compensated. therefore, oscillator nonlinearity is often compensated in the closed-loop systems such as plls. in feedback loop based fmcw synthesizers, dominated by phase-locked loop (pll) systems, the carrier frequency can be modulated by directly imposing the control signal of a vco, by modulating the reference frequency of an integer-n pll [5] or by using fractional-n pll to change the feedback frequency divider ratio [10–15] hence producing the modulation. advantages of direct vco modulation is a simple circuit structure and the absence of additional noise sources. on the other hand, direct vco modulation requires at least an order of magnitude smaller loop bandwidth in comparison to the modulation frequency which results in a very low filter cross-over frequency, impractical for integration. a method of modulating the reference frequency of integer-n pll, also known as direct digital frequency synthesis (ddfs), employs lut and dac to convert digital word representing phase to analog voltage. the use of dac constitutes the main disadvantage of this method, because the nonlinearity of the characteristic line, the settling time, the finite slew rate and the jitter coming from the dac result in spurious signals and serious phase noise performance degradation at the output of the fmcw synthesizer. probably the most suitable method for fmcw signal generation is based on fractional-n plls [25]. this method does not require a low noise dac nor a lut, and it provides highly linear frequency sweeps. thus, it is widely adopted in contemporary integrated fmcw radar sensor modules. various frequency synthesizer architectures based on fractional-n plls have been reported. they include: a pll with the fundamental frequency vco or dco [5,10,12,13,26], a pll with a push-push vco [27], a pll and a frequency multiplier [14,16,28–30], and a pll tied with an injection-locked oscillator [31]. each of these oscillator architectures and methods have their own pros and cons which are summarized in [26]. the choice of the actual synthesizer architecture mainly depends on the required pll phase noise and output amplitude, but also on the process technology that is being used. 564 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 565 518 v. milovanović the biggest issue in an fmcw synthesizer is the compensation of inherent nonlinearity of the dco/vco frequency tuning curve. various methods for fmcw signal generation are proposed so far, each with its own advantages and disadvantages. the most intuitive method is based on the open-loop oscillator, in which the compensation of its nonlinearity is achieved via a lookup table (lut) and a digital-to-analog converter (dac). a drawback of this method is the frequency drift with temperature or supply voltage variations which demands periodical updating of the lut. apart from aforementioned variations, large effect on the oscillator frequency have unwanted load fluctuations and disturbances which cannot be compensated. therefore, oscillator nonlinearity is often compensated in the closed-loop systems such as plls. in feedback loop based fmcw synthesizers, dominated by phase-locked loop (pll) systems, the carrier frequency can be modulated by directly imposing the control signal of a vco, by modulating the reference frequency of an integer-n pll [5] or by using fractional-n pll to change the feedback frequency divider ratio [10–15] hence producing the modulation. advantages of direct vco modulation is a simple circuit structure and the absence of additional noise sources. on the other hand, direct vco modulation requires at least an order of magnitude smaller loop bandwidth in comparison to the modulation frequency which results in a very low filter cross-over frequency, impractical for integration. a method of modulating the reference frequency of integer-n pll, also known as direct digital frequency synthesis (ddfs), employs lut and dac to convert digital word representing phase to analog voltage. the use of dac constitutes the main disadvantage of this method, because the nonlinearity of the characteristic line, the settling time, the finite slew rate and the jitter coming from the dac result in spurious signals and serious phase noise performance degradation at the output of the fmcw synthesizer. probably the most suitable method for fmcw signal generation is based on fractional-n plls [25]. this method does not require a low noise dac nor a lut, and it provides highly linear frequency sweeps. thus, it is widely adopted in contemporary integrated fmcw radar sensor modules. various frequency synthesizer architectures based on fractional-n plls have been reported. they include: a pll with the fundamental frequency vco or dco [5,10,12,13,26], a pll with a push-push vco [27], a pll and a frequency multiplier [14,16,28–30], and a pll tied with an injection-locked oscillator [31]. each of these oscillator architectures and methods have their own pros and cons which are summarized in [26]. the choice of the actual synthesizer architecture mainly depends on the required pll phase noise and output amplitude, but also on the process technology that is being used. on range-doppler estimation in integrated fmcw radar sensors 519 a recent example of an fmcw synthesizer packed in the complete transmitter module [32] provides, in a reasonable modulation time window an extremely large chirp bandwidth of more than 10 ghz thus enabling unmatched range resolutions that are better than 1.5 cm. it is intended to serve as a ubiquitous short-distance radar solution that operates in the unlicensed spectrum band around 65 ghz and to compete in diverse fields of demanding consumer products, like emerging gesture sensors, but also in industrial applications. even though at first glance it might seem counterintuitive, excluding the transceiver chain, in particular low-noise and power amplifiers, the shortrange radars (srrs) are actually more challenging to design than the longrange ones. a dominant source of difficulties in srrs arise due to a limited time frame associated with targets in close proximity to the radar. specifically, as can be deduced from fig. 3, for a fixed modulation slope, lower beat frequencies will correspond to objects located at smaller radii. therefore, it is generally beneficial to decrease the modulation time without compromising the bandwidth in order to push the beat notes of closer targets away from the flicker noise corner frequency. this in turn increases the signal-to-noise ratio (snr), and consequently the measurement threshold of weaker objects. nonlinearity, manifested as an instantaneous frequency deviation from the ideal chirp, disturbs the beat tone and thereby deteriorates radar’s measurement accuracy and precision. even though, faster chirps of high bandwidth, i.e., steeper, are more prone [30] to nonlinear frequency excursions, the above mentioned [32] state-of-the-art radar transmitter is able to achieve the superb frequency sweep linearity under acceptable phase noise levels. generally speaking, the use of a closed-loop pll enables the generation of highly linear chirps which avoid smearing of the fft peaks thus gaining the full benefits of unmatched range resolution associated with high rf bandwidth. although a wide rf bandwidth improves radar’s range resolution it can typically lead to a longer chirp duration which as a result has a limited maximum unambiguous velocity due to undersampling of the doppler frequency shift. hence, supporting steeper chirps, i.e., higher frequency ramp slopes, is essential to achieve higher range resolutions without compromising the maximum velocity. as a side advantage of previous, a wider if bandwidth relaxes the design of analog baseband filters (moderate roll-off), but requires higher analog-to-digital converter (adc) sampling rates to achieve equal maximum detectable distances. another subtle, but also a substantial advantage of steeper modulation slopes is illustrated in fig. 8, just for the case of triangular chirps, and relates to the fact that spatially equidistant targets yield more separate tones within the beat-frequency domain. thus, the noise skirt 564 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 565 520 v. milovanović { f t f0 f0 + b t0 + tct0 (b) (c) ffb1 fb1 fb2 a fb2 fb2 fb1 target 1 target 2 roc (a) r d tx rx s3 s2 s1 s3 s2 s1 fb1 fb2 rf only tx r � d fig. 8. effect of the modulation slope s on the beat frequency separation in static multitarget detection scenarios (a) spatial object positioning, (b) time-domain transmitted fmcw triangular chirps, (c) if amplitude spectrum for three different modulation rates. from one target produces less interference in the detection of nearby objects. for stationary targets previous statements can analytically be expressed as: fb2 − fb1 = 2b ctc · (r + d) − 2b ctc · r = 2b ctc · d = 2 c · s · d , (11) where d is the radial distance between targets with respect to the radar. because of all the mentioned reasons it is important to simultaneously increase the rf bandwidth and reduce the modulation time, thus supporting steeper slopes. to achieve that, many technical challenges have to be tackled. 4.2 fmcw radar receivers although just as important as the transmitter, due to higher similarity to wireless communication transceivers, less attention is devoted to the receiver. in the simple homodyne implementation, the fmcw receiver is just a plain direct-conversion radio receiver where the modulated signal is frequency 566 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 567 520 v. milovanović { f t f0 f0 + b t0 + tct0 (b) (c) ffb1 fb1 fb2 a fb2 fb2 fb1 target 1 target 2 roc (a) r d tx rx s3 s2 s1 s3 s2 s1 fb1 fb2 rf only tx r � d fig. 8. effect of the modulation slope s on the beat frequency separation in static multitarget detection scenarios (a) spatial object positioning, (b) time-domain transmitted fmcw triangular chirps, (c) if amplitude spectrum for three different modulation rates. from one target produces less interference in the detection of nearby objects. for stationary targets previous statements can analytically be expressed as: fb2 − fb1 = 2b ctc · (r + d) − 2b ctc · r = 2b ctc · d = 2 c · s · d , (11) where d is the radial distance between targets with respect to the radar. because of all the mentioned reasons it is important to simultaneously increase the rf bandwidth and reduce the modulation time, thus supporting steeper slopes. to achieve that, many technical challenges have to be tackled. 4.2 fmcw radar receivers although just as important as the transmitter, due to higher similarity to wireless communication transceivers, less attention is devoted to the receiver. in the simple homodyne implementation, the fmcw receiver is just a plain direct-conversion radio receiver where the modulated signal is frequency on range-doppler estimation in integrated fmcw radar sensors 521 translated in a single conversion step. this avoids additional complexity, but also since the tx and rx frequencies differ yields some properties much alike superheterodyne receiver, e.g., instead of zero the if is sufficiently large. some additional simplifications in terms of lo injection are present, too. namely, in case of an up-chirp sawtooth modulation the high-side injection is present, while in the case of down-chirp sawtooth modulation the low-side injection applies. for the case of a triangular modulation, both high-side and low-side injection apply to rising and falling frequency slopes, respectively. even though the transmitter signal energy can leak through the mixer and then reflect back to create a self-mixing dc offset, the baseband processing chain usually starts with the high-pass filter to alleviate for this effect. finally, leading edge fmcw radar sensors adopt the complex baseband receiver architecture which uses quadrature mixers with complex if and adc chains that include both in-phase (i) and quadrature (q) channels. this so-called iq baseband architecture brings several advantages but the most straightforward one seem to be better noise figure performance (up to 3 db in theory) because the image band noise foldback to in-band is eliminated. other benefits include reduced impact of rf intermodulation products due to receiver’s nonlinearity combined with the presence of strong tx-to-rx antenna coupling and spillover or very near objects like, e.g., a car bumper. 5 conclusions an introduction to radar systems that adopt frequency-modulated continuous waves, or fmcw, to measure range and velocity of remote objects has been made in this article. it has been explained that the received fmcw signal from the remote objects comprises of different time delayed and frequency shifted copies of transmitted chirp signals. an elaborate analysis on how the received signal can be processed in order to obtain the useful information has been performed and the fundamental operating principles of fmcw radars was discussed. some basic limitations in terms of resolution and maximum measurable distance and speed were shown. finally, the examples of recent cutting-edge integrated fmcw radar transceiver implementations are given. since the focus was on the most simple siso radar sensors, angle-of-arrival estimation, beamforming and mimo radar techniques were omitted. also, the so-called radar range equation which is a kind of a link budget for radars, as well as range precision and accuracy were not covered because depending on the actual algorithm it may vary from centimeters down to micrometers. in spite of that, a good head start in the fmcw topic is hopefully provided. 566 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 567 522 v. milovanović acknowledgements the author would like to thank the colleagues from novelic microsystems and faculty of engineering, university of kragujevac on helpful discussions. he would also like to acknowledge support granted by the ministry of education, science and technological development through the iii-41007 project. references [1] j. hasch, e. topak, r. schnabel, t. zwick, r. weigel, and c. waldschmidt, “millimeter-wave technology for automotive radar sensors in the 77 ghz frequency band,” ieee trans. microw. theory techn., vol. 60, no. 3, pp. 845–860, mar. 2012. [2] c. li, z. peng, t. y. huang, t. fan, f. k. wang, t. s. horng, j. m. muñozferreras, r. gómez-garcía, l. ran, and j. lin, “a review on recent progress of portable short-range noncontact microwave radar systems,” ieee trans. microw. theory techn., vol. 65, no. 5, pp. 1692–1706, may 2017. [3] m. pauli, b. göttel, s. scherr, a. bhutani, s. ayhan, w. winkler, and t. zwick, “miniaturized millimeter-wave radar sensor for high-accuracy applications,” ieee trans. microw. theory techn., vol. 65, no. 5, pp. 1707–1715, may 2017. [4] l. yujiri, m. shoucri, and p. moffa, “passive millimeter wave imaging,” ieee microw. mag., vol. 4, no. 3, pp. 39–50, sep. 2003. [5] t. mitomo, n. ono, h. hoshino, y. yoshihara, o. watanabe, and i. seto, “a 77 ghz 90 nm cmos transceiver for fmcw radar applications,” ieee j. solid-state circuits, vol. 45, no. 4, pp. 928–937, apr. 2010. [6] m. skolnik, introduction to radar systems, 3rd ed. mcgraw-hill, 2002. [7] s. trotta, h. knapp, d. dibra, k. aufinger, t. f. meister, j. bock, w. simburger, and a. l. scholtz, “a 79 ghz sige-bipolar spread-spectrum tx for automotive radar,” in ieee int. solid-state circuits conf. (isscc) dig. tech. papers, feb. 2007, pp. 430–613. [8] d. guermandi, q. shi, a. dewilde, v. derudder, u. ahmad, a. spagnolo, i. ocket, a. bourdoux, p. wambacq, j. craninckx, and w. v. thillo, “a 79ghz 2 × 2 mimo pmcw radar soc in 28-nm cmos,” ieee j. solid-state circuits, vol. 52, no. 10, pp. 2613–2626, oct. 2017. [9] w. v. thillo, v. giannini, d. guermandi, s. brebels, and a. bourdoux, “impact of adc clipping and quantization on phase-modulated 79 ghz cmos radar,” in 2014 11th eur. radar conf. (eurad), oct. 2014, pp. 285–288. [10] j. lee, y. a. li, m. h. hung, and s. j. huang, “a fully-integrated 77-ghz fmcw radar transceiver in 65-nm cmos technology,” ieee j. solid-state circuits, vol. 45, no. 12, pp. 2746–2756, dec. 2010. 568 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 569 522 v. milovanović acknowledgements the author would like to thank the colleagues from novelic microsystems and faculty of engineering, university of kragujevac on helpful discussions. he would also like to acknowledge support granted by the ministry of education, science and technological development through the iii-41007 project. references [1] j. hasch, e. topak, r. schnabel, t. zwick, r. weigel, and c. waldschmidt, “millimeter-wave technology for automotive radar sensors in the 77 ghz frequency band,” ieee trans. microw. theory techn., vol. 60, no. 3, pp. 845–860, mar. 2012. [2] c. li, z. peng, t. y. huang, t. fan, f. k. wang, t. s. horng, j. m. muñozferreras, r. gómez-garcía, l. ran, and j. lin, “a review on recent progress of portable short-range noncontact microwave radar systems,” ieee trans. microw. theory techn., vol. 65, no. 5, pp. 1692–1706, may 2017. [3] m. pauli, b. göttel, s. scherr, a. bhutani, s. ayhan, w. winkler, and t. zwick, “miniaturized millimeter-wave radar sensor for high-accuracy applications,” ieee trans. microw. theory techn., vol. 65, no. 5, pp. 1707–1715, may 2017. [4] l. yujiri, m. shoucri, and p. moffa, “passive millimeter wave imaging,” ieee microw. mag., vol. 4, no. 3, pp. 39–50, sep. 2003. [5] t. mitomo, n. ono, h. hoshino, y. yoshihara, o. watanabe, and i. seto, “a 77 ghz 90 nm cmos transceiver for fmcw radar applications,” ieee j. solid-state circuits, vol. 45, no. 4, pp. 928–937, apr. 2010. [6] m. skolnik, introduction to radar systems, 3rd ed. mcgraw-hill, 2002. [7] s. trotta, h. knapp, d. dibra, k. aufinger, t. f. meister, j. bock, w. simburger, and a. l. scholtz, “a 79 ghz sige-bipolar spread-spectrum tx for automotive radar,” in ieee int. solid-state circuits conf. (isscc) dig. tech. papers, feb. 2007, pp. 430–613. [8] d. guermandi, q. shi, a. dewilde, v. derudder, u. ahmad, a. spagnolo, i. ocket, a. bourdoux, p. wambacq, j. craninckx, and w. v. thillo, “a 79ghz 2 × 2 mimo pmcw radar soc in 28-nm cmos,” ieee j. solid-state circuits, vol. 52, no. 10, pp. 2613–2626, oct. 2017. [9] w. v. thillo, v. giannini, d. guermandi, s. brebels, and a. bourdoux, “impact of adc clipping and quantization on phase-modulated 79 ghz cmos radar,” in 2014 11th eur. radar conf. (eurad), oct. 2014, pp. 285–288. [10] j. lee, y. a. li, m. h. hung, and s. j. huang, “a fully-integrated 77-ghz fmcw radar transceiver in 65-nm cmos technology,” ieee j. solid-state circuits, vol. 45, no. 12, pp. 2746–2756, dec. 2010. on range-doppler estimation in integrated fmcw radar sensors 523 [11] n. pohl, t. jaeschke, and k. aufinger, “an ultra-wideband 80 ghz fmcw radar system using a sige bipolar transceiver chip stabilized by a fractionaln pll synthesizer,” ieee trans. microw. theory techn., vol. 60, no. 3, pp. 757–765, mar. 2012. [12] t. n. luo, c. h. e. wu, and y. j. e. chen, “a 77-ghz cmos fmcw frequency synthesizer with reconfigurable chirps,” ieee trans. microw. theory techn., vol. 61, no. 7, pp. 2641–2647, jul. 2013. [13] w. wu, r. b. staszewski, and j. r. long, “a 56.4-to-63.4 ghz multi-rate all-digital fractional-n pll for fmcw radar applications in 65 nm cmos,” ieee j. solid-state circuits, vol. 49, no. 5, pp. 1081–1096, may 2014. [14] j. park, h. ryu, k. w. ha, j. g. kim, and d. baek, “76-81-ghz cmos transmitter with a phase-locked-loop-based multichirp modulator for automotive radar,” ieee trans. microw. theory techn., vol. 63, no. 4, pp. 1399–1408, apr. 2015. [15] g. hasenaecker, m. van delden, t. jaeschke, n. pohl, k. aufinger, and t. musch, “a sige fractional-n frequency synthesizer for mm-wave wideband fmcw radar transceivers,” ieee trans. microw. theory techn., vol. 64, no. 3, pp. 847–858, mar. 2016. [16] j. h. song, c. cui, s. k. kim, b. s. kim, and s. nam, “a low-phase-noise 77-ghz fmcw radar transmitter with a 12.8-ghz pll and a × 6 frequency multiplier,” ieee microw. compon. lett., vol. 26, no. 7, pp. 540–542, jul. 2016. [17] h. jia, l. kuang, w. zhu, z. wang, f. ma, z. wang, and b. chi, “a 77 ghz frequency doubling two-path phased-array fmcw transceiver for automotive radar,” ieee j. solid-state circuits, vol. 51, no. 10, pp. 2299–2311, oct. 2016. [18] i. m. milosavljević, ð. p. glavonjić, d. p. krčum, l. v. saranovac, and v. m. milovanović, “a highly linear and fully-integrated fmcw synthesizer for 60 ghz radar applications with 7 ghz bandwidth,” springer analog integr. circuits signal process., vol. 90, no. 3, pp. 591–604, mar. 2017. [19] m. hitzler, s. saulig, l. boehm, w. mayer, w. winkler, n. uddin, and c. waldschmidt, “ultracompact 160-ghz fmcw radar mmic with fully integrated offset synthesizer,” ieee trans. microw. theory techn., vol. 65, no. 5, pp. 1682–1691, may 2017. [20] a. townley, p. swirhun, d. titz, a. bisognin, f. gianesello, r. pilard, c. luxey, and a. m. niknejad, “a 94-ghz 4tx-4rx phased-array fmcw radar transceiver with antenna-in-package,” ieee j. solid-state circuits, vol. 52, no. 5, pp. 1245–1259, may 2017. [21] e. öztürk, d. genschow, u. yodprasit, b. yilmaz, d. kissinger, w. debski, and w. winkler, “a 60-ghz sige bicmos monostatic transceiver for fmcw radar applications,” ieee trans. microw. theory techn., vol. 65, no. 12, pp. 5309–5323, dec. 2017. 568 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors 569 524 v. milovanović [22] federal communications commission (fcc), “operation within the band 5771 ghz, tile 47 cfr part 15, subpart c, ∮ 15.255,” nov. 2016. [23] b. p. ginsburg, k. subburaj, s. samala, k. ramasubramanian, j. singh, s. bhatara, s. murali, d. breen, m. moallem, k. dandu, s. jalan, n. nayak, r. sachdev, i. prathapan, k. bhatia, t. davis, e. seok, h. parthasarathy, r. chatterjee, v. srinivasan, v. giannini, a. kumar, r. kulak, s. ram, p. gupta, z. parkar, s. bhardwaj, y. c. rakesh, k. a. rajagopal, a. shrimali, and v. rentala, “a multimode 76-to-81 ghz automotive radar transceiver with autonomous monitoring,” in ieee int. solid-state circuits conf. (isscc) dig. tech. papers, feb. 2018, pp. 158–160. [24] v. winkler, “range doppler detection for automotive fmcw radars,” in proc. eur. radar conf., oct. 2007, pp. 166–169. [25] w. wang, x. chen, and h. wong, “a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology,” facta universitatis, series: electronics and energetics, vol. 31, no. 1, pp. 101–113, mar. 2018. [26] s. kang, j. c. chien, and a. m. niknejad, “a w-band low-noise pll with a fundamental vco in sige for millimeter-wave applications,” ieee trans. microw. theory techn., vol. 62, no. 10, pp. 2390–2404, oct. 2014. [27] a. ergintav, y. sun, f. herzel, h. j. ng, g. fischer, and d. kissinger, “a 61 ghz frequency synthesizer in sige bicmos for 122 ghz fmcw radar,” in proc. eur. microw. integr. circuits conf., oct. 2016, pp. 325–328. [28] g. liu, a. trasser, and h. schumacher, “a 64-84-ghz pll with low phase noise in an 80-ghz sige hbt technology,” ieee trans. microw. theory techn., vol. 60, no. 12, pp. 3739–3748, dec. 2012. [29] h. j. ng, a. fischer, r. feger, r. stuhlberger, l. maurer, and a. stelzer, “a dll-supported, low phase noise fractional-n pll with a wideband vco and a highly linear frequency ramp generator for fmcw radars,” ieee trans. circuits syst. i, reg. papers, vol. 60, no. 12, pp. 3289–3302, dec. 2013. [30] j. vovnoboy, r. levinger, n. mazor, and d. elad, “a dual-loop synthesizer with fast frequency modulation ability for 77/79 ghz fmcw automotive radar applications,” ieee j. solid-state circuits, vol. 53, no. 5, pp. 1328–1337, may 2018. [31] a. musa, r. murakami, t. sato, w. chaivipas, k. okada, and a. matsuzawa, “a low phase noise quadrature injection locked frequency synthesizer for mmwave applications,” ieee j. solid-state circuits, vol. 46, no. 11, pp. 2635– 2649, nov. 2011. [32] i. m. milosavljević, d. p. krčum, d. p. glavonjić, s. p. jovanović, v. r. mihajlović, d. m. tasovac, and v. m. milovanović, “a sige highly integrated fmcw transmitter module with a 59.5-70.5 ghz single sweep cover,” ieee trans. microw. theory techn., vol. 66, no. 9, pp. 4121–4133, sep. 2018. 570 v. m. milovanović on range-doppler estimation in integrated fmcw radar sensors pb 7766 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 199-216 https://doi.org/10.2298/fuee2202199k © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper website quality assessment. a case study of gsm hosting forum hossein kardanmoghaddam, mohsen masoumi department of computer engineering, birjand university of technology, birjand, iran abstract. the ever-growing amount of data on mobile phones, tablets, and smart electronic devices on the internet and the need to use this data to address problems highlight the importance of the evaluation and validation of data-sharing websites. the gsmhosting website plays a key role in communicating and providing services to its users in terms of repairing mobile phones and smart electronic devices. the purpose of this study was to determine its quality from the perspective of mobile phone repair technicians. these technicians were 100 people from birjand in south khorasan province (iran) who used gsmhosting website as a reference. the website quality assessment study was conducted in the summer of 2020. the study applied a descriptive survey and cross-sectional method based on a questionnaire. the questionnaire included 11 website dimensions: routing, information, delivery, apparent features, security, reputation, society, entertainment, provided goods and services, reliability, trust. scores were given on a likert scale. the validity of the questionnaire was determined using the opinions of web experts. the spss software and descriptive and inferential statistics methods were used to analyze data. the results indicated that the average quality of this website was acceptable in terms of technicians’ goals. addressing problems highlighted some of the website dimensions that will increase the overall quality of the website to support technicians in their activities. key words: website, web user, internet, user satisfaction 1. introduction the use of the internet and the increasing growth of online customers and the transactions carried out in this way indicate that the internet is at the center of today's commercial transactions. in this condition, internet has become an effective channel for e-commerce [1]. from the companies’ perspective, websites are useful for promoting products and services as well as generating income. given the highly competitive environment, websites have become a dynamic marketing tool whose optimization is an effective tool in customer received may 5, 2021; received in revised form march 27, 2022 corresponding author: hossein kardanmoghaddam department of computer engineering, birjand university of technology, birjand, iran e-mail: h.kardanmoghaddam@birjandut.ac.ir 200 h. kardanmoghaddam, m. masoumi acquisition. website quality is vital for businesses to attract and retain loyal customers. the quality of website services plays a key role in determining the success or failure of ecommerce [2] [11]. the quality of e-services refers to customer evaluations and general judgments about the quality of service provided to customers in virtual markets. when providing quality service to online customers, they tend to stay loyal to the service provider's website. the quality of internet services also improves customer satisfaction, and satisfied customers will have more loyalty to the website. repeated purchases by loyal customers generates income as well as profitability for web businesses. due to the following reasons, organizations and managers need to have quality websites: first, there is no face-to-face communication on a website and the interactions are going on through technology. some aspects of human interactions cannot be replaced by technology and the lack of these aspects should be compensated by improving website quality [3]. research also showed that in 60% of cases, people cannot find what they want on websites, which results in a significant number of unwanted and repetitive visits to a website. the result is a waste of time and energy [4]. therefore, regularly reviewing and evaluating websites, and then identifying their strengths and weaknesses will provide appropriate strategies for policy making and decision making. [5]. websites quality assessment to discover their problems may be very productive. content and textual assessment have been around for a long time, but websites have brought new aspects and applications, including rapid updating and dissemination of information, each requiring a separate opportunity for study [6]. focusing on the mobile and electronics repairs domain, many people are not familiar with the basics of starting repairs, and the lack of knowledge about the basics of repairs may lead to wasted time and money. in many parts of the world, people try to share experiences and knowledge in the field of electronics repair, especially mobile by creating specialized forums and launching websites and groups in cyberspace. these forums work in several areas: firstly on the parts, accessories, and boxes needed to solve hardware and software problems; secondly, in the development of various operating systems, e.g. the android operating system (from february 2021 to february 2022, more than 40% have used the android operating system [7]), thirdly, providing a place for review and solving software and hardware problems and sharing people's experiences across different fields; fourthly, making an environment for questions and answers on problems related to electronics, especially mobile, and answering problems by webmasters or other users. creating good quality web-based environments is an increasing necessity. on the whole, a person in such environments has both the ability to procure the required equipment through such websites, and to acquire the necessary training and share information and also acquire information about maintenance of electronic components in such an environment. one of the websites that can help a lot in this regard is the gsmhosting (gsmhosting.com) website which has a lot of users in different countries, especially the median east (see: https://www.alexa.com/siteinfo/gsmhosting.com). there is a lot of information and tips on the internet for mobile phone repair technicians that can be in the form of blogs, videos, or even step-by-step tutorials. one of the best resource is specialized forums. these forums are a great source of information because they are full of people, including professionals, amateurs, or even salespeople. these forums provide a unique perspective on the solution to a problem; therefore, it offers users with a comprehensive list of resolutions to these problems. the gsmhosting.com website known as the gsm forum, is one of the most comprehensive associations for mobile phone repair technicians which has a great source of information on how to fix mobile phone problems. website quality assessment. a case study of gsm hosting forum 201 this website is one of the oldest mobile phone repair platforms having members from all over the world who use regularly updated posts on this website. almost every type of mobile phone and their problems are reviewed and answered on this website. this website has sections for mobile phone topics and various types of problems for repairs that technicians may need. the data on this website is also categorized to allow users to simply move from one post to another, thus people can easily find solutions to their repair problems. there are also various discussions about different hardware and software, and people can find thousands of topics about each of the major mobile manufacturers. there is a section on this website for the latest technology news and other interesting items regarding the sale of the latest achievements in the field of mobile phones. the internal forum of this website is a great forum for membership and exchange of information for people who are looking for information about mobile phone repair. also, having a search option without registering on this website makes it easier for people who only want to follow the discussions. this study examines the quality of the gsmhosting website from the perspective of mobile phone repair technicians in birjand, south khorasan province (iran). in this paper, the definition of quality, user satisfaction, and website evaluation models are introduced in the second section. literature reviews and other works on the quality evaluation of different websites are discussed in the third section. the research method is illustrated in the fourth section and the fifth section gives the descriptive findings. the sixth section presents the inferential findings. research hypotheses, conclusions, and comparisons with similar work are discussed in the seventh section, giving recommendations to improve the website and there are suggestions for future research. 2. the definition of quality and website quality evaluation models literature review shows that there is no standard definition of the quality of website due to the large variety of existing websites and also the dimensions of service quality vary depending on the type of website. for example, dimensions such as reliability, ease of use, and security are especially of great importance for websites selling physical products, and the search capacity and reliability of digital information are important for those offering products or services. quality is a set of features and specifications of a product or service which can meet explicit or implicit needs [8]. in most definitions, customer satisfaction and meeting their demands are considered the most important factor of quality. based on research on website requirements [9], all the needs of stakeholders (owners, users, developers) should be considered. this research focuses on the users of the website. user satisfaction is the perception that a user has regarding the extent of meeting his/her needs. according to parasuraman [10], service quality is the extent and direction of the difference between customers' perceptions and expectations of service. kotler in 2000 described satisfaction as a person's pleasant or unpleasant feeling as a result of comparing the performance or perceived results of a product in relation to their expectations [11]. customer satisfaction is the motivation that customers gain from products or services. customer satisfaction is based on past experience and the evaluation of service effectiveness [12]. zeithaml, (2002) [13] introduced the quality of provided services as the difference between customers' expectations and their perceptions of received services. they developed an e-servqual tool to measure the quality of electronic services in their research. this model includes seven dimensions: efficiency, 202 h. kardanmoghaddam, m. masoumi reliability, implementation, privacy, responsiveness, compensation, and contact. four dimensions efficiency, reliability, implementation, and privacy, constitute the main eservqual scale [14]. the main scale is applied for the lack of any trouble using the website for users. the other three dimensions responsiveness, compensation, and contact, create a recovery scale for e-servqual. a recovery scale is applied when users have difficulty using the site. service quality can be defined as the difference between customer expectations of pre-service performance and their perceptions of received services [15]. lee et al. define the quality of received service as a general belief and attitude towards service excellence and that the attention to service quality is a reflection of the extent and direction of differences between customer perceptions and expectations [16]. the servqual questionnaire was designed by parasuraman et al. [17] to compare customer expectations and their perception of actual performance to measure service quality. five dimensions have been considered in the questionnaire: ▪ perceptible factors: physical facilities, equipment, and the appearance of employees. ▪ reliability: the ability to deliver the promised services with complete confidence and accuracy. ▪ responsiveness: willingness to help customers and provide services, without delay. ▪ confidence: employees' knowledge, politeness, and humility in addition to their ability to convey a sense of trust and confidence to customers. ▪ empathy: paying attention to each customer, availability, sensitivity, and effort to understand customer needs. kang & bradley (2002) [18] presented a process model based on the conceptual gap of information technology (it) quality which determines seven distances between users and it service providers. this model is developed based on the main model of service quality distance (servqual). moreover, their model tests the applicability of servqual, whether this standard and popular guideline designed to measure service quality can also measure it service quality, and ultimately the performance of an it department using a modified three-column servqual model. in their questionnaire, 13 questions have been applied instead of 22 questions related to servqual, eliminating perceptible factors. they used two factors including the skills of it service providers and it service characteristics and estimated 7 defined distances. the study introduces three levels of it service quality based on the difference in 3 levels of users' behavioral perspective for it services: ideal level, acceptable, and the perceived level. a new conceptual model has been developed by integrating these three levels with the main distance model. the model identified seven gaps between suppliers and it service users. the study indicated that there are two separate service factors including individual and it service features for the three to four criteria evaluated in the proposed model. moreover, it presented that the three-column servqual model is more efficient than the original model. parasuraman et al. [19] experimentally used multiple scales to evaluate the quality of electronic services. this is an e-s-qual quality assessment scale of electronic service which is provided through the website to customers. this approach has four general factors: efficiency, compliance, system availability, confidentiality. most conducted studies on the quality of websites have focused on their dimensions. for example, the web qual measuring tool has twelve dimensions: work-relevant information, interaction, trust, response time, design, conjecture, appearance, innovation, flow, coherent communication, business process, and substitutability. this set of characteristics also has some limitations [20]. yang et al. extracted the following model from the existing literature reviews and assessed it [20] (fig. 1). website quality assessment. a case study of gsm hosting forum 203 fig. 1 model yang et al. [20] reichheld and schefter state that customer loyalty plays an important role in online shopping which relies on the customer's satisfaction with the online seller [12]. loyalty refers to the customer's frequent intention to purchase products or services through a particular online retailer. loyalty is assessed through many behavioral criteria such as total share of purchase, repurchase, and probability of purchase [21]. satisfaction is considered as the overall satisfaction of the service provider technology where online transactions have been conducted. customer satisfaction is the key indicator of the company's profitability and market share and the company's financial health index. satisfied customers tend to repurchase products, which increases the company's market share [22]. customer loyalty is defined as the spent time to purchase similar products from a particular category relative to the total number of purchases made by the buyer in that category. accordingly, maximizing loyalty and the long-term value of customer purchases are the most important goals of the website [11]. internet loyalty known as electronic loyalty is in fact a developed form of traditional loyalty to explain online customer behavior. e-loyalty is defined as the desire to view and revisit a website or exchange in the future. electronic loyalty is different from traditional loyalty and emphasizes customer loyalty in internet-based business. electronic loyalty is defined as the perceived desire to visit or reuse a website and purchase from that website in the future [22]. zeithmal [22] defined website quality as the domain in which the internet facilitates the use and effectiveness of purchasing, delivering real products, and completing the purchasing process. jeon and jeong (2017) [23] assessed the effect of the website services quality (efficiency, system accessibility, implementation, and privacy) on the satisfaction and loyalty of internet customers. the relationships between variables of this research are shown in fig. 2. webqual is a tool to assess user perceptions of service quality [24] which is based on quality function deployment (qfd). qfd is a systematic and structured process that always determines and accepts the user needs at each stage of production, implementation, or development of information services. the use of qfd begins with the registration of user needs as the basis for determining quality needs. understanding the customer language and its explicit and implicit desires is one of the most important concerns of this step in qfd. the determined quality criteria in qfd are presented to users and its feedback is received. then 204 h. kardanmoghaddam, m. masoumi the quality of a product or service is measured. barnes & vidgen (2005) [24] applied an online questionnaire with a modified list of criteria affecting website quality to evaluate the quality of four uk business school websites. fink & nyaga (2009) [25] used webqual guidelines to evaluate a website adding a dimension called risk. they assessed the quality of the websites of major public accounting firms by searching among potential employee opinions and analyzing data to build the best quality practices for public accounting websites. in this study, a modified webqual questionnaire was used to evaluate the websites of six public accounting firms. data analysis obviously displays the highest quality for the usability structure at all levels while the risk is a characteristic with the lowest quality level. this study applied multiple models (design knowledge, constructivism, and expediency). dimensions included usability (learning, setting up services, ease of use, design appropriateness, competence, attractive appearance), interactivity (personalization, service search, credibility (reputation), search facilities, convenient communication, interactive efforts), information (level of detail, reliability, relationship, acceptability, useful to understand contents, appropriate format), risk (access security, communication security, transaction security, service delivery safety, information security, private criteria). the risk dimension is added and can be considered as the other three main webqual. findings are discussed using both statistics and researcher values. webb & webb (2004) [26], presented a conceptual model (site-qual) of factors affecting consumer perception of business to customer (b2c) websites. the fundamental hypothesis of the site-qual model consists of two main criteria: information quality and process quality. information quality includes accessibility, content, proper presentation, display, and intrinsic quality. process quality includes stability, responsiveness, reliability, empathy, and the quality of tangible system components. in another study, davidson & joan (2005) [27] presented a process model to evaluate the quality of eservices with the following features: 1. website evaluation requirements 2. management beliefs about customer requirements 3. web design and implementation. golovkova et al (2019) [28] presented an e-service satisfaction model having features of: 1. easy to use; 2. resolution; 3. consistency; 4. composition and structure; 5. content. the acsi indicator (american customer satisfaction index) provides an appropriate statistical model to predict customer behavior. there are indicators such as the probability of the customer returning to the website in the acsi model. the statistical community is randomly selected from among the website visitors. each website is categorized based on the views of fig. 2 website services quality based on the jeon and jeong concept model [23] website quality assessment. a case study of gsm hosting forum 205 its visitors in different spectrums of satisfaction. satisfaction indicators of acsi are content, performance, site traffic, site performance, understanding, and searching any website, which will ultimately result in a score between 0 and 100 for the website. the main inputs are expectations, perceived quality, and perceived value, and the outputs are the main consequences of customer satisfaction, customer complaints, and customer loyalty in the acsi model. the output of the acsi model is developed based on the general marketing model. the immediate consequence of growing customer satisfaction will be a reduction in complaints and improvement in their loyalty based on this model. therefore, two indicators of complaints and customer loyalty are determined as the output of the acsi model [28]. the quality of website services plays a leading role in the success or failure of e-commerce. the quality of e-services refers to the overall evaluation and judgment of customers regarding the superiority of the quality of delivered service to customers in virtual markets. the superior quality services are provided to online customers, the more loyal they tend to stay to the service provider's website. the quality of internet services also improves the level of customer satisfaction. satisfied customers will also have more loyalty to the website [29]. 3. background hasan and abuelrub emphasized that for designing a comprehensive framework for evaluating websites, the dimensions of content, design, organization, and user-friendliness must be considered [30]. yang et al.[31] in their study stated that active, helpful and useful links are important in website design. using a conceptual model, robbins ss [32] examined the structural and content characteristics of 90 business websites based on culture and type of occupation. the results showed that the websites studied were not significantly different in terms of structural features, but in terms of content, there was a significant difference between most of the content features of the websites in different cultural groups. mcinerney, c. & bird, n., ibid. [33] evaluated the quality of the content of genetically modified food web sites using the wqet tool. the results showed that in data retrieval, accessibility is the only considered feature of the quality of the website. finally, a model was developed to evaluate the quality of websites. hill et al. [34] looked at physics education websites for middle school. using a checklist, they surveyed 285 school websites in southern california for structure and content. the findings showed that most of the physics education websites are incomplete in terms of structure and content. carlos and vera silva [35] have evaluated the websites of higher education institutions and have concluded that they must try yet to be able to play a role in today's competitive and competitive world. in his study, sugak [36] examines the reasons why russian universities' websites have a low rank. he found that the content of these websites did not have a good quality. many studies have been carried out in iran on the websites of university libraries, from different aspects [37] [38] [39] [40]. these studies showed that iran university libraries are of poor quality in terms of criteria such as home-page content [40] and up-to-date material and interlinks [38], and the performance of these websites is very different from ones in advanced countries [41]. mohamadesmaeil s, movahedi in their study stated that the websites of the national medical libraries of iran were poor compared to the same websites in the united states in terms of criteria such as being up-to-date, efficiency and reliability of information [42]. ahmadi n. and hosseyni (2014) [43] evaluated the websites of central libraries of iranian state universities. the results showed that the highest score belonged to "trust" and the lowest score to "response time" in the "useful" dimension in all libraries studied. parvin et al. [44] 206 h. kardanmoghaddam, m. masoumi conducted a study aimed at evaluating the quality of the ministry of sport and youth affairs' web site through a survey on 31 experts in sport media management using the analytical hierarchy process (ahp). the results show that the usability weight of the website of the ministry of sport and youth is 0.62 and its attractiveness is 0.38, and the content of the website has the highest weight and interaction has the lowest weight. the research [45] evaluated the quality of the web sites of iranian research institutes of medical sciences. the checklist used in this study consisted of 4 dimensions including content quality, design, organization and userfriendliness of the websites. in the websites surveyed in this study, on average, 81% of the content criteria, 85% of the design, 89% of organization and 77% of user-friendliness were satisfied, and the results of this study indicated that the websites of research institutes of iran universities of medical sciences were at a good level in quality assessment. the results of the study by karkin and janssen [46] have shown that the websites studied are not satisfactory in terms of questions and answers, frequently asked questions and user interaction. the research [47] evaluated the quality and ranking of persian websites in the field of diabetes (12 websites), based on qualitative webmedqual scale, which included all iranian diabetes websites in iran. persian diabetes websites have gained about 50% of the webmedqual score and are generally in the median level. these websites, with the score more than 50%, were ranked above average, in terms of 3 indicators including information content, design, and links. their score was less than 50% in terms of other indicators such as source credibility, accessibility and usability, user support and confidentiality. fathi [48] examined the quality of the websites of selected sports federations in terms of attractiveness and usability, according to students and graduates of physical education and sports sciences in iran. the research was carried out through internet using a questionnaire completed by 521 statistical samples. the findings showed that the websites of the selected federations were at a medium to low-medium level, both in terms of attractiveness and usability. kriemadis et al., 2010 [49] assessed the websites of greek and english football clubs and presented many differences between these websites. the quality of the english club websites was superior to the greek clubs by providing more marketing opportunities. websites were also compared in terms of features such as information, communication, promotion, sales form, user information collection and website design. gonzalez et al. (2015) [50] used web qual tools to evaluate the quality of sports websites and also applied the qfd approach. they analyzed the customers' websites of the most popular and famous teams including barcelona, manchester united, liverpool, and la galaxi. the results displayed that the official website of barcelona overtook its competitors in providing highly qualified services. also, this website had a higher average in terms of information appropriateness, being up-to-date, and beautiful design from customers' perspective. di fatta et al. (2016) [51] assessed and classified the quality dimensions of the virtual education website of a university-based on a combined model of webqual and kano satisfaction. the standard webqual 4 questionnaire was used as the research tool. a multiple regression analysis statistical test was applied to assess the effect of asymmetric performance. pamučar et al. (2018) [52] evaluated the performance of websites in several organizations using ahp method. they prepared a list of the most important criteria for evaluation and the weight of each criterion calculated by ahp method. choi and kim (2019) [53] evaluated e-commerce websites using the ahp and servqual models, focusing on service provided by mobile operators and improving their services. singh & prasher (2019) [54] conducted a similar study on the development of a service quality model in the field of hospital web portals based on the same model (ahp and servqual). efe (2019) [55] applied a combination of topsis and servqual models website quality assessment. a case study of gsm hosting forum 207 for the evaluation and improvement of website quality. they prioritized the quality evaluation criteria to provide improvement solutions. numerous studies have been conducted in the field of website evaluation. also, several approaches have been applied to measure the quality of websites and assess their structural and content features, including determining factors, link analysis, and the quality website evaluation using various tools. the literature review indicated that there is no conducted study on evaluating websites related to mobile phone repair technicians which highlight the importance of current research. 4. research method in terms of purpose, this research is in the category of applied research and in terms of data collection method is in the category of descriptive-survey research. in this research, a random sampling method is used for sampling. the questionnaires are printed and offline in the summer of 2020 for mobile phone repair technicians in birjand city in south khorasan province (iran). after collecting and reviewing the questionnaires, finally, 100 questionnaires were analyzed. in this study, the standard quality questionnaire of rosenbaum website (2005) was applied which has 58 questions in 11 dimensions (routing, information, delivery, apparent features, security, reputation, society, entertainment, provided goods and services, reliability, and trust) to collect data. a five-point likert scale (from 1 strongly disagree to 5 strongly agree) was used in the questionnaire. having used the opinion of professors and experts in the field of web and management, the face validity and content of the questionnaire were confirmed. moreover, its reliability was obtained using cronbach's alpha method (0.841). rosenbaum’s model has several advantages in terms of theoretical foundations. it considers all the sections about the quality of a website and covers all the sections of the user's intuitive understanding. this model that was used in our research also had the necessary reliability. spss and excel software were used for descriptive statistics, and also for inferential statistics to analyze the data obtained from the questionnaires. the questionnaire used in the study has been applied in other works, including [56] and [57]. 5. descriptive findings frequency distribution of the subjects was 89 (89%) male and 11 (11%) female, 45 people (45%) from whom were single and 55 (55%) married. the frequency distribution of the study population is shown in table 1; as can be seen, 38 subjects (38%) were 25 years old or younger, 44 (44%) were 26-30 years old, and 18 (18%) were older than 30 years. table 1 distribution of absolute and relative frequency of the studied subjects by age percentage number of respondents age 38 38 25 years and younger 44 44 26-30 years 18 18 older than 30 years 100 100 total the education level of the study population was as follows: 40% of the subjects of those studied had middle school degree or diploma, 15% had an associate degree, and 208 h. kardanmoghaddam, m. masoumi 45% had a bachelor's degree or higher. table 2 shows the frequency distribution of the study population in terms of income per month. table 2 distribution of absolute and relative frequency of the study people by monthly income percentage number of respondents income (per month) 31 31 800,000 tomans and less 42 42 800,000-1500,000 tomans 27 27 more than 1500,000 tomans 100 100 total as the table above shows, most of the study population income (42%) ranged between 800,000 tomans ($1=13000 tomans) to one million five hundred thousand tomans monthly. years of working and experience in the mobile market is one of the most important dimensions to use and get acquainted with above mentioned websites which is distributed in accordance with table 3. table 3 distribution of absolute and relative frequency of the individuals surveyed by years working in the field of mobile percentage number of respondents experience 1-2 years 33 33 3-4 years 32 32 5-6 years 20 20 more than 6 years 15 15 total 100 100 as the table above shows, the highest frequency (33%) was for people with 1-2 years of work in the field of mobile and the lowest frequency (15%) was for people with more than 6 years of work in the field of mobile. in terms of descriptive indicators, the quality of the considered web site was in accordance with table 4. table 4 descriptive indices related to the gsmhosting website quality variable as a whole and its components (with a range of 0-5) mode median standard deviation mean variable 3.40 3.40 0.32 3.55 routing 4.00 3.94 0.18 3.93 information 4.33 4.17 0.24 4.20 delivery 2.50 2.50 0.34 2.44 apparent features 3.50 3.50 0.29 3.57 security 4.40 4.40 0.21 4.47 reputation 4.25 4.25 0.33 4.23 society 4.00 4.00 0.36 4.11 entertainment 3.83 3.83 0.32 3.77 provided goods and services 3.80 3.80 0.34 3.83 reliability 4.25 4.25 0.23 4.28 trust 3.91 3.89 0.12 3.87 website quality as a whole website quality assessment. a case study of gsm hosting forum 209 as the table above shows, according to the users’ viewpoints, the highest mean of quality score for the gsmhosting website was related to the reputation component (4.47 ±0.21) and the lowest mean was for the apparent features’ component (2.44 ± 0.34). the mean score of the quality of the gsmhosting website was 3.87±0.12, in terms of subjects’ viewpoints. 6. inferential findings inferential findings, the viewpoints of mobile phone repair technicians referring to this website in birjand, south khorasan province (iran) were in accordance to table 5. table 5 comparison of the mean scores of the surveyed people regarding the quality of the gsmhosting website as a whole and its components by gender p df t standard deviation mean gender variable 0.22 98 1.24 0.33 3.56 male routing 0.22 3.44 female 0.70 98 0.38 0.18 3.93 male information 0.15 3.95 female 0.27 98 1.12 0.25 4.21 male delivery 0.17 4.12 female 0.02 98 1.03 0.34 2.45 male apparent features 0.30 2.34 female 0.29 98 1.06 0.29 3.58 male security 0.34 3.48 female 0.73 98 0.35 0.22 4.47 male reputation 0.10 4.49 female 0.22 98 1.23 0.34 4.21 male society 0.26 4.34 female 0.30 98 1.04 0.36 4.09 male entertainment 0.31 4.21 female 0.86 98 0.18 0.31 3.77 male provided goods and services 0.35 3.79 female 0.12 98 1.55 0.34 3.81 male reliability 0.26 3.98 female 0.76 98 0.31 0.23 4.27 male trust 0.22 4.30 female 0.91 98 0.12 0.12 3.87 male website quality as a whole 0.08 3.87 female the result of independent t-test in the above table shows that the mean score of the surveyed subjects’ viewpoint regarding the quality of the gsmhosting website as a whole and its components was not significantly different by gender (p> 0.05). however, the mean score of physical features was significantly different (p = 0.20) in the studied subjects. to determine which difference was significant for these groups, tukey's pairing post-hoc test was used. the result showed the mean score of apparent features of females was significantly higher than men (p <0.05). 210 h. kardanmoghaddam, m. masoumi table 6 comparison of the mean scores of the studied people viewpoints regarding the quality of the gsmhosting website as a whole and its components by education level p df f standard deviation mean level of education variable ( 97و2) 0.03 0.03 0.33 3.55 middle school degree and diploma routing 0.34 3.53 associate degree 0.51 3.56 bachelor's degree and higher ( 97و2) 0.99 0.01 0.18 3.93 middle school degree and diploma information 0.17 3.93 associate degree 0.19 3.93 bachelor's degree and higher ( 97و2) 0.09 2.48 0.22 4.16 middle school degree and diploma delivery 0.18 4.32 associate degree 0.27 4.19 bachelor's degree and higher ( 97و2) 0.22 1.56 0.34 2.39 middle school degree and diploma apparent features 0.36 2.38 associate degree 0.32 2.51 bachelor's degree and higher ( 97و2) 0.37 1.01 0.25 3.61 middle school degree and diploma security 0.31 3.57 associate degree 0.32 3.52 bachelor's degree and higher ( 97و2) 0.74 0.30 0.14 4.49 middle school degree and diploma reputation 0.14 4.45 associate degree 0.28 4.46 bachelor's degree and higher ( 97و2) 0.47 0.76 0.40 4.18 middle school degree and diploma society 0.16 4.25 associate degree 0.31 1.26 bachelor's degree and higher ( 97و2) 0.34 1.08 0.44 4.05 middle school degree and diploma entertainment 0.25 4.20 associate degree 0.30 4.13 bachelor's degree and higher ( 97و2) 0.27 1.31 0.30 3.77 middle school degree and diploma provided goods and services 0.29 3.89 associate degree 0.34 3.74 bachelor's degree and higher ( 97و2) 0.67 0.40 0.37 3.82 middle school degree and diploma reliability 0.27 3.91 associate degree 0.33 3.82 bachelor's degree and higher ( 97و2) 0.54 0.62 0.25 4.26 middle school degree and diploma trust 0.18 4.33 associate degree 0.23 4.27 bachelor's degree and higher ( 97و2) 0.36 1.03 0.11 3.86 middle school degree and diploma website quality as a whole 0.10 3.91 associate degree 0.12 3.87 bachelor's degree and higher the result of the one-way anova test in the table above shows that the mean score of the views of the study subjects regarding the quality of the gsmhosting website as a whole and its components was not significantly different based on the education level (p> 0.05). however, the mean score of route determination in the studied subjects was significantly different in terms of education degree (p = 0.03). to determine which difference was website quality assessment. a case study of gsm hosting forum 211 significant for each of the groups mentioned, tukey's post-hoc test was used, which shows the mean score of rout determination of those with a middle school degree or diploma was significantly higher than those with a bachelor's degree or higher (p <0.05). 7. discussion and conclusion hypothesis 1: the quality of a website depends on the gender cell phone repair technicians, also called support specialists, are primarily concerned with repairing and troubleshooting parts when they break down or malfunction. since most mobile phone repair technicians are men and the population of female technicians is much smaller, and the interest and ability of men in technical work is higher, it has caused that most of the population of this group are men (in this the frequency of research for women who use the site is about one-eighth of men), and most users of the website are men. except for the component of characteristics, there is no significant difference in this study between the sexes of men and women in other components. the site's appearance was less detailed from the women's point of view, and the men's view of the site's features was more superficial than the women's. so in this website, more attention should be paid to appearance features based on men's views. hypothesis 2: the quality of a website depends on the age there is no significant difference between different ages and components, so the site does not require changes in the investigated components. hypothesis 3: the quality of a website depends on the income since almost no site customers have a certain income, then the site must comply with all of their requirements, both low-price and high-price products. although, the number of high-income customers is lower than the average total income of customers, the site should focus on offering cheaper products as well as seasonal incentives and discounts. hypothesis 4: the quality of website depends on the education web site customers with various degrees purchase from the site, so practically, education doesn't really have much impact on the components of the web site. because, in fact most of the site's customers are mobile phone repair technicians with at least a diploma and they have learned their technical knowledge from companies in special education and vocational classes. but the routing for those with an associate degree or higher is different from those with diploma and middle school degree, so the site must be modified to enhance the routing for all of its clients regardless of literacy level. conclusion websites play a key role in disseminating information. the most important goal of any website is to provide information that meets the expectations of users. website evaluation methods are used to more accurately identify and control the content produced on websites. evaluating websites based on principles and standards and using valid scientific tools seems to be essential. each website evaluation method has its own characteristics that can be used in evaluative research according to the conditions and objectives of evaluation. users go to websites to meet their needs. when a website is of good quality it provides users with easy use of the site. this will attract the user and increase the time of using the site which ultimately encourages the user to use the 212 h. kardanmoghaddam, m. masoumi website, as well as buy from that site. the higher the quality of the website, the easier and more useful it is to use the site; as a result, the user's desire to stay and use the site increases and people gain trust and the probability of buying will increase. in today's marketing, retaining existing customers costs less than attracting new customers; therefore, all actions should be planned to improve the quality of website performance; to satisfy the user and turn him into a customer. if the user interface of a website is very neat and carefully designed, in addition to making the user a customer; it can make him a regular and loyal customer. many people who search for a problem on the web do not even know enough about the features of their question or the answer they are looking for. they hope to find the answers they want by going to different websites and getting answers to the questions they have in mind. this brings a lot of time and money for many people. the existence of gsmhosting website plays an important role in communicating and providing services to users in the field of mobile and electronic and smart device repairs. through this website, people can share their knowledge, information and experiences related to products and services with other people. most mobile phone repair technicians use this website as a reference, so this website has great potential and can become one of the largest markets in the world in the field of mobile and smart devices. according to the results of this study, users' satisfaction with the site is high, but there are shortcomings in the appearance and routing of the site. their acceptability should also be considered and by more careful review and also general changes in these components can increase site performance. the administrators of this website are advised that in addition to making changes in the appearance of the website, by creating personal pages in social networks, interaction and communication between potential users, as well as past users are possible. appearance is one of the most important factors in user satisfaction. even if a website offers the best products, if you cannot display site graphics and quality images of your products, it will not be very successful in attracting user satisfaction. the appearance of a website introduces the brand of that website. users understand the features better. by creating attractive appearance features, users will trust the website more. the mean score of the subjects' views on the quality of this website as a whole and its components did not differ significantly in terms of gender, age, income, education level, number of households and marital status, but the mean score of physical characteristics in the subjects. there was a significant difference in terms of gender and the mean score of physical characteristics of females was significantly higher than males in this study. the routing score of people with a cycle and diploma degree was significantly higher than that of people with a bachelor's degree or higher. comparison of the results of this research with other in terms of website quality shows that in most cases the obtained results are consistent [58-63]. none of the marketing and advertising strategies will be effective if users are not satisfied with the quality of the website. brands that do not try to satisfy their users may disappear very soon. for this reason, many webmasters try to check the quality of their sub-websites at different times. in fact, users are the focus of a business and the existence of a successful website. low quality website support and customer service in the long run creates more costs for the webmaster. users are the focus of any business, users trust websites that provide up-to-date information and are up to date with the latest technologies. when users are satisfied with the quality of a website, they are more likely to return to that site. to strengthen the public image of the website in cyberspace, observance of the rules related to the ranking of the website should be considered so that users can without the need for a precise website address and simply by website quality assessment. a case study of gsm hosting forum 213 entering the words related to mobile phone repair to the website link in the initial search engine results reach. according to the research findings, it is recommended that the website's hands-on-the-business practitioners in this research provide promotion of their website, according to scores earned in each section and identify the weaknesses of the website. this study suggests future researchers pay more attention to partnership factors as well as factors related to the technology acceptance model and fuzzy-based website quality assessment methods [64-66]. also, the application of other models for evaluating the quality of website services can test the reliability of the results of this research. the researcher faced some limitations due to the small sample size. the results will probably be more accurate if there was a larger sample size. the present study was conducted on the quality of the gsmhosting.com [67] website among mobile phone technicians in birjand in south khorasan province in iran. the generalization of the results of this study to other mobile phone technicians should be cautiously taken place with sufficient knowledge. more importantly, this research considered mobile phone repair technicians in birjand, eastern iran; thus, it is not possible to generalize to the whole community of mobile phone repair technicians in iran. another limitation of this study was the lack of similar research on the quality of the website from the perspective of mobile phone repair technicians to compare the results. it is recommended to evaluate the quality of other websites related to the field of mobile phones such as gsmlover.com, howardforums.com, cent.com, cellphoneforums.net, androidcentral.com, imore.com, xda-developers.com in future research to provide a comprehensive and complete review and comparison in this regard. references [1] https://www.statista.com/markets/413/e-commerce/; https://ecommerceguide.com/ecommerce-statistics/ [2] l. mich, "evaluating website quality by addressing quality gaps: a modular process", in proceedings of the ieee international conference on software science, technology and engineering, 2014, pp. 42–49. [3] h. baghban, r. toudar, z. daliry and a. nasermalavani "assessment of islamic azad university website and position of marvdasht islamic azad university website in this system", info. commun. technol. edu. sci., vol. 1, no. 4, pp. 59–79, 2011. [4] i. f. aguillo, j. l. ortega and m. fernandez "webometric ranking of world universities. introduction methodology and future developments", high edu. europe, vol. 33, no. 2–3, pp. 233–244, sept. 2008. [5] l. mich and r. baggio "evaluating facebook pages for small hotels a systematic approach", info. technol. tour, vol. 15, pp. 209–231, aug. 2015. [6] m. j. metzger and a. j. flanagin, "credibility and trust of information in online environments: the use of cognitive heuristics", j. pragmat., vol. 59, pp. 210–220, dec. 2013. [7] web analytics made easy statcounter global stats, (https://gs.statcounter.com/os-market-share) [8] s. k. m. ho, tqm, an integrated approach: implementing total quality through japanese 5-s and iso 9000, london: kogan page, 1995. [9] l. mich, m. franch, and g. cilione, "the 2qcv3q quality model for the analysis of web site requirements", j. web eng., vol. 2, no. 1, pp. 105–127, sept. 2003. [10] a. parasuraman, "assessing and improving service performance for maximum impact: insights from two-decade-long research journey", perform. meas. metr., vol. 5, no. 2, pp. 45–52, aug. 2004. [11] r. chinomona, g. masinge and m. sandada, "the influence of e-service quality on customer perceived value, customer satisfaction and loyalty in south africa", mediterr. j. soc. sci., vol. 5, no. 9, pp. 331–341, may 2014. [12] m. b. khan, and k. f. khawaja, "the relationship of e-crm, customer satisfaction and customer loyalty. the moderating role of anxiety", middle-east j. sci. res., vol. 16, no. 4, pp. 531–535, jan. 2013. [13] v. a. zeithaml, a. parasuraman and a. malhotra, "service quality delivery through web sites: a critical review of extant knowledge". j. acad. mark. sci., vol. 30, no. 4, pp. 362–375, oct. 2002. https://www.statista.com/markets/413/e-commerce/ https://ecommerceguide.com/ecommerce-statistics/ https://statcounter.com/ https://www.worldcat.org/search?q=au%3aho%2c+samuel+k.+m.&qt=hot_author 214 h. kardanmoghaddam, m. masoumi [14] h. baber, "e-servqual and its impact on the performance of islamic banks in malaysia from the customer’s perspective", j. asian finance, economics and business (jafeb), vol. 6, no. 1, pp. 169–175, feb. 2019. [15] c. f. chen, "investigating structural relationships between service quality, perceived value, satisfaction, and behavioral intentions for air passengers: evidence from taiwan", transp. res. a: policy pract., vol. 42, no. 4, pp. 709–717, may 2008. [16] j. h. lee, h. d. kim, y. j. ko and m. sagas, "the influence of service quality on satisfaction and intention: a gender segmentation strategy", sport manage. rev., vol. 14, no. 1, pp. 54–63, feb. 2011. [17] j. iwaarden, t. wiele, l. ball and r. millen, "perceptions about the quality of web sites: a survey amongst students at northeastern university and erasmus university", inf. manag., vol. 41, no. 8, pp. 947–959, nov. 2004. [18] h. kang and g. bradley, "measuring the performance of it services: an assessment of servqual", int. j. account. inf. syst., vol. 3, no. 3, pp. 151–164, oct. 2002. [19] a. parasuraman, v. a. zeithaml and a. malhotra, "e-s-qual. a multiple-item scale for assessing electronice service quality", j. serv. res., vol. 7, no. 3, pp. 213–233, feb. 2005. [20] z. yang, s. cai, z. zhou and n. zhou, "development and validation of an instrument to measure user perceived service quality of information presenting web portals", inf. manag., vol. 42, no. 4, pp. 575–589, may 2005. [21] h. h. chen, s. c. chen and c. c. yang, "the impact of service quality and relationship quality on customer loyalty in e-tourism", in proceedings of the international conference on business and information, kitakyusu, japan, 2010, pp. 5-7. [22] n. asgari, m. h. ahmadi, m. shamlou, a. r. farokhi and m. farzin, "studying the impact of e-service quality on e-loyalty of customers in the area of e-banking services", j. mgmt. & sustainability, vol. 4, no. 2, pp. 126–133, may 2014. [23] m. m. jeon and m. jeong, "customers’ perceived website service quality and its effects on e-loyalty", int. j. contemp. hosp. manag., vol. 29, no. 1, pp. 438–457, jan. 2017. [24] s. j. barnes and r. t. vidgen, "data triangulation in action: using comment analysis to refine web quality metrics", in proceedings of the 13th european conference on information systems, information systems in a rapidly changing economy, ecis 2005, regensburg, germany, 2005, pp. 1–12. [25] d. fink and c. nyaga, "evaluating web site quality: the value of a multi paradigm approach", benchmarking: int. j., vol. 16, no. 2, pp. 259–273, april 2009. [26] h. w. webb and l. a. webb, "sitequal: an integrated measure of web site quality", j. enterp. inf. manag., vol. 17, no. 6, pp. 430–440, dec. 2004. [27] r. davidson and c. joan, "determining the existence of electronic service quality gaps in the australian wine industry", school of commerce, research paper series: 05-02, jan. 2005 [28] a. golovkova, j. eklof, a. malova and o. podkorytova, "customer satisfaction index and financial performance: a european cross country study", int. j. bank mark., vol. 37, no. 2, pp. 479–491, feb. 2019. [29] s. asadpoor and a. abolfazli, "effect of electronic service quality on customer satisfaction and loyalty saderat bank’s customers", int. j. scientific study, vol. 5, no. 4, pp. 407–411, july 2017. [30] l. hasan and e. abuelrub, "assessing the quality of web sites", appl. comput. inform., vol. 9, no. 1, pp. 11–29, jan. 2011. [31] zf. yang, y. shi, b. wang and h. yan, "website quality and profitability evaluation in ecommerce firms using two-stage dea model", procedia comput. sci., vol. 30, pp. 4–13, 2014. [32] s. s. robbins and a. c. stylianou, "global corporate web sites: an empirical investigation of content and design", inf. manag., vol. 40, no. 3, pp. 205–210, jan. 2003. [33] mcinerney, c. & bird, n., ibid. [34] g. m. hill, m. tucker and j. hannon, "an evaluation of secondary school physical education websites", phys. educator, vol. 67, no. 3, pp. 114–127, july 2010. [35] v. s. carlos and r. g. rodrigues, "web site quality evaluation in higher education institutions", proc. technol., vol. 5, pp. 273–282, 2012. [36] d. b. sugak "rankings of a university's web sites on the internet", sci. tech. inf. process., vol. 38, no. 1, pp. 17–19, may 2011. [37] s. ebadi "the study used the website of the central library of the three universities, the students and librarians" [thesis], school of educational sciences and psychology, ferdowsi university of mashhad; 2007. [in persian]. [38] f. osareh and z. papi "quality assessment of library website of iranian state universities: some suggestions for quality improvement", information sciences & technology, vol. 23, no. 4, pp. 35–70, 2008. [in persian]. website quality assessment. a case study of gsm hosting forum 215 [39] m. nowkarizi and t. abedini "usability evaluation of the central library websites of the universities dominated by the ministry of science, research and technology", library and information research journal, vol. 2, no. 1, pp. 153–174, 2012. [in persian]. [40] a. hamdipour "assessment study of library website of iranian universities of medical sciences and suggestions for improvement", health inf. manage, vol. 8, no. 2, pp. 176–188, 2011. [in persian]. [41] r. asgari and a. shabani "an evaluation of the electronic reference services of the websites of university libraries in the central region of iran", national studies on librarianship and information organization, vol. 23, no. 3, pp. 6–20, 2012. [in persian]. [42] s. mohamadesmaeil and f. movahedi, "comparative evaluation of websites of u.s. national library of medicine and iranian national library of medicine", health inf. manage, vol. 10, no. 3, pp. 1–11, 2013. [in persian]. [43] n. ahmadi and h. r. hosseyni, "evaluation of web pages of central libraries of iranian public universities via integration of webqual model", j. appl. environ. biol. sci., vol. 4, no. 5, pp. 81–89, april 2014. [44] n. parvin, a. farahani, n. parvin and s. ebrahim hesari, "quality assessment website of the ministry of sport and youth using analytical hierarchy process", comm. manage. sports med., vol. 4, no. 4, pp. 63–71, aug. 2017. [45] r. ansari and r. khajouei, "evaluating the quality of websites of research institutes affiliated to iranian universities of medical sciences", health inf. manage, vol. 13, no. 5, pp. 320–325, dec. 2016. [46] n. karkin and m. janssen "evaluating websites from a public value perspective: a review of turkish local government websites", int. j. inf. manage., vol. 34, no. 3, pp. 351–363, june 2014. [47] n. shahrabifarahani, m. shekofteh, m. kazerani and z. emami, "an evaluation of persian diabetes websites based on webmedqual (2016)", iranian j. endocrinol. metab., vol. 20, no. 3, pp. 142–150, sept. 2018. [48] f. fathi "quality assessment of selected sport federations websites", appl. res. sport manage., vol. 1, no. 4, pp. 17–36, dec. 2012. [49] t. kriemadis, c. terzoudis and n. kartakoullis, "internet marketing in football clubs: a comparison between english and greek websites", soccer soc., vol. 11, no. 3, pp. 291–307, april 2010. [50] m. e. gonzalez, g. quesada, j. davis and c. mora-monge, "application of quality management tools in the evaluation of websites: the case of sports organizations", qual. manag. j., vol. 22, no. 1, pp. 30–46, nov. 2017. [51] d. di fatta, r. musotto and w. vesperi, "analyzing e-commerce websites: a quali quantitative approach for the user perceived web quality (upwq)", int. j. mark. stud., vol. 8, no. 6, pp. 33–44, nov. 2016. [52] d. pamučar, ž. stević and e. k. zavadskas, "integration of interval rough ahp and interval rough mabac methods for evaluating university web pages", appl. soft. comput., vol. 67, pp. 141–163, june 2018. [53] s. b. choi and j. m. kim, "multimedia mobile application e-commerce service satisfaction", multimed. tools appl., vol. 78, no. 5, pp. 5217–5231, june 2017. [54] a. singh and a. prasher, "measuring healthcare service quality from patients’ perspective: using fuzzy ahp application", total qual. manag. bus. excell., vol. 30, no. 3–4, pp. 284–300, march 2017. [55] b. efe, "website evaluation using interval type-2 fuzzy-number-based topsis approach" in multicriteria decision-making models for website evaluation, 2019, pp.166–185, igi global. [56] m. s. rosenbaum, "meet the cyberscape", mark. intell. plan., vol. 23, no. 7, pp. 636–647, dec. 2005. [57] a. d. ghahnavieh, "the impact of the quality characteristics of websites on the intention internet purchasing", yazd university, department of economics, management and accounting, master thesis, 2015. [58] r. van der merwe and j. bekker, "a framework and methodology for evaluating e-commerce website", internet res., vol. 13, no. 5, pp. 330–341, dec. 2003. [59] h. medyawati and a. mabruri, "website quality: case study on local government bank and state own bank in bekasi city", procedia – soc. behav. sci., vol. 65, no. 3, pp. 1086–1091, dec. 2012. [60] k. m. griffiths and h. christensen, "quality of web based information on treatment of depression: cross sectional survey", bmj, vol. 321, no. 7275, pp. 1511–1515, dec. 2000. [61] t. l. lissman and j. k. boehnlein, "a critical review of internet information about depression", psychiatr. serv., vol. 52, no. 8, pp. 1046–1052, aug. 2001. [62] n. j. reavley and a. f. jorm, "the quality of mental disorder information websites: a review", patient educ. couns., vol. 85, no. 2, pp. e16–e25, nov. 2011. [63] r. zahedi, b. taheri, l. shahrzadi, m. tazhibi and h. ashrafi-rizi, "quality of persian addiction websites: a survey based on silberg, discern and wqet instruments (2011)", acta inform. med., vol. 21, no. 1, pp. 4–7, march 2013. 216 h. kardanmoghaddam, m. masoumi [64] r. rekik and i. kallel, "fuzz-web: a methodology based on fuzzy logic for assessing web sites", int. j. comput. inf. syst. ind. manag. appl., vol. 5, pp. 126–136, dec. 2013. [65] g. kabir and m. a. akhtar hasin, "comparative analysis of topsis and fuzzy topsis for the evaluation of travel website service quality", int. j. qual. res., vol. 6, no. 3, pp. 169–185, sept. 2012. [66] d. k. kardaras, b. karakoatas and x. j. mamakou, "content presentation personalization and media adaptation in tourism web sites using fuzzy delphi method and fuzzy cognitive maps", expert syst. appl., vol. 40, no. 6, pp. 2331–2342, may 2013. [67] http://websites.milonic.com/gsmhosting.com/ http://websites.milonic.com/gsmhosting.com/ facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 605-630 https://doi.org/10.2298/fuee2104605a © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper effects of connecting a scattered solar generation unit to the grid on the cloud passage using optimization algorithms ali aljbori, mahdi zarif department of electrical engineering, mashhad branch, islamic azad university, mashhad, iran abstract. today, limitation of fossil fuel resources and other issues such as the possibility of the depletion of fossil energy reserves, global warming, environmental pollution, price instability, and the growing need for industrial and urban centers for energy have prompted the international community to seek appropriate alternatives. such examples are nuclear energy, solar energy, geothermal energy, wind energy, and ocean waves. renewable energy is generated owing to the simplicity of the applied technology compared to nuclear energy technologies. on the other hand, such energies play a key role in new energy systems in the world similar to nuclear waste. the increasing use of renewable energies has given rise to significant complications. one of the main operational issues in this regard is the uncertainty of electricity generation by solar power plants, which is caused by the passage of clouds. the present study aimed to investigate the effects of cloud passage on the production of solar power plants. initially, a control system was designed to control a high-penetration solar power plant in the network, and the maximum allowable percentage of penetration was calculated for different loads. for this purpose, three algorithms (de, pso, and ica) were used to determine the mppt of the solar arrays in shady conditions, as well as the mppt point of the solar arrays. according to the results, the colonial competition algorithm was faster compared to the other algorithms. key words: photovoltaics, solar cells, maximum power point tracking, particle swarm optimization, differential evolution, imperialist competitive algorithm received april 4, 2021; received in revised form june 25, 2021 corresponding author: ali aljbori department of electrical engineering, mashhad branch, islamic azad university, mashhad, iran e-mail: aljbori.a1989@gmail.com 606 a. aljbori, m. zarif 1. introduction today, limitation of fossil fuel resources and other factors such as the possibility of the depletion of fossil energy reserves, global warming, environmental pollution, price instability, and the growing need for industrial and urban centers for energy have prompted the international community to seek appropriate alternatives. such examples are nuclear energy, solar energy, geothermal energy, wind energy, and ocean waves. on the other hand, atomic ions play a key role in new energy systems in the world. despite the increasing demand for electricity and the growing need for high-quality and reliable electricity, lack of responsive production, distribution, and transmission infrastructures in large electricity networks in some cases has led to scattered energy resources for further development [1]. use of distributed generation resources along with the supplying parts of the load increases the reliability of the power system through the proper placement of the distributed generation sources. furthermore, losses could be decreased and voltage profiles could be improved, which ultimately lead to increased energy efficiency [2]. solar energy is considered to be the most viable option among various scattered production sources given the problems associated with air pollution, as well as the abundance of high-power sunlight. a solar power plant is cost-efficient and able to cover a large portion of an area load (affecting air pollution) when it is large-scale in terms of energy production. scattering provides a significant amount of load to a feeder, which also known as a high-penetration scattering source. the use of high-penetration solar power plants in the distribution network has numerous advantages and several technical disadvantages. ohmic voltage drop in distribution networks is an important issue, and pulsed transformers are used for its compensation. distribution lines are mostly radial and designed to flow in one direction. by inserting a high-penetration solar power plant into parts of the feeder, the flow direction is reversed, thereby reducing the current sent by the distribution substation and also causing a significant decline in the voltage drop across the distribution network. if hotline pulses that cannot change the transformer pulse under the load are not used in the feeder, the pcc voltage will be higher than usual . the issue becomes more acute when the consumed load during the day changes. since the highest amount of electricity is generated by the solar power plant during the time with the lowest load consumption, the penetration of the power plant is maximized in this period. therefore, the manual pulse changers currently used in distribution networks cannot be used for voltage regulation. the effects of cloud passage may be highly destructive to the voltage and power balance in a distribution network. if the generation capacity of the solar power plant is partly comparable to the main power plant (steam, gas), the disturbance of the grid power balance could become problematic due to the instantaneous reduction of the generation power in the solar power plant. this occurs because a steam or gas power plant with a ramp could compensate for the reduction in the instantaneous production capacity, which may in turn cause power shortage in large parts of the network. in addition, the emergence of voltage fluctuations in the network could lead to customer dissatisfaction or the inefficient operation of network equipment. to date, several studies have evaluated the connection of the photovoltaic system: "integrated autonomous voltage regulation and islanding detection for high penetration pv applications". this paper proposes an autonomous unified var controller to address the system voltage issues and unintentional islanding problems associated with distributed photovoltaic (pv) generation systems. the proposed controller features the integration of both effects of connecting a scattered solar generation unit to the grid on the cloud passage... 607 voltage regulation (vr) and islanding detection (id) functions in a pv inverter based on reactive power control [2]. "a novel approach for ramp-rate control of solar pv using energy storage to mitigate output fluctuations caused by cloud passing". this paper proposes a strategy where the ramp-rate of pv panel output is used to control the pv inverter ramp-rate to a desired level by deploying energy storage (which can be available for other purposes, such as storing surplus power, countering voltage rise, etc) [3]. "a study of dispersed photovoltaic generation on the pso system". results of a study on dispersed photovoltaic (pv) generation on the public service company of oklahoma (pso) system with simulated dispersed pv generation are presented [4]. "influence of photovoltaic power generation on required capacity for load frequency control". in this paper developed a mathematical model to evaluate the impact of small (rooftop) photovoltaic (pv) power-generating stations on economic and performance factors for a larger scale power system, and applied this model to the tokyo metropolitan area [5]. in all papers data are limited regarding the problems that could occur within the network; such examples are swing power, increased/decreased voltage profiles, failure of protection devices, cloud effect, power plant harmonics, and network frequency regulation [3-12]. the necessity of building and connecting to the solar power plant grids and the unforeseen issues that occur with the introduction of these power plants to the grid have motivated the current research. the present study aimed to investigate the effects of connecting a scattered solar generation unit to the grid in the cloud passage and determined the feeder load changes for a fixed consumer from an operational perspective. 2. materials and methods to date, several studies have evaluated the connection of the photovoltaic system to the network and various penetrations rates within a photovoltaic system in the electricity network. however, data are limited regarding the problems that could occur within the network; such examples are swing power, increased/decreased voltage profiles, failure of protection devices, cloud effect, power plant harmonics, and network frequency regulation [3-12]. in this study, we assessed the effects of connection to a high-penetration solar power plant in the distribution network in terms of voltage and the changes in the feeder load for a constant consumer from an operational perspective. by connecting the power plant to the distribution network, which had transformers with manual pulse changers that are unchangeable under load, the voltage at the end of the line and where the power plant was connected to the network increased. the ansi standards allow 4% overvoltage for distribution networks. given that the impedance of distribution lines is higher than the standard value in some cases, it is paramount to investigate the effects of the overvoltage. on the other hand, the feeder is mainly powered by the solar power plant, and it is not an economical option to incorporate large amounts of energy storage in high-power plants. this is because by crossing the cloud and casting a shadow on solar array panels, their instantaneous power decreases significantly. the power reduction causes the power balance of the distribution network and the disturbance of the power plant, as well as voltage change. in this study, we also evaluated the effects of cloud transit by initially defining a photovoltaic system and a solar panel model . 608 a. aljbori, m. zarif 2.1. photovoltaic system [13-16] photovoltaics (pv) refer to a solar power generation system. in this method, solar cells are used for the direct production of electricity from solar radiation. these solar cells are semiconductors and composed of silicon. when sunlight shines on a photovoltaic cell, a potential difference occurs between the negative and positive electrodes, causing the current to flow in-between. pv could be classified as a renewable energy technology, and a photovoltaic system consists of several components and subsystems, including the photovoltaic effect manufacturer by mechanical tools, battery (energy storage subsystem), control equipment, monitors, and measurement devices, and support manufacturer . 2.2. solar panel modeling [17] the physical structure of a solar cell is similar to a diode the p-n junction of which is exposed to sunlight. the absorbed energy from the light intensity in this area leads to the production and transfer of carriers (electrons and holes) and their aggregation in the output terminal. a solar panel has several photovoltaic cells with a series of external connections (parallel or series-parallel). figure 1 shows the function of a solar cell. figure 2 depicts the equivalent circuit of a solar cell. fig. 1 solar cell function [2, 4] fig. 2 solar cell orbit equivalence in this study, the characteristics of the solar panel were determined based on the parameters of equation 1 t0 3 as follows: 𝐼 = 𝐼𝑝ℎ − 𝐼𝑜 (𝑒 𝑉𝐷 𝐴.𝑉𝑇 ⁄ − 1) − 𝑉+𝑅𝑠𝐼 𝑅𝑝 (1) effects of connecting a scattered solar generation unit to the grid on the cloud passage... 609 𝐼𝑝ℎ = 𝑆. (𝐼𝑠𝑐 − 𝛼(𝑇 − 25)) (2) 𝑃 = 𝑉𝐼 = 𝑉(𝐼𝑝ℎ − 𝐼𝑜 (𝑒 𝑉𝐷 𝐴.𝑉𝑇 ⁄ − 1) − 𝑉+𝑅𝑠𝐼 𝑅𝑝 ) (3) where q is the electron electric charge, k shows the boltzmann constant, vt is the thermal voltage, t represents the absolute cell temperature (°k), a is the diode emission coefficient, io shows the reverse saturation current, iph is the photovoltaic component of the current, s is the sunlight (kw/m2), α shows the short-circuit current temperature coefficient, isc is the cell short-circuit current under standard conditions (25°c, radiation: kw/m2), vd is the diode voltage, rs shows the series noise resistance, rp is the parallel noise resistance, v is the solar cell terminal voltage, i shows the solar cell terminal current, and p represents the solar cell output power. if series resistance and parallel resistance (0≈rs and prp) are eliminated and the short-circuit conditions are considered, the source current of the model is approximately equal to the short-circuit current. equation 4 was applied for the assessment of the solar cell. 𝐼 ≈ 𝐼𝑠𝑐 (1 − 𝑒 𝑉−𝑉𝑜𝑐 𝐴.𝑉𝑇 ) (4) photovoltaic systems could be used to generate electricity in any setting with a high potential for the absorption of solar energy. due to the high costs of solar cell production and the cost-efficiency of electricity generation by fossil fuels from photovoltaic systems, the national electricity grid is commonly used in remote areas (e.g., villages and borders). other applications of these systems for street lighting in cities are as solar pumping systems using photovoltaic, portable solar, and power supply systems for telecommunication and seismic stations and tunnel lighting systems for mountain roads. v-i. the characteristics of a temperatureand radiation intensity-based solar representation have been shown in equations 5-11. 𝑣𝑆𝐴 = 𝑁  𝑙𝑛 ( 𝑀𝐼𝑝ℎ−𝑖𝑆𝐴+𝑀𝐼𝑜 𝑀𝐼𝑜 ) − 𝑁 𝑀 𝑅𝑆 𝑖𝑆𝐴 (5) ∆𝑇 = 𝑇 − 𝑇𝑟 (6) ∆𝑖 = (𝐼𝑠𝑐 − 𝐼𝑠𝑐𝑟 )∆𝑇 + ( 𝐼𝑠𝑐 𝐼𝑠𝑐𝑟 − 1) 𝐼𝑠𝑐𝑟 (7) ∆𝑣 = −∆𝑇 − 𝑅𝑠 ∆𝑖 (8) 𝑣𝑆𝐴 𝑛𝑒𝑤 = 𝑣𝑆𝐴 + ∆𝑣 (9) 𝑖𝑆𝐴 𝑛𝑒𝑤 = 𝑖𝑆𝐴 + ∆𝑣 (10) 𝑃𝑆𝐴 = 𝑣𝑆𝐴 𝑛𝑒𝑤 𝑖𝑆𝐴 𝑛𝑒𝑤 (11) in these equations, t and tr are the working point temperature and the nominal temperature of a solar panel, respectively, α shows the coefficient of the temperature flow, β is the voltage coefficient of temperature, and tsc and tscr represent the short-circuit current at the operating point and the name of the solar panel, respectively. for a silicon solar panel (n=36, m=1), equation 5 will become equation 12. 𝑣𝑆𝐴 = 𝑘 𝑙𝑛 ( 𝐼𝑝ℎ−𝑖𝑆𝐴+𝐼𝑜 𝐼𝑜 ) − 𝑖𝑆𝐴 (12) 610 a. aljbori, m. zarif in addition, the production current with a specific radiation level could be obtained using equation 13. 𝐼𝑝ℎ = [𝐼𝑠𝑐𝑟 + 𝛼(𝑇 − 𝑇𝑟 )] ∙ (𝑆/1000) (13) figure 3 shows the p-i characteristic of a radiation intensity-based solar array, and figure 4 depicts the temperature-based solar array. by observing these curves, it could be concluded that the output power of a solar array is highly nonlinear and largely depends on the amount of radiation and the intensity of the ambient temperature. therefore, these figures show that the maximum power point (mpp) would change with changes in the temperature and radiation intensity. to achieve the optimal operating point of the system, it is essential to use the mppt algorithm. fig. 3 the characteristics of the p-i solar arrays at a constant temperature and different radiation intensities [18-23] fig. 4 the characteristics of the p-i solar arrays at a constant temperature and different radiation intensities [18-23] effects of connecting a scattered solar generation unit to the grid on the cloud passage... 611 2.3. maximum power point tracking (mppt)algorithm to use a solar array at the maximum operating point, an mppt method is required to detect the power peak. this method obtains the voltage and current at which the solar array has the maximum power in the output. figures 5 and 6 show the characteristic curves of the v-i and p-i of a solar array at the radiation intensity of 1,000 w/m^2 and temperature of 25°c, respectively. furthermore, these figures illustrate the characteristics of the solar array under partial shading conditions, which shows five cells with the radiation intensity of 800 w/m^2 and temperature of 23°c, as well as five cells with the radiation intensity of 500 w/m^2 and temperature of 25°c. as is shown in figures 5 and 6, many mppt methods do not function properly under partial shadow conditions and the local optimal mpp point converges due to the presence of several peak points under shady conditions. to solve this issue, the mppt algorithm could be used based on random optimization methods. fig. 5 v-i with and without shadow [18-23] fig. 6. p-i with and without shadow [18-23] 612 a. aljbori, m. zarif 2.3.1. mppt-based optimization algorithm the main advantage of stochastic optimization algorithms is escaping from the local optimal points. stochastic optimization algorithms such as pso and de could detect the optimal point of a problem, which makes them useful in finding the maximum power point of solar arrays under shady conditions. we considered the output power of the solar array as the objective function of the optimization problem as in equation 14. figure 7 shows the structure of the studied system. in this system, the tracker connects the maximum solar power of the pv module to the battery. the maximum power tracking system consists of a dc-dc converter and a pid control system and controller (pso mppt/de mppt). the pso mppt/de mppt unit uses environmental parameters at its input, along with equations 5-13 to determine the voltage and current of the maximum power. the inputs of the mppt method based on the stochastic optimization algorithm included t (cell temperature), s (radiation intensity), nshadle (number of the cells in the shade), tshade (temperature of the cells in the shade), and sshade (radiation intensity in the cells in the shade). fig. 7. structure of studied system the maximum power tracking system could achieve the optimum operating power point of the maximum power supply (pmax) by adjusting the duty cyck value of the skin converter. the dc-dc boost converter adjustment function based on its permanent state is as follows: (14 ) 𝑉0 = 𝑉𝑆𝐴 1−𝑑 where d is the value of the life cycle, vo shows the output voltage of the dc boost converter, and va is the output voltage of the solar array. the optimal value was calculated using equation 14, in which vsa=vmp and v0 =25v (battery voltage). in this sections, we define the random algorithms used in the article. effects of connecting a scattered solar generation unit to the grid on the cloud passage... 613 2.3.2. particle swarm optimization (pso) algorithm [24-25] in the past decade, the particle swarm optimization (pso) algorithm, which is based on stochastic research methodologies for general optimization, was proposed by abr hart et al. based on the models of simple social systems to solve nonlinear problems such as distribution optimization. reactive power is highly efficient, and the characteristics of this algorithm have been further discussed. this algorithm is based on research on different communities (e.g., bird communities) and a very simple concept. therefore, the time required for the calculations is very short and does not require considerable memory. in addition, the algorithm has been developed for nonlinear and continuous optimization problems, while it could also be used for problems with discrete variables. 2.3.3. differential evolution (de) algorithm [26-28] differential evolution (de) is a simple population-based algorithm, which randomly searches for the optimal point of the network. this algorithm is able to optimize nonlinear and non-derivative target functions. in the de algorithm, the populations include vectors with real values, and the key advantage of the algorithm is that it results in better answers compared to other methods at the same time. 2.3.4. imperialist competitive algorithm (ica) [29-30] the imperialist competitive algorithm (ica) is an evolutionary computational method, which determines the optimal answers to various optimization problems. the algorithm provides an algorithm for solving mathematical optimization problems by mathematically modeling the process of social and political evolution in terms of application, the ica is classified as an evolutionary optimization algorithm similar to the genetic algorithm (ga), pso, and ant colony optimization (aco). the main advantage of the imperialist competition algorithm is its high speed compared to other optimization algorithms. 2.4. microgrids [31] microgrids are low-voltage electrical networks consisting of scattered energy sources such as microturbines, solar cells, wind turbines, and fuel cells. moreover, microgrids include energy storage equipment such as batteries and flywheels, as well as controllable loads. microgrids could be used when connected to the grid and when disconnected from the grid, which greatly increases the reliability of the delivered energy. connecting any sources (microgrids) to a distributed generation system (except conversion from renewable energy or other energy sources into electrical energy) has several issues. one of the issues during the connection of microgrids to the network is the presence of nonlinear loads, which are related to the structure of power electronics and other similar loads that are used in the network or the microgrid. 614 a. aljbori, m. zarif 3. simulation 3.1. introduction with the expansion of industries and population growth, the need for energy is increasing each day. given the shortage of the energy generated from fossil fuels, the global community is paying more attention to renewable energy. renewable energies are the energies that are obtained from sunlight and used in two fashions; the first is the use of solar thermal energy for domestic, industrial, and power plants, and the second is the direct conversion of light from the sun into electricity using pv solar cells. the main advantages of solar cells include free and pollution-free fuel, and the disadvantages are the high initial costs and low system efficiency. each year, the cell production has higher yields and lower prices than the previous years, while low yields and relatively high prices per cell remain among the challenges of this technology. the main barriers to the use of this technology are the scientific and technical weakness in conversion due to the lack of knowledge and field experience, variable and alternating amounts of energy due to climatic and seasonal changes, and changes in the direction of radiation. to exploit the available resources, mechanical systems are required to place solar panels in the direction of uninterrupted sunlight at any time; this method is known as a solar tracking system. furthermore, an electronic system is essential to place the output of the solar panels at a suitable operating point with the maximum transmission power. however, the placement of solar panels at the point of maximum power may be problematic as in the non-linearity of the output characteristic of the solar cell and the variability of this characteristic in terms of light radiation and even cell temperature. therefore, a system should be implemented for the control of solar cells along with the placement of solar cells at the optimal working point. in case of change in this point due to climatic conditions, the maximum transmission power of the system could be tracked continuously and rapidly, so that the solar cell would remain at the optimal point. this section simulates the proposed method for determining the mppt of a solar array. in the present study, we initially investigated the effects of temperature changes and radiation intensity on the mppt value of solar arrays, followed by the effects of shadow on the performance of the solar arrays. in addition, the pso, de, and ica random optimization algorithms were used to determine the mppt of the solar at different temperatures, radiation intensities, and atmospheric conditions. 3.2. climatic conditions in determining mppt of solar arrays in the current research, we evaluated the effects of climatic conditions on the mppt of the solar arrays. as is shown in figure 8, the solar cell was initially modeled in matlab simulink environment. according to the findings, nonlinear, environmental solar cells (temperature and radiation) were dependent on the p-i and v-i characteristics. in addition, the working point of the solar cells depended on their charge attachment. to recognize the behavior of a solar cell, a model with an electrical average should be developed based on separate electrical components with well-established behavior. an ideal solar cell is modeled with a current source parallel to a diode although no solar cell is practically ideal. in the present study, we added a sub-resistor and a series resistor to the model. table 1 shows the basic characteristics of these cells at the temperature of 25°c. effects of connecting a scattered solar generation unit to the grid on the cloud passage... 615 table 1 characteristics of silicon solar cells arrays in t=25 ℃ α = 0.002086( 𝐴 ℃ ) current -temperature coefficient 𝛽 = 0.0779( 𝑉 ℃ ) voltage-temperature coefficient 𝐼𝑜 = 0.5 × 10 −4 (𝐴) reverse saturation current 𝐼𝑆𝐶 = 4.8 (𝐴) short circuit 𝑅𝑆 = 0.0277(ω) solar resistor λ = 20.41(𝑉 −1) coefficient solar fig. 8 solar photovoltaic modeling figures 9 and 10 show that at the temperature of 25°c and radiation intensity of 1,000 w/m2, the solar array had 4.55 a, voltage of 7.59 v, and power of 34.54 mppt watts. notably, only one mpp point was observed since there was no shadow conditions in the solar array. fig. 9 curve i-v (t=25℃, radiant intensity 𝑆 = 1000 𝑊 𝑚2 ) https://climatescience.org/advanced-energy-solar/ https://climatescience.org/advanced-energy-solar/ 616 a. aljbori, m. zarif fig. 10 curve p-v (t=25℃, 𝑆 = 1000 𝑊 𝑚2 ) figure 11 shows the effects of the changes in radiation intensity on the mpp of the solar arrays at the temperature of 25°c. as can be seen, the decreased intensity of radiation in the solar arrays led to the reduction of the maximum power of the solar arrays. figure 12 depicts the effects of changes in the mpp of the solar arrays at the radiation intensity of 1,000 w/m2. as is observed, the increased temperature of the solar arrays led to the reduction of the maximum power of the solar arrays . fig. 11 curve p-i fig. 12 curve p-i (t = 25℃, varied radiant intensity) (𝑆 = 1000 𝑊 𝑚2 , varied temperature) 3.3. determining mppt of solar arrays with partial shading partial shadowing is performed by the shadows created by buildings, trees, and clouds (moving shadows). as a result of creating a shadow instead of the maximum power point, several peaks were observed in the voltage-current characteristic of the module in the effects of connecting a scattered solar generation unit to the grid on the cloud passage... 617 present study. with regard to the moving shadow conditions, the photovoltaic system module is typically divided into three sections (figure 13). in the current research, radiation and facade data were collected in each section and used with a sample photovoltaic system in matlab software to calculate the short-circuit current and open-circuit voltage in the module section. the data were fed to random algorithms to calculate the maximum power point section voltage and the maximum power point current. with the changes in the radiation and temperature in section i, the values of isc and voc also changed. therefore, the appropriate function was automatically adjusted, acting to find the new value of the maximum power point of this section. it was assumed that the s radiation and temperature conditions of the section remained unchanged, and the maximum power point of the same section was searched by a random algorithm. after obtaining the maximum power point values of each independent section, the maximum power point of the entire module of the photovoltaic system was measured by the instantaneous possible mean of the maximum power point obtained from each module section of the photovoltaic system. the process was repeated in case of any changes in the radiation of each section of the module . fig. 13 solar array module with different radiation intensities in this section, the mpp of the solar arrays was initially determined under shadow conditions. for this purpose, two scenarios were also tested, the details of which are presented in table 2. as is depicted in figures 14-17, a local number was created in the solar array under optimal shadow conditions, making it impossible to search for mpp in the solar array using conventional methods. table 2 two scenarios under shadow conditions w/m2 w/m2 radiant intensity 1000 1000 cell1 600 300 cell2 260 200 cell3 618 a. aljbori, m. zarif fig. 14 curve i-v (t=25℃; first case) fig. 15 curve p-v (t=25℃; first case) fig. 16 curve i-v (t=25℃; second case) fig. 17 curve p-v (t=25℃; second case) 3.4. determining mppt using random algorithms in the studied system, the maximum power point detector connected the module of the photovoltaic system to the battery. the maximum power point tracker consisted of a dcdc boost converter and a control system (maximum power point tracking by random algorithms). in general, the maximum power point tracking unit of random algorithms uses random parameters in the inputs to determine the current and voltage based on the maximum power through equations. in the present study, the inputs of the maximum power point tracking unit were random algorithms for cell temperature, solar radiation, and the number, temperature, and radiation of the cells in the shade. the maximum power point detector adjusted the operating point of the solar array for the maximum power by adjusting the boost converter life cycle. the optimization algorithms had two modes with three scenarios in each case, which were defined to determine the maximum power point of the solar arrays. the defined modes are shown in tables 3 and 4. in these cases, the number of the cells in the shade, their temperature, and radiation intensity differed in each scenario . effects of connecting a scattered solar generation unit to the grid on the cloud passage... 619 table 3 basic conditions of solar arrays in assessment of random algorithms in first case scenario 1 scenario 2 scenario 3 number of cells in non-shady conditions 40 30 20 cell temperature in non-shady conditions 25 25 25 intensity of cell radiation in non-shady conditions 900 900 900 number of cells in shade 0 10 20 cell temperature in shade 0 20 20 intensity of cell radiation in shade 0 500 500 table 4 basic conditions of solar arrays in assessment of random algorithms in second case scenario 1 scenario 2 scenario 3 number of cells in non-shady conditions 50 40 35 cell temperature in non-shady conditions 30 30 30 intensity of cell radiation in non-shady conditions 750 750 750 number of cells in shade 0 10 15 cell temperature in shade 0 25 25 intensity of cell radiation in shade 0 600 600 tables 5-7 show the basic parameters of the pso, de, ica algorithms. in all these algorithms, the number of the iterations, number of the control parameters (population dimension), and initial population were equal . table 5 parameters of de algorithm value parameter 10 population size 1 number of dimensions of each population 50 number of repetitions table 6 parameters of pso algorithm value parameter 10 population size 1 number of dimensions of each population 50 number of repetitions 0 and 9 w max. 0 and 3 w min. 2 and 05 c1 2 and 05 c2 table 7 parameters of ica algorithm value parameter 10 population size 1 number of dimensions of each population 50 number of repetitions 2 number of empires 620 a. aljbori, m. zarif figures 18-20 depict the speed and convergence of the pso, de, and ica algorithms in the first case, and the second case is shown in figures 21-23. as can be seen, the convergence speed of the ica algorithm was moderately higher compared to the other algorithms. notably, these algorithms are based on random numbers, and their convergence rate may change each time the program is run. according to our findings, the speed of the training-based algorithm was higher than the other algorithms. in addition, the optimal point was obtained accurately in all the repetitions . fig. 18 function of random algorithm in determining mppt (mode 1, scenario 1) fig. 19 random algorithm performance in determining mppt (mode 1, scenario 2) effects of connecting a scattered solar generation unit to the grid on the cloud passage... 621 fig. 20 performance of random algorithm in determining mppt (mode 1, scenario 3) fig. 21 performance of random algorithm in determining mppt (mode 2, scenario 1) 622 a. aljbori, m. zarif fig. 22 performance of random algorithm in determining mppt (mode 2, scenario 2) fig. 23 performance of random algorithm in determining mppt (mode 2, scenario 3) as can be seen, the accuracy and performance of the proposed method were evaluated using a photovoltaic system consisting of a solar panel, dc/dc converter, battery, and control system (mppt), which was simulated using matlab simulink software. furthermore, three stochastic optimization algorithms (pso, de, and ica) were utilized to compare the performance of the mppt. to evaluate the performance, efficiency, and accuracy, the method was compared with the vmppt, p&o, and cmppt methods similar to the previous paper in this regard. effects of connecting a scattered solar generation unit to the grid on the cloud passage... 623 3.5. mppt based on perturb and observe (p&o) algorithm [32-33] the p&o method has been widely used for mppt given its convenience. in a typical p&o algorithm, the operating point voltage of the solar array is disturbed in one direction to observe the resulting output power. if the power change is positive, the operating point of the system has moves to the mpp point, and the voltage must be disturbed again in the same direction. if the power changes are negative, the operating point must be moved away from this point, and the voltage will change in the opposite direction of the first disturbance (figure 24). fig. 24 p&o method 3.6. mppt based on voltage (vmppt) [34-36] in vmppt, the correlation between the output voltage of the cellular array (vsa) is the same as vmp, and the open-circuit voltage (voc) is considered linear. (15 ) 𝑉𝑚𝑝 = 𝐾𝑣 𝑉𝑂𝐶 in the equation, kv is a constant known as the voltage coefficient. figure 25 shows the value of kv=vmp/voc based on different temperatures and radiation conditions where nshade is equal to 10, 15, 20, and 25, respectively. based on these results, it is clear that the kv value is not fixed, and the vmppt method is erroneous in shady conditions. the algorithm of this method is depicted in figure 26. 624 a. aljbori, m. zarif fig. 25 voltage coefficient calculated in different temperature and radiation intensity conditions; a) 10 shaded cells, b) 15 shaded cells, c) 20 shaded cells, d) 25 shaded cells [34-36] fig. 26 vmppt method algorithm [34-36] 3.7. mppt based on current (cmppt) [34-36] in cmppt, the correlation between the output current of a cellular array (isa, which is imp), and its short circuit current (isc) is considered linear. (16) 𝑖𝑚𝑝 = 𝐾𝑖 𝑖𝑠𝑐 effects of connecting a scattered solar generation unit to the grid on the cloud passage... 625 in the equation, ki shows known as the current coefficient. figure 26 shows the value of ki = imp / isc based on different temperature and radiation conditions where nshade is equal to 10, 15, 20, and 25, respectively. based on these results, it is clear that the ki value is not fixed, and the cmppt method is erroneous in shady conditions. the algorithm of this method is shown in figure 27. fig. 26 coefficient calculated in different temperature and radiation intensity conditions; a) 10 shaded cells, b) 15 shaded cells, c) 20 shaded cells, d) 25 shaded cells [34-36] fig. 27. cmppt method algorithm [34-36] 626 a. aljbori, m. zarif 3.8. mppt based on artificial neural network [37-38] the method presented in this study provides a two-stage maximum power tracking method that determines the maximum point for two modules having serial connection. in the first step, the radiation and the temperature of the array is measured and the final p– i curve is found. then, a search algorithm is implemented to approximate the mpp location with two current and power parameters at the mpp point [37-38]. when the weather conditions exceed beyond a certain level, the search is repeated. in the second step, the actual characteristic curve starts the search for mpp from the estimated point in the first stage or from its previous performance point, which depends on changes in the performance conditions. in one such case, p and o and rcc methods are used to perform the second phase of this algorithm, and the neural network is realized, correspondingly [37-38]. fig. 28 the schematic diagram of the proposed approach [37] fig. 29 the power of the pv array in the first simulation [37] effects of connecting a scattered solar generation unit to the grid on the cloud passage... 627 fig. 30 the current of the pv array in the first simulation [37] the comparison of the obtained results showed that this method has two steps and therefore is not faster than our methods, and also does not guarantee system stability. 3.7. mppt based on fuzzy logic [39-40] the change in the duty cycle is done by fuzzy logic controller by sensing the power output of the solar panel. the proposed controller is aimed at adjusting the duty cycle of the dc-dc converter switch to track the maximum power of a solar cell array. the inputs to the fuzzy logic system will be error (e) and change in error (c). the output will be the change in duty cycle (dd) at sampling instant k [40]. the fuzzy logic consists of the following stages: fuzzification, rule base, inference system and defuzzification. the fuzzy variables are divided into 5 linguistic hedges: negative big (nb), negative small (ns), zero (ze), positive small (ps) and positive big (pb).the membership functions are chosen as shown in fig 31. fig. 32 & fig. 33 show that the linguistic hedges of change in error and change in duty cycle respectively. the result is shown in fig.34. fig. 31 linguistic hedges of change in error [39] fig. 32 linguistic hedges of change in error [39] 628 a. aljbori, m. zarif fig. 33 linguistic hedges of change in duty cycle [39] fig. 34 output voltage of the solar panel without mppt as you can see, shadow conditions are not considered in this method. the comparison of the obtained results (section 3-4) with the findings of other studies (sections 3-5 to 3-9) indicated that the use of stochastic optimization algorithms is superior to other methods as they are not erroneous in shady conditions. 4. conclusion in the implementation of the network in this study, the load was considered a point load, and all the solar panels were centrally simulated. in other words, the cloud passage affected all the solar panels similarly, and the implemented network became centralized. this approach is common in industrial and power plants and for domestic use, while the industrial use is more frequent compared to the domestic use since residential areas often lack the necessary space to build a centralized solar power plant. in industrial areas, energy storage devices (e.g., batteries) must be used to compensate for the voltage drop applied to the system, which may in turn cause islanding and load interruption. these batteries, albeit temporarily, allow the photovoltaic system to compensate for the microgrid power shortage locally, thereby eliminating the need for the network to compensate for the power shortage. as a result, the voltage drop does not occur due to the current influx in the network. in this study, a control system was designed to control the high-penetration solar power plants in the network. furthermore, infiltration was obtained at different loads, and the effects of cloud transit on the system were also simulated and obtained. the results of these simulations are as follows: effects of connecting a scattered solar generation unit to the grid on the cloud passage... 629 1. the characteristics of a solar array depend on atmospheric conditions (e.g., temperature and radiation intensity) . 2. in non-shady conditions, solar arrays have only one maximum power point, while some local optimal points could be observed in shady conditions. consequently, it would be difficult to use conventional methods to determine the mppt of solar arrays . 3. stochastic optimization algorithms could be used to determine the mpp point of solar arrays in proper shadow conditions. in this study, three algorithms (de, pso, and ica) were employed to determine the mppt of the solar arrays in shady conditions, and the speed of the colonial competition algorithm was observed to be higher compared to the other algorithms . references [1] s. chalmers, m. hitt, j. underhill, p. anderson, p. vogt and r. ingersoll, "the effect of photovoltaic power generation on utility operation", ieee trans. power appar. syst., vol. pas-104, no. 3, pp. 524–530, march 2015. [2] y. zhou, h. li and l. liu, "integrated autonomous voltage regulation and islanding detection for high penetration pv applications", ieee trans. power electron., vol. 28, no. 6, pp. 2826–2841, 2012. [3] m. j. e. alam, k. m. muttaqi and d. sutanto, "a novel approach for ramp-rate control of solar pv using energy storage to mitigate output fluctuations caused by cloud passing", ieee trans. energy convers., vol. 29, no. 2, pp. 507–518, march 2014. [4] w. t. jewell, r. ramakumar and s. r. hill, "a study of dispersed photovoltaic generation on the pso system", ieee trans. energy convers., vol. 3, no. 3, pp. 473–478, sept. 1988. [5] h. asano, k. yajima, y. kaya, "influence of photovoltaic power generation on required capacity for load frequency control". ieee trans. energy convers.. vol. 11, no. 1, pp. 188–193, march 1996. [6] s. a. pourmousavi, a. s. cifala and m. h. nehrir. "impact of high penetration of pv generation on frequency and voltage in a distribution feeder" in proceedings of the north american power symposium (naps), ieee, 2017, pp. 1–8. [7] m. e. baran, h. hooshyar, z. shen and a. huang, "accommodating high pv penetration on distribution feeders", ieee trans. smart grid, vol. 3, no. 2, pp. 1039–1046, june 2012. [8] guest editorial, "progress in electricmachines, power converters and their control for wave energy generation", iet electr. power appl., vol. 14, no. 5, april 2020. [9] n. patapoff and d. mattijetz, "utility interconnection experience with an operating central station mwsized photovoltaic plant", ieee trans. power appar. syst., vol. pas-104, no. 8, pp. 2020–2024, aug. 1985. [10] d. cyganski, j. orr, a. chakravorti, a. emanuel, e. gulachenski, c. root and r. c. bellemare, "current and voltage harmonic measurements at the gardner photovoltaic project", ieee trans. power deliv., vol. 4, no. 1, pp. 800–809, jan. 1989. [11] m. h. moradi and a. r. reisi, "a hybrid maximum power point tracking method for photovoltaic systems". sol. energy, vol. 85, no. 11, pp. 2965–2976, nov. 2011. [12] z.-d. zhong, h.-b. huo, x.-j. zhu, g.-y. cao, y. ren, "adaptive maximum power point tracking control of fuel cell power plants", j. power sources, vol. 176, no. 1, pp. 259–269, jan. 2008. [13] photovoltaics and distributed generation, www.fsec.ucf.edu [14] f. antony, c. durschner, k. h. remmers, photovoltaic for professionals: solar electric systems marking, design and installation. routledge, 2007. [15] european photovoltaic industry association, market report 2011. 2012. [16] p. pereira da silva, g. dantas, g. ivan pereira, l. câmara, n. j. de castro, "photovoltaic distributed generation – an international review on diffusion, support policies, and electricity sector regulatory adaptation", renew. sust. energy rev., vol. 103, pp. 30–39, april 2019. [17] p. chaudhary and m. rizwan "energy management supporting high penetration of solar photovoltaic generation for the smart grid using solar forecasts and pumped hydro storage system", renew. energy, vol. 118, pp. 928–946, april 2018. [18] a. mohapatra, b. nayak, p. das and k. barada-mohanty, "a review on mppt techniques of pv system under partial shading condition", renew. sust. energy rev., vol. 80, pp. 854–867, dec. 2017. https://ieeexplore.ieee.org/author/37349858700 https://ieeexplore.ieee.org/author/37396708900 https://ieeexplore.ieee.org/author/37275589100 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=60 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=60 https://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6815993 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=60 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=60 http://www.fsec.ucf.edu;200/ https://www.sciencedirect.com/science/article/abs/pii/s1364032118308232#! https://www.sciencedirect.com/science/article/abs/pii/s1364032118308232#! https://www.sciencedirect.com/science/article/abs/pii/s1364032118308232#! https://www.sciencedirect.com/science/article/abs/pii/s1364032118308232#! https://www.sciencedirect.com/science/article/abs/pii/s1364032118308232#! https://www.sciencedirect.com/science/journal/13640321 https://www.sciencedirect.com/science/article/abs/pii/s0960148117310911#! https://www.sciencedirect.com/science/article/abs/pii/s0960148117310911#! https://www.sciencedirect.com/science/journal/09601481 https://www.sciencedirect.com/science/journal/09601481/118/supp/c https://www.sciencedirect.com/science/article/abs/pii/s1364032117307256#! https://www.sciencedirect.com/science/article/abs/pii/s1364032117307256#! https://www.sciencedirect.com/science/article/abs/pii/s1364032117307256#! https://www.sciencedirect.com/science/article/abs/pii/s1364032117307256#! https://www.sciencedirect.com/science/journal/13640321 630 a. aljbori, m. zarif [19] j. gosumbonggot and g. fujita, "partial shading detection and global maximum power point tracking algorithm for photovoltaic with the variation of irradiation and temperature", energies, vol. 12, no. 2, pp. 1–22. jan. 2019. [20] m. premkumar and r. sowmya, "an effective maximum power point tracker for partially shaded solar photovoltaic systems", energy rep., vol. 5, pp. 1445–1462, nov. 2019. [21] j. qi, y. zhang and y. chen," modeling and maximum power point tracking (mppt) method for pv array under partial shade conditions", renew. energy, vol. 66, pp. 337–345, june 2014. [22] b. liu, k. li, d. d. niu, y. a. jin and y. liu, "the characteristic analysis of the solar energy photovoltaic power generation system", in proceedings of the 5th global conference on materials science and engineering, iop conf. series: materials science and engineering, 2017, vol. 164, p. 012018. [23] s. dubey, j. narotam-sarvaiya and b. seshadri, "temperature dependent photovoltaic (pv) efficiency and its effect on pv production in the world – a review", energy procedia, vol. 33, pp. 311–321, 2013. [24] x.-s. yang, nature-inspired optimization algorithms. science direct, 2014. [25] b. seixas gomes de almeida and v. c. leite. "particle swarm optimization: a powerful technique for solving engineering problems" in swarm intelligence recent advances, new perspectives and applications. intechopen, 2019. [26] p. rocca, g. oliveri and a. massa, "differential evolution as applied to electromagnetics", ieee antennas propag. mag., vol. 53, no. 1, pp. 38–49, feb. 2011. [27] r. storn and k. price, "differential evolution a simple and efficient heuristic for global optimization over continuous spaces". j. glob. optim., vol. 11, no. 4, pp. 341–359, dec. 1997. [28] r. storn, "on the usage of differential evolution for function optimization", in proceedings of the biennial conference of the north american fuzzy information processing society (nafips), 1996, pp. 519–523. [29] e. atashpaz-gargari and c. lucas, "imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition" in proceedings of the ieee congress on evolutionary computation, 2007, pp. 4661–4666. [30] s. hosseini and a. al khaled, "a survey on the imperialist competitive algorithm metaheuristic: implementation in engineering domain and directions for future research", appl. soft comput., vol. 24, pp. 1078–1094, nov. 2014. [31] a. majzoobi and a. khodaei, "application of microgrids in supporting distribution grid flexibility", ieee trans. power syst., vol. 32, no. 5, pp. 3660–3669, sept. 2017. [32] j. ahmed and z. salam, "an improved perturb and observe (p&o) maximum power point tracking (mppt) algorithm for higher efficiency", appl. energy, vol. 150, pp. 97–108, july 2015. [33] m. a. elgendy, b. zahawi and d. j. atkinson, "evaluation of perturb and observe mppt algorithm implementation techniques", in proceedings of the 6th iet international conference on power electronics, machines and drives, 2012, pp. 1–6. [34] m. a. s. masoum and m. sarvi, "voltage and current based mppt of solar arrays under variable insolation and temperature conditions" in proceedings of the 43rd international universities power engineering conference, 2008, pp. 1–5. [35] m. a. s. masoum, h. dehbonei and e. f. fuchs, "theoretical and experimental analyses of photovoltaic systems with voltageand current-based maximum power-point tracking", ieee trans. energy convers., vol. 17, no. 4, pp. 514–522, dec. 2002. [36] m. veerachary, t. senjyu and k. uezato, "voltage-based maximum power point tracking control of pv system", ieee trans. aerosp. electron. syst., vol. 38, no. 1, pp. 262–270, jan. 2002. [37] z. zandi and a. h. mazinan, "maximum power point tracking of the solar power plants in shadow mode through artificial neural network", complex intell. syst., vol. 5, pp. 315–330, oct. 2019. [38] syafaruddin, e. karatepe and t. hiyama, "artificial neural network-polar coordinated fuzzy controller based maximum power point tracking control under partially shaded conditions", iet renew. power gener., vol. 3, no. 2, pp. 239–253, may 2009. [39] r. mahalakshmi, a. aswin kumar and a. kumar," design of fuzzy logic based maximum power point tracking controller for solar array for cloudy weather conditions", in proceedings of the power and energy systems: towards sustainable energy (pestse), 2014, pp. 1–4. [40] m. s. cheik, c. larbes, g. f. kebir and a. zerguerras, "maximum power point tracking using a fuzzy logic control scheme", revue des energies renouvelables, vol. 10, no. 32, pp 387–395, sept. 2007. https://www.sciencedirect.com/science/article/pii/s1876610213000829#! https://www.sciencedirect.com/science/article/pii/s1876610213000829#! https://www.sciencedirect.com/science/article/pii/s1876610213000829#! https://www.sciencedirect.com/science/journal/18766102 https://www.google.com/search?sxsrf=alekk00q4jdddt9gke02aqr8gifaurf1sg:1622549040572&q=xin-she+yang&stick=h4siaaaaaaaaaopge-lrt9c3nliomknktq9u4tlp1tcwrbtinc7wkslottjpys_p1i8vyiwpsc2ll88vyrzklc3jyc9axmotkzmng5yrqhczmje-g5uradieh-zmaaaa&sa=x&ved=2ahukewjohcazsvbwahunjaqkhqilcxsqmxmoatajegqigbad https://en.wikipedia.org/wiki/differential_evolution#cite_ref-storn97differential_2-0 http://www.academia.edu/download/3930081/imperialistic_competitive_algorithm__ica__ieee_cec_2007.pdf http://www.academia.edu/download/3930081/imperialistic_competitive_algorithm__ica__ieee_cec_2007.pdf https://www.mendeley.com/catalogue/d88c7337-0142-3bf9-9e5d-abd79c3badf3/?utm_source=desktop&utm_medium=1.19.5&utm_campaign=open_catalog&userdocumentid=%7bd70563d9-770e-4edf-88a1-3da066381eb1%7d https://www.sciencedirect.com/science/article/abs/pii/s0306261915004456#! https://www.sciencedirect.com/science/article/abs/pii/s0306261915004456#! https://www.sciencedirect.com/science/journal/03062619 https://digital-library.theiet.org/search;jsessionid=tlk2i6cv9rvk.x-iet-live-01?value1=&option1=all&value2=m.a.+elgendy&option2=author https://digital-library.theiet.org/search;jsessionid=tlk2i6cv9rvk.x-iet-live-01?value1=&option1=all&value2=b.+zahawi&option2=author https://digital-library.theiet.org/search;jsessionid=tlk2i6cv9rvk.x-iet-live-01?value1=&option1=all&value2=d.j.+atkinson&option2=author https://ieeexplore.ieee.org/author/37276447100 https://ieeexplore.ieee.org/author/37587298500 https://ieeexplore.ieee.org/xpl/conhome/4638685/proceeding https://ieeexplore.ieee.org/xpl/conhome/4638685/proceeding https://ieeexplore.ieee.org/author/37276447100 https://ieeexplore.ieee.org/author/37277414100 https://ieeexplore.ieee.org/author/37276443900 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=60 https://link.springer.com/article/10.1007/s40747-019-0096-1#auth-z_-zandi https://link.springer.com/article/10.1007/s40747-019-0096-1#auth-a__h_-mazinan https://link.springer.com/journal/40747 https://digital-library.theiet.org/search;jsessionid=et2mncfb3vu9.x-iet-live-01?value1=&option1=all&value2=+syafaruddin&option2=author https://digital-library.theiet.org/search;jsessionid=et2mncfb3vu9.x-iet-live-01?value1=&option1=all&value2=e.+karatepe&option2=author https://digital-library.theiet.org/search;jsessionid=et2mncfb3vu9.x-iet-live-01?value1=&option1=all&value2=t.+hiyama&option2=author https://digital-library.theiet.org/content/journals/iet-rpg;jsessionid=et2mncfb3vu9.x-iet-live-01 https://digital-library.theiet.org/content/journals/iet-rpg;jsessionid=et2mncfb3vu9.x-iet-live-01 https://ieeexplore.ieee.org/author/37085724131 https://ieeexplore.ieee.org/author/37086794593 facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 43-59 https://doi.org/10.2298/fuee2201043c © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper first principle insight into co-doped mos2 for sensing nh3 and ch4 * bibek chettri1, abinash thapa2, sanat kumar das1, pronita chettri1, bikash sharma2 1department of physics, sikkim manipal institute of technology, majitar, sikkim, india 2department of electronics and communication engineering, sikkim manipal institute of technology, majitar, sikkim, india abstract. in this work we present the atomistic computational study of the adsorption properties of co doped mos2 adsorbed ammonia (nh3) and methane (ch4). the adsorption distance, adsorption energy (ead), charge transfer (qt), bandgap, density of states (dos), projected density of states (pdos), transport properties, sensitivity and recovery time have been reported. the diffusion property of the system was calculated using nudge elastic band (neb) method. the calculated results depict that after suitable doping of co on mos2 monolayer decreases the resistivity of the system and makes it more suitable for application as a sensor. after adsorbing nh3 and ch4, co doped mos2 bandgap, dos and pdos become more enhanced. the adsorption energy calculated for nh3 and ch4 adsorbed co doped mos2 are -0.9 ev and -1.4 ev. the reaction is exothermic and spontaneous. the i-v curve for co doped mos2 for ch4 and nh3 adsorption shows a linear increase in current up to 1.4 v and 2 v, respectively, then a rapid decline in current after increasing a few volts. the co doped mos2 based sensor has a better relative resistance state, indicating that it can be employed as a sensor. the sensitivity for ch4 and nh3 were 124 % and 360.5 %, respectively, at 2 v. with a recovery time of 0.01s, the nh3 system is the fastest. in a high-temperature condition/environment, the co doped mos2 monolayer has the potential to adsorb nh3 and ch4 gas molecules. according to neb, ch4 gas molecules on co doped mos2 has the lowest energy barrier as compared to nh3 gas molecules. our results indicate that adsorbing nh3 and ch4 molecules in the interlayer is an effective method for producing co doped mos2 monolayers for use as spintronics sensor materials. key words: density functional theory, gas sensor, adsorption energy, tmd. received september 1, 2021; received in revised form november 17, 2021 corresponding author: bikash sharma sikkim manipal institute of technology, sikkim, india e-mail: ju.bikash@gmail.com * an earlier version of this paper was presented at the 4th international conference on 2021 devices for integrated circuit (devic 2021), may 19-20, 2021, in kalyani, west bengal, india [1]. b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 44 1. introduction nh3 and ch4 are the common gases which are used for industrial and agricultural purposes [1]. nh3 and ch4 are colourless and tasteless gases that are difficult to identify, and they make people suffocate when its concentration is high in the air [2][3][4]. ch4 reduces the level of oxygen, resulting in headaches, dizziness, increased rate of heartbeat and causing breathlessness in human beings [5][6][7]. therefore, good and sensitive gas sensors for the detection of hazardous gases such as ammonia (nh3) and methane (ch4), is critical for both industrial and civilian purposes [8][9][10]. hence, the demand for gas sensors with high sensitivity, low power consumption and short recovery time has increased [11][12]. in recent years, two-dimensional (2d) materials such as transition metal dichalcogenides (tmds) have gained immense attention. 2d mos2, n-type semiconductor [13] with a bandgap of 1.3-1.8 ev [13][14][15] has been one of the most promising materials for the application of gas sensors due to its superior sensitivity [16][17]. mos2 monolayer has a large surface-to-volume ratio[18], tunable electrical properties [19][20] and magnetic properties [21][22]. much research has been focused on making mos2 more prominent by suitable doping [23][24]. xianxian et al. investigated the adsorption behaviour of rh-doped mos2 monolayer towards so2, sof2 and so2f2 and found the improved performance towards the adsorption of gas molecules as compared to pristine mos2 [25]. guochao et al. verified the excellent sensitivity property of au doped mos2 for sensing c2h6 and c2h2 [26]. likewise, zhen et al. confirmed the center of mos2 as the best possible site for doping fe, co, ni and cu [27]. the doping and codoping on mos2 confirmed the high sensitivity of no and no2 gas by ehab et al. [28]. not only in mos2 but doping has significantly increased the material properties of other nanomaterials [29]. chettri et al. explored the changes in the electronic and magnetic properties of h-bn after suitable doping [30]. y. wang et al. used dft to determine that fe doped mos2 could be used as a spintronic gas sensor to detect no gas molecules [31]. additionally, using dft, y.-h. zhang et al. discovered that transition metal-doped mos2 can be used as a spintronic gas sensor to detect co gas molecules [32]. the gas sensing properties of ptn doped wse2 nanosheet to sf6 breakdown products were explored by linga xu et al., who discovered that incorporating a pt atom greatly increases the sensing properties of wse2 nanosheet [33]. the overall results show the novelty of mos2 after suitable doping. transition metal (tm) doped mos2 are more prominent since the interaction between tm and mos2 is strong [1] and provides numerous free electrons [34][1]. due to the strong orbital hybridization between the atoms and gas molecules, increased sensitivity is observed [35]. in this paper, we explored the adsorption properties of a co doped monolayer (hereafter referred to as co-mos2). the adsorption distance, binding energy, adsorption energy and charge transfer were studied to investigate the most stable configuration of doping of co and adsorption of gas molecules on mos2 monolayer. further, the bandgap, density of states (dos) and projected density of states (pdos) were studied to understand the electronic properties. in the end, the i-v characteristics followed by sensitivity and recovery time was calculated to understand the property of the gas sensor. 45 first principle insight into co-doped mos2 for sensing nh3 and ch4 2. methodology density functional theory [36] computation was carried out in quantumatk [37]. for the exchange-correlation term, the perdew-burke-ernzerh generalised gradient approximation (gga-pbe) function was used [38][39] with troullier-martins type pseudopotential [40]. the non-conserving double-zeta polarized (dzp) was used as the basis set [41] with a 5×5×1 monkhorst-pack k-point grid [42]. the density of states was computed with a higher k-point of 15×15×1 [32]. to understand the magnetic characteristics of the system, we used spinpolarized computations for all calculations [31]. the geometry was relaxed with limited memory broyden-fletcher-goldfarb-shanno (lbfgs) algorithm [43] with a minimum force of 0.05 ev/å [43]. the considered cut-off energy for all the calculations was 100 ha. the inclusion of dispersion correction is described by grimme’s dft-d2 method. pulay mixer algorithm was implemented to control the self-consistent iteration with a tolerance of 0.0002 ry and 100 maximum steps [44]. the 4×4×1 supercell of mos2 monolayer with 15 å vacuum space along z-direction was considered for the calculation with 32 s atoms and 16 mo atoms. the calculated lattice constant of mos2 is 3.8 å, which satisfies the theoretical value [45][46]. the mos2 monolayer was fully relaxed with stillinger-weber (sw) potentials [47]. the binding energy of co-mos2 is calculated using [48], 2 2b mos co co mos e e e e − = + − here, emos2, eco and eco−mos2 represents the total energy of pristine mos2, isolated co and co-mos2 monolayer respectively. the adsorption energy of nh3 and ch4 gas molecules on the co-mos2 is calculated using[48], 2 2ads co mos gas co mos gas e e e e − − − = + − here, eco−mos2, egas and eco−mos2−gas represents the total energy of co-mos2 monolayer, isolated gas molecules and gas molecules adsorbed in co-mos2 monolayer respectively. the total charge transfer qt was obtained using the mulliken method [49]. the qt is calculated by [49], ( ) ( )t absorbed gas isolated gas q q q= − qabsorbed(gas) and qisolated(gas) is the carried charge of gas molecules before and after gas adsorption respectively. we used a two-probe configuration with the left electrode, right electrode, and central region to investigate the system's transport properties [50]. the non equilibrium green’s function (negf) approach, as implemented in quantumatk [51], was used to compute the transport properties of the considered structure. the device supercell was sampled using a 2 d fast fourier transform (fft2d) poisson solver with 1×1×150 k-points for the device simulation [41]. on the electrode faces, the dirichlet boundary condition is used, whereas, on all other faces, the boundary condition is set to periodic. for all device computations, the average fermi level is used as the energy zero parameter in the krylov self-energy calculator. the transmission is derived from the device's extended green's function as follows: † ( ) [ ] l r t e tr g g=   b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 46 the greens’ function is given by [52][53] 1 ( ) [ ( ) ( )] l r g e es h e e − = − − − the current voltage characteristics are now derived using the following equation by integrating the transmission function across a suitable voltage [54]. / 2 0 / 2 ( ) ( , ) f a f a e ev a a e ev i v g t e v de + − =  the sensitivity of the co-mos2 to absorb urea and methanol was analyzed, obtained from the equation [55] 0 0 [( ) / ] 100%s r r r= −  where r0 and r represent the resistance of co-mos2 without and with gas adsorption respectively. in addition, to study the property of gas sensors we estimated the recovery time. the better property of the gas sensor is predicted by a faster recovery time [50]. the recovery time is calculated using the following formula [56]: 1 a b e k t a e − = where a is the apparent factor which is equal to 1012 s-1, kb is the boltzmann constant (8.62 ×10-5 ev/k. ea is the absolute value of adsorption energy and t is the temperature [57]. the diffusion coefficient is calculated using the equation below [58][59]. 𝐷 = 1 6𝑁 lim 𝑡→∞ 𝑑 𝑑𝑡 ∑〈[𝑟𝑖(𝑡) − 𝑟0(𝑡 2)]〉 𝑁 𝑖=1 where 𝑟𝑖(𝑡) denotes the position of atom i at time of t, n is the number of diffusion atoms in the system, 𝑟0(𝑡) is the initial position of atom i. 3. result and discussion the structure of nh3 and ch4 is shown in fig.1 (a) and (b) respectively. the central n atom in nh3 molecules bonds with three h atoms. the bond length of n with three h atoms is 1 å with a bond angle of 108.03º. the central c atom bonds with four h atoms in ch4 molecules with bond length and bond angle of 1.09å and 109.47º respectively. as per the mulliken population analysis, the n and h atoms have a positive charge of 4.941e and 1.019e respectively in nh3. the c and h atoms have the positive charge of 3.776e and 1.056e respectively in ch4 molecules as calculated by the mulliken population analysis. table 1 summarizes the respective values. 47 first principle insight into co-doped mos2 for sensing nh3 and ch4 the most stable structure of the co-doped monolayer was obtained by calculating the binding energy, charge transfer and distance between the atoms for three positions where co atom was kept on the top of s atom (st), top of mo atom (mot) and above the hexagonal ring (hollow) of mos2. the calculated parameters are shown in table 1. the calculated binding energy for the hollow site is 3.97 ev and the total charge transfer is -0.003e. the distance between s and mo atoms is 1.41 å and 3.54 å respectively. the binding energy of the mot site was calculated to be highest, i.e., 4.54 å and the lowest was for the st site i.e., -4.3 å. the charge transfer of mot and st is 0.021e and 0.163e. for the hollow site, the co loses electrons on the hollow site after doping, whereas co gets electrons in mot and st site after doping. from here we can conclude that the mot site has strong binding energy and a shorter distance between s-co and mo-co atoms. the binding is relatively strong, and mot is the most favorable position for the co atom doped on the mos2 monolayer. the most stable structure co on top of mo atom (mot) is shown in fig. 2 (b) and (c). fig. 2 the structure of (a) top view, (b) side view of mos2, (c) top view and (d) side view of comos2 fig. 1 the structure of (a) nh3 and (b) ch4 table 1 parameters for nh3 and ch4 molecules bond distance bond angle qt(e) ch4 c-h: 1.09å c-h: 109.47º c: 3.776 h: 1.056 nh3 n-h: 1 å n-h: 108.03º n: 4.941 h: 1.019 b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 48 table 2 parameters for co-mos2 site eb(ev) qt(e) ds(å) dmo(å) st -4.3 0.163 1.41 3.54 mot 4.5 0.021 1.81 2.58 hollow 3.9 -0.003 2.26 3.09 furtheron, to investigate more about the effect of co on mos2, we calculated the electronic properties like bandgap, the density of states (dos) and projected density of states (pdos) of pristine mos2 monolayer and co-mos2 monolayer. the optimized structure of the mos2 monolayer is shown in fig. 2 (a) and (b). the bond length between mo and s atom is 2.42 å and the bond angle of s-mo-s is 81.63º, which is close to the previous study [60]. the calculated band gap of pristine mos2 is 1.68 ev and after substitution of co atom, bandgap reduces to 0.2 ev. the bandgap value calculated for monolayer mos2 is consistent with earlier literature [25][61][20]. the reduction in bandgap signifies the improvement in the conduction property of the material. it depicts the less energy required for the transition of electrons between the valence band and conduction band. the band structure of pristine mos2 and co-mos2 is shown in fig. 3 (a) and (b) respectively. the dos graph of pristine mos2 is shown in fig. 5 (a). the black and red lines in this diagram represent spin up and down, respectively. the dos distance between the valence band and conduction band separation is much similar to the band structure of fig. 3 the calculated bandgap of (a) pristine mos2, (b) co-mos2, (c) co-mos2 adsorbed ch4 and (d) co-mos2 adsorbed nh3 49 first principle insight into co-doped mos2 for sensing nh3 and ch4 pristine mos2. the dos of pure mos2 is symmetric, with spin up and spin down mirroring each other. this result indicates that mos2 in its purest form is not magnetic. to better understand the electrical property, we looked at the pdos of pristine mos2, as seen in fig. 6(a). the p orbitals of s atoms dominate the upper and lower parts of the valence band. similarly, mo atoms d orbitals dominate the higher and lower parts of the conduction band. in both the upper and lower sides of the conduction band and valence band, there is a significant overlap of the p orbital of the s atom with the d orbital of the mo atoms near the fermi level. the strong hybridization of both atoms was shown by this overlap. fig. 5 (b) and 6 (b) show the dos and pdos graphs for co-mos2, respectively. the co-d orbital contributes to the introduction of impurity states around the fermi level, narrowing the energy bandgap to 0.2 ev. because of the antisymmetric spin-up and spin-down states, the computed pdos shows that the system has now become magnetic. the hybridization around the fermi level of the conduction band is due to the d orbitals of mo atoms and the d orbitals of co atoms, according to the pdos. similarly, p orbitals of s atoms and d orbitals of co atoms cause hybridization near the fermi level of the valence band. both the lower and upper sides of the conduction and valence bands show hybridization. 3.1. adsorption property of ch4 and nh3 on co-mos2 to investigate the most stable position for the adsorption of ch4 molecules on comos2, we used three possible sites. table 3 summarizes the adsorption energy, charge transfer, and adsorption distance calculations. the c atom was placed close to the co atom (c-co), the h atom was placed close to the co atom (h-co), and both the h and c atoms were placed close to the co atom (h-c-co). the adsorption energy for h-co was calculated to be -0.2 ev, and the distance between h-co was calculated to be 1.37. the adsorption energy for c-co was determined to be 0.1 ev. for the h-c-co position, higher adsorption energy of -1.4 ev was calculated. it adsorbs gas molecules at 1.75 for c-co atoms and 1.57 for h-co atoms. because of the position of ch4 that was kept above comos2, there is a slight increase in adsorption distance when compared to the other two positions. furthermore, the charge transfer for all three positions was calculated using mulliken analysis. the total charge transfer qt was negative for all three positions, indicating that ch4 molecules act as an electron donor, transferring an electron to comos2. c-co, h-co, and h-c-co locations have qt values of -0.116e, -0.17e, and -0.076e, respectively. as a result of the aforesaid results, we determined that the h-c-co site is the most stable for ch4 molecule adsorption on a co-mos2 monolayer. furthermore, we investigated the h-c-co site adsorption ability for ch4 molecule adsorption on co-mos2. fig. 4 (c) and 4 (d) depicts the most stable position. at energies of -0.17 ev, -0.18 ev, 0.19 ev, -0.2 ev, and -0.3 ev, a strong peak in the valence band, which is more populated than the conduction band, can be seen. in order to better understand the role of spin density at the fermi level, we show the pdos of ch4 in co-mos2 in fig. 6. (c). the s, p, and d orbitals of the h, c, and co, atoms have high peaks around the fermi level of the upper and lower valence band, respectively. at the lower side of the conduction band, substantial hybridization of the s, p, and d orbitals of h, c, and co atoms can be seen, indicating a large contribution to spin polarization from the ch4 molecule. our findings imply that adsorbing ch4 molecules in the interlayer is a good way to create a co-mos2 monolayer as a spintronics sensor material. when mos2 is doped with co, it becomes a spintronicsbased ch4 sensor. b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 50 fig. 4 the structure of (a) top view, (b) side view of adsorbed nh3, (c) top view and (d) side view of adsorbed ch4 in co-mos2 table 3 parameters for adsorption of nh3 on co-mos2 site ead(ev) qt(e) distance(å) h-co -0.9 -0.173 h-co: 1.23 n-co -0.8 -0.436 n-co: 1.28 h-n-co -0.1 -0.114 h-co: 1.55; n-co: 1.89 table 4 parameters for adsorption of ch4 on co-mos2 site ead(ev) qt(e) distance(å) c-co 0.1 -0.116 c-co: 1.95 h-co -0.2 -0.17 h-co: 1.37 h-c-co -1.4 -0.076 c-co: 1.75; h-co: 1.57 51 first principle insight into co-doped mos2 for sensing nh3 and ch4 fig. 5 density of states of (a) pristine mos2, (b) co-mos2, (c) adsorbed nh3 and (d) adsorbed ch4 on co-mos2 we adopted three types of models to investigate the most stable site for the adsorption of nh3 molecules on a co-mos2 monolayer. the h atom was placed near the co atom (h-co), the n atom was placed near the co atom (n-co) and n and h atoms were placed near the co atom (h-n-co). the calculated parameters of the aforesaid site are presented in table 3. the lowest adsorption energy was calculated for the h-no-co site of -0.1 ev followed by the nco site of -0.8 ev. the highest adsorption energy was obtained for the h-co site of -0.9 ev. when adsorption energy is negative, the adsorption process is exothermic. the adsorption distance of the h-n-co site is 1.55 å for h-co and 1.89 å for the n-co bond, respectively. there was observed some reduction in adsorption distance of n-co i.e., 1.28 å. this might be affected by the alignment of nh3 molecules kept near co-mos2. for the h-co site, the adsorption distance was reduced to 1.23 å indicating the shortest adsorption distance among all the sites. the shorter adsorption distance signifies the adsorption between gas molecules and the co-mos2 monolayer has a strong interaction. in addition, the total charge transfer qt obtained by mulliken analysis was found to be negative for all the sites. the negative qt indicates that nh3 acts as an electron donor and transfers an electron to co-mos2. the corresponding total charge transfer of h-co, n-co and h-n-co sites are -0.173e, -0.436e and -0.114e respectively. due to the shorter adsorption distance, strong adsorption energy and b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 52 negative qt, the h-co site is considered as one of the most stable sites for adsorption of nh3 on co-mos2. furthermore, we calculated the bandgap, dos and pdos to understand the electronic property of the h-co side. the most stable site is shown in fig. 4 (a) and (b). the dos for co-mos2 adsorbed nh3 is shown in fig. 5 (d). in contrast to the cases where nh3 was not adsorbed with co-mos2, we discovered that when nh3 is adsorbed with co-mos2, a few new states near the fermi level appear. furthermore, the magnetic metal property with the spin channel in crossing the fermi level with a bandgap of 0 ev is shown by the spin up and spin down. these conditions could be caused by the presence of the nh3 molecule. the pdos graph in fig. 6 (d) shows the effects of nh3 gas adsorption on co-mos2. the primary peaks of nh3 in co-mos2 are formed by p orbitals of n atoms and are positioned at -0.3 ev and 0 ev, as seen in fig. 6 (d). h atoms orbitals produce states with energies of 4.9 ev, which is far from the fermi level. although the contributions of the p orbitals are close to the fermi level, their peaks are much weaker than those of the s orbitals. the d orbitals of co atoms also produce some impurities around the bottom side of the fermi level. as seen in pdos, the d and p orbitals of the co and n atoms play a key role in enhancing the conductivity of the nh3 adsorbed system. our findings suggest that adsorbing nh3 molecules in the interlayer is a promising technique to make a co-mos2 monolayer that can be used as a spintronics sensor. mos2 becomes a spintronics-based nh3 sensor when it is doped with co. fig. 6 projected density of states of (a) pristine mos2, (b) co-mos2, (c) adsorbed nh3 and (d) adsorbed ch4 on co-mos2 53 first principle insight into co-doped mos2 for sensing nh3 and ch4 3.2. transport property of ch4 and nh3 on co-mos2 the i-v characteristic curve aids in determining the sensing device resistance status. fig.7 (a) and 7 (b) show the device supercells that we used in our calculations (b). fig. 8 depicts the co-mos2 sensor i-v characteristic curve. currents in co-mos2 increase linearly up to 5.8 µa when a bias voltage of 1.4 v is applied. there is a linear degradation in the current as the bias voltage is increased further. similarly, the current value increases linearly with the bias voltage in the ch4 and nh3 configurations. the greatest current value, 9.2 µa, is achieved in the nh3 configuration with a bias voltage of 2 v, as seen in the graph. in addition, with a 1.4 v applied bias voltage, a value of 5.8 µa is achieved in the ch4 configuration. after that, it starts to decrease for both configurations and approaches the present minimum value. table 5 shows the co-mos2 based sensor resistance condition at 2 v. table 5 shows that the co-mos2 without the detecting gas has a high resistance state of 921 ωk at 2 v. the variance of resistance in the nh3 and ch4 configurations at 2 v, i.e., 411 ωk and 200 ωk, is lower. fig. 7 device supercell of (a) nh3 and (b) ch4 on co-mos2 fig. 8 i-v plot for co-mos2 monolayer for adsorption nh3 and ch4 b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 54 table 5 the co-mos2 based sensor resistance state at 2v device voltage (v) resistance (kω) co-mos2 2 921 co-mos2-ch4 2 411 co-mos2-nh3 2 200 3.3. sensitivity, recovery time and diffusion barrier of ch4 and nh3 on co-mos2 it is a well-known fact that a good sensor must have excellent selectivity to detect specific gas molecules. we also computed the sensitivity of ch4 and nh3 configurations for this purpose. to acquire a better understanding of the co-mos2 monolayer sensitivity to targeted molecules at 2 v, we investigated it. the ch4 adsorption sensitivity of co-mos2 was 124 %. the sensitivity of co-mos2 to nh3 adsorption has also been calculated to be 360.5 %. table 6 bandgap, sensitivity, and recovery time of ch4 and nh3 on co-mos2 configuration bandgap sensitivity recovery time co-mos2-ch4 -0.9 124% 1.4×10 8 s at 350 k 4.7×105 s at 400k 4.7×103 s at 450 k co-mos2-nh3 -0.8 360.5% 9 s at 350 k 0.2 s at 400k 0.1 s at 450 k aside from that, the recovery time of methanol and urea on co-mos2 is examined because reusability is an important indicator for gas sensors. the desorption time for the nh3 arrangement is 9 s at 350 k and 0.2 sec at 400 k. at 450 k, the fastest recovery time was calculated to be 0.01 s for nh3 molecules. at 450 k, the fastest recovery time for ch4 molecules is 4.7×103 sec. at 400 k and 350 k, the recovery times were 4.3×105 sec and 1.4×108 sec, respectively. because the ch4 system has the maximum adsorption energy, the recovery rate is low. according to the computed value of recovery time, as the temperature rises, the recovery time decreases. hence, nh3 and ch4 gas molecules adsorbed co-mos2 monolayer is highly suitable for the application to monitor such gases in the furnace of industry. the gas molecules diffusion characteristics in ch4 and nh3 on co-mos2 are crucial for evaluating response performance, a quick diffusion of gas molecules in sensing material will result in a fast response and short recovery time of the gas sensor. as a result, the energy barriers of gases are calculated using the nudged elastic band (neb) method in quantumatk. the gas molecules diffusion characteristics in co-mos2 are crucial for evaluating response performance, a quick diffusion of gas molecules in sensing material will result in a fast response and short recovery time of the gas sensor [62][63][64]. as a result, the energy barriers of gases are calculated using the nudged elastic band (neb) method in quantumatk. for each of the nine diffusion images, the energy barrier of ch4 and nh3 gas molecules on co-mos2 is computed. the initial path in our neb calculation is image 1, and the final path is image 9. the image dependent pair potential approach with a 55 first principle insight into co-doped mos2 for sensing nh3 and ch4 maximum distance of 1 ǻ was employed to develop the neb image. the energy barrier for all the images is listed in table 7. the diffusion barriers for ch4 throughout the pathways vary from 0.01 ev to 1.69 ev, which is much lower than the nh3 barrier ranges. the diffusion barrier for nh3 varies from 0.89 ev to 5.15 ev along the paths. it means that ch4 gas molecules diffuse considerably more easily than nh3 gas molecules in co-mos2. furthermore, the diffusions of all gases in the co-mos2 monolayer are not isotropic, image 1 to 9 for ch4 gas molecules and image 1 to 2 for nh3 gas molecules correspond to the lowest diffusion barrier, which is due to the inherent lack of electronic and structural symmetry [65]. table 7 diffusion barrier of ch4 and nh3 on co-mos2 diffusion image diffusion barriers (ev) ch4 nh3 1 to 2 0.44 0.89 1 to 3 1.05 2.05 1 to 4 1.15 2.68 1 to 5 1.69 2.57 1 to 6 1.47 1.89 1 to 7 0.9 0.96 1 to 8 0.2 0.99 1 to 9 0.01 5.15 fig. 9 diffusion barrier of (a) ch4 and (b) nh3 on co-mos2 4. conclusion using the dft method we investigated the adsorption distance, adsorption energy, charge transfer, bandgap, dos, pdos, transport property, sensitivity and recovery time of the nh3 and ch4 adsorbed co-mos2 monolayer. the top of the mo atom was calculated to be the stable position for the doping of the co atom in mos2. the top of the mo atom site has the highest binding energy of 4.5 ev. the doping of the co atom in mos2 drastically reduces the bandgap to 1.19 ev from 1.68 ev. this suggests the conduction property of mos2 is enhanced. we found that nh3 and ch4 system has the shorter b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 56 adsorption distance. the adsorption energy of nh3 and ch4 systems are -0.9 ev and -1.4 ev. from the charge transfer, nh3 and ch4 molecules act as electron donors and co-mos2 as an electron acceptor. after the co atom was substituted in the mos2 monolayer, the magnetic property was detected. our device has a linear increase in the current until 1.4 v and 2 v for ch4 and nh3 configurations, respectively, and shows variation in resistance, according to the i-v characteristics computed by negf. furthermore, the co-mos2 monolayer shows exceptional sensitivity for adsorbing ch4 and nh3 molecules, with the sensitivity of 124 % and 360.5 %, respectively. the recovery time suggests that nh3 and ch4 systems are suitable for high-temperature applications. the fastest recovery time was obtained for nh3 with 0.01 s. according to the computed energy barrier, ch4 gas molecules diffuse more easily in co-mos2. therefore, the co-mos2 monolayer is suitable for the adsorption of nh3 and ch4 gas molecules and holds a high application in industrial purposes. as a result, the co-mos2 monolayer appears to be a potential candidate for use as a spintronic sensor to detect nh3 and ch4 molecules. acknowledgement. this work was supported by all india council for technical education (aicte) govt. of india under research promotion scheme for north-east region (rps-ner) vide ref.: file no. 8-139/rifd/rps-ner/policy-1/2018-19. references [1] p. karki, b. chettri, a. thapa, p. chettri and b. sharma, "first principle study of mos2 adsorbed transition metal for sensing nh3 and ch 4", in proceedings of the devices for integrated circuit (devic), 2021, pp. 659–661. [2] k. wetchakun, t. samerjai, n. tamaekong, c. liewhiran, c. siriwong, v. kruefu, a. wisitsoraat, a. tuantranont and s. phanichphant, "semiconducting metal oxides as sensors for environmentally hazardous gases", sens. actuators b: chem., vol. 160, no. 1, pp. 580–591, dec. 2011. [3] b. tian, t. huang, j. guo, h. shu, y. wang and j. dai, "gas adsorption on the pristine monolayer gep3 : a first-principles calculation", vacuum, vol. 164, pp. 181–185, june 2019. [4] r. cao, b. zhou, c. jia, x. zhang and z. jiang, "theoretical study of the no, no2, co, so2, and nh3 adsorptions on multi-diameter single-wall mos2 nanotube", j. phys. d. appl. phys., vol. 49, no. 4, p. 045106, dec. 2015. [5] t. abbasi and s.a. abbasi, "“renewable” hydrogen: prospects and challenges", renew. sustain. energy rev., vol. 15, no. 6, pp. 3034–3040, aug. 2011. [6] f. barbir, "fuel cells and hydrogen economy", chem. ind. chem. eng. q., vol. 11, no. 3, pp. 105–113, june 2005. [7] x. cheng, z. shi, n. glass, l. zhang, j. zhang, d. song, z. s. liu, h. wang and j. shen, "a review of pem hydrogen fuel cell contamination: impacts, mechanisms, and mitigation", j. power sources, vol. 165, no. 2, pp. 739–756, mar. 2007. [8] h. luo, y. cao, j. zhou, j. feng, j. cao and h. guo, "adsorption of no2, nh3 on monolayer mos2 doped with al, si, and p: a first-principles study", chem. phys. lett., vol. 643, pp. 27–33, jan. 2016. [9] d. j. late, t. doneux and m. bougouma, "single-layer mose2 based nh3 gas sensor", appl. phys. lett., vol. 105, p. 233103, dec. 2014. [10] n. yamazoe, "toward innovations of gas sensor technology", sensors actuators, b chem., vol. 108, pp. 2–14, july 2005. [11] a. zettl, "extreme oxygen sensitivity of electronic properties of carbon nanotubes", science, vol. 287, no. 5459, pp. 1801–1804, mar. 2000. [12] k. kalantar-zadeh and b. fry, nanotechnology-enabled sensors, springer, 2008. [13] z. huang, x. peng, h. yang, c. he, l. xue, g. hao, c. zhang, w. liu, x. qi and j. zhong, "the structural, electronic and magnetic properties of bi-layered mos2 with transition-metals doped in the interlayer", rsc adv., vol. 3, pp. 12939–12944, june 2013. 57 first principle insight into co-doped mos2 for sensing nh3 and ch4 [14] y. zhang, w. zeng and y. li, "the hydrothermal synthesis of 3d hierarchical porous mos2 microspheres assembled by nanosheets with excellent gas sensing properties", j. alloys compd., vol. 749, pp. 355–362, june 2018. [15] r. wang, b. a. ruzicka, n. kumar, m. z. bellus, h.y. chiu and h. zhao, "ultrafast and spatially resolved studies of charge carriers in atomically thin molybdenum disulfide", phys. rev. b condens. matter mater. phys., vol. 86, p. 045406, july 2012. [16] s. cui, z. wen, x. huang, j. chang and j. chen, "stabilizing mos2 nanosheets through sno2 nanocrystal decoration for high-performance gas sensing in air", small, vol. 11, no. 19, pp. 2305–2313, may 2015. [17] q. zhou, c. hong, y. yao, s. hussain, l. xu, q. zhang, y. gui and m. wang, "hierarchically mos2 nanospheres assembled from nanosheets for superior co gas-sensing properties", mater. res. bull., vol. 101, pp. 132–139, may 2018. [18] d. zhang, j. wu, p. li and y. cao, "room-temperature so2 gas-sensing properties based on a metal-doped mos2 nanoflower: an experimental and density functional theory investigation", j. mater. chem. a, vol. 5, pp. 20666–20677, sep. 2017. [19] d. j. late, y. k. huang, b. liu, j. acharya, s. n. shirodkar, j. luo, a. yan, d. charles, u. v. waghmare, v. p. dravid and c. n. r. rao, "sensing behavior of atomically thin-layered mos2 transistors", acs nano, vol. 7, no. 6, pp. 4879–4891, may 2013. [20] j. wang, q. zhou, l. xu, x. gao and w. zeng, "gas sensing mechanism of dissolved gases in transformer oil on ag–mos2 monolayer: a dft study", phys. e low-dimensional syst. nanostructures, vol. 118, p. 113947, apr. 2020. [21] a. m. hu, l. l. wang, w. z. xiao, g. xiao and q. y. rong, "electronic structures and magnetic properties in nonmetallic element substituted mos2 monolayer", comput. mater. sci., vol. 107, pp. 72–78, sep. 2015. [22] d. ma, w. ju, t. li, x. zhang, c. he, b. ma, y. tang, z. lu and z. yang, "modulating electronic, magnetic and chemical properties of mos2 monolayer sheets by substitutional doping with transition metals", appl. surf. sci., vol. 364, pp. 181–189, feb. 2016. [23] l. zhang, t. liu, t. li and s. hussain, "a study on monolayer mos2 doping at the s site via the first principle calculations", phys. e low-dimensional syst. nanostructures, vol. 94, pp. 47–52, oct. 2017. [24] h. cui, x. zhang, g. zhang and j. tang, "pd-doped mos2 monolayer: a promising candidate for dga in transformer oil based on dft method", appl. surf. sci., vol. 470, pp. 1035–1042, mar. 2019. [25] x. gui, q. zhou, s. peng, l. xu and w. zeng, "adsorption behavior of rh-doped mos2 monolayer towards so2, sof2, so2f2 based on dft study", phys. e low-dimensional syst. nanostructures, vol. 122, p. 114224, aug. 2020. [26] g. qian, q. peng, d. zou, s. wang, b. yan and q. zhou, "first-principles insight into au-doped mos2 for sensing c2h6 and c2h4," front. mater., vol. 7, p. 22., feb. 2020. [27] z. xiao, w. wu, x. wu and y. zhang, "adsorption of no2 on monolayer mos2 doped with fe, co, and ni, cu: a computational investigation", chem. phys. lett., vol. 755, p. 137768, sep. 2020. [28] e. salih and a.i. ayesh, "first principle study of transition metals codoped mos2 as a gas sensor for the detection of no and no2 gases", phys. e low-dimensional syst. nanostructures, vol. 131, p. 114736, july 2021. [29] m. w. iqbal, e. elahi, a. amin, g. hussain and s. aftab, "chemical doping of transition metal dichalcogenides (tmdcs) based field effect transistors: a review", superlattices microstruct., vol. 137, p. 106350, jan. 2020. [30] b. chettri, p. k. patra, lalmuanchhana, lalhriatzuala, s. verma, b. k. rao, m. l. verma, v. thakur, n. kumar, n. n. hieu and d.p. rai, "induced magnetic states upon electron–hole injection at b and n sites of hexagonal boron nitride bilayer: a density functional theory study", int. j. quantum chem., vol. 121, no. 16, p. e26680, aug. 2021. [31] y. wang, x. shang, x. wang, j. tong and j. xu, "density functional theory calculations of no molecule adsorption on monolayer mos2 doped by fe atom", mod. phys. lett. b, vol. 29, no. 27, p. 1550160, oct. 2015. [32] y. h. zhang, j. l. chen, l. j. yue, h. l. zhang and f. li, "tuning co sensing properties and magnetism of mos2 monolayer through anchoring transition metal dopants", comput. theor. chem., vol. 1104, pp. 12–17, mar. 2017. [33] l. xu, y. gui, w. li, q. li and x. chen, "gas-sensing properties of ptn-doped wse2 to sf6 decomposition products", j. ind. eng. chem., vol. 97, pp. 452–459, may 2021. [34] p. sharma, m. lepcha, b. chettri, a. thapa, p. chettri and b. sharma, "first principle study of mos2 adsorbed transition metal for sensing urea and methanol" , in proceedings of the devices for integrated circuit (devic), 2021, pp. 655–658. b. chettri, a. thapa, s. k. das, p. chettri, b. sharma 58 [35] t. li, y. gui, w. zhao, c. tang and x. dong, "palladium modified mos2 monolayer for adsorption and scavenging of sf6 decomposition products: a dft study", phys. e low-dimensional syst. nanostructures, vol. 123, p. 114178, sep. 2020. [36] s. smidstrup, d. stradi, j. wellendorff, p. a. khomyakov, u. g. vej-hansen, m.-e. lee, t. ghosh, e. jónsson, h. jónsson and k. stokbro, "first-principles green’s-function method for surface calculations: a pseudopotential localized basis set approach", phys. rev. b, vol. 96, p. 195309, nov. 2017. [37] s. smidstrup, t. markussen, p. vancraeyveld, j. wellendorff, j. schneider, t. gunst, b. verstichel, d. stradi, p. a. khomyakov, u. g. vej-hansen, others, "quantumatk: an integrated platform of electronic and atomic-scale modelling tools", j. phys condens. matter., vol. 32, p. 15901, 2020. [38] j. p. perdew, k. burke and m. ernzerhof, "generalized gradient approximation made simple", phys. rev. lett., vol. 77, p. 1396, oct. 1996. [39] j. p. perdew, k. burke and m. ernzerhof, "perdew, burke, and ernzerhof reply", phys. rev. lett., vol. 80, p. 891, jan. 1998. [40] n. troullier and j. l. martins, "efficient pseudopotentials for plane-wave calculations", phys. rev. b, vol. 43, no. 3, pp. 1993–2006, jan. 1991. [41] a. sengupta, "on the junction physics of schottky contact of (10, 10) mx2 (mos2, ws2) nanotube and (10, 10) carbon nanotube (cnt): an atomistic study", appl. phys. a mater. sci. process., vol. 123, p. 227, mar. 2017. [42] h. j. monkhorst and j. d. pack, "special points for brillouin-zone integrations", phys. rev. b, vol. 13, p. 5188, june 1976. [43] j. schneider, j. hamaekers, s. t. chill, s. smidstrup, j. bulin, r. thesen, a. blom and k. stokbro, "atkforcefield: a new generation molecular dynamics software package", model. simul. mater. sci. eng., vol. 25, p. 85007, oct. 2017. [44] p. pulay, "convergence acceleration of iterative sequences. the case of scf iteration", chem. phys. lett., vol. 73, no. 2, pp. 393–398, july 1980. [45] a. ramasubramaniam and d. naveh, "mn-doped monolayer mos2: an atomically thin dilute magnetic semiconductor", phys. rev. b condens. matter mater. phys., vol. 87, p. 195201, may 2013. [46] p. wu, n. yin, p. li, w. cheng and m. huang, "the adsorption and diffusion behavior of noble metal adatoms (pd, pt, cu, ag and au) on a mos2 monolayer: a first-principles study", phys. chem. chem. phys., vol. 19, pp. 20713–20722, aug. 2017. [47] j. w. jiang, h. s. park and t. rabczuk, "molecular dynamics simulations of single-layer molybdenum disulphide (mos2): stillinger-weber parametrization, mechanical properties, and thermal conductivity", j. appl. phys., vol. 114, p. 064307, aug. 2013. [48] y. li, x. zhang, d. chen, s. xiao and j. tang, "adsorption behavior of cof2 and cf4 gas on the mos2 monolayer doped with ni: a first-principles study", appl. surf. sci., vol. 443, pp. 274–279, june 2018. [49] y. chen, x. wang, c. shi, l. li, h. qin and j. hu, "sensing mechanism of sno2(1 1 0) surface to h2: density functional theory calculations", sensors actuators, b chem., vol. 220, pp. 279–287, dec. 2015. [50] d. stradi, u. martinez, a. blom, m. brandbyge and k. stokbro, "general atomistic approach for modeling metal-semiconductor interfaces using density functional theory and nonequilibrium green’s function", phys. rev. b, vol. 93, p. 155302, apr. 2016. [51] m. brandbyge, j.-l. mozos, p. ordejón, j. taylor and k. stokbro, "density-functional method for nonequilibrium electron transport", phys. rev. b, vol. 65, p. 165401, mar. 2002. [52] s. datta and h. van houten, "electronic transport in mesoscopic systems", phys. today, vol. 49, no. 5, p. 70, may 1996. [53] s. datta, "nanoscale device modeling: the green’s function method", superlattices microstruct., vol. 28, no. 4, pp. 253–278, oct. 2000. [54] p. srivastava, v. sharma and n. k. jaiswal, "adsorption of cocl gas molecule on armchair boron nitride nanoribbons for nano sensor applications", microelectron. eng., vol. 146, pp. 62–67, oct. 2015. [55] j. prasongkit, v. shukla, a. grigoriev, r. ahuja, v. amornkitbamrung, "ultrahigh-sensitive gas sensors based on doped phosphorene: a first-principles investigation", appl. surf. sci., vol. 497, p. 143660, dec. 2019. [56] y. h. zhang, y. bin chen, k. g. zhou, c. h. liu, j. zeng, h. l. zhang and y. peng, "improving gas sensing properties of graphene by introducing dopants and defects: a first-principles study", nanotechnol., vol. 20, no. 18, p. 185504, apr. 2009. [57] s. peng, k. cho, p. qi and h. dai, "ab initio study of cnt no2 gas sensor", chem. phys. lett., vol. 387, pp. 271–276, apr. 2004. [58] g. henkelman, b. p. uberuaga and h. jónsson, "climbing image nudged elastic band method for finding saddle points and minimum energy paths", j. chem. phys., vol. 113, p. 9901, nov. 2000. 59 first principle insight into co-doped mos2 for sensing nh3 and ch4 [59] g. henkelman and h. jónsson, "improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points", j. chem. phys., vol. 113, p. 9978, nov. 2000. [60] j. wang, q. zhou, z. lu, y. gui and w. zeng, "adsorption of h2o molecule on tm (au, ag)doped-mos2 monolayer: a first-principles study", phys. e low-dimensional syst. nanostructures, vol. 113, pp. 72–78, sep. 2019. [61] s. ahmad and s. mukherjee, "a comparative study of electronic properties of bulk mos2 and its monolayer using dft technique: application of mechanical strain on mos2 monolayer", graphene, vol. 3, no. 4, pp. 52–59, oct. 2014. [62] a. ghosh and s. b. majumder, "modeling the sensing characteristics of chemi-resistive thin film semiconducting gas sensors", phys. chem. chem. phys., vol. 19, pp. 23431–23443, sep. 2017. [63] h. wang, g. gao, g. wu, h. zhao, w. qi, k. chen, w. zhang and y. li, "fast hydrogen diffusion induced by hydrogen pre-split for gasochromic based optical hydrogen sensors", int. j. hydrogen energy, vol. 44, no. 29, pp. 15665–15676, june 2019. [64] y. qiao, j. wu, x. cheng, y. pang, z. lu, x. lou, q. li, j. zhao, s. yang and y. liu, "construction of robust coupling interface between mos2 and nitrogen doped graphene for high performance sodium ion batteries", j. energy chem., vol. 48, pp. 435–442, sep. 2020. [65] s. mukherjee, a. banwait, s. grixti, n. koratkar and c. v. singh, "adsorption and diffusion of lithium and sodium on defective rhenium disulfide: a first principles study", acs appl. mater. interfaces, vol. 10, no. 6, pp. 5373–5384, jan. 2018. instruction facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 13-28 https://doi.org/10.2298/fuee2201013j © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper planar cmos and multigate transistors based wide-band ota buffer amplifiers for heavy resistance load * remya jayachandran1, dhanaraj kakkanattu jagalchandran2, perinkolam chidambaram subramaniam2 1department of electronics and communication engineering, nie mysore, karnataka, india 2department of electronics and communication engineering, nit calicut, kerala, india abstract. analog buffer amplifier configurations capable of driving heavy resistive load using different operational transconductance amplifier (ota) are presented in this paper. the ota cmos buffer configurations are designed using 0.18 µm scl technology library in cadence virtuoso tool and multigate transistor ota buffer in tcad sentaurus tool. cmos ota buffer configuration using simple ota outperform the ota buffer circuits using other otas in terms of power dissipation and stability. measured results show that the ota buffer circuit works well for resistive load below 100 ω. the gain tuning of up to 5 v/v is achieved with rl equal to 50 ω, output swing of 1 v. ota buffer configuration implemented using multigate transistor with resistive load below 1 kω exhibits a bandwidth around 5 ghz and tunable gain up to 5 v/v. key words: ota, buffer amplifier, resistive load, multigate transistor 1. introduction the modern electronics systems are predominantly made up of digital circuits. however, every signal in nature is in analog form. in order to have a suitable interface with natural signals all the electronic systems need to have analog circuit based submodules. the fast growing demand for high speed and high frequency integrated circuits have motivated the researchers to design high performance analog circuits. many applications require analog buffer amplifier capable of driving heavy resistive load (i.e. high load currents typically due to resistive loads less than 100 ω) [1, 2]. in [2], ota buffer amplifier configurations with high input dynamic range, wide-bandwidth, tunable gain, as well as with heavy load driving capability are proposed. the hardware implementation and received august 9, 2021; received in revised form october 24, 2021 corresponding author: remya jayachandran department of electronics and communication engineering, nie mysore e-mail: remyajayachandran@nie.ac.in * an earlier version of this paper was presented at the 4th international conference on 2021 devices for integrated circuit (devic 2021), may 19-20, 2021, in kalyani, west bengal, india [1]. mailto:remyajayachandran@nie.ac.in 14 r. jayachandran, d. k. jagalchandran, p. c. subramaniam testing of these cmos ota buffer configurations are presented in [1]. cmos buffer amplifier in class ab configuration is commonly used to drive resistive load. buffer amplifier configuration for driving heavy resistive load which are variants of classab theme are in literature [3-5]. modern electronic gadgets have high speed processors which are implemented using the non-planar device structures instead of conventional planar transistors in which the analog part still uses cmos design using planar transistors. fabrication of the analog circuit and digital circuit for the same electronic system in two different process technologies results in high cost. hence, this paper also focuses on realizing analog circuits in new process technology nodes. digital circuits fabricated using the scaled-down new transistors are reported in literature [6-16]. compared to the digital design, analog circuit design is more sensitive to the device parameters. in scaled down devices, nonlinearities that dominate can affect the circuit design. as the device dimension reduces, the design complexity increases in analog circuit design compared to the digital circuit design. the digital circuits have already switched to the nano-scale regime. research on analog circuit design using scaled down device architectures is still going on [7-11]. among the multigate transistors, gateall-around fet (gaafet) is a device with better gate control over the channel which can be scaled down below 5 nm. another multigate device is reconfigurable field-effect transistor (rfet) that can be configured as an n-type fet or a p-type fet by applying an appropriate bias. rfet device structures with triple gate, double gate and single gate are reported in the literature [9-12]. among these rfet devices, single gate rfet (sg-rfet) is a simple device which has the structure similar to gaafet device. analog and digital circuits designed using sg-rfet and gaafet devices can give an insight to the design possibilities in nanoscale implementation. the ota buffer amplifier configurations discussed in [2] implemented using sg-rfet ota and gaafet ota are presented in [7]. in this paper, the comparison of ota buffer configurations implemented using different cmos ota and also using different non-planar device ota topologies are presented. the performance of ota buffer configurations proposed in [2] implemented using different cmos configurations are demonstrated. to the best of our understanding this kind of work is for the first time in literature. the experimental results of the ota buffer configurations fabricated using simple ota are compared with the theoretical results which are explained in detail in the results and discussion section. the logic circuits implemented using different rfet devices are presented in [15] in which the logical effort of the logic gates using rfet is low compared to the cmos based design. the low propagation delay due to the reduced parasitic capacitance makes this rfet device outperform the conventional transistors. as the analog circuits using rfet devices are not reported in literature, we have demonstrated the ota buffer configurations using non-planar transistor otasg-rfet ota and gaafet ota to analyse the possibilities of non-planar transistors in analog circuit design. the electrical characterization of the sgrfet, gaafet device and the circuits based on that are presented in [7] and [8] which is used in this work for the buffer configuration implementation. the implementation of cmos ota buffer configurations has been carried out in cadence virtuoso tool in 0.18 µm technology node and the implementation of multi-gate ota buffer configurations in 2d tcad sentaurus tool. the organization of the paper is as follows: in section 2, ota buffer amplifier architectures, cmos ota topologies and multigate transistors are presented. section 3 is about results and discussions followed by conclusion in section 4. planar cmos and multigate transistors based wide-band ota buffer amplifiers... 15 2. implementation of ota buffer configurations analog buffer amplifier using ota is shown in fig. 1 which is named buffer configuration 1 in [2]. n-stage unity gain and tunable gain ota buffer circuit named as buffer configuration 2 and buffer configuration 3 respectively are presented in [2]. buffer configuration 1 uses a single ota with voltage series feedback as shown in fig. 1. the open loop ota has high output impedance. to obtain heavy resistance load driving capability, the output impedance of open loop ota can be reduced by connecting ota in a voltage series feedback configuration. the output impedance of buffer configuration 1 is dependent on the gm of ota. hence, for increasing load drive capability and to obtain gain nearer to unity, gm of ota can be increased which in turn increases power dissipation. fig. 1 buffer configuration 1 fig. 2 n-stage buffer configuration 2 fig. 3 n-stage buffer configuration 3 16 r. jayachandran, d. k. jagalchandran, p. c. subramaniam in n-stage buffer configuration 2, the feedback to the otas except first stage is zero, i.e. the negative terminal of ota is connected to ac ground. this results in the nonuniform differential voltage swing at the input of each ota in the n-stage buffer, configuration 2. figure 2 shows n-stage buffer configuration 2. n-stage tunable gain "non-inverting type" buffer configuration 3 [2] is shown in fig. 3. the output impedance of n-stage buffer configuration 3 is low compared to buffer configuration 1 and n-stage buffer configuration 2 configuration. furthermore, the gain is dependent on the feedback factor β1 of the n-stage buffer configuration 3. by varying the feedback voltage of the first stage (β1vout), the gain can be varied which makes the ota buffer configuration to function as a tunable gain buffer amplifier configuration. the ota block can be selected or designed according to the area of application. 2.1 cmos ota topologies a variety of single ended output cmos ota topologies have been reported in literature [17-19]. among these ota topologies, four ota topologies, namely simple ota, folded cascode ota (fc-ota), recycled folded cascode ota (rfc-ota) and nauta’s ota, are considered in the buffer design for driving resistive load. other ota topologies have very high dc voltage gain which is not used for the ota buffer design due to its reduced bandwidth. fig. 4 (a) depicts the simple ota where the voltage gain of the first stage is almost unity. hence the overall dc voltage gain depends on the second stage which is not very high. fig. 4 (b) and 5 (a) represent the fc-ota and rfc ota configurations respectively with dc voltage gain less than 60 db. fig. 5 (b) represents the nauta’s ota using inverters. simple ota, fc-ota, rfc-ota and nauta’s ota are designed and implemented in cadence virtuoso tool with scl 0.18 µm library. table 1 presents the parameters of different cmos otas implemented in 0.18 µm technology node. ota circuit designed using non-planar device (multgate transistors) are discussed in the next section. table 1 parameters of cmos ota configurations parameters cmos ota types recycled folded cascode ota (rfc-ota) folded cascode (fc-ota) simple ota nauta's ota technology node (nm) 180 180 180 180 supply voltage (v) ±0.9 ±0.9 ±0.9 ±0.9 rl (kω) 100 100 100 100 gain (db) 56 48 28 35 gbw (mhz) 55 18 6000 6500 gm (ms) 7.5 6 5 5 2.2 multigate transistors the demand for minimization has forced a significant downscaling in the physical size of devices. as device dimension shrinks, gate control over the channel of planar transistors becomes more difficult. new device architectures such as multi-gate devices, carbon nanotubes, tunnel fets, single-electron devices, reconfigurable fets, can outperform the conventional planar transistors in terms of speed, area and power consumption [9-16]. planar cmos and multigate transistors based wide-band ota buffer amplifiers... 17 among these devices, reconfigurable field-effect transistor (rfet) is an emerging multigate device which can be configured as an n-type fet or a p-type fet by applying an appropriate bias to the terminals. fabrication of the rfet device is less complex compared to the existing mos transistors, as there is no need for doping. rfet device structures with triple gate, double gate and single gate are reported in literature. (a) (b) fig. 4 (a) simple ota (b) folded cascode ota (fc-ota) (a) (b) fig. 5 (a) recycled folded cascode ota ( rfc-ota) (b) bram’s nauta ota a simple and high-performance rfet with a single control gate rfet (sg-rfet) is proposed in [11]. compared to triple gate rfet and polarity gate rfet, sg-rfet has a very simple structure. the structure of sg-rfet device is similar to the multigate device gate-all-around fet (gaafet). fig. 6 (a) and (b) show the device structure of sg-rfet and gaafet. in gaafet, the gate material extends to surround the channel on all sides in order to attain maximum electrostatic integrity. cylindrical type gaafet offers the lowest 18 r. jayachandran, d. k. jagalchandran, p. c. subramaniam natural length which leads to further scaling of the device. the electrical characterization of sg-rfet and gaafet and the ota buffer circuit using sg-rfet ota and gaafet ota are presented in [7,8]. fig. 7 (a) [7,16] depicts the idvg characteristics of the sg-rfet device with different dimensions. tunneling current dominates when the gate voltage exceeds the threshold voltage of the device. below the threshold voltage, thermionic emission current dominates. fig. 7 (b) [7] highlights the comparison of the electrical characteristics of sg-rfet device with gaafet device. the characterization and mathematical analysis of the non-planar device structure –sg-rfet is reported in [16]. the sub-threshold current model and surface potential model of the sg-rfet device derived in [16] show near agreement with the simulation results. the electrical characterization of the sg-rfet and analog and digital circuits implemented using sgrfet device reported in [7-8] shows good performance in terms of gain, bandwidth and output characteristics. the digital circuits implemented using these multigate devices are presented in [17]. to enhance the current drive of sg-rfet, the mobility of the charge carriers is increased by using strained silicon as channel. (a) (b) fig. 6 (a) sg-rfet (b) gaafet implemented in tcad sentaurus tool (a) (b) fig. 7 (a) idvg characteristics of sg-rfet device for different device dimensions (b) idvg characteristics of sg-rfet and gaafet device the ota buffer circuits simulated using strained silicon channel sg-rfet device are also discussed in [7,8]. the sg-rfet with gate length (lg) 50nm and device length (lt) 220nm and gaafet with gate length 50nm is chosen for the ota circuit implementation. the simulation results of two–stage ota buffer configurations implemented using planar mosfet and multigate transistors are presented in the next section. planar cmos and multigate transistors based wide-band ota buffer amplifiers... 19 3. results and discussion 3.1. buffer configuration 1 buffer configuration 1 with resistive load is implemented using ota types, namely fc-ota, rfc-ota, simple ota and nauta’s ota, and the circuit performance is analysed using different resistive load. for driving a heavy resistance load (< 1 kω), a large current is required to attain a better output swing which results in increase in power dissipation. it is observed that increase in gm can increase the load driving capability that results in high output swing. moreover, the transconductance of the ota should be increased to get any significant voltage gain (less than unity), particularly for heavy load, rl. table 2 shows the simulation results of buffer configuration 1 implemented using different ota configurations. buffer configuration 1 implemented using sg-rfet ota and gaafet ota are also analysed in sentaurus tcad tool. the inverter using sg-rfet device is also presented in [7-8,16] which is compared with the gaafet inverter. the sg-rfet inverter circuit has a bandwidth of 650 mhz and a gain of 70 v/v. sg-rfet based inverter has a lower propagation delay (20 ps for cl = 0.35 ff, vdd = 2 v) due to the lower equivalent rc switching delay when compared to the gaafet device [8]. table 2 buffer configuration 1 using cmos otas parameters ota type recycled folded cascode ota (rfc-ota) folded cascode (fc-ota) simple ota [2] nauta's ota technology node (nm) 180 180 180 180 supply voltage (v) ±0.9 ±0.9 ±0.9 ±0.9 rl (kω) 5 5 5 5 gain (v/v) 0.98 0.97 0.96 0.97 ugb (mhz) 48 11 5200 5000 gm (ms) 7.5 6 5 6 ota topology using sg-rfet inverter circuit is used in the ota buffer configurations which is able to drive resistive load < 1 kω. table 3 presents the simulation results obtained for the buffer configuration 1 using multi-gate transistors. due to the dense meshing at the interface regions in the device, it is difficult to simulate the circuit with more components in sentaurus tcad tool. the ota configuration used in the circuit simulation is nauta’s ota as it contains only inverter blocks. the circuit parameters used in the tcad simulation table 3 buffer configuration 1 using multigate transistors parameters ota type gaafet ota [7] sg-rfet ota [7] strained si channel sg-rfet ota [7] cmos nauta’s ota supply voltage (v) 2 2 2 2 rl (kω) 5 5 5 5 gain (v/v) 0.98 0.97 0.96 0.97 ugb (ghz) 5.8 0.56 0.78 3.5 gm (ms) 1 0.79 0.9 0.9 20 r. jayachandran, d. k. jagalchandran, p. c. subramaniam (multigate transistors) are supply voltage vdd = 2 v, output load resistance, rl = 1 kω, input sinusoidal signal amplitude, vp−p = 1 v, and frequency 10 khz. it is observed that gaafet ota based buffer 1 has wide bandwidth and gain closer to unity for a resistive load of 1 kω compared to other ota based buffer configuration 1. 3.2. buffer configuration 2 table 4 present the simulation results of two-stage buffer configuration 2 using simple ota, fc ota, rfc ota and nauta’s ota respectively. from the simulation results, it is observed that buffer configuration 2 implemented using simple ota performs better when compared to that using fc ota, rfc ota and nauta’s ota. two-stage buffer configuration 2 circuit using non-planar transistor otas (sg-rfet ota, strained silicon sg-rfet ota and gaafet ota) are characterized in sentaurus tcad tool. using device module in tcad tool, dc and ac analysis of the two-stage buffer 2 are carried out with resistive load. the circuit parameters used in the simulation are supply voltage vdd = 2 v, output load resistance, rl = 1 kω, input sinusoidal signal amplitude, vp-p = 1 v, and frequency 1 khz. for two-stage buffer 2 using sg-rfet ota and gaafet ota, a peak-to-peak amplitude of 0.99 v is obtained as output from the simulation results, that results in a gain of 0.99 v/v. table 4 two-stage buffer configuration 2 using cmos otas parameters ota type recycled folded cascode ota (rfc-ota) folded cascode (fc-ota) simple ota [2] nauta's ota technology node (nm) 180 180 180 180 supply voltage (v) ±0.9 ±0.9 ±0.9 ±0.9 rl (kω) 1 1 1 1 gain (v/v) 0.99 0.99 0.99 0.99 ugb (mhz) 48 11 5200 5000 gm (ms) 7.5 6 5 6 as the rl reduces, the buffer circuit’s gain reduces as expected. table 5 presents the simulation results of buffer 2 configuration implemented using different multigate otas. the performance of the sg-rfet ota buffer amplifier is compared with gaafet ota. gaafet ota based buffer 2 configuration exhibits a wide bandwidth of 5 ghz for rl as 1 kω table 5 two-stage buffer 2 configuration using multigate transistors parameters ota type gaafet ota [7] sg-rfet ota [7] strained si channel sg-rfet ota [7] cmos nauta’s ota supply voltage (v) 2 2 2 2 rl (kω) 1 1 1 1 gain (v/v) 0.99 0.99 0.99 0.99 ugb (ghz) 5.8 0.56 0.78 3.5 gm (ms) 1 0.79 0.9 0.9 planar cmos and multigate transistors based wide-band ota buffer amplifiers... 21 3.3. single–stage variable gain buffer configuration n-stage tunable gain cmos ota buffer amplifiers named as buffer configuration 3 are presented in [2]. a monte carlo simulation has been carried out for two-stage buffer configuration 3 using rfc-ota, fc-ota and simple ota to verify the robustness of the design against process mismatch. buffer configuration 3 is useful in many applications such as biomedical application, consumer electronics, video applications and other industrial applications. figs. 8 (a) (d), 9 (a) – (d) and 10 (a) (d) show the distribution of unity gain bandwidth (ugb) and bandwidth of two stage buffer 3 with rl = 50 ω, gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and gain variation of buffer 3 with rl = 50 ω, gain =5 v/v respectively for 300 samples (n), along with respective mean (µ) and standard deviation (σ) for simple ota, fc ota and rfc ota respectively. it is observed from the plots that proposed buffer 3 using simple ota is robust even with local mismatches. fig. 8 (a) distribution of ugb of simple ota (b) bandwidth of two stage buffer configuration 3 with rl = 50 ω, (c) gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and (d) gain variation of buffer 3 with rl = 50 ω, gain = 5 v/v respectively for 300 samples (n) fig. 8(a) presents the distribution of ugb of simple ota, fig. 8 (b) presents bandwidth of two stage buffer configuration 3 with rl = 50 ω, fig. 8 (c) shows gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and fig. 8((d) shows gain variation of buffer 3 with rl = 50 ω, gain = 5 v/v respectively for 300 samples (n). 22 r. jayachandran, d. k. jagalchandran, p. c. subramaniam fig. 9 (a) presents the distribution of ugb of fcota, fig. 9 (b) shows bandwidth of two stage buffer configuration 3 with rl = 50 ω, fig. 9 (c) shows gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and fig. 9 (d) presents gain variation of buffer 3 with rl = 50 ω, gain = 5 v/v respectively for 300 samples (n). it is observed from the plots that proposed buffer 3 using simple ota is robust even with local mismatches. it is observed from the plots that proposed buffer 3 using simple ota is robust even with local mismatches. fig. 9 (a) distribution of ugb of fcota (b) bandwidth of two stage buffer configuration 3 with rl = 50 ω, (c) gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and (d) gain variation of buffer 3 with rl = 50 ω, gain = 5 v/v respectively for 300 samples (n) fig. 10 (a) presents the distribution of ugb of rfc ota. fig. 10(b) presents bandwidth of two stage buffer configuration 3 with rl = 50 ω, fig. 10 (c) shows gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and fig. 10 (d) depicts the gain variation of buffer 3 with rl = 50 ω, gain = 5 v/v respectively for 300 samples (n). for a two-stage buffer configuration 3 requires four otas including the feedback circuit. for n = 1, the buffer configuration 3 reduced to a configuration as shown in fig.11, named buffer configuration 4 which can drive resistive load which is presented in [7]. the feedback factor depends both on the gain and output load of the proposed ota buffer configuration (buffer configuration 4). in buffer configuration 4, orthogonal gain tuning with load is not possible as in buffer configuration 3. planar cmos and multigate transistors based wide-band ota buffer amplifiers... 23 fig. 10 (a) distribution of ugb of rfc ota (b) bandwidth of two stage buffer configuration 3 with rl = 50 ω, (c) gain variation of buffer 3 with rl = 50 ω, gain = 1 v/v and (d) gain variation of buffer 3 with rl = 50 ω, gain = 5 v/v respectively for 300 samples (n) table 6 single –stage variable gain ota buffer configuration parameters ota type recycled folded cascode ota (rfc-ota) folded cascode (fc-ota) simple ota nauta's ota technology node (nm) 180 180 180 180 supply voltage (v) ±0.9 ±0.9 ±0.9 ±0.9 rl (kω) 1 1 1 1 gain (v/v) 1 1 1 1 ugb (mhz) 42 9.3 5000 4800 gm (ms) 7.5 6 5 6 fig. 11 buffer configuration 4 24 r. jayachandran, d. k. jagalchandran, p. c. subramaniam as the feedback factor depends on both the gain and output load, the maximum output swing that can be attained for this configuration is limited. table 6 present the simulation results of buffer configuration 4 using simple ota, fc ota, rfc ota and nauta’s ota respectively. table 7 presents the simulation results of the multigate ota based buffer configuration 4. it is observed that the gaafet ota buffer configuration 4 has a widebandwidth as the gaafet ota has a ugb of around 5.4 ghz. simple design, low fabrication complexity, reconfigurable property and reduced area make sg-rfet ota buffer outperform the gaafet ota buffer and cmos ota buffer configurations. it is observed that gaafet ota buffer has wide bandwidth compared to other multigate otas presented in this paper. table 7 single-stage variable gain ota buffer using multigate transistors parameters ota type gaafet ota [7] sg-rfet ota [7] strained si channel sg-rfet ota [7] cmos nauta’s ota supply voltage (v) 2 2 2 2 rl (kω) 1 1 1 1 gain (v/v) 1 1 1 1 ugb (ghz) 5.4 0.38 0.6 3 gm (ms) 1 0.79 0.9 0.9 3.4. experimental results from the simulation results, it is observed that the performance of the ota buffer amplifier depends on the ota configuration used in the design. simple ota shows better performance in terms of stability when compared with fc-ota and rfc-ota. simple ota configuration is selected for the hardware implementation of the ota buffer configurations. the cmos ota buffer configurations namely buffer 1, buffer 2 and buffer 3 are fabricated in a single ic at scl chandigarh. the die size is 2 mm x 2 mm. the silicon area required for buffer configuration 1, buffer configuration 2 and buffer configuration 3 in the chip area is 54 µm x 45 µm, 250 µm x 85 µm and 235 µm x 160 µm respectively. 32 pin qfn packaging type is used for the buffer ic. the bonding diagram with i/o pads of buffer ic is shown in fig. 12 (a). the simplicity in this buffer configuration is the feedback fractions can be set externally with respect to the load and gain. figure 12 (b) shows the experiment set up for testing the buffer ic. the experimental results are discussed in detail in [1]. the layout of the basic ota used in the buffer configurations for hardware implementation is shown in fig. 13. the buffer ic contains buffer configuration 1, buffer configuration 2 (three and four stage) and buffer configuration 3 (three and four stage) and also the feedback circuits which is connected to a common supply voltage. the bandwidth of the buffer ic is limited due to the i/o pads used in designing the buffer ic. the simulation results of the buffer ic with i/o pads show that the bandwidth is limited to 150 mhz. without the i/o pads, the post-layout simulation of each buffer amplifier (buffer configuration 1, buffer configuration 2 and buffer configuration 3) in the ic shows a bandwidth above 900 mhz. the parasitic components in the routing also reduces the bandwidth of the buffer amplifier configuration. planar cmos and multigate transistors based wide-band ota buffer amplifiers... 25 (a) (a) (a fig. 12 (a) bonding diagram of buffer ic [1] (b) expeérimental set up [1] fig. 13 layout of the simple ota 26 r. jayachandran, d. k. jagalchandran, p. c. subramaniam the output obtained from the buffer ic is compared with the simulation results. figure 14 (a) and (b) highlight the comparison of the gain with respect to load variation of buffer configuration 1 and buffer configuration 2. as the load reduces, the gain reduces due to the loading effect. fig. 14 comparison of the gain with respect to load variation of (a) buffer configuration 1 and (b) buffer configuration 2 for different load values as the output impedance is reduced for the buffer configuration 2 due to the voltage series feedback, the gain remains close to unity for an output load upto 100 ω. the gain (av) [2] and output impedance (zout) [2] of the buffer 2 configuration is given as 1 1 1 v n o l m o a g g g g =     + +        (1) 1 / 1 o out n m o g z g g =   +     (2) where go and gm are the output conductance and transconductance of the ota in the buffer configuration 2, gl is the output load of the buffer configuration. as n increases, the gain increases (close to unity) for low output load (rl). the output impedance of the buffer configuration 2 reduces with increase in n. figure 14 (a) and (b) presents the variation in gain with respect to different output load. with vout/vin = , gives the value of the feedback factor, 1 [2] as: 1 1 1 1 o m g g   = − (3) where go1 and gm1 are the output conductance and transconductance of the first stage ota in the buffer configuration 3. using eq. (3), the feedback factor to obtain the required gain can be determined. planar cmos and multigate transistors based wide-band ota buffer amplifiers... 27 fig. 15 gain dependent feedback factor the main advantage of this configuration is gain tuning is independent of the output load. the mismatch in the experimental results with the theoretical values for low resistance load is due to the assumptions made in the theoretical analysis that otas used in the buffer configuration are identical. in actual case there is a slight mismatch in the ota parameters due to the transistor mismatch or process variations that results in deviation of the experimental results with theoretical and simulation results. as the load reduces, the effect of variation in the parameters of ota is dominant as the gm and go of each ota decides the output impedance of buffer configurations. figure 15 shows the values of the feedback factor for different gain values calculated using the feedback factor equations derived in [2]. it is observed from fig. 14 and fig. 15 that the experimental results show near agreement with the theoretical results and simulation results. figure 15 depicts the comparison of experimental, simulation and theoretical values of gain dependent feedback factor for different gain which near values. for buffer configuration 3, gain tuning of upto 5 v/v and output swing of 1 v are achieved with rl equal to 50 ω. hence, the ota buffer configurations are found useful in adcs, dacs, plls, automatic gain control circuits where tunable gain is preferred. 4. conclusion all-ota buffer configurations capable of driving resistive load implemented using different cmos ota topologies are discussed in this paper. experimental results of cmos ota buffer configurations using simple ota show near agreement with the theoretical values. cmos ota buffer with tunable gain is useful in cmos ics for applications such as biomedical application, consumer electronics, video applications and other industrial applications. ota buffer circuits using gaafet ota, sg-rfet ota and strained silicon sg-rfet ota with resistive load are analysed in tcad sentaurus tool. gaafet ota buffer circuit outperform the other multigate ota buffer circuit in terms of bandwidth. the simulation results show the feasibility of non-planar transistor circuits in analog circuit design. the ota buffer configurations using multigate transistor otas will be useful in applications such as biomedical devices, adc drivers and wireless sensor nodes. 28 r. jayachandran, d. k. jagalchandran, p. c. subramaniam references [1] r. jayachandran, k. j. dhanaraj and p. c. subramaniam, "hardware realization and testing of multistage ota buffer amplifier for heavy resistive load", in proceedings of 2021 devices for integrated circuit (devic), 2021, pp. 550–554. [2] r. jayachandran, p. c. subramaniam and k. j. dhanaraj, "a novel tunable gain cmos buffer amplifier for large resistive loads", integration, vol. 77, pp. 1–12, march 2021. [3] y. ha, m. li and a. q. liu, "a new cmos buffer amplifier design used in low voltage mems interface circuits", analog integ. circuits signal process., vol. 27, no. 1–2, pp. 7–17, apr. 2001. [4] k. moolpho and j. ngarmnil, "low voltage high-performance class-ab fgmos buffer", in proceedings of ieee asia pacific conference on circuits and systems, 2006, pp. 1779–1782. [5] c. mohan and p. m. furth, "a 16-ohm audio amplifier with 93.8-mw peak load power and 1.43-mw quiescent power consumption", ieee trans. circuits and systems ii: express briefs, vol. 59, no. 3, pp. 133–137, march 2012. [6] x. qiu, d. chen and z. wang, "response of ring oscillator to periodic interference on the power supply", aeu-int. j. electron. commun., vol. 82, pp. 383–390, dec. 2017. [7] j. remya et. al., "high performance reconfigurable fet for a simple variable gain buffer amplifier design", int. j. electron., apr. 2021. (published online) [8] r. jayachandran, r. s. komaragiri and p. c. subramaniam, "reconfigurable circuits based on single gate reconfigurable field-effect transistors", in proceedings of 6th ieee international conference on electronics, computing and communication technologies (conecct), 2020, pp.1–5. [9] v. khadem et. al., "an analytical approach to model capacitance and resistance of capped carbon nanotube single electron transistor", aeu-int. j. electron. commun., vol. 90, pp. 97–102, june 2018. [10] y.-m. lin et. al., "high-performance carbon nanotube field-effect transistor with tunable polarities", ieee trans. nanotechnol., vol. 4, no. 5, pp. 481–489, sept. 2005. [11] g. darbandy, m. claus and m. schröter, "high-performance reconfigurable si nanowire field-effect transistor based on simplified device design", ieee trans. nanotechnol., vol. 15, no. 2, pp. 289–294, march 2016. [12] a. heinzig et. al., "reconfigurable silicon nanowire transistors", nano letters, vol. 12, no. 1, pp. 119–124, nov. 2011. [13] w. m. weber et. al., "tuning the polarity of si-nanowire transistors without the use of doping", in proceedings of 8th ieee conference on nanotechnology, nano’08, 2008, pp. 580–581. [14] f. wessely, t. krauss and u. schwalke, "cmos without doping: multi-gate silicon-nanowire fieldeffect-transistors", ieee j. solid-state circ., vol. 70, pp. 33–38, apr. 2012. [15] d. sacchetto, y. leblebici and g. de micheli, "ambipolar gate-controllable sinw fets for configurable logic circuits with improved expressive capability", ieee electron device lett., vol. 33, no. 2, pp. 143–145, feb. 2012. [16] r. ranjith et. al., "two dimensional analytical model for a reconfigurable field effect transistor", superlattice. microstruct., vol. 114, pp. 62–74, feb. 2018. [17] r. s. assaad and j. silva-martinez, "the recycling folded cascode: a general enhancement of the folded cascode amplifier", ieee j. of solid-state circ., vol. 44, no. 9, pp. 2535–2542, sept. 2009. [18] a. s. khade, v. vyas and m. sutaone, "performance enhancement of advanced recycling folded cascode operational transconductance amplifier using an unbalanced biased input stage", integration, vol. 69, pp. 242–250, nov. 2019. [19] d. binkley, b. blalock and j. rochelle, "optimizing drain current, inversion level, and channel length in analog cmos design", analog integ. circuits signal process, vol. 47, no. 2, pp. 137–163, march 2006. 10173 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 349-377 https://doi.org/10.2298/fuee2203349r © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper combined effects of electrostatic and electromagnetic interferences of high voltage overhead power lines on aerial metallic pipeline djekidel rabah1, mohamed lahdeb1, sherif salama m. ghoneim2, djillali mahi1 1laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria 2electrical engineering department, college of engineering, taif university, p.o. box 11099, taif 21944, saudi arabia abstract. the main purpose of this paper is to model and analyze the electrostatic and electromagnetic interferences between a hv overhead power line and an aerial metallic pipeline situated parallel at a close distance. the modelling of these interferences is typically done for safety reasons, to ensure that the induced voltage does not pose any risk to the operating and maintenance personnel and to the integrity of the pipeline. the adopted methodologies respectively for electrostatic and electromagnetic interferences are based on the charge and current simulation methods combined with the teaching learning based optimization (tlbo) algorithm. the friedman test analysis indicate that teaching learning based optimization (tlbo) algorithm can be used for parameters optimization, it showed better results. in the case where the induced currents and voltages values exceed the limit authorized values by the international cigre standard, mitigation measures become necessary. the simulation results obtained were compared with those provided respectively by the admittance matrix analysis and carson's method, good agreement was obtained. key words: charge simulation method (csm), current simulation technique (cst), teaching learning based optimization (tlbo), friedman test, hv power line, aerial metallic pipelines received november 3, 2021; revised february 25, 2022; accepted february 28, 2022 corresponding author: djekidel rabah laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria e-mail: rabah03dz@live.fr 350 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi acronyms: ac alternating current fem finite element method cigre international council on large electric systems ga hs genetic algorithm harmony search csm charge simulation method hv high voltage cst dc current simulation technique direct current ieee institute of electrical and electronics engineers eas evolutionary algorithms nna nodal network analysis emf electromotive force of objective function fba flower pollination algorithm pso particle swarm optimization fdm finite difference method tlbo teaching learning based optimization 1. introduction the hydrocarbon and water transport metallic pipelines (buried or aerial) that share common right-of-way with high-voltage overhead transmission power lines network are subject to the influence of electrostatic and electromagnetic interferences created by the electric and magnetic fields emitted by these hv power lines in normal operating condition. these fields can induce voltages and currents in the metallic pipelines installed in the immediate vicinity of these hv power lines. in some cases, these induced voltages can reach to high levels enough to be hazardous to the safety of operating personnel coming into contact with the metallic pipeline, causing severe damage to metallic pipeline safe operation and associated equipment, cathodic protection systems and the pipeline itself [1-4]. consequently, the induced voltages on the metallic pipelines must be reduced to acceptable levels for the safety of workers personnel, and to ensure the integrity of the pipeline. based on the above, it is important and necessary to assess electrostatic and electromagnetic interference between transmission power lines and pipelines for performance and safety reasons in normal operation condition of the electric network. interference problems involving hv overhead power lines and metallic pipelines have been commonly deal in the literature, where several important researches have been devoted to evaluating the inductive and capacitive interference phenomenon based on various analytical and numerical methods. different simulation methodologies have been used [5,6], which are generally relied on transmission line approach [7-15], or by finite element method (fem) alone [16-20], or in combination with circuit analysis [21-26]. in addition, the nodal network analysis [27,28], the finite difference method (fdm) [29,30] and the charge simulation method (csm) [31-35]. the transmission line approach utilizes thevenin equivalent circuits as its basic assumption and provides almost good results for the induced voltage, the finite element method (fem) is a most robust approach with reliable and accurate results for calculating induced voltage, the circuit theory approach gives more conservative results because it does not take into account the effects of infinite transmission line length, the nodal network analysis (nna) can predict the induced voltage with sufficient accuracy, the finite difference method (fdm) is sufficiently rigorous, leading to accurate results, the charge simulation method (csm) is one of the most widely used approaches for its various advantages of optimization and accuracy which leads to better accuracy of results. this present paper proposes a numerical modeling analysis of electrostatic and electromagnetic couplings between hv overhead power lines and a proximity aerial metallic pipeline using hybrid simulation methods. the computation methodologies used were successively designed on the basis of the charge simulation method (csm) and the current combined effects of electrostatic and electromagnetic interferences... 351 simulation technique (cst) [36-38]. the main constraints of these analysis methods consist respectively in the number and position of the fictitious charges and the line current filaments. for solving this associated optimization problem in order to obtain the optimal values of these parameters, which provide a solution of sufficient precision of these couplings, evolutionary computation algorithms (eas) are commonly used. evolutionary algorithms (eas) are stochastic optimization methods based on a rough simulation of the natural evolution of populations. one of the most important and best types of evolutionary algorithms is teaching learning based optimization (tlbo). the teaching learning based optimization (tlbo) is a new stochastic optimization metaheuristics that was originally proposed by rao et al in 2011[39]. this population search algorithm is inspired by the teaching learning process and is based on the effect of the influence of a teacher on the production of students in a classroom; it is widely used due to their best performance, its efficiency and simplicity of implementation [40]. it has been successfully applied to solve optimization problems in many scientific applications and techniques in recent years. finally, the validity of the simulation results obtained by the two proposed combined methods is demonstrated by a comparison with those yielded respectively by the analytical approaches based on the admittance matrix analysis and carson's equations [15,35]. 2. coupling mechanisms in electricity, coupling is the transfer of energy from element to another element of the electrical system. there are mainly three types of couplings by which alternating voltages and currents can be induced on metallic pipelines near hv power transmission lines, these sources of interference are electrostatic, electromagnetic and conductive coupling. 2.1. electrostatic coupling from hv power line to pipeline only metallic pipeline installed above ground level is subject to the electrostatic coupling, the buried pipeline is protected by the good shielding effect caused by the ground. if a pipeline is located near a hv power line above ground level, it can undertake a large voltage to ground. the voltage is due to the charges accumulation through the capacitance between the hv power line conductors and pipeline in series with the capacitance between the pipeline and ground, which form a capacitive voltage divider; this is illustrated in figure 1[1-3]. fig. 1 electrostatic coupling from hv power line to a metallic pipeline ground level c2 pipeline c1 r s t ⚫ ⚫ ⚫ g 352 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 2.2. electromagnetic coupling from hv power line to pipeline the electromagnetic interference is the result of the magnetic field temporal variation generated by the hv power lines, as shown in figure 2. aerial and buried pipelines running parallel to or in close proximity to hv transmission lines are subjected to induced voltages by the time varying magnetic flux produced by the hv transmission line currents according to faraday's law of electromagnetic induction. the induced voltage causes currents circulation on the pipeline and voltages between the pipeline and the surrounding earth [1-3]. fig. 2 electromagnetic coupling from hv power line to a metallic pipeline 2.3. conductive coupling from hv power line to pipeline conductive coupling appears when a phase-to-earth or phase-to-phase-to-earth default had occurred. in this case, a large amount of current flows to earth through the pylon earthing, as shown in figure 3 below. this current raises the ground potential in proximity to metallic pipeline. this high voltage stresses the coating of pipelines and can cause arcs that damage the pipeline coating or the pipeline itself. in addition, this high voltage difference could pose an electric shock hazard to person directly touching the pipeline [1-3]. fig. 3 conductive coupling from hv power line to a metallic pipeline r s t ⚫ ⚫ ⚫ conductor single phase-ground fault pipeline fault currents conductive soil g pipeline magnetic field ground level combined effects of electrostatic and electromagnetic interferences... 353 3. electrostatic coupling calculation charge simulation method (csm) is a numerical calculation tool for the solution of boundary value problems of laplace's equation. this method was initially proposed by steinberger in 1969 [41], and then it was well developed and turned into a very powerful and efficient tool for calculating the electric field for high-voltage equipment. in fact, this method is very simple to use and implement; it can quickly deal with the problem to be solved while providing an accurate solution [42, 43-45]. in the principle of this method, each conductor is simulated by a number of simulated fictitious infinite line charges placed inside the conductor around a cylinder of fictitious radius. in most problems concerning the solution by the charge simulation method (csm), there is a plane of symmetry which is generally represented by the earth conventionally assumed that its reference potential is zero, this procedure makes it possible to take into account the ground effect, by introducing the concept of image charges [46-50]. therefore, the number of boundary points selected on the conductor's surface is assumed to be equal to the number of simulated charges; these charges are placed in such a manner while satisfying the dirichlet type boundary conditions. once the magnitudes of these simulated charges are determined, the potential at any point in space outside the region of the conductors can be determined using the superposition theorem as follows [50-53]: 1 cn i ij j j v p q = =  (1) where, nc is the total number of simulated charges; pij is the maxwell's potential coefficient at the contour point ( )i created by the simulated charge qj. firstly, the magnitudes of simulated charges are computed by solving the system of nc linear equations for nc unknown charges in the form described below in equation (2) [50-53]: 1[ ] [ ] [ ] c cc c j n ij ci nn n q p v (2) where, qj is the column vector of the simulated charges on the conductors; vci is the column vector of the known potentials at the boundary points of the conductors; pij is the matrix of the maxwell potential coefficients of the conductors. as an example, in figure 4, we consider three point charges in free space placed at different distances from the point mi. according to the superposition principle, the potential vi at this point will be [41]: 1 2 3 1 1 2 2 3 3 0 1 0 2 0 3 4 4 4 i i i i q q q v p q p q p q r r r   = + + = + + (3) once the magnitudes of the simulated charges are calculated after solving the system of equation (2), it is necessary to check whether these calculated magnitudes produce the same real boundary conditions fixed on the conductors’ surface; in order to get the best calculation precision. firstly, by selecting several checkpoints around the conductors, the new potential can be computed by these checkpoints on the surface of conductors. secondly, by determining the relative error between the new calculated potential and the real potential applied to the contours of the conductors, which makes it possible to indicate the simulation accuracy. if this accuracy does not satisfy the simulation criterion, it is necessary to change the number and/or location of the simulated charges. once this is done, the electric field strength at any point can be computed [50-53]. 354 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi fig. 4 three point charges in free space the charge simulation method (csm) is widely used to calculate the electric field strength in the vicinity of very high voltage overhead transmission lines. generally, the type of charges used for overhead power lines are of infinite length, because the radius of the conductor is negligible compared to its length. the typical emplacement of simulated charges and contour points in the conductor/pipeline cross-section is shown in figure 5. fig. 5 two-dimensional arrangement of simulation charges and contour points for the line conductor and the pipeline the general form of coordinates for contour points and simulated charges along the orthogonal frame is described by the following equations [35,36,51]: ( ) ( )0 0 2 2 cos 1 , sin 1k k k k x x r k y y r k n n      = +   − = +   −        (4) where, 1 2 , r r if k i r if k j= = = , 0y is the height of conductors/ pipeline above ground level; x0 is the horizontal coordinates of conductors/ pipeline. the electric field calculation generated by an electric charge is described by gauss's law. for a three-phase transmission line, in a rectangular coordinate system, the horizontal and vertical components of the electric field intensity along the two perpendicular axes due to all the simulated charges, including the image charges, are expressed by the equations described below [50-53]: 1 1 , c c i i n n x ij j y ij j j j e f q e f q = = = =  (5) where, fxi and fyi are the electric field intensity coefficients between the contour points and the simulated charges qj. the total electric field strength at any observation point is calculated as follows [43]: 2 2 res xi yi e e e= + (6) : simulated charges : contour points : check point r1: real radius of the conductor/pipeline r2: fictitious radius of the conductor/pipeline r1 r2 mi q2 r2 q3 q1 r1 r3 combined effects of electrostatic and electromagnetic interferences... 355 the induced voltage on the aerial metallic pipeline due to the capacitive effect of all electrical charges that simulate the conductors is evaluated as follows [1,33]: 2 2 2 2 10 ( ) ( )1 .ln 2 ( ) ( ) cn j j ind j j j j x x y y v q x x y y  =  − + +  =  − + −    (7) where, (x,y) are the coordinates of the observation point; (xj,yj) are the coordinates of the simulated charges. if a person is in contact with the ground and at the same time touches this pipeline, he gets an electric shock whose current passing through his body is given by the following relationship [1, 32]: shock p p indi j c l v= (8) where, lp is the length of the pipeline exposed to the electrostatic coupling; cp is the pipeline’s capacitance to earth per unit length;  is the angular frequency. when the discharge current in human body exceeds the safe limit in steady state conditions defined by the cigre standard at 10 ma [1], its reduction below the admissible level is required; the best protection is to connect the metallic pipeline to the ground through an adequate resistance rg, its value must be less than [1,54]: 1 body g r r  − (9) where, rbody is the body resistance;  is a ratio which is given by (i / i )shock admß = . according to the american standard ieee 80:2013, the overall resistance of the human body is usually taken equal to 1000 ω [1,55]. 3. electromagnetic coupling calculation many analytical and numerical methods are available for modeling and simulating magnetic induction due to very high voltage (vhv) overhead transmission lines. the current simulation technique (cst) is the most suitable method for two-dimensional computation, as it represents a reliable and efficient evaluation tool in the numerical solution of the magnetic induction equation for open boundary problems. its basic principle is very similar to that of the charge simulation method (csm) [37, 38]. high voltage transmission lines may use the bundled conductors (multiple sub-conductors per phase) to increase the electrical transport capacity. this approach consists by representing each current passing through a sub-conductor by a set of finite number of current filaments nf. in this method, each current passing through a sub-conductor is considered as a set of finite number of current filaments nf, which are allocated across a cylinder surface of fictitious radius rj. in a three-phase transmission line with bundled conductors, if each phase conductor consists of (m) identical sub-conductors, the total number of sub-conductors is (3m), as shown in figure 6. the number and position of simulated filament currents depends on the total number of power line conductors, their spatial arrangements and boundary conditions. for the full number of currents filaments line, the simulation currents along the all sub-conductors ) 1 ....3( i f i i mn=    must satisfy the following conditions [56-59]: 1 the normal component of the magnetic field intensity on the sub-conductor surfaces is zero, according to the biot-savart's law. 356 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 2 the sum of the filamentary currents which simulates the current in the sub-conductor must be equal to the real current passing through the sub-conductor. after selecting several contour points on the sub-conductors surface, the unknown simulation currents can be assessed by solving the system of equations given below: 3 1 0 , 1, 2, 3,........,3 ( 1) fn m ij ij i f i a k i j m n = = = = − (10) ( 1) 1 , 1, 2, 3,..............,3 f f n q i cq i q n i i q m = − + = = (11) where, m is the number of sub-conductors per phase; nf is the number of filament line currents; kij is the coefficient of normal magnetic field defined by the coordinates of the ith contour point and the jth filament line current, it is given by [37,38]: 0 ln 2 j ij ij r k r   = (12) where, rij is the distance between the simulation current point (j) and the contour point (i) at sub-conductor surface, rj is the fictitious radius of current filament simulation (see figure 7). fig. 6 three phase transmission line above ground with the images of line conductors fig. 7 normal and tangential field components at a point on the sub-conductor surface combined effects of electrostatic and electromagnetic interferences... 357 having calculated the values of the current line filaments by solving the equations system mentioned above in equations (10) and (11). it can be checked about the values and position of the currents filaments by adopting the same steps mentioned above in the charge simulation method (csm). in quasi-static analysis, the magnitude of the magnetic induction b is derived from the curl of the vector potential a, thus, the horizontal and vertical components of the magnetic induction vector according to the two perpendicular axes (x and y) can be determined as follows [37,38]: ij ij xi yi a a b a b and b x y → →   =   = =   (13) where, aij is the magnetic potential generated by the hv power line conductors’ currents, it can be expressed by the following relation [37,38]: 3 0 12 n m ij i ij i a i k   = =  (14) in this magnetic induction calculation, taking into account the earth effect. the induced currents in the earth represented by the filament image currents, which are located at a depth of penetration de below the surface of the earth, it can be calculated using the formula below [37,38]: 658.87 sed f  = (15) where, s is the electrical resistivity of the soil; f is the frequency of the source current. finally, the resulting magnetic induction intensity at a given point in space can be obtained by adding the horizontal and vertical components mentioned above in equation (13), as indicated below [37,38]: 2 2 res xj yj b b b= + (16) also, in this magnetic induction calculation, it is desirable to take into account the effects of induced currents circulating in the earth wires and metallic pipeline, which are caused by the three-phase currents passing through the phase conductors, they can be calculated by the following expression [60,61]: 1 [ ]=-[ ] [ ] [ ]g gg gc ci z z i − (17) where, zgg are the self impedances of the earth wires and metallic pipeline; zgp are the mutual impedances between the phase conductors and earth wires / metallic pipeline; ic are the currents passing through the three-phase conductors of the power line; ig represents the induced currents in the earth wires and metallic pipeline. in the extremely low frequency domain, the self and mutual longitudinal impedances of the conductors with ground return can be obtained by the simplified formulas of carsonclem as shown below, respectively [60,61]: 0 0 [ ln ( )] 8 2 e gg g gm d z r j r      = + + (18) 0 0 ln ( ) 8 2 e gc gc d z j d      = + (19) https://brilliant.org/wiki/curl/ 358 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi where, rg is the dc conductor resistance, rgm is the geometric mean radius of the conductor; dgc is the mutual distance between two conductors;  is the angular frequency; de is the penetration depth of earth return; 0 is the permeability of free space. the induced voltage on the aerial metallic pipeline due to the magnetic effect can be calculated through faraday’s law of electromagnetic induction. this law explains that magnetic induction that changes with time will induce a voltage in the pipeline; the total flux t due to all currents flowing through the conductors and change with time onto the pipeline is calculated as a surface integral as shown in [62-64]. t res s b ds = (20) where, t is the total flux density produced by all power line conductors; s is the total surface area. the metallic pipeline conductors form a closed loop, they are located at the position of the coordinates as shown in figure 8, the total magnetic flux t flowing through the surface s defined by the set of coordinates of the power line conductors and the pipeline can be expressed as following [62-64]: 2 2 0 2 2 1 ( ) ( ) ln 4 ( ) ( ) n p i p e i t i p i p i x x y d yl i x x y y    + + + + = − + + −  (21) where, (x,y) are the coordinates of the power line conductors; (xj,yj) are the coordinates of the metallic pipeline. finally, using the total magnetic flux, the induced voltage on the metallic pipeline due to the magnetic coupling can be expressed as follows [62-64]: t ind tv j t    = − = −  (22) in case of direct accidental contact with the metallic pipeline, the value of the shock current flowing through the human body can be calculated by this equation below [15,55]: indshock pipe body c v i z r r = + + (23) where, rbody is the human body resistance; rc is the ground contact resistance of a person; zpipe is the total impedance of the metallic pipeline, it is calculated by the equation given below [1]: 1 1 0 0 00 0 3,7 [ ln ( )] 8 22 2 p p p p s pp p j dd d                 − − + + + (24) where, dp is the pipeline’s diameter; p is the relative permeability of the pipeline’s metal; p is the pipeline’s resistivity. combined effects of electrostatic and electromagnetic interferences... 359 fig. 8 determination of the induced voltage on the metallic pipeline for touch voltages, for a soil with a surface resistivity, the contact resistance rc is calculated as [15]: 3,125 c s r =  (25) in some cases, the induced voltage exceeds the acceptable limit recommended by international standards; the international cigre regulations insist that safety measures be taken into account if the voltage on the pipeline exceeds 50v in steady state [1]. in this case, the mitigation is necessary to maintain the voltage within the permitted limit; it is enough to connect the metallic pipeline to the ground with two identical electrodes at each end of the pipeline. 4. teaching learning based optimization (tlbo) teaching learning based optimization (tlbo) is a meta-heuristic optimization algorithm proposed by rao et al. [39]. this is inspired from the teaching-learning process and is based on the effect of a teacher's influence on the output of students in a classroom environment. the teacher-students interaction is the fundamental inspiration for this algorithm, a group of learners in a classroom is considered as a population size and the different subjects offered to the learners are similar to the different design variables of the optimization problem. the results of the learner are analogous to the objective function value of the optimization problem, and the number of exams is the number of iterations, the best solution in the whole population is considered the teacher. the major advantage of this algorithm is the fact that it does not require specific control parameters. the teacher and the learners are the two essential components of the algorithm, thus, this algorithm describes two learning processes, through teacher (known as the teacher phase) and through interaction with other learners (known as the learner phase) [65-70]. pipeline ( , ) p p e x y d− −  ( , ) p p x y ( , ) j j x y ind v i i x y 360 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 4.1. teacher phase during this phase, the teacher aims to impart knowledge to the learners and tries to improve the average result of the classroom, making the maximum efforts to increase the level of knowledge of those learners who acquire his knowledge depending on the quality of the teaching provided by this teacher and the skills of the learners present in the class. taking this into account, the difference between the teacher's result and the learner's average result in each subject is expressed as follows [65-70]: , ( ) i i t i f i diff r x t m= −  (26) where, ri is a random number in [0, 1]; tf is a random number that accounts for the teacher factor that depends on teaching quality, and equals either 1 or 2. the value of tf is calculated at random by the following formula [65-70]: [1 (0,1){1, 2}]ft round rand= + (27) through the processes of teaching and transferring knowledge to learners and their acquisition, their new results being modified in the upcoming test, this difference is represented by the following expression [65-70]: ' , ,j i j i ix x diff= + (28) where, ' ,j ix and ,j ix are the new and old grades learner ( )j earned in exam ( )i , respectively. the best result among the two possible will be accepted and to be used as input for the learner phase. 4.1. learner phase in this second phase, the learners increase their knowledge through the interaction between them, also by discussing and interacting with other better learners by working as a collective team which helps to produce the best results x ”. considering a population size of n , the helping interaction learning phenomenon between two learners a and b in each exam for minimization problems is explained as follow [65-70]: ' ' ' ' ' '' , , , , , , ' ' ' ' ' , , , , , ( ) ( ) ( ( ) ) a i i a i b i a i b i a i a i i b i a i b i a i x r x x if x x x x r x x if x x  + − =  + − (29) x ” is accepted into the population if it gives a better function value. the implementation steps of tlbo algorithm can be summarized as follows [71-73]: step 1: define the optimization problem (minimization) and initialize the parameters of algorithm, the population size, number of variables, the maximum number of iterations, and the objective function f(x). step 2: randomly initialize the grades (solutions) (xi,j) of n learners (j = 1, 2, ..., n) in exam (i = 1). step 3: calculate the objective function for n students in exam (i) step 4: calculate (mi) and (xt,i), identify the best solution as teacher according to ( ) minteacher f xx x == step 5: calculate diffi for exam (i) according to equation (26) by utilizing the teaching factor tf. combined effects of electrostatic and electromagnetic interferences... 361 step 6: calculate x’j,i for n learners in exam (i) according to equation (28), compare the two solutions x’j,i and xj,i, accept the best solution for transferring to the next step. step 7: choose randomly each pair of learners and update the solution according to (4) and accept the better for the next step. step 8: calculate the objective function for all learners, check if the stopping criterion is met (the optimal solution is obtained), otherwise the algorithm will iterate from step (4). for charge simulation method (csm), the objective function used for the relative error is very simple and has the form given in the following equation [38]: 1 1 1 100 nc ci vi i c ci v v of n v= − =  (30) where: vvi is the exact potential to which is subjected the conductor and vci is the actual voltage of the check points; nc is the total number of check points. for current simulation technique (cst), the employed objective function is expressed by the relative error of the magnetic potential as follows [38]: 2 1 1 100 n f ci vi i f ci a a of n a= − =  (31) where, aci is the magnetic potential calculated by the current filaments points; avi is the new magnetic potential estimated by the matching filaments points; nf is the total number of matching points. 5. friedman's statistical test in fact, to prove the superiority and the best performance of an optimization algorithm in comparison with the analytical results obtained by different algorithms, we most often use the friedman nonparametric test to determine if the algorithms are statistically different and to classify them in terms of performance and speed, in order to implement the best of them in the optimization problem. generally, to conclude on the result of a statistical test, the procedure employed consists in quantifying the p-value and compare it to a previously defined threshold (traditionally 5%). if the p-value is less than this threshold, the null hypothesis is rejected in favor of the alternative hypothesis, and the test result is declared statistically significant [74-77]. in this paper, the friedman’s statistical test will be used to analyze the minimum values of the objective function obtained from different optimization algorithms such as the teaching learning based optimization (tlbo) [78], flower pollination algorithm (fpa) [79], harmony search algorithm (hs) [80], particle swarm optimization (pso) and genetic algorithm (ga) [81], in order to identify the most efficient algorithm. 6. validation methods in case of electrostatic coupling, the induced voltage on the metallic pipeline caused by the hv power line conductors can be evaluated using the admittance matrix technique. under steady-state operation condition, for a symmetrical hv overhead transmission power 362 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi line system with an aerial metallic pipeline, the shunt admittance matrix per unit length of the proposed electric circuit is determined by the following equation [1,54,82-85]. 1[ ] [ ] ij ij y j p − = (32) where, pij is the potential coefficients matrix of the proposed circuit (overhead power line conductors and metallic pipeline). then the current-voltage relations for this electric system can be represented in matrix form as follows:    i ij ii y v =  (33) the resulting matrix of shunt admittances for the total number of conductors (including three-phase conductors, earth wires and metallic pipeline) is represented below [1,53,81-85]: cc cp cgc c p pc pp pg p g gc gp gg g y y yi v i y y y v i y y y v             =                  (34) where, c, p and g are subscripts which represent respectively the three-phase conductors, metallic pipeline and earth wires. the current through the earthed earth wires is equal to zero; they can be removed by replacing (ig = 0) in equation (34), which gives: ' ' ' ' cc c pc c p pp c pp y yi v i vy y      =            (35) where, ' ' ' ' , , cg gc cg gp cc cc cp cp gg gg pg gc pc cp pc pc pp pp gg gg y y y y y y y y y y y y y y y y y y y y  = − = −    = − = −   (36) for an insulated metallic pipeline, the current flowing through it is zero ip = 0, by substituting it in equation (35), the resulting pipeline voltage to earth due to the electrostatic coupling with the hv power line can easily be deduced and given by the following relation [1,53,81-85]: ' 1 '[ ]= -[ ] [ ] [ ] p pc pp c v y y v − (37) where, vc is the column of the known three-phase voltages to earth of the hv power line conductors. in electromagnetic coupling case, under steady state conditions, the induced voltage on the metallic pipeline can be obtained by applying carson’s method. this approach is based on the principle of mutual impedances between the conductors of the hv power line and the metallic pipeline, the determination of these impedances is done using carson's formula mentioned previously in equation (19) [4,85-90]. the induced voltage calculation that appears between the metallic pipeline and the adjacent earth is done in two steps, firstly, the determination of the electromotive force (emf) induced along the metallic pipeline due to variable magnetic field, and then the induced voltage along the metallic pipeline can be obtained. combined effects of electrostatic and electromagnetic interferences... 363 the total longitudinal electromotive force (emf) induced on the metallic pipeline is obtained through the mutual impedances between the pipeline and the power line conductors, carrying a time varying alternating currents in the power line conductors. in the case where the overhead power line is equipped by one earth wire, the induced electromotive force (emf) is calculated according to the following equation [4,85-90]: 2 3 31 1 2 1 1ind c pc c pc c pc g pg e i z i z i z i z= − − − − (38) this relation can be easily reduced to the general form below: 1 ni ind i pi i e i z = = − (39) where, zpi are the mutual impedances between the conductors of the power line (phase conductors, earth wires) and the metallic pipeline; ii are the currents passing through the three-phase conductors and the earth wires of the power line; ni is the total number of conductors in the hv power line. the induced voltage on the metallic pipeline for an exposed length of exposure l to the electromagnetic coupling can be found using the formula given below [4,85-90]: ind ind v e l= (40) as can be see; this applied approach assumes that the induced voltage is constant over the entire length of the metallic pipeline. consider an hv overhead vertical single circuit transmission line of 275 kv, with one earth wire and an aerial insulated metallic pipeline in the immediate vicinity; the arrangement and geometric coordinates of the overhead power line and metallic pipeline are shown in figure 9. the pipeline is placed in perfect parallel to the axis of the hv power overhead line at a separation distance of 45 m; its height above the ground is 1 m with a radius of 0.3m. the metallic pipeline length of exposure to the ac interference is 25 km. the threephase currents in hv power line have been assumed under balanced operation with the magnitude of 500 a, with a nominal system frequency of 50 hz. the earth is assumed to be homogeneous with a resistivity of 100 (ω m), the ac resistance of the phase conductor is 0.1586 (ω/km), for the earth wire is 0.1489 (ω/km) and 0.5 (ω/km) for the metallic pipeline. fig. 9 single circuit hv vertical configuration with an aerial metallic pipeline 364 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi 7. results and discussions firstly, the aim is to select the best parameters to insert in the simulation methods to achieve results with satisfactory accuracy. in order to obtain the optimal number and location of fictitious charges and current filaments, it is necessary to use a robust and powerful optimization algorithm. in this context, a comparison of the performances of different optimization algorithms (pso, fba, hs, tlbo, ga) was made using the statistical friedman test under the same conditions, in order to be able to classify them according to their performance. to ensure a fair comparison, these algorithms were implemented in the matlab interface (r2014a), the experiments for each algorithm were repeated 10 times on the same computer running with windows 10 operating system. parameter settings of all optimization algorithms are shown in table 1. table 1 parameters settings of each algorithm algorithms parameters setting (100 iterations) particle swarm optimization (pso) swarm size n =20; learning factor c1=2, c2=2; inertia weight wmax=1.2; wmin=0.4. flower pollination algorithm (fpa) population size n=20; switch probability p=0.8 harmony search algorithm (hs) harmony memory size hms=5; harmony memory consideration rate hmcr=0.95; pitch adjustment rate par=0.25; band width distance bw=0.02*( ublb). teaching learning based optimization (tlbo) population size n=20 genetic algorithm (ga) population size n=20, mutation probability =0.2, crossover probability =0.4, number of bits =25. the statistical and comparative analysis of the obtained results by the different selected optimization algorithms following the friedman ranking test is presented in table 2. table 2 results of friedman's statistical test of the optimization algorithms test statistics algorithms mean rank friedman's chi-square statistic 84 pso 3 degrees of freedom (df) 4 fba 4 number of observations n 21 hs 5 standard deviation (sigma) 1.5811 tlbo 1 prob>chi-sq (p-value) 2.47e-17 ga 2 based on the friedman's statistical test, it shows that the difference between the performance of different proposed algorithms is significant, the level of probability (p) is very low and well below the critical value (p=0.05). moreover, it was observed that the tlbo algorithm achieved the first rank with minimum simulation accuracy and could provide the best performance compared to other algorithms. consequently, the tlbo algorithm can be used to solve the optimization problems in the adopted methods for induced voltages calculation. the variation of the objective functions (of) mentioned in equations (30 and 31) with the number of iterations is represented in figure 10, it shows the search process adopted by this algorithm and the optimization based on the minimization of these objective functions. it can combined effects of electrostatic and electromagnetic interferences... 365 be clearly observed that the objective functions values decrease as a number of iterations increase to converge towards a minimum solution. the optimization results for the optimal values of the parameters to be inserted in these simulation methods are summarized in table 3. table 3 optimum value of the simulation methods (csm and cst) algorithm+ method phase conductor earth wire pipeline of value csm+ tlbo fictitious charges number 22 15 23 2e-14 fictitious radius [m] 0.036 0.008 0.14 cst+ tlbo current filaments number 25 19 30 9.9e-07 fictitious radius [m] 0.03 0.01 0.1 -100 -80 -60 -40 -20 0 20 40 60 80 100 5 x 10 -12 o b je c ti v e f u n c ti o n iterationn number -100 -80 -60 -40 -20 0 20 40 60 80 100 5 x 10 -6 o b je c ti v e f u n c ti o n csm+tlbo cst+tlbo fig. 10 objective functions variation with number of iterations for electrostatic coupling analysis, figure 11 shows the lateral profile of the electric field distribution with and without the presence of the metallic pipeline. it is clear from the graph that the initial electric field distribution is symmetrical at a distance of 7 m near the suspension pylon, the presence of the metallic pipeline has a relatively significant effect on the maximum value of the electric field at the exact location where this pipeline is located, at this point the electric field is subjected to a slight increase on the pipeline’s surface due to the induced electrical charges accumulated on this surface. therefore, it can be concluded that the presence of a metallic pipeline in the immediate vicinity of an overhead power line causes a distortion of the electric field at the emplacement where this pipeline is implanted. the profile of the perturbed electric field on the pipeline's surface located at different distances in the two right-of-way sides is shown in figure 12. it can be observed that the perturbed electric field reaches its maximum value (e= 7.12 kv/m) for a horizontal separation distance of pipeline equal to +7 m, as it gradually moves away from either side of this point, the electric field intensity begins to decline where it becomes almost minimal very far from the point of symmetry of the electric field. as a result, it is suggested that the 366 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi pipeline be located as far as possible from the power line in order to effectively reduce the electric field effects on this pipeline. -60 -40 -20 0 20 40 60 0 0.5 1 1.5 2 lateral distance (x) [m] e le c tr ic f ie ld [ k v /m ] without pipeline with pipeline fig. 11 electric field profile with and without the metallic pipeline at 1 m above the ground -60 -40 -20 0 20 40 60 0 1 2 3 4 5 6 7 x: 45 y: 0.2397 pipeline position from the power line center [m] e le c tr ic f ie ld [ k v /m ] fig. 12 perturbed electric field profile on the metallic pipeline’s surface figure 13 shows the induced voltage profile on the pipeline's surface as a function of the separation distance of pipeline along the right-of-way. generally, the voltage induced on the metallic pipeline is directly proportional to the perturbed electric field, its distribution is very similar to that of the perturbed electric field, the maximum value of the induced distance is obtained at a separation distance of pipeline equal to +7 m. as a general suggestion, it is highly recommended that the metallic pipeline be installed at a proximity distance called the critical distance where the induced voltage is below the values prescribed by international standards. under normal operating conditions, the discharge current due to the capacitive coupling through a person's body touching the metallic pipeline located at different separation distances combined effects of electrostatic and electromagnetic interferences... 367 along the right of way is shown in figure 14. it is important to note that the discharge current level is directly related to the induced voltage value, the higher induced voltage, the more intense in resulting current. the discharge current in this case study is 17 (ma), this value is considered unacceptable from a personnel safety point of view. -60 -40 -20 0 20 40 60 0 500 1000 1500 2000 x: 45 y: 72.89 pipeline position from the power line center [m] in d u c e d v o lt a g e o n t h e p ip e li n e [ v ] fig. 13 induced voltage on the insulated metallic pipeline due to hv power line -60 -40 -20 0 20 40 60 0 50 100 150 200 250 300 350 400 450 500 x: 45 y: 17 pipeline position from the power line center [m] d is c h a r g e c u r r e n t [m a ] fig. 14 intensity of shock current flowing in human body concerning the discharge current values through the human body which are greater than the safety limit value recommended by the cigre standard which is equal to 10 ma. a protection procedure must be implemented, it is enough simply to connect the metallic pipeline to the earth through to an appropriate resistance calculated according to equation (9). the grounding resistance of the pipeline as a function of its horizontal proximity distance is shown in figure 15. as can be seen from this figure, the behavior of the graph represented by the grounding resistance is inversely to that of the discharge current. therefore, the metallic pipeline in this study example is grounded by a very suitable resistance which is equal to 1429 ω. 368 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi -60 -40 -20 0 20 40 60 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 x: 45 y: 1429 pipeline position from the power line center [m] e a r th in g r e si st a n c e [ o h m s] fig. 15 calculation of the earthling resistance of metallic pipeline for electromagnetic coupling analysis, figure 16 shows the lateral profile of the magnetic induction distribution with and without the presence of the metallic pipeline, taking into account the effect of the induced currents in the earth wire and the metallic pipeline. without the pipeline, it can be observed that the profile presents a symmetry close to the center of the power line (x = + 6 m), when it moves away from either side of this point, the magnetic induction intensity decreases rapidly as a function of the lateral distance. the figure also indicates, that the presence of a metallic pipeline in proximity to the power line disturbs the of magnetic induction distribution, this profile is distorted where the metallic pipeline is implanted. the pipeline will be affected by the magnetic induction and this is due to the current generated at the ends of this pipeline by the electromagnetic coupling. -60 -40 -20 0 20 40 60 0 0.5 1 1.5 2 2.5 3 3.5 x 10 -6 lateral distance [m] m a g n e ti c i n d u c ti o n [ t ] without pipeline with pipeline fig. 16 magnetic induction profile with and without the metallic pipeline at 1 m above the ground the effect of the metallic pipeline's location along the right-of-way on the perturbed magnetic induction profile at its surface is shown in figure 17. it can be seen that the combined effects of electrostatic and electromagnetic interferences... 369 maximum value of the perturbed magnetic induction (b= 4.1 µt) is obtained directly near the lateral phase at a separation distance of pipeline equal to (x=+ 6 m), from this position the magnetic induction decreases continuously with the lateral metallic pipeline's location to reach less intense or lower values very far from the power line center. -60 -40 -20 0 20 40 60 1.5 2 2.5 3 3.5 4 x 10 -6 x: 45 y: 1.886e-06 pipeline position from the power line center [m] m a g n e ti c i n d u c ti o n [ t ] fig. 17 perturbed magnetic induction profile on the metallic pipeline’s surface the induced voltage on the metallic pipeline by changing the pipeline's position along the right-of-way is shown in figure 18. as can be seen in this figure that the induced voltage is maximum where the pipeline is located at proximity position equal to +6 m, then it decreases progressively as the lateral position of this pipeline increases in the two sides. from this figure, it is important to note that the magnitude of the induced voltage in the metallic pipeline is directly proportional to the magnetic induction. in this case study the pipeline is kept location of 45 m from the pylon center, the obtained value of the induced voltage on the metallic pipeline is 270.9 v, this value is very higher than the maximum value permissible by the cigre norme which is 50 v. -60 -40 -20 0 20 40 60 100 200 300 400 500 600 x: 45 y: 270.9 pipeline position from the power line center [m] i n d u c e d v o lt a g e [ v ] fig. 18 induced voltage profile on the metallic pipeline the variation of the electric shock current flowing through a person coming into contact with the metallic pipeline as a function of its separation distance from the pylon is illustrated 370 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi in figure 19. as reflected in this figure, the amount of the shock current that flow in the human body accidentally is perfectly proportional to the magnitude of the applied induced voltage on the metallic pipeline, the form of its graph is very similar to that of the induced voltage. in this case of study, during normal operation the shock current due to accidental contact with the metallic pipeline is 204.2 ma, which can cause a significant risk and a great severity for this human body by comparing it with the admissible body current. -60 -40 -20 0 20 40 60 100 150 200 250 300 350 400 450 500 x: 45 y: 204.2 pipeline position from the power line center [m] s h o c k c u r r e n t in h u m a n b o d y [ m a ] fig. 19 intensity of shock current flowing through the human body for induced voltages values applied on the metallic pipeline which are greater than the maximum value admissible by the international cigre standard of 50v, that may pose a threat to the integrity of the pipeline and a risk to the safety of personnel. it then becomes imperative to implement an attenuation technique, to maintain the induced voltage at the recommended limit; it suffices simply to install low value shunt resistances at the ends of the pipeline with the earth which allow the current to be evacuated to earth along the pipeline section. figure 20 shows the electrode resistance value as a function of the separation distance of the metallic pipeline, this graph illustrates the earthing resistance values that ensure the safety of personnel and metallic pipeline, the behavior of the earthing resistance profile is exactly opposite to that of the electric shock current. -60 -40 -20 0 20 40 60 1 2 3 4 5 6 x: 45 y: 3.555 pipeline position from the power line center of [m] g r o u n d r e s is ta n c e [ o h m ] fig. 20 resistance of the ground electrode of metallic pipeline combined effects of electrostatic and electromagnetic interferences... 371 figure 21 shows the voltage applied to the electric system that combines in series the metallic pipeline and the electrode resistance to obtain a safety limit voltage (50 v). in this case study, it is necessary to install an earthing resistance value equal to 3.555 (ω) at each end of the metallic pipeline. 0 0.5 1 1.5 2 2.5 3 3.5 4 0 10 20 30 40 50 60 safe induced voltage x: 3.555 y: 50 electrode resistance value [ohm] e le c tr o d e v o lt a g e [ v ] fig. 21 safe voltage in the electrode resistance the results presented in figures 22 and 23 show the combined effect due to the electrostatic and electromagnetic couplings, which is generally represented by the total induced voltage applied on the metallic pipeline, as well as the total discharge current passing through the human body. as can clearly see that the obtained values according to the position of the metallic pipeline along the right-of-way are very significant. they can constitute a serious danger for the safety of the agents of intervention and maintenance, a great threat for the pipeline integrity and perfect degradation following to the metal corrosion and damage of the applied coatings, the failure of the cathodic protection system and the various devices connected to the metallic pipeline. in order to protect the safety to personnel of intervention and maintenance, thus the cost-effective functioning of the metallic pipelines, the application of mitigation procedure is necessary. -60 -40 -20 0 20 40 60 0 500 1000 1500 2000 2500 3000 x: 45 y: 343.8 pipeline position from the power line center [m] t o ta l in d u c e d v o lt a g e o n p ip e li n e [ v ] x: 45 y: 270.9 x: 45 y: 72.89 electrostatic effect electromagnetic effect combined effect fig. 22 total induced voltage on the metallic pipeline due to the combined effect 372 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi -60 -40 -20 0 20 40 60 0 100 200 300 400 500 600 700 800 900 1000 x: 45 y: 17 pipeline position from the power line center [m] t o ta l e le c tr ic s h o c k c u r r e n t [m a ] x: 45 y: 204.2 x: 45 y: 221.2 electrostatic effect electromagnetic effect combined effect fig. 23 total shock current intensity flowing through the human body due to the combined effect finally, in order to verify the effectiveness of the proposed methods, the results obtained for the induced voltage intensity for the electrostatic and electromagnetic couplings were compared with those computed respectively by the approaches of admittance matrix analysis and carson for the same data and similar geometry. figures 24 and 25 show the comparisons between the values of the obtained induced voltage, the results analysis of the comparison indicates that there is a very good correlation between the graphs of the different methods, the maximum estimated relative errors between the values of these different methods according to the two couplings cases were within the permissible range, this process is sufficient to validate the precision of the methods adopted. -60 -40 -20 0 20 40 60 0 500 1000 1500 2000 2500 x: 45 y: 72.89 pipeline position from the power line center [m] i n d u c e d v o lt a g e [ v ] x: 45 y: 72.52 admittance matrix method csm+pso fig. 24 comparison of the induced values by the two calculation methods for electrostatic coupling combined effects of electrostatic and electromagnetic interferences... 373 -60 -40 -20 0 20 40 60 100 200 300 400 500 600 700 x: 45 y: 271.2 pipeline position from the power line center [m] i n d u c e d v o lt a g e [ v ] x: 45 y: 270.9 faraday's law carson's method fig. 25 comparison of the induced values by the two calculation methods for electromagnetic coupling 8. conclusion in this paper, a rigorous quasi-static modeling approach is used to analyze the electrostatic and electromagnetic couplings under normal operating condition between an hv power transmission line and an aerial metallic pipeline placed in parallel and in close proximity. two hybrid simulation methods based on the charge simulation (csm) and current simulation techniques (cst), which are combined with the teaching learning based optimization (tlbo), were presented. this algorithm is applied in order to find the optimal position and the appropriate number of simulation charges and current filaments required of these methods. the intensities of the perturbed electric and magnetic fields and the induced voltage on the metallic pipeline were analyzed. for electrostatic coupling, from the results, it is clear that the presence of an aerial metallic pipeline in the vicinity of hv overhead power transmission line causes the distortion of the electric field at pipeline's surface due to the resulting electric static charges accumulated on this insulated surface. the magnitude of the maximum value of the induced voltage on the pipeline occurs at a separation distance of 7 m, and then it declines rapidly on both sides of this distance, where it becomes almost negligible at a critical distance, at which it is recommended to lay this metallic pipeline. if the discharge current flowing in the human body during direct contact with the metallic pipeline exceeds the authorized safety limit, it is recommended that the mitigation procedure be installed and it is sufficient to ground the metallic pipeline with an appropriate resistance. for electromagnetic coupling, according to the obtained results, it is evident that the presence of an aerial metallic pipeline in close proximity to a hv overhead power line disturbs the distribution of the magnetic field at the metallic pipeline's surface due to the electric current induced intensity in this pipeline. the maximum induced voltage appears in the metallic pipeline is obtained when this pipeline is located at a proximity distance equal to + 6 m from the pylon, then it decreases rapidly with the increase of the separation distance of the metallic pipeline across the sides of pylon. 374 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi the amount of discharge current which passes through the human body when it accidentally touches the metallic pipeline is linearly proportional to the magnitude of the induced voltage. when the resultant of the induced voltage intensity on the metallic pipeline exceeds the safety threshold of 50 v, it can present risks for the safety of intervention and maintenance agents, also for the pipeline's equipments, these risks can be completely eliminated by applying the mitigation measure, it be sufficient to connect the two endings of the metallic pipeline to the earth through a suitable resistances. the numerical results presented by the hybrid developed methods are compared with the results obtained by two different approaches, concerning respectively the both studied couplings; the comparison shows a good agreement between the simulation results, which confirms the efficiency and the validity of the proposed methods. references [1] cigre, guide on the influence of high voltage ac power systems on metallic pipelines, working group 36.02, technical brochure no. 095, 1995. [2] r. a. gummow, a/c interference guideline final report, nace corrosion specialist, no.17, canadian energy pipeline association, 2014. [3] en 50443, effects of electromagnetic interference on pipelines cased by high voltage a.c. railway systems and/or high voltage a.c. power supply systems, cenelec report no: ics 33.040.20; 33.100.01, 2009. [4] australian new zealand standard, electrical hazards on metallic pipelines, standards australia, standards new zealand, 4853:2000. [5] d. d. micu, e. simion, d. micu and a. ceclan, "numerical methods for induced voltage evaluation in electromagnetic interference problems", in proceedings of the 9th international conference on electrical power quality and utilisation, 2007, pp. 1–6. [6] k. hyoun-su, h. y. min, j. g. chase and c. h. kim, "analysis of induced voltage on pipeline located close to parallel distribution system", energies, vol. 14, pp. 8536–8536, 2021. [7] j. dabkowski, "how to predict and mitigate a.c. voltages on buried pipelines", pipeline & gas j., vol. 206, pp. 19–21, 1979. [8] a. taflove and j. dabkowski, "prediction method for buried pipeline voltages due to 60 hz ac inductive coupling part i analysis", ieee trans. power apparatus and systems., vol. pas-98, no. 3, pp. 780–787, 1979. [9] j. dabkowski, "the calculation magnetic coupling from overhead transmission lines", ieee trans. power appar. syst., vol. pas-100, no. 8, pp. 3850–3860, 1981. [10] f. p. dawalbi and r. d. southey, "analysis of electrical interference from power lines to gas pipelines part i: computation methods", ieee power eng. rev., vol. 9, no. 7, pp.70–70, 1989. [11] f. p. dawalibi and r. d. southey, "analysis of electrical interference from power lines to gas pipelines. ii. parametric analysis", ieee trans. power deliv., vol. 5, no. 1, pp. 415–421, 1990. [12] g. djogo and m. m. a. salama, "calculation of inductive coupling from power lines to multiple pipelines", electr. power syst. res., vol. 41, no. 1, pp. 75–84, 1997. [13] d. d. micu, g. c. christoforidis and l. czumbil, "ac interference on pipelines due to double circuit power lines: a detailed study", electr. power syst. res., vol. 103, pp. 1–8, 2013. [14] a. muresan, t. a. papadopoulos, l. czumbil, a. i. chrysochos, t. farkas and d. chioran, "numerical modeling assessment of electromagnetic interference between power lines and metallic pipelines: a case study", in proceedings of the 9th international conference on modern power systems. cluj-napoca, 2012, pp. 1–6. [15] r. djekidel and d. mahi, "calculation and analysis of inductive coupling effects for hv transmission lines on aerial pipelines", przegląd elektrotechniczny., vol. 190, no.9, pp. 151–156, 2014. [16] l. li and x. gao, "ac corrosion interference of buried long distance pipeline", in proceedings of the 3rd international conference on intelligent control-measurement and signal processing and intelligent oil field. xi’an, 2012, pp. 342–346. [17] k. j. satsios, d. p. labridis and p. s. dokopoulos, "finite element computation of field and eddy currents of a system consisting of a power transmission line above conductors buried in nonhomogeneous earth", ieee trans. power deliv., vol. 13, no. 3, pp. 876–882, 1998. [18] a. cristofolini, a. popoli and l. sandrolini, "numerical modelling of interference from ac power lines on buried metallic pipelines in presence of mitigation wires", in proceedings of the 2018 ieee javascript:void(0) javascript:void(0) combined effects of electrostatic and electromagnetic interferences... 375 international conference on environment and electrical engineering and 2018 ieee industrial and commercial power systems europe, palermo, 2018, pp. 1–5. [19] a. popoli, l. sandrolini and a. cristofolini, "finite element analysis of mitigation measures for ac interference on buried pipelines", in proceedings of the ieee international conference on environment and electrical engineering and industrial and commercial power systems europe, genova, 2019, pp. 1–5. [20] a. popoli, a. cristofolini, l. sandrolini, b. t. abe and a. jimoh, "assessment of ac interference caused by transmission lines on buried metallic pipelines using f.e.m," in proceedings of the 2017 international applied computational electromagnetics society symposium, firenze, 2017, pp. 1–2. [21] n. abdullah, "hvac interference assessment on a buried gas pipeline", iop conf. series: earth and environ. sci., vol. 704, no. 1, pp. 012009, 2021. [22] g. c. christoforidis, p. s. dokopoulos and k. e. psannis, "induced voltages and currents on gas pipelines with imperfect coatings due to faults in a nearby transmission line", in proceedings of the ieee international conference on porto power tech. porto, 2001, pp. 401–406. [23] g. c. christoforidis and d. p. labridis, "inductive interference of power lines on buried irrigation pipelines", in proceedings of the ieee international conference of power, bologna, 2003, pp. 196–202. [24] g. c. christoforidis, d. p. labridi and p. s. dokopoulos, "a hybrid method for calculating the inductive interference caused by faulted power lines to nearby buried pipelines", ieee trans. power deliv., vol. 20, no. 2, pp. 1465–1473, 2005. [25] a. popoli, a. cristofolini and l. sandrolini, "a numerical model for the calculation of electromagnetic interference from power lines on nonparallel underground pipelines", math. comput. simul., vol. 183, pp. 221–233, 2021. [26] c. andrea, a. popoli, l. sandrolini, g. pierotti and m. simonazzi, "laplace transform for finite element analysis of electromagnetic interferences in underground metallic structures", appl. sci., vol. 12, no. 2, pp. 872–872, 2022. [27] h. g. lee, t. h. ha, y. c. ha, j. h. bae and d. k. kim, "analysis of voltages induced by distribution lines on gas pipelines," in proceedings of the ieee international conference on power system technology. singapore, 2004, pp. 598–601. [28] s. al‐alawi, a. al‐badi and k. ellithy, "an artificial neural network model for predicting gas pipeline induced voltage caused by power lines under fault conditions", int. j. comput. math. electr. electron. eng., vol. 24, no. 1, pp. 69–80, 2005. [29] a. popoli, l. sandrolini and a. cristofolini, "comparison of screening configurations for the mitigation of voltages and currents induced on pipelines by hvac power lines", energies j., vol. 14, pp. 3855–3855, 2021. [30] m. a. elhirbawy, l. s. jennings, s. m. ai dhalaan and w. w. l. keerthipala, "practical results and finite difference method to analyze the electric and magnetic field coupling between power transmission line and pipeline", in proceedings of the ieee international symposium on circuits and systems, 2003, pp. 431–434. [31] mazen abdel-salam, abdallah al-shehri, "induced voltages on fence wires and pipelines by ac power transmission lines", ieee trans. ind. appl., vol. 30, no. 2, pp. 341–349, 1994. [32] m. m. saied, "the capacitive coupling between ehv lines and nearby pipelines", ieee trans power deliv., vol. 19, no. 3, pp. 1225–1231, 2004. [33] a. gupta and m. j. thomas, "coupling of high voltage ac power line fields to metallic pipelines", in proceedings of the 9th ieee international conference on electromagnetic interference and compatibility (incemic 2006), bangalore, 2006, pp. 278–283. [34] h. m. ismail, a. m. amin and s. alkhoudary, "comparative study of the effect of hvtl electrostatic fields on gas pipelines using the atp-lcc& csm methods", int. j. eng. res. technol., vol. 2, no. 9, pp. 3037–3043, 2013. [35] r. djekidel and s. a. bessidek, "estimation and mitigation of electrostatic interferences on metallic pipeline by hv overhead power line using differential evolution algorithm", electrotehnica, electronica, automatica eea, vol. 64, no. 3, pp. 83–90, 2016. [36] r. djekidel, s. a. bessedik and a. hadjadj, "electric field modeling and analysis of ehv power line using improved calculation method", fu electr. energ., vol. 31, no. 3, pp. 425–445, 2018. [37] r. djekidel, s. a. bessedik and s. akef, "accurate computation of magnetic induction generated by hv overhead power lines", fu electr. energ., vol. 32, no. 2, pp. 267–285, 2019. [38] t. meriouma, s. a. bessedik and r. djekidel, "modelling of electric and magnetic field induction under overhead power line using improved simulation techniques", eur. j. electr. eng., vol. 23, no. 4, pp. 289–300, 2021. [39] r. v. rao, v. j. savsani and d. p. vakharia, "teaching–learning-based optimization: a novel method for constrained mechanical design optimization problems", comput. aided des. j., vol. 43, no. 3, pp. 303–315, 2011. 376 r. djekidel, m. lahdeb, s. s. m. ghoneim, d. mahi [40] s. li, w. gong, l. wang, x. yan and c. hu, "a hybrid adaptive teaching–learning-based optimization and differential evolution for parameter identification of photovoltaic models", energy convers. manag., vol. 225 p. 113474, 2020. [41] n. h. malik, "a review of the charge simulation method and its applications," ieee trans. electr. insul., vol. 24, no. 1, pp. 3–20, 1989. [42] f. lai, y. wang, y. lu and j. wang, "improving the accuracy of the charge simulation method for numerical conformal mapping", math. probl. eng., vol. 2017, p. 3603965, 2017. [43] r. djekidel and d. mahi, "effect of the shield lines on the electric field intensity around the high voltage overhead transmission lines", amse journals -series: modelling a., vol. 87; no. 1, pp. 1–16, 2014. [44] r. djekidel, d. mahi and a. ameur, "analysis of parameters affecting the capacitive interference between pipelines and power overhead line using genetic algorithms", int. j. electr. eng. inform., vol. 8, no. 2, pp. 315–330, 2016. [45] r. djekidel, "optimum phase configuration and location of the aerial pipeline in the vicinity of a high voltage overhead line", period. polytech. electr. eng. comput. sci., vol. 60, no. 2, pp. 143–150, 2016. [46] r. m. radwn and m. m. samy, "calculation of electric fields underneath six phase transmission lines," j. electr. syst., vol. 12, no. 4, pp. 839–851, 2016. [47] m. m. samy and a. m. emam, "computation of electric fields around parallel hv and ehv overhead transmission lines in egyptian power network", in proceedings of the ieee international conference on environment and electrical engineering and ieee industrial and commercial power systems europe, italy, 2017, pp. 1– 5. [48] y. wang and c. lv, "electric field calculation of the improved charge simulation method based on hybrid coding", chinese automation congress, pp. 1208–1213, 2019. [49] s. nakasumi, k. kikunaga, y. harada, m. ohkubo and k. takagi, "error evaluation of defect shape identification using charge simulation method for static electricity", j. electrostatics., vol. 114, p. 103633, 2021. [50] r. djekidel, s. a. bessedik and a. c. hadjadj, "assessment of electrical interference on metallic pipeline from hv overhead power line in complex situation", fu electr. energ., vol. 34, no. 1, pp. 53–69, 2021. [51] r. djekidel, a. choucha and a. c. hadjadj, "efficiency of some optimization approaches with the charge simulation method for calculating the electric field under extra high voltage power lines," iet gener. transm. distrib., vol. 11, no. 17, pp. 4167–4174, 2017. [52] f. yang, w. he, w. deng and t. chen, "a genetic algorithm‐based improved charge simulation method and its application", int. j. comput. math. electr. electron. eng., vol. 28, no. 6, pp. 1701–1709, 2009. [53] r. wang, j. tian, f. wu, z. zhang and h. liu, "pso/ga combined with charge simulation method for the electric field under transmission lines in 3d calculation model", electronics, vol. 8, no. 10, pp. 1140, 2019. [54] n. tleis, power systems modeling and fault analysis theory and practice, elsevier, second edition 2019, pp. 835–861. [55] ieee std 80-2013, ieee guide for safety in ac substation grounding, (revision of ieee standard 802000), 2013, pp. 1-226. [56] y. degui, l. bing, d. jun, h. danmei and w. xihong, "power frequency magnetic field of heavy current transmit electricity lines based on simulation current method", ieee world autom. congr., pp. 1–4, 2008. [57] r. roshdy, a. s. mazen, m. abdel-bary and s. mohamed, "laboratory validation of calculations of magnetic field mitigation underneath transmission lines using passive and active shield wires", innovative syst. des. eng., vol. 2, no. 4, pp. 218–232, 2011. [58] r. m. radwan, m. abdel-salam, m. m. samy and a.m. mahdy, "passive and active shielding of magnetic fields underneath overhead transmission lines theory versus experiment", in proceedings of the 17th international middle east power systems conference. mansoura, 2015, pp. 1–10. [59] m. abdel-salam, h. abdullah, m. th. el-mohandes and h. el-kishky, "calculation of magnetic fields from electric power transmission lines", electr. power syst. res., vol. 49, pp. 99–105, 1999. [60] m. albano, r. turri, s. dessanti, a. haddad and h. griffiths, b. howat, "computation of the electromagnetic coupling of parallel untransposed power lines", in proceedings of the 41st international universities power engineering conference. newcastle upon tyne, 2006, pp. 303–307. [61] r. djekidel, s. a. bessedik, p. spitéri and d. mahi, "passive mitigation for magnetic coupling between hv power line and aerial pipeline using pso algorithms optimization", electr. power syst. res., vol. 165, pp.18–26, 2018. [62] k. yamazaki, t. kawamoto and h. fujinami, "requirements for power line magnetic field mitigation using a passive loop conductor", ieee trans. power deliv., vol. 15, no. 2, pp. 646–651, 2000. [63] p. cruz, c. izquierdo and m. burgos, "optimum passive shields for mitigation of power lines magnetic field", ieee trans. power deliv., vol. 18, no. 4, pp. 1357–1362, 2003. [64] a. r. memari, "optimal calculation of impedance of an auxiliary loop to mitigate magnetic field of a transmission line", ieee trans. power deliv., vol. 20, no. 2, pp. 844–850, 2005. combined effects of electrostatic and electromagnetic interferences... 377 [65] d. tang, j. zhao and h. li, "an improved tlbo algorithm with memetic method for global optimization", int. j. adv. comput. technol., vol. 5, no. 9, pp. 942–949, 2013. [66] h. r. e. h. bouchekara, m. a. abido and m. boucherma, "optimal power flow using teaching learning based optimization", electr. power syst. res., vol. 114, pp. 49–59, 2014. [67] p. sarzaeim, o. b. haddad and x. chu, teaching-learning-based optimization (tlbo) algorithm. in: advanced optimization by nature-inspired algorithms. studies in computational intelligence, springer, singapore, vol. 720, pp. 51–58, 2018. [68] m. m. puralachetty, v. k. pamula, l. m. gondela, v. n. b. akula, "teaching-learning-based optimization with two-stage initialization", in proceedings of the ieee students' international conference on electrical, electronics and computer science. bhopal, 2016, pp. 1–5. [69] r. venkata-rao, v. patel, "an improved teaching-learning-based optimization algorithm for solving unconstrained optimization problems", scientia iranica., vol. 20, no. 3, pp. 710–720, 2013. [70] o. bozorg-haddad, p. sarzaeim and h. a. loáiciga, "developing a novel parameter-free optimization framework for flood routing", sci. rep., vol. 11, no. 1, p. 16183, 2021. [71] r. venkata-rao and v. patel, "an elitist teaching-learning-based optimization algorithm for solving complex constrained optimization problems," int. j. ind. eng. comput., vol. 3, no. 4, pp. 535–560, 2012. [72] x. he, j. huang, y. rao and l. gao, "chaotic teaching-learning-based optimization with lévy flight for global numerical optimization", comput. intell. neurosci., vol. 8341275, pp. 1687–5265, 2016. [73] s. sleesongsom and s. bureerat, "four-bar linkage path generation through self-adaptive population size teaching-learning based optimization", knowledge-based syst., vol. 135, pp. 180–191, 2017. [74] t. hastie, r. tibshirani and j. friedman, the elements of statistical learning: data mining, inference, and prediction. new york: springer, second edition 2009, pp.745. [75] d. joaquin, g. salvador, m. daniel, h. francisco, "a practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms", swarm evol. comput., vol. 1, no.1, pp. 3–18, 2011. [76] m. a. el-shorbagy and a. y. ayoub, "integrating grasshopper optimization algorithm with local search for solving data clustering problems", int. j. comput. intell. syst., vol. 14, no. 1, pp. 783–793, 2021. [77] h. moayedi, h. nguyen and l. kok-foong, "nonlinear evolutionary swarm intelligence of grasshopper optimization algorithm and gray wolf optimization for weight adjustment of neural network", eng. with comput., vol. 37, no. 2, pp. 1265–1275, 2021. [78] w. li, y. fan and q. xu, "teaching-learning-based optimization enhanced with multiobjective sorting based and cooperative learning", ieee access j., vol. 8, p. 65937, 2020. [79] m. m. samy, s. barakat and h. s. ramadan, "a flower pollination optimization algorithm for an off-grid pv-fuel cell hybrid renewable system", int. j. hydrog. energy, vol. 44, no. 4, pp. 2141–2152, 2019. [80] n. sinsuphan, u. leeton and t. kulworawanichpong, "optimal power flow solution using improved harmony search method," appl. soft comput. j., vol. 13, no. 5, pp. 2364–2374, 2013. [81] s. shabir and r. singla, "a comparative study of genetic algorithm and the particle swarm optimization", int. j. electr. eng., vol. 9, no. 2, pp. 215–223, 2016. [82] m. h. shwehdi, m. a. alaqil and s. mohamed, "emf analysis for a 380 kv transmission ohl in the vvicinity of buried pipelines", ieee access j., vol. 8, pp. 3710–3717, 2020. [83] r. djekidel and d. mahi, "capacitive interferences modelling and optimization between hv power lines and aerial pipelines", int. j. electr. comput. eng., vol. 4, no. 4, pp. 486–497, 2014. [84] m. samy and a. emam, "induced pipeline voltage nearby hybrid transmission lines", innovative syst. des. eng., vol. 8, no. 3, pp. 31–40, 2017. [85] r. djekidel, a. hadjadj and s. a. bessedik, "electrostatic and electromagnetic effects of hv overhead power line on above metallic pipeline", in proceedings of the 5th ieee international conference on electrical engineering, boumerdes, 2017, pp. 1–6. [86] k. b. adedeji, "effect of hvtl phase transposition on pipelines induced voltage", indones. j. electr. eng. inform., vol. 4, no. 2, pp. 93–101, 2016. [87] a. hellany, m. nassereddine and m. nagrial, "analysis of the impact of the ohew under full load and fault current", int. j. energy environ., vol. 1, no. 4, pp. 727–736, 2010. [88] m. nassereddine and a. hellany, "ac interference study on pipeline: the impact of the ohew under full load and fault current", in proceedings of the 2nd ieee international conference on computer and electrical engineering, dubai, 2009, pp. 497–501. [89] k. b. adedeji, a. a. ponnle, b. t. abe, a. a. jimoh, a. m. i. abu-mahfouz and y. hamam, "gui-based ac induced corrosion monitoring for buried pipelines near hvtls", eng. letters., vol. 26, no. 4, pp. 489–497, 2018. [90] m. vakilian, k. valadkhani, a. shaigan, a. nasiri and h. gharagozlo, "a method for evaluation and mitigation of ac induced voltage on buried gas pipelines", scientia iranica, vol. 9, no. 4, pp. 311–320, 2002. instruction facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 285 296 doi: 10.2298/fuee1602285p dielectric properties of la/mn codoped barium titanate ceramics  vesna paunović 1 , vojislav mitić 1,2 , miloš marjanović 1 , ljubiša kocić 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 institute of technical sciences of sasa, belgrade, serbia abstract. la/mn codoped batio3 ceramics with various la2o3 content, ranging from 0.3 to 1.0 at% la, were investigated regarding their microstructure and dielectric properties. the content of mno2 was kept constant at 0.01 at% mn in all samples. la/mn codoped and undoped batio3 were obtained by a modified pechini method and sintered in air at 1300 0 c for two hours. the homogeneous and completely fine-grained microstructure with average grain size from 0.5 to 1.5m was observed in samples doped with 0.3 at% la. in high doped samples, apart from the fine grained matrix, the appearance of local area with secondary abnormal grains was observed. the dielectric properties were investigated as a function of frequency and temperature. the dielectric permittivity of the doped batio3 was in the range of 3945 to 12846 and decreased with an increase of the additive content. the highest value for the dielectric constant at room temperature (r= 12846) and at the curie temperature (r= 17738) were measured for the 0.3 at% la doped samples. the dissipation factor ranged from 0.07 to 0.62. the curie constant (c), curie-weiss temperature (t0) and critical exponent () were calculated using the curie-weiss and the modified curie-weiss law. the highest values of curie constant (c=3.2710 5 k) was measured in the 1.0 at% la doped samples. the obtained values for  ranged from 1.04 to 1.5, which pointed out the sharp phase transformation from the ferroelectric to the paraelectric phase. key words: barium titanate, ceramics, dielectrical properties 1. introduction barium titanate has attracted a considerable amount of attention over the years due to its excellent physical and electrical properties and numerous practical applications 1-3. the batio3 based ceramics are widely used for multilayer capacitors (mlccs), ptc thermistors, varistors, and dynamic random access memories (dram) in integrated circuits 4-6. for mlc applications, dielectric materials need to be electrically insulating and received may 8, 2015; received in revised form october 20, 2015 corresponding author: vesna paunović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: vesna.paunovic@elfak.ni.ac.rs) 286 v. paunović, v. mitić, m. marjanović, lj. kocić exhibit high permittivity values and low dielectric losses at room temperature. as overload protection devices, they are required to be semiconducting at room temperature and undergo a sharp rise in resistivity when heated above the ferroto paraelectric phase transition temperature, tc 7. at room temperature, batio3 adopts a tetragonal perovskite structure and is a ferroelectric with high permittivity. it transforms to the cubic, paraelectric state at the curie temperature, tc of 132°c. also, undoped batio3 is electrically insulating at room temperature. the dielectric properties of batio3 depend on the synthesis method, density, grain size, and sintering procedure. consequently, there is a considerable interest in the preparation the powder of high homogeneity and a ceramics of high density and small grain size. the homogeneous starting powders can be obtained by conventional solid state reaction, oxalate precipitation method and modified pechini process 8, 9. the pechini method of preparation has the advantage in raising the permittivity of modified batio3, compared with the samples obtained by the conventional solid state sintering. the electrical and dielectric properties of batio3 ceramics can be modified by using various types of additives, as well as processing procedures 10-12. generally, ions with large radius and low valence like la 3+ , ca 2+ , dy 3+ and y 3+ , tend to enter the a sites (ba 2+ sites), while ions with small radius and higher valence like nb 5+ and ta 5+ favor the b sites (ti 4+ ) 13-17. substitution of the barium or titanium ion with small concentrations of ions with a similar radius could lead to structure and microstructure changes, and furthermore, modify the dielectric and ferroelectric properties. some of the dopants shift transition temperature of batio3 or induce broadening of rt curve and many of them cause diffuseness of ferroelectric transition. the phase transition from the ferroelectric to the paraelectric phase can be with sharp dielectric maximum or with diffuse dielectric maximum which is characteristic for relaxor ceramics. according to literature data, partial substitution of ba or ti ions with dopants such as la, zn and sb cause the formation of diffuse phase transition, high dielectric constant and low losses, and sn, ce, zr, bi, hf cause the appearance of ferroelectric relaxor behavior [17]. the addition of cazro3 in batio3 ceramic enhances the capacitance of capacitor and reduces the curie temperature 18. dielectric behavior of nb5+ modified batio3 ceramics was leaded by the presence of nonferroelectric regions and causes to decrease in the value of dielectric constant. the shift of curie temperature towards lower temperature side is attributed to the replacement of ba 2+ with bi 3+ 19. the addition of sb affects to the grain growth inhibition and formation uniform microstructure and also to increase the dielectric constant. among the additives, lanthanum, la, is the most efficient in raising the dielectric permittivity of modified batio3 ceramics 20-22. la as donor dopant decreases the grain size and enhances the dielectric constant. in la doped ceramics the curie temperature was shifted towards lower temperatures and dielectric constant values were much higher than in pure batio3. also, it was found that the dielectric losses decrease with addition of la in batio3. at higher concentration of la, dielectric maximum was broadened. the relaxor-type frequency dependence of permittivity was also found in batio3. the substitution of la 3+ on the ba 2+ sites requires the formation of negatively charged defects. there are three possible compensation mechanisms: barium vacancies (vba // ), titanium vacancies (vti //// ) and electrons (e / ) 23-25. dielectric properties of la/mn codoped barium titanate ceramics 287 small additions of lanthanum (< 0.5 at%) which replace the ba ions, leads to the formation of a bimodal microstructure and n-type semiconductivity, which has been widely believed to occur via an electronic compensation mechanism, if the samples are heated in a reducing or argon atmosphere. la2o3+2tio2  2la  ba +e (1) in heavily doped samples ( 0.5 at%) sintered in air atmosphere, which are characterized by a small grained microstructure, a high insulation resistance and life stability of the multilayer capacitors can be achieved. the principal doping mechanism is the ionic compensation mechanism (titanium vacancy compensation mechanism). la2o3+3tio2  2la  ba + //// ti v +3titi +9oo (2) for low partial pressure of oxygen, the characterized mechanism is electronic compensation mechanism, while for high pressures it is the characteristic ionic compensation mechanism. mno2 are frequently added to batio3, together with other additives, in order to reduce the dissipation factor. manganese has double role, as acceptor dopant incorporated at ti 4+ sites, it can be used to counteract the effect of the oxygen vacancies donors. as additive, segregating at grain boundaries, can prevent the exaggerated grain growth. manganese belongs to the valence unstable acceptor-type dopant, which may take different valence states, mn 2+ , mn 3+ or even mn 4+ during post sintering annealing process. mn 2+ is stable in cubic phase and easily oxidized to mn 3+ state which is more stable in tetragonal phase. for codoped systems 26-28, the formation of donor-acceptor complexes such as 2[laba  ][mnti  ] prevent a valence change from mn 2+ to mn 3+ . generally, in codoped batio3 ceramics, the controlled incorporation of donor dopant, such as la, in combination with an acceptor (mn) leads to the formation ceramics with uniform microstructure and high dielectric constant at room temperature as well as at curie temperature. the codoped ceramic showed lower value of dielectric losses compared to the undoped ceramics. also, one of the reasons they used a modified batio3 is that the additives have the effect of moving the curie temperature in the temperature field that can be used effectively, significantly below 132c. the purpose of this paper is to study the dielectric properties of la/mn codoped batio3 ceramics, obtain by pechini method, as a function of different dopant concentrations. the curie-weiss and modified curie-weiss law were used to clarify the influence of dopant on the dielectric properties of batio3. 2. experiments and methods the la/mn codoped batio3 ceramics were prepared from organometallic complex based on the modified pechini procedure 9 starting from barium and titanium citrates. this method provides a low-temperature powder synthesis process (below 800c), good stoichiometry and easy incorporation of dopants in the crystal lattice. the content of additive oxides, la2o3, ranged from 0.3 to 1.0 at%. the content of mno2 was kept constant at 0.01 at% in all samples. for comparison purposes the samples free of la and mn were prepared in the same manner. the modified pechini process was carried out as a three stage process 288 v. paunović, v. mitić, m. marjanović, lj. kocić for the preparation of a polymeric precursor resin. solutions of titanium citrate and barium citrate were mixed, heated at 90c and then the la and mn were added. the temperature was raised to 120–140c, to promote polymerization and remove the solvents. the decomposition of most of the organic carbon residue was performed in an oven at 250c for 1 h and then at 300c for 4 h. thermal treatment of the obtained precursor was performed at 500c for 4 h, 700c for 4 h and 800c for 2 h. after drying at room temperature and passing through sieve, the barium titanate powder was obtained. the powders were isostatically pressed at 98 mpa into disk of 10 mm in diameter and 2 mm of thickness. the samples were sintered in air atmosphere at 1300c for 2 h and the heating rate was 10c /min. the bulk density was measured by the archimedes method. the specimens are denoted such as 0.3 la/mn-batio3 for specimen with 0.3 at% la and 0.01 at% mn and so on. the microstructures of the sintered or chemically etched samples were observed by scanning electron microscope jeol-jsm 5300 equipped with eds (qx 2000s) system. capacitance and dissipation factor was measured using an agilent 4284a precision lcr meter in the frequency range from 20hz to 1 mhz. the variation of the dielectric permittivity with temperature was measured in the temperature interval from 20 to 180c. the dielectric parameters such as curie-weiss temperature (t0), curie constant (c) and critical exponent  were calculated according to curie-weiss and modified curie-weiss law. 3. microstructure characteristics the relative density of the la/mn codoped samples varied from 90% to 95 % of theoretical density (td), depending on the amount of additives, being lower for higher dopant additive concentration. the main characteristic of the low doped samples, the samples doped with 0.3at% of la is a completely fine grained and homogeneous microstructure with fairly narrow size distribution. the grain sizes were ranged from 0.5 to 1.5 m (fig.1a) and no evidence of any secondary abnormal grain growth. with an increase of the additive content, the microstructure of the specimens doped with 0.5 at% of la showed quite significant grain growth with varied grain size. besides a small amount of 1 m grains, most of the grains were approximately 3-8 µm (fig.1b). fig. 1 sem images of la/mn codoped batio3, a) 0.3at% la and b) 0.5 at% la. dielectric properties of la/mn codoped barium titanate ceramics 289 the microstructure evolution in the samples doped with 1.0 at% of la was quite different from that observed in the other samples. in 1.0 at% la doped samples, apart from the fine grained matrix with grain size of 2-3 µm, some local area with secondary abnormal grains (fig.2a) were observed. the secondary abnormal grains size was in the range 10-15 µm. for undoped batio3 ceramics, (fig. 2b) the microstructure displayed the characteristic non-uniform microstructure and grain size distribution from 1-15 µm. fig. 2 sem images of a) 1.0 at% la/mn codoped batio3 and b) undoped batio3 ceramics. the difference in microstructural features is also associated with the inhomogeneous distribution of la as can be seen in the eds spectra taken from different areas in the same sample (fig. 3). the existence of x-ray peaks for lanthanum (l-la peak) in the 1.0 at% doped sample in eds spectrum indicates that la-rich regions are in coexistence with the nominal perovskite phase. it is worth noting that the concentrations less than 1.0 at% could not be detected by the eds attached to the sem, unless an inhomogeneous distribution or segregation of the additive was present. the la-rich regions are associated with the small grained microstructure, whereas eds spectrum free of la-content corresponds to the abnormal grains. also, the eds analysis did not reveal any content of mn, thus a homogeneous distribution of mn trough the specimens can be assumed. fig. 3 sem/eds images of 1.0 la/mn codoped batio3. 290 v. paunović, v. mitić, m. marjanović, lj. kocić 3. dielectrical characteristics all la/mn doped samples that were investigated are electrical insulators with an electrical resistivity   10 8 cm at room temperature. the high resistivity indicates that the ionic compensation mechanism (titanium vacancy compensation mechanism) is exclusively involved during the la incorporation into the batio3 matrix, and due to the immobility of cation vacancies, at room temperature, the doped samples remain insulating. the observed microstructural characteristics, which depend on the type and concentration of additive, have a direct influence on the dielectric properties. dielectric properties of batio3 ceramics (dielectric permittivity r and dissipation factor tan) were measured as a function of frequency and temperature. dielectric constant was determined in the frequency range from 20 hz to 1 mhz. after the initial high value at low frequency, dielectric constant becomes nearly constant at frequency greater than 10 khz. with an increase of additive content, the dielectric constant decreases. the highest value of the dielectric constant (r = 12846) was measured for samples doped with 0.3 at% of la characterized by small-grained microstructure and high sintering density (fig. 4). the lowest value of dielectric constant (r = 5200) was measured for 1.0 at% la doped samples. for the undoped batio3 ceramic, the dielectric constant was 2230 and for these samples dielectric constant was essentially independent of frequency. fig. 4 dielectric constant of undoped and la/mn-batio3 ceramics as a function of frequency. the dielectric loss (tan) values are in a wide range from 0.07 to 0.62 (fig 5). the main characteristics for all doped specimens are that after the initial high dielectric loss values, the tan decreases and are nearly independent of frequency greater than 20 khz. the highest value of tan, and a considerable change of tan vs. frequency from 0.61 to 0.2 were recorded in 0.3la/mn doped batio3 ceramics. dielectric properties of la/mn codoped barium titanate ceramics 291 fig. 5 the dielectric losses as a function of frequency for undoped and la/mn-batio3 ceramics. the dielectric properties of batio3 ceramics also can be analyzed through the permittivitytemperature dependence (fig. 6). the variation of the dielectric constant as a function of temperature clearly displays the effects of additive content and microstructural composition on dielectric properties. the highest value of the dielectric constant at room temperature (εr =12846) and at curie temperature (εr = 17738), was measured for the 0.3la/mn codoped batio3 samples, which is characterized also by a small grained and uniform microstructure and high density. fig. 6 dielectric constant of batio3 ceramics as a function of temperature. 292 v. paunović, v. mitić, m. marjanović, lj. kocić with an increase of additive content the dielectric constant decreases. for the samples doped with 1.0 at% la, the dielectric constant at room temperature is 3945 and at curie temperature is 8270. the variations in dielectric constant in low and heavily codoped la/mn ceramics, sintered at the same temperature, can be attributed on one hand to the different density (where density decreases with an increase of additive content); and on the other hand, to the presence of a la-rich phase and formation of secondary abnormal grains that obviously lead to a decrease in the dielectric permittivity. in general, the pronounced permittivity-temperature response and a sharp phase transition, from ferroelectric to paraelectric phase at curie temperature, are observed for all doped batio3 samples and for undoped batio3. it can be seen from the ratio of permittivity at curie point (εrmax) and room temperature (εrmin) i.e. (εrmax/εrmin) which for 0.3 at% doped samples has a value of 1.38, for the 0.5la/mn doped samples is 1.7, and for the 1.0 at% doped batio3 is 2.09. the curie temperature (tc) for codoped samples is shifted towards low temperature compared to undoped batio3 ceramics for which the curie temperature is 134c. for doped samples, the tc ranged from 110c for 0.3la/mn batio3 to 122c for 1.0la/mn batio3 ceramics (table 1). the shift of curie temperature for the codoped ceramics was heavily dependent on the ratio donor/acceptor. in the 0.3la/mn-batio3 ceramics, the donor/acceptor ratio is 30, and in 1.0la/mn is 100. with increasing la concentrations and the formation of donoracceptor complexes 2[laba  ]-[mnti  ], the possibility of oxidation mn 2+ to mn 3+ and mn 4+ state was reduced. so the influence of mn on the shift in curie temperature in 1.0la/mn batio3 ceramics was smaller. fig. 7 reciprocal value of r in function of temperature. dielectric properties of la/mn codoped barium titanate ceramics 293 all specimens have a sharp phase transition and follow the curie-weiss law: 0tt c r   (3) where c is the curie constant and t0 curie-weiss temperature, which is close to the curie temperature. the curie-weiss temperature (t0) was obtained from the linear extrapolation of the inverse dielectric constant of temperature above tc down to zero (fig. 7). the curie-weiss temperature decreased with an increase of additive concentration. the curie constant (c) was obtained by fitting the plot of the inverse values of the dielectric constant vs. temperature, and represents the slope of this curve for data above the tc. with an increase of dopant amount, the curie constant (c) increased. the highest value of c (c = 3.2710 5 k) was measured for the 1.0 at% la doped samples. the value of the curie constant is related to the grain size and porosity of the samples. the curie constant for undoped batio3 ceramic is (c = 2.1210 5 k). the curie constant and the curie-weiss temperature values are given in table 1. in order to investigate the curie-weiss behavior, the modified curie-weiss law was used 29 max / max ( )1 1 r r t t c       (4) where r is dielectric constant, rmax maximum value of dielectric constant, tmax temperature where the dielectric value has its maximum,  critical exponent for diffuse phase transformation (dpt) and c / the curie-weiss-like constant. the dielectric parameters for undoped and doped batio3 ceramics, together with the values calculated according to modified curieweiss law, are given in table 1. table 1 dielectric parameters for undoped and la/mn codoped batio3 samples r at 300k r at tc tan  at (300k) tc [ 0 c] t0 [ 0 c] c [k] 10 5  pure batio3 2230 5488 0.067 134 101.1 2.12 1.402 0.3la/batio3 12846 17738 0.610 110 106.9 1.95 1.509 0.5la/batio3 6550 11196 0.248 118 94.8 2.67 1.044 1.0la/batio3 3945 8270 0.177 122 87.1 3.27 1.536 the critical exponent of the nonlinearity  was calculated from the best fit of the curve ln(1/r  1/m) vs. ln (t  tm), as shown in fig. 8. the critical exponent  represents the slope of the curve. for a single batio3 crystal, the  is 1.08 and gradually increases up to 2 for diffuse phase transformation in doped batio3. 294 v. paunović, v. mitić, m. marjanović, lj. kocić fig. 8 the modified curie-weiss plot ln(1/r 1/m) vs. ln (ttm) for batio3 samples. the slope of the curve determines the critical exponent . as can be shown in fig.8, the critical exponent  value is in the range from 1.044 to 1.536, which is in agreement with the experimental data. these samples are characterized by a sharp phase transition from ferroelectric to paraelectric phase at the curie point. the highest value for the critical exponent  ( = 1.536) was calculated in the 1.0 at% la/mn doped samples. 4. conclusion the dielectric properties of la/mn codoped ceramics depends heavily on the additive concentration and obtained microstructure during sintering. all samples have a resistivity of 10 8 cm and they are electrical insulators at room temperature. the highest value of the dielectric constant was achieved at room temperature (r=12846) and at the curie temperature (r=17738), and these values were measured for the 0.3 at% la/mn doped ceramics. this composition displayed a high density and small grained microstructure. with an increase of the additive content, the dielectric constant decreased; for the samples doped with 1.0 at% la, the r is 3945. the differences in dielectric constant values in low and heavily doped batio3 are due first to the different density (porosity) of doped ceramics and secondly to the presence of non-ferroelectric la rich regions and secondary abnormal grains. the dielectric loss values are in a wide range from 0.07 to 0.62. after initially greater dielectric loss values at low frequency, the tan decreases and are nearly independent of frequency greater than 20 khz. all specimens followed a curie-weiss low with sharp phase transition. the curie temperature of doped batio3 ceramics was shifted towards low temperature compared to undoped batio3. the curie temperature values ranged from 110c for 0.3la/mn batio3 to 122c for 1.0la/mn batio3 ceramics. the curie constant dielectric properties of la/mn codoped barium titanate ceramics 295 increases with increase of additive content. the highest value of c (c = 3.2710 5 k) was measured in samples doped with 1.0 at% of la. the critical exponent  is in the range from 1.044 to 1.536 and pointed out the sharp phase transformation from ferroelectric to paraelectric phase at curie temperature. acknowledgement: this research is a part of the project “directed synthesis, structure and properties of multifunctional materials” (172057). the authors gratefully acknowledge the financial support of serbian ministry of education, science and technological development for this work. references [1] h. kishi, n. kohzu, j. sugino, h. ohsato, y. iguchi, t. okuda, "the effect of rare-earth (la, sm, dy, ho and er) and mg on the microstructure in batio3", j. e. ceram. soc. vol. 19, pp. 1043-1046, 1999. [2] lj. zivkovic, v. paunovic, n. stamenkov, m. miljkovic, "the effect of secondary abnormal grain growth on the dielectric properties of la/mn co-doped batio3 ceramics", science of sintering, vol.38, pp. 273-281, 2006. [3] m. vijatovic petrovic, j. bobic, t. ramoska, j. banys, b. stojanovic, "electrical properties of lanthanum doped barium titanate ceramics, materials characterization, vol. 62, pp.1000-1006, 2011. [4] d.h. kuo, c.h. wang, w.p. tsai, "donor and acceptor cosubstituted batio3 for nonreducible multilayer ceramic capacitors", ceramics international, vol. 32, pp.1-5, 2006. [5] j. qi, z. gui, y. wang, q. zhu, y. wu, l. li, "ptcr effect in batio3 ceramics modified by donor dopant", ceramic international, vol. 28, pp.141-143, 2002. [6] m. wegmann, r. bronnimann, f. clemens, t. graule, "barium titanate-based ptcr thermistor fbers: processing and properties", sens. actuators a: phys., vol. 135 (2), pp. 394–404, 2007. [7] e. brzozowski, m.s. castro, "conduction mechanism of barium titanate ceramics", ceramics international, vol. 26, pp. 265-269, 2000. [8] w. caia, c. fu, z. lin, x. deng, w. jiang, "influence of lanthanum on microstructure and dielectric properties of barium titanate ceramics by solid state reaction", advanced materials research, vol. 412, pp. 275-279, 2012 [9] m.p.pechini, method of preparing lead and alkaline earth titanates and coating method using the same to form a capacitor, us patent no. 3.330.697, 1967. [10] a. ianculescu, z.v. mocanu, l.p. curecheriu, l. mitoseriu, l. padurariu, r. trusca, "dielectric and tunability properties of la-doped batio3 ceramics", journal of alloys and compounds, vol. 509, issue 41, pp. 10040–10049, 2011. [11] a.k.yadav, c.gautam, " dielectric behavior of perovskite glass ceramics", j. mater sci: materials in electronics, vol. 25, pp. 5165-5187, 2014. [12] a.k.yadav, c.gautam, "a review on crystallisation behaviour of perovskite glass ceramics", advances in applied ceramics, vol. 113 (4), pp.193-207, 2014. [13] e.j. lee, j. jeong, y.h. han, "defects and degradation of batio3 codoped with dy and mn", jpn. j. appl. phys. vol. 45, pp. 822-825, 2006. [14] s. m. park, y. h. han, "dielectric relaxation of oxygen vacancies in dy-doped batio3", journal of the korean physical society, vol. 57, no. 3 pp. 458463, 2010. [15] k.j. park, c.h. kim,y.j. yoon, s.m. song, "doping behaviors of dysprosium, yttrium and holmium in batio3 ceramics", j.e.ceram.soc., vol. 29, pp. 1735-1741, 2009. [16] s.m. bobade, d.d. gulwade, a.r. kulkarni, p.gopalan, "dielectric properties of aand b-site doped batio3 (i): laand al-doped solid solution", j. appl. phys., 97:074105, 2005. [17] d. gulwade, p. gopalan, "dielectric properties of aand b-site doped batio3: effect of la and ga", physica b, 404, pp.1799–805, 2009. [18] p.r. krishnamoorthy, p. ramaswamy, b.h. narayana, "cazro3 additives to enhance capacitance properties in batio3 ceramic capacitors", j. mater. sci. mater. electron., vol. 3, pp.176–180, 1992. [19] y. yuan, m. du, s. zhang, z. pei, "effects of binbo4 on the microstructure and dielectric properties of batio3 –based ceramics", j. mater. sci. mater. electron., vol. 20, pp.157–162, 2009. http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/journal/09258388 http://www.sciencedirect.com/science/journal/09258388/509/41 http://www.sciencedirect.com/science/journal/09258388/509/41 http://jjap.jsap.jp/cgi-bin/findarticle?journal=jjap&author=e%2ej%2elee http://jjap.jsap.jp/cgi-bin/findarticle?journal=jjap&author=j%2ejeong http://jjap.jsap.jp/cgi-bin/findarticle?journal=jjap&author=y%2eh%2ehan http://jjap.jsap.jp/archive/jjap-45.html 296 v. paunović, v. mitić, m. marjanović, lj. kocić [20] v. paunovic, l.j. zivkovic, v. mitic, "influence of rare-earth additives (la, sm and dy) on the microstructure and dielectric properties of doped batio3 ceramics", science of sintering, vol. 42, pp. 69–79, 2010. [21] w. li, z. xu, r. chu, p. fu, "structure and dielectric behavior of la-doped batio3 ceramics", adv. mater.res., vol. 105–106, pp. 252–254, 2010. [22] f.d. morrison, d.c. sinclair, a.r. west, "electrical and structural characteristics of lanthanum-doped barium titanate ceramics", j. appl. phys., vol 86, pp. 6355–6366, 1999. [23] r. zhang, j.f. li, d. viehland, "effect of aliovalent substituents on the ferroelectric properties of modified barium titanate ceramics: relaxor ferroelectric behavior", j.am.ceram.soc., vol.87, pp. 864-870, 2004. [24] f.d. morrison, a.m. coats, d.c.sinclair, a.r.west, "charge compensation mechanisms in la-doped batio3", j.europ.ceram.soc., vol. 6, no. 3, pp. 219-232, 2001. [25] f.d. morrison, d.c.sinclair, a.r.west, "doping mechanisms and electrical properties of la-doped batio3ceramics", int. j. inorg. mater., vol. 3, pp.1205–1210, 2001. [26] h. kishi, n. kohzu, y. iguchi, j. sugino, m. kato, h. ohasato, t. okuda, "occupation sites and dielectric properties of rare-earth and mn substituted batio3", j.europ.ceram.soc., vol. 21, pp. 1643-1647, 2001. [27] h. miao, m. dong, g.tan, y.pu, "doping effects of dy and mg on batio3 ceramics prepared by hydrothermal method", journal of electroceramics, vol. 16, pp. 297–300, 2006. [28] k.albertsen, d.hennings, o.steigelmann, "donor-acceptor charge complex formation in barium titanate ceramics: role of firing atmosphere", journal of electroceramics, 2:3, pp. 193-198, 1998. [29] k. uchino, s. namura, "critical exponents of the dielectric constants in diffuse-phase transition crystals", ferroelectrics letters, vol.44, pp. 55–61, 1982. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 495 505 doi: 10.2298/fuee1503495p capacitive methods for testing of power semiconductor devices  vaclav papež 1 , jiri hájek 1 , bedrich kojecký 2 1 department of electrotechnology, faculty of electrical engineering, czech technical university in prague. 2 prague, czech republic abstract. electrical capacity of power semiconductor devices is quite an important parameter that can be utilized not only for testing a component itself, but it can also be applied practically; e.g. in series-connected high voltage devices. this paper first analyzes the theoretical voltage distribution on the bases of the polarized p-n junction, as well as the size of capacity. the measurement of the voltage-capacity dependence using the resonance principle is illustrated on the samples of 4kv and 6kv thyristors. the correspondence between theoretical estimation of the capacity, measured voltage capacity dependence based on the resonance principle and experimentally determined by injected charge proves the correctness of the applied procedures and assumptions. key words: capacity, p-n junction, voltage dependence, series connection of devices. 1. introduction most of the world leading manufacturers of power semiconductor devices offer discreet rectifying elements (diodes/thyristors) with off-state and reverse voltage up to 6 kv or 7 kv. thus, higher voltage converters must be constructed from serial-connected devices. the devices for a series connection (so called high voltage stack) must be chosen according to the following rules:  for static processes, the components must have "consistent" i-v characteristics; distribution dissipation resistors are often used for uniform voltage,  for dynamic loading (in a frequency application) a commutating charge of the components must also be considered; therefore resistor dividers are often supplemented with capacities. knowing the voltage dependence of the dynamic capacity of reverse polarized devices can help to design and optimize the series-connected high voltage stacks. in the following section, the distribution of charges and capacity between both bases of a polarized p-n junction will be theoretically described. received february 24, 2015; received in revised form april 30, 2015 corresponding author: vaclav papež department of electrotechnology, faculty of electrical engineering, czech technical university in prague, czech republic (e-mail: mnet.ok1vvp@atlas.cz) 496 v. papež, j. hájek, b. kojecký the evaluation of the voltage dependence of the dynamic capacity of a reverse polarized junction can also be a non-destructive measurement method enabling the evaluation of some physical and technological parameters of the device material. the same method can be used to evaluate the quality of the finished encapsulated devices, which allows verifying a real value of the electric field at the p-n junction or the resistivity of initial bulk silicon used for the wafer processing. 2. basic theoretical analysis the p-n junction of a high-voltage silicon semiconductor devices is generally created by sufficiently long high temperature (above 1200 °c) diffusion of acceptor atoms (al, b) into a single crystal n-type si wafer of typical resistivity in the order of 100 cm. a p-n junction extends to the depth of 80 m-120 m; the concentration profile of dopants follows the error function complement (erfc (x)) or gaussian distribution. in a reverse polarization and at a constant applied voltage, the structure is passed through by a constant reverse current formed by so called diffusion and recombination components [1], [2]. at a room temperature and a voltage in the order of 1.0 kv, the reverse current reaches the value in the order of 1 a. this assumption holds for 2” devices (both for diodes or thyristors) used in experiments. the distribution of the electric field on individual layers is described by poisson’s equation ( )de qn x dx   , (1) where e is the electric field, n is the density of electrically charged dopants and  is si permittivity. in a wide n basis, the density of donors is constant and the voltage distribution vn has a simple shape 2 0 2 d n n r qn x v xd            , (2) where dn is the width of the space charge region (scr) in the n basis. an exact solution of poisson’s equation for the adjacent layer p is a lot more complicated. however, basically essential for further consideration is the voltage on the layer. the layer inherently determines a maximum allowable voltage on the layer n, and thus the total reverse voltage of the junction structure t n pv v v  . (3) the requirement of equality of the charge q on adjacent layers represents another output of poisson’s equation 02n p d r nq q s qn v   . (4) regarding low values of reverse currents in a stationary mode, practically applicable values of q can be obtained only by numerical integration of the time current flows through the structure at a pulse loading by a sufficiently high (in the order of 100 hz) frequency, capacitive methods for testing of power semiconductor devices 497 or by a numerical integration of charging current of a parametric capacitor representing a monitored junction. at a sinusoidal type of loading, the charge pumped during one half-cycle after the substitution into expression (4) determines the value of the relevant part of the total voltage on the layer n. after further substitution into expression (3), we can compute currently immeasurable value of the voltage vp at the layer p 2 2 02 p t d r q v v s qn     . (5) for differential (measured) capacity of the layer n holds 0 2 d r n n qn c s v    . (6) from the formula for the total capacity ct of the layers adjacent is series n p t n p c c c c c   , (7) we can determine the dependence of cp as a function of voltage on individual layers. if the dependence of the capacity ct on the voltage vt applied to the junction is measured, then the charge accumulated in the junction capacities can be expressed as 0 ( ) ( ) ( )t t t dv t q c v dt dt     . (8) the result of the integration is not dependent on the course of the function v (t). the voltage dependence of the charge accumulated in the junction capacities can be obtained by substituting the inverse function t = f (v) into (8). 3. description of the samples all the following experiments and measurements were carried out on two independent groups of thyristors. these groups have totally different technology processing, predicted for similar application (phase control rectifiers, “f” housing puck design). each group of thyristors contained five samples. samples were taken from one production batch. first group contains samples of phase controlled rectifiers (pcr) with reverse and offstate voltage of 4 kv in diameter of 2". the pcrs were made by soldering technology, where si wafer is soldered using a 30m thick alsi film on a molybdenum substrate of the same diameter (53 mm). the thickness of the mo disk base is 1.2 mm; the thickness of the si substrate is about 800 μm and soldering to the anode side takes place in vacuum at about 700 °c. the required off-state voltage of 4 kv allows using a simple two-layer positive and negative bevelling (at an angle of about 30°) from the cathode side. an acid etched bevelling (a solution of hf and hno3) is protected by a conventional silicone gel 498 v. papež, j. hájek, b. kojecký hipac q1-9205. an active area of the blocking junction at the cathode side of the thyristor is about 1700 mm 2 ; an active area of the reverse junction at the anode side is about 2150 mm 2 . the active area of the cathode is coated with a layer of vapor-deposited contact metal (aluminum). the design of pcr uses a built-in amplifying gate. a simplified cross-section of the thyristor (without housing) is shown in fig. 1, left. molybdenum disc (diameter 53 mm; thickness 1,2 mm) si wafer 800 m thickness r = 26,5 mm r = 22 mm al layer silicon gel r = 26,5 mm r = 22 mm si wafer 1350 m thickness al layer al layer silic. gel fig. 1 cross-section of non-symmetrical structure of a 4kv pcr (left) and 6kv structure processed by strictly symmetrical free-floating technology (right). second group of samples contains phase controlled rectifiers with reverse and off-state voltage of 6 kv in diameter of 2". this pcr uses free-floating technology and very thick (up to 1350 m) si wafer. the thyristor is loosely mounted between two dilatation mo discs with a thickness of about 1 mm. the si wafer is two-sided edged; two two-layer negative bevellings are used again. on the etched bevelling a high protective layer of the silicon rubber hipac q1-9205 is applied. the thyristor also uses the design with the amplifying gate. active areas of a thyristor blocking and reversed junctions are approximately the same, of 1600 mm 2 , coated with a thin layer of vapor-deposited aluminum. a simplified crosssection of the thyristor (without housing) is shown in fig. 1, right. 4. provided measurements 4.1 measurement of voltage properties the measurement of the dc reverse and off-state i-v characteristics was carried out by means of a dc method using a high voltage power supply sz 10/2. the power supply was controlled by a computer program in a voltage range of 0-6 kv respective 8 kv, with a current limitation of 2 ma. dc voltage has been applied with the dv/dt rate of 1 kv/sec in both polarities. a gate port of the tested pcr has been opened. the characteristics were measured in a short time (of 6-8 seconds), thus the influence of temperature increase was negligible with respect to a low power loss. behaving of samples in both groups of thyristors was nearly identical with respect to achieved accuracy of measurement. here and bellow presented results were obtained always for one current sample. measured values were not deformed by means of any statistic processing. capacitive methods for testing of power semiconductor devices 499 0,0 0,5 1,0 1,5 2,0 0 2 4 6 8 i (ma) v (kv) 4kv pcr reverse polarityoff-state polarity 0,0 0,5 1,0 1,5 2,0 0 2 4 6 8v (kv) i (ma) 6kv pcr reverse polarity off-state polarity fig. 2 typical dc reverse and off-state i-v characteristics of both groups of samples. 4.2 measurement of the charge the first way of the measurement of the injected charge is based on the measurement at the voltage analyzer schuster sml 698. the device utilizes a pulse method [3]. the measured waveforms of the reverse/off-state voltage (v) and the injected capacity current (i) are shown in figure 3. there was chosen such a waveform which refers to a half period of 50hz sinusoidal voltage. the applied voltage was lower than the breakdown voltage during the entire measurement. it was measured by a dc method, as described in the previous section. 0 1 2 3 4 5 0 2 4 6 8 10t (ms) v (kv) -1 0 1 2 3 4 i (ma) reverse/off-state voltage reverse current off-state current 0 1 2 3 4 5 6 7 0 2 4 6 8 10t (ms) v (kv) -2 -1 0 1 2 3 4 5 i (ma) reverse/off-state voltage reverse/off-state current fig. 3 pulse measurement at the analyzer schuster sml 698: reverse and off-state voltage and current waveforms (4kv pcr left; 6kv pcr right). from the measured values of the capacity current, a numerical integration was carried out. the interval from zero to the maximum applied voltage (5 ms) was considered. thus the dependence of the injected accumulated charge from the area of an expanding p-n junction on the outer applied voltage was obtained. the injected charge is illustrated in figure 4 as the waveform “sml 698”. another method of determining the injected charge is based on the dependence of the parametric p-n junction capacity on the applied voltage. this method is described below. injected charge obtained using this method is illustrated in figure 4 as the waveform “dynamic capacity”. 500 v. papež, j. hájek, b. kojecký 0 1000 2000 3000 4000 0 1 2 3 4v (kv) q (nc) sml 698 (off-state) dynamic capacity (off-state) sml 698 (reverse) dynamic capacity (reverse) 0 1000 2000 3000 4000 0 2 4 6v (kv) q (nc) sml 698 dynamic capacity fig. 4 experimentally and numerically determined injected charges from a p-n junction area (4kv pcr left; 6kv pcr right). 4.3 measurement of capacity of p-n junction under reverse bias voltage the measurement of reverse polarized p-n junction semiconductor capacity is a methodology commonly used in the manufacture monitoring and testing of semiconductor devices. the measurement is very simple on principle, see fig. 5. in an ideal case, to ensure the measurement, a capacitance meter (an ac rlc meter) and low power regulated dc power supply delivering required bias voltage is sufficient. a measured p-n junction is biased by a reverse voltage from a dc power source through the impedance z1 which is chosen to be passed through by only a negligible part of the measuring current and, at the same time, a dc voltage drop between the source and the measured junction were small. in usual measurements at a higher frequency in units to tens of khz and at a reverse current in units to tens of μa, the resistor with real resistance of the size of several hundreds of kω to the units of mω is used as decoupling impedance. the rlc meter is separated from the dc bias circuit by the capacitor c. the capacity of the capacitor must be chosen much higher than the maximum measured capacity, without the need to correct the measurement results. the capacitor must also withstand the maximum dc voltage supply without being damaged. in case it is not possible to ensure the capacitor charging by the current passing through the rlc meter, the passage of the charging current is ensured by the impedance z2 that satisfies the same requirements as those for the impedance z1. rlc overvoltage protection z2 c z1 dut = + fig. 5 block diagram of the connection between an rlc analyzer and investigated p-n junction (dut). capacitive methods for testing of power semiconductor devices 501 the measuring circuit can be supplemented with over voltage protection circuits that must be designed with regard to their minimum effect on the measured capacity and that will be able to prevent the penetration of the over voltage to the rlc meter. however, their protection effectiveness is not usually high. in case of breakdown at the measured pn junction during high voltage measurements, the over voltage protection circuits are not usually able to ensure the rlc meter protection. the most serious drawback of the measuring circuit is that the separation of the high voltage biasing circuit from the measuring part of the rlc circuit is only virtual. any rapid change in voltage in a high voltage circuit part is transmitted to a measuring circuit part, and in the worst case, with a full voltage level. breakdown, an avalanche process on the measured junction, or an imperfect contact in the circuit of the measured junction between the decoupling capacitor and impedance meter give rise to the over voltage at the rlc meter clips, which usually leads to the destruction of the rlc meter, when measured at voltages greater than several tens of volts. for our measurements, we used a new measuring circuit design, where the p-n junction, whose capacity is being measured, was inserted into the resonance circuit. the circuit resonance frequency is evaluated and the searched p-n junction capacity is determined by its value. the circuit resonance frequency of the inductance l and capacitance c is expressed by formula 1 2 f lc  . (9) it must be considered that the resonance circuit capacity is not determined exclusively by the measured p-n junction capacity cm, but also by its coil self-capacity, connection capacity cc, and capacity cs of the decoupling capacitor that is connected in series with the measured capacity. expression (10) holds for capacity c, which must be also considered when evaluating the measured capacity m s c m s c c c c c c    . (10) the evaluation of the resonance circuit frequency is easily performed by adding the resonance circuit to the oscillator working as a control circuit, and by measuring the operating frequency of the oscillator, as shown in figure 6. cs rs dut = + l oscillator counter fig. 6 resonance method of measurement capacity of a biased p-n junction. both inductance and the resonance circuits can also be utilized for the construction of a special measuring circuit. the coil that shows minimum dc resistance can be used to 502 v. papež, j. hájek, b. kojecký mount an effective decoupling circuit that will reduce the penetration of the over voltage into the measuring circuit. the resonance circuit works as a narrowband filter which strongly inhibits the penetration of energy of potential avalanche processes and discharges in the high voltage circuit part to other circuits. in the resonance circuit, the coil itself or another decoupling capacitor are chosen to be high-voltage damage resistant. the operating frequency of the oscillator is evaluated, and by its value, the searched pn junction capacity is determined either by the computation according to expressions (9) and (10) or automatically. if the processor is used as a frequency meter, the measured capacity can be evaluated automatically. such evaluation can be done easily, e.g., by reading the searched values from the table of the calculation results. to reach the maximum over voltage protection of the measuring device, the oscillator can be designed by using a vacuum tube as an active element. the energy sufficient to damage the vacuum tube is much higher than the energy sufficient to damage the semiconductor element [4]. block diagram of the system for measuring high-voltage semiconductor device capacity is shown in fig. 7. cs rs dut hv + counter cc va l tube oscillator fig. 7 block diagram of an apparatus for capacity measurement. measured p-n junction is represented by a diode (dut), see figure 7. one lead of the diode (an anode) is connected to the coil l of the resonant circuit, whereas a second lead (a cathode) is connected to the separating capacitor cs and separating resistor. the capacity of the separating capacitor is usually relatively high. it is greater than the highest measured capacity so that the sensitivity of the measuring device for highest measured capacity would not be diminished. the impedance of the resonance circuit transformed into a node between the measured p-n junction and separating capacitor is small. the separating resistor has a high resistance value not to attenuate the resonance circuit too much. further, the resistor connects the high voltage supply (hv) to the measured p-n junction. the capacitor cc is not a physically existing component. capacity cc represents a self-capacitance of the coil and the capacity of connections that must be considered in evaluating the measured capacity. this design also allows the alternation of polarization voltage. resonance circuit (consisting of dut, cc, l) is designed as a controlling resonance circuit of the vacuum-tube oscillator. operating frequency of the oscillator determines the capacity of the measured p-n junction. capacitive methods for testing of power semiconductor devices 503 a high-impedance terminal of the resonant circuit serves as a node between the measured p-n junction and coil l. the terminal is connected to a control grid of the oscillating tube via the separating capacitor. the feed forward is created from the second grid of the tube by a coupling coil. the output signal from the oscillator is taken from a separating transformer in the anode circuit so that the oscillator operates as a three-point oscillator of a meissner type with an electron coupling. the operating frequency of the oscillator is evaluated by a simple digital frequency counter or a digital processor. this frequency can be eventually used even for the automatic evaluation of the measured capacity. the described device allows measuring of the capacity in the range of 10 pf to 10 nf with an accuracy better than 1 %. at the same time, a measured object (p-n junction) is polarized by dc voltage adjustable from a few tens of volts to 8 kv. 5. results of the measurements and their discussion the typical measured dependence of the thyristor capacities on the external applied voltage is shown in figure 8. in theory, for the total junction capacity ct, it is possible to use the following equation 2 2 1 2 t n dc v s qn         , (11) where ct is the total capacity of the diode and vn is the voltage distributed on an n base. from the measured waveforms shown in figure 4, there can be derived relatively accurately the experimental equation for the total capacity ct 2 t tc v k , (12) where k is a general constant. 100 1000 10000 10 100 1000 10000v (v) c t (pf) reverse polarity off-state polarity 100 1000 10000 10 100 1000 10000v (v) c t (pf) reverse polarity off-state polarity fig. 8 dependence of the sample capacities on the applied voltage (4kv pcr left; 6kv pcr right) at the thyristor, there are generally two similar dependences generated under different conditions; in thyristor polarization by reverse or off-state voltages, which depends on the 504 v. papež, j. hájek, b. kojecký conditions whether the areas of cut-off junctions in the thyristor structure are the same or different for different polarities. measured dependences c-v and i-t were used for creating the final (target) dependence q-v. this final dependence was used also for mutual comparison. searched q-v dependence was obtained by three different approaches:  in the first case, the dependence was obtained as the time integral of the charging current.  in the second case inversion dependence was obtained. v-q dependence was obtained as a dependence of voltage of polarized p-n junction on accumulated charge. we consider the measured voltage dependence of junction capacity, as it is given by eq. (13).  in the third case, when approximately evaluating the voltage dependence of the junction capacity, the total capacity of the junction was considered as a series connection of two capacities cn and cp according to equation (7). capacity cn was simply approximated by equation (6) as inversely proportional to the square root of vn and capacity cp as inversely proportional to the cube root of vp (14). ( )1 ( ) ( ) ( ) n i q i v n c v  . (13) 3 p p k c v  . (14) the dependence of polarizing voltage of the device on the accumulated charge was determined similarly as in the second case. the only difference was that the total capacity of the device was approximated by equation (15), for which optimum values of the constants were searched numerically. 1 3 1 2 np t vv c k k          . (15) from the cn capacity values (6) mentioned in the previous section some basic parameters of the samples can be calculated. 6. conclusions the main contribution of this paper is the description of newly developed measuring equipment. this equipment is designed for semiconductor p-n junction capacity measurement under high voltage bias. the use of standard rlc analyzer is exposed to the risk of equipment damage due to voltage penetration into the analyzer. the principle of described technical solution is the connection of measured capacity (e.g. p-n junction) to a resonance circuit. measured capacity is evaluated according to the resonance frequency of the circuit or according to the frequency of the oscillator. the advantage of described solution is the usage of coil as a separation circuit element. the coil effectively prevents the penetration of surge voltage into the measuring circuit (rlc analyzer). resonance circuit serves as a narrow band-pass filter heavily capacitive methods for testing of power semiconductor devices 505 suppressing possible surge voltage. the coil and other separating capacities are designed to withstand high voltage peaks. in previous works [5], only the properties of high voltage diodes were observed by described equipment. for the diode samples with their junction area s = 18 cm 2 for the applied voltage vt = 6 kv and an accumulated charge 3 c were determined vn = 5.3 kv; cn = 294 pf; cp = 2900 pf. space charge region extension in the n base and the maximum electric field intensity in the region can be specified as xmax = 680 m and emax = 15.6 kv/mm. similar voltage and capacitance distributions between the p and n bases were obtained for herein described samples of thyristors:  for the samples of symmetrical thyristors with the junction area s = 16 cm 2 , for the applied voltage vt = 6 kv in reverse and off-state polarity and for accumulated charge 2.5 c were determined: vn = 4.9 kv; cn = 257 pf; cp = 1400 pf. space charge region extension in the n base and the maximum electric field intensity in the region can be specified as xmax = 630 m and emax = 15.8 kv/mm.  for the samples of soldered thyristors with the junction area s = 17 cm 2 , for the applied voltage vt = 4 kv in the reverse polarity and accumulated charge 3.1 c were determined: vn = 3.4 kv; cn = 330 pf; cp = 4500 pf. space charge region extension in the n base and the maximum electric field intensity in the region can be specified as xmax = 420 m and emax = 16.2 kv/mm.  for the samples of soldered thyristors with the junction area s = 21 cm 2 for the applied voltage vt = 4 kv in the off-state polarity and accumulated charge 3.1 c were determined: vn = 3.3 kv; cn = 410 pf; cp = 4500 pf. space charge region extension in the n base and the maximum electric field intensity in the region can be specified as xmax = 400 m and emax = 16.5 kv/mm. these results correspond well to achievable si material parameters (emax 22 kv/mm), as well as to the technological parameters of components. acknowledgement: the authors would like to thank to the company abb s. r. o. polovodiče, novodvorská street 138a/1768 prague, both for provision of the thyristor samples and for accessing the voltage measurements on the equipment schuster sml 698. also, we would like to express our thanks to ms. němcová and mr. bušek both from fee ctu for helping with translation into english language. references [1] b. j. baliga, modern power devices, new york: john wiley & sons, 1987. [2] s. k. ghandhi, semiconductor power devices, new york: john wiley & sons, 1977. [3] schuster elektronik gmbh.: blocking voltage tester for power semiconductors sml 698. operating manual. www.schuster-elektronik.de. [on-line]. [4] v. papež, apparatus to measure capacitance of power high-voltage semiconductor devices, patent cz27126, www.upv.cz. [on-line]. [5] v. papež, j. hájek, b. kojecký, "complementary methods for a diagnostic evaluation of physical and electrical parameters of power silicon devices", in proceedings of isps’14, prague, 2014, pp. 111-116. facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 511-524 https://doi.org/10.2298/fuee2104511l © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper evaluation of the magnetic field generated by a power cable in proximity of a joint bay: comparison between two different approaches giovanni lucca mermec ste milano, italy abstract. the paper presents two different approaches for the evaluation of the magnetic flux density field produced by an underground power cable in proximity of areas where joint bays are present; as known, in those areas the field levels are generally much higher compared to the ones generated along the ordinary route. the first and more rigorous 3d approach takes into account the actual geometry of the power cable conductors in the joint bay, while the second one is based on a simplified 2d approach. the main result of the comparison is that the 2d approach, even at short lateral distances from the cable overestimates the field; therefore, one could adopt this method in order to rapidly and conservatively evaluate the distance of compliance (established by each specific national authority) from the cable in order to ensure protection of population from exposure to power frequency magnetic field. key words: magnetic field, power cables, human exposure to magnetic field, joint bay, biot-savart law, 3d and 2d method. 1. introduction according to laws and regulations in many countries of the world, limits to protect general population and workers against exposure to 50-60hz magnetic field have been established. we may distinguish between limits to protect people against short-term exposure (that produces acute effects on human body) and limits to protect people against long-term exposure; by considering the latter issue and by following cautionary principles, many countries, have fixed limits, not to be exceeded, in places where continuative human presence is expected. it is worthwhile to mention that such limits are much lower (one or even two orders of magnitude) than the ones adopted for short-term exposure. received march 23, 2021; received in revised form may 6, 2021 corresponding author: giovanni lucca mermec ste, via stamira d'ancona 9, 20127 milano, italy e-mail: vanni_lucca@inwind.it 512 g. lucca for example, icnirp (international commission on non-ionizing radiation protection) [1], suggested the limit of 200t (at the frequency of 50-60hz) for population, but many single countries adopt, as a further cautionary measure against long-term effects, lower limits that are the order of few t. anyway, this paper shall neither cover any environmental or biological effects of magnetic field nor discuss any specific levels of magnetic field. thus, when a new power line has to be built, in order to accomplish the limits fixed by national regulations, one of the design steps consists in a preventive evaluation of the magnetic field generated by the line itself and in establishing a minimum distance from the power line axis in order to guarantee that, at any point beyond that distance, the magnetic flux density field does not exceed the limit fixed by the national regulations. we name such a distance distance of compliance and the acronym doc will be used in the rest of the paper. for example, in italy, the limit of 3t has been established by national law and thus the doc is based on such a limit; the value of the doc clearly depends on the characteristics and on the current carried by the power line. once the doc has been evaluated, one can imagine to draw, along the power line route, a strip having half-width equal to the doc. inside this strip, no continuative human presence is allowed. one has to bear in mind that the doc may be not constant along the power line route because, when the conductors geometric disposition changes, the doc varies as well. an example of that, occurring when dealing with underground power cables, is represented by the joint bays. we remind that the joint bays are located along the power cable route in those point where two different cable sections are joined together and, just in correspondence of them, the magnetic field produced by the cable is strongly increased compared to the the field level far from the joint bay itself. that occurs because inside the joint bay, for technical reasons, the power cable conductors (phases) have to be arranged in flat disposition with an increase of the distance between them so producing a magnetic field that can be 5-10 times higher than the one produced by the same cable when the phases have trefoil configuration (typical disposition outside the joint bay). a further point has to be mentioned: inside the joint bay the three phases of the cable change progressively their geometric arrangement because they widen out and go deeper (see a very simple sketch in fig. 1) so that the invariance of phases disposition, along the cable route, is broken; this implies that the model of infinite conductors [2, or equivalently 2d model, normally adopted for calculating the magnetic field far from the joint bay is no more applicable and one has to use a more complex 3d model that takes into account of the varying distance among the conductors [3]. this does not imply that the simplified 2d model is useless: infact, the aim of this work is to show, by comparing the results obtained by means of the 3d and 2d models, that the latter one can be still adopted if our main purpose is to conservatively evaluate the doc relevant to the power cable under study. magnetic field in proximity of joint bays 513 fig. 1 sketch (not in scale) of disposition of the conductors inside a joint bay 2. description of 3d model the 2d model is based on the well-known biot-savart law applied to infinitely long and rectilinear conductors; due to the fact that it is broadly described in literature, we omit any further detail about it. information about this model can be found in 2. on the contrary, it is useful to describe more in details the 3d model which, as mentioned before, allows for a correct evaluation of the magnetic flux density field when the geometry of the problem is more complex provided that the source of the field can be represented by a suitable number of straight wire segments carrying a known current. 514 g. lucca this 3d model is still based on biot-savart law which is now applied to rectilinear conductor of finite length. (see for example [4-6]). thus, coming to our problem, if the power cable is composed by m conductors and each one of them is discretised by n wire segments, we have ntot = mn wire segments modelling the whole power cable. finally, if bxk(x, y, z), byk(x, y, z), bzk(x, y if, z) are respectively the x, y, z, components of the magnetic flux density field produced in the generic point of the space (x, y, z) by the k-th wire segment, the total magnetic flux density field is given by: 1 ( , , ) ( , , ) totn xtot xk k b x y z b x y z = =  (1) 1 ( , , ) ( , , ) totn ytot yk k b x y z b x y z = =  (2) 1 ( , , ) ( , , ) totn ztot zk k b x y z b x y z = =  (3) and the modulus of the total field is: 22 2 ( , , ) ( , , ) ( , , ) ( , , ) tot xtot ytot ztot b x y z b x y z b x y z b x y z= + + (4) next step is to provide suitable analytical expressions for calculating the field produced by a single wire. 2.1. field produced by a straight wire segment let us consider a generic straight wire having length s and carrying a constant current i (where i is a complex phasor); let us represent such a wire by means of an oriented segment with extremes in a and b of coordinates (xa, ya, za) and (xb, yb, zb) respectively. by assuming the positive direction from a to b, we can define the relevant direction cosines l, m, n by: b a b a b a x x s l y y m s n z z s −       −    =          −      (5) each point belonging to this segment can be represented by means of the following equations in parametric form: ( ) ( ) ( ) a a a x s x ls y s y ms z s z ns +        = +        +    (6) the parameter s, appearing in (6), belongs to the interval [0, s]. magnetic field in proximity of joint bays 515 the starting point for the calculation of the magnetic flux density field produced by the wire conductor is given by the ampere’s law in differential form (see for example [7]) that expresses the field produced in a generic point (x, y, z) by a conductor having infinitesimal length dl and carrying a current i. (see fig. 2) fig. 2 sketch of wire segment this law is given by the formula: 0 2 ( , , , ) ( , , , ) 4 ( , , , ) r i dl u x y z s db x y z s r x y z s    = (7) where the distance r(x, y, z, s) between the infinitesimal element dl and the point (x,y,z) is : 2 2 2 ( , , , ) ( ( )) ( ( )) (( ( ))r x y z s x x s y y s z z s= − + − + − (8) the vector dl is given by: l ds dl m ds n ds     =       (9) the unit vector ( , , , ) r u x y z s is expressed by: 516 g. lucca ( ) ( , , , ) ( ) ( , , , ) ( , , , ) ( ) ( , , , ) r x x s r x y z s y y s u x y z s r x y z s z z s r x y z s  −      − =      −     (10) and 0 is the vacuum magnetic permeability. the total field ( , , )b x y z can be obtained by integrating (7) over the whole segment length; hence, by taking into account (8), (9), (10) one gets: ( ) 0 30 2 2 2 ( ) [( ( )) ( ( )) ( ( )) ] ( , , ) 4 ( ( )) ( ( )) ( )) s x y z x y z lu mu nu x x s u y y s u z z s u dsi b x y z x x s y y s z z s   + +  − + − + − = − + − + −  (11) where zyx uuu  are respectively the unit vectors relevant to x, y, z axes. by splitting up (11) into the three components, it is possible to analitically integrate each one of the three expressions (that we omit for brevity) so that one finally gets the following closed form expressions for the magnetic flux density field components: 0 1 1 22 2 2 1 22 1 1 2 [ ( ) ( )] 4 2 2 ( , , ) 4 (4 )(4 ) a a x i m z z n y y s k k b x y z k k kk k s k s k    − − − +  = −  −− + +   (12) 0 1 1 22 2 2 1 22 1 1 2 [ ( ) ( )] 4 2 2 ( , , ) 4 (4 )(4 ) a a y i l z z n x x s k k b x y z k k kk k s k s k    − − + − +  = −  −− + +   (13) 0 1 1 22 2 2 1 22 1 1 2 [ ( ) ( )] 4 2 2 ( , , ) 4 (4 )(4 ) a a z i l y y m x x s k k b x y z k k kk k s k s k    − − − +  = −  −− + +   (14) being the quantities k1 and k2 given by: 1 2 [( ) ( ) ( ) ] a a a k x x l y y m z z n= − − + − + − (15) 2 2 2 2 ( ) ( ) ( ) a a a k x x y y z z= − + − + − (16) 3. validation of the formulas for 3d model even if the 3d model based on the biot-savart law is broadly described in literature, it may be useful to compare the results obtained by applying the formulas presented in par.2 with the ones obtained by using some other very specific formulas, appearing in [8], that may serve as a benchmark. such a benchmark is represented by the case of three parallel conductors of finite length carrying a tri-phase balanced current. magnetic field in proximity of joint bays 517 the three parallel conductors, carrying a current of 1000a, have length l=10m, lie on the x-z plane and are separated by a distance d=0.3m. the highest conductor has phase 2400, the lowest conductor has phase 1200 and the middle conductor has phase 00. (see fig. 3) fig. 3 three conductors of finite length with vertical disposition in figs.4 the magnetic flux density field, calculated according to the two different formulations, is shown. in fig. 4a, the field is evaluated along the x axis for different values of z; conversely, in fig. 4b, the field is evaluated along the y axis for different values of z. as one can see, in both the figures all the different couples of curves, corresponding to the same value of z, are superimposed. fig. 4a comparison of the b field; plot along the x axis for different values of z 518 g. lucca fig. 4b comparison of the b field; plot along the y axis for different values of z 4. comparison between the two models coming to the study of the joint bay, we shall consider two cases under the hypothesis that the power cable axis is parallel to the x-axis, while y is the lateral distance from the cable axis and z is the the quote/depth over/under the air-soil interface. the first case deals with a single circuit power line while the second case deals with a double circuit power line. in both the examples the power cables are carrying a balanced current of 1000a and the metallic sheaths of the phase conductors are supposed the be earthed in a single point so that no induced current can circulate on them. anyway, the models can be easily adapted also to the case of solid bonding i.e.: the sheaths are earthed at least in two points so that induced currents can circulate on them thus generating a reduction of the magnetic field; see 2, 4, [9] for more details. in that case, the current i appearing in formulas (7) and from (11) to (14) represents the sum of the current flowing on the phase conductor with the one circulating on its own sheath. in particular, [2] and [4] give some indications on how to calculate the sheath current in a simplified way. alternatively, the sheath current can be calculated, in a more precise way, by modelling the power cable as a multiconductor line; see [10], [11] for an exhaustive description of the multiconductor algorithm. lastly, the influence of the soil can be neglected because, at 50-60hz and with balanced currents circulating on the phase conductors, the contribution given by it is very small. in fact, one could take into account of it by considering, for each conductor, a respective image current placed at a complex depth him given by [12-13]: magnetic field in proximity of joint bays 519 0 2(1 ) 2 s im j h h   − = + (17) where h is the depth of the phase conductor, j is the imaginary unit, =2f is the angular frequency and s is the soil resistivity; at 50-60hz and for typical values of soil resistivity (100ωm-10000ωm), one can immediately see that him is in the range from some hundreds of meters to some kilometers; thus, the image current of each conductor (that is related the soil effect) is so far away that has practically no influence on the value of the magnetic field evaluated at the distance of some meters from the cable. 4.1. single circuit power line we consider a power line section 160m long composed by one single tri-phase circuit having, just at the middle, a joint bay of length 8m so that its length is much less than the section length of power cable under study. the conductors in the preceding and succeeding sections the joint bay have trefoil disposition while the arrangement of the conductors inside the joint bay is the one sketched in fig. 1. in fig. 5 we show the magnetic flux density field evaluated along power cable axis at the air-soil interface by using the 2d and 3d models. in particular, we can notice the very fast increase/decrease of the field at the beginning/end of the joint bay where the conductors change their configuration from trefoil to flat and vice-versa. fig. 5 b field along the power cable axis; y=0, z=0 in fig. 6. we show the field evaluated at the middle of the joint bay (x=0) versus the lateral distance from the cable axis; the calculations are relevant to different heights from the soil. 520 g. lucca fig. 6 b field versus lateral distance from power cable axis: continuous lines 3d model, dashed lines 2d model in fig. 7, starting from the results shown in fig. 6, we plotted the percent relative error between the field calculated according to the 2d model and the 3d model respectively i.e.: 3 2 % 3 ( , , ) ( , ) ( , , ) 100 ( , , ) d d d b x y z b y z e x y z b x y z − = (18) fig. 7 percent relative error versus lateral distance from power cable axis magnetic field in proximity of joint bays 521 from a look at figs 5-7 it is evident that: ▪ along the power cable route, the influence of the joint bay is restricted just to a short interval of few meters previous and next the joint bay itself. ▪ inside and just outside the joint bay area, the fields calculated according to the two models are not very different (per cent relative error less than 10%) and the values obtained by 3d model are higher than 2d model. ▪ by increasing the lateral distance from the joint bay, the differences between the values obtained by the two models increase and we can notice that the 2d model overestimates the field. 4.2. double circuit power line we consider a power line section 160m long composed by a double tri-phase circuit having, just at the middle, a joint bay of length 19m so that the joint bay length is much less than the section length of power cable under study. in this case, the geometry of the conductors inside the joint bay is more complicated because, for reasons of space, the two joints are staggered one respect to the other. see a sketch in fig. 8 like in the previous case, the conductors are disposed in trefoil configuration outside the joint bay. the currents along the two circuits are disposed in order to minimize the field from a certain distance from the caviduct axis; i.e. the phases disposition is bac-cab [2]. the colors associated the phases are the same as in fig. 3. fig. 8 double circuit: sketch (not in scale) of the joint bay 522 g. lucca we can notice that, in the first half of the joint bay, the conductors of the circuit on the left widen out while the conductors of the circuit on the right maintain their compact disposition; the reverse occurs in the second half of the joint bay. in fig. 9 we show the magnetic flux density field evaluated along the caviduct axis at the air-soil interface by using the 2d and 3d models. fig. 9 b field along the caviduct axis; y=0, z=0 in fig. 10 we show the field evaluated, according to the two models, at the middle of the left joint versus the lateral distance from the caviduct axis; the calculations are relevant to different heights from the soil. the results relevant to the right joint are exactly symmetrical with respect to the ones shown in fig. 10; so, for brevity reasons, we omit them. fig. 10 b field versus lateral distance from caviduct axis: continuous lines 3d model, dashed lines 2d model magnetic field in proximity of joint bays 523 in fig. 11, starting from the results shown in fig. 10, we plotted the per cent relative error between the field calculated according to the 2d model and the 3d model respectively. fig. 11 percent relative error versus lateral distance from caviduct axis from a look at figs 9-11 we can observe that the same remarks done for the single circuit still hold but with the difference that the discrepancies between the two models are more significative especially by moving away from the caviduct axis; compare figs.7 and 11. 5. conclusions the basic conclusion we can draw from this study is that the 2d model allows for a conservative evaluation of the field outside the area occupied by the joint bay while, if an assessment of the field is needed in the area just over the joint bay, the 3d model has to be preferred even if the differences between the two models are not very large. therefore, we can summarize the outcome of our analysis by means the following remarks: 1) if the doc around a joint bay has to be assessed, one can use the 2d model that it simpler and just needs the knowledge of the coordinates of the axis of the conductors when they are in flat disposition inside the joint bay. the 2d approach could be adopted also for a simplified evaluation of the field in close proximity of the joint bay provided that the latter one is located far from areas where a continuative human presence is expected. 2) if the magnetic flux density field is aimed to a more precise evaluation, especially if mitigation measures are needed (such as shields, passive loops or other [14, [15), then the use of the 3d model is necessary. 524 g. lucca references [1] icnirp, "icnirp guidelines for limiting exposure to time varying electric and magnetic fields (1hz100khz)", health physics, vol. 99, pp. 818-836, dec. 2010. [2] cigre joint task force 36.01/21, magnetic field in hv cable systems1/systems without ferromagnetic component. cigre, 1996, pp.1-5. [3] cigre, wg b1.23, tb 559, impact of emf on current ratings and cable systems. cigre 2013, pp.99. [4] italian standard cei 211-4, guide to calculation methods of electric and magnetic fields generated by power-lines and electrical substations, 2nd edition, 2008, pp. 19-21. (in italian). [5] k. j. binn, p. j. lawrenson and c. w. trowbridge, the analytical and numerical solution of electromagnetic fields. john wiley & sons, 1992, pp. 76-79. [6] t. modrić, s. vujević and d. lovrić, "3d computation of the power lines magnetic field", prog. electromagn. res. m, vol. 41, pp. 1-9, 2015. [7] e. c. jordan and k. g. balmain, electromagnetic waves and radiating systems. new jersey: prentice hall, 1968, chapter 3, pp. 86–88. [8] cigre, wg c4.204, tb 373, mitigation techniques of power-frequency magnetic fields originated from electric power systems, cigre, 2009, pp. 18. [9] j. r. riba ruiz and x. alabern morera, "effects of the circulating sheath currents in the magnetic field generated by an underground power line", in proceedings of the international conference on renewable energies and power quality icrepq’06, palma de mallorca, spain, 5-7 april 2006, pp. 26–30. [10] itu-t, directives concerning the protection of telecommunication lines against harmful effects from electric power and electrified railway lines, itu, vol. 3, chapter 5, 1989. [11] c. r. paul, analysis of multiconductor transmission lines. john wiley & sons, 1994, chapter 4, pp. 186-246 [12] itu-t, directives concerning the protection of telecommunication lines against harmful effects from electric power and electrified railway lines, itu, vol. 2, 1999, p. 144 [13] r. g. olsen et al., "magnetic fields from electric power lines: theory and comparison to measurements", ieee trans. power deliv., vol. 3, no. 4, pp. 2127-2136, oct. 1988. [14] a. canova and l. giaccone "a novel technology for magnetic field mitigation: high magnetic coupling passive loops", ieee trans. power deliv., vol. 26, no. 3, pp. 1625–1633, july 2011. [15] p. maioli, "shielding of electromagnetic field generated by hhv underground cable lines", in proceedings of the 8th international conference on insulated power cables jicable’11, versailles, france, 19-23 june 2011. facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 655-668 https://doi.org/10.2298/fuee2004655b © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd application of cluster analysis in the behaviour of traffic participants relating to the use of safety systems and mobile phones  marija blagojević, stefan šošić university of kragujevac, faculty of technical sciences ĉaĉak, serbia abstract. this paper presents a cluster analysis related to the behavior of traffic participants in relation to the use of safety systems and mobile phones. the data on traffic behavior were downloaded from an open data portal in serbia. three types of cluster analysis have been applied: hierarchical clustering, bayesian information criterion (bic) clustering and model clustering. the obtained results point to the various possibilities of using these three clustering methods in the field of traffic and suggest further research. key words: cluster analysis, traffic accidents, safety systems 1. introduction along with the development of society, traffic and traffic communication are developing. the basic factor of development of each society is traffic. the level of traffic development is used to measure the level of development of a particular society. the traffic in serbia is of paramount importance because of the country’s location at the crossroads of the balkans. in the area of legislation, the following significant regulations have been adopted in serbia: 1. the law on road traffic safety ((―the official gazette of the republic of serbia‖, no. 41/2009, 53/2010, 101/2011, 32/2013 – constitutional court decision, 55/2014, 96/2015 – other law, 9/2016 – constitutional court decision, 24/2018, 41/2018, 41/2018 – other law, 87/2018 and 23/2019) [1] 2. strategy on waterborne transport development of the republic of serbia, 2015 2025 [2] in the era of open data, the data on traffic in serbia have also got their space. a separate section of the serbian open data portal [3] is devoted to public safety and it has been the source of data used for cluster analysis in this paper. traffic monitoring and analysis play a key role in raising the level of transport of goods and passengers. statistics and indicators that characterize traffic are numerous and  received april 24, 2020; received in revised form june 11, 2020 corresponding author: marija blagojević university of kragujevac, faculty of technical sciences ĉaĉak, svetog save 65, 32102 ĉaĉak, serbia e-mail: marija.blagojevic@ftn.kg.ac.rs 656 m. blagojević, s. šošić often their collection and formation of databases is limited by their availability and efficiency of the system itself. the application of modern statistical and mathematical methods in the evaluation of traffic enables a comprehensive analysis that includes a large number of indicators, as well as a large amount of data. the aim of this paper is to analyze the available open databases on traffic in the domain of use of safety systems and mobile phones, which group the data into clusters whose number was obtained by statistical preprocessing. the goal of the research is to group traffic participants according to the environment in which the highest amount of offenses is committed. the paper is organized as follows: section 2 gives literature review, section 3 is dedicated to data, and section 4 describes methodology of research. section 5 presents results while section 6 consists of discussion and conclusion remarks. 2. literature review numerous research studies deal with cluster analysis in traffic. the study presented in [4] shows an analysis of data originating from vehicle trajectories obtained by simulation. two strategies were implemented: ―two platoon clustering strategies for cacc; an ad hoc coordination strategy and a local coordination strategy‖. the analysis conducted in [5] provides an overview of the use of the spatial clustering method for macro-level traffic crash analysis. the analysis was based on the open source point-of-interest data. these data were downloaded from an open source web site. traffic accidents are discrete and non-negative events and parameters that are used require further evaluation in order to determine the correlation so as to identify the distribution of traffic crash frequency. in [6], while analysing the traffic accidents, the authors sought to identify key factors that influence the severity of the accident. the latent class cluster (lcc) was used as a preliminary analysis tool. the incidents that occurred in granada (spain) in 2005-2008 were analysed. the clustering technique (in combination with other techniques) was also used in the research presented in [7]. these techniques were used to predict the collection of annual average daily traffic that is relevant to a large number of applications. cluster analysis and regression analysis were used in [8] to create an algorithm which would be used to ―estimate the number of traffic accidents and estimate the risk of traffic accidents in a study area‖. the authors of [9] paid special attention to younger drivers and their lifestyles in order to make a correlation with traffic accidents. by using cluster analysis they defined the groups of users with similar lifestyles. the research presented in [10] shows the application possibilities and efficiency of latent class clustering with the aim to identify homogeneous types of traffic accidents. the motivation for this research stemmed from the fact that traffic accident data are most often heterogeneous. some authors, like in [11], created the architecture of a dynamic clustering system using beowulf class clusters and now. if we compare the research conducted in this paper with the studies given above we can observe both similarities and differences in the approach. the basic concept underlying the clustering method is the same in all studies. some papers also deal with traffic accidents, but in different contexts. the main difference is reflected in the approach to traffic accident analysis related to the use of safety systems and mobile phones. application of cluster analysis in the behaviour of traffic participants... 657 3. data in order to implement any of data mining techniques, first we need to have a data set which will be analyzed. for our research, a data set of the indicators of traffic participants’ behavior has been downloaded from the serbian open data portal. these indicators are indicators of behavior of road users and are indirect indicators of traffic safety in serbia. three excel documents were available on the open data portal and we have directed our research to the indicators of traffic participants’ behavior with regard to the use of safety systems and mobile phones. the document shows the id of the territory to which the indicator relates, year of measurement of the indicator, type of the vehicle in which the traffic participant was observed, type of indicator, value of the indicator on the roads in the settlement, outside settlements and on highways. the data from excel, containing 1932 records, were raw data, which cannot be used in that form for making a data frame. because of that, in order to implement clustering technique, r studio software [12] has been used. r studio is an integrated development environment for the r programming language used for statistical data and graphics. fig. 1 raw data from excel the columns representing vehicles and indicators must be serialized first. data serialization is the process of converting structured data to a format that allows sharing or storage of the data in a form that allows recovery of its original structure. serialization has been done with ―keras‖ library for r studio, which has tokenizer method used to accomplish the process. fig. 2 data frame after serialization after the serialization and before creating a data frame, scaling of data has been done. in r program language, there is a scale function which places continuous variables on unit scale by subtracting the mean of the variable and dividing the result by the variables standard deviation. as a result, the transformed values have the same relationship but standard deviation 1. dataset used for research contains values which vary in range and are represented in different units. clustering algorithms used in this research use eucledian distance between two data points in their computations. if scaling is not done, 658 m. blagojević, s. šošić it can affect results, because of using mixed units and ranges in computations. the results would vary between different units. to bypass that issue, data frame needs to be scaled to the same level. fig. 3 data frame after scaling table 1 presents all variables with their type. table 1 variables and their types variable name type 1 id numerical 2 year numerical 3 vehicle categorical 4 indicator categorical 5 % of using in colony numerical 6 % of using outside the colony numerical 7 % of using on highway numerical 8 % of using in total numerical 9 class colony numerical 10 class outside the colony numerical 11 class highway numerical 12 class total numerical 4. methodology data mining technique which was applied to solve the research problem was clustering. to understand clustering technique, the term cluster needs to be explained first. cluster refers to a group of objects that belong to the same class. that means that similar objects are grouped in one cluster and dissimilar objects in another. based on that, a cluster of data objects can be presented as one group. the process of making a group of data objects by similarity is called clustering. according to [13] clustering is unsupervised classification technique in pattern analysis. the main advantage of this technique is that it is adaptable to changes and it helps to separate useful characteristics that distinguish variety of groups. in our study, traffic participants have been clustered according to the most common location where they committed violations in settlement, outside settlements or on highways. important thing to consider when choosing a clustering algorithm is whether the algorithm scales to dataset which is used for clustering. algorithm should have good performance and efficiency since dataset which is used for clustering can contain huge application of cluster analysis in the behaviour of traffic participants... 659 amount of data. complexity notation is used for determining efficiency of the algorithm. algorithms which have o(n 2 ) complexity notation are not practical and non-efficient. in proposed research only algorithms with complexity notation lower or equal than o(n 2 ) are used. for example, k-means algorithm, which is explained in continuation of the paper, has a complexity notation of o(n). complexity notation o(n), means that the algorithm scales linearly with n. one of the simpler learning algorithms that solve the clustering problem is k-means and it can be applied to these results. the idea is to define k centers for each cluster. hofmeyr in [14] noticed that „clusters are associated with compact collections of points arising around a set of cluster centroids‖. k-means clustering computes the distance between samples and forms clusters by representing a gene as a vector of expression values according to yang et al. [15]. different location of k centers gives different result. after that, a loop is created, and as a result k centers change location step by step until they stop moving. this algorithm has a goal of minimizing objective function square error: ( ) ∑ ∑ ( ) (1) where the function parameters represent the following: ||xi – vj|| euclidean distance between xi and vj ci – number of data points in cluster on i position c – number of centers in cluster when the first cluster center is calculated, the next one must be recalculated using the following function: ( )∑ (2) with this function, the distance between each data point and new obtained cluster center is recalculated. if no data point has been reassigned then the process stops. fig. 4 clusters after finishing the process of calculating cluster centers clustering techniques which have been applied to the data set of traffic participants’ behavior indicators in serbia are:  hierarchical clustering hierarchical clustering can be divided into two types of hierarchical cluster analysis strategies, agglomerative and divisive. hierarchical agglomerative clustering (hac), also known as bottom-up approach, is more informative than the unstructured set of clusters returned by flat clustering. algorithms used for hac treat each data as a singleton cluster at the outset and then successively agglomerates pairs of clusters 660 m. blagojević, s. šošić until all clusters have been merged into a single cluster that contains all data. on the other hand, divisive clustering, known also as top-down approach requires a method for splitting a cluster that contains the whole data and proceeds by splitting clusters recursively until individual data have been split into singleton cluster. based on the data provided, euclidean distance must be calculated: d(xi,xj) xi – xj represents the basic distance between any two elements of x, and the minimum distance for defining the sub-set distance: ∆(xi,xj)=min⁡(x)∈x,y∈xj d(x,y) (3) according to abbas [16] hierarchical clustering algorithm „combine or divide existing groups, creating a hierarchical structure that reflects the order in which groups are merged or divided―. in contrast to hierarchical clustering, there is also divisive hierarchical clustering that starts from the root containing all the data-set x, and splits this root node into two children nodes containing respectively x1 and x2 (so that x = x1 ∪ x2 and x1 ∩ x2 = ∅), and so on recursively until we reach the leaves that store the data elements in singletons. the divisive method, according to wei et al. [17], has a top down style ―in which the data objects are initially treated as a unified cluster that is gradually split until the desired number of clusters is obtained―.  clustering based on bayesian information criterion (bic), proposed by schwarz [18]. according to [19] this model supposes choosing one among a set of candidate models m = m1, m2…mm to represent a given data set d = d1, d2….dn. bic of model mi as: bic(mi) = logp (d1,d2…dn | mi)-1/2di log n (4) where, di is the number of independent parameters in model mi and p(d1,d2,…,dt|mi) is the maximized likelihood for the model.  model based clustering mclachlan and peel [20] and fraley and raftery [21] gave reviews of the area of model based clustering. different clustering algorithms have different objective functions, but the general idea is to minimize the distance between the objects in the same cluster while maximizing the distance between the objects in different clusters. minimization of the intra-cluster distance can also be viewed as the minimization of the distance between each data xi and the cluster means cj. given a set of clusters, cj’s the expected sse can be calculated as follows: (∑ ∑ | | ∈ ) ∑ ∑ ∫ ( ) ∈ (5) where || . || is a distance metric between a data point xi and a cluster means cj. cluster means are given by: ( ∑ ∈ ) ∑ ∫ ( ) ∈ (6) for all clustering techniques mentioned above, the process of serialization and normalization of the data frame must be done before applying any of the clustering application of cluster analysis in the behaviour of traffic participants... 661 algorithms. k-means can be applied to the resulting data frame. it is a simple learning algorithm which has already been mentioned and described in the previous section of the paper. 5. results and discussion the research shows the results of the analysis with different clustering methods. the data set mentioned above was used for the research but with different clustering algorithms. the aim of the research was to group traffic participants according to the environment in which the highest amount of offences was committed. as it can be seen from the command above, we have taken only the percentages of the offences committed in each environment. after applying the learning algorithm, the clusters can be represented by plotting in r studio. a dendrogram given in fig 5 is the result. fig. 5 representation of clusters with relation between them 5.1. hierarchical clustering in proposed research divisive hierarchical clustering was used due the fact that it is more efficient by having lower complexity notation of o (n 2 ). also, it is more accurate, agglomerative clustering makes decisions by considering the local patterns without considering the global distribution of data. those early decisions cannot be reversed and that affects result given by hierarchical agglomerative clustering. there can be several closest pairs of subsets, but we have chosen only one pair at each iteration, after which the iteration process is repeated from the beginning. in other words, we have applied a permutation on the elements of x and re-run the algorithm. for numerical data, we can slightly modify the initial data set by adding some small random noise drawn uniformly in (0,1) to bypass this problem. one disadvantage of complete linkage is that it is very sensitive to outliers (that is, artifact data that should have been removed beforehand when possible — the cleaning stage of data sets) (fig. 6). 662 m. blagojević, s. šošić dendrogram has colored cuts (fig. 6). each cut represents traffic participants’ behavior in different traffic areas. at a given height a flat clustering is obtained. the cut path does not need to be at a constant height. the dendrogram allows one to obtain many flat partitions. here, three different cuts are shown at a constant height, h = 3. hierarchical clustering is tightly linked to a class of distances called the class of ultrametrics. a distance is said to be an ultrametric if it is a metric and if it satisfies the following: ( ) ( ( ) ( ) (7) fig. 1 dendrogram representing hierarchical clustering by rows the same technique can be represented by a dendrogram with clustering by rows down with values (fig 7). fig. 2 dendrogram representing hierarchical clustering by rows down with values application of cluster analysis in the behaviour of traffic participants... 663 5.2. clustering based on bayesian information criterion (bic) bayes factors, approximated by the bayesian information criterion (bic), have been successfully applied to the problem of determining the number of components in a model and for deciding which among the three partitions most closely matches the data for a given model. partitions are determined by a combination of hierarchical clustering and the expectation-maximization (em) algorithm. the em algorithm is an effective approach for performing maximum likelihood estimation in the presence of latent variables. it does this by estimating the values for the latent variables, then optimizing the model and repeating these two steps until convergence. as such, it represents appropriate approach to the use in bayesian information criterion (bic) for estimating the parameters of the distributions. this approach can give much better results than the existing methods. moreover, the em result also provides a measure of uncertainty. the model based classification is able to match the traffic classification of a traffic offences data set much more closely than the standard k-means, in the absence of any training data. fig. 3 plot representing the bayesian information criterion (bic) the plot shows the bayesian information criterion (bic) for the model-based methods applied to the traffic offences data. the first local maximum occurs for the unconstrained model with three clusters. 5.3. clustering based on model in this section we will illustrate the model-based approach to clustering using a threedimensional data set involving 1932 observations used for traffic offences in different environments in serbia. the plot above (fig. 9) depicts the uncertainty of the classification produced by the best model (unconstrained, three clusters) indicated by the bic. fig. 10 presents a plot showing the traffic offences classification, which partitions the data into three groups. the variables have the following meanings: in the settlement, outside settlements and on highways. the clusters are overlapping and are far from 664 m. blagojević, s. šošić spherical in shape. as a result, many clustering procedures would not work well for this application. for example, figure 3 shows the (1, 3) projection of three-cluster classifications obtained by the single-link (nearest-neighbor) method, standard k-means and the modelbased method for an unconstrained gaussian mixture. fig. 4 model based on uncertainty clustering fig. 5 model based on classification clustering application of cluster analysis in the behaviour of traffic participants... 665 fig. 6 gaussian finite mixture model fitted by em algorithm fig. 11 shows reciprocal condition estimates for six different gaussian mixture models for up to nine clusters. it should also be clear that em started from the partitions obtained by hierarchical clustering and should not be repeated for the following clusters once a traffic offence is encountered. the assumption of three classes is artificial for the single link and k-means, while for the model-based method the bic has been used to determine the number of groups. nearest-neighbor discrimination assigns a data point to the same group as the point in the training set nearest to it. the first local maximum occurs for the unconstrained model with three clusters. for the initial values in em, zik was used given by the equation for the discrete classification from agglomerative hierarchical clustering for the unconstrained model (λk dk ak dt k) in all cases, leaving the model selection to the em phase (8). { (8) clustering algorithm dbscan belongs to density-based clustering technique, where the word density refers to the spatial disposition of data points that are dense when forming a group: two data points are in the same cluster if their distance is smaller than the threshold. the result of the dbscan algorithm is a set of clusters along their additional merge points (fig. 12.) fig. 7 model based on density clustering 666 m. blagojević, s. šošić moreover, the number of clusters is not fixed a priori, but it is estimated as a feature of the partition of the observations. to summarize, our model is based on the weakness of the natural clustering rule of species sampling mixture models of parametric densities, by which we mean that two observations xi and xj are in the same cluster if, and only if, the latent parameters θi and θj are equal. 6. discussion/conclusions results represent area where most of traffic offenses are done. this means that special attention should be paid to these areas. all algorithms have the same pattern and it is related to the calculation of the distance between the samples. the advantage could be the complexity of the algorithm, which at the same time can be a disadvantage if the complexity is too large (it supposes longer execution of clustering calculations). discussion based on presented results could be drawn regarding used data, variables and comparison of used algorithms: data-the data are collected from the open data portal, so such "raw" data cannot be used directly. before using, data must be processed with techniques like serialization and scaling. for research purposes in this area, more frequent data updates are desirable. a special contribution would be an open api that would display selected data in real time. variables-in the chosen data set some variables are more important than others. variables like percentage of use in specific traffic area are more important than vehicle, type of offense indicator and year of data collection. clustering results are also related to areas with most and least traffic offenses. key factor which determine result is percentage of use in specific traffic area. for example, regardless of vehicle in which offense is done it could be focus mainly on area where traffic offense is done. comparison: algorithms could be compared regarding complexity:  k-means – o(n 2 )  hierarchical agglomerative clustering (hac) – in start o(n 3 ) is time consumption and demands o(n 2 ) of memory. in proposed research time consumption is reduced o(n 2 log n)  em – o(n*k*i) where i is the number of iterations (which could be infinite, but it could be set to 0 if there is no need for single iteration)  bic – o(i* log n) where i is the number of parameters. it penalizes the complexity of the model where complexity refers to the number of parameters in model. bearing in mind the obtained results several conclusions can be drawn:  cluster analysis could be successfully implemented in solving the problems related to traffic accidents and all mentioned techniques of clustering have the advatages and drawbacks;  there is a significant correlation between the behavior of traffic participants and the use of safety systems and mobile phones  each of the types of cluster analysis used is significant and complementary to the cluster information. in practice, results gained by clustering can be used to determine new road safety laws or changing existing ones. by determining area where most traffic offenses are done, experts can focus on road safety laws only for those area. focus only on those areas can archive better quality road safety law which can reduce number of traffic offenses. application of cluster analysis in the behaviour of traffic participants... 667 proposed research had opened the possibilities for further improvement by clustering in different ways. clustering could be used to determine area in which most of traffic offenses are done and then cluster again data frame only for that specific area. the proposed improvement could determine details like in which vehicle most traffic offenses are done and which type of traffic offense is mostly done. by gathering those details, experts for creating road safety laws must not only focus on area on which most traffic offenses are done, but also on specific vehicle and type of offenses. that will result more specific laws with focus on most problematic safety factors. newly created laws will for sure reduce of traffic offenses done. traffic safety equipment can be also putted to prevent some of offenses. future work is concerned with exploring the possibility of using neural networks in order to predict the behavior of traffic participants based on the available open data. acknowledgement: this study was supported by the ministry of education, science and technological development of the republic of serbia, and these results are parts of the grant no. 451-03-68/2020-14/200132 with university of kragujevac faculty of technical sciences čačak. references [1] the law on road traffic safety ((―the official gazette of the republic of serbia‖, no. 41/2009, 53/2010, 101/2011, 32/2013 – constitutional court decision, 55/2014, 96/2015 – other law, 9/2016 – constitutional court decision, 24/2018, 41/2018, 41/2018 – other law, 87/2018 and 23/2019) [2] strategy on waterborne transport development of the republic of serbia, 2015 – 2025, available at: http://aler.rs/files/strategija_razvoja_vodnog_saobracaja_republike_srbije_od_2015_do_2020_god ine_sl_gl_rs_br_3_2015.pdf, last access march 23 rd 2020. [3] open data portal, https://data.gov.rs/sr/, last access march 23 rd 2020. [4] z. zhong, e. lee, m. nejad and j. lee, ―influence of cav clustering strategies on mixed traffic flow characteristics: an analysis of vehicle trajectory data‖, transportation research part c: emerging technologies, vol. 115, june 2020, 102611 [5] r. jia, a. khadka and i. kim, ―traffic crash analysis with point-of-interest spatial clustering‖, accident analysis & prevention, vol 121, pp. 223-230. december 2018. [6] j. ona, g. lopez, r. mujalli and f. calvo, ―analysis of traffic accidents on rural highways using latent class clustering and bayesian networks‖, accident analysis & prevention, vol. 51, pp. 1–10. march 2013. [7] a. sfyridis and p. agnolucci, ―annual average daily traffic estimation in england and wales: an application of clustering and regression modeling‖, journal of transport geography, vol. 83, 102658, february 2020. [8] k. ng, w. hung and w. wong, ―an algorithm for assessing the risk of traffic accident‖, journal of safety research, vol. 33, pp. 387–410. october 2002. [9] n. gregersen and h. berg, ―lifestyle and accidents among young drivers‖, accident analysis & prevention, vol. 26, pp. 297-303. june 1994. [10] b. depaire, g.wets, k. vanhoof, ―traffic accident segmentation by means of latent class clustering‖, accident analysis & prevention, vol. 40, pp. 1257–1266. july 2008. [11] d. kehagias, m. grivas, and g. pantziou, ―using a hybrid platform for cluster, now and grid computing‖, facta univ. ser.: elec. energ., vol. 18, no. 2, pp. 205-218. august 2005. [12] r studio, retrieved from: https://rstudio.com/. [13] a.k. jain, m.n. murty, p.j. flynn, ―data clustering: a review‖, acm comput. surveys, vol. 31, no. 3, pp. 264–323, 1999. [14] w. yang, h. long, l. ma, h. sun, ―research on clustering method based on weighted distance density and k-means‖, procedia computer science, vol. 166, pp. 507–511, 2020. [15] d. hofmeyr, ―degrees of freedom and model selection for k-means clustering‖, computational statistics & data analysis, vol. 149, 2020. [16] o.a. abbas, ―comparisons between data clustering algorithms‖, the international arab journal of information technology, vol. 5, no. 3, pp. 320–325. july 2008. http://aler.rs/files/strategija_razvoja_vodnog_saobracaja_republike_srbije_od_2015_do_2020_godine_sl_gl_rs_br_3_2015.pdf http://aler.rs/files/strategija_razvoja_vodnog_saobracaja_republike_srbije_od_2015_do_2020_godine_sl_gl_rs_br_3_2015.pdf https://data.gov.rs/sr/ https://www.sciencedirect.com/science/article/pii/s0022437502000336 https://rstudio.com/ 668 m. blagojević, s. šošić [17] w. wei, liang. j, x. guo, p. song, y. sun, hierarchical division clustering framework for categorical data, neurocomputing, vol. 341, pp. 118–134, may 2019. [18] g. schwarz, ―estimation the dimension of a model", the annals of statistics, vol.6, pp. 461-464, 1978. [19] b. zhou, j. hansen, ―unsupervised audio stream segmentation and clustering via the bayesian information criterion‖, in proceedings of the 6 th international conference on spoken language processing, bejing, china, 2000. [20] g. mclachlan, d. peel, ―robust cluster analysis via mixtures of multivariate t-distributions,‖ lecture notes in computer science, vol. 1451, pp. 658–666, 1998. [21] c. fraley, a. raftery, ―model-based clustering, discriminant analysis, and density estimation,‖ journal of the american statistical association, vol. 97, pp. 611–631, 2002. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 235 249 doi: 10.2298/fuee1402235s execution time – area tradeoff in gausing residual load decoder: integrated exploration of chaining based schedule and allocation in hls for hardware accelerators  anirban sengupta 1 , reza sedaghat 2 , vipul kumar mishra 1 1 computer science and engineering, indian institute of technology, indore, india 2 electrical and computer engineering, ryerson university, toronto, canada abstract. design space exploration is an indispensable segment of high level synthesis (hls) design of hardware accelerators. this paper presents a novel technique for area-execution time tradeoff using residual load decoding heuristics in genetic algorithms (ga) for integrated design space exploration (dse) of scheduling and allocation. this approach is also able to resolve issues encountered during dse of data paths for hardware accelerators, such as accuracy of the solution found, as well as the total exploration time during the process. the integrated solution found by the proposed approach satisfies the user specified constraints of hardware area and total execution time (not just latency), while at the same time offers a twofold unified solution of chaining based schedule and allocation. the cost function proposed in the genetic algorithm approach takes into account the functional units, multiplexers and demultiplexers needed during implementation. the proposed exploration system (expsys) was tested on a large number of benchmarks drawn from the literature for assessment of its efficiency. results indicate an average improvement in quality of results (qor) greater than 26 % when compared to a recent well known ga based exploration method. key words: area; high level synthesis; exploration; scheduling; chaining; execution 1. introduction as the complexity of very large scale integration (vlsi) designs increases, the design of application specific integrated circuits (asic) should be addressed at higher levels of abstraction in order to meet the growing challenges. of late there has been a major shift among all well-known electronic design automation (eda) vendors from traditional register transfer level (rtl) designs to high level synthesis. however, for comprehensive high level system designs, efficient design space exploration techniques are required during hls that can concurrently meet the user specified constraints of  received january 27, 2014 corresponding author: reza sedaghat electrical and computer engineering, ryerson university, toronto, canada (e-mail: rsedagha@ee.ryerson.ca) 236 a. sengupta, r.sedaghat, vk. mishra hardware area and execution time. furthermore, design space exploration should also be able to concurrently resolve the orthogonal issues encountered during dse, such as minimizing the time of the exploration process and maximizing the precision required. hence, the tremendous advancement of highly complex digital vlsi circuits in the current generation of portable devices and other electronic products has mainly become possible owing to the efficient design techniques developed so far [1]. the process of hls can be broadly classified into three phases. the first phase involves the conversion of the algorithm into data flow graph (dfg). the second phase includes scheduling, which assigns operations into the appropriate control steps. allocation, the third phase in high level synthesis, is the data-path synthesis that allocates hardware resources such as registers and busses, and binds the operations of dfg to functional units [1]. the hls phase consists of interdependent tasks such as scheduling and allocation. scheduling is the process of assigning the operations in specific control step while resource allocation refers to the assignment of the functional units to perform the operations, multiplexers and demultiplexers to switch between different inputs and output. however, the problem of solving the integrated scheduling and allocation by exhaustive analysis is strictly prohibited [1]. 2. related work the problem of design space exploration was addressed in [2], where the authors have proposed the use of a genetic algorithm in the binding and allocation phase in high level synthesis. this method involves crossover dependence on the force directed data path binding completion algorithm. one of the problems with [2] is that the method accepts a scheduled data flow graph as an input. this clearly signifies the inability of their approach to resolve the scheduling problem. authors in [3] have also proposed a genetic algorithm for time constrained scheduling. the chromosome is encoded with the permutation of operations, which is decoded by a list decoder, to decode the chromosome into a valid schedule. however, the approach does not handle chaining and execution time optimization. in addition, authors in [4] have proposed a problem space genetic algorithm for design space exploration of data paths. the authors have used the concept of heuristic/problem pair to convert a data flow graph into a valid schedule. the chromosome is encoded based on the „work remaining‟ value of each node. one of the problems with approach [4] is that the second special parent chromosome built in correspondence with the minimum functional units (i.e. serial implementation) does not differ in the work remaining field of the first special chromosome. this may not always properly lead to reaching the optimal solution. further, the cost function considers only latency and not total execution time. the problem of design space exploration was also addressed in [5] by suggesting order of efficiency, which assists in deciding preferences amongst the different pareto optimal points. research in [6] suggested that identification of a few superior design points from the pareto set suffices for an excellent design process. evolutionary algorithms in [7], such as the genetic algorithm (ga), have been suggested to yield better results for the design space exploration process. the use of ga has also been suggested as a framework for dse of data paths in high level synthesis in [8]. authors in this approach have proposed a priority order based chromosome for the data schedules and an independent chromosome for the functional units. their work uses the robust search capabilities of the genetic algorithm for scheduling and execution time – area tradeoff in ga using residual load decoder 237 allocation of datapath with the aim to find a solution for both the module selection and scheduling. one of the drawbacks of [8] is that the approach does not consider resource binding. thus, the cost function proposed does not reflect the multiplexer and demultiplexers‟ resources. furthermore, like other ga design space exploration approaches, [8] only considers optimization of latency and area. another approach introduced by researchers in [1] was also based on pareto optimal analysis. according to their work, the design space was arranged in the form of an architecture vector design space for architecture variant analysis and optimization of performance parameters. though the results proved promising the approach was unable to handle chaining based scheduling. furthermore in [9] and [10], authors described another approach to dse in high level systems based on binary encoding of the chromosomes. work shown in [11] for dse suggests that authors used an evolutionary algorithm for successful evaluation of the design for an application specific soc. approaches [9]-[11] only considered traditional latency and not the execution time constraint for data pipelining. the work shown in [12] discusses the optimization of area, delay and power in behavioral synthesis, but does not focus on the high level design space exploration using multi chromosomal genetic algorithm nor does it consider execution time during data pipelining. furthermore, authors in [13] introduce a tool called systemcodesigner that offers rapid design space exploration with rapid prototyping of behavioral systemc models. automated integration was developed by integrating behavioral synthesis into their design flow, while authors in [14] describe current state-of-the-art highlevel synthesis techniques for dynamically reconfigurable systems. additionally, authors in [15]-[17] also used genetic algorithms for scheduling and resource allocation for data path synthesis. another class of scheduling methods employed previously was probabilistic in nature. for example the simulated annealing (sa) and simulated evolution (se) based scheduling techniques have been used for the high level synthesis problem. authors in [18], [19] have proposed simulated annealing scheduling method called „salsa‟ which uses many probabilistic search operators to enhance the performance of sa-based technique for high level synthesis problem. moreover, authors have also proposed an extended binding model for handling the scheduling problem in high level synthesis. furthermore, authors in [20] also used sa for scheduling problem with simultaneous minimization of registers and function units. se has been proposed by authors in [21] for solving the combined problem of scheduling and resource allocation in high level synthesis. all aforementioned approaches [15]-[21], however, do not consider execution time, chaining and data pipelining. in contrast to the proposed approach, [15]-[17] do not incorporate a special seeding process based on serial and parallel implementation in order to efficiently guide the ga to optimal/nearoptimal solution. other previously proposed approaches [22], [23] are based on integer linear programming (ilp). here, the computational complexity is massive and although able to provide good results, consume enormous time. furthermore, the concept of data pipelining based on execution time was not shown during system trade-off. constructive approaches [24][27] are very straightforward to implement but suffer from the major drawback of leading to poor quality of solutions owing to their greedy nature. 3. the proposed approach for genetic algorithm based exploration system (expsys) the approach proposed in this paper for finding the optimal integrated scheduling, allocation, binding and module selection, employs a special multi chromosomal compound 238 a. sengupta, r.sedaghat, vk. mishra chromosome structure that has the efficient ability to search the design space. it provides an integrated solution to the problem of scheduling, allocation and binding by yielding a set of hardware resources that contains the details of functional units (e.g. number and kind). further, this solution reduces the cost function based on constraints provided for hardware area (consisting of function units, multiplexers, demultiplexers) and execution time (considering latency, cycle time and number of sets of data to be executed). in order to reduce the final cost, the module selection indicates the optimal number of resources needed of each kind, as well as the right version of a specific resource needed from the module library during implementation the expsys has been developed by a new chromosome encoding technique that consists of separate chromosome structures for each of the resources, rather than the traditional method consisting of a single chromosome structure to represent all the resources. moreover the proposed approach also includes an independent chromosome representation of the module allocations fields. 3.1 the expsys overview the input to the ga framework is the behavioral description of the dataflow graph (dfg), or the high level description of the algorithm in c language, that describes the behavior of the application. in addition to the behavioral description of the application input to the ga framework also includes the set of user specified design constraints for hardware area and execution time (with the user specified weight factors for hardware area-execution time tradeoff), control parameters for the genetic algorithm, and the module library that contains specifically three different information viz. maximum resources available, clock cycles and area. the proposed framework is comprised of two basic units. the first unit is the proposed heuristic that acts as an input to the skeleton for the genetic algorithm. the second unit processes the information provided by the first unit to produce a final integrated scheduling, allocation and module selection solution. the proposed skeleton (algorithm) for the genetic algorithm is shown is fig.1. it uses a new heuristic based on residual load criterion that assigns a specific priority for each operation in the chromosome structure. the first parent (p1) chromosome of the nodal string (this string is defined later in section 4.2) is encoded based on the residual load (α) of each resource from the asap scheduling graph. on the contrary, each operation of the second parent (p2) nodal string is encoded based on the difference of the latency obtained by using asap scheduling with maximum resource (l asap ) and the residual load (α) for each operation (oi) obtained for p1 chromosome. hence, the encoded value of each operation (oi) of the second parent chromosome is calculated using equation (1). asap i l (o )   (1) the rest of the parents of the population in the nodal string encoded with the residual load values are obtained by random perturbation. the other parent chromosomes (p3…..pn) of the population obtained by the perturbation function should be individuals lying between the parent p1 derived from the schedule based on maximum resource and parent p2 derived based on minimum resource. this is more logical because the optimal solution to the integrated problem lies somewhere between the maximum and the minimum resource. the developed perturbation function, which yields the residual load values, is given in equation (2) execution time – area tradeoff in ga using residual load decoder 239 pf ( ) / 2   (2) where „µ‟ is a random value between „α‟ and „β‟. the additional random value „µ‟ is added to the perturbation function because, in order to have more diversity in the initial population, the residual load value for the rest of the parents (p3…..pn) should be different (note: this residual load value determines the priority among nodes during the decoding process. thus, it is necessary to have different residual load values by adding the random value to the perturbation function). moreover, having greater diversity results in searching all the corners of the design space, thereby assisting in finding the optimal/near-optimal solution. ignoring „µ‟ in the above function would encode the nodal string part for the rest of the parents (p3…..pn) with the same residual load values, thereby reducing the diversity of the initial population. the function in equation (2) is used when encoding the values of the nodal string for the rest of the parents. on the other hand, the perturbation of the resource allocation string (this string is defined later in section 3.2) for the other parents is obtained by applying the algorithm shown below: algorithm 1) schedule the dfg using asap algorithm and calculate the latency (l). 2) generation g =1. 3) creation of the initial population by chromosome encoding with priority list of nodes based on „residual load‟ which is done as follows: a) encode the first parent (p1) of the nodal string using the residual load (α) based on the asap schedule. encode the first parent (p1) of resource allocation string with maximum resources. b) encode the second parent (p2) of the nodal string using residual load (β) calculated as: l asap – α (oi) based on minimum resources. encode the second parent (p1) of the resource allocation string with minimum resources. c) create the rest of the parent (p3…pn) of the nodal string with residual load based on the perturbation function = (α + β)/2 ± µ; where „µ‟ is a random value between „α‟ and „β‟. 4) perform crossover with very high probability (pcross) among parents to create off-springs. 5) decode the chromosomes using the proposed „residual load heuristic‟ to find scheduling solutions by binding dfg operations to fu, allocating mux‟s and demux‟s. 6) get information about the functional units (fu) such as versions, area occupied, clock cycle etc. from the module library. 7) calculate the global cost function and determine the fitness of each individual. global cost function considers a) total area which is a combination of: i) area of fu ii) area of mux iii) area of demux. b) total execution time which is a combination of, i) latency ii) cycle time and iii) number of sets of data. 8) perform mutation on the least fit nodal string chromosome and the resource allocation string chromosome with probability, pm = 0.25. mutation is performed once every generation 9) decode the mutated chromosomes using the proposed „residual load heuristic‟ to find scheduling solutions and then calculate the cost of the mutated chromosome again. 10) select the best population from the set of off-springs and parents from this generation and take it forward to the next generation. increment g, (g=g+1) until g< generation max 11) end ga run. fig. 1 the proposed skeleton for the expsys 240 a. sengupta, r.sedaghat, vk. mishra perturbation rule for the resource allocation chromosome for rest of the parents 1. randomly pick any two nodes (v1, v2) from the chromosome that represents the resource allocation. 2. randomly select any integer value (i) ranging between or equal to „α‟ and „β‟ for that specific operation (node). hence, α <=i<= β once the parents for the initial population are formed direct crossover is applied. crossover results in creation of off-spring in that generation. for every mating between two parents, two off-springs can be created. if, for example, size of the parents in the population is 8, then 16 off-spring will be produced. therefore, the total population of the first generation is 24. the next task is to decode the generated individuals of the first generation by applying a new „residual load heuristic‟ that always results in a valid schedule. during the process of formation of the schedule solution, the data dependency is strictly followed before any operation is selected for scheduling. the global cost function is then determined in order to judge the fitness of each individual solution. the least fit individual is mutated in order to hope for a better solution. after mutation, the mutated chromosome is again decoded and its fitness is adjudged. the best fit individuals from this first generation are then forwarded to the next generation. this process continues until the maximum generation g(max) specified in reached. 3.2 chromosome representation suitable encoding of the problem dictates the capability of the genetic algorithm to find optimal or near–optimal solutions. the proposed approach uses a multi chromosome structure consisting of independent strings to separately represent the priority of the nodes of the dfg for each fu type and the resource allocation information. the approach is called multi chromosomal because each fu (resource) is represented as an independent substring in the nodal string structure. it has two independent strings to separately represent the nodes of the dfg (called „nodal string‟) and the resource allocation (called „resource allocation string‟). the „nodal string‟ contains the residual load values of each node which will determine the priority of the nodes during scheduling. the „residual load heuristic‟ is used when decoding the nodal string in order to obtain a valid scheduling solution. the „resource allocation string‟ contains a list of integers, which indicate the maximum number of resources allowed during scheduling. the resource allocation string contains a substring with integers to represent the maximum number of functional units of each type available for scheduling in every time step of the schedule. this encoding scheme for both the resource allocation string and nodal string assures that the genetic algorithm always produces a valid schedule as well as reaching all the corners of the design space to explore the integrated solution of scheduling, allocation and binding. the encoding scheme for the „nodal string‟ and the „resource allocation string‟ is shown with an example of a benchmark „differential equation solver‟. small values of delay in cc are used during demonstration. for clarity, during experimentation real values have been used. the schedule of the dfg of the differential equation solver using asap is shown in fig.2. the latency (l) obtained is 12cc (note: assumes multipliers and adders/subtractors take 4cc and 2cc respectively). the corresponding chromosome encoding for the first parent (p1) of the nodal string is shown in fig. 3(a). the total residual load of each operation (node) is obtained by summation of the residual load of the successor operations following that node. e.g. for execution time – area tradeoff in ga using residual load decoder 241 node 1, the residual load is (4+4+2+2) cc = 12cc. the second parent (p2) chromosome is encoded based on the residual load values obtained using equation (1). the second parent (p2) chromosome encoding is shown in fig. 3(b). the rest of the parents of the initial population is obtained using equation (2) which is a perturbation function used to encode the residual load values. the residual load values for rest of the parents always lie between the values from the first parent and second parent. this scheme has been developed because the optimal solution to the problem should always lie between the serial and maximally parallel implementation [4]. on the other hand, the first parent (p1) shown in fig. 3(a) and second parent (p2) of the resource allocation string shown in fig. 3(b) are based on the user specified maximum and minimum resources respectively. for example, the first parent (p1) of the resource allocation string shown in fig. 3(a) consists of three multipliers, three adders, two subtractors and one comparator. additionally, second parent (p2) of the resource allocation string shown in fig. 3(b) consists of one multiplier, one adder, one subtractor, and one comparator. the rest of the parents (p3…p8) of the „resource allocation string‟ are obtained using the algorithm in fig 3. the „resource allocation string‟ for the rest of the parents of the initial population is also encoded with multiplier, adder, subtractor, and comparator option (note: „m‟, „a‟, „s‟, „c‟ refers to multipliers, adders, subtractor, and comparators respectively in the resource allocation string). thus, the final solution found by the proposed expsys is able to indicate the final combination of multipliers, adders, subtractor, and comparators needed to implement the problem based on the user specified hardware area and execution time constraints. the nodal string and the resource allocation string for the rest of the parents are shown in fig.4(a) and fig.4(b) respectively. for example, in case of fig 4(a), the encoding of the third parent for the resource allocation string is obtained by first picking up randomly any two nodes m (multiplier) & a (adder) and then randomly selecting any integer value between „3‟ and „1‟ for m and between „3‟ and „1‟ for a. the randomly selected value for both m & a is „2‟. similarly, the rest of the parent chromosomes can be built by perturbation. this type of perturbation for the „resource allocation string‟ and the perturbation function for the „nodal string‟ described before aids in searching all the possible combinations of the design space so that the ga can reach an optimal or nearoptimal solution. fig. 2 scheduling of differential equation solver using asap 242 a. sengupta, r.sedaghat, vk. mishra fig. 3 chromosome encoding for the first parent (a) and second parents (b) fig. 4 chromosome encoding for the third parent (a) and fourth parent (b) fig. 5 crossover between p1 and p2 execution time – area tradeoff in ga using residual load decoder 243 3.3 crossover technique crossover is a technique for producing off-spring when two parents mate. the parents are selected by a binary tournament selection method [28]. in this work, we propose the independent direct crossover of the two independent strings viz. nodal string and resource allocation string to produce separate off-spring for each with a very high crossover probability (pcross = 1.0). furthermore, the direct crossover is applied to each sub structure of the nodal string structure. for example, direct crossover is independently applied to adder substring, multiplier substring, subtractor substring, etc. of each nodal string as well as resource allocation string. since the nodal string encodes the residual load of each operation for a particular fu, the crossover results in crossing only the residual load values. hence the precedence relationship among the operators is not disobeyed. 3.3.1 multi-point crossover of the nodal string before the crossover scheme can be applied to the nodal strings, the two parents are randomly divided into two halves at point n. the crossover point selected during crossing is absolutely random. this is because the nodal string is encoded with residual load values of the nodes and crossover operation only crosses the residual load values, hence choosing a random cut point for crossover does not disturb the precedence relationship among the nodes. only random cut point has been used in the proposed work as this technique has been widely used by other approaches and provided efficient results. the proposed crossover is called multi-point because each substring of the nodal string representing independent fus is divided at a different point. for example, applying the direct crossover operator to the nodal string between the first parent (fig. 3(a)) and second parent (fig.3(b)) at point 2 for multiplier and point 1 for adder and subtractor, yields offspring 1 and offspring 2 respectively. offspring 1 inherits all the properties of the first half from the first parent, while the second half of the offspring is inherited from the second parent. the properties that are inherited from the parents are the residual load values and its corresponding node numbers (operations). the offspring 1 obtained after crossover between p1 and p2 is shown in fig 5(a), while offspring 2 obtained after crossover between p2 and p1 is shown in fig. 5(b). similarly the other offspring are obtained by crossing between the rest of the parents. for the sake of brevity, the rest of the offspring obtained have been omitted in this paper. 3.3.2 crossover of the resource allocation string the resource allocation string is responsible for encoding the number of hardware functional units of each type available for scheduling operations in each time step. since the number of allocated functional units of each type is totally independent of each other, the 1-point crossover can be easily applied. for instance, in the case of the dfg for differential equation solver benchmark, the two parents (p1 and p2) for the resource allocation string are shown in fig. 3(a) and 3(b) respectively. p1 represents a solution with three multipliers, three adders, two subtractors and one comparator while p2 represents a solution with one multiplier, one adder, one subtractor and one comparator. application of the direct crossover at a random cut point between p1 and p2 yields offspring 1 while crossing between p2 and p1 yields offspring2 as shown in fig 5(b). 244 a. sengupta, r.sedaghat, vk. mishra 3.4 mutation operation 3.4.1 mutation operator of the nodal string the mutation algorithm for resource allocation string is adopted from [8] based on random increment or decrement while mutation for nodal string is shown below: algorithm 1. randomly pick any two nodes (vi, vj) from the nodal string [k]. 2. swap the residual load values of the two selected nodes. if, vi = li and vj = lj, then, vi = lj and vj = li. according to the algorithm, any two nodes (vi, vj) in the string (k) are randomly selected for mutation. next, the residual load values of the two selected nodes are swapped. for example, let the residual load value for the two nodes (vi) and node (v2) selected be „l1‟ and „l2‟ respectively. therefore, after mutation the new residual load values for node (vi) is „l2‟ and node (vj) is „l1‟. this mutation technique drastically alters the residual load values, which act as the priority to select the operations for scheduling. as a result of this drastic alteration, the new operation to be scheduled can vastly affect the scheduling cost. 3.5 decoding process (determination of a valid schedule) the decoding of chromosomes always results in a valid scheduling solution, which strictly obeys the data dependency present between the operations. for the decoding process, a „residual load heuristic‟ is proposed. the residual load heuristic is shown in fig. 6. for example, in the case of offspring 1, the resource allocation string and the nodal string are shown in fig.5(a) and fig.5(b) respectively. the resource allocation string of offspring1 represents an allocation solution containing three multipliers, three adders, one subtractor, and one comparator. on the other hand, the priority of each operation for a particular type of fu is indicated by the residual load values in the nodal string (fig.5(b)). therefore, for the dataflow graph shown in fig.3, the scheduling solution of offspring 1 is shown in fig. 7. the resulting solution is a valid schedule, allocation and binding obtained for offspring 1. the solution provides an integrated solution to the concurrent problem of scheduling, allocation and binding. 3.6 global cost function and fitness evaluation methodology the proposed approach objective is to simultaneously reduce the execution time required for a specific set of data as well as the total hardware area occupied. most of the previous approaches [2], [4], [7], [8] have only considered latency as a design constraint and not total execution time, which considers the latency, cycle time and also the number of sets of data to be executed. in the presented approach, a comprehensive cost function has been developed that considers the total execution delay, taking data pipelining as well as the total hardware area into account. the decoding process strictly follows the „residual load heuristic‟ and hence always results in a feasible solution. the cost function (cg) developed considers total execution time and area is shown in eq. (3). execution time – area tradeoff in ga using residual load decoder 245 fig. 6 flow chart for residual load heuristic fig. 7 chaining schedule and allocation to offspring 1 (decoded) 246 a. sengupta, r.sedaghat, vk. mishra exe cons fu mux demux cons g max max t t [a (a a )] a c w1 w2 t a         (3) texe = total execution time taken for execution of the given sets of data; where texe is calculated using the function from [1] given in equation (4): exe c t {l (n 1) t }    (4) l= latency of the scheduling solution. tc = cycle time of the scheduling solution. (note: the cycle time is the difference in clock cycles between any consecutive outputs of pipelined data instances. the cycle time information is therefore not extracted from the module library since it is not readily available, i.e. the cycle time calculation for the integrated solution (fig. 7). the output for first set of data is arriving after 14cc while the output for second instance of data is arriving after 26cc. thus, due to pipelining there is a cycle time difference of 12 cc resulting from considering the initiation interval. therefore the option of cycle time during pipelining which is the resulting effect of considering initiation interval during data pipelining has been also taken into account during the exploration process. at= total area calculated using eq. 5. t fu mux demux a = a +(a +a ) (5) n = number of sets of data to be executed. cg = global cost of the integrated solution tcons = execution time specified by the user. tmax = max execution time taken by a solution during the specific generation (g). afu = total area of the functional units. amux = total area of the multiplexer used during implementation. ademux = total area of the demultiplexers used during implementation. acons = area constraint specified by the user. amax = max hardware area of a solution during the specific generation (g). w1 and w2 = user specified preference of the constraints. the cost function requires input from various sources to evaluate the fitness of each solution found. for the calculation of the execution time, the sources consist of: a) module library information, b) data extracted for the hardware implementation, c) data flow graph and d) scheduling solution found after decoding the chromosome (latency), number of sets of data, cycle time together. 3.7 termination criterion for the genetic algorithm the maximum generation has been kept constant for each benchmark run. although making the number of generations proportional to the problem size is more logical, settling on an average number of maximum generations for both small and large size benchmarks is a good compromise. therefore, experiments dictated that retaining the maximum generation g(max) at 100 is an optimal compromise. execution time – area tradeoff in ga using residual load decoder 247 4. experimental results various dsp benchmarks [29], [30] such as digital filter, auto regressive filter (arf), discrete wavelet transformation (dwt), digital butterworth filter, band pass filter (bpf) and elliptic wave filter (ewf), mpeg motion vectors, mesa: matrix multiplication and jpeg: down sample were tested and verified. the proposed approach has been implemented in java and run on intel core i5-2450m processor, 2.5 ghz with 3mb l3 cache memory and 4gb ddr3 ram. expsys finds optimal/near-optimal results for all the benchmark applications. moreover, the proposed expsys was also compared to [8] with respect to the mentioned benchmarks under the same constraints to make a qualitative assessment and strength of the proposed approach. the proposed achieved better quality of result (determined by eq.6) as shown in table i. furthermore, expsysalso considers cycle time resulting from initiation interval and latency to create a genuinely pipelined functional data-path during performance calculation. [8], on the other hand, is not able to optimize the execution time considerably due to its inability to create a genuinely pipelined functional data-path. thus, for determining of execution time in [8], “n” set of processing data is multiplied directly with the latency as per: [8] exe t n * l. where the qor is determined as: max max 1 2 t exe a t qor a t        (6) with respect to achieved qor, expsys produces better solutions compared to [8] for all the benchmarks as evident in table 1. for example, in the case of arf benchmark, the optimal resource configuration found 3 (*) and 1(+), the area of solution is 10934au, the execution time is 54281µs and the qor is 0.35. on the other hand [8], based on same constraints, yields an optimal resource configuration which is 4(*), 1(+) with 13776au area, 45630 µs execution time and 0.36 qor. expsys achieves an average improvement in qor greater than 26% (table 1). 5. conclusion this paper proposed a novel technique for area-execution time tradeoff using residual load decoding heuristics in genetic algorithm (ga) for integrated design space exploration (dse). to the best of the authors‟ knowledge, this approach is the first gabased dse method for area-execution time tradeoff in hls. based on the results obtained from the experiment, the proposed expsys is able to provide not only competitive but also superior results for almost all tested dsp benchmarks. acknowledgement: this work is supported by the optimization and algorithm research lab (opral), ryerson university, canadian microelectronics corporation (cmc), motorola, nserc crsng, ontario innovation trust and sun microsystems. additionally, this work acknowledges the assistance provided by science and engineering research board (serb), department of science and technology, govt. of india. 248 a. sengupta, r.sedaghat, vk. mishra references [1] anirbansengupta, reza sedaghat, zhipengzeng, “a high level synthesis design flow with a novel approach for efficient design space exploration in case of multi parametric optimization objective”, microelectronics reliability, elsevier, volume 50, issue 3, march 2010, pages 424-437. [2] c. mandal, p. p. chakrabarti, and s. ghose, “gabind: a ga approach to allocation and binding for the high-level synthesis of data paths,” ieee transaction on vlsi, vol. 8, no. 5, pp.747–750, oct. 2000. table 1 experimental results of comparison with [8] for the dsp benchmarks dsp benchmarks parameters of comparison (note: us = micro seconds and au = area unit; au = 1transistor, g(max)=100 and w1=w2=0.5 ) optimal resource combination execution time n=1000 (us) area (au) qor expsys [8] expsys [8] expsys [8] expsys [8] auto regressive filter (arf) fu 3(*),1(+) 4(*),1(+) 54281us 45630us 10934au 13776au 0.35 0.36 mux 8 10 constraint 70000us constraint 15000au demux 4 5 discrete wavelet transformation (dwt) fu 4(*),1(+) 2(*),1(+) 10844us 66420us 13776au 8092au 0.38 0.56 mux 10 6 constraint 30000us constraint 10000au demux 5 3 digital butterworth filter fu 2(*),1(+) 3(*),1(+) 22880us 22410us 8092au 10934au 0.42 0.49 mux 6 8 constraint 30000us constraint 9000au demux 3 2 band pass filter (bpf) fu 4(*),1(+) 2(*),1(+) 11642us 68310us 13776au 8092au 0.42 0.52 mux 10 6 constraint 30000us constraint 15000au demux 5 3 elliptic wave filter (ewf) fu 3(*),1(+) 2(*),2(+) 21085us 46440us 10934au 10500au 0.45 0.57 mux 8 8 demux 4 4 constraint 50000us constraint 8000au jpeg downsample fu 2(*),1(+) 1(*),1(+) 10818us 29700us 8092au 5250au 0.31 0.59 mux 6 4 constraint 15000us constraint 15000au demux 3 2 mpeg motion vector fu 4(*),1(+) 5(*),1(+) 32680us 35640us 13776au 16618au 0.24 0.27 mux 10 12 constraint 40000us constraint 25000au demux 5 6 discrete cosine transformation (dct) fu 4(*),1(+) 2(*),2(+) 31467us 88290us 13776au 10500au 0.33 0.47 mux 10 8 constraint 50000us constraint 15000au demux 5 4 mesa horner fu 3(*),1(+) 2(*),1(+) 10843us 65070us 10934au 8092au 0.35 0.59 mux 8 6 demux 4 3 constraint 25000us constraint 12000au mesa matrix multiplication fu 7(*),1(+) 4(*),2(+) 53628us 132570us 32570au 16184au 0.19 0.24 mux 16 12 constraint 200000us constraint 40000au demux 8 6 execution time – area tradeoff in ga using residual load decoder 249 [3] m. j. m. heijlingers, l. j. m. cluitmans, and j. a. g. jess, “high-level synthesis scheduling and allocation using genetic algorithms,” in proc. asp-dac., pp. 61–66, 1995. [4] m. k. dhodhi, f. h. hielscher, r. h. storer, and j. bhasker, “datapath synthesis using a problem-space genetic algorithm,” in ieee trans.comput.-aided des., vol. 14, pp. 934–944,1995. [5] i. das. a preference ordering among various pareto optimal alternatives. structural and multidisciplinary optimization, 18(1):30–35, aug. 1999. [6] alessandro g. di nuovo, maurizio palesi, davide patti, fuzzy decision making in embedded system design,” proc. of 4th intl conference on hardware/software codesign and system synthesis, pp: 223-228, october 2006. [7] j. c. gallagher, s. vigraham, and g. kramer,“a family of compact genetic algorithms for intrinsic evolvable hardware,” ieee trans. evolutionary computation., vol. 8, no. 2 , pp. 1–126, apr. 2004. [8] vyas krishnan and srinivaskatkoori, “a genetic algorithm for the design space exploration of datapathsduring high-level synthesis, ieee tran.on evolutionary computation, vol.10, no.3, 2006. [9] e. torbey and j. knight, “high-level synthesis of digital circuits using genetic algorithms,” in proc. int. conf. evol. comput., pp.224–229, may 1998. [10] e. torbey and j. knight, “performing scheduling and storage optimization simultaneously using genetic algorithms,” in proc. ieee midwest symp. circuits systems, pp. 284–287, 1998. [11] giuseppe ascia, vincenzo catania, alessandro g. di nuovo, maurizio palesi, davide patti, “efficient design space exploration for application specific systems-on-a-chip” jrnl of systems architecture 53, pp:733–750, 2007. [12] a.c.williams, a.d.brown and m.zwolinski,“simultaneous optimisation of dynamic power, area and delay in behavioural synthesis”, iee proc.-comput. digit. tech, vol. 147, no. 6, pp: 383-390, 2000. [13] christian haubelt, thomas schlichter, joachim keinert, mike meredith, “systemcodesigner: automatic design space exploration and rapid prototyping from behavioral models”, proceedings of the 45th annual acm ieee design automation conference, pages 580-585, 2008. [14] xuejie zhang and kam w. ng, “a review of high-level synthesis for dynamically reconfigurable fpgas”, microprocessors and microsystems, elsevier, volume 24, issue 4, pages 199-211,1 2000. [15] n. wehn et al., “a novel scheduling and allocation approach to datapath synthesis based on genetic paradigms,” in proc. ifipworking conf. logic architecture synthesis, pp. 47–56, 1991. [16] r. m. san and j. p. knoght, “genetic algorithms for optimization of integrated circuit synthesis,” in proc. 5th int. conf. genetic algorithms, san mateo, ca, pp. 432–438, 1993. [17] r. j. cloutier and d. e. thomas, “the combination of scheduling, allocation and mapping in a single algorithm,” in proc. 27th design automation conf., pp. 71–76, jun. 1990. [18] j. a. nestor and g. krishnamoorthy, “salsa: a new approach to scheduling with timing constraints,” ieee trans. comput.-aided des., vol. 12, pp. 1107–1122, 1993. [19] g. krishnamoorthy and j. a. nestor, “data path allocation using extended binding model,” in proc. 32nd acm/ieee design automation conf., pp. 279–284, 1992. [20] s. devadas and a. r. newton, “algorithms for hardware allocation in data path synthesis,” ieee trans. comput.-aided des., vol. 8, pp.768–781, 1989. [21] t. a. ly and j. t. mowchenko, “applying simulated evolution to high level synthesis,” ieee trans. comput.-aided des., vol. 12, no. 2, pp.389–409, feb. 1993. [22] c. h. gebotys and m. i. elmasry, “global optimization approach for architectural synthesis,” ieee trans. comput.-aided des., vol. 12, pp. 1266–1278, 1993. [23] c. t. hwang, j. h. lee, y. c. hsu, and y. l. lin, “a formal approach to the scheduling problem in highlevel synthesis,” ieee trans. comput.aided des., vol. 10, no. 2, pp. 464–475, feb. 1991. [24] g. de micheli, synthesis and optimization of digital circuits. new york: mcgraw-hill, 1994. [25] r. camposano, “path-based scheduling for synthesis,” ieee trans.cad., vol. 10, pp. 85–93, 1991. [26] p. g. paulin and j. p. knight, “force-directed scheduling for the behavioral synthesis of asics,” ieee trans. comput.-aided des., vol. 8, no.6, pp. 661–679, 1989. [27] a. c. parker, j. t. pizarro, and m. mlinar, “maha: a program for datapath synthesis,” in proc. 23rd acm/ieee design automation conf., 1986, pp. 461–466. [28] t. blickle and l. thiele, “a mathematical analysis of tournament selection,” in proc. 6th int. conf. genetic algorithms, pp. 9–16, 1995. [29] http://www.cbl.ncsu.edu/benchmarks/. [30] saraju p. mohanty, nagarajanranganathan, elias kougianos and priyadarsanpatra, “low-power highlevel synthesis for nanoscale cmos circuits” chapterhigh-level synthesis fundamentals, springer us, 2008. facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. i-ii © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd guest editorial advanced low-dimensional nanoelectronic devices: physics and modeling nanoelectronic devices of various kinds are essential for vlsi circuits. the struggle to follow moore’s law is becoming increasingly difficult and complex, requiring multitudinous novel approaches in order to continue decreasing dimensions of the devices which are already firmly established in the nano-world. as an example, the most advanced state of the art vlsi’s (microprocessors) currently can contain more than 50 billion transistors per chip. as far as the actual physical dimensions are concerned, in 2021 the ibm company announced their 2 nm chip. the efforts behind such achievements are enormous. this special issue on advanced planar nanoelectronics investigates some points of interest related to the physics of such devices, as well as their simulation, thus giving its contribution to the existing trends in this rapidly evolving and constantly expanding field. on may 19–20, 2021, the ieee kgec student branch chapter, in association with department of ece, kgec, technically co-sponsored by ieee eds kolkata chapter, organized international conference “devices for integrated circuit (devic)”, held in virtual mode as a measure of precaution against the covid-19 pandemic. the devic 2021 ended being a major international conference in the area of electronic devices for application in integrated circuits, with more than 300 submitted papers. it brought together leading scientists, researchers and industry professionals who shared their information and experiences and discussed practical challenges encountered and solutions adopted related to the latest developments in the area of electronic devices, circuits and vlsi. the conference was dedicated to the design, modeling and simulation of nanoelectronic devices, components, circuits and systems. the acceptance rate for the conference was about 50%, which has shown the stringent quality criteria applied to all contributions. the full proceedings of the conference were published by the ieee (isbn: 978-1-7281-99559) and can be found at ieee xplore. selected papers from devic 2021 were used as a loose inspiration for writing extended and modified and amended manuscripts with qualitatively new results for this special section of facta universitatis series: electronics and energetics. thus the articles published here had been specifically written for this special section, being loosely based on the corresponding devic 2021 presentations. each newly produced manuscript was subject to a rigorous peer reviewing procedure in which two or three reviewers from different countries were engaged. five papers altogether were selected for this special issue. the chosen articles are the following 1. dhananjaya tripathy, debiprasad priyabrata acharya, prakash kumar rout, sudhansu mohan biswal, "influence of oxide thickness variation on analog and rf performances of soi finfet" received january 31, 2022 ii guest editorial 2. remya jayachandran, k. j. dhanaraj, p. c. subramaniam, "planar cmos and multigate transistors based wide-band ota buffer amplifiers for heavy resistance load" 3. surajit bosu, baibaswata bhattacharjee, "all-optical frequency encoded dibitbased parity generator using reflective semiconductor optical amplifier with simulative verification" 4. bibek chettri, abinash thapa, sanat kumar das, pronita chettri, bikash sharma, "first principle insight into co-doped mos2 for sensing nh3 and ch4" 5. pranati ghoshal, chanchal dey, sunit kumar sen, "realization of a modified 8bit semiflash analog to digital converter based on bit segmentation scheme" the guest editors hope that the high quality of the papers included in this issue will encourage young authors to present their own achievements. the greatest pleasure for the editors would be to see new publications inspired by this special section. the guest editors would like to express their gratitude to all of the authors who ensured the existence of this special issue through their excellent contributions. the gratitude also extends to the organizers of the devic 2021 who assembled such a choice group of worldclass researchers, to fuee editor-in-chief, prof. danijel danković, as well as to the late member of the serbian academy of sciences and arts, prof. ninoslav stojadinović, who, before his untimely death, initiated and outlined the work on this special section, in cooperation with the general chair of devic 2021, prof. dr. anguman sarkar. guest editors: prof. dr. angsuman sarkar professor, kalyani government engineering college, university of kalyani, kalyani, west bengal, india prof. dr. arpan deyasi assistant professor, department of electronics and communication engineering, rcc institute of information technology, kolkata, india prof. dr. jyotsna kumar mandal professor, faculty of engineering, technology and management, kalyani university, kalyani, nadia, west bengal, india prof. dr. chandan kumar sarkar professor, department of electronics and telecommunication engineering, jadavpur university, jadavpur, kolkata, west bengal, india prof. dr. zoran jakšić full research professor, institute of chemistry, technogy and metallurgy, national institute of the republic of serbia – university of belgrade, serbia associate editor, facta universitatis series: electronics and energetics instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 275 286 doi: 10.2298/fuee1502275d numerical analysis of zno thin layers having rough surface  santolo daliento 1 , pierluigi guerriero 1 , maria luisa addonizio 2 , alessandro antonaia 2 1 department of electrical engineering and information technology, university of naples, naples, italy 2 enea research center, località granatello, portici, italy abstract. in this paper an automated procedure for the analysis of transparent conductive oxides (tco) layers exhibiting rough surfaces is proposed. the method is based on the interaction between matlab and the sentaurus tcad and is aimed to the reduction of computational efforts needed for full three dimensional analyses. experiments performed on cvd deposited zno layer, showing the reliability of the method for describing their optical properties, are reported. a semi-empirical technique for the extraction of the tco refractive index is shown as well. key words: tco, afm, zno, refractive index 1. introduction the characterization of transparent and conductive oxides (tco) is an open task for the scientific community. this is mainly due to the fact that in many applications, as an example when they are exploited as anti reflective coatings in optoelectronic devices, they exhibit very rough surfaces and conventional one dimensional models are unreliable to effectively describe their behavior. many advanced software packages allow full three dimensional capabilities for accurate numerical simulations of arbitrarily shaped surfaces but, as they operate on a discretized mesh which reproduces the real surface, the number of required grid points is often too large, leading to unpractical computational time. a second issue depends on the unreliability of geometrical models for modeling light propagation when roughness induces diffraction effects. in this paper we propose an automated procedure, based on the interaction between matlab and sentaurus tcad [1,2], which finds, in a generic surface, the smallest area characterized by the same statistical features (average roughness, standard deviation and so on) of the whole surface; thus, a reduced device, with same optical properties of the received september 8, 2014; received in revised form december 5, 2014 corresponding author: santolo daliento department of electrical engineering and information technology, university of naples, via claudio 21, naples, italy (e-mail: daliento@unina.it) 276 s. daliento, p. guerriero, m. addonizio, a. antonaia real one, can be defined and analyzed. the same procedure looks for the existence of a two dimensional section where statistical parameters are conserved as well, so that, in some cases, the analysis can be reduced to a two dimensional one. the availability of this procedure allowed us to check the limits of light propagation models for a given structure, and lead us to define whether exact solution of maxwell equations are needed to take into account diffraction phenomena [3,4]. moreover, the availability of a 3d model allowed us to set a semi empirical procedure to achieve the wavelength dependent refractive index of thin zno layers. it should be underlined that our aim was to look for a global index able to describe light intensity transmitted beyond a given tco film which, independently of its actual physical meaning, allows reliable numerical simulations of optoelectronic devices. many samples were specifically fabricated by varying deposition parameters to achieve different surface roughness (from smooth to very rough). comparison between experiments and simulations, the latter performed by importing real device geometries, proved the reliability of our approach. the paper is organized as follows. in section ii the matlab code which evaluates statistical parameters and looks for the minimal surface is described. in section iii the procedure is applied to experimental samples made by cvd deposited zno thin film [5,6]. section iv describes the procedure for evaluating the complex refraction index of deposited layers. conclusions are drawn in section v. 2. processing of afm images as previously mentioned all layers analyzed in this paper were cvd deposited zno thin films. fig.1 shows an example chosen among the atomic force microscopy (afm) images gained for our samples. fig. 1 afm image of a cvd deposited zno thin film numerical analysis of zno thin layers having rough surface 277 as can be seen pyramidal shapes are randomatically distributed over the surface, thus assuring an antireflective behavior to the film. as first step of the processing the afm images were loaded in the matlab environment. a graphical user interface (gui), shown in fig. 2, was built for a user friendly managing of the files. the gui allows commands input and results visualization; as an example, in the upper right corner of fig.2 we can see the image of fig.1 reproduced as matlab plot fig. 2 matlab gui interface the matlab code underlying the gui has three main features, it evaluates statistical roughness parameter of the whole image; looks for the smallest surface with same parameters; automatically generates the sentaurus input file for numerical analysis. statistical parameters are evaluated according to the definition of the effective roughness height given in [7] 2 ( , )rms s r x y dxdy s    1 (1) where r is the difference between the profile eight and its mean value and s is the area of the sample. the parameter given in (1) is iteratively evaluated by considering decreasing portions of the total area by means of a successive halving criterion. the algorithm stops when the smallest area still holding the starting value of the effective roughness is found. then, the algorithm verifies that different portions of the total surface having the same area just defined exhibit same parameters, thus assuring the statistical robustness of the procedure. once a reduced surface is chosen (an example is shown in the low right corner of fig.2) the three dimensional analysis of the properties of the thin layer, with respect to light propagation, can be effectively performed by exploiting a corresponding reduced set 278 s. daliento, p. guerriero, m. addonizio, a. antonaia of grid points. for the example shown in fig.2 we achieved a reduction of grid points of about 75% and a reduction of the computational time which was greater than 90%. with the aim of further reducing computational efforts the matlab code gives the chance to verify if the analysis of the surface properties can be reduced to a two dimensional problem. in other words, the effective roughness defined by (1) is evaluated along a finite set of transversal cross sections (the step of the scan is adjustable from the gui). if the effective roughness for the cross section is the same already valuated for the reduced surface the subsequent numerical analysis is performed in a two-dimensional reference system, otherwise, full three-dimensional analysis is performed. the procedure ends after automatically generating the input code for the sentaurus environment. fig. 3 sentaurus 2d mesh for the cross section of an afm image as an example fig.3 shows an image of the 2d grid generated by sentaurus after receiving the input from matlab, while fig.4 shows the analogous for a 3d case. fig. 4 three dimensional sentaurus mesh for an afm image numerical analysis of zno thin layers having rough surface 279 once statistical parameters have been determined a further simplification, which can eventually be adopted, consists in the substitution of the real surface with an equivalent surface only formed by regular pyramids. geometries of the pyramids should be chosen so as to have same statistical parameters of the real surface. the reliability of this simplification is evidenced in fig.5 and fig.6. fig. 5 sentaurus structure with non uniform (left) and uniform (right) pyramids the structure in the left side of fig.5 has pyramids with different heights while the structure in the right side has uniform pyramids but their heights are chosen to have same statistical parameters of the non uniform structure. fig. 6 transmittances profiles evaluated by sentaurus for structures with different surface shape but same effective roughness. 280 s. daliento, p. guerriero, m. addonizio, a. antonaia as can be seen in fig. 6 the transmittances evaluated for the two cases are perfectly coincident. this fact means that, in principle, optical properties of a real surface, like that of fig.1, can be reliably analyzed by considering a corresponding simplified surface, once statistical parameters have been evaluated. 3. experiments all deposited zno films were both optically and electrically characterized, in particular optical properties were defined in terms of transmittance profiles. experimental transmittances were then compared with those numerically evaluated by sentaurus on the previously defined reduced surfaces. sentaurus makes available three models for describing both light propagation through materials of given optical properties and light interaction with interfaces between materials with different optical properties. the simplest one is the tmm (transfer matrix method) model [8] which only applies to flat surfaces so that it is not suitable for our cases. the second model considers light as formed by discrete “rays” and applies geometric optics laws for reflection and transmission. usually, this method is preferred over others because of the reduced computational efforts. however it should be considered that diffraction effects may not be negligible because the width of pyramids like those shown in fig.1 can be comparable with the wavelengths in the visible spectrum. in such cases the exact solution of maxwell equations should be evaluated; the special model allowed for this application is termed emw (electro magnetic wave solver) in the sentaurus environment. the reliability of both ray-tracing and emw models were checked with reference to samples either subjected or not subjected to diffraction effects. as an example fig. 7 shows the comparison between the measured transmittance and the corresponding profile evaluated by means of the ray-tracing method when no diffraction is present. experiment 2d model fig. 7 comparison between a measured transmittance and a numerical result achieved by means of the ray-tracing model. numerical analysis of zno thin layers having rough surface 281 note that, in this case, according to the criteria defined in section ii, a 2d analysis was performed. the agreement between the curves shown in fig.7 evidences that the 2d analysis is enough accurate; moreover this result supports the reliability of the grid points reduction procedure described in the previous section. an example of results gained by means of a 3d analysis is reported in fig.8. 500 1000 1500 2000 0 20 40 60 80 100 t ( % ) wavelength (nm) l900 ell experiment geometrical model emw model fig. 8 measured and numerical transmittance profiles for a surface encountering diffraction effects. the emw model correctly reproduces the experimental curve. the figure refers to a surface manifesting diffraction effects. as can be seen the transmittance profile achieved by means of the geometrical approach (ray-tracing) is a poor description of the measured transmittance. on the other hand, the transmittance profile gained by the emw model correctly reproduces the experimental one. it should be emphasized again that the numerically evaluated transmittance profile was achieved on a reduced 3d surface automatically determined by the matlab code. 4. refractive index measurements the procedure which allows evaluating the transmittance profile once geometries and optical parameters of a given material are known can be conversely used for determining optical parameters of an unknown material once surface geometry and measured transmittance profiles are available. the method described hereafter refers to the characterization of a zno layer deposited on a glass substrate (of assigned thickness and optical parameters) whose transmittance profile was previously measured for different angles of the incident light. only the refractive index of the zno tco is assumed to be unknown. the transmittance was first determined by sentaurus for every possible value of both the real part n and the imaginary part k of the refractive index. indeed, for a given structure, it is always possible to draw a surface like that shown in fig.9, which, for an assigned wavelength, is the locus the transmittances compatible with all possible n,k couples. 282 s. daliento, p. guerriero, m. addonizio, a. antonaia fig. 9 transmittance for a zno on glass structure as a function of the complex refractive index of the zno the surface shown fig.9 was achieved by assuming normal incidence (0°) of the light; it was compared with the transmittance actually measured in the same conditions which was 82%. the intercept of this value with the surface gives a curve in the plane n,k, as shown in fig.10. all couples n,k belonging to the curve lead to the same value of the transmittance; from an applicative point of view this fact means that, in order to evaluate the light which is transmitted to an underlying device all that couples are equivalent and the actual refractive index is not needed. the above result only applies to normal incidence of the light. however, in real devices, the angle of incidence of the light is usually not known a priori, thus a refinement of the extraction is needed. fig. 10 intercept between the transmittance surface and the measured transmittance. the line is the locus of all refractive indexes giving the same transmittance for the assigned zno on glass structure. numerical analysis of zno thin layers having rough surface 283 to this end we first considered a possible uncertainty affecting the measurement of the transmittance. fig. 11 transmittance affected by a measurement error. as an example in fig.11 we considered that the measured transmittance was 82% +/ 1%. therefore, a region of the n,k plane (instead of a curve), where refractive index compatible with the measured transmittance lie, was now identified. the portion of the n,k plane between the two lines is, indeed, the locus of refractive indexes giving the transmittances in the assigned range. then, the transmittance surface was determined for a 15° incidence angle, as shown in fig.12, and compared with the corresponding measured transmittance, (78%) +/1%, which is shown as well. fig. 12 transmittance surface and measured transmittance for a 15° incidence angle. 284 s. daliento, p. guerriero, m. addonizio, a. antonaia a new region of possible n,k couples compatible with the measured transmittance was thus determined. as the tco is always the same its actual refractive index, which is independent of the incidence angle, should give both measured transmittances. hence the actual refractive index belongs to both regions identified in fig.11 and fig.12. 1.4 1.5 1.6 1.7 1.8 1.9 2.0 0 0.05 0.10 0.15 0.20 0 0.05 0.10 0.15 0.20 0 n k -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 15° 0° compatibility surface fig. 13 locus of refractive index giving the transmittance measured for both 0° and 15° incidence angle. for the sake of clarity fig.13 shows, in the plane n,k, the two regions identified in fig.11 and fig.12; points lying in the shaded region give both the transmittance measured at 0° and the transmittance measured at 15°. the procedure can be iterated to further reduce the spread for the n,k values to be adopted for simulations. fig.14 shows the result we gained by adding two further incidence angles, 45° and 60°. 1.4 1.5 1.6 1.7 1.8 1.9 2.0 0 0.05 0.10 0.15 0.20 0 0.05 0.10 0.15 0.20 0 n k -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 60° 45° 15° 0° compatibility surface fig. 14 locus of refractive index giving the transmittance measured for 0°, 15°, 45°, 60° incidence angle. numerical analysis of zno thin layers having rough surface 285 as can be seen a very narrow region is now identified which gives all n,k couples compatible with all measured transmittances. in principle this procedure should give the “actual” refractive index (if measurements were not affected by uncertainty two angles of incidence would be enough to univocally define n and k). for comparison purposes, the wavelength dependent real part of the refractive index, gained with the present method, is compared in fig.15 with the profiles gained on the same sample by means of an inverse procedure based on ellipsometric measurements. ellipsometry is usually considered very reliable and its results are often assumed as reference. the couples n,k given by the inverse procedure were exploited to evaluate the transmittance profiles of the assigned structure by means of sentaurus simulations. 200 400 600 800 1000 1200 1400 1600 1800 0.5 1 1.5 2 2.5 wavelength [nm] n i this work inverse procedure wavelength (nm) r e fr a ct iv e in d e x n fig. 15 comparison between the real part of the refractive index for a zno on glass thin layer measured by means of an ellipsometric method (inverse procedure) and the one achieved in this work results are shown in fig.16. 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 70 80 90 100 110 wavelength [nm] tr a n sm itt a n ce [ % ] ellipsometric experimental 0° 60° fig. 16 comparison between measured transmittance profiles and the transmittance profiles evaluated by sentaurus when the refractive indexe measured by means of the ellipsometric procedure were used. 286 s. daliento, p. guerriero, m. addonizio, a. antonaia as can be seen, experiments significantly differ from numerical results, especially in the low wavelength range. on the other hand, it is worth noting that measured transmittances are perfectly coincident with those achieved by exploiting for numerical simulations n,k couples gained by means of the method presented in this work. this result could be considered quite trivial because we extract the couples n,k directly from those measurements, actually it allows a more reliable modeling of the operation of an optoelectronic device, where the main issue is the correct estimation of the wavelength dependent light availability. 5. conclusions in this paper an automated procedure for analyzing afm images in the matlab environment has been presented. the procedure evaluates statistical parameters which qualifies the roughness of the surface and looks for the minimal area holding same parameters with the aim to simplify sentaurus three-dimensional simulations. the procedure has allowed the extraction of an effective refractive index for zno thin layers. experiments showing the overall reliability of the procedure have been shown. references [1] “matlab user’s guide” mathworks, www.mathworks.it [2] sentaurus user’s guide, http://www.synopsys.com/tcad/ devicesimulation/ pages/sentaurusdevice.aspx [3] s. daliento, p. guerriero, m. l. addonizio, a. antonaia, e. gambale, "refractive index measurement in tco layers for micro optoelectronic devices", in proceedings of the 29th international conference on microelectronics, miel 2014. belgrade, serbia, 12-14 may 2014, pp. 265–268. [4] s. daliento, p. guerriero, m. l. addonizio, a. antonaia, "approximate analysis of optical properties for zno rough surfaces", in proceedings of the 29th international conference on microelectronics, miel 2014. belgrade, serbia, 12-14 may 2014, pp. 261-264. [5] m. l. addonizio, a. antonaia, "enhanced electrical stability of lp-mocvd-deposited zno:b layers by means of plasma etching treatment", journal of physical chemistry, vol. 117, no. 46, pp. 24268–24276, 2013. [6] o. tari, a. aronne, m. l. addonizio, s. daliento, e. fanelli, p. pernice, "sol-gel synthesis of zno transparent and conductive films: a critical approach", solar energy materials and solar cells” vol. 105, pp. 179-186, 2012. [7] din en iso 4287, http://www.iso.org/iso/catalogue_detail.htm?csnumber=10132 [8] s. j. orfanidis, electromagnetic waves and antennas, rutgers university nj, 1999 http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=6701670503&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=6701670503&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84888630671&origin=resultslist&sort=plf-f&src=s&sid=79c15690621f4b555edab426d7830304.fm4vpbipdl1bpirdq5cw%3a240&sot=aut&sdt=a&sl=42&s=au-id%28%22addonizio%2c+maria+luisa%22+6602086464%29&relpos=0&relpos=0&citecnt=0&searchterm=au-id%28%5c%26quot%3baddonizio%2c+maria+luisa%5c%26quot%3b+6602086464%29 http://www.scopus.com/record/display.url?eid=2-s2.0-84888630671&origin=resultslist&sort=plf-f&src=s&sid=79c15690621f4b555edab426d7830304.fm4vpbipdl1bpirdq5cw%3a240&sot=aut&sdt=a&sl=42&s=au-id%28%22addonizio%2c+maria+luisa%22+6602086464%29&relpos=0&relpos=0&citecnt=0&searchterm=au-id%28%5c%26quot%3baddonizio%2c+maria+luisa%5c%26quot%3b+6602086464%29 http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=54884321700&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004261342&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=6602086464&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=6701670503&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=8656213500&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7006509565&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84863005434&origin=resultslist&sort=plf-f&src=s&sid=79c15690621f4b555edab426d7830304.fm4vpbipdl1bpirdq5cw%3a240&sot=aut&sdt=a&sl=42&s=au-id%28%22addonizio%2c+maria+luisa%22+6602086464%29&relpos=3&relpos=3&citecnt=6&searchterm=au-id%28%5c%26quot%3baddonizio%2c+maria+luisa%5c%26quot%3b+6602086464%29 http://www.scopus.com/record/display.url?eid=2-s2.0-84863005434&origin=resultslist&sort=plf-f&src=s&sid=79c15690621f4b555edab426d7830304.fm4vpbipdl1bpirdq5cw%3a240&sot=aut&sdt=a&sl=42&s=au-id%28%22addonizio%2c+maria+luisa%22+6602086464%29&relpos=3&relpos=3&citecnt=6&searchterm=au-id%28%5c%26quot%3baddonizio%2c+maria+luisa%5c%26quot%3b+6602086464%29 http://www.scopus.com/source/sourceinfo.url?sourceid=13332&origin=resultslist facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 231-238 https://doi.org/10.2298/fuee1902231l plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application josue lopez-leyva, miguel ponce-camacho, ariana talamantes-alvarez center for innovation and design, cetys university, microwave street, ensenada, mexico abstract. this paper shows the design and performance simulation of a 2.4 ghz plugand-play transceiver based on a high speed switch for ieee 802.15.4 applications. the electrical design was optimized taking into account the scattering parameters, inputoutput impedance matching and minimum trace width. the simulation results show an important performance regarding the noise figure (0.38 db) and gain (21 db) at particular temperature for reception mode, transmission scattering parameters (s12 and s21) and reflection scattering parameters (all the rest parameters) for both mode operation (power amplifier and low noise amplifier). key words: power amplifier, low noise amplifier, scattering parameters. 1. introduction nowadays, wireless communication systems are necessary to improve and expand the variety of services for the private, public and personal sectors [1,2]. in particular, the concepts of internet of things (iot) and machine-to-machine (m2m) impose a tendency towards the monitoring, control and data acquisition for different types of clients [3]. although there is a large number of wireless communication systems, these require improvements to some parameters, such as the extension of coverage (i.e. link distance) considering the trade-offs between energy consumption, the complexity of the electronic design, and the cost-effect. in order to improve these parameters, the power amplifier (pa) and the low noise amplifier (lna) are suitable technical options for full-duplex high-end telecomm systems; both have important features such a noise figure, gain, linearity, single / multiple narrow/wide bands and impedance matching [4,5]. however, designing and manufacturing these circuits with high performance for all parameters is a difficult task. resizing pa and lna is a trend but the gain-size trade-off is a highlight issue [6,7]. a lna+pa circuit with higher gain and lower noise figure (nf) is required received september 6, 2018; received in revised form november 14, 2018 corresponding author: josue lopez-leyva cetys university, center for innovation and design, mexico (e-mail: josue.lopez@cetys.mx) 232 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez for wide coverage applications where plug-in-play transceiver systems are needed [8,9]. in terms of low data rate wireless personal area network technologies, ieee 802.15.4 is the most useful standard used due to the extended life of the device based on low power consumption [10,11]. wide coverage applications based on this protocol are a crucial issue that the presented novel and optimum transceiver can solve. we propose a reduced plug-and-play transceiver in comparison with the traditional transceiver. the principal objective of our proposal is to increase the distance of communication links without the digital processing performed in traditional transceiver. this paper is organized as follows: section 2 is dedicated to the general description of the electrical design. section 3 shows the simulation results regarding scattering parameters in both operation modes, noise figure and gain performance. section 4 concludes the paper and mentions the future work for the manufacture of the electrical board with industrial quality level. 2. electronic design fig. 1 shows the block diagram of the transceiver (pa+lna), and multisim software was used for simulation analysis. the general set-up presents the lna subsystem where the incoming signal is received by the antenna (sma connector) and fed to a high speed rf switch. the switch presents a high isolation based on a rlc circuit and two diode circuits, i.e. dual switching diode circuit (baw56lt1) and a high shunt signal isolator / low shunt insertion loss diode (bar81w) with a switching rate up to 2 ghz. in particular, the rf switch has a control port to commute between transmission and receiver mode. after the lna block, the electrical signal is fed to another rf switch to send the signal to the processing board. as for the signal path and the way of processing for the pa, it is the same as that of lna. in addition, two test points were established in order to measure the scattering parameters (s-parameters) [12] using a network analyzer (na) for different time slots (i.e., slot #1 for reception mode that relates port #2 as input and port #1 as output, while slot #2 for transmission mode that relates port #1 as input and port #2 as output). fig. 1 block diagram of transceiver. blue trace describes the lna-pa path and red trace describes the pa-lna path with the respective measurement points at slot 1 and 2. plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application 233 fig. 2a) shows the general electronic diagram for the high speed switch / rf isolator based on the diodes mentioned for transmission mode. a mode controller is used in order to switch modes using the connection points, c and d. in particular, the connection point c, enables or disables the pa circuit shown in fig. 2b, and connection point d controls the lna circuit shown in fig. 3b). while the connection points a and b are the input and output of the pa circuit. an input-matching-impedance-network (imn) and outputmatching-impedance-network (omn) were implemented in the input and output port of the pa, respectively, as fig. 2 shows. a) b) fig. 2 a) electronic diagram for high speed switch / rf isolator for transmission mode, b) electronic diagram of the pa. a) b) fig. 3 a) electronic diagram for high speed switch / rf isolator for reception mode, b) electronic diagram of lna. fig. 3a) shows the general electronic diagram for the high speed switch / rf isolator for reception mode. in general, the electronic diagrams shown in fig. 2a) and 3a) are similar, however, particular inductance and capacitance values are modified in order to optimize the imn and omn. in addition, the connection points, e and f, represent the input and output of the lna circuit. as mentioned, the pa circuit uses the bfp650 transistor, therefore, the first step of the design is to measure the current-voltage 234 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez characteristics in order to choose and set the q-point (operating or quiescent point). fig. 4 shows the relation between the vce and ic for different ib, where the trace corresponding to ib = 6 ma was selected for vce = 3.3 v in order to establish proper operating conditions (q-point) based on the input data signal. the same procedure was performed to determine the q-point of the bfp843f used in the lna circuit and the same biasing voltage (vce) was chosen. fig. 4 analysis of the transistor bfp650. blue trace describes the vce-ic relation for different ib. red trace is the optimum steady state for q point. an important issue in the circuit design is the matching impedance with respect to the electronic element and the transmission line in the pcb. therefore, the characteristic impedance (z0) for the microstrip line can be calculated using some physical and electromagnetic parameters as eq. (1) shows [13]. 0 120 1.393 0.667 ln 1.444 eff z w w h h            (1) where w is the width, h is the dielectric thickness and εeff is the effective dielectric constant. in particular, eq. (1) is only suitable for microstrip satisfying the relation (w/h > 1). however, to optimize this matching, a transmission line calculator was used where the transmission line type, length and dielectric material characteristics were selected to produce a z0 ≈ 50 ω and a minimum capacitance and inductance (see fig. 5). due to the high power demand of the circuit, a trace width analysis was performed in order to calculate the minimum trace width based on the root mean square (rms) electric current in each electrical path. fig. 6 shows the printed circuit board (pcb) layout based on the aforementioned parameters and fig. 7 shows the three-dimensional view of the pcb. the ultiboard software was used for the pcb designs. plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application 235 fig. 5 transmission line calculator in order to determine the characteristic impedance based on particular physical features of dielectric material and microstrip. by using the matching circuits imn and omn shown in fig. 2 and 3, the input and output impedances of the pa and lna are obtained as follows: for pa, zin = 50.1 ω and zout = 49.48 ω, while for lna circuit, zin = 49.3 ω and zout = 45.05 ω. the good impedance matching was performed using l-section networks (i.e. using an inductor and a capacitor), however, the bandwidth and gain are an important trade-off considered in the complete design. fig. 6 pcb layout using c0402 packaging in each electronic element. fig. 7 3d view of printed circuit board layout. 236 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez 3. simulation results fig. 8 shows the simulation results of the s-parameters for the reception mode. the s12 value means that there is a high transmission power ratio (≈ 21 db) of the complete circuit (lna+pa+ high speed rf switch), while the s22 value (≈ -19 db) and the s11 value (≈ 21 db) means a good matching performances achieved in the input and output ports, respectively, in the reception operation mode. the s21 (≈ -27 db) has an adequate electrical performance of isolation between input and output ports. fig. 8 performance of the pa-lna scheme in the reception mode (port #2 is the input and port #1 is the output). fig. 9 performance of the pa-lna scheme in the transmission mode (port #1 is the input and port #2 is the output) plug-and-play transceiver with high gain and ultra low noise figure for ieee 802.15.4 application 237 with respect to the measurements of the s-parameters in the transmission mode (see fig. 9), s21 and s12 are the most important because they describe the transmitted and the reflected level signal (≈ 18 db and ≈ -19 db, respectively). in addition, fig. 10 shows the performance of nf and gain (g) depending on the temperature variation at 2.4 ghz. the nf measurement is ≈ 0.6 db and gain is ≈ 21 db for 27 °c. fig. 10 nf and gain of the pa-lna scheme in transmission mode (slot #1) at 2.4 ghz with temperature variations. in addition, nf and g parameters were measured at 18.8 °c (i.e. 292 °k, temperature standard). in this case, nf is ≈ 0.38 db and g is ≈ 21.5 db. 4. conclusion this paper presented a transceiver circuit that has good performance parameters considering s-parameters, noise figure and gain based on the detailed design for imn and omn. the plug-and-play feature imposes an easy way to extend the coverage of different traditional wireless systems based on the ieee 802.15.4 standard. it is important to clarify that the principal objective of the proposal is to increase the distance of the communication link of systems based on ieee 802.15.4. therefore, although conventional and commercial transceivers perform other processes (e.g. digital-to-analog converter, frequency synthesizer, among others), our proposal only focuses on improving the transmission and reception mode without considering modulation, synchronization, coding, encryption among others schemes. in particular, the analysis for the pcb design is based on microstrip transmission lines, although a ground layer is added in order to improve the performance. due to the above, it is possible to confuse the transmission lines shown in fig. 7 as a conventional coplanar waveguide (cpw). in fact, the impedance analysis is not performed considering a cpw. currently, we have a first prototype that uses fr4 dielectric material in order to perform some accelerated life testing (alt) and technical operating production (top). in addition, the transceiver circuit has been manufactured using a flexible dielectric material and other types of transmission lines in order to enhance the electronic performance. 238 j. lopez-leyva, m. ponce-camacho, a. talamantes-alvarez acknowledgement: this work was supported by the grant of center for innovation and design (ceid), cetys university as an internal scientific and technical project. in addition, this article was prepared within the frame of industrial-academic relationship of the ceid. in particular, thanks to the english native speaker colleagues that supported this document. references [1] j. g. d. hester, j. kimionis and m.m. tentzeris, “printed motes for iot wireless networks: state of the art, challenges, and outlooks”. trans. microwave theory and techniques, vol. 65, pp. 1819–1830, may 2017. [2] g. zheng, c. hua, r. zheng and q. wang, “toward robust relay placement in 60 ghz mmwave wireless personal area networks with directional antenna”, trans. mobile computing, vol. 15, pp. 762–773, march 2016. [3] j.w. raymond, t.o. olwal and a.m. kurien, “cooperative communications in machine to machine (m2m): solutions, challenges and future work”. access, vol. 6, pp. 9750–9766, february 2018. [4] j-e. baek, y.m cho and k-c. ko, “analysis of design parameters reducing the damage rate of lownoise amplifiers affected by high-power electromagnetic pulses”, trans. plasma science, vol. 46, pp. 524–529, march 2018. [5] h. laaouane, j. foshi and s. bri, “design of a low noise amplifier for lte radio base station receivers. in: international conference on wireless technologies”, in proceedings of the international conference on wireless technologies, embedded and intelligent systems (wits). morocco, ieee, 2017, pp. 1–5. [6] w-l. ou, y-k. tsai, p-y. tsengand and l-h. lu, “a 2.4-ghz dual-mode resizing power amplifier with a constant conductance output matching”. in proceedings of the international system-on-chip conference. munich, ieee, 2017, pp. 258–261. [7] p. qin and q. xue, “compact wideband lna with gain and input matching bandwidth extensions by transformer”, microwave and wireless components letters, vol. 27, pp. 657–659, july 2017. [8] j. p. carmo, n. dias, p. m. mendes, c. couto and j. h. correia, “low-power 2.4-ghz rf transceiver for wireless eeg module plug-and-play”. in proceedings of the international conference on electronics, circuits and systems, nice, ieee, 2006, pp. 1144–1147. [9] w-t. fang and y-s. lin, “highly integrated switched beamformer module for 2.4-ghz wireless transceiver application”, trans. microwave theory and techniques, vol. 64, pp. 2933–2942, sept. 2016. [10] h-j. jeon, t. demeechai, w-g. lee, d-h. kim and t-g. chang, “ieee 802.15.4 bpsk receiver architecture based on a new efficient detection scheme”, trans. signal processing, vol. 58, pp. 4711 – 4719, sept 2010. [11] a. zolfaghari, m-e. said, m. youssef, g. zhang, t-t. liu, f. cattivelli, y-i. syllaios, f. khan, f-q. fang, j. wang, k-y. jason-li, fh. liao, d-s. jin, v. roussel, d-u. lee and f-m. hameed, “a multimode wpan (bluetooth, ble, ieee 802.15.4) soc for low-power and iot applications”, in proceedings of the symposium on vlsi circuits, kyoto, ieee, 2017, pp. c74–c75. [12] b. lehmeyer, m.t. ivrlač and j.a. nossek, “lna noise parameter measurement”, in proceedings of the european conference on circuit theory and design, trondheim, ieee, 2015 pp. 1–4. [13] j.w.n. rogers and c. plett, “radio frequency integrated circuit design”. norwood: artech house, 2010, chapters 4, pp. 63–93. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 457 464 doi: 10.2298/fuee1503457c using a two-contact circular test structure to determine the specific contact resistivity of contacts to bulk semiconductors  aaron m. collins, yue pan, anthony s. holland school of science engineering and health, rmit university, australia abstract. we present a numerical method to extract specific contact resistivity (scr) for three-dimensional (3-d) contact structures using a two-electrode test structure. this method was developed using finite element modeling (fem). experimental measurements were performed for contacts of 200 nm nickel (ni) to p + -type germanium (ge) substrates and 200 nm of titanium (ti) on 4h-silicon carbide (sic). the scr obtained was (2.3-27) ×10 -6 ω·cm 2 for the ni-ge contacts and (1.3-2.4) ×10 -3 ω·cm 2 for the ti-sic. key words: specific contact resistivity, test structures, ohmic contact. 1. introduction specific contact resistivity (ρc, [ω·cm 2 ]) is one of the most important parameters in studying metal-semiconductor interfacial properties. this parameter is useful to determine the quality of a contact between two materials, due to specific contact resistance being geometry independent. therefore methods of testing this parameter can be seen to be of great use to reliability simulations. in measuring the specific contact resistivity, several test structures and methods have been reported [1-6]. among them, the transmission line model (tlm) and circular transmission line models (ctlm) are commonly used [7] due to their long standing reliability in testing methods. analysis using the tlm and ctlm is based on a two-dimensional (2-d) model which assumes no voltage drop in the semiconductor layer in the vertical direction. however, due to the reducing size of semiconductor devices and decreased ρc, this vertical voltage drop in the semiconductor layer could lead to errors in derivation of specific contact resistivity using either tlm or ctlm. furthermore, the prevalence of mems semiconductor devices suggests the need for a 3-d test structure for determining ρc of contacts to such devices. correction factors are commonly used to increase the accuracy of derived specific contact resistivity in 3-d circumstances [8], but not in the technique used in this paper. in this paper, we present a numerical method to extract specific contact resistivity for 3-d contact structures using a two-electrode circular test structure derived from investigation of the conventional three-electrode ctlm [9]. the method was developed using finite received december 2, 2014; received in revised form february 10, 2015 corresponding author: yue pan school of science engineering and health, rmit university, australia (e-mail: s3265073@student.rmit.edu.au)  458 a. m. collins, y. pan, a. s. holland element modeling (fem) of ohmic contacts between a metal layer and a semiconductor substrate and the scaling behavior of this method was also determined and discussed in this paper. this method presents its most useful application in areas where the lateral dimensions are far greater than the vertical. experimental measurements using the proposed test structure were performed for contacts of 200 nm ni to p-type ge substrates and contacts for 200 nm ti to 4h-sic and the specific contact resistivity was determined to be (2.3-27)×10 -6 ω·cm 2 and (1.3-2.4)×10 -3 ω·cm 2 respectively. 2. the structure as defined by berger [10], the parameter η is used to determine whether a metal and a semiconductor ohmic contact is in 3-d circumstance or not. in (1), when η ≤ 1, we have a 3-d contact, otherwise it is a 2-d contact. note that ρb and t are the resistivity and the thickness of the semiconductor layer respectively. (1) to create a pure 3-d situation, the test structure is assumed to be fabricated on a semiconductor substrate which has a relatively large thickness to make sure η ≤ 1. the test pattern for determining ρc in such a 3-d circumstance is shown in fig. 1 and consists of a central dot contact and a ring contact. the radius of the central dot is r0 and the inner and outer radii of the outer electrode are r1 and r2 respectively. mesa isolation is not needed, as is the case for all ctlm type test structures. in this paper, r0, r1, r2, ρb and ρc are all the information which determine the total resistance rt that is measured between the two electrodes. it can be written in the following form which is useful in the study of the scaling behavior of this method (discuss later). { } (2) by measuring rt, ρc can be found with the resistivity of the semiconductor layer ρb and the geometry sizes known. fig. 1 isotropic view of schematic of the proposed 3-d two-contact circular test structure. using a two-contact circular test structure to determine the specific contact resistivity... 459 3. the method the analytical solutions to the current-voltage relationship of the proposed test structure were deemed to be too difficult or impossible to obtain. therefore, we present a numerical method to determine ρc which is developed using finite element modeling (fem) of ohmic contacts between a metal layer and a semiconductor substrate [11]. a. finite element modeling fem can be used to accurately model the electrical behavior of ohmic contacts between a metal and a semiconductor. creating a model requires the following information: (i) test structure geometry, (ii) conductivity of each layer in the structure and (iii) specific contact resistivity ρc of each interface in the structure. msc nastran is a finite element program developed by nasa for electrical analysis while msc patran is used for creating models and meshing. fig. 2 shows a section of the fem model used to develop solutions for the 3-d ohmic contact test structure. it consists of three layers which are metal layer on the top, bulk semiconductor on the bottom and the very thin interfacial layer between them. only a 45 ◦ sector is modeled to reduce the time taken for analysis to run. the current is injected at fig. 2 equipotentials (in millivolts) in the semiconductor layer in a 3-d situation for the finite-element modeling example where r0 = 3 μm, r1 = 5 μm, and r2 = 9 μm. (a 45 ° sector of the test structure is presented). 460 a. m. collins, y. pan, a. s. holland the center electrode and the equipotential of the outer electrode is set to zero. the voltage contours in fig. 2 shows that when the thickness of the semiconductor layer t is beyond a certain value t ’ , little current goes through the bottom of the semiconductor substrate. what is mean by this is that when metal contacts to the substrate directly, the thickness of the semiconductor layer t can be considered as infinite beyond this t ’ (relatively small compare to typical substrate thickness). a number of models are analyzed using fem with ρb and ρc varying from 0.0001 ω·cm to 0.001 ω·cm and 1×10 -9 ω·cm 2 to 1×10 -4 ω·cm 2 respectively. the geometry size is fixed and the thickness of the semiconductor layer is set to be large enough to make sure the model is 3-d and little current goes through the bottom of the substrate. by doing this, we can get a constant rt with different combinations of ρb and ρc. plotting rt as a function of ρc with variable ρb, we can get fig. 3. from fig. 3, we can pick up the right curve with known semiconductor resistivity ρb and find out the value of ρc using the experimentally determined total resistance rt. fig. 3 fem analysis results for total resistance rt between the two electrodes as a function of ρc with ρb varying from 0.0001 ω·cm 0.001 ω·cm. geometry is fixed. r0 = 3 μm, r1 = 5 μm, and r2 = 9 μm. b. scaling behavior the scaling behavior of this method is shown in (3) { } { } (3) using (3), the plots in fig. 3 will be the same with ρc, rt and ρb scaled by factors of m 2 n, n and mn respectively. thus, the structure is universal and applicable for ohmic contacts where the resistive effects of the semiconductor and the contact can be described by ρb and the geometry of the electrodes. for example, when m = 1 and n = 10, we get fig. 4 which has the same shape of plots in fig. 3 but for a new set of ρb. using a two-contact circular test structure to determine the specific contact resistivity... 461 fig. 4 fem analysis results for total resistance rt between the two electrodes as a function of ρc with ρb varying from 0.001 ω·cm 0.01 ω·cm. geometry is fixed. r0 = 3 μm, r1 = 5 μm, and r2 = 9 μm. note that this figure can be scaled using (3). 4. experimental and results experimental measurements using the proposed test structure were performed for contacts of 200 nm ni to ge substrates. a number of two-contact circular test patterns were prepared on p-type germanium substrate. the geometries vary from r0 = 6 μm, r1 = 10 μm and r2 = 18 μm to r0 = 24 μm, r1 = 40 μm and r2 = 72 μm. fig. 5 shows an optical micrograph of an example pattern fabricated with r0 = 15 μm, r1 = 25 μm and r2 = 45 μm. fig. 5 optical micrograph of a two-contact circular test structure fabricated on p-type ge. the geometry size is r0 = 15 μm, r1 = 25 μm and r2 = 45 μm. the contacts are prepared in the following way. the p-type 3 inch germanium wafer with a thickness of 220 μm was diced into squares with dimensions of 1×1 cm 2 and 462 a. m. collins, y. pan, a. s. holland cleaned in az 100 solvent at 80 ºc for 15 minutes followed by acetone, isopropal alcohol and deionized water and dried in nitrogen gas. az 1512 was then spin coated on the surface of the wafers followed by soft baking at 90 ºc for 90 seconds. after removing the edge bead of the photoresist, the wafers were exposed to uv light for 8 seconds, soaked in chlorobenzene for 60 seconds and developed in 1:4 di water: az 400k for 25 seconds. after deposit 200 nm ni on the ge substrate by electron beam evaporation and soaked in acetone, the ni electrodes patterns were formed by lift off technique using ultra sound equipment at 90º c for 30 minutes. finally, the wafers were cleaned in deionized water and dried using nitrogen gas. the same process was conducted in order to prepare the sic substrates with ti deposited to a thickness of 200 nm. in addition to the photolithographic steps as discussed the sic samples were heat treated at 1100 ºc for 30 minutes in an argon environment. it is known that ti and sic will produce a schottky contact when deposited with no treatment applied. therefore this extra step was taken to ensure that the ti contacted the sic uniformly and to create an ohmic contact. fig. 6 optical micrograph of a two-contact circular test structure fabricated on n-type 4h-sic. the geometry size is: r0 = 30 μm, r1 = 50 μm and r2 = 90 μm. resistivity for ge substrate was determined before the wafer was diced using four point probe technique and it was determined to be 0.035 ω·cm. measurements were taken for ten different dimensions of the test patterns described above. a probing station with 0.6 μm radius tips, a multi meter and a current supply were used in the measurements. the current/voltage characteristic of each two-contact circular pattern indicates that ohmic contacts were generated between as-deposited ni and ge. the measured total resistance rt ranged from 4.78 ω to 17.23 ω with different dimensions of patterns. the values of ρc were then determined using fig. 4 and (3) and varied from 2.3×10 -6 ω·cm 2 to 2.7×10 -5 ω·cm 2 . this can be seen in table 1. using a two-contact circular test structure to determine the specific contact resistivity... 463 table 1 experimental results for determining specific contact resistivity for as-deposited nickel to germanium substrate contacts pattern gem. rt (ω) ρc (ω·cm 2 ) 1 a 15.68 3.7×10-6 2 a 17.23 6.5×10-6 3 a 14.77 2.3×10-6 4 b 6.98 1.3×10-5 5 b 6.48 1.1×10-5 6 b 5.93 7.9×10-6 7 b 5.54 5.3×10-6 8 b 6.06 8.8×10-6 9 c 4.43 2.1×10-5 10 c 4.78 2.7×10 -5 a: r0 = 6 μm, r1 = 10 μm, r2 = 18 μm. b: r0 = 15 μm, r1 = 25 μm, r2 = 45 μm. c: r0 = 24 μm, r1 = 40 μm, r2 = 72 μm. table 2 experimental results for determining specific contact resistivity for heat treated titanium to silicon carbide substrate contacts pattern gem. rt (ω) ρc (ω·cm 2 ) 1 c 140 2.4×10 -3 2 c 125 1.8×10 -3 3 c 129 1.9×10 -3 4 c 137 2.1×10 -3 5 c 150 2.4×10 -3 6 d 70 1.5×10 -3 7 d 63 1.3×10 -3 8 d 96 2.1×10 -3 9 d 103 2.4×10 -3 10 d 98 2.1×10 -3 c: r0 = 24 μm, r1 = 40 μm, r2 = 72 μm. d: r0 = 30 μm, r1 = 50 μm, r2 = 90 μm. similarly to the ge substrate, the sic samples had the sheet resistance measured before fabrication using the four-point probe method. from this measurement the sheet resistance was determined to be 0.01 ω cm. using ten different patterns of two differing sizes, measurements were taken as per the described method. the resistance measurements taken from the patterns ranged between 70 ω to 150 ω as the patterns became smaller in size. with these measurements taken from the sic samples, ρc was determined to be between 1.3×10 -3 ω·cm 2 and 2.4×10 -3 ω·cm 2 . the full results can be viewed in table 2. 5. conclusion a numerical method for determining specific contact resistivity between a metal and a semiconductor ohmic contact in 3-d circumstance using a two-contact circular test structure was presented. it was developed using finite element modeling program. specific contact resistivity for as-deposited ni contacts to p-type ge substrates were 464 a. m. collins, y. pan, a. s. holland obtained by using the proposed test structure and it was determined to be (2.3-27) × 10 -6 ω·cm 2 using presented method. in addition the process was conducted a second time on heat treated ti contacts on sic to provide a second independent set of results. the specific contact resistivity was determined to be (1.3-2.4) ×10 -3 ω·cm 2 . the results show that with known semiconductor substrate resistivity ρb and a fixed geometry, using a scaling equation, ρc can be determined conveniently by picking up data points from the reported figures. references [1] d. k. schroder, semiconductor material and device characterization, 3rd ed. hoboken, nj: wiley, pp. 135-157, 2006. [2] g. k. reeves and h. b. harrison, "obtaining the specific contact resistance from transmission line model measurements", ieee electron device lett., vol. edl-3, no. 5, pp. 111–113, may 1982. [3] s. j. proctor, l. w. linholm, and j. a. mazer, "direct measurements of interfacial contact resistance, end contact resistance, and interfacial contact layer uniformity", ieee trans. electron devices, vol. ed-30, no. 11, pp. 1535–1542, november 1983. [4] v. gudmundsson, p. hellstrom, and m. ostling, "error propagation in contact resistivity extraction using cross-bridge kelvin resistors", ieee trans. electron devices, vol. 59, no. 6, pp. 1585–1591, june 2012. [5] k. w. j. findlay, w. j. c. alexander, and a. j. walton, "the effect of contact geometry on the value of contact resistivity extracted from kelvin structures", in proceedings of the ieee int. conf. microelectron. test struct., march 1989, vol. 2, pp. 133–138. [6] d. b. scott, r. a. chapman, c.-c. wei, s. s. mahant-shetti, r. a. haken, and t. c. holloway, "titanium disilicide contact resistivity and its impact on 1-μm cmos circuit performance", ieee trans. electron devices, vol. ed-34, no. 3, pp. 562–574, march 1987. [7] g. k. reeves, "specific contact resistivity using a circular transmission line model", solid state electron, vol. 23, no. 5, pp. 487-490, may 1980. [8] a. s. holland, g. k. reeves, p. w. leech, "universal error corrections for finite semiconductor resistivity in cross-kelvin resistor test structures", ieee trans. electron devices, vol. 51, no. 6, pp. 914-919, june 2004. [9] y. pan, g. k. reeves, p. w. leech and a. s. holland, "analytical and finite-element modeling of a two-contact circular test structure for specific contact resistivity", ieee trans. electron devices, vol. 60, pp. 1202-1207, march 2013. [10] h. h. berger, "models for contacts to planar devices", solid state electron, vol. 15, no. 2, pp. 145-158, february 1972. [11] y. pan, a. m. collins and a. s. holland, "determining specific contact resistivity to bulk semiconductor using a two-contact circular test structure", in proceedings of the ieee international conference on miel, may 2014, pp. 257-260. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 345 381 doi: 10.2298/fuee1503345j rf mems/nems resonators for wireless communication systems and adsorptiondesorption phase noise  ivana jokić 1,2 , miloš frantlović 1,2 , zoran djurić 3 , miroslav l. dukić 4 1 school of electrical engineering, university of belgrade, bulevar kralja aleksandra 83, 11000 belgrade, serbia 2 institute of chemistry, technology and metallurgy center of microelectronic technologies, university of belgrade, njegoševa 12, 11000 belgrade, serbia 3 serbian academy of sciences and arts, institute of technical sciences sasa, knez mihailova 35, 11000 belgrade, serbia 4 singidunum university, danijelova 29, 11000 belgrade, serbia abstract. during the past two decades a considerable effort has been made to develop radio-frequency (rf) resonators which are fabricated using the micro/nanoelectromechanical systems (mems/nems) technologies, in order to replace conventional large off-chip components in wireless transceivers and other high-speed electronic systems. the first part of the paper presents an overview of rf mems and nems resonators, including those based on two-dimensional crystals (e.g. graphene). the frequency tuning in mems/nems resonators is then analyzed. improvements that would be necessary in order for mems/nems resonators to meet the requirements of wireless systems are also discussed. the analysis of noise of rf mems/nems resonators and oscillators is especially important in modern wireless communication systems due to increasingly stringent requirements regarding the acceptable noise level in every next generation. the second part of the paper presents the analysis of adsorption-desorption (ad) noise in rf mems/nems resonators, which becomes pronounced with the decrease of components' dimensions, and is not sufficiently elaborated in the existing literature about such components. finally, a theoretical model of phase noise in rf mems/nems oscillators will be presented, with a special emphasize on the influence of the resonator ad noise on the oscillator phase noise. key words: mems resonator, nems resonator, tunable resonator, graphene resonator, adsorption-desorption noise, oscillator phase noise received march 2, 2015 corresponding author: zoran djurić serbian academy of sciences and arts, institute of technical sciences sasa, knez mihailova 35, 11000 belgrade, serbia (e-mail: zoran.djuric@itn.sanu.ac.rs) 346 i. jokić, m. frantlović, z. djurić, m. l. dukić 1. introduction over the past two decades, wireless communications have been a subject of intensive development, and tremendous growth has taken place in this area of technology and industry. modern wireless terminals have become universal mobile personal devices which unite the functions of a telephone, a computer with internet access, a radio navigational device, a multimedia center etc., and in every new generation operate in a greater number of frequency ranges (multiband operation) and according to a greater number of communication standards (multistandard operation). this course of development poses new challenges related to the design of these devices' transceivers. having in mind the requirements for a small size, low power consumption and low cost of mobile terminals, it is apparent that the multiband multistandard front end, i.e. the transceiver part in which the processing of high frequency signals is performed, is the most critical. in transceivers operating at frequencies around 1 ghz and higher, which is common in modern mobile personal devices, it is still not technologically possible to ensure entirely software-based adaptation to an arbitrary communication standard. the processing of signals at those frequencies (filtering, amplification, frequency conversion) is performed by analog circuits, and it often requires the implementation of off-chip (discrete) passive components and several separate integrated circuits (ic), because the performance of the corresponding integrated components and circuits implemented in cmos technology are not satisfactory. bearing in mind the trend towards an increasing number of frequency bands in which a mobile terminal is used, the current approach towards designing multiband multistandard transceivers, which implies the introduction of an additional set of rf analog circuits and off-chip passive components for each new band and wireless standard, is becoming inefficient because it leads to an unacceptable increase in complexity, power consumption, size and price. particularly critical is the increase in the number of off-chip passive rf components. in the next generation of mobile terminals, the reconfigurability of the rf part of transceivers should be achieved by using as small a number of components as possible, with the simultaneous increase in the integration level: by replacing the discrete passive components (such as rf filters, duplexers, switches, impedance matching circuits, resonators in frequency references and frequency synthesizers, etc.) with integrated ones, by introducing components with tunable parameters instead of a number of discrete ones with fixed parameters; the most desirable solution implies the application of integrated tunable (reconfigurable) passive components. however, in this regard, the possibilities of conventional technologies are limited. the requirements for the reconfigurability of the rf front-end, better rf performance and a higher level of integration of the rf segment of the transceiver in future systems generate the need for high-quality passive rf components, applicable in a wide frequency range, as well as for those with tunable parameters, which will be integrated in cmos circuits. nowadays, mems and nems technologies are considered to have the potential for the realization of rf components which are able to meet the mentioned requirements. rf mems/nems resonators are being developed with the intention to replace large off-chip components, such as rf filters and quartz resonators, in wireless transceivers [14]. in the first part of this paper we present a short overview of rf mems resonators, including their classifications, principle of operation, their main characteristics relevant for wireless transceiver applications and advantages compared to solutions based on rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 347 conventional technologies. the achieved values of basic parameters will be given through examples of rf mems resonators reported in the literature. the capability of rf mems and nems resonators to meet future needs will also be considered, including the possibility of frequency tuning in mems and nems mechanical resonators. resonators based on two-dimensional (2d) crystals (such as graphene) will also be included in the analysis. the comment will be given on the necessary improvements and the direction of future research in this field, with the intention to optimize rf mems and nems components according to requirements of both current and future systems, especially having in mind the need for nems resonators. the analysis of noise generation mechanisms which are specific for these components is of particular significance in this respect: it leads to optimal resonator's design and operating conditions, which ensure minimal noise and, accordingly, minimization of signal degradation. in the second part of the paper we will present the analysis of adsorption-desorption (ad) noise in rf mems and nems electromechanical resonators, which becomes pronounced with components' decreasing dimensions and mass. finally, we present a theoretical model of phase noise in oscillators using rf mems/nems resonators as frequency determining elements, considering the influence of the resonator ad noise. 2. rf mems and nems resonators radio-frequency mems and nems technologies are intended for the realization of mems and nems passive components variable capacitors, inductors, resonators, switches, which may be basic elements of more complex functional blocks such as tunable filters, impedance-matching networks, phase shifters, reference oscillators, frequency synthesizers, antenna switches etc, all of them operating at radio frequencies to mm-wave frequencies (i.e. up to the order of 10 ghz and above). the development of rf mems components began at the end of 1980s. high rf performance (even of tunable components), dimensions at the micrometer scale, the technological compatibility with cmos and other ic technologies which enable their integration with active electronics, low power consumption, and mass production, make them promising candidates for application in wireless transceivers. integrated rf mems components can directly replace off-chip traditional components (rf filters and crystal oscillator references, as well as other passives and rf switches) in conventional transceiver architectures, as shown in fig. 1a. furthermore, tunable rf mems components can be used to ensure front-end reconfigurability in multiband multistandard transceivers, significantly reducing its complexity, as shown in fig. 1b. 2.1. rf mems resonators – a short overview of existing components and future needs resonator is a basic element of oscillators and frequency selective circuits. the applications of resonators in wireless transceivers are numerous. they include rf filters and duplexers, tunable tanks of voltage controlled oscillators (vcos), frequency references, frequency synthesizers, clock generators. the basic parameters of resonators are the resonance frequency, f0, and the quality factor, q, while other important parameters are frequency stability (in time – e.g. long/short-term; with temperature; with pressure etc.), power handling capability and series resistance. the value of the product f0·q is often 348 i. jokić, m. frantlović, z. djurić, m. l. dukić used as an indicator of the performance of the resonator. the required values of the resonator parameters depend on its application. for example, resonance frequencies of resonators used in various stable frequency references (for the operation of cellular modules, gps modules, microprocessors, real-time clocks etc.) and filters with different central frequencies in wireless transceivers cover a wide range (f0 are of the order of 1 khz – 1 ghz). the quality factor of a vcos resonant tank in a superheterodyne receiver can be 30–50, but q of rf bandpass filters, with the central frequencies in the range 0.8– 5.5 ghz, must be much greater (q~500–10000). the highest q (typically greater than 10 5 , even 10 6 ) is required in frequency references (oscillators). the temperature frequency stability of frequency references in mobile terminals should be better than ±10 ppm in the range 0–70 ºc [2]. the maximal acceptable resonator frequency variation for frequency synthesis is ±2 ppm in the same temperature range. in rf preselect filters and imagerejection filters the maximum temperature coefficient of frequency can be ~10 ppm/ºc. the resonator long-term frequency stability better than 3 ppm/year is needed [4]. the resistance should be low enough to allow impedance matching to conventional rf circuits (typically 50 ω). fig. 1 simplified block-diagram of a hypothetical multiband multistandard wireless transceiver rf front end, illustrating applications of rf mems components (shown in color/gray): a) direct replacement of non-integrated conventional components with integrated mems components, b) significantly simplified rf front end as a result of the application of tunable mems components. resonators can generally be divided into electromagnetic and electromechanical. in modern wireless transceivers, electromagnetic resonators are lc circuits, while electromechanical ones include saw (surface acoustic wave) and baw (bulk acoustic wave) resonators. saw and baw resonators (which also include quartz resonators) are off-chip components, and have better performance than electromagnetic ones, especially than integrated ones. saw resonators are used in high-performance rf filters and duplexers, with central frequencies up to 2 ghz. the nominal frequencies of crystal resonators are of the order of 10 khz–10 mhz. with the increase in the resonance frequency rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 349 of quartz resonators, the value of the q factor decreases. the best quartz resonators (whose production is complex and expensive) with a frequency of 10 mhz have q=10 6 . the series resistance of quartz resonators is 50 , whereas the frequency stability, at best 1 ppm, in the temperature range of 100c may be achieved by choosing the optimum oscillating mode and crystal cut. conventional resonators are highly reliable and technologically mature, but none of the mentioned resonator types can simultaneously meet the following requirements: resonance frequencies in ghz-range, high q at ghzfrequencies, tunability, low power consumption, high frequency stability, small dimensions, low cost and integration (especially monolithic) with cmos circuits. therefore, they are an obstacle for full integration and miniaturization of ghz multiband multistandard wireless transceivers. during the past two decades a considerable effort has been made in the development of rf mems resonators in order for them to be used in wireless transceivers instead of conventional bulky off-chip resonators [1-4]. both electromagnetic and electromechanical resonators have been fabricated using the mems technology. as far as mems resonators of these two types with equal resonance frequency are concerned, mems electromechanical resonators are smaller and have a higher quality factor than electromagnetic ones. in the following text we will present electromechanical (em) rf mems resonators in greater detail. some of the main advantages that mems technologies bring into the field of em resonators include small dimensions, high both f0 and f0·q product (f0 up to the order of 1 ghz, q comparable to conventional), low power consumption, the possibility of lowcost mass production and integration (monolithic or hybrid) with cmos ics. an especially attractive feature of mems resonators is the tunability of their parameters. em resonators consist of the resonant mechanical structure and the input and output electromechanical transducer. the operation of em resonators is based on mechanical oscillations of a resonant structure, which are actuated by the input electrical signal and converted in the output electrical signal. an input em transducer converts electrical energy into mechanical energy (i.e. electric voltage into force or mechanical stress), whereas an output transducer converts mechanical energy into electrical energy (i.e. displacement or deformation of the resonant structure into an output electrical signal). the actuation is usually achieved through the action of electrostatic (es) or magnetic force, or it is based on a piezoelectric (pe) effect or on thermally induced expansion. the mechanism of conversion of mechanical energy into an electrical signal can be capacitive, piezoelectric, piezoresistive, etc. the most common are the combination of es (capacitive) actuation and capacitive detection of mechanical oscillations and the combination of pe actuation and detection. electromechanical transducers are characterized by the coefficient of electromechanical coupling, which is a measure of the efficiency of energy conversion between electrical and mechanical domains of a resonator. it depends on the shape and dimensions of the resonant structure, material parameters, mode of oscillation, the transduction mechanism, the transducer's parameters, as well as the position and size of electrodes, and it significantly influences the parameters of the resonator (e.g. the equivalent series resistance) [3]. the capacitive actuation and detection of mechanical oscillation of the resonant structure are achieved as shown in fig. 2a. dc voltage vp is applied to the resonant structure, whereas alternating driving voltage vi is applied to the input electrode. these 350 i. jokić, m. frantlović, z. djurić, m. l. dukić two voltages together generate time-variable electric force acting on the resonant structure and exciting its mechanical resonant oscillation if the frequency of the excitation electrical signal is equal to the mechanical resonance frequency of the structure (which depends on the geometrical parameters of the resonant structure and parameters of the materials from which it is made). during the resonant oscillation of the structure, the distance between the structure and the output electrode changes in time. consequently, the corresponding capacitance and the output current also change, this change being directly proportional to the instantaneous value of the oscillation amplitude and bias voltage vp. accordingly, the output current is generated only if vp≠0, and the capacitive resonators are switched on and off by a simple mechanism (by turning the polarization voltage on and off). in order to achieve low power consumption and integrability with the integrated circuits, the value of vp should be low enough. therefore, the distance g between the electrodes of capacitive transducers and the resonant structure should be less than 1 μm. a smaller electrode gap and a greater surface area of electrodes ensure a greater em coupling coefficient. the coupling coefficient in es transduction depends on the bias voltage (it is directly proportional). fig. 2 schematic representation of mems resonators, illustrating their basic electromechanical configuration and principle of operation: a) a resonator with capacitive actuation and detection of mechanical oscillation, b) with piezoelectric (pe) actuation and detection of bulk acoustic waves, c) with pe actuation and detection of surface acoustic waves. the resonant structure of capacitive resonators can be in the shape of a cantilever, clamped-clamped or free-free beam, membrane, disk, quadratic plate, ring or comb. they are usually made of silicon, but silicon carbide, silicon nitride, diamond, germanium, silicon germanium, gallium arsenide, nickel, etc. can also be used. the most common configurations of electrodes are parallel plates, and interdigitated (comb) electrodes. a conducting material or a dielectric covered with a thin conductive layer can be used to manufacture electrodes. electrodes are placed in such a way to maximize coupling into a desired mode of vibration. capacitive resonators oscillate in flexural or torsional modes, but there are also capacitive resonators with bulk oscillation modes: extensional longitudinal, extensional contour (i.e. radial for disk resonators), wine-glass and lamé. the piezoelectric mechanism of excitation and detection of mechanical waves is based on the use of piezoelectric materials which are prone to mechanical deformation in the presence of an electric field (inverse pe effect), while the induced mechanical deformation of the material generates a voltage at the output port (pe effect). the main functional rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 351 components of the resonator are a layer of a pe material (e.g. aln, zno, lead zirconate titanate (pzt)), reflector structures (surfaces) and the electrodes to which the driving electric signal is applied, or which are used to detect the generated voltage (figs. 2b and 2c). metal electrodes are placed directly on the pe layer. pe layer is usually in the form of a square, rectangular or circular plate but they may also have a form of a circular or square ring. driving electrical signal excites acoustic waves in pe material, and the reflector surfaces confine the generated acoustic waves. this enables the acoustic resonance to be established, when the frequency of the excitation electrical signal is equal to the mechanical resonance frequency determined by the geometry of the system and the parameters of the material. there are several types of pe rf mems resonators, but they can generally be divided into saw and baw resonators. in a saw resonator, input and output interdigitated electrodes and reflectors are placed on the same surface of the pe layer. in baw resonators, a thin layer of a pe material is placed between the electrodes. in saw and baw resonators, mechanical waves are formed on the surface and within the volume of the thin layer of a pe material, respectively. saw resonators usually oscillate in the rayleigh mode. the most common oscillating modes in baw resonators are the bulk extensional mode in the direction of piezolayer thickness, i.e. the direction of the excitation electrical field (such as in fbars – thin film bulk acoustic resonators), or the lateral extensional mode (in lbars – lateral bars). the resonance frequency of fbar depends on the thickness of the pe layer. the fbar resonance in the range of lower ghz frequencies is formed in the layers of the pe material reaching about 1 μm in thickness, whereas the lateral dimension of the resonator are of the order of 10-100 μm. the lateral extension modes may be along one direction or a contour, whereas the resonance frequency is determined by the lateral dimensions (e.g. the width of the ring acting as the resonant structure). the lateral bulk modes can also be formed in hybrid saw-baw structures, in which the direction of the excitation electric field belongs to the lateral plane (lfer – lateral field-excited resonator). the interdigitated electrodes of these resonators are located on a single surface of the pe layer (just like in a saw resonator), bulk standing acoustic waves are formed (as in baw resonators), whereas the resonance frequency is determined by the distance between two adjacent "fingers" of interdigital electrodes (the lateral dimension, as in saw resonators). less commonly, pe actuation is used to excite the flexural modes of oscillation of suspended resonant structures comprising a layer(s) of pe material [5]. in the piezoelectric mechanism of energy conversion, the coefficient of em coupling is larger than in the capacitive mechanism, for resonators of similar shape and size. it is greater in fbar resonators than in lbars. aln resonators have a slightly smaller coupling coefficient than zno and pzt components, but their piezoelectric properties are excellent and they are suitable for high-frequency applications [6]. the mechanical resonant frequency of the structure is determined by its stiffness and mass, i.e. by its geometry, dimensions and material parameters. the q factor of the resonator (unloaded q) is by definition equal to the ratio of the energy stored in the resonator and the energy lost per one cycle of oscillation. a high resonator q results in low resonator impedance (i.e. series resistance). the fulfillment of requirements to be met by an oscillator in terms of phase noise and frequency stability, as well as the filter insertion loss and selectivity, power dissipation, etc. also depend on the q value. the value of q is determined by different mechanisms of energy loss, both internal and 352 i. jokić, m. frantlović, z. djurić, m. l. dukić external. external loss mechanisms may include the loss of mechanical energy in places where the resonant structure is fixed during oscillation, or they can be a result of the presence of the surrounding medium (e.g. air or other gas mixtures) or external circuits. as for internal mechanisms, mechanical energy is dissipated in the resonator or on the surface of the resonant structure as a result of the presence of the bulk and surface defects, and thermoelastic effects that lead to the irreversible transformation of acoustic energy into heat [3]. with the increase of gas pressure in the surrounding medium inside the resonator cavity, the energy loss due to the gas damping can grow and prevail over losses caused by other mechanisms. therefore, it is usually necessary to ensure that resonators operate in a vacuum packaging. the value of the pressure at which q begins to decrease due to gas damping with further increase in pressure depends on the resonance frequency and dimensions of the resonant structure. this pressure value is lower in resonators of smaller dimensions (at the same f0), as well as in resonators of lower f0. the energy loss due to other mechanisms can be minimized by optimizing the design of the resonator (choice of material, shape, size and place where the resonant structure is fixed, choice of mode of oscillation, etc.). for example, q increases with a decreased resonator surface-tovolume ratio. bulk mode resonators have a greater q than flexural, whereas si bulk resonators have greater q than aln bulk resonators. thermoelastic losses set the upper fundamental thermodynamic limit of the resonator quality factor, and also of the f0q product in resonators whose resonant frequency is lower than 1 ghz [7]. the value of q increases with decreasing temperature. the resonance frequency of fbar resonators is typically in the range between 400 mhz and 10 ghz, and their q factor is usually 1000-3000. the relatively low values of the q of aln-based resonators are a consequence of material losses, which are specifically related to metal electrodes that are placed directly on the pe material. as far as the value of the product f0q (on the scale of 10 12 ) is concerned, lbars in the form of an aln ring with a lateral contour mode and lfe (saw-baw) resonators [8-10] have a prominent place among pe resonators. for example, the q factor of saw-baw mems resonators with the resonance frequency of 843 mhz – 1.64 ghz, manufactured using the cmos-compatible process, is up to 2200 in air [8]. the second lateral resonator [11], monolithically integrated with cmos circuits, has f0=1.01 ghz, and q around 7000 in air. capacitive transduction influences the mechanical resonance frequency of a resonator through the effect of spring constant softening [12-14]. namely, in capacitive mems resonators, along with the mechanical component, the effective stiffness has an electrical component (which depends on the dc-bias voltage, resonator-to-electrode gap spacing, g, the electrode overlap area, ae, and the permittivity of the dielectric which fills the gap, εd). that is why the overall resonance frequency is different from the mechanical resonance frequency and is determined not only by the dimensions and the material parameters of the resonant structure, but also by the parameters vp, g, ae and εd. capacitive mems resonators generally have higher q than piezoelectric ones (because in piezoelectric resonators lossy metallic electrodes are deposited on top of the resonant structure). the q factor of capacitive resonators that oscillate in bulk modes is greater than that in flexural resonators of the same resonance frequency due to lower energy loss (clamping loss, dissipation due to the surrounding medium, due to surface defects, thermoelastic effect) [15, 16]. the q factor of capacitive bulk mode resonators remains at its maximum at higher values of pressure (i.e. q>10 5 at pressures of ~10 4 pa [17]) in comparison to rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 353 flexural ones, thus vacuum-packaging of certain types of resonators oscillating in bulk modes is not necessary in order to achieve a greater q. however, there are other effects of gas presence in the environment (e.g. resonance frequency drift and fluctuations) that influence the choice of operating conditions and resonator packaging. fig. 3 shows the values of f0q reported in the literature for some of the realized capacitive mems resonators, for which we also show the values of the resonance frequency and the q factor [14, 16-31]. this diagram can be used for the comparison of the performance of rf mems resonators and quartz resonators. it can be observed that f0q values corresponding to the best available quartz resonators are already reached by mems structures. also, mems resonators go far beyond conventional cmos (lc) resonators in terms of the presented parameters. fig. 3 values of f0q reported in the literature for some of the realized capacitive mems resonators. capacitive resonators can be completely fabricated using materials that are compatible with silicon ic technologies, making them suitable for monolithic integration with transistor circuits. although the highest level of integration is achieved by fabricating a resonator on the same chip with cmos circuitry [32], which means that the fabrication of the resonator is embedded in an existing cmos flow (it either precedes cmos processing steps or is performed between them), it almost always implies degradation of the performance of either cmos circuits or the resonator. the most convenient method of monolithic integration is the post-cmos integration (or above-cmos), in which the mems processing is done on top of prefabricated cmos layers. in that case, the mems and cmos processing steps are optimized and the mutual influence is minimized. however, then the temperature at which mems processing steps are performed is limited to 450c, which influences the choice of resonator material [3]. for example, sige and ni can be processed using low temperature techniques in order to fabricate resonators. 354 i. jokić, m. frantlović, z. djurić, m. l. dukić pe materials are not standard for si technologies, which makes the integration of pe mems resonators with ics difficult, with the exception of the resonator with an aln layer. mems technology can be used for the fabrication of pe thin films. sputtered aln thin films can be processed below 450c and they are therefore suitable for the integration of mems resonators on top of cmos circuitry. recent technological advances allow for the integration of pe resonators with cmos circuits on the same chip [6]. for example, the saw resonator (q below 500) fabricated by combining a standard 0.6 μm cmos processes and mems technology [33] is monolithically integrated on si with active cmos circuits. over a period spanning more than a decade, great attention has been dedicated to the integration of fbar with a cmos circuits [34-37]. fbar filters and duplexers are suitable for hybrid integration within the mcm (multi-chip module). the monolithic fbar-cmos integration is more complicated to perform than the integration of mems saw resonators and cmos. monolithically integrated 2 ghz fbar on si, described in [35], has q=780, which is one of better results. other aln-based ghz fbar resonator fabricated on top of a bicmos circuitry is presented in [38]. lbars fabricated by processes compatible with si ic technologies have better performance than cmoscompatible fbars. an example for this are the above-mentioned cmos-compatible lbars in the form of contour-mode oscillating aln rings and lfers, suitable e.g. for post-cmos integrated on-chip direct ghz frequency synthesis in reconfigurable multiband wireless communications [9-11, 39]. however, it should be pointed out that though the mems-cmos monolithic integration is desirable in terms of miniaturization, it may not be the best solution in terms of cost. mems resonators operating in modes in which f0 depends on lateral dimensions of a resonant structure (most of the capacitive bulk modes, saw, lbar) are suitable for realization of monolithically integrated rf filter banks, which consists of a large number of resonators with different resonant frequencies. switches are commonly used for the selection of filters from a filter bank. due to the simple filter selection (without switches that cause attenuation), and fabrication compatibility with si ic technologies, capacitive resonators with bulk oscillating modes enable the fabrication of rf filter banks with minimal dimensions and minimal energy loss, monolithically integrated with active cmos circuits. the change of the resonance frequency in mems resonators with temperature is a result of the temperature dependence of the young modulus of elasticity, the thermal expansion of materials, and mechanical stress in the resonant structure due to different coefficients of thermal expansion of the resonant structure and structures surrounding it. the relative change in the resonance frequency in most mems resonators is a linear function of temperature, with a negative slope [40]. the temperature coefficient of frequency of a si resonator is typically between –15 ppm/°c and –30 ppm/°c [40]. temperature stability of the frequency of pe resonators fabricated using the aln technology is about –25 ppm/°c [41]. these values are acceptable for the implementation of preselect rf filters and rf image-rejection filters, but not for oscillators in which temperature stabilization of frequency is necessary. different methods for temperature compensation are used in mems resonators: at the level of fabrication process, resonator design or external circuits [42-48, 3]. for example, doping of silicon reduces the temperature-dependence of the modulus of elasticity [45, 46]. a greater temperature stability of frequency can be achieved by fabricating the resonator using a combination of materials whose thermal expansion rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 355 coefficients are different and/or whose temperature coefficients of the young modulus are different, so that an appropriate design can ensure the elimination of individual effects of temperature changes [47]. the effective stiffness and, consequently, the resonance frequency can be varied in order to ensure temperature compensation [49]. this may be achieved e.g. by a mechanical deformation of the structure, using temperature-dependent mechanical stress [50]. the adjustment of the resonance frequency can be done by changing the gap between the electrode and the resonant structure [12], the electrode overlap area [13] or the dc bias voltage, which change the effective stiffness as a function of the temperature [48]. the frequency adjustment range in capacitive resonators is larger than in piezoelectric ones. the range of 8.4% has been achieved in capacitive resonators by means of integrated heaters [51]. in fbars operating above 1 ghz tuning range of 1.47% is achieved by using the tuning voltage of 7v [52]). one of the temperature compensation methods using external circuits is based on the temperature-dependent frequency synthesis. by applying various methods of compensation or their combination in mems resonators the temperature variation of the frequency is reduced to the value of 0.1-300 ppm in the temperature range between 60 k and 200 k [3]. in commercial mems oscillators (sitime inc.), the temperature compensation is achieved using digital techniques (a temperature sensor and an external cmos circuit are used) [53]. long-term frequency drift (i.e. frequency aging) of mems resonators of the order of ppm/year is observed [54], and it can be as low as 1 ppm/year [53, 55] which is better than that of a typical quartz resonator. the resonance frequency aging depends on the hermeticity of the resonator package. the resonance frequency of em resonators changes with pressure variations due to various effects. in quartz resonators, the influence of pressure on the modulus of elasticity dominates, due to which the resonance frequency increases linearly with the increase of pressure [41]. the resonance frequency of mems resonators decreases with increasing pressure, probably due to adsorption (binding) of the particles of surrounding gases onto the surface of the resonator. this effect is more pronounced in resonators of smaller size and mass. the value of the equivalent series resistance (also called motional resistance), rm, is important for the coupling of a resonator with other rf circuits. it should be low enough for the appropriate matching to the impedance of conventional rf circuits, which is typically 50 ω. in filters, the signal attenuation in the passband is smaller at lower values of series resistance. in oscillators, the necessary amplification depends on rm. lower values of a resonator's series resistance in oscillators enable the amplifier gain to be lower, leading, consequently to lower power consumption. at lower rm the output power is higher. higher rm values challenge the fulfillment of the requirements for starting and maintaining of oscillations [3], and increase the oscillator's phase noise. capacitive resonators typically have rm of the order of 1-100 kω (regardless of the resonance frequency [3]), which can be a problem for coupling the resonators with antennae or other rf devices. for example, a flexural resonator (f0=5.1 mhz, q=80000) presented in [31] has rm=35 k; the si resonator oscillating in flexural mode (f0=1 mhz, q=1000), described in [56], has a relatively small value of motional resistance (340 ), whereas the flexural si resonator (f0=14 mhz, q=1500) from [57] has an extremely high value (1 m). a si square wine-glass mode resonator [29] has rm=10 k (f0=2 mhz, q=4.0510 6 ). motional resistance of a quadratic si plate resonator oscillating in the 356 i. jokić, m. frantlović, z. djurić, m. l. dukić contour mode (f0=1.31 mhz, q=130000) is 4.47 k [23]. the resistance of a si bulkmode resonator in the form of a disk (2.1 k, f0=24 mhz, q=53000) [58] is of the same order of magnitude, as well as that of the 145 mhz resonator (2.4 k) [59]. si bulk-mode resonators presented in [60] (f0=13 mhz, q=10 5 ) and [61] (f0=60 mhz, q=6200) have relatively small values of rm, amounting to 500  and 966 , respectively. the bulk mode ring resonator described in [62] has rm=200 k (f0=1.95 ghz, q=8000). the reason for the high resistance values is in the nature of the electrostatic transduction mechanism in which a low intensity force is generated and used for actuation. there are several ways to reduce rm: by using a higher bias voltage, by reducing the distance between the actuation/detection electrode and the resonant structure, by changing the resonator design (ensuring a greater overlap surface area of electrodes in capacitive transducers), by using several mechanically coupled resonators in parallel, etc. pe resonators have lower motional resistance compared to capacitive ones similar in shape and size (due to a higher em coupling coefficient). in addition, the value of the impedance in pe resonators decreases with the increase of f0. fbars have lower motional resistance than lbars at the same electrode surface area (in fbars, it is easy to achieve the rm of 50 ω). however, low rm values have also been achieved in lateral pe resonators. for example, the rm of aln contour-mode ring-shaped resonators (f0 in the range 223– 656 mhz) is between 56  and 205  [9]. in lbars in the form of a circular ring, the series resistance depends on the mean radius, whereas in square ring lbars it depends on the mean side length of the basis, so by varying of those dimensions rm can be adjusted without changing the resonant frequency, which is determined by the width of the ring. for example, a cmos-compatible lfer (f0=1.01 ghz) has rm≈150  [11, 39]. when the amplitude of oscillation becomes comparable with the characteristic dimension of the resonant structure in pe resonators or with the distance between the electrodes of capacitive resonators, the em coupling coefficient and the effective stiffness of the resonator start to depend on the deformation, i.e. on the amplitude of the alternating actuation voltage and the resonator begins to work in a nonlinear regime [18, 63]. nonlinear effects are a consequence of material nonlinearities, electromechanical coupling, or they have a different mechanical origin. generally, when stress and strain reach a certain value, a linear relationship between them ceases to exist and the resonator stiffness constant becomes a function of stress and strain. in pe resonators, piezoelectric coefficients begin to non-linearly depend on strain. in capacitive resonators the nonlinear behavior is partly a direct consequence of a nonlinear relationship between the capacitance and the change in the distance between the electrodes. in addition, at high power levels parasitic modes can be excited along with the desired mode of oscillation. nonlinear effects limit the maximum amplitude of oscillation, i.e. the maximum signal power at which the resonator operates in the linear regime. however, it is desirable that the amplitude of oscillation be as high as possible, i.e. that the resonator be capable of handling high power levels (for example, in order to reduce the oscillator phase noise). the power handling capability of mems resonators is lower than that of quartz resonators. for example, in flexural mems resonators, it is of the order of 1 μw, while in quartz resonators it is 100 times greater [18]. mems resonators with a higher stiffness can be driven by higher power levels while operating in the linear regime (non-linear effects are less pronounced); accordingly, the resonators which oscillate in bulk modes are better in this respect than flexural ones (for rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 357 example, maximal power may be of the order of 1 mw). because of the higher q factor and higher resonance frequencies compared to flexural resonators, capacitive resonators with bulk oscillation modes are more suitable for realization of filters and frequency synthesis in wireless transceivers. a smaller distance between the electrodes of capacitive transducers and higher voltage vp lead to the increase of non-linearity and reduce the power handling capability. capacitive resonators operate with a higher signal power in the linear regime if the overlap surface area of the electrodes is larger. resonators with the pe transduction mechanism are more linear and have a better power handling capability (fbar up to the order of 1 w). in mems resonators, a higher output signal power can be obtained using mechanically coupled parallel resonator arrays [64]. based on the achieved parameter values, dimensions and cmos compatibility, rf mems resonators are considered as a solution for realization of fully integrated rf systems. however, the potential of rf mems resonators for applications in wireless communications has not been sufficiently exploited yet. in the case of capacitive resonators, remaining problems that prevent wider practical application are high motional resistance and insufficient power handling, while in the case of high f0 pe resonators it is a low q (limited to several thousand). nevertheless, significant results achieved after the year 2000 (especially in terms of better temperature stability, better long-term stability, improved packaging) have enabled the commercialization of both pe and capacitive mems resonators. an example of a commercial product is agilent's fbar filter from 2001. this is the first rf mems component that appeared on the market. fbar filters have better performance (a higher q factor, lower insertion loss, better selectivity, better temperature stability, the ability to work with higher rf powers greater than 1 w) than conventional saw filters at frequencies around 2 ghz and above. mems baw (fbar) resonators find the main application in mobile terminals, where they have replaced traditional duplexers, which are bulky non-integrated components. they are also used in bandpass filters, which are traditionally realized using conventional off-chip saw components of larger dimensions. for example, by using fbar components in mobile terminals instead of two saw filters that are traditionally used for pcs bands 18501880 mhz and 1880-1910 mhz, the area needed for the transmission filter is reduced by 90%, and so is the realization cost. fbar duplexers and filters are currently the only novel pe frequency selective components that meet all the required specifications of wireless standards and enable the miniaturization of the rf part of the transceivers of multiband multimode mobile terminals. the aln fbar resonator manufactured by avago technologies is the most successful mems resonator in the past decade [8]. capacitive mems resonators that can be used instead of quartz resonators have also become commercially available (discera, sitime). the first silicon mems oscillator manufactured by discera appeared in 2003. its resonant structure is in the form of a cantilever, with dimensions of 30  8 μm 2 and a resonance frequency of 19.2 mhz. the following year, the same manufacturer presented the first integrated mems tunable oscillator with a nominal frequency of 1.6 ghz, intended to be used as a voltage controlled oscillator of the local oscillator in transceivers of mobile terminals; however, it is not commercially available. in 2006, the first sitime mems oscillators with a resonant structure in the shape of a square ring oscillating in the flexural mode appeared in the market. commercially available is also a sitime resonator with a resonace frequency of 5 mhz (q=80000, vp=1.8-4.6 v, long-term frequency stability 0.5 ppm/year, 800  600  150 μm 3 in size), fabricated on si. a si 358 i. jokić, m. frantlović, z. djurić, m. l. dukić mems chip is placed on a cmos chip that contains amplifier circuits, circuits for temperature compensation and programmable memory. the dimensions of the packaging that contains both chips are 2 × 1.6 × 0.25 mm 3 and this is currently the smallest programmable oscillator. and what are the future needs? the development of mems resonators with f0~1 ghz and q>10 4 would enable implementation of new and compact multiband multistandard transceiver architectures (e.g. direct channel selection at the rf stage) [65]. frequency references in the ghz range are also desirable in future wireless communications and other high-speed electronic systems. although the possibility of fabrication of monolithically integrated mems resonator arrays of different resonance frequencies is very significant for multiband transceivers, enabling versatility and reconfigurability on a small surface, the ultimate objective in that sense are the resonators whose parameters are adjustable in a wide range, resulting in a significant reduction in the number of necessary components. therefore, high-q resonators oscillating at ghz frequencies and tunability of the resonance frequency are highly needed in future systems. in the next subsection the means for achieving these goals will be considered. 2.2. achieving ghz resonance frequencies and the resonance frequency tuning in order to achieve the resonance frequency of a mechanical structure in ghz range one has to choose appropriate geometry, dimensions and the material of the structure. the analysis will be performed for a doubly clamped beam resonator, since it is commonly used as a model structure in theoretic considerations. from the expression for the mechanical resonance frequency of the clamped-clamped beam oscillating in the first flexural mode, f0=1.03(h/l 2 )(e/ρ) 1/2 , it is obvious that the resonance frequency will be higher if a structure is made of a material with a high e/ρ ratio (e is the material's young modulus, ρ is its density) and also if the geometric parameter h/l 2 (h is the beam thickness, and l is its length) is high. fig. 4 shows the calculated dependence of the resonant frequency on h/l 2 for the beams made of different materials commonly used in mems and characterized by the ratio e/ρ. this diagram is created based on the diagram in ref. [66]. it leads to the conclusion about the values of the h/l 2 ratio at which ghz frequencies can be achieved with a beam made of a certain material. the calculation results (according to the expression for f0 given above) suggest that resonators with ghz fundamental resonance frequencies have nanometer dimensions, and they are, therefore, fabricated using nems technologies. in a majority of mems resonators realized so far, resonance frequency tuning is implemented in order to compensate temperature or fabrication process variations of the resonator parameters. several frequency-tuning methods have been reported as mentioned before. however, tunable rf components for multiband transceivers require a much greater frequency tuning range compared to both temperature and process variation compensation. mechanical tuning methods based on the change of the resonator's effective spring constant can yield a high tuning range without significant degradation of the q factor, and are simple for implementation. resonators with a high f0 due to a high mechanical stiffness (oscillating in bulk modes) have a lower tunability than flexural resonators. one of the methods for frequency tuning through the change of the effective spring constant is based on the application of mechanical tension, i.e. tensile strain on the resonant structure. for example, rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 359 the resonant structure can be exposed to mechanical stress by using electrothermal actuators (they require a significant amount of additional surface area), by utilizing a more compact capacitive tuning in which the structure deforms under the influence of electrostatic force, or by some other mechanism. we analyze the resonance frequency dependence on mechanical tension in the case of a clamped-clamped beam, in order to quantitatively estimate the capabilities of the method in terms of both the tuning range and the influence of the resonator's parameters on the tuning range. fig. 4 doubly clamped beam resonator oscillating in the first flexural mode: a) the dependence of the resonance frequency on the geometrical parameter h/l 2 (h is the beam thickness, and l is its length), for the beams made of different materials, characterized by the ratio e/ρ, b) left axis: dependence of the eigenvalue z1 (determining the resonance frequency) on the tension dependent parameter p, right axis: ratio of the resonance frequency of an arbitrary beam under the tension and the resonance frequency in the absence of the tension, as a function of p (p=12(l/h) 2 ,  is the uniaxial tensile strain). the resonance frequency of the n-th flexural mode of a resonant structure in the shape of a double-clamped beam, under the tension n, is given as )( )( 1 ρπ2 )( )( )( 1 )0( )( )0()( 22 2 22 2 00 nz np wh ei l nz nz np z nz fnf n n nn n nn  (1) where fn0(0) is the beam's resonance frequency in the absence of the tension, lwh are the beam's dimensions (lengthwidththickness), e is the young modulus, i is the moment of inertia (i=wh 3 /12 for a beam with a rectangular cross-section), and the tensiondependent parameter p is ε)/(12)/()( 22 hleinlnp  (2) in the above expression ε is the applied uniaxial strain. the equation that has to be solved [67] for zn in order to obtain the resonance frequency of the n-th flexural mode of a double-clamped beam under tension, written in a convenient form, is 1)/1sinh()sin()]/12/([)/1cosh()cos( 2222  nnnnnnnn zpzzzpzpzpzz (3) 360 i. jokić, m. frantlović, z. djurić, m. l. dukić this equation is solved for the first (fundamental) oscillation mode, considering three characteristic cases: a) when p is low, i.e. p≈0, which corresponds to the absence of tension, eq. (3) becomes 1)cosh()cos(  nn zz (4) which for n=1 yields z1(0)=4.73. b) when p is high, thus p/zn 2 >>1, eq. (3) is approximately 0))/2(sin(  pzarctgz nn (5) and its solution corresponding to the first mode is z1=. c) for arbitrary p, eq. (3) was solved numerically. the obtained dependence z1(p) is shown graphically in fig. 4b (the left axis). based on it the frequency ratio f1(n)/f1(0) is calculated as a function of the parameter p and shown in the same diagram (the right axis). (in the remaining text and diagrams the first mode resonance frequency will be denoted with f0 instead of f10.) this diagram gives a general insight into the amount of change of the resonance frequency of an arbitrary beam resonator oscillating in the given mode, attainable by applying an arbitrary tensile strain. for a resonator with a given l/h ratio and under a certain amount of strain, the frequency tuning ratio can be obtained based on eq. (2). for example, assuming the maximal strain of 1% (corresponding to the yield strength of common semiconductor materials used in mems), for mems resonators with the ratio l/h=60 the parameter p=432, so the maximum tuning ratio of 3.3 is obtained from the diagram. the diagram in fig. 5a shows the dependence of the first flexural mode resonance frequency of doubly clamped beam resonators (l/h=60), made of various semiconductor materials commonly used in mems, on the applied uniaxial tensile strain. this dependence is obtained by applying the presented theory. two distinct regions can be observed in the diagram. in the first region, which corresponds to low values of ε (p/zn 2 <<1, so eq. (4) is valid), the resonance frequency is practically independent of tension. this is the bendingdominated resonant frequency region. in the second region the increase of the resonant frequency with the applied strain can be clearly seen, and also the mentioned ratio f0(ε)/f0(0) of 3.3, that corresponds to ε=1%. as the tension increases, it begins to dominantly determine the resonance frequency (p/zn 2 >>1), so this region is called the tensiondominated resonance frequency region. also, it can be concluded from the same diagram that the ratio f0(ε)/f0(0) does not depend on the parameters of the material. the dependence of the same ratio on ε is shown in fig. 5b for three different values of l/h (20, 100 i 1000). the values of ε at which the resonance frequency is bending-dominated (or tensiondominated) depends on the ratio l/h. at higher l/h ratios lower ε values are required in order to attain a certain factor of the resonance frequency change, i.e. the same strain applied in a resonator of a higher l/h ratio yields a higher f0(ε)/f0(0) ratio. for fixed l/h, the maximum achievable resonance frequency depends on the maximal possible strain (it depends on the resonator's material properties and the maximum value of the control voltage in the given application). rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 361 fig. 5 doubly clamped beam resonators oscillating in the first flexural mode: a) the dependence of the resonance frequency of beams (l/h=60) made of various semiconductor materials commonly used in mems, on the applied uniaxial tensile strain, b) the dependence of the ratio of the resonance frequency of a beam under the tension (i.e. axial strain) and the resonance frequency in the absence of the tension, on strain, for three different values of l/h. based on the presented analysis and results from the literature, the following conclusions can be made: 1. in order to achieve nominal resonance frequencies in the ghz range, the resonator needs to be made of a material with high e/ρ ratio (i.e. the material should be stiff and/or light); also, the resonant structure needs to be of nanometer dimensions (the domain of nems technologies), 2. for a wide frequency tuning range to be achieved by application of tensile strain, the material needs to have a high yield strength (i.e. to withstand a high strain); it is also necessary for the l/h ratio to be as high as possible, 3. high resonance frequencies of mems resonators are typically achieved in structures of high mechanical stiffness, which makes frequency tuning difficult. nems resonators, however, achieve high resonance frequencies while having the mechanical compliance needed for tunability [68]. 2.3. transition from mems to nems resonators. rf mems/nems dimensions scalling challenges nanoelectromechanical (nems) systems contain mechanical features whose at least one dimension is under 1 micrometer. since the year 2000 a significant advance has been achieved in the development of nems resonators. due to unique mechanical properties, nems resonators provide a promising basis for future ultrafast communication systems, highly-sensitive force and mass sensors, biomedicine etc. a majority of nems resonators is in the shape of a nanoscale beam (doubly-clamped, cantilever or free-standing) made of si or sic, that oscillates in response to an applied external force [66, 69-72]. their length is typically between 1-20 μm, while the thickness and width are smaller than 1 μm. the effective mass of nems is usually 10 -14 g, and typical resonance frequencies are in the range between 1 mhz and 10 ghz [70], while the dissipated power can be as low as 10 -17 w. 362 i. jokić, m. frantlović, z. djurić, m. l. dukić the first rf si resonator of nanoscale dimensions (7.7 μm  330 nm  800 nm, beam shape), reported in 1996, had the fundamental resonance frequency around 70 mhz and q=1.8·10 4 [69]. later the silicon nems beam resonator fundamental frequency of 380 mhz was reached, with q of the order of 1000 at room temperatures [73]. first nems resonators whose resonance frequency exceeded 1 ghz were sic beams [74]. dimensions of one of them, with the resonance frequency of 1.029 ghz, are 1.1 μm  120 nm  75 nm, and q~10 4 . other materials typical for mems, such as gallium arsenide, silicon nitride, aluminum nitride and nanocrystalline diamond, are also used for fabrication of nems beam resonators with similar values of q as previously mentioned, and f0 being in the range from the order of 10 mhz to the order of 100 mhz [75, 76]. for example, a doubly clamped nanobeam aln resonator (4 μm  900 nm  320 nm) oscillating in flexural mode has the resonance frequency of 78.2 mhz, and q=670 at room temperature [77]. metallic (au, pt, al, ti) nems resonators were also demonstrated, having fundamental resonance frequencies of the order of 10 – 100 mhz, and q of the order of 1000 at the temperature of 4 k [78, 79]. nanostructures of a high aspect ratio (defined as the ratio l/h or l/d, where d is a structure diameter) are called nanowires. they can be made of si, sic, au, ag, pt, ge, zno, gaas, sin etc. [78-83]. some of techniques for fabrication of nems resonators are inherited from mems. however, the transition from microto nanoscale often implies qualitatively new technological solutions. different methods exist for fabrication of nems/nanowire resonators, and can be divided into the following categories: top-down, bottom-up and hybrid methods. in top-down methods nems devices are made of bulk materials or thin films that are patterned by lithography and etching to create fully released structures such as clamped beams and cantilevers. top-down methods are e.g. those based on standard electron beam lithography (ebl), superlattice nanowire pattern-transfer (snap), nanoimprint lithography (nil) or stencil lithography. top-down methods typically provide a high level of control regarding the design and geometry of the resonator. by using ebl very high aspect ratios of nems structures are achieved, such as l/h250 in 20-25 nm thick sic nanowires presented in [81]. this method enables fabrication of nanowires using different materials such as si, gaas, sin etc. nanowires made of au, cr, al, ti, nb, pt or ni by using snap method are reported. one of them is a suspended pt nanowire with a diameter of 20 nm and a length of 0.75 m (a diameter as small as 8 nm is possible) [80]. nil offers high resolution (5 nm) and also high-volume fabrication. metallic nanowires can be fabricated by using wafer-scale stencil lithography. in [82] 70 nm thin al nanowires, 5 m long, fabricated by using this method are presented. bottom-up methods include, for example, synthesis of si, sic, gan, zno nanowires by vapor-liquid-solid (vls) growth. problem of bottom-up methods in general is the control of nanowire length, diameter and spatial distribution. another disadvantage of these methods is their low efficiency. hybrid top-down/bottom-up methods include, for example, integration of si nanowires synthesis into device fabrication [83]. these methods enable better control of nanowire dimensions. si nanowires of a 50-150 nm diameter, about 2 m in length, fabricated by using a hybrid method, are presented in [71]. among them are a metalized si nanowire with a resonance frequency about 200 mhz and q2500 (measured in high vacuum at cryogenic temperatures), whose resistance is matched to 50 , a 80 mhz non-metalized si nanowire with q=13100, and a non-metalized si nanowire with f0=215 mhz and q=5750 (the rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 363 resistance of the latter two is of order of 10-100 k). it can be concluded that the product f0q of the order of 10 12 is achieved in nanowire resonators. there are several methods of actuation and detection of motion in nems resonators: electrostatic, optical (e.g. free-space or fiber-optical interferometry), piezoelectric, magnetomotive/electromotive, piezoresistive, methods based on single electron transistors (set), atomic point contact (limited to resonators made of conductive materials that do not form surface oxides), photonic transduction etc. however, not all the mentioned methods are equally suitable for applications requiring a compact actuation and detection system (preferably on a chip), for resonators of extremely small dimensions or those made of arbitrary materials. for example, magnetomotive transduction is often the right choice for both drive and detection of motion of very small structures oscillating at extremely high frequencies. however, its significant limitation is that it requires a strong magnetic field (1–16 t) produced by a superconducting magnet. such a strong field also causes circuit loading and renders the setup large and expensive. purely capacitive transduction methods suffer from very low signal levels and parasitic coupling at very high frequencies. optical transduction cannot be realized on-chip, and its sensitivity decreases as the resonator size scales down, because the diameter of the focused beam is limited from below by diffraction (this limitation can be overcome by using waveguides of a submicron cross-section, located on the substrate near the resonator). photonic method implies significant fabrication difficulties for dimensions of the order of 10 nm and below. promising results have been reported regarding optical multiplexing techniques. the transduction efficiency of the piezoelectric method is very high and it is extensively used in bulk mode mems resonators. however, the method is less convenient for small structures as the crystalline structure must be maintained for a material to be piezoelectric, thus limiting the lowest possible device thickness. nevertheless, both nems cantilevers and doubly-clamped beams have been successfully demonstrated with pe layers as thin as 100 nm, proving that pe layers can be used as efficient nanoscale transducers [77]. intrinsic amplification mechanisms such as transistor-based charge modulation are promising at room temperatures. pronounced piezoresistive effect in doped si nanowires due to longitudinal strain enables integrated piezoresistive self-sensing of strain or displacement (a nanowire without patterned piezoresistor loop is a transducer) [84]. some other methods (based on set or superconductive microwave cavity) are used only at low temperatures [85], and are difficult to implement. one of interesting new all-on-chip solutions applicable at room temperature is, for example, a self-transducing nems system based on the cointegration of a finfet transistor and a suspended doubly-clamped beam silicon resonator [86]. in this example electrostatic actuation and transistor-based detection of the resonator motion are used. there are significant benefits of scaling down the dimensions, such as high speed operation, higher resonance frequencies, higher component density and better integration. however, apart from the mentioned features of nems components, which are a result of miniaturization, and beneficial in high-frequency signal processing and sensing applications, there are problems that remain to be solved on both theoretic and experimental level. the studies in the field of nems are at the forefront of physical and engineering sciences. most developments in this field are currently confined to theoretic models, simulations and laboratory experiments, with nems components in a prototype stage, at best. 364 i. jokić, m. frantlović, z. djurić, m. l. dukić some solutions used in mems do not scale well into the nems domain. apart from technological issues regarding reproducibility and control of surface and bulk properties of extremely small structures, remaining issues include efficient energy conversion mechanisms and coupling between nems and other components and circuits. the small size of nems typically results in a small motional signal. in spite of many different transduction methods applied in nems resonators so far, inducing resonator motion and detecting of weak mechanical signals at very high oscillation frequencies, at room temperature and with low power dissipation remain challenging, especially if a compact solution, such as a system-on-chip, is required. small motional signal can easily be overwhelmed by parasitic coupling or background noise. as the dimensions decrease, so does the signal-to-noise ratio. there are also problems with energy loss at clamping sites (clamping loss), which increases with f0, as well as with surface losses and other effects leading to the increase in energy dissipation, and, consequently, the q factor decrease. therefore, there are limits to the reduction of device size. q of nems resonators can reach thousands, even tens of thousands, but, except in rare cases, such q values have been reached only at temperatures below 25 k. a sin 50 nm thick square membrane with the fundamental resonance frequency of 133 khz is an example of a nems resonator with q of the order of 10 6 at the room temperature [87]. however, attaining a high fundamental resonance frequency without decreasing the q value is the remaining problem in nems development. as in mems, solutions are also needed for reduction of the series resistance of nems resonators and improvement of their power handling. from the theoretical standpoint, it is important to analyze the applicability of the continuum approach to the calculation of mechanical characteristics of an extremely small resonator. reduction of the noise caused by physical effects that become pronounced as the dimensions decrease is another important task which requires both theoretic and experimental research. in mems, and especially in nems resonators, additional noise generating mechanisms exist that are characteristic for structures of small dimensions and mass, and high surface-to-volume ratio. it is therefore necessary to investigate their influence on the resonator performance as a function of dimensions of the structures and the operating conditions. in section 3, by analyzing the adsorption-desorption (ad) noise that becomes prominent as the dimensions and mass of the components decrease, we contribute to the theory of noise in mems and nems resonators, and subsequently to the theory of phase noise in oscillators using rf mems/nems resonators as frequency determining elements. 2.4. rf nems resonators based on 2d crystals recent years have seen increasing interest in nems that utilize carbon nanostructures, such as one-dimensional (1d) nanotubes or two-dimensional (2d) beams or membranes, as building blocks. these structures are based on graphene, a planar sheet of carbon atoms arranged in a honeycomb lattice. graphene structures, consisting of one or a few of atomic layers, are intrinsically nanoscale. since the carbon-based nanostructures emerged, a continued miniaturization of resonant nems has advanced into atomically thin 2d or 1d nems. rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 365 graphene has high 2d elastic stiffness (2d young's modulus e2d=340 n/m, corresponding to e=1 tpa reduced to a single atomic layer) and high breaking strain (25%), which exceed the values for any of the thin-film materials currently used for nems. it also has a low mass (2d=7.410 -7 kg/m 2 ) and a high e/ρ ratio. the strength of carbon-carbon bond makes graphene quite flexible. being atomically thick, graphene structures have extremely high l/h ratio. these characteristics imply that the conclusions 1-3 made at the end of section 2.2 are in favor of graphene-based resonators as opposed to mems and nems resonators made of traditional materials: graphene structures can have high resonance frequencies which can be further increased and widely tuned by application of large strains. for example, for a graphene beam of 1 m in length, and exposed to 1 % of strain, the parameter p (eq. (2)) is about 10 6 , which, according to the diagram in fig. 4b yields the ratio of the resonance frequencies of stretched and unstretched beam as high as 10 2 (in typical mems resonators the ratio is 3.3, as stated earlier). apart from that, graphene's charge-tunable conductance and large charge mobility allow the efficient electrical transduction of mechanical vibration to electrical signal. thus, there is a growing interest in the development of graphene-based nems resonators. in fig. 6a a schematic diagram is given of a graphene doubly-clamped beam exposed to tensile strain. fig. 6b shows the strain dependence of the first mode resonance frequencies of 1 m and 2 m long doubly-clamped graphene beams (obtained by using eqs. (1)-(3)). it indicates that the resonance frequency can be increased by several orders of magnitude when stretching is applied, compared to the resonance frequency of the unstretched beam, which is a consequence of a high l/h ratio (l/h is about 3000 for a 1 m long graphene beam). because of that, ghz frequencies can be attained even when resonance frequencies of unstretched beams are in the mhz-range (typical for 2d resonators 1-5 m long). as it can be seen by comparison of figs. 5a and 6b, in atomically thin structures the values of ε above which the tension dominated resonance frequency region begins are significantly lower than in mems/nems resonators of a lower l/h ratio. in the following text a short description will be given of two types of carbon (graphene) based resonators: carbon nanotube resonators and graphene (beam and membrane) resonators. a carbon nanotube (cnt) is a hollow cylinder of covalently bonded carbon atoms. depending on the number of graphene sheets that are rolled concentrically, such a structure can be a single-walled carbon nanotube (swnt) or a multi-walled carbon nanotube (mwnt). the first successfully fabricated nanotubes were reported in 1991 [88]. typically, the diameter of a swnt is 1-2 nm, and the length is several micrometers (several millimeters long swnts have also been reported [89]). the diameter of mwnt is usually 2-25 nm, and its length is several tens of micrometers (they can be grown up to several centimeters in length [90]). currently, these bottom-up structures are typically synthesized by chemical vapor deposition (cvd), using the catalyst-assisted method which enables obtaining of nanotubes that are defect-free or with a few defects only. 366 i. jokić, m. frantlović, z. djurić, m. l. dukić fig. 6 a) schematic representation of a 2d (graphene) beam resonator under tension, b) the strain dependence of the first mode resonance frequencies of 1 m and 2 m long doubly-clamped graphene beams. the first nanotube resonator was made out of mwnt in 1999 [91]. tunable swnt and mwnt resonators have been reported in 2004 [92]. method of actuation and detection of a nanotube resonator motion, suitable for realization on a single chip, is described in [92]. the actuation is achieved through the electrostatic interaction between the tube and the underneath gate electrode, while the detection rely on nts transistor properties, i.e. on the change in the conductance of nanotube due to modulation of ntgate capacitance, which is caused by vibration of the nanotube, and measured by using the frequency mixing technique. this method was applied in characterization of doubleclamped swnts with the diameter of 1-4 nm, the length of 1.75-3 m, the resonance frequency in the range of 5.1-333 mhz, and qs of 50-100 (q=100 corresponds to the nt with the lowest f0, and q=50 to the highest f0 nt, measured in vacuum at room temperature) [93]. nanotube resonators operating at the ghz range, potentially applicable in rf systems, have been demonstrated recently [94, 95]. mechanical resonances as high as 39 ghz have been observed in carbon nanotube resonators [96]. typical values of nt resonators' q factor (from the order of 10 to the order of 100 at the room temperature) are lower than those of nems resonators made of conventional materials. for example, for a doubly-clamped swnt of 3 m in length and with f0=26.1 mhz, q factor about 90 is measured at room temperature and at the pressure of 10 -4 torr [97]. detailed consideration of different dissipation mechanisms in nt resonators is performed in [98]. lowering the temperature reduces dissipation, allowing for quality factors up to 2000 [99]. the highest reported q in nts exceeds 10 5 (swnt, f0=350 mhz), but it is obtained at as low a temperature as 25 mk, when tensile strain is applied [100]. among the highest q values (about 700) reported at room temperature and the pressure of 10 -4 torr is for a doublyclamped swnt (3 m length, f083 mhz), and the increase of q is attained by applying the parametric amplification concept [97]. the frequency of swnt resonators presented in [97] is tuned by varying the gate voltage, which changes the electrostatic force, so both the stretching (i.e. the increase of tension) of the nanotube and the electrostatic interaction with the gate occur, changing the nanotube effective spring constant. at the tuning voltage of 10 v, the resonance frequency increase of as much as 200% (the spring hardening effect due to increased tension dominates the electrostatically induced spring softening effect) is reported. in [93], by varying the gate voltage from 2 v to 3.5 v the tension in rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 367 nt was changed, thus enabling the adjustment of the resonance frequency in the range 714 mhz. a tunable band-pass filter utilizing singly-clamped nt resonator is analyzed in [101]. the center frequency and the bandwidth of the filter are voltage-tunable: the increase of the frequency by over 100% is attainable by varying the tuning voltage from 0 v to 50 v, while the bandwidth simultaneously decreases by 50%. graphene was obtained for the first time by mechanical exfoliation from graphite [102]. nowadays, graphene structures can be fabricated by a combination of top-down and bottom-up methods. unlike nts, graphene can be grown over large areas (using cvd or sic annealing). moreover, graphene can be patterned at the wafer scale by standard lithographic processes, compatible with other top-down processing techniques, which makes its integration with other components possible. the first graphene electromechanical resonator was demonstrated in 2007 [103]. the motion of a suspended graphene sheet (in the form of a doubly-clamped beam) was actuated by using the laser-based optical method or electrical method, and for detection an interferometric method was applied. subsequently, electrostatic excitation of mechanical vibrations combined with spm (scanning probe microscopy) detection was used for graphene resonators [104]. the electrical detection method, based on the change of conductivity of vibrating single-layer graphene due to the change of its distance from the gate, is described in [105], where the feasibility of actuating and detecting resonance on a single chip was confirmed. graphene resonators in the form of doubly-clamped beams have q of the order of 10100. for example, a single-layer graphene beam (1100  1930  0.3 nm 3 ) has f0=70.5 mhz and q=78, and a 15 nm thick multilayer graphene beam has f0=42 mhz and q=210, both at room temperature and pressure below 10 -6 torr [103]. a graphene resonator in the form of a circular membrane (4 m in diameter) has f0=52.19 mhz and q=55 [68]. drum-like graphene resonators of high q factor are also reported; for example, the q factor about 2400 is obtained at the room temperature and pressure lower than 610 -3 torr for the membrane of 22.5 m in diameter (f0 about 4 mhz) [106]. in graphene resonators q increases as the temperature decreases. as the temperature decreases to 50 k, the q factor of beam resonators rises above 1000 [103], while at 5 k it can be as high as 10000 (f0=130 mhz) [105]. energy dissipation mechanisms that determine the q factor value in graphene resonators are reviewed in detail in [98]. due to a built-in tension, resonance frequencies of graphene resonators are higher than expected for structures of given dimensions and predicted by bending alone. the built-in tension originates from the fabrication process [103]. a typical built-in strain is of the order of 10 -5 -10 -4 [105]. a high tunability of graphene resonance with the applied gate voltage that induces tension in the resonant structure is observed [105]. the resonance frequency of a 3 m wide and 1.1 m long single-layer graphene beam increases from 30 mhz to 65 mhz as the gate voltage changes from 0 v to 7v [105]. the resonance frequency tunability as high as 400% is reached [105]. tensile strain does not only increase the resonance frequency but can also significantly reduce dissipation (i.e. increase the q factor) [98]. the increase of q with the tensile strain is also observed in si, sin and gaas nems resonators [98]. 368 i. jokić, m. frantlović, z. djurić, m. l. dukić in ref. [68], oscillators containing a graphene nems resonator are reported, whose frequency can be electrostatically tuned by as much as 14%. self-sustaining mechanical vibrations are generated and transduced at the room temperature using simple electrical circuitry. the 52.2 mhz nems oscillator based on a graphene circular drum (4 m in diameter) has q=4015 at the room temperature. the graphene vco, which is the first prototype device for rf applications based on graphene nems, presented in the same reference, shows promising performance. also, in [68] experimental data pertinent to phase noise of graphene oscillators were presented for the first time. graphene resonators have more reproducible characteristics than nts. signal levels in graphene resonators are improved compared to those in nt resonators, due to the ability to fabricate micrometer-wide structures with higher conductance than that of onedimensional nanotubes [105]. by exposing a micrometer-scale graphene resonator to strain, an increase in the resonance frequency (in ghz range) can be achieved without a decrease in the signal level, and the dynamic range also increases with the strain (since the amplitude at the onset of nonlinearity increases with strain). this is an advantage compared to top-down nems in which high resonance frequencies are achieved by reducing the resonator dimensions, which in turn causes a decrease of both the output signal magnitude and the amplitude at the onset of nonlinearity, also decreasing the dynamic range and making ghz-range transduction difficult [105]. in spite of the great potential of carbon nanotubes and graphene in resonator applications due to their extraordinary mechanical properties (enabling high resonance frequencies and high tunability), there is a number of issues that need to be addressed in order to enable practical applications. for example, integration of cnts and control over their location on-chip make mass production of cnt-based nems devices difficult to achieve. further research activities aim to provide simpler and more reproducible techniques for fabrication of ultraclean nanotubes, and exploration of frequency tuning mechanisms and nanotube nonlinear dynamics. the improvement of quality factors of both nanotube and graphene resonators is an important task. cnt and graphene structures have the largest surface-tovolume ratios, so surface effects become increasingly important to investigate. noise generation mechanisms in these structures also require further investigation in order for their ultimate performances to be determined. also related to the subject of this paper are adsorbed mass fluctuations that generate the adsorption-desorption phase noise of resonators and oscillators. the extremely low mass of graphene structures, and their large surface-to-volume ratio make these resonators highly sensitive to added mass. therefore, the properties of such structures are highly sensitive to the amount of adsorbates and its change [107]. due to the high sensitivity to mass, stochastic adsorbed mass fluctuations could influence the fluctuations of graphene resonator parameters, i.e. the resonator total noise. there is not enough data in the literature about the effects of gas adsorption on mechanical and electrical parameters of graphene resonators and their oscillation, and therefore this topic requires further investigation. rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 369 3. phase noise in rf mems/nems operation of mems and nems components (including rf) is based on the interaction between the mechanical and the electrical domain of the system. thus, apart from the noises inherent to electrical and electronic devices (e.g. thermal (johnson) noise, shot noise, generation-recombination (gr) noise, 1/f noise), generated in em transducer circuits, amplifiers and other electronic parts of mems/nems systems, noise analysis in mems/nems has to include noise generating mechanisms in the mechanical domain [108]. fundamental (internal) mechanical noises are the consequence of the stochastic nature of physical processes occurring inside the mems/nems component or at the interface between the mechanical structure and the environment, which result in stochastic fluctuations of displacement of a mechanical structure and/or of its mechanical resonance properties, thus causing fluctuations of the electrical output signal. characteristic fundamental mechanical noises of mems/nems resonators are thermo-mechanical (tm) noise, noise due to temperature fluctuations (tf), and adsorption-desorption (ad) noise. their contribution to the total noise increases and may become dominant as the dimensions, mass and displacement of mechanical structures decrease. the short-term frequency stability, which is a significant parameter of mems/nems resonators, is determined by the phase noise. the oscillator phase noise, that degrades the signal transmission quality, originates from phase fluctuations caused by noise generation mechanisms in the oscillator's electronic circuit and also from the resonator noise [109]. the fundamental mechanical noises cause unavoidable stochastic fluctuations of the phase and frequency of a resonant structure oscillation and determine the lowest (fundamental) limit of the mems/nems resonator noise. therefore, tm, tf and ad noise are considered as a measure of the mems/nems resonator ultimate performance. although theoretical considerations of tm, tf and ad noise in mems/nems were published in the literature [108, 110-114], a need for more comprehensive models of noise in mems/nems resonators and oscillators still exists. further in this section, theoretic models of ad noise in rf mems/nems em resonators and of phase noise in mems/nems oscillators, presented in our papers [113, 115-123], will be briefly reviewed, and then the results of the quantitative analysis will be given. 3.1. adsorption-desorption noise in mems/nems resonators in mechanical structures of micrometer or sub-micrometer dimensions and minuscule mass, whose displacement is in nanometer range, the effects of physical phenomena that are negligible in the macroscopic world become significant. among such phenomena are adsorption and desorption of surrounding gases, that spontaneously and inevitably occur on surfaces of all solid bodies at temperatures higher than 0 k, and at pressures above 0 pa. stochastic nature of both the instantaneous adsorption and desorption rate results in stochastic fluctuations of the adsorbed particles number, n(t), and consequently the total adsorbed mass fluctuates (δm(t)) causing fluctuations of the mechanical resonance frequency of the mems/nems structure, δf(t), i.e. the resonator adsorption-desorption (ad) frequency and phase noise. the mean power of ad phase noise of a resonator in the bandwidth 1 hz at the offset-frequency  from the nominal frequency f0, expressed in dbc/hz, is [123] 370 i. jokić, m. frantlović, z. djurić, m. l. dukić 2 2 2 0 10 log( ( ) /(2 )) 10 log(( / 2 ) ( ) /(2 )) r f m l s f m s        (6) where sδν() is the power spectral density (psd) of resonator ad frequency noise. both these quantities are determined by the psd of the adsorbed mass fluctuations, sδm() (m is the resonator mass). in order to perform the statistical analysis of ad processes by using the approach which is common for gain-loss processes (ad processes and generation-recombination (gr) processes belong to them), the equation(s) describing the change of the number of adsorbed gas particles in time is (are) shown in the general langevin form inieniei nnnrnnngdtdn ξ),...,,(),...,,(/ 2121  (7) i=1,2,...n, valid for different types of ad processes. here, n1 is the total number of gases in the resonator surroundings whose particles adsorb in a single layer on the resonator surface [113, 115-119], or the total number of adsorbed layers in the case of multilayer single-gas adsorption [120, 121]. the index "i" refers to i th gas in a gas mixture, or to the particles in the i th adsorbing layer (which are not covered by (i+1) th layer). the equivalent rate of "generation" of adsorbed particles of the type i (i.e. of increase of their number) and the equivalent rate of their "recombination" (i.e. of decrease of their number), gie and rie, respectively, take into account the influences of all the processes relevant for the change of ni, and their forms differ depending on the analyzed case of adsorption on the surface of micro/nanostructures: single gas single-layer adsorption [115], single-layer adsorption of an arbitrary number of gases [116, 117], adsorption in an arbitrary number of layers [120, 121], or adsorption coupled with mass transfer [118-119]). i is the stochastic source function. for small fluctuations (ni) of the number of adsorbed particles around the corresponding equilibrium value (nie), so that ninie, eqs. (7) can be written in the matrix form 1 2 1 2 1 2 ([ ... ] ) / [ ... ] [ ... ] t t t n n n d n n n dt n n n ξ ξ ξ         k (8) ξnkn  dtd /)( (9) where k is the nn matrix of elements kij=-(gie/nj-rie/nj)n=ne, n=[n1 n2 ... nn] t , δn=[δn1 δn2 ... δnn] t , and ne=[n1e n2e ... nne] t is the vector of steady-state values nie, which are obtained from the steady state conditions gie(n1e,n2e,...nne)=rie(n1e,n2e,... nne). by performing fourier analysis, a square nn matrix sn2(ω) is obtained 1 ξ 1 )ω()ω()ω(2    iksiks t n jj (10) its elements (i,j) are single-sided spectral and cross-spectral densities, sninj*() [119 supplementary data]. according to the stochastic analysis of gr processes [124] and the analogy with ad processes, the psd of the i th source function equals si = 4gie(n1e, n2e, ..., nne)=4giee. since i and j (ij) are statistically independent, s is a diagonal nn matrix of elements sii=si. i is the unity nn matrix, ω=2π. rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 371 the total fluctuation of the adsorbed mass is δm=m1n1+m2n2+…+mnnn, where mi is the mass of a single particle of the type i (number of such particles at the surface is ni). the psd of adsorbed mass fluctuations is t n msm   )ω()ω( 2,nms (11) (m=[m1 m2 ... mn]) and it can be expressed in the general form                        n i i n i i inm ks 1 22 1 0 2 , )1(/)(  (12) coefficients ki and τi are obtained analytically for the given case of adsorption. the resonator ad phase and frequency noise are now obtained using eq. (6), where sδm() is given by eq. (11) or (12). in order to illustrate the applications of the presented approach, the results obtained for several characteristic cases of adsorption will be given. for example, in the case of single gas single-layer adsorption [113, 115, 123] 2 1 2 2 111 2 1 1, τ)πν2(1 τ)(4 )ν(    ee m ngm s (13) where τ1=1/k11=-(g1e/n1-r1e/n1) -1 n1=n1e, and n1e is obtained from the steady state condition g1e(n1e)=r1e(n1e). considering the mass transfer process, the psd of the adsorbed mass fluctuations for single gas single-layer case is also given by eq. (13), with τ1,mt=τ1(1+g1eert/(akmp)) instead of τ1 (p is the gas pressure, t is the temperature, a is the resonator surface area, and km is the mass transfer coefficient) [118]. mass transfer processes of particles in a resonator chamber can influence the fluctuations of the number of adsorbed particles, especially for low-pressure (low-concentration) environments [118]. for the resonator operating in a two-gas atmosphere (the simplest case of multiple gas adsorption), the psd of the mass adsorbed in a single layer is [116, 117] )τω1)(τω1( )τω1)(( τ ττ 4)ω( 2 2 22 1 2 22 2 2 21 2 1 2 2 2 2 1 2,     zeeee z m gmgm s (14) where τ1,2=2{k11+k22±[(k11+k22) 2 -4(k11k22-k12k21)] 1/2 } -1 , τz=(m1 2 g1ee+m2 2 g2ee)[(m1k22m2k21) 2 g1ee+(m1k11-m2k12) 2 g2ee)] -1 . the same expression is valid in the case of two-layer adsorption, but τ1,2 and τz depend on different parameters because the functions gie and rie are different [120]. 3.2. phase noise in rf mems/nems oscillators oscillators, which produce continuous periodic signals from dc power, are important for modern communications systems, due to their versatile applications including timing references and frequency synthesizers. the effects of oscillator phase noise become increasingly destructive with the introduction of new wireless standards based on advanced modulation schemes [125], which makes modeling of rf oscillators phase 372 i. jokić, m. frantlović, z. djurić, m. l. dukić noise in modern wireless communication systems very useful. therefore, the theory of phase noise is being constantly improved [125, 126]. a signal generated by a real oscillator can be expressed by u(t)=a02 -1/2 e j2πf0te j(t) , where a0 is the amplitude (usually considered as constant [125]), f0 is the carrier frequency, and (t) is the resulting stochastic fluctuation of the phase, caused by noise generating mechanisms in the oscillator constituting components. the spectrum of the signal u(t) is located around the frequency f=f0, and shaped by (t)=e j(t) . therefore, when considering only the spectrum shape, it is convenient to analyze the spectrum translated to the baseband, which is then the spectrum of (t). if the psd of (t) is denoted with s(), the oscillator phase noise (expressed in dbc/hz) is [125] ))ν(log(10)ν( θ sl osc  (15) ( is the fourier frequency). s() is obtained by using the wiener-khinchin theorem         ττ)τ()ν( πντ22/)τ(σπντ2 θθ 2 deeders jj (16) where r() is the autocorrelation function of (t), and  2 () is the phase jitter variance (the variance of the phase increments) [125]. since (t) is the integration result of stationary frequency noise, the variance  2 (τ) is related to the psd of frequency noise sf() according to the expression [125, 127]      ν )πν( )πντ(sin )ν()τ(σ 2 2 2 υ ds f (17) considering a mems oscillator consisting of a mems mechanical resonator (as a frequency selective element) and sustaining electronics, the total phase fluctuations are caused by noise generation mechanisms in the oscillator's electronic circuit, but also by the resonator mechanical frequency noises. due to the noise induced by dissipation processes in a resonator and sustaining circuits (called the brownian motion noise, or white noise), the phase undergoes diffusion process, with the diffusion constant d. the omnipresent 1/f noise of oscillator components also causes phase fluctuations. the corresponding variances,  2 ,b() and  2 ,1/f(), are given in [125, 126]. by using eq. (17) the variance of the phase increments due to ad noise,  2 ,ad(), can be obtained, applying sf()=(f0/2m) 2 sm(), where sm() is given by eq. (12). we determined the variance  2 ,ad() for single gas single-layer adsorption (by using eqs. (13) and (17)) [122]. in the presence of these noises the corresponding variances are τ2)τ(σ υ 2 ,υ d b   , 22 /1,υ τ)τ(σ k f   , 2/)]1(ττ[)τ(σ 1 τ/τ 1 2 ,υ    ep ad (18) where k is related to parameters of the 1/f noise in an oscillator circuit, and p=g1e(n1e)1 2 m1 2 (f0/2m) 2 . assuming that mentioned noise sources are not correlated, the total variance equals the sum of the components that correspond to each of the frequency noise sources. the corresponding total autocorrelation function equals the product of the autocorrelation functions of all noise contributors. the individual spectra are denoted with s,b(), s,1/f() and s,ad(), and they are obtained using eqs. (16) and (18). the rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 373 psd of (t), i.e. s(), is the convolution of the individual power spectra in the frequency domain. the oscillator phase noise, which includes the influence of the resonator ad noise, is then obtained from eq. (15). the component which represents the contribution of the lorentzian ad frequency noise of the resonator to the oscillator's total phase noise is [122]                 ),(γreτ2reτ2)ν( 1 0 1 1,θ az a e dtet a e s z aa tz z a ad (19) where (z,a) is the lower incomplete gamma function, a=pτ1/4, and z=pτ1/4+jωτ, ω=2π. 3.3. results of numerical calculations and discussion by applying the derived theoretic model, we analyzed single layer adsorption on the surface of mems/nems silicon resonator with the nominal resonance frequency of 50 mhz (m=8.18·10 -16 kg, a=8.89·10 -12 m 2 ). fig. 7a shows the dependence of the resonator ad phase noise (lr()=10log(0.5(f0/2m) 2 sm()/ 2 ) [123]) on the pressure of the gas inside the resonator's housing, at various offset frequencies from the nominal, at t=300 k, for the case of the resonator operating in a single gas atmosphere. the gas is nitrogen [123]. it can be observed that this noise has a low magnitude at near-atmospheric pressures, but becomes significant as the pressure decreases. this observation is very significant when optimization of operating conditions of mems/nems resonators is performed. in a majority of studies resonator tm noise is analyzed and the operating conditions are chosen so to minimize it. this means that low pressure values are chosen at which energy dissipation due to the surrounding medium is low enough, so tm is also reduced. at typical pressure values in evacuated mems/nems resonator packages, ad noise has its maximum, and, therefore, it can become dominant [115, 123]. therefore, the optimization of operating conditions of rf mems/nems resonators in order to minimize their total noise must be performed based on the analysis of all the noises dependent on the ambient pressure and temperature. ad noise becomes increasingly important in such analysis with the increase of resonance frequencies, i.e. with the decrease of resonator dimensions, as can be seen in fig. 7b. ad phase noise of the same resonator, but operating in a two-gas atmosphere, is shown in fig. 7c, as a function of both the gas 1 pressure (the gas 2 pressure is p2=10 3 pa, t=300 k) and the offset frequency. it can be observed that the presence of multiple gases affects resonator ad noise. in [116, 117] it is shown that in a certain pressure and frequency range the magnitude of the ad noise spectrum for two-gas adsorption is lower than for the case of one gas. however, since the decrease does not exist at all frequencies, the effect of the gas mixture composition on the total ad noise in the bandwidth of interest should be observed. the presented theory enables performing the analysis that yields the optimal gas mixture composition at which the resonator ad noise is minimized. the psd s,ad() for a single gas atmosphere at different gas pressures [122] is shown in fig. 7d, for the same example as shown in fig. 7a. the pressure values are chosen: one of them approximates the pressure at which ad noise has its maximum, another one is an order of magnitude lower, and the remaining one an order of magnitude higher. this is the first result that illustrates the spectral dependence of the component of the oscillator phase 374 i. jokić, m. frantlović, z. djurić, m. l. dukić noise that is caused by the ad frequency noise. its influence on oscillator noise must be analyzed together with the other two components (s,b() and s,1/f()), by determining the total psd of (t), i.e. s() (as the convolution in the frequency domain of the three psds), and subsequently (based on eq. (15)) obtaining the oscillator phase noise, which includes the influence of the resonator ad noise. fig. 7 a) dependence of the mems/nems resonator ad phase noise (single gas single layer adsorption) on gas pressure and offset frequency (t=300 k, nitrogen, f0=50 mhz), b) ad phase noise (single gas single layer adsorption) as a function of the resonance frequency (p=0.01 pa), c) ad phase noise for a two-gas atmosphere, as a function of the gas 1 pressure and offset frequency (pressure of the gas 2 is p2=10 3 pa), d) the calculated psd (eq. (19)) of the oscillator phase noise constituent caused by ad process of a single gas of pressure p. 4. conclusion rf components based on mems and nems structures are expected to have an important role in achieving new levels of integration and reconfigurability of transceivers in future mobile terminals. mems/nems resonators have generated a significant interest rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 375 because of their ultra-high resonance frequencies, small size, very low operating power, high quality factors and possibility of integration with silicon ic technologies. rf mems resonators have already become a competitive alternative to conventional components used for realization of rf filtering and frequency synthesis in wireless transceivers. significant results achieved after the year 2000 (especially in terms of better temperature stability, better long-term stability, improved packaging) have enabled the commercialization of both piezoelectric and capacitive mems resonators. due to unique mechanical properties (enabling high resonance frequencies), especially their mechanical compliance needed for high frequency tunability, nems resonators, including 2d (graphene) resonators, provide a promising basis for future ultrafast communication systems. the studies in the field of nems are at the forefront of physical and engineering sciences. however, a number of issues need to be addressed in order to enable practical applications. there are problems that remain to be solved on both theoretic and experimental level. most developments in this field are currently confined to theoretic models, simulations and laboratory experiments, with nems components in a prototype stage, at best. noise generation mechanisms in mems and nems resonant structures require further investigation in order for their ultimate performances to be determined. the extremely low mass and their large surface-to-volume ratio make these resonators highly sensitive to added mass. therefore, stochastic adsorbed mass fluctuations influence the fluctuations of mems/nems resonator parameters, i.e. they are a source of adsorptiondesorption (ad) noise which contributes to the resonator total noise. by analyzing the ad noise that becomes prominent as the dimensions and mass of the components decrease, we contributed to the theory of noise in mems and nems resonators, and subsequently to the theory of phase noise in oscillators using rf mems/nems resonators as frequency determining elements. the theoretical model of ad phase noise enables prediction of ad noise during the design of such components in order to identify the dominant noise generating mechanism, and then optimization of the resonator parameters and operating conditions in terms of noise minimization. acknowledgement. this work was funded by the serbian ministry of education, science and technological development (project tr 32008) and by the serbian academy of sciences and arts (project f/150). the authors would like to express their gratitude to prof. dr. gradimir milovanović, full member of the serbian academy of sciences and arts, for his contribution in solving mathematical problems. references [1] j. basu and t. k. bhattacharyya, "microelectromechanical resonators for radio frequency communication applications", microsystem technologies, vol. 17, pp. 1557-1580, 2011. [2] b. kim, m. a. hopcroft and r. n. candler, "silicon mems resonators for timing applications", in microelectronics to nanoelectronics materials, devices & manu-facturability, a. b. kaul, ed., crc press, 2012, pp. 79-108. [3] j. t. m. van beek and r. puers, "a review of mems oscillators for frequency reference and timing applications", j. micromech. microeng., vol. 22, pp. 013001 1-35, 2012. 376 i. jokić, m. frantlović, z. djurić, m. l. dukić [4] c. t.-c. nguyen, "vibrating rf mems overview: applications to wireless communications", in proceedings of spie: micromachining and microfabrication process technology, vol. 5715, photonics west: moems-mems 2005, san jose, california, 2005, pp. 11-25. [5] d. e. serrano, r. tabrizian and f. ayazi, "tunable piezoelectric mems resonators for real-time clock", in proceedings of the joint conference of the ieee international frequency control and the european frequency and time forum (fcs), san fransisco, ca, 2011, pp. 1-4. [6] i. voiculescu and a. n. nordin, "acoustic wave based mems devices, development and applications", in: microelectromechanical systems and devices, n. islam, ed., intech, 2012, chapter 4, pp. 65-86. [7] r. tabrizian, m. rais-zadeh and f. ayazi, "effect of phonon interactions on limiting the f·q product of micromechanical resonators", in proceedings of the international solid-state sensors, actuators and microsystems conference (transducers 2009), denver, co, 2009, pp. 2131–2134. [8] c. zuo, j. van der spiegel and g. piazza, "1.05-ghz cmos oscillator based on lateral field-excited piezoelectric ain contour mode mems resonators", ieee trans. ultrason. ferroelectr. freq. control, vol. 57, pp. 82-87, 2010. [9] g. piazza, p. j. stephanou, j. m. porter, m. b. j. wijesundara and a. p. pisano, "low motional resistance ring-shaped contour-mode aluminium nitride piezoelectric micromechanical resonators for uhf applications", in proceedings of the 18 th ieee international conference on micro electromechanical systems (mems 2005), 2005, pp. 20-23. [10] m. rinaldi, c. zuniga and g. piazza, "5-10 ghz aln contour-mode nanoelectromechanical resonators", in proceedings of the 22 nd ieee international conference on micro electro mechanical systems (mems 2009), 2009, pp. 916-919. [11] h. m. lavasani, p. wanling, b. harrington, r. abdolvand and f. ayazi, "a 76 db 1.7 ghz 0.18 μm cmos tunable tia using broadband current pre-amplifier for high frequency lateral micromechanical oscillators", ieee journal of solid-state circuits, vol. 46, pp. 224-235, jan 2011. [12] w.-c. chen, w. fang and s.-s. li, "quasi-linear frequency tuning for cmos-mems resonators", in proceedings of the 24 th ieee international conference on micro electro mechanical systems (mems 2011), 2011, pp. 784-787. [13] g. k. ho, k. sundaresan, s. pourkamali and f. ayazi, "low-motional-impedance highly-tunable i 2 resonators for temperature compensated reference oscillators", in proceedings of the ieee micro electro mechanical systems conference (mems‘05), miami, fl, 2005, pp. 116-120. [14] h. g. barrow, t. l. naing, r. a. schneider, t. o. rocheleau, v. yeh, z. ren and c. t.-c. nguyen, "a real-time 32.768-khz clock oscillator using a 0.0154-mm 2 micromechanical resonator frequencysetting element", in proceedings of the ieee international freq. control symposium, baltimore, md, 2012, pp. 1-6. [15] z. hao, s. pourkamali and f. ayazi, "vhf single-crystal silicon elliptic bulk-mode capacitive disk resonators–part i: design and modeling", j. microelectromech. syst., vol. 13, pp. 1043–1053, 2004. [16] j. e. y. lee and a. a. seshia, "5.4-mhz single-crystal silicon wine glass mode disk resonator with quality factor of 2 million", sens. actuators a, vol. 156, pp. 28–35, 2009. [17] j. wang, j. e. butler, t. feygelson and c. t.-c. nguyen, "1.51-ghz polydiamond micromechanical disk resonator with impedance-mismatched isolating support", in proceedings of the 17 th ieee international conference on micro electro mechanical systems, maastricht, the netherlands, 2004, pp. 641-644. [18] frederic nabki, "silicon carbide micro-electromechanical resonators for highly integrated frequency synthesizers", phd thesis, mcgill university, montreal, canada, 2009. [19] k. wang, a.-c. wong and c. t.-c. nguyen, "vhf free–free beam high-q micromechanical resonators", ieee/asme j. microelectromech. syst., vol. 9, pp. 347-360, 2000. [20] s. pourkamali and f. ayazi, "soi-based hf and vhf single-crystal silicon resonators with sub-100 nanometer vertical capacitive gaps", in proceedings of the 12 th international conference on solid state sensors, actuators and microsystems (transducers ‘03), boston, 2003, pp. 837-840. [21] y. naito, p. helin, k. nakamura, j. de coster, b. guo, l. haspeslagh, k. onishi and h. tilmans, "high-q torsional mode si triangular beam resonators encapsulated using sige thin film", in proceedings of the ieee international electron devices meeting, san francisco, 2010, pp. 154–157. [22] t. j. cheng and s. a. bhave. "high-q, low impendance polysilicon resonators with 10 nm air gaps", in proceedings of the 23 rd ieee international conference on micro electro mechanical systems (mems 2010), wanchai, hong kong, 2010, pp. 695-698. rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 377 [23] v. kaajakari, t. mattila, a. oja, j. kiihamäki and h. seppä, "square-extensional mode single-crystal silicon micromechanical resonator for low-phase-noise oscillator applications", ieee electron device lett., vol. 25, pp. 173–175, 2004. [24] g. wu, d. xu, b. xiong and y. wang, "a high-performance bulk mode single crystal silicon microresonator based on a cavity-soi wafer, j. micromech. microeng., vol. 22, pp. 025020 1-8, 2012. [25] y.-w. lin, s.-s. li, z. ren and c. t.-c. nguyen, "low phase noise array-composite micromechanical wine-glass disk oscillator", in proceedings of the ieee international electron devices meeting, washington dc, 2005, pp. 287-290. [26] m. u. demirci, m. a. abdelmoneum and c. t.-c. nguyen, "mechanically corner-coupled square microresonator array for reduced series motional resistance", in proceedings of the 12 th international conference on solid-state sensors & actuators (transducers’03), boston, massachussets, 2003, pp. 955-958. [27] s.-s. li, y.-w. lin, y. xie, z. ren and c. t.-c. nguyen, "micromechanical "hollow-disk" resonators", in proceedings of the 17 th ieee international conference on.micro electro mechanical systems (mems 2004), 2004, pp. 821-824. [28] p. ovartchaiyapong, l. m. a. pascal, b. a. myers, p. lauria and a. c. bleszynski jayich, "high quality factor single-crystal diamond mechanical resonators", appl. phys. lett., vol. 101, pp. 163505 1-4, 2012. [29] j. e.-y. lee and a. a. seshia, "square wine glass mode resonator with quality factor of 4 million", in proceedings of the 7 th ieee conference on sensors, lecce, italy, pp. 1257–1260, 2008. [30] d. weinstein and s. a. bhave, "internal dielectric transduction in bulk-mode resonators", j. microelectromech. syst., vol. 18, pp. 1401–1408, 2009. [31] r. henry and d. kenny, "comparative analysis of mems, programmable, and synthesized frequency control devices versus traditional quartz based devices", in proceedings of the ieee frequency control symposium, honolulu, hi, 2008, pp. 396–401. [32] j. l. lopez, j. verd, j. teva, g. murillo, j. giner, f. torres, a. uranga, g. abadal and n. barniol, "integration of rf-mems resonators on submicrometric commercial cmos technologies", j. micromech. microeng., vol. 19, pp. 015002 1-10, 2009. [33] a. n. nordin and m. e. zaghloul, "modeling and fabrication of cmos surface acoustic wave resonators", ieee trans. on microwave theory tech., vol. 55, pp. 992-1001, 2007. [34] h. campanella, e. cabruja, e., j. montserrat, a. uranga, n. barniol and j. esteve, "thin-film bulk acoustic wave resonator floating above cmos substrate", ieee electron device lett., vol. 29, pp. 28-30, 2008. [35] m. hara, j. kuypers, t. abe and m. esashi, "mems based thin film 2 ghz resonator for cmos integration", in proceedings of ieee mtt-s international microwave symposium digest, philadelphia, pa, 2003, vol. 3, pp. 1797-1800. [36] b. p. otis and j. m. rabaey, "a 300-w 1.9-ghz cmos oscillator utilizing micromachined resonators", ieee j. solid-state circuits, vol. 38, pp. 1271-1274, 2003. [37] j. s. wang and k. m. lakin, "sputtered aln films for bulk-acoustic-wave devices", in proceedings of ultrasonics symposium, chicago, il, 1981, pp. 502-505. [38] m. a. dubois, j. f. carpentier p. vincent, c. billard, g. parat, c. muller, p. ancey and p. conti, "monolithic above-ic resonator technology for integrated architectures in mobile and wireless communication", ieee j. solid-state circuits, vol. 41, pp. 7–16, 2006. [39] b. p. harrington, m. shahmohammadi and r. abdolvand, "toward ultimate performance in ghz mems resonators: low impedance and high q", in proceedings of the 23 rd ieee international conference on micro electro mechanical systems (mems), wanchai, hong kong, 2010, pp. 707-710. [40] l. khine, "performance parameters of micromechanical resonators", phd. thesis, national university of singapore, 2010. [41] j. wang, z. ren and c. t.-c. nguyen, "1.156-ghz self-aligned vibrating micromechanical disk resonator", ieee trans. ultrason. ferroelect. freq. control, vol. 51, pp. 1607-1628, 2004. [42] k. sundaresan, g. k. ho, s. pourkamali and f. ayazi, "electronically temperature compensated silicon bulk acoustic resonator reference oscillators", ieee j. solid-state circuits, vol. 42, pp. 1425–1434, 2007. [43] j. salvia, m. messana, m. ohline, m. a. hopcroft, r. melamud, s. chandorkar, h. k. lee, g. bahl, b. murmann and t. w. kenny, "exploring the limits and practicality of q-based temperature compensation for silicon resonators", in proceedings of international electron devices meeting, san francisco, ca, 2008, pp. 1-4. 378 i. jokić, m. frantlović, z. djurić, m. l. dukić [44] w.-t. hsu and c. t.-c. nguyen, "stiffness-compensated temperature insensitive micro-mechanical resonators", in proceedings of ieee international micro electro mechanical systems conference, las vegas, nevada, 2002, pp. 731–734. [45] a. k. samarao and f. ayazi, "temperature compensation of silicon micromechanical resonators via degenerate doping", in proceedings of ieee international electron devices meeting. baltimore, 2009, pp. 1–4. [46] a. k. samarao, g. casinovi and f. ayazi, "passive tcf compensation in high q silicon micromechanical resonators", in proceedings of the 23 nd ieee international conference on microelectromechanical systems, hong kong, 2010, pp. 116–119. [47] r. melamud, s. a. chandorkar, k. bongsang, h. k. lee, j. c. salvia, g. bahl, m. a. hopcroft and t. w. kenny, "temperature insensitive composite micromechanical resonators", j. microelectromechanical systems, vol. 18, pp. 1409–1419, 2009. [48] h. k. lee, m. a. hopcroft, r. k. melamud, b. kim, j. salvia, s. chandorkar,ž and t. w. kenny, "electrostatic tuning of hermetically encapsulated composite resonators", in proceedings of ieee solid state sensor, actuator and microsystems workshop, hilton head, 2008, pp. 48–51. [49] j. h. seo, k. s. demirci, a. byun, s. truax and o. brand, "novel temperature compensation scheme for microresonators based on controlled stiffness modulation", in proceedings of international conference on solid-state sensors, actuators and microsystems (transducers 2007), 2007, pp. 2457– 2360. [50] h. wan-thai, j. r. clark and c. nguyen, "mechanically temperature-compensated flexural-mode micromechanical resonators", in proceedings of ieee international electron devices meeting, san francisco, ca, 2000, pp. 399-402. [51] f. nabki, t. a. dusatko and m. n. el-gamal, "frequency tunable silicon carbide resonators for mems above ic", in proceedings of ieee custom integrated circuits conference, san jose, ca, 2008, pp. 185-188. [52] w. pang, h. zhang, h. yu, c.-y. lee and e. s. kim, "electrical frequency tuning of film bulk acoustic resonator", j. microelectromechanical systems, vol. 16, pp. 1303-1313, 2007. [53] m. lutz, a. partridge and p. gupta, n. buchan, e. klaassen, j. mcdonald and k. petersen, "mems oscillators for high volume commercial applications", in proceedings of the 14 th ieee international conference on solid state sensors, actuators and microsystems, 2007, pp. 49–52. [54] w. t. hsu, "reliability of silicon resonator oscillators", in proceedings of ieee international frequency control symposium and exposition, miami, fl, 2006, pp. 389–392. [55] m. lutz, j. mcdonald, p. gupta, a. partridge, c. dimpel and k. petersen, "new mems timing references for automotive applications", in advanced microsystems for automotive applications, j. valldorf, w. gessner, eds., berlin: springer, 2007, pp. 279-289. [56] y. w. lin, s. lee, z. ren and c. t.-c. nguyen, "series-resonant micromechanical resonator oscillator", in proceedings of ieee international electron devices meeting, washington, dc, 2003, pp. 39.4.1–39.4.4. [57] t. mattila, o. jaakkola, j. kiihamaki, j. karttunen, t. lamminmaki, p. pantakari, a. oja, h. seppa, h. kattelus and i. tittonen, "14 mhz micromechanical oscillator", sens. actuators a, vol. 97-98, pp. 497-502, 2002. [58] m. sworowski, f. neuilly, b. legrand, a. summanwar, p. philippe and l. buchaillot, "fabrication of 24-mhz-disk resonators with silicon passive integration technology", ieee electron device lett., vol. 31, pp. 23–25, 2010. [59] h. lavasani, a. k. samarao, g. casinovi and f. ayazi, "a 145 mhz low phase-noise capacitive silicon micromechanical oscillator", in proceedings of international electron devices meeting, san francisco, ca, 2008, pp. 675–678. [60] p. rantakari, v. kaajakari, t. mattila, j. kiihamoki, a. oja, i. tittonen and h. seppa, "low noise, low power micromechanical oscillator", in proceedings of the 13 th international conference on solidstate sensors, actuators and microsystems (transducers), seoul, 2005, vol. 2, pp. 2135–2138. [61] m. akgul, b. kim, l. w. hung, y. lin, w.-c. li, w.-l. huang, i. gurin, a. borna and c. t.-c. nguyen, "oscillator far-from carrier phase noise reduction via nano-scale gap tuning of micromechanical resonators", in proceedings of the solid-state sensors, actuators and microsystems conference (transducers), denver, co, 2009, pp. 798–801. [62] m. ziaei-moayyed, j. hsieh, j.-w. p. chen, e. p. quevy, d. elata and r. t. howe, "higher-order mode internal electrostatic transduction of a bulk-mode ring resonator on a quartz substrate", in rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 379 proceedings of the 17th international solid-state sensors, actuators and microsystems conference (transducers), denver, co, 2009, pp. 2338–2341. [63] v. kaajakari, t. mattila, j. kiihamaki, h. kattelus, a. oja and h. seppa, "nonlinearities in singlecrystal silicon micromechanical resonators", in proceedings of the 12 th international conference on solid state sensors, actuators and microsystems, boston, ma, 2003, pp. 1574–1577. [64] m. u. demirci and c. nguyen, "mechanically corner-coupled square microresonator array for reduced series motional resistance", ieee/asme j. microelectromechanical systems, vol. 15, pp. 1419-1436, dec. 2006. [65] c.t.-c. nguyen, "mems technology for timing and frequency control", ieee trans. ultrason. ferroelect. freq. control, vol. 54, pp. 251-270, 2007. [66] y. t. yang, k. l. ekinci, x. m. h. huang, l. m. schiavone, m. l. roukes, c. a. zorman and m. mehregany, "monocrystalline silicon carbide nanoelectromechanical systems", appl. phys. lett., vol. 78, pp. 162-164, 2001. [67] h. a. c. tilmans, m. elwenspoek and j. h. j. fluitman, "micro resonant force gauges", sens. actuators a, vol. 30, pp. 35-53, 1992. [68] c. chen, s. lee, v. v. deshpande, g.-h. lee, m. lekas, k. shepard and j. hone, "graphene mechanical oscillators with tunable frequency", nat. nanotechnol., vol. 8, pp. 923-927, 2013. [69] a. n. cleland and m. l. roukes, "fabrication of high frequency nanometer scale mechanical resonators from bulk si crystals", appl. phys. lett., vol. 69, pp. 2653-2655, 1996. [70] k. l. ekinci and m. l. roukes, "nanoelectromechanical systems", rev. sci. instrum., vol. 76, pp. 061101, 2005. [71] x. l. feng, r. r. he, p. d. yang and m. l. roukes, "very high frequency silicon nanowire electromechanical resonators", nano lett., vol. 7, pp. 1953-1959, 2007. [72] x. m. h. huang, x. l. feng, c. a. zorman, m. mehregany and m. l. roukes, "vhf, uhf and microwave frequency nanomechanical resonator", new j. phys., vol. 7, pp. 247 1-15, 2005. [73] d. w. carr, s. evoy, l. sekaric, h. g. craighead and j. m. parpia, "measurements of mechanical resonance and losses in nanometer scale silicon wires", appl. phys. lett., vol. 75, pp. 920-922, 1999. [74] x. m. h. huang, c. a. zorman, m. mehregany and m. l. roukes, "nanodevice motion at microwave frequencies", nature, vol. 421, pp. 496-496, 2003. [75] a. n. cleland, m. pophristic and i. ferguson, "single-crystal aluminum nitride nanomechanical resonators", appl. phys. lett., vol. 79, pp. 2070-2072, 2001. [76] l. sekaric, j. m. parpia, h. g. craighead, t. feygelson, b. h. houston and j. e. butler, "nanomechanical resonant structures in nanocrystalline diamond", appl. phys. lett., vol. 81, pp. 4455-4457, 2002. [77] r. b. karabalin, m. h. matheny, x. l. feng, e. defaÿ, g. le rhun, c. marcoux, s. hentz, p. andreucci and m. l. roukes, "piezoelectric nanoelectromechanical resonators based on aluminum nitride thin films", appl. phys. lett., vol. 95, pp. 103111 1-3, 2009. [78] a. husain, j. hone, h. w. ch. postma, x. m. h. huang, t. drake, m. barbic, a. scherer and m. l. roukes,"nanowire-based very-high-frequency electromechanical resonator", appl. phys. lett., vol. 83, pp. 1240-1242, 2003. [79] t. f. li, y. pashkin, o. astafiev, y. nakamura, j. s. tsai and h. im, "high-frequency metallic nanomechanical resonators", appl. phys. lett., vol. 92, pp. 043112 1-3, 2008. [80] n. a. melosh, a. boukai, f. diana, b. gerardo, a. badolato, p. m. petroff and j. r. heath, "ultrahighdensity nanowire lattices and circuits", science, vol. 300, pp. 112-115, 2003. [81] x. l. feng, m. h. matheny, c. a. zorman, m. mehregany and m. l. roukes, "low voltage nanoelectromechanical switches based on silicon carbide nanowires", nano lett., vol. 10, pp. 28912896, 2010. [82] o. vazquez-mena, g. villanueva, v. savu, k. sidler, m. a. f. van den boogaart and j. brugger, "metallic nanowires by full wafer stencil lithography", nano lett., vol. 8, pp. 3675-3682, 2008. [83] r. he, d. gao, r. fan, a. i. hochbaum, c. carraro, r. maboudian and p. yang, "si nanowire bridges in microtrenches: integration of growth into device fabrication", adv. mater., vol. 17, pp. 2098-2102, 2005. [84] r. r. he, x. l. feng, m. l. roukes and p. d. yang, "self-transducing silicon nanowire electromechanical systems at room temperature", nano lett., vol. 8, pp. 1756-1761, 2008. [85] c. regal, j. teufel and k. lehnert, "measuring nanomechanical motion with a microwave cavity interferometer", nat. physics, vol. 4, pp. 555–560, 2008. 380 i. jokić, m. frantlović, z. djurić, m. l. dukić [86] s. t. bartsch, a. lovera, d. grogg and a. m. ionescu, "nanomechanical silicon resonators with intrinsic tunable gain and sub-nw power consumption", acs nano, vol. 6, pp. 256–264, 2012. [87] b. m. zwickl, w. e. shanks, a. m. jayich, c. yang, a. c. bleszynski jayich, j. d. thompson and j. g. e. harris, "high quality mechanical and optical properties of commercial silicon nitride membranes, appl. phys. lett., vol. 92, pp. 103125 1-3, 2008. [88] s. iijima, "helical microtubules of graphitic carbon", nature, vol. 354, pp. 56-58, 1991. [89] s. m. huang, x. y. cai and j. liu, "growth of millimeter-long and horizontally aligned singlewalled carbon nanotubes on flat substrates", j. amer. chem. soc., vol. 125, pp. 5636-5637, 2003. [90] h. w. zhu, c. l. xu, d. h. wu, b. q. wei, r. vajtai and p. m. ajayan, "direct synthesis of long single-walled carbon nanotube strands", science, vol. 296, pp. 884-886, 2002. [91] p. poncharal, z. l. wang, d. ugarte and w. a. de heer, "electrostatic deflections and electromechanical resonances of carbon nanotubes", science, vol. 283, pp. 1513-1516, 1999. [92] v. sazonova, y. yaish, h. üstünel, d. roundy, t. a. arias and p. l. mceuen, "a tunable carbon nanotube electromechanical oscillator", nature, vol. 431, pp. 284-287, 2004. [93] vera sazonova, "a tunable carbon nanotube resonator", phd. thesis, cornell university, 2006. [94] h. b. peng, c. w. chang, s. aloni, t. d. yuzvinsky and a. zettl, "ultrahigh frequency nanotube resonators", phys. rev. lett., vol. 97, pp. 087203 1-4, 2006. [95] d. garcia-sanchez, a. san paulo, m. j. esplandiu, f. perez-murano, l. forró, a. aguasca and a. bachtold, "mechanical detection of carbon nanotube resonator vibrations", phys. rev. lett., vol. 99, pp. 085501 1-4, 2007. [96] e. a. laird, f. pei, w. tang, g. a. steele and l. p. kouwenhoven, "a high quality factor carbon nanotube mechanical resonator at 39 ghz", nano lett., vol. 12, pp. 193–197, 2011. [97] chung-chiang wu, "carbon based nanoelectromechanical resonators", phd. thesis, university of michigan, 2012. [98] m. imboden and p. mohanty, "dissipation in nanoelectromechanical systems", physics reports, vol. 534, pp. 89–146, 2014. [99] b. lassagne, d. garcia-sanchez, a. aguasca and a. bachtold, "ultrasensitive mass sensing with a nanotube electromechanical resonator", nano lett., vol. 8, pp. 3735–3738, 2008. [100] a. huttel, g. steele, b. witkamp, m. poot, l. kouwenhoven and h. van der zant, "carbon nanotubes as ultrahigh quality factor mechanical resonators", nano lett., vol. 9, pp. 2547–2552, 2009. [101] benjamin jose aleman, "carbon nanotube and graphene nanoelectromechanical systems", phd. thesis, university of california, berkeley, 2011. [102] a. k. geim, k. s. novoselov, "the rise of graphene", nat. mater., vol. 6, pp. 183–191, 2007. [103] j. s. bunch, a. m. van der zande, s. s. verbridge, i. w. frank, d. m. tanenbaum, j. m. parpia, h. g. craighead and p. l. mceuen, "electromechanical resonators from graphene sheets", science, vol. 315, pp. 490-493, 2007. [104] d. garcia-sanchez, a. m. van der zande, a. san paulo, b. lassagne, p. l. mceuen a. bachtold, "imaging mechanical vibrations in suspended graphene sheets", nano lett., vol. 8, pp. 1399-1403, 2008. [105] c. y. chen, s. rosenblatt, k. i. bolotin, w. kalb, p. kim, i. kymissis, h. l. stormer, t. f. heinz and j. hone, "performance of monolayer graphene nanomechanical resonators with electrical readout", nat. nanotech., vol. 4, pp. 861-867, 2009. [106] r. a. barton, b. ilic, a. m. van der zande, w. s. whitney, p. l. mceuen, j. m. parpia and h. g. craighead, "high, size-dependent quality factor in an array of graphene mechanical resonators", nano lett., vol. 11, pp. 1232–1236, 2011. [107] y. s. greenberg, y. a. pashkin and e. il'ichev, "nanomechanical resonators", physics – uspekhi, vol. 55, pp. 382-407, 2012. [108] z. djurić, "mechanisms of noise sources in microelectromechanical systems", introductory invited paper, microelectron. reliab., vol. 40, pp. 919-932, 2000. [109] f. l. walls and j. r. vig, "fundamental limits on the frequency stabilities of crystal oscillators", ieee trans. ultrason. ferroel. freq.control, vol. 42, pp. 576-589, 1995. [110] j. r. vig and y. kim, "noise in mems resonators", ieee trans. ultrason. ferroelect. freq. control, vol. 46, pp. 1558-1565, 1999. [111] a. n. cleland and m. l. roukes, "noise processes in nanomechanical resonators", j. appl. lett., vol. 92, pp. 2758-2769, 2002. rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise 381 [112] z. djurić, "noise in nanoelectromechanical systems", invited paper, in proceedings of the 1 st international workshop on nanoscience & nanotechnology iwon 2005, belgrade, serbia and montenegro, 2005, pp. 33-36. [113] z. djurić, "noise in microsystems and semiconductor photodetectors", in proceedings of the xliv conference etran, sokobanja, serbia, 2000, pp. 9-16. [114] y. k. yong and j. r. vig, "resonator surface contamination – a cause of frequency fluctuations?", ieee trans. ultrason. ferroelect. control, vol. 36, pp. 452-458, 1989. [115] z. djurić, o. jakšić and d. randjelović, "adsorption–desorption noise in micromechanical resonant structures", sens. actuators a, vol. 96, pp. 244-251, 2002. [116] z. djurić, i. jokić, m. frantlović, o. jakšić and d. vasiljević-radović, "adsorbed mass and resonant frequency fluctuations of a microcantilever caused by adsorption and desorption of particles of two gases", in proceedings of the 24 th international conference on microelectronics miel 2004, vol. 1, niš, serbia, 2004, pp. 197-200. [117] z. djurić, i. jokić, m. frantlović and o. jakšić, "fluctuations of the number of particles and mass adsorbed on the sensor surface surrounded by a mixture of an arbitrary number of gases", sens. actuators b, vol. 127, pp. 625-631, 2007. [118] i. jokić, z. djurić, m. frantlović, k. radulović, p. krstajić and z. jokić, "fluctuations of the number of adsorbed molecules in biosensors due to stochastic adsorption–desorption processes coupled with mass transfer", sens. actuators b, vol. 166–167, pp. 535–543, 2012. [119] m. frantlović, i. jokić, z. djurić and k. radulović, "analysis of the competitive adsorption and mass transfer influence on equilibrium mass fluctuations in affinity-based biosensors", sens. actuators b, vol. 189, pp. 71-79, 2013. [120] z. djurić, i. jokić, m. frantlović and k. radulović, "two-layer adsorption and adsorbed mass fluctuations on micro/nanostructures", microel. eng., vol. 86, pp. 1278-1281, 2009. [121] z. djurić, i. jokić, m. djukić and m. frantlović, "fluctuations of the adsorbed mass and the resonant frequency of vibrating mems/nems structures due to multilayer adsorption", microel. eng., vol. 87, pp. 1181-1184, 2010. [122] i. jokić, m. frantlović and z. djurić, "rf mems and nems components and adsorption-desorption induced phase noise", in proceedings of the 29 th international conference on microelectronics miel 2014, belgrade, serbia, 2014, pp. 117-124. [123] i. jokić, m. frantlović, z. djurić and m. dukić, "adsorption-desorption phase noise in rf mems/nems resonators", in proceedings of the 10 th international conference on telecommunications in modern satellite, cable and broadcasting services telsiks, niš, serbia, 2011, pp. 114-117. [124] k.m. van vliet and j.r. fasset, "fluctuations due to electronic transitions and transport in solids", in fluctuation phenomena in solids, r. e. burgess, ed., new york and london: academic press, 1965, pp. 267-354. [125] s. yousefi, t. eriksson and d. kuylenstierna, "a novel model for simulation of rf oscillator phase noise", in proceedings of the ieee radio and wireless symposium, new orleans, 2010, pp.428-431. [126] g. v. klimovitch, "near-carrier oscillator spectrum due to flicker and white noise", in proceedings of the ieee international symposium on circuits and systems iscas 2000, vol. 1, geneva, 2000, pp.703-706. [127] m. j. buckingham, noise in electronic devices and systems, ellis horwood ltd., 1983. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 235 244 doi: 10.2298/fuee1702235g performance analysis of dual-branch selection diversity system using novel mathematical approach  aleksandra golubović 1 , nikola sekulović 2 , mihajlo stefanović 1 , dejan milić 1 1 university of niš, faculty of electronic engineering, niš, republic of serbia 2 college of applied technical sciences, niš, republic of serbia abstract. in this paper, novel mathematical approach for evaluation of probability density function (pdf) of instantaneous signal-to-interference ratio (sir) at the receiver output in interference-limited environment is proposed. dual-branch selection combining (sc) receiver operating over correlated weibull fading channels applying sir algorithm is considered. analytical expression for joint pdf of desired signal and interference at the receiver output is derived and used for evaluation of pdf of instantaneous sir. the expression for pdf of sir is used for system performance analysis via outage probability, average bit error probability (abep) and average output sir as system performance measures. numerical results are graphically presented showing the effects of fading severity, average sir at the input and level of correlation on the diversity receiver performance. in addition, results obtained for the pdf of instantaneous sir in this paper, are compared to the results when the pdf of instantaneous sir is directly calculated. key words: cochannel interference, correlated channels, decision algorithms, selection diversity, weibull fading channels. 1. introduction the main performance limitations in wireless communications systems are fading and cochannel interference (cci). fading emerges due to multipath propagation while cci develops as a side effect of frequency reuse. in order to make as accurate system design as possible, depending on propagation environment, several models are used to describe the statistical behaviour of the multipath fading envelopes. the most frequently used in literature are rayleigh, rice, nakagami-m and weibull. this paper focuses on weibull distribution since it is simple and flexible yet not exploited as much as the other models. it represents an received july 8, 2016; received in revised form october 16, 2016 corresponding author: aleksandra golubović faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: aleksandra321@gmail.com) 236 a. golubović, n. sekulovic, m. stefanović, d. milić excellent fit to experimental fading channel measurements for indoor [1], [2] and outdoor [3]-[5] environments. wireless communication system performance can be improved at relatively low cost by diversity techniques. basic idea behind diversity systems is simultaneous reception of the same radio signal over two or more paths in order to increase the overall signal-to-noise ratio (snr) [6]. the diversity paths can be separated by space, frequency or time and in all cases some redundancy in time, frequency and/or spatial domain is required [7]. compared with other diversity techniques, space diversity is powerand bandwidth-efficient that makes it the most commonly used diversity technique [8]. if the best of the received signals is selected or if they are properly combined, the outage time can be substantially reduced [9]. depending on the communication system complexity restrictions and the amount of channel state information (csi) available at the receiver, space diversity has several principal types of combining techniques. combining techniques like maximal ratio combining (mrc) and equal-gain combining (egc) require some amount of the channel state information of received signal and separate receiver chain for each branch of the diversity system that results in system complexity increase. on the other hand, selection combining (sc) receiver processes only one of the diversity branches at the time and it is much simpler and cheaper for practical realization [6]. in interference-limited environment, where the level of cci is sufficiently high compared to noise, sc receiver can employ one of the combining algorithms: the desired signal (ds) algorithm, the signal-to-interference ratio (sir) algorithm and the total signal (ts) algorithm [10]. sir algorithm is based on selecting the diversity branch that has the highest sir and it usually provides the best results in the case of interference-limited systems. l-branch egc and mrc receivers operating over non identical weibull fading channels have been considered in [11]. performance analysis of digital communications receivers over weibull fading channels that employ sir algorithm was thoroughly investigated in [12]-[14]. the performance of sc diversity system operating over correlated weibull fading channels that applies sir decision algorithm is studied in [12] for dual-branch system, in [13] for triple-branch system and in [14] for l-branch system. a system that uses ds algorithm, where both desired signal and interference are correlated and under weibull fading, is presented in [15] for dual-branch and [16] and [17] for triple branch system. this paper presents novel mathematical approach for deriving an expression for the probability density function (pdf) of instantaneous sir at the output of a selection combining diversity system with two correlated weibull fading channels that applies sir algorithm. the mathematical approach used in [12] for the same system, directly calculates pdf of instantaneous sir at the system output while this paper calculates joint pdf of desired signal and interference at the output first and then the result is used for calculation of pdf of instantaneous sir at the output. finally, the results obtained in this paper are compared to the results obtained in [12] and [15]. 2. system and channel model we consider a sc diversity system with two branches in interference-limited weibull fading environment. in practice, diversity systems are applied in small-size terminals and complete independence between branches can not be achieved resulting in diversity gain performance analysis of dual-branch selection diversity system using novel mathematical approach 237 degradation. in such case, desired signal envelopes (x1, x2) and cci envelopes (y1, y2) experience correlative weibull fading with joint pdfs [18, eq. (11)] 1 2 1 2 1 2 1 2 1 2 1 21 2 / 2 / 2 1 1 1 21 2 1 2 1 2 1 2 0 d d d dd d 2 1 , exp , (1 ) ω ω 1 ω ω(1 ( ) ) ω ω x x x x x x x x p x x i                                (1) 1 2 1 2 1 2 1 2 1 2 1 21 2 / 2 / 2 1 1 1 21 2 1 2 1 2 1 2 0 c c c cc c 2 1 , exp , (1 ) ω ω 1 ω ω(1 ( ) ) ω ω y y y y y y y y p y y i                                (2) where ρ represents branch correlation coefficient (0≤ρ≤1), β is weibull fading parameter which expresses fading severity (β>0). as the value of weibull fading parameter increases, fading severity decreases. ω i di i x   and ω i ci i y   are the average powers of desired and interference signal at i-th branch (i=1,2), respectively. in() is the modified bessel function of the first kind and n-th order [19, eq. (8.445)]. instantaneous values of sir on the first and second diversity branch are defined as z1=x1/y1 and z2=x2/y2, respectively. the joint pdf of these random variables is 1 2 1 2 1 21 2 1 2 1 1 2 2 1 2 1 2 0 0 ( , ) ( , ) ( , ) . z z x x y y p z z y y p z y z y p y y dy dy     (3) sc receiver based on sir algorithm chooses and outputs the branch with larger sir, i.e. z = max {z1, z2}. applying the concepts of probability, the pdf of instantaneous sir at the output of sc combiner can be obtained as 1 2 1 22 2 1 1 0 0 ( ) ( , ) ( , ) . z z z z z z z p z p z z dz p z z dz   (4) the approach described by (3) and (4) is used in previously published papers which study sc receivers. in this work, we propose mathematical approach based on calculation of the joint pdf of desired and interference signal envelopes. the joint pdf of desired and interference signal envelopes on input diversity branches can be easily expressed as 1 1 2 2 1 2 1 21 1 2 2 1 2 1 2 ( , , , , ,) ( ( ).) x y x y x x y y p x y x y p x x p y y (5) when a dual-branch sc diversity system uses sir algorithm, one of two conditions have to be fulfilled: 1. 1 2 1 1 2 2 1 2 , x x y x x y y y x y y x       2. 2 1 2 2 1 1 2 1 , . x x y x x y y y x y y x       in that case, the joint pdf of desired signal and interference envelopes at the output of dual-branch sc receiver based on sir algorithm can be obtained as 238 a. golubović, n. sekulovic, m. stefanović, d. milić 1 1 2 2 1 1 2 2 2 1 2 2 2 2 1 1 1 1 0 0 , ( , , , ) ( , , , )( ) , xy x y x y x y x y y y x x x x p x y p x y x y dy dx p x y x y dy dx          (6) which by substituting (5) and after some mathematical manipulations yields 1 1 1 1 1 2 1 2 1 2 1 2 2 2 2 2 2 1 d c 1 1 1 1 2 +1 1 , 0 d d c c 2 c d d ( 1) ( ) ( 1 ( ) 2 1 ) 1 , exp 1 ω ω (1 ) ! ! 1 (ω ω ) (ω ω ) γ( 2) 1 1 ω ω ω 1, 2; ( ) ( ) ( ; ) 2 ω xy j j i j i j i j i ii j j j x y p x y x y i j i i j y x f i j i                                                          2 2 2 2 2 2 1 2 1 2 1 2 1 2 1 1 1 1 c 2 2 d c 1 1 1 1 ( 1) 1 2 ( ) ( 1 1 , 0 d d ( d ) ) c c c 1 1 exp 1 ω ω (1 ) ! ! 1 (ω ω ) (ω ω ) γ( 2) 1 1 ω ) ω ( ( ) n m n nn m m n m n m n y x x y x y m n m m n y x                                                                    1 1 1 2 1 1 d 2 c ω 1, 2; 2; 1 , ω m n y m n m x f                      (7) where 2f1(,;;z) represents gaussian hypergeometric function [19, eq. (9.100)] and () represents gamma function [19, eq. (8.310.1)]. to the best of the authors’ knowledge, the above presented expression for the joint pdf of desired and interference signal envelopes at the sir based sc receiver output is novel in the open technical literature. the pdf of instantaneous sir at the sc output can be calculated using following equation 0 ( ) ( , ) . z xy p z yp zy y dy    (8) by substituting (7) in (8) and after integration, final expression for the pdf of instantaneous sir at the receiver output is derived as performance analysis of dual-branch selection diversity system using novel mathematical approach 239 1 2 1 2 1 2 21 1 1 2 2 2 2 2 ( 1 (1 )) 1 2 1 2 1 1 , 0 d d 2 2 d c c 2 d 1 d 1 c 1 ( ! !) 1 ω ω (ω ω ) γ ( 2) 1 1 1 1 ω ω ω ω ω 1 1, 2; 2 ( ) ( ) ; 1 ( )( ω ) i jj i iz j i j c c i j z i j i i j z z f i j i z p z                                                             2 1 1 2 1 2 12 2 2 1 1 1 1 1 ( 1) 1( ) 1 2 2 2 1 1 , 0 d d 2 2 d c d 2 c 1 d 1 c 1 ( ! !) 1 ω ω (ω ω ) γ ( 2) 1 1 1 1 ω ω ω ω ω 1 1, 2; 2; 1 ( ) ( , ω )( ) m nn m m n m n c c m n z m n m m n z z f m n m z                                                         (9) the pdf of instantaneous sir at the output of the same system obtained using mathematical approach described by (3) and (4) is presented in [12] by (11). table 1 comparison of number of terms of (9) and (11) in [12] to achieve accuracy at the fourth significant digit (β1=2, β2=3, s1= s2=10db) z=5 z=25 (9) in this paper (11) in [12] (9) in this paper (11) in [12] ρ=0.2 4 6 5 6 ρ=0.5 12 13 13 15 ρ=0.8 34 34 34 41 considering that convergence represents significant problem in infinite-series expressions, table 1 summarizes the number of terms that need to be summed in the expressions for the pdf of instantaneous sir at the sc output obtained in this paper and paper [12] to achieve accuracy at the 4th significant digit after the truncation of the infinite series. instead of individual signal and interference powers, as it was presented in equation (9), the table considers their ratio at the input of i-th branch of selection combiner si=ωdi/ωci, i=1,2. the results show that the expression obtained in this paper converges more rapidly than the expression (11) in [12], making it more manageable for system analysis. 240 a. golubović, n. sekulovic, m. stefanović, d. milić 3. system performance analysis the performance of dual-branch sc system operating over correlated weibull fading channels is analysed using analytically obtained expression for the pdf of instantaneous sir at the output. performance indicators that are considered in this section are outage probability, average bit error probability (abep) and average output sir. the influence of fading severity, correlation coefficient and average powers is studied. moreover, numerical results are compared to numerical results in [12] to verify mathematical approach proposed in this work. 3.1. outage probability being a basic system performance measure in interference-limited environment, outage probability, pout, can be defined as the probability that the output sir drops below a specified threshold zth out 0 ( ) . thz z p p z dz  (10) fig. 1 depicts outage probability of balanced (s1=s2=s) dual-branch sc receiver as a function of outage threshold for different system parameters. the results obtained in this paper match perfectly the results obtained in [12]. the outage probability decreases for lower values of outage threshold and higher weibull fading parameters. for higher values of outage threshold, when desired signal is dominant, the system performance deteriorates as weibull fading parameter increases. when fixed values of weibull fading parameters are observed, it is obvious that for higher correlation coefficient system performance deteriorates. -15 -10 -5 0 5 10 15 20 10 -5 10 -4 10 -3 10 -2 10 -1 10 0   =2.5             results obtained using [12] for corresponding parameters o u ta g e p ro b a b il it y outage threshold [db] s=6db               fig. 1 outage probability of dual-branch sc system comparison of the results for outage probability when ds algorithm [15] and sir algorithm are used for different fading severity is illustrated in fig. 2. the branches of the performance analysis of dual-branch selection diversity system using novel mathematical approach 241 receiver are correlated and balanced. it can be seen that system with sir algorithm shows slightly better performance in terms of outage probability compared to ds algorithm. -15 -10 -5 0 5 10 15 20 10 -5 10 -4 10 -3 10 -2 10 -1 10 0 =1 =4 sir algorithm ds algorithm o u ta g e p ro b a b il it y outage threshold [db] =0.6 s 1 =s 2 =2db fig. 2 result comparison of outage probability for sir and ds decision algorithms 3.2. average bit error probability abep represents one of the important first order performance measures. it is often used for system performance evaluation because it is the most revealing of the nature of the system behaviour. abep is calculated using conditional bit error probability (bep), which is a function of the modulation/detection scheme employed by the system. in this paper, two modulations are considered, bdpsk and bfsk. for these two cases, the conditional bep for a given sir is 21 2 ( ) , gz e p ez   (11) where g represents modulation constant and the values are, for bdpsk g=1 and bfsk g=1/2. abep at the sc output can be evaluated directly by averaging the conditional bep over the pdf of z 0 ( ) .( ) e e z zp p p z dz    (12) fig. 3 illustrates abep of balanced dual-branch sc receiver for bfsk and bdpsk signalling for different correlation coefficient. the results obtained in [12] perfectly match the results obtained in this paper. the system performance is better for lower values of correlation coefficient which means that the system performance is better as the distance between the antennas increases. for the case when correlation is too high, it is possible for deep fades in the branches to occur simultaneously resulting in low improvement degree of considered space diversity. it is obvious from the figure that system with bdpsk signalling shows better performance than system with bfsk signalling which is in compliance with conclusion presented in [6]. 242 a. golubović, n. sekulovic, m. stefanović, d. milić 0 5 10 15 20 25 30 10 -3 10 -2 10 -1 10 0      a b e p s [db] bfsk   bdpsk   results obtained using [12] for corresponding parameters fig. 3 the influence of correlation coefficient on abep of dual-branch sc system 0 5 10 15 20 25 30 10 -3 10 -2 10 -1 10 0 results obtained using [12] for corresponding parameters       a b e p s [db] bfsk   bdpsk   fig. 4 the influence of fading severity on abep of dual-branch sc system in fig. 4, abep of balanced dual-branch sc receiver for bfsk and bdpsk signalling for different fading intensity is presented. it is obvious that system performance is better in the environment with lower fading parameter. it is interesting to note that for lower values of s, bfsk signalling with lower value of β, shows worse system performance than bdpsk signalling with higher value of β while for the case when higher values of s are observed, the situation is vice versa. it can be explained by the fact that in the considered scenario desired signal and cci, which is inferior for higher values of s, are exposed to the same fading severity. performance analysis of dual-branch selection diversity system using novel mathematical approach 243 3.3. average output sir average output sir is one more useful parameter that is used in wireless communications in the case when cci is present. it can be calculated by 0 ( ) .sc zz z p z dz    (13) based on (9) and (13), fig. 5 is plotted. it shows that the results obtained using (9) match perfectly with the results obtained using mathematical approach presented in [12]. the figure shows that the average output sir degrades rapidly for higher values of correlation coefficient. it is also obvious that the system performance is better for higher values of s, which is more significant in the case of lower values of fading parameters. 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 results obtained using [12] for corresponding parameters a v e ra g e o u tp u t s ir    = 2 =2.5; s=3db   = 2 =2.5; s=6db   = 2 =4.7; s=3db   = 2 =4.7; s=6db fig. 5 average output sir as a function of correlation coefficient 4. conclusion this paper studies the performance of dual-branch sc receiver operating over correlated weibull fading channels in the presence of weibull distributed cci for the case when sir algorithm is applied. the pdf of instantaneous sir at the system output was derived using mathematical approach based on calculation of the joint pdf of desired signal and interference signal envelopes at the output. using the pdf of instantaneous sir at the system output, outage probability, abep and average output sir were evaluated as efficient system performance measures. numerical results were graphically presented describing the influence of correlation coefficient, fading severity and average sir at the input on overall system performance. in addition, obtained results were compared to the results in [12] which proved the perfect match, as it was expected. it was shown that the expression for pdf of instantaneous sir obtained in this paper converges faster than the expression in [12] therefore the novel expression derived in this paper can be used more efficiently. moreover, the joint pdf of desired signal and interference signal envelopes at the system 244 a. golubović, n. sekulovic, m. stefanović, d. milić output can be used to calculate other important distributions. for example, the pdf of sum of desired signal and interference signal envelopes can be obtained and applied in performance analysis of system with micro and macrodiversity when macrodiversity combiner uses total power signal algorithm. motivated by these facts, the subject of our future work will be generalization of the mathematical approach for arbitrary order of diversity and macrodiversity system based on ts algorithm. references [1] f. babich, g. lombardi, ―statistical analysis and characterization of the indoor propagation channel,‖ ieee trans. commun., vol. 48, pp. 455-464, mar. 2000. [2] h. hashemi, ―the indoor radio propagation channel,‖ proc. ieee, vol. 81, pp. 943–968, july 1993. [3] g. tzeremes, c. g. christodoulou, ―use of weibull distribution for describing outdoor multipath fading‖ in proc. of the ieee anthennas and propagation society international symposium 1, 2002, pp. 232-235. [4] n. s. adawi, et al., ―coverage prediction for mobile radio systems operating in the 800/900 mhz frequency range,‖ ieee trans. veh. technol., vol. 37, no. 1, pp. 3–72, feb. 1988. [5] n. h. shepherd, ―radio wave loss deviation and shadow loss at 900 mhz,‖ ieee trans. veh. technol., vol. 26, pp. 309–313, nov. 1977. [6] m. k. simon, m. s. alouini, digital communications over fading channels, john wiley & sons, inc. 2000. [7] y. li, x. g. xia, g. wang, ―simple iterative methods to exploit the signal-space diversity,‖ ieee trans. commun., vol. 53, no. 1, pp.32-38, jan. 2005. [8] j. boutros, e. viterbo, ―signal space diversity: a power and bandwidth-efficient diversity technique for rayleigh fading channel,‖ ieee trans. inf. theory, vol. 44, pp. 1453-1467, july 1998. [9] s. h. lin, t. c. lee, m. f. gardina, ‖diversity protections for digital radio-summary of ten-year experiments and studies,‖ ieee commun. magazine, vol. 26, no. 2, feb. 1988, pp. 51-64. [10] w. jakes, microwave mobile communications, john wiley & sons, inc. 1974. [11] g. k. karagiannidis, d. a. zogas, n. c. sagias, s. a. kotsopoulos, g. s. tombras, ‖equal-gain and maximalratio combining over nonidentical weibull fading channels,‖ ieee trans. wireless commun., vol. 4, no. 3, pp. 841–846, may 2005. [12] m. c. stefanovic, d. m. milovic, a. m. mitic, m. m. jakovljevic, ―performance analysis of system with selection combining over correlated weibull fading channels in the presence of cochannel interference,‖ int. j. aeü, vol. 62, no. 9, oct. 2008, pp. 695—700. [13] p. spalevic, n. sekulovic, z. georgios, e. mekic, ―performance analysis of sir-based triple selection diversity over correlated weibull fading cchannels,‖ facta universitatis, series electronics and energetics vol. 23, no. 1, apr. 2010, pp. 89—98. [14] m. stefanovic, d. draca, a. panajotovic, n. sekulovic, ―performance analysis of system with l-branch selection combining over correlated weibull fading channels in the presence of cochannel interference,‖ int. j. commun. systems, vol. 23, no. 2, pp. 139—150, feb. 2010. [15] a. golubovic, n. sekulovic, m. stefanovic, d. milic, i. temelkovski, ―performance analysis of dualbranch selection diversity receiver that uses desired signal algorithm in correlated weibull fading environment‖, tehnicki vjesnik-technical gazette, vol. 21 no. 5, pp. 953-957, 2014. [16] n. sekulovic, m. stefanovic, a. golubovic, i. temelkovski, b. trenkic, m. peric, s. milosavljevic ―performance analysis of triple-branch selection diversity based on desired signal algorithm over correlated weibull fading channels,‖ ttem technics technologies education management, vol. 7, no. 3, pp. 10131019, 2012. [17] n. sekulović, a. golubović, ĉ. stefanović, m. stefanović, ―average output signal-to-interference ratio of system with triple-branch selection combining based on desired signal algorithm over correlated weibull fading channels,‖ facta universitatis series automatic control and robotics, vol. 11, no 1, pp. 37-43, 2012. [18] n. c. sagias, g. k. karagiannidis, ―gaussian class multivariate weibull distributions: theory and applications in fading channels,‖ ieee trans. inf. theory, vol. 51, no.10, pp. 3608—3619, oct. 2005. [19] i. gradshteyn, i. ryzhik, table of integrals, series and products, 7ed, ny: academic press, 2007. http://www.kobson.nb.rs/nauka_u_srbiji.132.html?autor=golubovic%20aleksandra 8357 facta universitatis series: electronics and energetics vol. 35, no 3, september 2022, pp. 313-331 https://doi.org/10.2298/fuee2203313d © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a new approach for direct discretization of fractional order operator in delta domain sujay kumar dolai1, arindam mondal2, prasanta sarkar3 1dream institute of technology, faculty of electrical engineering, india 2pailan college of management & technology, faculty of electrical and electronics engineering, india 3nitttr kolkata, faculty of electrical engineering, india abstract. the fractional order system (fos) comprises fractional order operator. in order to obtain the discretized version of the fractional order system, the first step is to discretize the fractional order operator, commonly expressed as s, 0 <  < 1. the fractional order operator can be used as fractional order differentiator or integrator, depending upon the values of . in general, there are two approaches for discretization of fractional order operator, one is indirect method of discretization and another is direct method of discretization. the direct discretization method capitalizes the method of formation of generating function where fractional order operator s is expressed as a function of z in the shift operator parameterization and continued fraction expansion (cfe) method is then utilized to get the corresponding discrete domain rational transfer function. there is an inherent problem with this discretization method using shift operator parameterization (discrete z-domain). at fast sampling time, the discretized version of the continuous time operator or system should resemble that of the continuous time counterpart if the sampling theorem is satisfied. at very high sampling rate, the shift operator parameterized system fails to provide meaningful information due to its numerical ill conditioning. to overcome this problem, delta operator parameterization for discretization is considered in this paper, where at fast sampling rate, the continuous time results can be obtained from the discrete time experiments and therefore a unified framework can be developed to get the discrete time results and continuous time results hand to hand. in this paper a new generating function is proposed to discretize the fractional order operator using the gauss-legendre 2point quadrature rule. additionally, the function has been expanded using the cfe in order to obtain rational approximation of the fractional order operator. the detailed mathematical formulations along with the simulation results in matlab, with different fractional order systems are considered, in order to prove the newness of this formulation for discretization of the fos in complex delta domain. key words: continuous fraction expansion, direct discretization, delta operator, fractional order operator, fractional order system received september 30, 2021; revised january 26, 2022; accepted july 4, 2022 corresponding author: arindam mondal pailan college of management & technology, faculty of electrical and electronics engineering, india e-mail: arininstru@gmail.com 314 s. k. dolai, a. mondal, p. sarkar 1. introduction around 300 years ago the concept of fractional calculus [1-2] came into existence. it has been an untouched and undiscovered part of engineering until the conceptual furtherance of fractional calculus eventuated in the mid nineteenth century. with time this part attracted the researchers towards its diversified properties that can be implemented in various field of engineering, as well as various part of science [3-7]. the postulation of fractional order calculus has an immense perspective to change the technique we see, manipulate and design the nature that is around us. the fundamental unit of the non-integer order system is the operator (s), which can also be coined as fractional order differentiator or integrator [8-9] for variation of  by making it either positive or negative. the important part of digital realization of fractional order system is the discretization of this operator. in order to implement the fos in real time, the rationalization is the only procedure, either in continuous time or in discrete time. there are various methods for continuous time approximation of fractional order operators [10-12]. once it is converted to continuous time rational transfer function [13], there are methods of discretization to get the discretized version of the fos [14-18]. this is known as indirect method of discretization of fos. there is a second method known as direct discretization method, where the rational transfer function in z-domain is directly obtained via different generating functions, as proposed by euler, tustin, al-alauoi. in the subsequent step, the generating function is expanded using methods such as continued fraction expansion (cfe) [19]. there has been an increased demand in digital system implementation. in order to implement the fos digitally, the sampling rate must be increased to at least 10 times the original system bandwidth. the increased sampling rate makes the poles closer to each other in zdomain transfer function and gets focused near the point (1,0) in the discrete zplane. this will result in an unstable system due to finite word length effect [15]. the conventional or shift operator representation of discrete time system fails to furnish the significant portrayal of the conventional continuous-time system at fast sampling rate. to circumvent this problem delta operator parameterization is introduced [20] where, at very high sampling frequency the continuous time results and discrete time results are obtained at the same time. the superiority of the delta operator parameterization along with its various applications are found in [21-29]. in this paper, a method is proposed by which the fractional order operator is directly discretized [30-31] in delta domain. initially, a generating function is proposed in delta domain by using one of the useful numerical computational tools known as gauss-legendre 2-point quadrature rule [32]. the classical cfe method is adopted to expand this generating function to get the rational approximation of the fractional operator in discrete delta domain. the significant contributions are made in this paper as given below: earlier research work so far done on the discretization of the fractional order system through the discretization of the fractional order operator in shift operator parameterization. in this work the fos has been directly discretized using delta operator parameterization so that at a very fast sampling frequency, the discrete time results resemble that of the continuous time counterpart. one more important contribution of this work is that here gauss-legendre 2-point quadrature rule is used for the close form approximation of the log(1 + x) function to minimize the approximation error. the comparison with the other standard methods are done to prove the efficacy of this proposed method. the paper has been well organized in the following sections as indicated: in section 2, fractional order operator and systems are discussed; section 3 enlightens the direct a new approach for direct discretization of fractional order operator in delta domain 315 discretization method of fo operator in delta domain; simulation and result analysis are discussed with different examples in section 4; and in section 5 the conclusion is drawn. 2. fractional order system and its discretization (direct method) using traditional methods 2.1. fractional order operator and fractional order system fractional order system literally means the order of the system is no longer integer that is non-integer order. a system of fractional order is represented as fractional order differential equation and laplace transform of the system can be performed to get the transfer function. a non-integer order system can be portrayed by the following equation [30]. )(....)()()(....)()( 0101 0101 tudbtudbtudbtydatydatyda mmnn mmnn  −  −  +++=+++ −− where, ( ) ( 0) 1 ( 0) ( 0) m d d md d                = =      (1) is known as integro-diffrerentiator operator. the laplace transform of the eq. (1) under consideration of zero initial condition, the transfer function that we get is: 1 0 1 0 m 1 0 n 1 0 .....( ) ( ) ( ) ..... m m n n m n b d b d b dy s g s u s a d a d a d       − − − − + + + = = + + + (2) where, [ ( )] ( ), [ ( )] ( )l y t y s l u t u s= = . if the fractional differential equation as given in eq. (2) may be coined as commensurate order which further gets reduced to the following form. 0 0 ( ) ( ) m n ka ka k kk k a d y t d u t = = =  (3) where, + = ,, k aa kk there are two popular definitions, such as grünwald-letnikov (gl) and riemannliouville (rl) definitions, to express this operator   md . here, rl definition is considered. the rl definition is 1 1 ( ) ( ) ( ) ( ) k k k d p md t dp k dt t p       − + =  − −  (4) where m and  are the bounds of operation and  is used to represent the euler's gamma function. 316 s. k. dolai, a. mondal, p. sarkar for the analysis purpose, the fractional order differentiator is considered in this section. the fractional order system (differentiator) is realized in complex s-domain for the ease, which can be acquired by taking the laplace transform of the eq. (4), thus the laplace transform of the equation is { ( )} ( ), 0 1l md t s s for     =     (5) 3. delta domain discretization method of fractional order operator in contrast to get better finite-word-length effect under fast sampling, forward shift operator is going to be replaced by the delta operator [20]. the forward difference operator of delta operator is defined as 1q  − =  (6) where q is the forward shift operator and  is termed as sampling time or internal. employing a differentiable signal x(t), at high sampling time (→0) the delta (  ) operator gravitates with continuous-time derivative operator as shown in eq. (7). dt tdxtxtx tx )()()( )( limlim 00 =  −+ = →→ (7) the variable corresponding to z in the shift operator parameterization is denoted by  in complex delta domain and relationship between the two complex variables are given in eq. (8)[20]. 1 1 δ δ sδ z e γ − − = = (8) at high sampling time limits (→0) the delta discrete-time frequency variable () coincides with the continuous-time frequency variable (s) as follows and it is the philosophy which is capitalized in this work. 2 2 0 00 1 .... 1 1 2! lim lim lim s s s e s  → →→  +  + + − − = = =   (9) to obtain the mapping between s and  , we need to replace sz e = in eq. (8) as shown above. after taking logarithm on both sides the relationship between the two domains can be established by eq. (10). 1 ln(1 )s = +   (10) now, ln(1 )+ function is approximated in a closed form and the cfe expansion is made possible. upon applying different trapezoidal quadrature rule [32], the close form approximation of ln(1 )x+ is obtained through 2p-gilog approximation as follows: 2 2 66 36 )1ln( xx xx x ++ + + (11) this approximation is known as 2p-gilog. a new approach for direct discretization of fractional order operator in delta domain 317 now replacing x by  in eq. (11), the expression becomes, 2 2 6 3( ) ln(1 ) 6 6( ) ( )       +  +   +  +  (12) the eq. (10) is re-established by using eq. (12) and eq. (13) and is obtained as follows: 2 2 2 1 6 3 ln(1 ) 6 6 s       +   = +       +  +     (13) at fast sampling limit ( 0 → ) the discrete-time frequency variable (  ) in delta domain coincides with the continuous-time frequency variable ( s ) as can be found out from eq.(13) therefore, at fast sampling limit, the complex variable in continuous domain is approximated as the complex variable in discrete delta domain. a fo differentiator is framed as: ( ) (0 1) r g s s r=   (14) cfe2p-gilog method is used for discretization of r s directly in delta domain. the fractional order operator discretization is accomplished in two stages. initially, the required generating function is selected and that is going to define the approximate mapping between delta discrete-time variable (γ) and continuous-time variable ( s ). in the next stage, to obtain the discrete time approximation of rs in the form of transfer function in delta domain, the selected generating function is expanded. in this work, eq. (13) is chosen as the generating function and cfe method is used to expand it to get respective integer order approximation of sr in delta domain.                 ++ +   r r cfegs 2 2 )()(66 )(36 )( (15) the mathematical expression for cfe approximation is as follow: .....2 )3( 5 )2( 2 )2( 3 )1( 2 )1( 1 1)1( + − + + + − + + + − + +=+ pr pr pr pr pr rp p r (16) 2 2 2 6 3 1 6 6  +  − +  +   is substituted in place of p in the eq. (16) to get the equivalent form of eq. (15). now executing cfe approximation of 2 2 2 6 3 6 6 r   +      +  +      for third order, and fifth order in delta domain are obtained as given in eq. (17) and eq. (18) respectively. 3122030 3323130 22 2 3 66 36 )( bbbb aaaa cfegs r r +++ +++ =                 ++ + = −−− −−−  (17) 318 s. k. dolai, a. mondal, p. sarkar 5 1 4 2 3 3 2 4 1 5 0 5 1 4 2 3 3 2 4 1 5 0 22 2 5 66 36 )( bbbbbb aaaaaa cfegs r r +++++ +++++ =                 ++ + = −−−−− −−−−−  (18) table 1 numerator coefficients for fifth order approximation in delta domain 15 14 13 12 5 11 10 9 8 7 6 (( 1)( 2)(1073741824 16911433728 13354663936 1002254106624 3869945888768 20886278111232 129327405203456 138817498447872 1829934470742016 712034173267968 12 num d r r r r r r r r r r r r = + + + + − − + + − − − + 5 4 3 2 608455533286400 13516106683236096 41479456532696640 59408759887249392 55098059015583104 92016444345172880))) r r r r r + − − + + coefficients numerator 0a 17 16 15 14 13 12 11 10 9 ( ((3 ) (1073741824 20132659200 66236448768 928367247360 6849998880768 7271932231680 184246347759616 290937273384960 1987732155678720 6479472582389760 8 681248407199 r / r r r r r r r r r r −  − + + − − + − + + 7 6 5 4 3 2 5 8464 49917404936559360 24285774583584448 156814916118867136 206087133711558336 138493101617423408 386245451066684864 184032888690345760)) num r r r r r r r / d + + − + − 1 a 15 16 14 13 12 11 10 9 8 ( ((3 ) (140928614400 8053063680 320612597760 7395229040640 42899897057280 102849152286720 1256995122708480 791103107235840 15398862690017280 30803 / r r r r r r r r r r 7 6 5 4 3 2 5 473842032640 81700461363356160 272889395307594240 112347728802349920 1002215818766418432 425460690581907136 1423614499033773056 1274294568187684864)) num r r r r r r r / d 2a 2 15 2 14 2 13 2 12 2 11 2 10 2 9 2 8 2 7 ( ((3 / ) (28185722880 422785843200 65179484160 26212722278400 87432002273280 570237360537600 3060794314260480 4379676740812800 43583729456762880 7143762905395 r r r r r r r r r r −   −  +  +  −  −  +  +  −  + 2 6 2 5 2 4 2 3 2 2 2 2 5 200 300546723808853760 267563840484326400 992987073349200768 1262317112875803648 1327256882051046912 2019580911193378048 )) / num r r r r r r d  +  −  −  +  +  −  3a 3 13 3 14 3 12 3 11 3 10 3 9 3 8 3 7 3 6 (-((3 / ) (634178764800 56371445760 2247811399680 44392513536000 12672785448960 1194755633971200 1827599513026560 15451885751500800 32314828390440960 1020310420 r r r r r r r r r r      +  +    +  + + 3 5 3 4 3 3 3 2 3 3 5 11494400 243411076800568320 -330417309695155200 842112615992487808 436883676571452608 1161233166549210368 )) / num r r r r r d    +  +   4a 4 13 4 12 4 11 4 10 4 9 4 8 4 7 4 6 4 5 ( ((3 / ) (63417876480 396361728000 4510596464640 27834502348800 122752979435520 750819041280000 1602561749483520 9729145062604800 10683837557406720 643703179525 r r r r r r r r r r −   −  −  +  +  −  −  +  +  − 4 4 4 3 4 2 4 4 5 05600 34908135243891840 208843823902442400 46559875383003776 276641682514481024 )) / num r r r r d  −  +  +  −  5a 5 12 5 10 5 8 5 6 5 4 5 2 5 5 ((3 / ) (31708938240 2255298232320 61376489717760 801280874741760 5341918778703360 17454067621945920 23279937691501888 )) / r num r r r r r r d   −  +  −  +  −  +  a new approach for direct discretization of fractional order operator in delta domain 319 table 2 denominator coefficients of fifth order approximation in delta domain 2 3 5 4 5 6 7 8 6 (( 1)( 2)(55098059015583104 59408759887249392 41479456532696640 13516106683236096 12608455533286400 712034173267968 -1829934470742016 -138817498447872 712034173267968 -182993447074 dn r r r r r r r r r r r = + + + + 7 8 9 10 11 12 13 14 15 2016 -138817498447872 129327405203456 20886278111232 3869945888768 -1002254106624 13354663936 16911433728 1073741824 92016444345172880)) r r r r r r r r r + + + + + + coefficients denominator 0b 2 3 4 5 6 7 8 9 10 ((386245451066684864 138493101617423408 206087133711558336 156814916118867136 24285774583584448 49917404936559360 6812484071998464 6479472582389760 1987732155678720 290937273384960 r r r r r r r r r r + − − + + + − − + 11 12 13 14 15 16 17 5 184246347759616 7271932231680 6849998880768 928367247360 66236448768 20132659200 1073741824 184032888690345760)) / r r r r r r r dn + + − − + + + + 1b 2 3 4 5 6 7 8 ((1274294568187684864 1423614499033773056 -425460690581907136 -1002215818766418432 -112347728802349920 272889395307594240 81700461363356160 30803473842032640 -15398862690017280 791 r r r r r r r r  +     +  +    + 9 10 11 12 13 14 15 16 5 103107235840 1256995122708480 102849152286720 -42899897057280 7395229040640 320612597760 140928614400 8053063680 )) / r r r r r r r r dn  +  +    +  +  +  2b 2 2 2 2 2 3 2 4 2 5 2 6 2 7 ((1327256882051046912 2019580911193378048 1262317112875803648 992987073349200768 267563840484326400 300546723808853760 7143762905395200 43583729456762880 4379676740812800 r r r r r r r  +  −  −  +  +  −  −  −  2 8 2 9 2 10 2 11 2 12 2 13 2 14 2 15 5 3060794314260480 570237360537600 87432002273280 26212722278400 65179484160 422785843200 28185722880 )) / r r r r r r r r dn +  +  −  −  +  +  +  3b 3 3 3 2 3 3 3 4 3 5 3 6 3 7 3 ((436883676571452608 1161233166549210368 842112615992487808 330417309695155200 243411076800568320 102031042011494400 32314828390440960 15451885751500800 1827599513026560 r r r r r r r  +  −  −  +  +  −  −  +  8 3 9 3 10 3 11 3 12 3 13 3 14 5 1194755633971200 12672785448960 44392513536000 2247811399680 634178764800 56371445760 )) / r r r r r r r dn +  −  −  −  +  +  320 s. k. dolai, a. mondal, p. sarkar 4b 4 4 4 2 4 3 4 4 4 5 4 6 4 7 4 8 ((46559875383003776 276641682514481024 208843823902442400 34908135243891840 64370317952505600 10683837557406720 9729145062604800 1602561749483520 750819041280000 12275 r r r r r r r r  +  −  −  +  +  −  −  +  + 4 9 4 10 4 11 4 12 4 13 5 2979435520 27834502348800 4510596464640 396361728000 63417876480 )) / r r r r r dn  −  −  +  +  5b 5 5 2 5 4 5 6 5 8 5 10 5 12 5 5 ((23279937691501888 17454067621945920 5341918778703360 801280874741760 61376489717760 2255298232320 31708938240 )) ) / r r r r r r dn  −  +  −  +  −  +   4. simulation and result analysis to prove the effectiveness of the portrayed approach, three examples are taken. example 1: a 1/4th order differentiator is considered in this example [25] with transfer function as shown below: 0.25 ( ) r g s s s= = (19) the direct discretization of 1/4th order differentiator in delta domain is expressed as follows: 0.25 2 0.25 2 2 2 0.01 6 3 ( ) 6 6 p gilodel s g cfe − =    +         +  +     (20) the third and fifth order approximation of 0.25s in delta domain after continued fraction expansion of                   ++ + 25.0 22 2 66 36   results in eq. (21) and eq. (22) respectively. the sampling time is considered to be 0.01s = 0.25 2 0.25 2 3 2 20.01 0.01 7 3 2 7 3 2 6 3 ( ) 6 6 5.48 0.0003722 0.06519 1.8 1.317 0.0001026 0.02198 1 p gilodel s g cfe − = = − −    +     =    +  +      +  +  + =  +  +  + (21) 0.25 2 0.25 2 5 2 20.01 0.01 11 5 8 4 5 3 2 12 5 9 4 6 3 2 6 3 ( ) 6 6 3.238 3.685 1.466 0.002369 0.1322 1.439 7.781 9.632 4.252 0.0007911 0.05562 1 p gilodel s g cfe                − = = − − − − − −   +    =    +  +    + + + + + = + + + + + (22) a new approach for direct discretization of fractional order operator in delta domain 321 for g2p−gilogdel5() the denominator and numerator coefficient are calculated using table 1 and table 2 taking r = 0.25 and  = 0.01. the frequency responses of delta domain transfer functions, g2p−gilogdel3() and g2p−gilogdel5 are shown in fig. 1. the magnitude and phase error of the third order and fifth order approximate transfer function with respect to the original 1/4th order differentiator are demonstrated in fig. 2. it can be seen through the graph that as the order of approximation goes higher, the precision of approximation gets better. fig. 1 fifth order and third order approximation of 0.25 s in delta domain using proposed method fig. 2 error comparison between fifth order and third order approximation of 0.25 s in delta domain using proposed method while taking the whole range of frequency into consideration, the magnitude is more accurate as compared to the phase response. the approximation is compared on the basis of the maximum absolute magnitude and phase error as shown in table 3. as we can see 322 s. k. dolai, a. mondal, p. sarkar that the approximation results for the fifth order are more prominent than those of the third order, therefore fifth order cfe approximation has been chosen to develop the frequency responses for the different systems considered in this paper. at a sampling time of 0.01s = , the fifth order discrete realization of 1/4th order differentiator is considered based upon the four methods described in this paper namely cfe of al-alaoui (cfeal), cfe of tustin (cfeto), cfedo and cfe of 2p gilog in delta domain (cfe2pgilogdel) and following results are obtained. -1 -2 -3 -4 -5 5 -1 -2 -3 -4 -50.01 (1409-3221z +2435z -639.5z +6.82 z +5.449 z ) ( ) (430.9-861.9 z +533.6 z -90.88 z -7.06 z +z ) al g z = = (23) -1 -2 -3 -4 -5 5 -1 -2 -3 -4 -50.01 (226.5 56.63 245.4 43.65 51.03 3.761 ) ( ) (60.24 15.06 65.25 -11.61 13.57 ) tus z z z z z g z z z z z z= + + = + + + (24) 11 8 4 5 3 2 2 5 12 5 9 4 6 3 20 01 3.238 3.685 1.466 0.002369 0.1322 1.439 ( ) 7.781 9.632 4.252 0.000791 0.05562 1 p gilogdel δ . γ γ γ γ γ g γ γ γ γ γ γ − = + + + + + = + + + + + (25) 5 5 4 5 3 5 2 4 5 5 4 4 5 3 5 2 50.01 7157 1.282 4.186 3.512 7.025 2057 ( ) 2417 7.373 3.56 4.158 1.242 6526 cfedo g =  +  +  +  +  +   +  +  +  +  + (26) table 3 absolute maximum phase error and magnitude error for discretization of 0.25th order differentiator using cfe2p-gilogdel approximation order maximum magnitude error (db) maximum phase error (degree) fifth 0.92 7.7415 third 1.27 30.5 example 2: a fractional order system [25] is considered: 1 0.97 2.813 ( ) 0.191g s s = + (27) for the discretization of the above system, sampling time considered is s0001.0= . the discretization of this continuous time transfer function results in four rational approximation t.f. as given by eq. (28), eq. (29), eq. (30) and eq. (31), by using four methods cfeal, cfeto, cfedo and cfe2p-gilogdel, respectively,. 7 7 1 7 2 7 3 6 4 4 5 5 7 8 1 8 2 7 30 0001 7 4 5 5 1 693 4 564 4 33 1 665 2 032 2 54 ( ) 8 85 2 387 2 266 8 719 1 064 1 33 al -δ . . . z + . z . z + . z + . z g z . . z + . z . z + . z + . z = = (28) a new approach for direct discretization of fractional order operator in delta domain 323 5 5 -1 5 -2 5 -3 4 -4 4 -5 5 -1 -2 -3 -4 -50.0001 (3.142 -3.048 z 2.177 z + 2.052 z + 1.822 z -1.486 z ) ( ) (21.15+20.51z -14.65 z -13.81 z +1.226 z +z ) tus g z = = (29) 18 4 13 4 9 3 5 2 5 18 5 13 4 8 3 20.0001 7157 1.076 3.584 4.511 0.1549 9.672 ( ) 6.698 5.628 1.874 0.0002356 0.8057 cfedo g − − − − − −=  +  +  +  +  +  =  +  +  +  +  35.91+ (30) 5 4 4 5 3 6 2 5 2 5 4 5 5 4 5 3 5 20 0001 4238 3.512 7.721 1.341 6.34 6.204 ( ) 2.207 2.25 4 603 2.431 2861 24.51 p gilogdel δ . γ γ γ γ γ g γ γ γ . γ γ γ − = + + + + + = + + + + + (31) example 3: the fo system [14] is chosen and the transfer function is as follows: 2 0.638 41.89 4 68) 28.(g s s = + (32) here the sampling rate is made higher and that is considered as 0.00001s = . the discretization of this continuous time transfer function results in four rational approximation t.f., as given by eq. (33), eq. (34), eq. (35) and eq. (36), by using four methods cfeal, cfeto, cfedo and cfe2p-gilogdel, respectively. 8 9 1 9 2 8 3 7 4 6 5 5 6 6 1 6 2 6 30 00001 5 4 5 8.257 2.07 1.782 5.872 4 794 3.057 ( ) 1.926 4 829 4.156 1.37 1.118 7132 al δ . z z z . z z g z . z z z z z − − − − − − − −= − − − + − + + = − + − + + (33) 7 7 1 7 2 7 3 6 4 6 5 5 4 4 1 4 2 4 30 00001 4 5 2 732 1 743 2 541 1 277 4 282 1 033 ( ) 6 372 4 065 5 927 2 978 9987 2410 tus δ . . . z . z .. z . z . z g z . . z . z . z z z − − − − − − − −= − − − − + + − = − + + + − (34) 6 5 7 4 8 3 8 2 7 4 5 4 5 5 4 5 3 5 20 00001 4 5 662 7.637 2 036 1.436 2.053 7 853 ( ) 1 315 1 751 4 964 2.913 3.077 546.5 cfedo δ . . γ γ . γ γ γ . g γ . γ . γ . γ γ γ = + + + + + = + + + + + (35) 21 5 15 4 9 3 2 4 2 5 23 5 17 4 12 30 00001 7 2 5 507 5 827 2 116 0 0003013 13 34 7 089 ( ) 1 285 1 359 4 936 7 029 0 03112 165 3 p gilogdel δ . . γ . γ . γ . γ . γ . g γ . γ . γ . γ . γ . γ . − − − − − − −= + + + + + = + + + + + (36) 324 s. k. dolai, a. mondal, p. sarkar fig. 3 frequency response comparison after discretization of g(s) using four methods at 0.25r = and 01.0= fig. 4 frequency response comparison after discretization of g1(s) using four methods 0.97r = and 0001.0= fig. 5 frequency response comparison after discretization of g2(s) using four methods at 0.638r = and 0.00001= a new approach for direct discretization of fractional order operator in delta domain 325 four different discretization methods are utilized to discretize three fractional order systems as shown in three examples. the frequency responses of all the systems (fractional order) along with the frequency responses of their corresponding discrete-time approximated systems are shown in fig. 3, fig. 4, and fig. 5, respectively. in all the discretization methods magnitude approximation turns out to be superior over the phase approximation. from the fig. 3, fig. 4 and fig. 5, it is evident that the proposed method, cfe2p-gilogdel produces excellent frequency responses in the frequency range of (0.001 rad/s to 1000rad/sec). therefore, through experimental analysis, the proposed method is more promising than the other three approaches for discretization with respect to approximation of original fractional order system. moreover, the comparison of the outcomes with another method developed in the delta domain been made and superiority of the proposed method is established. the cfe2p-gilogdel method at high sampling time ( 0.00001 = ) provides frequency responses very much closer to the original fractional order system as can be seen from fig. 5. this leads to a development of a unified approach towards the discretization of fractional order operator or system in complex delta domain means at high sampling rate the continuous time result and discrete time results can be obtained at the same time and is a sole reason for the development of discrete time systems’ in delta operator parameterization. fig. 6 magnitude and phase error after discretization of g(s) using four methods at 0.25r = and 01.0= fig. 7 magnitude and phase error after discretization of g1(s) using four methods at 0.97r = and 0.0001= 326 s. k. dolai, a. mondal, p. sarkar fig. 8 magnitude and phase error after discretization of g2(s) using four methods at 0.638r = and 00001.0= table 4 absolute maximum magnitude error and phase error for four discretization methods for different systems fos max. magnitude error (db) max. phase error (degree) cfe2pg ilogdel cfedo alalaoui tustin cfe2pgilogdel cfedo alalaoui tustin 0.25 ( )g s s= 0.72 1.06 1.11 1.2 7.74 18.3 44.79 44.88 1 0.97 2.813 ( ) 0.191g s s = + 1.66 2.12 5.83 24.27 44.46 45.1 79.83 88.02 2 .0.638 41.89 ( ) 428.68g s s = + 7.6 7.94 28.76 35.78 82.54 82.8 103.52 112.44 -5 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 x 10 5 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 111111 1 1 111111 1 1 1e+0052e+0053e+0054e+0055e+005 pole-zero map real axis im a g in a ry a x is cfe2p-gilog 3rd order, r=0.97 pole-zero map real axis im a g in a ry a x is -5 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 x 10 5 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 111111 1 1 111111 1 1 1e+0052e+0053e+0054e+0055e+005 cfe2p-gilog 5th order, r=0.97 fig. 9 pole-zero plot for the third-order and fifth order approximation of 97.0 s using cfe2p-gilogdel method a new approach for direct discretization of fractional order operator in delta domain 327 -400 -350 -300 -250 -200 -150 -100 -50 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1 111111 1 1 50100150200250300350400 111111 1 pole-zero map real axis im a g in a ry a x is cfedo, 3rd order, , r=0.97 pole-zero map real axis im a g in a ry a x is -10 0 10 20 30 40 50 60 -100 -80 -60 -40 -20 0 20 40 60 80 100 0.2 0.28 0.38 0.52 0.68 0.88 20 40 60 80 100 20 40 60 80 100 0.06 0.12 0.2 0.28 0.38 0.52 0.68 0.88 0.06 0.12 cfedo, 5th order, r=0.97 fig. 10 pole zero plot for the third-order and fifth order approximation of 97.0 s using cfe-do method pole-zero map real axis im a g in a ry a x is -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1/t 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1/t tustin 3rd order r=0.97 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1/t 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1/t pole-zero map real axis im a g in a ry a x is tustini, 5th order, r=0.97 fig. 11 pole-zero plot for the third-order and fifth order approximation of 97.0 s using tustin method -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1/t 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1/t pole-zero map real axis im a g in a ry a x is al-alaoui, 3rd order r=0.97 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 /t 0.1/t 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t /t 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1/t 0.2/t 0.3/t 0.4/t 0.5/t 0.6/t 0.7/t 0.8/t 0.9/t pole-zero map real axis im a g in a ry a x is al-alaoui, 5th order, r=0.97 fig. 12 pole-zero plot for the third-order and fifth order approximation of 97.0 s using alalaoui method 328 s. k. dolai, a. mondal, p. sarkar from the table 4, it is clearly observed that when the sampling time is increased to a very high limiting value 0.00001s = , the maximum absolute magnitude error and phase error is much higher in case of discretization using tustin and al-alaoui method in zdomain in comparison to the discretization using delta operator parameterization. the graphical representation can also be viewed from fig. 8. also, it can be seen that the proposed method is superior to the other methods in the literature. at the same time, a comparison has been made for the fifth order approximation of s0.97 using another delta domain based approach, cfedo method, where poles are in the right half of the plane fig. 10, thus making the rational transfer function of the system unstable, whereas the method proposed in this paper shows that in both third order and fifth order the poles in the region itself are making the system stable. so, it is evident that the proposed method delivers preferable approximation amidst all four discretization methods and is a viable alternative in the literature of direct discretization of fractional order operator in delta domain. the following analysis has been done to prove the novelty of the direct discretization of fractional order operator (s, 0 <  < 1) over the indirect discretization of the fractional order operator in delta domain. for the illustration purpose, a 1/4th order differentiator is considered for the discretization purpose. this operator is discretized using indirect discretization using oustaloup approximation [33] method as an intermediate step. rational approximation of 25.0s is obtained using [33] as given in eq. (37). 7 6 5 4 3 2 7 6 5 4 3 2 3 162 1899 2 411 05 7 763 06 6 586 07 1 472 08 8 343 07 1 07 834 3 1 472 05 6 586 06 7 763 07 2 411 08 1 899 08 3 162 07 . s + s + . e s + . e s + . e s + . e s + . e s + e s + . s + . e s + . e s + . e s + . e s + . e s + . e (37) eq. (37) is discretized in delta domain to get the rational approximation of 25.0s . 7 6 5 4 3 2 7 6 5 4 3 2 3 162 1532 1 745 05 5 357 06 4 459 07 9 897 07 5 595 07 6 7 06 667 1 056 05 4 52 06 5 243 07 1 619 08 1 273 08 2 119 07 . γ + γ + . e γ + . e γ + . e γ + . e γ . e γ . e γ + γ + . e γ + . e γ + . e γ + . e γ + . e γ + . e + + (38) the rational approximation of 25.0s in delta domain using proposed direct discretization method is illustrated in eq. (39) 5 4 3 2 5 4 3 2 2 55954 0 0235 0 000042 2 6066 ( 08) 6 5524 ( 12) 5 75821 ( 16) 0 00556 0 000007 4 25183 ( 9) 9 63178 ( 13) 7 7805 ( 17) . γ . γ . γ . e γ . e γ . e γ + . γ + . γ + . e γ + . e + . e - + + + + − (39) a comparative analysis between the direct discretization and indirect discretization using delta operator based parameterization is graphically demonstrated in fig. 13 and fig. 14 respectively. a new approach for direct discretization of fractional order operator in delta domain 329 fig. 13 frequency response using indirect discretization of 25.0s at ∆=0.001s fig. 14 frequency response using direct discretization of 25.0s at ∆=0.001s from the above figure it is clear that using the direct discretization the magnitude and phase plot resembles that of the 1/4th order differentiator in continuous time domain, whereas there is a notable deviation of the magnitude and phase curve when indirect discretization is approached. therefore, direct discretization of the fractional operator in delta domain is superior over indirect discretization. 330 s. k. dolai, a. mondal, p. sarkar 5. conclusion in this paper, a new direct discretization method for fractional order operator is proposed. the traditional discretization method for fractional order operator works in the discrete z-domain and at a high sampling frequency, the resulting system fails to provide meaningful information. instead, delta operator parameterized systems give continuous time results at high sampling frequency. in this work, an approximation mapping between the s-domain and delta domain is established through trapezoidal quadrature rule and traditional cfe, method is used to obtain rational transfer function corresponding to the fractional order operator in discrete delta domain. simulation results show that the proposed discretization method using delta operator is producing gratifying frequency response approximation of the original fractional order system in resemblance to other two discretization methods. at fast sampling rate, the discretized system produces almost the same frequency responses as those of continuous time counter-part. this successfully proves the efficiency of the suggested approach to be a viable alternative to that of the direct discretization methods of discretizing the fractional order operator or systems available in the concerned literature and leading to the development of a unified approach for direct discretization of fos in delta domain. references [1] k. b. oldham and j. spanier, the fractional calculus: theory and application of differentiation and integration to arbitrary order. dover books on mathematics, 2006. [2] k. s. miller and b. ross, an introduction to the fractional calculus and fractional differential equations. new york: wiley, 1993. [3] r. caponetto, g. dongola, l. fortuna and i. petras, fractional order systems, modeling and control applications, singapore: world scientific, 2010, pp. 43–54. [4] h. h. sun, b. onaral and y. tsao, "application of the positive reality principle to metal electrode linear polarization phenomena", ieee trans. biomed., eng. vol. 31, pp. 664–674, 1984. [5] h. h. sun, a. a. abdelwahab and b. onaral, "linear approximation of transfer function with a pole of fractional order", ieee trans. autom. control., vol. 29, pp. 441–444, 1984. [6] s. b. skaar, a. n. michel and r. k. miller, "stability of viscoelastic control systems", ieee trans. auto. control., vol. 33, no. 4, pp. 348–357, 1988. [7] n. engheta, "fractional calculus and fractional paradigm in electromagnetic theory", in proceedings of the viith international conference on mathemathical methods in electromagnetic theory, 1998, pp. 879–880. [8] i. podlubny, "fractional-order systems and piλdμ controllers", ieee trans. automatic contro.l, vol. 44, pp. 208–214, 1999. [9] n. nakagava and k. sorimachi,"basic characteristics of a fractance device" ieice trans. fundamentals, vol. e75-a, pp.1814–1819, 1992. [10] b. m. vinagre, i podlubny, a hernández and v. feliu, "some approximation of fractional-order operator used in control theory and application", j. fractional calc. and appl., vol. 3, pp. 231–248, 2000. [11] d. xue, c. zhao and y. chen, "a modified approximation method of fractional order system", in proceedings of the ieee international conference on mechatronics and automation (icma), luoyang, 2006, pp. 1043–1048. [12] m. khanra, j. pal and k. biswas, "rational approximation of fractional operator –a comparative study", in proceedings of the ieee international conference on power, control and embedded systems (icpces), allahabad, 2010, pp. 1–5. [13] a. oustaloup, la derivation non entiere: theorie, synthese et applications. hermes, paris, 1995. [14] b. t. krishna, "studies on fractional-order differentiators and integrators: a survey", signal processing, vol. 91, pp. 386–426, 2011. [15] g. maione, "high-speed digital realizations of fractional operator in the delta-domain", ieee trans. automat. control, vol. 56, pp. 697–702, 2011. a new approach for direct discretization of fractional order operator in delta domain 331 [16] r. d. keyser and c. i. muresan, "analysis of a new continuous-to-discrete-time operator for the approximation of fractional order system", in proceedings of the ieee international conference on systems, man and cybernetics (smc), budapest, 2016, pp. 003211–003216. [17] b. t. krishna, "design of fractional-order differ integrators using reduced order s to z transforms", in proceeding of the 10th ieee international conference on industrial and information systems, peradeniya, 2015, pp. 469–473. [18] y. q. chen and k. l. moore, "discretization schemes for fractional-order differentiators and integrators", ieee trans. circuits system., vol. 49, no. 3, pp. 363–367, 2000. [19] y. chen, i. petráš and d. xue, "fractional order control-a tutorial", in proceeding of the american control conference, 2009, pp. 1397–1411. [20] r. h. middleton and g. c. goodwin, digital control and estimation-a unified approach. englewood cliffs, new jersey: prentice-hall, 1990. [21] r. h. middleton and g. c. goodwin, "improved finite word length characteristics in digital control using delta operator", ieee trans. automatic cont., vol. 31, pp. 1015–1021, 1986. [22] j. cortes-romero, a. luviana-juarez and h. sira-ramirez, "a delta operator approach for the discretetime active disturbance rejection control on induction motors", math. probl. eng., vol. 2013, pp. 1–9, 2013. [23] y. zhao and d. zhang, "h-∞ fault detection for uncertain delta operator systems with packet dropout and limited communication", in proceedings of the american control conference, seattle, 2017, pp. 4772–4777. [24] j. gao, s. chai, m. shuai, b. zhang and l. cui, "detecting false data injection attack on cyber-physical system based on delta operator", in proceeding of the 37th chinese control conference, wuhan, 2018, pp. 5961–5966. [25] j. swarnakar, p. sarkar, m. dey and l. joyprakash singh, "a unified approach for reduced order modelling of fractional order system in delta domaina unified approach", in proceeding of the ieee region 10 humanitarian technology conference (r-10 htc), 2017, pp. 144–150. [26] p. sarkar, r. r. shekh and a. iqbal, "a unified approach for reduced order modelling of fractional order system in delta domain", in proceeding of the ieee international automatic control conference (cacs), taichung, 2016, pp. 257–262. [27] s. ganguli, g. kaur and p. sarkar, "global heuristic methods for reduced-order modelling of fractionalorder systems in the delta domain: a unified approach", ricerche di matematica, 2021. [28] l. a. quezada-téllez, l. franco-pérez and g. fernandez-anaya, "controlling chaos for a fractionalorder discrete system", ieee open j. circuits syst., vol. 1, pp. 263–269, 2020. [29] o. lamrabet, e. h. tissir and f. e. haoussi, "controller design for delta operator time-delay systems subject to actuator saturation", in proceedings of the international conference on intelligent systems and computer vision (iscv), 2020, pp. 1–6. [30] i. pan and s. das, intelligent fractional order systems and control, studies in computational intelligence, springer-verlag, berlin heidelberg, 2013. [31] b. m. vinagre, y. q. chen and i. petrášc, "two direct tustin discretization methods for fractional-order differentiator/integrator", j. franklin institute, vol. 340, pp. 349–362, 2003. [32] s. k. khattri, "new close form approximation of ln(1+x)", the teaching of mathematics, vol. 12, pp. 7–14, 2009. [33] j. baranowski, w. bauer, m. zagorowska, t. dziwinski and p. piatek, "time-domain oustaloup approximation",” in proceedings of the 20th international conference on methods and models in automation and robotics (mmar), 2015, pp. 116–120. instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 613 619 doi: 10.2298/fuee1404613m resolving the bias point for wide range of temperature applications in high-k/metal gate nanoscale dg-mosfet  sushanta k. mohapatra, kumar p. pradhan, prasanna k. sahu nanoelectronics lab., department of electrical engineering, national institute of technology, rourkela, odisha india abstract. this article investigates the zero-temperature-coefficient (ztc) bias point and its associated performance metrics of a high-k metal gate (hkmg) dg-mosfet in nanoscale. the ztc bias point is defined as the point at which the device parameters are independent of temperature. the discussion includes sub threshold slope (ss), drain induced barrier lowering (dibl), on-off current ratio (ion/ioff), transconductance (gm), output conductance (gd) and intrinsic gain (av). from the results, it is confirmed that there are two different ztc bias points, one for ids (ztcids) and the other for gm (ztcgm). the points are obtained as: ztcids=0.552 v and ztcgm =0.410 v, which will open important opportunities in analog circuit design for wide range of temperature applications. key words: dg-mosfets, hkmg, sces, analog foms, ztc point. 1. introduction the growing interest and demand in designing circuits that operate at high temperatures which will be used in the military, automobile, nuclear, and some industries need to be analysed in nanoscale. silicon on insulator (soi) based cmos devices have the potential for the operation at both low and high temperatures. it is desirable to bias the digital and analog circuits designed for high temperature applications at a point where the v-i characteristics show little or no variation with respect to temperature. this point is typically known as ztc point [1-5]. previously, shoucair [1] and prijic, et al. [3] have identified the ztc point for a bulk cmos in both linear and saturation regions for temperatures between 25 0 c 200 0 c. researchers like groeseneken, et al. [4] and jeon, et al. [5] have shown the existence of the ztc point for soi mosfet’s. osman, et al. [6] presented a systematic analysis of ztc point for partially depleted (pd) soi mosfet over a wide range of temperatures (25 0 c 300 0 c), and identified two distinct ztc points, in the linear as well as in the saturation region. tan, et al. [2] identified that the ztcids received july 3, 2014; received in revised form september 14, 2014 corresponding author: sushanta k. mohapatra nanoelectronics lab., dept. of electrical engineering, national institute of technology, rourkela, 769008, odisha india (e-mail: skmctc74@gmail.com) 614 s. k. mohapatra, k. p. pradhan, p. k. sahu exists in both linear and saturation regions, whereas the ztcgm lies only in the saturation region for fully depleted (fd), lightly doped, enhanced mode soi n-mosfet. the double gate (dg) mosfet fabricated on soi wafers is one of the most promising candidates due to its attractive features of low leakage currents, high current drivability (ion) & transconductance (gm), reduced short channel effects (sces), steeper subthreshold slopes, and suppression of latch-up phenomenon [7-10], and also it is a very good option for analog applications [11-14]. hardly any work has been reported to investigate the ztc point for multi-gate technology. the behaviour of id is exactly opposite after a certain vgs with variation in temperatures. this is due to the degradation in mobility or high electric field effect at higher gate bias [2]. as far as we know, this is a unique attempt to investigate the detailed analysis of ztc point over a wide range of temperatures (100 k-400 k) for analog applications of a dg mosfet with hkmg technology. various performance metrics of the device have been systematically examined, which includes the sces like ss, dibl, ion/ioff ratio, and some important analog figures of merit (foms) such as gm, gd, av. 2. device description and simulation setup in the 2-d numerical simulation, a symmetric device structure as shown in fig. 1(a) has been modelled. the silicon channel is covered above and below by oxide layers as gate stack (gs) of equivalent oxide thickness (eot) having 1.1 nm. metal gate work function is considered as 4.6 ev. the channel length is 40 nm with a fixed width of 1 μm has been considered. source and drain extensions are 60 nm with contacts vertically placed (s and d, respectively). the doping profile for channel (p-type 110 16 cm −3 ) and source, drain (n-type 110 20 cm −3 ) are set. (a) (b) fig. 1 (a) schematic structure of nanoscale hkmg double gate n-mosfet (b) calibration between simulation and experimental data of threshold voltage as a function of temperature. the 2-d numerical device simulator atlas is employed to simulate the planner dgmosfet with high-k/metal gate technology. according to itrs the drain bias has been fixed at vdd = 1.0 v [15]. to study the analog performance the simulation is performed with analysis of ztc point in hkmg-dg-mosfet for analog application 615 vds = 0.5 v and vgs = 0 v to 1.0 v. in the simulation, the inversion-layer lombardi constant voltage and temperature (cvt) mobility model has been used, that takes into account the effect of transverse fields along with doping and temperature dependent parameters of the mobility. the shockley–read–hall (srh) generation and auger recombination model are used for minority carrier recombination. the model fermi-dirac uses a rational chebyshev approximation that gives results close to the exact values. the model temperature is used for various operating temperature in kelvin which is varied from 100 k to 400 k. the interface trapped charges during the pre and post fabrications process are a common phenomenon, and these charges cannot be neglected in nanoscale device fabrication. presence of trapped charges creates an additional non-linear potential and varying electric field across the gate dielectric. according to (1), the high-k gate stack reduces the electric field across the layer of gate stack due to high permittivity. so a lower electric field will require inducing inversion layer charge as [16]. ch di iq   (1) where qch is inversion charge, di permittivity of dielectric and ei is electric field. even if, the fixed oxide and interface trapped charge densities are very large, it requires moderate potential across the high-k gate stack layer. consequently, the reduction of threshold voltage and supply voltages can be maintained at reasonable values. this low electric field promotes gate stack reliability with huge unwanted charges inside. as the device is high-k gate stack, the interface trapped charge effects are included in the simulation. the trapped charge densities are considered at semiconductor to insulator interface. the typical concentration of trapped charges considered in this work is 410 11 cm -2 at interface [17]. the electron and hole surface recombination velocity is considered as 110 4 cm/sec. in the simulation all the junctions of the structure are assumed to be abrupt in nature. furthermore, we have chosen two numerical techniques, gummel and newton, to obtain solutions [17]. fig. 1(b) shows excellent agreement with the nature of threshold voltage between our simulation results and experimental data for a wide range of temperature reported in [4]. 3. results and discussion in this section, the device scalability and analog performance metrics are discussed. threshold voltage (vth), sub-threshold swing (ss), dibl, on-state drive current (ion), offstate leakage current (ioff), ion/ioff ratio are the important figures of merit (foms) under device scalability. as far as analog circuits are concerned, the most important parameters are the transconductance (gm), output conductance (gd), intrinsic gain (av). fig. 2(a) and (b) describe all three important sces, which include the variation of vth, ss, and dibl for different temperatures. the threshold voltage is determined from ids– vgs characteristics. it is considered to be that value of the vgs for which the ids approaches 10 −6 a/μm at vds = 0.5 v. the calculation of dibl is done as per (2). 1 2 2 1( ) ( )th ds th th ds dsdibl v v v v v v      (2) the vth is observed at two different drain bias vds1=0.5 v and vds2=1.0v. from the fig. 2(a) and (b), it should be noted that vth is decreasing with an increase in temperature, but the 616 s. k. mohapatra, k. p. pradhan, p. k. sahu ss, and dibl values are decreasing as temperature increases. the typical value for the ss of multi-gate mosfet is 60 mv/decade. according to fig. 2(b), the ss value is lowest for t< 300 k (room temperature), then it starts increasing as temperature increases and reaches its typical value at t=300 k. the dibl value is quite impressive throughout the entire temperature range. as there is a little variation in vth for two different vds at temperatures from 200 k to 400 k, so the dibl varies from 5 mv/v to 14.38 mv/v. (a) (b) fig. 2 (a) vth as a function of temperature for different vds, (b) ss and dibl as a function of temperature for vds=0.5 v. fig. 3(a) and (b) show the ion, and ioff respectively for a wide range of temperature variations at vgs=0.5v and vds=0.5v. the on state current (ion) is extracted, by calculating the maximum drain current (id) from the ids–vgs characteristics at vgs=0.5 v and vds=0.5 v. the off state current (ioff) is extracted, by calculating the drain current (id) at vgs=0 and vds=vdd. the ioff shows a very low value for t< 300 k and then started increasing as temperature increases; this is due to the low ss and high vth values at low temperatures. the temperature dependence on the id is influence by vth as: ( ) ( )[ ( )]d gs thi t t v v t  (3) the temperature dependant id(t) is directly related to µ(t) or vgs–vth(t) term. so, increasing the vgs–vth(t) term causes the id(t) to increase because the vth decreases with increase in temperature as shown in fig. 3(a) and (b). (a) (b) fig. 3 (a) on state current (ion), (b) off state current (ioff), as a function of temperature for different vds. analysis of ztc point in hkmg-dg-mosfet for analog application 617 the ion/ioff is a very important parameter for switching application; it should be very high for a good switch. according to fig. 4(a), the ion/ioff is 2.3010 14 for t=100 k, then it starts falling down as temperature increases and reaches 1.2010 4 for t=400 k. at higher temperature regions, the high value of ion because of lower vth and the high value of ioff due to the high ss values compensate each other and give rise to nearly constant ion/ioff. fig. 4(b) shows the variation of the id and gm with vgs for different bias temperatures. as per (2), at high gate bias the µ(t) dominates because at higher t, lattice scattering dominates and causes reduction in the channel mobility, which further reduces the id. at low gate bias the vgsvth(t) term causes the id to increase with increasing temperature because a low vth is predicted at higher temperatures. these two opposite effects will cancel each other out at a value of vgs where the id shows minimum variation with t. this point is called ztc bias point. the gm–vgs plot can be obtained by the derivative of the id with respect to the vgs. at vgs< vth (channel is weakly inverted) the id is due to diffusion. the diffusion current increases with increase in t due to hike in intrinsic carrier concentration. at vgs >vth, gm will decrease as t increases due to mobility degradation. (a) (b) fig. 4(a) on-off current ratio (ion/ioff) as a function of temperature, (b) drain current (id) and transconductance (gm) as a function of vgs for different values of operating temperature. the reduction in vth with increase in temperature will increase gm but the reduction of gm occurs due to degradation of mobility. these two phenomena will compensate each other to give rise to a ztc bias point for gm. from the figure we can conclude that the transconductance ztc point (0.014 v) is lower than the drain current ztc bias point (0.552 v). the ztcids and ztcgm bias points are two important measures in analog circuit design. in opamp (operational amplifier) design, to maintain constant dc current levels, the devices need to be biased at ztcids points, while input devices can be biased at ztcgm point to achieve stable circuit parameters. the simulated output current (ids) and output conductance (gd) versus drain bias (vds) at a vgs=0.5 v for different temperatures are plotted in fig. 5(a). because of the above said µ(t) and vth effects with respect to temperature, the ids decreases as t increases below the ztc point and the reverse is happening after the ztc point for both parameters. the ztc point for gd is lower than the output current ztc point. the intrinsic gain (av = gm/gd) is a valuable fom for operational transconductance amplifier (ota) and it is given in fig. 5(b). from fig. 5(b), high gain can be observable for high temperatures in 618 s. k. mohapatra, k. p. pradhan, p. k. sahu subthreshold regime and just a reverse effect in above threshold region. from this it can be concluded that the device shows better results in subthreshold regime for higher t and it is a good candidate in above threshold regime for lower t. (a) (b) fig. 5(a) output current (id) and output conductance as a function of vds for different values operating temperature, (b) intrinsic gain (av) as a function of vgs for the different values operating temperature. the important performance metrics are tabulated in table 1. by observing the table, it is clear that our device shows very impressive values in low temperature ranges. the ion/ioff, ss and av of the device increases as temperature decreases and attains their maximum values for t=100 k. table 1 extracted parameters for various temperatures temperature in k ion/ioff dibl (mv/v) ss (mv/decade) av in db 400 1.2010 4 14.38 83.52 38.780 350 8.0810 4 12.52 72.83 40.514 300 1.0210 6 10.33 62.30 42.402 250 3.4510 7 7.78 51.85 44.355 200 6.5010 9 5.00 41.45 46.311 150 1.1910 13 11.75 18.80 48.248 100 2.3010 14  20.87 50.180 4. conclusion the ztc bias points of the hkmg dg-mosfet are investigated using the 2-d numerical simulation. the results presented in this work give a detailed idea about the ztc bias point for parameters like id, and gm. these results provided here can serve as a good design tool for designing circuits in a wide range of temperature applications and show promising solutions to minimize temperature degradation of analog circuits. the work identified the distinct ztc points for the device in nanoscale. analysis of ztc point in hkmg-dg-mosfet for analog application 619 references [1] f. s. shoucair, “analytical and experimental methods for zero-temperature-coefficient biasing of mos transistors,” electronics letters, vol. 25, pp. 1196-1198, 1989. [2] t. h. tan, and a. k. goel, “zero-temperature-coefficient biasing point of a fully depleted soi mosfet”, microwave and optical technology letters, vol. 37, no. 5, pp-366-370, june, 2003. [3] z. prijic, s. s. dimitrijev, and n. stojadinovic, “the determination of zero temperature coefficient point in cmos transistors,” microelectronics reliability, vol. 32, no. 6, pp. 769-113, 1992. [4] g. groeseneken, j. p. colinge, h. e. maes, j. c. alderman, and s. holt, “temperature dependence of threshold voltage in thin-film soi mosfet's,” ieee electron device letters, vol. 11, no. 8, pp. 329-331, 1990. [5] d. s. jeon and d. e. burk, “a temperature-dependent so1 mosfet model for high-temperature application (27 0 c-300 0 c),” ieee transactions on electron devices, vol. 38, no. 9, pp. 2101-2110, 1991. [6] ashraf a. osman, mohamed a. osman, numan s. dogan, and mohamed a. iman, “zero-temperaturecoefficient biasing point of partially depleted soi mosfet’s”, ieee transactions on electron devices, vol. 42, no. 9, pp. 1709 – 1711, september, 1995. [7] k. suzuki, y. tosaka, t. tanaka, h. horie, y. arimoto, “scaling theory of double-gate soi mosfet’s”, ieee transactions on electron devices, vol. 40, no. 12, pp. 2326–2329, 1993. [8] c. wann, k. noda, t. tanaka, m. yoshida, and c. hu, “a comparative study of advanced mosfet concepts”, ieee transactions on electron devices, vol. 43, pp. 1742, oct. 1996. [9] j.p. colinge “multiple-gate soi mosfets” solid-state electronics, vol. 48, no.6, pp.897–905, 2004. [10] h.-s. p. wong “beyond the conventional transistor” ibm j. res. & dev. vol. 46, no. 2/3, march/may, 2002. [11] a. kranti, t. m. chung, d. flandre, j. p. raskin “laterally asymmetric channel engineering in fully depleted double gate soi mosfets for high performance analog applications”, solid-state electronics, vol. 48, pp. 947–59, 2004. [12] n. mohankumar, b. syamal, c. k. sarkar, “influence of channel and gate engineering on the analog and rf performance of dg mosfets”, ieee transactions on electron devices, vol. 57, no. 4, pp. 820–826, april, 2010. [13] a sarkar, a. k. das, s. de, c. k. sarkar, “effect of gate engineering in double gate mosfets for analog/rf applications”, microelectronics journal, vol. 43, pp.-873-882, july, 2012. [14] r. k. sharma, m. bucher, “device design engineering for optimum analog/rf performance of nanoscale dg mosfets”, ieee transactions on nanotechnology, vol.-11, no.-5, pp.-992-998, sept., 2012. [15] the international technology roadmap for semiconductors. (2011). [online]. available: http://public.itrs.net/. [16] s. m. sze, “physics of semiconductor devices (3rd edition)”, wiley, 2007. [17] atlas manual: silvaco int. santa clara, 2008. http://public.itrs.net/ facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 239-247 https://doi.org/10.2298/fuee1902239r design and analysis of quadrifilar helical antenna for cube-sats using c-band frequency range for satellite communication pinku ranjan 1 , mihir patil 2 , amit bage 3 , brajesh kumar 2 , sandeep kumar p. 2 1 department of computer science & engineering, abv-indian institute of information technology and management, gwalior, madhya pradesh–474015, india 2 department of electronics and communication engineering, srm institute of science and technology, kattankulathur, chennai, tamil nadu– 603203, india 3 department of electronics and communication engineering, national institute of technology, hamirpur, himachal pradesh – 177005, india abstract. design and analysis of quadrifilar helical antenna are presented in this paper. the proposed antenna is designed for cube-sats in the low earth and medium earth orbits. it is a combination of four helical antennas, each separated by 90°, and excited separately at the feeding point. the antenna is designed for operation at 4.5 ghz with an impedance bandwidth of 11.11 %. design of the antenna is done in two steps. the first step being the design of a ground plane, which can make the antenna operate at 4.5 ghz. the second step is to analyze the antenna’s performance for different helix angles using the best ground plane dimensions obtained in the first step. the gain versus frequency curve has been obtained and the designed antenna is having a gain of more than 4 db at the resonant frequency of 4.5 ghz. key words: quadrifilar antenna, satellite communication, coaxial probe feed 1. introduction due to a huge building, assembling and launching costs of large satellites, many of the private institutions who are willing to contribute even a bit of chunk to the space exploration department are having a cube and microsatellites as their priority. in modern microwave and millimeter wave communication systems, the use of quadrifilar helical antenna is increasing day by day. this is due to the very large beam-width and high gain provided by the antenna [1-4]. it has become a major pillar for antenna design of satellite communication. even due to the evolution of electronics and vlsi technology, it is possible for small satellites to perform pretty difficult space exploration task. and thus, there is a need received september 14, 2018; received in revised form february 15, 2019 corresponding author: amit bage department of ece, srm institute of science and technology, kattankulathur, chennai, india, 603203 (e-mail: bageism@gmail.com) 240 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. felt to design small antennas suitable to fit on cube-sats. in [5] the deployable helical antennas antenna is presented for cube-sats. the deployment mechanism is used for the antenna to take as less space as possible. thus, it would require a ground plane which is also deployable to reflect out the back lobes. in some cases, this might help but, because it requires a deploying mechanism. it becomes very hard for small satellites to carry out the job with perfection. even, it could worsen the radiation pattern if not deployed properly and would thus be prone to a lot of errors. the dual-band quadrifilar helix antenna using stepped widths arms has been demonstrated in [6]. in [7] an omnidirectional antenna, sending circularly polarized waves is presented. c-band is selected for the antenna operations. because the antenna’s dimensions are very small in this band. it becomes very suitable to fit on 1u, 2u, 3u cube-sats. also, very less free space loss is incurred as compared to the x and the ku-band. adding to that going towards higher frequencies leads to more atmospheric losses. the s-band is rejected because it would require an antenna of about 14 cm in height. which, the c band is providing at about half the height. the low earth orbits have much less time for a direct line of sight communication. thus, they need to be properly oriented when the line of sight communication can be established. as one qfh antenna could serve only 180°, 2 qfh antennae are required to serve the whole 360 o view of the satellite. the antenna helices require a phase difference of 90 o between two helices. in [8] a very cost efficient and very small sized circuit is designed to give phase differenced signal, which could help to lower the burden of generating and sending out phase differenced signal. the basic design of the quadrifilar helical antenna has been demonstrated in [9]. on basis of that, an antenna is proposed and further optimized. the gain enhancement techniques have been presented in [10]. in [11] printed circuits discontinuities have been taken into account. this works as a resonating structure and thus allows only certain frequency to be received by the antenna. in 2011[12], b. pawan. k et.al. presented circularly polarized (cp) quadrifilar helix antenna (qfh).. this manuscript presents the design and analysis of the quadrifilar helical antenna. it is a combination of four helical antennas, each separated by 90 º , and excited separately at the feeding point. the antenna is operated at 4.5 ghz, with 11.11 % of impedance bandwidth. the antenna works as a circularly polarized antenna in 4.28 – 4,64 ghz. the length of the antenna is 7.5 cm and the bottom cylinder which is below the ground plane is 1 cm. while the length of the feed cylinder (above the ground plane) is 3.6 mm. the numerical simulation analysis has been carried out using ansys high-frequency structure simulator (version 15). the organization of the manuscript is as follows. in the first section, the quadrifilar helical antenna’s geometry is presented. in the second section results and discussion are presented. in the last section final conclusion has been presented. 2. antenna geometry the pitch of the helices is 15 cm and has half a turn and thus making a total length of 7.5 cm, which is 1.125λ. the helix radius is 1.15 cm. and the radius of the wire is 0.5 mm. in [13] the maximum antenna characteristics are achieved using an angle of 73°. while in this design the antenna has a helix angle of 81.28°, to attain a better radiation pattern. all the four helices are of the same dimensions. each of the helices is rotated by 90° with respect to the previous helix. the number of segments per turn is taken as 36, which is the default value. the top part design and analysis of quadrifilar helical antenna 241 of the antenna is having four cylinders, which have their axis perpendicular to the z-axis. this is to support the antenna structure from the top. these cylinders are called as top cylinders. the total height of these cylinders is 11.5 mm, and a radius of 1 mm. these values are taken such that cylinder can easily accommodate the helix into itself. the top cylinders are made up of copper. and thus, no losses are incurred into the design. four metallic rods are placed above the ground plane to support the antenna structure from down. these cylinders are called bottom cylinders. all the four rods are having a radius of 0.9 mm and height of 7.5 mm, such that it could easily accommodate the incoming helix. the four helices, the top four supporting cylinders and the bottom four supporting cylinders are united to make one antenna radiating structure. copper is assigned as the material to the structure. it is assigned a perfect e boundary condition. then below these cylindrical rods, there is a ground plane which is square shaped. below the ground plane, there are four copper rods of 0.9 mm. these cylinders are called as feed cylinders. the feed cylinders are made up of copper. the feed cylinder rods are surrounded by a teflon tube of an inner radius of 0.9 mm and outer radius of 3.018 mm. the radius is taken such that it makes a total impedance of 50 ohms. this makes any wire suitable to attach to the antenna with 50 ohms of impedance. the impedance matching plays a major role for power transmission through the antenna. the height of the feed cylinder is 3.6 mm. the height is selected so that the antenna can be easily mounted on any structure. on the bottom face of these feed cylinders, excitation is given to the ports. separate excitation is given to all the four ports. the four ports are feed with a 90° of phase shifted signal with respect to the simultaneous port. all the ports are fed in clockwise direction. the dielectric constant of teflon is 2.1. the inner copper tube is responsible for transferring the electrical signal from the wave-port to the antenna structure. the cross sections are taken as minimum as possible such that they can easily accommodate the incoming helices, to avoid losses. there are four holes subtracted from the cross-section of the ground plane of radius 0.9 mm, so as to allow the passing of the electric signals through the ground plane. fig. 1 shows the side view of the proposed antenna. fig. 2 shows only the bottom view of the ground plane. fig. 3 shows the cross-sectional view of the whole antenna. fig. 4 shows the direction of alignment for the feeds of the 4 ports of the antenna. fig. 1 side view of qfh antenna 242 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. fig. 2 bottom view of the ground plane. fig. 3 cross-section view of qfh antenna fig. 4 feed alignment of the 4 ports. 3. result and discussion the antenna’s input characteristics have been analyzed for the desired operating frequency. the ground plane dimension has been analyzed for the lowest reflection coefficient. from the simulations, as shown in fig. 5, it is found that the reflection coefficient is least for the ground plane of length 3.75 cm. it is half of the total height of the antenna. the y x z helices top cylinders ground plane bottom cylinder feed cylinders y x z design and analysis of quadrifilar helical antenna 243 antenna is simulated in hfss with the following design constraints. the maximum no. of passes taken is 6 and the maximum delta s is 0.02.the step size is kept as 0.01 so as to depict the most accurate antenna parameters. the minimum length of the ground plane for which the simulation is evaluated is 3 cm. below 3 cm the ground plane would not be able to support the helix structure. the lowest value of the reflection coefficient is -12 db at 3 cm of the ground plane, which then further decreases until the length of 3.75 cm. the lowest value at 3.75 cm ground plane is about -28.8 db. but after that, the value increases until 5 cm. the value goes up to -16.7 db and then further decreases. the value at 6 cm ground plane is about -21.4 db. but, after 6 cm of ground plane length, the resonating frequency starts to move towards 4.4 ghz. then further at 7, 8, 9, 10 cm the lowest reflection coefficient stays in between 19.5 and -20.5 db but resonating at 4.4 ghz. fig. 5 plot for reflection coefficient against frequency for the different lengths of the ground plane. after this, by keeping the length of the ground plane as 3.75 cm, further optimization is tried by calibrating its results against different helix angles. fig. 6 reflection coefficients for different helix angle with the constant ground plane of 3.75 cm. 244 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. thus, from fig. 6, it can be inferred that the antenna at the helix angle of 70º resonates at the frequency of 4.8 ghz. and at 75º the antenna resonates at the frequency of 4.4 ghz. but, after that from 80º until 85º, the resonating frequency stays at 4.5 ghz. the antenna performs best at the helix angle of 81.3º, with minimum reflection coefficient of -28.8041. the antenna helix angles are evaluated until 85º. because, above 85º it becomes impossible to mount the coaxial feed as they intersect with other feeds. the final |s11| versus frequency curve has been extracted and it is shown in fig. 7. from fig. 7, it can be inferred that it has a resonant frequency of 4.5 ghz with 11.11 % impedance bandwidth. fig. 7 reflection coefficient for the ground plane of dimension 3.75 cm and helix angle of 81.3º. the far-field analysis has been done for the proposed antenna at their resonant frequency (4.5 ghz). the radiation pattern for xz-plane and xy-plane has been shown in fig. 8 and fig. 9 respectively. the difference between co and cross-polarized is more than 15 db. the eplane view has the maximum value of e field radiation in that cross-sectional plane, which is shown in fig. 8. similarly, h-plane has the maximum value of h field radiation in that crosssectional plane, which is shown in fig. 9. thus, the antenna assures very promising radiation pattern. fig. 8 radiation pattern for the optimum antenna dimension for xz-plane (e-plane). design and analysis of quadrifilar helical antenna 245 fig. 9 radiation pattern for optimum antenna dimension for xy-plane (h-plane). fig. 10 gain (db) vs. frequency (ghz) for phi=80º and theta=110º. fig. 11 axial ratio of the antenna. 246 p. ranjan, m. patil, a. bage, b. kumar and s. kumar p. fig. 12 3-d radiation pattern of the proposed antenna. the gain versus frequency curve has been analyzed for the proposed antenna and it is shown in fig. 10. thus, it can be inferred that the antenna gives a maximum gain of 4.2254 db at phi=80º and theta=110º at resonant frequency 4.5 ghz. the antenna gain is constant throughout the operating frequency band. in fig. 11, the axial ratio of the antenna is shown. the antenna works as circular polarized antenna from 4.28-4.64 ghz.in fig. 12, 3-d radiation pattern of the proposed antenna is shown, in that the maximum gain is 4.22 dbi. the radiation efficiency of the antenna is 79 % is obtained at 4.5 ghz. the proposed antenna is compared with other quadrifilar helical antennas in table-1. it can be inferred from the data that the antenna has a very high bandwidth of 500 mhz, as compared to other designs. also, the proposed antenna has a moderate gain as compared to other designs presented in table-1. linearity in gain over the bandwidth proves very helpful. thus, the novelty in this design is the impedance bandwidth and gain of the antenna. it supports 500 mhz of bandwidth, with a linear gain of 4.22 db. this is the major advantage of the design. the design is a result of intense optimization in the antenna’s height, ground plane size and the diameter of the cylindrical rod. all this is possible with the very simple design of the antenna using the metallic rods. table 1 comparison of the proposed antenna with other antenna designs. ref. resonating frequency (ghz) bandwidth (mhz) gain (db) length of the antenna (cm) % impedance [1] 2.51 20 2.32 4183 0.0079 [11] 0.86 95 6.4 16.1 0.110 [12] 4.2 500 3.5 4.6 0.1190 [13] 1.53 200 6.2 19.5 0.1307 our work 4.5 500 4.22 7.5 0.1111 design and analysis of quadrifilar helical antenna 247 4. conclusion the quadrifilar helical antenna has been designed at 4.53 ghz resonant frequency with 11.11% impedance bandwidth (4.3 ghz to 4.8 ghz). the optimized antenna dimension has a total height of 7.5 cm with half a turn, and it performs the best at 3.75 cm x 3.75 cm of the ground plane with a helix angle of 81.3º. it gives a gain of about 4.2 db at resonant frequency 4.5 ghz at phi=80º and theta=110º. this paper shows that the qfh antenna is a very good candidate for omnidirectional on cube-sats application with a good gain. acknowledgment: the authors would like to the department of science and technology (dst), government of india, for its support through the fist project. references [1] n. bhuma and c. himabindh, “right hand circular polarization of a quadrifilar helical antenna for satellite and mobile communication systems,” recent advances in space techn. servic. and climate change 2010 (rsts & cc-2010), chennai, pp. 307–310, 2010. [2] chapari, z. h. firouzeh, r. moini and s. h. h. sadeghi, “a low weight s-band quadrifilar helical antenna for satellite communication,” in proceedings of the 13th intern. symp. on antenna techn. and applied electromag. and the canadian radio science meeting, toronto, 2009, pp. 1-3c. [3] t. cvetković, v. milutinović, n. dončov, b. milovanović, "numerical calculation of shielding effectiveness of enclosure with apertures based on em field coupling with wire structures", facta universitatis, series: electronics and energetics, vol. 28, no. 4, pp. 585–596, 2015. [4] mengmeng and h. weina, “a printed quadrifilar-helical antenna for ku-band mobile satellite communication terminal,” in proceedings of the 17th intern. conf. on comm. techn. (icct), chengdu, 2017, pp. 755–759. [5] j. costantine, y. tawk, i. maqueda, m. sakovsky, g. olson, s. pellegrino, c. g. christodoulou, “uhf deployable helical antennas for cubesats,” ieee trans. on antennas and propag., vol. 64, no. 9, pp. 3752-3759, 2016. [6] g. byun, h. choo, s. kim, “design of a dual-band quadrifilar helix antenna using stepped-width arms,” ieee trans. on antennas and propag., vol. 63, no. 4, pp. 1858–1862, april 2015. [7] j. hou, x. sun and h. yang, “design of a high gain quadrifilar helix antenna for satellite mobile communication,” in proceedings of the china-japan joint microw. conf., hangzhou, 2011, pp. 1-3. [8] m. s. ghaffarian, s. khajepour and g. moradi, “a quadrifilar helix antenna using low cost planar feeding circuit,” in proceedings of the 24th iranian conf. on electrical engg. (icee), shiraz, 2016. [9] adams, r. greenough, r. wallenberg, a. mendelovicz and c. lumjiak, “the quadrifilar helix antenna,” ieee trans. on antennas and propag., vol. 22, no. 2, pp. 173–178, 1974. [10] s. gao, q. luo, and f. zhu, “circularly polarized antennas,” hoboken, nj, usa: wiley, nov. 2013. [11] y. tawk, m. chahoud, m. fadous, j. costantineand c. g. christodoulou, “the miniaturization of a partially 3-d printed quadrifilar helix antenna,” ieee trans. on antennas and propag., vol. 65, no. 10, pp. 5043–5051, oct. 2017. [12] p. kumar, m. kumar, c. kumar, s. kumar, v. srinivasan, “integrated quadrifilar helix at c-band for spacecraft omni antenna system,” in proceedings of the ieee applied electromag. conf. (aemc), pp. 1–4, 2011. [13] z. y. zhang, l. yang, s. l. zuo, m. u. rehman, g. fu, c. zhou, “printed quadrifilar helix antenna with enhanced bandwidth,” iet microw. antennas & propag., vol. 11, pp. 732–736, 2017. facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 53-69 https://doi.org/10.2298/fuee2101053d © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper assessment of electrical interference on metallic pipeline from hv overhead power line in complex situation rabah djekidel, sid ahmed bessedik, abdechafik hadjadj laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria abstract. sharing corridors between high voltage alternating current (hvac) power lines and metallic pipelines has become quite common. voltages can be induced on pipelines from hv power lines, which may cause a risk of electric shock to the operator and serious corrosion damage on metallic pipelines. this paper aims to examine the capacitive coupling between aerial metallic pipelines and hv power lines in perfect parallelism case and in general situation which is formed by parallelism, approaches and crossings, using a combination of charge simulation method and artificial bee colony (abc) algorithm. the electric field at the pipeline's surface and the induced voltage on the pipeline are strongly affected by the pipeline separation distance. the presented simulation results are compared with those obtained from the admittance matrix analysis, a good agreement has been obtained. key words: charge simulation method (csm); artificial bee colony algorithm (abc), capacitive coupling, h-v overhead power line, aerial pipelines 1. introduction aerial and buried metallic pipelines are typically designed to share common corridors for long distances with hv overhead power lines, the electric and magnetic fields emitted by these hv power lines result in ac interferences to metallic pipelines located in close proximity. therefore, in many cases, the adjacent metallic pipelines are subjected to the impact of high induced ac voltages and currents [1-6]. there are three different mutual interferences, capacitive, inductive and conductive coupling. these electrical interferences present three main subjects of concern, a risk of electrocution for intervention agents of the pipeline, a damage of the pipeline’s insulation coating, a risk to the integrity of the pipeline received june 24, 2020; received in revised form december 24, 2020 corresponding author: djekidel rabah electrical engineering department, university of amar telidji of laghouat, bp 37g route of ghardaïa, laghouat 03000, algeria e-mail: rabah03dz@live.fr https://context.reverso.net/traduction/anglais-francais/are+typically+designed+for https://context.reverso.net/traduction/anglais-francais/over+long+distances 54 r. djekidel, s. bessidek, a. hadjadj and its associated protective equipment, which puts in many cases the need to make a careful verification of these induced voltages levels and to adopt in some cases mitigation systems and safety measures [1-6]. in simple case, where the metallic pipeline runs perfectly in parallel situation with the conductors of the hv power line, this parallel exposure is termed a perfect parallelism. generally, for the complex situation, the exposure length of the zone of ac interference influence is composed by parallelism, approaches and crossings [1]. in this regard, the purpose of the present paper is to assess the ac induced voltages due to capacitive coupling between the hv power lines and adjacent aerial metallic pipelines that are located in the same corridor in perfect case of parallelism, and in complex situation. the ac induced voltage assessment will be done using the charge simulation method (csm), this technique accuracy depends strongly on the number and location of the both simulating charges and the contour points. in order to solve this major constraint, the artificial bee colony algorithm (abc) is one of the most commonly used evolutionary algorithms for solving these optimization problems [7, 8]. the performance of the adopted hybrid technique will be verified by a comparison with results obtained by the admittance matrix analysis. 2. capacitive coupling mechanism only metallic pipelines installed above ground are subject to the capacitive coupling from the hv power line. the capacitive coupling is produced by the electric field strength due to the hv power line by inducing electric charges in the aerial metallic pipeline. it is a voltage divider formed by the capacitance between the hv overhead power line and the aerial pipeline, which is insulated from the ground, in series with the capacitance between the aerial pipeline and the adjacent earth, as shown in the fig. 1 below [1,5,9,10]. fig. 1 electrostatic coupling from a hv power line to a metallic pipeline 3. charge simulation method (csm) in this technique, the real distributed charges on the surface of a conductor are replaced by a system of discrete fictitious charges arranged inside the conductor. the values of these fictitious charges are evaluated by satisfying the boundary conditions at a number of selected points called contour points on the surface of the conductor. once these values of the fictitious charges are known, the potential of any point in the region outside the conductors can be determined using the superposition principle as follows [9-17]. http://context.reverso.net/traduction/anglais-francais/will+be+verified assessment of electrical interference on metallic pipeline from hv overhead power line… 55 1 cn i ij j j v p q = =  (1) where nt is the number of discrete fictitious charges and pij, called the potential coefficient, means the potential at point (i) caused by a unit charge of qj. it depends on the relative distance between the contour point (i) and the fictitious charge (j), which can be expressed by the following equation [9-17]. 2 2 2 2 0 ( ) ( )1 ln 2. . ( ) ( ) i j i j ij i j i j x x y y p x x y y  − + + = − + − (2) where, (xi, yi): coordinates of the boundary contour point; (xj, yj): coordinates of the discrete fictitious charge. firstly, the values of the fictitious charges are determined by solving the linear system given in equation (3) below [9-17]: 1[ ] [ ] .[ ]j ij ciq p v −= (3) where [pij] is the potential coefficients matrix; [qj] is the column vector of discrete fictitious charges; [vci] is the column vector of known potentials at the contour point (boundary conditions). after calculating the values of the fictitious charges, we choose then nc several verification points located at the contour of the conductors, and we calculate the new electrical potential vvi given by the simulation charges, the relative error calculated between this new calculated electrical potential and the real potential applied to the phase conductors vci represent the accuracy of the simulation. the simulation is acceptable if this relative error value is less than the selected precision. if not, the simulation procedure should be repeated by changing the number and/ or the position of the simulation charges [9-17]. in electric field calculation due the hv power line; each conductor of the power line is considered as an infinite line type charge. the two-dimensional (2-d) coordinates of the fictitious charges and contour points in the cross section of the conductor/pipeline are shown in fig. 2 [9,10,16,17]: fig. 2 2-d arrangement of simulation charges and contour points for the line conductor and the metallic pipeline the general equations of the coordinates of contour points and fictitious charges are obtained very simply using the following two formulas [9,10,16,17]. 56 r. djekidel, s. bessidek, a. hadjadj 0 0 2. cos ( 1) 2. sin ( 1) k k k k x x r k n y y r k n   = +   −      = +   −    (4) where, 1 2 , ,r r if k i r if k j= = = 0y is the vertical coordinates of conductors and metallic pipeline above ground, x0 is the horizontal coordinates of conductors and metallic pipeline. the components of the electric field are calculated using the superposition principle of all the vector components of this electric field. for a cartesian coordinate system, the horizontal and vertical components ex and ey of the electric field for a number of nt discrete fictitious charges would be given by [9-17]. 2 2 2 2 10 2 2 2 2 10 1 2. . ( ) ( ) ( ) ( ) 1 2. . ( ) ( ) ( ) ( ) t t n j j x j j j j j j n j j y j j j j j j x x x x e q x x y y x x y y y y y y e q x x y y x x y y     = =   − −  = −    − + − − + +         − +   = −    − + − − + +        (5) where, (x,y) are the coordinates of the observation point; (xj,yj) are the coordinates of the discrete fictitious charges. the resulting strength of the electric field at the observation point p is obtained by the sum of the intensities of the horizontal and vertical components, it can be written in the form [9-17]. 2 2 res x y e e e= + (6) under steady state condition of the hv power line, the ac induced voltage on the metallic pipeline due to the fictitious charges is determined using equation (7) given below [1,12]. 2 2 2 2 10 ( ) ( )1 .ln 2 ( ) ( ) tn j j ind j j j j x x y y v q x x y y  =  − + +  =  − + −    (7) when a person touches this metallic pipeline, the human body is charged and undergoes an electric shock, the discharge current that would flow through its body is given by the following relation [1, 12,18]: ind shock p p dv i j c l dt =    (8) where, lp is the length of the metallic pipeline exposed to the capacitive coupling (electrostatic coupling); cp is the metallic pipeline’s capacitance to earth per unit length is given by the inverse of the pipeline potential coefficient given above. if the discharge current magnitude is higher than the admissible exposure limit authorized by the international standards iec 60479-1:2005 in steady state conditions at industrial frequency of 50 hz, which equal to 10ma for adult males [1,19], it is required assessment of electrical interference on metallic pipeline from hv overhead power line… 57 to earthed the metallic pipeline through a low resistance rs to reduce the discharge current below the acceptable limit, this earthing resistance must be lower than[1,5,12,14]: 1 body s r r  − (9) where, rbody is the body resistance;  is the ratio ( ) i / ishock admß = . in accordance with the american standard ieee 80: 2000, the overall resistance of the human body is typically taken equal to a value of 1000 ω[1,20 ]. 4. general situation in general situation (complex situation) of parallelism between an overhead hv power line and metallic pipeline, the area of influence is generally made up of three different cases: parallelism, approaches and crossings. this situation is illustrated in fig. 3. fig. 3 zone of influence in this case, the distance separating the metallic pipeline, or a section of the metallic pipeline, and the different conductors of the overhead hv power line is no longer constant. two such situations are illustrated in fig. 4. fig. 4 conversion of non-parallel exposures to parallel exposures between a hv power line and a metallic pipeline: (a) oblique exposure, (b) crossing exposure in both cases, the non-parallel pipeline exposure can be converted to a perfect parallelism where the aerial pipeline is parallel to the hv power line and is at an equivalent distance from the hv power line given by the following equation [1,18]: 58 r. djekidel, s. bessidek, a. hadjadj 1 2 . eq x x x= (10) with, 1 2 1 3 3 x x   (11) where, xeq is the geometric mean distance to the hv power line and x1 and x2 are the minimum and maximum distances of the metallic pipeline to the hv power line. the resulting ac voltage of the metallic pipeline to earth can be evaluated as the average of the voltages in each section weighted by its length to the pipeline’s complete length as follows[1,18 ]: ( ) 1 1 . n p p i i i v v l l = =  (12) where, vp is the ac induced voltage per unit length in section i; li is the length of section i; n is the number of sections of the aerial pipeline; l is the complete length of the zone of influence. 5. artificial bee colony (abc) for improving the simulation accuracy in charge simulation method (csm), the artificial bee colony (abc) is proposed in order to find the optimal position and number of both discrete fictitious charges and contour points. the artificial bee colony algorithm was introduced by dervis karaboga in 2005 for continuous optimization problems, it is a recent metaheuristic algorithm inspired by the natural pattern of honey bee behavior in foraging. the artificial bee colony contains three groups, scouts, onlookers and employed bees. in the abc algorithm, the initial solution population consists of a sn number of food sources generated randomly in the search space, each food source vmi is generated according to the following equation [7,8,21,22]: ( ) mi mi mi mi ki v x x x= + − (13) where, xk is a randomly selected candidate solution (m k), k is a randomly selected parameter index; mi is a random number within the range [-1,1]. each food source is associated with a fitness function which characterizes the amount of nectar; this value is calculated according to the following equation (14) [7,8,21,22]. 1 ( ) , ( ) 0 1 ( ) m m m m m m fit x f x f x = + (14) the choice of a food source is carried out in a probabilistic manner by evaluating the probability pm, which depends on the nectar content of this food source, this probability is determined as follows [7,8,21,22]: assessment of electrical interference on metallic pipeline from hv overhead power line… 59 1 ( ) ( ) m m m sn m m m fit x p fit x = =  (15) finally, if the solution is abandoned, then, a new solution xm will be produced randomly by the scout bees using the following expression [7,8,21,22]: (0,1) *( )m i i ix l rand u l= + − (16) where, rand (0,1) is a random number within the range [0, 1], ui and li are the upper and lower bound of the solution space of objective function. the objective function used in this method is based on the relative error; its form is given by the following equation [9,10,16,17]: 1 ( , )1 nt c v t fi i i t ci v v n r of n v= − =  (17) where, vvi is the real voltage which is subjected the phase conductors of the hv power line; vci is the new electrical voltage obtained by the verification points (check points); nt is the total number of verification points. 6. admittance matrix method for problems related to electrical charges, the pipeline ac voltage to earth due to the capacitive coupling for a given pipeline exposure length with the high voltage (hv) overhead power lines can be assessed using the admittance matrix technique, the resulting matrix consisting of the self and mutual admittances of the hv overhead power line conductors and aerial metallic pipeline. the advantages of this approach are that it is very simple and quite easy to be implemented; it can process very fast and provides an accurate solution. for a balanced (symmetrical) three phase ac power system with an aerial pipeline under steady-state conditions of the hv power line (the three phases have the same amplitude and are phase shifted by120◦), the power line admittance per phase per unit length is obtained from the following formula [18,23]: 1[ ] [ ]y j p −=   (18) where, p is the potential coefficients matrix (the inverse of impedance to earth per unit length). the ac currents in the hv power system are represented in the matrix form as follows, [ ] [ ] [ ]i y v=  (19) 60 r. djekidel, s. bessidek, a. hadjadj the resultant matrix of shunt admittance per unit length for the three-phase system with the ground wires and aerial metallic pipeline is given by applying the following system of equations [18,23]: . c cp cgc c p pc p pg p g gc gp g g y y yi v i y y y v i y y y v             =                  (20) where, the subscripts 'c', 'p', and 'g' represent the three phase conductors, pipeline and ground wires, respectively. the matrix can be reduced by eliminating the earthed ground wires, by replacing the current (ig =0) in equation (20), giving [18]: ' ' ' ' . c c pc c p pp c p y yi v i vy y      =            (21) with, . .' ' . .' ' , , cg gc cg gp c c cp cp g g pg gc pc cp pc pc p p g g y y y y y y y y y y y p y y y y y y p y  = − = −    = − = −   (22) for an aerial insulated pipeline by substituting (ip=0) in equation (21), it can be deduced from this simplification the pipeline ac voltage to earth due to capacitive coupling with the hv power lines, which is expressed by the equation below[18]:   1 ' ' . . p pc p c v y y v −      = −      (23) where, vc are the known three-phase voltages to earth of the hv overhead power line. 7. results and discussions we consider in this study a 400 kv ac overhead power transmission line arranged in single horizontal configuration, with an aerial isolated metallic pipeline in close proximity under operating conditions for two types of situations. the first presents the simple situation (perfect parallelism between the hv overhead power line and the aerial metallic pipeline), and the second presents the general situation (the complex case) as illustrated in fig. 5. the physical data and the geometric coordinates for the hv power line are shown in fig. 6. assessment of electrical interference on metallic pipeline from hv overhead power line… 61 fig. 5 representation of situation between hv overhead power line and aerial pipeline, (a) case of perfect parallelism, (b) case of general situation fig. 6 single circuit overhead hv horizontal configuration with an aerial pipeline at first, the artificial bee colony is applied in order to determine the optimal position and number of discrete fictitious charges and contour points used in charge simulation method (csm), which makes it possible to obtain very sufficient precision in the simulation. figure 7 shows the continuous decrease in the objective function value (of) given in equation (17) as a function of the number of iterations in this minimization algorithm. 62 r. djekidel, s. bessidek, a. hadjadj the simulation result for the optimal values of parameters versus iteration number is shown in the fig. 8, where it becomes apparent that this optimization algorithm converges quickly to the best optimal solutions. the indices 'pc', 'ew' and 'pl' represent respectively the phase conductors, earth wires and metallic pipeline. fig. 7 the optimization process of the objective function with the number of iterations fig. 8 convergence of search parameters towards the optimal values in case of perfect parallelism, fig. 9 shows the lateral profile of electric field strength at metallic pipeline surface. it is clear from the graph that the presence of the metallic pipeline disturbs the electric field distribution, this electric field is subjected to a significant increase in the zone of the pipeline location, and this is caused by the accumulated induced electric charges on the pipeline surface. assessment of electrical interference on metallic pipeline from hv overhead power line… 63 fig. 9 perturbed electric field strength on the metallic pipeline surface the perturbed intensity of the electric field on the metallic pipeline’s surface localized at varying separation distances from the hv power line center is shown in fig. 10. it can be observed that the electric field strength has a low value under the middle phase conductor, and then increases progressively to a maximum value at a critical location of the pipeline, at this point, it starts to decline rapidly as one moves away from the hv power line center. as a result, the electric field strength on the metallic pipeline surface is effectively minimized when the pipeline is located as far away from the power line center as possible. fig. 10 perturbed electric field strength as a function of the pipeline separation distance 64 r. djekidel, s. bessidek, a. hadjadj fig. 11 induced voltage values as a function of the pipeline separation distance from the power line center fig. 11 presents the induced voltage values on the metallic pipeline’s surface as a function of its separation distance. it is evident from this figure that the induced voltage profile is generally similar to that of the electric field. from the mid-point of the hv power line, the induced voltage rises until it reaches its maximum value, and then declines gradually with increasing the pipeline separation distance from the mid-point of the hv power line. it is very strongly recommended that the pipeline should be maintained at a proximity distance called critical distance where the induced voltage is very close to zero. fig. 12 strength of shock current flowing through the human body assessment of electrical interference on metallic pipeline from hv overhead power line… 65 under normal operating conditions, the shock current due to the capacitive coupling in a worker when it touches accidentally the metallic pipeline sited at different distances from the hv power line center is shown in fig. 12. it is important to note that the current intensity is directly proportional to the ac induced voltage level, if this voltage is intense on the metallic pipeline, resulting in high value of shock current in contact with the metallic pipeline. fig. 13 calculation of the earthing resistance of metallic pipeline for shock currents values greater than the safety limit for operating personnel, the limit value which is advised by the cigre standard is equal to 10 ma. a concrete preventive measure must be applied: it is enough simply connect the metallic pipeline to the ground through an appropriate resistance calculated properly in accordance with the equation (9) mentioned above. earthing resistance of the metallic pipeline as a function of its separation distance from the hv power line center is presented in fig. 13. as can be noted from this figure, the behavior of earthing resistance profile is inversely to that of electric shock current. in general situation, as shown in fig. (5-b) above, the zone of influence of the circuit (hv overhead power line and metallic pipeline) is divided into parallel sections, so that the geometrical condition represented by the equation (11) is respected. the results of the calculation of the different separation distances (power line-pipeline) and lengths of parallelism (longitudinal and transverse coordinates), for each section are presented in table (1) given below. 66 r. djekidel, s. bessidek, a. hadjadj table 1 dimensions of the sections of the zone of influence: hv power line – metallic pipeline fig. 14 illustrates the variation of the ac induced voltage in each section of the zone of exposure influence, as a function of the equivalent length of parallelism. it can be noted from this figure that the magnitude of the ac induced voltage increases with decreasing in the separation distance between the metallic pipeline and the hv overhead power line, then it remains constant for a constant separation, when the aerial pipeline approaches to the hv power line, this ac induced voltage reaches a maximum value, and then decreases when the metallic pipeline crosses the hv power line, when this metallic pipeline moves away from the hv overhead power line, the ac induced voltage again reaches its maximum value for the same separation distance. in addition, the further we fig. 14 illustration of ac induced voltage along the metallic pipeline in complex case points coordinates (m) (x,y) ratio ( )1i ira y y += separation distance (m) 1éq i iy y y +=  length equivalent (m) 2 2 1 1( ) ( )éq i i i il x x y y+ += − + − 0 100 / / / 500 70 1.43 83.67 500.9 1000 40 1.75 52.91 500.9 2000 40 1 40 1000 3000 40 1 40 1000 4000 40 1 40 1000 5000 40 1 40 1000 6000 40 1 40 1000 6050 20 2 28.28 53.85 6075 10 2 14.14 26.93 6190 -10 -1 6 116.73 6280 -20 0.5 14.14 90.55 6550 -50 0.4 31.62 271.66 7000 -100 0.5 70.71 452.77 https://context.reverso.net/traduction/anglais-francais/coordinates assessment of electrical interference on metallic pipeline from hv overhead power line… 67 get from the hv overhead power line, the induced voltage decreases to achieve very lower values. it can be concluded that the magnitude of ac induced voltage depends directly on the separation distance between the metallic pipeline and the hv overhead power line, and is significantly influenced by the equivalent length of exposure along the area of influence. fig. 15 3-d representation of the ac induced voltage along the metallic pipeline in complex case fig. 15 describes the (3-d) three-dimensional variation of the average total pipeline ac voltage to earth profile, as a function of the equivalent length of parallelism and the pipeline separation distance. it is evident from this figure that the ac induced voltage on metallic pipeline is directly proportional to the equivalent length of exposure; on the other hand, this ac induced voltage is in inverse proportion to the separation distance between the hv overhead power line and the metallic pipeline. the last step is devoted to validate this modelling by comparing the simulation results obtained by the combined approach (csm +abc) with those obtained by the admittance matrix analysis, this simple method is strongly recommended in calculating the ac induced voltage on the metallic pipeline from the hv overhead power line; it presents a fast simulation tool. in fig. 16, we see a good agreement between the values of the induced voltage. this procedure allows to confirm the obtained results.moreover, it validates and ensures the effectiveness and accuracy of the adopted approach. it is important to note that the validation of simple case of perfect parallelism is sufficient to validate the complex case, since the complex case is a series combination of simple cases of parallelism. 68 r. djekidel, s. bessidek, a. hadjadj -80 -60 -40 -20 0 20 40 60 80 0 0.5 1 1.5 2 2.5 3 3.5 4 pipeline position from power line center [m] i n d u c e d v o lt a g e [ k v ] csm+abc admittance matrix analysis fig. 16 comparison of the ac induced voltage values by the two calculation methods 7. conclusion in this study, an improved method is used to evaluate the capacitive coupling between the hv overhead power line and an aerial metallic pipeline, based on the charge simulation method (csm) combined with the artificial bee colony (abc). from the results, the perturbed electric field on the metallic pipeline located at different separation distances from the hv power line center has a lower value at the hv power line center and increases to reach its peak value, and then gets progressively decreased significantly as one moves away from this center. the ac induced voltage on the metallic pipeline is directly proportional to the electric field; its graphic representation is similar to that of the electric field. generally, if the shock current in a human body touching the metallic pipeline exceeds the authorized limit; therefore, this metallic pipeline must be earthed through an adequate resistance, this protection and mitigation measure is necessarily compulsory to reduce these ac induced voltage levels to accepted limits that are safe for personnel touching the metallic pipeline. in general situation, the magnitude of the ac induced voltage on the metallic pipeline is considerably influenced by the distance separating the hv power line and the metallic pipeline, also the equivalent length of parallel exposure. the performance of the coupled method (csm+ abc) is assured by the comparison between its results and those obtained by the admittance matrix analysis, the comparison shows a good agreement. references [1] cigre, electric and magnetic fields produced by transmission systems, description of phenomena practical guide for calculation, wg 36-01, paris 1980. [2] bs en 50443, effects of electromagnetic interference on pipelines caused by high voltage a.c. railway systems and/or high voltage a.c. power supply systems, cenelec, 2009. [3] h. m. ismail, "effect of oil pipelines existing in an hvtl corridor on the electric-field distribution", ieee trans. power deliv., vol. 22, no. 4, pp. 2466-2471, 2007. assessment of electrical interference on metallic pipeline from hv overhead power line… 69 [4] icnirp, "international commission on non-ionizing radiation protection, guidelines for limiting exposure to time-varying electric and magnetic fields (1hz to 100 khz)", health physics, vol. 99, no. 6, pp. 818-836, 2010. [5] r. djekidel and d. mahi, "calculation and analysis of inductive coupling effects for hv transmission lines on aerial pipelines", przegląd elektrotechniczny, vol. 90, no. 9, pp. 151-156, 2014. [6] iarc, non-ionizing radiation, part 1: static and extremely low-frequency (elf) electric and magnetic fields. iarc monographs on the evaluation of carcinogenic risks to humans, vol. 80, pp. 1-395, 2002. [7] d. karaboga and b. basturk, "a powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm", j. glob. optim., vol. 39, pp. 459-471, 2007. [8] d. karaboga and b. akay, "a comparative study of artificial bee colony algorithm", appl. math. comput., vol. 244, no. 1, pp. 108-132, 2009. [9] r. djekidel, a. ameur, d. mahi and a. hadjadj, "electrostatic interference calculation from hv power lines to aerial pipelines using hybrid pso – csm approach", in proceedings of the 9th jordanian international electrical and electronics engineering conference (jieeec), jordan, 2015, pp. 1-6. [10] d. rabah, h. a. chafik and s. a. bessedik, "electrostatic and electromagnetic effects of hv overhead power line on above metallic pipeline", in proceedings of the 5th international conference on electrical engineering – boumerdes (icee-b), boumerdes, october 2017, pp. 1-6. [11] n.h. malik, "a review of the charge simulation method and its applications", ieee trans. electr. insul., vol. 24, no. 1, pp. 3-20, 1989. [12] r. djekidel and d. mahi, "capacitive interferences modeling and optimization between hv power lines and aerial pipelines", int. j. electr. comput. eng., vol. 4, no. 4, pp. 486, 2014. [13] d. himadri, "implementation of basic charge configurations to charge simulation method for electric field calculations", international journal of advanced research in electrical, electronics and instrumentation engineering, vol. 3, no. 5, pp. 9607-9611, may 2014. [14] r. m. radwn and m. m. samy, "calculation of electric fields underneath six phase transmission lines", j. electr. syst., vol. 12, no. 4, pp. 839-851, 2016. [15] m. samy and a. emam, "induced pipeline voltage near-by hybrid transmission lines", innovative systems design and engineering, vol. 8, no. 3, pp. 31-40, 2017. [16] r. djekidel, s. bessedik and s. akef, "3d modelling and simulation analysis of electric field under hv overhead line using improved optimisation method", iet science, measurement & technology, 2020. [17] r. djekidel, s.a. bessedik and a. hadjadj, "electric field modeling and analysis of ehv power line using improved calculation method", fu elec. energ., vol. 31, no. 3, pp. 425-445, 2018. [18] n. d. tleis, power systems modeling and fault analysis. theory and practice. elsevier, 2008. [19] technical specification, basic safety publication: effects of current on human beings and livestock, iec, ts 60479-1, fourth edition 2005-07. [20] m.a. el-sharkawi, electric safety: practice and standards. university of washington, crc press book. taylor & francis group, llc, 2014. [21] v. r. nayak and g. m. kakandikar, artificial bee colony algorithm. department of mechanical engineering, zeal education society’s, dnyanganga college of engineering and research, pune 41, 2014-2015. [22] b. kumar and d. kumar, "a review on artificial bee colony algorithm", int. j. eng. technol., vol. 2, no. 3, pp. 175-186, 2013. [23] m. h. shwehdi, m. a. alaqil and s. r. mohamed, "emf analysis for a 380kv transmission ohl in the vicinity of buried pipelines, computer science", ieee access, vol. 8, pp. 3710-3717, 2020. https://link.springer.com/journal/10898 http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=19861&querytext%3dnazar+h.+malik%2c+%3a+a+review+of+the+charge+simulation+method+and+its+applications%2c+ieee+trans.+on+electrical+insulation%2c+vol.+24%2c+no.+1%2c+feb+.lb.1989.rb.%2c3-20. javascript:void(0) javascript:void(0) javascript:void(0) javascript:void(0) https://www.semanticscholar.org/author/mohamed-h.-shwehdi/9377463 https://www.semanticscholar.org/author/mohammed-a.-alaqil/65864661 https://www.semanticscholar.org/author/s.-raja-mohamed/1484987735 https://www.semanticscholar.org/paper/emf-analysis-for-a-380kv-transmission-ohl-in-the-of-shwehdi-alaqil/3a83d57fbb779cd5cd70a67a7867f285576f40df https://www.semanticscholar.org/paper/emf-analysis-for-a-380kv-transmission-ohl-in-the-of-shwehdi-alaqil/3a83d57fbb779cd5cd70a67a7867f285576f40df instruction facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 51-63 https://doi.org/10.2298/fuee1901051d reduction of susceptibility from electromagnetic interference in sensorless foc of ipmsm * lindita dhamo 1 , aida spahiu 1 , mitja nemec 2 , vanja ambrozic 2 1 polytechnic university of tirana, faculty of electrical engineering, tirana, albania 2 university of ljubljana, faculty of electrical engineering, ljubljana, slovenia abstract: this paper presents main problems of practical implementation of field oriented control (foc) developed for an interior permanent magnet synchronous motor (ipmsm). the main sources of electromagnetic interferences (emi) noises are discussed and practical aspects when a position sensor is used are presented. the control system is based on the dsp processing unit, together with inverter and encoder. the main problem addressed in this paper is reduction of vibrations in torque and speed response in a real system by re-placing a hardware device of control system very susceptible to emi noises, like encoder, with a soft block in control unit like sliding mode observer, less sensitive to emi. the experimental results with this control structure show considerable ripple reduction at steady state in torque, speed and current, as a consequence of reduction of sensitivity to emi noises. key words: emi, ipmsm, sensorless foc. 1. introduction pmsm has become really competitive to an induction motor in terms of lifetime cost. this motor has recently become quite attractive due to its many advantages be-cause magnets, instead of windings, are used for rotor magnetization [2]. phase inductance of pmsm is lower than that of the induction motor. thus, in pmsm, the effect of electromagnetic noise is greater when compared to the induction motor [3]. electro magnetic interference (emi), the appropriate term when referring to lower frequencies, or radio frequency interference (rfi), the appropriate term when referring to higher frequencies, is unwanted electrical noise that can interfere with signaling or communication equipment. drives with 8 khz or higher switching frequency have many harmonic frequencies, which produce problematic emissions affecting sensitive equipment. received february 3, 2018; received in revised form october 12, 2018 corresponding author: aida spahiu polytechnic university of tirana, faculty of electrical engineering, sheshi “nene tereza” nr.4, 1000, tirana, albania (e-mail: aida.spahiu@fie.upt.al) * an earlier version of this paper was presented at the 13th international conference on applied electromagnetics (пес 2017), august 30 september 01, 2017, in niš, serbia [1]. 52 l. dhamo, a. spahiu, m. nemec, v. ambrozic reducing the pwm carrier frequency reduces the effects and lowers the risk of common mode noise interference. higher carrier frequencies are less efficient for the drive, but lower carrier frequencies are less efficient for the motor. in general, restricting the propagation of electrical noise as close to the noise source as possible is the best way to protect sensitive devices from emi. there are many studies about the reduction of emi on ac drive systems. random pwm technique has been developed to suppress emi in power converters [4]–[8] and have shown that with this method it is possible to reduce acoustic noise and mechanical vibration. random pwm are various carried out in ways, such as by random switching frequency, random pulse position technique and random switching technique. it was shown that acoustic noise and emi were suppressed by using random pwm technique in svpwm algorithm [9]–[10]. methods having various switching frequencies like random or chaotic pwm are generally applied to induction motor. chaotic signal is obtained more easily than the random signal and it is also simpler to apply [11]. various techniques are available and discussed in literature, such as chaotic sinusoidal pwm, chaotic pulse position pwm, hybrid chaotic spwm and chaotic sv-pwm methods [12]–[14]. emi can create adverse effects with electrical components in the motor control panel, contributing to a loss of serial communication, nuisance drive trips and disturbance of control signals. emi not only degrades the performance of electrical equipment but also decreases the lifetime of components and increases the financial cost for equipment maintenance. this paper deals with electromagnetic interference (emi) and its prevention through the design of control sys-tem. it present the case when the sensitive device from emi, or noise receiver , is replaced with a soft block in control scheme, less sensitive to emi, in order to reduce the negative effect of emi propagation that are present in the system. 2. mathematical model of an ipmsm with system uncertainties 2.1 dynamic model of an ipmsm applying kirchhoff’s voltage law (kvl) to the dq-axis equivalent circuits of a threephase ipmsm yields the following voltage equations in the synchronously rotating d-q reference frame: qs s qs qs qs ds ds mv r i l i ωl i ωλ    (1) ds s ds ds ds qs qsv r i l i ωl i   (2) where vds and vqs are the dq-axis voltages, ids and iqs are the dq-axis currents, rs is the stator resistance, lds and lqs are the dq-axis inductances, ω is the electrical rotor speed, and λm is the magnetic flux. in addition, the electromagnetic torque can be obtained from the following electrical and mechanical equations: 3 [ ( ) ] 2 2 e m qs ds qs ds qs p t λ i l l i i   (3) 2 2 e lt t b ω j ω p p    (4) experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 53 where te and tl are the electromagnetic and load torques, p is the number of pole pairs, b is the viscous friction coefficient, and j is the rotor inertia. substituting (3) into (4) yields the following speed dynamic equation: 3 3 2 4 2 2 4 2 2 ds qsm qs l ds qs l lλp b p p ω i ω t i i j j j j      (5) 2.2 the extraction of rotor position the extraction of rotor position is made using the magnitudes of the αβ back emf components and inverse tangent method. in this method the rotor position angle is determined from as follows: 1 ˆˆ ˆ α β e θ tan e           (6) however, the position calculated by this method depends on the quality of the estimated back emf. because of the low sampling frequency, the estimated back emf will have both phase and magnitude shifts, which will bring oscillations and phase shift to the estimated position. in order to mitigate the oscillation of the estimated position, an estimated speed feedback algorithm is used to improve the inverse tangent method for position calculation, as shown in fig. 2, and the formula is as (7). ˆ ˆ[ ] [ 1] [ 1]2 sθ k θ k ω k t     (7) block diagram that represent the algorithm for improving the inverse tangent method for rotor position calculation is shown in figure 2. there is a logic used for rotor position selection, which consist in comparison of evaluated position during the k th time step, of 1 ˆ [ ]θ k that can be obtained from the smo, and 2 ˆ [ ]θ k that has been calculated at the end of the (k-1) th time step. the error ε[k] between 1 ˆ [ ]θ k and 2 ˆ [ ]θ k will be calculated as difference of them at the beginning of the k th time step. if the generated error ε[k] is smaller than the predetermined position error margin, than ˆ[ ]θ k = 1 ˆ [ ]θ k ; otherwise, ˆ[ ]θ k = 2 ˆ [ ]θ k . this method used to extract the rotor position, in implementation has shown a good performance of speed control for ipmsm and the oscillations in the estimated rotor position are mitigated. + + delay selection of rotor position ++ atan2 compensation of phase ][ˆ ke ][ˆ ke ][ˆ k ]1[ˆ k ][ˆ 1 k ][ˆ 2 k ]1[ k s t fig. 2 block diagram to improve the inverse tangent method for position calculation. 54 l. dhamo, a. spahiu, m. nemec, v. ambrozic 3. emi noises and effects the reasons for electromagnetic compatibility (emc) having grown in importance at such a rapid pace are owed to the increasing frequency because of use of digital electronics in today’s world and the virtually worldwide imposition of governmental limits on the radiated and conducted noise emissions of digital electronic products [15]. there are three ways to prevent interference: suppress the emission at its source, make the coupling path as inefficient as possible, make the receptor less susceptible to the emission. although these three alternatives should be kept in mind, the “line of defense” in this work is to make the receptor less susceptible to the emission. the paper shows the effect of replacing a device of the control system (the absolute en-coder) with a soft block (sliding mode observer) into control scheme, in order to reduce the disturbances caused by emi. the experimental results confirm the effectiveness of sensorless field oriented control by sliding mode observer of ipmsm in decreasing the sensitivity of control system to emi noises. 3.1. emi noise transmission path each type of interference problem includes a source, a receptor, and a transmission path between the source and victim or receptor of noise that suffer from emi noises. conducted emi is defined as interference that uses conductors as a path from a source to receptor. for example, a motor encoder grounded to a noisy connection would conduct noise to the drive encoder interface. the conducted noise could cause the drive encoder interface to receive inexact voltage signals precluding the motor drive from reading the rotor position and speed correctly thus causing drive faults. at the beginning, it may be supposed that the root cause for the drive operational malfunctions are related to incorrect parameter setting or possible a faulty drive interface board. closer inspection reveals the culprit to be poor grounding of the encoder cable. radiated emi is defined as interference that uses a wireless path from a source to the receptor. this is commonly seen in motor control panels with ac motor wires are laid in parallel next to low-voltage control wiring. the result is coupling between the wires causing disturbances on the data transmission line. for example, if the motor wires were laid in close proximity to a serial link between the motor controller and the drive, the coupling of the signals may corrupt the data packets being transferred between the controller and drive. 3.2. emi noise sources the motor drive system (mds) in industrial applications has become a new noise source because its switching frequency, operation voltage and current variations have been increased, causing unwanted effects such as common-mode (cm) noise and electromagnetic interference (emi) [16]. hence, the analysis of the noise propagation paths is necessary for understanding and improving the system reliability. noise propagation paths are mainly composed of an inverter, a three phase cable, a ball-bearing, an electric motor, and multiple ground nodes. especially the electric motor is an electric active load of inverter and a mechanical power source of vehicle as well. therefore, unwanted current flows to whole vehicle body through the electric motor by capacitive coupling in both the electric components and the mechanical parts. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 55 emc of electronic circuits is to a great extent deter-mined by the way the components are laid out and inter-connected. signal lines with their corresponding return line form an antenna, which is able to radiate electromagnetic energy, where the magnitude is determined by current amplitude, frequency and the geometrical area of the current loops. there are three typical sources for emi: power sup-ply lines, signal lines carrying high frequency, oscillator circuit. an important source of electromagnetic interference noise is the crosstalk. this essentially refers to the unintended electromagnetic coupling between wires and pcb lands that are in close proximity. crosstalk is distinguished from antenna coupling in that it is a near-field coupling problem. crosstalk between wires in cables or between lands on pcbs concerns the intrasystem interference performance of the product; that is, the source of the electro-magnetic emission and the receptor of this emission are within the same system. thus this reflects the third concern in emc: the design of the product such that it does not interfere with itself. with clock speeds and data transfer rates in digital control systems steadily increasing, crosstalk between lands on pcbs is becoming a significant mechanism for interference in modern digital systems. 3.3. receptors of emi in a real digital control system, there are several devices sensitive to emi, like encoders, tachometers, analog signals and measurement devices, communication networks and devices, microprocessor devices etc. each of them demonstrates specific symptoms when affected by emi noises. encoders may include jumping around of encoder counts when still and non-repeatable positioning when moving. tachometers may include incorrect speed reporting or un-expected speed fluctuations. analog signals and measurement devices may include unexpected voltage spikes, ripple, or jitter on the analog signal causing incorrect and non-repeatable readings. communication networks and devices almost always include loss of communication or errors in reading or writing data. the microprocessor devices can include loss of communications, faults or failure in the processor, digital inputs or outputs to trigger unexpectedly, analog inputs or outputs to report the incorrect value. the upper devices are all very important and irreplaceable, except the encoder. in the sensorless control system that we have developed, the elimination of one of the most sensitive receiver noises from emi, will reduce significantly the negative effects, like ripple in analog signals: torque, speed and current. in this paper, the effect of replacing the en-coder with an observer of sliding mode type is investigated. 4. ipmsm sensorless control 4.1. control unit the control system for sensorless foc of ipmsm with sliding mode observer, developed in this study, is com-posed by three main blocks: control unit, power module and measurement unit. the control unit is based upon a piccolo f28069 controlstick dsp by ti [17]. it consists of an adc converter, pwm channels and floating point central processing unit. the stator windings of ipmsm are supplied from a conventional 3 phase power module made up of 6 mosfet-s, operated as keys for break control. 56 l. dhamo, a. spahiu, m. nemec, v. ambrozic 4.2. measurement unit the measurement unit is a determinative part in the closed loop control system and encoder has a crucial role since the performance of foc depends directly on accurate rotor position information. in this study, only the effect of encoder in emi noises is considered. the idea has been realized through a soft block added in control unit block. instead of absolute encoder a sliding mode observer to calculate the rotor position and rotor speed that are needed for foc algorithm is designed. the results for important quantities of control system are then compared. 4.3. modular philosophy of digital motor control although a standardized platform, a modular ti piccolo f28069 controlstick dsp provides a smooth way for customers to quickly port the reference software to customized hardware. ti’s modular philosophy, which clearly separates modules into cpu and peripheral-dependent (drivers) categories, greatly simplifies the porting process. the ipmsm speed controller and the speed calculator from position information is the appropriate partitioning point in this system due to its complexity and reusability. this modular philosophy of ti’s platforms has encouraged and allowed us to develop and modify the standard dmc sys-tem to a sensorless one. the figure 1 shows an overall block diagram of the proposed observer-based nonlinear sliding mode speed control system. ac/dc converter 3 phase inverter power supplies gate drivers analog conditioning gd 12 bit adc epwm module serial interface eqep svpwm phase current reconstruction bus over voltage pi foc s speed calculator ` acin dc bus proccesor ground s y n c gpio ose pwm d e fe k t b u s v b u s i o v e r c u rr e n t m o to r p w m s t ri g g e r reference speed actual speed angle ualpha ubeta torque reference iq_ref id_ref angle p ic c o lo f 2 8 0 6 9 observer smo ualpha ubeta ibeta ialpha speed estimator angle angle fig. 1 control scheme of sensorless smo ipmsm drive. the blocks “speed estimator” and “observer smo” are added in the existing control scheme in order to calculate the rotor position and rotor speed through voltages and currents of stator, digitized and transformed by clarke transformation (to uα, uβ, iα and iβ) skipping the need for encoder, that gives a very important information like rotor position. the control scheme, by a soft-key provides sensored or sensorless operation and experimental results for quantities like torque, speed and currents to be compared and analyzed. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 57 5. experimental setup in order to verify the performance and effectiveness in emi noise reduction of the proposed observer-based non-linear sliding mode controller, experiments are carried out with a prototype ipmsm drive system based on a piccolo f28069 controlstick dsp. figure 2 shows the experimental setup of sensorless smo ipmsm drive. the hardware circuit consists of an ipmsm, product of slovenian industry mahle-letrika dedicated for electric power steering systems, a three-phase inverter with 6 mosfets (irfp4410), a control board with a f28069 controlstick dsp (float-point), an absolute optical encoder (hengstler ad35, 22 bit), two hall-effect current sensors (lts15np), and a pmsm motor as load in a back-to-back configuration. table 1 show the parameters of ipmsm used in experiment. the dc-link voltage (295 vdc) is obtained from the utility (ac 230v/50hz) using a single-phase full-bridge rectifier. the two phase currents (ia, ib) are measured by lts15np hall sensors and then converted into digital form using two 12-bit a/d converters. in addition, the rotor position (θ), which is used to execute the coordinate transformation for foc, is measured by the absolute encoder and fed to tex-as instruments piccolo f28069 controlstick dsp via a 32-bit qep. note that the rotor speed (ω) required to perform the feedback control can be easily obtained by differentiating θ with respect to time. table 1 ipmsm parameters. parameters symbol unit value rated power pn w 600 rated speed ωn rpm 1250 stator resistance rs ω 0.06 d-axis inductance ld mh 0.068 q-axis inductance lq mh 0.086 total linkage flux λpm wb 0.0373 pole pairs p 3 inertia j kgm 2 0.0001682 fig. 2 experimental setup of sensorless smo of ipmsm drive. 58 l. dhamo, a. spahiu, m. nemec, v. ambrozic 6. experimental results and discussions a variety of experiments have been performed. results for sensor and sensorless mode are compared in order to evaluate the emi noises reduction by replacing the absolute encoder with sliding mode observer. since the emi noises are due to lot of complex and coupled factors, the effect of removing only the position sensor is checked in all electric and mechanic quantities that are important for the quality of control like torque, speed, and currents. that kind of nonlinear control used in our experiments, the sliding mode control, “suffer” from chattering phenomena while implementation in real time control of ipmsm drive. it is obvious that the chattering is overlapped to speed and torque ripples, resulting in a worse situation. but the encoder, is the most susceptible hardware part of the drive by emi, and replacing that hardware with a software, the smo, reduce the possibility to effect the drive operation. experimentally, result that the torque ripples are reduced up to 50%. furthermore, the existence of a no observable zone for very low speeds of motor is a weak point of operation for ipmsm drive . so the results for speed response in steady state are taken at two different regimes: for rated speed and low speed, 15 rev/s and 3.5 rev/s, respectively. the figures are presented in appropriate scale to compare the amplitudes of ripples for both sensor and sensorless operation (the reference signal is shown for speed). experimental results show that sensorless control exhibits less ripples in electromagnetic torque. 0 50 100 150 200 250 300 350 400 450 500 -0.66 -0.64 -0.62 -0.6 -0.58 -0.56 -0.54 -0.52 -0.5 -0.48 time [samples] t o rq u e c o m p a ri s o n [n m ] torque, sensored torque sensorless fig. 3 experimental results for comparison of electromagnetic torque with sensor and sensorless control. from figure 3 it is clear that electromagnetic torque during sensorless operation is more stable and has fewer ripples. the ripple’s amplitude for torque during sensorless operation is reduced up to 50% of ripple’s amplitude of torque during sensor operation. this is not an isolated fact, which occurs accidentally. the replacing of encoder with soft block smo, “confront” directly one of the receivers of emi noises, e.g. position sensor. in general, the “first line of defense” is to suppress the emission as much as possible at the source, but it is a valid strategy to make the control system “deaf” for a part of emi noises. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 59 0 50 100 150 200 250 300 350 400 450 500 14.7 14.8 14.9 15 15.1 15.2 15.3 15.4 15.5 15.6 time [samples] s p e e d c o m p a ri s o n speed calculated from encoder speed estimated from smo speed reference fig. 4 experimental results for comparison of speed response in steady state with sensor and sensorless control during rated speed regime. figure 4 shows the experimental results for speed response at steady state for both sensor and sensorless operation near rated speed. the taken results show very clear that sensorless operation provide a smooth speed control almost equal to reference speed. compared with speed response of sensor operation, the accuracy of speed estimation during sensorless operation is very high and speed error is approximately zero. so, the rotor speed reflects a great benefit in using a sensorless scheme for vector control of ipmsm from emi noises point of view. another important quantity for field oriented control algorithm is direct current id. in order to verify the validity of our strategy, we have to check the effect expressed in results for other important quantities in field oriented control like rotor speed and direct current id. currents id and iq, are variables calculated by clarke and park vector transformations of digitalized real currents flowing into stator of ipmsm, sensed with hall effect sensors and digitalized with adc converter. the direct current id is a fluxproducing component that during execution of the foc algorithm, is forced to zero in order to achieve the maximum torque production for a given stator current. so, being this very important, the results for current id during sensor and sensorless operation are put together in figure 5, were it is clearly shown that amplitude of ripples for current id is reduced by 50% during sensorless operation. 0 50 100 150 200 250 300 350 400 450 500 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 time[samples] id c o m p a ri s o n [ a ] id-sensored id-sensorless fig. 5 experimental results showing comparison between direct current id in sensored and sensorless control. 60 l. dhamo, a. spahiu, m. nemec, v. ambrozic the sensorless drive systems based on state observers, suffer from the un-observability in the area of very low speed. 0 50 100 150 200 250 300 350 400 450 500 3 3.2 3.4 3.6 3.8 4 4.2 4.4 time[samples] t ra n s ie n t re s p o n c e o f ro to r s p e e d (c o m p a ri s o n ) [r e v /s ] reference speed speed calculated from encoder speed estimated from smo fig. 6 experimental results for comparison of speed response in steady state during very low speed regime with sensor and sensorless control. since this is the lower boundary of functionality of observer, where all quantities tend to be uncontrollable, it seems to be useful to compare the speed responses for both sensor and sensorless operation. figure 6 shows a better behavior of observer at very low speed than using the encoder. the speed fluctuations are less than 50% during sensorless operation. the transient response is quite important when analyzing the behavior of a device. figure 7 show the transient response for torque during sensor and sensorless operation. it is clearly shown that sensorless operation has a smaller overshot and need less time (the half) to stabilize. 0 50 100 150 200 250 300 350 400 450 500 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 time [samples] t o rq u e tr a n s ie n t re s p o n s e [ n m ] with sensor sensorless fig. 7 experimental results for comparison of transient response for torque during sensor and sensorless operation. figure 8 shows the transient responses for speed during sensor and sensorless operation. it is clearly shown that sensorless operation has a smaller overshot (approximately 7%) and need almost the same time to be stabilized. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 61 t ra n s ie n t r e s p o n s e o f r o to r s p e e d time [samples, 1sample=0.00015s] 0 50 100 150 200 250 300 350 400 450 500 -2 0 2 4 6 8 10 12 14 16 calculated speed from encoder estimated speed from smo reference speed fig. 8 experimental results for comparison of transient response for rotor speed during sensor and sensorless operation. the fig. 9 show the results for angle estimation during sensorless operation by sliding mode observer. the accuracy of angle estimation is crucial for control performance because that angle estimated by smo block is used to calculate the rotor speed and to realize vector transformations of clarke and park in order to generate the right value of voltage by svpwm block. 0 50 100 150 200 250 300 350 400 450 500 -1 0 1 2 3 4 5 6 7 koha [samples, 1 sample = 0.00015s] p o z ic io n i i ro to ri t[ ra d ] pozicioni prej enkoderit pozicioni i vlerësuar me smo gabimi i vlerësimit të këndit time 0 50 100 150 200 250 300 350 400 450 500 -1 0 1 2 3 4 5 6 7 angle from encoder angle estimated with smo error estimated angle r o to r p o s it io n [ ra d ] fig. 9 experimental results for rotor position estimation by smo and encoder. as a summary of experimental results, one may conclude that elimination of one of receptors of emi noises (encoder) make the control system less susceptible to them. this strategy, although ranked third, is very useful in achieving a better overall performance of control system. an accurate view of all results confirms that using the sensorless mode of operation has a lot of benefits. this kind of nonlinear control (the sliding mode), “suffer” from chattering phenomena while implementation in real time control of ac drives and ipmsm drives too. it is obvious that the chattering is overlapped to speed and torque ripples, resulting in a worse situation. but the encoder, is the most susceptible hardware part of the drive by emi, and replacing that hardware with a software, the smo block, 62 l. dhamo, a. spahiu, m. nemec, v. ambrozic reduce the possibility to effect the drive operation. experimentally, result that the torque ripples are reduced in total up to 50%. 5. conclusions the paper presents main problems of practical implementation of sensorless foc of an ipmsm. electromagnetic distortions have pernicious influence on the calculations performed in control unit as well as for the operation of absolute encoder. eliminating one of the sufferers from emi noises, by replacing it with a sliding mode observer, provide a noticeable improvement in response of speed and torque in the control system. this improvement is reflected in decreasing effect of emi, higher efficiency, less vibrations, and better overall performance. different experiments were performed on sensor and sensorless foc and measurements of currents, torque, rotor speed and currents are compared. acknowledgement: this paper presents a part of the work supported by the research program of erasmus mundus/basileus iv (2013-2014), in laboratories of department of mechatronics, faculty of electrical engineering, university of ljubljana, slovenia.. references [1] l. dhamo, a. spahiu, m. nemec, v. ambrozic, "electromagnetic interferation reduction by using sensorless foc of ipmsm with piccolof28069 controlstick", in proceedings of the extended abstracts of the 13th international conference on applied electromagnetics (пес 2017), niš, serbia, 2017, pp. 69. [2] b. k. bose, “power electronics and motor drives: advances and trends”. usa: elsevier, 2006. [3] y. xu, q. yuan, j. zou, y. li, “analysis of triangular periodic carrier frequency modulation on reducing electromagnetic noise of permanent magnet synchronous motor”, ieee trans. magn., vol. 48, no. 11, pp. 44244427, 2012. [4] r. l. kirlin, s. kwok, s. legowski, a. m. trzynadlowski, “power spectra of a pwm inverter with randomized pulse position”, ieee trans. power electron., vol. 9, no. 5, pp. 463-472, 1994. [5] k. s. kim, y. g. jung, y. c. lim, “a new hybrid random pwm scheme”, ieee trans. power electron., vol. 24, no. 1, pp. 192-200, 2009. [6] s. kaboli, j. mahdavi, a. agah, “application of random pwm technique for reducing the conducted electromagnetic emissions in active filters”, ieee trans. ind. electron, vol. 54, no. 4, pp. 2333-2343, 2007. [7] a. m. hava, e. un, “performance analysis of reduced common-mode voltage pwm methods and comparison with standard pwm methods for three-phase voltage-source inverters”, ieee trans. power electron, vol. 24, no. 1, pp. 241-252, 2009. [8] y. c. lim, s. o. wi, j. n. kim, y. g. jung, “a pseudorandom carrier modulation scheme”, ieee trans. pow. electron, vol. 25, no. 4, pp. 797-805, 2010. [9] j.-y. chai, y.-h. ho, y.-c. chang, c.-m. liaw, “on acoustic-noise reduction control using random switching technique for switch mode rectifiers in pmsm drive”, ieee trans. ind. electron., vol. 55, no. 3, pp. 1295-1309, 2008. [10] h. khan, e. miliani, k. e. k. drissi, “discontinuous random space vector modulation for electric drives: a digital approach”, ieee trans. power electron., vol. 27, no. 12, pp. 4944-4951, 2012. [11] k. t. chau, z. wang, ”chaos in electric drive systems-analysis, control and application”. singapore: wiley, 2011. [12] h. li, y. liu, j. lu, t. zheng, x. yu, “suppressing emi in power converters via chaotic spwm control based on spectrum analysis approach”, ieee trans. ind. electron., vol. 61, no. 11, pp. 6128-6136, 2014. [13] z. wang, k. t. chau, c. h. liu, “improvement of electromagnetic compatibility of motor drives using chaotic pwm”, ieee trans. magn., vol. 43, pp. 2612-2614, 2007. experimental evaluation of torque ripple reduction in a sensorless foc of ipmsm drive... 63 [14] z. zhang, k. t. chau, z. wang, w. li, “improvement of electromagnetic compatibility of motor drives using hybrid chaotic pulse width modulation”, ieee trans. magn., vol. 47, no. 10, pp. 4018-4021, 2011. [15] c.r. paul “ introduction to electromagnetic compatibility”, second edition, john wiley & sons, inc., hoboken, new jer-sey.2006. [16] dabi, j. zare, f. ledwich, g. ghosh, a., "leakage current and common mode voltage issues in modern ac drive systems," power engineering conference, 2007. aupec 2007. australasian universities, pp.1-6, 9-12 dec. 2007. [17] e. haseloff, ”printed circuit board layout for improved electromagnetic compatibility”, texas instruments 1996. [18] 618.indd facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 101 125 boolean differential equations a common model for classes, lattices, and arbitrary sets of boolean functions bernd steinbach1 and christian posthoff2 1institute of computer science, freiberg university of mining and technology, bernhard-von-cotta-str. 2, d-09596 freiberg, germany 2department of computing and information technology, the university of the west indies, trinidad & tobago abstract: the boolean differential calculus (bdc) significantly extends the boolean algebra because not only boolean values 0 and 1, but also changes of boolean values or boolean functions can be described. a boolean differential equation (bde) is a boolean equation that includes derivative operations of the boolean differential calculus. this paper aims at the classification of bdes, the characterization of the respective solutions, algorithms to calculate the solution of a bde, and selected applications. we will show that not only classes and arbitrary sets of boolean functions but also lattices of boolean functions can be expressed by boolean differential equations. in order to reach this aim, we give a short introduction into the bdc, emphasize the general difference between the solutions of a boolean equation and a bde, explain the core algorithms to solve a bde that is restricted to all vectorial derivatives of f (x) and optionally contains boolean variables. we explain formulas for transforming other derivative operations to vectorial derivatives in order to solve more general bdes. new fields of applications for bdes are simple and generalized lattices of boolean functions. we describe the construction, simplification and solution. manuscript received november 17, 2014 *an earlier version of this paper was presented as invited paper at the the 13th serbian mathematical congress (smc 2014), may 22-25, 2014, in vrnjačka banja, serbia. corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cottastr. 2, d-09596 freiberg, germany (e-mail: steinb@informatik). 101 facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 101 125 boolean differential equations a common model for classes, lattices, and arbitrary sets of boolean functions bernd steinbach1 and christian posthoff2 1institute of computer science, freiberg university of mining and technology, bernhard-von-cotta-str. 2, d-09596 freiberg, germany 2department of computing and information technology, the university of the west indies, trinidad & tobago abstract: the boolean differential calculus (bdc) significantly extends the boolean algebra because not only boolean values 0 and 1, but also changes of boolean values or boolean functions can be described. a boolean differential equation (bde) is a boolean equation that includes derivative operations of the boolean differential calculus. this paper aims at the classification of bdes, the characterization of the respective solutions, algorithms to calculate the solution of a bde, and selected applications. we will show that not only classes and arbitrary sets of boolean functions but also lattices of boolean functions can be expressed by boolean differential equations. in order to reach this aim, we give a short introduction into the bdc, emphasize the general difference between the solutions of a boolean equation and a bde, explain the core algorithms to solve a bde that is restricted to all vectorial derivatives of f (x) and optionally contains boolean variables. we explain formulas for transforming other derivative operations to vectorial derivatives in order to solve more general bdes. new fields of applications for bdes are simple and generalized lattices of boolean functions. we describe the construction, simplification and solution. manuscript received november 17, 2014 *an earlier version of this paper was presented as invited paper at the the 13th serbian mathematical congress (smc 2014), may 22-25, 2014, in vrnjačka banja, serbia. corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cottastr. 2, d-09596 freiberg, germany (e-mail: steinb@informatik). 101 facta universitatis series: electronics and energetics vol. 28, no 1, march 2015, pp. 51 76 doi: 10.2298/fuee1501051s received november 17, 2014 *an earlier version of this paper was presented as invited paper at the the 13th serbian mathematical congress (smc 2014), may 22-25, 2014, in vrnjačka banja, serbia. corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cotta-str. 2, d-09596 freiberg, germany (e-mail: steinb@informatik) 102 b. steinbach and c. posthoff the basic operations of xboole are sufficient to solve bdes. we demonstrate how a xboole-problem program (prp) of the freely available xboole-monitor quickly solves some bdes. keywords: boolean differential calculus, boolean differential equation (bde), classes of boolean functions, lattices of boolean functions, arbitrary sets of boolean functions, lattice-bde, xboole. 1 introduction boolean variables are the simplest variables. a boolean variable carries only one element of the set b = {0, 1}. these two values can be easily distinguished from each other in technical systems. therefore we get more and more digital systems. boolean functions, the operations of the boolean algebra, and boolean equations [1] are well known to solve many tasks for digital systems. the solution of a boolean equation is a set of boolean vectors which describes, e.g., the behavior of a circuit. however, this theory is restricted to fixed values for a given point in time. the boolean differential calculus (bdc) [1–3] allows us to study the change of the values of boolean variables and boolean functions. the substitution of the derivative operations of the bdc into a boolean equation leads to a boolean differential equation (bde) [4, 5]. a bde has a more general solution and consequently opens a wider field of applications. we show in this paper how lattices of boolean functions can be expressed by bdes. we will show that additionally to the well known lattices of incompletely specified functions also more general lattices of boolean functions can be described in an easy way using the new special lattice-bde. the generalized lattices of boolean functions open a new field of applications. the rest of this paper is organized as follows. section 2 defines the derivative operations of the bdc and briefly explains their meaning. section 3 summarizes the known theory of boolean differential equations, introduces algorithms that calculates either sets of classes of boolean functions or arbitrary sets of boolean functions as the solution of a bde, explains a method to map a more general bde into the form needed for one of these algorithms, and shows how a xbooleproblem programs (prp) can solve a bde in the xboole-monitor. lattices of boolean functions are discussed in section 4 as a new field of applications of bdes. both the well known lattices that describe incompletely specified functions (isf) and a more general type of lattices of boolean functions are uniquely described using lattice-bdes. several examples show that the known algorithm can be used to solve different types of lattice-bdes. finally, section 5 concludes this paper. 102 b. steinbach and c. posthoff the basic operations of xboole are sufficient to solve bdes. we demonstrate how a xboole-problem program (prp) of the freely available xboole-monitor quickly solves some bdes. keywords: boolean differential calculus, boolean differential equation (bde), classes of boolean functions, lattices of boolean functions, arbitrary sets of boolean functions, lattice-bde, xboole. 1 introduction boolean variables are the simplest variables. a boolean variable carries only one element of the set b = {0, 1}. these two values can be easily distinguished from each other in technical systems. therefore we get more and more digital systems. boolean functions, the operations of the boolean algebra, and boolean equations [1] are well known to solve many tasks for digital systems. the solution of a boolean equation is a set of boolean vectors which describes, e.g., the behavior of a circuit. however, this theory is restricted to fixed values for a given point in time. the boolean differential calculus (bdc) [1–3] allows us to study the change of the values of boolean variables and boolean functions. the substitution of the derivative operations of the bdc into a boolean equation leads to a boolean differential equation (bde) [4, 5]. a bde has a more general solution and consequently opens a wider field of applications. we show in this paper how lattices of boolean functions can be expressed by bdes. we will show that additionally to the well known lattices of incompletely specified functions also more general lattices of boolean functions can be described in an easy way using the new special lattice-bde. the generalized lattices of boolean functions open a new field of applications. the rest of this paper is organized as follows. section 2 defines the derivative operations of the bdc and briefly explains their meaning. section 3 summarizes the known theory of boolean differential equations, introduces algorithms that calculates either sets of classes of boolean functions or arbitrary sets of boolean functions as the solution of a bde, explains a method to map a more general bde into the form needed for one of these algorithms, and shows how a xbooleproblem programs (prp) can solve a bde in the xboole-monitor. lattices of boolean functions are discussed in section 4 as a new field of applications of bdes. both the well known lattices that describe incompletely specified functions (isf) and a more general type of lattices of boolean functions are uniquely described using lattice-bdes. several examples show that the known algorithm can be used to solve different types of lattice-bdes. finally, section 5 concludes this paper. 52 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 53 102 b. steinbach and c. posthoff the basic operations of xboole are sufficient to solve bdes. we demonstrate how a xboole-problem program (prp) of the freely available xboole-monitor quickly solves some bdes. keywords: boolean differential calculus, boolean differential equation (bde), classes of boolean functions, lattices of boolean functions, arbitrary sets of boolean functions, lattice-bde, xboole. 1 introduction boolean variables are the simplest variables. a boolean variable carries only one element of the set b = {0, 1}. these two values can be easily distinguished from each other in technical systems. therefore we get more and more digital systems. boolean functions, the operations of the boolean algebra, and boolean equations [1] are well known to solve many tasks for digital systems. the solution of a boolean equation is a set of boolean vectors which describes, e.g., the behavior of a circuit. however, this theory is restricted to fixed values for a given point in time. the boolean differential calculus (bdc) [1–3] allows us to study the change of the values of boolean variables and boolean functions. the substitution of the derivative operations of the bdc into a boolean equation leads to a boolean differential equation (bde) [4, 5]. a bde has a more general solution and consequently opens a wider field of applications. we show in this paper how lattices of boolean functions can be expressed by bdes. we will show that additionally to the well known lattices of incompletely specified functions also more general lattices of boolean functions can be described in an easy way using the new special lattice-bde. the generalized lattices of boolean functions open a new field of applications. the rest of this paper is organized as follows. section 2 defines the derivative operations of the bdc and briefly explains their meaning. section 3 summarizes the known theory of boolean differential equations, introduces algorithms that calculates either sets of classes of boolean functions or arbitrary sets of boolean functions as the solution of a bde, explains a method to map a more general bde into the form needed for one of these algorithms, and shows how a xbooleproblem programs (prp) can solve a bde in the xboole-monitor. lattices of boolean functions are discussed in section 4 as a new field of applications of bdes. both the well known lattices that describe incompletely specified functions (isf) and a more general type of lattices of boolean functions are uniquely described using lattice-bdes. several examples show that the known algorithm can be used to solve different types of lattice-bdes. finally, section 5 concludes this paper. boolean differential equations a common model for classes, lattices, ... 103 2 derivative operations of the boolean differential calculus the bdc defines the boolean differential dx which is also a boolean variable that is equal to 1 in the case that x changes its value. additionally, the bdc defines several differential operations of boolean functions which describe certain general properties depending on possible directions of change. we confine ourselves to derivatives of the bdc where the direction of change is fixed. for simple derivative operations the direction of change is restricted to a single variable xi. the (simple) derivative ∂ f (xi, x1) ∂ xi = f (xi = 0, x1) ⊕ f (xi = 1, x1) (1) is equal to 1 if the function f (xi, x1) changes its value when the value of xi changes. the (simple) minimum min xi f (xi, x1) = f (xi = 0, x1) ∧ f (xi = 1, x1) (2) is equal to 1 when the value of the function f (xi, x1) remains unchanged equal to 1 while the value of xi changes. the (simple) maximum max xi f (xi, x1) = f (xi = 0, x1) ∨ f (xi = 1, x1) (3) is equal to 0 when the value of the function f (xi, x1) remains unchanged equal to 0 while the value of xi changes. vectorial derivative operations similarly describe cases where all variables of the vector x0 change their values at the same point in time. the following formulas define the vectorial derivative, the vectorial minimum, and the vectorial maximum: ∂ f (x0, x1) ∂ x0 = f (x0, x1) ⊕ f (x0, x1) , (4) min x0 f (x0, x1) = f (x0, x1) ∧ f (x0, x1) , (5) max x0 f (x0, x1) = f (x0, x1) ∨ f (x0, x1) . (6) the simple derivative operations can sequentially be executed with regard to different variables of a subset x0. such m-fold derivative operations describe a special property for subspaces x1 = const. the following formulas define the mfold derivative, the m-fold minimum, the m-fold maximum, and the ∆-operation: 52 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 53 104 b. steinbach and c. posthoff ∂ m f (x0, x1) ∂ x1∂ x2 . . . ∂ xm = ∂ ∂ xm ( . . . ( ∂ ∂ x2 ( ∂ f (x0, x1) ∂ x1 )) . . . ) , (7) min x0 m f (x0, x1) = min xm ( . . . ( min x2 ( min x1 f (x0, x1) )) . . . ) , (8) max x0 m f (x0, x1) = max xm ( . . . ( max x2 ( max x1 f (x0, x1) )) . . . ) , (9) ∆x0 f (x0, x1) = minx0 m f (x0, x1) ⊕ max x0 m f (x0, x1) . (10) because of the limited space, we skip all theorems about relations between certain derivative operations and refer to [3, 6]. 3 boolean differential equations 3.1 an introductory example if we know the function f (x) and need the result of any derivative operation then we simply apply the definition and get as result an uniquely specified boolean function. example 1. we take the boolean function: f (x) = f (x1, x2, x3) = x1 ∨ x2 x3 (11) and use definition (4) to calculate the vectorial derivative of f(x) with regard to (x1, x3): ∂ f (x1, x2, x3) ∂ (x1, x3) = f (x1, x2, x3) ⊕ f (x1, x2, x3) = (x1 ∨ x2 x3) ⊕ (x1 ∨ x2 x3) = x1 ⊕ x2 x3 ⊕ x1 x2 x3 ⊕ x1 ⊕ x2 x3 ⊕ x1 x2 x3 = (x1 ⊕ x1) ⊕ x2 (x3 ⊕ x3) ⊕ x2 (x1 x3 ⊕ x1 x3) = x2 ⊕ x2 (x1 ⊕ x3) = x2 ∨ (x1 ⊕ x3) , and get as result the unique boolean function: g(x1, x2, x3) = x2 ∨ (x1 ⊕ x3) . (12) arrows in figure 1 illustrate the pairs of function values which determine the result of this vectorial derivative. different function values in these pairs lead to function values 1 in the calculated vectorial derivative. 54 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 55 104 b. steinbach and c. posthoff ∂ m f (x0, x1) ∂ x1∂ x2 . . . ∂ xm = ∂ ∂ xm ( . . . ( ∂ ∂ x2 ( ∂ f (x0, x1) ∂ x1 )) . . . ) , (7) min x0 m f (x0, x1) = min xm ( . . . ( min x2 ( min x1 f (x0, x1) )) . . . ) , (8) max x0 m f (x0, x1) = max xm ( . . . ( max x2 ( max x1 f (x0, x1) )) . . . ) , (9) ∆x0 f (x0, x1) = minx0 m f (x0, x1) ⊕ max x0 m f (x0, x1) . (10) because of the limited space, we skip all theorems about relations between certain derivative operations and refer to [3, 6]. 3 boolean differential equations 3.1 an introductory example if we know the function f (x) and need the result of any derivative operation then we simply apply the definition and get as result an uniquely specified boolean function. example 1. we take the boolean function: f (x) = f (x1, x2, x3) = x1 ∨ x2 x3 (11) and use definition (4) to calculate the vectorial derivative of f(x) with regard to (x1, x3): ∂ f (x1, x2, x3) ∂ (x1, x3) = f (x1, x2, x3) ⊕ f (x1, x2, x3) = (x1 ∨ x2 x3) ⊕ (x1 ∨ x2 x3) = x1 ⊕ x2 x3 ⊕ x1 x2 x3 ⊕ x1 ⊕ x2 x3 ⊕ x1 x2 x3 = (x1 ⊕ x1) ⊕ x2 (x3 ⊕ x3) ⊕ x2 (x1 x3 ⊕ x1 x3) = x2 ⊕ x2 (x1 ⊕ x3) = x2 ∨ (x1 ⊕ x3) , and get as result the unique boolean function: g(x1, x2, x3) = x2 ∨ (x1 ⊕ x3) . (12) arrows in figure 1 illustrate the pairs of function values which determine the result of this vectorial derivative. different function values in these pairs lead to function values 1 in the calculated vectorial derivative. boolean differential equations a common model for classes, lattices, ... 105 x2 x3 f (x) 0 0 0 1 0 1 0 1 1 1 1 1 1 0 0 1 0 1 x1 x2 x3 f (x) 0 0 ��0 1 �� 1 1 ��1 0 �� 0 1 x1 x2 x3 g(x) 0 0 1 1 0 1 1 1 1 1 0 1 1 0 1 0 0 1 x1 � � fig. 1. related function values of the vectorial derivative of example 1 tab. 1. pairs of functions values and their result of: ∂ f (x0,x1) ∂ x0 = g(x0, x1) f (x0, x1 = const.) f (x0, x1 = const.) g(x0, x1 = const.) g(x0, x1 = const.) 0 0 0 0 1 1 0 0 0 1 1 1 1 0 1 1 now we consider the reverse situation that we know g(x) and want to find the function f (x) such that g(x) is the result of a derivative operation of the unknown function f (x). from figure 1, we can conclude that the vectorial derivative of several functions f (x) with regard to (x1, x3) result in the same function g(x) (12). table 1 shows that two different patterns of function values of f (x) result for a vectorial derivative in the same pair of patterns of function values of g(x). in the special case of example 1 (see also figure 1), the two possible patterns of pairs of function values for each of the four pairs lead to 42 = 16 different function f (x) with the same function g(x1, x2, x3) (12) as result of the vectorial derivative with regard to (x1, x3). figure 2 shows the karnaugh-maps of these 16 functions using the same encoding of the leftmost karnaugh-maps for all of them. the original function f (x1, x2, x3) belongs to this set of functions and is labeled as f3(x1, x2, x3) in figure 2. another interesting question: is there for each given function g(x0, x1) at least one solution function f (x0, x1) of the simple bde ∂ f (x0, x1) ∂ x0 = g(x0, x1) ? (13) due to the commutativity of the ⊕-operations we have g(x0, x1) = ∂ f (x0, x1) ∂ x0 = f (x0, x1) ⊕ f (x0, x1) = ∂ f (x0, x1) ∂ x0 = g(x0, x1) . (14) 54 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 55 106 b. steinbach and c. posthoff ∂ f (x1,x2,x3) ∂ (x1,x3) = g(x1, x2, x3) x2 x3 f1(x) 0 0 0 1 0 1 0 1 1 1 0 1 1 0 0 0 0 1 x1 f2(x) 0 1 0 1 0 0 1 0 f3(x) 0 1 0 1 1 1 0 1 f4(x) 0 1 0 1 1 0 1 1 f5(x) 0 0 1 1 0 1 0 0 f6(x) 0 0 1 1 0 0 1 0 f7(x) 0 0 1 1 1 1 0 1 f8(x) 0 0 1 1 1 0 1 1 x2 x3 f9(x) 0 0 1 1 0 1 0 0 1 1 0 1 1 0 0 0 0 1 x1 f10(x) 1 1 0 0 0 0 1 0 f11(x) 1 1 0 0 1 1 0 1 f12(x) 1 1 0 0 1 0 1 1 f13(x) 1 0 1 0 0 1 0 0 f14(x) 1 0 1 0 0 0 1 0 f15(x) 1 0 1 0 1 1 0 1 f16(x) 1 0 1 0 1 0 1 1 x2 x3 g(x) 0 0 1 1 0 1 1 1 1 1 0 1 1 0 1 0 0 1 x1 � fig. 2. sixteen functions with the same vectorial derivative g(x1, x2, x3) (12). therefore a function f (x0, x1) exists only in the case g(x0, x1) = g(x0, x1) g(x0, x1) ⊕ g(x0, x1) = g(x0, x1) ⊕ g(x0, x1) g(x0, x1) ⊕ g(x0, x1) = 0 ∂ g(x0, x1) ∂ x0 = 0 . (15) we call (15) the integrability condition and can conclude: the functions f (x0, x1) of the bde (13) exist if and only if g(x0, x1) satisfies the integrability condition (15). all solution functions of the bde (13) are given by fi(x0, x1) = xi ∧ g(x0, x1) ⊕ h j(x0, x1) (16) with x0 = (x1, . . . , xi, . . . , xk), x1 = (xk+1, . . . , xn), and ∂ h j(x0, x1) ∂ x0 = 0 . (17) we learn from this example: 1. a boolean differential equation ∂ f (x1,x2,x3)∂ (x1,x3) = g(x1, x2, x3) includes the unknown function f (x1, x2, x3). 2. there are solutions of a bde only if the right-hand function g(x1, x2, x3) satisfies a special integrability condition. 56 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 57 boolean differential equations a common model for classes, lattices, ... 107 3. the general solution of an inhomogeneous bde is built using a single special solution of the inhomogeneous bde and the set of all solutions of the associated homogeneous bde. the associated homogeneous bde is built by replacing the right-hand side of an inhomogeneous bde by 0. 4. generally, the solution of a boolean differential equation is a set of boolean functions. this is a significant difference to boolean equations. the solution of a boolean equation is a set of boolean vectors. 3.2 bdes of simple and vectorial derivatives separation of classes generally, a boolean differential equation (bde) is an equation in which derivative operations of an unknown function f (x) occur. in order to find a convenient solution method, we restrict in this subsection bdes such that only the function f (x) and all its simple and vectorial derivatives are allowed in expressions on both sides of an equation: dl ( f (x), ∂ f (x) ∂ x1 , ∂ f (x) ∂ x2 , . . . , ∂ f (x0, x1) ∂ x0 , . . . , ∂ f (x) ∂ x ) = dr ( f (x), ∂ f (x) ∂ x1 , ∂ f (x) ∂ x2 , . . . , ∂ f (x0, x1) ∂ x0 , . . . , ∂ f (x) ∂ x ) . (18) using the ⊕-operation, each bde(18) can be transformed into a homogeneous restrictive bde: d ( f (x), ∂ f (x) ∂ x1 , ∂ f (x) ∂ x2 , . . . , ∂ f (x0, x1) ∂ x0 , . . . , ∂ f (x) ∂ x ) = 0 . (19) the following definition supports the description of the solution procedure. definition 1. let g(x) be a solution function of (19). then 1. [ g(x), ∂ g(x) ∂ x1 , ∂ g(x) ∂ x2 , . . . , ∂ g(x) ∂ x ] x=c (20) is a local solution for x = c, 2. d(u0, u1, . . . , u2n−1) = 0 (21) is the boolean equation, associated to the boolean differential equation (19), and has the set of local solutions sls. 56 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 57 108 b. steinbach and c. posthoff ��01 dx1 dx2 �� dx1 dx2 �� dx1 dx2 �� 11�� dx1 dx2 ��dx1 dx2�� 00 �� dx1 dx2 �� 10 �� (x1, x2) fig. 3. values of the function g(x1, x2) = x1 ∨ x2 and their simple and vectorial derivatives for (x1, x2) = (1, 1). 3. ∇g(x) = ( g(x), ∂ g(x) ∂ x1 , ∂ g(x) ∂ x2 , . . . , ∂ g(x) ∂ x ) (22) the local solutions (20) are the key to solve a bde. the values of the simple and vectorial derivatives in a single selected point of the boolean space specify the function values in all other points of this space. figure 3 shows the function g(x1, x2) = x1 ∨ x2 where function values 1 are indicated by double circles, values 1 of derivatives are indicated by solid arrows, and values 0 of derivatives are indicated by dotted arrows. example 2. assume, we know the local solution for (x1, x2) = (1, 1) as depicted in figure 3. then we can conclude: • due to the known point: g(x)|(x1,x2)=(11) = 1 → g(1, 1) = 1 , • due to the direction of change dx1 dx2: ∂ g(x) ∂ x1 ∣ ∣ ∣ (x1,x2)=(11) = 1 → g(0, 1) = 0 , • due to the direction of change dx1 dx2: ∂ g(x) ∂ x2 ∣ ∣ ∣ (x1,x2)=(11) = 0 → g(1, 0) = 1 , and • due to the direction of change dx1 dx2: ∂ g(x) ∂ (x1,x2) ∣ ∣ ∣ (x1,x2)=(11) = 0 → g(0, 0) = 1 . hence, a possible solution function g(x) = x1 ∨ x2 could be reconstructed based on the knowledge of the local solution in a single point of the boolean space. the bde (19) contains all elements of ∇ f (x) either in non-negated or negated form. hence, all these elements can be encoded by boolean variables ui as shown in table 2. 58 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 59 108 b. steinbach and c. posthoff ��01 dx1 dx2 �� dx1 dx2 �� dx1 dx2 �� 11�� dx1 dx2 ��dx1 dx2�� 00 �� dx1 dx2 �� 10 �� (x1, x2) fig. 3. values of the function g(x1, x2) = x1 ∨ x2 and their simple and vectorial derivatives for (x1, x2) = (1, 1). 3. ∇g(x) = ( g(x), ∂ g(x) ∂ x1 , ∂ g(x) ∂ x2 , . . . , ∂ g(x) ∂ x ) (22) the local solutions (20) are the key to solve a bde. the values of the simple and vectorial derivatives in a single selected point of the boolean space specify the function values in all other points of this space. figure 3 shows the function g(x1, x2) = x1 ∨ x2 where function values 1 are indicated by double circles, values 1 of derivatives are indicated by solid arrows, and values 0 of derivatives are indicated by dotted arrows. example 2. assume, we know the local solution for (x1, x2) = (1, 1) as depicted in figure 3. then we can conclude: • due to the known point: g(x)|(x1,x2)=(11) = 1 → g(1, 1) = 1 , • due to the direction of change dx1 dx2: ∂ g(x) ∂ x1 ∣ ∣ ∣ (x1,x2)=(11) = 1 → g(0, 1) = 0 , • due to the direction of change dx1 dx2: ∂ g(x) ∂ x2 ∣ ∣ ∣ (x1,x2)=(11) = 0 → g(1, 0) = 1 , and • due to the direction of change dx1 dx2: ∂ g(x) ∂ (x1,x2) ∣ ∣ ∣ (x1,x2)=(11) = 0 → g(0, 0) = 1 . hence, a possible solution function g(x) = x1 ∨ x2 could be reconstructed based on the knowledge of the local solution in a single point of the boolean space. the bde (19) contains all elements of ∇ f (x) either in non-negated or negated form. hence, all these elements can be encoded by boolean variables ui as shown in table 2. boolean differential equations a common model for classes, lattices, ... 109 tab. 2. mapping of the bde into the associated boolean equation index binary code ui associated element 0 (0 . . . 00) u0 f (x) 1 (0 . . . 01) u1 ∂ f (x) ∂ x1 2 (0 . . . 10) u2 ∂ f (x) ∂ x2 3 (0 . . . 11) u3 ∂ f (x) ∂ (x1,x2) : : : : 2n − 1 (1 . . . 11) u2n−1 ∂ f (x) ∂ x the result of this substitution is an associated boolean equation d(u) = 0. the solution of this boolean equation is the set of all local solutions sls(u). there are local solutions for which no global solution functions of the bde (19) exist. the sufficient condition for a global solution function of the bde (19) is that a local solution exists for each point of the boolean space bn: ∀c ∈ bn ∇ f (x) |x=c ∈ sls(u) . (23) an important consequence of (23) is that the solutions of the bde (19) consist of classes of boolean functions as characterized by theorem 1; the proof is given in [4, 5]. theorem 1. if the boolean function f (x) is a solution function of the boolean differential equation (19), then all boolean functions f (x1, x2, ..., xn) = f (x1 ⊕ c1, x2 ⊕ c2, ..., xn ⊕ cn) (24) for c = (c1, . . . , cn) ∈ bn are also solution functions of (19). the set of local solutions sls(u) expresses by u0 the function value in one point of bn and by ui, 0 < i < 2n, the values of changes with regard to all the other points of bn. the separation of function classes becomes easier when the function value in all points of bn are uniquely given. the function d2v (derivative to value) uses (25) to transform the set sls(u) into the set sls′(v). v0 = u0 , vi = u0 ⊕ ui , with i = 1, 2, ..., 2n − 1 . (25) 58 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 59 110 b. steinbach and c. posthoff due to (24), the exchange of xi and xi does not change the set of solution functions. the function epv (exchange of pairs of values) realizes this exchange: slst(v) = epv(sls′(v), i) . (26) applied to variables vi this exchange is described by: v(m+2 k·2i−1) ⇐⇒ v(m+(2 k+1)·2i−1) , (27) with i = 1, 2, ..., n , m = 0, 1, ..., 2i−1 − 1 , k = 0, 1, ..., 2n−i − 1 . table 3 enumerates the index pairs which are used in the function epv for boolean spaces up to b4. tab. 3. index pairs defined by (27) for the exchange of function values i = 1 i = 2 i = 3 i = 4 0 ⇔ 1 0 ⇔ 2 0 ⇔ 4 0 ⇔ 8 2 ⇔ 3 1 ⇔ 3 1 ⇔ 5 1 ⇔ 9 4 ⇔ 5 4 ⇔ 6 2 ⇔ 6 2 ⇔ 10 6 ⇔ 7 5 ⇔ 7 3 ⇔ 7 3 ⇔ 11 8 ⇔ 9 8 ⇔ 10 8 ⇔ 12 4 ⇔ 12 10 ⇔ 11 9 ⇔ 11 9 ⇔ 13 5 ⇔ 13 12 ⇔ 13 12 ⇔ 14 10 ⇔ 14 6 ⇔ 14 14 ⇔ 15 13 ⇔ 15 11 ⇔ 15 7 ⇔ 15 the intersection of the given set sls′(v) with the exchanged set slst(v) for all variables xi separates the set of global solution functions from the local solutions which are not sufficient. algorithm 1 describes the procedure to solve the bde (19) in detail. the solution vectors v of algorithm 1 specify, substituted into (28), all solution functions of the bde (19). f (x1, x2, . . . , xn) = x1 x2 . . . xn ∧ v0 ∨ x1 x2 . . . xn ∧ v1 ∨ . . . ∨ (28) x1 x2 . . . xn ∧ v2n−1 3.3 bdes of simple and vectorial derivatives as well as variables separation of arbitrary function sets the solution of a boolean differential equation of the type (18) is a set of function classes characterized by (24). additional boolean variables x in a boolean differential equation of the type (29) restrict the derivatives to certain points of the 60 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 61 110 b. steinbach and c. posthoff due to (24), the exchange of xi and xi does not change the set of solution functions. the function epv (exchange of pairs of values) realizes this exchange: slst(v) = epv(sls′(v), i) . (26) applied to variables vi this exchange is described by: v(m+2 k·2i−1) ⇐⇒ v(m+(2 k+1)·2i−1) , (27) with i = 1, 2, ..., n , m = 0, 1, ..., 2i−1 − 1 , k = 0, 1, ..., 2n−i − 1 . table 3 enumerates the index pairs which are used in the function epv for boolean spaces up to b4. tab. 3. index pairs defined by (27) for the exchange of function values i = 1 i = 2 i = 3 i = 4 0 ⇔ 1 0 ⇔ 2 0 ⇔ 4 0 ⇔ 8 2 ⇔ 3 1 ⇔ 3 1 ⇔ 5 1 ⇔ 9 4 ⇔ 5 4 ⇔ 6 2 ⇔ 6 2 ⇔ 10 6 ⇔ 7 5 ⇔ 7 3 ⇔ 7 3 ⇔ 11 8 ⇔ 9 8 ⇔ 10 8 ⇔ 12 4 ⇔ 12 10 ⇔ 11 9 ⇔ 11 9 ⇔ 13 5 ⇔ 13 12 ⇔ 13 12 ⇔ 14 10 ⇔ 14 6 ⇔ 14 14 ⇔ 15 13 ⇔ 15 11 ⇔ 15 7 ⇔ 15 the intersection of the given set sls′(v) with the exchanged set slst(v) for all variables xi separates the set of global solution functions from the local solutions which are not sufficient. algorithm 1 describes the procedure to solve the bde (19) in detail. the solution vectors v of algorithm 1 specify, substituted into (28), all solution functions of the bde (19). f (x1, x2, . . . , xn) = x1 x2 . . . xn ∧ v0 ∨ x1 x2 . . . xn ∧ v1 ∨ . . . ∨ (28) x1 x2 . . . xn ∧ v2n−1 3.3 bdes of simple and vectorial derivatives as well as variables separation of arbitrary function sets the solution of a boolean differential equation of the type (18) is a set of function classes characterized by (24). additional boolean variables x in a boolean differential equation of the type (29) restrict the derivatives to certain points of the boolean differential equations a common model for classes, lattices, ... 111 algorithm 1 separation of function classes require: bde (19) with function f (x) containing n variables ensure: set of boolean vectors v = (v0, v1, . . . , v2n−1) that describe, substituted in (28), the set of all solution functions of the bde (19) 1: sls(u) ← solution of be (21) associated with bde (19) 2: sls′(v) ← d2v(sls(u)) 3: for i ← 1 to n do 4: slst(v) ← epv(sls′(v), i) 5: sls′(v) ← sls′(v) ∩slst(v) 6: end for boolean space. therefore, selected functions of the function classes can belong to the set of solution functions. hence, a bde (29) has an arbitrary set of boolean functions as solution. dl ( f (x), ∂ f (x) ∂ x1 , ∂ f (x) ∂ x2 , . . . , ∂ f (x0, x1) ∂ x0 . . . , ∂ f (x) ∂ x , x ) = dr ( f (x), ∂ f (x) ∂ x1 , ∂ f (x) ∂ x2 , . . . , ∂ f (x0, x1) ∂ x0 . . . , ∂ f (x) ∂ x , x ) . (29) a bde (29) can be solved using a slightly modified algorithm. the variables x remain in the associated boolean equation (30) associated to (29): dl (u0, u1, . . . , u2n−1, x1, x2, . . . , xn) = dr(u0, u1, . . . , u2n−1, x1, x2, . . . , xn) . (30) a detailed analysis in [4] reveals that the local solutions must be split into the cofactors s0 for xi = 0 and s1 for xi = 1 and the exchange of function pairs must be restricted to s1. algorithm 2 describes the detailed steps to solve a bde (29). 3.4 more general boolean differential equations in addition to the simple and vectorial derivatives all other derivative operations can be used within a bde. we do not need special solution algorithms for such more general boolean differential equations because the theorems of the bdc allow us the transformation of all types of derivative operations into the elements of ∇ f (x) 60 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 61 112 b. steinbach and c. posthoff algorithm 2 separation of functions require: bde (29) in which the function f (x) depends on n variables ensure: set s of boolean vectors v = (v0, v1, . . . , v2n−1) that describe, substituted in (28), the set of all solution functions of the bde (29) 1: sls(u, x) ← solution of the be (30) associated with bde (29) 2: s(v, x) ← d2v(sls(u, x)) 3: for i ← 1 to n do 4: s0(v, x \ (x1, . . . , xi)) ← maxxi [xi ∧s(v, xi, . . . , xn)] 5: s1(v, x \ (x1, . . . , xi)) ← maxxi [xi ∧s(v, xi, . . . , xn)] 6: st1(v, x \ (x1, . . . , xi)) ← epv(s1(v, x \ (x1, . . . , xi)), i) 7: s(v, x \ (x1, . . . , xi)) ← s0(v, x \ (x1, . . . , xi)) ∩st1(v, x \ (x1, . . . , xi)) 8: end for (22). min xi f (x) = f (x) ∧ ∂ f (x) ∂ xi (31) max xi f (x) = f (x) ∨ ∂ f (x) ∂ xi (32) min x0 f (x0, x1) = f (x0, x1) ∧ ∂ f (x0, x1) ∂ x0 (33) max x0 f (x0, x1) = f (x0, x1) ∨ ∂ f (x0, x1) ∂ x0 (34) min (x1,x2) 2 f (x) = f (x) ∧ ∂ f (x) ∂ x1 ∧ ∂ f (x) ∂ x2 ∧ ∂ f (x) ∂ (x1, x2) (35) max (x1,x2) 2 f (x) = f (x) ∨ ∂ f (x) ∂ x1 ∨ ∂ f (x) ∂ x2 ∨ ∂ f (x) ∂ (x1, x2) (36) ∂ 2 f (x) ∂ x1∂ x2 = ∂ f (x) ∂ x1 ⊕ ∂ f (x) ∂ x2 ⊕ ∂ f (x) ∂ (x1, x2) (37) ∆(x1,x2) f (x) = ∂ f (x) ∂ x1 ∨ ∂ f (x) ∂ x2 ∨ ∂ f (x) ∂ (x1, x2) (38) general formulas to express the simple or vectorial minimum or maximum by the function f (x) and simple or vectorial derivatives are given in (31),. . . , (34). the equations (35),. . . , (38) describe how all 2-fold derivative operations can be transformed into expressions that contain only the function f (x) and simple or vectorial derivatives. these formulas can be generalized for any m ≤ n of functions in bn [6]. using these transformations each more general bde results either in the bde 62 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 63 112 b. steinbach and c. posthoff algorithm 2 separation of functions require: bde (29) in which the function f (x) depends on n variables ensure: set s of boolean vectors v = (v0, v1, . . . , v2n−1) that describe, substituted in (28), the set of all solution functions of the bde (29) 1: sls(u, x) ← solution of the be (30) associated with bde (29) 2: s(v, x) ← d2v(sls(u, x)) 3: for i ← 1 to n do 4: s0(v, x \ (x1, . . . , xi)) ← maxxi [xi ∧s(v, xi, . . . , xn)] 5: s1(v, x \ (x1, . . . , xi)) ← maxxi [xi ∧s(v, xi, . . . , xn)] 6: st1(v, x \ (x1, . . . , xi)) ← epv(s1(v, x \ (x1, . . . , xi)), i) 7: s(v, x \ (x1, . . . , xi)) ← s0(v, x \ (x1, . . . , xi)) ∩st1(v, x \ (x1, . . . , xi)) 8: end for (22). min xi f (x) = f (x) ∧ ∂ f (x) ∂ xi (31) max xi f (x) = f (x) ∨ ∂ f (x) ∂ xi (32) min x0 f (x0, x1) = f (x0, x1) ∧ ∂ f (x0, x1) ∂ x0 (33) max x0 f (x0, x1) = f (x0, x1) ∨ ∂ f (x0, x1) ∂ x0 (34) min (x1,x2) 2 f (x) = f (x) ∧ ∂ f (x) ∂ x1 ∧ ∂ f (x) ∂ x2 ∧ ∂ f (x) ∂ (x1, x2) (35) max (x1,x2) 2 f (x) = f (x) ∨ ∂ f (x) ∂ x1 ∨ ∂ f (x) ∂ x2 ∨ ∂ f (x) ∂ (x1, x2) (36) ∂ 2 f (x) ∂ x1∂ x2 = ∂ f (x) ∂ x1 ⊕ ∂ f (x) ∂ x2 ⊕ ∂ f (x) ∂ (x1, x2) (37) ∆(x1,x2) f (x) = ∂ f (x) ∂ x1 ∨ ∂ f (x) ∂ x2 ∨ ∂ f (x) ∂ (x1, x2) (38) general formulas to express the simple or vectorial minimum or maximum by the function f (x) and simple or vectorial derivatives are given in (31),. . . , (34). the equations (35),. . . , (38) describe how all 2-fold derivative operations can be transformed into expressions that contain only the function f (x) and simple or vectorial derivatives. these formulas can be generalized for any m ≤ n of functions in bn [6]. using these transformations each more general bde results either in the bde boolean differential equations a common model for classes, lattices, ... 113 (18) for which the solution classes can be calculated using algorithm 1, or it results in the bde (29) so that the arbitrary solution set is found using algorithm 2. 3.5 solving a boolean differential equations using a xboole prp the xboole-monitor can be downloaded (for free) from: http://www.informatik.tu-freiberg.de/xboole/ and provides many operations which can be applied to sets of boolean functions. all xboole-operations are explained in the help-system of the xboole-monitor. we summarize here some xboole-operations needed to solve a bde. all xboole-objects are indicated by numbers. the xboole-operations can be executed in the xboole-monitor by means of 1. a selected and parametrized menu item, 2. a tool-bar button followed by the same dialog to specify the parameters, 3. a xboole-command that specifies both the operation and the objects, and 4. a xboole-problem program (prp) that contains a sequence of commands. the main data structure is the ternary vector list (tvl). all xboole-operations are executed in a boolean space. the user must specify the number of boolean variables which can be used in a boolean space using the xboole-command: space vmax sno where vmax is the number of variables and sno is the number of the space. variables in a wanted order can be attached to an xboole-space using: avar sni where on the next lines the names of the variables separated by a space and finished by a point (.) are declared. a boolean equation or a system of boolean equations is solved using: sbe sni tno where tno is the object number of the result tvl and the boolean equation is given on the following lines finished by a point (.) using the operation signs ‘/’ for the negation, ‘&’ for the conjunction ∧, ‘+’ for the disjunction ∨, ‘#’ for the antivalence ⊕, ‘=’ for the equivalence ⊙ as well as for separating both sides of the equation, and ‘,’ to separate the equations within a system of boolean equations. logic operations for the input tvl (tni, tni1, tni2) and the calculation of the output tvl (tno) are given by: • negation fo = f (complement): cpl tni tno 62 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 63 114 b. steinbach and c. posthoff • conjunction fo = f1 ∧ f2 (intersection): isc tni1 tni2 tno • disjunction fo = f1 ∨ f2 (union): uni tni1 tni2 tno • antivalence fo = f1 ⊕ f2 (symmetric difference): syd tni1 tni2 tno • equivalence fo = f1 ⊙ f2 (complement of the symmetric difference): csd tni1 tni2 tno an ordered set of variables can be defined as a xboole-object called variables tuple (vt) using: vtin sni vtno followed by the variable in the needed order as in the case of the command avar. two such vts define ordered pairs of variables which can be changed using: cco tni vtni1 vtni2 tno where the exchange of columns of tni is realized for all defined pair of variables. alternatively the vts can be directly specified within a special xboolecommand: cco tni tno which exchanges, e.g., column x1 with column x8 and column x2 with column x9. a single vt is used to specify the variables for a vectorial or m-fold derivative operation; e.g.: maxk tni vtni tno calculates the k-fold maximum with regard to all variables specified in vt vtni and maxk tni tno calculates the 2-fold maximum with regard to (x1, x2). example 3. bent functions [7–9] are the most non-linear functions which are needed in cryptography. the simplest bent functions are specified by the bde: ∂ 2 f (x1, x2) ∂ x1∂ x2 = 1 . (39) using (37) we get the equivalent bde: ∂ f (x1, x2) ∂ x1 ⊕ ∂ f (x1, x2) ∂ x2 ⊕ ∂ f (x1, x2) ∂ (x1, x2) = 1 . (40) 64 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 65 114 b. steinbach and c. posthoff • conjunction fo = f1 ∧ f2 (intersection): isc tni1 tni2 tno • disjunction fo = f1 ∨ f2 (union): uni tni1 tni2 tno • antivalence fo = f1 ⊕ f2 (symmetric difference): syd tni1 tni2 tno • equivalence fo = f1 ⊙ f2 (complement of the symmetric difference): csd tni1 tni2 tno an ordered set of variables can be defined as a xboole-object called variables tuple (vt) using: vtin sni vtno followed by the variable in the needed order as in the case of the command avar. two such vts define ordered pairs of variables which can be changed using: cco tni vtni1 vtni2 tno where the exchange of columns of tni is realized for all defined pair of variables. alternatively the vts can be directly specified within a special xboolecommand: cco tni tno which exchanges, e.g., column x1 with column x8 and column x2 with column x9. a single vt is used to specify the variables for a vectorial or m-fold derivative operation; e.g.: maxk tni vtni tno calculates the k-fold maximum with regard to all variables specified in vt vtni and maxk tni tno calculates the 2-fold maximum with regard to (x1, x2). example 3. bent functions [7–9] are the most non-linear functions which are needed in cryptography. the simplest bent functions are specified by the bde: ∂ 2 f (x1, x2) ∂ x1∂ x2 = 1 . (39) using (37) we get the equivalent bde: ∂ f (x1, x2) ∂ x1 ⊕ ∂ f (x1, x2) ∂ x2 ⊕ ∂ f (x1, x2) ∂ (x1, x2) = 1 . (40) boolean differential equations a common model for classes, lattices, ... 115 this bde contains two simple and and one vectorial derivative. hence, it can be solved using algorithm 1. the associated boolean equation of (40) is u1 ⊕ u2 ⊕ u3 = 1 . (41) 1 s p a c e 32 1 2 a v a r 1 3 u0 u1 u2 u3 v0 v1 v2 v3 . 4 s b e 1 1 5 u1 # u2 # u3 = 1 . 6 s b e 1 2 7 v0 =u0 , 8 v1 = u0 # u1 , 9 v2 = u0 # u2 , 10 v3 = u0 # u3 . 11 i s c 1 2 3 12 v t i n 1 4 13 u0 u1 u2 u3 . 14 maxk 3 4 5 15 v t i n 1 6 16 v0 v2 . 17 v t i n 1 7 18 v1 v3 . 19 c c o 5 6 7 8 20 i s c 5 8 9 21 v t i n 1 10 22 v0 v1 . 23 v t i n 1 11 24 v2 v3 . 25 c c o 9 10 11 12 26 i s c 9 12 13 fig. 4. listing of the prp to solve the bde (40). figure 4 show the prp that can be used by the xboole-monitor to solve bde (40). the solution of (41) is stored as xboole-object 1. the lines 6 to 14 in figure 4 realize the d2v-transformation of line 2 of algorithm 1 so that the xboole-object 5 stores sls′(v). the first sweep of the loop in lines 3 to 6 of algorithm 1 is executed in lines 15 to 20 and the second sweep of this loop leads to the xboole-object 13 as final result in lines 21 to 26 of figure 4. table 4 shows in the left part the solution tvl of bde (40). the eight v-vectors describe two classes of boolean functions. the expressions of these functions are built by the substitution of the v-vectors of the solution into (28). table 4 shows these functions in the right part associated to the classes 1 and 2. 4 boolean differential equations for lattices of boolean functions 4.1 lattices of incompletely specified boolean functions a lattice of boolean function is a special set of functions that has the following properties: • if both f1(x) and f2(x) belong to the lattice then f (x) = f1(x)∧ f2(x) belongs also to this lattice of boolean functions, and 64 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 65 116 b. steinbach and c. posthoff tab. 4. solution functions of bde (40) v0 v1 v2 v3 class 1 class 2 0 1 1 1 x1 ∨ x2 1 0 0 0 x1 ∧ x2 1 0 1 1 x1 ∨ x2 0 1 0 0 x1 ∧ x2 0 0 0 1 x1 ∧ x2 1 1 1 0 x1 ∨ x2 1 1 0 1 x1 ∨ x2 0 0 1 0 x1 ∧ x2 • if both f1(x) and f2(x) belong to the lattice then f (x) = f1(x)∨ f2(x) belongs also to this lattice of boolean functions. a lattice of boolean function is very often used for the design of a digital circuit. based on another point of view, such a lattice is sometimes called incompletely specified function (isf). one method to describe the lattice of functions of an isf uses two mark functions called: 1. on-set q(x): all functions of the lattice must be equal to one for q(x) = 1, and 2. off-set r(x): all functions of the lattice must be equal to zero for r(x) = 1. using these known mark functions each function f (x) of the set of functions { fi(x)} must hold the inequalities: f (x) ≥ q(x) , (42) f (x) ≤ r(x) , (43) which can be transformed into the equivalent restrictions: f (x) ∧ q(x) = 0 , (44) f (x) ∧ r(x) = 0 . (45) the system of equations (44) and (45) can be merged into a single boolean equation: f (x) ∧ q(x) ∨ f (x) ∧ r(x) = 0 . (46) 66 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 67 116 b. steinbach and c. posthoff tab. 4. solution functions of bde (40) v0 v1 v2 v3 class 1 class 2 0 1 1 1 x1 ∨ x2 1 0 0 0 x1 ∧ x2 1 0 1 1 x1 ∨ x2 0 1 0 0 x1 ∧ x2 0 0 0 1 x1 ∧ x2 1 1 1 0 x1 ∨ x2 1 1 0 1 x1 ∨ x2 0 0 1 0 x1 ∧ x2 • if both f1(x) and f2(x) belong to the lattice then f (x) = f1(x)∨ f2(x) belongs also to this lattice of boolean functions. a lattice of boolean function is very often used for the design of a digital circuit. based on another point of view, such a lattice is sometimes called incompletely specified function (isf). one method to describe the lattice of functions of an isf uses two mark functions called: 1. on-set q(x): all functions of the lattice must be equal to one for q(x) = 1, and 2. off-set r(x): all functions of the lattice must be equal to zero for r(x) = 1. using these known mark functions each function f (x) of the set of functions { fi(x)} must hold the inequalities: f (x) ≥ q(x) , (42) f (x) ≤ r(x) , (43) which can be transformed into the equivalent restrictions: f (x) ∧ q(x) = 0 , (44) f (x) ∧ r(x) = 0 . (45) the system of equations (44) and (45) can be merged into a single boolean equation: f (x) ∧ q(x) ∨ f (x) ∧ r(x) = 0 . (46) boolean differential equations a common model for classes, lattices, ... 117 the functions q(x) and r(x) in (46) are known and can be represent by expressions of boolean variables connected by boolean operations. the term f (x) in (46) describes the unknown functions of the lattice. hence, the equation (46) fits to the type of boolean differential equations (29), where only f (x) and variables xi but no simple or vectorial derivatives occur. all functions of a lattice which is specified by the bde (46) can be calculated using algorithm 2. example 4. let q(x) = x1 x3 ∨ x2 x3 and r(x) = x1 x2 be the given mark functions of a lattice of boolean functions f (x). using (46), we get the bde of this lattice: f (x) ∧ (x1 x3 ∨ x2 x3) ∨ f (x) ∧ x1 x2 = 0 (47) and the associated boolean equation: u0 ∧ (x1 x3 ∨ x2 x3) ∨ u0 ∧ x1 x2 = 0 . (48) figure 5 shows the prp that is used in the xboole-monitor to solve the bde (47). after the definition of the boolean space b32 in line 1; the used variables are defined in lines 2 to 5, and the associated boolean equation is solved in lines 6 to 8. the bde (47) contains only f(x) out of the vector ∇ f (x) so that the transformation d2v can be restricted to the mapping of u0 to v0 in lines 9 to 14. due to the existing variables xi algorithm 2 must be used to separate the set of solution functions. all steps of the loop of lines 3 to 8 of algorithm 2 are realized in lines 15 to 53 of the prp in figure 5. the lines 15 to 27 of the prp describe the first sweep of this loop for x1. the indices of the variables vi are taken from the first four rows of column i = 1 of table 3 where the vt 11 uses the left values and the vt 12 the right values. the intermediate solution of this sweep is stored into the same xboole-object 5 that represent the function s of algorithm 2. hence, the fragment of lines 15 to 27 of the first sweep of the loop can be reused in lines 28 to 40 for the second sweep with x2 and the vts specified by column i = 2 of table 3 and in lines 41 to 53 for the third sweep with x3 and the vts specified by column i = 3 of table 3, respectively. tab. 5. solution functions of bde (47) v0 v1 v2 v3 v4 v5 v6 v7 solution functions 1 1 0 1 1 1 0 1 f1(x) = x1 ∨ x2 1 1 0 1 0 1 0 1 f2(x) = x1 ∨ x2 x3 1 1 0 0 1 1 0 1 f3(x) = x1 x3 ∨ x2 1 1 0 0 0 1 0 1 f4(x) = x1 x3 ∨ x2 x3 66 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 67 118 b. steinbach and c. posthoff 1 s p a c e 32 1 2 a v a r 1 3 u0 4 v0 v1 v2 v3 v4 v5 v6 v7 5 x1 x2 x3 . 6 s b e 1 1 7 / u0 &( x1&x3 + / x2 &/ x3 ) + 8 u0 & ( / x1&x2 ) = 0 . 9 s b e 1 2 10 v0 = u0 . 11 i s c 1 2 3 12 v t i n 1 4 13 u0 . 14 maxk 3 4 5 15 s b e 1 6 16 x1 = 0 . 17 i s c 5 6 7 18 maxk 7 6 8 19 c p l 6 6 20 i s c 5 6 9 21 maxk 9 6 10 22 v t i n 1 11 23 v0 v2 v4 v6 . 24 v t i n 1 12 25 v1 v3 v5 v7 . 26 c c o 10 11 12 13 27 i s c 8 13 5 28 s b e 1 6 29 x2 = 0 . 30 i s c 5 6 7 31 maxk 7 6 8 32 c p l 6 6 33 i s c 5 6 9 34 maxk 9 6 10 35 v t i n 1 11 36 v0 v1 v4 v5 . 37 v t i n 1 12 38 v2 v3 v6 v7 . 39 c c o 10 11 12 13 40 i s c 8 13 5 41 s b e 1 6 42 x3 = 0 . 43 i s c 5 6 7 44 maxk 7 6 8 45 c p l 6 6 46 i s c 5 6 9 47 maxk 9 6 10 48 v t i n 1 11 49 v0 v1 v2 v3 . 50 v t i n 1 12 51 v4 v5 v6 v7 . 52 c c o 10 11 12 13 53 i s c 8 13 5 fig. 5. listing of the prp to solve the bde (47). table 5 enumerates the four solution functions of the bde (47) calculated by the xboole-monitor using the prp of figure 5. the columns v3 and v4 show that this lattice contains all functions which are smaller or equal to the supremum function f1(x) and greater or equal to the infimum function f4(x). 4.2 generalized lattices of boolean functions each derivative operation transforms a given lattice of boolean functions into another lattice of boolean functions. the resulting lattices are more general because the function values 0 and 1 cannot only be chosen for a single x-pattern, but also for 68 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 69 118 b. steinbach and c. posthoff 1 s p a c e 32 1 2 a v a r 1 3 u0 4 v0 v1 v2 v3 v4 v5 v6 v7 5 x1 x2 x3 . 6 s b e 1 1 7 / u0 &( x1&x3 + / x2 &/ x3 ) + 8 u0 & ( / x1&x2 ) = 0 . 9 s b e 1 2 10 v0 = u0 . 11 i s c 1 2 3 12 v t i n 1 4 13 u0 . 14 maxk 3 4 5 15 s b e 1 6 16 x1 = 0 . 17 i s c 5 6 7 18 maxk 7 6 8 19 c p l 6 6 20 i s c 5 6 9 21 maxk 9 6 10 22 v t i n 1 11 23 v0 v2 v4 v6 . 24 v t i n 1 12 25 v1 v3 v5 v7 . 26 c c o 10 11 12 13 27 i s c 8 13 5 28 s b e 1 6 29 x2 = 0 . 30 i s c 5 6 7 31 maxk 7 6 8 32 c p l 6 6 33 i s c 5 6 9 34 maxk 9 6 10 35 v t i n 1 11 36 v0 v1 v4 v5 . 37 v t i n 1 12 38 v2 v3 v6 v7 . 39 c c o 10 11 12 13 40 i s c 8 13 5 41 s b e 1 6 42 x3 = 0 . 43 i s c 5 6 7 44 maxk 7 6 8 45 c p l 6 6 46 i s c 5 6 9 47 maxk 9 6 10 48 v t i n 1 11 49 v0 v1 v2 v3 . 50 v t i n 1 12 51 v4 v5 v6 v7 . 52 c c o 10 11 12 13 53 i s c 8 13 5 fig. 5. listing of the prp to solve the bde (47). table 5 enumerates the four solution functions of the bde (47) calculated by the xboole-monitor using the prp of figure 5. the columns v3 and v4 show that this lattice contains all functions which are smaller or equal to the supremum function f1(x) and greater or equal to the infimum function f4(x). 4.2 generalized lattices of boolean functions each derivative operation transforms a given lattice of boolean functions into another lattice of boolean functions. the resulting lattices are more general because the function values 0 and 1 cannot only be chosen for a single x-pattern, but also for boolean differential equations a common model for classes, lattices, ... 119 pairs of x-patterns. the proof that all derivative operations of any lattice result in such a generalized lattice is given for the first time in [10]. the generalized lattices are described in [10] by the mark function q(x) and r(x) as well as an independence matrix (idm) that have the shape of an echelon and elements below the main diagonal are equal to 0. the idm describes all independent directions of change where no function of the lattice changes its values. associated to the independence matrix idm an independence function f id (x) is defined in [10]. knowing the mark functions q(x) and r(x) and the independence function f id (x), a boolean equation given in [10] allows us to check for each function f (x) whether it belongs to the generalized lattice or not. instead of the two mark functions and the independence matrix we suggest here a single boolean differential equation to describe a generalized lattice of boolean functions. the directions, in which no function of the lattice changes its values, are described in a restrictive bde by simple and vectorial derivatives which are connected by disjunctions (∨): n ∨ i=1 ∂ f (x) ∂ x0i = 0 , (49) where • i is the row index of the idm, • values 1 in the row of idm specify the variables of x0i, and • ∂ f (x) ∂ x0i = 0 if all elements of the i-th row in idm(f) are equal to 0. a bde (50) can be solved by algorithm 1, and the special structure of the bde ensures that the set of classes of solution function describes a lattice of boolean functions. example 5. the generalized lattice of the boolean function f(x1, x2, x3) in which all functions do not change their function values if either x1 and x3 or x2 and x3 are commonly changed at the same point in time can be described by the bde: ∂ f (x) ∂ (x1, x3) ∨ ∂ f (x) ∂ (x2, x3) = 0 (50) and the associated boolean equation: u5 ∨ u6 = 0 . (51) figure 6 shows the prp that is used in the xboole-monitor to solve the bde (50). after the definition of the boolean space b32 in line 1; the used variables are 68 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 69 120 b. steinbach and c. posthoff 1 s p a c e 32 1 2 a v a r 1 3 u0 u5 u6 4 v0 v1 v2 v3 v4 v5 v6 v7 . 5 s b e 1 1 6 u5 + u6 = 0 . 7 s b e 1 2 8 v0 =u0 , 9 v5 = u0 # u5 , 10 v6 = u0 # u6 . 11 i s c 1 2 3 12 v t i n 1 4 13 u0 u5 u6 . 14 maxk 3 4 5 15 v t i n 1 6 16 v0 v2 v4 v6 . 17 v t i n 1 7 18 v1 v3 v5 v7 . 19 c c o 5 6 7 8 20 i s c 5 8 5 21 v t i n 1 6 22 v0 v1 v4 v5 . 23 v t i n 1 7 24 v2 v3 v6 v7 . 25 c c o 5 6 7 8 26 i s c 5 8 5 27 v t i n 1 6 28 v0 v1 v2 v3 . 29 v t i n 1 7 30 v4 v5 v6 v7 . 31 c c o 5 6 7 8 32 i s c 5 8 5 fig. 6. listing of the prp to solve the bde (50). tab. 6. solution functions of bde (50) v0 v1 v2 v3 v4 v5 v6 v7 class 1 class 2 class 3 1 1 1 1 1 1 1 1 f1(x) = 1 1 0 0 1 0 1 1 0 f2(x) = x1 ⊕ x2 ⊕ x3 0 1 1 0 1 0 0 1 f3(x) = x1 ⊕ x2 ⊕ x3 0 0 0 0 0 0 0 0 f4(x) = 0 defined in lines 2 to 4, and the associated boolean equation is solved in lines 5 to 6. the bde (50) contains only two vectorial derivatives out of the vector ∇ f (x) so that the transformation d2v can be restricted to the mapping of u5 to v5 and u6 to v6 in lines 7 to 14, where u0 is needed as counterpart to v0. due to missing variables xi algorithm 1 can be used to separate the classes of solution functions. all steps of the loop of lines 3 to 6 of algorithm 1 are realized in lines 15 to 32 of the prp in figure 6. the lines 15 to 20 of the prp describe the first sweep of this loop for the exchange of x1. the indices of the variables vi are taken from the first four rows of column i = 1 of table 3 where the vt 6 uses the left values and the vt 7 the right values. the intermediate solution of this sweep is stored into the same xboole-object 5 that represents the function sls′ of algorithm 1. hence, the fragment of lines 15 to 20 of the first sweep of the loop can be reused in lines 21 to 26 for the second sweep with regard to x2 and the vts specified by column i = 2 of table 3 and in lines 27 to 32 for the third sweep with regard to x3 and the vts specified by column i = 3 of table 3, respectively. 70 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 71 120 b. steinbach and c. posthoff 1 s p a c e 32 1 2 a v a r 1 3 u0 u5 u6 4 v0 v1 v2 v3 v4 v5 v6 v7 . 5 s b e 1 1 6 u5 + u6 = 0 . 7 s b e 1 2 8 v0 =u0 , 9 v5 = u0 # u5 , 10 v6 = u0 # u6 . 11 i s c 1 2 3 12 v t i n 1 4 13 u0 u5 u6 . 14 maxk 3 4 5 15 v t i n 1 6 16 v0 v2 v4 v6 . 17 v t i n 1 7 18 v1 v3 v5 v7 . 19 c c o 5 6 7 8 20 i s c 5 8 5 21 v t i n 1 6 22 v0 v1 v4 v5 . 23 v t i n 1 7 24 v2 v3 v6 v7 . 25 c c o 5 6 7 8 26 i s c 5 8 5 27 v t i n 1 6 28 v0 v1 v2 v3 . 29 v t i n 1 7 30 v4 v5 v6 v7 . 31 c c o 5 6 7 8 32 i s c 5 8 5 fig. 6. listing of the prp to solve the bde (50). tab. 6. solution functions of bde (50) v0 v1 v2 v3 v4 v5 v6 v7 class 1 class 2 class 3 1 1 1 1 1 1 1 1 f1(x) = 1 1 0 0 1 0 1 1 0 f2(x) = x1 ⊕ x2 ⊕ x3 0 1 1 0 1 0 0 1 f3(x) = x1 ⊕ x2 ⊕ x3 0 0 0 0 0 0 0 0 f4(x) = 0 defined in lines 2 to 4, and the associated boolean equation is solved in lines 5 to 6. the bde (50) contains only two vectorial derivatives out of the vector ∇ f (x) so that the transformation d2v can be restricted to the mapping of u5 to v5 and u6 to v6 in lines 7 to 14, where u0 is needed as counterpart to v0. due to missing variables xi algorithm 1 can be used to separate the classes of solution functions. all steps of the loop of lines 3 to 6 of algorithm 1 are realized in lines 15 to 32 of the prp in figure 6. the lines 15 to 20 of the prp describe the first sweep of this loop for the exchange of x1. the indices of the variables vi are taken from the first four rows of column i = 1 of table 3 where the vt 6 uses the left values and the vt 7 the right values. the intermediate solution of this sweep is stored into the same xboole-object 5 that represents the function sls′ of algorithm 1. hence, the fragment of lines 15 to 20 of the first sweep of the loop can be reused in lines 21 to 26 for the second sweep with regard to x2 and the vts specified by column i = 2 of table 3 and in lines 27 to 32 for the third sweep with regard to x3 and the vts specified by column i = 3 of table 3, respectively. boolean differential equations a common model for classes, lattices, ... 121 table 6 enumerates the four solution functions of the bde (50) calculated by the xboole-monitor using the prp of figure 6. the solution of the bde (50) consists of three classes of boolean functions. the class 1 only contains the function f1(x) = 1 that is the supremum r(x) of the lattice. due to (24) a solution class in b3 generally contains 23 = 8 functions. two times four of these functions are identical in case of class 2: f2(x) = x1 ⊕ x2 ⊕ x3 = x1 ⊕ x2 ⊕ x3 = x1 ⊕ x2 ⊕ x3 = x1 ⊕ x2 ⊕ x3 , f3(x) = x1 ⊕ x2 ⊕ x3 = x1 ⊕ x2 ⊕ x3 = x1 ⊕ x2 ⊕ x3 = x1 ⊕ x2 ⊕ x3 . the class 3 only contains the function f4(x) = 0 that is the infimum q(x) of the lattice. the solution lattice contains only two of 28 − 2 = 254 functions of b3 which are greater than q(x) = 0 and smaller than r(x) = 1. nevertheless, the four function of table 6 constitute a lattice—both the conjunction and the disjunction of any pair of these function result in one of these functions. it should be mentioned that there are four different bde having the same lattice of solutions shown in table 6. the most extended bde with this solution lattice is: ∂ f (x) ∂ (x1, x2) ∨ ∂ f (x) ∂ (x1, x3) ∨ ∂ f (x) ∂ (x2, x3) = 0 . (52) the equivalent other three bde with the same solution are built by omitting one of the three vectorial derivatives: ∂ f (x) ∂ (x1, x3) ∨ ∂ f (x) ∂ (x2, x3) = 0 , (53) ∂ f (x) ∂ (x1, x2) ∨ ∂ f (x) ∂ (x2, x3) = 0 , (54) ∂ f (x) ∂ (x1, x2) ∨ ∂ f (x) ∂ (x1, x3) = 0 , (55) where (53) is identical with (50) and used in example 5. the reason for these alternative bde with the same solution lattice is that only two of the three directions of change in (52) of the vectorial derivatives are linearly independent. two algorithms in [10] describe how the unique independent directions of change can be found. the bde (53) is the unique representative bde for the solution lattice of table 6. example 5 explores a generalized lattice with the special mark functions q(x) = 0 and r(x) = 0, in order to emphasize that only a special subset of functions between these two mark function belong to the lattice. however, this restriction is not necessary. the bde (46) can be combined with the bde (49) to the bde of a most 70 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 71 122 b. steinbach and c. posthoff general lattice: f (x) ∧ q(x) ∨ f (x) ∧ r(x) ∨ n ∨ i=1 ∂ f (x) ∂ x0i = 0 . (56) example 6. we assume that a generalized lattice of the boolean function f(x) is described by the bde: f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4)) ∨ f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4))∨ ∂ f (x) ∂ (x1, x3) ∨ ∂ f (x) ∂ (x2, x4) = 0 (57) which has the associated boolean equation: f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4)) ∨ f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4)) ∨ u5 ∨ u10 = 0 . (58) figure 7 shows the prp that is used to solve the bde (57) of a most general lattice of boolean functions. after the definition of the boolean space b32 in line 1; the used variables are defined in lines 2 to 8, and the associated boolean equation is solved in lines 9 to 12. the bde (57) contains only f (x) and two vectorial derivatives out of the vector ∇ f (x) so that the transformation d2v can be restricted to the mapping of u0 to v0, u5 to v5, and u10 to v10, in lines 13 to 20. due to the existing variables xi algorithm 2 must be used to separate the set of solution functions which have the structure of a lattice due to the structure of the bde to be solved. all steps of the loop of lines 3 to 8 of algorithm 2 are realized in lines21 to80of theprpinfigure 7. the lines21 to35of theprpdescribe thefirst sweep of this loop for x1. the indices of the variables vi are taken from the column i = 1 of table 3 where the vt 11 uses the left values and the vt 12 the right values. the intermediate solution of this sweep is stored to the same xboole-object 5 that represent the function s of algorithm 2. hence, the fragment of lines 21 to 35 of the first sweep of the loop can be reused in lines 36 to 50 for the second sweep with x2 and the vts specified by column i = 2 of table 3, in lines 51 to 65 for the third sweep with x3 and the vts specified by column i = 3, and finally in lines 66 to 80 for the fourth sweep with x4 and the vts specified by column i = 4 of table 3, respectively. table 7 enumerates the four solution functions of the bde (57) calculated by the xboole-monitor using the prp of figure 7. the function values in the columns v0, . . . , v15 confirm that the calculated four functions form a lattice. it can also be seen in these columns that not all functions which are smaller than the supremum f1 and larger than the infimum f4 belong to this generalized lattice of boolean functions. 72 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 73 122 b. steinbach and c. posthoff general lattice: f (x) ∧ q(x) ∨ f (x) ∧ r(x) ∨ n ∨ i=1 ∂ f (x) ∂ x0i = 0 . (56) example 6. we assume that a generalized lattice of the boolean function f(x) is described by the bde: f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4)) ∨ f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4))∨ ∂ f (x) ∂ (x1, x3) ∨ ∂ f (x) ∂ (x2, x4) = 0 (57) which has the associated boolean equation: f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4)) ∨ f (x) ∧ ((x1 ⊕ x3) ∧ (x2 ⊕ x4)) ∨ u5 ∨ u10 = 0 . (58) figure 7 shows the prp that is used to solve the bde (57) of a most general lattice of boolean functions. after the definition of the boolean space b32 in line 1; the used variables are defined in lines 2 to 8, and the associated boolean equation is solved in lines 9 to 12. the bde (57) contains only f (x) and two vectorial derivatives out of the vector ∇ f (x) so that the transformation d2v can be restricted to the mapping of u0 to v0, u5 to v5, and u10 to v10, in lines 13 to 20. due to the existing variables xi algorithm 2 must be used to separate the set of solution functions which have the structure of a lattice due to the structure of the bde to be solved. all steps of the loop of lines 3 to 8 of algorithm 2 are realized in lines21 to80of theprpinfigure 7. the lines21 to35of theprpdescribe thefirst sweep of this loop for x1. the indices of the variables vi are taken from the column i = 1 of table 3 where the vt 11 uses the left values and the vt 12 the right values. the intermediate solution of this sweep is stored to the same xboole-object 5 that represent the function s of algorithm 2. hence, the fragment of lines 21 to 35 of the first sweep of the loop can be reused in lines 36 to 50 for the second sweep with x2 and the vts specified by column i = 2 of table 3, in lines 51 to 65 for the third sweep with x3 and the vts specified by column i = 3, and finally in lines 66 to 80 for the fourth sweep with x4 and the vts specified by column i = 4 of table 3, respectively. table 7 enumerates the four solution functions of the bde (57) calculated by the xboole-monitor using the prp of figure 7. the function values in the columns v0, . . . , v15 confirm that the calculated four functions form a lattice. it can also be seen in these columns that not all functions which are smaller than the supremum f1 and larger than the infimum f4 belong to this generalized lattice of boolean functions. boolean differential equations a common model for classes, lattices, ... 123 1 s p a c e 32 1 2 a v a r 1 3 u0 u5 u10 4 v0 v1 v2 v3 5 v4 v5 v6 v7 6 v8 v9 v1 0 v 11 7 v1 2 v 13 v14 v15 8 x1 x2 x3 x4 . 9 s b e 1 1 10 / u0 & ( / ( x1 # x3 ) & ( x2 # x4 ) ) + 11 u0 & ( ( x1 # x3 ) & ( x2 # x4 ) ) + 12 u5 + u10 = 0 . 13 s b e 1 2 14 v0 =u0 , 15 v5 = u0 # u5 , 16 v1 0 = u0 # u10 . 17 i s c 1 2 3 18 v t i n 1 4 19 u0 u5 u10 . 20 maxk 3 4 5 21 s b e 1 6 22 x1 = 0 . 23 i s c 5 6 7 24 maxk 7 6 8 25 c p l 6 6 26 i s c 5 6 9 27 maxk 9 6 10 28 v t i n 1 11 29 v0 v2 v4 v6 30 v8 v10 v12 v1 4 . 31 v t i n 1 12 32 v1 v3 v5 v7 33 v9 v11 v13 v1 5 . 34 c c o 10 11 12 13 35 i s c 8 13 5 36 s b e 1 6 37 x2 = 0 . 38 i s c 5 6 7 39 maxk 7 6 8 40 c p l 6 6 41 i s c 5 6 9 42 maxk 9 6 10 43 v t i n 1 11 44 v0 v1 v4 v5 45 v8 v9 v12 v13 . 46 v t i n 1 12 47 v2 v3 v6 v7 48 v10 v11 v14 v15 . 49 c c o 10 11 12 13 50 i s c 8 13 5 51 s b e 1 6 52 x3 = 0 . 53 i s c 5 6 7 54 maxk 7 6 8 55 c p l 6 6 56 i s c 5 6 9 57 maxk 9 6 10 58 v t i n 1 11 59 v0 v1 v2 v3 60 v8 v9 v10 v11 . 61 v t i n 1 12 62 v4 v5 v6 v7 63 v12 v13 v14 v15 . 64 c c o 10 11 12 13 65 i s c 8 13 5 66 s b e 1 6 67 x4 = 0 . 68 i s c 5 6 7 69 maxk 7 6 8 70 c p l 6 6 71 i s c 5 6 9 72 maxk 9 6 10 73 v t i n 1 11 74 v0 v1 v2 v3 75 v4 v5 v6 v7 . 76 v t i n 1 12 77 v8 v9 v10 v11 78 v12 v13 v14 v15 . 79 c c o 10 11 12 13 80 i s c 8 13 5 fig. 7. listing of the prp to solve the bde (57). 72 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 73 124 b. steinbach and c. posthoff tab. 7. solution functions of bde (57) v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 solution functions 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 f1(x) = (x1 ⊕ x3) ∨ (x2 ⊕ x4) 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 f2(x) = x1 ⊕ x3 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 f3(x) = x1 ⊕ x2 ⊕ x3 ⊕ x4 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 f4(x) = (x1 ⊕ x3) ∧ (x2 ⊕ x4) 5 conclusions the boolean differential calculus extends the field of application of the boolean algebra significantly. simple and vectorial derivative operations evaluate pairs of function values in the selected direction of change. the values of m-fold derivative operations depend on values of the given function in whole subspaces. an unknown function f (x) and derivative operations of this function appear in boolean expressions on both sides of a boolean differential equation (bde). the solution of each bde is a set of boolean functions. this is an important extension to a boolean equation which has a set of binary vectors as solution. the explored algorithms allow us to solve bdes which describe either sets of classes of boolean functions or arbitrary sets of boolean functions. we demonstrated that these algorithms can easily be implemented using the xboole-monitor. the used problem programs (prp) show all details of the introduced algorithms to solve a bde. using the xboole-library all these steps can be wrapped by special programs such that the set of solution functions of a bde is automatically created. the algorithms known from [4,5] are able to separate (depending on the type of the bde) either classes of solution functions or arbitrary sets of solution functions. we presented in this paper special types of bdes which either combine certain classes to a lattice of boolean functions or restrict the arbitrary sets of solutions to a lattice of boolean functions. there is a wide field of applications of bdes. many examples are explained in [4]. here, we introduced three new types of bdes. all of them describe lattices of boolean functions. 1. a bde of the type (46) describes a well known and widely used lattice of boolean functions that can alternatively be expressed by an incompletely specified function. such a bde can be solved using algorithm 2. 2. a bde of the type (49) describes a lattice with the infimum function q(x) = 0 and the supremum function r(x) = 1 and can be solved using algorithm 74 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 75 124 b. steinbach and c. posthoff tab. 7. solution functions of bde (57) v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 solution functions 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 f1(x) = (x1 ⊕ x3) ∨ (x2 ⊕ x4) 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 f2(x) = x1 ⊕ x3 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 f3(x) = x1 ⊕ x2 ⊕ x3 ⊕ x4 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 f4(x) = (x1 ⊕ x3) ∧ (x2 ⊕ x4) 5 conclusions the boolean differential calculus extends the field of application of the boolean algebra significantly. simple and vectorial derivative operations evaluate pairs of function values in the selected direction of change. the values of m-fold derivative operations depend on values of the given function in whole subspaces. an unknown function f (x) and derivative operations of this function appear in boolean expressions on both sides of a boolean differential equation (bde). the solution of each bde is a set of boolean functions. this is an important extension to a boolean equation which has a set of binary vectors as solution. the explored algorithms allow us to solve bdes which describe either sets of classes of boolean functions or arbitrary sets of boolean functions. we demonstrated that these algorithms can easily be implemented using the xboole-monitor. the used problem programs (prp) show all details of the introduced algorithms to solve a bde. using the xboole-library all these steps can be wrapped by special programs such that the set of solution functions of a bde is automatically created. the algorithms known from [4,5] are able to separate (depending on the type of the bde) either classes of solution functions or arbitrary sets of solution functions. we presented in this paper special types of bdes which either combine certain classes to a lattice of boolean functions or restrict the arbitrary sets of solutions to a lattice of boolean functions. there is a wide field of applications of bdes. many examples are explained in [4]. here, we introduced three new types of bdes. all of them describe lattices of boolean functions. 1. a bde of the type (46) describes a well known and widely used lattice of boolean functions that can alternatively be expressed by an incompletely specified function. such a bde can be solved using algorithm 2. 2. a bde of the type (49) describes a lattice with the infimum function q(x) = 0 and the supremum function r(x) = 1 and can be solved using algorithm boolean differential equations a common model for classes, lattices, ... 125 1. such a lattice contains not all functions which are greater than q(x) and smaller than r(x). 3. a bde of the more general type (56) merges the bdes of the cases 1 and 2. such a bde must be solved using algorithm 2 because the mark functions q(x) and r(x) are not limited to the case 2. it can be very difficult to find a bde for a needed set of boolean functions. however, it is an advantage that bdes for lattices can be built in a straight manner based on the mentioned types. therefore, we call bdes of the types (46), (49), and (56) lattice-bdes. the result of a lattice-bde (49) or (56) is a generalized lattice of boolean functions that cannot be expressed by an incompletely specified function. in this way these lattice-bdes opens many new fields of applications, e.g., in circuit design. vice versa, the lattice-bde (46) is a sub-type of the most general latticebde (56) so that the so far widely used lattices of boolean functions fitting to an incompletely specified boolean function are integrated in the new theory in a natural manner. it is a challenge for the future to utilize lattice-bdes for specific applications. references [1] c. posthoff and b. steinbach, logic functions and equations binary models for computer science. dordrecht: springer, 2004. [2] b. steinbach and c. posthoff, “boolean differential calculus,” in progress in applications of boolean functions. morgan & claypool publishers, san rafael, ca usa, 2010, pp. 55–78. [3] ——, “boolean differential calculus theory and applications,” journal of computational and theoretical nanoscience, vol. 7, no. 6, pp. 933–981, 2010. [4] ——, boolean differential equations. morgan & claypool publishers, 2013. [5] b. steinbach, “lösung binärer differentialgleichungen und ihre anwendung auf binäre systeme,” ph.d. dissertation, th karl-marx-stadt (germany), 1981. [6] d. bochmann and c. posthoff, binre dynamische systeme. munich, vienna: oldenbourg, 1981. [7] o. s. rothaus, “on ”bent” functions,” j.combinatorialtheory, vol. 20, pp. 300–305, 1976. [8] b. steinbach and c. posthoff, “classification and generation of bent functions,” in proceedings reed-muller 2011 workshop, gustavelund conference centre, tuusula, finnland, 2011, pp. 81–91. 74 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... 75 126 b. steinbach and c. posthoff [9] ——, “classes of bent functions identified by specific normal forms and generated using boolean differential equations,” facta universitatis (niš), vol. 24, no. 3, pp. 357–383, 2011. [10] b. steinbach, “generalized lattices of boolean functions utilized for derivative operations,” in materiały konferencyjne knws’13, łagów, poland, 2013, pp. 1–17. 76 b. steinbach, c. posthoff boolean differential equations – a common model for classes, lattices... pb facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 267-285 https://doi.org/10.2298/fuee1902267r accurate computation of magnetic induction generated by hv overhead power lines djekidel rabah 1 , bessedik sid ahmed 1 , samar akef 2 1 department of electrical engineering, laghouat university, algeria 2 department of electrical engineering, cairo university, egypt abstract. this paper proposes a 3d quasi-static numerical model for the magnetic induction calculation produced by the high voltage overhead power lines by using the current simulation technique (cst) combined with the particle swarm optimization algorithm (pso), in order to determine the appropriate position and number of the filamentary current loops for an accurate computation. the exact form of the catenary of the power line conductors is taken into account in this calculation. from the simulation results, the effect of the conductor sag is largely noticed on the magnetic induction distribution, especially at the mid-span length of the power line where the magnetic induction becomes very significant, the maximum magnetic induction strength at 1 m above the ground level recorded at mid-span point is 8.87 μt, at the pylon foot, the maximum value is significantly reduced to 3.94 μt. according to these values, we note that the limits set by the icnirp guidelines for magnetic induction strength are respected for occupational and public exposure. the simulation results of magnetic induction are compared with those obtained from the 3-d integration method, a fairly good agreement is found. key words: current simulation technique (cst), magnetic induction, sag effect, 3-d integration method, particle swarm optimization (pso) 1. introduction the increase in the population leads to raise the energy needed which causes the evolution of the electric energy demand and accelerates the concentration of the transmission lines with a high operating voltage level. these power lines create electric and magnetic fields, and therefore raised serious questions about the potential health and environmental effects associated with high levels of intensity of these fields around these lines. the possible effects of electric and magnetic fields on human health and the environment are discussed in several research projects [1-4]. the limits of exposure to electromagnetic fields (emf) are derived from the international commission on non received october 10, 2018; received in revised form february 5, 2019 corresponding author: djekidel rabah department of electrical engineering, university of amar telidji of laghouat, bp 37g route of ghardaïa, laghouat 03000, algeria (e-mail: rabah03dz@live.fr) 268 d. rabah, b.sid-ahmed, s. akef ionizing radiation protection (icnirp). at low frequency 50 hz, the reference levels for public exposure are 200 μt (magnetic induction) and 5kv/m (electric field). respectively, the reference levels for occupational exposure are 1 mt and 10kv/m [5]. the accurate evaluation of electric and magnetic fields generated by electric power lines is very important in many areas of research, and necessary in many applications. in recent years, several publications have been made to calculate the electric and magnetic fields created by power transmission lines. most assume that the power lines are horizontal straight parallel to a flat ground, and the sag due to the power line weight is neglected or introduced by taking an average height between the maximum and the minimum height of the power line [6-8]. in this paper, in order to obtain a more accurate computational result of the magnetic induction strength distribution around electric power transmission lines, a 3-d quasistatic numerical modeling combining the current simulation technique (cst) with particle swarm optimization (pso) can be used. the current simulation technique (cst) is an effective approach and more adapted to the simulation of overhead power transmission lines and their particularities such as conductors in bundles and constraints posed by the power lines geometry. it should be noted that this calculation takes into account the effects of the catenary form of the overhead power line conductors. the major problem in this technique is the optimal number and position coordinates of the filamentary line currents in the sub-conductors [9]. to solve the constraint optimization, we appeal to the particle swarm optimization (pso) method. pso is a powerful optimization algorithm that is inspired from the behavior of a flock of birds which is capable of finding global optimum solution [10,11]. in order to verify the accuracy of the combined method, the obtained results are compared with those obtained using 3-d integration approach. 2. current simulation technique (cst) by analogy with the charge simulation method (csm) applied to transmission lines to calculate the electric field, it is possible to have a current simulation technique (cst) for calculating the magnetic field on the conductor’s lines. in this technique, for a three phase bundled conductor line with m sub-conductors per phase, each sub-conductor current is simulated by a finite number n of filamentary line currents distributed on a fictitious cylindrical surface of radius rj. the simulation currents ij, must satisfy the following conditions [9,12,13]: 1. zero normal component of the magnetic field strength on the sub-conductors' surfaces, following biot–savart law. 2. the sum of the filamentary line currents simulating the sub-conductor current must be equal to the sub-conductor current. to determine the unknown filamentary currents, a set of equations is formulated at a number of boundary points chosen on the sub-conductors’ surface to satisfy the boundary conditions as follows [9,12,13]: accurate computation of magnetic induction generated by hv overhead power lines 269 3. . 1 . 0 1, 2,..................., 3. ( 1) n m ij ij j j a p i i m. n      (1) ( 1). 1 1, 2, 3,.............., 3. nq j cq j q n i i q m      (2) where, pij is the magnetic normal field coefficient determined by the coordinates of the i th boundary point and the j th filamentary line current and is given by [12]: 0 ln 2 j ij ij r p r     (3) where, rij is the distance between simulation current point ( j ) and match point ( i); rj is the radius of filamentary line current, as shown in fig. 1. we find another expression [9, 13] which uses equation (4) to calculate the magnetic coefficient. 1 sin( ) 2. ij ij ij p r     (4) with, ij ij j     (see fig. 1). fig. 1 normal and tangential field components at a point on the sub-conductor surface once the set of equations (1) and (2) are solved for the unknown filamentary line currents, the deviation of the normal component of the magnetic field strength from the zero value is calculated at a set of check points (match points) chosen on the subconductor’s surfaces, the values and positions of simulation currents are known, the distribution of the magnetic field in any region can be calculated easily. the horizontal and vertical components of the magnetic flux density at any point in the space around the hv power line can be calculated by the following equations [12-15]: 270 d. rabah, b.sid-ahmed, s. akef 3. . 0 2 2' 1 3. . 0 2 2' 1 2. 2. n m i j i j erc xj j ijj ij n m i j i j yj j ijj ij y y y y d b i r r x x x x b i r r                                     (5) where, (xi, yi) and (xj, yj) are the coordinates of the observation point and location of simulation line current, respectively; rij is the distance between each conductor and observation point above ground; r ’ ij is the distance between each image conductor and observation point (see fig. 2). for magnetic field calculation, the image of a filamentary current for a sub-conductor is located at depth different from the real sub-conductor height above ground, is called the depth of penetration; it can be expressed as follows [14]: 658.87. s ercd f   (6) where, ρs is the electrical resistivity of the earth expressed as ω.m; f is the frequency of the source current in hz. fig. 2 magnetic field generated by a real current and its image in an observation point the magnitude of the total magnetic induction at any desired point p is calculated by the summation of the horizontal and vertical components. 2 2 t xj yj b b b  (7) accurate computation of magnetic induction generated by hv overhead power lines 271 3. particle swarm optimization (pso) pso is a robust stochastic optimization algorithm developed by russell c. eberhart and james kennedy in 1995, inspired by the social behavior of bird flocking and fish schooling, it uses a number of agents (particles) that constitute a swarm moving around in the search space for finding global optimal solutions in nonlinear and high-dimensional spaces. in the main loop of the algorithm, the velocities and positions of the particles are iteratively updated by making use of the following equations [16-18]: 1 1 1 2 2( ) ( )i ii i best i best i v w v c r p x c r g x           (8) 1 1i i ix x v   (9) where, xi and are vi the position and velocity of particle i; (pbesti ,gbesti) are the local best position obtained by the particle i and the global best position ever found in the entire population respectively; w is a parameter controlling the flying dynamics; r1 and r2 are random variables in the range [0, 1]; c1 and c2 are factors controlling the related weighting of corresponding terms. the pso algorithm consist the following steps: 1. initialize the swarm form the solution space 2. evaluate the fitness of each particle 3. update individual and global bests 4. update velocity and position of each particle 5. go to step2, and repeat until a termination condition has been reached the objective function (fitness function) used in this method is based on the calculation of the relative error between the magnetic coefficient estimated by the match current points and the magnetic potential of the simulation current points; it is given by the equation below. 1 1 . .100 t i i i n c m t ci a a of n a     (10) where, aci is the magnetic potential calculated by the current points; ami is the new magnetic potential estimated by the match simulation current points; nt is the total number of check points. 4. 3d integration technique fig. 3 shows a span of conductor suspended freely between two adjacent pylons, which are at the same level and separated by a horizontal distance l, takes the form of a catenary curve providing the conductor is perfectly flexible and conductor weight is uniformly distributed along its length [19-25]. 272 d. rabah, b.sid-ahmed, s. akef fig. 3 the basic catenary geometry for a single conductor line the equation of the catenary shape of conductor placed in the yz plane is given by: 2 ( ) 2. .sinh 2. min z y z h           (11) where, z is the longitudinal position of the conductor about z axis, for a symmetrical line, you normally choose z = 0 at the mid-span; hmin is the minimum height at mid span; hmax is the maximum height on the extremes of the line; α is the solution of the transcendental equation. 2 2. . sinh ( ) h h u u l   (12) with: ( / 4. )u l  l is the length of the conductor beween two pylons, in meters. the parameter α is also associated with the mechanical parameters of the line (catenary constant): ht w c   (13) where, th is the horizontal tension at the low point of the conductor curve (n); wc is the linear weight of conductor (n/m). fig. 4 shows the profile of the catenary of an overhead conductor. the magnetic induction generated by a sagging conductor of an overhead power line with span l between pylons in an arbitrary point p(x0,y0,z0) can be determined by applying biotsavart law, as [21-25]: 0 3 . . . 4. l i dl r b r     (14) accurate computation of magnetic induction generated by hv overhead power lines 273 where, μ0 is the permittivity of free space; i is the line current; r is the distance vector from the source point (x,y,z) to the field point p(x0,y0,z0), it is given by: 0 0 0( ) ( ) ( )r x x i y y j z z k      (15) dl is the differential element at the direction of the current, from the geometry shown in fig. 4 it results, 0 0 0 0sinh izdl dy j dz k dz j dz k           (16) the magnetic field produced by a multi-phase sagging conductors (m), and their images by taking the effect of a conducting ground into account, in any point above the ground placed at span length would be determined applying the superposition principle. the expression for the total magnetic field is given by [14, 21-25]: 1 ( ). ( ). ( ). m x y zres xi yi zi i b b a b a b a        (17) where: 2 0 ' 2 . sinh( ).( ) ( ) sinh( ).( ) ( ) . 4 l i i i i i i i erc xi i i il i z z z y y z z z y y d b dz d d                     (18) 2 0 ' 2 . ( ) ( ) . 4 l i i i yi i i il i x x x x b dz d d              (19) 2 0 ' 2 . sinh( ).( ) sinh( ).( ) . 4 l i i i i i zi i i il i z x x z x x b dz d d                 (20) fig. 4 sagging conductor of an overhead power line between two adjacent pylons 274 d. rabah, b.sid-ahmed, s. akef with: 2 2 2 3 2 [( ) ( ) ( ) ]i i i id x x y y z z      (21) ' 2 2 2 3 2 [( ) ( ) ( ) ]i i i erc id x x y y d z z       (22) where, (xi,yi,zi) being the coordinates of the conductor and l the distance between two towers (the span length). it should be recalled that this calculation takes into account the induced currents circulating in the earth wires. these currents can be calculated by the relation given below [18]. 1 [ ] [ ] [ ] [ ]g ii ij ci = z z i     (23) where, zii is the self impedances matrix of the earth wires; zij is the mutual impedances between the phase conductors and earth wires; ic is the matrix of currents passing through the phase conductors. the self and mutual impedances per unit length by means of carson-clem formulae are given by [18-26]: 2 4 4 . .10 . .2.10 . ln ercii i gm d z r f j r km                       (24) 2 4 4 . .10 . .2.10 .ln ercij ij d z f j d km                   (25) where, ri is the dc resistance per unit length of conductor in (ω/km); rgm is the geometric mean radius of the conductor in (m); dij is the distance between the centers of two conductors of the power line; ω is the angular frequency and derc is the depth of equivalent earth return conductor given in equation (6). fig. 5 tower-to-tower geometry, showing mid-span between two towers accurate computation of magnetic induction generated by hv overhead power lines 275 in the present work, a single-circuit overhead power line of 275 kv, with a symmetrical horizontal phase conductor configuration and two earth wires is considered (see fig. 5). the three-phase currents on the power line have been assumed under balanced operation with the magnitude of 500a. the earth is assumed to be homogeneous with a resistivity of 100 ω.m. nominal frequency f =50 hz. the simplified schematic diagram of this transmission line structure used in this proposed study, with the arrangement and geometric details in the vicinity of the suspension pylon is shown in fig.6. fig. 6 275 kv single circuit three phase overhead transmission line 6. results and discussions the first step in this study is to determine the values of the currents induced in the ground wires of the power line in order to take into account the effect of these currents. *(65.21) *( 162.98) 1 2 18.41 ( ), 18.51 ( ) i i g g i e a i e a    the second step consists in determining the optimal values of the number of the current filaments points and their locations; this optimization of the search for the best parameters is made using a particle swarm optimization algorithm (pso). the input parameters used in the pso algorithm and cst technique are presented in table 1. table 1parameters settings used for the pso algorithm and cst technique radius 14.6 mm 20 m 26 m 7 m 10 m radius 31.8 mm 0.4 m algorithm+csm parameters current simulation technique (cst) range of current filaments points 2–30 range of fictitious radius for conductor phase: 0.01–0.1 [m] range of fictitious radius for ground wire: 0.003–0.013 [m] particle swarm optimisation algorithm (pso) swarm size n=20, learning factors c1=c2=2, weights: wmin=0.5 and wmax=0.5, iteration max= 100 276 d. rabah, b.sid-ahmed, s. akef the convergence of the objective function mentioned above in equation (10) with the number of iterations is shown in fig. 7, in order to determine the better solution according to the search space. the search process of this algorithm at successive iterations with optimal solutions are represented in figs. 8 and 9 respectively, where it becomes clear that the pso algorithm converge rapidly to these values. fig. 7 convergence of objective function used in pso algorithm fig. 8 convergence of the optimum values of number of filamentary line current fig. 10 shows the magnetic induction distribution under the power line in 1 m from the ground level at pylon foot and mid-span in different points along the transmission line 0 10 20 30 40 50 60 70 80 90 100 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x 10 -6 iterationn number f it n e s s f u n c ti o n 0 10 20 30 40 50 60 70 80 90 100 5 10 15 20 25 30 n u m b e r o f fi la m e n ta r y l in e c u r r e n t iteration number phase conductor ground wire accurate computation of magnetic induction generated by hv overhead power lines 277 corridor. it can be seen that magnetic induction is maximum under the middle phase conductor and it decreases rapidly with the lateral distance. generally, it can be said that the magnetic induction distribution follows a nearly gaussian shape. fig. 9 convergence of the optimum values of position of filamentary current fig. 10 magnetic induction profile at mid-span and pylon foot calculated at 1m above the ground level fig. 11 illustrates the longitudinal profile of the magnetic induction under the middle phase conductor, at the point where the magnetic induction is greatest; the magnetic induction strength immediately below the lowest point of a power line (mid-span) is significantly higher than in the proximity of a pylon and at some distance from the line. this illustrates that taking into account the effect of the sag of the conductor in the magnetic induction calculation is a very practical means to model the real behavior of the 0 10 20 30 40 50 60 70 80 90 100 0 0.02 0.04 0.06 0.08 0.1 0.12 iteration number p o s it io n o f fi la m e n ta r y l in e c u r r e n t [m ] phase conductor ground wire -50 -40 -30 -20 -10 0 10 20 30 40 50 0 1 2 3 4 5 6 7 8 9 lateral distance [m] m a g n e ti c i n d u c ti o n [ µ t ] at pylon foot at mid span 278 d. rabah, b.sid-ahmed, s. akef power line. consequently, the maximum value of the magnetic induction increases from 3.94 μt at the pylon foot to 8.87 μt at the half span length, this is due to the sag influence. it is important to mention that the maximum values obtained of the magnetic induction are below the threshold defined by the icnirp guidelines. fig. 11 longitudinal magnetic induction profile calculated at 1m above the ground level fig. 12 contours lines of the magnetic induction strength around the phase conductors the contour lines for the magnetic induction distribution around the power line at any point for the xy plane are depicted in fig. 12. the magnetic induction magnitudes are highest around and under the power lines and decrease rapidly with the distance from the pylon axes. the different level of magnetic induction is due to the variation of coordinates (x, y) of the calculation points from the ground level. -150 -100 -50 0 50 100 150 3 4 5 6 7 8 9 longitudinal span [m] m a g n e ti c i n d u c ti o n [ µ t ] -20 -15 -10 -5 0 5 10 15 20 0 5 10 15 20 25 30 3 3 5 5 5 5 5 10 10 10 1 0 1010 1 0 12 12 12 1 2 1212 1 2 1 5 1 5 15 1 5 1515 1 5 1 5 2 0 20 20 2020 2 0 25 25 2525 2 5 30 30 3030 3 0 50 50 50 75 75 75100 1 0 0 100 200 2 0 0200 lateral distance [m] h e ig h t [m ] accurate computation of magnetic induction generated by hv overhead power lines 279 fig. 13 mapping of the magnetic induction generated by the hv power line fig. 14 three-dimensional (3-d) magnetic induction profile at 1m above the ground level fig. 13 describes the mapping of the magnetic induction of power line, in an area defined by the vertical coordinates, and the axis of the lateral distance from the pylon axes. it may be interesting to note that the concentrated level of the magnetic induction is produced around the surface of the phase conductors; the magnetic induction gradually decreases with increasing the lateral distance from the center of the power line in both directions of the transmission line corridor. fig. 14 shows the three-dimensional profile of the magnetic induction over an area equivalent to a right of way either side of the transmission line center and a longitudinal span between the suspension pylons. it will be noted that the concentrated level of the lateral distance [m] h e ig h t [m ] -40 -30 -20 -10 0 10 20 30 40 0 5 10 15 20 25 30 -200 -100 0 100 200 -50 0 50 0 2 4 6 8 10 longitudinal span [m] lateral distance [m] m a g n e ti c i n d u c ti o n [ µ t ] 280 d. rabah, b.sid-ahmed, s. akef magnetic induction exists in a small area under and in the immediate vicinity of the conductors in the mid-span, and then it decreases slowly toward the pylons and even with the increasing of lateral distance from the power line center in both directions along the right-of-way corridor. fig. 15 magnetic induction profiles for different relative permeability values of soil at pylon foot fig. 16 magnetic induction profile at mid-span and pylon foot calculated at 1m above the ground level obtained with the analytical formula of magnetostatic fig. 15 shows the effect of the soil relative permeability at 1 m above the ground level on the lateral profile of the magnetic induction underneath the power line at pylon foot. an increase in the relative permeability of the soil in the range investigated will result in a slight increase of the magnetic induction. -50 -40 -30 -20 -10 0 10 20 30 40 50 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 lateral distance [m] m a g n e ti c i n d u c ti o n [ µ t ] µr=1 µr=2 µr=3 µr=4 µr=5 -50 -40 -30 -20 -10 0 10 20 30 40 50 0 1 2 3 4 5 6 7 8 9 lateral distance [m] m a g n e ti c i n d u c ti o n [ µ t ] pylon foot mid-span accurate computation of magnetic induction generated by hv overhead power lines 281 fig. 16 shows the magnetic induction profile by modelled the original conductors with the single current filament located in the canter of the conductor cross-section, for the same geometrical configuration, the results obtained of the magnetic induction at mid-span and pylon foot are approximately similar to those obtained by the current simulation technique (cst) fig. 17 magnetic induction profile in 1 m from the ground level calculated by the 3-d integration method in order to validate the adopted method in this study, 3-d integration method was proposed taking under consideration the conductors sag. fig. 17 shows the lateral distribution of the magnetic induction calculated by 3-d integration method at 1 m above ground level at pylon foot and mid-span length, the magnetic induction is maximum at the center of the power line and then gets progressively reduced as one moves away from the center of the transmission line corridor to achieve very negligible values far from the power line center. fig. 18 illustrates the effect of the variation of the calculation point height on the magnetic induction above the ground at the point where the latter is maximum (under the middle phase conductor x = 0), using the combined method and 3-d integration technique. it can be seen that the increase in the calculation point height can lead to an increase in the amplitude of the magnetic induction. therefore, the magnetic induction values calculated in the vicinity of the surface of phase conductors using the combined method are well correlated with those obtained by the 3-d integration technique. the graphs of two methods are perfectly superposed. -50 -40 -30 -20 -10 0 10 20 30 40 50 0 1 2 3 4 5 6 7 8 9 lateral distance [m] m a g n e ti c i n d u c ti o n [ µ t ] at pylon foot at mid span 282 d. rabah, b.sid-ahmed, s. akef fig. 18 magnetic induction profile as a function of the observation point height above ground the obtained results of the lateral distribution of magnetic induction computed by the combined method cst+pso were compared with those calculated by the 3-d integration method as shown in fig. 19, a very good agreement is achieved, the two graphs are perfectly superimposed. fig. 19 comparison of magnetic induction levels between the proposed method and 3-d integration method 0 2 4 6 8 10 12 14 16 18 20 0 50 100 150 200 250 height of calculation point above ground [m] m a g n e ti c i n d u c ti o n [ µ t ] mid span(cst+pso) mid span(3d integration method pylon foot (cst+pso) pylon foot (3d integration method) -50 -40 -30 -20 -10 0 10 20 30 40 50 0 2 4 6 8 10 12 lateral distance [m] m a g n e ti c i n d u c ti o n [ µ t ] pylon foot (cst+pso) mid span(cst+pso) pylon foot (3d integration method) mid span(3d integration method accurate computation of magnetic induction generated by hv overhead power lines 283 fig. 20 comparison between the magnetic induction values using the proposed method and with those measured described in reference [22] in order to confirm the accuracy of the presented method, the obtained results were compared with the measured values available in the literature [22]. as can be seen from this comparison shown in fig. 20, the simulated values of the magnetic induction resemble those measured. another point to note that the majority of the measured values at 1 m above the ground level are lower than the calculated values because of the metallic objects located in the immediate vicinity of the power line which act as shielding means are neglected, and once again, a very good agreement is achieved which validates the accuracy of the presented method. 7. conclusions in this paper, a 3d quasi static numerical modelling for computation of the magnetic induction generated by overhead power lines is presented, in order to obtain the most appropriate position and number of filamentary current loops used in current simulation technique (cst) which provides the solution of sufficient accuracy, a particle swarm optimization algorithm (pso) is applied. from the results, it is clear that the magnetic induction strength is maximum at the center of the power line, and then decreases with increase in the lateral distance, the magnetic induction strength immediately below the lowest point of the power line is significantly higher than in the proximity of the pylon and at some distance from the line, the magnetic induction around the pylon is much lower than at mid-span. the obtained result showed that the calculated magnetic induction under the hv power line is within the icnirp safety limit for general public and occupational exposure. the obtained results by the proposed method were compared with those obtained by the 3-d integration method. the simulation results are almost identical and are visually superimposed; the comparison is satisfying enough and it sufficient to confirm the accuracy of the combined method. -20 -10 0 10 20 30 40 50 60 70 50 100 150 200 250 300 350 400 450 500 lateral distance [m] m a g n e ti c i n d u c ti o n [ n t ] cst+pso measured values 284 d. rabah, b.sid-ahmed, s. akef references [1] ch. j. portier, m.s. wolfe, "assessment of health effects from exposure to power line frequency electric and magnetic fields", working group report, niehs and emfrapid, august 1998. [2] k. olden, "health effects from exposure to power-line frequency electric and magnetic fields", national institute of environmental health sciences, niehs report, pl 102-486, section 2118, 1999. [3] m. havas, "biological effects of low frequency electromagnetic fields", chapter 10, electromagnetic environments and health in buildings, spon press, london, pp. 207-232, 2004. [4] t. samaras, "preliminary opinion on potential health effects of exposure to electromagnetic fields", scientific committee on emerging and newly identified health risks scenihr, health effects of emf, november 2013. [5] icnirp, "international commission on non-ionizing radiation protection, "guidelines for limiting exposure to time-varying electric and magnetic fields (1hz to 100 khz) ", health physics, vol. 99, no.6, pp. 818–836, 2010. [6] cigré, "electric and magnetic field produced by transmission systems", working group 01 (interference and fields) of study committee 36, paris, 1980. [7] m.l.p. filho, j.r. cardoso, c.a.f. sartori, m.c. costa, b.p. de-alvarenga, a.b. dietrich, l. m. r. mendes , "upgrading urban high voltage transmission line: impact on electric and magnetic fields in the environment", in proceedings of the ieee/pes transmission and distribution conference and exposition, vol. 8, 2004, pp. 788–793. [8] k. a. vyas, j. g. jamnani, "analysis and design optimization of 765 kv transmission line based on electric and magnetic fields for different line configurations", in proceedings of the ieee 6th international conference on power systems (icps), march 2016, pp. 1–6. [9] m. abdel-salam, h. abdallah, m. th. el-mohandes, h. el-kishky, "calculation of magnetic fields from electric power transmission lines", electric power systems research, vol. 49, no. 2, pp. 99–105, march 1999. [10] r. djekidel, a. ameur, d. mahi, a. hadjadj, "electrostatic interference calculation from h-v power lines to aerial pipelines using hybrid pso-csm approach", in proceedings of the ieee 9th jordanian international electrical and electronics engineering conference, pp.1–6, oct 2015. [11] s. a. bessedik, h. hadi, "prediction of flashover voltage of insulators using least squares support vector machine with particle swarm optimization", electric power systems research, vol. 104, pp. 87– 92, 2013. [12] d. yao, b. li, j. deng, d. huang, x. wu, "power frequency magnetic field of heavy current transmit electricity lines based on simulation current method", in proceedings of the ieee automation congress, 2008, pp. 1–4. [13] r. radwan, m. abdel-salam, a.b. mahdy, m. samy, "laboratory validation of calculations of magnetic field mitigation underneath transmission lines using passive and active shield wires", innovative systems design and engineering, vol. 2, no. 4, pp. 218–232, 2011. [14] j.r. riba ruiz, a. g. espinosa, "magnetic field generated by sagging conductors of overhead power lines", computer applications in engineering education, vol. 19, no 4, pp.787–794, 2011. [15] a. r. memari, w. janischewskyj, "mitigation of magnetic field near power lines", ieee transactions on power delivery, vol. 11, no. 3, pp. 1577–1586, jul 1996. [16] j. kennedy, r c. eberhart, "particle swarm optimization", in proceedings of the ieee international conference on neural networks, australia, vol. 19, 1995, pp. 1942–1948. [17] f. rebahi, a. bentounsi, h. bouchekara, r. rebbah, "optimization design of a doubly salient 8/6 srm based on three computational intelligence methods", turkish journal of electrical engineering & computer sciences, vol. 24, no. 5, pp. 4454–4464, 2016. [18] r. djekidel, s. a. bessedik, p. spiteri, d. mahi, " passive mitigation for magnetic coupling between hv power line and aerial pipeline using pso algorithms optimization", electric power systems research, vol. 165, pp. 18–26, 2018. [19] k. budink, w. machczynski, j. szymenderski, "voltage induced by currents in power line sagged conductors in nearby circuits of arbitrary configuration", archives of electrical engineering, vol. 64, no. 2, pp. 227–236, 2015. [20] m. p. arabani, b. porkar, s. porkar, "the influence of conductor sag on spatial distribution of transmission line magnetic field", cigre, paris, paper b2-202, 2004. [21] a. z. el-dein, "magnetic-field calculation under ehv transmission lines for more realistic cases", ieee transactions on power delivery, vol. 24, no. 4, pp. 2214–2222, oct 2009. https://ieeexplore.ieee.org/author/37416447400 https://ieeexplore.ieee.org/author/37288125000 https://ieeexplore.ieee.org/author/37288125000 https://ieeexplore.ieee.org/author/37419264300 https://ieeexplore.ieee.org/author/37418708300 https://ieeexplore.ieee.org/author/37429832700 https://ieeexplore.ieee.org/author/37427950700 https://ieeexplore.ieee.org/author/37427950700 https://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=9791 https://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=9791 https://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=7579440 https://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=7579440 http://www.sciencedirect.com/science/journal/03787796/49/2 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=4681917 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=4681917 http://academic.research.microsoft.com/author/21678131/jordi-roger-riba-ruiz http://academic.research.microsoft.com/author/55507343/antonio-garcia-espinosa http://65.54.113.26/publication/46260994/magnetic-field-generated-by-sagging-conductors-of-overhead-power-lines http://65.54.113.26/publication/46260994/magnetic-field-generated-by-sagging-conductors-of-overhead-power-lines http://65.54.113.26/journal/4482/comput-appl-eng-educ-computer-applications-in-engineering-education http://scholar.google.ca/citations?view_op=view_citation&hl=en&user=qvrmte8aaaaj&citation_for_view=qvrmte8aaaaj:rolk4nbrz8uc http://scholar.google.ca/citations?view_op=view_citation&hl=en&user=qvrmte8aaaaj&citation_for_view=qvrmte8aaaaj:rolk4nbrz8uc http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=61 http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=61 accurate computation of magnetic induction generated by hv overhead power lines 285 [22] i. n. ztoupis, i. f. gonos, i. a. stathopulos, "calculation of power frequency fields from high voltage overhead lines in residential areas", in proceedings of the 18th international symposium on high voltage engineering, paper a-01, 2013, pp. 61–69. [23] k. deželak, g. stumberger, f. jakl, "emissions of electromagnetic fields caused by sagged overhead power lines", przegląd elektrotechniczny, vol. 87, no. 3, pp. 29–32, 2011. [24] m. perić, s. aleksić, "influence of conductor sag on magnetic field distribution in vicinity of power lines", international journal of emerging sciences, vol. 1, no. 4, pp. 564–574, december 2011. [25] a. z. el-dein, "the effects of the span configurations and conductor sag on the magnetic field distribution under overhead transmission lines", journal of physics, vol. 1 no. 2, pp. 11–23, july 2012. [26] m. albano, r. turri, s. dessanti, a. haddad, h. griffiths, b. howat, "computation of the electromagnetic coupling of parallel untransposed power lines", ieee power engineering conference, vol. 1, pp. 303-307, 2006. 6279 facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 605-616 https://doi.org/10.2298/fuee2004605l © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd computation of per-unit-length internal impedance of a multilayer cylindrical conductor with possible dielectric layers dino lovrić, slavko vujević, ivan krolo university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia abstract. in this manuscript, a novel method for computation of per-unit-length internal impedance of a cylindrical multilayer conductor with conductive and dielectric layers is presented in detail. in addition to this, formulas for computation of electric and magnetic field distribution throughout the entire multilayer conductor (including dielectric layers) have been derived. the presented formulas for electric and magnetic field in conductive layers have been directly derived from maxwell equations using modified bessel functions. however, electric and magnetic field in dielectric layers has been computed indirectly from the electric and magnetic fields in contiguous conductive layers which reduces the total number of unknowns in the system of equations. displacement currents have been disregarded in both conductive and dielectric layers. this is justifiable if the conductive layers are good conductors. the validity of introducing these approximations is tested in the paper versus a model that takes into account displacement currents in all types of layers. key words: internal impedance, multilayer cylindrical conductor, dielectric layers, conductive layers, modified bessel functions. 1. introduction conductors composed of different types of materials are often used in a number of engineering applications [1,2]. since each material used in the conductor has certain advantages and disadvantages, by carefully combining different types of conductor materials one can obtain a structure in which advantages of one material used negates the disadvantages of another material. however, the resulting multilayer structure becomes more challenging to accurately model. this is for example the case when performing various electromagnetic compatibility analyses [3,4], harmonic and transient analyses of transmission lines [5] as well as harmonic and transient analyses of grounding systems [6,7]. received march 27, 2020; received in revised form may 25, 2020 corresponding author: dino lovrić faculty of electrical engineering, mechanical engineering and naval architecture, r. boskovica 32, 21 000 split, croatia e-mail: dlovric@fesb.hr 606 d. lovrić, s. vujević, i. krolo in order to obtain the distribution of electric and magnetic fields inside the multilayer structure, authors in the available literature mainly utilize a cascade of two-port networks [8-10]. this approach leads to certain numerical instabilities that are inherent to the transfer matrix of the system, where some elements of the matrix tend to infinity for high frequencies even for extra thin layers, which was demonstrated in [11]. in paper [11], however, the authors derive the equations for computation of electric and magnetic field distribution within the multilayer structure directly from maxwell equations and base the solutions on modified bessel functions [12] which have proven to be the most numerically stable choice [13-15]. accurate distribution in all layers is obtained by forming a system of linear equations from boundary conditions which is then easily solved. the formulas are derived to maximize numerical stability and robustness of the proposed algorithm. in the model from [11] all layers of the multilayer conductor are characterized by electrical conductivity, permittivity and permeability, hence both conductive and displacement currents have been taken into account in all types of layers. in this paper, a slightly different approach to model a multilayer structure is proposed and tested. first of all, the multilayer structure consists of two types of layers: conductive layers which consist of materials that are good conductors and dielectric layers, unlike in [11] where the layers are general. the proposed model consists of an arbitrary number of conductive layers where a single dielectric layer can be situated between two conductive layers. secondly, displacement currents have been disregarded in all layers. this is only possible if the conductive layers are made of materials which are good electrical conductors. and thirdly, in the proposed model, the conductive layers are the only layers which contribute to the formation of the system of equations. distribution of electric and magnetic fields in dielectric layers is computed indirectly from the border conditions on contiguous conductive layers. the effect of introducing these simplifications is tested in the numerical examples part of the manuscript. 2. model of the multilayer cylindrical conductor with dielectric layers the multilayer cylindrical conductor analyzed in this paper can have an arbitrary number of conductive layers (m). in addition to this, the model of the multilayer conductor allows the existence of a single dielectric layer between two conductive layers, which means that for a total number of m conductive layers there can be a maximum of m-1 dielectric layers (the last layer of the conductor is a conductive layer). an arbitrary i-th conductive layer is characterized by its internal radius ri in , external radius ri ex , electrical conductivity σi and magnetic permeability μi, whereas each of the dielectric layers is defined by its magnetic permeability µi d and, indirectly, by the external and internal radii of the contiguous conductive layers. electrical permittivity is non-existent in the model since the displacement currents have been disregarded in all layers. all layer materials are considered to be linear, isotropic and the parameters describing them are not frequency dependent. to better illustrate this, fig. 1 depicts the i-th and (i+1)-th conductive layers. the i-th dielectric layer illustrated on fig. 1 is defined by the external radius of the ith conductive layer and the internal radius of the (i+1)-th conductive layer. in the case that the dielectric layer is nonexistent then ri ex = ri+1 in . the developed formulas have been derived in such a way to directly take this case into account without modification. due to simplicity numerical computation of cylindrical conductor internal impedance of a multylayer cylindrical... 607 of the model, to each i-th conductive layer, where i = 1, 2, ..., m-1, an i-th dielectric layer has been joined, which can be an actual dielectric layer or a fictive dielectric layer. the last conductive layer does not have a joined dielectric layer. if the first layer is solid, then r1 in = 0, whereas if it is a tubular, then r1 in ≠ 0. fig. 1 dielectric layer between two conductive layers 3. distribution of electric and magnetic field in conductive and dielectric layers the formulas for the computation of electric and magnetic field inside an arbitrary i-th conductive layer of the multilayer conductor are derived directly from maxwell equations for good conductors. due to axial symmetry, the electric field only has the component in the direction of the conductor current whereas the magnetic field only has the azimuthal component. unlike in paper [11], in this paper, displacement currents have been disregarded in the entire multilayer conductor including dielectric layers due to the fact that they are significantly smaller in relation to conductive currents even for higher frequencies, if the conductive layers can be considered good conductors. the effect of disregarding displacement currents will be tested in the numerical examples section of this paper. the expressions for computation of electric and magnetic field inside an arbitrary i-th conductive layer are derived directly from maxwell equations using modified bessel functions [12] with disregarded displacement currents. for improved numerical stability of the electromagnetic models modified bessel functions have been scaled up/down to produce values of similar order of magnitude thus avoiding any underflow/overflow numerical problems [13-15]: 0 0 1 1 ( ) exp( ) ( ) ; ( ) exp( ) ( ) s s i i i i i i i r r i r i r r i r                 (1) 0 0 1 1 ( ) exp( ) ( ); ( ) exp( ) ( ) s s i i i i i i k r r k r k r r k r               (2) expressions for computation of electric and magnetic field inside an arbitrary i-th conductive layer written using scaled modified bessel functions are: 608 d. lovrić, s. vujević, i. krolo  1 1( ) exp[ ( )] ( ) exp[ ( )] s s ex s s in i tot i i i i i i i i h i c i r r r d k r r r                 (3)  0 0( ) exp[ ( )] ( ) exp[ ( )] s s ex s s intot i i i i i i i i i i i i e c i r r r d k r r r                     (4)  jj iiii         1 4 exp (5) where s i 0 is the scaled complex-valued modified bessel function of the first kind of order zero, s k0 is the scaled complex-valued modified bessel function of the second kind of order zero, s i1 is the scaled complex-valued modified bessel function of the first kind of order one, s k1 is the scaled complex-valued modified bessel function of the second kind of order one, s ic and s id are the unknown scaled complex-valued coefficients for the i-th conductive layer, i is the complex wave propagation constant of the i-th conductive layer, σi is the electrical conductivity of the i-th conductive layer, µi is the magnetic permeability of the i-th conductive layer, ω is the circular frequency of the conductor current, αi is the attenuation constant of the i-th conductive layer, toti represents the phasor of the total multilayer conductor current and r is the distance of the observation point from the axis of the multilayer cylindrical conductor. computation of electric and magnetic field inside a dielectric layer between the i-th and (i+1)-th conductive layers is achieved indirectly, from the values of electric and magnetic fields on the outer edge of the i-th conductive layer. computation of magnetic field at an observation point located inside a dielectric layer between the i-th and (i+1)-th conductive layers is performed using ampere’s law disregarding the displacement currents. the magnetic field intensity in the dielectric layer is integrated along a curve (circle of radius r with the center located on the axis of the conductor) which produces the following equation: enc d i irh 2 (6) where d ih is the magnetic field inside the dielectric layer and enci is the harmonic current enclosed inside the circle of radius r. since the magnetic field on the outer edge of the i-th conductive layer encloses the same amount of harmonic current and can be written as: enc ex i rr i irh ex i   2 (7) by introducing (7) into (6), the following equation is obtained for computation of magnetic field inside a dielectric layer between the i-th and (i+1)-th conductive layers: r r hh ex i rr i d i ex i   (8) as for the computation of electric field inside the dielectric layer between the i-th and (i+1)-th conductive layers, fig. 2 clearly describes how this is achieved. the dash-dotted line represents the axis of the multilayer conductor which lies on the z-axis of the numerical computation of cylindrical conductor internal impedance of a multylayer cylindrical... 609 coordinate system. the black rectangle in the figure represents the curve over which the line integral of the electric field intensity present in the faraday’s law of induction is integrated, the black arrow denoting the positive path of integration as dictated by the right hand rule. fig. 2 computation of electric field inside the dielectric layer between the i-th and (i+1)-th conductive layers the line integral reduces to the following expression since the values of electric field are constant along the integration curve and cancel each other out on parts of the curve perpendicular to the conductor axis: i rr i d i jee ex i   (9) the per-unit-length magnetic flux though the surface bounded by the integration curve depicted on fig. 2 is easily obtained by integrating the magnetic flux density through the surface: ex i ex i rr iex i ex i d i r r d i d ii h r r nrdrh    1 (10) where µi d is the magnetic permeability of the dielectric layer located between the i-th and (i+1)-th conductive layers. by introducing (10) into (9) and rearranging the expression one can obtain the following expression for electric field inside the dielectric layer between the i-th and (i+1)-th conductive layers: ex i ex i rr iex i ex i d i rr i d i h r r nrjee    (11) since the electric and magnetic fields inside dielectric layers are computed indirectly, the number of unknown complex-valued coefficients has been reduced unlike in paper [11] where each layer, be it conductive or dielectric, adds two unknowns to the subsequent system of equations. computation of unknown coefficients slightly varies depending on whether the multilayer conductor is a solid conductor (r1 in = 0) or a tubular conductor (r1 in ≠ 0). the computation of unknown complex-valued coefficients from the boundary conditions is given in appendix a. 610 d. lovrić, s. vujević, i. krolo 4. per-unit-length internal impedance of the multilayer conductor per-unit-length internal impedance of the multilayer conductor with possible dielectric layers where the displacement currents have been disregarded in the entire conductor is computed using the value of electric field on the outer edge of the multilayer conductor using the following expression: exp( ) ex m m r r tot e z z j i      (12) where z is the modulus of the per-unit-length internal impedance of the multilayer conductor and φ is the phase angle of the per-unit-length internal impedance of the multilayer conductor. substituting equation (4) into (12), one obtains the following expression for computation of per-unit-length internal impedance of the multilayer conductor:  0 0( ) ( ) exp[ ( )] s s s s ex inm m m m m m m m m m m z c i r d k r r r                (13) 5. comparison with a model of the multilayer conductor from [11] in paper [11], a model of a multilayer conductor is developed which consists of an arbitrary number of layers which can feature arbitrary electrical and magnetic parameters. this, in fact, means that each layer has its electrical conductivity, permittivity and magnetic permeability, hence conductive and displacement currents have been taken into account in all layers. in this paper, however, a different approach is proposed which totally disregards displacement currents in all layers. in following two examples, the effects of this will be investigated. 5.1. example 1 in the first numerical example the following multilayer conductor with four layers in total is considered: 1. r1 in = 0; r1 ex = 5 mm; σ1 = 1.37 ms/m; μ1 = 1.02·μ0; ε1 = ε0 2. r2 in = 5 mm; r2 ex = 10 mm; σ2 = 59.6 ms/m; μ2 = 0.999994·μ0; ε2 = ε0 3. r3 in = 10 mm; r3 ex = 15 mm; σ3 = 0; μ3 = μ0; ε3 = ε0 4. r4 in = 15 mm; r4 ex = 20 mm; σ4 = 10 ms/m; μ4 = μ0; ε4 = ε0 as can be seen from the previous list, first two layers are conductive layers, the third layer is a dielectric layer, whereas the final layer is a conductive layer. these parameters have been implemented into the model from [11] which takes displacement currents into account in all layers, and into the proposed model where displacement currents have been disregarded. per-unit-length internal impedance is computed for a set of frequencies ranging from very low to very high frequencies. fig. 3 depicts the moduli of the per-unit-length internal impedance for both models, whereas fig. 4 depicts the phase angle of the impedance. as can be observed from both figures the curves coincide throughout the observation interval. the maximum difference between the moduli of per-unit-length internal impedances is numerical computation of cylindrical conductor internal impedance of a multylayer cylindrical... 611 1.0179·10 -10 , whereas the maximum difference between the phase angles of per-unitlength internal impedances is 1.6209·10 -7 . fig. 3 comparison of moduli of per-unit-length internal impedance computed by the model from [11] and the proposed model for the first example fig. 4 comparison of phase angles of per-unit-length internal impedance computed by the model from [11] and the proposed model for the first example 5.2. example 2 in the second numerical example the following multilayer conductor with seven layers in total is considered. a tubular multilayer conductor consisting of four thin conductive layers and three dielectric layers is considered: 1. r1 in = 4; r1 ex = 5 mm; σ1 = 59.6 ms/m; μ1 = 0.999994·μ0; ε1 = ε0 2. r2 in = 5 mm; r2 ex = 7 mm; σ2 = 0; μ2 = μ0; ε2 = ε0 3. r3 in = 7 mm; r3 ex = 8 mm; σ3 = 1.37 ms/m; μ3 = 1.02·μ0; ε3 = ε0 4. r4 in = 8 mm; r4 ex = 10 mm; σ4 = 0; μ4 = μ0; ε4 = ε0 5. r5 in = 10 mm; r5 ex = 11 mm; σ5 = 10 ms/m; μ5 = μ0; ε5 = ε0 6. r6 in = 11 mm; r6 ex = 13 mm; σ6 = 0; μ6 = μ0; ε6 = ε0 7. r7 in = 13 mm; r7 ex = 14 mm; σ7 = 59.6 ms/m; μ7 = 0.999994·μ0; ε7 = ε0 612 d. lovrić, s. vujević, i. krolo these parameters have been implemented into the model from [11] which takes displacement currents into account in all layers, and into the proposed model where displacement currents have been disregarded. per-unit-length internal impedance is computed for a set of frequencies ranging from very low to very high frequencies. fig. 5 depicts the moduli of the per-unit-length internal impedance for both models, whereas fig. 6 depicts the phase angle of the impedance. as can be observed from both figures the curves coincide throughout the observation interval. the maximum difference between the moduli of per-unit-length internal impedances is 1.1872·10 -10 , whereas the maximum difference between the phase angles of per-unit-length internal impedances is 1.7847·10 -4 . fig. 5 comparison of moduli of per-unit-length internal impedance computed by the model from [11] and the proposed model for the second example fig. 6 comparison of phase angles of per-unit-length internal impedance computed by the model from [11] and the proposed model for the second example numerical computation of cylindrical conductor internal impedance of a multylayer cylindrical... 613 5.3. discussion other than the presented two numerical examples, numerous comparisons have been made for different kinds of multilayer conductors and the authors came to the same conclusion. disregarding the displacement currents in both the conductive layers and the dielectric layers introduces a practically insignificant error in the model if the conductive layers are good conductors. furthermore, the number of unknowns in the system of equations is reduced since the dielectric layers are treated differently than in [11] which reduces computation time by approximately 23%. it can also be noted here that the model presented in [11] was tested against similar models available in the literature which are based on a cascade of a set of two-port networks [8-10] and proved equally accurate and far more stable. therefore, by comparing the proposed model to the model presented in [11] one can also validate this proposed model relative to the other models available in literature. 6. conclusion in this paper a model of the multilayer conductor with conductive and dielectric layers is proposed. in the proposed model the effect of totally disregarding displacement currents in both conductive layers and dielectric layers was examined and the authors came to the conclusion that displacement currents have negligible effects on the distribution of electric and magnetic field inside a multilayered conductor even for higher frequency values. in addition to this, the dielectric layers are taken into account indirectly in a way that does not add additional unknown coefficients to the system of linear equations, which reduces computation time. references [1] s. olsen, c. traeholt, a. kuhle, o. tonnesen, m. daumling, and j. ostergaard, "loss and inductance investigations in a 4-layer superconducting prototype cable conductor, " ieee transactions on applied superconductivity, vol. 9, no. 2, pp. 833–836, 1999. [2] v. morgan, "effects of alternating and direct current, power frequency, temperature, and tension on the electrical parameters of acsr conductors, " ieee transactions on power delivery, vol. 18, no. 3, pp. 859–866, 2003. [3] z. zhihua, m. weiming, "ac impedance of an isolated flat conductor," ieee transactions on electromagnetic compatibility, vol. 44, no. 3, pp. 482–486, 2002. [4] w. mingli, f. yu, "numerical calculations of internal impedance of solid and tubular cylindrical conductors under large parameters," iee proceedings generation, transmission and distribution, vol. 151, no. 1, pp. 67–72, 2004. [5] g. a. antonini, a. orlandi, c. r. paul, "internal impedance of conductors of rectangular cross section," ieee transactions on microwave theory and techniques, vol. 47, no. 7, pp. 979–985, 1999. [6] s. vujević, p. sarajčev, d. lovrić, "time-harmonic analysis of grounding system in horizontally stratified multilayer medium," electric power systems research, vol. 83, pp. 28–37, 2011. [7] l. grcev, f. dawalibi, "an electromagnetic model for transients in grounding systems," ieee transactions on power delivery, vol. 5, pp. 1779–1781, 1990. [8] j. a. brandao faria, "a circuit approach for the electromagnetic analysis of inhomogeneous cylindrical structures," progress in electromagnetics research b, vol. 30, pp. 223–238, 2011. [9] j. a. brandao faria, "a matrix approach for the evaluation of the internal impedance of multilayered cylindrical structures," progress in electromagnetics research b, vol. 28, pp. 351–367, 2011. 614 d. lovrić, s. vujević, i. krolo [10] k. kubiczek, m. kampik, "highly accurate and numerically stable matrix computations of the internal impedance of multilayer cylindrical conductors," ieee transactions on electromagnetic compatibility, vol. 62, no. 1, pp. 204–211, 2020. [11] s. vujević, d. lovrić, i. krolo, i. duvnjak, "computation of electric and magnetic field distribution inside a multilayer cylindrical conductor," progress in electromagnetics research m, vol. 88, pp. 53–63, 2020. [12] m. abramowitz, i. a. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables, new york: dover publications, 1964. [13] s. vujević, d. lovrić, v. boras, "high-accurate numerical computation of internal impedance of cylindrical conductors for complex arguments of arbitrary magnitude", ieee transactions on electromagnetic compatibility, vol. 56, pp. 1431–1438, 2014. [14] s. vujević, d. lovrić, "on the numerical computation of cylindrical conductor internal impedance for complex arguments of large magnitude", facta universitatis series: electronics and energetics, vol. 30, no. 1, pp. 81–91, 2017. [15] d. lovrić, s. vujević, "accurate computation of internal impedance of two-layer cylindrical conductors for arguments of arbitrary magnitude," ieee transactions on electromagnetic compatibility, vol. 60, no. 2, pp. 347–353, 2018. appendix a formation of the system of equations for computing the unknown complex-valued coefficients model of the multilayer cylindrical conductor presented in this paper can have an arbitrary number of conductive layers (m). this means that there are 2∙m unknown scaled complex-valued coefficients s ic and s id (i = 1, 2, ..., m) which one needs to compute in order to know the electric and magnetic field distribution in all layers. unknown scaled complex-valued coefficients are obtained by forming and solving a system of 2∙m linear equations which are derived from the boundary conditions between layers and the boundary conditions on the edges of the multilayer conductor. the first 2∙(m-1) equations in the system of equations are formed from the boundary conditions between layers requiring that the tangential components of electric field intensity and magnetic field intensity are continuous on border between two adjacent layers. the possible existence of a dielectric layer between two conductive layers is taken into account in the following boundary conditions, which are directly derived from equations (9) and (12) where in this case, r is substituted with in ir 1 : 1...,,2,1; 1 1 1      mih r r h in i ex i rr iex i in i rr i (a1) 1...,,2,1; 1 1 1       mieh r r nrje in i ex i ex i rr i rr iex i in iex i d i rr i  (a2) equations (a1-a2) are valid for both cases when the first conductive layer is a solid layer or if it is a tubular layer. numerical computation of cylindrical conductor internal impedance of a multylayer cylindrical... 615 one additional equation, also valid for both cases, is derived from the boundary condition on the outer edge of the multilayer cylindrical conductor: ex m tot rr m r i h ex m    2 (a3) the final equation in the system of equations varies for the cases of solid and tubular cylindrical conductors since it is derived from the innermost edge of the conductor (if it exists). in the case where the first conductive layer is a solid cylindrical conductor, then obviously the internal radius of the first layer equals zero. since modified bessel functions of the second kind tend to infinity if their argument is zero, they must be eliminated in order to preserve the physical validity of results. in this case the final equation in the system of equations for the case of the solid cylindrical conductor is: 01  s d (a4) however, in the case where the first conductive layer is a tubular cylindrical conductor, then the internal radius of the first layer does not equal zero so no singularity issues occur. hence, the boundary condition on the innermost edge of the conductor can be included as the final equation in the system of equations for the tubular cylindrical conductor: 0 1 1   in rr h (a5) by introducing equations for electric and magnetic field described by (1-2) into (a1a5), the following system of 2∙m equations is obtained: equations 1 to 2∙m-2: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ( ) ( ) exp[ ( )] ( ) exp[ ( )] ( ) 0 ; 1, 2, ..., 1 s s ex s s ex ex in i i i i i i i i i in s s in ex ini i i i i i iex i in s s in i i i i ex i c i r d k r r r r c i r r r r r d k r i m r                                         (a6) 1 0 1 1 0 1 1 1 0 1 1 1 1 1 ( ) ( ) ( ) ( ) exp[ ( )] ( ) exp[ ( d in s s ex ex s exi i i i i i i i iex i i d in s s ex ex s ex ex ini i i i i i i i i i i iex i i s s ini i i i i i i i i r c i r r n i r r r d k r r n k r r r r c i r r                                                           1 1 1 0 1 1 1 )] ( ) 0 ; 1, 2, ..., 1 ex in i s s ini i i i i i i r d k r i m                    (a7) 616 d. lovrić, s. vujević, i. krolo equation 2∙m-1: 1 1 1 ( ) ( ) exp[ ( )] 2 s s ex s s ex ex in m m m m m m m m m ex m c i r d k r r r r              (a8) equation 2∙m: 1 1 11 1 1 1 1 1 1( ) exp[ ( )] ( ) 0 s s s s in ex in in i c i r r r d k r            (a9) or 01  s d (a10) 10607 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 557-570 https://doi.org/10.2298/fuee2204557s © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper event-triggered sliding mode control for constrained networked control systems* andrej sarjaš, dušan gleich university of maribor, faculty of electrical engineering and computer science, maribor, slovenia abstract. the paper describes a non-linear control (etnc) approach for constrained networked feedback control systems (nfcs). the real-time controller execution is implemented based on the event-triggering paradigm. a nonlinear variable structure is used for the controller design. the nonlinear approach is based on the predefined sliding variable defined by the system states with a nonlinear switching function. the system's stability is analyzed regarding the evolution of the sliding variable. the event-triggered operation of the nonlinear controller is based on the prescribed triggering rule. the stability boundary of the sliding variable is subject to the preselected triggering condition, whose selection is a tradeoff of system performance, networks constraints and transmission capabilities. the main focus of the event triggering approach is lowering network resources utilization in the steady-state behavior of the nfcs. the presented approach ensures a non-zero inter-event time of controller execution, which enables scheduling and optimization of the network operation regarding the network constraints and real-time system performance. the efficiency of the presented method is presented with a comparison of the classical time triggering approach. the real measurement supports the results. key words: event-triggering, networked control system, variable structure control, sliding mode control 1. introduction networked feedback control systems have been researched extensively over the last two decades [1]. new communication technologies integrated into tiny devices with decent computational capability offer vast, remote applications in distributed or decentralized structures. regarding the network structure and amount of connected devices, the implementation of the nfcs is critical. new methods are derived that improve network usage and ensure system performance according to the controller implementation and execution. the paper introduces the nonlinear control law with event triggering execution. received march 23, 2022; revised april 29, 2022; accepted june 15, 2022 corresponding author: andrej sarjaš university of maribor, faculty of electrical engineering and computer science, koroška cesta 46, si-2000 maribor, slovenia e-mail: andrej.sarjas@um.si * an earlier version of this paper was presented at the 15thinternational conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20-22, 2021, in niš, serbia [1] 558 a. sarjaš, d. gleich sliding mode control (smc) is an effective approach to ensure the prescribed performance of a closed-loop system, despite external disturbances and system uncertainty [1]-[4]. depending on the controller structure, the sliding mode controller is straightforward to implement and requires much computational time. all controllers in the real-time system are implemented in a discrete form, which results in a hybrid system where the continuous and discrete systems are interconnected [3]-[6]. the most commonly used approach for controller implementation is a sample and hold technique, or time triggering approach (tt). time triggering means that the controller output is updated at equidistant time intervals, also known as a sampling time. such tt closed-loop system is more suitable to design, due to the vast amount of ` developed techniques and approaches for time sampled systems. on the other hand, the tt system requires constant resources` utilization and data transmissions over the network system. the event-triggering (et) approach of a closed-loop system offers an alternative to the tt [7]. regarding the tt in the et system, the closed-loop system is updated based on the trigger rule evaluation. in other words, the controller is updated when the system states violate the triggering rule, which means that the controller is no longer updated periodically with fixed time intervals. such an implementation of the controller is more efficient than the tt implementation, and requires fewer computational resources, especially when the sliding manifold is reached. regarding the latter, et is beneficial for the networked control system (ncs), where the trigger mechanism reduces network transmission and is suitable for systems with data-rate constraints [8]-[10]. the network constraints with variable round trip time (rtt), limited data transmission, and package drops are insufficient for the ncs[11], [12]. the mentioned network parameters reduce system performances considerably, and can lead to unstable operation. the presented work introduced an smc controller design with an associated triggering rule, which ensures ncs stability and takes all the network parameters into account during the design procedure. the derived event-triggered sliding mode controller (et-smc) introduces triggering boundaries regarding the admissible lower inter-event time value and network delay [13], [14]. the et-smc for ncs is divided into two steps. the first step introduces an smc controller design with preselected system dynamics and parametrized sliding variables [15]-[17]. the second step involves triggering boundary selection regarding the system tracking performance and ncs uncertainty robustness. in comparison to the similar linear et paradigms, the presented approach still ensure smc properties and lowers the computational burden and network usage effectively. the controller parameter selection can be presented as an optimization procedure. the optimal parameter selection can be evaluated as a tradeoff between network utilization regarding ncs uncertainties and closed-loop performance, such as tracking capability, transient performance, network delay, etc. the assessment of the admissible lower interevent time of the et shows the direct influence of the et-controller on the network utilization during the reaching and sliding phase of the sliding variable evolution. the efficiency of the proposed controller is evaluated on a real-time system. event-triggered sliding mode control for constrained networked control system 559 2. sliding mode controller design for the sliding mode controller (smc) synthesis, the given system is used, 1 2 2 2 , x x x bx gv d = = − + + (1) where 2 1 2 ( ) [ ( ) ( )] t x t x t x t=  is a state vector and ( )v t  is the control variable. the parameters :g → and :b → are system parameters, where :d → is a disturbance. for smc design, the boundary of the system parameters are given, max 0 b b  , min max g g g  , min max max 0/[ , , ]g g b   . for system tracking capability, new system states are introduced, 1 1d x x = − , 2 2d x x = − , where d x is the desired value with its derivative dx . the transformed system is given as, 1 2 2 2 ,b gv d     = = − − + (2) where 1 2[ , ] t   = , d d d d x bx= − + + and holds ( )0supt dd t     . the sliding variable is designed as s c= for 2c , where 1 1[ 1], 0c c c=  . differentiating of s c= with respect to time gives, 2 1 1 1 2 ( ) 0. s c c b gv d    = + = = − − +  (3) regarding (3), d   and the sliding property, which brings the sliding variable to the sliding manifold, , 0s s = the smc controller can select as, 1 1 2 (( ) ( )),v g c b sign s  − = − + (4) where  > d holds. after the smc controller design (4), the et mechanism will be introduced in the next section. the controller (4) contains a nonlinear term, the solution of the feedback system (2),(3) with controller (4) is understood in the filippov sense [18]. 3. event-triggered sliding mode control for ncs the event-triggered rule derivation is based on the analysis of the reaching phase stability of the sliding variable [2]. it is worthy of mentioning that the discrete implementation of smc can not reach a sliding manifold completely. as a result, the quasisliding mode is obtained [16], [19], where the sliding variable is limited with boundary , where  it is subject to the sampling time, sliding parameter, and disturbance d . furthermore, the presented work is limited to the et approach, where the band  will be determined regarding the trigger mechanism and preselected inter-event time. the et-smc after two consecutive updates is given as ( ) ( ) ( ) ( )( )( )1 1 2 ,et n nv t g c b t sign s t  − = − + (5) where tn is the last update, t is the current time between two updates, and is t  [tn,tn+1). ,s +     560 a. sarjaš, d. gleich theorem 1: consider system (2) with the sliding manifold s = 0. the parameter  is given so that ( )1 1 2( ) ( ) ( ) ( ( )) ,et n nv t g c b t sign s t  − = − + (6) for all t > 0, where 2 2 2( ) ( ) ( )ne t t t = − . the event triggering is established if the controller gain is selected as d   +  (7) where holds 0  . proof: before continuing to prove, the remaining et error variables are introduced, e1(t) = 1(t) − 1(tn), and e(t) = (t) − (tn). for the stability test, the lyapunov function is presented v(t) = s(t)2/2 for the time interval t  [tn,tn+1), where 0n  . differentiation v with respect to time t the derivative v is given as 1 2 (( ) ). et v ss s c b gv d= = − − + (8) substituting the controller (5) in (8) gives 1 2 1 2 1 2 1 2 2 (( ) ( ) ( ) ( )) (( ) ( ) ( ) ( ) ( ( )) ( )) (( )( ( ) ( )) ( ( )) ( )) et n n n n n v s c b t gv t d t s c b t c b t sign s t d t s c b t t sign s t d t        = − − + = − − − − + = − − − + ( ) 2 1 2 2 (t) 1 2 ( ) ( ( ) ( )) ( ) ( ) , n d e d d d s c b t t s s s c b e t s s s s s s s           − − − +   − − +   − +   − − −   − where  > 0. concerning the condition (7) and assumption sign(s(tn)) = sign(s(t)), it is to be noted that the sliding variable is approaching the sliding manifold, where s = 0. the above is true if at the time of triggering t = tn holds e2(tn) = e2(t) = 0, then the sliding variable s is bounded with , where, 2 1 ( ) ( ) ( ) ( ) , ( ) n n s t s t c t c t c e c k e c k k c b     − = − =   = − (9) regarding 2e e and 2k e e= . the parameter k is defined as ( ) 2 2 1 2 1 c b k   − = + and  is an upper limit of the 0 1sup ( )t e t     . the boundary  is defined as  = { , }s c k   =  , where the triggering rule in (6) can be defined as, 1 2 1 ( ) ( )e t c b −  − , (10) which is the end of the proof. event-triggered sliding mode control for constrained networked control system 561 the stability of the remaining system in (2) with controller (5) needs to be assessed after the stability analysis of the sliding variable with triggering condition. regarding the reaching phase boundary (9), it can be derived 2 1 1s c = − , where 1 1 1s c = − holds. with the introduction of the lyapunov function 2 1 / 2v = , the stability can be assessed as, 1 1 1 1 1 1 1 1 1 ( ) 1 , v s c c s c       = = −   = − −    with respect to conditions (6),(9), the system is stable if it holds that 1 1 1 0c s − −  . thus, the closed-loop system is stable with respect to s, and the system trajectory 1 is bounded by 1 1 1 . ( ) k c c c b   − (11) 4. event-triggered sliding mode control for ncs the structure of the network control system is depicted in fig. 1. the controller algorithm is executed on the network computer, where the triggering rule is evaluated on the plant. we assume that the plant has a real-time system with computational ability and communication interfaces. the real-time system on the plant side is used for noncomplex computation such as triggering condition evaluation, signal conditioning, and communication capability. the user datagram protocol (udp) is used for the given et-smc implementation. the data have been transmitted over different network hops, where additional time delay and package loss may occur. the package loss in the network is modeled as a loss delay [12], [13], where the maximal allowed round trip time (rtt) of the network is used for package loss detection. the plant side uses a dedicated package-loss timer, and if the watchdog timer is expired, then the request for new data is demanded from the server. we assume that two consecutive losses can not be accrued for the package loss occurrence. network network smc plant server data flow fig. 1 networked controller structure with et-smc 562 a. sarjaš, d. gleich the controller feedback structure is presented in fig. 2, where the triggering condition determines the network usage. the controller (5) is implemented on the server, and the triggering mechanism is on the plant side. plantsmc u xexd triggering condition network network server fig. 2 et-smc feedback configuration the inter-event time of the et-smc is determined regarding the error analysis of the two consecutive sampled states, 1 1 1 2 2 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) n n t t td d d d e t e t t t tdt dt dt dt       −     = =    −    , (12) where ( ) 0 n t = , according to the last update. substitute (12) in (2), (5) which gives 0 1 0 0 0 ( ) ( ) ( ) ( ) 0 0 1 et n d e t t d t v t b gdt         + −      −      , 1 0 1 0 0 0 ( ) ( ) ( ( )) 0 0 1 1 cdc n bba t d t sign s t c         = + −      −      ( ( ) ( )) ( ( )) ( ) ( ) ( ) . c n c n d c c n c d d a e t t b sign s t b d t a e t a t b b     = + − +  + + +  the solution of the differential equations is ( )( ) ( ) ( 1)c n a t tc n c d d c a t b b e t e a   −+ +   − , (13) where the minimal inter-event time  = t − tn is determined as ( ) min 1 1 ln 1 ( ) ( ) c c c n c d d k a a c b a t b b        +   − + +    (14) it can be seen that the inter-event time depends on triggering condition  and selected controller parameters c1 and . regarding the uncertainty of the network, the delay n is introduced with the update time tn. the update sequence 0{ }n n nt   = + corresponds to the event-triggered sliding mode control for constrained networked control system 563 update time tn and means that the controller is not updated with the last states, wherein the inter-event time is extended by delay value n. hence the error (13) grows till the next update time tn+1. the triggering sequence is admissible, regarding if 1 0,n n nt t n+  +  and the triggering rules (6),(10) ensure system stability. the derivation of the delay boundary, where the triggering rule ensures the system stability, is similar to the derivation of the inter-event time in (13),(14). for a given derivation, we assumed that the controller (5) at the time t  [tn, tn + n) is not updated with the current state (tn), whereby the further updates are executed at t  [tn + n, tn+1 + n+1), and the analysis involves the controller structure with past value v(t) = g−1((c1 − b)2(tn−1)+ sign(s(tn−1))). the admissible interevent time is caused by the delay, which ensures that the system stability with triggering condition (10) is, ( ) ( ) ( )( ) n 1 1 1 ln 1 ( ) c c c n n c d d k a a c b a t t b b      −    = +  − + + +    (15) the system is stable, and the boundary (11) is preserved if n  n it holds. for proper parameter selection, it is necessary to assume the maximally allowed delay in the network. the delay boundary is given as 0supn n      . the network structure and the used protocol for communication are designed after derivation of the crucial parameters for event-triggering implementation. the focal point of the network system is a protocol that needs to ensure simple transmission and minimal package loss with low rtt. all transmitted data must be transparent to the server and the client, whereas the measured data should not be ambiguous. the designed protocol enables package lost detection and possible adaptation of the controller execution in a classical tt or et manner. the package loss algorithm is essential for controller output recovery. if the package loss is detected or the rtt timer reaches the threshold, the controller output must be updated. otherwise, the closed-loop system is running in an open-loop. the update can be done with a new data transmission request from the server or an internal model-based approach. the recurrent request sent is a straightforward task for the controller update, whereas the model-based approach is more complex and advanced. in the model-based approach, the system data are obtained from the model or system approximation algorithms such as fuzzy sets and neural networks. the model-based approach requires more computational resources on the server or the client-side. such an approach can ensure faster output recovery than sending a new transmission request. the model-based update regarding the computational resources can act as a redundant system in the case of irregularities on the network or system. the structure of the designed protocol for the client communication is presented in fig. 3. ids rtts rttc data1 data2 datan crc #... fig. 3 the communication protocol of the client message the ids presents the server address, which is the main system of the ncs. tags rtts and rttc are timing data of the network rtt, one on the server-side and the other on the client-side. both sides are measured with their own rtt, where the server`s rtts is the time from server send to server received, and the client rttc is similar to rtts with 564 a. sarjaš, d. gleich beginning on the client send and received. the package loss and network irregularities can be detected with comparisons of the rttc and rtts. tags data1,2,n are transmitted states of the system. the estimation and detection of the network irregularities through different measured parameters are not the main objective of the presented work and will not be discussed hereinafter. all additional parameters of the protocol, which are not directly involved in the ncs operation, are just starting points for the further research of a network`s quality and reliability assessment. the protocol is concluded with a cyclic redundancy check crc and the delimiter #. the response message from the server to the client is presented in fig. 4. idc rtts rttc cont1 cont2 crc #... fig. 4 the communication protocol of the server message the idc presents the client address, where rtt, crc and # are the same parameters as in the client message presented in fig. 3. the tags cont1,2,.. are controllers update values. all the time values and data are presented in 4bytes float format. the id and crc are presented with 32-bit integer values. the length of the message is determined with a number of transceived system states (data), whereas id, rtts, rttc, crc, and # values are mandatory and are the control parameters of the used protocol. regarding the employed protocol with network rtt time measurement on the server and client-side, it is necessary to acknowledge the possible network uncertainty. the network uncertainty can be presented as network delay, where the network information takes time to spread from the sender to the receiver over different network hoops. the delay can cause an unwanted effect on the feedback system, such as an oscillation, slower response, deteriorated disturbance rejection capability, and even unstable operation. the delayed system needs special awareness in the controller design. in the proposed approach, the delayed system is presented as an additional elapsed time after requesting a new update from the client-side. the delay caused a more extended operation in the unstable region given in (11),(13),(14). the inter-event time (14) is extended, and the permitted state boundary is extended (13). such time delay lowers the performance of the closed-loop system and tracking capability. the system's stability is ensured with the proper selection of the controller gain given in (7). if the time delay is modeled as a parametric uncertainty with a prescribed bound,  then the controller gain selection can be lowered for the admissible delay boundary. d   +  +  (16) besides the network delay, package loss can occur in the network. unlike the network delay, package loss is generally described as information that never arrives at the destination. in the ncs approach, different types of package loss are known; newer arrived, out of order, and multiple package arrivals. in the tt-ncs approach, the state observer with a controller on the server-side is mainly used to recover the loosed packages [8]. in the et technique, the package loss stability criteria can be analyzed regarding the lyapunov stability function of the reaching phase in the et-smc operation, where the package loss is modeled such as the error, ep(t) = (t) − (t), for time t  [tk, tk+1), where 2    . the state (t) presents the last update after the package loss. the number of packages lost is equal to  − 1, which  = 2 means one lost package. the proof of the event-triggered sliding mode control for constrained networked control system 565 stability is similar to the proof presented in (8), where the lyapunov function is equal to 2 ( , ) ( , ) / 2v t s t = , and its derivative is 1 2 1 2 2 ( ) 1 (( ) ( ) ( ) ( )) ( ) ( ( ) ( )) ( ( )) ( ) ( ) ( ) ( ) p et e t p d p d p d v s c b t gv t d t s c b t t sign s t d t s c b e t s s s s s s             = − − +    = − − − +      − − +   − +   − − −  regarding the assumption  > n it holds p  . after a consecutive package lost, the system is stable if the controller gain ensure the given condition,  > p + d (17) the 1 trajectory is bounded by 1 1 1 ( ) p p k c c c b   − , (18) where is ( ) 2 2 1 2 1 p p c b k   − = + . after solving the differential equation ( , ) d e t dt  given in (12), the minimal inter-event time is ( ) min 1 1 ln 1 ( ) ( )p p p c c c c d d k a a c b a t b b         +   − + +    (19) it is evident that the package loss higher the boundary of the output trajectory 1. if the output boundary needs to be in the prescribed range (11), (18), the controller gain and interevent time (14), (19) need to be selected at lower values. the closed-loop performance needs to be reduced to ensure higher robustness of the network uncertainties. the package loss can be detected with tts,c measurement on both sides of the network. with the proper selection of the maxtts,c, and delay parameter , the desirable performance of the closed-loop system can be ensured; otherwise, the lowered closed-loop or unstable behavior can occur. 5. experimental results the dual servo system is used for the validation of the presented et-ncs approach. the servo system is presented in fig 5. the client is implemented on the arm® cortex®m7 based stm32f7xx mcu with an iwip stack for transparent udp communication with the presented ncs protocol presented in figs. 3 and 4. the iwip stack on stm32f7 enables 100base-tx communication speed. the arm embedded system is responsible for the measured current, velocity, and angle of the servo system and provides actuation to the motor drive, with pulse width modulation (pwm) at the frequency of 10khz and resolution 4mv/duty. all the measurements before the transmission are preprocessed with different signal processing algorithms to ensure the high fidelity and reliability of the measured data. the used brushed motors in fig. 5 have a maximal velocity of 3500rpm at 24v and max load current 4a. 566 a. sarjaš, d. gleich fig. 5 real-time system with network socket the network is composed of an arm embedded system, router and pc-server. the embedded system provides a request for the controller update, which is sent to the server. the request message structure is defined with the protocol presented in fig. 3. after the client's received message, the server calculates the new controller output and prepares the server message back to the client, fig. 4. the used network is presented in fig. 6. stm32f7 current -velocity -angle measurements processing udp router serverclient smc -python script l w ip udp adc, pulses, pwm e t -t ri g g e r fig. 6 ncs-network configuration the sliding mode controller is implemented with python 3.7. the main components of the python script are running the udp server with additional timer interrupt threats for tt implementation and rtts measurements. the closed-loop performance for tt and et implementation is evaluated with the given performance indices, 2 1 1 1 , { , , , }, sn w tt et ks rms w w x s v v n = =  (20) 1 1 2 1 1 0 2 1 0 for { ( ) } , , 1 for { ( ) } sn v i i i u t c a flag n n u t c a     − − − =   − = =   −  (21) event-triggered sliding mode control for constrained networked control system 567 where ns and ni are the numbers of triggering events for controllers vtt and vet respectively. the controller vtt stands for the tt execution of the controller algorithm presented in (4) as v. the controllers vtt and vet are tested in the same condition, with equal references values and a sampling time of 10ms for tt execution and periodic triggering evaluation for vet execution n  10ms (15). the parameters of the system presented in (1), (2) are, b = 3.3, g = 0.897, d = 7.1.the selected controller parameters are, c1 = 5.2,  = 16.2,  = 19.7, p = 19.7, tts,c = 11ms. the network performance is presented in fig. 7. fig. 7 measured rtts and rttc values of the ncs network the periodic triggering evaluation is selected properly regarding the measured rtt values for server and client trigger = 10ms. in each trigger period, only measured data are examined concerning the triggering boundary . figs. 8 and 9 present the ncs performance of the controller v = vtt. fig. 8 tracking capability, rpm value, and vtt controller output of the ttncs 568 a. sarjaš, d. gleich fig. 9 sliding variable and controller update flag of the ttncs figs. 10 and 11 present the ncs performance of the controller vet fig. 10 tracking capability, rpm value, and vet controller output of the etncs fig. 11 sliding variable and controller update flag of the etncs event-triggered sliding mode control for constrained networked control system 569 the estimated indices values (20),(21) are presented in table 1. table 1 performance indices of tt-ncs and et-ncs figs. 8-11 show the implementation results of the tt-ncs and et-ncs strategies. the advantages of both approaches are shown clearly. the tt-ncs has better tracking performance regarding table 1 and the rmsx1 value. this result was expected, due to the constant controller update with a prescribed sampling time of 10ms. on the other hand, the tt approach uses constant network resources. for a given experiment, at least two messages are transmitted in each 10ms time frame. regarding rmsx1 of the et-approach, the tracking capability has a slightly deteriorated response. the lower performance is the result of the nonlinear switching function of v and the unstable boundary region of the output x1 variable derived in (11) and the triggering condition. the network usage in the et-strategy is reduced drastically, especially when the system reaches a sliding manifold. the average update time for et-ncs is 41ms, presented in column avg(ts/n) of table 1. the average update time is related closely to the preselected triggering boundary and the course of the reference value. the triggering boundary affects the tracking capability of the closed-loop system directly. the employment of the et-ncs system is a tradeoff between network resources usage and the accuracy of the system. in the given experiment, the network usage of the tracking system is reduced by almost 70%, and the output rmsv value is reduced drastically. the et approach can also be considered a chattering alleviation technique for sliding mode controllers with an explicated output signum function, which is studied extensively within different implementation techniques and adaptation algorithms [18]-[21]. 6. conclusion the paper presents the event-triggering nonlinear controller implementation for a networked control system. compared to the classic time triggering implementation, the approach is beneficial for the ncs system with data rate constraints, where the network constraints can be considered during the controller design. the experimental results confirm the theoretical assumptions of et-nsc and derivation. the network usage and embedded system utilization are reduced. the et technique can be a viable alternative for tt feedback systems, especially where the computational and network resources are limited or the optimization subject. the work is a good research starting point for multi-agent, distributed control, and task scheduling in embedded systems. the central supervised server system can share its computation capacity with other distributed systems and control multiple sub-plants remotely, where the relaxation of network requests can be lowered significantly and preestimated. acknowledgement: this research was funded by the slovenian research agency (arrs) grant number p2-0065. ncs rmsx1 rmsv rmss avg(ts/n) rtts rttc flag vtt 83.2 4.56 57.2 10ms 8.23ms 3.21ms 100% vet 85.7 1.82 58.4 41ms 8.43ms 2.78ms 28.7% 570 a. sarjaš, d. gleich references [1] a. sarjaš and d. gleich, "nonlinear event-triggered networked feedback control system under data-rate constrains", in proceedings of the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks), 2021, pp. 376-379. [2] a. k. behera, b. bandyopadhyay and x. yu, "periodic event-triggered sliding mode", automatica, vol. 96, pp. 1916-1931, jan. 2018. [3] v. i. utkin, sliding modes in control and optimization. new york: springer-verlag, 1992. [4] c. edwards and s. k. spurgeon, sliding mode control: theory and applications taylor and francis, 1998. [5] i. furtat, y. orlov and a. fradkov, "finite-time sliding mode stabilization using dirty differentiation and disturbance compensation", int. j. robust nonlinear control, vol. 29, no. 3, pp. 793-809. [6] k. j. aström, "event based control" in a. astolfi and l. marconi (eds.), analysis and design of nonlinear control systems, pp. 127-147, berlin, heidelberg, springer, 2006. [7] k. j. åström and b. m. bernhardsson, "comparison of riemann and lebesgue sampling for first-order stochastic systems", in proceedings of the 41st ieee conference on decision and control (cdc), las vegas, nv, usa, 2002, pp. 2011-2016. [8] a. ferrara, g. p. incremona and v. stocchetti, "networked sliding mode control with chattering alleviation", in proceedings of the 53th ieee conference on decision control, los angeles, ca, usa, december 2014, pp. 5542-5547. [9] e. kofman and j. h. braslavsky, "level crossing sampling in feedback stabilization under data-rate constraints", in proceedings of the 45th ieee conference on decision control (cdc), san diego, ca, usa, dec. 2006, pp. 4423-4428. [10] j. ludwiger, m. steinberger, m. horn, g. kubin and a. ferrara, "discrete time sliding mode control strategies for buffered networked systems", in proceedings of the 57th ieee conference on decision control, miami beach, fl, usa, dec. 2018, pp. 6735-6740. [11] m. cucuzzella, g. p. incremona and a. ferrara, "event-triggered variable structure control", int. j. control, vol. 93, no. 2, pp. 252-260, jan. 2019. [12] j. ludwiger, m. steinberger and m. horn, "spatially distributed networked sliding mode control", ieee control syst. lett., vol. 3, no. 4, pp. 972-977, may 2019. [13] j. ludwiger, m. steinberger, m. horn, g. kubin and a. ferrara, "discrete time sliding mode control strategies for buffered networked systems", in proceedings of the 57th ieee conference on decision control, miami beach, fl, usa, dec. 2018, pp. 6735-6740. [14] a. k. behera and b. bandyopadhyay, "event-triggered sliding mode control for a class of nonlinear systems", int. j. control, vol. 89, no. 9, pp. 1916-1931, jan. 2016. [15] a. k. behera, b. bandyopadhyay and x. yu, "periodic event-triggered sliding mode", automatica, vol. 96, pp. 1916-1931, jan. 2018. [16] a. k. behera and b. bandyopadhyay, "robust sliding mode control: an event-triggering approach", ieee trans. circuits syst. ii: express briefs, vol. 64, no. 2, pp. 146-150, feb. 2017. [17] w. gao, y. wang and a. homaifa, "discrete-time variable structure control system", ieee trans. ind. electron., vol. 42, no. 2, pp. 117-122, april 1995. [18] s. koch and m. reichhartinger, "discrete-time equivalents of the super-twisting algorithm", automatica, vol. 107, pp. 190-199, 2019. [19] b. brogliato and a. polyakov, "digital implementation of sliding-mode control via the implicit method: a tutorial", int. j. robust nonlinear control, vol. 31, no. 9, pp. 3528-3586, 2021. [20] v. utkin, "discussion aspects of high-order sliding mode control", ieee trans. automat. contr., vol. 61, pp. 829-833, 2016. [21] u. p. ventura and l. fridman, "design of super-twisting control gains: a describing function based methodology", automatic, vol. 99, pp. 175-180, 1990. 10503 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 145-154 https://doi.org/10.2298/fuee2202145м © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd review paper prior knowledge based neural modeling of microstrip coupled resonator filters zlatica marinković1, miloš mitić1, branka milošević1, marin nedelchev2 1university of niš, faculty of electronic engineering, niš, serbia 2technical university of sofia, faculty of telecommunications, sofia, bulgaria abstract. the design of microstrip coupled resonator filters includes determination of the coupling coefficients between the filter resonator units. in this paper a novel modeling procedure exploiting prior knowledge neural approach is proposed as an efficient alternative to the standard electromagnetic (em) simulations and to the neural models based purely on the artificial neural networks (anns). it has similar accuracy as the em simulations and requires less training data and less time needed for the model development than the models based purely on anns. key words: artificial neural networks, coupled filters, design, microstrip 1. introduction microstrip coupled resonator filters act as bandpass filters and they are widely exploited in the modern microwave communication systems. planar filters are a good choice for realizing low passband loss and high rejection ratio in the stopband. they are manufactured easily to utilize printed circuit board (pcb) technology with a high accuracy and a relatively low price. planar filters’ responses do not vary when manufactured in series and their adjustment and tuning is straightforward. the variety of classical and crosscoupled topologies of microstrip filters can realize the chebyshev equiripple and quasielliptic response. the preferred resonators for practical realizations are half-wavelength resonators and their compact variantshairpin and square open loop resonators [1]. the square open loop resonators offer compact size at good quality factor inhering the frequency properties of the half wavelength resonator. as many microwave systems are relatively narrowband, the square open loop resonator can realize the narrow bandwidths with weak coupling coefficients at reasonable distance between them. received february 17, 2022 corresponding author: zlatica marinković university of niš, faculty of electronic engineering, 18106 niš, aleksandra medvedeva 14, serbia e-mail: zlatica.marinkovic@elfak.ni.ac.rs 146 z. marinković, m. mitić, b. milošević, m. nedelchev the cross-coupled filters with quasi-elliptic frequency response require clear identification of the sign of the coupling coefficient, which leads to clarification of the electrical, magnetic or mixed type of coupling especially between the non-adjacent resonators. the square open loop resonators solve this difficulty comparing to the half-wavelength resonators with the benefit of flexibility of coupling topologies. the filter synthesis process follows the classical approach through the calculation of the coupling matrix according to the chosen approximation. in the microwave systems, the most popular and implemented approximation is the chebyshev one [2]-[3]. in [2] the design process of the polynomials and the transversal coupling matrix is given. many authors offer matrix rotations to transform the canonical or transversal matrix to the exact matrix corresponding to the chosen topology [1]-[2]. an optimization method for direct calculation of the interresonator coupling coefficients is proposed in [4]. nevertheless, whatever method for synthesis is chosen, the distance between the resonators should be calculated precisely. in [1] it is proposed to utilize a full-wave em simulator, which is a rigorous approach, but suffers from a high time consumption and high calculation power needed. to overcome time consuming em simulations or complex optimization methods, new approaches based on application of artificial neural networks (anns) have been proposed to model the filter coupling properties on the filter resonator physical dimensions and/or the properties of the chosen dielectric material [5]-[6]. moreover, the ann based approach has been applied to perform inverse modeling of the filter. namely, the anns are used to determine the distance between the filter resonators for the given coupling properties [7]-[8] or resonator dimensions and the given coupling coefficient [5]-[6]. however, the developed models of the filter coupling properties shown in [5] are valid for only one considered dielectric material (i.e., for one specified value of the relative dielectric constant). in other words, it means that for each dielectric material it is necessary to develop a new neural model. to build a model which would be valid for different values of the relative dielectric constant, it would be necessary to acquire a bigger amount of the em simulated data, which would be time consuming and thus making the whole modeling procedure inefficient. in this paper we propose a novel approach in microstrip coupled resonator filter modeling, which is based on the prior knowledge based neural approach. namely, instead of exploiting the anns only, here the anns are combined with the empirical formulae, aimed for the approximate determination of the filter coupling coefficient. this approach provides a single model for all considered values of the relative dielectric constant. moreover, the model can be built with less data than the separate purely ann based models. the rest of the paper is structured as follows. the considered microstrip coupled resonator structure as well as the empirical expressions used for approximate determination of the filter coupling coefficients are described in section 2. section 3 contains a brief background of the prior knowledge neural approach. the novel prior knowledge neural model is proposed in section 4, whereas the obtained results and the discussion are given is section 5. section 6 contains conclusions. prior knowledge based neural modeling of microstrip coupled resonator filters 147 2. microstrip coupled resonator filters the square open loop resonator is a half wavelength long microstrip line with open ends (see fig.1a). the form of the resonator is symmetrical and the electromagnetic field distribution along it is predictable due to the symmetry. the open ends are supposed to be shortened, because of the fringe capacitance [9]-[10]. (a) (b) fig. 1 (a) the topology of microstrip square open loop resonator, (b) an example of coupled resonators the different orientations of the resonators on the top plane of the substrate form various kinds of coupling topologies. the coupling mechanism is achieved by the fringe fields, when the resonators are adjacent each other. the electrical filed is stronger than the magnetic near the open end of the resonator and the magnetic field is predominant at the center of the resonator. the strength of the electrical field and magnetic field decays rapidly with the distance from the open end and the center of the resonator respectively. the coupling structures in fig.1b perform mixed coupling. it is not possible to determine which field is dominant. the value of the coupling coefficient of the coupled resonators in fig1.b is much lower, because the currents are out-of-phase. this topology is applicable in narrow bandwidth filters. the considered microstrip resonator is of a square shape with the length a and the line width w, fabricated on the substrate having the height h and the relative dielectric constant r. the coupling coefficient (including mixed electric and magnetic coupling) k is precisely calculated in the em simulators, but the rough value of the coupling coefficient can be calculated using the following expressions [11]: '' me kkk += , (1) mm kk = 5.0 ' , (2) me kk = 6.0 ' (3) the coefficient of magnetic coupling km and the coefficient of electric coupling ke are calculated as: )exp()exp()exp( 16 eeeee dbafk −−−=  , (4) )exp()exp()exp( 16 mmmmm dbafk −−−=  , (5) 148 z. marinković, m. mitić, b. milošević, m. nedelchev where: h w a rre ++−= 11.001571.02259.0  , (6) pe r e h s b                     + += 2 1 ln226.00678.1  , (7) 4 03146.00886.1       += h w pe , (8) 15.1 06945.01608.0                −= h s h a de , (9)         −+−= h a h a fe 2443.04087.19605.0 , (10)               ++−= 3 08655.014142.006864.0 h w h w am , (11) pm m h s b       = 2.1 , (12) h w pm 1751.08885.0 −= , (13)                +−= h s h a h a dm 1417.08242.0154.1 , (14) h a h a fm −+−= 1557.00051.15014.0 . (15) 3. prior knowledge neural modeling approach owing to their excellent fitting capabilities artificial neural networks have found many applications in the field of rf and microwaves [12]-[19]. most of the applications have been based on the black-box modeling approach, which means that one or more anns are used to extract the relationship between the sets of the input and the output parameters (see fig. 2a). however, in order to make the modeling procedure more efficient, less time consuming and more accurate, without increasing the number of training data, the prior knowledge input (pki) neural approach can be applied (see fig. 2b) [12]. namely, in the pki approach, beside the original n input parameters, there are additional inputs of the ann. they represent the prior knowledge, meaning that they are correlated in some extent with the output parameters. in general, the number of prior knowledge input parameters (l) can be equal, but not necessary, to the number of the output parameters (m). the prior knowledge can be, for instance, the values of the outputs which are obtained by an approximate or simplified method. prior knowledge based neural modeling of microstrip coupled resonator filters 149 (a) (b) fig. 2 (a) black-box neural modeling approach, (b) prior knowledge input neural modeling approach the anns used in this work are the multilayer perceptron networks, having one input, one output and one or two hidden layers [12]. the transfer function of the input layer neurons is a unitary transfer function. the hidden layer neurons have sigmoid transfer functions, whereas the output layer neurons have linear transfer function. the levenbergmarquardt algorithm is used for the ann training. the pki approach requires that for each data sample used for the ann training, as well as later for testing and employing the developed model, it is necessary to have the values of the prior knowledge parameters. the average test error (ate), the worst case error (wce) and the product-pearson correlation coefficient (r) have been used as the metrics for comparing the models [11]. if the error of the ann response for the i–th input combination (i-th sample), ki compared to the corresponding target value, kti, relative to the dynamic range of the target values in the test set (kt max − kt min) is calculated as minmax tt ti i kk kk − − = . (16) the ate, wce and r are defined as follows: 1 1 | | n i i ate n  = =  , (17) 1 max | | n i i wce  = = , (18) 1 2 2 1 1 (| | | |)(| | | |) (| | | |) (| | | |) n i i i n n i ti t i i k k k k r k k k k = = = − − =     − −           , (19) 150 z. marinković, m. mitić, b. milošević, m. nedelchev where n is the number of the samples in the training set, and k and tk mean values of the ann response and the target values, respectively:  = = n i ik n k 1 1 and  = = n i tit k n k 1 1 . (20) 4. proposed model in the proposed model, an ann (fig. 3) is trained to predict the coupling coefficient for the given resonator dimensions a, s, w, the substrate height h and the relative dielectric constant r. besides these original input parameters, the ann has an additional input representing the prior knowledge, which is the approximate value of the coupling coefficient, here marked as kapprox ̧which is calculated by eqs. (1)-(15) given in section 2. the training and test sets consist of data samples, where one sample contains one combination of the values of the original input parameters, the calculated kapprox for the given input combination and the corresponding target value of the coupling coefficient k obtained by precise simulations in the full-wave em simulator. fig. 3 proposed pki neural model of microstrip coupled resonator coupling coefficient 5. results and discussion the proposed approach has been applied to model the microstrip coupled resonator coupling coefficient by exploiting the same data used in [6] for developing the black-box neural models aimed to predict the coupling coefficient for the given resonator dimensions and the properties of the substrate, k = (a, w, s, h), for the constant value of r. in table 1 the considered ranges of the input dimensions as well as the considered values of r are given. the training set has consisted of 2089 samples covering all four r values, whereas the validation test set has consisted of 40 samples not used in the training set. several anns with different number of hidden neurons were trained and the best model has been obtained with the ann having two hidden layers, each containing 17 neurons. the ate, wce and r values for the training set and the test set are given in table 2. the corresponding scatter plots prior knowledge based neural modeling of microstrip coupled resonator filters 151 showing the correlation of the predicted and target values for the training and test sets are given in fig. 4. table 1 considered ranges/values of the input parameters parameter range/values a (5 20) mm w (0.1 – 4) mm s (0.1 – 3.5) mm h (0.254 1.575) mm r 2.33, 4.4, 6.15, 10.2 table 2 test statistics for the training and the test sets set ate[%] wce[%] r trainig set 0.5 2 0.99967 test set 0.24 2.55 0.99981 table 3 comparison of the predicted and target values for ten chosen test samples k target k – ann model ae re[%] 0.096523 0.097757 0.001234 1.28 0.082074 0.082802 0.000728 0.89 0.066264 0.065804 0.000460 0.69 0.068744 0.066463 0.002280 3.32 0.073213 0.074809 0.001596 2.18 0.066295 0.065904 0.000390 0.59 0.074939 0.075631 0.000692 0.92 0.070582 0.070506 0.000075 0.10 0.058675 0.059139 0.000464 0.79 0.047486 0.047668 0.000183 0.38 (a) (b) fig. 4 correlation of the ann generated coupling coefficient and the reference target values (a) training set, (b) test set 152 z. marinković, m. mitić, b. milošević, m. nedelchev small errors in predicting both training and test values, as well as a good correlation, show that the proposed model not only learnt well the training data but has a good generalization accuracy on the test set not seen by the ann during the training phase. as an additional illustration, in table 3, for ten randomly selected test samples, the target and predicted values are reported together with the corresponding absolute errors (ae the absolute difference of the predicted and target values) and the relative errors (re the ae devided by the target value and expressed in percent). the rest of the test samples shown the similar errors. the relative errors are mostly below 2%, which can be considered as a good predicting accuracy. this model includes the dependence of the coupling coefficient on the relative dielectric constant, which was not possible to achieve with a simple black-box model by using the available data, i.e. without increasing the training set. to investigate how much the training set can be downsized in order to keep the same level of accuracy of the proposed model additional analysis have been performed. with this aim, the training set has been reduced but removing certain data samples, taking care that all considered areas of the input space were properly represented. (a) (b) fig. 5 correlation of the ann generated coupling coefficient and the reference target test values for the models trained with the (a) training set of 873 samples, (b) training set of 692 samples the proposed model has been developed for each reduced size training set ensuring the same level of training accuracy as in the initial case. the models have been further tested on the same test set (consisting of 40 samples) used for testing the model developed by using the full training set. the process of downsizing the training set has been stopped when the accuracy in predicting the test values started to get worse. in total, the test has been performed with four data sets consisting of 1230, 1036, 873 and 692 data samples. the test statistics is shown in table 4. prior knowledge based neural modeling of microstrip coupled resonator filters 153 table 4 test statistics for the test set obtained by the models trained with the reduced size training sets training set ate[%] wce[%] r reduced – 1230 samples 0.93 5.48 0.998672 reduced – 1036 samples 1.01 4.20 0.998495 reduced – 873 samples 1.05 3.90 0.998728 reduced – 692 samples 6.25 26.47 0.952975 it can be seen that the accuracy of the first three models is very similar. however, for the last data set, although the model was well trained, the correlation with the target test values has significantly decreased, which is confirmed by the higher errors. this can be clearly seen from fig. 5, where the scatter plots of the predicted data versus the target data for the last two data sets, containing 873 and 692 samples, show much higher discrepancies between the predicted and the target values of the coupling coefficient. it can be concluded that the number of training data can be more than halved comparing to the considered initial training set. this further means that the proposed approach can be exploited to develop the model for determining the coupling coefficient a much smaller number of the training data than the pure black-box model. 6. conclusion in this paper a novel modeling procedure exploiting prior knowledge neural approach is proposed for accurate determination of the coupling coefficient of a microstrip coupled resonator. unlike the black-box neural approach, which assumes that an ann is exploited to model the coupling coefficient dependence of the filter geometry and substrate properties, in the proposed model, an additional input of the ann is a value of the coupling coefficient obtained by mathematical expressions for approximate calculation of the coupling coefficient, representing the prior knowledge for the ann. by introducing the prior knowledge, the number of needed samples in the training data is reduced, that mean that less time is needed to acquire the training data by the time consuming em simulations, making the whole process of the model development more efficient and faster. comparing to the black-box model, the proposed model needs significantly less training data to develop the model with the desired accuracy. moreover, it gives a good accuracy in the cases where the black-box approach would need much more data to be exploited. in the considered case, with the available training data, the model includes dependence on the relative dielectric constant, which was not possible to achieve with a pure ann model. the model provides values of the coupling coefficient which are very close to the target values obtained by the em simulations. as the ann can be described by a set of mathematical expressions based on the basic mathematical operations and exponential function, the ann response can be calculated in a very short time. consequently, the ann accompanied with can the expressions representing prior knowledge be used for instant prediction of the coupling coefficient. in other words, the proposed model can be successfully used as a fast and accurate replacement of the em simulation for the coupling coefficient determination. looking from the side of the expressions used as the prior knowledge, which are used for approximate determination of the correlation coefficient, the ann can be seen as an addition to these expressions improving their accuracy. 154 z. marinković, m. mitić, b. milošević, m. nedelchev acknowledgement: the presented research has been supported by the ministry for education, science and technological development of serbia and by the ministry of education, republic bulgaria and faculty of telecommunications under contract number дн07/19/15.12.2016 "methods of estimation and optimization of the electromagnetic radiation in urban areas". references [1] j. hong and m. j. lancaster, microstrip filters for rf/microwave applications, john wiley & sons, 2001. [2] r. j. cameron, c. m. kudsia and r. m. mansour, microwave filters for communication systems: fundamentals, design, and applications, second edition, john wiley & sons, 2018. [3] r. j. cameron, "advanced coupling matrix synthesis techniques for microwave filters", ieee trans. microw. theory tech., vol. 51, no. 1, pp. 1–10, jan. 2003. [4] s. amari, "synthesis of cross-coupled resonator filters using an analytical gradient-based optimization technique", ieee trans. microw. theory tech., vol. 48, no. 9, pp. 1559–1564, sept. 2000. [5] m. mitić, m. nedelchev, a. kolev and z. marinković, "ann based design of microstrip square open loop resonator filters", in proceedings of the joint international conference on digital arts, media and technology with ecti northern section conference on electrical, electronics, computer and telecommunications engineering , pattaya, thailand, 11–14 march 2020, pp. 158–161. [6] m. nedelchev, m. mitić, a. kolev and z. marinković, "modeling and design of microstrip coupled resonator filters based on anns", in proceedings of the 43rd international conference on telecommunications and signal processing, milan, italy, july 7-9, 2020, pp. 470–473. [7] m. nedelchev, z. marinković and a. kolev, "ann based design of planar filters using square open loop dgs resonators", in proceedings of the 53rd international scientific conference on information, communication and energy systems and technologies icest 2018, sozopol, bulgaria, june 28-30, 2018, pp. 59–92. [8] m. nedelchev, z. marinković and a. kolev, "ann modelling of planar filters using square open loop dgs resonators", in proceedings of the 4th eai international conference on future access enablers of ubiquitous and intelligent infrastructures (fabulous 2019), sofia, bulgaria, march 28-29, 2019, pp. 363–371. [9] j.-s. hong and m. j. lancaster, "transmission line filters with advanced filtering characteristics", in proceedings of the mtt-s international microwave symposium digest, vol. i, boston, ma, usa, june 2000, pp. 319–322. [10] j.-s. hong and m. j. lancaster. "theory and experiment of novel microstrip slow-wave open-loop resonator filters", ieee trans. microw. theory tech., vol. 45, no. 12, pp. 2358–2365, dec. 1997. [11] j.-s. hong and m. j. lancaster, "couplings of microstrip square open-loop resonators for cross-coupled planar microwave filters", ieee trans. microw. theory tech., vol. 44, no. 11, pp. 2099–2109, nov. 1996. [12] q. j. zhang and k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [13] h. kabir, l. zhang, m. yu, p. aaen, j. wood and q. j. zhang "smart modelling of microwave devices", ieee microw. mag., vol. 11, no. 3, pp. 105–108, may 2010. [14] z. marinković, g. crupi, a. caddemi, v. marković and d. m.m.‐p. schreurs, "a review on the artificial neural network applications for small‐signal modeling of microwave fets", int. j. numer. model el., e2668, may/june 2020. [15] z. stanković, n. dončov, "prediction of the em signal delay in the ionosphere using neural model", facta univ. ser.: elec. energ., vol. 32, no. 2, pp. 287–302, 2019. [16] t. ćirić, z. marinković, r. dhuri, o. pronić-rančić and v. marković, "hybrid neural lumped element approach in inverse modeling of rf mems switches", facta univ. ser.: elec. energ., vol. 33, no. 1, pp. 27–36, march 2020. [17] j. jin, f. feng, j. n. zhang, s. x. yan, w. c. na and q. j. zhang, "a novel deep neural network topology for parametric modeling of passive microwave components", ieee access, vol. 8, pp. 82273– 82285, may 2020. [18] q.-j. zhang, e. gad, b. nouri, w. na and m. nakhla, "simulation and automated modeling of microwave circuits: state-of-the-art and emerging trends," ieee j. microwavs, vol. 1, no. 1, pp. 494– 507, jan. 2021. [19] j. n. zhang, f. feng, j. jin, w. zhang, z. zhao and q.-j. zhang, "adaptively weighted yield-driven em optimization incorporating neuro-transfer function surrogate with applications to microwave filters", ieee trans. microw. theory tech., vol. 69, no. 1, pp. 518–528, jan. 2021. 10819 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 43-51 https://doi.org/10.2298/fuee2301043r © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications pinku ranjan1, swati yadav2, amit bage3 1department of electrical / electronic engineering, atal bihari vajpayee-indian institute of information technology and management (abv-iiitm), gwalior, india 2department of electronics & telecommunication engineering, college of engineering roorkee, uttarakhand-247667, india 3department of electronics and communication engineering, national institute of technology, hamirpur, india abstract. in this manuscript, a compact mimo antenna for wireless application has been presented. the proposed antenna consists of the f-shaped radiator with the circular slot in the center and a rectangular ground plane on the other side of the substrate. the proposed antenna has the overall size of 48 × 48 mm2. the antenna is designed to work on two frequency bands from 1.5 to 2.3 ghz, and 3.7 to 4.2 ghz, having the resonating frequency of 1.8 ghz and 3.9 ghz respectively. the diversity performance of the antenna is also observed by using a variety of parameters like envelop correlation coefficient (ecc), diversity gain (dg), total active reflection coefficient (tarc), etc. the value of ecc is 0.02, which shows good diversity performance of the antenna. in order to validate the simulated and measured results, the proposed antenna has been fabricated and shows good agreement with the each other. key words: mimo antenna; envelop correlation coefficient (ecc); total active reflection coefficient (tarc) 1. 1. introduction in worldwide terms, wireless communication is considered to be the fastest growing technology. in 2020, it is expected that 70 percent of the world’s population will have at least a smart phone. the improvement in the generation of wireless communication in terms of data rate, antenna size and higher gain are required. a technology that fulfills the higher demands of such future wireless communication is the use of multiple input multiple output (mimo) antennas. in mimo antenna design technology, multiple antennas are used on both transmitting and receiving side in order to increase the radio link capacity. in this technique, more than received may 26, 2022; revised july 14, 2022, and july 19, 2022; accepted july 25, 2022 corresponding author: pinku ranjan abv-indian institute of information technology and management (iiitm), gwalior, madhya pradesh, india e-mail: pinkuranjan@iiitm.ac.in 44 p. ranjan, s. yadaw, a. bage one data signal is simultaneously transmitted or received over the same radio channel. by using the mimo technology, the signal capturing capacity of receiver is increased by allowing antennas to combine their data streams that are arriving from different paths at different times. mimo is the most important technique in most of the research and will play a key role in the next generation wireless systems, including 5g networks. for the mimo antenna system, isolation between the two radiating elements is very important. therefore, two radiators should be designed in such a way that the isolation between them is less than –15 db. to ensure the isolation between the antenna elements with a miniaturized size is a big challenge for the antenna designers. in the past few years many researches have proposed different mimo antennas with different techniques [1]–[10]. in [11], antenna with two-element semi-ring along with uwb amplifier is presented to design mimo antenna. an annular slot antenna and two shorts in the opposite direction placed at 45 degrees between the microstrip lines are used to achieve an isolation [12]. in 2018 [13], a. dkiouak et al. presented a compact mimo antenna for wireless application based on two symmetrical monopoles with a t-shape junction. the t-shape junction is used to enhance the isolation between two antennas. to abandon the reactive coupling connection between the different antenna elements of mimo antenna, the technique of parasitic elements is used [14]. in [15], a high isolated compact 2 x 2 mimo antenna is designed using pifa and dgs has been used to improve the inter port isolation. in this proposed design, a simple f–shaped radiator is used to get the dual band function of the mimo antenna for the wireless communication. the f–shaped radiator is chosen to get the desired band of application. the antenna has been designed to work at two different frequency bands ranging from 1.5 2.3 ghz and 3.7 4.2 ghz, and having the resonating frequency of 1.8 ghz and 3.9 ghz respectively. the numerical analysis has been carried out using high frequency simulation software (hfss). the organization of the manuscript is as follows. in section 2, antenna design and configuration are presented with its design steps. in section 3, simulated and measured return loss, isolation between ports and radiation pattern are presented. in section 4, diversity performance is evaluated in terms of ecc, tarc, and dg. finally, conclusion is provided in section 5. 2. antenna design and configuration 2.1. methodology the flowchart of the proposed antenna from design specification to fabrication and measurement is shown in fig. 1. the design methodology of the proposed mimo antenna starts from the antenna design specification. after the design specification a single element antenna is designed with the desired frequency response. the single element antenna is modified to a double element square patch mimo antenna. in order to achieve the desired mimo characteristics and frequency ranges, f-shaped mimo antenna is designed with a circular slot. in the next step, the optimization of all the parameters of the designed antennas is done to check its performance. once the desired performance is achieved, the proposed antenna is fabricated and measured. dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 45 fig. 1 flow diagram represents antenna specification to fabrication 2.2. design parameter the front view and back view of the proposed antenna with its dimensions are shown in fig. 2. the antenna has been designed on fr-4 dielectric substrate having thickness = 1.6 mm, copper thickness = 0.035 mm, dielectric constant = 4.4, and loss tangent = 0.02. the overall dimensions of the proposed antenna are 48 × 48 mm2. the antenna consists of two radiating elements of f-shape, which are placed horizontally to each other on top side of substrate along with the rectangular ground plane, which is designed on the bottom side of the dielectric substrate. the design steps of the proposed antenna are shown in fig. 3. in order to achieve the desired characteristics, there are three design steps. at first, a square (a) (b) fig. 2 (a) top view and (b) bottom view of proposed antenna where l = 48, lp1= 23, lp2= 41, lp3= 4, lp4= 6, lp5= 5, w= 48, wp1=3, wp2 = 13, wp3 = 4, wp4 = 5, wp5 = 8 and lg1 =20 (all in mm) 46 p. ranjan, s. yadaw, a. bage shape radiator is designed along with the microstrip feed line as shown in fig. 3(a). the square shape antenna shows the dual band performance with 1.4 – 2.3 ghz and 3.9 – 4.4 ghz bands, which is not the desired operating frequency bands. also, for the square shaped antenna, the second operating band shows low impedance matching. (a) (b) (c) fig. 3 evolution of the proposed antenna (a) antenna 1 (b) antenna 2 (c) antenna 3 as a result, two f-shape slots are etched from the radiator in the next stage as shown in fig. 2(b). the two f-slots etched from rectangular patch with different dimensions. this f–shaped antenna operates in the 1.5–2.3 ghz and 3.8–4.3 ghz frequency ranges, which is not the required lte and sub-6 5g band. in order to achieve the desired frequency band, the second operating band has been shifted to the lower frequency. in order to shift at lower frequency bands, a circular slot is etched from the upper part of the rectangular patch along with f-slot. using this circular slot the proposed antenna achieved the desired dual band performance with two operating bands from 1.8–2.3 ghz and 3.7–4.2 ghz. the width of the microstrip feed is kept same for all the three design and is equal to 3 mm. the gap between the two radiators is 8 mm. the simulated s-parameters for the fig. 3 is shown in fig. 4. the fig. 4(a) reveals that antenna 3 which is the f–shaped structure with circular slot shows good performance. it is also clear from the fig. 4(b), the designed f–shaped antenna shows the good impedance matching over the two frequency bands with the center (a) (b) fig. 4 (a) simulated s12/s21 parameter versus frequency for antenna 1, 2 and 3 and (b) simulated s11/s22 parameters versus frequency for antenna 1, 2 and 3 dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 47 frequency of 1.8 ghz and 3.9 ghz, and the isolation between the two antennas is less than –5, and –15 db for the two bands. to analyze the behavior of antenna current densities for the two different frequencies are determined. for calculating the current densities at two resonating frequencies port 1 is excited. the current distribution of the proposed antenna at 1.8 and 3.9 is shown in fig. 5. (a) (b) fig. 5 surface current densities at (a) 1.8 ghz and (b) 3.9 ghz the figure reveals that, at 1.8 ghz resonant frequency the surface current is uniformly distributed at the feed line and the lower part of the f-shaped radiator. for the frequency of 3.9 ghz the current is uniformly distributed at the lower strip of the f-shaped structure. 3. result and discussion in order to validate the numerical analysis, the proposed dual band mimo antenna has been fabricated using pcb prototype machine. the fabricated top view and bottom view are shown in fig. 6. the fabricated antenna is measured using agilent n5230a vector network analyzer. (a) (b) fig. 6 fabricated photograph of the proposed dual band mimo antenna, (a) top view, and (b) bottom view 48 p. ranjan, s. yadaw, a. bage the simulated and measured s-parameters (s11/s22 and s12/s21) are compared and shown in fig. 7. the figure shows that they are in good agreement with each other. the measurement of radiation patterns is performed inside an anechoic chamber for each element, by keeping the other element terminated with matched load. the radiation pattern at two different frequencies is calculated for the two principal planes (e-plane and hplane) as shown in fig. 8. (a) (b) fig. 7 the comparison of simulated and measured results (a) s11/s22 db and (b) s12/s21 db (a) (b) fig. 8 simulated radiation pattern for proposed antenna at (a) at 1.8 ghz and (b) 3.9 ghz dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 49 the fig. 8(a) shows the simulated radiation pattern of the proposed antenna at 1.8 ghz for e and h plane, and the radiation pattern at the 3.9 ghz for e and h plane frequency is shown in the fig. 8(b). the figure evidences that the antenna possesses the consistent radiation pattern for both frequency bands. fig. 9 shows the gain of the presented antenna for the two different frequency bands. from the figure, it is clear the designed antenna possesses the gain of around 4 db and 2 db for the 1.8 ghz and 3.9 ghz frequency respectively. fig. 9 simulated gain verses frequency graph for the proposed dual band mimo antenna a comparison of the characteristics of proposed mimo antenna with few other reported mimo antenna [11, 12, 13, 14 and 15] is tabulated in table 1. table 1 comparison of presented antenna with previous literature ref. impedance bw (ghz) isolation (db) size (in mm) electrical size in guided wavelength ecc [11] 1.8–5.5 -12 50×90×0.76 1.13× 2.0486 × 0.0173 0.33 [12] 3–12 -15 80×80×0.6 4.19 × 4.19 × 0.031 [13] 2.35–3.05 and 5.12–5.51 -12 43×37×1.6 0.811× 0.69 × 0.0302 0.001 [14] 3.2–3.7, 5.1–5.6, 6.7 7.5 -30 70×50×0.6 1.6886 × 1.2061 × 0.0145 [15] 5.2 – 6 -25 100×50×0.8 3.95 × 1.97 × 0.0316 <0.5 proposed antenna 1.5 – 2.3 and 3.3 –4.2 -15 48×48×1.6 0.6377 × 0.6377 × 0.0213 <0.002 4. diversity performance for mimo antenna, the diversity performance shows how efficiently two antennas work individually. the diversity performance can be calculated using different parameters such as envelop correlation coefficient (ecc), diversity gain (dg), total active reflection coefficient (tarc), etc. the capacity to receive information individually by each antenna is shown through ecc. to achieve better performance, the value of ecc should be less than 0.2, and it can be calculated using the method proposed in [16]: 𝐸𝐶𝐶 = |𝑆11 ∗ 𝑆12 + 𝑆21 ∗ 𝑆22| 2 (1 − |𝑆11| 2 − |𝑆21| 2)(1 − |𝑆21| 2 − |𝑆12| 2)⁄ (1) 50 p. ranjan, s. yadaw, a. bage the diversity gain (dg), can be calculated using the envelop correlation coefficient. for the proposed dual band mimo antenna, the diversity gain can be calculated using [16]: 𝐷𝐺 = √1 − |𝐸𝐶𝐶|2 (2) the simulated ecc and dg of the proposed antenna is shown in fig. 10. the figure shows that ecc of the proposed antenna is below 0.002 at both frequency bands which ensure the good diversity performance of the presented mimo antenna. in the same figure, the diversity gain of the proposed antenna is above 9.9 db at the resonating frequencies. in the transmission and reception process of mimo antenna systems, working of multiple antennas together will affect the overall operating bandwidth and efficiency. the effect of multiple antenna elements on each other is shown through total active reflection coefficient (tarc). tarc can be defined as square root of the ratio of total reflected power to the total incident power and is apparent return loss of the overall mimo antenna system. for dual-band mimo system, the value of tarc can be calculated using the equation given in [17]: 𝑇𝐴𝑅𝐶 = √(𝑆11 + 𝑆12) 2 + (𝑆21 + 𝑆22) 2 √2⁄ (3) the value of tarc should be <0 db for the mimo communication. the simulated tarc of the proposed mimo is shown in fig. 11. the figure reveals at both resonant frequencies the tarc is below -25 db. fig. 11 simulated tarc of the proposed dual band mimo antenna fig. 10 simulated ecc and dg of the proposed antenna dual band mimo antenna for lte, 4g and sub–6 ghz 5g applications 51 5. conclusion this manuscript introduces a small mimo antenna for lte 4g and the sub-6 ghz 5g channel. the suggested antenna operates effectively in two frequency bands having bandwidths of 500 mhz and 600 mhz respectively and ranging from 1.8–2.3 ghz and 3.7– 4.3 ghz. the measured and simulated results of the proposed antenna are compared, which shows the good agreement with each other. the antenna also shows good diversity performance with low envelop correlation coefficient, good diversity gain and low value of tarc. the radiation pattern at e–plane and h–plane for the antenna at both the resonating frequency shows the omnidirectional pattern. acknowledgement: the author would like to acknowledge the iit kanpur for doing the antenna fabrication at their institute. references [1] b. x. wang, w. q. huang and l. l. wang, "ultra-narrow terahertz perfect light absorber based on surface lattice resonance of a sandwich resonator for sensing applications", rsc advances, vol. 7, pp. 42956-42963, 2017. [2] d. hu, t. meng, h. wang, y. ma and q. zhu, "ultra-narrow-band terahertz perfect metamaterial absorber for refractive index sensing application", results phys., vol. 19, p. 103567, 2020. [3] f. yan, q. li, h. tian, z. wang and l. li, "ultrahigh q-factor dual-band terahertz perfect absorber with dielectric grating slit waveguide for sensing", j. phys. d: appl. phys., vol. 53, p. 235103, 2020. [4] q. xie, g. dong, b. wang and w. huang, "design of quad-band terahertz metamaterial absorber using a perforated rectangular resonator for sensing applications", nanoscale res. lett., vol. 13, p. 137, 2018. [5] m. janneh, a. de marcellis, e. palange, a. t. tenggara and d. byun, "design of a metasurface-based dualband terahertz perfect absorber with very high q-factors for sensing applications", opt. commun., vol. 416, pp. 152-159, 2018. [6] w. yin, z. shen, s. li, l. zhang and x. chen, "a three-dimensional dual-band terahertz perfect absorber as a highly sensitive sensor", front. phys., vol. 9, p. 665280, 2021. [7] x. hu, g. xu, l. wen, h. wang, y. zhao, y. zhang, d. r. s. cumming and q. chen, "metamaterial absorber integrated microfluidic terahertz sensors", laser photonics rev., vol. 10, pp. 962-969, 2016. [8] l. cong, s. tan, r. yahiaoui, f. yan, w. zhang and r. singh, "experimental demonstration of ultrasensitive sensing with terahertz metamaterial absorbers: a comparison with the metasurfaces", appl. phys. lett., vol. 106, p. 031107, 2015. [9] a. kovačević, m. potrebić and d. tošić, "sensitivity analysis of possible thz virus detection using quad-band metamaterial sensor", in proceedings of the ieee 32nd international conference on microelectronics (miel), niš, serbia, 2021, pp 107-110. [10] n. akter, m. m. hasan and n. pala, "a review of thz technologies for rapid sensing and detection of viruses including sars-cov-2", mdpi biosensors, vol. 11, p. 349, 2021. [11] n. shen, p. tassin, t. koschny and c. soukoulis, "comparison of goldand graphene-based resonant nanostructures for terahertz metamaterials and an ultra-thin graphene-based modulator", phys. rev. b, vol. 90, no. 11, p. 115437, 2014. [12] wipl-d pro 17, 3d electromagnetic solver, wipl-d d.o.o., belgrade, serbia, 2021. available online: http://www.wipl-d.com (accessed on 29 april 2022). [13] b. dadonaite, b. gilbertson, m. l. knight, s. trifković, s. rockman, a. laederach, l. e. brown, e. fodor and d. l. v. bauer, "the structure of the influenza a virus genome", nat. microbiol., vol. 4, no. 11, pp. 1781-1789, 2019. [14] m. amin, o. siddiqui, h. abutarboush, m. farhat and r. ramzan, "a thz graphene metasurface for polarization selective virus sensing", carbon, vol. 176, pp. 580-591, 2021. [15] b. wang, a. sadeqi, r. ma, p. wang, w. tsujita, k. sadamoto, y. sawa, h. r. nejad, s. sonkusale, c. wang et al, "metamaterial absorber for thz polarimetric sensing", in proceedings of the spie, terahertz, rf, millimeter, and submillimeter-wave technology and applications xi, san francisco, ca, usa, 2018, vol. 10531, pp. 1-7. [16] f. lan, f. luo, p. mazumder, z. yang, l. meng, z. bao, j. zhou, y. zhang, s. liang, z. shi et al, "dualband refractometric terahertz biosensing with intense wave-matter-overlap microfluidic channel", biomed. opt. express, vol. 10, pp. 3789-3799, 2019. http://www.wipl-d.com/ 10631 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 587-601 https://doi.org/10.2298/fuee2204587a © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper doherty amplifier linearization by digital injection methods* aleksandar atanasković1, nataša maleš-ilić1, aleksandra đorić1, djuradj budimir2 1faculty of electronic engineering, university of niš, niš, serbia 2university of westminster, london, uk abstract. verification of two linearization methods, applied on asymmetrical two-way microstrip doherty amplifier in experiment and on symmetrical two-way doherty amplifier in simulation, is performed in this paper. the laboratory set-ups are formed to generate the baseband nonlinear linearization signals of the second-order. after being tuned in magnitude and phase in the digital domain the linearization signals modulate the second harmonics of fundamental carrier. in the first method, adequately processed signals are then inserted at the input and output of the main doherty amplifier transistor, whereas in the second method, they are injected at the outputs of the doherty main and auxiliary amplifier transistors. the experimental results are obtained for 64qam digitally modulated signals. as a proof of concept, the linearization methods are also verified in simulation, for doherty amplifier designed to work in 5g band below 6 ghz, utilizing 20 mhz lte signal. key words: doherty amplifier, baseband signal, second harmonic, linearization, experimental verification. 1. introduction in modern wireless communications, the efficiency of rf transmitters largely depends on the efficiency of power amplifiers (pa), so the development of 5g/6g systems requires new pa architectures that will ensure that amplifiers hold high efficiency while maintaining good linearity. therefore, it is necessary to find a compromise between the key parameters of the pa, such as efficiency, power output and linearity. with the classic architecture of the pa, this is not easy to be achieved, and it is very difficult to optimize all the key parameters of pa. usually the optimal design of the pa for one parameter leads to the degradation of another important parameter; therefore, the solution of this problem is to design an energy efficient pa, which is then to be linearized by one of the appropriate linearization techniques. the pa characterized by high efficiency is doherty received march 31, 2022; revised june 10, 2022; accepted june 28, 2022 corresponding author: aleksandar atanasković faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia e-mail: aleksandar.atanaskovic@elfak.ni.ac.rs *an earlier version of this paper was presented at the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20-22, 2021, in niš, serbia [1] 588 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir topology (da), which is widely used in the contemporary wireless communication systems. in recent time, different linearization techniques [2] are used for pa nonlinearity compensation, such as feedback linearization [3-4], feedforward linearization [5-6], digital predistortion [7-9] and digital injection methods [10-11]. we deployed in earlier work the digital linearization technique [10], [12-14] which processes the i and q signals to generate the adequate 2nd order baseband linearization signals adjusted in the magnitudes and phase angles. these signals are then driven at the gate and drain of the amplifier transistor, after modulating the 2nd harmonic of the fundamental carrier, in order to lower the nonlinearity of the single stage pa [10], [12], and the two-way da [10], [13]. in [14], da was linearized by inserting the modulated signals for linearization at the outputs of the main and auxiliary amplifier transistors. the comparison of two digital linearization methods was carried out in simulation on the designed broadband microstrip da, for different two-tone signal power and maximum tone separation of 30 mhz, as well as for ofdm signal. in this paper, the various experiments are performed on doherty amplifier fabricated in microstrip technology [15] for evaluation of two linearization methods. the tests were realized for 64qam signal with useful spectrum bandwidth of 2 mhz. measured results show the adjacent channel power ration -acpr at dominant third-order intermodulation products and fifth-order intermodulation products. to confirm the efficiency of the linearization method, in this paper the verification also was performed in the simulation procedure for da designed to operate at frequency 3.5 ghz [16]. simulation was performed for a 20 mhz lte signal and various output power levels up to 1-db compression point. 2. analysis the applied linearization methods can be explained theoretically by modeling the nonlinearity of the amplifier transistor by using a taylor-series polynomial model, which does not include memory effect. the fet output current (ids) in terms of the gate-source voltage (vgs), and drain-source voltage (vds) is given by eq. 1, [10], [12], [15]. / / / 2 / 3 1 2 3 / / 2 / 3 1 2 3 / / / 2 / / / 2 1 1 2 1 1 2 ( ) ( ) ( ) ( ) ( ) ( ) m a m a m a m a ds m gs m gs m gs m a m a m a d ds d ds d ds m a m a m a m a m a m a m d gs ds m d gs ds m d gs ds i g v g v g v g v g v g v g v v g v v g v v = + + + + + + + + + + (1) where gmx represents transconductance terms, gdy is the drain-conductance terms and gmxdy is mixed terms (the order of each coefficient can be calculated as x y+ ), and m/a relates to the main and auxiliary amplifiers in doherty circuit. the nonlinear terms defined by the coefficients gd1 − gd3 can be neglected according to the previous performed analysis. also, the mixing terms of the 3rd order gm1d2 and gm2d1 produce the 3 rd order intermodulation products (im3) that can be considered to reduce each other to some extent, so that they are omitted from the final equations that relate to the im3 of da output current given in text below, based on the results obtained in [10], [12], [15]. however, those mixing terms are included into the equations that describe the 5th order intermodulation products of da doherty amplifier linearization by digital injection methods 589 output current (im5), so that the influence of the injected 2nd order linearization signals to the im5 can be explained. the basband signals for linearization are formed by the adeqate processing of the inphase i and quadrature-phase q components of the digital signal resulting in the in-phase linearization component – iim2 = (i2 − q2), and quadrature-phase component –qim2 = 2iq, which are the products of the 2nd order nonlinearity. those signals are tuned in magnitude by / { } m a i o a and phase by / { } θ m a i o , where adaptation coefficients are denoted by i and o in subscript for the injection of the signals for the linearization at the input and output of the transistor in amplifier. the baseband signals prepared in this manner then modulate fundamental carrier second harmonic. in the first linearization approach applied in this paper for the doherty amplifier linearization, the signal for linearization are inserted at the input (together with the fundamental signal) as given by eq. 2 and at the output of the main amplifier transistor in da, eq. 3, whereas in the second approach the signals for linearization are led to the transistor output of the main, eq. 3, and auxiliary stages, eq. 4, in da. 0 0 2 2 0 0 [ cos( ) sin( )] 1 [( ) cos(2 ) 2 sin(2 )] 2 m i m m gs s jm i v v i t q t a e i q iq      − = − + + − − (2) 0 0 2 2 0 0 [ cos( ) sin( )] 1 [( ) cos(2 ) 2 sin(2 )] 2 m o m m ds o jm o v v i t q t a e i q iq      − = − + − − − (3) 0 0 2 2 0 0 [ cos( ) sin( )] 1 [( ) cos(2 ) 2 sin(2 )] 2 a o a a ds o ja o v v i t q t a e i q iq      − = − + − − − (4) where v m s , and v m o, are the magnitudes of the input and output signal of the main amplifier transistor at fundamental frequency, and v a o is the magnitude of the output signal of the auxiliary amplifier transistor at fundamental frequency. the distorted output current of the doherty amplifier analysed for the im3 products is expressed by eq. 5 when the first linearization method is applied and by eq. 6 for the second linearization method. the im5 products are included into eqs. 7 and 8 for both linearization methods. 1 3 3 3 3 23 2 2 1 1 1 1 0 0 3 3 1 ( ) ( ) 4 4 2 1 1 ( )( cos(ω ) sin(ω )) 4 4 m i m m o i st jm a m m out s m s m i s mim j jm m m m o s m d i o m d i v g v g a e v g a e v g a e v g i q i t q t    − − −  = + +   − + + −  (5) 2 3 3 3 33 2 2 1 1 1 1 0 0 3 3 ( ) ( ) 4 4 1 1 ( )( cos(ω ) sin(ω )) 4 4 m a o o nd m a out s m s mim j jm m a a o s m d o s m d i v g v g a e v g a e v g i q i t q t  − −  = + +   − − + −  (6) 590 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir 1 25 5 2 5 5 35 2 ( )2 1 2 1 2 2 ( )2 2 2 2 2 1 2 1 0 0 5 5 3 ( ) ( ) ( ) 8 8 2 1 ( ) 2 1 ( ) ( ) ( cos( ) sin( )) 2 m i m m m o i o m m m i i o st jm a m m out s m s m i s mim j jm m m m m o s m d i o o m d j jm m m m m i o m d i o s m d i v g v g a e v g a e v g a a e v g a e v g a a e v g i q i t q t          − − − + − − +  = + + +  + − +  + − + −  (7) 2 5 5 5 55 2 22 2 2 2 2 2 1 2 1 0 0 5 5 ( ) ( ) 8 8 1 1 ( ) ( ) ( ) ( cos( ) sin( )) 2 2 m a o o nd m a out s m s mim j jm m a a o s m d o s m d i v g v g a e v g a e v g i q i t q t     − −  = + +   + + + −  (8) the 1st and 2nd terms in eqs. 5 and 6 represent da linearity degradation by the 3rd order nonlinearity of the amplifier stages. the 3rd to 5th terms in eq. 5 are the nonlinear products of the second order between the linearization signal injected at the input and at the output of the main amplifier transistor in da and fundamental signals. the 3rd and 4th terms in eq. 6 relate to the mixing terms of the 2nd order between the fundamental signals and the signals for the linearization put at the output of the main and auxiliary amplifier in da. it can be observed that the nonlinear terms of the 2nd order generated due to the injection of the signals for the linearization can reduce the originally produced im3 distortion by the adequate adjustment of the magnitude and phase of the signals for the linearization, [10], [12], [15]. equations 7 and 8 define the im5 products of the da output current generated by the 5th order nonlinearity of da main and auxiliary stages by the 1st and 2nd terms. additional terms for two linearization methods are the 3rd order products that mix the linearization signal with the fundamental signals, which can reduce original im5 products if their magnitudes and phases are related appropriately, [10], [12], [15]. 3. da design two doherty amplifiers were designed to verify proposed linearization method: twoway asymmetrical doherty amplifier operating at 900 mhz central frequency [15] and symmetrical doherty amplifier operating at 3.5 ghz central frequency [16]. the linearization effects were examined in experiment on the fabricated two-way asymmetrical doherty amplifier shown in figure 1, which consist of: 1. main amplifier and frequency diplexers; 2. auxiliary amplifier and frequency diplexer; 3. offset line and output combining networks; 4. pi attenuator; 5. power combiner for injection of the signal for linearization at auxiliary amplifier output, 6. port for the injection of the linearization signal at the main transistor input, 7. port for the injection of the linearization signal at the main transistor output. detailed description of the doherty amplifier design can be found in [15], (it should point out that wilkinson power combiner denoted as 5. in figure 1 was used for another purpose in the linearization method exploited in [15]). in this paper, one port of the combiner was utilized for the linearization. the maximal transducer gain 9 db was measured for the fabricated two-way asymmetrical doherty amplifier for the main amplifier biased in class-ab (vd = 5 v, vgm =−3 v), and the auxiliary amplifier operating in class-c regime (vd = 5 v, vga =−5 v) when ap602a-2 gaas mesfet transistor was used in doherty amplifier linearization by digital injection methods 591 amplifying cells. moreover, measured 1-db compression point of da is at 15 dbm output power and 18 dbm maximum output power is achieved. symmetrical two-way doherty amplifier that operates at 5g band below 6 ghz was designed according to the instructions given in [16]. the da pa was designed by using cgh40010f gan hemt transistor. the drain voltage is 28 v, whereas the gate voltage of the main and auxiliary amplifier is -2.8 v, and -5.7 v, respectively. main characteristics that relate to gain, 1-db compression point, power added efficiency pae, dc power consumption etc. were represented in table 1 for frequency 3.5 ghz. additionally, the gain, gain compression, pae and supply current are shown in figure 2 in the range of da output power. fig. 1 asymmetrical two-way doherty amplifier (all dimensions are in millimeters) 592 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir table 1 characterization of two-way symmetrical da 3.5 ghz a) b) fig. 2 symmetrical two-way doherty amplifier characteristics: a) gain and gain compression; b) pae and supply current 4. measurement set-up the measurement set-up shown in figure 3 was established to verify in experiments the linearization methods developed by our researcher group [17]. the linearization methods of power amplifiers are based on the 2nd order baseband nonlinear digital signals, which adequately modified and processed in the baseband, modulate the fundamental carrier second harmonic. the measurement system was designed to enable verification of the linearization methods, which generally use two linearization signals, which after the digital processing in the baseband modulate the second harmonic of the fundamental signal. the measuring system can generate three independent, synchronized signals – the fundamental signal and two linearization signals at the frequency of fundamental signal second harmonic. two ni usrp 2920 models and one ni usrp 2922 model were used for the measurement system, which are connected to the computer via an ethernet switch. the labview environment was used for implementation of the interface for management and control of the ni usrp devices (figure 4). a unique challenge during the implementation of the measurement system was the synchronization of the ni usrp devices. the synchronization of the ni usrp devices was performed by a mimo cable which was used to synchronize two of the three usrp devices (using mimo expansion input), while the third device was synchronized with the previous two via external referent 10 mhz and 1 pps signals. rigol dg1022 two-channel function/arbitrary waveform generator was used as the generator of these signals. the outputs from the function generator were distributed to the corresponding inputs of the ni usrp device (ref in and pps in). doherty amplifier linearization by digital injection methods 593 a) b) fig. 3 experimental verification of linearization methods 1 and 2: a) measurement set-up; b) schematic diagram 594 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir fig. 4 interface for management and control of the ni usrp devices when using the ni usrp with the labview environment, it is necessary to provide high processing power, large amount and high speed of memory as well as fast ethernet connection to the computer on which the ni usrps are connected, which is especially required if multiple usrp devices are used at the same time. the lack of any of these resources can significantly affect the reliable operation of the ni usrp and lead to frequent downtime [17]. to demonstrate the results of linearization, useful 64qam signal, the signals for linearization and their control in magnitude and phase were performed by the usrp platforms. the linearization effects were examined on the fabricated two-way asymmetrical doherty amplifier operating at 900 mhz central frequency, shown in figure 1. the measurements of output spectra, the adjacent channel power ratios-acprs, for the states before and after the linearization carried out for 64qam modulation format and different signal power levels were spotted in exa signal analyzer n9010a. 5. results of linearization asymmetrical two-way doherty amplifier was tested for 64qam signal with 2 mhz useful channel bandwidth. central frequency of operation is 900 mhz. the linearization effects were measured on the fabricated da for different input signal power levels 1 dbm to 5 dbm. the presented results shown in figures 5 to 7 compare the acprs obtained without and with applying two digital linearization methods: 1) the first-standard method that injects signals for the linearization at the gate and drain of the transistor in the main cell of the da and 2) the second-modified method, where the linearization signals are put at the drain of the main and auxiliary amplifier transistors in the da. doherty amplifier linearization by digital injection methods 595 a) b) c) fig. 5 output spectrum for 64qam signal of 2 mhz useful signal frequency bandwidth for input signal power 1 dbm: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method the results of acprs are illustrated in the lower and upper adjacent channels (at ±2 mhz offset from carrier where im3 products are dominant) and in the alternate channels (at ±3 mhz offset from carrier where im5 products are dominant). we can observe for 1 dbm input power, that the acpr in the adjacent channels is improved by 4 db for both linearization methods, whereas for 3 dbm input power they become better by 6 db for the 1st method and 8 db for the 2nd method. with the power increase to 5 dbm, acprs decreases by 3 db and 5 db in the 1st and 2nd methods, respectively. no evident improvement in the alternate channels can be noticed for 1 dbm and 3 dbm input power levels, but it is 4 db in case of 5 dbm power. comparing the measured results with the simulated results represented in [14], we can infer that the 2nd linearization method achieves slightly better acprs improvement in the adjacent channels, especially for higher power, as it was also deduced in [14] when simulated results were analyzed. even though the simulated results attained for the twotone test show more apparent improvement when the 2nd method is used, it should indicate that for the ofdm signal test in simulation, the less divergence between results accomplished with two linearization methods can be observed for higher power, closer to amplifier saturation region. 596 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir a) b) c) fig. 6 output spectrum for 64qam signal of 2 mhz useful signal frequency bandwidth for input signal power 3 dbm: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method symmetrical two-way doherty amplifier was simulated for 20 mhz lte signal at 3.5 ghz central frequency of operation. both linearization methods were considered for different power levels up to 1-db compression point. simulation results obtained without and with applying linearization methods are shown in figures 8 to 10. it can be observed from the figures that the acpr at ±20 mhz offsets from the carrier frequency over 2 mhz bandwidth is improved for about 6 db at 32 dbm output power (near 1-db compression point) for the 1st method. for lower power levels, acpr improvement is much better: about 11 db at 27 dbm output power and nearly 12 db at 22 dbm output power for the 1st method. for all power levels, the 2nd method shows an improvement in acpr of 1 db to 2 db more comparing to the 1st method. also, a slight asymmetry in the acpr reduction can be observed at lower (-20 mhz offset) and upper (+20 mhz offset) adjacent channels for both methods. doherty amplifier linearization by digital injection methods 597 a) b) c) fig. 7 output spectrum for 64qam signal of 2 mhz useful signal frequency bandwidth for input signal power 5 dbm: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method 6. conclusion experimental results of the linearization of asymmetrical doherty amplifier fabricated in microstrip technology obtained by applying two digital linearization methods are presented in this paper, as well as simulation results for symmetrical doherty amplifier designed to operate in 5g band below 6 ghz. the linearization methods utilize the adequately processed baseband digital signals that modulate the second harmonic of the fundamental carrier. in the 1st linearization method, formed signals for the linearization are injected at the input and output of main transistor in doherty amplifier, while in the 2nd method these signals are led to the outputs of the main and auxiliary amplifier transistors in the da circuit. the ni usrp platforms programmed by labview software were used for generation of the useful 64qam signals for da test and measurements of acprs in adjacent and alternate channels for various input power levels. additionally, these platforms form the signals for linearization, and process them in amplitude and phase. measurements performed by signal analyzer illustrate the results of the linearization for two applied linearization methods and compare them to the states before the linearization. 598 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir a) b) c) fig. 8 output spectrum for lte signal at 22 dbm output power: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method on the bases of the achieved experimental results, it can be noticed that the 2nd method provides slightly better results for higher power then the application of the 1st method regarding adjacent channels, where the 3rd order im products are dominant. the same conclusion can be derived for the alternate channels (the band of dominant 5th order im products) but these results of only 1 db or 2 db are inconsiderable, except for the 5 dbm input power where higher improvements of acprs were attained in case of both linearization methods. doherty amplifier linearization by digital injection methods 599 based on the obtained linearization results in simulation for symmetrical da for the 20 mhz lte signal at 3.5 ghz, it can be assumed that the proposed linearization method can be successfully used for 5g band signals with a bandwidth of 20 mhz. the test for wider 5g modulation formats is a subject of further analysis. a) b) c) fig. 9 output spectrum for lte signal at 27 dbm output power: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method 600 a. atanasković, n. maleš-ilić, a. đorić, dj. budimir a) b) c) fig. 10 output spectrum for lte signal at 32 dbm output power: a) before linearization; b) after linearization by 1st method; c) after linearization by 2nd method acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia (grant no. 451-03-68/2022-14/200102) and science fund of the republic of serbia (grant no. 6398983 serbian science and diaspora collaboration program: vouchers for knowledge exchange – project name: digital even-order linearization of 5g power amplifiers in bands below 6ghz delfin). doherty amplifier linearization by digital injection methods 601 references [1] a. atanasković, n. m. ilić, a. djorić and d. budimir, "doherty amplifier linearization in experiments by digital injection methods", in proceedings of 15th international conference on advanced technologies, systems and services in telecommunications telsiks 2021, niš, serbia, october 20-22, 2021, pp. 82-85. [2] a. borel, v. barzdėnas and a. vasjanov, "linearization as a solution for power amplifier imperfections: a review of methods", electronics, vol. 10, no. 9, may 2021. [3] s. kang, e. t. sung and s. hong, "dynamic feedback linearizer of rf cmos power amplifier", ieee microw. wirel. compon. lett., vol. 28, no. 10, pp. 915-918, oct. 2018. [4] j. li, r. shu, q. j. gu, "a fully-integrated cartesian feedback loop transmitter in 65nm cmos", in proceedings of the ieee mtt-s international microwave symposium digest, honololu, hi, usa, june 2017, pp. 103-106. [5] h. choi, y. jeong, c. d. kim and j. s. kenney, "efficiency enhancement of feedforward amplifiers by employing a negative group-delay circuit", ieee trans. microw. theory techn., vol. 58, no. 5, pp. 1116-1125, may 2010. [6] r. n. braithwaite, "a comparison for a doherty power amplifier linearized using digital predistortion and feedforward compensation", in proceedings of the 2015 ieee mtt-s international microwave symposium, ims 2015, phoenix, az, usa, 17–22 may 2015, pp. 1-4. [7] s. jung, o. hammi and f. m. ghannouchi, "design optimization and dpd linearization of gan-based unsymmetrical doherty power amplifiers for 3g multicarrier applications", ieee trans. microw. theory techn., vol. 57, no. 9, pp. 2105-2013, sept. 2009. [8] p. l. gilabert, d. vegas, z. ren, g. montoro, j. r. perez-cisneros, m. n. ruiz, x. si and j. a. garcia, "design and digital predistortion linearization of a wideband outphasing amplifier supporting 200 mhz bandwidth", in proceedings of the ieee topical conference on rf/microwave power amplifiers for radio and wireless applications, pawr 2020, san antonio, tx, usa, 26–29 january 2020, pp. 46-49. [9] s. n. ali, p. agarwal, s. gopal and d. heo, "transformer-based predistortion linearizer for high linearity and high modulation efficiency in mm-wave 5g cmos power amplifiers", ieee trans. microw. theory techn., vol. 67, no. 7, pp. 3074–3087, may 2019. [10] a. đorić, a. atanasković, n. maleš-ilić and m. živanović: "linearization of rf pa by even-order nonlinear baseband signal processed in digital domain", int. j. electron., vol. 106, no.12, pp. 1904-1918, dec. 2019. [11] d. bondar, n. d. lopez, z. popovic and d. budimir, "linearization of high-efficiency power amplifiers using digital baseband predistortion with iterative injection", in proceedings of the ieee radio and wireless symposium, new orleans, la, usa, 10–14 january 2010, pp. 148-151. [12] a. atanasković, n. males-ilić, k. blau, a. đorić and b. milovanović, "rf pa linearization using modified baseband signal that modulates carrier second harmonic", microw. rev., vol. 19, no. 2, pp. 119-124, dec. 2013. [13] a. đorić, n. maleš-ilić, a. atanasković and v. marković, "linearization of broadband doherty amplifier by baseband signal that modulates second harmonic" in proceedings of the ieee eurocon 2017, ohrid, macedonia, 6-8 july, 2017, pp. 206-211. [14] a. đorić, a. atanasković, b. alorda and n. maleš-ilić: "linearization of doherty amplifier by injection of digitally processed baseband signals at the output of the main and auxiliary cell", in proceedings of the 14th international conference on advanced technologies, systems and services in telecommunications telsiks 2019, niš, serbia, october 23-25, 2019, pp. 339-342. [15] n. maleš-ilić, a. atanasković, k. blau and m. hein, "linearization of asymmetrical doherty amplifier by the even-order nonlinear signals", int. j. electron., vol. 103, no. 8, pp. 1318-1331, aug. 2016. [16] z. zhang, z. cheng and g. liu, "a power amplifier with large high-efficiency range for 5g communication", sensors, vol. 20, no. 19, oct. 2020. [17] a. atanasković, n. m. ilić, a. đorić and d. budimir, "experimental verification of the impact of the 2nd order injected signals on doherty amplifiers nonlinear distortion", in proceedings of 29th telecommunications forum – telfor 2021, belgrade, serbia, november 23-24, 2021, pp. 1-4. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 245 256 doi: 10.2298/fuee1702245l novel approach to modelling of lightning current derivative  karl lundengård 1 , milica rančić 1 , vesna javor 2 , sergei silvestrov 1 1 mälardalen university, ukk, division of applied mathematics, västerås, sweden 2 university of niš, faculty of electronic eng., dept. of power engineering, niš, serbia abstract. a new approach to mathematical modelling of lightning current derivative is proposed in this paper. it builds on the methodology, previously developed by the authors, for representing lightning currents and electrostatic discharge (esd) currents waveshapes. it considers usage of a multi-peaked form of the analytically extended function (aef) for approximation of current derivative waveshapes. the aef function parameters are estimated using the marquardt least-squares method (mlsm), and the framework for fitting the multipeaked aef to a waveshape with an arbitrary number of peaks is briefly described. this procedure is validated performing a few numerical experiments, including fitting the aef to singleand multi-peaked waveshapes corresponding to measured current derivatives. key words: analytically extended function, lightning current derivative, lightning current function, lightning stroke, marquardt least-squares method 1. introduction besides different parameters of lightning electromagnetic field and lightning discharge currents, which greatly endanger the functionality of power systems, electrical equipment and electronic devices, lightning current derivative signal is often measured at tall instrumented towers, towers at elevated terrain and at rocket-triggered stations, [1]-[8]. current derivatives approximation is important for calculation of lightning induced overvoltages and for further improvements of lightning discharge models [1], [5], [9]. generalizing the function for representing lightning currents from [10]-[12], the proposed multi-peaked analytically extended function (aef) has been applied by the authors to modelling of different lightning currents, including those defined in the iec standard 62305-1 [13], slow and fast-decaying ones, as well as measured ones, see e.g. [14]-[16]. furthermore, it has been recently used in [9] and [17] for representation of the electrostatic discharge (esd) current corresponding to the iec standard 61000-4-2 waveshape as given received august 29, 2016; received in revised form november 26, 2016 corresponding author: vesna javor university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: vesna.javor@elfak.ni.ac.rs) 246 k. lundengård, m. ranĉić, v. javor, s. silvestrov in [18]-[19]. the aef’s parameters were fitted to the desired current waveshapes using the marquardt least-squares method (mlsm), [20]. in this paper we explore the possibility of reproducing the waveshape of the lightning current first derivative using the aef and adjusting its non-linear parameters employing the mlsm. the validity of approximation and this methodology is tested by performing a few numerical experiments related to modelling of lightning current derivative signals measured at the cn tower [5]. since installation, simultaneous measurements of currents and current derivatives by rogowski coils, corresponding electromagnetic field values detected by sensors and high-speed cameras at a few km distance from the tower have been providing useful data for analysis, [1]-[5]. reflection coefficients are estimated for the cn tower and employed for magnetic field calculation in [5]. reflections occur from the tip of this tower, top and bottom of its restaurant and from the ground, so as at the upward-propagating lightning returnstroke channel front, and produce peaks in the current derivative waveshape. in this paper, lightning current derivative approximation is done taking into account the initial peak and subsequent peaks in the derivative waveshape, regardless of their cause. the same procedure may be used in the case when measured current derivatives have multipeaked waveshapes for other reasons, e.g. due to various configurations of the terrain and some tall structures, or due to lightning current channel discontinuities and branching. 2. modelling of the lighting current derivative 2.1. analytically extended function (aef) and some of its properties the basic building block of the multi-peaked aef is, as referred to in [18], the power exponential function (pef) given by 1 ( ; ) ( ) , 0 , t x t te t      (1) where the β-parameter determines the steepness of both its rising and its decaying part. the aef is constructed as a function consisting of piecewise linear combinations of pefs that have been scaled and translated to ensure that the resulting function is continuous. in [18], it is defined as 1 1 1 , , 1 1 , , 1 1 ( ), ,1 , d ( ) d ( ), , 1, q k q q q p k p nq dm dm q k q k m m k k np dm q k q k m k k i i x t t t t q p i t t i x t t t q p                            (2) where:  1 2 , ,..., pdm dm dm i i i the difference in height between each pair of peaks,  1 2 , ,..., pm m m t t t the times corresponding to these peaks,  0 q n  the number of terms in each time interval,  ,q k  real values so that ,1 1 qn q kk    , and  , ( ) q k x t pefs defined by ,q k  parameters in the following way: novel approach to modelling of lighting current derivative 247 2 , 1 2 , 1 , exp 1 , ( ) exp 1 1, q k q q q q q k q q m m m m q k m m t t t t q p t t x t t t q p t t                                      (3) where 1q q qm m m t t t     . expression (2) can be written more compactly as     1 1 t 1 t 1 , , 1 , d ( ) d , , 1, k q q q k p q dm dm q q m m k p d m q q m k i i t t t t q p i t t i t t t q p                     x x   (4) after introducing t ,1 ,2 , [ ] , qq q q q n     ,1 ,2 , ( ) [ ( ) ( ) ( )]. qq q q q n t x t x t x tx the first derivative of the multi-peaked aef corresponds to the second derivative of the lightning discharge current, i(t), and can be easily found since the aef consists of elementary functions. compact form is given by     1 1 t 2 2 t ( ) , , 1 , d ( ) d ( ) , , 1, q q q q q q q q p q q m q dm q q q m m m m m q dm q q q m m m t t x t i t t t t q p t t t i t t t t x t i t t t q p t t                  η b x η b x (5) where q b are diagonal matrices: 2 2 2 ,1 ,2 , 2 2 2 ,1 ,2 , diag( 1, 1, , 1), 1 , diag( , , , ), 1. q q q q q n q q q q n q p q p                    b based on this expression, it is easy to see that the current’s second derivative is also continuous since it will be zero at each qm t . the integral of the aef corresponds to the lighting discharge current i(t) and is also relatively straightforward to find, since the integral of the pef can be written using the lower incomplete gamma function ([21]) i.e. 248 k. lundengård, m. ranĉić, v. javor, s. silvestrov 1 0 1 0 1 01 1 ( ; ) d ( ( 1, ) ( 1, )) ( , , ) t t e e x t t t t t t                    , (6) where 1 0 ( , ) d t t e        is the incomplete gamma function. combining (6) and (2) we obtain the integral of the rising part of the aef 1 1 , 1 1 11 2 , , 1 1 1 1 , 1 1 d ( ) d ( ) ( , ) d ˆ ( 1) ( ) ( , ) , 0 , a b a k a a a q q k q q b k b b p na t m a dm dm a k a m a t k k nqb m dm dm q k q k q a k k nb b m dm dm b k b b m a b m m k k i t t t t i i g t t t t i i g t t i i g t t t t t t                                                          1 1 , , a a b ba m m b m t t t t t       (7) with 2 , 1 1 2 , 1 1 02 1 0 ,22 , ( , ) 1, , ( 1) q k q q q k q q m m q q k m mq k t t t te g t t t t                     and 1 ˆ ( ) ( 1, ) e g         . the integration formula corresponding to the decaying part is 1 1 0 1, 1 1 0 0 1 1 1 d ( ) d ( , ), d p k p np t dm p k p mt k k i t t i g t t t t t t             , i.e. 1 2 1, 1, 1 1 d ( ) d ( ) d p k m p np dm p k p kt k k i t t i g t           , (8) with 1 ( ) ( ( 1) ( 1, )) e g            and 1 0 ( ) de         , the gamma function, [21]. 2.2. marquardt least-squares method (mlsm) detailed explanation of the mlsm algorithm is given in [15], [16], here we just go over the parts specific for the multi-peaked aef. the mlsm is used for estimating β-parameters, and from these, the corresponding η– parameters are calculated. in each iteration step, η–parameters are obtained using the regular least-square method since for fixed β-parameters the aef is linear in η. based on these η–parameters, a new set of β-parameters is found. the mlsm uses a jacobian matrix, denoted by j, containing partial derivatives of the residuals. the least square fitting of the multi-peaked aef to a set of data points can be done separately between each peak (and after the final one), and the corresponding j matrix is ,1 ,1 ,2 ,1 , ,1 ,1 ,2 ,2 ,2 , ,2 ,1 , ,2 , , , ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) q q q q q q q q q q q n q q q q q q n q q q k q q k q n q k p t p t p t p t p t p t p t p t p t                j , (9) novel approach to modelling of lighting current derivative 249 where q k is the number of data points between the (q-1)th and qth peak, and ,q r t is the time corresponding to these data points, and 2 , , , , , 2 , , , , , d ( ) 2 ( ) ( 1), 1 , d ( ) ( ) ( ), 1, q q dm q k q k q q k q r q r q r dm q k q k q q k t t q r i t i h t x q p t p t i h t x q p                          with 1 1 ln 1, 1 , ( ) ln 1, 1. q q q q q q m m m m q m m t t t t q p t t h t t t q p t t                                  3. aef representing measured lighting current derivative examples in this section we validate our model by attempting to represent measured lightning current derivatives data obtained at the 553m cn tower, toronto, canada, [5]. time and current values corresponding to aef peaks were chosen manually and the rest of the aef parameters were obtained using the framework briefly described in section 2.2. the number of time intervals and terms in each of them vary from example to example. general notation, aefp(n1, …, np) for nq, q=1, …, p, is used to denote an aef with p peaks and chosen number of terms nq in each time interval q. 3.1. single-peaked waveshape the first example illustrates the application of a single-peaked aef to representation of the measured initial current derivative impulse occurring in the first 0.5 s given in [5, fig. 4]. the best fitting was obtained choosing two terms in each of the two time intervals: 0-tm and tm-0.5 s (the moment tm corresponds to the maximum current derivative). current derivative value at t=0 is treated as the first point of approximation, so there are 4 terms in total, for these 2 intervals. obtained aef2(2,2) model is illustrated in fig. 1a along with the measured data, data points used for the mlsm fitting, and the locations of peaks observed in this waveshape. using the expressions (7) and (8) we also obtained the aef’s integral, i.e. the lighting discharge current. it can be observed in fig. 1b along with the numerically integrated measured data. 250 k. lundengård, m. ranĉić, v. javor, s. silvestrov a) b) fig. 1 a) aef2(2,2) representing measured lightning current derivative from [5, fig. 4], and b) the corresponding lightning current 3.2. multi-peaked waveshapes in this part we attempt modeling of the measured current derivatives data that include the initial and a number of subsequent impulses. the recoded waveshapes have great number of peaks and therefore is harder to model them using standard functions, but these are more suitable for modelling by the multi-peaked aef. the first example corresponds to an event of lightning discharge measured at the cn tower, using the rogowski coil positioned at 474 m, illustrated in [5, fig. 2] in 10 s. such current derivative waveshape corresponds to typical fast-rising negative lightning discharge which occurs in about 80% of the registered cases (in 126 flashes out of 160 [5]). the complexity of the aef used for modelling of such multi-peaked waveshapes depends on the desired level of accuracy of the data representation. novel approach to modelling of lighting current derivative 251 in fig. 2 are presented two aefs with different number of peaks, including the starting current derivative value at t = 0 and other peaks which are chosen such that they correspond to local maxima only: a) aef6(1,2,2,2,2,2) with 6 intervals and 11 terms in total, and b) aef8(1,2,2,2,2,2,2,2) with 8 intervals and 15 terms in total. the increased number of time intervals fixes representation of the waveshape part corresponding to the period between the fourth and fifth peak of aef6, and also after its sixth peak, so that the total number of intervals in aef8 is increased by 2, whereas the number of terms by 4. a) b) fig. 2 multi-peaked aefs (using starting point and maxima only) representing measured lightning current derivative from [5, fig. 2]: a) aef6(1,2,2,2,2,2) with 6 peaks, b) aef8(1,2,2,2,2,2,2,2) with 8 peaks 252 k. lundengård, m. ranĉić, v. javor, s. silvestrov additional improvement is needed and could be achieved by further segmentation and including also local minima, so as by increasing the number of terms. two such aef models are illustrated in figs. 3a and 3b, both with thirteen peaks, but for different number of terms chosen to represent some of its intervals. thirteen peaks in aef13 include 4 minima added to aef8 and also one more maximum at its ending part, so that the number of peaks is increased from 8 (in fig.2b) to 13 (in figs. 3a and 3b). these two aefs are denoted by a) aef13a(1,1,1,1,1,2,1,1,1,1,2,2,1) with 13 intervals and 16 terms in total, and b) aef13b(1,1,2,1,2,2,1,2,1,2,2,2,2) with 13 intervals and 21 terms in total, where the bold numbers in brackets point out to the changed number of terms, in some intervals increased from 1 to 2. a) b) fig. 3 multi-peaked aefs with 13 peaks (using starting point, 8 maxima and 4 minima) representing measured lightning current derivative from [5, fig. 2]: a) aef13a(1,1,1,1,1,2,1,1,1,1,2,2,1), b) aef13b (1,1,2,1,2,2,1,2,1,2,2,2,2) novel approach to modelling of lighting current derivative 253 results for the same lightning current derivative measured at cn tower are given in first 7s in figs. 4a and 4b for fitting by aefs corresponding to data from [5, fig. 6]. model aef7(1,2,2,2,2,2,2) with 7 peaks (starting point and maxima only) and 13 terms is presented in fig. 4a, able to capture the initial impulse and subsequent peaks due to reflections at the tower discontinuities. aef7 has one more peak added at the end of aef6, and 2 more terms in total. in fig. 4b, aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) model is presented with the total of 13 peaks (the starting point, 8 maxima and 4 minima), which almost perfectly models measured set of data. it has 23 terms in total, 4 terms added and 2 excluded compared to aef13b. the difference between those two is that 1 peak was added for aef13c between tenth and eleventh peak of aef13b, which improved significantly the approximation, and the thirteenth peak was excluded from the end of aef13b. a) b) fig. 4 multi-peaked aef representing measured current derivative from [5, fig. 6]: a) aef7(1,2,2,2,2,2,2) with 7 peaks (using starting point and maxima only), b) aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) with 13 peaks (starting point, 8 maxima & 4 minima) 254 k. lundengård, m. ranĉić, v. javor, s. silvestrov figure 5 illustrates lighting discharge currents corresponding to above modelled multipeaked current derivative waveshapes. again, expressions (7) and (8) are employed to calculate them, and the numerically integrated measured data is also given for comparison. fig. 5a corresponds to aef13b(1,1,2,1,2,2,1,2,1,2,2,2,2) model shown in fig. 3b, while fig. 5b relates to model aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) from fig. 4b. a) b) fig. 5 lightning currents corresponding to derivatives modelled by aefs: a) aef13b (1,1,2,1,2,2,1,2,1,2,2,2,2) from fig. 3b, b) aef13c(1,2,2,2,2,2,2,2,2,1,1,2,2) from fig. 4b novel approach to modelling of lighting current derivative 255 4. conclusions approximation of lightning current derivatives is needed for calculation of lightning induced effects and for improvements of lightning discharge models. suitability of the multipeaked aef to represent lightning current derivatives is presented in this paper through a few examples. aef’s non-linear parameters are calculated using marquardt least-squares method (mlsm), so that the measured current derivatives signals [5] are well approximated. the approximation by aefs in this paper is done for singleand multi-peaked current derivative waveshapes. increasing the number of maxima and minima, so as the number of terms in total, improves the approximation of the current derivative by aef. the lightning current waveshape is obtained with great accuracy as analytically integrated aef representation of the measured derivative. multi-peaked lightning current derivatives are characteristic for lightning discharges to tall towers and high structures at elevated terrain, but also for subsequent lightning strokes and lightning current channels with discontinuities and branching. further work should be aimed at including such current and its derivative function into lightning stroke models in order to obtain measured lightning electromagnetic field at certain distances. references [1] k. elrodesly and a. m. hussein, "cn tower lightning return-stroke current simulation”, journal of lightning research, vol. 4, suppl 2: m3, pp. 60-70, 2012. [2] a. m. hussein, m. milewski, and w. janischewskyj, "correlating the characteristics of the cn tower, lightning return-stroke current with those of its generated electromagnetic pulse”, ieee transactions on electromagnetic compatibility, vol. 50, no. 3, pp. 642-650, aug. 2008. [3] a. m. hussein, m. milewski, e. burnazovic and w. janischewskyj, "current waveform characteristics of cn tower negative and positive lightning", in proceedings of the x international symposium on lightning protection, curitiba, brazil, 2009, pp. 451-456. [4] b. kordi, r. moini, w. janischewskyj, a. m. hussein, v. o. shostak and v. a. rakov, " application of the antenna theory model to a tall tower struck by lightning”, journal of geophysical research, vol. 108, no. d17, 4542, doi: 10.1029/2003jd003398, 2003. [5] m. milewski and a. m. hussein, "tall-structure lightning return-stroke modelling", in proceedings of the 14th international middle east power systems conference (mepcon’10), cairo university, egypt, 2010, paper id 313, pp. 947-952. [6] f. rachidi, w. janischewskyj, a. m. hussein, c. a. nucci, s. guerrieri, b. kordi and j-s. chang, "current and electromagnetic field associated with lightning–return strokes to tall towers", ieee transactions on electromagnetic compatibility, vol. 43, no. 3, pp. 356-367, aug. 2001. [7] v. a. rakov, “transient response of a tall object to lightning”, ieee transactions on electromagnetic compatibility, vol. 43, no. 4, pp. 654-661, 2001. [8] m. a. uman, j. schoene, v. a. rakov, k. j. rambo and g. h. schnetzer, “correlated time derivatives of current, electric field intensity, and magnetic flux density for triggered lightning at 15 m”, journal of geophysical research, vol. 107, no. d13, doi: 10.1029/2000jd000249, 2002. [9] v. javor, "an analytically extended function for representing the lightning current first derivative", in proceedings of the int. colloquium on lightning and power systems, bologna, italy, 2016, p13_s3.2, pp. 1-8. [10] v. javor and p. d. rancic, “a channel-base current function for lightning return-stroke modeling”, ieee transactions on electromagnetic compatibility, vol. 53, no. 1, pp. 245-249, feb. 2011. [11] v. javor, "multi-peaked functions for representation of lightning channel-base currents", in proceedings of 2012 international conference on lightning protection iclp, vienna, austria, 2012, pp. 1–4. 256 k. lundengård, m. ranĉić, v. javor, s. silvestrov [12] v. javor, "new function for representing iec 61000-4-2 standard electrostatic discharge current", facta universitatis, series: electronics and energetics, vol. 27(4), pp. 509-520, 2014. [13] iec 62305-1, protection against lightning part i: general principles ed. 2.0, 2010-12. [14] k. lundengård, m. ranĉić, v. javor and s. silvestrov, "application of the multi-peaked analytically extended function to representation of some measured lightning currents", serbian journal of electrical engineering, vol. 13(2), pp. 1-11, 2016. [15] k. lundengård, m. ranĉić, v. javor and s. silvestrov, "estimation of parameters for the multi-peaked aef current functions", methodology and computing in applied probability, springer, pp. 1-15, 2016. doi: 10.1007/s11009-016-9501-z [16] k. lundengård, m. ranĉić, v. javor and s. silvestrov, “on some properties of the multi-peaked analytically extended function for approximation of lightning discharge currents”, engineering mathematics i: electromagnetics, fluid mechanics, material physics and financial engineering, series: springer proceedings in mathematics & statistics, vol. 178, eds. s. silvestrov and m. ranĉić, springer, heidelberg, 2016, pp. 151-172, ebook isbn 978-3-319-42082-0; hardcover isbn 978-3-319-42081-3; doi 10.1007/978-3-319-42082-0 [17] k. lundengård, m. ranĉić, v. javor and s. silvestrov, "multi-peaked analytically extended function representing electrostatic discharge (esd) currents", in aip conference proceedings of icnpaa 2016, la rochelle, france, 2016, pp. 1-10. [18] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, 1995+a1:1998+a2:2000. [19] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, ed. 2, 2009. [20] d. m. marquardt, "an algorithm for least-squares estimation of nonlinear parameters", journal of the society for industrial and applied mathematics, vol. 11(2), pp. 431-441, 1963. [21] m. abramowitz and i. a. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables. 1964, dover, new york. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 1-23 https://doi.org/10.2298/fuee1901001p application of python programming language in measurements  predrag pejović university of belgrade, school of electrical engineering, belgrade, serbia abstract. application of python programming language in automation of measurement systems and creating virtual instruments is discussed in this paper. requirements imposed to the software in order to perform these tasks are listed, and python modules that support them are presented. application of proposed techniques are illustrated in seven examples in different application areas. analysis of software evolution, as well as the evolution of professional education yields conclusion that application of python in automating measurement systems is promising. key words: computerized instrumentation, electric variables measurement, impedance measurement, measurements, measurement techniques, software measurement. 1. introduction over the years it exists, software evolved. at the beginning of computers, the program was not stored in the machine, instead the functionality had been hard wired for each application. a great step forward occurred with stored programs, at first written in a machine language. such languages are considered as the first generation of programming languages. the second generation of programming languages involves assembly languages, somewhat more readable than the machine languages, but still heavily dependent on particular instruction set architecture. finally, the third generation of programming languages, a prominent example of which was fortran, which appeared among the first in this generation and gained huge popularity, provided abstraction that separated programmer from the machine instruction set architecture, enabling code portability. with portable code, software libraries appeared, accumulating knowledge and programming experience, and programming became a social activity. high level libraries, like libraries for numeric computation, are nowadays very rich and complete, and it is the most likely that an everyday problem a programmer faces is already solved and included in a library. in this manner, programming became a social activity: a programmer relies on program development tools, such as compilers and integrated development environments, developed by other programmers, received november 12, 2018 corresponding author: predrag pejović university of belgrade, school of electrical engineering, 73 kralj aleksandar blvd, 11000 belgrade, serbia (e-mail: peja@etf.rs)  2 p. pejović as well as software libraries, if he or she wants to program efficiently. this focused programming to solving specific tasks, while general, frequently encountered problems, already have readily available library solutions. this led to ―gluing‖ languages, designed to provide efficient inclusion of library solutions, and to glue them together to solve a specific problem. evolution of software libraries, growing in size and capabilities on daily basis, further supported this concept. a prominent example of a programming language that supports ―gluing‖ concept is python [1]. designed to be readable, with simple and clear syntax, while highly extensible by inclusion of software libraries, named modules, which can be used comfortably using a convenient namespace system. python modules can be written in python, but also in c or c++. furthermore, it is possible to link fortran libraries to python modules. in this manner, a wast software heritage could be efficiently used in python applications. a huge list of useful modules are included in python standard library [2]. the modules used in applications focused in this paper are [3–6]. however, real power of python is in the fact that it enables easy and straightforward inclusion of user contributed modules, outside the python standard library. application of python in measurements relies on these modules and their flexibility to adjust to current trends in development of electronics measurement equipment. review of such external modules [7–17] needed for automating electrical measurements and for creating virtual instruments is presented in this paper. furthermore, the author of this paper contributed some modules [18–20]. along with the evolution of computers and computer languages, the people who use computers evolved, too. in serbia, the last generations that did not learn programming in their high school are getting retired nowadays, and the first generations that learned machine languages, assembly languages, fortran, cobol, and basic in their high school (―programmer‖ high school specialization in serbian high school curricula, lasted from 1977 to 1989) are about 10 years to retirement. python is about to start to be taught in the sixth grade in elementary schools, and many schools and universities worldwide use python as the first programming language. very soon we might expect every professional in any of the technical or science disciplines to be proficient in programming, and the most likely, in python programming language, which is rapidly becoming a standard language for high level programming. many social obstacles are present in automating measurement processes and creating virtual instruments at the time this article is being written. the driving force of this impediment are particular human interests, the process common to automation of any kind. effects of such temporary impediment are expected to vanish, as they vanished in any other automation process. for example, nowadays simple electrical measurements are performed by a digital multimeter, which contains a microcontroller to process the data. another example involves building construction, where distance measurements just a few years ago were dominated by measuring tape, while nowadays almost everyone uses digital laser distance meter, being a digital device. on the other end of the process, the measured data are processed by a computer. who connects the two? in some cases, still a human, collecting the data, writing it down to a notebook, typing it back to a computer. such jobs are likely to disappear, since computer connectivity enabled multimeters are already available. common and standardized communication protocols, preferably wireless, and standardized data processing software are still needed, but they are likely to appear, since there are no technological obstacles to provide them. another option application of python programming language in measurements 3 provided by computer supported measurements is creation of virtual instruments. as an example, consider a digital oscilloscope which is a common piece of equipment in any lab and provides signal samples. by acquiring these samples and by processing them on a computer, power, apparent power, reactive power, power factor, displacement power factor, and total harmonic distortion could be measured, which creates virtual instruments that can measure quantities the oscilloscope initially could not measure. current state in evolution of measurement equipment, computers, and the people who operate both is such that one might expect that most of the measurements in future would be electronics based, and that the measurement results would be presented in a digital form. furthermore, all the data are already processed by computers. connectivity between instruments (which are computers in their construction, microcontrollers), and data processing computers is likely to increase to the level when it becomes an assumed part of any instrument. some knowledge of programming is already assumed, and it is likely that in a decade every professional would be proficient in python, limiting the need for graphical programming languages in measurement applications, enabling all the necessary programming tasks to be performed in a general purpose programming language, not requiring any specialized knowledge nor training. all these facts suggest that python is a convenient choice for a programming language to support measurement automation and virtual instruments. such conclusion spontaneously and independently appeared in many places, resulting in significant amount of available literature, like [21–27]. according to available literature, it seems that at the moment application of python in measurements is the most popular in advanced scientific experiments. examples of applications which might find approach proposed in this paper useful are [28–32]. this paper is written at the tenth anniversary of the author's use of python in electrical measurements for measurement automation and creation of virtual instruments. all measurements for the experimental results in [33] and all the papers aggregated in it are performed using virtual instruments that post-process the data collected using a digital oscilloscope. the software is ported to python, as presented in [34]. furthermore, the same technology is used to create different instruments and systems in [35–40]. this paper aggregates gained experiences and lists all the modules and techniques necessary to design automated measurement systems and virtual instruments, providing some examples. the choice of tools is made to minimize requirements to application specific knowledge, and such that all the tools are free software. 2. requirements imposed by automated measurement systems and virtual instruments at first, let us review functionality required by the design of automated measurement systems and virtual instruments. at first, communication with instruments should be provided in full duplex, proving computers with an ability to send commands to instruments as well as to receive data containing measurement results. the idea is not new, it originated in late 1960s [41], emerging with hp-ib, later renamed to gpib after a wide acceptance, finally standardized as iee 488 and ieee 488.2. thus, about half a century ago it had been evident that measurements are time consuming and boring, and that these processes should be automated connecting instruments to a computer. 4 p. pejović standardization of commands followed, resulting in scpi commands in 1990 [42, 43], almost three decades ago. nowadays gpib still exists, however the interface is somewhat outdated, being expensive, requiring expensive cables, thus being replaced by general purpose higher bandwidth standard interfaces like usb and ethernet, where highly applicable communication hardware is available at low prices due to high production volume. the second task to be performed is data processing and storage. computers are efficient in that, and many libraries to perform these tasks exist, as well as database utilities which may be required in the case huge datasets are being processed. the third task is data visualization, since providing graphical representation of measurement results is frequently required. the fourth group of tasks which are always required covers timestamping and time control, like providing necessary delays for the system to reach the steady state or providing timed measurements at required time instants, like in climate parameters monitoring. also, in free software, which is in the focus of this paper, it is common practice to use other general purpose tools to provide specific functionality of the designed system. in an example which will be covered in this text, for automatic report generation a text processing system latex is used. to provide such functionality, communication to the operating system should be provided, to start other programs and to control their execution. in some cases, graphical user interface is needed, especially in cases when the designed system is going to be used by less qualified personnel or by many people, so tools for providing this functionality should be available. for all of the listed tasks, appropriate python modules are already available as free software. in some cases, experimental hardware should be reconfigured during the measurement process, like in the cases where devices under test should be switched or a reference value should be changed. an inexpensive way to do that is to use the arduino platform [44], which could be easily controlled by a python program using [17]. arduino mega [45] board is of special interest, since it provides a huge number of 54 digital inputs and outputs at a moderate price. 3. python modules useful for automated measurements and virtual instruments 3.1. communication with instruments as already discussed, communication with instruments reduces to exchange of ascii strings when the instruments support scpi [42, 43] commands. thus, while considering instrument purchase, support of scpi should be an important issue, since it enables the user to create his or her own programs to control the instrument. nowadays, the most popular means to communicate to instruments are by usb and by ethernet, which is also an issue in instrument selection. communication over the usb interface is provided using usbtmc protocol [46]. python support for this protocol is provided by python-usbtmc module [7]. in [47], a script to install python-usbtmc on gnu/linux debian-based systems (tested on ubuntu and linux mint distributions) is provided. the module provided effective communication to agilent 33220a signal generator [48], tektronix tbs 1052b-edu oscilloscope [49], which is in everyday use in laboratory for electronics at the school of electrical engineering, university of belgrade, in electrical measurements class [50], as well as the application of python programming language in measurements 5 multimeter [51]. a python module used to support communication and control of the oscilloscope [49] is given in [18]. communication over ethernet is provided using vxi-11 protocol [52], implemented in python-vxi11 module [8]. the module has been successfully used in [38] in communication with [48] and [53], and is in everyday use in [50, 54]. another popular communication interface used with older equipment is the rs-232 interface. communication over that interface is supported by python-serial [9] module. besides, this module supports communication over usb to some devices, like the arduino boards [44]. a python class that supports communication to tektronix oscilloscope is provided at [19], and it had been used successfully with tds 210, tds 220, tds 1000, and tps 2024 oscilloscopes. providing communication to measurement equipment is the most specific part of the measurement automation and the design of virtual instruments as proposed in this paper. after the communication has been established, everything else is common generalpurpose programming. communication to instruments according to scpi [42, 43] reduces to exchange of ascii strings, and conversion of such strings is readily available in python even with built in functions, which might be supported with string module of the python standard library if some more complex string operations are needed. 3.2. data processing data processing required by measurement methods is readily provided in python using numpy module [10], primarily. the module provides numerically efficient array objects and operations over these objects, including basic linear algebra and fft, among other numerical methods. in the case some advanced numerical algorithms are needed, scipy library [11] is available, although most of the tasks are performed by numpy. it is worth to mention that pylab programming environment with namespaces set to provide user friendly numerical programming environment is available [12], although recently deprecated for encouraging old fashioned programming styles. in the case excessive data analysis is necessary, pandas module is available [13]. 3.3. data visualization to provide data visualization in python matplotlib [14] seems to be the best known tool. it provides data plotting, both 2d and 3d, and saving the diagrams in a plethora formats. it should be noted that there are other, both well known and mature tools available. 3.4. timestamping and time control access to the system clock is provided by time module of the python standard library [3]. the module is intuitive and comfortable to program with. in both of the python versions, 2 and 3, modules with the same names are available for the tasks required by applications considered in this paper, and only versions of modules for python 2 are cited in this document, assuming equivalent module availability for version 3. 6 p. pejović 3.5. access to other programs in some case, like the automated report generation, it is necessary to access other programs, like latex [55] or convert [56] for image data format conversion. such functionality is provided by sys [4] and os [5] modules of the python standard library [2]. 3.6. graphical user interface design in the case designed system is intended for specific use, with a limited number of experienced users, it is not likely that creating a graphical user interface (gui) would be an interesting option. however, if the audience that uses the program is wider, a gui is required. fortunately, there are many gui development tools and modules available for python, including rapid application development tools. in an example presented in this paper, taken from [38], the gui had been created using tkinter [6] module, being the simplest and already included in the python standard library. other very popular and advanced tools are available, like pyqt [15] and wxpython [16], which might be of interest in more complex designs. 4. supporting programs the use of python programming language in gnu/linux environment provides an option of a simple interfacing with other free software tools. only two of such programs would be mentioned: latex [55] which was used for automatic report generation in [34], resulting in [57], and convert used to convert image data formats provided by the digital oscilloscope, as used in [18]. any other program could easily be invoked from python, and its output used in further processing. 5. the use of arduino platform arduino [44] is a very popular prototyping platform, characterized by free software and open hardware, which greatly fueled its popularity. the platform itself can be utilized as an instrument, either using its built-in ad converters, either connecting external high precision converters. however, a different application would be suggested here, based upon availability of a large number of digital ports which could be configured either as an input or as an output: for reconfiguration of the measurement system, and in some cases to facilitate indication of the system state. easy and direct interfacing with python might be provided using [17], and with [45] up to 54 digital signals could easily be controlled. relays operated by arduino digital output voltage and current levels are readily available, so reconfiguration of measurement system could be easily provided. 6. application examples 6.1. applications in power electronics and electric power power electronics is a principal research area of the author, and he started to use virtual instrumentation in power electronics, to support research in three phase rectifiers that resulted in a number of papers aggregated in [33]. the measurements required to application of python programming language in measurements 7 support the research included measurement of power, apparent power, reactive power, power factor, displacement power factor, total harmonic distortion, and efficiency. additionally, characterization of components, like recording magnetizing curves and analyzing component constitutive relations and losses were required. specific equipment to perform these tasks are nowadays available, but being narrow in application and highly expensive. for the research purposes, virtual instruments had been created, performing digital post-processing of recorded waveforms. after the python based instrumentation had been introduced, being entirely based on free software, the methods had been ported to education, to laboratory exercises in power electronics 2 [34]. to illustrate automation of measurement process, a 92-page measurement report is automatically generated during a lab exercise that lasts for only two hours, an example being available at [57]. as an example, in fig. 1 waweforms of voltages and currents at the 6-pulse three-phase rectifier inputs are presented, and their spectra are given in fig. 2. effects caused by commutation of the diodes, like the notches in the input voltages and limited slope in the input currents are observable. in the spectra, absence of harmonic components at triples of the line frequency is observable, that matches analytical results. to improve the input current spectra and to reduce the harmonic pollution, 12-pulse rectifiers are applied, and waveforms that correspond to this rectifier are presented in fig. 3, while corresponding spectra are given in fig. 4. reduced distortion is readily available. collected samples are used to determine input power, output power, efficiency, power factor, displacement power factor and total harmonic distortions (thd) of the input currents and voltages. signal processing is simplified by the fact that the system frequency is the line frequency, known in advance, and taking an appropriate number of samples spectral leakage is avoided. in systems with variable frequency this issue should be considered, and it will be discussed in this paper in the section that covers frequency response measurement. fig. 1 waveforms of the input currents and voltages, 6-pulse rectifier 8 p. pejović fig. 2 spectra of the input currents and voltages, 6-pulse rectifier fig. 3 waveforms of the input currents and voltages, 12-pulse rectifier application of python programming language in measurements 9 fig. 4 spectra of the input currents and voltages, 12-pulse rectifier a direct application of the same technology, with minor extension to provide timed measurement and timestamping, is presented in [35], where long lasting measurements, over a week, of the line voltage and its total harmonic distortion (thd) were provided. a diagram presenting measured thd values is presented in fig. 5, indicating periodic behavior during working days, while having a specific pattern during weekends. to provide the diagram of fig. 5 measurements were made every minute over a week, and 10080 data points are collected and presented. to illustrate daily variations of the thd, the waveform of fig. 5 in the part that corresponds to workdays is plotted in fig. 6 such that the curves are plotted for each day one atop another. close to periodic behavior could be observed, illustrating effects of human daily activities on the voltage thd. on the other hand, the thd exposes a different pattern during weekends. to illustrate that, the same methodology as for the workdays, presented in fig. 6 is applied, and the results is presented in fig. 7. significant reduction of the bump from 08 to 16 hours could be readily observed, corresponding to the reduction of business activity during weekends. for the rest of the day, the thd profile remained about the same. 10 p. pejović furthermore, the software based virtual instrument is able to record root-mean-square (rms) value of the phase voltage at every point. measurements are made over a week every minute, and the resulting phase voltage histogram is presented in fig. 8. in this manner, registration of the phase voltage is obtained applying general purpose instruments and some controlling software that provides measurement automation and timestamping. fig. 5 thd of the phase voltage fig. 6 thd of the phase voltage, workdays application of python programming language in measurements 11 fig. 7 thd of the phase voltage, weekend fig. 8 hystogram of the phase voltage rms value proposed measurement methods were extended to cover both measurement and control of a solar power generator, aiming maximum power point tracking of the solar panel [36]. rapid prototyping is achieved using general purpose instruments and a personal computer to close the loop, which was possible due to the low frequency dynamics in the loop. as a part of the same project, a solar power harvester is designed as 12 p. pejović presented in [37], where the solar panel is kept at the maximum power point by an adjustable resistive load, and harvested power is measured in order to estimate average, minimum and maximum power that could be harvested in the specified location as it depends on weather conditions. 6.2. dc voltage calibrator a different application of the proposed methods is presented in [38] where design of a special instrument is approached using the software tools. a dc voltage calibrator was needed, being an expensive instrument, narrow in application, not worth purchasing for the particular application. a substitution is created closing a loop that included two general purpose instruments, a programmable signal generator [48] and a highly precise multimeter [53]. the system is presented in fig. 9. the voltage assigned to the signal generator is adjusted in order to generate required voltage, and improvement in accuracy of two orders of magnitude is achieved, placing the generated voltage error within about 500 v limit, as depicted in fig. 10. the data of fig. 11 contain 20001 data points, obtained using an automated system which loops the required calibrator output voltage over all possible values in the available range. just assuming 30 seconds of manual work per data point, which is fairly optimistic, the measurement process would last for more than 165 man hours. since the instrument was intended for use by less qualified personnel, a graphical user interface (gui) is built using tkinter [6] module. the choice is made considering the application as low demanding, not requiring rapid application development tools, and having in mind that tkinter module is a part of the python standard library [2]. a screenshot of the resulting gui is shown in fig. 11, and it presents assigned voltage, measured voltage, and the voltage assigned to the generator, which is an intermediate step governed by the feedback loop. in the example of fig. 11 an offset of 23 mv had to be added to compensate for the signal generator error and to locate the calibrator error within 500 v limit. fig. 9 the calibrator system application of python programming language in measurements 13 fig. 10 voltage error, closed loop calibrator fig. 11 graphical user interface of the calibrator 6.3. applications in education developed techniques proved to be successful in education, since free access to all of the source codes is available, the code could be analyzed in classes and shared to students, and the time spent in the laboratory, limited due to the huge lab burden, could be effectively utilized, illustrating key concepts instead of spending time on trivial repetitive tasks. the first application of the proposed methods in education is made in power electronics 2 [34], where lab exercises were introduced to illustrate the theory presented in the course, as shown here by figs. 1–4. after this successful implementation, course of electrical measurements [50] is reformed, as reported in [39]. after a year, the course is further updated, since new oscilloscopes were obtained, providing much faster data acquisition, enabling introduction of even more experiments since the intellectually and educationally idle processing time had been reduced further. a set of nine new laboratory exercises is created [54]. according to student questionnaires, they enjoyed the concept which reduced hard work and increased the number of experiments, focusing to the essence instead to the trivia. 14 p. pejović 6.4. measurement of frequency response another application example of the proposed techniques, used both in education [39, 50] and in practice is an automated system for frequency response measurement [40]. the system is intended to measure frequency response of transmittance and immittance, and a numerically intensive technique is used to measure amplitude and phase, extracting the first harmonic. such approach is applied to remove influence of noise in the amplitude and phase measurements, which is going to be illustrated as significantly present in measurements encountered in practice. in this manner, precise measurements are obtained, since all of the collected samples affect the result, filtering the noise out. the algorithm starts with selecting the time scale such that the the minimal number of signal periods is covered by the oscilloscope time frame. the frequency is assigned to the signal generator, being an independent variable, thus the signal period is known. time span of the oscilloscope screen belongs to a discrete set of values achievable by the given oscilloscope, and the span that includes the lowest number of whole signal periods is selected, determining the oscilloscope time scale. for the oscilloscope applied [49] the number of periods covered by a screen is either one or two, depending on the signal frequency, as depicted in fig. 12. fig. 12 number of periods covered by the oscilloscope screen after the time scale has been selected, the number of samples taken into account is computed by rounding spanper ttn 02500  , where 2500 is the number of samples per time frame for the given oscilloscope, nper is the number of signal periods per time frame, shown in fig. 12, t0 is the signal period, and tspan is the time span covered by the time frame. the number of samples is solely dependent on the signal frequency, and the diagram is shown in fig. 13. in [40], an older version of the algorithm is presented, reducing the scope to only one signal period, but in cases when more than one signal period is covered by the oscilloscope screen, due to the limitations imposed by the discrete set of available time scale values, better results are obtained by taking two periods into account, and the improved algorithm is presented in this paper. application of python programming language in measurements 15 fig. 13 the number of considered samples the algorithm assumes that the number of considered samples ns is known, and that samples of signals x(t) and y(t) are available as xk and yk for k  {0, ... ns  1}. waiting functions are computed next, according to          s perk n k nc 2cos2 (1) and          s perk n k ns 2sin2 (2) according to the fourier analysis, for signal  tx cosine component is obtained as     1 0 1 s n k kk s c cx n x (3) while the sine component is     1 0 1 s n k kk s s sx n x . (4) after the cosine and sine components are determined applying the fourier analysis, effectively filtering the noise out, the signal amplitude is obtained as 22 scm xxx  (5) and the phase is obtained as atan 2( , ) x s c x x  (6) 16 p. pejović using the 2atan function that takes two arguments and provides the result in the range ( , ]  . the same signal processing is performed over signal y(t), resulting in values of yc, ys, ym, and y . finally, the transfer function magnitude is obtained as 0 ( ) m m y h j x   (7) and the phase is obtained as xyh  0 (8) the value 0h  is named ―raw phase‖ since it takes value in the range  22 0  h since   yx , . the value is correct, due to the phase periodicity over 2 , but it is convenient to provide the phase value in the range   h . in this aim, phase adjustment by appropriate shifting for 2 is performed according to          .2 2 00 00 00     hh hh hh h (9) this concludes the algorithm for the one point, for the specified frequency value. the algorithm is repeated for specified frequency range and the specified number of data points. as the first example, consider a circuit of fig. 14, used to illustrate frequency response effects caused by the capacitor, to identify frequency range where it behaves approximately as an open circuit and the range where it behaves approximately as a short circuit. the program is run, and the frequency response is obtained as presented in the diagram of fig. 15, clearly indicating areas of flat frequency response where the capacitor could be considered either as open circuit, bellow 1 khz in the considered case, or as short circuit, which occurs above 100 khz in the considered case. fig. 14 the circuit,  k1 21 rr , nf10 1 c application of python programming language in measurements 17 fig. 15 frequency response of the circuit the same system could be used for immittance measurements, for impedance and admittance, using the circuit of fig. 16. in the circuit of fig. 16 r is used as a reference resistor, and the current through the measured impedance is computed as r vv i 21   (10) the same data processing algorithm as for the transfer functions is applied, taking signals v2(t) and i(t) as y(t) and x(t) if impedance computation is the goal. application of the method to analyze electronic components provides insight in their operation and suggest suitable modeling strategies. as an example, in fig. 17 frequency response of a capacitor c = 1 nf impedance is presented. the result matches expectations, and barely noticeable deviations of measured phase from 90 o at the beginning and at the end of the diagram are caused by a huge difference of the capacitor impedance at considered frequency and the impedance reference of r = 20 k. this is expected, since measured impedance varies for four decades, i.e. 10 4 times over the considered frequency range, and a constant reference impedance is used. to improve the result, suggested approach that uses arduino to reconfigure the circuit by adapting the reference impedance value to the measured impedance should be applied. fig. 16 circuit structure for impedance measurement; r is the impedance reference value 18 p. pejović fig. 17 frequency response of a capacitor impedance, nf1c in contrast to the capacitor impedance frequency response, which follows the ideal model, impedance of an inductor is presented in fig. 18. the inductor has rated inductance of 10 mh, but it exposes inductive behavior only in the frequency range from about 1 khz to about 200 khz. at low frequencies, parasitic resistance of the winding dominates the impedance, while at high frequencies parasitic capacitance of the winding dominates the response, resulting in capacitor-like frequency response above the resonant frequency of about 400 khz. the results are obtained using a reference resistor of 500 . fig. 18 frequency response of an inductor impedance, mh10l application of python programming language in measurements 19 as a final example that covers impedance measurements, consider frequency response of an electrolytic capacitor impedance, presented in fig. 19. to measure impedance of the electrolytic capacitor, a dc offset of 2 v has been applied, and the measurements are made with 0.5 v amplitude of the signal generator ac component. the capacitor shows dominantly capacitive behavior only at frequencies lower than 300 hz, and in the frequency range from 300 hz to about 300 khz equivalent series resistance slightly above 1  dominates the impedance. above 300 khz, equivalent series inductance starts to dominate the impedance behavior. to illustrate waveforms captured during the measurement process and noise filtering, the waveforms recorded while measuring the electrolytic capacitor impedance at the frequency of 289.087 hz are presented in fig. 20. red trace corresponds to the capacitor current, while the yellow trace is the capacitor voltage. the cyan trace is the input voltage ac component. significant presence of noise in the capacitor voltage waveform could be readily observed. similar situation occurs in the frequency range from 300 hz to 300 khz, when the capacitor voltage is low. regardless the noise, consistent measurements of amplitude and phase are presented in fig. 19, indicating that the noise is successfully removed by the signal processing, not affecting the measurement result. fig. 19 frequency response of an electrolytic capacitor impedance, μf470c 20 p. pejović fig. 20 waweforms recorded during the electrolytic capacitor impedance measurement: yellow — capacitor voltage; red — signal proportional to the capacitor current; cyan — voltage of the signal generator in educational application of [39, 50], measurements of a transmission line transfer function is an experiment that attracts lots of student attention, and has an educational value of connecting courses that cover circuit theory to engineering practice. due to the nature of the problem, linear frequency scale is appropriate, and the transfer function of an open transmission line is presented in fig. 21. resonances and nonlinear phase response could be readily observed. repeating the experiment with properly terminated transmission line, results of fig. 22 are obtained, indicating flat amplitude response and linear phase response, corresponding to close-an-ideal transmission system. fig. 21 frequency response of an open transmission line application of python programming language in measurements 21 fig. 22 fequency response of properly terminated transmission line 7. conclusions in this paper, application of python programming language in creating automated measurement systems and virtual instruments is discussed. it is shown that to create such systems a set of specific tasks should be performed, not frequent in common application programming. the tasks are listed, and the python modules that support performing them are looked for. it is shown that for all of the specific tasks there are python modules readily available, either from the python standard library, either from external sources, some of them highly specialized to support communication with instruments. effective methods of including modules and arranging them in separate namespaces turned out to be useful in considered application. it is also shown that other programs, like latex for text processing might be useful in creating automated measurement tools, to provide automatic report generation, which might be of use in certifying laboratories. the use of arduino platform is proposed to provide measurement system controlled automatic reconfiguration and indication of the system state and performance. tools for controlling arduino platforms directly from python are identified. application of the proposed methods is illustrated in four different areas, as reported by the author in seven papers. applications started in power electronics, and positive experiences spread to metrology, to the design of a dc voltage calibrator, to education, where the methods were used in modernizing two courses, and in measurements of system frequency response, as applied in electronics, acoustics, and control system design. in some of these applications, selection of the time scale and the number of considered samples in the case of variable signal frequency is controlled by an updated algorithm presented in this paper. overall conclusion is that python is an adequate tool for creating automated measurement systems and virtual instruments, due to its modular structure and openness for contribution of modules. in the choice of programming tools, the attention has been 22 p. pejović made to favorize general purpose tools and techniques, to minimize specific knowledge requirements. having in mind evolution of software and the people who work in metrology, it is likely to expect wide application of the proposed approach and methods, which already started in several places independently. references [1] python programming language — official website, [online] available: http://www.python.org/ [2] the python standard library, [online] available: https://docs.python.org/3/library/ [3] time — time access and conversions, [online] available: https://docs.python.org/2/library/time.html [4] sys — system-specific parameters and functions, [online] available: https://docs.python.org/2/library/sys.html [5] os — miscellaneous operating system interfaces, [online] available: https://docs.python.org/2/library/os.html [6] graphical user interfaces with tk, [online] available: https://docs.python.org/2/library/tk.html [7] python-usbtmc, [online] available: https://github.com/python-ivi/python-usbtmc [8] python vxi-11, [online] available: https://github.com/python-ivi/python-vxi11 [9] pyserial, [online] available: https://pythonhosted.org/pyserial/ [10] numpy, [online] available: http://www.numpy.org/ [11] scipy, [online] available: https://www.scipy.org/ [12] scipy: pylab, [online] available: https://scipy.github.io/old-wiki/pages/pylab [13] pandas, [online] available: https://pandas.pydata.org/ [14] matplotlib, [online] available: https://matplotlib.org/ [15] pyqt's modules, [online] available: http://pyqt.sourceforge.net/docs/pyqt4/modules.html [16] wxpython, [online] available: https://wxpython.org/ [17] python-arduino-proto-api-v2, [online] available: https://github.com/vascop/python-arduino-proto-api-v2 [18] p. pejović, oscusb, python module to support communication with oscilloscopes over usb, [online] available: http://tnt.etf.bg.ac.rs/~oe2em/oscusb.py [19] p. pejović, oscusb, python module to support communication with oscilloscopes over rs-232, [online] available: http://tnt.etf.bg.ac.rs/~oe2em/oscrs232.py [20] pr. pejović, oscusb, python module to support presentation of numbers in engineering notation, [online] available: http://tnt.etf.bg.ac.rs/~oe2em/engineeringnotation.py [21] j. m. hughes, real world instrumentation with python: automated data acquisition and control systems. o'reilly media, inc., 2010 [22] g. real, l. raviola, m. f. jauré, and a. o. vitali, ―data acquisition system for didactic laboratories based on open-source hardware and free software,‖ in proceedings of the 2015 xvi ieee workshop on information processing and control (rpic), 2015, pp. 1-6. [23] j. l. johnson, h. t. wörden, and k. v. wijk, ―place: an open-source python package for laboratory automation, control, and experimentation,‖ journal of laboratory automation, vol. 20, no. 1, pp. 10-16, 2015. [24] i. j. koenka, j. sáiz, and p. c. hauser. ―instrumentino: an open-source software for scientific instruments,‖ chimia international journal for chemistry, vol. 69, no. 4, pp. 172-175, 2015. [25] i. j. koenka, j. sáiz, and p. c. hauser. ―instrumentino: an open-source modular python framework for controlling arduino based experimental instruments,‖ computer physics communications, vol. 185, no. 10 pp. 2724-2729, 2014. [26] f. j. f. martín, m. v. llopis, j. c. c. rodríguez, j. r. b. gonzález, and j. m. blanco, ―low-cost open-source multifunction data acquisition system for accurate measurements,‖ measurement, vol. 55, pp. 265-271, 2014. [27] a. j. lewis, m. campbell, and p. stavroulakis, ―performance evaluation of a cheap, open source, digital environmental monitor based on the raspberry pi,‖ measurement, vol. 87, pp. 228-235, 2016. [28] v. davidović, d. danković, s. golubović, s. djoric-veljkovic, i. manić, z. prijić, a. prijić, n. stojadinović, and s. stanković, ―nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets,‖ facta universitatis, series: electronics and energetics, vol 31, no. 3, pp. 367-388, 2018. [29] s. k. mohapatra, k. p. pradhan, and p. k. sahu, ―resolving the bias point for wide range of temperature applications in high-k/metal gate nanoscale dg-mosfet,‖ facta universitatis, series: electronics and energetics, vol. 27, no. 4, pp. 613-619, 2014. [30] s. k. mohapatra, k. p. pradhan, and p. k. sahu, ―ztc bias point of advanced fin based device: the importance and exploration,‖ facta universitatis, series: electronics and energetics, vol. 28, no. 3 pp. 393-405, 2015. application of python programming language in measurements 23 [31] i. manić, d. danković, v. davidović, a. prijić, s. djorić-veljković, s. golubović, z. prijić, and n. stojadinović, ―effects of pulsed negative bias temperature stressing in p-channel power vdmosfets,‖ facta universitatis, series, electronics and energetics, vol. 29, no. 1, pp. 49-60, 2015. [32] x. saura, m. riccio, j. suñé, a. irace, and e. miranda, ―study on the spatial generation of breakdown spots in mim capacitors with different aspect ratios,‖ facta universitatis, series electronics and energetics, vol. 28, no. 2 pp. 177-192, 2015. [33] p. pejović, ―three-phase diode rectifiers with low harmonics current injection methods,‖ springer, 2007. [34] p. pejović, m. simić, ―virtual instruments for power electronics based on free software tools,‖ in proceedings of the17th international symposium on power electronics, ee 2013, novi sad, october-november 2013. [35] p. pejović, m. simić, ―a system for measuring mains voltage parameters and logging the data,‖ in proceedings of the 18th international symposium on power electronics, ee 2015, novi sad, october 2015. [36] v. lazarević, m. bjelica, p. pejović, ―maximum power point tracking control system of photovoltaic module using free software and standard laboratory equipment,‖ in proceedings of the 18th international symposium on power electronics, ee 2015, novi sad, october 2015. [37] p. pejović, m. bjelica, ―a simple system to estimate on-site solar energy harvesting,‖ in proceedings of the 18th international symposium on power electronics, ee 2015, novi sad, october 2015. [38] p. pejović, a. zeković, ―software supported dc voltage calibrator,‖ in proceedings of the xi international symposium industrial electronics, indel 2016, banja luka, november 3-5, 2016. [39] p. pejović, ―electrical measurements revisited — experiences from modernizing the course,‖ in proceedings of the ieee eurocon 2017, ohrid, republic of macedonia, 6-8 july 2017, pp. 838-844. [40] p. pejović, ―an automated system for frequency response measurement based on free software tools,‖ in proceedings of the xii international symposium industrial electronics, indel 2018, banja luka, november 1-3, 2018. [41] wikipedia contributors, ieee-488, [online] available: https://en.wikipedia.org/wiki/ieee-488 [42] wikipedia contributors, standard commands for programmable instruments, [online] available: https://en.wikipedia.org/wiki/standard_commands_for_programmable_instruments [43] standard commands for programmable instruments (scpi), [online] available: http://www.ivifoundation. org/docs/scpi-99.pdf [44] m. banzi, getting started with arduino, second edition, o’reilly media, 2011 [45] arduino mega 2560 rev3, [online] available: https://store.arduino.cc/arduino-mega-2560-rev3 [46] universal serial bus test and measurement class specification (us-btmc), revision 1.0, april 14, 2003, [online] available: http://sdpha2.ucsd.edu/lab_equip_manuals/usbtmc_1_00.pdf [47] p. pejović, usbtmcinstall.zip, [online] available: http://tnt.etf.bg.ac.rs/~oe2em/usbtmcinstall.zip [48] agilent technologies agilent 33220a 20 mhz waveform generator user’s guide, [online] available: http://cp.literature.agilent.com/litweb/pdf/33220-90002.pdf [49] tbs1000b-edu series datasheet, [online] available: https://www.tek.com/datasheet/digital-storageoscilloscope-0 [50] p. pejović, electrical measurements, course web site, [online] available: http://tnt.etf.bg.ac.rs/~oe2em/ [51] keysight technologies digital multimeters, 34460a digital multimeter, 6 (1/2) digit, basic truevolt, [online] available: https://literature.cdn.keysight.com/litweb/pdf/5991-1983en.pdf [52] vmebus extensions for instrumentation tcp/ip instrument protocol specification vxi-11, revision 1.0, the vxibus consortium, 1995, [online] available: http://www.vxibus.org/files/vxi\_specs/vxi-11.zip [53] agilent 34410a and 34411a multimeters, [online] available: http://cp.literature.agilent.com/litweb/pdf/ 59893738en.pdf [54] p. pejović, ―laboratorijske vežbe iz električnih merenja‖ [online] available: https://zenodo.org/record/ 1311557/files/prirucnik.pdf?download=1 [55] ctan comprehensive tex archive network, [online] available: https://ctan.org/ [56] imagemagick convert, [online] available: https://imagemagick.org/script/convert.php [57] twelve pulse rectifier lab report example, [online] available: http://tnt.etf.bg.ac.rs/~ms1ee2/report-12-pulse2.pdf facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 115-131 https://doi.org/10.2298/fuee2101115p © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper synthesis of composite logic gate in qca embedding underlying regular clocking jayanta pal1, dhrubajyoti bhowmik2,a, ayush ranjan singh2,b, apu kumar saha2,c, bibhash sen3 1department of information technology, tripura university, agartala, tripura, india 2department of {acse, bee, cmathematics}, nit agartala, barjala, tripura, india 3department of cse, national institute of technology, durgapur, west bengal, india abstract. quantum-dot cellular automata (qca) has emerged as one of the alternative technologies for current cmos technology. it has the advantage of computing at a faster speed, consuming lower power, and work at nanoscale. besides these advantages, qca logic is limited to its primitive gates, majority voter and inverter only, results in limitation of cost-efficient logic circuit realization. numerous designs have been proposed to realize various intricate logic gates in qca at the penalty of non-uniform clocking and improper layout. this paper proposes a composite gate (cg) in qca, which realizes all the essential digital logic gates such as and, nand, inverter, or, nor, and exclusive gates like xor and xnor. reportedly, the proposed design is the first of its kind to generate all basic logic in a single unit. the most striking feature of this work is the augmentation of the underlying clocking circuit with the logic block, making it a more realistic circuit. the reliable, efficient, and scalable (res) underlying regular clocking scheme is utilized to enhance the proposed design’s scalability and efficiency. the relevance of the proposed design is best cited with coplanar implementation of 2-input symmetric functions, achieving 33% gain in gate count and without any garbage output. the evaluation and analysis of dissipated energy for both the design have been carried out. the end product is verified using the qcadesigner2.0.3 simulator, and qcapro is employed for the study of power dissipation. key words: composite gate, regular clocking, 2-input symmetric function, qca, res clocking, basic gates, energy dissipation. 1. introduction the performance of cmos technology degrades due to its high-power consumption, high leakage current, quantum effects & feature size. it leads international technology & roadmap for semiconductors (itrs) to make a noteworthy and viable alternative to received august 1, 2020; received in revised form september 5, 2020 corresponding author: jayanta pal department of information technology, tripura university, suryamaninagar, tripura, , india e-mail: jayantapal@tripurauniv.in 116 j. pal, d. bhowmik cmos vlsi. in this context, quantum-dot cellular automata (qca) [1] is considered as one of the rising nanotechnologies to invalidate the restrictions of current cmos vlsi [2]. qca employs the use of quantum dots rather than transistors, used in cmos. the circuit design in qca enjoys low power dissipation [3] and high operational speed at nanoscale [4]. the transmission and distribution of information in qca occur with the position of electron and the interaction of cells in the presence of a clock signal. it does not involve any flow of electrons, as in the case of cmos. the basic structural components of qca based architecture are 3-input majority gate (mv) [5, 6], inverter [7, 8] and array of cells [9, 10]. besides, the fan-out assumes a significant role in signal splitting for the substantial implementation of qca arrays [11]. four-phase clocking [12] namely: switch, hold, release and, relax phases is used to control the direction of signal propagation. the purpose of these phases is to allow or deny the tunneling of the confined electrons in a qca cell and achieve a stable logic state. any complex circuit can be implemented in qca based technology with these basic primitives [13–16]. however, the qca logic design is limited to the majority voter gate and inverter only and it hampers the cost-effective logic synthesis [17]. a minimization of functional logic blocks is preferred in the realization of a sophisticated qca logic design. a composite logic gate (cg) allows the maximum realization of logic functions and reduces the circuit complexity & qca cost. the use of an underlying regular clocking is essential for the design, considering the timing issue for an error-free and high-performance behavior [18, 19]. few attempts have been made to address the regular clocking related issue [18–20]. however, 2ddwave [20] lacks proper feedback path realization and use clocking [18] lacks multi-directional information flow although it addresses the issue of feedback as in [20]. the issues were adequately tackled in [19] with a robust, efficient and scalable (res) scheme introducing multi-input based clock zone. the researchers have proposed numerous designs employing regular clocking [18–21] and also without regular clocking [22–27]. most of the designs have utilized more than two logic outputs of the primitive gate. nevertheless, no design has been reported to consider producing all possible primary logic functions for the given input. although researches in the field of qca technology have proposed and presented many remarkable designs. in most cases, they did not take into account the vital use of the underlying regular clock. as a result, they ended up with the cell layout with irregular clocking zones. such irregular clocking has made it more challenging to provide an easy construct in terms of fabrication. at the same time, the regular clocking scheme should be well-defined and regular. a composite gate (cg) is capable of generating all possible logic gates like and, nand, inverter, or, nor, xor and xnor for the given inputs. this paper aims to design a low-cost qca gate to realize all possible logic operations and design the same using the res clocking scheme. until now, no previous attempt is made to realize a cg with such flexibility and functionality. on the other hand, the synthesis of symmetric boolean functions is an essential aspect in cryptology [28] and it draws considerable attention of the researchers in qca. different designs and techniques to synthesize symmetric functions can be seen in [29–33]. the proposed cg can be utilized as the better choice for the synthesis of two-input symmetric functions. the salient features of the proposed work are as follows: ▪ a composite gate is proposed, in order to produce all logic gates in a single unit. ▪ a regular clocking scheme (res) is applied to ensure the design’s scalability and efficiency. ▪ the design capability of the circuit has been investigated. synthesis of composite logic gate in qca embedding underlying regular clocking 117 ▪ the cost-efficiency of the circuit is evaluated. ▪ the energy dissipation has also been evaluated for both the design (with & without regular clock). the remaining part of the paper is arranged in the following order: section 2 discusses the basics of qca. in section 3, the different proposals and designs for underlying clocking schemes are discussed. the composite gate’s basics, the circuit proposed in qca and the simulated result are presented in section 4. the performance and applicability analysis has been discussed in section 5. the analysis for the power dissipation of the proposed design is discussed in section 6. section 7 deals with concluding part of the article. 2. quantum-dot cellular automata in this section, the basic design and working principle of quantum-dot cellular automata (qca) technology are discussed. a square-shaped qca cell consists of two electrons residing in two of the four quantum dots inside it. the information flows with the effect of coulombic repulsion force between the electrons and the electron moves between the neighboring dots via tunneling [34] but not between the neighboring cells due to high inter-cell potential. 2.1. qca basic structure according to the alignment or the position of electrons in the qca cell, it can either be in one of the two polarization states. one in polarization +1 (binary logic 1) and the other in polarization -1 (binary logic 0). likewise, according to the qca dots’ position, the cell can be 900 cells or 450 cells. the electrons endeavor to maintain maximum distance placed diagonally because of the coulombic repulsion force, as depicted in fig. 1(a). the polarization energy for each cell (say p) can be formulated with eq. 1 as follows. )( )()( 4321 4231 pppp pppp p +++ +−+ = (1) where pi represents the probability of the presence of an electron in the ith quantum dot [35]. fig. 1 basic structural components of qca. 118 j. pal, d. bhowmik the qca cells are set alongside one another to shape a qca wire, and the data propagates through it. the coulombic repulsion force between electrons makes the last cell’s polarization in the array to have the same value as the first one in the array. it can also be opposite polarization, depending on the structure [36]. there can be two different structures of qca wire (fig. 1(b)). the inverter (not gate) can have two ways of implementation (fig. 1(c)), and it is used as and when the reverse polarized value of a cell is to be used. the basic building block for most of the qca based circuits is majority voter gate (mv) and inverter. the functional logic of mv with three inputs a, b, and c can be represented in eq. 2. it is designed with 5 qca cells, with three inputs, one device cell and one output cell (fig. 1(d)). ( , , )m p q r pq qr rp= + + (2) to propagate the data through the intersecting point of the wires, two commonly known wire crossing exist in qca: coplanar approach [37] and multi-layer approach [38, 39]. the crossover can also be achieved in either of three (3) ways by proper placing of clock zones. it can also be implemented using two types of qca wires (900 cells and 450 cells) crossed vertically and horizontally. on the other hand, a multi-layer wire crossing can be implemented applying different single layers created one above another (fig. 1(e)). still, it is not a suitable approach from a fabrication point of view. 2.2. qca clocking apart from the basic building blocks in qca design, it is also essential to have an appropriate synchronization of clock signals with the majority gate with optimization in delay. the qca signals are driven by four clock phases, such as the switch phase, hold phase, release phase, and relax phase. qca cells in a circuit should be organized in terms of consecutive clocking zones with proper synchronization employing some added cells [40] (see fig. 1(f)). the inter-dot potential barrier gradually increases between the qca cells during the switch phase, which leads the input cell to interconnect with the neighbor cells, which results in a polarized state [41]. palpable computation is achieved in this phase. the barrier becomes high during the hold phase and prevents the tunneling of electrons between the dots. the barrier recurrently decreases in the release phase. on reaching its least value in the relax phase, the cell polarization loses its polarized value entirely and does not indulge any impact on neighboring cells. 3. related work usually, the qca design follows the rule of considering minimum or maximum value in cell count in a particular clocking zone. however, it does not justify the design, and also the proper use of clocking phases is ignored, which leads to the need for appropriate use of well defined, underlying, and regular clocking schemes. in this regard, several works have been proposed and presented as reported in [42–45], but none of them have considered a regular layout of the underlying clocking scheme. it will not just help enhance the performance of the design, but also ensure the scalability, reliability, and ease of fabrication. synthesis of composite logic gate in qca embedding underlying regular clocking 119 the concept was first suggested by [12] and further [46] proposed that it should be regular, uniform, and in a bounded shape. but none of them formulated any scheme to be implemented. (a) (b) (c) fig. 2 regular clocking scheme (a) 2ddwave [20] (b) use [18] (c) res [19]. a formulation of a clocking scheme was framed by [20] in the name of twodimensional qca clocking schemes. but does not support a proper feedback path; thus, the implementation of sequential circuits was a complication. the issue was taken care of by [18] with multi-layer wire crossing. despite this, they argue that it can also work for coplanar crossing. however, both solutions are physically challenging to fabricate. an alternative of all the hurdles, as mentioned earlier, was proposed in [19] with an added benefit of the three-dimensional flow of information in any particular clock zone. along with a proper regular underlying clocking scheme, a universal qca design is also essential to realize a complex circuitry. the qca implementation relies on the majority gate and inverter. therefore, a single unit capable of generating the basic functions like and, or & xor is the need of the hour. it will enhance the performance of the circuit, makes it cost-effective, and exploitable. a few works can be seen in [32,33,44,47] with different aspects of aim and motivation. but none considered design of the composite gate, using underlying clocking and minimization of garbage outputs while realizing a complex circuit. whereas and-nand methodology [32], uqcalg implementation [33] also analyzed the applicability and performance with the synthesis of symmetric function. all these factors have influenced to propose a cost-effective composite gate with the capability of generating the essential logic functions like and, or, xor gate. moreover, the res clocking scheme enhances efficiency, and scalability is in the center of attention. 4. proposed composite gate digital systems are constructed from the basic building blocks, also termed as logic gates. these gates include and, or, nand, nor, not, xor, and xnor gates. the functionality of all the logic gates can be implemented, simulated, and verified using quantum-dot cellular automata technology [48]. many attempts have been reported in qca, but no suitable design found aims in proposing a circuit with primitive logic like nand, and, nor, or, xor, x-nor as output. therefore, the logic gate design with a potentiality to produce universal logic functions like and, or & xor is necessary. the logic operations can be framed in qca with the majority gate’s help and inverter only, as shown in fig. 3. 120 j. pal, d. bhowmik (a) (b) fig. 3 composite gate (a) block diagram (b) schematic diagram. such a design may lead to the synthesis of the complex logic function implementation where most of the logic functions are used simultaneously, with a minimum number of the circuit block. the proposed design is a platform that receives two inputs and generates the basic logic outputs like and (a.b), or (a+b), xor (ab’+a’b), for two inputs a & b. the functional logic equation of the proposed design can be expressed as eq. 3 where a and b are the inputs to the composite gate. }'.'.,,.{),( bababababaf ++= (3) all the functional outputs can also be expressed using majority voter gate and inverter in qca as follows. and = maj(a,b,0), equivalent to a.b or = maj(a,b,1), can also be expressed as a+b xor = maj(maj(a,b’,0), maj(a’,b,0),1), expressed as a ⊕ b the remaining underlying logic like nand gate expressed as maj(a’,b’,1) or maj(a,b,0)’, nor gate can also be denoted by maj(a’,b’,0) or maj(a,b,1)’ and x-nor gate, which can also be expressed as maj(maj(a’,b’,0),maj(a,b,0),1) = (a⊕b)’. in other words, they can be generated directly by applying an inverter in the generated outputs and, or, and xor, respectively, without imposing any additional primitives. additionally, the inversion of the inputs can be pulled off using xor logic only setting either of the input to logic 1, like: xor(a,b)= a for input b = 1 & it is b when a = 1. the nand and nor gates are called the universal gate, as these logics have the capability to generate the basic logic gates like and, or, and not [48]. to the best of the authors’ knowledge, it is the initial attempt to design a cost-efficient composite gate. moreover, the proposed composite logic gate is also introduced with a proper cell layout using the res clocking scheme, discussed later in this section. 4.1. qca realization as diagrammatically depicted in the schematic diagram (see fig. 3(b)), it can be inferred that the derivation of the proposed cg requires five (5) majority gates and two (2) inverters. the cell layout of qca realization for the cg is covered in fig. 4(a) which testifies to the target of low area, high device density and enhanced computational speed. it is implemented in a coverage area of 0.13µm2, utilizing 88 qca cells and latency of 0.75 (3 clock zones). synthesis of composite logic gate in qca embedding underlying regular clocking 121 (a) (b) fig. 4 (a) qca layout of the proposed composite gate (b) simulated result. the result generated by the simulation of the proposed design is shown in fig. 4(b). during the verification of the outcomes with the simulated value, no data loss has been observed. the 2-input composite gate is simulated and verified using the qcadesigner [49] simulator, with default parameters. the result of the circuit has been tested and validated with the truth table of the underlying logic gates. for an intersection in the flow of information, a coplanar wire crossing (preferably clock-based approach) has less fabrication overhead than multi-layer or rotated cell-based crossing. though any kind of crossing is still a concern of research; it is an essential part of any large circuit. in this proposed design, clock-based coplanar wire crossing can efficiently be utilized if cascading is required. as mentioned earlier, without using an underlying clocking scheme, the design may not be a suitable candidate for fabrication. with this consideration, an extended design is proposed next. 4.2. regular clock based design it is to be noted that a regular clocking scheme is essential to enable the specification of standard cells, routing algorithms and the development of placement and fabrication to 122 j. pal, d. bhowmik allow qca technology to progress [18]. it is noticed from the different existing regular clocking schemes, as shown in fig. 5 that the neighboring clock zones can be identified by continuous zone numbering. for example, clock zone 0 (switch phase) is denoted by 1, followed by clock zone 1 or hold phase and so on up to clock zone 3 (relax phase). information proliferates through the clock zones sequentially. starting from clock zone 0 to 1, then 1 to 2, 2 to 3 and 3 to 0 and continuously. furthermore, to facilitate the feedback path, there must be an opposite direction available in a clocking scheme in order. these constrictions have been implemented along with a threedimensional flow of information in the res clocking scheme [19]. as recommended in [20], to disperse the clock zone signals, the diagonally placed metal wires are inhumed under the qca design where a 4-phase clock generator generates the signals. fig. 5(a) shows the generation of electric field uniformly for each clock zone in the res clocking scheme. fabrication techniques can adequately consider the realization of these structural designs for the circuits. (a) (b) fig. 5 res clocking scheme (a) circuitry generates the electric fields for the clock (b) extended version. the clocking scheme block can be made replicated as and when required depending upon the area of the circuit. accordingly, the extended version of res is shown in fig. 5(b). the green arrows show the directions; reverse directional flow is also available side by side for steering feedback purposes. but it creates an unnecessary white space only to have the opposite directions in a circuit. to overwhelmed delinquent, a cell at clock zone 1 is fixed just below the top left corner, creating three-way routing options. thus, a threeway information flow is achieved in the res underlying regular clocking scheme. it offers a supplementary benefit to realize an input majority voter gate with three inputs in a single clock zone. with the advent of this advantage, the extended design of the composite gate using res underlying regular clocking scheme as in fig. 6(a), and the corresponding simulated waveform is presented in fig. 6(b). the extended circuit using res clocking uses an area of 0.64µm2 and 261 cells, utilizing nine clock zones (latency = 2.25). synthesis of composite logic gate in qca embedding underlying regular clocking 123 (a) (b) fig. 6 (a) proposed composite gate using res underlying clocking scheme, (b) simulated waveform. 124 j. pal, d. bhowmik table 1 composite gate with clocking vs without clocking composite gate cell latency area(µm2) regular functions generate in fig. 6 88 0.75 0.13 no (and, or, xor) in fig. 9 261 2.25 0.64 yes (and, or, xor) with complement the structural information of the proposed designs is tabulated in table 1. following table 1, it is worth mentioning that the circuit developed using res underlying regular clocking scheme have come up with a result in a larger area and number of cells in terms of implementation. it makes the circuit implemented in fabricate technology with a metal wire buried under the circuit and makes the circuit realistic possible. 5. applicability and performance analysis the compatibility of a circuit can be justified with the application of the same. this section discusses the best use of the proposed composite gate in qca. at the same time, the performance of the proposed design is also analyzed. 5.1. synthesis of symmetric function if each variable in a boolean function is either in un-complemented or in a complemented form in its sum of products (sop) expression, it is called unate. a switching function expressed as f(x1, x2,..., xn) for all variables x1, x2,..., xn is termed as totally symmetric, if it remains invariant for the permutations of the variables [32]. the weight, say w of a vertex, say v (set of variables where each variable appears only once) is represented as the number of uncomplemented variables in v. whereas in case the total symmetry which is a set of integers, say a=(x1, x2,..., xn) with a⊂(0, 1, 2,..., n); all the vertices will appear in the function with weight w a. likewise, an n-variable symmetric function is represented as fn(x1, x2,..., xn). if set a consists of only consecutive integers (xi, xi+1,..., xn), the symmetric function is consecutive. in the case of both unate and symmetric, it is called unate symmetric, which is always consecutive. 5.1.1 . 2-inputs symmetric function there are 2n+1–2 different symmetric functions possible for n variables, excluding constant functions 0 and 1. thus, the number of symmetric functions consist of two input variables is 22+1 – 2. therefore, a total of 23-2 = 6 functions, as shown in table 2. table 2 two-input symmetric functions sl no function 1 a.b 2 a'+b' 3 a+b 4 a'.b' 5 a.b'+a'.b 6 a.b+a'.b' synthesis of composite logic gate in qca embedding underlying regular clocking 125 for 2-input variable symmetric functions, only single composite gates are sufficient to produce and get all the two variables symmetric functions. in this case, no garbage output is produced. it is worth mentioning here that a garbage output can be defined as the unused intermediate output generated in each step until the outcome. to understand the same in terms of qca and majority voter gate, let us analyze the functions below: function 1: here, a.b is nothing but the and logic of the input variables. it can easily be implemented using majority voter gate. function 2: here in a’+b’, an or logic is used in the complemented forms of the inputs. if we consider de-morgan’s law of digital logic, the same can also be represented as (a.b)' which is a nand logic of the inputs. the inversion of the output in function 1 will serve the purpose, without any extra logic to impose. function 3: here a+b is an or logic of the input variables. in this case, also, a majority voter gate is enough for the synthesis. function 4: in this case, a’.b’ is represented as the and of the complemented forms of the inputs. like the second function, this function can also be represented as (a+b)’ or the complement of the or logic of the inputs. similar to function 2, an inversion of the output in function 3 will generate this output. function 5: the equation a.b’+a’.b stands for mutually anded inputs with the complemented form of its counterpart in the minterms of the output. it is simply the exclusive or of the inputs. in this case, three majority gates are used as per the function structure. function 6: the expression a.b+a’.b’ can further be presented as the complemented form of x-or and also termed as x-nor. here, it is again the inversion of the result produces in function 5. to sum up, we can say that the six (6) number of symmetric functions generated from 2input variables produce the outputs as exactly like the proposed composite gate, as shown in fig. 7(a). therefore, this is the simplest form of design, ensuring the fastest operation in the realization of 2-input symmetric functions. (a) (b) fig. 7 composite gate based realization of 2-variable symmetric functions (a) block diagram (b) schematic diagram. 5.1.2 . composite gate based realization this section explores the realization of 2-input symmetric functions to demonstrate the effective use of the composite gate. the proposed circuit is best suited for a specific 126 j. pal, d. bhowmik purpose, where most of the basic logic is required simultaneously. the most effective use of the composite gate structure introduced in this work can be the case where all the outputs are utilized. 2-input symmetric functions can be realized without any garbage or extra output (refer schematic diagram in fig. 7(b)), which shortens the cost of circuit design, and the same can be realized in qca approach as in fig. 8(a). this circuit (similar to fig. 4) is presented to show the implementation of a 2-input symmetric function using the proposed circuit as it is best suited for the purpose. the only difference lies in the fact that fig. 8(a) includes the inverted forms of and, or & x-or logic outputs, as generated in the design proposed in fig. 4(a). (a) (b) fig. 8 (a) composite gate based realization of 2-variable symmetric functions, (b) simulation result. it shows that cg utilizes all the outputs of the composite gate and the inverted forms of each without any garbage outputs while realizing. no wire crossing can be observed in the composite gate-based implementation. however, as mentioned earlier, it may be the synthesis of composite logic gate in qca embedding underlying regular clocking 127 need of the hour for a cascaded circuit. it can also be noted that the proposed composite gate-based realization of implementation of 2-input symmetric functions is most areasaving with only a single gate for all the functions so far, as compared to few existing designs. the simulation result of corresponding qca based realization, as shown in fig. 8(b), verifies the output. this figure is used to verify the simulated output with the logic of the 2-input symmetric function. 5.2. performance analysis a few works in the relevant field have been reported in [32,33,47]. among them [33] and [47] have proposed a layout of 2-input symmetric functions realization. these designs generate garbage outputs, while other designs suffer from wire crossing. still, no such single gate has been evolved so far as to accomodate all basic logic functions and at the same time use regular clocking scheme. the analysis and comparison for the effectiveness of composite gate-based realization with its counterpart in the literature review are shown in table 3. therefore, the efficiency of composite gate-based realization is qualified when compared to the existing designs by considering the features like the number of wire crossings, gate count, garbage output and augmenting the underlying clocking scheme. as per the result, analysis, and comparison of the 2-inputs symmetric functions, it is found to be the best suitable to use the composite gate in realizing the same. the simulated output for the realization of 2-input symmetric functions, as shown in fig. 8(b), affirms the synthesis. table 3 performance of composite gate in synthesis of 2-input symmetric functions gate wire crossing gate used regular clocking garbage output produced mv+inv 4 8 no ulg [47] 2 7 no uqcalg [33] 0 3 no 3 and-nand [32] 0 3 no 2 proposed cg 0 1 yes 0 6. power analysis this section presents the power dissipated by the proposed designs considering both designs (with and without clocking). the widely accepted and used power estimation tool qcapro has been used for the findings of the results [50]. qcapro is a tool for the capability of trading with larger cells in number. it utilizes a technique based on fast approximation where non-adiabatic switching power loss is expected for a qca circuit. the temperature value (qcapro parameter) 2 kelvin was considered in this research. the power dissipation of the design (average of leakage energy dissipation and average of switching energy dissipation) is evaluated at three levels of tunneling energy (0.5 ek, 1 ek, and 1.5 ek). the energy consumption maps for the proposed design are presented in fig. 9, with tunneling energy of 0.5 ek. a comparative analysis of the energy dissipation at these tunneling energy levels is recorded in table 4 for the proposed composite gate. as the proposed design is first of its kind and no previous design in literature found, the comparison was not feasible. 128 j. pal, d. bhowmik (a) cg without clocking (b) cg using res clocking fig. 9 energy consumption map under tunneling energy 0.5 ek, at 2k temperature. table 4 the power dissipation for the proposed composite gate design/ energy level average of leakage energy dissipation (mev) average of switching energy dissipation (mev) total energy consumption (mev) 0.5ek 1ek 1.5ek 0.5ek 1ek 1.5ek 0.5ek 1ek 1.5ek fig. 6 23.77 76.48 140.59 154.69 135.32 115.62 178.46 211.80 256.21 fig. 9 78.52 244.14 439.96 429.15 369.58 312.17 507.67 613.72 752.13 the total power consumption of the proposed design in represented graphically in fig. 10. it can be observed that the regular clock-based design requires more cells compared to the same design without clocking. it may result in higher density (in terms of area) for the design implemented in regular clocking. the graph representing the total power dissipation can also be noticed that the use of regular clocking in qca design may lead to higher power dissipation. however, considering the practical design perspective and cost-efficiency of the design, it is always preferred to design a circuit embedding regular clocking and not only focus on basic metrics in arbitrary clocking. fig. 10 total dissipated energy of the proposed circuits design. synthesis of composite logic gate in qca embedding underlying regular clocking 129 7. conclusion a coplanar design for a cost-efficient composite logic gate is proposed in this paper. the gate is capable of generating all the significant 2-input basic logic operations such as and, nand, not, or, nor, xor and xnor. the proposed composite gate is suggested and designed in both forms with and without regular clocking scheme. this article is the first of its kind to propose a gate to generate all basic logic functions. the outcome in the experimental result supports the claim in compliance with the truth table. apart from that, the proposed design is best suited for direct implementation of 2-input symmetric functions. a single block of composite gate is sufficient to realize all the functions. it achieves a gain of 33.33% in gate count and a complete waiver of garbage output is observed in comparison with its counterparts. in other words, the experimental study shows that the proposed design has better execution in a complete choice. the composite gate design has also been expanded, augmenting the underlying regular clocking scheme, res which has serve to make the gate robust, efficient, and scalable. the functionality verification of the circuit has been carried out using the qcadesigner version 2.0.3 tool. additionally, a comprehensive analysis of dissipated power is presented and qcapro simulator is utilized in this regard. we have proposed the design for a combinational circuit only. as a reference to the future study, we are prepared to apply the technique for a sequential circuit. references [1] c. s. lent, p. d. tougaw, w. porod, et al., "quantum cellular automata", nanotechnology, vol. 4, no. 1, p. 49, 1993. [2] n. h. weste and d. harris, cmos vlsi design: a circuits and systems perspective. pearson education india, 2015. [3] j. timler and c. s. lent, "power gain and dissipation in quantum-dot cellular automata", j. appl. phys., vol. 91, no. 2, pp. 823–831, 2002. [4] y. lu, m. liu, and c. lent, "molecular quantum-dot cellular automata: from molecular structure to circuit dynamics", j. appl. phys., vol. 102, no. 3, p. 034311, 2007. [5] g. l. snider, a. o. orlov, i. amlani, et al., "quantum-dot cellular automata: line and majority logic gate", jpn. j. appl. phys., vol. 38, no. 12s, p. 7227, 1999. [6] s.-s. ahmadpour, m. mosleh and s. r. heikalabad, "the design and implementation of a robust single-layer qca alu using a novel fault-tolerant three-input majority gate", j. supercomput., vol. 76, pp. 1–31, 2020. [7] p. d. tougaw and c. s. lent, "logical devices implemented using quantum cellular automata", j. appl. phys., vol. 75, no. 3, pp. 1818–1825, 1994. [8] m. goswami, m. roychoudhury, j. sarkar, et al., "an efficient inverter logic in quantum-dot cellular automata for emerging nanocircuits", arab. j. sci. eng., vol. 45, pp. 1–12, 2019. [9] c.-k. wang, i. yakimenko, i. zozoulenko, et al., "dynamical response in an array of quantum-dot cells", j. appl. phys., vol. 84, no. 5, pp. 2684–2689, 1998. [10] z. chu, h. tian, z. li, et al., "a high-performance design of generalized pipeline cellular array", ieee comput, archit. lett., vol. 19, no. 1, pp. 47-50, 2020. [11] a. chaudhuri, m. sultana, d. sengupta, et al., "a reversible approach to two’s complement addition using a novel reversible tcg gate and its 4 dot 2 electron qca architecture", microsyst. technol., vol. 25, no. 5, pp. 1965–1975, 2019. [12] k. hennessy and c. s. lent, "clocking of molecular quantum-dot cellular automata", j. vac. sci. technol. b: microelectron. nanometer struct. process. meas. phenom, vol. 19, no. 5, pp. 1752–1755, 2001. [13] m. goswami, m. r. choudhury, and b. sen, "a realistic configurable level triggered flip-flop in quantum-dot cellular automata", in proceedings of the international symposium on vlsi design and test, 2019, pp. 455–467. 130 j. pal, d. bhowmik [14] b. sen, m. dutta, d. saran, et al., "an efficient multiplexer in quantum-dot cellular automata", in proceedings of the progress in vlsi design and test, 2012, pp. 350–351. [15] y. adelnia and a. rezai, "a novel adder circuit design in quantum-dot cellular automata technology", int. j. theor. phys., vol. 58, no. 1, pp. 184–200, 2019. [16] h. r. roshany and a. rezai, "novel efficient circuit design for multilayer qca rca", int. j. theor. phys., vol. 58, no. 6, pp. 1745–1757, 2019. [17] w. liu, l. lu, m. o’neill, et al., "a first step toward cost functions for quantum-dot cellular automata designs", ieee trans. nanotechnol., vol. 13, no. 3, pp. 476–487, 2014. [18] c. a. t. campos, a. l. marciano, o. p. v. neto, et al., "use: a universal, scalable, and efficient clocking scheme for qca", ieee trans. comput.-aided design integr. circuits syst., vol. 35, no. 3, pp. 513–517, 2015. [19] m. goswami, a. mondal, m. h. mahalat, et al., "an efficient clocking scheme for quantum-dot cellular automata", int. j. electron. lett., vol. 8, no. 1, pp. 83–96, 2019. [20] v. vankamamidi, m. ottavi, and f. lombardi, "two-dimensional schemes for clocking/timing of qca circuits", ieee trans. comput.-aided design integr. circuits syst., vol. 27, no. 1, pp. 34–44, 2007. [21] m. abutaleb, "robust and efficient quantum-dot cellular automata synchronous counters", microelectron. j., vol. 61, pp. 6–14, 2017. [22] m. raj, l. gopalakrishnan, and s.-b. ko, "design and analysis of novel qca full adder-subtractor", int. j. electron. lett, pp. 1–14, 2020. [23] r. singh, s. s. das, and v. sarada, "design of a compact negative-edge triggered t flip-flop in qca technology", j. electr. eng. technol., vol. 11, no. 2, pp. 139–146, 2020. [24] t. n. sasamal, a. k. singh, and a. mohan, "design of registers and memory in qca", in quantum-dot cellular automata based digital logic circuits: a design perspective, pp. 119– 137, 2020. [25] a. shiri, a. rezai, and h. mahmoodian, "design of efficient coplanar comparator circuit in qca technology", fu elec. energ., vol. 32, no. 1, pp. 119–128, 2019. [26] m. n. divshali, a. rezai, and s. s. f. hamidpour, "design of novel coplanar counter circuit in quantum dot cellular automata technology", int. j. theor. phys., vol. 58, no. 8, pp. 2677–2691, 2019. [27] z. taheri, a. rezai, and h. rashidi, "novel single layer fault tolerance rca construction for qca technology", fu elec. energ, vol. 32, no. 4, pp. 601–613, 2019. [28] y. xianyang and b. guo, "further enumerating boolean functions of cryptographic significance", j. cryptol., vol. 8, no. 3, pp. 115–122, 1995. [29] d. l. dietmeyer, "generating minimal covers of symmetric functions", ieee trans. comput.-aided design integr. circuits syst, vol. 12, no. 5, pp. 710–713, 1993. [30] s. chakrabarti, s. das, d. k. das, et al., "synthesis of symmetric functions for path-delay fault testability", ieee trans. comput.-aided design integr. circuits syst., vol. 19, no. 9, pp. 1076–1081, 2000. [31] m. perkowski, p. kerntopf, a. buller, et al., "regular realization of symmetric functions using reversible logic", in proceedings of the euromicro symposium on digital systems design, warsaw, poland, 2001, pp. 245-252. [32] p. k. bhattacharjee, "use of symmetric functions designed by qca gates for next generation ic", int. j. comput. theory eng., vol. 2, no. 2, p. 211, 2010. [33] b. sen, m. dalui, and b. k. sikdar, "introducing universal qca logic gate for synthesizing symmetric functions with minimum wire-crossings", in proceedings of the international conference and workshop on emerging trends in technology, 2010, pp. 828–833. [34] c. s. lent and p. d. tougaw, "a device architecture for computing with quantum dots", in proceedings of the ieee, vol. 85, no. 4, pp. 541–557, 1997. [35] p. d. tougaw and c. s. lent, "dynamic behavior of quantum cellular automata", j. appl. phys., vol. 80, no. 8, pp. 4722–4736, 1996. [36] k. kim, k. wu and r. karri, "quantum-dot cellular automata design guideline", ieice transactions on fundamentals of electronics, communications and computer sciences, vol. 89, no. 6, pp. 1607–1614, 2006. [37] r. devadoss, k. paul and m. balakrishnan, "clocking-based coplanar wire crossing scheme for qca", in proceedings of 23rd international conference on vlsi design, bangalore, india, 2010, pp. 339-344. [38] b. sen, a. nag, a. de, et al., "towards the hierarchical design of multilayer qca logic circuit", j. comput. sci., vol. 11, pp. 233–244, 2015. [39] m. n. divshali, a. rezai and a. karimi, "towards multilayer qca siso shift register based on efficient d-ff circuits", int. j. theor. phys., vol. 57, no. 11, pp. 3326–3339, 2018. [40] b. sen, r. mukherjee, k. mohit, et al., "design of reliable universal qca logic in the presence of cell deposition defect", int. j. electron., vol. 104, no. 8, pp. 1285–1297, 2017. synthesis of composite logic gate in qca embedding underlying regular clocking 131 [41] b. sen, m. r. chowdhury, r. mukherjee, et al., "reliability-aware design for programmable qca logic with scalable clocking circuit", j. comput. electron., vol. 16, no. 2, pp. 473– 485, 2017. [42] a. n. bahar, r. laajimi, m. abdullah-al-shafi, et al., "toward efficient design of flip-flops in quantumdot cellular automata with power dissipation analysis", int. j. theor. phys., vol. 57, no. 11, pp. 3419–3428, 2018. [43] j. pal, s. bhattacharjee, a. k. saha, et al., "study on temperature stability and fault tolerance of adder in quantum-dot cellular automata", in proceedings of the 5th international conference on signal processing, computing and control (ispcc), solan, india, 2019, pp. 69-74. [44] m. dalui, b. sen, and b. k. sikdar, "fault tolerant qca logic design with coupled majority-minority gate", int. j. comput. appl., vol. 1, no. 29, pp. 81–87, 2010. [45] b. sen, m. dutta, d. k. singh, et al., "qca multiplexer based design of reversible alu", in ieee international conference on circuits and systems (iccas), kuala lumpur, malaysia, 2012, pp. 168-173. [46] m. janez, p. pecar, and m. mraz, "layout design of manufacturable quantum-dot cellular automata", microelectron. j., vol. 43, no. 7, pp. 501–513, 2012. [47] y. xia and k. qiu, "design and application of universal logic gate based on quantum-dot cellular automata", in proceedings of the 11th ieee international conference on communication technology, hangzhou, china, 2008, pp. 335-338. [48] j. pal, p. dutta, and a. k. saha, "realization of basic gates using universal gates using quantum-dot cellular automata", in proceedings of the international conference on computing and communication systems, 2018, pp. 541–549. [49] k. walus, t. j. dysart, g. a. jullien, et al., "qcadesigner: a rapid design and simulation tool for quantum-dot cellular automata", ieee trans. nanotechnol., vol. 3, no. 1, pp. 26–31, 2004. [50] s. srivastava, a. asthana, s. bhanja, et al., "qcapro-an error-power estimation tool for qca circuit design", in ieee international symposium of circuits and systems (iscas), rio de janeiro, brazil, 2011, pp. 2377-2380. 8214 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 243-252 https://doi.org/10.2298/fuee2202243d © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper area and power-efficient reconfigurable digital down converter on fpga debarshi datta1, himadri sekhar dutta2 1electronics & communication engineering department, makaut kolkata, west bengal, india 2electronics & communication engineering department, kalyani government engineering college, nadia, west bengal, india abstract. this paper presents a field-programmable gate array (fpga)-based digital down converter (ddc) that can reduce the bandwidth from about 70 mhz to 182.292 khz. the proposed ddc consists of a polyphase coordinate rotation digital computer (cordic) processor and a multirate filter. the advantage of polyphase cordic processor is to process with high sample rate input data and produces computational efficient noiseless baseband spectrum. the pipeline multirate filter works at a high clock speed. moreover, the multirate filter generates a fractional sample rate factor using a cubic b-spline farrow filter. the proposed ddc is coded with optimal hardware description language (hdl) and tested on kintex-7 xilinx fpga as the target device. experimental results indicate that the proposed design saves chip area, power consumption and operates at high speed without loss of any functionality. additionally, the proposed design offers sufficient spurious-free dynamic range (sfdr) and produces less than 1 hz frequency resolution at the output. key words: digital down converter (ddc), coordinate rotation digital computer (cordic), half-band (hb) filter, field programmable gate array (fpga), matlab 1. introduction the demand for a high-performance digital down converter (ddc) is very much essential in modern communication [1]. the sample rate reduction process plays an important role in data communication systems for its various data rates. hence, fieldprogrammable gate array (fpga)-based ddc architecture is very much essential due to its outstanding flexible architecture as compared to application-specific integrated circuits (asic) [2]. furthermore, the implementation of ddc on fpga performs superbly in frequency response and phase characteristics with a high precision output. received september 3, 2021; received in revised form december 23, 2021 corresponding author: debarshi datta electronics & communication engineering department, makaut kolkata, west bengal, india e-mail: debarshidatta7@gmail.com 244 d. datta, h. s. dutta in the last decade, several researchers have reported hardware-efficient different ddc architectures on fpga devices. recently, the authors in l. l. motta et al. [3] have proposed a digital up-down converter using polyphase cascaded integrated comb (cic) filters. the simulation results show the functional verification of the filters, and the design has achieved a high performance using fixed-point filter coefficients. again, l. guo et al. [4] have suggested parallel ddc architecture using numerical control oscillator (nco). the nco was decomposed into several sinusoidal sequences. these sequences are multiplied by the input signals to produce complex waveforms. the design was verified by matlab and tested on the fpga board. similarly, the authors in x. liu et al. [5] proposed a reconfigurable ddc architecture that performed a down-converted signal about 3.6 ghz to the output range of 1 ks/s-225 ms/s. the design was implemented on the xilinx kintex7 device and measured the synthesized results in terms of resources and power consumption. furthermore, the authors in b. tietche et al. [6] described fpga-based resampling circuits for software-defined radio applications. the implementation schemes controlled the spurious-free dynamic range (sfdr). again, the authors in v. obradović et al. [7] discussed a flexible ddc architecture for wideband direction finder. the ddc was tested on xilinx kintex-7 using xilinx ip cores to implement the filters chain. similarly, authors in j. thabet et al. [8] presented a reconfigurable ddc design implemented on virtex-7 fpga board to obtain high speed, low power consumption. the design reduced the complexity for applicability in multi-standard gnss receivers. furthermore, the authors in a. agarwal et al. [9] suggested coordinate rotation digital computer (cordic)-based ddc on xilinx virtex-6 fpga for multi-standard radio communications and achieved a maximum operating speed of 240 mhz. however, all the existing designs have some drawbacks in hardware implementation. they consume a large area and power in the fpga platform. thereby a cost-efficient reconfigurable ddc architecture is very much attractive in a communication system. therefore, hardware efficient flexible ddc architecture is required that can meet all the practical applications. the proposed design uses a polyphase and pipelined architecture to improve the operating speed. again, the truncation process in each unit reduces the area requirements. finally, the proposed design is tested on the xilinx kintex-7 fpga board. the implementation results indicate that the proposed ddc optimizes the hardware resources and power as compared to existing architectures without losing any significant information. the organization of this paper is as follows: section 2 describes the proposed architecture and its components. results are discussed in section 3. section 4 concludes the paper. 2. proposed architecture the proposed ddc consists of a polyphase cordic processor and multirate filter, as shown in fig. 1. the polyphase cordic processor works a high data rate input signal which is beyond 1 ghz. the multirate filter such as multi-stage cic, half-band (hb), and cubic b-spline farrow filters are connected in cascade to achieve a high decimation factor and to the produce correct baseband spectrum. area and power-efficient reconfigurable digital down converter on fpga 245 fig. 1 proposed ddc architecture the total sample rate (r) factor is calculated as r = fout / fs = r1 x r2 x 2 x r3 (1) where fs and fout is the input and output sampling rate, respectively. r1 is the decimation factor of the polyphase cordic processor, r2 is the decimation factor of the multi-stage cic filter, r3 is the decimation factor of the cubic b-spline farrow filter. the sample rate factors can be changed dynamically in real-time to match any practical application. hence, the design offers maximum flexibility. the frequency resolution at the output is fs/2 32 (= 0.8381hz for fs = 3.6 ghz). the following sub-modules describe each component of the proposed design. 2.1. polyphase cordic processor the polyphase cordic processor can satisfactorily work with a high sample rate signal which is the output from an analog-to-digital (adc) converter (typically, adc12d1800). the proposed polyphase cordic processor is shown in fig. 2. the polyphase component operates at a speed of fs/r1, resulting in the polyphase cordic processor being more feasible in the fpga platform [10]. to achieve correct output, the relation between fs and r1 is expressed as r1 ≤ fs/w (2) where w is bandwidth of the input signal. fig. 2 polyphase cordic processor from the polyphase algorithm, the signal gi(n) can be represented as gi(n) = x(nr1 + i) (3) where i = 0, 1, ………… (r1-1), and x(n) is the input sequence. 246 d. datta, h. s. dutta hence, the in-phase (yi(n)) and quadrature (yq(n)) parts of the polyphase cordic processor are expressed as [11] yi(n) = ∑ [gi(n)ici(n)] r1−1 i=0 = ∑ [xi(nr1 + i) cos [2π(nr1 + i)f0/fs] r1−1 i=0 (4) and yq(n) = ∑ [xi(n)qci(n)] r1−1 i=0 (5) = ∑ [xi(nr1 + i) sin [2π(nr1 + i)f0/fs] r1−1 i=0 respectively where fo is the central frequency. to eliminate unwanted frequency components and further reduce the sample rate to ensure a correct output signal, both yi(n) and yq(n) signals are passed through multirate decimation filters. 2.2. cic filter the cic filter performs low-pass filtering to remove the multiple copies of images and produces a very narrow passband for the ddc system [12]. cic is a high efficient decimation filter that is placed just after the polyphase cordic processor. a multi-stage cic filter is typically used to reduce the sidelobe producing maximum main lobe gain [13]. this work allows a pipeline 4-stage cic decimation filter, shown in fig. 3. the additional register in the integrator and comb section reduces critical path delay. fig. 3 4-stage truncated pipeline-based cic filter the filter gain is calculated as [14] g = (𝑅2𝐷) 𝑁 = 65536 (for 𝑅2 = 8, stage n = 4, and comb delay d = 2) = 48.16 db (6) area and power-efficient reconfigurable digital down converter on fpga 247 the full resolution data width at the output stage is 𝐵𝑜𝑢𝑡 = [𝐵𝑖𝑛 + n𝑙𝑜𝑔2(𝑅2𝐷)] = 36 bits [𝐵𝑖𝑛 = 20] (7) fig. 4 depicts the magnitude response of the cic filter. fig. 4 magnitude response of the cic filter for r2 = 8, d = 2, n = 4 generally, integrator works at a high sample rate with a large data width. hence, the truncation process is necessary to reduce the word length without losing desired information. it is noted that the five least significant bits (lsbs) are truncated from the first integrator's 36-bit. hence, the second integrator works only 31-bit. using the same procedure, the third and fourth integrators are work only with 26-bit and 21-bit, respectively. as a consequence, the truncation process reduces the output data width to 16-bit. usually, the matlab tool provides the data length in each stage. the passband frequency (𝜔𝑝) is π n⁄ 𝑅2. 2.3. hb filter it is important to note that the cic filter does not provide a flat response and its nonflatness must be compensated in other processing stages. after the cic filter, the hb filter is used to attain the correct passband droop [15]. the hb filter has symmetric property at cut-off frequency π/2. fig. 5 shows a 31-tap symmetric hb filter with decimation factor 2. fig. 5 31-tap transpose symmetric hb filter 248 d. datta, h. s. dutta the pass-band frequency (ωp) is 0.45π, and stop-band frequency (ωs) is 0.55π. the transpose symmetric hb architecture reduces the multiplication units [16]. hence, the computational workload reduces significantly. for this work, the hb filter coefficients are 16-bit fixed points and generated using the “firhalfband” matlab function [17]. 2.4. cubic b-spline farrow structure finally, a cubic b-spline farrow structure is used to produce the fractional sampling output with 3/2 times the input signal. this type of implementation provides a better reconstruction of the signal as compared with conventional lagrange interpolation [18], [19]. the calculation of the cubic b-spline farrow structure is described below. the nth degree b-spline at time domain is expressed as [14] βn(t) = 1 𝑁! ∑ (−1)𝑘𝑁+1𝑘=0 ( 𝑁 + 1 𝑘 ) (𝑡 − 𝑘 + n + 1 2 )𝑁 (8) where βn represents as n-th b-spline. consider, n = 3, or cubic spline type, then the polynomial becomes β3(t) = 1 6 ∑ (−1)𝑘4𝑘=0 ( 4 𝑘 ) (𝑡 − 𝑘 + 2)3 (9) = 1 6 (t + 2)3 2 3 (t + 1)3 + t3 2 3 (t − 1)3 + 1 6 (t − 2)3 (10) the reconstruction spline is the summation of weighted b-spline sequences and expressed as y(t) = ∑ 𝑥(𝑘)𝛽3(𝑡 − 𝑘)𝑘 (11) consider, the samples are taken at time t = -1, 0, 1, 2, and from eq. (11), the four parts b-splines are calculated as y(d) = x(n + 2) β3 (d 2) + x(n + 1) β3 (d 1) + x(n) β3 (d) + x(n – 1) β3 (d + 1) = x(n + 2) 𝑑3 6 + x(n + 1) [ 1 6 (d + 1)3 2 3 d3 ] + x(n) [ d3 2 3 (d + 1)3 + 1 6 (d + 2)3] + x(n 1) [ 1 6 (d 1)3] = x(n + 2) 𝑑3 6 + x(n + 1) [𝑑3 2 + 𝑑2 2 + 𝑑 2 + 1 6 ] + x(n) [ 𝑑3 2 – d2 + 2 3 ] + x(n 1) [𝑑3 2 + 𝑑2 2 𝑑 2 + 1 6 ] (12) for realizing the above equations in farrow structure, the factors of fractional delay dk are generated by the following four equations: d0 : 0 + x(n + 1)/6 + 2x(n)/3 + x(n 1)/6 = c0 d1 : 0 + x(n + 1)/2 + 0 x(n 1)/2 = c1 d2 : 0 + x(n + 1)/2 x(n) + x(n 1)/2 = c2 d3 : x(n + 2)/6 x(n + 1)/2 + x(n)/2 x(n 1)/6 = c3 (13) where c0, c1, c2, and c3 are represented as spline matrix coefficients and d lies between 1 and 0. the above coefficients in eq. (13) are transformed into z-domain to realize the transfer functions of the farrow filter architecture, as shown in fig. 6. farrow filters are the most suitable architecture for fractional sample rate converter due to its one programmable fractional delay component without changing filter coefficients [19]. area and power-efficient reconfigurable digital down converter on fpga 249 fig. 6 cubic b-spline farrow structure [20] 3. result analysis the following sub-sections describe the result analysis in detail. 3.1. design specifications the proposed system performs for mobile communication specifications. all floating-point data are converted to fix-point data to achieve stopband specifications. the specifications of the proposed ddc are summarized as follows: i. input signal bandwidth: 70 mhz. ii. output signal bandwidth: 182.292 khz iii. decimation factor: 384 (r1=4, r2=2 5, hb =2, r3=3/2) iv. input data width: 16-bit v. output data width: 20-bit vi. passband ripple ≤ 0.1 db vii. stopband attenuation ≥ 80 db 3.2. data truncation the truncation is applied in each signal path to protect overflow error. each polyphase branch can be represented as an fir filter. the multiplication-accumulation is described as follows. an m-bit binary word signifies in signed 2’s complement fixed-point rational format and can take value from subset s as [21] s = {s/2y1| 2 m-1 ≤ s ≤ 2m-1 -1, s∈ z} (14) which is represented as p (x1, y1), where x1 = m 𝑦1 – 1 and y1 fractional bits. using fixed-point arithmetic, the multiplication is calculated as p (x1, y1) x p (x2, y2) = p (x2+ x2+ 1, y1+ y2) or p (x3, y3) (15) consider, the multiplication and accumulation are denoted by p (x3, y3) and p (x4, y4), respectively, so that p (x4, y4) = a (x3 + floor [log2(r-1)], y3) [where r = r1 + 1] (16) for example, the input data is p (8, 7), and the coefficient data is a (3, 12). hence, the multiplication and accumulation data are p (12, 19) and p (14, 19) respectively [for r1 = 4]. according to the word length reduction, the output data is p (14, 19–14) or a (14, 5) or data word length (14 + 5 + 1) 20-bit which are the input of the cic filter. the output word lengths of the cic filter are 16-bit [described in section 2.3]. again, the farrow fir output word length is 20-bit. 250 d. datta, h. s. dutta 3.3. fpga implementation the proposed ddc design is simulated in xilinx vivado 2017.4 tool and implemented on kintex-7 xc7k70t-fbg676 with 16-bit input precision to meet the desired specifications. the design is coded using verilog hardware description language (hdl). additionally, the code optimization technique reduces the logical resources and power [22], [23]. the compilation report contains slices, luts, iob blocks, maximum frequency, and power consumption. table 1 indicates the synthesized list of each component of the proposed design. table 1 resource utilization of each component of the proposed ddc architecture synthesis parameters polyphaser cordic processor cic filter (r2 = 25) hb filter (31-tap) (2) cubic b-spline farrow filter (r3 = 3/2) r1=4 r1=8 r1=16 slice registers 1758 3650 8521 1290 1948 3182 6-input luts 832 1975 3932 556 878 2185 iobs 62 62 62 65 80 86 brams 2 4 8 0 0 8 dsp48es 4 8 16 0 0 36 3.4. validation for the purpose of verification, chipscope outputs are sent back in the matlab r2015a tool. fig. 7 shows sfdr of 88 db, which can be generated using 1024 samples with unity signal amplitude. fig. 7 power spectrum of proposed ddc 3.5. comparison table 2 shows a comparison report of the proposed ddc design with the existing designs. the proposed design uses data truncation to reduce the resources. this area reduction leads to power optimization. moreover, the pipeline version of this proposed area and power-efficient reconfigurable digital down converter on fpga 251 architecture enhances the operating speed. the area and power are reduced by 39.65% and 32.92%, respectively. the polyphase cordic processor improves the sfdr, which is 88 db. results analysis suggested that the proposed ddc is an energy-efficient architecture that is widely used in real-time signal processing applications. table 2 comparison report of existing architectures and proposed solution synthesis parameters vuk et al. [7] (kintex-7) fs = 120 mhz r = 6 liu et al. [5] (kintex-7) fs = 3.6 ghz r = 20 proposed solution slices 37066 13552 8178 luts 69499 7269 4451 brams not available 22 10 dsp48es 1034 83 40 fmax (mhz) not available 454.5 512 power (w) not available 1.446 0.970 sfdr (db) not available 83.3 88 4. conclusion this paper briefs an fpga-based flexible ddc architecture so that it can match any digital radio specifications. the proposed design uses a polyphase and pipelined structure which can save the area and improve the operating speed. the multirate filter performs sample rate reduction and channel filtering with enhanced sensitivity and selectivity. these new design techniques increase the operating speed. furthermore, the truncation and optimum coding style are used to improve area efficiency and power reduction. additionally, the proposed design has achieved an sfdr of 88 db. thus, the presented ddc design has been enhanced in real-time applications. acknowledgement: the authors are expressed their sincere gratitude to makaut for providing the valuable xilinx tools and fpga board. references [1] a. v. oppenheim and r. w. schafer, discrete-time signal processing, third edition. prentice hall, 2010. [2] w. wolf, fpga-based system design. englewood cliffs, nj: prenticehall, 2004. [3] l. l. motta, b. a. acurio, n. f. t. aniceto and luís geraldo p. meloni, "design and implementation of a digital down/up conversion directly from/ to rf channels in hdl", integration, vol. 68, pp. 30–37, sept. 2019. [4] l. guo, f. tan, p. zhan and h. zeng, "decomposing numerically controlled oscillator in parallel digital down conversion architecture", j. circuits, syst. comput., vol. 26, no. 9, p. 1750126, feb. 2017. [5] x. liu, x. yan, z. wang, and q. deng, "design and fpga implementation of a reconfigurable digital down converter for wideband applications", ieee trans. on vlsi systems, vol. 25, no. 12, dec. 2017. [6] b. h. tietche, o. romain, and b. denby, "a practical fpga-based architecture for arbitrary-ratio sample rate conversion", j. sign. process. syst., vol. 78, pp. 147–154, feb. 2015. [7] v. obradović, p. okiljević, n. kozić and d. ivković, "practical implementation of digital down conversion for wideband direction finder on fpga", sci. tech. rev., vol. 66, no. 4, pp. 40–46, jan. 2016. 252 d. datta, h. s. dutta [8] j. thabet, r. barrak, n. kamoun, n. khouja and a. ghazel, "a reconfigurable digital down converter architecture for multistandard gnss receiver", in proceedings of the 14th international symposium on communications and information technologies (iscit), incheon, 2014, pp. 404–408. [9] a. agarwal, l. boppana and k. r. kodali, "a factorization method for fpga implementation of sample rate converter for a multi-standard radio communications", in proceedings of the 2013 tencon spring, sydney, nsw, 2013, pp. 530–534. [10] d. datta, p. mitra and h. s. dutta, "fpga implementation of high performance digital down converter for software defined radio", microsyst. technol., vol. 28, pp. 533–542, aug. 2019. [11] j. e. volder, "the cordic trigonometric computing technique", ire trans. electron. comput., vol. ec–8, pp. 330–334, sept. 1959. [12] e. b. hogenauer, "an economical class of digital filters for decimation and interpolation", ieee trans. acoustic speech, signal process, vol. assp-29, no. 2, pp.155–162, april 1981. [13] q. jing, y. li, and j. tong, "performance analysis of multi-rate signal processing digital filters on fpga", eurasip j. wirel. commun. netw., p. 31, feb. 2019. https://doi.org/10.1186/s13638019-1349-9. [14] u. meyer-baese, digital signal processing with field programmable gate arrays, springer, third edition, 2007. [15] p. p. vaidyanathan and t. q. nguyen, "a “trick” for the design of fir half-band filters", ieee trans. circuits syst., vol. cas–34, no. 3, mar. 1987. [16] a. n. willson, "desensitized half-band filters", ieee trans. circuits syst.–i: regul. pap., vol. 57, no. 1, pp. 152-167, jan. 2010. [17] mathworks hdl coder, https://www.mathworks.com/products/hdl-coder.html. accessed 14 aug. 2019. [18] r. ratan, s. sharma and a. k. kohli, "cubic lagrange polynomial-based designing of efficient interpolators", int. j. electron. lett., vol. 2, no. 1, pp. 8–16, nov. 2013. [19] c. farrow, "a continuously variable digital delay element", in proceedings of the ieee international symposium on circuits and systems (iscas88), 1998, pp. 2642–2645. [20] d. datta, p. mitra and h. s. dutta, "implementation of fractional sample rate digital down converter for radio receiver applications", in proceedings of the devices for integrated circuit (devic), kalyani, 2021, pp. 94–98. http://dx.doi.org/10.1109/devic50843.2021.9455805. [21] r. yates, "fixed-point arithmetic: an introduction" 2007. available at: https://courses.cs.washington. edu/courses/cse467/08au/labs/l5/fp.pdf. [22] s. navid shahrouzi and darshika g. perera, "hdl code optimizations: impact on hardware implementations and cad tools", in proceedings of the ieee pacific rim conference on communications, computers and signal processing (pacrim), canada, 2019, pp. 1–9. [23] z. zulfikar, "novel area optimization in fpga implementation using efficient vhdl code", jurnal rekayasa elektrika, vol. 10, no. 2, pp. 61–66, oct. 2012. instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 1 10 doi: 10.2298/fuee1601001l enhanced dynamic voltage clamping capability of clustered igbt at turn-off period  hong. y long, mark. r sweet, e. m. sankara narayanan department of electrical and electronic engineering, university of sheffield, uk abstract. one of the critical requirements for high power devices is to have rugged and reliable capability against hash operating conditions. in this paper, we present the dynamic voltage clamping capability of 3.3kv field stop clustered igbt devices under extreme inductive load condition. it shows that pmos trench gate cigbt structure with outstanding performance of fast turn-off time and low over-shoot voltage. further optimization of current gain of cigbt structure is analyzed through numerical evaluation. a step further in the safe operating area has been achieved for high voltage devices by cigbt technology. key words: insulated gate bipolar transistor (igbt), power semiconductor devices, clustered igbt 1. introduction similar to short circuit device failure, dynamic latch-up of high voltage igbts represents another practical failure mode during device turn-off under dynamic avalanche conditions. overshoot of anode voltage occurs during device turn-off, especially for parallel connected power modules is very critical for igbt operation and should be protected within the limited safe operating area (soa). manufacturers and circuit designers have been trying to suppress the peak voltage by reducing anode current turnoff di/dt or de-rating and the use of voltage clamping circuits, snubbers to achieve sustainable capability. however, these methods unavoidably increase the turn-off switching loss, cost and the complexity of the system. the self-voltage clamping characteristics of igbt have been reported in [1-4]. it must be capable of absorbing all the energy stored in the inductance during abnormal conditions [5]. it is important to develop igbt without destruction even under the condition of dynamic avalanche [6]. during turn-off, the abruptly reduction of gate voltage seizes the injection of electron from the n-channel. the anode current continues to flow due to the inductive load. it must be sustained by the hole current. the hole carriers flows across the and modifies the received june 08, 2015 corresponding author: e.m.sankara narayanan department of electrical and electronic engineering, university of sheffield, united kingdom (e-mail: s.madathil@sheffield.ac.uk) 2 h. y. long, m. r. sweet, e. m. s. narayanan effective carrier concentration in the n-drift region. the profile of electric field is determined by the poisson equation in e.g. (1) (1) wherein neff is the effective carrier concentration in the n-drift region. these extra carriers lead to an increase in neff. it can modify the profile of the electrical field and may force the device into a dynamic avalanche mode by the high peak electric field. this process is stable if the extra generated electrons and holes are balanced in numbers and will continue until all the remaining excess carriers are eliminated and subsequently, the dynamic avalanche mode is suddenly eliminated. otherwise, the process can get out of control by the avalanche-generated carriers and would lead to a device failure. due to the stray inductance in the circuit, the igbt anode voltage over-shoots and eventually the electric field could punch through the n-drift region. when the anode voltage reaches the dc bias voltage, the anode current begins to fall as the current is transferring to the diode in a rate depending on the stray inductance and peak anode voltage. the capability for the power devices to dissipate a large amount of power dissipated during the period could be improved by employing a high igbt internal pnp gain, β [1]. more hole carriers will balance the effective carriers in the n-drift region, but this approach would have increased turn-off loss and higher leakage current in the off-state. in this paper, we demonstrate the dynamic avalanche ruggedness of 3.3kv field-stop clustered igbt (cigbt) [7-10] with self-voltage clamping capability. the technology shows improved safe and efficient operation and will ease the design constrictions on the system level. 2. self-clamped inductive switching capability 2.1. device structure cigbt is a mos-bipolar device employing a controlled thyristor concept to significantly reduce on-state voltage drop. it has the unique capability to clamp the cathode cell potential by punch-through of an n-well region between the p-base and pwell, termed as “self-clamping”. the feature improves current saturation characteristics and enables better short circuit performance [11]. the single cell schematic structures of 3.3kv class, conventional, pmos trench gate cigbt and field-stop igbt structures are shown in fig. 1(a)-(c) respectively. the igbt structure model is optimized for comparable purpose [11]. as a result, all structures have the same cell dimensions. the pmos trench cigbt [12], fig. 1(b), is identical to that of the conventional cigbt, fig. 1(a), except that a pmos trench gate (width=1µm, depth=4µm) connects the p-base to gate. the pmos and nmos gates are connected together to form a three terminal device. the pmos channels are only conducted during the turn-off cycle when the gate voltage is negative and is used for hole current pass. a constant lifetime of 50µs is chosen for both electrons and holes and it is assumed that the edge termination does not have any impact upon device performance under this condition. the simulated cigbt structures have only one full cell considered although in reality each cluster can consist of 50 to 100 cathode cells. enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 3 fig. 1 schematic structure diagram of (a) planar gate cigbt, (b) planar gate cigbt with deep pmos trench channel and (c) conventional planar gate igbt. 2.2. device turn-off performance the 3.3kv fs cigbt and igbt structures listed in fig. 1 are simulated to compare their capability to clamp voltage under such extreme condition. the circuit configuration for the inductive turn-off is shown in fig. 2. these devices are turned-off at vdc=2500v, ianode=150a and tj=25˚c. a large stray parasitic inductance of lc=2.4µh is also included in the circuit. it is important to point out that there is no gate resistor used in the circuit. because conventional technology normally requires large gate resistance to suppress the dynamic avalanche generation, but the turn-off loss increases in this case mainly due to the change of the reduction of dv/dt and longer turn-off time [13]. a further increase in turn-off losses and applying de-rating factor to power devices will cause a significant loss in soa capability. the reduction of rg in new technology will provide much lower power losses, shorter delay time during turning-off transient when compared to conventional technology. fig. 2 circuit setup of inductive load turn-off simulation. 4 h. y. long, m. r. sweet, e. m. s. narayanan fig. 3 igbt and cigbt turn-off waveforms (vdc=2500v, ia=150a, tj=25˚c, solid line: anode voltage; dashed line: anode current). fig. 3 shows the turn-off waveform of planar gate cigbt, pmos trench gate cigbt and conventional igbt. the maximum voltage peak across the igbt during the transient is about 400v higher than the other cigbt devices and associated with strong voltage oscillation. the planar gate cigbt has a slow dv/dt in comparison to igbt device. this is because cigbt has several times higher conductivity modulation of the n-drift region due to thyristor conduction [10] . it takes longer time to remove excess carriers from its ndrift region. on the other hand, the low dv/dt helps to maintain current and voltage levels within the soa, ease the high power stress across the device and less voltage peak and oscillation are found. pmos trench gate cigbt is the best performed device by displaying both fast turn-off time and low voltage peak in contrast to the other two structures. the current flow lines of planar gate igbt, planar gate cigbt and pmos trench gate cigbt at 200ns after the gate turn-off, when the mos channel of these devices has cutoff and enters dynamic avalanche in the n-drift region, are shown in fig. 4 (a)-(c), respectively. the cigbt devices behave differently to that of igbt due to its current is carried by a controlled thyristor. the holes within the p-well region flow through the depleted n-well at the saturated hole velocity and are collected at the cathode contact. it should be noted that the n-well is completely depleted when the anode voltage exceeds the self-clamping value of the n-well. avalanche-generated electron and hole carriers can also be found by the laterally displayed current flow lines. with pmos trench gate, it conducts during turn-off period when the gate voltage goes negative. it extracts the holes vertically by the trench gate channels and enhanced the capability of cigbt to remove charges underneath the cathode region. lower current density can thus be achieved. enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 5 (a) (b) (c) fig. 4 current flow lines of (a) igbt, (b) cigbt, and (c) pmos trench gate cigbt structure at time=200ns. 6 h. y. long, m. r. sweet, e. m. s. narayanan after the turn-off of the gate voltage, the dc voltage is then supported within the structure by the formation of the depletion region. depending on the concentration of the excess carriers in the depletion region, the width of the depletion region expands with time allowing the device to support larger anode voltage. the electric field profiles in the n-drift region during turn-off period are plotted in fig 5. the electric field expands towards anode contact to support higher voltages and eventually punches through to the n-buffer region at their maximum clamped voltage. it should be noted that the different positions of electric field peaks at the cathode side are due to the forward blocking voltage is support by the p-base/n-drift junction for igbt whereas it is supported by the p-well/n-drift junction for cigbt devices. fig. 5 simulated electric fields distribution of structures after gate turn-off (solid line: time=200ns, dash line: time=400ns, and dotted line: time at maximum anode voltage). fig. 6 simulated effective carrier concentration of structures based on the results shown in fig. 5 (solid line: time=200ns, dotted line: time=400ns, and dash line: time at maximum anode voltage). enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 7 the electric field of planar gate cigbt expands at a slower rate in comparison to the other two devices. this could be explained by the neff concentration across the structures as shown in fig 6. planar gate cigbt has a significant high portion of carriers are concentrated at the cathode side than the other two structures. in the process of time, neff is moving towards anode contact and becomes more evenly distributed across the whole region. it is important to notice that the with the help of pmos trench gate, the number of hole carriers at the cathode side has greatly reduced in comparisons to the conventional cigbt. this technology provides an efficient way to remove excess carriers. 3. optimization of current gain of fs-cigbt the self-clamping of the over-shoot voltage can be achieved by optimization of nbuffer layer in fs technology. this results larger soa required for high voltage devices. like short circuit condition, the self-clamped voltage is influenced by the internal pnp current gain, βpnp, which is a function of anode emitter efficiency, γanode, base transport factor, αt and also the effective n-buffer thickness, weff, as stated in the e.q.(2). ( ⁄ ) (2) where, lp is the hole diffusion length. it depends on the carrier mobility, lifetime and temperature. it thus requires optimum parameters of βpnp for the fs cigbt to enable the device to withstand over-shoot voltage successfully. the 3.3kv planar gate fs cigbt is simulated under the same circuit configuration in the section a to determine the influence of pnp current gain on the dynamic clamping performance of cigbt with different n-buffer thicknesses, and anode peak doping concentrations. fig. 7 shows the turn-off waveforms with n-buffer thicknesses varying from 5µm to 30µm with the same peak concentration of 5.0×10 15 cm -3 . it can be observed that the dv/dt of anode voltage after mos channel turn-off is greatly influenced by the n-buffer thickness. it also leads to reduction of over-shoot voltage as the n-buffer thickness reduces. but this sacrifices the current fall time during the transient. in comparison, the turning-off waveforms of the structure with varying anode peak concentration with a constant n-buffer thickness (15µm) and doping concentration (5.0×10 15 cm -3 ) are demonstrated in fig. 8. as expected from the increase in current gained by increasing peak anode doping concentration, a reduced self-clamped voltage is achieved at the expense of turn-off loss. fig. 9 has illustrated the peak power density during turning-off transient with a function of n-buffer thickness and anode peak doping concentration. the peak power density decreases with decreasing n-buffer thickness. the same trend can be found for igbt plotted in comparison. as a thinner buffer enhances the number of holes injected into the n-drift region during the transient, the peak power density reduced. but the reduction is less significant when the n-buffer thickness is less than 15µm. other constraints, such as stray inductance and carrier mobility, limit further improvement in the peak power density when there are sufficient holes to maintain a normal electric field distribution in the n-drift region. in the case of the igbt, its peak power density is higher for the same n-buffer thickness due to a higher electric field peak across the ndrift region than that exhibited by the cigbt as explained in the previous section. 8 h. y. long, m. r. sweet, e. m. s. narayanan fig. 7 cigbt turn-off waveforms with variable n-buffer thickness from 5µm to 30µm (vdc=2500v, ia=200a, tj=25˚c, rg=0ω). fig. 8 cigbt turn-off waveforms with variable anode peak concentration (vdc=2500v, ia=200a, tj=25˚c, rg=0ω). for a constant n-buffer thickness of 15µm, the peak power density of cigbt with increasing peak anode doping concentration is also plotted in the same figure. with a higher anode peak concentration, it also increases the pnp current gain. but the peak power density only shows a slight reduction when compared to the variation of n-buffer thickness. because as e.q. (2) suggested, n-buffer thickness causes βpnp change exponentially whereas γanode changes linearly with the current gain. a trade-off relationship between turn-off power loss and maximum self-clamped voltage is plotted in fig. 10 with n-buffer thicknesses from 5µm to 30µm. by controlling the n-buffer thickness, trade-off between voltage clamping capability and turn-off loss can be optimized. as can be concluded from the above results, the 3.3kv fs cigbt device exhibits good voltage clamping capability and turn-off loss. enhanced dynamic voltage clamping capability of clustered igbt at turn-off period 9 fig. 9 peak power density during turn-off. fig. 10 turn-off loss and clamped voltage dependence on the n-buffer thickness. 4. conclusion this paper has shown the dynamic voltage clamping capability of planar gate cigbt, pmos trench gate cigbt and conventional igbt under extreme stray inductance and zero gate resistance. the removal of excess charges stored in the n-drift region determines the turn-off time and maximum clamped voltage. pmos trench gate provides a more efficient method to extract the hole carriers by the induced p-channel when the gate voltage goes to negative value. it has exhibited low losses, fast turn-off time and smooth switching waveforms among the three types of structures simulated. 10 h. y. long, m. r. sweet, e. m. s. narayanan the self-voltage clamping feature of cigbt can be further improved through structural optimization of internal pnp current gain. a high current gain has better over-voltage protection, but would increase the turn-off power loss. a low current gain should also be avoided as it shifts the peak electrical field from cathode to anode side and induces oscillation during the process. the simulation analysis has shown that greater optimization of the performance of fs devices is achieved through the freedom provided by the n-buffer than by npt technology. there is a considerable impact on soa capability and power losses to fs cigbt. the new protection feature of fs cigbt can simplify the system design and offer greater optimization of performance of high voltage devices. references [1] m. rahimo, a. kopta, s. eicher, u. schlapbach, and s. linder, "a study of switching-self-clampingmode "sscm" as an over-voltage protection feature in high voltage igbts," in proceedings of the 17th international symposium on power semiconductor devices & ics, pp. 67-70, 2005. [2] a. rahimo, a. kopta, s. eicher, u. schlapbach, and s. linder, "switching-self-clamping-mode "sscm", a breakthrough in soa performance for high voltage igbts and diodes," ispsd '04, in proceedings of the 16th international symposium on power semiconductor devices & ics, pp. 437-440, 2004. [3] m. otsuki, y. onozawa, s. yoshiwatari, and y. seki, "1200v fs-igbt module with enhanced dynamic clamping capability," ispsd '04, in proceedings of the 16th international symposium on power semiconductor devices & ics, pp. 339-342, 2004. [4] j. yedinak, j. wojslawowicz, b. czeck, r. baran, d. reichl, d. lange, p. shenoy, and g. dolny, "enhanced igbt self clamped inductive switching (scis) capability through vertical doping profile and cell optimization," in proceedings of the 14th international symposium on power semiconductor devices & ics, pp. 289-292, 2000. [5] j. yedinak, j. merges, j. wojslawowicz, a. bhalla, d. burke, and g. dolny, "operation of an igbt in a self-clamped inductive switching circuit (scis) for automotive ignition," ispsd '98, in proceedings of the 10th international symposium on power semiconductor devices & ics, pp. 399-402, 1998. [6] j. lutz and r. baburske, "dynamic avalanche in bipolar power devices," microelectronics reliability, vol. 52, pp. 475-481, mar 2012. [7] e. m. s. narayanan, m. r. sweet, n. luther-king, k. vershinin, o. spulber, m. m. de souza, and j. v. s. c. bose, "a novel, clustered insulated gate bipolar transistor for high power applications," in proceedings of the international semiconductor conference, cas 2000, vols 1 and 2, pp. 173-181,542, 2000. [8] m. sweet, n. luther-king, s. t. kong, e. m. s. narayanan, j. bruce, and s. ray, "experimental demonstration of 3.3kv planar cigbt in npt technology," ispsd 08, in proceedings of the 20th international symposium on power semiconductor devices & ics, pp. 48-51, 2008. [9] k. vershinin, m. sweet, o. spulber, s. hardikar, n. luther-king, m. m. de souza, s. sverdloff, e. m. s. narayanan, and d. hinchley, "influence of the design parameters on the performance of 1.7kv, npt, planar clustered insulated gate bipolar transistor (cigbt)," ispsd '04, in proceedings of the 16th international symposium on power semiconductor devices & ics, pp. 269-272, 477, 2004. [10] n. luther-king, e. m. s. narayanan, l. coulbeck, a. crane, and r. dudley, "comparison of trench gate igbt and cigbt devices for increasing the power density from high power modules," ieee transactions on power electronics, vol. 25, pp. 583-591, mar 2010. [11] h. y. long, l. ngwendson, e. sankara narayanan, and m. sweet, "numerical evaluation of the shortcircuit performance of 3.3-kv cigbt in field-stop technology", ieee transactions on power electronics, vol. 27, pp. 2673-2679, 2012. [12] n. luther-king, m. sweet, and e. m. s. narayanan, "performance of a trench pmos gated, planar, 1.2 kv clustered insulated gate bipolar transistor in npt technology," in proceedings of the 21st international symposium on power semiconductor devices & ics, pp. 164-167, 2009. [13] t. ogura, h. ninomiya, k. sugiyama, and t. inoue, "turn-off switching analysis considering dynamic avalanche effect for low turn-off loss high-voltage igbts," ieee transactions on electron devices, vol. 51, pp. 629-635, apr 2004. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 193 204 doi: 10.2298/fuee1602193p an intelligent power mosfet driver asic circuit with additional integrated safety operation functionality  jurij podržaj 1 , janez trontelj 2 1 letrika lab d.o.o., sempeter pri gorici, slovenia 2 university of ljubljana, faculty of electrical engineering, laboratory for microelectronics, ljubljana, slovenia abstract. this paper presents an extension to the previously presented conference paper [1] a power mosfet driver asic with intelligent driving algorithm approach of the power modern mosfet devices. the intelligent driving algorithm concept proposes a realization of power mosfet gate driving with controlled source/sink current of the power mosfet driver circuit. such approach enables higher control of the power mosfet operation behavior, especially during switching events. additionally to the previously published work this paper presents implementation of the intelligent driving algorithm and driver safety operation functions on a single integrated asic circuit. the paper concludes with presentation of some functions of the manufactured asic circuit in cmos technology. key words: mosfet driver, driving algorithm, asic, safety functions 1. introduction common awareness of preserving a clean environment and trends for low energy consumption affect our daily lives in many ways. one such example could also be recognized in the increasing trend of implementation of different types of modern electrical motor controllers in various applications (e.g. electrical cars). the implementation of electrical motor controllers offers higher conversion efficiency of electrical energy to mechanical energy (e.g. movement, rotation), especially if standard propulsion driving systems using fossil fuels are replaced. a large assortment of different motor controllers for various drive applications exists on the market. the motor controllers are designed for many kinds of voltage/current ratings where their power ratings vary from several watts up to several megawatts. in many applications the speed control of the attached electrical motor through the use of pulse received january 12, 2015; received in revised form december 7, 2015 corresponding author: jurij podržaj letrika lab d.o.o., polje 15, 5290 sempeter pri gorici, slovenia (e-mail: jurij.podrzaj@si.mahle.com) 194 j. podrzaj, j. trontelj width modulation (pwm) is achieved. at this point it also must be mentioned that other driving control algorithms [2], [3], [4] exist. their selection and implementation depends on the application. in such motor controllers power semiconductor devices (e.g. mosfet, igbt) switching frequencies from several khz up to 20 khz are usually used. for battery power electrical drive systems, used in electrical cars and other light facilities vehicles, power mosfet devices dominate due to their high switching current capabilities and low voltage operation. mosfet devices with their low on-state resistance rdson, low freewheel diode forward voltage vsd, acceptable low switching power losses, and acceptable low temperature dependence are desired due to low operating voltage, even with the requirement of handling high switching current densities power. in order to handle high current densities the topology of connecting several power mosfet drivers in parallel is commonly used. in addition to the aforementioned power mosfet device electrical properties, the power controller designer must also be familiar with the mechanisms of power loss sources. the analysis of power mosfet device power loss sources under various operating conditions are described in more detail in [5], [6], [7], [8] and many other papers. the power losses of switching a mosfet device can be divided into two parts. the part of the power losses originate from the situation when a device operates in static conditions (rdson, leakage, vds). the second part of total power losses contributes power losses when the power device is being switched and can be defined as dynamic power losses. the static power losses are correlated with the power mosfet device electrical properties, which are defined by the manufacturer‟s process capabilities and can be recognized in devices‟ datasheet. additionally, the device‟s electrical properties and operation mode, a nonneglectable part of power loss contribution can also originate from packaging and ambient temperature. as described in papers [9], [10], [11], analyzing the influence of the package to the power losses and power mosfet performance. the dynamic part of the total power loss scheme needs to be studied from case to case in more detail since there are many factors influencing the power device efficiency. major part of the dynamic losses is depended on the connected power device switching characteristics. an important contribution to the overall dynamic part of the total power loss origin from the driver circuit operation analyzed in [13], [14] and also how efficient the power device is being switched. this paper should be considered as a continuation of the previously published work [1] where some design ideas and techniques of the intelligent power mosfet driver were addressed. the previous work described in [1] was mostly focused on the analysis of various influences on the power mosfet device parameters which directly and indirectly influence on the introduced switching algorithm. additional to this the paper also introduces some additional functionality which would increase safety level of the system where such power mosfet driver would be implemented. this paper presents implementation of a proposed intelligent driving algorithm in a form of an integrated circuit – asic and introduces some simulation and measurements results of the prototyped integrated circuit. an intelligent power mosfet driver asic circuit with additional integrated safety operation functionality 195 2. power mosfet switching the power mosfet device is a voltage-controlled device where the connected gate signal controls the device operation mode. many different power mosfet devices with various voltage/current/power ratings, static and dynamic properties, technology, package etc. are available on the market. the selection of the power mosfet device strongly depends on the power mosfet‟s operation mode and end application. consequently, the designer must also consider how adequate control over the power mosfet device will be achieved. there are many automotive applications with a power mosfet device, such as various inverters/converters (e.g. dc-ac). the complete system operation requires an engine control unit (ecu) where a microcontroller or dsp is usually employed. the power available on the i/o pin is limited because the microcontrollers are low voltage and low power devices. drivers usually have to be positioned between the microcontroller output pin and the power mosfet control pins (gate). in other words, the driver can be defined as an electrical circuit which provides conditioned signals with adequate electrical energy for the efficient driving of the connected power mosfet device. when selecting a driver for controlling a power mosfet device, or even several power mosfets connected in parallel, the designer has to consider many parameters which can influence the switching performance and consequently the performance characteristics of the target application. 2.1. mosfet device parameters the simulation circuit shown in fig. 1 was used for obtaining the simulation results presented in fig. 2. simulations were perfomed in spice environment. rser={rdri} pulse(0 12 1u 1n 1n 1u 2u) vdri rdow n 10k rgs_dow n 10ohm vbat 60v rl 1ohm tj tcaseu1 ipb017n06n3 d1 d vgs v d s vdri .tran 0 3u 0 1n .step param dvth list -1 0 1 .param dc 0 .param ls 1.8n .param ld 1n .param lg 4n fig. 1 spice simulation circuit the switching performance is strongly dependent on the selected power mosfet device, especially the device structure and process deviations – on which the designer has no influence. since the designer of the application has no influence on the selected power 196 j. podrzaj, j. trontelj mosfet device properties, or the device process deviations, the following presented simulation shows only the deviation of a single parameter – which is in this case the gate threshold voltage vth. the simulation results presented in fig. 2 indicates the signal behavior where a single parameter such as a device threshold voltage vth is varied. 0.7µs 0.9µs 1.1µs 1.3µs 1.5µs 1.7µs 1.9µs 2.1µs 0v 4v 8v 12v 16v 20v 24v 28v 32v 36v 40v 20a 24a 28a 32a 36a 40a 44a 48a 52a 56a 60a -1v 0v 1v 2v 3v 4v 5v 6v 7v 8v 9v 10v 11v 12v -1.2a -1.0a -0.8a -0.6a -0.4a -0.2a 0.0a 0.2a 0.4a 0.6a 0.8a 1.0a 1.2a v(vds) ix(u1:drain) max vth max vth min vth min vth v(vgs) ix(u1:gate ) min vth max vth typ vth fig. 2 switching waveforms simulated with single ipb017n06n3 device in package fig. 2 presents spice simulation results of the trench mosfet device (ipb017n06n3) where voltage v(vgs) and current ix(u1:gate) waveforms of the mosfet control pin (gate) are shown in the upper graph. the second graph indicates signals of v(vds) in ix(u1:drain) of the connected mosfet device. the results of both graphs shown in fig. 2 present the dependence of the signals if only the gate threshold voltage vth is varied from the device‟s specified minimum (“min vth”), typical, and maximum (“max vth”) values. (see fig. 2). the presented simulation results present only a single device process parameter variation. for further analysis of the power mosfet switching behaviour also other process variations and interactions of other parameters should be analysed. 2.2. driver properties the driver of the power mosfet device is used for providing a conditioned control signal to the gate of the power mosfet device in order to control the connected device operation. voltage/current and power capacities of the driver, whether realized as an asic of an electrical circuit with multiple components, have to be tailored to the power mosfet device or to a network of such devices (usually connected in parallel in order to obtain higher power density capabilities). an intelligent power mosfet driver asic circuit with additional integrated safety operation functionality 197 the properties and functionality of the selected driver circuit have, in addition to mosfet device properties and application layout design, an important impact on the power mosfet switching, performance (losses and efficiency), emi behaviour, bill-ofmaterial (bom), size, and price. the designer has to consider the influences of the aforementioned construction parts in order to accomplish the specified end application requirements, such as emi compatibility, power density per unit of volume, and many others. in fig. 3 the simulation results of power mosfet device operation are shown for different driver output current capabilities. the simulation circuit and description of the presented graphs in fig. 3 are identical to the previously presented simulation results in fig. 1. the output current driver capability is controlled with an adjustment of internal driver resistance rdri (0ω, 12ω and 24ω). the resistance rdri influences the power mosfet device switching du/dt and di/dt performance as seen in the lower graph of fig. 3. the impact of the driver‟s current capability is even more noticeable when the power mosfet device is being switched „off.‟ not only the du/dt and di/dt behaviour is considerably altered, but also additional time delays occur. those time delays result not only in an increase of switching power losses, but also in a limited available minimal time pause („dead-time‟). the „dead-time‟ needs to be controlled and adjusted in the case where operation of two or more power switches (e.g. h-bridge circuit topology) can lead to uncontrolled coincident operation. this can establish a short circuit path between the connected power source terminals. when a short circuit path through the series connected power switches occurs, it can lead to a sudden increase in the power source current, increased power losses, and elevated temperatures. this occurrence can in some cases also lead to power switch destruction and even possible application failure. 0.8µs 1.0µs 1.2µs 1.4µs 1.6µs 1.8µs 2.0µs 2.2µs 2.4µs 2.6µs 2.8µs 3.0µs 3.2µs 0v 5v 10v 15v 20v 25v 30v 35v 40v 20a 24a 28a 32a 36a 40a 44a 48a 52a 56a 60a -2v 0v 2v 4v 6v 8v 10v 12v -1.4a -1.0a -0.6a -0.2a 0.2a 0.6a 1.0a 1.4a v(vds) ix(u1:drain) rdri = 24  rdri = 0  rdri = 0  rdri = 12  v(vgs) ix(u1:gate) rdri = 0  rdri = 0  rdri = 12  rdri = 24  fig. 3 switching waveforms of the power mosfet with different driver current capabilities (simulated with rdri (0ω, 12ω and 24ω)/ driver internal resistance) as previously mentioned power mosfet device and driver selection, the board layout also plays an important role in the power system design. the electrical and the thermal properties of the board used for final power system components assembly should 198 j. podrzaj, j. trontelj be considered with respect to the expected operation of the power mosfet device, and the end application environment conditions (e.g. ambient temperature, storage, mechanical loads, emi, etc.). the detailed analysis of the board layout is not the main scope of this paper, so at this point the analysis of direct and indirect effect factors of the board layout to the power mosfet device and driver operation is not discussed in detail. some further approaches of different analysis of the power module layout are highlighted in [15], [16], [17]. 3. asic driver concept in the following section some of the design ideas and techniques of an intelligent power mosfet asic driver are introduced. this concept emphasizes only some highlighted approaches about power mosfet driving, implementation of protection, and safety functionality. an idea of the asic implementation is also introduced. 3.1. power mosfet driving the driving of the power mosfet device has a direct influence on the power mosfet device‟s performance, operation reliability lifetime, and also effects correlated to the emi susceptibility must not be neglected. in the past many of the power mosfet device driving techniques and approaches were introduced. in the previously reported work [18], [19] alternative mosfet drivers with voltage controlled output are summarized and more detailed described. the proposed driving technique, instead of „conventional‟ voltage control of the vgs (voltage applied between gate and source) of the connected power mosfet, proposes the use of an integrated current controlled source connected to the asic driver output. the idea of using current controlled asic power mosfet driver output is to obtain control over the source/sink current provided to the gate terminal of the power mosfet device during each of the switching segments. an introduced intelligent current driving technique optimizes switching efficiency and enables higher level of control of the du/dt and di/dt during switching of the power mosfet device. additional features of the asic driver should also implement various algorithms which can automatically adjust the asic driver internal current driving algorithm parameters regarding the sensing values of the power mosfet device such as die power, mosfet die temperature, voltage drop vdson, etc. in fig. 4 a block diagram of the integrated intelligent power mosfet driver asic is presented. the block diagram introduces an implementation of the intelligent algorithm with safety functions and configuration of the input/output pins which are more in detailed analyzed in previously published work [18]. the introduced block diagram of the intelligent driver has, beside power supply pins, also additional input/output pins which are required for its operation. an intelligent power mosfet driver asic circuit with additional integrated safety operation functionality 199 safety functions fail safe on / off power mosfet output controlled current source “on” controlled current source “off” di/dt, du/dt di/dt & du/dt control fig. 4 block diagram of the intelligent driver and safety function of the mosfet driver asic control signal connected to the “on/off” pin usually provided from an external microcontroller unit is used for controlling an operation of the connected power mosfet. the signal “on/off” pin is connected to the internal functional block “di/dt & du/dt control” and controls operation of the connected “controlled current source “on” ” and “controlled current source “off”. the controlled current source additionally marked with “on” is used for the signal conditioning of the gate control signal during switching the power mosfet transistor on. the second current source, marked with “off”, is active when the power mosfet is being switched off – when there is no conduction between drain and source terminal of the power mosfet. an external signal connected to the pin “di/dt, du/dt” is used for programming the current values of both controlled current sources separately. the pin named “output” is connected to the gate terminal of the connected power mosfet device. an implementation of the safety functions are presented with functional block “safety functions”. this block is used for monitoring voltage drop between source and sink of the connected power mosfet device and the temperature of the driver asic. when any of the observed values of the voltage drop or temperature is exceeded the operation of the driver is disabled. the internal signal of the “safety functions” block overrides the command provided from the external control signal connected to the “on/off” pin and set the highest available sinking current of the “controlled current source off”. when any of the predefined monitored values is exceeded the “safety functions” block immediately turns off the connected power mosfet device in order to avoid any further uncontrolled events and possible also any further damage of the application. furthermore, the “fail safe” signal, a part of the “safety functions” block, is connected to the external control unit and is used as an interrupt for external control unit when any of the monitored operation parameter (e.g. overcurrent, temperature, battery voltage) of the power mosfet is out of the predefined limits. 200 j. podrzaj, j. trontelj 3.2. asic protection and safety functions in addition to the asic driver‟s efficient and reliable driving of the connected power mosfet device, implementation of safety functionality should be considered. standard functionality can be already found in existing solutions, such as those mentioned in [20] and [21]. the proposed implementatin of the power mosfet driver consists of implementation of the following safety functions: the temperature detection of the power mosfet driver die, the monitoring of the battery voltage and the overcurrent protection of the controlled power mosfet device. 3.3. asic driver implementation power mosfet device drivers are usually assembled in different ic packages (e.g. soic). the influence of connections of the drivers and controlled power mosfet devices also differ between applications, and the parasitic influences of such connections cannot be universally determined. in order to optimize connections influences to the performance an combined integration of the power mosfet device‟s structure and the asic‟s driver circuit on a single die is proposed. fig. 5 shows an asic integrated power mosfet driver protoype with implemented introduced intelligent driving algorithm and safety functions. fig. 5 intelligent power mosfet driver prototype photo; die size: (1612 x 1192) µm the asic has been prototyped in 0.25µm cmos technology with maximal breakdown voltage of 40v. in fig. 5 marked area on the prototyped power mosfet driver die gives an illustrative indication of individual function blocks as introduced in a block diagram shown in fig. 4. additionally also some peripheral function block are added required for integrated circuit operation. the detailed description of the peripheral function blocks presented in fig. 5 is not the topic of this paper. an intelligent power mosfet driver asic circuit with additional integrated safety operation functionality 201 such an approach reported in [20] was previously mentioned, but the solution only addresses high voltage power devices where switching current densities do not represent any notable influence on the integrated asic‟s driver operation. the combination of the power and cmos structure on a single substrate introduces new challenges for future work. 3.4. measurement results of the prototyped asic in this section some preliminary measurement results of the prototyped asic are presented. figure 6 and 7 show oscilloscope waveform captures of the “input” and “output” signal of the asic at three different set values of the control current source “on” and “off”, respectively. the asic was supplied with a 12vdc supply and load capacitor of 10nf on the asic “output” pin was connected. the “input” pin was controlled with an external signal generator. fig. 6 switching waveform of the intelligent power mosfet driver prototype during load capacitor charging in fig 6 an “output” pin voltage signal behaviour connected to the load capacitor of 10nf at maximum (“111”), middle (“011”) and minimum (“000”) pre-set current source capability of the controlled current source “on” is shown. the measurement results show influence of different controlled current source values which directly affects the du/dt slope of the connected capacitor load during charging. current capability of the controlled current source “on” ranges from 550ma to 100ma for maximum and minimum pre-set values, respectively. 202 j. podrzaj, j. trontelj fig 7 present waveforms when the connected capacitor load for different pre-set controlled current source “off” values during discharging. the voltage signal “output” marked as “111” on the fig 7 present signal behaviour when maximal available discharge current in the asic is pre-set. the “output” signal marked with “000” indicated signal behaviour when minimal available discharge current of the controlled current source “off” was selected. the range of pre-set current values of the controlled current source “off” could be set in eight programmable steps from 330ma to 270ma. fig. 7 switching waveform of the intelligent power mosfet driver prototype during load capacitor discharging presented measurement results show implementation of the controlled current source “on” and “off” in first asic prototype. the asic prototype proved that independent control of charging and discharging of the connected loads capacitor is achieved and measured without implementation of any external electrical components. 4. conclusion an important part of power mosfet device operation requires a novel approach on the efficient and reliable control of such devices. the paper introduces some simulation results of various influences of the power mosfet device parameters and analyses their effect on switching performance. the paper concludes with an introduction of some design aspect and ideas of the intelligent asic drivers and driving algorithm, the integration of protection and safety functions, and proposes a combined implementation of the asic an intelligent power mosfet driver asic circuit with additional integrated safety operation functionality 203 driver and power mosfet device on a single die. the article also introduces the photo of the first prototyped power mosfet driver asic with intelligent driving algorithm and some safety functions integration. acknowledgement: the paper is a part of the research done within the letrika lab d.o.o. and university of ljubljana, faculty of electrical engineering, laboratory for microelectronics collaboration. the authors would like to thank to the staff of the laboratory for microelectronics and colleagues at letrika lab d.o.o. and mahle letrika d.o.o. for their valuable support and knowledge sharing. references [1] j. podrzaj, a. sesek, j. trontelj, “an intelligent power mosfet driver with improved functionality,” in proc. 50th int. conf. microelectronics. devices materials, vol. 50, october 2014, pp. 59–63. [2] v. ambrozic, modern control of ac drives. ljubljana: university of ljubljana, faculty of electrical engineering, 1996. [3] w. leonhard, control of electrical drives, 3rd ed. berlin; new york: springer, 2001. [4] p. c. krause, analysis of electric machinery and drive systems, 2nd ed. new york: ieee press, 2002. [5] b. j. baliga, advanced power mosfet concepts. new york: springer, 2010. [6] y. ren, m. xu, j. zhou, and f. c. lee, “analytical loss model of power mosfet,” ieee trans. power electron., vol. 21, no. 2, pp. 310–319, mar. 2006. [7] l. aubard, g. verneau, j. c. crebier, c. schaeffer, and y. avenas, “power mosfet switching waveforms: an empirical model based on a physical analysis of charge locations,” in power electronics specialists conference, 2002. pesc 02. 2002 ieee 33rd annual, 2002, vol. 3, pp. 1305–1310. [8] z. j. shen, y. xiong, x. cheng, y. fu, and p. kumar, “power mosfet switching loss analysis: a new insight,” in conference record of the 2006 ieee industry applications conference, 2006. 41st ias annual meeting, 2006, vol. 3, pp. 1438–1442. [9] i.-j. km, s.-k. hwang, y.-i. choi, and m.-k. han, “a design methodology for the minimum die area of power mosfet‟s considering thermal resistance of the package,” in proceedings of the 4th international symposium on power semiconductor devices and ics, 1992. ispsd ‟92, 1992, pp. 202–205. [10] c. yue, j. lu, x. zhang, and y.-s. ho, “effects of package type, die size, material and interconnection on the junction-to-case thermal resistance of power mosfet packages,” in 12th international conference on electronic packaging technology and high density packaging (icept-hdp), 2011, pp. 1–6. [11] x. fan and s. haque, “emerging mosfet packaging technologies and their thermal evaluation,” in the eighth intersociety conference on thermal and thermomechanical phenomena in electronic systems, 2002. itherm 2002, 2002, pp. 1102–1108. [12] l. t. sim and y. w. chet, “high performance and reliable to package,” in electronic manufacturing technology symposium (iemt), 2012 35th ieee/cpmt international, 2012, pp. 1–6. [13] r.-h. tzeng and c.-l. chen, “a low-consumption regulated gate driver for power mosfet,” ieee trans. power electron., vol. 24, no. 2, pp. 532–539, feb. 2009. [14] j. fu, z. zhang, y.-f. liu, and p. c. sen, “mosfet switching loss model and optimal design of a current source driver considering the current diversion problem,” ieee trans. power electron., vol. 27, no. 2, pp. 998–1012, feb. 2012. [15] y. liu, power electronic packaging: design, assembly process, reliability and modeling, 2012 edition. new york: springer, 2012. [16] w. tursky and p. beckedahl, “advanced drive systems,” in proceedings 35th annu. ieee power electron. spec. conf., vol. 2004, no. 6, pp. 4499–4502, 2004. [17] t. stockmeier, “from packaging to „un‟-packaging trends in power semiconductor modules,” in 20th international symposium on power semiconductor devices and ic‟s, 2008. ispsd ‟08, 2008, pp. 12–19. [18] j. podrzaj and j. trontelj, “a new concept of intelligent power asic driver technique for semiconductor power devices,” in proc. 46th int. conf. microelectron. devices mater. workshop opt. sens., vol. 46, pp. 167–170, october 2010. [19] j. podrzaj, a. sesek, and j. trontelj, “intelligent power mosfet driver asic,” in 2012 proceedings of the 35th international convention mipro, 2012, pp. 107–111. 204 j. podrzaj, j. trontelj [20] m. bildgen, “from standard to intelligent mosfet,” in conference record of the 1992 ieee industry applications society annual meeting, vol.1, 1992, pp. 1212–1217. [21] l. chen, f. z. peng, and d. cao, “a smart gate drive with self-diagnosis for power mosfets and igbts,” in twenty-third annual ieee applied power electronics conference and exposition, apec 2008, 2008, pp. 1602–1607. 10684 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 1-16 https://doi.org/10.2298/fuee2301001l © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper performance analysis of finfet based inverter, nand and nor circuits at 10 nm ,7 nm and 5 nm node technologies abdelaziz lazzaz1, khaled bousbahi2, mustapha ghamnia3 1,3laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie. 2ecole supérieure du génie electrique et energétique d’oran, (esgeeo), algérie abstract. advancement in the semiconductor industry has transformed modern society. a miniaturization of a silicon transistor is continuing following moore’s empirical law. the planar metal-oxide semiconductor field effect transistor (mosfet) structure has reached its limit in terms of technological node reduction. to ensure the continuation of cmos scaling and to overcome the short channel effect (sce) issues, a new mos structure known as fin field-effect transistor (finfet) has been introduced and has led to significant performance enhancements. this paper presents a comparative study of cmos gates designed with finfet 10 nm, 7 nm and 5 nm technology nodes. electrical parameters like the maximum switching current ion, the leakage current ioff, and the performance ratio ion/ioff for n and p finfet with different nodes are presented in this simulation. the aim and the novelty of this paper is to extract the operating frequency for cmos circuits using quantum and stress effects implemented in the spice parameters on the latest microwind software. the simulation results show a fitting with experimental data for finfet n and p 10 nm strctures using quantum correction. finally, we have demonstrate that finfet 5 nm can reach a minimum time delay of td=1.4 ps for cmos not gate and td=1 ps for cmos nor gate to improve integrated circuits ic. key words: finfet, quantum effect, cmos not gate, cmos nor gate, cmos nand gate, microwind 1. introduction the rapid development of nanoelectronics technology is closely related to solving the problem of minimum layout dimensions. the efficient miniaturization of a transistor has been one of the most important topic for integrating a greater number of electronic components in a single chip. received april 17, 2022; revised may 22, 2022, june 05, 2022 and june 16, 2022; accepted july 16, 2022 corresponding author: abdelaziz lazzaz laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie e-mail: lazzaz.abdelaziz@gmail.com 2 a. lazzaz, k. bousbahi, m. ghamnia finfet is one of the best alternative for replacing mosfet which encounter the problem of the sce like drain induced barrier lowering (dibl), and the increase of leakage current when the channel length is reduced below 32 nm. researchers around the world have tried to improve the performance of finfets by the introduction of high k dielectric materials and strained silicon technology [1]. since the conventional mosfet has reached its limit, the multi gate finfet has been one of the most promising devices for cmos technology and the different analytical studies of finfet is a current topic of research in large foundries like tsmc[6], samsung and intel, they are aiming to create the most efficient cmos circuits. shiqi liu et al. in 2021[21] have simulated an ultra-thin si finfet with a width of 0.8 nm by using ab initio quantum transport simulations. the results of their simulation confirm that even with the gate length down to 5 nm, the on-state current, delay time, power dissipation, and energy-delay product of the optimized ultra-thin si finfet still meet the high-performance applications. dhananjaya tripathy et al in 2022 [22] have examined the impact of variation in the thickness of the oxide (sio2) layer on the performance parameters of a finfet. the results confirm that a rise in sio2 thickness improves the energy and power dissipation of finfet. lazzaz et al. in 2022 [23] have simulated a theoretical model based on the bohm quantum potential (bqp) theory and compared it with experimental data. the theory fits with the experiment after optimization and correction using the right values of the geometric parameters. bourahla et al. in 2021 [24] have demonstrated that the ta2o5 material of gate with high permittivity (k = 27) turns out better values for performance parameters such as (vth, ss, ion, ioff current and ion/ioff ratio current, gm, and electrical field (e)) in comparison with other dielectrics such as sio2, sno2, zro2 which improve the performance of the device. lazzaz et al. in 2021 [2] have demonstrated the impact of the metal gate work function on the performance of the dg finfet 10 nm with silvaco tcad tools. uttam kumar das et al. in 2021 [25] have examined a comparative study between silicon finfet with carbon nanotube and 2d-fets for advanced node cmos logic application.the results of this simluationn confirm that the finfet delivers more than three times higher drive current, as well as five times better energy-delay performances. rajeev ratna vallabhuni et al. in 2020 [26] have simulated a 2-bit comparator designed with 18nm finfet technology. the simulation shows the cmos comparator in terms of power and delay using the cadence virtuoso tool. the result of this simulation confirm that finfet can be used where a fast switching rate is required, to improve the efficiency of control devices and to make compact device. j. jena et al. in 2022 [27] have simulated finfet-based inverter design and optimization for 7 nm technology node. the result of their simulation confirm that according to the sidewall orientation (<100 > or < 110>), the amount of mobility enhancement of both the electrons and holes results in more than 100% (>100%) and less than 25% (<25%) respectively. c. auth et al. in 2017 [32] have an industry leading 10 nm cmos technology node with excellent transistor such as finfet with interconnect performance and aggressive design rule scaling.the results of their simulation show a higher performane high density sram featuring 0.0312µm² cell size fabricated using all 10 nm process features. s.panchanan et al. in 2021 [35] have simulated an analytical model of tri-gate metaloxide-semiconductor field effect transistor (tg mosfet) for short channel lengths performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 3 below 10 nm using tcad software. the model is examined by varying channel length, oxide thickness, gate voltage, drain voltage and doping concentration. the result of their simulation confirm that to obtain identical surface potentials, the oxide thickness of hfo2 must be larger than sio2. unlike sio2, the minima of surface potential remain constant with channel length for hfo2. b. vandana et al. in 2018 [36] have explored the analog analysis and higher order derivatives of drain current (id) at gate source voltage (vgs), by introducing channel engineering technique of 3d conventional and wavy junctionless finfets (jlt) as silicon germanium (si1-0.25ge0.25) device layer. the results of their simulation confirm that a better channel controllability over the gate is observed for wavy structures and high id is induced as lg scales down. n. p. maity et al. in 2019 [37] have simulated a double-gate (dg) heterojunction tunnel finfet structure with a source overlap region to optimize its performance and validate its technology computer-aided design (tcad) simulation results by modeling of the surface potential, electric field, and threshold voltage. suparna panchanan et al. in 2021[38] have analysed an analytical model for surface potential and threshold voltage for undoped (or lightly) doped tri-gate fin. field effect transistor (tg-finfet) is proposed and validated using transistor computer aided design (tcad) simulation. suparna panchanan et al. in 2022 [39] have studied lambert w function-based a drain current model of lightly doped short channel tri-gate fin fashioned field effect transistor (tgfinfet). their results confirm that a precise drain current is obtained by adding quantum mechanical effect (qme) which also improves the efficiency of the model. shaheen saleh et al. in 2018 [41] have demonstrated the roles and impacts of various effects and aging mechanisms on finfet transistors compared to planar transistors on the basic approach of the physics of failure mechanisms to fit to a comprehensive aging model. so, the above literature survey indicates the importance of using high-k dielectrics in finfet devices and the importance of multi gate finfet to overcome the sce and to improve the channel control. in this paper, we present a comparative study of different cmos gates (not gate, nand and nor gate) based on 10 nm, 7 nm and 5 nm technology node to extract optimul geometric parameters to have an operational finfet device for future applications like sram circuits. 2. device structure and simulation tri gate (tg) finfet technology is based on the vertical fin represented by the fin length (l), fin height (hfin) and fin width (wfin) as show in figure 1. finfet devices have been used in a variety of innovative digital and analog circuit designs. tg (tri gate) has been recently developed and its ability to control three channel sides has been used in order to reduce circuit area, its capacitance and the variation of the threshold voltage. throughout the last few years, cmos scaling and improvement in processing technologies have led to continuous enhancement in circuit speeds due to the miniaturization of finfet device. the main difference between the bulk finfet and soi finfet is the buried oxide (box) which isolate the body from the subtrate, minimizes the leakage current due to quatntum effect, reduces the parasitic junction capacitance and source/drain capacitance. 4 a. lazzaz, k. bousbahi, m. ghamnia despice the use of the soi finfet technology in term of enhancement of the device, one of the drawbacks is the self heating effect because the active thin body is on silicon oxide which is good thermal insulator. during an operation, the power consumed by the active region cannot be dissipated easily therefore, the temperature of thin body rises and this decreases the mobility and the current of the device [32]. in this work, finfet structure has been simulated with microwind 3.8 software using parameters that are provided in table 1. figure 1 in the right shows the 3d schematic of simulated finfet 10 nm and in the left figure shows the design layout of the device: fig. 1 n finfet 10 nm table 1 different parameters of the simulated device [6] [7][12] notation description finfet 10 nm finfet 7 nm finfet 5 nm ls,ld length of drain /source 22nm 16 nm 12 nm lg gate length 18 nm 16 nm 14 nm tox oxide thikness 1 nm 0.9 nm 0.9 nm hfin fin height 46 nm 46 nm 46 nm wfin fin width 7 nm 6 nm 5 nm table 1 shows the design parameters that we have employed for the circuit simulations in our present work. the primary obstacles to the scaling of cmos gate lengths to 10 nm and beyond are short channel effect and leakage current which lead to low yield. finfet offers better control over of the sce and hence overcome the obstacles of scaling. the circuit simulation is done using microwind 3.8 which we have used to simulate electrical circuits in transient domain. microwind tool facilitates circuit level analysis of performance simulation of the integrated circuits. the predictive technology model (ptm) integrated in microwind provides accurate, customizable, and predictive model files for future transistor and interconnect technologies[28][29]. we have simulated different logic circuits such as the not gate,2 input nand and 2 input nor gates for leakage power dissipation, delay time and power delay product (pdp) at 10 nm, 7 nm and 5 nm technology nodes and a comparison is made to check the technology scaling. performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 5 the threshold voltage expression can be represented by the following equation [11]: in ox ss ox d fmsth v c q c q v ++++=  2 (1) ms: work functions difference between gate and fin, qss: charge in the gate dielectric, cox: oxide capacitance, qd: depletion charge, f : fermi potential vin : input voltage. power dissipation plays a crucial role in the overall performance of the circuits in sub 10 nm regime and it represents an important performance metric to check the effectiveness of the proposed technique. time delay is a performance metric to evaluate the switching speed of the circuit, it is calculated by following equation [9]: 2 plhphl d tt t + = (2) tphl: high to low transition delay; tplh: low to high transition delay. leakage power dissipation is also an important parameter for research designers because it affects performance and reliability of the electronic device. the leakage power dissipation is calculated using following equation [8]: leakageddleakage ivp = (3) where vdd is supply voltage and ileakage is the leakage current. scaling of finfet plays a very important step in finfet structure where the scaling factor  is given in following equation [6]: oxfin tw 2+= (4) wfin: fin width; tox: oxide thickness. pdp (power delay product) is an essential requirement for better performance of the circuits. technology scaling increases power dissipation and delay values therefore, lowest value of pdp depicts better performance at the scaled technology nodes. pdp is given by following equation [17]: pdp =power dissipation x delay (5) the following equation represents the drain current equation on the sub-threshold mode used in this simulation: ) )( ( ),( nkt vvq dsondsds ongs evvii − = (6) vgs: gate source voltage , n: body coefficient , k: boltzman coefficient , t:temperature, q:electron charge. )1( ) ).42( 1(0 0 effsat dseff dseff gsteff dseffbulk gsteff r eff eff eff l v v vtv va v toxel w ids    + + −= (7) weff: effective width, leff: effective length , ε0: vacuum permittivity, εr: relative permittivity, toxe: oxide thichness, vgsteff: gate source effective voltage, vdseff: drain source effective voltage, ε0: saturation permittivity, v: carrier velocity. 6 a. lazzaz, k. bousbahi, m. ghamnia in 3d nanochannel devices, the sce modifies the drain current expression by a correction factor cf for the post-threshold voltage regime: l cf + =   (8)  : mean free path, l: channel lengh, cf: is also called transition coefficient. figure 2 represents the transfer characteristics of n finfet 10 nm and illustrates a comparison between the theoretical and experimental transfer characteristics in subthreshold regime. the gate voltage is swept from 0 v to 0.8 v for different drain values 0.05 v, 0.1 v and 0.2 v. the maximum value of drain current represents the on current when vgs= vdd=0.8 v and the value of on current is 10.5 µa.the leakage current is 2.75 na and it represents the value of the current when vgs=0. to fit the experimental results, the drain current is modified by correction factor cf represented in equation 8. this coeffcient represents the transport mode transition factor. the transport is quasi balistic in the channel. this transition coefficient takes into consideration the type of charge carrier n or p therefore, the correction value distinguished between both structures. the fitting of the simulated results with the experimental data is due to the quantum correction that gave a good convergence between two curves. the cacultated parameters are used to compute the means free path used in the equation number (8) such as, effective mobility in th n channel, diffusion coefficient and unidirectional thermal velocity. the electron carrier mobility used in this simulation of n finfet 10 nm is 350 cm²/v.s. the average mobility value has been extracted from berkley spice model for finfet 10 nm [28]. it is noted that for the gate voltages 0.1 v and 0.2 v, the simulation curves fit very well with the experimental [31], there is therefore a good convergence between the theoretical model and the experimental points curves at these gate voltages. the discrepancy at 0.4 v and 0.5 v voltages can be explained by the presence of complex scattering phenomena which are very difficult to model. fig. 2 transfer characteristics of n finfet 10 nm [31] performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 7 figure 3 represents transfer characteristics of p finfet 10 nm and illustrates a comparison between the theoretical and experimental transfer characteristics. we note that the on current is 50 µa and the leakage current is 44.76 na. the threshold voltage in this simulation is 0.20 v and the decrease of threshold voltage is due to the increase of the quasi-fermi level. the values of drain voltage have been chosen to calculate the threshold voltage and to fit the curve with experimental data [31]. fig. 3 transfer characteristics of p finfet 10 nm figure 4 represents the transfer chrematistics of n finfet 7 nm, we note that on current is 0.306 ma and leakage current is ioff is 82.536 na. various low static power technology needs higher threshold voltage but the miniaturization of integrated circuits and channel length decreases the threshold voltage. the threshold voltage in this simulation is 0.22v [33]. the leakage current in this simulation of n finfet 7 nm is lower than calculated by suyog gupta et al [4]. fig. 4 transfer characteristics of n finfet 7 nm 8 a. lazzaz, k. bousbahi, m. ghamnia figure 5 represents the transfer characteristics of p finfet 7 nm, we note that ion is 0.250 ma and leakage current is ioff= 221.571 na. we note that on current in this simulation of p finfet 7nm is higher than calculated in t.dash et al [18] and leakage current is lower than calculated by suyog gupta et al [4]. fig. 5 transfer characteristics of p finfet 7 nm figure 6 represents transfer characteristics of n finfet 5 nm, we note that the on current is 0.240 ma and the leakage current is 81.694 na. the threshold voltage is 0.23 v for this simulation and the increase of its value is due to the fermi level and to have better threshold voltage, we need to increase the fin height [3][14]. on current in this simulation is higher than calculated by n. p. maity et al [5]. we can control and minimize the leakage current in this structure with different channel length by optimizing the geometric parameters in order to have optimal results. fig. 6 transfer characteristics of n finfet 5 nm performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 9 figure 7 represents the transfer characteristics of p finfet 5 nm, we note that the maximum current ion is 0.199 ma and leakage current is 219.31 na. we think that the problem to the increase of the leakage current is the leaked quantum confinement and the choice of geometric parameter like the gate oxide which leads to the raising of the conduction band, so we need more potential to create an inversion layer [13]. fig. 7 transfer characteristics of p finfet 5 nm the following table 2 represents the performance ratio ion/ioff the threshold voltage vth and dibl calculated for different structures of finfet 10 nm, 7 nm and 5 nm [10]. the table presents a comparatice study with international roadmap for device and systems (irds) results [30].the supply voltage for finfet 10 nm and 7 nm is 0.8 v and 0.65 v for finfet 5 nm.these parameters are extracted from berkley spice model [28]. table 2 performance ratio of finfet 10 nm,7 nm and 5nm device finfet 10 nm finfet 7 nm finfet 5 nm ion/ioff values for n structure 3818.18 3707.36 2937.79 ion/ioff values for p structure 1117.6 1128.30 907.36 vth (v) for n structure 0.24 0.22 0.23 vth (v) for p structure 0.20 0.20 0.22 ion/ioff for n strcture (irds)[30] 950 930 840 dibl n fnfet (mv/v) 49.5 45.5 40.5 dibl p finfet (mv/v) 50.5 46.5 41.5 we note that the better performance ratio of n finfet is for finfet 7 nm due to the leakage current and the higher ratio performance of p finfet is for finfet 10 nm due to the minimum strain effect of on current. 10 a. lazzaz, k. bousbahi, m. ghamnia 3. cmos gates designs this paper has considered three design styles for digital logic circuits structures using finfets. the circuit diagram of different finfet-based not gate, nand, nor gate designs along with the ordinary cmos is shown in the figure 8. fig. 8 (1): not gate, (2): cmos nand, (3) cmos nor [8] the three different circuits of cmos (nand nor and inverter) based of finfet have been analyzed using the microwind 3.8 tool. the first step is the implementing of three different circuits of finfet based nand and nor gates in order to create the layout styles [16]. the design rule must be checked before applying the inputs. the design rule which is used in this simulation is lambda-based design rule. the value of lambda is fixed to 8 nm [6] [15]. figure 9.a represents the layout design of cmos not gate with finfet 5 nm using microwind 3.8 and figure 9.b represents the structure of cmos not gate in 3d with finfet 5nm.[19] (a) (b) fig. 9 (a) design layout cmos inverter , (b) cmos inverter 3d structure performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 11 figure 10.a represents the design layout of cmos nand with finfet 5 nm using microwind 3.8 and figure 10.b represents the structure of cmos nand in 3d with finfet 5nm. (a) (b) fig. 10 (a) design layout cmos nand gate, (b) cmos nand 3d structure figure 11a represents the design layout of cmos nor gate with finfet 5nm using microwind 3.8 and figure 10.b represents the structure of cmos nor gate in 3d with finfet 5nm. (a) (b) fig. 11 (a) design layout cmos nor gate, (b) cmos nor gate 3d structure figure 12 represent the different vtc curves of different cmos circuits: 12 a. lazzaz, k. bousbahi, m. ghamnia (a) (b) (c) fig. 12 (a) vtc curves of cmos not gate, (b) vtc curves of cmos nor gate, (c) vtc curves of cmos nand gate noise margin is a measure of design margins to ensure circuits operation within specified conditions and it is closely related to the dc transfer curve [40]. this parameter allows to determine the allowable noise voltage on the input of a gate so that the output will not be corrupted. the specification most commonly used to describe noise margin (or noise immunity) uses two parameters: the low noise margin nml and the high noise margin nmh [8]. table 3 represents calculated parameter from vtc (voltage transfer curve): figure 13 represents the values of power delay product (pdp) with different cmos gates. we note that the better value of pdp in not gate is for finfet 5 nm and for cmos nand, nor gates is finfet 7 nm. performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 13 table 3 calculated parameters of different cmos finfet gates device finfet not gate finfet nand gate finfet nor gate technlogy node 10 nm 7 nm 5 nm 10 nm 7 nm 5 nm 10 nm 7nm 5 nm vdd 0.80 0.8 0 0.65 0.80 0.80 0.65 0.80 0.80 0.65 vsp(v) 0.385 0.385 0.386 0.397 0.391 0.3999 0.379 0.373 0.371 vil(v) 0.3243 0.3243 0.3189 0.3445 0.3445 0.3351 0.3148 0.3189 0.3202 voh(v) 0.7750 0.7687 0.7656 0.7509 0.7562 0.7562 0.7187 0.7718 0.7562 vih(v) 0.4378 0.4391 0.4418 0.4513 0.4472 0.4472 0.4189 0.4216 0.4437 vol(v) 0.0531 0.0406 0.0406 0.0375 0.0437 0.0406 0.0343 0.05 0.0437 nml(v) 0.2712 0.2836 0.2782 0.3070 0.3007 0.2944 0.2804 0.2689 0.2764 nmh(v) 0.3372 0.3296 0.3238 0.2996 0.3090 0.3090 0.2998 0.3502 0.3125 td (ps) 1.6 1.5 1.4 2.20 2.20 2.10 1.10 1.10 1.0 p( w ) 0.446 0.357 0.460 0.475 0.686 0.5950 0.325 0.416 0.401 pdp (10-18w.s) 0.7136 0.5355 0.6440 1.0450 1.5092 1.2495 0.3575 0.4576 0.4010 p: power dissipation in static cmos, pdp: power delay product. td: time delay; vol: maximum low output voltage, voh: minimum high output voltage, vil: maximum low input voltage, vih: minimum high input voltage, vsp: switching point voltage. fig. 13 power delay product (pdp) for different cmos gates figure 14 represents the values of times delay of different cmos gates, we note that the optimal device is finfet 5 nm due to the low time delay. the results obtained for each of the digital application at 10 nm, 7 nm and 5 nm of finfet shows a system tradeoff. we note that as we scale down the device from 10 nm to 5 nm, the time delay decreases because the supply voltage has been decreased [34]. 14 a. lazzaz, k. bousbahi, m. ghamnia fig. 14 time delay for different cmos gates the fluctuation in power delay product (pdp) is due to the fluctuation of static power dissipation and it is a minor issue because the system reliability has improved [20]. conclusion as ultra large semiconductor integration (ulsi) moves towards new advancement, new challenges have been arisen such as sce which are generated because of scaling of the transistors. from the simulation results, it has been observed that the leakage power dissipation is the major issue in modern semiconductor industry and finfet devices have the advantages to overcome these issues. the simulation results for finfet based digital application at nanometer regime of 10 nm ,7 nm and 5 nm technology are studied here in the educational tool microwind and a comparative and analysis is carried out in this paper for comparison between the different nodes technology of finfet device. from the simulation results, one can conclude that the impact of the time delay and power dissipation product on cmos based finfet device are crutial parameters for improvements of the performance of cmos circuits. we confirm in this study that significant progresses have been made by introducing a new generation of 5 nm finfet device which improves the switching performances and decrease the time delay as compared to different nodes such as 10 nm and 7 nm for cmos circuits. the results in this simulation confirm that the proper selection of supply voltage and geometric parameters is important for obtaining a high speed and stable cmos circuits. acknowledgement: the authors wish to thank pr etienne sicard and mr vinay sharma for their helpful suggestions in this work. performance аnalysis of finfet based inverter nand and nor circuits at 10 nm, 7 nm ... 15 references [1] b. yu, l. chang and s. ahmed, "finfet scaling to 10 nm gate length". in proceedings of the ieee digest. international electron devices meeting", 2002, pp. 251-254. [2] a. lazzaz, k. bousbahi and m. ghamnia, "modeling and simulation of dg soi n finfet 10 nm using hafnium oxide", in proceedings of the 21st ieee international conference on nanotechnology (nano), 2021, pp. 177-180. [3] x. zhang, d. connelly and p. zheng, "analysis of 7/8-nm bulk-si finfet technologies for 6t-sram scaling", ieee trans. electron devices, vol. 63, no 4, pp. 1502-1507, 2016. [4] s. gupta, v. moroz and l. smith, "7-nm finfet cmos design enabled by stress engineering using si, ge, and sn", ieee trans. electron devices, vol. 61, no. 5, pp. 1222-1230, 2014. [5] n. maity, r. maity and s. maity, "comparative analysis of the quantum finfet and trigate finfet based on modeling and simulation", j. comput. electron., vol. 18, no 2, pp. 492-499, 2019. [6] e. sicard and l. trojman, "introducing 5-nm finfet technology in microwind", hal open science, hal0325444, 2021. [7] n. bourahla, a. bourahla and b.hadri, "comparative performance of the ultra-short channel technology for the dg-finfet characteristics using different high-k dielectric materials" , indian j. phys., vol. 95, pp. 1977-1984, 2020. [8] n. weste and d. harris, cmos vlsi design: a circuits and systems perspective, pearson education india, 2015. [9] j. baker, cmos circuit, design, layout and simulation, ieee press series on microelectronic systems, pp. 332-375, 2010. [10] y. eng, l. hu, t. chang, s. hsu, c. chiou, t. wang and c. yang, "importance of $\delta v_ {{\text {diblss}}}/({i} _ {{\text {on}}}/{i} _ {{\text {off}}}) $ in evaluating the performance of n-channel bulk finfet devices", ieee j. electron devices soc., pp.207-213, 2018. [11] m. lundstrom, fundamentals of nanotransistors, world scientific publishing company, vol. 6, 2017 pp. 100-300. [12] n. collaert, high mobility materials for cmos applications, woodhead publishing, 2018, pp. 115-280. [13] y. chauhan, d. lu and s.venugopalan, finfet modeling for ic simulation and design: using the bsimcmg standard, academic press, 2015, pp 72-200. [14] m. tang, f. pregaldiny and c. lallement, "quantum compact model for ultra-narrow body finfet", in proceedings of the 10th international ieee conference on ultimate integration of silicon, 2009, pp. 293-296. [15] e. sicard, "introducing 20 nm technology in microwind", hal open science, hal-03324322, pp.3-20, 2011. [16] e. sicard and s. dhia, "microwind & dsch: version 3". insa, pp.1-90, 2004. [17] r. sharma and s.verma, "comparitive analysis of static and dynamic cmos logic design", in proceedings of the ieee international conference on computing and communication technologies, 2011, pp. 231-234. [18] t. dash, s. dey and s. das, "performance comparison of strained-sige and bulk-si channel finfets at 7 nm technology node ", j. micromech. microeng., vol. 29, no. 10, p. 104001, 2019. [19]l. artola, g.hubert and m.alioto,"comparative soft error evaluation of layout cells in finfet technology" microelectron. reliab., vol. 54, no. 9-10, pp. 2300-2305 ,2014. [20] v. vashishtha and l. clark ,"comparing bulk-si finfet and gate-all-around fets for the 5 nm technology node", microelectron. j., vol. 107, p. 104942, 2021. [21] s. liu, j. yang and l. xu, "can ultra-thin si finfets work well in the sub-10 nm gate-length region? ", nanoscale, vol. 13, no 10, pp. 5536-5544, 2021. [22] d. tripathy, d.acharya and p.rout, "influence of oxide thickness variation on analog and rf performances of soi finfet", fu: elec. energ., vol. 35, no. 1, pp. 001-011, 2022. [23] a. lazzaz, k. bousbahi and m. ghamnia, "optimized mathematical model of experimental characteristics of 14 nm tg n finfet", micro and nanostructures, p. 207210, 2022. [24] n. bourahla, b. hadri and n. boukortt, "impact of high-k dielectric material on ultra-short-dg-finfet performance", in proceedings of the 15th international ieee conference on advanced technologies, systems and services in telecommunications (telsiks), 2021, pp. 78-81. [25] u. das, m. hussain, "benchmarking silicon finfet with the carbon nanotube and 2d-fets for advanced node cmos logic application", ieee trans. electron devices, vol. 68, no 7, pp. 3643-3648,2021. [26] r. vallabhuni, d. sravya and m. shalini, "design of comparator using 18nm finfet technology for analog to digital converters", in proceedings of the 7th international ieee conference on smart structures and systems (icsss), 2020, pp. 1-6. [27] j. jena, d. jena and e. mohapatra,"finfet-based inverter design and optimization at 7 nm technology node", silicon, vol. 14, pp. 10781-10794, 2022. 16 a. lazzaz, k. bousbahi, m. ghamnia [28] s. sinha, g. yeric and v. chandra, "exploring sub-20nm finfet design with predictive technology models", in proceedings of the ieee dac design automation conference, 2012, pp. 283-288. [29] e. sicard and l. trojman, "introducing 5-nm finfet technology in microwind", hal open science, hal0325444, 2021. [30] international roadmap for devices and systems. available at: https://irds.ieee.org/ (2018 edition). [31] c. auth, a. aliyarukunju and m .asoro, "a 10nm high performance and low-power cmos technology featuring 3 rd generation finfet transistors, self-aligned quad patterning, contact over active gate and cobalt local interconnects", in proceedings of the ieee international electron devices meeting (iedm), 2017 pp. 29.1.1-29.1.4. [32] p. vora and r. lad, "a review paper on cmos, soi and finfet technology", design and reuse industry articles, p. 1-10, 2017. [33] m. tang, f. prégaldiny and c. lallement, "explicit compact model for ultranarrow body finfets", ieee trans. electron devices, vol. 56, no. 7, pp. 1543-1547,2009. [34] j. hu and x. yu, "near-threshold full adders for ultra low-power applications", in proceedings of the second ieee pacific-asia conference on circuits, communications and system, 2010, p. 300-303. [35] s. panchanan, r. maity and s. baishya, "a surface potential model for tri-gate metal oxide semiconductor field effect transistor: analysis below 10 nm channel length", eng. sci. technol. int. j., vol. 24, no. 4, pp. 879-889, 2021. [36] b. vandana, d. kumar and s. mohapatra, "impact of channel engineering (si1-0.25 ge0.25) technique on gm (transconductance) and its higher order derivatives of 3d conventional and wavy junctionless finfets (jlt)", facta universitatis, series electronics and energetics, vol. 31, no. 2, pp. 257-265, 2018. [37] n. maity, r. maity and s.baishya, "an analytical model for the surface potential and threshold voltage of a double-gate heterojunction tunnel finfet", j. comput. electron., vol. 18, no 1, pp. 65-75, 2019. [38] s. panchanan, r. maity, "modeling, simulation and analysis of surface potential and threshold voltage: application to high-k material hfo2 based finfet", silicon, vol. 13, no. 10, pp. 3271-3289, 2021. [39] s. panchanan, r. maity and s. baishya, "modeling, simulation and performance analysis of drain current for below 10 nm channel length based tri-gate finfet", silicon, vol. 14, pp. 11519-11530, 2022. [40] l. wang, y. chang and k. cheng, electronic design automation: synthesis, verification, and test, morgan kaufmann (ed), 2009. [41] s. shaheen, g. golan, m. azoulay, "a comparative study of reliability for finfet", facta universitatis, series electronics and energetics, vol. 31, no 3, pp. 343-366, 2018. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 357 365 doi: 10.2298/fuee1603357k an architectural design for cloud of things abhirup khanna b.tech cse with specialization in cloud computing and virtualization technology university of petroleum and energy studies (upes) dehradun, uttarakhand, india abstract. in recent times the world has seen an exponential rise in the number of devices connected to the internet. this widespread expansion of the internet and growth in the number of interconnected devices has lead to the rise of many new age technologies. internet of things (iot) being one of them allows devices to communicate with one another that are connected through the internet. it provides a new way of looking towards pervasive computing wherein "things" be it sensors, embedded devices, actuators or humans interact with one another. but currently iot is facing a number of challenges related to scalability, interoperability, storage capacity, processing power and security which all act as a deterrent for its practical implementation. cloud computing, the buzzword of the it industry, suits best to handle all these challenges, thus leading towards the integration of cloud and iot. in this paper, we present a layered architecture for cloud of things, i.e. the amalgamation of cloud computing and internet of things. the architecture provides a scalable approach for iot as it allows dynamic addition of nnumber of "things". moreover, the architecture allows the end users to host their applications onto the cloud and access iot systems remotely. towards the end, the paper discusses a use case that proves the correctness of the proposed architecture. key words: cloud of things, internet of things, cloud computing, ubiquitous computing 1. introduction in present day times internet has become a key aspect in everyone's life. from shopping malls to banks, from e-health to military equipments, internet has made its mark. with the advancement of the internet more and more people are able to connect among themselves located at distant places throughout the globe. this outburst of the internet has given birth to a new idea of having every object connected to one another. soon the number of things connected to the internet would surpass the number of people living on earth. according to an estimate given by cisco, 50 billion devices would be connected through the internet by the year 2020. the future will have things communicating to one received june 30, 2015; received in revised form november 12, 2015 corresponding author: abhirup khanna university of petroleum and energy studies (upes), bidholi, via prem nagar, dehradun, uttarakhand 248007, india (e-mail: abhirupkhanna@yahoo.com) 358 a. khanna another rather than humans; in fact they would be talking on behalf of humans [1]. this rise in the outreach of the internet is gradually leading towards an era of internet of things (iot). wherein the objects (things) connected to the internet would be sharing information with other objects as well as with humans. new ways of communication would evolve allowing humans and things to communicate with one another. iot can be seen as a revolution in the field of computer science and would play a vital role in shaping the future of computing. the term internet of things was introduced way back in the year 1999 by kevin ashton. at that time many people thought it to be just an analogy for m2m communication but they never realized how big iot can become. it is true that the concept of iot follows the principles of m2m communication, but it cannot be considered as an analogy for it [2]. m2m communication finds its application in the late 1960s and early 1970s. it was a term used by the telecom industry to denote point to point communication. m2m communication was merely connecting embedded devices to one another through cellular or wired networks. whereas on the other hand internet of things is far more than this, having an ip based networking model along with the integration of sensors and embedded devices. iot allows various kinds of heterogeneous devices to connect to one another, collect data, exchange information and depict this information onto the real world with the help of actuators. iot facilitates the use of wireless sensor networks (wsn) in order to collect information from sensors present at remote locations. the wsn comprises of an n number of self powered sensing nodes connected through a wireless network. these nodes detect events, gather information and transmit this information to their base stations. to be precise, iot is not just about embedded devices connected to one another, rather, it consists of a large set of actors that lead to its proper functioning. talking of the actors that constitute the entire system of internet of things include: sensors, embedded devices (things), sensor networks, actuators and humans. sensors gather data that is transmitted to embedded devices through sensor networks. things process this data, generate information and exchange this information with one another or even humans. specific actions are performed by the actuators or humans in accordance to the processed information. talking of iot there is always a mention of data that is either being exchanged or processed or being depicted in the physical world. with the increase in number of "things" the data being exchanged or processed by them will also increase leading to an outburst of unstructured heterogeneous data. present day embedded devices lack the capabilities to store and process this humongous amount of data thus heading towards the integration of cloud computing and internet of things [3]. cloud computing needs no introduction as it is one of the big time game changers in the field of computer science. nowadays, a lot is heard about cloud computing and how it is being implemented in every walk of life. cloud computing is a next gen computing model that allows users to have access to resources on a pay as you use basis. cloud is constructed on the foundation of virtualization thus allowing its users to access unlimited amount of resources from remote locations. dynamic resource allocation, platforms to host heterogeneous services and applications, virtualization of resources, unlimited storage and processing capabilities is what makes cloud the buzz word of the it industry. the integration of cloud computing and iot will give rise to a new computing paradigm having benefits of both iot and cloud. this new paradigm can be addressed as cloud of things (cot), wherein cloud acts an architectural design for cloud of things 359 as a central control and processing unit and things are the real world entities which collect data and represent information in the form of suitable actions. but in order to make cot a reality there is an urgent need for an architecture that could depict its internal and external working. the architecture would define various actors along with their functionalities required to constrict an ecosystem for cot. the aim of this paper is to present such kind of an architecture that represents the amalgamation of cloud and iot. in this paper we propose a layered architecture for cot that leverages the capabilities of cloud and explores the outreach of internet of things. the rest of the paper is organized as follows. section 2 talks about the challenges of iot and the benefits of its integration with cloud. section 3 discusses some of the architectures for internet of things. in section 4 we have the proposed architecture for cloud of things. in section 5 there is a use case to validate the proposed architecture. finally, section 6 provides a conclusion for the paper. 2. challenges for cloud in iot till now we have discussed the benefits of iot and how its implementation could ease the way of living. but there are several challenges which we come across towards the implementation of iot and its potential to become the future of computing [4]. below are some of the prominent challenges which need to be addressed before implementing iot. 1) interoperability: it is said that iot is based on diversity and not interoperability but it is essential for an iot driven system to foster both technical as well as semantic interoperability [5]. every system working on the guidelines of iot should allow various kinds of heterogeneous devices to connect to one another. devices should be allowed to communicate among themselves irrespective of the operating system running on it or its hardware configuration. semantic interoperability also needs to be harnessed so that every device has a correct and similar interpretation of the exchanged information. 2) data access and control: data sharing is an essential part of iot. it would be beneficial for all if various organizations could come up and share their data in order to gain useful insights. thus who can access and control this data is a big question as data ownership still remains a concern for iot. 3) security and privacy: data integrity and privacy is a major concern for iot as most of the data exchanged comprises of users personal information. issues such as protecting users' privacy and manufacturers' ip; detecting and blocking malicious activity come under security threats pertaining to iot [6]. implementation of energy efficient data encryption schemes along with maintaining a proper authentication mechanism is a challenge for iot. 4) storage capacity: embedded devices used in iot lack the storage capabilities that are needed to store huge volumes of data collected from various sensors. their inability to store large amounts of data makes the system inefficient and leads to creation of incomplete data sets. 5) processing power: things involved in iot lack processing capabilities and thus are unable to process huge volumes of data. this lack of processing power leads to half baked information which when depicted lead to actions that are incorrect. 360 a. khanna 6) power consumption: the devices being used under internet of things, be it sensors or actuators, require power to run. new research needs to be done in promoting the use of low power devises that consume less battery life and can run for years. 7) reliability: iot systems need to be reliable in order to meet the industry standards. any single point of failure in the system should not hamper the working of the entire system. the system needs to be flexible, robust and fault tolerant in nature [7]. 8) scalability: one major question related to internet of things is how big it can become? or to put it this way, how far is iot scalable? [8] there are very limited systems or architectures that fully explore the scalability of iot. a lot of work needs to be done in designing systems for iot that facilitate dynamic increase and decrease of things. after going through all the above mentioned challenges cloud seems to be the best solution for all of them. integration of cloud with iot will allow iot systems to have access to unlimited storage and processing capabilities along with efficient security mechanisms. amalgamation with cloud will provide flexibility, scalability and robustness to the entire system. cloud will also be acting as a platform where service providers could host there services and monitor the working of the entire system. for end users cloud would act as an interface from which they can interact and communicate with their devices. the fusion of iot and cloud will also act beneficially for cloud providers as they will be able to enhance the reach of their services to the real world entities in a more dynamic and distributed manner. 3. related work since the outburst of iot, many architectures have been proposed in order to implement it in a practical scenario. similarly, many such frameworks have been proposed that exhibit fusion of cloud and iot. in this section, we have presented some of the research works pertaining to this area.  diat stands for distributed internet-like architecture for things. it is a layered architecture for iot that works on the principles of service oriented architecture (soa) and ensures minimum human involvement [9]. the architecture comprises of three layers, namely, virtual object layer (vol), composite virtual object layer (cvop) and service layer. all three layers are clubbed together along with their functionalities into a stack call iot daemon. this is the very daemon that forms the core of the entire architecture. talking of the different layers the vol acts like an interface between the real and physical world and is responsible for virtual representation of objects. the work of the cvop is to ensure communication and interaction between virtual objects present at the vol. last comes the service layer whose work is to manage and monitor all kinds of various services. it can also initiate service creation on its own in order to make the entire system automated.  marm also known as multi agent based rfid middleware is software that is built on the principles of agent oriented software engineering [10]. it also incurs a layered architecture having three layers for device management, data management and user interface. there is another architecture proposed in [11] that makes use of an architectural design for cloud of things 361 a cell based structure in order to ease the traffic congestion between rfid readers and tags.  next is an architecture which talks about the integration of cloud and iot. cloudthings is an architecture that aims at the integration of cloud computing and iot and interacts with all the three delivery models (iaas, paass, saas) of cloud [12]. the purpose of the architecture is to enhance the experience of application development and management through the use of cloud computing.  when dealing with internet of things mobile devices play a major role. with the advancements in the smartphone technology mobile devices ought to be the perfect match for what we call a "thing" in iot. mosden focuses at this aspect and provides a middleware between a mobile device and iot [13]. the middleware makes use of mobile devices as sensing units and transmit the sensed data to the backend systems. thus the work of the developer is made easy by allowing it to code at the backend rather than on the mobile device itself.  thin clients have always been used to propagate the principles of ubiquitous computing and now they are being implemented in designing systems for iot. the architecture proposed in [14] makes use of thin clients as thin servers which act as an interface for low level devices such as sensors and actuators. the architecture deploys communication protocols such as coap and http to facilitate communication between various applications and devices. the apps and thin servers make use of restful api calls to interact with the low end devices. the application model of the architecture works similar to web mashups and enables developers with the facility to reuse their code in designing new services. for discovery of new nodes the architecture takes help of meta data such as rfid tags, names, geospatial information, etc. 4. proposed architecture a scalable and robust architecture is required to ensure proper working and implementation of a cot based ecosystem. the architecture must cope with the never ending requirements of the end user along with tackling the challenges mentioned in section 2. constructing architecture is the first step towards a solution. in this section we propose architecture for cloud of things which would act as a blue print for the technology and describe various components that constitute it. below is the detailed description of a layered architecture along with its various actors pertaining to cloud of things. sensing layer: this layer comprises of the various kinds of sensors present in the system. the work of the sensors is to gather information and transmit it to the subsequent network layer. sensors act as the eyes and ears for the system and detect events and transmit the collected information. every sensor can be categorized on the basis of three parameters namely, sensor type, methodology and sensing parameters. sensor type defines which type of sensor it is, i.e. whether it is a homogeneous or a heterogeneous sensor or if it is a single dimensional or multidimensional sensor. methodology tells about the ways in which the sensor gathers information. it can be either active or passive. active sensing means direct collection of data, i.e. from an mri, while passive sensing is inferring data (blood pressure) from the data collected by active sensing. sensing parameters are the number of parameters which a sensor is able to sense. a sensor might 362 a. khanna just sense one parameter like body temperature or many parameters like in the case of ecg. the sensing layer may also comprise of rfid readers which gather information from rfid tags. these rfid tags can store large amounts of information and can be easily tagged on any object be it an animal, consumer product or a human being. fig. 1 layered architecture for cloud of things communication layer: it is also known as the network layer. the purpose of this layer is to maintain communication among various sensors, things and humans. the three broad categories of communication that take place are:  sensor to thing.  thing to thing.  human to thing. it is the communication layer which receives information from the sensing layer and forwards it to the control layer. the network layer comprises of two gateways which act as collection points to combine information collected from various sensors and rfid readers. these gateways combine all forms of unstructured information and transmit it to the subsequent control layer. the communication layer makes use of several networks in order to maintain interaction at various levels. wsn or wireless sensor networks form the core of the network layer. in case of wsn, sensors are connected through a wireless network and transmit information to their respective hosts through wireless communication. another type of sensor network which the network layer uses is the body sensor network (bsn). it consists of sensing nodes that are implanted inside or outside a patient's body. the work of the sensing nodes is to monitor and sense physiological parameters of a patient like its blood pressure and body temperature. the communication layer may also comprise of nsg, i.e. net generation networks which is a combination of body sensor networks and social networks. the communication layer works on the ip an architectural design for cloud of things 363 based networking model and provides a unique ip address to every node (sensor, thing, human) connected through the system. as the number of nodes in a system increase at an exponential rate, keeping this in mind the communication layer implements the ipv6 addressing scheme to map every node. control layer: it is the most important layer of the entire architecture. it is the control layer which derives useful insights by performing computations over the data received from the communication layer. in technical terms, the control layer can be considered the cloud layer as it is where all the data is stored and processed. the control layer is also known as the service layer as it provides a platform for service providers to host their services. it also acts like a web portal for end users to add, delete and monitor their devices (things and sensors). any device can become a part of the system by registering itself. after successful registration every device is allotted a unique id and password. the id is usually the ip address of that device and password is for secure authentication. once the device is added its entry is made in the cloud data base. it is the scalable and robust nature of cloud that facilitates dynamic addition and subtraction of nodes. the control layer receives data from the communication layer and stores it in the data bases. it then applies certain algorithms and performs the n number of computations on the stored data in order to find interesting patterns. the computations performed on the data are in accordance to the service to which it belongs. the results after processing are reverted back to the end user and are either depicted by the actuators or represented in the form of useful information (knowledge). actuation layer: the purpose of this layer is to represent information received from the control layer. it is the actuators which receive and represent useful insights coming from the control layer into the physical world. the actuation layer comprises of robotic arms, led screens, motors, pulleys, etc. the process of actuation can either be manual or automatic. in case of manual actuation, human intervention is involved and results are depicted by humans based upon the suggestions given by the control layer; whereas in automatic actuation the actuators work on their own in accordance with the information received from the control layer. 5. proof of concept over the past few years internet based technologies have found their way in numerous health care applications. with the advancements in sensor technologies iot is able to find its use in several medical applications [15]. iot aims at easing the life of people and it does the same when dealing with patients. iot based systems are able to provide convenience to both patients and doctors by offering services such as real time monitoring of the patient, health management, emergency management and patients information management. in order to prove the proposed architecture we created a test bed for it. the use case used in here is of a health care monitoring system. the system would be monitoring the physiological parameters of a patient on a real time basis and take suitable actions if the values of a parameter go out of range. the system will also be monitoring the geospatial location of the patient with the help of a gps sensor. below is the working of the health care monitoring system. 364 a. khanna  the system would be monitoring three physiological parameters, namely, blood pressure, body temperature, and pulse rate of a patient. sensors used for this purpose are pressure sensor, temperature sensor and a pulse sensor.  all the sensors are connected to an arduino board. the board is also connected to a 2.4" tft lcd screen in order to display suitable information.  the raw data is collected from the sensors and transmitted to an application running on cloud. data is transferred using the internet protocol. the application would be storing data onto the cloud data base and would be comparing whether the parameters lay in normal rage or not.  if any of the parameters go beyond its normal range the application would communicate with the arduino board.  the arduino board will take suitable actions such as display the name of a prescribed medicine or transmit the coordinates of the patient to the application by communicating with the gps sensor.  once the application has received the geospatial coordinates it can easily book an appointment with the doctor for a house visit or call the ambulance to that specific location. 6. conclusion since the last decade, internet has drastically changed as well as the needs of its users. with the growing popularity of the internet the number of users accessing it has also increased. in this paper, we propose an architecture for cloud of things which is an amalgamation of cloud computing and internet of things. cot is a new age technology that copes with the ever increasing size of the internet as well as to the never ending requirements of the end users. the proposed architecture talks about the various actors along with their functionalities that are required to setup a cloud integrated iot system. upcoming technologies such as fog computing or cloudlets will be more appropriate for iot rather than cloud, thus leading researchers are to explore new avenues related to them. in the future, people from different walks of life can make use of this architecture to implement a cot system in a practical scenario. references [1] m. gomes, r. da rosa righi, c. da costa, “internet of things scalability: analyzing the bottlenecks and proposing alternatives”, in proceedings of the ieee 6th international congress on ultra modern telecommunications and control systems and workshops (icumt), 2014, pp. 269-276. [2] c. doukas, l. capra, f. antonelli, e. jaupaj, a. tamilin, i. carreras, “providing generic support for iot and m2m for mobile devices”. in proceedings of the ieee rivf international conference on computing & communication technologies-research, innovation, and vision for the future (rivf), 2015, pp. 192-197. [3] s. w. kum, j. moon, t. lim, j. i. park, “a novel design of iot cloud delegate framework to harmonize cloud-scale iot services”, in proceedings of the ieee international conference on consumer electronics (icce), 2015, pp. 247-248. [4] v. gazis, m. goertz, m. huber, a. leonardi, k. mathioudakis, a. wiesmaier, f. zeiger, short paper: “iot: challenges, projects, architectures”, in proceedings of the 18th international conference on intelligence in next generation networks (icin), 2015, pp. 145-147. an architectural design for cloud of things 365 [5] o. vermesan, p. friess, internet of things-global technological and societal trends from smart environments and spaces to green ict, 2011, river publishers. [6] r. h. weber, “internet of things–new security and privacy challenges” computer law & security review, vol. 26, no. 1, pp. 23-30, 2010. [7] h. d. ma, “internet of things: objectives and scientific challenges”, journal of computer science and technology, vol. 26, no. 6, pp. 919-924, 2011. [8] d. miorandi, s. sicari, f. de pellegrini, i. chlamtac, “internet of things: vision, applications and research challenges”, ad hoc networks, vol. 10, no. 7, pp. 1497-1516. 2012. [9] c. sarkar, a. uttama nambi sn, r. prasad, a. rahim, r. neisse, g. baldini, diat: a scalable distributed architecture for iot, 2012. [10] l. v. massawe, f. aghdasi, j. kinyua, “the development of a multi-agent based middleware for rfid asset management system using the passi methodology”, in proceedings of the sixth international conference on information technology: new generations, 2009. itng'09. pp. 1042-1048. [11] a. solanas, j. domingo-ferrer, a. martínez-ballesté, v. daza, “a distributed architecture for scalable private rfid tag identification” computer networks, vol. 51, no. 9, pp. 2268-2279, 2007. [12] j. zhou, t. leppanen, e. harjula, m. ylianttila, t. ojala, c. yu, l. t. yang, “cloudthings: a common architecture for integrating the internet of things with cloud computing”, in proceedings of the ieee 17th international conference on computer supported cooperative work in design (cscwd), 2013, pp. 651-657. [13] c. perera, p. p. jayaraman, a. zaslavsky, d. georgakopoulos, p. christen, “mosden: an internet of things middleware for resource constrained mobile devices” in proceedings of the 47th hawaii international conference on system sciences (hicss), 2014, pp. 1053-1062. [14] m. kovatsch, s. mayer, b. ostermaier, b. “moving application logic from the firmware to the cloud: towards the thin server architecture for the internet of things”, in proceedings of the sixth international conference on innovative mobile and internet services in ubiquitous computing (imis), 2012, pp. 751-756. [15] j. choi, m. ha, j. im, j. byun, k. kwon, w. yoon, d. kim, “the patient-centric mobile healthcare system enhancing sensor connectivity and data interoperability”, in proceedings of the international conference on recent advances in internet of things (riot), 2015, pp. 1-6. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 557 570 doi: 10.2298/fuee1504557p conversion model of the radiation-induced interface-trap buildup and its hardness assurance application  vyacheslav sergeevich pershenkov national research nuclear university mephi (moscow engineering physics institute), russia abstract. the model, which confirms that the interaction of trapped positive charges (hydrogenous species) in the oxide and electrons from the substrate is an important component of radiation-induced interface-trap buildup, is presented. the “one-to-koi” relationship between the number of trapped holes annealed and number of interfacetrap generated is used for prediction of mos device response in space environment. the model of enhanced low dose rate effect (eldrs) is proposed. eldrs conversion model is based on the assumption that there are two types of traps: shallow and deep. the time constants of these traps are different and correspond to interface-trap buildup at high dose rates for shallow traps and at low dose rates for deep traps. the possible physical mechanism of eldrs effect elimination in the silicon-germanium (sige) bipolar transistors is described. the original mechanism of interface-trap buildup saturation based on radiation-induced charge neutralization (ricn) effect is presented. key words: mos device, bipolar device, interface trap, conversion model, eldrs, hardness assurance 1. introduction total ionizing dose effects in mos and bipolar devices for space electronics connect with radiation-induced positive oxide trapped charge qot and interface-trap nit buildup. electron-hole generation, initial hole yield, continuous-time-random-walk, deep hole trapping and annealing is described in detailed in [1]. physical model [1] is commonly used. the most developed model of radiation induced interface-trap buildup is a twostage “hydrogen” model [2-3]. the other model (so called “conversion” model [4,5]) is based on the assumption that the generation of interface traps connects with the neutralization of positive charge by the substrate or radiation-induced electrons. in this work the conversion model of interface trap buildup is used for the estimation of long time operation mos and bipolar devices in space environment. the introducing of received april 30, 2015 corresponding author: vyacheslav sergeevich pershenkov national research nuclear university mephi (moscow engineering physics institute), russia (e-mail: vspershenkov@mephi.ru) 558 v. s. pershenkov quantitative relationship between two physical processes gives us the possibility to develop numerical prediction methods for the estimation of long time operation mos and bipolar devices in space mission. the use of the conversion model for the description of low dose rate effect in sige transistors and interface-trap buildup saturation are described. 2. conversion model of interface-trap buildup radiation induced buildup of interface traps nit is a problem that has been known for the last 35 years [2,3]. in addition to the works [4] where interface trap generation is connected with electron capture by trapped holes, none widely known experimental results described in [5]. the experimental dependencies of the threshold voltage shift δvit (caused by the interface-trap buildup) versus the annealing time for different four tests are presented in fig. 1. a maximum change of δvit is observed in test 1, when both electrons and hydrogenous species are presented near the surface. in other cases, when there are no electrons (test 2) or no hydrogen species (test 3) or both are near the interface (test 4), shift δvit is essentially reduced. these experimental data confirms the hypothesis that only the presence of hydrogen is not enough for an effective interface trap buildup. the interaction between hydrogen complexes and electrons from substrate is an important component of this process. fig.1 interface-trap component of the threshold voltage shift δvit versus the annealing time in the hydrogen atmosphere (after ref. [5]) conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 559 3. prediction of mos devices response in space environment total radiation induced threshold voltage shift ∆vth is usually separated to the components due to oxide trapped (∆vot) and interface trap (∆vit) charge buildup th ot it v v v    (1) to separate accumulation and annealing effects which occur simultaneously during irradiation, the technique of linear response theory can be used. at time t the ∆vot response to an arbitrary irradiation starting at t = 0 and described by the dose rate function γ(t) which can be obtained through the convolution integral [6] ( ) ( ') ' ot r v t v t t dt    (2) where ∆vr(t  t׳) is the impulse response function. to describe the annealing process we use the equation for ∆vr introduced in [6]. if after the end of irradiation at t →∞ all trapped holes are completely annealed, the impulse response function ∆vr is given by [6] 0 0 ( ) /(1 / ) r v t v t t      , (3) where ∆v0, t0 and ν are fitting constant. for irradiation time tir using this impulse response function with γ(t) = γ0 for t < tir and γ(t) = 0 for t > tir we have 1 0 ( ) [(1 / ) 1], ot ir v t c t t t t        , (4a) 1 1 0 0 ( ) [(1 / ) (1 ( ) / ) ], ot ir ir v t c t t t t t t t            (4b) where 0 0 0 /(1 )c t v    similar equations were derived in [6]. if no annealing occurs (ν = 0), the threshold voltage shift would reach its maximum value _ max 0 ( ) ot v t v d  , (5) where d is the total absorbed dose. the threshold voltage shift ∆vit includes fast and slow components. we suppose that for times greater than about 10 -3 s the fast component is proportional to the dose _ ( ) it fast i v t v d   , (6) where ∆vi is the fitting constant. according to conversion model of interface buildup, the interface state density is proportional to decrease of positive charge, i.e. there is some conversion coefficient koi which reflects strong correlation between the accumulation of slow interface states and trapped hole annealing. following this approach we can write for slow interface density component ∆nit_slow: _ _ max ( ) it slow oi ot ot v k n n    , (7) where ∆not_max corresponds to ∆not_max. 560 v. s. pershenkov in this case for slow component we have: _ _ max ( ) it slow oi ot ot v k v v    , (8) note, that the process of interface annealing is ignored, because at room temperature they decay with a time constant of several years. finally, we have the analytical equations for interface voltage shift: ( ) it oi o i oi ot v k v v d k v      , (9) the practical formula for hardness assurance application of mosfet voltage shift response can be derived from equation (1): ( ) (1 ) th oi o i oi ot v k v v d k v       , (10) where ∆vot is calculated using (4a). equation (10) has five fitting parameters: koi, ∆vo, ∆vi, t0 and ν, which can be found numerically using experimental data obtained in laboratory tests with high dose rate irradiation. there are several approaches to fitting procedure: solving of nonlinear least squares problem for five unknown parameters, implementation of separation techniques and so on. more convenient approach is to find three constants ∆vo, t0 and ν using the experimental data on ∆vot and two constants koi and ∆vi from analysis of ∆vit(t). the constants can be extracted from at least three experimental points ∆vot and ∆vit versus t. the reasonable value for the first measurement is taken to be equal to 1s after the end of irradiation. the monte-carlo simulation shows that the second point can correspond to interval 2 tir and the third measurement can be done at 100 tir [7]. the results of parameter extraction for our experimental data as well as for data taken from [8-11] are listed in table 1. table 1 parameters extracted from experimental data (after ref. [7]). data vg (v) ∆v0 (v/rad) 10 -6 t0 (s) ν koi ∆vi (v/rad) 10 -7 [8], fig 2 5.0 0.35 26 0.082 0.0 0.6 [9], fig 1 6.0 14 1.5 0.081 0.73 2.5 [9], fig 4 6.0 3.6 0.018 0.078 0.44 4.5 [10], fig 5 5.0  8900 0.405 0.25  [11], fig 13 2.5 21 110 0.1 0.41 12 experiment: n-channel, 30nm 0 2.5 5.0 1.1 0.83 0.6 0.0004 0.0016 0.019 0.074 0.083 0.092 0.0 0.12 0.12 1.5 0.14 0.0028 experiment: n-channel, 100nm 0 2.5 5.0 20 23 22 15 16 48 0.026 0.035 0.078 1.0 1.0 1.0 56 59 98 4. low dose rate effect in bipolar devices the low dose rate effect in bipolar transistors or the enhanced low-dose-rate sensitivity (eldrs) consists in more serve degradation of bipolar structure current gain for the given total dose following the low dose rate [12]. the eldrs model in the given work is based on conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 561 the hydrogen-electron (h-e) conversion model. the motivation of this development is the creation of a model that is allowed to obtain a quantitative numerical estimation of radiation degradation of bipolar transistor current gain for the arbitrary dose rate and temperature. because the h-e model is based on the conversion of a radiation-induced positive trapped charge to interface traps, the model described below is called the eldrs conversion model. to explain the classical radiation-induced positively charge annealing [13] and the reversibility of annealing effect [14], it is necessary to consider two positions of positive centers in the oxide forbidden gap: the non-rechargeable centers located about 1 ev above sio2 valence band [12], and the rechargeable parts of the oxide trapped charge located opposite the silicon forbidden gap [13]. direct substrate electron tunneling to positive centers, located opposite the silicon forbidden gap, is impossible because the tunneling electron energy must be constant (basic principles of quantum mechanics). but tunneling to the thermally activated positive centers is still possible. the positive centers energy level can reach the silicon conduction band due to a thermally excited vibration of the lattice (fig. 2,a). the positive charge can be neutralized by hole emission to silicon valence band (fig. 2,b). below the case of an interaction of positive charge and electron (fig. 2,a) will be considered. fig. 2 conversion of oxide charge (qot)rech to interface trap nit: capture of an electron e (a), emission of a hole h (b). ec and ev are energy levels of si conduction and valence band an interaction of thermally excited rechargeable positive charges and tunneling substrate electrons leads, according to conversion model, to interface-trap buildup. the physical nature of the conversion process can be connected with changing a distance between positive si+ and neutral sio atoms (eγ′ center, hole trap) after electron capture by eγ′ center [15]. the probability of the oxide positive center excitation up to conduction band depends on its energy depth in oxide relatively si forbidden gap. the shallow oxide traps (near conduction band) are converted for short time, while the deep traps (opposite to middle of si forbidden gap) need much more time for conversion. 562 v. s. pershenkov for simplicity, it is supposed that there are two kinds of oxide traps: shallow traps with small time of conversion, responsible for the degradation at high dose rates, and deep traps determining the excess base current increasing at long times of irradiation, i.e. at low dose rates (fig. 3). the shallow traps are converted with time constant τs; the conversion time of the deep traps is τd. essentially, the conversion time of the deep traps or constant τd is responsible for eldrs. fig. 3 the shallow (qot)s and deep (qot)d oxide trapped charges with conversion time τs and τd as shown in [16], the degradation of the base current as a function of dose rate (for irradiation time much more than 1 s) can be written as: ( ) 1d d b d s d d i k k d k e                   , (11) where ks is excess base current per unit dose at high dose rate; kd is excess base current per unit dose at low dose rate; γ is dose rate; d is the total dose. a conversion of oxide charge to interface traps is a thermal stimulating process. to consider a temperature effect on base current degradation, dependence of deep trap conversion time from temperature is introduced. temperature dependence of time constant τd can be described by arrhenius equation: 0 exp( / ) d d a e kt  , (12) where τd is conversion time of deep traps; t is temperature; eа is the activation energy of the oxide trap thermal excitation; k is the boltzmann's constant; τd0 is pre-exponential coefficient. conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 563 thus eldrs conversion model has 4 fitting parameters: ks, kd, eа and τd0. their extractions are performed by the following steps presented in [17]: 1. constant ks determining the contribution of shallow trapped charge conversion to base current degradation is estimated as a ratio of base current degradation to the specified total dose at 10 rad(sio2)/s irradiation. 2. the deep traps conversion time or constant τd is estimated from data of postirradiation anneal following high dose rate irradiation to the specified total dose. pre-exponential constant τd0 and activation energy ea in (12) are derived from the data for two different temperatures of elevated temperature post-irradiation anneal. 3. constant kd determining the contribution of deep trapped charge conversion to base current degradation at low dose rate is estimated from elevated temperature irradiation data. constant kd is derived from (11), where the constant τd for using elevated temperature is calculated from (12) (values of τd0 and activation energy ea are determined on step 2). the eldrs conversion model was validated by comparison with previously reported experimental data. two examples are shown below. in fig. 4 calculated and experimental results obtained from relationship (11) and [18] are shown. relationship (11) well describes experimental data [18] for values of fitting constants: ks = 1.35∙10 -3 na/rad(sio2), kd = 8.65∙10 -3 na/rad(sio2), τd = 2.2∙10 5 s (for lateral pnp) and ks = 0.16∙10 -3 na/rad(sio2), kd = 1.49∙10 -3 na/rad(sio2), τd = 5.0∙10 5 s (for substrate pnp). the same results for [19] are shown in fig. 5. fitting constants for that case are: ks = 0.33∙10 -3 na/rad(sio2), kd = 6.33∙10 -3 na/rad(sio2), τd = 3.0∙10 5 s. fig. 4 excess base current versus dose rate. experimental [18] and calculated data from relationship (11). 564 v. s. pershenkov fig. 5 excess input base current lm158 versus dose rate. experimental [19] (dots) and calculated data from conversion model (11). the conversion model proposed also explains why the base current starts growing 10 5 s after the cessation of the short-term, high dose rate irradiation [19]. the reason is that the charge at the deep oxide traps has no time to be converted into interface traps during the short-term, high dose rate irradiation. it is not accidental that the measured value τd = 3.0∙10 5 s is of the same order of magnitude as the started delay in [19]. 5. eldrs in sige transistors the activation energy of deep positive oxide center with energy eot in the oxide (fig. 6) can be presented as the sum of the energy of thermal excitation ∆ed from eot to electron energy at conduction band edge ec and energy of elastic coupling of positive center with lattice atoms: a d latt e e e   , (13) where eact is the activation energy of the positive oxide trap; ∆ed = ec – eot ; ec is the electron energy at conduction band edge; eot is energy level of positive trap in the oxide; elatt is the energy of elastic coupling of positive center with lattice atoms. in sige hbts due to the ge content, the bandgap narrowing in base region takes place. the bandgap narrowing ∆eg leads to a reducing of the energy interval (∆ed)sige which is needed for an interaction of the thermal exited deep oxide traps and tunneling substrate electrons. it leads to a reducing of deep trap conversion time and during any dose rate irradiation all oxide trapped charges have time to be converted into interface traps. as a result, deep traps can act as shallow traps, and eldrs is eliminated. the reducing of a necessary exited energy for conversion of deep traps in sige transistors depends on bandgap narrowing ∆eg of base region under base spacer interface: ( ) ( ) d sige d si g e e e    , (14) conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 565 where (∆ed)sige is the thermal exited energy for conversion of the oxide deep traps in sige transistor; (∆ed)si is the thermal exited energy for conversion of the oxide deep traps in conventional si transistor; ∆eg is bandgap narrowing of base region under base spacer interface of sige hbt. fig. 6 the energy of thermal excitation ∆ed from level of positive trap in the oxide eot to conduction band edge ec . eg is bandgap of semiconductor. it can be shown using results of [16] that for conventional bipolar devices the deep trap location is near 0.21ev – 0.29 ev below the edge of conduction band. fig.7 presents the effect of bandgap narrowing on the exited energy ∆ed which is enough for conversion of deep traps into interface traps. the line 1 in fig. 7 corresponds to initial value ∆ed = 0.29ev, line 2 corresponds the initial value ∆ed = 0.21 ev. the dotted line shows the boundary between eldrs region and region where eldrs is absent (eldrs-free). fig. 7 the effect of bandgap narrowing ∆eg on the exited energy ∆ed. the dotted line presents the boundary between eldrs region and region where eldrs is absent (eldrs-free). 566 v. s. pershenkov we consider that the eldrs boundary (existence or absence eldrs) corresponds to ∆ed = 0.12 ev. it connects with following physical reason. a spreading of the energy location of the positive oxide traps by temperature excitation can be estimated as ±(2-3) kt. it means that shallow and deep energy levels can be separated as different traps if the energy gap between their locations more than approximately 5 kt or 0.0125 ev. for ∆ed more than 0.12 ev the shallow and deep oxide traps act as the different traps and eldrs can be observed (above dotted line in fig.7). for ∆ed less than 0.12 ev the shallow and deep oxide traps are equivalent one trap and eldrs cannot be observed (under dotted line in fig.7). in sige hbts the value of bandgap narrowing has order 0.1ev – 0.2 ev. fig. 8 shows valence band offset as a function of ge content [20, fig.9]. fig. 8 valence band offset as a function of ge content (after ref. [20]). therefore, for sige devices eldrs will be not observed (no eldrs region in fig. 7) if bandgap narrowing more than 0.1 ev or 0.18 ev. it is very probable that parameters of the modern sige hbts lay within “no eldrs” region. this conclusion agrees with experimental data of [20], where was said: “and to first order, enhanced low dose rate sensitivity (eldrs) is not observed in sige hbts, which is clearly good news since it is a traditional concern in most si bjt technologies” [20, page 2001]. the eldrs conversion model can give physical explanation of this statement. 6. saturation of the radiation-induced interface-trap buildup the analysis of this section is based on the assumption that the positive charge of trapped holes in oxide is transformed through electron capture into a new defect (the ad center) with two energy states in forbidden gap of si [21]. this is point defect, for which the high energy level is acceptor-like and lower energy level is donor-like. the following process of ad center generation and annihilation is proposed. the strained si-si bond (oxygen vacancy) serves as precursor for this radiation-induced defect. this precursor can be treated as a non-activated donor center d. the radiation induced holes are captured by deep d traps creating a positive charged d + center: d + h = d + . free electron capture by d + center causes its transformation to the two-level ad center: d + + e = a 0 d 0 . the ad conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 567 defect can be found in four different states: a 0 d 0 , a – d 0 , a 0 d + , a – d + . the superscripts after a and d designate charge state of the acceptor and donor levels respectively: a 0 d 0 – acceptor level is empty, donor level is occupied; a – d 0 – acceptor and donor levels are occupied; a 0 d + – acceptor and donor levels are empty; a – d + – acceptor level is occupied, donor level is empty. the charge exchange of the a 0 d 0 with radiation induced or substrate electrons leads to a – d 0 and a 0 d + . the charge state a – d + cannot be stable and is assumed to immediately relax back to the d precursor due to energy released during electron transition from higher (a) to lower (d) levels. therefore, the appearance of the a – d + state leads to the annihilation of the ad center. the saturation can be explained by two competitive processes: accumulation and annihilation (annealing). at mathematical form it can be written / /( ) it it ann it dn dt g n   , (15) where g is accumulation rate of interface trap; nit is density of interface traps; (τann)it is the time constant of interface state annihilation. in saturation, dnit/dt = 0 and nit reaches a saturated value ( ) ( ) it sat ann it n g   , (16) the accumulation rate of nit buildup is proportional to the dose rate ( ) acc it g k  , (17) where (kacc)it is a coefficient characterizing interface trap accumulation; γ is the dose rate. therefore ( ) ( ) ( ) it sat acc it ann it n k     , (18) the value of (nit)sat is proportional the dose rate γ if (kacc)it and (τann)it are constants. but, as follows from experimental data, the value interface trap concentration in saturation (nit)sat is very weak function of the dose rate. the changing of the dose rate at more than 4 orders in region from 300 krad (si)/min to 13 rad (si)/min leads to very small variation of (nit)sat [22]. the same result is obtained in [23,24], where the saturation of nit was observed for the changing of the dose rate from 333 rad (sio2) to 5.25 rad (sio2). the coefficient (kacc)it is very weak function of the dose rate. it follows from linear dependence of nit buildup at small total doses, that agrees with numerous experimental data reported by [22, 24, 25]. the value (nit)sat is not dependent at the dose rate γ if (τann)it is inversely proportional γ or an annihilation (annealing) of interface traps depend on the dose rate. it is necessary to consider radiation induced charge neutralization (ricn) effect. usually ricn effect concerns to the annealing of oxide trapped charge. in given work we suppose using ricn effect as basic mechanism of interface-trap annealing. consider the case when annihilation takes place from a 0 d + configuration after capture radiation-induced electron. the a 0 d + state transforms to a – d + state, which is not stable and is assumed to immediately relax back to the d precursor. the nit annihilation process can be described by the relationship from recombination theory of shocklyread-hall [26] ( / ) it ann th t it dn dt v n n   , (19) 568 v. s. pershenkov where υth is the thermal velocity; σt is the capture cross-section of ad center; n is concentration of radiation induced electrons. concentration of radiation induced electrons equal p y n k k  , (20) where kp is generation rate per unit dose rate; ky is electron yield; γ is the dose rate. result of substituting (20) in equation (19) is ( / ) /( ) it ann th t p y it it ann it dn dt v k k n n       , (21) where ( ) / ann it ad k  , (22) 1/ ad th t p y k v k k , (23) it means from (18) that ( ) ( ) it sat acc it ad n k k  , (24) the value of density of interface trap in saturation, as follows from (24), depends on product of interface trap accumulation rate (kacc)it and constant kad which is function of thermal velocity, capture cross-section of ad center, generation rate and electron yield of radiation induced electrons. consider the analysis of the some results of work [25], using relationship (24). two vendors (vendor “a” and vendor “b”) of n-channel metal-oxide-semiconductor field effect transistors (mosfets) were irradiated with x-ray. the vendors had different initial values of interface trap density and were irradiated at different dose rates, which presented in table 2 with estimated value of (kacc)it and kad. table 2 experimental conditions and estimation results for transistor venders from [25]. dose rate (rad(sio2)/s) initial nit, cm -2 (kacc)it, (rad(sio2) -1 cm -2 ) kad, rad(sio2) (nit)sat, cm -2 vendor “a” 170 2*10 10 6.4*10 4 1.6*10 7 1*10 12 vendor “b” 1700 2*10 11 1.15*10 6 1.7*10 7 2*10 13 the values of kad for different venders are the same despite different initial nit values and irradiation dose rate. it means that model, presented in this work, is able to describe physical mechanism of interface-trap buildup saturation correctly. value of (kacc)it is determined by initial nit buildup rate and depends on parameters of manufacture technology process and irradiation dose rate. the additional information concerning interface-trap buildup saturation can be find in [27]. 7. conclusion the eldrs conversion model for modeling the radiation-induced degradation of bipolar device parameters for the impact of low dose rate irradiation is described. the model is based on the concept that the radiation-induced interface-trap buildup connects with the hydrogen-electron mechanism, where both hydrogenous species and electrons are responsible for radiation-induced interface-trap formation. the interaction of trapped conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications 569 positive charges (hydrogenous species) and electrons from the substrate leads to the formation of interface traps. the main feature of the eldrs conversion model includes the fitting parameter extraction techniques. the model was validated by comparing it with the previously reported experimental data for different technologies and devices. according to conversion model of interface trap buildup, bandgap narrowing of the sige bipolar transistor base region leads to reducing of deep trap conversion time and, as a result, during irradiation at any dose rate all oxide trapped charges have enough time to be converted into interface traps. therefore, there is no difference between test dose rate and low dose rate irradiation (eldrs-free). the interface-trap buildup saturation is explained by an interaction of the radiation-induced electrons with centers which were formed during conversion process. references [1] t.r. oldham, f.b. mclean, "total ionizing dose effects in mos oxides and devices", ieee trans. nucl. sci. ns-50, no. 3, 483-499, 2003. doi: 10.1109/tns.2003.812927 [2] p.s. winokur, h.e. boesch jr., "interface state generation in radiation-hard oxides", ieee trans. on nuclear science, vol. 27, no. 6, pp. 1647-1650, 1980. doi: 10.1109/tns.1980.4331083 [3] f.b. mclean, "a framework for understanding radiation-induced interface state in sio2 mos structures", ieee trans. on nuclear science, vol. 27, no. 6, pp. 1651-1657, 1980. doi: 10.1109/tns.1980.4331084 [4] s.k. lai, "interface trap generation in silicon dioxide when electrons are captured by trapped holes", j. appl. phys., vol. 54, pp.2540-2546, may 1983. doi: 10.1063/1.332323 [5] a.v. sogoyan, s.v. cherepko, v.s. pershenkov, "hydrogen-electron model of radiation induced interface trap buildup on oxide-semiconductor interface", russian microelectronics, vol. 43, no. 2, pp. 162-164, 2014. [6] f.b. mclean, "generic impulse response function for mos systems and its application to linear response analysis", ieee trans. on nuclear science, vol. 35, no 6, pp. 1178-1185, 1988. doi: 10.1109/23.25436 [7] v.s. pershenkov, v.v. belyakov, s.v. cherepko, i.n. shvetzov-sholovky, "threepoint method of prediction of mos device response in space environments", ieee trans. nuclear science, vol. 40, no. 6, pp.1714-1720, 1993. doi: 10.1109/23.273488 [8] m.p. baze, r.e. plaag, a.h. johnston, "a comparison of methods for total dose testing of bulk cmos and cmos/sos devices", ieee trans. on nuclear science, vol. 36, no. 6, pp. 1818-1824, 1990. doi: 10.1109/23.101195 [9] d.m. fleetwood, p.s. winokur, j.r. schwank, "using laboratory x-ray and cobalt-60 irradiations to predict cmos device response in strategic and space environments", ieee trans. on nuclear science, vol. 35, no. 6, pp. 1497-1505, 1988. doi: 10.1109/23.25487 [10] b.j. mrstik, r.w. rendell, "si/sio2interface state generation during x-ray irradiation and during postirradiation exposure to a hydrogen ambient", ieee trans. on nuclear science, vol. 38, no. 6, pp. 11011110, 1991. doi: 10.1109/23.124081 [11] a.j. lelis, t.r. oldham, w.m. delancey, "response of interface traps during high-temperature anneal", ieee trans. on nuclear science, vol. 38, no 6, pp. 1590-1596, 1991. doi: 10.1109/23.124150 [12] r. l. pease, r.d. schrimpf, d.m. fleetwood, "recent advances in understanding total-dose effects in bipolar transistors", ieee trans. on nuclear science, vol. 57, no. 4, 1894-1908, 2009. doi: 10.1109/radecs.1995.509744 [13] p. j. mcwhorter, s. l. miller, w. m. miller, "modeling the anneal of radiation-induced trapped holes in a varying thermal environment", ieee trans. nucl. sci., vol. 37, no. 6, p.1683, dec. 1990. doi: 10.1109/23.101177 [14] v.v. emelianov, a.v. sogoyan, o.v. meshurov, v.n. ulimov, v.s. pershenkov, "modeling the field and thermal dependence of radiation-induced charge annealing in mos devices", ieee trans. on nucl. sci., 1996, vol. ns-43, no. 6, pp.2572-2578. doi: 10.1109/23.556838 [15] e.p. reilly, j. roberson, "theory of defects in vitrous silicon dioxide", phys. rev. b, vol.27, no. 6, p. 3780 (1981). 570 v. s. pershenkov [16] v.s. pershenkov, d.v. savchenkov, a.s. bakerenkov, v.n. ulimov, a.y. nikiforov, a.i. chumakov, a.a. romanenko, "the conversion model of low dose rate effect in bipolar transistors", in proceedings of radecs, pp. 286-393, 2009. doi: 10.1109/radecs.2009.5994661 [17] a.s. bakerenkov, v.v. belyakov, v.s. pershenkov, a.a. romanenko, d.v. savchenkov, v.v. shurenkov, "extracting the fitting parameters for the conversion model of enhanced low dose rate sensitivity in bipolar devices", russian microelectronics, vol. 42, issue 1, january, 2013, pp. 48-52. doi: 10.1134/s1063739712040026 [18] s.c. witczak, r.d. schrimpf, k.f. galloway, d. m. fleetwood, r.l. pease, j.m. puhl, d.m. schmidt, w.e. combs, j.s. suehle, "accelerated tests for simulating low dose rate gain degradation of lateral and substrate pnp bipolar junction transistors", ieee trans. on nucl. sci., vol. 43, no. 6, pp.3151-3160, 1996. [19] r.k. freitag, d.b. brown, "study of low-dose-rate effects on commercial linear bipolar ics", ieee trans. on nucl. sci., vol. 45, no. 6, pp.2649-2658, 1998. [20] john d. cressler, "radiation effects in sige technology", ieee transactions on nuclear science, vol. 60, no. 3, pp. 1992-2014, june 2013. doi: 10.1109/tns.2013.2248167 [21] v.s. pershenkov, s.v. cherepko, a.v. sogoyan, v.v. belyakov, v.n. ulimov, v.v. abramov, a.v. shalnov, v.i. rusanovsky, "proposed two-level acceptor-donor (ad) center and nature of switching traps in irradiated mos structures", ieee transactions on nuclear science, vol. 43, no. 6, pp. 25792586, 1996. [22] m.p. baze, r.e. plaag, a.h. johnston, "dose dependence of interface traps in gate oxides at high levels of total dose", ieee transactions on nuclear science, vol. 36, no. 6, pp. 1858-1864, 1989. doi: 10.1109/23.556839 [23] j. boch, y.g. velo, f. sainge, n. roche, r.d. schrimpf, j. vaille, l. dusseau, c. chatry, e. lorfevre, r. ecoffet, a.d. touboul, "the use of dose rate switching technique to characterize bipolar devices," ieee transactions on nuclear science, vol. 53, no. 6, pp. 3347-3353, 2009. doi: 10.1109/tns.2009.2033686 [24] h.j. barnaby, r.d. schrimpf, r.l. pease, p.cole, t. turflinger, j.kreig, j. titus, d. emily, m. gehlhausen, s.c. witczak, m.c. maher, d. van nort, "identification of degradation mechanisms in bipolar linear voltage comparator through correlation of transistor and circuit response", ieee transactions on nuclear science, vol. 46, no. 6, pp. 1666-1673, 1999. doi: 10.1109/23.819136 [25] j. m. benedetto, h.e. boesch, jr., f.b. mclean, "dose and energy dependence of interface trap formation in cobalt-60 and x-ray environments", ieee transactions on nuclear science, vol. 35, no. 6, pp. 1260-1264, 1988. doi: 10.1109/23.25449 [26] s.m. sze, physics of semiconductor devices, new york, willey, 1981. [27] v.s. pershenkov, a.s. bakerenkov, a.v. solomatin, v.v. belyakov, v.v. shurenkov, "mechanism of the saturation of the radiation induced interface buildup", applied mechanics and materials, vol. 565, pp. 142-146, 2014. doi: 10.4028/www.scientific.net/amm.565.142 instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 407 417 doi: 10.2298/fuee1603407s a platform for a smart learning environment konstantin simić 1 , marijana despotović-zrakić 1 , živko bojović 1 , branislav jovanić 2 , đorđe knežević 1 1 faculty of organizational sciences, university of belgrade, serbia 2 institute of physics, university of belgrade, serbia abstract. in this paper, a modular platform which provides student services for smart educational environment is described. the platform represents a point of mutual integration of various services, such as hosting platform for students’ projects, platform for integrating sms service with students’ web applications, internet of things platform which enables acquiring data from sensors distributed within the university building and controlling various actuators. platform is deployed as a part of smart learning environment. it is integrated with single sign on service and it uses cas and oauth2. rest api is also provided. php symfony framework, relational and non-relational databases are used for deploying the platform. the platform was evaluated and tested. key words: smart environment, e-learning, web application, platform as a service 1. introduction internet of things (hereinafter: iot) enables interconnecting smart devices, such as sensors, actuators, microcontrollers and microcomputers with other information-communication infrastructure [1][2]. using smart devices brings increasing the level of automating everyday tasks which leads to gaining better productivity in many different environments. smart environments, such as smart homes, smart classrooms or smart factories, are formed by connecting and adding a large number of smart devices to an existing communication infrastructure. in education sphere, iot has wide application possibilities which have not used enough. by using adequate sensors and actuators, it is possible to track different features of a physical environment, to detect whether these features are in correlation with learning and teaching processes and to dynamically change some features of the environment according to needs. iot technologies are integral part of the smart learning environment such as smart classrooms. received july 3, 2015; received in revised form november 15, 2015 corresponding author: živko bojović faculty of organizational sciences, university of belgrade, jove ilića 154, 11000 belgrade, serbia (e-mail: zivko@elab.rs) 408 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević smart classroom is a concept which integrates several information and communication technologies to enable collaborative learning in order to improve the overall learning and teaching processes [3]. different technologies can be used for deploying a smart classroom, such as nfc, smart mobile devices, multimedia devices etc. furthermore, learning environment should be pleasant place for teaching and learning. therefore, smart classroom should be equipped with systems for heating and cooling, light and presence management, and with the necessary equipment for the realization of the teaching process. the hardware equipment in smart classrooms are usually managed by adequate software. existence of a platform that integrates all services in scope of smart learning environment is important for the teachers and students. this research represents a development of a platform for smart learning environment. these iot platform integrates several student services and collects data from the smart learning environment through various sensors, actuators, microcomputers and microcontrollers. the aim of this research is enhancing learning of internet of things in an academic environment by creating projects in developed iot platform. the research is conducted in scope of the department of e-business (hereinafter: elab), at the faculty of organizational sciences, university of belgrade. the elab iot platform was developed to help students to learn iot in an interesting way and achieve better learning outcomes. 2. literature review internet of things can be defined as a loosely coupled, decentralized system, made of smart objects or autonomous physical or digital network-equipped objects which are able to collect environmental data and to process these data [4]. the internet becomes a network of all devices, not solely computers. according to gartner’s predictions, around 26 billions of devices will have been an integral part of the internet by the year 2020 [5]. umbrella term “internet of things” is usually used for grouping sensors, actuators, microcontrollers and microcomputers into smart environments. sensors are analog or digital devices able to detect physical characteristic of the environment, such as temperature, humidity, pressure, levels of sound noice etc. [6]. actuators are devices which works like switches – they can be used for controlling other devices. sensors and actuators are not enough by themselves for creating smart environments. they are often used together with more complex devices, such as microcontrollers and microcomputers. in the sphere of the iot, a widely spread microcontroller platform is arduino and one of the best known microcomputers is raspberry pi. an important aspect of the iot is connecting with other networks. in the era of broadband technologies such as wifi and lte, this issue is especially growing. data provided by different devices should be available everytime and everywhere. by creating an adequate platform in the cloud, it is possible to integrate multiple data sources and to analyze these larage amounts of data. the vision of the internet of things can be seen from two aspects: the internet aspect, which focuses on providing adequate internet services, and the things aspect, which includes collecting and processing data aquired from devices. smart devices are going to be key-elements in software developed by using the object-oriented architecture. a platform for a smart learning environment 409 the iot platform can connect a sensor infrastructures which represent data generators with clients interesting in obtaining data which represent consumers. sensor infrastructure can contain one or many sensors. they can be mobile and connected to the same cloud wirelessly. data acquired by using the sensor infrastructure are stored into a nonrelational database. clients are able then to access these data [7]. database can be delocalized and distributed in order to exchange and store information. nowadays, iot platforms are based on cloud infrastructure. mainly these platforms are used for collecting data from sensors and other smart devices from the environment in which are implemented. cloud services and resources can be delivered by three cloud service models [8][9][10]: platform as a service (paas), software as a service (saas) and infrastructure as a service (iaas). in this research the focus is on iot paas. platform as a service enables the developers to consume the resources in iaas and deploy their applications onto a virtualized cloud platform [9]. one of the most widely used paas is xively. it is a free online service enabling developers to deploy their own application based on the iot and data acquired from sensors. xively manages large amounts of data every day. it is used by individuals, organizations and companies all around the world. data can be send from different sensors and devices. xively has the following features:  analysing and processing historical data acquired from sensors.  sending real-time notifications and alerts related to any devices.  calling custom scripts if user-defined conditions are fulfilled. xively is built to encourage open ecosystems, such as digital electric meters, weather stations biosensors and other devices. besides iot platforms like xively, there are also iot platforms which are developed for specific purposes. most of these platforms are used in business context for providing charged services. one example of these platforms is aneka paas which represents an adaptable, extensible and flexible cloud platform that enhance the performance and efficiency of applications by harnessing resources from private, public or hybrid clouds [4]. aneka supports provisioning resources on different public cloud providers such as amazon ec2, windows azure and gogrid. application domains of aneka are in the science, finance, entertainment and media, manufacturing and engineering, telecommunication, health and life science [11]. the clem project has established a cloud-based ecosystem for e-learning, resourcesharing and support for mechatronic vocational education teachers and learners [9]. clem is a platform that allows large number of distributed mechatronic devices to become sharable and to be used for e-learning. this paas is used for project realization. analysing studies from the literature, the authors concluded that there are lack of developed iot paas platforms for the educational purposes. the iot platform presented in this research was developed according to xively platform. elab iot platform is developed in php language and symfony framework. the main aim of the platform is collecting and evaluating data collected from the smart learning environment. furthermore, this platform is mainly developed to enhance learning iot and realization of internet of things project in an academic environment. the developed iot platform is an integral part of the elab student platform. elab iot platform can be integrated with other platforms and deployed in different smart environments. 410 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević 3. designing a platform for smart learning environment 3.1. platform architecture e-business department (elab) within the faculty of organizational sciences, university of belgrade, organizes courses in following fields: internet technologies, ebusiness, computer simulation, mobile computing, and internet marketing. each year, ecourses are attended by more than 700 students [12]. the central student service of the department is moodle, which enables enrolling students to different courses for all levels of studies, downloading all teaching materials and managing assignments. due to the specific nature of different subjects, it is needed to deploy new student services which enable new functionalities. for example, students enrolled to internet marketing course need to use web hosting and sms services, but students enrolled to internet of things course need to use a platform which can control sensors and actuators. all new services and platforms should be integrated and they should use sso (single sign on). student platform should include various heterogeneous services and it should be extendable. for that reason, the platform architecture has to be modular and integrated with other services. the logical infrastructure is shown in the figure 1. this solution should integrate a new platform with current services. by using the api, platform can be integrated with moodle lms. this integration enables gathering information about students and using this information in various contexts. fig. 1 a model of educational infrastructure based on the internet of things. a platform for a smart learning environment 411 the model consists of four components. cloud computing infrastructure and virtualized resources can be used for creating a highly-scalable and reliable infrastructure. identity management software is used for providing unique user accounts and single sign on services. lms is used for administrating courses for students. for running the lms, relational databases and web servers are required. big data infrastructure and non-relational databases (nosql) are used for collecting various data about students and data from sensors. the second component is an iot platform infrastructure, which consists of two subcomponents. the platform for learning iot enables students to use data from sensors, to control different actuators and to deploy their own smart environments for testing and educational purpose. the other subcomponent is related to the production environment where wireless sensor networks are used for enhancing students’ experience and for introducing new educational services. the last two components are used for integrating other components of the infrastructure and for providing external application programming interfaces (apis) to external users. elab student platform represents a point of integration of all student services. elab student is a modular platform. currently, the following modules are operational: 1. sms module, which enables integration with a sms gateway device and provides an api for sending and receiving short messages via public mobile network; 2. hosting module, which enables publishing students’ project on the internet; 3. iot module, which is used for publishing and reading data from sensors and managing actuators and other smart devices. the model of proposed architecture is shown in the figure 2. authentication authorization sso identity management common components html5 css3 js, jquery presentation moodle lms synchronization global configuration module configuration administration rest services web services user devices access main services modules data domains & dns ftp configuration hosting sending sms receiving sms sms projects devices sensors actuators iot fig. 2 the platform architecture 412 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević 3.2. digital identity management problems of digital identities are important in constructing platforms and other software solutions. separate identity layer is necessary in complex information system made from various heterogeneous parts. without the identity layer, integration of these parts would not be possible. digital identity is related to managing the relationship between individuals and objects that they use. it represents a set of digital subjects and their attributes. in other words, digital identity includes a set of information about the owner of the identity that can be an individual, company or even a service [13]. for managing digital identities, several protocols are used. saml is the best-known protocol for managing distributed identities [14]. it is a xml-based framework which provides user authentication and authorization. other frameworks which are used for identity management are cas, oauth, openid and others. a digital identity management tier enables centralized storage of all user accounts in ldap, as well as centralized authentication, authorization and single sign on/ single sign out features. cas server in combination with openldap and radius servers is used for digital identity management. cas (central authentication service) represents a single-sign on protocol and server developed by jasig. using this solution, users of the platform should enter their authentication credentials only once, afterwards they can have access to all pages they are authorized for (figure 3). cas server moodle lms elab student portal other services radius server ldap server services identity management fig. 3 platform services and identity management 3.3. use cases use cases of the elab iot platform are shown in the figure 4. there are four main use cases: project management, device management, sensor management and working with data from sensors. elab iot platform works with projects. students can create their iot projects and register team members. projects work similar as operating system’s folders. each project can be private (viewable only by team members) and public (viewable by all users). in each projects, devices, such as raspberry pi or arduino, can be registered. for each device, some metadata, such as ip address and location, can be added. under devices, particular sensors and actuators which are physically connected to the device can be added. students can use the api to write data from devices, sensors and actuators, to the platform. a platform for a smart learning environment 413 fig. 4 the platform use cases 4. deploying a platform for smart learning environment 4.1. technologies used in deployment the platform for iot is highly modular and it is created as a part of elabstudent platform which integrates all student services. elab iot platform is developed in php language and symfony framework. symfony includes reusable php components which enable creating powerful web application. it is a mvc-based framework. for rendering views, twig templating engine is used and for working with data, preferable data-mapping tool is doctrine orm. for developing elab iot platform, both relational (mysql with doctrine orm) and non-relational (mongodb) databases are used. mongodb is nosql database which uses bjson data collections. this data format is human-readable and it is convenient for working with large sets of data. for communicating with cas server, besimplessoauthbundle is used. this bundle can map user entities created in symfony to cas users. also, cas login and registration forms are used. 4.2. core component of the platform the core component of elab student platform integrates all modules, identity service, templates, identity module and tools for integration with other software. this component 414 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević enables registering new users to the platform. also, it manages which services are available to particular students. administrator is able to deny access to particular services or to the whole platform. elab student platform uses bootstrap frontend framework for designing templates. the basic template is slightly modified to suit the needs of the platform. the homepage of the platform, where students can choose their desired service, is shown in the figure 5. fig. 5 the platform homepage 4.3. internet of things module the module for the internet of things is one of the key-features of the platform. using this module, students can add their iot projects, devices they work with and sensors and actuators. students are able to use the provided api to send actual data collected from sensors and to read historical data, measured at any moment. this platform stores various metadata related to devices and sensors. for devices, user can define their type, latitude, longitude, image and description. for sensors, user can set their reliability, type, measuring units, image and description. in the following figure, a procedure for creating a new project is shown. first, a student can select team members from the list. all students who have had cas account, who have been registered to the elab student platform and who have not had created an iot project are shown in the list. afterwards, they can add devices which they want to include to the project. finally, students can view values from sensors and the graph of historical values. they can also filter the graph by entering start and end date and time they want to see in the graph (figure 7). a platform for a smart learning environment 415 fig. 6 creating a project fig. 7 graph with sensor data 4.4. results of using the platform in order to evaluate the usability of the designed environment, the research was conducted in the scope of the course internet of things on undergraduate studies at the faculty of organizational sciences, university of belgrade. in this research, 37 students 416 k. simić, m. despotović-zrakić, ž. bojović, b. jovanić, đ. knežević participated. all of them had similar backgrounds and interests in the sphere of business informatics. this course consisted of 12 lectures, which were grouped into the following topics: introduction to internet of things technologies, defining scenarios for automating smart environments, microcomputers and microcontrollers, developing web services, developing web and mobile applications for smart environments automation. after completing the course, students' knowledge was evaluated. students' assignments, projects and knowledge tests were used to calculate the final grade. each assignment was a part of the final students' project. each project was related to designing and implementing of a smart environment such as smart home, smart classroom, smart parking, etc. for the implementation of a smart environment students used elab iot platform. the average grade that students achieved on the course was 8.75 (on a scale of 6 to 10). during the course, students were given a survey. questions were mostly based on the five-point likert scale. students were asked to assess the quality the four topics studied during the course: arduino, raspberry pi, web applications and web services. all the topics were studied using the described platform. table 1 shows summary results of the survey analysis for each (x mean grade, from 1 to 5; 𝛿 standard deviation). table 1 survey analisis parameter topic x 𝛿 interesting arduino 4.31 0.62 raspberry pi 4.43 0.65 web applications 4.22 0.67 web services 3.89 0.81 simplicity arduino 4.03 0.65 raspberry pi 3.81 0.66 web applications 3.30 1.27 web services 3.62 1.01 motivation arduino 4.22 0.64 raspberry pi 4.05 0.70 web applications 3.51 0.90 web services 3.59 0.801 evaluation results show that the students were interested in learning iot and developing smart environments using the described platform. the designed platform could effectively support teaching and learning, leading to good results on knowledge tests and high level of students' satisfaction and motivation. 5. conclusion in this paper, we designed and deployed an internet of things platform. this platform was a part of the broader elab student platform and it was able to help students with their iot projects. students were able to register their iot devices and to send data from them to the platform. also, they were able to browse save data. in the future, this platform is going to be extended with more features. one of planned functionalities is better integration with big data infrastructure. a platform for a smart learning environment 417 acknowledgement: the paper is a part of the research done within the project 174031. the authors would like to thank to the mntrs for financial support. references [1] l. atzori, a. iera, g. morabito, the internet of things: a survey, computer networks, vol. 54, issue 15, 28 october 2010, pp. 2787-2805. [2] e. borgia, the internet of things vision: key features, applications and open issues, computer communications, vol. 54, 2014, pp. 1-31. [3] s.s. yau, s.k.s. gupta, e.k.s. gupta, f. karim, s.i. ahamed, y. wang, b. wang, smart classroom: enhancing collaborative learning using pervasive computing technology, in asee 2003 annual conference and exposition, 2003, pp13633-13642. [4] j. gubbi, r. buyya, s. marusic, and m. palaniswami, “internet of things (iot): a vision, architectural elements, and future directions,” futur. gener. comput. syst., vol. 29, no. 7, pp. 1645-1660, 2013. [5] gartner, “gartner identifies the top 10 strategic technology trends for 2014,” gartner, orlando, florida, 2013. [online]. available: http://www.gartner.com/newsroom/id/2603623. [6] s. s. iyengar, n. parameshwaran, v. v. phoha, n. balakrishnan, and c. d. okoye, fundamentals of sensor network programming: applications and technology. wiley-ieee press, 2010. [7] c. floerkemeier, the internet of things: first international conference, iot 2008, march 26-28, 2008, proceedings, vol. 4952. zurich, switzerland: springer, 2008. [8] l. george, developing software online with platform-as-a-service technology, ieee comput., vol. 41, 2008, pp. 13-15. [9] k.-m. chaoa, a. e. jamesa, a. g. nanosa, j.-h. chena, s.-d. stan, i. muntean, g. figliolini, p. rea, c. b. bouzgarrou, p. vitliemov, j. cooper, j. van capelle, “cloud e-learning for mechatronics: clem”, future generation computer systems, vol. 48, 2015, pp. 46-59. [10] m. despotović-zrakić, k. simić, a. labus, a. milić, b. jovanić, 2013 scaffolding environment for adaptive e-learning through cloud computing. educational technology & society, 16(3), pp. 301-314. [11] y. wei, k. sukumar, c. vecchiola, d. karunamoorthy and r. buyya, chapter 27. aneka cloud application platform and its integration with windows azure, in cloud computing methodology, systems, and applications edited by boualem benatallah, crc press 2011, pp. 645–679. [12] a. labus, k. simić, m. vulić, m. despotović-zrakić, and z. bogdanović, “an application of social media in elearning 2.0,” in proceedings of the 25th bled econference edependability: reliable and trustworthy estructures, eprocesses, eoperations and eservices for the future, 2012, pp. 557–572. [13] y. zhang and j.-l. chen, "universal identity management model based on anonymous credentials," in proceedings of the 2010 ieee international conference on services computing (scc), 2010. [14] k. d. lewis and j. e. lewis, “web single sign-on authentication using saml,” int. j. comput. sci. issues, vol. 1, no. 8, pp. 41-48, 2009. 11151 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 239-251 https://doi.org/10.2298/fuee2302239s © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper strength analysis of a blade with different cross-sections* bader somaiday1, ireneusz czajka1, muhammad a. r. yass2 1agh university of science and technology, krakow, poland 2university of technology, baghdad, iraq abstract. the efficiency of horizontal axis wind turbine (hawt) blades is examined in this paper concerning the effect of cross-section airfoil type. three dif-ferent airfoils were examined: symmetric (naca 4412), asymmetric (naca 0012), and supercritical (naca 4412). (eppler 417). the anal-yses that were performed combined theory and experiment. theoretical analyses were carried out using fortran 90 code and the blade element momentum-based qblade code. the blade was created using solidworks software and a 3d printer for testing purposes. the findings of experi-mental tests supported the conclusions of the theory. research revealed that the eppler 417 blade, which has a supercritical airfoil, performed better than other examined objects. naca 4412, naca 0012, and eppler 417 each have a power coefficient of 0.516, 0.492, and 0.510. according to the experimental data, the eppler 417 airfoil outperforms other air-foils in terms of power and speed reduction. to calculate the deformation and stresses of the three blades with various cross sections, cfd analysis was done in ansys workbench. the cfd results showed that naca 4412 has the highest strength but eppler 417 was considered the optimum cross-section based on power generation and acceptable stress values. key words: hawt, cfd analysis, optimum power coefficient, qblade code 1. introduction the aerodynamic efficiency of an airfoil is defined by the lift-to-drag ratio. it achieves the highest values at a specific angle of attack, and the value of this angle varies between airfoils. [2] the lift-to-drag ratio depends on zero-lift drag, aspect ratio, and span efficiency and is independent of weight. the use of airfoil in a wind turbine is no more limited than in an aircraft wing because a wind turbine operates at a lower speed than an aircraft. [3] there have been many theoretical and scientific studies on the performance of wind turbine blades. received september 29, 2022; revised december 11, 2022; accepted january 06, 2023 corresponding author: bader somaiday agh university of science and technology, krakow, poland e-mail: somaiday@agh.edu.pl * an earlier version of this paper was presented at the 7th virtual international conference science, technology and management in energy (energetics 2021), belgrade, serbia, december 16-17 2021 [1]. mailto:somaiday@agh.edu.pl 240 b. somaiday, i. czajka, m. a. r. yass symbols a axial factor á angular factor cd drag coefficient cl lift coefficient cp power coefficient n number of revolutions of rotor per minute (rpm) p power (w) r radius (m) r radius of turbine rotor (m) u wind speed (m/s) 𝛼 angle of attack (degree) 𝜆 tip speed ratio φ relative flow angle (degree) ω angular velocity of the rotor(rad/s) acronyms cfd computational fluid dynamics hawt horizontal axis wind turbine bem blade element momentum naca national advisory committee for aeronautics fem finite element method urans unsteady reynolds averaged navier-stokes [4] this paper investigated an aerodynamic performance evaluation system using two groups of naca profiles which were used in a series of five-digit naca (63-221, 65415; 23012,23021) and four-digit naca (2421, 2412,4412, 4424) for three hawt blades. the same airfoils were used along the entire blade. a computer pro-gram was developed to automate the entire procedure. their results show that the elementary power coefficient of naca 4412 and naca 23012 was higher than the other profiles. [5] in this paper, a stable and aerodynamic design using naca 4412 profile with blade length (800 mm) and power (600 w) with mini hawt was pro-posed. the length chord and twist angle distributions of the initial blade model were calculated. a reasonable compromise between high efficiency and good starting torque was obtained. the blades were developed using matlab software. the op-timized blade chord was reduced by 24% and the thickness by 44%. the power level of the optimized blade was significantly increased to 30% compared to the standard blade. [6] this paper explained the design and optimization of a small hawt blade using custom code. the blades were made using naca 4412, naca 2412 and naca 1812 at a wind speed of 5 m/s, which was the most frequent wind speed pre-vailing in the indian peninsula. based on a self-created code based on bem theory, an optimum blade profile was generated which performs with high efficiency using multiple airfoils. the twist angle distribution, chord distribution and other parame-ters for different airfoil sections along the blade were determined using the proposed code. for a rotor with a diameter of 4.46 m, a power factor of 0.490 and an output power of 0.56 kw was obtained. the blade analysis result obtained using q-blade software showed reliable agreement with the proposed code and wind turbine per-formance analysis. the power factor obtained using the matlab code was 0.490, which was very close to that obtained using q-blade (0.514). in addition, the differ-ence in output power between the two values was only 28.58 w. [7] the behaviour and performance of a multi-section hawt blade with and without a fence are researched in this paper. the multi-section hawt blade was designed using supercritical airfoils (sg6043, fx63-137s and fx66-s-196v). the overall performance of the multi-section blade was compared with the single-section naca4412 blade. numerical analyses were performed using the author's code (fortran 90) and the qblade package based on bem theory. the multi-pass vanes show an increase of about 8% in power factor compared to the single-pass vanes. the boundary layer theory was used to design the fences and their strength analysis of a blade with different cross-sections 241 position was de-termined experimentally. an increase in total power factor of about 16% with the use of fences and high flutter stability. [8] this paper investigated the aero-elastic behavior of horizontal axis wind tur-bine (hawt) flexible blades by using computational and aerodynamic models ap-proach. to study the unstable blade airfoil aerodynamic properties, the b-l (bed-does-leishman) dynamic stall model was added to the modified blade element mo-mentum (bem) model. [9] in this paper investigation of the aero-elastic model of multi-rotor hawt was done. the method used in this system was to integrate the single-rotor hawcstab2 with the multi-rotor tool hawc2. therefore, this method of fidelity linear time-invariant aero elastic modelling was verified by comparing the frequency responses of different rotors. [10] this paper studied the aero elastic be-havior of a mw (multi-megawatt) hawt is influenced by the integrity of the aero-dynamic simulation. the main purpose of this research is the comparison between engineering model results and cfd aero elastic simulations results that needs less empirical modeling. to investigate the influence of the aerodynamic models on the aero elastic results for large hawt, two distinct models (bemand cfd-based) were used. [11] the two-part study looks at horizontal axis wind turbines which have improved aero elastic performance and hence boost yearly energy output is proposed in this paper. the structural characteristics of a standard blade were then idealized using an adaptable shear. this development's power curve is evaluated, and it is proven to dramatically boost yearly energy output over traditional systems by 1.51 % than the maximum power at a wind speed of 15 m/s. [12] researchers developed the urans equations that were integrated with the fem in a flexible way to describe the aero-elastic behaviour of tjreborg horizontal axis wind turbine blades. at four different horizontal inflow wind speeds, this approach was validated by comparing simulated and experimental data. the aero-elastic behaviour of the tjreborg wind turbine was also estimated and studied for yaw angles of 10, 30, and 60 degrees. [13] this research was done by developing a horizontal axis wind tur-bine rotor blade model for showing the coupling effect of rotor bulk rotation and blade flexible motion. the model was created with lagrange's technique, as well as the blade was discretized utilizing the finite element method (fem). the two differ-ent relationships between aerodynamics wind and structural behavior are captured in this design. [14] researchers developed a wind speed model for n-blades of hori-zontal axis wind turbines with the considerations of wind shear and buildings shad-ow impacts. the systematic approach was utilized to calculate the wind shear, build-ing shadows, synthesis, and equivalent wind velocity disturbances elements, as well as their relative positions in the rotor disc region. [15] this paper investigated the influence of diagonally input upon the wake parameters of a hawt (horizontal axis wind turbine) inside a wind farm. a hawt with a generator limit of 30 kw and then rotor diameter of 10.0 m was employed in this work. a field study was used to analyse the influence of tilt angle on the energy and thrust efficiency of a hawt. on this foundation, the hawt's wake properties were investigated with various wind orientations and pitch angles. as a result, the peak power coefficients cp were 0.31, 0.33, and 0.27, respectively, and correlate to tip velocity ratios l= 7.5, 7.4, and 6.8 with pitch angles b= 0°, 2°, and 4°. the wind turbine experimental model predicts around 51% of annual power generation compared to the experimental research model. in this study, the behavior and performance of different blade cross-sections, symmetrical, asymmetrical and supercritical airfoils (naca0012, naca 4412 and eppler 417) were investigated experimentally and by ansys. the horizontal axis wind turbine blade design was performed using fortran 90 code and qblade software based on bem theory. 242 b. somaiday, i. czajka, m. a. r. yass 2. the blade cross-section table 1 shows the airfoil characteristics in cross sections and the airfoil shapes are shown in figure 1. the hawt rotor design parameters are shown in table 2. table 1 cross-section airfoils distribution airfoil max t/c max cl/cd naca 4412 12% at 30% 129.4 at 5.25° naca 0012 12% at 30% 75.6 at 7.5° eppler 417 14.2% at 38.35% 135.9 at 2.25° fig. 1 airfoils geometry table 2 parameter of design rotor diameter 1.07 m hub diameter 0.20 m number of blades 3 rated power 600 w cut in speed 2 m/s2 3. power coefficient fig. 2 shows maximum power that can be produced by wind flowing through the ring. [7]. the velocity around the disc is assumed to be constant (u2 = u3) with the assumption that the upstream and downstream pressures are equal. the equations yield the rotor power coefficient. equations [16]. 1 4 2 2 u u u + = (1) strength analysis of a blade with different cross-sections 243 1 2 1 u u a u − = (2) 2 1 (1 )u u a= − (3) 4 1 (1 2 )u u a= − (4) h h r r r r p dp dq= =   (5) 2 3 1 1 2 h r r p wind dqp c p r u  = =  (6) \ 3 2 8 (1 ) 1 cot h d p r r l c c a a d c         =  − −           (7) the tip speed ratio; 1 2 60 r n u   = (8) fig. 2 wind turbine actuator disk model 4. design and manufacturing blades a program in fortran (f.90) was written and the qblade package was used to calculate the aerodynamic data and power factor based on blade element momen-tum (bem) theory, as shown in tables 3, 4 and 5 and figure 3. solidworks software was used to design the 3d blade shapes (see figure 4). the developed models were fabricated on a 3d printer (see figure 5). due to the limited size of the printer's print area, the blades were divided into several sections and then combined. the blades with different profiles in sections (naca 4412, naca 0012 and eppler 417) were mounted to the wind turbine for testing as shown in figure 6. 244 b. somaiday, i. czajka, m. a. r. yass table 3 naca 4412 cross-section geometry position (m) chord (m) twist (deg) foil 1 0.00 0.167 43.42 naca 4412 2 0.10 0.156 23.14 naca 4412 3 0.17 0.136 16.72 naca 4412 4 0.27 0.109 10.91 naca 4412 5 0.37 0.090 7.42 naca 4412 6 0.47 0.076 5.11 naca 4412 7 0.57 0.066 3.47 naca 4412 8 0.67 0.058 2.25 naca 4412 9 0.77 0.052 1.30 naca 4412 10 0.87 0.046 0.56 naca 4412 11 0.97 0.042 -0.05 naca 4412 12 1.07 0.038 -0.56 naca 4412 table 4 naca 0012 cross-section geometry position (m) chord (m) twist (deg) foil 1 0.00 0.206 41.43 naca 0012 2 0.10 0.194 21.14 naca 0012 3 0.17 0.168 14.72 naca 0012 4 0.27 0.136 8.91 naca 0012 5 0.37 0.112 5.42 naca 0012 6 0.47 0.094 3.11 naca 0012 7 0.57 0.082 1.47 naca 0012 8 0.67 0.072 0.25 naca 0012 9 0.77 0.064 -0.69 naca 0012 10 0.87 0.057 -1.44 naca 0012 11 0.97 0.052 -2.05 naca 0012 12 1.07 0.048 -2.56 naca 0012 table 5 eppler 417 cross-section geometry position (m) chord (m) twist (deg) foil 1 0.00 0.283 47.42 eppler 417 2 0.10 0.266 27.14 eppler 417 3 0.17 0.231 20.72 eppler 417 4 0.27 0.186 14.90 eppler 417 5 0.37 0.153 11.42 eppler 417 6 0.47 0.129 9.11 eppler 417 7 0.57 0.112 7.47 eppler 417 8 0.67 0.098 6.25 eppler 417 9 0.77 0.087 5.31 eppler 417 10 0.87 0.079 4.55 eppler 417 11 0.97 0.072 3.94 eppler 417 12 1.07 0.066 3.44 eppler 417 strength analysis of a blade with different cross-sections 245 fig. 3 cross-section blades by qblade package fig. 4 cross-section blade by solidworks (a) naca 4412 blade (b) naca 0012 blade (c) eppler 317 blade fig. 5 3d printing process 246 b. somaiday, i. czajka, m. a. r. yass fig. 6 wind turbines (a) with naca4412 cross-section blades (b) with naca0012 cross-section blades (c) with eppler 317 cross-section blades the material assigned to the blades was carbon fiber and its properties are shown in figure 7. applied pressure was 14.25 mpa, 251 mpa, 986.4 mpa and 1370 mpa to derive the post-processing results of total deformation, equivalent stress and equivalent strain shown in table 7. fig. 7 carbon fiber properties strength analysis of a blade with different cross-sections 247 5. results and decisions the primary design element of a wind turbine blade is the cross-sectional area of the airfoil, which transforms the airflow velocity into a pressure distribution throughout the length of the blade. in this investigation, many profiles including symmetrical, asymmetrical, and supercritical have been used. when evaluating the performance of the profiles, the primary factors to consider are the amount of energy absorbed from the free stream, the maximum lift-to-drag ratio, and the angle of attack. not just the power factor peak, but also the airfoil's overall cross-sectional efficiency, was taken into account. figure 8 demonstrates that compared to the other profiles, the eppler 417 profile produced less drag. additionally, as shown in figure 9, the eppler 417 profile produced the greatest pressure dispersion in the second third of the blade radius. as indicated in fig. 10, the naca 4412 profile had the highest power factor value (cp = 0.516), followed by the naca 0012 (cp = 1.491), and the eppler 417 (cp = 0.510) profiles. however, according to figure 11, the eppler 417 profile had the highest overall efficiency. according to the experimental findings (see table 6), eppler 417 performs the best and produces the most power of the other profiles. cfd results showed that naca 4412 goes through less deformation and stresses (figures 12 to 17 and table 7). fig. 8 normal force distribution along the blades radius fig. 9 tangential force distribution along the blades radius 248 b. somaiday, i. czajka, m. a. r. yass fig. 10 the power coefficient of the cross-sections blades versus tip speed ratio fig. 11 the area under power coefficient curve (a) naca 4412 cross-section blade (b) naca 0012 cross-section blade (c) eppler 317 cross-section blade strength analysis of a blade with different cross-sections 249 table 6 experimental results wind speed (m/s) naca 4412 naca 0012 eppler 417 rpm power w rpm power w rpm power w 3 62 13 51 10 68 20 4.2 88 93 69 28 95 122 5.4 107 144 96 125 122 173 6.5 125 306 114 285 147 330 7.5 171 506 132 363 178 546 fig. 12 equivalent stress for naca 4412 at 251 pa and 1370 pa fig. 13 equivalent stress for naca 0012 at 251 pa and 1370 pa fig. 14 equivalent stress for eppler 417 at 251 pa and 1370 pa fig. 15 total deformation for naca 4412 at 251 pa and 1370 pa 250 b. somaiday, i. czajka, m. a. r. yass fig. 16 total deformation for naca 0012 at 251 pa and 1370 pa fig. 17 total deformation for eppler 417 at 251 pa and 1370 pa table 7 cfd results blade models applied pressure (pa) total deformation (mm) equivalent stresses (mpa) equivalent strain (mm/mm) eppler 417 14.25 0.1016 0.0653 3.49e-06 251 1.7888 1.1505 6.15e-05 986.4 7.0299 4.5213 2.42e-04 1370 9.7638 6.2796 3.35e-04 naca 0012 14.25 0.4138 0.1628 8.70e-06 251 7.2898 2.8688 1.53e-04 986.4 28.648 11.274 6.02e-04 1370 39.789 15.659 8.37e-04 naca 4414 14.25 2.05e-04 1.49e-04 8.13e-09 251 3.61e-03 2.62e-03 1.43e-07 986.4 1.42e-02 0.01028 5.63e-07 1370 1.97e-02 0.01427 7.81e-07 6. conclusions the effects of several cross-section airfoil types on the effectiveness of hawt blade efficiency were studied. analysis was done on three different airfoils: supercritical (eppler 417), asymmetric (naca 0012), and symmetric (naca 4412). the analyses that were performed combined theory and experiment. theoretical analyses were carried out using fortran 90 code and the blade element momentum-based qblade code. the findings of experimental tests supported the conclusions of the theory. at a short angle of attack, supercritical airfoils always produce the highest lift-to-drag ratio. eppler 417 has a high chord length and twist angle, so it generates the highest power. strength analysis of a blade with different cross-sections 251 since the cfd results show that naca 4412 has less total deformation and equivalent stress. this is due to the reason that naca 4412 has a greater cross-section area and stress is inversely proportional to the area. overall eppler 417 is the optimum blade cross-section as it produces more power and has less deformation than the naca 0012. references [1] b. somaiday, i. czajkal, m. a. r. yass, "the influence of cross-section airfoil on the hawt efficiency", in proceedings of the 7th virtual international conference on science, technology and management in energy (energetics 2021), belgrade, serbia, 16-17 2021, pp. 545-551. [2] c. bak et al., "wind tunnel test on wind turbine airfoil with adaptive trailing edge geometry", in proceedings of the 45th aiaa aerospace sciences meeting and exhibit, 2007, p. 1016. [3] d. g. hull, fundamentals of airplane flight mechanics, vol. 19. springer, 2007. [4] n. tenguria, n. d. mittal and s. ahmed, "evaluation of performance of horizontal axis wind turbine blades based on optimal rotor theory", j. urban environ. eng., vol. 5, no. 1, pp. 15-23, 2011. [5] s. a. kale and r. n. varma, "aerodynamic design of a horizontal axis micro wind turbine blade using naca 4412 profile", int. j. renew. energy res., vol. 4, no. 1, pp. 69-72, 2014. [6] f. javed, s. javed, t. bilal and v. rastogi, "design of multiple airfoil hawt blade using matlab programming", in proceedings of the ieee international conference renewable energy resources application, 2016, pp. 425-430. [7] a. h. muheisen, m. a. r. yass and i. k. irthiea, "enhancement of horizontal wind turbine blade performance using multiple airfoils sections and fences", j. king saud univ. eng. sci., 2021. [8] w. w. mo, d. y. li, x. n. wang, c. t. zhong, "aeroelastic coupling analysis of the flexible blade of a wind turbine", energy, vol. 89, pp. 1001-1009, 2015. [9] o. t. filsoof, a. yde, p. bøttcher and x. zhang, "on critical aeroelastic modes of a tri-rotor wind turbine", int. j. mech. sci., p. 106525, 2021. [10] m. sayed, l. klein, th. lutz and e. kramer "the impact of the aerodynamic model fidelity on the aeroelastic response of a multi-megawatt wind turbine", renew energy, vol. 140, pp. 304-318, 2019. [11] m. capuzzi, a. pirrera and p. m. weaver, "a novel adaptive blade concept for large-scale wind turbines. part ii: structural design and power performance", energy, vol. 73, pp. 25-32, 2014. [12] l. dai, q. zhou, y. zhang, s. yao, s. kang and x. wang, "analysis of wind turbine blades aeroelastic performance under yaw conditions", j. wind eng. ind. aerod., vol. 171, pp. 273-287, 2017. [13] d. ju and q. sun, "modeling of a wind turbine rotor blade system", j. vib. acoust. trans. asme, vol. 139, pp. 1–15, 2017. [14] s. wan, l. cheng and x. sheng, "numerical analysis of the spatial distribution of equivalent wind speeds in large-scale wind turbines", j. mech. sci. technol., vol 31, no. 2, pp. 965-974, 2017. [15] y. wang, y. kamada, t. maeda, j. xu, s. zhou, f. zhang and c. cai, "diagonal inflow effect on the wake characteristics of a horizontal axis wind turbine with gaussian model and field measurements", energy, vol 238, p. 121692, 2022. [16] m. ragheb and a. m. ragheb, "wind turbines theory-the betz equation and optimal rotor tip speed ratio", fundam. adv. top. wind power, vol. 1, no. 1, pp. 19-38, 2011. [17] m. a. r. yass, "highest power coefficient of horizontal axis wing turbine (hawt) using multiple airfoil section", test eng. manag., vol. 83, pp. 30029-30041, 2020. microsoft word fu_ee_25_final_paper le facta universitatis series: electronics and energetics vol. 27, no 2, june 2014, pp. 251 258 doi: 10.2298/fuee1402251c cmos ic radiation hardening by design alessandra camplani, seyedruhollah shojaii, hitesh shrimali, alberto stabile, valentino liberali infn-milano and department of physics, università degli studi di milano via g. celoria, 16 – 20133 milano, italy abstract. design techniques for radiation hardening of integrated circuits in commercial cmos technologies are presented. circuits designed with the proposed approaches are more tolerant to both total dose and to single event effects. the main drawback of the techniques for radiation hardening by design is the increase of silicon area, compared with a conventional design. key words: radiation hardening, cmos technology, integrated circuits 1. introduction commercial integrated circuits (ics) may not have an adequate level of immunity to radiations to guarantee good reliability in harsh environments. radiation hard circuits undergo a set of qualification tests, before being used in space (satellites) or in nuclear applications (high energy physics, nuclear power plants, medical equipments for radiology and radiotherapy). however, is worth remarking that every electronic equipment can be affected by low dose rate radiation, due to various sources: e.g., natural radioactivity in materials, high energy cosmic rays, x-ray scanners in airports, etc. the evolution of ic fabrication technology towards ever more dense integration scale has a twofold effect on radiation tolerance: at modern nano-scale size, devices are more tolerant to cumulative (long-term) effects, but on the other hand they are more prone to soft errors due to single events. therefore, design of complex integrated systems should account for such effects. in recent years, specific techniques have been developed to obtain integrated circuits with a high immunity to radiations. radiation tolerance can be increased either by modifying the fabrication process (rhbp: radiation hardening by process), or by adopting design techniques (rhbd: radiation hardening by design). in this paper, rhbd techniques are presented, to achieve a satisfactory tolerance to both total dose and single event effects in mos devices and circuits.  received february 3, 2014 corresponding author: valentino liberali infn-milano and department of physics, università degli studi di milano via g. celoria, 16 – 20133 milano, italy (e-mail: valentino.liberali@mi.infn.it) 252 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali 2. interaction between radiation and silicon the interaction between external radiation (photons like x-rays and γ-rays, charged particles like protons, electrons, and heavy ions, or neutral particles) and a semiconductor may cause two main phenomena: ionization and displacement. 2.1 ionization phenomenon when radiation interacts with the semiconductor material, an electron in the valence band may acquire enough energy to pass in the conduction band. therefore, an electronhole pair (hep) is generated: a free electron is present in the conduction band and a hole in the valence band. if an electric field exists in the ionization region (e.g., in biased devices), heps are separated and carriers move within the semiconductor, giving an extra (parasitic) current. then, the carriers may recombine, or remain trapped, or drift into an electrode. the ionization phenomenon is measured with the linear energy transfer (let). the let indicates the quantity of energy lost by the incident particle along its path into the target material. the let depends on atomic number of the particles and on energy of the particle, target material and the collision location: let = ∙ mev ∙ (1) where is the density of the target material, and indicates the average energy transferred into the target material per length unit along the particle trajectory. ionization effects can be divided into two main categories:  temporary ionization effect is due to hep separation and generation of a parasitic current;  fixed ionization effect is due to trapping of carriers in insulators, where the mobility of carriers is lower than in the semiconductor, or at the interface between insulator and semiconductor; when positive charges are trapped, a shift of device parameters occurs, and circuit performance may be affected. 2.2 displacement when a neutral particle interacts with the silicon lattice, it transfers energy to lattice atoms. a transferred energy greater than 20 ev can displace a silicon atom, which moves toward an interstitial position, and the displaced atom man displace other atoms along its trajectory. defects due to atom displacement in the silicon lattice act as energy levels within band-gap. these levels alter electric properties of semiconductor (e.g., life time of minority carriers, doping density, mobility, etc.). 3. radiation effects on ics damaging effects due to radiation can be divided into two major categories: cumulative effects due to a long-time exposure to radiation, and single event effects due to the interaction with a single particle. cmos ic radiation hardening by design 253 3.1 cumulative effects from the viewpoint of circuit performance, cumulative effects can be divided into total ionizing dose (tid) effects, caused either by charged particles (e.g., electrons or protons), or by photons (x-rays and γ-rays), and displacement damage dose (ddd) effects, caused by massive particles (e.g., neutrons, protons, or heavy ions). in cmos integrated circuits, the most sensitive region to cumulative effects is the gate oxide. when a single particle collides with the oxide, heps are generated; if the ionized region is crossed by an electric field, electrons and holes are separated. electrons are quickly collected by neighboring electrodes because their mobility is approximately 20 cm2/(vs), while holes move slowly by hopping transport toward the sio2-si interface, because their mobility ranges from 10 4 cm2/(vs) to 10 11 cm2/(vs). these holes remain trapped into the oxide for a long time (approximately from 10 3 s to 10 6 s) [1]. the trapped holes can be seen as fixed positive charges, which obviously introduce a negative shift in threshold voltage ∆ , given by: ∆ = − ∆ = − ∆ (2) where q is the elementary charge, cox = ox /tox is the oxide capacitance per unit area, not is the density of trapped holes into the oxide, ox is the dielectric constant of the oxide, tox is the oxide thickness. at the first degree of approximation, ∆ is proportional to . for very thin gate oxide (e.g, for thickness lower than approximately 3 nm), threshold shift becomes negligible [2]. however, field oxides are thick (approximately in the range from 100 nm to 1000 nm) and trap positive charged particles. charge trap effects occur especially in the shallow trench isolation (sti) regions at the transition between field thick oxide and gate thin oxide. the region on the side of an sti can be modeled as a parasitic transistor in parallel to the mos transistor channel. parasitic transistors have the same length as designed transistors, however their voltage threshold is larger, due to thick oxide, so the parasitic transistors are normally turned off. however, positive charged particles are trapped in the thick oxide region attract negative carriers, and this charge can be seen as a fixed charge on the gate of parasitic transistors that could turn on, thus creating a parasitic path between drain and source, in parallel with the mos transistor channel. in an nmos transistor, tid may induce a parasitic channel between the source and the drain, leading to a leakage current when the nmos device is in the “off” state (fig. 1). furthermore, channel carriers can be trapped at the si-sio2 interface [3], decreasing carrier mobility and transconductance. fig. 1 holes trapped in the shallow trench isolation (sti) 254 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali in a pmos transistor, tid causes an increase of the threshold voltage and a reduction of the effective channel width. the latter effect is negligible for usual transistor sizes; however, for very narrow pmos devices (with ≪ 1⁄ ), this effect must be taken into account [4]. ddd effects are due to collisions between neutral particles and nuclei of silicon belonging to the lattice structure [5]. lattice defects at si-sio2 interface introduce energy states in the band-gap, which may trap channel carriers. the voltage threshold shift is: ∆ = − (2) where is the trapped charge at the interface, which depends on device biasing. moreover, trap states due to lattice defects facilitate electron transitions between valence band and conduction band, and the carrier mobility decreases [6]: = ∙∆ (3) where is the pre-irradiated mobility, is a parameter dependent on the chosen technology, is the number of charges trapped at interface. it is important to point out that nowadays tid effects are negligible in the ic core. therefore, only the circuit at the ic periphery (pad ring) require a special care, due to the higher voltage and the thicker oxide of the periphery transistors. 3.2 single event effects single event effects (see) are due to charge generation in a reverse-biased p-n junction in the cmos ic. the junction may be part of a mos transistor (drain-body of source-body), or may be a well-substrate junction. fig. 2 charge generation and parasitic current in a reverse-biased p-n junction the electric field in the reverse-biased p-n junction separates electrons and holes. the generated carriers are collected by neighbouring electrodes, thus giving a parasitic current with a peak due to carrier drift, followed by a tail due to carrier diffusion (fig. 2). from a functional viewpoint, the current due to see may cause a soft error, which is a non-destructive and temporary effect, or a hard error, which cause irreversible effects and is destructive. a soft error is a non destructive see, i.e., an effect that do not cause a permanent damage to the ic [7]. soft errors occur when the total parasitic charge generated is larger than the critical charge of the affected node. cmos ic radiation hardening by design 255 a single event transient (set) is a transient glitch which affects the voltage of a node in combinational logic. transients are temporary, however they may propagate to adjacent nodes where the effect of other set can be added. sometimes, the sum of set can trigger damaging effects [8]. a single event upset (seu) occurs when a see changes the logic value of a memory cell (e.g., a latch), or when set propagation toggles the data stored into a memory [9]. if a seu affects two or more memory cells, a multiple bit upset (mbu) occurs. a seu in the control logic may lead to a single event functional interruption (sefi). a single event latch-up (sel) is due to a see that triggers on a positive gain loop due to parasitic bipolar transistors in cmos technology, leading to a high current intensity in the loop, which may damage the ic interconnections if the device is not turned off promptly [10]. other destructive see are the single event burnout (seb), which occurs in high voltage devices when an avalanche multiplication mechanism is triggered by a parasitic charge in a p-n junction reverse biased [11], and single event gate rupture (segr), when the displacement effect combined with a high parasitic gate current can result in an oxide gate rupture [12]. seb and segr occur in power mos transistors, and are not a concern for cmos logic. hence, they will not considered in the following sections of the paper. sensitivity versus see is measured with the cross section (in square centimeters), which represent sensitive area of device. 4. design of radiation-hardened mos devices special design techniques can be adopted to improve device tolerance to radiation. 4.1 nmos transistors edge-less transistors (elt) are mos transistors with annular gate shape. this geometry was proved to reduce current leakage due to cumulative effects in nmos transistors, even at very high total doses, at the expense of a larger area, as shown in fig. 3(a) [13]-[14]. (a) (b) fig. 3 layout of (a) nmos elts; (b) conventional pmos transistors 256 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali when using elts, the internal side of the ring-shaped transistor should be used as the drain terminal of the mos device, and the external side should be the source terminal. in this way, the design minimizes the area of the drain, which is the most sensitive node for see, thus reducing the cross-section. 4.2 pmos transistors pmos transistors are not prone to current leakage, since hole trapping do not attract channel carriers. therefore, pmos transistors do not require elt shape, and they can be designed with conventional geometry, as shown in fig. 3(b), in order to save area and to maintain the ratio between pull-up and pull-down transistor sizes. 4.3 guard rings the use of double guard rings around p-wells and n-wells, biased to constant voltages, prevents sel [14]. moreover, the use of guard rings around transistors of the same type biased at different voltages reduces inter-device leakage, since positive charges trapped in the sti oxide cannot induce a parasitic channel between n-type diffusions at different voltages (fig. 4). fig. 5 shows a detail of the layout of a logic circuit employing both guard rings and elts. (a) (b) fig. 4 cross-section of two nmos transistors: (a) without guard rings; (b) with guard rings between the two transistors fig. 5 portion of a layout with elts and guard rings compared to conventional layout design, elts and guard rings require a larger silicon area. therefore, a higher level of radiati there are no sources in the current document.n tolerance can be achieved only at the expense of a larger area [15]. cmos ic radiation hardening by design 257 5. design of radiation-hardened cmos circuits an ic designer may use other radiation hardening techniques, such as redundancy and error correcting codes at the architectural level, and optimization of logic cells at circuit level. 5.1 architectural solutions at architectural level, radiation hardness can be improved by using redundant logic, such as ecc (error correcting codes). another example is the “scrambling” in a memory array: the physical location of bits do not correspond to the logical bit position, to avoid logical multiple bit upset (mbu) due to see. a further improvement can be obtained by storing each bit of a byte into a different memory array, and by providing each memory array with separate bit-line and word-line decoders, to avoid mbus due to address upset [15]. 5.2 logic circuits sensitivity to see can be analyzed through injection of “soft faults” in different circuit locations [17]. simulation results demonstrate the most sensitive nodes with respect to set are the circuit nodes which are not directly connected to voltage supplies. therefore, set sensitivity can be reduced by using fully cmos logic and by minimizing the number of transistors which are not directly connected to supplies [18]. to mitigate sefi, the numbers of feedback loops in the circuits must be minimized. 6. conclusion this paper has presented an overview of the effects due to the interaction between radiation and ics. the overview also emphasizes some design techniques developed to avoid or to mitigate radiation effects. it is important to remark that design solutions to improve radiation hardness lead to an increase of the ic area. nevertheless, they should be adopted when robustness in radiation environment is an important parameter. in addition, rhbd techniques in comparison with other approaches (shield or component selections) can be applied to different fabrication processes in order to increase the overall radiation hardening. references [1] p. j. mcwhorter and p. s. winokur, “simple technique for separating the effects of interface traps and trapped-oxide charge in metal-oxide-semiconductor transistors,” appl. phys. lett., vol. 48, pp. 133–135, jan. 1986. [2] n. s. saks, m. g. ancona, and j. a. modolo, “radiation effects in mos capacitors with very thin oxides at 80 k,” ieee trans. nucl. sci., vol. 31, pp. 1249–1255, dec. 1984. [3] f. b. mclean, “a framework for understanding radiation-induced interface states in sio2 mos structures,” ieee trans. nucl. sci., vol. 27, pp. 1651–1657, dec. 1980. [4] m. gaillardin, v. goiffon, s. girard, m. martinez, p. magnan, and p. paillet, “enhanced radiationinduced narrow channel effects in commercial 0.18 μm bulk technology,” ieee trans. nucl. sci., vol. 58, pp. 2807–2815, dec. 2011. 258 a. camplani, s. shojaii, h. shrimali, a. stabile, v. liberali [5] g. baccarani and m. r. wordeman, “transconductance degradation in thin-oxide mosfet’s,” ieee trans. electron devices, vol. 30, pp. 1295–1304, oct. 1983. [6] j. r. schwank, f. w. sexton, m. r. shaneyfelt, and d. m. fleetwood, “total ionizing dose hardness assurance issues for high dose rate environments,” ieee trans. nucl. sci., vol. 54, pp. 1042–1048, aug2007. [7] t. c. may, “soft errors in vlsi: present and future,” ieee trans. comp., hybrids, manufact. technol., vol. 2, pp. 377–387, dec. 1979. [8] j. l. andrews, j. e. schroeder, b. l. gingerich, w. a. kolasinski, r. koga, and s. e. diehl, “single event error immune cmos ram,” ieee trans. nucl. sci., vol. 29, pp. 2040–2043, dec. 1982. [9] l. t. clark, k. c. mohr, k. e. holbert, x. yao, j. knudsen, and h. shah, “optimizing radiation hard by design sram cells,” ieee trans. nucl. sci., vol. 54, pp. 2028–2036, dec. 2007. [10] johnston, “the influence of vlsi technology evolution on radiation induced latchup in space systems,” ieee trans. nucl. sci., vol. 43, pp. 505–521, apr. 1996. [11] j. h. hohl and k. f. galloway, “analytical model for single event burnout of power mosfets,” ieee trans. nucl. sci., vol. 34, pp. 1275–1280, dec. 1987. [12] f. wheatley, j. l. titus, and d. i. burton, “single-event gate rupture in vertical power mosfets; an original empirical expression,” ieee trans. nucl. sci., vol. 41, pp. 2152–2159, dec. 1994. [13] g. anelli, m. campbell, m. delmastro, f. faccio, s. floria, a. giraldo, e. heijne, p. jarron, k. kloukinas, a. marchioro, p. moreira, and w. snoeys, “radiation tolerant vlsi circuits in standard deep submicron cmos technologies for the lhc experiments: practical design aspects,” ieee trans. nucl. sci., vol. 46, pp. 1690–1696, dec. 1999. [14] calligaro, v. liberali, a. stabile, m. bagatin, s. gerardin, and a. paccagnella, “a multi-megarad, radiation hardened by design 512 kbit sram in cmos technology,” in proc. ieee int. conf. on microelectronics (icm), cairo, egypt, dec. 2010, pp. 375–378. [15] m. benigni, v. liberali, a. stabile, and c. calligaro, “design of rad-hard sram cells: a comparative study,” in proc. ieee int. conf. on microelectronics (miel), niš, serbia, may 2010, pp. 279–282. [16] stabile, v. liberali, and c. calligaro, “a radiation hardened 512 kbit sram in 180 nm cmos technology,” in proc. int. conf. on electronics, circuits and systems (icecs), hammamet, tunisia, dec. 2009, pp. 655–658. [17] do, v. liberali, a. stabile, and c. calligaro, “layout-oriented simulation of non-destructive single event effects in cmos ic blocks,” in proc. eur. conf. on radiation and its effects on components and systems (radecs), bruges, belgium, sep. 2009. [18] stabile, v. liberali, and c. calligaro, “design of a rad-hard library of digital cells for space applications,” in proc. int. conf. on electronics, circuits and systems (icecs), malta, sept. 2008, pp. 149–152. << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 439-448 https://doi.org/10.2298/fuee1903439v © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd a brief overview of stochastic instruments for measuring flows of electrical power and energy  vladimir vujicic 1 , dragan pejic 2 , aleksandar radonjic 3 1 vladimir vujicic entrepreneur consultant in electrical engineering and energetics, novi sad, serbia 2 faculty of technical sciences, novi sad, serbia 3 institute of technical sciences of the serbian academy of sciences and arts, belgrade, serbia abstract. this paper gives a brief overview of three instruments suitable for measuring the flow of electrical power and energy. the first instrument is a single-phase power analyzer, while the other two instruments are double and quadruple three-phase power analyzers. in addition to overviewing these instruments, the paper presents a possible improvement of a quadruple three-phase power analyzer. the implementation of this improvement would make it possible to use a quadruple three-phase power analyzer as support for the phasor measurement unit. key words: electrical power, electrical energy, power grid network, stochastic instruments, measurement accuracy, phasor measurement unit. 1. introduction electrical energy is the most common and widely used type of energy in the world. it can be easily converted to other forms of energy, such as heat, light, or mechanical power. in industry, electrical energy is usually calculated indirectly: as the product of electrical power and time. in contrast to it, electrical power is calculated directly: as the product of voltage and current. the precise measurement of all these quantities is a necessary precondition for the proper operation of technological equipment. their values give us all the necessary information about the technological process. in the offline mode, this information can be used to analyze and improve the process from the economic point of view. in contrast, in the online mode, the obtained information is used as input data for various scada systems. in this way, it is possible to perform real-time control of very complex technological processes. received october 25, 2018; received in revised form april 23, 2019 corresponding author: aleksandar radonjic institute of technical sciences of the serbian academy of sciences and arts, knez-mihailova 35/iv, 11000 belgrade, serbia (e-mail: sasa_radonjic@yahoo.com)  440 v. vujicic, d. pejic, a. radonjic in the last 20 years a large number of methods for measuring electrical power and energy have been developed. they are practically implemented in three types of devices: 1) instruments for measuring the quality of electrical energy, 2) power analyzers, and 3) smart meters. besides having different roles, these devices have different prices: the instruments for measuring the quality of electrical energy are, respectively, one and two orders of magnitude more expensive than power analyzers and smart meters. 2. the vmp20 instrument the vmp20 instrument is a single-phase power analyzer (fig. 1). it was designed back in 1996 by the authors and their colleagues. this device, based on national patent [1], is able to measure (at two second time intervals) four quantities: 1) single phase voltage (with the accuracy of 0.5 % of full scale), 2) single phase current (with the accuracy of 0.5 % of full scale), 3) single phase active power (with the accuracy of 1 % of full scale), 4) the grid frequency (with the accuracy of 0.02 % of full scale). fig. 1 the vmp20 instrument. the instrument is connected to a pc via rs232 interface. the software installed on a supporting pc (vmpcalc 2.1) is intended for additional processing of measured data (fig. 2). this includes: a) the calculation of the reactive and apparent power, b) the calculation of the impedance, c) the calculation of the minimum and maximum values of all measured quantities, d) the calculation of the mean value and standard deviation of all measured quantities, e) the calculation of the peak power (maximum 15-minute average power), f) the calculation of the maximum 15-minute average current value, g) the calculation of the maximum 15-minute average reactive and apparent power, h) the generation of the reports for a given time interval, i) the graphical representation and visualization of the measured/calculated quantities. based on the aforementioned, the authors have successfully tested the ability of the vmp20-based system (vmp20 instrument + pc + vmpcalc 2.1 software) to detect various disturbances in a low voltage distribution network (lvdn) [2]. some examples are illustrated in figs. 3 and 4. a brief overview of stochastic instruments for measuring flows of electrical power and energy 441 fig. 2 the basic window of vmpcalc 2.1. fig. 3 continual measurement of the phase voltage using the vmp20 instrument. fig. 4 continual measurement of the grid frequency using the vmp20 instrument. 442 v. vujicic, d. pejic, a. radonjic 3. the mm2/mm4 instrument as a result of an intensive research, in the early 2010's, the authors and their colleagues have designed two new instruments: a double three-phase power analyzer, called mm2, and a quadruple three-phase power analyzer, called mm4 (fig. 5). fig. 5 the mm2 instrument (left) and mm4 instrument (right). both devices are based on national patents [1] and [3] and use a two-bit sddft processor [4] to process some measured data. owing to this, one mm4 device can measure up to 70 quantities: 1) 3 voltage rms (with the accuracy of 0.2 % of full scale) [5], 2) 16 current rms (with the accuracy of 0.2 % of full scale) [5], 3) 12 active powers (with the accuracy of 0.5 % of full scale) [5], 4) 38 fundamental fourier coefficients (with the accuracy of 0.2 % of full scale) [6], 5) power grid frequency (with the accuracy of 0.02 % of full scale) [7]. unlike the vmp20, the mm4 is connected to a pc via the usb cable. the software installed on a pc (vmpcalc 3.0) performs three-phase processing and has the ability to calculate fryze's reactive power (rp) in the three phases and the fundamental of budeanu's rp in the three phases (based on the measured values of fundamental fourier coefficients). the authors have successfully tested the ability of the mm-based system (mm2/mm4 instrument + pc + vmpcalc 3.0 software) to detect, locate and measure unregistered electricity consumption. one such test was performed five years ago for the needs of the serbian national power distribution company’s branch (formerly called “elektrovojvodina”). in the mentioned case, along with the company's system (system1), the additional mm4based system (system2) was installed as a redundant system. the key hardware elements of this system (two mm2/mm4 instruments and one pc) were placed in the substation and connected at the output of distribution transformer (fig. 6). on the other hand, on each distant pole one energy power meter (labeled as br on fig. 6) was placed and connected. thanks to such approach, it was possible to measure electricity consumption independently of the company's system. by comparing the measurement results of both system1 and system2, it was possible to detect and locate unregistered electricity consumption. in this particular case, we have found a huge disproportion between recorded and actual consumption (especially on the 4 th pole) (fig. 7). a brief overview of stochastic instruments for measuring flows of electrical power and energy 443 fig. 6 a schematic diagram of the system2. fig. 7 the 96-hour measurement results obtained using the system1 (green line) and system2 (red line). 444 v. vujicic, d. pejic, a. radonjic 4. further improvement of the mm4 instrument among all of the above mentioned instruments, the most advanced is the mm4. it performs measurements in both the time domain (the measurement of the rms value of the voltage/current and the measurement of the active power) (fig. 8) and the fourier domain (the measurement of fourier coefficients of the input voltage/current) (fig .9). from figs. 8 and 9 it can be seen that the a/d conversion and mac operations (mac multiply and accumulate) are extremely simple. the instrument is, therefore, simple and reliable, and it has a small number of systematic errors that can be easily identified and eliminated [8]. fig. 8 two-bit mac scheme in time domain. fig. 9 two-bit mac scheme in transformation domain. the first scheme is intended for measurement of the mean value of a product of two analog signals f1(t) and f2(t) (e.g. voltage and current). for that purpose, it is needed to add two uncorrelated dithers h1 and h2 (fig. 8). in that case, the output value  will be equal to 1 2 1 2 1 0 1 1 ( ) ( ) ( ) ( ) tn i i i f t f t dt n t           (1) where t denotes the measurement interval length. the second scheme (fig. 9), on the other hand, is intended for measuring the harmonic components (fourier coefficients aj and bj) of the input signal f1(t). as it can be seen, the analog sum of the signals f2(t) and h2 is replaced by memorized two-bit samples of a dithered base function (dbf). for instance, if f2(t) = r2·cos(jωt), the output value  will be equal to a brief overview of stochastic instruments for measuring flows of electrical power and energy 445 2 1 2 1 2 1 0 1 1 ( ) ( ) ( ) cos ( ) 2 2 tn j j i ar i i f t r j t dt a n t                (2) where r2 = 1 represents the dbf range, while ω denotes the fundamental frequency. analogously, if f2(t) = r2·sin (jωt) , the output value  will be equal to sin 2 2 2 1 2 1 2 1 0 1 1 ( ) ( ) ( ) ( ) tn i r i i f t r j t dt n t                j j b b (3) in [4], it was shown that the mm4 measures all parameters necessary for calculation of the electrical power (according to the ieee std. 1459-2010). the whole process of signal processing is performed by two fpga chips, which were made nine years ago. thanks to the great advancement of fpga technology [12], the performance of the mm4 instrument can be greatly improved. one such improvement would make it possible to use the mm4 as support for the phasor measurement unit (pmu). for instance, in [11] it was formulated and solved the problem of measuring the current power value by using four digitized samples of the voltage and current taken in the sliding half-cycle of grid frequency. the authors of [11] have shown that, for this purpose, one needs to know the values of both the fundamental and the largest odd higher harmonic. one solution to this problem is the application of four stochastic digital dft (sddft) processors (figs. 10 and 11). one sddft (fig. 10) is intended to calculate the fourier coefficients within one voltage cycle (20 ms). however, by using four sddft processors, which are successively "phase-shifted" by π/2 (fig. 11), it is possible to measure the fourier coefficients within the sliding quarter-cycle of the voltage signal (fig. 12). fig. 10 optimal two-bit sddft processor for measuring 2m fourier coefficients. 446 v. vujicic, d. pejic, a. radonjic fig. 11 optimal two-bit sddft processor for measuring i-th fourier coefficient within the sliding quarter-cycle of the voltage signal. fig. 12 optimal two-bit quadruple sddft processor for measuring 2k+1 odd fourier coefficients within the sliding quarter-cycle of the grid frequency. a brief overview of stochastic instruments for measuring flows of electrical power and energy 447 unlike the mm4, which is synchronized with the grid frequency, that varies [3], the pmu is synchronized with astronomical time that does not vary [10]. as a result, the output data from the mm4 (one quadruple sddft processor) may delay up to two sampling periods of the pmu, i.e. 5 ms. by embedding two quadruple sddft processors inside the mm4, the mentioned delay can be reduced up to half sampling period of the pmu, i.e. 1.25 ms. a special problem is the determination of the largest odd higher harmonic. it needs to be solved within a few microseconds, which is a topic beyond the scope of this paper. 5. discussion the instruments described in the previous sections enable control and monitoring the flow of electrical power and energy in a lvdn. the number of the users of electrical energy can be practically arbitrary: from several tens to several thousands. an additional advantage is the fact that mm2 and mm4 instruments are based on fpga technology. therefore, they can be improved without new hardware design. some improvements in that sense were presented in [4] and [8]. the first reference describes the improvement in terms of accuracy, while the second one shows how to determine the consumer's profile (capacitive, inductive, thermogenic or mixed) and its behavior. all these features were obtained by reprogramming fpga chips. besides this, practical experience has shown that a pc is the most sensitive component of the system. thus, in [9] it was suggested its replacement with a beaglebone device [10]. on the other hand, by replacing existing fpga chips with more advanced ones, it is possible to measure the fourier coefficients within the sliding half-cycle of the grid frequency. consequently, it is also possible to measure the current electrical power within the sliding half-cycle of the grid frequency. it is interesting to note that for this need it is necessary to embed at least three additional sddft processors, while the rest of the instrument remains unchanged. 6. conclusion in this paper, we gave an overview of three instruments that have been constructed by the authors and their colleagues. compared to corresponding commercial solutions, they provide a magnitude of order cheaper and not less reliable control of the flow of electrical power and energy. because of their significantly lower price, they can also be used as redundant systems along to scada systems. one example of such system is described in this paper. finally, the paper presents the proposal for a significant improvement of the mm4 instrument. it is based on embedding three additional sddft processors that are successively "phase-shifted" by π/2. this improvement is a necessary precondition for solving a significant problem in practice: determining the fundamental and largest odd higher harmonic in the power grid, which enables the calculation of the current power value. acknowledgement: this work was supported by the serbian ministry of education and science under grant tr 32019. 448 v. vujicic, d. pejic, a. radonjic references [1] v. vujicic and s. milovancev, “digitalni instrument za merenje proizvoda dva analogna periodična signala“, yu patent p-742/95, 1995 (in serbian). [2] v. vujicic et al., “concept of stochastic measurements in the fourier domain”, in proceedings of the ieee 16th international conference on harmonics and quality of power, may 2014, pp. 288-292. [3] v. vujicic, “digitalni instrument za merenje harmonika“, yu patent p-628/96, 1998 (in serbian). [4] d. pejic et al., “stochastic digital dft processor and its application to measurement of reactive power and energy”, measurement, vol. 124, pp. 494-504, aug. 2018. [5] v. vujicic et al., “low frequency stochastic true rms instrument,” ieee trans. instrum. meas., vol. 48, no. 2, pp. 467-470, apr. 1999. [6] p. sovilj et al., “stochastic measurement of reactive power using a two-bit a/d converter", in proceedings of the imeko tc-4 int. symp. on understanding the world through electrical and electronic measurement, sept. 2016, pp. 176-179. [7] a. radonjic, p. sovilj and v. vujicic, “stochastic measurement of power grid frequency using a twobit a/d converter,” ieee trans. instrum. meas., vol. 63, no. 1, pp. 56-62, jan. 2014. [8] m. urekar et al., “accuracy improvement of the stochastic digital electrical energy meter,” measurement, vol. 98, pp. 139-150, feb. 2017. [9] d. davidovic et al., „optimalni redundantni merni sistem za nadzor tokova električne snage i energije“, in proceedings of the energetika 2017, zlatibor, mar. 2017 (in serbian). [10] https://beagleboard.org/bone [11] a. ghanavati, h. lev-ari and a. stankovic, “a sub-cycle approach to dynamic phasors with application to dynamic power quality metrics,” ieee trans. power delivery, vol. 33, no. 5, pp. 22172225, oct. 2018. [12] https://indico.cern.ch/event/283113/contributions/1632265/attachments/522019/720041/zibell_how_fp gas_work.pdf https://scholar.google.com/scholar?cluster=6392327388801746347&hl=en&newwindow=1&as_sdt=2005&sciodt=0,5 https://scholar.google.com/scholar?cluster=6392327388801746347&hl=en&newwindow=1&as_sdt=2005&sciodt=0,5 https://beagleboard.org/bone https://ieeexplore.ieee.org/author/38314883800 https://ieeexplore.ieee.org/author/38314883800 https://indico.cern.ch/event/283113/contributions/1632265/attachments/522019/720041/zibell_how_fpgas_work.pdf https://indico.cern.ch/event/283113/contributions/1632265/attachments/522019/720041/zibell_how_fpgas_work.pdf facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 631-645 https://doi.org/10.2298/fuee2104631c original scientific paper a review on the pursuit of an optimal microwave absorber soma chakraborty1, soumik chakraborty2 1department of electronics and communication engineering, indian institute of information technology, nagpur, maharashtra-441108, india 2department of electronics and instrumentation engineering, nit silchar, assam-788010, india abstract. mitigation of the electromagnetic radiations is essential for reliable communication of information. the challenges lie in achieving sufficiently good absorption over a broad range of frequencies. considering the applications in airborne and handheld devices where light weight, thin, conformable and broadband absorbers are desired, numerous techniques and methods are applied to design broadband absorbers. in this review paper, a detailed analysis on electromagnetic absorbers including evolution, the materials used, and characteristics such as absorption efficiency over the years is presented. progress on recent research on various polymerbased and metamaterialbased microwave shields are included along with their findings. several prospects such as broadbanding, flexibility, multibanding are described here. various material and structural composition offering good absorption performance in different frequency bands are also summarized whose the techniques can be used for suppressing electromagnetic interference and radar signature. the paper specifies the aspects one encounters while designing and realizing a perfect microwave absorber. explored here are several works of distinguished authors which are based on various techniques used to achieve good absorption performance with ease of mounting. key words: absorber, electromagnetic, metamaterial, microwave, bandwidth 1. introduction electromagnetic emissions are usually generated and transmitted during the operation of wireless and electronic devices. beyond a certain level, these emissions cause operational interferences and are classified as electro-magnetic interferences (emi). growth in modern high speed electronic devices packaged alongside the electromagnetic wave emitting sources in devices such as cellular telephony, wi-fi, bluetooth, etc. are posing newer challenges for the designer. in addition, these multitude of applications have created an received may 16, 2021; received in revised form august 20, 2021 corresponding author: soma chakraborty department of electronics and communication engineering, indian institute of information technology, nagpur, maharashtra-441108, india e-mail: soma.ch15@gmail.com 632 s. chakraborty, s. chakraborty even more congested electromagnetic environment leading to operational challenges of systems in close proximity [1]. the electromagnetic vulnerability and radiation hazard have to be controlled for obtaining an electromagnetically compatible (emc) environment by reducing emi. microwave absorbers/shields are generally used to sufficiently reduce emi. traditional microwave absorbers can be dielectric, magnetic or magneto-dielectric. the structures consist of one or more filler materials reinforced in a matrix material thus forming a composite with or without a metal back. the electrical or magnetic properties of these materials can be altered to achieve high absorption (reflection loss) over broadband frequencies. although dielectric microwave absorbing materials achieve good absorption performance, however, thickness of the absorber increases by many orders to get good attenuation. for effective absorption there should be minimum reflection from the absorber surface. when the two reflected waves are out of phase they cancel each other and so reduce reflection. this is possible if the two waves destructively interfere, i.e., have a path difference of  / 2. since the wave travelling twice the thickness of the absorber (t) is equal to odd multiple of g / 4, where, λg = λ0 ∕ (|εr||μr|) 1∕2 where, |r| and |r| are the moduli of complex permittivity (r) and complex permeability (r) respectively. the magnetic component of absorber improves matching at the air-absorber interface (z′ = √μ ε⁄ ). magnetic losses along with dielectric losses enhance attenuation of the incident wave resulting in reduced thickness of the absorber as the guide wavelength reduces by a factor of 1 √με⁄ . magnetic materials offer an effective way of alternating electromagnetic waves by way of better impedance matching at the interface of the absorber and also reducing its thickness. then, there are metamaterial absorbers which are artificially engineered homogenous materials consisting of periodic unit cells that possess electromagnetic characteristics not found in natural materials [2]. the word “meta” means beyond, and “metamaterials” stand for the artificial composite materials. the homogeneity condition is attained by realizing the dimension of the unit cell size (a) much smaller than the wavelength of the incident wave in the guided medium and the effective homogeneity condition is satisfied for a ≤ λg/4. the structure consists of top and bottom conducting layers isolated by a dielectric interlayer. the dielectric layer in the middle controls the input impedance and impedance matching, yielding to equivalent inductances (l) and capacitances (c) which form a lc equivalent circuit. since both electric and magnetic fields are involved in em wave propagation, permeability (µ) together with permittivity (ɛ) plays an important role in absorber performance. fig. 1 schematic representation of absorbing type emi shielding mechanism a review on the pursuit of an optimal microwave absorber 633 a schematic representation of the various wave components involved in absorption is shown in figure 1. design of microwave absorbers with enhanced absorption performance requires two important conditions to be satisfied: impedance matching characteristic and attenuation characteristic. when electromagnetic wave is incident on an absorber, reflection takes place at the free space-absorber interface due to mismatch in impedance. reflections can be minimized if impedance of the absorber is matched to the free space impedance, resulting in penetration of the wave into the absorber, which is the first condition. within the absorber, dissipation of radio frequency (rf) energy is maximized resulting in rapid attenuation of the amplitude as it propagates in the absorber structure. this is the second condition. hence, in this review, an effort has been made to describe the need of a polarization insensitive microwave absorber offering optimal broadband absorption up to a wide angle of incidence with minimum thickness, as well as cost. 2. absorbers the archives investigations on electromagnetic wave absorbers started in the netherlands with the first known absorber being patented in 1936 which was a quarter-wave resonant type structure comprising of carbon black (cb) and tio2. carbon black provides the dissipation, and a high dielectric constant can be achieved using tio2 for reduced thickness [3]. absorbers were first used practically during the world war ii (1939-1945) where germany used two types of absorbing materials for camouflaging of submarines and periscope [4, 5]. one of them is the “wesch” material in the form of a rubber sheet of about 0.3 inches thickness infused with carbonyl powder of a grid-like structure resonating at 3 ghz. the other is the “jauman” absorber of about 3 inches thickness consisting of rigid plastic and resistive sheets placed alternately with decreasing resistances providing a gradual transition from a low to a high loss medium with a reflection loss of more than -20 db over the range of 2-15 ghz. it was during this period when j. l. snoek explored the possibility of ferrite to be used as absorber [6]. during 1941-1945, materials known as “harp” (halpern-anti-radarpaint) were used by the united states for airborne and shipborne applications in the xband. reflection loss of the absorbers used were in the range of 15-20 db at resonance. the absorbers which were used for air-born environment contained disc shaped aluminum flakes (high dielectric constant of 150) infused in rubber and cb where as the absorbers used for ship borne environment consisted of a high concentration of iron particles binded by neoprene rubber (dielectric constant of 20) having thickness of 0.025 inches and 0.07 inches respectively. the magnetic permeability of the iron shows resonance behavior at such high frequencies. around that time, another absorber, commonly known as salisbury screen absorber, was also developed in the radiation laboratory [7]. a quarter-wavelength absorber having a resistive sheet (clothes coated with graphite) of around 377 ohm located at a distance of quarter wavelength behaved as a resonating structure. this arrangement where at one side of a slice of 0.75 inches thick wood a resistive cloth (known as uskon cloth) was adhered and metal foil to the other side, showed resonance at 3 ghz with absorbance over 20-30 % of the frequency range when used practically. 634 s. chakraborty, s. chakraborty simultaneously, structurally modified absorbers were being investigated by the radiation laboratory. it was observed that reflections were reduced to normal incidence while using long pyramidal shaped absorber structures due to absorption of multiple reflections generating in the direction of the vertex of the pyramid. proper impedance matching can be achieved by using graded and tapered absorbers as they provide progressive transition of the impedance [8-11]. broad banding was experimentally achieved by many organizations using several structurally modified absorber surfaces such as cones, hemispheres and wedges. few filler materials included carbon, graphite, iron oxide, powdered iron, aluminum and copper, steel wool, metal wires, etc. which were loaded into plaster of paris, various plastics and ceramics to be used as free-space absorbers. 3. traditional absorbers broad banding aspects in the early 1950s, the commercial “hair” broad-band absorber was manufactured by drenching animal hair into carbon black by emerson in the us. the absorbers were 2 inches, 4 inches and 8 inches thick attaining a reflection coefficient of around -20 db for normal incidence over the frequency range of 2400-10000 mhz, 1000 mhz and 500 mhz, respectively. buckley at emerson & cuming, inc. redesigned the hair absorbers to show an improved performance of -40 db reflection coefficient when the front surface is convoluted. a schematic of the first dallenbach layer magnetic absorber, shown in figure 2, was patented using ferrite materials [12]. in the course of this period, meyer, a german scientist presented few innovative concepts associated with microwave absorption such as resistance loaded loops and dipoles, slotted resistive foils, strips of magnetic & resistive materials with different inclinations, magnetic loading etc. this is how research into frequency selective surfaces (fss) aspects came into being. fig. 2 a schematic of single layer dallenbach absorber magnetic materials as possible absorber’s fillers were inspected continuously during 1960’s and 70’s [13]. in the late 1960’s, suetake in his patent described a broadband absorber structure of thickness 12.3 inches comprising of a graphite-made zig-zag wall inclined at the front of a ferrite plate with reflective coefficient less than 0.1 in the frequency range of 0.1 to 1 ghz [14]. also, absorption was controlled by coating several structured absorbers such as foams, netlike or honeycomb structures with some paint-like a review on the pursuit of an optimal microwave absorber 635 material containing carbon particles or fibres, or alloys of different metal like nickel chromium alloy, etc. [15]. another type of absorbers employing plasma as the absorbent which could be generated by a radioactive substance requiring about 10 curies/cm2 was studied by m. e. nahmias in his patent, which was a conjecture by then [16]. evolution in the material aspect as inclusions and also in design process were the key factors of 1980’s in absorber development. jaumann absorbers were modified in design point of view by using graded layers so as to achieve high absorption bandwidth. theoretical design of absorbers saw the rise in this era using computational models like transmission line models, floquet theorem to calculate reflectivity from material properties and to study periodic structures, respectively. dielectric materials were renovated by including fillers like rods, wires, disc, etc., which exhibited promising results and also conducting polymers, such as low-density polyacetylene, which were all studied as a possible candidate for absorption [15, 17]. in the year 1988, experiments conducted by the department of defense, us related to rcs reduction validated the use of a class of schiffbase salts by dissolving in aircraft structural materials. this substance which was much lighter than ferrites had been used as radar absorbing paint for stealth aircrafts [18]. chiroshield, a thin salisbury shield made of chiral materials was introduced in 1989, offering increased absorption rate with low thickness for a wide range of frequencies compared to the conventional ones. either chirality could be incorporated into the prevailing materials or new chiral composites could be fabricated. there is a mutual coupling and induction of electric and magnetic fields within a chiral medium and the losses in permittivity imitate losses in permeability and vice-versa [19]. many optimization techniques such as genetic algorithm was used to optimize the structures of jaumann absorbers along with deep research into circuit analog absorbers, and fss, to continue in the 1990’s. absorber composites made with different fibres or netlike structures being coated with conducting polymers were also on the rise. tunable resonant absorbers made of conducting polymers were also investigated by varying the resistive and capacitive elements in the absorber until 1991 when carbon nanotubes (cnts) came into light which were discovered by lijima [15, 20]. thus, cnts paved the foundation for a new type of radar absorbing materials which consists of nanoparticles. until the invention of cnts, carbon fillers such as carbon black, graphite, expanded graphite, etc. continued to be used as radar absorbing materials. single-walled and multi– walled cnts have been extensively utilized showing wide microwave absorption. high absorption properties could be achieved by using a low weight percentage of cnts because of their high aspect ratios (= length/diameter) which helps in attaining low percolation threshold at very low loading [21]. in addition to cnts, their 3d structures such as graphene nanosheets, graphene oxide and reduced graphene oxide have also been prepared using different chemical methods of preparation for radar absorber applications [22]. being light weight, flexible, corrosion resistant makes graphene one of the attractive materials to be used as a component of em wave absorbing materials like any other carbonbased materials which possess extraordinary advantages of low density, high thermal stability and high chemical [23, 24]. unfortunately, the direct application of graphene in em absorption is restricted due to its high εr value which causes impedance mismatch [25, 26]. efforts have been made to improve matching by mixing them with different magnetic fillers [27] and by using modified graphene (rgo). but this also resulted in some other downsides, such as aggregation, restacking, and the need of high filler content, which again hampers the 636 s. chakraborty, s. chakraborty practical applicability. then, the concept of ‘plainification’ appeared where instead of adding more amount fillers, an interface type of structure is included to achieve superior properties [28–30]. this required fine adjustment of the structure and the process remains challenging. recent study on development of em shielding materials based on plant based cellulose nanofibres have shown the path of using environmental friendly materials. the material is a light weight, conductive and porous cellulosepolyaniline aerogel with a thickness of 5 mm which shows 95% absorption in the x-band and a real time heat dissipation behavior using a mobile phone with a great prospective for applications in portable electronics [31]. lately, studies related to wear-on-body microwave communication have been introduced where textiles are coated with em shield materials so as to prevent any adverse effect of using electronic devices on our health. with a thickness of 2.236 mm, coating layers of composites where conductive polymer mixtures incorporating metallic nanoparticles, nanowires or carbon based nanostructures along with conventional textiles are used [32, 33], which provides shielding effectiveness of more than 20 db over the x-band. nevertheless, including such coatings on textiles still remains a challenge due to conformability and washability issues. an interesting approach towards tunable absorbers was experimentally studied by estevez et al. where two different hybrid fillers (cnt/aw and rgo/aw nanowires) were bound by silicone resin in x-band. the polarization loss originating from the interfacial polarization relaxations at the interfaces of cnt-resin & cnt-wire leads to higher dielectric losses. the domain wall motion due to the wires leads to the ferromagnetic resonance and contributes to the magnetic losses. tuning of the absorber is thus controlled by the amount of cnt coating which guides interfacial interactions. a high reflection loss of -35 db is obtained for rgo coating thickness of 2.7 mm at 11.3 ghz [34]. tunable em wave absorption and shielding was achieved at a thickness of 1.65 mm by growing cobalt nanoparticles embedded variable length cnts on natural cotton using cvd method. the highly elastic and easily compressible absorbers are light weight showing absorption intensity as -43 db and also shields 99% of incident wave over a bandwidth of ~5 ghz in the frequency range from 2-18 ghz. cnts with shorter length and less conductivity is favorable for microwave absorption, whereas with longer cnts the conductivity increases which enhances the shielding effectiveness [35]. 4. metamaterial absorbers the origin in the quest of a perfect absorber, the use of metamaterial provides an encouraging solution to the problem of electromagnetic interference. metamaterials are usually structured geometrically as a periodic arrangement of unit cells (metallic or dielectric elements) demonstrating wave characteristics that do not exist in nature and thus are often described as artificially engineered homogeneous medium [2]. depending on the size and shape (geometry) of the unit cells, the electromagnetic properties such as permittivity (ϵ) and permeability (μ) can be altered to a wider range including negative values. developing thin metamaterial absorbers possessing characteristics such as conformability and fabricability with high absorption over a wide bandwidth is still in progress. few pioneering works in this field is discussed here, starting with the origin of concept. it was in the years 1996 and 1999, when john pendry along with his group, first experimentally realized the concept of negative permittivity and negative permeability respectively. absorption in a review on the pursuit of an optimal microwave absorber 637 metamaterial-based absorbers is of a resonant type and the frequency is regulated by the rise of inductance and capacitance due to the dimensions of unit cells of the structure. the first metamaterial absorber was based on split-ring resonator (srr). an array of srrs were placed periodically in x-z plane on a resistive sheet of 1mm thickness providing a resistance of 377 ω like salisbury screen with minimum s11 being observed at around 2 ghz [36]. then, the idea of electric ring resonator (err), also known as electric field driven lc resonator (elc) based absorbers was presented, where at the top of the surface the incident e-field causes the flow of current and it gets stored within the metallic patches producing inductance and capacitance. here, fr-4 substrates are used as a dielectric material on top of which unit cells are patterned as shown in figure 3. an absorption peak of 96% was observed at around 11.65 ghz [37]. since these absorbers had less absorption bandwidth, a 3-d microwave absorber was then developed combining the elc and srr structures for broadband absorption which exhibited a peak absorption of 99% at 2.4 ghz [38]. the relatively thin λ/5 thickness of elc-srr structure in the propagation direction makes it more beneficial, when compared to the typical λ/2 or greater thickness of traditional foam pyramidal absorbers. also, lumped circuit elements could be added to this structure to initiate tunability. fig. 3 a schematic of the unit cell in electric ring resonator (err) 4.1. multiple banding it was in the year 2010, when a triple layered unit cell structured metamaterial absorber was developed by li et al., and a dual band resonant behavior was observed at 11.15 ghz and 16.01 ghz with maximum absorption of 97% and 99% respectively [39]. the structure incorporates a cross-shaped resonator (csr) and complementary cross-shaped resonator (ccsr) in one unit cell, to make it compact. it also displays improved impedance matching the free space due to the mutual coupling effect between the two resonators. the ohmic loss and the dielectric loss account for the absorption, since there exists electrical resonance which leads to ohmic loss. in 2013, bhattacharyya et al. obtained a triple band polarization independent absorption by using different combination and size of square-shaped closed ring resonators [40]. the surface current distribution around the square rings control the overall permeability of the structure, thus leading to absorption at different frequencies. the absorber exhibited a triple band absorption response with one band lying in x-band and two in c-band. likewise, an arrangement of concentric squares and circular rings explored the polarization insensitiveness with triple band metamaterial absorbers [41, 42]. to improve the absorption bandwidth, metamaterial absorber based on sectional asymmetric structures was realized using cst studio suite by gong et al. [43], which had thickness of 1.9 μm and was composed of au and si3n4. due to the resonant behavior of the metamaterial, these absorbers suffer from narrow absorption bandwidth, limiting their usage in applications. for broadband absorption multilayering is one of the techniques which is used in metamaterial absorbers also. 638 s. chakraborty, s. chakraborty lee and lee implemented multiresonance structures of different geometric dimensions into a single unit cell to widen the working bandwidth. the structure is 0.8 mm thick and demonstrates a maximum absorption of 93% at 10 ghz with a bandwidth of 970 mhz [44]. the different mixture of unit cells with small difference in the scaling factor between cells having varying geometric dimensions when arrayed periodically demonstrates resonant absorption peaks overlapping and thus increasing the bandwidth. if the scaling factor between the cells increases, it shows split distinct resonant peaks. dual and triple band metamaterial absorbers with wideband absorption was developed by kollatou et al. by utilizing scalability property of metamaterials [45]. special arrangements of donut-based resonators as shown in figure 4 were also implemented in order to achieve multiband absorption. multiple absorption peaks of 97%, 97%, 98% and 98% were observed at 6.5 ghz, 7.4 ghz, 9.2 ghz and 11.0 ghz respectively [46]. 4.2. broadbanding another technique for widening the absorption bandwidth was attempted by gu et. al. where different sizes of hexagonal metal dendritic units are closely placed to combine absorption peaks of each unit an isotropic ma. absorption greater than 80% is observed for normal incidence and oblique incidence for less than 45° in the frequency range from 9.05 ghz to 11.4 ghz [47]. as the unit cell of a symmetrical structure can resonate identically for different polarizations, there were several investigations on polarization insensitive absorbers using highly symmetrical structures, such as rotational structure [48], four-fold symmetrical structure [49], or higher order symmetrical structures [50]. then, using the property of high absorption of magnetic materials a two-layered hybrid absorber was implemented by li et al., where non-planar metamaterial was integrated with magnetic absorbing materials to observe 90% absorption over the range from 2 to 18 ghz [51]. the top layer is an arrangement of metal aluminium unit cells stacked on a metal backed magnetic layer which is composed of carbonyl iron flakes powder infused in epoxy with a weight ratio of 2.65:1. although the structure has the advantage of inheriting the characteristics of both magnetic absorber and non-planer metamaterial absorber for broadband absorption and absorption at lower frequency range respectively, the structure is quite complex so that might cause several fabrication errors. following this, yin et al. developed a less complex, polarization independent and thin broadband metamaterial absorber by using two tapered hyperbolic metamaterial waveguide arrays of different dimensions which has 90% absorption bandwidth from 2.3 to 40 ghz [52]. the absorption bandwidth is enhanced by appropriate selection of geometrical boundaries for each hyperbolic metamaterial waveguide connected in some pattern. a wideband double layer circuit analog absorber involving an upper layer which contains an arrangement of resistor-loaded square loops printed on dielectric substrates was realized by ghosh et al. the layers are separated by an air spacer adding to a total thickness of 4.6 mm. the bottom layer helps in increasing the total bandwidth and produces new fig. 4 schematic of a donut-shaped resonator a review on the pursuit of an optimal microwave absorber 639 resonance when the wave is incident normally. the absorber exhibits 90% absorption from 5.10 to 18.08 ghz. however, fabrication of such absorbers is difficult [53]. a wideband switchable metamaterial absorber was investigated by kim et al. where lumped elements and microfluidic channels with liquid metal alloy are combined in order to reduce rcs for x-band and c-band [54]. the metamaterial absorber incorporated chip resistors and a modified jerusalem cross resonator (jcr) which was adjusted by loading slotted circular rings into the whole structure. the jcr consists of slotted circular rings, resistors and microfluidic channels. the absorber was fabricated on a flexible substrate and the microfluidic channels are imprinted on a polydimethylsiloxane (pdms) material. absorption rate of 90% was observed covering almost the x-band from 7.43 to 14.34 ghz and the cband from 5.62 to 7.3 ghz, with empty channels and liquid metal-filled channels, respectively. water has been used in designing microwave absorbers because of its frequency dispersive nature at microwave frequencies and also being abundantly available all over the world [55, 56, 57]. following this, yoo et al. designed a series of metamaterial absorbers with four different substrates, viz., fr-4, pet, paper and glass material in the frequency range from 8-18 ghz, using periodic arrangement of water droplets which actas a resonator [58]. each droplet is placed on the top layer of the structure with proper height and diameter, controlling the absorption and bandwidth for an overall absorber thickness of 2.36 mm. absorption rate of 93% on fr-4, pet, paper and glass substrates was observed in the frequency range of 8.3–12.07 ghz, 11.23–12.36 ghz, 9.2–16.5 ghz and 12.05–12.65 ghz, respectively. in another research carried out by pang et al., where a water based metamaterial absorber of 3.5 mm thickness was used by incorporating water as a dielectric substrate [59]. the hybrid substrate being a combination of water and a low-permittivity material allows a leak-proof structure which can be easily fabricated presenting a 90% wideband absorption from 6.2 to 19 ghz. in one of the other works, distilled water filled dielectric reservoir based ultra-thin three dimensional water-substrate array organized periodically on a metal back metamaterial absorber were used. a triangular shaped metallic fishbone structure was also incorporated in between water-substrate and dielectric reservoir periodically attaining an ultra-broadband absorption in the frequency range from 2.6 to 16.8 ghz [60]. recently, in the year 2019, a low cost flexible water-based metamaterial using 3d printing technology was proposed which offered 90% absorption over the broad frequency range of 5.9-25.6 ghz. the overall thickness of the structure was 4mm where distilled water was selected as the absorbing material and thermoplastic urethane was used to hold the water. the absorber is insensitive to polarization and shows good microwave absorption performance in wide-angle of incidence [61]. another ultraband metamaterial absorber was presented where tetramethylurea was added to water in order to alter the dielectric properties and this solution was used for absorption. the four-layered structure achieved an absorption of 90% covering the frequency range of 4.3 to 40 ghz [62]. an ultra-broadband polarization-insensitive metamaterial absorber was developed by munaga et al. which presented 10-db absorption bandwidth in c-band (3.78–8.28 ghz) during normal incidence with the incident angle less than 45° [63]. further investigation on broadband absorber was done by hoa et al. who reported a polarization insensitive absorber by incorporating a rotational symmetrical multilayer structure [64]. the absorber was based on periodic arrangement of metallic/dielectric conical frustums which show 90% absorption with large angle of incidence up to 60°. another four-layered 4.2 mm thick ultra wideband ionic liquid based metamaterial absorber was designed using 3d printing technology. [emim] [n(cn)2] was chosen due to its highly lossy nature which was injected 640 s. chakraborty, s. chakraborty in a periodic arrangement of photopolymer cylindrical array via 3d printing. absorption rate of 90% was reported in the frequency range of 9.26–49 ghz along with good high absorption performance for oblique incidence of 45°. using a low dielectric constant photopolymer material as a top layer, the impedance matching was improved [65]. a switchable c-band polarization insensitive absorber composed of a periodic arrangement of square loops along with pin diodes to provide switching between single band and multiband absorption was reported by ghosh et al. [66]. to provide bias voltage to all the switches a biasing network has been implemented without disturbing the resonance of the structure. the 4-axial symmetrical design of the structure provides polarization insensitiveness for all angles. the broadband switchable structure under normal incidence for off state exhibits 10 db absorption bandwidth of 4.66 ghz (3.56 8.16 ghz), whereas for on state good reflection value is observed for the whole frequency range. a dual-band metamaterial absorber structure consisting of two circular rings showing absorption with oblique angles larger than 60° was designed and studied by ayop et al. [67]. in the year 2016, another dual-band absorber symmetrical structure consisting of a rectangular ring, a cross and a slotted cross design was realized by the same team with angle of incidence of more than 77° with for x-band [68]. 4.3. conformability development of conformable absorbers is the need of the hour so that absorbers can be easily mounted on any surface. a flexible metamaterial absorber using printing technology was presented for cylindrical surfaces [69]. the unit cell of the absorber structure is based on jcr resonator which is printed on a flexible polymer polyethylene-terephthalate (pet) using silver nanoparticle ink. the structure shows 95% absorption at 9.21 ghz for flat, as well as a cylindrical surface having a diameter of 9.12 cm on a 0.62 mm thick substrate for all polarizations less than 30° of obliquely incident angles. few other flexible metamaterial absorbers were realized by many groups, such as a polarization incident 1.19 mm thick absorber designed on a flexible paper substrate based on inkjet printing technique substrate by yoo et al. [70]. the inkjet-printed metamaterial absorber is fabricated on a paper substrate by applying silver nanoparticle ink using an inkjet printer. it offers 95% absorption at 9.09 ghz for all polarizations up to 30° of oblique incident angles. then, huang et al. observed a 90% absorption at both x & ku bands when conductive graphene nano-flake ink is used to print an fss on top of a flexible silicon dielectric material through stencil printing method [71]. the 2 mm thick structure enables conformable bending and provides a fractional bandwidth of 62% with an exceptional reduction in rcs. using screen printing technique, another noted flexible metamaterial absorber for wearable device was designed on an ordinary textile using conductive silver [72]. the top of the unit cell of the structure was designed in the form of a channel logo and was backed by copper tape. the 1.2 mm thick absorber was simple to design and it presented the opportunity of integrating metamaterial absorber with wearable technology. the absorber showed good absorption at 10.8 ghz when the wave is normally incident. an interesting wideband textile based metamaterial absorber using the same technique was presented by singh et al. as a wearable microwave absorber offering more than 90% absorption from 7.39 to 18 ghz. the top layer is the printed cloth of various kind (fr4, plain weave cotton cloth and twill weave cotton cloth), which is separated by a flexible a review on the pursuit of an optimal microwave absorber 641 dielectric foam from the ground plane in the 3 layered structure. the fabricated absorber was treated with polydimethylsiloxane (pdms) to make it hydrophobic [73]. an x-band light weight metamaterial absorber using agnw resistive film was described by lee et al [74]. the structure which is 7.5 mm, consisted of cross-shaped resistive agnw film on top of a styrofoam dielectric material backed by a conductor shows 90% absorption bandwidth from 6 to 14 ghz for all polarizations. a graphite-based metamaterial absorber was designed instead of copper. as graphite has a low electric conductivity, high corrosion resistance, low density and high skin depth used to construct the surface pattern. a graphite square ring is placed on a layer of fr4 with an aluminium back offering an absorption bandwidth from 11.36 to 18 ghz [75]. switchable metamaterial absorbers based on split ring resonator (srr) were fabricated by 3d printing technology realizing single-band and dual-band switching, and three bands (4.5 ghz, 6 ghz and 8.8 ghz) simultaneous absorption for controllable absorption and selective filtering by rotating its units. for a single srr unit, the main body of which is composed of polylactic acid (pla) and the interior of the unit is hollow and filled with liquid metal to observe the regulation of absorption at the incident angle of 240° [76]. an ultra-broadband, light weight, magnetic metamaterial absorber consisting of periodically-arranged subwavelength-scaled stepped structure was designed and presented offering absorption from 1.23 to 19 ghz up to an incident angle of about 45°. each unitcell structure is made up of a mixture of carbonyl iron powder and resin and composed of four stacked cuboids of equal length and width. the magnetic loss of the magnetic material, the multi-resonances and the edge diffraction effects at different frequencies of the stepped structures contribute to a broad absorption band [77]. until recently, there have been a number of investigations on microwave absorbers comprising of varying absorption levels, bandwidth and polarization independency over a wide-ranging angle of incidence with various thicknesses and few other parameters in different frequency ranges [78-88]. a comparison between the absorption, bandwidth, thickness, etc., of the few recently reported wideband microwave absorbers is listed in table 1. table 1 sl. no. type of absorber maximum reflection loss (db) frequency range (ghz) -10 db bandwidth (ghz) thickness (mm) year ref 1 dielectric -53.9 2-18 4.56 3.5 2019 [25] 2 dielectric -62.25 8-18 6.64 2.7 2019 [26] 3 dielectric -32.0 8-12 4 5.0 2020 [31] 4 magneto dielectric -35.0 8-12 3.2 2.7 2018 [34] 5 dielectric -43.0 2-18 5.08 1.6 2019 [35] 6 metamaterial --6-12 multiple bandwidth 1.2 2013 [45] 7 metamaterial -10 4-12 multiple bandwidth 3.1 2016 [53] 8 metamaterial -19.1 3.56 8.16 ~5 1.2035 2016 [65] 9 dielectric -21 6.4-15 ~8.5 1.0 2020 [89] 10 magneto dielectric/ hybrid -20 8-12 ~6 1.0 2019 [90] 11 metamaterial -16.42 4-8.12 ~4.12 5.0 2015 [63] 12 metamaterial -20 4-10 multiple bandwidth 1.035 2014 [91] 642 s. chakraborty, s. chakraborty 5. conclusion in order to design a microwave absorber, many challenging aspects, such as good absorption over a wide bandwidth, polarization sensitiveness for wide incidence angle, low thickness, conformability, etc. are to be considered. some of these aspects with all the historical achievements on various conventional and metamaterial absorbers are discussed here. different sets of materials, such as conductive and non-conductive polymers, magnetic and non-magnetic nano materials, along with techniques to maximize absorption bandwidth are considered here. symmetrical structures using srrs, fss, varactor diodes, pin diodes, lumped elements, fractal structures, multilayering, etc. are some research areas which are used recently to address polarization sensitiveness and incidence angles cases. use of substrates which are magnetic, thermoplastic, water –based are also presented, as they not only maximize absorption efficiency but few are also easily moldable into thin sheets and mountable on any surface. in addition to the benefits and limitations, several critical aspects experienced in designing a near-perfect microwave absorber are analyzed in order to have an overview of the current scenario. there are numerous possible applications of microwave absorbers in various civilian and defense sectors. pursuit of ultra-thin, compact microwave absorber with broadband behavior and justification of the need for perfectly thin economical absorber with enhanced features for more practical airborne applications is of great interest and still quite challenging. references [1] x. c. tong, advance materials and design for electromagnetic interference shielding. london: taylor and francis, 2009. [2] v. g. veselago, "the electrodynamics of substances with simultaneously negative values of ɛ and μ", soviet physics: uspekhi, vol.10, pp. 509-514, 1968. [3] w. h. emerson, "electromagnetic wave absorbers and anechoic chambers through the years", ieee trans. antennas propag., vol. 21, no. 4, pp. 484-490, july 1973. [4] o. halpern, "method and means for minimizing reflection of high-frequency radio waves", us patent 2923934, 1960. [5] o. halpern, m. h. j. johnson and r. w. wright, "isotropic absorbing layers", us patent 2951247, 1960. [6] j. l. snoek, "dispersion and absorption in magnetic ferrites at frequencies above one mc/s", physica, vol. 14, pp. 207-217, may 1948. [7] w. w. salisbury, "absorbent body for electromagnetic waves", us patent 2599944, 1952. [8] j. w. tiley, "radio wave absorption device", us patent 2464006, 1949. [9] h. a. tanner, "fibrous microwave absorber", us patent 2977591, 1961. [10] e. b. mcmillan, "microwave radiation absorbers", us patent 2822539, 1958. [11] o. halpern, m. h. j. johnson and r. w. wright, "isotropic absorbing layers", us patent 2951247, 1960. [12] w. dallenbach and w. kleinsteuber, "reflection and absorption of decimeter-waves by plane dielectric layers", hochfreq. u elektroak, vol. 51, 152-156, 1938. [13] l. wesch, "resonance absorber for electromagnetic waves", us patent 3526896, 1970. [14] k. suetake, "super wide band wave absorber", us patent 3623099, 1971. [15] p. saville, review of radar absorbing materials. technical memorandum drdc atlantic tm 2005-003, 2005. [16] m. e. nahmias, "method and means for reducing reflections of electromagnetic waves", us patent 4030098, 1977. [17] a. feldblum, et al., "microwave properties of low-density polyacetylene", j. polym. sci.: polym. phys. ed., vol. 19, no. 1, pp. 173-179, jan. 1981. [18] k. j. vinoy and r. m. jha, "trends in radar absorbing materials technology", sadhana, vol. 20, pp. 815850, oct. 1995. [19] d. l. jaggard, n. engheta and j. c. liu "chiroshield: a salisbury/dallenbach shield alternative", electron. letters, vol. 26, pp. 1332-1334, aug. 1991. a review on the pursuit of an optimal microwave absorber 643 [20] m. f. lin and d. s. chuu, "low-frequency plasmons in metallic carbon nanotubes", phys. rev. b, vol. 56, pp. 1430-1439, july 1997. [21] r. ramasubramaniam, et al., "homogeneous carbon nanotube /polymer composites for electrical applications", appl. phys. lett., vol. 83, pp. 2928-2930, sept. 2003. [22] c. wang, et al., "overview of carbon nanostructures and nanocomposites for electromagnetic wave shielding", carbon, vol. 140, pp. 696-733, dec. 2018. [23] t. chen, et al., "hexagonal and cubic ni nanocrystals grown on graphene: phase-controlled synthesis, characterization and their enhanced microwave absorption properties", j. mater. chem., vol. 22, pp. 15190, aug. 2012. [24] d. chuai, et al., "enhanced microwave absorption properties of flake-shaped fepcb metallic glass/graphene composites", compos. part a: appl. sci. manuf., vol. 89, pp. 33-39, oct. 2016. [25] p. b. liu, et al., "synthesis of lightweight n-doped graphene foams with open reticular structure for highefficiency electromagnetic wave absorption", chem. eng. j., vol. 368, pp. 285–298, july 2019. [26] s. r. lu et al., "permittivity-regulating strategy enabling superior electromagnetic wave absorption of lithium aluminum silicate/rgo nanocomposites", acs appl. mater. interfaces, vol. 11, pp. 18626–18636, april 2019. [27] x. y. lv, et al., "investigation on the enhanced electromagnetism of ni/rgo nanocomposites synthesized by an in situ process", mater. lett., vol. 201, pp. 43–45, aug. 2017. [28] y. p. shi, et al., "achieving excellent metallic magnet-based absorbents by regulating the eddy current effect", j. appl. phys., vol. 126, pp. 105109, sept. 2019. [29] y. h. li, et al., "vertical interphase enabled tunable microwave dielectric response in carbon nanocomposites", carbon, vol. 153, pp. 447–57, nov. 2019. [30] x. y. li, and k. lu, "improving sustainability with simpler alloys", science, vol. 364, no. 6442, pp. 733– 734, may 2019. [31] a. r. pai, et al., "ultra-fast heat dissipating aerogels derived from polyaniline anchored cellulose nanofibers as sustainable microwave absorbers", carbohydr. polym., vol. 246, pp.116663, oct. 2020. [32] j-s. roh, et al., "electromagnetic shielding effectiveness of multifunctional metal composite fabrics". text. res. j., vol. 78, pp. 825–835, sept. 2008. [33] k. fu, et al., "conductive textiles", in engineering of high-performance textiles, m. miao, j. h. xin, eds. woodhead publishing, 2018, pp. 305–334. [34] d. estevez, et al., "complementary design of nano-carbon/magnetic microwire hybrid fibers for tunable microwave absorption", carbon, vol. 132, pp. 486–494, june 2018. [35] y. cheng, et al., "lightweight and flexible cotton aerogel composites for electromagnetic absorption and shielding applications", adv. electron. mater., vol. 6, pp. 1900796, nov. 2019. [36] f. bilotti, et al., "an srr-based microwave absorber", microw. opt. technol. lett., vol. 48, pp. 21712175, aug. 2006. [37] n. i. landy, et al., "perfect metamaterial absorber", phys. rev. lett., vol. 100, pp. 207402, may 2008. [38] s. gu, et al., "a broadband low-reflection metamaterial absorber", j. appl. phys., vol. 108, pp. 064913, sept. 2010. [39] m. li, et al., "perfect metamaterial absorber with dual bands", prog. electromagn. res., vol. 108, pp. 37– 49, sept. 2010. [40] s. bhattacharyya, et al., "triple band polarization-independent metamaterial absorber with bandwidth enhancement at x-band", j. appl. phys., vol. 114, pp. 094514, sept. 2013. [41] b. bian, et al., "novel triple-band polarization-insensitive wide-angle ultra-thin microwave metamaterial absorber", j. appl. phys., vol. 114, 194511, nov. 2013. [42] o. b. ayop, et al., "triple band circular ring-shaped metamaterial absorber for x-band applications", prog. electromagn. res. m, vol. 39, pp. 65–75, oct. 2014. [43] c. gong, et al., "broadband terahertz metamaterial absorber based on sectional asymmetric structures", sci rep., vol. 6, p. 32466, aug. 2016. [44] h. m. lee and h. s. lee "a method for extending the bandwidth of metamaterial absorber". int. j. antennas propag., vol. 2012, pp. 1-7, nov. 2012. [45] t. m. kollatou, et al., "a family of ultra-thin, polarization-insensitive, multi-band, highly absorbing metamaterial structures", prog. electromagn. res., vol. 136, pp. 579–594, jan. 2013. [46] j. w. park, et al., "multi-band metamaterial absorber based on the arrangement of donut-type resonators", opt. express, vol. 21, no. 8, pp. 9691–9702, april 2013. [47] s. gu, et al., "planar isotropic broadband metamaterial absorber", j. appl. phys., vol. 114, pp. 163702, oct. 2013. [48] f. c. seman and r. cahill, "performance enhancement of salisbury screen absorber using resistively loaded spiral fss", microw. opt. technol. lett., vol. 53, pp. 1538–1541, april 2011. https://www.sciencedirect.com/science/journal/1359835x https://www.sciencedirect.com/science/journal/1359835x/89/supp/c https://www.sciencedirect.com/science/journal/01448617 https://www.sciencedirect.com/science/journal/01448617/246/supp/c https://www.sciencedirect.com/science/article/abs/pii/s0008622318302136#! https://www.sciencedirect.com/science/journal/00086223 https://www.sciencedirect.com/science/journal/00086223/132/supp/c 644 s. chakraborty, s. chakraborty [49] j. zhao, et al., "a tunable metamaterial absorber using varactor diodes", new j. phys., vol. 15, p. 043049, april 2013. [50] s. li, et al., "wideband, thin, and polarization-insensitive perfect absorber based the double octagonal rings metamaterials and lumped resistances", j. appl. phys., vol. 116, p. 043710, july 2014. [51] w. li, et al., "integrating non-planar metamaterials with magnetic absorbing materials to yield ultrabroadband microwave hybrid absorbers", appl. phys. lett., vol. 104, p. 022903, jan. 2014. [52] x. yin, et al., "ultra-wideband microwave absorber by connecting multiple absorption bands of two different-sized hyperbolic metamaterial waveguide arrays", sci rep., vol. 5, p. 15367, oct. 2015. [53] h. sun, et al., "broadband and broad-angle polarization-independent metasurface for radar cross section reduction", sci rep., vol. 7, p. 40782, jan. 2017. [54] h. k. kim, et al., "wideband-switchable metamaterial absorber using injected liquid metal", sci rep., vol. 6, p. 31823, aug. 2016. [55] w. ellison, "permittivity of pure water, at standard atmospheric pressure, over the frequency range 0–25 thz and the temperature range 0–100°c", j. phys. chem. ref. data, vol. 36, 1–18, feb. 2007. [56] a. andryieuski, et al., "water: promising opportunities for tunable all dielectric electromagnetic metamaterials", sci rep., vol. 5, p. 13535, aug. 2015. [57] m. odit, et al., "experimental demonstration of water based tunable metasurface", appl. phys. lett., vol. 109, p. 011901, july 2016. [58] y. j. yoo, et al., "metamaterial absorber for electromagnetic waves in periodic water droplets", sci rep., vol. 5, p. 14018, sept. 2015. [59] y. pang, et al., "thermally tunable water-substrate broadband metamaterial absorbers", appl. phys. lett., vol. 110, p. 104103, march 2017. [60] y. shen, "thermally tunable ultra-wideband metamaterial absorbers based on three-dimensional water-substrate construction", sci rep., vol. 8, p. 4423, march 2018. [61] w. zhuang, et al., "design and optimization of a flexible water-based microwave absorbing metamaterial", appl. phys. express, vol. 12, may 2019. [62] j. zhang, et al., "ultra-broadband microwave metamaterial absorber with tetramethylurea inclusion", opt. express, vol. 27, no. 18, pp. 2559525602, sept. 2019. [63] p. munaga, et al., "a fractal-based compact broadband polarization insensitive metamaterial absorber using lumped resistors", microw. opt. technol. lett., vol. 58, no. 2, pp. 343–347, feb. 2016. [64] n. thi quynh hoa, et al., "wide-angle and polarization-independent broadband microwave metamaterial absorber", microw. opt. technol. lett., vol. 59, no. 5, pp. 1157–1161, march 2017. [65] f. yang, et al., "ultrabroadband metamaterial absorbers based on ionic liquids", appl. phys. a, vol. 125, p. 149, feb. 2019. [66] s. ghosh and k. v. srivastava, "polarization-insensitive single-and broadband switchable absorber/reflector and its realization using a novel biasing technique", ieee trans. antennas propag., vol. 64, no. 8, pp. 3665–3670, may 2016. [67] o. ayop, et al. "dual band polarization insensitive and wide angle circular ring metamaterial absorber", in proceedings of the conf. antennas and propagation (eucap), hague, 2014, pp. 955–957. [68] o. ayop, et al., "dual-band metamaterial perfect absorber with nearly polarization-independent", appl. phys. a, vol. 123, p. 63, 2017. [69] h. k. kim, et al., "flexible inkjet-printed metamaterial absorber for coating a cylindrical object", opt. express, vol. 23, no. 5, pp. 5898–5906, march 2015. [70] m. yoo, et al., "silver nanoparticle-based inkjet-printed metamaterial absorber on flexible paper", ieee antennas wirel. propag. lett., vol. 14, pp. 1718–1721, april 2015. [71] x. huang et al., "experimental demonstration of printed graphene nano-flakes enabled flexible and conformable wideband radar absorbers", sci rep., vol. 6, p. 38197, dec. 2016. [72] d. lee et al., "textile metamaterial absorber using screen printed channel logo", microw. opt. technol. lett., vol. 59, no. 6, pp. 1424–1427, june 2017. [73] g. singh, et al., "fabrication of a non-wettable wearable textile-based metamaterial microwave absorber", j. phys. d: appl. phys., vol. 52, p. 385304, july 2019. [74] j. lee and b. lee, "wideband absorber using silver nanowire resistive film", electron. letters, vol. 52, pp. 631–633, april 2016. [75] x. chen, et al., "a graphite-based metamaterial microwave absorber", ieee antennas wirel. propag. lett., vol. 18, pp. 1016–1020, march 2019. [76] c. kejian, et al., "switchable 3d printed microwave metamaterial absorbers by mechanical rotation control", j. phys. d: appl. phys., vol. 53, p. 305105, may 2020. [77] j. ning et al., "ultra-broadband microwave absorption by ultra-thin metamaterial with stepped structure induced multi-resonances", results phys., vol. 18, p. 103320, sept. 2020. https://iopscience.iop.org/journal/1882-0786 https://iopscience.iop.org/volume/1882-0786/12 a review on the pursuit of an optimal microwave absorber 645 [78] f. s. santos and v. f. rodriguez-esquerre, "water-based broadband metamaterial absorbers operating at microwave frequencies", metamaterials, metadevices, and metasystems, vol. 2020, p. 114602g, aug. 2020. [79] d. sood, "ultrathin compact triple-band polarization-insensitive metamaterial microwave absorber", in mobile radio communications and 5g networks, lecture notes in networks and systems, n. marriwala, c.c. tripathi, d. kumar, s. jain, eds. vol. 140, 2021. [80] m. zhen, et al., "multi-spectral functional metasurface simultaneously with visible transparency, low infrared emissivity and wideband microwave absorption", infrared phys. technol., vol. 110, p. 103469, nov. 2020. [81] x. zhang, et al., "3-d printed swastika-shaped ultrabroadband water-based microwave absorber", ieee antennas wirel. propag. lett., vol. 19, no. 5, pp. 821–825, march 2020. [82] s. dongyong, et al., "comptibility of optical transparency and microwave absorption in c-band for the metamaterial with second-order cross fractal structure", physica e, vol. 116, p. 113756, feb. 2020. [83] h. wu, et al., "design and analysis of a five-band polarization-insensitive metamaterial absorber", int. j. antennas propag., vol. 2020, pp. 1–12, dec. 2020. [84] w. zhendong, et al., "broadband microwave absorber with a double-split ring structure", plasmonics, vol. 15, pp. 1863–1867, dec. 2020. [85] t. m. cuong, et al., "broadband microwave coding metamaterial absorbers", sci rep., vol. 10, p. 1810, feb. 2020. [86] a. e. assal, et al., "toward an ultra-wideband hybrid metamaterial based microwave absorber", micromachines, vol. 11, no. 10, p. 930, oct. 2020. [87] s. a. naqvi and m. a. baqir, "ultra-wideband symmetric g-shape metamaterial-based microwave absorber", j. electromagn. waves appl., vol. 32, no. 16, pp. 2078-2085, july 2018. [88] k. chaudhary, et al., "optically transparent protective coating for ito-coated pet-based microwave metamaterial absorbers", ieee trans. compon. packaging manuf. technol., vol. 10, no. 3, pp. 378-388, march 2020. [89] r. bhattacharyya, et al., "defect reconstruction in graphene for excellent broadband absorption properties with enhanced bandwidth", appl. surf. sci., vol. 537, p. 147840, jan. 2021. [90] r. bhattacharyya, et al., "graphene oxide-ferrite hybrid framework as enhanced broadband absorption in gigahertz frequencies", sci. rep., vol. 9, p. 12111, aug. 2019. [91] s. bhattacharyya and k. v. srivastava, "triple band polarization-independent ultra-thin metamaterial absorber using elc resonator", j. appl. phys., vol. 115, p. 064508, feb. 2014. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 407 421 doi: 10.2298/fuee1503407s utility needs smarter power meters in order to reduce economic losses  dejan stevanović 1 , predrag petković 2 1 innovation centre of advanced technology (icnt), niš, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. whenever the delivered power is greater than the sum of the registered power at points of common coupling (pcc) the utility will have losses. this paper will show that even in an ideal case, without any abuse of users, the losses occur due to inadequate measurement equipment and to deficient billing policy. namely, common household power meters register only active energy, while power meters for industrial applications register reactive energy as well. consequently, the billing policy is based only at one or both values. this approach does not follow the change of the end-user load profile that becomes very nonlinear. actually, the current trough nonlinear load deviates from sine waveform causing that a part of the delivered power remains invisible for the power distribution system. therefore, the utility registers significant economic losses. to solve this problem we recommend distortion power to be measured and included into the billing policy. it has not been the case so far because the electric power community has not been aware of the amount of the distortion power in the contemporary grid. besides, power meters have not been able to measure it. this paper demonstrates how to overcome the obstacle with a minor modification of ordinary electronic power meters. the proposed solution is verified by a set of measurements on different types of loads that are commonly used in households and offices. key words: electronic meters, harmonics, distortion power, utility losses. 1. introduction one of the main demands facing modern human society is reducing power consumption, or using energy in a more efficient way. electronics responded to this request with high efficiency appliances that consume less power. however, the battle for every watt at consumer‟s side is useless if it is not supported at the utility side. the recent researches published in [1, 2] show that the annual value of transmission and distribution losses reaches up to 6% of the total generated energy (2% for transmission and 4% for distribution losses). that represents about 7 billion euros losses every year. this number includes losses that occur in the medium and low voltage lines and in primary and received july 8, 2014; received in revised form december 22, 2014 corresponding author: dejan stevanović innovation centre of advanced technology (icnt), niš, serbia (e-mail: dejan.stevanovic@icnt.rs) 408 d. stevanović, p. petković secondary substation. many countries have brought the law that demands from utility to reduce the losses for 1.5% each year [3]. these losses can be classified as technical losses and non-technical losses. the technical losses occur due to dissipation on electricity system component (transformers, transmission and distributed lines and measurement system). therefore, it represents physical losses. non-technical losses refer to the energy delivered and consumed, but for some reason not recorded as sales, measuring and billing. in order to reduce the level of losses at a power grid, many different approaches exist, some of which are published in [3]. according to schneider electric about 90% of non-technical losses occur in the medium and low voltage (mv/lv) grid. it is assumed that they range between 1.000€ to 10.000€ per mv/lv substation per year in european countries [3]. this brought mv/lv grid at the top of the priorities for loss reduction. the first step in cutting the losses is to monitor and to detect their sources. this request was hardly feasible and very expensive in the past. fortunately, that is not the case today. smart meters give inexpensive and precise insight into the current status of particular parameters of the grid. they allow the utility to measure many parameters that define quality of the delivered electric power. unfortunately, due to the inertia of the acceptance of facts, some decisions affecting the power system are not timely made. the most obvious misconception is related to understanding the character of consumers connected to the electricity grid. the standpoint of the power grid measurement theory assumes that all loads are entirely linear resistive or reactive. this implies that the current follows voltage sine waveform (with possible phase lead or lag at reactive loads). in general, for centuries the loads in households were basically resistive (heaters, incandescent light) while in industry they usually have large inductive character (ac motors). consequently, it was sufficient that power meters register only active power in households and/or reactive power with industrial customers. however, the character of loads has been drastically changed since the last quarter of the 20 th century. one of the biggest european utility enel s.p.a. corporation (ente nazionale per l'energiae lettrica; www.enel.com) located the problem of losses related to the changed load profile. therefore, they included reactive energy into the billing policy for households. as a result, they have replaced more than 20 million household electric power meters with upgraded electronic power meters capable to measure both active and reactive energy [4]. in this way the losses were lowered, but not completely eliminated. we will show in this paper that the main source of losses caused by consumers at a contemporary grid is not due to the unregistered reactive power. as was presented in some earlier papers [5, 6, 7] and as will be presented here on some other examples, the main source of losses is due to the non-registered distortion power. we prove that distortion power has the same order of magnitude as active power at most contemporary power-saving electronic devices. being unregistered in the system it appears as a „phantom‟ part of the delivered power that utility sees as a loss. therefore, we suggest introducing distortion power into billing policy in order to reduce the losses. however, in order to do so, we need a simple and reliable way to quantify it. we claim that there is an unsophisticated and consequently a smart way to enhance features of electronic power meters with the option to register distortion power. consequently, that new feature will make the existing smart meters to be smarter. this paper is organized as follows. the next section will contemplate the nature of loads at the contemporary power grid. the third section gives a survey trough historical utility needs smarter power meters in order to reduce economic losses 409 development of power meters and presents a theory of operation behind the contemporary power meters. the consequent section gives solution how to upgrade the meters and billing policy in order to reduce economic losses at utility. the fifth section presents results measured using the suggested upgrade. it proves that the main cause of losses does not come from reactive but from distortion part of power. the final section concludes the paper analyzing results obtained by measurement on different type of loads that are commonly used in households and offices. 2. nature of loads at contemporary power grid the orientation towards energy efficient electronic devices caused that transistors within most of consumer‟s electronics operate in a switching mode. for the power grid they represent nonlinear loads. at the beginning of the energy saving era their nominal power was low, so they did not jeopardize the operation of the utility. the requests for energy efficiency rapidly increased the number of such gadgets connected to the grid. therefore, the community is faced with the fact that the total load becomes mostly nonlinear. at a nonlinear load the current waveform is distorted in comparison with voltage due to the presence of higher order harmonics. let us assume that the grid is supplied from an ideal voltage source with zero resistivity. then voltage will retain the basic frequency of f1=50hz (60hz) and current will have harmonic components at frequencies of fn= nf1, n=2, 3... active power will appear only at f1. therefore, the power meters capable to measure only active power will not register power caused by all other components of current. as any other unregistered power, this causes economic losses at the utility side. the losses increase with the rise of nonlinearity. moreover, knowing that resistance of transmission lines is not zero, it is obvious that harmonics of current will introduce harmonics into grid voltage as well. the existence of harmonics within voltage and current additionally complicates measuring of active, reactive and apparent power established on linear loaded grid model. in reality, the harmonics contribute to active and reactive power with less than 3% of the total active or reactive power [8]. the main contribution of harmonics reflects trough additional component of power that is known as distortion power. unfortunately standard power meters, including modern smart meters, do not register this component. accordingly, utilities suffer huge losses. as we, and other authors, proved by measuring [5], [6], [7], this component of power cannot be neglected. in addition, harmonics will produce indirect losses to the utility due to malfunction in other power grid equipment [9], [10]. we claim that it is necessary to upgrade the meters with the possibility to register all components of power. otherwise the end users will not take care about the level of nonlinearity of their load. in contrast, they will be stimulated to use energy saving devices that reduce the bill but spoil the grid voltage with harmonics. the purpose of this paper is to show the real level of utility losses caused by billing only active energy and to offer a low-cost solution. the solution implemented within contemporary electronic power meters turns the smart meters to smarter. the subsequent section presents an overview of power meters, their advantages and disadvantages. 410 d. stevanović, p. petković 3. history and development of power meters the meter is a crucial part of the electric utility infrastructure. it registers the amount of electricity transferred from the service to a customer. consequently, they both, the utility and the customer should trust measured results. conventional electromechanical power meters earned the broad trust because they are accurate, simple, low-cost, and durable. the principle of operation of electromechanical meter is based on interaction between phase shifted magnetic fluxes whose intensities are proportional to the value of current and voltage. an orthogonal phase angle provides a maximum electromagnetic torque. therefore, a 90 o phase shift is added when the voltage and current are in phase. as a result, the meters do not register an active component of power in case of completely reactive loads. they are reliable for what they were designed: for energy metering at grid loaded with linear loads. consequently, they have been in use persistently in their original form regardless of the level of development of electronics. however, they could not resist the solid-state revolution. eventually, after a century of consistent usage, their withdrawal from the market has begun. the major power meter manufacturers have replaced electromechanical with solid state models. this opened the market of the meter business to new enterprises. fig. 1 illustrates the trend of the replacements between 1998 and 2010 [11]. 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 0 1 2 3 4 0 1 2 3 4 5 6 manufacturers offering solid state meters manufacturers offering electromechanical meters echelen elster ge itron landis+gyr sensus fig. 1 replacement of electromechanical meter production with solid state versions, (the diagram has been taken from [11]) the new technology of power metering has enriched the meter with options to register rms voltage and current, to measure frequency, apparent power, and power factor. the electronic power meters relay on digital signal processing. instantaneous values of the attenuated line voltage are sent to adc where they are sampled at least two per a period (according to the nyquist-shannon theorem) and digitalized. simultaneously, at another channel, the voltage equivalents of current acquired by current transformers are converted to digital signals, as well. thereafter, a dsp unit calculates all the necessary power line quantities using the digitalized voltage and current samples according to the following mathematical procedure. instantaneous value of power line signal (current or voltage) in time domain can be expressed as: utility needs smarter power meters in order to reduce economic losses 411 )2cos(2)( 1    m h hhrmsh tfxtx  . (1) where fh, φh denotes frequency and phase angle of h th harmonics, h is order of harmonic while m is the highest harmonic. after the discretization at equidistant time intervals (t), for the n th interval it transforms to )2cos(2)( 1    m h h sempl h rmsh n f f xntx  , (2) where fsempl, is the sampling frequency (fsempl=1/t). according to the definition the rms value of signal per second is: )( 1 2 n ntx x n n rms   , (3) where n is number of samples per second. the active power is an average of the product of instantaneous values of current and voltage in form: n ntp n ntintv p n n n n    11 )( )()( . (4) the reactive power (q) is defined similarly after shifting the phase of voltage samples for π/2: n ntq n ntintv q n n n n    11 2/ )( )()( , (5) where vπ/2(nt) denotes the n th sample of the phase shifted voltage. the product of rms values of voltage and current represents apparent power (6): rmsrms ivs  . (6) in addition, electronic meters offer better accuracy. following the new features american national standards institute (ansi) suggested new regulative with more severe accuracy requirements. ansi c12.20 established accuracy classes 0.2 and 0.5. the class numbers correspond to the maximum percentage of metering error at normal loads. typical household electronic power meters of class 0.5, have replaced electromechanical meters, typically built to meet class 1 standard (ansi c12.1), as fig. 2 illustrates [11]. moreover, the lower measuring limit for ansi c12.1 was 0.3a (72 watts) while ansi c12.20 requires from solid state meters to register power greater than 24w (0.1a). this introduced a significant accuracy improvement. 412 d. stevanović, p. petković 0.2 200 0 1 2 -1 -2 -3 3 2 20current [amps rms] m e a s u re m e n t e rr o r l im it s [% ] class 0.5 accuracy limits per ansi c12.20 solid state meters class 1 accuracy limits per ansi c12.1 electromechanical meters fig. 2 accuracy class comparison, [11] one of important issues in metering development is how much the better accuracy will cost. after more than a century long experience in power metering and two decades of solid state meters usage this is not a secret. the interrelation between practical needs and the cost of improved accuracy is obvious. an interesting benefit to cost analysis regarding the profitability of improving metering accuracy was published in [12]. it took into account contribution of metering inaccuracies and errors of voltage and current transformers to the total system error. the system considered the meter error of 0.2% and errors for voltage and current transformers of 0.15% each. that results in the total system error of 0.292%. adopting price of $0.03/kwh the analysis has shown that the system error will make economic losses of $7,661.87 per year at load whose nominal power is 10 mw. the improved meter with error of 0.1% at the same circumstances yields $1,498.66 difference. taking the meter error down to 0.05% would result in an additional $435 over the 0.1% error meter. fig. 3 shows how the benefit of better accuracy decreases rapidly below 0.1% [12]. 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.2 0.1 0.05 0.025 meter % error b e n e fi t to c o s t r a ti o n benefit over 0.5% benefit over 0.2% fig. 3 benefit to cost analysis for improving metering accuracy at a 10mw site, [12] utility needs smarter power meters in order to reduce economic losses 413 4. method for elimination losses caused by nonlinear loads at power grid both manufacturers and utilities use a set of tests to verify if meters meet the requirements of the appropriate standards [13], [14]. during the manufacturing process, each individual meter is calibrated and verified. the utility does the additional accuracy test, either on each meter or on a sample basis upon receiving power meters from the manufacturer. the meters are calibrated at fundamental frequency. therefore, they do not register precisely the total of consumed energy when harmonics exist. in classic electromechanical meters the error rises with frequency of harmonics because the intensity of magnetic fluxes decreases proportionally to the order of harmonic. the meters make an error of 40% of the true value at the third harmonic (150hz), whereas it is almost 80% at the seventh harmonic (350hz) [15]. the error can be positive or negative regarding the character of reactance (inductance/capacitance). as stated in the introduction section and as will be proved in the next section, the contemporary power grid is polluted with harmonics. therefore, even the well calibrated solid state meters will not register all delivered power. they will measure active and reactive energy with accuracy that meet the standards (iec 62053-22 and iec 62053-23 standard [13, 14]) but distortion power will not be registered. the distortion power is practically delivered to the customers, but is not visible at the side of power distributor. namely, when one calculates active, reactive, and apparent powers according to (4), (5), and (6), respectively, they get for sine-wave condition a well known relation: 222 qps  . (7) however, in presence of harmonics, this equation turns to inequality: 222 qps  . (8) budeanu noticed this as early as 1927. he introduced the term distortion power and revised the equation for apparent power: 2222 dqps  , (9) where d denotes distortion power. the revision states that in the absence of harmonics, d=0 and s 2 =p 2 +q 2 . obviously this definition represents special case of (9). with known s, p and q one can find the distortion power as: 222 qpsd  . (10) equation (10) is appropriate for implementation within solid state power meters. in order to meet the regulative, electronic meters already provide p,q, and often s, as well. however, if an electronic power meter does not provide information of apparent power, it is possible to upgrade the feature using (6) within its dsp block. therefore, one can easily implement (10) to calculate the distortion power. the implementation requires small modification of software or dsp. thereafter, the utility is able to take into account d and to introduce it in a new billing policy. this arms the utility with the possibility to charge all components of the delivered power and to eliminate the losses that exist at the system. 414 d. stevanović, p. petković direct application of (10) for the billing requires certain corrections. namely, the existing power quality standards define the allowed value for each harmonic. the two best known standards in this area are the ieee 519-1995 and iec/en61000-3-2. therefore, the utility should not charge the distortion energy caused by harmonics amounts less than the limits allowed by standard. so, it makes sense to define the threshold value for d which will be deducted from the measured value using (10). this value presents the acceptable amount of distortion. it is quite appropriate to relate it in terms of the apparent power. one way is to define the threshold value dt as: sd t   , (11) where γ denotes a constant which should be defined (by a standard or the utility). the authors of this paper analyzed the allowed limit for each harmonic of current and voltage defined by the ieee 519-1995 standard and suggest that correction factor γ should be equal to maximum allowed amount of thdi. notice that the value of thdi that is used as correction factor is not the measured value. according to ieee 519-1995 the allowed value of thdi is 10% in case where isc/il <20 (isc is the maximum short circuit current at pcc while il is maximum current of fundamental harmonic measured in 15or 30-minute interval at pcc). this is important because the measured value thdi can be significant and reach up to 90%. in this way only customers that over-cross the threshold will be charged for additional consumed power, dp: tp ddd  , (12) where d denotes measured distortion power. therefore, the customer will not be penalized for using loads with allowed limit of harmonic defined by ieee 519-1995 standard. the standard iec 61000-3-2 cannot be used for correction factor calculation because it applies different philosophy to define the allowed amount of harmonic. namely, ieee 519-1995 limits harmonics primarily at the service entrance (pcc) while iec 61000-3-2 is applied at the terminals of end-user equipment. therefore, ieee 519-1995 defines maximal allowed value of each harmonic (as percentage of fundamental current) in order to prevent interactions between neighboring customers within the power system. however, the intention of iec limits is to reduce harmonic pollution in an industrial plant. moreover, iec 61000-3-2 defines four classes of equipment (class a, b, c, d), by assigning different limits for harmonics in each class. these limits are given as maximum allowed harmonic current for classes a, b and d, or as percent of fundamental current for classes c. the next section will demonstrate the suggested methodology on real examples. these results are obtained by using industrial standard power meter manufactured by ewg [16]. 5. confirmation of the method by measuring the main gain of the suggested billing method is its applicability. namely, most of ordinary electronic power meters at the market can be enhanced with the possibility to calculate distortion power according to (10). the only requirement is that the meter is utility needs smarter power meters in order to reduce economic losses 415 capable to measure separately s, p and q according to (4), (5), (6). in our case we use meter produced by “ewg” [16]. the operation of this meter is based on standard integrated circuit 71m6533 [17]. this power meter completely fulfils standard iec 62053-22 and iec 62053-23 [13, 14]. the only additional effort was to collect data provided by the meter and to acquire them. for the research purpose we used a pc to perform this task. figure 4 shows the realised set-up. the simplicity of the set-up is obvious. it consists of the meter, the load, and pc. communication between the meter and pc is done through the optical port and rs232 interface. dedicated software processes data and forwards them to matlab script that calculates the distortion power. power inlet power outlet signal output 230v/50hz load rs232 input computer + software electronic power meter fig. 4 set-up circuit for distortion power measurement table 1 measurement results for different types of loads row loads vrms irms s p q d p/s q/s d/s d/p 1 ilb philips 100w 219.9 0.42 92.48 92.32 0.74 2.54 1.00 0.01 0.03 0.03 2 water kettle 216.24 7.906 1709.59 1709.54 0.39 13.51 1.00 0.00 0.01 0.01 3 fan 225.41 0.008 1.83 1.19 1.32 0.42 0.65 0.72 0.23 0.35 4 fridge 225.68 0.641 144.66 98.64 104.78 14.77 0.68 0.72 0.10 0.15 5 fl18w 218.62 0.08 17.49 11.33 -5.80 11.99 0.65 0.33 0.69 1.06 6 es20wbulb 218.55 0.13 29.07 18.30 -8.81 20.79 0.63 0.30 0.72 1.14 7 es20whelix 219.01 0.14 30.66 18.61 -9.38 22.49 0.61 0.31 0.73 1.21 8 es15wbulb 219.74 0.09 19.56 12.10 -5.51 14.34 0.62 0.28 0.73 1.19 9 es11whelix 221.73 0.08 17.74 10.42 -5.38 13.31 0.59 0.30 0.75 1.28 10 es9wbulb 216.06 0.06 12.75 7.58 -3.64 9.58 0.59 0.29 0.75 1.26 11 es7wspot 217.75 0.04 9.58 5.83 -2.87 7.04 0.61 0.30 0.73 1.21 12 led reflector 10w 223.05 0.093 20.74 10.81 -3.75 17.30 0.52 -0.18 0.83 1.60 13 led reflector 30w 222.72 0.272 60.58 32.23 -8.96 50.51 0.53 -0.15 0.83 1.57 14 led bulb 6w 222.65 0.042 9.35 5.10 0.96 7.78 0.55 0.10 0.83 1.53 15 monitor g2320hdbl 222.9 0.179 39.90 23.65 -4.67 31.79 0.59 0.12 0.80 1.34 16 monitor w2241s 223.88 0.289 64.70 40.28 -9.40 49.75 0.62 0.15 0.77 1.24 17 monitor1909w 221.79 0.20 43.91 24.69 -7.15 35.61 0.56 0.16 0.81 1.44 18 dell-optiplex360 220.57 0.39 85.80 61.09 10.24 59.37 0.71 0.12 0.69 0.97 19 air conditioner (cooling-mode) 217.14 4.73 1026.86 1006.03 107.44 175.48 0.98 0.10 0.17 0.17 20 printer hp1505 standby 211.26 0.04 8.24 3.60 -3.35 6.61 0.44 0.41 0.80 1.84 21 printer hp1505 printing 208.19 2.68 557.32 548.39 -8.05 99.07 0.98 0.01 0.18 0.18 416 d. stevanović, p. petković table 1 summarises the measured results obtained for different types of loads. the first four rows represent a group of linear loads while the rest are nonlinear. according to expectations, for linear resistive loads (incandescent lamp and heater) the active power almost equals apparent power. small difference occurred due to the accuracy of power meters. the more accurate power meter, the smaller the difference and consequently, the more precise the measured value of d. in our case we used power meter that measures active and apparent power with error less than 0.5%, and reactive power with error less than 2%. more information about measuring error at power meter can be found in [18]. due to the errors obtained while the active, reactive and apparent powers are measured, some small value of distortion power appears. the next two loads in table 1 (fan, fridge) represent linear reactive loads that consume significant part of reactive power q. namely, in this case, the value of reactive power is greater than the value of active power. moreover, the measured results at these loads show some small amount of distortion power d. the main reason of the existing distortion power at the linear reactive load is harmonics in voltage supply. actually, according to standard ieee 519-1995 the utility has to provide voltage with thd< 5% (total harmonic distortion). therefore, in reality, the waveform of voltage is not pure sine-wave. it has a small amount of harmonics. due to these harmonics the impedance at reactive loads at different harmonics are not equal (z1≠ z3≠…≠ zh, h denotes the h th harmonic). consequently, the currents at different harmonics will not be equal, which reflects as d≠0 [9]. the loads presented from rows 5 to 14 in table 1 represent group of energy saving lamps. their number on grid has rapidly increased in the recent past and they are becoming very interesting for research. namely, one of the easiest ways to shrink power bill and carbon footprint is using the energy saving lamps (led bulbs, cfls) insted of incandescent light bulbs. the energy saving lamps represent typical green-green situation, saving money and helping the environment. however the price for improving the „green‟ features is paid from a quite different account. the basic problem with energy saving lamps lies in their non-linear nature. namely, they generate harmonics in currents, so irms increases proportionally to harmonics. this reflects through the increase of s and d, as well. the columns p/s and d/s in table 1 indicate the seriousness of the problem. the unregistered distortion power occupies between 0.53 and 0.83 of the apparent power. obviously, it exceeds the registered active power that ranges between 0.52 and 0.65 of apparent power for energy saving lamps. practically for all nonlinear loads from table 1, d is greater than p and rated between 0.18 and 1,84 times. the obtained results are in consistence with data recently published in [6] for similar types of energy saving lamps. it is important to notice that the measurements in [6] were based on a quite different approach. actually, it was grounded on fft analysis. fortunately, the number of manufacturers being aware of the problems caused by their product increases. so, they try to improve their product using different filtering methods in order to reduce the value of generated harmonics. for example, most of phillips branded led use the valley-fill circuit [19, 20]; toshiba lamp contains a passive filter [21, 22], while osram decides to embed an active filter [21, 22]. despite the used filter, the current of these loads is not sinusoidal [21]. consequently, loads with pfc filters still produce the considerable value of harmonics and some improvement of their filters is necessary. however, the question is whether it is of worth to the manufacturer. an alternative is to employ some active harmonic compensation systems at pcc [23]. utility needs smarter power meters in order to reduce economic losses 417 therefore, harmonics produced by all nonlinear loads connected at pcc will be diminished and manufactures‟ cost will be less. however, this opens a new topic of who should pay for this system: customer or utility. fig. 5 compares active with distortion component of power for light bulbs in table 1. only for the first case, incandescent lamp bulb (ilb) philips 100w, that is a linear load, active power is much greater then distortion power. in all other cases, the distortion power is of the same order of magnitude but greater than active power. fig. 5 comparison between active and distortion power spent on different types of light bulb fig. 6 shows timing diagrams of power consumption for five bulbs switched according to the schedule in table 2. obviously, the proposed method successfully detects all changes. when incandescent lamp bulb (philips 100w) turns-on (events denoted as 3 and 8) the total apparent and active powers rise while distortion and reactive powers remain of the previous values. similarly, when it turns off (event 6), active and apparent power decreases while distortion and reactive power stay constant. figure 6 clearly shows that the distortion power is registered only when any of nonlinear loads is turned on. fig. 6 power consumption timing diagram measured for a group of five bulbs switched according to the schedule in table 2. 418 d. stevanović, p. petković table 2 schedule of switching different types of bulbs state es20wbulb es20whelix ilb philips 100w es15wbulb es11whelix 1 on off off off off 2 on on off off off 3 on on on off off 4 on on on on off 5 on on on on on 6 on on off on on 7 on on off off on 8 on on on off on 9 on on on off off someone could disregard these results claiming that energy saving lamps are small loads (p< 20w). however, one should not overlook the number of these loads at the grid. namely, about 20% of world electricity consumption is closely related to light [24]. only for the public lightening the average operating time reaches more than 12 hours per day during the winter and is not less than 9 hours per day during the summer in this region (the balkans). the last seven loads in table 1 present results measured at loads usually used in households and offices. the most interesting load from all the observed loads is the printer hp m1505 during the stand-by mode. namely, in this mode the value of distortion power is 1.84 times greater, than the value of active power. someone may argue that the real value of p is relatively small (3.6w), so it not important for power distributor. the authors of this paper do not agree with them. namely, we should not neglect two important facts. firstly, electronic equipment spends most of the time in a stand-by mode. secondly, the utility measure loses in terms of real money that is correlated to energy, i.e. integral of power in time. moreover, in printing mode the absolute value of d reaches almost 100var. it is interesting to observe power consumption of typical nonlinear loads during some time interval. we have tracked consumption of a personal computer and an air conditioner. the power consumption of a pc during setup is shown in fig.7. for the entire observed interval the distortion power is greater than active power. without doubt the unregistered or wasted energy is greater than the active part. this is the main reason of losses for a utility. fig. 8 presents power consumption of an air conditioner during starting the heating mode. it is interesting to see that distortion power has the same order of magnitude as reactive power and excesses 200var. fig. 7 typical power consumption timing diagram of a pc (dell optiplex 360) during setup utility needs smarter power meters in order to reduce economic losses 419 fig. 8 typical power consumption timing diagram of an air conditioner all examples confirm that distortion power exists at the grid. however, the awareness about the losses caused by nonlinear loads does not exist. therefore, the billing policy might change. it would be reasonable to bill higher the consumers that pollute grid with harmonics. however, this requires meters capable to measure all components of apparent power according to (10). the suggested modification can be done at software level or at hardware. namely, it is not difficult to implement the part for distortion power calculation within dedicated dsp [25] or a microprocessor unit that already exists in electronic power meters. of course, in both cases the manufacturer‟s intervention is needed. software upgrade requires access to the source code change. hardware intervention requires replacement of the metering integrated circuit. besides, we have a solution based on a new pice of hardware that is able to collect the required data trough optical head and to calculate and display data about distorted power. that will be reported in a separate paper. the hardware has been prototyped on fpga. such meters are little smarter than the contemporary electronic meters because they have one feature more than others. therefore, they are able to solve the problem of economic loses at utility side caused by non-registering distortion power. 6. conclusion the results presented in this paper should inform utility that the main cause of registered losses comes due to inadequate measuring equipment. moreover, the paper offers an easy, reliable and low cost solution to the problem. the measured results show, that utilities have large losses because they do not register distortion part of the delivered power. actually, as far as the authors were able to find out, almost all utilities in europe based their billing policy only on active and reactive power measurements. however, in real life the number of nonlinear loads increases rapidly. that comes as the cost for a greater usage of energy efficient equipment that reduces carbon footprint. for example, in the case of energy efficient light equipment (cfls and led lamps) that replaces classic light bulbs the losses due the distortion power rise above 60% in comparison to the active power. although each energy saving lamps is a small power consumer, their total number could not be ignored. moreover, their operation time is usually very long in comparison to some other devices, so that the total consumed energy becomes significant. 420 d. stevanović, p. petković this paper proves that the contribution of distortion power to consumed apparent power is similar for other equipment and gadgets commonly used in offices and households (air conditioners, monitors, pcs, printers, etc). moreover, measurements during particular operating regimes (set-up phase of a pc, printer stand-by) show that the distortion power is greater than active power all the time. all presented results show that utilities have the increased level of losses when using power meters that measure only active power. therefore, some countries have already started to replace old electromechanical power meters that measure only active power with the new electronic meters that are capable to measure reactive power as well. however, they eliminated losses for a few percent but the problem related to losses is not solved. therefore, as a solution the authors of this paper suggested using distortion power as an addition parameter in the billing policy. for this to happen it is necessary to provide information of spent distortion power at customer connection point. we claim that this is possible with a minor upgrade of smart electronic power meters that will make them smarter. someone may suggest that it is enough to register only the apparent power and separate active from non-active power. however, this would be a step back because ordinary electronic power meters are capable to measure separately apparent, active and reactive power. as presented in this and some previous papers [24], only minor modification in dedicated dsp hardware or software for dsps or mcus built-in the meters is sufficient to enable them to measure distortion power, as well. armed with this feature, the meters installed at pcc allow utility an opportunity to identify the source of harmonics at the grid [5], [7]. this opens the possibility to bill separately each component of power. in our opinion this will diminish the losses and will serve as a powerful tool to manage the loading profile of the consumers. acknowledgment: this work has been partly funded by the serbian ministry of education, science and technological development under the contract no. tr32004. 7. references [1] international energy agency, "energy statistics and balances of non-oecd countries and energy statistics of oecd countries, and united nations, energy statistics yearbook". [2] eurelectric, power statistics 2010, full report, page 16. [3] m. clemence, r. coccioni, a. glatigny, "how utility electrical distribution networks can save energy in the smart grid era", schneider electric, april 2013. [4] e. moulin, "measuring reactive power in energy meters", metering international, 2002. [5] p. petković, d. stevanović, "detection of power grid harmonic pollution sources based on upgraded power meters", journal of electrical engineering, vol. 65, no. 3, pp. 163-168, 2014. [6] m. dimitrijević, v. litovski,"power factor and distortion measuring for small loads using usb acquisition module", journal of circuits systems and computers, world scientific publishing co. pte. ltd., singapore, vol. 20, no. 5, pp. 867-880, august 2011. [7] d. stevanović, p. petković,"the efficient technique for harmonic sources detection at power grid", przegląd elektrotechniczny, pp. 196-199, 2012. [8] j. g. webster, the measurement, instrumentation, and sensors handbook, ieee press, 1999, chapter 40. [9] g. k. singh "power system harmonics research: a survey", european transactions on electrical power, vol.19, pp. 151–172, august 2007. [10] integral energy power quality centre, "harmonic distortion in the electric supply system", technical note no. 3 march 2000. [11] electric power research institute (epri), "accuracy of digital electricity meters", may 2010. utility needs smarter power meters in order to reduce economic losses 421 [12] a. i. lance, "a high accuracy standard for electricity meters", schneider electric, april 2011. [13] iec 62053-22 electricity metering equipment (ac) particular requirements static meters for active energy (classes 0.2s and 0.5s). [14] iec62053-23 (2003) electricity metering equipment (a.c.) – particular requirements – part 23: static meters for reactive energy (classes 2 and 3). [15] j. driesen, v. craenenbroeck, d. v. dommelen, "the registration of harmonic power by analog and digital power meters", ieee trans. on instrumentation and measurement, vol. 47, no. 1, february 1998. [16] ewg multi metering solutions, www.ewg.rs. [17] http://datasheets.maximintegrated.com/en/ds/71m6533-71m6534h.pdf. [18] s. puzović, b. koprivica, a. milovanović, m. đekić, "analysis of measurement error in direct and transformed-operated measurement system for electric energy and maximum power measurement", facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 389-398, september 2014. [19] k. k. sum, "improved valley-fill passive current shaper", in proceedings of the international power conversion intelligent motion conference, 1997, pp. 43-50. [20] dylan dah-chuan lu, "analysis of an ac-dc valley-¯ll power factor corrector (vfpfc) ", ecti transactions on electrical eng., electronics, and communications, vol.5, no.2, pp. 23-28 august 2007. [21] s. uddin , h. shareef, a. mohamed, "power quality performance of energy-efficient low-wattage led lamps", measurement, vol. 46, 2013, pp. 3783-3795. [22] s. uddin , h. shareef, a. mohamed, m. a hanan, "harmonics and thermal characteristics of low wattage led lamps", przegląd elektrotechniczny (electrical review), r. 88 nr 11a/2012, pp. 266-271. [23] https://www.youtube.com/watch?v=lqhffcgctew. [24] c. hammerschmidt, "research project illuminates oleds' industrial future", eetimes, dec. 2011. [25] b. jovanović, m. damnjanović, d. stevanović, "the decomposition of dsp control logic block", in proceedings of the small system simulation symposium 2012, niš, serbia, 2012, pp. 119-124. http://www.ewg.rs/ http://datasheets.maximintegrated.com/en/ds/71m6533-71m6534h.pdf facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 37-51 https://doi.org/10.2298/fuee2101037b © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper integrated green submersible pumping system for future generation bidrohi bhattacharjee1*, pradip kumar sadhu1, ankur ganguly2, ashok kumar naskar2 1department of electrical engineering, indian institute of technology (indian school of mines), dhanbad, jharkhand, india 2department of electrical engineering, techno international batanagar, kolkata, india abstract. in the system solar power has been used for cultivation. solar photovoltaic cells convert solar energy into electricity through solar photovoltaic (spv) effect. generated dc voltage then converted in to ac voltage by pump controller, this ac voltage is used as the input of variable frequency drive (vfd). the vfd acts as a motor controller that controls the submersible pump motor by varying the frequency and voltage of its input power supply. the vfd is associated with pump controller. regulated three phase ac voltage is the output of the pump controller which is directly connected with submersible pump. the water thus drawn from bore wells by a solar water pump is pumped to supply for irrigation purpose as required. this system is full off-grid interfaced system appropriate for the village areas. the main goal of this system is to use solar energy at a minimum running cost. the solar powered project is completely eco friendly and the plant is relatively clean with small maintenance. this project helps to reduce the cost of electricity, as well as minimize the overall agricultural cost. key words: variable frequency drives (vfd), submersible pump, solar panel, irrigation, pump controller received may 4, 2020; received in revised form october 23, 2020 corresponding author: bidrohi bhattacharjee research scholar of department of electrical engineering, indian institute of technology (indian school of mines), dhanbad, jharkhand-826004, india e-mail: onlybidrohi@gmail.com 38 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar 1. introduction water is the basic human necessity but a large nepalese population is depleted of access to hygienic, safe and adequate water. nepal is a landlocked country and most of the villages in nepal have to rely on small brooks flowing from the mountains and have to travel hours to get safe water. still, safe and unpolluted water is unavailable. one of the main reasons for this is due to the fact that the water level is deteriorating from the normal ground level and from the surface naturally and by anthropogenic deposits. this gives rise to many difficulties for the rural regions on getting sufficient collected water from normal tube wells and by using pumps by renewable energy source[1]-[4]. normally, , to fix such problems submersible pumps are used. these pumps are the best for collecting water in those places with very low water levels, still a strong challenging point to sustain is the unavailability of electricity. the common issue in those rural regions of nepal communities is suffering each day from power cut and driving the submersible pumps have become their nightmare which is vividly affecting their irrigation process. the solar photovoltaic system (spv) has been implemented in order to overcome electricity issues [5]-[8]. this system is implemented using renewable sources of energy and has become an almost successful project which is meeting the need for water without any obstacle to the villagers, using completely green energy [9]-[12]. water head ranges 45 to 50 meters with water discharge 36,000 liters per hours. now, this paper discusses the execution and implementation of an integrated solar photovoltaic system, generation of electricity for running submersible pump with minimum running cost [10]-[18]. spv contains few photovoltaic cells in series-parallel combination for achieving the required voltage. the solar photovoltaic cells arranged in series-parallel in panel absorb the solar energy [19]-[22] and process it into electrical energy as dc voltage. the dc voltage however does not remain steady because of the change the intensity of sunlight through day time. pump controller maintains constant voltage during operating time to provide water for farming work [23]-[25]. 2. pump installation and operation the solar submersible pump installation has been performed by these steps: initially try to survey the purpose and nature of the irrigation requirement of the specific village as well as the climate and nature of soil of this particular area. after that, hydraulic analysis of the pumping system is required to calculate the depth of underground water level. wind flow and wind pressure also play a vital role, therefore wind flow calculation must be required. in this project the solar module shielded the wind flow up to 150 km/h [26]. the reason for the sailing effect is the strong wind flow up to 150km/h along the ground level which is extremely detrimental to photovoltaic installations. therefore, during the installation, the behavior of the structure and pv module at 140km/h to 150km/h wind pressure was observed through wind tunnel test, and the solar panels passed the test successfully. as the structures were laid by pvh (pvhardware), they will not be affected by bad weather. the working temperature of the solar module is -20°c to 55°c with ip65 protection. determination of the peak photovoltaic power is required to drive 7.5 hp submersible pumps. the size of the suitable pv panel is set to get the required output power. during pv array layout, minimum impact of shadow effect must be considered for optimum and uninterrupted electrical output. shadows occur because of permanent structures, trees, or integrated green submersible pumping system for future generation 39 other types of civil constructions. inter row spacing of solar module is also the cause of shadow effect. to avoid shadow effect due to row spacing the following methodology: row spacing = height of solar module × 𝐶𝑂𝑆 𝑜𝑓 𝑎𝑧𝑖𝑚𝑢𝑡ℎ 𝑎𝑛𝑔𝑙𝑒 tan 𝑜𝑓 𝑎𝑙𝑡𝑖𝑡𝑢𝑑𝑒 𝑎𝑛𝑔𝑙𝑒 (1) in this project row spacing = 4meters × 𝐶𝑜𝑠 85° 𝑇𝑎𝑛 15° = 4meters × 0.087 0.26 = 1.34 m (1.5 m approx.) with the help of this method, electric power has been generated through solar panel. the photovoltaic arrays are made up of a combination of solar panels which convert sunlight into electricity, used for driving the motor and submersible pump set. the solar energy is supplied to the electrical motor to run the pumping system through cables. the pumping system draws water from the bore well. by the rotation of the shaft of the motor which is attached to the pump, the pump starts to collect the underground water and supply the fields. this system demands a shadow-free area for installation of the solar panel. 2.1. performance ratio and plant yield calculation total generated energy by a solar energy operated plant per annum to meets the consumers demand is called plant yield. performance ratio of a plant (pr) = (1−α) × (1−t) × (1−s) × (1−v) × ήinverter (2) plant yiled = psun × prated × pr × 365 (3) where psun is the average solar irradiation per day, ( kwh/m 2/day) is the solar radiation received by a surface at the panel installation area with that specific time. prated is the amount of dc power production under standard test conditions of the particular solar module. α is the manufacturer’s tolerance that means the permissible limits of variation in the physical dimensions of component or object of the pump. the value of α is 0.96 for 17 stages submersible pump [27]. t is the % of losses of efficiency due to diverse factors like ohmic losses, pv losses due to the temperature & irradiance. s denotes solar module soiling loss. significant factors impacting the rate of soiling are wind velocity of the atmosphere angle and direction of the panel, humidity, fog, dew, geographical location of the pane. the losses generated due to soiling vary from 1.5% to 6.2%, solely depending on the location of the pv plant. v is the power cable’s impedance loss. the efficiency of inverter is denoted by ήinverter. 2.2. methodology of total dynamic head calculation factors that have helped with the model improvement of total dynamic head (tdh) were recognized. the factors are -vertical rise, pumping level, static water level, drawn down, dynamic water level, pump depth, well depth, friction losses due to (insert coupling), threaded adapter (plastic to thread), standard tee (flow-through run), standard tee (flowthrough side), gate valve and swing check valve. the schematic representation of the research is shown in fig.1. 40 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar fig. 1 schematic diagram of total dynamic head of a submersible pump frictional loss calculation is shown from the data obtained from table 1. table 1 practical data related with submersible pump installation parameters acronyms value (m) friction loss pumping level total length of pipe fl pl lt 0.55m 158.50m 281.03m vertical rise fittings equivalent of pipe number of same fittings vr fe nf 61.11m 0.91m 4 fittings frictional loss in meters fl = [lt + ∑( nf × fe)] × fh × 30.48-1 (4) in this project the value of vertical rise and pump level are predetermined, so the modified equation of total dynamic head can be shown as: tdh= pl + vr [lt + ∑( nf × fe)] × fh × 30.48-1 (5) total dynamic head (tdh) = pl + vr + fl (6) application of equation (4) by substituting the values in table 1, gave a value of 737.31 feet (224.88 m) tdh= pl + vr [lt + ∑( nf × fe)] × fh × 30.48-1 = 224.88 m for calculating the pump capacity following factors has been considered in the model village of nepal with population of 2347. total water demand = 70 klit average solar energy per day = 6 hrs. average water required per hour = 12 klit water required per minute 12000 /60 = 200 lit/minute, power rating in watt = hp x 746 watt = 7.5× 746 = 5595 watt. (7) integrated green submersible pumping system for future generation 41 2.3. pump capacity design daily water demand = 70 lpcd 7.5 hp pump discharge = 600 lpm for 48 m head. total water discharge in an hour 600 × 60 = 36,000 lit. in 6 hrs.= 36,000 × 6 = 216, 000 lit. (8) 2.4. pump specification power rating of pump = 7.5 hp or 5.59 kw. water discharge = 600 lpm water head = 48m voltage: 280440v. stages: 19 housing material: stainless steel pump bush material: gun metal. insulation class: b bore size: 4 inch or 101.6 mm coating: ced (cathodic electro-deposition) protection class: ip68. 2.5. panel design 1m length × 2m width panel = 0.15 kw total wattage of 7 number of panel= 1kw for 7.5 hp load total number of 150 watt panel required = (7.5 ×746) ÷ 150= 38 nos. (9) area required for panel fittings = 1 ×2 × 38 = 76 sqm (10) add 100% more space for gap between the panel and area around it. so the total working area = 76 sqm × 2 = 152 sqm. (11) 3. technical specifications of solar water pump controller system in this project, 3 phase solar pump controller are used as an electronic device which is the combination of inverter, maximum power point tracking (mppt) and variable frequency drive (vfd). the maximum output pressure of the pump controller is greater than 0.55 bar. the system works at its optimum condition throughout the day by any intensity of sunlight. the starting current of induction motor is very high and non linear in nature than the running current of the motor which is almost regulated. the vfd in controller eliminates the high starting current of the induction motor. because of the smooth start using vfd, the starting method of the motor is very smooth. for 7.5 hp submersible pumps the ip rating of vfd is ip 65 with enclosure for endure severe environments. variable frequency drive shall be suitable for operating at voltage range of 415v with ±10% tolerance and the input supply frequency of 50 hz with ±5% tolerance. the frequency regulation shall be ±0.1% of rated maximum frequency under steady state and ±5% during transient condition. the output frequency variation of drives shall be 2.5 hz to 50 hz. maximum overload range of vfd is 150% of rated current for 60s and can work properly under the ambient temperature range -10 to +40°c with 95% of maximum humidity. 42 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar 4. pump efficiency calculation flow rate (q) = 120 m3/s water head (h) = 48m input power to pump = 21kw hydraulic kw is given by: q in m3 sec⁄ × total head in m × density in kg m3 ⁄ × g in m2/s 1000 = ( 120 3600 )×48×1000×9.81 1000 = 15.69 kw (12) pump efficiency =hydraulic kw/ input power to the pump = (15.69÷21) ×100= 74.71% this efficiency is sufficient for farming on that area. the submersible pump with diameter of the outlet is 4 inches and water discharge 600 lpm. the irrigation problem in this area is resolved with the submersible pump that runs for an average of 6 hours per day. the polluted pv cells power reduced to about 12% while the naturally cleaned cell lost about 8% compared to the clean cell [28]. table 2 shows a series of statistical data obtained from the agricultural field prevailed by direct measurements. this table also indicates how the pump efficiency varies with the water head and also the maximum efficiency for a particular water level. table 2 optimum efficiency water head (m) pump efficiency (%) 25 38 30 47 35 62 40 80 45 75 50 78 55 75 60 70 65 65 the relationship of efficiency with the inlet width of the pump diffuser is clearly stated in fig. 2. this figure also shows the locations of optimum operating point efficiency and the forecast of maximum efficiency. fig. 2 pump maximum efficiency curve with maximum operating points integrated green submersible pumping system for future generation 43 table 3 the optimum flow rate with different inlet and corresponding efficiency diffuser inlet width (b3) in mm single-stage head (h) in m efficiency (%) 40 16 77.4 45 17.01 81.0 50 17.20 83.7 55 13.87 75.9 table 3 indicates the results of various experiments revealed that the efficiency of a submersible pump that varies with different parameters like width of the diffuser inlet, water flow and the water head. the diffuser inlet width is measured by the inside caliper and measuring tape. inside caliper consists of measuring two adjustable jaws or legs for measuring the dimension of diffuser inlet width. the right side of the caliper has an adjustable screw and nut. the method of measuring diffuser inlet width with inside caliper is that first the jaws of the caliper were adjusted with the diffuser inlet width with the help of adjustable screw and nut of the caliper. then the gap in the lower part of the jaws is measured with either a scale or a measuring tape, that reading is the diffuser inlet width. 5. pump control technology the frequency regulation system is very effective for the flow of water at a certain pressure during the operation of the pump, which is placed as vfd of the project. at the same time this technology prevents water wastage, as well as power consumption control. the operation of vfd depends on the variation in the input voltage and frequency of the pump motor. a variable frequency drive (vfd) is a kind of motor controller that runs an electric motor by altering the frequency and voltage of its power supply. the vfd also has the potential to control ramp-up and ramp-down of the motor, during the start or stop, sequentially. a variable frequency drive can modify the power provided to meet the energy requirement of the driven equipment, and this is how it conserves energy or optimizes energy consumption. vfd for ac motors has been the reform that brought the application of ac pump motors back into influence. the ac-induction motor can have its speed modified by changing the frequency of the voltage used to power it. these indicate that, if the voltage applied to an ac motor is 50 hz (used in countries like india), this motor can operate at its rated speed. a pump controller will manage the regulated voltage, so no battery is required in this project for backup protection. a voltage regulator is employed to supply continuous power to the microcontroller. the microcontroller will regulate the switching of the pump which will further provide water to the crops as per their necessities. the water level in the overhead water tank is controlled by a water level controller which will run the pump executing it to supply water from the water reservoir. ac pumps are utilized in this system as they are cost-effective and efficient. the pumps require ac supply, whereas the solar panels provide dc power. therefore, an inverter is used to alter this dc supply to ac supply. 44 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar fig. 3 shows the simulated waveform has been done with the help of matlab/simulink software. this software has been used to make this project more acceptable. in fig. 4 the vertical axis indicates the flow rate in m3/s and in lpm with separate diagram and horizontal axis indicates the time. the relationship between the flow rate and time is shown by the simulation. pump flow rate is in both cubic meter per sec and liter per minute. both curves show that the pump starts to deliver the water after almost 0.12 sec, which is too fast. the pump flow rate is 5.35 × 10−3 [m3/sec] and 320 [lpm], respectively, which then increases to 5.85 × 10−3 [m3/sec] and 350 [lpm], which is slightly higher than the actual calculated value. fig. 3 simulated waveform of the output voltage without pump controller fig. 4 pump flow rate in [m3/s] and [lpm] against time the bright sunlight provides illuminance of 98000 lux on a perpendicular time when the voltage directly coming from the solar array without the influence the pump controllers is fluctuating in nature. the generated voltage (dc) is unregulated and deepens upon the intensity of light. so, the output voltage is not constant throughout the day. this means that as the intensity of light decreases, the amount of voltage generated also decreases. the value of the voltage is almost constant as the intensity of light changes by using solar pump controller. when the sunshine varies during the solar hours or day time, the power input to integrated green submersible pumping system for future generation 45 the controller also varies and the variable frequency drive (vfd) generates variable ratio to control the input voltage of the pump and the speed of the motor. thus pump controller always maintain constant speed. but after using the submersible pump controller the output voltage of the controller is almost constant. the voltage vector of the inverter is supplied to pump motor and it holds a steady ratio of the voltage to frequency (v/hz). it obtains feedback data from the driven motor and corrects the output voltage or frequency to the preferable values. the control system is entirely based on svpwm (space vector modulated pwm) and soft computing based algorithm. this project is solely an integrated system and not design-based work. the solar inverter with mppt vf drive will give the maximum torque even at minimum sunlight. no battery is required in this project. the panel is directly connected to the pump controller and the output of the controller is connected to the submersible pump motor. the dsp will track a particular point at which the maximum power can be extracted from the solar module or array by changing the pwm technique with modulation frequency so that the pump motor will always run maximum power extracting from the panel and with a steady torque for a large range of intensity of sunlight throughout the solar hours. thus, maximum power can be obtained from the panel by altering the pwm and modulation frequency so that the motor always runs by deriving most of the power from the panel and at a steady torque for the ample range of intensity of sunlight from morning till evening. this method provides 35% more energy, and therefore gets 35% more pumping water. this comparative excess water has greatly improved the agriculture on that area. this project was created to solve the agricultural obstacle for a particular area. fig. 5 shows the output power or shaft power of the pump [kw] against time [sec]. the shaft power of the pump gives the same results as the motor speed, as shown in the full load shaft power is 5.5 kw. fig. 5 shaft power (kw) of pump vs time (sec) fig. 6 single line block diagram of submersible pump operation. 46 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar fig. 6 shows the single line diagram of the control circuit of the submersible pump. when sunlight falls on solar array, voltage is generated in photovoltaic manner and it is unregulated dc voltage. this voltage is connected to the appropriate 3 phase ac voltage of the pump motor through the pump controller. the output terminals of the pump controller are directly connected with the pump motor. the pump controller provides the rated power with specified voltage to motor. the pump motor is fully submerged in underground water. the motor is hermetically sealed and close-coupled to the body of the pump. the submersible pump pushes water to the upper surface of the ground by converting the rotating mechanical energy of the motor into kinetic energy. this can be achieved by the water being pulled into the pump, initially in the intake, where the rotation of the impeller pushes the water through the diffuser. from the diffuser it goes to the upper ground surface. after that the submersible pump is driven by motor and the water moves to the upper surface of the soil at high pressure through certain pipelines. latter the water is used for farming. table 4 test result panel voltage dc (volt) load voltage (volts) load (submersible pump) kw load current (amp) 24 volt 414 volt 5.595 13 amp real time experimental results are directly available from table 4. the results were obtained by connecting the metering deviances directly to the pump circuit. the experimental result of this project shows that the theoretical value of the efficiency is 74.74% s as almost equal to the efficiency which is calculated by the practical data. input electrical power = √3 ×vl× il ×cos ∅ (13) = √3 × 414 × 13 × 0.8 kw = 7457.5 w output electrical power = 5595 w (7.5hp) practical efficiency = 5595 7457.5 × 100 = 75% pump performance curve the characteristic curve of submersible pump is the relation between the flow rate produced vs total head. the pump performance curve, basically a performance data, helps to choose the proper rating of the pump for a particular project. pump performance curve in fig. 7, indicates the relation between total head (m) and the water flow rate (m3/s). the flow rate decreases with increasing the total head of the pump, so that the curve is drooping in nature that means if the total head increased then the flow rate of water will be decrease and the maximum water flow available with minimum water head. fig. 7 pump performance curve integrated green submersible pumping system for future generation 47 determining the optimum operating point the best operating point of a submersible pump is typically analyzed by the fluid flow and the total head at maximum working efficiency. the optimum operating point of the submersible pump is present in case of impeller type where q is the pump flow, n is the pump speed in rpm, η is the hydraulic efficiency, ψ is the impeller outlet exclusion coefficient, d is the impeller outlet diameter, p is the theoretical head correction factor, β is the impeller blade outlet angle, the characteristic equation of the impeller can be derived from the basic equation of submersible pump as follows: pump flow q = 50gm/m impeller outlet diameter d = 150mm theoretical head correction factor = p = 1.42 impeller outlet exclusion ψ = 26mm impeller blade outlet angle cot β = cot 65° hydraulic efficiency η = 81.4% pump speed in rpm (n) = 3200 rpm h =( ηh (1+p)g ) ×( 𝜋𝐷𝑁 60 ) 2 − ( ηh (1+p)g × 𝑁𝑐𝑜𝑡 𝐶𝑜𝑡𝛽𝑁 60𝑐𝑜𝑡𝛽ψη ) × 𝑄 (14) h =( 81.4×h (1+p)g ) ×( 𝜋𝐷𝑁 60 ) 2 − ( ηh (1+p)g × 𝑁𝑐𝑜𝑡 𝐶𝑜𝑡𝛽𝑁 60𝑐𝑜𝑡𝛽ψη ) × 𝑄 water flow measurement was done with the help of magnetic flow meter; whereas the total head of the water was measured manually with the help of measuring tape. fig. 8 and fig. 9 show the decreasing and increasing rate of water head respectively where horizontal axis represents total head and vertical axis indicates water flow rate. fig. 8 total water head vs flow rate curve (decreasing mode) fig. 9 total water head vs flow rate curve (increasing mode) 48 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar fig. 10 to fig. 14 show the different parts of the solar submersible pump unit and also the installation method of the pump. fig. 10 overall solar integrated submersible pump fig. 11 pump outlet fig. 12 pump installation technique fig. 13 pump controller fig. 14 variable frequency drive integrated green submersible pumping system for future generation 49 6. degradation of solar power and efficiency one of the disadvantages of solar systems is that the output of all solar panels beings to degrade over time. as the year progresses, the output of solar panels also start to decline due to micro cracks developed in silicon solar cells. as the degradation gradually increases, the panels become completely crippled. the main causes of solar panel degradation are materials expanding and contracting at different rates with temperature changes and this puts joins between different materials under strain and causes slow deterioration. solar panels are damaged due to nature’s humidity and excessive heat. as per specification of monocrystalline solar cell depreciation of required output power and efficiency will be minimum compared to the other type of cells, but still the warranty period means that after 20 years both the quality and efficiency of the panel will decrease at a certain rate. solar panel efficiency means the ratio of the amount of light energy falling on the panel to the converted electrical energy. panels are typically about 20% efficient as per specification. this plant will give at least 97% normal power in the first year. the next 10 years it will run with 91% of total power and will continue to provide 83% of the total power for up to 20 years. 7. cost analysis and payback calculation although the initial cost of solar power is a bit higher, it is comparatively more profitable than conventional electricity. this project proves that the use of solar power instead of conventional electrical power is profitable. table 5 cost of overall project sl no name of material cost in indian currency (inr) 1 monocrystalline solar panel (number of panels: 38, maximum power=150) 55/per watt. (40x150) = 6000/-(per panel). total panel cost: 6000(each) x 38=228000 2 phase: 3 phase. motor power: 7.5 hp or 5.5 kw, voltage: 280440v. submersible pump 26000/ 3 solar pump controller with ac drive for 7.5hp submersible pump 16000/ 4 cost of mounting structure with installation cost 72000/ 5 cost of cable 15000/ 6 labor cost 20000/ total cost 382000/ submersible pump consumes 7.5 hp or 5595 watts load for 6 hours hour for one month. and the total electricity bill per unit of the consumer or tariff rate is 9 rs/(indian rupee). indian currency is prevalent in the market of india and nepal. consider, 1unit = 1kwh. total kwh in a month’s = 5595 watts × 6 hrs × 30 days = 1007100 watt/hour. (15) so total consumed units. 1007100 /1000 units total average units per months = 1007 units. cost of per unit is 9. 50 b. bhattacharjee, p. k. sadhu, a. ganguly, a. k. naskar so, total cost or electricity bill= 1007 × 9 = 9063 rs/-. (indian rupee) (16) average annual electricity cost = 9063 ×12 =108756 rs/-(indian rupee) (17) initial fixing and investment cost of solar system is comparatively very high, but the warranty period of pv panels is long, almost 20 years. manufacturers assure the concert guarantee for pump controllers and other accessories for about 5years. various manufacturers also offer linear performance guarantee for submersible pump for 5 years. overall investment cost of the project is 382000/as per table 6. average annual cost of electricity for conventional energy source is 108756 rs/-. if the depreciation of solar power is not estimated then payback will be around 3.5 years, but if the depreciation of solar power is taken into account, the pay back will come in 5 years. 8. conclusion in village area of nepal the accessibility of electricity by conventional supply system is very poor as well as maintenance of electrical power system not is not much better as frequent power supply cut off is a common incident, more over the cost of electricity is very high for common people. by eliminating the dependence on conventional electricity and using solar power, it is possible to improve the quality of agriculture by supplying the required amount of water throughout the day. solar energy is the main input of the pump and it takes no money to get solar energy. so, this project reduces the running cost of electricity. moreover, daily maintenance cost of this project is very low because the only maintenance is cleaning the dust from the panel from time to time. this method does not require much human resources throughout the day and it is possible to use the stored ground water properly for better quality of agriculture. as a result, the cost of farming in that village has dropped significantly, and the economic development of the people of that area has been remarkable. in addition to agricultural job, the water obtained in this way meets the demand for daily life in the area. references [1] s. rahman, alternative energy sources: the quest for sustainable energy. ieee power and energy magazine, 2007. [2] r. foster, m. ghassemi and a. cota, solar energy: renewable energy and the environment. crc press, 2009. [3] s. miric and m. nedeljkovic, "the solar photovoltaic panel simulator", rev. roum. sci. techn. électrotechn. et énerg., vol. 60, no. 3, pp. 273–281, 2015. [4] r. kumar, b. singh, "buck boost converter fed bldc motor drive for pv array based water pumping", in proceedings of the ieee international conference on power electronics, drives and energy systems (pedes), mumbai, india, 2014, pp. 1–6, 16–19. [5] h. wang and d. zhang, "the stand-alone pv generation system with parallel battery charger", in proceedings of the international conference on electrical and control engineering (icece’10), wuhan, china, 2010, pp. 4450-4453. [6] m. kolhe, "techno-economic optimum sizing of a stand-alone solar photovoltaic system", ieee trans. energy convers., vol. 24, no. 2, pp. 511–519, 2009. [7] d. debnath and k. chatterjee, "a two stage solar photovoltaic based standalone scheme having battery as energy storage element for rural deployment", ieee trans. ind. electron., vol. 62, no. 7, pp. 4148–4157, 2015. [8] s. krithiga and n. g. a. gounden, "power electronic configuration for the operation of pv system in combined grid-connected and stand-alone modes", iet power electron., vol. 7, no. 3, pp. 640–647, 2014. integrated green submersible pumping system for future generation 51 [9] i. j. balaguer-álvarez and e. i. ortiz-rivera, "survey of distributed generation islanding detection methods", ieee latin amer. trans., vol. 8, no. 5, pp. 565–570, 2010. [10] c. a. hill, m. c. such, d. chen, j. gonzalez and w. m. grady, "battery energy storage for enabling integration of distributed solar power generation", ieee trans. smart grid, vol. 3, no. 2, pp. 850–857, 2012. [11] w. xiao, f. f. edwin, g. spagnuolo and j. jatskevich, "efficient approaches for modeling and simulating photovoltaic power systems", ieee j. photovoltaics, vol. 3, no. 1, pp. 500–508, 2013. [12] s. jain, a.k. thopukara, r. karampuri and v.t. somasekhar, "a single-stage photovoltaic system for a dual-inverter-fed open-end winding induction motor drive for pumping applications", ieee trans. power electron., vol. 30, no. 9, pp. 4809–4818, 2015. [13] le an and d.d.-c. lu, "design of a single-switch dc/dc converter for a pv-battery-powered pump system with pfm+pwm control", ieee tran. ind. electron., vol. 62, no. 2, pp. 910–921, 2015. [14] q. yan, x. wu, x. yuan, y. geng and q. zhang, "minimization of the dc component in transformer less three-phase grid-connected photovoltaic inverters", ieee trans. power elec., vol. 30, no. 7, pp. 3984–3997, 2015. [15] a. sangwongwanich, y. yang and f. blaabjerg, "high-performance constant power generation in gridconnected pv systems", ieee trans. power elec., vol. 31, no. 3, pp. 1822–1825, 2016. [16] l. campanhol, s. silva, a. junior and v. bacon, "dynamic performance improvement of a grid-tied pv system using a feed-forward control loop acting on the npc inverter currents", ieee trans. ind. electron., vol. 64, no. 3, pp. 2092–2101, 2017. [17] a. radwan and y. mohamed, "power synchronization control for grid-connected current-source inverter-based photovoltaic systems", ieee trans. energy conv., vol. 31, no. 3, pp. 1023–1036, 2016. [18] g. n. tiwari and s. dubey, fundamentals of photovoltaic modules and their applications, centre for energy studies, indian institute of technology (iit) delhi, new delhi, india, rsc publishing, 2010, pp. 99–100. [19] p. singh and n.m. ravindra, "temperature dependence of solar cell performance an analysis", solar energy materials & solar cells, vol. 101, pp. 36–45, 2012. [20] a. m. nader, d. abderrahmane and a. said, "optimization of the performance of a photovoltaic pumping system by neuro fuzzy and direct torque control", rev. roum. sci. techn. – électrotechn. et énerg., vol. 59, no. 3, pp. 279–289, 2014. [21] p.k.r. reddy and j.n. reddy, "photovoltaic energy conversion system for water pumping application", int. j. emerg. trends electr. electron., vol. 10, no. 2, pp. 2320-2369, 2014. [22] s. biswas and m. tariq iqbal, "dynamic modeling of a solar water pumping system with energy storage", hindawi journal of solar energy, vol. 2018, id 8471715, 2018. [23] s.s. chandel, m. nagaraju naik, v. sharma and r. chandel, "degradation analysis of 28 year field exposed mono-c-si photovoltaic modules of a direct coupled solar water pumping system in western himalayan region of india", renew. energy, vol. 78, pp. 193–202, 2015. [24] s. lal, p. kumar and r. rajora, "techno-economic analysis of solar photovoltaic based submersible water pumping system for rural areas of an indian state rajasthan", sci. j. energy eng., vol. 1, no. 1, pp. 1-4, 2013. [25] s.s. chandel, m. nagaraju naik and r. chandel, "review of solar photovoltaic water pumping system technology for irrigation and community drinking water supplies", renewable and sustainable energy reviews, vol. 49, pp. 1084-1099, 2015. [26] z. zhang and t. stathopoulos, "wind loads on solar panels mounted on flat rooftops: progress and limitations", in proceedings of the 2014 world congress on advance in civil, environmental, and materials research (acem 14), busan, korea, 2014. [27] k. ragu and p.v. mohanram, "tolerance design of multistage radial flow submersible pumps", mechanika, vol. 1, no. 63, pp. 64-70, 2007. [28] m. t. chaichan, b. a. mohammed and h. a. kazem, "effect of pollution and cleaning on photovoltaic performance based on experimental study" int. j. sci. eng. res., vol. 6, no. 4, pp. 594-601, 2015. 10918 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 171-188 https://doi.org/10.2298/fuee2302171k © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan and wimax applications using dgs lalit kumar, vandana nath, bvr reddy university school of information, communication and technology, guru gobind singh indraprastha university, new delhi, india abstract. microstrip antennas have become ubiquitous in today's wireless communication world due to their low profile, low cost, and simplicity in fabricating on circuit boards. however, poor performance characteristics, such as limited bandwidth, low power handling capabilities, and low gain, limit their applicability in various instances. path loss will be substantial in 5th generation (5g) wireless communication due to the utilization of high-frequency bands. a high-gain antenna with a small size is necessary to address this issue. a compact tri-band, slotted monopole antenna with high and consistent gain employing a defected ground plane structure (dgs) has been investigated and implemented in this study. this proposed antenna uses three inverted l-shaped stubs connected to the radiating element to cover the desired bands while keeping the antenna size small. the designed antenna has two key characteristics: (i) wide bandwidth and (ii) reasonable gain. the antenna covers 2.45 and 5.6 ghz wlan, 2.4 ghz wi-fi, 2.5 and 5.2 ghz wimax and 3.7 ghz sub-6 ghz of 5g for mobile communication. the overall substrate size of the antenna is 30 × 17 × 1.6 mm3and the electrical dimensions are 0.49 λl × 0.28 λl × 0.026 λl, where λl is the free space wavelength at 2.45 ghz. the measured reflection coefficient (s11 < -10db) covers 2.4 2.52 ghz (bandwidth 112 mhz) and 3.4 4.1 ghz (bandwidth 700 mhz) and 5.2 6.6 ghz (bandwidth 1359 mhz) with a fractional bandwidth of 5.1 % at lower frequency band, 18.6 % at mid frequency band and 23.7 % at high frequency band. a prototype antenna has also been developed using an inexpensive, low-profile 1.6 mm thick fr-4 (εr = 4.4) substrate. the measured peak gains achieved are 1.35 db at 2.45 ghz, 2.55 db at 2.65 ghz and 3.8 db at 5.5 ghz. the simulated results have been validated against actual experimental measurements, and the outcomes are consistent and match with certainty. the proposed antenna design is very compact and easy to fabricate due to the absence of vias. key words: 5g sub-6 ghz, slotted patch antenna, monopole, multiband, wide bandwidth received july 19, 2022; revised september 19, 2022 and october 15, 2022; accepted october 18, 2022 corresponding author: lalit kumar university school of information, communication and technology, guru gobind singh indraprastha university, new delhi, india e-mail: lalitkr12@yahoo.com 172 l. kumar, v. nath, bvr reddy 1. introduction the advancement of mobile technology and the continuously reducing size of mobile devices has necessitated antenna designers to design antennas that can operate across multiple frequency bands while still compact and having adequate gain and efficiency. antennas with multiple frequency bands and small sizes can be employed in mobile devices. furthermore, high-gain antennas are highly beneficial for satellite communication applications. broadband antennas have also grown in popularity in recent years. however, while the microstrip patch antenna offers the advantages of low cost and tiny size, it has the limitations of poor gain and narrow bandwidth. other requirements include a low profile, simple design, ease of fabrication and low cost. these make planar microstrip patch antennas a better and more popular choice [1]. in addition, it is challenging to cover multiple bands while keeping the antenna size minimal with adequate bandwidth and gain [2]. the sub-6 ghz frequency offers high-speed data transfer over long distances owing to its low latency and high traffic density. these features of the sub-6 ghz frequency band are suitable for access points (ap) and base stations (bs) communication for machine-tomachine(m2m), internet of things (iot) and device-to-device (d2d) technologies, along with existing wimax, wlan, and lte bands [3]. a practical solution for such a system design is a multiband antenna with a frequency band selection capability. several printed monopole antennas with different geometries are demonstrated with a reduced size while increasing bandwidth to meet wlan, wimax, and 5g sub-6 ghz technology standards [3]. therefore, modifying an antenna's geometry to occupy a small volume is necessary to reduce its overall size. it is worth mentioning that stub matching and slotting techniques are widely used to maintain the compactness of the antenna. the stubs increase the current path length to get a fundamental resonating mode [4-8]. for improving impedance matching in multiband operations while reducing antenna volume, slots of different types and geometries are implemented [9-13]. in the present era of antenna design, one of the most important considerations is how to maximise the bandwidth of compact antennas. therefore, many researchers have suggested diverse techniques to develop a small antenna with broadband characteristics with multiband operations. some proposed methods include a thick substrate, shorting pins, active and passive devices, stacked patches, various feeding mechanisms and an impedance-matching network. defected ground structure (dgs) is one such type of bandwidth enhancement technique where some defects are introduced, or slots are carved in the ground plane to suppress the cross-polarisation radiation, reduce the antenna size and achieve the desired performance [14-18]. m. karthieyen et al. [7] proposed a tri-band antenna using t-shaped strips and rectangular slot defects at the ground side. the proposed antenna operates at three frequency bands, viz. 2.47-2.77 ghz, 3.3-3.7 ghz and 5.10-6.62 ghz. the fr-4 substrate dimensions are 33 × 17 × 1.6 mm3. the average gain ranges from 2 to 3.9 db, but the efficiency ranges from 60% to 80%, covering wlan and wimax bands only. using an arc-shaped dgs, a planar monopole antenna for triple-band operation is proposed in [11] for wireless lan and wimax bands. the proposed antenna combines monopole rings and a defective ground plane. monopoles are made up of a rectangular ring and a rectangular patch connected by a straight metal strip that may create lowerand middle-frequency bands and omnidirectional radiation patterns. the substrate is fr-4 with 1.6 mm thickness, but a relatively larger antenna size is 36 × 39 mm2. antenna resonates at 2.45 ghz, 3.55 ghz and 5.5 ghz and 1.22 db, 2.15 db and 4.06 db gain and 88%, 99% and 94% efficiencies in the respective triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 173 bands. a small slotted monopole antenna is presented by h. ahmed et al. [19] for wireless lan and wimax frequency bands. resonance in three bands has been achieved by etching a rectangle patch with bevel, pi, and inverted l-shaped slots. the antenna is 27.5 × 20 mm2 in size and operates in three bands: 2.37 2.52 ghz, 3.35 3.90 ghz, and 4.97 7.85 ghz. the antenna radiation pattern is almost omnidirectional, having 90 % efficiency and 4 dbi gain across the three wlan and wimax frequency ranges. in [20], a multiband monopole antenna is presented using a rectangular patch etched with two crossed cshaped slots and two e shaped slots in a curtailed ground plane. the total dimension of the antenna is 29 × 36 × 0.8 mm3, and the antenna casts on an fr-4 substrate. the antenna achieves a maximum gain of 2.5 dbi and maximum efficiency of 98 % at a higher frequency band, and it covers both the wireless lan and wimax frequency bands. but antenna has a comparatively lower gain and larger size. defected ground structure (dgs) has been utilised in microstrip antennas to increase the bandwidth and gain and suppress higher mode harmonics, mutual coupling between neighbouring elements, and cross-polarisation to improve the radiation properties of microstrip antennas. for wlan / wimax applications, a. ibrahim et al. [21] have developed an antenna using dgs for tri-band operations. the antenna has relatively lower bandwidths of 197 mhz, 118 mhz, and 90 mhz and resonates at three frequencies: 2.4 ghz, 3.5 ghz, and 5.8 ghz. the substrate is fr-4, with a dielectric constant of 4.3 and larger dimensions of 34 × 30 × 1.6 mm3. in [22], a triple-band antenna structure using a defective ground plane is suggested for use in wlan, wimax, and wi-fi applications. the proposed antenna achieves a higher peak gain of 3.88 dbi, 3.87 dbi, and 3.83 dbi and comparatively fewer impedance bandwidths of 14.61%, 5.42%, and 5.40 %, respectively. the antenna resonates at 2.47 ghz, 3.55 ghz, and 5.55 ghz. fr-4 substrate is used to fabricate the antenna. five split ring resonators (srrs) units are fabricated on the ground plane, complicating the antenna design. the constructed antenna is suitable for triple-band applications because it combines monopole and srrs. the total antenna size is 83 × 56 × 1.56 mm3. a complex metamaterial (mtm) based monopole antenna design is proposed by m. kasmaei et al. [23], which can function in 3g, wlan, and wimax frequency bands. the antenna covers 2.45 and 5.2 wlan bands (2.02 2.62 ghz and 5.12 5.34 ghz) along with 3.5 ghz wimax (3.48 4.56 ghz) bands. the antenna dimensions are also comparatively larger, 45 × 40 × 1mm3, built on an fr-4 substrate with a relative permittivity of 4.4. the impedance bandwidths of 600 mhz, 1080 mhz, and 220 mhz and peak gain of 2.23 db, 2.81 db and 1.91 db are obtained at 2.02 2.62 ghz, 3.48 4.56 ghz, and 5.12 5.34 ghz, respectively. a significant amount of research has been conducted on approaches for increasing bandwidth and gain in hexagonal-shaped patch planar antennas [24-27]. the designs presented by the research community have considerably improved the outcomes by utilising fractal approaches, various feeding strategies such as cpw/coaxial feeding, and srr (split-ring resonator). however, these techniques make the antenna structure large and bulkier with a complex design. a small 5g reconfigurable antenna for four frequency bands, 2.4 ghz, 3.1 ghz, and 3.4 ghz, presented in [24] provides frequency selectivity through a grouped element switch. the proposed monopole antenna has a small footprint, 37 × 35 × 1.6 mm3, built on an fr-4 substrate with relative permittivity of 4.3. the proposed configuration can function as an omnidirectional antenna with a bandwidth of 200 mhz, 682 mhz, 590 mhz and 960 mhz at 2 ghz, 3.4 ghz, 2.45 ghz and 3.1 ghz, respectively. the antenna achieves a peak gain of 1.95 db only and an efficiency greater 174 l. kumar, v. nath, bvr reddy than 85%. a multiband hexagonal patch antenna using an fr-4 substrate consisting of a circular slot on a radiating element and four rectangular slots is proposed in [26]. the antenna resonates at 2.40 ghz, 5.03 ghz, and 8.67 ghz, covering wlan, wimax and x-band. the total antenna dimension is 35 × 30 × 1.2 mm3 and attains a peak gain of 1.63 db, 1.38 db and 2.95 db in the respective lower, middle and higher bands. a stubloaded, hexagonal ring patch antenna excited through a triangular shape coplanar waveguide (cpw) transmission line is presented in [27]. the antenna covers wlan/wimax and itu bands from 2.9 to 5.9 ghz and 7 to 10 ghz. the substrate is made of fr-4 with a thickness of 0.8 mm and a total dimension of 20 × 20 mm2. for multiband operation, stubs are attached to the slotted hexagonal patch. but the paper does not discuss the antenna's gain and efficiency parameters. although the designs discussed above use the same substrate material, fr-4, they utilise distinct patch and ground geometries and are constructed using different bandwidth enhancement methods. as a result, most antennas are ineffective in one or two operating bands and have comparatively large sizes and moderate gains with complex antenna designs. this article investigates a wide-slot hexagonal shape patch using the dgs plane. a defected ground structure (dgs) reduces antenna size, improves radiation performance, and suppresses cross-polarisation. slotting is also used, which decreases the antenna volume while increasing the current route length and is also used to maintain the patch's size. the antenna's distinctive feature is its ability to operate across multiple bands using stubs while retaining a compact size and simple structure. the present article evaluates the antenna's performance using frequency domain properties such as gain, reflection coefficient, and radiation patterns. the proposed antenna has adequate efficiency and high gain with compactness while covering wireless lan, wimax, and 5g sub-6 ghz frequency bands with a simple design that is less bulky, easy to fabricate, and easy to integrate with other devices. section 2 of the paper discusses the theoretical aspects of the proposed antenna. antenna working using vector e-field and current distribution has been explained in section 3. an equivalent electric model of the proposed antenna using ads software is described and compared with the proposed antenna reflection coefficient (s11) in section 4. simulated and measured results are discussed in section 5, and finally, the paper is concluded in section 6. 2. theoretical analysis the proposed monopole antenna structure is straightforward and has a hexagonal shape patch and an equivalent wide hexagonal slot etched through the main patch. three inverted l-shaped stubs are added to generate the desired resonating frequencies. an inverted lstub is attached to one of the main hexagon's outer edges, and the other two are within the slotted patch's vicinity. a defected ground plane is etched on the other side of the substrate. there is no ground on the opposite side of the main radiating element, mainly covering the feed line. the following section describes the theoretical dimension calculation for patch, stub and feed lines. for the wideband antenna, the lower band edge frequency is the fundamental design parameter for monopole antennas rather than the resonance frequency [28]. the lower edge frequency can be determined by comparing the area of the monopole antenna to that of a cylindrical monopole antenna with a comparable height, 'l,' and radius, 'r' [28]. triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 175 f le = 7.2 (lc + p + rc) k ghz (1) where fle = lower edge frequency rc = radius of the cylindrical monopole antenna in cm lc = length of the cylindrical monopole antenna in cm p = probe length in cm (distance between the partial ground plane and patch vertex) k = 1.15 for the fr-4 substrate having a thickness of 1.6 mm [28]. the patch is fed at its vertex, and the values of above defined lc and rc are designed as given below: lc = 2 * lh (2) rc = √3*lh (8 * pi) (3) where lh is the side length of the hexagon, putting these values in equation (1) and using the value of p and k, the lower edge frequency can be calculated. the first hexagonal patch is constructed with fle = 3.6 ghz to keep the antenna size small. a microstrip transmission line of 50 ω has been used to feed the hexagon patch. the following equation shows a relation between the microstrip line's size and characteristic impedance zc [29]. zc = 120 * pi (√εreff) { w h +1.393 + 0.667*ln ( w h +1.444)} (4) where h is the depth and w is the breadth of the dielectric substrate microstrip feed line. εreff is the effective dielectric constant and is calculated as: εreff = εr + 1 2 (5) the current path length has to be extended to generate the fundamental mode in the desired frequencies. therefore, stubs are used whose length determines the resonating frequency. the lengths of the stubs may be changed to obtain the desired resonance frequency and can be adjusted separately without affecting the main patch characteristics. three l-subs are attached with the main patch to cover wi-fi/wlan and wimax frequency ranges. the stub length is optimised to be close to the quarter wavelength of the resonant frequency for fl=2.45 ghz and fh=5.5 ghz, using equation (6-7) [29]. l1@2.45 = c 4* fl √εreff (6) l2@5.5 = c 4* fh √εreff (7) 2.1. antenna structure the final patch antenna with a dgs plane on the ground side and a slotted hexagon antenna with inverted l-stubs are shown in figure 1 from the top, rear, and side views. the epoxy substrate having εr = 4.4 (relative permittivity) and δ= 0.02 (loss tangent) with a thickness of 1.6 mm has been used for fabrication. table 1 demonstrates the optimal geometric dimensions of the presented antenna. the width of the three inverted l-strips is kept fixed. 176 l. kumar, v. nath, bvr reddy the fundamental configuration of the proposed antenna started with a hexagon patch, whose dimensions are obtained using equation (1). the patch has been connected to a 50 ω microstrip feed line at one of its vertexes. figure 2(a) shows the initial stage designed antenna that resonates at 4.2 and 7.5 ghz frequencies, has a lower cutoff frequency at 3.66 ghz and has an overall 5.33 ghz impedance bandwidth. on the opposite side of the substrate, the defective ground plane is etched, covering the maximum of the feed line, and no ground exists below the main radiating patch. a greater bandwidth has been attained because of the defected partial ground plane. fig. 1 final proposed antenna fig. 2 a) stage 1; b) stage 2; c) stage 3 of the antenna z y x triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 177 table 1 optimised simulated dimensions of the presented antenna symbol dimension (mm) symbol dimension (mm) symbol dimension (mm) lsub 30 wsub 17 lg 9 lf 10 wf 2.8 d1 1 ls 9 wp 15.6 h 1.6 lc1 4.5 wc2 2.5 d 0.5 p1 7 p2 2 wc1 2.3 s 6.86 s1 14 b 1.5 a 14.5 finally, at stage 3, two inverted l-stubs with length l2 = p1 + p2 are extruded in the region of the slotted patch to generate a resonance peak in the fh = 5.5 ghz band, as illustrated in figure 2(c). for greater bandwidth at higher frequency bands, final slots are carved on the ground plane's corner side. the hfss v.19 simulator is used to model the proposed antenna. reflection coefficient (s11) characteristics for all three stages are shown in figure 3. fig. 3 reflection coefficient simulated curves of three phases of the presented antenna 3. antenna operation simulated surface current concentration and vector e-field at all three resonating frequencies are depicted in figures 4 and 5, which makes the antenna's working easily understood. the feed line, patch edges, and ground plane edges have the maximum current density. the neighbouring modes must be overlapped with one another to create a broad frequency spectrum. it can be seen from the surface current concentration on the top inverted l-shaped strip at 2.45 ghz and in the middle-inverted l-shaped strip at 5.5 ghz is maximum. from the vector-e plot, it can be concluded that the antenna is linearly polarised and has maximum radiation in the desired direction at desired resonating frequencies. thus, from the reflection coefficient characteristic (s11) curves and surface current distributions, the purpose of each extended l-strip of the presented antenna can be understood. 178 l. kumar, v. nath, bvr reddy fig. 4 simulated current distribution (i) at 2.45 ghz, (ii) at 3.65ghz, and (iii) 5.5 ghz fig. 5 simulated e-vector plot (a) 2.45 ghz, (b) 3.65 ghz, and (c) 5.5 ghz 4. modelling of equivalent circuit the analogous circuit model reveals the characteristics of resonant frequencies and their significance to input impedance. as seen in figure 6, ads software is used to implement the designed antenna's equivalent circuit model. the antenna reflection coefficient (s11) response is used to create the equivalent circuit model. the reflection coefficient (s11) below 10 db is optimised to an identical parallel rlc resonant circuit model using foster's canonical form. figure 7 illustrates the contrast between the reflection coefficient (s11) obtained from hfss and ads software. the results of the hfss simulation and ads are slightly different. the values of capacitors, inductors, and resistors are varied to satisfy the proper response, which causes the shift in resonance frequencies according to the hfss simulation results. it can be seen that the results from hfss and ads are very well and fairly matched. however, the values at higher frequencies deviate because the equivalent circuit model is roughly compared to 50 hz. the equivalent circuit-derived values of rlc circuit elements are tabulated in table 2 for the desired frequency bands. it can be concluded from the above discussion that the surface current distribution, vector e field, and equivalent circuit model provide details of the working of the presented antenna and good insight at resonant frequencies. triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 179 fig. 6 rlc equivalent circuit modelled in ads fig. 7 a comparison of the hfss and ads reflection coefficients table 2 equivalent rlc components for the antenna parameters prlc1 prlc2 prlc3 prlc4 inductor (nh) 0.133 0.26 0.639 0.703 capacitor (pf) 4.8 2.55 5.90 1.913 resistance (ohms) 53.43 28.69 197.6 90.5 l1 (nh) 2.3 c1 (pf) 0.428 r1 (ohm) 12.4 5. parametric analysis the antenna is subjected to a parametric analysis to determine its ideal dimensions and to improve its performance. this section shows the effects of adjusting the antenna's different geometries. an inverted l-stub length (l1=s1+s) is attached to the upper side of the radiating element to cover the 2.45 ghz frequency band. the lowest resonating 180 l. kumar, v. nath, bvr reddy frequency can be altered by changing the l-stub length without affecting the other resonant frequencies, as shown in figure 8(a). two inverted l-stubs length (l2= p1+p2) are appended in the vicinity of the hexagonal slot for resonating the antenna at the 5.5 ghz wlan band. by changing the length of these l-stubs, the resonating frequency for the 5.5 ghz wlan band can independently be altered without affecting the lower and mid-frequency bands. figure 8(b) depicts the change in the resonating frequency by altering the length p2. three slots have been etched on the partial ground plane on the other side of the substrate the centre slot just below the feed line whose width wc1 affects the crust value of the reflection coefficient. figure 8(c) shows the maximum value of s11 is at optimised width, describing a good matching at resonating frequency at all three bands. figure 8(d) shows the effect on resonating frequencies at middle and higher frequency bands when the width of ground slots at edges varies (wc2). besides the exception optimised width, the middle and higher resonating frequencies shift toward their higher side with lesser matching as the width increases. the lower frequency band is least affected by the width wc2 except for its impedance matching changes. (a) variation of lower resonating frequency (2.45ghz) w.r.t. l1 (b) variation of higher resonating frequency (5.5 ghz) w.r.t p2 triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 181 (c) variation in s11 vs. wc1 (d) variation in s11 vs wc2 fig. 8 parametric analysis by altering. (a) l1, (b) p2, (c) wc1, (d) wc2 6. simulated and measured results the focus of this section is the comparison of experimental and simulated outcomes. a prototype antenna is constructed on an fr-4 substrate using the dimensions of table 1. a female edge-mounted sma connector with 50 ω impedance has been used for the excitation of the antenna. the fabricated antenna is tested experimentally, and the findings are compared to the simulated outcomes of the antenna. figure 9(a-b) shows the fabricated (a) (b) fig. 9 fabricated antenna (a) top patch, (b) bottom ground 182 l. kumar, v. nath, bvr reddy antenna's top and bottom surfaces. the antenna characteristics are measured on the vna kc901c model from measall technology. figure 10 (a) and (b) depict the simulated reflection coefficient (s11) and simulated results against the measured one, and figure 11(ad) shows the measured reflection coefficient (s11) parameter for all three frequency bands. (a) (b) fig. 10 (a) hfss generated reflection coefficient (s11) (b) simulated vs measured s11 the tiny size, fabrication process errors, sma connection quality, and soldering faults contribute to the simulated and actual results variances. table 3 compares the resonant frequencies and bandwidths of simulated with measured results. table 3 measured vs simulated resonant frequencies and bandwidths simulated measured resonant frequency (ghz) bandwidth (mhz) resonant frequency (ghz) bandwidth (mhz) lower frequency band 2.45 112 2.42 280 mid-frequency band 3.78 710 3.7 249 high-frequency band 5.5 1359 5.7 1105 figure 12 (a-e) illustrates the experimental and simulated far-field radiation patterns at the low, mid, and high resonance frequencies in the h-plane (φ=90°) and the e-plane (φ= 0°). a bi-directional e-plane and omnidirectional h-plane configuration are observed in all three bands. due to the increased frequency, the radiation patterns in both planes become distorted and less omnidirectional in h-plane at higher frequencies. such distortions happen at high frequencies because of the stimulation of higher-order modes. however, the measured radiation patterns are relatively stable in both planes. triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 183 (a) (b) (c) (d) fig. 11 measured reflection coefficient (s11 in db) a) total bandwidth, b) lower frequency band, c) mid frequency band, d) higher frequency band 184 l. kumar, v. nath, bvr reddy (a) h-plane (at phi=90°) at 2.45 ghz (b) e-plane (at phi=0°) (c) h plane (at phi=90°) at 3.65 ghz (d) e plane(at phi=0°) triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 185 (e) h-plane (at phi=90°) at 5.5 ghz (f) e-plane (at phi=0°) fig. 12 simulated and measured e-plane and h-plane radiation patterns at resonant frequencies figure 13 (a) depicts the simulated and measured gain fluctuation. the gain of the proposed antenna is calculated using the gain transfer technique and a standard horn antenna. the measured peak gain is 1.35 db at 2.45 ghz, 2.55 db at 3.65 ghz and 3.8 db at 5.5 ghz. the gain value increases as the frequency increases. thus, the effective aperture grows in proportion to the wavelength. a peak gain of almost 5.5 db is achieved at 6 ghz. the simulated gain ranges from 1.5 to 5.8 db, and the measured gain range from 1.5 to 5.5 db. figure 13(b) demonstrates the proposed antenna's radiation efficiency w.r.t each frequencies bands simulated on hfss. the radiation efficiency increases with frequency, with a maximum efficiency of about 96% found in the mid-frequency region. . (a) (b) fig. 13 (a) simulated and measured gain in db, (b) simulated efficiency of the presented antenna 186 l. kumar, v. nath, bvr reddy the proposed antenna has achieved more than 80% radiation efficiency for all bands. the characteristics of the provided antenna, viz. size, gain, frequency bands, impedance bandwidth, and substrate material, are compared with wideband monopole antenna designs published in recent research articles summarised in table 4. table 4 comparison of the referenced antenna with the presented antenna ref. size (mm3) electrical equivalent size w.r.t free space wavelength (λl ) operating frequencies (ghz) impedance bandwidth (mhz) peak gain (db) substrate material [7] 33×17×1.6 0.55×0.30×0.026 2.5/3.5/5.5 300/400/1520 3.9 fr4 [8] 60×50×1.6 0.72×0.60×0.019 1.8/3.5/5.4 140/180/200 5.18 fr-4 [10] 40×40×1.6 0.64×0.64×0.0256 2.4/3.5/5.5 360/400/450 3.3 fr-4 [11] 36×39×1.6 0.58×0.63×0.026 2.45/3.55/5.5 170/960/740 4 fr-4 [19] 27.5×20×1.5 0.44×0.32×0.024 2.44/3.5/5.5 150/550/2880 4 fr-4 [20] 36×29×0.8 0.59×0.47×0.013 2.45/3.3/5.5 330/140/1060 2.5 fr-4 [21] 34×30×1.6 0.55×0.48×0.026 2.43/3.5/5.7 197/118/90 2.9 fr-4 [22] 83×56×1.56 1.35×0.92×0.0256 2.47/3.55/5.55 380/190/300 3.9 fr-4 [23] 45×40×1 0.64×0.565×0.014 2.12/4.12/5.16 600/1080/220 1.75 fr-4 [24] 37×35×1.6 0.49×0.46×0.021 2/2.45/3.1/3.4 200/590/682/960 1.95 fr-4 [30] 14×16×1.6 0.224×0.256×0.0256 2.4/5.8 400/1500 3.1 fr-4 [31] 12×16×1.5 0.192×0.256×0.024 2.4/5.8 1.44 fr-4 this antenna 30×17×1.6 0.49×0.28×0.026 2.45/3.65/5.5 112/710/1359 5.5 fr-4 7. conclusion this research article has evaluated and experimentally validated a compact tri-band antenna encompassing wlan/wimax bands at 5g sub-6 ghz. the total surface dimension of the antenna is 30 × 17 mm2 with a simple design using low-profile fr-4 substrate material and having wideband characteristics. the presented antenna could be operated in three bands, resonating at 2.45 ghz, 3.65 ghz, and 5.5 ghz by extruding three inverted lshaped extensions from the slotted primary patch antenna. peak gains of 1.34 db (at 2.45 ghz), 2.55 db (at 3.65 ghz), and 3.8 db (at 5.5 ghz) have been achieved. the presented antenna achieved a peak gain of 5.5 db in the upper-frequency region. the wide bandwidth has been accomplished successfully using defected ground plane while keeping the overall antenna volume minimum. a good impedance matching is obtained by etching an equivalent hexagon slot in the main radiating patch for all the operating bands. the omnidirectional h-plane and bi-directional e-plane with stable gain across the operating frequencies band have also been accomplished. the proposed antenna is a good choice for near-future 5g sub-6 ghz application systems and wlan and wimax bands because of its compact size, stable gain, and more than 80% efficiency. triple-band stub loaded patch antenna with high gain for 5g sub-6 ghz, wlan... 187 references [1] k. f. lee and k. f. tong, "microstrip patch antennas: basic characteristics and some recent advances", in proceedings of the ieee, vol. 100, no. 7, pp. 2169-2180, 2012. [2] r. b. waterhouse, microstrip patch antennas: a designer's guide: a designer's guide, springer science & business media, 2003, chapter 4-5, pp. 167-274. [3] n. kishore and a. senapati, "5g smart antenna for iot application: a review", int. j. commun. syst., vol. 35, no. 13, p. e524, jan. 2022. [4] w. zaman, h. ahmad and h. mehmood, "a miniaturised meandered printed monopole antenna for triband applications", microw. opt. technol. lett., vol. 60, no.5, pp. 1265-1271, 2018. [5] r. n. tiwari, p. singh and b. k. kanaujia, "asymmetric u-shaped printed monopole antenna embedded with t-shaped strip for bluetooth, wlan/wimax applications", wirel. netw., vol. 26, no.1, pp. 51-61, 2020. [6] h. ahmad, w. zaman, m. rehman and f. c. seman, "the smallest form factor monopole antenna with meandered radiator for wlan and wimax applications", iete j. res., vol. 68, no. 4, pp. 3010-3018, 2022. [7] m. karthikeyan, r. sitharthan, t. ali, s. pathan, j. anguera and s. shanmuga, "stacked t-shaped strips compact antenna for wlan and wimax applications", wirel. pers. commun., vol. 123, no. 2, pp. 1523-1536, 2022. [8] a. s. elkorany, a. n. mouse, s. ahmad, d. a. saleeb, a. ghaffar, m. soruri, m. dalarsson, m. alibakshikenari and e. limiti, "implementation of a miniaturised planar tri-band microstrip patch antenna for wireless sensors in mobile applications", sensors, vol. 22, no. 2, p. 667, 2022. [9] j. park, j. minjoo, n. hussain, s. rhee, p. kim and n. kim, "design and fabrication of triple‐band folded dipole antenna for gps/dcs/wlan/wimax applications", microw. opt. technol. lett., vol. 61, no. 5, pp. 1328-1332, 2019. [10] u. patel, m. parekh, a. desai and t. upadhyaya, "wide slot tri‐band antenna for wireless local area network/worldwide interoperability for microwave access applications", int. j. commun. syst., vol. 34, no.12, p. e4897, 2021. [11] s. wang, f. kong, k. li and l. du, "a planar triple-band monopole antenna loaded with an arc-shaped defected ground plane for wlan/wimax applications", int. j. microw. wirel. technol., vol. 13, no. 4, pp. 381-389, 2021. [12] a. z. manouare, s. ibnyaich, d. seetharamdoo, a. e. idrissi and a. ghammaz, "design, fabrication and measurement of a novel compact triband cpw-fed planar monopole antenna using multi-type slots for wireless communication applications", j. circ. syst. comput., vol. 29, no. 2, p. 2050032, 2020. [13] b. kumar, b. k. shukla, a. somkuwar and o. p. meena, "analysis of hexagonal wide slot antenna with a parasitic element for wireless application", prog. electromagn. res. c, vol. 94, pp. 145-159, 2019. [14] s. mahapatra and m. n. mohanty, "an optimised feed hexagonal antenna with defective ground plane for uwb body area network application", instrum. mes. métrolog, vol. 20, no. 5, pp. 261-267, 2021. [15] p. p. singh and s. k. sharma, "design and fabrication of a triple band microstrip antenna for wlan, satellite tv and radar applications", prog. electromagn. res. c, vol. 117, pp. 277-289, 2021. [16] h. v. pallavi, g. m. m. naik, a. p. j. chandra and paramesha, "enhancement of radiation characteristics in a planar microstrip patch antenna using defected ground structure", turk. j. comput. math. education (turcomat), vol. 12, no. 12, pp. 3157-3166, 2021. [17] m. h. reddy and d. sheela, "a compact ultra-wideband patch antenna using defected ground structure", 3c tecnología. glosas de innovación aplicadas a la pyme, edición especial, pp. 567-76, 2021. [18] p. m. mpele, f. m. mbango, d. b. o. konditi and f. ndagijimana, "a tri-band and miniaturised planar antenna based on countersink and defected ground structure techniques", int. j. rf microw. comput.aided eng., vol. 31, no. 5, p. e22617, 2021. [19] h. ahmad, w. zaman, s. bashir and m. u. rahman, "compact triband slotted printed monopole antenna for wlan and wimax applications", int. j. rf microw. comput.‐aided eng., vol. 30, no.1, p. e21986, 2020. [20] chandan, "truncated ground plane multiband monopole antenna for wlan and wimax applications", iete j. res., vol. 68, no. 4, pp. 2416-2421, 2022. [21] a. ibrahim, n. a. fazil and r. dewan, "triple-band antenna with defected ground structure (dgs) for wlan/wimax applications" j. phys.: conf. ser., vol. 1432, no. 1, p. 012071, iop publishing, 2020. [22] a. pandya, t. upadhyaya and k. pandya, "tri-band defected ground plane based planar monopole antenna for wi-fi/wimax/wlan applications", prog. electromagn. res. c, vol. 108, pp. 127-136, 2021. [23] m. kasmaei, e. zareian-jahromi, r. basiri and v. mashayekhi, "miniaturised triple-band monopole antenna loaded with a via-less mtm for 3g, wimax, and wlan applications", int. j. microw. wirel. technol., vol. 14, no. 5, pp. 601-608, 2022. 188 l. kumar, v. nath, bvr reddy [24] s. ullah, i. ahmad, y. raheem, s. ullah and t. ahmad, "hexagonal shaped cpw feed based frequency reconfigurable antenna for wlan and sub-6 ghz 5g applications", in the proceedings of ieee international conference on emerging trends in smart technologies (icetst), 2020, pp. 1-4. [25] r. mark, n. mishra, k. madal, p. p. sarkar and s. das, "hexagonal ring fractal antenna with a dumbbellshaped defected ground structure for multiband wireless applications", aeu-int. j. electron. commun., vol. 94, pp. 42-50, 2018. [26] a. o. fadamiro, j. d. ntawangaheza, o. j. famoriji, z. zhang and f. lin, "design of a multiband hexagonal patch antenna for wireless communication systems", iete j. res., vol. 68, no. 3, pp. 1675-1682, 2022. [27] z. khan, r. p. dwivedi and k. u. kiran, "stub loaded compact hexagonal ring antenna for wlan/wimax/itu applications", in the proceeding of teqip iii sponsored ieee international conference on microwave integrated circuits, photonics and wireless networks (imicpw), 2019, pp. 98-102. [28] g. kumar and k. p. ray, broadband microstrip antennas, artech house, 2003, chapter 9, pp. 357-378. [29] c. a. balanis, antenna theory: analysis and design, wiley, 2005. [30] m. v. yadav and s. baudha. "dual-band miniaturised and modified circular patch radiator for wifi/wlan applications" in the procedding of ieee indian conference on antennas and propagation (incap), 2019, pp. 1-4. [31] s. baudha, m. v. yadav and n. joshi. "miniaturised dual-band arrow shaped planar antenna", telecommun. radio eng., vol. 78, no. 19, pp. 1719-1728, 2019. the latent effects in digital ic’s under electrical overstress pulses facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 153 164 doi: 10.2298/fuee1501153b total ionizing dose effects and radiation testing of complex multifunctional vlsi devices  dmitry boychenko, oleg kalashnikov, alexander nikiforov, anastasija ulanova, dmitry bobrovsky, pavel nekrasov national research nuclear university (nrnu) “mephi”, moscow, russian federation abstract. total ionizing dose (tid) effects and radiation tests of complex multifunctional very-large-scale integration (vlsi) integrated circuits (ics) rise up some particularities as compared to conventional “simple” ics. the main difficulty is to organize informative and quick functional tests directly under irradiation. functional tests approach specified for complex multifunctional vlsi devices is presented and the basic radiation test procedure is discussed in application to some typical examples. key words: total ionizing dose (tid) effect, radiation test, functional failure, operating mode 1. introduction radiation hardness requirements are typical for all kinds of microelectronic parts for space, avionics, military and nuclear physics applications. radiation tests have to be performed to qualify each type of ic within its design, manufacturing or application steps [1], [2]. the infinite number of various radiation test results have been published recently but most of them are concentrated on rather simple devices under test – transistors, digital or analog ics with rather simple radiation sensitive parameters set and well developed measurement procedures [3]-[5]. in case of modern complex vlsi ics the same test approach is widely used, based on choosing modes and conditions of ic‟s operation under irradiation to be as simple as possible and measuring the simplest electric parameters, for example output voltages and power supply or input currents. at the same time, most of radiation test facilities initially are not adopted for ic radiation tests and have rather long signal cables – about ten meters and more, that excludes the real functional test possibility under irradiation for high-frequency and precision devices. the aim of this research is to prove the necessity and demonstrate the possibility of complex functional testing (ft) of vlsi ics under tid irradiation, and to show the existence of critical electrical and functional modes of ic operation. we have reviewed the most typical problems of multifunctional vlsi ics tid testing which are illustrated received september 3, 2014; received in revised form december 4, 2014 corresponding author: aleksandr nikiforov national research nuclear university (nrnu) “mephi”, moscow, russian federation (e-mail: aynik@spels.ru) 154 d. boychenko, o. kalashnikov, a. nikiforov, et al. by different ics radiation behavior [6]-[16]. we also provide typical guidelines for ft procedure (both hardware and software) and present the proper gamma irradiation facility. in this paper we concentrate on tid effects experimental research, but all main results and conclusions can be spread to transient radiation effects (tre), displacement damage (dd) and single event effects (see) [17]-[20]. 2. functional and parametric failures tid behavior of complex vlsi ics is usually non-trivial. it means that simultaneous total dose degradation of different elements and their mutual influence often lead to mixed parametric-functional ic failures. the results of 32-bit risc-processor idt79r308125mj testing are shown in fig. 1 as an example [6]. processor malfunction (internal cache memory errors) is accompanied by parametric failure (supply current increase). fig. 1 processor cache memory errors number and supply current vs. total dose. it is difficult to separate parametric and functional tid failures for some types of complex ics, such as adcs and dacs. parametric degradation leads to functional failures of these ics. in fig. 2 the set of adc parameters is presented as a function of total dose [7]. the increase of gain error and nonlinearity is derived from the adc transfer function degradation (fig. 3) at 200 krad, when the adc actually does not operate. the similar radiation behavior is observed at flash memory slcf128mm1ui (stec) testing (fig. 4). the read speed extreme fall means in fact that the flash memory cannot operate properly [8]. total ionizing dose effects and radiation testing of complex multifunctional vlsi devices 155 fig. 2 tid degradation of adc conversion parameters and supply current. fig. 3 tid degradation of adc transfer function. 156 d. boychenko, o. kalashnikov, a. nikiforov, et al. total dose, krad(si) 0 2 4 6 8 10 12 14 r e a d s p e e d , k b /s 0 2000 4000 6000 8000 10000 12000 sample 1 sample 2 sample 3 fig. 4 read speed vs. tid level for slcf128mm1ui. the analysis of tid test data for different ic types demonstrates the critical importance of ft during complex multifunctional vlsi ics radiation test. ics dominant tid failure mechanism (parametric or functional) statistics from our test center is presented in fig. 5. one can see the essential prevalence of tid failures for simple logic while other (complex) types of ics are characterized by subsequent or even dominant functional failures [9]. fig. 5 functional vs. parametric tid failures quantities for various ics classes. the essential problem of ft design is the proper selection of ics operating modes under irradiation. it is known that tid hardness usually depends strongly on the electric bias and operating mode under irradiation [10]. the hardness level difference between the best and the worst case modes for a particular ic may be several times. total ionizing dose effects and radiation testing of complex multifunctional vlsi devices 157 radiation sensitivity of ic‟s different units can vary significantly, since the elements operate at different electrical modes under irradiation. fig. 6 shows that soi risc microprocessor powerpc7448 (e2v) demonstrates no uniformity in tid hardness. in „normal‟ mode (processor executes a test program under irradiation) the failure dose is much higher (6–10 times) than in „periodical restarts‟ mode under irradiation. it was found that the boot unit of the microprocessor is the most tid sensitive [12]. static and dynamic operation modes of ic under irradiation usually lead to very different estimations of hardness levels. for example, the total dose graphs presented in fig. 7 demonstrate the extreme increase of ram supply current at static irradiation mode, while the samples irradiated at dynamic mode are much harder [13]. fig. 6 functional failuretid for different irradiation modes of powerpc7448 microprocessor. fig. 7 ram supply current vs. total dose at static and dynamic operation modes under irradiation. 158 d. boychenko, o. kalashnikov, a. nikiforov, et al. 3. ics operating mode under irradiation in some cases tid hardness difference of samples irradiated in various operating modes leads to the situation when the samples irradiated in one mode do not fail at all, whereas the hardness level of the samples irradiated in another mode is rather low. for example, in fig. 8 the results of mil-std-1553b receiver bus-65163-220y (ddc) tid testing are presented [14]. some samples have been irradiated in data transfer mode and the others – in „silent‟ mode (without data transfer). one can see that the second part of samples did not fail at all. these examples demonstrate the importance of ics operating mode under irradiation correct selection. in most cases the purpose of the test is to determine the ic hardness level in the worst case mode. complex multifunctional vlsi ics can often operate in dozens of modes, that is why the preliminary analysis and research should be carried out to find such mode. fig. 8 supply current vs. tid level for bus-65163-220y samples irradiated in transfer mode and “silent” mode. 4. testing during and after irradiation as it has been mentioned before, sometimes radiation test procedure is based on measurements of only the most primitive electrical parameters under irradiation. the complete functional testing, if any, is performed with a delay of several hours or even days after irradiation. this is due to the inability of a radiation test center to carry out an informative test under irradiation. the test procedure is often based on special equipment (industrial ic testers) which is not adapted to radiation test environment, nor is compatible with irradiation facilities and does not support remote testing. but according to our experience and data, it is really very important to execute ft directly during irradiation. testing ic samples after irradiation would distort the real radiation behavior picture and hardness level because of annealing that can result even in full operation recovery. total ionizing dose effects and radiation testing of complex multifunctional vlsi devices 159 in fig. 9 two graphs of cmos adc inl are shown: the first is measured immediately after the 100 krad (si) irradiation and the second 12 hours later (t = +25ºc) [15]. one can see that 12-hours annealing leads to an adc‟s operation recovery. fig. 10 demonstrates another example of tid hardness level distortion because of annealing [8]. annealing was carried out at a temperature of +25ºc. the samples of 4 mbit static ram have been irradiated to 60-120 krad (si) and were tested both during irradiation and after 24-hours annealing. the second part of this procedure demonstrated full functional recovery and significant supply current drop. fig. 9 adc inl measured immediately after 100 krad (si) irradiation and 12 hours later(t = +25ºc) (datasheet margins are shown by dashed lines at ±4 lsb). fig. 10 ram errors number (black symbols) and supply current (gray symbols) vs. tid during irradiation and after annealing(t = +25ºc). 160 d. boychenko, o. kalashnikov, a. nikiforov, et al. 5. low dose rate effects the important specifics of space applications is its low intensity (dose rate) in the range of 10 -3 ... 10 -4 rad (si)/s, and it is well known that the low dose rate effects can affect ics total dose hardness [3], [21]. it should be noted that in most cases radiation tests are carried out in the dose rate range 10 ... 1000 rad (si)/s. low dose rate effects are especially important for bipolar ics, which are often characterized by increasing parameters‟ degradation and low tid hardness at low dose rate, the so-called eldrs (enhanced low dose rate sensitivity) effect [22], [23]. this effect has to be taken into account when planning bipolar ics testing procedure for space applications. at the same time, for cmos ics dose rate influences on the hardness in the opposite way – the hardness levels are usually higher at low dose rates [24], [25]. the example in fig. 11 demonstrates the influence of radiation dose rate on cmos flash-memory wf1m32b (white electronic designs) tid hardness. low dose rate conditions proved to be about 1,5 times better than high dose rate [26]. for complex devices the low dose rate effects may be more complicated. both parameters degradation and functional performance of such ics often demonstrate different behavior at low and average radiation intensities. in fig. 12 the nonlinearity of adc ad7890 (analog devices) vs. total dose at two dose rates is presented. since radiation tests are usually carried out at the average dose rates, the real on-board cmos ics tid hardness may be higher than it has been determined in laboratory radiation tests. direct low dose rate testing is usually hard to fulfil because of the long time required. there are many techniques of accelerated testing usually based on irradiation at average intensity and following annealing (see [27] – the well known mil-std-883h test method), but all of them have restrictions and adequacy problems. we suggested and implemented the „engineering‟ technique for low dose rate effects estimation. this technique is based on the combination of dose rates under testing and allows obtaining real tid hardness levels without great time loss. the basic structure of the technique for cmos ics is as follows: (1) radiation testing of some samples (about half of a lot) at average dose rate (10 ... 100 rad (si)/s) to estimate tid failure level; (2) comparison of this level with the required hardness level; (3) if the determined tid failure level is above the requirements we do not need to take the low dose rate effect into account, because this effect can only improve the cmoc ics tid hardness; (4) if the determined tid failure level is below than the requirements for more than 3 times we also do not need to test the low dose rate conditions, because according to our experience, this effect cannot improve the hardness level estimation for more than 2-2,5 times; (5) and if the failure level is below the requirements less than 3 times, we test the rest samples of a lot at low dose rate (0,005 ... 0,05 rad (si)/s) to determine the tid failure level for the real space conditions. this rational technique reduces the low dose rate testing amount in several times [28]. total ionizing dose effects and radiation testing of complex multifunctional vlsi devices 161 total dose, krad(si) 0 10 20 30 40 s u p p ly c u rr e n t, u a 0,1 1 10 100 1000 10000 fig. 11 total dose degradation of wf1m32b supply current: black symbols – 10 rad(si)/s, white symbols – 0.04 rad(si)/s. total dose, krad (si) 0 2 4 6 8 n o n li n e a ri ty , l s b 0,1 1 10 100 fig. 12 total dose degradation of ad7890 nonlinearity: black symbols – 5 rad(si)/s, white symbols – 0.01 rad(si)/s. 162 d. boychenko, o. kalashnikov, a. nikiforov, et al. 6. radiation test facilities the experimental results presented in the previous section demonstrate the necessity of special test equipment for complex multifunctional vlsi ics radiation testing. as a rule, industrial ic testers are not optimal for radiation test procedure. that is why specialized technical solutions have been designed, in order to combine complex ft of different vlsi ics with restrictions of modern irradiation facilities. therefore, universal vlsi ics test system have been designed based on the national instruments hardware, labview software and the set of test plates, adapted and specified to the variety of complex multifunctional ics under test [29]. the general system structure is presented in fig. 13. fig 13 general structure of the ni-based vlsi ics radiation test system. radiation testing of vlsi ics also requires convenient irradiating facilities. the basic requirements are low electromagnetic interference and short signal lines. the original dual-zone co-60 (energy 1,25 mev, dose rate 0,01...1 rad/s) and cs-137 (energy 0,66 mev, dose rate 0,2...20 rad/s) gamma facilities have been especially designed and installed in nrnu mephi – spels test center. the unique feature is about 1m distance between ft equipment and device under test. being used in coincidence with compact x-ray tester (energy up to 0,1 mev, dose rate 1...500 rad/s) and pulsed linear electron accelerator in xray mode (linac – energy 2 mev, dose rate 0,5...100 rad/s) these facilities allow to carry out complex multifunctional vlsi ics tid-radiation tests in practical range of irradiation intensities, that is necessary to estimate tid hardness for all kinds of applications – [30],[31],[32]. 7. conclusion as a conclusion we can note that radiation behavior of multifunctional vlsi ics differs from radiation behavior of „simple‟ ics by significant features, causing specifics in test procedure. it is necessary to take it into account when planning and preparing the test experiment. total ionizing dose effects and radiation testing of complex multifunctional vlsi devices 163 according to the practical tid test experience one can summarize the following features of complex multifunctional vlsi ics radiation test procedure:  it is important to research and select correctly worst-case ic bias conditions and operating modes under irradiation;  functional and parametric tests should accompany each other in coincidence;  tests should be executed directly under irradiation;  low dose rate effects influence should be taken into consideration during testing. these principles form the foundation of basic test technique and equipment which are used in radiation test center of nrnu mephi-spels. the test system and procedure presented have been checked and verified in hundreds of real radiation tests of complex multifunctional vlsi devices. references [1] a. nikiforov, a. chumakov, v. telets, et. al, “ic space radiation effects experimental simulation”, in proc. of workshop "space radiation environment modelling new phenomena and approaches", oct. 79, 1997, moscow, russia, p. 4.11. [2] v. belyakov, a. chumakov, a. nikiforov, v. pershenkov, “ic's radiation effects modeling and estimation”, microelectronics reliability, 1999, v. 40, № 12, pp. 1997-2018. [3] v. belyakov, v. pershenkov, g. zebrev, et. al, “methods for the prediction of total-dose effects on modern integrated semiconductor devices in space: a review”, russian microelectronics, 2003, v. 32, № 1, pp. 25-38. [4] d. gromov, v. elesin, g. petrov et. al, “radiation effects in nanoelectronic elements”, semiconductors, 2010, v. 44, № 13, pp. 1669-1702. [5] d. boychenko, l. kessarinskiy, d. pechenkina, “the influence of the electrical conditions on total dose behavior of the analog switches”, in radecs proceedings, 2011, sevilla, spain, pp. 822-824. [6] p. nekrasov, a. demidov, o. kalashnikov, “functional checks of microprocessors during radiation tests”, instruments and experimental techniques, 2009, v. 52, № 2, pp. 196-199. [7] o. kalashnikov, a. demidov, a. nikiforov, et. al, “integrating analog-to-digital converter radiation hardness test technique and results”, ieee transactions on nuclear science, 1998, v. 45, № 6(1), pp. 2611-2615. [8] a. boruzdina, a. ulanova, n. grigor'ev, a. nikiforov, “radiation-induced degradation in the dynamic parameters of memory chips”, russian microelectronics, 2012, v. 41, № 4, pp. 259-265. [9] o. kalashnikov, “statistical variations of integrated circuits radiation hardness”, in radecs proceedings, 2011, sevilla, spain, pp. 661-665. [10] o. kalashnikov, “cmos integrated circuits total dose functional upset sensitivity to operation mode”, in proc. of the 4th workshop on electronics for lhc experiments, 1998, rome, italy, pp. 484-485. [11] a. kirgizova, a. nikiforov, n. grigor'ev, et. al, “dominant mechanisms of transient-radiation upset in cmos ram vlsi circuits realized in sos technology”, russian microelectronics, 2006, v. 35, № 3, pp. 162-176. [12] a. karakozov, o. korneev, p. nekrasov, et. al, “bias conditions and functional test procedure influence on powerpc7448 microprocessor tid tolerance”, in radecs proceedings, 2013, oxford, uk (to be published). [13] a. akhmetov, d. boychenko, d. bobrovskiy, et. al., “system on module total ionizing dose distribution modeling”, in proceedings of the international conference on microelectronics miel, 2014, belgrade, serbia, pp. 329-331. [14] d. bobrovsky, o. kalashnikov, p.nekrasov, “functional control technique for fpga total ionizing dose testing”, radecs proceedings, 2012, biarritz, france. [15] o. kalashnikov, a. artamonov, a. demidov, et. al, “adc/dac radiation test technique”, workshop record 4th european conf. "radiations and their effects on devices and systems" (radecs 97), 1997, palm beach-cannes, france, pp. 56-60. [16] o. kalashnikov, a. nikiforov. “tid behavior of complex multifunctional vlsi devices”, in proceedings of the international conference on microelectronics miel, 2014, belgrade, serbia, pp. 455-458. http://www.scopus.com/authid/detail.url?authorid=7103252626&eid=2-s2.0-0037274111 http://www.scopus.com/authid/detail.url?authorid=7004360154&eid=2-s2.0-0037274111 http://www.scopus.com/authid/detail.url?authorid=8706168700&eid=2-s2.0-0037274111 http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/authid/detail.url?authorid=26530680900&eid=2-s2.0-65149098680 http://www.scopus.com/authid/detail.url?authorid=7004500118&eid=2-s2.0-65149098680 http://www.scopus.com/authid/detail.url?authorid=6701334891&eid=2-s2.0-65149098680 http://www.scopus.com/source/sourceinfo.url?sourceid=15467&origin=recordpage http://www.scopus.com/authid/detail.url?authorid=6701334891&eid=2-s2.0-0032308202 http://www.scopus.com/authid/detail.url?authorid=7004500118&eid=2-s2.0-0032308202 http://www.scopus.com/authid/detail.url?authorid=7202140406&eid=2-s2.0-0032308202 http://www.scopus.com/source/sourceinfo.url?sourceid=17368&origin=recordpage http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55348114300&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7003366382&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=13405410000&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7202140406&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84865596480&origin=resultslist&sort=plf-f&src=s&sid=7c25d4267df1aab13e457b4758e78dab.y7eslnddisn8ce7qwvy6w%3a100&sot=aut&sdt=a&sl=43&s=au-id%28%22nikiforov%2c+alexander+yu%22+7202140406%29&relpos=0&relpos=0&citecnt=0&searchterm=au-id%28%5c%26quot%3bnikiforov%2c+alexander+yu%5c%26quot%3b+7202140406%29 http://www.scopus.com/record/display.url?eid=2-s2.0-84865596480&origin=resultslist&sort=plf-f&src=s&sid=7c25d4267df1aab13e457b4758e78dab.y7eslnddisn8ce7qwvy6w%3a100&sot=aut&sdt=a&sl=43&s=au-id%28%22nikiforov%2c+alexander+yu%22+7202140406%29&relpos=0&relpos=0&citecnt=0&searchterm=au-id%28%5c%26quot%3bnikiforov%2c+alexander+yu%5c%26quot%3b+7202140406%29 http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/authid/detail.url?authorid=8696494800&eid=2-s2.0-33646395977 http://www.scopus.com/authid/detail.url?authorid=7202140406&eid=2-s2.0-33646395977 http://www.scopus.com/authid/detail.url?authorid=13405410000&eid=2-s2.0-33646395977 http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage 164 d. boychenko, o. kalashnikov, a. nikiforov, et al. [17] a. nikiforov, p. skorobogatov, “physical principles of laser simulation for the transient radiation response of semiconductor structures, active circuit elements, and circuits: a nonlinear model”, russian microelectronics, 2006, v. 35, № 3, pp. 138-149. [18] g. davydov, v. luchinin, a. nikiforov, “effect of irradiation with fast neutrons on electrical characteristics of devices based on cvd 4h-sic epitaxial layers”, semiconductors, 2003, v. 37, № 10, pp. 1229-1233. [19] a. chumakov, a. vasil'ev, a. yanenko, et. al, “single-event-effect prediction for ics in a space environment”, russian microelectronics, 2010, v. 39, № 2, pp. 74-78. [20] d. bobrovsky, o. kalashnikov, p.nekrasov, “an estimate of the fpga sensitivity to effects of single nuclear particles”, russian microelectronics, 2012, v. 41, № 4, pp. 226-230. [21] j.r. schwank, “basic mechanisms of radiation effects in the natural space environment”, nsrec short course, 1994. [22] a.h. johnston et al. “enhanced damage in bipolar devices at low dose rates: effects at very low dose rates”, ieee trans. nuc. sci., 1996, vol. 43, №6, p. 3049. [23] v.s. pershenkov, a.i. chumakov, a.y. nikiforov et al. “interface trap model for the low-dose-rate effect in bipolar devices”, in radecs proceedings, 2007, deauville, france, pp. 1-6 [24] d.m. fleetwood. “total ionizing dose effects in mos and low-dose-rate-sensitive linear-bipolar devices”, ieee trans. nucl. sci., 2013, vol. 60, № 3. p.p. 1706-1730. [25] p.j. mcwhorter, s.l. miller, w.m. miller. “modeling the anneal of radiation-induced trapped holes in a varying thermal environment”, ieee trans. nucl. sci., 1990, vol. 37, №6, p. 1682–1689. [26] a. petrov, a. vasil‟ev, a. ulanova, a. chumakov, a. nikiforov, “flash memory cells data loss caused by total ionizing dose and heavy ions”, central european journal of physics, 2014, v.12, issue 10, pp. 725-729. [27] mil-std-883h – ionizing radiation (total dose) test procedure, department of defense. test method standard. microcircuits, 2010. [28] d. boychenko, o. kalashnikov, a. karakozov, a. nikiforov, “the rational technique for cmos ics total dose hardness evaluation with low dose rate effects”, russian microelectronics, 2014, v.43 – to be published. [29] d. bobrovsky, g. davydov, a.petrov, et. al, “national instruments platform based hardware-software system for electronic devices radiation experiment application”, electronics, 2012, v.5(97), pp. 91-106. [30] a. chumakov, o. kalashnikov, a. nikiforov, et. al, “'reis-ie' x-ray tester: description, qualification technique and results, dosimetry procedure”, in proc. of the 1998 ieee radiation effects data workshop, pp. 164-169. [31] a. chumakov, a. nikiforov, v. pershenkov, et. al, “prediction of local and global ionization effects on ics: the synergy between numerical and physical simulation”, russian microelectronics, 2003, v. 32, № 2, pp. 105-118. [32] a. sogoyan, a. artamonov, a. nikiforov, d. boychenko method for integrated circuits total ionizing dose hardness testing based on combined gammaand xrayirradiation facilities, facta univesitatis: series electronics and energetics, 2014, vol. 27, no. 3, pp. 329-338 http://www.scopus.com/authid/detail.url?authorid=7202140406&eid=2-s2.0-0029713508 http://www.scopus.com/authid/detail.url?authorid=6602159626&eid=2-s2.0-0029713508 http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/authid/detail.url?authorid=7005305551&eid=2-s2.0-0142153119 http://www.scopus.com/authid/detail.url?authorid=6701446103&eid=2-s2.0-0142153119 http://www.scopus.com/authid/detail.url?authorid=7202140406&eid=2-s2.0-0142153119 http://www.scopus.com/source/sourceinfo.url?sourceid=29834&origin=recordpage http://www.scopus.com/authid/detail.url?authorid=7103110289&eid=2-s2.0-77952643121 http://www.scopus.com/authid/detail.url?authorid=7402046621&eid=2-s2.0-77952643121 http://www.scopus.com/authid/detail.url?authorid=6701643388&eid=2-s2.0-77952643121 http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage http://www.scopus.com/authid/detail.url?authorid=7103110289&eid=2-s2.0-0032290021 http://www.scopus.com/authid/detail.url?authorid=6701334891&eid=2-s2.0-0032290021 http://www.scopus.com/authid/detail.url?authorid=7202140406&eid=2-s2.0-0032290021 http://www.scopus.com/authid/detail.url?authorid=7103110289&eid=2-s2.0-0032290021 http://www.scopus.com/authid/detail.url?authorid=7202140406&eid=2-s2.0-0032290021 http://www.scopus.com/authid/detail.url?authorid=7004360154&eid=2-s2.0-0037274111 http://www.scopus.com/source/sourceinfo.url?sourceid=27163&origin=recordpage instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 103 112 doi: 10.2298/fuee1401103k multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search  ioannis g. karafyllidis 1 , paul isaac hagouel 2 1 democritus university of thrace, department of electrical and computer engineering, 671 00 xanthi, greece 2 optelec, 11 chrysostomou smyrnis street, 54 622 thessaloniki, greece abstract. in this paper discrete quantum walks with different coins used for odd and even time steps are studied. these coins are called hybrid. the calculation results are compared with the most frequently used coin, the hadamard transform. furthermore, quantum walks on the line which involve two or more quantum walkers with hybrid coins are studied. quantum walks with entangled walkers and hybrid coins are also studied. the results of these calculations show that the proposed types of quantum walks can be used for quantum search, because the walker can be directed towards preferred directions and can also be confined in certain segments of the line. key words: quantum walk, quantum computing, simulation max 1. introduction quantum walks are quantum versions of classical random walks. they were first introduced in 1993 [1] and since then considerable work has been done on this subject. quantum walks are useful models for physical processes such as brownian motion and may serve as a basis for the development of new quantum algorithms [2], [3]. furthermore, quantum walk is a natural model for quantum search using parallel quantum computer architectures [4], [5]. quantum walks may also become an effective tool for studying biological systems [6]. several studies of continuous-time and discrete-time quantum walks on the line [7], [8], and some implementation proposals have been published [9]-[11]. the effect of noise on the discrete-time quantum walk has also been studied [12]. recently, a study of a quantum walk on the line with one walker and several coins has been published [13]. on the other hand, in [14] a quantum walk on the line with two entangled walkers and one coin, the hadamard transform, is studied. in this paper a study of discrete quantum walks in which multiple walkers use hybrid coins is presented. "hybrid coin" means the use of two different coins, one for odd and one for even time steps. the question to be answered by this study is: is it possible to use multiple quantum  received january 9, 2014 corresponding author: ioannis g. karafyllidis democritus university of thrace, department of electrical and computer engineering, 671 00 xanthi, greece (e-mail: ykar@ee.duth.gr) mailto:ykar@ee.duth.gr 104 i. g. karafyllidis, p. i. hagouel walkers and hybrid coins to direct the search towards a desired direction and, furthermore, is it possible to confine the search in a desirable segment of the line? calculation results show that this is possible. 2. quantum walk on the line with hybrid coins in discrete quantum walk a walker (which can be a particle or a state) moves on a one-dimensional periodic lattice. the sites of this lattice are numbered by: 0, 1, 2,i n    (1) the hilbert space of the discrete quantum walk comprises two subspaces, the location subspace hl, which is spanned by the basis: , 2 , 1 , 0 , 1 , 2 ,i n n    (2) and the two-dimensional coin subspace, hc, which is spanned by the two coin basis states 0 and 1 . the hilbert space, h, of the quantum walk is: l c h h h  (3) the state of the quantum walker found at location j with coin in state 0 is , 0j . the quantum walk usually starts with the walker in state 0 , 0 . at each step of the walk two operations are applied to the walker state. the coin toss operation, c, which acts on the coin state, is applied first: 0,0 0,1 1,0 1,1 , 0 , 0 , 1 , 1 , 0 , 1 c j c j c j c j c j c j     (4) any two-dimensional unitary transformation can by used as a coin toss operation. usually, the hadamard transform, h, is used. in this case: 0,0 0,1 1,0 1,1 1 1 1 2 1 1 c c h c c                    (5) the second operation applied is the walker shit operation, s, which acts on the location state and is given by: 1 1 1 0 0 1 n j n s j j j j         (6) this operation shifts the walker to the right (towards +n) if the coin state is 1 and to the left if the coin state is 0 . the probability distribution for a discrete quantum walk with initial walker state 0 , 0 , in which the hadamard transform is used as coin, is shown in multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 105 figure 1. the probability distribution is biased towards left because of quantum interference. the initial walker state 0 , 1 results in probability distribution biased towards right. fig. 1 probability distribution for a quantum walk after 40 steps. the initial state is 0 , 0 and the hadamard transform is used for coin toss the dependence of the probability distribution on the coin initial state leads naturally to the question: what is the probability distribution in the case where two coins are used alternatively? the case where the coin used in odd steps is the hadamard transform and the coin used in even time steps is a phase shift, p, is considered first. the phase shift is given by: 1 1 1 i p e            (7) figure 2 shows the probability distribution in this case. the initial walker state is 0 , 0 and the phase angle, ф, is 60 o . the probability distribution is biased towards left as in the case where only the hadamard transform was used, but the probability to find the walker in certain locations, which are periodically distributed, is larger. on the other hand, there is a zero probability to find the walker in much more locations that the case where only the hadamard transform was used. this is an expected result for all phase angles, in the case of a single walker. phase shift is important in the case of multiple walkers, because it affects quantum interference. 106 i. g. karafyllidis, p. i. hagouel fig. 2 probability distribution for a quantum walk after 40 steps in the case where two coins are used alternatively, namely h and p. the initial walker state is 0 , 0 the case where the coin used in odd steps is the hadamard transform and the coin used in even time steps is a more general transform, g, is considered next. the transform g is given by: cos ( ) sin ( ) sin ( ) cos ( ) g               (8) figure 3(a) shows the probability distribution after 40 steps in this case where φ = 30 o . the initial walker state is 0 , 0 . the walker is localized between in the region [-10, +10]. the order of magnitude of the probability to find the walker in locations outside this region is 10 -3 . it is therefore acceptable to say that using the aforementioned hybrid coin we can confine the walker within a certain segment of the line. this walker can be confined in any segment of the line [x-10, x+10] by setting the initial walker state to , 0x . different values of φ result in walker confinement in regions with different sizes. for example, figure 3(b) shows the probability distribution after 40 steps in this case where φ = 55 o and initial walker state 20 , 0 . in this case the walker is confined in the region [18, 22] or [20-2, 20+2], that is ±2 around its initial location. multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 107 (a) (b) fig. 3 probability distribution for a quantum walk after 40 steps in the case where the coins h and p are used alternatively. (a) initial walker state 0 , 0 and φ=30 o . (b) initial walker state 20 , 0 and φ=55 o . 108 i. g. karafyllidis, p. i. hagouel 3. multiple walkers on the line more that one walker can be used in order to exploit quantum interference. a number of w walkers can be used. these walkers can be distinct particles each one with a different initial state. in this case the initial state of the quantum walk, in w , is given by: 1 1 1 2 2 2 3 3 3, , , ,in w w ww a w c a w c a w c a w c     (9) with 2 2 2 2 1 2 3 1 w a a a a     (10) multiple walkers can also be states of the same particle located initially at different locations. in this case: 2 2 2 2 1 2 3 1 w a a a a     (11) figure 4 shows the probability distribution after 40 steps of quantum walk. a hybrid coin h and p with ф = 40 o is used. the initial state for this walk is: 1 1 14 , 0 12 , 0 2 2 in w     (12) the calculation results show that the walk is directed towards left. fig. 4 probability distribution after 40 steps of quantum walk with a hybrid coin h and p with ф = 40 o . the initial state for this walk is given by equation (12). multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 109 figure 5 shows the probability distribution after 40 steps of quantum walk in which a hybrid coin h and g with φ = 30 o is used. the initial state for this walk is: 1 1 18 , 0 18 , 1 2 2 in w    (13) the walkers are confined into two segments of the line. from figure 5 it is evident that the two probability patterns are located symmetrically to the left and to the right of the origin and are exactly the same. this walk results in two probability patterns which are the same and are displaced by 36 line sites. fig. 5 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 30 o . the initial state for this walk is given by equation (13). a quantum walk with two entangled walkers will be considered next. the initial state of the walk is: 1 1 6 , 0 5 , 1 2 2 in w    (14) α hybrid coin h and g with φ = 50 o is used. the calculation results are shown in figure 6. the probability distribution pattern has mirror symmetry with respect to the origin. more than two walkers can be used. let us examine the case of a quantum walk with three walkers in which a hybrid coin h and g with φ = 30 o is used. the initial state is: 1 1 1 10 , 1 5 , 1 11 , 0 2 2 2 in w      (15) 110 i. g. karafyllidis, p. i. hagouel fig. 6 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 50 o . the initial state for this walk is given by equation (14) the calculation results shown in figure 7 indicate that the walkers are confined within the region [-20, 20]. there is a non-zero probability to locate a walker in every location within the region [-15, 1]. fig. 7 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 30 o . the initial state for this walk is given by equation (15) multiple quantum walkers on the line using hybrid coins: a possible tool for quantum search 111 the periodic structure of probability distribution shown in figure 8 was obtained using four walkers with hybrid coin h and g (φ = 30 o ). the initial state was: 1 1 1 1 20 , 0 10 , 1 10 , 0 20 , 1 2 2 2 2 in w       (16) a periodic probability distribution with more periods can be obtained using more walkers. figures 5 and 6 indicate that two periods correspond to two walkers. fig. 8 probability distribution after 40 steps of quantum walk with a hybrid coin h and g with φ = 30 o . the initial state for this walk is given by equation (16) 4. conclusions in this paper the study of discrete quantum walks involving multiple walkers and hybrid coins was presented. an analytical study of these quantum walks is probably impossible because the probability distribution patterns depend on the choice of hybrid coins, the number of the walkers and the initial states. a large variety of patterns can be achieved including walker confinement, walker direction and periodic patterns. the results presented here indicate that quantum walk on the line with multiple walkers using hybrid coins is an effective tool for quantum search. references [1] y. aharonov, l. davidovich and n. zagury, quantum random walks, physical review a, 48, 1993, 1687. [2] a. ambainis, quantum walks and their algorithmic applications, quant-ph/0403120. [3] j. kempe, quantum random walks: an introductory overview, contemporary physics, 44, 2003, 302. [4] i. g. karafyllidis, simulation of entanglement generation and variation in quantum computation, journal of computational physics, 200, 2004, 383. 112 i. g. karafyllidis, p. i. hagouel [5] i. g. karafyllidis, definition and evolution of quantum cellular automata with two qubits per cell, physical review a, 70, 2004, 044301. [6] tai-hsin hsu and su-long nyeo, diffusion coefficients of two-dimensional viral dna walks, physical review e, 67, 2003, 051911. [7] d. ben-avraham, e. m. bolt and c. tamon, one-dimensional continuous-time quantum walks, quantum information processing, 3, 2004, 295. [8] o. buerschper and k. burnett, stroboscopic quantum walks,quant-ph/0406039. [9] w. dur, r. raussendorf, v. m. kendon and h.-j. briegel, quantum walks in optical lattices, physical review a, 66, 2002, 052319. [10] j. du, h. li, x. xu, m. shi, j. wu, z. zhou and r. han, experimental implementation of the quantum random-walk algorithm, physical review a, 67, 2003, 042316. [11] h. jeong, m. paternostro and m. s. kim, simulation of quantum random walks using the interference of a classical field, physical review a, 69, 012310, 2004. [12] d. shapira, o. biham, a. j. bracken and m. hackett, one-dimensional quantum walk with unitary noise, physical review a, 68, 2003, 062315. [13] p. ribeiro, p. milman and r. mosseri, aperiodic quantum random walks, physical review letters, 93, 2004, 190503. [14] y. omar, n. paunkovic, l. sheridan and s. bose, quantum walk on a line with two entangled particles, quant-ph/0411065. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 333-366 https://doi.org/10.2298/fuee2103333t © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a comprehensive overview of recent developments in rf-mems technology-based high-performance passive components for applications in the 5g and future telecommunications scenarios* girolamo tagliapietra, jacopo iannacci center for sensors and devices (sd) fondazione bruno kessler (fbk), trento-38123, italy abstract. the goal of this work is to provide an overview about the current development of radio-frequency microelectromechanical systems technology, with special attention towards those passive components bearing significant application potential in the currently developing 5g paradigm. due to the required capabilities of such communication standard in terms of high data rates, extended allocated spectrum, use of massive mimo (multipleinput-multiple-output) systems, beam steering and beam forming, the focus will be on devices like switches, phase shifters, attenuators, filters, and their packaging/integration. for each of the previous topics, several valuable contributions appeared in the last decade, underlining the improvements produced in the state of the art and the chance for rfmems technology to play a prominent role in the actual implementation of the 5g infrastructure. key words: rf-mems, switches, packaging, phase shifters, attenuators, filters 1. introduction since the first discussion of rf-mems technology in late `90s, research and expectations concerning this topic have followed a fluctuating path, alternating the sensation of a general maturity and the need of some improvement to allow its practical utilization. after an initial enthusiasm towards rf passives based on mems, triggered by their outstanding performances, a subsequent phase of research was needed to overcome practical limits of these electro-mechanical systems, such as reliability and the possibility to package and integrate the devices with other existing technologies. once these issues received march 08, 2021; received in revised form april 05, 2021 corresponding author: girolamo tagliapietra center for sensors and devices (sd) fondazione bruno kessler (fbk), trento-38123, italy e-mail: gtagliapietra@fbk.eu * an earlier version of this paper was presented at the 1st international conference on micro/nanoelectronics devices, circuits and systems (mndcs 2021), 29 31 january, 2021, in silchar, assam, india. [1]. 334 g. tagliapietra, j. iannacci were addressed and rf-mems reached a renewed solidity, their supposed wide-scale market absorption in rf and millimeter wave communications was opposed by a scenario already dominated by cheaper and more consolidated technologies. in fact, advantages and commendable performances of rf-mems were not essential for the implementation of communication standards at the time [2]. nevertheless, the last half decade portrays a more encouraging scenario: an increasing presence of rf-mems on the market, determined by a general maturity of this technology. nowadays, components based on rf-mems are commonly available [3], so that to be employed in base stations [4] and mobile devices [5]. moreover, rf-mems are predicted to play a prominent role in the deployment of the emerging 5g mobile communication standard, capitalising on their remarkable features of reconfigurability, low power consumption and rf performances along wide portions of frequency spectra [6]. differently from previous opportunities, in the current scenario, the increasing interest towards rf-mems is actually driven by the needs of market. the use of femtocells, picocells and microcells to achieve high data rates imposes the employment of reliable, fast and wideband devices, capable to operate along frequency bands below 6 ghz and up to (24.25-29.5) ghz, (37-43.5) ghz, and (64-71) ghz bands, depending on the specific country [7][8]. from a more “macroscopic” point of view, the use of highly reconfigurable mimo antennas, together with proper beam forming and beam steering of the basic radiation pattern is introduced to maximize the bandwidth available to users [7]. as mentioned in [9], in order to properly implement a network characterized by such variety of access points, various classes of powerful rf passives are required, including: ▪ very-wideband switches and switching networks (e.g. mpmt multiple pole multiple throw), characterized by low losses, high isolation and extremely low cross-talk between adjacent channels along the aforementioned frequency bands. ▪ reconfigurable filters with substantial rejection along stopband and minimal attenuation along the passband. ▪ very-wideband impedance tuners with multiple states. ▪ programmable attenuators with multiple configurations and constantly at attenuation along the considered portions of spectrum. ▪ very-wideband digital/analog phase shifters with multiple achievable values of phase shift. ▪ hybrid devices, comprising both attenuation and phase shifting capabilities into a single device. ▪ miniaturized antennas and antenna arrays permitting monolithic integration of the radiating element with abovementioned components. thus, depending on the specific desired functionalities, rf-mems devices that could be adopted for implementation of 5g standard should satisfy different specific constraints. however, despite the variety of factors that may influence the requirements for a particular component, it is possible to quantitatively outline a set of reference performance to be pursued, which is reported in table 1. as reported in [10][11], in the framework of mems devices, this set of reference performance must be matched with the separate and underlying issue of reliability. the goal of the present work is to provide a general overview about the state of the art of some rf-mems devices, which are considered as key components of rf front ends in the abovementioned 5g scenario. switches are taken into account in section 2 as one of a comprehensive overview of recent developments in rf-mems technology 335 the first components to be implemented by this technology, and for their massive presence in almost every complex rf-mems and non-rf-mems device. the section is complemented by a set of common guidelines for an optimal design of rf-mems switches. packaging is considered in section 3 as a critical step to enhance reliability, and to guarantee the practical integration of such devices on common platforms. since beam forming techniques are basically performed by attenuating and phase-shifting the signal feeding a radiating element, phase shifters are discussed in section 4, including their most common architectures. for the same reason, attenuators are treated in section 5. as critical components in every rf front end, filters are treated in section 6. these different topics are presented by explaining their basic working principles, and examining several research contributions available in the scientific literature. table 1 desirable characteristic features for devices operating in 5g network. feature desired value motivation frequency range <6 & (24.25-71) ghz the diversification of the infrastructure imposes such wide range. mobile terminals will have to operate correctly when connected to base stations or cells. isolation > -30 or -40 db a crucial feature in reconfigurable switchbased passives, it should be as high as possible. loss < -1 db given the multiple components housed in common rf front ends, single losses should be minimized along the widest achievable frequency span. cross-talk < -50 or -60 db presence of mimo systems and closely spaced components prescribes maximum isolation between adjacent channels. switching time < 200 or 300 µs the low latency required by the diversified network requires fast commutation of switching devices. control voltage < 2 v compliance with typical bias voltages of circuits and avoidance of ad hoc additional circuitry in mobile devices. 2. switches rf-mems based switches represent one of the early and principal components in this field of research. they have been investigated since the second half of 1990s and they are part of almost all complex networks and devices implemented in such a technology. in order to play a prominent role in the current 5g paradigm, switches have to comply with the specific requirements reported in table 1, together with a satisfying reliability, which should be greater than one billion cycles in some applications. many valuable research solutions addressed the above-mentioned requirements in the last years, although trade-offs are generally needed between electrical and mechanical performances. from the point of view of classification, the movable membrane allowing or avoiding the electrical connection of the signal line to the output port, or to rf ground, may realize a ohmic (metal-metal) or capacitive (metal-dielectric) contact; on the basis of the contact, the switch will be ohmic or capacitive. the target of the connection (signal line or ground) determines the series or shunt nature of the device; hybrid series-shunt implementations are also achievable by particular arrangements [12]. the membrane may move vertically or 336 g. tagliapietra, j. iannacci horizontally, while its structure may be based on a cantilever or a clamped-clamped type, employing different shapes of beams and membranes. actuation may take place by piezoelectric, electrothermal, electromagnetic or electrostatic transduction mechanism, with the last one being the most widely employed. from a practical point of view, isolation performances of ohmic switches are quite broadband, limited only by the capacitance between the movable membrane and the signal line, when the membrane is not actuated. this capacitance determines a decreased impedance as frequency increases, affecting overall isolation of the device. in case of capacitive switches, the contact point is covered by a dielectric layer. this determines a low capacitance in non-actuation “off” state, and a high capacitance when the membrane is actuated in “on” state. since this limits the isolation performances at both high and low frequencies, capacitive switches can be considered more narrow-band. due to this dynamics of their capacitance values, capacitive switches are usually evaluated on the basis of their “on/off” capacitance ratio, which should be as high as possible. additional details concerning advantages or vulnerabilities of such devices in terms of structure, performances, frequency range of usability are investigated in depth in [11]. considering the large amount of literature in this field, some interesting proposals of the last half decade are considered. to this end, table 2 is provided for a better comparison of the discussed research items in terms of their characteristic parameters. 2.1. ohmic switches considering ohmic switches, [13] presents a buckle-beam structure, whose actuation follows a vertical movement to connect the two interrupted sections of the signal line, thanks to a buried electrode. the beam structure is based on a chevron thermal actuator, in which the provided voltage causes heating and deformation of the structure, with consequent horizontal displacement and de-actuation. its working principle is represented in figure 1a. besides its remarkable electrical performances within the measured (0-20) ghz range, the device based on silicon substrate has been thermally characterized in the range from -20 to 140°c to test possible detrimental effects of common packaging processes on the original structure. the results demonstrate minimal variations of pull-in (actuation) voltage and losses. with the target of a device operating at d band ((110-170) ghz), [14] proposes an spdt (single pole double throw) switch integrated in bicmos (bipolar cmos) technology, based on an initial t-junction connected to two terminal spst (single pole single throw) switches. each spst switch is encompassed in a cavity containing two actuation electrodes on both sides of elevated signal line, an overhead membrane, and a metallic plate for protection on top of the cavity, as reported in figure 1b. the actuation of the membrane will connect the signal line to ground, isolating the corresponding output port. by choosing an actuation voltage of 60 v, the authors reached switch-on and switch-off time smaller than 10 μs. another substantial contribution comes from [15], in which the traditional series ohmic cantilever switch is provided with two additional cantilevers allowing a shunt connection to ground of the cpw (coplanar waveguide), as shown in figure 1c. actuation of additional shunt connections has a dual purpose: decreasing the voltage swing on the main series contact and increasing isolation of the output port. this protection mechanism has proven to extend reliability of this 400x300 μm2 device up to 500 million cycles with 1 w rf power. a comprehensive overview of recent developments in rf-mems technology 337 beside electrical performances, the switching time is remarkable too, being 30.4 μs and 39.8 μs for switch-on and switch-off time, respectively. fig. 1 (a) movable membrane reported in [13], (b) cross section of switch cavity reported in [14], (c) surface current distribution on switch realized in [15] when lateral cantilevers are actuated to ensure maximum isolation. in [16] a traditional series ohmic switch based on gaas mmic (monolithic microwave integrated circuit on gallium arsenide) technology is discussed, with a series of etched holes on the surface of the membrane, in order to decrease the pull-in voltage. power handling characterization demonstrated acceptable performances up to 36 dbm, while reliability of 2.4x107 cycles has been verified. a more innovative design is reported in [17], where a laterally moving switch is proposed in order to connect the two terminations of the signal line. the movable section is always in touch with the input sections, and it comes in touch with the output section by electrostatic actuation. triangular beams in the central part of the membrane structure provide mechanical restoring force, to move the section back to its rest position. a voltage of 90 v must be applied to ensure proper contact between the signal lines, with a 38 and 57 μs for switch-on and switch-off time, respectively. performances for an application in 5g scenario has been considered at 3.5 ghz and 28 ghz. more complex networks have been implemented by this basic switch, including spdt, sp3t (single pole 3 throws), and sp6t (single pole 6 throws), showing to be operational beyond 1 billion cycles with 1 w rf power. footprint of these devices based on alumina substrate does not exceed 0.7 mm2. (a) (b) (c) 338 g. tagliapietra, j. iannacci another design has been simulated in [18], comprising: a cantilever vertical switch, with serpentine shaped beams, a center membrane and two free ends (contacts) provided with holes. the presence of holes on membrane and free ends, a 2.4 μm air gap, the choice of material and serpentine shaped beams allowed authors to reach a simulated low actuation voltage and commendable switch-on time of 55 μs. moreover, the presence of two contact points is intended to increase both reliability and current-carrying capabilities of the device, decreasing the occurrence of detrimental surface nonidealities on the entire contact area. 2.2. capacitive switches considering capacitive switches, significant contributions have been proposed by [19][20][21], covering different aspects. the first one concerns a 4-beams cantilever structure: each beam is placed on a corner of a square area, each one provided with its shared tether and two springs. stability of the structure against stress gradient is guaranteed by a central joint. underlying dimples are employed to prevent a collapse of the structure and instability of capacitance value. fabricated on quartz substrate, this 400x400 μm2 device has a switch-on time of 55 μs. besides the original design, depicted in figure 2a, such work is remarkable in terms of power handling capabilities: the device has been characterized in its pull-in and de-actuation voltage, showing acceptable variations within (1-12) w. hot switching conditions are also evaluated, demonstrating little sensitivity of the capacitance to temperature variations within 25-125°c. focused on ka-band, and on a design minimally affected by hard fabrication temperatures, [20] proposes a folded-leg structure characterized after packaging process. both actuation voltage and up-state capacitance exhibited minimal changes after 200°c thermal treatment, demonstrating a stable behavior of such quartz-based device during the fabrication process. equally focusing on ka-band, [21] offers a device provided with an actuation electrode on both sides of signal line. vertical movable membrane actuates in just 15 μs, and it features large perforated areas between surfaces devoted to actuation and capacitive coupling, connected to cpw ground by meandered beams. as indicated by table 2, measured electrical parameters are remarkable, and this is also due to the high capacitance ratio of this roughly 1 mm2 device, fabricated on high resistivity silicon. the implementation proposed in [24] is marked by remarkable performances as well, since it exhibits minimal insertion loss along a wide frequency range, with a reduced actuation voltage (3.5 v). in fact, when all its membranes are actuated, an insertion loss smaller than 0.1 db is achievable in the (0-25) ghz range. the simulated 640x820 μm2 structure is based on silicon substrate and it is embodied in a cpw configuration with four cantilever membranes, each one provided with a single serpentine shaped beam. activation of a single or multiple membrane determines different configurations, in which return loss is smaller than -20 db along intervals varying from 1.6 to 16.1 ghz. targeting k-band satellite applications, [22] introduces a simple and effective design on silicon nitride, simultaneously achieving low actuation voltage (4.7 v) and insertion loss (0.3 db) up to 45 ghz. capitalising on meandered beams, a traditional rectangular perforated membrane with stepped profile, and 1.7 μm air gap, a structure with 175 μs switch-on time and high capacitance ratio (132) is obtained. a comprehensive overview of recent developments in rf-mems technology 339 (a) (b) fig. 2 (a) top view of the concept described in [19], (b) conceptual representation of capacitive switch realized in [23]. a more complex device is suggested by [23], combining thermal actuation and electrostatic holding: the first for the actual downward or upward movement, and the second for holding the actuated membrane. in this way, low actuation voltage is achieved, still involving low power consumption. as visible in figure 2c, the moving membrane is composed by two metal layers: the first one for upward (c and d electrodes) and the second for downward actuation (a and b electrodes), with an interposed dielectric layer. the holding electrode (e) completes the device, attracting the first layer and keeping the whole membrane actuated. this double electrothermal mechanism is employed also in case of stiction, in order to provide additional restoring force when the membrane is stuck because of micro-welding or charges trapped in the dielectric. in fact, its measured electrical properties remained constant up to 21 million cycles, while no stiction took place even after 80 million cycles. with an actuation and holding voltage of 0.3 and 15.4 v, this 160x300 μm2 device demonstrates high isolation up to 20 ghz. table 2 comparison between the different discussed realizations, including ohmic and capacitive switches. reported values of return loss (s11), insertion loss (s12 when close), or isolation (s12 when open) refer to worst value along the measured interval. work type measured ret. loss ins. loss isolation actuation @(ghz) (db) (db) (db) (v) [13] ohm 0-20 25 0.45 20 90 [14] ohm 140 1.42 54.5 60 [15] ohm 0-40 0.48 36 60 [16] ohm 1-40 20.09 0.25 22.5 17.5 [17] ohm 3.5&28 21&11.4 0.28&2.82 28 90 [18] ohm 1-40 15 0.6 19 3.1 [19] cap 20 8 1.1 12 55 [20] cap 35 18 0.3 35 25 [21] cap 35 27 0.29 20.5 18.3 [24] cap 5 28 2 49 3.5 [22] cap 27 13 0.3 35 4.7 [23] cap 2.4 0.28 31 15.4 340 g. tagliapietra, j. iannacci 2.3. set of reference guidelines in design and optimization of multiphysics design of rf-mems all the previously mentioned realizations exhibit respectable performances, covering different portions on the spectrum allocated to the current 5g paradigm. bearing in mind the target requirements, some of the mentioned solutions are more favorable in terms of electrical performances, while others may be more convenient for their reliability, and others, again, for the possibility to be easily integrated with other circuits. further solutions, instead, score viable trade-offs between all the mentioned characteristics. in any case, the existing vulnerabilities of a certain design concept can be properly addressed, depending on their nature. to this end, as pointed out in [11], there are different guidelines that may help a designer involved in the optimization of a particular device. starting from the electromechanical behavior of rf-mems, a shorter switching time can be achieved by acting on the actuation principle, as well as on the structural or electrical features of the device at stake. piezoelectric actuation generally guarantees a faster actuation, but it is prone to unwanted parasitic actuations [25]; in particular, mismatches of lattice or thermal expansion coefficients of the different layers will determine the presence of residual stress in the structure, leading to cracks and bends [26]. from the structural point of view, multiple and large holes on the movable membranes are a wise option, since they increase the speed of actuation by reducing the damping of air. this is one of the criteria behind the design of the membrane reported in [22]. another option consists in modulating the amplitude of the biasing waveform, as discussed in [27]. while a smaller voltage (as compared to pull-in voltage) can be adopted to maintain the membrane actuated, a short and relatively high voltage peak of voltage can accelerate the movement of the membrane. an easier integration on common wireless or mobile devices usually requires low actuation voltages, compatible with typical supply voltages of cmos technology (roughly from 1.5 to 6 v). such low voltages avoid the presence of converters, saving costs and space on chips. another beneficial effect is an enhanced reliability, provided by the reduction of mechanical shock on the movable membrane. within the framework of electrostatic actuation, there are three main structural strategies to achieve such low voltages: decreasing the spring constant, maximizing the actuation area, and decreasing the air gap. the spring constant of the movable structure can be lowered by adopting folded or meandered beams with reduced thickness, as in [20][22]. the negative consequence of a reduced spring constant of a structure is an attenuation of its restoring force, which, in turn, determines an increased risk of stiction (missed release of the actuated membrane after the zeroing of the actuation voltage). this issue can be counteracted by inserting additional electrodes, which could help the proper release of the membrane; but sometimes this solution is not affordable: beside the significant cost in terms of complexity of the design, in some cases, the housing of electrodes above the membrane may excessively complicate the fabrication process. in those cases, a viable option is represented by a proper exploitation of packaging, as discussed in section 3. maximizing the actuation area consists in maximizing the overlapping area between the movable structure and the fixed electrodes; for this purpose, designers would prefer membranes with relatively large areas devoted to actuation, and large electrodes, as in [21]. the decrease of the air gap is a viable option, but it should be carefully managed in case of capacitive switches. in that case, a small gap between the non-actuated membrane and the underlying contact layer would significantly increase the off-state capacitance. however, the corresponding limitation in the achievable isolation can be compensated by matching circuits that mitigate the parasitic capacitance [28][29]. a comprehensive overview of recent developments in rf-mems technology 341 better electrical performances of switching devices can be obtained by a wise selection of the structure and materials, or by introducing additional electrical components. in particular, structural modifications may involve the deployment of additional shunt membranes on a common series ohmic spst switch. this choice is the one adopted in [15], demonstrating beneficial effects on isolation. another strategy regarding series ohmic switches is reported in [30], in which a triangular cantilever membrane acts as a tapered section of the transmission line, reducing the return loss of the device. in case of capacitive switches, the achievable isolation in mainly limited by the up-state capacitance, while insertion loss is bounded by down-state capacitance [31]. for this reason, a common practice is to maximize the capacitance ratio. the ratio can be maximized at structural level by adopting bending structures or by introducing floating metal layers. bending structures, such as warped beams or plates, represent a viable option, but they may lead to an increased actuation voltage. the addition of a floating layer generally represents a better solution: the metal plate on top of the upper dielectric layer does not affect the up-state capacitance, but it acts as a wider upper plate of a capacitor when the membrane is actuated. this leads to a higher down-state capacitance. the solution of a floating metal layer should be preferred also because of a practical issue: the surface roughness of the dielectric is generally not accounted in simulations. actuated membranes with a non-uniform adhesion to the underlying dielectric layer, or surface roughness, will determine an interposed layer of air, decreasing the achievable down-state capacitance [32][33]. the adoption of the floating layer allows to overcome these practical issues. regarding the substrate and the dielectric, the choice of materials and their thickness is an important subject. good electrical performances are usually obtained by using substrates with high resistivity, such as quartz, glass, high-resistivity silicon, gallium arsenide (gaas) or silicon-on-glass (siog) [34][35][36]. in case of capacitive switches, dielectric materials with a high permittivity and arranged in very thin layer are preferable, since they ensure a high capacitance when the membrane is actuated. the main candidates are piezoelectric leade zirconate titanate (pzt), strontium titanate oxide (srtio3), tantalum pentoxide (ta2o5) and hafnium dioxide (hfo2) [37][38][39]. however, while some of them are relatively expensive, the thickness of the dielectric must be chosen accordingly to the available fabrication process, and it must ensure to sustain the pull-in voltage without a breakdown. the introduction of additional electrical components generally involves the deployment of shunt inductors or sections of transmission lines. in fact, the beams supporting the movable structure introduce some parasitic inductance, affecting the isolation of the device. this inductance is generally compensated by introducing matching sections or reactive components [29]. reliability is an important issue, investigated in depth in the framework of rf-mems devices, thus different strategies have been proposed so far. the occurrence of stiction can be mitigated or counteracted by some of the already mentioned strategies, such as the insertion of additional electrodes to enforce the release, the decrease of pull-in voltage, the modification of voltage actuation waveform or the adoption of additional shunt membranes. beside these techniques, other expedients regarding the structure and employed materials can be listed. from a structural point of view, the most common are the use of frames with reinforcing bars on the movable structure, and stopping pillars or thin posts under the membrane [40]. the underlying idea is to prevent deformations of the membrane and its complete adhesion to the dielectric layer, in which residual trapped charges may be present. other 342 g. tagliapietra, j. iannacci possibilities to overcome the problem of trapped charge is the adoption of lateral actuation mechanisms, as in [17], or the more "radical" option of a dielectric-less design [41]. an increased lifetime of the device can be achieved also by a proper selection of materials. in case of ohmic switches, stiction is mainly caused by increase of resistance, welding, fusion or transfer of metal contacts. this is generally prevented by the adoption of harder metals, such as molybdenum (mo), palladium (pd), tungsten (w) and rhodium (re), often organized in multilayers [42][43][44]. in case of capacitive switches, besides charge injection and trapping, reliability is threatened by mechanical shock and the breakdown of the dielectric layer. these problems can be mitigated by using layers that may rapidly release charges, such as aluminium nitride (aln) and pzt [45], by removing surface roughness and nonidealities, or by adopting robust structures with double dielectric layers. 3. packaging when talking about reliability of mems components, it is not possible to omit the closely related field of packaging. besides enclosing mems structures in a “sealed” space, less sensitive to changes of temperature or humidity, packaging avoids contamination with harmful agents and external mechanical shocks. however, this indispensable interface to outer world introduces: additional costs of production, increased volume and impairment on performances of the original (naked) mems device. the two main choices for packaging at technology level are chip-scale and wafer-level package. the latter is the most widely employed for its lower cost, efficiency in packaging process, reduced size and decreased impairment of performances. such a technique can rely on two fundamental approaches: application of a cap and thin-film packaging. the former approach adopts silicon-based materials as a cap covering the device, as depicted in figure 3a. typical materials for the realization of the cap are represented by benzocyclobutene (bcb), co-fired ceramic (ltcc), silicon (si), and quartz (sio2). depending on the presence (or absence) of a sealing material, interposed between the cap and the device wafer, intermediate bonding (or direct bonding) techniques are employed to seal the cap. intermediate bonding is generally more convenient because of its relatively low temperatures ((200-400)°c), since direct bonding techniques may require up to 1000°c (in case of fusion bonding, as reported in [46]). the cap is placed on mems device by optical alignment and typically sealed by gold (au), sio2, bcb, or su-8. (a) (b) fig. 3 (a) general practical implementation of capping reported in [47], (b) detail of quartz capped sp4t (single pole 4 throw) switch, reported in [48]. a comprehensive overview of recent developments in rf-mems technology 343 examples are reported in [48][49][50][51] showing switches encapsulated by intermediate bonding techniques at temperatures in the range (200-250)°c, which have not been particularly impaired by packaging procedures. in particular [48] presents various devices, including the sp4t switch represented in figure 3b, protected by a quartz cap fixed on substrate by epoxy polymer su-8. depending on the substrate flatness, two fabrication methods are proposed: one for substrates with cavities and holes, and one for flat substrates. experimental results of electrical parameters show minimal degradation of insertion loss (an increase of 0.1-0.2 db) up to 30 ghz, providing a capped device not exceeding -0.7 db up to 40 ghz. return loss is slightly impaired by the impedance mismatch introduced by the cap, reaching values up to 10 db in the range up to 20 ghz. in the remaining portion of the measured spectrum, the return loss of the capped device complies with the one of uncapped version along the measured range. encapsulation did not drastically affect electro-mechanical properties. while for a shunt capacitive switch there was no change in capacitance and actuation voltage was just slightly increased, a measured sp4t switch showed a minimal change in series resistance and a decrease of actuation voltage. a more recent contribution comes from [52], in which a silicon cap bonded by spincoated bcb is accurately modeled and added to different switch devices. the different mounted caps have been tested in terms of resistance to shear stress, by applying horizontal force only to the cap while keeping the wafer fixed. packages exhibited a high average break force (force needed to detach the cap) of 21 n. the increase of insertion loss due to the proposed package is around 0.2 db, with an average increase of actuation voltage of 4.2 v among the nine measured switches. all capped switches show remarkably low insertion loss (<-0.44 db), and reasonable return loss (<-20.9 db) all over the 40 ghz measured interval, operating for more than 10.2 billion cycles. the approach of thin film packaging generally involves a sacrificial layer placed on movable parts of a mems device, covered with a thin film layer. such layer will be perforated to remove the sacrificial layer, and then covered with a proper layer for final sealing, as shown in figure 4a. in this case, gold [53] and silicon-based materials are commonly employed solutions for the thin film, as observed in [47]. such materials, besides various polymers, can be adopted also as sealing layer, generally patterned by deposition, evaporation and electroplating, spin-coating or plasma-enhanced chemical vapor deposition (pecvd). examples of such technology are reported in [53][54][55][56], showing encapsulated devices still maintaining remarkable rf performances after the packaging process. a more recent proposal comes from [57], in which a protective dielectric shell comprises an electrode, biased to ensure maximum vertical upward displacement of the movable membrane of a capacitive switch. this is beneficial both for the capacitance ratio and reliability aspects. possible configurations of this device are represented in figure 4b. contact holes and release channels are etched on sides of the shell to remove the sacrificial layer; a si3n4 layer is then deposited by pecvd to hermetically seal the structure. measurements over hours of operations with square bipolar biasing voltages show that both actuation voltage and capacitance suffer from minimal variations, leading to an increase of just 0.2 db in insertion loss. 344 g. tagliapietra, j. iannacci fig. 4 (a) general practical implementation of thin-film packaging reported in [47], (b) cross section of the device reported in [57] in both its states. an interesting contribution is represented by [58], in which an investigation regarding the amount of perforated area on the thin film for the proper release of encapsulated devices is reported. in this work, thin film layer is perforated with holes of different diameters (7 and 8 μm), and different realizations of perforated films are obtained, each one with a different percentage of perforated area. after the identification of the most appropriate percentage (at least 57%), different combinations of power and time are considered for the proper release of the considered devices. complete 3d fem (finite element method) models of the considered realizations on gold cpw have been implemented and simulated, showing no significant variations of return or insertion loss introduced by the packaging phase. (a) (b) a comprehensive overview of recent developments in rf-mems technology 345 table 3 comparison between the different discussed packaging implementations, including both cap and thin film solutions. the provided dimensions in the order of mm, and tens or hundreds of µm, refer to dimensions of the whole packaged device. work type cap or film/ sealing dimensions ins. loss isolation temp. materials (µm) (db/@ghz) (db/@ghz) °c [49] cap ltcc/au 1.4x0.9x0.8 0.5/5 20/5 250 [50] cap bcb/bcb 2mm2x28 0.7/30 18/30 250 [51] cap si 5000x4000x950 0.5/6 24/2.5 [48] cap sio2/su-8 0.7/30 200 [52] cap si/bcb 0.4/35 35/35 250 [53] film au/sio2 27/2 <120 [54] film si3n4/au 75x80x 1.2 30/2 [56] film na/sio2 x x4 0.45/142 51.6/142 200 [55] film sio2/ alx2010 <600x600x 0.25/12 35/12 [57] film si3n4 70x50x 2/25 8/25 4. phase shifters being vital components in the architecture of phased arrays and rf front ends, phase shifters are abundantly employed in radar systems (l, s, c, x bands), as well as for mobile and satellite communications (ku, k, ka bands). current developments in terms of mobile communications, with the adoption of millimeter wave frequencies, determined an accentuated interest towards mems-based phase shifters. consequently, the literature regarding this field is quite populated. the first implementations of phase shifters based on rf-mems were proposed in late `90s, both concerning digital [59] and analog [60][61] realizations. shortly after, the main topologies concerning the architecture of the proposed phase shifters were defined, according to the following classification: switched-line (sl), loaded-line (ll), distributed mems transmission line (dmtl), and reflection-type (rt) phase shifters. as reported by more comprehensive books and overviews, such as [62][63][64], the working principle of switched-line devices is the switching between signal paths of different electrical length, in order to achieve the desired phase delay. however, when a wide phase shifting range or multiple small phase steps are required, such strategy may lead to bulky arrangements. conceptually similar, the basic idea of loaded-line phase shifters is to load a transmission line with two reactive impedance networks. often realized with stubs, capacitive or inductive loads will cause phase lags or leads respectively, while the transmission line section will ensure impedance matching. as in the case of switched-line devices, extended shifting performances will lead to bulky phase shifters. in case of dmtl, the transmission line of the signal path is periodically loaded with varactors, changing the distributed capacitance of the line and, in turn, the line phase constant and, finally, the resulting phase delay at the output port. when wide shifting or precision performances are required, the presence of multiple mems capacitive membranes determines arrangements with pronounced area, high aspect ratio and prone to reliability issues. 346 g. tagliapietra, j. iannacci rt phase shifters typically employ quadrature couplers with equal and tunable reflective loads connected to coupled ports, controlling the phase delay of the reflected signal. although this approach may seem less troublesome, it is characterized by underlying trade-off between the achievable shift range and the insertion loss. such trade-off is due to the need of effective modeling, design and matching of the coupler and of the reflective load. as highlighted in [65], although phase shifters based on rf-mems represented a powerful solution to simplify the architecture of phased arrays, some substantial improvements were mandatory at that time. the main requirement was a general reduction of the physical footprint. other concerns were mostly related to the employed switching units within the phase shifters, the most relevant ones being limited reliability, high switching time and limited power handling capabilities. these factors confined early rfmems phase shifters to arrays with relatively slow scanning and radiated power, while the physical size prevented the integration in case low microwave frequencies were at stake (<20 ghz). many research solutions of the last decade remarkably faced the challenge of increasing the shifting range and phase precision, and considerable steps ahead have been made in the direction of miniaturization, still lowering insertion loss. tables 4 and 5 are provided along the present section for a better comparison of the discussed research items in terms of their characteristics. 4.1. switched-line concerning switched-line topologies of rf-mems phase shifters, some relevant contributions are reported in [66][67][68]. in particular, the device presented in [66] and shown in figure 5a has a wide shifting range and remarkable power handling capabilities (5 w), at the expense of losses (-3.1 db insertion loss) and footprint (7.5x6 mm2). the employed sp2t and sp4t switches demonstrated very limited switching time (5 μs) and high reliability (100 m cycles), despite a rather high actuation voltage of 90 v. since both [67][68] propose 1-bit 180° phase shifters, their intrinsic simplicity allows more appreciable losses and area (9 and 15 mm2). additional suggestions are offered by [69][70][71][72]. in particular, [69] suggests a 4-bit phase shifter for lte base station, fabricated on common pcb substrate. the path from the input to the output port is composed by two sections, each one provided with four paths of different electric length, as depicted in the block scheme of figure 5b. the switching is performed by two spdt on both sides of a section, actuated by 90 v signal in 10 μs. a 5-bit architecture focused on ku band is described in [70]. as reported in figure 5c, out of the three sections in which the device is divided, two are characterized by four paths with different electrical length selected by a sp4t switch on each side. the remaining section contains just two paths, selected by a couple of sp2t switches. by applying 53 v pull-in voltage, switch-on time is 28 μs for both kinds of micro-relays. the overall area of the device is 3.19x5.17 mm2. in case of both cold and hot switching conditions, the device has been demonstrated with more than 10 million cycles from 0.1 to 1 w of rf power. the 3-bit device reported in [71] has been characterized in the ka band, and it is composed by two sp8t switches selecting one output of the available eight signal paths. with a total area of 5.9 mm2, this phase shifter based on alumina substrate employs a comprehensive overview of recent developments in rf-mems technology 347 switches with actuation at 65 v. reliability measurements at room temperature reported a satisfactory behavior beyond 100 million cycles for the range of power (0.1-1) w, and beyond 30 million cycles for 2 w of rf power. possible improvements for the two previous works are reported in [72], together with a dmtl solution for a k band phase shifter by the same authors. another recent proposal is provided by [73], with two realizations of 4-bit phase shifters for k and ka frequency bands, in both cases composed by 4 sections comprising a reference and a shifting path. a couple of monolithically integrated mems spst switches control the activation of each path by 45 v signal, allowing 16 phase states. fabricated variants for k band show a footprint of 10.5x7.5 mm2. fig. 5 (a) switched-line phase shifter described in [66], (b) block scheme of the device reported in [69], (c) top view of architecture proposed in [70]. 4.2. loaded-line regarding loaded-line architecture, noteworthy results are obtained in [74][75]. the first one describes a device composed by a transmission line periodically loaded by a total number of eight 4-bit tunable capacitive loads. the control of capacitors is achieved by (lateral) electrostatic and (vertical) electrothermal actuation. the overall 16 states (a) (b) (c) 348 g. tagliapietra, j. iannacci allow to tune digital capacitors from 15.5 to 111.6 ff, changing phase velocity of the line and determining a maximum phase shift of 337.5°. while electrostatic actuation requires 20 v, its electrothermal requirement is just 4.8 v. the proposed phase shifter has been designed to be monolithically integrated with conventional cmos, and involves an overall footprint of 4.32x0.5 mm2, although the aspect ratio is still relatively high. another digital phase shifter is offered by [75], i.e. a 3-bit device allowing a total 315° of shift. in a more classic design, quarter wavelength microstrip sections are periodically loaded with different susceptances, sequentially composing 180°, 90° and 45° shifting sections, within an area of 20.5x6.5 mm2. table 4 comparison between the different discussed realizations of phase shifters, comprehending switched-line and loaded-line topologies. for each realization, the reported return loss, insertion loss, and phase error referred to worst case among the various observed states are reported. work type measured @(ghz) ret. loss (db) ins. loss (db) min shift (°) max shift (°) error (°) [66] sl 1 15 3.1 5.625 337.5 [67] sl 30 1.5 0 187 7 [68] sl 24 19.2 1.65 0 184.6 4.6 [69] sl 2.7 13.4 0.89 5 75 2.05 [70] sl 17 22 3.72 11.25 348.75 1.14 [71] sl 35 14 5.07 45 315 1.78 [73] sl 19.5 15 6.8 22.5 180 4.5 [74] ll 32 15 3.9 45 337.5 [75] ll 2.5 8.5 4 45 315 7.9 4.3. distributed mems transmission line concerning dmtl topology, multiple suggestions have been provided even only in the last half decade. different examples are reported, covering a certain variety of possible arrangements and improvements. the most classic examples of dmtl design are reported in [76][77][78]. being a dmtl phase shifter basically composed by a transmission line loaded with mems bridges, the work in [77] is the study of a simple one-bit dmtl cell aimed at the improvement of a basic switched-line 1-bit phase shifter in the range 1-6 ghz. such study reports that adding a mems bridge on the shifting signal path introduces an improvement of 40% and 25% as compared to initial size and losses, respectively. despite the traditional design, an innovative proposal is discussed in [76]. in this work, the signal line is loaded with 26 capacitive bridges, with 7 possible lengths. since its longer bridges need lower actuation voltage (10 v) as compared to shorter bridges (40 v), each buried electrode controlling a bridge is connected to the same dc voltage pad. in this manner, a single signal will actuate all bridges characterized by the same length, determining a precise phase shift. such design avoids the presence of multiple control pads, reducing the footprint of this 7-states device to 7x0.8 mm2. a comprehensive overview of recent developments in rf-mems technology 349 fig. 6 (a) schematic section of dmtl phase shifter described in [83], (b) top view of 4bit device offered by [84], (c) schematic of the arrangement proposed in [85]. following the traditional “straight loaded line" approach, [78] emphasizes the importance of miniaturization for movable components. in particular, the fabrication of bridges with a scaled area allowed the authors to design a basic 30° delay cell with 1x0.66 mm2 area, reduced actuation voltage (14.6 v), and low switching time (0.6 μs). promising performances of the cell thus permitted to cascade six of them, obtaining a 180° phase shifter with 4x1 mm2 on quartz substrate. an improvement of the basic bridge cell loading the transmission line is provided by [79]. besides the standard actuation electrode, two additional electrodes are introduced: each one on a side of the signal line, on a higher pillar. in this manner, besides nonactuation state, each basic cell may achieve two different capacitive states, extending its reconfigurability. by cascading 32 of such cells, the authors propose a 6-bit phase shifter with a maximum shift of 180°, by 5.625° steps. characterized by reasonably low actuation voltages (6.8 and 3.4 v), the device exhibits an overall length of 12.8 mm. additional remarkable works are represented by [80][81][82], comprising dmtl configurations working in the part of spectrum from x band to 60 ghz, with considerably small losses and phase error, investigated in depth also in terms of reliability. (a) (b) (c) 350 g. tagliapietra, j. iannacci a more compact and innovative design is proposed in [83], with two horizontal parallel fixed plates: the lower is the base for the dielectric of the cpw, while the upper is the support to which a piezoelectric transducer is attached. as it is shown in figure 6a, on the lower side of the transducer, a slab with high dielectric constant is stuck. the 35 μm air gap separating the slab from the underlying cpw is varied by the transducer, inducing broad changes of load (and thus phase delay) on the signal line. in order to extend the achievable phase shift, a grating and a serpentine shaped cpw line have been investigated, besides standard straight cpw configuration. with an overall length of 3 mm, the device is fabricated on simple high-resistance silicon. as compared to previous designs, an arrangement that permits to drastically reduce the aspect ratio of the overall device is presented in [85]. in this case, a 5-bit phase shifter is implemented by a spiral-shaped cpw configuration of the signal line and loaded with 31 capacitive mems switches, as depicted in figure 6c. the actuation of switches takes place by groups. in details, the switches are divided into five groups, involving shifts of 11.25°, 22.5°, 45°, 90° and 180°. the device has a remarkable footprint of 2.6x2.6 mm2. one last proposal to be mentioned about dmtl phase shifters is [84]. in this work, two compact devices (3 and 4-bit) are proposed, both exploiting the advantages of combining switched-line and dmtl topologies. in case of the 3-bit device, two bits control the sp4t switches (selecting one of the four delay lines), while one bit controls capacitive bridges loading the lines. the same principle applies to the 4-bit version displayed in figure 6b. both devices are designed to operate in ka band, at 35 ghz, showing valuable maximum return loss (<-12 db), contained phase error (<3.8°), and compact sizes of 2.24x1.91 mm2 and 2.78x2.69 mm2, respectively. moreover, both devices exhibit a low aspect ratio. devices have been evaluated also in terms of reliability, sustaining more than 1 billion cycles with 0.1 w of rf power in cold switching conditions, and more than 400 million cycles with 0.5 w in hot switching conditions (85°c). in addition, the inclusion of the two phase shifters within a low cost package has been accurately characterized. 4.4. reflection type regarding the reflection type architecture, research contributions in the last decade mainly focused on the improvement of both quadrature coupler and reflective loads. some noteworthy contributions are represented by [86][87][88]. in the first one, a 2-bit device composed by two directional couplers (3db branch line) and four loads is presented. activation of loads connected to the output ports of a coupler by mems switches determines a 45°, 180° or 225° shift. implemented on a sapphire substrate, the whole circuit covers an area of 7.9x4 mm2. this device, which has been conceived for the ka band, was characterized also in terms of power handling capabilities, showing a commendable behavior up to 32 dbm input power. among the different advances in the field of rf-mems phase shifters listed in [87], a couple of them are interesting from the point of view of the design strategy. in particular, for the design of devices working in k and ka band, a hybrid distributed-reflective architecture is adopted. when small shift (22.5° or 45°) is required, only the input microstrip lines of the coupler are loaded with capacitive radial stubs by mems switches, without actuating any load on the other ports of the coupler. differently, when a higher shift is required (from 90° to 270°), capacitive loads are activated on the output lines of the coupler by switches. two packaged realizations of such architecture have been measured in a comprehensive overview of recent developments in rf-mems technology 351 their 16 phase states, one working at 15 ghz and shown in figure 7a, while the other at 21 ghz. the packaged devices proved to be able to sustain over 250 billion switch actuations. a robust implementation of a phase shifter mounted on a base station for electronically scanned antennas, is reported in [88]. it is composed by two stages: one introducing 0°, 11.25°, 22.5° or 33.75°, while the other introducing 0°, 45° or 90° shift. each stage comprises a quadrature coupler and lc loads, both realized by microstrip. loads are selected by commercial omron spdt mems switches. the proposed device is realized on common ro4003 substrate. the employed switches have been extensively characterized in terms of reliability: at 2 ghz rf power and hot switching conditions, switches operated beyond 100 m cycles (in (0.1-0.3) mw range) and slightly more than 5 m cycles (in the (0.5-1) w range). (a) (b) fig. 7 (a) realization of reflection type phase shifter working at 15 ghz, described in [87], (b) top view of the architecture proposed by [89]. a significant suggestion in terms of reconfigurability and shifting range comes from [90]. after a detailed characterization of a proposed reconfigurable quadrature hybrid coupler based on mems varactors, this one is adopted in the implementation of a 360° phase shifter. in this case, each output branch of the reconfigurable coupler is connected to a dual-resonant lc load, realized by two parallel branches connected to ground, each one including an inductor and a 5-bit varactor, controllable by 35 v actuation. this kind of load allows a 360° phase shift, without the need of including another stage. by acting on the 10 bits is then possible to achieve 1024 phase states, although some of them will be quite similar for what concerns the resulting shift. a device operating at k band, with more classic design, is proposed in [91]. a lange coupler and two reflective paths, each one provided with three switches along the lines, are enclosed in the footprint of 2.45x1.77 mm2. the length of short-ended lines can be decreased by actuating one of the shunt switches. while the total length of the load line corresponds to a 135° shift, actuation may introduce 90°, 45° or 0°. the same concept is exploited in [89], where a coupler made by folded broad-side coupled lines is presented, after a detailed theoretical analysis of the qualities to be pursued in the design of both the coupler and the reflective loads. its ultrawideband performances comprehend: reflection coefficients of all ports smaller than -25 db and isolation greater than -28 db, over the whole (70-86) ghz interval. eight switches are comprised in such 3-bit device depicted in figure 7b, for a total footprint of 2.68x0.70 mm2, considering the device and different pads. the most recent analog implementation is offered by [92], in which output ports of the lange coupler are connected to a lc reflective load (three inductive line sections spaced out by couples of shunt varactors) in a cpw configuration. the capacitance of 352 g. tagliapietra, j. iannacci contactless varactors loading the lines is simultaneously tuned from 85 to 200 ff by the control voltage applied to the pads of a chevron electrothermal actuator, thus determining the output phase shift of this 4x2.6 mm2 device. the continuous tuning range covers 120° shift, achievable by 8 v control signal. table 5 comparison between the different discussed realizations, comprehending distributed mems transmission line and reflection type topologies. for each realization, return loss, insertion loss, and phase error referred to the worst case among the various configurations are reported. work type measured @(ghz) ret. loss (db) ins. loss (db) min shift (°) max shift (°) error (°) [76] dmtl 30 13 1.7 106 [78] dmtl 22 12 3 30 180 [79] dmtl 30 16 0.8 5.625 180 <1 [83] dmtl 30 12 2.55 170 [85] dmtl 40 9.9 15.1 11.25 348.75 3.72 [84] dmtl 35 14-12 3.8-5.4 45-22.5 315-337.5 2.3-3.8 [86] rt 26 7 2.5 45 225 [87] rt 15-21 10-12 2.2-2.2 22-22 337-337 7-5.3 [88] rt 2 21 0.93 11.25 123.75 5 [90] rt 2 25 2.4 360 11.25 [91] rt 20 13.61 1.61 45 135 1.15 [89] rt 74 18 4.9 22.5 195 8.6 [92] rt 28 15 6.35 120 5. attenuators differently from the previous two kinds of device, rf-mems based attenuators are not characterized by a dense presence of scientific literature, remaining a relatively niche scenario. nonetheless they also represent key components in the evolution towards the iot and 5g paradigms, together with impedance tuners, filters and resonators. among the remarkable characteristics typical of rf mems, in case of attenuators the most valuable are represented by reconfigurability, broadband behavior and good linearity. desirable attenuators should exhibit multiple configurations of attenuation, each one with a flat and stable value of impedance along a wide range of frequencies. table 6 is provided at the end of the present section for a better comparison of the discussed proposals in terms of their characteristic parameters. from a general point of view, devices operating by both digital or analog control are present in the literature. since the first decade of the new millennium, the first one is the most widespread, as it is possible to observe by looking at [93][94]. in [93] a 6-bit attenuator is presented: within the 8.77 mm2 device encompassed in cpw configuration, two spdt switches and two branches loaded with six series resistors and corresponding bypass cantilever membranes are comprised. while switches are meant to select one branch or another by ~15 v actuation, cantilevers will shorten or load resistors on the selected branch. with 64 possible impedance states, the device demonstrated remarkable a comprehensive overview of recent developments in rf-mems technology 353 linear performances in terms of insertion and return loss. conceptually similar, [94] proposes a cpw-based structure in which signal line crosses an initial section with 3 series resistors, and then separates into two parallel lines: each loaded with 3 resistors and their movable membrane. the two branches eventually meet to form the output line, as shown in figure 8a. by an actuation voltage between 70 and 80 v, and by 7-bit sequences is then possible to control 128 possible impedance states. also in this case the performances are stable, demonstrated over a wider interval. fig. 8 (a) complete architecture described in [94], (b) schematic of device proposed in [96]. substantially new contributions appeared only in the last half decade with [95][96], both proposing 3-bit attenuators enclosed in cpw configuration and demonstrated up to 20 ghz. with a reduced footprint of 3.2 mm2, [96] offers a structure composed by three equal subunits in series on the main signal line, determining an attenuation of 5, 10, 20 db respectively. as depicted by the schematic of figure 8b, each sub-unit is composed by a t-shaped resistive network and two cantilever switches: one switch rules shunt connection to ground, the other allows to bypass the two series resistive loads. the proposed design achieves 8 different states with a limited attenuation error, at the price of 70 v actuation. in a similar way, [95] proposes a device composed by three cascaded sub-units of 10, 20, 40 db attenuation. each sub-unit comprises a t-junction power splitter, leading to an (a) (b) 354 g. tagliapietra, j. iannacci unloaded branch or to a branch loaded with π-shaped resistive network by its spst switch. specular spst switches and t-junction complete the sub-unit. such design allows to reach wide attenuation (70 db), with a reduced pull-in voltage (30 v) and footprint (2.45x4.34 mm2). previous research items are generally demonstrated over a limited range of frequencies, with a limited number of possible states (except [93][94]). the first offer for an attenuator covering a wider spectrum came with [97]. in this work, a 8-bit device is presented, measured in the interval (0.01-110) ghz. enclosed within cpw configuration and with an overall footprint of 3x1.95 mm2, the attenuator is substantially composed by eight cascaded modules of ohmic switches along the main rf signal line, actuated by 45 v signals. depending on the type of module, each switch will contribute to shorten (when actuated) the resistor in series on the signal line or to connect (when actuated) signal line to a shunt resistor to ground. the latter module thus may operate or avoid shunt resistive path to ground, while the former module may shorten the series resistor, and so on, alternately. such asymmetric network is measured under a wide variety of configurations, considering vwsr (voltage wave standing ratio), insertion and return loss, and identifying the most advisable configurations and causes of the unadvisable ones. the first proposals of analog attenuators came only during the last couple of years. in [98], authors present a design based on cpw meandered configuration. a membrane with lowresistivity silicon slab is deployed over the signal line, with a highly magnetized permanent magnet on the upper side facing an overhead planar spiral coil. the direction of the current determines the attraction (or repulsion) of the underlying magnet and membrane, while the amount of current determines the variation of the gap between the slab and the signal line and, in turn, the quantity of signal power absorbed by the silicon slab. this 2x5 mm2 device exhibits low insertion loss, for an attenuation range of 16.4 db. a radically different design is proposed in [99]. this work is substantially based on two quadrature hybrid coupler and two varactors. while “isolated” ports of the couplers are terminated with a matched load, the corresponding “through” and “coupled” ports of couplers are connected by separate rf lines, each loaded with a shunt varactor. the plates of a varactor are on the two sides of the rf line and are moved by an electrothermal chevron-type actuator, driven by a control signal with maximum value of 9.8 v. beside actuation voltage, the area of the device is moderate (3.8x3.1 mm2), as well. despite the limited attenuation and frequency range, the previous two proposals represent remarkable steps ahead in the direction of further low-cost and low-power reliable rf-mems attenuators. table 6 comparison between the different discussed realizations of attenuators. work range (ghz) measured @(ghz) ret. loss (db) ins. loss (db) max att. (db) step (db) error (db) [93] 0.05-13 8 15 1.8 19.7 [94] 0.1-40 20 2.5 38.5 [95] 0.1-20 0.1-20 11.9 0.73 70 10 <3% [96] 0-20 0-20 12 2 35 5 <5% [97] 0.01-110 20 5 45 [98] 21.5-26.5 23.5 18 1.2 17.6 <0.2% [99] 58-62 60 20 10 25 a comprehensive overview of recent developments in rf-mems technology 355 6. filters filters play a crucial role in every rf front-end, enabling a suitable signal-to-noise ratio for the desired signal contributions and rejecting the out-of-band noise within the desired frequency bands. moreover, by considering the growing overpopulation of the limited available spectrum in recent years, it is possible to understand why the role has gained an increasing relevance [100]. given the additional portions of spectrum provided by 5g paradigm and the required compliance with different communication standards, the presence of agile reconfigurable filters represents a stringent need for radio front-ends of future mobile devices [101]. in addition to these technical considerations, the requirements of the market must be taken into account as well; this determines a constant demand of devices characterized by low cost and power consumption, small size, light weight, wide tunability and high performances. these motivations make the design of reconfigurable filters a particularly challenging task, under a constant attention by researchers. the generic structure of a reconfigurable filter includes a resonator, made by various types of transmission lines, lumped elements, cavities or dielectric solids, whose resonant frequency, bandwidth or filtering nature is modified by tuning elements [102]. from the point of view of classification, these devices can be categorized on the basis of their resonators. bearing this in mind, the three main categories consist in 3d, planar, or lumped lc structures. the first one includes resonators made by waveguides (e.g. substrate integrated waveguides siw), cavities (e.g. evanescent-mode resonators eva) and dielectric resonators (dr). for the sake of brevity, resonators exploiting acoustic or ferromagnetic principles are not discussed in the present treatise. the second category is composed by resonant structures made with cpw, microstrip and striplines, generally arranged according to different layouts, such as comb-line, hairpin or edge-coupled configurations. lc structures are usually employed at lower frequencies (<1 ghz), as the wavelength of signals would make inconvenient the use of planar structures [103]. the reconfigurability of these devices is provided by tuning elements, which are generally represented by banks of capacitors, varactors, switches to select reactive loads, or actuation mechanisms that modify the dimensional (and thus electrical) features of the resonator. rf-mems technology applied as a valuable candidate for widening reconfigurability in filters since the early 2000s [104], capitalizing on its intrinsic high quality factor (q), negligible losses and power consumption, and simple biasing of its switches and lumped components. for the sake of completeness, semiconductors represent another common choice as tuning element, but they are employed only when a short switching time is a primary concern, due to their inconvenient losses and quality factor [105]. filters can be evaluated on the basis of power handling and consumption, linearity and tuning speed, but features like q, tuning range, insertion and return loss, size and integrability with common electronic supports, remain the most accepted figures of merit. considering the vast amount of literature in this field, some interesting research solutions of the last half decade are reported in the next paragraphs. tables 7 and 8 are provided along the present section, for a better comparison of the different research items in terms of their features. 356 g. tagliapietra, j. iannacci 6.1. three-dimensional regarding resonators based on waveguides, k band application and a high q factor are the target of the bandpass filters reported in [106]. among the different discussed realizations, an interesting realization is the one adopting virtual movable walls. in this approach, the resonator is obtained by inserting spaced metallic septa along the e-plane (the plane along which electric field propagates) in the center of the waveguide. the lateral walls of the waveguide facing the septa present a recess, and six dies are arranged on the wall. each die is composed by a quartz substrate on which a conductive path can be allowed or avoided by an ohmic switch. the actuation of the switches allows to establish a conductive path that will electrically "hide" the lateral recess of the guide, varying the electrical effective width of the resonant cavity represented by the space between the central septa and lateral walls. this configuration allowed a 725 mhz shift from the basic resonant frequency (22.17 ghz). although its q factor and electrical performances are remarkable, the device is characterized by the typically huge dimensions and aspect ratio of waveguide-based devices. in fact, the overall length (48 mm) makes this device more suitable for fixed infrastructures. another bandpass implementation featuring waveguide resonant cavities is provided by [107], in which an alternation of inverters and cavities composes the whole structure. the three stages of inverters consist in dielectric substrate laminates on which rf-mems switches are integrated, whose activation allows a 2,4 ghz tuning range for the resonance of the two cavities, up to 14.6 ghz, by three possible intermediate steps. also in the case of this device for ku band applications, a remarkable q factor is achieved for the three states, at the expenses of volume and aspect ratio. regarding resonators based on eva cavities, k and ka are the bands selected by the authors of [108][109], for their bandpass filter. the structure of the resonator is traditional: a gold-sputtered cavity on silicon substrate, encompassing two posts, and covered by a diaphragm. the two areas of the diaphragm near the posts are provided with electrodes, for a variable deflection of the two areas toward the underlying posts. in this configuration, the resonant frequency (20 ghz) of the cavity can be shifted through the deflection, which modifies the capacitance between the posts and the corresponding areas of the diaphragm. this architecture enabled a high q factor and an outstanding continuous tuning range of 20 ghz, still maintaining good electrical performances. the price to pay for such achievements corresponds to the typical drawbacks of micromachined 3d resonators: increased complexity of fabrication and large actuation voltages. the same arrangement is adopted for the development of the tunable filter presented in [110], operating in v band. the two-pole filter demonstrated a fairly wide tuning range ((57.763) ghz) and limited insertion loss, featuring a cpw-microstrip transition for a simpler integration on planar layout technologies. the integration is simplified also by virtue of a further advance in the miniaturization of such device, whose size does not exceed 5x8 mm2. a similar arrangement is proposed for ku band operations in [111], with a copper cavity encompassing two internal posts and fed by cpw line. in this case two implementations are offered, one using clamped-clamped rf-mems structures, and the other using disk-shaped patches of vanadium dioxide (vo2) as tuning mechanism controlling the areas of the diaphragm facing the posts. the second approach involves a thermal activation of such material to trigger an abrupt change of conductivity (metal-to-insulator transition) of the material, determining in turn a variation of the parasitic capacitance between the disks and the a comprehensive overview of recent developments in rf-mems technology 357 upper part of the posts. this strategy allowed to achieve 1 ghz tuning range from the original resonant frequency (14.7 ghz), still providing a satisfying q factor and reasonable electrical performances. table 7 comparison between the different discussed realizations of filters. reported values of return loss and insertion loss refer to the worst assumed value. work type (bp/bs) tuning range (%) q unloaded ret. loss (db) ins. loss (db) actuation (v) [106] bp 3.2 1450 -13 -1.3 70 [107] bp 18 1700 -15 -0.7 100 [108] bp 100 540 -15 -3.09 180 [110] bp 9 330 -2.9 140 [111] bp 7 1198 -14 -2.26 [112] bp 200 64 -10.5 -3.5 0.5 [113] bp 3 3850 -0.44 [114] bp 35 92 -10 -4.29 [115] bp 19 22 -15 -0.2 another analog filter based on metallic cavity and inner post is provided by [112]. this implementation involves a sort of inverted arrangement: the post arises in the vertical direction from the upper side, facing the movable membrane on the bottom side of the cavity. among the different realizations, the most interesting one is represented by a 45° top-angled post facing the gold plate tuned by thermally driven actuators. the need for a precise and constant control over the position of the membrane determines the presence of a sensing capacitance and a polysilicon thermometer under the membrane, enabling a closed-loop control of the power supplied to the actuators. the simulated performances of such arrangement demonstrated a wide continuous tuning range, concerning c, x, and ku bands in the (6-18) ghz interval and limited insertion loss. in addition, the device requires less than 0.5 v for a remarkably fast actuation (1 μs). a more traditional filter based on cavity and posts is described in [113], whose tuning principle is utilized in commercial omron mems switches; switches may establish or avoid the connection to ground for a couple of striplines inserted in the cavity, fed by common subminiature-version-a (sma) probes. the absence of lumped components allowed the authors to avoid undesired degradation of the high q factor for this 2-bit device, greater than 2000 for all the tuning states. designed for c band applications, the filter demonstrated minimal insertion loss (<-0.44 db) within a relatively limited 530 mhz tuning range in the vicinity of the designed frequency of 7 ghz. regarding resonators based on siw, [114] proposes a couple of realizations, employing folded-ridged quarter-mode siw resonators, and operating at l and s bands. in both the bandpass realizations, the tuning mechanism comprises a couple of commercial sp4t rfmems switches, selecting lumped reactive loads, while the two-pole resonator is represented by two folded-ridged siw cavities coupled by a iris opening. a quarter-mode resonator siw is obtained by cutting a siw waveguide along a plane of symmetry of the resonance associated with its fundamental mode. this strategy enabled a substantial miniaturization of a single siw-based resonator, featuring a λ/16 transverse width as compared to the reported state of the art regarding such devices. in terms of electrical performance, the filter equipped 358 g. tagliapietra, j. iannacci with capacitive loads exhibits reasonable insertion and return loss along its eight tuning states, providing a 305 mhz tuning range in the vicinity of the center operational frequency (865 mhz). another interesting implementation is described in [115], involving a half-mode siw resonator loaded by a complementary split ring resonator (csrr) and a mems movable membrane. in this arrangement, the csrr in engraved on top of the 6x6 mm2 half mode siw, and an overhead metallic movable membrane realizes the tuning, acting as variable capacitance. this configuration determined a satisfying 500 mhz tuning range, starting from the basic resonant frequency (3.15 ghz). in addition, simulations show a remarkably small insertion loss (<-0.2 db) and appreciable return loss (>-15 db) for the reported tuning states. fig. 9 (a) explanatory cross section of the eva cavities described in [108] and [109], (b) 3d model of the device reported in [115], (c) scanning electron microscope (sem) photography of the layout realized in [116]. 6.2. planar structures regarding filters based on planar structures, the first proposed device [116] is a tunable bandstop filter for ka-band applications, realized in cpw configuration. the resonance is achieved by arranging a csrr both on the signal line and ground sides. the default resonance frequency (38.8 ghz) can be shifted to 35.32 ghz by the activation of three movable membranes on top of the resonator engraved on the signal line, letting (a) (b) (c) a comprehensive overview of recent developments in rf-mems technology 359 them act as varactors. the novelty of this research contribution is represented by the fact of being the first proposal for cpw filter based on csrr, compatible with cmos technology. the device, featuring a gold metallization on a silicon substrate is remarkable also in terms of dimensions, not exceeding 1.46x5 mm2. the same frequency band is targeted in [117], in which a bandstop filter based on shape defected ground (sdg) is presented. more specifically, the device consists in a cpw configuration, periodically loaded by four recesses with spiral-shaped metal engravings, on both its ground planes. the resonant frequency (26 ghz) of this arrangement is controlled by a spst rfmems switch on each spiral, whose actuation increases the overall length of the spiral itself. the reported simulations show an encouraging 3 ghz tuning range and a remarkable stopband attenuation. an interesting contribution comes from [118], combining spurline and dmtl approaches to obtain a tunable bandstop filter working at ku band. in spurline-based filters, the resonance is achieved by short-ended stubs along the signal line of a cpw. in this case, reconfigurability is enforced by shunt capacitive mems switches, whose actuation determines a reduced length of the stubs and thus a shift in the resonant frequency (18 ghz). the whole 1.2x10 mm2 device is composed by three sections: two sections, each one loaded by a couple of stubs, and an interposed section. in the latter, the signal line is periodically equipped with three overhead movable mems membrane, realizing a typical dmtl configuration. the actuation of such membranes is intended in order to tune the bandwidth of the filter, thus realizing a reconfigurable device both in terms of frequency and bandwidth. from a quantitative point of view, the filter shows good performance within its 2.5 ghz tuning range; the filter also provides minimal insertion loss (<-1 db) while non-operational, and remarkably large attenuation level (>-70 db) for all its tuning states. a similar approach is adopted in [119], with two cascaded sections of cpw loaded in dmtl fashion, coupled by series capacitive rf-mems switches. targeting c band and high attenuation along the stopband, [120] proposes a threepole bandpass filter encompassing commercial components. in fact, the three cascaded half-wavelength resonators are realized by microstrip on roger ro4003c substrate, while each digital variable capacitor loading a resonator is a commercial product by cavendish kinetics. the resulting device is quite compact, providing a wide tuning range by means of 5 states covering the (5.3-7) ghz interval. in each state, the attenuation along the stopband is greater than -30 db up to 10 ghz, still maintaining limited losses. a comparable architecture for l band applications is described in [121], comprising a three-pole resonator made by coupled quarter-wavelength sections of microstrip in combline configuration. commercial rf-mems varactors are employed for tuning the basic resonant frequency (2.1 ghz) and for the control over the capacitive coupling between the two microstrip resonators. also in this case, the layout is quite compact, since the size does not exceed 9x5 mm2; the attenuation in the stopband is comparable as well. with the proposed device, a remarkable 700 mhz tuning range can be achieved by 21 capacitance states, with appreciable electrical performances. 360 g. tagliapietra, j. iannacci table 8 comparison between the different discussed realizations of filters. reported values of return loss and insertion loss refer to the worst assumed value. work type (bp/bs) tuning range (%) q unloaded ret. loss (db) ins. loss (db) actuation (v) [116] bs 10 -2 -31 25.65 [117] bs 10 -3 -42 [118] bs 16 -10 -75 [120] bp 32 200 -17.5 -5.3 [121] bp 50 253 -20 -2 30 [122] bp 650 -12.7 -6.8 40 a tunable bandpass filter covering almost the entire uhf band is presented in [122]. the basic structure is represented by a half-wavelength and a quarter-wavelength microstrip resonator in concentric/nested arrangement. a couple of rf-mems switches control the activation of each resonator, so that the activation of one, of the other, or of both resonators, realizes a first coarse tuning of the resonant frequency. the fine tuning is then performed by a bank of switched capacitors connected to each resonator. the choice of the nested arrangement allowed the authors to achieve fairly limited dimensions (26x36 mm2), for a device working at relatively low frequencies, fabricated on common rt/duroid substrate and equipped with commercial lumped components. in terms of capabilities, the wide tuning range ((0.4-3) ghz) of such device is inevitably counterbalanced by contained electrical performances. 7. conclusions the abovementioned examples of rf-mems devices portray the current state of the art of a technology that is now mature to face current challenges posed by the 5g applications. besides the intrinsic reconfigurability, considerable wideband performances and low power consumption, research on rf mems has provided components characterized by an increasing miniaturization and reliability, manageable control signals, and compatibility with existing technologies. among the different proposals, while some may seem more appropriate for fixed infrastructures, other may apply to mobile low power devices for their limited size and control voltage. many devices proved a satisfying reliability under extensive power and temperature measurements, reinforced by low cost and effective packaging strategies. the range of frequencies covered by the mentioned examples comprehend both portions below 6 ghz and higher ones, involving all those bands engaged in the realization of 5g infrastructure. in addition to the previous considerations, the electrical performances of the reported rf mems devices proved themselves to be a valuable plus of this technology, in view of a potentially significant role in 5g scenario. acknowledgement: this paper is published thanks to the support of the project “wafer level micropackaging di rf mems switch per applicazioni spaziali” (f36c18000400005) funded by the agenzia spaziale italiana (asi). a comprehensive overview of recent developments in rf-mems technology 361 references [1] g. tagliapietra and j. iannacci, “overview of recent developments in rf-mems technology with reference to 5g emerging applications,” in proceedings of the 1st international conference on micro/nanoelectronics devices, circuits and systems (mndcs 2021), pages 1–31, 2021. [2] j. iannacci, “rf-mems: an enabling technology for modern wireless systems bearing a market potential still not fully displayed,” microsystem technologies, vol. 14, no. 09, 2015. [3] omron website. omron electronic components. [online], accessed: nov. 19, 2020. https://www.components. omron.com/. [4] l. wood, “global radio frequency (rf) mems market 2019-2023: expected to grow at a cagr of approx 37%, with aac technologies, analog devices, broadcom, cavendish kinetics, and qorvo at the forefrontresearchandmarkets.com”, [online], accessed: 19 november 2020, https://www.businesswire.com/news/ home/20190221005400/en/global-radio-frequency-rf-mems-market-2019-2023/ . [5] cavendish kinetics, “cavendish powers nubia z9: world's first smartphone with dual antenna tuning,” [online], accessed: 19 november 2020. https://www.cavendish-kinetics.com/release/cavendish-powersnubia-z9-worlds-first-smartphone-with-dual-antenna-tuning/ . [6] j. iannacci, “internet of things (iot); internet of everything (ioe); tactile internet; 5g a (not so evanescent) unifying vision empowered by eh-mems (energy harvesting mems) and rf-mems (radio frequency mems),” sensors and actuators a: physical, vol. 272, no. 02, 2018. [7] j. rodriguez. fundamentals of 5g mobile networks. wiley, 2015. [8] gsm association. 5g spectrum public policy position. [online], accessed: 19 november 2020. https://www.gsma.com/spectrum/wp-content/uploads/2020/03/5g-spectrum-positions.pdf . [9] j. iannacci, “rf-mems technology: an enabling solution in the transition from 4g-lte to 5g mobile applications,” in 2017 ieee sensors, 2017, pp. 1–3. [10] j. iannacci, “rf-mems for high-performance and widely reconfigurable passive components a review with focus on future telecommunications, internet of things (iot) and 5g applications”, journal of king saud university science, vol. 29, no. 4, pp. 436–443, 2017. si: smart materials and applications of new materials. [11] l. ma, n. soin, m. h. mohd daut, and s. f. wan muhamad hatta, “comprehensive study on rf-mems switches used for 5g scenario”, ieee access, vol. 7, pp. 107506-107522, 2019. [12] y. yuan-wei, z. jian, j. shi-xing, and y. shi, “a high isolation series-shunt rf mems switch”, sensors, vol. 9, no. 6, pp. 4455–4464, 2009. [13] j. sun, z. li, j, zhu, y. yu, and l. jiang, “design of dc-contact rf mems switch with temperature stability”, aip advances, vol. 5, march 2015. [14] s. t. wipf, a. goritz, m. wietstruck, c. wipf, b. tillack, and m. kaynak, “d-band rf-mems spdt switch in a 0.13 um sige bicmos technology”, ieee microwave and wireless components letters, vol. 26, pp. 1002–1004, 2016. [15] y. liu, y. bey, and x. liu, “high-power high-isolation rf-mems switches with enhanced hotswitching reliability using a shunt protection technique”, ieee transactions on microwave theory and techniques, vol. 65, no. 9, pp. 3188–3199, 2017. [16] c. chu and x. liao, “one to 40ghz ultra-wideband rf mems direct-contact switch based on gaas mmic technique”, iet microwaves, antennas propagation, vol. 12, no. 6pp. 879–884, 2018. [17] s. dey, s. koul, a. poddar, and u. rohde, “compact, broadband and reliable lateral mems switching networks for 5g communications”, progress in electromagnetics research m, vol. 86 pp. 163–171, 2019. [18] n. narang and p. singh, “metal contact rf mems switch design for high performance in ka band”, in proceedings of the iop conference series: materials science and engineering, vol. 872, p. 012020, 06 2020. [19] h. yang, h. zareie, and g. m. rebeiz, “a high power stress-gradient resilient rf mems capacitive switch”, journal of microelectromechanical systems, vol. 24, no. 3, pp. 599–607, 2015. [20] k. demirel, e. yazgan, s. demir, and t. akin, “a folded leg ka-band rf mems shunt switch with amorphous silicon (a-si) sacrificial layer”, microsystem technologies, vol. 23, pp. 1191–1200, 2017. [21] m. li, j. zhao, z. you, and g. zhao, “design and fabrication of a low insertion loss capacitive rf mems switch with novel micro-structures for actuation”, solid-state electronics, vol. 127, pp. 32–37, 2017. [22] g. s. kondaveeti, k. guha, s. r. karumuri, and a. elsinawi, “design of a novel structure capacitive rf mems switch to improve performance parameters”, iet circuits, devices systems, vol. 13, no. 7, pp. 1093–1101, 2019. [23] u. chae, h. y. yu, c. lee, and i. j. cho, “a hybrid rf mems switch actuated by the combination of bidirectional thermal actuations and electrostatic holding”, ieee transactions on microwave theory and techniques, vol. 68, no. 8, pp. 3461–3470, 2020. https://www.components.omron.com/ https://www.components.omron.com/ https://www.businesswire.com/news/home/20190221005400/en/global-radio-frequency-rf-mems-market-2019-2023/ https://www.businesswire.com/news/home/20190221005400/en/global-radio-frequency-rf-mems-market-2019-2023/ https://www.cavendish-kinetics.com/release/cavendish-powers-nubia-z9-worlds-first-smartphone-with-dual-antenna-tuning/ https://www.cavendish-kinetics.com/release/cavendish-powers-nubia-z9-worlds-first-smartphone-with-dual-antenna-tuning/ https://www.gsma.com/spectrum/wp-content/uploads/2020/03/5g-spectrum-positions.pdf 362 g. tagliapietra, j. iannacci [24] a. bojesomo, n. saeed, and i. m. elfadel, “a multiband rf mems switch with low insertion loss and cmos-compatible pull-in voltage” in proceedings of the 2018 symposium on design, test, integration packaging of mems and moems (dtip), 2018, pp. 1–4. [25] w. tian, p. li, and l.x. yuan, “research and analysis of mems switches in different frequency bands”, micromachines, vol. 9, no. 04, 2018. [26] a. algamili, m. haris, md khir, j. dennis, a. ahmed, s. omar, s. ba hashwan, and m. junaid, “a review of actuation and sensing mechanisms in mems-based sensor devices”, nanoscale research letters, 01 2021. [27] d. a. czaplewski, c. w. dyck, h. sumali, j. e. massad, j. d. kuppers, i. reines, w. d. cowan, and c. p. tigges, “a soft-landing waveform for actuation of a single-pole single-throw ohmic rf mems switch”, journal of microelectromechanical systems, vol. 15, no. 6, pp. 1586–1594, 2006. [28] d. peroulis, s. p. pacheco, k. sarabandi, and l. p. b. katehi, “electromechanical considerations in developing low-voltage rf mems switches”, ieee transactions on microwave theory and techniques, vol. 51, no. 1, pp. 259–270, 2003. [29] s. fouladi and r. r. mansour, “capacitive rf mems switches fabricated in standard 0.35-um cmos technology”, ieee transactions on microwave theory and techniques, vol. 58, no. 2, pp. 478–486, 2010. [30] a. gopalan and u. k. kommuri, “design and development of miniaturized low voltage triangular rf mems switch for phased array application”, applied surface science, vol. 449, pp. 340–345, 2018. 4th international conference on nanoscience and nanotechnology. [31] h. wei, z. deng, x. guo, y. wang, and h. yang, “high on/off capacitance ratio rf mems capacitive switches”, journal of micromechanics and microengineering, vol. 7, no. 5, p. 055002, mar 2017. [32] j. iannacci, a. repchankova, d. macii, and m. niessner, “a measurement procedure of technologyrelated model parameters for enhanced rf-mems design”, in proceedings of the 2009 ieee international workshop on advanced methods for uncertainty estimation in measurement, pp. 44–49, 2009. [33] j. iannacci, “mixed-domain fast simulation of rf and microwave mems-based complex networks within standard ic development frameworks”, 04, 2010. [34] j.-m. kim, j.-h. park, c.-w. baek, and y.-k. kim, “the siog-based single-crystalline silicon (scs) rf mems switch with uniform characteristics”, journal of microelectromechanical systems, vol. 13, no. 6, pp. 1036–1042, 2004. [35] a. persano, a. tazzoli, p. farinelli, g. meneghesso, p. siciliano, and f. quaranta, “k-band capacitive mems switches on gaas substrate: design, fabrication, and reliability”, microelectronics reliability, vol. 52, no. 9, pp. 2245–2249, 2012. [36] s. touati, n. lorphelin, a. kanciurzewski, r. robin, a. rollier, o. millet, and k. segueni, “low actuation voltage totally free flexible rf mems switch with antistiction system”, in proceedings of the 2008 symposium on design, test, integration and packaging of mems/moems, 2008, pp. 66–70. [37] j. y. park, g. h. kim, k. w. chung, and j. u. bu, “fully integrated micromachined capacitive switches for rf applications”, in proceedings of the 2000 ieee mtt-s international microwave symposium digest (cat. no.00ch37017), vol. 1, pp. 283–286, 2000. [38] a. persano, a. cola, g. de angelis, a. taurino, p. siciliano, and f. quaranta. “capacitive rf mems switches with tantalum-based materials”, journal of microelectromechanical systems, vol. 20, no. 2, pp. 365–370, 2011. [39] m. angira, g.m. sundram, k. rangra, d. bansal, and k. maninder, “on the investigation of an interdigitated, high capacitance ratio shunt rf-mems switch for xband applications”, 05, 2013. [40] v. mulloni, f. solazzi, g. resta, f. giacomozzi, and b. margesin, “rf-mems switch design optimization for long-term reliability” in proceedings of the 2013 symposium on design, test, integration and packaging of mems/moems (dtip), 2013, pp. 1–6. [41] p. blondy, a. crunteanu, c. champeaux, a. catherinot, p. tristant, o. vendier, j. l. cazaux, and l. marchand, “dielectric less capacitive mems switches”, in proceedings of the 2004 ieee mtt-s international microwave symposium digest (ieee cat. no.04ch37535), vol 2, pp. 573–576, 2004. [42] f. ke, j. miao, and j. oberhammer, “a ruthenium-based multimetal-contact rf mems switch with a corrugated diaphragm”, journal of microelectromechanical systems, vol. 17, no. 6, pp. 1447–1459, 2008. [43] c. d. patel and g. m. rebeiz, “a high-reliability high-linearity high-power rf mems metal-contact switch for dc-40-ghz applications” ieee transactions on microwave theory and techniques, vol. 60, no. 10, pp. 3096–3112, 2012. [44] j. pal, y. zhu, j. lu, d. dao, and f. khan, “high power and reliable spst/sp3t rf mems switches for wireless applications” ieee electron device letters, vol. 37, no. 9, pp. 1219–1222, 2016. [45] m. fernandez-bolanos badia, e. buitrago, and a. m. ionescu, “rf mems shunt capacitive switches using aln compared to si3n4 dielectric”, journal of microelectromechanical systems, vol. 21, no. 5, pp. 1229– 1240, 2012. a comprehensive overview of recent developments in rf-mems technology 363 [46] c. harendt, h. graf, b. hofflinger, and e. penteker, “silicon fusion bonding and its characterization”, journal of micromechanics and microengineering, vol. 2, no. 113, 01, 1999. [47] w. tian, x. wang, j. niu, h. cui, y. chen, and y. zhang, “research status of wafer level packaging for rf mems switches” in proceedings of the 21st international conference on electronic packaging technology (icept)”, 2020, pp. 1–5. [48] f. giacomozzi, v. mulloni, s. colpo, a. faes, g. sordo, and s. girardi, “rf-mems devices packaging by using quartz caps and epoxy polymer sealing rings”, in proceedings of the 2013 symposium on design, test, integration and packaging of mems/moems (dtip), 2013, pp 1–6. [49] t. katsuki, t. nakatani, h. okuda, o. toyoda, s. ueda, and f. nakazawa, “a highly reliable singlecrystal silicon rf-mems switch using au sub-micron particles for wafer level ltcc cap packaging”, in proceedings of the 2nd ieee cpmt symposium japan, 2012, pp. 1–4. [50] s. seok, j. kim, m. fryziel, n. rolland, p. rolland, h. maher, w. simon, and r. baggen, “wafer-level bcb cap packaging of integrated mems switches with mmic”, in proceedings of the ieee/mtt-s international microwave symposium digest, 2012, pp. 1–3. [51] r. goggin, p. fitzgerald, b. stenson, e. carty, and p. mcdaid, “commercialization of a reliable rf mems switch with integrated driver circuitry in a miniature qfn package for rf instrumentation applications” in proceedings of the ieee mtt-s international microwave symposium, 2015, pp. 1–4. [52] i. comart, c. cetintepe, e. sagiroglu, s. demir, and t. akin, “development and modeling of a waferlevel bcb packaging method for capacitive rf mems switches”, journal of microelectromechanical systems, vol. 28, no. 4, pp. 724–731, 2019. [53] f. barriere, a. crunteanu, a. bessaudou, a. pothier, f. cosset, d. mardivirin, and p. blondy. zero level metal thin film package for rf mems. in 2010 topical meeting on silicon monolithic integrated circuits in rf systems (sirf), pages 148-151, 2010. [54] k. nadaud, f. roubeau, a. pothier, p. blondy, l. zhang, and r. stefanini, “high q zero level packaged rf-mems switched capacitor arrays”, in proceedings of the 11th european microwave integrated circuits conference (eumic), 2016, pp. 448–451. [55] f. souchon, d. saint-patrice, j. l. pornin, d. bouchu, c. baret, and b. reig, “thin film packaged redundancy rf mems switches for space applications, in proceedings of the 19th international conference on solid-state sensors, actuators and microsystems (transducers), 2017, pp. 175–178. [56] s. t. wipf, a. goritz, m. wietstruck, c. wipf, b. tillack, a. mai, and m. kaynak, “thin film wafer level encapsulated rf-mems switch for d-band applications” in proceedings of the 11th european microwave integrated circuits conference (eumic), 2016, pp. 452–455. [57] n. belkadi, k. nadaud, c. hallepee, d. passerieux, and p. blondy, zero-level packaged rf-mems switched capacitors on glass substrates, journal of microelectromechanical systems, vol. 29, no. 1pp. 109–116, 2020. [58] a. persano, f. quaranta, a. taurino, p. siciliano, and j. iannacci, “thin film encapsulation for rf mems in 5g and modern telecommunication systems”, sensors, vol. 20, pp. 1–12, 04, 2020. [59] a. malczewski, s. eshelman, b. pillans, j. ehmke, and c. l. goldsmith, “x-band rf mems phase shifters for phased array applications”, ieee microwave and guided wave letters, vol. 9, no. 12, pp. 517–519, 1999. [60] s. barker and g. m. rebeiz, “distributed mems true-time delay phase shifters and wide-band switches”, ieee transactions on microwave theory and techniques, vol. 46, no. 11, pp. 1881–1890, 1998. [61] a. s. nagra and r. a. york, “distributed analog phase shifters with low insertion loss”, ieee transactions on microwave theory and techniques, vol. 47, no. 9, pp. 1705–1711, 1999. [62] s. koul and s. dey, “radio frequency micromachined switches, switching networks, and phase shifters”, crc press, may 2019. [63] y.a. wang, “rf mems switches and phase shifters for 3d mmic phased array antenna systems”, university of cincinnati, 2002. [64] a. chakraborty and b. gupta, “paradigm phase shift: rf mems phase shifters: an overview”, ieee microwave magazine, vol. 18, no. 1, pp. 22–41, 2017. [65] g. m. rebeiz, guan-leng tan, and j. s. hayden, “rf mems phase shifters: design and applications”, ieee microwave magazine, vol. 3, no. 2, pp. 72–81, 2002. [66] j. lampen, s. majumder, c. ji, and j. maciel, “low-loss, mems based, broadband phase shifters”, in proceedings of the ieee international symposium on phased array systems and technology, 2010, pp. 219–224. [67] r. malmqvist, c. samuelsson, b. carlegrim, p. rantakari, t. vaha-heikkilla, a. rydberg, and j. varis, “ka-band rf mems phase shifters for energy starved millimetre-wave radar sensors”, in proceedings of the cas (international semiconductor conference), 2010, vol. 1, pp. 261–264. [68] t. watanabe, r. yamazaki, t. furutsuka, s. tanaka, and k. suzuki, “a quasi-millimeter wave band phase shifter with mems shunt switches” in proceedings of the asia-pacic microwave conference, 2014, pp. 64–66. 364 g. tagliapietra, j. iannacci [69] y. huang, j. bao, x. li, y. wang, and y. du, “a 4-bit switched-line phase shifter based on mems switches” in proceedings of the 10th ieee international conference on nano/micro engineered and molecular systems, 2015, pp. 405–408. [70] s. dey and s. k. koul, “reliability analysis of ku-band 5-bit phase shifters using mems sp4t and spdt switches”, ieee transactions on microwave theory and techniques, vol. 63, no. 12, pp. 3997-4012, 2015. [71] s. koul, s. dey, a. poddar, and u. rohde, “ka-band reliable and compact 3-bit true-time-delay phase shifter using mems single-pole-eight-throw switching networks”, journal of micromechanics and microengineering, vol. 26, 08 2016. [72] s. k. koul, s. dey, a. k. poddar, and u. l. rohde, “micromachined switches and phase shifters for transmit/receive module applications”, in proceedings of the 46th european microwave conference (eumc), 2016, pp. 971–974. [73] j. iannacci, g. resta, a. bagolini, f. giacomozzi, e. bochkova, e. savin, r. kirtaev, a. tsarkov, and m. donelli, “rf-mems monolithic k and ka band multistate phase shifters as building blocks for 5g and internet of things (iot) applications”, sensors, vol. 20, no. 13, 05, 2020. [74] j. reinke, l.wang, g. k. fedder, and t. mukherjee, “a 4-bit rf mems phase shifter monolithically integrated with conventional cmos”, in proceedings of the ieee 24th international conference on micro electromechanical systems, 2011, pp. 748–751. [75] y. lin, y. chou, and c. chang, “a balanced digital phase shifter by a novel switching-mode topology”, ieee transactions on microwave theory and techniques, vol. 61, no. 6, 2361–2370, 2013. [76] m. bakri-kassem and r. r. mansour, “a novel self collapsed corrugated mems phase shifter”, in proceedings of the 2013 european microwave integrated circuit conference, 2013, pp. 360–363. [77] a. razeghi and b. a. ganji, “an improved switched-line phase shifter using distributed mems transmission line”, majlesi journal of telecommunication devices, vol. 4, no. 3, nov. 2015. [78] a. chakraborty and b. gupta, “utility of rf mems miniature switched capacitors in phase shifting applications”, aeu international journal of electronics and communications, vol. 75, 03 2017. [79] s. afrang, k. samandari, and g. rezazadeh, “a small size ka band six-bit dmtl phase shifter using new design of mems switch”, microsystem technologies, vol. 23, 06 2017. [80] s. dey and s. koul, “design, development and characterization of an x-band 5 bit dmtl phase shifter using an inline mems bridge and mam capacitors”, journal of micromechanics and microengineering, vol. 24, 09 2014. [81] s. dey and s. koul, “10-25 ghz frequency reconfigurable mems 5-bit phase shifter using push-pull actuator based toggle mechanism”, journal of micromechanics and microengineering, vol. 25, 2015. [82] s. dey, s. koul, a. poddar, and u. rohde, “ku to v-band 4-bit mems phase shifter bank using high isolation sp4t switches and dmtl structures”, journal of micromechanics and microengineering, vol. 27, p. 105010, 2017. [83] a. s. abdellatif, m. faraji-dana, n. ranjkesh, a. taeb, m. fahimnia, s. gigoyan, and s. safavinaeini, “low loss, wideband, and compact cpw-based phase shifter for millimeter-wave applications”, ieee transactions on microwave theory and techniques, vol. 62, no. 12, pp. 3403–3413, 2014. [84] s. dey, s. k. koul, a. k. poddar, and u. l. rohde, “reliable and compact 3and 4-bit phase shifters using mems sp4t and sp8t switches”, journal of microelectromechanical systems, vol. 27, no. 1, pp. 113–124, 2018. [85] w. tian, y. zhang, m. li, z. xie, and w. li, “5-bit spiral distributed rf mems phase shifter. in 2019 ieee 19th international conference on nanotechnology (ieee-nano)”, 2019, pp. 94–98. [86] b. belenger, b. espana, s. courreges, p. blondy, o. vendier, d. langrez, and j. cazaux, “a high-power ka-band rf-mems 2-bit phase shifter on sapphire substrate”, in proceedings of the 6th european microwave integrated circuit conference, 2011, pp 164–167. [87] b. pillans, l. coryell, a. malczewski, c. moody, f. morris, and a. brown, “advances in rf mems phase shifters from 15 ghz to 35 ghz”, in proceedings of the ieee/mtt-s international microwave symposium digest, 2012, pp. 1–3. [88] c. ko, k. m. j. ho, and g. m. rebeiz, “an electronically-scanned 1.8-2.1 ghz base-station antenna using packaged high-reliability rf mems phase shifters”, ieee transactions on microwave theory and techniques, vol. 61, no. 2, pp. 979–985, 2013. [89] x. li, k. y. chan, and r. ramer, “e-band rf mems differential reflection-type phase shifter”, ieee transactions on microwave theory and techniques, vol. 67, no. 12, pp. 4700–4713, 2019. [90] o. d. gurbuz and g. m. rebeiz, “a 1.6{2.3-ghz rf mems reconfigurable quadrature coupler and its application to a 360 reflective-type phase shifter”, ieee transactions on microwave theory and techniques, vol. 63, no. 2, pp. 414–421, 2015. [91] p. k. shrivastava, s. k. koul, and m. p. abegaonkar, “compact k-band lange coupler based 2-bit rf mems reflection-type phase shifter”, in proceedings of the 2018 ieee mtt-s international microwave and rf conference (imarc), 2018, pp. 1–4. a comprehensive overview of recent developments in rf-mems technology 365 [92] t. singh, n. k. khaira, and r. r. mansour, “thermally actuated soi rf mems-based fully integrated passive reflective-type analog phase shifter for mmwave applications”, ieee transactions on microwave theory and techniques, pp. 1–4, 2020. [93] j. iannacci, f. giacomozzi, s. colpo, b. margesin, and m. bartek, “a general purpose reconfigurable mems-based attenuator for radio frequency and microwave applications”, in proceedings of the ieee eurocon 2009, 2009, pp. 1197–1205. [94] j. iannacci, a. faes, f. mastri, d. masotti, and v. rizzoli, “a mems-based wideband multi-state power attenuator for radio frequency and microwave applications”, in proceedings of the techconnect world, nsti nanotech 2010, vol. 2, pp. 328–331. [95] x. guo, z. gong, q. zhong, x. liang, and z. liu, “a miniaturized reconfigurable broadband attenuator based on rf mems switches”, journal of micromechanics and microengineering, vol. 26, p. 074002, 2016. [96] j. sun, j. zhu, l. jiang, y. yu, and z. li, “a broadband dc to 20 ghz 3-bit mems digital attenuator”, journal of micromechanics and microengineering, vol. 26, p. 055005, 05 2016. [97] j. iannacci, m. huhn, c. tschoban, and h. potter, “rf-mems technology for future mobile and highfrequency applications: reconfigurable 8-bit power attenuator tested up to 110 ghz”, ieee electron device letters, vol. 37, no. 12, pp. 1646–1649, 2016. [98] a. raeesi, h. al-saedi, a. palizban, a. taeb, w. m. abdel-wahab, s. gigoyan, and s. safavi-naeini, “low-cost planar rf mems-based attenuator”, in proceedings of the ieee mtt-s international microwave symposium (ims), 2019, pp. 869–872. [99] n. k. khaira, t. singh, and r. r. mansour, “rf mems based 60 ghz variable attenuator”, in proceedings of the ieee mtt-s international microwave workshop series on advanced materials and processes for rf and thz applications (imws-amp), 2018, pp. 1–3. [100] x. liu, “tunable rf and microwave filters”, in proceedings of the ieee 16th annual wireless and microwave technology conference (wamicon), 2018, pp 1–5. [101] a. jaimes-vera, i. llamas-garro, a. corona-chavez, and i. zaldivar-huerta, “review on microwave and millimeter filters using mems technology”, in proceedings of the 17th international conference on electronics, communications and computers (conielecomp'07), 2007, pp. 26–30. [102] k. thialagavthi and m. balakumar, “review on rf tunable filters”, international journal of innovations in engineering and technology (ijiet), 04 2017. [103] f. lin and m. rais-zadeh, “tunable rf mems filters: a review”, encyclopedia of nanotechnology, pp. 4233–4243, 01 2016. [104] j. brank, j. yao, m. eberly, a. malczewski, k. varian, and c. goldsmith, “rf mems-based tunable filters”, international journal of rf and microwave computer-aided engineering, vol. 11, no. 5, pp. 276–284, 2001. [105] z. brito-brito, j. reyes, and i. llamas-garro, “recent advances in reconfigurable microwave filters”, sbmo/ieee mtt-s international microwave and optoelectronics conference proceedings, 10 2011. [106] l. pelliccia, f. cacciamani, p. farinelli, and r. sorrentino, “high-q tunable waveguide filters using ohmic rf mems switches”, ieee transactions on microwave theory and techniques, vol. 63, 10, pp. 3381–3390, 2015. [107] l. gong, k. y. chan, and r. ramer, “a four-state iris waveguide bandpass filter with switchable irises”, 2017 ieee mtt-s international microwave symposium (ims), 2017, pp. 260–263. [108] z. yang and d. peroulis, “a 20{40 ghz tunable mems bandpass filter with enhanced stability by goldvanadium micro-corrugated diaphragms”, in proceedings of the ieee mtt-s international microwave symposium (ims), 2016, pp. 1–3. [109] z. yang, d. psychogiou, and d. peroulis, “design and optimization of tunable silicon-integrated evanescentmode bandpass filters”, ieee transactions on microwave theory and techniques, vol. 66, no. 4, pp. 1790– 1803, 2018. [110] m. abdelfattah, d. psychogiou, z. yang, and d. peroulis, “v-band frequency reconfigurable cavitybased bandpass filters”, in proceedings of the 2016 ieee/aces international conference on wireless information technology and systems (icwits) and applied computational electromagnetics (aces), 2016, pp. 1–2. [111] m. agaty, a. crunteanu, c. dalmay, and p. blondy, “ku band high-q tunable cavity filters using mems and vanadium dioxide (vo2) tuners”, in proceedings of the 2018 ieee mtt-s international microwave workshop series on advanced materials and processes for rf and thz applications (imws-amp), 2018, pp 1–3. [112] j. chang, m. j. holyoak, g. k. kannell, m. beacken, m. imboden, and d. j. bishop, “high performance, continuously tunable microwave filters using mems devices with very large, controlled, out-of-plane actuation”, journal of microelectromechanical systems, vol. 27, no. 6, pp. 1135–1147, 2018. 366 g. tagliapietra, j. iannacci [113] j. jiang and r. r. mansour, “high-q tunable filter with a novel tuning structure”, in proceedings of the 11th european microwave integrated circuits conference (eumic), 2016, pp. 436–439, 2016. [114] t. r. jones and m. daneshmand, “miniaturized folded ridged quarter-mode substrate integrated waveguide rf mems tunable bandpass filter”, ieee access, vol. 8, pp. 115837–115847, 2020. [115] s. shirin saberhosseini, b. a. ganji, and a. ghorbani, “tunable and dual-band hmsiw resonator using rf mems capacitor”, in proceedings of the iranian conference on electrical engineering (icee), 2017, pp 279–282. [116] b. pradhan and b. gupta, “ka-band tunable filter using metamaterials and rf mems varactors”, journal of microelectromechanical systems, vol. 24, no. 5, pp. 1453–1461, 2015. [117] r. kuriakose and e. s. shajahan, “tunable bandstop filter based on cascaded spiral shaped defected ground plane cpw”, in proceedings of the international conference on smart electronics and communication (icosec), 2020, pp. 945–949. [118] k. n. jose and m. r. baiju, “a tunable cpw spurline filter employing distributed mems switches”, in proceedings of the ieee students' technology symposium (techsym), 2016, pp. 235–239. [119] r. kumar, u. sharma, and p. jain, “dmtl filter in ku-band with improved slope of attenuation after cutoff”, in proceedings of the 5th international conference on signal processing and integrated networks (spin), 2018, pp. 569–574. [120] t. lin, k. k. wei low, r. gaddi, and g. m. rebeiz, “high-linearity 5.3-7.0 ghz 3-pole tunable bandpass filter using commercial rf mems capacitors”, in proceedings of the 48th european microwave conference (eumc), 2018, pp. 555–558. [121] a. j. alazemi and g. m. rebeiz, “a low-loss 1.4-2.1 ghz compact tunable three-pole filter with improved stopband rejection using rf-mems capacitors”, in proceedings of the ieee mtt-s international microwave symposium (ims), 2016, pp 1–4, 2016. [122] k. motoi, n. oshima, m. kitsunezuka, and k. kunihiro, “a band-switchable and tunable nested bandpass filter with continuous 0.4-3ghz coverage”, in proceedings of the 46th european microwave conference (eumc), 2016, pp. 1421–1424. instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 393 405 doi: 10.2298/fuee1503393m ztc bias point of advanced fin based device: the importance and exploration  sushanta k mohapatra, kumar p. pradhan, prasanna k. sahu nano electronics laboratory, department of electrical engineering, national institute of technology (nit), rourkela, 769008, odisha india abstract. the present understanding of this work is about to evaluate and resolve the temperature compensation point (tcp) or zero temperature coefficient (ztc) point for a sub-20 nm finfet. the sensitivity of geometry parameters on assorted performances of fin based device and its reliability over ample range of temperatures i.e. 25 0 c to 225 0 c is reviewed to extend the benchmark of device scalability. the impact of fin height (hfin), fin width (wfin), and temperature (t) on immense performance metrics including on-off ratio (ion/ioff), transconductance (gm), gain (av), cut-off frequency (ft), static power dissipation (pd), energy (e), energy delay product (edp), and sweet spot (gmft/id) of the finfet is successfully carried out by commercially available tcad simulator sentaurus tm from synopsis inc. key words: finfet, tcp or ztc, hfin, wfin, static and dynamic performances. 1. introduction and background concept between the two types of transistors, the bipolar devices (bjts) are more temperature sensitivity and show large variations in the operating point with temperature fluctuations. the unipolar devices (fets) are not so prone to instabilities due to temperature effects, but it is still needed to investigate the behaviour precisely the device performance when the transistor dimension enters in to nanometre scale. because the physical, chemical, mechanical, thermal and optical properties of devices change significantly from those at larger scales. from the basic operating principle point of view, a mosfet is a voltage controlled majority carrier device. the movement of majority carriers is controlled by the voltage applied on the control electrode (called gate) which is insulated by a thin metal oxide layer from the bulk semiconductor body. the electric field produced by the gate voltage modulate the conductivity of the semiconductor material in the region between the main current carrying terminals called the drain (d) and the source (s) [1]. received march 5, 2015 corresponding author: sushanta k mohapatra nano electronics laboratory, department of electrical engineering, national institute of technology (nit), rourkela, 769008, odisha india (e-mail: skmctc74@gmail.com) 394 s k mohapatra, k p pradhan, p k sahu changes in temperature affect system speed, power, and reliability. this effect is caused by altering the threshold voltage (vth), mobility (µ), and saturation velocity (vsat) in the device. the resulting changes in device current can lead to failures [2]. vth, µ, vsat and supply voltage (vdd) are all technology dependent parameters, with predicted values available down to the 22 nm node [itrs]. use of high-k dielectrics and metal gates to alleviate nanoscale gate leakage problems also alters vth, µ and vsat. the combination of these changes makes it difficult to determine the effect of temperature on the device performance [3]. the temperature effect is important to be considered because of thermal runaway. in the temperature dependence region, circuits continue to speed up as temperature increases. the higher temperatures could result in thermal runaway resulting from the exponential temperature dependence of leakage current, which may already be dominating the total power consumption in the nanoscale regime [4]. mosfets are widely used in the field of military, satellite communications, medical equipment, automobile, nuclear sectors, wireless and mobile communications, etc., as amplifier design, analog integrated circuits (ics), digital cmos design, mixed-signal ics, power electronics and switching devices. as for demand in variety of applications and the use the nanoscale transistors, it is important to analyze the performances at a wide range of temperatures [5]. according to the literature, several technologies have been explored as an option for both low and high temperature operations. few of them are complementary metal oxide semiconductor (cmos), silicon on insulator (soi) [6], and iii-v semiconductors. the unwanted flow of high leakage current through the well junction and the presence of latch up puts a limit on the use of bulk cmos devices at high temperatures. however, due to the absence of the well and latch up in soi devices, it can be preferred for both low and high temperature operations [7]–[9]. vadasz and grove [10] reported the temperature dependence of bulk mosfet at below saturation region. as for theoretical and experimental agreement, the variation of channel conductance with temperature is shown to be due to the variation of the threshold voltage and of the inversion layer mobility. bipolar transistors are considered to be unusable at low temperatures as a consequence of strongly reduced current gain [11], [12]. gaensslen et al. [13] presented an enhancement mode fet with a channel length of 1 µm suitable for operation at liquid nitrogen temperature. they claimed the performance of fet devices are significantly improved in terms of device turn-on time, 1.7 to 4 times higher transconductance, and an increasing threshold voltage at 77 k. other advantages are a decrease of 1000 times inversion layer leakage currents, 6 times higher silicon thermal conductivity, and 6 times lower aluminium line resistance. there is no significant difference in temperature dependence of threshold voltage was observed between ‘thick-film’ soi and bulk mosfet’s reported by krull and lee [14]. groeseneken et al. [15] documented that, in thin-film soi n-channel mosfet’s the device is fully depleted below a critical temperature and above, the device is no longer fully depleted. the drain current id is influenced by two terms, i.e., channel mobility µ and threshold voltage vth as [16] ( ) ( )[ ( )]d gs thi t t v v t   (1) the mobility term of (1) forces id to decrease, whereas the [vgs vth] term increases id with increase in temperature. but the behaviour of id with temperature shows an opposite effect at a fixed gate bias voltage. the effect of two controlling terms of (4) is ztc bias point of advanced fin based device: the importance and exploration 395 nullified at a fixed value of bias voltage, which is defined as zero temperature coefficient (ztc) bias point. the so called ztc point has been identified for bulk cmos by shoucair [17] and prijic et al. [18], in both the linear and the saturation regions for temperatures between 27 0 c and 200 0 c. later, groeseneken et al. [15] and jeon and burk [9] demonstrated the existence of the ztc point experimentally for thin and thick-film soi mosfets, respectively [19]. both experimental and analytical results for the ztc point over a high temperature range (25 0 c-300 0 c) of a partially depleted (pd) soi mosfet has been introduced by osman et al. [20]. they have identified two distinct temperature coefficient points, in the linear as well as in the saturation region. tan et al. [16] have analysed the fully depleted (fd) and lightly doped enhanced soi n-mosfet over a wide range of operating temperature (300 k-600 k). it is desirable to bias the digital and analog circuits meant for wide temperature applications at a point where the v-i characteristics show little or no variation with respect to temperature. this inflection point is typically known as temperature compensation point (tcp) or zero temperature coefficient (ztc) [15], [18], [20]–[22]. 2. ztc bias point there are two ztc points for a transistor, one for the drain current and the other for the transconductance, and in general they have different values in linear and saturation regions. these ztc points are defined as the points at which the drain current or the transconductance remains constant and independent of temperature. the ztc points, are values of vgs at which the reduction of the threshold voltage is counter-balanced by the reduction of the mobility, and as a result, the value of the drain current or the value of the transconductance remains constant as the temperature varies. for gate voltages lower than ztc, the decrease of threshold voltage is dominant, as a matter of fact drain current increases with temperature, while for gate voltages higher than ztc, the mobility degradation predominates and drain current decreases with temperature. the ztc is a very important bias point for analog designers as it corresponds to a gate voltage at which the device dc performance remains constant with temperature [19], [23], [24]. 3. significance of ztc bias ztc biasing is one of the important techniques in high temperature design especially for operational transconductance amplifier (ota). the principal advantages of ztc technique are [25]:  it maintains a constant operating point over a wide range of temperatures so that no transistors operate out of saturation.  it ensures stability of the circuit over a wide range of temperatures.  design simplicity and ensures reliable circuit operation when several stages are used. it provides a bias point that is temperature independent. the main disadvantages of ztc are: high overdrive voltage associated with ztc bias results in reduced intrinsic gain due to the low gm as well as reduced signal swing. the reduced gm with temperature can affect the small signal performances of the amplifier like gain, bandwidth, etc., especially when the amplifier is required to operate over a wide range of temperatures. 396 s k mohapatra, k p pradhan, p k sahu the multi-gate structures like double gate (dg) mosfet fabricated on soi wafers is one of the most promising candidates due to its attractive features of low leakage current, high current drivability (ion), transconductance (gm), reduced short channel effects (sces), steeper subthreshold slopes, and suppression of latch-up phenomenon [26]–[31]. in a recent work [32]–[34], a detailed analysis of inflection point to examine its reliability issues over a wide range of temperature variations (100 k-400 k) for both analog and rf applications of dg mosfet with hkmg technology was reported. to pamper the market requisites, the density of transistors in a chip and the performance in terms of speed and power consumption are needed to be increased. the transistor miniaturization is one of the major concerns behind performance and cost. undesirable short channel effects (sces) [35] and excessive vth variation occurred beyond 32 nm technology node, hence there is searching for new technologies/methodologies. the new methodologies lead in two directions: one is the introduction of new materials into the classical single gate mosfets like develop uniaxial/biaxial strain in the channel region to enhance the carrier mobility in the channel region and implementation of high-k dielectric materials as gate oxide to minimize the gate leakage current. second is the development of non-classical multigate mosfets (mug-fets) which is a very good concept for further scaling of the device dimensions. so, the integrated device manufacturer (idm), foundries and electronic design automation (eda) companies grant more investments with an emphasis on most promising 3-d finfet technology. the advantages of finfet technology are higher drain current and switching speed, less than half the dynamic power requirement with 90% less static leakage current [36], [37]. 4. finfet design (a) (b) fig. 1 (a) perspective 3-d (b) 2-d cross sectional view of soi finfet the geometrical process parameters of finfets are as:  gate length (lg): the physical gate length of finfets.  fin height (hfin): the height of silicon fin. ztc bias point of advanced fin based device: the importance and exploration 397  fin width (wfin): the width of silicon fin.  gate oxide thickness (tox): the thickness of the gate oxide.  underlap channel length (lun): the region under si3n4 spacer. among all the parameters the hfin and wfin are the two which play a major role to be investigated. a tradeoff is required between the wider fin which results in unacceptable sces and narrower increases parasitic resistance and is hard to manufacture. similarly from the manufacturing point of view, a taller fin achieves a better layout efficiency and higher current. so we have adopted various design parameters like wfin/lg = 0.25, 0.5, 0.6, 0.8, 1 and hfin/lg = 0.25, 0.6, 0.8, 1, 1.1, 1.3 in our simulation [38]–[40]. an n-channel mosfet, having interfacial oxide as sio2 with high-k material (si3n4) as spacer in the underlap regions (lun) is modeled. the lun is considered as 5 nm from both sides of the channel towards source and drain side. fig. 1(a) and (b) show a three dimensional, as well as 2-d cross sectional view of the finfet with source/drain length (ls/ld) as 40 nm. the source drain doping is gaussian in nature with peak nd at a density of 10 20 cm -3 . the equivalent oxide thickness (eot) is 0.9 [39], [41], [42] nm and supply voltage vdd = 0.7 v. the work function for the gate electrode is assumed to be 4.5 ev. the channel is undoped which augments the effective mobility, and hence the current density from the source [35]. 5. simulation setup the numerical simulation uses the drift diffusion approach [43], and the models activated in the simulation comprise a field dependent mobility, concentration dependent mobility and velocity saturation model. the technology parameters and the supply voltages employed for the device simulations are according to the analog itrs roadmap [44] for below 50 nm gate length devices. the work functions of the metal gates are adjusted to achieve the desired vth value. physical models accounting for electric field dependence of mobility are invoked in the simulation. the inversion layer mobility models [45], along with shockley–read–hall (srh) [46], [47] and auger recombination models are included. the inversion-layer lombardi mobility model calculates the mobility degradation which normally occurs due to a higher surface scattering near the semiconductor to insulator interface which also includes coulomb and phonon scattering. it deems the effect of transverse fields along with doping and temperature dependent parameters of mobility. the srh and auger recombination models are applied for minority carrier recombination. in addition, the basic mobility model is employed to consider the effect of doping dependence, high-field saturation (velocity saturation), and transverse field dependence. the impact ionization and band to band augur recombination model are included in the simulation. the silicon band gap narrowing the model that sorts out the intrinsic carrier concentration is activated. 6. effect of hfin and wfin on scalability in this section, the scalability of device is being discussed, the on-state drive current (ion), and off-state leakage current (ioff). the variation of fin height (hfin) and fin thickness (wfin) on drain current is traced in fig. 2(a) and (b) respectively. to analyze the immense improvement in gm (id/vgs) with increase in hfin/lg ratio, we have appraised and studied the 398 s k mohapatra, k p pradhan, p k sahu id-vgs curve. the sub threshold slope (ss) is an important parameter for calculating the off state current. furthermore, ss is calculated as: ( ) (log ) gs d v ss mv dec i    (2) exp( )d gsi qv kt  (3) where, the logarithm is in base 10, id is the drain current, vgs is the gate voltage, q is the charge of electron, k is the boltzmann’s constant, η is the body factor and t is the temperature. at room temperature (300 k) and ideal condition (η=1), the function exp(qvgs / kt) changes by 10 for every 60 mv change in vgs. the ideal value for the ss is 60 mv/decade. (a) (b) fig. 2 drain current (id) of the device in log scale as a function of gate to source voltage (vgs) with variability of process parameter (a) hfin (b) wfin. from fig. 2(a), as hfin/lg ratio increases, there is a lofty leakage current observed but with this ss also increases. however, with the same (high hfin/lg ratio), parasitic resistance problem can be avoided, which further increases the drain current. similarly, fig. 2(b) demonstrates that the leakage current can be significantly reduced for lower wfin/lg ratio cases. this is because by picking a smaller wfin, we can minimize the longitudinal electric field at the source side because of the precincts of multiple gates. from both figures, it can be noticed that ss augments with the increment in both ratios, i.e. hfin/lg and wfin/lg, although its value is very close to the ideal one, i.e. 60 mv/decade. the ion and ioff are very much dependent on vital device geometry parameters, i.e. hfin and wfin. so, there is always an accord between ion and ioff for the device design and device engineers can choose the optimum parameter dimensions as their requirement for specific applications. ztc bias point of advanced fin based device: the importance and exploration 399 (a) (b) fig. 3 on current (ion) and leakage current (ioff) with variation of (a) hfin (b) wfin at vgs= vds= vdd. the important figure of merit for digital application, i.e. ion versus ioff for different hfin/lg and wfin/lg ratios is presented in fig. 3(a) and (b). as for our previous discussion, both ion and ioff increase with the increase in hfin. this is to confirm that for high drive current with matching the current drivability, taller fins are required, whereas narrow fins give better sce immunity. this is because an increase in hfin results in decrease of the electric field in the silicon region which enhances carrier mobility and further the on state current. by comparing ion and ioff for all hfin/lg cases, we can say that hfin = 0.6 x lg is the optimum one as it endues a moderate value for both ion and ioff. fig. 3(b) discussed the same ion versus ioff benchmark for different wfin/lg ratios. from the figure, a wider fin width (wfin = 1 x lg) gives unacceptable sces, whereas a narrower fin width (wfin = 0.2 x lg) is more difficult to fabricate. so, we can take the moderate one, i.e. wfin = 0.6 x lg as the optimized wfin/lg ratio. 6. investigation of analog performance with variation of temperature temperature dependency of the id is influenced by vth as (1), the mobility term (which is hampered due to scattering effects at high t) of (1) forces id to decrease, whereas the [vgs vth] term (improves at higher t as vth decreases) increases id with increase in temperature. but the behaviour of id with t shows just adverse response at a fixed gate bias voltage. the effects of two controlling terms of (1) are nullified at a fixed value of bias voltage, that the inflection point is called temperature compensation point (tcp). fig. 4(a) shows the variations of id with vgs at different bias temperatures. as for equation (1) at high gate bias, µ(t) dominates because of the heavy lattice scattering at higher t. it leads to a reduction in the channel mobility which further reduces id. at low gate bias, [vgs vth] term influences id to raise because of the shrinking nature of vth with an increase in t. these two opposite effects cancel out each other at a value of vgs where id shows minimal fluctuation with t. this inflection point as shown in fig. 4(a) is imminent in between vgs = 0.34 v. this creates an opportunity to use multigate mosfets for integrated circuit applications. 400 s k mohapatra, k p pradhan, p k sahu (a) (b) fig. 4 (a) drain current (id) as function of gate voltage (vgs) both in linear and log scale (b) leakage current (ioff) versus on current (ion) with variation of temperature. fig. 4(b) presents a plot for the important parameters which includes the variation of ion, ioff for different temperatures. from the figure, it can be observed that the behaviour of ion and ioff is absolutely opposite to each other with temperature variation. for high t values, the device shows a fairly large ioff and low ion, which is just reverse in the case of low t. this is because, as temperature increases, the mobility of carrier’s decreases due to scattering effects which further reduce ion. again the degradation in ioff at high temperatures is due to the lattice vibration and the phonon scattering phenomena play a significant role as t increases. the gm-vgs plot can simply be obtained by taking the derivative of id with respect to vgs. at vgs < vth, the channel is weakly inverted and id is due to diffusion. the diffusion current increases with t because of the increase in intrinsic carrier concentration as in einstein’s relation: d = kbt, where d is the diffusion constant, μ stands for mobility, kb is boltzmann’s constant and t represents temperature. at vgs > vth, the value of gm decreases with t due to the mobility degradation. the reduction in vth with temperature enhances gm, however the degradation of mobility reduces gm. these two phenomena influence each other to give rise for a temperature compensation point for gm. from fig. 5 (a), we can conclude that the value of transconductance ztc point (0.14 v) is lower than the drain current ztc bias point (0.34 v). the inflection point for id and gm are two important fom in analog circuit design for both high and low temperature applications. in opamp (operational amplifier) based circuit design and transistors used in biasing string can be biased at inflection point for drain current to maintain a constant dc current level. the input devices may be biased at an inflection point for transconductance to achieve stable circuit parameters. the above said points are obtained for constant bias conditions in case of floating body or body tied configuration mosfets. hence there is only one possibility to bias the transistor, i.e. either at inflection point for id or gm. moreover, this point is usually affected by process variations. hence, depending upon the nature of applications, the bias conditions are picked accordingly. ztc bias point of advanced fin based device: the importance and exploration 401 (a) (b) fig. 5 (a) transconductance (gm) and (b) cut off frequency (ft) as a function of gate voltage (vgs) with variation of temperature. cut-off frequency (ft) plays a vital role in evaluating the rf performance of the device plotted in fig. 6. generally, ft is the frequency at which the current gain is unity [42]. 2 m t gg g f c  (4) where gm, and cgg are the transconductance, total gate capacitance respectively. the enhancement in ft occurs at higher drive current and lower t values. this improvement in ft is partially due to the increment in gm and merely because of the low values of intrinsic capacitance. at low temperature, the improvement of cut-off frequency ft is due to a steep increase in mobility and in turn gm. in addition, it reveals the advantage of the multigate technology which exhibits ztc bias points over a wide range of temperatures (t=25 o c to 225 o c). (a) (b) fig. 6 (a) intrinsic gain (av) versus cut off frequency (ft) (b) sweet spot as a function of drain current (id) with variation of temperature. 402 s k mohapatra, k p pradhan, p k sahu the intrinsic gain (av = gm/gd) is a valuable fom for operational transconductance amplifier (ota) and is shown in fig. 6 (a). from the graph, a similar type of analysis can be made as in the case of gm and gd. from the figure, a high gain can be obtained for high temperatures in the subthreshold region and the reverse effect in super threshold region. fig. 6(b) presents one crucial parameter for analog/rf application, i.e. the ‘sweet spot’ (settlement among power, speed of operation and linearity), which is signified by the peak of transconductance to the current ratio (gm/id) and cut-off frequency (ft) product. the variation of the ‘sweet spot’ with id for a broad range of t (25 0 c to 225 0 c) is well examined from fig. 6(b). the device predicts pretty higher gmft/id values at low t and gradually starts decaying with the increase in t. the extracted static parameters like ion, ioff, ion/ioff, and power dissipation (pd=ioff*vdd) for a wide range of t variation are arranged in table 1. all the parameters predict significant improvements in the lower range of t values. the performances start deteriorating as t increases. there is a 77.04% enhancement in ioff, 79.01% improvement in on-off ratio, and 77.04% in pd, while t steps down from 75 0 c to 225 0 c. table 1 static performance of finfet with t variation temp. ( 0 c) ion (μa) ioff (na) ion/ioff pd (ioff*vdd) (w) x10 -8 25 128 16.03 7961.96 1.122 75 117 69.83 1670.96 4.888 125 108 211.01 512.33 14.77 175 101 492.53 205.52 34.47 225 95.6 950.03 100.64 66.50 in a similar fashion, table 2 reveals the dynamic analysis of finfet towards temperature sensitivity. the performances like ft, ‘sweet spot’, energy, and edp are exported and compared for different temperatures. alike the above discussed static performances, the dynamic parameters are also depict numerous enhancements at detrimental temperatures. table 2 ac/dynamic performance of finfet for different values of t temp. ( 0 c) cgg (f) x10 -18 peak ft (ghz) sweet spot (thz/v) delay (cv/ieff) (ps) energy (cv 2 ) (j) x10 -18 edp (js) x10 -29 25 92.236 465 23.5 0.506 45.195 2.29 75 92.178 412 13.8 0.553 45.167 2.5 125 92.217 370 9.47 0.597 45.186 2.7 175 92.325 336 6.7 0.638 45.239 2.89 225 92.422 308 5.05 0.677 45.286 3.06 7. conclusion the dc characteristics of a 20 nm n-channel finfet for variation in fin width and fin height are carried out using sentaurus device simulator. from the results obtained by geometrical parameter variation, we can say that taller fins are required for higher current ztc bias point of advanced fin based device: the importance and exploration 403 drivability and narrower fins are required for higher immunization to sces. wfin = 0.6 x lg and hfin = 0.8 x lg cases show the desired device performances in terms of ion, ioff. when developing novel architectures to enable further miniaturization to meet the itrs requirements, the evaluation of ztc/tcp is one of the key analysis for optimal device operation and reliability. we have systematically analyzed the sensitivity of various finfet performances towards temperature variation. from the presented outcomes of this work, it is evident that there exist different inflection points for id, and gm, which should be seriously taken into consideration for finfet based circuit operation. references [1] s. m. sze, physics of semiconductor devices, third edit. john wiley and sons inc., 2009. [2] i. m. filanovsky and a. allam, "mutual compensation of mobility and threshold voltage temperature effects with applications in cmos circuits", circuits syst. i fundam. theory appl. ieee trans., vol. 48, no. 7, pp. 876–884, 2001. [3] b. cheng, m. cao, r. rao, a. inani, p. vande voorde, w. m. greene, j. m. c. stork, z. yu, p. m. zeitzoff, and j. c. s. woo, "the impact of high-κ gate dielectrics and metal gate electrodes on sub-100 nm mosfets", electron devices, ieee trans., vol. 46, no. 7, pp. 1537–1544, 1999. [4] n. s. kim, t. austin, d. baauw, t. mudge, k. flautner, j. s. hu, m. j. irwin, m. kandemir, and v. narayanan, "leakage current: moore’s law meets static power", computer (long. beach. calif)., vol. 36, no. 12, pp. 68–75, 2003. [5] h. p. wong, s. member, d. j. frank, p. m. solomon, c. h. j. wann, and j. j. welser, "nanoscale cmos", in proceedings of the ieee, vol. 87, no. 4, pp. 537–570, 1999. [6] m. bruel, "silicon on insulator material technology", electron. lett., vol. 31, no. 14, pp. 1201–1202, 1995. [7] g. k. celler and s. cristoloveanu, "frontiers of silicon-on-insulator", j. appl. phys., vol. 93, no. 9, pp. 4955–4978, 2003. [8] g. reichert, c. raynaud, o. faynot, f. balestra, and s. cristoloveanu, "submicron soi-mosfets for high temperature operation (300k-600k)", microelectron. eng., vol. 36, no. 1–4, pp. 359–362, jun. 1997. [9] d.-s. jeon and d. e. burk, "a temperature-dependent soi mosfet model for high-temperature application (27 °c-300 °c) ", ieee trans. electron devices, vol. 38, no. 9, pp. 2101 – 2111, 1991. [10] l. vadasz and a. s. grove, "temperature dependence of mos transistor characteristics below saturation", ieee trans. electron devices, vol. 24, no. 12, pp. 863–866, 1966. [11] b. lengeler, "semiconductor devices suitable for use in cryogenic environments", cryogenics (guildf)., vol. 14, no. 8, pp. 439–447, 1974. [12] e. s. schlig, "low-temperature operation of ge picosecond logic circuits", solid-state circuits, ieee j., vol. 3, no. 3, pp. 271–276, 1968. [13] f. h. gaensslen, v. l. rideout, e. j. walker, and j. j. walker, "very small mosfet’s for lowtemperature operation", ieee trans. electron devices, vol. 24, no. 3, pp. 218–229, 1977. [14] w. a. krull and j. c. lee, "demonstration of the benefits of soi for high temperature operation", in proceedings of the ieee sos/soi technology workshop, 1988, p. 69. [15] g. groeseneken, j.-p. colinge, h. e. maes, j. c. alderman, and s. holt, "temperature dependence of threshold voltage in thin-film soi mosfets", ieee electron device lett., vol. 11, no. 8, pp. 329–331, 1990. [16] t. h. tan and a. k. goel, "zero-temperature-coefficient biasing point of a fully depleted soi mosfet", microw. opt. technol. lett., vol. 37, no. 5, pp. 366–370, 2003. [17] f. s. shoucair, "analytical and experimental methods for zero-temperature-coefficient biasing of mos transistors", electron. lett., vol. 25, no. 17, pp. 1196 – 1198, 1989. [18] z. d. prijić, s. s. dimitrijev, and n. d. stojadinović, "the determination of zero temperature coefficient point in cmos transistors", microelectron. reliab., vol. 32, no. 6, pp. 769–773, jun. 1992. 404 s k mohapatra, k p pradhan, p k sahu [19] m. emam, j. c. tinoco, d. vanhoenacker-janvier, and j. p. raskin, "high-temperature dc and rf behaviors of partially-depleted soi mosfet transistors", solid. state. electron., vol. 52, no. 12, pp. 1924–1932, 2008. [20] a. a. osman, m. a. osman, n. s. dogan, and m. a. imam, "zero-temperature-coefficient biasing point of partially depleted soi mosfet’s", ieee trans. electron devices, vol. 42, no. 9, pp. 1709–1711, 1995. [21] z. d. prijić, s. s. dimitrijev, and n. d. stojadinović, "analysis of temperature dependence of cmos transistors’ threshold voltage", microelectron. reliab., vol. 31, no. 1, pp. 33–37, 1991. [22] z. prijić, z. pavlović, s. ristić, and n. stojadinović, "zero-temperature-coefficient (ztc) biasing of power vdmos transistors", electron. lett., vol. 29, no. 5, pp. 435–437, 1993. [23] b. gentinne, j. p. eggermont, and j. p. colinge, "performances of soi cmos ota combining ztc and gain-boosting techniques", electron. lett., vol. 31, no. 24, pp. 2092–2093, 1995. [24] m. el kaamouchi, m. s. moussa, j.-p. raskin, and d. vanhoenacker-janvier, "zero-temperaturecoefficient biasing point of 2.4-ghz lna in pd soi cmos technology", in proceedings of the european microwave conference, 2007, pp. 1101–1104. [25] d. flandre, l. demeus, v. dessard, a. viviani, b. gentinne, and j.-p. eggermont, "design and application of soi cmos otas for high-temperature environments", in proceedings of the 24th european solid-state circuits conference, esscirc ’98., 1998, pp. 404–407. [26] k. suzuki, t. tanaka, y. tosaka, h. horie, and y. arimoto, "scaling theory for double-gate soi mosfet’s", ieee trans. electron devices, vol. 40, no. 12, pp. 2326–2329, 1993. [27] c. h. wann, k. noda, t. tanaka, m. yoshida, and c. hu, "a comparative study of advanced mosfet concepts", ieee trans. electron devices, vol. 43, no. 10, pp. 1742–1753, 1996. [28] j. p. colinge, "multiple-gate soi mosfets", solid. state. electron., vol. 48, no. 6, pp. 897–905, 2004. [29] h. s. p. wong, "beyond the conventional transistor", ibm j. res. dev., vol. 46, no. 2, pp. 133–168, 2002. [30] p. k. sahu, s. k. mohapatra, k. p. pradhan, p. k. sahu, s. k. mohapatra, and k. p. pradhan, "impact of downscaling on analog/rf performance of sub-100nm gs-dg mosfet", j. microelectron. electron. components mater., vol. 44, no. 2, pp. 119–125, 2014. [31] p. k. sahu, s. k. mohapatra, and k. p. pradhan, "a study of sces and analog foms in gs-dgmosfet with lateral asymmetric channel doping", j. semicond. sci., vol. 13, no. 6, pp. 647–654, 2013. [32] s. k. mohapatra, k. p. pradhan, and p. k. sahu, "resolving the bias point for wide range of temperature applications in high-k/metal gate nanoscale dg-mosfet", facta univ. ser. electron. energ., vol. 27, no. 4, pp. 613–619, 2014. [33] s. k. mohapatra, k. p. pradhan, and p. k. sahu, "temperature dependence inflection point in ultrathin si directly on insulator (sdoi) mosfets: an influence to key performance metrics", superlattices microstruct., vol. 78, pp. 134–143, 2015. [34] p. k. sahu, s. k. mohapatra, and k. p. pradhan, "zero temperature-coefficient bias point over wide range of temperatures for single-and double-gate utb-soi n-mosfets with trapped charges", mater. sci. semicond. process., vol. 31, pp. 175–183, 2015. [35] v. a. sverdlov, t. j. walls, and k. k. likharev, "nanoscale silicon mosfets: a theoretical study", ieee trans. electron devices, vol. 50, no. 9, pp. 1926–1933, 2003. [36] c. r. manoj, m. nagpal, d. varghese, and v. r. rao, "device design and optimization considerations for bulk finfets", ieee trans. electron devices, vol. 55, no. 2, pp. 609–615, 2008. [37] a. kranti and g. a. armstrong, "device design considerations for nanoscale double and triple gate finfets", in proceedings of the ieee international soi conference, 2005, vol. 2005, pp. 96–98. [38] h. shang, l. chang, x. wang, m. rooks, y. zhang, b. to, k. babich, g. totir, y. sun, e. kiewra, m. ieong, and w. haensch, "investigation of finfet devices for 32nm technologies and beyond", in tech. dig. pap. symp. vlsi technol. , 2006. [39] b. ho, x. sun, c. shin, and t. liu, "design optimization of multigate bulk mosfets", ieee trans. electron devices, vol. 60, no. 1, pp. 28–33, 2013. [40] x. sun, v. moroz, n. damrongplasit, c. shin, and t. j. k. liu, "variation study of the planar groundplane bulk mosfet, soi finfet, and trigate bulk mosfet designs", ieee trans. electron devices, vol. 58, no. 10, pp. 3294–3299, 2011. ztc bias point of advanced fin based device: the importance and exploration 405 [41] m. g. c. de andrade, j. a. martino, m. aoulaiche, n. collaert, e. simoen, and c. claeys, "behavior of triple-gate bulk finfets with and without dtmos operation", solid. state. electron., vol. 71, pp. 63– 68, 2012. [42] k. p. pradhan, s. k. mohapatra, p. k. sahu, and d. k. behera, "impact of high-k gate dielectric on analog and rf performance of nanoscale dg-mosfet", microelectronics j., vol. 45, no. 2, pp. 144–151, 2014. [43] s. selberherr, analysis and simulation of semiconductor devices. 1984, pp. springer–verlag, wien– new york. [44] "the international technology roadmap for semiconductors", 2011. [45] c. lombardi, s. manzini, a. saporito, and m. vanzi, "a physically based mobility model for numerical simulation of nonplanar devices", ieee trans. comput. des. integr. circuits syst., vol. 7, no. 11, pp. 1164 – 1171, 1988. [46] w. shockley and w. t. read, "statistics of the recombination of holes and electrons", phys. rev., vol. 87, pp. 835–842, 1952. [47] r. n. hall, "electron-hole recombination in germanium", phys. rev., vol. 87, p. 387, 1952. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 129 136 doi: 10.2298/fuee1401129t thz technology for vision systems  j. trontelj 1 , a. sešek 1 , a. švigelj 2 1 faculty of electrical engineering, laboratory for microelectronics, tržaška 25, 1000 ljubljana, slovenija 2 letrikalab d.o.o, polje 15, 5290 šempeter pri gorici, slovenija abstract. the thz radiation brings new technology challenges and new opportunities to overcome some of the current application obstacles. in the paper a portable thz system is presented operating at room temperature. the presented solution is robust and inexpensive, convenient for many applications. the thz sensor fabricated at the faculty of electrical engineering in the laboratory for microelectronics is currently one of the best sensors in its frequency operating range. it reaches sensitivity up to 1000v/w and nep down to 5pw/√hz in vacuum. with the proposed system solution variety of application can be covered. some imaging results captured with the proposed system at different stand-off distances are shown in the paper. key words: thz sensors, thz systems, stand-off thz detection, bolometer 1. thz technology nowadays many kinds of material inspection systems that are used, for example in the semiconductor industry, paper industry, medical and security applications, or many other see-through systems are based on x-ray techniques. as x-rays are ionizing and thus extremely harmful for biological tissue, their usage is area and time limited. therefore, new technologies and non-destructive methods emerging especially for biological tissues are used. thz technology promises suitable substitution, as it is nonionizing and comprises some other new properties which are available only in the thz frequency region of the electromagnetic spectrum. thz waves propagate through different non-metal materials such as plastics, clothes, paper, ceramics, some thermal insulation materials, and also dry wood. furthermore, very good reflection can be obtained by flat surfaces, and especially from metals. the main obstacle is humidity, which brings very high attenuation in the thz spectral range. air humidity absorbs thz radiation and makes thz waves improper for use at long stand-off distances. it was also shown that when thz waves propagate through or reflect from different materials, especially drugs and explosives, a specific fingerprint of each material can be recognized. all of these facts open a lot of new  received january 10, 2014 corresponding author: janez trontelj faculty of electrical engineering, laboratory for microelectronics, tržaška 25, 1000 ljubljana, slovenija (e-mail: janez.trontelj1@guest.arnes.si) 130 j. trontelj, a. sešek, a. švigelj possibilities of thz waves usage in e.g. material quality control, the pharmaceutical industry, medical imaging, and security – which is one of the most emerging fields. generally, there are two approaches of generation and detection of thz waves. first is the electro-optical approach, which is based on an ultra-short laser pulse, where the pulse width is in the femtosecond range. such a pulse illuminates a special crystal or semiconductor material and an electromagnetic wave in the thz region is emitted. the same material can also be used for detection. the tds principle is mostly used for spectroscopy, but imaging can also be done. imaging is very time consuming, as systems consist of one source and only one detector. with such setups, frequencies from hundreds of ghz to up to 8 thz and above can be achieved in the power range of microwatts. the second is the microwave approach, where signals are generated at frequencies as low as 10 ghz with discrete rf components. with these techniques continuous wave signals with frequencies up to 1.2 thz can be generated. the typical power at 300 ghz can reach up to 20 milliwatts. commonly used detectors are schotkey diodes and bolometers with an antenna, where an incident wave causes a temperature change of the detector. such detectors are small enough and can be merged into arrays, and can be integrated into systems used for imaging. it should be noted that spectroscopy is also emerging as a future application. 1.1. time domain thz spectroscopy system an optical approach of thz generation and detection is used in many different types of time domain spectroscopy (tds) thz systems. the common base of all is to split a femtosecond laser beam into two signals. one is used for thz wave generation and the second, also named a probing signal, is used for thz pulse reconstruction on the detector. for generation and detection of thz waves various methods are used, such as photoconductive generation and optical rectification. to reconstruct the whole thz pulse, the probing signal has to be time shifted over the whole thz pulse. this is done by a probing signal delay with a motorized optical delay line. when using photoconductive generation and detection, signals have to coincide directly on the detector itself. in the case when the optical rectification detection method is used, both signals have to coincide on the crystal, where the probing signal is polarized according to the incident thz pulse intensity. the difference in polarization gives thz signal strength information. at first, thz time domain spectroscopy setups were mostly built on optical tables, where various optical components such as beam splitters and focusing parabolic mirrors could be easily adjusted. such systems are very sensitive to mechanical stress and also the samples have to be small enough to fit into the measurement place in the system. they operated mostly in the transmission mode. to change to reflection mode usually big part of the setup had to be changed. to obtain a visual thz image, a sample had to be moved with implemented translation stages. therefore, new systems were built using fiber optics to improve the flexibility of measurements. now both main and probing laser signals can be transferred to a remote distance of few meters using fiber optics, and thz pulses are generated at the remote location. this allows thz response measurements of larger objects. also, the reconstructing of a visual thz image can be easily done with transmitting and receiving thz heads mounted on to the translation stages. thz technology for vision systems 131 1.2. continuous wave thz systems the main thz core of a continuous wave (cw) thz source is a few ghz voltage controlled oscillator with a precise pll loop to achieve low phase noise and enough output power. normally the chosen basic frequency is 12.5ghz, which can be easily multiplied to several hundreds of ghz. during multiplication, which means higher harmonics filtration and amplification, the main issue is to keep low phase noise and to achieve high output power. the power on the output horn antenna is usually up to 100 milliwatts and is highly dependent on number of multiplications. with electronics sources a frequency of up to 1thz can be reached, which is highly suitable for 2d and 3d imaging. for a detector, many sensors can be used, as the power is higher and the thz beam illuminates a larger area. primary selection is the bolometer type sensor array, schotkey diode array, pyro sensor array, etc. [1]. 2. lmfe thz sensor and system the thz system in the laboratory for microelectronics (lmfe) is a cw thz-based system with scanning mechanic, thz lenses and a micro bolometer detection array [2], [3]. data acquisition object thz source pivoting mirror sensor array beam splitter lo fig. 1 block diagram of a lmfe thz imaging system the thz source is a 12.5ghz source, multiplied by x4, x2, and x3totally x24, which gives an output frequency of 300ghz with a peak power of 5mw. the thz beam is split in a ratio 40:60 with a silicon beam splitter. the larger part of the beam continues to the observed object. there it reflects and it is redirected to a sensor array by a pivoting mirror. the pivoting mirror scans through the vertical dimension of the object. the on sensor array, which gives the horizontal dimension, and both thz beams are merged to achieve a heterodyne detection. the core of the system is a 2x16 thz sensor array, fabricated and assembled in lmfe. 2.1. thz sensor and sensor array sensors used in a thz array [4]-[6] are designed and fabricated in lmfe. lmfe owns the cmos technology, which is able to produce a sensor and systems on silicon down to 500nm. the technological procedure of the thz sensor fabrication was described in patent [7]. materials for the sensor were evaluated with the equation 132 j. trontelj, a. sešek, a. švigelj ( ) √ (1) where se is sensitivity, tc is the temperature coefficient of the material, rho is sheet resistance, and g is thermal conduction [8]. from the several appropriate materials, titanium was chosen. the main goal of the design was to achieve the highest sensitivity (se), low noise equivalent power (nep), and to match sensor and antenna impedances. fig. 2 lmfe thz sensor as the detection principle is based on a titanium thermistor, a double dipole antenna is attached to it to receive and transfer thz energy to the bolometer which is therefore heated, and consequently its resistance changes according to the energy received. on figure 2, the realization of such sensor is shown. the sensors are fabricated from a silicon wafer, which is partly etched on the thermistor-antenna area to achieve better parameters regarding thermal dissipation. doubled contact pads allow connection from both sides. the antenna and connections material is aluminum. equation (2) describes power conditions on the bolometer: ⁄ ⁄ ⁄ (2) as it can be seen from equation (2), three basic power components are present on thermistor – biasing power (ub 2 /r), noise power (un 2 /r), and signal power (us 2 /r) [9]. thz sensors fabricated at lmfe have a 300ghz central frequency, a sensitivity of up to 1000v/w, and nep of up to 5pw/√hz when in a vacuum and at room temperature. the sensors are fabricated as quadruples for easier handling and a simple array assembly. the sensors are biased with an i0/4 current, where i0 is a physical limitation of the electrical damage. fig. 3 thz sensor array thz technology for vision systems 133 on figure 3 thz sensor array (2 x 16 sensors) is presented. the opening under the sensor can be clearly seen – a 3um silicon nitride membrane is practically invisible – cavity under the sensors is λ/4 deep and acts as a resonator which gains the thz signal. 2.2. lmfe thz system thz system was partly described in the block diagram on figure 1. the real setup is shown on figure 4. fig. 4 thz sensor system the image on figure 4 presents a portable thz system which consists of four main blocks, as presented in the block diagram in figure 1. some other system parts can be seen in figure 4, as the thz receiving lens and illumination focus lens. the system also needs additional low noise amplifiers, which are below the sensor array, the supply for each block and a/d converter for its operation. digitalization is made with a 16-bit national instruments a/d card with 32 channels, and a 2mhz total sampling frequency. data collected and transformed is transferred to a pc and processed to produce the thz image of the hidden object. 3. imaging results the thz system is capable of scanning through different materials which are invisible, but transparent for thz radiation as paper cardboard, plastics, paper, and packaging material. it covers approximately 0.1m x 0.1m area at a 1m distance with basic lenses and basic optical adjustments. with special stand-off lenses, a maximal area of 0.2m x 0.2m at 5m was achieved. the test setup of the system is presented in figure 5 where the lenses, thz source, and thz detector array box are separated due to different tests, different observation distances, and different operation modes (reflection and transmission mode). thz receiving lens illumination focus lens thz pivoting mirror thz beam splitter thz scanning array 134 j. trontelj, a. sešek, a. švigelj thz lenses are made of polystyrene and they have different diameter according to the distance of the observed object. in many cases, a lens can included in the thz source block and/or in the receiving block. design of the lens is important as it can significantly influence image quality and the system resolution. for larger diameter of the lens a fresnel principle is used and for the smaller diameters continuous lenses are choosen. for images three different objects were chosen to prove all operational modes and the standoff operation. fig. 5 test thz system in transmission mode figure 6 was captured. the observed objects were plant leaves, where water vessels can be clearly seen in the thz image. the visual image is added for better understanding. fig. 6 visual and thz image in transmission mode the main vessel in the center is almost opaque due to high water content (thz waves are totally absorbed), meanwhile other parts of the leaf are semi-transparent according to the water content level. the next mode is the reflection mode, where two different objects at two different distances are presented. the first in figure 7, the paper clip at a distance of 0.36m was scanned. thz technology for vision systems 135 fig. 7 visual and thz image of a paper clip the upper two images in figure 7 presents a thz and visual image of a paper clip without a barrier cover material, and the images below present the same clip with an additional two layers of textile cover, to prove the thz waves penetration. figure 8 presents the imaging result of the thz system of a small carpenter knife taken at a 5m stand-off distance. fig. 8 visual and thz image of a carpenter knife the upper image couple in figure 8 presents the uncovered object, and the bottom couple show the object covered with two layers of textile. the knife is clamped in expanded polystyrene, which is transparent for thz radiation. 4. conclusions in the paper the thz vision system developed in the university of ljubljana laboratory for microelectronics is described. both transmitted and reflected images are shown giving excellent resolution at up to a 5m stand-off distance. the core of the system is the thz 136 j. trontelj, a. sešek, a. švigelj thermistor sensor, which is fabricated in lmfe, and achieves one of the best reported results in sensitivity (se=1000v/w) and noise equivalent power (nep=5pw/√hz) at room temperature. acknowledgement: the thz research was partly funded by the ministry of defense of the republic of slovenia and the namaste center of excellence. references [1] a. rogalski, f. sizov, ˝terahertz detectors and focal plane arrays,˝ opto−electornics review 19(3), 346–404, (2011) [2] j. trontelj, a. sešek, a. švigelj, thz focal plane array for high resolution 3d imaging. v: leitner, raimund (ur.), arnold, thomas (ur.). international thz conference 2013: september 9-10, 2013, villach, austria. [wien]: österreichische computer gesellschaft, cop. 2013, pages 37–41, ilustr. [3] j. trontelj, room temperature antenna-sensor array and thz vision demonstrator, 7th nato set-124 task group business meeting, oslo, norway, 8. 9. 11. 2010; nato / rto 2010 [4] l. pavlovič, d. kostevc, a. pleteršek, m. maček, a. sešek, j. trontelj: 300 ghz microbolometer parallel-dipole antenna for focal-plane-array imaging, nato-otan, set-159, specialists meeting on terahertz and other electromagnetic wave techniques for defense and security; vilnius, lithuania, may 3-4 2010; nato / rto 2010 [5] j. trontelj, a. sešek, m. maček, thz sensor array operating at room temperature, set169 symposium on 8th nato military sensing symposium, friedrichshafen, germany, 16 – 18.5.2011; nato / rto 2011 [6] m. podhraški, a. švigelj, m. maček, j. trontelj, thermal analysis of thz microbolometer. v: belavič, darko (ur.), šorli, iztok (ur.), 48th international conference on microelectronics, devices and materials & the workshop on ceramic microsystems, september 19 september 21, 2012, otočec, slovenia. proceedings. ljubljana: midem society for microelectronics, electronic components and materials, 2012, pp. 421–426, illustr. [7] j. trontelj, m. maček, a. sešek. a detection system and a method of making a detection system: 1307052.9 20130418. [newport]: united kingdom intellectual property office, 2013. [8] l. pavliček, d. kostovec, a. pleteršek, m. maček, a. sešek, j. trontelj, 300 ghz microbolometer parallel-dipole antenna for focal-plane-array imaging. nato-otan, set-159, specialists meeting on terahertz and other electromagnetic wave techniques for defense and security, vilnus, lithuania, may 3-4, 2010, rto/nato, 2010. [9] j. trontelj, m. maček, a. sešek, a. švigelj, uncooled nanometric scale bolometer system for thz sensor array. v: 2011 nanoelectronic devices for defense & security (nano-dds) conference, 29 aug. 1. sept. / brooklyn, new york. technical program & abstract digest:2011 nano-dds conference theme: present & future roles of nanotechnology in the forensic sciences., 2011 facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 571-581 https://doi.org/10.2298/fuee2004571m © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd power transformer health index estimation using evidential reasoning srdjan milosavljević 1 , aleksandar janjić 2 1 electrotechnical institute “nikola tesla” belgrade, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. market-oriented power distribution system requires a well-planned budget with scheduled preventive and corrective maintenance during a replacement of units that are in an unsatisfactory condition. in recent years, the concept of the transformer health index as an integral part of resource management was adopted for the condition assessment and ranking of ets. however, because of the lack of regular measurement and inspections, the confidence in health index value is greatly reduced. the paper proposes a novel methodology for the et condition assessment and the lifetime increase through the establishment of priorities for control and maintenance. the solution is based on the upgraded health index, where the confidence to the measurement results is calculated using evidential reasoning algorithm based on dempster – shafer theory. a novel, two – level hierarchical model of et health index is proposed, with real weighting factors values. this way, the methodology for et ranking includes the value of available information to describe et current state. the proposed methodology is tested on real data of an installed et and compared with the traditional health index calculation. key words: dempster shaffer, evidential reasoning, health index, condition evaluation 1. introduction reliability of energy power transformers (ets) is vital in maintaining the stability of the power system. the market-oriented system and deregulation in the electricity industry requires a well-planned schedule of preventive maintenance and corrective maintenance or replacement of units that are in unsatisfactory condition. however, inspection and testing schedules are predetermined and defined by legislation or internal regulations and company rules for all substations, regardless of their status and importance [1]. in the current practice of most electric utilities, condition diagnostics of each individual et has been presented descriptively, especially in the field of chemical and electrical tests. in recent years, work has been done on defining a methodology to perform an integral quantification of et states based on the results of chemical and electrical tests, received march 5, 2020; received in revised form may 7, 2020 corresponding author: aleksandar janjić faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia e-mail: aleksandar.janjic@elfak.ni.ac.rs 572 s. milosavljevic, a. janjic maintenance data and work history data, by introducing a state index or so-called "health index" (hi) which would rank et by its actual condition. transformer indexing by operating condition, with additional risk analysis, enables a better understanding of the availability and reliability of large transformer populations. hi is a tool that combines the results of in-service electrical testing, laboratory (chemical) testing of transformer oil, maintenance data and work history data to manage basic resources and build priorities when designing maintenance plans using a numerical ranking of transformer status and capital investment. in [2], a practical hi calculation method is given, combining the impact of all available data and criteria based on the common practices and technical standards. based on the standard model of twenty-four diagnostic factors, additional three factors (loss factor at very low frequency, conductivity factor and polarization index) are used for the hi calculation in [3]. hi concept can be extended to other equipment, like in [4], where hi was determined for a number of around 2000 secondary substations, each consisting of a mv switchgear, mv/lv transformer and lv rack. a comprehensive study of previous research related to transformer health index by using mathematical models, algorithm or expert judgment is given in [5]. the problem with the traditional hi calculation is the generation of an overall assessment about the transformers condition by aggregating the above judgments in a rational way. furthermore, very few researches are dealing with the uncertainties, accuracy and confidence of the inspection results. the evidential reasoning (er) approach is suitable method for dealing with the aggregation problem, turning a transformer condition assessment problem into an multi-criteria decision solution. the process can model various types of qualitative and quantitative uncertainties and is developed on the basis of dempstershafer evidence theory [6] and evaluation analysis model [7].with the introduction of the concepts of belief structure [8, 9] and the belief decision matrix, it became possible to model various types of uncertainties in a unified format. in recent period, the usage of er methodology has been applied for the et condition assessment. in [10], various dissolved gas analysis (dga) methods have been given different subjective judgment grades. then, the concept of a preference degree was introduced to quantify these evaluation grades and subjective judgments with uncertainty. er approach is used in [11] to transformer winding assessment based on frequency response analysis (fra), but the degree of uncertainty, like in the previous study, relies only on the expert’s judgement. the integrated fuzzy and evidential reasoning model is presented in [12], with previous operation history, results of the latest inspection and states of the onload tap changer taken as evidence to assess the working state of the transformer. the fuzzy model is proposed for generating the original basic probability assignments for the second-level model. the testing data of indices are normalized according to the attention value on transformer tests and operation standards, but the practical grade assessment of different et components has not been analysed. this paper presents the new methodology for the et condition assessment and prioritization, solving three main problems of previous condition assessment approaches:  rational aggregation of different et components,  uncertainties, accuracy and confidence of the inspection results  consistent grade assessment and weighting of different et components. power transformer health index estimation using evidental reasoning 573 the novel methodology is based on the upgraded hi where the er methodology has been used for the quantification of uncertain data, as a general, multi-level evaluation process for dealing with multi-criteria decision problems. a basic tree structure necessary for er assessment is developed based on the modified two-level transformer model and individual hi of every component. the importance of different components and different inspection methods are both evaluated by the real and practical weighting factors used in et maintenance practice. the et condition is represented as a belief distribution over all possible health states. the comparison with the traditional hi calculation method shows that the novel methodology gives more accurate results in the presence of obsolete and inaccurate measuring data. the rest of the paper is organized as follows. after the introductory section, section 2 presents the methodology: briefly outlines the hi approach and how it works as prioritization method, and explains the evidence reasoning algorithm. section 3 provides an illustrative example of the proposed methodology, data analysis and a discussion, while section 4 gives a conclusion and further research activities. 2. health index assessment 2.1. health index definition in recent years, the numerical assessment (indexing) of the current state of et and other high-voltage equipment in plants assigning a hi emerges as a tool that could effectively provide a transition to condition based maintenance. hi is a numerical value that can be used to estimate the overall condition of an et. by individually evaluating the most representative key factors that are vital to the reliable operation of transformers and mathematically aggregating them into a quantitative index, this value provides information on the "health" of the et. with this index, it is possible to evaluate the state of a large population of distribution transformers and group them according to the state. introducing this concept, the availability and reliability will increase while reducing maintenance costs. the assessment of the condition of an et is based on [13]:  results of electrical and chemical tests  maintenance information  transformer work history (previous loading)  condition of equipment: isolators, cooling system, transformer tank, expansion tank and auxiliary equipment  the estimated condition of the paper insulation  expert opinion. hi represents the sum of these estimates. it is very important to view the health index as a variable parameter because, by performing a multi-parameter analysis of the condition, it changes over the life of the et [14] the assessment of the condition of the et should include an assessment of the condition of the key parts: magnetic core and coil, solid insulation and insulating oil, bushings and voltage regulators, cooling system, transformer tank, expansion tank and auxiliary equipment. the assessment is based on the results obtained by applying appropriate test methods in the field of chemical and electrical testing and visual inspection as well as evaluation of load histories [15, 16]. the health indices for each of these parts, as well as the et hi must be determined. 574 s. milosavljevic, a. janjic 2.2. weighting factors of examination methods the transformer health index should include an assessment of the condition of its key parts (table 1). each part of the et is assigned a weight factor wd based on the impact it has on the overall condition of the et. the impact of part of et is also estimated according to the current statistics of the place of occurrence of failure in et [11]. weighting factors are given based on experience, and can take the integer value from 1 to 5, as shown in table 1. the source of weighting factors values is the industry practice. the condition monitoring and assessment is performed for the long time period in serbian power industry and the factors are the result of accumulated practice and experience. the more detailed explanation is given in [17]. table 1 weighting factors for different et components no et component weighting factor (wd) 1 magnetic core 3 2 geometry end electric contacts of windings 4 3 insulation 4 4 bushings 5 5 on line tap changer 5 6 dissolved gas analysis (dga) for the active part 5 7 transformer oil 4 8 transformer tank and auxiliary equipment 2 9 work history 3 different test methods are used to evaluate the condition of each of the above parts of the et. some parts are joined by a group of appropriate test methods, each corresponding to a weight factor wm = (1–5), depending on how accurately the results of that method can describe the state of et component (table 2). table 2 weighting factors of different inspection methods et component no inspection method weighting factor (wm) magnetic core 1 open circuit test/ sfra 5 geometry end electric contacts of windings 2 resistance testing 5 3 leakage inductance test /sfra 5 insulation 4 insulation resistivity/tgδ and capacitance test 5 5 pdc/rvm/fds/water content in oil 4 6 furan derivatives analysis 3 bushings 7 tgδ and capacitance 4 on line tap changer 8 static/dynamic resistance testing 5 dga analysis for the active part 9 dissolved gas analysis (dga) 4 transformer oil 10 physical and chemical oil characteristics 5 11 content of water in oil 4 transformer tank and auxiliary equipment 12 testing of cooling system and auxiliary equipment 2 13 visual inspection-/leakage control 2 work history 14 loading and operation history 3 power transformer health index estimation using evidental reasoning 575 since the dissolved gas analysis (dga) of the transformer oil sample may indicate a problem of overheating or the occurrence of particles, but it cannot reliably define the location of the resulting fault, it is singled out as special category. this limited its impact on the value of total hi, but not on specific components, such as windings or cores. 2.3. overall health index the overall health index of a transformer can be calculated using: n di di i n di i o w hi w     (1) od is a grade for each individual et part in the range 0 ≤ od ≤ 3: 1 1 k mi mi i d n mi o w o w      (2) in equation (2), n corresponds to the number of components, while k corresponds to the number of test methods for which there are applicable results and which assess the state of a given system. the estimation of the om method is given by an expert on the basis of the results of the last and previous tests, experience and specificity of individual ets, and using the criteria given in the applicable standards and technical recommendations. the possible range is 0 ≤ om ≤ 3. the state estimates for electrical measurements are given in descriptive terms:” good”, “moderately good”, “moderately bad”, and “bad. the numerical range of each corresponding estimates for the health index calculation is shown in table 3. table 3 comparison of electrical and chemical test scores with appropriate numerical estimates for hi calculations test results hi good 3 moderately good 2 ≤ hi < 3 moderately bad 1 ≤ hi < 2 bad < 1 given that three-stage grading is usually used to diagnose the condition: "good", "doubtful" and "poor", the second grade in the methodology is divided into two grades: "moderately good" and "moderately bad". the criteria for the two grades is the same the difference is that the “moderately good” rating indicates dubious results, but without major changes over time, e.g. comparing the last two to three trials and continuing the follow-up with more frequent testing. on the other hand, the rating "moderately bad" indicates a growing trend of deterioration of the transformer state, and it tightens control by more frequent testing, recommends additional testing, or emphasizes the need to plan for a specific intervention in the coming period. because of irregular inspection period, it is hard to perform accurate yearly et condition assessment. some data may be old several years and the main problem in interpretation is the 576 s. milosavljevic, a. janjic lack of confidence of testing results. in this paper, evidential reasoning is used for the quantification of different parameters and the algorithm is presented in the following section. 2.4. evidential reasoning algorithm in a two level hierarchy of attributes with a general attribute at the top level and l basic attributes at the lower level ei (i = 1, …, l ) it is possible to define a set of low level attributes as follows: e = {e1, …ei,… el}. (3) the weights of the attributes are presented by  = {1, …i, …l} where i is the relative weight of the ith lower level attribute (ei) with value between 0 and 1 (0  i  1). the evaluation grades are represented by h = {h1, …hn, …hn}, (4) (it is assumed that hn+1 is preferred to hn ) an assessment for ith basic attribute ei may be represented by the following distribution: s(ei) = {(hn,n,i), n = 1,…n} i = 1,…, l; (5) where n,i denotes degree of belief and n,i  0, 1, 1 n n i n    . if 1, 1 n n i n    then assessment s(ei) is complete. in opposite case, assessment s(ei) is incomplete. eq. (6) denotes a complete lack of information on ei 0, 1 n n i n    (6) let hn be a grade to which the general attribute is assessed with certain degree of belief n. the problem is to generate n by aggregating the assessments for all associated basic attributes ei. for this purpose, following algorithm is used. let mn,i be a basic probability mass representing the degree to which basic ith attribute ei supports judgment that the general attribute y is assessed to the grade hn. respectively, let mh,i be a remaining probability mass unsigned to any individual grade after all the n grades, concerning the ei attribute, are considered. the basic probability mass is calculated in (7): mn,i=in,i n=1,…, n. (7) the weight normalization is given by the following expression: 1 1 n n n    (8) remaining probability mass is calculated as: 1 1, ,. 1 1 n n m mn i i n ih i n n         (9) suppose that ei(i) is a subset of the first i attributes ei(i)={e1,e2,…, ei} and according to that mn,i(i) can be probability mass defined as the degree to which all the i attributes support the judgment that y is assessed to the grade hn. also mh,i(i) is remaining power transformer health index estimation using evidental reasoning 577 probability mass unassigned to individual grades after all the basic attributes in ei(i) have been assessed. probability masses mn,i(i), mh,i(i) for ei(i) can be calculated from basic probability masses mn,j and mh,j for all n=1,…, n, j=1,…, i. concerning all above statements, the recursive evidential reasoning algorithm can be summarized by the following expressions: (10) , 1, ( 1) ( 1) , ( ) m k m m h ih i i i i h i i    (11) 1 1 1, ..., 1 , 1( 1) , ( )1 1 n n k m m i l j ii i t i it j j t                     (12) where ki(i+1) is a normalizing factor so that 1, ( 1) , ( 1)1 n m m n i i h i in      is ensured. it is important to note that basic attributes in ei(i) are numbered arbitrarily and that initial values are mn,i(1)=mn,1 and mh,i(1)=mh,1. and finally, in original evidential reasoning algorithm combined degree of belief for a general attribute n is given by: , 1, ..., , ( ) m n nn n i l    (13) 1 , ( ) 1 n m nh h i l n      (14) while h denotes degree of incompleteness of the assessment. the algorithm for the et assessment can be presented in following five steps: step 1. define a set of l inspection methods (basic attributes) influencing the assessment of the et component state (m is the number of components upper level attributes). determine the importance weighting of every inspection method wd and each component wm. step 2. for each attribute εi and evaluation grade hn a degree of belief βn is assigned. mn,i a basic probability mass, representing the degree to which the ith inspection method εi supports a hypothesis that the health index is assessed to the nth evaluation grade hn is calculated (eq. 7–9). step 3. the combined probability masses are generated by aggregating all the basic probability assignments using the recursive er algorithm (12–14). this step is repeated for each basic attributes for one component. step 4. calculate the combined degrees of belief for a higher level property. the combined probability masses are generated by aggregating all the probability assignments from previous step using the recursive er algorithm (12–14). this step is repeated for each et component. step 5. the procedure is terminated and the utility can be calculated. the flowchart is presented graphically on figure 1. ( ) 1, ..., , 1 , 1 , 1, ( 1) ( 1) , ( ) , ( ) , ( ) m k m m m m m m n n n i h i n in i i i i n i i n i i h i i         578 s. milosavljevic, a. janjic l, m, n, wd, wm β i,j , m i,j all attributes are calculated? combined degree of belief for a general attribute β n combined degrees of belief for a component all components are calculated? process terminated fig. 1 flowchart for the et condition assessment the methodology is illustrated on a real data from an et operating in serbian distribution utility and compared with the traditional hi calculation. 3. case study the methodology for the condition assessment will be applied to the existing transformer 110/35/10 kv, 20/20/10mva operating in eps (electric power industry of serbia). starting from a complete model presented in tables 2 and 3, a reduced model concerning only the main transformer parts without the online tap changer is presented on figure 1. because of different dates of inspection methods, different degrees of belief are presented in the table. the degree of belief denotes the source’s level of confidence when assessing the level of fulfilment of a certain property. for instance, due to the lack of frequency domain spectroscopy (fds) test, all belief values equal to zero. power transformer health index estimation using evidental reasoning 579 (10) (11) (4) (5) (6) (9) (2) (3) fig. 2 hierarchical scheme for transformer hi assessment numbers above the inspection methods in figure 2 represent the ordinal number of inspection method listed in table 2. actual gradings for the transformer 110/35/10 kv were effectuated during regular inspection and maintenance activities and they are presented in table 4. results for physical and chemical measurements, active resistance and leakage resistance are two years old. table 4. transformer assessment using traditional hi oil insulation active part windings wd 4 4 5 4 wm 5 4 5 4 3 5 5 phys, chem h20 tgδ fds furan dga r l om 3 2 1 3 2 3 3 using the traditional hi calculation method (equations 2), the grade od for oil, insulation, active part and windings equals 2.56, 1.75, 2 and 3, respectively. using equation (3), the value of hi is given in (15). 4 2, 56 4 1, 75 5 2 4 3 2, 3 17 n di di i n di i o w hi w              (15) as stated before, some measurements are not actual (two years old) and some inspection methods are not absolutely accurate. the new methodology require the initial degrees of belief listed in table 5. weighting factors for et component (wd) and for testing method (wm) are also presented in the table. starting from values in tables 2 and 3, factors are normalized to fulfil the condition (8). 580 s. milosavljevic, a. janjic table 5 initial data for the degrees of belief calculation oil insulation active part windings wd 0,24 0,24 0,28 0,24 wm 0,55 0,45 0,41 0,34 0,25 0,5 0,5 hi phys, chem (βi,1) h20 ( βi,2) tgδ ( βi,1) fds ( βi,2) furan ( βi,3) dga (βi) r ( βi,1) l ( βi,2) 3 0,5 0 0 0 0,8 0 0,5 0,5 2 0,5 0,8 0 0 0 0,9 0,3 0,3 1 0 0,2 0,8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 recursively using equations (12) (14) for the aggregation of probability masses for individual inspection methods, probability masses for individual et components are obtained and represented in table 6. for instance, assessment of the transformer oil (oil) for the grade h3 = “good”, h2 = “moderately good”, h1 = “moderately bad” and h0 = “bad”, equal to 0.17, 0.44, 0.045 and 0 respectively. the remaining probability mass (mhi) equals 0.34. table 6 degrees of belief for main transformer components β i,3 βi,2 βi,1 βi,0 mh i oil 0,17 0,44 0,045 0 0,34 insulation 0,062 0 0,14 0 0,8 dga 0 0,252 0 0 0,748 windings 0,153 0,07 0 0 0,777 by using equations (12 14) and with the values calculated in step 3, we get the combined degrees of belief for the h3 = “good”, h2 = “moderately good”, h1 = “moderately bad” equal to 0.32, 0.175, and 0.08 respectively. using the traditional hi calculation method, the transformer is graded as “moderately good” (table 3). the er methodology, however, gives the distribution of belief states, with 0.44 degree of belief that the transformer is in moderately good state, and the significant value that the transformer could be in the better state (0.17). according to current practice in eps, grading the transformer in category 2, means that inspection should be carried out more often, resulting in increased expenses and non-supplied energy. further research will be focused on the estimation of financial losses resulting from the interruption of electricity supply that can be caused by an et failure. 4. conclusions calculating the transformer health index produces an extremely useful tool for quality resource management, analysis of the current state of transformers in the network and planning preventative maintenance. this index provides an assessment of the status of the power transformer, which makes it possible to perform a comparative analysis between individual transformers, parts of the distribution system, and to set priorities and adequately channel power transformer health index estimation using evidental reasoning 581 financial resources and plan corrective measures to improve the hi that is, to ensure transformer operational readiness. the methodology presented in the paper is using er approach which is one of the latest developments in multi-criteria decision-making, applied for the prioritization of et according to their condition. the methodology proved to be very useful in the field of reliability and stability of the distribution system. unlike the traditional hi calculation method, the er methodology gives the distribution of belief states that the transformer could be in better condition. according to current practice in eps, grading the transformer in lower categories means that inspection should be carried out more often, resulting in increased expenses and non-supplied energy. currently, the methodology doesn’t address the precise economic model for the estimation of financial losses resulting from unnecessary interruption of electricity supply caused by inspections or on the other hand, interruptions that can be caused by failure. therefore, further research will be focused on the more precise estimation of financial losses resulting from the interruption of electricity supply that can be caused by an et failure or unnecessary inspections. references [1] d stevanović, a janjić, ”influence of circuit breaker replacement on power station reliability”, facta universitatis, series: electronics and energetics, vol. 32, no. 3, pp. 331–344, 2019. [2] a. n. jahromi, r. piercy, s. cress, j. r. r. service, w. fan, “an approach to power transformer asset management using health index”, ieee electrical insulation magazine, vol. 25, no. 2, pp. 20–34, 2009. [3] b. gorgan, p.v. notingher, v.l. badicu, g. tanasescu, “calculation of power transformers health indexes”, annals of the university of craiova, electrical engineering series, 2010 no. 34, pp. 13–18, [4] m. vermeer, j. m. wetzer, p.c.j.m. van der wielen, e. de haan, e. de meulemeester, asset-management decision-support modeling, using a health and risk model. 2015 ieee eindhoven powertech 1–6. [5] a. azmi, j. jasni, n. azis, m.z.a .ab. kadir, “evolution of transformer health index in the form of mathematical equation”, renewable and sustainable energy reviews, vol. 76, pp. 687–700, 2017. [6] g.a. shafer, “mathematical theory of evidence”. princeton university press, princeton, 1976. [7] z.j. zhang, j.b. yang, d.l. xu, “a hierarchical analysis model for multi-objective decision making. analysis, design and evaluation of man–machine system”, oxford, uk, pp. 13–18, 1990. [8] j.b. yang, m.g. singh, “an evidential reasoning approach for multiple attribute decision making with uncertainty”, ieee transactions on systems, man, and cybernetics, vol. 24, no. 1, pp. 1–18, 1994. [9] j.b. yang, d.l. xu, ”on the evidential reasoning algorithm for multi-attribute decision analysis under uncertainty”, ieee transactions on systems, man and cybernetics part a systems and humans, vol. 32, no. 3, pp. 289–304, 2002. [10] w. h. tang, k. spurgeon, q. h. wu, z. j. richardson, “an evidential reasoning approach to transformer condition assessments”, ieee transactions on power delivery, vol. 19, no. 4, 2004. [11] a. shintemirov, w.h. tang, q.h. wu, “transformer winding condition assessment using frequency response analysis and evidential reasoning”, iet electr. power appl., vol. 4, no. 3, pp. 198–212, 2010. [12] r. liao, h. zheng, s. grzybowski, l. yang, y. zhang, and y. liao, “an integrated decision-making model for condition assessment of power transformers using fuzzy approach and evidential reasoning”, ieee transactions on power delivery, vol. 26, no. 2, 2011. [13] e. duarte, d. falla, j. gavin, m. lawrence, t. mcgrail, d. miller, p. prout, b. rogan, “a practical approach to condition and risk based power transformer asset replacement”, in proceedings of the ieee international symposium on electrical insulation ieee, 2010. [14] f. scatiggio, a. fraioli, v. iuliani, m. pompili, “health index: the terna’s practical approach for transformers fleet management”, cigre, paris, 2014. [15] n. dominelli, “equipment health rating of power transformers” in proceedings of the ieee international symposium on electrical insulation indianapolis, usa, september, 2004, pp. 163–168. [16] life management techiques for power transformers”, cigre working group 12.18, brochure 227, 2003 [17] j. ponocko et all. “health index as the part of asset management in the area of power transformers” (in serbian), cigre conference, zlatibor, 2015. https://ieeexplore.ieee.org/xpl/conhome/5542406/proceeding https://ieeexplore.ieee.org/xpl/conhome/5542406/proceeding instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 67 80 doi: 10.2298/fuee1701067l a non-inverting buck-boost converter with an adaptive dual current mode control  srđan lale 1 , milomir šoja 1 , slobodan lubura 1 , dragan d. mančić 2 , milan đ. radmanović 2 1 university of east sarajevo, faculty of electrical engineering, east sarajevo, bosnia and herzegovina 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper presents an implementation of adaptive dual current mode control (adcmc) on non-inverting buck-boost converter. a verification of the converter operation with the proposed adcmc has been performed in steady state and during the disturbances in the input voltage and the load resistance. the given simulation and experimental results confirm the effectiveness of the proposed control method. key words: adaptive dual current mode control, non-inverting buck-boost converter, operating modes, transient response 1. introduction a non-inverting buck-boost power electronics converter is one of the most versatile non-isolated converter topologies. it has become increasingly popular in many applications, including: electric vehicles [1], dc microgrids [2], battery-powered portable electronic devices (e.g. cellular phones and laptops) [3], [4], power factor correction (pfc) circuits [5], photovoltaic systems [6], etc. the non-inverting buck-boost converter provides the output voltage that is either lower or higher than the input voltage. this property is significant in socalled dynamic voltage scaling (dvs)-based power-efficient supplies, which provide adjustable voltage levels, according to the instantaneous operating conditions [7]. one of the most important features of this converter type is bidirectional operation, which is especially useful in applications such as dc microgrids and electric vehicles. the conventional two-switch topology of the non-inverting buck-boost converter is shown in fig. 1 (a), being a result of a cascaded combination of a buck converter followed by a boost converter. it contains two power switches t1 and t2. if a bidirectional operation of the noninverting buck-boost converter is required, a four-switch topology from fig. 1 (b) must be used,  received january 25, 2016; received in revised form april 10, 2016 corresponding author: srđan lale university of east sarajevo, faculty of electrical engineering, vuka karadžića 30, 71126 lukavica, east sarajevo bosnia and herzegovina (e-mail: srdjan.lale@etf.unssa.rs.ba) 68 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović where the diodes d1 and d2 from fig. 1 (a) are replaced with additional power switches t3 and t4. fig. 1 a 2-switch (a) and 4-switch (b) topology of the non-inverting buck-boost converter depending on the ratio between the input voltage vg and the output voltage vo, the non-inverting buck-boost converter can operate in buck mode (vovg). as it is discussed in [8], these operating modes can be achieved in different ways. a conventional way is to control simultaneously the switches t1 and t2 with the same gate signal. although this switching scheme is simple, it provides low converter efficiency. in order to increase the efficiency, the operating modes are split: converter operates either as buck converter (only switch t1 is controlled, while t2 is always turned off) when vovg. however, the control of the switches is more complicated in this case, because it is necessary to provide mode detection and smooth and stable transition between the modes. different control methods can be applied to the non-inverting buck-boost converter, depending on the application. this paper is focused on using only current mode control (cmc). in most cases, for example in [2], [3], [5], [9], regardless of the applied cmc method, it is suggested that the non-inverting buck-boost converter operates either as buck or boost converter, as described above. in [5], a non-inverting buck-boost converter as a part of pfc rectifier works in both modes during fundamental period. after detection of each operating mode, the built-in control logic decides to work as conventional peak cmc (pcmc) or valley cmc (vcmc). therefore, the control shifts between pcmc (boost mode with duty cycle below 0.5) and vcmc (buck mode with duty cycle above 0.5) when the input rectified voltage crosses the output dc voltage, without need for slope compensation. in [9] a synchronous buck-boost led driver controller is presented, which uses more complex control as a combination of pcmc and vcmc with slope compensation. however, as it is stated in [2], an implementation of conventional cmc methods to this converter, such as pcmc and vcmc, is not a simple task, because they require information about converter operating modes. an average cmc (acmc) can be applied to the non-inverting buck-boost converter, without determination of operating modes [2]. by using a dual-carrier modulator described in [2], it is possible to achieve a smooth transition between the buck and the boost mode and to precisely control the inductor current throughout the entire operating range. there are other acmc approaches, for example in [3], which unlike the above mentioned acmc [2] has a mode selector circuit, which determines the operating mode during a switching cycle. due to the inherent ability of natural transition between pcmc and vcmc and vice versa, a dual current mode control (dcmc) proposed in [10] could be suitable for implementation on the non-inverting buck-boost converter, with simultaneously controlled a non-inverting buck-boost converter with an adaptive dual current mode control 69 switches t1 and t2. the converter will operate in buck mode with pcmc (duty cycle below 0.5) and in boost mode with vcmc (duty cycle above 0.5). in this way, there is no need for detection of operating modes. also, the converter is stable for the entire range of duty cycle from 0 to 1, that is, the subharmonic oscillations do not exist. on the other hand, all excellent features of pcmc and vcmc are preserved, such as fixed switching frequency, good dynamics and what is very important simplicity. a modified version of dcmc, named adaptive dual current mode control (adcmc), is proposed in [11] and elaborated in detail in [12], which improves some features of dcmc, while the basic operating principles remain the same. in [12], only simulation results are given for the non-inverting buck-boost converter. in this paper, besides some simulation results, the experimental verification of adcmc of this converter is presented. the paper is organized in the following way. the basic operating principles of adcmc, on the example of the non-inverting buck-boost converter, are described in section 2. the simulation and experimental results are presented in section 3. section 4 gives the concluding remarks. 2. operating principles of adcmc of non-inverting buck-boost converter the basic scheme of adcmc of the conventional non-inverting buck-boost converter is presented in fig. 2 (a). the switches t1 and t2 are controlled simultaneously with the same gate signal. in order to increase the converter efficiency, there is a possibility of synchronous version of this converter, where the diodes d1 and d2 are replaced with power switches, as it is shown in fig. 1 (b). in this paper, a synchronous version is not used. however, as it is stated in the introduction, if bidirectional operation of the non-inverting buck-boost converter is required, these additional two switches are necessary. a quiescent value of the output voltage vo of the non-inverting buck-boost converter from fig. 2 (a) is equal to: , 1 g o dv v d   (1) where d and vg are the quiescent values of the duty cycle and the input voltage, respectively. according to (1), when 00.5 (boost mode), a stable operation of the non-inverting buck-boost converter is guaranteed for the entire range of d without slope compensation. instead of mode detection and artificial shifting between pcmc and vcmc, dcmc proposed in [10] is suitable for this application, because it has a natural ability of shifting between pcmc, when d<0.5, and vcmc, when d>0.5, without any mode selector circuits. similarly as pcmc and vcmc, dcmc has a drawback in existence of a peak-toaverage current error (a difference between the reference current iref and the average value of the inductor current ( ) s l t i t over switching period ts). in ideal case of cmc, the aim is 70 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović to control precisely the average value of the inductor current over each switching period, that is, to make this error equal to zero. in order to eliminate peak-to-average current error, an enhanced version of dcmc is proposed in [11], named adcmc. fig. 2 a) adcmc of the non-inverting buck-boost converter, b) operating modes a non-inverting buck-boost converter with an adaptive dual current mode control 71 the operating modes of adcmc applied to the non-inverting buck-boost converter are shown in fig. 2 (b). the main difference between adcmc and dcmc is in the fact that the width between peak iref+ib and valley iref-ib current boundaries (the current bandwidth 2ib), is not constant and predefined for adcmc, unlike dcmc, but it is adaptive and online calculated by using the instantaneous peak-to-peak ripple of the inductor current δilpp on each switching period ts. the adaptive current bandwidth 2ib for the non-inverting buckboost converter is calculated as (fig. 2 (a)): 2 , ( ) g o b ib lpp ib s g o v v i k i k lf v v     (2) where kib is the scaling gain (kib≥1), fs=1/ts is the switching frequency, and l is the inductance value. the gain kib determines whether 2ib≥δilpp. when kib=1, the adaptive current bandwidth 2ib becomes equal to the measured instantaneous peak-to-peak current ripple δilpp, giving zero peak-to-average current error. it is evident from (2) that the calculation of adaptive current bandwidth 2ib depends on the inductance value l, which can be inconvenient if the l parameter is wrong or variable in different operating conditions. the wrong l parameter will lead to inaccurate current bandwidth 2ib and the appearance of the peak-to-average current error. a possible solution for this issue is to directly measure the instantaneous peak-to-peak ripple from the measured inductor current. this solution will be considered in the future work. a detailed analysis of adcmc, including small-signal models and design of the output voltage compensator gc(s) are presented in [12] for three types of dc–dc power electronics converters: buck, boost, and non-inverting buck–boost converter. this paper is focused on experimental verification of adcmc of non-inverting buck-boost converter. 3. simulation and experimental results the operation of the non-inverting buck-boost converter under adcmc, with the topology from fig. 2 (a), was verified with simulations in matlab/simulink and experimentally. the parameters of the non-inverting buck-boost converter working in the continuous conduction mode (ccm), which are the same for both simulations and experiments, are listed in table 1. the experimental setup is shown in fig. 3. the developed setup can be used for testing adcmc on various types of converters, because the used prototype is made as a universal four-quadrant (4q) converter, with possibility of easy configuration to the desired topology, such as buck, boost, non-inverting buck-boost, etc. table 1 parameters of the non-inverting buck-boost converter vg [v] 12 l [µh] 220 c [µf] 1000 r [ω] 20 fs [khz] 23 72 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović fig. 3 experimental setup: 1) the prototype of the non-inverting buck-boost converter; 2) electronic module for measurements and inner current loop; 3) pc with built-in mf624 board; 4) driver module; 5) input voltage source of the converter; 6) power supply units; 7) tektronix mso 2014 oscilloscope a separate electronic module, which is connected to mf624 multifunctional data acquisition input/output digital board [13], is used for implementation of the measurements and inner current loop. the measurement of the inductor current, which is necessary for the inner current loop, is performed with lem current transducer hx 10-np [14]. the converter input and output voltage are measured with galvanic isolation via optocoupler il300 [15] and sampled by 14-bit a/d converter (conversion time about 2 µs) of the mf624 board. mf624 board is built into the computer and it provides a real time processing with matlab/simulink environment. an implementation of the outer voltage loop and calculation of the adaptive current bandwidth 2ib for adcmc is performed in real time in simulink, using real time windows target (rtwt) environment. the reference current iref and current boundaries iref+ib and iref-ib are obtained from 14-bit d/a converter of the mf624 board and fed into the inner current loop. the fundamental sampling time for real time operation in simulink was set to 25 μs, which is the minimum sampling time for this hardware. power mosfets irf540n (100 v, 33 a) [16] are used as power switches t1 and t2. a dual-channel galvanically isolated mosfet driver module (turn on/off delay of 0.6 µs) was developed for driving the power switches. a non-inverting buck-boost converter with an adaptive dual current mode control 73 a primary objective of the performed simulations and experiments is to demonstrate that the proposed adcmc can be successfully applied to the non-inverting buck-boost converter, ensuring a stable operation in all operating modes and good dynamical properties, regardless of the application. several cases of the converter operation were tested: in steady state for buck and boost operating modes, during the step changes in the input voltage and the load resistance and during the gradual change of the input voltage. 3.1. operation of the non-inverting buck-boost converter in steady state the output compensator, as a key part of the outer voltage loop, produces the reference current iref for the inner current loop (fig. 2 (a)). in steady state, the reference current practically has a constant value. therefore, in order to test the behavior of the inner current loop in steady state, the outer voltage loop can be disabled and the reference current should be set manually as a constant signal. a testing the operation of the non-inverting buck-boost converter with adcmc in steady state was performed for both cases: with and without the outer voltage loop. when the voltage loop is disabled, two values of the reference current were used to provide buck and boost operating mode. the simulation waveforms of the inductor current in steady state are shown for ire f = 0.5 a (buck mode) in fig. 4 (a) and iref = 5 a (boost mode) in fig. 4 (b). the corresponding experimental waveforms are given in fig. 5 (a), (b). in the second case, a simple proportional-integral (pi) compensator for the regulation of the output voltage was employed. a design procedure for the output voltage compensator is derived in detail in [12]. as in the first case, the both operating modes were considered. the simulation waveforms of the inductor current in steady state are shown for two values of the output voltage: vo = 7 v (buck mode) and vo = 30 v (boost mode), in fig. 4 (c) and fig. 4 (d), respectively. the corresponding experimental results are presented in fig. 5 (c), (d). it is evident from fig. 4 that there is an excellent matching between the reference current and the average value of the inductor current. a very small peak-to-average current error still exists, which can be attributed to the delays in numerical calculation of the simulation. the experimental results from fig. 5 are similar to the simulation results from fig. 4. a small peak-to-average current error appears as a consequence of imperfections of the components used for realization of adcmc. on the basis of the given results from fig. 4 and fig. 5 it can be concluded that adcmc provide a stable operation of the non-inverting buck-boost converter for both values of the duty cycle: d < 0.5 and d > 0.5. 3.2. robustness to the disturbances in the input voltage and load it is very important to evaluate how the converter with certain control is sensitive to the various disturbances which can occur during operation. in this paper, the disturbances such as the step and gradual changes of the input voltage and the step changes in the load resistance were considered. a line regulation, which is defined as converter ability to maintain the specified output voltage despite changes in the input voltage, was tested for adcmc of the non-inverting buckboost converter. 74 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović fig. 4 the simulation waveforms of the inductor current, reference current and current boundaries in steady state, when the outer voltage loop is: a), b) disabled; c), d) enabled fig. 5 the experimental waveforms of the inductor current, reference current and current boundaries in steady state, when the outer voltage loop is: a), b) disabled; c), d) enabled first, the step changes from 12 v to 6 v and vice versa, were introduced in the input voltage. the output voltage was regulated to 9 v. the load resistance was set to r=10 ω. these step changes were performed in order to make a transition from buck to boost mode and vice versa, and to examine the dynamical behavior of adcmc. the waveforms of the output voltage and the inductor current are shown in fig. 6 (a), (b) (simulation) and fig. 7 (experiment). the same parameters of the output voltage compensator were used in both simulations and experiments. a non-inverting buck-boost converter with an adaptive dual current mode control 75 as it is shown from simulation and experimental results, the converter naturally crosses from buck to boost mode and vice versa. due to adaptation of the current bandwidth 2ib, the transition of the inductor current from one mode to another is smooth, which gives satisfactory line regulation. fig. 6 the simulation waveforms of the output voltage and the inductor current, for the step changes in the input voltage (a), (b) and the load resistance (c), (d) 76 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović fig. 7 the experimental waveforms of the output voltage (up) and the inductor current (bottom), when the input voltage changes from 12 v to 6 v (left) and vice versa (right) fig. 8 the experimental waveforms of the output voltage (up) and the inductor current (bottom), when the load resistance changes from 20 ω to 10 ω (left) and vice versa (right) a non-inverting buck-boost converter with an adaptive dual current mode control 77 fig. 9 the experimental waveforms of the output voltage, when the input voltage changes from 6 v to 12 v (up) and vice versa (bottom), for σ=100, 150, 200 and 500 fig. 10 the experimental waveforms of the output voltage, when the load resistance changes from 10 ω to 20 ω (up) and vice versa (bottom), for σ=100, 150, 200 and 500 78 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović in order to test a step load response, step changes in the load resistance from r=20 ω to r=10 ω and vice versa were performed. the output voltage was regulated to 20 v. the simulation and experimental waveforms of the output voltage and the inductor current are shown in fig. 6 (c), (d) and fig. 8, respectively. it is evident that adcmc successfully reject the introduced load disturbances. the transient response in the output voltage for the considered step disturbances depends also on the designed output voltage compensator, as it is shown in fig. 9 and fig. 10. several values of parameter σ, which determines the transient response time (about 5/σ) and the gains of the pi compensator [12], are considered. it is evident from the given experimental results from fig. 9 and fig. 10 that better responses regarding the transient response time and over/undershoot are obtained for higher values of the adjustable parameter σ. the optimization of the output voltage compensator is not subject in this paper. the aim was to obtain satisfactory results in accordance with the design procedure from [12] (the chosen settling time is about 10-50 ms). also, the output voltage loop is designed to be slow in order to emphasize the behavior of the inner current loop. fig. 11 the experimental waveforms of the output voltage (up) and the inductor current (bottom), for the gradual change of the input voltage from 15 v to 5 v (left) and vice versa (right) besides the step changes, a gradual linear change in the input voltage was also introduced in the experiments. the input voltage was gradually changed from 15 v to 5 v and vice versa, while the output voltage was regulated to 10 v, in order to make a gradual transition from buck to boost mode and vice versa. the experimental results are shown in fig. 11. it is obvious that adcmc is robust against these changes. the output voltage is successfully regulated, without any disruptions between two operating modes. a non-inverting buck-boost converter with an adaptive dual current mode control 79 4. conclusion in this paper, an implementation of a novel adcmc method on the non-inverting buck-boost converter has been presented. the given simulation and experimental results confirm that there is no need for the detection of converter operating modes, because this method ensures a natural and stable transition between the buck and the boost mode, and vice versa. the given results show that adcmc provides a stable operation of the noninverting buck-boost converter for the entire range of duty cycle from 0 to 1. also, it is robust against the disturbances, such as the step and gradual changes in the input voltage and the step changes in the load resistance, with good dynamical performances. the following task will be the using of the proposed adcmc of the non-inverting buckboost converter in various popular applications, such as battery chargers/dischargers, led drivers, etc., and to compare it with other relevant methods in the same applications. references [1] m. a. khan, a. ahmed, i. husain, y. sozer and m. badawy, "performance analysis of bidirectional dc–dc converters for electric vehicles", ieee trans. ind. appl., vol. 51, no. 4, pp. 3442-3452, july/aug. 2015. [2] i. aharon, a. kuperman and d. shmilovitz, "analysis of dual-carrier modulator for bidirectional noninverting buck–boost converter", ieee trans. power electron., vol. 30, no. 2, pp. 840-848, feb. 2015. [3] wei chia-ling, chen chin-hong, wu kuo-chun and ko i-ting, "design of an average-current-mode noninverting buck–boost dc–dc converter with reduced switching and conduction losses", ieee trans. power electron., vol. 27, no. 12, pp. 4934-4943. [4] c.-h. tsai, y.-s. tsai and h.-c. liu, "a stable mode-transition technique for a digitally controlled non-inverting buck–boost dc–dc converter", ieee trans. ind. electron., vol. 62, no. 1, pp. 475-483, jan. 2015. [5] g. k. andersen and f. blaabjerg, "current programmed control of a single-phase two-switch buck-boost power factor correction circuit", ieee trans. ind. electron., vol. 53, no. 1, pp. 263-271, feb. 2006. [6] t.-f. wu, c.-l. kuo, k.-h. sun, y.-k. chen, y.-r. chang and y.-d. lee, "integration and operation of a single-phase bidirectional inverter with two buck/boost mppts for dc-distribution applications", ieee trans. power electron., vol. 28, no. 11, pp. 5098-5106, nov. 2013. [7] l. feng and m. dongsheng, "design of digital tri-mode adaptive-output buck–boost power converter for power-efficient integrated systems", ieee trans. ind. electron., vol. 57, no. 6, pp. 2151-2160, june 2010. [8] haifeng fan, "design tips for an efficient non-inverting buck-boost converter", analog applications journal, texas instruments, pp. 20-25, 2014. [9] linear technology, "60v 4-switch synchronous buck-boost led driver controller", lt3791 datasheet, rev. b, 2012. available: http://cds.linear.com/docs/en/datasheet/3791fb.pdf. [10] a. v. anunciada and m. m. silva, "a new current mode control process and applications", ieee trans. power electron., vol. 6, no. 4, pp. 601–610, oct. 1991. [11] s. lale, m. šoja, s. lubura and m. radmanović, "modeling and analysis of new adaptive dual current mode control", in proceedings of the 10th international symposium on industrial electronics indel 2014, vol. 10, no. t-02, pp. 73–76. [12] available: http://www.indel.etfbl.net/resources/proceedings_2014/indel_2014_paper_11.pdf. [13] s. lale, m. šoja and s. lubura, "a modified dual current mode control method with an adaptive current bandwidth", int. j. circ. theor. appl., 2015. [14] humusoft, "mf624 multifunction i/o card", mf624 user’s manual, 2014. available: http://www2.humusoft.cz/www/datacq/manuals/mf624um.pdf. [15] lem, "current transducer hx 05..15-np", hx 10-np datasheet. available: http://www.lem.com/docs/ products/hx%205_15-np_e%20v10.pdf. 80 s. lale, m. šoja, s. lubura, d. d. manĉić, m. đ. radmanović [16] vishay semiconductors, "linear optocoupler, high gain stability, wide bandwidth", il300 datasheet. available: http://www.vishay.com/docs/83622/il300.pdf. [17] international rectifier, "hexfet ® power mosfet", irf540n datasheet. available: http://www.irf.com/ product-info/datasheets/data/irf540n.pdf. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 455 465 doi: 10.2298/fuee1403455k load modelling at low voltage using continuous measurements  lidija m. korunović 1 , milica rašić 1 , nenad floranović 2 , vladica aleksić 3 1 faculty of electronic engineering, university of niš 2 research and development centre “alfatec” 3 public utility company “elektrodistribucija vranje” abstract: the paper presents the results of load modelling at low voltage level of transformer station (ts) 10/0.4 kv/kv supplying predominantly the residential load. the necessary data is obtained from a recording device and measuring information system for continuous measurements permanently installed in the concerned distribution network. the identified parameters of static exponential load model are classified according to day periods and days of the week, and statistically analysed in order to obtain reliable parameter estimates. these are compared with literature data for the same residential load class. possibilities for further application of the described load modelling procedure are listed. key words: load modelling, recording device, exponential load model, parameter estimation 1. introduction load modelling is a mature topic, but in the recent years it has again become very challenging and important. namely, there is a renewed interest in both industry and academia for load modelling, due to the appearance of new types of loads and modern nonlinear electrical and electronic equipment offering increased efficiency and controllability. these are compact fluorescent lamps (cfls), light-emitted diode (led) light sources, adjustable speed drives (asds), inverter-interfaced distributed generation, plug-in electric vehicle chargers, etc. furthermore, it is expected that accurate modelling of loads and other network components will be even more important for realising increased flexibility and improved energy efficiency of future electric power networks [1]. there are two general methodologies for load modelling: component based and measurement based approach [2]. the first one assumes a priori knowledge of load models and corresponding load model parameters of individual low voltage (lv) load components. afterwards, load characteristics at higher voltage levels can be derived from the known load model components and their composition at lower voltages, by applying  received february 21, 2014; received in revised form may 17, 2014 corresponding author: l. m. korunović faculty of electronic engineering, a. medvedeva 14, 18000 niš, republic of serbia (e-mail: lidija.korunovic@elfak.ni.ac.rs) 456 l. m. korunović, m. rašić, n. floranović, v. aleksić some aggregation method (e.g. [3], [4]). it is very difficult, however, to establish the exact load composition at medium and high voltage levels [5]. the load composition at a bus changes during the year, week and time of the day. therefore, the results obtained by the component based load aggregation approach should be used with caution, and the measurement based approach regards to be a better one. the measurement based approach [6]-[10] is based on field measurements at selected electric power system buses, performed specifically for load modelling purposes, or on laboratory measurements of low voltage devices aimed at obtaining their load models. the results obtained for the investigated load buses can be used for modelling the load at other buses if the load structure there is similar. since load composition changes with time, it is recommended to identify load model parameters for different seasons, different days of the week and characteristic time intervals during a day [11]. data for load modelling using field measurements can be obtained from field tests or from continuous measurements. the field tests are allowed to be performed in time periods when they cannot endanger electric power system operation. during these tests operating conditions have to be within acceptable, predefined limits, that sustains safety and the quality of the supply. therefore, the ranges of the voltage changes during the tests are rather narrow [11], [12]. continuous measurements imply data recording by the equipment with a relatively large memory in time intervals when the disturbances in electric power system occur [13]. some of these recording devices can transmit or process the data [12]. the measurement based approach requires permission of electrical distribution company and relatively expensive measuring equipment. it becomes very difficult to get the permission for the field test in electric power system in the environment of electric energy deregulated market and generally in the market economy. additionally, the engagement of utility’s employers is needed. therefore, continuous measurements are much more promising, especially in a smart grid environment. the idea of this paper is to present the procedure of load modelling using pre-installed recording equipment for continuous measurements in a distribution network. thus, there is no additional cost for the purchase and installation of generally very expensive measuring equipment. since new types of recording devices can transmit data to the control centre, getting permission for every entrance into utility’s substations that was necessary for local access to data stored in older types of devices, is avoided. in general, the procedure is applicable to all voltage levels. it is demonstrated in the paper on the example of the low voltage aggregate load. it is of an additional importance, since there is no published research that deals with load modelling using measurement based approach at low voltage. the main reasons are: necessity of the usage of very expensive recording equipment, large number of load buses at lv and stochastic nature of aggregate load at these buses that disturb the identification of load model parameters. the final aim of this paper is to approve the ability to obtain statistically reliable parameters of the load at lv in different days of the week and in different day periods. 2. data recording, storing and transmission in the public utility company “elektrodistribucija vranje” several recording devices have been installed at low voltage level of tss 10/0.4 kv/kv in order to measure the load load modelling at low voltage using continuous measurements 457 consumption that enables comparison of the delivered and consumed energy and to find possible causes of the increased energy losses. technical capabilities of such a device as an embedded part of the measuring information system (mis) [14] are discussed briefly in this paper. these are: measuring, recording, storing and transmission of recorded data. the mks–i2–5 device type has been installed in ts “asambair krug” 10/0.4 kv/kv that supplies predominantly the residential load (households count 92.3 % of all 349 lv consumers in this consumption area). this transformer station is the first one of three ts 10/0.4 kv/kv fed by the same 10 kv feeder. the feeder is supplied from ts “vranje 1” 35/10 kv/kv located in the urban area of town vranje that is fed by ts 110/35 kv/kv including two on-load tap-changing transformers (2×31.5mva, 110±10×1,5%/36,75/10,5 kv/kv/kv). the simplified one-line diagram of the device connections in ts "asambair krug" is depicted in fig. 1. current probes of the device are connected to the transformer secondary via the existing current measuring transformers (ct), and voltage probes are connected to lv bus directly. fig. 1 principal diagram of device connections in considered ts currents and voltages are recorded and processed by mis. it can store significant number of variables such as: true effective (rms) current values, phase and phase-to-phase rms voltages, single-phase and three-phase real, reactive and apparent power, individual and total harmonic distortions of particular currents and voltages, etc. variables to be stored, as well as the sampling rate, depend on user settings. the recorded data is stored on 14 gb hdd disk of the device. the stored data can be accessed locally by the implemented device functionality using a usb memory stick and using a gui (graphical user interface) of the device. the data is automatically transferred to the usb memory stick after the hardware is connected via the usb port, i.e. software module is activated and the data is transferred. another way of local access to the stored data is by ethernet using a cross over cable, ssh (secure shell) protocol of the suitable ssh client. the data can be also accessed remotely in the control centre of the public utility company “elektrodistribucija vranje”. the data transmission to the control centre is performed automatically by execution of the pre-set scripts over gsm/gprs communication channel or a digital radio modem, with the transfer speed of 458 l. m. korunović, m. rašić, n. floranović, v. aleksić 200 kb/s and 9.6 kb/s, respectively. in both cases, the stored data is accessed using a ssh client, which is a software program that incorporates the ssh protocol for the connection to a remote computer. the central server located in the control centre storages recorded mis data which can be further processed, analysed and used for exploitation and planning purposes. since rms voltages and real and reactive power recorded by the device are required for load modelling, they can be used for this issue. the application of the pre-installed mis and data stored by the central server, in the domain of load modelling, is demonstrated bellow. 3. operation data the period of ten days of normal distribution network operation is chosen for the analysis. the device recorded data from 1 st to 10 th march 2012 with the sampling period of 12 s, according to the pre-defined setting of time-interval for the monitored parameters of energy consumption in the considered consumption area. the recording device generates a special file format optimized for the data transfer with high compress compatibility. the average size of one day file is 400 kb, with the resulting total size of the uncompressed file for ten day period of 4 mb. therefore, it is easy to transmit, store and process the recorded data for the discussed period, but also for longer time intervals, e.g. several months or year(s). furthermore, it is expected that the number of recording devices in distribution network(s) in the future will increase and enable simultaneous data collection from various consumption areas of different tss. this will facilitate a widespread application of the proposed load modelling procedure. in the considered lv distribution network the number of remotely accessed devices is limited only by the provider capacity in the case of gsm/gprs communication channel, i.e. by the capacity of the subnet protocol of the provider. in the case of digital radio modem transmission, limitations are: line of sight between the two transmitting antennas, bandwidth and transmitter signal strength. figs. 2a), 2b) and 2c) present voltage, real and reactive power curves recorded in ts “asambair krug” on sunday, 4 th march, and figs. 2d), 2e) and 2f) the corresponding curves recorded on friday, 9 th march. general observations that will be listed for these curves refer also to the corresponding curves for other days. the voltage changes in the relatively narrow ranges from its maximum value of 417.3 v and 417.4 v on sunday and friday, respectively, measured during early morning hours in the period of low load, to the minimum values: somewhat greater than 396.5 v in the period of high evening load on sunday, and 391.9 v around 10 am on friday. real power curves are characterized by the decreases during the night, their minimum values (174.9 kw and 187.4 kw on sunday and friday, respectively) in the early morning, followed by the increases up to the late morning hours when they start to vary in narrow ranges. there are the trends of load increases from early evening hours up to 594.4 kw at 19:38:08 pm on sunday and up to 549.2 kw on friday at 20:00:15 pm. both real power curves decrease before the midnight. these curves are also characterized by stochastic load changes. stochastic rapid changes are more notable in reactive power curves. minimum values of these curves are 42.6 kvar and 39.6 kvar, on sunday and friday, respectively, and corresponding maximum values are 103.4 kvar and 113.6 kvar. the challenge is to identify load model parameters under such stochastic nature of the load typical for lv networks. load modelling at low voltage using continuous measurements 459 it should be emphasized that the trends of voltage curves and numerous small changes of the voltage are not predominantly influenced by the supplied load, but rather by the rest of the electric power network and the voltage regulation at higher voltage levels. in the presented load modelling procedure only abrupt voltage changes greater than 1.5 % of the rated voltage (un), are selected for the analysis. the changes that are less than 1.5 % of un cause very small real and reactive power changes strongly influenced by stochastic load variations. these variations significantly disturb the identification of load model parameters. the analysis of the curve from fig. 2a) reveals that only two abrupt voltage changes greater than 1.5 % occurred during sunday. at 02:01:38 am the voltage decreased approximately from 414 v to 405 v (i.e. for nearly 2.3 % of un) and immediately after that increased to approximately the same value. similar abrupt voltage changes are typical for early morning hours, and these are also recorded on friday (fig. 2d)). the considered lv network and supplying medium voltage network do not include controlled capacitors or other regulating equipment that are adjusted to operate when voltage changes or voltage values are beyond the predefined limits. therefore, it can be concluded that the described variations are the result of voltage regulation performed by changing the tap position of on-load tap-changing transformers in ts 110/35 kv/kv that both supply ts “vranje 1”. fig. 2 voltage a), real b) and reactive power c) curves during sunday, and voltage d), real e) and reactive power f) curves recorded on friday fig. 3 zooms two voltage changes (and corresponding real and reactive power responses) that started at 1:09:36 on friday, 9 th march. these are voltage decrease and voltage increase for approximately 6 % of un. time in this figure is labelled with integers zero denotes the recording moment when the first change started, and two negative and four positive numbers correspond to two prior and four subsequent recording moments, respectively. stochastic changes of the real and reactive power when the voltage is almost constant are notable in figs. 3b) and 3c), respectively. the identified load parameters under presence of such stochastic load variations should be accepted with caution. therefore, load modelling procedure presented in this paper includes the analysis of a large number of voltage 460 l. m. korunović, m. rašić, n. floranović, v. aleksić changes and statistical analysis of identified load model parameters in order to obtain estimated, but reliable parameters. fig. 3 two subsequent voltage changes a), and real b) and reactive c) power responses 4. load model both voltage changes in fig. 3 are depicted by two points. voltage decreases and the corresponding real and reactive power responses start at the recording moment 0 (point 0) and finish at moment 1 (point 1), while voltage increases and its immediate power responses start at moment 1 and finish at the recording moment 2. therefore, very simple load model can be used, such as exponential load model with frequency dependence term neglected (since the voltage changes much more than frequency). it is one of the most frequently used static load models [1]: puk n n u u pp          , (1) quk n n u u qq          . (2) in model (1)-(2) p and q denote real and reactive load power at voltage u, pn and qn are real and reactive load power at the rated voltage, kpu and kqu denote parameters of the model, voltage exponent of the real and reactive power, respectively. if both voltage exponents are 0, 1 or 2, the load is of constant power, current or impedance type, load modelling at low voltage using continuous measurements 461 respectively. there is the variant of model (1)-(2) with the initial real and reactive power values, p0 and q0, at the initial voltage u0, instead of pn and qn at un [2]. the exponential load model describes the changes of the real and reactive power of predominantly residential load with voltage variations in the range from 0.915 pu to 1.1 pu, with acceptable mistakes [7]. these mistakes are up to 1.1 % for real and 6 % for reactive power. the analysis of all abrupt voltage changes during the concerned period of ten days shows that the voltage belongs to a relatively small range during all 42 changes, i.e. from 417.4 v (1.044 pu) to 377.6 v (0.944 pu). therefore, exponential load model is adequate. the parameters of the variant of the exponential load model, that includes initial voltage and power values, can be calculated as: 0 0 0 0 0 ( ) / ln ln ( ) / pu p p pp u k p u u u u               , (3) 0 0 0 0 0 0 ( ) / ln ln ( ) / qu q q qq u k q u u u u               , (4) for small voltage (and therefore small power) deviations. according to (3) and data from fig. 3, kpu=1.778 and kpu=1.554 are identified for voltage decrease and increase, respectively. analogously, kqu=3.272 and kqu=3.853 are obtained for the data from the same figure using (4). more precise results of load modelling can be obtained by the increase of the sampling rate, by carrying out filtering [1] and/or by the selection polynomial load model that is proved to be more accurate [10]. however, the main aim of the paper is to demonstrate that it is possible to use small files of the recorded data and to obtain reliable model parameters. namely, the utilities in serbia very often meet problems of the transmission of large data files. if the sampling rate is increased, such kind of files will be obtained by the measurements at numerous lv buses in long periods of time (that is the final aim of the research whose beginning is presented in this paper). 5. parameter estimation the further analysis of the recorded voltage values revealed that 42 voltage changes, greater than 1.5 % of un, occurred during the ten considered days, while the maximum recorded change was 7.6 %. the data is separated in two sets. the first set is used for model tuning and it includes data from 1 st march to 3 pm of 7 th march, with 28 voltage changes. another data set is used for model verification. it includes data from 3 pm of 7 th march to 10 th march, with 14 recorded voltage changes. table 1 lists the number of the considered voltage changes from the first data set that are grouped according to different time intervals of the day and three characteristic days of the week – working day, saturday and sunday. in the majority of time intervals and on both working days and weekend days, voltage changes greater than 1.5 % were recorded and the corresponding load model parameter was obtained. 462 l. m. korunović, m. rašić, n. floranović, v. aleksić table 1 number of voltage changes in time intervals and days of the week time interval [h] number of changes day of the week number of changes 06 10 working day 32 612 4 1218 14 saturday 4 1824 0 sunday 2 the identified parameters are also grouped according to time intervals and days of the week, and statistically analysed. the results of this analysis are mean values and standard deviations of the parameters of three time intervals and days (table 2). mean, i.e. estimated value of kpu is the smallest in the afternoon, from 12 h to18 h (kpu=1.382), and it is the largest in the morning, from 6 h to 12 h (kpu2). it can be explained by the changes in load composition during the day, indicating that the resistive load devices are predominant load component in the morning hours. parameter kpu of the resistive load devices is 2, as listed in numerous papers and books such as [2, 3, 7]. stimated value of kpu in different days of the week, and therefore average load composition, is almost the same. this parameter belongs to the narrow range, from 1.561 to 1.662. standard deviation of kpu is the measure of parameter dispersion. this standard deviation belongs to a relatively wide range, from 0.055 for sunday to 0.958 for the period 12-18 h. greater values indicate that load composition changes significantly in the considered time intervals or days, but also this can be caused by larger stochastic load changes. thus, the analysis of small voltage variations, in the range from 1 % to 1.5 % of un, in the time interval 18-24 h and their real power responses yields: mean value of kpu is 2.017 and the corresponding standard deviation is of the order of mean value (it is 1.681). such so large standard deviation confirms the statement from section 3 parameters obtained on the basis of such small voltage changes are very unreliable. table 2 mean values and standard deviations of the parameters parameter time interval [h] mean standard deviation day of the week mean standard deviation kpu 06 1.874 0.498 working day 1.662 0.831 612 2.013 0.958 saturday 1.561 0.308 1218 1.382 0.774 sunday 1.608 0.055 kqu 06 3.659 0.784 working day 2.998 0.603 612 2.449 0.163 saturday 3.916 1.264 1218 3.023 0.637 sunday 3.550 0.255 mean, i.e. the estimated value of kqu grouped according to the time intervals changes from 2.449 in the period 6-12 h to 3.659 in the early morning hours (0-6 h). according to small voltage deviations 1-1.5 % of un, the mean value of kqu is 4.301 in the interval from 18 h to 24 h, but again the standard deviation is almost equal to the mean value, it is 4.039. mean values of kqu that are grouped according to the days of the week belong to a somewhat narrower range, from 2.998 during working days to 3.916 on sunday, confirming that that the load composition vary rather with day periods than with the days of the week. standard deviations of kqu are several times smaller than the corresponding mean values. load modelling at low voltage using continuous measurements 463 mean values of parameters from table 2 represent load behaviour during 28 examined voltage changes very well. it is proved by simulations of both real and reactive power responses to these voltage changes by the exponential model with the corresponding mean values of parameters from table 2. when parameters of time intervals are used, in only three of 56 simulations the errors are greater than 5 %, while the largest error is 6.8 %. similarly, simulations that assume estimated parameters for the corresponding days of the week cause the error that is beyond the acceptable limit of 5 % in four cases (i.e. in approximately 7% of 56 simulations). furthermore, the estimated, mean parameters are validated on the second set of the recorded data. this set includes four voltage changes in the time interval 0-6 h and ten changes in the time interval 12-18 h, both on working days. very small deviations from the measured values are obtained. when parameters of particular time intervals are used, they do not exceed 6.6 % and in about 10 % of cases deviations are greater than 5 %. similarly, in approximately 7 % of cases the errors are greater than 5% when the mean values for a working day are used. it indicates that parameter values obtained on the basis of about ten or more voltage changes (table 1) can be used for load modelling under similar loading conditions and that reliable parameters are estimated although the nature of load at lv is pretty stochastic. however, in the time interval 18-24 h no large change of voltage occurred and in the interval 6-12 h, on saturday and on sunday very small number of changes was recorded, i.e. there was a lack of statistically significant sample of data (or any data). therefore, a further analysis based on measurements in longer time intervals that will provide more relevant data in all time intervals and all days of the week, will be the subject of the future research. it can also include the determining of the parameters in different months, seasons and years. measurements in long time intervals also enable us to adjust and to verify the model recursively. this methodology is applied on the second set of data, while the estimated parameters from the first data set are used as initial values. for every subsequent voltage change load model parameters are identified and adjusted. thus, the old (previously determined) parameter value is adjusted proportionally to the difference between identified and the old value of the concerned parameter, and the new parameter value is obtained. it is used for model validation by calculating the difference between the measured power response to the subsequent voltage change and the corresponding simulated response on the basis of this new value. the model is validated since such differences are less than 5 %: in 7 % of simulations when parameters that are grouped according to the time intervals are adjusted, and in 10 % of simulations when parameters grouped according to days are adjusted. the most of final parameter values are very close to those from table 2 indicating that the load composition remained almost the same in the period (on days) that belong to the second data set. for example, final parameters of working days are kpu=1.623 and kqu=3.169, while the corresponding parameters from table 2 are 1.662 and 2.998, respectively. it should be emphasized that the results presented in this paper are very valuable because they regard concrete, time variable load. the usage of the parameter values from the literature can lead to significant mistakes since these correspond to the load in different country or the region of the same country, with different load composition. generally, literature does not provide parameters for different time periods of the day and different days of the week. thus, for residential load in winter, [2], [15], and [10] just list 464 l. m. korunović, m. rašić, n. floranović, v. aleksić kpu=1.5 and kqu=3.2, kpu=1.5-1.7 and kqu=2.5-2.6, and kpu=1.761 and kqu=3.656, respectively. when parameters from [2], mean values from [15] and parameters from [10] are used for simulation of 84 measured p and q demands analysed in this paper, in approximately 10 %, 20 % and 14 % of the cases the errors are greater than 5 %. therefore, data from [2] can be treated as adequate for the considered load, but in general, literature data can lead to significant mistakes. on the other hand, [7] provides the exponential load model parameters of the predominantly residential load at 10 kv in town niš (republic of serbia) for two winter days – friday and saturday in three day periods – morning, afternoon and night. these parameters (table 3) change in narrower ranges than the mean values of parameters from table 2: kpu belongs to the range from 1.716 to 1.861 and kqu from 2.962 to 3.616. data from table 3 is used for simulations of real and reactive power responses that are recorded in ts "asambair krug" in the corresponding time intervals and days. in 12.5% of cases simulation errors are greater than 5 % proving that the usage of the parameters of the same load class, but obtained at different voltage level, in different town, and several years before, can cause unacceptable mistakes. table 3 exponential load model parameters [7] time interval morning afternoon night day friday saturday friday saturday friday saturday kpu 1.716 1.791 1.767 1.812 1.861 1.774 kqu 3.565 3.616 3.135 3.169 2.962 3.138 load modelling procedure demonstrated in this paper can be broadly used in “elektrodistribucija vranje” and other utilities equipped with the devices described in this paper or similar recording devices. by now, approximately 200 recording devices similar to the device described in the paper, have been installed in serbian utility companies. thus, the procedure based on continuous measurements will allow for the obtaining of load model parameters in different regions of the country and of different load classes. additionally, the estimated load model parameters can be obtained in different months and seasons, but also for both small and large voltage variations separately. 6. conclusion the paper presents simple and efficient load modelling procedure that enables tracking model parameters at low voltage in a long time period. it uses the data from the previously installed recorded device and the measuring information system, stored in central server in the control centre. in this way additional costs for generally expensive equipment are avoided and easy data access is enabled. identified parameters are grouped according to day periods and days of the week and the statistically analysed providing parameter estimates proved to be reliable for the load with significant stochastic changes. the estimated values of kpu and kqu vary approximately from 1.4 to 2 and from 2.4 to 3.7, respectively, depending on the period of the day and indicating the changes of load composition. the variations parameter estimates are greater in day periods than in days of the week. the described load modelling procedure can be applied simultaneously at numerous load modelling at low voltage using continuous measurements 465 low voltage buses in a long time period providing parameter estimates in different months and seasons valuable for both exploitation and planning purposes. acknowledgement: the paper is a part of the research done within the projects iii44006 and iii44004 supported by the ministry of education, science and technological development of the republic of serbia. references [1] cigre wg c4.605, "modelling and aggregation of loads in flexible power networks", report tb 566, cigre, oct. 2013. [2] p. kundur, power system stability and control. new york: mc graw-hill, 1994. [3] j. r. ribeiro and f. j. lange, "a new aggregation method for determining composite load characteristics", ieee trans. power app. syst., vol. pas-101, pp. 2869-2875, aug. 1982. [4] w. w. price, k. a. wirgau, a. murdoch, j. mitsche, e. vaahedi and m. a. el-kady, "load modeling for power flow and transient stability computer studies", ieee trans. power syst., vol. 3, pp. 180-187, feb. 1988. [5] j. milanović, "on unreliability of exponential load models", elect. power syst. res., vol. 49, pp. 1-9, feb. 1999. [6] d. karlsson and d. hill, "modelling and identification of nonlinear dynamic loads in power systems", ieee trans. power syst., vol. 9, pp. 157-166, feb. 1994. [7] l. korunović and d. stojanović, "load model parameters in low and medium voltage distribution networks", elektroprivreda, vol. 55, no. 2, pp. 46-56, 2002. (in serbian) [8] w.-s. kao, c.-j. lin, c.-t. huang, y.-t. chen and c.-y. chiou, "comparison of simulated power system dynamics applying various load models with actual recorded data", ieee trans. power syst., vol. 9, pp. 248-254, feb. 1994. [9] l. hajagos and b. danai, "laboratory measurement of modern loads subjected to large voltage changes for use in voltage stability studies", project 113-t-1040 final report, canadian electricity association, may 1996. [10] l. m. korunović, d. p. stojanović and j. v. milanović, "identification of static load characteristics based on measurements in medium-voltage distribution network", iet gen. transm. distrib., vol. 2, pp. 227-234, mar. 2008. [11] d. p. stojanović, l. m. korunović and j. v. milanović, "dynamic load modelling based on measurements in medium voltage distribution network", elect. power syst. res., vol. 78, pp. 228-238, feb. 2008. [12] y. baghzouz and c. quist, "determination of static load models from ltc and capacitor switching tests", in proceedings of ieee power engineering society summer meeting, 2000, vol. 1, pp. 389–394. [13] b.-k. choi, h.-d. chiang, y. li, h. li, y.-t. chen, d.-h. huang and m. g. lauby, "measurement-based dynamic load models: derivation, comparison, and validation", ieee trans. power syst., vol. 21, pp. 1276-1283, aug. 2006. [14] http://alfatec.rs/proizvodi/merno-kontrolni-sistemi-mks/ [15] ieee task force on load representation for dynamic performance, "load representation for dynamic performance analysis (of power systems)," ieee trans. power syst., vol. 8, no. 2, pp. 472–482, may 1993. http://ieeexplore.ieee.org/xpl/recentcon.jsp?punumber=6033 http://alfatec.rs/proizvodi/merno-kontrolni-sistemi-mks/ 11268 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 299-314 https://doi.org/10.2298/fuee2302299m © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper analysis of portable system for sound acquisition of vehicles powered by internal combustion engines marko milivojčević1, emilija kisić2, dejan ćirić3 1academy of technical and art applied studies, school of electrical and computer engineering, belgrade, serbia 2metropolitan university, faculty of information technology, belgrade, serbia 3university of niš, faculty of electronic engineering in niš, niš, serbia abstract. in this paper a portable system for acquisition of sound generated by passenger vehicles powered by internal combustion engines is described and analyzed. the acquisition system is developed from scratch and tested in order to satisfy the requirements such as high-quality of audio recordings, high mobility, robustness and privacy respect. with this acquisition system and adequate signal processing, the main goal was to collect a large amount of clear audio recordings that will form a quality dataset. in further research, this dataset will be used for machine learning model training and testing, i.e. for developing a system for automatic recognition of the type of car engine based on fuel. key words: acoustic based acquisition system, dataset, audio signals, internal combustion engines 1. introduction applications of artificial intelligence algorithms to audio signals are becoming more numerous over time [1-4]. sound classification, audio event detection and audio scene recognition are examples of tasks that are successfully realized in practice by applying machine or deep learning [5, 6]. in this context, machine and deep learning could be used to identify the type of internal combustion engine with regard to the fuel based on the sound generated by the engine. namely, the sound of these engines differs depending on the used fuel petrol (gasoline) or diesel. human ear can recognize this sound difference, that is, whether it is a petrol or diesel engine’s sound. those facts and the need to classify passenger vehicles by fuel as a result of improved environmental standards [7] have served as major pillars of the present research. its main aim is to develop a system for automatic recognition of engine type based on sound generated by the engine, that is, to received november 09, 2022; revised january 18, 2023; accepted february 06, 2023 corresponding author: marko milivojčević academy of technical and art applied studies, school of electrical and computer engineering, belgrade, serbia e-mail: markom@viser.edu.rs 300 m. milivojčević, e. kisić, d. ćirić build a machine/deep learning model that will be able to recognize the type of engine with high accuracy, where an input to the model will be the engine sound. since the successful implementation of machine/deep learning requires an adequate dataset (containing, in this case, audio samples), a specialized acquisition system has been developed for this purpose. details of the development of such an acquisition system for the collection of audio samples of the passenger vehicles powered by internal combustion engines are presented here. the first requirement that the acquisition system should satisfy is the automation, because manual collection of a large number of samples would require a lot of time and might introduce certain differences in conditions during the acquisition. then, the collected data should meet the requirements for quality, duration and invariability of environmental conditions in order to provide the reliable information regarding the acoustic characteristics. the paper is divided into several sections. the technical characteristics of the system, hardware configuration and selection of components as well as acquisition procedure and processing of the collected audio signals are presented in the section related to methodology. the section describing the results provides a tabular presentation of the system efficiency for three cases of time interval between the start of detection of two consecutive vehicles, as well as the presentation of audio signals in the time and spectral domain as a measure of validity of the obtained images for further analysis with a machine or deep learning system. the paper ends with concluding remarks. 2. acquisition system and procedure description in the earlier phases of the research, the influence of microphone position in the area below the engine compartment on the characteristics of audio recordings was analyzed in detail [8]. as a result, it was determined that the basic characteristics of the audio signal varied minimally independently where the measuring microphone was placed as long as the microphone was directly below the engine compartment [8]. in that regard, depending on a vehicle, the target area where the microphone can be placed below the engine is approximately 1.2 m by 1.2 m. because of that, it is possible to collect relevant audio samples regardless of the exact position of the vehicle when it is stopped above the microphone. based on the previous findings, the acquisition system uses a microphone positioned in the area below the engine compartment chosen as the most suitable area in terms of "purity" of sound [8, 9], and audio recording begins only after the presence of the vehicle is detected. in this way, audio samples of engine operation in the idle mode are collected, without the microphone itself being positioned on the vehicle. the system has been developed to be mobile, so that it can be set up independently of availability of power sources, and it is fully designed to run on battery power. additionally, the system is designed to be autonomous, i.e., not to require human presence during operation. as the system has limited memory space, it was necessary to develop several verification steps before the current audio sample was written in the memory. specifically, this system has four levels of verification before storing the audio recording, which resulted in a dataset of recordings that contains only sounds of interest, i.e., engine operation. when the system is applied in real conditions involving presence of interfering sources of noise and different engine load modes, despite a large number of successfully collected audio recordings, some recordings containing not only the desired engine mode but also other engine modes appeared analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 301 in the formed dataset. so, it was necessary to develop a procedure that detects and then extracts the idling mode of the internal combustion engine. in order to have as much autonomy of the system as possible, the requirement for minimum energy consumption conditioned the application of the simplest possible procedure for separating the desired engine mode. thus, the procedure of extracting the engine idle mode applied here is based on the audio signal processing in the time domain, i.e., usage of signal envelope. it is worth mentioning that the number of recordings containing only the engine idling mode is also affected by the minimum time period between the start of detection of two consecutive vehicles. 2.1. acquisition system the main goal of collecting audio samples of engine operation is to make a dataset containing sounds of passenger vehicles recorded in real conditions. in this way, the future classification system will be able to properly work in such conditions, as those at entrances to underground garages, toll plazas, gas stations, etc. the generated dataset of audio samples should preferably have such characteristics that will enable its usage in different machine and deep learning approaches [10]. they include support vector machine (svm) [11], k-nearest neighbors (k-nn) [12], deep forest [13] or various deep neural network architectures as multilayer perceptron [14] or convolutional neural network [15]. for this purpose, the audio samples may be transformed either into selected set of features or images, such as spectrogram-like images, or they may be used in the existing format (raw audio signals). the entrance to the underground garage with a ramp was chosen as the most suitable space for collecting audio samples, where it is necessary to stop the vehicle until the driver takes the card / token. during this period, the car is static and idling. even if it has a start / stop system, it will run in idle mode for a certain period of time. in addition, in such a situation, the movement of the vehicle is so directed that there is no possibility of mechanical damage to the microphone and sensor that are placed on the ground in the space between the wheels. the block diagram of the system is presented in fig. 1, and the realized system in a laboratory environment is shown in fig. 2. fig. 1 block diagram of the sound acquisition system 302 m. milivojčević, e. kisić, d. ćirić fig. 2 realized acquisition system in laboratory environment the system has been developed so that the presence of vehicles is detected with the ultrasonic sensors before the process of recording the engine operation sound begins (the first level of verification). ultrasonic sensors are primarily selected as sensors that, unlike widely used cameras, do not affect user privacy. also, these sensors that are among the cheapest sensors on the market have very low power consumption, and they are accurate enough to detect vehicles. this type of vehicle detection enables the installation of the system almost everywhere because there is no possibility of interference with any existing induction sensors at the entrance ramp and violations of the law related to user privacy. in order to avoid detection of objects that are not vehicles of interest, two sensors are used. the sensors are positioned so that one measures the distance along the horizontal (x) axis and the other one along the vertical (y) axis. the plane formed by the ultrasound sensors is not perpendicular to the direction of vehicle movement, as shown in fig. 3. by using the sensors placed in the described way, the possibility of detecting twowheelers and pedestrians that might also show at the ramp is eliminated. namely, due to the position and orientation of the sensor placed on the ground, two-wheelers can only be detected if they pass directly above, i.e., over the sensor. however, even in such a case, they will not meet the distance requirement from the side sensor, if they move in the intended direction of entering the garage. if the sensors were located in the same plane, then it would theoretically be possible for a motorcycle to be oriented perpendicularly in reference to the intended direction of movement of the vehicle, i.e., above the ground sensor and facing the side sensor with the front or rear wheel. by positioning the sensors in two planes, a motorcycle would have to be in an almost impossible position to enter the garage, i.e., it would need to hit the ramp in order to satisfy the condition of the vehicle presence on both sensors. in a similar manner, a pedestrian who is above the ground sensor could move in tandem with another pedestrian who would satisfy the condition of the side sensor if both sensors were in the same plane. however, if the distances are measured in different planes, it would be more difficult and less likely to meet the condition of the vehicle presence on both sensors. analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 303 fig. 3 the acquisition system positioned at the entrance to the underground garage, where horizontal (x) and vertical (y) axis as well as horizontal and vertical plane, which is also the plane formed by the axes, are presented readily available waterproof ultrasonic distance measurement modules containing an ultrasonic sensor jsn-sr04t, whose specification is given in [16], are used in the acquisition system. these modules, that is, sensors are controlled by a microcontroller within the arduino nano platform [17], where distances are set for the specific measurement case. distance measurement is realized by the short-term emission of an ultrasonic signal triggered by arduino, after which arduino measures the time until the reflected signal appears. the distance to an obstacle is calculated based on the measured time required for the signal to reach the obstacle and then return, and based on the speed of sound in the air. since measurement of the distance to the vehicle does not require precision greater than 1 cm, the best results were obtained by a trigger signal lasting 10 microseconds. thus, if both sensors detect an object (the horizontal sensor at distance less than 80 cm and the vertical sensor at distance less than 40 cm), the microcontroller registers a vehicle presence and sends this information using serial communication to the raspberry pi computer [18]. this computer represents the central part and heart of the acquisition system. the reason for using an additional microcontroller in addition to the raspberry pi, which can also control and read ultrasonic sensors, is the need to detect vehicle presence continuously, i.e., in parallel with recording the audio. by having both the raspberry pi and the additional microcontroller (arduino nano), two activities − vehicle detection and audio recording, supposed to be done in parallel, can be realized in an easier and more reliable way. each audio sample is recorded with an omnidirectional microphone that is placed on the ground in the area below the vehicle. in this way, in almost all cases, the microphone is positioned directly below the vehicle’s engine after the vehicle is stopped in front of the ramp. in order to obtain the highest quality audio recordings, the akg c562cm omnidirectional microphone is used, with the specifications that are listed in [19] and presented in fig. 4. 304 m. milivojčević, e. kisić, d. ćirić (a) (b) fig. 4 characteristics of akg c562cm microphone: (a) frequency and (b) polar response [19] the microphone and ultrasonic sensor that measures the vertical distance are placed in a purpose-made cable protector (fig. 5a) made of industrial rubber with a hardness of 90 shora. for the purpose of collecting audio recordings, the edges of this cable protector had to be processed at an appropriate angle so that the sound of wheel crossing over it should be negligible in the recordings. the processing angle was determined empirically and was approximately 150°. in addition to protecting the cables that connect the microphone and ultrasonic sensor to the rest of the system, the cable protector is designed to protect both the microphone and sensor in the case that the vehicle wheel passes directly over them (fig. 5b). when the cable protector is placed at the measuring position, it is not necessary to fasten this guide to the base, because it is not subject to slipping and moving due to the structure of the rubber and its width of 20 cm. the guide was at almost the same position during the acquisition independently on how large and heavy were the vehicles passing over. (a) (b) fig. 5 cable and ultrasonic sensor protection: (a) purpose-made cable protector and (b) microphone/sensor protection the hardware limitations of the raspberry pi computer in terms of maximum sampling frequency and number of bits for audio signal quantization as well as the need for microphone phantom power resulted in the insertion of an a/d converter between the microphone and the raspberry pi computer. for the purpose of a/d conversion and microphone power supply, a dedicated high-quality audio interface irig pre hd is employed, which is also a batterypowered device whose specifications are given in [20]. additionally, the use of an external analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 305 a/d converter enables the raspberry pi to run at lower processing power and lower power consumption. on the raspberry pi computer, the developed python code is run after the power is turned on. within this code, the serial communication via usb port is listened to in order to receive the information from the arduino about the vehicle presence. when the vehicle is detected, a series of processes are realized that are described in the next subsection. 2.2. acquisition procedure after the vehicle presence is detected, and in order to save the battery, the raspberry pi starts the microphone listening mode via the a/d interface. only when the detected sound level is above the set threshold, the storage of the stream in the buffer will begin (the second level of verification). the threshold level is determined empirically at 74 db. in this way, an accidental excitation of the sensors that can be caused by the passing of a pedestrian, dog or cat is avoided. the audio recording duration is initially set to 5 s and after the time has elapsed, the stream stops. in order to avoid an accidental excitation potentially caused by the passing of a motorcycle, the stream is additionally checked after stopping it. namely, at the location where the samples were taken, and in most of the underground garages, motorcycles are allowed to enter without any obstruction next to the ramp, so they are not stopped at the entrance. the mentioned check is performed simply after two seconds from the beginning of the stream the signal level is checked whether it is above the threshold set in the previous step or not (the third level of verification). if the threshold condition is met in that segment of the stream, it is stored as a wav file on the sd card. the entire procedure is shown as the flowchart given in fig. 6. fig. 6 acquisition procedure flowchart 306 m. milivojčević, e. kisić, d. ćirić the fourth level of verification is a specially developed algorithm where only the engine idle mode (stationary signal) is extracted from the existing wav file. the description of this procedure is given in the next subsection. the initial installation of the system at the entrance ramp of the underground garage showed that the system detected only vehicles and that the audio recordings contained only signals originating from internal combustion engines. however, the waiting time of vehicles above the microphone varied considerably from case to case. due to this phenomenon, three different approaches for audio signal recording (a, b and c) were applied based on the activities of ultrasound sensors. in the first one (a), it was defined that after detecting the object (vehicle), the ultrasound sensors remained inactive for 5 s until the rest of the system finished the audio sample recording. in the approach b, a fixed time of 5 s of sensor inactivity after detection was replaced by the time of 8 s. the third approach (c) is related to the situation where the sensors were constantly active in order to detect when the vehicle left the space above the microphone, thus not sending a command to the rest of the system to start the next recording. if the sensors in two successive iterations separated in time for 50 microseconds detected the absence of a vehicle, the system interpreted this situation as the vehicle had left the position. this is important because occasionally one of the sensors measures greater distance to a vehicle caused by the higher-order reflections of ultrasonic waves, due to the long waiting of the vehicle. this is interpreted as non-compliance with the presence condition. such a phenomenon is attributed to the dispersion of ultrasonic waves that can occur due to the shape of the vehicle’s body. during the system testing, it was shown that this phenomenon was rare. in terms of the negative effects of constant exposure to the ultrasonic waves, the used ultrasonic sensors are of very low power, designed to measure distances of up to 4.5 m, which means that the signal level can be negligible at longer distances due dispersion. if we look at the configuration of the entrances to the underground garages, the width of the passage for vehicles must be at least 3 m. in this way, if a pedestrian passage exists, it can only be found at a distance greater than 3 m from the sensor. besides, within the few hours of the acquisition, fewer than 10 passengers were seen in the pedestrian passage, but being further than 5 m from the sensors. 2.3. extraction of idling mode of operation in order to extract the stationary part of each recorded audio signal that corresponds to the engine idle by the signal processing in the time domain applied here, it is necessary to determine the threshold (time moment) after which the non-stationary part of the signal should be rejected. due to the nature of the problem, the stationary part of the signal always appears at the signal beginning, see figs. 7, 8 and 9 given in the next section. there are no cases where the idling occurs later (in the middle or at the end of the signal). so, it is clear that the threshold needs to be found at a certain time point after the signal starts, i.e., at the first moment when the signal becomes non-stationary. based on the analysis of the waveforms of the recorded signals in the time domain, it is noticed that at the moment when the signal ceases to be stationary, its amplitude abruptly increases. thus, at that moment, there is a noticeable increase (jump) in the signal envelope. the idea for extracting a stationary part of a recorded signal is based on generating the signal envelope and calculating the difference between the current and previous envelope values along the envelope. while the signal is stationary, the difference between the current and analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 307 previous envelope values is expected to be small. on the other hand, at the moment when the signal ceases to be stationary, the difference between the current and previous value of the envelope must be significantly greater than the difference at time instants before that moment. the first time instant from the beginning to the end of the signal where there is a significant increase in the difference between the current and the previous envelope value is a candidate for setting the threshold. this significant increase needs to be quantified. if the signal envelope is denoted as env(t) and the threshold representing the upper time limit of the stationary signal part as tl, the threshold itself can be determined as:         −−= f s l n t atenvtenvt })1()(min{ (1) where ts denotes the duration of the signal, and nf is the number of frames in which the signal maxima are calculated in the procedure of generating the signal envelope. a is a constant having the value of 0.1 determined empirically. since it is necessary to set the threshold at the first time instant after the envelope jump looking from the signal beginning to its end, the smallest value that satisfies the condition in (1) is taken as the threshold tl. more precisely, since the time variable t is given in frames used for generating the signal envelope, the condition min{env(t)-env(t-1)>a} returns an envelope frame in which there is an envelope jump indicating a transition from stationary to non-stationary part of the signal. in order to obtain the exact time instant for setting the threshold, it is necessary to normalize the obtained envelope frame value by ts/nf. in our case, the frame size for generating the envelope is 4000 samples with the frame overlap of 1000 samples. this means that the resolution for setting the threshold tl is determined by the frame size, which can be chosen in accordance with the nature of the analyzed signal. 3. analysis of recorded audio signals positioning the acquisition system at the entrance of the underground garage with a ramp where it is necessary to stop a vehicle in order to take a token gave the results above the expectations in terms of the quality and number of audio recordings. these recordings have the following parameters: sampling rate of 44.1 khz, the bit depth of 16 bits, fixed duration of 5 s resulting in a file size of approximately 431 kb, which provides the possibility of storing approximately 67800 audio samples assuming the effective storage space of 28 gb on the 32 gb sd card. the power supply used a power bank with a capacity of 10 ah, consumed about 20% of the capacity for 2 hours of recording, showing that the system is able to function with this power supply for about 10 hours in a completely autonomous way. in parallel with the autonomous operation of the system, manual records of the engine type by fuel were made, meaning that the samples were labeled manually. this was done to identify the possible error, e.g. audio recording that would be unusable due to excessive noise of the environment that might be present indoors typically coming from the garage ventilation. this case did not happen in practice as a result of a correctly set threshold that determines the beginning of the recording. considering the three approaches mentioned above (a, b and c), after analyzing the recordings, the most important fact is that no vehicle passed by the acquisition system without triggering the system to record the sound of its engine operation. also, events other than passenger vehicles passing by did not falsely trigger the system, and a completely blank recording was not obtained. table 1 provides a comparative overview of 308 m. milivojčević, e. kisić, d. ćirić these three approaches in terms of the number of samples collected as well as the usability of the samples. it is worth mentioning, that during the collection of audio samples, a very small percentage of vehicles belonged to the older generation of vehicles. the majority of diesel vehicles belonged to the generation of common rail type injection, while the majority of gasoline vehicles had multipoint indirect injection. table 1 gives the total number of audio recordings and number of useful audio recordings. here, the latter contain the engine idling mode sounds, while the rest of recordings still contain the vehicle engine sounds, but not the idling mode of operation − instead they contain the sound of a vehicle leaving the ramp. large majority of recordings are the useful recordings, and its percentage in reference to the total number of recordings is above 90%, where this percentage is the highest for the ultrasonic detection approach c, and it is close to 97%. by comparing three ultrasonic detection approaches from table 1, it can be noticed that the approach a with a fixed time interval of detection (sensor inactivity) of 5 s gave the most audio samples, as many as 202% of useful recordings in relation to the number of vehicles. this approach is primarily suitable for generating the largest possible dataset, but it is not suitable from the point of view of efficient usage of the storage resources. if the system is used employing this approach for detection and recognition of the engine type in real conditions, there will be cases where the same vehicle is detected more than once. strictly speaking, this increased number of recorded audio signals for some vehicles could have certain detrimental effects on the machine/deep learning due to overrepresentation of these vehicles in comparison to others. although the number of recordings for majority of vehicles is up to two, these effects will be analyzed in the next phases of the research. besides, if necessary, the redundant recordings for the same vehicle could easily be removed from the dataset according to the time of recording. table 1 comparative overview of three different detection approaches (a, b and c) in terms of the number of samples collected as well as the usability of the samples a (sensors inactive for 5 s after vehicle detection) b (sensors inactive for 8 s after vehicle detection) c (continuous detection of vehicles by sensors) number of vehicles that passed through the acquisition system 50 100 100 number of detected vehicles 50 100 100 total number of audio recordings 111 143 122 number of useful audio recordings 101 133 118 number of idle mode records only (without any additional processing) 69 97 109 percentage of vehicles detected 100% 100% 100% percentage of useful recordings in relation to the total number of recordings 90.99% 93% 96.72% percentage of useful recordings in relation to the number of sampled vehicles 202% 133% 118% percentage of recordings of idle mode only without additional processing in relation to the number of sampled vehicles 138% 97% 109% percentage of recordings not requiring the fourth level of verification in relation to the total number of recordings 62.16% 67.83% 89.34% analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 309 the approach b (time interval of sensor inactivity of 8 s) also gave good results in terms of the number of vehicles detected and the amount of audio recordings. however, it has the lowest percentage of recordings of idling mode only without additional processing compared to the number of sampled vehicles. this approach has more efficient usage of memory resources compared to the first approach. the most complex approach (c), continuous detection with the recognition of the next vehicle, gave the least audio recordings in relation to the number of detected vehicles. on the other hand, this approach is the most efficient in terms of memory utilization, achieving a high percentage of clean recordings. in this way, the lowest redundancy among samples and the highest percentage of useful recordings in relation to the total number of recordings were obtained. the latter led to the least need for additional processing (saving cpu resources) and additional power from the power supply. within all three approaches from a to c, one or two audio recordings per vehicle were obtained for the majority of vehicles. here, the first recording represented the engine idling stationary mode without exceptions, fig. 7. in most of the samples, the second recording (in some cases the last one) partially contained the engine idling mode followed by an increase in the crankshaft speed and partial engine load mode in order to accelerate the vehicle, as shown in figs. 8 and 9. there were no cases where the partial load mode of the engine appeared before the idling mode in the recordings. in these three figures (figs. 7, 8 and 9), the audio signals of approximately the same generation of vehicles are presented. here, the signals’ amplitudes are normalized; hence the focus is on differences in the signal level caused by the change in operating mode. fig. 7 audio signal of (a) petrol and (b) diesel engine at idle, without changing the mode fig. 8 audio signal of (a) petrol and (b) diesel engine having early operation mode change from idle to load mode (during the recording interval) 310 m. milivojčević, e. kisić, d. ćirić fig. 9 audio signal of (a) petrol and (b) diesel engine having late operation mode change from idle to load mode (during the recording interval) calculation of the threshold (i.e., the time instant of the audio signal until which the engine is in the idling mode) used for extraction of idling mode of operation is illustrated in figs. 10 and 11, where the threshold is marked with a purple vertical line. the terms “early” and “late” are related to the cases where the operation mode change happens earlier (up to 1 s) and later (after 1 s) in the recorded signal, respectively. in the recorded signals where there is no change in the engine operation mode, the threshold (cutoff time) could not be determined in the described way. in such a case, the entire audio track is selected as an engine idle, and is used for further analysis and processing. fig. 10 waveform and envelope of the audio signal of (a) petrol and (b) diesel engine with an early change of operation mode (the threshold is marked by a vertical line) fig. 11 waveform and envelope of the audio signal of (a) petrol and (b) diesel engine with a late change of operation mode (the threshold is marked by a vertical line) the waveforms of the characteristic audio signals extracted in the described way are presented in figs. 12 and 13. for the presented case of the vehicle using diesel fuel where an early operation mode change (almost at the very beginning of the recording) occurred, the calculated threshold (cutoff) time was also very close to the beginning of the signal analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 311 (fig. 10b), which means that this recording is rejected using the function for checking the duration of the stationary mode. this duration can be set according to the requirement related to the minimal length of the signals. depending on a particular need, the signal length may be either shorter or longer. in the present case, the duration of the stationary mode is set to 0.5 s meaning that the minimal signal length is 0.5 s. fig. 12 audio signal of (a) petrol and (b) diesel engine at idle extracted from the recordings with a late change of operation mode fig. 13 audio signal of (a) petrol and (b) diesel engine at idle extracted from the recordings with an early change of operation mode as the mapping of audio signals into an adequate image format [21, 22], such as spectrogram-like images, is increasingly used in modern signal processing and deep learning, the obtained audio signals are also presented in the form of spectrograms, see figs. 14, 15 and 16. there are some properties present in the spectrograms of both engine types (petrol and diesel), such as stronger components at low and mid frequencies than at high frequencies as well as rather steady-state behavior along the time axis. however, these images contain also certain differences between the sounds of petrol and diesel engines, such as more uniform energy distribution along frequency axis for the petrol engine and more prominent particular frequency components for the diesel engine. more detailed analysis of the recorded audio signals and their representations in different domains, as well as correlation between the signals and vehicle types by fuel will be done in the very next phase of the research. 312 m. milivojčević, e. kisić, d. ćirić fig. 14 spectrogram of (a) petrol and (b) diesel engine audio signal at idle, without changing the mode and without applying the idling mode extraction fig. 15 spectrogram of (a) petrol and (b) diesel engine audio signal at idle with a late change of operation mode fig. 16 spectrogram of (a) petrol and (b) diesel engine audio signal at idle with an early change of operation mode 4. conclusions considering the number of recordings containing exclusively the idling mode of the vehicles in reference to the number of sampled vehicles, it can be seen that the developed acquisition system has collected at least one such recording for each vehicle. also, the system has not recorded a single blank audio file, and it is rather robust to false triggering. in addition, the selected amount of memory proved to be sufficient, and the most critical part of the system, the battery power, gave very satisfactory results in terms of system autonomy. since 250 vehicles in total passed behind the microphone and sensors placed on the ground without any consequences for functionality, the condition of robustness has been satisfied, and also the ability of unattended use has been proven. analysis of portable system for sound acquisition of vehicles powered by internal combustion engines 313 the developed additional processing of recorded signals for extracting exclusively the engine idle mode along the entire audio recording has enabled to create a dataset of audio samples containing only this target mode of operation. the acquisition system has proven to be efficient for recording the sound of a passenger vehicle at idle regardless of the type of fuel. the number of audio recordings can also be affected by the approach applied for detecting the presence of a vehicle using ultrasound sensors. this results in a larger or smaller number of recordings having higher or lower redundancy between the recordings, respectively. by using the developed acquisition system, a dataset has been created consisting of 352 audio recordings for 250 vehicles containing the sound of engines in the idling mode of operation. this acquisition system can found its application in different use-cases including control of car entrance in restricted areas of smart-cities, prevention of misfueling at gas stations, optimization of road usage or noise prevention based on engine fuel type. in such cases, this proof-of-concept system could be implemented as an embedded system on a dedicated single platform. depending on a particular application and its requirements, the acquisition system might be modified to become even less demanding. thus, taking into account relatively high sound pressure levels at the microphone (above 74 db) and proximity of the source, the condenser akg c562cm microphone might by replaced by an electro-dynamic microphone not requiring phantom power. since it is expected that dynamic range of the acquired signals will not be that large, the bit depth might be smaller than 16 bits used here. in addition, after developing an adequate classifier and considering the useful frequency range, it would be worthwhile to explore an option of reducing the sampling frequency. the generated dataset of audio samples will play an important role in future work for developing a system for automatic recognition of the type of engine based on the used fuel. this system will be designed by applying an adequate approach of deep or machine learning for classification and employing the created dataset for model training and testing. based on the samples from the generated dataset, it can be concluded that spectrograms of engines that use petrol and diesel at idle seem to be different, forming a strong ground-base for achieving high accuracy in engine type classification. acknowledgment: this work has been supported by the ministry of education, science and technological development of the republic of serbia, contract no. 451-03-68/2022-14/200102. references [1] s. das, a. dey, a. pal and n. roy, "applications of artificial intelligence in machine learning: review and prospect", int. j. comput. appl., vol. 115, no. 9, pp. 31-41, april 2015. [2] p. dhanalakshmi, s. palanivel and v. ramalingam, "classification of audio signals using svm and rbfnn", expert syst. appl., vol. 36, no. 3, part 2, pp. 6069-6075, 2009. [3] p. dhanalakshmi, s. palanivel and v. ramalingam, "classification of audio signals using aann and gmm", appl. soft. comput., vol. 11, no. 1, pp. 716-723, 2011. [4] h. ponce, p. ponce and a. molina, "adaptive noise filtering based on artificial hydrocarbon networks: an application to audio signals", expert syst. appl., vol. 41, no. 14, pp. 6512-6523, 2014. [5] z. liu, j. huang, y. wang and t. chen, "audio feature extraction and analysis for scene classification", in proceedings of first signal processing society workshop on multimedia signal processing, princeton, nj, usa, 23-25 june 1997, pp. 343-348. 314 m. milivojčević, e. kisić, d. ćirić [6] t. birtchnell, "listening without ears: artificial intelligence in audio mastering", big data & society, vol. 5, no. 2, july 2018. [7] g. p. chossière, r. malina, f. allroggen, s. d. eastham, r. l. speth and s. r. h. barrett, "countryand manufacturer-level attribution of air quality impacts due to excess nox emissions from diesel passenger vehicles in europe", atmospheric environ., vol. 189, pp. 89-97, sept. 2018. [8] m. milivojčević, f. pantelić, d. ćirić, "pozicioniranje mikrofona prilikom snimanja audio karakteristika motora putničkih vozila" (microphone positioning when recording audio characteristics of passenger car engines) in proceedings of 63rd national conference on electrical, electronic and computing engineering etran, srebrno jezero, serbia: 3-6 june 2019, pp. 58-62 (in serbian). [9] m. milivojčević, f. pantelić and d. ćirić, "comparison of frequency characteristics of sound generated by internal combustion engines depending on fuel", in proceedings of 26th noise and vibration, niš, serbia: 6-7 december 2018, pp. 115-120. [10] n. evans, automated vehicle detection and classification using acoustic and seismic signals. ph.d. thesis, university of york, 2010. [11] h. frederick, a. winda and m. iwan solihin, "automatic petrol and diesel engine sound identification based on machine learning approaches", in proceedings of the international conference on automotive, manufacturing, and mechanical engineering. bali, indonesia: 26-28 september 2018, published at e3s web of conferences, vol. 130, article no. 01011. [12] a. d. mayvana, s. a. beheshtib and m. h. masoom, "classification of vehicles based on audio signals using quadratic discriminant analysis and high energy feature vectors", int. j. soft comput., vol. 6, no. 1, pp. 5364, feb. 2015. [13] a. wieczorkowska, e. kubera, t. słowik and k. skrzypiec, "spectral features for audio based vehicle and engine classification", j. intell. inf. sys., vol. 50, pp. 265-290, 2018. [14] e. alexandre, l. cuadra, s. salcedo-sanz, a. pastor-sánchez and c. casanova-mateo, "hybridizing extreme learning machines and genetic algorithms to select acoustic features in vehicle classification applications", neurocomput., vol. 152, pp. 58-68, march 2015. [15] s. d. badiger and m. uttarakumari, "vehicle classification using machine learning algorithms based on the vehicular acoustic signature", sci. tech. dev., vol. 8, no. 11, pp. 369-374, nov. 2019. [16] ultrasonic waterproof range finder datasheet. available at: https://www.jahankitshop.com/getattach.aspx?id= 4635&type=product. [17] a. pajankar, kickstart to arduino nano. susteren, the netherlands: elektor international media, 2022. [18] b. r. kent, science and computing with raspberry pi. san rafael, usa: morgan & claypool publishers, 2018. [19] c562 cm specifications. available at: https://www.akg.com/microphones/boundary%20layer% 20microphones/c562cm.html. [20] digital high definition microphone interface specifications. available at: https://www.ikmultimedia. com/products/irigprehd/. [21] s. amiriparian, m. gerczuk, s. ottl, n. cummins, m. freitag, s. pugachevskiy, a. baird and b. schuller, "snore sound classification using image-based deep spectrum features", in proceedings of interspeech 2017, stockholm, sweden, august 20–24, 2017, pp. 3512-3516. [22] d. ćirić, z. perić, j. nikolić, n. vučić, "audio signal mapping into spectrogram-based images for deep learning applications", in proceedings of 20th international symposium infoteh-jahorina (infoteh), east sarajevo, bosnia and herzegovina: march 17-19, 2021, pp. 1-6. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 303-314 https://doi.org/10.2298/fuee1902303p the influence of busbars connection on fuselink temperature at fast fuses adrian plesca gheorghe asachi technical university of iasi, faculty of electrical engineering, romania abstract. the paper, based on three-dimensional thermal modelling and simulation finite element method software package, presents a comparison between the thermal behaviour of a fast fuse without busbar terminals and the one with these busbars mounted on it. the maximum fuselink temperature is lower in the second case when the thermal model had taken into consideration the busbar connections, actually, the real situation. also, a thermal analysis for different type of load variations has been done in both cases of the fast fuse geometry with and without busbar terminals. key words: busbars, fast fuses, fuselinks, heating, thermal modelling and simulations 1. introduction at the first sight, the electric fuse manufacture and its working principle don’t seen a hardly gain insight into matters, in fact this device operating is very complex, [1, 2]. choosing the right fuses for the safety purposes implies anyway some criteria and also tests methodologies in order to prevent fuse failure. [3]. electric fuses operation is widely studied from the point of view theoretical, experimental and modelling aspects [4, 5]. fuses for power semiconductors are used in different domain area, starting from the distribution network to special applications, such as: automotive [6], adjustable speed drive [7], photovoltaic systems [8], power substations [9] or microgrids [10]. different models of fuses were developed usually based on a mathematical representation of the arc physics. these models include transient heating and fusion of notched strip elements in sand, arc ignition, and subsequent burn-back, radial expansion of the arc channels due to fusion of the sand, merging of adjacent arcs, and many other second-order effects [11-13]. different parameters such as thermal distribution, thermal flux, and electrical potential in all fuse parts are obtained [13]. fast fuses for power semiconductor protection have been continually developed, in general on the basis of experimental methods. because the processes which govern the operation of fuselinks are many and complex, its analysis is very complicated and several received december 14, 2018; received in revised form february 20, 2019 corresponding author: zoran stanković gheorghe asachi technical university of iasi, faculty of electrical engineering, romania (e-mail: aplesca@ee.tuiasi.ro) 304 a. plesca simplifying assumptions are required. the very complicated situation is that of the prearcing times longer than those which correspond with an adiabatic process, because the current densities in the fuselinks are not constant over their cross-section or along their lengths due to the presence of the restrictions, [14, 15]. in addition, resistivity increases as the fuselink temperature rises, and the effects of various component parts, like fine-grain filler, outer body, end caps, connecting cables or busbars must be considered in temperature distribution analysis, [16]. fuses for power semiconductors have been investigated in [17-20]. in [21] inverter design requirements for safe fuse blowing are investigated. the effects of differing fuse characteristics and the influence on semiconductors during fuse interruption are evaluated in theory and with a practical setup. design hints for fuse locations and selection have been presented. along with the main grid protection, hybrid ac/dc microgrids require also special attention for their protection [22]. energy storage systems are studied from the point of view of their protection in order to reduce the incident energy of the arc flash. use of fast-acting fuses is effective, reducing the incident energy to low values [23]. this paper aims to study the influence of the busbar connections of the fast fuse from temperature distribution point of view, using a specific software package based on finite element method, in order to model and simulate in steady-state conditions their complex thermal behaviour. 2. thermal analysis of fast fuses in the case of variable loads or transient conditions, there is the posibility to set up an equivalent thermal circuit where every section of the fast fuse assembly is represented by its thermal resistance and thermal capacity taking into account that in every section is accumulated a part of the heat released in the fuselink. the most important elements to be taken into consideration in the equivalent thermal circuit are: the fuselink, the quartz sand, the ceramic body of the fuse and the busbar connections. in fig. 1 is presented the equivalent thermal circuit considering in a simplified way the concentrated parameter of the fast fuse assembly. fig. 1 equivalent thermal circuit of the fast fuse assembly the notations in the figure have the following meaning: rf – thermal resistance of the fuselink, rn – thermal resistance of the quartz sand, rc – thermal resistance of the ceramic body of the fuse and rb – thermal resistance of the busbar connection. similarly, cf is the the influence of busbars connection on fuselink temperature at fast fuses 305 thermal capacity of the fuselink, cn – thermal capacity of the quartz sand, cc – thermal capacity of the fuse ceramic body and cb – thermal capacity of the busbar. the products between thermal resistance and capacity allow to compute the thermal time constant, i.e. rfcf – thermal time constant of the fuselink. this is an important parameter which leads to calculate the time after which the fuselink reaches the steady-state temperature (after approximate 5 thermal time constant). the values of the thermal resistances and capacities can be roughly calculated from the dimensions and the characteristics of the materials which the assembly is made of. the correspondence between thermal parameters with physical components of the fast fuse, is shown in fig. 2. fig. 2 main components of the fast fuse (1 – fuselink; 2 – sand qurtz; 3 – ceramic body; 4 – terminal for busbar connection) for the thermal computation in the transient regime it can be defined the fast fuse response when a step power is applied, fig. 3, which is called transient thermal impedance zth(t), and usually it is experimentally determined: ( ) ( ) th t z t p   (1) where: zth(t) is the transient thermal impedance; (t) – fuselink temperature rise; p step pulse power. fig. 3 transient thermal impedance 4 2 1 3 306 a. plesca zth(t) can be considered as representing the possibilities of evacuation and storage of heat in the fuselink at the moment in time t. under constant heating and cooling conditions, the temperature of a homogeneous body of small size will have an exponential variation to a steady-state value, determined by the caloric power released into the body and by the body-to-ambient thermal resistance. thus, for zth(t), it is convenient an approximation by sum of exponentials: 1 ( ) 1 t k t th j j j z t r e            (2) the k number of the exponential functions, the number of the time constants tj = rjcj, and the number of the thermal resistances rj, do not correspond to the number of the elements and of the calculated values from the characteristics and the dimensions of the materials of which the fast fuse is made of. instead, they are generally determined directly from the curve of zth(t). the thermal response of a single element can be extended to a complex system, such as a fast fuse with or without busbar connections, whose thermal equivalent circuit comprises a ladder network of the separate resistance and capacitance terms. transient thermal impedance data, derived on the basis of a step input of power, can be used to calculate the thermal response of fast fuses for a variety of one-shot and repetitive pulse inputs. further on, the thermal response for commonly encountered situations, fig. 4, has been computed and is of great value to the circuit designer who must specify a fast fuse and its characteristics [24]. a) b) c) fig. 4 types of thermal load (a – rectangular pulse; b – rectangular pulse series; c – partial sinusoidal pulse) the time variation of the rectangular pulse input power is shown in fig. 4a and its expression is given by the equation (3). 0 , ( ) 0 m p if t p t if t        (3) the fuselink temperature is given by, the influence of busbars connection on fuselink temperature at fast fuses 307 1 1 1 0 , ( ) 1 t k t m i i t t k t t m i i i i i p r e if t t p r e e if t                                    (4) fig. 4b shows the rectangular pulse series and the equation (5) describes this kind of input power. , ( ) 0 ( 1) m p if nt t nt p t if nt t n t            (5) the thermal response is given by the following equation (6). for a very big number of rectangular pulses, actually n  , it gets the relation (7).       1 1 ( 1) 1 1 1 1 1 , 1 ( ) 1 1 1 1 n tt nt nt t t t t t k m i t i t n n tt nt t k t t m i t i t i i i i i i i i i e e e e p r if nt t nt e t e p r e e if nt t n t e                                                                       (6) 1 1 1 1 , 1 ( ) 1 ( 1) 1 t t t k t m i t i t t t k t m i t i t i i i i i i e p r e if nt t nt e t e p r e if nt t n t e                                         (7) a partial sinusoidal pulse series waveform is shown in fig. 4c. the equation which describes this type of waveform is given by (8). sin( ) , ( ) 0 ( 1) m p t if nt t nt p t if nt t n t               (8) in order to establish the fuselink temperature when n  , it will use the relation (9), where the notations for , z and  are described in the expressions (10). 308 a. plesca 1 2 1 2 sin( ) sin( ) sin( ) 1 1 ( ) ,( ) sin( ) sin( ) 1 1 ( ) ( 1) t t t k t m i i i t i t i t t k t m i i i t i t i i i i i i i e p z t r e e t if nt t ntt e p r e e t if nt t n                                                                                   t                 (9) 2 12 2 2 21 1 1 sin 2 1 2; ( cos ) sin 2 ; 2 cos k i ik k ii i i i i k i i i i i i r r ctg z r tg t r                        (10) from the previous obtained equations to be used for the computation of the temperature rise of the fuselink of the fast fuse in the case of different type of loads, it is easily to be observed that on the one hand the calculations using the analytical formula are not so facile, and on the other hand, it needs to know the thermal resistances ri and the thermal constants ti from the exponential decomposition of the transient thermal impedance zth. as mentioned before, this can be done through experimental tests for a certain fast fuse. therefore, in order to obtain faster the fuselink temperature rise at different type of thermal loads and also the temperature variation in any other component part of the fast fuse assembly, the solution is to use numerical methods as finite element method. 3. thermal modelling and simulations a three-dimensional model for a fast fuse has been developed using a specific software, the pro-engineer, an integrated thermal design tool for all type of accurate thermal analysis on devices. the subject was a fast fuse type ar with rated current by 400a, rated voltage about 700v and rated power losses of 65w [25]. the 3d model had taken into consideration all the component parts of a fast fuse: outer cap, end tag, rivet, inner cap, ceramic body, fuselink and granular quartz, as shown in fig. 5. it was considered a simplified geometry for the rivets. using this thermal model of the fast fuse, it has been included the busbar terminals as part of a bidirectional rectifier bridge equiped with power diodes, fig. 6. taking into account that the rated power losses for the fuse is about 65w and the rated current is 400a, the rated resistance will be,  m i p r n n n 4.0 2 (11) the influence of busbars connection on fuselink temperature at fast fuses 309 fig. 5 geometrical model of the fast fuse (1 – outer cap; 2 – end tag; 3 – rivet; 4 – inner cap; 5 ceramic body; 6 – fuselink; 7 – granular quartz) when the considered fast fuse is protecting in series the power semiconductor from the bidirectional rectifier bridge, during normal operating conditions, at a current with the value of 315a results a power losses by, wirp n 69.39 2  (12) in this case, because the fuse has three fuselink elements and assuming an equal distribution of the current flow, every fuselink will dissipate 13.23w. the analyzed fuse has the following overall dimensions: length: 50mm, square cross-section: 59mm x 59mm, end tag diameter: 24mm. the fuselink has a length of 38mm, width: 15mm, thickness: 0.15mm, notch diameter: 2.3mm and its thermal time constant is about 12ms [24]. fig. 6 geometrical model of the fast fuse assembly (1 – fast fuse; 2 – busbar connections) 1 2 3 4 5 6 7 1 2 b c a 310 a. plesca the material properties of every component part of the fuse are described in the table 1. the heat load has been applied on surfaces of the fuselink elements, 13.23w on every one. it is a uniform spatial distribution on these surfaces. the ambient temperature has been considered about 25ºc. from experimental tests it was computed the convection coeffcient kt = 14.24w/m 2 ºc, for this type of fast fuse [24]. the convection coefficient has been applied on surfaces of outer caps, end tags, rivets, ceramic body and busbar connections with a uniform spatial variation and a bulk temperature of 25ºc. table 1 material data and coefficients at 20ºc parameter material (and correspondence with the components from fig.5) ceramic body (5) copper (1, 2) iron fe40 (3) granular quartz (7) silver (6) insulation material /pressed carton (4)  (kg/m 3 ) 2400 8900 7190 1500 8210 1400 c (j/kgºc) 1088 387 420.27 795 377 0.099  (w/mºc) 1 385 52.028 0.325 121.22 0.063 because during analyzed situation, the temperature on the surface of the ceramic body of the fuse or on the surface of the busbars has not increased so much (a maximum of 33°c) in comparison of the initial temperature (the ambient one, about 25°c), it has been considered that the convection coefficient has a constant value which not depends on the temperature variation. for all thermal simulations a 3d finite elements pro-mechanica software has been used. the mesh of this 3d fuse thermal model has been done using tetrahedron solids element types with the 62544 elements and 14322 nodes. the single pass adaptive convergence method to solve the thermal steady-state simulation has been used. then, it has been made some steady-state thermal simulations for the fast fuse together with its busbar connections. the temperature distribution inside the fuse and through fuselink elements is shown in the fig. 7. the maximum temperature, on the fuselinks is 177.6ºc and the minimum, on the surface of busbar connections, is about 28.34ºc. fig. 7 temperature distribution through the fuse structure at 50% cross section the influence of busbars connection on fuselink temperature at fast fuses 311 much more, the temperature distribution along the bottom busbar connection mounted on the fuse terminals, has been computed, fig. 8. it has been considered the temperature along the curve bounded by the points a, b and c. it can be observed a maximum value in the middle of the busbar and the minimum temperatures at the ends. hence, it results that busbar connections act like a heatsink for the fast fuse. it actually spreads the fuse heating from the middle to the end parts through the busbars to the environment. in order to validate the thermal simulation results, some experimental tests have been performed. it can be noticed that the experimental values at the ends of the busbar are higher than the middle point where actually, the simulated and the experimental value are very close (32.26, respect to 32.3°c). this can be explained because at the ends of the busbar there are connections to the power supply of the uncontrolled bridge rectifier. actually, these terminal connections act as additional power losses for the busbar. fig. 8 temperature distribution along the bottom busbar connection (curve bounded by a, b and c points). comparison between simulation and experimental tests further on, the thermal simulations have been done in order to compute the maximum temperature of the fuselink in both cases with and without busbar connections and for different type of loads presented in the next picture, fig. 9. it took into account that usually, a fast fuse has to protect against short-circuits power semiconductors as diodes or thyristors from different types of power rectifiers. hence, it has been considered the following type of thermal loads only in the case of single-phase circuit: one-way uncontrolled bridge rectifier (fig. 9a), bidirectional uncontrolled bridge rectifier (fig. 9b), one-way controlled bridge rectifier (fig. 9c 135 electrical degrees firing angle) and bidirectional controlled bridge rectifier (fig. 9d 135 electrical degrees firing angle). the pm means the maximum of power loss and for each fuselink is about 18.71w and t is the period of the sinusoidal waveform at 50hz, so its value is 20ms. after all of these thermal simulations, the results related to the maximum temperatures for the fast fuse without busbar connections and in the situation when the fuse has mounted the busbar terminals, are synthesized in the table 2. 29 29,5 30 30,5 31 31,5 32 32,5 33 0 50 100 150 200 250 length [mm] t e m p e ra tu re [ °c ] simulation experiment 312 a. plesca it can be noticed that the higher values for the maximum temperature are in the case of fuse without busbar connections. also, the highest value for temperature is obtained in the case when the thermal load corresponds to the bidirectional uncontrolled bridge rectifier. actually, this is the situation of the sigle-phase bridge rectifier made with power diodes. the minimum value for temperature, 97.6°c is obtained for the fuse with busbar connections and when the thermal load has the time variation of a single-phase one-way controlled bridge rectifier for 90 firing angle. this is the case of the single-phase rectifier made with thyristors. it is to observe that for a higher firing angle (i.e. 135°el) the maximum temperature is increased (103.7°c for an one-way controlled bridge rectifier and 152.8°c for a bidirectional controlled bridge rectifier, both in the case with busbar connection). a) b) c) d) fig. 9 different types of thermal load (a one-way uncontrolled bridge rectified; b bidirectional uncontrolled bridge rectifier; c one-way controlled bridge rectifier; d bidirectional controlled bridge rectifier) table 2 comparison between maximum temperatures load type maximum temperatures [c] without busbars with busbars one-way uncontrolled bridge rectified 164.3 115.5 one-way uncontrolled bridge rectified – rms value 160.2 110.8 bidirectional uncontrolled bridge rectifier 225.5 177.6 bidirectional uncontrolled bridge rectifier – rms value 220.5 173.5 one-way controlled bridge rectifier (θ = 135°el) 154.3 103.7 one-way controlled bridge rectifier (θ = 135°el) – rms value 150.1 98.2 bidirectional controlled bridge rectifier (θ = 135°el) 205.5 152.8 bidirectional controlled bridge rectifier (θ = 135°el) – rms value 202.5 147.8 one-way controlled bridge rectifier (θ = 90°el) 148.2 97.6 one-way controlled bridge rectifier (θ = 90°el) – rms value 143.2 91.5 bidirectional controlled bridge rectifier (θ = 90°el) 198.8 146.5 bidirectional controlled bridge rectifier (θ = 90°el) – rms value 195.4 140.2 the influence of busbars connection on fuselink temperature at fast fuses 313 hence, for a higher firing angle in the case of controlled power semiconductor devices, the maximum temperature on the fuselinks becomes higher. also, for the same type of thermal loads, it has been performed the thermal simulations considering the thermal load as a constant value equal with its rms value. it can be noticed that all obtained values are smaller than the first analyzed case when it has been considered the wave-shape of each thermal load. therefore, the calculus of the maximum temperatures taking into account only the rms value of the thermal load, does not give satisfactory results. the thermal analysis can be extended for other type of thermal load variations, for instance those specific to three-phase power rectifiers when the power semiconductors are diodes or controlled thyristors. therefore, for a certain power rectifier can be established the thermal stress for fast fuse included all busbar connections and also for the protected device, the power semiconductor. more, it can be analyzed the thermal behaviour of other type of power semiconductor equipment as inverters or frequency converters. 4. conclusions thermal response of fast fuses for a variety of one-shot and repetitive pulse inputs have been computed with the aim to offer valuable formulae for power circuit designers. a transient thermal calculation using the analytical formula is very complex and difficult to do. so, a more exactly and efficiently thermal calculation of fuses at different types of thermal loads, can be done using specific modelling and simulation software packages based on finite element method. in this way it can be computed the temperature values anywhere inside or on the fast fuse assembly. the proposed three-dimensional thermal model has been included all the necessary components for a fast fuse such as outer caps, end tags, rivets, inner caps, ceramic body, fuselink elements and granular quartz; also, the simulations have been considered all the thermal model not parts of it or cross-sections. it can be concluded that in all analysed cases, highest maximum temperature has been obtained for the fast fuse without busbar terminal connections. this is because the busbar connections act like some hetasinks on the surface of the end caps. hence, these terminals being made from copper have a good thermal conductivity and through the thermal convection they spread the fuse heating in the environment. also, the maximum temperature of the fast fuse structure depends also on the load types, with the highest value in the case of bidirectional uncontrolled bridge rectifier and minimum for the situation when the load had the waveform variation specific for a thyristor in conduction for a certain time period in the case of one-way rectifier. using the 3d simulation software it may improve the fast fuse designing and also there is the possibility to get new solutions for a better protection of power semiconductor devices. references [1] g. hoffmann and u. kaltenborn, "thermal modelling of high voltage h.r.c. fuses and simulation of tripping characteristic", in proceedings of the 7th international conference on electric fuses and their applications. gdansk, poland, 2003, pp. 174–180. [2] m. wilniewczyc, p.m. mcewan and d. crellin, "finite-element analysis of thermally-induced film debonding in single and two-layer thick-film substrate fuses", in proceedings of the 6th international conference on electric fuses and their applications. torino, italy, 1999, pp. 29–33. 314 a. plesca [3] r. k. huang and s. nilsson, "fuse selection criteria for safety applications," in proceedings of the 2012 ieee symposium on product compliance engineering (ispce). portland or, usa, 5-7 november 2012, pp. 1–8. [4] w. bussiere, "electric fuses operation, a review: 1. pre-arcing period", in proceedings of the iop conference series: materials science and engineering, 2012, vol. 29, no. 1, p. 012001. [5] w. bussiere, "electric fuses operation, a review: 2. arcing period". in proceedings of the iop conference series: materials science and engineering, 2012, vol. 29, no. 1, p. 012002. [6] m. naidu, s. gopalakrishnan, and t. w. nehl, "fault-tolerant permanent magnet motor drive topologies for automotive x-by-wire systems", ieee trans. on ind. app., vol. 46, no. 2, pp. 841–848, january 2010. [7] d. w. schlegel, d. neeser, l. verstegen and n. lemberg, "characteristics, selection guidelines and performance of circuit protection devices for asds", in proceedings of the 2013 ieee industry applications society annual meeting. lake buena vista, fl, usa, december 2013, pp. 1–22. [8] m. k. alam, f. h. khan, j. johnson and j. flicker, "pv faults: overview, modeling, prevention and detection techniques", in proceedings of the 2013 ieee 14th workshop on control and modeling for power electronics (compel). salt lake city, usa, 2013, pp. 1–7. [9] s. madhusoodhanan, d. patel, s. bhattacharya, j. a. carr and z. wang, "protection of a transformerless intelligent power substation", in proceedings of the 2013 4th ieee international symposium on power electronics for distributed generation systems (pedg). rogers, ar, usa, 2013, pp. 1–8. [10] d. m. bui, s. l. chen, c. h. wu, k. y. lien, c. h. huang and k. k. jen, "review on protection coordination strategies and development of an effective protection coordination system for dc microgrid", in proceedings of the 2014 ieee pes asia-pacific power and energy engineering conference (appeec). hong kong, china, 2014, pp. 1–10. [11] s. memiaghe, w. bussière and d. rochette, "numerical method for prearcing times: application in hbc fuses with heavy fault-currents", in proceedings of the conference on electric fuses and their application (icefa), clermont-ferrand, france, 2007, pp. 127–132. [12] s-h lee, "application of high voltage current limiting fuse model using atp-draw", ieee trans. on dielectr. electr. insul., vol. 17, no. 6 , pp. 1806–1813, december 2010. [13] h. f. farahani and m. sabaghi, "analysis of current harmonic on power system fuses using ansys", indian journal of science and technology, vol. 5.3, pp. 2396–2400, 2012. [14] c. cañas, l. fernández and r. gonzález, "minimum breaking current obtaining in fuses", proceedings of the 6th internatinal conference on electric fuses and their applications. torino, italy, 1999, pp. 69–74. [15] k. jakubiuk and w. aftyka, ―heating of fuse-elements in transient and steady-state‖, in proceedings of the 7th international conference on electric fuses and their applications. gdansk, poland, 2003, pp. 181–187. [16] l. o. eriksson, d. e. piccone, l. j. willinger and w. h. tobin, "selecting fuses for power semiconductor devices", ieee ind. appl. mag., vol. 2, pp. 19, 1996. [17] a. pleşca, "numerical thermal analysis of fuses for power semiconductors", electr. pow. syst. res., vol. 83, no. 1, pp. 144–150, 2012. [18] j. l. soon and d. d. c. lu, "design of fuse–mosfet pair for fault-tolerant dc/dc converters", ieee trans. power electron., vol. 31, no. 9, pp. 6069–6074, february 2016. [19] p. nuutinen, p. peltoniemi and p. silventoinen, "short-circuit protection in a converter-fed low-voltage distribution network", ieee trans. pow.electron., vol. 28, no. 4, pp. 1587–1597, 2013. [20] w. zhang, d. xu, p. n. enjeti, h. li, j. t. hawke and h. s. krishnamoorthy, "survey on fault-tolerant techniques for power electronic converters", ieee trans. power electron., vol. 29, no. 12, pp. 6319– 6331, february 2014. [21] m. gleissner and m. m. bakran, "a real-life fuse design for a fault-tolerant motor inverter", in proceedings of the 18th european conference on power electronics and applications (epe'16 ecce europe). karlsruhe, germany, 2016, pp. 1–11. [22] s. mirsaeidi, x. dong, s. shi and d. tzelepis, "challenges, advances and future directions in protection of hybrid ac/dc microgrids", iet renew. power gen., vol. 11, no. 12, pp. 1495–1502, november 2017. [23] f. m. gatta, a. geri, s. lauria, m. maccioni and f. palone, "arc flash in large energy storage systems—hazard calculation and mitigation", ieee trans. ind. appl., vol. 54, no. 3, pp. 2926–2933, january 2018. [24] a. plesca, ―overcurrent protection systems for power semiconductor installations‖, phd thesis, gheorghe asachi technical university of iasi, 2001. [25] ferraz – shawmut, ultra – fast fuse, datasheet. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 599-612 https://doi.org/10.2298/fuee1804599k a blind decision feedback equalizer with efficient structure-criterion switching control  vladimir r. krstić, nada bogdanović “mihajlo pupin” institute, belgrade, serbia abstract. this paper considers and proposes an innovated method of structurecriterion switching control for the self-optimized blind decision feedback equalizer (dfe) scheme which operates by switching between adaptation modes according to the mean square error (mse) convergence state. the new switching control shortens the blind acquisition period time of the dfe and, consequently, speeds up its effective convergence rate. the switching control is based on the variable switching threshold which combines the commonly used mse estimate of the dfe’s output and a posteriori error of the all-pole whitener performing front-end amplitude equalization during the blind operation mode. the efficiency of the dfe switching control is verified by simulations of single-carrier system transmitting qam signals over multipath channels. key words: blind equalization, decision feedback equalizer, maximum joint entropy, operation mode switching control. 1. introduction in this paper, we have addressed the new method for the convergence rate increasing of the decision feedback equalizer (dfe) scheme which is based on the improvement of the equalizer’s operation mode switching control. with respect to the earlier version [1], presented at the 5 th icetran2017 conference, this paper includes a new set of case studies followed by the most recent simulation results. the convergence rate of the blind equalization is, besides its complexity, an issue of the utmost importance from the perspective of its usage in today communication systems continually striving for the increased data throughput and frequency efficiency [2]. because of that, the frequency efficiency advantages, achieved by removing a training sequence from the system [3], [4], have to be followed by an adequate equalization convergence rate if we want to preserve the benefits of the blind equalization. to reconstruct an unknown source signal, blind equalizers use the higher-order statistics of channel outputs as well as some knowledge of the given signal statistic. in  received february 12, 2018; received in revised form april 24, 2018 corresponding author: vladimir r. krstić, university of belgrade, institute „mihajlo pupin“, volgina 15, 11060 beograd, serbia (email: vladimir.krstic@pupin.rs) *the earlier version of this paper was presented at the 5 th international conference icetran2017, kladovo, serbia, june 5-8, 2017. [1] 600 v. r. krstić, n. bogdanović such environment the resulting symbol-by-symbol based blind algorithms are typically characterized by the relative low convergence rate and high residual mean square error (mse) [3], [4] compared to the conventional pilot-trained equalizers employing the second-order statistic based algorithms [5]. as a way to mitigate these drawbacks twosteps adaptation strategy is commonly used dividing blind equalization task between blind and decision-directed operation modes [4], [6]. at the initial (blind) operation mode, the equalizer adjusts its adaptive parameters to open “eye diagram” enough and then, depending on convergence state, switches adaptation to the decision-directed (dd) operation mode that should guarantee both the successful proceeding of the convergence process and the maximal reduction of the output mse. in such scenario, blind equalizers must be provided by an algorithm estimating some measure of convergence state or signal quality as well as an appropriate performance threshold to decide operation mode switching. this task, as well as blind equalization alone, is not so easy because it depends on unknown system parameters, such as a source signal and channel characteristic. the operation mode switching control based on the online mse estimation of the equalizer’s output and its comparison with in advance selected threshold level is an often used method because of its simplicity [6]. on the other hand, this scheme strongly depends on both the applied mse estimation efficiency and the heuristically selected threshold level according to the given signal statistic and the assumed channel characteristics. an alternative but more complex approach is to join an equalizer’s operation mode switching control with its blind adaptation algorithm aiming at the soft switching scheme [7], [8] which eliminates the above mentioned difficulties and possibly improves equalization performance. in [7], the noise-predictive decision feedback equalizer (dfe) smoothly transforms the equalization process between its two extreme stages: blind linear and dd steady state. for that purpose, the equalizer employs the soft decision device defined by the linear convex function combining identity function (linear) and hard decision (nonlinear) device. in [8], using a similar convex mixing rule, the soft switching blind equalization is considered more generally in the context of linear blind equalization. this soft-switching scheme combines the outputs of two linear equalizers working in parallel: one adapted blindly and the other adapted using the dd-lms algorithm minimizing mse. both schemes aggregate the equalizer’s adaptation algorithm and the operation mode control function into one adaptation task not needing a switching threshold. in this paper we have considered the blind dfe, called soft-dfe [9], using the operation mode switching control scheme based on both the on-line estimation of mse and the variable switching threshold [1], [10]. the purpose of using the variable threshold instead of a fixed one includes several goals such as relaxing the issue of mse threshold level selection and speeding up the equalizer’s effective convergence rate all with minimal computation complexity rate. besides, these goals have been concerned with keeping the error propagation phenomenon [11] a major drawback of blind dfe equalization under the control guaranteeing high values of equalization successfulness. the paper is organized as follows. section 2 describes the soft-dfe structure-criterion optimization scheme. in section 3 the insufficiency of the existing switching control is addressed and then the innovated control that combines the variable threshold with online mse estimation is introduced. in section 4 the efficiency of the threshold variable switching control is verified by simulations. a blind decision feedback equalizer with efficient structure-criterion switching control 601 2. soft-dfe: background and problem definition 2.1. structure-criterion optimization a simplified based-band model of a single-carrier qam (quadrature amplitude modulated) system with the soft-dfe is presented in fig. 1 where the in-phase and quadrature components of complex-valued symbols { }na , generated in time intervals of t seconds, are independent identically distributed real zero-mean variables with a finite variance and sub-gaussian distribution, the time-invariant channel pulse response { }nh represents combined effects of the transmitter filter, channel impulse response and anti-alias filter at the receiver side and the noise is a zero-mean white gaussian process independent of the source data. the signal ( )x t at the input of the equalizer’s feedforward part given by the fractionally-spaced equalizer (fse) is sampled at the rate 2/t and its odd and even samples 0 ,( / 2) n ix t nt it x   , 1, 2i  , are alternatively shifted to the delay lines of the corresponding fir filters presented by coefficient vectors ic . fig. 1 simplified model of transmission system with dfe (soft-dfe) the operation of soft-dfe is based on the principles of the self-optimized dfe scheme [6] which, in order to eliminate the error propagation effects, optimizes both the structure and the cost criteria according to its convergence state. specifically, the softdfe optimizes both the filter structure including four fir filters, two in fff (feedforward) and two in fbf (feedback) part, and the combination of three cost criteria: joint entropy maximization (jem) [12], constant modulus [13] and minimum mse (mmse) [5]. also, besides blind and tracking operation modes, which are commonly performed by blind equalizers, the new soft-transition mode has been introduced into the soft-dfe scheme in order to mitigate the error propagation effects caused by a rapid structure-criterion switching from the blind to decision-directed adaptation mode. at the beginning of the blind mode, the soft-dfe transforms its structure into the cascade of four linear signal transformers the gain control (gc), whitener (wt), blind equalizer (te) and phase rotator (pr) operating independently of each other except of the gc-wt pair, fig. 2a. effectively, in the blind mode the soft-dfe acts as a t/2-fse linear equalizer [14] dividing the equalization task between the whitener of the received signal and the te equalizer where the wt-jem and the te-cm, respectively, performs the channel amplitude and phase equalization. in the next soft-transition mode the soft-dfe proceeds to adapt filters combining the mmse and jem criteria, fig. 2b. finally, in the tracking mode, the soft-dfe continues to converge to the mmse steady-state using the dd-lms algorithm. x(t) + zn {b} an ^ decision{hn} + {c1,c2} feedforward filterchannel feedback filter noise an yn -2/t 602 v. r. krstić, n. bogdanović (a) (b) fig. 2 soft-dfe structure-criterion transformation: (a) blind mode and (b) soft-transition mode (sfbf with jem, dotted line) and tracking mode (fbf with dd-lms, solid lines) the phase rotator pr is realized as a modified variant of the decision-directed phaselocked loop [15] that, using the reduced signal constellation based only on the symbols with the largest energy [16], aims to evade catastrophic effects being caused by the carrier phase estimation exploiting an insufficiently open signals; this is particularly critical for high-order signal constellations such as 64-qam and higher. 2.2. algorithms in this subsection, the adaptation algorithms used by the soft-dfe are revisited in the order following the operation mode switching. gain control. the gain control gc is realized as a single-coefficient equalizer [6] which has a task to recover the power of the source signal. the gc operation is enhanced by the whitener’s outputs ,n iu , 1, 2i  , and given by the recursion 2 2 , 1 , ,[ ]i n i n g i n ag g u     , , 1 , 1i n i ng g  (1) where g is the adaptation step size and 2 a is the variance of source symbols  na . jem whitening algorithm. the whitener wt of the received signal is realized as allpole filter (equalizer) to compensate for the channel amplitude distortion, i.e., recover the second order statistic of the given source signal by using the entropy-based jem cost [12]. the corresponding stochastic-gradient jem whitening algorithm (jem-vl) [16] is given by , , , , t i n i n i n i n u x b u , 1, 2i  (2) 2 * , 1 , , , , , (1 ) i n i n n i n bb i n w i n i n u u       b b b u (3) where , , ,1 , ,[ ,..., ] t i n i n i n n u uu and , , ,1 , ,[ ,..., ] t i n i n i n nb bb are, respectively, whitener’s regression and coefficient vectors, 0 n   is the time-variable leaky factor, w  is the free parameter representing the slope of the employed neuron function, bb  is a step-size, n is the span of the whitener delay line given in t periods and the superscripts t and * signify, respectively, the transpose and conjugation. the specific of the jem-vl algorithm, besides the slope w  controlling its entropic capability, is its variable leaky factor n . acting in opposition to the entropy-gradient term, the leaky term n nb controls jem(b1) cma(c1) jem(b2)  cma(c2)  xn 2/t x un,1 un,2 x wt tegc pr un yn exp(-jn) b1 ili b2  an ^ (c1,c2) x pr 2/t jem/lms xn yn sfbf/fbf dd-lms te exp(-jn) zn a blind decision feedback equalizer with efficient structure-criterion switching control 603 the magnitudes of whitener coefficients avoiding superfluous coefficients to degrade the equalizer convergence process. the undesirable influence of superfluous coefficients is particularly exposed at the time of equalizer switching from the blind to decision-directed operation mode. the adaptation of the leaky n is based on the analysis of the whitener’s a posteriori errors and the heuristic punish/award rule [17] which decides when and how much to increase or decrease the leaky factor. accordingly, the leaky adaptation rule in jem-vl comprises the following three operations: the calculation of a posteriori errors with ( > 0) and without ( = 0) coefficient leakage, decisions when and decisions how much to increase or decrease leaky. the a posteriori error vl ne estimate for  > 0 in jem-vl is given by 1 t n n n n u x    b u (4) 2 (1 ) vl n n w n e u u  (5) and the corresponding a posteriori error w ne estimate for 0n  in (3) (corresponds to the original whitening algorithm jem-w [9]) is given by 2 * 1 (1 ) n n bb n w n u u     b b u (6) 1 t n n n n u x    b u (7) 2 (1 ) w n n w n e u u  (8) it should be noted that the a posteriori errors, given in (5) and in (8), are obtained using the same current value of the whitener input xn; in the above recursions the indexing 1, 2i  is omitted for simplicity. in the next step, based on the comparison of the achieved a posteriori errors, the “ifelse” relation if vl w n ne e then set 1 max( , 0)n n dm m l   else set 1 min( , )n n um m l m   end if (9) decides when to decrease or to increase the leaky factor and, finally, the quantized function max( ) ( / )n n nf m m m   (10) estimates how much to decrease or to increase the leaky factor employing parameters 0( , , , )d um l l m  , max  and 0,...,nm m is an independent variable. cma algorithm. the constant modulus algorithm (cma) is realized in its commonly used variant for dispersion function of order p=2 [13] 604 v. r. krstić, n. bogdanović , , ,' t i n i n i ny  c u , 2 , 1 'n i n i y y   (11) 2 ' * , 1 , , , ,'i n i n fb i n i n c i ny y r         c c u , 4 2 { } { } n c n e a r e a  (12) where ci,n = [ci,0,..., ci,m1] t is the coefficient vector of fff, fb is an adaptation step-size and the constant cr is the kurtosis of the source signal which represents the source probability density function (pdf) distance measure from normality [18]. assuming the amplitude equalization is done efficiently by the gc-wt pair, the t/2-fse-cma has the task to equalize for a channel phase distortion by retrieving the kurtosis statistic of the source signal [19]. soft jem algorithm. the performing of the soft-dfe in the soft-transition mode is characterized by the sfbf equalizer behaviour operating between the original soft fbf equalizer maximizing the joint entropy of the neuron outputs [9] and a hard dd fbf equalizer suffering from incorrect decisions ˆ n a . the operation of the sfbf is described by the following relations ˆexp( ) t n n n ny j c u (13) 1ˆ t n n n nz y  b a (14) 2 * 1 ˆ1n n bs n d n nz z           b b a (15) where ˆ n  is a carrier phase estimate, 1ˆ ˆ ˆ[ ,..., ] t n n n na a a is the vector of previously detected symbols, bs is a step size and d is the neuron slope which is determined by the given source statistic [20]. tracking mode. in the tracking mode, the soft-dfe approaches to the mmse steadystate and continues to follow slow-time channel variations using dd-lms algorithms in its both fff and fbf parts optimizing jointly the mmse criterion given by  2ˆ ˆ( , , )mmse n n n n nj e z a  c b (16) it should be noted that despite the soft-dfe strives to reach a global mmse solution, the local solutions cannot be avoided at all because the soft-dfe’s final convergence state depends on the local ( )jemj b and ( )cmj c criteria. 3. switching control with variable threshold the soft-dfe controls the convergence state using the mse monitor which estimates online the output mse and, according to the a priori selected mse threshold levels (tl), switches the structure and adaptation criterion through three operation modes. to switch from the blind to soft-transition mode and from the soft-transition to tracking mode, the monitor, respectively, compares the estimated mse with tl1 and tl2 thresholds. also, a blind decision feedback equalizer with efficient structure-criterion switching control 605 to switch the pr operation between a reduced and full signal constellation, the mse is compared with threshold tl3. since the latter indicates the signal constellation opening, it is also utilized as a measure of equalization successfulness, given by the equalization success index (esi), which is defined by the ratio of the number of successful equalizations and the total number of monte carlo runs. thus, the soft-dfe controls its convergence process completely by mse thresholds satisfying the relation tl1>tl2>tl3. 3.1. mse switching control the online estimation of the mse in the blind mode is given by the relation   2 , , 1 (1 )b n b n n cmse mse y r      (17) where the forgetting factor  > 0 regulates a quality of estimation process, and typically takes values little less than 1.0. the same mse estimation principle is also used during the next soft-transition and tracking modes provided that the error  ˆ nn z a is substituted for the error  n cy r in (17). the quality of ,b nmse estimate obtained by (17) suffers from several weaknesses. firstly, the ,b nmse is a crude estimate of the mse for all non-constant modulus qam signals (except for 4-qam) because the term  n cy r on the right-hand side of (17) is not a real error but a dispersion measure of the modulus of symbol estimates with respect to the constant c r . secondly, the ,b nmse estimate aggregates the mse affected by the cascaded gc-wt-te (see fig. 2a) with the dominate influence of the te-cma algorithm which is based on the fourth-order statistic represented by the constant cr (12). in other words, the estimate ,b nmse relays mostly on the outlier sensitive kurtosis statistic [18] neglecting the second-order statistic being reconstructed by the wt-jem. to illustrate the soft-dfe convergence behaviour controlled by the mseb,n estimator, we have presented in fig. 3 the results of the convergence tests carried out for three different heuristically selected thresholds tfmse tl1 using system in fig. 1 with 64qam signal and mp-e channel; see channel amplitude in fig. 5 in the next section. if we suppose the optimal mean square error ,b optmse is achievable during the blind mode if the equalizer’s coefficients reached the optimal setup then the three typical equalization scenarios are possible: 1) for tf ,mse b optmse the equalizer successfully switches operation from the blind to the dd operation mode, 2) for tf ,mse b optmse the equalizer stays longer in the blind mode than in the case 1) or, possibly, it will never reach the softtransition mode and the equalization will be ended in failure and, finally, 3) for tf ,mse b optmse the equalizer switches operation to the dd mode faster than in the case 1) but, the mmse steady-state performance is not guaranteed, and even some pathological states are possible. as can be seen from the presented convergence curves the threshold tl1=8.02 db is selected to be the best threshold. 606 v. r. krstić, n. bogdanović fig. 3 mse convergence curves obtained for three different fixed thresholds tl1; soft-dfe single run test for 64-qam signal and mp-e channel 3.2. variable threshold in order to compensate for insufficiency of the ,b nmse estimation given by (17), we have combined a fixed threshold tlmse , generally different from the threshold tfmse , with the whitener’s a posteriori errors , 1 vl i ne  introducing in such a way the variable threshold tlvmse [10]  tlv tl 1, 1 2, 1mse mse vl vln ns e e    (18) which includes two terms, the fixed threshold tlmse and the variable term  1, 1 2, 1 vl vl n ns e e  where s is a small positive scale factor. it is worth noting that the scaling factor s should be selected through the analysis of the ratio between the sum of a posteriori errors and the msetl threshold. the first verifications of the variable threshold model have proved its efficiency for the s values scaling down a posteriori term to a level comparable with the msetl term. the full exploration of the variable threshold usage and its limitations need to be a subject of further study. the above innovation of the blind mode threshold comes from the fact that a posteriori errors of the wt-jem carry up-to-date information on the second-order statistics missing to the ,b nmse (17). a posteriori errors of the jem-vl algorithm are functions of wt-jem outputs which are almost free from isi disturbance (outliers) coming from channel amplitude characteristics. as it is mentioned in the previous section jem-vl provides efficient compensation for frequency-selective channels. practically, by introducing the whitener’s a posteriori errors as a variable threshold term we have created the switching control that directly reflects the recovery of both the secondorder and four-order statistics of the applied source data. using the variable threshold, the switching control responds as follows: for a lower a posteriori error the tlvmse becomes higher, which shortens the blind equalization time and, hence, speeds up the equalizer convergence rate, and reverse, for a higher a posteriori error the tlvmse becomes lower which lengthens the blind acquisition time and slows the equalizer convergence. effectively, from the perspective of the mse estimation quality, the msetlv becomes more robust against the dispersion of magnitudes ny of symbol estimates. a blind decision feedback equalizer with efficient structure-criterion switching control 607 to avoid the false equalizer switching through the operation modes, which could be caused by the non-stationarity of the mse data, the soft-dfe switching control implementation is based on the multiple checking of the threshold level passage. according to the switching rule presented in fig. 4, the equalizer is allowed to switch from the blind to the softtransition mode if and only if the ,b nmse satisfies , tlvmseb nmse  during the k equalizer’s update iterations where k is an integer larger than 1. the same switching rule is valid for the soft-dfe switching from the softtransition to the tracking operation mode but it is less critical than the former from the perspective of convergence rate. 4. simulation results the efficiency of the innovated structure-criterion switching control and its impact on the equalizer’s convergence rate is verified by the software simulator of the qam system presented in fig. 1. the simulations are carried out using 16and 64-qam signals and the multi-path channel adding the white gaussian noise determined by the signal-to-noise ratio (snr). the selection of soft-dfe dimensions and parameters is done aiming at the best compromise between the convergence rate achievements and the equalization successfulness defined by esi. the frequency selective mp-(a, c, e) channels, whose normalized amplitude characteristics are presented in fig. 5, are design in a way to gradually increase isi severity from mp-a to mp-e. fig. 5 normalized attenuation characteristics of mp-(a, c, e) channels fig. 4 soft-dfe rule switching from blind to soft-transition mode mseb,n blind mode mseb,nk 608 v. r. krstić, n. bogdanović the soft-dfe parameters are given as follows. the filter tapped-delay line span, in t intervals, for fbf is 5 for both qam signals and for fff is 23 and 24, respectively, for 16 and 64-qam signals. the fbf is initialized for all zero coefficient-values while the initialization of the fff is realized by two strategies: 1) double-spike initialization (ds) with two central reference tapes 1, 2, 0.707r rc c  and 2) single-spike initialization (ss) with a single central reference tape 1, 1.0 r c  . the adaptation steps for the gc, jem, cma and lms algorithms are selected in a way to optimize their efficiency through the corresponding operation modes. it is of particular importance for gc, jem and cma algorithms which divide the blind equalization task into several simpler subtasks. for example, the gs uses two adaptation steps 11 20 {2 , 2 } g     for both signals. the first step, applied at the early beginning of the blind mode, is much larger than the second one aiming to prevent the wt and te equalizers from taking over the gain control function. the adaptation steps of jem, cma and lms algorithms are selected in order to produce the best response of the fbf and fff filters through three operation modes. depending on the 16and 64-qam signals, they are selected as follows: 1) for fbf { 19 ,16 2 bb    , 22 ,64 2 bb    }, { 18 ,16 2 bs    , 21 ,64 2 bs    }, { 14 ,16 2 bt    , 13 ,64 2 bt    } and 3) for fff { 16 ,16 2 fb    , 21 ,64 2 fb    }, { 15 ,16 2 fs    , 20 ,64 2 fs    }, { 13 ,16 2 ft    , 16 ,64 2 ft    }; the second subscripts of adaptation steps, b, s and t, respectively signify blind, soft-transition and tracking modes. the leaky parameters are given by { 0 40m  , 5 d l  , 40 u l  , 400m  , 11 max 2   } for both signals. the selection of neuron slopes { , } w d   is done according to the considerations given in [16], [20]. the slope d  , which depends mostly on the given signal constellation , takes values 12 and 1.95, respectively, for 16and 64-qam constellations. on the other hand, the slope w  , together with the threshold parameters s and k, is used as a tool to optimize an initial convergence rate of the equalizer, see table 1. the comparison of the soft-dfe performance achieved by the fixed (tlf) and variable (tlv) switching controls are given in the terms of the pdf histograms of the blind acquisition period time, mse convergence and equalization successfulness esi. the comparison tests are carried out for ss and ds equalizer initialization methods, (16,64)-qam signals and switching control parameters {msetf/msetl, s, k} as given in table 1. the motivation to test the switching control for two initialization methods comes from the fact that the success and speed of convergence of fse-cma equalization are strongly affected by the coefficient initialization [14]. the presented pdh histograms and esi tests are obtained for 10000 and mse convergence curves for 200 independent monte carlo runs. table 1 system setups: switching control parameters qam/soft-dfe msetf/msetl w  k s 16qam-tlf 1.30 db 7.5 95 0 16qam-tlv 2.30 db 9 105 0.00145 64qam-tlf 8.02 db 2.4 95 0 64qam-tlv 8.61 db 2.8 105 0.00165 a blind decision feedback equalizer with efficient structure-criterion switching control 609 fig. 6 presents the pdf histograms of the blind acquisition period time obtained with tlf and tlv switching controls for the 64-qam signal. the histogram obtained by tlf control demonstrates a positive skewness caused by fse-cma kurtosis outliers in contrast to the histograms obtained by tvl control which are much more symmetrical; the latter is obviously affected by a posteriori variable threshold term in (18). the more quantitative measure of the switching control impact on the blind acquisition time is provided by mean and standard deviation (sd) statistic presented in table 2 for (16,64) (a) channel mp-a (b) channel mp-c (c) channel mp-e fig. 6 pdf histograms of blind acquisition period time for tlf and tlv thresholds and ss initialization: 64-qam, snr=30 db, a) mp-a, b) mp-c, c) mp-e 610 v. r. krstić, n. bogdanović table 2 blind mode statistic, [t]: (16, 64)-qam, ss mean, std/channel mp-a mp-c mp-e 16-qam mean: tlf 3146 4264 3317 std: tlf 759 1314 843 mean: tlv 2778 2957 2684 std: tlv 232 301 173 64-qam mean: tlf 4776 8431 6103 std: tlf 1893 2507 1795 mean: tlv 4691 6422 4585 std: tlv 1142 1049 793 qam signals and ss initialization method. the presented results emphasize an important decrease of mean and sd in the case of the tlv control. for example, for the tlv control and 64-qam signal, mean and sd values are, respectively, 18% and 51.8% smaller (averaged over all channels) with respect to the tlf control. the impact of the operation mode switching control on the equalizer convergence rate is presented in figures 7 and 8. in fig. 7, the convergence curves obtained for tlf and tlv controls and ss and ds initializations in the case of the 16-qam signal are given. as can be seen, the convergence rates achieved by the tlv control are significantly higher than by the tlf for both ss and ds initialization provided that the residual mse is not sacrificed and, also, the best results are reached for ss-tlv combination. the similar results are achieved for the 64-qam signal, fig. 8. in this case, for the sake of the figure clarity, only the convergence curves obtained by the tlv control and for ss and ds initializations are presented. it is worth noting that the equalizer converges faster by tlv control because the (a) channel mp-a (b) channel mp-c (c) channel mp-e fig. 7 comparison of mse convergence curves obtained using tlf and tlv controls and (ss, ds) initializations: 16-qam, snr=25 db, a) mp-a, b) mp-c, c) mp-e a blind decision feedback equalizer with efficient structure-criterion switching control 611 blind mode time has been made shorter as a result of the improved switching control. also, these results have proved an efficiency of the gc-wt amplitude equalizer that has been insufficiently visible unless the tlv control has been applied. fig. 8 comparison of mse convergence curves obtained using tlv control and (ss, ds) initializations: 64-qam, snr=30 db, mp-(a,c,e) the results of esi tests are given in table 3. the purpose of these tests is to prove that the new tlv control does not degrade the equalization successfulness; the results for both the tlf and the tlf methods are practically same for parameters selected in table 1. it is of an essential importance because we have used different control switching parameters (w , s, k) aiming to achieve the best convergence performance by both methods and, at the same time, to preserve the equalization efficiency. table 3 equalization success index [%] esi/channel mp-a mp-c mp-e 16-qam ss-tf 99.99 99.80 100 ss-tv 99.99 99.75 100 ds-tf 99.97 99.91 99.16 ds-tv 100 99.87 98.94 64-qam ss-tlf 100 99.68 98.83 ss-tlv 100 99.80 98.64 ds-tlf 100 99.69 98.65 ds-tlv 100 99.84 98.03 conclusions our goal in this paper was to increase the convergence rate of the blind soft-dfe equalizer by improving its operation mode switching control. the performing of the online mse estimator monitoring the equalizer’s convergence state is enhanced by the innovated switching control that combines the fixed value threshold term with the a posteriori error of the all-pole amplitude equalizer coefficient updates. in this innovation the robust second-order statistic of a posteriori errors is employed to compensate for the undesirable effects of the outlier sensitive kurtosis statistic of fse-cma outputs. it is verified by different simulation setups that the simple mse estimation method combined 612 v. r. krstić, n. bogdanović with the variable up-to-date threshold information significantly reduces the blind mode operation time and, hence, greatly improves the effective equalizer convergence rate. acknowledgement: the paper is a part of the research done within the project tr 32037, 20112018, the ministry of education, science and technological development of the republic of serbia. the authors thank to the anonymous reviewers for their valuable suggestions and comments. references [1] v. r. krstić, n. bogdanović, “on structure-criterion switching control for self-optimized decision feedback equalizer”, in proceedings of conference papers icetran 2017, srbija, june 5-8, 2017. [2] v. savaux, f. bader, j. palicot, “ofdm/oqam blind equalization using cna approach”, ieee trans. signal processing, vol. 64, no. 9, pp. 2324-2333, 2016. [3] j. r. treichler, m. g. larimore and j. c. harp, “practical blind demodulators for high-order qam signals,” in proceedings of the ieee, vol. 86, no. 10, pp. 1907-1926, 1998. [4] z. ding, y. g. li, blind equalization and identification. signal processing and communication series, marcel dekker, 2001. [5] s.u.h. qureshi, "adaptive equalization," in proceedings of the ieee, vol. 73, pp.1349-1387, sept. 1985. [6] j. labat, o. macchi and c. laot, “adaptive decision feedback equalization: can you skip the training period?,” ieee trans. commun., vol. 46, no. 7, pp. 921-930, jul, 1998. [7] a. goupil and j. palicot, “an efficient blind decision feedback equalizer,” ieee commun. letters, vol. 14, no. 5, pp. 462-464, 2010. [8] m. t. m. silva, j. arenas-garcía, “a soft-switching blind equalization scheme via convex combination of adaptive filters,” ieee trans. signal processing, vol. 61, no. 5, pp. 1171-1182, march 1, 2013. [9] v. r. krstić. and m. l. dukić, “blind dfe with maximum-entropy feedback,” ieee signal processing letters, vol. 16, no 1, pp. 26-29, jan. 2009. [10] v. r. krstić, “fast start-up blind dfe equalizer,” pending patent rs, p-2017/0205, feb. 2017. [11] j. g. proakis, digital communications.3 rd ed. new york: mcgraw-hill, 1995. [12] y. h. kim, h. s. shamsunder, “adaptive algorithms for channel equalization with soft decision feedback,” ieee journal on selected areas in communications, vol. 16, no. 9, pp. 1660-1669, 1998. [13] d. n. godard, “self-recovering equalization and carrier tracking in two-dimensional data communication systems”, ieee trans. commun., 1980, vol. 18, no. 11, pp. 1867-1875, 1980. [14] c. r. johnson, jr. et al., “the core of fse-cma behavior theory”. in s. haykin (ed.), unsupervised adaptive filtering, vol. ii blind deconvolution, pp. 13-112. new york: john wiley & sons, 2000. [15] s. abrar, a. zerguine, a. k. nandi, “blind adaptive carrier phase recovery for qam signals,” digital signal processing, vol. 49, pp. 65-85, 2016. [16] v. r. krstić, a. m. stevanović and b. lj. odadžić, “a variable leaky entropy-based whitening algorithm for blind decision feedback equalization”, wireless personal communications, vol. 95, issue 2, pp. 931-946, july 2017. [17] m. kamenetskyand, b. widrow, “a variable leaky lms adaptive algorithm”, in proceedings of the thirty-eighth asilomar conference on signal, systems and computers, nov. 2004, vol.1, pp. 125-126. [18] l. t. decarlo, “on the meaning and use of kurtosis,” psychological methods, vol. 2, no. 3, pp. 292-307, 1997. [19] o. shalvi, e. weinstein, "new criteria for blind deconvolution of nonminimum phase systems (channels)," ieee trans. inf. theory, vol. 36, pp.312-321, march 1990. [20] v. r. krstić, m. l. dukić, ”decision feedback blind equalizer with tap-leaky whitening for stable structure-criterion switching.” international journal of digital multimedia broadcasting volume 2014, article id 987039, 10 pages, 2014. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 411 424 doi: 10.2298/fuee1403411d implementation of artificial neural networks based ai concepts to the smart grid  marko dimitrijević, miona andrejević stošović, jelena milojković, vančo litovski faculty of electronic engineering, university of niš, serbia abstract. ict and energy are two economic domains that became among the most influential to the growth of modern society. these, in the same time, due to exploitation of natural resources and producing unwanted effects to the environment, represent a kind of menace to the eco system and the human future. implementation of measures to mitigate these unwanted effects established a new paradigm of production and distribution of electrical energy named smart grid. it relies on many novelties that improve the production, distribution and consumption of electricity among which one of the most important is the ict. among the ict concepts implemented in modern smart grid one recognizes the artificial intelligence and, specifically the artificial neural network. here, after reviewing the subject and setting the case, we are reporting some of our newest results aiming at broadening the set of tools being offered by ict to the smart grid. we will describe our result in prediction of electricity demand and characterization of new threats to the security of the ict that may use the grid as a carrier of the attack. we will use artificial neural networks (anns) as a tool in both subjects. key words: smart grid, ict, artificial intelligence, ann, prediction, security. 1. introduction in our recent studies we addressed the problem of interaction of the ict and energy sector including the specific interrelation through the subject of security [1, 2]. most of the claims reported were later on confirmed in the literature as, for example, in [3, 4, 5, 6]. it is our intention here to report on some aspects of these interrelations and, via some new case studies, to demonstrate how much the modern energy distribution system may be supported by ict. in particular, we intend to emphasize the potential role of the artificial intelligence in improving the implementation of the new emerging concepts of production, consumption and distribution of electricity. the ict industry plays a vital role in the global economy and is a major driver of growth and development [3]. several of the most transformative economic trends (e.g., social media, big data, multi-channel retail, etc.) involve the use of ict.  received january 31, 2014; received in revised form june 5, 2014 corresponding author: miona andrejević stošović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, (miona.andrejevic@elfak.ni.ac.rs) 412 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski in addition to its positive implications for economic growth, ict‟s greenhouse gasses (ghg) abatement potential must also be considered [3]. the ict industry accounted for 1.9% of total global ghg emissions in 2011, which is significantly less than its overall contribution to gdp. nonetheless, this is a significant amount of emissions that the industry must address, especially as we expect even faster adoption of ict in the future. however, in the last several years there have been promising strides toward decreasing the growth rate of ict emissions. early on, sustainable ict focused on green ict initiatives that minimize the ecological impact of the development, management, use, and disposal of computing resources. that is named the first wave of sustainable ict [7]. green ict tends to be product-oriented and mostly focused on reducing energy costs and carbon emissions for data centres and desktops. several studies were reported on the energy footprint of computers and data centres [8, 9, 10]. as concerns about ict‟s impact on the environment have risen, these issues have become limiting factors in determining the feasibility of deploying new ict systems, even though processing power is widely available and affordable. on the other side the electric power sector went through revolutionary transformations that include deregulation, use of alternative energy sources, and introduction of ict. at the distribution level, the new requirements call for the development of:  distribution grids accessible to distributed generation (dg) and renewable energy sources (ress), either self-dispatched or dispatched by local distribution system operators,  distribution grids enabling local energy demand management interacting with the users through smart metering systems, and  distribution grids that benefit transmission dynamic control techniques and overall level of power security, quality, reliability, and availability. the key technology supposed to fulfil these requirements today is named smart grid. smart grids and smart power systems in the energy sector can have major impacts on improving energy distribution and optimizing energy usage [11]. defining the smart grid in a concise way is not an easy task as the concept is relatively new and as various alternative components build up a smart grid. some authors even argue that it is “too hard” to define the concept [12]. looking at different definitions reveals that the smart grid has been defined in different ways by different organizations and authors. here is one of them: “a „smart grid‟ is a set of software and hardware tools that enable generators to route power more efficiently, reducing the need for excess capacity and allowing two-way, real time information exchange with their customers for real time demand side management (dsm). it improves efficiency, energy monitoring and data capture across the power generation and transmission and distribution network [13]”. the need of implementation of ai within the smart grid was recognized by the professional and scientific community [5,14]. for example, the work in [15] surveys some of the most relevant applications of ann techniques to the field of energy systems. these applications range from a wide variety of purposes such as, modeling solar energy heat-up response [16], prediction of the global solar irradiance [17], adaptive critic design [18], or even for security issues as reviewed in [19]. the idea behind these applications is based on learning how system performances can be related to certain input values, for instance, how weather conditions (solar or wind) determine the energy output that can be expected [20]. in the past decades anns have emerged as a technology with a great promise for identifying and modeling data patterns that are not easily discernible by traditional implementation of artificial neural networks based ai concepts to the smart grid 413 methods. a comprehensive review of ann use in forecasting may be found in [21]. among the many successful implementations we may mention [22, 23, 24]. applications of anns for security purposes were discussed in [5, 6]. putting all together, at this moment, one may state that the ai concepts and especially anns may be implemented in the following aspects of the life of modern distributed energy resources.  various forecasting tasks, like renewable energy forecasting, storage forecasting and demand forecasting, that need intelligent rules. we will address this issue later on.  protection. being by nature fault tolerant, the anns are most likely a very good means for localizing the faults within a micro grid and in the same time to be capable to isolate it in case of a fault in the main grid.  intelligent diagnosis of equipment in micro grid. anns are a better option for diagnosing faults in electrical equipment for the following reasons:  they can interpolate from previous learning and give a more accurate response to unseen data, making them better at handling uncertainty.  they are fault tolerant, so they handle corrupt or missing data more effectively.  they are good non-linear function approximators by nature, making them better at equipment diagnostics.  they are more suitable for extracting the relationship between input and output in fault detection and diagnosis applications.  demand side management. it appears that demand-side management technologies that simply rely on reacting to control or price signals will not be enough. rather, what is necessary are more sophisticated approaches that are truly adaptive to the state of the grid, that are able to learn the correct response given any particular situation, and that can look ahead and predict both supply and demand trends in the near future, in order to prepare for future reductions in available supply, or to make the most effective use of supply when it is available.  intelligent data processing including data-mining. the main challenge to be tackled in the smart grid comes from the vast amount of information involved in it. in contrast to traditional grids, in which the consumption metering information was only retrieved monthly, smart grids present a new scenario in which all the interconnected nodes are gathering information about many different matters, and not only consumption (i.e. real-time prices, peak loads, network status, power quality issues, etc.) [25]. in this sense, one of the main challenges for computational intelligence is how to intelligently manage such an amount of information so that conclusions and inferences can be drawn to support the decision making process.  security. here we see the grid as a highly interconnected vulnerable communication network being exposed to all kinds of malicious cyber attacks such as eavesdropping, tempering and even jeopardizing the physical structure of the system. the two case studies we are reporting here are interrelated by the fact that they both use artificial neural networks to improve the performance of the grid since the one (prediction) may be seen as a base for protection of the grid from overload while the second is related to profiling the loads connected to the grid and protect them of misuse. in addition, both solutions rely on the measured data generated by modern metering systems [ami/amr][26, 27]. the paper is organized as follows. in the second paragraph we will give a brief review on the anns and the structures we are using for interpolation and extrapolation. then, in 414 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski the third paragraph the implementation of anns in load prediction related to next day peak-load forecasting will be given. note, the method implemented here is genera in the sense that we have application to other types of load prediction such as short, medium, and long term. the implementation of the very same ann structures to the new eavesdropping method related to the profiling of the loads (in this case a computer) to grid, will be described in the fourth paragraph. 2. a short review of the methods of ann implementation we will first briefly introduce the feed-forward neural networks that will be used as a basic structure for prediction throughout this paper. fig. 1 a fully connected feed-forward ann the network is depicted in fig. 1. it has only one hidden layer, which has been proven sufficient for this kind of problem [28]. indices: in, h, and o, in this figure, stand for input, hidden, and output, respectively. for the set of weights, w(k, l), connecting the input and the hidden layer we have: k=1,2,..., min, l=1,2,..., mh, while for the set connecting the hidden and output layer we have: k=1,2,...mh, l=1,2,..., mo. the threshold is here denoted as θx,r, r=1,2,..., mh or mo, with x standing for h or o, depending on the layer. the neurons in the input layer are simply distributing the signals, while those in the hidden layer are activated by a sigmoidal (logistic) function. finally, the neurons in the output layer are activated by a linear function. the learning algorithm used for training is a version of the steepest-descent minimization algorithm [29]. the initialization problem was solved according to literature [30]. the number of hidden neurons, mh, is of main concern. to get it we applied a procedure that is based on proceedings given in literature [28, 31, 32]. for prediction purposes we developed two structures [33]. the first one was named time controlled recurrent (tcr). it is depicted in fig. 2. the second was named feedforward accommodated for prediction (ffap). its structure is depicted in fig. 3. later on, these two structures were further elaborated as discussed in the succeeding paragraph. it is worth mentioning that, in our opinion, for deterministic forecasting one always needs at least two predictions being supportive to each other. since no knowledge of the forecasting outcome is available, the second prediction is only means to corroborate the first one. having in mind, however, that both predictions carry the same uncertainty, we decided for the best final prediction to accept the average of the two. implementation of artificial neural networks based ai concepts to the smart grid 415 fig. 2 time controlled recurrent (tcr) ann fig. 3 the feed-forward accommodated for prediction (ffap) structure 3. prediction of peak-load at suburban level electric load prediction is essential for power generation and operation [34]. it is vital in many aspects such as providing price effective generation, system security, and planning. among others, it enables: scheduling fuel purchases, scheduling power generation, planning of energy transactions, and assessment of system safety [35]. the load forecast errors imply high extra costs: if the load is underestimated one has extra costs caused by the damages due to lack of energy or by overloading system elements; if the load is overestimated, the network investment costs overtake the real needs, and the fuel stocks are overvalued, locking up capital investment. in a smart grid context, prediction allows for developing computationally efficient learning algorithms that can accurately predict both the prosumers‟ (produce/consumer) consumption and generation profiles (instead of only the usage profile for a consumer) as well as the price of electricity in real time in order to inform profitable trading decisions. given this, a number of researchers have suggested that more sophisticated tariffs, such as real-time pricing (rtp) or spot pricing (where the price per kwh of electricity consumed is different for each half-hour and is provided to the consumer a day, or a few hours, ahead of time), in conjunction with more sophisticated „agents‟ that can autonomously respond to these price signals, would avoid this [36]. consequently, the quality of load forecasts has greatly influenced the economic planning in areas such as generation capacity, purchasing fuel, assessing system‟s security, maintenance scheduling, and energy transmission [37]. 416 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski the power load value is determined by several environmental and social factors. seasonal and daily profiles are the most apparent influential. temperature and air humidity are the primary parameters determining the energy consumption generally and especially in urban residential areas. working times, holidays, and weekends are characterized by specific load profile. environmental disasters, sudden increase of large loads or outages, and important social events are further complicating the load-time function. all together, the load curve is a nonlinear function of many variables that map themselves into it in an unknown way. in the next, our newest results in the application of artificial neural networks (anns) for prediction of daily peak loads at suburban level will be presented. 3.1. problem formulation we took data for the implementation of our method from the unite 1999 competition file [38]. the task was: given the peak values for the previous days, predict the peak-load value for the next day. according to studies of the behaviour of the consumers, in general, one may expect the peak-value to happen at about 19.00 hours. there are some exceptions but these are not influencing the general method we implement. when speaking about the very peakvalue one may recognize a regular periodicity with, unfortunately, some exceptions. fig. 4 represents the daily peak-value for one month (april 1997) extracted from [38]. note the difficulty to recognize the periodicity of the phenomenon. fig. 4 the daily peak-value for one month (april 1997) extracted from [38] the problem may be stated as follows. given the series (tk, f(tk)), k=1,2, ....n , where tk, is the time instant – namely day in the calendar, f(tk) the peak-value at that day, and k the counter, the last known peak-value is at the n-th day. our task is to predict the peak-value at the (n+1)st day. for the purpose of prediction in the subject of electricity we developed two ann structures named etcr and effap [39] which we implement simultaneously. the idea is the following: when predicting one is making a step into the dark. if one wants to have any confidence in the prediction one has to have at least two predictions that support each other. then, since both are of equal importance, instead of accepting one of them the average is calculated and stated as final result. we will give some rudimentary description of etcr and effap anns in the next. implementation of artificial neural networks based ai concepts to the smart grid 417 for the verification of the method we undertook the task to predict the daily peakvalues in may 1997 and to compare with the data given by the unite 1999 competition. 3.2. the etcr solution the etcr ann structure tailored for the application at hand is depicted in fig. 5. the name stands for extended time controlled recurrent. it is a recurrent ann with two feed-back loops. the first one is feeding back the peak-values of the most recent days while the second is feeding back the peak values from two previous weeks but of the same day in the week as the one to be predicted. in this way we implement two principles. first, we claim that only the most recent values have influence to the current value and there is no need for a huge amount of useless data. second, one has to exploit the pseudo-periodic behaviour of the consumers since same days in the week have similar load profile. the etcr is supposed to approximate the function: 1 2 3 4 7 14( , , , , , , )i i i i i iiy f i y y y y y y      (1) where the samples are the daily-peak values. when progressing in time i will raise its value by one. fig. 5 etcr: extended time controlled recurrent according to (1) fig. 6 the extended feed forward accommodated for prediction (effap) according to (2) as for the first test of the method we predicted the peak-value for april 30. 1997 what according to the unite 1999 was 609 kw. the resulting ann had 7 input terminals, 2 output terminals, and 5 neurons in the hidden layer. after bringing a proper excitation we got as a prediction y={625.3241}, what is depicted in table 1. 3.3. the effap solution the effap ann tailored for the application at hand is depicted in fig. 6. the name stands for extended feed forward accommodated for prediction. it is a feed forward ann with three inputs one of them being the time i, while the rest are the peak-values from the previous weeks. there are five outputs each of them supposed to learn the same 418 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski function but shifted in time for one day. the following set of functions approximates the phenomenon: 1 2 3 6 131{ , , , , ,} ( , , )i i i i i iiy y y y y i y y      f . (2) of course, this network is approximating the very same function as the etcr does but in a different manner. as a result for april 30 th 1997, the effap ann obtained after training had 3 input neurons, 5 output neurons, and 5 neurons in the hidden layer. after proper excitation the following prediction was obtained y= {653.2675}. the result is again depicted in table 1. table 1 prediction of the peak-value consumption at april 30 th 1997 of the unite data no. expected value etcr % effap % average value of the prediction % number of hidden neurons etcr effap 1 609 625.3241 2.68 653.2675 7.27 639.2958 4.975 5 5 3.4. overall solution as stated above, the final solution to the prediction problem in our method is obtained by averaging the etcr and the effap predictions. it is shown in table 1, too. it is encouraging. to get a complete picture about the capabilities of the method we made a prediction for every day in may 1997. our first partial results were published in [40] while here we are giving complete results for the whole month as shown in fig. 4. these allow for real evaluation of the properties of the method. by inspection of fig. 7 we conclude that the method proposed may be implemented for prediction of the peak-load at suburban level. the largest discrepancies between the actual and the predicted values are lower than 17% even in the worst case. in 22 out of 30 days the error was lower than 10%, while in 12 out of 30 days the error was lower than 5%. fig. 7 error of prediction (y-axis) as a function of the day in the month may 1997 (x-axis) implementation of artificial neural networks based ai concepts to the smart grid 419 4. a very specific view to the security within the research of the behaviour of computers from the power consumption point of view [10], different software packages were implemented in order to create the energy profile of the computer under different “loading” conditions. we noticed, however, that not only the power consumed, but the thd was dependent on the application running within the pc. so, table 2 contains all harmonics generated by one personal computer (dell optiplex 980, intel core i7 cpu @ 2.8ghz, 4gb ram, 500gb hdd) under different working conditions. approximately 50 harmonics were observed in a sample (200ms, 10000 samples) of a grid current. since even harmonics have incomparably smaller values than the odd ones, in table 2 only the dc, the main, and the odd harmonics are presented. fig. 8. illustrates two columns of table 2. table 2. odd harmonics extracted from one string measurement in eight different states of the workstation harm. no. off (1) idle (2) video (3) cpu arithmetic (4) gpu rendering (5) multimedia cpu (6) physical disks (7) file system benchmark (8) dc -0.55 -0.84 1.3 -0.52 -0.68 -1.3 -0.23 -0.51 1 89.7 400.26 475.4 785.73 747.73 394.33 381.54 411.72 3 3.05 47.9 54.03 34.6 35.84 47.79 48.05 47.73 5 8.55 23.18 23.52 28.7 28.42 22.83 23.53 24.14 7 8.94 11.41 12.3 17.43 16.77 9.74 6.96 9.61 9 3.08 9.19 7.7 10.12 9.26 9.17 8.63 9.5 11 8.76 6.17 7.24 12.27 11.13 6.12 5.36 5.53 13 2.77 1.4 1.73 6.01 5.81 1.99 2.49 2.96 15 6.28 9.81 12.19 5.98 6.84 9.32 9.94 8.92 17 4.81 3.66 5.1 8.91 9.9 5.6 3.76 3.71 19 0.69 4.16 5.05 5.74 5.68 3.3 5.75 7.31 21 0.92 7.39 6.52 4.89 5.12 6.65 5.55 5.29 23 0.62 5.17 7.15 6.06 7.19 5.55 4.56 4.3 25 0.53 4.12 6.2 5.86 4.63 4.6 5.2 4.76 27 0.94 5.18 8.31 2.29 1.28 4.2 3.07 6.35 29 0.62 6.61 6.35 2.94 4.3 5.85 4.93 6.26 31 0.54 4.89 3.64 2.54 3.61 4.98 3.96 5.16 33 1.08 7.58 5.23 4.48 3.67 7.84 8.2 7.34 35 0.47 3.98 2.72 1.71 1.59 4.27 4.17 2.94 37 0.45 2.61 2.09 0.51 0.93 2.98 3.19 2.2 39 0.58 3.9 2.83 2.94 3.55 3.97 4.7 2.81 41 0.54 1.29 0.97 1.26 0.56 1.54 0.96 1.11 43 0.24 1.28 0.46 1.24 0.67 1.39 1.24 1.82 45 0.27 1.91 0.85 1.44 1.79 2.2 1.93 1.77 47 0.39 0.94 0.98 0.34 0.48 0.55 0.9 1.03 49 0.21 0.36 0.53 1.95 1.78 0.7 1.34 0.95 420 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski fig. 8 measured odd harmonics in two cases: physical disc drive active and cpu loaded by arithmetic computations. the first harmonic is omitted for convenience question is: what would this table have to do with security? there are many security issues related to the grid. among them the most vulnerable subsystem, looking from the ict point of view, is the advanced metering infrastructure (ami). while it could bring significant benefits, it is potentially subject to security violations such as tampering with software in the meters, eavesdropping on its communication links, or abusing the copious amount of private data the new meters are able to collect. in addition to securing market sensitive data from competitors, information systems for the power grid need to defend against to malicious attacks [41] that intend to harm the power grid as a whole. the more comprehensive an information system becomes, the greater the consequences of a successful attack and thus the need for security measures increases. one of the ways of eavesdropping a home, an office, or a company is monitoring the power consumption and creating an energy profile of the subject [42]. having this information a large number of malicious actions can be undertaken such as burglaries and other damaging security breaches. here we expose an additional way of eavesdropping where the harmonic structure of the current drawn from the grid is base for information on the activities within a home or an office. the problem will be illustrated on the example depicted in table 2. here the pc is taking the role of the whole which is supervised. we will show in the next how one can precisely find the state in which the computer is, based on measurements of the supply current taken by its ac/dc converter from the grid. note, in the example depicted in table 2, power factor correction was applied within the converter. while there are several possibilities that allow information to be extracted from table 2 about the state in which the computer is, here we will use anns. an ann was trained to create a response recognizing which one of the sets of harmonics of table 2 is present at its input. its structure is depicted in fig. 9. to simplify, for the proper vector of harmonics, the corresponding output of the ann was forced to unity while the rest of the outputs were kept at zero. in other words, it was trained to recognize which software was running within the computer. full success was achieved meaning, after training, the ann was classifying perfectly. implementation of artificial neural networks based ai concepts to the smart grid 421 fig. 9 artificial neural network that eavesdrops the personal computer based on information on harmonics in its mains current to make the problem harder, i.e. to introduce the possible variations due measurement errors, we transformed table 2 so that every entry was recalculated by the formula [1 (2 1) 0.025] new x x rnd      , (3) where rnd is a pseudo-random number with uniform distribution within the [0,1] segment. in other words a “noise” of amplitude (peak-to-peak) as large as 5% of the harmonic value was added as “measurement disturbance”. again, as can be seen from table 3, excellent classification was obtained. table 3 responses of the ann to noisy input data ann‟s output→ input vector↓ off idle video cpu arithmetic gpu rendering multimedia cpu physical disks file system benchmark (1) 0.94189 -0.00826428 -4.98446e-05 0.0596502 0.00545632 -2.68923e-05 0.00254522 0.00128351 1 0 0 0 0 0 0 0 (2) -0.100789 0.936809 -6.30066e-05 0.107029 -0.00390563 -4.56815e-05 0.0353001 0.0301201 0 1 0 0 0 0 0 0 (3) 0.0747284 -0.0347075 1.00742 -0.0946782 0.0368009 6.60139e-06 0.0172143 -0.00950488 0 0 1 0 0 0 0 0 (4) 0.0530374 -0.00513355 -3.01133e-05 0.94394 0.00599003 4.07148e-06 -0.00314594 0.0039932 0 0 0 1 0 0 0 0 (5) -0.0714551 0.141341 0.000249561 0.347383 0.694706 2.93044e-05 -0.0165517 -0.0935344 0 0 0 0 1 0 0 0 (6) -0.0390391 -0.068559 -2.64327e-05 0.0464038 -0.0182126 0.994595 0.0357881 0.0513166 0 0 0 0 0 1 0 0 (7) 0.0221675 -0.0245939 -7.75624e-06 -0.0287134 0.0235965 -8.00252e-07 1.01758 -0.010466 0 0 0 0 0 0 1 0 (8) 0.0524894 -0.0178626 -6.26366e-05 -0.0587603 0.0177179 1.40437e-06 0.00103932 1.00386 0 0 0 0 0 0 0 1 422 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski finally, eight new sets of “harmonics” were created artificially by permutations within the rows in table 2 and the newly created columns were used as excitation to the ann. none succeeded to deceive the network. to conclude, there are robust classification mechanisms whose implementation may give to a malicious attacker, having a sophisticated tool based on current monitoring, an opportunity to monitor every activity within a computer and, in general, a data centre or similar. note, the spectrum of a current taken by a household is not much more complicated than the one of the computer since the main consumers in the household are linear loads and do not generate additional harmonics. from that point of view, we consider our method applicable to a broader list of situations then just a computer. 5. conclusion the modern electricity distribution system gradually evolves into a very large and very complex structure in which ict is getting more and more important role. it is nowadays most frequently referred to as smart grid. there is almost unlimited number of possible applications of ict subsystems within the smart grid and one is not to say that smart grid is a fixed structure whose capabilities are finally set. a special offer of the ict to the smart grid is artificial intelligence and particularly the artificial neural networks. here we represent our attempts to contribute to the development of the smart grid toward an advanced, reliable and secure system. the case studies reported are part of the same project since the same methodology is implemented and they are considering two important and interrelated aspects: the profiling of the load and the protection of the grid. in particular, we discussed some of the most recent results produced within the laboratory for electronic design automation at the university of niš, serbia, which are related to load prediction at suburban level, and a new way of cyber-attack to the ict connected to the grid. both results are based on our own methodology of measurements and own concepts of implementation of anns. as for the load prediction it is worth mentioning that the results reported are part of a set of implementation of our concept to short term [43], medium term [40], and long term [44] prediction of electricity loads. when appropriate, e.g. short term prediction, real-time implementation of the prediction was implemented [39]. the results related to the profiling the computer looking at it from the grid, however, are brand new and will be further elaborated and implemented to more complex computer loads such as data centres or company networks. acknowledgement: this research was partly funded by the ministry of education, science and technological development of republic of serbia under contract no tr32004. references [1] v. litovski, p. petković, ”why the power grid needs cryptography?”, proc. of the symposium on industrial electronics indel 2008, banja luka, 06.11.-08.11., 2008, pp. 75-81. reprinted in: electronics, issn 1450-5843, vol. 13, no. 1, june 2009, pp. 30-36. [2] m., dimitrijević, j., milojković, s., slobodan bojanić, o., nieto-taladriz, and v., litovski, “ict and power: new challenges and solutions”, int. j. reasoning-based intelligent systems, vol. 5, no. 1, 2013, pp. 32-41. publisher: inderscience enterprises, issn: 1755-0556, e-issn: 1755-0564. implementation of artificial neural networks based ai concepts to the smart grid 423 [3] -,“gesi smarter 2020: the role of ict in driving a sustainable future”, the boston consulting group, http://gesi.org/smarter2020. [4] s. iyer, “cyber security for smart grid, cryptography, and privacy”, hindawi publishing corporation, int. j. of digital multimedia broadcasting, vol. 2011, article id 372020, 8 pages. [5] w. wang, and z. lu, “cyber security in the smart grid: survey and challenges”, computer networks, vol. 57, pp. 1344–1371, 2013. [6] f. aloul, a. r. al-ali, r. al-dalky, m. al-mardini, and w. el-hajj, “smart grid security: threats, vulnerabilities and solutions”, international journal of smart grid and clean energy, vol. 1, no. 1, pp. 1-6, 2012. [7] r. harmon, h. demirkan, “the next wave of sustainable it”, it professional, vol. 13, no. 1, pp. 19-25, jan./feb. 2011, doi:10.1109/mitp.2010.140. [8] -,“electricity consumption and efficiency trends in the enlarged european union”, institute for environment and sustainability, 2007, http://www.eubusiness.com/ topics/energy/electricity-jrc.bk/ [9] a. p. bianzino, a. k. raju, d. rossi, “greening the internet: measuring web power consumption”, it pro, january/february 2011, published by the ieee computer society, pp. 48-53. [10] o. nieto, et al., “energy profile of a personal computer”, proceedings of the lvi conf. of etran, zlatibor, serbia, june 2012, isbn 978-86-80509-67-9, proc. on a disc, paper el3.3-1-4. [11] r. adam, w. wintersteller, from distribution to contribution. commercializing the smart grid, booz & company, munich, 2008. [12] j. miller, “the smart grid – how do we get there?”, smart grid news, june 26, 2008. http://www.smartgridnews.com/ [13] -,“smart 2020: enabling the low carbon economy in the information age”, climate group, gesi 2008, www.theclimategroup.org/assets/resources/publications/smart2020 report.pdf. [14] d. ramchurn, p. vytelingum, a. rogers, a., and n. r. jennings, “putting the 'smarts' into the smart grid: a grand challenge for artificial intelligence”, communications of the acm , vol. 55, no. 4, april 2012. [15] s. kalogirou, k. metaxiotis, and a. mellit, “artificial intelligence techniques for modern energy applications”, intelligent information systems and knowledge management for energy: applications for decision support, usage, and environmental protection, igi global, pp. 1-39, 2010. [16] s. kalogirou, c. neocleous, and c. schizas, “artificial neural networks for modelling the starting up of a solar steam generator”, applied energy, vol. 60, pp. 89– 100, 1998. [17] p. l. zervas, h. sarimvies, j. a. palyvos, n. g. c. markatos, “model-based optimal control of a hybrid power generation system consisting of photovoltaic arrays and fuel cells”, journal of power source, vol. 181, pp. 327–338, 2008. [18] p. j. werbos, “approximate dynamic programming for real time control and neural modelling”. in white da and sofge da (eds.), handbook of intelligent control, van nostrand reinhold, new york, 1992, pp. 493-525. [19] y. mansour, e. vaahedi, m. a. el-sharkawi, “dynamic security contingency screening and ranking using neural networks”, ieee trans power syst., vol. 8, no. 4, pp. 942–950, july 1997. [20] d. riley, g. k. venayagamoorthy, “characterization and modeling of a grid connected photovoltaic system using a recurrent neural network”, in proc. ieee int. joint conf. neural networks, san jose, ca, july 31–aug. 5, 2011. [21] b. g. zhang, e. patuwo, and m. y. hu, “forecasting with artificial neural networks: the state of the art”, international journal of forecasting, vol. 14, no. 1, pp. 35-62, march 1998. [22] j. g. m . zade, and r. noori, “prediction of municipal solid waste generation by use of artificial neural network: a case study”, int. j. environmental reserch, vol. 2, no. 1, pp. 13-22, 2008. [23] s. canu, y. grandvalet, and x . ding, “one step ahead forecasting using multilayered perceptron”, working paper de i'universite de technologie de compiegne. [24] j. connor, and r. douglas martin, “recurrent neural networks and robust time series prediction”, ieee trans. on neural networks, vol. 5, no. 2, pp. 240-254, march 1994. [25] y. simmhan, s. aman, b. cao, m. giakkoupis, a. kumbhare, q. zhou, d. paul, c. fern, a. sharma, v. prasanna, “an informatics approach to demand response optimization in smart grids”, technical report, computer science dept., usc, 2011. [26] c. king, “advanced metering infrastructure (ami) overview of system features and capabilities”, emeter corporation, https://www.smartgrid.gov/sites/default/files/doc/files/overview_ami_system _features_capabilities_200405.pdf [27] m. dimitrijević, and v. litovski, „power factor and distortion measuring for small loads using usb acquisition module”, journal of circuits, systems, and computers, vol. 20, no. 5, pp. 867-880, august 2011. [28] t. masters, practical neural network recipes in c++, academic press, san diego, 1993. 424 m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski [29] z. zografski, “a novel machine learning algorithm and its use in modeling and simulation of dynamical systems”, proc. of s"" annual european computer conference, ieee compeuro'91, bologna, italy, pp. 860-864, 1991. [30] t. denoeux and r. lengelle, “initializing back propagation networks with prototypes”, neural networks (pergamon press), vol. 6, pp. 351-363, 1993. [31] g.-b. huang and h. a . babri, “upper bound on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation function”, ieee trans, on neural networks, vol. 9, pp. 224228, 1998. [32] e. b. baum and d. haussler, “what size net gives valid generalization”, neural computing, vol. 1, pp. 151-160, 1989. [33] j. milojković, and v. litovski, “comparison of some ann based forecasting methods implemented on short time series”, 9th symposium on neural network applications in electrical engineering, neurel-2008, pp. 179-179, belgrade, serbia, 2008. [34] h. m. al-hamadi, s. a. soliman, “short-term electric load forecasting based on kalman filtering algorithm with moving window weather and load model”, electric power systems research, vol. 68, no. 1, 2004, pp. 47-59. [35] s., tzafestas, and e., tzafestas, “computational intelligence techniques for short-term electric load forecasting”, journal of intelligent and robotic systems, vol. 31, no. 1-3, 2001, pp. 7-68. [36] f. schweppe, b. daryanian, and r. tabors, “algorithms for a spot price responding residential load controller”, power engineering review vol. 9, no. 5, pp. 49–50, 1989. [37] f. liu, r. d. findlay, q. song, “a neural network based short term electric load forecasting in ontario canada”, in int. conf. on computational intelligence for modelling control and automation, and int. conf. on intelligent agents, web technologies and internet commerce, (cimca-iawtic'06), 2006, pp. 119 – 125. [38] worldwide competition within the eunite network. (2001). [online] available: http://neuron.tuke.sk/ competition [39] j. milojković, v. litovski, “dynamic one step ahead prediction of electricity loads at suburban level”, proc. of the first ieee int. workshop on smart grid modeling and simulation – at ieee smartgridcomm 2011, sgms2011, brussels, october 2011, proc. on disc, paper no. 25. [40] j. milojković, v. litovski, “one day ahead peak electricity load prediction”, ix symposium industrial electronics, indel 2012, banja luka, november 2012, pp. 261-267. [41] f. cleveland, “iec tc57 security standards for the power system information infrastructure beyond simple encryption”, june 2007. iec tc57 wg15 security standards white paper ver. 11. http://www.xanthus-consulting.com/pages/publications.htm [42] m. andrejević stošović, m. dimitrijević, v. litovski, “computer security vulnerability seen from the electricity distribution grid side”, applied artificial intelligence, taylor & francis ltd., 2014, accepted for publication. [43] j. milojković, and v. litovski, “new ann models for short term forecasting of electricity loads”, proc. of the 7th eurosim congress on modelling and simulation vol.2: full papers (cd), czech technical university in prague, faculty of electrical engineering, dept. of computer science and engineering, prague, czech republic isbn 978-80-01-04589-3, september 2010. [44] j. milojković, v. litovski, o. nieto-taladriz, and s. bojanić, “forecasting based on short time series using anns and grey theory – some basic comparisons”, in proc. of the 11th int. work-conference on artificial neural networks, iwann 2011, june 2011, torremolinos-málaga (spain). j. cabestany, i. rojas, and g. joya (eds.): part i, lncs 6691, pp. 183–190, 2011, © springer-verlag berlin heidelberg 2011. issn: 0302-9743. 10637 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 603-617 https://doi.org/10.2298/fuee2204603d © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper chaotic seismic signal modeling based on noise and earthquake anomaly detection leila dehbozorgi, reza akbari-hasanjani, reza sabbaghi-nadooshan department of electrical engineering, central tehran branch, islamic azad university, tehran, iran abstract. since ancient times, people have tried to predict earthquakes using simple perceptions such as animal behavior. the prediction of the time and strength of an earthquake is of primary concern. in this study chaotic signal modeling is used based on noise and detecting anomalies before an earthquake using artificial neural networks (anns). artificial neural networks are efficient tools for solving complex problems such as prediction and identification. in this study, the effective features of chaotic signal model is obtained considering noise and detection of anomalies five minutes before an earthquake occurrence. neuro-fuzzy classifier and mlp neural network approaches showed acceptable accuracy of 84.6491% and 82.8947%, respectively. results demonstrate that the proposed method is an effective seismic signal model based on noise and anomaly detection before an earthquake. key words: artificial neural networks, chaos, earthquake, entropy, prediction, seismic signal processing, wavelet transforms 1. introduction earthquake prediction is a branch of seismology and should be distinguished from earthquake warning systems which provide a real-time warning to regions that might be affected. the purpose of a chaotic signal model considering noise and detection of anomalies before an earthquake is to warn of an impending major earthquake to reduce death and destruction. in the 1970s, scientists were optimistic that a practical method for predicting earthquakes would soon be found [1]. however, further devastating earthquakes occurred that caused destruction and loss of life exceeding 6,300 persons in the m7.2 1995 kobe earthquake in japan, 15,000 in the m7.4 1999 izmit earthquake in turkey, and over 30,000 in the m6.7 2003 bam earthquake in iran [2]. there are many common methods of detecting anomalies before an earthquake which use artificial neural networks (anns), genetic programming (gp), and radial basis function networks. artificial neural networks have applications in areas such as identification, prediction, and image processing. in ref. [3], the back propagation neural network and new mark displacement analysis examined the earthquake risk in the manjil-rudbar damaged area received april 4, 2022; revised july 26, 2022; accepted august 10, 2022 corresponding author: reza sabbaghi-nadooshan department of electrical engineering, central tehran branch, islamic azad university, tehran, iran e-mail: r_sabbaghi@iauctb.ac.ir https://en.wikipedia.org/wiki/earthquake_warning_system 604 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan in 1990. in order to evaluate earthquake signals, it is better to use factual information than the null hypothesis. ref. [4] considered a model for noisy signal and detecting anomalies before an earthquake using anns and got acceptable results on ghir station in iran. researchers have developed software for short-term earthquake prediction using pressure reduction and temperature rise, which has resulted in 70.5% accurate forecasting in japan. the accuracy of this network is not optimal for predicting [5]. in ref. [6] used location related parameters in the neural network to predict earthquakes in iran. the researchers in ref. [7] used the deep learning model of dlep for earthquake prediction, which used explicit and implicit features. there is no suitable time frame for earthquake prediction. in ref. [8] the neural network is discussed to predict the arrival time of p-wave earthquake occurrence in taiwan.the time frame for earthquake prediction is concise. in ref. [9] has used a grnn neural network to predict earthquakes on the iranian plateau. in ref. [22] examines the possibility of using the dlis algorithm to identify and reconstruct the location, size, and thickness distribution of several complex defects. in ref. [23], it has used 8 mini-stations of the new region located in north sumatra which it uses the svm model (one of the machine learning tools in digital signal processing) to distinguish seismic activities. however, the proposed model has acceptable accuracy but the amount of data to be tested can be increased and the used more data is necessary for test network performance. ref. [24] used the deep learning to predict earthquakes and p-wave has been investigated, but the time frame for earthquake prediction is concise. ref. [25] also used the deep learning and neural network to predict earthquakes. the period of 3 seconds before the earthquake is intended to predict the earthquake that is a concise time to forecast. ref. [26] suggested the augmented linear mixing model (almm) method. the most of the focus of this article is on image processing, object recognition, and classification. in this purpose, a dictionary is defined to model spectral variables. this paper focuses more on image processing, not signal processing. the dictionary has little similarity with the training data in neural network. but, the application of methods used for image processing needs more investigation in signal processing. ref. [27] proposed a model called fourier-based rotation-invariant feature boosting (frifb) to increase the speed of calculations and reduce complexity. in this way, the fourier is calculated in polar coordinates and then the subsequent analyses are performed. in this article, we have defined several extracted frequency features, which are used almost the same way but with some differences as in the above article. for example, to extract newer and different features, we applied signal divider features to fft and psd, and then extracted statistical features for each part. it is explained in more detail in sections 2-3. in ref. [10], two feature groups were compared to detect anomalies before an earthquake. those consisted of 54 and 87 features. the accuracy values for data classifier and mlp neural network are equal to 60.6383% and 55.8511% for the feature matrix with dimensions of 54 and 87, with a total of 626 records. this method employed much more data than previous methods. it does not have the desired accuracy, but there is a time frame before the earthquake to predict it. most previous articles use concise time frame to predict earthquakes, and the number of features is minimal and related to geological features. for this reason, it is impossible to make an accurate decision about the types of effective features in the occurrence of an earthquake. this study aims to determine the most desirable characteristic matrix for detecting anomalies within 5 minutes before an earthquake and chaotic signal modeling. chaotic seismic signal modeling based on noise and earthquake anomaly detection 605 in this study, a new class of effective features is developed for chaotic signal modeling before an earthquake using intelligent networks with a more extensive database for generalization than previous methods. then a model is evaluated for noisy signal and detection of anomalies before an earthquake using neuro-fuzzy and mlp classifiers. the innovation of this article is the use of more data for training and testing the classifiers, considering 5 minutes before an earthquake to predict it, using different features than previous articles and comparing the performance of two neuro-fuzzy networks and mlp classifiers. most papers use only geological or frequency-related features. in this article, we tried to examine the different types of features and determine the effectiveness or ineffectiveness of each. the rest of the paper is organized as follows: section 2 discusses the basic concepts of the features and ann structure. section 3 proposes the design method and discusses the simulation results. section 4 concludes with the obtained result. 2. data and methodology chaotic signal modeling based on noise was employed with neuro-fuzzy and mlp classifiers and using a large amount of data and some new features. the seismic waves are processed to detect anomalies before an earthquake onset. the method is divided into six main stages (in fig. 1): (1) considering the earthquake onset to select an observation window and detect anomalies; (2) slice the rest of the signal into two sections; (3) high pass filtering of the signals to reject baseline drift; (4) feature extraction from the filtered signal; (5) feed the feature vector to the intelligent networks; (6) after training and testing the classifiers, select the effective features using uta algorithm [11,12]. the selected signals were processed using a high pass butterworth filter to remove baseline drift in the signals and the cut-off frequency (fc) was set at 0.04 hz [4]. there was not enough evidence showing how an earthquake related to a known feature, so a mixture of time, time-scale, and chaos features were extracted, and the effective features were selected after achieving acceptable accuracy [4, 10]. the whole process of the algorithm is shows in fig. 1. 2.1. features 2.1.1. statistical features the ten statistical features evaluated are mode, mean, variance, covariance , maximum data, minimum data, signal standard deviation , median, deviation of string factor from symmetry (sk) and stretch factor. the ‘sk’ and ‘k’ features represented as follow. where xi and denote the signal and the mean of signal, respectively, and n is the number of data [13]. ( ) 3 3 2 ( ( ) ) ( ( ) )i isk       = −  −         (1) ( ) 4 4 2 ( ( ) ) ( ( ) ) 3i ik e       = = −  −  −        (2) 606 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan separating the earthquake signal from the main signal entering signal remove 5 minutes before the earthquake dividing the rest of the signal into two equal parts filtering both signal sections to remove low frequencies extracting the features for each section neuralnetwork (out put =1?) chaotic modeling based on noise and detecting anomalies before the earthquake accure no anomalies detect yes no uta algorithm effective features selection finish fig. 1 flowchart of chaotic modeling based on noise and detecting anomalies before the earthquake 2.1.2. chaos features chaotic systems are highly dependent on initial conditions. in other words, if two trajectories start very close to each other, they diverge from each other rapidly and exponentially if and only if their processes have chaotic behavior. the difference between the two trajectories after the time period of t is measured as the lyapunov exponent ( ). where that x0 is a point on a trajectory at time t and x0 + ∆x0 is the point near to x0 on a different trajectory where ∆x0 approaches zero and presents the initial amount of separation between the two points. 0 0 (1 ) ( ( , ) )im n     → =     (3) there are three states for the lyapunov exponent (λ): (1) λ>0: the system is chaotic. (2) λ<0: the system is not chaotic. (3) λ =0: the system reaches steady state condition [13]. chaotic seismic signal modeling based on noise and earthquake anomaly detection 607 2.1.3. signal divider a signal divider applied to classify of data between the maximum and minimum signal values. the signal divided into 16 equal classes and the amount of available data in each class is extracted as the feature. 2.1.4. entropy entropy is a measure of the system disorder. entropy h(x) of discrete random variable x is evaluated as following [14], so that p(x) is the probability of x occurrence. 2( ) ( ) ( ( )) x og  = −      (4) 2.1.5. discrete wavelet transform (dwt) wavelet transform can be seen as the projection of a signal into a set of basic functions named wavelets. a wavelet transform includes a function based on the mother wavelet function and has an excellent localization characteristic in the time-scale domain [15]. most of the energy in a wavelet function is concentrated in a short interval and is damped quickly. of the various types of wavelet functions, the daubechies wavelet transform is one of the most common. in wavelet transforms, the signal passes through an internal filter and is divided into a low-frequency (ca) and a high-frequency (cd) component. the dwt of signal x[n] is defined based on approximation coefficient wφ [j0,k] and detail coefficient wψ [j,k], as it is shown as follows. where n=0,1,2,…, m-1, k=0,2,…,2j-1 and j=0,1,2,…,j-1, and m is the number of samples to be transformed using wavelet function. 00 , [ , ] (1 / ) [ ] j k n w j k m x n =   (5) 0 , 0 , for [ , ] (1/ ) [ ] j k n w jj k m x n j =   (6) the basic functions φj,k [n], and ψi,k [n] are defined as follow. where φ[n] is the scaling function and ψ[n] is the wavelet function [4,16]. 2 , [ ] 2 [2 ] j j j k n n k =   − (7) 2 , [ ] 2 [2 ] j j j k n n k =   − (8) the daubechies 2 wavelet transform is implemented in the next five steps. the output of each array is selected using half of the inputs selected at each step. the statistical values are used as features in each step. 2.1.6. fast fourier transform (fft) using the equation 9, fast fourier transform (fft) for an n×n matrix is calculated [17]. the statistical features and data classifier for the fft of the signal evaluated as features. 1 0 ( ) ( ) (exp( 2 / )) n n k n j k n − =  =   − k=0, 1, 2, …, n (9) 2.1.7. power spectral density (psd) the power spectral density (psd) function shows the strength of variation (energy) as a function of frequency. it shows at which frequencies variations are strong and at which 608 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan frequencies variations are weak. the energy is obtained within a specific frequency range by integrating psd in the frequency range. the computation of psd is done directly by computing autocorrelation function r(τ) and then transforming it. the results are demonstrated in the following formulas for signal s(t). 2( ) ( )p t s t= (10) ( ) ( ) (exp( 2 )) ( ( ))s f r j d f r    + − =  −  = (11) the power of the signal in a frequency band can be calculated as: 2 2 1 1 ( ) ( ) f f f f p s f df s f df − − =  +   (12) afterward, statistical features of the signal’s psd were derived as psd features. 2.1.8. trajectory a trajectory is a path followed by an object moving through space as a function of time. in this present study, a signal with n pieces of data is presumed. each part of the signal is depicted as {x(t1), x(t2), …, x(tn)} such that t1,t2, …,tn refers to the data stored in a time series [18]. first, the x(n+1) to x(n) graph is represented as a signal trajectory and then is divided into 16 houses. the number of pieces of data stored in each house in a matrix is a feature. 2.2. classification networks 2.2.1. multilayer perceptron (mlp) network multilayer perceptron (mlp) is a well-known feed-forward neural network that is used for classification usually because of its good performance. generally, an mlp contains input and output layers and one or more hidden layers. after forming the structure of a network, the neurons are connected by linking weights and they are trained using a training algorithm (fig. 2) [19]. input layer first hidden layer second hidden layer output layer output. . . . . . . . . . . . . . . . . . fig. 2 the structure of multi-layer perceptron network chaotic seismic signal modeling based on noise and earthquake anomaly detection 609 2.2.2. neuro-fuzzy classification networks fuzzy systems use two significant paradigms: fuzzy logic and neural networks [20]. fuzzy logic programming in matlab software includes conditional statements. the neural network consists of several nodes which are connected to each other by weights (fig. 3). 0x1 xn ++ v -m f=a/b f ba x µ layer layer layer µ=exp[-(x-xi l ) 2 /i 12 ] y -1 z m × × fig. 3 network representation of the fuzzy system [10] 1 1 1 1 2 1 1 exp( (( ) )) ( ) exp( (( ) ) ) m l n l i i i i l m n l i i i i l y x x f x x x   − − = = − = =      − −     =     − −       (13) here, three parameters i l , y −1 and xi −1 define in the phase of learning and must be determined to design a neuro-fuzzy system, and m is the number of rules considered. input x passes through a product gaussian operator to become zl, then result of this stage passes through summation operator b and weighted operator a. finally, output f is calculated [20]. 01 exp( (( ) )) nl p l l i i ii z x x  − = = − − (14) 1 m l i b z = =  (15) 1 ( ) m l l l a y q z − = =   (16) f a b= (17) 2.3. feature selection in the uta algorithm, the average of one feature in all instances is calculated. then the selected feature in all input vectors is replaced by the calculated mean value. then trained network is tested with the new features and new matrix. if the system cognition is 610 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan decreased, that feature is effective, but if the result doesn’t change or improve, that feature is considered ineffective (noisy feature) and should be removed from the input vector [11]. 3. analysis of results the database contains 760 records at 5 to 7 on the richter scale from the international institute of earthquake engineering and seismology for 21 earthquake recording stations in iran (between 2004 and 2010). the sampling frequency is 50 hz (fig. 4). table 1 shows the date, time, geographical location, depth, and magnitude of each earthquake. table 1 characteristics of 5 to 7 richter earthquakes recorded between 2004 and 2009 date of occurrence time of occurrence magnitude and geographical characteristics year month day hour minute second latitude longitude depth magnitude 2004 10 6 11 14 26.1 28.8 57.9 14.1 5.2 2004 10 7 12 54 56.1 28.4 57.2 15.9 5 2004 10 7 21 46 15.2 37.3 54.5 16.8 6.2 2004 10 16 10 4 33.9 33.5 45.7 18 5 2005 3 13 3 31 27.3 27.3 61.5 54.8 6.1 2005 5 1 18 58 38.8 30.8 56.9 14.2 5.1 2005 5 14 18 4 57.1 30.7 56.6 14.1 5.2 2005 6 19 4 46 4.5 33.1 58.2 15 5.2 2005 8 9 5 9 19.7 28.8 52.6 18 5 2005 11 27 16 30 39.1 27.0 55.7 14.1 5.2 2005 11 29 5 57 3 37.5 54.6 15 5 2005 12 26 23 15 51.1 32.1 49.1 32.9 5.2 2005 12 27 21 53 15 28.1 56.1 15 5.1 2006 2 18 11 3 31.5 30.7 55.8 14.1 5 2006 2 28 7 31 3.4 28.1 56.7 18 5.8 2006 3 25 7 28 57.3 27.5 55.8 15.8 5.5 2006 3 25 9 55 16 27.6 56.0 15.9 5.1 2006 3 25 10 0 37 27.4 55.7 15 5 2006 3 30 19 36 18 33.6 48.9 15 5.1 2006 3 31 1 17 2.3 33.6 48.9 14.1 6.1 2006 3 31 11 54 2.6 33.8 48.7 17.5 5.2 2006 6 28 21 2 9.2 26.8 55.9 10 5.6 2006 7 18 23 27 5.5 26.2 61.1 46 5 2006 11 5 20 6 40.2 37.4 48.8 14.1 5 2007 3 26 6 36 50 29.1 58.4 14.1 5 2007 6 18 14 29 49.4 34.5 50.8 17.3 5.6 2008 3 9 3 51 6.4 33.3 59.1 17.9 5 2008 8 27 21 52 39.9 32.3 47.3 32.5 5.6 2008 9 10 11 0 35.1 26.9 55.7 6.7 5.8 2008 10 25 20 17 16.9 26.6 54.8 14.2 5.1 2008 12 7 13 36 20.8 26.9 55.7 11 5.2 2008 12 9 15 9 27.4 27.0 55.8 15 5 2009 7 22 3 53 2.6 26.7 55.8 14.2 5.2 2009 10 4 21 50 49.6 31.8 49.4 15 5.1 chaotic seismic signal modeling based on noise and earthquake anomaly detection 611 fig. 4 distribution of stations in the iran broadband national center seismology [21] for training, 70% of the data were randomly selected and the remaining 30% of the data was used for testing in matlab r2017a software. initially, for anomaly detection, 380 records that have 20 minutes of the signal and have the property that an earthquake has happened after were selected as the sig1 group and 380 records equal in length by the sig1 group were selected which no earthquake has occurred in the next following five minutes after them and five minutes before the earthquake in each record deleted (fig. 5). the first five minutes of sig2 and last five minutes of sig1 separated to extract features for the feature vector. a fourth-order high pass butterworth filter was applied to remove the low frequency (fc = 0.04) and then the signals normalized (fig. 6). 612 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan fig. 5 classification of seismic signals before the earthquake, the earthquake does not happen after sig2 and happen 5 minutes after sig1 fig. 6 a) the original signal, b) the filtered signal chaotic seismic signal modeling based on noise and earthquake anomaly detection 613 after filtering the 15000 samples, the statistical features are derived for each record using the chaostest.m in matlab [4]. chaostest.m tests the positive existence of the dominant lyapunov exponent λ and local lyapunov exponents. the second output parameter is h, which is the result comparing λ and α. the value of p is the observing probability. another output is orders, which gives the triplet (l, m, q), minimizes the schwarz information criterion to obtain the best coefficients and calculates λ. the confidence interval (ci) for λ is determined at level α (α is a fixed number with a default value of 0.05 ). the dwt is implemented for five steps and eight statistical features (mode, mean, variance, covariance, maximum, minimum, median, and signal standard deviation) is saved for each step. the signal divider is evaluated for fft and eight statistical features are calculated for fft and psd, respectively. moreover, the x(n+1) to x(n) graph is provided as a signal trajectory and then divided into 16 houses. the amount of data stored in each place is saved as a feature. input feature vector has 260 values per instance. the neuro-fuzzy classifier has 260 inputs, 14 neurons (rules), and one output. the threshold of the classification in neuro-fuzzy classifier is 0.49. furthermore, the mlp neural network has 260 neurons in the input layer, two hidden layers, and an output layer consisting of two neurons. neuro-fuzzy classifier and mlp neural network were successfully trained in matlab and the testing results are presented for both networks. the networks have one output; each output value uniquely represents one category (0: no earthquake; 1: earthquake). after training, both classifiers were tested and then 3503 iterations of training, the results indicated that the neuro-fuzzy classifier was better than the mlp network and could detect anomalies five minutes before an earthquake with an acceptable accuracy of 84.6491% (fig. 7; table 2). fig. 7 difference between the output of neuro-fuzzy classifier and real output after 3503epoch 614 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan table 2 neuro-fuzzy classifier’s performance compare with mlp before feature selection classifier neuro-fuzzy classifier multilayer perceptron (mlp) network accuracy 84.6491% 81.1404% sensitivity 71.93% 80.70% specificity 97.37% 81.58% average error 0.1237 0.1691 0.1663 fig. 8 compares the neuro-fuzzy classifier and mlp performance before feature selection. this figure shows that the neuro-fuzzy classifier is produced better results for accuracy. 79.00% 80.00% 81.00% 82.00% 83.00% 84.00% 85.00% mlp neurofuzzy accuracy% fig. 8 neuro-fuzzy classifier’s performance compared to mlp before feature selection after training and testing, the uta algorithm implemented for feature selection and ineffective features were deleted. this algorithm decreased the input vector dimensions to 150 for the mlp network and 29 for the neuro-fuzzy classifier. both classifiers were trained and tested again (table 3). table 4 shows some of the more effective features of both classifiers. results show that frequency characteristics are priorities for both classifiers and the neurofuzzy classifier produced better results for accuracy and sensitivity. table 3 neuro-fuzzy classifier’s performance compare with mlp after feature selection classifier neuro-fuzzy classifier multilayer perceptron (mlp) network accuracy 84.6491% 82.8947% sensitivity 74.56% 71.05% specificity 94.74% 94.74% average error 0.1512 0.1782 0.1580 table 4 some of more effective features after implementation of uta algorithm for neuro-fuzzy classifier and mlp neural network mlp neural network neuro-fuzzy classifier mean of angle (fft) mean of angle (fft) mean of angle of normalize (fft) median of entropy max of data covariance of ca (dwt) mean of abs (fft) signal standard deviation of (psd) max of ca (dwt) mean of ca (dwt) mean of cd (dwt) trajectory chaotic seismic signal modeling based on noise and earthquake anomaly detection 615 fig. 9 shows the results of the accuracy of this study compared with other studies. amount of accuracy optimization compared to previous articles is obtained using the following formula: present implementation result improvement (%) 1 100 previousimplementationresult   = −     (18) accuracy: improvement (%) (this study, [4]) = (1−(84.6491/60.8491))100 = 39.079% improvement (%) (this study, [10]) = (1−(84.6491/60.6383)100 = 39.59% improvement (%) (this study, [7]) = (1−(84.6491/50)100 = 69.2982% improvement (%) (this study, [8]) = (1−(84.6491/75)100 = 12.8654% it can be seen that the accuracy in the proposed design is more optimal than the previous articles. the amount of improvement is even close to 70% (fig. 10). fig. 11 shows the results of the present study improved for the neuro-fuzzy classifier after the implementation of the new features in this study. it shows that the neuro-fuzzy classifier has performs better than the mlp network. 75 50 60.869660.6383 84.6491 0 20 40 60 80 100 0 2 4 6 other studi es[8] other studi es[7] previous researches [4] previous researches [10] thi s study fig. 9 neuro-fuzzy classifier’s performance for this study (ts) compared with the other studies 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% [4] [10] [7] [8] improvement(accuracy) % fig. 10 comparison of the accuracy improvement of the proposed design with previous articles 616 l. dehbozorgi, r. akbari-hasanjani, r. sabbaghi-nadooshan fig. 11 neuro-fuzzy classifier’s performance compared with mlp after feature selection 4. conclusion in this article, the proposed method can detect anomalies before an earthquake by using new features. one of the innovations of this article is extracting new features. also, we considered a longer period of time than the rest of the articles to detect the anomaly before the earthquake then evaluated two types of classifiers. finally, we chose the best network and the most optimal features. the proposed method provided a new matrix of features that was capable of chaotic signal modeling based on noise and detection of anomalies during the five minutes before the earthquake with an acceptable accuracy of 84.6491%. moreover, the results indicate that the uta algorithm decreased input feature dimensions without loss of accuracy. the selected features demonstrated that chaotic signal modeling based on noise and detecting anomalies before an earthquake is very dependent on frequency features, followed by entropy, trajectory, chaotic and statistical features. future work would be to collect more earthquake data globally, add more frequency-dependent parameters to the feature vector, and use committee machines to increase the classification accuracy. it is also possible to extract a new feature from combination of two or three features for example, the combination of entropy and classification and frequency features or other possible combinations. references [1] r. j. geller, d. d. jackson, y. y. kagan and f. mulargia, "earthquakes cannot be predicted," science, vol. 275, pp. 1616-1617, 1997. [2] s. uyeda, t. nagao, and m. kamogawa, "short-term earthquake prediction: current status of seismoelectromagnetics", tectonophysics, vol. 470, no. 3-4, pp. 205-213, 2009. [3] a. m. rajabi, m. khodaparast, and m. mohammadi, "earthquake-induced landslide prediction using back-propagation type artificial neural network: case study in northern iran", natural hazards, vol. 110, no. 1, pp. 679-694, 2022. [4] l. dehbozorgi, "case study of seismic signals for ghir station before the earthquake", bulletin of earthquake science and engineering, vol. 5, no. 4, pp. 131-143, 2019. [5] h. shiraishi, "developing and validating earthquake prediction software", international journal of engineering and techniques, vol. 8, pp. 63-69, 2022. [6] m. yousefzadeh, s. a. hosseini, and m. farnaghi, "spatiotemporally explicit earthquake prediction using deep neural network", soil dynamics and earthquake engineering, vol. 144, p. 106663, 2021. http://moho.ess.ucla.edu/~kagan/geller_et_al_1997.pdf http://www.bese.ir/article_240366.html?lang=en chaotic seismic signal modeling based on noise and earthquake anomaly detection 617 [7] r. li, x. lu, s. li, h. yang, j. qiu, and l. zhang, "dlep: a deep learning model for earthquake prediction," in proceedings of the 2020 international joint conference on neural networks (ijcnn), 2020, pp. 1-8. [8] y.-j. chiang, t.-l. chin, and d.-y. chen, "neural network-based strong motion prediction for on-site earthquake early warning", sensors, vol. 22, no. 3, pp. 704, 2022. [9] s. yaghmaei-sabegh, "earthquake ground-motion duration estimation using general regression neural network", scientia iranica, vol. 25, no. 5, pp. 2425-2439, 2018. [10] l. dehbozorgi and f. farokhi, "notice of retraction: effective feature selection for short-term earthquake prediction using neuro-fuzzy classifier", in proceedings of the 2010 second iita international conference on geoscience and remote sensing, 2010, vol. 2, pp. 165-169. [11] j. utans, j. moody, s. rehfuss, and h. siegelmann, "input variable selection for neural networks: application to predicting the us business cycle", in proceedings of 1995 conference on computational intelligence for financial engineering (cifer), 1995, pp. 118-122. [12] m. f. redondo and c. h. espinosa, "a comparison among feature selection methods based on trained networks", in proceedings of the neural networks for signal processing ix: proceedings of the 1999 ieee signal processing society workshop (cat. no. 98th8468), 1999, pp. 205-214. [13] k. majumdar and m. h. myers, "amplitude suppression and chaos control in epileptic eeg signals", computational and mathematical methods in medicine, vol. 7, no. 1, pp. 53-66, 2006. [14] s. byun et al., "entropy analysis of heart rate variability and its application to recognize major depressive disorder: a pilot study", technology and health care, vol. 27, no. s1, pp. 407-424, 2019. [15] k. sui and h.-g. kim, "research on application of multimedia image processing technology based on wavelet transform", eurasip journal on image and video processing, vol. 2019, no. 1, pp. 1-9, 2019. [16] a. qin, z. shang, j. tian, y. wang, t. zhang, and y. y. tang, "spectral–spatial graph convolutional networks for semisupervised hyperspectral image classification", ieee geoscience and remote sensing letters, vol. 16, no. 2, pp. 241-245, 2018. [17] a. l. zheleznyakova, "physically-based method for real-time modelling of ship motion in irregular waves", ocean engineering, vol. 195, pp. 106686, 2020. [18] y. xue, p. j. ludovice, and m. a. grover, "dynamic coarse graining in complex system simulation", in proceedings of the 2011 american control conference, 2011, pp. 5031-5036. [19] n. singh and r. khan, "speaker recognition and fast fourier transform", international journal, vol. 5, no. 7, 2015. [20] r. tabbussum and a. q. dar, "performance evaluation of artificial intelligence paradigms—artificial neural networks, fuzzy logic, and adaptive neuro-fuzzy inference system for flood prediction", environmental science and pollution research, vol. 28, no. 20, pp. 25265-25282, 2021. [21] international institute of earthquake engineering and seismology. [online]. available: http://www.iiees.ac.ir. [22] j. tong, m. lin, x. wang, j. li, j. ren, l. liang, y. liu, "deep learning inversion with supervision: a rapid and cascaded imaging technique", ultrasonics, 122, 106686, 2022. [23] m. sinambelaa, m. situmoranga, k.tarigana, s. humaidia, makmur siraitb, "waveforms classification of northern sumatera earthquakes for new mini region stations using support vector machine", advanced science engineering information technology, vol.11, no. 2, 2021. [24] w. yanwei, l. xiaojun, w. zifa, et al., "deep learning for p-wave arrival picking in earthquake early warning", earthq. eng. eng., vol. 20, pp. 391-402, 2021. [25] m. s. abdalzaher, m. s. soliman, s. m. el-hady, a. benslimane, et al., "a deep learning model for earthquake parameters observation in iot system-based earthquake early warning", ieee internet of things journal, vol. 9, no. 11, pp. 8412-8424, 2022. [26] d. hong, n. yokoya, j. chanussot, x. x. zhu, "an augmented linear mixing model to address spectral variability for hyperspectral unmixing", ieee transactions on image processing, vol. 28, no. 4, 2019. [27] x. wu, d. hong, j. chanussot, y. xu, r. tao, y. wang, "fourier-based rotation-invariant feature boosting: an efficient framework for geospatial object detection", ieee geoscience and remote sensing letters, vol. 17, no. 2, 2020. http://www.iiees.ac.ir/ https://ieeexplore.ieee.org/author/37085776557 https://ieeexplore.ieee.org/author/37088891934 https://ieeexplore.ieee.org/author/37089390270 https://ieeexplore.ieee.org/author/37371767900 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=6488907 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=6488907 https://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=9780059 https://ieeexplore.ieee.org/author/37085775450 https://ieeexplore.ieee.org/author/37588926500 https://ieeexplore.ieee.org/author/37265876800 https://ieeexplore.ieee.org/author/37404573000 https://ieeexplore.ieee.org/author/37085902460 https://ieeexplore.ieee.org/author/37085775450 https://ieeexplore.ieee.org/author/37265876800 https://ieeexplore.ieee.org/author/37085405293 https://ieeexplore.ieee.org/author/37289461800 https://ieeexplore.ieee.org/author/37085440814 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=8859 instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 621 630 doi: 10.2298/fuee1404621p the influence of technology and switching parameters on resistive switching behavior of pt/hfo2/tin mim structures  albena paskaleva 1 , boris hudec 2 , peter jančovič 2 , karol fröhlich 2 , dencho spassov 1 1 institute of solid state physics, bulgarian academy of sciences, sofia, bulgaria 2 institute of electrical engineering, slovak academy of sciences, bratislava, slovakia abstract. resistive switching (rs) effects in pt/hfo2/tin metal-insulator-metal (mim) capacitors have been investigated in dependence on the tin bottom electrode engineering, deposition process, switching conditions and dielectric thickness. it is found that rs ratio depends strongly on the amount of oxygen introduced on tin surface during interface engineering. in some structures a full recovery of conductive filament is observed within more than 100 switching cycles. rs effects are discussed in terms of different energy needed to dissociate o ions in structures with different tin electrode treatment. key words: resistive switching; pt/hfo2/tin structures; interface engineering; atomic layer deposition. 1. introduction among the new types of non-volatile memories (nvm) the resistive switching memories (rram) have attracted a great interest because of their simple structure, long retention time, small size, fast switching speed and non-destructive readout [1-4]. resistive switching (rs) is a phenomenon in which the resistance of material changes under application of electric field or current. rs devices can be switched between low resistive state (lrs) and high resistive state (hrs) over many cycles. two kinds of rs effect have been recognized – unipolar, where the switching does not depend on polarity of the applied voltage; and bipolar, where the set to lrs occurs in one polarity and the reset to hrs in the reversed polarity. bipolar switching is usually preferred because of better uniformity, faster switching speed and better control [5]. to obtain rs effect an initial electroforming step is needed, which causes the initially insulating structure to change into a higher conductive state. in fact, electroforming is a current-limited electric breakdown, which in its nature is a kind of “soft” or “arrested” breakdown [6]. the polarity of the forming process as well as compliance current (cc) should be carefully optimized in order to bring the structure to a received july 11, 2014; received in revised form july 21, 2014 corresponding author: albena paskaleva institute of solid state physics, bulgarian academy of sciences, 72 tzarigradsko chaussee, 1784 sofia, bulgaria (e-mail: paskaleva@issp.bas.bg) 622 a. paskaleva, b. hudec, p. jančovič, k. fröhlich, d. spassov state where rs could be observed. the typical bipolar rs loop is presented in fig.1. after the forming step the structure is in lrs (on state). by sweeping the voltage (usually in polarity opposite to forming process), at a certain voltage, vreset, the structure goes to hrs (off state). this is the reset process. then, by sweeping of voltage in the reversed polarity, at a certain voltage vset the structure is brought to lrs – set process. the switching between on and off states is reversible and could be performed over many cycles. rs effect has been observed in various transition metal oxides, but the origin of rs is still unclear. it is generally accepted that the rs is due to formation and rupture of nano-sized conductive filament(s) (cf) through the insulating layer which constitutes the two stable resistive states (on and off states). formation of localized cfs has been observed in various metal-oxide resistive switching devices by conductive atomic force microscopy [7, 8]. thermal, electrical or ion-migration-induced mechanisms could control rs [9-11]. for example, there are evidences that the oxygen vacancies play a crucial role in formation and rupture of the conductive filaments in transition metal oxides. therefore, the effects depend strongly on the dielectric material and the method of its deposition. on the other hand, all the above mentioned mechanisms could be substantially influenced by the metal electrodes and evidence for interface (i.e. metal/dielectric) effects are more frequently reported than bulk switching effects. in other words rs phenomenon is not an intrinsic property of the oxide itself, but a property of both oxide and electrode(s)/ oxide interface(s) [1,4,9]. metalinsulator-metal (mim) structures with hfo2 as insulating film are intensively studied recently due to cmos compatibility and excellent switching properties [12-14]. tin and pt electrodes are often employed to provide bipolar rs. tin can be easily oxidized during the dielectric growth and acts as an oxygen reservoir due to its high affinity to oxygen while pt serves as inert electrode [3]. different modifications of pt/hfo2/tin structure by introducing different cap layers at metal/dielectric interfaces or doping of hfo2 with different atoms have been suggested in order to optimize rs behavior [3,12,15,16]. in this work, we extend our previous investigation [17] on rs effects in pt/hfo2/tin mim capacitors and shed more light on the influence of technology and measurement conditions on the rs properties of these structures. a special attention is focused on the enhancement of rs properties by o3 treatment of tin bottom electrode. c u rr e n t voltage vrsetvset reset set fig. 1 typical bipolar resistive switching loop presenting set and reset processes. the influence of technology and switching parameters on resistive switching behavior... 623 2. sample preparation mim structures with active hfo2 layer were used to investigate rs effects. hfo2 thin (d=5-13 nm) films were grown by plasma or ozone assisted atomic layer deposition (ald) at 300 °c in beneq tfs 200 equipment using tetrakis (ethyl methylamino)hafnium as precursor. tin bottom electrode (be) was reactively sputtered in ar/n2 plasma at temperature of 200 °c. thickness of the layer was 70 nm and resistivity about 200 cm. pt top electrodes (te) (30 nm) were evaporated at room temperature through a shadow mask and capped by 30 nm thick au. the test mim structures are presented schematically in fig.2. for some samples tin be was subjected to different number (5-40 cycles) of o3 treatment before deposition of hfo2. in another set of samples hfo2 was deposited on top of very thin (1-1.5 nm) tio2, prepared also by ald by using titanium isopropoxide as a precursor. x-ray diffraction spectra revealed that hfo2 is amorphous [18]. fig. 2 experimental mim structures with hfo2 as an active layer and tin and pt as bottom and top electrodes, respectively. 3.results and discussion 3.1. dependence of rs effect on ald process and bottom electrode engineering figure 3 shows i-v curves and endurance characteristics of mim structures with hfo2 deposited by ozone assisted ald. the first result to be mentioned is that the o3 treatment of tin decreases the forming voltage vform, which is about -4.5 v for samples without o3 treatment (fig. 3a) and for 5 cy o3 samples and decreases to about -2 v for 20 cy o3 sample. in addition, the initial leakage current before forming increases by several orders of magnitude with increasing o3 treatment. it is also seen that the character of rs switching as well as the on/off ratio are strongly affected by the ozone treatment of tin electrode. in samples without o3-treatment (fig. 3a) relatively weak abrupt rs is observed. the endurance characteristics measured during 100 switching cycles (fig. 3b) reveal that rs is not very stable the on/off ratio is about 10 and it decreases progressively. in addition both set and reset processes occur at very low voltage (< 1 v) (fig.3a). although the low vset and vreset are generally desirable to ensure low power consumption, they should be well resolved from the readout voltage which is usually about 0.2 – 0.3 v. only 5 cy of o3treatment (not shown) is enough to significantly change the rs behavior – a gradual reset followed by a weak abrupt reset process is observed; some very strong abrupt reset events have been also registered. the rs ratio is increased and is about 40-60. with increasing the number of o3-treatment cycles (fig. 3c,d) this ratio is further increased and is significantly 624 a. paskaleva, b. hudec, p. jančovič, k. fröhlich, d. spassov higher than 100 (fig.3d). the typical reset process in this kind of samples is the gradual reset followed by a strong abrupt reset (fig. 3c). as is seen, vset = -1 -2 v and vreset = 1-2 v, i.e. both they well resolved with respect to the read-out voltage and are low enough to ensure low power consumption. all these observations indicate that o3 treatment introduces some structural changes which enhance the rs behavior of structures. the most plausible hypothesis is that o3 treatment oxidizes the tin surface; a thin tion film is formed and the thickness of this layer increases with increasing the number of o3 cycles. this layer could influence the initial current and vform by two ways: 1) it is a defect-rich layer which gives rise to defect-assisted transport mechanisms, hence to increased leakage current. it is very likely that this layer is conductive, i.e. it is always in lrs [19]; 2) this layer modifies the barrier at hfo2/tin interface. most probably both factors play a role, but the decreased vform and increased leakage current when increasing o3 treatment give evidence that the defect-related mechanism is a dominant one. it is established in other works [20] that vform is linearly dependent on the thickness of oxide film, hence thinner oxides are required to reduce vform. the present work shows that it is possible to reduce vform not only by reducing the oxide thickness, but also by modifying the oxide/metal interface. -4 -3 -2 -1 0 1 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 i, a top electrode voltage, v electroforming small abrupt rset a) 0 50 100 150 200 10 -5 10 -4 i, a n of readings vset=-1.6v, vrset=1.5 v, cciset=0.5 ma off state on state b) -2 -1 0 1 2 3 4 10 -12 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 i, a top electrode voltage, v gradual rset strong abrupt rset c) 0 50 100 150 200 10 -7 10 -6 10 -5 10 -4 10 -3 i, a n of readings vset=-2v, vrset=2v, cciset=0.5ma d) fig. 3 resistive switching in: pt/hfo2/tin structures without o3 treatment of tin electrode a) typical i-v rs loops, b) endurance characteristics (on and off current levels) within 100 switching cycles; and pt/hfo2/tin structures with 10 cy o3 treatment of tin electrode c) typical i-v rs loops, d) endurance characteristics. hfo2 is deposited by ozone assisted ald. the influence of technology and switching parameters on resistive switching behavior... 625 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 i, a top electrode voltage, v before forming strong abrupt rset a) 0 20 40 60 80 100 120 10 -7 10 -6 10 -5 10 -4 10 -3 i, a n of readings vset=-2, vrset=2 v, cciset=0,1ma b) -8 -6 -4 -2 0 2 4 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 i, a top electrode voltage, v before forming forming set c) 0 50 100 150 200 10 -12 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 i, a n of readings vset=-8 v, vrset=3v, cciset=1,1ma d) fig. 4 resistive switching in: pt/hfo2/tin structures with 20 cy of o3 treatment of tin electrode, a) typical i-v rs loops, b) endurance characteristics; and pt/hfo2/tin structures with ultrathin tio2 deposited on tin c) typical i-v rs loops, d) endurance characteristics. hfo2 is deposited by plasma assisted ald. in fig. 4a,b the rs characteristics of hfo2 deposited by plasma-enhanced ald and 20 cy of o3 treatment of tin are shown. in this sample only strong abrupt switching events have been observed and an on/off ratio of nearly 1000 is achieved. unlike the typical rs phenomenon where off current is significantly higher than the current before electroforming process, in this case the current resets almost to its initial level. as the forming process is in fact controlled soft or arrested breakdown (bd), this result reveals that in this kind of structures it is possible to achieve a full recovery of bd. this phenomenon is also observed in samples with plasma-enhanced ald hfo2 and ultrathin tio2 deposited in-between tin and hfo2 (fig. 4c,d). in this case the current before forming is significantly lower compared to all previously discussed samples. in addition, much higher vform in the order of -6 v is observed. these results indicate for a better dielectric quality of hfo2/tio2 stacks. as the difference to sample presented in fig. 4a,b is the interfacial layer, these results support the assumption for a defect-rich tion layer formed during o3 treatment of tin. it should be noted that a full recovery of bd has been achieved within more than 100 switching cycles even in the structure with ultra-thin tio2 and due to the very low initial leakage current, an extremely high rs ratio of about eight orders of magnitude has been measured (fig. 4d). it should be mentioned that the similar 626 a. paskaleva, b. hudec, p. jančovič, k. fröhlich, d. spassov values of off current after strong reset and the initial current before electroforming are due rather to a full annihilation of the filamentary path than to any area effect arising from the relatively large top electrode (10 -4 cm 2 ). the following observations support this conclusion: 1) the effect is observed in samples with quite different initial current values (10 -7 a at 1 v for sample presented in fig. 4a, and <10 -11 a at 1 v for sample presented in fig. 4c); 2) when the sample undergoes only a gradual reset and is not allowed to go to a strong abrupt reset (see the dash line in fig. 4c), the off current is several orders of magnitude higher than the initial current (i.e. in this case only a partial annihilation of conductive filament occurs); 3) vset following strong abrupt reset process is similar (and even slightly higher) than the forming voltage (fig. 4c). it should be mentioned also that the gradual reset is a preferred mode as it provides more controllable and stable rs process. the extremely large rs ratio obtained in some of the structures could hardly be implemented in rs devices. the observed phenomenon, however, could give valuable information about the bd process in these structures and the possibility to control it. this finding could be also very useful for manufacturing of structures with increased immunity to bd, i.e. structures in which multiple self-healing of bd could be obtained. the presented results reveal that there exist two kinds of reset processes – gradual reset which is attributed to gradual annihilation of o vacancies only in a narrow region at tin electrode [21, 22] and a strong abrupt reset process, in which a substantial part of and in some cases the whole conductive filament is erased. the results give evidence that the extent of the strong reset depends strongly on the amount of oxygen introduced to tin bottom electrode interface by o3 treatment – the more o3 cycles performed, the stronger the abrupt reset is. the most widely accepted model explaining rs effect involves breaking of metal-oxygen bonds in dielectric layer during the electroforming step, dissociation of o ions and formation of oxygen vacancies. as a result a localized path with increased conductivity (conductive filament) is formed between the electrodes. this filament is characterized with increased density of o vacancies and is essentially metallic in nature [22]. the dissociation of o ions and formation of o vacancies are driven by the electric field and elevated temperature. it should be mentioned that the formation energies of o vacancies are quite different for different materials and even for the same material having different structure (e.g., amorphous or crystalline) [23], which on its turn defines different rs behavior. the dissociated o ions diffuse out of the conductive filament and some of them are stored at the anode. the activation energy of o ion diffusion is 0.3 ev, while that of o vacancy diffusion is about 1.2 ev and 0.7 ev for single and double ionized vacancies, respectively, i.e. the diffusion of o ions is more effective [22]. this implies that the process following electroforming is rather diffusion of o ions than diffusion of o vacancies. in a reset process under opposite bias polarity, the stored o ions at the electrode can be moved back to the rs switching layer where they annihilate some of the o vacancies, thus causing a rupture of conductive filament. therefore, it is reasonable to think that the two kinds of reset processes stem from o ions dissociated from different bonds. we suggest that o ions released in hfo2 during the forming process are swept to tin electrode and form weak bonds there. most likely these are o ions which take part in gradual reset process. oxygen introduced by o3 treatment is bounded more strongly in tin by forming tion and/or tiox. more energy (i.e. stronger electric field) is required to break these bonds and to release o ions, which under positive bias drift toward anode and recombine with the o vacancies in the conductive filament. therefore, due to the easy oxidation of tin, this electrode serves the influence of technology and switching parameters on resistive switching behavior... 627 as oxygen reservoir, providing o ions needed for the erasure of o vacancies in the conductive filament. o3 treatment of tin electrode increases the amount of oxygen stored at tin/hfo2 interface, thus enhancing the on/off ratio. as already discussed the o3 treatment most likely results in partial oxidation of tin and formation of tion defect-rich (very likely conductive) layer. unlikely, in the case of ultrathin tio2, a nearly stoichiometric layer is formed. therefore, stronger bonds are formed and more energy is needed to knockout o ion from these bonds, hence the largest reset voltage is observed. once the field is high enough to break ti-o bonds, the amount of dissociated o ions is enough to annihilate the whole cf, thus resulting in full recovery of breakdown. next, a set voltage in the order of (or even higher than) vform is required to create cf anew. due to the strong thermodynamic ability of ti to extract oxygen from hfo2 the process of full recovery of breakdown could be repeated multiple times. 3.2. dependence of rs effect on switching conditions the rs effects are strongly dependent on measurement conditions such as vset and vreset voltages and compliance current during the set process cciset. compliance current is usually needed to avoid too high current flowing through the active layer which may bring it to a hard breakdown. the stability of switching and rs ratio can be varied in a wide range by changing the above mentioned parameters of switching process. fig. 5 shows the difference in rs at different switching conditions for sample presented in fig. 4b. the following parameters have been used to measure the rs shown in the inset of fig. 4b: vset = -2 v; vrset = 2 v; cciset = 0.1 ma. as is seen in fig. 5a by decreasing vreset to 1.5 v the rs ratio decreases to below 10 and the effect is not very stable. on the other hand the increase of cciset to 0.5 ma (fig. 5b) resulted in a more stable effect with well resolved on and off state. in other words, by optimization of switching parameters it is possible to control rs effect and to obtain stable switching characteristics. it also opens-up a way for manufacturing of multilevel rs devices [15, 16], i.e. devices in which on/off ratio could be varied by changing reset voltage or compliance current. 0 20 40 60 80 100 10 -6 10 -5 10 -4 i, a n of readings vset=-2 v, vrset=1,5v, cciset=0,1ma, a) 0 50 100 150 200 10 -6 10 -5 10 -4 i, a n of readings vset=-2 v, vrset=2 v, cciset=0,5ma, b) fig. 5 dependence of the rs effect in pt/hfo2/20 cy o3/tin sample on switching parameters: a) decrease of vrset, and b) increase of cciset with respect to measurement conditions in fig. 4b. 628 a. paskaleva, b. hudec, p. jančovič, k. fröhlich, d. spassov 3.3. dependence of rs effect on oxide thickness further, the influence of dielectric thickness on rs has been investigated. in this set of samples 40 cy o3 treatment of tin has been performed and the thickness of hfo2 layer has been varied between 5 and 9 nm. we used thinner layers because rram devices have to operate at low voltages, hence the thickness of dielectric layer should be decreased to obtain acceptable values for vform, vset and vreset. we have chosen switching conditions resulting in gradual reset process. as discussed above this mode of reset provides more controllable rs process. in fig. 6 the resistive switching in stacks with hfo2 thickness of 5.4 and 7.2 nm, respectively is presented. the sample with a 9 nm thick hfo2 (not shown) exhibited similar behavior. as shown above the on/off ratio depends strongly on measurement conditions. in order to study the influence of the oxide thickness itself and to disregard the influence of measurement conditions, the best results in terms of stable on/off ratio for each sample have been presented in fig. 6. all three samples show very stable rs for at least 100 switching cycles. moreover, in the thinnest sample this stable rs is obtained without applying compliance during the set process. in thicker samples rs without compliance current has also been observed, but it has not been stable and rs ratio has varied in a wide range. it is seen (fig. 6) that the on/off ratio increases with decreasing the hfo2 thickness and the largest ratio of about 100 is obtained for the thinnest (5.4 nm) sample. for 7.2 and 9 nm thick samples the obtained ratio is about 20 and 10, respectively. it should be noted that the difference in on/off ratio comes from the differences in on current – it increases with decreasing the thickness. the off current is similar in all three samples. these results indicate stronger low resistive state in thinner samples, which is assigned to formation of wider conductive filament that may be a consequence of set process performed without compliance. -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 i, a top electrode voltage, v pt/hfo 2 /tin d=5.4 nm 50 rs cycles a) -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 i, a top electrode voltage, v pt/hfo2/tin d=7.2 nm 50 rs cycles b) fig. 6 stable rs in pt/hfo2/tin structures with 40 cy o3 treatment of tin and different hfo2 thickness: a) 5.4 nm and b) 7.2 nm the influence of technology and switching parameters on resistive switching behavior... 629 4. conclusion the results presented give evidence that by incorporation of oxygen at hfo2/tin interface and the choice of deposition technique it is possible to enable and enhance substantially the resistive switching of structures. the effects depend strongly on the amount of incorporated oxygen; how it is incorporated and what kinds of bonds it forms at the interface. the presence of tion (tio2) ultra thin film plays a crucial role in rs process – it serves both as a reservoir for o ions released from breaking of hf-o bonds during electroforming process and as a source of additional oxygen, released from dissociation of ti-o and/or ti-o-n bonds. all switching parameters (vform, vset, vreset, on/off ratio) could be substantially changed by the surface engineering, hence the amount of o and its bonding in tio-based film as well as rs layer thickness and switching conditions should be carefully optimized to obtain stable rs effects with controllable switching behavior. by proper engineering even a multiple full recovery of cf (i.e. full healing of “arrested” breakdown) is possible. acknowledgement: this work was supported by the apvv-0509-10, vega (no. 2/0138/14) and isspbk-05-14. references [1] r. waser, r. dittmann, c. staikov, and k. szot, “redox-based resistive switching memories nanoionic mechanisms, prospects, and challenges”, advanced mater., vol. 21, pp. 2632-2663, 2009 [2] h.y lee, p.s. chen, et al, “low power and high speed bipolar switching with a thin reactive ti buffer layer in robust hfo2 based rram”, techn. digest – intern. electron dev. meeting, iedm, pp. 297300, 2008. [3] y. y. chen, l. goux, et al., “endurance/retention trade-off on hfo2\metal cap 1t1r bipolar rram”, ieee trans. electron dev., vol. 60, pp. 1114-1121, 2013. [4] h.-s. p. wong, h.-y. lee, et al., “metal-oxide rram”, proc. of the ieee, vol. 100, pp. 1951-1970, 2012. [5] y.h. do, j.s. kwak, and j. p. hong, „resistive switching characteristics of tio2 films with embedded co ultra thin layer”, j. semicond. technol. sci., vol. 8, pp.80-84, 2008. [6] s. lombardo, j. h. stathis, et.al., “dielectric breakdown mechanisms in gate oxides”, j. appl. phys., vol. 98, 121301, 2005. [7] b.j. choi, d.s. jeong, et. al, “resistive switching mechanism of tio2 thin films grown by atomic-layer deposition”, j. appl. phys., vol. 98, art. no. 033715, 2005. [8] j.y. son, and y.-h. shin, “direct observation of conducting filaments on resistive switching of nio thin films”, appl. phys. lett., vol. 92, art. no. 222106, 2008. [9] r. waser and m. aono, “nanoionics-based resistive switching memories”, nature mater., vol. 6, pp. 833-840, 2007. [10] b. gao, s. yu, et al, “oxide-based rram switching mechanism: a new ion-transport-recombination model”, techn. digest – intern. electron dev. meeting, iedm, art. no. 4796751, 2008. [11] u. russo, d. ielmini, et al., “conductive-filament switching analysis and self-accelerated thermal dissolution model for reset in nio-based rram”, techn. digest–intern. electron dev. meeting, iedm, pp. 775-778, 2007. [12] l. goux, p. wang, et al., “roles and effects of tin and pt electrodes in resistive-switching hfo2 systems”, electrochem. solid-state lett., vol. 14, pp. h244-h246, 2011. [13] c. walczyk, d. walczyk, et al., “impact of temperature on the resistive switching behavior of embedded hfo2-based rram devices”, ieee trans. electron dev., vol. 58, pp. 3124-3131, 2011. [14] y. s. lin, f. zheng, et al., “resistive switching mechanisms relating to oxygen vacancies migration in both interfaces in ti/hfox/pt memory devices”, j. appl. phys., vol. 113, 064510, 2013. http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=35444542200&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7005218959&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=26968155300&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55393685900&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-67650102619&origin=resultslist&sort=plf-f&src=s&st1=waser%2cr.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a70&sot=b&sdt=b&sl=137&s=firstauth%28waser%2cr.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009&relpos=6&relpos=6&citecnt=965&searchterm=firstauth%28waser%2cr.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009 http://www.scopus.com/record/display.url?eid=2-s2.0-67650102619&origin=resultslist&sort=plf-f&src=s&st1=waser%2cr.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a70&sot=b&sdt=b&sl=137&s=firstauth%28waser%2cr.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009&relpos=6&relpos=6&citecnt=965&searchterm=firstauth%28waser%2cr.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009 http://www.scopus.com/source/sourceinfo.url?sourceid=19881&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=35793456600&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=8873646900&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-64549149261&origin=resultslist&sort=plf-f&src=s&st1=lee%2ch.y.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a30&sot=b&sdt=b&sl=137&s=firstauth%28lee%2ch.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008&relpos=19&relpos=19&citecnt=1&searchterm=firstauth%28lee%2ch.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008 http://www.scopus.com/record/display.url?eid=2-s2.0-64549149261&origin=resultslist&sort=plf-f&src=s&st1=lee%2ch.y.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a30&sot=b&sdt=b&sl=137&s=firstauth%28lee%2ch.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008&relpos=19&relpos=19&citecnt=1&searchterm=firstauth%28lee%2ch.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008 http://www.scopus.com/source/sourceinfo.url?sourceid=26142&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=36016343300&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=8625491000&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84874656071&origin=resultslist&sort=plf-f&src=s&st1=chen%2cy.y.&nlo=&nlr=&nls=&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a150&sot=b&sdt=b&sl=138&s=firstauth%28chen%2cy.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=68&relpos=8&citecnt=8&searchterm=firstauth%28chen%2cy.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 http://www.scopus.com/source/sourceinfo.url?sourceid=26052&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=24437497100&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=35793456600&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84861125089&origin=resultslist&sort=plf-f&src=s&st1=wong%2cp&nlo=&nlr=&nls=&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a360&sot=b&sdt=b&sl=135&s=firstauth%28wong%2cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012&relpos=24&relpos=4&citecnt=68&searchterm=firstauth%28wong%2cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012 http://www.scopus.com/source/sourceinfo.url?sourceid=17915&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55015586900&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-25844479330&origin=resultslist&sort=plf-f&src=s&st1=lombardo%2cs&sid=9f8f9fa98ebfab2153ae04185166d0cb.i0qkgbijgqqlq4nw7dqz4a%3a90&sot=b&sdt=b&sl=42&s=author-name%28lombardo%2cs%29+and+pubyear+%3d+2005&relpos=4&relpos=4&citecnt=129&searchterm=author-name%28lombardo%2cs%29+and+pubyear+%3d+2005 http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=35182588700&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=8330807300&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-23944447615&origin=resultslist&sort=plf-f&src=s&st1=choi%2cb.j&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a450&sot=b&sdt=b&sl=137&s=firstauth%28choi%2cb.j%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2005&relpos=0&relpos=0&citecnt=493&searchterm=firstauth%28choi%2cb.j%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2005 http://www.scopus.com/record/display.url?eid=2-s2.0-23944447615&origin=resultslist&sort=plf-f&src=s&st1=choi%2cb.j&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a450&sot=b&sdt=b&sl=137&s=firstauth%28choi%2cb.j%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2005&relpos=0&relpos=0&citecnt=493&searchterm=firstauth%28choi%2cb.j%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2005 http://www.scopus.com/source/sourceinfo.url?sourceid=28132&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=24339170900&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=35075042400&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-44849088973&origin=resultslist&sort=plf-f&src=s&st1=son%2cj.y.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a490&sot=b&sdt=b&sl=137&s=firstauth%28son%2cj.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008&relpos=17&relpos=17&citecnt=78&searchterm=firstauth%28son%2cj.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008 http://www.scopus.com/record/display.url?eid=2-s2.0-44849088973&origin=resultslist&sort=plf-f&src=s&st1=son%2cj.y.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a490&sot=b&sdt=b&sl=137&s=firstauth%28son%2cj.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008&relpos=17&relpos=17&citecnt=78&searchterm=firstauth%28son%2cj.y.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008 http://www.scopus.com/source/sourceinfo.url?sourceid=27030&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=35444542200&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7103189290&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-35748974883&origin=resultslist&sort=plf-f&src=s&st1=waser%2cr.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a110&sot=b&sdt=b&sl=137&s=firstauth%28waser%2cr.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2007&relpos=1&relpos=1&citecnt=1368&searchterm=firstauth%28waser%2cr.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2007 http://www.scopus.com/source/sourceinfo.url?sourceid=17854&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55707310000&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=8960531500&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-64549136141&origin=resultslist&sort=plf-f&src=s&st1=gao%2cb.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a740&sot=b&sdt=b&sl=151&s=firstauth%28gao%2cb.%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008&relpos=8&relpos=8&citecnt=1&searchterm=firstauth%28gao%2cb.%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008 http://www.scopus.com/record/display.url?eid=2-s2.0-64549136141&origin=resultslist&sort=plf-f&src=s&st1=gao%2cb.&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a740&sot=b&sdt=b&sl=151&s=firstauth%28gao%2cb.%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008&relpos=8&relpos=8&citecnt=1&searchterm=firstauth%28gao%2cb.%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2008 http://www.scopus.com/source/sourceinfo.url?sourceid=26142&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=22635217300&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004029843&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004029843&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-50249141738&origin=resultslist&sort=plf-f&src=s&st1=russo%2cu&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a780&sot=b&sdt=b&sl=152&s=firstauth%28russo%2cu%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2007&relpos=0&relpos=0&citecnt=45&searchterm=firstauth%28russo%2cu%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2007 http://www.scopus.com/record/display.url?eid=2-s2.0-50249141738&origin=resultslist&sort=plf-f&src=s&st1=russo%2cu&sid=7c788586405d93227c6be22122348547.i0qkgbijgqqlq4nw7dqz4a%3a780&sot=b&sdt=b&sl=152&s=firstauth%28russo%2cu%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2007&relpos=0&relpos=0&citecnt=45&searchterm=firstauth%28russo%2cu%29+and+doctype%28cp%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2007 http://www.scopus.com/source/sourceinfo.url?sourceid=26142&origin=resultslist http://www.scopus.com/record/display.url?eid=2-s2.0-79953783285&origin=resultslist&sort=plf-f&src=s&st1=goux%2cl&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a20&sot=b&sdt=b&sl=36&s=firstauth%28goux%2cl%29+and+pubyear+%3d+2011&relpos=3&relpos=3&citecnt=8&searchterm=firstauth%28goux%2cl%29+and+pubyear+%3d+2011 http://www.scopus.com/record/display.url?eid=2-s2.0-79953783285&origin=resultslist&sort=plf-f&src=s&st1=goux%2cl&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a20&sot=b&sdt=b&sl=36&s=firstauth%28goux%2cl%29+and+pubyear+%3d+2011&relpos=3&relpos=3&citecnt=8&searchterm=firstauth%28goux%2cl%29+and+pubyear+%3d+2011 http://www.scopus.com/record/display.url?eid=2-s2.0-80052097231&origin=resultslist&sort=plf-f&src=s&st1=walczyk&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a70&sot=b&sdt=b&sl=37&s=firstauth%28walczyk%29+and+pubyear+%3d+2011&relpos=0&relpos=0&citecnt=17&searchterm=firstauth%28walczyk%29+and+pubyear+%3d+2011 http://www.scopus.com/record/display.url?eid=2-s2.0-80052097231&origin=resultslist&sort=plf-f&src=s&st1=walczyk&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a70&sot=b&sdt=b&sl=37&s=firstauth%28walczyk%29+and+pubyear+%3d+2011&relpos=0&relpos=0&citecnt=17&searchterm=firstauth%28walczyk%29+and+pubyear+%3d+2011 http://www.scopus.com/record/display.url?eid=2-s2.0-84874314756&origin=resultslist&sort=plf-f&src=s&st1=lin%2cy.+s.&nlo=&nlr=&nls=&sid=09cd38148d4b99bb8fe32df0763dae8a.cnvicamoodvwpvrjseqq%3a60&sot=b&sdt=b&sl=138&s=firstauth%28lin%2cy.+s.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=57&relpos=17&citecnt=1&searchterm=firstauth%28lin%2cy.+s.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 http://www.scopus.com/record/display.url?eid=2-s2.0-84874314756&origin=resultslist&sort=plf-f&src=s&st1=lin%2cy.+s.&nlo=&nlr=&nls=&sid=09cd38148d4b99bb8fe32df0763dae8a.cnvicamoodvwpvrjseqq%3a60&sot=b&sdt=b&sl=138&s=firstauth%28lin%2cy.+s.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=57&relpos=17&citecnt=1&searchterm=firstauth%28lin%2cy.+s.%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 630 a. paskaleva, b. hudec, p. jančovič, k. fröhlich, d. spassov [15] a. markeev, a. chouprik, k. egorov, y. lebedinskii, a. zenkevich, and o. orlov, “multilevel resistive switching in ternary hfxal1-xoy oxide with graded al depth profile”, microel. eng., vol. 109, pp. 342345, 2013. [16] s. yu, b. gao, z. fang, h.yu, j. kang, and h.-s. p. wong, “a neuromorphic visual system using rram synaptic devices with sub-pj energy and tolerance to variability: experimental characterization and large-scale modeling”, techn. digest–intern. electron dev. meet., iedm, art. no. 6479018, 2012. [17] a. paskaleva, k. frohlich, b. hudec, and p. jančovič, “resistive switching effects in tin/hfo2/pt mim structures and their dependence on bottom electrode interface engineering”, proc. 29th intern. conf. on microelectr. (miel 2014), pp. 285-288, 2014. [18] k. fröhlich, p. jančovič, b. hudec, j. dérer, a. paskaleva, t. bertaud, t. schroeder, “atomic layer deposition of thin oxide films for resistive switching”, ecs transactions, vol.58, pp.163-170, 2013. [19] g. tang, f. zeng, et al., “programmable complementary resistive switching behaviours of a plasmaoxidised titanium oxide nanolayer”, nanoscale, vol.5, pp. 422-428, 2013. [20] c. walczyk, c. wenger, et al., “pulse-induced low-power resistive switching in hfo2 mim diodes for nonvolatile memory applications”, j. appl. phys., vol. 105, 114103, 2009. [21] g. bersuker, d. gilmer, et al., “metal oxide rram switching mechanism based on conductive filament microscopic properties”, techn. digest–intern. electron dev. meet., iedm , art. no. 5703394, 2010. [22] g. bersuker, d. gilmer, et al., “metal oxide resistive memory switching mechanism based on conductive filament properties”, j.appl. phys., vol.110, 124518, 2011. [23] t.-j. chen, and ch.-l. kuo, “oxygen vacancy formation and the induced defect states in hfo2 and hfsilicates – a first principles hybrid functional study”, microel. reliab. vol. 54, pp.1119-1124, 2014. http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7005131291&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=16314970000&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55670556700&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=6603322778&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=6603378936&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55671271700&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84876895003&origin=resultslist&sort=plf-f&src=s&st1=markeev%2ca&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a370&sot=b&sdt=b&sl=140&s=author-name%28markeev%2ca%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=2&relpos=2&citecnt=0&searchterm=author-name%28markeev%2ca%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 http://www.scopus.com/record/display.url?eid=2-s2.0-84876895003&origin=resultslist&sort=plf-f&src=s&st1=markeev%2ca&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a370&sot=b&sdt=b&sl=140&s=author-name%28markeev%2ca%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=2&relpos=2&citecnt=0&searchterm=author-name%28markeev%2ca%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 http://www.scopus.com/source/sourceinfo.url?sourceid=26696&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=26424450200&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55649690000&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55577921100&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55577921100&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55540467500&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=24437497100&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84876153660&origin=resultslist&sort=plf-f&src=s&st1=yu%2cs&st2=gao%2cb&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a830&sot=b&sdt=b&sl=158&s=%28firstauth%28yu%2cs%29+and+author-name%28gao%2cb%29%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012&relpos=0&relpos=0&citecnt=0&searchterm=%28firstauth%28yu%2cs%29+and+author-name%28gao%2cb%29%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012 http://www.scopus.com/record/display.url?eid=2-s2.0-84876153660&origin=resultslist&sort=plf-f&src=s&st1=yu%2cs&st2=gao%2cb&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a830&sot=b&sdt=b&sl=158&s=%28firstauth%28yu%2cs%29+and+author-name%28gao%2cb%29%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012&relpos=0&relpos=0&citecnt=0&searchterm=%28firstauth%28yu%2cs%29+and+author-name%28gao%2cb%29%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012 http://www.scopus.com/record/display.url?eid=2-s2.0-84876153660&origin=resultslist&sort=plf-f&src=s&st1=yu%2cs&st2=gao%2cb&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a830&sot=b&sdt=b&sl=158&s=%28firstauth%28yu%2cs%29+and+author-name%28gao%2cb%29%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012&relpos=0&relpos=0&citecnt=0&searchterm=%28firstauth%28yu%2cs%29+and+author-name%28gao%2cb%29%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2012 http://www.scopus.com/source/sourceinfo.url?sourceid=26142&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=36901763900&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=55498633000&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84870940337&origin=resultslist&sort=plf-f&src=s&st1=tang%2cg&nlo=&nlr=&nls=&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a530&sot=b&sdt=b&sl=135&s=firstauth%28tang%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=101&relpos=1&citecnt=6&searchterm=firstauth%28tang%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 http://www.scopus.com/record/display.url?eid=2-s2.0-84870940337&origin=resultslist&sort=plf-f&src=s&st1=tang%2cg&nlo=&nlr=&nls=&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a530&sot=b&sdt=b&sl=135&s=firstauth%28tang%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013&relpos=101&relpos=1&citecnt=6&searchterm=firstauth%28tang%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2013 http://www.scopus.com/source/sourceinfo.url?sourceid=19700173215&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=26665213700&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=8728718900&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-67649538428&origin=resultslist&sort=plf-f&src=s&st1=walczyk&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a470&sot=b&sdt=b&sl=136&s=firstauth%28walczyk%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009&relpos=5&relpos=5&citecnt=38&searchterm=firstauth%28walczyk%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009 http://www.scopus.com/record/display.url?eid=2-s2.0-67649538428&origin=resultslist&sort=plf-f&src=s&st1=walczyk&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a470&sot=b&sdt=b&sl=136&s=firstauth%28walczyk%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009&relpos=5&relpos=5&citecnt=38&searchterm=firstauth%28walczyk%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2009 http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004345214&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004369128&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-79951843778&origin=resultslist&sort=plf-f&src=s&st1=bersuker%2cg&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a420&sot=b&sdt=b&sl=139&s=firstauth%28bersuker%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2010&relpos=1&relpos=1&citecnt=27&searchterm=firstauth%28bersuker%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2010 http://www.scopus.com/record/display.url?eid=2-s2.0-79951843778&origin=resultslist&sort=plf-f&src=s&st1=bersuker%2cg&sid=b0dcbd200e0a786ad80ceaf84f9d6353.zqknzaysrvjozycdfiziq%3a420&sot=b&sdt=b&sl=139&s=firstauth%28bersuker%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2010&relpos=1&relpos=1&citecnt=27&searchterm=firstauth%28bersuker%2cg%29+and+subjarea%28mult+or+ceng+or+chem+or+comp+or+eart+or+ener+or+engi+or+envi+or+mate+or+math+or+phys%29+and+pubyear+%3d+2010 http://www.scopus.com/source/sourceinfo.url?sourceid=26142&origin=resultslist http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004345214&zone= http://www.scopus.com/authid/detail.url?origin=resultslist&authorid=7004369128&zone= http://www.scopus.com/record/display.url?eid=2-s2.0-84855306489&origin=resultslist&sort=plf-f&src=s&st1=bersuker%2cg&sid=9f8f9fa98ebfab2153ae04185166d0cb.i0qkgbijgqqlq4nw7dqz4a%3a20&sot=b&sdt=b&sl=42&s=author-name%28bersuker%2cg%29+and+pubyear+%3d+2011&relpos=0&relpos=0&citecnt=66&searchterm=author-name%28bersuker%2cg%29+and+pubyear+%3d+2011 http://www.scopus.com/record/display.url?eid=2-s2.0-84855306489&origin=resultslist&sort=plf-f&src=s&st1=bersuker%2cg&sid=9f8f9fa98ebfab2153ae04185166d0cb.i0qkgbijgqqlq4nw7dqz4a%3a20&sot=b&sdt=b&sl=42&s=author-name%28bersuker%2cg%29+and+pubyear+%3d+2011&relpos=0&relpos=0&citecnt=66&searchterm=author-name%28bersuker%2cg%29+and+pubyear+%3d+2011 http://www.sciencedirect.com/science/article/pii/s0026271413003958 http://www.sciencedirect.com/science/article/pii/s0026271413003958 instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 461 474 doi: 10.2298/fuee1603461d designing an intelligent home media center igor đurić 1 , vanjica ratković-živanović 2 , milica labus 1 , dragana groj 1 , nikola milanović 1 1 faculty of organizational sciences, university of belgrade, serbia 2 radio television of serbia, belgrade, serbia abstract. this paper presents design and implementation of a personal intelligent home media center. the primary goal was to increase the quality of life with the use of ambient intelligence in smart homes. the solution presented here uses client-server architecture with network-attached storage for storing all multimedia contents. sensors are used to identify person’s presence and ambient intelligence techniques to recommend the most suitable multimedia content to end-users. the major advantages of this personal intelligent home media center are speed, intelligence, inexpensive components and scalability. the implementation was done in within one home media center, for the evaluation purposes. key words: smart home, ambient intelligence, media center 1. introduction ambient intelligence (hereinafter: ami) offers an opportunity to realize an old dream the smart or intelligent home. people spend a lot of time in their homes and both social and technological drivers are broadening the scope of activities to be undertaken at home. advances in technology are ultimately improving the way home environment can react and adopt to residents‟ needs. besides sleeping, rest and entertainment are main functions performed at home. radio, tv and music records/cds have been the dominant entertainments in home environment for decades. with advances in technology, all these contents became digital and new form of multimedia entertainment (listening, watching, and interacting) was developed. each home has potentially many devices that store different digital contents (videos, music and photos) and media storage units (cds, dvds, etc.). each device has limited memory available, and often the same content is stored in many places. idea behind home media center is to integrate all these devices and units and to allow centralized storage, search, and playback, internet streaming and often even recording of digital contents. there is no universal plug and play solution since each home has different setup of devices. received june 30, 2015; received in revised form november 10, 2015 corresponding author: igor đurić faculty of organizational sciences, university of belgrade, jove ilića 154, 11000 belgrade, serbia (e-mail: igor@elab.rs) 462 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović in this paper we provide an overview of ambient intelligent applications in smart homes and technological challenges behind implementation of intelligent home media centers. we also explore some of the existing smart or intelligent home media center solutions. in particular, we specifically focus on implementation of broadly implementable and affordable intelligent home media center based on inexpensive hardware and open source software. our goal was to develop intelligent home media center which can respond to specific needs and moods of its users. our system is in testing phase and future research will rely on users‟ feedback. 2. literature review 2.1. ambient intelligence in smart homes ambient intelligence is a new paradigm for an intelligent environment which uses information and communications technologies to become an active, adaptive and responsive to people presence and their needs, and thereby improving the quality of their lives [1]. there are various definitions of ami, but they all highlight the following features of underlying technologies: sensitive, responsive, adaptive, transparent, ubiquitous and intelligent [2]. according to european information society technology advisory group, ami is “a set of properties of an environment that we are in the process of creating” and that it should be treated as “imagined concept” and not as a set of specific requirements [3]. ami focuses on user(s) in their environment and emphasizes ease of use, user-empowerment and support for human interactions in seamless, unobtrusive and often invisible way [4]. ami is emerging discipline today. we have not only necessary supporting technology present, but the user demand has also reached that critical point for prosperous development. as cook, augusto and jakkula point out, ami incorporates aspects of context-aware computing 1 , disappearing computers 2 , and pervasive/ubiquitous 3 computing, and enriches them with artificial intelligence research in the fields of machine learning, agent-based software, and robotics [2]. ami applications have been developed in many fields such as smart homes, health monitoring and assistance, hospitals, transportation, emergency services, education, workplaces, etc.. in this project we have focused on smart homes. "smart home" is the term commonly used to define a residence that has appliances, lighting, heating, air conditioning, tvs, computers, entertainment audio & video systems, security, and camera systems that are capable of communicating with each other and which can be controlled remotely [5]. another definition according to the smart homes association is: “the integration of technology and services through home networking for a better quality of living” [6]. smart homes make life easier and more convenient. no matter where you are, smart system will alert you if something is going wrong in the house. for example, not only a resident will be woken up with notification of a fire alarm, the smart home would also unlock doors, dial the fire department and light the path to safety. 1 „context-aware computing is a style of computing in which situational and environmental information about people, places and things is used to anticipate immediate needs and proactively offer enriched, situation-aware and usable content, functions and experiences.“ – gartner it glossary (www.gartner.com/it-glossary/) 2 „the most profound technologies are those that disappear. they weave themselves into the fabric of everyday life until they are indistinguishable from it“[30] 3 „ubiquitous computing“ was first defined by weiser[31]; ibm later called it „pervasive computing“[29] http://www.gartner.com/it-glossary/ designing an intelligent home media center 463 there are many areas where ami is applied in functions of smart home applications: home automation, communication and socialization, resting, refreshing, entertainment, sport, working and learning [7]. home automation covers basic housing supporting functions, like heating, piping, ventilation, air-conditioning or hpac, lighting, electrical installations, but also home security functions and the functions to increase the autonomy and support the independent living, especially of elderly residents. one example is the european ist amigo project which developed a networked home system of heterogeneous devices and services from the following domains: personal computing, mobile computing, consumer electronics and home automation [8]. besides improving quality of people‟s lives, home automation solutions are addressing energy efficiency by providing comprehensive support for energy savings [9]. electric bills go down when lights are automatically turned off when a person leaves the room, and rooms can be heated or cooled based on who's there at that moment. some devices can track how much energy each appliance is using and command it to use less. communication and socialization functions are already well established at homes by the use of landline phones, internet, tv, mobile phones and other hand-held/hands-free devices. one of the further developments in this area will be dynamic networking where ami technologies would seamlessly put people in contact based on comparable permanent patterns of interest or specific requests with the use of context modeling techniques [10] applications of ambient intelligence embedded in clocks, beds, lamps, windows, floors, ceilings, furniture, etc., can improve sleeping and other forms of relaxation in home. important issues to address here are activity recognition and conflict resolution [11]. another ami trend is to optimize the time which is anyway consumed in the bathroom for other functions, for example bathroom mirror could display clock, news, weather, advices on health improvement based on person‟s weight [12]. entertainment and sport activities are not necessarily done at home, but ami technologies can again enrich this experience. for example, voice recognition could be combined with databases so that resident can turn on the music by simply humming a few lines of a song [12]. one of the challenges is to allow the right balance between relatively passive enjoyment of multimedia entertainment and interactive engagement in them. regarding exercising at home, friedewald et al. envision that future trend is to integrate physical exercise capacities into „„ordinary‟‟ furniture placed in living room, bedroom or even kitchen [7]. ami technologies promise tremendous benefits for an elderly persons living alone. smart systems can notify residents when it is time to take the medicine, alert the hospital if a resident fell, track how much residents are eating, automatically turn off the water before a tub overflow or turn off the oven if no one is present in the home. smart home systems provide an opportunity for adult children who live elsewhere to participate in the care of their aging parents [6]. ami environment consist of sensors, controllers and intelligent agents. sensors gather data from the real world based on which intelligent agents perceive the state of the environment and users. intelligent agents are systems that can decide what to do and then do it [13]. they reason about the gathered data using a variety of ami techniques, and act upon the environment using controllers. thus, sensing, reasoning, and acting are the main functional parts of ami algorithms [2]. there are wired and wireless sensors. they can be integrated into the environment or they can be attached to persons or items. the example of the latter case is rfid tags that 464 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović can be coupled with an rfid reader to monitor the movement of the tagged objects. when analyzing sensors data, ami systems may employ a centralized or distributed model [14]. resent research implies that wired and wireless distributed computing are a key mean to accomplish established ami goals [1]. reasoning is accomplished through modeling of user‟s behavior, activity prediction and recognition, decision making, and spatial-temporal reasoning [15]. in the mavhome (managing an intelligent versatile home) smart home project a data mining pre-processor identifies common sequential patterns in data, and then uses those patterns to build a hierarchical model of resident behavior [16]. luhr goes even further and uses video data to find sequential association rules in resident actions [17]. other examples of projects which all adaptively control home environments by anticipating the location, routes and activities of their residents are the neural network house [18], the intelligent home [19] and the placelab [20]. ami systems execute actions through various controllers, robots and other intelligent and assistive devices. mobile robot assistants are already found in nursing homes [21], developed to assist elderly individuals with mild cognitive and physical impairments, as well as to support nurses in their daily activities. the goal of many leading smart home projects associated with wearable/implantable monitoring systems and assistive robotics is to allow older people to live autonomously in a comfortable and secure environment [22]. important challenge for ambient intelligence is how to make technology learn about the people and their identity and how to apply such knowledge in varying contexts but at the same time how to secure a sufficient degree of privacy and prevention against misuse so that people will trust intelligent world that surrounds them. the future of ami depends on how successful it will address those important security and trust issues [23]. other areas of further research within domain of smart homes are better use of resources, home security, appliance management, digital entertainment, energy management and assistive computing/health care, as well as smart environments to support elderly and disabled persons [6]. 2.2. intelligent home media centers various solutions for intelligent media players are accessible on the market with numerous advantages as well as some challenges. in this paper we discuss some of the best media players that were examined as a starting point in our development of a personal intelligent home media center. one of widely used open-source media player is kodi (formerly xbmc), developed by the xbmc foundation, a non-profit technology consortium. it is media center for playing videos, music, pictures, games, and more. kodi operates on a linux, os x, windows, ios and android, with a 10-foot user interface for use with televisions and remote controls. it allows users to play and view videos, music, podcasts, and other digital media files from local and network storage media and the internet. one of the challenges is that it needs a 3d capable graphics hardware controller for rendering. additional issue is that kodi's internal cross-platform video and audio players (dvdplayer and paplayer) cannot play any audio or video files that are protected/encrypted with drm (digital rights management) technologies for access control [24]. other examples of home media center solutions are: noontec n5 gigalink and ciscolinksys media center extender with dvd. https://en.wikipedia.org/wiki/consortium https://en.wikipedia.org/wiki/3d_computer_graphics https://en.wikipedia.org/wiki/gpu https://en.wikipedia.org/wiki/encryption https://en.wikipedia.org/wiki/digital_rights_management https://en.wikipedia.org/wiki/access_control designing an intelligent home media center 465 noontec n5 gigalink is a reliable storage server, which can back up a large number of multimedia files (hard drive required), such as digital pictures, music and movies. it supports upnp and dlna functions, so users can go through pictures, listen to their favorite music or watch the 1080p high definition movies on the high definition tv, ps3, xbox360 or other players connected to the home network, which can give them the experience of truly digital home. it supports mobile access from iphone, ipad, android smartphone and tablets connected through the local area network. it also supports file server, ftp server, and samba server [25]. the linksys dma2200 connects the latest 1080p dvd players with windows media center and streams user‟s digital music, movies and photos to any tv in their home wirelessly. with elegant and easy navigation menu screens, users can play dvds, view family slide shows, browse music collection by cover art, listen to entire playlists or choose from a vast selection of internet radio stations from all over the world. the linksys media center extender and user‟s windows vista media center pc give a complete pvr solution-allowing them to watch, pause, rewind and record live tv (pc-embedded or optional tv tuner required). another approach is self-made personalized smart tv which uses client-server architecture together with ami to implement popular tv functions. the best three unique functions of this approach are recognizing user‟s gestures to control a tv, creating collaborative recommendations from social opinions and using environmental collaboration for enabling context-aware services [25]. and final approach, similar to our solution, is custom-made intelligent home media center such as intelligent multimedia service system (imss) [27] and ubiquitous-hybrid multimedia system (u-hms) [28]. imss is based on context awareness and ubiquitous computing. it provides multimedia interoperability among incompatible multimedia devices, device specific video encoding, copyright and license management. similarly, u-hms system offers multimedia interoperability among incompatible devices, transparent services, authentication method and security services. it is based on wireless sensor network (wsn), context awareness, and mpeg-21 dia/video transcoding. disadvantage of these two systems is that there are based on external services such as certificate authority (ca), digital object identifier (doi), multimedia/content management system (mms/cms) and license management system (lms). 3. design of an intelligent home media center in this chapter development of an intelligent home media center will be described together with the basic usage scenarios. all needed components will be listed together with the short description and required and desirable features. for setting up an intelligent home media center the following hardware components are needed:  network-attached storage (in further text nas) for storing all data  main server for sharing meta data between clients and for hosting web server  two microcomputers  sensors  power switches  mobile devices (mobile phone, laptop)  router  output device (tv). 466 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović main communication between devices will go through the main server. one microcomputer will communicate with the nas storage when its multimedia content is played. all other communication will go through the main server. this approach will provide a possibility to store history of all actions in the system. fig. 1 design of an intelligent home media center 3.1. infrastructure overview nas storage nas storage will be used to store all multimedia content. this storage should be always on and accessible for all users inside the private network. for the outside world this storage should be invisible. server which will be used for nas storage should have a large hard drive for storing a lot of multimedia content. processor does not need to be too powerful and amount of ram is not crucial. scheduled tasks will be running on this server to scan the whole system and send information about new available multimedia content to the main server. file share on this server should be available all the time for all users in the network. this server does not need to have any kind of graphical user interface (gui) since only data will be passed between nas storage and other devices. additional requirement is to have a file share which is accessible from different operating systems, both mobile and computer. main server this server should have more processing power than nas server. also, amount or ram on this server should be bigger than one on the nas server because it will communicate with a certain amount of end-users and devices at once. on the other hand, hard drive space does designing an intelligent home media center 467 not need to be too big. this server will have a database with meta data about all multimedia content which is available in the system. the content meta data should have an option to be shared inside a private network through http protocol and with outside network with http and email protocols. that requires an email server running on the main server. main server should also have a web server installed, through which other devices will access the list of available multimedia content. in addition, main server should work as a "main switch" which will allow end-users to turn off any device in the system. also main server will be able to automatically turn on any device in the system if it is needed. these features should be also accessible over the web through implemented web services. there is no need to have a desktop gui on the main server, but access to terminal over ssh protocol is required. microcomputers with the use of microcomputer(s) end-users will browse multimedia content over http protocol. some software for playing multimedia content is required, as well as graphical card which can handle high quality videos. processing power and amount of ram for these computers should be average. these computers should have an option to work with some kind of sensors or readers which are able to recognize presence of certain persons and notice main server about it. amount of hard drive for these computers can be very low. these computers must have a gui which could be controlled with some input device (such is tv remote). sensors system should contain sensors which can note a presence of a certain person or device inside the network. these sensors should just note the presence. microcomputer should send the information to the main server about the detected person. also, sensors should have an ability to record some additional variables inside an intelligent home media center such as: what‟s the weather, is the light on, is this person moving, etc. according to this requirement, for example, temperature sensors could be used to measure temperature, nfc tag readers to recognize persons, etc. other devices system should also contain power switches, mobile devices, router and one output device. power switches should be used to turn on microcomputers. mobile devices will be used for controlling the system through native application which will communicate with the main server. this application will provide a basic functionality for controlling the system. this application should have an option to communicate with the main server from private and public networks. the system must have a router which will provide internet access for all computers inside the private network. this router must have an option to set a static identifier for all devices. also, some services on devices must be visible even from the outside world. client output device must be in the system. over the output device users will watch multimedia content and see the status of the system. 3.2. software communications and relations communications between software components is illustrated in the figure 2. on the main server mysql database will be installed. only main server should have access to 468 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović this database. on the main server also should be a web page together with an api visible to the outside world and a possibility to communicate over ssh protocol. android application will be communicating with main server through web page and rest api and over ssh connection. microcomputer with various sensors will be communicating only with the main server and only via rest api. nfc tag reader will be connected to one microcomputer. other microcomputer with a bundle of sensors will be connected to a tv and it will communicate with both samba share server on the nas and with web interface on the main server. nas storage will have ssh connection open to the outside world and a samba file share which makes all multimedia content available to all users of an intelligent home media center. fig. 2 software communications and relations 3.3 usage scenario activity diagram for usage scenario is presented in figure 3. when a person comes in a range of an intelligent home media center, microcomputer is noticing that over sensors. micro computer is contacting the main server to inform it about presence of a person. main server is turning on all devices in the network. users of an intelligent home media center can control the center over mobile devices or over microcomputer which is connect to the output device. over microcomputer and output device user will have an option to browse all multimedia content in the network. system will also have a functionality to offer to end-user what to watch or listen. for this particular use case concept of ambient designing an intelligent home media center 469 intelligence should be implemented. micro computer will, based on the data received from sensors and user profile, offer the appropriate content. fig. 3 activity diagram for the usage scenario 4. implementation in this chapter implementation of an intelligent home media center is described in detail. implementation details for each component are discussed, together with best practices and approaches. 470 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović 4.1. nas storage for an operating system on the nas storage we chose linux centos 7 with minimum installation. samba file share is installed on the server. beside samba share, only perl programming language and apache server are installed. this server communicates with the main server over rest (representational state transfer) protocol. there is an xml settings file on this server which stores all paths for multimedia content inside the system. for scheduled scanning the system cron jobs are used. cron jobs are running each day at 5:00am in a form of a perl script. perl script first reads the xml settings file and for each of the paths search is performed for available content. for each movie, perl script is, over rest client, getting meta content from open movie database api (omdb). when all data is collected, it's packed in json format and sent out to the main server again over rest. since this operation is very performance consuming, it is run when nas does not have any other pending requests. if system usage is over 50%, bash script will create a new cron job which will start the same perl script 30 minutes later. apache server is used to show images over http protocol. since not all images are accessible on the server, there is another daily cron job which creates symbolic links from images folder to the root folder of apache server. there is also a rest service which is used to start bash scripts. bash scripts are used for basic data manipulation and for shutting down the system. each time nas server receives a request from network, it is informing the main server about it. 4.2. main server main server machine has more processing power than one used for the nas server. main server also does not require gui, but it requires stable and powerful operating system. again centos 7 with minimum installation is chosen. on the main server apache web server is installed alongside with mysql database, email client and php rest server. php rest server is used to collect all requests from devices in the system. main server is receiving meta data about multimedia content from the nas server. all received meta data is stored in a database. also, main server can receive a rest request for turning off a device in the system, sending an email content or displaying data. main usage scenarios are:  meta data about multimedia contents is synchronized between nas server and main server database: when nas server is sending meta data about multimedia content, all meta data content which is not anymore on the nas server is removed from the main server database. all new content is added to the database  when a request for turning off a device is received, main server is turning off the selected device through ssh connection. turning off can be scheduled over ssh also.  when a request for sharing a meta data through an email is received, main server is sending meta data content via email to desired addresses.  when a request for displaying data is received, main server is collecting data from the database and sending it in json format. this use case can occur only when native mobile application is communicating with the server. designing an intelligent home media center 471 apache web server is hosting a web page which is displaying all multimedia meta data from the database. besides that, web page also implements the following functionalities: turning off request for a selected device, power status of all devices in the system, requests for putting devices to sleep and download request for desired content through torrent client. one of implemented web pages is “what to do” page. when user visits this page, system is asking if user wants to watch images, play music or watch videos. based on a few simple questions (“how do you feel?”, “how much time do you have?”, “what would you like to do?”) and user responds, system will offer an appropriate multimedia content to the user. this small feature is based on a concept of ambient intelligence. each time user is looking at some multimedia content, microcomputer is writing to a database what was the genre of the content, what was the weather, time of the day (so we can track what user wants to do at what time) and user‟s mood. based on user‟s answers, time of the day, weather and previous data, system is offering a certain multimedia content to the user. table with user‟s profiles contains zero or more entries for the each user. when user fills out all the questions, data from user‟s profile database is checked. if there are no entries for the user, all content that fulfills user‟s request gets displayed. if there are entries in the user‟s profile database, only content younger than two months (we assume that user is changing habits and expectations for multimedia content) gets selected. if there are entries with the same weather and time of the day like in the time user filled out questions, they are used. otherwise all available content from user‟s profile database is used. in the end hash of user‟s preferences is created. for example, if user wants to watch a movie, hash with user‟s preferences contains favorite genre, average time of movies user watched, average imdb rating of movies user watched, etc. user‟s preferences hash, together with user‟s request is used to search for an appropriate content for the user. if user wants to exercise and user has less than half an hour available, main server will offer p90x cardio training to the user since this training lasts only 20 minutes and can be done indoors. example of using “what to do” feature over android device is presented below: fig. 4 android application for the intelligent home media center 472 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović 4.3. micro computers we chose two raspberry pi 2 model b microcomputers because of their price, performances and availability. we have chosen rasbian operating system because of its expandability and big user community. one microcomputer is connected to the output device. on this raspberry pi two additional features have been installed lib-cec library and kodi media player lib-cec library have been installed to monitor events from the output device‟s remote to which raspberry pi microcomputer is connected. keyboard arrows and enter keys have been mapped together with few other buttons to allow browsing menus with output device‟s remote and turning off raspberry pi over remote. kodi media player is used to play movies and music from the nas storage. on the other raspberry pi nfc tag reader is connected. purpose of nfc tag reader is to note when some persons are nearby, in the radius of intelligent home media center. when a person‟ presence is detected, system is sending a rest request to the main server to turn on all devices. nfc tag reader is placed on the door‟s entrance. users have nfc tag bracelets and key chains. when user is entering the door, she or he should place nfc tag near the reader so tag gets recognized. 4.4. mobile devices we used mobile phone with android operating system. native application has been developed to communicate over rest api with the main server. this application can communicate with the main server even outside private network because port 80 on the main server is visible to the outside world. 4.5. output device we have used lg tv as an output device. out tv has hdmi-cec support (this feature is called simplink on lg tvs). it is important to use a hdmi cable with cec support, otherwise tv‟s remote events won't be passed to the microcomputer. 4.6. components setup main server, nas server and router are in the same room. since over nas server a large amount of data is transferred, nas server is connected to router via lan cable. main server is also connected to router over lan cable because it communicates frequently with the nas server. all other devices in the system are connected to router over wi-fi. since microcomputers don‟t have wi-fi receivers, small wi-fi receivers are provided. nfc card reader is connected to one of microcomputers to notice the presence of certain tags nearby. second microcomputer is connected to tv. one or more mobile devices can be in the private network. 5. conclusion in this paper we have explained in detail design and implementation of a personal intelligent home media center. there are multiple advantages of our solution comparing designing an intelligent home media center 473 to a classic home media center, such as dvd player. comparing to a dvd player or a home theater, our solution offers more comfort in using by allowing users to control the system over multiple devices (such as mobile phone, laptop, tv remote, etc...). also, the most of movie theaters have support only for several multimedia content types. self-built solution offers a possibility to stream almost any content. comparing to a smart tv, our solution is cheaper, with a richer set of features and it is open for communication with any device over the rest server. the intelligent home media center is developed to satisfy the personal needs of the end-user. solution is very scalable and it is easy to add more components such as clients, servers and other devices. it is inexpensive to build, it is fast and it is intelligent. it uses sensors to detect person‟s presence and concepts of ambient intelligence to recommend appropriate media to the user. it is easily maintainable, devices can be replaced or upgraded and additional software features can be implemented there are also some disadvantages of the design and implementation of this personal intelligent home media center. it requires well knowledge of various information technologies. consequently, comparing to other alternatives, it is not a plug & play solution. there are many components in the system and if one component is not configured correctly, the whole system will not work. we will try to address these disadvantages in our future work. further development of our personal intelligent home media center will be focused on the integration with the digital interactive television. based on the recognized mood of the user and user‟s individual profile, system could automatically select tv content. this integration could be especially exploited in the interactive tv advertising landscape [32], [33]. in addition, further development will introduce additional security features, such as parental control and user privacy. references [1] c. benavente-peces, a. ahrens, j. filipe, “advances in technologies and techniques for ambient intelligence”, journal of ambient intelligence and humanized computing, 2014, vol. 5, no. 5, pp. 621-622. [2] d. j. cook, j. c. augusto, v. r. jakkula, “ambient intelligence: technologies, applications, and opportunities”, pervasive and mobile computing, vol. 5, no. 4, pp. 277-298. [3] k. ducatel, m. bogdanowicz, f. scapolo, f., ambient intelligence: from vision to reality. ist advisory group, 1–31. retrieved from http://scholar.google.com/scholar?hl=en&btng=search&q=intitle:ambient+ intelligence+:+from+vision+to+reality#1 [4] k. ducatel, m. bogdanowicz, f. scapolo, j. leijten, j.-c. burgelman, istag scenarios for ambient intelligence in 2010. society, 58. retrieved from ftp://ftp.cordis.europa.eu/pub/ist/docs/istagscenarios 2010.pdf [5] internet source: http://www.smarthomeusa.com/smarthome, retrieved 2015 [6] j.-c. rosslin, k. tai-hoon, “applications, systems and methods in smart home technology : a review”, international journal of advanced science and technology, 2010, vol. 15, pp. 37-48. [7] m. friedewald, o. da costa, y. punie, p. alahuhta, s. heinonen, “perspectives of ambient intelligence in the home environment”, telematics and informatics, 2005, vol. 22, no. 3, pp. 221-238. [8] amigo: ambient intelligence for the networked home environment. (2008). retrieved from http://www.hitech projects.com/euprojects/amigo. [9] a. de paola, s. gaglio, g. lo re, m. ortolani, “sensor 9k: a testbed for designing and experimenting with wsn-based ambient intelligence applications”, pervasive and mobile computing, vol. 8, no. 3, pp. 448–466, 2012. [10] a. sorici, g. picard, o. boissier, a. zimmermann, a. florea, “consert: applying semantic web technologies to context modeling in ambient intelligence”, computers & electrical engineering, vol. 44, pp. 280-306, 2012 http://scholar.google.com/scholar?hl=en&btng=search&q=intitle:ambient+intelligence+:+from+vision+to+reality#1 http://scholar.google.com/scholar?hl=en&btng=search&q=intitle:ambient+intelligence+:+from+vision+to+reality#1 ftp://ftp.cordis.europa.eu/pub/ist/docs/istagscenarios2010.pdf ftp://ftp.cordis.europa.eu/pub/ist/docs/istagscenarios2010.pdf http://www.smarthomeusa.com/smarthome,%20retrieved%202015 http://www.hitech-projects.com/euprojects/amigo http://www.hitech-projects.com/euprojects/amigo 474 i. đurić, v. ratković-živanović, m. labus, d. groj, n. milanović [11] f. sebbak, a. chibani, y. amirat, a. mokhtari, f. benhammadi, “an evidential fusion approach for activity recognition in ambient intelligence environments”, robotics and autonomous systems, vol. 61, no. 11, pp. 1235-1245, 2013 [12] peterson, k. e. (2002). if high-tech it is your idea of paradise, welcome to valhalla. [13] s. russell, p norvig, “artificial intelligence: a modern approach”, international dental journal, vol. 60. [14] i. f. akyildiz, w. su, y. sankarasubramaniam, e. cayirci, “a survey on sensor networks”, ieee communications magazine, vol. 40, no. 8, pp. 102-105, 2002. [15] galton, a. (2000). qualitative spatial change. oxford university press. [16] d. cook, m. youngblood, s. das, “a multi-agent approach to controlling a smart environment”, designing smart homes, pp. 165-182. [17] s. lühr, g. west, s. venkatesh, “recognition of emergent human behaviour in a smart home: a data mining approach”, pervasive and mobile computing, vol. 3, no. 2, pp. 95-116, 2007. [18] m.c. mozer, “lessons from an adaptive home”, smart environments, pp. 271-294. 2005 [19] v. lesser, m. atighetchi, b. benyo, b. horling, a. raja, r. vincent, s. x. q. zhang, “the intelligent home testbed”, environment, vol. 2, no. 15. 1999. [20] s. s. intille, k. larson, j. s. beaudin, j. nawyn, e. munguia tapia, p. kaushik, “a living laboratory for the design and evaluation of ubiquitous computing technologies”, in proceedings of the chi ’05 extended abstracts on human factors in computing systems, 2005, pp. 1941-1944. [21] j. pineau, m. montemerlo, m. pollack, n. roy, s. thrun, (2003). towards robotic assistants in nursing homes: challenges and results. robotics and autonomous systems, vol. 42, no. 3-4, 271-281, 2003. [22] m. chan, d. estève, c. escriba, e. campo, “a review of smart homes-present state and future challenges”, computer methods and programs in biomedicine, vol. 91, pp. 1, pp. 55-81, 2008. [23] m. friedewald, e. vildjiounaite, y. punie, d. wright, “privacy, identity and security in ambient intelligence: a scenario analysis”, telematics and informatics, vol. 24, no. 1, 2007. [24] interner source: http://kodi.tv/about/, retrieved 2015 [25] internet source: http://www.digilifeonline.com.au/, retrieved 2015 [26] l. wei-po, c. kaoli, and j.-y. huang. "a smart tv system with body-gesture control, tag-based rating and context-aware recommendation."knowledge-based systems, vol. 56, pp. 167-178, 2014. [27] j. park, h. park, s. lee, j. choi, d. lee, d. “intelligent multimedia service system based on context awareness in smart home”, context, pp. 1146–1152, 2005. [28] j. h. park, s. lee, j. lim, l.t. yang, “u-hms: hybrid system for secure intelligent multimedia data services in ubi-home”, journal of intelligent manufacturing, vol. 20, no. 3, pp. 337-346. 2009. [29] krill, p. (2000). ibm research envisions pervasive computing. [30] m. weiser, m. “the computer for the twenty-first century”, scientific american, vol. 265, no. 3, 94– 104. 1991 [31] m. weiser, “hot topics-ubiquitous computing”, computer, vol. 26, no.10, 1993. [32] e. athanasiadis, s. mitropoulos, “a distributed platform for personalized advertising in digital interactive tv environments”, journal of systems and software, vol. 83, no. 8, pp. 1453-1469, 2010. [33] g. lekakos, d. papakiriakopoulos, k. chorianopoulos, an integrated approach to interactive and personalized tv advertising. channels, 1-10, 2001. http://kodi.tv/about/ http://www.digilifeonline.com.au/ 10744 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 17-29 https://doi.org/10.2298/fuee2301017k © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper the impact of finite dimensions on the sensing performance of terahertz metamaterial absorber anja kovačević, milka potrebić, dejan tošić university of belgrade, school of electrical engineering, belgrade, serbia abstract. this paper investigates the impact of finite number of unit cells on the sensing performance of chosen thz metamaterial absorber. sensor models with different number of unit cells varying from 16 to infinite have been created using wipl-d software. the results of comparison show that as the sensor’s size increases, its absorption response becomes more similar to the one of an infinite sensor structure. metamaterial absorber with 50 unit cells expresses the similar behavior in terms of the corresponding frequency and amplitude shifts as the infinite absorber when the h9n2 virus sample of variable thickness is uniformly deposited on the top of the sensors’ surface. the uneven distribution of sample affects the sensor’s absorption response which has been proven on the example of sensor with 50 unit cells. key words: thz metamaterial absorber, finite dimensions, absorption response, h9n2 virus sample 1. introduction various metamaterials have been artificially designed to manipulate electromagnetic (em) waves in the manner that enables their functional use in a wide range of device applications such as in switches, modulators, filters and sensors [1]. basically, metamaterials are structures with periodic sub-wavelength metallic [1] or dielectric [2, 3] patterns that possess em properties that are not found in natural materials [2]. metamaterial metallic-based structures inherently have dissipation losses which can be used to enhance their absorption capabilities [1]. metamaterial absorbers (ma) are devices that can minimize the reflection and theoretically eliminate transmission of the incident em wave [4]. they are typically designed as metaldielectric-metal structures [1, 5–7], but other possible designs include dielectric grating-based structures [4], integrated microfluidic structures [8] and dielectric-metal structures [2]. mas can be used in solar power harvesting, material detection, thermal imaging and sensing [4]. received may 08, 2022; revised july 02, 2022; accepted july 16, 2022 corresponding author: milka potrebić university of belgrade, school of electrical engineering, belgrade, serbia e-mail: milka.p@mts.rs 18 a. kovačević, m. potrebić, d. tošić mas that work in terahertz (thz) domain are crucial for bio-sensing applications since the vibration resonances of biomolecules coincide with the thz range [9]. besides that, thz technology has several different advantages relevant for the field of bio-sensing such as non-ionizing property and strong penetration capability [10]. sensors based on thz ma can be used to detect various virus subtypes with wide range of particle size [11]. since the physically realizable sensor has finite dimensions and therefore its structure cannot be fully periodical, we wanted to investigate the impact of finite number of unit cells on the sensing performance. first, we had to come up with proper modelling technique for both the infinite and finite sensor structure in wipl-d software. the whole modelling process alongside the geometrical and material properties of the chosen thz ma will be described in section 2. in section 3, the obtained results that describe the behavior of modelled sensor structures with and without the sample will be presented and thoroughly discussed including the case when the sample is unevenly spread across the ma’s surface. 2. sensor design and modelling process for the purpose of investigating the impact of finite dimensions on sensing performance, we have selected quad-band metamaterial absorber presented in [5]. the chosen ma is a typical planar metal-dielectric-metal structure whose quad-band absorption is achieved by introducing slight deformation to the traditional rectangular metallic resonator rather than using multiple single-band resonators of different sizes. although there are four resonant frequencies, we will focus our analysis on the range of the first resonant frequency which is below 1 thz, but the concept can be broadened to the higher frequencies. 2.1. unit cell and modelling of infinite sensor structure the unit cell structure is composed of metallic ground layer and perforated metallic resonator separated by a polyimide lossy dielectric spacer. the dimensions of interest are given in figure 1. both metallic layers are made of gold whose conductivity varies with the increase of frequency, but since the frequency range of interest is below 1 thz, the fixed value of 40.9 ms/m used in [5] is sufficient for obtaining good-quality results. if the analysis is to be extended to the range of higher frequencies, variation of conductivity can be taken into account by using drude model [12]. in addition, the ground layer is thicker than the skin depth in the whole frequency range of interest which is essential for proper isolation between the substrate and the sensor itself. metal dielectric: ɛ = 3(1+j0.05) 9.5 25 35 2.5 45.5 10 units μm 9 0.4 0.4 x y z fig. 1 thz ma unit cell with given dimensions тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 19 metamaterials are composed of a large number of meta-atoms represented by unit cells. consequently, the proper model of an infinite ma structure implies creating an orthogonal lattice of unit cells through the periodical repetition along the xand y-axes which can be achieved by using periodic boundary conditions (pbc). pbc are a set of boundary conditions applied for analysis of infinite 2d em structures by using a single unit cell [13]. the modelling process of an infinite sensor structure in wipl-d software using pbc option consists of three main steps. since pbc option is only available in scatterer operation mode, the first step is to choose an adequate scatterer mode. the bistatic radar cross-section (rcs) mode is more suitable for this particular structure considering the fixed position of field generator. second step involves setting the values that define unit cell in terms of the occupying space and spatial repetition. x and y values correspond to the start and end coordinates of the unit cell in xy-plane while the z values are recommended to be set to 10% higher values than the cell size determined by its geometry [13]. port 1 and 2 have been positioned at the top and on the bottom of the structure respectively. the last step consists of making planar unit cell structure by defining plates and their domains determined by the used materials and finally, specifying the source as a transverse electromagnetic (tem) plane wave vertically irradiated to the sensor surface and the frequency range of interest. additionally, the quality of planar structure model can be significantly improved by using imaging and edging. 2.2. modelling of finite sensor structure in order to create a model of finite sensor structure in wipl-d, whole modelling process has to be done manually since the pbc option is no longer suitable which results in significantly higher time-consumption. despite the introduced difficulties, the modelling of finite sensor has some significant advantages such as the ability to analyze the impact of the end effects which are inevitably present in the physically realizable structure and the possibility of modelling the uneven distribution of the sample across the sensor’s surface which will be demonstrated in section 3. to fully investigate the impact of dimensions on sensor performance, we have created models for different numbers of unit cells (16, 50, 100 and 400). although the expected dimensions of metamaterial biosensor device for experimental measurements are around 12 mm x 12 mm [14] which is equivalent to 24000 unit cells of the sensor observed in this paper, the thz source is usually focused on a much smaller area of the metamaterial sensor (approximately 1 mm2 [15] which is equivalent to around 167 unit cells). in order to improve the efficiency of simulations, we have exploited the symmetry of modelled structure and the excitation by using the symmetry plane in our models which has cut the number of unknowns around two times without compromising the results of numerical calculations. the simulation frequency range was set to the frequency range of the first resonant peak of the infinite structure. the example of modelling a sensor of finite dimensions is given for structure made of 50 cells in figure 2. 20 a. kovačević, m. potrebić, d. tošić fig. 2 modelling of the sensor with 50 unit cells (the pink plane represents the used symmetry plane) main difficulty that has occurred during the modelling process is how to adequately define sensor ports so that the results can be compared with the previously obtained results for an infinite structure. the main goal is to determine the scattering parameters of sensor which can be achieved by mimicking the process that has been incorporated into the functioning of pbc option. due to the existence of ground plane (figure 1), the transmission coefficients s21 and s12 are practically brought down to zero. for that reason, in order to reduce the complexity of analysis, we have observed only s11. by definition |s11| is: refl 11 inc p s p = (1) where prefl and pinc are powers of reflected and incident wave that had to be calculated in order to determine s11. it should be noted that the definition (1) is valid only on condition that s12 is equal to zero which has been fulfilled. to calculate these powers, we have simulated the near field distribution in the plane parallel to the sensor’s surface. the power of the wave can be calculated by using complex poynting vector s : * s' s' re d ' re ( ) d 'p        =  =              s s e h s (2) where e and h are field vectors described by their x, y and z components and ds’ = ds’iz is vector of the infinitely small surface ds'. after arranging expression (2), it is necessary to perform its discretization since the analysis has been conducted in a finite number of points n = nx ∙ ny: тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 21 * * re s' ( )x y y x n p e h e h    =  −      (3) where nx and ny are number of points along xand y-axes in which the near field distribution has been calculated and δs' = s' / n is an elementary surface of the observed surface s'. we have assumed that all the surfaces δs' are equal and small enough so that the field distribution is approximately constant within them. in order to find the optimal n which supports this assumption, we have varied the total number of points in which the near field distribution is calculated from 441 to 10201. we have concluded that increasing number of points doesn’t lead to the significant variations in the results. therefore, we have set the total number of points to 441 for structures of 16 and 50 cells, 1681 for 100 cells and 3721 for 400 cells. in order to get the near field distribution for the incident wave, we have created separate model with a single wire that doesn’t have significant impact on the field. the incident wave only has ex and hy components onwards marked as ex0 and hy0, thus simplifying power formula (3) to: * inc 0 0re s' ( )x y n p e h    = −       (4) minus sign has been added as the incident waves enters the surface. for calculating the power of reflected wave, we have used the field components imported from the sensor model from which we have subtracted the field components of incident wave to obtain the fields of reflected wave eir and hir (i = x, y): * * refl re s' ( ) xr yr yr xr n p e h e h   =  −     (5) finally, we have used (1) to determine |s11| for different frequencies from the operating range. considering that the selected sensor was designed as ma, we have chosen the absorption as a reference parameter in our analysis. since the transmission through the structure is negligible due to the existence of ground plane, the absorption of the chosen ma is fully defined through the reflection described by s11: 2 111a s= − (6) where |s11| 2 is normalized reflected power. 2.3. sample to investigate the sensing capabilities of both infinite and finite sensor structures, we have chosen the sample of h9n2 subtype of influenza a virus (iav). iavs are respiratory viruses with rna genome and a serious possibility of causing human epidemics or pandemics [16]. virus sample has been modeled as a continuous dielectric layer that completely covers the top of the ma structure. the complex permittivity of the sample has been determined by the frequency-dependant dispersive refractive index ( n = n + jk) derived from the drude-lorentz model 22 a. kovačević, m. potrebić, d. tošić 2 2 2 2 0 1.5 j p n      = = − − + (7) where ωp = 4 thz is the plasma frequency, ω0 = 2.8π thz is the resonant frequency and γ = 4 thz is the damping coefficient [17]. calculated n for certain frequency from the operating frequency range has been modified with coefficients a and b retrieved by thz spectroscopy for h9n2 sample of protein concentration 0.28 mg/ml into the form of complex refractive index an + b jk where 1.2a = and 1.4b = [17]. finally, the complex permittivity required by wipl-d software was calculated by squaring the corresponding complex refractive index. the whole process was repeated for each frequency used in simulation. it should be noted that coefficients a and b and therefore calculated values for complex permittivity of the sample refer to the specific protein concentration and may vary if it is changed. therefore, the samples of different concentrations can be treated as completely different sample types. during the analysis, we have varied the thickness of virus layer to examine the sensors’ behavior with different quantities of the deposited sample. the same analysis can be conducted for a different virus type by altering the coefficients a and b in addition to the parameters of drude-lorentz model given in (7). for example, for iav subtypes h1n1 and h5n2, the drude-lorentz parameters remain the same as for h9n2, but the coefficients a and b have to be modified to (1, 1.4) and (1, 1) respectively [17]. 3. results and discussion the results for selected thz ma are presented and discussed with the aim of investigating the effect that finite dimensions have on sensor’s properties and sensing capabilities. 3.1. behavior without the sample the absorption response of a finite sensor structure obtained in the frequency range of the first peak significantly varies with the change of number of unit cells (figure 3). as the number of cells increases, the peak width and its resonant frequency decrease while the prominence of the peak increases resulting in the response that becomes more similar to the one of an infinite sensor structure. all of the peaks show strong absorption which can be contributed to the combination of two effects: the influence of the perforated metallic resonator and the fabry-pérot effect as a consequence of the multiple reflections between the metallic layers [18]. in order to further compare infinite and finite structures, the corresponding q-factors have been calculated and presented in table 1 alongside with other parameters of interest such as resonant frequency fresonant, full-width at half-maximum (fwhm) and maximal absorption value (amax). the values given in table 1 numerically confirm conclusions made by observing figure 3. the structure with 16 cells does not have enough prominent resonant peak to determine fwhm and q-factor. as the number of unit cells increases, the sensor’s performance in the frequency range of the first resonant peak enhances which can be seen through the increase of q-factor. all of the made observations lead to a very important conclusion that the finite sensor structure with the sufficient number of unit cells can potentially give very approximate results to the ones that are theoretically obtained using the infinite sensor model. тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 23 fig. 3 absorption response of the finite structure for different numbers of unit cells in comparison with the response of the infinite structure table 1 numerical comparison of sensor models with different number of unit cells number of unit cells fresonant [thz] amax fwhm [thz] q-factor 16 0.883 0.983 / / 50 0.878 0.9742 0.169 5.2 100 0.877 0.9662 0.166 5.2 400 0.876 0.9663 0.147 6 infinite 0.864 0.9741 0.127 6.8 to gain a better insight into the underlying physical mechanism of investigated sensor structures, we have calculated the distributions of electric and magnetic fields for both the infinite and finite sensor with 50 unit cells at their first resonant frequencies. the sensor with 50 unit cells has been chosen for further analysis since it has fewer unit cells than other models with prominent peaks which reduced total modelling and simulation time. the results are presented in figure 4. figure 4 (a) shows that the electric field calculated in the parallel plane close to the sensor’s surface is mainly concentrated at the area around the resonator perforation. the electric field distribution is exactly the same for all the unit cells that compose the infinite sensor structure. on the contrary, figure 4 (b) shows that the field distribution on the finite sensor’s unit cell is dependent on its position in the structure. mentioned phenomenon is the direct consequence of the finite dimensions of the sensor and the end effect that occurs on the borders of the structure. figure 4 (c) and (d) show that the magnetic field distribution in the cross-section of both structures is fairly similar as the field is mainly gathered in the middle layer made of lossy dielectric. such confinement of electromagnetic field is typical for the metal–dielectric–metal structures as shown in [8]. the field localization predominately affects the sensing performance as the placement of the sample should coincide with the strongest wave-matter interaction zone in order to achieve high sensitivity. therefore, inverting the placement of the substrate and the sample has been proposed in an effort to enhance the interaction between the thz wave and the sample. mas with integrated microfluidic channels based on this approach were built and tested with solutions of ethanol, glucose and bovine serum albumin (bsa) [8, 19]. however, it should be noted that, due to the technical difficulties during placing and removing samples, these 24 a. kovačević, m. potrebić, d. tošić sensors may not be the most suitable candidates for applications that require large number of consecutive sensing tests and/or have samples that are not in the fully liquid form. (a) (b) (c) (d) fig. 4 distribution of electric field [v/m] for (a) infinite and (b) finite sensor model and magnetic field [ma/m] for (c) infinite and (d) finite sensor model at the first resonant frequency 3.2. behavior with the presence of sample the example of absorption response of both structures in the frequency range of the first peak for three different thicknesses (d) of h9n2 is given in figure 5. both structures show the similar behavior with the presence of virus sample as the resonant peak shifts to the left when the thickness of the sample layer is increased. consequently, the resonant frequency shift can be used not only as an indicator of the virus presence in the sample, but тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 25 also to determine the sample thickness. figure 5 also suggests that there is a certain limit in such detection because of the frequency shift saturation that the resonant peak undergoes when the sample thickness is increased to a certain extent. beside the frequency, the resonant peak amplitude also varies with the modification of sample properties as shown in figure 5. the values of both frequency and amplitude shifts for different thicknesses of the sample deposited on top of the both sensor structures are presented in table 2. it should be noted that, unlike the resonant frequency that never grows when the thickness of the sample increases, the resonant peak amplitude sometimes grows and sometimes declines. in that sense, the values for amplitude shifts given in table 2 are absolute values. table 2 shows that the resonant peak of the absorption response that corresponds to the finite structure experiences larger frequency shifts and saturates faster compared to the one of the infinite structure. additionally, the amplitude shifts are also more dynamic for the finite structure. fig. 5 comparison between the absorption responses of the finite sensor made of 50 unit cells with different thicknesses of h9n2 sample (full line) and the results for the infinite model (dashed line) table 2 frequency and amplitude shifts for different thicknesses of h9n2 sample deposited on top of the infinite and finite sensor structures structure thickness [µm] fresonant [thz] amax frequency shift [ghz] amplitude shift [x10-4] infinite 0 0.864 0.9741 0 0 1 0.827 0.9736 37 5 5 0.771 0.9844 93 103 8 0.750 0.9828 114 87 finite 0 0.878 0.9741 0 0 1 0.823 0.9268 55 473 5 0.761 0.9458 117 283 8 0.748 0.9840 130 99 the previously conducted analysis refers to the uniform distribution of the virus sample across the sensors’ surface. in order to investigate the impact of uneven sample distribution 26 a. kovačević, m. potrebić, d. tošić on the response, we have created several models with different sample distributions based on the model of sensor with 50 unit cells. these models have been created by removing the sample from certain unit cells thus creating the “holes” in the sample layer. since the observed structure has 50 unit cells, there are 250 different distributions that can be analyzed (each unit cell can be covered with the sample or not). taking into account the symmetry plane used in the modelling process shown in figure 2, the number of possible distributions decreases to 225 which is still considerable number to cover by analysis. in order to find the representative distributions to include into our models, we have set three possible parameters that have impact on the absorption response we wanted to characterize: the number of “holes”, the separation between them and their position in terms of the field distribution given in figure 4. let us first formalize the coordinates that describe the position of the “hole” in the sample placed on the top of the sensor’s surface as in figure 6. the gray unit cells from figure 6 belong to the part of the structure that is obtained by using symmetry plane. we can only choose the position of the “hole” from one of the white unit cells and that choice will automatically place another “hole” on the symmetrical gray unit cell. for example, if the “hole” is placed on (2, 3), it will also inevitably be placed on (2, -3). having that in mind, in the following analysis we will only declare the position of the “hole” from the white part of the structure and the position of the corresponding “hole” from the gray part will be implied. the number of “holes” will thus always be even. fig. 6 the coordinates of the “holes” in the sample first, the number of “holes” was set to two and their position and mutual distance were varied. the results presented in figure 7 indicate that placing two “holes” in the sample does lead to certain changes in the absorption response such as small frequency and amplitude shifts and slight deformations of the resonant peak’s shape. both the amplitude and the frequency of the resonant peak increase when two “holes” in the sample are introduced. the maximum increase for both parameters is achieved in the case of two тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 27 connected “holes” in the center of the structure ((3, 1) and its pair), corresponding maximal frequency and amplitude shifts are 3 ghz and 0.0074. the changes of resonant frequencies and amplitudes are the smallest when the “holes” are further away from the center whether the “holes” are connected ((5, 1) and its pair) or completely separated from each other ((4, 4), (2, 3) and their pairs). the differences between the absorption values for models with “holes” in the sample and the original model with uniform distributions indicate that the shape of the resonant peak is slightly altered with the introduction of two “holes”. fig. 7 absorption response for different positions of the “holes” in the 8 µm thick sample in the case of two “holes” next, the number of “holes” was increased to six and three different distributions were observed. the obtained results are shown in figure 8. the changes in the absorption response are more pronounced than when there were two “holes”. the maximal frequency shift of 8 ghz is achieved when there are six consecutive “holes” forming a 1x6 rectangular “hole” near the center of the structure ((2, 1 – 3) and their pairs). the peak amplitude for that case has the maximal decrease of 0.0221 which is about three times the absolute value fig. 8 absorption response for different positions of the “holes” in the 8 µm thick sample in the case of six “holes” 28 a. kovačević, m. potrebić, d. tošić of the corresponding shift for the two “holes”. in other two cases, the peak amplitude is slightly increased, but significantly less than for the two “holes” in the sample. 4. conclusion we have thoroughly investigated the impact of finite dimensions on the sensing performance of the thz metamaterial absorber based on the typical planar metal-dielectricmetal structure. the results have shown that as the number of unit cells increases, the absorption response approaches the one of an infinite structure which is numerically reflected in the decreased width of the resonant peak and the increased q-factor. the calculated electric field distribution has indicated that the field was mainly localized around the rectangular perforation regardless of the number of unit cells. unlike the infinite structure, the structure with finite number of unit cells has shown the dependency of the field distribution on the position of the unit cell due to the presence of the end effect. the electromagnetic field was primarily confined in the lossy dielectric layer for both the infinite and the finite structure which is typical for metal-dielectric-metal based structures. the behavior of the infinite and finite sensors in the presence of the h9n2 virus sample was examined. first, the sample was evenly distributed across the sensors’ surfaces. the results have shown that the resonant peak of the finite structure experiences greater frequency shifts and saturates more quickly with the increase of the virus layer thickness in comparison with the infinite structure. finally, we investigated the effect of uneven sample distribution on the finite sensor structure by removing the sample from the top of the certain unit cells. the analysis has shown that creating “holes” in the sample does lead to changes in the absorption response such as frequency and amplitude shifts and slight deformations of the resonant peak’s shape. the number of “holes” in the sample is proven to be the parameter that contributes to the mentioned changes the most. acknowledgment: this research was supported in part by the ministry of education, science and technological development of the republic of serbia, project no. 2022/200103, and by the innovation fund of the republic of serbia. the authors would like to acknowledge the contribution of the eu cost action ca18223. references [1] b. x. wang, w. q. huang and l. l. wang, "ultra-narrow terahertz perfect light absorber based on surface lattice resonance of a sandwich resonator for sensing applications", rsc advances, vol. 7, pp. 4295642963, 2017. [2] d. hu, t. meng, h. wang, y. ma and q. zhu, "ultra-narrow-band terahertz perfect metamaterial absorber for refractive index sensing application", results in phys., vol. 19, p. 103567, pp. 1-5, 2020. [3] y. wang, d. zhu, z. cui, l. hou, l. lin, f. qu, x. liu and p. nie, "all-dielectric terahertz plasmonic metamaterial absorbers and high-sensitivity sensing", acs omega, vol. 4, pp. 18645-18652, 2019. [4] f. yan, q. li, h. tian, z. wang and l. li, "ultrahigh q-factor dual-band terahertz perfect absorber with dielectric grating slit waveguide for sensing", j. phys. d: appl. phys., vol. 53, p. 235103, pp. 1-9, 2020. [5] q. xie, g. dong, b. wang and w. huang, "design of quad-band terahertz metamaterial absorber using a perforated rectangular resonator for sensing applications", nanoscale res. lett., vol. 13, p. 137, pp. 18, 2018. тhe impact of finite dimensions on the sensing performance of terahertz metamaterial absorber 29 [6] m. janneh, a. de marcellis, e. palange, a. t. tenggara and d. byun, "design of a metasurface-based dual-band terahertz perfect absorber with very high q-factors for sensing applications", optics commun., vol. 416, pp. 152-159, 2018. [7] w. yin, z. shen, s. li, l. zhang and x. chen, "a three-dimensional dual-band terahertz perfect absorber as a highly sensitive sensor", front. phys., vol. 9, p. 665280, pp. 1-10, 2021. [8] x. hu, g. xu, l. wen, h. wang, y. zhao, y. zhang, d. r. s. cumming and q. chen, "metamaterial absorber integrated microfluidic terahertz sensors", laser photonics rev., vol. 10, pp. 962-969, 2016. [9] l. cong, s. tan, r. yahiaoui, f. yan, w. zhang and r. singh, "experimental demonstration of ultrasensitive sensing with terahertz metamaterial absorbers: a comparison with the metasurfaces", appl. phys. lett., vol. 106, p. 031107, pp. 1-7, 2015. [10] a. kovačević, m. potrebić and d. tošić, "sensitivity analysis of possible thz virus detection using quad-band metamaterial sensor", in proceedings of the ieee 32nd international conference on microelectronics (miel), niš, serbia, 2021, pp 107-110. [11] n. akter, m. m. hasan and n. pala, "a review of thz technologies for rapid sensing and detection of viruses including sars-cov-2", mdpi biosensors, vol. 11, p. 349, pp. 1-21, 2021. [12] n. shen, p. tassin, t. koschny and c. soukoulis, "comparison of goldand graphene-based resonant nano-structures for terahertz metamaterials and an ultra-thin graphene-based modulator", phys. rev. b, vol. 90, no. 11, p. 115437, pp. 1-8, 2014. [13] wipl-d pro 17, 3d electromagnetic solver, wipl-d d.o.o., belgrade, serbia, 2021. available online: http://www.wipl-d.com (accessed on 29 april 2022). [14] g. wang, f. zhu, t. lang, j. liu, z. hong and j. qin, "all-metal terahertz metamaterial biosensor for protein detection", nanoscale res. lett., vol. 16, p. 109, pp. 1-10, 2021 [15] s. j. park, s. h. cha, g. a. shin and y. h. ahn, "sensing viruses using terahertz nano-gap metamaterials", biomed. opt. express, vol. 8, pp. 3551-3558, 2017. [16] b. dadonaite, b. gilbertson, m. l. knight, s. trifković, s. rockman, a. laederach, l. e. brown, e. fodor, d. l. v. bauer, "the structure of the influenza a virus genome", nat. microbiol., vol. 4, no. 11, pp. 1781-1789, 2019. [17] m. amin, o. siddiqui, h. abutarboush, m. farhat and r. ramzan, "a thz graphene metasurface for polarization selective virus sensing", carbon, vol. 176, pp. 580-591, 2021. [18] b. wang, a. sadeqi, r. ma, p. wang, w. tsujita, k. sadamoto, y. sawa, h. r. nejad, s. sonkusale, c. wang et al, "metamaterial absorber for thz polarimetric sensing", in proceedings of the spie, terahertz, rf, millimeter, and submillimeter-wave technology and applications xi, san francisco, ca, usa, 2018, vol. 10531, pp. 1-7. [19] f. lan, f. luo, p. mazumder, z. yang, l. meng, z. bao, j. zhou, y. zhang, s. liang, z. shi et al, "dualband refractometric terahertz biosensing with intense wave-matter-overlap microfluidic channel", biomed. opt. express, vol. 10, pp. 3789-3799, 2019. http://www.wipl-d.com/ instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 423 437 doi: 10.2298/fuee1503423z investigation of the effect of additional electrons originating from the ultraviolet radiation on the nitrogen memory effect  emilija n. živanović university of niš, faculty of electronic engineering, niš, serbia abstract. the influence of ultraviolet radiation on memory effect in nitrogen has been investigated. the spectrum of the radiation which passes through the walls of the experimental sample was obtained by the spectrometer. a detailed comparison of experimental results of electrical breakdown time delay as a function of afterglow period with and without ultraviolet irradiation was performed. these studies were done for such product of gas pressure and inter-electrode distance when both breakdown initiation mechanisms exist. the research has shown that ultraviolet radiation leads to the decrease in ion concentration in early nitrogen afterglow due to recombination of nitrogen ions with electrons released from the tube walls and electrodes. meanwhile, it has been cofirmed that this radiation has a negligible influence on the breakdown initiation in late nitrogen afterglow when a significant nitogen atom concentration is persistent. when the concentration of nitrogen atoms decreases enough, the breakdown initiation is caused by cosmic rays but uv photons have an important influence because of the rise of the electron yield. key words: memory effect, electrical time delay, nitrogen, ultraviolet radiation 1. introduction the electrical breakdown time delay in gases is one of the most important characteristics of gas components, which is also known as the delay response. it is defined as a time interval from applying voltage, sufficient enough to initiate the electrical breakdown. furthermore, the investigation of electrical breakdown time delay can provide useful information about the physical processes that occur in the gas during the operation of electrical devices. the investigation of electrical breakdown time delay in gas could be performed as a function of different parameters 1, 2. one of the most important parameters that influences on the mean value of electrical breakdown time delay dt is the afterglow period . the dependence )(ftd  is usually known as memory curve. it has been used for qualitative and quantitative analysis of concentrations of positive ions and neutral active received july 18, 2014; received in revised form march 5, 2015 corresponding author: emilija živanović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš (e-mail: emilija.zivanovic@elfak.ni.ac.rs)  424 e. n. živanović states remaining from the previous discharge as well as formed during the afterglow period [3], [4]. it has also enabled the estimation of recombination and de-excitation times of the mentioned particles due to their recombination on the tube walls, electrodes and in gas. the particles that come to the cathode play the main role in the initiation of the subsequent breakdown. they induce the secondary electron emission, and if the voltage applied on the electrodes is higher than the static breakdown voltage, secondary electrons created at the cathode can initiate the subsequent breakdown. the previous study showed that auger neutralization process, in which positive ions participate, as well as auger de-excitation process for which the molecule metastable state are responsible, play a dominant role in the breakdown initiation in gases at low pressure. especially in nitrogen, the process of surface catalysed excitation could also be responsible for the breakdown initiation. many parameters affect the behavior of the memory curve at low pressures. some of them are the inter-electrode distance, the material of cathode and tube's walls, the wall's temperature, the applied voltage on the gas tube, the glow current and the glow time. most of these investigations have already been published. however, the influence of additional electrons which originate from ultraviolet irradiation on the nitrogen memory effect when the product of gas pressure p and the inter-electrode distance d are placed on the left side of paschen's minimum (the dependence of the static breakdown voltage as a function of the pd product) has not been sufficiently investigated. practical importance of investigation of this irradiation influence on the discharge in gas-filled electronic components as well as solid system, it could be found in many published papers [5-9]. for this purpose, the aim of this paper is to examine the effect of ultraviolet radiation that comes from the cadmium lamp on the nitrogen memory effect in the presence of vacuum and gas breakdown mechanisms. as for each of the areas of the memory curve depending on the afterglow period length, a different mechanism initiating breakdown is responsible, the influence of ultraviolet radiation for each of them has been individually investigated. 2. experiment 2.1. experimental sample the cylindrical borosilicate glass (8245 schott technical glass) tube filled with nitrogen at 6.6 mbar pressure was used as experimental sample and it is shown in fig. 1. its volume was about 1 l. it was connected in the circuit with one fixed and one movable iron electrode, so that the inter-electrode distance could be varied by a permanent magnet from the outside. using this experimental sample allows changing the inter-electrode distance from 0.01 cm to 0.45 cm, while the value of the gas pressure in the tube remains constant. the diameter of spherical electrodes was 1 cm. fig. 1 the shape of used experimental samle (1fixed electrode, 2movable electrode, 3rotation shaft from iron) investigation of the effect of additional electrons originating from the ultraviolet radiation... 425 the tube had to be baked out and evacuated before the nitrogen was admitted in a process similar to that for the production of x-ray and the other electron tubes. after that, the tube was filled with matheson research grade nitrogen at pressure of 6.6 mbar with the claimed abdundance of impurities such as co<0.5 ppm, co2<0.5 ppm, o2<1 ppm, thc<0.2 ppm and h2o<1 ppm. before the time delay measurements were done, the cathode sputtering with glow current of 0.5 ma was set with duration of a few days. such value of discharge current was selected to avoid erosion of the cathode during conditioning. due to the stohastic nature of electrical time delay, each point in memory curves represents the mean value of a hundred measured values. after the breakdown, the current in the tube was 0.5 ma during a glow time of 1 s. this time is sufficient to attain the steady-state discharge conditions in our experiment. it should be noted that the similar experiments were performed by other group of scientists [10] and [11] in which the glow time was the order of milliseconds. the gas sample used in these experiments has the cathode made of gold-plated copper. but, it could be emphasized that the gold provides a relatively stable work function. the estimated values of static breakdown voltage us for this experimental sample for two values of inter-electrode distance of 0.01 cm and 0.1 cm were 418 v and 386 v, respectively. the electrical breakdown time delay measurements were performed for overvoltage %50)(  ssws uuuuu higher than static breakdown voltage, where uw is voltage applied on the tube electrodes. 2.2. experimental setup electrical breakdown time delay measurements were performed with an electronic system, whose block diagram is shown in fig. 2. the used electrical system, from the architecture point of view, consists of three major parts. those are high voltage power supply, analog subsystem and digital subsystem. for high voltage power supply it is necessary to operate in the range from 100 to 1000 volts within the desired power raring. fig. 2 block diagram of system for electrical breakdown time delay measurement the measured electrical breakdown time delay values can range from several microseconds to several minutes, while the values of afterglow period set by the system can range from couple of microseconds to several days. due to the statistical nature of time delay, it is necessary to perform a large number of measurements since the measurement error of time delay mean values decreases as n/1 , where n is the total number of measurements. the number of measurements in this experiment was a hundred for only one value of the afterglow period. since the measurement cycle can be set up for 426 e. n. živanović an arbitrary number of different afterglow period values, it can be concluded that the total number of measured timed delay values per experiment can be extremely large. such large number of data should be stored somewhere in real time and kept for further statistical analysis which the digital subsystem enables. the used electrical system allowed obtaining the dependence of the mean value of electrical time delay vs. afterglow period, )(ftd  on two different values of interelectrode distance, with and without the presence of uv radiation. before discussing the processes that predominantly affect the secondary electron emission process during the afterglow, it is necessary to highlight the progress that was made in measuring the electrical time delay using the improved measurement system. what this system enables is a significant reduction in the value of the afterglow period duration for which electrical breakdown time delay could be measured. hence, the lowest value of the afterglow period value for which the measurement of electrical breakdown time delay was possible was 3 μs. thus, the value of afterglow period for which it was possible to measure electrical breakdown time delay was reduced by three orders magnitude using the improved system, which allowed tracking the decrease in the concentration of the charged and neutral active particles that were passed from the discharge into the afterglow period. further, this system has enabled sufficiently reliable monitoring the recombination/deexcitation of positive ions and neutral active particles formed during and after discharge based on the process of secondary electron emission which they initiate. the detailed development of the electrical system used to measure the electrical time delay and its electrical scheme could be traced in [12]. 2.3. investigation of ultraviolet irradiation as a source of ultraviolet radiation the commercial cadmium lamp was used. before recording memory curves, spectral analysis of the light that comes from the lamp, which goes through the glass tube walls, was performed. a piece of borosilicate glass was placed between the cadmium lamp and the spectrometer in order to ascertain which wavelengths could pass through the gas-filled tube walls. this enabled measuring the light intensity influence in this process. the instrument used in spectroscopic analysis was avantes spectrometer avaspec-3648 [13] which has a useable range from 200 nm to 850 nm. the spectrometer has a diffraction grating with 600 lines/mm and the slit size is 10 µm, making the lowest resolution between two near lines 0.32 nm. as can be seen, the most intense lines in the spectrum are wavelengths 327 nm, 480 nm and 509 nm. the obtained emission spectrum is shown in fig. 2, and as it can be seen, the wavelengths larger than 327 nm can pass through the glass walls. the physical process that occurs when ultraviolet light from the cadmium lamp falls on the cathode surface is the photoelectric effect. specifically, based on einstein's relation for the photoelectric effect provided that freed electrons do not experience collisions inside the metal, or if the photons are delivered energy electrons at the cathode surface, the energy of the incident photons will be equal to the sum of the work function cathode material and the maximum kinetic energy of the released electrons. investigation of the effect of additional electrons originating from the ultraviolet radiation... 427 fig. 3 strongest observed lines in the emission spectrum of the cadmium lamp which pass through the borosilicate glass therefore, comparing the value of energy corresponding to the wavelength the spectrum resulting lines with a value of the iron work function, the existence of photoelectric effect could be determined [14]. to make it possible, the photon energy from the lamp needs to be greater or equal to the value of the work function of iron, of which the electrodes are made. in the literature, there are inconsistencies in the value of the work function of iron, according to the different authors, varying from 3.5 ev [15] to 4.3 ev [16]. it can be asked how the secondary electrons can be emitted from the cadmium lamp since the work function of iron is slightly greater than the photon energy of the smallest wavelength of the spectrum. it should be noted that the iron electrodes are coated by a layer of iron oxide, which has smaller work function than the iron 15. earlier qualitative energy dispersive spectroscopy analysis of the electrode surface confirmed the iron oxide layer stability, and that it cannot be removed by sputtering during the discharges. because of that, from the spectrum of cadmium lamp lines 327 nm, 341 nm, 346 nm and 360 nm should be taken into account, with corresponding energies 3.79 ev, 3.64 ev, 3.59 ev and 3.45 ev. 3. results and discussion the experimental results presented in this paper are the follow-up of the recent research presented in 17. in investigation whose results are presented below, the interelectrode distance was taken as a parameter in tracking the behaviour of the memory curve. in this case the research was done in the persence of ultraviolet radiation. the analysis of the influence of additional electrons originating from commercial cadmium lamp has been performed. the memory curves present in figs. 4 and 5 are obtained for nitrogen-filled tube, with and without source of ultraviolet radiation for two different values of inter-electrode distance of 0.01 cm and 0.1 cm, respectively. the analysis of the obtained results enables a discussion of the ultraviolet radiation influence as well as the inter-electrode distance to the memory curve behavior. a type of breakdown initiation mechanism also has an influence on the memory curve behavior. 428 e. n. živanović fig. 4 memory curves with and without presence of radiation from cadmium lamp for inter-electrode distance of 0.01 cm fig. 5 memory curves with and without presence of radiation from cadmium lamp for inter-electrode distance of 0.1 cm 3.1. nitrogen memory effect a special mechanism of breakdown initiation occurs at low pressure values when the inter-electrode distance is less than the electron mean free path (pd  10 3 mbar cm [18]). under these conditions the breakdown appears in the so called technical vacuum and it is caused by the existence of an avalanche mechanism which creates free electron and ions. in this case, the breakdown starts in the processes at the electrode, but is significantly investigation of the effect of additional electrons originating from the ultraviolet radiation... 429 different than γ processes of townsend's mechanism. there are several ways of the vacuum breakdown initiation, but all are common to evaporation of the electrode material forms a vapor cloud which is still developing breakdown with townsend's avalanche mechanism [19]. in order to form a cloud of vapor, it is necessary that the electrodes' surface has a large number of micro-spices which should cause a sufficient amount of energy to cause thermal instability. thermal instability of electrodes could be caused by emission mechanism, by micro-particles accelerated in electrode material, or through the avalanche effect in the adsorbed residual gas layer on the electrode. it can be induced in three ways: the emission mechanism (autoelectron emissions), accelerated micro-particles from electrode materials or avalanche effect in the gas adsorbed layer on the electrode surface. as a criterion of whether it is a technical vacuum or not, the dependence us = f (pd) was monitored for the gas sample used in this experiment, which was published in [17]. the product value of inter-electrode distance 0.01 cm and nitrogen pressure in the tube of 6.6 mbar lies to the left of the paschen's curve minimum. it corresponds to the most favorable conditions for the gas ionization. in this case it was not pure vacuum breakdown mechanism existence, but for this the value of pd product the combined effect of vacuum and gas breakdown mechanism exists. the presence of vacuum breakdown mechanism leads to the existence of additional electrons in the inter-electrode gap which are otherwise responsible for the secondary electron emission in the plateau region of memory curve. it is known that the memory curve of nitrogen has three areas 1 which are caused by the existence of different mechanisms responsible for the electrical breakdown initiation in the gas. using an advanced system for measuring the time delay allows a detailed analysis of the plateau area of the memory curve (short-lived afterglow) which has been done on this occasion. namely, if the positive ions are present in the gas, their movement towards the cathode is enabled by connecting a voltage to the electrode tubes, where they eject electrons that initiate electrical breakdown in the process of secondary electron emission. the positive ions could be transferred from the discharge to the short-lived afterglow, but they persist in gas to 10 ms. the discussion in the paper [20] showed that the minimum emissions of 1 system occurred 1-10 ms after the discharge had ceased. on the other hand, the positive ions could be formed during the actual relaxation in the metastable molecule reactions )or()()( 22422 ennenanan   , (r1) )or()()( 22422 ennenanan   , (r2) with the rate coefficient values of 3.2·10 -12 cm 3 s -1 [21] and 5·10 -11 cm 3 s -1 [22]. n2(a) and n2(a') metastable molecules formed in the discharge recombined 1-10 ms after the discharge had ceased [23], [24], so that the number of the reactions (r1) and (r2) decreased with time up to 10 ms. also, if the electrons are present in the gas, the positive ions could be formed by electron impact on the nitrogen molecules, as well. the electrons in the gas could be found up to 1 ms after the discharge ceased [25]. if a voltage is applied to the electrodes after the relaxation which lasted   1 ms the electrons would perform ionizing collisions and thus encourage the process of secondary electron emission. since the mechanism of positive ion creation during the (r1) and (r2) reactions dominates over the electron impact even during the discharge [22], it can be concluded that the same will happen outside the discharge, and the contribution of the process of the secondary electron emission during the afterglow can be neglected. 430 e. n. živanović the calculations of the vibrational temperature during the discharge showed that the concentration of excited vibrational levels 10v had a value of 0.1% concentration of nitrogen molecules in the ground state [4]. a similar trend is predicted if the discharge current is low, which corresponds to the conditions of our measurements. therefore, it can be concluded that immediately after the completion of the discharge, the concentration of the vibrational-excited nitrogen molecules ( 10v ) is high, and the reactions of positive ions creation are possible [26]: ,)()24,( 422 enanvxn   (r3) ,)29,()29,( 422 envxnvxn   (r4) enanvxn   422 )()36,( . (r5) after   10 ms the emission intensity of 1 system increases. this intensity passes through a maximum value in the interval of 15-20 ms [20]. this maximum results from the growth of the positive ion concentration during the short-lived afterglow. from the aforementioned facts it can be seen that in this range the concentration of n2(a) and n2(a') metastable molecules pass through the maximum, which leads to, due to the higher probability of the reactions (r1) and (r2), the increase of the positive ion concentration. however, considering the fact that after 10 ms the positive ions and metastable molecules from the discharge recombine or become de-excited, it can be concluded that the other particles present in the gas are involved in the creation of the positive ions. it is believed that the formation of n2(a) and n2(a') metastable molecules in the shortlived afterglow for the relaxation time ms10 follows from the reactions ,)()()39,()( 2 2 2 4 andnvxnsn  (r6) ,)'()()38,()( 2 4 2 4 ansnvxnsn  (r7) involving n( 4 s) atoms and the highly vibrational-excited nitrogen molecules [26]. n( 4 s) atoms retain 1% of the nitrogen molecule concentration after the discharge period up to   10 ms  1 s [27], which is more than other particles in the gas do. as for the highly vibrational excited nitrogen molecules, the population availability of the highly vibrational excited levels increases during the short-lived afterglow due to the "pumping up" effect. the calculations of other authors [27] have shown that the concentration of n2 (x, v > 25) molecules has the maximum value for the afterglow period in order of tens of milliseconds. this maximum coincides with the maximum of the 1 system emission, so it can be concluded that, in the interval of 10-15 ms, the efficiency of the (r6) and (r7) reactions is the largest in the short-lived afterglow. this leads to an increase in the concentration of n2(a) and n2(a') metastable molecules, which, by participating in the reactions (r3)-(r5), cause the increase of the positive ion concentration. the concentration of highly vibrational excited nitrogen molecules n2 (x, v > 25) rapidly decreases after reaching the maximum, and the relaxation times of the order of 100 ms is already about 50% lower than the maximum, whereas for   1 s it is negligible [27]. this decrease significantly reduces the probability of positive ion creation in the previously mentioned processes, so their significance in the process of secondary electron emission is smaller for the relaxation time  > 30 ms. investigation of the effect of additional electrons originating from the ultraviolet radiation... 431 bearing in mind the aforementioned discussion, the 3 s <  < 70 ms interval in figs. 4 and 5, which represents an area of the short-lived afterglow (particularly in figs. 6 and 7) is investigated in detail. a slightly larger increase in the electrical breakdown time delay value in the 3 s <  < 1 ms interval is caused by the decrease of the positive ion concentration and n2(a) and n2(a') metastable molecules being transferred from the discharge to the short-lived afterglow. in this interval, the secondary electron emission is dominated by the positive ions that are either transferred from the discharge or afterglow through reactions of n2(a) and n2(a') metastable molecules and highly vibrational excited nitrogen molecules, also transferred from the discharge. since in the 3 s <  < 1 ms interval positive ion concentration is the largest for all afterglow periods for which the measurements are performed, the efficiency of the secondary electron emission is the largest in this interval. it is characterized by the lowest values of electrical breakdown time delay on the memory curve of figs. 4 and 5. the end of the interval 3 s <  < 1 ms coincides with the minimum intensity of 1 system emission in the short-lived afterglow, when most of the positive ions are recombined and the majority of the de-excited electrons and n2(a) and n2(a') metastable molecules are transferred from the discharge. in the next part of the short-lived afterglow (1 ms <  < 70 ms), the positive ions are also responsible for the secondary electron emission, but almost entirely those incurred during the actual relaxation. at the beginning of this interval the levels' concentration of the highlyvibrational excited nitrogen molecules reaches a maximum, and then the efficiency of the (r6) and (r7) reactions, in which n2(a) and n2(a') metastable molecules are formed, is the highest. these molecules, in their mutual interaction in reactions (r1) and (r2) as well as in reactions with nitrogen highly vibrational excited molecules (reactions r3-r5), cause the increase of the positive ion concentration and enhance the secondary electron emission. the renewed increase in the positive ion concentration caused in such a way is represented on the memory curve of figs. 4 and 5 by a significantly slower increase in the td value during the interval 1 ms <  < 70 ms than during the interval 3 s <  < 1 ms. there is a further increase in the electrical breakdown time delay value due to the fact that having reached the maximum during  = 10  15 ms, the concentration of the highly excited vibrational molecules of nitrogen begins to decline. this, over time, leads to the reduction of the efficiency of the reactions which produce the positive ions. at the end of the 1 ms <  < 70 ms area, the concentration of the positive ions decreases to a level at which the probability of the secondary electron emission process of their impact upon the cathode becomes very low and then this process begins to dominate neutral active particles. since the efficiency of these particles in the secondary electron emission is significantly lower than the efficiency of the positive ions, there is a sudden increase in the value of td for  > 70 ms. at the same time, the value   70 ms is the end of the area of the short-lived afterglow time in figs. 4 and 5. most reactions of nitrogen positive ions creation, both by electron impact ionization of neutral molecules, during breakdown and discharge by associative ionization processes, involve metastable molecules and highly vibrationally excited molecules were listed in papers 28. it could be seen that n ,  2 n ,  3 n ,  4 n ions have a certain role in the process of secondary electron emission from the cathode in early afterglow. taking into account the published data 28, the concentration of  2 n ions is 310 cm105   after discharge ceases, while the concentrations of the other ions are significantly lower. for 432 e. n. živanović this reason, it can be proposed that  2 n ions formed in processes (r1) and (r2) have a dominant role in the breakdown initiation in the early afterglow. the drift velocity of these ions can be estimated by expression dvmuev jwd  [29], where the mean free path is calculated as pdkt 2 2   (k is the boltzmann constant, t = 300 k is gas temperature, d is the ion diameter and p is gas pressure), uw is applied voltage on the electrodes, mj is the ion mass, d is the inter-electrode distance, e is the elementary charge and v is the mean thermal velocity estimated as jmktv 2 . for nitrogen-filled discharge tube at 6.6 mbar pressure, for present experimental condition, the mean free path is 1.4710 -5 m and the mean thermal velocity is 421.73 ms -1 , while the value of the drift velocity for overvoltage of 50% is 6.910 4 ms -1 . it can be concluded that the drift velocity is higher than the thermal velocity. for these afterglow periods the electrical breakdown time delay is only determined by total time necessary for the ion drift to the cathode and the secondary electron release from its surface. in the case when the drift velocity is lower than the thermal velocity, it is emphasized that the positive ions also play the most important role in the process of secondary electron emission because of their drift motion toward the cathode under the field influence in the inter-electrode gap. not only the positive ions but the metastable molecules and other neutral active particles as well can cause the secondary electron emission at the cathode. however, when a voltage is applied to the electrode, the diffusion time of these particles to the cathode is considerably longer than the drift of the positive ions, so their direct contribution to this process is negligible provided that the positive ions are present in the gas. the rapid growth of the electrical breakdown time delay value at the end of shortlived afterglow indicates a change in the mechanism that dominates in the process of secondary electron emission. after  = 15 ms, the concentration of n2(a) and n2(a') metastable molecules and high vibrational excited molecules of nitrogen decreases, consequently the number of positive ions formed the above-mentioned reactions became lower. because of that, when the concentration of positive ions becomes lower than the concentration of neutral particles in a gas, which can also cause secondary electron emissions, the form of memory curves change, entering the region of rapid electrical breakdown time delay increase, so-called long-lived afterglow. the efficiency of neutral active particles in causing the process of secondary electron emission is much smaller than in the case of positive ions, causing the value of td rapidly increased in relation to the plateau memory curve. earlier investigations [1], [3], [24], [30] confirmed that the nitrogen atoms in ground state n( 4 s) remaining from the previous discharge as well as formed after the discharge ceased are the most responsible particle for the secondary electron emission from the cathode in late nitrogen afterglow. numerical models, which followed the decrease of nitrogen atoms concentration based on the re-association on the tube walls [3], [4], [24] combined with a model predicts that secondary electron emission is caused by nitrogen atoms, showed a good agreement with the experimental obtained memory curves in the area of sudden increase in td value. the decrease of n atoms concentration is inversely proportional to the probability of the secondary electron emission, indicating the td value increases. since their concentration decreases exponentially [4], as well as the decline of the light emitted from the tube after discharge, this downward trend in log scale is shown linearly. investigation of the effect of additional electrons originating from the ultraviolet radiation... 433 it has been shown earlier [31], [32] that the long-lived lewis-rayleigh afterglow lasts up to several hours and it can also been confirmed that the source of the energy for the glow is the recombination of the nitrogen atoms in the ground state. it was concluded that n( 4 s) atoms are present a very long time in the afterglow and their concentration deceases mostly by surface recombination on the tube walls [3]. in addition, they could be also recombined in the gas and on the electrodes. the final product of their recombination is n2(a) metastable state. it can be reach via the following reactions: mbnmsnsn  )()()( 2 44 , (r8) where m is the atom at the cathode surface, and by the spontaneous de-excitation process of n2(b) molecules, hanbn  )()( 22 . (r9) n2(a) metastable state formed in this way [33], transferring the energy to the cathode via the collision. as the work function of iron, is lower than the n2(a) metastable state energy of 6.2 ev, it can induce the secondary electron emission that determines the value of the electrical breakdown time delay in the late nitrogen afterglow. this process of nitrogen atom recombination on cathode surface is the surface-catalyzed excitation [34]. this is the process of heterogeneous catalysis which is significant for this research, when the adsorbate and the substrate are in different phases, i.e. in gaseous and solid. in order to achieve heterogeneous catalysis, at least one of the reactants needs to be adsorbed and modified into the shape that has got a high affinity for the reactions. and on that way the secondary electrons are produced for breakdown initiation in the late afterglow. for   3·10 3 s and   7·10 3 s the memory curves without presence of additional radiation from figs. 4 and 5 reach the saturation for the inter-electrode distance of 0.01 cm and 0.1 cm, respectively, i.e. the mean value of the time delay slightly changes with the afterglow time increase. it should be emphasized that in the saturation district of memory curves, the concentration of nitrogen atoms decreases to so low value that the cosmic ray becomes responsible for the breakdown initiation. when the applied voltage on the tube is higher than the static breakdown voltage, the electron-ion pairs form in gas and they could initiate the breakdown. then, it is highly probable that electrons are released from the cathode due to the impact of cosmic radiation. they form the avalanche which leads to the breakdown causing the electrical time delay decrease. since the flux of cosmic ray during the experiment was approximately constant, the number of the electron-ion pairs created in unit time is approximately constant and the electron yield is also approximately constant. because of that, the mean value of time delay is constant for the given value of overvoltage. these conclusions are in agreement with the results shown in figs. 4 and 5. it is important to emphasize that the cosmic ray permanent exists during the experiment. but, for shorter afterglow period when the positive ions and the considerable concentration of n( 4 s) atoms are present, the role of the cosmic ray in the breakdown initiation is negligible in relation to the secondary electron emission initiated with these particles. 3.2. influence of ultraviolet radiation the analysis of the obtained experimental results also enables a discussion of the ultraviolet irradiation influence to the memory effect in nitrogen at 6.6 mbar pressure when the combined 434 e. n. živanović vacuum and gas breakdown initiation mechanism exist. it is clearly observed from figs. 4 and 5 that ultraviolet radiation, which comes from the lamp, has influence on the electrical breakdown time delay. in some regions of the memory curve the impact is slight, but noticeable. in the area of rapid growth of the mean value of electrical breakdown time delay, the influence of ultraviolet radiation is negligible. the obtained values of d t were slightly less. the presence of additional electrons from uv irradiation causes the memory curves to reach the saturation earlier and they decrease the d t values for about the order of magnitude. it should also be noted that some influence of ultraviolet radiation is felt in the plateau area of the memory curve. because of that, this part of memory curve has been specially presented in figs. 6 and 7 for both values of inter-electrode distance. fig. 6 ion part of memory curves with and without presence of radiation from cadmium lamp for inter-electrode distance of 0.01 cm fig. 7 ion part of memory curves with and without presence of radiation from cadmium lamp for inter-electrode distance of 0.1 cm investigation of the effect of additional electrons originating from the ultraviolet radiation... 435 it can be seen from these figures that the plateau length is not changed due to the ultraviolet irradiation. however, for both values of inter-electrode distance the plateau height increases. this increase is caused by the electron yield growth in the interelectrode gap due to the liberated electrons from the cathode by ultraviolet irradiation, which was more pronounced in the presence combined vacuum and gas breakdown. in this case, it should be noted that for further electron yield growth, the presence of the vacuum breakdown mechanism is responsible. these additional electrons induce the process of recombination the part of positive ions formed immediately after the finish of the discharge until the end of the plateau. meanwhile, these ion-electron recombinations proceed through the following processes: ),()( 224 xnxnne   (r10) ),()( 44 2 snsnne   (r11) ).()( 24 2 dnsnne   (r12) the rate coefficients of these reactions are 2  10 6 (300/te) 0.5 cm 3 s 1 for (r10) and 2  10 7 (300/te) 0.5 cm 3 s 1 for processes (r11) and (r12) [35]. as a result, when an operating voltage is applied on the electrodes, fewer ions per second arrive to the cathode surface when the gas tube is irradiated. then, the breakdown probability of electron occurrence decreases, causing a rise in the mean value of electrical breakdown time delay. 4. conclusion on the basis of the above considerations, the following brief conclusion is given. fundamental research of nitrogen discharge and afterglow is very important because of their different applications. the investigation of influence of ultraviolet irradiation on memory curve behavior has also been published. this effect is a consequence of the production of electrons from the cathode by light from the cadmium lamp. the spectral analysis of the light that comes from the lamp, which goes through the glass tube walls, was performed using avantes spectrometer avaspec-3648. it was obtained that ultraviolet irradiation had a noticeable influence in the plateau region and saturation of the memory curve, while a deviation of time delay is insignificant in the region of its rapid increase. namely, in early nitrogen afterglow ultraviolet irradiation increases the values of time delay, due to the dominant effect of ions enhanced electron-ion recombination. otherwise, in far late nitrogen afterglow the ultraviolet radiation decreases the time delay values because of the growth in total electron yield. the obtained results have shown that the memory curves in the region of very long afterglow period values are very sensitive to ultraviolet radiation. because of that, a strict control of enviromental radiation during the measurement was necessary to be performed in order to reduce the errors in tracking the kinetics of positive ions and neutral active particles in nitrogen afterglow. the most important process related to positive ions and nitrogen atoms creation/quench are mentioned. in addition, it was represented that the additional electron yield caused by influence of vacuum breakdown initiation mechanism has also a dominant role and that it was responsible for the decrease of dt value. this phenomenon is more pronounced in the presence of the vacuum breakdown mechanism at the lower value of inter-electrode gap. 436 e. n. živanović the ability to detect weak effects of ultraviolet radiation on the memory effect has reaffirmed that the used time delay measurement technique is very sensitive to the change of particle concentration in gas. earlier, it was found [3] that the used method could detect nitrogen atom concentration nearly 10 8 cm -3 . it was determined by the level of a natural charge production between the electrodes. acknowledgement: this work has been supported by the ministry of education, science and technological development of republic of serbia under the contract no. 177007. references [1] m. m. pejović, e. n. živanović and m. m. pejović, "kinetics of ions and neutral active states in afterglow and their influence on the memory effect in nitrogen at low pressures", j. phys. d: appl. phys., vol. 37, pp. 200-210, 2004. [2] n. t. nesić, m. m. pejović, m. m. pejović and e. n. živanović, "the influence of additional electrons on memory effect in nitrogen at low pressures", j. phys. d: appl. phys., vol. 44, p. 095203(9pp), 2011. [3] v. lj. marković, z. lj. petrović and m. m. pejović, "surface recombination of atoms in a nitrogen afterglow", j. chem. phys., vol. 100, pp. 8514-8521, 1994. [4] n. nešić, g. ristić, j. karamarković and m. m. pejović, "modelling of time delay of electrical breakdown for nitrogen-filled tubes at pressures 6.6 and 13.3 mbar in the increase region of the memory curve", j. phys. d: appl. phys., vol. 41, p.225205, 2008. [5] k. bergmann, g. schriever, o. rosier, m. müller, w. neff, and r. lebert, "highly repetitive, extremeultraviolet radiation source based on a gas-discharge plasma", applied optics, vol. 38, pp. 5413-5417, 1999. [6] j. g. kim, h. j. cho, s. k. park, s. h. lee, b. g. choi, j. y. an, y. i. cheon, y. h. jeon, t. ishigaki, k. kang and w. s. yoo, "investigation of unexpected residual effects of ultraviolet based measurements of sio2/si interface by photoluminescence", ecs solid state lett., vol. 3, pp. n11-n14, 2014. [7] n. philip, b. n. saoudi, m. c. crevier, m. moisan, j. barbeau, j. pelletier, "the respective roles of uv photons and oxygen atoms in plasma sterilization at reduced gas pressure: the case of n2-o2 mixtures", ieee trans. on plasma sci., vol. 30, pp. 1429-1436, 2002. [8] a. m. anpilov, e. m. barkhudarov, yu b. bark, yu v. zadiraka, m. christofi, yu n. kozlov, i. a. kossyi, v. a. kop'ev, v. p. silakov, m. i. taktakishvili and s. m. temchin, "electric discharge in water as a source of uv radiation, ozone and hydrogen peroxide", j. phys. d: appl. phys., vol. 34, pp. 993-999, 2001. [9] xin miao zhao, j. c. diels, cai yi wang, j. m. elizondo, "femtosecond ultraviolet laser pulse induced lightning discharges in gases", ieee journal of quantum electronics, vol. 31, pp. 599-612, 2002. [10] a. v. phelps, z. lj. petrović and b. m. jelenković, "oscillation of low-current electrical discharges between parallel-plane electrodes. iii. models", physical review e, vol. 47, pp. 2825-2838, 1993. [11] z. lj. petrović and a. v. phelps, "temporal and constriction behavior of low-pressure, cathodedominate argon discharges", physical review e, vol. 56, pp. 5920-5931, 1997. [12] m. m. pejović and m. m. pejović, electrical breakdown of gases: measuring systems and experimental research, university of niš: faculty of electronic engineering, 2009, in serbian. [13] avantes spectrometer avaspec 3648, datasheet. [on line]. available at http://www.wacolab.com/avantes/ spectrometers14.pdf. [14] y. smirnov and n. yudin, nuklear physics, moscow: nauka, 1980. [15] v. s. fomenko, emissionny svoystva materialov, spravochnik, kiev: naukova dumka, 1970, in russian. [16] n. a. ashcroft and n. d. mermin, solid state physics, new york: holt, riehart and winston, 1976. [17] e. n. živanović, "influence of combined gas and vacuum breakdown mechanisms on memory effect in nitrogen", vacuum, vol. 107, pp. 62-67, 2014. [18] j. m. meek and j. d. craggs, electrical breakdown of gases, new york: john wiley and sons inc., 1978. [19] a. pedersen, "on the electrical breakdown of gaseous dielectrics-an engineering approach", ieee trans. electr. insul., vol. 24, pp. 721-739, 1989. [20] d. blois, p. suppiot, m. bary, a. chapput, c. foissac, o. dessaux and p. goudmand, "the microwave source's influence on the vibrational energy carried by n2(x) in a nitrogen afterglow", j. phys. d: appl. phys., vol. 31, pp. 2521-2531, 1998. http://ssl.ecsdl.org/search?author1=jung+geun+kim&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=ho+jin+cho&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=sung+ki+park&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=seok-hee+lee&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=byoung+gon+choi&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=jea+young+an&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=young+il+cheon&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=young+ho+jeon&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=toshikazu+ishigaki&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=kitaek+kang&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=kitaek+kang&sortspec=date&submit=submit http://ssl.ecsdl.org/search?author1=woo+sik+yoo&sortspec=date&submit=submit http://www.wacolab.com/avantes/spectrometers14.pdf http://www.wacolab.com/avantes/spectrometers14.pdf investigation of the effect of additional electrons originating from the ultraviolet radiation... 437 [21] b. f. gordiets, c. m. ferreira, m. j. pinheiro and a. ricard, "self-consistent kinetic model of low-pressure n2h2 flowing discharges: ii. surface processes and densities of n, h, nh3 species", plasma sources sci. technol., vol. 7, pp. 363-378, 1998. [22] b. f. gordiets, c. m. ferreira, v. guerra, j. loureiro, j. nahorny, d. pagnon, m. touzeau and m. vialle, "kinetic model of a low pressure n2-o2 flowing discharge", ieee trans. plasma sci., vol. 23, pp. 750-68, 1995. [23] e. eslami, c. foissac, a. camparague, p. supiot and n. sadeghi, "vibrational and rotational distributions in n2(a) metastable plasma", in proceedings of the xvi europhysics conference on atomic and molecular physics of ionized gases (escampig) 5 th international conference on reactive plasmas (icrp) join meeting, grenoble, france 2002, european physical society, vol.1, p.57. [24] v. guerra, p. sa and j. loureiro, "kinetic modeling of low pressure nitrogen discharge of the postdischarge", eur. j. appl phys., vol. 28, pp. 125-152, 2004. [25] p. supiot, o. dessaux and p. goudmand, "spectroscopic analysis of the nitrogen short-lived afterglow induced at 433 mhz," j. phys. d: appl. phys., vol. 28, pp. 1826-1839, 1995. [26] a. a. matveyev and v. p. silakov, "theoretical study of the role of ultra-violet radiation of the nonequilibrium plasma in the dynamics of the microwave discharge in molecular nitrogen", plasma sources sci. technol., vol. 8, pp. 162-178, 1999. [27] p. sa, v. guerra, j. loureiro and n. sadeghi, "self-consistent kinetic model of short-lived afterglow in flowing nitrogen", j. phys. d: appl. phys., vol.37, pp. 221-231, 2004. [28] j. levaton, j. amorim, souza, d. franco and a. ricard, "kinetics of atoms, metastable, radiative and ionic species in the nitrogen pink afterglow", j. phys. d: appl. phys., vol. 35, pp. 689-699, 2002. [29] von engel a, ionized gases, oxford: clarendon, 1965. [30] z. lj. petrović, v. lj. marković, m. m. pejović and s. r. gocić, "memory effects in the afterglow: open questions on long-lived species and the role of surface processes", j. phys. d: appl. phys., vol. 34, pp. 1756-1768, 2001. [31] w. brennen and e. c. shane, "the nitrogen afterglow and the rate of recombination of nitrogen atoms in the presence of nitrogen, argon and helium", j. phys. chem., vol. 75, p. 1552, 1971. [32] j. berkowitz, w. a. chupka and g. b. kistiakowsky, "mass spectrometric study of the kinetics of nitrogen afterglow", j. chem. phys., vol. 25, p. 457, 1956. [33] g. cernogora, c. m. ferreira, l. hochard, m. touzeau and j. loureiro, "vibrational populations of n2(a 3 u + ) in a pure nitrogen glow discharge", j. phys. b: at. mol. phys., vol. 17, pp. 4429-4437, 1984. [34] g. g. manella, r. r. reeves and p. harteck, "surface catalyzed excitation with n and o", j. chem. phys., vol. 33, p. 636, 1960. [35] i. a. kossyi, a. y. kostinsky, a. a. matveyev and v. p. silakov, "kinetic scheme of the nonequilibrium discharge in nitrogen-oxygen mixture", plasma sources sci. technol., vol. 1, pp. 207-220, 1992. facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 119-128 https://doi.org/10.2298/fuee1901119s design of efficient coplanar 1-bit comparator circuit in qca technology ahmadreza shiri, abdalhossein rezai, hamid mahmoodian acecr institute of higher education, isfahan branch, isfahan 84175-443, iran abstract. qca technology is an emerging and promising technology for implementation of digital circuits in nano-scale. the comparator circuits play an important role in digital circuits. in this work, a new and efficient coplanar 1-bit comparator circuit is proposed and evaluated in the qca technology. the designed coplanar 1-bit qca comparator circuit is constructed based on majority gate, xnor gate and inverter gate that are designed carefully. the functionality of the designed coplanar 1-bit qca comparator circuit is verified by using qcadesigner version 2.0.3. the obtained results indicate that the designed 1-bit qca comparator circuit requires 0.03 µm 2 area and 38 qca cells. it also has 0.5 clock cycles delay. the comparison demonstrates that the designed qca comparator circuit provides improvements in comparison with other qca comparator circuits in terms of effective area, cell count, and delay as well as cost. key words: comparator, quantum-dot cellular automata, high-performance design, coplanar circuit 1. introduction two important issues in the vlsi design are scaling and reducing the computation time. the quantum-dot cellular automata (qca) technology is an emerging and promising technology to these issues at nano-scale [1]. the basic element in this technology is a square cell that has two free electrons in four dots [1-14]. the qca cell is a building block for constructing qca gates [1-14]. there are three basic gates in this technology: inverter gate, majority (m) gate, and xor gate [3-4]. these gates are building blocks for constructing the logic circuits such as qca multiplexers [5, 7], qca full address [1-3, 6, 8] and qca comparators [9-12, 15-18]. on the other hand, the comparator circuits play an important role in digital circuits such as micro controllers [6, 12, 15-18]. thus, the implementation of high-performance comparator circuits has a great deal of attention, and a lot of effort [10-12, 15-18] has been invested in performance improvement in the qca comparator circuits. das and de [10] have presented a 1-bit qca comparator that requires 0.343 µm 2 area and 319 qca received may 8, 2018; received in revised form september 14, 2018 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan 84175-443, iran (e-mail: rezaie@acecr.ac.ir) 120 a. shiri, a. rezai, h. mahmoodian cells. alshafi and bahar [11] have presented a 1-bit qca comparator, which requires 0.182 µm 2 area and 117 qca cells. shinha et al. [12] have proposed two qca comparator circuits that require 40 and 37 qca cells and 0.032 and 0.028 µm 2 area, respectively. ghosh et al. [16] have presented a 1-bit qca comparator circuit that requires 0.06 µm 2 area and 73 qca cells. akter et al. [17] have presented a 1-bit qca comparator circuit, which requires 0.11 µm 2 area and 87 qca cells. bhoi et al. [17] have presented a 1-bit qca comparator circuit that requires 0.23 µm 2 area and 220 qca cells. this study proposes an efficient coplanar 1-bit qca comparator circuit. the designed qca comparator is based on majority, xor and inverter gates. the accuracy of the designed circuit functionality is demonstrated by using qcadesigner version 2.0.3. the simulation results show that the designed coplanar 1-bit qca comparator circuit provides improvements compared with other 1-bit qca comparator circuits in terms of cell count, area, delay time and cost. the rest of this study is unified as follows: section 2 provides a review for qca technology. in section 3, the designed qca comparator circuit is presented. the results and comparison of the designed qca comparator circuit are provided in section 4. the conclusion is presented in section 5. 2. background 2.1. qca cell figure 1 shows the basic qca cell and its possible stats. we can consider each qca cell as a square including four quantum dots and a pair of electrons [1, 5]. electrons can be located at diagonally opposite locations due to the coulomb interaction between electrons in each cell. there are two different forms in each cell that their polarizations are specified as -1 and +1. these polarizations denote the binary values of 0 and 1, respectively [1, 5]. fig. 1. the possible stats for the qca cell [5] 2.2. qca gates there are three fundamental gates in this technology: inverter, majority, and xor gates, which are used to construct the circuits in this technology [11, 13, 19]. figure 2 shows these qca gates [3, 13]. design of efficient coplanar 1-bit comparator circuit in qca technology 121 (a) (b) (c) (d) (e) fig. 2 qca gates: (a) corner inverter, (b) robust inverter, (c) original majority gate (omg), (d) rotated majority gate (rmg), (e) xor gate [11, 3, 13, 19] in figure 2(a) and figure 2(b), the inverted polarization value of the input in each inverter is shown as the output. in figure 2 (c) and figure 2 (d), two kinds of qca threeinput majority gates are shown. figure 2 (e) shows three inputs qca xor gate [6, 13]. 122 a. shiri, a. rezai, h. mahmoodian 2.3. qca comparator comparator circuits play an important role in digital circuits [9-12, 15-18]. this circuit compares their two inputs. suppose a and b are two inputs of the comparator circuit, the outputs of this circuit are defined as follows [15]: output (ab) = a. ̅ where a>b for implementation of comparator circuit in the qca technology, equation (1) is reformulated as follows [15]: output (ab) = m (a, ̅, 0) where a>b 2.4. related works das and de [10] have developed a qca comparator circuit by combining the feynman and tr gates functional property, which is shown in figure 3. fig. 3. the utilized qca comparator circuit in [10] this qca comparator circuit requires 0.343 µm 2 area and 319 qca cells. al-shafi1 et al. [11] have developed a qca comparator circuit without wire-crossing, which is shown in figure 4. design of efficient coplanar 1-bit comparator circuit in qca technology 123 fig. 4. the utilized qca comparator circuit in [11] this qca comparator circuit requires 0.182 µm 2 area and 117 qca cells. shinha roy et al. [12] have developed a qca comparator circuit based on layerd-t or and and gates, which is shown in figure 5. fig. 5. the utilized qca comparator circuits in [12] this qca multilayer comparator circuit requires 0.03 µm 2 area and 37 qca cells. 124 a. shiri, a. rezai, h. mahmoodian ghosh et al. [16] have developed a qca comparator circuit, which is shown in figure 6. fig. 6. the utilized qca comparator circuit in [16] this qca comparator circuit requires 0.06 µm 2 area and 73 qca cells. bhoi et al. [17] have developed a qca comparator circuit, which is shown in figure 7. fig. 7. the utilized qca comparator circuit in [17] this qca comparator circuit requires 0.23 µm 2 area and 220 qca cells. design of efficient coplanar 1-bit comparator circuit in qca technology 125 akter et al. [18] have developed a qca comparator circuit based on tr and feynman gates, which is shown in figure 8. fig. 8. the utilized qca comparator circuit in [18] this qca comparator requires 0.11µm 2 area and 87 qca cells. although, these qca comparator circuits are suitable, the performance of the comparator can be improved as will be described in the next section. 3. the proposed qca comparator circuit the proposed qca comparator circuit has two 1-bit inputs and three 1-bit outputs. the inputs are indicated by a and b, and the outputs are indicated by l(a, b), e(a, b), and g(a, b). the relation between outputs and inputs are defined as follows: l(a, b)= ̅.b where ab as it is shown in equation (3), if the input a is less than the input b, the output l(a, b) is “1” and other outputs are “0”. moreover, if the input a is greater than the input b, the output g(a, b) is “1” and other outputs are “0”. otherwise, the inputs a and b are equal, the output e(a, b) is “1” and other outputs are “0”. figure 9 shows the designed 1-bit qca comparator circuit. (a) (b) fig. 9. the designed 1-bit qca comparator circuit (a) block diagram (b) layout 126 a. shiri, a. rezai, h. mahmoodian the designed 1-bit comparators consist of 2 original majority gates (fig.2 (c)), 3 inverter gates (fig. 2 (a)) and an xor gate (fig. 2 (e)). the majority gates in the developed 1-bit qca comparator circuit are used for implementation of and gates. as a result, one input of these majority gates is set as logic "0". the designed 1-bit qca comparator circuit requires 38 qca cells. 4. simulation results and comparison the designed 1-bit qca comparator circuit is simulated by using qcadesigner tool version 2.0.3. the following parameters are used for simulation: the number of samples: 12800, radius of effect [nm]:65.000000, the convergence tolerance: 0.00100, relative permittivity: 12.900000, clock low [j]: 3.800000e-023, clock high [j]: 9.800000e-022, clock shift: 0.000000e+000, and clock amplitude factor: 2.000000. other simulation parameters are chosen as default. figure 10 shows the simulation results of the designed 1-bit comparator circuit. fig. 10. the results for the designed 1-bit comparator circuit these results demonstrate that the outputs of the designed 1-bit comparator circuit are correctly obtained after 0.5 clock cycles delay. moreover, the designed 1-bit qca comparator circuit requires 0.03 µm 2 area and 38 qca cells. table 1 summarizes the simulation results of the designed 1-bit comparator circuit compared with other 1-bit comparator circuits in [10-12, 16-18]. table 1 the comparison table for 1-bit qca comparator circuit reference cell count area (μm 2 ) time delay (clock cycle) crossover cost [10] 319 0.343 3 multilayer 1.029 [11] 117 0.182 1 coplanar 0.182 [12] design1 40 0.032 1 multilayer 0.032 [12] design2 37 0.028 1 multilayer 0.028 [16] 73 0.06 1 coplanar 0.060 [18] 87 0.11 0.50 coplanar 0.055 [17] 220 0.23 0.75 coplanar 0.172 this paper 38 0.030 0.50 coplanar 0.015 design of efficient coplanar 1-bit comparator circuit in qca technology 127 in this table, area and delay are shown in terms of µm 2 and clock cycle, respectively. moreover, following equation is used to determine the cost value based on [1, 5, 7]. cost= area × delay (4) as it is shown in table 1, the designed 1-bit comparator circuit has advantages in terms of cost and area compared to [10-12, 1618]. for example, the cell count, area, delay and cost in the designed 1-bit qca comparator circuit are improved compared to 1bit qca comparator circuits in [10] by about 88%, 91%, 83% and 98%, respectively. the only 1-bit qca comparator circuit, which requires a slightly lower cell count and area than the designed qca comparator circuit is the 1-bit qca comparator circuit in [12] (design 2). however, this advantage has been resulted from the increased number of layers, not from logic design. in addition, the delay time and cost in the proposed 1-bit qca comparator circuit are reduced by about 50% and 40% compared to the 1-bit qca comparator circuit in [12] (design 2). 5. conclusions qca technology is a promising technology for implementation of digital circuits in nano-scale [1-6]. the comparator circuits play important role in digital circuits [9-12, 1618]. in this study, an efficient 1-bit qca comparator circuit was proposed and evaluated. the designed 1-bit qca comparator circuit was constructed based on majority gate, xnor gate and inverter gate that were designed carefully. the functionality of the designed 1-bit comparator circuit was verified by using qcadesigner version 2.0.3. the obtained results indicate that the designed 1-bit comparator circuit requires 0.03 µm 2 area and 38 qca cells. it also has 0.5 clock cycle delay. the results showed that the designed 1-bit comparator circuit provided improvements compared with other 1-bit comparator circuits in [10-12, 16-18] in terms of cell count, effective area, and delay as well as cost. references [1] h. rashidi, a. rezai, “high-performance full adder architecture in quantum-dot cellular automata,”j. eng., vol. 2017, pp. 394–402, 2017. [2] d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi, “ design of novel efficient full adder circuit for quantum-dot cellular automata technology, ” facta univ. series: electr. energy, vol. 31, no. 2, pp. 279-285, 2018. [3] i. edrisi arani, a. rezai, “novel circuit design of serial-parallel multiplier in quantum-dot cellular automata technology”, j. comput. electr., 2018. [4] m. niknejad divshali, a. rezai, a. karimi, “towards multilayer qca siso shift register based on efficient d-ff circuits”, int. j. theor. phys., 2018. [5] h. rashidi, a. rezai, “design of novel efficient multiplexer architecture for quantum-dot cellular automata,” j. nano electr. phys., vol. 9, no. 1, pp. 1-7, 2017. [6] m. balali, a. rezai, h. balali, f. rabiei, s. emadid, “towards coplanar quantum-dot cellular automata adders based on efficient three-input xor gate,” result phys., vol. 7, pp. 1389-1395, 2017. [7] h. rashidi, a. rezai, s. soltani, “high-performance multiplexer architecture for quantum-dot cellular automata” j. comput. electr., vol. 15, pp. 968-981, 2016. [8] m. balali, a. rezai, “design of low-complexity and high-speed coplanar four-bit ripple carry adder in qca technology,” int. j. theor. phys., pp. 1-13, 2018. [9] d. bahrepour, “a novel full comparator design based on quantum-dot cellular automata,” int. j. inf. electr. eng., vol. 15, pp. 406-410, 2015. 128 a. shiri, a. rezai, h. mahmoodian [10] j c. das, d. de, “reversible comparator design using quantum dot cellular automata,” iete j. res., vol. 62, pp. 323-330, 2016. [11] m d. abdullah-al-shafi, a n. bahar, “optimized design and performance analysis of novel comparator and full adder in nanoscale,” cogent eng., vol. 3, 2016. [12] s. sinha roy, c. mukherjee, s. panda, a. k. muchopadhyay, b. maji, “layered t comparator design using quantum-dot cellular automata,” ieee conf. dev. integ. circ. (devic), pp. 90-94, 2017. [13] a. n. bahar, s. waheed,”a novel 3-input xor function implementation in quantum dot-cellular automata with energy dissipation analysis,” alexandria eng. j., in press, 2018. [14] j. c. das, d. de, “novel low power reversible binary incrementer design using quantum-dot cellular automata,” microprocess microsyst., vol. 42, pp. 10-23, 2016. [15] a. sarker, md. badrul alam miah, “design of 1-bit comparator using 2 dot 1 electron quantum-dot cellular automata,” int. j. adv. comput. sci. appl., vol. 8, no. 3, pp. 481-485, 2017. [16] b. ghosh, sh. gupta, s kumari, “quantum dot cellular automata magnitude comparators,” ieee int. conf. electr. dev. solid state circ. (edssc), pp. 1-2, 2012. [17] b. k. bhoi, n. k. misra, m. pradhan, “a universal reversible gate architecture for designing n-bit comparator structure in quantum-dot cellular automata,” int. j. grid distr. comput., vol. 10, no. 9, pp. 33-46, 2017. [18] r. akter, n islam, s waheed, “implementation of reversible logic gate in quantum dot cellular automata,” int. j. comput. appl., vol. 109, pp. 41-44, 2017. [19] m. balali, a. rezai, h. balali, f. rabiei, s. emadid, “a novel design of 5-input majority gate in quantum-dot cellular automata technology,” ieee symp. comput. appl. indust. electr. (iscaie), pp. 1316, 2017. facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 359-368 https://doi.org/10.2298/fuee1903359a © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd an inexpensive anemometer using arduino board * elson avallone, paulo césar mioralli, pablo sampaio gomes natividade, paulo henrique palota, josé ferreira da costa, jonas rafael antonio, sílvio aparecido verdério junior federal institute of education, science and technology of sao paulo, catanduva-sp, brazil abstract. in all studies involving wind speed, such as meteorology, wind turbines and agriculture accurate speed information for decision making is required. there are several types of anemometers, with medium and high costs, such as cup, hot wire and pitot tubes, the hot wire being more sensitive and expensive than others. the device developed in this work is the cup anemometer, that is easy to build. the great advantage of this device is the low cost, with an approximate value of us$ 50.00, using simple materials that are easy to find in commercial stores. the reed switch sensor is also another advantage as it does not require a sophisticated programming, as well as the open platform arduino. the use of theoretical aerodynamic drag coefficients and the presented calculations resulted in values very close to a commercial anemometer. the coefficient of determination between the cup anemometer and the standard sensor of meteorological research institute ipmet/brazil is r 2 =0.9999, indicating strong correlation between the instruments. as the reference anemometer (ipmet) has high embedded technology and the prototype is low cost, we conclude that the project has an attractive cost benefit for possible development and production, reaching the objective of this work. key words: anemometer, low cost, airspeed measurement, arduino, open hardware. 1. introduction this work is an extended version presented in [1], where the operating principles of a low-cost anemometer were presented. this anemometer is a part of the electronic meteorological station project [2], authorized by the campus director and developed by enaco-energy and related applications [3], catanduva-sp-brazil. the meteorological station will promote the development of other surveys and will also assist the city civil defense and micro-region farmers in alerting the population, making decisions, keeping historical data. received february 13, 2019; received in revised form april 25, 2019 corresponding author: elson avallone federal institute of education, science and technology of sao paulo, catanduva-sp, brazil (e-mail: elson.avallone@ifsp.edu.br) * an earlier version of this paper was presented at the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, niš, serbia [1]  360 e. avallone, p. c. mioralli, p. s. g. natividade, et al. there is a wide variety of anemometers, the most common being cup anemometers. they consist of cups attached to stems that are fixed to a central axis. other anemometers, such as a hot wire that measures temperature change of wire heated electrically by the passage of wind; pitot probes use the bernoulli principle and the propeller anemometers have a front propeller to determine wind speed. the first anemometer applying scientific principles was developed by [4]. in the work developed by [5], the authors present historical fragments on the technological evolution of the anemometers and in the study of [6], the various types of anemometers and their applications are presented. in the research developed by [7], the authors present the importance of knowledge of basic meteorological data to assess the impacts of climate variability, predicting impacts on hydrology and agroecosystems using a high-resolution network, showing that the meteorological data set increases significantly the availability of climatic data in brazil. the authors [8] present a bibliographical review of researches involving wind speed and prediction for wind energy, being a source of bibliographic consultation. the researchers [9] proposed a low-cost ultrasonic planar anemometer with a very interesting price-performance ratio that was obtained with the use of arduino architecture, producing a simple, original and innovative system. in the solar collector surveys developed by [10], [11], [12] and [13], the authors used the same type of anemometer with identical results to those presented in this work. as evacuated solar collectors were used, the wind speed had insignificant influence on the results. this kind of solar collector isolates the outside air by both glass and vacuum. therefore, the results were not used in the thermal efficiency calculations. in the meteorology studies, the anemometer is an important instrument in the atmospheric analyses [14] and [15], where the authors used the equations of angular motion and frequency to determine the rotation of the instrument. in the work developed by [16], the authors discuss a study involving ultrasound transducers to measure wind speed at an approximate cost of us$ 150.00. the researchers [17] studied the influence of lattice towers on the cup anemometers, analyzing the best distance of less turbulence between the tower and the anemometer. to develop a rotational anemometer of conical cups, [18] used in electronic blocks system, requires calibration to validate the results. another application of the anemometer is in the estimation and measurement of the efficiency of wind turbines [19]. the most common use of cup anemometers is in meteorology and agriculture, to optimize agricultural practices with more precise decision-making. the use of this equipment in brazilian agriculture is limited by its high cost [20]. despite the calibration, the instrument developed by [20] does not provide readings for speeds below 0.85 m/s. in the study developed by [21], 3 types of anemometers were tested in food greenhouses, showing that this instrument is very important to help in the technological development of agriculture. instead of using the first-degree function in anemometer calibration, [22] uses two harmonic constants, which represent the influence of cup geometry. also [23] refers to the use of anemometers in wind energy and uses the front and rear drag coefficients and the radius of rotation to determine the instrument constant. the eolian sector depends exclusively on precise measurements of wind speed and an instrument, so it is necessary to have reliable instruments [24]. the device chosen for this work is the cup anemometer because it is simple and uses only a reed-switch sensor to measure the rotation of the device. an inexpensive anemometer using arduino board 361 the objective of the work is to construct a low-cost cup anemometer with good efficiency. the device uses the electronic system in the arduino platform with embedded software, because it is simple and easy to construct, besides being of low cost and easy access. 2. materials and methods as the objective of the work is to construct a low-cost anemometer with good efficiency so that any low-income farmer can install this equipment in his property. with this proposal, we look for cheap materials and find in any store of screws and supermarkets. the construction begins with the machining of an aluminum central shaft with fillister for coupling of a simple bearing. the other machined part is the central nylon bracket. three 70 mm aluminum confectionery cups were used. the three cups were bolted to three 5mm diameter threaded rods with locking nuts and washers. the rods were coated with thermosetting plastic tubes for protection against the weather. the neodymium magnet is glued to the underside of the central nylon support, fig. 1. fig. 1 anemometer mounted with threaded rods with the plastic coating, neodymium magnet, and nylon central support the top view of the anemometer with the stems, the central bearing, and the cups with its hemispherical form is shown in figure 2. as the cup anemometer was calibrated beside the standard sensor, the friction is implicit in the calibration. the radius dimension "r = 0.11433 m" is the only factor dependent on the construction of the instrument, is measured using a pachymeter in millimeters and the result of the measurement was converted into meters. the cups depend exclusively on aerodynamic coefficients. 362 e. avallone, p. c. mioralli, p. s. g. natividade, et al. fig. 2 top view of anemometer with a radius of 0.11433 m the reed-switch [25] is of the "normally open" type and for appropriate operation, the spacing between the magnet and the sensor must be at most 3 millimeters. the moment these two components intersect, the "s1" sensor closes the reed-switch circuit, counting one pulse at each revolution. the advantage of this circuit is its simplicity of construction and assembly in addition to the low cost of production. the reed-switch, arduino board, and neodymium magnet sensor values are us$ 0.99, us$ 3.50 and us$ 1.19, respectively. the circuit with the components is shown in fig. 3. the arduino was connected to the computer, being the main source of power of the circuit. the cup anemometer operating voltage is the input/output usb computer connected to the arduino. the standard ipmet-unesp anemometer has the same power supply, that is, connection by the usb input/output of the computer of that institute of meteorological research. the pulse register is counted and controlled by the arduino software [26], which was previously loaded into the microcontroller memory. the 10 kω resistor works as a pull-up, which ensures that the input of the logic system is set to the expected logical level of the external devices, i.e., the reed-switch. fig. 3 electronic circuit with reed-switch sensor connection to arduino microcontroller an inexpensive anemometer using arduino board 363 to calculate the airspeed, [27] suggest the calculation procedure described in equations (1), (2) and (3), where is a dimensionless relationship between the frontal and rear drag coefficients are 1 42 drag 0 3, respectively, from reference [28]. the anemometer dimensionless factor , also suggested by [27], is defined by the aerodynamic characteristics of the cups, extracted from equation (1) or a relationship between airspeed ( ) [m/s], anemometer radius r 0 11433 m and angular speed ( ) defined in equation (4) [rad/s]. by equation (5) is determined airspeed in [m s⁄ ]. the index ( ) represents the pulses per each revolution measured at reed switch sensor. √ (1) (2) (3) (4) (5) the complete arduino programming can be found in the appendices of [30]. equation (3) is used to calculate the uncertainty of the sensor because it contains the radius, which is the variable of the anemometer. the term refers to the number of pulses measured by the reed-switch sensor. this sensor does not fail and if this occurs, the cup anemometer does not work. deriving the equation (3), it is found the partial derivatives that follow: (6) equation (7) is used to calculate the mean uncertainty of the anemometer cup: ̅ √( ) (7) the mean anemometer uncertainty value is ⁄ . the standard sensor accuracy is 1%. 2.1. programming the arduino platform the float variable "wind" of arduino software is shown in fig. 4. fig. 4 float variable “wind” 364 e. avallone, p. c. mioralli, p. s. g. natividade, et al. the wind speed is calculated from constant 2.27 in arduino software, using equation (5), as shown in fig. 5. the acronym (f) represents each pulse measured by the reedswitch sensor in [hz]. fig. 5 part of arduino software. the complete arduino software can be found in the appendices of [30] and the calibration equation is shown in figure 6. the calibration procedure of the cup anemometer for instrument certification was carried at ipmet-unesp bauru/brazil [31] from 08:00 am to 5:33, with a measurement interval of 1 minute. this means that the sample size (number of measurements) was 727. the reference instrument was a propeller anemometer [32], generating a linear function and coefficient of determination ( ), which were included in the graphic of fig. 6. with the anemometer airspeed vcup is inserted into calibration equation, thus obtaining the calibrated airspeed (vreal) and the calibration equation is shown in equation (8). [ ⁄ ] [ ⁄ ] (8) equation (8) was used only to obtain a comparative check between the anemometer cup and the ipmet standard sensor, i.e., equations (1), (2), (3), (4) and (5) are sufficient for wind speed calculation, as defined by [23]. in equation (8), the value 0.0021 represents the point at which the trend line generated from the point cloud intercepts the ordered axis, as shown in figure 6. fig. 6 a calibration curve using cup anemometer and reference sensor of ipmet-unesp bauru an inexpensive anemometer using arduino board 365 a positive linear correlation between cup anemometer speeds and the reference anemometer is shown in fig. 6. the variables increase and decrease proportionally, indicating a strong linear relationship. this calibration curve, using the most widely accepted criteria, should be performed from 0 to 8 m/s since the predominance of winds at the standard sensor site is 0 to 8 m/s on normal days. for higher speeds analysis, it would only be possible on stormy days. it is also observed in fig. 6 that the coefficient of determination (r 2 ) was 0.9999. this means that 99.99% of the total variation of the anemometer speed is explained by the regression line, confirming that the calibration of the cup anemometer compared to standard ipmet was successful. a comparison between the speeds obtained with the standard ipmet sensor and the cup anemometer is shown in fig. 7. we verified that 91% of the measurements of the anemometer cup are above of the standard anemometer and 9% below. the differences are best seen in fig. 8. continuing the analysis of fig. 7 and fig. 8, it is possible to see that the instruments work with very similar results. fig. 7 airspeed comparison curves between the cup anemometer and the reference sensor of the ipmet-unesp bauru the percentage difference between the cup anemometer and the ipmet-unesp standard anemometer is shown in fig. 8. the positive values represent the differences found above the prototype with respect to the standard, while the negative values represent the difference below the standard. we can also observe that the maximum and minimum amplitudes of the differences from the standard are 2.29% above and 1.05% below, respectively. these differences can be explained by the theoretical use of the values of the drag coefficients, which were extracted from [28]. more precise values of the drag coefficients could be obtained with an experimental wind tunnel analysis. another justification is the possibility that the radius of 0 11433 m has a small variation in one of the shells in relation to the center of rotation of the cup anemometer. this can change the linear velocity (see equation 3) depending on the direction of the wind. 366 e. avallone, p. c. mioralli, p. s. g. natividade, et al. as the reference anemometer (ipmet) has high embedded technology and the prototype is low cost, we conclude that the project has an attractive cost benefit for possible development and production, reaching the objective of this work. fig. 8 percentage difference between the cup anemometer and the reference anemometer (ipmet) 3. conclusion the cup anemometer showed excellent agreement with the reference sensor. it is a great choice not only for small farmers, but also for evaluation of wind turbines and especially for meteorological stations. the great attraction of this instrument is its low cost, with an approximate value of us$ 50.00, considering the time of machining. constructive simplicity is another interesting factor in this instrument. as the reference anemometer (ipmet) has high embedded technology and the prototype is low cost, we conclude that the project has an attractive cost benefit for possible development and production, reaching the objective of this work. future activities should be also focused on the development of more accurate machining and assembly processes of the cup anemometer, and the use of real drag coefficients getting directly from a wind tunnel test. these procedures could further increase the precision of the sensor. acknowledgment: to the federal institute of education, science and technology of sao paulo, catanduva-sp, brazil for the incentive. to the meteorological research institute ipmet-unespbauru/brazil for scientific support. an inexpensive anemometer using arduino board 367 references [1] e. avallone, p.c. mioralli, p. s. g. natividade, p. h. palota, j.f. da costa, j. r. antonio, s. a. v. junior, “low cost cup electronic anemometer”, in proceedings of the 4th virtual international conference on science, technology and management in energy, niš, serbia, vol 1, pp. 9–12. [2] o severino junior, “construção de uma estação meteorológica eletrônica no câmpus catanduva-sp” federal institut of education, science and technology of são paulo, 23-nov-2018. [3] p. c. mioralli, e. avallone, p. s. g. natividade, p. h. palota, j. f. costa, e s. a. verdério júnior, enaco energia e aplicações correlatas (energy and related applications). . [4] t r robinson, “on a new anemometer”, in proceedings of the royal irish academy (1836-1869), vol. 4, pp. 566–572, 1847. [5] o f hansen e l kristensen, “fragments of the cup anemometer history”, windsensor, vol. 1, no 1, pp. 3, fev. 2005. [6] m. a. varejão-silva, meteorologia e climatologia, vol. 1, 2 vols. recife pe brazil. [7] a c xavier, c w king, e b r scanlon, “daily gridded meteorological variables in brazil 19802013): daily gridded meteorological variables in brazil (1980-2013 ”, international journal of climatology, vol. 36, no 6, pp. 2644–2659, maio 2016. [8] m lei, l shiyan, j chuanwen, l hongling, e z yan, “a review on the forecasting of wind speed and generated power”, renewable and sustainable energy reviews, vol. 13, no 4, p. 915–920, maio 2009. [9] p luca, a benedetto, b enrico, g francesco, m marco, e m tommaso, “integrated design and testing of an anemometer for autonomous sail drones”, journal of dynamic systems, measurement, and control, vol. 140, no 5, p. 055001, dez. 2017. [10] e avallone, d g cunha, a padilha, e v l scalon, “electronic multiplex system using the arduino platform to control and record the data of the temperatures profiles in heat storage tank for solar collector”, int j energy environ eng, vol. 7, no 4, pp. 1–8, ago. 2016. [11] e avallone, “estudo de um coletor solar, tipo tubo evacuado modificado, utilizando um concentrador cilíndrico parabólico cpc ”, phd thesis, universidade estadual paulista “júlio de mesquita filho” unesp/feb, brazil, 2017. [12] e avallone, a i sato, v l scalon, e a padilha, “analisys of thermal efficiency of a modified solar collector type evacuated tube”, reterm, vol. 13, no 1, pp. 3–8, jun-2014. [13] e avallone, “avaliação da eficiência térmica de um coletor solar tipo tubo evacuado modificado”, master thesis, universidade estadual paulista júlio de mesquita filho, campus de bauru, 2013. [14] t ali, s nayeem, m o faruk, m shidujaman, e s m ferdous, “design & implementation of a linear ic based low cost anemometer for wind speed measurement”, apresentado em international conference on informatics, electronics & vision, dhaka, bangladesch, 2012, p. 99–102. [15] c d dicenzo, b szabados, e n k sinha, “digital measurement of angular velocity for instrumentation and control”, transactions on industrial electronics and control instrumentation, vol. 23, no 1, fev-1976. [16] m p del valle, j a u castelan, y matsumoto, e r c mateos, “low cost ultrasonic anemometer”, in proceedings of the 2007 4th international conference on electrical and electronics engineering, 2007, pp. 213–216. [17] r n farrugia e t sant, “modelling wind speeds for cup anemometers mounted on opposite sides of a lattice tower: a case study”, journal of wind engineering and industrial aerodynamics, vol. 115, pp. 173–183, abr. 2013. [18] j t fasinmirin, p g oguntunde, k o ladipo, e l dalbianco, “development and calibration of a selfrecording cup anemometer for wind speed measurement”, african journal of environmental science and technology, vol. 5, no 3, 2011. [19] a accetta, m pucci, g cirrincione, e m cirrincione, “on-line wind speed estimation in im wind generation systems by using adaptive direct and inverse modelling of the wind turbine”, in proceedings of the 2016 ieee energy conversion congress and exposition (ecce), 2016, pp. 1–8. [20] c a sampaio, m n ullmann, e m camargo, “desenvolvimento e avaliacao de anemometro de copos de facil construcao e operacao ”, revista de ciencias agroveterinarias, vol. 4, no 1, pp. 11–16, 2005. [21] t l funk, “anemometry tools and procedures for greenhouse experiments”, phd thesis, university of illinois at urbana-champaign, illinois usa, 1994. [22] s. pindado, j. cubas, e f. sorribes-palmer, “on the harmonic analysis of cup anemometer rotation speed: a principle to monitor performance and maintenance status of rotating meteorological sensors”, measurement, vol. 73, pp. 401–418, set. 2015. [23] s. pindado, j. perez-alvarez, e s sanches, “on cup anemometer rotor aerodynamics”, sensors, vol. 14, pp. 6198–6217, 2012. 368 e. avallone, p. c. mioralli, p. s. g. natividade, et al. [24] r. wagner, m. courtney, j. gottschall, e p. lindelöw-marsden, “accounting for the speed shear in wind turbine power performance measurement: accounting for speed shear in power performance measurement”, wind energy, vol. 14, no 8, pp. 993–1004, nov. 2011. [25] super ultraminiature, “reed switch” 2012 [26] arduino, “arduino”, arduino, 2018. [online]. disponível em: https://www.arduino.cc/. [27] s. pindado, j. cubas, e f. sorribes-palmer, “the cup anemometer, a fundamental meteorological instrument for the wind energy industry. research at the idr/upm institute”, sensors, vol. 14, pp. 21418–21452, 2014. [28] b. r. munson, d. f. young, t. h. okiishi, e w. w. huebsch, fundamentals of fluid mechanics, 6 o ed, vol. 1, 1 vols. u.s.a.: wiley, pp. 509-510, 2009. [29] r e predolin, “desenvolvimento de um sistema de aquisição de dados usando plataforma aberta”, master’s thesis, universidade estadual paulista “júlio de mesquita filho” unesp/feb, brazil, 2017. [30] e avallone, “estudo de um coletor solar, tipo tubo evacuado modificado, utilizando um concentrador cilíndrico parabólico cpc ”, phd thesis, universidade estadual paulista “júlio de mesquita filho” unesp/feb, brazil, 2017. [31] ipmet unesp, “ipmet unesp” [online] disponível em: https://www.ipmet.unesp.br/. [acessado: 09-jan-2019]. [32] young company, “wind monitor model 05103” r m young company 10853 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 53-75 https://doi.org/10.2298/fuee2301053m © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimal power management of dgs and dstatcom using improved ali baba and the forty thieves optimizer belkacem mahdad department of electrical engineering, university of biskra, algeria abstract. in this study an improved ali baba and the forty thieves optimizer (iaft) is proposed and successfully adapted and applied to enhance the technical performances of radial distribution network (rdn). the standard aft governed by two sensible parameters to balance the exploration and the exploitation stages. in the proposed variant a modification is introduced using sine and cosine functions to create flexible balance between intensification and diversification during search process. the proposed variant namely iaft applied to solve various single and combined objective functions such as the improvement of total power losses (tpl), the minimization of total voltage deviation and the maximization of the loading capacity (lc) under fixed load and considering the random aspect of loads. the exchange of active powers is elaborated by integration of multi distribution generation based photovoltaic systems (pv), otherwise the optimal management of reactive power is achieved by the installation of multi dstatcom. the efficiency and robustness of the proposed variant validated on two rdn, the 33-bus and the 69-bus. the qualities of objective functions achieved and the statistical analysis elaborated compared to results achieved using several recent metaheuristic methods demonstrate the competitive aspect of the proposed iaft in solving with accuracy various practical problems related to optimal power management of rdn. key words: ali baba and the forty thieves optimizer, integration of distributed generation, rdn, dstatcom, power losses, loading capacity received june 12, 2022; revised july 06, 2022; accepted july 24, 2022 corresponding author: mahdad belkacem department of electrical engineering, university of biskra, al e-mail: belkacem.mahdad@univ-biskra.dz 54 b. mahdad list of abbreviations iaft improved ali baba and the forty thieves rdn radial distribution network tpl total power losses tvd total voltage deviation lc loading capacity pv photovoltaic systems dstatcom distributed static compensator sc shunt compensator dg distributed generation facts flexible ac transmission systems cb capacitors bank gwo grey wolf optmizer abc artificial bee colony aca ant colony algorithm iwho improved wild horse optimization algorithm sd standard deviation bsoa backtracking search optimization algorithm simbo-q swine influenza model-based optimization with quarantine hho harris hawks optimization algorithms moihho multi-objective improved harris hawks optimization algorithms ieo improved equilibrium optimizer pm power management lms loading margin stability fwa fireworks algorithm bfoa bacterial foraging optmization algorithm hsa harmony search algorithm tm taguchi method ga/pso genetic algorithm/particle swarm optimization wca water cycle algorithm tsa tabu search algorithm itsa improved tabu search algorithm egwa enhanced grey wolf algorithm mrfa manta ray foraging algorithm jfsa jellyfish search algorithm mc margin capacity rdn radial distribution network tlbo teaching-learning based optimization qosimbo-q quasi-oppositional swine influenza model-based optimization with quarantine ihho improved harris hawks optimization algorithms fuzzy-ias fuzzy and artificial immune system 1. introduction due to economic aspect, the radial distribution network (rdn) is exploited based on simple topology, as a result the energy quality delivered to consumers is greatly affected which requires urgent measures and additional costs to satisfy the desired objectives. actually, with the large diffusion of various types of renewable sources such as wind and photovoltaic (pv) energy, the rdn becomes more flexible to exploit in terms of improving the energy quality, reducing cost investment and emission. otherwise, the intermittent aspect of this optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 55 energy is the main drawback which affects the energy quality delivered to consumers. recently, many smart management strategies based on the adaptation of several novel metaheuristics methods have been proposed for integration of various types of renewable sources to improve the performances of modern rdn [1].the various power management strategies developed until now aim to find the optimal solutions to the following technical and economic problems, such as: what are the best locations and size of multi types of dgs units, how to find the best locations of conventional capacitor banc (cb) and shunt compensators (sc) based facts devices [1], how to optimally coordinate the amount of active powers of various types of dgs units and the reactive powers of shunt compensators [2], to optimize individually and simultaneously several objective functions, and finally how to design the optimal reconfigurations of rdn under normal and abnormal situations in the presence of multi dgs and sc to reduce the total power loss (tpl), improve the total voltage deviation (tvd), reduce emission, and enhance the total cost investment of modern rdn. a deep statistical review of large number of metaheuristic methods introduced in the recent literature reveals that the success of the majority these methods depends on the structure of the diversification and intensification mechanism. dynamic interactivity between exploration and exploitation during search process allows the algorithm to solve with accuracy various complex optimization problems [1, 2]. among many developed strategies based recent metaheuristic methods applied with success to improve the technical and economical performances of rdn, authors in [3] proposed a hybrid technique based on combining an analytical method and metaheuristic optimization techniques for solving the optimal location of bank capacitors to improve the performances the various rdn. in [4] a water cycle algorithm is adapted and applied to solve the location and sizing of bank capacitors and dgs in rdn. in [5] an efficient jellyfish search algorithm is successfully applied to solve the power management of rdn such as the location and coordination of shunt compensators based facts devices and dgs, and the reconfiguration operation to improve the power quality delivered to consumers such as the improvement of voltage deviation and the reduction of the tpl. in [6], three novel metaheuristic methods such as the grey wolf optimizer, the dragonfly and moth–flame optimization algorithms have been applied to solve the optimal location and sizing of multi dgs and cb in rdn. in [7], a spring search algorithm is applied to solve the optimal integration of capacitor banks and various dgs; various objective functions have been treated to elevate the rdn performances. in [8] a hybrid method based on combining the ga and the pso algorithm for optimal setting and sizing of multi dgs units, the various multi objective problems are transformed to a single objective function by employing fuzzy optimal theory. in [9], a combined technique based on genetic algorithm and mathematical optimization, is presented to improve the operating cost and reducing the tpl, the particularity of the proposed hybrid method validated on three test rdn (10-bus, 33-bus and 69-bus). in [10] artificial bee colony (abc) method is investigated for optimal location of dgs considering the operation cost and tpl in rdn. in [11] a probabilistic technique based pso is proposed for optimal allocation of dstatcom based facts devices in coordination with renewable sources such as wind turbines and solar photovoltaic (pv) to enhance the rdn. in [12] an approach based on ant colony algorithm (aca) for optimal location of dgs to reduce tpl and improve the voltage profile of loads. in [13] a novel quasi-oppositional chaotic harris hawk’s optimization (qochho) algorithm is adapted to solve the optimal sitting and sizing of distributed generation (dg) installed in the 33-bus and the practical brazil 136-bus radial distribution network (rdn) considering different types of load models at three load levels). in [14], an improved wild horse 56 b. mahdad optimization algorithm (iwho) is proposed to improve the reliability of various rdn test systems, the 33-bus, 69-bus and the 119-bus. in [15] a new circuit theory based branch oriented for loss allocation in rdn considering different load model and dgs units. in [16], an improved equilibrium optimizer (ieo) designed for selecting the suitable location and the most effective size of dgs based pv systems in practical rdn. due to the robust characteristic and fast response of the statcom device to regulate the voltage magnitude in particular at critical situations such as severe faults, this device is also investigated by researches to improve the system loadability of multi machine based on imperialist competitive algorithm [17] and cuckoo search algorithm [18]. in [19] ant lion algorithm is applied for optimal allocation and sizing of various dgs based renewable sources. in [20], an efficient reactive power management strategy based on a modern metaheuristic algorithm is proposed for reduction the tpl in rdn. in [21], a new optimization variant namely a novel opposition-based tuned-chaotic differential evolution technique designed to improve the techno-economic aspect of the optimal placement of dgs in rdn. in [22], an enhanced equilibrium optimizer (eeo) is applied for optimal planning of pv-bes units in rdn considering time-varying demand. in [23], a parallel slime mould algorithm (psma) is proposed for optimal reconfiguration of rdn in coordination with dgs integration. in [24], a hybrid genetic dragonfly algorithm (hgada) is proposed and applied for optimal allocation of dgs to improve the technical performances of rdn. in [25], a planning strategy based on an improved grey wolf optimizer (igwo) and loss sensitivity (ls) is proposed to improve the integration of dgs in rdn. in [26] an improved coyote optimization algorithm (icoa) is proposed for optimally installing solar photovoltaic sources in rdn. in [27], a single and multi objective technique based on an improved harris hawks optimizer (ihho) is applied for optimal location and sizing of multi dgs. in [28], an improved meta-heuristic method is proposed to maximize the penetration level of multi dgs in rdn. in [29], a novel hybrid technique is proposed to solve the multi objective problem related to the integration of multi cbs and multi dgs in rdn. recently, authors in [30] developed a novel optimizer tool namely ali baba and the forty thieves (aft). the efficiency of this technique validated on many modal and multi benchmark functions [30].results confirmed the particularity of this technique and its ability to solve complex optimization problems. the best of our knowledge there is no application of this technique to solve practical problems related to power system operation and control, otherwise, it is found that the two proposed critical values of the standard algorithm which are responsible to create balance between exploration and exploitation are not generalized and depends on the problem to be solved. the main contributions of this paper compared to existing in the literature are summarized as follows: 1. a novel variant based aft is proposed and successfully applied to solve the power management of practical rdn. 2. the modification introduced in the standard aft algorithm allows the mechanism search to be more flexible and interactive to locate the global solution. 3. the active power of multi dgs units and the reactive power of multi dstatcom devices are optimized in coordination to improve the performance of two standard rdn, the 33-bus, and the 69-bus. 4. obtained results are compared to many recent metaheuristic methods demonstrate the efficiency of the proposed iaft in solving optimal power management of various rdn. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 57 2. formulation of the energy management problem the strategy of power management (pm) consists in improving the performances of modern rdn by optimizing individually or simultaneously several objective functions formulated as follows: 2.1. tpl improvement the objective function associated to minimization of tpl is expressed as follow: 1 _ 1 ( , 1) nbr loss k obj tpl min p k k =   = = +     (1) where, the active and reactive power losses in lines are expressed by the following expressions: 2 2 , 1 , 1 , 12 ( , 1) k k k k loss k k k p q p k k r u + + +  +  + =      (2) 2 2 , 1 , 1 , 12 ( , 1) k k k k loss k k k p q q k k x u + + +  +  + =      (3) 2.2. improvement of loading capacity margin capacity (mc) known also as loading margin stability (lms) of rdn reflects the capability of the rd network to deliver energy quality under sever situations such as faults and load growth. delivering power quality to consumers under this critical situation is a challenge for expert. in such situation, it is mandatory to dispatch optimally the reactive power delivered by the substation and the reactive power to be injected or absorbed by the distributed statcom devices. the lower reactive power delivered by the principal transformer, improves the mc of the rdn. the objective function related to the mc is expressed as follows: obj_2 = max (mc) (4) where, mc is the margin capacity of the rdn. 2.3. minimization of tvd the mathematical expression associated to the minimization of the normalized tvd is formulated as follow: 1 _ 3 min ( ) min ( ) npq des i i obj tvd v v =   = = −     (5) where; vdes is the permissible voltage magnitude, vi is the voltage magnitude reported at load buses, npq is the number of load buses. 58 b. mahdad 2.4. improvement the tvd and the tpl the tpl and the tvd may be two conflict objective functions. for practical planning and operation of rdn, it is mandatory to find the equilibrium balance between tpl and tvd to ensure efficient power quality. this multi objective problem may be solved using the following mathematical expression: _ 4 min ( , ) min ( (1 ) )obj tvd tpl tpl tvd = =  + −  (6) where, α, is a balancing coefficient introduced to find the compromise solution between tpl and tvd. the two weighting coefficients are selected in the range [0 1]. 2.5. operation constraints management 2.5.1. active and reactive power balance to ensure reliable operation of rdn under normal and abnormal conditions, it is mandatory to ensure the following equality constraints:   = == +=+ nl i nbr k klossid ndg i idgslacktr pppp 1 1 ,, 1 ,, (7)   = == +=+ nl i nbr k klossid ndg i idgslacktr qqqq 1 1 ,, 1 ,, (8) 2.5.2. security constraints the security constraints consist of inequality constraints associated to the secure operation of all elements of the rdn. ▪ voltage constraint: the voltage magnitude is an important index of power quality. to satisfy consumers the voltage magnitude must be within security values. nbusivvv iii ,,2,1, maxmin = (9) ▪ dg constraints the active power delivered by the dg units which considered as a control variable must be controlled within specified security limits. ndgippp idgidgidg ,,2,1, m ax ,, m in , = (10) ▪ level of dg integration due to the stochastic and intermittent aspect of various types of dgs, the exchanged of active powers delivered by various dgs such as pv and wind sources must be dispatched within their security range. the penetration level () to satisfy is introduced within the following operation inequality constraint:  ==  nl j jd ndg i idg pp 1 , 1 , (11) where,  is the level of active power penetration in the rdn. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 59 ▪ dstatcom constraints the dstatcom device must be operated within its admissible reactive power limits. m ax ,, m in , istcistcistc qqq  (12) ▪ line current transit in branches: the transit of currents in lines must be controlled without violation their permissible value. nliii lili .....1 m ax = (13) 3. modeling of dstatcom device the dstatcom device is a shunt compensator from the facts family designed principally to regulate the voltage magnitude at specified bus. fig. 1 model of dstatcom device compared to the capacitor bank (cb) and to the svc devices the dstatcom controller consists of a robust characteristic capable to regulate the voltage magnitude at critical situations. the dstatcom devices can regulate the voltages by injecting or absorbing reactive power from the network. fig. 1 shows the basic structure of the dstatcom device. 3. distributed generation for practical installation, as well shown in fig 2, the dgs units are classified on three categories: category 1: this category include al types of dgs units which can only exchange the active power with the network such as the pv sources which have been intensively integrated in many practical electrical networks in world. category 2: this category includes dgs units which can exchange the active power with the network and absorb the reactive power. the wind sources based renewable sources are also integrated in various electrical networks. bus k ±q qstc, max qstc, min bus i pk+jq k pik+j(qik-qstc) 60 b. mahdad category 3: this category include al the dgs units which can exchange the active power with the network and absorb or inject the reactive power. these dgs are efficient which allows to control simultaneously the active power and the reactive powers with the network. in this study an alternative solution is proposed to relieve the main drawback of the pv sources by installing shunt compensator based facts devices such as the dstatcom to ensure flexible control of reactive power in coordination of the active powers. fig. 2 categories of dgs units: a) dgs with only active power control, b) dgs with active power control and only reactive power absorption, c) dgs with active and total reactive power control 4. ali baba and the forty thieves optimizer the aft mimics the human intelligence and interactivities to find the best food’s sources, materials and treasures. the current algorithm is particularly inspired from the famous tale of ali baba and the forty thieves. the following key words summarize the main strategy of the proposed aft [30]: ▪ in the tale of ali baba, the thieves’ behavior tries to find the location of ali baba, so the thieves are the individuals in the search space (environment). ▪ the home of ali baba is the objective function to achieve ▪ ali baba location is considered as the global solution ▪ the forty thieves search within an interactive group, they travel from an initial location and try to find the best location which is the house of ali baba. ▪ marjaneh is considered as an intelligent operator designed to deliver astute ways to protect ali baba. 4.1. modeling of aft optimizer ▪ initial positions of n individuals are generated randomly in the search space characterized by d dimension. bus i p b) dg +q bus i p c) dg +q bus i p a) pv pmin pmax pmax, qmax pmin pmin, qmin pmax, qmax voltage control optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 61                 = n d nn d d xxx xxx xxx x .. ..... ..... .. .. 21 22 2 2 1 11 2 1 1 (14) i jx denotes the jth dimension of the ith thief (individual), ( ) i j j j x lb rand ub lb= + − (15) x i, is the position of the ith individual in the search space, ubj, lbj denotes the upper and lower bunds in the jth dimension, ▪ initialize randomly the wit level of marjaneh as follow:                 = n d nn d d mmm mmm mmm m .. ..... ..... .. .. 21 22 2 2 1 11 2 1 1 (16) fitness evaluation: the values of control variables are evaluated during search process based on each thief’s position using the following matrix form. (   ) (   ) (   )                = n d nn n d d xxxf xxxf xxxf f .. ..... ..... .. .. 21 22 2 2 12 11 2 1 11 (17) update locations of thieves: the new locations of thieves can be updated using the following expression: ( ) 1 1 2 3 4 ( ) ( ) sgn ( 0.5) 0.5, a ii i i i it it it it it it it it it x gbest td best y r td y m r rand r r pp +  = + − + − −     (18) where; i itx 1+ denotes the position of the ith thieve at iteration (it+1), i ity is the position of the ali baba at iteration it, tdit is the tracking distance of the thieves at iteration it, ppit is the perception potential of the thieves at iteration it, and ( )a i it m denotes the marjaneh’s intelligence level, the parameter a is defined as: [( 1) ( 1)]a n rand n= − − (19) 62 b. mahdad the tracking distance and the perception potential are formulated as follow: ( ) max1 0 it it etd it − = (20) 0 max0 log ( ) bit itit pp b= 21) update marjaneh astute plane using the following expressions: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) i i i i it it it it i i i it it it x if f x f m m m if f x f m       =   (22) where, f (.) denotes score of the fitness function. the key steps of the standard aft algorithm [30] 1 input setting variables of aft: pop_size, iter_max, trial_max, dim, ubj, lbj 2 randomly generate initial position, x, of all individuals (thieves) in the search space 3 initialize the best position (best i it) and the global best position (gest i it)for all individuals 4 initialize the intelligence degree of marjaneh with respect to all individuals 5 evaluate the position of all individuals using the appropriate fitness function (f(x)) 6 set it  1 7 while (it < it_max) do 8 calculate tdit using eq.20 9 calculate ppit using eq.21 10 for i = 1,2,...,n do 11 if (rand  0.5) then 12 if (rand  ppit) then 13 update i itx 1+ using eq. 18 14 else update i itx 1+ using eq.15 15 end if 16 else 17 update i itx 1+ using eq.18 18 end if 19 end for 20 for i = 1,2,...,n do 21 check the feasibility of the new position 22 evaluate and update the new position of the individuals (thieves) 23 update the solution best i it and gest i it 24 update  itm using eq. 22 25 end for 26 it=it+1 27 end while 4.2. proposed variant the main contribution of this proposed variant is related to its ability to ensure the interactivity between the exploration phase and the exploitation phase during search process. the following are the modifications introduced to improve the performances of the original algorithm: optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 63 the first modification: the standard aft is governed by various parameters to be well carefully identified and adjusted to achieve the near global solution. among these parameters, the tracking distance (td) and the perception potential (pp).the following proposed modeling expressions are suggested to create flexible balance between diversification and intensification. fig. 3 shows the evolution of the proposed tracking distance (td) and perception potential (pp) during search process. fig. 3 evolution of the proposed tracking distance (td) and perception potential (pp) during search process 4.3. analysis methodology the following steps summarize the analysis methodology based iaft designed to solve various single and multi objective functions: 1. read the technical data of the rdn such as the line data, load data, 2. select and specify the objective function 3. introduce the initial parameters of the iaft, such as: population size, generation max, trials. 4. run power flow tool to determine the initial state of the rdn in terms of total power loss, low voltage magnitude, maximum current transit in lines. 5. select preliminary buses to install dgs and dstatcom devices based on sensitivity power index. 6. run iaft to minimize the objective function 7. save the best solution 8. check the convergence condition based on gmax and tmax 9. return the optimized solutions such as the best active of dgs, the reactive power of dstatcom, and the voltage profiles. 64 b. mahdad 5. statistical results analysis 5.1. test 1: rdn 33-bus this first rdn 33-bus consists of 32 lines and 33 buses; the active and reactive power of loads to satisfy is 3.715 mw and 2.300 mvar respectively [2, 31]. fig 4 shows the standard topology of the rdn 33-bus. the performances of the proposed optimizer tool namely iaft is demonstrated via experiencing the following test cases. fig. 4 the topology of the rdn 33-bus 5.1.1. case 1: tpl improvement based dstatcom under normal condition this test case is focused to show the impact of integration only three statcom devices on the performances of rdn with 33-bus. three efficient locations are considered in buses 14, 24 and30. the maximum size of each statcom device is 1 mvar. by considering the voltage limits of all pq buses in the range [0.95 1.05] p.u, the optimized tpl found using the proposed iaft is 126.5868 kw and by considering voltage limits in the range [0.9 1] p.u, the optimal tpl achieved is 132.2102 kw. detailed optimized results related to decision variables of this case are shown in table 1.the results of this case are compared to various metaheuristic methods such as: psga, gsa, sa, ip, fpa, mfo, gwo, dfo, pso, and water cycle algorithm (wsa), it is absolutely clear that the proposed iaft achieves better solution quality. the lowest voltage magnitude is 0.95 p.u reported at bus 18. the convergence behaviour of tpl is shown in fig. 5, it is important to mention that only 5 trails are sufficient to locate the best solution. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 65 fig. 5 convergence characteristic of tpl minimization: case 1 table 1 optimized decision variables based three statcom: case 1: rdn 33-bus methods [3, 5] limits of v (p.u) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tdv (p.u) psga [0.95 1.05] 6 28 29 1200 760 200 0.9463 135.4 gsa [0.95 1.05] 13 15 26 450 800 350 0.9672 134.5 sa [0.95 1.05] 10 14 30 450 900 350 0.9591 151.75 ip [0.95 1.05] 9 29 30 450 800 900 0.9501 171.78 fpa [0.95 1.05] 6 9 30 250 400 950 0.9365 171.78 mfo [0.95 1.05] 8 13 30 450 300 900 0.9400 134.0725 gwo [0.95 1.05] 8 13 30 450 300 900 0.9400 bus 18 134.0725 dfo [0.95 1.05] 8 13 30 450 300 900 0.9400 bus 18 134.0725 pso [0.95 1.05] 8 13 30 450 300 900 0.9400 bus 18 134.0725 wsa [0.95 1.05] 14 24 30 397.3 451.1 1000 0.951 bus 18 130.912a proposed iaft [0.9 1] 14 24 30 361.10 547.20 1043.7 0.9389 bus 18 132.2102 proposed iaft [0.95 1.05] 14 24 30 358.70 541.90 1036.3 0.9601 bus 18 126.5868 0.5129 66 b. mahdad 5.1.2. case 2: tpl improvement based three dgs units under normal condition the main objective of this second test case is to show the impact of integration only three dgs units without considering the reactive power support of shunt compensators based dstatcom devices. the maximum size of each dg unit is 2 mw. it is found that by integrating three dgs at buses 14, 24 and 30, the tpl is reduced to a competitive value 70.6725kw when the voltage magnitude at all pq buses taken in the range [0.95 1.05] p.u, and by considering the limits of voltages at pq buses in the range [0.9 1] p.u, the optimized tpl becomes 71.4572 kw. detailed optimized results of this case are shown in table 2, the obtained results are compared to various competitive metaheuristic methods such as fwa, bfoa, hsa, tm, ga/pso, pso, ga, and water cycle algorithm (wca), it is clearly evident, that the proposed iaft achieves better solution at competitive number of iteration and trials. the lowest voltage magnitude is 0.95 reported at bus 18. the convergence behaviours of tpl are shown in figs 6-7. fig. 6 convergence characteristic of tpl minimization: case 2, vϵ [0.9 1] p.u fig. 7 convergence characteristic of tpl minimization: case 2, vϵ [0.9 1] p.u optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 67 table 2 optimized decision variables based three dgs: case 2: rdn 33-bus methods [3, 5, 27] location of dgs dgs size (kw) min voltage (p.u) tpl (kw) tdv (p.u) fwa 14 18 32 589.70 189.00 1014.6 0.968 88.68 bfoa 17 18 33 633.00 90.00 947.00 0.964 98.3 hsa 17 18 33 572.4 107.0 1046.2 0.967 bus 29 96.76 tm 15 25 33 587.6 195.9 783.0 0.958 bus 30 91.305 ga/pso 11 16 32 925.0 863.0 1200 0.980 bus 25 103.4 pso 8 13 32 1176.8 981.60 829.70 0.980 bus 30 105.35 ga 11 29 30 1500.0 422.8.0 1070.0 0.981 bus 25 106.3 wsa 14 24 30 854.60 1101.7 1181.0 0.973 bus 33 71.052 lsf 18 33 25 720 810 900 85.07 fuzzy-ias 32 30 31 2071 1113.8 150.3 117.36 bsoa 13 28 31 632 486 550 89.05 bfoa 14 25 30 779 880 1083 73.53 tlbo 10 24 31 824.6 1031.1 886.2 75.54 qotlbo 12 24 29 880.8 1059.2 1071.4 74.10 simbo-q 14 24 29 763.8 1041.5 1135.2 73.4 qsimbo-q 14 24 30 770.8 1096.5 1065.5 72.8 hho 14 24 30 745.69 1022.69 1135.78 72.98 ihho 14 24 30 775.54 1080.83 1066.69 72.79 proposed iaft 14 24 30 754.00 1099.7 1071.4 0.9687 bus 33 71.4572 0.5872 proposed iaft 14 24 30 748.90 884.60 1072.3 0.9771 bus 33 70.6726 0.3224 68 b. mahdad 5.1.3. case 3: tpl improvement based dgs units and dstatcom under normal condition in this test case, three dgs and three dstatcom are integrated at buses 14, 24 and 30. the proposed iaft is designed to optimize the amount of active powers of dgs and the reactive powers of dstatcom to be exchanged with the electric network. the optimal tpl achieved is 11.60 kw which is significantly improved compared to the last two cases and also compared to the results achieved using many recent methods such as, bfoa, wca, tsa, itsa, egwa, mrfa, and jfsa. details of optimized control variables are depicted in table 3. the convergence characteristics for tpl minimization under two levels of penetration (76.72 % and 74.47 %) of dgs are shown in figs 8-9, respectively, the lowest voltage magnitude is reported at bus 8. fig. 8 convergence behavior of tpl improvement considering 3 dgs and 3 dstatcom devices: penetration level=76.72 %: rdn 33-bus fig. 9 convergence behavior of tpl minimization considering 3 dgs and 3 dstatcom devices, penetration level= 74.47 %: rdn 33-bus optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 69 table 3 optimized decision variables based three statcom and three dgs units: case 3: rdn 33-bus methods [4] dgs penetration level % limits of v (p.u) location of dgs dgs size (kw) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tdv (p.u) bfoa 42.98 [0.95 1.05] 17 18 33 542 160 895 18 33 30 163 338 541 0.9783 41.41 wca 68.61 [0.95 1.05] 25 29 11 973 1040 536 23 30 14 465 565 535 0.98 24.68 tsa 71.57 [0.95 1.05] 24 30 12 766 917 976 30 11 24 1060 246 566 nr 15.0 itsa 70.39 [0.95 1.05] 13 25 30 788 742 1085 7 15 30 603 269 834 nr 14.4 egwa 76.096 [0.95 1.05] 24 14 30 1094.96 767.74 964.22 25 14 30 388.75 334.77 1189.91 0.9924 12.7 mrfa 78.5 [0.95 1.05] 13 24 30 803 1073 1040 14 24 30 300 600 900 0.992 12.572 jfsa 77.6 [0.95 1.05] 14 24 30 748 1079 1056 14 24 30 300 600 900 0.992 12.40 proposed iaft 76.72 [0.9 1] 14 24 30 743.9 1066.7 1039.7 14 24 30 348.2 510.8 1014.6 bus 8 11.60 0.1284 proposed iaft 74.47 [0.9 1] 14 24 30 771.7 999.6 995.4 14 24 30 314.7 610.5 1076.5 bus 8 12.01653 0.1287 mc 1 5.1.4. case 4: improvement tpl and margin capacity based dgs and dstatcom devices this test case is dedicated to improve the technical performances of rdn under critical situation at loading margin satiability. the tpl is optimization in coordination with the lc. for fair comparison with the third test case, three dstatocm and three dgs units are integrated on three optimal locations (14-24-30). the tpl optimized and the mc achieved are 71.311 kw and 2.2 p.u, respectively, the corresponding voltage deviation becomes 0.2889 p.u. the minimum voltage magnitude obtained is 0.9812 p.u reported at bus 8. table 4 shows the values of optimized decision variables such as the table 4 optimized decision variables obtained by optimizing the tpl and mc of the rdn 33-bus methods dgs penetration level % limits of v (p.u) location of dgs dgs size (kw) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tdv (p.u) proposed iaft [0.9 1] 14 24 30 1786.10 1999.90 1999.90 14 24 30 607.30 1666.70 2661.20 0.9812 bus 8 71.3110 0.2889 mc 2.2 70 b. mahdad active powers of dgs units and the reactive powers delivered by the statcom devices. the convergence behavior of tpl under loading margin stability is shown in fig. 10. fig. 10 convergence behaviour of tpl improvement under mc maximization considering 3 dgs and 3 dstatcom devices 5.2. test 2: rdn 69-bus the proposed iaft is also validated on a medium rdn, the 69-bus. all data of this second test system are given in [2, 31]. this second test system consists of 69 bus and 68 branches, with 12.66 kv, the total apparent power to satisfy to loads is (3.8+j2.69) mva. the exploitation states of this test system at normal condition without integration of compensators and without installation of dgs are: the total power loss 224.95 kw and the low voltage magnitude is 0.9092 (p.u) reported at bus 65. the one line representation of the rdn 69-bus is shown in fig 11. fig. 11 topology of the rdn 69-bus optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 71 5.2.1. case 5 for fair comparison with other recent technique, this test case is focused to minimize the tpl at normal condition. three dgs and three statcom devices are optimally integrated at efficient locations (bus 11, bus 18 and bus 61). the sizes of dgs and statcom devices are 2 mw, and 1.5 mvar, respectively. the obtained optimized variables such as the active power of dgs and the reactive power of the three statcom devices are recapitulated in table 5. the best tpl achieved using iaft is 4.2693kw, which is better than several recent techniques such as: tsa, sma, cso, itas, and jfa. the convergence behaviour of the iaft for tpl minimization is shown in fig 12, the distribution of voltage profile is shown in fig 13. it is absolutely clear, that the proposed variant namely iaft achieves the best solution quality, at a reduced time. for this test system, the population size is 10, and the maximum number of iteration is 40. table 5 comparison of optimized decision variables obtained using iaft and other techniques: case 5: rdn 69-bus methods [5] dgs penetration level % limits of v (p.u) location of dgs dgs size (kw) location of scs scs size (kvar) min voltage (p.u) tpl (kw) tsa 65.97 [0.95 1.05] 9 16 61 452 555 1500 21 53 61 299 605 1148 6.9 sma 58.78 [0.95 1.05] 16 30 61 497 112 1625 2 13 61 708 623 1091 9.0053 cso 67.42 [0.95 1.05] 17 71 67 535 1728 299 61 67 68 1367 311 323 7.5488 itsa 60.05 [0.95 1.05] 10 12 61 291 491 1500 9 23 61 288 292 1149 0.9944 6.8012 jfsa 67.05 [0.95 1.05] 11 18 61 495 379 1674 18 51 61 300 300 1200 0.994 4.6826 proposed iaft 67.15 [0.95 1.05] 11 18 61 498.8 379.3 1673.9 11 18 61 365.7 249.8 1196.3 0.9943 bus 50 4.2693 fig. 12 convergence behaviour of tpl improvement considering 3 dgs and 3 dstatcom devices: rdn 69-bus 72 b. mahdad fig. 13 voltage profile after integration of three dgs and three dstatcom: rdn 69-bus 5.3. statistical analysis the performances of the proposed variant are demonstrated by elaborating a statistical analysis. the mean, the max and the standard deviation (sd) are the three well known statistical indexes used largely to identify the advantages and the drawbacks of many metaheuristic optimizers, for the first analysis five trials, and ten trials are elaborated. for all accomplished test cases, the maximum number of iterations is fixed to 40, and the population size is taken 10, the sd achieved for 10 trials is 4.3546e-6 which is remarkably better than the sd associated to a new metaheuristic namely jfsa (0.7146). fig. 14 shows the convergence characteristics for tpl achieved for 10 trails; however the evolution of the optimized value tpl for 10 trials and 5 trials are shown in figs.15-16, respectively. it is evident that the global solution achieved at a reduced number of trials. table 6, depicts the statistical values achieved by using the proposed variant namely iaft. fig. 14 convergence behavior of tpl minimization for 10 trials; pop_size=10 optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 73 fig. 15 values of optimized tpl for10 trials: pop_size=10 table 6 robustness evaluation of optimized results: case 1: rdn 33-bus scenario 1: 3 dstatcom and 3 dgs methods pop_size max_it limits of v (p.u) min_tpl (kw) mean_tpl (kw) max_tpl (kw) std trials base case 316.2 jfsa [0.95 1.05] 12.4002 13.1092 15.1889 0.7146 proposed iaft 5 40 [0.9 1] 11.6366 12.0971 13.9252 0.001 5 proposed iaft 10 40 [0.9 1] 11.63636 11.6386 11.64734 4.3546e-06 10 mc 1 fig. 16 convergence behavior of tpl for 5 trials: pop_size=10 74 b. mahdad 6. conclusion in this current study, a new variant namely ifta is successfully adapted and applied to solve with accuracy the optimal location and setting of multi dgs units and multi shunt compensators based dstatcom devices. to improve the technical performances of rdn, two objective functions are optimized, the tpl and the loading margin capacity of the rdn. the tpl decreased to a competitive value 11.6473 kw when considering both three dgs and three dstatcom devices. the loading capacity of the rdn 33-bus is optimized to 2.2 without affecting the operation constraints. otherwise, the particularity of the proposed strategy is also demonstrated in optimizing the active and reactive powers of dgs and dstatcom devices considering the uncertainties in loads for 24 hours. it has been clearly demonstrated that the proposed power management strategy based ifta almost gives better results in terms of solution quality and convergence behavior compared to many recent optimization algorithms. a statistical analysis demonstrated that for the rdn 33-bus, only 5 trials are sufficient to locate the near global solution, as a result the average execution time required will be reduced at a competitive value. the proposed metaheuristic variant namely ifta may be considered as a competitive optimizer tool to solve various power management problems of large rdn. in future work, the application of the proposed optimizer tool based ifta will be adapted to solve the stochastic multi objective power management considering various types of facts devices and renewable sources. references [1] b. mahdad, "a novel tree seed algorithm for optimal reactive power planning and reconfiguration based statcom devices and pv sources", sn applied sciences, vol. 3, id. 336, 2021. [2] b. mahdad, "novel adaptive sine cosine arithmetic optimization algorithm for optimal automation control of dg units and statcom devices", smart science, 2022. [3] a. selim, s. kamel, f. jurado, "capacitors allocation in distribution systems using a hybrid formulation based on analytical and two metaheuristic optimization techniques", computers and electrical engineering, vol. 85, 106675, 2020. [4] a. a. a. el-ela, r. a. el-sehiemy, and a. s. abbas, "optimal placement and sizing of distributed generation and capacitor banks in distribution systems using water cycle algorithm", ieee systems journal, pp.1-8, 2018. [5] a. m. shaheen, a. m. elsayed, a. r. ginidi, e. e. elattar, "effective automation of distribution systems with joint integration of dgs/ svcs considering reconfiguration capability by jellyfish search algorithm", ieee access, vol. 9, pp. 92053-92069, 2021. [6] a. a. z. diab, h. rezk, "optimal sizing and placement of capacitors in radial distribution systems based on grey wolf, dragonfly and moth–flame optimization algorithms", iran j sci technol trans electr eng, vol. 43, pp. 77-96, 2019. [7] m. dehghani, z. montazeri, o. p. malik, "optimal sizing and placement of capacitor banks and distributed generation in distribution systems using spring search algorithm", international journal of emerging electric power systems, id. 20190217, 2020. [8] m. h. moradi, m. abedini, "a combination of genetic algorithm and particle swarm optimization for optimal distributed generation location and sizing in distribution systems with fuzzy optimal theory", international journal of green energy, vol. 9, pp. 641-660, 2012. [9] f. e. riaño, j. f. cruz, o. d. montoya, h. r. chamorro and l. alvarado-barrios, "reduction of losses and operating costs in distribution networks using a genetic algorithm and mathematical optimization", electronics, vol. 10, no. 4, id. 419, p. 25, 2021. [10] e. a. al-ammar, k. farzana, a. waqar, m. aamir, saifullah, a. u. haq, m. zahid, m. batool, "abc algorithm based optimal sizing and placement of dgs in distribution networks considering multiple objectives", ain shams engineering journal, vol. 12, pp. 697-708, 2021. optimal power manegement of dgs and dstatcom using improved ali baba and the fourty ... 75 [11] s. rezaeian-marjani, s. galvani, v. talavat, m. farhadi-kangarlu, "optimal allocation of d-statcom in distribution networks including correlated renewable energy sources", electrical power and energy systems, vol. 122, id. 106178, 2020. [12] a. a. ogunsina, m. o. petinrin, o. o. petinrin, e. n. offornedo, j. o. petinrin, g. o. asaolu, "optimal distributed generation location and sizing for loss minimization and voltage profile optimization using ant colony algorithm", sn applied sciences, vol. 3, id. 248, 2021. [13] k. balu, v. mukherjee, va novel quasi-oppositional chaotic harris hawk’s optimization algorithm for optimal siting and sizing of distributed generation in radial distribution system", neural processing letters, vol. 54, pp. 4051-4121, 2022. [14] m. h. ali, s. kamel, m. h. hassan, m. tostado-véliz, h. m. zawbaa, "an improved wild horse optimization algorithm for reliability based optimal dg planning of radial distribution networks", energy reports, vol. 8, pp. 582-604, 2022. [15] a. p. hota, s. mishra, "active power loss allocation in radial distribution networks with different load models and dgs", electric power systems research, vol. 205, id. 107764, 2022. [16] t. t. nguyen, t. t. nguyen, m. q. duong, "an improved equilibrium optimizer for optimal placement of photovoltaic systems in radial distribution power networks", neural computing and applications, vol. 34, pp. 6119-6148, 2022. [17] s. m. abd-elazim, e.s. ali, "imperialist competitive algorithm for optimal statcom design in a multimachine power system", electrical power and energy systems, vol. 76, pp. 136-146, 2016. [18] s. m. abd-elazim, e.s. ali, "optimal location of statcom in multimachine power system for increasing loadability by cuckoo search algorithm", electrical power and energy systems, vol. 80, pp. 240-251, 2016. [19] e. s. ali, s. m. abd elazim, a. y. abdelaziz, "optimal allocation and sizing of renewable distributed generation using ant lion optimization algorithm", electr eng, vol. 16, no. 1, pp. 445-458, 2016. [20] t. t. nguyen, k. h. le, t. m. phan, and m. q. duong, "an effective reactive power compensation method and a modern metaheuristic algorithm for loss reduction in distribution power networks", hindawi, complexity, vol. 2021, id. 8346738, p. 21, 2021. [21] s. kumar, k. k. mandal and n. chakraborty, "a novel opposition-based tuned-chaotic differential evolution technique for technoeconomic analysis by optimal placement of distributed generation", engineering optimization, vol. 52, no. 2, pp. 303-324, 2019. [22] a. eid, s. kamel & e. h. houssein, "an enhanced equilibrium optimizer for strategic planning of pv-bes units in radial distribution systems considering time-varying demand", neural computing and applications, vol. 34, pp. 17145–17173, 2022. [23] h.-j. wang, j.-s. pan, t.-t. nguyen, s. weng, "distribution network reconfiguration with distributed generation based on parallel slime mould algorithm", energy, vol. 244, part b, id. 123011, 2022. [24] g. v. n. lakshmi, a. jayalaxmi & v. veeramsetty, "optimal placement of distribution generation in radial distribution system using hybrid genetic dragonfly algorithm", technology and economics of smart grids and sustainable energy, vol. 6, id. 9, 2021. [25] m. sodani, h. h. aly, t. a. little, "optimal planning of distributed generation using improved grey wolf optimizer and combined power loss sensitivity", in proceedings of the 2021 ieee canadian conference on electrical and computer engineering (ccece), 2021, id. 21380949. [26] t. t. nguyen, t. d. pham, l. c. kien, and l. v. dai, "improved coyote optimization algorithm for optimally installing solar photovoltaic distribution generation units in radial distribution power systems", complexity, vol. 2020, p. 34, id. 1603802, 2020. [27] a. selim, s. kamel, a. s. alghamdi, and f. jurado, "optimal placement of dgs in distribution system using an improved harris hawks optimizer based on singleand multi-objective approaches", ieee access, vol. 8, pp. 52815-52829, 2020. [28] k. h. truong, p. nallagownden, i. elamvazuthi, d. n. vo, "an improved meta-heuristic method to maximize the penetration of distributed generation in radial distribution networks", neural computing and applications, vol. 32, no. 1, 2019. [29] c. venkatesan, r. kannadasan, m. h. alsharif, m.-k. kim, and j. nebhen, "a novel multiobjective hybrid technique for siting and sizing of distributed generation and capacitor banks in radial distribution systems", sustainability, vol. 13, no. 6, id. 3308, 2021. [30] m. braik, m. h. ryalat, h. al-zoubi, "a novel meta-heuristic algorithm for solving numerical optimization problems: ali baba and the forty thieves", neural computing and applications, vol. 34, pp. 409-455, 2022. [31] r. d. zimmerman, c. e. murillo-sanchez, r. j. thomas, "matpower: steady-state operations, planning and analysis tools for power systems research and education". ieee trans power syst, vol. 26, pp. 12-19, 2011. https://link.springer.com/article/10.1007/s11063-022-10800-1#auth-korra-balu https://link.springer.com/article/10.1007/s11063-022-10800-1#auth-v_-mukherjee https://link.springer.com/journal/11063 https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/science/article/pii/s235248472101461x#! https://www.sciencedirect.com/journal/energy-reports https://www.sciencedirect.com/journal/energy-reports/vol/8/suppl/c https://www.sciencedirect.com/science/article/abs/pii/s0378779621007458#! https://www.sciencedirect.com/science/article/abs/pii/s0378779621007458#! https://www.sciencedirect.com/journal/electric-power-systems-research https://www.sciencedirect.com/journal/electric-power-systems-research/vol/205/suppl/c https://link.springer.com/article/10.1007/s00521-021-06779-w#auth-thang_trung-nguyen https://link.springer.com/article/10.1007/s00521-021-06779-w#auth-thuan_thanh-nguyen https://link.springer.com/article/10.1007/s00521-021-06779-w#auth-minh_quan-duong https://link.springer.com/journal/521 https://link.springer.com/article/10.1007/s00521-022-07364-5#auth-ahmad-eid https://link.springer.com/article/10.1007/s00521-022-07364-5#auth-salah-kamel https://link.springer.com/article/10.1007/s00521-022-07364-5#auth-essam_h_-houssein https://link.springer.com/journal/521 https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://www.sciencedirect.com/science/article/abs/pii/s0360544221032606#! https://link.springer.com/article/10.1007/s40866-021-00107-w#auth-g__v__naga-lakshmi https://link.springer.com/article/10.1007/s40866-021-00107-w#auth-a_-jayalaxmi https://link.springer.com/article/10.1007/s40866-021-00107-w#auth-venkataramana-veeramsetty https://link.springer.com/journal/40866 https://link.springer.com/journal/40866 https://ieeexplore.ieee.org/author/37089005058 https://ieeexplore.ieee.org/author/38340591500 https://ieeexplore.ieee.org/author/37289727900 https://ieeexplore.ieee.org/xpl/conhome/9569025/proceeding https://ieeexplore.ieee.org/xpl/conhome/9569025/proceeding https://www.hindawi.com/journals/complexity/ instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 41 56 doi: 10.2298/fuee1401041w topology, analysis, and cmos implementation of switched-capacitor dc-dc converters  oi-ying wong 1 , hei wong 1 , wing-shan tam 2 , chi-wah kok 2 1 department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong 2 canaan semiconductor ltd., fotan, nt, hong kong abstract. this review highlights various design and realization aspects of three commonly used charge pump topologies, namely, the linear, exponential, and the fibonacci type of charge pumps. we shall outline the new methods developed recently for analyzing the steady and dynamic performances of these circuits. some practical issues for the cmos implementation of these charge pump structures will be critically discussed. finally, some conventional voltage regulation methods for maintaining a stable output under a large range of loading current and supply voltage fluctuations will be proposed. key words: switched-capacitor dc-dc converters, charge pump, steady-state analysis, dynamic analysis, voltage regulation 1. introduction switched-capacitor dc-dc converter (or sc dc-dc converter, in short) is a kind of voltage converters which realizes a dc-to-dc voltage conversion using capacitors as the only energy storage elements. unlike the conventional inductor-based dc-dc converters, no inductor is used in the sc dc-dc converters and that makes this kind of converter to have less emi emission, more compact in size, and is easier for system integration. when compare with the low-dropout regulators (ldo) which can provide step-down conversion only, the sc dc-dc converters have the advantage of being able to generate a voltage higher than the supply. however, the conversion efficiency of a sc dc-dc converter is usually poorer than those of inductor-based converters and the silicon area occupation of a sc dc-dc converter is much larger than that of a ldo. nevertheless, sc dc-dc converters have been widely used for voltage generation in flash memory systems [1]-[3] and lcd driver circuits where dc voltages higher than the supply voltages are required [4], [5]. sc dc-dc converters are also used in energy harvesting system, self-powered systems like biomedical implant devices, rfid, and wireless sensor networks [6]-[11] where the available source voltages are too low to be used for operating any electronic  received december 27, 2013 corresponding author: hei wong department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong (e-mail: eehwong@cityu.edu.hk) 42 o.-y. wong, h. wong, w.-s. tam, c.w. kok devices. the step-up capability and the ease of cmos implementation feature of the sc converters can also help to minimization the power consumption of some electronic systems [12], [13]. a sc dc-dc converter consists of an output capacitor (capacitor which is connected across the output node and the ground) and some coupling capacitors which are connected to different nodes in the circuit during the two system clock phases through some power switches. (there are some other structures using more than two clock signals. these structures are more complex and the analysis will be much more complicated. in this work, we shall focus on the two-phase configuration only.) after the clock signals being applied, the coupling capacitors in the converter will be charged and discharged alternately during the charging and discharging phases. during these processes, the energy is temporarily stored in the coupling capacitors and then transferred to other capacitors via the charge sharing nodes. in this way, the energy can be transferred from the input side to the output side via the coupling capacitors, and a desired power conversion can be achieved. the output voltage of a sc dc-dc converter is governed by its switch-capacitor network or the basic topology. higher conversion ratio, defined as the ideal output voltage divided by the supply voltage, can be achieved by cascading several units of the basic topology. in this paper, we shall first review some commonly used topologies in section 2. because of the charge sharing effects, the switching loss due to the finite "on"-resistance of the cmos switches, the performances of a sc converter are always poorer than the ideal ones. with these connections, we shall look at some circuit analysis methods that took the non-ideal conditions into consideration. we shall compare the performances of various sc converters using these methods. these ideas together with some typical results will be presented in section 3. on the other hand, process requirements for cmos realization of the different sc dc-dc converter topologies will also be different. the practical issues for the cmos implementation will also be discussed in section 4. finally, the implementations of the voltage regulating building block, i.e., the output stage of the converter, will be discussed briefly in section 5. 2. sc dc-dc converter topologies a sc dc-dc converter can be constructed with several different topologies. different conversion ratios can be achieved by cascading different numbers of stage n. the linear, fibonacci and exponential topologies, shown respectively in fig. 1, 2 and 3, are the most commonly used topologies for stepping up a supply voltage [14]-[16]. for the linear topology shown in fig. 1 [14], which is also known as the dickson charge pump, the voltage across the coupling capacitor in each stage is stepped up by a value equal to the supply voltage, vdd, during the clock phase φ1 or φ2. therefore, by cascading n repeating units, an output voltage equal to (n+1)vdd can be achieved in ideal case, i.e. the conversion ratio m is equal to (n+1). for the fibonacci topology in fig. 2 [15], the coupling capacitor in the k-th stage is charged to f(k+1)vdd in φ1 and φ2 for odd and even k values, respectively. here f(x) is the x-th member in the fibonacci series defined by 1, 1, 2, 3, 5, 8, 13, 21, ···. the conversion ratio of an n-stage fibonacci converter is given by f(n+2). for the exponential topology given in fig. 3 [16], the step-up voltage at the output of each stage will become the input voltage of next stage. hence the conversion ratio of an n-stage exponential converter is given by 2 n vdd. comparing these three converter topologies, the topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 43 fibonacci and the exponential topologies can achieve a higher conversion ratio with smaller number of stage and thus fewer components for implementations; whereas the linear topology has the advantage of the smaller voltage stress across the switches. note that the aforementioned topologies are not limited to step-up operation. by considering the input and output nodes of the step-up converter topologies as the output and input nodes, step-down operation is also possible. in the step down operation, the corresponding conversion ratios for the linear, fibonacci and the exponential topologies are, respectively, 1/(n+1), 1/f(n+2) and 1/2 n . fig. 1 topology of a dickson charge pump or the linear step-up sc dc-dc converter fig. 2 topology of a fibonacci step-up sc dc-dc converter fig. 3 an exponential step-up sc dc-dc converter topology 3. performance analysis for a practical sc dc-dc converter, its performances vary with different design parameters and are different from the ideal ones. the major design parameters including the number of stage (n), the supply voltage (vdd), the operation frequency (f), the unit value of the coupling capacitance (c), the "on"-resistance (ron), the clock duty cycle (d), 44 o.-y. wong, h. wong, w.-s. tam, c.w. kok the output capacitance (co), and the topand bottomplate parasitic factors (α and β), which are defined as the ratio of the parasitic capacitances at the topand bottomplates of a capacitor to the capacitance value. the mathematical relationships between these parameters and the performances of the converters have been analyzed in many previous publications. this section highlights these results. 3.1. output voltage fig. 4 equivalent circuit of a sc dc-dc converter [19] the output voltage of a sc dc-dc converter (vo) is always smaller than the ideal one as suggested by the conversion ratio m and it depends on the loading current io also. this behavior can be modelled by taking an equivalent output resistance into consideration (see fig. 4). the value of the resistance depends on the charging status or operation mode of the converter [17]-[20]. the charging status can be modelled with ψ = 1/(2fcron) [17], [18], [20]. when ψ > 1, the converter is operated in the complete charge transfer mode. the charge transfer among the capacitors in the converter is complete that the current in each path of the converter drops closed to zero at the end of each clock phase. the equivalent output resistance of the converter mainly depends on the 1/(fc) factor. when ψ < 1, the converter operates in the non-complete charge transfer mode. the charge transfer among the capacitors in the converter is far from complete that the current in each path of the converter is almost constant during each clock phase. the equivalent output resistance of the converter mainly depends on ron. when ψ ≈ 1, the converter operates between the non-complete and the complete charge transfer mode, i.e. partially charge transfer mode, in which the equivalent output resistance of the converter depends on both 1/(fc) and ron. the equivalent output resistances of the converter operated at the slow switching limit (ssl), for which ψ is closed to infinitive; or at the fast switching limit (fsl), for which ψ is closed to zero, theoretically. models for the equivalent output resistances of the dickson charge pump at the ssl [21]-[25] and fsl [26], [27] are often reported. it was also suggested that the variation on the equivalent output resistance of the dickson charge pump across different operation modes can be modeled by the coth(x) function [28], [29]. the equivalent output resistances of the dickson and the fibonacci converters at the ssl, with the consideration of the parasitic capacitance factors, are compared in ref. [30]. a generalized method for finding the equivalent output resistances of different converter topologies at the ssl and fsl has been proposed [19]. for ssl case, the equivalent output resistance of a given converter can be approximated by [19]: 2 ( ) ,k ssl k k a r fc   (1) topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 45 where ak is the charge multiplier of the k-th capacitor ck. the parameter models the amount of charge flow into the component and is normalized by the amount of output flowing charge. for fsl case, the equivalent output resistance of a given converter topology can be approximated by [19]: 2 , 2 ( ) , fsl k on k k r a r  (2) where ron,k is the "on"-resistance of the k-th switch and ak is the corresponding charge multiplier. 3.2. power efficiency power efficiency is an import figure of merit of the dickson charge pump and is often analyzed or modelled by taking the parasitic losses into consideration. without considering the parasitic capacitances, finite switch "on"-resistances, and the switching losses for the gate-capacitances of the power switches and the clock drivers, the power efficiency of a converter can be simply determined by vo/(mvdd) [31]. considering the switching losses of the parasitic capacitances, more accurate solutions were developed [21], [22], [32]. the power efficiency of some practical converters in terms of transistor parameters can also be found [33]. this model took the parasitic resistances, gate capacitances and the topand bottomplate parasitic factors into consideration. in general, the power efficiency is given by: , 100%,out eff out r eq dyn p p p p      (3) where pout is the output power, pr,eq is the power loss of the equivalent output resistance and pdyn is the total switching loss due to gate capacitances, topand bottomplate parasitic capacitances, and the clock driver. thus, pr,eq increases with the equivalent output resistance; and pdyn increases with the parasitic factors, α and β, clock frequency, supply voltage, and the size of the power switches. the increment of both pdyn and pr,eq gives rise to a lower power efficiency. notice that the power efficiency of a converter varies with the loading current, and the maximum achievable power efficiency of a given design can only be determined with a given value of loading current and output voltage. by assuming that the conversion ratio and the equivalent output resistance are independent of the parasitic factors, the condition for maximum power efficiency can be determined accordingly [34]. it was found that as the parasitic factors increase, the maximum power efficiency occurs at a large loading current. 3.3. output voltage ripple the behavior of output voltage ripple in a sc dc-dc converter can be understood as follow. let us consider a converter, see figs. 1 to 3 for example, has an output capacitor co and its loading current be io. as the output capacitor is not charged by the converter during φ1, the loading current is governed by the charge stored in the output capacitor only. hence the output voltage would have the smallest value of ripple of io/(2fco) when d =0.5. the largest value of the ripple is io/(fco) when the total amount of charge equal to io/f. this occurs when the charge consumed by the load in a single clock cycle is transferred instantaneously to the output capacitor at the beginning of φ2 in each clock cycle. hence, 46 o.-y. wong, h. wong, w.-s. tam, c.w. kok when a converter is operated in the complete charge transfer mode, its output voltage ripple is more likely to be larger than the minimum value. on the other hand, the output voltage ripple is closed to the minimum value when the converter is operated in the non-complete charge transfer mode. a detailed discussion on the output voltage ripple under different operation modes can be found in ref. [20]. 3.4. start-up time if the capacitors in fig. 1, 2 and 3 are not charged at the beginning, it takes several clock cycles for the coupling capacitors to transfer charge to the output co. that is, a converter will take certain time to reach its steady output voltage. this time interval for this transient period is known as the start-up time. during this period, the charge flowing in and out of the capacitors are not the same for some clock cycles. finding the start-up time requires some dynamic analyses of the converter. with the aid of dynamic analyses, closed-form solutions for the start-up times of some linear converters with different numbers of stage and parasitic factors were obtained [34]-[40]. a generalized method which can evaluate the start-up behaviors of any forms of converter topologies was proposed [41]. it involves the fig. 5 plots of the required number of clock cycles for achieving 95% of final values for linear, fibonacci and the exponential converters as a function of coupling-to-output capacitance ratio for the case of the conversion ratio equal to: (a) 8; (b) 13; (c) 16; and (d) 21 [41] topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 47 formulation of a given converter topology into some matrices, from which the output voltage of a converter can be evaluated with time using the matrix equations. in the analyses, we assume: (a) the converter is operated at ssl mode with capacitive loads only; (b) all the capacitors in the converter are initially uncharged; (c) the parasitic capacitance effects can be neglected. based on these assumptions, the number of cycle, m, for achieving 95% of final output value can be determined. figure 5 plots the number of cycle as a function of coupling-to-output capacitance ratio (defined by co/c) for the case of n=8, 13, 16 and 21[41]. it can be found that for a converter to have a short start-up time, the output capacitance should not be larger than 2 times of the coupling capacitance. 3.5. performance comparison of different topologies this section concludes with the performance comparison as given in table 1. table 1 lists the performances of the 8× linear, fibonacci and exponential converter with same conversion ratio. note that the equivalent output resistance of the linear converter at the fsl mode is smaller than those of the fibonacci and the exponential ones. in addition, the voltage stress across the transistors in the linear converter is smaller regardless the large number of cascading stages. however, the linear converter requires larger number of components to implement and has a longer start-up time. further detailed comparison on the performances of these kinds of converters can be found in refs. [42], [43]. table 1 comparison of the linear, fibonacci and exponential topologies with m = 8 topology n no. of switch no. of capacitor max. blocking voltage (v) req,out start-up time (no. of clock cycle, m, for c/co=1) ssl (1/(cf)) fsl (ron) linear 7 22 7 2vdd 7 44 75 fibonacci 4 13 4 5vdd 7 52 36 exponential 3 12 5 4vdd 10 56 30 4. cmos implementation of sc dc-dc converters the sc dc-dc converter topologies given in section 2 can be implemented using the standard cmos technology by realizing the switches with some n-type or p-type cmos switches (or called the cts’s). to minimize the reverse current and the output voltage drop, these switches should be biased in the cut-off region and the triode region when they are turned off and on, respectively. the reverse current can be further reduced by applying non-overlapping clock signals [44]. the body effects, i.e. the non-zero source-to-bulk biasing voltages, in these switches can lead to a larger threshold voltages and make the output voltage be saturated at a lower value [45]. hence, for a step-up sc dc-dc converter, we may encounter some unexpected voltage drop and power losses. in addition, as the node voltages in a step-up sc dc-dc converter are higher than vdd, several factors need to be considered. here lists some issues need to be take care:  in the step-up voltage converters, the node voltages (and therefore the drain and the source voltages of the transistors) are higher than vdd. thus, a p-type cts can be turned on easily by applying 0v (or any voltage lower than its source/drain voltage by a threshold) at the gate. however, a gate voltage higher than vdd is usually required to shut down the cts completely, which may not be available in the circuit. on the other hand, 48 o.-y. wong, h. wong, w.-s. tam, c.w. kok an n-type cts can be readily shut down by applying 0v (or any voltage lower than its source/drain voltage plus a threshold) at the gate. however, to turn on the cts completely, the gate voltage should be higher than vdd. the bodies (or the n-wells if the converter is implemented in an ordinary n-well process) of the p-type cts's are usually required to be biased at a voltage higher than vdd such that the p-n junctions in the transistors can be always in reversely biased. otherwise, substrate leakage current exists and the converter would have poor efficiency. on the other hand, if the bodies of the n-type cts's are biased at 0v (which is the usual case for the circuits implemented in an ordinary n-well process), they will suffer from the body effect and that the threshold voltages of the cts's will become larger. as the node voltages in the linear topology increase linearly from the input side to the output, while those in the other ones can rise exponentially. thus, it is easier to design an efficient converter using the linear topology. fig. 6 illustration of the three gate-biasing techniques: (a) the dynamic biasing; (b) gate-boosting; and (c) the cross-coupled techniques [46]-[49] fig. 7 illustration of the three body-biasing approaches: the (a) floating-well; (b) adaptive body-biasing; and the (c) body-source junction diode approaches [50]-[52] topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 49  in the step-down voltage converters, the node voltages (and therefore the drain and the source voltages of the transistors) are between 0v and vdd. thus, the body and the gate terminals of both nand p-type cts's can be biased properly without any difficulties. however, the use of n-type cts can usually save more silicon area due to its higher transconductance. to pass a voltage in the range of 0v to vdd, transmission gates can be used. to achieve higher power efficiency, several different gateand bodybasing techniques have been proposed to control the cts’s in the dickson charge pump. figure 6 shows the dynamic biasing [46], gate boosting [47], [48] and the cross-coupled techniques [49] for gate biasing. figure 7 shows some techniques including the floating-well [50], adaptive body-biasing [51], and the body-source junction diode approach [52], to alleviate the body effects of the cts. the advantages and disadvantages of these techniques have been discussed in detail in ref. [53]. in short, small output voltage drops can be found in the converters using the gate boosting and the cross-coupled techniques, but the gate-boosting technique would consume larger dynamic power and the cross-coupled technique requires a costive triple-well process. advanced converter circuits are usually constructed by making use of more than one of these techniques [54], [55]. figures 8 and 9 present the measurement, simulation, as well as the theoretical results on the loading characteristics and the power efficiencies of the cmos 4× exponential converter at four different frequencies ranging from 25 khz to 200 khz [56], [57]. all the switches in this converter are turned on and off properly with additional dynamic inverters. the circuits were designed for operation at vdd = 1.5v, c = 100 nf, and assuming ron of the fig. 8 theoretical (both the ssl and fsl cases), simulated, and measured loading characteristics of the 4× cmos exponential converter proposed in ref. [57] with vdd = 1.5v, at four different frequencies: (a) 25 khz; (b) 50 khz; (c) 100 khz; and (d) 200khz 50 o.-y. wong, h. wong, w.-s. tam, c.w. kok switches be 50 ω. for f = 25 khz, 50 khz, 100 khz and 200 khz, the corresponding ψ values are equal to 4, 2, 1 and 0.5, respectively. thus, it is expected that the designed converter is more likely to be operated at the ssl mode with req,out given by eq. (1) for f = 25 khz and 50 khz. for f = 100 khz and 200 khz, they are at fsl mode and req,out is given by eq. (2). the simulation results shown in fig. 8 agree well with this conjecture. in fig. 8, loading characteristics are more or less the same. it further prove that the designed circuit should work at the fsl when f = 100 khz and 200 khz. it is because the equivalent output resistance of a converter at fsl should be independent of the operation frequency according to eq. (2). the difference between the theoretical and the measurement results in fig. 8(a)-(d) should be due to the equivalent series resistance (esr) of the externally connected capacitors, interconnections, and the parasitic capacitances at each node of the real circuit. this difference is less than 10% [57]. in fig. 9, it can be further observed that when the frequency is increased from 25 khz to 200 khz, the measured power efficiency drops from 80% to 40% when the loading current is small. this agrees well with what suggested by eq. (3). in eq. (3), the dynamic power loss, pdyn, dominates when the output power is small, and pdyn increases with the operation frequency [57]. fig. 9 theoretical and measured power efficiencies versus loading current of the 4× cmos exponential converter proposed in ref. [57] with vdd = 1.5 v, at four different frequencies: (a) 25 khz; (b) 50 khz; (c) 100 khz; and (d) 200 khz topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 51 5. regulations in sc dc-dc converters as shown in fig. 8, the output voltages of the sc dc-dc converters drop as the loading currents increase. to maintain a constant output voltage under different loads, a better regulation method that using a closed-loop structure is required for the sc dc-dc converters. on the other hand, the supply voltage, like the battery used in some portable devices, may also drop significantly during the operation and that causes the output voltage of the dc-dc converters to drop continuously as the output voltage of a sc dc-dc converter is directly proportional to the supply voltage. hence, we need to maintain the output voltage over a wide supply voltage range also. these are the main tasks for the output stage of the converters. in this section, a review on some previously proposed regulation techniques will be given. 5.1. regulation for loading current fluctuation pulse-skipping modulation (psm) technique was found to be well suit for sc dc-dc converter applications. fig.10 illustrates the schematic of this this method. by skipping some of the charge transferring periods for the output according to the loading condition, a better regulation can be achieved. as shown in the figure, when the loading current decreases, the sensed output voltage, vfb, may become larger than that of the reference one, vref, the clock signals for controlling the converter will be shut down. the converter will then be disconnected from the output in effect and that stops further delivering power to the output. when vfb drops below vref due to the discharging loading current, the output capacitor will be recharged by the converter again. by this way, the charge transferring period for the output (or the average power delivering to output) can be adjusted such that constant vo will be kept under different io. here a hysteresis comparator is used and ripples exist. this method has the advantages of fast response and good stability. moreover, the switching loss can be reduced and the power efficiency can be improved especially when io is small. however, variable operation frequency and large output voltage ripple, are the main drawbacks of this method [58]. alternatively, as illustrated in fig. 11, the output voltage can be regulated by adjusting the coupling capacitors charging current according to the loading condition. in this linear control method, some transistors are used as voltage-controlled current sources and are included in the charging paths of the converter. the error amplifier in the negative feedback loop then controls the current sources, and thus the charging currents to the coupling capacitors according to the loading condition. unlike the psm method, this control method produces smaller output voltage ripple and makes use of invariant operation frequency. unfortunately, the switching loss is comparatively large when the loading current is small and that leads to poorer power efficiency [31], [59]. other regulation methods proposed in the literatures frequency modulation, segmented output components [60]-[65]. in the frequency modulation method, the output is regulated by adjusting the operation frequency of the converter. it uses a voltage-controlled oscillator (vco) in the feedback loop [60], [61]. as the equivalent output resistance of a converter is related to the component sizing (see eq. (1) and (2)), the output voltage of a sc dc-dc converter can also be regulated using some segmented devices, like the segmented capacitors [62], [63] and the segmented cts’s [64], [65]. regulated converters using more than one of the above techniques can also be found in some reports. for example, bayer and schmeller [66] used both the linear control and the psm methods to regulate its output 52 o.-y. wong, h. wong, w.-s. tam, c.w. kok voltage. patounakis, li and shepard proposed a hybrid regulator that combines the sc dc-dc converter and the low dropout regulator (ldo) to achieve high efficiency and to reduce the voltage ripple [67]. fig. 10 illustration of the pulse-skipping modulation method used for charge pump voltage regulation fig. 11 illustration of the linear control method used for charge pump voltage regulation 5.2. regulation on supply voltage variation as mentioned, sc dc-dc converters are often used in systems with varying supply voltage. it was realized that regulating a sc dc-dc converter over a wide supply voltage range based on simple feedback loops, like the ones shown in fig. 10 and 11, often results in a poor power efficiency as the maximum achievable power efficiency of a sc dc-dc converter is given by vo/(mvdd). hence, to maintain a certain output voltage, the power efficiency will drop with the supply voltage if the switch-and-capacitor-network configuration is not altered (.i.e. m is fixed). with this connection, reconfigurable sc dc-dc converters, in which the switch-and-capacitor-network configuration could be alterable, i.e. m is an available, were proposed. this method improves the power efficiency over a wide range of supply voltage. this idea can be demonstrated by considering the case that a fixed output voltage of 1.8v is required and the supply voltage may vary in the range of 3-5 v. in this case, the theoretical maximum power efficiencies which can be achieved by some sc dc-dc converters with different available m values under different supply voltages are plotted in fig. 12. clearly, higher overall power efficiency can be achieved with a converter having more available conversion ratios. fig. 13 illustrates the control of the reconfigurable sc dc-dc converter. additional circuitries are added to determine the m value of the converter based on the vdd value. that minimizes the dropout voltage at different supply voltages [68]-[71]. more complicated techniques, such as the gain hopping technique [72]-[74], were proposed to determine the m value based on the loading condition also so as to further improve the overall power efficiency. topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 53 fig. 12 plot of the theoretical maximum power efficiencies at different vdd for a reconfigurable converter using different m values for constant vo of 1.8v fig. 13 illustration of the control of a reconfigurable sc dc-dc converter 6. conclusion an overview on two-phase switched capacitor dc-dc converters is given. characteristics, including equivalent output resistances, power efficiencies, voltage ripples, and start-up times, of three commonly-used topologies, i.e. the linear, fibonacci, and the exponential topologies, are compared. some practical issues on the implementation of these converters using cmos technology are discussed. finally, revised voltage regulation schemes, being able to accommodate a wider range of loading current fluctuation, are proposed for high-efficiency voltage conversion. acknowledgement: this work is supported by an applied research grant of city university of hong kong under project number 9667083. references [1] t. tanzawa, t. tanaka, k. takeuchi and h. nakamura, "circuit techniques for a 1.8-v only nand flash memory", ieee j. solid-state circuit, vol. 37, pp. 84-98, 2002. [2] j. m. baek, j. h. chun and k. w. kwon, "a power-efficient voltage up converter for embedded eeprom application", ieee trans. circuits syst. ii: express briefs, vol. 57, pp. 435-439, 2010. 54 o.-y. wong, h. wong, w.-s. tam, c.w. kok [3] n. derhacobian, s. c. hollmer, n. gilbert and m. n. kozicki, "power and energy perspectives of nonvolatile memory technologies", proc. ieee, vol. 98, pp. 283-29, 2010. [4] c.-h. wu and c.-l. chen, "multiphase charge pump generating positive and negative voltages for tft-lcd gate driving", in ieee int. symp. electronic design, test and appl., pp. 179-183, 2008. [5] f. su and w.-h. ki, "component-efficient multiphase switched-capacitor dc-dc converter with configurable conversion ratios for lcd driver applications", ieee trans. circuits syst. ii: express briefs, vol. 5, pp. 753-757, 2008. [6] w. zhao, k. choi, s. baumana, z. dilli, t. salter and m. peckerar, "a radio-frequency energy harvesting scheme for use in low power ad hoc distributed network", ieee trans. circuits syst. ii: express briefs, vol. 59, pp. 573-577, 2012. [7] t. salter, k. choi, m. peckerar, g. metze and n. goldsman, "rf energy scavenging system utilizing switched capacitor dc-dc converter", electron. lett., vol. 45, pp. 374-376, 2009. [8] k. eguchi, s. pongswatd, h. zhu, k. tirasesth, h. sasaki and t. inoue, "a multiple-input sc dc-dc converter with battery charge process", in int. conf. intelligent networks and intelligent systems, pp. 697-700, 2009. [9] j. kim, j. m. kim and c. kim, "wide input range hybrid dc-dc conversion system for solar energy harvesting", electronics letters, vol. 48, pp. 39-40, 2012. [10] x. zhang, d. shang, f. xia, h. s. low and a. yakovlev, "a hybrid power delivery method for asynchronous loads in energy harvesting systems", in ieee int. new circuits and syst.conf., pp. 413-416, 2012. [11] m. r. sarker, s. h. m. ali, m. othman and m. s. islam, "designing a low voltage energy harvesting circuits for rectified storage voltage using vibrating piezoelectric", in ieee student conf. research and development, pp. 343-346, 2011. [12] v. gutnik and a. p. chandrakasan, "embedded power supply for low-power dsp", ieee trans. vlsi syst., vol. 5, pp. 425-435, 1997. [13] t. burd, t. pering, a. stratakos and r. brodersen, "a dynamic voltage scaled microprocessor system", ieee j. solid-state circuit, vol. 35, pp. 1571-1580, 2000. [14] j. f. dickson, "on-chip high-voltage generation in mnos integrated circuits using an improved voltage multiplier techniques", ieee j. solid-state circuit, vol. 11, pp. 374-378, 1976. [15] f. ueno, t. inoue, i. oota and i. harada, "emergency power supply for small computer systems", ieee int. symp. circuits and syst., pp. 1065-1068, 1991. [16] j. a. starzyk, y.-w. jan and f. qiu, "a dc-dc charge pump design based on voltage doublers", ieee trans. circuits syst. i: fundam. theory appl., vol. 48, pp. 350-359, 2001. [17] s. ben-yaakov, "on the influence of switch resistances on switched-capacitor converter losses", ieee trans. ind. electron, vol. 59, pp. 638-640, 2012. [18] s. ben-yaakov, "behavioral average modeling and equivalent circuit simulation of switched capacitors converters", ieee trans. power electron, vol. 27, pp. 632-636, 2012. [19] m. d. seeman and s. r. sanders, "analysis and optimization of switched-capacitor dc-dc converters", ieee trans. power electron, vol. 23, pp. 841-851, 2008. [20] w.-c. wu and r. m. bass, "analysis of charge pumps using charge balance", in ieee 31st annu.. power electron. specialists conf., pp. 1491-1496, 2000. [21] p. favrat, p. deval and m. j. declereq, "a high-efficiency cmos voltage doubler", ieee j. solid-state circuit, vol. 33, pp. 410-416, 1998. [22] g. palumbo, d. pappalardo and m. gaibotti, "charge-pump circuits: power-consumption optimization", ieee trans. circuits syst. i: fundam. theory appl., vol. 49, pp. 1535-1542, 2002. [23] a. cabrini, l. gobbi and g. torelli, "theoretical and experimental analysis of dickson charge pump output resistance", in proc. int. symp. circuits syst., pp. 2749-2752, 2006. [24] j. s. witters, g. groeseneken and h. e. maes, "analysis and modeling of on-chip high-voltage generator circuits for use in eeprom circuits", ieee j. solid-state circuit, vol. 24, pp. 1372-1380, 1989. [25] c.-h. hu and l.-k. chang, "analysis and modeling of on-chip charge pump designs based on pumping gain increase circuits with a resistive load", ieee trans. power electron, vol. 23, pp. 2187-2194, 2008. [26] c.-c. wang and j. wu, "efficiency improvement in charge pump circuits", ieee j. solid-state circuit, vol. 32, pp. 852-860, 1997. [27] i. oota, n. hara and f. ueno, "a general method for deriving output resistances of serial fixed type switched-capacitor power supplies", in ieee int. symp. circuits and syst., pp. 503-506, 2000. [28] g. van steenwijk, k. hoen and h. wallinga, "analysis and design of a charge pump circuit for high output current applications", in ieee nineteenth european solid-state circuits conf., pp. 118-121, 1993. [29] j. w. kimball, p. t. krein and k. r. cahill, "modeling of capacitor impedance in switching converters", ieee power electron. lett., vol. 3, pp. 136-140, 2005. topologies, analysis, and cmos implementation of switched-capacitor dc-dc converters 55 [30] y. allasasmeh and s. gregori, "a performance comparison of dickson and fibonacci charge pumps", in european conf. circuit theory and design, pp. 599-602, 2009. [31] b. r. gregoire, "a compact switched-capacitor regulated charge pump power supply", ieee j. solid-state circuit, vol. 41, pp. 1944-1953, 2006. [32] d. baderna, a. cabrini, g. torelli and m. pasotti, "efficiency comparison between doubler and dickson charge pumps", in ieee int. symp. circuits and syst., vol. 2, pp. 1891-1894, 2005. [33] c.-p. hsu and h. lin, "analytical models of output voltages and power efficiencies for multistage charge pumps", ieee trans. power electron, vol. 25, pp. 1375-1385, 2010. [34] a. cabrini, l. gobbi and g. torelli, "a theoretical discussion on performance limits of cmos charge pumps", in proc. european conf. circuit theory and design, vol. 2, pp. 35-38, 2005. [35] f. h. khan, l. m. tolbert and w. e. webb, "start-up and dynamic modeling of the multilevel modular capacitor-clamped converter", ieee trans. power electron., vol. 25, no. 2, pp. 519-531, 2010. [36] g. di cataldo and g. palumbo, "double and triple charge pump for power ic: dynamical models which takes parasitic effects into account", ieee trans. circuits syst. i: fundam. theory appl., vol. 40, no. 2, pp. 92-101, 1993. [37] g. di cataldo and g. palumbo, "design of an nth order dickson voltage multiplier", ieee trans. circuits syst. i: fundam. theory appl., vol. 43, no. 5, pp. 414-418, 1996. [38] t. tanzawa and t. tanaka, "a dynamic analysis of the dickson charge pump circuit,'' ieee j. solid-state circuits, vol. 32, no. 8, pp. 1231-1240, 1997. [39] g. palumbo and d. pappalardo, "charge pump circuits with only capacitive loads: optimized design", ieee trans. circuits syst. ii: express briefs, vol. 53, no. 2, pp. 128-132, 2006. [40] m. zhang and n. llaser, "optimization design of the dickson charge pump circuit with a resistive load", in proc. int. symp. circuits syst., vol. 5, pp. 840-843, 2004. [41] o.-y. wong, h. wong, c.-w. kok and w.-s. tam, "dynamic analysis of two-phase switched-capacitor dc-dc converters", ieee trans. power electron., vol. 29, no. 1, pp. 302-317, 2014. [42] t. tanzawa, "on two-phase switched capacitor multipliers with minimum circuit area", ieee trans. circuits syst. i: regular papers, vol. 57, no. 10, pp. 2602-2608, 2010. [43] w.-h. ki, y. lu, f. su and c.-y. tsui, "design and analysis of on-chip charge pumps for micro-power energy harvesting applications", ieee/ifip 19th int. conf. vlsi system-on-chip, pp. 374-379, 2011. [44] a. fantini, a. cabrini and g. torelli, "impact of control signal non-idealties on two-phase charge pumps", in proc. int. symp. circuits syst., pp. 1549-1552, 2007. [45] j. c. chen, t. h. kuo, l. e. cleveland, c. k. chung, n. leong, y. k.kim, t. akaogi and y. kasa, "a 2.7 v only 8mb × 16 nor flash memory", in ieee symp. vlsi circuits dig. tech. papers, pp. 172-173, 1996. [46] j. t. wu and k. l. chang, "mos charge pumps for low-voltage operation", ieee j. solid-state circuit, vol. 33, pp. 592-597, 1998. [47] a. umezawa, s. atsumi, m. kuriyama, h. banba, k. imamiya, k. naruke, s. yamada, e. obi, m. oshikiri, t. suzuki and s. tanaka, "a 5-v-only operation 0.6-μm flash eeprom with row decoder scheme in triple-well structure", ieee j. solid-state circuit, vol. 27, pp. 1540-1546, 1992. [48] g. van steenwijk, k. hoen and h. wallinga, "analysis and design of a charge pump circuit for high output current applications", in nineteenth european solid-state circuits conf., pp. 118-121, 1993. [49] r. pelliconi, d. iezzi, a. baroni, m. pasotti and p. l. rolandi, "power efficient charge pump in deep submicron standard cmos technology", ieee j. solid-state circuit, vol. 38, pp. 1068-1071, 2003. [50] k. h. choi, j. m. park, j. k. kim, t. s. jung, and k. d. suh, "floating-well charge pump circuits for sub-2.0v single power supply flash memories", in ieee symp. vlsi circuits dig. tech. papers, pp. 61-62, 1997. [51] j. shin, i. y. chung, y. j. park, and h. s. min, "a new charge pump without degradation in threshold voltage due to body effect [memory applications]", ieee j. solid-state circuit, vol. 35, pp. 1227-1230, 2000. [52] o. khouri, s. gregori, a. cabrini, r. micheloni and g. torelli, "improved charge pump for flash memory applications in triple well cmos technology", in ieee int. symp. ind. electronics, pp. 1322-1326, 2002. [53] o.-y. wong, h. wong, c.-w. kok and w.-s. tam, "a comparative study of charge pumping circuits for flash memory applications", microelectron. reliab., vol. 52, pp. 670-687, 2012. [54] a. cabrini, l. gobbi, and g. torelli, "enhanced charge pump for ultra-low-voltage applications", electron. lett., vol. 42, pp. 512-514, 2006. [55] o.-y. wong, w.-s. tam, c.-w. kok and h. wong, "a low-voltage charge pump with wide current driving capability", ieee int. conf. electron devices and solid-state circuits, pp. 1-4, 2010. [56] o.-y. wong, w.-s. tam, c.-w. kok and h. wong, "area efficient 2 n × switched capacitor charge pump", in proc. ieee int. symp. circuits syst., pp. 820-823, 2009. 56 o.-y. wong, h. wong, w.-s. tam, c.w. kok [57] o.-y. wong, h. wong, c.-w. kok and w.-s. tam, "a dynamic-biasing 4× charge pump based on exponential topology", int. j. circuit theory applications, in press. [58] t. tanzawa and s. atsumi, "optimization of word-line booster circuits for low-voltage flash memories", ieee j. solid-state circuit, vol. 34, pp. 1091-1098, 1999. [59] h. chung, b. o and a. ioinovici, "switched-capacitor-based dc-to-dc converter with improved input current waveform", in ieee int. symp. circuits and syst., pp. 541-544, 1996. [60] l. aaltonen and k. halonen, "on-chip charge pump with continuous frequency regulation for precision high voltage generation", in ph.d. research in microelectronics and electronics, pp. 68-71, 2009. [61] s.-c. tan, s. kiratipongvoot and s. bronstein, "adaptive mixed on-time and switching frequency control of a system of interleaved switched-capacitor converters", ieee trans. power electron., vol. 26, pp. 364-380, 2011. [62] l. su and d. ma, " monolithic reconfigurable sc power converter with adaptive gain control and on-chip capacitor sizing", in ieee energy conversion congress and exposition., pp. 2713-2717, 2010. [63] y. k. ramadass, a. a. fayed and a. p. chandrakasan, "a fully-integrated switched-capacitor step-down dc-dc converter with digital capacitance modulation in 45nm cmos", ieee j. solid-state circuit, vol. 45, pp. 2557-2565, 2010. [64] s. musunuri and p. l. chapman, "improvement of light-load efficiency using width-switching scheme for cmos transistors", ieee power electron. lett., vol. 3, pp. 105-110, 2005. [65] r. guo, l. yang, a. huang and j. endredy, "a high efficiency regulated charge pump over wide input and load range", in ieee applied power electron. conf. exposition, pp. 1172-1176, 2010. [66] e. bayer and h. schmeller, "charge pump with active cycle regulation-closing the gap between linear and skip modes", in ieee annual power electron. specialists conf., pp. 1497-1502, 2000. [67] g. patounakis, y. w. li and k. l. shepard, "a fully integrated on-chip dc-dc conversion and power management system", ieee j. solid-state circuit, vol. 39, pp. 443-451, 2004. [68] i. chowdhury and d. ma, "design of reconfigurable and robust integrated sc power converter for self-powered energy-efficient devices", ieee trans. ind. electron., vol. 56, pp. 4018-4028, 2009. [69] c.-l. wei and m.-h. shih, "design of a switched-capacitor dc-dc converter with a wide input voltage range", ieee trans. circuits syst. i: regular papers, vol. 60, pp. 1648-1656, 2013. [70] x. zhang and h. lee, "an efficiency-enhanced auto-reconfigurable 2×/3× sc charge pump for transcutaneous power transmission", ieee j. solid-state circuit, vol. 45, pp. 1906-1922, 2010. [71] v. ng and s. sanders, "a 92%-efficiency wide-input-voltage-range switched-capacitor dc-dc converter", in ieee int. solid-state circuits conf. dig. tech. papers, pp. 282-284, 2012. [72] i. chowdhury and d. ma, "an integrated reconfigurable switched-capacitor dc-dc converter with a dual-loop adaptive gain-pulse control", in ieee int. symp. circuits and syst., pp. 2610-2613, 2008. [73] v. w. ng and s. r. sanders, "a high-efficiency wide-input-voltage range switched capacitor point-of-load dc-dc converter", ieee trans. power electron., vol. 28, pp. 4335-4341, 2013. [74] low noise, high efficiency, inductorless step-down dc/dc converter, ltc1911, linear technology, 2001. instruction facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 463-478 https://doi.org/10.2298/fuee1903463a © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd raac: a bandwidth estimation technique for admission control in manet  folayo aina 1 , sufian yousef 1 , opeyemi osanaiye 2 1 department of engineering and built-in-environment, anglia ruskin university, chelmsford, essex, united kingdom 2 department of telecommunication engineering, federal university of technology, minna, niger state, nigeria abstract. the widespread of wireless mobile network have increased the demand for its applications. providing a reliable qos in wireless medium, especially mobile ad-hoc network (manet), is quite challenging and remains an ongoing research trend. one of the key issues of manet is its inability to accurately predict the needed and available resources to avoid interference with already transmitting traffic flow. in this work, we propose a resource allocation and admission control (raac) solution. raac is an admission control scheme that estimates the available bandwidth needed within a network, using a robust and accurate resource estimation technique. simulation results obtained show that our proposed scheme for manet can efficiently estimate the available bandwidth and outperforms other existing approaches for admission control with bandwidth estimation. key words: admission control, bandwidth, channel idle time, manet 1. introduction in recent times, the need to support qos in manet is rapidly increasing. tasks, especially real-time applications, require qos to enhance its communication (i.e. multimedia data). solutions have been proposed to support qos in wired network, however, these solutions are not directly adaptable to the wireless communication networks, as the latter requires novel solution for manet. nodes must therefore cooperate with one another to guarantee effective routing as well as qos. this cooperation includes endpoint flow policing as well as admission control implementation along the route to prevent network violation of the initially configured policy. the aim of deploying qos support is to provide guaranteed application support in terms of delay, jitter, throughput, bandwidth, etc. to ensure this, the mac layer takes the responsibility of allocating resources at individual nodes, while the network layer must consider resources along the entire communication route. the support for received january 16, 2019; received in revised form march 21, 2019 corresponding author: opeyemi osanaiye department of telecommunication engineering, federal university of technology, minna, niger state, nigeria (e-mail: opyosa001@myuct.ac.za)  464 f. aina, s. yousef, o. osanaiye qos in manet when compared with its wired counterpart is not trivial, due to its lack of infrastructure and sharing of resources and medium [1] [2]. a mechanism that provides qos assurance is known as admission control. the aim of an admission control is to decide whether to admit data sessions that can satisfy a given qos requirement without violating any previously made rules or reject sessions. the main issue encoun/during the implementation of admission control mechanism revolves around retrieving information on the available network resources. the admission control protocol must be able to determine if there are nodes that have the available resources to accommodate the intended traffic flow [3] [4]. in this work, we propose raac which is used to estimate the available bandwidth in a network for admission control purpose. raac combines and improves the existing algorithms of measurement-based available bandwidth estimation and flow admission control (bandest) and cognitive passive estimation of available bandwidth (cpeab). we identify the key metrics that must be considered for our protocol to have a better performance a mechanism that determines the measurement of all these metrics to improve the network performance has been implemented using opnet modeler simulation tool. the rest of the paper is organized as follows; section 2 presents related works while section 3 describes bandwidth estimation and admission control. in section 4, we present our proposed resource allocation and admission control (raac) while section 5 presents the experimental simulation. finally, section 6 concludes the paper. 2. related works manets in recent times have become the choice wireless network due to the numerous advantages it proffers. in wired network, the available bandwidth measurement is done using an active estimation technique [5]. this technique is not suitable for manet because it makes use of probe packets when measuring the available bandwidth between source and destination. if the number of source to destination pair is large, it will result in the sending of many probe packets which in turn consumes a large amount of bandwidth. yang and kravets [6] proposed a contention aware flow admission control for ad-hoc network (cacp). in cacp, flow admission control is performed based on estimating the available bandwidth. the estimation is done using the wireless channel sensing mechanism by considering the back-off period. it is assumed that the back-off period is negligible even at saturation. cacp considered both intra-flow and inter-flow contention count in a distributed manner. the drawbacks of cacp is its non-consideration of the effect of mac layer on the available bandwidth, and its failure to consider the impact of mac layer overhead when data traffic load is increased within a network. sarr et al. [7] propose an available bandwidth-based flow admission control (abe) algorithm for wireless network. estimation of the available bandwidth is done by using the wireless channel sensing mechanism. to achieve this, they considered the virtual, physical carrier sensing, and different types of wireless csma/ca mac layer interframe spacing. the authors argued that measuring the channel activities, considering the amount of time spent in the physical and virtual carrier sensing with different interframe space, results in overestimating the available bandwidth. this is due to the non-synchronization between the sender and the receiver within an ad-hoc network (note that synchronization between the sender and receiver as used in the context of this work means that for communication to occur, the medium availability on the sender and the receiver must synchronise). the authors thereafter propose a mathematical model that considers the collision probability to estimate the actual raac: a bandwidth estimation technique for admission control in manet 465 available bandwidth and the future back-off overhead. the collision probability is derived from the amount of hello messages received by a node over the amount of hello packets expected to be received by the node at the previous interval measurement. the admission control flow algorithm makes use of one-hop neighbour and two-hop neighbour information to calculate the intra-flow contention and the authors used 4 as the maximum intra-flow contention. to calculate the inter-flow contention, the minimum available bandwidth within the interference range is determined to decide on the flows admission request. the drawbacks of this technique are: (i) if there is an increase in the data traffic load within a network, the only factor considered is the additional back-off overhead. other important factors, such as additional retransmission and contention window overheads are ignored. (ii) the intra-flow contention count calculation does not always provide a right contention count and appears as been too simple, since it only considers the minimum available bandwidth within the interference range of a node. (iii) collision probability is calculated without considering the hidden and exposed node causing unnecessary delay. an improved available bandwidth (iab) has been proposed by zhao et al. [8]. this protocol estimates the available bandwidth of a giving link for qos support in wireless adhoc network. it considers the synchronization between the source and the destination node by differentiating the busyness caused by the transmitting and receiving node from those caused by the sensing node. furthermore, the work also improved the accuracy of estimating the overlapping probability of the idle time of two adjacent nodes. the drawback of this technique is similar to (i) and (ii) mentioned in [7]. cognitive passive estimation of available bandwidth (cpeab) was proposed by [9]. this protocol estimates the available bandwidth of a network in an overlapped wifi environment. it considers the additional overhead caused by acknowledgement frames, which was not considered in both aac and abe, therefore estimating the available bandwidth by measuring the proportion of waiting and back off delay, packet collision probability, acknowledgment delay, and channel idle time. furthermore, cpeab considered the hidden and exposed node to have a more accurate available bandwidth measurement. the drawback of this proposed algorithm is that the intra-flow contention count calculation does not always provide a right contention count. additionally, retransmission and contention window overheads were also ignored in the proposed algorithm. to retrieve the available bandwidth on a carrier sensing, hello packet is broadcasted to two hop neighbour which floods the network to increase the network overhead. lastly, the dependency of the channel idle time ratio only differentiates between the busy and sensed busy and did not regard an empty queue to be an idle channel time period. we define the busy state as a situation whereby a node is in the state of transmission or receiving while the sense busy state is defined as a situation whereby a node is in the state of sensing. any other time outside the sensing time means the node is in an idle state. the idle state means that the node is neither transmitting, receiving nor sensing any packet. for a channel to be idle, the channel does not necessarily have to be sensed idle by both the physical and virtual wireless carrier sensing mechanism, however, the interface queue must be empty. nam et al. [10] improved on the work of [7] by enhancing its algorithm to include retransmission mechanism and back-off overhead. the drawback of this technique is that the contention window overhead was not considered with increase in data traffic load inside the network. also, the assumptions made in the mathematical model may not hold through in the actual network. farooq et al. [11] propose a proactive bandwidth estimation (pabe) for ieee 802.15.1based network. pabe is a measurement based enhancement for available bandwidth 466 f. aina, s. yousef, o. osanaiye estimation method and flow control admission control algorithm. instead of deploying a model to predict the collision and back-off, empirical method for gathering data was used to predict any additional back-off overhead. besides, it uses the value of the expected future data traffic load to predict additional overhead instead of using the existing one. the drawback of this algorithm is the increase in data traffic load within a network, as additional retransmission and contention window overheads are ignored. also, the computation of the intra-flow and inter-flow contention count was inaccurate. lastly, to retrieve the available bandwidth on a carrier sensing, hello packet is broadcasted to two hop neighbour which tends to flood the network, which in turn increase the network overhead. bandest, another algorithm proposed by farooq et al. [12], proactively considers the complete wireless 802.15.4 unslotted csma-ca mac layer overhead and considers the future load. additionally, it considers the estimation of intra-flow contention and estimates contention on non-relaying nodes. additional mac layer overhead that is associated with the increase in data traffic load was considered and an algorithm that deals with concurrent admission request in a fifo was implemented. the drawback of bandest is that it has a higher overhead because it broadcast to two-hops. furthermore, bandest did not consider the channel idle time dependency together with the effect of hidden/exposed node on the accuracy of bandwidth estimation. from the reviewed literature, the channel idle time dependency sensed by both the sender and receiver has not been properly addressed as most previous works in the literature did not factor it in their design. this work therefore proposes a resource allocation and admission control (raac) mechanism that estimate the bandwidth for admission control based on some key factors. 3. factors to be considered for admission control implementation in this section, we identify the key factors that are essential to implement admission control within a network. this will help to create a background work to evaluate the related works. 3.1. channel idle time dependency channel idle time dependency sensed by the sender and the receiver ensures an accurate estimation of the available bandwidth. this is achieved by differentiating the nodes busy state from sense state and differentiating the channel idleness that may be caused by an empty queue. 3.2. intra-flow interference transmitted packets interfere with all nodes within the carrier sensing range of the transmitting host. by considering a multi-hop path, some forwarding nodes are located within the sensing range of one another, therefore, the same flow are transmitted several times in the same sensing region, thereby using the same shared channel. this circumstance is known as intra-flow contention. in [13], the contention count is defined as the number of nodes on the multi-hop path located within the carrier sensing range of the contending host. raac: a bandwidth estimation technique for admission control in manet 467 3.3. collision with respect to hidden node and unnecessary delay from exposed nodes in wireless network, there is no possibility of detecting if a collision will happen, therefore, once it happens, both colliding frames are emitted completely, thereby maximizing the loss in bandwidth. therefore, when estimating collision and unnecessary delay within the available bandwidth, consideration must be given to check the impact of both the hidden and the exposed terminal nodes [9]. 3.4. increased data traffic increased data traffic inside the network leads to an increase in csma/ca which is based on mac overhead with respect to back-off interval, retransmission number, acknowledgement packet and contention size. when a data traffic load of a network is increased, it in turn increase the csma/ca based mac layer overhead; therefore, the available bandwidth estimation of the admission control algorithm needs to take note of the consumed bandwidth such as the mac layer overhead corresponding to different values of the offered data load inside a network [14]. 4. resource allocation and admission control (raac) our proposed algorithm, raac, has adopted bandwidth estimation, where channel idle time dependency, intra-flow interference, collision with respect to hidden nodes and unnecessary delay impact due to exposed nodes, and lastly, increased data traffic inside a network leading to an increase in csma/ca based on mac overhead was considered. raac is a novel, efficient and accurate resource allocation and admission control technique that estimates the available bandwidth for the admission controller to either accept or reject a session when an admission is requested. the process to achieve this can be divided into three, namely; measuring the channel idle time dependency, measuring the intra-flow contention, and resolving issues of hidden node causing collision and exposed nodes leading to unnecessary delay. 4.1. measuring the channel idle time dependency figure 1 depicts a wireless state transition diagram. a node in this transmission diagram is said to be in a state of transmission, only if it is currently emitting signals through its antenna. a node is said to be in a receiving state if there are nodes transmitting within its transmission range. a node is said to be in its sensing state if the medium is sensed busy but there is no receiving frame because the energy is below the receiving threshold. a node is said to be in an idle state if it is not transmitting, receiving, or sensing any packet. by differentiating sense busy state from the busy state and redefining the idle channel time of a station to include a time that the mac queue is empty, allows for the synchronization of the sender and the receiver as well as proper available bandwidth estimation. the available bandwidth with respect to the channel idle time dependency is therefore; (1) where ti, tb, ts, te, denotes the time duration of the idle, busy, sense busy and empty queue states respectively at a measured period t. c is the maximum link capacity. 468 f. aina, s. yousef, o. osanaiye fig. 1 wireless radio transition diagram [15] to further clarify this, the scenario in figure 2a was considered, where n1 is transmitting to n2. figure 2b shows the basic ieee 802.11 exchange of frame sequence (at the top) and the channel state sensed by all the nodes. all the nodes that falls into the transmission range of node1 can successfully decode any packet from it. furthermore, information about the time it finished transmitting the packet can also be determined. at this time, they are in the receiving state, which is busy state. even though n1 is defined as idle in “interval a”, during this period, the medium must be sensed idle by n1 and cannot be used by nodes within the carrier sensing range. to eliminate this inaccuracy, the coefficient k was adopted as used in [13], where: ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅ (2) k represents the proportion of the bandwidth consumed during the waiting and the back-off period. note that the back-off varies, therefore, we use its average value, which is written as, backoff . the number of back-off slot that decrements for a single frame on an average can be represented as: ∑ ( ) ( ) (3) where cwmin represents the initial (or minimal) value of the contention window and cwmax represents the maximum value of the contention window, with cwmax = 2 n . m denotes the maximum number of retransmissions attempted (m ≥ n); x denotes the number of retransmissions suffered by a given frame, therefore: ( ) { ( ) p represents the conditional collision probability [16], which is the probability that a transmitting packet will collide. the following expression can be used to derive the backoff : (4) raac: a bandwidth estimation technique for admission control in manet 469 note that the packet collision probability effect (p) was included in the calculation of k. fig. 2a wireless transmission scenario showing transmission range and carrier sensing range [15] fig. 2b channel states sensed by nodes in scenario 2a [15] 470 f. aina, s. yousef, o. osanaiye 4.2. measuring the intra-flow contention determining the correct value of the intra-flow contention depends on the interference range of the node in a network. let us assume that the nodes within the two-hop distance can cause interference, therefore, the interference count on any node along the path forwarding the data majorly depends on the distance of the node from the source and the nodes destination. for a new admission control request to be granted, raac determines the actual intra-flow contention count along the source node, intermediate node, and the destination node. 4.3. resolving issues of hidden and exposed nodes looking at the ieee 802.11 frame exchange sequence in figure 3, interval iii is used for transmitting data frame which is dependent on the frame size. moreover, according to [7], the size of a frame has a direct impact on the packet collision rate, where the impact of hidden and exposed node was not considered by the author. fig 3 frame exchange sequence in rts/cts mechanism [17] therefore, using [18], the impact of a flows hidden/exposed terminals can be calculated as: { ( ) ( ( ) ) (5) where, f_h denotes the total data flow of hidden nodes and fe denotes the total data flow of the exposed node. to solve the issue of hidden nodes and exposed nodes which may cause collision and unnecessary delay, the request to send and clear to send (rts/cts) mechanism is activated. in figure 3, interval ii shows the frame exchange sequence when the rts and cts mechanism is activated. interval ii, therefore consist of rts and cts messages with two sifs (short interframe space) in between them. the overhead incurred by rts and cts is calculated as: raac: a bandwidth estimation technique for admission control in manet 471 { ( ) (6) by considering the extra overhead that may be added when the rts/cts is used, the available bandwidth estimation can be more precise. scenario without hidden/exposed node: figure 4 depicts a topology without hidden/exposed node. the two nodes involved are located within each other’s transmission range. one of the nodes is sending traffic to the access point while the other node is estimating the available bandwidth. fig. 4 scenario without hidden/exposed node [9] scenario with hidden/exposed nodes: in figure 5 and 6, we consider a topology which is configured to have 1 hidden node and 1 exposed node. fig. 5 exposed nodes [18] fig. 6 hidden node [18] 472 f. aina, s. yousef, o. osanaiye figure 5 shows that node b and node c, are in the same transmission range. when node b sends data to node a, node c will detect that the channel is busy and node c will not make any attempt to send data to node d to avoid collision. the same process applies vice-versa. note that node b and node c are each other’s exposed node. in figure 6, node a is not in the transmission range of node c. whenever node a sends packets, node c detects that the channel is idle, if node c sends data at the same time, it will result in packet collision, i.e. packet a and c will collide with node b, which will eventually result in transmission failure. note that node c is the hidden node of node a. 4.4. increased data traffic lead to an increase in csma/ca mac overhead farooq and kunz [11] in their work observed that an increase in data traffic in the network results in an increase in the csma/ca mac overhead, due to the number of retransmission and back-off duration. therefore, for an available bandwidth estimation to be effective, there is need to take note of the bandwidth consumed by the mac layer overhead corresponding to the different values of the data load offered inside a network. in [11], an experimental study was carried out to determine the ieee802.15.4 unslotted csma/ca mac layer overhead (retransmission and back-off) with increased data load in the network. it was observed that an increase in data load will lead to an increase in the average back-off as well as the retransmission overhead. therefore, it is essential to consider the back-off and retransmission overhead by taking note of the additional data load inside the network. if there is an excess of 60kbps of the anticipated data load within the interference range of a network, the extrapolation technique can be used to determine the additional back-off and retransmission overhead. in order to estimate the additional mac layer overhead leading to an increased data traffic load, the author in [11] presented a method in section 2.1 of their work. here, the mac layer overhead is considered after determining the future data load ( i.e., the current data traffic load at the interference range of a node is added to the contention count and then multiplied by the new flow’s required bandwidth). the overhead associated with the method presented in [11] is that a lookup table is stored on nodes that returns estimated mac overhead corresponding to a given value of the data load inside a network. it is not possible to store the mac layer overhead in terms of bps corresponding to each possible offered data load, but an algorithm can estimate the mac layer overheads for an offered data load not present in the lookup table by linear interpolation, using the two closest available data points. by applying equation (1) through to (6), we derived an estimation of available bandwidth for raac, which is: ( ) ( ⁄ ) ( ) ( ) (7) where: k= bandwidth consumed as per waiting time and back-off pc= packet collision probability ack= acknowledgement c= maximum link capacity l= traffic load r/c= rts/cts ti = idle time of the wireless in a measured period t raac: a bandwidth estimation technique for admission control in manet 473 5. simulation parameters in this section, we use opnet modeler to simulate our design to evaluate the performance of raac. we have deployed 100 nodes which was randomly distributed in a 1200x1200m area. furthermore, we set other network parameters accordingly, i.e. link capacity of 54mbps, transmission range of 250m and carrier sensing range of 550m was used. t is set to 1s and 6 sender and receiver nodes were randomly selected among the 100 nodes to carry out the background traffic while the rest of the nodes are either acting as relay node or idle. simulation was carried out for 60 seconds and each simulation was repeated 10 times. table 1 depicts the parameters used for our simulation. table 1 simulation parameter 5.1. simulation model and evaluation of raac similar to the work of [17], a scenario in figure 7 is used in evaluating raac. flow 1 (f1) on link (5,6) has a variable bandwidth and flow 2 (f2) on link (1,2) has a constant bandwidth of 600kbps. the available bandwidth estimation on link (3,4) for raac is calculated using equation 7. the link capacity is 54mbps and the source nodes which are nodes 1 and 5 generates 1kbyte traffic. the distance between each node is 200m. fig. 7 simulated network topology [17]. parameter value number of nodes 100 total network area 1200 x1200m link capacity 54mbps packet size 127bytes transmission range 250m carrier sensing range 550m number of sender-receiver 6 t 1sec number of simulation (repetition) 10 times simulation time 60s difs 28ms sifs 10ms slot time 9ms mac header size 34byte acknowledgement 33bytes rts size 20byte cts size 14byte cwmin 15 cwmax 1023 traffic type cbr 474 f. aina, s. yousef, o. osanaiye as we will be estimating the available bandwidth every t (sample period) seconds, the choice of t will have an impact on the available bandwidth estimation. we show the impact of this in the next section. to have a fair comparison, t has been chosen to be 1 second, just as in the work of [9], [12] and [11]. 5.2. measuring the available bandwidth to measure the available bandwidth on a given link (s, r) during simulation, we transmitted a flow f on the link (s, r). for each value obtained, the rate of the flow is increased incrementally. if one of the other existing flows in the network sees its rate decrease by more than 5%, the increase in the rate of the flow f (s, r) is stopped. the achieved rate f (s, r) is considered as the available bandwidth on the link (s, r), i.e. the real bandwidth that can be achieved without degrading close flows. 5.3. simulation results assessing raac: we compared the available bandwidth estimated by raac with the real available bandwidth, as shown in figure 8. our bandwidth estimation approach, raac, has been able to predict the available bandwidth notwithstanding the type of traffic flow. even though some little estimation variations were recorded in some instances (see figure 8), the results obtained by our proposed raac is very close to the actual available bandwidth. for clarity purpose, we present the average value of the real available bandwidth and the value obtained from our proposed raac (see table 2). the results obtained from the measured and estimated bandwidth show how well raac has been able to estimate the measured available bandwidth. fig. 8 available bandwidth estimation between raac and real available bandwidth raac: a bandwidth estimation technique for admission control in manet 475 table 2 average available bandwidth measurement per traffic flow bandwidth estimation method average value of traffic flow (bps) real available bandwidth 15757.12 raac 15844.42 assessing raac against cpeab, pabe, and bandest: here, we evaluate our proposed approach, raac, with related past works, cpeab, pabe and bandest using the same scenario in section 4. the available bandwidth estimation on link (3,4) for cpeab [9], pabe [12] and bandest [11] is calculated using equation 8, 9, and 10. our implementation of pabe and bandest adopted the mathematical model of estimation as against the proactive method used by the authors. the mathematical method was used to ensure a fair comparison. the estimation of cpeab, on the other hand, was presented by the authors using mathematical model. ( ) ( ) ( ) (8) ( ) ( ) (9) ( ) ( ) (10) where tr and ts are the idle time of the sender and receiver in the wireless medium. all other parameter definition can be found in section ii. the result presented in figure 9 shows how raac outperforms other protocols when estimating the available bandwidth between a sender and a receiver pair of wireless node. this can be attributed to bandest assumption on the overlap idle channel period, which resulted in an over estimation of the available bandwidth. also, pabe and cpeab assumed that the idle channel is independent, therefore resulting in the underestimation of the available bandwidth. raac considers the dependency of two adjacent node idle channel occupancy by differentiating the busy state from the sense busy state and the idle state caused by an empty queue to ensure a better and accurate estimation. raac use the current estimated available bandwidth to predict the next period, just like in the case of other calculation-based approaches. fig. 9 available bandwidth estimation 476 f. aina, s. yousef, o. osanaiye we have also plotted the estimated error statistics for each simulation as computed by [17] and [9] as shown in equation 11: [ ] (11) fig. 10 error estimation ratio (percentage) the results shown in figure 10 further buttress the graph presented in figure 9. this shows that our proposed technique, raac, gives a better estimate of the available bandwidth when compared with cpeab, bandest and rabe. effectiveness of the estimated bandwidth: suppose the source node of a flow transmits admission request message at 10, 20, 30, and 40 seconds, we consider that a flow makes a wrong admission decision if it accepts a new flow that degrades the throughput of an already existing flow and/or the newly admitted throughput by more than 5%. also, an admission control algorithm of a flow makes a wrong decision if it unnecessarily rejects a flow. both pabe and cpeab techniques did not consider cases of wrong rejection of a flow; therefore, according to [11], the effectiveness (ɳ) is more comprehensive. one may argue that an unnecessary rejection of admission request flow will not degrade the performance of a flow that has already been admitted. therefore, wrong acceptance of flows is worse as compared with unnecessary flow rejection, hence, wrong admission should only be considered as a bad admission decision. an alternative argument is that the available resources must be efficiently used, otherwise, there may be deployment of sufficient resources for qos requirement flow to be satisfied during peak network utilization. however, in most cases, network resources are always underutilized, therefore, for a comprehensive evaluation to be achieved, equal importance is given to both types of wrong decision, such that: ɳ= number of correct admission decision/total number of admission requests; where ɳ represent the effectiveness. figure 11 shows the mean effectiveness and evaluation over 10 repetitions, along with 95% confidence interval. it shows that the mean effectiveness of raac is higher than cpbea, pabe and bandest, and the difference is statistically significant. raac may also give a wrong admission accepts at some point due to factors such as corruption of bandwidth raac: a bandwidth estimation technique for admission control in manet 477 increment, broadcast messages due to interference, and lost admission reject message in response to a bandwidth increment message. therefore, figure 11 shows that the mean effectiveness of raac is higher than the other techniques while also showing the mean effectiveness when an admission control is not implemented. non-implementation of admission control means there is no control message overhead outside the routing message, however, the flow is lower than all other admission control protocol implemented in the work. if we are considering few flows, we do not need to implement admission control scheme, as all flows can be accommodated. this however is a rare case, especially when shared and low bandwidth characterizes wireless network. in conclusion, raac is more effective because of its low chance of false rejection. in pabe and cpeab, correct contention factors were not considered (see section 2.3 for correct contention count estimation), hence their effectiveness is very low. fig. 11 different bandwidth effectiveness table 3 shows the number of times the different schemes considered makes an incorrect admission decision. it is observed that raac makes fewer wrong decisions as compared with cpeab, pabe and bandest. therefore, raac is effective because it has a lower chance of falsefully rejecting an admission request, since the algorithm is designed to account for all overhead generated by the network. table 3 number of wrong admission decisions comparison (100 nodes) method wrong accepts wrong rejects bandest 18 3 cpeab 25 7 pabe 30 5 no admission control 58 0 raac 16 1 478 f. aina, s. yousef, o. osanaiye 6. conclusion in this work, we present a new approach to improve the accuracy of estimating the available bandwidth for admission control. factors that must be considered for a flow admission control algorithm has also been highlighted. we have proposed raac, a novel algorithm for manet that considers factors such as channel idle time dependency, intra-flow interference, collision with respect to hidden nodes and unnecessary delay impact due to exposed nodes, and lastly, the effect of increase in data traffic inside a network. results obtained through simulation demonstrates that by considering the factors highlighted, an effective available bandwidth-based admission control can be guaranteed. a comprehensive comparison has shown that raac provides a significant improvement as compared to other related previous research work. references [1] s. chaudhari and biradar, “survey estimation techniques in communication networks”, wireless personal communication an international journal., vol 83, pp. 1425–1476, 2015. [2] c. lal, v. laxmi, and m. gaur, “bandwidth-aware routing and admission control for efficient video streaming over manets”, springer science and business media., 2015. [3] y. su, s. chan, and j. manton, “bandwidth allocation in wireless ad hoc networks: challenges and prospects”, ieee communication magazine, pp. 80-85, 2010. [4] s.y. oh, g. marfia, m. gerla, “manet qos support without reservations”, journal of security and communication networks, vol. 4, no. 3, pp. 316–328, 2011. [5] h. zhu and i. chlamtac, “admission control and bandwidth reservation in multi-hop ad hoc networks,” computer networks, vol. 50, no. 11, pp. 1653–1674, 2005. [6] y. yang and r. kravets, “contention-aware admission control for ad hoc networks,” ieee trans. mobile comp., vol. 4, aug. 2005, pp. 363–77. [7] c. sarr, c. chaudet, et al. “bandwidth estimation for ieee 802.11-based ad hoc networks”, ieee transactions on mobile computing, vol. 7, no. 10, pp. 1228-1241, 2008. [8] h. zhao, e. garcia-palacios, j. wei, & y. xi, “accurate available bandwidth estimation in ieee 802.11based ad hoc networks”, computer communications, vol. 32, no. 6, pp. 1050–1057, 2009. [9] s. tursunova, k. inoyatov, & y.-t. kim, “cognitive passive estimation of available bandwidth (cpeab) in overlapped ieee 802.11 wifi wlans” in proceedings of the ieee network operations and management symposium, 2010, pp. 448–454. [10] n. van nam, i. guerin-lassous, v. moraru, and c. sarr, “retransmission-based available bandwidth estimation in ieee 802.11-based multihop wireless networks,” in proceedings of the 14th acm international conference on modeling, analysis, and simulation of wireless and mobile systems (mswim ’11), november 2011, pp. 377–384. [11] m. farooq, & t. kunz, “bandest: measurement-based available bandwidth estimation and flow admission control algorithm for ieee802.15.4-based wireless multimedia networks”, international journal of distributed sensor networks, vol. 2015, 2015. [12] m. farooq, & t. kunz, “proactive bandwidth estimation for ieee 802.15.4-based networks”, in proceedings of the 77th ieee vehicular technology conference (vtc ’13), 2013, pp. 1–5. [13] r.e renesse, v. friderikos and h.aghvami, “cross-layer cooperation for accurate admission control decision in mobile ad-hoc networks”, iet communications, vol. 1, no. 4, pp. 577–586, 2007. [14] a. paul, a. tachibana, and t. hasegawa, “an enhanced available bandwidth estimation technique for an end-to-end network path”, ieee transaction on network and service management, vol. 13, no. 4, pp. 768781, 2016. [15] h. zhao, e. garcia-palacios, j. wei, & y. xi, “accurate available bandwidth estimation in ieee 802.11based ad hoc networks”, computer communications, vo. 32, no. 6, pp. 1050–1057, 2009. [16] a. nafaa, “provisioning of multimedia services in 802.11-based networks: facts and challenges”, ieee wireless communications, vol. 14, no. 5, pp. 106–112, 2007. [17] h. j. park, & b.-h. roh, “accurate passive bandwidth estimation (apbe) in ieee 802.11 wireless lans”, in proceedings of the 5th international conference on ubiquitous information technologies and applications, 2010, pp. 1–4. [18] ietf draft, routing algorithm based on the flow sensing parameter, draft-wei-manet-rafsp-00, july 2009. facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 461-485 https://doi.org/10.2298/fuee1803461a efficient image compression and decompression algorithms for ocr systems  boban arizanović, vladan vučković faculty of electronic engineering, computer department, niš, serbia abstract. this paper presents an efficient new image compression and decompression methods for document images, intended for usage in the pre-processing stage of an ocr system designed for needs of the “nikola tesla museum” in belgrade. proposed image compression methods exploit the run-length encoding (rle) algorithm and an algorithm based on document character contour extraction, while an iterative scanline fill algorithm is used for image decompression. image compression and decompression methods are compared with jbig2 and jpeg2000 image compression standards. segmentation accuracy results for ground-truth documents are obtained in order to evaluate the proposed methods. results show that the proposed methods outperform jbig2 compression regarding the time complexity, providing up to 25 times lower processing time at the expense of worse compression ratio results, as well as jpeg2000 image compression standard, providing up to 4-fold improvement in compression ratio. finally, time complexity results show that the presented methods are sufficiently fast for a real time character segmentation system. key words: image processing, image compression, image decompression, ocr, machine-typed documents, machineprinted documents 1. introduction character segmentation still presents a considerable challenge in image processing and other related computer science fields [1], and is a very important pre-processing stage in optical character recognition (ocr) systems [2-4]. character segmentation and character recognition [5-8] have been important subjects of research for many years [9]. outside of the ocr systems scope, much recent work deals with extracting characters from natural and other non-document images. taking the recent work into consideration, it is noticeable that the difficulty of character segmentation is usually underestimated compared to the process of character recognition [10,11]. related works may be classified into those that analyze the character segmentation approach in natural images [12-14], and others that deal with character segmentation in document images. the  received december 22, 2017; received in revised form april 30, 2018 corresponding author: vladan vuĉković faculty of electronic engineering, computer department, p.o. box 73, 18000 niš, serbia (e-mail: vladanvuckovic24@gmail.com) 462 b. arizanović, v. vuĉković second group includes machine-printed documents [10,15-18], where the document structure and the shape of its elements are regular, and handwritten documents where character segmentation is challenged due to irregular document structure [11,19-29]. old machine-typed documents are of particular significance because important historical documents are often in this form [8,30-32]. recent research includes all levels of character segmentation. many approaches for skew estimation, as a part of a skew correction process, are modifications of the hough transform [33-35], with some based on correlation functions or straight line fitting [36,37]. analyses on document image binarization parameters showed that the otsu method and other otsu-based methods give the best results on average [32]. a learning-based approach for finding the best binarization parameters was presented in [38]. document image compression and decompression methods can also be exploited in the pre-processing stage of the character segmentation system, in order to efficiently store the document images. a survey of image compression algorithms used in wireless multimedia sensor networks (wmsn) was presented in [39]. compression of large arabic textual images based on pattern segmentation is achieved using the approach proposed in [40]. genetic algorithm based on discrete wavelet transformation information for fractal image compression was presented in [41]. combination of the lapped transform and tucker decomposition, named as hyperspectral image compression, was proposed in [42]. a lossy image compression technique based on singular value decomposition (svd) and wavelet difference reduction (wdr) was proposed in [43]. taking the character segmentation into account, many methods have been proposed. a technique based on searching for connected regions in the spatial domain performed on a binary image was proposed in [44]. a character segmentation method based on gaussian low-pass filter and innovational laplace-like transform was proposed in [45]. segmentation process adapted for real time tasks is proposed in [46] and is based on the bayes theorem in order to exploit prior knowledge. a novel approach was proposed in [47] based on the usage of contour curvature of letters for identifying the writer of ancient inscriptions and byzantine codices, without requiring learning algorithms or a database. diverse methods for segmentation of handwritten documents are proposed [26]. some techniques exploit clustering in the process of segmentation [28,48]. gabor filter for feature extraction and fisher classifier for feature classification were exploited in [49]. to solve the problem of touching characters in handwritten documents, self-organizing maps, svm classifiers, and multi-layer perceptron are used [21,27,50,51]. for natural images, tensor voting and the three-color bar code for segmentation have been combined [14,52]. this paper presents further improvements of the authors‟ character segmentation approach, which forms part of a real time ocr system for the needs of the “nikola tesla museum” in belgrade [53-56]. this paper presents pre-processing methods for document image compression and decompression, which take place after the image binarization. the proposed compression algorithms are based on the rle data compression algorithm and document character contour extraction, while decompression algorithm exploits the scanline fill algorithm. together with skew estimation and correction [54], and the image filtering stage, which concludes with image binarization process, this pre-processing is executed independently before the actual segmentation stage. this offers the opportunity to prepare document images for further processing later, to store document images in a compressed form, to use the improved character segmentation and recognition software independently from the pre-processing stage, and also to test independently the character efficient compression and decompression algorithms for ocr systems 463 segmentation and recognition stages. the results show that the proposed image compression and decompression methods perform up to 25 times faster than jbig2 compression at the expense of much lower compression ratio, and are better than jpeg2000 image compression in its lossless mode, giving up to 4-fold improved compression ratio. compression quality proved to be unimportant with regards to character segmentation accuracy. additionally, the evaluated image compression and decompression methods proved to be quite efficient and suitable for use in real time system. this paper is organized as follows: section 2 provides a description of the related works which deal with image compression, as well as a description of the previously proposed character segmentation approach. section 3 offers a theoretical foundation for bi-level image compression standards and jpeg/jpeg 2000 image compression standards used for comparison with the proposed algorithms, as well as a description of the rle algorithm and a scanline fill algorithm used for the proposed methods. section 4 provides the complete description of the proposed image compression and decompression methods. pseudo-codes for the proposed image compression and decompression methods are given in section 5, including the suggestion for the optimal implementation. in section 6, a large set of experimental results for image compression methods, obtained on different pc machines is provided. image compression and decompression results from the aspect of compression ratio and time complexity are analyzed in section 6, including segmentation accuracy results for compressed document images. finally, discussion of the extended real time character segmentation method, results, and future work are given in section 7. 2. related works this section gives more detailed descriptions of other image compression methods and authors‟ existing character segmentation approach. a novel universal algorithm for lossless chain code compression with a new chain code binarization scheme was proposed in [57]. the compression method is based on the rle algorithm and the modified lz77 algorithm. compression consists of three modes: rle, lz77, and copy mode. the runs of the 0-bits are compressed using rle, the simplified lz77 algorithm handles the repetitions within the bit stream, and copy mode is used if the aforementioned two methods are unsuccessful. on average, this method achieves better compression results than state-of-the-art methods. an image compression technique for video surveillance based on dictionary learning was presented in [58]. the main concept exploits the camera‟s being stationary, giving image samples a high level of similarity. the algorithm transforms images over sparsely tailored, over-complete dictionaries previously learned directly from image samples, and thus the image can be approximated with fewer coefficients. results show that this method outperforms jpeg and jpeg2000 in terms of both image compression quality and compression ratio. an image compression technique which combines the properties of predictive coding and discrete wavelet coding was proposed in [59]. to reduce inter-pixel redundancy, the image data values are pre-processed using predictive coding. the difference between the predicted and the original values are transformed using discrete wavelet coding. a nonlinear neural network predictor is used in the predictive coding system. results show that this method performs as well as jpeg2000. 464 b. arizanović, v. vuĉković a multiplier-less efficient and low complexity 8-point approximate discrete cosine transform (dct) for image compression was proposed in [60]. an efficient graphics processing unit (gpu) implementation for the presented dct is provided. it is shown to outperform other approximate dct transforms in jpeg-like image compression. the image compression and decompression methods proposed in this research are intended for usage in the authors‟ existing character segmentation approach. vuĉković and arizanović [53] proposed an efficient character segmentation method for machinetyped documents and machine-printed documents based on the usage of projection profiles. the method consists of pre-processing and segmentation logic. the pre-processing of the character segmentation is focused on manual document skew correction [54], document image grayscale conversion (to perform the document image binarization), and noise reduction. this paper provides an extension of the pre-processing by adding the image compression/decompression to enable the efficient and independent document image storage before the segmentation stage. segmentation logic is semi-automatic and consists of line, word, and character segmentation. all segmentation levels use the modified projection profiles technique. a new method for segmentation of words into characters based on decision-making logic is the core of the segmentation logic. this method gradually eliminates the possibility for big segmentation errors by determining the number of characters in a word using word width and the assumed average character width for a given document image. computational efficiency is achieved using the linear image representation, with further implementation optimization using pointer arithmetic and highly-optimized low level machine code. the provided results have shown that this novel method outperforms state-of-the-art techniques in terms of both time complexity and segmentation accuracy. 3. theoretical background this section provides a theoretical foundation for the state-of-the-art image compression standards used for comparison with the proposed algorithms, including the description of the standard algorithms used in the proposed image compression and decompression methods. the state-of-the-art theoretical background provided in this section covers the compression standards for bi-level images, which are especially suitable for document images, as well as a general jpeg and jpeg2000 image compression standards. 3.1. compression standards for bi-level images bi-level images are represented using only 1 bit per each pixel. this bit denotes a black or white color and has a value 0 or 1 depending on the color. for this reason, bilevel images are also referred to as black and white images. bi-level images usually contain a few specific types of elements such as text, halftone images, and line-art which includes graphs, equations, logos, and other similar features. first compression standards have been designed for facsimile (fax) images. fax standards include group 3 (g3), group 4 (g4), and jbig standard which is the basis of the later developed jbig2 standard. g3 standard includes the modified huffman (mh) coding which combines the variable length codes of huffman coding with standard rle coding of the repetitive sequences, and the modified relative element address designate (read) coding, also efficient compression and decompression algorithms for ocr systems 465 called the modified read (mr) coding. g4 standard uses the modified mr (mmr) coding, which similarly as g3 standard has mh as a basic coder. the jbig compression standard recommended by the joint bi-level image experts group is a lossless compression standard used for binary images such as scanned text images, computer-generated text, fax transmissions, etc. this standard can work in three separate modes of operation: progressive, progressive-compatible sequential, and singleprogression sequential. taking the coders into consideration, jbig uses the arithmetic coding, exploiting the qm coder variant. context-based prediction is used in the encoding process. in order to ensure a significantly higher compression ratio over the previously described compression standards, the modified lossy version of jbig has been proposed and named jbig2. although the jbig standard also supports a lossy compression, the lossy compression quality it provides is very low. on the other side, jbig2 provides both higher compression ratio for lossless compression and lossy compression with a very high compression ratio. jbig2 supports three basic coding modes: generic, halftone, and text coding mode. generic coding mode uses either the mmr or mq variant of arithmetic coding. halftone coding is used for halftone images. coding part is based on generic coding using a pattern, having a multi-level image as an output. the decoder obtains the halftone image using the multi-level image and previously used pattern. finally, dictionary-based text coding is used for textual content. each representative textual symbol is firstly encoded using the generic coding and is stored in the dictionary together with its position. decoding is achieved in a straight-forward way, using the dictionary. difference between the lossy and lossless text compression is in pattern matching type. lossy compression uses a hard pattern matching and similar letters are coded with the same dictionary entry. soft pattern matching is used for lossless compression where refinement coding is exploited in order to make a necessary difference between the already stored letter in the dictionary and the current letter. different modes are used for different document regions. sometimes text regions are classified as generic regions in order to obtain better results. in average jbig2 gives 3-5 times higher compression ratio than g4 compression standard and 2-4 times higher compression ratio compared with jbig standard. 3.2. jpeg and jpeg 2000 image compression standards jpeg and jpeg 2000 are well-known image compression standards used for comparison with the proposed methods and evaluation of their performance [61]. jpeg, which stands for joint photographic experts group, is a lossy image compression algorithm. the goal of the jpeg compression algorithm is to eliminate a high frequency colors in the image which cannot be observed by the human eye. this way original and compressed images would be usually the same visually, but the compressed image would be smaller in size. jpeg image compression algorithm consists of few steps: 1) image partitioning the whole image is divided into blocks, size of 8 × 8. the choice of the block size is also an important part of this step. in case of blocks of bigger size, it is possible to happen that there would exist blocks with big areas of the similar color structure, but since blocks are observed as a whole, those similarities cannot be exploited to obtain a better compression results. on the other side, in case of smaller blocks the processing would be much slower because in the next step a discrete cosine transform (dct) needs to be performed on each block. for these reasons the blocks size of 8 × 8 are taken as an appropriate choice. 466 b. arizanović, v. vuĉković 2) dct in this step, dct is performed on each block matrix. dct is a similar transform as a discrete fourier transform (dft) since both transforms map a function to frequency domain, except the fact that dct uses only cosine function, without dealing with imaginary parts. after derivation of the dct term, it is noticeable that values of dct are half of dft values: 𝑓𝑚 = 2 ∑ 𝑓𝑗cos ( 𝑚𝜋 2 (𝑗 𝑁−1 𝑗=0 + 1 2 )) (1) where fm is the dft coefficient. the values of dct can be defined as: 𝑔𝑚 = ∑ 𝑓𝑗cos ( 𝑚𝜋 2 (𝑗 𝑁−1 𝑗=0 + 1 2 )) (2) if dct values are taken into consideration and represented as a matrix, values with lower frequency will be grouped in the upper left side, while the higher frequency values, which are not visible for the human eye, are in other parts of the matrix. the goal of the next step is to eliminate these high frequency values. 3) elimination of the high frequency values – for this task, it is necessary to multiply the dct matrix by the appropriate mask matrix. in case of 8 × 8 blocks, the mask matrix that eliminates high frequency values would have the next form: [ 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 4) inverse dct finally, the last step in this process is to apply the inverse dct on each obtained block and use the obtained values to form the new image of the same dimensions as the original image. in order to describe the jpeg 2000 image compression standard and point out its advantages over the jpeg compression standard, its comparison with jpeg standard would be appropriate at this point: 1) transformation type while the previously described jpeg algorithm uses the 8 × 8 dct, jpeg 2000 uses the wavelet transform with lifting implementation. using the wavelet transform, the better energy compaction and resolution scalability is obtained. efficient compression and decompression algorithms for ocr systems 467 2) partitioning domain while jpeg uses partitioning in the space domain by dividing the image into blocks and applying transformations on each block, jpeg 2000 performs partitioning in the wavelet domain. this way the blocking artifacts which appear during partitioning in the space domain, as a result of partial application of dct transform, would be eliminated. 3) entropy coding jpeg algorithm encodes the dct coefficients one by one, while jpeg 2000 encodes the wavelet coefficients bitplane by bitplane. in case of jpeg, the resulting bitstream cannot be truncated, while in case of jpeg 2000 truncation is allowed, which enables the bitstream scalability. 4) rate control compression ratio and the amount of distortion when jpeg image compression algorithm is used can be determined by the quantization module, while jpeg 2000 uses the quantization module only for conversion of float wavelet transform coefficients to integer coefficients, and the bitstream assembly module is used to determine the compression ratio and the amount of distortion. this allows final bitstreams of the certain compression ratio to be easily converted to bitstreams of another compression ratio without repeating the entropy coding and transformation process. 3.3. run-length encoding (rle) algorithm rle is a standard widely used lossless data compression algorithm. the logic of this algorithm is to replace each repeating of some specific pattern with a symbol which describes that pattern and a value which defines the number of consecutive repeats of that pattern in the given sequence. in literature, the application of this algorithm to text compression is usually explained. the simplest example of algorithm application is in case that pattern is a single character. suppose that the next sequence of characters is given: 𝑠𝑠𝑎𝑎𝑎𝑐𝑐𝑐𝑎𝑠𝑑𝑎𝑎𝑒𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑠𝑑𝑑𝑑𝑎𝑑𝑑𝑑𝑑𝑑𝑤𝑤𝑤𝑎 in this case, the compressed text will be the following: 2𝑠3𝑎3𝑐1𝑎1𝑠1𝑑2𝑎1𝑒7𝑎1𝑠3𝑑1𝑎5𝑑3𝑤1𝑎 the original text has 35 characters, while the compressed text has 30 characters. the gain in this concrete example is not significant, but in practice algorithm can deal with a large amount of binary values and can be very efficient. the larger runs of the same values exist in sequence, the higher compression ratio will be achieved. in general case, the pattern does not necessarily need to be a single character. it could be a word or even a sentence. in such cases, it is also mandatory to use a delimiter between the compressed information for each pattern run, since it is unknown how many characters each pattern has and decompression would be impossible without having this information. the pseudo-code for general rle algorithm is shown in the following listing. run-length-encoding input: sequence s. output: array compressed. 1: i := 0 2: while i < length(s) do 468 b. arizanović, v. vuĉković 3: currentpattern := s[i] 4: runlength := 0 5: while i < length(s) and s[i] = currentpattern do 6: runlength := runlength + 1 7: i := i + 1 8: end while 9: compressed  runlength 10: compressed  currentpattern 11: end while 12: return compressed the sign  is used to represent the assignment operator for array. 3.4. scanline fill algorithm scanline fill algorithm belongs to the region filling algorithms. instead of algorithms with flooding approach which fill the contour by coloring the connected pixels of the same color at pixel level, the scanline fill algorithm is defined at geometric level and fills the contour in a horizontal or vertical direction, i.e. row by row or column by column. illustration of the scanline fill algorithm is shown in fig. 1. fig. 1 illustration of the scanline fill algorithm efficient compression and decompression algorithms for ocr systems 469 as it is shown in fig. 1, the algorithm starts from a seed pixel and searches for the last pixel on the left side which color is the same as the seed pixel color. once that pixel is found, the pixel row is processed from left to right until the pixel which color is not the same as the seed pixel color or the end of the pixel scanline is encountered. the color of each pixel in the current pixel row is changed to the fill color if its color is the same as the seed pixel color. also, the pixels above and below the current pixel are checked and they are pushed to stack if their color is the same as the seed pixel color. once the pixels above or below the current pixel are pushed to stack, the next pixels above and below are not considered until the new sequence of pixels which color is the same as the seed pixel color is not encountered. the next iteration starts when a new pixel is taken from the top of the stack. in fact, instead of pushing to stack the coordinates of the all individual pixels which need to be processed, this algorithm pushes the start coordinates of the line segments. this ensures that each pixel is checked once and leads to better time complexity results compared with flood fill algorithm, which is the main reason to use this algorithm for image decompression. 4. proposed algorithms this section proposes new image compression and decompression methods used in the pre-processing stage of the character segmentation system, in order to compress and decompress document images before the segmentation stage. as it has already been mentioned, previously presented character segmentation approach is extended by adding the image compression/decompression part in the pre-processing stage. this step gives the possibility to divide the character segmentation system into two independent parts. the first part is a pre-processing part having the document image compression and decompression as a final process, and the second part is a document image segmentation. the most evident gain achieved here is the possibility to execute two independent system parts in different moments. this is important due to several reasons. this way a document image compression and document image segmentation can be done on different machines. additionally, the previously compressed/decompressed document images can be processed using different versions of the segmentation engine. this is very important feature since it also allows efficient testing of the segmentation engine. finally, image compression allows efficient storing of document images which can save a significant disc space. in this section two image compression and decompression methods are proposed. the first method is completely based on rle algorithm and can be used for both machinetyped documents and machine-printed documents. the second method uses document character contour extraction in combination with scanline fill algorithm. the second method works with machine-printed documents, but its application on machine-typed documents is limited due to irregular structure of document image characters caused by low quality of documents. these image compression and decompression methods are presented in the following subsubsections. 4.1. image compression and decompression using rle the first proposed image compression and decompression methods used in the preprocessing stage of character segmentation system employ the rle algorithm for data 470 b. arizanović, v. vuĉković compression. after the image binarization, the image consists of black and white pixels, thus the rle algorithm proved to be excellent choice. this approach is general and can be used for all types of binarized images, but rle algorithm gives better compression results in case of document images than e.g. in case of natural images. the reason for this lies in fact that rle algorithm searches for runs of black and white pixels in each pixel scanline. in natural images these runs are short, while in document images the background has large runs of white pixels. illustration of the rle algorithm including the coding format is given in fig. 2. fig. 2 document image compression using rle algorithm: (a) compression format for white pixel runs, (b) compression format for black pixel runs, (c) example of pixel scanline, (d) compressed pixel scanline. fig. 2 shows an example of image compression using rle algorithm. rle algorithm counts white and black pixels and stores the information about pixel runs in compressed file. storing is achieved using 3 bytes in overall for information about the white pixels and 2 bytes for information about the black pixels. in both cases the pixels are counted until the maximal value is reached. since 2 bytes are used for white pixels (white_run), this value is 216 = 65536, while in case of black pixels (black_run) this value is 28 = 256. when these values are reached, the white_loops or black_loops byte is incremented and white_run and black_run values are set to 0. the whole process of counting is then repeated. since white pixels are a part of the background and are dominating in document images, it is expected that 1 byte is not enough for storing the information about the number of consecutive white pixels. on the other side, black pixel runs are not expected to be too long since they represent document characters and some spaces between characters are expected, thus only 1 byte is used for storing this information. document image decompression is straight forward. first byte is always multiplied by 256 or 65536 in case of black pixels or white pixels, respectively. after that, this value is incremented by value of the next byte or 2 bytes. the obtained value represents the number of consecutive pixels of the same color in the current run of pixels. this process is repeated until the end of the compressed file. efficient compression and decompression algorithms for ocr systems 471 4.2. image compression and decompression using document character contour extraction and scanline fill algorithm the second proposed method for image compression and decompression employs the combination of the algorithm based on document character contour extraction and scanline fill algorithm. the compression algorithm uses the 2d processing of a document image, since character contours need to be obtained from all four sides. document image is processed in horizontal and vertical direction and distances between the black pixels which represent starting and ending pixels of the black runs are stored in the compressed file. for this purpose, 2 bytes can be used to store the distance between two black pixels. it should be mentioned that in both compression methods the number of bytes used for storing the information about the white and black pixel runs is dependent primarily on the image dimensions. small images are expected to have short runs, while large images are expected to have long runs of pixels of the same color. therefore, 1 byte can be used for both white and black pixels in case of small images, while in case of large images 2 bytes are necessary. another important factor is a structure of a document image. if textual content is dominating in a document image, background areas are not huge and even in large images 1 byte can be used for storing the information about pixel runs. on the other side, if background area is dominating, even in medium images 2 bytes would not be enough to store the information about pixel runs. the process of image decompression is more specific. after obtaining the offsets of black pixels which represent the contours of the characters, in the first step of decompression method contours are drawn to the output image. the second step uses the iterative scanline fill algorithm. the main idea here is to scan the whole output image and fill the contours which represent the background with background color, while character contours will be filled with black color. this is achieved by repeating execution of the scanline fill algorithm. after that, a background color of a document image is replaced with white color and the original binarized document image is obtained. the illustration of image decompression is shown in fig. 3. it is assumed that the first pixel in a document image is a background pixel, thus the scanline fill algorithm starts execution from the first pixel. the next step is filling of the closed contours which are the document characters contours. in order to fill these contours, the algorithm searches for the next white pixel. when the next white pixel is found, the colors of the previous two or three pixels are checked. it is also assumed that contour edge is not wider than two pixels, but generally this does not need to be a case. in case that the previous two pixels have the black color and background color, or in case that the previous three pixels have black color, black color, and background color, the character contour is found and needs to be colored in black using scanline fill algorithm. this procedure is repeated until the end of a document image. the final image will contain only the background color and the black color. the final step is to change the background color back to white color and binarized document image will be obtained. 472 b. arizanović, v. vuĉković a) b) c) d) e) fig. 3 document image decompression using scanline fill algorithm: (a) original image, (b) resulting image after character contours extraction, (c) resulting image after background filling, (d) resulting image after character contours filling, (e) final binarized image. 5. implementation this section provides pseudocodes for the proposed image compression and decompression methods. in order to achieve an efficient implementation, linear image representation could be used. linear image representation is obtained by storing the image pixels linearly in a one-dimensional array. this representation is efficient since the memory organization is also linear and image pixels will be stored in successive memory locations, which will provide the fastest possible access to the image elements. the first presented efficient compression and decompression algorithms for ocr systems 473 method for image compression and decompression uses the rle data compression algorithm. the pseudocode of the image compression algorithm is shown below. rle-image-compression input: image f. output: array compressed. 1: currentcolor := white 2: while not end of image f do 3: runscount := calculate-run-length(f, currentcolor) 4: factor := runscount div maxrun 5: runlength := runscount mod maxrun 6: compressed  factor 7: if currentcolor = black then 8: compressed  runlength 9: else 10: compressed  runlength div 256 11: compressed  runlength mod 256 12: endif 13: replace-max-run(maxrun) 14: replace-current-color(currentcolor) 15: end while 16: return compressed this pseudocode represents a general application of the rle algorithm for compression of binarized images. this algorithm provides lossless image compression and its application is not limited on document images, but as will be shown in the experimental section, this method provides better compression results when applied on document images. in order to provide the optimal compression, 2 bytes are used for storing the information about black pixel runs and 3 bytes are used for storing the information about white pixel runs. the image decompression method is straightforward. the following listing shows the pseudocode of the decompression algorithm. rle-image-decompression input: image f, array compressed. 1: currentcolor := white 2: while not end of compressed do 3: runlength := 0 4: factor := compressed[current] 5: if currentcolor = black then 6: runlength := compressed[current + 1] 7: else 8: runlength := compressed[current + 1] * 256 9: runlength := runlength + compressed[current + 2] 10: endif 11: fill-run(f, runlength, currentcolor) 12: replace-current-color(currentcolor) 13: end while the image decompression algorithm reads the pixel runs information from the compressed file and regenerates the original image. since the compression is lossless, the decompressed image will be the same as the original image, but results from the 474 b. arizanović, v. vuĉković experimental section will prove that image compression quality is actually unimportant for segmentation accuracy. the pseudocode from the previous listing corresponds to the pseudocode for image compression shown in the first listing, regarding the number of bytes used for representing the pixel runs. the second proposed method for document image compression and decompression uses the character contour based document image representation and the scanline fill algorithm. contour image representation is used for image compression, while image decompression uses the iterative scanline fill algorithm. the image compression algorithm performs the 2d image processing to obtain the distances between the black pixels which form the character contours in a document image. depending on the document image dimensions, 1 byte or 2 bytes can be used to store the distance between two black pixels which represent edges of document character contours. the following listing shows the pseudocode for the document image compression algorithm based on character contour extraction. contour-image-compression input: image f. output: array compressed. 1: for each pixel row in image f do 2: currentcolor := white 3: while not end of current pixel row do 4: runscount := calculate-run-length(f, currentcolor) 5: compressed  runscount div 256 6: compressed  runscount mod 256 7: replace-current-color(currentcolor) 8: end while 9: end for 10: for each pixel column in image f do 11: currentcolor := white 12: while not end of current pixel column do 13: runscount := calculate-run-length(f, currentcolor) 14: compressed  runscount div 256 15: compressed  runscount mod 256 16: replace-current-color(currentcolor) 17: end while 18: end for 19: return compressed the previous pseudocode describes the algorithm for image compression which uses the 2d image analysis to obtain the pixels which represent edges of the document character contour. generally, this algorithm can be used for all binarized images, but its application on document images is more effective. the reason is that binarized document images have less details compared with say, natural images. also, document images have large areas with background color which will ensure a good compression ratio. with regards to compression quality, this algorithm gives worse results than other methods, but it does not affect the segmentation accuracy. this fact justifies the usage of this compression algorithm in the character segmentation system. the following listing shows the pseudocode for document image decompression based on the scanline fill algorithm. efficient compression and decompression algorithms for ocr systems 475 scanline-fill-image-decompression input: image f, array compressed. 1: offset := 0 2: while not end of row compression bytes do 3: runlength := compressed[current] * 256 4: runlength := runlength + compressed[current + 1] 5: offset := offset + runlength 6: set-pixel-color(f, offset, black) 7: end while 8: offset := 0 9: while not end of column compression bytes do 10: runlength := compressed[current] * 256 11: runlength := runlength + compressed[current + 1] 12: offset := offset + runlength 13: set-pixel-color(f, offset, black) 14: end while 15: scanline-fill(0, blue_color, white_color) 16: for each pixel in f do 17: if get-pixel-color(f, current) = white then 18: if get-pixel-color (f, current – 1, current 2) = [black, blue] or getpixel-color(f, current – 1, current – 2, current 3) = [black, black, blue] then 19: scanline-fill (current, black, white) 20: endif 21: endif 22: scanline-fill(0, white, blue) as aforementioned, the pseudocode for image decompression using the scanline fill algorithm can be applied in the case of machine-printed documents. the efficiency of the decompression algorithm is highly influenced by the efficiency of the scanline fill algorithm implementation, since the scanline fill algorithm must be executed multiple times. in the pseudocode presented in the previous listing, two and three previous pixels of the current pixel are checked. in the general case, this conditional statement can be changed since it is possible that character contour edges are wider than 1 or 2 pixels as it is assumed here. using the linear image representation, as suggested at the start of this section, it is possible to achieve a real time implementation of the proposed methods, which will make them suitable for use in the real time character segmentation system. 6. experiments proposed image compression and decompression methods, as a part of character segmentation system, are tested on several pc machines. results are analyzed from different aspects in order to provide the complete insight into the extended character segmentation approach and its capabilities. image compression and decompression methods are evaluated from the perspective of the image compression ratio and time complexity, to the perspective of the segmentation accuracy when specific compression methods are used. evaluation of the image compression ratio is performed using the standard test set of images. test set consists of six black and white images: baboon, barbara, cameraman, goldhill, lena, and peppers. each pixel intensity value is represented using 3 bytes, one byte for red, green, and blue pixel intensity value component. in order to obtain the comparative results, jbig2 and jpeg2000 image compression standards are used. for both image compression standards, the performances of their lossless modes are evaluated. 476 b. arizanović, v. vuĉković the most important metric for evaluating the compression methods is compression ratio. in case of images, compression ratio is a ratio between the original image file size and compressed image file size. higher values for compression ratio mean that compression method is better regarding this metric. comparison of the image compression ratio for standard set of images and different compression methods is given in table 1. table 1 comparison of the compression ratio results for different compression methods image (image dimensions) image file size (kb) compression ratio jpeg2000 (lossless) jbig2 (lossless) rle baboon (512x512) 769 3.544 46.048 8.010 barbara (512x512) 769 5.961 113.255 9.859 cameraman (256x256) 193 3.642 82.833 12.867 goldhill (512x512) 769 6.303 111.288 19.718 lena (512x512) 769 7.539 156.939 21.971 peppers (512x512) 769 8.640 201.837 30.76 as it is visible from table 1, jbig2 compression gives much better compression ratio results than the proposed algorithm. this comes from the sophisticated nature of the jbig2 algorithm, which is specialized for black and white images. compression ratio and time complexity are two most important measures for the quality of the compression algorithm. although the proposed algorithms fail to surpass the jbig2 compression ratio results, the time complexity results shown later will justify the usage of the proposed methods. additionally, table 1 shows that rle based image compression method presented in this paper provides a higher compression ratio than jpeg2000 image compression standard in its lossless mode. rle based method is not limited to document images, therefore it can be used for nondocument images as it is a case with standard test images. it is clear that this method ensures the possibility to store a huge amount of compressed document images efficiently without occupying a lot of disc space. the second presented compression method is limited to specific document images and performance of this method is analyzed later in this section. previous results represent the general analysis of the image compression methods. since the proposed image compression and decompression methods will be used in character segmentation system, their performances on document images should be analyzed. in order to perform this analysis, image compression methods are tested using two document images. these document images are machine-printed documents since the second method is limited on machine-printed documents which have the regular structure and character contours can be extracted correctly. compression ratio results for these document images and different image compression methods are shown in table 2. table 2 comparison of the image compression ratio for machine-printed document images for different image compression methods image dimensions image file size (kb) compression ratio jpeg2000 (lossless) jbig2 (lossless) rle contour extraction/scanline fill 719x328 692 14.417 640.741 67.184 46.443 1266x924 3429 9.741 357.933 27.878 19.373 efficient compression and decompression algorithms for ocr systems 477 as it is expected, the jbig2 compression again achieves much better compression ratio results. in this case the compression ratio is much higher than in case of standard test set images because jbig2 has a separate mode for compression of a textual content, as it is explained in section 3. taking the proposed methods into account, they perform very well on document images compared with jpeg2000 standard. in overall, rle based method gives the second best results, while the contour extraction method provides worse, but still competitive results. it is important to mention that jpeg2000 image compression standard performs worse than rle algorithm because of its nature. the color transformation stage of jpeg2000 algorithm generates two color channels which are being compressed in this stage. the channel related to black and white image features is not compressed the same way, thus the jpeg2000 compression gives worse results in case of black and white images, i.e. gives better compression ratio results in case of fullcolor images with many color and contrast transitions. contour extraction based compression in combination with scanline fill decompression is a lossy compression, since it cannot reconstruct the original image perfectly. although it could imply that this method does not perform well enough, further analysis will deny this fact. in order to justify the usage of the second proposed image compression method, the segmentation results for document images previously compressed and decompressed using different methods are given in table 3. table 3 comparison of the segmentation accuracy results for different image compression methods used in the pre-processing stage segmentation accuracy (%) jbig2/jpeg2000 (lossless) rle contour extraction/scanline fill line segmentation 81.54 81.54 80.32 word segmentation 78.28 78.28 78.14 character segmentation 87.08 87.08 86.92 these results are obtained using the chosen ground-truth machine-printed documents. as expected, the results for jbig2, jpeg2000, and rle based compression are identical. the most important conclusion here is that contour extraction based compression in combination with scanline fill decompression gives slightly worse results than previous compression methods. the reason for this lies in sensitivity of the evaluation metrics and also in the specificity of the character segmentation technique. in general, this technique is not sensitive on small changes in document image structure and therefore the segmentation accuracy results are similar to those obtained using the lossless compression methods. fig. 4 shows the comparison of the original binarized image and image obtained after compression and decompression using the second proposed method. 478 b. arizanović, v. vuĉković a) b) c) d) fig. 4 comparison of the original images and images obtained after compression/decompression using the second method: (a) first original image, (b) first image after compression/decompression, (c) second original image, (d) second image after compression/decompression visual results from fig. 4 clearly show that compression quality is sometimes irrelevant. although the second method is the worst among all analyzed methods regarding the compression quality and provides a lossy compression, differences between original and final images are negligible in case of character segmentation. in order to clearly demonstrate this conclusion, fig. 5 shows the same images from fig. 4 after being processed using the character segmentation algorithm. a) b) c) d) fig. 5 comparison of the original and compressed images after being processed using the character segmentation algorithm: (a) first original processed image, (b) first image processed after compression/decompression, (c) second processed original image, (d) second image processed after compression/decompression efficient compression and decompression algorithms for ocr systems 479 finally, a very important aspect of the image compression methods is time complexity. the image compression methods are intended for the usage in real-time character segmentation system and their processing time should be appropriate for that. in order to provide the reliable results, proposed image compression and decompression methods are tested on several pc machines. comparison of the processing time for jbig2 compression and compression/decompression performed using the proposed methods is given in table 4. table 4 comparison of the processing time for jbig2 compression and rle based compression and decompression method (amd athlon™ x4 840 quad core processor 3.1 ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) jbig2 compression rle compression rle decompression 719x328 93.09:6.91 9.36832 10.12405 0.26639 0.30149 0.23231 0.27278 1266x924 83.51:16.49 44.64481 49.82263 2.04907 2.59600 1.94450 3.05122 2632x3575 98.06:1.94 353.74121 382.56647 10.61898 15.07239 13.35769 21.82402 2640x3612 98.69:1.31 372.48256 405.33698 10.79404 16.99354 13.31302 21.95595 time complexity comparison justifies the usage of the proposed algorithms in the authors‟ character segmentation system. numerical results in table 4 are obtained after 10000 executions of the analyzed algorithms implementations. table 4 shows the best (left) and average (right) processing time. rle based compression algorithm provides up to 25 times faster compression than the jbig2 compression standard. the reason for this lies in simplicity of the proposed compression method, as well as in the complexity of the jbig2 compression. this advantage of the proposed compression method comes to the fore with a large number of documents that need to be processed in the “nikola tesla museum” in belgrade. in order to provide a reliable time complexity results, processing time are given for a set of different pc machines. these results are shown in tables 5-14. table 5 processing time for rle based compression and decompression method (amd athlon™ x4 840 quad core processor 3.1 ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.26639 0.30149 0.23231 0.27278 1266x924 83.51:16.49 2.04907 2.59600 1.94450 3.05122 2632x3575 98.06:1.94 10.61898 15.07239 13.35769 21.82402 2640x3612 98.69:1.31 10.79404 16.99354 13.31302 21.95595 table 6 processing time for rle based compression and decompression method (amd athlon™ 64 x2 dual core processor 5200+ 2.6 ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.59617 0.64098 0.45145 0.48502 1266x924 83.51:16.49 4.54918 4.98643 3.32389 3.93771 2632x3575 98.06:1.94 18.97615 22.70397 23.82398 27.92950 2640x3612 98.69:1.31 18.41798 35.85595 24.20699 42.99031 480 b. arizanović, v. vuĉković table 7 processing time for rle based compression and decompression method (intel® core™ i3-4150 cpu @ 3.50ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.22140 0.23022 0.14838 0.15094 1266x924 83.51:16.49 1.34219 1.39174 0.86508 0.90333 2632x3575 98.06:1.94 7.57190 7.81398 5.38810 5.69849 2640x3612 98.69:1.31 7.36194 7.66240 5.28898 5.67878 table 8 processing time for rle based compression and decompression method (intel® core™ i5-750 cpu @ 2.67ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.44596 0.45164 0.53324 0.53898 1266x924 83.51:16.49 2.92499 2.99345 2.82125 2.86606 2632x3575 98.06:1.94 15.47632 16.07258 21.01470 22.01499 2640x3612 98.69:1.31 15.18539 15.83180 21.34199 21.78855 table 9 processing time for rle based compression and decompression method (intel® core™ i7-920 cpu @ 2.67ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.42999 0.44189 0.31279 0.31529 1266x924 83.51:16.49 2.74480 2.84551 1.94092 1.95931 2632x3575 98.06:1.94 15.46358 15.60705 11.13561 11.33740 2640x3612 98.69:1.31 15.24224 15.38186 10.98920 11.21025 table 10 processing time for rle based compression and decompression method (intel® core™ i7-4700 mq processor 2.4 ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.32331 0.34429 0.21639 0.22497 1266x924 83.51:16.49 1.93728 2.05009 1.24961 1.31144 2632x3575 98.06:1.94 11.24646 11.64243 7.86414 8.18985 2640x3612 98.69:1.31 10.96036 11.31501 7.70805 8.34529 efficient compression and decompression algorithms for ocr systems 481 table 11 processing time for rle based compression and decompression method (intel® core™2 quad q9550 cpu @ 2.83ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.41861 0.42261 0.29742 0.29985 1266x924 83.51:16.49 2.61796 2.66822 1.86799 1.96971 2632x3575 98.06:1.94 14.93945 15.69766 13.77433 17.14845 2640x3612 98.69:1.31 14.78641 15.59181 13.69504 17.03617 table 12 processing time for rle based compression and decompression method (intel® core™ i5-3470 cpu @ 3.20ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.24246 0.25897 0.17094 0.18088 1266x924 83.51:16.49 1.42943 1.54486 1.00416 1.06919 2632x3575 98.06:1.94 9.77028 10.26213 6.09392 6.58797 2640x3612 98.69:1.31 9.61185 10.10193 5.96242 6.46359 table 13 processing time for rle based compression and decompression method (intel® celeron® e3400 cpu @ 2.60ghz) image dimensions (pixels) white pixels/ black pixels (%) processing time (ms) rle compression rle decompression 719x328 93.09:6.91 0.45608 0.58744 0.34018 0.63256 1266x924 83.51:16.49 3.00126 3.56548 2.50430 3.85496 2632x3575 98.06:1.94 16.44169 20.02098 20.42599 27.38618 2640x3612 98.69:1.31 16.07075 19.90696 20.32023 27.60395 table 14 processing time for contour extraction based compression method and scanline fill decompression method (document image size of 719x328 pixels) pc machine specification processing time (ms) contour compression scanline fill decompression amd athlon™ x4 840 quad core processor 3.1 ghz 0.26672 0.29824 1.59406 2.06575 amd athlon™ 64 x2 dual core processor 5200+ 2.6 ghz 1.10936 1.24688 2.94814 6.71124 intel® core™ i3-4150 cpu @ 3.50ghz 0.50409 0.52009 1.16008 1.18575 intel® core™ i5-750 cpu @ 2.67ghz 1.05348 1.06509 1.94235 1.95962 intel® core™ i7-920 cpu @ 2.67ghz 1.13896 1.22269 1.93016 1.97598 intel® core™ i7-4700 mq processor 2.4 ghz 0.74069 0.75961 1.69607 1.72861 intel® core™2 quad q9550 cpu @ 2.83ghz 0.89366 0.90396 1.85968 1.87623 intel® core™ i5-3470 cpu @ 3.20ghz 0.53816 0.57638 1.19594 1.27253 intel® celeron® e3400 cpu @ 2.60ghz 0.98342 1.82706 2.36290 3.41929 time complexity results shown in tables 5-14 are also obtained after 10000 executions of analyzed algorithms implementations. in the previous tables, the average and the best 482 b. arizanović, v. vuĉković processing time are given. provided results prove the superiority of the linear image representation used in the character segmentation system. all processing time are below 50ms which is a quite good result for real-time usage. the first method also achieves very good results in case of document images with bigger number of black pixels which is a characteristic of documents with predominant textual content. the second method processing time is primarily affected by the number of closed contour which represent character borders and need to be filled using scanline fill algorithm. each time the closed contour needs to be filled, scanline fill algorithm for region filling needs to be executed. since the document image used for obtaining the results in table 14 has huge areas of the same color and small number of characters, the second method proved to be very efficient. 7. conclusions this paper presents an image compression/decompression stage of the authors‟ existing character segmentation approach. in section 2 the description of the other image compression methods and the authors‟ character segmentation approach is given. section 3 provides a theoretical background for bi-level image compression standards and a theoretical comparison of the jpeg and jpeg 2000 image compression standards used for evaluation of the proposed methods, as well as a description of the rle data compression algorithm and scanline fill algorithm exploited for the proposed image compression and decompression algorithms. in section 4 the image compression and decompression methods are presented. the presented image compression and decompression methods are adapted to document images and use the rle data compression algorithm and document character contour extraction for image compression, and the scanline fill algorithm for document image decompression. the decoupling of the image compression/decompression stage allows preparation document images for further processing later, to keep the document images in compressed form and save storage memory, to use the improved character segmentation and recognition software independently from the pre-processing stage, and also to test independently the character segmentation and recognition stages. in section 5, pseudocodes and suggestion for optimal implementation of the proposed image compression and decompression algorithms are given. in section 6 a large set of experimental results is provided for image compression methods. the proposed image compression algorithms perform up to 25 times faster than the jbig2 image compression and 4 times better than jpeg2000 image compression standard in its lossless mode regarding the image compression ratio, whereas the proposed algorithms give much worse compression ratio results compared with jbig2 compression standard. the results proved that image compression quality does not affect the segmentation accuracy. also, the proposed image compression and decompression methods proved to be suitable for a real time character segmentation system. the official evaluation of the complete system performance will be performed at the “nikola tesla museum”, since the approach is designed for the needs of the museum, namely for conversion of the original nikola tesla documents to electronic form. future work will be focused on algorithms improvement, particularly of the segmentation logic, further approach optimization and automation, and its integration into the complete ocr system. efficient compression and decompression algorithms for ocr systems 483 acknowledgments: this paper is supported by the ministry of education, science and technological development of the republic of serbia (project iii44006-10), mathematical institute of serbian academy of science and arts (sanu), museum of nikola tesla (providing original typewritten documents of nikola tesla), and pattern recognition & image analysis research lab (prima) (providing ground-truth historical machine-printed documents). references [1] a. andreopoulos and j. k. tsotsos, “50 years of object recognition: directions forward”, computer vision and image understanding, vol. 117, no. 8, pp. 827-891, 2013. [2] n. bourbakis, n. pereira and s. mertoguno, “hardware design of a letter-driven ocr and document processing system”, journal of network and computer applications, vol. 19, no. 3, pp. 275-294, 1996. [3] s. khoubyari and j. j. hull, “font and function word identification in document recognition”, computer vision and image understanding, vol. 63, no. 1, pp. 66-74, 1996. [4] j. mao and k. m. mohiuddin, “improving ocr performance using character degradation models and boosting algorithm”, pattern recognition letters, vol. 18, no. 11-13, pp. 1415-1419, 1997. [5] a. namane, a. guessoum, e. h. soubari and p. meyrueis, “csm neural network for degraded printed character optical recognition”, journal of visual communication and image representation, vol. 25, no. 5, pp. 1171-1186, 2014. [6] j. i. olszewska, “active contour based optical character recognition for automated scene understanding”, neurocomputing, vol. 161, no. 5, pp. 65-71, 2015. [7] m. i. razzak, f. anwar, s. a. husain, a. belaid and m. sher, “hmm and fuzzy logic: a hybrid approach for online urdu script-based languages‟ character recognition”, knowledge-based systems, vol. 23, no. 8, pp. 914-923, 2010. [8] g. vamvakas, b. gatos, n. stamatopoulos and s. perantonis, “a complete optical character recognition methodology for historical documents”, iapr international workshop on document analysis systems, vol. 1, 2008, pp. 525-532. [9] h. fujisawa, “forty years of research in character and document recognition-and industrial perspective”, pattern recognition, vol. 41, no. 8, pp. 2435-2446, 2008. [10] y. lu, “machine printed character segmentation an overview”, pattern recognition, vol. 28, no. 1, pp. 67-80, 1995. [11] y. lu and m. shridhar, “character segmentation in handwritten words an overview”, pattern recognition, vol. 29, no. 1, pp. 77-96, 1996. [12] á. gonzález and l. m. bergasa, “a text reading algorithm for natural images”, image and vision computing, vol. 31, no. 3, pp. 255-274, 2013. [13] d. karatzas and a. antonacopoulos, “colour text segmentation in web images based on human perception”, image and vision computing, vol. 25, no. 5, pp. 564-577, 2007. [14] j. lim, j. park and g. g. medioni, “text segmentation in color images using tensor voting”, image and vision computing, vol. 25, no. 5, pp. 671-685, 2007. [15] j. min-chul, s. yong-chul and s. n. srihari, “machine printed character segmentation method using side profiles”, in proceedings of the ieee smc ‟99 conference on systems, man and cybernetics, 1999. [16] n. nikolaou, m. makridis, b. gatos, n. stamatopoulos and n. papamarkos, “segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths”, image and vision computing, vol. 28, no. 4, pp. 590-604, 2010. [17] h. c. park, s. y. ok, y. j. yu and h. g. cho, “a word extraction algorithm for machine-printed documents using a 3d neighborhood graph model”, international journal on document analysis and recognition, vol. 4, no. 2, pp. 115-130, 2001. [18] l. zheng, a. h. hassin and x. tang, “a new algorithm for machine printed arabic character segmentation”, pattern recognition letters, vol. 25, no. 15, pp. 1723-1729, 2004. [19] a. choudhary, r. rishi and s. ahlawat, “a new character segmentation approach for off-line cursive handwritten words”, procedia computer science, first international conference on information technology and quantitative management, vol. 17, pp. 88-95, 2013. [20] k. fukushima and t. imagawa, “recognition and segmentation of connected characters with selective attention”, neural networks, vol. 6, no. 1, pp. 33-41, 1993. [21] e. b. lacerda and c. a. b. mello, “segmentation of connected handwritten digits using self-organizing maps”, expert systems with applications, vol. 40, no. 15, pp. 5867-5877, 2013. 484 b. arizanović, v. vuĉković [22] h. lee and b. verma, “binary segmentation algorithm for english cursive handwriting recognition”, pattern recognition, vol. 45, no. 4, pp. 1306-1317, 2012. [23] j. oh, s. joon, s. sangkuk, l. ji-won, k. nojun and k. em, “online recognition of handwritten music symbols”, international journal on document analysis and recognition (ijdar), pp. 1-11, 2017. [24] t. plötz and g. a. fink, “markov models for offline handwriting recognition: a survey”, international journal on document analysis and recognition (ijdar), vol. 12, pp. 269-298, 2009. [25] a. rehman and t. saba, “performance analysis of character segmentation approach for cursive script recognition on benchmark database”, digital signal processing, vol. 21, no. 3, pp. 486-490, 2011. [26] n. stamatopoulos, b. gatos, g. louloudis, u. pal and a. alaei, “icdar 2013 handwriting segmentation contest”, in proceedings of the 12th international conference on document analysis and recognition (icdar), 2013. [27] o. surinta, m. f. karaaba, l. r. b. schomaker and m. a. wiering, “recognition of handwritten characters using local gradient feature descriptors”, engineering applications of artificial intelligence, vol. 45, pp. 405-414, 2015. [28] j. tan, j. lai, c. wang, w. wang and x. zuo, “a new handwritten character segmentation method based on nonlinear clustering”, neurocomputing, vol. 89, pp. 213-219, 2012. [29] m. younes and y. abdellah, “segmentation of arabic handwritten text to lines”, in proceedings of the procedia computer science, international conference on advanced wireless information and communication technologies (awict 2015), vol. 73, pp. 115-121, 2015. [30] a. antonacopoulos and d. karatzas, “semantics-based content extraction in typewritten historical documents”, in proceedings of the 8th international conference on document analysis and recognition (icdar „05), 2005, pp. 48-53. [31] i. bar-yosef, a. mokeichev, k. kedem, i. dinstein and u. ehrlich, “adaptive shape prior for recognition and variational segmentation of degraded historical characters”, pattern recognition, vol. 42, no. 12, pp. 3348-3354, 2009. [32] m. r. gupta, n. p. jacobson and e. k. garcia, “ocr binarization and image pre-processing for searching historical documents”, pattern recognition, vol. 40, no. 2, pp. 389-397, 2007. [33] l. a. f. fernandes and m. m. oliveira, “real-time line detection through an improved hough transform voting scheme”, pattern recognition, vol. 41, no. 1, pp. 299-314, 2008. [34] v. shapiro, “accuracy of the straight line hough transform: the non-voting approach”, computer vision and image understanding, vol. 103, no. 1, pp. 1-21, 2006. [35] c. singh, n. bhatia and a. kaur, “hough transform based fast skew detection and accurate skew correction methods”, pattern recognition, vol. 41, no. 12, pp. 3528-3546, 2008. [36] g. bessho, k. ejiri and j. f. cullen, “fast and accurate skew detection algorithm for a text document or a document with straight lines”, in proc. of the spie, vol. 2181, pp. 133-140, 1994. [37] y. cao, s. wang and h. li, “skew detection and correction in document images based on straight-line fitting”, pattern recognition letters, vol. 24, no. 12, pp. 1871-1879, 2003. [38] a. fernández-caballero, m. t. lópez and j. c. castillo, “display text segmentation after learning bestfitted ocr binarization parameters”, expert systems with applications, vol. 39, no. 4, pp. 4032-4043, 2012. [39] h. z. eldin, m. a. elhosseini and h. a. ali, “image compression algorithms in wireless multimedia sensor networks: a survey”, ain shams engineering journal, vol. 6, no. 2, pp. 481-490, 2015. [40] j. mtimet and h. amiri, “arabic textual image compression approach”, procedia computer science, vol. 35, pp. 118-126, 2014. [41] m. wu, “genetic algorithm based on discrete wavelet transformation for fractal image compression”, journal of visual communication and image representation, vol. 25, no. 8, pp. 1835-1841, 2014. [42] l. wang, j. bai, j. wu and g. jeon, “hyperspectral image compression based on lapped transform and tucker decomposition”, signal processing: image communication, vol. 36, pp. 63-69, 2015. [43] a. m. rufai, g. anbarjafari and h. demirel, “lossy image compression using singular value decomposition and wavelet difference reduction”, digital signal processing, vol. 24, pp. 117-123, 2014. [44] z. zheng, j. zhao, h. guo, l. yang, x. yu and w. fang, “character segmentation system based on c# design and implementation”, procedia engineering, international workshop on information and electronics engineering, vol. 29, pp. 4073-4078, 2012. [45] a. sedighi and m. vafadust, “a new and robust method for character segmentation and recognition in license plate images”, expert systems with applications, vol. 38, no. 11, pp. 13497-13504, 2011. [46] m. grafmüller and j. beyerer, “performance improvement of character recognition in industrial applications using prior knowledge for more reliable segmentation”, expert systems with applications, vol. 40, no. 17, pp. 6955-6963, 2013. efficient compression and decompression algorithms for ocr systems 485 [47] c. papaodysseus, p. rousopoulos, f. giannopoulos, s. zannos, d. arabadjis, m. panagopoulos, e. kalfa, c. blackwell and s. tracy, “identifying the writer of ancient inscriptions and byzantine codices. a novel approach”, computer vision and image understanding, vol. 121, pp. 57-73, 2014. [48] n. b. venkateswarlu and r. d. boyle, “new segmentation techniques for document image analysis”, image and vision computing, vol. 13, no. 7, pp. 573-583, 1995. [49] j. li, m. li, j. pan, s. chu and j. f. roddick, “gabor-based kernel self-optimization fisher discriminant for optical character segmentation from text-image-mixed document”, optik international journal for light and electron optics, vol. 126, no. 21, pp. 3119-3124, 2015. [50] j. h. bae, k. c. jung, j. w. kim and h. j. kim, “segmentation of touching characters using an mlp”, pattern recognition letters, vol. 19, no. 8, pp. 701-709, 1998. [51] p. p. roy, u. pal, j. lladós and m. delalandre, “multi-oriented touching text character segmentation in graphical documents using dynamic programming”, pattern recognition, vol. 45, no. 5, pp. 1972-1983, 2012. [52] o. starostenko, c. cruz-perez, f. uceda-ponga and v. alarcon-aquino, “breaking text-based captchas with variable word and character orientation”, pattern recognition, vol. 48, no. 4, pp. 1101-1112, 2015. [53] v. vuĉković and b. arizanović, “efficient character segmentation approach for machine-typed documents”, expert systems with applications, vol. 80, pp. 210-231, 2017. [54] v. vuĉković and b. arizanović, “automatic document skew pre-processor for character segmentation algorithm“, facta universitatis: electronics and energetics, vol. 30, no. 4, pp. 611-625, 2017. [55] v. vuĉković, b. arizanović and s. le blond, “ultra-fast basic geometrical transformations on linear image data structure“, expert systems with applications, vol. 91, pp. 322-346, 2018. [56] v. vuĉković, b. arizanović and s. le blond, “generalized n-way iterative scanline fill algorithm for real-time applications“, journal of real-time image processing, vol. 13, no. 4, pp. 1-19, 2018. [57] b. žalik, d. mongus and n. lukaĉ, “a universal chain code compression method”, journal of visual communication and image representation, vol. 29, pp. 8-15, 2015. [58] j. zhu, z. wang, r. zhong and s. qu, “dictionary based surveillance image compression”, journal of visual communication and image representation, vol. 31, pp. 225-230, 2015. [59] a. j. hussain, d. al-jumeily, n. radi and p. lisboa, “hybrid neural network predictive wavelet image compression system”, neurocomputing, vol. 151, no. 3, pp. 975-984, 2015. [60] r. t. haweel, w. s. el-kilani and h. h. ramadan, “fast approximate dct with gpu implementation for image compression”, journal of visual communication and image representation, vol. 40 (part a), pp. 357-365, 2016. [61] j. li, “image compression the mathematics of jpeg 2000”, microsoft research, communication collaboration and signal processing. facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 331-344 https://doi.org/10.2298/fuee1903331s © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd influence of circuit breaker replacement on power station reliability* dragan stevanović 1 , aleksandar janjić 2 1 electric power industry of serbia, zaječar, serbia 2 faculty of electronic engineering, niš, serbia abstract. in this paper, the new methodology for the determination of circuit breakers (cb) replacement time has been proposed. the methodology is based on statistical analysis of condition monitored data and the impact on substation reliability. influence of cb removal on substations reliability is presented together with cost justification of such investment. using statistical data of 427 cbs gathered in past 10 years, weibull probability distribution of contact resistance for breakers on both overhead and underground feeders and voltage levels of 35 kv and 10 kv is determined. substations reliability is calculated using minimal path and minimal cuts method. with this methodology influence of cb’s condition on substations reliability can be observed by using real field data. example of calculation is shown on 35/10 kv substation. substation reliability calculation is carried out for 5 different scenarios of cb removal with their expenses. at the end, discounted investment costs for each action and period of 5 years are calculated and are shown in table. for this substation final results are showing best scenario with removal cb’s on power transformers. key words: circuit breaker, cost evaluation, substation, reliability, weibull distribution. 1. introduction circuit breaker is a device used for switching feeder power supply in any working mode (normal load, no load, short circuit current…), and therefore represents the vital element of power system operation. cb failure threatens work of other equipment, which directly affects reliability of whole substation. this makes good reason of finding correlation between cb condition and substations reliability. to determine economic effects of maintenance, overhaul or cb removal [2], [3], assessment of circuit breakers remaining useful life (rul) must be done [4], [5]. remaining useful life is the lifetime from current time to the time that the device fails [2]. it is random variable which depends on various factors (device age, working conditions, received january 31, 2019; received in revised form april 25, 2019 corresponding author: dragan stevanović electric power industry of serbia, 19000 zaječar, serbia (e-mail: dragan.stevanovic3@epsdistribucija.rs) * an earlier version of this paper was presented at the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, niš, serbia [1]  332 d. stevanović, a. janjić and level of maintenance) [6]. if the failure time of the population follows the probability density function (pdf) f (t), then the population mean time to failure (mttf) can be calculated by (1): 0 0 ( ) ( )mttf tf t dt r t      (1) r(t) is the survival function at t. let define xt as the random variable of the rul at time t, then the probability density function (pdf) of xt conditional on yt is denoted as f(xt /yt) where yt is the history of operational information up to t. if yt is not available then the estimation of f (xt/yt) is: ( ) ( / ) ( ) ( ) t t t t f t x f x y f x r t    (2) where f (t + xt) is the pdf of the life at t + xt cb’s reliability analysis depends of type of available data, which can be: contact resistance, commutation noise, erosion resistance [7], ultrasound detectors, transient earth voltage, infrared thermo scanning [8], cb control circuit data [9] and collected data of cb faults [10]. depending of collected data type, rul can be assessed with: knowledge-based models (fuzzy method [11]); life expectancy models (statistical method [6], [12] – [16]); artificial neural networks and physical models [5]. utilities, grid operators and industrial power consumers are facing unprecedented challenges. with increasingly aging infrastructure combined with cost-cutting pressures to operate into today’s competitive environment, prioritizing investment has never been so important [17]. in [18] reliability of different substation configurations is evaluated using the minimal cut-set method based on the criterion of continuity of service. in [19] reliability indices of each failure events and entire reliability indices in the ring bus substation and double bus double breaker substation were calculated and quantitatively compared. method that combines the modeling of failures and repairs as stochastic point processes and a procedure of sequential monte carlo simulation for computing the reliability indexes is presented in [20]. in [21] comparison between the reliability of different substation constructions is shown. a number of methods have been used in determining the final substation indices, such as markov model, minimal cut-set method based on the criterion of continuity of service. [22] have developed a monte carlo approach to solving a system with non-markovian models. the mostly used methods are fault tree analysis, event tree analysis, monte carlo simulation and state enumeration [21]. because of importance of reliability, some companies [17] are started to use software, algorithms and analysis techniques for reliability management services to provide substation owners with the right insights to make optimal investments to improve system performance. influence of circuit breaker replacement on power station reliability 333 2. cb ageing process the main causes of cb deterioration are the age, the number of operations under normal and fault conditions and the operational conditions like the temperature and contaminants content. measuring the contact resistance is usually done by using the principles of ohm’s law. since the interrupting chamber is a closed container, we have only access to the entry and exit conductors; the measured r between these two points would be the sum of all the contact resistances found in series, (fixed, make-break and sliding contacts). according to the iec 60694 [23], article 6.4.1, the current value to use should be the closest to the nominal current the interrupting chamber is designed for. if it is impossible to do so, lower currents can be used but not less than 50 a to eliminate the galvanic effect that might affect the readings. 2.1. data collecting analysis covers 42 35/10 kv substations and 427 circuit breakers, mounted on 10 kv and 35 kv feeders. measurement of static contact resistance presented by the voltage drop on contacts is collected in past 10 years, where voltage drop was measured on every two years. other data regarding to circuit breakers that are collected are: voltage level, feeder type, manufacturing year, number of fault trips, number of short circuit current trips, number of customers, and annual consumption. depending on cb’s nominal current and nominal voltage allowed voltage drop goes from 3.5 mv up to 14 mv [24]. analyzed cbs have following maximal voltage drop values: 35 kv cb’s: 3.5 – 7 mv; 10 kv cb’s: 7 – 14 mv. manufacturer manual [24] states that cb must be completely overhauled after: 10-12 years of service, or 5000 operations, or 6 short-circuit currents breaking. measurement has been done with dc current of i=100a, measuring voltage drop on every cb’s pole. fig. 1 shows voltage drop distribution among all currently available data, with values divided into 4 categories depending of voltage drop level. fig. 1 voltage drop distribution on analyzed circuit breakers 0.00 10.00 20.00 30.00 40.00 50.00 60.00 i "<5 mv" ii "5-10 mv" iii "10-20 mv" v ">20 mv" 334 d. stevanović, a. janjić 2.2. data analyzing in first step, state of every cb is determined, according to its voltage drop value. cb’s with voltage drop value beyond permissible are set in “failed” state (f), and those which still have voltage drop value below allowed are in “suspension” state (s). for failed cb’s precise year of reaching that condition is defined. from manufacturers manual [24] allowed voltage drop values are dependent on cb’s rated voltage and rated current, and manufacturer allows them to surpass the permissible value for 25%. for that reason, cb’s are also analyzed for two different criterions: 1) maximal allowed voltage drop value is as in manufacturers table, 2) maximal allowed voltage drop is 25% greater than recommended values. weibull distribution is most commonly used method for equipment failure, ageing and reliability analysis [25]. it can describe three types of equipment states (infant mortality, normal work, wear out), through bathtub curve [26]. weibull cumulative distribution function represents probability of failure in given period of time (3). it is two-parametric distribution, with slop parameter η and shape parameter β. ( ) 1 t f t e          (3) slop parameter shows time at which 63.2% of analyzed units are failed. shape parameter represents failure rate behavior. its value tells whether failures are decreasing or increasing. β<1 indicates infant mortality, while β>1 show wear out failures. higher value of beta indicates greater rate of failure. in table 1 weibull parameters for different criteria are shown. table 1 weibull parameters cb feeder type η β fail \ suspens overhead +25% 39.1 5.2 100 \ 87 overhead 37.1 4.8 131 \ 56 underground +25% 41.5 6.1 63 \ 169 underground 38.1 6.1 97 \ 135 10 kv feeders +25% 43.4 5.6 87 \ 224 10 kv feeders 40.4 5.1 135 \ 176 35 kv feeders +25% 35.2 5.6 79 \ 31 35 kv feeders 33.8 5.6 96 \ 14 all feeders +25% 40.4 5.6 166 \ 255 all feeders 38.0 5.3 231 \ 190 by observing weibull parameters two conclusions could made, underground feeders (both criteria of voltage drop value limit) have highest β while overhead feeder have lowest value. considering η parameter, 10 kv feeders (+25% limit voltage drop level) have closer time to failure, while 35kv feeders have lowest η value. both β and η parameters are calculated for the whole cb population from the statistical data using the least square method [27]. weibull distribution function with right censored data (case when some devices didn’t fail during period of analysis) unreliability is calculated for all cb’s categories. on figures 2-5 unreliability distribution of different criterions is shown, and specific values for unreliability and failure rate regarding cb’s age from this example are shown in table 2. influence of circuit breaker replacement on power station reliability 335 fig. 2 weibull unreliability distribution for cbs on overhead feeders fig. 3 weibull unreliability distribution for cbs on underground feeders 336 d. stevanović, a. janjić fig. 4 weibull unreliability distribution for cbs on 10 kv overhead feeders fig. 5 weibull unreliability distribution for cbs on 35 kv overhead feeders table 2 cb’s reliability indices from weibull analysis feeder type age unreliability failure rate transformer 35 kv 32 0.532 0.226 transformer 10 kv 41 0.278 0.11 supply 35 kv 32 0.532 0.226 load 10 kv 41 0.612 0.11 considering age of cb’s in substation from example, values of reliability indices obtained in previous analysis are shown in table 2. influence of circuit breaker replacement on power station reliability 337 3. substation reliability analysis in this example 35/10 kv substation is used, which has two 8mva power transformers, two 35 kv supply feeders and ten 10 kv feeders. functional blocks are defined and shown in fig. 6. functional block consists of elements which would be out of supply if only one of them fails. active failure is an event that causes the protection system to operate and isolate a failed component [18]. active failure events refer to all failures that induce the actions of protective breakers adjacent to the component where failure occurred and affect normally operating components where no failure occurred [19]. a minimal cut-set is a set of components that when all fail, the continuity of service is lost, but if any one of the components doesn’t fail, the continuity remains [18]. fig. 6 functional graph using functional blocks from fig. 6, functional graph can be created (fig. 7). in this case it is considered that 10kv feeders can supply the same load (ring connection). substations reliability is calculated with minimal path and minimal cuts method [28]. fig. 7 functional blocks of substation m s f b o t1 l k a jn i t2 g supply feeder ii m l a n j b f k t2 g s t1t1t2 supply feeder i 338 d. stevanović, a. janjić 3.1. minimal path method path is serial connection of graph branches which connects input and output nod. minimal path doesn’t cross the same nod more than once. highest order of minimal path is by one less than number of network nods. in this case (fig. 6), number of nods is m=6 and connection matrix c will have dimension mxm, where the element eij is branch which connects nods „i“ and „j“. [ ] (4) minimal paths of first order doesn’t exist here, because there is no just one branch which connects input and output nod. minimal paths of second order are obtained by multiplying (from right side) first row of matrix c with whole matrix c. minimal paths of third order are obtained by multiplying former result with whole matrix c. identical process is carried for minimal paths of next orders. after calculations, minimal paths are: iii: , iv: , , , v: , 3.1. minimal cuts method cut of a graph consists of group of branches by which removal connection between input and output nod is broken. minimal cut is unique and doesn’t include other cuts. matrix of minimal paths p (5), with size mxn, where is mnumber of minimal paths and n – number of branches, has elements eij which are equal to 1 if branch „j“ is part of the minimal path „i“, otherwise it is equal to 0. if minimal paths are given in next order: fkt2; sgt1; kat1g; kt2bg; sat2f; st1bf; kat1bf; sat2bg and branches are defined in next order: k, s, a, t1, t2, b, f, g. for the graph from fig. 6 matrix of minimal paths is: [ ] (5) if all elements of one column are all equal to 1, then that branch is minimal cut of first order. minimal cuts of second order are obtained by adding columns of matrix p (every column is added to next columns). adding is done by law of bool algebra (1+1=1, 1+0=1, 0+1=1, 0+0=0). influence of circuit breaker replacement on power station reliability 339 as the result, minimal cuts of second and third order are: ii: k-s, t1-t2, f-g iii: k-a-t1, s-a-t2, t1-b-f, t2-b-g connection between input and output nod of the functional graph is broken when all branches that are part of cut are broken. in other words, connection is broken if at least one minimal cut is broken. with all results that are obtained so far, equivalent minimal paths graph of substation can be made (fig. 8). fig. 8 equivalent minimal paths graph 4. different scenarios of cb replacement analysis of cb replacement profitability and its influence on substations unreliability is shown through 4 different scenarios. new cb unavailability would be equal to u=0.00000822 [21] (with assumption that failure frequency and probability of failure are remaining unchanged). list of actions:  i – no cb replacement  ii – replacement of all cb’s on 10 kv feeders  iii – replacement of cb’s on supplying feeders  iv – replacement of cb’s on power transformers  v – replacement of all cb’s results of each action, depending of time they are taken, are shown in table 3 (and also fig. 9-10), while table 4 show how each action affects power stations unavailability and failure frequency. s' t2 m t1' n a t1' l t2' b" k x f' o k' x t2 t2' f' t1' a t1 g' t2' f x g' g x s' bb m t1 s t1 g n a t2 f l t1 j i t2 s x t1' s' k t2' k' a t1' f' j g' t2' b a" k' x b x x 340 d. stevanović, a. janjić table 3 unavailability results in different scenarios year action i ii iii iv v 1 0.50038 0.48357 0.47501 0.04219 1.40364e-05 2 0.62222 0.60417 0.59232 0.04796 1.40364e-05 3 0.72393 0.70440 0.69079 0.05269 1.40364e-05 4 0.78698 0.76531 0.75203 0.05664 1.40364e-05 5 0.83704 0.81432 0.80080 0.05897 1.40364e-05 table 4 power station unavailability regarding of year taking action results of different actions (%) parameter action ii action iii action iv action v unavailability reduction (%) 3.36 5.07 91.57 99,99 failure frequency 8.73 6.59 80.26 95.59 fig. 9 power stations unreliability in next years for each action (columns from table 3) fig. 10 power stations unreliability regarding taken action (rows from table 3) -0.05 0.15 0.35 0.55 0.75 0.95 1 2 3 4 5 u n re li a b il it y time (years) i action ii action iii action iv action v action -0.05 0.15 0.35 0.55 0.75 0.95 1 2 3 4 5 u n re li a b il it y action year i year ii year iii year iv year v influence of circuit breaker replacement on power station reliability 341 5. cost estimation of actions considering price of cb replacement of 5 000 $ for 35 kv cb and 2 000 $ for 10 kv feeder (including labor cost) with average time of replacement of 6 hours (from decision making, transporting and mounting), cost of different actions is presented in table 5. column “maintenance” covers regular maintenance costs of old circuit breakers which are not replaced, and column “replacement” consists of replacement costs only. values in column “sum” are total costs in the year of investment (maintenance of not replaced cb’s and costs of newly installed cb’s). table 5 cost of cb replacement action description maintenance ($) replacement ($) sum ($) i no cb replacement, only costs of maintenance 13 200 0 13 200 ii replacement of cb’s on all 10 kv feeders 2 400 38 800 41 200 iii cb replacement on 35 kv supply feeders 12 400 14 800 27 200 iv replacement of cb on power transformers 11 600 23 600 35 200 v replacement of all cb’s in power station 0 77 200 77 200 considering probability of cb failure due to its age and condition, costs of unplanned failure are calculated and presented in table 6. table 6 variable costs per cb, considering probability of failure cb (feeder) year i ii iii iv v trafo 35 3,936.80 4,639.80 5,143.00 5,424.20 5,624.00 trafo 10 1,223.20 1,368.40 1,579.60 1,760.00 1,966.80 kladovo i 4,528.80 4,861.80 5,261.40 5,838.60 6,119.80 cs carina 11,872.81 12,745.83 13,793.50 15,306.87 16,044.43 zelezara 11,873.07 12,747.18 13,798.03 15,318.13 16,068.36 cs jezero 11,884.33 12,794.19 13,932.46 15,627.93 16,657.18 radio stanica 11,872.80 12,745.80 13,793.40 15,306.60 16,043.80 other feeders 3,936.80 4,639.80 5,143.00 5,424.20 5,624.00 fig. 11 variable costs per cb 1,000.00 3,000.00 5,000.00 7,000.00 9,000.00 11,000.00 13,000.00 15,000.00 17,000.00 1 2 3 4 5 c o st ( $ ) year transformer 35kv transformer 10kv 35kv supply feeder 10kv feeder "cs carina" 10kv feeder "železara" 10kv feeder "cs jezero" rest 10kv feeders 342 d. stevanović, a. janjić 6. present cost value calculation calculating present value is done by equation (6): (1 ) fv pv n c c i   (6) where is: cpv – present value cfv – future value i – rate n – time period following calculation is carried for rate of , and future value (expected costs in next 5 years) is calculated with equation (7): ( ) fv mn inv un c c c c   (7) cmn – costs of planned maintenance (table 5) cinv – costs of new investments (table 5) cun – unpredictable costs due to cb’s failure (table 6) example of present value calculation (fourth year, ii action): 4 ( ) (2 400 38 800 40 414) 57 817.42 $ (1 ) ( 0, 09) mn inv un pv n c c c c i a          (8) using equations (6) and (7), present values of all actions in following years can be calculated. results are shown in table 7. table 7 present value ($) of all actions in next 5 years action cost type year 1 2 3 4 5 i sum 103,701.41 113,811.80 123,343.39 133,350.33 139,178.97 discounted costs 95,138.91 95,793.12 95,243.73 94,468.74 90,456.78 ii sum 70,897.60 74,956.40 78,613.20 81,614.00 83,802.80 discounted costs 65,043.67 63,089.30 60,703.81 57,817.42 54,466.07 iii sum 257,707.81 293,331.40 321,446.19 341,504.33 355,204.17 discounted costs 236,429.18 246,891.17 248,215.44 241,930.28 230,858.34 iv sum 115,381.41 123,795.40 131,898.19 140,981.93 145,997.37 discounted costs 105,854.50 104,196.11 101,849.60 99,875.15 94,888.27 v sum 77,200.00 77,200.00 77,200.00 77,200.00 77,200.00 discounted costs 70,825.69 64,977.70 59,612.56 54,690.43 50,174.70 total expected costs considering fixed (table 5) and variable costs (table 6) are put together for every action, and their discounted value is calculated. influence of circuit breaker replacement on power station reliability 343 7. conclusion the determination of cb replacement time is a complex procedure depending on various stochastic factors. substations reliability analysis can be used for determining size of both asset replacement and new investments and their financial justifications as well. using statistical data of 427 cbs gathered in past 10 years, weibull probability distribution of contact resistance for breakers on both overhead and underground feeders and voltage levels of 35 kv and 10 kv proved to be the best fit. cb’s removal has been assessed by the risk assessment and substation’s reliability improvement calculation. results are showing that the first candidates for the replacement are those cb’s with the biggest influence on substations reliability. obtained results are showing that the maximal increase in substation reliability regarding invested money can be obtained with the replacement of cb’s on power transformers (action iv). in that case unavailability is decreased by ~92% with investment of 23 600 $ for replacement of 4 circuit breakers. used methodology is easy to utilize because all data are already available and there is no need for extra investments or labor cost in order calculation to be carried out. references [1] d. stevanović, a. janjić, “circuit breaker replacement strategy based on the substation risk assessment”, in proceedings of the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, niš, serbia, pp. 248–253. [2] y. hu, s. liu, h. lu, h. zhang, “remaining useful life assessment and its application in the decision for remanufacturing”, procedia cirp, vol. 15, pp. 212–217, 2014. [3] h. picard, j. verstraten, m. hakkens, r. vervaet, “decision model for end of life management of switchgears”, in proceedings of the el. and instr. appl. in the petroleum & chem. ind., pcic europe 4th european conference, jun. 2007. [4] c. okoh, r. roy, j. mehnen, l. redding, “overview of remaining useful life prediction techniques in through-life engineering services”, sciencedirect, procedia cirp, vol. 16, pp. 158–163, 2014. [5] j. z. sikorska, m. hodkiewicz, l. ma, “prognostic modelling options for remaining useful life estimation by industry”, mech. syst. and sign. process., vol. 25, no. 5, pp. 1803–1836, jul. 2011. [6] x. s. si, w. wang, c. h. hu, d. h. zhou, “remaining useful life estimation – a review on the statistical data driven approaches”, european journal of operational research, vol. 2013, no. 1, pp. 1–14, aug. 2011. [7] m. braunović, v. v. konchits, n. k. myshkin, fundamentals of electrical contacts, crc press, 2006. [8] a. h. a. bakar, h. a. illias, m. k. othman, h. mokhils, “identification of failure root causes using condition based monitoring data on a 33 kv switchgear”, el. power and energy systems, vol. 47, pp. 305– 312, may, 2013. [9] s. natti, m. kezunovic, “assessing circuit breaker performance using condition-based data and bayesian approach”, el. power systems research, vol. 81, no. 9, pp. 1796–1804, sep. 2011 [10] a. janssen, d. makareinis, c. e. sölver, “international survey on circuit-breaker reliability data for substation and system studies”, ieee trans. power delivery., vol. 29, pp. 808–814, apr. 2014. [11] p. sun, h. jiang, h. yu, x. huang, y. sun, x. wang, “reliability evaluation of high voltage circuit breaker based on ifahp and ga”, icaees, nov. 2015. [12] j. f. boudreau, s. poirier, “end-of-life assessment of electric power equipment allowing for non-constant hazard rate – application to circuit breakers”, el. power and energy systems, vol. 62, pp. 556–561, nov. 2014. [13] x. zhang, e. gockenbach, z. liu, h. chen, l. yang, “reliability estimation of high voltage sf6 circuit breakers by statistical analysis on the basis of the field data”, el. power system research, vol. 103, pp. 105–113, oct. 2013 [14] l. jian, t. tianyuan, “ls-svm based substation circuit breaker maintenance scheduling optimization”, el. power and energy system, vol. 64, pp. 1251–1258, jan. 2015. [15] g. balzer, f. heil, p. kirchesch, r. meister, c. neumann, “evaluation of failure data of hv circuit-breakers for condition based maintenance”, cigre, paris, report a3-305, 2004. 344 d. stevanović, a. janjić [16] t. m. lindquist, l. bertling, r. eriksson, “circuit breaker failure data and reliability modelling”, iet gen., transm. & distrib. vol. 2, no. 6, pp. 813–820, nov. 2008. [17] abb switzerland ltd, substation reliability management services, 2016, www.abb.com/substationservice [18] d. nack, reliability of substation configurations, iowa state university, 2005. [19] s.h. kim, d. j. lee, substation reliability evaluation considering the failure events, iop conf. ser.: earth environ. sci. 159, 2018. [20] c. j. zapata, a. alzate, m. a. ríos, “reliability assessment of substations using stochastic point processes and monte carlo simulation”, ieee pes general meeting, july 2010 [21] f.wang, “reliability evaluation of substations subject to protection failures”, delft university of technology, netherlands, july 2012. [22] r. billington and g. lian, “monte carlo approach to substation reliability evaluation,” iee proceedings-c, vol. 1 40, no. 2, march 1993. [23] common specifications for high-voltage switchgear and controlgear standards, international standard iec 60694, edition 2.2, 2002-01 [24] operating manual for medium voltage medium oil circuit breakers for internal assembly, minel, serbia, 1984 [25] t. suwanasri, m. t. hlaing, c. suwanasri, “failure rate analysis of power circuit breaker in high voltage substation“,gmsarn international journal, vol. 8, pp. 1–6, 2014. [26] j. z. sikorska, m. hodkiewicz, l. ma, “prognostic modelling options for remaining useful life estimation by industry”, mech. syst. and sign. process, vol. 25, no. 5, pp. 1803–1836, jul. 2011. [27] w. li, j. zhou, j. lu, w. yan "a probabilistic analysis approach to making decision on retirement of aged equipment in transmission systems" ieee trans. on power delivery, vol. 22, no. 3, pp. 1891–1896, july 2007. [28] j. nahman, v. mijailović, “razvodna postrojenja”, beograd, 2005 http://www.abb.com/substationservice paper title (use style: paper title) facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 367 381 doi: 10.2298/fuee1603367j from intelligent web of things to social web of things nafaâ jabeur 1 , hedi haddad 2 1 dept. of computer science, german university of technology in oman, oman 2 dept. of computer science, dhofar university, oman abstract. numerous challenges, including limited resources, random mobility, and lack of standardized communication protocols, are currently preventing a myriad of heterogeneous devices to interact and provide web services within the context of the web of things (wot). we argue in this paper that these devices should be augmented with artificial intelligence techniques for an enhanced management of their resources and an easier construction of web applications integrating real world things (rwt). to this end, we present a new classification of the wot challenges and highlight the opportunities of embedding smartness into rwt. we also present our vision of intelligent wot by proposing a multiagent system-based architecture for intelligent web service composition. in addition, we discuss the shift of the wot toward a social wot (swot) and debate our ideas within two important scenarios, namely the intelligent vanet-wot and smart logistics. key words: internet of things, web of things, multiagent systems, web service composition, social web of things, smart logistics 1. introduction continuous technological advances are bringing communication and computing technologies from large to small and tiny scales. for instance, new range of small devices, including wireless sensor networks (wsns), are capable of acquiring and reporting data about a variety of spatial objects and events of interest, anytime and everywhere [2]. these devices are since there profiting from incessant progress in the fields of networking capabilities, mobile and pervasive computing, and miniaturization. they are not anymore being considered as simple data collecting devices. their capabilities are, indeed, being augmented with processing and intelligent mechanisms to assess on their own their current received august 30, 2015; received in revised form november 15, 2015 corresponding author: nafaâ jabeur dept. of computer science, german university of technology in oman gutech, p.o. box 1816, pc 130, muscat, oman (e-mail: nafaa.jabeur@gutech.edu.om) *an earlier version of this paper was presented at the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, 2015 [1]. 368 n. jabeur, h. haddad situations and make the right decision at the right time. a new era bridging cyber and physical worlds have then emerged with the vision to insert smartness everywhere. this era is particularly marked with the recent emergent fields of cyber physical systems [3] and internet of things. the internet of things (iot) could be defined as a global networking infrastructure that uses data capturing devises and communication resources to link virtual and physical objects [4]. it can, therefore, be perceived as an amalgamation of a variety of sensing, communication, and networking devices and systems in order to connect people and things with common interests. in this configuration, anybody can efficiently access the information of any object and any service, at any time and any place, regardless the heterogeneity of communication protocols and devices [5]. the web of things (wot) is a subset of the iot where web standards are used to seamlessly integrate and connect physical objects and information resources [6]. the emerging development of wot is expected to offer solutions in a wide variety of domains, including transportation management, energy monitoring, logistics and supply chain management, military and rescue scenarios, and healthcare applications. this is expected to be facilitated thanks to the increasing abundance of smart devices with web-enabled capabilities. the wot vision particularly aims to use web protocols and technologies to allow an easy building of web applications exploiting real world things (rwt). however, due to the heterogeneity of their hardware/software specifications and capabilities, the nonhomogeneity of their data representations and quality as well as their commonly nondeterministic mobility, rwt are facing serious problems to interoperate. these problems are more and more challenging because of the absence of widely accepted standards. with the continuous expansion of cyber and physical words toward each other as well as toward a social world, additional challenges concerning trust, privacy, and security are raising up. as it can be seen clearly, the challenges of the wot concern several levels and issues. we believe that autonomy, flexibility, and intelligence must be integrated to any approach addressing these challenges, and we argue that techniques from the artificial intelligence field would allow the creation of efficient candidate solutions. in this perspective, few approaches have been proposed [7][8]. however, the integration of intelligence into rwt has not been clearly investigated. furthermore, a major success factor for the wot is driven by the prevalence of web expertise. the internet networking infrastructure and the existing standards for data storage, visualization, and sharing are, indeed, pillars of the wot vision. nevertheless, these standards and techniques must be extended, revised, and/or revolutionized in order to meet the operational requirements of the rwt and allow them to integrate the web and mutually exchange web services. these services should be easy to publish, discover, compose, and execute. the traditional web service paradigm should then be enriched by promoting the web from both cyber and physical worlds [6]. because of their hardware and software limitations, it would be beneficial to the rwt to collaboratively provide services going beyond their individual capabilities. we then argue that these rwt should organize themselves into clusters where web-enabled devices could act as proxies allowing other rwt to connect to the internet and share their services. as the issues of service composition and clustering within the context of wot were not specifically investigated, we propose in this paper to address them as well as other challenges of the wot using a multiagent-based approach. in the reminder of this paper, section 2 highlights existing works that have addressed the issue of web service provision from intelligent wot to social wot 369 in the wot. section 3 presents our categorization of the wot challenges. section 4 addresses the issue of intelligent wot where the need for intelligent techniques are emphasized and explained. section 5 brings hints about socializing the wot. section 6 focuses on the application of our ideas to two important scenarios, namely web of vehicular ad-hoc network and freight transportation. 2. related work the main challenge of the iot and therefore the wot is to allow a myriad number of rwt to interoperate and mutually “understand” each other. to facilitate this interoperability, several techniques, including universal plug and play (upnp), dlna, slp, and zeroconf have been proposed [9]. each of these techniques has individually been successful in enabling devices to communicate with each other [7]. however, in addition to being not strictly standardized, some of them are inappropriate to resource-constrained devices due to their heavy protocols. thanks to the increasing integration of web-enabled capabilities, large number of rwt are currently benefiting from the existent networking infrastructure of the internet. the wot is then providing these rwt with the service and application layer to interoperate over http [10]. other networking infrastructures like wi-fi and ethernet permit new opportunities to build additional applications and services [7]. furthermore, with the falling size of embedded systems and their growing hardware and software capabilities, it has become possible to integrate lightweight web servers into many appliances [11]. consequently, the academia and the business sector are giving increasing attention to using the web as a platform for the creation of new applications that integrate rwt [10][12]. this trend has resulted in the increasing use of web services for the interoperability of rwt, particularly because of their proprietary and heterogeneous technologies [7]. the possible integration of heterogeneous rwt into the web leads to a more advanced perspective, where these things are abstracted into reusable web services, and not only viewed as simple web pages [6]. for instance, soap-based web services (ws-*) and restful apis allow rwt to offer their functionalities. restful web services are based on representational state transfer (rest) [13] which is lightweight, simple, loosely coupled, flexible as well as easy to integrate into the web using the http application protocol [7]. although rest-based services are being incorporated into many wot applications, particularly where quality of service (qos) levels are firmly applied (e.g., stock market and banking), a more tightly coupled service paradigm like ws-* would be more ideal [14]. recent developments are successfully allowing to embed tiny web servers into rwt (e.g., [15][16]), especially since these servers do not need to handle large number of concurrent connections and requests. however, a lot of research and development efforts remain necessary in order to properly manage the increasing volume of demands from these servers while efficiently using the limited resources of the corresponding rwt. in the current literature, the wot did not attract enough research and development attention, worth of its value. we believe that this is due to its numerous challenges as well as the lack of maturity of related processing and communication capabilities of rwt. we also believe that artificial intelligence techniques, which have proven their extraordinary performance in dealing with problems of highly dynamic, uncertain, and heterogeneous environments, could bring solutions to the problems of wot. some works have integrated such techniques within the context of iot (e.g., [17][18][19]). however, to the best of our 370 n. jabeur, h. haddad knowledge this was not the case for the wot. an interesting study was proposed by zhong et al. [8] where the authors have suggested a holistic intelligence methodology called wisdom wot (w2t) for realizing "the harmonious symbiosis of humans, computers, and things in the hyper world" [8]. the methodology principally aims to implement a closed cycle that starts from things to data, information, knowledge, wisdom, services, humans, and then back to things. this macro-level cycle is not embedded on the rwt which are mostly being considered as data collectors with networking facilities to connect to server providers. 3. challenges of the web of things basically, building the wot concerns ways to design and implement scalable and industry-ready iot solutions on the web. as a subset of the iot, the wot shares many characteristics with wireless sensor networks (wsns), machine-to-machine (m2m), and ubiquitous computing technologies. furthermore, the wot integrates information and physical objects, necessitating new means to model and reason about a range of context types [20]. from a design perspective and compared to the traditional client-server architecture, the wot has a flat architecture that includes two main challenges: a) integrating the rwt into the web; and b) making the rwt provide web services capable of mutually interoperate and fuse into complex services [6]. from a general perspective, we classify the challenges of wot into five main categories: data preprocessing and storage, data analytics, service management, networking and communication, and security, privacy, and trust (figure 1). data preprocessing and storage. the spatially distributed rwt are generally moving in the space while collecting data, anytime, anywhere and for a variety of purposes. to this end, they are usually facing problems to make the appropriate use of their data. in this regard, the rwt have to identify which data is important to collect for the current situation and according to which sampling frequency. the data collected should then be filtered and evaluated according to its semantics, the current context as well as current and expected requirements. once data is cleaned and filtered, it should be stored according to appropriate representations, granularities, and quality. the abovementioned process could be performed with a convenient form of the commonly used extract transform, load (etl) process which is capable of merging data from different sources and creating specialized datasets for a variety of purposes. data analytics. once data have been transformed and fed into local embedded databases, some analytics can start. to this end, some lightweight algorithms could be applied in order to perform a variety of operations, including data mining, semantics extraction, and data correlation identification. these algorithms may derive from genetic algorithms, support vector machines, decision trees, neural networks, and/or cluster analysis. they will be basically applied to small, focused data owned by each rwt. because of their limited storage and processing capabilities, some data analytics processing would go beyond the individual capabilities of rwts. to this end, a trusted, federating entity would be necessary to carry out the necessary processing within appropriate timeframes. this entity could be a rwt endowed with extended capabilities or a remote server to which the participating rwts are registered. this entity has to collect data from individual rwts and aggregate them according to from intelligent wot to social wot 371 specific structures and requirements. because of the increasing number of sensing devices capable of acquiring huge amounts of data, anywhere, anytime, the resulting aggregated data is tending to be huge. advanced data analytics algorithms could then be thoroughly performed, leading to the potential discovery of new relevant data correlations as well as hidden communication, behavioural, mobility, and processing patterns. furthermore, in addition to increasing the context-awareness while processing data, the trusted rwt executing data analytics could infer actionable information through business intelligence mechanisms. these information could particularly allow the concerned rwt to make more informed actions. challenges of web of things data peprocessing service management networking and communication security, trust, privacy data representation data availability data storage data granularity data quality context service publishing service discovery service sharing searching engine service composition protocols mobility of things client/server architecture standards context service mobility context protocols evaluation mechanisms mobility context data analyticsdata mining dependability statistics searching engine context-awareness business intelligence event tracking fig. 1 a proposed classification of the wot challenges service management. the rwt can be directly integrated to the web (in the case they have ip addresses or they are ip-enabled when connected to the internet) and be, consequently, able to understand each other through standardized web languages. they can also be integrated indirectly to the web (e.g., sensor nodes in a wsn) for cost, energy and security considerations [6]. this is achieved through ip-enabled rwt proxies. in both cases, the rwt should allow other devices to interoperate with them and mutually benefit from their services, which requires the abstraction of the rwt into reusable web services [7]. one or both of the w3c web service paradigms (rest-compliant web services) and arbitrary web services can be adopted. the rwt services should be generated on-the-fly or at least within appropriate timeframes [7]. although some technologies (e.g., flyport: www.openpicus.com) and research initiatives (e.g., [15]) have successfully embedded tiny web servers on mobile http://www.openpicus.com/ 372 n. jabeur, h. haddad devices, additional research and development efforts are still needed, particularly because of the physical constraints of rtw. furthermore, the services of rwt should be published in appropriate locations with convenient mechanisms for their discovery. in this regard, existing searching engines and algorithms must be re-examined in order to allow an efficient and effective discovery of rwt services. because of the limited capabilities of rwt, service composition could be a challenging solution where a group of rwt collaboratively create complex services from their individual elementary services. furthermore, although the mobility of rwt offers new opportunities for service composition, it also brings new challenges, basically because it does not guarantee a durable availability of service providers. furthermore, increasing capabilities of smart things to connect to the web is enabling additional flexibility and customization possibilities for end-users. following the tendency of web 2.0 participatory services, especially web mashups, users are currently capable of creating new applications where rwt (e.g., home appliances) are mixed with virtual services on the web [21]. this type of applications is often referred to as physical mashup [22]. a web mashup is a special application that integrates several web resources in order to generate a new service or application. this integration is mainly performed in an opportunistic manner for the sake of end-user’s personal use and generally for non-critical applications [23]. in addition to serving short-term needs, mashups are usually created ad-hoc with well-known, lightweight web technologies (e.g., html, javascript). an example of mashup could be an application that displays on google maps the location of all the pictures posted to flickr [21]. within the context of wot, rwt could be used by mashups in order to create new web services. to this end, these rwt must be easy to locate through the web. in addition, they must maintain the availability of their contributions in the new services. networking and communication. during the last decades, several technologies and standards have been proposed for smart things' communication. the sporadic mobility of rwt makes communication difficult, especially in the context of indoor applications. with the huge variety of types and manufacturers of rwt, interoperability is an upward concern. for instance, the rwt should be able to understand each other by using welldefined communication protocols. since existing protocols, including upnp and jxta, have not been neither standardized nor widely accepted for embedded devices in industry, embedded tiny web servers could be an option [6]. the unpredictable mobility of rwt intensifies the problems of their communication and urges the need for new lightweight protocols, where the identities, capabilities, and requirements of things are supported. trust, privacy, and security. the issues of security, privacy, and trust are always fuelling intensive research works, especially within the context of large scale, open configurations in which specialized and non-specialized parties can participate anytime, anywhere. this is also the case for wot where rwt can exchange and share data/services without having a firm awareness about their mutual intensions and actions. the option of embedding tiny web servers on rwt adds up additional security challenges. the use of rest-based interfaces makes it possible to have secure interactions using https [24]. however, the erratic configuration of the wot and the lack of standards require new and revolutionary security mechanisms. the use of the social web as a platform to ensure the trust and privacy of things has been advocated [25] to control web-enabled things among trusted members on social web sites [7]. however, additional research and development works are still needed toward a successful, widespread use of the wot. from intelligent wot to social wot 373 4. intelligent web of things in this section, we propose a multiagent-based architecture in order to deal with the challenges of the wot. this architecture is expected to be embedded on rwt. we particularly focus on the issue of service composition. 4.1. need for intelligence because of their limited capabilities, non-standardized communication protocols, unplanned mobility, and potentially their heterogeneous data formats, accuracy, and granularity, the spatially distributed rwt definitively need suitable mechanisms to make convenient actions at the right time, depending on their current capabilities and context. in this paper, we argue that the multiagent system paradigm (mas) could be appropriate for the wot, thanks to its proven flexibility, autonomy, and intelligence to solve complex problems within highly dynamic, constrained, and uncertain environments [26]. we believe that several, well-established agent-based techniques could perfectly bring solutions to the deficiency and challenges of rwt highlighted in section iii. 4.2. multiagent-based architecture we propose, in this paper, to embed a mas into rwt in order to handle the wot challenges at different levels. data filtering agent content generation agent networking and communication agent raw data service repository protocols and communication links s e c u ri ty , p ri v a c y , t ru s t a g e n t a p p li c a ti o n service composition agent system fig. 2 an embedded multiagent architecture for rwt our architecture (figure 2) contains four main modules: data filtering agent (dfa), content generation agent (cga), networking and communication agent (nca), and security, trust, and privacy agent (stpa). the dfa processes and analyzes the data collected by local sensing devices as well as data received from neighbouring devices. agentbased techniques for data filtering (e.g., [27]) and data mining (e.g., [28]) can be used. the cga will then be able to create elementary services which will be published later. if a given service is requested by a tier, the rwt should use appropriate communication protocols (e.g., 6lowpan, zigbee, wifi) as well as appropriate communication pathways to respond and convey the requested service. this task is achieved by the agent nca. the operation of the rwt is carried out according to specific security, trust, and privacy rules handled by the 374 n. jabeur, h. haddad stpa. these rules will be updated and improved based on the accumulated experience and the envisioned wot application. our architecture also includes a dedicated agent-based system which will be used for service composition on-demand (requested by peers) or when the rwt is willing to create a new mashup, integrating local, neighbouring, and remote services from trusted peers. 4.3. service composition when some required services cannot be provided individually, rwt should have the option to collaboratively generate new contents beyond their individual capabilities. this collaboration is particularly needed for energy and safety reasons as well as shortage of resources due to rwt mobility. in order to enable rwt collaboration, we propose to allow them creating clusters of things that we call circles of friends (cof). each cof will be composed of a group of rwt that will select each other based on their own preferences. although the creation of cofs is beyond the scope of this paper, we give a brief overview of how they are formed. initially, while publishing its services, any rwt also publishes its wish to belong to a cof with specific social and/or professional aims. interested rwt could then contact each other to make a new cof. one of the rwt is appointed as a head of the cof (hcof). the hcof is responsible of selecting the appropriate rwt to provide the currently requested services and make the necessary plans to generate complex services from elementary ones. in order to motivate rwt to join cof so that complex services could be created more easily, smart things will be rewarded whenever they are participating and providing services within these circles. this will consequently affect their reputation in the wot. a reward, and therefore a reputation, is also assigned to each cof in order to motivate rwt to be active and maintain their cofs. translator agent service generator agent evaluator agent executor agent request service repository specifications service cof members revision function beliefs option generation function desires intentions action generation function plan generation inputs (new communication) fig. 3 (left) embedded multiagent system architecture for service composition, (right) belief-desire-intension architecture of rwt in order to carry out the tasks of a hcof, any given rwt with appropriate physical resources will include a service composition agent system (see figure 3) with the following from intelligent wot to social wot 375 agents: translator agent (ta), service generator agent (sga), evaluator agent (ea), and executor agent (xa) (figure 3, left). the ta will receive the requests for services from its corresponding cof and make the necessary translations between the external languages and communication formats and the internal ones to the hcof. if the request cannot be understood then the hcof can request the help of a member of the circle to make the necessary translations. once the request is translated, specifications are sent to the sga which will consult the repository of the services currently provided by the cof as well as the currently active rwt and their rewards, trust, and security levels. elementary services will be assigned to individual rwt. however, for complex services, the sga plans and generates options to the ea which will make the necessary assessments and select an appropriate service composition plan with one backup plan. the selected plan will then be executed by the concerned rwt and monitored by the agent xa. 4.4. mobile agents a mobile agent is a software component capable of transporting itself from one location to another while performing delegated tasks. it is capable of interacting autonomously with foreign hosts while gathering information on behalf of its owner and delivering information and service based on its context-awareness knowledge [19]. because of the limited capabilities of rwt and the restrictions to access the web, mobile agents could play crucial roles in enabling the wot. for instance, any rwt can create a mobile agent, instruct it with specific tasks, and send it to neighboring or far rwt. the main goals of such agent include reporting events, negotiating deals as well as delivering, promoting, or attracting services to cut operating costs and discovering new partners or proxies. as the rwt contributing to the wot have heterogeneous capabilities, mobile agents should be lightweight to ensure an easy migration from one rwt to another. mobile agents should also abide by the requirements of hosts in terms of security (to avoid attacks), communication protocols, local resources use, and any local operating regulations. to this end, we need an efficient architecture for such agent. this architecture is explained in what follows. 4.5. belief desire intension (bdi) architecture in order to allow the rwt to reason adequately about occurring events and the dynamic surrounding environment affecting their web access, we propose a belief-desire-intension (bdi) architecture [13] for every agent embedded to a rwt (figure 3, left). in this architecture, beliefs represent the local information that the agent has about itself and its rwt (e.g., its current operations, services, processing capabilities, battery lifetime when applicable, communication protocols, etc.) and the environment (neighbouring trusted and untrusted peers as well as their communication protocols and the services they are providing). beliefs could be true or false and are subject to change. the desires reflect the objectives or the situations that the agent would like to accomplish, while the intentions refer to the actions that the agent has chosen to do. the agent will be always listening to communications from neighbouring and remote peers with whom it has connections (e.g., belonging to the same cof). for any new communication received, a revision function is executed in order to update the current beliefs. based on these beliefs, an option generation function updates the desires of the agent. an action generation function is then applied to 376 n. jabeur, h. haddad deliberate the new intensions. a plan generation function is finally executed to schedule the actions of the agent and update the beliefs, desires, and intentions accordingly. 5. socializing the web of things several researches have applied the idea of social networking to the iot arguing that if the iot can be made to imitate the social behaviour of the humans then those smart objects will be able to provide a better service than locally connected objects [29]. this results in a new idea called social internet of things (siot) [30]. siot applications can be a valuable resource in several areas, including domestics, business, automation and industrial manufacturing, logistics, and intelligent transportation of people and goods [31]. by analogy, we adopt in this paper the notion of social web of things (swot) where rwt use the social web as a platform to guarantee network navigability (effectively performing the discovery of objects and services), guarantee scalability as in human social networks, and establish appropriate levels of trustworthiness to improve the degree of interaction among things that are friends. the swot is also an open structure where rwt can seek for help to find trusted peers for their web connection, particularly if they are not web-enabled. they can also find peers with similar objectives with which they can seek advices about the reputation and trustworthiness of other rwt, share operating costs, jointly create services beyond their individual capabilities, mutually delegate tasks, etc. we therefore believe that it is important to adapt existing social theories to the wot context and prepare an impending shift to an environment where social relations will exist between everything. this shift will also bring the swot to the social web of everything (swoe). in order to enable the swot, it is important to possess efficient tools that facilitate a seamless connection and cooperation among devices and users. to this end, it is important to leverage modern paradigms like social networks and crowd-based applications, create a platform allowing the development of swot while enabling its relevant business-wise ecosystem, and create data analysis and recommendation techniques that fit the above paradigms and enable useful application creation. re-examining the concept of mashups and adapting them to the context of wot would be an asset. 6. applications 6.1. intelligent web of vehicles advances on sensing and communication facilities are impelling the evolution of the conventional vehicular ad-hoc networking (vanet) activities to the cloud, creating thereby the emergent notion of internet of vehicles (iov) [32]. in the iov paradigm, each vehicle is potentially involved with heterogeneous devices, communication and networking technologies, service kinds, data formats/contents, accuracy/efficiency requirements, etc. in order to smoothly integrate and connect the rwt and information resources of the iov along with a seamless integration with the social context, we coin the term web of vehicles (wov) that particularly aims to leverage web protocols and technologies for vanet related devices/objects, while facilitating rapid service generation and sharing. some of the devices on vehicles could be web-enabled and could therefore be endowed with embedded tiny web servers. these devices could play the role of proxies for other devices which cannot connect to the internet. to this end, they may provide them with restful apis for a direct web-based access. from intelligent wot to social wot 377 within the context of wov, let us suppose that a commuter wants to reduce his travel time between two given locations. in order to avoid unexpected traffic jams and reduce stoppage time at road intersections, a speed sensor on the commuter vehicle continuously reports information to an onboard decision unit (similar decision units could be embedded to any of the rwt in the wov scenario). this unit also receives data from distance and environmental sensors as well as information/services from the road infrastructure, vehicles, humans, and sensors in the vicinity. in addition to measuring the distance between the current vehicle and neighbouring objects (vehicles, road infrastructures, etc.), a distance sensor on the commuter’s vehicle could receive measurements from similar sensors on vehicles in the vicinity. these measurements should be cleaned and filtered by the distance sensor in order to assess the position of the vehicle with respect to its neighbouring objects from the side where the sensor is deployed. the sensor should also timely share useful information with other appropriate rwt on the road. for a better assessment of the situation, all distance sensors on the commuter’s vehicle will collect similar data and submit reports to the decision unit onboard. agent-based techniques (e.g., [28][27]) could then be used for data filtering and mining purposes on any of the sensors/rwt. as road safety is a shared matter, on-road vehicles have to accommodate each other and mutually exchange contextual information and services on-time. examples of services may include vehicle driving conditions (speeding, planned driving directions, alerts on vehicle about critical situations, etc.), on-road events (traffic jams, accidents, etc.), and professional services (healthcare if the driver is doctor/nurse, plumber, etc.). the vehicles of the wov will create cofs. a cof does not necessary consist of geographically collocated vehicles. for instance, some vehicles may share the same destination or the same social interests and therefore would like to maintain their cof, although they may be very far from each other because of traffic conditions. for each cof, one vehicle will be elected as hcof using an appropriate clustering technique [33]. this vehicle will maintain the list of services provided by each of the vehicles in the circle. it can also request services on their behalf and enable them to socially connect with similar vehicles from other circles. the hcof should always stay tuned to the needs of the members of the circle, update their rewards, plan the composition of complex services, etc. to this end, all requests received by the hcof will be translated, when needed, into the internal language and formats by an onboard intelligent agent. service composition will be planned by a special agent based on the current offering, trust, and capabilities of the vehicles in the cof. since some vehicles would be competing to offer their services and increase their rewards, an agent evaluator will fairly and carefully check service composition plans before handing over the approved plan to an executor agent to monitor the required actions. rewards and trust levels will then be updated accordingly once this plan is achieved. 6.2. smart logistics roughly, logistics is a part of the supply chain process where the forward and reverse flow and storage of good, services, and related information are effectively and efficiently planned, implemented, and controlled between the point of origin and the point of consumption with the aim to meet customers’ requirements [34]. the logistics industry is being considered a key player currently benefiting from the revolution of iot [35]. for instance, large numbers of a variety of machines, vehicles and people are daily packing, moving, and tracking millions of freights around the world within complex ecosystems known for their large operational scales and unpredictable spatio-temporal events. integrating a wide range of heterogeneous assets 378 n. jabeur, h. haddad and allowing them to interoperate in timely fashion is being helped with iot capabilities while creating customized, dynamic, and automated services for their customers. in order to make increasing benefits within this context of falling prices of device components, devices should be allowed to smoothly connect to the web. the wot is an ideal platform that would allow devices to cooperate in a context of smart logistics scenarios. amid the existing applications of logistics, we will focus in what follows in the scenario of fright transportation. although it is already possible today to track and monitor a container in a freighter in the middle of the pacific and shipments in a cargo plane midflight, it is expected from the iot and the wot to provide the next generation of track and trace by allowing them to be faster, more accurate, more predictive, and more secure [35]. the spatially distributed sensing devices can be endowed with web-enabled capabilities to consult nearby and remote devices and request specific services of current interest. imagine that a given damaged container had been moved by some trucks before and another truck is going to move it this time. this latter truck may connect to the wot and request some details and recommendation about the best way to transport this container while avoiding problems because of the already existing damages. since the different tracks could be located in far regions, efficient communications mechanisms are needed. to meet the above goals, clear and standardized approaches are needed to allow a seamless interoperability for exchanging sensor information in heterogeneous environments. sensors should be able to establish trust relations with a circle of friends in order to overcome some privacy issues in the wot-powered supply chain. in order to clarify these ideas, let us suppose the scenario of figure 4 where rwt are embedded or deployed on several facilities, tcof cocof scof ccof legend: cocof (container circle of friends), tcof (truck cof), (scof (ship cof), ccof (crane cof) rwt_truck rwt_container rwt_ship rwt_crane fig. 4 freight transport scenario from intelligent wot to social wot 379 including ships, planes, containers, cranes, etc. these rwt may have different processing, storage, and communication capabilities. because of the highly dynamic environment (e.g., facility movements) and sporadic spatio-temporal events (e.g., accident on the container yard, heavy rain, etc.), rwt have interest to coordinate their effort and particularly take benefit from previous experiences of peers while currently performing similar tasks. to this end, rwt on trucks could create a truck circle of friends (tcof) and rwt on cranes could form a crane cof (ccot). similarly, we can talk about container cof (cocof), ship cof (scof), and plane cof (pcof). let us imagine that a container is being transported by a truck for shipment. an onboard master rwt (let’s call it mco_rwt: master container rwt) is assigned to the control of this container. the mco_rwt could request to join the tcof as service consumer since its container is being transported by a truck. the rwt has also to communicate with any rwt onboard of its container. relevant information could be conveyed timely within the tcof as for example when goods inside the container have underwent some damage and more careful transportation services should be observed. once the container is deposited on the shipment area, the mco_rwt has to confirm to the master rwt assigned to the crane (mc_rwt) its position as well as its local conditions and parameters. the mco_rwt will then unsubscribe from the tcof and subscribe to the ccof. although only one crane is generally responsible of shipping the container, other cranes could give recommendations based on the current conditions of the container as well as the ongoing environmental conditions. once on the ship, the mco_rwt may connect with other rwt during the marine transport (figure 5). our scenario could also be extended to the phase when the container is on road. besides, as some goods in the container could travel by air then the same scenario could be extended to air transportation. mco_rwt container_1 mco_rwt container_nmco_rwt container_2 cocof mt_rwt truck_1 mt_rwt truck_mmt_rwt truck_2 tcof [during transport] mc_rwt crane_1 mc_rwt crane_kmc_rwt crane_2 ccof [during shipment] ms_rwt ship_s ms_rwt ship_1 ms_rwt ship_3 scof [on ship] fig. 5 a multiagent system architecture for freight transport scenario 380 n. jabeur, h. haddad 7. conclusion real world things (rwt) are currently capable of establishing connections to the web, either directly via ip-enabled capabilities or via proxies. however, because of their spatial distribution, heterogeneity, limited resources, and sporadic mobility, maintaining efficient, secure, and durable connections is not straightforward. we therefore presented in this paper some conceptual steps towards enabling the vision of intelligent wot (iwot) and social wot through the use of mas techniques. in order to show the potential of our ideas, we discussed two important application scenarios, namely the intelligent web of vehicles and smart logistics. several issues still need to be addressed in the future to fully implement our vision. in this regard, the mas-based architecture proposed for service composition needs to be refined, implemented and experimented. then it needs to be extended to address the other challenges presented in the paper, including data processing and storage, networking and communication, and trust, privacy, and security. we also believe that considerable research and development works are needed towards socializing the wot. references [1] n. jabeur, h. haddad. “towards an intelligent web of things”, in proceedings of the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, november 2015. [2] n. jabeur, n. sahli, s. zeadally, “abama: an agent-based architecture for mapping natural ecosystems onto wireless sensor networks”, invited paper, in proceedings of 9th international conference on future networks and communications (fnc-2014), elsevier procedia computer science, volume 34, canada, august 2014. [3] r. rajkumar, i. lee, l. sha, j. stankovic, “cyber-physical systems: the next computing revolution”, in proceedings of the 47th design automation conference. acm, new york, usa, 2010, pp. 731-736. [4] casagras. casagras final report: rfid and the inclusive model for the internet of things, 2009, pp. 10-12. [5] z. pang , “technologies and architectures of the internet-of-things (iot) for health and well being”, kth royal institute of technology, 2013. [6] d. zeng, s. guo, z. cheng, “the web of things: a survey (invited paper)”, j. communications, vol. 6, no. 6, pp. 424-438, 2011. [7] s.s. mathew, y. atif, q.z. sheng, z. maamar. internet of things and inter-cooperative computational technologies for collective intelligence, bessis, n., xhafa, f., varvarigou, d., hill, r., li, m. (ed./s), 2013, pp.1-23 [8] n. zhong, j. ma, r. huang, j. liu, y. yao, y. zhang, j. chen. research challenges and perspectives on wisdom web of things, journal of supercomputing, springer, 2010. [9] s. cheshire, d.h. steinberg, zero configuration networking, the definitive guide, o’reilly, 2005. [10] d. raggett . the web of things: extending the web into the real world, sofsem 2010: theory and practice of computer science, jan 2010. [11] b. ostermaier, m. kovatsch, and s. santini, “connecting things to the web using program-mable lowpower wifi modules”, in proceedings of 2nd international workshop on the web of things, 2011. [12] d. guinard and v. trifa, “towards the web of things: web mashups for embedded devices”, in proceedings of the workshop mashups, enterprise mashups and lightweight composi-tion on the web (mem’09), 2009. [13] r. t. fielding, architectural styles and the design of network-based software architectures, ph.d. dissertation, 2000. [14] c. pautasso, o. zimmermann, and f. leymann, “restful web services vs. 'big' web services: making the right architectural decision”, in proceedings of the 17th international conference on world wide web, ser. www ’08. new york, ny, usa: acm, pp. 805–814, 2008. [15] s. duquennoy, g. grimaud, and j.j. vandewalle, “the web of things: interconnecting devices with high usability and performance”, in proceedings of the international conference on embedded software and systems (icess’09), 2009. from intelligent wot to social wot 381 [16] z. shelby, “embedded web services”, ieee wireless communication magazine, vol. 17, no. 6, pp. 52–57, 2010. [17] g. kortuem, f. kawsar, v. sundramoorthy, d. fitton, “smart objects as building blocks for the internet of things”, in proceedings of the ieee internet computing, vol. 14, no. 1, pp. 44-51, 2010. [18] a. m. mzahm, m. s. ahmad, y. alicia and c. tang, “agents of things (aot): an intelligent operational concept of the internet of things (iot)”, in proceedings of the 13th international conference on intelligent systems design and applications (isda 2013), pp. 159-164, 2013. [19] a. m. mzahm, m. s. ahmad, a. y. c. tang, “enhancing the internet of things (iot) via the concept of agent of things (aot)”, journal of network and innovative computing, vol. 2, pp. 101-110, 2014. [20] p. sawyer, a. pathak, n. bencomo, v. issarny, “how the web of things challenges requirements engineering”, in proceedings of the 3rd workshop on the web and requirements engineering at 12th international conference on web engineering icwe 2012, berlin germany, july 2012. [21] d. guinard, v. trifa, f. mattern, e. wilde, “from the internet of things to the web of things: resourceoriented architecture and best practices” d. uckelmann, m. harrison and f. michahelles, editors, architecting the internet of things, pp. 97-129. springer berlin heidelberg, berlin, heidelberg, 2011 [22] d. guinard, v. trifa, e. wilde, “a resource oriented architecture for the web of things”, in proceedings of ieee international conference on the internet of things (iot) 2010. tokyo, japan. [23] j. yu, b. benatallah, f. casati, f. daniel, “understanding mashup development”, ieee inter-net comput, vol. 12, pp.44-52, 2008. [24] e. wilde, putting things to rest, ucb ischool report 2007-015, school of information, uc berkeley, 2007. [25] d. guinard, m. fischer, and v. trifa, “sharing using social networks in a composable web of things”, in proceedings of the 1st ieee international workshop on the web of things (wot), 2010, germany, 2010. [26] s. bandyopadhyay and e.j. coyle an energy efficient hierarchical clustering algorithm for wireless sensor networks”, proc. of infocom 20013, ieee societies, 2013, vol. 3, pp. 1713-1723 [27] p. skocir, h. maracic, m. kusek, g. jezic, “data filtering in context-aware multi-agent system for machine-to-machine communication”, g. jezic et al. (ed.), agent and multi-agent systems: technologies and applications, smart innovation, systems and technologies 38, 2015. [28] k. a. albashiri, “an investigation into the issues of multi-agent data mining”, ph.d. dissertation, the university of liverpool, liverpool l69 3bx, 2010, united kingdom. [29] x. hannan, n. sidhu, b. christianson, "guarantor and reputation based trust model for social internet of things," in proceedings of the international wireless communications and mobile computing conference (iwcmc), 2015, pp. 600-605. [30] l. atzori, a. iera, g. morabito and m. nitti, “the social internet of things (siot) when social networks meet the internet of things: concepts, architecture and network characterization,” computer network, vol. 56, no. 16, pp. 3594-3608, 2012. [31] l. atzori, a. iera and g. morabito, “the internet of things: a survey,” computer networks, vol. 54, no. 15, pp. 2787-2805, 2010. [32] m. gerla, e-k. lee, g. pau, u. lee, “internet of vehicles: from intelligent grid to autonomous cars and vehicular clouds”, in ieee world forum on internet of things (wf-iot), 2014, pp.241-246. [33] s. vodopivec, j. bester, a. kos, "a survey on clustering algorithms for vehicular ad-hoc networks", in proceedings of the 35th international conference on telecommunications and signal processing (tsp), 2012, pp. 52-56. [34] b. tilanus, information systems in logistics and transportation. elsevier science ltd., uk, 1997 [35] dhl and cisco (2015) internet of things in logistics a collaborative report by dhl and cisco on implications and use cases for the logistics industry, available at: http://www.dpdhl.com/content/dam/dpdhl/ presse/pdf/2015/dhltrendreport_internet_of_things.pdf. http://www.dpdhl.com/content/dam/dpdhl/presse/pdf/2015/dhltrendreport_internet_of_things.pdf http://www.dpdhl.com/content/dam/dpdhl/presse/pdf/2015/dhltrendreport_internet_of_things.pdf mems design simplification with virtual prototyping facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 11 34 doi: 10.2298/fuee1601011s mems design simplification with virtual prototyping renate sitte griffith university, griffith sciences – ict, gold coast, australia abstract. mems design requires a good understanding of interactions in complex processes and highly specialized interdisciplinary skills. traditional prototyping is not easy or cheap due to typically needing very expensive manufacturing facilities for its implementation. progress towards faster, cheaper prototyping has been achieved but, it cannot be applied to mems fabrication in general. this paper analyzes the benefits of virtual prototyping for a simplification and aid in mems design and proposes the continuation of mems animated graphic design aid (magda) project. its purpose is to simplify preliminary design stages and make mems design more accessible to a wider audience. key words: mems, scientific visualization, vr-cad tools 1. introduction and motivation the purpose of this paper is to motivate making the design of mems more broadly accessible and to give a glimpse and overview on where to start for those who wish to endeavour into this area. since its early days, the mems industry is now established and many of the papers presented here are pioneering work that have subsequently been adopted and laid the foundations of this industry. nevertheless, the production technology options for mems remains vast; there is not a “one size fits all”. manufacturing challenges are more the result of a particular innovation of a specific mems than of the production process itself. this is also reflected in the research publications. one of the difficulties in mems design and innovation is that it requires highly specialized skills and a wide interdisciplinary background with experience in, physics, advanced mathematical modelling (e.g. for microfluidics), chemistry, materials engineering and manufacturing technology to name a few. it requires such skills for both, the technology and design of the mems itself, and the science and engineering understanding at the mems’ application niche. these required specializations and skills limit the potential for a broader industrial development. this is because development requires adequate tools with powerful modelling and simulation software to reduce the prototyping and received september 19, 2015 corresponding author: renate sitte griffith university, griffith sciences – ict, gold coast, australia (e-mail: r.sitte@griffith.edu.au) https://www.griffith.edu.au/griffith-sciences 12 r. sitte optimization period. the introduction of cad packages was a critical step in the widespread development of vlsi devices and reduction of the design and prototyping phase [1]. despite the demand, there is a lack of cad tools to aid in the development of mems devices. there are several packages available and their benefit is supporting the mathematical modelling part, but for a realistic and useful application, they still require a strong interdisciplinary background. in computing for example, the introduction of icons and mouse in the early eighties made a huge impact and breakthrough for shortcuts of recurring tasks like file handling, starting programs drawing and visual output. this allowed focusing more on using the computer than typing commands for menial tasks. suddenly, it allowed a broader audience to use a computer. we need to be able to bring mems design to a less specialized audience. other engineering disciplines, such as mechanics or robotics have found their way into early education and entertainment (edutainment). despite their ballooning ubiquitosity and breakthrough as, for example, in biomedical applications, mems are not yet ready for edutainment, which has undoubtedly a favourable effect for a richer understanding of physical cause-and-effect and shaping of the mind in younger years. it will be many years before mems design can be simplified to the point of pick and place on a virtual prototyping (vp) computer screen, and see it functioning in 3d and 4d vp. mems can nowadays be made of a range of materials, not just silicon. those materials have different physical properties and behave differently in manufacture and use. therefore a virtual reality (vr) computer aided design (cad) software that can mimic functioning with physically correct results can be the meccano or lego toy for edutainment and discovery (acquiring an intuition) at earlier ages than postgraduates. our aim should be making the whole mems domain more popular. this could be by bringing it to undergraduate or even final years of high school level with introductory courses and gradually adding more ambitious courses in a similar way, as introductory mathematics courses are taught early on, shaping the mind. to achieve that, we need simulation tools that are easy to use and to understand. just the lengthy training time to handle the software tools and time their calculations take is a discouragement. novices do not have the patience or the maturity to wait for something they have neither background nor meaning. the bottleneck is no longer the computing power but having usable and curiosity stimulating simulation tools for the uninitiated. our research has developed techniques suitable for virtual prototyping that reduce the calculation time without sacrificing physical correctness. our methods are suitable for initial design that can then be refined with conventional methods. it serves for advanced researchers and novices alike. the paper is organized in the following way: overall, we progress throughout the following sections towards mems virtual prototyping. section 2 brings some background and context about the wide range of applications, product ramifications and variety of problems as mems have evolved in just two decades. section three discusses existing tools for mems modelling and simulation and moves into existing cad systems. it explains some of the difficulties and complexities affecting reliability in mems modelling. chapter four looks at the importance of prototyping and its strong potential for innovation. chapter five discusses virtual prototyping as an important and flexible design tool that has not yet really found its way into mems design. in chapter six gives a snapshot of our contribution magda. it briefly explains our fast algorithms that make the difference for speedier virtual prototyping. the last section concludes the discussions with suggestions for future work. mems design simplification with virtual prototyping 13 2. background and context this section looks into the multidisciplinary aspect of mems. its main purpose is to motivate and provide context towards an easier design phase and virtual prototyping (vp) and this is reflected in its literature review. due to the diversity and amount of mems material published, this paper does not and cannot replace review papers. both mems and vp are extensive disciplines with their own specialization branches. the project of vp for mems is huge and ambitious, and requires specialization topics such as for example “physically based rendering” or “turbulent flow” and many more. such topics require in-depth study on their own. the project also requires to overcome the old dilemma that engineers are weak programmers, and software developers are weak in science and engineering. mems are minute devices that are in widespread use, for example in airbag triggers and inkjet print heads, optical, medical, and many other applications. with ever increasing new applications in the r&d phase, the mems industry is strong and growing, in particular in the medical and optical applications. by their very nature, mems devices are microscopic and therefore difficult to observe in action. in the macroscopic world of our daily experience, inertia and gravity dominate the motion of objects. in contrast, in the microscopic domain of mems adhesion and friction are the dominant forces. therefore, mems designers cannot use their intuition on how things behave. because of the different dominant forces, mems cannot simply be downscaled counterparts of larger mechanical machines, requiring innovative designs and arrangements of their components, whose effects are often not fully understood. for example, a fluid pump with macroscopic dimensions would not function if it were downscaled to a miniature version with microscopic dimensions. 2.1. evolution and rise of mems mems emerged in the late eighties and nineties with the downscaling of transistors’ structures into the submicron scale and by perfectioning microlithography patterning. since these early days, mems sizes are not only in the micron range but can be several millimetres big. the intention is to keep them as small as possible. small means less materials and therefore less cost and more flexibility in their placing. mems materials are no longer limited to silicon but also other materials e.g. polymers or metal are used. mems appeared as a new opportunity in microelectronic manufacturing, in which many of the fabrication steps and factory facilities of semiconductor industry could also be used for mems fabrication. liga technology, developed at the fzk, germany [2] for micropatterning precise aspect ratio microstructures with steep trenches or walls, played an important role in patterning microstructures [3]. examples of achievements and benefits in aspect ratio precision with liga are micro optical devices using filters with submicron sized structures, wave guides and photonic crystals, or gears of gold (luxury watch components), that are so perfectly fitting that they do not need lubrication [4]. the use of polymers opened a new opportunity for mems. one often speaks of mems as complex devices. however, the structural complexity and the functional complexity of mems [5] can be very different. they can be made of a few simple components that produce sophisticated function (example: a movable mirror in an optical switch), or several components in a complex arrangement that do simple function (example: a microfuidic pump). by their small size and electronic controllability, mems can be built 14 r. sitte into larger devices, often replacing hitherto large, heavy equipment (e.g. gyroscopes) or saving time and laboratory space in chemical analysis (e.g. lab on a chip). 2.2. impact in medicine an increasingly important impact of mems is in the medical industry where it has changed medical diagnostics and surgery in an evolution from microgrippers to endoscopy and robotic surgery. this in turn has transformed and brought in new capabilities e.g. ultrasonic surgery, microsurgery e.g. in eyes, on embryos, tactile feedback and with it keyhole surgery with all its associated benefits [6]. mems’ share in the medical industry alone has grown into a multi billion dollar industry in less than twenty years. another successful niche for mems with remarkable advances are in biophysical applications. for example, margesin et al. designed a mems for measuring the electrical activity and metabolic activity (ion concentration) in a network of neurons using ionsensitive field-effect transistor (isfet) arrays [7]. a word of caution, microsystems and nanotechnology are often erroneously thought as being the same. they are not; they operate at different scales of resolution. nanotechnology deals at molecular and particle level and therefore uses different models; it has different challenges and different industrial potential. however, in microtechnology it is possible to produce nano-sized structures whenever necessary. rieth has written a good introduction in a nutshell about nanotechnology (suitable for advanced readers) [8]. 2.3. training and specialization when it comes to education, vr is nowadays a well established option in undergraduate multimedia curricula in many tertiary institutions, with some institutes more specialized in vr than others. in contrast, the teaching of microsystems is usually deferred to at least masters level. this is due to the multidisciplinarity required in understanding mems. institutes that are known by their excellence in the field also offer regularly specific specialization short courses. such short courses and summer schools provide an introduction to a specific topic; they are a valuable step towards postgraduate research. programs for short courses can be easily found through international professional organizations. examples are fsrm fondation suisse pour la recherche en microtechnique (swiss foundation for research in microtechnology), neuchatel [9], imec interuniversity microelectronics centre, leuven [10] or the ieee [11] and eurotraining [12]. while a researcher or student should stick to reputable and peer reviewed literature, one must never forget that in industry is where results of research come to fruition. there is a wealth of real life information in industry reports that they should take advantage of, albeit, with some caution. they complement research findings by providing eye opening context. 2.4. mems design here we present a selection of issues arising in mems design, with the intention of preparing the scene towards virtual prototyping. mems design can be overwhelming by the wide, almost infinite range of possible structures and how these structures work together to provide a useful function. mastromatteo and murari have designed and proposed an architecture to address the diversity of mems by grouping them into the traditional categories moems, mems, lab on chip, rf mems, data storage mems. mems design simplification with virtual prototyping 15 however, it is not always possible to allocate a mems into just one of any these categories in a very strict sense due to their cross category functionality [13]. grouping them helps to conceptualize, but it is not a strict definition or standard. spearing analyzes scaling the size of mems in the context of macro versus micro scaling. in his work the relation between mass – volume scaling and volumetric to area scaling are explained. this is summarized in a table for guidance for possible scaling in mems design. the work shows how some of the most important effects of scale on mems design or performance cannot be attributed to a single physical factor. it also shows the need for fabrication processes that allow for dimensional tolerance but that this can be limiting to the shapes achieved. as a consequence in mems distinct, more expensive materials may be used, whose cost would be prohibitive for larger sized devices [14]. materials play an important role, in for example flexing membranes in micropumps, or cantilever switches. senturia offers an introductory overview into this area structures, processes and modelling [15]. another good source is by pelesko and bernstein explaining structures and device behaviours by motivating and developing understanding and intuition and then moving into the modelling and optimization [16]. 2.5. mems evolving manufacturing alternatives mems are 3d devices. the traditional functional distinction is into sensors and actuators [17]. they can be one single structure, or the result of many components but they all need some circuitry or interaction to control them. in silicon manufacturing, it would save huge fabrication costs if mems and the circuitry to control them could be manufactured together on the same wafer. unfortunately, this is rarely possible because processing steps that involve heat can damage previously accomplished structures. depending on the materials used, the structures of mems components are either removed from a solid material, or built up in a deposition process. this is the area of microlithography and micromachining [18] and [19]. there is a good range of introductory and advanced literature available, but publications hardly keep up with mems’ fast technology advances. one reason for publication delay is industrial non-disclosure. an overview about the fabrication process, materials, processes and micromachining are presented by g. fedder [20] and subramanian et al. [21] about design. despite mature fabrication processes, new mems applications and innovations in their fabrication present constantly new hurdles that must be overcome. this is illustrated by the design and realization difficulties of a micromachined silicon nanopositioner with electrothermal positioning by zhu et al. [22]. another example is nanochannel fabrication using bond micromachining by j. haneveld [23]. normally, the fabrication of mems is not a simple process. if it is in traditional silicon technology, it requires a semiconductor foundry, capable of handling around 200 processing steps and very expensive equipment. some processing steps that require very specialized equipment or processing facilities may be outsourced to other foundries. experimental implementations are fundamental for frontier research. due to the high costs, research centres are often equipped with whatever funding allows and sometimes fabrication steps have to be carried out at industrial facilities. the collaboration between research institutes and fabricants providing service for prototyping and fabrication is a convenient way to overcome shortages for mutual benefit, sometimes even sponsoring research [24]. 16 r. sitte a vast range of technologies have been developed and there is more to come. nowadays mems are not only made of silicon but other materials are used, for example, glass, photoresists or other polymers that can be patterned by laser, rapid ion etching (rie) or other technologies. for example, desbiens et al. found that for prototyping, an excimer laser (uv laser) can be used for the removal of materials (ablation) for mems micromachining of 3d structures in approximately 1-5 m range. their research studied the interactions of repetition rate and mask dragging speed as parameters in a systematic study and measured the etch rate of material on samples of different materials, si, pzt and pyrex [25]. delille et al. have shown how photopatternable uv sensitive adhesives can be used for patterning up to 1cm thickness. the benefit is that the process is low cost and requires no baking and does not even require a cleanroom. some of these polymers bond irreversibly to glass and they can be compatible with living cells [26]. due to the ability to work in any room and under any light condition, makes their findings suitable for education purposes of mems fabrication. material deposition by 3d printing is becoming popular, but all depends on the purpose of the mems. more about 3d printing further down. to power a mems requires some source of energy. for medical mems implant applications or situations where mems application requires independence from a clumsy battery this means an additional difficulty. this leads to a niche for technologies and materials to harvest energy and supply to power a mems. iniewski et al. present a good introduction to this area about such materials and technologies [27] bermejo and castañer have studied to drive mems electrostatic actuators with a direct photovoltaic (pv) source. the benefits are that the number of solar cells can be customized for specific mems switches and better performance with increased reliability [28]. 3. tools for modelling and simulation this section explains the evolution and need of cad systems for mems. it also shows with examples the vast diversity of problems that appear and need to be addressed. mems design and fabrication requires a range of modelling techniques at different stages. on one hand, we have the mechanical and circuit modelling for the functioning of the mems. on the other hand, we have the fabrication design and experimentation with physical modelling to find out desired properties of our object mems. for cad tools, we have to distinguish the mathematical modelling and the visual images providing information about what we are modelling. mathematical modelling (mm) is essential in device design. mms are also the underlying simulation tool for mems cad software. without it, there can be no serious outcome in mems design. mm occurs at different levels. napieralski et al. have elaborated an interesting work on the evolution of mems and modelling. they demonstrate how the advances in mems technology and modelling methodologies not only depend on each other but even drag each other forward [29]. lyshevski provides a good introduction to the fundamentals of mathematical and physical modelling in context with mems and nems structures [30]. these models are necessary to calculate the dynamic behaviour of those structures working together in a purpose or function. in this endeavour, further calculations are needed to solve the resulting differential equations. this is done with numerical calculations, using solvers such as matlab tm and for mems in particular finite element analysis (fea) calculations. comsol multiphysics tm is a popular and steadily growing environment for calculations mems design simplification with virtual prototyping 17 using finite elements [31]. another one is ansys tm . these are not the only ones; there are other multiphysics solvers. one important step in modelling mems is the order reduction of differential equation systems (in particular non-linear) and differential algebraic equation systems. greiner et al. have developed a method to reduce the order (dimension) for finite element models of second order systems, which appears to work well for linear conditions [32]. additional practice about numerical and experimental evaluation of the mechanical properties of mems and nems are collated by frangi et al. in [33]. it also contains an investigation by ananthasuresh about continuous parameterization and the problems that arise and ways for optimization. bechthold et al. have developed a methodology for model order reduction for a range of mems [34]. their method has the potential for an automated implementation. mems involving fluids have a substantial impact in medical applications. fluids play a special role in many microsystems because fluids behave differently in a microchannel than in a macroscopic space. design considerations and microfluidic behaviour requires special mathematical and modelling skills in a different physical domain. nguyen and wereley provide a good introduction into this domain and microfluidics in mems [35]. 3.1. reliability reliability is defined as the time before failure. this quantity has been used for decades, but on closer observation, it does not give any indication why the device fails. this is aggravated in mems; they are not just the microelectrical circuitry but also the mechanical part that goes with it. microscopic structures will function in different physical domains than macroscopic devices. it may not be easy to pinpoint the source of functional failures because the dominant physical forces change gradually as their geometric dimensions increase or decrease. because it is a gradual change, it may not be evident how much is due to say adhesion, capillarity, or any other force, it is appropriate to consider a more structured approach. to address this issue, we have developed a hierarchically structured reliability model that allows giving different failure weights to different components. some components are more robust (or vulnerable) in their design than others. likewise, some materials are more robust (or vulnerable) than others; and again, some assembly or manufacturing processes are more difficult (vulnerable) than others. our model allows assessing and pondering a priori different combinations of options for design, materials and manufacture or assembly [36]. in a somewhat similar way, muratet et al. have focused on failure analysis given that the vast variety of structures in mems represent different points of material weakness and/or design failure. to demonstrate this, they have developed a time before failure prediction model and illustrated the procedure by implementing a wobble electrostatic micromotor as an example. they use testing failure (including failure criteria and conditions) and combine these observations with fea simulations (by including failures into the simulations) from which they can identify risk conditions (deformation, stress) from which they derive the time before failure model [37]. 3.2. cad systems in the early days, a relatively small number of mems design software environments were available on the market. their application potential was rather restricted to 18 r. sitte modifications of existing library designs. dewey et al. used analog hardware description languages (vhdl-ams) for their project visual integrated-microelectromechanical vhdl-ams interactive design (vivid) [38]. often the tools were by-products from code written for the design of another specific project [39,40] and often difficult to use [41]. they appear as a collection of tools [42], sometimes limited to specific applications [43]. one of the first cad systems for mems was memcad built at mit in the late nineties [44]. since then many more systems have been brought to the market, some of them disappeared, while other evolved with state of the art facilities. few have calculating mems manufacturing parameters as their primary purpose, and if so, more often than not, they are beyond reach researchers due to their high cost [45, 46]. due to the amount of calculations involved, the development of a cad tool is not an easy task, in particular for mems. this is caused by their multifaceted, multivariate aspect. we are dealing with 3d mechanical devices, with critical timings (4d) and acting forces, sensing, or performing chemical, spectral analysis or pattern recognition adding to the complexity and at the time of design most of it squeezed into a multiple representation on a 2d screen display. to aid imagination and interpretation, schematic drawings have progressively grown into cad tools. these in turn have diversified for specific application niches and with the purpose to be fed as a run specification program into a microlithography or micromachining tool, e.g. a laser. the reality is that design is usually a mix of back and forth between simulations and the development of prototypes. it appears that a one-only streamlined workflow that goes from the cad drawing board to the fabrication of a prototype does not exist yet. in an early endeavour for optimum design, gaddi et al. have developed a framework for a top down design approach based on ic design and electrical and mechanical parameters. their aim was for a hierarchically mixed design environment, using fea for validation [47]. this is unusual, because fea are normally used for calculating optimal parameter combinations in a systematic set of simulations, not the other way round. this model appears to be limited to silicon technology. one very early cad tool was developed by dasigenis et al. their cad tool recycles a previous mems converter design and allows its updating to a new design and producing its new processing parameters [48]. this approach is devoid of any modern user/menu driven software or architecture facilities. it equates to building up a library from scratch each time when it comes to designing another device, using a new technology or materials. a classic computing simplification approach was chosen by bardohl et al., who have used graph (that is the graph description of images) and transforming it into sets of reduced graphs [49]. it is questionable, whether these transformations for reducing information handling are efficient or even practical in a mems–cad application. in a biometric approach to manufacturing biomedical microdevices, hengsbach and díaz lantada have produced a multiscale biomedical microsystem for addressing the effect of surface texture on the cell mobility. the purpose was to fabricate multiple length scale geometries that allow interactions of implants with living tissues. they used a laser writer for the device structure (several mm) and a direct laser writer for finer, submicron size details. one problem that arose with the fine textures and microstructures was the cad file size of several hundred mb and some gb, which in turn affected the fabrication time excessively. the solution was to revert from a descriptive geometry as sets of layers to algorithmic geometry, by mapping a grid of channels as fractal surface functions to a matrix. this reduced the fabrication time by more than one order of magnitude [50]. mems design simplification with virtual prototyping 19 another problem that requires attention is that in the design of mems different physical or chemical phenomena must be simulated. that means several suites of solvers, and because they require surface or volume meshing for their calculations, they are slow and not suitable for interactive design. this is an inconvenience with all currently existing cad products for mems that are on the market, for example coventor tm [51], memspro tm [52], tanner tm mems design flow [53], intellisuite tm [54] and others. they offer mixed capabilities, mixed interactive facilities, speed and popularity. some of them are sophisticated but all require a good understanding of physics, mathematics, engineering, knowledge in mems design and training time to use the cad package efficiently. computing power has not yet resolved the problem of speed, the very essence for rapid prototyping. 4. prototyping for larger systems, industrial rapid prototyping played a substantial role in the development of new articles. a prototype is a model of a device with emphasis on either replicating its functioning and scale (dimensions) of the intended device, or to study its production feasibility. if the aim is to study the functioning, then neither materials nor the production process need to be the same as the intended ones for regular fabrication. however, the closer to the reality, the better is the prototype. if the aim is to study the feasibility of a certain production process, then the intended production process must be replicated. typically this is a “no frills” approach aimed at simplification, be it reduction in fabrication time or cost of materials or need of expensive equipment. prototyping is aimed at answering the question “can it be done” and “does it do what it is meant to do”. the answer must be fast, before large sums are invested in its production. this is why rapid prototyping has evolved over the years. even rapid prototyping requires to some extent a worked out design. a prototype serves to eliminate design flaws or unnecessary costs early on. the “no frills” means that the focus is on the functioning of a device for its intended purpose, the famous “fit for that purpose”. as an analogy, there is no point in modelling comfortable seats for an airplane that is unable to take off and fly. it must fly! that is its purpose. it was the search for faster, cheaper prototyping that enabled the evolution of mems from silicon manufacturing to other materials and processes. this happens when a mems prototype turns out to be satisfactory to the extent that the initial experiment goes almost instantly into production maturity. this is triggered by the insight that originally intended materials or the production process can be replaced with the cheaper ones used in the prototype. this has lead to an explosion of alternative mems materials and technologies, and with it pushing innovations and applications further from initially expensive devices to cheap single use medical mems products. in what follows, we will illustrate this with selected examples in a brief journey in time. in the process of rapid prototyping sometimes specific tools are required for being able to see small structures in mems. one such tool is “small spot” stereolithography but it was insufficient for small structures, and being replaced by microstereolithography, which was not yet fully developed. bertsch et al. have [55] conducted a comparative analysis of those different types of stereolithography and their suitability in mems prototyping. conventional stereolithography’s resolution was too limited for small mems structures. however, a later integral microstereolithography’s resolution with at least an order of magnitude better than small spot lithography turned out particularly 20 r. sitte suitable for manufacturing complete layers with small 3d structures of 0.05 to 0.2mm but without high aspect ratio. lin et al. used a thin layer of baked on photoresist, instead of a conventional mask on soda-lime glass substrates to produce microfluidic channels approximately 36 μm deep. a two-step baking process ensured good adhesion of the photoresist to the glass. this was then etched off in an iterative progression of wet etching of dipping and etching with ultrasonic agitation, which led to smooth etching results. the process was aimed at fast prototyping and mass production of microfluidic systems. after successful etching, the microfluidic channels were sealed with glass chips at 580oc. the whole process was done in ten hours [56]. another similar technique using photoresist as was developed by sampath et al. [57] to produce free moving structures. the authors use a 20 m layer of patterned photoresist (su-8) to form an insulating spacer layer on a silicon wafer. then they used wafer bonding to apply a 50 m layer of crystal silicon on top of the insulating spacer. this was followed by patterning the crystal silicon layer with rie to produce the desired structures, in this case, a spring and a piston. the difficulty was to achieve a tight bond between the photoresist and the crystal silicon layer, given that the thickness of the photoresist is critical to produce precisely the desired thickness but it is thermally sensitive and can crack in the silicon patterning processing steps. in mems manufacturing and prototyping we often see additive (building up in layers) and subtractive (removal of material) processing steps to achieve the desired structures. these fabrication methods allow alternative materials, often polymers and they do not need cleanrooms. they are faster, often cheaper alternatives to the traditional silicon wafer processing. li et al. have adapted shape deposition manufacturing (sdm) to microfabrication by developing an ultrasonic-based micro powder-feeding mechanism for precise microdeposition of dry powder onto a substrate. this was followed by patterning by sintering the powder patterns with a micro-sized laser beam to clad them onto a substrate [58]. khoury et al. used liquid phase photopolymerization for ultra rapid prototyping that are suitable as masters for micromoulding microfluidic channels. the process is suitable for lab on a chip mems used in life science, where fluids in very small quantities are used, mixed, cultured, etc. and discarded. for the process, the authors used a multichannelled universal cartridge as master. the cartridge was filled with fluid photoresist and unwanted parts were masked off before exposing with uv to harden the desired channel geometry. the remaining structure was then rinsed, leaving the desired channels open. the process is also suitable for a fast production of microfluidic devices without micromoulding [59]. high aspect ratio (structures with deep narrow trenches with straight walls) is a specific niche in mems. the processes that can be used for prototyping and the production of mems that require high aspect ratio depends on the materials used and consequently the intended application and life span of the mems. sarajlic et al. have used plasma processing with low pressure chemical vapour deposition (lpcvd) for high aspect trenches and after a few more processing steps using the “black silicon method” (bsm) for pattering, passivating and release with isotropic plasma etching [60]. the benefit is that the processing was drastically simplified with bsm by keeping all in the same run, that is, in the same vacuum chamber. this made it suitable for rapid prototyping. in an endeavour for finding alternative micromachining to produce polymer-based capacitive micro accelerometer yung et al. [61] have used direct write laser ablation (removing material by laser sublimation) for its production. they have shown that this is a more convenient and suitable technology than traditional lithography methods. this is mems design simplification with virtual prototyping 21 because traditional lithography as used in semiconductor manufacturing requires expensive equipment and expensive masks, which is justified for mass production, but not for smaller productions of some mems. the cheaper laser ablation made it ideal for nonmass produced mems and it is also much simpler by allowing other materials. abdelgawad et al. [62] have developed a cheap technology to produce actuators with 5060m electrode separation that allow droplets of 1-12 l microfluid to move, merge or split. they use digital microfluidics and electrowetting and electrophoresis to measure enzymatic activity (enzymatic assays). what makes their work so different is that they did most of this with very cheap resources, i.e. recycled circuit boards and compact disks (cds) for gold and metals. for electrode patterning they used an ink pen and ink masking made with a razor blade instead of expensive photolithography with uv exposure. for dielectric coating, they used cling wrap (plastic film used to cover food). for protective hydrophobic treatment, they used cheap car windshield protector instead of expensive and licensed use of teflon. they have successfully realized their experimental work for prototyping. to read about this work is not just inspiring, it is also highly commendable for education. in the strive for rapid prototyping of precise submicron and nanogaps, villarroya et al. have experimented combining a focused ion beam (fib) followed by reactive ion etching using aluminium masking. their process achieved trenches of 80nm wide and 11nm deep. the goal was to produce nanodots [63]. microfluidics present another challenge to rapid prototyping. the quantities of fluid used in the end product are minuscule and in biomedical application they are used briefly and discarded. this requires large quantities of mems or nems to be produced. this makes conventional manufacturing in silicon unattractive due to their complicated and slow production and expensive equipment in a foundry with cleanroom. do et al. have developed a process using a cutter plotter (a “printer” that removes material) to scratch or cut through a polymer substrate, patterning the structures layer by layer with holes and trenches. the polymer sheets are then assembled one upon the other (like pancakes) and bonded. the arrays of overlapping holes make the containers for the fluids. this process achieved 20m wide and 30m deep channels in less than 30 minutes. this process is fast and does not need cleanroom conditions [64]. another interesting case is printing mems onto paper. in a feasibility study, meiss et al. have developed a method and special ink to print resistive sensors onto paper substrates using inkjet equipment. the technique can be used in iterative development and complex model design of sensors for low cost applications, such as medical disposal or consumer goods packaging [65]. speed is paramount in prototyping. 3d printing had a substantial impact in rapid prototyping because it is a fast way of building up structures. by adding layer after layer, fine structures can be produced using materials like plastic and metal. lifton et al. have researched and compared 3d printing with other technologies. they found that for some currently silicon-based mems, the production time can be drastically reduced by replacing long-cycle prototyping and packaging loops with 3d printing. there is the potential to use 3d printing for electronic packaging of mems devices at the wafer stage. 3d printing is suitable for features larger than 1m, such as lab on a chip. however 3d printing is not suitable for structural elements such as cantilevers and springs because the polymers used incompatible with their desired functions [66]. 22 r. sitte 5. virtual reality prototyping virtual prototyping has been around for over a decade. it became possible with increased computing power and faster vr algorithms. its application in mems is rather limited, which is understandable, given the complexity of manufacturing. cecil et al. have presented a comprehensive research in virtual prototyping [67]. however their work was aimed at vp in general, not necessarily mems, but the explanations are equally important for mems vp. jiang et al. have developed a proposal for a service driven mems cad design tool. contrary to traditional bottom up approaches, the authors argue for a top down approach. this project was aimed at designers with little knowledge in mems manufacturing process technologies and the requirement to detach the mems design from its fabrication. the authors have produced a partial software prototype; its output produced bond graphs. [68]. schröpfer et al. went a step further and presented an overview of different modelling levels in mems and the cad tools that are relevant to these modelling levels. it serves both mems and ic designers. the authors also analyze the differences and benefits between their applied behavioural modelling and the two popular modelling with fea and boundary element method (bem). another important feature is the use of voxels (think of pixels in 3d) instead of pixels for their displays to facilitate 3d animations [69]. cecil et al. have proposed a virtual reality-based environment for micro assembly (vrem) that is linked with the physical manufacturing. the software for the vr environment mimics and displays on screen the tools and movements from the point view of an operator. an automated assembly sequence generator uses genetic algorithms to optimize assembly sequences. the outcomes from the virtual environment aim to produce a validated schedule for the fabrication of a mems, and the assembly instructions for the physical part (tools and autonomous robots) to be assembled by the available work cell resources. some examples of vrem are developed as prototypes [70]. despite the potential of current computing power, the availability of virtual reality with animations is still non-existent or very limited for mems. in 2001 we initiated our mems animated graphic design aid (magda) project [71]. this project is summarized below and specific example shown further down. magda aims for starting design templates for structures of mems. these design templates give priority to those parameters that are most sensitive in the proper functioning of the mems. it aims to provide a library with typical designs that can be changed further in a similar way as dewey et al. did in their project visual integrated-microelectromechanical vhdl-ams interactive design, vivid [38]. the difference is that our starting point is based on theoretical mems whose design templates summarize the features of typical classes of mems. this provides a more general starting point with more freedom, while in vivid only those designs that are available from the foundry cell libraries provided by the designers of the software can be used. our motivation for our more generic template is to start with a “feasible” mems. we use chua’s notion of “local activity” [72] to step into the design of a mems, whose internal complexity (hence its detailed mathematical modelling) can be deferred for a subsequent stage of fine-tuning. this is important to bring mems design closer to the less specialized and novices, and still offer (albeit limited) understanding and learning of cause and effect in mems parameters. mems design simplification with virtual prototyping 23 6. the magda project mems animated graphic design aid (magda) is our virtual prototyping project that aims at building a simulation environment to aid in the design of mems. the purpose of magda is to overcome the weaknesses of commercially available cad software. specifically, it aims to overcome the weakness in interface usability, by simulating the functioning of mems interactively, and by producing animated vr visualizations. it aims at contributing in a similar way to the mems industry as the introduction of cad packages was a critical step in the widespread development of vlsi. in its implementation, magda acts as a layer between the user and existing cad solvers currently used in mems design, with a capability for calculations on its own. figure 1 shows the basic organization of magda magda is an ambitious project that was initiated in 2001. it has attracted postgraduate students and international exchange students for research and implementation. to illustrate why it is an ambitious project, we look at collision detection that is suitable amongst the many collision detection algorithms. what adds appeal to it is the vr placing of different shapes in different conditions and arrangements, for example modelling a gear or a spring into a device being assembled. magda is about the manufacturability, which impacts on the suitability of models that can be used; it does neither use nor replace existing commercial products or finite element calculations. the objective of magda is to calculate faster for interactive early design and narrowing to a desired range of parameters and functionality towards a prototype. the result of magda can then be used for further fine tuning with finite elements or other mathematical models. this system must have several major components that are interrelated with each other. for the virtual prototyping project, this is a huge task that must be broken down into smaller, more manageable parts. we do this by following the naturally given classification of actuators and sensors according to their operational principles. however, we cannot isolate the design of each class of sensor, because it would defy the overall purpose of providing a design tool that offers flexibility and allows for innovation perhaps across different technologies. this has been exemplified in the previous sections of this paper. it is therefore that the virtual prototyping facility must be able to combine different technologies and at the same time work in concert with the different components of the mems to be designed. magda is not intended for virtual manufacturing; this is a different niche altogether. an exception to this is virtual etching because the different types of etching affect the shape of material removal and consequently the shape of the object (straight or curved corners, edges and shapes). for the software, development matlab tm and c wereu sed for the physical shape design drawing board and vrml for the visualizations. the benefit of matlab is that it can be used on windows and unix os. it is widely available, affordable, and it has good graphics facilities. for the control and interaction of the mems (systems modelling) simulink is suitable. an additional advantage for using matlab is that mems design engineers can link the virtual prototyping with their earlier calculations and results if they were done in matlab. however, one severe problem with matlab is that it is not a stable fig. 1 magda organizational diagram. 24 r. sitte software. as we have regrettably experienced, it suffers from version changes and upgrades that are not backward compatible, sometimes rendering existing software useless. 6.1. visualizations and animations in our magda vr visualizations, we use physically based rendering. most of the code is written in vrml. we also use transparency for flexibility and easier understanding the devices in 3d visualizations. visualizations can be rotated for easier inspection from different aspects. images on screens are two-dimensional arrays of pixels, sometimes representing 3d and moving structures. representing specific movements by showing series of lights (pixels) some flashing alternating with each other to make the whole series appear moving in a specific direction and changing, is not trivial, because its outcome depend a range of “by-effects” that affect the visual perception in either good or bad ways. a well known example is wheels (or gears) rotating “backwards” while the object where they are attached moves forwards. one of the main purposes of magda is to show animations of a functioning mems in scaled observable “real time”. this can include components simultaneously moving at cycle times that can differ in orders of magnitude, for example, a gear rotating, a cantilever flipping and a membrane bending. therefore, animations cannot be a simple a zoom in time, because it would cause too much distortion between moving parts with different motion rhythms. while observing the movement of one component in slow motion, another one could come to a stand still. we have to be aware that we are performing animated visualizations of simulations that must strictly map to their object’s physical behaviour without ever degrading to a cartoon. we have addressed this problem by simulated stroboscopic illumination with flexible fine tuning its two virtual stroboscopic flash parameters: duration and interval. this is necessary for overcoming results of specific undesirable visual side effects (jumpy or flickering images) and hardware influences such as pixel size and computing latency effects [73, 74] and to provide a smooth observable animation. if, for example, in a visual experiment the thickness of a micropump membrane as shown below in figure 2 is changed, the two stroboscopic parameters can be reset by moving a virtual slider on the screen, to bring the new conditions again into a smooth, non flickering animation. this makes magda different from other simulators. fig. 2 interactivevr environment showing a micropump with flexing membrane, flow and user controls [73]. mems design simplification with virtual prototyping 25 in what follows, some of magda’s research results are briefly presented and what difficulties they are overcoming. for an interactive system, fast response is paramount [71]. much of mems physical modelling is done with finite elements. despite substantially increasing computing power, they are still too slow for interactive modelling. there is another issue: the physical domain. mems can be microscopic or macroscopic. the boundary for the separation of the dominant physical forces (e.g. inertia and gravity vs. adhesion, capillarity etc) is hazy to say the least. this is crucial for the distinction of fluidic and microfluidic modelling, because the viscosity and channel materials affect the slip length which in turn affects the reynolds number and depends on the pumping speed or flow rate and the physical characteristics like size, hydrophyllia or hydrophobia of the channel [75]. in addition, a novice mems designer would rarely be familiar with the rather specialised topic of navier stokes equation systems for fluid modelling. in a systematic fea analysis, we have simulated microfluidic flow by varying stepwise a set of parameters to find the distinction between laminar and turbulent flow [76]. such subtle details affect the mathematical modelling, hence the outcome, but this is important in the vast area of chemical and medical analysis. the interactive vp environment must be equipped with recommended model guidance in (e.g. like a pop up alerting to turn on a menu for specific parameter setting combinations). 6.2. fluid flow in a microchannel, the fluid is flowing at very high velocity. this velocity is different throughout the channel: it flows at different rates in different regions. for example in the centre of a square section microchannel with 152 m sides, the flow has a velocity of 8.3e10 m/s, while towards the channel walls the velocity drops to 2/3 of the maximal velocity, and touching the walls it flows only at ¼ of that velocity. this velocity reduction is due to an electric friction with the walls of the microchannel, pulling into the opposite direction as the flow and it is induced by the high velocity of the fluid flow in the channel. in our research, we have been looking for valid replacements for finite element calculations because they are too slow. we have investigated new models for laminar and turbulent flow of microfluidics in a channel, for example how to model an inversion layer in a channel. we use a layer model for the different velocities as if they were distinct strata. this is shown in figure 3. those layers next to the channel walls rub against it producing friction and to lesser extent, they slow down the adjacent layer, which in turn also exerts friction on the next layer and so on. in the centre, fig. 3 vr simulation of fluid entering the channel and formation of the bullet nose as it moves at different velocities (coloured layers). the vertical stripes of the flow are to distinguish the movement. to the lower right are user visualization controls (blue/green) [73]. 26 r. sitte the particles move at high speed because there is little or no friction anymore. in the outer layers, the particles move much slower due to the friction with the channel wall. our aim was to model the different layers of fluids as an electrical network. to do this we have modelled the flow segmented into layers to the pertinent models. we used first a continuum model (euler and navier-stokes) for incompressible flow (liquids). this was done by solving the navier-stokes equation, obtaining an analytical model for the circular and a numerical model for the rectangular channels. these were then used to model the layers as an electrical network model in matlab simulink. the resistances of the layers are obtained from the velocity profile of the flow. compared with ansys, our electric network model for the circular microchannel gives percentage errors up to 6.6% and compared with hagen-poiseuille equation, the error is below 5.22%. one must bear in mind that ansis’ error can reach up to 10%. this is a satisfactory result for a faster model that does not require meshing nor lengthy iterative calculations [77, 78]. 6.3. turbulences turbulences are an important phenomenon in fluidic mems design; they may be desired (e.g. for mixing fluids in or undesired (for medical implant medication dispensers). turbulences have several phases in their existence: a beginning, a movement, and an end phase. the can move in rotational or undulated movements. initially the velocities of the fluid can be rendered with larger patches of colour, while as the turbulence sets in, the patches become increasingly smaller. this is because a turbulent diffusion process is ongoing, but the diffusion is slow, following the swirls and eddies that characterize turbulent flow [79]. the strict layered flow as it occurs in non-turbulent fluid starts mixing and some parts will move faster, some move slower across the channel. we have developed the cluster splitting method for displaying turbulences in a microchannel. our method is suitable for fast calculations virtual reality visualizations in an interactive cad tool with a 2d display. instead of calculating and recalculating all the nodes in a mesh as in finite elements, our method takes advantage of redundancies. for graphic visualizations, we do not necessarily have to go down to the level of atoms or molecules. our objects of interest can be composed of macroscopic particles or clusters, but interacting in similar ways like smaller particles. however, by staying in the potential domain instead of the force domain, physical approximations can be made, simplifying complex and lengthy calculations. we use the lennart-jones potential model, but instead of individual particles, we use clusters of particles [80]. we start when the fluid is pumped with a given force into a nozzle and the microchannel at (t0) with larger clusters of particles (think of circular droplets) that are moving with equal speed and direction in the stream pumped through the channel. after a time (t1) we divide each cluster in half (t2), calculate, divide again (t3) and so on [81]. the total time is the sum of fig. 4 cluster splitting [81] upper: model diagram (1:2 split) lower: simulation (1:4 split). mems design simplification with virtual prototyping 27 a well-known geometric progression. ideally, by just dividing each cluster into two we save 50% of calculations. in reality it takes slightly more because the calculation times have to be added in both cases, cluster splitting and fea [82] for comparison. in this method, calibration is required for different materials; this becomes part of the data library. for the example shown in figure 4, we used three layers and progressively reached finer cluster granularities that are well suited to show the bullet nosed fluid flow in the channel. our calculation of the channel used 6000 clusters in our worst-case dynamic simulation examples. the corresponding fea calculations used 90000 nodes for a static image. 6.4. flexing movements. another research aimed at developing faster models for magda were flexing membranes and cantilevers. normally these components are also calculated with finite elements. we derived faster models using splines. our parameters were material, thickness of membrane and size (diameter or length of cantilever). these were fed into ansys and the values obtained were then imported into matlab where splines and quadratic polynomials are fitted to them. then the equations describing the curves are obtained as well as the coefficients and errors of the structures. the process involves dividing the surface into three regions or segments of curvature. figure 5 shows the difference between the real flexed membrane and the calculated values at maximum deformation. the obtained errors are still within the errors of ansys. for the purpose of magda our models can be repeated by systematical stepped analysis and then bundled and simplified into a more generic model with simple parameter input [83]. fig. 5 membrane flexing modelled as three different segments and using spline approximations (red: actual, blue simulated). 6.5. virtual etching again, after an in-depth comparison of available software techniques, we found that the main problem is that they use finite elements to calculate material removal. again, 28 r. sitte this is not suitable for interactive vp because at current hw status this is still too slow. etching performance is well known from the integrated circuit processing, but it is not so predictable in mems because the shapes are more complex. underetching is not desired in ic technology, but it is crucial in shaping and releasing mems structures for free movement. the preparations for animated anisotropic etching, both for wet and dry etching are relatively straight forward, but isotropic etching requires a more sophisticated approach. fig. 6a etching square mask wiremesh obtained with marker string method, 2d view [84]. fig. 6b etching square mask wiremesh obtained with marker string method, 3d view [84]. fig. 6c etching square mask (marker string method) rendered , 2d view [84]. fig. 6d etching round mask (marker string method) rendered, 3d view [84]. for visual simulations of isotropic etching we use a marker/string method for the progressive mesh as a faster method suitable for interactive design [84]. the method is not known much for etching but has been proposed for modelling other ic processing [85]. the model never took off due to a problem with swallowtail conditions that appear on corners. we have found a way for overcoming swallowtail conditions and we are also able to simulate underetching. fig. 6a and 6b show the wire meshes obtained in the progress of etching using a square lithography mask calculated in 2d then rotated, and a square mask calculated in 3d respectively. the method can be extended into larger material removal cad visualizations. this is a crucial step towards filling a long existing need in virtual prototyping. figures 6c and 6d show rendered images (using the wire meshes calculated earlier) for etching with a square and a round mask respectively. transparency is part of magda visualizations, to allow better perception of ongoing processes. our marker string method can be adapted for direct laser writing (dlw). mems design simplification with virtual prototyping 29 for the simpler anisotropic visual simulation, we use the etch rate together with data picked from a small database of materials, crystalline orientation, and etchant. this is the input for the visualization, which is displayed progressively at simulated times (typically 2 min) intervals. image transparency is used to be able to observe the progress of the concave well formed by etching using basic geometric shape masks (square, round, rectangular). this process could be used for a round shaped mask only but other mask geometries will not produce a truthful visualization [86]. 6.6. microassembly in small mems microassembly is integrated with their production by etching out structures and then underetching them for mobility. in mems sufficiently large to be handled under a magnifying device, microassembly is done with microgrippers, but there are other means e.g. air, magnets, liquids, etc. in magda we do simulate microassembly disregarding the nature of helping devices (i.e. microgrippers) or autonomous visual servoing. we do not simulate microgrippers or aiding devices. we mimic assembly simply by mouse movements and clicks to test the feasibility of assembly in our virtual environment. simulating assembly is important. it allows testing for conflicts or impediments in the assembly of a device before prototyping or production. for interactive vrp these algorithms have to be fast and smooth. precision in collision detection is paramount for virtual microassembly. to this end, a comparison of efficiency and suitability of collision detections algorithms was performed and a new, more suitable and more efficient algorithm was derived [87]. this algorithm exploits the essentially 2d nature (flat shapes) of typical mems components (which are often etched into a silicon wafer and then underetched and released). in order to take advantage of a new point-based collision detection method, a convex hull is computed around the object, and using this convex hull, a series of concavities is derived. the shape itself and the derived concavities are then divided into a minimum number of convex shapes. a point-based collision detection to check for convex shapes can then be applied in one of two ways: (a) by checking all the convex bodies that make up the solid portion of the object, or (b) by checking the convex hull and the concavities to rule out a possible collision. by using the method that requires the least number of checks, we can arrive at a result in the quickest manner possible. this modification produces a computational advantage of this method over other popular existing methods for 2d (and 3d) collision detection. 6.7. design desk a design drawing board was implemented in magda. a range of shapes and typical mems components can be picked and placed on the drawing board. this is includes free hand drawing a component. all components can be edited, e.g. the number and sizes of cogs in a gear or comb. fig. 7 shows some examples of the interface. this work was done by final year students from germany [88]. 30 r. sitte fig. 7 examples of the shapes available from magda drawing desk menu. all shapes can be extruded into 3d shapes that can be placed individually or merged (intersected) to other shapes. the shapes can be associated with materials from a small database [88]. in its current state, the user interface and drawing board of magda are implemented with a good range of mems components, facilities and 3d including rotation and assembly. the moving parts (membranes, cantilevers, fluids) and consequently the functioning of mems as described earlier are researched and published but not implemented in code. this is the sad consequence of disrupted research continuity as it happens when postgraduate students graduate and other key players retire altogether. magda should be continued, but it needs a new owner, new postgraduate students and programmers. our team has done the groundwork and set the foundation but this is just the tip of the iceberg. one option is to continue it as a wiki with global contribution, but this is dangerous and difficult to track for scientific correctness. interactive vr can do many miracles, not necessarily real, but a vp mems design simulator must stick to the reality and manufacturability. the results must not become cartoons, but they must neither inhibit what could be done in the future, for example more research on a cheap mems technology with carbon nanotubes. a fast and easy virtual prototyping environment could help finding manufacturable designs and cheaper technology. one must never discard a jules verne’s like vision. to climb a mountain one has to take a first step. we have done that first step. now it needs a next generation and the vision to keep on climbing further. 7. conclusions mems design and fabrication are currently in the hands of a highly skilled, highly multidisciplinary privileged minority. to continue filling the trend of this fast expanding industry, we need to find ways to ensure understanding and development of intuition for mems to younger generations and enable the way to satisfy the increasing need for innovation and new mems technologies in the following decades. the aim of this paper is to motivate scholars to engage in this endeavour and contribute to researching fast algorithms suitable for interactive virtual reality design to ease mems understanding. this paper has also presented a progression from earlier research on mems towards alternative technologies, prototyping and mems animated virtual prototyping design aid (magda). the contribution of our research is that we demonstrated that there are ways for alternative methods and faster calculations for the visualizations, without compromising physical validity. magda is far from complete. we have barely scratched the surface. it needs to be mems design simplification with virtual prototyping 31 developed further by dedicated programmers to complement the research that we have initiated. this requires financing, implementation with extended user facilities, populating databases and beta testing by a commercial body in continuous cooperation with a dedicated research group. references [1] j. m. karam, b. courtois, h. boutamine, p. drake, a. poppe, v. szekely, m. rencz, k. hofmann, and m. glesner, “cad and foundries for microsystems”, in proceedings of the 34th conference on design automation (dac ’97), anaheim, ca, usa, 1997, pp. 674-679. [2] v. saile, u. wallrabe, o. tabata, j. g. korvink, eds. liga and its applications, advanced micro & nanosystems , wiley-vch vol. 7, 2008. [3] h. fujita, “a decade of mems and its future”, mems '97, in proceedings, ieee., tenth annual international workshop on micro electro mechanical systems, 26-30 jan 1997, pp 1–7. [4] w. bacher, v. saile, liga, “von der trenndüse zu zahnrädern für luxusuhren”, nachrichten – forschungszentrum karlsruhe, jahrg. 38, 1-2/2006, pp. 84-86. [5] r. sitte, “about the predictability and complexity of complex systems” in from system complexity to emergent properties m.a. aziz-alaoui & cyrille bertelle (eds), springer series understanding complex systems, 2009, part i, pp 23-48, isbn 978-3-642-02198-5 [6] k.j. rebello, “applications of mems in surgery”, proceedings of the ieee, vol. 92, no. 1, january 2004, pp 45-55 [7] b. margesin, l. lorenzelli, “silicon based physical and biophysical microsystems: two case studies, sensors and microsystems", sensors and microsystems, pp. 41-50, 2008. [8] m. rieth, nano engineering in science and technology – an introduction to the world of nano design, world scientific publishing, series on the foundations of natural sciende and technology, vol. 6, 2003, rep 2006 isbn 981-238-073-6 [9] http://www.fsrm.ch/ (aug. 2015) [10] http://www2.imec.be/ (aug. 2015 [11] https://www.ieee.org/index.html (aug. 2015) [12] http://ecd.eurotraining.net/ (aug. 2015) [13] u. mastromatteo and b murari “new architecture in designing microsystems” in proceedings of the 7th italian conference s ensors and microsystems, bologna, italy, february 2002, pp. 94-98 4 – 6 [14] s. m. spearing, acta materialia, vol. 48, issue 1, pp. 179-196 , 1 january 2000 [15] s. d. senturia, microsystem design, kluwer academic publishers, boston 2001 [16] j.a. pelesko,.d.h. bernstein, modelling mems and nems, chapman & hall/crc, 2003. [17] t.fukuda, w. menz, micro mechanical systems, principles and technology, elsevier 1998. [18] p.rai-choudhury (ed.) handbook of microlithography, micromachining and microfabrication, vol. 1, microlithography, 1997, spie optical engineering press. [19] p.rai-choudhury (ed.) handbook of microlithography, micromachining and microfabrication, vol. 2, micromachining and microfabrication, 1997, spie optical engineering press. [20] g.k. fedder, mems fabrication, proceedings ieee international test conference, itc, 2003, pp. 691 698 [21] k.subramanian, micro electro mechanical systems a design approach, springer-verlag, 2010. [22 ] y. zhu, a. bazaei, s.o.r. moheimani, m.r. yuce, “design, prototyping, modelling and control of a mems nanopositioning stage”, in proceedings of the ieee american control conference, san francisco, ca, usa, 2011, pp 2278-2283. [23] j. haneveld, “nanochannel fabrication and characteristic using bond micromachining”, phd thesis, 2006, university of twente, enschede, the netherlands. [24] g. menozzi “nexus & eurimus: two major initiatives to support r&d and strengthen european mems industry”. sensors and microsystems, pp. 13-29, 2002. [25] j.-p. desbiens, p. masson, “arf excimer laser micromachining of pyrex, sic and pzt for rapid prototyping of mems components”, sensors and actuators a 136, 554–563, 2007. [26] r. delille, m.g. urdaneta, s.j. moseley, e.smela, “benchtop polymer mems”, journal of microelectromechanical systems, vol. 15, no. 5, pp. 1108-1120, october 2006. http://www.fsrm.ch/ http://www2.imec.be/ https://www.ieee.org/index.html http://ecd.eurotraining.net/ 32 r. sitte [27] k.iniewski, s. sriram, m. bhaskaran, energy harvesting with functional materials and microsystems, crc press, 2014, taylor & francis group. [28] s. bermejo, l. castañer, “dynamics of mems electrostatic driving using a photovoltaic source”, sensors and actuators a: physical, vol. 121, issue 1, pages 237–242, 31 may 2005. [29] a. napieralski, m. napieralska, m.szermer, c. maj, “the evolution of mems and modelling methodologies”, the international journal for computation and mathematics in electrical and electronic engineering publisher:emerald group publishing limited, vol. 31, issue 5, pp. 1458 – 1469. [30] mems and nems – systems, devices and structures, s.e. lyshevski, ed., crc press, 2002. [31] multiphysics modelling with finite elements methods, world scientific, series on stability, vibration and control of systems, w.b.j. zimmermann & a. guran, eds.., series a., vol. 18, 2006 rep., 2007. [32] a. greiner, j. lienemann, e. rudnyi, j. g. korvink, l. ferrario, m. zen “automatic order reduction for finite element models”. sensors and microsystems: pp. 411-417, 2005. [33] advances in multiphysics simulation and experimental testing of mems, eds. a. frangi , c. cercignani, s. mukherjee, n. aluru, computational and experimental methods in structures, vol. 2, 2008, imperial college press. [34] t.bechtold, , e.b. rudnyi, , j.g.korvink, “automatic order reduction of thermo-electric model for micro-ignition unit” , international conference on simulation of semiconductor processes and devices. sispad (ieee cat. no. 02th8621) 2002, pp. 131 – 134. [35] n.t. nguyen and s.t. wereley, integrated microsystems: fundamentals and applications of microfluidics (2nd edition), 2006, artech house. [36] r. sitte, “visualizing reliability in mems vr-cad tool”, journal of wscg, vol. 11, no. 3, 2003, pp. 433-439. [37] s. muratet, jy. fourniols, g. soto-romero, a. endemaño, a. marty, m. desmulliez “mems reliability modelling methodolog: application to wobble micromotor failure analysis”, microelectronics reliability, vol. 43, pp. 1945-1949, 2003. [38] a. dewey, v. srinivasan, e. icoz, “visual modeling and design of microelectromechanical system transducers”, microelectronics journal, vol. 32, issue 4, pp. 373-381, april 2001. [39] s. p. levitan, t. p. kurzweg, p. j. marchand, m. a. rempel, d. m. chiarulli, j. a. martinez, j. m. bridgen, c. fan, f. b. mccormick, “chatoyant, a computer-aided design tool for free-space optoelectronic systems”, applied optics, vol. 37, no. 26, pp. 6078-6092, september 1998 [40] www.coventor.com (sept. 2015) [41] www.cfdrc.com (sept. 2015) [42] www.ansys.com (sept. 2015) [43] d. reznik, s. brown, j. canny, “dynamic simulation as a design tool for a microactuator array”, proceedings ieee conference of robotics and automation (icra), albuquerque, nm, april 1997, pp. 1675-1680. [44] j. gilbert, “integrating cad tools for mems design”, ieee computer, vol. 31, issue 4, pp. 98-101, 1998. [45] www.memscap.com (sept. 2015) [46] www.intellisense.com (sept. 2015) [47] r. gaddi and j. iannacci, “hierarchical multi-domain mems simulation within an ic-design framework”, sensors and microsystems, pp. 461-466, 2004. [48] m. m. dasigenis, d. j. soudris, s. k. vasilopoulou, and a. t. thanailakis, “a cad tool for automatic generation of rns & qrns converters, microelectronics, microsystems and nanotechnology, pp. 297300, 2001. [49] r. bardohl, g. taentzer, m. minas, a. schürr, “application of graph transformation to visual languages”, handbook of graph grammars and computing by graph transformation, pp.105-180, 1999. [50] s. hengsbach, a. díaz lantada, “rapid prototyping of multi-scale biomedical microdevices by combining additive manufacturing technologies” biomed microdevices 16, pp. 617–627, 2014. [51] http://www.coventor.com/mems-solutions/ (september 2015) [52] http://www.softmems.com/mems_pro.html (september 2015) [53] http://tannereda.com/mems (september 2015) [54] http://www.intellisense.com/ (september 2015) [55] a. bertsch, p. bernhard, c. vogt, p. renaud, ,"rapid prototyping of small size objects", rapid prototyping journal, vol. 6, issue 4, pp. 259 – 266, 2000. [56] c.h. lin, g.b. lee, y.h. lin, g.l. chang “a fast prototyping process for fabrication of microfluidic systems on soda-lime glass” j. micromech. microeng. 11 pp. 726–732, 2001. http://griffith.summon.serialssolutions.com/2.0.0/link/0/elvhcxmwy2awntiz0eure1iskw0sddiskk2szy1nleehoqazpcqzjiuaamfkispibg-ixhmtymbkzrnluhrzdxh20iunzmrdxzbik4b9cgcncexmitgwapvlqrimcsa2qpjrkjcxphiymkqlmyaap5iypxpzpfock7_kvdnjbinc5gaatzcxjg http://www.sciencedirect.com/science/article/pii/s0924424705000749 http://www.sciencedirect.com/science/article/pii/s0924424705000749 http://www.sciencedirect.com/science/article/pii/s0924424705000749 http://www.sciencedirect.com/science/journal/09244247/121/1 http://www.worldscientific.com/series/cems http://www.coventor.com/ http://www.cfdrc.com/ http://www.ansys.com/ http://www.memscap.com/ http://www.intellisense.com/ http://www.coventor.com/mems-solutions/ http://www.softmems.com/mems_pro.html http://tannereda.com/mems http://www.intellisense.com/ mems design simplification with virtual prototyping 33 [57] s. k. sampath, l. st.clair, xingtao wu, d. v. ivanov, q. wang, c. ghosh, k. r. farmer. “rapid mems prototyping using su-8, wafer bonding and deep reactive ion etching” ieee proceedings of the fourteenth biennial university/government/industry microelectronics symposium virginia commonwealth university richmond virginia , 2001 (cat. no.01ch37197) [58] x. li, h. choi, y.yang, “micro rapid prototyping system for micro components”, thin solid films 420 –421, 515–523, 2002. [59] c. khoury, g.a. mensing , d.j. beebe “ultra rapid prototyping of microfluidic systems using liquid phase photopolymerization” lab on a chip, vol. 2, issue 1, pp 50–55, 2002. [60] e. sarajli´c, m.j. de boer, h. v. jansen, n. arnal, m. puech, g. krijnen, m. elwenspoek “advanced plasma processing combined with trench isolation technology for fabrication and fast prototyping of high aspect ratio mems in standard silicon wafers”, institute of physics publishing journal of micromechanics and microengineering, j. micromech. microeng. 14 pp. 570–575, 2004. [61] k.c. yung, s.m. mei and t.m. “yue rapid prototyping of polymer-based mems devices using uv yag laser”, j. micromech. microeng. 14, pp. 1682–1686, 2004 [62] m. abdelgawad, a.r.wheeler microfluidics and nanofluidics, springer verlag, 2007, 101007/s10404007-0190-3, [63] m. villarroya, n. barniol, c. martin, f. perez-murano, j. esteve, l. bruchhaus, r. jede, e. bourhis, j. gierak, “fabrication of nanogaps for mems prototyping using focused ion beam as a lithographic tool and reactive ion etching pattern”, microelectronic engineering 84, pp. 1215–1218, 2007. [64] j. do, j.y. zhang, c.m. klapperich, “maskless writing of microfluidics:rapid prototyping of 3d microfluidics using scratch ona polymer substrate”, robotics and computer-integrated manufacturing, vol. 27, issue 2, pp. 245–248, april 2011. [65] t. meiss, r.wertschützky and b.stoeber, “rapid prototyping of resistive mems sensing devices on paper substrates”, ieee 27th international conference on micro electro mechanical systems (mems), 2014, pp 536 – 539. [66] v. a. lifton, g. lifton, s. simon, “options for additive rapid prototyping methods (3d printing) in mems technology”, rapid prototyping journal, vol. 20, issue 5, pp. 403-412, 2014. [67] j. cecil, a. kanchanapiboon, “virtual engineering approaches in product and process design”, int j adv manuf technol 31, pp 846–856, 2007. [68] p. jiang, x.yan, y. liu, ,"service in e-design", journal of manufacturing technology management, vol. 18, issue 1, pp. 90 – 105, 2007. [69] g. schröpfer, g. lorenz, s. rouvillois, s. breit, “novel 3d modeling methods for virtual fabrication and eda compatible design of mems via parametric libraries, j. micromech. microeng. 20, 064003 (15pp), 2010. [70] j. cecil, j. jones, “vrem: an advanced virtual environment for micro assembly”, int j adv manuf technol 72, pp. 47–56, 2014. [71] r. sitte, “modeling mems manufacturability with virtual prototyping cad tools”, electronics and structures for mems ii, neil bergman, editor, proceedings of spie vol 4591, pp. 125-133, 2001. [72] leon.o. chua, cnn: a paradigm for complexity, ed. leon o. chua, world scientific series in nonlinear science, 1998, series a, vol. 31. [73] z. li, “analysis and design of virtual realityvisualization for a micro electro mechanical systems (mems) cad tool”, phd thesis, 2005, griffith university, australia. [74] z. li, r. sitte, “scaling for mems virtual prototyping: size and motion dynamics visualizations”, proceedings of the 13-th international conference in central europe on computer graphics, visualization and computer vision, plzen, czech rep. , 2005, pp. 37-40. [75] c.-w. choi, k. johan, a. westin, k.s. breuer: “to slip or not to slip – water flows in hydrophilic and hydrophobic microchannels”, in proceedings of imece 2002, new orleans, louisiana, usa, 2002, pp. 1-8. [76] r. sitte, j. westphal, “sensitivity to the onset of microfluidic slip length in a microchannel”, spie volume 6035: microelectronics: design, technology, and packaging ii, cds197, pp. 6035-60350y-1 6035-60350y-8, 2005 . [77] m. aumeerally, r. sitte “layered fluid model and flow simulation for microchannels using electrical networks”, journal of simulation modelling practice and theory (elsevier) 14, pp. 82-94, 2006. [78] m. aumeerally, "simulation and modelling of microfluidic mems devices for vr-cad" phd thesis, 2006, griffith university, australia. [79] m. lesieur, o. métais, p. compte, large-eddie simulations of turbulence, cambridge university press, 2005 34 r. sitte [80] r. sitte, “introductory physics based visual simulation”, proceedings of the c# and .net technologies, plzen (czech republic), 2003, v. skala (ed), pp. 63-69. [81] r. sitte, r. kovacs, “iterative cluster splitting for fast vr visualization s of turbulences in microfluids”, european simulation and modeling conference esm'2005, porto, october 2005, pp. 455-462. [82] w. b.j. zimmerman, multiphysics modelling with finite element analysis, 2006/2007 world scientific publishing co.pte.ltd, singapore. [83] k. tatur, r. sitte, “spline approximations of flexible deformations for fast dynamic vr visualizations”, proceedings of the european simulation and modeling conference esmc 2003, naples, 2003, pp. 309315. [84] r. sitte, j. cai, “visualizing dynamic etching in mems vr-cad tool” proc. of the 14-th international conference in central europe on computer graphics, visualization and computer vision, plzen, czech rep. 2006, pp. 343-350. [85] d. adalsteinsson, j.a. sethian, “a level set approach to a unified model for etching, deposition, and lithography i: algorithms and two-dimensional simulations” journal of computational physics, vol. 120, pp. 128-144, 1995. [86] a. singh-jhandi, r. sitte, “virtual etching and transparency aiding in mems design”, international conference on information technology and applications (icita2002), bathurst, nsw australia, november 2002. [87] david wilson, fast collision detection and orientation for virtual assembly of microsystems, b.hon. thesis, nov. 2006, griffith university, australia. [88] a. baer, m. kellermann, v. von hintzenstern, w. schoor (university otto v. guericke, germany) “the magda project a computer aided mems design tool”, research report, griffitth university, australia, march 2003. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 287-302 https://doi.org/10.2298/fuee1902287s prediction of the em signal delay in the ionosphere using neural model zoran stanković, nebojša dončov university of niš, faculty of electronic engineering abstract. neural model capable to accurately and efficiently predict a propagation delay of electromagnetic signal in the ionosphere is proposed in this paper. the model performs this prediction for a given geographic location in europe between 40 (n) and 70 (n) latitude and 10 (w) and 30 (e) longitude, according to the following parameters: particular day in a year, time during the day and frequency of a signal carrier. architecture of the model consists of four multilayer perceptron (mlp) networks with the task to estimate, for the known values of the previously mentioned input parameters of the model, the approximate value of free ions concentration in the atmosphere along the signal propagation path above the geographic location of the receiver. based on the estimated ions concentration and taking into account the considered frequency of the signal carrier, the model calculates the time delay of signal propagation in the ionosphere. the developed neural model is applicable on the whole territory of republic of serbia, for all four weather seasons in the period of low solar activity. the results of using the proposed model for the prediction of time delay of the gps (global positioning system) signal in the area of city of niš are provided in the paper. key words: neural networks, neural model, ionosphere, total electron content, signal delay estimation, global positioning system 1. introduction concentration of ions is much higher in the ionosphere than in any other atmosphere layer and, as a result, the parameters of the electromagnetic (em) signals propagating throughout the ionosphere can be significantly affected [1-4]. signals of modern satellite communication systems such as satellite positioning systems, navigation systems, broadcasting systems, time service systems and remote sensing systems, propagate partially through the ionosphere. following changes in the propagating signals of these systems can appear: the change of trajectory and time delay of the signal, modification of the frequency of the signal, variation in the phase of signal carrier and change in the received october 28, 2018; received in revised form december 12, 2018 corresponding author: zoran stanković university of nis, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: zoran.stankovic@elfak.ni.ac.rs) 288 z. stankovic, n. doncov signal polarization [1,3]. these changes may influence the proper operation of the aforementioned wireless systems, and the space-time characterization of these changes can be of great importance for their design and exploitation. typical examples of how these changes of signal parameters due to its propagation in the ionosphere have a negative impact on an operation of satellite systems are: the variation of signal propagation delay introduces the error in the determination of the user position on earth using satellite navigation systems (e.g. using gps), the change in frequency due to the additional doppler effect in the ionosphere interrupts the correct operation of synthetic-aperture radar (sar) system, the faraday rotation of wave propagation and the change of the wave shape of the em signal interrupts the correct work of broadcasting satellites, etc. [2,3]. ions concentration in the ionosphere as well as the distribution of this concentration with the altitude (so-called height profile of the ionosphere) contribute strongly to the changes of previously mentioned signal characteristics while propagating through the ionosphere [1,3]. therefore, for the description and prediction of previously mentioned changes of satellite signals parameters in the ionosphere, it is of very high importance to know the distribution of ions concentration along the path of the signal propagation through the ionosphere which is represented by the total electron content (tec) values [1,3-8]. tec is the total number of electrons integrated between two points, along a tube of one meter squared cross section which surrounds the path of the signal through the ionosphere. the distribution of free ions concentration in the ionosphere, and therefore the tec value, depend on a number of spatial and time parameters, among them the most important are: the geographic location of the location above which the height profile of the ionosphere is observed, the current weather season, the time during the day and intensity of sun activity within the 11-years solar cycle [1,3-8]. a possibility to replace the complex and slow approaches of determining the tec values based on classical vertical ionosphere sounding [1,3] arises with an introduction of a system for global positioning. new methods appear capable to determine the current value of tec value in the vertical height profile of the ionosphere above the receiver position, by using gps receiver and receiving signal at two different frequencies. for this purpose, the ionospheric monitoring and prediction center (impc) [6] is very important today and it uses more than 160 global navigation satellite system (gnss) receivers for real-time tec measurements, placed all around the world. however, for the design and analysis of the operation of satellite communication systems, it is of greater importance to know the variation and prediction of tec values in a longer period of time (so-called tec forecasting) above some specific geographic location rather than obtaining the current tec value above a number of geographic locations that are defined and conditioned by the distribution of measurement equipment. therefore, the development of model for the tec forecasting is in the focus of today’s research. performing the greater number of measurements in a longer period of time for specific locations within some monitored geographic area and applying the classification of measured results by using the statistical analysis and mathematical interpolation methods, it is possible to develop an empirical model of the ionosphere for given geographic area that can perform the prediction of tec values for a specific moment in time and desired location inside the considered area [3,4,7,8]. this approach was demonstrated in [7] with the development of prediction of the em signal delay in the ionosphere using neural model 289 tec forecasting model that uses the kriging interpolation technique and also in [8] where such model was developed by using the spherical harmonic expansion approach. alternative approach for the development of the tec forecasting model is based on using the artificial neural networks (anns) to design the model of ionosphere and to predict the tec values. this approach offers a much simpler development of ionosphere model and its easier implementation than empirical model based on statistical methods and complex mathematical interpolations. once successfully trained, the neural model avoids, during the exploitation, a manipulation with a large number of measurement data organized as tables, graphics or the matrix database. therefore, for the given values of input parameters related to the spatial-time location of receiving terminal, the neural model is capable to predict the tec value in the vertical profile of the ionosphere in a very short time interval. in addition to that, thanks to its powerful interpolation and generalization capabilities, the neural model provides a better accuracy prediction of tec value in comparison with the classical statistical models in geographic areas with a sparse distribution of probe stations in the ionosphere. the mentioned characteristics of anns for the modelling of the ionosphere and prediction of tec values were demonstrated in [12-18]. in [12] a regional tec model based on ann has been designed and tested using data sets collected by the brazilian gps network covering periods of low and high solar activity. in [13] a local specific neural model was proposed for the prediction of tec values above the area in iran based on multilayer perceptron (mlp) network. performances of that model were compared with the polynomial fitting and kriging interpolation. local specific wavelet neural model (wnn) for tec prediction over azerbaijan was given in [14]. in [15] a neural model for tec value prediction in the vertical profile of ionosphere above the city parit raja, johor in malaysia was presented for low to medium solar activity period. in [16] and [17] regional neural models based on mlp networks were suggested to predict the tec values and to perform the calculation of the time delay [16] and the carrier phase advance [17] of the em signal in the ionosphere above the mediterranean area. regional neural model for the prediction of tec values above the china, realized by using genetic algorithm-based neural network (ga-nn) and measured tec values from 43 permanent gps stations in china was shown in [18]. this paper represents the continuation of the research conducted in [16,17]. by modifying the approach in the design of neural model and with further development of the neural model architecture from [16,17] and use of the new set of samples in the neural model training that covers the territory of the republic of serbia, the new neural model is here developed and proposed. it performs the prediction of tec value and calculation of time delay of em signal propagating in the ionosphere for the whole territory of the republic of serbia and for the all four weather seasons in the period of low solar activity. under the assumption of the low solar activity it is considered the activity of sun with average monthly values below 80 srfu (10.7 cm solar radio flux (f10.7) units that roughly lasts around 4 years and it is repeating every 11 years in average [5]. 2. neural model of the ionospheric time delay of the em signal for the development of the proposed neural model we use data of measured tec values provided in [4]. most of measured results from [4] were obtained by probe stations in the ionosphere in order to provide the map of tec values distribution above the areas 290 z. stankovic, n. doncov that cover the significant part of europe and mediterranean. regarding the area of europe, the measurement of current tec values was performed during the period of 27 months. authors of this paper were directed to use the measured results from [4] due to the following reasons. during the neural model development, the new period of low solar activity has not yet started and therefore the acquisition of data by impc server within a period of minimum one year would be possible only in the coming period. alternative was to use a database of measured results for low solar activity periods older than 11 years. the only database that covers solar minima with a duration longer than 1 year, whose measured results can be considered valid for the territory of the republic of serbia and it was available to the authors of this paper during the model development is the database given in [4]. measured values were classified by geographic regions and measured f10.7 solar flux and averaged for all four weather seasons. during the measurements it was noticed that tec value depended on the geographic location for which the height profile of atmosphere was monitored, current weather season and the time interval during the day as well as intensity of solar flux directly connected with the actual period of solar activity. in addition, it was observed that for the locations lying on the same latitude and for the measurements taken on same day and exact time during that day, the approximately same values of tec were measured. therefore, the change of tec values with longitude could be incorporated into a variation of tec with a local time. on the other side, it was shown in [3] that the time delay of signal depended on tec values along the propagation path throughout ionosphere and frequency of the carrier. taking this into account, the proposed neural model of the ionospheric time delay of the em signal has on its input the following variables: date in short format (dd-mm) defining the day and month and based on which the appropriate season is selected, latitude of receiver station (la), time during 24 hours of local time (h) and frequency of the carrier (f). neural model is developed for the period of low solar activity where fluctuation of solar flux is not significant, therefore the dependence of tec value with a variation of solar flux is not considered here. for the development of neural ionospheric time delay (itd) model, mlp network is used so the shortened name of the model is mlp_itd. fig. 1 architecture of the ionospheric time delay neural model (mlp_itd model) prediction of the em signal delay in the ionosphere using neural model 291 architecture of mlp_itd model is shown in fig. 1. the model consists of four mlp_tec (s) modules (where s=1,2,3,4; each module corresponds to one season during the year), season selection and smoothing (sss) module and time delay calculation (tdc) module. each of mlp_tec (s) modules has a task to predict the tec value in the vertical profile of ionosphere above the receiver for the specific season, the latitude of the receiver (la) and moment in time (h). therefore, the transfer function of s-th module is of the form tec (s) =fmlp_tec(s)(la,h). sss module based on the date value selects the corresponding outputs of the mlp_tec (s) networks, uses these values to form the final tec value and forward it to the module that is responsible to find the time delay in the ionosphere (tdc module). tdc module, for chosen tec value and frequency of the carrier, calculates the time delay as: tec fc ftecft tdc  2 3.40 ),( (1) where the value of tec is expressed in units 10 16 electron/m 2 , c is a speed of light in m/sec and f is a frequency of the signal is hz. in line with this, the processing function of tec_itd model can be expressed as: )),,((),,,( )(__ fhlfffhlsft astecmlptdcaitdmlp  (2) 2.1. architecture of the mlp_tec (s) module for the realization of each mlp_tec (s) module, mlp network with one hidden layer is used (fig. 2). the input layer of neurons is a buffer layer for a vector of input variables x = [la h], so it has only two neurons: i1 and i2. the hidden layer has a variable number of neurons n1, n2,...nh, where h is a number of hidden neurons. the output layer has only one neuron o1 which on its output gives the value of tec. outputs of each input neuron are forwarded to the input of each neuron in the hidden layer multiplied with a corresponding connection weight factor. also, the outputs of all hidden neurons are sent to the input of neuron in the output layer, again multiplied with corresponding connection weight factor. fig. 2 architecture of the mlp_tec (s) module 292 z. stankovic, n. doncov for all neurons in the hidden layer, the hyperbolic tangent sigmoid transfer function is chosen as the activation function of neurons: uu uu ee ee uf     )( (3) so that the processing function of mlp network and therefore of mlp_tec (s) module is of the form: )( )2( )( )1( )( )1( )( )2()(_ )( )(),( ssss astecmlp s fhlftec bbxww  (4) where input weight matrix is w(1) (s) =[wij(1) (s) ]h2 , layer weight matrix is w(2) (s) =[wij(2) (s) ]1 h input bias vector is b(1) (s) =[bi(1) (s) ]h1 and output bias vector b(2) (s) =[bi(2) (s) ]11. element wij(1) (s) represents the weight of the connection between j-th neuron of input layer and i-th hidden neuron, wij(2) (s) represents the weight of the connection between j-th neuron of hidden layer and i-th output neuron, bi(1) (s) is bias of the i-th hidden neuron and bi(2) (s) is bias of the i-th output neuron. the general notation for the architecture of mlp network, which has one hidden layer with h neurons in total, is mlp-h, and for the network chosen to select the value of tec for s-th season and incorporated into the mlp_tec (s) module is mlp_tec (s) : mlp-h. 2.2. architecture of the sss module sss module has a task to identify the current season based on particular day in the month (expressed as dd-mm input), choose appropriate outputs from the mlp_tec (s) networks and form final tec value (fig. 1) at the same time this module provides smooth transitions between neighbouring seasons and avoids the rapid changes of tec values through switching of mlp_tec (s) networks (fig. 3). transition area between two neighbouring seasons contains 30 days (the last 15 days of the previous season and the first 15 days of the next season). in this transition area a linear smoothing scheme is applied by using the algorithm shown in fig. 3. 3. training and testing of the mlp_itd model for the training and testing of mlp_itd model, we used data from [4] which are measured tec values in the ionosphere above the part of europe, 40(n)-70(n) latitude, 10(w)-30(e) longitude, so that they basically include the territory of the republic of serbia. training and testing of mlp_itd module represent separate and independent training of mlp network for each mlp_tec (s) module. in order to generate the sets for mlp networks training and testing, the measured results of tec values in the period of one year with a low solar activity, with an average solar activity of 75 srfu, were chosen. these results are shown on four plots with tec iso-contours (fig 4a-4.d) [4]. each plot shows the dependence of tec value with the location above which the height profile of ionosphere is observed and with local time for an appropriate season (fig 4.a for winter, fig 4.b for spring, fig 4.c for summer and fig 4.d for autumn). for each season, the training and test sets were generated by sampling from the plot corresponding to that season. for s-th season, the set for the training of neural network is a set of triplets prediction of the em signal delay in the ionosphere using neural model 293 fig. 3 season selection algorithm with smoothing scheme at the boundaries of the mlp_tec (s) networks used in the sss (season selection and smoothing) module with a form ls={(lai,hi,tec(ref)i (s) ) | i=1,...ls}, where tec(ref)i (s) represents the target value of neural network output for i-th combination of input variables lai,hi while ls is a total number of training samples of network for s-th season. similarly, for s-the season the training set of the neural network is a set of triplets of the form ts={(la t i,h t i,tec t (ref)i (s) ) | i=1,...ts}, where index t in superscript means that these values are for testing and not for training, while ts is a number of test samples for s-th season. each triplet for training and testing represents the coordinates of the point on iso-contour plot. training set was 294 z. stankovic, n. doncov generated by sampling these points from the tec iso-contour plots for the values of latitude la = 40,45,50,55,60,65,70, while the testing set was obtained by sampling fig. 4 contour plots of measured tec data vs. hours and latitude for (a) winter, (b) spring, (c) summer and (d) autumn (tec units are in10 16 electron/m 2 ). measurements were performed in a period of low solar activity where the average solar activity was 75 srfu [4]. these points from the tec iso-contours for the values of latitude la = 43.4, 52.5, 62.5. as a result, the training and testing sets with the following distribution of total number of samples by seasons: l1= 109, l2= 117, l3= 132, l4= 112 and t1= 46, t2= 52, t3= 52 and t4= 45 were generated. the main goal of mlp network training for each mlp_tec (s) module is to adjust the values of connection weight matrices w(1) (s) and w(2) (s) , and bias matrices b(1) (s) and b(2) (s) so that the mean square error of network output teci (s) with respect to the target value tec(ref)i (s) , observed on the whole training set, is equal or lower from the specified maximum training error et: (a) (b) (c) (d) prediction of the em signal delay in the ionosphere using neural model 295 t s iref l i s i s etectec l s   ))()(( 1 2)( )( 1 2)( (5) the model was realized in the matlab environment and the levenberg-marquartd training method was used with a given targeted maximum training error et =10 -4 . in order to quantify the success of the training of each mlp_tec (s) network and its generalization abilities, the testing of training networks was conducted on test sets and the following criteria were considered: the worst case error (wce) value   )(min)(max max )( )( ,..,1 )( )( ,..,1 )( )( )( ,..,1)( st iref tsi st iref tsi st iref st i tsis tectec tectec wce      (6) where tec t i (s) is an output of mlp_tec (s) network on i-th sample, the average test error (ate) value: )(min)(max )( )( ,..,1 )( )( ,..,1 1 )( )( )( )( st iref tsi st iref tsi s t i st iref st i s tectect tectec ate s       (7) and the pearson product-moment correlation value (r ppm )          s s s t i t i t i stst i st ref st iref sppm st st i st ref st iref tectectectec tectectectec r 1 1 22 1 )()()()( )( )( )()( )( )( )( )( )( ))(( (8) where approapriate average values of referent tec values of test set and average value of tec values representing the output of mlp_tec (s) network on test set are defined as:    ss t i st i t i s stst iref s st ref tec t tectec t tec 1 )( 1 )()( )( )( 11 (9) with the goal to obtain mlp_tec (1) module with an accuracy as higher as possible for each season, the training and testing of different mlp-h networks (2≤h≤20) were conducted. 3.1. training and testing results of the mlp_tec (1) networks (case s=1, winter) testing results for six mlp networks, trained and tested for the case s=1 (winter), with the highest r ppm value are shown in table 1. mlp-8 network is chosen to be implemented into the mlp_tec (1) module. scattering diagram for the mlp_tec (1) :mlp-8 network on test set is shown in fig. 5, where a good agreement between the tec values, provided by mlp network, and the referent values 296 z. stankovic, n. doncov can be observed. the weight and biases values of the mlp tec (1) :mlp-8 network, obtained after training, are given in table 2. table 1 testing results for six mlp networks with the highest r ppm value (mlp_tec (1) , winter) mlp net wce [%] ace [%] r ppm mlp-8 8.36 2.44 0.9948 mlp-14 6.38 2.78 0.9948 mlp-6 7.70 2.65 0.9935 mlp-9 8.34 2.59 0.9933 mlp-13 7.81 3.06 0.9929 mlp-5 9.33 3.01 0.9917 fig. 5 scattering diagram for mlp_tec (1) :mlp-8 network (winter, test set) table 2 weight and bias values of the mlp_tec (1) :mlp-8 network input layer hidden layer connection hidden layer – output layer connection                            2.80871.7390 5.8590-3.09050.6086-2.87762.8569-1.69625.0526-1.4532 5.14421.49876.21290.09705.80833.1005 )1( )1( w                            3.8328 5.1722 1.15063.85301.04731.0726 2.40215.1380)1( )1( b t                            15.7952 20.06130.2092 15.9412 21.361620.44900.813320.1482)1( )2( w  0.6764-)1( )2( b 3.2. training and testing results of the mlp_tec (2) networks (case s=2, spring) testing results for six mlp networks, trained and tested for the case s=2 (spring), with the highest r ppm value are shown in table 3. mlp-9 network is chosen to be implemented into the mlp_tec (2) module. scattering diagram for the mlp_tec (2) :mlp-9 network on test set is shown in fig. 6. again, a good agreement between the tec values, provided by mlp network, and the referent values can be observed the weight and biases values of the mlp tec (2) :mlp-9 network, obtained after training, are given in table 4. prediction of the em signal delay in the ionosphere using neural model 297 table 3 testing results for six mlp networks with the highest r ppm value (mlp_tec (2) , spring) mlp net wce [%] ace [%] r ppm mlp-9 7.29 2.59 0.9940 mlp-7 7.93 2.58 0.9937 mlp-15 8.64 2.67 0.9937 mlp-13 8.32 2.38 0.9934 mlp-11 8.19 2.51 0.9933 mlp-4 7.70 2.71 0.9931 fig. 6 scattering diagram for mlp_tec (2) :mlp-9 network (spring, test set) table 4 weight and bias values of the mlp_tec (2) :mlp-9 network input layer hidden layer connection hidden layer – output layer connection 25.709511.27980.5954-21.15416.63160.09078.04862.86425.52710.3306 11.3968-3.24734.48852.8918 0.10582.57180.66401.9519 )2( )1(                             w                              25.329520.85224.36302.33842.8569 1.9909 4.48491.6000 1.1567)2( )1( b t                              0.0550 0.03910.71990.2608 0.5611 0.1585 0.2336 0.64680.7326)2( )2( w  0.9077-)2( )2( b 3.3. training and testing results of the mlp_tec (3) networks (case s=3, summer) as in previous two cases, in the case s=3 (summer), after the training, the testing of network on the test set corresponding to this season is performed. the testing results for six mlp networks with the highest r ppm value, are given in table 5 and, as a result, the mlp-12 network is selected to be used in the mlp_tec (3) module. scattering diagram for the mlp_tec (3) :mlp-13 network on test set is shown in fig. 7. tec values provided by mlp network are very close to the referent values, as seen from fig. 7. table 6 contains the weight and biases values of the mlp_tec (3) :mlp-13 network. 298 z. stankovic, n. doncov table 5 testing results for six mlp networks with the highest r ppm value (mlp_tec (3) , summer) mlp net wce [%] ace [%] r ppm mlp-13 8.90 3.12 0.9868 mlp-12 11.66 3.68 0.9835 mlp-9 19.62 3.49 0.9833 mlp-7 11.23 3.71 0.9832 mlp-8 11.35 4.08 0.9794 mlp-6 13.48 3.87 0.9781 fig. 7 scattering diagram for mlp_tec (3) :mlp-13 network (summer, test set) table 6 weight and bias values of the mlp_tec (3) :mlp-13 network input layer hidden layer connection hidden layer – output layer connection 0.3792-16.15407.3582-0.2968 4.1528-0.1686 3.3586-0.7351 4.6531-12.7813 2.5763-0.9679 3.6046-0.87493.56300.8992 6.45456.0756 3.9723-0.90375.923818.35354.3269-0.1130 6.5274-1.1354 )3( )1(                                           w                                            18.85846.69103.5370 2.5669 0.00370.43840.5026 0.50722.56912.644411.3957 3.7168 6.5660)3( )1( b t                                            11.38560.9535 35.4423 4.16110.1641 0.708634.9119 34.5148 0.20040.49230.1264 31.11470.7055)3( )2( w  11.8833-)3( )2( b 3.4. training and testing results of the mlp_tec (4) networks (case s=4, autumn) the same testing and training procedures are conducted for the case s=4 (autumn). the testing results for six mlp networks with the highest r ppm value, trained for this season, are presented in table 7. for the implementation into the mlp_tec (4) module, mlp-8 network is chosen. scattering diagram for the mlp_tec (4) :mlp-8 network on test set (fig. 8) shows a good agreement between the tec values provided by mlp prediction of the em signal delay in the ionosphere using neural model 299 network and the referent values. table 8 contains the weight and biases values of the mlp_tec (4) :mlp-8 network. table 7 testing results for six mlp networks with the highest r ppm value (mlp_tec (4) , autumn) mlp net wce [%] ace [%] r ppm mlp-8 8.25 2.42 0.9945 mlp-10 15.95 2.62 0.9923 mlp-11 12.11 2.66 0.9920 mlp-9 11.52 2.73 0.9915 mlp-18 9.11 3.07 0.9905 mlp-5 8.79 3.20 0.9905 fig. 8 scattering diagram for mlp_tec (4) :mlp-8 network (autumn, test set) table 8 weight and bias values of the mlp_tec (4) :mlp-8 network input layer hidden layer connection hidden layer – output layer connection 3.37812.53902.65196.6548 9.58866.0375 0.52743.0743 5.7140-4.9974 3.17850.8426 2.9541-0.20779.3968-1.7449 )4( )1(                           w                            4.42764.9596 3.5531 0.16120.15582.05460.86799.0229)4( )1( b 0.43600.20450.0939 0.3062 0.1555 0.61861.07300.3711 )4( )2( t                           w  0.7890-)4( )2( b 4. simulation results of the mlp_itd model in the area of the city of niš matlab environment was used to realize, train and test mlp networks intended for the creation of the appropriate mlp_tec (s) module. after that, in the same environment, we created mlp_tec (s) modules together with the sss module and the tdc module which, based on output of selected mlp_tec (s) module, performs the time delay of the signal. as a final step the integration of all functional parts into the mlp_itd model was done. this model is able to perform 24-hours prediction of the time delay of em signal in the ionosphere for the period of low solar activity on the part of europe that also covers the territory of the republic of serbia. 300 z. stankovic, n. doncov proposed mlp_itd model is used for the prediction of gps signal delay above the area of the city of niš caused by impact on the ionosphere on signal trajectory for the period of low solar activity. signals on two frequencies belonging to the l-band and used by gps service, f1= 1575.42 mhz (l1) and f2 = 1227.60 mhz (l2), are observed. comparison of the results for the delay, obtained by prediction from the mlp_itd model, and the referent values, obtained based on measured tec values from [4], is shown in figs. 9.a, fig. 9.b, fig. 9.c and fig. 9.d for the cases when a day during which prediction is performed belongs to the winter, spring, summer and autumn (march 20, may 20, july 20 and november 20 respectively). a good agreement between results obtained by mlp_itd model and referent values can be observed. (a) (b) (c) (d) fig. 9 prediction of gps signal delay obtained by using the mlp_itd model on the territory of the city of niš (latitude 43.3 n) during 24 h period and comparison of the estimated values with the referent values obtained by measurement for one day in (a) winter: march 20, (b) spring: may 20, (c) summer: july 20 and (d) autumn: november 20. prediction is valid for the period of low solar activity. prediction of the em signal delay in the ionosphere using neural model 301 4. conclusion due to high concentration of ions, the ionosphere represents a specific em medium that can change the signal characteristics of satellite communication systems that propagate through this atmospheric layer. one of the most significant changes, which affects the operation of these systems is a time delay of signal depending on current concentrations of free electrons (tec) along the propagation path through the ionosphere. the current tec value is a function of a number of parameters, among them the key parameters are: geographic location, local time, weather season and index of solar activity. this function is very complex and most researches are dedicated to the modelling of this dependence and prediction of tec values. due to parallel data processing and fast input-output propagation of signal, neural network modelling of tec forecasting is in focus of today research. neural models for tec forecasting in the literature are of local specific nature and they can not be applied directly on the territory of the republic of serbia. in this paper, the neural model, based on mlp network, is proposed for an efficient prediction of tec value and time delay of satellite signal in the ionosphere, applicable on the whole territory of the republic of serbia in the period of low solar activity. the results of using this model for the prediction of satellite signal delay in the area of the city of niš, during 24 hours and for all four weather seasons, are in good agreement with the referent values obtained from measurements and therefore justify the researches regarding the application of mlp network for tec forecasting. future researches will be directed towards further development and enhancement of architecture of mlp_itd model aiming to achieve even better accuracy in prediction of tec value and delay of satellite signal in the periods of low solar activity. improvement of neural model accuracy will be conducted through:  finer incorporation of tec values variation within season depending of the day in year (achievable by including the day of year as additional input for the mlp_tec networks).  finer incorporation of longitudinal variation of tec values that will take into account, besides local time effect as largest contributor to longitudinal variations, an interactions/coupling activities between the lower-middle-upper atmosphere layers, as well as the fact that the geomagnetic latitude lines are not parallel to the geographic latitude lines. in order to collect data needed for the development of such enhanced tec neural model, the acquisition of tec values will be performed during 2019 by using the impc free service allowing to read the results of gnss monitoring on a daily basis. this time interval is chosen as 2019 will be the first year that completely belongs to the period of low solar activity. acknowledgement: this paper is supported by the project tr-32024 of the ministry of education, science and technological development education in the republic of serbia. 302 z. stankovic, n. doncov references [1] m. dragović, antene i prostiranje radio talasa, beopres, beograd, 1996. [2] t. pratt, c. w. bostian, j. e.allnutt, satellite communications, john wiley and sons, 2003. god. [3] s. basu, j. buchau, f.j. rich, e.j. weber, e.c. field, j.l. heckscher, p.a. kossey, e.a. lewis, b.s. dandekar, l.f. mcnamara, e.w. cliver, g.h. millman, j. aarons, s. basu, j.a. klobuchar, s. basu, m.f. mendillo, ionospheric radio wave propagation, chapter 10, pp. (10-1)–(10-111), 1985. [4] j. a. klobuchar and j. aarons, numerical models of total electron content over europe and the mediterranean and multi-station scintillation comparisons, international agard agardograph, 1973. [5] k. f. tapping, "the 10.7 cm solar radio flux (f10.7)", space weather, vol. 11, pp. 394–406, 2013. [6] ionosphere monitoring and prediction center (impc), deutsches zentrum für luftund raumfahrt e.v. (dlr), german aerospace center, url: http://impc.dlr.de/. [7] r. orus, m. hernandez-pajares, j.m. juan and j. sanz, "improvement of global ionospheric vtec maps by using kriging interpolation technique", journal of atmospheric and solar-terrestrial physics, vol. 67, no. 16, pp. 1598–1609, 2005. [8] b.k. choi, w.k. lee, s.k. cho, j.u. park and p.h. park, "global gps ionospheric modelling using spherical harmonic expansion approach", journal of astronomy and space sciences, vol. 27, no. 7, pp. 359–366, 2010. [9] s. haykin, neural networks, new york, ieee, 1994. [10] q. j. zhang, k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [11] c. christodoulou, m. georgiopoulos, applications of neural networks in electromagnetics, artech house, 2001. [12] r.f. leandro, m.c. santos, "a neural network approach for regional vertical total electron content modelling", j. studia geophys. geod., vol. 51, issue 2, pp. 279–292, 2007. [13] m.r.g. razin, b. voosoghi and a. mohammadzadeh, "efficiency of artificial neural networks in map of total electron content over iran", acta geod. geophys., issue 3, pp. 1–15, 2015. [14] m.r.g. razin, b. voosoghi, "wavelet neural networks using particle swarm optimization training in modeling regional ionospheric total electron content", j. atmos. sol. terr. phys., vol. 149, pp. 21–30, 2016. [15] m. j. homam, "prediction of total electron content of the ionosphere using neural network", jurnal teknologi (sciences & engineering), vol. 78, no. 5–8, pp. 53–57, 2016. [16] z. stanković, i. milovanović, j. jovanović, n. dončov, b. milovanović, “estimation of the em wave propagation delay in the ionosphere using artificial neural networks”, presented at the yuinfo 2017 conference, march 12-15, kopaonik, serbia, 2017, only a short abstract printed. [17] z. stankovic, i. milovanovic, n. doncov, m. sarevska and b. milovanovic, "estimation of the carrier phase advance of the em signal in the ionosphere using neural model", in proceedings of the 52nd international scientific conference on information, communication and energy systems and technologies, niš, serbia, june 28 30, 2017, pp. 211–215. [18] r. song, x. zhang, c. zhou, j. liu and j. he, "predicting tec in china based on the neural networks optimized by genetic algorithm", advances in space research, vol. 62, no. 4, pp. 745–759, 2018. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 645 656 doi: 10.2298/fuee1504645l comparison of different device concepts to increase the operating voltage of a trench isolated soi technology to above 900v  ralf lerner 1 , klaus schottmann 1 , siegfried hering 1 , andreas käberlein 2 , matthias fritzsch 3 , klaus schneider 3 , daniel beyer 3 , steffen heinz 2,3 1 x-fab semiconductor foundries ag; erfurt, germany 2 professur elektronische bauelemente der mikround nanotechnik; fakultät für elektrotechnik und informationstechnik; technische universität chemnitz; germany 3 electronic design chemnitz gmbh; chemnitz, germany abstract. for gate driver ics in three phase power applications level shifters with more than 900v operating voltage are required. the extension of the voltage rating of an existing trench isolated soi process was done with different device concepts: serial stacking of lower voltage devices was evaluated as an alternative approach to conventional quasi-vertical and charge compensated lateral devices which need layout and material modifications. based on sufficient 900v trench isolation the different device concepts were tested with diodes and transistors. for the usage as level shifters the focus was to achieve the required breakdown voltages with minimum area. key words: trench isolated soi, integrated high voltage devices, high voltage device concepts 1. introduction and motivation a large part of globally generated electricity is used for mechanical drive applications. the increasing demand for energy savings pushed also by legislative requirements [1] leads to rising usage of inverters in many applications. in contrast to single phase 230v applications, where intelligent motor drivers are available which allow an adjustment of rotational speed and power demand; this is seldom the case for three phase applications due to cost and size restrictions. for compact three phase drive applications like industrial fans and pumps not only igbts and diodes but also high side capable gate driver ics with appropriate operating voltages are required. the three phase approach however offers space reduction of the inverter due to the smaller dc-link and lower system currents compared to a single phase input. also new topologies and control algorithms received march 11, 2015; received in revised form june 16, 2015 corresponding author: ralf lerner x-fab semiconductor foundries ag; erfurt, german (e-mail: ralf.lerner@xfab.com) 646 r. lerner, k. schottmann, s. hering, et al. focus on the integration of the inverter into the motor, which was already presented in automotive applications [2], [3], [4], [5]. special needs have to be fulfilled by the asic-technology for usage in an inverter application. first of all the level shifter must be capable of handling voltages of up to 750v. for a stable design there should be some safety margin for transients. this leads to a maximum application voltage of about 900v. these voltages also apply for the insulation of the floating high-side circuit. there are two different types of level shifters commonly used. the transistor based shifter [6] mostly used at lower dc-link voltages has some advantages over inductive approaches because of its lower sensitivity to spurious magnetic emissions commonly found in power electronic systems. the area required for high voltage integrated devices is huge because of the isolation requirements and the peak power requirements in the moment of signal transmission and transients. the inductive shifter [7] overcomes the problems of transients and can be used for higher dc-link voltages, but the area required for the integrated air coils is even higher than the area of integrated high voltage devices. therefore some manufacturers stack the transformer on top of the driver or receiver die. another approach is a capacitive shifter utilizing two integrated capacitors in a differential manner. these capacitors are much smaller than high voltage transistors and can be fabricated in the metal interconnect system of high voltage cmos technologies with appropriate thickness of inter metal dielectrics, imd. the capacitance needed for proper operation of the shifter is much smaller than the capacitance needed in a discrete realization which is also more critical in proper isolation, lifetime, driving power and chip area. one major issue of this shifting approach is the susceptibility to voltage transients between the high side and low side circuits i.e. induced through shifting of the active or passive power switches of the connected half bridge. during the occurrence of such transients no information can be transferred across the capacitors and the high side circuit has to ignore such transient signals to prevent false switching of the high side power switch. as the impact of such transients is increased with higher rise and fall times of the power switches this approach is especially suitable for small power or high integration applications where a low emi signature is more preferred than very high efficiencies which is achieved with slower switching power devices. for proper driving of the power devices there is a need for mos transistors with voltage capabilities of 20v to 30v. another need is the interface to logic circuits. so the asic needs cmos devices with an operating voltage in the range of 3v to 5.5v for maximum compatibility with common microcontrollers and digital signal processors. furthermore for close integration of power devices, driver and control electronics the asic must have an upper operating temperature of about 150°c. all the low voltage devices described above to build an integrated level shifting device are available in the xdh10 technology of x-fab in conjunction with high voltage capable trench isolation [8]. this isolation allows the integration of high side and low side circuits for one half bridge on a single die which reduces risks of isolation failures in the complete drive system and also reduces the complete systems size. the bi-directionality together with the small temperature coefficient of the isolation leakage current makes trench isolated silicon on insulator, soi, processes a good choice to manufacture such gate driver ics [9], [10]. in this paper we present results to extend an existing 625v (operating conditions) trench isolated soi process to operating voltages above 900v. this operating voltage requirement is necessary for the isolation between comparison of different device concepts to increase the operating voltage... 647 high side and ground logic as well as for the level shifters used for the communication between both. trench isolated soi allows the serial stacking of devices. switched emitter bjt/mosfet cascodes with a supply for the upper bjt element and a separate drive for the lower mosfet [11] are known as well as cascodes consisting of an upper gan hv hemt in series with a lower n-channel low voltage mosfet with the hemt gate tied to ground [12]. a two chip 1200v half bridge gate driver with cascaded 600v devices is proposed in [13]. based on a working isolation scheme stacked medium voltage diodes and transistors were investigated and compared with conventional quasi-vertical and charge compensated lateral high voltage devices. for area efficient level shifting functionality we investigated a single chip transistor cascode concept with a common gate connection. 2. the base process a trench isolated bipolar-cmos-dmos, bcd, process on 55 µm thick soi wafers containing 5, 7 and 20v logic cmos transistors, medium and high voltage n-channel dmos and pmos transistors as well as bipolar and other analogue devices like resistors and capacitors [8] was the base for the 900v extension. isolation, soi thickness as well as doping concentrations are sufficient to achieve typical breakdown voltages above 700v. fig. 1 schematically shows the dielectric isolation topology, consisting of the vertical isolating trench, the trench adjacent doping layer (sinker), the buried oxide, box, isolating the device wafer from the handle wafer and the highly doped buried layer above the box. inside the isolation tub the standard high voltage device, a 750v n-channel quasi-vertical dmos transistor is shown. fig. 1 quasi vertical n-channel enhancement dmos isolated with trench and buried oxide buried oxide buried layer device wafer handle wafer pwell s in k e r source drain 648 r. lerner, k. schottmann, s. hering, et al. 3. experiments and results 3.1. isolation a prerequisite for a 900v gate driver process and the characterization of its high voltage devices is a working isolation with sufficient overhead. trenches, buried oxide, inter metal dielectrics (imd) as well as the surface topology need to withstand at least 1000v. the isolation capability was measured with current voltage characteristics between room temperature and 175°c. fig. 2 shows the investigated different trench tub layouts: single trench tubs with corner angles of 90°, single trench tubs with corner angles of 135° and double trench tubs with corner angles of 135°. the “high” potential was either applied to a pad inside the isolated tub or to the surrounding tub. any early parasitic breakdown mechanism must be avoided, either in vertical direction from metal to metal layer or from metal to silicon respectively as well as in lateral direction by sufficient metal, pad and probe spacings and appropriate pad topologies. as a criterion for isolation “breakdown” a leakage current of 0.1na was used. results of these measurements are shown in table 1. more details of these measurements and test structures can be found in [14]. for 90° corners 600v and 800v were measured while for 135° corners 600v and 1020v were measured. for double trenches the measured voltages were above 1300v, above the maximum measurement range. fig. 2 used trench tub layout geometries in top down view: 90° trench corner angle (left), 135° trench corner angle (center) and 135° double trench corner angle (right) table 1 measured breakdown voltages of trench isolation test structures trench structure trench corner tub voltage breakdown voltage single 90° gnd 600v single 90° + 800v single 135° gnd 600v single 135° + 1020v double 135° gnd >1300v double 135° + >1300v 3.2. 900v vertical diodes the breakdown voltage of the quasi-vertical high voltage devices, as shown in fig. 1, is mainly defined by the soi material which acts as drift region. extending the operating voltage of such devices requires a modification of the thickness and doping of the soi device wafer as well as a re-design of the field plate based edge termination region. simple circular diodes were used for the first layout and soi material related evaluation. with an internal pad to contact the central electrode (“source” in fig. 1, would be the anode of a similar diode) blocking voltages of about 1150v were measured, fig 3. but for integrated high voltage devices it would be very convenient to contact the inner electrode by a metal wiring instead of a pad-bond wire connection. this requires a comparison of different device concepts to increase the operating voltage... 649 crossing of the inner electrode metal wiring over the outer electrodes. fig 3 also shows the blocking behavior of such a diode. the same diode layout but with a crossing wiring only has 750v breakdown voltage. fig 3 breakdown voltage of quasi-vertical diodes with internal pad and with crossing metal wiring connection to the inner electrode 900v lateral diodes 3.3. 900v lateral diodes for lateral diodes using a charge compensation mechanism different layouts were designed using an additional deep pwell region and an additional shallow ndrift doping. the circular designs were based on 2d simulation by using either the anode or the cathode region as a symmetry axis, fig. 4, to create a 3d circular device. with the anode in the center breakdown voltages of 1000…1100v were measured while diodes with a similar cross-section but with the cathode in the center only reached about 100v, see fig. 5. fig. 4 simulated 2d lateral diode structure, “a”: anode field plate metal stack, “c”: cathode field plate metal stack, for 3d layout a rotation either around the anode (right side) or the cathode region (left side) was done oxide stack: field oxide and imds n-doped device wafer deep pwell shallow ndrift a c 650 r. lerner, k. schottmann, s. hering, et al. fig. 5 measured blocking characteristics of lateral diodes, “anode center” i.e. rotation around anode, “cathode center” i.e. rotation around cathode for these diodes the deep pwell doping as well as the ndrift region (purple region in fig. 4 and area above respectively) was varied by different implantation doses in the wafer process. fig. 6 shows the achieved breakdown voltages versus the two doping concentrations for two different layouts. the simulated implantation results were used as 100% starting values. maximum breakdown voltages above 1000v were measured at simulated deep pwell doping but at lower ndrift doping. depending on layout, the breakdown voltages were lowered to 311v and 620v respectively with the used implantation dose variants. fig. 6 measured blocking voltage maps for different ndrift and deep pwell doping conditions, layout “a” left, layout “b” right, 6 sites per wafer measured 3.4 stacked diodes with a working dielectric isolation the soi material allows an alternative approach: the serial stacking of isolated medium voltage diodes. as shown in the inset of fig. 7 several isolated diodes in series were tested. either two identical 600v diodes or up to four identical 300v diodes in series were stacked. also variants were tested with and without the center of the serial diode stack being connected to the handle wafer, hw, of the soi substrate by a handle wafer contact, hwcnt. n d ri ft d o se 4 6 % 7 3 % 1 0 0 % 1 2 6 % 1 5 3 % n d ri ft d o se 4 6 % 7 3 % 1 0 0 % 1 2 6 % 1 5 3 % comparison of different device concepts to increase the operating voltage... 651 fig. 7 breakdown voltages versus device area for different diode stacks compared to conventional vertical and lateral diodes, numbers above symbols indicate numbers of stacked diodes the measured breakdown voltages of these diode stacks as well as the best conventional vertical and lateral diodes versus required device layout area are shown in fig. 7. totally isolated diode stacks i.e. with floating inner nodes between the stacked diodes were measured with breakdown voltages of about 1200v either with 4 individual 300v diodes in series but also the stacking of two 600v diodes leads to an overall breakdown of 1200v. connecting the handle wafer “hw” electrically to a central potential also gives 600v for two stacked 300v diodes. forward voltages were measured with 0.76v, 1.52v and 3.05v for the single diode, the 2 diode and 4 diode stacks respectively. 3.5. stacked transistors a similar stacking approach was evaluated with two transistors in series, a bottom transistor t1 with its source connected to ground and a top transistor t2 connected with its drain to vdd, see fig. 8. main problem of this transistor stacking is the high gate to source voltage of the top transistor t2 during reverse blocking state which is equal to the negative drain to source voltage of t1. therefore the total breakdown voltage target was set to >1000v and t1 should only have a medium breakdown voltage. fig. 8 transistor stacking scheme t1 t2 t1 t2 hw cnt hw vdd vdd vdd 1 2 4 2 1 1 1 652 r. lerner, k. schottmann, s. hering, et al. for the medium voltage bottom transistor t1 of the stack a 400v, 0.013mm² dmos was available, but for the top transistor t2 it was necessary to develop a new device for 400v gate voltage. the polysilicon gate electrodes of 0.2mm² large dmos transistors were modified by changing from a “normal” <100nm gate oxide to a much thicker gate oxide of > 600nm with metal 1 or even about 1000nm gate oxide for a metal 2 gate electrode. these metal gate dmos transistors, as well as the related transistor stacks, have threshold voltages of about 2.5 v for metal 1 gates and about 12v for metal 2 gate transistors, see transfer characteristics in fig. 9. fig. 9 transfer characteristics of single metal gate transistors and transistor stacks fig. 10 shows the blocking behavior of the metal gate transistors with leakage currents below 100pa for both transistor types and breakdown voltages of 770v for the metal 1 gate devices but only about 720v and a round breakdown curve for the metal 2 gate device. fig. 10 blocking characteristics of 700v metal gate transistors m1 gate m1 gate stack m2 gate m2 gate stack comparison of different device concepts to increase the operating voltage... 653 measurements have shown working transistor stacks with leakage currents of about 100pa and breakdown voltages of 1050v for the stacks using the metal 2 top transistor and 1100v for the metal 1 gate top transistors, see fig. 11. the total area of the stacked dmos transistors (t1+t2) is 0.215mm². it can also be observed in fig. 11 that both types of transistor stacks do not show an influence of the usage of a handle wafer contact i.e. no effect of an approximately 400v handle wafer potential could be seen. also all stacks show a sharp breakdown behavior. fig. 11 blocking characteristics of stacked transistors, inset with stacking scheme output characteristics of the transistor stacks are shown in fig. 12. normal output behavior up to drain voltages of 200v was measured with gate voltages ranging from 10v up to +15v. fig. 12 output characteristics of stacked transistors (m1-gate, no handle wafer contact) 654 r. lerner, k. schottmann, s. hering, et al. 4. discussion the observed difference of the trench isolation capability with regards to voltage polarity and corner geometry has been observed already for lower voltages [15], [16]. for 900v applications single trench isolation might be sufficient with 135° corner angles and correct polarity. double trench isolation structures give enough margins to measure (and operate) above 1000v. the double trench isolation capability of up to 1300v is the basic requirement for further device related high voltage measurements without any parasitic isolation failures. with the working isolation conventional quasi-vertical diodes can be designed using adequate material parameters and layouts. but increasing the soi thickness means to increase also the trench depth i.e. longer etch times and higher trench aspect ratios are necessary with linked technological problems. furthermore also the transistor layout, especially the edge termination region which lowers the curvature of the electric field lines of the blocking pn-junction between anode and cathode close to the surface region, needs to be re-designed. the measured breakdown voltages of vertical diodes up to 1200v clearly show the functionality of the modified soi wafer material and the edge termination design. these breakdown voltages have been measured with an internal pad to the anode center of the circular diode. connecting the anode in the center of the diode by a metal 3 wire means that this wire has to cross over the circular cathode region at the diode edge. a certain part of the metal 3 cathode field plate (on cathode potential) has to be removed in the layout and an anode wiring in metal 3 (on anode potential) crossing the cathode region has to be inserted instead of it. this layout modification leads to a disturbance of the electrical field distribution under the edge termination field plates and results in a reduction of breakdown voltage by 400v. as a second type of devices lateral diodes with breakdown voltages above 1000v were simulated, designed, processed and measured. with a correct rotational axis the 2d simulation can be transferred into 3d diode structures. since we were not able to perform 3d simulations on these large geometries we can only assume on the reason for the drastic breakdown voltage reduction when rotating around the cathode region. in that case the electric field in the drift region below the cathode is bended in a convex shape and therefore a large field crowding occurs. much larger radii would be necessary to compensate this effect and to avoid the early breakdown. highest breakdown voltages have been measured at the simulated deep pwell and close to the simulated ndrift doping conditions. compared to the vertical 1200v vertical diodes the 1000v lateral diodes have a much smaller area requirement. but this area improvement has to be paid by two additional process layers for the deep pwell and ndrift regions and, even more critical, the doping sensitivity of the breakdown voltage. up to 15v per one percent ndrift doping change and up to 30v per one percent deep pwell doping variation was observed. especially the shallow ndrift doping is expected to be very sensitive to e.g. variation of implant oxide thickness etc. either a very robust and stable wafer processing or larger safety margin in the breakdown voltage specification would be necessary for the lateral approach. as a third approach serial stacked diodes have been evaluated. with this stacking approach also diode breakdown voltages of about 1200v can be achieved, e.g. by stacking two 600v diodes or by stacking four 300v diodes in series. a mathematically correct addition of breakdown voltages was achieved. one big advantage of stacked medium voltage diodes is the reduced area consumption: only 50% of the area of a single 1200v vertical diode is necessary for a diode stack with two 600v diodes and only less than 20% for a stack with four 300v diodes. compared to 1000v lateral diodes the stacked comparison of different device concepts to increase the operating voltage... 655 diode approach is still significantly smaller. another advantage of the stacking approach is the reduced effect of crossing wires. while for “real” 1200v devices a severe 400v reduction due to this crossing was observed this effect is much smaller for lower breakdown devices used in the stacking approach. but the diode stacking also has a drawback: not only the breakdown voltages add up, also the forward voltages do add up in a mathematically correct manner. for diodes operating in forward mode this might be an issue. while the diode stacks consist of similar diodes (similar in terms of layout and electrical parameters) per stack the transistor stacks used unsymmetrical devices: to reduce the gate voltage requirements of the top transistors a small sized, medium voltage bottom transistor was combined with a larger high voltage top transistor. in blocking mode also an almost mathematical correct addition of the breakdown voltages of the two transistors was observed. the necessary area of 0.215mm² for the stack is less than half of the expected 0.5mm² a single vertical dmos with a similar breakdown voltage would require. the differences of the metal gate transistors in breakdown voltage are repeated in the stack: 1100v breakdown was only achieved with a metal 1 gate transistor as top t2 whereas the metal 2 gate stacks only achieved 1050v. with the change from a polysilicon gate electrode to a metal 1 and metal 2 gate electrode respectively the polysilicon layout elements in the dmos design especially the gate and source field plates were replaced by design elements in metal 1 and metal 2 respectively. with this change in the design the oxide thickness between these field plates and the silicon is getting thicker. this change in the field plate design leads to a change of the electric field distribution. the rounded breakdown characteristic of metal 2 gate transistors is also indicating a sort of reach through and not a pure avalanche breakdown mechanism for the metal 2 gate device. this would explain the lowered breakdown voltage of metal 2 gate transistors. in the metal 2 gate transistor stacks this pre-breakdown leakage current is suppressed by the bottom transistor in series still operating in blocking mode and thus limiting the current flow. only when both transistors are in breakdown mode the current can raise and an avalanche like current increase can be observed in the characteristics. under forward conditions of course the on-resistances of the two transistors in series also add up. increasing the transistor areas would decrease the total on-resistance at the costs of area saving of the stacking approach. for driver applications this trade-off might be an issue but for level shifting devices the on-resistance is only of minor importance. 5. summary and outlook different high voltage device concepts quasi-vertical, lateral charge compensated and serial stacking have been evaluated and compared. with all three device topologies the required parameters for three phase gate driver ics transistor based level shifters were achieved. the advantages of the quasi-vertical approach are the higher robustness against process tolerances and a lower mask count while the lateral device concept requires smaller area consumption. the basic functionality of diode and transistor cascode approaches (stacking of existing high voltage devices) to achieve breakdown voltages far above those of the single devices was proven. compared to 900v single devices the cascode approach combines several advantages: highest area efficiency, smallest development effort, low mask count and a high robustness against process tolerances. the addition of diode forward voltage and transistor on-resistance are likely critical for driver applications or diode forwards 656 r. lerner, k. schottmann, s. hering, et al. operation. but for the required level shifting functionality the isolation and transistor breakdown voltages as well as transistor dc characteristics are very interesting. switching characteristics and reliability investigations are items of further work. especially unsymmetrical switching i.e. one transistor already slightly “on” while the second one still in off-state, is a concern. a locally reduced breakdown voltage e.g. below the crossing wires, will lead to a locally concentrated avalanche mechanism. therefore further edge termination layout improvement to avoid the breakdown voltage reduction due to crossing wires is topic of further work. acknowledgement: the paper is a part of the research done within the german federal ministry of education and research project “modelan 16n12068”. references [1] directive 2009/125/ec: design requirements for energy-related products. [2] p. vas and w. dury, "electrical machines and drives: present and future", in proceedings of electrotechnical conference, melecon '96, may 1996, vol. 1, pp. 67-74. [3] r.j. kerkman, g.s. skibinski, and d.w. schlegel, "ac drives: year 2000 (y2k) and beyond", in proceedings of ieee applied power electronics conference, apec '99, march 1999, vol. 1, pp. 28-39. [4] y. tadros, j. ranneberg and u. shafer, "ring shaped motor-integrated electric drive for hybrid electric vehicles", in proceedings of european power electronics conference, epe’03, toulouse, france, september 2003. [5] c. klumpner, f. blaabjerg, and p. thorgersen, "converter topologies with low passive components usage for the next generation of integrated motor drives", in proceedings of ieee power electronics specialist conference, pesc’03, vol. 2, pp. 568-573, june 2003. [6] m. rossberg, b. vogler, r. herzer, "600v soi gate driver ic with advanced level shifter concepts for medium and high power applications", in proceedings of the european conference on power electronics and applications, pp.1-8, 2007. [7] m. munzer, w. ademmer, b. strzalkowski, k.t. kaschani, "insulated signal transfer in a half bridge driver ic based on coreless transformer technology", in proceedings of the fifth international conference on power electronics and drive systems, peds 2003, pp. 93-96, 2003. [8] http://www.xfab.com/en/technology/soi/10-um-xdh10/, xdh10 data sheet [9] f. udrea, d. garner, k. sheng, a. popescu, h. t. lim, w. i. milne, "soi power devices", electronics and communications journal, vol. 12, issue 1, feb. 2000, 27ff. [10] d.m. garner, f. udrea, h.t. lima, g. ensell, a.e. popescua, k. shenga, w.i. milnea. "silicon-oninsulator power integrated circuits", microelectronics journal, vol. 32, issues 5-6; may-june2001; 517ff. [11] p. schimel, "cascode configured gan switch enables faster switching frequencies and lower losses", electronic design, may 14, 2012, www.electronicdesign.com [12] x. huang, z. liu, q. li, f. lee, "evaluation and application of 600v gan hemt in cascode structure, in proceedings of the applied power electronics conference and exposition (apec) 2013, long beach, ca, pp. 1279-1286. [13] r. herzer, j. lehmann, m. rosberg, b. vogler, "igbt gate driver solutions for low and medium power applications", www.power-mag.com issue 6 2010 power electronics europe. [14] r. lerner, k. schottmann, s. hering, a. käberlein, m. fritzsch, k. schneider, d. beyer, s. heinz, "using soi capabilities to increase breakdown voltages from < 600v to > 900v", in proceedings of the 12th international seminar on power semiconductors, isps’14, prague, august 2014, pp. 91-96. [15] r. lerner, u. eckoldt, k. schottmann, s. heinz, k. erler, a. lange, g. ebest, "time dependent isolation capability of high voltage deep trench isolation", in proceedings of the 20th international symposium on power semiconductor devices & ics, ispsd '08 orlando, florida, may 18-22, pp. 205-208. [16] r. lerner, u. eckoldt, a. hoelke, a. nevin, g. stoll, "optimized deep trench isolation for high voltage smart power process", in proceedings of the 17th international symposium on power semiconductor devices & ics, ispsd '05 santa barbara, california, may 23-26, pp. 135-138. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 179-193 https://doi.org/10.2298/fuee1902179p enhanced dielectric properties in la modified barium titanate ceramics  vesna paunović 1 , zoran prijić 1 , miloš đorđević 1 , vojislav mitić 1,2 1 university of niš, faculty of electronic engineering, niš, serbia, 2 institute of technical sciences of sasa, belgrade, serbia abstract. donor/acceptor (la/mn) doped batio3 ceramics, sintered at different temperatures, were studied regarding their microstructure and dielectric properties as well as the dielectric response in a ferroelectric/paraelectric regime. the concentrations of la 3+ as donor, ranging from 0.1 to 5.0 at% were used for doping, while a content of mn 4+ as acceptor was at 0.05 at% in all samples. the sintering temperature of codoped samples were 1290 and 1350c. a reduction in grain size and fine-grained microstructure with average grain size from 0.5 to 2.0 m was observed in low doped samples, whereas the abnormal growth of individual grains took place in the 2 at% and 5 at% la doped specimens. the dielectric properties of these samples were investigated as a function of frequency (100hz – 20 khz) and temperature (20-180c). the measured results suggested that both the dielectric constants of the ceramics (r at room temperature and rmax at the curie temperature) decreased as the concentration of la 3+ increased. the dielectric permittivity was in the range of 944 to 3200. for samples doped with 0.1 at% la and sintered at 1350°c, the highest dielectric constant value at room temperature (r= 3200) and curie temperature (r= 5000) were measured. for all measured samples the dissipation factor was less than 0.09. with an increase in la contents, dielectric measurements exhibited shift in the curie temperature (tc) towards the low temperature. using the curie-weiss and the modified curie-weiss law, curie's constant c was calculated as well as the parameter, which describes the deviation from the linear dependence r of t above the phase transformation temperature. the calculated values for  ranged from 1.01 to 1.43. these values indicate a sharp phase transformation in lowdoped and diffuse phase transformation in highly la doped samples. the phase transition was reflected in the values of c that started to decrease with increasing dopant content. key words: batio3, dielectric constant, dissipation factor, curie temperature received april 8, 2019 corresponding author: vesna paunović university of nis, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: vesna.paunovic@elfak.ni.ac.rs)  180 v. paunović, z. prijić, m. đorđević, v. mitić 1. introduction due to its high dielectric constant and low dielectric losses barium titanate is an invaluable electroceramic materials that have been widely used in multilayer ceramic capacitors (mlccs), temperature sensors, rf filter circuit, electro-optical components and piezoelectric transducer applications 1-3. in order to be used for mlc applications, barium titanate must be electrical insulators and exhibit high dielectric constant and small dielectric losses at room temperature. as over-current protection devices, temperature sensors and self-regulating heaters they need to be semiconducting, at room temperature and make full use of the ptcr characteristics which are associated with a sharp rise in resistivity when heated above the curie temperature 4-6. batio3 is an insulator and a ferroelectric with a tetragonal perovskite structure and a high dielectric constant at room temperature. at the ferroelectric curie temperature, the crystal structure transforms from tetragonalferroelectric to cubic–paraelectric structure. dielectric properties of batio3-ceramics can be controlled through processing parameters, synthesis method and sintering procedure. accordingly, it is necessary to prepare homogeneous starting powder, and a ceramics of high density, uniform and fine grain microstructure. a suitable choice of dopant/additive and careful control of the composition during sintering procedure are one of the most important parameters for modifying the electrical properties of batio3 ceramics 7-12. ions with low valence and larger ionic radius (la 3+ , ca 2 ) tend to take ba 2+ sites, while ions with smaller ionic radii of valence 5 + and higher (nb 5+ ) favor the ti 4+ sites 13-18]. incorporation of heterovalent ions in perovskite lattice of barium titanate leads to significant changes of structure and microstructure and furthermore to change of dielectric and electrical properties. the dielectric characteristics of doped ceramics, in addition to the type and concentration of the additive, are greatly influenced by other parameters such as microstructure, phase homogeneity, pore morphology and domain structure. during the phase transformation of batio3, the domain structure is formed. the configuration and type of domains depends on the development of the microstructure during the sintering process. homogenous, finegrained microstructure with monodomain structure allows the obtaining of ceramics of stable ferroelectric characteristics, i.e. ceramics in which the dielectric constant changes slightly with temperature. during the last few decades, batio3 ceramics, doped with rare earth elements, and especially la, have widely been studied. as a dopant, la is one of the most commonly used materials [19–22]. la 3+ behaves as a donor as it occupies the ba site in perovskite lattice. this may raise the dielectric constant and further broaden dielectric peak [23-25]. also, la as donor decreases the grain size and shifted curie temperature towards lower values. in la doped samples, the dielectric constant values are much higher than in pure batio3. the partial substitution of ba 2 + ions with la 3+ ions increases the temperature range in which a stable tetragonal phase, characterized by a small change in dielectric constant with temperature. also, it was found that dielectric losses are reduced by adding la to batio3 2628 at low concentrations of la (less than 0.5 at %) occurs to the substitution of ba 2+ ions and to the formation of solid solutions of the general formula ba(1x)laxtio3. at higher enhanced dielectric properties in la modified barium titanate ceramics 181 concentrations of additives above 1.0 at %, ba 2+ or ti 4+ ions may be substituted, where the specific electrical resistance of the sample is very high in the order of 10 10 cm. the substitution of la 3+ on the ba 2+ sites requires the formation of negatively charged defects in order to preserve electroneutrality. the charge imbalance can be compensated by three different compensation mechanisms: electrons (e / ) what constitutes electronic compensation mechanism or barium vacancies (vba // ) and titanium vacancies (vti //// ) which represent ionic compensation mechanism 29-32. for samples sintered in the air atmosphere, the main compensation mechanism is the ionic compensation mechanism, although there is disagreement as to whether this the mechanism takes place through the creation of barium (vba // ) or titanium (vti //// ) vacancies. for low partial pressure at low doping levels, the compensation of donors is accomplished by electrons, while for high pressures, the characteristic mechanism is an ionic compensation mechanism. the influence of mno2 on the electrical properties of doped batio3 has been widely investigated [33-35]. in an attempt to increase the reliability of the material, mn as an acceptor dopant is used to counteract the effects of oxygen vacancies. for ptc thermistor, mn is among the most effective acceptor type dopant, which segregates along the grain boundaries to enhance the resistance jump at the curie temperature. manganese replaces ti 4+ ion in batio3 lattice as an acceptor with unstable valence, from mn 2+ , mn 3+ to mn 4+ , depending on the partial pressure of oxygen. in a reducing atmosphere mn 2+ is likely to be found but in oxidizing conditions it converts into mn 4+ . in air processed samples found both mn 2+ and mn 4+ with no traces of mn 3+ . the formation of donor acceptor complexes such as 2[laba  ]-[mnti  ] prevents a valence change of mn 2+ to mn 3+ and has a beneficial effect on reduction of the dissipation factor. controlled embedding of donor substances, la 3+ , in combination with the mn 2+ acceptor, enables the creation ceramics with a fine grain structure and an increased dielectric constant r, at room temperature and phase transformation temperature compared to la/batio3 ceramics [36-38]. also, the dielectric losses are lower in the la/mn codoped ceramics compared to the undoped and la doped ceramics. in this paper the influence of sintering temperature, donor concentration, acceptor mn, and their relationship on the performance of ceramics were discussed. also, the permittivity response with temperature and frequency for specimens doped with various content of la and sintered at different temperatures were analyzed. 2. experiments and methods the samples of la/mn doped ceramics used in this investigation were obtained from commercial batio3 powder, elmic bt 100 rhone poulenc: with a particle size of 0.1m -0.7 m. the stoichiometric bao/tio2 ratio was 0.996 ±0.004. la2o3 (merck, darmstadt) was used as donor dopant. the donor concentration was 0.1 at%-5.0 at%. mno2 with a concentration of 0.05 at%, was used as acceptor in all cases. the powders were milled with al2o3 balls in a suspension of ethyl alcohol. the homogenization and milling time was 24h. the powders were then dried at 200c for several hours and isostatically pressed at 120mpa into cylindrical shaped tablets of 10 mm diameter (hydraulic press vpm veb thuringer industrieverg raunestein). the prepared tablets 182 v. paunović, z. prijić, m. đorđević, v. mitić were subjected to the sintering in a laboratory tube furnace (lenton thermal design ltd) at 1290°c and 1350°c in an alumina ceramic boats. the sintering was conducted in air for 2h. the heating mode was 5°c/min to a temperature of 850°c and then of 12°c/min to the desired (sintering) temperature. the cooling rate was 10°c/min to room temperature. the archimedes’ method was used to measure bulk density. a scanning electron microscopy (jsm -5300) equipped with eds was used to investigate the microstructures of the samples obtained after sintering. the samples were covered with an au electrodes to improve the conductivity during measurement. the capacitances and the loss tangent of the sintered samples were measured with lcr meter (agilent 4284a) in the frequency range between 100hz and 20 khz. the relative dielectric constant was calculated from the measured capacitance. the temperature interval in which the dielectric constant was measured, is from 20 to 180°c. curie temperature (tc), curieweiss temperature (t0), curie constant (c) together with critical exponent of nonlinearity ( ) were calculated using modified curie-weiss law. 3. microst ructure characteristics la/mn -batio3 ceramic density ranged from 72-91% of theoretical density (td), depending on the sintering temperature and additive concentration. with an increase of the sintering temperature and a decrease of the lanthanum concentration, the density increases while porosity decreases. the smallest density (from 72% td for 5.0 la/bt to 75% td for 0.1 la/bt) was measured in samples sintered at 1290c. the highest density value, 91%td, has 0.1la/bt sample sintered at 1350c. the lower densities characteristic for samples doped with higher concentration of la can be due to the formation of a secondary la rich phase, la2ti2o7 phase, which prevents diffusion during the initial sintering phase. (a) (b) fig. 1 sem images of la/mn doped batio3, sintered at 1290c a) 0.1at% la and b) 0.5 at% la. fig. 1 shows surface sem images and grain size distributions of ceramics with 0.1 and 0.5 at% la sintered at 1290c. it has been found that all low la doped samples have a dense microstructure and uniform grain size. the average grain size of la-doped batio3 ceramics enhanced dielectric properties in la modified barium titanate ceramics 183 was from 1.0 to 2.0 m for la concentrations of 0.1 at%, and 0.5at%. at a sintering temperature of 1350c, the microstructure of 0.1 and 0.5 at% la doped samples is very similar to the microstructure of the samples sintered at lower sintering temperatures (fig. 2). the grain size for these samples ranged from 0.5 to 1.5 μm and was characterized by slightly higher density. (a) (b) fig. 2 sem images of la/mn doped batio3, sintered at 1350c a) 0.1at% la and b) 0.5 at% la. however, when the la content increased further to 1.0 at%, a slight increase in average grain size was observed. for samples doped with 1.0 at%, 2.0 at% and 5.0 at% la and sintered at 1350c the microstructure is quite different from the samples sintered at lower temperatures and doped with low additive content. this was related to the formation of individual large grains from 3 to 10 μm as shown in fig. 3. (a) (b) fig. 3 sem micrographs of 2.0la/mn-doped batio3, sintered at 1350c, a) fine-grained microstructure, b) abnormal growth of individual grains. sem images were taken from the same sample. in relation to temperature, the sintering process of la doped batio3 systems can be considered into two different regions, below the eutectic temperature and above the 184 v. paunović, z. prijić, m. đorđević, v. mitić eutectic temperature. at 1350c, liquid phase sintering, with a non-homogeneous distribution of the liquid phase, contributes to the abnormal grain growth within the finegrained structure. one of the specificities of microstructural characteristics, noticed in samples sintered above the eutectic point, is the appearance of the domain structure in individual abnormal grains (fig 3b). the pronounced differences in the microstructure are due to the non-homogeneous distribution of la, which can be confirmed by eds spectra made from different locations on the same sample (fig. 4). (a) (b) fig. 4 sem/eds images of 2.0 la/mn doped batio3.a) fine-grained structure rich in la, b) individual large grains with a domain structure. the presence of la-rich regions indicates two possibilities: first, form a new phase, la2ti2o7, during the sintering, and the other, la has not been incorporated into the batio3 lattice. xrd analysis shows that the second phase, apart from the batio3 perovskite phase, was not found, which leads to the conclusion that the free la is present in the sample (fig. 5).the la rich region are associated to the fine-grained microstructure. the eds spectrum has shown that abnormal grains with a domain structure don’t contain la (fig. 4b). also, the eds analysis showed a homogeneous distribution of mn through the samples, since areas with an increased mn content were not detected. fig. 5 xrd pattern of the 2.0 at% la-batio3 ceramics. no evidence of any secondary phase. enhanced dielectric properties in la modified barium titanate ceramics 185 3. dielectrical characteristics the electrical resistivity measurements indicated that all la doped samples behave as electrical insulators and have an electrical resistivity greater than 10 8 cm at room temperature dielectric properties, which depend on the microstructural characteristics, type and concentration of additives, were measured as a function of frequency and temperature. the frequency range was from 100hz to 20 khz and the temperature range from 20°c to 180°c. according to the obtained measurement results, as can be seen in fig.6, the dielectric constant generally decrease with increasing frequency. at high frequencies, dipoles in ceramic materials are incapable of following rapid electric field changes, resulting in limited dipole response which consequently leads to lower dielectric constant. in all la doped samples, after the initial higher value at lower frequencies r decreases its value becomes almost constant for frequencies greater than 3 khz. (a) (b) fig. 6 frequency dependence of dielectric constant for la/mn-batio3 ceramics sintered at a) 1290c and b) 1350c. 186 v. paunović, z. prijić, m. đorđević, v. mitić with an increase of sintering temperature, the porosity of the samples decreases and their density increases, thus increasing the value of the dielectric constant. so the highest values of r have samples sintered at 1350°c. with an increase of additive concentration, the dielectric constant value decreases, so that the maximum value of r was measured for 0.1la/mn-batio3 samples sintered at 1350°c. the value of the dielectric constant at 100hz and room temperature ranged from 944 for 5.0la/mn-batio3 to 1970 for 0.1la/mn-batio3 ceramics sintered at 1290°c, and from 1450 for 5.0la/mn-batio3 to 3200 for 0.1la/mn-batio3 samples sintered at 1350°c. for all la/mn-batio3 doped samples are characteristic the low values of the dielectric losses. the highest values of tan at 100hz, as well as the largest changes with frequency, from 0.09 to 0.01, were measured for 2.0 and 5.0 at % la doped samples sintered at both sintering temperatures. the main characteristic of the loss tangent for all samples are that after high values at low frequencies, tanδ decreases and becomes constant for frequencies above 3 khz, as shown in fig.7. (a) (b) fig. 7 frequency dependence of dielectric losses for la/mn-batio3 ceramics sintered at a) 1290c and b) 1350c. enhanced dielectric properties in la modified barium titanate ceramics 187 the changes of dielectric constant with temperature clearly displays its dependence on the additives concentration and microstructural characteristics. among the investigated samples, the highest dielectric constant value at room temperature (εr = 3200) and at the curie temperature (εr = 5000), as well as the largest change in the dielectric constant with temperature, shows 0.1la/mn-batio3 sample sintered at 1350c characterized by fine grain and uniform microstructure (fig. 8). for other additive concentrations, the dielectric constant values at curie's temperature are considerably lower. with an increase of sintering temperature, the values of εr are also increased. (a) (b) fig. 8 temperature dependence of r for la/mn-batio3 ceramics. a) tsin=1290c and b) tsin = 1350c. for all the investigated temperatures, it is also characteristic that dielectric constant, decreases with increasing additive concentration. the lowest εr value was measured from sample doped with 5.0 at% la and sintered to 1290c. lower dielectric constant values in high doped la/ mn ceramics can be due to lower ceramic density in samples with higher 188 v. paunović, z. prijić, m. đorđević, v. mitić content of additives or the presence of la-rich regions and the formation of individual large grains which obviously cause a decrease in dielectric permittivity. in general, the sharp phase transformation from the ferroelectric to the paraelectric phase at curie temperature was observed in the lower doped (0.1 and 0.5 at% la) batio3 samples. for samples doped with higher la concentration, the nearly flat and stable permittivity response of dielectric constant in the temperature range of 20° to 180°c were observed. from fig. 8 it can be seen that the curie temperature (tc) was shifted to a higher temperature with the increase of la content. it is because la 3+ entered ba site and tc increased with the incorporation of la 3+ . the curie temperature was in the range of 125°c for samples doped with 0.1at% la to 129c for samples doped with 5.0 at% la (table 1). fig. 9 reciprocal value of dielectric constant versus temperature for selected labatio3 samples. all investigated samples follow the curie-weiss law, r=c/(t-t0). based on the curieweiss law, by fitting curves 1/r vs. t, the curie-weiss temperature (t0) and the curie constant (c) are calculated, for all concentrations and sintering temperatures. the curves, dependence of the inverse value of the dielectric constant on the temperature for la/mn-batio3 doped ceramics (fig. 9), show a linear dependence 1/r vs.t, in the ferroelectric region. using the linear extrapolation 1/r vs.t is calculated the curie-weiss temperature t0. the curie constant was determined by fitting the plot of the reciprocal values of the permittivity in relation to the temperature, and represents the slope of this curve for value above the tc. curie's constant value depends largely on the grain size, additive concentration and the porosity of the doped specimens. since the increase in the additive concentration decreases the density of the samples and increases the grain size, it can be expected that the highest value of enhanced dielectric properties in la modified barium titanate ceramics 189 the curie constant was measured for samples doped with 0.1at% la and sintered at 13500c (c=8.39510 5 k). as the sintering temperature increases, the curie constant for all sample increases (fig.10). it has also been observed that the change in the curie constant with the additive concentration is more pronounced for samples sintered at higher temperatures, where a sharp drop of c is observed for higher additive concentrations. the curie constant c and the curie-weiss temperature t0 values are given in table 1. fig. 10 the dependence of curie constant on the additive content for doped batio3 samples the linear fitting of the curves ln (1/r 1/max) vs. ln (ttmax), a critical nonlinearity exponent (γ) which represents the slope of the curve was calculated (fig. 11) [39]. fig. 11 ln(1/r 1/rmax) versus ln (ttmax) for selected batio3 samples. the critical exponent  is determined from the slope of curves. 190 v. paunović, z. prijić, m. đorđević, v. mitić the values of the critical non-linearity exponent (γ) is in the range from 1 to 1.2, for samples with a lower concentration of la and from 1.13 to 1.43 for heavily doped samples (fig. 12). the lowest values were observed in samples sintered at 1350°c and doped with 0.1 and 0.5 at% la (γ=1.01) which is in accordance with dielectric characteristics, as these samples show a sharp transition from the ferroelectric to the paraelectric region. for samples sintered at 1290° c these values are higher, especially for samples doped with 2.0 and 5.0 at% la. for these samples, it is characteristic that at the curie temperature, in addition to structural transformation, there are other processes that are associated with defects at the grain boundary. also, for these samples, the experimental results showed diffuse phase transformation, which is in agreement with obtained values for γ. fig. 12 the critical exponent  in function of additive content for la doped samples. table 1 dielectric properties for la/mn doped batio3 la at% r max r 300k c105 k tc c t0 c c 105 k  tsin=1290 0 c 0.1 2630 1970 2.193 126 45 1.68 1.20 0.5 1750 1380 1.938 126 47 1.86 1.22 1.0 1500 1330 1.843 127 6 1.98 1.28 2.0 1230 1150 1.808 127 -20 3.22 1.30 5.0 1175 944 1.738 129 -22 4.04 1.43 tsin=1350 0 c 0.1 5000 3200 8.395 125 58 1.39 1.01 0.5 2880 2530 8.21 126 -122 1.89 1.015 1.0 2750 2350 5.194 127 -65 4.23 1.015 2.0 1800 1730 4.447 128 -165 4.85 1.13 5.0 1430 1450 3.653 128 -455 5.01 1.40 enhanced dielectric properties in la modified barium titanate ceramics 191 the higher values of the curie weiss like constant (c) have ceramics sintered at 1350° c and they range from 1.39· 10 5 k for 0.1 at% la to 5.01 · 10 5 for 5.0 at% la doped samples. with an increase la concentration, within a single series of samples, the value of c increases. the highest values of c are calculated for samples doped with 5.0 at% la (table 1). 4. conclusion experimental results revealed that dielectric properties depend on the microstructural characteristics, type and concentration of additives and sintering temperature. the finegrained microstructure with grains 0.5 2.0m in size were obtained in low doped samples, whereas the abnormal growth of individual grains took place in the higher doped specimens. batio3 ceramics samples doped with 0.1at% la and sintered at 1350c with a high density and fine grain structure showed the highest values of the dielectric constants at room temperature (r =3200) and at the curie temperature (rmax=5000). for all measured samples the dissipation factor was less than 9%. the curie-weiss law characterizes the permittivity of both series of samples and curie constant is decreasing with an increase of additive content. the highest value of the curie constant was observed in samples doped with 0.1at% la and sintered at 1350c (c=8.39510 5 k). the curie temperature of the doped samples is slightly lower than the curie's temperature of the undoped ceramics, and it is in the narrow range of 125-129 o c. the calculated values for the critical exponent of nonlinearity  ranged from 1.01 to 1.43 which is in accordance with the measured experimental data and the types of phase transition. acknowledgement: this research is a part of the projects oi-172057 and tr-32026. the authors gratefully acknowledge the financial support of serbian ministry of education, science and technological development for this work. references [1] h. kishi, n. kohzu, j. sugino, h. ohsato, y. iguchi, t. okuda, "the effect of rare-earth (la, sm, dy, ho and er) and mg on the microstructure in batio3", j. e. ceram. soc., vol. 19, pp. 1043–1046, 1999. [2] lj. zivkovic, v. paunovic, n. stamenkov, m. miljkovic, "the effect of secondary abnormal grain growth on the dielectric properties of la/mn co-doped batio3 ceramics", science of sintering, vol. 38, pp. 273–281, 2006. [3] m. vijatovic petrovic, j. bobic, t. ramoska, j. banys, b. stojanovic, "electrical properties of lanthanum doped barium titanate ceramics", materials characterization, vol. 62, pp. 1000–1006, 2011. [4] d.h. kuo, c.h. wang, w.p. tsai, "donor and acceptor cosubstituted batio3 for nonreducible multilayer ceramic capacitors", ceramics international, vol. 32, pp. 1–5, 2006. [5] j. qi, z. gui, y. wang, q. zhu, y. wu, l. li, "ptcr effect in batio3 ceramics modified by donor dopant", ceramic international, vol. 28, pp. 141–143, 2002. [6] m. wegmann, r. bronnimann, f. clemens, t. graule, "barium titanate-based ptcr thermistor fbers: processing and properties", sens. actuators a: phys., vol. 135, no. 2, pp. 394–404, 2007. [7] w. caia, c. fu, z. lin, x. deng, w. jiang, "influence of lanthanum on microstructure and dielectric properties of barium titanate ceramics by solid state reaction", advanced materials research, vol. 412, pp. 275–279, 2012. [8] e. brzozowski, m.s. castro, "conduction mechanism of barium titanate ceramics", ceramics international, vol. 26, pp. 265–269, 2000. 192 v. paunović, z. prijić, m. đorđević, v. mitić [9] a. ianculescu, z.v. mocanu, l.p. curecheriu, l. mitoseriu, l. padurariu, r. trusca, "dielectric and tunability properties of la-doped batio3 ceramics", journal of alloys and compounds, vol. 509, issue 41, pp. 10040– 10049, 2011. [10] a.k. yadav, c. gautam, "dielectric behavior of perovskite glass ceramics", j. mater sci: materials in electronics, vol. 25, pp. 5165–5187, 2014. [11] a.k. yadav, c. gautam, "a review on crystallisation behaviour of perovskite glass ceramics", advances in applied ceramics, vol. 113, no. 4, pp. 193–207, 2014. [12] m.s. alkathy, a. hezam, k.s.d. manoja, j. wang, c. cheng , k. byrappa , k.c. james raju, "effect of sintering temperature on structural, electrical, and ferroelectric properties of lanthanum and sodium cosubstituted barium titanate ceramics", journal of alloys and compounds, vol. 762, pp. 49–61, 2018. [13] v. paunović, v. mitić, z. prijić, lj. živković, "microstructure and dielectric properties of dy/mn doped batio3 ceramics", ceramic international, vol. 40, no. 3, pp. 4277–4284, 2014. [14] s. m. park, y. h. han, "dielectric relaxation of oxygen vacancies in dy-doped batio3", journal of the korean physical society, vol. 57, no. 3 pp. 458–463, 2010. [15] k.j. park, c.h. kim, y.j. yoon, s.m. song, "doping behaviors of dysprosium, yttrium and holmium in batio3 ceramics", j.e. ceram. soc., vol. 29, pp. 1735–1741, 2009. [16] s.m. bobade, d.d. gulwade, a.r. kulkarni, p.gopalan, "dielectric properties of aand b-site doped batio3 (i): laand al-doped solid solution", j. appl. phys, vol. 97, p. 074105, 2005. [17] v. paunović, v.v. mitić, lj. kocić, "dielectric characteristic of donor-acceptor modified batio3 ceramics", ceramics international, vol. 42, pp. 11692–11699, 2016. [18] d. gulwade, p. gopalan, "dielectric properties of aand b-site doped batio3: effect of la and ga", physica b, 404, pp. 1799–805, 2009. [19] v. paunović, v. mitić, m. marjanović, lj. kocić, "dielectric properties of la/mn codoped barium titanate ceramics", facta universitatis, series: electronics and energetics, vol. 29, no. 2, pp. 285–296, june 2016. [20] y.w. hu, p.p. yong, l.c. xiao, f.w. jin, "study of reoxidation in heavily la-doped barium titanate ceramics", j. phys.: conf. ser., vol. 152, p. 012040, 2009. [21] x.l. zhao, z.m. ma, z. xiao ,g. chen, " preparation and characterization on nano-sized barium titanate powder doped with lanthanum by sol-gel process", j. rare earths, vol. 24, pp. 82–85, 2006. [22] y. wang, k. miao, w. wang, y. qin, "fabrication of lanthanum doped batio3 fine-grained ceramics with a high dielectric constant and temperature-stable dielectric properties using hydro-phase method at atmospheric pressure", journal of the european ceramic society, vol. 37, pp. 2385–2390, 2017. [23] y. wang, b. cui, y. liu, x.t. zhao, z.y. hu, q.q. yan, t. wu, l.l. zhao, y.y. wang, "fabrication of submicron la2o3–coated batio3 particles and fine-grained ceramics with temperature-stable dielectric properties", scripta. mater., vol. 90-91, pp. 49–52, 2014. [24] m.ganguly, s.k. rout, t.p.sinha, "characterization and rietveld refinement of a-site deficient lanthanum doped barium titanate", j. alloy compd., vol. 579, pp. 473–484, 2013. [25] h. zu, q, fu, c. gao, t. chen, d. zhou, y. hu, z. zheng, w. luo, "effects of baco3 addition on the microstructure and electrical properties of la-doped barium titanate ceramics prepared by reduction-reoxidation method", j. europ. ceram. soc., vol. 38, pp. 113–118, 2018. [26] v. paunovic, lj. zivkovic, v. mitic, "influence of rare-earth additives (la, sm and dy) on the microstructure and dielectric properties of doped batio3 ceramics", science of sintering, vol. 42, pp. 69–79, 2010. [27] w. li, z. xu, r. chu, p. fu, "structure and dielectric behavior of la-doped batio3 ceramics", adv. mater. res., vol. 105–106, pp. 252–254, 2010. [28] c. a. stanciu, m. cernea, e. c.secu , g. aldica, p.ganea, r. trusca, "lanthanum influence on the structure, dielectric properties and luminescence of batio3 ceramics processed by spark plasma sintering technique", journal of alloys and compounds, vol. 706, pp. 538–545, 2017. [29] f.d. morrison, d.c. sinclair, a.r. west, "electrical and structural characteristics of lanthanum-doped barium titanate ceramics", j. appl. phys., vol. 86, pp. 6355–6366, 1999. [30] r. zhang, j.f. li, d. viehland, "effect of aliovalent substituents on the ferroelectric properties of modified barium titanate ceramics: relaxor ferroelectric behavior", j. am. ceram. soc., vol. 87, pp. 864–870, 2004. [31] f.d. morrison, a.m. coats, d.c.sinclair, a.r.west, "charge compensation mechanisms in la-doped batio3", j.europ. ceram. soc., vol. 6, no. 3, pp. 219–232, 2001. [32] f.d. morrison, d.c.sinclair, a.r.west, "doping mechanisms and electrical properties of la-doped batio3ceramics", int. j. inorg. mater., vol. 3, pp. 1205–1210, 2001. [33] j. jeong, y.h. han, "electrical properties of acceptor doped batio3", journal of electroceramics, vol. 13, no. 13, pp. 549–553, 2004. http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/article/pii/s0925838811016653 http://www.sciencedirect.com/science/journal/09258388 http://www.sciencedirect.com/science/journal/09258388/509/41 http://link.springer.com/journal/10832 enhanced dielectric properties in la modified barium titanate ceramics 193 [34] h. yoon, c.a. randall, k.h.hur, "difference between resistance degradation of fixed valence acceptor (mg) and variable valence acceptor (mn)-doped batio3 ceramics", j. appl. phys., vol. 108, pp. 064101-9, 2010. [35] y.y. yeoh, h. jang, h.i. yoo, "defect structure and fermi-level pinning of batio3 co-doped with a variablevalence acceptor (mn) and a fixed-valence donor (y)", phys chem chem phys., vol. 14, no. 5, pp. 1642-8, 2012. [36] h. kishi, n. kohzu, y. iguchi, j. sugino, m. kato, h. ohasato, t. okuda, "occupation sites and dielectric properties of rare-earth and mn substituted batio3", j. europ. ceram. soc., vol. 21, pp.1643–1647, 2001. [37] h. miao, m. dong, g.tan, ·y.pu, "doping effects of dy and mg on batio3 ceramics prepared by hydrothermal method", journal of electroceramics, vol. 16, pp. 297–300, 2006. [38] k.albertsen, d.hennings, o.steigelmann, "donor-acceptor charge complex formation in barium titanate ceramics: role of firing atmosphere", journal of electroceramics, vol. 2:3, pp. 193–198, 1998. [39] k. uchino, s. namura, "critical exponents of the dielectric constants in diffuse-phase transition crystals", ferroelectrics letters, vol. 44, pp. 55–61, 1982. http://www.ncbi.nlm.nih.gov/pubmed/22193753 instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 325 338 doi: 10.2298/fuee1603325c design and technologies for implementing a smart educational building: case study ionut cardei, borko furht, luis bradley department of computer & electrical engineering and computer science florida atlantic university, boca raton, florida, usa abstract. in this paper we describe the design of an educational smart building and the innovative technologies that were implemented. in january of 2011, florida atlantic university opened its new leed platinum-certified “engineering east” building. the building was designed as both a model of how new technologies can drastically decrease the energy requirements of a large university building and for providing a “living laboratory” so that students and faculty may actually see how these systems work and interrelate. engineering faculty was involved in providing inputs to the builder in creating state-of-the-art engineering laboratories. key words: smart building, living laboratory, sensors, led-certified, power analysis 1. introduction the development of smart buildings has gained the importance, so innovative techniques can be created to optimize the operation of the building in order to reduce expenses and save energy. recent research deals with various approaches and related technologies in designing smart buildings [1-6]. in designing our educational smart building and providing research ability, the building is outfitted with hundreds of different sensors that record everything from the temperature of the cold water entering and leaving the building to the amount of electricity generated by the solar panels on the roof and walkways to level of co2 in a lab at a certain time. the paper provides an overview of the various innovative systems implemented in our new “smart, green building,“ shown in figure 1, and highlights the various sensors and data available for analyzing these systems, both in real time, and through storing this data and processing it through data mining routines. several research projects as part of the “living laboratory” are described. received june 25, 2015 corresponding author: borko furht department of computer & electrical engineering and computer science, florida atlantic university, boca raton, florida, usa (e-mail: bfurht@fau.edu) 326 i. cardei, b. furht, l. bradley 1.1. what is leed the leed or leadership in energy and environmental design green building certification program is a voluntary, consensus-based rating system for buildings designed, constructed and operated for improved environmental and human health performance. leed addresses all building types and emphasizes state-of-the-art strategies in five areas: 1. sustainable and development 2. water savings 3. energy efficiency 4. materials and resources selection, and 5. indoor environmental quality. points are attempted in each of these five areas in order to achieve a silver, gold, or platinum level of certification based on the type of project. when the building was completed in 2011, a minimum of 52 points were required to achieve platinum level certification. engineering east achieved 55 points. the leed scorecard for the building is given at: http://www.eng.fau.edu/pdf/green_bldg_scorecard071411.pdf. fig. 1 leed platinum-certified engineering building at fau 2. main building subsystems and their design innovative technologies were implemented in the following subsystems:  hvac system (heating, ventilation & air conditioning),  cloud computing system and its network control  power generation system and its control. the mechanical equipment and sensors, installed on the 1 st floor, are shown in figure 2. http://www.eng.fau.edu/pdf/green_bldg_scorecard071411.pdf design and technologies for implementing a smart educational building: case study 327 fig. 2 mechanical equipment, pumps, piping, and sensors the building is equipped with hundreds of sensors that measure and collect various parameters and display them in real-time on the dashboard, which is accessible through a web-based application called devisewise, created by ils technology. the devicewise system periodically captures sensor data from the building’s electrical, computing, and air conditioning systems and stores the information in a database. the system then provides a web-based application that displays the summarized information in an energy dashboard accessible from any internet browser. devicewise also provides an api that allows other programs to access the data stored in the database for extraction and reporting outside of the energy dashboard. figure 3 shows the dashboard indicator with a menu of views. the selected view is 1 st floor temperature. fig. 3 dashboard indicators: on the left is the menu of various views including hvac network, power, and water systems. 328 i. cardei, b. furht, l. bradley 2.1. hvac system unlike traditional cooling systems that use air conditioning system or heat pumps to remove hot air from a building and replace it with cooled air, the system is this building does the opposite. since on most days in florida there is a need to cool buildings rather than heat them, a campus-wide chilled water system delivers cold water through the campus. our building uses an innovative technique to temper the chilled water to reduce humidity, and other systems to heat the building if needed. there are three chilled water tertiary pumps in the building to circulate the chilled water. two pumps operate in parallel continuously at 50% capacity. the third pump is normally turned off and acts as a back-up to the first two pumps. on a periodic basis, the active pumps are cycled. as the chilled water arrives in the building some of it is first put through a heat exchanger that increases the temperature of the water by around 10 degrees fahrenheit. the rest of the water is then piped into the chiller beams in the building and sent to the roof to run through the coils of the air handler units. many sensors are installed throughout the system to ensure that all components are working properly. these sensors include the supply and return temperatures at different locations in the building, the output flow and status of the three pumps, the differential and the status of the chilled water control valve, as shown in figure 4. fig. 4 chilled water system and relating sensors the heating hot water system used to heat the building in winter and dehumidify in summer is comprised of three different water system circuits, a well water system, a source water system and a hot water system. the well water system pulls water from a well which maintains a constant water temperature of around 78 degrees f. this water is then run through a heat exchanger between the well water system and the source water system. depending upon the temperature of the source water, the heat exchange design and technologies for implementing a smart educational building: case study 329 either transfers heat energy from the source system to the well system or from the well system to the source system. the source water system runs the water through pipes linked into the computer server room cooling system and absorbs the heat energy from the it equipment. the water heated by the servers is passed through another heat exchange unit to the hot water system on the other side. the hot water system then absorbs the heat energy from the source water system. if additional hot water is required for dehumidification or heating above what is generated by the computer room servers, the source water may also run through one to three heat pumps that extract additional heat from the source water and transfer that heat energy to the hot water system. the resulting hot water is then sent through the piping system to each floor. there is one additional heat-pump used as a back-up and the order of which heat pump is first, second, third is rotated on a weekly basis. sensors record the temperature of the well water, the temperature before and after the first heat exchanger, after absorbing the energy from the servers, and at other spots throughout the building, as shown in figure 5. sensors also record the status and operating statistics for the various pumps and heat pumps used in the system. fig. 5 heating hot water system and relating sensors the high temperature chilled water component for air conditioning uses a heat exchange unit to transfer heat energy from a separate water flow to the chilled water system. this results in the separate system maintaining its water temperature at approximately 10f above the chilled water temperature (approximately 55f). the increase in temperature is necessary to reduce the possibility of condensation when the water is run through pipes directly above certain locations. this water is then circulated in separate pipes throughout the building. the system relies on two pumps with one pump running at a time and the other acting as a back-up. the pump’s responsibilities are swamped each week. the main sensors 330 i. cardei, b. furht, l. bradley used by the high temperature chilled water system include the status and output of the two pumps, the temperature of the water entering, after the initial heat exchange, and before the final heat exchange. 2.2. engineering server room and cloud computing system the server room is set up using a “room within a room” (or “hot isle”) configuration. the room itself is a 600 square feet open space cooled by fans blowing over the tempered cold water system. the inner room is comprised of 14 server racks, 4 computer room cooling units and an uninterruptible power conditioning system. the racks and equipment are installed so that they form an enclosed “hot aisle” in about square 300 feet of space. fig. 6 (a) the server room consist of private cloud computing system. (b) computer laboratory connected to cloud computer using thin clients. the four computer room cooling units supply additional cooling using a gas to liquid refrigerant configuration. the units remove heat from the “hot aisle” and pass that heat energy through a heat exchanger to the building heating system. the resulting cooler air is blown into the outer room further cooling that space. server and equipment fans then pull in this cooled air and blow warmer air back into the “hot aisle”. the server room maintains different type of sensors. one set of sensors captures power data from the four computer room cooling units including the total amperage for the supply fans and compressors. several lan connected temperature and humidity sensors are deployed throughout both the inside room and outside room that provide realtime access to that information. another set of sensors monitors the lan traffic flow to and from the servers and power used (in kilowatts) by the servers. the computer system is architected as a private cloud computing system. two computer laboratories and all computers in the building are using cloud computing technology to run software and access data stored in the cloud computing system (fig. 6). the cloud computer system consists of 14 blade computers and the network traffic is measured and controlled in real time, as illustrated in figure 7. design and technologies for implementing a smart educational building: case study 331 fig. 7 network trafic measurements: cumulative inbound transfer 2.3. building power generation and control the building generates approximately 4% of its power from three arrays of solar photovoltaic cells. one set of 96 cells is installed on the south-east facing roof of the davinci conference center. two other arrays are installed over the north-south (32 panels) and east-west (48 panels) walkways around the building (fig. 8). the panels are all rated at 282 watts each which results in a maximum output from all panels with direct sunlight at around 50 kw. the power generated by the solar arrays are routed through a central power conversion unit located in the server room where the direct current from the arrays is converted to alternating current and added to the building’s power grid. the converter provides real time reporting of the watts passed through to the grid. fig. 8 solar photovotalic cells for poer generation 332 i. cardei, b. furht, l. bradley the overall utilization of electricity within the building is monitored by a set of sensors. the utilization is divided into the following categories so that the changes to each system can be analyzed. the categories include:  mechanical equipment – power used to run the air handler units, pumps, heat pumps, dampers, fans and any other equipment involved in the air conditioning of the building.  lights – power used to light the building.  receptacle – power consumed by any devices plugged into wall receptacles in the building  kitchen – power consumed by the equipment in the kitchen.  power consumed by the uninterruptible power supply systems to charge back-up batteries.  emergency power, and  solar power generated by photovoltaic panels. figure 9 illustrates the cumulative power usage by various subsystems, while figure 10 shows the real-time network measurements for separate chassis. fig. 9 cumulative power usage by various subsystems the lights in the building are controlled through a system of linked sensors and switches provided by encelium technologies. the lighting system includes both occupied and unoccupied modes for the main hallway lighting, along with overrides based on building occupancy. it relies on sensors and switches installed through the building that determine room occupancy by motion detection and required illumination based on ambient light. the switches allow room occupants to temporarily override normal room lighting. design and technologies for implementing a smart educational building: case study 333 fig. 10 real-time network measurements for separate chassis. the system uses a light management application from encelium named polaris 3d to optimize the power required for room illumination. the system receives inputs from the various sensors and switches throughout the building, combines that information with configurable lighting parameters then determines the best lighting for each room and each area. it also records lighting and occupancy information for further analysis and research. 3. living laboratory: research projects 3.1. alerting and monitoring system as part of the nsf center project, we worked with aware technologies and their process data monitor (pdm) system [7], which is an alerting system that uses data mining techniques to categorize sensor data into similar clusters of information. once these clusters are identified, users can determine whether the clusters represent normal or abnormal running conditions for the sensor data used in creating the cluster. the system will then keep track of the number of times each cluster is computed and when providing an analytical tool to assist in optimization. it will also detect and alert when a cluster is calculated that is outside of the normal operating parameters and report those anomalies through emails. pdm uses a tool named xlreporter to extract information from building sensors and reformat the information into xml files that are then processed by the system. we also developed data warehouse that stores information from several different sensor systems including device wise, standalone wireless and wired sensors, pdm calculated clusters, and weather stations. the collected weather data comes from a link to the weatherbug api [8] and pulls meteorological information every 15 minutes including temperature, humidity, wind speed and direction, air pressure, rain amount and light 334 i. cardei, b. furht, l. bradley levels. the system provides for a set of utility programs that extract the data from the sensors, store it in the fau “green” database then summarize and export the information in a variety of different formats for use by other tools such as weka and excel. these systems have been used in several preliminary studies to help validate the data being collected and determine possible future research opportunities. these studies included:  determining correlation between photovoltaic energy generation and weather conditions,  calculating energy flow between the different components of the air conditioning systems,  categorizing cluster data from the pdm system, and  determining room occupancy based on room co2 levels. 3.2. building power analysis in this section we present a summary of results we obtained from analyzing the performance of various building systems. photovoltaic power system we analyzed the efficiency of the solar panels by tracking the power generated over a period of one year. figure 11 shows the solar energy generated per day (in kwh, right y axis) in comparison with the total energy consumed by the building systems per day (kwh, left axis), excluding the energy used by it equipment in the data center. fig. 11 photovoltaic energy generated and the total energy used by building systems, including mechanical, receptacles, kitchen, lighting, and standby power. design and technologies for implementing a smart educational building: case study 335 we noticed a high variation in the solar power generated; this is due mainly to variable day by day cloud coverage. between march and may it was a period of very clear skies with almost no rain and hardly any clouds. conversely, at the beginning of 2013 it was a period of high cloud coverage combined with the shorter day time that caused reduced solar energy generation. the total energy consumed by building systems, in orange color, depends on building occupancy, with lower consumption during school breaks in march, june-august, for thanksgiving (end of november), and the winter break. a summary of the energy statistics are listed in table 1. the solar energy produced is on average 4.45% of the total energy used by building systems and has a high standard deviation – 1/3 of the mean. we built a predictive model with the weka tool for the solar power having the time of day, outside temperature, light-level %, humidity %, and rain as attributes. a reptree decision tree algorithm achieves the lowest relative absolute error of 19.26% among all alternatives, and a mean absolute error of 1.38 (kw), with a correlation coefficient of 93.9%. the solar power prediction error is caused by measuring the light level from a meteorological station located 4 km from the building; clouds passing on cause a variable delay in measuring light levels. table 1 summary statistics for the solar energy produced and the total energy consumed during 1-year period. photo energy per day (kwh) total energy per day (kwh) photo / total % maximum 227.05 3720.83 7.60% minimum 16.53 2575.42 0.60% average 134.55 3030.52 4.45% stdev 45.84 223.83 1.54% the solar energy produced is on average 4.45% of the total energy used by building systems and has a high standard deviation – 1/3 of the mean. we built a predictive model with the weka tool for the solar power having the time of day, outside temperature, light-level %, humidity %, and rain as attributes. a reptree decision tree algorithm achieves the lowest relative absolute error of 19.26% among all alternatives, and a mean absolute error of 1.38 (kw), with a correlation coefficient of 93.9%. the solar power prediction error is caused by measuring the light level from a meteorological station located 4 km from the building; clouds passing on cause a variable delay in measuring light levels. a k-means clustering algorithm applied to the same photovoltaic data model yields the clusters seen in figure 12. the relation between the light value and the solar power generated is disturbed by the aforementioned measurement delay and by the orientations of the solar panels that don't match the normal orientation of the light sensor. 336 i. cardei, b. furht, l. bradley fig. 12 dependence of the photovoltaic power (vertical axis) of the light level (horizontal axis). instances are color-coded according to classes determined by the k-means clustering algorithm. prediction models for receptacle power receptacle power in the building is used by anything plugged into a power outlet. this depends in part on the building occupancy, as office equipment (e.g. laptops) is a major consumer. figure 13 shows the dependence of the receptacle power (vertical axis, in the 16.6-26.4 kw interval) on the time of day (0-2400), as measured for the sept. 2013 month. the power rises sharply after 8am, peaks at 1pm, then drops gradually after 4pm until 11pm. the points in brown represent measurements taken during the weekend, with a lower power value. the weekend peak is 19 kw, the weekday peak (on wednesday at 1pm) is 26.4 kw, while at night the power drops to 16.9 kw. the chart also partitions the measurements into 7 clusters determined by the k-means algorithm. a m5 decision tree classifier computed with weka has a relative absolute error of 36.15%, a mean absolute error of 0.6837, and a correlation coefficient of 92.31%. for training and evaluation in all experiments we used 10-fold cross validation and we searched for the lowest error. design and technologies for implementing a smart educational building: case study 337 fig. 13 weka screenshot showing the dependence of the receptacle power on the time of day (0 – 2400) and 7 clusters computed using k-means algorithm. data center power the data center consists of two racks located in a “hot isle” enclosure, with separate uninterruptible power supply units used by computing blades, discrete pcs, network attached storage, and networking equipment. the total power used by the data center includes that used by the four crac (air conditioning units) that cool the air inside the hot isle. the total data center power has grown in small chunks due to additions and upgrades to equipment, from 277 kw, in 08/2012, to 368 kw, in 10/2013. the total power has a standard deviation during a day and during a week of about 1.1% of the average for the corresponding period. the hot water circuit that extracts heat from the hot isle using heat pumps achieves a high temperature of 50°c and a maximum differential of 20°c, proving effective in reusing waste heat for other building systems. 4. conclusions strong instrumentation in the new leed platinum engineering building opens up a multidimensional view of the inner working of its hvac and power systems. a variety of sensors allow a detailed analysis of the building system performance and the data center power utilization. however, it is equally important to consider external variables, such as weather, building occupancy, and school schedule to get a more accurate picture. 338 i. cardei, b. furht, l. bradley our analysis found that the solar power generated covers a maximum of about 7.6% and 4.65% on average of the total building-related energy consumption and that it varies highly with cloud coverage. another interesting observation is that the total power used by the building (122 kw on average for 09/2013) is 2.79 times smaller than the power consumed by the data center (average of 341 kw). still, this figure does not include most of the power needed for cooling the building, as it is spent by the chilled water campus plant. in the future we will conduct more analysis aiming to estimate power savings from cooling in the data center by raising the hot isle temperature. acknowledgement: the paper is a part of the research done within the project funded by nsf industry/university cooperative research center for advanced knowledge enablement, 2009-2020. references [1] d. sciuto and a.a. nacci, “on how to design smart energy-efficient buildings,” in proceedings of 12 th ieee international conference on embedded and ubiquitous computing, milano, italy, 2014, pp. 205208. [2] “special section on intelligent buildings and home energy management in a smart grid environment,” ieee transactions on smart grid, vol. 3, no. 4, december 2012. [3] o. evangelatos, k. samarasinghe, and j. rolim, “evaluating design approaches for smart building systems,” in proceedings of the 9 th international conference on mobile adhoc and sensor systems, las vegas, nevada, 2012, pp. 1-7. [4] y. sun, t-y. wu, g. zhao, and m. guizani, “efficient rule engine for smart building systems,” ieee transactions on computers, vol. 64, no. 6, pp. 1658-1669, june 2015. [5] r. fantacci, t. pecorella, r. viti, c. carlini, and p. obino, “enabling technologies for smart building, what’s missing?”, in proceedings of the aeit conference, mondello, italy, 2013, pp. 1-5. [6] s. tadokoro et al, “smart building technology,” ieee robotics & automation magazine, vol. 21, issue 2, 2014, pp. 18-20. [7] aware technologies, process data monitor (pdm), http://awaretechnology.com/. [8] weatherbug api, http://weather.weatherbug.com/desktop-weather/api-documents.html. http://awaretechnology.com/ instruction facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 487-500 https://doi.org/10.2298/fuee1803487a electromagnetic analysis of single/multiple grounding rods * vesna arnautovski-toseva, leonid grcev university ss cyril and methodius university, faculty of electrical engineering and information technologies, skopje, macedonia abstract. this paper presents electromagnetic modeling of multiple driven grounding rods in homogeneous/two-layer soil. the mathematical model is formulated by mixed potential integral equation (mpie) on the basis of sommerfield integrals. several configurations of multiple driven rods located in homogeneous or two-layer soil are analyzed. the authors are focused on the calculation of the current density along the rods in wide frequency range from 100hz to 1mhz. key words: electromagnetic theory, grounding rod, high frequencies, homogeneous soil, two-layer soil. 1. introduction ground rods are simplest and the most often used means used for earth termination of different electrical systems, providing a conducting connection, whether intentional or accidental between an electrical circuit or equipment and the earth. their behavior at dc (50 or 60 hz) is well understood [3-4], but their high-frequency (hf) and transient performance is also of interest in different fields, such as, lightning protection, power and telecommunication systems, power system transients, electromagnetic compatibility, etc. the safety criteria based on “a minimum rise in the potential” are taken from the power systems analysis. in such cases the usual dc approximation leads to rather straightforward computations. the simulation studies show that grounding systems, even the simplest ones, behave quite differently at low and high frequencies [5]. the survey of the literature shows that high frequency analysis of grounding systems is realized by using lumped circuit equivalents [6-7], quasi-static method of images [8-9], rigorous electromagnetic model  received february 19, 2018; received in revised form may 24, 2018 corresponding author: vesna arnautovski-toseva faculty of electrical engineering and information technologies, skopje, macedonia (e-mail: atvesna@feit.ukim.edu.mk) * an earlier version of this paper was presented at the 13th international conference on applied electromagnetics (пес 2017), august 31 september 01, 2017, in niš, serbia [1] 488 v. arnautovski-toseva, l. grcev [10-11], or hybrid approaches [12-13]. however, there is a lack of papers that threat the problem of multiple driven rods at high frequencies, except at dc [3]. our objective in this paper is to give sight into the problem of high frequency performance of multiple driven ground rods in homogeneous or two-layer soil with respect to the case of a single rod. the main interest is the current density along the rods of various configurations for which the dc behavior is known [4]. preliminary results of authors’ research in this field are presented in [1]. the authors have recently presented similar analysis of a single rod at high frequencies [2]. 2. electromagnetic model the rigorous treatment of the air/two-layer soil interfaces in electromagnetic models is based on the exact solution for the field of a hertz dipole near a conducting half space/two-layer soil. this approach involves green’s functions formulated by sommefeld integrals that need numerical integration. this model is confirmed as theoretically most accurate since it is based on minimum approximations. the detailed description of the mathematical model is given in our previous work [10, 11, 15]. in this analysis the electromagnetic model is extended to take into account multiple rods geometry and the corresponding excitations. 2.1. geometry of the problem in fig. 1 we consider grounding system consisting of k identical parallel rods, each of length l and radius a penetrating homogeneous/two-layer soil. the upper layer (medium 1) is of finite depth d characterized by permittivity ε1, permeability μ0 and resistivity 1. when the soil is homogeneous it is assumed that d . in the case of two-layer soil, the bottom layer (medium 2) is characterized by permittivity ε2, permeability μ0 and resistivity 2. the air (medium 0) is characterized by permittivity ε0 and permeability μ0. corresponding rod(s) lengths in the upper/bottom layer are l1 and l2 respectively. in the case of homogeneous soil l2=0. fig. 1 geometry of multiple rods in two-layer soil electromagnetic analysis of single/multiple grounding rods 489 2.2. mathematical model grounding rod may be considered in a circuit with an ideal harmonic current source of magnitude is with one terminal connected to the ground electrodes and the other terminal to the remote earth theoretically at infinite distance. the influence of the connecting leads is ignored. multiple rods excitation is assumed by using respectively k harmonic current sources of equal magnitude is, leading to total excitation current of kis. fig. 2 approximation of the current with triangle dipoles following thin-wire approximation, the physical model of the system of ground rods is based on fictitious segmentation into n+1 straight tubular segments, fig. 2. the segmentation is done in a way so that no segment penetrates through the boundary between the two soil layers. to solve current distribution along each rod the method of moments is applied by using thin wire approximation [10]. following galerkin formulation the current distribution in the system of rods is approximated by n overlapped triangular dipoles, each extended over two neighboring segments i = li-1 + li (i=2, 3, ... , n+1). one of the triangular dipoles passes through the interface between the two soil layers with its top point located just at the interface between the two soil layers (z = d). the total excitation current kis is approximated by k additional triangular monopoles sk with length k = lk, each of magnitude is, that are positioned at the top of each rod. the weighting functions are also triangular dipoles. the following matrix equation yields the current distribution along the grounding rods 1 2 1 2 1 2 1 1 1 11 12 1 1 2 2 221 22 2 2 1 2 k k k s s s s s s n s s s s s sn n n nn n ns s ns s ns s z i z i z iz z z i z i z i z iz z z i z z z i z i z i z i                                  , (1) where matrix column [i] contains the coefficients in of unknown currents; [z] is generalized impedance matrix related to self and mutual impedances between all triangular dipoles that represent all electromagnetic influences between all rods in the 490 v. arnautovski-toseva, l. grcev configuration; [zsis] is excitation matrix where the corresponding multiple rods excitations and their influences are taken into account. once the currents in the dipoles are computed, the current density along each rod in the configuration is easily estimated 1i i i i i i i l     . (2) the elements zij and the elements kjs z correspond to mutual impedances between the source dipole i and observation dipole j (i, j=1, 2, ... , n); and respectively between the source dipole i (i=1, 2, ... , n) and each of the k excitation monopoles sk as following 1 1 k k k ij ij zj i i j is is zs i i k u z e dz i i u z e dz i i         . (3) here ezj is tangential electric field at the surface of the observation dipole j over the length of a dipole segment lj due to current ii in the dipole i. in this analysis, mpie formulation for the electric field is used z z z e j a v   , (4) 1 i z azz i z vz i i di a g i dz v g q dz q j dz       . (5) in (5) gazz is z-component of the dyadic green's function for the magnetic vector potential at observation point (x,y,z) due to a hertzian vertical electric dipole (ved) of unit strength at source point (x',y',z'). respectively, gvz is the corresponding scalar potential green's function due to a single point charge associated with ved [14]. the exact formulation of spatial domain green’s functions are formulated by sommerfeld integrals that are solved by direct numerical integration  , , 0 0 , 0 1 1 ( ) ( ) 2 2 mn mn mn azz vz azz vz azz vz g g k j k k dk s g           . (6) where j0(k) is zero-order bessel function of the first kind. the corresponding spectral domain green’s functions for the magnetic vector potential are given below [15] electromagnetic analysis of single/multiple grounding rods 491 1 1 1 2 1 1 2 1 1 2 ( ) ( )11 0 1 ( ) ( ) ( )12 0 12 10 1 ( ) ( ) ( )21 0 21 10 2 22 0 2 e 2 e e e 2 e 2 e 2 z z z z z z z z z z jk z z jk z z jk z z azz z jk z d jk d z jk d z azz z jk z d jk d z jk d z azz z jk azz z g a e be j k g t m r j k g mt e r e j k g j k                                         21 ( 2 )2 12 10 ( ) zz z z jk z z djk d m r r e e          . (7) 1 1 1 1 1 1 1 ( ) ( ) ( ) 12 10 ( 2 ) 10 12 1 2 12 10 ( ) ( ) 1 jkz d z jkz d z jkz d z jkz z jkz z jkz d z jk d a e r e r e m b e r e r e m m r r e                         . (8) in above relations, r21, r10, t21 and t10 are reflection and transmission coefficients of a tm wave incident on both interfaces between mediums 11 2 0 1 02 1 21 10 12 21 1 2 12 1 0 1 0 1 2 21 12 21 1 22 1 2 2 2 2 2 0 0 0 0 2 1 ( ) , 1, 2 z zz z z z z z z z z i zi i i i k kk k r r r r k k k k k t t t k k j k k k k k i                                  . (9) the corresponding spectral expressions for the scalar potential green's functions are derived by using the following relation 2 2 1 mn mn azz v zm g g z zk     . (10) 3. numerical results in this section frequency domain behavior of single/multiple grounding rods located in homogeneous/two-layer soil is analyzed. three grounding test configurations are assumed as shown in fig. 3; single rod (r1), 5-rods configuration (r5), and 9-rods configuration (r9). the geometry of each rod is identical, characterized by length 10m (extending from 0.05m to 10.05m) and radius 0.01m. the outer dimensions of the “mesh” of r5 and r9 configurations are: a) 20m20m, (r5a and r9a); and b) 10m10m (r5b and r9b) as given in [3]. the main objective of this analysis is to compare the behavior of the given grounding multiple rod structures at high frequencies with respect to the corresponding dc results. 492 v. arnautovski-toseva, l. grcev fig. 3 test configurations: single rod (r1) and multiple rods (r5) and (r9) in the case of two-layer soil the permittivity of both layers is 1 = 2 = 100. the resistivity of the upper layer with depth d=5m is fixed at 1=100ωm, while the bottom layer resistivity is: 2=33.33ωm (reflection factor k=0.5), and 2=300ωm (reflection factor k=+0.5). the homogeneous soil corresponds to reflection factor k=0. the total excitation current in the corresponding analysis is k1ka, i.e. the excitation current applied to each rod is 1ka. 3.1. homogeneous soil in this section, all test configurations are analyzed in homogeneous soil. the main objective is to investigate the influence of the number of multiple rods in r5 and r9 configurations, as well as how their mutual distance affects the current distribution. as a result, current density along the central rod and outer/corner rods is compared to the corresponding results obtained for single rod r1. the analysis is preformed in frequency range from 100hz to 1mhz. in fig. 4 it may be observed respectively the current density along the central and the outer/central rods of: a) r5a and r5b configurations, and b) r9a and r9b configurations at 100hz with respect to single rod r1 behavior. as it may be observed, the current density along r1 is generally uniform except at the bottom end of the rod where much higher current density is observed. the results show that in multiple rods configurations (r5 and r9) the current density in the central rod is lower in the upper part, and higher in the bottom part of the rod as compared to the outer/corner rods. this effect is emphasized with smaller distance between the rods and with larger number of rods in the grounding configuration (r9b). the results are in good accordance with the reference results obtained at dc [3]. in fig. 5 a) and b) it may be observed current density in the specific rods of r5 and r9 configurations respectively at 100khz. again, the injected current is almost uniformly discharged along each rod, i.e. the grounding system performance is quasi-static up to 100khz. the differences in the current density in outer/corner rods are lower as compared to single rod. next, in fig. 6 it is shown the current density along the central and outer/corner rods of r5 and r9 configurations obtained at 1mhz with respect to single rod r1. the results show that each rod of r5 and r9 configurations acts as isolated rod, since the current density along each rod is identical and equal to the corresponding single rod behavior. as it may be observed, most of the injected current is discharged from the upper part of the rod. the results show that the distance between the rods has reduced influence on the current density at higher frequencies. summarizing the results obtained for the analyzed configurations it may be expected that in the case when the distance between the rods is smaller higher differences in the current single rod r1 central corner side r9 central corner r5 20 (10) m 2 0 ( 1 0 ) m electromagnetic analysis of single/multiple grounding rods 493 densities would arise between the central rod and the outer rods. also it may be assumed that such differences would decrease at higher frequencies it can be assumed that the differences will be reduced so that the results will converge with those given in fig. 6. 0 2 4 6 8 10 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 rod length (m) c u rr e n t d e n s it y ( k a /m ) a) current density along r1, r5a and r5b at 100hz r1 central rod of r5a corner rod of r5a central rod of r5b corner rod of r5b 0 2 4 6 8 10 0.08 0.1 0.12 0.14 0.16 0.18 0.2 b) current density along r1, r9a and r9b at 100hz rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b fig. 4 current densities along r1, r5 and r9 in homogeneous soil at 100hz 494 v. arnautovski-toseva, l. grcev 0 2 4 6 8 10 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 a) current density along r1, r5a and r5b at 100khz rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r5a corner rod of r5a central rod of r5b corner rod of r5b 0 2 4 6 8 10 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 b) current density along r1, r9a and r9b at 100khz rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b fig. 5 current densities along r1, r5 and r9 in homogeneous soil at 100khz electromagnetic analysis of single/multiple grounding rods 495 0 2 4 6 8 10 0.05 0.1 0.15 0.2 0.25 0.3 current density along r1, r5 and r9 at 1mhz for k=0 rod length (m) c u rr e n t d e n s it y ( a /m ) r1 central rod of r5b corner rod of r5b central rod of r9b side rod of r9b corner rod of r9b fig. 6 current densities along r1, r5 and r9 in homogeneous soil at 1mhz 3.2. two-layer soil in this section, the analysis is focused on r9 configuration since highest variations in the current density are observed in the central rod with respect to the outer/corner rods or single rod case, as observed in previous section. as may be seen in fig.7 a) k=+0.5 and b) k=0.5 respectively, significant differences in the current density along ground rods are observed at 100hz. the current density is much higher in the partition of the rod located in the lower resistivity layer. also, current density is practically uniform along the rod partition located in one layer. this result is in accordance with the rod dc behavior, when the injected current is practically uniformly discharged in the surrounding soil [3]. however, the differences in current density between the upper and the bottom partition of the rod lead to large jump that occur at the interface between both soil layers (at depth d). when k=+0.5, the current density along the central rod is lower along the upper rod partition, but much higher at the bottom rod partition as compared to the outer/single rod. however for k=0.5, such differences in the current density are much less observed. in fig. 8, the corresponding results obtained at 100khz are shown. the results are generally similar with those shown in fig. 7. however, higher deviations, expressed by small peaks, lead to larger discontinuity in the current density that occurs at the interface between both soil layers. 496 v. arnautovski-toseva, l. grcev 0 2 4 6 8 10 0.04 0.06 0.08 0.1 0.12 0.14 0.16 a) current density along r1, r9a and r9b at 100hz for k=+0.5 rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b 0 2 4 6 8 10 0.05 0.1 0.15 0.2 0.25 b) current density along r1, r9a and r9b at 100hz for k=-0.5 rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b fig. 7 current densities along r1, r9a and r9b in two-layer soil at 100hz electromagnetic analysis of single/multiple grounding rods 497 0 2 4 6 8 10 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 a) current density along r1, r9a and r9b at 100khz for k=+0.5 rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b 0 2 4 6 8 10 0 0.05 0.1 0.15 0.2 0.25 b) current density along r1, r9a and r9b at 100khz for k=-0.5 rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b fig. 8 current densities along r1, r9a and r9b in two-layer soil at 100khz 498 v. arnautovski-toseva, l. grcev as may be observed in fig. 9 and in fig. 10, the current density obtained at high frequencies, 1mhz and 10mhz respectively, differs significantly from the corresponding low frequency, quasi-static, behavior. 0 2 4 6 8 10 0 0.05 0.1 0.15 0.2 0.25 a) current density along r1, r9a and r9b at 1mhz for k=+0.5 rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b 0 2 4 6 8 10 0.05 0.1 0.15 0.2 0.25 0.3 b) current density along r1, r9a and r9b at 1mhz for k=-0.5 rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 central rod of r9a side rod of r9a corner rod of r9a central rod of r9b side rod of r9b corner rod of r9b fig. 9 current densities along r1, r9a and r9b in two-layer soil at 1mhz electromagnetic analysis of single/multiple grounding rods 499 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 current density along r1 and r9 at 10 mhz rod length (m) c u rr e n t d e n s it y ( k a /m ) r1 k=0 r1 k=+0.5 r1 k=-0.5 central rod of r9b k=0 central rod of r9b k=+0.5 central rod of r9b k=-0.5 side rod of r9b k=0 side rod of r9b k=+0.5 side rod of r9b k=-0.5 corner rod of r9b k=0 corner rod of r9b k=+0.5 corner rod of r9b k=-0.5 fig. 10 current densities along r1, r9a and r9b in two-layer soil at 10mhz at high frequencies the influence of the number of multiple rods and their mutual distance is practically negligible. the differences in the current density are mainly due to the two-layer soil parameters. at 1mhz, the jump in the current density that occurs between the upper and the bottom rod partitions is especially high when the bottom layer is less resistive. in this case, due to high resistivity of the upper layer, most of the injected current is discharged from the bottom part of the rod surrounded by less resistive bottom layer. however, as the frequency increases this effect vanishes. as may be seen in fig. 10, the results obtained at 10mhz for all rods in r9b configuration converge to the corresponding behavior of a single rod (r1). at high frequencies above 1mhz, practically, the behavior of the grounding multiple rods is not affected by the number of rods in the configuration or by the distance between the rods, since each rod in the configuration acts as isolated rod in homogeneous soil. here also, the bottom layer and the corresponding reflection factor has almost no influence on the current density since all injected current is quickly discharged from the upper part of each rod into the upper soil layer. it may be expected that when the distance between the rods is smaller, higher differences in the current densities would arise at low frequencies together with higher discontinuities at the interface between the two soil layers. again, it may be expected that such differences would decrease at higher frequencies leading to results similar as given in fig. 10. 500 v. arnautovski-toseva, l. grcev 4. conclusion in this paper, the frequency domain behavior of single/multiple grounding rods in homogeneous/two-layer soil is analyzed. the results for the current density obtained for various multiple rods configurations show that grounding rods performance at high frequencies differs significantly from their low frequency performance. in case of homogeneous soil, the current in the central rod varies from the corresponding distribution in the outer/corner rods. at higher frequencies, this effect vanishes, i.e. at 1mhz the current density for all test cases converge to the distribution obtained for single rod. in case of twolayer soil, current density along the upper and the bottom rod partitions differs significantly and lead to large jump in the current distribution that occurs at the interface between the both soil layers. this effect is noticeable at higher frequencies also, while the influence of other parameters such as the number of rods and their distances are negligible. at very high frequencies as 10mhz, the effect of the two distinct soil layers also vanishes. the results obtained for rods all test configurations show that their performance is identical to the corresponding case of single rod in homogeneous soil. in the future work the authors will analyze in more details the influence of the multiple rods geometry on the current density and will extend their analysis to h grounding impedance. references [1] v. arnautovski-toseva, l. grcev and s. cundeva, "high frequency behaviour of ground rods", in proc. of the 13th international conference on applied electromagnetics, nis, serbia, august 30 sep. 01, 2017, pp. 1–4. [2] v. arnautovski-toseva, l. grcev and k. el khamlichi drissi, "high frequency performance of ground rod", in proc. of the 17th ieee international conference on smart technologies, ohrid, macedonia, 6–8 july 2017, pp. 914–918. [3] f. dawalibi, d. mukhedekar, "influence of ground rods on grounding grids", ieee trans. power app. syst, vol. 98, pp. 2089–2097, 1979. [4] j. m. nahman, "digital calculation of earthing systems in nonuniform soil", arch. elektrotech., vol. 62, pp. 19–24, 1980. [5] r. g. olsen, m. c. willis, "a comparison of exact and quasi-static methods for evaluating grounding systems at high frequencies", ieee trans. power del., vol. 11, pp. 1071–1081, 1996. [6] c. t. mata, m. i. fernandez, v. a. rakov, m. a. uman, "emtp modeling of a triggered-lightning strike to the phase conductor of an overhead distribution line", ieee trans. power del., vol. 15, pp. 1175–1181, 2000. [7] l. grcev, m. popov, "on high frequency circuit equivalents of a vertical ground rod", ieee trans. power del., pp. 15981603, 2005. [8] t. takashima, t. nakae, r. ishibashi, "high frequency characteristics of impedances to ground and field distributions of ground electrodes", ieee trans. power app. syst, vol. 100, pp. 18931900, 1980. [9] s. bourg, b. sacepe et al., "deep earth electrodes in highly resistive ground: frequency behaviour”, in proc. of the ieee int. symp. on electromagnetic compatibility, atlanta ga, usa, 14-18 aug. 1995, pp. 584 –589. [10] l. grcev, f. dawalibi, "an electromagnetic model for transients in grounding systems", ieee trans. power del., no. 4, pp. 17731781, 1990. [11] l. grcev, v. arnautovski-toseva, "grounding systems modeling for high frequencies and transients: some fundamental considerations", in proc. ieee bologna power tech. bologna, italy, june 23-26, 2003. [12] a. f. otero, j. cidras, j. l. del alamo, "frequency-dependent grounding system calculation by means of a conventional nodal analysis technique", ieee trans. power del., vol. 14, pp. 873–878, 1999. [13] z-x li, w. chen, j-b fan, j. lu, "a novel mathematical modeling of grounding system buried in multilayer earth", ieee trans. power del., vol. 21, pp.1267–1272, 2006. [14] g. dural, m. i. aksun, "closed-form green's functions for general sources and stratified media", ieee trans. microwave theory techn., vol. 43, july 1995, pp. 1545-1552. [15] v. arnautovski-toseva, l. grcev, "image and exact models of a vertical wire penetrating a twolayered earth", ieee trans. on electromagnetic compatibility, vol. 9, 2011, pp. 1–9. facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 419 435 doi: 10.2298/fuee1603419m using internet of things in monitoring and management of dams in serbia rastko martać 1 , nikola milivojević 1 , vladimir milivojević 1 , vukašin ćirović 1 , dušan barać 2 1 institute for the development of water resources “jaroslav ĉerni”, serbia 2 faculty of organizational sciences, university of belgrade, serbia abstract. this paper discusses harnessing internet of things in monitoring and managing dams in republic of serbia. large dams are of major importance, primarily because of their use for electricity, but risks which are associated with it should be greatly taken into account. there is a need to consolidate information related to dam facilities in order to use them for dam management in the republic of serbia. an information system has been developed based on the existing systems, allowing utilization of intelligent network sensors. the aim of the paper is to describe possibilities of the internet of things application within a specific system for dam safety management. in order to facilitate the inclusion of a large number of intelligent sensors, a new data acquisition module for communication with sensors in the monitoring network is defined. the system should provide on time alerting in case security parameters deviate from the expected values. key words: internet of things, cloud, dams, dam safety management, monitoring, serbia 1. introduction most of the dams in serbia were built in the sixties and seventies of 20 th century. the risk for security increases with the age of the building, which is why management and security of the facility has to be improved in order to timely consider possible negative situations [1]. it should be noted that these facilities are of vital importance for society, because they are used to produce electricity and water supply. dams also provide water supply to cities, flood control, and can assist river navigation. many dams are multipurpose, providing more than one of the above benefits. their damage or possible demolition can cause serious consequences to the environment. in order to provide support for the management of complex systems of hydro power plants, it is necessary to establish communication between metering systems and computer models. the complexity of the management of water resources is due to the conflicting demands of different users (hydropower, agriculture, etc.) for limited resources, and this received july 10, 2015; received in revised form november 13, 2015 corresponding author: rastko martać institute for the development of water resources “jaroslav ĉerni”, jaroslava ĉernog 80, belgrade, serbia (e-mail: mrastko@gmail.com) 420 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać complexity increases in extreme weather conditions, such as droughts and floods, which are reflected in populated areas. dam safety management is a long-term and continuous process that has to be improved permanently [2] [3] [4]. in this respect, procedures and processes of dam safety management must continually be improved in all aspects, both in terms of measuring equipment, as well as in the management and use of data in the procedures for determining safety facilities. a modern system for dam safety management should be established, so that it primarily provides operational status of monitoring dam safety in real time and to enable operational conclusion on the status of the dam safety practically on a daily basis. the whole concept of technical monitoring, with a posteriori reasoning after a few months, or even more than a year, loses much of its meaning and importance (the past practice was based on the preparation of periodic reports on the behavior of the dam). the modern concept of dam safety management should be based on the physically based and software-supported technical system [5] [6]. the physical foundation of this concept relates to the provision of data of importance to the safety of the dam and the accurate measurement of relevant physical quantities, which are to be tracked on the dam with installed equipment for technical monitoring. today's level of information and telecommunication infrastructure enables implementation of the advanced systems for measuring, acquisition and archiving data. these systems should be able to automatically collect monitoring data, to perform data validation and to securely archive them as to provide users with data in unified and efficient manner. with long-term monitoring of the instruments operation, database obtained by reliable instruments could be formed. implementation and use of iot on dams enables creation of databases of reliable instruments which can give more precise evaluation of the dam safety. internet of things (iot) is a network of physical objects in which electronics are incorporated, as well as software and sensors that allow users to obtain timely and accurate data through services for data exchange between manufacturers, users or other connected devices [7]. reliable data could enable users to react in the right way at the right time, in case of critical situations or natural disasters and in some cases to predict events. the aim of paper is to describe possibilities of the internet of things application within a specific system for dam safety management. the idea is to improve the system of data collection with the implementation of cloud and wsn. all data processing would be moved to cloud to free up computer resources. wsn would provide more reliable data. 2. literature review 2.1. dam safety management observing the safety of dams is one of important measures to ensure the safety of the dam [8]. this is an important and indispensable activity in the work and management of the dam. computer software plays a vital role in monitoring the safety of dams. many dam owners have developed information systems for the dam safety management supervision to facilitate management and analysis of data. fujian electric power company in east china has 27 different types of dam: concrete, earth, arch and embankment dams. all these dams are deployed in remote rural areas, making it difficult to manage security information for all dams. it is therefore important to develop an information system for remote control of the monitoring system, to collect and to transfer dam safety monitoring data so that all this using internet of things in monitoring and management of dams in serbia 421 information can be processed, analyzed and evaluated to effectively adopt the decision on the status of the dam safety. fortunately, such a remote information system was successfully developed jointly by all the participants in the business. it was applied to a group of dams of fujian electric power company, where the staff can use the system for analysis and evaluation of data observations. lately, it is possible to see an increase in damages and failures on the dams due to aging, earthquakes and unusual changes in climate [9]. for these reasons, the safety of the dams is gaining in importance every day in terms of disaster management at the national level. in the world there are numerous organizations that are responsible for the dam safety, and some of them are: the international commission on large dams (icold), committee on dam safety and dam security (codss), association of state dam safety officials (asdso), the interagency committee on dam safety (icods), the national dam safety review board (ndsrb) and dam safety interest group (dsig). kwater (korea water resource corporation) which currently runs and manages 30 large dams developed a system for dam safety (kdsms). this system is used in a consistent and efficient management of dam safety. kdsms consists of data for a dam and reservoir, hydrological information system, management system for the area of control and data, system of instruments and observations including the monitoring of earthquakes, a system for improving research and security and information system of corporation. for effective control of dam life cycle, it is very important to implement the diagnosis in real time and a reasonable estimate of dam safety based on the prototype observation [10]. the development of iewsds (intelligent early-warning systems of dam safety) is an important approach for the realization of this goal. huai-zhi su et al. observed the dam as a vital and intelligent system and constructed a bionic model of safe dams, which consists of a system of observations (nerve), central processing units (big brain), and tools for decision-making (the body). with the above-described model and system engineering, the authors have designed iewsds. intelligent machine that performs reasoning is the central processing unit of the system iewsds, it performs data analysis, and applies the algorithm of diagnosis and assessment of the safety of dams. because of the persistent non-linear and dynamic characteristics, the system has adopted a combined model based on a wavy network to exert approximation and prediction of behavior of the dam. the security status of the dam is changing dynamically, requiring qualitative and quantitative change in behavior. huai-zhi su et al. in the paper propose an expanded method of assessment [10]. the application shows that the bionic model is possible and suggests key technology operation. systems can provide technical support to improve dam safety management, prolonging the life of dams and avoiding accidents. disposal of tailings is of great importance for mining, because the processing of ores produces a large amount of tailings [11]. in the past few years there have been a catastrophic accident at the tailings dam and tailings mines, which have caused enormous damage and great human losses. to improve security of tailings dams, the control and pre-alarm system tdmpas (the tailings dam monitoring and pre-alarm system) is introduced for monitoring tailings dams, which are based on the use of iot and the cloud with the ability to monitor line saturation, the water level and the deformation of the dam in real time. tdmpas helped engineers in the mines to monitor the dam 24/7 and automatically receive pre-alarm information from remote locations in any weather conditions. tdmpas was applied at several mines and showed that the application in the monitoring of the physical condition of the tailings dam was justified. 422 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać 2.2. the application of iot numerous works related to iot application in system for observing have been published. for instance in the [12] authors deal with the localization system, based on zigbee technology in real-time in order to provide prompt support for safe management of the dam construction sites. the system is based on the tracking technology using wireless sensors and a set of servers that run software for processing the collected data, visually monitoring the condition of the site in real-time and remote communication with other systems such as erp, crm. a low-power tracking technology is network hardware based on zigbee technology, which uses the technology of fingerprinting software. the proposed system for observing in real time for employees was successfully implemented in the xiluodu arch dam construction site. implementation and development of the internet of things (iot) is closely connected with the construction of smart grids [13]. generally, using the technology of wireless communications and observations all electrical devices can be connected in iot, in order to make the smart grid become interactive electricity network in real time. qiaoming zou et al. summarize the current state in that area, analyze the current structure and characteristics, as well as key technologies that enable the implementation of the iot. authors brought up some concrete analysis and discussion on the implementation of iot in asset management and in the automatic reading meter system of smart grid and gave conclusions about the perspective of the application of iot in smart grids. operating state tailings ponds, which are an important production area in the mine, directly affect the safety of people and property, as well as production at the mine [14]. to build a system for the security surveillance of tailings ponds, using gis technology we cannot only manage the data and information of tailings scientifically and effectively, but also give full play to the advantage of computer's storage of massive data. the interactive operation of gis spatial query and analysis facilitates accurate and convenient search management, alteration and statistics of data. with the observation of the height of the seepage line of dam body, the water level in the tank, the index of dry coast, deformation and deviation of the dam body, we can promptly obtain information such as the fluctuation of the water level, which is important for timely forecasting stability of the dam body, thus achieving safe management of tailings ponds, as well as early warning of danger. lately, much attention has been paid to climate changes, control and management of the environment, so iis (integrated information system) is gaining on importance. the paper described in [15] presents a new iis that combines iot, cloud computing, geo-informatics (remote sensing rs, geographic information system gis, global positioning system gps) and e-science for monitoring and management of living environment, with a case study of regional climate change and environmental impact. in order to collect data and other information to a perception layer, multiple sensors and web services have been utilized. both networks, private and public, were used to access and transport mass data and other information in the network layer. the result of this case study shows that there is a visible trend of the increase in air temperature in xinjiang in the past 55 years and an apparently growing trend in rainfall since the early 1980s [15]. besides the correlation between environmental indicators and meteorological elements, the availability of water resources is a decisive factor in the terrestrial ecosystem in the area. the study shows that the iis greatly contributed to the study, not only in terms of data collection using iot, but also in the use of web services and applications that are based on the cloud (cloud) platform and e-science, and that effective evaluation and monitoring can still be improved. using internet of things in monitoring and management of dams in serbia 423 3. management and monitoring of large dam safety in [16] and [17], authors describe the current state of dams in the republic of serbia. over time, the sensors cease to operate or provide inaccurate values, so it is necessary to replace or implement new modern sensors. although the system of maintenance of dams in serbia is not up-to-date and fully equipped on all dams, dams have not had a harder disasters or major problems which is primarily, due to the good design and high quality of the works during their construction. however, despite the fact that so far there has not been any greater damage on the individual objects, which could have jeopardized their security and stability, or reduced their functionality, we must keep in mind that especially with aging dams, we can expect emergence of various problems which have already been testified by some peculiar features, which will be described later. most dams have technical monitoring systems that are essential from the point of monitoring and control the state of the facilities. these systems generally date from the time of building the facilities, and in the meantime have not been significantly renewed nor have they been further developed. often, those systems are technologically outdated, so failure of old instruments and missing of the data needed for monitoring dam safety is not uncommon. the system monitoring of the dam becomes incomplete as per the type and frequency of monitoring. in the last few years, the reconstruction process of system for monitoring (djerdap 1 [16], gruza [17]) has started. in the forthcoming period it is obvious that significant activities will happen with regard to these issues. it is necessary to establish a modern, functional and optimized system for technical monitoring of most of the remaining dams, in the form of automatic telemetry system for acquisition, which should allow continuous automatic measurement and recording of measurement data in a given time interval. in the future, increasing the fund of collected data will create conditions for a more detailed analysis of the condition and behavior of the facilities during operation, which is an integral part of the concept of dam safety management and would enable a precise definition of the trend behavior of the dam and should provide an opportunity for early detection of possible anomalies in the condition of the dam. this could be an example of a dam on which there have been good initial assumptions for the development and application of modern control system of dam safety. on many dams in serbia, the state of monitoring system can be assessed as partially satisfactory. this means that based on all available results of observations it is possible to make assessment of the condition of the facilities, but it is necessary to take steps to improve the situation of monitoring. the implementation of a new, up-to-date system of monitoring can provide more accurate assessment of the state of the system. in recent years, steps have been taken to improve the system of technical surveillance by implementing advanced information technologies and a software system for managing the security of the dam. because the first system is a prerequisite for the latter, phase development in the area of dam safety in serbia should be expected. this complete system is applied on the rock fill dam prvonek, near vranje, while the realization of systems for high gravity concrete dam "djerdap 1" and "djerdap 2" is in progress. advanced system for technical measurement usually consists of following mechanisms: automatic acquisition, validation, archiving and access to all relevant data obtained in the system of technical surveillance. the core of this system is an information system for 424 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać technical measurement, whose purpose is to be technical support in the collection, management and processing of measurement data. the aim is to allow merging diverse data in one place from the entire system of technical monitoring, having a score of reliability, as well as access to all data to be simple, interactive and fast. establishing a system for dam safety management implies the existence of an advanced system of technical monitoring, and thus the information system of technical surveillance. relying on advanced system of technical surveillance as a source of reliable data, it is possible to develop a set of statistical and mathematical models based on physics, as well as following mathematical apparatus for monitoring the state and analysis of dam safety. this established system of dam safety management is used for:  tracking and monitoring the behavior, which consists of continuous monitoring, measurement and determination of compliance measured values and their expected values,  checking of the dam safety, which may be initial, periodic and extraordinary, and refers to determining the condition of the facilities and determining the degree of the facilities safety. the activity of monitoring and tracking behavior relies on established statistical models based on measured inputs that can provide the expected value of a variable. if the measured value deviate within permissible limits which are expected, it can be concluded that the system has no major changes. in the modern automated system, this process is daily and has an alarming role in the case that on the basis of measurements concluded that the facility does not behave as expected. this alarm is a signal that a special security check should be performed. checking the safety of dams is carried out to determine the condition of the facility and degree of the safety, by checking the facility behavior in a series of scenarios, respectively situations that are valid from the standpoint of dam safety. this check is done periodically after the expiry of a defined period or extraordinary, because the system of technical surveillance and the use of statistical models have shown that facility is behaving differently than it is expected. given that in the analysis of state of complex real objects, it is not possible to a priori completely define in homogeneity and the actual characteristics of the material, and on the other side having a large number of measuring different indicators of the state of the facility, for determining the current state of all parameters, it is necessary to establish assimilation mechanisms of real measurements. practically, based on the measurements of relevant physical quantities, calibration of physical parameters of the system is performed (such as: e.g. elastic modules, filtration coefficient, etc.), so the calculated quantities can be more appropriate to the measured ones. in this way, identification can be performed in the zone where changes have occurred. only over the updated model is it possible to carry out safety analysis and based on the analyses it can be decided which measures must be undertaken to improve the safety of the dam. dam safety management is reflected in the use of systems to support dam safety management, and it continues through the entire life cycle of the dam. 3.1. software system to support the dam safety management software system to support the dam safety management, shown in [18], was realized on the principles of service-oriented architecture (soa), which enables not only the use of data in real-time, but also the expandability and interconnection with other information using internet of things in monitoring and management of dams in serbia 425 systems. to create this system, commercially available technologies such as sql server databases, .net framework, ado.net to connect to databases and web services were used. the system architecture is shown in the following figure. fig. 1 the structure of the software system for managing dam safety (adopted from [19]) software system consists of the following components:  interface with the system to technical monitoring  number of modules for statistical analysis  numeric module for the simulation of surface leakage  numeric module for stress-strain analysis  numeric module for data assimilation applications that are an integral part of the solution allow users to see current measurements (measurements in real time) as well as the estimation of the state of the dam at the time. for details see [19]. 426 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać 4. the acquisition module for communication with sensors in the monitoring network dams have a lot of different instruments, such as rain gauges, water level gauges, flow meter, precipitation meter, etc. in order to improve the observation of dams it is necessary to bring these instruments into a single network and allow them to communicate with each other. due to the large number of different instruments, it is essential to enable communication between devices. this can be achieved with the help of sensorml and wireless sensor networks (wsns). 4.1. sensor model language the goal is to make all types of devices discoverable and accessible using standard web services and schemas [20]. standard xml encoding scheme can be used for metadata describing sensors, sensor platforms, sensor tasking interfaces, and sensor-derived data, if connections can be layered with web and internet protocols. sensor can enable direct communications by publishing xml descriptions of its control interface, so it is possible to receive real-time or stored monitoring data, determine the sensor's location, identify the characteristics of its monitoring capabilities, and even request specific monitoring tasks. sensor web enablement (swe) standards are open standards based on open and universally accepted standards for the internet and web, and for spatial location and they are foundational standards for communicating with sensors, actuators and processors whose location matters [21]. they are a key enabler for the internet of things. the sensor model language (sensorml) 2.0 provides a standard encoding and supports the internet of things (iot) and web of things (wot) by providing the ability to describe a sensor (or other online processing component) and to provide a link to the realtime values coming from this component [20]. the sensorml is a head component that provides sensor information necessary for discovery, processing, and geo-registration of sensor monitoring. an example on the web page http://www.sensorml.com/sensorml-2.0/examples/iot simple.html describes a sensor with a simple data stream consisting of temperature. it is combination of simple sensor and iot. the data themselves can be accessed through the url [22]. accessing this url would return either the latest value(s) or open up an html stream of real-time values. the proposal of authors of this paper is to use a web service that will access to sensor's data via the above mentioned url. in addition for obtaining real data, the role of the web service is also transmission and storage of real data in the central database. the end user calls the web service via the software that is described in the previous chapter. the web service can be used for all types of sensors. 4.2. wireless sensor network the constant evolution of technologies, low cost technologies with embedded wireless transmitter, low-power and powerful chipset led to the massive use and development of wireless sensors networks (wsns). wsn can scale from tens to hundreds of nodes and seamlessly integrate with existing wired measurement and control systems [15]. the network cluster architecture, which takes advantage of multi-hop and clustering, is adopted to lower the energy consumption. a wireless sensor network consists of a number of smart nodes, gateways or sink nodes and a computer management center [23]. using internet of things in monitoring and management of dams in serbia 427 sensor’s data are shared among smart nodes and sent to distributed or centralized system for analytics, which can be on cloud or in local [24]. wdsn (wsn applied on dams) is a self-organized wireless network with dynamic topology structures, which consists of sensor nodes and gateway nodes. the sensor nodes collect the dam data about water level, shift, stress and leakage, temperature, rainfall, seepage and displacement in the dam sections which is transferred to the database server through the gateway nodes. the sensor used in wdsn is different from the common one. it is an intelligent one which can not only perceive the variation of tested physical value and output the corresponding change information, but also communicate with others. the intelligent sensor has several parts, such as sensitive components, embedded processors, storages and power supplies. these smart sensors in wdsn network are very important for measuring the reliability of the dam because at any moment it is possible to get information about the functionality of the device. wdsn structure is shown in fig. 2. fig. 2 wireless sensor network (wsn) the whole network is divided into several clusters, each of which is a monitoring area. the wireless sensor nodes in each cluster can communicate with each other and transmit the data to the gateway through multi-hops. the gateways can also communicate with each other and transmit the data to the sink. 4.3. communication services for automated measurements which are not included in the information system which could provide data to user outside of the system, it is necessary to set up special services for communication with measuring systems. due to the specific requirements for the reliability of the measurement system it is not recommended to directly access data. these services carry out local data collecting and sending on the processing and validation. 428 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać in the structure of the service communication, module for data collection is directly connected to measuring systems and has a central role [25]. module collects data from various sources, translates them into a standard format and passes them to a service for processing and validation of data. with this module, depending on the number and types of measuring systems which need to communicate, participants in the services are software components for the acquisition, which are divided into: components for communication with passive sensors, components for communication with active sensors, components for communication with passive data logger, the components for communicating with an active data logger and components for the information contained in the files. each of these components must implement the appropriate interface module for data collection. the number of components of one type is only limited with computing resources, while the number of these types of components is specified with configuration of measuring systems. the latter means that the concrete implementation of these services on an object does not have to contain all the components, but it is possible to add the components in the case of the extension configuration. fig. 3 communication services with measurement systems (adopted from [25]) 5. detecting sensor failure and continuing further work during the life cycle of the dam instruments the risk of cancellation of individual sensors is, of course, increasing, so the safety assessment of the dam should be brought without taking into account the measurements from these sensors. consequently, in order to implement the iot in monitoring and dam safety management, it is necessary to implement the adaptive algorithm for detecting sensor failure. algorithm should signal on time which measurements are missing, i.e. without which sensor the decision on safety of the facility has been made. using internet of things in monitoring and management of dams in serbia 429 the algorithm for failure detection of sensors, suitable for use in iot, represents the connection between the adaptive system for modeling the behavior of the dam and the acquisitions module for communication with sensors in the monitoring network. there are several different approaches to modeling the behavior of the dam. the earliest models were based on the application of statistical [26] and numerical [27] methods. the development of artificial intelligence has enabled the application of new techniques such as artificial neural networks [28], genetic algorithms [29] and adaptive neuro-fuzzy systems [30]. for application in the internet of things are the most suitable adaptive models that provide results in real time. one such has been described in [18]. it is a hybrid system that combines statistical models and genetic algorithms, so it can model the expected behavior of the dam. the basis of this system is the linear regression model, which is sensitive to the change of input parameters set. for this reason, the adaptive part is added to the system, in which, genetic algorithms, represent the basis. in accordance with the theory of genetic algorithms, model of linear regression is seen as the optimization problem, where each regression model represents one entity within the population. based on the available measurements, the generator of regressor creates a corresponding set of functions that can be applied. this means that there is always an alternative to the main model in case that some information is not available, so that the safety dam monitoring is not compromised. at the same time, in case of missing data, through the communication module with sensors, it is possible to get information from which sensor in network, the information is received and the system should timely alert about partial malfunction of the sensor. in the case of missing the entire set of data from a sensor (all measurements that the sensor performs), the system announces a complete malfunction of the sensor. results of regression models represent the parameters on the basis of which the current state of the dam is estimated and alarming is performed, in case that the parameters deviate from the expected values. with this information, it is also important to give information of the available measurements in the system and the state of the sensors, because as noted earlier, the regression model is formed on available measurements in the system. this could jeopardize the credibility of the results obtained from the regression model in a situation of incomplete measurements. for this reason, the condition of sensors is an important factor in making a correct decision about the real state of the dam. further development will be directed to the use of collected data in the advanced numerical models (fem etc.) and implementation of cloud computing. 5.1. fem for the modeling of the stress-deformation and filtration phenomena on the dam finite element method (fem) is used. fem can form a physical model of the building with the surrounding rock mass. to make this model fit the real model of a dam, it is necessary to repeat a particular phenomenon at the dam that occurred during operation. based on the results of technical surveillance calibration of material parameters is carried out and fem gives information about a realistic model of the dam, which should serve to further monitor the behavior of the object in order to anticipate certain undesirable situations in the further exploitation [31] [32]. an example of an arch dam model is shown in fig. 4. 430 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać fig. 4 fem arch dam to carry out safety analysis over the present state model, numerical module for assimilation of measured data should be developed. this module should enable, on the basis of the data obtained from the information system for technical monitoring, assimilation of measurements, i.e. determine updated values of fem model parameters. the core of the module consists of optimization algorithms required for the assimilation of measurements and automated communication with numerical modules. up to date parameters of individual material models that form the fem model describe real state of construction. 5.2. big data and cloud in remote sensing internet of things (iot) is a concept that includes all the objects around us as part of the internet. coverage iot is very large and includes a variety of smart devices such as smart phones, digital cameras, smart rain gauge, an outside temperature sensor and a variety of other types of sensors. when all these devices are interconnected, they provide much more intelligent processes and services that can be used in various areas. such large number of devices and sensors on dams connected to the internet provides a multitude of services and produces a large amount of data (big data). cloud computing is a model for on-demand-access to repository of configurable resources (budget, networks, servers, storage, applications, services, software, etc.), which can easily provide such infrastructure, applications and software. platforms based on a cloud help us to connect to the things that surround us, so it is possible to access them from anywhere at any time. cloud acts as a front-end to access the iot. applications which interact with devices, such as sensors, have special requirements for massive storage to record big data, a huge power computation that would provide data processing in real-time and high speed internet to allow high speed data throughput [33]. using internet of things in monitoring and management of dams in serbia 431 6. proof of concept the main goal of practical work is the dam safety. computers with limited resources need to be less burdened, i.e. the execution of operations should be relocated to the server. furthermore, it is necessary to increase the level of reliability of the dam safety system. this new innovative system would be implemented on dam prvonek, which is one of the last built dams in serbia and has modern sensors. fig. 5 describes implemented system on dam prvonek. the figure shows the data flow from the measuring instrument to the end-user. the measured data are temporarily stored in data logger. every data logger has own software for downloading data, which is installed on acquisition server. downloaded data format is csv (comma separated values). the acquisition server sends data to the central server. end-users use specific software for data analyses. the installed software on the computer of end-user uses resources of the computer and not server. if operations are complex, it is possible that operations will need a lot of computer resources. fig. 5 current data flow on dam prvonek the next figure (fig. 6) shows further project improvement. all data from acquisition servers are sent to the central server on the cloud. all data transformation and processes are performed in the cloud. the above mentioned represents an etl process (extract, transfer and load). all operations use server’s resources. end-user computer works only with prepared data for reports and has much more free resources for other operations. 432 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać all data are available to end-users 24/7. end-user can access data any place any time. fig. 6 model for cloud and big data it is possible to apply a new system on all instruments: rain gauges, water level gauges, flow meters, precipitation meters, etc. this new system is useful for all types of dams. implementing wsn architecture from fig. 2 will make a system of sensors more reliable and data more accurate. the nodes within wsn network communicate between themselves and send data about the malfunctioning sensor in real time through gateway to sin node, which further sends data onto the cloud server. the software for mathematical calculations generates statistical curve of dam stability using the received data. this statistical curve has to be in specific value limits. the curve could be generated based on data from different instruments. most frequently used instruments are piezometer, coordinometer, clinometer and thermometer. piezometer measures level of ground and underground flows. coordinometer measures dam movements. clinometer measures the angle of movement, while thermometer measures temperature. if there is a malfunctioning of instrument, a new formula, which excludes given instrument, is automatically generated by specific algorithms and provides approximately the same curve as if all instruments were in perfect working order. end user launches software for generating curve. all received data are stored in database on cloud server. the new system provides database with more reliable data which enables better analyzes and reporting. using internet of things in monitoring and management of dams in serbia 433 7. conclusion in this paper authors give an example for possible application of latest technologies such as internet of things, sensorml and wireless sensors networks with software for dam safety management. combination of these technologies and software improves functionality of dams. sensor technology, computer technology and network technology are advancing together while the demand grows for ways to connect information systems with the real world. linking diverse technologies in this fertile market environment, integrators are offering new solutions for plant security, industrial controls, meteorology, geophysical survey, flood monitoring, risk assessment, tracking, environmental monitoring, defense, logistics and many other applications [34]. internet of things, as a technology that is in trend, allows sensors to become intelligent by connecting them to the internet. this allows sensors to communicate with each other. application of iot in modern business significantly improves operations of companies. application of iot on dams would provide more efficient recording of failed sensors, which would significantly reduce the probability of damage occurring. with the collected data about failed sensors, it is possible to make database of reliability instruments, which directly shows the reliability of dams. combination of wsn, big data, cloud computing with iot would greatly improve the operation of the dams. all technologies produce a lot of data, which requires massive data storage. cloud, as a form of technology, that gains momentum as iot, could allow storage of large amounts of data on the web. with cloud computing end users could access the data anytime and anywhere. all data processing would be done on a cloud, which would considerably make the functioning of the system for data collection faster and more reliable. using the last forms of technology such as big data, cloud computing and iot will improve the operation of dams in serbia and significantly minimize the chances for failure to happen. serbia has good quality dams, so it is only needed to start implementing new technologies so that we could possibly prevent potential failure from happening. the implementation of the system for managing and monitoring dam safety and the implementation of new technology reduces the risk of a major failure of the dam. acknowledgement: the authors would like to thank to the ministry of education, science and technological development, republic of serbia, for financial support project number 174031. references [1] "assosiation of state dam safety officials," april 2012. [online]. available: http://www.damsafety. org/media/documents/downloadabledocuments/livingwithdams_asdso2012.pdf. [2] david s. bowles, loren r. anderson and terry f. glover, "the practice of dam safety risk assessment and management: its roots, its branches, and its fruit," 1998. [3] david s. bowles, loren r anderson , terry f. glover and sanjay s. chauhan, "dam safety decisionmaking: combining engineering assessments with risk information," 2003. [4] charles r. farrar and keith worden, "an introduction to structural health monitoring," the royal society, 2007. [5] shen zhen-zhong, chen yun-ping, wang cheng, li tao-fan and li ze-yuan, "development of realtime monitoring and early warning system of dam safety," vol. 3, 2010. 434 r. martać, n. milivojević, v. milivojević, v. ćirović, d. barać [6] jesung jeon, jongwook lee, donghoon shin and hangyu park, "development of dam safety management system," advances in engineering software, vol. 40, no. 8, p. 554–563, 2009. [7] i. bojanova, "defining the internet of things," computing now, 16 march 2015. [online]. available: http://www.computer.org/web/sensing-iot/content?g=53926943&type=article&urltitle=defining-theinternet-of-things. [8] f. bao t., s. gu c. and y. zhang, "remote safety monitoring management information system for dam group," in 2nd international conference on structural health monitoring of intelligent infrastructure, shenzhen, 2006. [9] jeon jesung, lee jongwook, shin donghoon and hangyu park, "development of dam safety management system," advances in engineering software, vol. 40, no. 8, pp. 554-563, 2009. [10] h. z. su and z. p. wen, "intelligent early-warning system of dam safety," proceedings of 2005 international conference on machine learning and cybernetics, vols 1-9, pp. 1868-1877, 2005. [11] enji sun, xingkai zhang and zhongxue li, "the internet of things (iot) and cloud computing (cc) based tailings dam monitoring and pre-alarm system in mines," safety science, vol. 50, no. 4, pp. 811-815, 2012. [12] peng lin, junfeng guan and qingbin li, "a real-time zigbee-based location system in xiluodu arch dam," civil engineering, architecture and sustainable infrastructure ii, pts 1 and 2, vols. 438-439, pp. 1329-1333, 2013. [13] qiaoming zou, lijun qin and qiyan ma, "the application of the internet of things in the smart grid," materials science and information technology, pts 1-8, vols. 433-440, no. 2012, pp. 3388-3394, 2011. [14] yang yingxin, hou chunhua and han yanxia, "design of safety monitoring system of tailing pond based on gis technology," electronic information and electrical engineering, vol. 19, pp. 408-410, 2012. [15] shifeng fang, li da xu and yunqiang zhu, "an integrated system for regional environmental monitoring and management based on internet of things," ieee transactions on industrial informatics, vol. 10, no. 2, pp. 1596-1605, 2014. [16] dragan maksimović, tina savić-tomić and maja pavić, "uticaj inoviranog oskultacionog sistema na pouzdanost osmatranja, upravljanja i održavanja glavnog objekta he “djerdap 1”," in sdvb prvi kongres, bajina bašta, 2008. [17] ljubomir petrović and srdjan djurić, "osavremenjavanje sistema osmatranja na brani gruža," in sdvb prvi kongres, bajina bašta, 2008. [18] b. stojanović, m. milivojević, m. ivanović, n. milivojević and d. divac, "adaptive system for dam behavior modeling based on linear regression and genethic algorithms," advances in engineering software, vol. 65, pp. 182-190, 2013. [19] n. milivojević, n. grujović, d. divac, v. milivojević and r. martać, "information system for dam safety management," icist, vol. 1, pp. 56-60, 2014. [20] mike botts and lance mckee, "sensors online," 1 april 2013. [online]. available: http://www. sensorsmag.com/networking-communications/a-sensor-model-language-moving-sensor-data-internet-967. [21] "the ogc approves sensorml 2.0, advanced standard for internet of things," ogc, february 2014. [online]. available: http://www.opengeospatial.org/node/1971. [22] "internet of things simple sensor (sensorml 2.0 examples)," [online]. available: http://www.sensorml. com/sensorml-2.0/examples/iotsimple.html. [23] xinying miao, jinkui chu, linghan zhang and jing qiao, "development of wireless sensor network for dam monitoring," journal of information & computational science, vol. 6, no. 9, p. 1609–1616, 2012. [24] i. f. akyildiz, w. su, y. sankarasubramaniam and e. cayirci, "wireless sensor networks: a survey, computer networks," no. 38, p. 393–422, 2002. [25] razvoj sistema za podršku optimalnom održavanju visokih brana u srbiji (tr37013), 2011-2015. [26] r. ardito and g. cocchetti, "statistical approach to damage diagnostic of concrete dam by radar monitoring: formulation and pseudo-experimental test," engineering structures, vol. 28, no. 14, p. 2036–2045, 2006. using internet of things in monitoring and management of dams in serbia 435 [27] yifeng chen, ran hu, wenbo lu, dianqing li and chuangbing zhou, "modeling coupled processes of non-steady seepage flow and non-linear deformation for a concrete-faced rockfill dam," computers & structures, vol. 89, no. 13-14, p. 1333–1351, 2011. [28] j. mata, "interpretation of concrete dam behaviour with artificial neural network and multiple linear regression models," engineering structures, vol. 33, no. 3, p. 903–910, 2011. [29] chang xu, dongjie yue and chengfa deng, "hybrid ga/simpls as alternative regression model in dam deformation analysis," engineering applications of artificial intelligence, vol. 25, no. 3, p. 468– 475, 2012. [30] vesna ranković, nenad grujović, dejan divac, nikola milivojević and aleksandar novaković, "modelling of dam behaviour based on neuro-fuzzy identification," engineering structures, vol. 35, p. 107–113, 2012. [31] d. divac, d. vuĉković and m. živković, "modeliranje filtracionih i naponsko-deformacionih procesa u interakciji akumulacionog jezera, brane i stenske mase, na primerima brane sv. petka u makedoniji i brane prvonek kod vranja," graċevinski kalendar, 2004, pp. 9-57. [32] m. kojić, r. slavković, m. živković and n. grujović, metod konaĉnih elemenata i (linearna analiza), kragujevac: mašinski fakultet u kragujevcu, 1998. [33] prahlada b. b. rao, payal saluja, neetu n. sharma, ankit mittal and shivay veer sharma, "cloud computing for internet of things & sensing based applications," 2012 sixth international conference on sensing technology (icst), pp. 374-380, 2012. [34] "sensor web enablement (swe)," ogc, [online]. available: http://www.opengeospatial.org/ogc/ markets-technologies/swe . [35] l. nachabe, m. girod-genet and b. el hassan, "unified data model for wireless sensor network," ieee sensors journal, vol. 15, no. 7, pp. 3657-3667, 2015. [36] a. ghosh and s. k. das, "coverage and connectivity issues in wireless sensor networks: a survey, pervasive and mobile computing," no. 2, p. 303–334, 2008. [37] y. sang, h. shen, y. inoguchi, y. tan and n. xiong, "secure data aggregation in wireless sensor networks: a survey," p. 315–320, 2006. [38] "ogc," [online]. available: http://www.opengeospatial.org/standards/sensorml. 404.indd facta universitatis series: electronics and energetics vol. 27, no. 4, december 2014, pp. 649 – 661 a double-differential-input / differential-output fully complementary and self-biased asynchronous cmos comparator vladimir milovanović and horst zimmermann institute of electrodynamics, microwave and circuit engineering faculty of electrical engineering and information technology vienna university of technology (tu wien) gußhausstraße 27, a-1040 wien, austria abstract: a novel fully complementary and fully differential asynchronous cmos comparator architecture, that consists of a two-stage preamplifier cascaded with a latch, achieves a sub-100 ps propagation delay for a 50 mvpp and higher input signal amplitudes under 1.1 v supply and 2.1 mw power consumption. the proposed voltage comparator topology features two differential pairs of inputs (four in total) thus increasing signal-to-noise ratio (snr) and noise immunity through rejection of the coupled noise components, reduced evenorder harmonic distortion, and doubled output voltage swing. in addition to that, the comparator is truly self-biased via negative feedback loop thereby eliminating the need for a voltage reference and suppressing the influence of process, supply voltage and ambient temperature variations. the described analog comparator prototype occupies 0.001 mm2 in a purely digital 40 nm lp (low power) cmos process technology. all the above mentioned merits make it highly attractive for use as a building block in implementation of the leadingedge system-on-chip (soc) data transceivers and data converters. keywords: comparator, preamplifier, latch, cmos, fully-differential, pvt variations, noise immunity, self-biasing, data converters, adc, transceivers. manuscript received august 9, 2014; received in revised form october 9, 2014 ∗ an earlier version of this manuscript received the best oral paper award at the 29th international conference on microelectronics (miel 2014), belgrade, 12-14 may, 2014. [1] corresponding author: vladimir milovanović institute of electrodynamics, microwave and circuit engineering (emce), vienna university of technology (tu wien), gußhausstraße 25-29/e354, a-1040 vienna, österreich. e-mail: vladimir.milovanovic@tuwien.ac.at 649 facta universitatis series: electronics and energetics vol. 27, no. 4, december 2014, pp. 649 – 661 a double-differential-input / differential-output fully complementary and self-biased asynchronous cmos comparator vladimir milovanović and horst zimmermann institute of electrodynamics, microwave and circuit engineering faculty of electrical engineering and information technology vienna university of technology (tu wien) gußhausstraße 27, a-1040 wien, austria abstract: a novel fully complementary and fully differential asynchronous cmos comparator architecture, that consists of a two-stage preamplifier cascaded with a latch, achieves a sub-100 ps propagation delay for a 50 mvpp and higher input signal amplitudes under 1.1 v supply and 2.1 mw power consumption. the proposed voltage comparator topology features two differential pairs of inputs (four in total) thus increasing signal-to-noise ratio (snr) and noise immunity through rejection of the coupled noise components, reduced evenorder harmonic distortion, and doubled output voltage swing. in addition to that, the comparator is truly self-biased via negative feedback loop thereby eliminating the need for a voltage reference and suppressing the influence of process, supply voltage and ambient temperature variations. the described analog comparator prototype occupies 0.001 mm2 in a purely digital 40 nm lp (low power) cmos process technology. all the above mentioned merits make it highly attractive for use as a building block in implementation of the leadingedge system-on-chip (soc) data transceivers and data converters. keywords: comparator, preamplifier, latch, cmos, fully-differential, pvt variations, noise immunity, self-biasing, data converters, adc, transceivers. manuscript received august 9, 2014; received in revised form october 9, 2014 ∗ an earlier version of this manuscript received the best oral paper award at the 29th international conference on microelectronics (miel 2014), belgrade, 12-14 may, 2014. [1] corresponding author: vladimir milovanović institute of electrodynamics, microwave and circuit engineering (emce), vienna university of technology (tu wien), gußhausstraße 25-29/e354, a-1040 vienna, österreich. e-mail: vladimir.milovanovic@tuwien.ac.at 649 facta universitatis series: electronics and energetics vol. 27, no 4, december 2014, pp. 649 662 doi: 10.2298/fuee1404649m received august 9, 2014; received in revised form october 9, 2014 *an earlier version of this manuscript received the best oral paper award at the 29th international conference on microelectronics (miel 2014), belgrade, 12-14 may, 2014. [1] corresponding author: vladimir milovanović institute of electrodynamics, microwave and circuit engineering (emce), vienna university of technology (tu wien), gußhausstraße 25-29/e354, a-1040 vienna, österreich (e-mail: vladimir.milovanovic@tuwien.ac.at) 650 v. milovanović and h. zimmermann 1 introduction after amplifiers, comparators are perhaps the second most widely used analog electronic component. analog comparators can be used to determine whether one input value is higher or lower than the other one at specific time points (predefined by the clock signal) or to perform the comparisons in an asynchronous manner, that is, to detect the time point at which the difference of the two input signals has changed its sign. these two comparator types are usually classified as dynamic (clocked) comparators and asynchronous (or open-loop), respectively. further, the compared signal may be any analog physical (i.e., electrical) quantity, like current, voltage, but also charge or even time. this paper settles its contribution in the field of the so-called asynchronous (non-clocked) analog voltage comparators. both asynchronous/open-loop [2] and dynamic/synchronous [3] comparator types, are in a widespread use in switched-mode power supplies as well as in the present-day data conversion [4] and/or transmission circuits [5]. after all comparator itself is nothing else but the single-bit analog-to-digital converter (adc). often, they are the critical design components as, for example, data converters’ bandwidth and maximum (over-)sampling rate directly depend on comparator’s propagation delay. moreover, an analogto-digital converter’s resolution, expressed in terms of signal-to-noise and distortion ratio or effective number of bits, is largely influenced by the comparator’s noise figure and its input-referred noise. finally, on the one hand, comparators should be high speed/low noise, while on the other, for use in battery-powered applications, they should consume as less power as possible. the basic idea behind high-speed analog voltage comparators is in combination of the best aspects of a preamplifier with the negative exponential step response with a latch that exhibits the positive exponential rise. the v − in v + in v + intermediate v − intermediate v − out v + out preamplifier latch vss vdd vss vss vdd vdd fig. 1. fully differential asynchronous voltage comparator that exploits a preamplifierlatch cascade to achieve fast decision making and thereby high operating speeds. 650 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 651 a fully differential self-biased asynchronous cmos comparator 651 v − in2 v + in2 v − in1 v + in1 v + intermediate v − intermediate v − out v + out preamplifier latch vss vdd vss vss vdd vdd fig. 2. fully differential high-speed preamplifier-latch asynchronous voltage comparator that features two pairs of differential inputs (four in total) on the preamplifier. preamplifier is used to build-up the input voltage difference up to a certain point where the latch takes over and brings the signal to rail. both clocked and non-clocked comparators can exploit these speed-up principles. a blocklevel representation of a high-speed asynchronous comparator consisting of a preamplifier-latch cascade is given in fig. 1. it is advantageous for high-speed asynchronous voltage comparators to utilize fully differential signaling as it brings with itself increased noise immunity by rejection of the coupled noise components, reduced even-order harmonic distortion, and doubled output voltage swing. besides using differential output as the one of fig. 1, the overall noise performance benefits could also be induced from the comparator version of fig. 2 that features the preamplifier stage with two pairs of differential inputs (four in total). this article presents a high-speed asynchronous cmos voltage comparator implementation which exploits two differential pairs of inputs and is suitable for incorporation in the cutting-edge systems on chip (socs). 2 four-input asynchronous comparator topology transistor-level and block-level schematics of the proposed complementary and fully differential self-biased asynchronous cmos voltage comparator that features two pairs of inputs are shown in fig. 3 and fig. 4, respectively. the comparator is comprised out of three fully differential self-biased cmos voltage amplifiers that share identical circuit topology, and a cmos latch. inputs of two amplifiers (four in total) at the same time act as the comparator inputs, while the biasing nodes and respective outputs of these two amplifiers are connected to each other in parallel, thus constituting the first preamplifier stage. the third amplifier is cascaded to the outputs of the first two, hence effectively forming the preamplifier’s second stage. the 650 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 651 652 v. milovanović and h. zimmermann v + in1 v − in1 v + in2 v − in2 v ′ up1 v ′ down1 v ′ up2 v ′ down2 n1lbias n 1r bias n1libias n 1r ibiasn 1l iout n 1r io p1lbias p 1r bias p1libias p 1r ibiasp 1l io p 1r io v ′ bias n1lbias n 1r bias n1libias n 1r ibiasn 1l io n 1r iout p1lbias p 1r bias p1libias p 1r ibiasp 1l io p 1r io v ′+ out v ′− out r ′ r ′ r ′ r ′ v ′′+ in v ′′− inv ′ ′ b ia s v ′′+ out v ′′− out v ′′ up v ′′ down n2lbias n 2r bias n2libias n 2r ibias n2liout n 2r iout p2lbias p 2r bias p2libias p 2r ibias p2liout p 2r iout r ′′ r ′′ v + inl v − inl v + out v − out nlinv n r invn l latch n r latch nlrail n r rail plinv p r invp l latch p r latch plrail p r rail vdd vdd vdd vdd vdd vdd vdd vdd vdd vdd fig. 3. transistor-level schematic of the proposed self-biased asynchronous cmos analog voltage comparator which features two pairs of differential inputs and differential output. amplifiers constructing the first preamplifying stage are mutually identical (corresponding transistor sizes of both are matched), but are different from the one serving as the second preamplifying stage (meaning, its transistor sizes are optimized independently). finally, preamplifier is cascaded with a 652 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 653 a fully differential self-biased asynchronous cmos comparator 653 v + in1 v − in1 v + in2 v − in2 1st stage 1/2 1st stage 2/2 v ′− out v ′+ out v ′− out v ′+ out v ′ b ia s v ′′+ in v ′′− in 2nd stage v ′′− out v ′′+ out v − inl v + inl latch vss vss vdd vdd vss vdd vss vss vdd vdd v + out v − out fig. 4. block-level schematic of the proposed self-biased asynchronous analog voltage comparator which features two pairs of differential inputs and differential output of fig. 3. simple latch whose outputs are at the same time the comparator outputs. inputs of each of the three fully differential self-biased inverter-based cmos amplifiers [5, 6] are amplified through the push-pull inverters consisting of transistors nxxiout and p xx iout, thus rendering the outputs of that particular amplifier. the cmos inverters at the inputs bring with themselves inherent advantages like very high input impedance and nominally doubled transconductance. the biasing of each stage is accomplished through complementary transistor pairs nxxbias and p xx bias which are controlled by vbias and are operating deep within the linear region. this potential is in turn stabilized through the negative feedback loop utilizing nxxibias and p xx ibias. namely, any variation in processing parameters or operating conditions (change of supply voltage or ambient temperature) that shifts vbias from its nominal value, results in an instant attenuation of these deviations [7] in an extent proportional to the value of the loop gain. as the biasing transistors are operating in the triode region, potentials vdown and vup are very close to the negative and the positive supply rail, respectively. in such configuration, self-biasing is not compromising with the output voltage swing which is nearly equal to the difference between the values of the two supply rails. resistors r′ and r′′ serve to avoid establishment of the low-resistive paths through v′bias and v ′′ bias nodes, respectively, for high (by absolute value) input voltage differences. placed in the biasing part, the resistors have no impact on comparator performance except that it drastically reduces dissipation while mutually distant potentials are applied as comparator inputs. 652 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 653 654 v. milovanović and h. zimmermann problem of the same kind will also occur in the path through v′+out and v′−out nodes but it cannot be avoided using the resistor trick instead these metal lines must be made thicker in order to sustain higher current values. as already stated, the output of the last preamplifier stage is connected to the input of the latch stage. the latch itself is implemented as the cross-coupled connection of two cmos inverters (composed out of transistors nxlatch and p x latch). the coupling between the preamplifier’s output and the latch itself is done through inverters consisting of transistors nxinv and pxinv. without transistors n x rail and p x rail, the coupling inverters should be large/strong enough to have the ability to pull the latch out of the positive feedback saturation, but still small/weak enough not to firmly dictate the output voltage (because having a latch in that case is senseless). connecting these four field-effect transistors to the supply rails relaxes the last requirement and consequently increases design’s reliability and robustness. besides being fully complementary, the proposed asynchronous voltage comparator circuit with two pairs of inputs is also perfectly symmetrical with respect to the vertical and the horizontal axis in fig. 3 and fig. 4, respectively. this is the reason why the biasing transistors on each preamplifier stage are drawn separately. symmetry implies beneficial repercussions on the process of laying the circuit out, as one can naturally match paired devices and the propagation delay through separate circuit blocks. 3 circuit analysis of the comparator architecture analysis of the proposed comparator topology can be accomplished by analyzing two of its subcomponents, namely the preamplifier and the latch. 3.1 preamplifier if the voltage drops across the biasing transistors are neglected, that is, if vdown and vup are approximately at the supply rails, then the small-signal differential gain of the comparator’s preamplifier is just equal to the transfer function of the push-pull inverter and hence it can be written as v ′′+out − v ′′− out ( v +in1 − v − in1 ) − ( v +in2 − v − in2 ) (s) = hpreamplifier (s) = (1) r′or ′′ o ( s − g′m/c ′ gd ) ( s − g′′m/c ′′ gd ) r′or ′′ oζs 2 + [ r′o ( c′ gd + c′′ gd (1 + g′′mr ′′ o) + ci2o1 ) + r′′o ( c′′ gd + cl )] s + 1 , 654 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 655 a fully differential self-biased asynchronous cmos comparator 655 0 time ttx ttot tpreamplifier tlatch tlatch vlatch vx vpreamplifier = gpreamplifier· · [( v + in1 − v − in1 ) − ( v + in2 − v − in2 )] vpreamplifier > vx vx > vlatch v vdd supply voltage rail preamplifier la tc h p re a m p li fi e r/ l a tc h t im e -d o m a in r e sp o n se fig. 5. combination of the preamplifier negative exponential step response (dashed line) with the positive exponential initial condition time response of the latch (dash-dotted line). at optimum point (tx, vx), which is at the same time the preamplifier-latch takeover point, the first derivatives of the two curves are the same. this minimizes preamplifier-latch cascade propagation delay ttotal = tpreamplifier+tlatch and makes the combined output signal quicker which implies fast decision making of the proposed asynchronous comparator. where g′m = g ′ mn +g ′ mp and g ′′ m = g ′′ mn +g ′′ mp are the total transconductances of the first and the second preamplifier’s stage inverter, respectively, r′o and r′′o are the total resistances seen at the output of the first and at the output of the preamplifier’s second stage, c′ gd = c′ gdn+c ′ gdp and c ′′ gd = c′′ gdn+c ′′ gdp are the sums of the gate-drain capacitances of the nmos and pmos of the first and the second preamplifier’s stage, respectively. for simplicity reasons, ζ = cl ( c′gd + c ′′ gd + ci2o1 ) + c′′gd ( c′gd + ci2o1 ) is introduced, while ci2o1 is the total capacitance at the output of the first and the input of the second preamplifier stage and cl is the total load capacitance at the output of the preamplifier or at the input of the latch. it may be observed that the transfer function (1) in which s = σ + iω is the complex angular frequency, is of the second order with two real left complex half-plane poles. it also possesses two real high frequency right complex half-plane zeroes at frequencies z1 = g ′ m/c ′ gd and z2 = g ′′ m/c ′′ gd . the step response of the preamplifier can be predicted based on its transfer function. if the effect of the two high frequency zeroes, z1 and z2 is neglected, together with the dominant pole approximation, the system’s step 654 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 655 656 v. milovanović and h. zimmermann response may be written as v′′+out (t) − v ′′− out (t) = l −1 {hpreamplifier (s) /s} ≈ (2) ≈ gpreamplifier [( v+in1 − v − in1 ) − ( v+in2 − v − in2 )] [1 − κ exp (−t/τa)] u (t) , where gpreamplifier and τa are the preamplifier low frequency gain and time constant which is inversely proportional to the value of the dominant pole, κ is a constant dependent on coefficients of the polynomial found in the transfer function denominator, while u (t) and l−1 represent the heaviside step function and the inverse laplace transform operator, respectively. 3.2 latch if the initial voltage that is applied to the latch output nodes (through the preamplifier-latch coupling inverters) at specified time point t′ is v+out (t ′) − v−out (t ′), then the time response of the linearized latch approximation on this initial condition (for t ≥ t′ and ∆t = t − t′) has the form of an exponentially increasing [8] function of time ∆t and can be written as v+out (t) − v − out (t) = exp(∆t/τl) [ v+out ( t′ ) − v−out ( t′ )] . (3) the time constant of the portrayed cross-coupled cmos inverter latch is approximately equal to τl ≈ c/gml, where c is the total capacitance seen at the output of the latch, i.e., comparator, while gml = gmnl + gmpl is the total transconductance of the latch complementary transistor pair. note that this is a typical temporal response of positive-feedback systems which have a single or a dominant real right complex half-plane pole. 4 operating principles of the described comparator as already stated in the introduction, the basic idea behind the presented comparator is in combination of the best aspects of the preamplifier, which is characterized by the negative exponential step response (2), with the positive exponential response (3) latch. the preamplifier builds up the voltage up to a certain point where the latch takes over and brings the signal to a rail. the previous principle concepts are illustrated in fig. 5. in this figure, the preamplifier gain times the input voltage alone is not sufficient for the output to reach the rail. nevertheless, it achieves a high enough output value to pull the latch out of one saturation state and trigger its positive feedback loop that drives the comparator to the saturation state on another supply rail, thus producing a firm logical level (high or low) at the output. 656 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 657 a fully differential self-biased asynchronous cmos comparator 657 comparator + output buffers in2− in2+ in1− in1+ 5 0 ω 5 0 ω 5 0 ω 5 0 ω comparator chain of inverters as output drivers ✏✏✶ ron = 50 ω�� capable of driving pad capacitance and 50 ω measurement equipment out+ out− delay(comparator)=delay(comparator+buffers)−delay(buffers) output buffers only (for delay subtraction) in+ in− 50 ω 50 ω dummy comparator actually a shortcut chain of inverters as output drivers ✏✏✶ ron = 50 ω�� these inverters are identical to the ones that come after the comparator out+ out− fig. 6. on-chip comparator structure with output buffers and the corresponding dummy comparator structure used for exact extraction of the comparator’s propagation delay. with the total propagation delay through the comparator being the sum of propagation delays of the cascaded components it consists of, namely, ttotal = tpreamplifier + tlatch , (4) it is obvious that reducing the time constants of the separate comparator subcircuits (τa and τl) is essential to increase its speed of operation. additionally, it can be proven that there exists the optimum preamplifier-latch takeover point (tx, vx) that is located in the point where the first derivatives of the preamplifier and the latch function are equal. this was somewhat expected and hence for high-speed applications the comparator should be optimized such that the subcomponent function that has larger first derivative of the two is used for the corresponding part of the characteristics. apart from acceleration, another role of the latch block is also to align comparator’s complementary output fall-time and rise-time edges. 656 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 657 658 v. milovanović and h. zimmermann b u ff e rs o n ly o u t + & [v ] time elapsed after the fixed moment in time t [ns] b u ff e rs o n ly in + & [v ] c o m p a ra to r o u t + & [v ] c o m p a ra to r in 2 + & [v ] c o m p a ra to r in 1 + & [v ] t + 1 t + 2 t + 3 t + 4 t + 5 t + 6 t + 7 t + 8 t + 9 0 0.2 0.4 0.6 0.0 0.55 1.1 0 0.2 0.4 0.6 0.53 0.55 0.57 0.53 0.55 0.57 p se u d o ra n d o m b in a ry se q u e n c e 2 3 1 − 1 fre q u e n c y f = 3 .3 3 g h z ✻ ❄ 50 mvpp ✻ ❄ 50 mvpp ✻ ❄ 0.55 vpp ✻ ❄ 1.1 vpp ✻ ❄ 0.55 vpp ❄ d iff e re n c e : c o m p a r a t o r d e la y t d e la y fig. 7. measured inputs and outputs of the on-chip structure containing asynchronous voltage comparator featuring two pairs of differential inputs with output drivers and the corresponding on-chip dummy comparator structure containing the output drivers alone. 5 on-chip measurement setup for propagation delay the output of the latch, which is at the same time the comparator output, has rail-to-rail swing and is hence designed to be cascaded by some digital circuitry which regularly features relatively low input capacitance with respect to a pad capacitances. to measure the comparator characteristics in a realistic configuration a chain of several inverters which drive the pad capacitance and the 50 ω measurement equipment follows each of the comparator outputs as shown in fig. 6. both transistors in the last inverter are designed to have the on-resistance of ron = 50 ω to avoid reflection thus halving the output signal amplitude to vdd/2. for the same reason all four inputs have 50 ω on-chip termination to ground. to enable indirect delay measurement of the comparator, output drivers are also placed on chip, on their own, as explained by fig. 6. special attention is paid so that the metal lines routed to and off the comparator (with the output drivers) and the output drivers alone 658 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 659 a fully differential self-biased asynchronous cmos comparator 659 fig. 8. oscilloscope display showing an eye pattern for the two comparator outputs that are connected to channels 1 and 2. input pseudorandom sequence’s frequency is 3.33 ghz. are identical in every aspect. this enabled the use of identical printed circuit boards, identical coaxial cables and finally identical measurement equipment to drive and characterize both on-chip structures. thus, delay of the comparator is obtained as the difference between the delay of the structure with comparator plus output buffers and the delay of the dummy structure containing the buffers only. the previous subtraction eliminates the influence of coaxial cables, printed circuit board microstrip lines, on-chip metal lines, etc., which were identical for both measurements and are therefore canceled out in the process of delay subtraction. additionally, the output drivers are optimized for small propagation delay variation, the standard deviation of which is σ(delay) < 5 ps based on one thousand monte-carlo simulations and the sample of ten relative on-chip measurements. also, the comparator and the buffers have separate supply pads (i.e., analog and digital, respectively) to enable power consumption measurement of the comparator alone. measured inputs and outputs of the on-chip characterization structures depicted in fig. 6, driven by pseudorandom binary sequence signal with frequency of 3.33 ghz, are shown in fig. 7 in a form of an oscilloscope screenshot. it can be observed that the structure containing buffers only is always driven with rail-to-rail signal resembling the comparator outputs. difference between the two outputs yields the comparator propagation delay. 658 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 659 660 v. milovanović and h. zimmermann 1.05 mm ✲✛ 0 .7 7 m m ✻ ❄ ✄ ✂ � ✁ ✄ ✂ � ✁ ✞ ✝ ☎ ✆ ✄ ✂ � ✁✄ ✂ � ✁ 11.96× 25.4 µm2 ❅❅❘ ✻✻ output drivers 39.2× 25.5 µm2 ❅❅❘ ✻ comparator d g o o g d g i i g a g i g i g a g i i g g g o o g g fig. 9. test chip photomicrograph. abbreviations: (g) ground, (a) analog supply, (d) digital supply, (i) input, (o) output. left – output buffers; right – four-input comparator. 6 measurement results of the proposed comparator having in mind reasonable power consumption, the described comparator is optimized for speed and is fabricated in a standard 1p8m digital 40 nm low power multi-threshold cmos process technology shrank to 90% (minimum transistor gate length 36 nm). to optimize latency and power the exploited technology offers transistors with three different values of threshold voltage. threshold voltages for low-vt transistor types, which are used in the design to minimize propagation delay, are around vtn/vtp ≈ 0.33 v/−0.28 v, while the nominal supply voltage for the given process is vdd = 1.1 v. the propagation delay of the comparator with two pairs of inputs, measured in the upper described manner, is lower than 100 ps for the 50 mvpp step applied at both of its differential inputs. total power dissipation of the comparator under these circumstances equals 2.1 mw and is dominated by the preamplifier’s static consumption. ergo, the dc current consumption accounts for the major part of the total comparator’s power consumption. measured eye diagram of the comparator at 3.33 ghz, what was the limit of stimulus equipment, is shown in fig. 8, however, based on the propagation delay measurements, the eye opening should be present up to 10 ghz. test chip photomicrograph is given in fig. 9. our proposed four-input comparator design implementation occupies an area of 39.2 × 25.5 µm2. 660 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 661 a fully differential self-biased asynchronous cmos comparator 661 7 conclusions the article presents a prototype of a novel fully differential asynchronous comparator topology that features two-pairs of inputs and is implemented in 40 nm lp cmos technology. the comparator consists of a preamplifierlatch cascade and is completely self-biased thus overcoming the need for a reference circuit and reducing the influence of pvt variations. comparator propagation delay is extracted using subtractive method which exploits onchip dummy output driver structures. measurements indicate that, depending on the actual input signal amplitude and common-mode, the comparator can operate at frequencies beyond 10 ghz under dissipation of 2.1 mw. although both comparator delay and its power consumption greatly depend on the input signal amplitude and common-mode value, this still places it among the fastest non-clocked comparators published up to date. finally, the proposed comparator circuit is well-suitable for implementation in the cutting-edge system-on-chip (soc) data transceivers and data converters. acknowledgements the authors would like to express their gratitude to lantiq a and austrian bmvit for their financial support of the fit-it project xplc via ffg. references [1] v. milovanović and h. zimmermann, “a two-differential-input / differentialoutput fully complementary self-biased open-loop analog voltage comparator in 40 nm low power cmos,” in proceedings of the 29th international conference on microelectronics — miel 2014, may 2014, pp. 355–358. [2] t. sepke et al., “comparator-based switched-capacitor circuits for scaled cmos technologies,” in isscc dig. tech.papers, feb. 2006, pp. 812–821. [3] d. schinkel et al., “a double-tail latch-type voltage sense amplifier with 18 ps setup+hold time,” in isscc dig. tech.pap., feb. 2007, pp. 314–315. [4] v. srinivasan et al., “a 20 mw 61 db sndr (60 mhz bw) 1 b 3rd-order continuous-time delta-sigma modulator clocked at 6 ghz in 45 nm cmos,” in isscc dig. tech.papers, feb. 2012, pp. 812–821. [5] c.-y. yang and s.-i. liu, “a one-wire approach for skew-compensating clock distribution based on bidirectional techniques,” ieee journal of solid-state circuits, vol. 36, no. 2, pp. 266–272, feb. 2001. [6] m.-c. huang and s.-i. liu, “a fully differential comparator-based switchedcapacitor ∆σ modulator,” ieee transactions on circuits and systems ii: express briefs, vol. 56, no. 5, pp. 369–373, may 2009. [7] m. bazes, “two novel fully complementary self-biased cmos differential amplifiers,” ieee j. of solid-state circuits, vol. 26, no. 2, pp. 165–168, feb. 1991. [8] b. j. mccarroll et al., “a high-speed cmos comparator for use in an adc,” ieee journal of solid-state circuits, vol. 23, no. 1, pp. 159–165, feb. 1988. 660 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator 661 a fully differential self-biased asynchronous cmos comparator 661 7 conclusions the article presents a prototype of a novel fully differential asynchronous comparator topology that features two-pairs of inputs and is implemented in 40 nm lp cmos technology. the comparator consists of a preamplifierlatch cascade and is completely self-biased thus overcoming the need for a reference circuit and reducing the influence of pvt variations. comparator propagation delay is extracted using subtractive method which exploits onchip dummy output driver structures. measurements indicate that, depending on the actual input signal amplitude and common-mode, the comparator can operate at frequencies beyond 10 ghz under dissipation of 2.1 mw. although both comparator delay and its power consumption greatly depend on the input signal amplitude and common-mode value, this still places it among the fastest non-clocked comparators published up to date. finally, the proposed comparator circuit is well-suitable for implementation in the cutting-edge system-on-chip (soc) data transceivers and data converters. acknowledgements the authors would like to express their gratitude to lantiq a and austrian bmvit for their financial support of the fit-it project xplc via ffg. references [1] v. milovanović and h. zimmermann, “a two-differential-input / differentialoutput fully complementary self-biased open-loop analog voltage comparator in 40 nm low power cmos,” in proceedings of the 29th international conference on microelectronics — miel 2014, may 2014, pp. 355–358. [2] t. sepke et al., “comparator-based switched-capacitor circuits for scaled cmos technologies,” in isscc dig. tech.papers, feb. 2006, pp. 812–821. [3] d. schinkel et al., “a double-tail latch-type voltage sense amplifier with 18 ps setup+hold time,” in isscc dig. tech.pap., feb. 2007, pp. 314–315. [4] v. srinivasan et al., “a 20 mw 61 db sndr (60 mhz bw) 1 b 3rd-order continuous-time delta-sigma modulator clocked at 6 ghz in 45 nm cmos,” in isscc dig. tech.papers, feb. 2012, pp. 812–821. [5] c.-y. yang and s.-i. liu, “a one-wire approach for skew-compensating clock distribution based on bidirectional techniques,” ieee journal of solid-state circuits, vol. 36, no. 2, pp. 266–272, feb. 2001. [6] m.-c. huang and s.-i. liu, “a fully differential comparator-based switchedcapacitor ∆σ modulator,” ieee transactions on circuits and systems ii: express briefs, vol. 56, no. 5, pp. 369–373, may 2009. [7] m. bazes, “two novel fully complementary self-biased cmos differential amplifiers,” ieee j. of solid-state circuits, vol. 26, no. 2, pp. 165–168, feb. 1991. [8] b. j. mccarroll et al., “a high-speed cmos comparator for use in an adc,” ieee journal of solid-state circuits, vol. 23, no. 1, pp. 159–165, feb. 1988. 662 v. milovanović, h. zimmermann fully differential self-biased asynchronous cmos comparator pb 10963 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 189-208 https://doi.org/10.2298/fuee2302189d © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design and implementation of fractional-order controller in delta domain sujay kumar dolai1, arindam mondal2, prasanta sarkar3 1department of electrical engineering, dit, kolkata, west bengal, india 2department of electrical engineering, dr. bc roy engineering college, durgapur, west bengal, india 3department of electrical engineering, nitttr kolkata, west bengal, india abstract. in this work, a fractional-order controller (foc) is designed in a discrete domain using delta operator parameterization. foc gets rationally approximated using continued fraction expansion (cfe) in the delta domain. whenever discretization of any continuous-time system takes place, the choice of sampling time becomes the most critical parameter to get most accurate results. obtaining a higher sampling rate using conventional shift operator parameterization is not possible and delta operator parameterized discretize time system takes the advantages to circumvent the problem associated with the shift operator parameterization at a high sampling limit. in this work, a first-order plant with delay is considered to be controlled with foc, and is implemented in discrete delta domain. the plant model is designed using matlab as well as in hardware. the fractional-order controller is tuned in the continuous domain and discretized in delta domain to make the discrete delta foc. continuous time fractional order operator (s±α) is directly discretized in delta domain to get the overall foc in discrete domain. the designed controller in implemented using matlabsimulink and dspace board such that dspaceboard acts as the hardware implemented foc. the step response characteristics of the closed-loop system using delta domain foc resembles to that of the results obtained by continuous time controller. it proves that at a high sampling rate, the continuous-time result and discrete-time result are obtained hand to hand rather than the two individual cases. therefore, the analysis and design of foc parameterized with delta operator opens up a new area in the design and implementation of discrete foc, which unifies both continuous and discrete-time results. the discrete model performance characteristics are evaluated in software simulation using matlab, and results are validated through the hardware implementation using dspace. key words: continued fraction expansion, delta operator, dspace, fractional order controller received august 01, 2022; revised october 20, 2022; accepted november 04, 2022 corresponding author: sujay kumar dolai department of electrical engineering, dit, kolkata, west bengal, india e-mail: dolaisujay@gmail.com 190 s. k. dolai, a. mondal, p. sarkar 1. introduction a fractional-order system (fos) is a system having a non-integer order differentiator and integrator. nowadays fos has become a vital research arena not only in mathematics but also in the system theory and control. from the literature, most of the real-world system is inevitably fractional order [1]–[3]. since its inception in the year 1695, the mathematicians have done value addition and its utilization in control theory [4]. for the last few decades, the researchers have paid attention in modeling, analysis, simulation, solution of differential equations in fractional order domain to deliver a clear concept on fos [5]–[7]. the control engineers are nowadays using the fractional-order calculus as a background of fractional-order controllers (foc). to control the plant, the fractionalorder controller becomes very much essential tools rather than the integer-order controller, and it is evident from the literature that the performance of the fractional-order controller is better than that of the integer-order controller [8]. the electrochemical process [9], dielectric polarization [10], visco-electric materials [11], chaos electromagnetic fractional poles [12], signal processing [13] are the primary areas where the fractional order calculus has been rigorously used for the last decade. in the case of fos, the differentiator/integrator is symbolized by an irrational operator s±μ,where s is a complex quantity and known as laplace transform variable. for the value of μ = ±1 the irrational operator becomes an integer order operator s±1.the infinite dimensional irrational operator s±μ is usually converted to the rational function either in a continuous domain (s-domain) or discrete domain (z or δ domain). to implement the fractional operators in the discrete domain, the discretization of the same operator is of primary concern [14]. the most common discretization method is tustin operator-based discretization method. the comparative study between the different discretization methods in the z-domain is summarized in [14] to get the merits and demerits of each of the methods. for the realization of the fractional order operator in discrete domain, sampling rate during discretization should be at least 6-10 times the system bandwidth, as suggested by shannon. but when the sampling rate is increased to a certain extent, corresponding z-domain transfer function becomes numerically ill conditioned thereby fails to provide meaningful insights. the digital controller design in delta domain is better than the corresponding controller designed using shift operator [15]. the advantages of the delta operator parameterization are elaborated in [16], [17] particularly while the discrete 𝓏-domain results fails at high sampling rate. delta operator has proven its potential for its application in control theory [18], system identification [19] in case of fault detection and network control [20], for kalman filter-based controller design used in cyber-physical systems [21]. direct discretization from continuous time domain to delta domain can make the procedure for fo controller design smoother and methods for the same has been proposed in [22], [23]. high speed digital realization for the fractional order operator can be possible using the properties of delta operator parameterization [24]. moreover, delta operator parameterization has made it possible to understand both continuous and discrete-time systems in a unified framework. for designing the fractional order controller, there are different works of literatures (an231e04 data sheet., 2012), [25]–[27]where different realization techniques are discussed. the tuning of parameters for the controllers is a fundamental issue. several optimization techniques [28], [29] in the frequency domain [8], [30] are available. the analog realizations of fractional-order pid controllers have been proposed in [31]–[34]. design and implementation of fractional-order controller in delta domain 191 digital implementation of the foc for boost converter using shift operator parameterization has been successfully done in [33]. digital implementation of fractional-order controllers using fpga via shift operator parameterization in indirect discretization domain is presented in [35], [36]. in this paper, ds1202 dspace board is a platform where a realtime controller in the discrete delta domain is implemented. in this paper, the performance of the proposed controller is studied using both simulations and digital hardware platforms, and a comparative study is done. the significant contributions are made in this paper in manifold: in the earlier work, the fractional-order controllers are designed in different analog realization techniques. the discrete-time systems so far designed are done using shift operator parameterization, but shift operator parameterization fails to provide meaningful information at a high sampling rate. the real-time implementation of the controller in the digital domain needs a very high sampling rate to get a better result. in this work, the fo controller design for the integer-order plant with dead time is done using the delta operator parameterization and hardware realization is made using dspace. at a fast-sampling limit, the discrete domain results resemble to that of the continuous-time results providing a unified method of foc design in delta domain. a new direct discretization method for discretizing the fractional order continuous time operator into discrete delta domain is utilized to obtain the rational transfer function in delta domain for the implementation using dspace board. therefore, digital design and implementation of foc using delta operator parameterization using dspace is a newer concept and a new direction for further research. the organization of the paper is as: the basics of fractional-order system and controller are discussed in section 2. in section 3, the discretization of fractional order operators using the delta operator is described. the digital realization of the fopid controller using the delta operator is demonstrated in section 4. in section 5, the implementation of the proposed controller in simulink and dspace board is discussed. finally, section 6 & section 7 is devoted to analyzing the result analysis and conclusion, respectively. 2. fractional order system 2.1. fractional order calculus in fractional calculus, the non-integer order differentiation/integration is denoted by a fundamental operator md  , where ψ is used to specify the order of the operation like differentiation or integration. this operator is known as an integro-differentiator operator; this is mathematically represented as ( 0) 1 ( 0) ( ) ( 0) ψ ψ ψ τ τ ψ m d ψ dτ md ψ dτ ψ    = =      (1) there are two popular definitions, such as grünwald-letnikov (gl) and riemannliouville (rl) definitions, to express the integro-differentiator operator. (2) and(3) describe the gl and rl definitions, respectively. 192 s. k. dolai, a. mondal, p. sarkar gl definition: 0 0 ( ) lim ( 1) ( ) p ψ n τ p n md t p np n       − → =    = −   −     (2) rl definition: 1 1 ( ) ( ) ( ) ( ) x ψ τ x x m d p md t dp x d p   − +   =  −    −  (3) where the value of  varies from (x − 1) to x and  is used to represent the euler's gamma function. 2.2. fractional order differential equation and transfer function the fractional-order differential equation is used to describe the dynamics of a fractional-order system (fos). likewise, with the case of the classical integer order system, the laplace transform of the fractional-order differential equation generates the transfer function of the fos. the mathematical equation of a fractional-order system is described by (4). 1 0 1 0 1 0 1 0 ( ) ( ) ( ) ( ) ( ) ( ) n n m m r n n m r r m a d y t a d y t a d y t b d u t b d u t b d u t − −    − − + + + = + + + (4) where,  tdd 0 is known as rl-derivative or caputo fractional derivat ive. the input and the output of the system are denoted by u(t) and y(t) respectively, ai(i = 0,......,n) and bi(i = 0,......,m) are constants and i(i = 0,......,n), ri(i = 0,......,m) are arbitrary real numbers. in general, the values of iψ and rj can be considered as 01 ψ.....ψψ nn  − , and 01 r.....rr mm  − . laplace transform of (1) gives rise to a continuous-time transfer function as given by (5). { ( )} ( ) ψ τ l md t s s   = (5) according to the definition of caputo, the fractional derivative m is taken equal to 0 , and the laplace transform ( )t is denoted by ( )s . by using the expression as derived in (6), laplace transform is applied on both sides of the (4) gives rise to the transfer function of a system with y(t) as the output and u(t) is the input. 01 01 01 01 0 )( )( )(  sasasa sbsbsb su sy sg nn mm nn rr m r m +++ +++ == − − − −   (6) where, u(s) = lu(t) and y(s) = ly(t), 2.3. fractional order pid controller (pid) the fractional order pid controller performs better than the integer-order pid controller owing to its greater number of degrees of freedom. in case of the fopid controller, the orders of the integrator and differentiator ( < 0,  > 0) are non-integer. design and implementation of fractional-order controller in delta domain 193 therefore, by using the fractional-order calculus for differentiation, integration and laplace transform, the continuous-time domain transfer function of fractional order pid controller gets the following form: ( ) ( ) ( ) , 0 ( ) c p i d u s g s k k s k s e s −  = = + +    (7) where, u(s) = lu(t) and e(s) = le(t) are output and the input of the controller, respectively. the integer-order pid controller can be obtained by using  = 1 and  = 1 in (7). likewise, the pd controller can be obtained if the value of  = 0, and ki = 0. this may conclude that (7) is the generalized transfer function of integer/fractional-order controller. the basic structure of the fopid controller is given in fig. 1. fig. 1 fractional order pi d   controller 3. direct discretization of fractional order integrator and differentiator using delta operator 3.1. relationship between s-domain and  -domain the shift operator parameterization is used to describe the discrete-time system. the forward shift operator is usually denoted by q. the delta domain is an area where discrete-time systems are represented using the delta operator . the delta operator () is nothing but the scaled and shifted version of the forward shift operator (q). the operator is related with the forward shift operator q as (  is the sampling time).  − = 1q  (8) at high sampling period ( → 0), the following identity is obtained when delta operator is applied on a differentiable signal y(t): 0 ( ) ( ) lim ( ) ( ) y t y t d y t y t dt   → +  − = =  (9) the continuous-time derivative can be obtained from the delta operated signal at a fast-sampling limit as can be seen from (9). the relationship between the frequency 194 s. k. dolai, a. mondal, p. sarkar variable '' in the delta domain and the frequency variable '' z of the shift operator domain is given below:  − = 1z  (10) in (10), replacing = sez the relationship between the frequency variables in continuous time and discrete delta time is obtained and is depicted by (11).  − =  1 s e  , or, +=  1 s e )1ln( 1 +  = s (11) equation (11) represents the direct relationship between the variable s and . 3.2. direct discretization of fractional order operator in delta domain for the realization of foc in delta domain, discretization of the fractional order operator (s) in delta domain plays the pivotal role. from (11), the transformation of the fractional order operator into delta domain from continuous time domain can be re-established as:         +  = )1ln( 1 s (12) by using trapezoidal quadrature rule [37] and cfe, ln (1 + x) function can be successfully approximated to its closed form is as follow: 2 2 66 36 )1ln( xx xx x ++ + =+ (13) replacing x by  in (13), (11) can be rewritten as         ++ +        +  = 22 2 66 36 )1ln( 1   s (14) from (14), it is evident that at fast sampling rate ( → 0), s   meaning, the continuous and discrete delta domain becomes replicate to each other, thereby (14) gives a direct relationship between the two domains. equation (12) can be rewritten as: 2 2 2 6 3 6 6 s          +  =   +  +   (15) rational transfer function in delta domain corresponding to any fractional order operator can be realized using (15) through the direct discretization method as demonstrated in [23] in continuous-time system representation, fractional-order differentiator (fod) and fractional-order integrator (foi)are mathematically expressed as: )10()( = rssg r d (16) )10()( = − rssg r i (17) design and implementation of fractional-order controller in delta domain 195 continued fraction expansion (cfe) [38], [39] is used as a powerful tool that operates on the generating function to get a rational transfer function. the cfe approximation is mathematically formulated using (18)[39]. .....2 )3( 5 )2( 2 )2( 3 )1( 2 )1( 1 1)1( + − + + + − + + + − + +=+ pq pq pq pq pq qp p q (18) to obtain the standard form of cfe as given in (18), p is replaced by         −         ++ + 1 66 36 22 2   to get the result obtained by cfe in (15). here, (15) is used as the generating function for the integer order approximation of the fractional-order differentiator/integrator in the delta domain as mathematically represented by (19). r del cfeg          ++ + = 22 2 66 36 )( (19) in this work, third order approximation of fod and foi are considered for the realization and implementation purpose. delta domain coefficients [23] for the third order approximation of rs are tabulated in table 1. table 1 delta-domain coefficients for third-order approximation of rs 6 5 4 3 2 3 (3 ) ( 1) (4096 26624 9472 201472 252944 331304 506955)dnum / r / r + / r + r + r r r + r +=  coefficient numerator 0h 6 3 5 2 7 4 3 (30720 454416 36096 838259 78360 4096 192000 506955) r + r r r + r r r + dnum 1h 2 3 5 6 4 3 ( 938460 1388142 723408 608640 76800 12288 12288 ) r + r + r r + r + r dnum        2h 2 2 2 2 3 2 5 2 2 4 3 ( 465120 195900 128640 15360 714105 57600 ) r r + r r + + r dnum       3h 3 2 3 4 3 3 ( 64320 7680 97950 )+ r + r + dnum   coefficient denominator 0i 7 6 5 4 3 2 3 (4096 30720 36096 192000 454416 78360 838259 506955) r + r + r r r + r + r + / dnum 1i 2 3 5 6 4 3 (938460 1388142 723408 608640 76800 12288 12288 ) + r + r r + r + r + r / dnum        2i 2 2 2 2 3 2 5 2 2 4 3 ( 465120 195900 128640 15360 714105 57600 ) + r + r r + r + + r / dnum       3i 3 2 3 4 3 3 ( 64320 7680 97950 )+ r + r + / dnum   196 s. k. dolai, a. mondal, p. sarkar from the coefficients of table 1, the 3rd order rational approximation of sr can be obtained and 3rd order generalized transfer function as given by (20). 3 22 0 1 2 3 2 2 3 2 0 1 2 3 6 3 ( ) 6 6 r r d h h h h g s i i i i              + + ++  = = =  +  +  + + +  (20) 4. digital realization of fractional-order pid controller in the delta domain the transfer function of the pid controller in continuous time is given by (7). to realize the controller transfer functions in the delta domain, fractional order operator such as s− and s are to be implemented in the delta domain using (20).the pid controller in the delta domain takes the form as 2 2 2 2 2 2 6 3 6 3 ( ) 6 6 6 6 p i d c k k k             −    +  +  = + +    +  +  +  +     (21) in this work, the proposed foc, designed in the delta domain is to control a plant, which is of a first order with time delay [33]. the plant transfer function gp(s) is modeled through the first order padé approximation to obtain (22). 1 2( ) 1 1 1 2 p pls p l sk k g s e lst st s −   −   =     + +   +    (22) considering t = 1, l = 0.1, the plant becomes 0.1 1 0.05 ( ) 1 1 1 0.05 p ps p k k s g s e s s s −   −  =      + + +   (23) the fopid controller in the continuous-time domain is tuned using particle swarm optimization (pso) [33]for the plant as given by (23) and tuned parameters of the fopid controllers are as: proportional gain(kp) = 0.7469, integral gain(ki) = 0.874, derivative gain ( ) 0.0001, 1.2089 d k = = and 0603.0= the fopid in discrete delta domain takes the form as shown in (24). 1.2089 0.0603 2 2 2 2 2 2 6 3 6 3 ( ) 0.7469 0.874 0.0001 6 6 6 6 c           −    +  +  = + +    +  +  +  +     (24) 3rd order rational approximation of the controller in delta domain (sampling time is considered to be 001.0= second) is obtained using (20) and expressed by (25). 3 2 8 14 3 2 8 13 5 3 8 2 13 18 3 2 9 14 9.514 0.0006938 1.293 7.102 ( ) 0.7469 0.0009524 4.021 3.909 9.031 1.558 4.316 3.148 0.0001543 4.106 2.911 e e c e e e e e e e e               − − − − − − − − − −  − − − − = +    − − −   + + + +   + + +     (25) design and implementation of fractional-order controller in delta domain 197 4.1. realization of controller using df-ii method in this work, the delta domain fopid controller is realized using direct form ii (dfii) realization method. the foc can be realized in iir form in z-domain as follows 1 21 1 0 1 2 1 1 2 0 1 2 ( ) ( ) ( ) m m n n b b z b z b zy z f z x z a a z a z a z − − −− − − − − −    + + + + = =    + + + +    (26) the foc can be realized in iir form in  -domain as follows: 1 21 1 0 1 2 1 1 2 0 1 2 ( ) ( ) ( ) m m n n m m m my f x n n n n         − − −− − − − − −    + + + + = =    + + + +    (27) the functional diagram of the delta df-ii realization method is depicted in fig.2. corresponding to governing iir equation (27). fig. 2 delta direct form ii realization structure the unit delay block (z−1) corresponding to discrete z-domain is rebuilt in the discrete domain using (10) to realize the foc in delta domain. this can be called as delta direct form -ii(ddf-ii) realization. the unit delay block ( −1) in the -domain in represented by (28). 1 1 1 (1 ) z z  − − − =  − (28) 4.1.1. delta direct form-ii realization of foi the integrator part of (25) is considered for the ddf-ii realization purpose. in fig. 3, the ddf-ii realization of integrator section is demonstrated. 198 s. k. dolai, a. mondal, p. sarkar fig. 3 delta direct form ii realization of fractional-order integrator section of fractional order controller 4.1.2. delta direct form-ii realization of fod the differentiator part of (25) is considered for the ddf-ii realization purpose. in fig. 4, the ddf-ii realization of differentiator section is demonstrated fig. 4 delta direct form ii realization of fractional-order differentiator section of fractional order controller design and implementation of fractional-order controller in delta domain 199 4.2. implementation of digital controller designed in delta domain using dspace data acquisition and control of the prototype system with a controller is accomplished using ds1202 dspacemicrolabbox, which can be reprogrammed using matlab/ simulink, and dspace software. the dspace is a software package where the real-time interface with the model-based input-output can be integrated with the simulink control desk. if any continuous system is to be controlled with a digital controller having a sampling time of , the following functional diagram as shown in fig. 5can be utilized. the interfacing of the system and the controller can be pictorially demonstrated in fig.5. to get the information from the sensor to the controller in dspace, analog to digital (adc) converter is used and digital to analog (dac) is used to send the signal back. fig. 5 real-time control structure the selection of sample time of the control program using dspace depends on the time constant of the physical system, which is again related to the dynamics of the system. the actual hardware set up for the experiment is shown in fig.6 where the plant is designed in a continuous-time domain and controller is designed in the delta domain (discrete-time domain) and implemented through the ds1202 dspace board. fig. 6. actual photograph of the experimental setup in fig. 7, analog realization of fo plant [8] controller in the continuous-time domain is shown. the parameters required to design the fo plant as shown in fig.7. is summarized in table 2. 200 s. k. dolai, a. mondal, p. sarkar table 2 component specifications for designing the fo plant elements value r1 40 k r2 10 k r3 500  c1, c2 15 nf fig. 7 analog realization of fractional order plant fig. 8 shows the digital realization of the fopid controller designed using the delta operator used to control the continuous-time plant in matlab/simulink. fig. 9 demonstrates the step response of the overall system where the fopid controller using the delta operator is designed using matlab/simulink. fig. 8 digital realization of fopid controller designed in the delta domain (kp = 0.25) design and implementation of fractional-order controller in delta domain 201 fig. 9 step response of the overall system with fopid controller designed in delta domain (kp = 0.25) fig. 10 hardware implementation of the plant of first order with time delay 202 s. k. dolai, a. mondal, p. sarkar 5. result analysis in this work, delta operator parameterization is used to design the discrete fopid controller, and the same is realized by delta direct form ii structure. the plant is considered to be one first order with time delay, is designed on a real-time basis. the designed delta fopid controller is implemented using the ds1202 dspace board, and the unit step responses of the overall system for variation of the dc gain kp are demonstrated in fig. 12 to fig. 17. fig. 12 step response characteristics of the overall system with delta fopid controller in dspace (kp = 0.25, the maximum overshoot percentage or mp (%) = 1.4 and ts (ms) =1.3) fig. 13 step response characteristics of the overall system with delta fopid controller in dspace (kp = 0.5, the maximum overshoot percentage or mp (%) = 9.28 and ts (ms) = 1.5) design and implementation of fractional-order controller in delta domain 203 fig. 14 step response characteristics of the overall system with delta fopid controller in dspace (kp = 1, the maximum overshoot percentage or mp (%) = 14.53 and ts (ms) = 1.6) fig. 15 response characteristics of the overall system with delta fopid controller in dspace (kp = 2, the maximum overshoot percentage or mp (%) = 13.59 and ts (ms) = 1.2) 204 s. k. dolai, a. mondal, p. sarkar fig. 16 step response characteristics of the overall system with delta fopid controller in dspace (kp = 4, the maximum overshoot percentage or mp (%) = 7.15 and ts (ms) = 1.14) fig. 17 step response characteristics of the overall system with delta fopid controller in dspace (kp = 8, the maximum overshoot percentage or mp (%) = 2.309 and ts (ms) = 0.96) 5.1. robustness analysis for the proposed controller to study the robustness analysis of the developed delta domain foc, the dc gain (kp) is varied and the responses of the closed loop system are measured. for the variation of dc gain (kp), the peak percentage overshoot and the settling time are measured, and variation of the percentage peak overshoot and settling times does not vary considerably for the variation of dc-gain. the iso-damping property of fractional-order system is thus satisfied through the designing of discrete foc in delta domain. a comparative analysis of the time domain parameters for variation of the dc gain (kp) has been summarized in table 3. design and implementation of fractional-order controller in delta domain 205 from the plots shown in fig. 12 to fig. 17, proves that the closed loop system with delta fopid controller realized using dspace is robust against process gain (k) variations and exhibits the iso-damping properties. 5.2 sensitivity analysis of the system a perturbation (± 20 % pu) is applied to the closed loop system containing the fractional order plant and the developed delta domain fopid using dspace and the steady state response in noted. the output of the closed loop system with random variation of step input, is demonstrated in fig. 18. from the fig. 18, it is very clear that the steady state error becomes zero though a sufficient perturbation is applied at the input side. this proves the system to be a robust one and sensitive to input variation . fig. 18 steady state error of the closed loop system for a random perturbation the foc designed using continuous and discrete delta domain must have to be stable. the pole -zero plotting of the designed controller in both domains are shown in fig. 19 and fig. 20. from fig. 19 and fig. 20 the stability of the realized controllers is ensured. fig. 19 pole-zero plot of discrete delta(  ) fopid controller 206 s. k. dolai, a. mondal, p. sarkar fig. 20 pole-zero plot of continuous time fopid controller table 3 comparative analysis of the time domain parameters for variation of the dc gain ‘kp’ 6. conclusion in this paper, the design and implementation of fractional order controller in the delta domain is presented. one of the essential properties of the fractional-order system is isodamping property. the fractional-order pid controller is designed in delta domain from corresponding continuous-time fopid controller transfer function by using the direct discretization method and the delta fopid controller is then realized using delta direct form-ii structure of filter realization. the ds1202 dspace board is used in this work to implement the controller through the matlab/simulink and control desk interface of the dspace board. this approach is devoid of ill-conditioning which is inherentin the case with shift operator parameterization. in this work, the sampling rate (δ=0.001 sec) is considered very close to zero to obtain a discrete time system with very high sampling realization methods s-domain realization analog realization [33] delta domain realization kp = 0.25 %mp 11.2 4.11 1.4 ts (ms) 0.86 0.54 1.1 kp = 0.5 %mp 12.9 10.9 9.28 ts (ms) 0.52 0.32 .95 kp = 1 %mp 14.23 14 14.53 ts (ms) 0.29 0.2 .74 kp = 2 %mp 11.29 12.3 12.59 ts (ms) 0.17 0.11 1.1 kp = 4 %mp 7.3 7.9 7.1 ts (ms) 0.07 0.052 1.14 kp = 8 %mp 8.1 5.8 2.3 ts (ms) 0.021 0.017 0.96 design and implementation of fractional-order controller in delta domain 207 rate. the fopid controller designed in the delta domain gives the response characteristics very close to the responses obtained from the analog realization of the fopid controller, which is designed in the s-domain. when the dc gain "kp" is varied over a specified range, the response characteristics of the overall system remains almost unaltered meaning the property of iso-damping is satisfied. from the table 3, it is evident that the results are very close to each other in regard to the time response parameters among the three methods of designing fopid controller. the stability of the realized system is also verified through the pole and zero locations of developed delta domain controller. the system response remains stable with a perturbation in the step input as demonstrated in fig.18.the results obtained using delta parameterized discrete-time system resembles to that of the results as obtained by continuoustime system at a fast-sampling rate makes the design a unified one and a viable alternative for the discrete fractional order controller design and implementation. references [1] i. podlubny, fractional differential equations, elsevier, 1998. [2] m. nakagawa and k. sorimachi, "basic characteristics of a fractance device", ieice trans. fundamentals electron., commun. comput. sci., vol. 75, pp. 1814-1819, dec. 1992. [3] a. oustaloup, la dérivation non entière, hermes science publication, 1995. [4] r. caponetto, g. dongola, l. fortuna and i. petrá, fractional order systems: modeling and control applications, world scientific, 2010. [5] k. b. oldham and j. spanier, the fractional calculus: theory and applications of differentiation and integration to arbitrary order, elsevier science, 1974. [6] i. podlubny, "fractional-order systems and piλdμ-controllers", ieee trans. automatic contr., vol. 44, no. 1, pp. 208-214, jan. 1999. [7] k. s. miller and b. ross, an introduction to the fractional calculus and fractional differential equations, john wiley & sons, july 1993. [8] y. q. chen, i. petrá and d. xue, "fractional order control a tutorial", in proceedings of the 2009 american control conference, pp. 1397-1411, june 2009. [9] h. h. sun, b. onaral and y. y. tso, "application of the positive reality principle to metal electrode linear polarization phenomena", ieee trans biomed eng, vol. bme-31, pp. 664-674, oct. 1984. [10] h. h. sun, a. a. abdelwahab and b. onaral, "linear approximation of transfer function with a pole of fractional power", ieee trans automat contr, vol. 29, pp. 441-444, may 1984. [11] s. b. skaar, a. n. michel and r. a. miller, "stability of viscoelastic control systems", in proceedings of the 26th ieee conference on decision and control, vol. 26, pp. 1582-1587, july 1987. [12] n. engheta, "fractional calculus and fractional paradigm in electromagnetic theory", in proceedings of the international conference on mathematical methods in electromagnetic theory (mmet 98) (cat. no.98ex114), vol. 1, pp. 43-49, june 1998. [13] j. swarnakar, p. sarkar and l. j. singh, "a unified direct approach for discretizing fractional-order differentiator in delta-domain", int. j. model. simul. sci. comput., vol. 9, pp. 1850055:1-1850055:20, aug. 2018. [14] j. a. t. machado, "analysis and design of fractional-order digital control systems", syst. anal. modelling simulation, vol. 27, pp. 107-122, 1997. [15] r. h. middleton and g. c. goodwin, digital control and estimation: a unified approach, englewood cliffs, nj, prentice hall, 1990. [16] a. khodabakhshian, v. j. gosbell and f. coowar, "discretization of power system transfer functions", ieee trans. power syst., vol. 9, no. 1, pp. 255-261, feb. 1994. [17] g. c. goodwin, r. h. middleton and h. v. poor, "high-speed digital signal processing and control" in proceedings of the ieee, vol. 80, no. 2, pp. 240-259, feb. 1992. [18] j. cortés-romero, a. luviano‐juárez and h. j. sira-ramírez, "a delta operator approach for the discretetime active disturbance rejection control on induction motors", math. probl eng, vol. 2013, pp.1-9, nov. 2013. [19] s. ganguli, g. kaur and p. sarkar, "identification in the delta domain: a unified approach via gwocfa", soft. comput., vol. 24, no. 3, pp. 4791-4808, april 2020. 208 s. k. dolai, a. mondal, p. sarkar [20] y. zhao and d. zhang, "h∞ fault detection for uncertain delta operator systems with packet dropout and limited communication", in proceedings of the american control conference, 2017, pp. 4772-4777. [21] j. gao, s. chai, m. shuai, b. zhang and l. cui, "detecting false data injection attack on cyberphysical system based on delta operator", in proceedings of the chinese control conference (ccc), 2018, pp. 5961-5966. [22] j. swarnakar, p. sarkar and l. j. singh, "direct discretization method for realizing a class of fractional order system in delta domain – a unified approach", automatic control comput. sci., vol. 53, no. 2, pp. 127-139, june 2019. [23] s. dolai, a. mondal and p. sarkar, "a new approach for direct discretization of fractional order operator in delta domain" fu: elec. energ., vol. 35, no. 3, pp. 313-331, sept. 2022. [24] g. maione, "high-speed digital realizations of fractional operators in the delta domain", ieee trans automat contr., vol. 56, no. 3, pp. 697-702, march 2011. [25] r. herrmann, fractional calculus: an introduction for physicists, singapore world scientific publishing, 2011. [26] j. zhong and l. li, "tuning fractional-order piλdμ controllers for a solid-core magnetic bearing system", ieee trans. control syst. technol., vol. 23, pp. 1648-1656, july 2015. [27] c. a. monje, y. q. chen, b. m. vinagre, d. xue and v. feliu, fractional-order systems and control : fundamentals and applications, springer-verlag, 2010, london. [28] b. saidi, m. amairi, s. najar and m. aoun, "bode shaping-based design methods of a fractional order pid controller for uncertain systems", nonlinear dyn., vol. 80, pp. 1817-1838, sept. 2015. [29] r. duma, p. dobra, and m. trusca, "embedded application of fractional order control",” electron lett, vol. 48, pp. 1526-1528, nov. 2012. [30] t. n. l. vu and m. lee, "analytical design of fractional-order proportional-integral controllers for timedelay processes", isa trans., vol. 52, no. 5, pp. 583-591, sept. 2013. [31] i. podlubny, i. petráš, b. m. vinagre, et al., "analogue realizations of fractional-order controllers". nonlinear dyn., vol. 29, pp. 281-296, july 2002. [32] j. petrzela, r. sotner and m. guzan, "implementation of constant phase elements using low-q band-pass and band-reject filtering sections," in proceedings of the international conference on applied electronics (ae), pilsen, czech republic, 2016, pp. 205-210. [33] c. muñiz-montero, l. v. garcía-jiménez, l. a. sánchez-gaspariano, c. sánchez-lópez, v. r. gonzález-díaz and e. tlelo-cuautle, "new alternatives for analog implementation of fractional-order integrators, differentiators and pid controllers based on integer-order integrators", nonlinear dyn, vol. 90, pp. 241256, oct. 2017. [34] b. m. vinagre, i. podlubny, a. hernandez and v. feliu, "some approximations of fractional order operators used in control theory and applications", j. fract. calc. appl. anal., pp. 231-248, jan. 2000. [35] s. khubalkar, a. junghare, m. aware and s. das, "unique fractional calculus engineering laboratory for learning and research", int. j. electr. eng. education, vol. 57, no. 1, pp. 3-23, jan. 2020. [36] m. s. monir, w. s. sayed, a. h. madian, a. g. radwan and l. a. said, "a unified fpga realization for fractional-order integrator and differentiator", electronics, vol. 11, no. 13, p. 2052, june 2022. [37] k. s. khattri, "new close form approximations of ln (1 + x)", teaching of math., vol. 12, no. 1, pp. 714, dec. 2009. [38] w. rui, s. qiuye, z. pinjia, g. yonghao, q. dehao and w. peng, "reduced-order transfer function model of the droop-controlled inverter via jordan continued-fraction expansion", ieee trans. energy conver., vol. 35, pp. 1585-1595, march 2020. [39] y. chen, b. m. vinagre and i. podlubny, "continued fraction expansion approaches to discretizing fractional order derivatives—an expository review", nonlinear dyn., vol. 38, no. 1, pp. 155-170, dec. 2004. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 251 262 doi: 10.2298/fuee1502251s complexity reduction of toffoli networks based on fdd  suzana stojković 1 , milena stanković 1 , claudio moraga 2,3 1faculty of electronic engineering, university of niš, serbia 2centre for soft computing, mieres, asturias, spain 3 tu dortmund university, dortmund, germany abstract. synthesis of switching functions by toffoli gates has become a very important research topic in the last years, since toffoli gates are used in the synthesis of reversible circuits. early methods based on the truth-table representation of boolean functions are applicable to functions with a relatively small number of variables. later on, methods for synthesis by toffoli gates based on decision diagrams (bdds, fdds or okfdds) were introduced and applied to the synthesis of both reversible and irreversible functions. this paper presents a method for the reduction of the number of lines and gates in the toffoli gate realization of boolean functions based on their functional decision diagram (fdd) representation. experiments show that, when the proposed reduction is used, the realization of the given function based on fdd will, on the average, be smaller in terms of the number of lines and the number of gates than the realizations based on an okfdd, an optimal bdd or based on a fdd by using previously defined algorithms. key words: switching functions synthesis, toffoli gates, binary decision diagrams, functional decision diagrams 1. introduction toffoli gates have been used in the realizations of systems of reversible functions. a reversible function is a bijective function, i.e., it is a system of n functions with n input variables where there is a mapping of each input into a unique output. it follows that a reversible function can be realized by a cascade of toffoli gates. reversible circuits have lower power dissipation (heat) than classical (irreversible) circuits. therefore, they are receiving increasing attention. however, most of the current methods for reversible synthesis (see, for example, [1], [2], [3] and the references therein) are limited by the complexity of computations and are applicable to functions of a relatively small number of variables. in all these approaches, the number of lines in the circuit is equal to the number of input and output variables. in [4-12], the synthesis approaches that can cope received august 19, 2014; received in revised form january 19, 2015 corresponding author: suzana stojković faculty of electronic engineering, university of niš, a. medvedeva 14, 18000 niš, serbia (e-mail: suzana.stojkovic@elfak.ni.ac.rs) 252 s. stojković, m. stanković, c. moraga with boolean functions of a large number of variables were proposed. all these methods are based on different types of decision diagrams. the methods in [4] and [5] use binary decision diagrams (bdds) to represent functions and derive reversible circuits by mapping the non-terminal nodes into toffoli cascades. papers [6], [7] and [8] discuss the methods for reversible synthesis based on functional decision diagrams (fdds) with positive davio nodes. in [9], a method for synthesis based on the kronecker functional decision diagrams (kfdds) was proposed. the advantage is obtained from the reduced number of nodes in kfdds compared to bdds and fdds, at the cost of more complex determination of these decision diagrams. in all the afore mentioned methods, circuits with toffoli gates are obtained hierarchically, within a time and a memory linearly proportional to the decision diagram size. in [10] and [11], a reversible synthesis based on lattice decision diagrams can be found. in [12], a method for the iterative synthesis of reversible cascades by analyzing properties of the function to be realized in the walsh-hadamard domain is proposed. the computation and the representation of the walsh-hadamard spectrum and the analysis are performed over bdds. the methods based on decision diagrams are applicable to the realizations of both reversible and irreversible systems of switching functions by toffoli gates. however, the main disadvantage of this approach is the large number of lines in the finally produced circuit, because in almost all cases, for the realization of a node in the decision diagram a new additional line is introduced. the problem of reducing the reversible realization of a switching function is discussed in many papers. some methods reduce the number of additional lines by increasing the number of gates (see for example the methods presented in [13]). other methods reduce the cost of the circuits by adding some additional lines (see for example [14]). papers [15] and [16] present the methods that reduce both parameters: the number of lines and the number of gates. the method from paper [15] is based on the bdd and okfdd with negated edges. the method proposed in [16] is based on the bdd representation of the switching function. in this paper, we propose a method for reducing both parameters of the circuit complexity: the number of lines and the number of gates based on the functional decision diagram (fdd) representation of the switching function instead of bdds and kfdds discussed in [13-16]. the reason for the change of the underlying data structure (fdds instead of bdds and kfdds) is that the implementation of a positive davio or negative davio node by toffoli gates is simpler than the implementation of the shannon node. the paper is organized as follows. in section 2, the basic definitions necessary for understanding the concepts presented in the next sections are given. section 3 describes the standard and optimal ways for the implementation of the fdd nodes by toffoli gates. experimental results that confirm the proposed method’s advantage are shown in section 4. section 5 summarizes the results of the proposed method. 2. basic definitions the algorithm for reversible syntheses that will be presented in this paper is based on the fdd representation of the switching functions. the results will be compared with realizations that are done based on the bdd or kfdd. because of that, this section introduces definitions of these types of the decision diagrams. complexity reduction of toffoli networks based on fdd 253 in the realizations of the functions, the toffoli gates with one or more input lines are used, so the toffoli gates are defined as well. 2.1. bdd, fdd and kfdd for the definition of decision diagrams, the concept of a decision tree should be introduced. a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: ( 0) ( 1)i i i if x f x x f x      . (1) additionally, instead of the shannon decomposition, the function can be decomposed by using the positive (2) or the negative (3) davio decomposition: ( 0) ( ( 0) ( 1))i i i if f x x f x f x       , (2) ( ( 0) ( 1)) ( 1) i i i i f x f x f x f x       . (3) the resulting decision tree using the davio decompositions is named the functional decision tree (fdt). from the previous definition, it follows that in the functional decision tree (fdt) for the given function f, in each non-terminal node either the positive or negative davio decomposition can be used. in practice, the same decomposition is usually applied to all the nodes at the same level in the tree. in that case, a polarity vector defines the types of decompositions that are used in the levels. for example, in the fdt of a function with three variables, the polarity vector [101] means that at levels 1 and 3, the positive davio decomposition is used, and at level 2, the negative one. example 1. figure 1 shows the bdt (a) and the fdt (b) of the function 1 2 3 4( , , , )f x x x x  = 1 2 1 2 3 4 x x x x x x   . in this fdt, the positive davio decomposition is used at all levels. fig. 1 bdt (a) and fdt (b) of the function f in example 1. 254 s. stojković, m. stanković, c. moraga a bdt is transformed into a bdd by using the following reduction rules: 1. share all the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. the edges pointing to the deleted node are directed to the remaining node. 2. eliminate all the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. an fdt is transformed into an fdd by using reduction rule 1 above and the following reduction rules for suppression of the zeros ([17], [18]): 2.1.if the right outgoing edge from a positive davio node points to the 0-node (terminal node labeled by the value 0), the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2.if the left outgoing edge from a negative davio node points to the 0-node, the nonterminal node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 2. figure 2 shows the bdd (a) and the fdd (b) that are obtained by reduction of the bdt and fdt from example 1. (a) (b) fig. 2 bdd (a) and fdd (b) of the function from example 1. in the previous example, the fdd is more compact than the bdd for the considered function. but there are functions for which the bdd is more compact. there are no methods to predict whether the bdd or the fdd will be smaller for the given function (see the examples in section 5). the more general type of decision tree is the kronecker functional decision tree (kfdt) in which at each level either the shannon, the positive davio, or the negative davio decomposition can be used. by the reduction of the kfdt, a kronecker functional decision diagram (kfdd) is obtained. since the realization of the shannon nodes by toffoli gates is usually more complex than the corresponding realization of the positive or negative davio nodes, the considerations in this paper are restricted to circuits derived from fdds. complexity reduction of toffoli networks based on fdd 255 2.2. toffoli gates toffoli gates are used in the synthesis of reversible functions. accordingly, toffoli gates transform the set of input signals 1 2 1( , , , , )nx y y y  into the set of output signals 1 2 1 1 2 1 ( , , , , ) n n x y y y y y y    . a special case is the toffoli gate with only one input and one output line that generates the complement of the input signal. graphical symbols of the general toffoli gate and (by extension) a toffoli gate with one input and one output (that realizes the not operation) are shown in figure 3. 1x1xxx yn-1 yn-1 y1 y2 y1 y2 xx xy1y2...yn-1xy1y2...yn-1 (a)(a) (b)(b)    (a) (b) fig. 3 toffoli gate with n input and output lines (a) and with one input and output line (b). 3. optimal realization of positive and negative davio nodes by toffoli gates we use the same method for the reversible synthesis as in the approaches based on the bdd and kfdd, discussed in [4], [5], [9]. as proposed in [9], each node in the fdd maps into a toffoli cascade as shown in table 1. realizations of davio nodes that are proposed in the [9] we will note as “standard realizations” in the following text. it can be noticed that the nodes are always realized with additional lines. we observed that in some cases it is possible to have a simpler realization of the node, without additional lines, by the direct transformation of the inputs. however, if transformed, the initial input will be lost for future use. the additional lines are introduced to keep the input for future use. in the cases when the reminder function or an input variable is not used later, it is possible to select the implementation with a smaller number of lines and gates. table 2 shows the simplified realization of some davio nodes which can be applied in the cases when it is not necessary to keep some inputs of the node for future use. table 2 shows possible reductions based on our approach: 1. for the node type marked as number 2, the output function f is equal to the kx which can be realized with one one-input toffoli gate. however, in this case xk will be lost for future use. if the xk will not be used later, this node can be realized without additional lines. 2. the node type marked as number 3 realizes the function f = f0 + xk. there are two possible ways for the reduced realization of this node. in the first case the line xk will be transformed into an output function, and in the second case, the line f0. if one of them will not be used later, the node can be realized without additional lines and with one gate (instead of 2 as it is proposed in the standard realization). 256 s. stojković, m. stanković, c. moraga table 1 standard realizations of positive and negative davio nodes by toffoli gates no. positive davio (pd) nodes negative davio (nd) nodes node type implementation by toffoli gates node type implementation by toffoli gates 1. 0 1 pd f xk1 xk f 01 nd f xk 1 f kx 2. 1 1 pd f xk1 xk 1 f xk 1 nd f xk 1 1 1 f kx kx 3. pd f xk1 f0 1 xk 0 f xk f0 f0 1 nd f xk 1 f1 0 f f1 f1 kxkx 4. f10 pd f xk1 xk 0 f xk f1 f1 nd f xk 1 f0 0 0 f f0 f0 kx kx 5. f11 pd f xk1 xk 1 f xk f1 f1 nd f xk 1 f0 1 1 f f0 f0 kx kx 6. f1 pd f xk1 f0 xk 0 f xk f1 f0 f1 f0 nd f xk 1 f0 f1 0 f f0 f1 f0 f1 kxkx 7. pd f xk1 f0 f0 0 f f0 f0 xkxk nd f xk 1 f0 f0 0 f f0 f0 kx kx complexity reduction of toffoli networks based on fdd 257 table 2 reduced realizations of positive and negative davio nodes by toffoli gates node type standard implementation minimal implementation condition of minimization pd no. 2: 1 1 pd f xk1 xk 1 f xk fxk if xk is not used later nd no. 2: 1 nd f xk 1 1 1 f kx kx fkx if kx is not used later pd no. 3: pd f xk1 f0 1 xk 0 f xk f0 f0 xk f0 f f0 if xk is not used later xk xk f0 f if f0 is not used later nd no. 3: 1 nd f xk 1 f1 0 f f1 f1 kxkx f1 f kx kx if f1 is not used later f f0f1 kx if kx is not used later pd no. 6: f1 pd f xk1 f0 xk 0 f xk f1 f0 f1 f0 xk f xk f1 f1 f0 it is the last usage of the line f0 nd no. 6: nd f xk 1 f0 f1 0 f f0 f1 f0 f1 kxkx f f0 f0 f1 kx kx it is the last usage of the f1 3. the node type marked as number 6 realizes the function f = f0 + xk f1. the standard realization of this node can be reduced if the residual function f0 is not used later. reduced realization does not contain additional lines and contains one gate less than the standard realization. due to the previous observations summarized in table 2, in the realization of the nodes satisfying the conditions mentioned above, the number of lines and/or number of gates can be reduced. based on this, we propose the following algorithm for the synthesis of not necessarily reversible functions by toffoli gates. 258 s. stojković, m. stanković, c. moraga we assume that the function to be realized is specified by an fdd with positive and negative davio nodes. in each node the number of input edges must be saved. for each line in realization, the number of future uses is also saved. in the special hash table (realization table), the realized nodes and the appropriate output lines of their realizations are stored. the algorithm to realize a multi-output function (given by the fdd) by toffoli gates consists of the following steps. step 1. (initialization of the input lines) for each input variable xi (i  [1,n]) do: step 1. 1. create the input line xi . step 1. 2. set the number of the future use of the line created in the previous step on the number of the nodes at the appropriate level in the fdd. step 1. 3. if -th level contains the negative davio nodes then add a not toffoli gate on the created line. step 2. (implementation of nodes) traverse the fdd in the depth-first manner and for each visited node q do: step 2. 1. if q exists in the realization table then get it and return the output line, else go to the next step. step 2. 2. if q is a terminal node then return the corresponding constant line, else go to the next step. step 2. 3. determine the type of node q. step 2. 4. get input lines for the realization of node q (the input lines are the output lines of the nonterminal child nodes and the input line corresponding to the variable in node q). step 2. 5. if there is a reduced realization for the given node type and the number of the future usages of the corresponding input line is equal to 1, then create the reduced implementation of node q; else create the standard implementation of node q. step 2. 7. decrement future usage of each input line. step 2. 8. increase future usage of the output line by the number of the input edges of node q. step 2. 9. return the output line. in comparison with the algorithms discussed in [12], [13], the advantages of the proposed algorithm can be summarized as  the algorithm reduces the number of lines and the number of nodes at the same time,  the reduced implementation is generated directly, while in ([12], [13]), the unreduced implementation is first generated and the reduction is performed in the post-processing step. complexity reduction of toffoli networks based on fdd 259 example 3. figure 4 shows the realizations of the function f in example 1 by the use of the proposed algorithm. figure 3 (b) corresponds to the realization based on the fdd with positive davio nodes. for comparison, the bdd-based realization is shown in figure (a). x2 f x1 x3 x4 1 0 1 x2 f x1 x3 x4 0 1 0 (a) (b) (a) (b) fig. 4 toffoli realization of the function f in example 1 based on the bdd (a) and based on the fdd (b). for this function, the size (number of non-terminal nodes) of the bdd is 5 and the size of the fdd is 4. the realization based on the bdd contains 11 gates and 4 additional lines, while the fdd-based realization contains 4 gates and 2 additional lines. note that the achieved reduction of the network complexity is greater than the ratio of the sizes of these two dds. 4. experimental results the proposed method for the fdd-based reversible synthesis was applied in the design of circuits realizing several mcnc benchmark irreversible functions and several reversible functions from the revlib [19]. the results are shown in tables 4 and 5, respectively. in table 4, the results of the synthesis of the mcnc function are compared with the results when synthesis is done based the optimal bdd and okfdd presented in [7]. recall that the optimal bdd assumes that optimization by variable reordering and complemented edges are performed. in table 4, the number of lines is denoted as l, and the number of gates as g. the sizes of the corresponding dds are denoted by s. notice that in table 4, there are 5 cases where the realization based on fdds requires more lines but less gates than realizations using other decision diagrams, but at the same time, are required. there are only two out of 16 benchmarks where the method fails to deliver an optimal solution. 260 s. stojković, m. stanković, c. moraga table 4 bdd, kfdd and fdd based realizations of mcnc functions function in/out bdd kfdd fdd s l g s l g s l g 9sym 9/1 25 27 62 26 29 60 27 12 26 con1 7/2 16 16 32 13 15 25 15 13 23 misex1 8/7 36 35 104 32 33 87 44 31 63 rd53 5/3 17 13 34 13 15 30 13 10 14 rd73 7/3 31 25 73 21 25 52 21 14 24 rd84 8/2 42 33 103 29 32 70 29 20 33 sqrt8 8/4 35 30 76 29 29 63 43 29 56 squar5 5/8 35 28 81 26 27 62 32 24 31 xor5 5/1 6 6 8 5 6 6 5 5 4 apex4 9/19 909 547 2551 915 552 2647 1106 574 1628 bw 5/28 104 87 307 91 81 265 116 85 172 clip 9/5 87 66 228 85 66 185 152 82 217 ex1010 10/10 1080 670 2982 1062 658 2883 1494 761 2194 inc 7/9 73 53 187 70 56 196 99 66 142 5xp1 7/10 42 30 90 36 36 89 75 42 97 sao2 10/4 86 74 211 91 77 226 147 109 226 in table 5, the results of the synthesis of the revlib functions are compared with different previous algorithms for reversible synthesis based on dds. the columns in the table contain the following results: 1. bdd the results when the synthesis is done based on the optimal bdd presented in [5], 2. hr bdd and er bdd the results of the methods for reducing the number of lines in the realization based on the bdd that are presented in paper [13] (the hr bdd contains the results of a heuristic method for reducing the number of lines and the er bdd contains the results of the exact method for line reduction), 3. pdd – the results when the synthesis is done based on the fdd with positive davio nodes that is presented in papers [7] and [8]. in table 5, it can be seen that out of 27 benchmark functions, 23 require fewer lines and fewer gates when using the proposed method than when using the bdd. in three cases some additional lines are required, but fewer gates are used. only for the circuit alu_9 an additional line is required, keeping the same number of gates. as it can be seen from the same table, both algorithms for postprocessing reduction presented in [12] create many additional gates. for 4 out of 10 functions, the realization by our algorithm requires fewer lines than realizations by a reduction in [12]. in 4 cases the same number of lines is generated and only in one case (for the circuit rd84) does our realization require more lines. but, the number of gates in our realization is on average 6.3 times smaller than the number of gates generated by a heuristic algorithm, and 1.84 times smaller than the number of gates that are generated by the exact algorithm. complexity reduction of toffoli networks based on fdd 261 table 5 bdd and fdd based realizations of revlib functions function in / out bdd hr bdd er bdd pdd fdd l g l g l g l g l g 3_17_6 3/3 7 17 6 6 7 10 4_49_7 4/4 15 42 6 11 13 25 4mod5_8 4/1 7 8 6 10 6 10 6 5 5 7 aj-e11_81 4/4 15 38 12 22 alu_9 5/1 7 9 8 9 decod24_10 2/4 6 11 7 7 decod24-enable_32 3/4 9 14 9 9 ex-1_82 3/3 5 7 6 5 fredkin_3 3/3 5 6 4 4 graycode6_11 6/6 11 15 6 5 ham3_28 3/3 7 14 6 8 ham7_29 7/7 21 61 16 46 17 35 hwb5_13 5/5 28 88 25 146 26 89 22 71 26 57 hwb6_14 6/6 46 159 41 485 41 172 41 105 hwb7_15 7/7 76 278 65 586 66 288 66 186 hwb8_64 8/8 114 440 71 287 101 300 miller_5 3/3 6 16 8 12 mini-alu_84 4/2 10 20 9 95 9 19 9 14 mod5d2_17 5/5 11 20 8 13 one-two-three_27 3/3 9 16 9 10 peres_4 3/3 5 7 4 3 rd32_19 3/2 6 10 4 4 rd53_68 5/3 13 34 12 50 12 35 11 27 10 14 rd73_69 7/3 25 73 12 65 14 24 rd84_70 8/4 34 104 24 424 25 105 30 86 28 41 sym6_63 6/1 14 29 11 177 11 28 14 22 10 17 sym9_71 9/1 21 62 22 362 24 61 21 52 12 26 in comparison with the algorithm based on the fdd with positive davio nodes, our algorithm has better results for both criteria for 5 of 11 functions. in 3 cases our algorithm produces a smaller number of gates and greater number of lines, and in 3 cases (for functions 3_17_6, 4_49_7 and hwb8_64) our algorithm produces a greater number of gates and lines. the advantage of our algorithm is that we use the fdd containing both types of nodes (a positive davio and negative davio). because of that our results are better when the optimal polarity is not [111…1]. we optimized the fdd only by finding optimal polarity (i.e., by finding the polarity that produces the minimal number of lines in the toffoli realization). for future work, the method should be improved by reordering the variables in the fdd. 5. conclusion in this paper, we propose a method for the reduction of the number of lines and gates in the fdd-based synthesis of switching functions by toffoli gates. the motivation for this work is in the relationship between the davio decomposition rules, used in fdds, and the output of the toffoli gate. in general case, an fdd node can be realized by one toffoli gate. 262 s. stojković, m. stanković, c. moraga the proposed method was experimentally confirmed. our experiments show that in such cases of functions whose fdd is smaller than the bdd (and for functions whose bdd is slightly larger than its fdd), the fdd-based realization by toffoli gates requires a smaller number of lines and gates than the bdd-based realization. it can be seen in table 4 that for most benchmark functions the fdd-based realization is more compact than the kfdd-based realization, although the kfdd should be equal or smaller than the fdd. this is due to the fact that the realizations of nodes with shannon decomposition are more complicated than the proposed realizations of the fdd nodes. references [1] d. maslov, d. w. miller and g. v. dueck, “techniques for the synthesis of reversible toffoli networks” acm transaction on design automation of electronic systems, vol.12, no. 4, pp.42:1-42:20, 2007 [2] j. zhong and j. c. muzio, “improved implementation of a reed-muller spectra based reversible synthesis algorithm” in: proceedings of ieee pacific rim conference on communication, computers and signal processing, pp. 202-205, 2007 [3] p. gupta, a. agrawal and n. k. jha, “an algorithm for synthesis of reversible logic circuits”, ieee transaction on computer-aided design of integrated circuits and systems, vol. 25, no. 11, pp. 2317-2329, 2006 [4] r. wille and r. drechsler r, “bdd-based synthesis of reversible logic for large functions”, in: proceedings of design automation conference, san francisco, 2009, pp. 270-275 [5] r. wille and r. drechsler, “effect of bdd optimization on synthesis of reversible and quantum logic”, electronic notes in theoretical computer science, vol. 253, no. 6, pp. 57-70, 2010 [6] k. takahashi, t. hirayama, “reversible logic synthesis from positive davio trees of logic functions”, in: proceedings of ieee tencon conference, singapore, 2009, pp. 1-4 [7] y. pang, j. lin, s. sultana, k. radecka, “a novel method of synthesizing reversible logic”, in: proceedings of ieee int. symposium on circuits and systems, rio de janeiro, 2011, pp. 2857–2860 [8] y. pang, s. wang, z. he, j. lin, s. sultana, k. radecka, “positive davio based synthesis algorithm for reversible logic”, in: proceeding of ieee int. conference on computer design, amherst, ma , 2011, pp. 212–218 [9] m. soeken, r. wille and r. drechsler, “hierarchical synthesis of reversible circuits using positive and negative davio decomposition”, in: proceedings of the design and test workshop, abu dhabi, 2010, pp. 143-148 [10] d. shah, m. perkowski, “synthesis of quantum arrays with low quantum cost from kronecker functional lattice diagrams”, in: proceedings of ieee congress on evolutionary computation, barcelona, 2010, pp. 1-7 [11] m. perkowski, m. lukac, d. shah, m. kameyama, “synthesis of quantum circuits in linear nearest neighbor model using positive davio lattices”, facta universitatis (niš), series: electronics and energetics., vol. 24, no. 1, pp. 71-87, 2011 [12] m. stankovic and s. stojkovic, “reversible synthesis in the walsh-hadamard domain”, in: lecture notes in computer science, vol. 6928, pp. 311-318, 2012 [13] r. wille, m. soeken and r. drechsler, “reducing the number of lines in reversible circuits”, in: proceedings of the dac 2010, anaheim, california, 2010, pp. 647-652 [14] d. m. miller, r. wille and r. drechsler, “reducing reversible circuits cost by adding lines”, in: proceedings of international symposium on multi-valued logic, barcelona, spain, 2010, pp. 217-222 [15] e. schoenborn, k. datta, r. wille, i. sengupta, h. rahaman, r. drechsler, “optimizing dd-based synthesis of reversible circuits using negative control lines”, in: proceedings of the symposium on design and diagnostics of electronic circuits and systems, warsaw, poland, 2014, pp. 129-134 [16] m. krishna, a. chattopadhyay, “efficient reversible logic synthesis via isomorphic subgraph matching”, in: proceedings of the ieee international symposium on multiple-valued logic, bremen, germany, 2014, pp. 103-108 [17] s. minato, “zero-suppressed bdds for set manipulation in combinatorial problems”, in: proceedings of the 30th international conference on design automation, 1993, pp. 272-277 [18] a. mishchenko, “an introduction to zero-suppressed binary decision diagrams”, technical report 2001. (available at http://www.eecs.berkeley.edu/~alanmi/publications/2001/tech01_zdd.pdf ) [19] an online resource for reversible functions and circuits: http://revlib.org instruction facta universitatis series:electronics and energetics vol. 27, n o 1, march 2014, pp. 137 151 doi: 10.2298/fuee1401137b evaluating system security using transaction level modelling  aisha bushager 1 , mark zwolinski 2 1 department of information systems, college of information technology, university of bahrain, bahrain 2 electronics and computer science, university of southampton, southampton so17 1bj, uk abstract. the design of secure systems requires the use of security analysis techniques. security objectives have to be considered during the early stages of system development and design; an executable model will give the designer the advantage of exploring the vulnerabilities early, and therefore enhancing the system security. in this work we create an executable model of a smart card system using systemc with the transaction level modelling (tlm) extensions. the model includes the security protocols and transactions. the model is used to compare a number of authentication mechanisms with different probabilities of failure. in addition, a number of probable attacks, including theft of a private key and denial of service were modelled to examine the vulnerabilities. the executable model shows that security protocols and transactions can be effectively simulated in order to design improvements to withstand different types of security attacks. key words: security modelling, systemc, transaction level modelling, protocols, smart cards. 1. introduction robust and secure system design requires the selection and implementation of a set of policies, procedures, architectures, technology, and personnel. however, there is no system that is 100% secure; there will always be a way to breach the system. the objective in security analysis is to identify the weak points. this requires modelling and simulation tools. we have used an executable model of a smart card system as an exemplar, including the security protocols and transactions, to allow examination of the security strengths and weaknesses by executing tests on the model. this paper extends work previously presented [1].  received january 12, 2014 corresponding author: mark zwolinski electronics and computer science, university of southampton, southampton so17 1bj, uk (e-mail: mz@ecs.soton.ac.uk) 138 a. bushager, m. zwolinski 2. related work security protocols are sets of rules designed to ensure particular security goals. however, designing and implementing these protocols is difficult and they may fail against various attacks. to be able to effectively integrate the security protocols at early stages of development, modelling languages and techniques are used to better visualize the entire system. one such modelling tool is communicating sequential processes (csp), which is a process algebra that is used to describe and analyse security properties and protocols by providing a mathematical framework [2]. however, to be able to use csp, the designer must have specialized knowledge and training, which limits the usage of this method. gspml, [3], is a visual security protocol modelling language. again, this language introduces notations and complex models that are targeted to security specialists. stereotypes and tags are used to create and present security requirements and assumptions, constraints may be attached but they should be satisfied by modelling elements with the related stereotype [4]. the unified modelling language (uml) version 2.0 has been widely used to model security protocols [5]. for example, umlsec [4], [6] is an extension to uml for integrating security related information into uml specifications, by specifying security requirements through stereotypes, tagged values, and constraints[7]. an adversary can be created in umlsec to model possible threats to a system. umlsec was used to find possible vulnerabilities in common electronic purse specifications (ceps) [4], it was also used to define security permissions that enforce restrictions on the workflows of a system [8]. none of the above modelling languages provides an automatic transition from design to code implementation. a designer would like to have an executable model that allows a better testing of the designed model and therefore links the gap between the design phase and the code implantation phase. in our work, an executable model is produced using systemc with the tlm extensions [9]. systemc has been used to produce a methodology to simulate security attacks on smart cards with fault injection [10] and it has also been used to create an environment for design verification of smart cards using security attack simulation [11]. in tlm, communication among computational components is modelled by channels and transaction requests are handled by calling interface functions of these channel models [12]. 3. using uml to model smart card transactions as an illustration of our methodology, we use a smart card system. because smart cards are used to store sensitive data such as pins, passwords, and keys, they are likely targets for criminal attacks. the main purpose of an attack is to get hold of this data. attackers might perform various numbers and types of attack on the smart card system. 3.1. overview of a smart card system figure 1 is a use case diagram that gives an overview of the basic components and functions of any smart card system. the use case diagram is a behavioural uml diagram that presents the system functionality. in our system, the actors illustrated in the figure represent the main components of the system, which are the user, smart card, smart evaluating system security using transaction level modelling 139 card reader, client, server, and database. the use cases represent the functions or services that take place while the system is operating. the focus of the analysis in this study will be on the functions of three main components, which are the user, smart card, and the smart card reader. fig. 1 overview of a smart card system the system combines three security mechanisms and a smart card ("what the user has"). the mechanisms are: pin, biometrics, and pki. the first two mechanisms are responsible for user identification and verification, a pin is: "what the user knows", and the biometrics are: "who the user is". pki verifies the devices in the system. when the user decides to use the smart card, the first step is to insert the smart card in the smart card reader. the smart card reader has number of jobs: it has to verify and authenticate the user and smart card, commit transactions, and exchange and confirm the user details with the other system components. to be able to demonstrate the transactions of the system, another type of uml diagram has to be used, figure 2. the following sections describe the registration phase and the verification phase of the smart card system and the potential threats and attacks. 3.2. smart card registration system to be able to demonstrate the transactions and message sequence between the smart card system objects, a sequence diagram is used, e.g. figure 2, which is a behavioural diagram that shows the interactions of system processes. the user provides the required information along with the biometric evidence. the system then saves the user details in the smart card and captures the fingerprint, which is the biometric method used in the proposed design, and produces a template that is 140 a. bushager, m. zwolinski stored in the system and the smart card. then, the registration system requests a pin from the user to be used in future verification processes. fig. 2 registration phase in pin, biometrics (fingerprint), and pki smart card system the pin is stored in the smart card for future verification. finally, the smart card system requests a private key from the certificate authority (ca) to generate a digital signature [13]. the ca, on the other hand, requests user verification from the registration system, generates a pair of keys for the user. the ca also issues a digital certificate corresponding to the public key, and sends the private key to the smart card to generate a digital signature that combines the private key and the biometric template of the user. 3.3. smart card system verification figure 3 shows the transactions that take place when the user uses the smart card in a security environment that combines pin, biometrics, and pki security methods. the sender first inserts the pin, the smart card reader extracts the stored pin from the smart card and starts the comparison process. if the match is successful the smart card reader will ask for another proof, which is the sender's fingerprint, otherwise, the transaction will be aborted after allowing the sender three attempts to enter the pin. the sender scans the finger through the smart card reader scanner; the reader will extract the sender's biometric feature and produce a template. the matching process will then take place and the result will decide whether the sender has permission to access the evaluating system security using transaction level modelling 141 system or not. if the match is true, the smart card releases the sender's private key. next, the sender starts to send a message to the receiver; the message is going to be digitally signed with the sender's private key, and the system will request the receiver's public key from the ca to encrypt the message. the ca will send the digital certificate and the message will be encrypted using both the sender's private key and the receiver's pubic key, therefore, the digital envelope is now ready to be sent securely to the receiver. finally, the receiver will send a request to the ca to get the sender's public key to decrypt the message. again, using both the sender's public key and the receiver's private key the receiver will be able to decrypt the message successfully. fig. 3 verification processes in pin, biometrics (fingerprint), and pki smart card system these security methods should achieve the security goals of confidentiality, integrity, authentication, and non-repudiation. however, each mechanism has its pros and cons. for example, fingerprints have disadvantages: how can we know that the biometric provided is not subject to misuse? if the user was clever and powerful enough to fool the system and use a false fingerprint, then the system will be breached and an intruder will have access to the real user's credentials and privileges. the pki method has its disadvantages as well. if one breach takes place during the transaction the sender and the receiver can both suffer security loss. 142 a. bushager, m. zwolinski 3.4. smart card system threats threats are the possible means by which a security policy may be breached [14]. a threat source can be any person, thing, event, or idea that poses danger to an asset within a system in terms of confidentiality, integrity, availability, or legitimate use. moreover, threats can be deliberate or accidental [14]. if deliberate, a threat can be categorized as passive, such as network sniffing, or active, such as negligence, errors, attempt to gain unauthorized access to the system, or changing the value of a particular transaction by malicious persons. therefore, possible threats on a smart card system include unauthorized system access, hacking and system intrusion, information leakage or theft, integrity violation (errors and omissions by insiders or outsiders), distributed denial of service, illegitimate use (dishonest or disgruntled insiders or outsiders), system penetration and tampering. threat sources have different motivations that may lead to various attacks on any government or business information system; therefore, the parties involved in the smart card system must be familiar with the human threat environments and their different motivations. 3.5. possible attacks on a smart card system attacks may occur at every single stage of a product's lifecycle, starting from the development stage, the manufacturing stage, and ending up with actual usage. attacks that take place at the development stage and the manufacturing stage of a smart card are most likely to be carried out by an insider, [15]. attacks during the smart card use stage can be physical or logical [15]. physical attacks may manipulate the semiconductor itself and usually require equipment like microscopes, focused ion beams, etc. [16]. sidechannel attacks consist of observing behaviour while the information is being processed and include timing analysis and power analysis [17]. in contrast, logical attacks or so-called software attacks do not attack the hardware properties directly; they are more focused on the communication and flow of information between the smart card and the terminal [15]. attackers can write malicious software, that can be employed in a software attack on a smart card, for example, in smart cards that support java card it is possible to load and run software. examples of logical attacks could be bug exploits, illegal bytecode, and attacks during pin comparison. other types of attacks take place during the authentication phase of the smart card system, where the user identity is authenticated using different types of authentication mechanisms like biometrics [18]. 3.6. modelling attacks using umlsec after using uml diagrams to express the smart card system protocol and processes, and to represent the transactions that take place while messages are exchanged during the registration and verification processes, in addition to knowing where the areas are that could be vulnerable to attacks, it is also essential to test the model against possible attacks. umlsec was used to model attacks, using stereotypes such as secrecy and secure information flow along with their tags and constraints. an adversary type in umlsec can have a function called threat that allows the adversary to commit delete, read, and insert attacks. nevertheless, the model is still static and not executable. evaluating system security using transaction level modelling 143 4. animating the model using systemc tlm systemc was developed to support the need for a language that improves the overall productivity for designers in the electronic systems field [9]. it supports the development of complex systems by the design and verification of hardware system components at a high level of abstraction. the systemc library is open source and written in c++. in addition, it contains a lightweight kernel that schedules the processes. the systemc library provides concurrent and hierarchical modules, ports, channels, processes, and clocks. large designs are always broken down hierarchically to be able to manage complexity; structural decomposition of the simulated model in systemc is specified with modules. the module is the smallest container with state, behaviour, and structure for hierarchical connectivity [9]. within a module, we use a thread process, which is associated with its own thread of execution. once the thread starts executing it is in complete control of the simulation until it chooses to return control to the simulator. hence, the thread process is used to model sequential behaviour [9]. systemc has two ways to pass control to the simulator again, one way is to exit by (return), in this case the thread is totally stopped, the other way is by having a (wait), therefore, every thread contains an infinite loop and usually has at least one wait function. the tlm library is built on top of systemc and allows abstract communications to be modelled in a structured manner. in tlm communication between components is modelled by channels and transaction requests, which are implemented by calling interface functions of the channel models [12]. the initiator port and the target port are distinguished in tlm. an initiator is a module that creates new transactions and passes them on by calling a method of one of the core interfaces. the target is a module that receives the transactions from the initiator. a system component can be an initiator, a target, or an interconnect. the interconnect module accesses a transaction but does not act as an initiator or a target for that transaction, for example routers can be interconnect modules in a system. another important element in tlm is the generic payload, which allows data abstraction. 4.1. smart card system simulation the executable model produced in our work shows the sequence of transactions that occur in the smart card system while the smart card is used; they correspond to the transactions in figure 3. hence, in the executable model, the smart card system objects and their related transactions, the lifelines in the uml diagram, are represented as objects – modules in systemc, and the arrows are represented as tlm transactions. the modules have two types of socket, an initiator socket that is responsible for sending the transactions and a target socket that is responsible for receiving the transactions; both sockets are defined in the module structure. the sender module communicates with the smart card module and the smart card reader module. an initiator socket from the sender to the smart card is created, along with another initiator socket to the smart card reader module, to allow the sender to send transactions to both modules. the initiator is responsible for calling the transport function to send the payload to the target socket. on the other hand, a target socket is created and then registered in the constructor; the target socket receives the payload from the transfer function for processing and response. 144 a. bushager, m. zwolinski the next step is creating the threads that correspond to the processes taking place in each module, creating the payloads that are transferred from a module to the other, creating functions, and setting events and variables. in the smart card executable model, the authentication methods used are pin and biometrics. the user, modelled as part of the sender module, enters the pin. if the pin is correct, the user enters the fingerprint. the number of attempts allowed for the sender is programmable. the executable model counts the number of attempts, and compares the inserted pin and fingerprint with the saved pin and fingerprint template in the smart card. also, there is a time limit for inserting the pin and fingerprint, otherwise a timeout message will appear. if the number of incorrect attempts exceeded the limit, the system blocks the smart card and saves the smart card id in the banned smart card list. errors in entering the correct pin vary; it could be wrong digits, taking a long time to insert the correct pin, or an attacker trying to insert the pin randomly. the same steps take place when entering a fingerprint. the successful attempts at pin and fingerprint entry will confirm that the sender is a legitimate user. therefore, when the sender passes the authentication step, the smart card releases the private key. then the transactions related to signing the message with the private and public keys take place, and finally the system sends the digitally signed message to the receiver. in reality, the user enters the pin and scans the fingerprint through an input device like a keypad, biometric scanner, or touch pad. however, our executable model can randomise the pin and fingerprint entries, and also randomise the correct and incorrect time. a simple pseudo-random number generator is used to randomise the pin and fingerprint entries along with randomising the correct and incorrect time in seconds. the simple random number generator is fast and provides better randomness properties like adjusting the ratios, changing the range of sample smart cards to be tested, and modifying the probabilities of failure. an arbitrary ratio of successful pin and fingerprint is used; it can be modified to allow flexibility in testing different probabilities of failure. the executable module has the smart card system objects and their related transactions. the lifelines in the uml diagram are represented as objects, modules in systemc, and the arrows are represented as transactions using tlm. the transitions in the output correspond to the transaction number in the uml diagram in figure 3. obviously, the designer can observe the attempts to enter the right pin and biometric along with the required timing. this allows the testing of the effectiveness of the authentication methods used. by running the simulation on different numbers of smart cards with different probabilities of failure it is possible to evaluate the effectiveness of each authentication method. 4.2. testing the authentication methods validation of the authentication methods in the smart card system is based on two proposed models. the first model uses a pin followed by a biometric authentication method, while the second model reverses the sequence. the main reason behind carrying out these correctness tests is to check that the simulation using the executable model is actually working. the purpose of these tests is to verify:  the functionality/workability of the smart card simulation tool and the availability of test results;  the reliability of the smart card simulation tool through simulation; evaluating system security using transaction level modelling 145  the degree of flexibility in assigning thresholds and failure probabilities, which will assist in customising the simulation tool based on the industry and sector in which the smart card system will be used;  the speed of testing, which allows users of the simulation tool to obtain results and manipulate thresholds with ease and flexibility. the following tests have been performed: 1. pin followed by biometrics. 2. biometrics followed by pin. for each of these tests, an arbitrary probability of failure has been assigned to each of the authentication methods. for example, the probability of failure for the pin is set at 15%, for the biometrics (fingerprint) it is set at 10%, and the time allowed for entering the correct pin and correct fingerprint is set at 10 seconds for each. the reason for assuming that the pin has a slightly higher probability of failure is that the pin authentication method is weaker than the biometrics and thus there is a higher probability of successful attacks and user errors and mistakes. the first test (pin followed by biometrics) used 100 to 3,000 smart cards. table 1 displays the results for the authentication method based on the scenarios of potential failure/error. table 1 results from testing the pin followed by biometrics authentication method remarks number of simulated smart cards 100 500 1000 1500 2000 2500 3000 good pin decoded 100 500 998 1490 1976 2464 2950 pin incorrect/re-enter correct pin 16 102 207 302 394 493 587 timeout error (pin) 9 58 125 189 257 299 376 good bio decoded 100 500 998 1490 1976 2464 2950 bio incorrect/re-enter correct bio 13 38 82 126 167 200 234 timeout error (bio) 11 58 124 171 236 299 359 an examination of the results may be interpreted according to the industry and sector of use, which dictate the levels of acceptable thresholds and probabilities of failure. initially, when examining the relationship between the expected and observed results of failure attempts across all sample sizes we are able to confirm that it is a linear relationship and that observed failure attempts are always below the expected range. in a sample of 3,000 cards, failure attempts are 963 over 30% of the sample size. this failure percentage alerts us to the vulnerability of the system. this entails a low level of acceptance of usage from both parties due to the increased risks represented by the use of this method. having such a high degree of risk and vulnerability in the system will expose it to numerous additional threats from different sources. the results of the expected and observed pin and biometric failure attempts are listed in table 2 and recorded as percentage of the total sample size. 146 a. bushager, m. zwolinski table 2 percentages of expected and observed pin followed by biometrics failure attempts number of smart cards 100 500 1000 1500 2000 2500 3000 percentage observed (pin) 8 11 11 11 11 11 11 percentage expected (pin) 15 15 15 15 15 15 15 percentage observed (bio) 8 6 7 7 7 7 7 percentage expected (bio) 10 10 10 10 10 10 10 when comparing the observed pin failure attempts to the biometrics failure attempts, it is noted that the percentages are 11% and 7%, respectively. although the difference is relatively small, it indicates that the pin authentication method requires additional monitoring, particularly in avoiding risks of external threats that pose potential harm against the users and system confidentiality and privacy. furthermore, under the simulation of 1,000 smart cards, it is noted that two cards have been banned for reaching the maximum attempts of pin entry. however, as the sample size increases, the number of banned smart cards grows significantly as illustrated in figure 4. fig. 4 smart cards banned in pin and biometrics proposed model for example when simulating 3000 smart cards, about 50 of them were banned during the pin authentication step. on the other hand, for the biometrics authentication method, it is noted that no smart cards have been banned when using this method. this is a clear indication of the level of security that the use of biometric authentication provides when adopted by smart cards, particularly ones that store and have access to sensitive data. in the second test, the initial expectation is that the use of a biometrics authentication first will decrease the possibility of failure attempts and attacks. this mechanism supports the security concept of using something you own (smart card), something you are (biometrics), and something you know (pin). -10 0 10 20 30 40 50 60 100 500 1000 1500 2000 2500 3000 f a il u re a e m p ts number of smart cards max pin a empts/card banned max fingerprint/card banned evaluating system security using transaction level modelling 147 when using the biometrics authentication method before the pin, the number of banned smart cards is recorded at 7 and 2 consecutively for a sample size of 3,000 smart cards. this is low compared to when the pin is used prior to the biometrics where the number of banned smart cards was 50 and 0 consecutively for a sample size of 3,000. given the benefits to the user and administrator, as well as the practicality of using the biometrics and pin authentication methods across most industries, it is recommended to adopt this method in the given order as it provides better security levels. in summary, the executable model developed using systemc tlm allowed the designer to test the proposed models that support a combination of authentication methods; by running simulations on different number of smart cards with different authentication methods and recording the results, the designer can examine the robustness of the proposed models in terms of enhancing security specifically during the phase of authenticating the smart card system users. the simulation tool provided a quick, automated, and flexible environment to test the proposed models, in addition to allowing the designer to observe and modify the transactions whenever changes are required. testing the proposed model against physical and logical attacks while the smart card is in use has resulted in giving the attacker the chance to get hold of the users private key, and therefore violating numbers of security properties like authentication, confidentiality, privacy, and integrity. this in essence shows that the system is vulnerable to threats and successful attacks taking place. yet, to be able to reduce the probability of successful attacks, our approach allows the designer to modify the executable model to test against future attacks. 4.3. simulating attacks on smart card system there are different types of attacks that have different probabilities of occurrence and different consequences for the smart card system and its users. each attack targets different areas of the system and has a specific goal; some attacks violate the smart card system authentication, privacy, and confidentiality like attacks on pin or attacks on biometrics. other attacks violate the smart card system integrity, reliability, and even authentication like invasive attacks, side channel attacks, etc. figure 5 is a uml sequence diagram that demonstrates the types of attacks that may occur in any smart card system, even though safeguards and controls like pin, biometrics, and pki are in place. the purple callouts represent the types of possible attacks that an attacker can carry out in that area precisely; in addition, the red callouts represent the attacks that are created in the executable model to test the system robustness. the executable model allows us to simulate an attack on the system. an attack on any part of the system is essentially another transaction inserted into the model. for example, to simulate an attack that allows the attacker to steal the private key released from the smart card object, which is coded as a state machine, an attacker is implemented as a class that can intrude into multiple modules in a thread-safe manner. thus, a transaction is effectively inserted into the model with one line of code at the appropriate point in the smart card module. 148 a. bushager, m. zwolinski fig. 5 possible attacks on pin, biometrics (fingerprint), and pki smart card system now, the model waits for transitions 1 to 8 to occur, and then the attacker interferes and attacks the system after transition 8 where the private key is released, figure 6. smartcard_reader_object: begin transition 8 smartcard_reader_object: end transition 8 sender_object: end transition 5 attacker initialized, @104 s attacker stole the private key, @104 s smartcard_object: begin transition 9 smartcard_object: end transition 9 fig. 6 simulated private key theft in this example, the attacker has to conduct a physical or logical attack to be able to get hold of the private key. for example, the attacker can practise a successful side channel attack, invasive attack, attacks during pin comparison, or attacks on biometrics. the executable model in this study does not simulate the physical or logical attack; it only assumes that a physical or logical attack has taken place. for that reason, it simulates an attack and creates an attacker class with features that allow the attacker to modify the transitions and as a result gain access to the user's secret information, specifically the private key. evaluating system security using transaction level modelling 149 another example of utilising the executable module in attacks simulation is by modelling another sort of an attack, which is carried out on the key exchange operation. this time the attacker monitors the public keys exchanged between the users and the ca, and gets hold of the users' public keys. being able to interfere with the key exchange protocol opens a door for the attacker to practice attacks that result in network disruption and loss of user trust like for example carrying out a man-in-the-middle attack [19], or a multi-protocol attack [20]. this example focuses on modelling an attack that allows the attacker to interfere through the transactions exchanged between the user and the receiver and gets hold of the data exchanged without both of the users knowing, by being able to model the attack, it is possible to point out a gap in the protocol that allows an attacker to monitor the flow of data, interfere within the transactions, and get hold of the public keys exchanged, figure 7. smartcard_object: begin transition 13 certificate_authority_object: begin transition 14 certificate_authority_object: end transition 14 attacker stole the receiver public key, @203 s smartcard_object: end transition 13 smartcard_object: begin transition 15 smartcard_object: end transition 15 smartcard_object: begin transition 16 smartcard_object: end transition 16 receiver_object: begin transition 17 certificate_authority_object: begin transition 18 certificate_authority_object: end transition 18 attacker stole the sender public key, @206 s receiver_object: end transition 17 fig. 7 simulated public key theft a denial of service (dos) attack is simulated using the same model. the attack aims at violating the availability property of the system security. the dos attack will take place against the certificate authority server; the attacker attempts to exhaust the server, which will result in the server being unable to provide the services for legitimate users. the following is part of the dos attack simulation output: as the output shows, the transactions of the smart card system are running normally, however, when the dos attack successfully takes place, the service is denied and the attacker gets hold of the users public keys exchanged among the system objects. in addition, the subsequent transactions failed to occur because the certificate authority server is unavailable. this attack shows that the availability property has been violated and the system users will not be able to use their smart cards until the certificate authority server recovers from the attack. dos attacks are indistinguishable from legitimate sign-in requests. the only differentiation is in the frequency of sign-in attempts and their origin. a large number of sign-in attempts in rapid succession can be indicative of a dos attack. hence, smart card systems can be protected from dos attacks by identifying high frequency of login attempts from a source and denying service to the source of such attack. another effective way is to limit the number of login attempts a user is allowed at a time. in summary, the executable model developed using systemc tlm allowed the designer to test the proposed models that support a combination of authentication 150 a. bushager, m. zwolinski methods; by running simulations on different number of smart cards with different authentication methods and recording the results, the designer can examine the robustness of the proposed models in terms of enhancing security specifically during the phase of authenticating the smart card system users. the simulation tool provided a quick, automated, and flexible environment to test the proposed models, in addition to allowing the designer to observe and modify the transactions whenever changes are required. in addition, the systemc tlm executable model also allowed the designer to discover the weak points of the system and point out vulnerabilities; the successful attacks indicate that there are weaknesses in the security protocol. to be able to reduce the probability of successful attacks, the designer can modify the executable model to test against future attacks. in contrast with the uml diagram, the animation makes it possible to see the attack actually happening. moreover, it is possible to make changes easily within the model and to try a number of attacks to test the system's robustness by simply inserting transactions into the uml diagram, and transforming them into transactions within the systemc tlm executable model. 5. conclusion uml diagrams are an excellent way of modelling systems, along with their extensions; they have features that show the designer how things should work. however, uml does not allow the designer to see what happens if something goes wrong with the system. therefore, to be able to see things happening and give reasons about the system, simulation has to take place. systemc tlm was used to transform a static uml model into an executable model. the executable model providing the opportunity to see the transaction flow within the system objects in an animated manner. in addition, it allowed the simulation of attacks in different parts of the system. the model gives a clear view of the weaknesses in the security requirements, methods, and protocols used in the smart card system. references [1] a. bushager and m. zwolinski, "modelling smart card security protocols in systemc tlm", in: embedded and ubiquitous computing (euc), 2010 ieee/ifip 8th international conference on. 2010, pp. 637–643. [2] s. schneider, "security properties and csp", in: proceedings ieee symposium on security and privacy, 1996, pp. 174 –187. [3] j. mcdermott, "visual security protocol modeling", in: proceedings of the 2005 workshop on new security paradigms, nspw '05:. new york, ny, usa: acm. isbn 1-59593-317-4; 2005, pp. 97–109. [4] j. jürjens, "umlsec: extending uml for secure systems development", in: uml 2002 – the unified modeling language. 2002, pp. 412–425. [5] object management group, introduction to omg's unified modeling language tm (uml ®) 2005;url http://www.omg.org/gettingstarted/what is uml.htm. [6] j. jürjens, "modelling audit security for smart-card payment schemes with umlsec", in: proceedings of sec 2001 – 16th international conference on information security, 2001, pp. 93–108. [7] j. jürjens, "using umlsec and goal-trees for secure systems development", in: proceedings of the 2002 acm symposium on applied computing. 2002, pp. 1026–1031. [8] j. jürjens, j. schreck, and y. yu, "automated analysis of permission-based security using umlsec", in: fundamental approaches to software engineering, 11th international conference, fase 2008, budapest, hungary, march 29-april 6, 2008. proceedings. 2008, pp. 292–295. evaluating system security using transaction level modelling 151 [9] ieee standard system c language reference manual. ieee std 1666 2005 [10] k. rothbart, u. neffe, c. steger, r. weiss, e. riegerand a. muehlberger, "high level fault injection for attack simulation in smart cards", in: proceedings of asian test symposium 2004, pp. 118–121. [11] k. rothbart, u. neffe, c. steger, r. weiss, e. rieger and a. muehlberger, "extended abstract: an environment for design verification of smart card systems using attack simulation in systemc", in: acm/ieee international conference on formal methods and models for co-design, 2005, pp.253–254. [12] l. cai, and d. gajski, "transaction level modeling: an overview", in: proceedings of the 1st ieee/acm/ifip international conference on hardware/software codesign and system synthesis. codes+isss '03; new york, ny, usa: acm. isbn 1-58113-742-7; 2003, pp. 19–24. [13] c. williams, "configuring enterprise public key infrastructures to permit integrated deployment of signature, encryption and access control systems", in: military communications conference, 2005. milcom 2005. ieee. 2005, pp. 2172 – 2175 vol. 4. [14] r.j. anderson, security engineering: a guide to building dependable distributed systems. wiley publishing; 2 ed.; 2008. isbn 9780470068526. [15] w. rankl, "overview about attacks on smart cards",information security technical report 2003, vol. 8, pp.67 – 84. [16] k. markantonakis, m. tunstall, g. hancke, i. askoxylakis, and k. mayes, "attacking smart card systems: theory and practice",information security technical report 2009,vol. 14, pp.46 – 56. [17] k. baddam, and m. zwolinski, "evaluation of dynamic voltage and frequency scaling as a differential power analysis countermeasure", in: vlsid '07: proceedings of the 20th international conference on vlsi design. washington, dc, usa: ieee computer society. isbn 0-7695-2762-0; 2007, pp. 854–862. [18] x. leng, "smart card applications and security. information security technical report 2009, vol. 14, pp. 36 – 45. [19] c. y. yang, c.c. leeand s.y. hsiao, "man-in-the-middle attack on the authentication of the user from the remote autonomous object". international journal of network security, 2005, pp.81–83. [20] a. m. johnston and p.s. gemmell, "authenticated key exchange provably secure against the man-in-themiddle attack". journal of cryptology, 2002, pp.139–148. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 113 127 doi: 10.2298/fuee1401113a bandgap engineering of carbon allotropes  vijay k. arora faculty of electrical engineering, universiti teknologi malaysia, utm skudai 81310 department of electrical engineering and physics, wilkes university, wilkes-barre, pa 18766, u. s. a. abstract. starting from the graphene layer, the bandgap engineering of carbon nanotubes (cnts) and graphene nanoribbons (gnrs) is described by applying an appropriate boundary condition. linear e-k relationship of graphene transforms to a parabolic one as momentum vector in the tube direction is reduced to dimensions smaller than inverse of the tube diameter of a cnt. similar transition is noticeable for narrow width of a gnr. in this regime, effective mass and bandgap expressions are obtained. a cnt or gnr displays a distinctly 1d character suitable for applications in quantum transport. key words: bandgap engineering, graphene, carbon nanotube, graphene nanoribbon, neadf, carrier statistics 1. introduction carbon allotropes have their basis in graphene, a single layer of graphite with carbon atoms arranged in a honeycomb lattice. graphene has many extraordinary electrical, mechanical, and thermal properties, such as high carrier mobility, ambipolar electrical field effect, tunable band gap, room temperature quantum hall effect, high elasticity, and superior thermal conductivity. it is projected to be a material of scientific legend, comparable only to penicillin as a panacea. there is a modern adage: silicon comes from geology and carbon comes from biology. cohesive band structure of graphene rolled into a cnt in a variety of chiral directions has recently been reported [1]. graphite, a stack of graphene layers, is found in pencils. as shown in fig. 1, formation of these allotropes originates from a graphene layer through various cutouts. a carbon nanotubes (cnt) is a rolled-up sheet of graphene (also see fig. 2). a fullerene molecule is a “buckyball,” nanometer-size sphere. a graphene nanoribbon (gnr) is a cutout from a graphene sheet with a narrow width of high aspect ratio.  received january 10, 2014 corresponding author: vijay k. arora wilkes university, e-mail: vijay.arora@wilkes.edu (e-mail: vijay.arora@wilkes.edu) 114 v. k. arora fig. 1 carbon allotropes arising from graphene sheet to form zero-dimensional (0d) buckyball, one-dimensional (1d) cnt, and three dimensional graphite (3d). each layer of two-dimensional (2d) graphite can be converted to a 1d gnr by making width smaller. copyright macmillan publishers limited [2], a. k. geim and k. s. novoselov, "the rise of graphene," nature materials, vol. 6, pp. 183-191, mar 2007 carbon (6c 12 ) atom with 6 electrons has electronic configuration 1s 2 2s 2 2p 2 . it is a tetravalent material with four of its electrons in shell 2 and still able to accommodate 4 more in 2p orbitals. however, carbon orbitals can hybridize because the s-orbital and porbitals of carbon's second electronic shell have very similar energies [3]. as a result, carbon can adapt to form chemical bonds with different geometries. three sp 2 orbitals form -bond residing in the graphene plane of fig. 2. these are pretty strong bonds that demonstrate superior electronic properties. fourth electron with -bond is delocalized conduction electron. each atom contributes 1/3 rd of the  -electron to a hexagon. with 6 atoms forming corners of a hexagon, each hexagon contributes 2 -electrons. the areal electronic density of -electrons is 2 /g hn a with 2 3 3 / 2 h cc a a is the area of the haxagon. the areal density is 19 23.82 10 g n m    with c-c bond length of 0.142 cc a nm . the intrinsic line density in a cnt is expected to be a function of diameter 11 1 1.2 10 ( ) cnt t n d nm m    when rolled into a cnt. similarly, the line density of gnr is expected to be 10 13.82 10 ( ) gnr n w nm m    . bandgap engineering of carbon allotropes 115 fig. 2 the rolled up graphene sheet into a carbon nanotube in recent years, there is a transformation in the way quantum and ballistic transport is described [3]. equilibrium carrier statistics with large number of stochastic carriers is the basis of any transport. nonequilibrium arora's distribution function (neadf) [4] seamlessly transforms the stochastic carrier motion in equilibrium with no external influence into a streamlined one in a high-field-initiated extreme nonequilibrium for current to flow and get saturated. a new paradigm for characterization and performance evaluation of carbon allotropes is emerging from the application of neadf to graphene and its allotropes. 2. bandgap engineering the electronic quantum transport in a carbon nanotube (cnt) is sensitive to the precise arrangement of carbon atoms. there are two families of cnts: singled-wall swcnt and multiple-wall mwcnt. the diameter of swcnt spans a range of 0.5 to 5 nm. the lengths can exceed several micrometers and can be as large as a cm. mwcnt is a cluster of multiply nested or concentric swcnts. the focus here is on swcnt. depending on the chirality, rollup of fig. 2 can lead to either a semiconducting or metallic state. when the arrangement of carbon atoms is changed by mechanical stretching, a cnt is expected to change from semiconducting to metallic or vice versa. several unique properties result from the cylindrical shape and the carbon-carbon bonding geometry of a cnt. wong and akinwande [3] are vivacious in connecting physics and technology of a graphene nanolayer to that of a cnt with splendid outcomes. cnt band structure arising out of 6-fold dirac kpoints with equivalency of k and k' points can lead to complex mathematics. however, once nearest neighbor tight binding (nntb) formalism is applied, the resulting dirac cone, as revealed in fig. 2, gives useful information for a variety of chirality directions. in fact, the k-points offer much simplicity for quantum transport applications. in the metallic state e-k relation is linear. however, the fermi energy and associated velocity are different in the semiconducting state. intrinsic fermi energy efo = 0 is applicable for undoped or uninduced carrier concentration. induction of carriers will move the fermi level in the conduction (ntype) or valence band (p-type). 116 v. k. arora linear e-k relation can be written in terms of kt, the momentum vector in the longitudinal direction of the tube, and kc, the momentum vector in the direction of rollup. the new description becomes 2 2 | | fo f fo f t c e e v k e v k k     (1) the k-point degeneracy is crucial to the profound understanding of the symmetries of graphene folding into a cnt in a given chiral direction. symmetry arguments indicate two distinct sets of k points satisfying the relationship 'k k  , confirming the opposite phase with the same energy. there are three k and three k' points, each k (or k') rotated from the other by 2 / 3. the zone degeneracy gk = 2 is based on two distinct sets of k and k' points in addition to spin degeneracy gs = 2. there are six equivalent k points, and each k point is shared by three hexagons; hence gk = 2 is for graphene as well as for rolled-up cnt. the phase kcch of the propagating wave e ikcch in the chiral direction results in rolled up cnt to satisfy the boundary condition (2 / 3) c h k c   (2) where v = (n  m) mod 3 = 0,1,2 is the band index. (n  m) mod 3 is an abbreviated form of (n-m) modulo 3 that is the remainder of the euclidean division of (n  m) by 3. the quantization condition transforms to kc = v(2/3dt) when ch = dt is used for a cnt's circular parameter. using (2 / 3 ) c t k d in eq. (2) yields the band structure as given by 2 2 2 | | 3 fo f fo f t t e e v k e v k d            (3) the equation is re-written to introduce the bandgap near 0 t k  2 2 3 32 1 1 3 2 2 2 gt t t t fo f fo t ed k d k e e v e d                         (4) with the binomial expansion 1/ 2 1 (1 ) 1 2 x x   near the lowest point (kt  0) of the subband and keeping first order term transforms (3) to 2 2 * 2 2 g t fo t e k e e m         (5) with 4 0.88 . 3 g f t t ev nm e v d d    and * 2 0.077 3 t o t f o t m nm m d v m d    (6) here 63 10 / 2 cc f a v m s    with 3.1 ev  the c-c bond strength. bandgap engineering of carbon allotropes 117 fig. 3 e vs kt graph with chirality (10,4) for v = 0, (13,0) for v = 1, and (10,5) for v = 2, all with diameter dt  1.0 nm. solid line is used for exact formulation (eq. 3) and dash-dot line showing parabolic approximation (eq. 4) fig. 3 displays the ev  kt relationship for v = 0 metallic and v = 1(2) semiconducting (sc1(2)) states. v is the band index confined to these three values only as v = 3 is equivalent to v = 0 and pattern repeats itself in these three modes. the diameter is dt  1.0 nm for the chosen chirality directions: (10,4) for v = 0, (13,0) for v = 1, and (10,5) for v = 2. assuming that each of these configurations are equally likely, about 1/3 rd of the cnts are metallic and 2/3 semiconducting with bandgap that varies with chirality. as seen in fig. 3, the curvature near kt = 0 is parabolic making it possible to define the effective mass that also depends on chirality. fig. 4 shows the bandgap of cnts with different chiral configurations, covering metallic (v = 0) and two semiconducting (v = 1,2) states. as expected, the bandgap is zero for metallic state, in agreement with eq. (4). sc2 bandgap is twice as large as that of sc1. fig. 4 also shows chirality leading to v = 0, 1 or 2. the nntb bandgap is likewise shown. it also exhibits the wide band gap nature of sc2. fig. 4 calculated band gap as a function of cnt diameter showing an agreement with nntb calculation. chiral vectors are indicated against corresponding points gnr, as shown in fig. 5, are strips of graphene with ultra-thin width (<50 nm). the electronic states of gnrs largely depend on the edge structures. the precise values of the bandgaps are sensitive to the passivation of the carbon atoms at the edges of the nanoribbons. just like for a cnt, bandgap dependence on inverse width is preserved. -2 0 2 -2 -1 0 1 2 k t (nm -1 ) e ( e v ) =2 =1 =0 1 2 3 4 0 0.5 1 1.5 d t (nm) e g ( e v ) nntb =2 =1 =0 (10,5) (10,4) (15,15) (8,7) (13,0) (19,0) (18,2) (21,8) (32,10) (16,5) (26,0) (32,0) (38,0) 118 v. k. arora fig. 5 armchair and zigzag graphene nanoribbons (gnr) where edges look like armchair and zigzag respectively in momentum k-space, there are bonding and anti-bonding wavefunctions. in the absence of a magnetic field, forward k and backward k moving states have identical eigenenergies as is well-known both for parabolic semiconductors as well as for graphene with e = ћvfo | k |. this degeneracy occurs both at k and 'k with 'k k  . the total phase change as one starts from one of the three k points to other two and returning to the same one is . the angular spacing between k and k' in k-space is (2v + 1) / 6 where v = 0,1,2. v = 3 is equivalent to v = 0 repeating the pattern. this gives (2 1) / 6 w k w    with v = 0,1 and 2 for equivalent k or k' points, where kw is the momentum vector along the width and kl is along the length of the nanoribbon. the small width of gnrs can lead to quantum confinement of carriers that can be modeled analogous to the standing waves in a pipe open at both ends. the dispersion for a gnr is then given by 2 2 | | (2 1) 6 fo f fo f l e e v k e v k w              (7) this leads to bandgap equation 2 3 1 2 (2 1) g l fo e wk e e           (8) in the parabolic approximation as for cnt, the bandgap and effective mass are given by 0.69 . (2 1) (2 1) 3 g f ev nm e v w w       (9) and * 0.06 (2 1) 6 l o f o m nm m w v m w      (10) armchair zigzag bandgap engineering of carbon allotropes 119 fig. 6 gives the bandgap as a function of width with experimental values indicated that are spread between v = 1 and v = 2 configurations. the gnr bandgap and associated transport framework are sketchy as compared to that of a cnt. 3. carrier statistics the theoretical development of electronic transport in a graphene nanostructure is complicated due to linear e-k relation with zero effective mass of a dirac fermion, as reviewed in a number of notable works [5-7]. an ideal graphene is a monoatomic layer of carbon atoms arranged in a honeycomb lattice. monoatomic layer makes graphene a perfect two-dimensional (2d) material. as a 2d nanolayer, graphene sheet has some semblance to a metal-oxide-semiconductor field-effect-transistor (mosfet). wu et al [5]. give excellent comparison of linear e-k (energy versus momentum) relation in a graphene nanolayer to a quadratic one in a nano-mosfet. six-valley parabolic band structure in a mosfet, even though anisotropic, has a finite effective mass [8-10]. as graphene is a relatively new material with a variety of its allotropes, the landscape of electronic structure and applications over the whole range of electric and magnetic fields is in its infancy [11]. fig. 6 the gnr bandgap as a function of width w. markers are experimental data (m. han et. al. phy, rev. lett. 98, 206805(2007)) the dirac cone described in eq. (1) shows rise in energy with the magnitude of momentum vector: 2 2 | | fo f fo f x y e e v k e v k k     (11) where k is the momentum vector which in circular coordinates has components kx = k cos and ky = k sin and vf = (1/ћ)de /dk is constant due to linear rise of energy e with momentum vector k. ћvf is the gradient of e-k dispersion. the linear dispersion of dirac cone is confirmed up to 0.6ev [3, 12]. vf  10 6 m/s is the accepted value of fermi velocity near the fermi energy that lies at the cone apex ef  efo = 0 for intrinsic graphene with efo = 0 as the reference level. the fermi velocity vectors are randomly oriented in the graphene sheet. 20 40 60 80 100 0 0.05 0.1 0.15 0.2 w (nm) e g ( e v ) =0 =1 =2 120 v. k. arora the deviation ef  efo of the fermi energy from dirac point defines the degeneracy of the fermi energy that itself depends on 2d carrier concentration ng as given by [4] 1 ( ) g g c n n   (12) with 2 (2 / )( / ) g b f n k t v (13) ( ) / f fo b e e k t   (14) ( ) j  is the fermi-dirac integral (fdi) of order j [13, 14] with j=1 for graphene. the linear carrier density of cnt is similarly described by ( , ) cnt cnt cnt g n en  (15) with eg = eg / kbt and ncnt = dokbt the effective density of states with do = 4/ћvfo = 1.93 ev 1 nm 1 . ( , )cnt ge is the cnt integral that can be evaluated numerically [1]. equilibrium carrier statistics for gnr is similarly obtainable. in equilibrium, the velocity vectors are randomly oriented in the tubular direction with half oriented in the positive x-direction and half directed in the negative x-direction for a tubular direction along the x-axis. this makes the vector sum of velocity vectors equal to zero, as expected. however, the average magnitude of the carrier motion is not zero at a finite temperature. the group average velocity of a carrier in essence informs the speed of a propagating signal. it is also a useful parameter giving information as velocity vectors are re-aligned in the direction of an electric field [15] as it sets the limit at saturation velocity that is the ultimate attainable velocity in any conductor. in a ballistic transport when electrons are injected from the contacts, the fermi velocity of the contacts plays a predominant role [16, 17]. it is often closely associated with the maximum frequency of the signal with which the information is transmitted by the drifting carriers. formally, the carrier group velocity is defined as 1 ( ) de v e dk  (16) the magnitude of the velocity of eq. (8.5.13) can be related to the dos by rewriting it as 1 ( ) 2 ( ) s cnt gde dn v e dn dk d e   (17) where 2 s gdn dk   (with 2 s g  for spin degeneracy) in the k-space and dcnt(e) = dn / de is the dos for a single valley. when multiplied with the dos and the fermi-dirac distribution function and divided by the electron concentration given by eq. (3.9.15) the magnitude of the velocity vector, the intrinsic velocity vi, for a cnt is given by 0 0 ( ) i f cnt v v u  , / cnt cnt cnt u n n (18) bandgap engineering of carbon allotropes 121 the name intrinsic is given to this velocity as it is intrinsic to the sample as compared to the drift velocity that is driven by an external field. similarly, the intrinsic velocity of gnr follows the same pattern ( , ) gnr gnrgnr g n n e as in eq. (15). fig. 7 the normalized intrinsic velocity vi / vfo as a function of normalized carrier concentration ucnt = ncnt / ncnt for different chiralties. p is parabolic approximation with effective mass the intrinsic velocity and unidirectional velocity for arbitrary degeneracy are shown in fig. 7 for band index v = 0,1,2. the intrinsic velocity is not equal to the fermi velocity vfo  10 6 m/s for semiconducting samples approaching vfo as expected in strong degeneracy. however, for a metallic cnt, the intrinsic velocity is the intrinsic fermi velocity. the fermi velocity in the parabolic model is calculable. however, it has no physical meaning as parabolic approximation works only in the nondegenerate regime. the nonequilibrium carrier statistics is challenging considering a variety of approaches in the published literature with no convergence in sight. neadf is natural extension of fermi-dirac statistics with electrochemical potential ef during the free flight of a carrier changing by q e where e is the electric field and is the mean free path (mfp) [15]. neadf is given by ( ) 1 ( , ) 1 f b e e q k t f e e       e e, (19) neadf is the key to transformation of stochastic velocity vectors into streamlined one giving intrinsic velocity that is the fermi velocity in metallic state, but substantially below the fermi level in semiconducting state. this intrinsic velocity vi is lowered by the onset of a quantum emission. the nature of quantum (photon or phonon) depends on the substrate. a sample of intrinsic velocity in a cnt is given in fig. 6. intrinsic velocity is the average of the magnitude of the stochastic velocity with fermi-dirac distribution that is obtained from eq. (19) when electric field is zero ( e =0). this is the limit on the drift velocity as stochastic vectors in equilibrium transform to streamlined unidirectional vectors. in the metallic state, the saturation velocity is limited to vfo. however, in a semiconducting state, it is substantially below vfo. the parabolic (p) approximation, although simple in its appearance, is not valid in the degenerate realm. 10 -2 10 0 10 2 0 0.2 0.4 0.6 0.8 1 u cnt v i/ v f 0 m (10,4) sc1 (13,0) sc1p (13,0) sc2 (10,5) sc2p (10,5) 122 v. k. arora equal number of electrons has directed velocity moments in and opposite to the electric field (n+ = n = n/2) in equilibrium, where n is the total concentration and n / n the fraction antiparallel (+) and parallel () to the applied electric field applied in the –xdirection. the electron concentration n+ >> n opposing electric field overpowers in the presence of an electric field as / exp( / ) b n n q k t    e . the fraction of electrons going in the +x-direction (opposite to an applied electric field /v le ) is / tanh( / )bn n q k t  e the drift velocity d v as a function of electric field then naturally follows as [15] tanh( / ) d u b v v q k t e (20) here vu is the unidirectional intrinsic velocity appropriate for twice the carrier concentration as electron cannot be accommodated in already filled state because of pauli exclusion principle. in nondegenerate domain, the distinction between vu and vi is not necessary. the quantum emission can also be accommodated to obtain saturation velocity tanh( / ) sat u q b v v k t  that is smaller than vu by quantum emission factor tanh( / ) q b k t approaching unity as energy of a quantum q >> kbt goes substantially beyond the thermal energy. in the other extreme, when emitted quanta are much smaller in energy, the bose-einstein statistics limits q / kbt  1. eq. (20) is strictly for nondegenerate statistics and that too for 1d nanostructures. different expressions are obtained for 2d and 3d nanostructure. however, use of tanh function unifies the current-voltage profile that can be usefully employed for characterization in the wake of failure of ohm's law. one way to preserve eq. (20) for degenerate statistics is to define the degeneracy temperature te for electrons. the low field carrier drift velocity is obtained when n+ /n is multiplied by vfo: 1 ( / ) [ ( ) / ( )] d fo fo o v n n v v          (21) fdi approximates to ( ) exp( )j    for all values of j for nondegenerate statistics. the degenerate mobility expression is obtained from (11) by retaining the factor in bracket giving / o fo b e q v k t  (22) with 1 2 3 2 ( ) ( ) e ud id t v t v        (23) where te is the degeneracy temperature signifying the higher energy of the degenerate electrons that is substantial higher than the thermal energy kbt. obviously te / t = 1 as expected. however, for strongly degenerate statistics te / t =  giving te = ef / kb. the fermi energy is ef = ncnt / do for metallic degenerate cnt. te is therefore equal to te = ncnt / kbdo. 4. i-v characteristics in the graphene and cnt because of linear e-k relationship, the mobility expression is different from other semiconductors. a rudimentary analysis of the mobility in terms of mfp is to change mobility expression / * / *o oq m q m v    by replacing *m v k  bandgap engineering of carbon allotropes 123 ( ) / f fo f e e v . this gives a simple mobility expression that has been utilized in [4] in extracting mfp. the expression obtained from this analogy is / ( ) o o f f fo q v e e     (24) the resistance then can be obtained from either 2 /or l w , with 2 2 21 ( )on opn p q    , for graphene with n2  ng replacement for areal density for graphene and n1  ncnt for cnt in 1 1 1 11 ( )o on opr l n p q      ). the velocity response to the high electric field is discussed in [4]. neadf's transformation of equilibrium stochastic velocity vectors into a streamlined mode in extreme nonequilibrium leads to velocity saturation in a towering electric field. in a metallic cnt, the randomly oriented velocity vectors in equilibrium are of uniform fermi velocity vfo = 1.010 6 m/s [1]. the saturation current isat = ncntqvfo arises naturally from this saturation, where ncnt = 1.5310 8 m 1 is the linear carrier concentration along the length of the tube consistent with experimentally observed [18] isat = 21 a. q is the electronic charge. the carrier statistics [1] gives ef = 67.5 mev which is larger than the thermal energy for all temperatures considered (t = 4, 100, and 200 k), making applicable statistics strongly degenerate. the transition from ohmic to nonohmic saturated behavior initiates at the critical voltage ( / )c bv k t q l for nondegenerate statistics with energy kbt and ( / )c fv e q l for degenerate statistics with energy ef. the mfp extracted from ro = 40 k is 70 nm that gives mobility [4] 2 / 10, 000 / o f f q v e cm vs   . the possibility of ballistic transport is miniscule given 1l m  . the ballistic transport in 2d systems is extensively discussed by arora and co-workers,[17, 19] where it is shown that the ballistic conduction degrades substantially the mobility in a 2d ballistic conductor with length smaller than the ballistic mfp. it may be tempting to apply the same formalism to 1d nanowire or nano cnt. however, the surge in resistance in a 1d resistor contradicts expected vanishing resistance for a ballistic conductor. a highfield resistance model[3] that employs the onset of phonon emission consistent with phonon-emission-limited mfp q of tan et. al [20] explains very well the saturation in 2d gaas/algaas quantum well. phonon-emission-limited mfp is generalized to any energy quantum by arora, tan, and gupta [4]. q is the distance that a carrier travels before gaining enough energy q qq  e to emit a quantum of energy q with the probability of emission given by the bose-einstein statistics. /q q q  e is infinite in equilibrium, very large in low electric field, and a limiting factor only in an extremely high electric field becoming comparable or smaller than the low-field mfp. that is why in the published literature on cnt, it is considered a high-field mfp, distinct from low-field scatteringlimited mfp. it is q that was used by yao et. al [18] to interpret the linear rise r / ro = 1 + (v / vc) in resistance with the applied voltage. here r = v / i is the direct resistance. this direct resistance r cannot replicate the incremental signal resistance /r dv di . therefore, the description of yao et. al [18] is deficient in not employing the distribution function and hence does not attribute correctly the source of current saturation, the transition point to current saturation, and the paradigm leading to rise of direct and incremental resistance. 124 v. k. arora neadf has a recipe for nonohmic transport leading to current saturation consistent with velocity saturation. as stated earlier, n+ >> n in the presence of an electric field as / exp( / ) b n n q k t    e . the fraction of electrons going in the opposite direction to an applied electric field /v le is then / tanh( / )bn n q k t  e . the current-voltage relation with tanh function ( fo i n qv    ) is a derivative of rigorous degenerate statistics [15] with ( / )c fv e q l and magnitude of velocity vector equal to the fermi velocity fo v for a metallic cnt. the current-voltage characteristics in a cnt are given by tanh( / ) sat c i i v v (25) fig. 8 is a plot of eq. (25) along with the experimental data of yao et al [18]. also shown are the lines at temperature t = 4, 100, and 200 k following the rigorous degenerate statistics [15]. the distinction between direct /r v i and differential /r dv di mode of resistance is crucial when i-v relation is nonlinear. r and r are given by / ( / ) / tanh( / ) o c c r r v v v v (26) 2 / cosh ( / ) o c r r v v (27) this relationship is in direct contrast to r / ro = 1 + (v / vc) with vc = isatro used by yao et. al [18], which can be obtained from eq. (26) by using approximation tanh ( ) / 1x x x  . io of yao et. al is the same as isat. fig. 8 i-v characteristics of a cnt of length 1 m. th stands for theoretical curves derived from degenerate statistics. tanh curves are display of eq. (25) as shown in fig. 9, the rise in r / ro is exponential compared to linear rise in r / ro. the potential divider rule between channel and contacts will make the lower-length resistor more resistive [21]. hence great care is needed to ascertain the critical voltage vc of the contact and channel regions. -5 0 5 -20 0 20 v (v) i (  a ) 4 k exp 100 k exp 200 k exp 4 k th 100 k th 200 k th tanh bandgap engineering of carbon allotropes 125 fig. 9 r-v characteristics of a cnt of length 1 m. markers and lines have same legend as in fig.7. the differential resistance r (eq. 27) rises sharply than the direct resistance r (eq. 26) fig. 8 makes it clear that both direct slope i/v giving inverse resistance r 1 and incremental slope dv/di giving differential (incremental) resistance r 1 decrease as voltage is increased ultimately reaching zero in the regime of saturation. however, in the work cited [4] the r is shown to decrease while conductance dv/di increases with applied voltage. it may be noted that incremental resistance increases almost exponentially as indicated in fig. 9 for v > vc and hence the curves are limited to a mv range to indicate superlinear surge of incremental resistance. direct resistance does follow the linear rise with applied voltage. the following observations are made consistent with the experimental data: 1. ohmic transport is valid so far the applied voltage across the length of the channel is below its critical value (v < vc). 2. the transition to nonlinear regime at the onset of critical electric field corresponding to energy gained in a mean free path is comparable to the thermal energy for nondegenerate statistics and fermi energy for degenerate statistics [21, 22]. 3. resistance surge effect in ballistic channels corroborate well with that observed by yao et. al [18] preceded by what was pointed out by greenberg and del alamo[23] in 1994. the surge in contact region will change the distribution of voltage between contacts and the channel.in this light, yao et. al [18] correctly conjectured that the measured resistance to be a combination of the resistance due to the contacts and the scattering-limited resistance of the cnt channel. the application of neadf in cnt [1] gives not only the comprehensive overview of metallic and semiconducting band structure of cnt, but also elucidates the rise of resistance due to the limit imposed on the drift velocity by the fermi velocity. 4. onset of quantum emission lowers the saturation velocity. however, if quantum is larger than the thermal energy, its effect on transport is negligible [22]. it is important to employ bose-einstein statistics [4] to phase-in the possible presence of acoustic phonon emissions in addition to optical phonons or for that matter photons as transitions are induced by transfer to higher quantum level induced by an electric field. the phonon emission, generalized to quantum emission with bose-einstein statistics, is effective in lowering the saturation velocity only if the energy of the quantum is higher than the thermal energy. quantum emission does not affect the ohmic mobility or for that matter ohmic resistance. -5 0 5 50 100 150 200 250 v (v) r ( k  ) r r 126 v. k. arora 5. conclusions carbon nano allotropes offer distinct advantage in meeting the expectations of more than moore era. the paper reviews the complete landscape as graphene is rolled into a cnt or cut into a gnr. the rollover effect in terms of metallic (m) and semiconducting sc1 and sc2 is distinctly new given the competing explanation for chiral and achiral cnts, making metallic cnt distinct from a semiconducting one. the same can be said of gnr where there is a complete absence of a semiconducting state. the neadf is unique for high-field applications as it seamlessly makes a transition from ohmic domain to nonohmic domain. it is the nonohmic domain that is not interconnected in the published literature, necessitating the use of hot-electron temperature. neadf clearly shows that hot-electron temperature is not necessary in description of high-field transport. the formalism presented connects very well the low-field and high-field regimes. the drive to reduce the size below the scattering-limited mfp is expected to eliminate scattering. this expectation goes against the experimental observation of rapid rise in the resistance [13, 21, 23-25]. this scattering-free ballistic transport, as is known in the literature, gives a resistance quantum h / 2q 2 = 12.9 k. however, if the length of cnt is larger than the scattering-limited mfp, the resistance will rise almost linearly. that is perhaps the reason that observed experimental resistance of 40.0 k as observed by yao et. al [18] exceeds its ballistic value for a 1  m resistor. in fact, greenberg and del alamo [23] have demonstrated that resistance surge in the parasitic regions degrades the performance of an ingaas transistor. to sum it up, explorations of new physical phenomena on this length scale require the contributions from many different fields of science and engineering, including physics, chemistry, biology, materials science, and electrical engineering. however, quantum physics forms the backbone of understanding at a nanoscale in biochemical sciences where most applications of nanoensemble is apparent. this review exhibits new phenomena at the interface between the microscopic world of atoms and the macroscopic world of everyday experience that occur at the nanoscale. such studies will undoubtedly lead to further applications with enormous benefit to society. acknowledgement: the paper is a part of the research done at the universiti teknologi malaysia (utm) under the utm distinguished visiting professor program and utm research university grant (gup) q.j130000.2623.04h32 of the ministry of education (moe). references [1] v. k. arora and a. bhattacharyya, "cohesive band structure of carbon nanotubes for applications in quantum transport," nanoscale vol. 5, pp. 10927-10935, 2013. [2] a. k. geim and k. s. novoselov, "the rise of graphene," nature materials, vol. 6, pp. 183-191, mar 2007. [3] p. h. s. wong and d. akinwande, carbon nanotube and graphene device physics. cambridge: cambridge university press, 2011. [4] v. k. arora, m. l. p. tan, and c. gupta, "high-field transport in a graphene nanolayer," journal of applied physics, vol. 112, p. 114330, 2012. [5] y. h. wu, t. yu, and z. x. shen, "two-dimensional carbon nanostructures: fundamental properties, synthesis, characterization, and potential applications," journal of applied physics, vol. 108, p. 071301, oct 1 2010. [6] a. h. castro neto, f. guinea, n. m. r. peres, k. s. novoselov, and a. k. geim, "the electronic properties of graphene," reviews of modern physics, vol. 81, pp. 109-162, jan-mar 2009. bandgap engineering of carbon allotropes 127 [7] k. s. novoselov, s. v. morozov, t. m. g. mohinddin, l. a. ponomarenko, d. c. elias, r. yang, i. i. barbolina, p. blake, t. j. booth, d. jiang, j. giesbers, e. w. hill, and a. k. geim, "electronic properties of graphene," physica status solidi b-basic solid state physics, vol. 244, pp. 4106-4111, nov 2007. [8] m. l. p. tan, v. k. arora, i. saad, m. taghi ahmadi, and r. ismail, "the drain velocity overshoot in an 80 nm metal-oxide-semiconductor field-effect transistor," journal of applied physics, vol. 105, p. 074503, 2009. [9] i. saad, m. l. p. tan, a. c. e. lee, r. ismail, and v. k. arora, "scattering-limited and ballistic transport in a nano-cmos circuit," microelectronics journal, vol. 40, pp. 581-583, mar 2009. [10] v. k. arora, m. l. p. tan, i. saad, and r. ismail, "ballistic quantum transport in a nanoscale metaloxide-semiconductor field effect transistor," applied physics letters, vol. 91, p. 103510, 2007. [11] v. e. dorgan, m. h. bae, and e. pop, "mobility and saturation velocity in graphene on sio(2)," applied physics letters, vol. 97, p. 082112, aug 2010. [12] i. gierz, c. riedl, u. starke, c. r. ast, and k. kern, "atomic hole doping of graphene," nano letters, vol. 8, pp. 4603-4607, dec 2008. [13] v. k. arora and m. l. p. tan, "high-field transport in graphene and carbon nanotubes," presented at the international conference on electron devices and solid state circuits 2013 (edssc2013), ieeexplore digital library, hong kong polytechnic university, 2013. [14] v. k. arora, nanoelectronics: quantum engineering of low-dimensional nanoensemble. wilkes-barre, pa: wilkes university, 2013. [15] v. k. arora, d. c. y. chek, m. l. p. tan, and a. m. hashim, "transition of equilibrium stochastic to unidirectional velocity vectors in a nanowire subjected to a towering electric field," journal of applied physics, vol. 108, pp. 114314-8, 2010. [16] v. k. arora, m. s. z. abidin, m. l. p. tan, and m. a. riyadi, "temperature-dependent ballistic transport in a channel with length below the scattering-limited mean free path," journal of applied physics, vol. 111, mar 1 2012. [17] v. k. arora, m. s. z. abidin, s. tembhurne, and m. a. riyadi, "concentration dependence of drift and magnetoresistance ballistic mobility in a scaled-down metal-oxide semiconductor field-effect transistor," appl. phys. lett., vol. 99, p. 063106, 2011. [18] z. yao, c. l. kane, and c. dekker, "high-field electrical transport in single-wall carbon nanotubes," physical review letters, vol. 84, pp. 2941-2944, 2000. [19] v. k. arora, "ballistic transport in nanoscale devices," presented at the mixdes 2012 : 19th international conference mixed design of integrated circuits and systems, wasaw, poland, 2012. [20] l. s. tan, s. j. chua, and v. k. arora, "velocity-field characteristics of selectively doped gaas/alxga1xas quantum-well heterostructures," physical review b, vol. 47, pp. 13868-13871, 1993. [21] m. l. p. tan, t. saxena, and v. arora, "resistance blow-up effect in micro-circuit engineering," solidstate electronics, vol. 54, pp. 1617-1624, dec 2010. [22] v. k. arora, "theory of scattering-limited and ballistic mobility and saturation velocity in lowdimensional nanostructures," current nanoscience, vol. 5, pp. 227-231, may 2009. [23] d. r. greenberg and j. a. d. alamo, "velocity saturation in the extrinsic device: a fundamental limit in hfet's," ieee trans. electron devices, vol. 41, pp. 1334-1339, 1994. [24] v. k. arora, "quantum transport in nanowires and nanographene," presented at the 28th international conference on microelectronics (miel2012), nis, serbia, 2012. [25] t. saxena, d. c. y. chek, m. l. p. tan, and v. k. arora, "microcircuit modeling and simulation beyond ohm's law," ieee transactions on education, vol. 54, pp. 34-40, feb 2011. 10646 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 619-634 https://doi.org/10.2298/fuee2204619l © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications abdelaziz lazzaz1, khaled bousbahi2, mustapha ghamnia1 1laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie 2esgee d’oran, oran, algérie abstract. this paper analyses the electrical characteristics of 10 nm tri-gate (tg) nand pchannel silicon-on-insulator (soi) finfets with hafnium oxide gate dielectric. the analysis has been performed through simulations by using silvaco atlas tcad with the bohm quantum potential (bqp) algorithm. the influence of the geometrical parameters on the threshold voltage vth, the subthreshold swing (ss), the transconductance and the on/off current ratio, ion/ioff, is investigated. the two structures have been optimized for cmos inverter implementation. the simulation results show that the n-finfet and the p-finfet can reach a minimum ss value with fin heights of 15 nm and 9 nm, respectively. in addition, low threshold voltages of 0.61 v and 0.27 v for nand p-channel soi finfets, respectively, are obtained at a fin width of 7 nm. key words: finfet, cmos, quantum effect, leakage current 1. introduction nanoelectronics is a field of engineering technology which is used for controlling device properties at the nanometric scale. to meet the increasing demands for highperformance and high-speed applications, transistors need to be aggressively scaled down. this poses huge modifications both in the development of new device structures and in the fabrication processes. when the channel length is less than 20 nm, short-channel effects (sces) become insurmountable and, consequently, the device performance degrades. multi-gate fets have successfully enabled cmos scaling and are considered to be the best alternative structures that can extend to 10 nm node technology. received april 8, 2022; revised june 21, 2022 and june 26, 2022; accepted june 26, 2022 corresponding author: abdelaziz lazzaz laboratoire des sciences de la matière condensée (lsmc), département physique, université d’oran 1 ahmed ben bella, oran, algérie e-mail: lazzaz.abdelaziz@gmail.com 620 a. lazzaz, k. bousbahi, m. ghamnia most of the reported finfets are fabricated with a silicon channel, they present different advantages such as: i) reduced sces and a low leakage current, ii) superior electrostatic control through tri-gate structures, iii) reduced effect of substrate bias on the threshold voltage and excellent carrier transport properties along with more aggressive channel length scaling possibilities [1]. the conventional finfet technology has to face the competition from other technology options because of its high access channel resistance due to its extremely thin body. to improve finfet performance, one must address the quantum confinement problem. hence, the use of the bqp algorithm, which is based on the bohm interpretation of quantum mechanics [8], may become more important. n.p. maity et al. in 2017 [22] have explored the application of the promising high-k dielectric material, hfo2, on mos devices. they observed that the tunneling current is inversely proportional to the dielectric constant of the oxide material. niladri pratap maity et al. in 2016 [23] have developed an analytical model to evaluate the impacts of the hfo2 on the current density model with a comparison between the theoretical model and the experimental measurements. lazzaz et al. in 2022 [27] have demonstrated that quantum effects play a dominant role in nanostructures. they used the bqp method to fit experimental measurement of the ids-vds characteristics for 14 nm tg n-finfet. neha gupta et al. in 2020 [28] have explored the performance evaluation of high-k gate stack on the analog and rf figure of merits (foms) of 9 nm soi finfet. the results of their simulation confirm that the limitations of the transistor device such as sces, leakage current and parasitic capacitance have been reduced and pave the way for high-speed switching and rf application due to the use of high-k dielectric material with sio2 between gate and fin. anisur rahman et al. in 2018 [29] found that intel’s 10 nm technology achieved scaling benefits over its preceding 14 nm generation at matched or better transistor reliability. marupaka aditya et al. in 2021 [30] have confirmed that using high-k dielectric materials increase the on current and improve the device performance. sanghamitra das et al. in 2021 [31] have studied the effect of finfet geometric parameters (channel length and fin height) on the rf foms by using tcad simulations. their results confirm that decreasing the channel length or increasing the fin height improves the rf parameters. mostak ahmed et al. in 2021 [31] have simulated the electrical characteristics of a 3d tg n-channel soi finfet with a channel length of 5 nm using different gate dielectric materials. the results of their simulation confirm that high-k dielectric materials are the better option in the fabrication for future tg finfet devices. the above literature survey indicates the importance of using high-k dielectrics in finfet devices to reduce sces. in this paper, the transfer and the transconductance characteristics have been computed in order to find the electrical response of tg nand pchannel soi finfets with 10 nm channel length. the bqp algorithm has been used from silvaco atlas tcad software to simulate the i-v characteristics. the simulated devices have been optimized in terms of geometry to have optimal voltage transfer characteristics (vtc) for a cmos inverter [14], [15]. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 621 2. device structure the hafnium-based oxide is extensively used because of its low leakage property and its high thermal stability with silicon [25]. the geometric parameters used in this simulation are represented in table 1 and the operating parameters of the two structures are presented in table 2. table 1 geometric parameters symbol designation value l channel length 10 nm ld, ls drain, source length 12 nm eot equivalent oxide thickness 0.68 nm vdd supply voltage (v) 0.75 v table 2 operating parameters symbol designation value eg gap energy 1.12 ev k(si) dielectric constant of silicon 11.9 k(hfo2) dielectric constant of hafnium dioxide 24 nch channel doping concentration 1016 cm-3 ns/d φg source/drain doping concentration gate work function 1021 cm-3 4.85 ev tg finfet technology is based on the following fin geometry: fin length (l), fin width, wfin, fin height, hfin, and oxide thickness, tox. the numerical resolution, which includes the gate work function and the choice of physical models, represents one of the two main steps in the silvaco atlas tools. the shockley-read-hall (srh) theory has been used. figure (1a) shows the top-view layout of tg soi finfet with 10 nm gate length, and figure (1b) illustrates the 3d schematic view of finfet. the gate oxide thickness is the same for all three sides of the fin. the hfin is considered as the distance between the top gate and the bottom gate oxides. the wfin is represented as the distance between front gate and back gate. lg is the gate length and box is buried oxide. (a) (b) fig. 1 (a) top-view layout of tg soi finfet [21], (b) 3d schematic view of tg soi finfet 622 a. lazzaz, k. bousbahi, m. ghamnia all simulations have been performed using atlas and devedit 3d device simulator and different operating parameters such as the supply voltage, are extracted from the predictive technology model (ptm) [32]. 3.drain current model of the tg finfet the device electrostatics is governed by the 3-d poisson’s equation [5][19]: si zyxqn dz zyxd dy zyxd dx zyxd   ),,( ² ),,(² ² ),,(² ² ),,(² =++ (1)  : electrostatic potential; q: electron charge; εsi: silicon permittivity, n(x,y,z): electron density. quantum effects become more dominant and are difficult to control in the device. hence, in this study, one must consider them by selecting the appropriate model such as the bqp. the bqp model can also be used with the energy balance and hydrodynamic models, where the semi-classical potential is modified by the quantum potential in a similar way as for the continuity equations [20]. according to the bohm interpretation of quantum mechanics, the wave function can be represented in polar coordinates by the following expression [8]: 𝜓 = 𝑅𝑒𝑥𝑝( 𝑖𝑆 ℏ ) (2) r: probability density per unit volume; s has the dimension of an “action” (energy × time) the schrödinger equation can be written as: )(re)(re)(re 2 ² 1   is xpe is xpv is xpm =+             − − (3) m-1∇s: the local velocity of the particle associated with the wave function. e is conserved and equal to the sum of the potential energy and v is the kinetic energy [8]. the quantum potential is derived from the use of the bohm interpretation of quantum mechanics and it is described by the following equation [8][20]: r rm q )( 2 ² 1  −= − (4) the threshold voltage expression in the case of a finfet structure can be defined by [18]: dsdsbbsd ox b bfbth vqn c vv    −+++= ))2(2(2 (5) vfb: flat band voltage; ∅b: body potential; cox: gate oxide capacitance; q: electron charge; ns: doping concentration; εs: dielectric constant of the semiconductor; vsb: the reverse bias between the source and the body; λd: drain-induced barrier lowing (dibl) coefficient; λds: channel length modulation; λb: barrier variation coefficient. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 623 the transconductance, gm, represents the drain current variation with respect to gate voltage. it is represented by the following equation [4][13]: gs d m dv di g = (6) ss is a major parameter for calculating the leakage current, and is calculated as [13]: )(log 10 ds gs id dv ss = (7) the value of the dibl is [3]: ds th v v dibl   = (8) vth: threshold voltage; vds: drain-source voltage. 4. results finfet is considered to be a promising candidate for ultimate cmos device structure because it has robustness against sces and higher current drivability. soi finfets have shown several advantages over bulk finfets. soi finfet could suppress the leakage current between source and drain through the body below the channel fin, and it has low source/drain-to-substrate capacitance, thereby improving the speed characteristics. in this section, the effect of changing the fin width and the fin height is analyzed and investigated for finfet structures. the different electrical parameters which are derived in this simulation, such as leakage and ss, are compared with other published results [9][12][17] in order to validate our model. the bqp model has better convergence properties in many situations and it can be calibrated against results from the schrödinger-poisson equation under conditions of negligible current flow. 4.1. simulation and analysis srh theory accounts for the generation and recombination of charges carriers through electron and hole capture and emission states within the energy gap. the software tcad was used to simulate the structure and the characteristics of the tg finfet. figure 2 represents ids-vgs transfer characteristics of the n-channel finfet in linear scale with vds = 0.7 v, which is higher than the threshold voltage. the on current in this simulation is 4 μa when vgs = vdd. one can observe that the threshold voltage, vth, in this simulation is 0.62 v. the vth of the device is related to the position of the fermi level with respect to the sub-bands energy levels. increasing the fin height will actually reduce the carrier quantum confinement, thereby reducing the sub-band energy. 624 a. lazzaz, k. bousbahi, m. ghamnia fig. 2 transfer characteristics for n-channel finfet to include quantum confinement in the computation of the id-vgs characteristics, bqp has been used. as a result, a correction of the value of drain current has been achieved. figure 3 represents the transfer characteristics in logarithmic scale. the gate voltage is swept from 0 to 0.75 v. we note that the leakage current is 10-14 a when vgs = 0 v. the leakage current obtained in this simulation is less than that obtained by dhananjaya tripathy et al [26]. fig. 3 transfer characteristics n-channel finfet in logarithmic scale figure 4 represents the output characteristics for n-channel soi finfet. we note that the drain saturation current is 2x10-6a at vds = vdd = 0.75 v. we observe that the early effect is more pronounced. we find that an increase in vgs will result in higher channel conductivity. the output characteristics have been obtained for vgs = 0.7 v. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 625 fig. 4 output characteristics of n-channel finfet figure 5 represents the transconductance of n-channel finfet. the gate voltage is swept from 0 to 0.75 v. we note that the maximum value of the transconductance at vds = 0.7 v is 5x10-5 a/v. fig. 5 transconductance characteristics of n-channel finfet the higher value of the transconductance can be attributed to the higher strain in the short channel device. the transconductance peak can be reduced by reducing the channel length. shorter gate length, lg, provides less resistance and lower surface-roughness scattering, which leads to a higher transconductance and mobility. the higher mobility is induced by the quasi-ballistic transport instead of mobility increase. table 3 represents the different performance parameters of n-channel soi finfet [6][11]: table 3 performance parameters of n-channel finfet parameter value ion 4 µa ioff 10-14 a ion/ioff 4x108 vth 0.62 v dibl 21.05 mv/v ss 79.48 mv/dec 626 a. lazzaz, k. bousbahi, m. ghamnia the values of ss and dibl indicate a performance comparable with the state of the art obtained in ptm 10 nm hp nmos and pmos, such as the value of ss which is 102.4 mv/dec and dibl which is 212 mv/v. but, we must optimize these values in order to have a good performance of the device [17]. the ss provides a good performance comparable to the one obtained by ajay kumar et al [12], and the calculated performance ratio is better than that obtained by buryk et al [9]. figure 6 represents the transfer characteristics of p-channel finfet in linear scale with vsd = 0.7 v. we note that the on current in this structure is 6.25x10 -5 a. the on current is measured at vsg = vdd= 0.75 v. the value of the threshold voltage in this simulation is 0.30 v. the performance ratio, ion/ioff, calculated in this simulation is higher than that calculated by a.s. opanasyuk et al [9]. fig. 6 transfer characteristics of p-channel finfet figure 7 represents the transfer characteristics of p-channel finfet in logarithmic scale. we note that the leakage current is 1.58x10-8 a when vsg = 0 v. fig. 7 transfer characteristics in logarithmic scale of p-channel finfet performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 627 figure 8 represents the output characteristic of p-channel finfet. we note that the drain current saturation is 5.5x10-5 a. fig. 8 output characteristics p-channel finfet table 4 represents the different performance parameters: table 4 performance parameters of p-channel finfet parameters values ion 5.5x10-5 a ioff ion/ioff vth ss 2.51x10-8 a 2.19x103 0.31 v 133.50 mv/dec 4.2. effect of the fin height in this section, we investigate the impact of the fin height variation on id-vgs characteristics. quantum effects are included in the simulation. we think that the calculated optimized values of heights in nand p-channel finfets can be used as geometric parameters in the ptm for the design of cmos inverters. fig. 9 transfer characteristics with different fin heights for n-channel finfets 628 a. lazzaz, k. bousbahi, m. ghamnia figure 9 represents the transfer characteristics of n-channel finfets with different height values, 9 nm, 11 nm, 13 nm and 15 nm. it is clear that as we increase the fin height, the on current increases from 2.5 μa up to 4 μa. the on current allows the driving capability of the device. the increase of fin height increases the inversion charge density and thereby increases the on current [24]. we note that the leakage current increases with the increase of the fin height because trap-assisted-tunneling is more important than direct tunneling. table 5 represents the impact of fin height on the subthreshold swing and the threshold voltage of the simulated device: table 5 impact of fin height of n-channel finfet parameter 9 nm 11nm 13nm 15nm ss (mv/dec) 73.86 78.57 79.76 83.75 vth (v) 0.67 0.66 0.65 0.64 we note that the increase of fin height increases the subthreshold swing, and the threshold voltage decreases in the device. the increase of the subthreshold swing is due to the increase in the total capacitance, so we need to minimize the parasitic capacitance in order to reduce the power consumption. the decrease of the threshold voltage vth is due to the decrease of the fermi level. figure 10 represents the impact of fin height on the performance ratio (ion/ioff). we note that the increase of fin height up to 13 nm decreases the performance of the device. the performance ratio increases because of the decrease of leakage current [2]. fig. 10 the impact of fin height on the performance ratio of n-channel finfet the suitable value of fin height for the simulated device is 15 nm because it shows a larger performance ratio equal to 4x109. figure 11 represents the impact of fin height variation on p-channel finfet transfer characteristics. we note that the on current increases with the increase of fin height. we note that the leakage current increases with increase of fin height because direct tunneling is more important than trap-assisted-tunneling. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 629 fig. 11 impact of fin height on the transfer characteristics of tg p-channel finfet table 6 displays the impact of fin height on subthreshold swing and vth in 10 nm pchannel finfet. table 6 impact of fin height on p-channel finfet parameter 9 nm 11 nm 13 nm 15 nm ss (mv/dec) 129.62 137.50 133.33 133.50 vth (v) 0.35 0.33 0.32 0.31 the variation of subthreshold swing is due to the total capacitance of the device. the threshold voltage in p-channel finfet decreases with increasing fin height. figure 12 illustrates the impact of fin height on performance ratio of p-channel finfet. we note that the increase of fin height until hfin = 13 nm decreases the performance ratio, then the performance ratio increases when hfin > 13 nm because of the decrease of the leakage current[2]. fig. 12 impact of fin height on the performance ratio of p-channel finfet the suitable value of this optimization is 9 nm because it allows a desirable performance ratio. 630 a. lazzaz, k. bousbahi, m. ghamnia 4.3. effect of fin width in this section, we investigate the impact of fin width on the performance of n-channel finfet. fig. 13 impact fin width on n-channel finfet figure 13 shows the transfer characteristics of n-channel finfet for different values of fin width. the gate voltage is swept from 0 v to 0.75 v. we note that the on current increases from 4x10-6 a up to 5.25x10-6 a, then it falls down to 3.30x10-6 a because the strain effect in the channel increases. the large fin width decreases the mobility and the inversion charge and results in a smaller drain current. the leakage current increases from 10-14 a up to 3.16x10-12 a when fin width is 9 nm. when fin width is greater than 9 nm, the leakage current decreases down to 1.58x10-13 a because direct tunneling is more important than the trap-assisted-tunneling. the following table 7 represents different values of subthreshold swing and vth for different fin widths. we note that the subthreshold swing and the threshold voltage increase with increasing fin width. because of the quantum effects along the wfin direction, the channel electrons will populate the discrete sub-bands. the vth will increase because more gate-bias is required to populate electrons into the lowest sub band, which is significantly above the bottom of the conduction band by evth. it should be underlined that a large fin width allows the enlarging of the total gate width therefore, the gate and depletion capacitance increases and subthreshold swing increases [16]. table 7 impact of fin width of n-channel finfet parameter 7 nm 8 nm 9 nm 10nm ss (mv/dec) 95.31 106.89 114.54 171.05 vth (v) 0.61 0.62 0.63 0.65 figure 14 illustrates the performance ratio of n-channel finfet, we note that ion/ioff decreases down to 1.66x106 with wfin = 9 nm then, the performance ratio increases up to 4x108.the increase of the performance ratio is due to the decrease of leakage current [2]. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 631 fig. 14 impact of width fin on the performance n-channel finfet transistor dimensions are scaled down in order to improve drive current and circuit speed and the ratio ion/ioff is needed to exceed 10 6 [2]. the suitable value of fin width on n-channel finfet is 10 nm because it has the best performance ratio 4x108. fig. 15 impact of fin width on p-channel finfet figure 15 represents the impact of fin width on the transfer characteristics of p-channel finfet, we note that the on current increases and the maximum value is 6.25x10-6 a. the increase of fin width increases the leakage current in the device after wfin=9 nm because direct tunneling current is more important than trap-assisted-tunneling. figure 16 represents the impact of fin width on performance ratio of p-channel finfet. we notice that the performance of device decreases with increasing fin width up to 9 nm, then above this value, it starts to increase [2]. 632 a. lazzaz, k. bousbahi, m. ghamnia fig. 16 impact of fin width on the performance ratio of p-channel finfet table 8 represents the different results of subthreshold swing and vth of p-channel finfet: table 8 impact of fin width of p-channel finfet. parameter 7 nm 8 nm 9 nm 10nm ss (mv/dec) 168.75 175.01 132.01 137.25 vth (v) 0.27 0.28 0.33 0.35 the increase of threshold voltage is due to the presence of the sub bands. the quantum confinement raises the conduction band edge, ec, to the lower order eigenvalues. this shift has a direct influence on the device threshold voltage because as it requires more band bending (potential energy lowering) in order to create the inversion layer [10]. the variation of subthreshold swing is due to the total capacitance of the device [7]. the suitable value of fin width in p-channel finfet is 7 nm because it has a larger performance ratio of 4x102. 5. discussion and conclusion throughout this study, we have shown that both nand p-channel finfets have good performance ratios only when short-channel effects are minimized. we have also shown that the bqp algorithm is a good simulation tool for computing parameters that control the quantum effects. it also allows the calculation of the optimal geometrical parameters for optimal performance of devices that can be implemented in cmos circuits. the results show that in order to have a good threshold voltage, one needs to increase the fin height that allows the increase of the energy level of the sub-bands. to minimize the sces, the subthreshold swing must be around 60 mv/dec and the total capacitance must be decreased in both devices by using high-k oxides and wide thicknesses. the integration of tri-gate soi finfet provides new opportunities in achieving high performance in cmos technology. this requires the improvement of certain parameters such as leakage currents and the control the threshold voltages. performance analysis and optimization of 10 nm tg nand p-channel soi finfets for circuit applications 633 we have also shown that the device characteristics depends on fin widths and fin heights. as a result, 10 nm finfet with hafnium dioxide using quantum confinement can be considered as a promising device for future cmos manufacturing process. our results can be used as spice parameters for ptm in cmos inverter design. acknowledgments: the author lazzaz wishes to thank professor ghibaudo of inp grenoble and professor pierpaolo palestri for their very helpful pieces of advice. references [1] b. yu, l. chang, s. ahmed, h. wang, s. bell, c. yang, d. kyser, "finfet scaling to 10 nm gate length", in digest of the ieee international electron devices meeting, 2002, pp. 251-254. [2] y. eng, l. hu, t. chang, s. hsu, c. chiou, t. wang, m. chang, "importance of ∆𝐷𝐼𝐵𝐿𝑆𝑆/(𝐼𝑜𝑛/ 𝐼𝑜𝑓𝑓) in evaluating the performance of n-channel bulk finfet devices", ieee electron devices society, vol. 6, pp. 207-213, january 2018. [3] m. lundstorm, "fundamentals of nano transistors (lessons from nanoscience: a lecture notes)", wspc, 2015, pp. 05-334 [4] j. baker, "cmos circuit design,layout and simulation, 3rd edition (ieee press series on microelectronic systems)". wiley-ieee press, 2010, pp. 187-188. [5] a. tsormpatzoglou, "characterization and modeling of modeling nanoscale multi-gate mosfets", ph.d dissertation, institut polytechnique de grenoble, grenoble, france, 2009. [6] n. collaert, "high mobility materials for cmos applications", woodhead publisching series in electonic and optical materials (1reed), ed. woodhead publishing, 2018, pp. 297-298. [7] y. chauhan, d. lu, s. venugopalan, s. khandelwal, j. duarte, n. paydavosi, c. hu, "finfet modeling for ic simulation and design": using the bsim cmg standard (1re éd), ed. academic press, pp. 131-135, 2015. [8] g. iannaccone, g. curatola, g. fiori, "effective bohm quantum potential for device simulators based on drift-diffusion and energy transport", simulation of semiconductor processes and devices, springer, vienna, pp. 275-278, 2004. [9] i. buryk, m. ivashchenko, a. golovnia, a. opanasyuk, "numerical simulation of fet transistors based on nanowire and fin technologies", ieee, pp. 257-259, 2020. [10] w. han, z. wang, "toward quantum finfet", springer, pp. 54-67, 2013. [11] a. zhang, j. mei, l. zhang, h. he, j. he, m. chan, "numerical study on dual material gate nanowire tunnel field-effect transistor", in proceedings of the 2012 ieee international conference on electron devices and solid state circuit (edssc), 2012, pp. 1-5. [12] a. kumar, s. saini, a. gupta, n. gupta, m. tripathi, r. chaujar, "sub-10 nm high-k dielectric soifinfet for highperformance low power applications", in proceedings of the 2020 6th international conference on signal processing and communication (icsc), 2020, pp. 310-314. [13] n. bourahla, a. bourahla, b. hadri, "comparative performance of the ultra-short channel technology for the dg-finfet characteristics using different high-k dielectric materials", indian journal of physics, pp. 1-8, 2020. [14] c. hu, "3d finfet and other sub-22 nm transistors", in proceedings of the 19th ieee international symposium on the physical and failure analysis of integrated circuits, 2012, pp. 1-5. [15] k. kuhn, "cmos scaling for the 22 nm node and beyond: device physics and technology". in proceedings of 2011 ieee international symposium on vlsi technology, systems and applications, pp. 1-2, 2011. [16] n. boukortt, b. hadri, s. patane, "investigation on tg n-finfet parameters by varying channel doping concentration and gate length. silicon", vol. 9, no. 6, pp. 885-893, 2017. [17] a. rassekh, m. fathipour, "a single-gate soi nanosheet junctionless transistor at 10-nm gate length: design guidelines and comparison with the conventional soi finfet", journal of computational electronics, vol. 19, no. 2, pp. 631-639, 2020. [18] o. bonnaud, "physique des solides, des semiconducteurs et dispositifs. université de rennes", vol. 1, p. 78, 2003. [19] k. ren, y. liang, c. huang, "compact physical models for algan/gan mis-finfet on threshold voltage and saturation current", ieee transactions on electron devices, vol. 65, no 4, pp. 1348-1354, 2018. [20] c. mohan, s. choudhary, b. prasad, "gate all around fet: an alternative of finfet for future technology nodes", international journal adv. res. sci. eng., vol. 6, no. 7, pp. 563-569, 2017. 634 a. lazzaz, k. bousbahi, m. ghamnia [21] a. lazzaz, k. bousbahi, m. ghamnia, "modeling and simulation of dg soi n finfet 10 nm using hafnium oxide", in proceedings of the ieee 21st international conference on nanotechnology (nano 2021), 2021, pp. 177-180. [22] n. maity, r. maity, s. baishya, "voltage and oxide thickness dependent tunneling current density and tunnel resistivity model: application to high-k material hfo2 based mos devices", superlattices and microstructures, vol. 111, pp. 628-641, 2017. [23] n. maity, r. maity, r. thapa, s. baishya, "a tunneling current density model for ultra thin hfo2 high-k dielectric material based mos devices", superlattices and microstructures, vol. 95, p. 24-32, 2016. [24] n. boukortt, "3-d simulation of nanoscale soi n-finfet at a gate length of 8 nm using atlas silvaco", transaction on electrical and electronic materials, vol. 16, pp. 156-161, june 2015. [25] n. bourahla, a. bourahla, b. hadri, "comparative performance of the ultra-short channel technology for the dg-finfet characteristics using different high-k dielectric materials", indian journal of physics, vol. 95, no. 10, pp. 1977-1984, 2021. [26] d. tripathy, d. acharya, p. rout, s. biswal, "influence of oxide thickness variation on analog and rf performances of soi finfet”, facta universitatis, series: electronics and energetics, vol. 35, no 1, p. 001-011, 2022. [27] a. lazzaz, k. bousbahi, m. ghamnia, "optimized mathematical model of experimental characteristics of 14 nm tg n finfet", micro and nanostructures, p. 207210, 2022. [28] n. gupta, a. kumar, "assessment of high-k gate stack on sub-10 nm soi-finfet for high-performance analog and rf applications perspective", ecs journal of solid-state science and technology, vol. 9, no. 12, p. 123009, 2020. [29] a. rahman, j. dacuna, p. nayak, pinakpani, "reliability studies of a 10 nm high-performance and lowpower cmos technology featuring 3rd generation finfet and 5th generation hk/mg", in proceedings of the ieee international reliability physics symposium (irps 2018), p. 6f. 4-1-6f. 4-6, 2018. [30] m. aditya, k. rao, srinivasa, k. sravani, "design, simulation and analysis of high-k gate dielectric finfield effect transistor", international journal of nano dimension, vol. 12, no 3, pp. 305-309, 2021. [31] m. ahmed, s. islam, d. al mamun, "numerical simulation of the electrical characteristics of nanoscale tg n-finfet with the variation of gate dielectric materials", international journal of semiconductor science & technology (ijsst), vol. 11, no. 3, 2021. [32] s. sinha, g. yeric, v. chandra, b. cline, y. cao, "exploring sub-20 nm finfet design with predictive technology models", in proceedings of the ieee dac design automation conference, 2012, pp. 283-288. instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 81 91 doi: 10.2298/fuee1701081v on the numerical computation of cylindrical conductor internal impedance for complex arguments of large magnitude * slavko vujević, dino lovrić university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia abstract. in this paper a numerical algorithm for computation of per-unit-length internal impedance of cylindrical conductors under complex arguments of large magnitude is presented. the presented algorithm either numerically solves the scaled exact formula for internal impedance or employs asymptotic approximations of modified bessel functions when applicable. the formulas presented can be used for computation of per-unit-length internal impedance of solid cylindrical conductors as well as tubular cylindrical conductors. key words: internal impedance, modified bessel functions, large function arguments, scaling. 1. introduction internal impedance per-unit-length (pul) or surface impedance of cylindrical conductors is required in analysis of numerous electromagnetic problems [1-5]. this pul internal impedance can be computed using various formulas which contain special functions such as bessel functions and modified bessel functions [6]. whatever formula is employed the results are valid only for smaller function arguments whereas for larger function arguments stability issues often occur. these issues are directly connected with computing special functions (bessel functions and modified bessel functions) under large parameters which in some cases yield extremely large values and in some cases extremely low values. in addition, these extreme values are multiplied, divided, subtracted and added which considerably makes thing worse. in this paper an algorithm is presented which circumvents the mentioned issues by first scaling the employed formulas to avoid overflow/underflow issues and then solving the expressions for modified bessel functions in two ways either by numerical integration or by using asymptotic approximations when applicable [7].  received february 25, 2016; received in revised form april 7, 2016 corresponding author: slavko vujević university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia (e-mail: vujevic@fesb.hr) * an earlier version of this paper was presented at the 12 th international conference on applied electromagnetics (пес 2015), august 31 september 2, 2015, in niš, serbia [1]. 82 s. vujević, d. lovrić the formulas presented in the paper are applicable to solid and tubular cylindrical conductors. all presented formulas are for a tubular cylindrical conductor, but by introducing the value zero for internal radius of the tubular cylindrical conductor, the pul internal impedance of a solid cylindrical conductor can be obtained. this model for computing pul internal impedance of single-layer tubular conductors represent a basis for a more general model which will be able to compute pul internal impedance of a multilayered tubular conductor which is currently in development. 2. formula for computation of tubular cylindrical conductor internal impedance computation of pul internal impedance of tubular cylindrical conductors (fig. 1), which takes the skin effect into account but ignores the proximity effect, can be performed using various formulas based on different special functions. it has been concluded in the previous work of the authors of this paper that, from the numerical stability standpoint, the most suitable formula for computation of pul internal impedance of tubular conductors is based on modified bessel functions of the first and second kind [7]: 1 0 0 1 1 1 1 1 ( ) ( ) ( ) ( ) 2 ( ) ( ) ( ) ( ) i e e i e i e e i k r i r k r i r z r k r i r k r i r                              (1) exp (1 ) 4 j j                    (2) where σ is the electrical conductivity of the conductor material, re is the external radius of the conductor, ri is the internal radius of the conductor, 0i and 1i are complex-valued modified bessel function of the first kind of order zero and one, 0k and 1k are complex-valued modified bessel function of the second kind of order zero and one (also called kelvin functions),  is the complex wave propagation constant, α is the attenuation constant, µ is the permeability of the conductor material, ω is the circular frequency and j is the imaginary unit. fig. 1 cross-section of a tubular cylindrical conductor as it has been shown in [7] by rearranging formula (1) and scaling it by an appropriate factor, the following formula for pul internal impedance of tubular conductors can be obtained: numerical computation of cylindrical conductor internal impedance 83 0 0 0 1 1 1 1 1 1 1 ( ) ( ) exp[ 2 ( )] ( ) ( ) ( ) 2 ( ) ( ) ( ) exp[ 2 ( )] ( ) ( ) s s i e e is s s e i e s s s e e i e e is s i e k r k r r r i r i r i r z r i r k r k r r r i r i r                                           (3) where the scaled modified bessel functions are: ( ) exp( ) ( ) s n n i r r i r        (4) ( ) exp( ) ( ) s n n k r r k r       (5) modified bessel functions of the first kind are scaled down exp( )r  times whereas modified bessel functions of the second kind are scaled up exp( )r  times. in such a way quantities of similar magnitudes are obtained which consequently enables more stable computation. the computation of internal impedance z can be further simplified depending on the magnitude of ( ) e i r r   . numerical analysis has shown that for ( ) 19 e i r r    computation of z must be performed using (3) in order to maintain high accuracy. however, for larger magnitudes of ( ) e i r r   simplifications of formula (3) can be performed without loss of accuracy. the following relation presents these simplifications and their interval of applicability: 0 1 15 15 ( ) 19 ( ) 10 2 ( ) ( ) 10 2 s e e is e e e i e i r ; r r r i r z ; r r r                               (6) as can be seen from (3) and (6), it is imperative to compute scaled modified bessel functions of the first and second kind as accurately as possible. the proposed numerical procedure for achieving this is addressed in the following section of the paper. 3. computation of scaled modified bessel functions in the developed algorithm for function parameters α∙r ≤ 25 integral representation of scaled modified bessel functions of the first and second kind is used. integral representation of modified bessel functions is more suitable than the infinite sum representation because the scaling factors given in (4-5) can be easily included in the integral representation of modified bessel functions. this is not the case when using the infinite sum representation. integrals that occur in modified bessel functions of the first and second kind are solved numerically using adaptive simpson rule. on the other hand, for function parameters α∙r > 25 computation of scaled modified bessel functions of the first and second kind is performed using asymptotic approximations. through extensive numerical analysis it has been found that for function parameter values larger than 25, asymptotic approximations of modified bessel functions produce results of equal accuracy as the numerical solution of integral representation of modified bessel functions but in less computation time. 84 s. vujević, d. lovrić 3.1. computation of scaled modified bessel functions of the first kind for α∙r ≤ 25 modified bessel function of the first kind of order zero in its integral form can be expressed by the following equation [8]: / 2 0 0 0 2 ( ) cos( sin ) exp( ) ( ) s i r j r d r i r                      (7) further simplification of the previous expression and separation of real and imaginary parts yields the following relation for scaled modified bessel function of the first kind of order zero: / 2 0 0 / 2 0 1 ( ) [exp( ) cos exp( ) cos ] [exp( ) sin exp( ) sin ] s i r a a b b d j a a b b d                         (8) where a and b are given by: (sin 1)a r     (9) (sin 1)b r     (10) the separation of the real and imaginary parts is performed because these integrals are solved separately using adaptive simpson numerical integration. numerical integration yields highly accurate results because the separated functions are simple to integrate as can be seen from fig. 2 and fig. 3 which depict how the real and imaginary parts of equation (8) behave on the integration interval for various values of parameter α∙r. fig. 2 real part of scaled modified bessel function of the first kind of order zero for various values of parameter α∙r numerical computation of cylindrical conductor internal impedance 85 fig. 3 imaginary part of scaled modified bessel function of the first kind of order zero for various values of parameter α∙r integral representation of modified bessel function of the first kind of order one can be expressed by the following equation [8]: / 2 1 0 1 2 ( ) sin( sin ) sin exp( ) ( ) s i r j j r d r i r                         (11) as before, by simplification of expression (11) and separation of real and imaginary parts, the following relation for scaled modified bessel function of the first kind of order one can be obtained: / 2 1 0 / 2 0 1 ( ) [exp( ) cos exp( ) cos ] sin [exp( ) sin exp( ) sin ] sin s i r a a b b d j a a b b d                             (12) two integrals present in equation (12) are again solved numerically using adaptive simpson rule. fig. 4 and fig. 5 depict how the real and imaginary parts of equation (12) behave on the integration interval for various values of parameter α∙r. 86 s. vujević, d. lovrić fig. 4 real part of scaled modified bessel function of the first kind of order one for various values of parameter α∙r fig. 5 imaginary part of scaled modified bessel function of the first kind of order one for various values of parameter α∙r 3.2. computation of scaled modified bessel functions of the first kind for α∙r > 25 asymptotic approximation of scaled modified bessel function of the first kind can be expressed by [8]: 2 2 1 1 [4 (2 1) ] 1 ( ) ~ 1 ( 1) ; 0, 1 ! (8 )2 m s m t n m m n t i r n m rr                                 (13) numerical computation of cylindrical conductor internal impedance 87 from the previous expression asymptotic approximations of scaled modified functions of the first kind of orders zero and one can easily be deduced: 0 1 1 ( ) ~ 1 ( )2 na s m m m c i r rr                 (14) 1 1 1 ( ) ~ 1 ( )2 na s m m m d i r rr                 (15) where:                r r r r r na 10000for3 10000300for5 300100for7 10050for9 5025for12 (16) 2 1 ( 1) [ (2 1) ] 8 ! m m m m t c t m          (17) 1 2 1 ( 1) [4 (2 1) ] 8 ! m m m m t d t m           (18) the expressions for cm and dm are deduced from (13) and are also used for asymptotic approximations of modified bessel functions of the second kind. values of na have been determined through numerical analysis. 3.3. computation of scaled modified bessel functions of the second kind for α∙r ≤ 25 integral present in the expression for the modified bessel function of the second kind of order zero has an upper integral limit that tends to infinity [8]. fortunately, the integral function rapidly tends to zero as the function argument increases so the infinite limit can be substituted with a finite limit tm0 without loss of accuracy: 0 0 0 0 ( ) exp( cosh ) exp( cosh ) mt k r r t dt r t dt                (19)          r tm 65 1cosh 1 0 (20) now the scaled modified bessel function of the second kind of order zero can be deduced from (19): 0 0 0 0 0 ( ) exp( ) cos exp( ) sin m mt t s k r d d dt j d d dt            (21) (cosh 1)d r t    (22) 88 s. vujević, d. lovrić the two integrals present in equation (21) are again solved numerically using adaptive simpson rule with high accuracy. fig. 6 and fig. 7 depict how the real and imaginary parts of equation (21) behave on the integration interval for various values of parameter α∙r. fig. 6 real part of scaled modified bessel function of the second kind of order zero for various values of parameter α∙r fig. 7 imaginary part of scaled modified bessel function of the second kind of order zero for various values of parameter α∙r similarly as for the modified bessel function of second kind of order zero, the integral present in the expression for modified bessel function of second kind of order one [8] can be replaced with a finite limit tm1: 1 1 0 0 ( ) exp( cosh ) cosh exp( cosh ) cosh mt k r r t t dt r t t dt                  (23) 25.001  mm tt (24) numerical computation of cylindrical conductor internal impedance 89 simplification of expression (23) yields the following expression for scaled modified bessel function of the second kind of order one: 1 1 1 0 0 ( ) exp( ) cos cosh exp( ) sin cosh m mt t s k r d d t dt j d d t dt              (25) as before the two integrals present in equation (25) are solved numerically using adaptive simpson rule with high accuracy. fig. 8 and fig. 9 depict how the real and imaginary parts of equation (25) behave on the integration interval for various values of parameter α∙r. fig. 8 real part of scaled modified bessel function of the second kind of order one for various values of parameter α∙r fig. 9 imaginary part of scaled modified bessel function of the second kind of order one for various values of parameter α∙r 90 s. vujević, d. lovrić 3.4. computation of scaled modified bessel functions of the second kind for α∙r > 25 asymptotic approximation of scaled modified bessel functions of the second kind is given by the following expression [8]: 2 2 1 1 4 (2 1) ( ) ~ 1 ; 0, 1 2 ! (8 ) m s t n m m n t k r n r m r                                  (26) from the previous expression asymptotic approximations of scaled modified functions of the second kind of orders zero and one can be deduced: 0 1 ( 1) ( ) ~ 1 2 ( ) mna s m m m c k r r r                 (27) 1 1 ( 1) ( ) ~ 1 2 ( ) mna s m m m d k r r r                 (28) where na is given by (16) whereas the coefficients cm and dm are computed from (17) and (18). 4. numerical examples the presented model for computation of pul internal impedance of tubular conductors was implemented into a fortran program. in order to ascertain the accuracy of obtained results and numerical stability of the model itself, a comparison is made with matlab which is used to compute pul internal impedance using the initial formula (1). both fortran and matlab employ double precision computing. it is important to note here that by using a program package which can employ more decimal places higher robustness of results would be achieved but at the expense of execution time. in the numerical example magnitudes and phase angles of z for a thin tubular copper conductor (internal radius ri = 3.8 mm and external radius re = 4 mm) are computed. the results of the comparison are presented in table 1 and table 2. table 1 comparison of magnitudes of tubular cylindrical conductor internal impedance. α∙re z (ω) proposed matlab 10 -2 0.003643657122067 0.003643657122067 10 -1 0.003643657122745 0.003643657122745 10 0 0.003643663902873 0.003643663902873 10 1 0.003710702668820 0.003710702668820 10 2 0.025181394368712 0.025181394368712 10 3 0.251267138203603 nan 10 5 25.12049572965153 nan 10 10 2512043.292911872 nan 10 15 251204329284.9072 nan numerical computation of cylindrical conductor internal impedance 91 table 2 comparison of phase angles of tubular cylindrical conductor internal impedance. α∙re φ (°) proposed matlab 10 -2 9.30814638898·10 -6 9.30814663686·10 -6 10 -1 9.30814668336·10 -4 9.30814668521·10 -4 10 0 9.30813198101·10 -2 9.30813198100·10 -2 10 1 9.164530090507745 9.164530090507741 10 2 44.85885196305934 44.85885196305934 10 3 44.98566888986672 nan 10 5 44.99985675983501 nan 10 10 44.99999999856761 nan 10 15 45.00000000000000 nan as can be seen from the results in table 1 and table 2, when computing formula (1) using matlab an underflow/overflow stability issue occurs for larger function parameters. these numerical instabilities are a direct consequence of the denominator consisting of subtraction of two products. when these products become identical up to the last decimal place that the program package can compute, the denominator becomes equal to zero thus resulting in a not a number value. the proposed numerical procedure successfully circumvents these issues as can be seen form the results of the analysis. 5. conclusion in this paper an algorithm for computation of pul internal impedance of cylindrical conductor under large complex function arguments is presented. the high accuracy and stability of the algorithm was achieved by selecting a formula for pul internal impedance which does not lead to undefined values for relatively small function arguments and by scaling the modified bessel functions present in this formula by an appropriate scaling factor. the developed algorithm represents a basis for computation of pul internal impedance of multilayered tubular cylindrical conductors which is in development. references [1] s. vujević, d. lovrić, "on the numerical computation of cylindrical conductor internal impedance for complex arguments of large magnitude", in proceedings of the extended abstracts of the 12th international conference on applied electromagnetics (пес 2015), niš, serbia, 2015, pp. (p1_1) 1-4. [2] p. sarajčev, s. vujević, "grounding grid analysis: historical background and classification of methods", international review of electrical engineering, vol. 4, pp. 670-683, 2009. [3] h. w. dommel, "emtp theory book, 2nd edition", microtran power system analysis corporation, 1992. [4] f. p. dawalibi, r. d. southey, "analysis of electrical interference from power lines to gas pipelines part i: computation methods", ieee transactions on power delivery, vol. 4, no. 3, pp. 1840-1846, 1989. [5] j. moore, r. pizer, "moment methods in electromagnetics techniques and applications", john wiley and sons, 2007. [6] j. a. stratton, "electromagnetic theory", john wiley & sons, 2007. [7] s. vujević, d. lovrić, v. boras, "high-accurate numerical computation of internal impedance of cylindrical conductors for complex arguments of arbitrary magnitude", ieee transactions on electromagnetic compatibility, vol. 56, pp. 1431-1438, 2014. [8] m. abramowitz, i. a. stegun, "handbook of mathematical functions with formulas, graphs, and mathematical tables", dover publications, 1964. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 571 584 doi: 10.2298/fuee1504571j qrs complex detection based ecg signal artefact discrimination  borisav jovanović 1 , vančo litovski 1 , milan pavlović 2 1 university of niš, the faculty of electronic engineering, niš, serbia 2 university of niš, the faculty of medicine, niš, serbia abstract. a new algorithm dedicated to electrocardiograph telemetry devices is proposed which evaluates the quality of electrocardiogram signals acquired in unsupervised environments, raises the certainty of the produced diagnoses, and accelerates protective actions when necessary. the proposed algorithm is utilized in conditions when electrocardiogram signals are highly susceptible to artefacts. the algorithm is based on novel qrs detection method and is used in microprocessor-based telemetry devices with reduced computing power. the algorithm has been tuned on publicly available databases. the results of its exploitation are also presented. key words: ecg telemetry systems, ecg recordings quality 1. introduction the quality of electrocardiogram (ecg) data influences the diagnosis results [1, 2]. the same stands for data acquired by ecg telemetry devices [3]. the potential limitation of using telemetry devices is measurement artefact immunity and their ability to measure discernible qrs and p waveforms in the presence of noise [4]. the very notion that the patient executes measurement without supervision often has a detrimental impact on the quality of the acquired data, even though the patient will receive training in how to use the device [4]. the ecg telemetry device itself should be able to distinguish ecg signals from artefacts. moreover, the device has to work autonomously and transmit data to doctors when a cardiac disorder happens. one approach fulfilling these requirements is proposed in this paper. it is our intention to emphasize a novel method for ecg signal quality assessment which is applied in the design of an ecg telemetry device. it prevents the transmission of ecg data with insufficient signal quality and also enhances detection of various cardiac disorders. the algorithm is tuned on publicly available ecg recording database [5] and validated on clinical ecg data.  received september 15, 2014; received in revised form january 13, 2015 corresponding author: borisav jovanović university of niš, the faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: borisav.jovanovic@elfak.ni.ac.rs) 572 b. jovanović,v. litovski, m. pavlović 2. related work the ecg signals acquired by ecg devices are not immune to noise contamination. techniques used for noise elimination are numerous [6], [7], [8]. the technique proposed in [6] is considered to be inappropriate for implementation in telemetry devices, because it requires multiple ecg leads for noise discrimination. specifically, the structure of commercial telemetry devices is simplified. in order to be wearable, many devices measure only single ecg lead. the other technique proposed in [7] finds the locations of qrs complex and other ecg signal waves to perform an adaptive signal filtering. the technique proposed in [8] uses a light-emitting-diode-based sensor to measure the amount of skin stretching, and based on sensor outputs, performs the filtering process. when ecg signals are contaminated by large artefacts filtering methods are not sufficiently successful in recovering the underlying ecg signals. for example, the body movement related artefacts are observed in the frequency range of the electrocardiogram, and often have similar shapes as qrs complexes. therefore, the signal and noise components cannot be easily discerned [9]. the signal to noise ratio cannot be either calculated for ecg signals because the signal components interpreted as ecg signal in one application may be interpreted as noise in the other applications [9]. additionally, as a consequence of aging, the amplitude of r wave can be decreased down to noise levels. instead of suppressing artefacts it is preferable to quantify the quality of an ecg signal. the assessment of signal quality is not completely solved, especially when focusing on telemetry devices with limited processing power [9]. some approaches use additional devices to identify the sections of ecg signal having artefacts. for example, the work in [10] proposes an accelerometer sensor for movement detection. the other methods rely exclusively on ecg recordings. the paper [11] presents a method which reduces the number of false alarms in coronary care units. the method for noise assessment presented in [12] requires two ecg leads. first, the algorithm calculates the locations of the qrs complexes in signal, and after using neural networks, determines whether the signal is a qrs complex or an artefact. in the algorithm described in [13], the researchers estimate the level of deviation of the suspected qrs complex compared to averaged qrs complex. recently, several algorithms dedicated to smart phone platforms are proposed. such method [14] is based on [13] and focuses on the assessment of ecg signal quality when the signal is measured in the unsupervised environments. after applying a filter to remove gross movement artefact, the signal quality is estimated in the remaining ecg signal [14]. the qrs complex detection is a basic step in almost every ecg analysis procedure. the performance of subsequent ecg analyses strongly depends on the robustness of the qrs detection [15]. therefore, ecg quality assessment based on qrs detection seems to be a practical approach that is suitable for most subsequent ecg analysis algorithms [9] and therefore has been adopted also here. the qrs detectors have been thoroughly studied and many methods have been proposed. the qrs detection method is first described by nygards and sornmo [16] and is subsequently updated by pan and tompkins method [17]. an open-source code [18] has been taken as a starting point for a new method for identification of qrs waves, proposed in this paper. this algorithm is a modification of pan and tompkins method [17]; uses adaptive thresholds and has a scan-back procedure which looks back in time if no beats have been detected during a certain period. qrscomplex detection based ecg signal artefact discrimination 573 3. the operation of telemetry devices the telemetry devices have emerged as a technology with a great promise for identifying cardiac disorders that are not easily discernible by other ecg diagnostic devices. compared to other ecg recorders, the ecg telemetry devices have the following advantages in:  using wireless communications for instantaneous reporting of cardiac disorders  having longer recording periods, long enough to capture the arrhythmic episodes and pauses in heart rhythm  having very small size in order to reduce the obtrusiveness of the recording process [4] a novel algorithm for ecg signal quality assessment has been implemented in an ecg telemetry device. the device uses five conductive electrodes (fig. 1). the arm electrodes (r and l) are placed on spots near right and left shoulders (fig. 1). the leg electrodes (f and n) are placed on abdomen bottom side in the legs direction. one additional precordial electrode [19] is placed at one of anatomically referenced landmarks on the anterior chest (v1–v6) given in fig. 1. the precordial landmark v5 is mostly utilized since it provides best sensitivity for myocardial ischemia disease detection [1]. the following standard leads are acquired: i, ii, iii, avr, avl, avf and one precordial the lead v5. fig. 1 the data processing chain in an ecg telemetry device the processing blocks of an ecg telemetry device are depicted in fig. 1. the device receives analog ecg signals from electrodes through conductive patient cables. a brief description of similar amplifier circuits operating in electrocardiograph dedicated to stress testing is given in [20], [21]. the analog signals are converted into digital at data rate of 500 samples per second, and then processed by digital filters. after the filtering operation has been completed, the rr (r wave to r wave) intervals representing the time periods between two consecutive r waves are calculated for the detection of following cardiac disorders:  tachycardia (disorders having high heart-rate rhythm) [22],  bradycardia (low heart-rate rhythm heart condition),  pauses (abnormal delays between qrs waves),  arrhythmia (irregular changes in heart-rate rhythm) [23]. the algorithm for ecg signal noise level estimation which is implemented within a telemetry device determines if an ecg recording has acceptable quality for data transmission. the noise level estimation algorithm, in conjunction with qrs complex detection method, will be described in following sections in detail. 574 b. jovanović,v. litovski, m. pavlović 4. the algorithm for peak detection and noise level estimation 4.1. qrs detection algorithm the digital samples of ecg signal are processed at sampling frequency of 250 samples per second. the operations dedicated to heart beat detection and noise level estimation can be divided into following groups:  ecg signal pre-processing,  peak detection,  noise level estimation,  qrs waves detection the brief description of ecg pre-processing block operations is given in figure 2. the pre-processing block starts with band-pass filtering. two finite impulse response (fir) filters are used:  the low-pass with cut-off frequency of 15hz  high-pass with cut-off frequency 1hz. fig. 2 qrs wave detection operations the absolute value of the signal's first derivative is calculated and the outcome signal is processed by moving the averaging filter. the average value of the input signal is found over a 96ms timing window. the duration of 96 ms is chosen after a thorough analysis was performed using ecg data from database [5]. we initially used the longer interval of 150 ms, which is used in [17]. then, we had to change the value to 96ms because we found that 96ms timing window enables better detection of qrs complexes when ventricular tachycardia is present in an ecg signal. as a result of pre-processing operations each qrs complex produces a knoll at the output of moving the average block. the output is denoted with x[n] in figure 2. the peak detection and noise level estimation blocks (figure 2.) are novel and will be described in the following sections in detail. the block denoted as qrs detection rules in fig. 2 is taken from [18]. the block classifies the peaks found by peak detection block (fig. 2.) as qrs complexes or noise, using peak height, location (relative to last found qrs complex) and adaptive thresholds. if peak is greater than threshold, it is considered as r wave, otherwise it is called noise. besides, all peaks that precede or follow larger peaks by less than time period of 200ms are discarded. the outcome of detection rules block can be used further for beat classification such as detection of arrhythmias. qrscomplex detection based ecg signal artefact discrimination 575 4.2. peak detection algorithm the output signal of pre-processing block, the digital signal x[n], enters the next peak detection block, at data rate of 250 samples per second. the examples of ecg signal ecg[n] and corresponding x[n] are depicted on the top and the middle panels of figure 3 respectively. as one can see from fig. 3, the qrs complexes in ecg[n] produce knolls in signal x[n]. also, the knolls produced by p and t waves are considerably smaller than those created by r waves, and therefore can be successfully distinguished. the peak detection algorithm generates at its output two different signals:  peak[n] which is used by qrs detection rules block for the identification of r waves. peak[n] is presented on the top panel of fig.3.  pulse[n] which is used by noise level estimation block for artefact detection and is presented on the middle panel of fig. 3. fig. 3 set of signals illustrating the peak detection and noise estimation algorithm. the top panel presents signals ecg[n] and peak[n], the middle panel x[n], oldmin[n], oldmax[n], ave[n] and pulse[n], the bottom panel signal state[n]. the algorithm for peak detection uses adaptive thresholds which are adjusted depending on the local minimum and maximum values of signal x[n]. the local minimum and maximum values are named with oldmin and oldmax respectively. the variables oldmin, oldmax are presented in fig. 3 on the middle panel. the oldmax is the most distinct value within the signal x[n], calculated during the last rr interval (fig. 3). the oldmin represents the minimum value of x[n] found in the same interval. the algorithm calculates additional variable ave (given in fig. 3 on middle panel), which represents the approximation of the mean value of signal x[n]. 576 b. jovanović,v. litovski, m. pavlović the operation of the peak detection algorithm can be described by a finite state machine (fsm) which is comprised of three states denoted as s0, s1 and s2. the state transition diagram of fsm is illustrated in figure 4. in the example of an ecg signal depicted in the figure 3, the fsm states are included on the bottom panel by signal state. fig. 4 the finite state machine states for peak detection a qrs complex is recognized (the pulse is generated on the signal peak[n]) when a state transitions s1->s2 occurs (fig. 4). during the isoelectric time interval of an ecg signal [1] which is positioned before the qrs complex, the fsm resides in state s0. the state is changed from s0 to s1 (fig. 4) when the rising edge of signal x[n] is detected. this transition is possible when the condition, given by eq. (1), is met. ((( [ ] [ 4]) ) (( [ 4] [ 8]) )) ( [ ] )x n x n x n x n x n d           (1) the condition given by eq. (1) consists of two parts. the first part calculates the slope of the rising edge of the signal x[n] and checks if it is greater than constant value δ. the δ is determined as the slope of signal x[n], produced by an input ecg signal ecg[n] having the lowest slew rate. the lowest slew rate for an ecg is estimated by dividing the minimum r peak amplitude within the range of 0.5mv to 5mv and dividing it by maximum rise time of the qr interval within the range of 17.5 ms to 52.5 ms. [24] this gives a minimum slew rate of 0.5mv/52.5ms=0.0095v/s. the second part of eq. (1) checks whether the amplitude of x[n] exceeds the threshold d. note that r peaks should be greater than 0.15mv for a qrs detection. the threshold value d is defined as a function of variables ave, oldmin, oldmax and time interval measured after the qrs complex has been detected as described in table 1. motivated by physiological standpoint, the time interval after the qrs complex has been detected is referred to as refractory period. the refractory period is greater than 280ms [1] and consists of absolute and relative refractory periods. the interval measured from the beginning of the qrs complex to the apex of the t wave is referred to as the absolute refractory period. the last half of the t wave is referred to as the relative refractory period. the absolute refractory interval is started when fsm state is changed from s2 to s0 and it lasts for 200ms. during this period, the d value is set to the maximum value. the absolute refractory period is followed by a relative refractory period lasting at least for 80 ms, within which the next qrs complex is more possible to happen. the d value, used during the relative refractory period, is determined empirically after a thorough analysis was performed on ecg data from database [5] and depends on the value of variable ave. as relative refractory interval elapses, the d value is decreased according to the equations described in table 1. because the algorithm is computationally optimized to be executed by low-power microcontrollers the multiplicand constants from table 1 are chosen to speed-up the multiplication operations which can be replaced by combination of more time efficient shift, add and subtract operations. qrscomplex detection based ecg signal artefact discrimination 577 table 1 the threshold d value, used in relation (1) for the transition from state s0 to s1 timer value [ms] threshold d [0-200) oldmax*0.75 [200, 240) ave*1.25+oldmin [240, 280) ave+oldmin [280, +∞) ave*0.875+oldmin the state s1 of fsm covers the rising edge and the peak of the knoll produced by qrs complex (fig. 3). when the amplitude of x[n] becomes smaller than the threshold given by relation (2), the fsm changes its state from s1 to s2 (fig. 4). [ ] 0.75(max )x n oldmin oldmin   (2) the variable max found in eq. (2) represents the local maximum of x[n], calculated during the state s1. also, at the moment of transition from s1 to s2, the variable oldmax is updated with the max (fig. 3). the pulse on signal peak[n] is generated when the fsm state is changed from s1 into s2 (fig. 3). the peak[n] is taken later as an input by detection rules block which classifies detected peaks as either qrs waves or noise. during the state s2 the falling edge of x[n]occurs. the state is changed from s2 to s0 if the relation (3) is met: ( [ ] 0.25( )) ( [ ] 2 )x n oldmin oldmax oldmin x n oldmin      (3) the transition from s2 to s1 (fig. 4) is possible and it is caused by artefacts in the ecg signal. the state s1 changes into s2 when relation (4) is fulfilled: ( [ ] 0.25( )) ( [ ] 10)x n min oldmax oldmin x n min      (4) the variable min found in eq. (4) stores the local minimum of signal x[n], calculated during the state s2. the ave is changed when the state is changed from s2 to s0 to track the signal x[n] maximum, obtained by signal oldmax: 125.0875.0  oldmaxaveave (5) when fsm is in s0, the value of ave is approximately decreased by half after every time period equal to 2 seconds. this is achieved by reducing the signal ave after every interval of 120 ms (30 ecg signal samples): 5 (1 2 ) 0.96875ave ave ave       (6) the multiplicand constant from equation (6) is chosen to speed-up the multiplication operations which can be replaced by shift, add and subtract operations. when fsm is changed from s2 to s0, the oldmin is initialized with x[n]; at the transition from s2 to s1 the variables oldmin and max are initialized with content of variables min and x[n] respectively (fig. 3). 578 b. jovanović,v. litovski, m. pavlović 4.3. noise level estimation algorithm the part of the algorithm dedicated to noise estimation is described as follows. the signal pulse[n] which is used for ecg signal artefact detection is updated at fsm's state transitions and is depicted on the middle panel of figure 3. the state transitions during which the non-zero values on pulse [n] are generated and their amplitudes are defined in the table 2. the pulse[n] is set to the value of one of variables min, oldmin or ave. table 2 the fsm state transitions at which the pulses are produced on pulse[n] and corresponding values state transition amplitude s0 -> s1 min s1 -> s2 ave s2 -> s1 oldmin, if (ave≤20) s2 -> s1 min, if (ave>20) the signal pulse[n] has to be normalized. the algorithm is executed by a microcontroller with limited processing capabilities which does not calculate with fixed-point numbers and the value pulse[n] is multiplied with constant of c=32, which is arbitrarily selected. after these numbers are multiplied, the result is divided by signal ave. normalized signal pulse2[n] is described by eq. (7). 2[ ] 32 [ ] /pulse n pulse n ave  (7) the result of noise level estimation block is the signal noisy_interval[n] providing the information if the ecg signals quality is ‟acceptable‟ or ‟unacceptable‟. the block produces noisy_interval[n] equal to 1 when an interval is identified as noisy. the borders of noisy intervals are estimated considering amplitudes of non-zero values produced by pulse2[n]. the interval denoted as noisy begins when two consecutive non-zero values are detected which are greater than threshold value th. the noisy interval is finished if in the sequence of five adjacent non-zero values of pulse2[n] there are not consecutive two which are greater than th. the th value is related to the value of constant c and it is determined after extensive analysis was performed on ecg data from the database [5].the threshold th=12 is used for the identification of the noisy intervals in which the qrs complexes are often missed or false detected by qrs detector. the artefacts within the detected interval include motion and poor electrode contact related artefacts. using lower value th=6 the sensitivity of noise level estimation algorithm is increased and the intervals are classified as noisy even when artefacts do not influence a qrs detection, but may distort detection of p or t waves. the noise detected using th=6 originate mostly from 50hz-related artefacts and electromyography signals. an example of ecg signals containing noisy segments is given in figure 5. the ecg signal ecg[n] is depicted on the top panel of figure 5. the middle panel presents the signal pulse2[n]. the signals noisy interval[n] and state[n] are depicted on the bottom panel of figure 5. qrscomplex detection based ecg signal artefact discrimination 579 fig. 5 the identification of noisy segments in ecg signals. the top panel presents ecg[n], the middle panel pulse2[n] and the bottom panel state[n] and noisy_interval[n]. fig. 6 noisy segments of ecg signals having ectopic beats. the top panel presents ecg[n], the middle pulse2[n] and the bottom panel signals state[n] and noisy_interval[n]. 580 b. jovanović,v. litovski, m. pavlović another example of ecg signals which are corrupted by patient's movement is depicted on the top panel in figure 6. the example shows the strength of peak processing algorithm. in the example, regular beats are mixed with ectopic beats having significantly larger and wider qrs waves than regular qrs complexes. the middle panel presents the signal pulse2[n]. the noise_interval[n] is shown on the bottom panel. despite of changes in signal morphology between normal and ectopic ecg beats, all qrs complexes are correctly identified. this can be observed by signal state[n] which is shown on the bottom panel. besides, the baseline drift does not influence the detection of qrs complexes. 4. the algorithm validation on public ecg record database the algorithm's performance was validated on physionet challenge database [5]. the database is comprised of one thousand 12-lead ecg recordings. the ecg signals are sampled at data rate of 500 hz, with the resolution of 16-bit per sample and 5 μv per bit. the duration of each ecg record in database is 10 seconds. the ecgs were recorded by nurses, technicians and other volunteers with different amounts of training. after the ecg recordings had been recorded, they were interpreted by a group of people consisting of three cardiologists, five ecg analysts, ten people without prior ecg reading experience and five persons with some previous ecg reading experience [25]. in order to estimate the quality of all ecg recordings from set, each ecg recording was presented to annotators and all the recordings were scored. 775 of the records were considered as acceptable, 223 records as unacceptable and 2 records as undefined. it is worth mentioning that records with artefacts have been labelled as acceptable even if the annotators assumed that the record quality is satisfactory for medical personnel to make accurate diagnosis [25]. the described algorithm for qrs complex detection and noise quality estimation is implemented by the program code written in matlab [26]. one thousand ecg recordings from the database [5] were analysed. the noisy signal segments, obtained using different thresholds values th, are detected for all twelve standard leads of an ecg recording. the following seven measures are extracted for each ecg lead: 1. the portion of ‟weak‟ signals in an ecg recording which have the r wave amplitudes smaller than 0.15mv. the r waves having low amplitudes are assumed to be undetectable 2. the portion of flat line segments in an ecg recording 3. the portion of segments in a recording corrupted by high vertical spikes, disabling the qrs detection in intervals following spikes 4. the value obtained by dividing the duration of noisy segments and the duration of the ecg recording. the noisy segments, which include motion and poor electrode contact related artefacts, are determined by signal noise_interval[n]=1, calculated using the threshold value th=12. 5. the value obtained by dividing the noisy segments duration and the duration of the ecg recording. the noisy segments are determined by noise_interval[n]=1, which is calculated using th=6.the detected noise intervals include artefacts related to 50hz interference and electromyography signals. the ecg signal segments considered by previous measure are excluded. 6. the portion of noisy segments of ecg signal caused by noise sources including previous two measures qrscomplex detection based ecg signal artefact discrimination 581 7. the value of rr interval variability, where the square root of the mean of the sum of the squares of differences between adjacent rr intervals is calculated, divided by mean value of rr intervals, using the following formula: 2 1 1 variability 1 ( ) (%) ( ) n i i i n i i rr rr n rr rr n        (8) the artificial neural network is used to train the relationships between the calculated measures in the presence of ‟acceptable‟ and ‟unacceptable‟ ecg records. the total number of measures used in a training process for an ecg record is 84, consisting of seven distinct measures for each standard ecg lead. the calculated measures were fed into to a multilayer perceptron (mlp) artificial neural network. the back-propagation neural network (bpnn) was used with three-layer feed-forward structure. the first layer is the input layer that has 84 neurons as inputs (seven inputs per each of twelve standard ecg leads). the second layer, called the hidden layer, has 10 neurons. one hidden layer has been proven as sufficient [27]. the number of hidden neurons was obtained after the procedure was applied that is based on methods given in the literature [27], [28]. the output layer has only one neuron providing a quality estimate of the ecg record (‟acceptable‟ or ‟unacceptable‟). in this study, the logistic function is used as activation function for the hidden neurons. the weight and bias values in the bpnn are optimized using levenbergmarquardt algorithm [29]. after the training process of a neural network has been completed, we used the same training set to test neural network effectiveness. the neural network correctly identified 715 recordings of 1000 recordings as being ‟acceptable‟ and 208 recordings as being ‟unacceptable‟. let tp, tn, fp and fn denote true positives, true negatives, false positives, and false negatives, respectively. tp counts correctly classified „unacceptable‟ records, tn correctly classified ‟acceptable‟ records, fp incorrectly classified ‟unacceptable‟ records) and fn incorrectly classified ‟acceptable‟ records. the following results are obtained tp=208, tn=715, fp=17 and fn=58. the results are characterized with sensitivity, specificity and accuracy measures, described by following equations: (%) tp sensitivity tp fn   (9) (%) tnfp tn yspecificit   (10) (%) fnfptntp tntp accuracy    (11) sensitivity is equal to 78.19%, specificity is 96.31% and classification accuracy is 92.48%. 582 b. jovanović,v. litovski, m. pavlović 5. discussion the performances of several qrs detection algorithms including the method from [18] is evaluated in [30] and [31]. the method has been developed and improved over the period of roughly 15 years, and states that the performance of the classification software is as good as or better than the performance reported from other qrs detection algorithms [30]. also, the same algorithm is used as the starting variant of qrs detector in [31], where the algorithm has been adapted to operate in high noise and frequent signal losses environments. the algorithm described in [18] was significantly improved by the proposed algorithm especially when it processes noisy ecg signals. the novel algorithm more efficiently identifies qrs waves then the algorithm described in [18], regardless of qrs complex amplitude level, width and morphology. for example, the algorithm described in [18] suffered from false qrs detections when qrs waves are wide, which was confirmed by simulations and also evaluated on real clinical data. the novel algorithm does not produce false detections for wide qrs complexes being present in ectopic beats. furthermore, it discriminates well p and t waves of ecg signal, which may have amplitudes as high as r waves. the qrs detector [18] should be more robust when detecting signals with low amplitudes and frequent artefacts [31]; for a highly noisy signal it fails to detect the peak location accurately [30]. one of the contributions of the proposed method is that it recognizes qrs complexes better than the algorithm presented in [18], particularly when qrs complexes have small amplitudes in the range from 0.15mv to 0.5mv. the algorithm can operate at rates up to 250 heart beats per minute. moreover, the ecg signal segments with large amount of artefacts are clearly identified by noise level estimation block, rejecting possible false arrhythmia and tachycardia detections. one of the results of the evaluation of proposed algorithm on clinical ecg data is presented in figure 7. the figure shows the detection of qrs complexes in the presence of ventricular tachycardia. the ecg signal, presented on the top panel, is acquired by ecg telemetry device implementing the algorithm we propose. the qrs complexes are indicated by signal state, shown on the bottom panel of figure 7. the signal noise_interval[n] is equal to zero. fig. 7 the detection of ventricular tachycardia (vt). the top panel presents signal ecg[n], the bottom panel state[n] indicating qrs detections and noise_interval[n] qrscomplex detection based ecg signal artefact discrimination 583 several studies were reported on the topic of ecg signal quality assessment [32], [33], [25]. the studies returned a simple binary score for an ecg record, which does not quantify the amount of artefacts, but just decides if an ecg record is of acceptable quality. these ecg signal quality algorithms were validated on physionet challenge database [6] comprised of one thousand 12-lead ecg recordings. we have obtained the value of accuracy of 92.48% which is comparable to those found in [32], [33] and [25]. these accuracy values are obtained for the same input ecg dataset. for example, the method from [32] was has the accuracy of 93.2%. the [33] is a variant of [22] and has accuracy equal to 92.6%. the other studies reported accuracy measures from 83.3 to 92.5% [25]. we want to emphasize the potential role of the proposed quality assessment algorithm in utility of future ecg diagnostic devices. in particular, we assume that the algorithms we propose could improve the performance of ecg telemetry devices. additionally, the quality assessment method could be used in every other kind of ecg recorders to, for example, help inexperienced nurses and technicians to record high quality ecg records. besides, in contrast to the algorithms [32], [33] that are computationally demanding and were conceived in their original design for smart phone applications, the method that we propose is computationally optimized for embedding in low-power microprocessor-based ecg diagnostic devices. 6. conclusion the work presented in this paper demonstrates a framework for combining both qrs detection and ecg signal quality assessment. such approach exploits the covariant information of the noise and relevant ecg data measured in unsupervised ecg signal acquisition environments. a more accurate false alarm reduction system has been developed, which is a must in novel wearable ecg telemetry devices. the results from this work clearly indicate that the ecg record signal quality can be estimated using qrs complex detection method, which is based on the detection of heart refractory time intervals and additionally supported by utilization of artificial neural network techniques. it is shown that the algorithm has almost the same performance as the known state-of-the-art algorithms with a considerable reduction in computational complexity. the ecg telemetry devices can be therefore greatly improved with the use of proposed ecg signal analysis routines. references [1] r. m. rangayyan, biomedical signal analysis – a case-study approach, wiley-ieee press, new york, 2002. [2] g. clifford, f. azuaje and p. mcsharry, advanced methods and tools for ecg data analysis, artech house, inc. norwood, ma, 2006. [3] a. müller, w. scharner, t. borchardt, w. och and h. korb, "reliability of an external loop recorder for automatic recognition and transtelephonic ecg transmission of atrial fibrillation", journal of telemedicine and telecare, vol. 15, no. 8, pp. 391-391, 2009. [4] s. lobodzinski and m. laks, "new devices for very long-term ecg monitoring", cardiology journal, vol. 19, no. 2, pp. 210–214, 2012. [5] physionet 2011 physionet/computing in cardiology challenge 2011 http://www.physionet.org/ challenge/2011 [6] f. la foresta, n. mammone and f. morabito, "neural nets", lecture notes in computer science, vol. 3931, pp. 78–82, berlin: springer, 2006. [7] n. v. thakor and y. s. zhu, "applications of adaptive filtering to ecg analysis: noise cancellation and arrhythmia detection", ieee trans. biomed. eng., vol. 38, issue 8, pp. 785–794, 1991. 584 b. jovanović,v. litovski, m. pavlović [8] y. liu and m. g. pecht, "reduction of skin stretch induced motion artefacts in electrocardiogram monitoring using adaptive filtering", in proceedings of 28th annu. int. conf. of the ieee engineering in medicine and biology conf., pp. 6045–6048, new york city, usa, 2006. [9] d. hayn, b. jammerbund and g. schreier," qrs detection based ecg quality assessment", physiol. meas., vol. 33, no. 9 pp. 1449–1461, iop publishing, september 2012. [10] y. kishimoto, y. kutsuna and k. oguri 2007, "detecting motion artefact ecg noise during sleeping by means of a tri-axis accelerometer", in proceedings of 29 th annu. int. conf. of the ieee engineering in medicine and biology society, pp. 2669–2672, 2007. [11] j. allen and a. murray, "assessing ecg signal quality on a coronary care unit", physiol. meas., vol. 17 issue 4, 249–258, 1996. [12] y. kigawa and k. oguri 2005 "support vector machine based error filtering for holter electrocardiogram analysis", in proceedings of 27 th annu. int. conf. of the ieee engineering in medicine and biology society, pp. 3872–3875, 2005. [13] q. li, r. g. mark and g. d. clifford 2008 "robust heart rate estimation from multiple asynchronous noisy sources", physiol. meas., vol. 29, no.1, 15–32, 2008. [14] s. j. redmond, y. xie, d. chang, j. basilakis and n. h. lovell, "electrocardiogram signal quality measures for unsupervised telehealth environments", physiol. meas., vol. 33, no. 9 pp. 1517-1533, iop publishing, september 2012. [15] l. sornmo and p. laguna, bioelectrical signal processing in cardiac and neurological applications, academic press series in biomedical engineering, new york: academic, 2005. [16] m. e. nygårds, l. sörnmo, "delineation of the qrs complex using the envelope of the ecg", journal of medical and biological engineering and computing, vol. 21, issue 5 , pp 538-547, 1983. [17] j. pan and w. j. tompkins 1985 "a real-time qrs detection algorithm", ieee trans. biomed. eng. vol. 32, pp. 30–36, 1985 [18] p. s. hamilton, "open source ecg analysis", in proceedings of computers in cardiology conf., sept. 2002, pp. 101 – 104. [19] d. dubin, "rapid interpretation of ekg‟s" hong kong: cover, 2000. [20] b. jovanović, m. damnjanović and m. pavlović,"12-channel pc-based electrocardiograph", electronics, vol. 10, no.2, university of banja luka, december, 2006, pp.44-48. [21] b. jovanović, m. damnjanović, "novel pc-based cardiac stress test system", in proceedings of lv conf. etran, banjavrućica, bosnia and herzegovina, 06.06.-09.06., 2011, el 2.4. [22] q. li, c. rajagopalan c and g. d. clifford, "ventricular fibrillation and tachycardia classification using a machine learning approach". ieee trans on biomed eng., 2013. [23] v. fuster, l. e. ryden, d. s. cannom, h. j. crijns, a. b. curtis, k. a. ellenbogen, j. l. halperin, j. y. le heuze, g. n. kay, j. e. lowe, s. bertil olsson, e. n. prystowsky, j. l.tamargo, s. wann "acc/aha/esc 2006 guidelines for the management of patients with atrial fibrillation", circulation, vol.8, no. 9, pp. 651-745, 2006. [24] d. prutchi and m. norris, design and development of medical electronic instrumentation: a practical perspective of the design, construction, and test of medical devices, john wiley & sons, inc., hoboken, new jersey, 2004. [25] i. silva, g. moody and l. celi, "improving the quality of ecgs collected using mobile phones: the physionet/computing in cardiology challenge 2011" int. conf. on computing in cardiology, pp. 273– 276, 2011. [26] matlab version 7.6.0.324 natick, massachusetts: the mathworks inc., 2008. [27] t. masters, practical neural network recipes in c++, academic press, san diego, 1993. [28] g.-b. huang and h. a. babri, “upper bound on the number of hidden neurons in feed-forward networks with arbitrary bounded nonlinear activation function”, ieee trans. on neural networks, vol. 9, pp. 224228, 1998. [29] litovski, v., zwolinski, m., "vlsi circuit simulation and optimization", chapman and hall, london, 1997. [30] s. mohan, g.v. kadambi, v.k. reddy, m.d. deshpande, "development of an industry standard qrs detection algorithm for automated ecg analysis", sastech journal, vol. 7, no. 1, april 2008. [31] j. oster, j. behar, r. calloca, q. li, q, li and g. clifford, "open source java-based ecg analysis software and android app for atrial fibrillation screening", in proc. of int. conf. on computing in cardiology, vol. 40, pp. 731-734, 2013. [32] h. xia, g. garcia, j. mcbride, a. sullivan, t. de bock, j. bains, d.wortham and x. zhao 2011 "computer algorithms for evaluating the quality of ecgs in real time", in proc. of int. conf. on computing in cardiology, pp. 369–72, 2011. [33] g. clifford, d. lopez d, q. li and i. rezek, "signal quality indices and data fusion for determining acceptability of electrocardiograms collected in noisy ambulatory environments", in proc. of int. conf. on computing in cardiology, pp. 285–288, 2011. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 177 191 doi: 10.2298/fuee1602177m artifical neural networks in rf mems switch modelling  zlatica marinković 1 , vera marković 1 , tomislav ćirić 1 , larissa vietzorreck 2 , olivera pronić-rančić 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 tu münchen, lehrstuhl für hochfrequenztechnik, münchen, germany abstract. the increased growth of the applications of rf mems switches in modern communication systems has created an increased need for their accurate and efficient models. artificial neural networks have appeared as a fast and efficient modelling tool providing similar accuracy as standard commercial simulation packages. this paper gives an overview of the applications of artificial neural networks in modelling of rf mems switches, in particular of the capacitive shunt switches, proposed by the authors of the paper. models for the most important switch characteristics in electrical and mechanical domains are considered, as well as the inverse models aimed to determine the switch bridge dimensions for specified requirements for the switch characteristics. key words: actuation voltage, artificial neural networks, resonant frequency, rf mems, switch 1. introduction modern communication systems rely to a great extent on new high performance rf and microwave devices and components that enable miniaturization of components according to the demand of integrating more and more functionalities by reducing the overall size of the system at the same time. rf mems (micro electro mechanical systems) are novel components which are able to meet the mentioned requirements [1]. rf mems components and devices exploit mechanically movable parts and thus enable a change of topology. one of the first examples developed in 1995 [2, 3] was an electrostatically actuated rf mems shunt switch where the ground of the coplanar waveguide is connected by a very thin membrane. if a dc voltage is applied between ground and signal line, the membrane is pulled down by the electrostatic force and thus it shortens the signal line. since these first developments, many different components based on mems switches have been introduced, like phase-shifters, reconfigurable antennas, matching networks, switch matrices, tunable filters, etc. [4-9] received september 29, 2015 corresponding author: zlatica marinković university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: zlatica.marinkovic@elfak.ni.ac.rs) 178 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić rf mems switches allow multiband operation due to their ability to reconfigure its topology. also, they have several advantages compared to their electronic counterparts, like pin diode or mesfet switches [10]-[12], such as: low insertion loss, high isolation, small size, high linearity and excellent compatibility with microwave and mm-wave circuits. because of those significant advantages, rf mems switches are of growing interest for use in various communication systems, primarily in satellite and mobile communication systems. current research of rf mems switches is mostly concentrated on various new structures, new materials or processes in devices [13]-[16], while optimization analysis of mems devices lacks enough study. a standard approach to obtain rf mems switch electrical characteristics is to use full-wave numerical methods in electromagnetic (em) simulators. however, as it is also necessary to determine mechanical characteristics, simulations in mechanical simulators should be included during the design and simulation as well. although these methods provide the necessary accuracy, they are generally limited to a single analysis for a specific structure, and their computational overhead (running time, memory) becomes extensive when a number of simulations with different mesh properties are needed [17]. an alternative approach to modelling and designing rf mems devices is based on artificial neural networks (anns). anns can be considered as a great fitting tool, i.e. they have the ability to learn the dependence between two sets of data and to generalize, which means to give a correct response to inputs not used in the learning process. they give response almost instantaneously, retaining the accuracy of the standard em and mechanical simulators. owing to these abilities, anns have found a lot of applications in different fields, among others in rf and microwaves. this paper is devoted to applications of anns for modelling and design of rf mems switches. as far as rf mems devices are concerned, anns have been applied as a modelling tool about for a decade [17-25]. they have mostly been applied for modelling the device membrane characteristics. several publications refer to neural modelling of rf mems switches [17, 20, 23-25]. in most of the referred applications, anns were exploited to model dependence of the switch scattering (s-) parameters and/or switch resonant frequency on the dimensions of membrane and frequency. almost all of them refer to switches which have a simple rectangular membrane. in this paper a capacitive switch with a more complex membrane is considered. the paper is organized as follows. after introduction, in section ii a short description of neural networks is given. the capacitive rf mems switch modeled in this work is described in section iii. ann models of switch characteristics, as well as corresponding numerical results and discussions, are presented in section iv. section v contains description of rf mems switch inverse ann models, the modelling results and the discussion. finally, the main concluding remarks are given in section v. 2. artificial neural networks all neural models presented in this work are based on the multilayer perceptron (mlp) neural networks. an mlp ann consists of basic processing elements (neurons) grouped into layers: an input layer, an output layer, as well as several hidden layers [26]. rf mems switch ann models 179 each neuron is connected to all neurons from the adjacent layers. neurons from the same layer are not mutually connected. each neuron is characterized by a transfer function and each connection is weighted. the anns exploited in this work have linear transfer function for neurons from the input and output layer and sigmoid transfer function for the hidden neurons. an ann learns the relationship among sets of input-output data (training sets) by adjusting the network connection weights and thresholds of activation functions. there are a number of algorithms for training of anns. the most frequently used are backpropagation algorithm and its modifications, as the levenberg marquard algorithm [26], used in the present work. once trained, the network provides fast response for various input vectors without changes in its structure and without additional optimizations. the most important feature of anns is their generalization ability, i.e., the ability to generate the correct response even for the input parameter values not included in the training set. the generalization ability has qualified anns to be used as an efficient tool for modelling in the field of rf and microwaves [26-36]. as examples, anns could be used as an alternative to time-consuming electromagnetic simulations [26-28, 30] or an alternative to the conventional modelling of microwave devices [25, 27, 30, 32, 35, 36]. in the present work, the accuracy of ann learning and generalization was tested by calculating average test error (ate), worst case error (wce) and pearson productmoment correlation coefficient (r) [26]. having in mind that it is not possible to determine the number of hidden neurons, in this work for each developed ann, anns with different number of hidden neurons in one or two hidden layers were trained. the network with the best test results was chosen as the final model. when reporting the ann structure of the final models, in this paper the following notation is used: ann denoted with n-h1-h2-m, has n input neurons, h1 and h2 neurons in the first and second hidden layer, respectively, and m output neurons; ann denoted with n-h1-m, has n input and m output neurons and only one hidden layer with h1 neurons. 3. modeled device the considered device is a cpw (coplanar waveguide) based rf mems capacitive shunt switch (see fig. 1) fabricated at fbk in trento in an 8-layer silicon micromachining process [37]. the signal line below the bridge is made by a thin aluminum layer. adjacent to the signal line the dc actuation pads made by polysilicon are placed. the bridge is a thin membrane connecting both sides of the ground. the inductance of the bridge and the fixed capacitance between signal line and bridge form a resonant circuit to ground, whose resonance frequency can be changed by varying the length of the fingered part, lf, close to the anchors and the solid part, ls. at series resonance the circuit acts as a short circuit to ground. in a certain frequency band around the resonance frequency the transmission of the signal is suppressed. the bridge can be closed by applying an actuation voltage of around 45 v. the actuation voltage is determined as the instant voltage applied to the dc pads when the bridge comes down and touches a cpw centerline, which is a pull-in voltage (vpi). this is strongly related to the switch features and mechanical/material properties, such as a dc pad size and location, a bridge spring constant and residual stress, bridge shapes or supports, etc. the finger parts (correspond to lf) in fig. 1 are to control vpi. if finger parts are long compared to the other parts, the bridge becomes flexible and the 180 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić switch is easily actuated by a low vpi. but this increases the risk of a self-actuation or a rf hold-down when the switch delivers a high rf power. and opposite, with the short finger parts, the switch needs a high vpi to be actuated. therefore, the bridge part lengths (lf, ls) should be carefully determined considering a delivering rf power and a feasible dc voltage supply [1]. (a) (b) fig. 1 top-view of the realized switch (a) and schematic (b) of the cross-section with 8 layers in fbk technology [37] 4. ann models of switch characteristics as mentioned in the introductory section, simulations of an rf mems switch characteristics in standard em and mechanical simulators are time consuming, which is especially important when it is necessary to repeat simulations during the design and optimization of the switch characteristics. similarly to the approaches presented in the literature, the authors of this paper have developed neural models of switch electrical and mechanical characteristics, in particular the neural models of s-parameters, resonant frequency and actuation voltage, as shown in figs. 2 and 3. an rf mems switch is a symmetric reciprocal device, i.e., s22 = s11 and s12 = s21, therefore only parameters s11 and s21 were modeled. the ann model of each modeled sparameter consists of two anns, both having three inputs corresponding to the bridge lateral dimensions, ls and lf, and frequency, f, whereas the outputs correspond to the magnitude and phase of the modeled parameter, |sij| and sij , respectively. with the aim to train the anns, it rf mems switch ann models 181 is necessary to simulate the s-parameters for several bridge sizes (i.e. for different values of the bridge lateral dimensions) in a full-wave em simulator. a properly trained ann gives responses which are very close to the response of the full-wave em simulator but in a shorter time, as the ann response is almost instantaneous. by using the developed model, analysis and optimizations of the switch dimensions can be done much faster than in the standard way. as far as the resonant frequency is concerned, the ann model consists of one ann with two inputs corresponding to ls and lf , and one output corresponding to the resonant frequency (see fig. 2b). the data for the ann training consists of several resonant frequencies corresponding to different bridge sizes, and can be acquired by determining the resonant frequency in a full-wave em simulator, or by using the neural model of the parameter s21. like the above mentioned model, this model enables a quick estimation of the switch resonant frequency and optimization of the dimensions to obtain the desired resonant frequency. the model of the switch actuation voltage has the same structure as the resonant frequency model. namely, it has two inputs and one output, corresponding to the bridge lateral dimensions and actuation voltage, respectively, as shown in fig. 3. as in the previous cases, the training data were obtained in a standard simulator able to calculate the switch mechanical properties. the gain in simulation time is the most significant in this case, as simulations in commercial mechanical simulator took much more time than the simulations of the electrical parameters in a full-wave em simulator. (a) (b) fig. 2 ann models of the switch electrical characteristics: (a) s-parameters; (b) resonant frequency (c) fig. 3 ann model of the switch actuation voltage 4.1. numerical results all ann models described above were developed for the considered switch [38, 39]. for development of the models of s-parameters and resonant frequency, the s-parameters for several different combinations of the switch lateral dimensions were simulated in the full-wave em simulator, ads momentum [40], and the corresponding resonant frequencies were determined. the data referring to 23 differently sized bridges were used for the model development, whereas the data referring to 17 bridges different than the training ones were used for validation of the models. the s-parameters used for the model development were simulated in 401 frequency points up to 40 ghz. for each ann model, 182 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić anns with different number of hidden neurons were trained and the anns listed in table 1 were chosen as the final models. table 1 final ann models for the switch electrical and mechanical characteristics with the number of training samples parameter ann model number of training samples |s11| 3-8-6-1 23 x 401  s11 3-10-10-1 23 x 401 |s22| 3-8-8-1 23 x 401  s22 3-10-10-1 23 x 401 fres 2-5-1 23 vpi 2-8-1 30 validation of the ann models has shown that they produce the values which are very close to the values obtained by using the em simulator. as an illustration, in fig. 4 the insertion loss (|s21| in db) and the return loss (|s11| in db) are shown for the device having the bridge with lateral dimensions ls = 350 µm and lf = 75 µm. a very good agreement of the parameters generated by using the developed ann models with the em simulations can be observed. this is especially important, as the data referring to this device was not included in the training set, proving that the anns achieved a good generalization. as far as the resonant frequency is concerned, the maximum difference between the modeled and the reference values for the test devices is less than 1%, which can be considered very good. another illustration of the achieved accuracy of the resonant frequency ann model is the scattering plot given in fig. 5 showing very good agreement of the values obtained by the ann model and the reference values calculated in the em simulator for six considered test devices. more details about development and validation of the ann models of the electrical characteristics can be found in [38, 39]. 0 10 20 30 40 -60 -50 -40 -30 -20 -10 0 s 1 1 ( d b ), s 2 1 ( d b ) f (ghz) ann em simulator s 11 s 21 fig. 4 insertion and return losses for the tested device (ls = 350 µm and lf = 75 µm) rf mems switch ann models 183 9 10 11 12 13 14 15 9 10 11 12 13 14 15 f r e s ( g h z ) e m s im u la to r f res (ghz) ann model fig. 5 resonant frequency scattering plot for six test devices the data used for training and validation of the neural model for the switch actuation voltage, shown in fig. 3, were obtained in the mechanical simulator comsol multiphysics [41]. in total, 39 data samples (pairs of lateral dimensions and the corresponding actuation voltages) were used, thereof 30 for the ann training and 9 for the ann model validation. the best ann has one hidden layer with 8 neurons, as listed in table 1. the validation results shown in table 2 confirm that this model also has very good generalization abilities, as the maximum error for the test devices not used for the ann training is around or less than 1%, i.e., less than 0.5 v. more details about development and validation of this model can be found in [42, 43]. table 2 actuation voltage for the test devices ls (m) lf (m) vpi_target (v) vpi_sim (v) abs. error (v) rel. error (%) 150 25 55.6 55.58 0.02 0.01 150 65 43 43.45 0.45 1.10 250 25 33.3 33.16 0.14 0.40 250 65 28.2 28.21 0.01 0.03 350 10 25.2 25.32 0.12 0.47 350 25 23.8 23.74 0.06 0.25 350 65 21.1 20.99 0.11 0.54 350 75 20.5 20.45 0.35 0.17 450 65 16.9 16.80 0.10 0.57 4.2. discussion as already mentioned, the developed models of the rf mems switch characteristics give responses instantaneously. having in mind that they give the responses with the accuracy close to the accuracy of the calculations in standard em and/or mechanical simulators, they are very convenient to be used for further analyses and optimizations of the considered switch. the mathematical expressions describing the developed anns can 184 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić be easily implemented within the standard simulators by means of blocks dealing with variables and expressions, or can be used separately in different (mathematical) software packages. as an example, optimization of the bridge lateral dimensions for the given requirements for s-parameters in a desired frequency band lasts less than a second when performed by using the neural model implemented in the ads circuit simulator, which is significantly faster than the optimization in the full wave simulator (ads momentum), which lasts around 2 hours [39]. this advantage is even more evident in the case of the mechanical characteristics modelling. namely, the calculation of switch actuation voltage versus the bridge lateral dimensions (plotted in fig. 6), lasts few seconds in the matlab environment by using the developed ann model, whereas the mechanical simulator requires several tens of minutes to determine the actuation voltage for a single combination of the bridge geometrical parameters. optimization of the switch bridge dimensions based on the ann model lasts several seconds, unlike the optimizations in the mechanical simulators lasting for hours. 125 200 300 400 500 0 20 40 60 80 100 0 25 50 75 100 l s (m)lf (m) v p i (v ) fig. 6 actuation voltage calculated by using the ann model [42] the developed ann models can be efficiently used to study the behaviour of the device when the bridge size is changed, either intentionally, with the aim to optimize the device characteristics, or due to the deviation of the dimensions in the device fabrication process. the analyses done in [44] for the resonant frequency and in [45] for the actuation voltage show that when the dimension changes are within the fabrication tolerances (which are for the considered device up to +/ 3 µm) the changes in the actuation voltage and the resonant frequency can be considered as acceptable. for instance, maximum changes of the resonant frequency for several arbitrary chosen devices when both dimensions were changed in the range +/ 3 µm, with the step of 1 µm are shown in table 3 [44]. it can be seen that maximum deviation of the resonant frequency is 1.5%, with the maximum absolute change of 0.24 ghz. rf mems switch ann models 185 table 3 resonant frequency test results for simultaneous changes of ls and lf up to +/3 µm ls (m) lf (m) max | fres| (ghz) max | fres/fres| (%) 200 20 0.24 1.5 200 50 0.20 1.4 200 80 0.17 1.3 300 20 0.13 1.1 300 50 0.12 1.0 300 80 0.11 0.1 450 20 0.08 0.8 450 50 0.07 0.7 450 80 0.06 0.6 5. rf mems switch inverse ann models as illustrated in the previous section, the developed neural models of the electrical or mechanical characteristics of rf mems switches can significantly speed up the analysis and design of these switches. however, the time needed for the optimization of switch dimensions can be further reduced if the inverse neural models of the switch characteristics versus dimensions are used. namely, it would be very useful to develop models that could predict both of the lateral dimensions of the switch bridge for the given resonant frequency or/and actuation voltage. however, this is not possible, as the inverse functions of the resonant frequency and actuation voltage dependence on the bridge dimensions are not unique, which means that several combinations of the lateral dimensions result in the same resonant frequency or actuation voltage. the authors of the paper proposed inverse models where one of the dimensions is fixed, and the other is determined by an ann, as shown in fig. 7 [39, 43, 46, 47]. (a) (b) fig. 7 inverse ann models for the switch electrical (or mechanical) characteristics: (a) ls (b) lf namely, the proposed inverse ann models of the switch electrical (or mechanical) characteristics consist of anns with two input neurons: one corresponding to the fixed lateral dimension (lf in fig. 7a and ls in fig. 7b) and the other to fres in the case of electrical inverse model, or to vpi in the case of mechanical inverse model, and one output neuron corresponding to the dimension being determined (ls in fig. 7a and lf in fig. 7b). however, during the design of an rf mems switch one may have a need to optimize the dimensions to meet the desired resonant frequency and the actuation voltage simultaneously. that could be complex as the em simulations and simulations of the mechanical characteristics are performed in different software packages. therefore, the authors proposed inverse electromechanical models, which calculate one of the lateral 186 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić dimensions for given both, the resonant frequency and the actuation voltage. for the same reasons as in the case of separate electrical and mechanical inverse models, it is not possible to develop a model that would determine both dimensions at the same time. therefore, the exploited anns have three inputs and one output, as shown in fig. 8. (a) (b) fig. 8 inverse electro-mechanical ann models: (a) ls (b) lf for both types of the inverse models, separate electrical and mechanical or electromechanical ann model, the data for training the anns is obtained by calculating the resonant frequency or/and the actuation voltage for several combinations of the lateral dimensions. this can be done in standard simulators, or alternatively by the previously developed neural models aimed at calculating the resonant frequency and actuation voltage for the given dimensions (let us call them the direct models). once the inverse models are trained, the determination of the desired dimension is done directly without optimization. 5.1. numerical results the proposed inverse ann models were developed for the rf mems switch considered in this work. due to behaviour of the inverse characteristics of the considered devices, it appeared that the data used for the development of the direct models of the resonant frequency and actuation voltage were not sufficient to train the inverse ann models with the satisfying accuracy, as the modelling error was higher than tens of percent in some parts of the input space [39, 43, 46]. therefore, to acquire more training data in these critical parts of the input space, the developed direct neural models were used for generating more training samples. the anns showing the best performance for each model are listed in table 4, together with the number of training samples. to illustrate the accuracy of the inverse modelling, in fig. 9 a comparison of the determined lf and its target value is plotted in the form of scatter plots. fig. 9a refers to the electrical inverse model and fig. 9b to the mechanical inverse model. it can be observed that the deviation of the lf value is within the boundaries of +/-3 µm, indicating very good prediction abilities of the proposed model. similar results were obtained for prediction of ls. table 4 final ann models for the switch electrical and mechanical characteristics with the number of training and test samples inverse model ann model number of training samples electrical lf 2-15-15-1 814 electrical ls 2-15-15-1 814 mechanical lf 2-25-25-1 961 mechanical ls 2-4-6-1 961 electro.mech. lf 3-10-20-1 4131 electro.mech. ls 3-20-10-1 4131 rf mems switch ann models 187 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 l f (m) target l f ( m ) in v e rs e a n n m o d e l (a) 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 l f (m) target l f ( m ) in v e rs e a n n m o d e l (b) fig. 9 inverse modelling of lf : (a) electrical inverse model, (b) mechanical inverse model the inverse electro-mechanical models gave similar accuracy as the separate electrical and mechanical models, which can be seen from the following analysis, where the inverse electromechanical model for determining the fingered part length (shown in fig. 8b) is considered. the influence of the determination of lf to changes of the resonant frequency (desired value 12 ghz) and the actuation voltage (desired value 25 v) were calculated and shown in tables 5 and 6, respectively [48]. namely, for the ls values from 280 to 340 µm, and the desired fres and vpi, the value of lf is calculated (lf_inv). further, the calculated lf value is used to determine the resonant frequency (table 5) or the actuation voltage (table 6) with the direct ann models for fres and vpi, respectively, and these values were compared with the desired values. the corresponding absolute errors (ae) and relative errors (re) are given in tables 5 and 6 as well. it can be seen that the relative errors are less than 2%, which can be considered as good. 188 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić table 5 rf mems switch inverse modelling results: resf [48] sl [µm] resf [ghz] piv [v] inv_fl [µm] dir_resf [ghz] resf ae [ghz] resf re [%] 280 12 25 71.350 11.886 0.114 0.95 290 12 25 61.703 11.859 0.141 1.20 300 12 25 52.228 11.827 0.173 1.40 310 12 25 43.426 11.789 0.211 1.80 320 12 25 35.355 11.755 0.245 2.00 330 12 25 26.917 11.884 0.116 0.97 340 12 25 16.59 12.009 0.009 0.07 table 6 rf mems switch modelling results: piv [48] sl [µm] resf [ghz] piv [v] inv_fl [µm] dir_piv [v] piv ae [v] piv re [%] 280 12 25 71.350 25.169 0.169 0.68 290 12 25 61.703 25.143 0.143 0.57 300 12 25 52.228 25.120 0.120 0.48 310 12 25 43.426 25.082 0.082 0.33 320 12 25 35.355 25.056 0.056 0.22 330 12 25 26.917 25.129 0.129 0.52 340 12 25 16.590 25.380 0.380 1.50 5.1. discussion the results shown above confirm the accuracy of the determination of the lateral dimensions of the bridge for the given requirements related to the resonant frequency and/or the actuation voltage. the deviation in the dimension prediction is in the order of fabrication tolerances, confirming also the accuracy of modelling. the developed inverse models provide a very fast straightforward calculation of the bridge dimensions. opposite to the direct models, which are valid in the range of the dimensions used for the ann model development, although the inverse models give response for all the inputs falling between minimum and maximum values of input values used for training, they are valid only in the ranges of input values which are physically meaningful. this means that before choosing an input combination for an inverse model, it should be checked if the chosen combination is physically meaningful. this can be efficiently checked from two-dimensional plots input dimension resonant frequency (and/or actuation voltage, depending on the inverse model used) which can be plotted by using the direct ann models [49, 50]. another challenge in bridge dimension optimization is how to determine the bridge lateral dimensions when total length of the bridge is given. since the desired dependence is not unique, as it is case for all mentioned inverse models, such direct model is not possible to be realized with anns. however, the developed ann based direct and inverse models can be used as a solution. the interested readers can find more details about it in [4951]. rf mems switch ann models 189 6. conclusion rf mems switches have seen increasing applications in the field of microwave control, therefore, the design of the circuits containing rf mems switches require the presence of the reliable models. artificial neural networks have appeared as an efficient alternative to standard commercial full-wave em simulators and mechanical simulators providing similar accuracy but with significantly lower computational cost. this paper gives an overview of the neural models of capacitive shunt rf mems switches. despite the fact that the development takes a certain time, as it is necessary to obtain the training data by using the standard simulation methods and to train the ann models (a few minutes per a trained ann), efficiency and speed in giving response make the ann models very convenient for modelling and optimization of electrical and mechanical characteristics of rf mems switches. acknowledgement: the authors would like to thank fbk trento, thales alenia italy, cnr rome and university of perugia, italy for providing rf mems data. this work was funded by the bilateral serbian-german project "smart modeling and optimization of 3d structured rf components" supported by the daad foundation and serbian ministry of education, science and technological development. the work was also supported by the projects tr32052 and iii-43012 of the serbian ministry of education, science and technological development. references [1] g. m. rebeiz, rf mems theory, design, and technology. new york: wiley, 2003. [2] c.l. goldsmith, z. yao, s. eshelman, and d. denniston, "performance of low-loss rf mems capacitive switches," ieee microwave guided wave lett., vol. 8, pp. 269-271, august 1998. [3] g. m. rebeiz, j. b. muldavin, "rf mems switches and switch circuits," ieee microw. mag., vol. 2, no. 4, pp. 59-71, december 2001. [4] s. a. figur, e. meniconi, b. schoenlinner, u. prechtel, r. sorrentino, l. vietzorreck, v. ziegler, "design and characterization of a simplifed planar 16 x 8 rf mems switch matrix for a geostationary data relay", in proceedings of european microwave conference, 2012. [5] s. montori, e. chiuppesi, p. farinelli, l. marcaccioli, r. v. gatti, r. sorrentino, "w-band beamsteerable mems-based reflectarray", international journal of microwave and wireless technologies, vol. 3, no. 05, pp. 521-532, october 2011. [6] g. m. rebeiz, k. entesari, i. reines, s. j. park, m. a. el-tanani, a. grichener, a. r. brown, "tuning in to rf mems", ieee microw. mag., vol. 10, no. 6, pp. 55 – 72, june 2009. [7] m. daneshmand, r. r. mansour, "rf mems satellite switch matrices", ieee microw mag, vol. 12, no. 5, pp. 92 – 109, may 2011. [8] i. jokić, m. frantlović, z. đurić, m. dukić, "rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise", facta universitatis – series electronics and energetics, vol. 28, no. 3, pp. 345-381, 2015. [9] a. napieralski, c. maj, m. szermer, p. zajac, w. zabierowski, m. napieralska, ł. starzak, m. zubert, r.kiełbik, p. amrozik, z. ciota, r. ritter, m. kamiński, r. kotas, p. marciniak, b. sakowicz, k. grabowski, w. sankowski, g. jabłoński, d. makowski, a. mielczarek, m. orlikowski, m. jankowski, p. perek, “recent research in vlsi, mems and power devices with practical application to the iter and dream projects”, facta universitatis – series electronics and energetics, vol. 27, no. 4, pp. 561-588, 2014. [10] m. lazic, m. skender, s. radosevic, “generating driving signals for three phases inverter by digital timing functions”, facta universitatis – series electronics and energetics, vol. 13, no. 3, pp. 353-364, 2000. [11] a. n. al-rabadi, “carbon nano tube (cnt) multiplexers for multiple-valued computing”, facta universitatis – series electronics and energetics, vol. 20, no. 2, pp. 175-186, 2007. 190 z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić [12] j. vobecký “the current status of power semiconductors”, facta universitatis – series electronics and energetics, vol. 28, no. 2, pp. 193-203, 2015. [13] m. lamhamdi, p. pons, u. zaghloul, l. boudou, f. coccetti, j. guastavino, y. segui, g. papaioannou, r. plana “voltage and temperature effect on dielectric charging for rf mems capacitive switches reliability investigation” microel. reliab., vol. 48 pp. 1248-1252, sept. 2008. [14] m. matmat, k. koukos, f. coccetti, t. idda, a. marty, c. escriba, j-y. fourniols, d. esteve, “life expectancy and characterization of capacitive rf mems switches”, microelectron. reliab., vol. 50, no. 9–11, pp. 1692-1696, 2010. [15] l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou “a mim capacitor study of dielectric charging for rf mems capacitive switches”, facta universitatis – series electronics and energetics, vol. 28, no. 1, pp. 113-122, 2015. [16] m. koutsoureli, l. michalas, g. papaioannou, “assessment of dielectric charging in micro-electro-mechanical system capacitive switches”, facta universitatis – series electronics and energetics, vol. 26, no. 3, pp. 239245, 2013. [17] y. lee, d. s. filipovic, "combined full-wave/ann based modelling of mems switches for rf and microwave applications", in proceedings of the ieee antennas and propagation society international symposium, 2005, pp. 85-88. [18] y. lee, y. park, f. niu, b. bachman, k. c. gupta, d. filipovic, "artificial neural network modelling of rf mems resonators", int. j. rf microw. c e, special issue: rf applications of mems and micromachining, vol. 14, no. 4, pp. 302–316, july 2004. [19] v. litovski, m. andrejevic, m. zwolinski, "behavioural modelling, simulation, test and diagnosis of mems using anns," in proceedings of the ieee international symposium on circuits and systems iscas 2005, 2005, pp. 5182 5185. [20] y. lee, d. s. filipovic, "ann based electromagnetic models for the design of rf mems switches", ieee microw. compon. lett,, vol. 15, no. 11, pp. 823-825, november 2005. [21] y. lee, y. park, f. niu, d. filipovic, "design and optimization of rf ics with embedded linear macromodels of multiport mems devices," int. j. rf microw c e, vol. 17, no. 2, pp. 196-209, march 2007. [22] g. h. yang, q. wu, j. h. fu, k. tang, j. x. he, "an efficient modelling technique for rf mems phase shifter based on rbf neural network," in proceedings of the international conference on microwave and millimeter wave technology icmmt 2008, 2008, pp. 475-478. [23] y. mafinejad, a. z. kouzani, k. mafinezhad, "determining rf mems switch parameter by neural networks", in proceedings of the ieee region 10 conference tencon 2009, 2009, pp. 1-5. [24] y. gong, f. zhao, h. xin, j. lin, q. bai, "simulation and optimal design for rf mems cantilevered beam switch", in proceedings of the international conference on future computer and communication fcc '09, 2009, pp. 84-87. [25] s. suganthi, k. murugesan, s. raghavan, "neural network based realization and circuit analysis of lateral rf mems series switch," in proceedings of the international conference on computer, communication and electrical technology icccet 2011, 2011, pp. 260 265. [26] q. j. zhang, k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [27] c. christodoulou, m. gerogiopoulos, applications of neural networks in electromagnetics, artech house, 2000. [28] p. burrascano, s. fiori and m. mongiardo, "a rewiew of artificial neural network applications in microwave computer-aided design", int j rf microw c e, vol. 9, no. 3, pp. 158-174, 1999. [29] z. marinković, v. marković, "temperature dependent models of low-noise microwave transistors based on neural networks", int. j. rf microw. c e, vol. 15, no. 6, pp. 567-577, 2005. [30] z. marinković, g. crupi, a. caddemi, and v. marković, "comparison between analytical and neural approaches for multibias small signal modelling of microwave scaled fets", microw. opt.techn. lett., vol. 52, no. 10, pp. 2238-2244, 2010. [31] j. e. rayas-sanchez, "em-based optimization of microwave circuits using artificial neural networks: the state-of-the-art", ieee trans. microw. theory techn., vol. 52, no. 1, pp. 420–435, 2004. [32] h. kabir, y. cao, and q. zhang, “advances of neural network modelling methods for rf/microwave applications,” applied computational electromagnetics society journal, vol. 25, no. 5, pp. 423-432, 2010. [33] z. marinković, g. crupi, d. schreurs, a. caddemi, v. marković, "microwave finfet modelling based on artificial neural networks including lossy silicon substrate", microel. eng., vol. 88, no. 10, pp. 3158-3163, 2012. [34] m. agatonović, z. marinković, v. marković, "application of anns in evaluation of microwave pyramidal absorber performance", applied computational electromagnetics society journal, vol. 27, no. 4, pp. 326333, 2012. http://www.sciencedirect.com/science/article/pii/s0026271410003379 http://www.sciencedirect.com/science/article/pii/s0026271410003379 http://apps.webofknowledge.com/full_record.do?product=wos&search_mode=generalsearch&qid=7&sid=y2opmh@dfgnmpdiofid&page=1&doc=1 http://apps.webofknowledge.com/full_record.do?product=wos&search_mode=generalsearch&qid=7&sid=y2opmh@dfgnmpdiofid&page=1&doc=1 rf mems switch ann models 191 [35] z. marinković, o. pronić-ranĉić, v. marković, "small-signal and noise modelling of class of hemts using knowledge-based artificial neural networks", int. j. rf microw. c e, vol. 23, no. 1, pp. 34-39, 2013. [36] z. marinković, n. ivković, o. pronić-ranĉić, v. marković, a. caddemi, "analysis and validation of neural approach for extraction of small-signal models of microwave transistors", microelectron. reliab., vol. 53, no. 3, pp. 414–419, march 2013. [37] s. di nardo, p. farinelli, f. giacomozzi, g. mannocchi, r. marcelli , b. margesin, p. mezzanotte, v. mulloni, p. russer, r. sorrentino, f. vitulli, l. vietzorreck, "broadband rf-mems based spdt", in proceedings of the european microwave conference, 2006. [38] z. marinković, t. kim, v. marković, m. milijić, o. pronić-ranĉić, l. vietzorreck, "rf mems modelling with artificial neural networks", in proceedings of the memswave 2013, 2013. [39] z. marinković, t. kim, v. marković, m. milijić, o. pronić-ranĉić, l. vietzorreck, "artificial neural network based design of rf mems capacitive shunt switches", submitted to aces applied computational electromagnetics society journal [40] advanced design system 2009, agilent technologies [41] comsol multiphysics 4.3, comsol, inc. [42] zlatica marinković, ana aleksić, tomislav ćirić, olivera pronić-ranĉić, vera marković, tomislav ćirić, "analysis of rf mems capacitive switches by using neural model of actuation voltage", 2nd international conference on electrical, electronic and computing engineering (icetran 2015), silver lake, serbia, june 8-11, 2015, pp. mti2.3.1-5. [43] t. ćirić, z. marinković, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann approach for mechanical characteristics modelling of rf mems capacitive switches," submitted to journal of electrical engineering-elektrotechnicky casopis [44] z. marinković, t. ćirić, v. đorċević, o. pronić-ranĉić, t. kim, m. milijić, v. marković, l. vietzorreck, "ann approach for the analysis of the resonant frequency behavior of rf mems capacitive switches", in proceedings of the first international conference on electrical, electronic and computing engineering icetran 2014, 2014, pp. mti2.1.1-5 [45] t. ćirić, z. marinković, o. pronić-ranĉić, v. marković, l. vietzorreck , "ann approach for analysis of actuation voltage behavior of rf mems capacitive switches", in proceedings of the 12th international conference on advanced technologies, systems and services in telecommunications telsiks 2015, 2015. [46] z. marinković, t. ćirić, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse modelling of rf mems capacitive switches", in proceedings of the 11th conference on telecommunications in modern satellite, cable and broadcasting services telsiks 2013, 2013, pp. 366-369. [47] l. vietzorreck, m. milijić, z. marinković, t. kim, v. marković, o. pronić-ranĉić, "artificial neural networks for efficient rf mems modelling", in proceedings of the xxxi ursi general assembly and scientific symposium ursi gass, 2014, pp. 1-3. [48] t. ćirić, z. marinković, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse electro-mechanical modelling of rf mems capacitive switches", in proceedings of the xlix scientific conference on information, communication and energy systems and technologies icest 2014, 2014, pp. 127-130. [49] z. marinković, a. aleksić, o. pronić-ranĉić, v. marković, l. vietzorreck, "analysis of rf mems capacitive switches by using switch em ann models", accepted for telfor journal, in press [50] z. marinković, a. aleksić, t. ćirić, o. pronić-ranĉić, v. marković, l. vietzorreck, "inverse electromechanical ann model of rf mems capacitive switches applicability evaluation", in proceedings of the xlx scientific conference on information, communication and energy systems and technologies icest 2015, 2015. [51] z. marinković, a. aleksić, t. ćirić, o. pronić-ranĉić, v. marković, t. ćirić, "analysis of rf mems capacitive switches by using neural model of actuation voltage", in proceedings of the 2nd international conference on electrical, electronic and computing engineering icetran 2015, 2015, pp. mti2.3.1-5. bridging the snmp gap: simple network monitoring the internet of things facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 475 487 doi: 10.2298/fuee1603475s bridging the snmp gap: simple network monitoring the internet of things  mihajlo savić university of banja luka, faculty of electrical engineering, republic of srpska, bosnia and herzegovina abstract. things that form internet of things can vary in every imaginable aspect. from simplest devices with barely any processing and memory resources, with communication handled by networking devices like switches and routers to powerful servers that provide needed back-end resources in cloud environments, all are needed for real world implementations of internet of things. monitoring of the network and server parts of the infrastructure is a well known area with numerous approaches that enable efficient monitoring. most prevalent technology used is snmp that forms the part of the ip stack and is as such universally supported. on the other hand, “things” domain is evolving very fast with a number of competing technologies used for communication and monitoring. when discussing small, constrained devices, the two most promising protocols are coap and mqtt. combined, they cover wide area of communication needs for resource constrained devices, from simple messaging system to one that enables connecting to restful world. in this paper we present a possible solution to bridge the gap in monitoring by enabling snmp access to monitoring data obtained from constrained devices that cannot feasibly support snmp or are not intended to be used in such a manner. key words: iot, monitoring, snmp, coap, mqtt 1. introduction internet of things (iot) may mean many different things to many different people, but few would disagree that in order to achieve the full potential of smart environment based on iot one needs to be able to fully monitor all of the things that do make iot possible. although there is a wealth of monitoring products as well as comparable number of standards and platforms that go hand in hand with them, there is one standard that has been around for a long time, is implemented in almost all networking devices and is even a part of the set of the protocols that enable modern networking to exist.  received june 30, 2015; received in revised form november 12, 2015 corresponding author: mihajlo savić university of banja luka, faculty of electrical engineering, patre 5, 78000 banja luka, republic of srpska, bosnia and herzegovina (e-mail: badaboom@etfbl.net) 476 m. savić as it often is, with age it gained robustness and reliability, but lost some of the appeal to newer generations and younger monitoring systems, though one would be hard pressed to find a monitoring product that does not support it. it is also important to note that monitoring is never easy and in production tried and true solutions have proven themselves worthy throughout the history. to monitor the iot we need to monitor any and every device that makes it or provides the services to it, from smallest and simplest single function sensors to ritualized back-end services needed to transform raw data into usable information. currently, simple network management protocol (snmp) is the protocol that enables uniform monitoring of all parts of the iot infrastructure, save for the simplest of devices. as even those devices need to be monitored, presented in this paper one of the possible snmp based solutions for end-to-end monitoring is. solution described in this paper covers one possible use of snmp in monitoring iot infrastructures, enabling monitoring of just iot devices as well as larger heterogeneous infrastructures that can also contain complex iaas entities that provide services to iot devices. 2. simple network management protocol simple network management protocol (snmp) is a part of internet protocol suite (ip) set of protocols as defined by internet engineering task force (ietf) [1], organization in charge of defining standards and protocols that provide base for existence and exchange of data over the internet. snmp defines a set of standards for network management that include application protocol, database schema, as well as the definition of data sets. relatively small numbers of what we generally consider to be standards are in fact full standards and this only gives weight to snmp and its use in management and monitoring areas. common use of snmp is in default configuration that consists of at least one computer or other device that has administrative role (master) and a group of managed networked devices that are controlled by the master device. every managed device (slave) is running a software component called an agent that is in charge of communication with master node. agents provide for access to various system variables of managed device (e.g. system identification, available resources, resource consumption, etc.) but also provide a mechanism to control the device by setting the values of specified variables to desired values (e.g. bringing network interfaces up or down, changing their addresses, etc). data transfer is typically done over user datagram protocol (udp) and default port numbers 161 on the agent side and 162 on master side. communication can be initiated by the master through use of get operations for accessing the data and set operations used to modify the data, as well as by the managed device through the use of trap or inform operations used to send data to management node. 2.1. versions of snmp protocol snmp standard has been so far defined by three versions as will be described in following text. snmp version 1 (snmpv1) was defined by rfc documents number 1155, 1157 and 1215. although it has a “historic” status today, it is still widely used as it is supported by almost all network equipment manufacturers for nearly all networking devices. security model leaves a lot to be desired as it is based on so called “community” strings that can be seen as a shared secret or access passwords. biggest issue lies in the fact that all communication, including community strings, is performed in unencrypted bridging the snmp gap: simple network monitoring the internet of things 477 form. snmp version 2 (snmpv2) was defined by rfc documents 1441-1452 and introduced a host of improvements in the area of security, by utilizing more complex security model, and performance, by introduction of getbulk operation. snmp version 3 (snmpv3) as defined by rfc documents 3411-3418 is also known as std0062 and represents the official version of the standard recognized by ietf. older versions of the standard are considered to be historic or obsolete. the main improvement in this version is advanced security model based on version v2.it is important to note that there is no compatibility between different versions of snmp protocol as the message format and the protocol itself was changed. possible scenarios for coexistence between different versions of snmp protocol are described in rfc 2576. 2.2. data organization every network device accessible by snmp protocol is defined by one or more management information bases (mib) – a virtual database representing a hierarchically organized set of information available for a given device. mib consists of managed objects (mib objects) that are uniquely identifiable in mib hierarchy by value named object identifier (oid). mib tree has an unnamed root node that is branched out to branches controlled by organizations in charge of standards that are further divided on lower levels of hierarchy. mib object consist of at least one instance that can be seen as a variable or variables. there are two types of mib objects: scalars (that define a single instance of the object) and tables (that define multiple linked instances that make up the mib table). one of the aims of the snmp standard is to solve the problem of differing data representations on various platforms, a task that was solved by the use of subset of iso osi abstract syntax notation one (asn.1) – structure of management information (smi). snmpv1 smi specific data types can be either simple (integer, octet-string, oid) or application-wide (network address, counter, gauge, time tick, opaque, integer, unsigned integer). 2.3. extending the snmp functionality as was previously described, snmp allows for a flexible approach and management of networked devices, but is unfortunately limited to functionality implemented in the agent component. if one desires to access additional data or enable new functionality, there are several approaches, among which the most used are: modification of the agent, use of external programs and use of agentx protocol. the most efficient, but also the most difficult to implement and least flexible approach is modifying the agent to implement required functions through access to and modification of the source code of the agent in question. if it is impossible or infeasible to modify the agent, or if there is a need for several agents on the same device, solution can be obtained by the use of snmp proxy software. use of proxy increases the complexity of the system as the introduction of additional layer in the architecture also requires full support for all relevant requirements on this layer as well (e.g. proxy layer becomes a key component in security aspect). alternative solution is the use of external programs for access to required data. the simplest solution is execution of the external program every time the need for a specific data arises. this approach can have severely degraded performances as the program could be executed during any snmp operation. better solution is parallel execution of 478 m. savić both agent and external program, providing the means for communication between them. as this problem was present since the early days of snmp, parallel to development of various ad-hoc solutions, a process for standardized solution of the problem was created. result of this process is agentx protocol [2] that is based on master-slave principle within one or more devices. this protocol is continuation of snmp-smux and snmp-dpi protocols that were relegated to historic and experimental statuses. in 1995 ietf formed snmp agent extensibility working group [3] which defined an extension framework [2] and corresponding mib document [4]. these documents define the protocol, master agent, sub-agents, coding of all required data types, as well as the handling of all communication between parties. 3. internet of things and monitoring when talking about iot and monitoring, there are two major protocols that cannot be overlooked: coap (constrained application protocol) and mqtt (message queuing telemetry transport). as per rfc 7252 that defines it, coap “is a specialized web transfer protocol for use with constrained nodes and constrained (e.g., low-power, lossy) networks” aimed at m2m (machine to machine) applications and is intended to be usable on devices with very limited processor, memory and networking resources [5]. it is udp based and employs an adapted subset of http optimized for m2m use cases, offering features not present in http but highly valuable in m2m environment such as discovery, multicast support, and asynchronous message exchanges[5]. it was specifically designed to utilize insignificant processing resources in normal operation. from request perspective, coap messages are very similar to http request methods, but are limited to get, post, put and delete messages that implement corresponding http method functions. core (constrained restful environments) link format as described by rfc 6690 [6] defines a well-known entry point ("/.well-known/core") that enables client to list the links hosted by the server and as such can be used for discovery, resource collection and resource directory and similar needs. there is an ongoing work on implementing coap on alternative transports such as tcp, p2p, websockets, zigbee and other network protocols that would enable wider use of coap in iot scenarios. mqtt as defined by oasis [7] is a light weight, open and simple client server oriented publish/subscribe messaging transport protocol. like coap, it is aimed at use in m2m applications and resource constrained devices. it runs over tcp/ip or other network protocols that need to provide ordered, lossless and bi-directional connections (for example zigbee protocol [8]). there is a special version of mqtt aimed at sensor networks under the name mqtt-sn that enables use of mqtt in very unreliable networking conditions by severely resource constrained devices via mqtt-sn forwarders and mqtt-sn gateways [9]. mqtt utilizes publish-subscribe pattern in which clients, here referred to as publishers, connect to servers (messaging brokers) and are able to send the messages to select topics with no need to specify exact recipient of the message, in this context called subscriber. messages are filtered by their attributes, chief of which is called topic and is represented by utf-8 string. topics can have hierarchical organization in which different levels are separated by forward slash. an example of such topic is “building1/room007/ rack02/server27/temperature”. as messaging is asynchronous, topics can exist even with bridging the snmp gap: simple network monitoring the internet of things 479 no currently connected publishers or subscribers which enables for use in unreliable environments as individual nodes can connect and disconnect as the need arises. this allows for considerable flexibility as subscriber can precisely choose to listen only to messages in topics related to, for example, certain room or building, or to listen to all messages related to temperature data in all rooms or buildings. but, iot does not consist of constrained devices only. fundamental to proper functioning of any iot infrastructure is also the proper functioning of interconnecting network as well as, most often, proper functioning of back-end services, running on any kind of server device. further complicating the things is the fact that both networking and service components of modern architectures can be virtualized. this represents a problem specially for monitoring of the performance as the nms traditionally has access to monitoring data inside virtualized environment and performance data of actual physical device running the virtualization software is available on to infrastructure provider. when discussing the networking component, outside of possible specialized hardware, for example mqtt-sn forwarders and similar, almost all networking devices support snmp for monitoring. devices that do not support it are usually unmanageable devices that provide no means for remote monitoring and are as such not suitable for use in described circumstances. virtualized servers running back-end services are under control of infrastructure user and can be easily configured to support snmp monitoring if it is not already the case. as mentioned earlier, the real problem lies in the fact that the virtual machine that contains the service has no access to non-virtualized performance data of physical host. following example illustrates the issue. let’s assume that the server running our hourly data collection service is spending proportionally large percentage of time waiting for database server to complete processing of new records. in non-virtualized situation we could monitor the processes in the system and see that, for example, we are waiting for storage system to complete the writing to disk as another process, archiving of previous data in this example, is consuming the resource at same time. this would give us enough information to solve the issue by rescheduling the offending process or decreasing the priority in order to ensure that data collection is completed properly. but, in virtualized environment, if another virtual machine is consuming resources, we have no idea that is happening, as all the performance data suggests nothing is consuming resources but they are unavailable to our service. this is but one example that illustrates how any of the limited resources on the physical host (processor, memory, networking, storage, etc) can be temporarily unavailable without having any means to determine whether the issue lies with our code or just wider environment. fact that virtual machines can be migrated, without shutting down, from one host to another with different resources available further complicates the monitoring aspect of back-end. 3.1. use of snmp for monitoring the internet of things we can divide devices we want to monitor into three categories depending on their support for snmp. first category consists of devices that do support snmp and provide needed monitoring data. second category includes devices that do support snmp but do not provide needed data directly, while the third category would be made of devices that do not support snmp. for our needs, second and third category are essentially the same, as there is no simple way for our monitoring system to directly access the required data, whatever it may be. 480 m. savić first group mostly consists of devices providing network connectivity as they were usually designed to be remotely managed and monitored by snmp. there is very little to do for us here, barring the cases where supported version of snmp does not provide sufficient security (versions 1 and 2) or there are other reachability issues (vpn, nat, etc). most of these issues can be solved by using snmp proxy services or other similar technique. physical infrastructure in cloud environment can also be in this category, providing that we are self hosting operation or have specific arrangements with hosting provider. there are four principal ways to gather data from devices that do not provide them in suitable form for monitoring: 1. devices that support messaging or event notifications allow us to subscribe to relevant topics and queues or implement listeners and receive the data as it is generated by the device. this is the best approach as all the data is current and the required resources are minimal, but is limited by the support by the monitored device. 2. polling (predefined intervals) is a simple, robust and enables us to estimate needed resources in advance. down sides are possible monitoring of devices that are not required, risk of stale data or higher resource consumption if polling more frequently. 3. proxying data collection as requests are made. this provides for minimal resource usage as we are collecting only the data that is needed when it is needed, but introduces unknown response delay in the system as we have to wait for all required devices to respond, makes estimates about resource usage difficult as we are dealing with, for us, random requests (example would be frequent monitoring of a slow responding device by large number of clients) and makes aggregate data calculations almost impossible. 4. proxying with caching extends previous approach by introducing a proxy level cache that can reduce system load at the price of not returning current data to all requests and significantly increasing the complexity of the system. described approaches can be combined in a number of ways to create hybrid solution that would tailor to one’s specific needs, again at the price of increasing already significant level of complexity. as lindholm-ventola and silverajan have shown in [10], monitoring of constrained devices using coap can be done by using coap to snmp proxy, with or without database component, in principle corresponding to third and fourth approach described above. in their work they conclude that further work must be done on research regarding implementation of notifications in iot monitoring systems. of the four described ways to monitor the devices in iot environment, only the first approach provides for meaningful handling and generating of notifications. remaining three approaches will either introduce a possibly significant delay in case of polling, or might completely miss the event if there were no requests to monitor the device. if a device supports messaging or can generate snmp notifications we can process and respond to event with minimal delay. 3.2. mqtt-snmp bridge in order to enable snmp monitoring of mqtt and mqtt-sn devices, we need to implement a system that would listen to messages generated by monitored devices, if needed send requests to monitored devices and transform collected data into form suitable bridging the snmp gap: simple network monitoring the internet of things 481 for serving to snmp clients. although it is possible to serve standalone snmp clients, most often setup like this are a part of larger monitoring infrastructure where snmp clients are in fact nmss (network management systems). architecture of such iot-snmp bridge system is presented in figure 1. the system consists of: monitored devices either supporting mqtt or in case of severely constrained devices mqt-sn protocol, mqtt-sn gateways and forwarders, mqtt broker, iotsnmp collector and server and various number of snmp clients. mqtt-sn forwarders and gateways exist in configurations where there is a need to monitor mqtt-sn devices. mqtt-sn gateways can, and usually are a part of mqtt broker. the broker itself should be chosen to be a polyglot type broker, enabling simple use of different messaging protocols by other endpoints in the system. choice of a suitable broker would also enable simplifying the infrastructure of a complete system that will be described later in the text. when it comes to collecting the data and serving snmp clients, it is possible to create monolithic system where both functions would be centralized, but by separating the collector and server we can easily scale the system or introduce additional load balancing and fault tolerance by employing multiple instances of needed service. broker infrastructure can also be made scalable and/or fault tolerant by employing suitable broker like apache activemq [11] that can function in both classic clustered environment as well as in a so called network of brokers that enables distributed queues and topics across a number of brokers. fig. 1 overview of iot-snmp bridge 3.3. snmp monitoring of iaas development of monitoring component for iaas in this paper is a continuation of work performed in the areas of grid computing and monitoring of distributed services started in see-grid-sci project [12] that resulted in bbmgridsnmp system [13] and is heavily influenced by implemented solutions. architecture of cloudsnmp system is given in figure 2. data is collected from various iaas endpoints via listening to messages generated by endpoints and sent through queue server (broker), by listening to snmp notifications and performing snmp monitoring of physical devices that are a part of the infrastructure as well as accessing needed information through iaas api specific for a 482 m. savić given iaas implementation or through generalized and standardized interfaces like ones produced by dmtf cmwg (distributed management task force cloud management working group) [14], etsi (european telecommunications standards institute) [15], oasis camp tc (organization for the advancement of structured information standards cloud application management for platforms technical committee) [16] and ogf occi (open grid forum open cloud computing interface) [17]. depicted queue server also supports at least one of the jms (java message service) [18] or amqp (advanced message queuing protocol) [19] protocols. fig. 2 architecture of cloudsnmp (iaas-snmp bridge) overall architecture mirror the one used in collecting and processing the data from constrained devices enabling unification of many of the components in this complex infrastructure. for example, it is possible to use the same brokers connected in load balancing and fault tolerant architecture to handle messages from both constrained devices as well as iaas services endpoints. this also enables for sharing the code on the iot snmp and cloudsnmp collector and server components and further modularization of the code. bridging the snmp gap: simple network monitoring the internet of things 483 there are two principal users of served data: operator and client. operator access should allow for full access to real monitored data and should provide for any information of interest to the operator. this can be achieved by designing and implementing a custom mib that contains tables where rows represent monitored resources and enable the operator to easily access summary data for any required parameter. as the monitoring is already done, at least in part, by using snmp there are existing snmp servers with already configured access rules, thus the simplest solution is to extend their functionality by using agentx protocol. client access has various restrictions imposed and enables the client to access only the data relevant for a specific virtual machine, or set of individual virtual machines. this requirement mandates either the use of many instances of snmp server, one for every monitored virtual machine, or some other mechanism that would allow for efficient access to the monitoring data. in order to provide possibility of the client of iaas infrastructure to access the data of the physical server hosting the monitored virtual machine, we employ snmp contexts. in simple terms, snmp contexts provide for creating multiple instances of data structure, be a full tree or some subset, serving the right instance to client. in our use, this enables one snmp server to perform the function of several servers, one for each context, without unnecessary duplication of resources. as cloudsnmp server has the data from all physical virtualization servers in the infrastructure, by connecting a certain context value to a unique virtual machine, client can be served data from the correct virtualization server even after migration to another server has taken place. example in figure 3 presents data propagation for a snmp sub-tree providing data for processor, memory and basic storage statistics from physical device to cloudsnmp server to be served for infrastructure operator as an extension of existing snmp data by utilizing agentx protocol as well as for the client by using custom snmp server that masks and transforms the data prior to replying to client request. depending on the requirements of the system, it is possible to serve different versions of data to clients, both to ensure that we are serving only the data that needs to be served and to avoid sudden changes in configuration of monitored device after migration. for example, it is possible to provide following levels of data masking: 1. no masking – served data is identical to data gathered from physical server. this enables for best performance monitoring by the client but also provides deep insight into actual configuration of infrastructure and can cause troubles for monitoring software as it is possible for a server to suddenly gain or lose cpu cores, ram or networking interfaces. 2. normalize to virtual machine resource – data will be normalized to maximum resources that can be occupied by monitored virtual machine. for example, if the virtual machine can utilize up to 8 cpu cores and server has 16 cpu cores, served data will be scaled to 8 cpu cores, even after migration to different server with 64 cpu cores. this provides for both limiting the amount of information we are publishing to client and for consistent measurements as the maximum values remain the same. the issue arises from the fact that it is now possible to serve data that is in collision with data recorded within the virtual machine. 3. normalize to fixed value – any resource is to be normalized to a predefined fixed value and be seen as proportion of resource currently utilized. this hides almost all information from end users while still providing for limited performance monitoring and troubleshooting. 484 m. savić fig. 3 data propagation in cloudsnmp to operator and client 3.4. overview of security aspects while iot promises a wealth of future possibilities in future, there are also some worrisome aspects that cannot go unmentioned, security as being the chief one. due to pervasive nature of iot and access to sensitive information, any compromise can have potentially grave consequences. when discussing the security of the described system, we can divide it into several possible attack surfaces: snmp based components, messaging components and iot components. when discussing the security of snmp it is important to distinguish between different versions. versions 1 and 2c are prone to packet sniffing and other general attacks applicable to unencrypted communication. only non-obsolete version of the protocol is version 3 that employs standard cryptographic features. due to the limits imposed by stateless nature of the protocol, the protocol can be attacked by brute force and dictionary attacks. modular architecture of snmp enables use of tls [21] and dtls [22] within transport subsystem [23]. proper configuration and utilization is of paramount importance in order to provide for secure operating environment. messaging components allow for use of complex authentication and authorization mechanisms as well as use of encryption. while this component and its security analysis lie outside of the scope of this paper, it is worth noting that there have been a number of bridging the snmp gap: simple network monitoring the internet of things 485 security vulnerabilities in various widely used ssl/tls libraries in the past few years, affecting systems ranging from simple embedded solutions to mobile devices and dedicated servers [24][25][26][27]. discussing security models of iot is complicated by the nature of iot and the fact that it covers everything from simple sensors to connected cars and vast industrial infrastructures. examples of security issues range from vulnerabilities in widely used zigbee protocol [28] to vulnerabilities present in connected cars [29]. concise overview is given by sadeghi, wachsmann and waidner in [30]. one of the benefits of described monitoring system is a possibility to provide effective monitoring to users of the infrastructure while limiting possible attack surfaces to exposed monitoring servers. it is also worth noting that this approach also enables the system to function as a proxy that exposes secure snmp version 3 to outside world although the monitored devices might be able to support only insecure versions of the protocol. in a stark contrast to resource constrained devices, these servers can possess ample hardware and software resources and are much better equipped to handle possible attacks, possible through detection in cooperation with ids (intrusion detection system) or mitigation when coupled with ips (intrusion prevention system). 3.5. integration with existing systems although there is a possibility to use specialized systems to gather, analyze and present monitoring data related to iot, most organizations already use some form of nms (network management system) that can be used for both management and monitoring of the infrastructure. there exists a vast variety of monitoring system, running on different platforms, utilizing different architectures, operational procedures and data collection methods. some of the representatives of popular nmss are nagios [31], zenoss [32], zabbix [33] and opennms [34]. one example of using zenoss in iot monitoring was given in [35]. mazhelis et al have analyzed the possibilities of use of the coap protocol for monitoring of iot infrastructure as well as adapting existing accounting and monitoring of authentication and authorization infrastructure services (amaais) project [36] for such use [37]. although nms products can differ significantly from each other, practically all of them support at least data gathering via snmp. this enables previously described system to extend the reach of general purpose network monitoring systems to iot part of the infrastructure. depending on the exact purpose and system configuration, it is possible to serve either raw collected data or data derived after previously defined transformations. this can be used to also mitigate or solve some of the privacy aspects of possibly sensitive data as the said data can be thoroughly filtered and modified to provide anonymization and/or aggregation. one example of complex monitoring system in the heterogeneous and distributed computing infrastructure such as see-grid [12] was described in [13]. developing software for systems as diverse as iot infrastructures are can be a daunting task. shear diversity of available devices and implementations provides for a very dynamic environment, often difficult to set up for testing purposes. while developing the system contiki [38] based cooja network simulator [39] can be used in place of physical devices. for testing mqtt and coap as well as stress testing the system simple load generator was developed in java utilizing californium coap framework [40] and fusesource mqtt libraries [41]. proof of concept snmp server was first created in java using jax toolkit [42] utilizing agentx protocol, but was rewritten in python programming language [43] utilizing pysnmp library [44]. 486 m. savić 4. conclusion in this paper we presented one solution for end-to-end monitoring of iot devices, including severely constrained devices such as sensors, iaas installations, as well as the networking infrastructure that connects them together. on the constrained devices end of spectrum, use of coap and mqtt was covered, while networking infrastructure natively supports snmp and four approaches to iaas and virtualization equipment data gathering were presented. integration into existing network management and monitoring systems enables simpler transition to full utilization of iot infrastructures in practice. often neglected aspect of harmonization of operational procedures in different domains can be significantly simplified by enabling uniform view and/or control interface for the whole infrastructure. by limiting exposed attack surfaces to simpler to manage and secure monitoring servers, security of the complete system can be increased, also alleviating some of the privacy aspects of the data gathering through the use of data transformation and anonymization prior to serving. described solution provides for non-blocking asynchronous data collection, scalable and fault tolerant data processing and serving, but most importantly, it provides an uniform standards based interface needed for reliable monitoring. references [1] “rfc 2571 an architecture for describing snmp management frameworks.” [online]. available: https://tools.ietf.org/html/rfc2571. [2] “rfc 2741 agent extensibility (agentx) protocol version 1.” [online]. available: https://tools.ietf. org/html/rfc2741. [3] “agent extensibility working group (agentx).” [online]. available: http://www.ietf.org/html.charters/ agentx-charter.html. [4] “rfc 2742 definitions of managed objects for extensible snmp agents.” [online]. available: https://tools.ietf.org/html/rfc2742. [5] “rfc 7252 the constrained application protocol (coap).” [online]. available: https://tools.ietf.org/ html/rfc7252. [6] “rfc 6690 constrained restful environments (core) link format.” [online]. available: https://tools. ietf.org/html/rfc6690. [7] “mqtt version 3.1.1.” [online]. available: http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/mqtt-v3.1.1.html. [8] zigbee alliance. zigbee specification. technical report document 053474r06, version 1.0, 2005. [9] a. stanford-clark and h. linh truong, mqtt for sensor networks (mqtt-sn) protocol specification,. ibm, http://mqtt.org/new/wp-content/uploads/2009/06/mqtt-sn_spec_v1.2.pdf. [10] lindholm-ventola, hanna; silverajan bilhanan , “coap-snmp interworking iot scenarios,” tampere university of technology, department of pervasive computing. report 3, tampere, 2013. [11] “apache activemq.” [online]. available: http://activemq.apache.org/. [12] a. balaž, o. prnjat, d. vudragović, v. slavnić, i. liabotis, e. atanassov, b. jakimovski, m. savić, “development of grid e-infrastructure in south-eastern europe,” j of grid comp, vol. 9, no. 2, pp. 135-154, 2011. [13] m. savic, s. gajin, m. bozic, “snmp based grid infrastructure monitoring system,” in proceedings of the 34th international convention mipro, 2011, pp. 231-235. [14] d. davis, g. pilz, “cloud infrastructure management interface (cimi) model and restful http-based protocol,” technical report, distributed management task force (dmtf), 2012. [15] “etsi ict standards, gsm, tetra, nfv, gprs, 3gpp, its, umts, utran, m2m.” [online]. available: http://www.etsi.org/standards. [16] “oasis cloud application management for platforms (camp) technical committee | charter.” [online]. available: https://www.oasis-open.org/committees/camp/charter.php. [17] “open cloud computing interface” [online]. available: http://occi-wg.org/. bridging the snmp gap: simple network monitoring the internet of things 487 [18] m. hapner, r. burridge, r. sharma, j. fialli, and k. stout, “java message service,” sun microsystems inc., santa clara, ca, 2002. [19] “advanced message queuing protocol website” [online]. available at http://www.amqp.org/. [20] “rfc 2576 coexistence between version 1, version 2, and version 3 of the internet-standard network management framework.” [online]. available: https://tools.ietf.org/html/rfc2576. [21] “the transport layer security (tls) protocol version 1.2” [online]. available: https://tools.ietf. org/html/rfc5246. [22] “datagram transport layer security” [online]. available: https://tools.ietf.org/html/rfc4347. [23] “transport layer security (tls) transport model for the simple network management protocol (snmp)” [online]. available: https://tools.ietf.org/html/rfc5953. [24] “cve-2014-1266” [online]. available: https://web.nvd.nist.gov/view/vuln/detail?vulnid=cve-2014-1266. [25] “cve-2015-0282” [online]. available: https://web.nvd.nist.gov/view/vuln/detail?vulnid=cve-2015-0282. [26] “cve-2014-0160” [online]. available: https://web.nvd.nist.gov/view/vuln/detail?vulnid=cve-2014-0160. [27] “microsoft security advisory 3046015.” [online]. available: https://technet.microsoft.com/en-us/library/ security/3046015. [28] “zigbee exploited – the good, the bad and the ugly” [online]. available: http://cognosec.com/zigbee_ exploited_8f_ca9.pdf [29] s. kamkar, “drive it like you hacked it”[online]. available: http://samy.pl/defcon2015/2015-defcon.pdf [30] a.-r. sadeghi, c. wachsmann, and m. waidner, “security and privacy challenges in industrial internet of things”, in proceedings of the 52nd annual design automation conference, 2015, p. 54. [31] “nagios core. nagios open source project.,” nagios. [online]. available: https://www.nagios.org/. [32] “zenoss,” zenoss. [online]. available: http://www.zenoss.com/. [33] “zabbix: the enterprise-class open source network monitoring solution.” [online]. available: http://www.zabbix.com/. [34] “the opennms project.” [online]. available: http://www.opennms.org/. [35] u. gupta, “monitoring in iot enabled devices,” arxiv preprint arxiv:1507.03780, 2015. [36] o. mazhelis, m. waldburger, g. s. machado, b. stiller, and p. tyrväinen, “extending monitoring and accounting infrastructure towards constrained devices in internet-of-things applications”, technical paper, university of zurich, 2013. available: https://www.merlin.uzh.ch/contributiondocument/download/5076 [37] b. stiller, “accounting and monitoring of aai services.” switch journal, 2010(2):12–13,october 2010. [38] a. dunkels, b. grönvall, and t. voigt, “contiki-a lightweight and flexible operating system for tiny networked sensors,” in proceedings of the 29th annual ieee international conference on local computer networks, 2004, pp. 455-462. [39] f. osterlind, a. dunkels, j. eriksson, n. finne, and t. voigt, “cross-level sensor network simulation with cooja”, in proceedings of the 31st ieee conference on local computer networks, 2006, pp. 641-648. [40] “californium (cf) coap framework java coap implementation.” [online]. available: http://people.inf. ethz.ch/mkovatsc/californium.php. [41] “fusesource mqtt libraries.” [online]. available: https://github.com/fusesource/mqtt-client. [42] “jasmin: jax java agentx client toolkit.” [online]. available: https://www.ibr.cs.tu-bs.de/projects/ jasmin/jax.html. [43] g. vanrossum and f. l. drake, the python language reference. python software foundation, 2010. [44] “snmp library for python.” [online]. available: http://pysnmp.sourceforge.net/. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 103 111 doi: 10.2298/fuee1501103g thomas-fermi method for computing the electron spectrum and wave functions of highly doped quantum wires in n-si  volodymyr grimalsky, outmane oubram, svetlana koshevaya, christian castrejon-martinez ciicap and the faculty of sciences on chemistry and engineering, autonomous university of state morelos (uaem), av. universidad 1001, 62209 cuernavaca, mor., mexico abstract. the application of the thomas-fermi method to calculate the electron spectrum in quantum wells formed by highly doped n-si quantum wires is presented under finite temperatures where the many-body effects, like exchange, are taken into account. the electron potential energy is calculated initially from a single equation. then the electron energy sub-levels and the wave functions within the potential well are simulated from the schrödinger equation. for axially symmetric wave functions the shooting method has been used. two methods have been applied to solve the schrödinger equation in the case of the anisotropic effective electron mass, the variation method and the iteration procedure for the eigenvectors of the hamiltonian matrix. key words: quantum wires, thomas-fermi method, exchange, wave functions, variation method, inverse matrix iterations 1. introduction investigations of the electron spectrum of low-dimensional and highly doped structures are central to many nanotechnology applications [1,2]. quantum devices based on silicon have been the subject of a concentrated recent interest, both experimental and theoretical, including the recent proposals on quantum computing [1-4]. the infrared transitions between the electron sub-levels within -doped quantum wells are perspective for using in optoelectronics, especially for infrared modulators, detectors, and lasers [5,6]. the electron spectrum of -doped quantum wells can be calculated from solving schrödinger equation jointly with the poisson one (sp) [5,6]. there exist several difficulties for simulations of quantum structures in silicon, namely, anisotropy of the received july 8, 2014; received in revised form november 22, 2014 corresponding author: volodymyr grimalsky ciicap and the faculty of sciences on chemistry and engineering, autonomous university of state morelos (uaem), av. universidad 1001, 62209 cuernavaca, mor., mexico (e-mail: v_grim@yahoo.com) 104 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez effective electron mass and slow convergence of sp method in the case of an arbitrary initial approximation. the investigation of -doped quantum structures is possible with a simpler approach based on the statistical thomas-fermi (tf) method [6-8]. the preference is the separation of the complex problem into the sequential ones, where the wave functions are computed after the found solution of the potential energy. in the case of axially symmetric high doping, the electron potential energy depends of the radius only. also the combined method can be applied, where the final result of tf simulations is used as a starting one for sp [9]. moreover, a comparison between sp method and tf one shows that the simple tf method gives a good approximation for the electron energy sub-levels and the total electron concentration within the -doped quantum wells [9]. in this paper the application of the tf method to calculate the electron spectrum in the quantum wells formed by highly doped n-si quantum wires is presented under finite temperatures t, and many-body effects, like exchange, are taken into account [5,7]. the electron potential energy and the total electron concentration are calculated from a single equation solved by the newton method. then the wave functions and values of the electron energy sub-levels are computed from the schrödinger equation where two possible orientations of electron valleys are considered. the peculiarities of solving schrödinger equation in the case of the anisotropic electron mass are pointed out. 2. basic equations consider a single -doped electron quantum wire within n-si. below the atomic units are used for distances a0 * =2/(mce 2 ) 0.52 nm and for energy ry * = e 2 /(2a0 * )  0.12 ev, where mc =  2/3 (m 2 m||) 1/3  1.06 me  10 -27 g,  = 6 is the number of the lowest electron valleys in si. in the case of n-si the lowest valleys are lateral and the effective mass is anisotropic: m||, m. with non-dimensional variables the basic equation of tf method for the -doped electron quantum wire is [8]: 1 0 1 1 ( ) 8 { [ ] (2 exp( ) 1) ( )}; ( 0) 0, ( ) 0; d d d e vd dv r n v n n r r dr dr t dv r v r dr                 (1) where 3 / 2 1/ 2 1/ 3 1/ 2 1/ 2 1/ 2 0 2 17 3 1 1 0 0 1 [ ] ; 4 tanh(( / ) 1), ; 2 ( ); ( ) 0, ; 2 ( ) ; 1 exp( ) ( ) exp( ( / ) ); 10 . x c c x c d d c v vt n v t n n n n v n f n f n n n u du v u v n r n r r n cm                                  (2) here v(r) is the electrostatic electron energy, n is the total electron concentration; n1d and nd0 are 1d and 3d donor concentrations, respectively. vx is the many-body correction thomas-fermi method for computing the electron spectrum snd wave functions... 105 to the electron energy due to the exchange [7]. eq. (1) is the poisson equation for the electrostatic electron energy, where the electron concentration n[v] is calculated from the equilibrium statistical fermi distribution. note that at the 1d donor concentrations n1d0  10 21 cm -3 the exchange energy vx is comparable with the electrostatic one. the donor levels are assumed shallow and single charged. the concentration of 1d donors is high n1d0  10 20 cm -3 ; they are fully ionized. the 1d doping is localized at the distances r ~ r0  1 – 5 nm. the finite size of the highly doping region r0 is considered, because the distance unit a0 * in silicon is comparable with the size of the lattice cell. moreover, 1d doping cannot be approximated by the -function directly, because this approximation leads to the logarithmic singularity of the electron potential energy at r ~ 0. the results of simulations do not depend on the value of the critical electron concentration when nc  10 18 cm -3 . the position of the fermi level  has been obtained from the condition of the total neutrality of the semiconductor [6]: 1 0 [ 0] (2 exp( ) 1)d d e n v n t       (3) here ed is the donor energy with respect to the bottom of the conduction band. eq. (1) has been solved by the newton method [8]. .0|,0| )};()(][{8)( 1 )(8)( 1 ; 0 1 1            rrr d s d s s dss dr d rnvnvn dr dv r dr d r q v n v n dr d r dr d r vv       (4) note that in the derivative (n/v) the exchange correction vx does not vary. in the boundary conditions the parameter r is an enough big radius. in eq. (4) the parameter q  1 is chosen to provide better convergence [9]. the rapid convergence of the method has been demonstrated, even when the exchange energy has been taken into account. after calculation of the electron potential energy v(r), the energy sub-levels ej, the wave functions (wf) j(x,y) of the discrete spectrum of the well, and then the electron concentration n in each electron sub-level within the quantum well have been computed from the following schrödinger equations: ;])([)(ˆ )1()1()1( 2 )1(2 2 )1(2 )1( 1 jjjx jjc j evrv yxm m h         (5a) ;])([ˆ )2()2()2( 2 )2(2 || 2 )2(2 )2( 2 jjjx jcjc j evrv ym m xm m h         (5b) .)(;1)( 2/1222 yxrdxdy j       wf can be chosen as real, because the hamiltonians are real. there are two different orientations of electron valleys in silicon, as seen from eqs. (5). namely, eq. (5a) is for the isotropic case of the effective mass components, eq. (5b) is for the anisotropic case. 106 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez 3. simulations of potential energy and electron concentration the results of simulations of the electron potential energy v(r) and the electron concentration n(r) are presented in figs. 1,2. for all cases the volume doping is nd0 = 1·10 16 cm -3 , r0 = 2a0 *  1.04 nm. the previous simulations [10] demonstrated that the exchange correction is important for the doping levels n1d  10 21 cm -3 . but the total electron potential energy w = v + vx and the electron concentration n are practically the same as without this many-body correction. the potential energy depends on temperature t, as seen in fig. 2. this is due to the partial ionization of volume donors nd0 at low temperatures, as seen from eq. (3). but the electron concentration n does not depend on t. some difference is only at the periphery r >> r0. a) b) fig. 1 part a) is dependence of electron potential energy jointly with the exchange energy v+vx on the radius r. part b) is the dependence of the total electron concentration n(r). the values of the maximum doping are n1d0 = 310 21 cm -3 (curve 1), 10 21 cm -3 (curve 2), 310 20 cm -3 (curve 3), 510 19 cm -3 (curve 4), t = 300 k, r0 = 2a0 *  1.04 nm. the corresponding exchange energies vx are also presented there in the upper part of the part a). a) b) c) fig. 2 part a) is dependence of electron potential energy jointly with the exchange energy v+vx on the radius r for different temperatures t. part b) is the dependence of the electron concentration for different temperatures; part c) is the same as b) in details. curve 1 is for t = 300 k, 2 is for 200 k, 3 is for 150 k, 4 is for 100 k, 5 is for 50 k, 6 is for 20 k. the maximum doping is n1d0 = 310 21 cm -3 . thomas-fermi method for computing the electron spectrum snd wave functions... 107 4. wave functions and energy sub-levels after calculating the electron potential energy it is possible to simulate the electron energy sub-levels in the quantum well and the corresponding wf. to compute wf for the isotropic case (5a), where the effective masses are the same, the shooting method has been applied [11]. the axially symmetric wf (r) for the isotropic case, eq. (5a), are presented in fig. 3, a. the maximum doping level is n1d0 = 310 21 cm -3 , t = 300 k, as in fig. 1, a, curve 1. the dependence of the electron energy sub-levels, two lowest ones e1,2, on the maximum doping is presented in fig. 3, b, for t = 300 k. the dependence of the electron energy sub-levels on temperature is given in fig. 3, c. one can see that the difference e2 – e1 depends on the temperature t there. a) b) c) fig. 3 part a) is the axially symmetric wf for the case of isotropic effective mass; part b) is the dependence of the energy sub-levels on the maximum doping, t = 300 k; c) is the dependence of the energy sub-levels on the temperature t for the maximum doping n1d0 = 310 21 cm -3 . the anisotropic case, eq. (5b), with different effective masses has been solved by two simple methods, which are realized in the cartesian coordinate frame xoy. the first one is the variation method [12]. namely, the problem of the minimization of the functional of the electron energy is considered:                              dxdy dxdyh e 2 2 )( ˆ min (6) wf possesses different types of symmetry or antisymmetry in the plane xoy, due to the symmetry of the hamiltonian, eq. (5b). the probing functions for the symmetric case (±x, ±y)=(x,y) are chosen as: 2 2 1 01 01 2 2 2 2 2 02 02 02 02 exp( ( ) ( ) ); exp( ( ) ( ) )(1 ( ) ( ) ); x y x y x y x y a b x y x y           (7) 108 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez in the case of antisymmetry with respect to x (-x, ±y) = -(x,y) the probing functions are: );)()(1)()()(exp( );)()(exp( 2 02 2 02 2 02 2 02 2 2 01 2 01 1 y y b x x a y y x x x y y x x x   (8) analogously, it is possible to write down the probing functions for other types of symmetry or antisymmetry, i.e. with the multipliers y or xy. therefore, for the lowest wf there are two variation parameters x01, y01. for the second wf there are 3 independent variation parameters, because of the imposing orthogonality relation:        0 21 dxdy . (9) the second method is the search of the eigenvalues of the matrix of the hamiltonian by means of the iteration procedure [13,14]. for this purpose, wf has been expanded by the truncated fourier series. zero boundary conditions have been used at the periphery x = ±lx, y = ±ly, where the boundaries lx, ly are chosen enough large. namely, wf is represented by the vector, or the column of the coefficients of the expansion; the hamiltonian has been represented by the matrix. then the following iteration procedure has been applied [13,14]: ),( ),( ;)ˆ( 10 1 1 02 1        ss ss s ss ee eh (10) here ),( 1  ss is the scalar product of vectors, s in the number of iterations, e0 < 0 is the parameter that has been chosen from the condition of maximum convergence. usually, e0 is close to the lowest energy sub-level computed from the variation method. it is important that the matrix inversion can be realized in the simple manner, because of the diagonal domination of the shifted hamiltonian matrix )ˆ( 02 eh  . when the second wf is searched, it should be orthogonal to the first wf 1: ),/(),( 111 1 1 11   sss . after each iteration it is better to normalize the vector: .1),(  ss the initial values of the vectors  s=0 can be chosen as ones found earlier from the variation method. the iterations with the direct hamiltonian matrix )ˆ( 02 eh  diverge and cannot be applied. the profiles of wf for the two lowest sub-levels are presented in fig. 4 for the temperature t = 300 k and the maximum doping n1d0 = 310 21 cm -3 . the dependencies of the two lowest energy sub-levels in the quantum well on the maximum doping concentration for t = 300 k and on the temperature t for the maximum doping n1d0 = 310 21 cm -3 are given in fig. 5 for symmetric wf. thomas-fermi method for computing the electron spectrum snd wave functions... 109 a) b) c) d) e) f) g) h) fig. 4 wave functions computed for the case of anisotropic effective masses, t = 300 k, n1d0 = 310 21 cm -3 . part a) is the first symmetric wave function; the left panel is computed from the matrix iteration method, the right panel is from the variation method. part b) is the same for the second symmetric wave function. parts c) and d), e) and f), g) and h) are the first and the second wave functions correspondingly computed from the variation method for different types of symmetry or antisymmetry. 110 v. grimalsky, o. oubram, s. koshevaya, c. castrejon-martinez a) b) fig. 5 the dependence of two lowest energy sub-levels in the quantum well on the maximum doping concentration for t = 300 k (a) and on the temperature for n1d0 = 310 21 cm -3 (b). the case of anisotropic effective mass, eq. (5b), symmetric wf, is considered. the solid lines are the data obtained from the matrix iteration method, the dot lines are ones from the variation method. the variation method yields accurate values of the energy sub-levels. for instance, at t = 300 k and n1d0 = 310 21 cm -3 the values of the energy for the symmetric case calculated from the iteration procedure are e1 = -4.09 ry * , e2 = -1.94 ry * . the same values computed from the variation method are e1 = -4.075 ry * , e2 = -1.895 ry * . the profiles of wf are calculated from the variation method approximately; there is some difference at the periphery from those computed from the matrix iteration procedure. in the report [10] the electron spectrum has been calculated from the shooting method applied in the polar coordinate system. there is coincidence of the energy sub-levels with the data presented here, but that numerical realization of the shooting method is more complicated. the difference of the lowest energetic sub-levels e2 – e1 does not depend on the temperature for the anisotropic case, symmetric wf, see fig. 5, b. for the isotropic case, eq. (5a), this is not valid, see fig. 3, c. this can be explained by higher values of the electron sub-levels |e1,2| for the anisotropic case. it is possible to calculate wf more accurately also by means the standard simulators based on finite element methods, like comsol multiphysics [15]. 5. conclusions an application of tf method to the electron spectrum of quantum wires in n-si can be subdivided into two problems. the first one is the simulation of the electron potential energy from the simple ordinary differential equation. the iteration procedure demonstrates rapid convergence even when the many-body effects, like exchange, are taken into account. then it is possible to solve the schrödinger equations for the wave functions and the energy sub-levels. because of the anisotropy of the effective electron mass in silicon, this problem is generally two-dimensional. two simple methods have been proposed. the variation method yields accurate values of the energy sub-levels, whereas the profiles of the electron wave functions are approximate at the periphery. the method based on the inverse matrix iterations is more accurate both for the eigenvalues and the eigenfunctions. thomas-fermi method for computing the electron spectrum snd wave functions... 111 acknowledgement: the authors would like to thank to sep-conacyt (mexico) for a partial support of our work. references [1] d. w. drumm, a. budi, m. c. per, s.p. russo, and l.c. l. hollenberg. “ab initio calculation of valley splitting in monolayer δ-doped phosphorus in silicon”, nanoscale research lett., vol. 8, no 1, pp. 111121, jan. 2013. [2] b. weber, s. mahapatra, h. ryu, s. lee, a. fuhrer, t. c. g. reusch, d. l. thompson, w. c. t. lee, g. klimeck, l. c. l. hollenberg, and m. y. simmons, “ohm’s law survives to the atomic scale”, science, vol. 335, no 6064, pp. 64-67, jan. 2012. [3] f. j. ruess, w. pok, t. c. g. reusch, m. j. butcher, kuan eng j. goh, l. oberbeck, g. scappucci, a. r. hamilton, and m. y. simmons, “realization of atomically controlled dopant devices in silicon”, small, vol. 3, pp. 563 – 567, apr. 2007. [4] d.k. ferry, s.m. goodnick, and jonathan bird, transport in nanostructures, cambridge: cambridge univ. press, 2009. [5] l. ramdas ram-mohan, finite element and boundary element applications in quantum mechanics, oxford: oxford university press, 2002. [6] y. fu and m. willander, physical models of semiconductor quantum devices, dordrecht: kluwer, 1999. [7] l. m. gaggero-sager, “exchange and correlation via functional of thomas-fermi in delta-doped quantum wells”, modelling simul. mater. sci. eng., vol. 9, pp. 1-5, jan. 2001. [8] v. grimalsky, l. m.gaggero-s., s.koshevaya, and a.garcia-b., “electron spectrum of -doped quantum wells by thomas – fermi method at finite temperatures”, in proc. 27 th international conference on microelectronics (miel 2010), nis, serbia, 2010, pp. 119-122. [9] c. castrejon-martinez, v. grimalsky, l. m. gaggero-sager, and s. koshevaya, “combined method for simulating electron spectrum of delta-doped quantum wells in n-si with many-body corrections”, progress in electromagnetics research m, vol. 31, pp. 215-229, aug. 2013. [10] v. grimalsky, o. oubram, s. koshevaya, and c. castrejon-m., “thomas-fermi method for computing the electron spectrum of highly doped quantum wires in n-si”, in proc. 29 th international conference on microelectronics (miel 2014), belgrade, serbia, 2014, paper 049. [11] v. a. ilyina and p. k. silaev, numerical methods for theoretical physicists, vol. 2, moscow: institute for computing research publ., 2004 (in russ.). [12] k. t. hecht, quantum mechanics, springer, n.y., 2000. [13] w. h. press, s.a. teukolsky, w.t. vetterling, and b.p. flannery, numerical recipes in fortran, cambridge university press, cambridge, 1997. [14] j. stoer and r. burlish, introduction to numerical analysis. n.y.: springer, 2002. [15] s. m. musa, ed., computational finite element methods in nanotechnology. boca raton, ca: crc press, 2013 (www.comsol.com). https://www.sciencemag.org/search?author1=b.+weber&sortspec=date&submit=submit https://www.sciencemag.org/search?author1=s.+mahapatra&sortspec=date&submit=submit https://www.sciencemag.org/search?author1=h.+ryu&sortspec=date&submit=submit https://www.sciencemag.org/search?author1=s.+lee&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=a.+fuhrer&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=t.+c.+g.+reusch&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=d.+l.+thompson&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=w.+c.+t.+lee&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=g.+klimeck&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=g.+klimeck&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=l.+c.+l.+hollenberg&sortspec=date&submit=submit https://archive.today/o/zfl0/http:/www.sciencemag.org/search?author1=m.+y.+simmons&sortspec=date&submit=submit http://www.comsol.com/ facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 529-538 https://doi.org/10.2298/fuee1904529k © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd classification of electricity consumers using artificial neural networks  dragana knežević, marija blagojević university of kragujevac, faculty of technical sciences ĉaĉak, serbia abstract. this paper explains the process of using neural networks, as one of numerous data mining techniques, for the classification of electricity consumers. the processed data comprised more than a million recordings of electricity consumption for 21,643 consumers over the period of four years and eight months. using a data subset (70% of the entire dataset), the network was trained for the classification of consumers according to the type of the electric meter they possess (single-rate or dual-rate) and the zone they live in (city or village). the network input data in both cases included: consumer code, reading period from-to, current and previous meter reading for both low and high tariff, dual and single rate tariff consumption for that period and their total amount, as independent variables, whereas the network output comprised dependent variable classes (zone or type of electric meter). the results show that a network created in this way can be trained so well that it achieves high precision when evaluated using the test dataset. using the available recordings about electricity consumption, the type of the electric meter consumers possess and the zone they live in can be predicted with the accuracy of 77% and 82%, respectively. these findings can provide the basis for further research using other data mining techniques. key words: data mining, neural network, classification, prediction, electricity, r programming. 1. introduction data mining has emerged as a result of the attempts to find more effective ways of dealing with ever-growing amounts of stored data. the appropriate use of data can be highly beneficial, and data mining is primarily aimed at discovering patterns among seemingly completely unrelated data. depending on the problem that should be solved, the volume of available data, and the format in which results should be reported, one of numerous data mining techniques can be selected. data mining is commonly considered a multidisciplinary field [1], which has been proven by its applications in various areas. it is used for research purposes in economics, for testing security systems and discovering elements that affect their performance [2], for received january 17, 2019; received in revised form june 3, 2019 corresponding author: dragana knežević faculty of technical sciences ĉaĉak, svetog save 65, 32000 ĉaĉak, serbia (e-mail: dknezevic28@gmail.com)  530 d. knežević, m. blagojević education-related purposes [3], for increasing it efficiency and solving problems of storing huge sets of data, as well as the problems of transferring, processing and analyzing data using internet of things [4], and even for making predictions about the risk of falling off a bike [5], or about the result of a football match [6]. furthermore, data mining techniques can be used to analyze electricity consumption in order to design more efficient electric power distribution systems that would meet specific consumers‟ needs [1], or to compare the electricity consumption during different seasons, national holidays, etc. [7]. moreover, these techniques can be used for consumer clustering based on the amount of consumed electricity, taking into account weather conditions and other factors [8], as well as for shortterm and long-term planning and predictions of electricity demand, but also for predictions of potential electricity theft and misuse [9, 10 , 11, 12]. the authors of [13], for instance, developed a model for the detection of two types of illegal electricity consumption (entire or partial) using the combination of classification methods and the levenberg-marquardt method in smart grid. drawing upon the knowledge bases providing information on the existing medium voltage electricity consumers, the authors of [14] endeavored to identify typical characteristics and load profiles of consumers in order to ensure successful predictions and classification of new consumers, whereas the aim of [15] was to perform the classification and categorization of the existing consumers based on the characteristics of the electricity consumption. this paper focuses on the classification of electricity consumers based on two different criteria: the type of the electric meter they possess and the zone they live in. an extremely huge dataset was used for the purpose of the research, exceeding a million recordings, in order to ensure higher precision of the obtained results. such results may serve as a basis and inspiration for future research. one of the greatest data mining challenges is the selection of the appropriate algorithm. unlike other fields (e.g. statistics), data mining allows the simultaneous use of several different techniques, and the applied algorithms are tested using a data subset (commonly referred to as the test dataset) before selecting the one that yields the most reliable results. in this paper, the emphasis is placed on solving the problem of classification using neural networks as a type of supervised learning. the aim of the research is to find out whether neural networks can be used for the classification of electricity consumers based on several different criteria. the very fact that the given classes might not be clearly partitioned, i.e. the fact that they might be intertwined regarding the type of the meters and the zone where they are installed, makes the research all the more interesting. it also makes it more complicated for the algorithm itself to spot and make the differences. 2. methodology the target dataset included the information about electricity consumers on the territory of the city of užice during the period of four years and eight months, i.e. from january 2014 to august 2018. the observed geographical area, i.e. the consumers on the territory under the jurisdiction of the electricity distribution company of the city of užice – ed užice, can be classified into several basic categories. one classification may be performed on the grounds of the place where the meters are installed, i.e. whether they are in urban or rural areas, which is an interesting classification criterion. on the other hand, there are two types of electric meters, classification using artificial neural networks 531 single-rate and dual-rate ones, and the above-mentioned zones usually differ with respect to these types, i.e. most single-rate meters are installed in rural areas, though this is not a rule, not universally true. therefore, it might be interesting to perform such a classification. the target dataset comprised 1,048,575 readings for a total of 21,643 consumers. the analyzed data included: the consumer category, the zone where consumers live, the information whether they possess a single-rate or dual-rate meter, meter readings at the beginning and end of each month for the single rate or both rates, as well as the month (the billing period), and the total electricity consumption for both rates, expressed in kwh. this dataset had a special target variable, which became its class attribute. due to the fact that classes were determined by the user, this type of learning is called supervised learning [16], which implies that the training dataset is used to train a neural network, which is subsequently validated using the test dataset. the data were processed and the neural network created using the r program language, i.e. in the r studio. after importing the dataset into the r studio working environment, and before activating the neural network algorithm, it was necessary to perform the data preprocessing. as the used dataset contained different types of data (numeric data, strings, dates, etc.), the process of data normalization was performed and all the data transformed into numeric values belonging to the 0-1 range, as shown in figures 2 and 3. preprocessing also implied the differentiation between relevant data and those insignificant for the desired analysis, and it was followed by the classification into two groups, the training and test datasets. for the training of the neural network, 70% of the data (734,003 precisely) were selected using the accidental sampling method, whereas the remaining data (314,572) were used for the purpose of testing the model (the test dataset). after the removal of irrelevant and redundant attributes, and the selection of the classification variable, the neural network algorithm was activated. the input data fed into the activation function included: the dependent variable, independent variables, the target set (i.e. the data subset used to train the network), the selected algorithm, the number of neural network training repetitions, the decision whether the output would be printed and how, the threshold value, and the number of hidden layers, if any. while training a neural network, the results were printed and the network diagram, as shown in figure 1, was created, showing the input and output values, as well as hidden layers. moreover, the values of attribute weights in a layer (layers) can be seen. fig. 1 diagram of trained neural network 532 d. knežević, m. blagojević all the data „learnt‟ in the previous process (with the exception of the classification variable) were evaluated using the test dataset. a table containing the predicted and actual values was created, showing the network prediction accuracy. finally, the confusion matrix was used to summarize the values of all correct and incorrect predictions [17, 18]. a typical confusion matrix is shown in table 1. table 1 confusion matrix [17] predicted value actual value classified negative classified positive actual negative tn fp actual positive fn tp tp  true positive: the model correctly predicts the positive class (we predicted „yes‟, and it is „yes‟), tn  true negative: the model correctly predicts the negative class (we predicted „no‟, and it is „no‟), fp  false positive: the model incorrectly predicts the positive class (we predicted „yes‟, but it is „no‟), fn  false negative: the model incorrectly predicts the negative class (we predicted „no‟, but it is „yes‟) [16, 14]. given a confusion matrix, other measures such as accuracy, precision (p) and recall/sensitivity (r) can be calculated using equations 1, 2 and 3, respectively. (1) p (2) r (3) although, theoretically speaking, there is no correlation between precision and recall, in practice, a high level of precision is almost always achieved at the expense of recall, and the maximum recall is achieved at the expense of precision. which of these two measures is more important mostly depends on the nature of the application. very often, when a single metric is needed for the comparison of different classifiers, the f-score (also known as f1-score) is used (4) [17]: . (4) the f score is the harmonic mean of precision (p) and recall (r). the harmonic mean of a pair of numbers strongly tends towards the lower one. therefore, in order to get high f1-score, both p and r values must be high. there is another measure, known as the precision and recall breakeven point. the breakeven point is a point at which precision equals recall (p=r). it implies that different cases can be classified based on their probability of being positive. if this point cannot be found, the interpolation must be performed, using an interpolation method [17]. classification using artificial neural networks 533 3. results and discussion the paper presents the results of the analysis of two different examples of prediction. the research was carried out using the same dataset, but different dependent (class) variables. the aim of the first analysis was to build a neural network that would predict whether electricity consumers possess a single-tariff or dual-tariff electric meter, whereas the aim of the second one was to predict in which zone the consumers live. 3.1. classification of consumers according to type of meter they possess in order to solve this problem, two classes of the dependent variable „type_of_electric_meter‟ were defined. a single-tariff meter was labelled „0‟, and a dual-tariff meter was labelled „1‟. the normalization of data was done at the beginning of the research. table 2 shows the data subset before, and table 3 shows the same subset after the normalization process. table 2 data subset overview before normalization c o n s u m e r z o n e p e r io d _ f r o m p e r io d _ t o t y p e _ o f _ e le c t r ic a l_ m e t e r c u r r e n t _ r e a d in g _ lr p r e v io u s _ r e a d in g _ lr d u a lr a te _ ta ri ff (l r ) p r e v io u s _ r e a d in g _ h r c u r r e n t _ r e a d in g _ h r s in g le ra te _ ta ri ff (h r ) t o t a l_ b o t h _ t a r if f s potrosac_1 1 01.01.2014 31.01.2014 0 1 1 0 64567 65014 447 447 potrosac_1 1 01.02.2014 28.02.2014 0 1 1 0 65014 65418 404 404 potrosac_1 1 01.03.2014 31.03.2014 0 1 1 0 65418 65800 382 382 potrosac_1 1 01.04.2014 30.04.2014 0 1 1 0 65800 66199 399 399 potrosac_1 1 01.05.2014 31.05.2014 0 1 1 0 66199 66634 435 435 potrosac_1 1 01.06.2014 30.06.2014 0 1 1 0 66634 67025 391 391 potrosac_1 1 01.07.2014 31.07.2014 0 1 1 0 67025 67457 432 432 potrosac_1 1 01.08.2014 31.08.2014 0 1 1 0 67457 67823 366 366 potrosac_1 1 01.09.2014 30.09.2014 0 1 1 0 67823 68279 456 456 potrosac_1 1 01.10.2014 31.10.2014 0 1 1 0 68279 68734 455 455 potrosac_1 1 01.11.2014 30.11.2014 0 1 1 0 68734 69108 374 374 potrosac_1 1 01.12.2014 31.12.2014 0 1 1 0 69108 69565 457 457 potrosac_1 1 01.01.2015 31.01.2015 0 1 1 0 69565 70055 490 490 potrosac_1 1 01.02.2015 28.02.2015 0 1 1 0 70055 70422 367 367 in this example, the independent variables included: consumer, zone, period_from, period_to, current_reading_lr, previous_reading_lr, dualrate_tariff.lr, previous_reading_hr, current_reading_hr, singlerate_tariff.hr, total_both_tariffs. a little more than 70% of the data were used as the training dataset, and the remaining percentage served as the test dataset. 534 d. knežević, m. blagojević table 3 data subset overview after normalization c o n s u m e r z o n e p e r io d _ f r o m p e r io d _ t o t y p e _ o f _ e le c t r ic a l_ m e t e r c u r r e n t _ r e a d in g _ lr p r e v io u s _ r e a d in g _ lr d u a lr a te _ ta ri ff (l r ) p r e v io u s _ r e a d in g _ h r c u r r e n t _ r e a d in g _ h r s in g le ra te _ ta ri ff (h r ) t o t a l_ b o t h _ t a r if f s 1 0 0 0 0.418182 0 0 0 0 0.14096 0.141649 0.014261 2 0 0 0.090909 0 0 0 0 0 0.141936 0.142529 0.012889 3 0 0 0.181818 0.509091 0 0 0 0 0.142818 0.143362 0.012187 4 0 0 0.272727 0.090909 0 0 0 0 0.143652 0.144231 0.012729 5 0 0 0.363636 0.6 0 0 0 0 0.144523 0.145179 0.013878 6 0 0 0.454545 0.181818 0 0 0 0 0.145473 0.146031 0.012474 7 0 0 0.545455 0.690909 0 0 0 0 0.146327 0.146972 0.013782 8 0 0 0.636364 0.781818 0 0 0 0 0.14727 0.147769 0.011677 9 0 0 0.727273 0.272727 0 0 0 0 0.148069 0.148763 0.014548 10 0 0 0.8 0.872727 0 0 0 0 0.149064 0.149754 0.014516 11 0 0 0.872727 0.345455 0 0 0 0 0.150058 0.150569 0.011932 12 0 0 0.945455 0.945455 0 0 0 0 0.150874 0.151565 0.01458 13 0 0 0.018182 0.436364 0 0 0 0 0.151872 0.152632 0.015632 14 0 0 0.109091 0.018182 0 0 0 0 0.152941 0.153432 0.011708 after the completion of the learning process, which was performed using the training dataset, the neural network was created, as shown in figure 2. fig. 2 neural network diagram created using dependent variable „type_of_electric_meter‟ classification using artificial neural networks 535 finally, it was necessary to test the performance of the model obtained using the training dataset and determine the neural network accuracy by comparing these results with the data from the test group. first, the dependent variable was removed from the test dataset and the predictor variable was determined. then the predicted values were compared with the actual ones, and ultimately, the confusion matrix was created, providing an overview of true, false, positive and negative results. the confusion matrix is shown in table 4. table 4 confusion matrix prediction actual 0 1 0 77405 42398 1 27550 167220 based on the data provided in the confusion matrix and the conclusions drawn, the accuracy of this specific model was calculated using equation 1, as well as its precision (p) using equation 2, sensitivity (r) using equation 3, and f1-score (equation 4): . for detailed information, please see table 6. 3.2. classification of consumers according to zone they live in in the second example, the same dataset was used for the classification according to the zone where consumer live (urban zone – the class labeled „0‟, rural zone – the class labeled „1‟) in order to predict whether a consumer lives in an urban or in a rural zone. exactly 70% of the data were used as the training, and the remaining 30% were used as the test dataset. again, the normalization of data was done, and the neural network created, as shown in figure 3. fig. 3 neural network diagram created using dependent variable – „zone‟ 536 d. knežević, m. blagojević after creating the network, the results were compared with the data from the test group, and their correlation was reported using the confusion matrix, as shown in table 5. table 5 confusion matrix prediction actual 0 1 0 163965 17154 1 39199 94255 based on these data, the accuracy of this specific model was calculated using equation 1, as well as its precision (p), sensitivity (r) and f1-scoreusing equations 2, 3 and 4, respectively. . given the confusion matrix, other measures can be calculated in addition to the above-mentioned ones. this can also be done using some ready-made software available on the internet. an example of such software is available at the following address: http://onlineconfusionmatrix.com/ [19]. the measures obtained using this software for the examplesdescribed in sections 3.1 and 3.2 are given in table 6. table 6 confusion matrix measure value (classification: type of electrical meter) value (classification: zone) derivations sensitivity / recall (r) 0.7375 0.8071 tpr = tp / (tp + fn) specificity 0.7977 0.8460 spc = tn / (fp + tn) precision (p) 0.6461 0.9053 ppv = tp / (tp + fp) negative predictive value 0.8586 0.7063 npv = tn / (tn + fn) false positive rate 0.2023 0.1540 fpr = fp / (fp + tn) false discovery rate 0.3539 0.0947 fdr = fp / (fp + tp) false negative rate 0.2625 0.1929 fnr = fn / (fn + tp) accuracy 0.7776 0.8209 acc = (tp + tn) / (p + n) f1 score 0.6888 0.8534 f1 = 2tp / (2tp + fp + fn) matthews correlation coefficient 0.5197 0.6320 tp*tn fp*fn / sqrt((tp+fp)*(tp+fn)*(tn+fp)*(tn+fn)) http://onlineconfusionmatrix.com/ classification using artificial neural networks 537 4. conclusion according to the available literature, data classification via neural networks is one of the most commonly used techniques for processing huge datasets. this technique was used to obtain the results reported in this paper. the results can be further processed for different purposes. given the fact that a large dataset was dealt with, the results should provide a clear and unambiguous picture of the research. in the first example, the classification of electricity consumers according to the type of the meter they possess was performed. the accuracy of the predicted data was 77%, and the precision was 65%, which is a satisfactory result. in the example relating to the predictions according to the zone consumers live in, high precision was achieved as well. the accuracy of 82% and precision of 90% are highly satisfactory. the high precision and favorable f1-scores indicate that the learning was successfully done, and it can be concluded that the algorithm can provide reliable results regarding the prediction of the zone consumers live in as well. while processing different examples using the same dataset, with the same dependent and independent variables, some adjustments of the basic network parameters such as the number of hidden layers and threshold, were performed, but it turned out that these parameters did not significantly affect the final values, and therefore the results are not reported herein. based on the obtained results, it can be concluded that this learning model can be used reliably enough to predict the type of the electric meter electricity consumers possess, as well as whether they live in an urban or rural area. this paper presents only a segment of the research and the results obtained, which will provide the basis for further research using some other methods and algorithms, and it will be described in the future papers. acknowledgement: this study was supported by the serbian ministry of education and science, project iii 44006 and project iii 41007. references [1] u. ali, c. buccella and c. cecati, “households electricity consumption analysis with data mining techniques”, department of information engineering, computer science and mathematics, university of l‟aquila, italy, 2016. [2] d. shi, j. guan, j. zurada and a. manikas, “a data-mining approach to identification of risk factors in safety management systems”, journal of management information systems, vol. 34, no. 4, pp. 1054–1081, 2017. [3] m. blagojević, “appliance of web mining in education”, technics and informatics in education, ĉaĉak, 2010. [4] s. shadroo and m. a. rahmani, “systematic survey of big data and data mining in internet of things”, computer networks, 2018. [5] g. prati, l. pietrantoni and f. fraboni, “using data mining techniques to predict the severity of bicycle crashes”, accident analysis & prevention, elsevier ltd, 2017, pp. 44–54. [6] m. carpita, m. sandri, a. simonetto and p. zuccolotto, “data mining applications with r”, research center “data, methods and systems” department of economics and management of the university of brescia, italy, 2014. [7] z. guo, k. zhou,x. zhang, s. yang and z. shao, “data mining based framework for exploring household electricity consumption patterns: a case study in china context”, journal of cleaner production, elsevier ltd, 2018. https://www.sciencedirect.com/science/article/pii/s095965261831607x#! https://www.sciencedirect.com/science/article/pii/s095965261831607x#! https://www.sciencedirect.com/science/article/pii/s095965261831607x#! https://www.sciencedirect.com/science/article/pii/s095965261831607x#! 538 d. knežević, m. blagojević [8] r. rathod and r. d. garg, “regional electricity consumption analysis for consumers using data mining techniques and consumer meter reading data”, international journal of electrical power & energy systems, vol. 78, pp. 368–374, 2016. [9] s. k. barai, “data mining applications in transportation engineering”, journal transport, 2003. [10] c. da cunha, b. agard and a. kusiak, ”data mining for improvement of product quality”, international journal of production research, vol. 44, no. 18–19, pp. 4027–4041, 2006. [11] c. djeraba, “data mining from multimedia”, international journal parallel emergent distributed system, vol. 22, pp. 405–406, 2007. [12] s. xiaogang, ”data mining methods and models”, the american statistician, vol. 62, no. 1, pp. 91, 2012. [13] a. ghasemi, m. gitizadeh, “detection of illegal consumers using pattern classification approach combined with levenberg-marquardt method in smart grid”, international journal of electrical power and energy systems, vol. 99, pp. 363–375, 2018. [14] s. ramos; j. m. duarte; f. j. duarte; z. vale, “a data-mining-based methodology to support mv electricity customers‟ characterization”, elsevier bv, 2015. [15] z. jiang, r. lin, f. yang, “a hybrid machine learning model for electricity consumer categorization using smart meter data”, energies, vol. 11, p. 2235, 2018. [16] f. günther, “neuralnet: training of neural networks”, stefan fritsch, the r journal , vol. 2/1, 2010. [17] l. bing, web data mining. exploring hyperlinks, contents, and usage data,secon edition, springer, 2011. [18] g. ciaburro and b. venkateswaran, “neural networks with r”, packt publishing ltd, birmingham, 2017. [19] software on web address: http://onlineconfusionmatrix.com/, accessed in: october 2018. https://www.mendeley.com/authors/57205229167/ https://www.mendeley.com/authors/24778115900/ http://onlineconfusionmatrix.com/ 10902 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 103-119 https://doi.org/10.2298/fuee2301103b © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper design and implementation of digital controller in delta domain for buck converter arka biswas1, arindam mondal2, prasanta sarkar3 1department of aerospace engineering, iit kharagpur, west bengal, india 2department of electrical engineering, dr bc roy engineering college, durgapur, west bengal, india 3department of electrical engineering, nitttr kolkata, west bengal, india abstract. this paper presents the design and implementation of a discrete-time controller for a dc-dc buck converter in the complex delta domain. whenever any continuous-time system is sampled to get a corresponding discrete-time system with a very high sampling rate, the shift operator parameterized discrete-time system fails to provide meaningful information. there is another discrete-time operator called delta operator. in the delta operator parameterized discretetime system, the discrete-time results and continuous-time results can be obtained hand to hand, rather than in two special cases at a very high sampling rate. the superior property of the delta operator is capitalized in this paper to design the proposed controller in the discrete domain. the proportional plus integral (pi) controller designed in the delta domain is used to maintain the output voltage of the buck converter at the load end for varying load and varying supply voltage conditions. the controller is designed and implemented using the ds1202 dspace board. the output voltage of the buck converter is scaled to feed to the onboard analogue to digital converter of ds1202. under the different disturbances, the error between the desired output voltage and the actual output voltage is measured and the delta pi controller is used to manipulate the duty cycle of the converter. the duty cycle of this pulse width modulation (pwm) signal is generated using a ds1202 board and is applied to the gate of the metal oxide semiconductor field-effect transistor (mosfet) via a suitable driver such that the output voltage of the buck converter remains at its desired value. key words: buck converter, delta domain, digital controller, dspace board, pi controller 1. introduction the sources of conventional energy are decreasing day by day and the supply-demand gap is therefore increasing. this leads to a growing demand for non-conventional sources of energy. the output of most of the renewable sources is dc voltage and also, they are not stabilized. for the stabilization and conversion from one dc voltage level to another dc voltage level, one of the most important power-electronic circuits called the dc-dc converter, is used [1]. the maximum power point tracking is a very important area for received july 08, 2022; revised august 21, 2022; accepted september 01, 2022 corresponding author: arindam mondal department of electrical engineering, dr bc roy engineering college, durgapur, west bengal, india e-mail: arininstru@gmail.com 104 a. biswas, a. mondal, p. sarkar maximization of solar power and is done through an electronic circuit consisting of a dcdc converter [2]. there are two types of dc-dc converters available, one is the buck converter, and another is the boost converter. for the reduction of voltage level buck converter is used. the buck converter is widely used for the dc motor drive control [3], renewable systems [4], [5], [6] as it is one of the most interesting power electronics circuits which converts the uncontrollable dc input into controllable dc output. whenever the supply voltage varies or the load is changed, there is a possibility of changing the output voltage of the buck converter, thereby calling for a proper choice of controller [7]. in [8], robust adaptive control (rac) approach using system identification methodologies has been illustrated for controlling of buck converter by pwm (pulse width modulation) in the presence of input voltage as well as load variations. pid controllers are used for controlling the output voltage of the buck converter [9] for low-power applications such as powering led. as the buck converter itself is a nonlinear system, the control effects of the system on voltage can be improved through the use of fractional-order pid controllers [10]. in [11], rct digital robust control is used to overcome the instability issues caused by the negative resistance effect of constant power load. nonlinear least squares optimization methodbased digital controllers can be used for controlling the high-frequency buck converter [12]. through this approach, performance of the controller is optimized through the polezero-cancellation (pzc) technique and the adverse effects of the undesired poles on the buck converter power stage are drastically reduced. a derivative-free nelder–mead (n– m) simplex method for designing a digital controller for buck converter operating in high frequency is depicted in [13] for the improvement of rise-time and settling time. the proportional-integral (pi) controller gives zero steady-state error and the simplest of all the controllers is generally used in different dc-dc converters [14]. digital controllers are always better than analog controllers, therefore used for the controlling of dc-dc buck converter. by using the digital control strategy, the algorithm or program can be easily altered. the digital pid controller can improve the performance of the buck converter by varying loop gain, cross-over frequency and phase margin [15]. the control algorithm developed through the shift operator parameterization finds defects for highfrequency applications for buck converter [16]. digital controller design using the delta domain is better than the controller designed using the shift operator, particularly when the sampling rate is very high. the advantages and application of the delta operator in control theory are elaborated in [17], [18], and [19]. the delta operator has the diversified nature of giving results in the digital domain which is again equivalent to the continuous ‘s’ domain, basically at high frequency. the discrete ‘z’ transfer function approximation turns out to be very sensitive even if there is a slight change in the values of the coefficient but the transfer function in the digital delta domain, progresses significantly the robustness of the estimate to parameter changes [20]. in [21], the delta operator is used to reduce the order of the model of a system which helped to save some extra bits in a digital system. the superior property of the delta () operator is used in the case of fault detection and network control [22], for kalman filter-based controller design used in cyber-physical systems [23]. to check the packet losses in the sensor to controller link or controller to actuator link, the delta operator is successfully applied for lyapunov-krasovskii functional design in the field of limited communication [24]. a delta domain-based pi controller is designed [25] for indirect field-oriented control (ifoc) for controlling an induction motor and the superiority of delta parameterised discrete-time system is proved. at a very high sampling frequency, the continuous-time results are the obvious outcome from discrete-time design and implementation of digital controller in delta domain for buck converter 105 measurements. the selection of sampling frequency is very much important during discretization. the sampling frequency must be 10 times the maximum frequency of the system to suitably reproduce the signal. for the design of pi controller in discrete shift operator parameterization sampling rate cannot be made high as it becomes numerically ill at very high sampling limit, therefore, for high frequency digitally controlled switching converters, delta domain pi controllers are most suitable [26]. the pi controller instead of the pid controller is used in the case of certain types of work where the voltage has a smaller amount of ripple during load change from lower load to higher load. this will cause the drop of the output voltage to develop smaller than the essential size and the same goes for the opposite, therefore, only the pi controller is sufficient for regulatory the process to be stable. the regulatory process using the pi controller is satisfactory as well as it has wide use in industries since it got a simple structure and is cost-efficient as compared to pid controller [27], [28]. for more precise results, a fractional-order controller can also be used instead of the traditional integer-order controller using the discrete delta operator [29]. for finding the parameters of fractional order controller in delta domain alpha guided grey wolf optimization technique can be used [30]. the ds 1202 dspace board is one kind of surrounded system where the controller can be designed and simulated using the simulink and dspace block sets. the dspace has been successfully used for designing pid controllers for buck and boost converters [31], [32]. as the ds 1202 dspace board operates on the discrete-time platform, this can be used as a real-time controller for controlling the buck converter for getting the output at desired level irrespective of the load and supply voltage variation. the hardware implementation of the buck converter along with the controller formulated in the delta domain using ds 1202 dspace board has been presented in this paper. the realtime analyses as well as simulation results are obtained using matlab/simulink. the significant contributions made in this paper are as given below: in the earlier work, the digital controllers for buck converter have been designed using shift operator parameterization. the discrete-time systems so far designed are done using shift operator parameterization but shift operator parameterization fails to provide meaningful information at a high sampling rate. the real-time implementation of the controller in the digital domain needs a very high sampling rate to get a better result. this is the motivation to work on the implementation of a digital controller for buck converter using the delta operator parameterization. the most crucial part is that at a fast-sampling limit, the discrete domain results resemble that of the continuous-time results in the delta operator parameterized system. moreover, the discrete-time pi controller for buck converter, designed in the delta domain is implemented using the ds1202 dspace board which acts as the real-time controller with built-in adc having a much higher resolution than any other microcontrollers. by using the realized controller, the output of the dc-dc buck converter provides a stable desired output voltage. therefore, digital design and implementation of pi controller for buck converter using delta operator parameterization is a newer concept and a new direction for further research. this paper is organized in the following way. the basics of the buck converter are discussed in section 2. in section 3, the control algorithm based on delta-operator for dc-dc buck converter is described. the simulation and practical result analysis are illustrated in section 4. finally, section 5 is devoted to the conclusion. 106 a. biswas, a. mondal, p. sarkar 2. buck converter 2.1. topology the buck converter topology is used to step down the input voltage to a lower level. it consists of a power mosfet switch, a filter inductor l, a filter capacitor c, a freewheeling diode d and a resistive load rl. it operates either in continuous conduction mode or discontinuous conduction mode. figure 1(a) represents the present topology of the buck converter under consideration. 2.2. operation 2.2.1. mode 1 when the gate pulse is applied to the mosfet, current flows through l, c, and rl thus storing energy in the inductor. in this mode diode remains to reverse biased, the inductor current increases linearly and the load consumes energy from the source. fig. 1(b) shows the equivalent circuit for the model when the switch is on. the voltage and current equations during this mode are as follows: l di e l dt = (1) where el, is the inductor voltage. let the inductor current increases from i1 to i2, the kirchhoff’s voltage equation is written as 2 1 0dc on i i e e l t  − − =     (1) where edc is the supply voltage, e0 is the output voltage and ton is the on-time of the switch. peak to peak ripple current through the inductor l is defined as: 2 1 i i i = − (2) equation (2) can be rewritten as 0dc on i e e l t  − = (3) 2.2.1. mode 2 when the switch is off, the energy stored previously in the inductor acts as a source and current flows through c, rl and d. in this mode, the diode is in forward biased and conducts. fig. 1(c) shows the equivalent circuit of mode 2 when the switch is off. during toff, the inductor current falls linearly from i2 to i1 and therefore the output voltage is expressed as 0 off i e l t  − = − (4) where toff, is the off-time of the switch. comparing i from (3) and (4) and rearranging the variables, (5) is obtained. design and implementation of digital controller in delta domain for buck converter 107 0 ( ) off on dc on e t t e t+ = (5) the time period is defined as on off t t t= + . therefore equation (5) can be rewritten as 0 dc on e t e t= (6) defining the duty ratio / on t t as . the output equation can be expressed as 0 dc e e= (7) fig. 1 (a) an ideal buck converter, (b) mode 1: switch on, (c) mode 2: switch off the buck converter design parameters and the values of the components are detailed in table 1 table 1 buck converter design parameter and values parameter with symbol value units input voltage (edc) 8 volt load resistance (rl) 100 kω load inductance (ll) 100 h series inductor (l) 100 h esr of inductor (rl) 10 mω output capacitor (c) 1000 f esr of capacitor (rc) 30 mω forward drop across diode (vd) 0.7 volt esr of diode when conducting (rd) 0.01 ω drain-source resistance of mosfet (rt) 8 mω operating frequency 5 khz 2.3. choice of sampling rate though the nyquist sampling theory recommends considering the sampling frequency as twice the maximum frequency contained in the signal. the thumb rule is that the minimum sampling frequency has to be 10 times the maximum frequency of the system. therefore, the sample time will be 1/10th of the time constant. the transfer function of a buck converter with vo(s) as the output voltage and d(s) being the duty cycle is given below, lcrc s s lc v sd sv sg in o buck 1)( )( )( 2 ++       == (8) 108 a. biswas, a. mondal, p. sarkar the sampling time is related to the time constant, therefore before the sampling time is decided; the time constant has been calculated first. considering the value of r and c are 100k and 1000uf respectively the time constant is coming out as 0.005 sec. according to the nyquist theorem, the sampling time can be taken as 0.0005 sec or less. for the controller design in the delta domain, the sampling rate is considered as 0.00001 sec to study the behavior of the controller at a high sampling rate as well as to establish the philosophy of the proposed controller design in the delta domain. 3. the control algorithm based on delta-operator for dc-dc buck converter 3.1. delta operator the d / dt operator in the continuous domain is well known for modelling any dynamic system. it is defined as ( ) ( ) 0 lim t h t h x xd dt h + → − = (9) the urge for an operator which resembles this d/dt operator structurally as well as functionally in the discrete domain led to the development of the delta-operator () which is defined as ( ) nn x x  + − =  (10) where  is the sampling time. it is an incremental difference operator that works as a signal differentiator unlike signal shifting as the case with the shift operator. this is a shifted and scaled version of the shift operator. it can be shown easily that the response of the delta-operator converges with the d/dt operator of continuous-time as the sampling time tends to zero (0). this property can be understood by comparing the stable zone of continuous, shift and deltaoperator in the frequency domain. in the frequency domain, the d/dt operator is expressed by the laplace operator s and the stable zone of this operator is widely known which is the entire left half side of the splane. in the frequency domain, the shift operator is denoted by z and related to the laplace operator s as s z e  = (11) examining the positions of the poles, it is seen that the stable zone for the shift operator lies within a circle of radius 1 and the centre at the origin. in the frequency domain, the delta operator is defined as 1z  − =  (12) since it is a shifted and scaled version of the shift operator, the stable zone for the delta-operator is also get shifted and scaled. the stable zone of the delta operator lies in a circle of radius 1/ having the centre at (−1/, 0). fig.2 shows the stable zones of three domains. it can be observed that as the sampling time reduces, the stable zone of the delta-operator tends to converge with the stable zone of the continuous domain. thus, the design and implementation of digital controller in delta domain for buck converter 109 use of the delta-operator provides a unified approach to model, design, analyse, and implement the digital control scheme. fig. 2 (a) stability zone: s domain, (b) stability zone: z domain, (c) stability zone: -domain 3.2. the digital controller design based on delta-operator 3.2.1. pi controller design to control the dc-dc buck converter, a proportional and integral (pi) controller and a pwm generator are used. the pwm signal is required for the on/off operation of the mosfet of the buck converter. the pwm control technique is one of the popular control methods for any switching devices. in this experiment, the pwm signal is generated digitally to trigger the mosfet of the circuit. the duty cycle of the pwm signal is controlled using the pi controller. the proposed pi controller is designed in the discrete delta domain and is simulated using matlab/simulink before being implemented through the dspace. the mathematical equations for a pi controller in continuous, shift and delta domain are as follows: ( ) ( )ip k u s k e s s   = +    (13) ( ) ( ) 1 1 i p k u z k e z z −   = +  −  (14) since no γ-1 operator is available in matlab, z-1 module is used to represent the γ-1. the relation can be derived from equation (12) as: 1 1 1 1 − − − −  = z z  , therefore, ( ) ( ) 1 1 . . 1 i p k z u k e z   − −   = +  −  (15) where kp and ki are proportional gain and integral gain respectively. e is the error and u denotes the control signal. the representation of the γ-1in simulink is shown in fig. 3. 110 a. biswas, a. mondal, p. sarkar fig. 3 representation of γ-1 in matlab simulink the transfer function of the pi controller in  domain can be obtained from equation (14) by using equation (12). the simulations of the pi controller in the three stated domains are given in fig. 4. 3.2.2. ziegler-nichols approach for tuning of pi controller the ziegler-nichols approach for tuning industrial controllers is most well-known [33] and mostly favored by process control engineers in practice [34]. in this work, the ziegler-nichols approach is used to find out the pi controller parameters for buck converter in the continuous time domain. the integral gain (ki) of the controller is set to zero and proportional gain is slowly increased till a sustained oscillation is observed. the value of proportional gain (kp) for which sustained oscillation received is called critical gain and denoted by kc. the frequency of oscillations is measured and is called as critical frequency (fc). the values of kp and ki are tabulated as per the guidelines of ziegler & nicholos and given in table 2. table 2 setting of pi controller parameters using ziegler-nichols rule controller kp kp pi 0.45 kc 1.2 fc the value of kp and ki are optimised through the guidelines of ziegler-nichols’ chart. the optimised values of kp and ki obtained are 0.22 and 0.01 respectively. the continuous time transfer function of pi controller is given by (16). s sgpi 01.0 22.0)( += (16) corresponding  -domain transfer functions of the pi controller is expressed by (17) and the controller structure as given in (17) is realized using matlab/simulink and dspace board for the implementation of pi controller in the delta domain. 1 01.0005.022.0)( − ++= pig (17) fig. 4 (a) pi controller in the continuous domain, (b) pi controller in the discrete z domain, (c) pi controller in the discrete  domain design and implementation of digital controller in delta domain for buck converter 111 3.2.3. mechanism for design of digital controller in delta domain the complete mechanism for the design of the proposed pi controller for buck converter in discrete delta domain is illustrated with a flowchart as shown in fig. 5. fig. 5 flowchart describing the complete mechanism of controller design in the delta domain 4. simulation and practical results fig. 6 shows the schematic diagram of the proposed work. fig. 6 schematic diagram of the proposed method 112 a. biswas, a. mondal, p. sarkar 4.1. simulation the experiment has been simulated first using matlab/simulink in sim electronics module. the simulation of closed-loop control of dc-dc buckconverter using continuous-time pi controller and pi controller in delta domain along with the dc-dc buck converter for r load has been depicted in fig. 7. fig. 7 simulink model for closed-loop control of dc-dc buck converter using (a) continuous time pi controller with r load, (b) delta domain pi controller with r load 4.2. hardware implementation in this work, the controller used for the control action is built with the dspace microlab board. in the year 2000, at bradley university, the dspace ds1102 was first used after developing the user’s manual and a workstation based on this board. after that, a newer dspace ds1103 board has been developed. in this experiment, the latest version of dspace ds1202 has been used. the design and simulation of the controller are done using the matlab simulink and the dspace block sets, the matlab-to-dsp interface libraries, real-time interface to simulink, and real-time workshop on a pc. the output from the ds1202 includes the pwm signal to trigger the gate of mosfet of the dc-dc buck converter. in this work, the dspace ds1202 system is used for the implementation of the control system; it is a mixed fpga/dsp digital controller consisting of a powerful processor for the computation of floating-point. the pci slot of the host computer is plugged with the key of ds1202. the control system is automatically processed and run in the ds1202 after being developed using matlab/simulink. a graphical user interface (gui) has been built using dspace. it allows the realtime evaluation of the control system. the “control desk” is used for multiple services. it has the provision for interfacing using which, the controller model that has been designed in simulink can be downloaded onto the dsp. various measurements viz., the regulated output (voltage and current), the duty cycle of the pwm signal and error to the controller can be displayed at the instrument panel feature of the control desk. the primary objective of using “ds1202” is as an interface between the external hardware portion of the overall system and the simulation. the ds1202 contains connectors for thirtytwo (32) analog-to-digital inputs and sixteen (16) digital-to-analog outputs; there are forty design and implementation of digital controller in delta domain for buck converter 113 eight (48) other connectors that can be used for digital i/o, slave/dsp i/o, incremental encoder interfaces, can interface and serial interfaces. the adc that is used for feedback the output voltage is of 16 bits, i.e., it represents the values between 0 to 65535. therefore, the resolution of adc is (8/65535) v =0.122 mv. now, the converter regulates the voltage to 4v which needs to be represented by 32765.5. but decimals cannot be represented due to the finite word length effect, so the reference voltage is represented by 32765. the adc resolution error using dspace is much less compared to the adc of 8 bit which is normally included in the microcontroller. the inherent error is 0.122/2 = 0.061 mv in the reference voltage. the adc resolution error can be small as 0.122 mv. the pwm generator used here, is an in-built pwm generator that takes the control input as the duty ratio and generates a pwm signal accordingly at a given frequency. the amplitude of the pwm signal can be varied over the range of 2.5 v, 3.5 v, and 5 v. fig. 8 shows the circuit implementation of buck converter and fig. 9 shows the simulated delta domain pi controller with dspace i/o blocks. fig. 8 the circuit implementation of the dc-dc buck converter fig. 9 discrete pi controller in the delta-domain with dspace rti blocks 114 a. biswas, a. mondal, p. sarkar 4.3. result analysis 4.3.1. simulation result fig. 10 shows the response of the continuous-time pi controller and the response of the pi controller using the delta domain with low and high sample rates. the resemblance of the response of the pi controller designed in the delta domain at a high sampling rate is also shown here. fig. 10 simulation result with 100 kω resistance (a) continuous domain pi controller response, (b) delta domain pi controller response with a sample rate of 0.5 sec, (c) delta domain pi controller response with a sample rate of 0.00001 sec fig. 11 shows the simulation result of the complete closed loop system under different load conditions. at first, the load is taken to be purely resistive and varied over the range of 100 ω to 100 kω. subsequently, a 100 h inductor is added to test the behaviour of the system under inductive load. in each case, the controller output remains at a steady desired output voltage of 4 v. design and implementation of digital controller in delta domain for buck converter 115 fig. 11 simulation of the system under different load conditions. (a) with 100 ω resistance, (b) with 100 kω resistance, (c) with r-l load consisting of 100 kω resistance and100 h inductor in each case, the current variation is shown with the variation of the load. the output voltage regulation of the buck converter with the load variation is thus depicted in terms of current variation in fig. 11. 116 a. biswas, a. mondal, p. sarkar 4.4. real-time experimental results fig. 12 shows the complete hardware setup with microlab dspace board, designed buck converter, cpu and dso. fig. 12 complete hardware setup with 4 volts as reference the load is varied over the ranges from 100 ω and 100 kω. with the variation of load resistances, the output of the buck converter is set at almost 4 v at its output which is the desired set point. the variation of the output with the changes in load resistances is depicted in figure 13. the output of the buck converter at a load resistance of 100 ω is shown in figure 13(a) whereas, figure 13(b) is used to illustrate the changes in output voltage with a load resistance of 100 kω. therefore, it is evident that the controller is successfully working for the output voltage regulation of the buck converter for load variation. (a) (b) fig. 13 regulation of output voltage for load resistance of (a) 100 ω (b) for 100 kω the system is also tested with the variation of source voltages keeping the set point fixed at 4 volt. in fig. 14, output voltage regulation for a variety of source voltages like 6v, 8 v and 20 v is depicted. it is observed that with the different input voltage levels, the output of the buck converter is stable at the desired set point which is 4.0 v in this experiment. thus, the controller is working efficiently to maintain its output level fixed even if there are any changes in the source voltages. design and implementation of digital controller in delta domain for buck converter 117 (a) (b) (c) fig. 14 regulation of output voltage with the variation of the source voltages of (a) 6 volt, (b) 8 volt, (c) 20 volt pwm signal with different duty cycles are generated through the dspace board and are shown in fig. 15. the pwm signal is varied with the changes of the reference voltages. (a) (b) fig. 15 pwm signal fed to the buck converter at (a) 75% duty cycle, (b) 50% duty cycle 5. conclusion the work in this paper deals with the development of the digital pi controller in the delta domain and its implementation in ds1202 dspace board for the dc-dc buck converter. the mathematical analysis, simulations and experiments are conducted using a pi-compensated buck converter. it is found that the performance of the chosen dc-dc buck converter is satisfactory under variation of supply voltage as well as load. the output voltage changes are only 0.25% when the load and supply voltage are varied as can be shown in fig. 11, fig. 13 and fig. 14 respectively. by varying the duty cycle of pwm using the designed controller in dspace, the output voltage is adjusted proportionately as shown in fig. 15. from the simulation result as given in fig. 10, it is found that the sampling time (δ) is reduced up to 0.00001sec and desired result is obtained which is again almost same as that of the output obtained by using the continuous-time controller. from this result, it is proved that at a fast sampling limit the delta parameterised system provides meaningful information. the mathematical derivations, simulations, and experiments performed in this paper conclude that the delta operator parameterized discrete-time controller’s exhibit certain numerical advantages and at a high sampling rate the results of the continuoustime controllers are almost same as the results obtained by using the controller designed in delta domain. this leads to the development of a unified approach for digital controller design for buck converter using the delta operator. 118 a. biswas, a. mondal, p. sarkar references [1] n. mohan, t. m. undeland, & w. p. robbins, power electronics: converters, applications, and design, john willey & sons, new york, 2002. [2] g. dileep & s. n. singh, "selection of non-isolated dc-dc converters for solar photo voltaic system", renewable sustainable energy review, vol. 76, pp. 1230-1247, 2017. [3] a. bhaumik, y. kumar, s. srivastava, & m. islam, 2016, "performance studies of a separately excited dc motor speed control fed by a buck converter using optimized piλdμ controller", in proceedings of the int. conf. circuit, power comput. technol. (iccpct), nagercoil, india, 2016. [4] m. hassanalieragh, t. soyata, a. nadeau, & g. sharma, "ur-solarcap: an open-source intelligent autowakeup solar energy harvesting system for supercapacitor-based energy buffering", ieee access, vol. 4, pp. 542-557, 2016. [5] q. xu, c. zhang, c. wen, & p. wang, "a novel composite nonlinear controller for stabilization of constant power load in dc microgrid, ieee trans. smart grid", vol.10, no. 1, pp. 752-761, 2010. [6] d. kumar, f. zare, & a. ghosh, "dc microgrid technology: system architectures, ac grid interfaces, grounding schemes, power quality, communication networks, applications, and standardizations aspects", ieee access, vol. 5, pp. 12230-12256, 2017. [7] liu qingpeng, research on buck converter based on linear feedback control [d], northeast petroleum university, 2012. [8] m. ghamari, h. mollaee, f. khavari, "robust self-tuning regressive adaptive controller design for a dc–dc buck converter", measurement, vol. 174, 109071, 2021. [9] c. deekshitha & k. latha shenoy, "design and simulation of synchronous buck converter for led application", in proceedings of the 2nd ieee international conference on recent trends in electronics information & communication technology, bangalore, india, 2017, pp. 142-146. [10] z. yichen, x. hejin & l. deming, "feedback control of fractional piλdμ for dc/dc buck converters", in proceedings of the international conference on industrial informatics -computing technology, intelligent technology, industrial information integration, wuhan, china, 2017, pp. 219-222. [11] a. m. abdurraqueeb, a. a. al-shamma’a, a. alkhuhyali, a. m. noman, k.e. addoweesh, "rst digital robust control for dc/dc buck converter feeding constant power load", mathematics, vol. 10, id. 1782, p. 15, 2022. [12] g. abbas, j. gu, u. farooq, m. irfan abid, a. raza, m. asad, v. e. balas and m. e. balas, "optimized digital controllers for switching-mode dc-dc step-down converter", electronics, vol. 7, no. 12, id. 412, p. 25, 2018. [13] g. abbas, m. nazeer, v. balas, t.-c. lin, m. balas, m. asad, a. raza, m. shehzad, u. farooq and j. gu, "derivative-free direct search optimization method for enhancing performance of analytical design approach-based digital controller for switching regulator", energies, vol. 12, no. 11, id. 2183, p. 18, 2019. [14] k. r. kumar & s. jeevananthan, "design of sliding mode control for negative output elementary super lift luo converter operated in continuous conduction mode", in proceedings of the communication control and computing technologies (icccct), ramanthapuram, india, 2010. pp. 138-148. [15] k. sharma & d. k. palwalia, 2017, "design of digital pid controller for voltage modecontrol of dc-dc converters", in proceedings of the international conference on microelectronic devices, circuits and systems (icmdcs), vellore, india, 2017. [16] l. h. guang, w. bo & l. g. you, 2005, "delta operator control and its robust control theory basis (m), national defence industry press, beijing. [17] r. h. middleton & g. c. goodwin, 1990, digital control and estimation-a unified approach, prentice-hall, englewood cliffs. new jersey. [18] r. h. middleton & g. c. goodwin, 1986, "improved finite wordlength characteristics in digital control using delta operators", ieee transactions on automatic control, vol. 31, no. 11, pp. 1015-1021. [19] j. cortes-romero, a. luviana-juarez & h. sira-ramirez, 2013, "a delta operator approach for the discretetime active disturbance rejection control on induction motors", mathematical problems in engineering, vol. 2013, id. 572026, p. 9, 2013. [20] g. maione, "high-speed digital realisation of fractional operators in delta domain", ieee transactions on automatic control, vol. 56, no. 3, pp. 697-702, 2011. [21] s. ganguli, g. kaur & p. sarkar, "a hybrid intelligent technique for model order reduction in the delta domain: a unified approach", springer nature, soft computing, vol. 23, pp. 4801-4814, 2018. [22] y. zhao & d. zhang, "h∞ fault detection for uncertain delta operator systems with packet dropout and limited communication", in proceedings of the american control conference, seattle, wa, usa, 2017, pp. 4772-4777. [23] j. gao, s. chai, m. shuai, b. zhang & l. cui, "detecting false data injection attack on cyber-physical system based on delta operator", in proceedings of the 37th chinese control conference, wuhan china, 2018. https://ieeexplore.ieee.org/xpl/conhome/8169966/proceeding https://ieeexplore.ieee.org/xpl/conhome/8169966/proceeding design and implementation of digital controller in delta domain for buck converter 119 [24] j. zhou, d. zhang, ieee access, multidisciplinary, vol. 7, id. 94448, 2019. [25] a. mondal, p. sarkar, a. hazra, "a unified approach for pi controller design in delta domain for indirect fieldoriented control of induction motor derive", journal of engineering research, vol. 8, no. 3, pp. 118-134, 2020. [26] b. l. eidson, 2010, an experimental evaluation of delta operator in digital control, auburn, alabama. [27] i. laoprom, s. tunyasrirut, "design of pi controller for voltage controller of four-phase interleaved boost converter using particle swarm optimization", journal of control science and engineering, id. 9515160, p. 13, 2020. [28] k. s. rao, r. mishra, "comparative study of p, pi and pid controller for speed control of vsi-fed induction motor", international journal of engineering development and research, vol. 2, no. 2, pp. 2740-2744, 2014. [29] l. a. quezada-téllez, l. franco-pérez, "guillermo fernandez-anaya, 2020, controlling chaos for a fractional-order discrete system", ieee open journal of circuit and systems, vol. 1, pp. 263-269, 2020. [30] p. hu, s. chen, h. huang, g. zhang, l. liu, "improved alpha-guided grey wolf optimizer", ieee access, vol. 7, pp. 5421-5437, 2018. [31] t. s. anandhi, k. muthukumar & s.p. natarajan, "dspace based implementation of pid controller for buck converter", in proceedings of the dspace user conference, 2012. [32] v. r, r. g, k. k. b & a. k. g, "dspace based 12/24v closed loop boost converter for low power applications", in proceedings of the international conference on computation of power, energy, information and communication, chennai, india, 2014, pp. 213-217. [33] ogata. k, modern control system, 1987, university of minnesota, prentice hall. [34] n pillai, p.a. govender, particle swarm optimization approach for model independent tuning of pid control loop, ieee africon, ieee catalog: 04ch37590c, 2007. https://ieeexplore.ieee.org/author/37088578546 https://ieeexplore.ieee.org/author/37088577841 https://ieeexplore.ieee.org/author/38273888200 instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 631 646 doi: 10.2298/fuee1604631j multi-criteria assessment of the smart grid efficiency using the fuzzy analytic hierarchy process  aleksandar janjic 1 , suzana savic 2 , goran janackovic 2 , miomir stankovic 2 , lazar velimirovic 3 1 university of niš, faculty of electronic engineering, niš, serbia 2 university of niš, faculty of occupational safety, niš, serbia 3 mathematical institute of the serbian academy of sciences and arts, belgrade, serbia abstract: in this paper, the key performance indicators related to the smart grid efficiency, as the key factor of any energy management system implementation have been analysed. the authors are proposing multi-criteria fuzzy ahp methodology for the determination of overall smart grid efficiency. four criteria (technology, costs, user satisfaction, and environmental protection) and seven performances (according to eu and us initiatives for analysis of benefits and effects of smart grid systems) for the selection of optimal smart grid project are defined. the analysis shows that the dominant performances of the optimal smart grid project are efficiency, security and quality of supply. the methodology is illustrated on the choice of smart grid development strategy for the medium size power distribution company. key words: smart grid, multi-criteria analysis, fuzzy analytical hierarchy process 1. introduction a smart grid is usually defined as an electrical grid that intelligently integrates the actions of all users connected within it – producers, consumers, and those who are both, with the purpose of efficiently producing electricity and delivering it sustainably, economically, and safely [1]. the smart grid promises a variety of efficiency gains for utilities, like the reducing distribution line losses through minimization of reactive power and more precise voltage control [2]. furthermore, the smart grid should enhance utilities‟ ability to monitor and measure the effectiveness of end-use energy-efficiency programs, and to better manage energy costs on the customer side, which is confirmed by the numerous projects and organizations that were initiated to facilitate the evolution of the smart grid [3], [4]. received june 28, 2015; received in revised form august 25, 2015 corresponding author: lazar velimirovic mathematical institute of the serbian academy of sciences and arts, kneza mihaila 36, 11001 belgrade, serbia (email: lazar.velimirovic@mi.sanu.ac.rs) 632 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic in the eu, the concept of smart grids was adopted in 2005, as an official document of the european commission through the european technology platform smart grids, and more precisely defined in [5] and [6]. in early april 2010, the european commission issued a statement reiterating the need to improve the existing grids, listing the following as the main objectives [7]: increased use of renewable electricity sources, grid security, energy conservation and energy efficiency, and deregulated energy market. therefore, the strategy for sustainable, competitive, and safe energy primarily implies: competitiveness, use of different energy sources, sustainability, innovation, and technological improvement [8]. the result of energy system development is reflected in energy performance, with quantifiable results pertaining to energy (e.g. energy efficiency, energy intensity, or specific energy consumption) and energy performance indicators as quantitative indexes of energy performance. energy efficiency is a way of managing and restraining the growth in energy consumption. the key energy performance indicators were defined in 2005 as a result of cooperation between several international organizations – global leaders in energy and environmental statistics and analysis: international atomic energy agency (iaea), united nations department of economic and social affairs (desa), international energy agency (iea), european environment agency (eea), and the directorate-general of the european commission for statistics – eurostat [9]. the key energy performance indicators include a set of 30 indicators: 4 social indicators, 16 economic indicators, and 10 environmental indicators. the values of the u.s. energy security risk index were determined based on the data for the period between 1970 and 2010, and predicted for the period between 2011 and 2035 [10]. the indicator values do not merely represent data but the basis for communication between stakeholders regarding sustainable energy use. each set of indicators (social, economic, or environmental) expresses specific aspects or impacts of energy production and use. the lack of systematic approach in the classification of these indicators is the main reason why the smart grids were evaluated on individual indicators only. the cyber security indicator has been explored in [11]-[13], while the cost/benefit assessment of a smart distribution system with intelligent electric vehicle charging has been analysed in [14], [15]. in the smart grid context, three main assessment frameworks based on key performance indicators (kpis) have been introduced. the ec task force for smart grids has introduced the characteristics of the ideal smart grids (services) and the outcomes of the implementation of the ideal smart grid (benefits) [16], [17]. a measure of the contribution of projects to the ideal smart grid is quantified in terms of benefits, via a set of kpis. the european electricity grid initiative has divided the ideal smart grid system into thematic areas (clusters) and is currently mapping smart grid projects into clusters [18]. in usa, the ideal characteristics of the smart grid and a set of metrics to measure progresses toward the ideal smart grids has been defined [19]: build metrics that describe attributes that are built in support of a smart grid (e.g. percentage of substations using automation) and value or impact metrics that describe the value that may derive from achieving a smart grid (e.g. percentage of energy consumed to generate electricity that is not lost, or quantity of electricity delivered to consumer compared to electricity generated expressed as a percentage). however, because of proliferation of these energy indicators, it is still very difficult to decision maker to answer to simple questions like:  among different smart grid projects, which alternative to choose?  which alternative will be the most beneficiary to different stakeholders?  how to monitor the efficiency of already implemented smart grid project? multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 633 the contribution of this paper is the introduction of multi-criteria approach in the smart grid efficiency assessment. unlike the approach used in [1], the fuzzy ahp method has been proposed, offering much more flexibility in the criteria selection and the evaluation of both criteria and alternatives. furthermore, the new hierarchy of four criteria and seven performances has been introduced in order to obtain more consistent evaluation framework. we proved that the method is highly successful in the evaluation of alternatives in the presence of heterogeneous criteria. because of the main characteristic of the adopted smart grid evaluation framework and its complex hierarchical structure, we proposed the fuzzy ahp methodology for the project evaluation, structuring a decision into a hierarchy of criteria, sub criteria and alternatives. by means of pair-wise comparisons of two (sub) criteria or alternatives, it generates inconsistency ratios and weighting factors to prioritise the criteria and alternatives. after the brief overview of key performance indicators for the smart grid evaluation, the fuzzy ahp methodology has been presented. the methodology is illustrated on the choice of smart grid projects deployment for one medium size power distribution company. 2. smart grid assessment frameworks the implementation of a smart grid is useful to achieve strategic policy goals, such as the smooth integration of renewable energy sources, a more secure and sustainable electricity supply and full inclusion of consumers in the electricity market. smart grids help the consumers to better understand their own energy use, which in turn allows them to identify energy saving opportunities. smart grid and advanced metering infrastructure (ami) systems could open up opportunities for energy management companies, hired by consumers, to use data from consumers‟ smart meters to identify opportunities for energy savings or to measure the success of energy savings measures after they are undertaken [20]. for utilities, a better understanding of the electrical grid's status at a second-by-second level allows the grid to be operated at much tighter tolerances, resulting in greater efficiencies and reliability. steering the smart grid transition is a challenging, long-term task, which requires balancing energy policy goals, environmental constraints and market profitability. in this perspective, a first approach in smart grid assessment is to evaluate to what extent smart grid projects are contributing to progresses toward the “ideal smart grid” and its expected outcomes (e.g. sustainability, efficiency, consumer inclusion), which are directly linked with the policy goals that have triggered the smart grid transition. this first approach is conducted via the definition of suitable metrics and key performance. a second complementary approach is to assess the profitability of smart grid solutions and investments through an appropriate multi-criteria decision analysis methodology. 2.1. key performance indicators the progress of smart grid development can be measured by formulating a set of key performance indicators (kpis) and applying those to the electricity network. in [17]-[19] the characteristics of the ideal smart grids and defined metrics to measure progresses and outcomes resulting from the implementation of smart grid projects have been defined. the ideal smart grid has been defined in terms of characteristics in the us and in terms 634 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic of services in the european union. built/value metrics in the usa and benefits/kpis in europe are used to measure progresses toward the ideal smart grid. the ec smart grid task force has identified a list of benefits deriving from the implementation of a smart grid [16]:  increased sustainability;  adequate capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers;  adequate grid connection and access for all kinds of grid users;  satisfactory levels of security and quality of supply;  enhanced efficiency and better service in electricity supply and grid operation;  effective support of transnational electricity markets by load flow control to alleviate loop flows and increased interconnection capacities;  coordinated grid development through common european, regional and local grid planning to optimise transmission grid infrastructure;  enhanced consumer awareness and participation in the market by new players;  enable consumers to make informed decisions related to their energy to meet the eu energy efficiency targets;  create a market mechanism for new energy services such as energy efficiency or energy consulting for customers;  consumer bills are either reduced or upward pressure on them is mitigated. each benefit is expressed via a set of kpis including both quantitative and qualitative indicators. for illustration, the first benefit – increased sustainability is valued by the quantified reduction of carbon emissions, environmental impact of electricity grid infrastructure and quantified reduction of accidents and risk associated with generation technologies (this sentence is not clear). the complete list of indicators can be found in [16]. the kpis can be applied to evaluate project results on smart grids as well. a clearly defined framework can specify where exactly the project contributed to a smart electricity grid. the mixture of quantitative and qualitative indicators is one of the major reasons for introducing the multi-criteria decision analysis techniques. another reason is the shortcoming of the cost benefit analysis, which will be explained in the sequel. 2.2. smart grid development assessment model the implementation of the smart grid should be market-driven. another necessary approach in smart grid assessment is therefore to assess the costs, the benefits and the beneficiaries of different smart grid solutions. a comprehensive methodology for cost benefit analysis of smart grid projects has been defined in [21], while the european commission has adapted and expanded the doe/epri methodology to fit the european context [22]-[24]. however, the traditional cost benefit analysis approach is not catching all the effects involved in development policies, where intangible aspects are not secondary, but dominating [25]. the main disadvantage of the cost benefit is the translation of all the effects in a common numerical and a single aggregate measure. it is crucially important to ensure that project proposals are evaluated against a common reference system, to integrate the outcome of the kpi and of the economic analysis and come up with an overall project evaluation. therefore, multiple criteria analysis seems to be better in measuring intangibles and soft impacts than cost benefit; actually, it uses more than one criterion introducing qualitative aspects in the analysis. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 635 in order to get a thorough understanding of the status of smart grid development, the main smart criteria (they have to be specific, measurable, attainable, relevant and time-bound) can be defined. starting from eleven main benefits, presented in previous section, an adapted list of main criteria is defined in our approach, including:  technology, covering all aspects of advanced services and new requirements imposed to the distribution and transmission network;  costs;  customer satisfaction, encompassing different options of customer choice, new energy services and market participation;  environmental impact. introducing this higher level of four main criteria, after the first set of benefits defined on the base level of efficiency assessment, the higher level of assessment with four criteria explained above can be established. different levels between relations can be set up in terms of the volume of their inter connectedness. multi-criteria methods differ in the way the idea of multiple criteria is treated. each method shows its own properties with respect to the way of assessing criteria, the application and computation of weights, the mathematical algorithm utilised, the model to describe the system of preferences of the decision maker, and finally, the level of uncertainty embedded in the data set. because of the main characteristic of the adopted smart grid evaluation framework and its complex hierarchical structure, we proposed the fuzzy ahp methodology for the project evaluation, structuring a decision into a hierarchy of criteria, sub criteria and alternatives. by means of pair-wise comparisons of two (sub) criteria or alternatives, it generates inconsistency ratios and weighting factors to prioritise the criteria and alternatives. sensitivity analysis can be applied to test the robustness of the priorities. the main characteristics of this methodology are presented in the sequel. 3. methodology thomas l. saaty developed the original ahp in the late 1970s [26]. in this method, human‟s judgments are represented as crisp values. however, in many practical cases the human preference model is uncertain and decision makers cannot to assign crisp values to the comparison judgments. in these cases it is useful implementation of fuzzy ahp method. fuzzy ahp method is designed to improve decision support for uncertain valuations and priorities. in this method the data and preferences of experts are evaluated under fuzzy set environment [27]. the use of fuzzy set theory allows the decision makers to incorporate unquantifiable information, incomplete information, non-obtainable information and partially ignorant facts into decision model [28]. the basic notions of fuzzy arithmetic are given in the appendix. many authors have used fuzzy ahp method for solving problems in different areas: to solve multi-criteria problems involving qualitative data [29], [30]; water management [31]-[33]; evaluation naval tactical missile systems [34]; hazardous waste management [35]; prioritization of human capital measurement indicators [36]; shipping asset management [37]; occupational safety management [38], [39]. in this paper the fuzzy ahp method is used for smart grid projects ranking and selection, precisely because of many uncertain and non-tangible benefits and criteria involved in the smart grid projects. 636 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic the fuzzy ahp method involves the following steps: (1) the overall goal (objective) is identified and clearly defined; (2) the criteria, sub-criteria, and alternatives are identified; (3) the hierarchical structure is formed; (4) pair-wise comparison is made using fuzzified saaty‟s evaluation scale; (5) the priority weighting vectors are evaluated; (6) the defuzzification and the final ranking of alternatives are defined. in this study, the fuzzy ahp method is applied to the ranking of smart grid projects, according to following steps. 1. goal identification. the goal is to rank different smart grid projects. 2. identification of criteria, sub-criteria, and alternatives. criteria for smart grid projects selection are: technology, costs, user‟s satisfaction and environmental protection. sub-criteria are project performance: sustainability, capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers, possibility of grid connection and access for all kinds of grid users, security and quality of supply, efficiency and good service in electricity supply and grid operation, effective support of transnational projects and electricity markets, transparent information to consumers. finally, the smart grid projects are identified as alternatives. 3. hierarchical structure formation. the fuzzy ahp method presents a problem in the form of hierarchy: the first level represents the goal; the second level considers relevant criteria (four identified criteria); the third level considers relevant sub-criteria (seven identified sub-criteria); and the fourth level defines smart grid projects. 4. pair-wise comparison. pairs of elements at each level are compared according to their relative contribution to the elements at the hierarchical level above, using fuzzified saaty‟s scale, as shown in table 1. table 1 crisp and fuzzified saaty‟s scale for pairwise comparisons [30]. crisp values (x) judgment description fuzzy values 1 equal importance (1, 1, 1+δ) 3 week dominance (3-δ, 3, 3+δ) 5 strong dominance (5-δ, 5, 5+δ) 7 demonstrated dominance (7-δ, 7, 7+δ) 9 absolute dominance (9-δ, 9, 9) 2, 4, 6, 8 intermediate values (x-1, x, x+1) in this paper fuzzification is implemented by triangular fuzzy numbers, and the value of fuzzy distance of 2 is used; on boundaries, (1,1,3) is used for 1, and (7,9,9) is used for 9. it is used a fuzzy distance of 2 for odds (3, 5, 7), and a fuzzy distance of 1 for pairs (2, 4, 6, 8), as recommended in [33], because the most consistent results can be expected. pair-wise comparisons at each level, starting from the top of the hierarchy, are presented in the square matrix form , 1,ij i j n a a      , where ija is the fuzzy value about the relative importance of criteria/cub-criteria/alternative i over criteria/cub-criteria/alternative j, 1 ij a  for i = j and 1 /ij jia a for i≠j. 5. priority weights vectors evaluation. the ranking procedure starts with the determination of criteria weighting vector: 1 2 3 4 ( , , , ) t c c c c c w w w w w . (1) multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 637 elements of criteria weighting vector, with respect to equation (a.7), are determined as: 4 4 4 1 1 1 1 [ ] , 1, 2, 3, 4 ci ij ij j i j w a a i         . (2) performance weighting vectors are defined by pair wise comparison of performance according to every single criterion. appropriate elements of this vector, according to equation (a.7), are calculated as follows: 1 4 7 4 1 1 1 ij ij lj j l j x a a              , (3) where xij represents the fuzzy weights of the i-th performance with respect to the j-th criterion. final performance weights are derived through the aggregation of the weights at two consecutive levels, i.e. multiplying performance weights by criteria weights: 1 2 3 4 5 6 7 ( , , , , , , ) t sc c sc sc sc sc sc sc sc w x w w w w w w w w   . (4) finally, the smart grid projects are compared according to the relevant performance. proper weights of projects for individual performance are determined according to equation (a.7), as follows: 1 7 3 7 1 1 1 ij ij lj j l j y a a              , (5) where yij represents the fuzzy weights of the i-th project with respect to the j-th performance. final smart grid projects weights are obtained by multiplying the weights of the projects and the final performance weights: 1 2 3 ( , , ) t a sc a a a w y w w w w   . (6) 6. defuzzification and the final ranking of alternatives. in this paper triangular fuzzy numbers are ranked by applying the total integral value method. this method is used for ranking of smart grid projects according to moderate and optimistic attitude toward risk. 4. results and discussion the proposed methodology is illustrated on the choice of the smart grid deployment strategy in a hypothetical power distribution company of medium size. the company is supposed to supply 50 000 consumers, and the list of alternatives with the description of proposed actions and appropriate indicators is given in table 2. 638 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic table 2 different development alternatives. no description of the proposed action performance indicator alternative 1 alternative 2 alternative 3 1 advanced meter installation number of advanced meter installed 20 000 10 000 5 000 2 substation automation percentage of substations applying automation technologies 20% 30% 40% 3 introduction of dynamic line rating technology number of lines operated under dynamic line ratings 2 3 4 percentage of kilometres of transmission circuits operated under dynamic line ratings 15% 20% 15% 4 solar power plant connection total installed power (mw) 3 5 7 three alternatives are evaluated, encompassing four activities introducing new technologies in the distribution network: replacement of old meters with the remotely read meters; the remote control and introduction of substation in the scada system; dynamic line rating of transmission lines; construction of new photovoltaic plant embedded in the distribution network. all activities are planned inside the same approximate budget of 5 000 000 € and the planners proposed three different development strategies. using the presented methodology, experts (in the field of smart grid technologies and multi-criteria decision-making) ranked three smart grid projects whose characteristics are presented in table 3. the proposed set of actions is bringing some qualitative and quantitative benefits. for instance, the increased number of advanced meters installed in the first alternatives will strongly affect both the adequate grid connection because of the enhanced low voltage network management and transparent information to consumers. the quantitative aggregated performance indicators for different alternatives are calculated and represented in table 3. table 3.quantitative aggregated performance indicators for different alternatives. no performance indicator alternative 1 alternative 2 alternative 3 1 energy losses reduction [mwh/year] 3000 8000 11000 2 quantified reduction of carbon emissions (t) 5 400 14 000 19 000 3 probability of injures reduction (in percentage) 10 15 20 although the calculation of these parameters is outside the scope of this paper, the relation between the proposed actions and the expected results is obvious. energy loss reduction is caused by the dynamic line rating enabling the more economic line loading and the connection of the photovoltaic plant (row 1). this renewable source is reducing the carbon emission according to the installed plant power (row 2). finally, the automation of substations reduces the probability of injures during the equipment manipulation (row 3). experts first performed pair wise comparison of the following criteria: technology (c1), costs (c2), customer satisfaction (c3) and environmental (c4). the results of the comparison, fuzzy weights, final weights (fws) and ranks of criteria are shown in table 4. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 639 table 4 the pairwise comparison, fuzzy weights, final weights and ranks of criteria. c1 c2 c3 c4 fuzzy weights wci λ=0.5 fws rank λ=1.0 fws rank c1 1 3 5 5 (0.1967,0.5303, 1.3141) 0.5096 1 0.5023 1 c2 1 3 1 3 3 (0.0787, 0.2778, 0.7885) 0.2819 2 0.2904 2 c3 1 5 1 3 1 1 (0.0576, 0.0960, 0.3504) 0.1189 3 0.1216 3 c4 1 5 1 3 1 1 1 (0.0412, 0.0960,0.2190) 0.0896 4 0.0858 4 then the experts compared the following performance indicators in relation to every criterion: sustainability (sc1), capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers (sc2), possibility of grid connection and access for all kinds of grid users (sc3), security and quality of supply (sc4), efficiency and good service in electricity supply and grid operation (sc5), effective support of transnational projects and electricity markets (sc6), transparent information to consumers (sc7). this step is necessary because of different economical, social and political conditions for different distribution companies. as stated above, the pairwise comparison made by experts is performed both by qualitative and quantitative indicators. for instance, security criteria (sc4) can be supported by the reduction of injuries (table 3), while the market development criteria (sc6) is much more susceptible to subjective experts judgments. the results are presented in tables 5 to 8. table 5 the pairwise comparison matrix of sub-criteria in relation to the technology. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi1 sc1 1 1 3 3 1 7 1 7 1 5 1 5 (0.0179, 0.0489, 0.1307) sc2 3 1 5 1 5 1 5 1 3 1 3 (0.0376, 0.0981, 0.2539) sc3 1 3 1 5 1 1 7 1 7 1 5 1 5 (0.0122, 0.0216, 0.0551) sc4 7 5 7 1 1 3 3 (0.1125, 0.2631, 0.6320) sc5 7 5 7 1 1 1 3 3 (0.1081, 0.2631, 0.5996) sc6 5 3 5 1 3 1 3 1 1 (0.0622, 0.1526, 0.4051) sc7 5 3 5 1 3 1 3 1 1 1 (0.0578, 0.1526, 0.3727) table 6 the pairwise comparison matrix of sub-criteria in relation to the costs. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi2 sc1 1 1 5 3 1 5 1 5 1 3 5 (0.0364, 0.0939, 0.2361) sc2 5 1 7 3 3 5 7 (0.1230, 0.2929, 0.6769) sc3 1 3 1 7 1 1 5 1 5 1 3 3 (0.0181, 0.0492, 0.1396) sc4 5 1 3 5 1 1 3 7 (0.0919, 0.2110, 0.5195) sc5 5 1 3 5 1 1 1 3 7 (0.0876, 0.2110, 0.4880) sc6 3 1 5 3 1 3 1 3 1 5 (0.0424, 0.1216, 0.3201) sc7 1 5 1 7 1 3 1 7 1 7 1 5 1 (0.0118, 0.0204, 0.0514) 640 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic table 7 the pairwise comparison matrix of sub-criteria in relation to the customer satisfaction. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi3 sc1 1 1 1 7 1 7 1 3 1 3 1 5 (0.0186, 0.0319, 0.1164) sc2 1 1 1 1 7 1 7 1 3 1 3 1 5 (0.0141, 0.0319, 0.0819) sc3 7 7 1 1 5 3 3 (0.1145, 0.2730, 0.6744) sc4 7 7 1 1 1 5 3 3 (0.1100, 0.2730, 0.6399) sc5 3 3 1 5 1 5 1 1 3 1 5 (0.0244, 0.0802, 0.2248) sc6 3 3 1 3 1 3 3 1 1 3 (0.0310, 0.1112, 0.3286) sc7 5 5 1 3 1 3 5 3 1 (0.0768, 0.1988, 0.5015) table 8 the pairwise comparison matrix of sub-criteria in relation to the environmental protection. sc1 sc2 sc3 sc4 sc5 sc6 sc7 fuzzy weights xi4 sc1 1 3 3 5 5 3 7 (0.1101, 0.3115,0.8074) sc2 1 3 1 1 3 3 1 5 (0.0602, 0.1654, 0.5176) sc3 1 3 1 1 1 3 3 1 5 (0.0553, 0.1654, 0.4762) sc4 1 5 1 3 1 3 1 1 1 3 1 5 (0.0422, 0.0946, 0.2967) sc5 1 5 1 3 1 3 1 1 1 1 3 3 (0.0226, 0.0715, 0.2139) sc6 1 3 1 1 1 1 3 3 1 5 (0.0504, 0.1654, 0.4348) sc7 1 7 1 5 1 5 5 1 3 1 5 1 (0.0138, 0.0263, 0.0732) the final vector of fuzzy weights of the performance of the projects, according to equation (4) and tables 4-8, is:     4 1 7 17 4 (0.0120, 0.0850, 0.5756) (0.0204, 0.1523,1.0094) (0.0127, 0.0672, 0.5231) (0.0374, 0.2334,1.5294) (0.0305, 0.2127,1.2984) (0.0194, 0.4113, 0.9951) (0.0173, 0.1082, 0.7221) sc c ij ci scix xx w x w x w w                               (7) at the end, three smart grid projects (project 1 [a1], project 2 [a2], and project 3 [a3]) are compared in relation to performance presented in tables 3 and 4 as presented in table 9. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 641 table 9 the pair wise comparison of alternatives in relation to performance sc a1 a2 a3 fuzzy weights yij sc1 a1 1 1 3 1 5 (0.0601,0.1031,0.2731) a2 3 1 1 3 (0.0985,0.2915,0.8194) a3 5 3 1 (0.2239,0.6054,1.5217) sc2 a1 1 1 1 (0.2000,0.3333,1.0000) a2 1 1 1 1 (0.1556,0.3333,0.7143) a3 1 1 1 1 1 (0.1111,0.3333,0.4286) sc3 a1 1 3 5 (0.2239,0.6054,1.5217) a2 1 3 1 3 (0.0985,0.2915,0.8194) a3 1 5 1 3 1 (0.0601,0.1031,0.2731) sc4 a1 1 1 1 3 (0.1158,0.2000,0.7426) a2 1 1 1 1 3 (0.0807,0.2000,0.4455) a3 3 3 1 (0.1579,0.6000,1.6337) sc5 a1 1 1 1 3 (0.1158,0.2000,0.7426) a2 1 1 1 1 3 (0.0807,0.2000,0.4455) a3 3 3 1 (0.1579,0.6000,1.6337) sc6 a1 1 1 1 3 (0.0667,0.1282,0.4545) a2 1 1 1 1 3 (0.1048,0.3333,1.0606) a3 3 3 1 (0.1429,0.5385,1.6667) sc7 a1 1 1 5 1 5 (0.0593,0.0909,0.1570) a2 5 1 1 (0.2308,0.4545,1.0359) a3 5 1 1 1 (0.2000,0.4545,0.8475) the final vector of fuzzy weights for smart grid projects, according to equation (16) is:   7 13 7 (0.0168, 0.2079, 4.5137) (0.0136, 0.2445, 4.1205) (0.0190, 0.4394, 7.4556) a sc ij sci xx w y w y w                 (8) after the defuzzification of final weights vectors of performance and projects, according to equation (11), performance and smart grid projects are ranked. ranking results are shown in table 10 (fws are final weights). 642 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic table 10 ranking of project performance and smart grid projects. λ=0.5 fws rank λ=1.0 fws rank project performance sustainability (sc1) capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers (sc2) possibility of grid connection and access for all kinds of grid users (sc3) security and quality of supply (sc4) efficiency and good service in electricity supply and grid operation (sc5) effective support of transnational projects and electricity markets (sc6) transparent information to consumers (sc7) 0.0861 6 0.1516 3 0.0761 7 0.2310 1 0.1993 2 0.1473 4 0.1086 5 0.0863 6 0.1518 3 0.0771 7 0.2303 1 0.1974 2 0.1485 4 0.1085 5 smart grid projects project 1 (a1) project 2 (a2) project 3 (a3) 0.2669 2 0.2580 3 0.4661 1 0.2781 2 0.2570 3 0.4647 1 based on the previous results, we can conclude the following: 1. the most important criterion for the selection of smart grid (for this particular distribution company) is the selected technology, followed by the costs, the customer satisfaction and the environmental protection (table 5). advanced technology increases the efficiency and security of energy supply of high performance, thus increases user satisfaction and protects the environment. 2. in relation to the technology, the best ranked performance is security and quality of supply; in relation to the costs grids for „collecting‟ and bringing electricity to the consumers; in relation to the user satisfaction possibility of grid connection and access for all kinds of grid users; and in relation to the environmental protection sustainability. 3. the final ranking of the project performance, based on all criteria, is:  security and quality of supply  efficiency and good service in electricity supply and grid operation  capacity of transmission and distribution grids for „collecting‟ and bringing electricity to the consumers  effective support of transnational projects and electricity markets  transparent information to consumers  sustainability  possibility of grid connection and access for all kinds of grid users. the best-ranked performance (security and quality of supply, and efficiency and good service in electricity supply and grid operation) are supported by the advanced technology. 4. the final rank of the alternatives indicates that the highest rank has the a3 project, followed by the a2 project; the lowest priority has the a1 project. this means that for the implementation of the smart grid project 3 should be selected. multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 643 5. conclusion in this paper, starting from a general set of smart grid performance indicators, a new assessment framework for the evaluation of the smart grid efficiency has been established, as one of the main conditions for the successful implementation of any energy management program. using the fuzzy ahp methodology with four main criteria and seven sub criteria derived from the adopted set of smart grid benefits, we proved that the method is highly successful in the evaluation of alternatives in the presence of heterogeneous criteria. this method allows the decision makers to incorporate unquantifiable information, incomplete information, non-obtainable information and partially ignorant facts into decision model. the proposed methodology is illustrated on the choice of the right smart grid deployment strategy in the medium size power distribution company. the analysis shows that the dominant performances of the optimal smart grid project are the selected technology, followed by the costs, the customer satisfaction and the environmental protection. this methodology is applied to the general assessment of smart grid efficiency, while the further research will be focused on particular aspects of the project implementation. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia under grant iii 42006 and grant iii 44006. appendix a.1. fuzzy set, triangular fuzzy number and fuzzy arithmetic mathematical basis for fuzzy ahp method is based on fuzzy sets and fuzzy arithmetic. in [40] it is defined a fuzzy set a by degree of membership a(x) over a universe of discourse x as: ( ) : [0,1]a x x  (a.1) a fuzzy number is a convex and normalized fuzzy set {( ) }, ( ) , a a x x x r  . a triangular fuzzy number can be denoted as ( , , )m a b c , and the membership function is: , [ , ] ( ) , [ , ] 0, a x a x a b b a c x x x b c c b otherwise             (a.2) where a b c  , a and c stand for the lower and upper value of the support of m, respectively, and b is the modal value. when a b c  , it is a “normal”, crisp number. fuzzy arithmetic is based on zadeh‟s extension principle. if :f x y is a function, and a is a fuzzy set in x, then ( )f a is defined as: 644 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic ( ) , ( ) ( ) sup ( ) f a a x x f x y y x     , (a.3) where y y . the main laws for operations for two triangular fuzzy numbers 1 1 1 1 ( , , )m a b c and 2 2 2 2 ( , , )m a b c are: 1 2 1 1 1 2 2 2 1 2 1 2 1 2 ( , , ) ( , , ) ( , , )m m a b c a b c a a b b c c       , (a.4) 1 2 1 1 1 2 2 2 1 2 1 2 1 2 1 2 ( , , ) ( , , ) ( , , ), , 0m m a b c a b c a a b b c c a a        , (a.5) 1 1 1 1 1 1 1 1 1 1 1 1 ( , , ) ( , , )m a b c c b a     . (a.6) a.2. fuzzy synthetic extent the value of fuzzy synthetic extent, according to chang‟s extent analysis method, is defined as [41]: 1 1 1 1 [ ] , 1, 2,..., i i m n m j j i g g j i j s m m i n         , (a.7) where i j g m is a triangular fuzzy number representing the extent analysis value for decision element i with respect to goal j and  is fuzzy multiplication operator. sum in equation (a.7) are determined using equations (a.4) and (a.6): 1 1 1 1 , , ( , , ) i m m m m j g j j j i i i j j j j m a b c a b c                , (a.8) 1 1 1 1 1 , , i n m n n n j g i i i i j i i i m a b c                 , (a.9) 1 1 1 1 1 1 1 1 1 , , i n m j g n n n i j i i i i i i m c b a                          . (a.10) a.3. total integral value method for defuzzification for the given triangular fuzzy number ( , , )m a b c the total integral value is defined as follows [38]:  ( ) 0.5( (1 ) ), 0,1ti m c b a         , (a.11) where λ represents an optimism index. it describes the decision maker‟s attitude toward risk. values 0, 0.5 and 1 are used respectively to represent the pessimistic, moderate and optimistic views of the decision maker. if 1 2( ) ( )t ti m i m    , then 1 2m m ; if 1 2( ) ( )t ti m i m    , then 1 2m m ; if 1 2( ) ( )t ti m i m    , then 1 2m m . multi-criteria assesment of the smart grid efficiency using the fuzzy analitical... 645 references [1] v. giordano, s. vitiello, j. vasiljevska, definition of an assessment framework for projects of common interest in the field of smart grids, jrc science and policy reports, 2014 [2] d. tasic et al. “conception of low voltage network loss reduction based on integrated information”, facta universitatis, series: electronics and energetics, vol. 24, no. 1, pp. 59-71, april 2011. [3] european commission, r&d investment in the priority technologies of the set-plan, sec, 1296, 2009. [4] the european electricity grid initiative (eegi), a joint tso-dso contribution to the european industrial initiative (eii) on electricity networks, 2009. [5] european commission, toward smart power networks, lessons learned from european research fp5 projects, 2005. [6] european commission, strategic research agenda for europe electricity networks of the future european technology platform, 2007. [7] european commission, 2010. strategic deployment document for europe electricity networks of the future european technology platform. available from www.smartgrids.eu/documents/smartgrids_ sdd_final_april2010.pdf [8] european network for the security of control and real time systems, r&d and standardization road map, final deliverable 3.2, 2011. [9] commission of european communities, green paper – a european strategy for sustainable, competitive and secure energy, brussels, 2006. [10] institute for 21 st century energy, index of u.s. energy security risk, assessing america‟s vulnerabilities in a global energy market, 2011. [11] b. falahati, f. yong, w. lei, "reliability assessment of smart grid considering direct cyber-power interdependencies," ieee transactions on smart grid, vol. 3, no. 3, pp. 1515-1524, september 2012. [12] b. falahati, f. yong, "reliability assessment of smart grids considering indirect cyber-power interdependencies," ieee transactions on smart grid, vol. 5, no. 4, pp. 1677-1685, july 2014. [13] m. dimitrijevic et al. “implementation of artificial neural networks based ai concepts to the smart grids”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 411-424, september 2014. [14] z. lin, l. furong, g. chenghong, h. zechun, b. le, "cost/benefit assessment of a smart distribution system with intelligent electric vehicle charging," ieee transactions on smart grid, vol. 5, no. 2, pp. 839847, march 2014 [15] a. janjic and l. z. velimirovic, “optimal scheduling of utility electric vehicle fleet offering ancillary services”, etri journal, vol. 37, no. 2, april 2015. [16] european commission task force for smart grids, 2010a. “expert group 2: regulatory recommendations for data safety, data handling and data protection”, available from http://ec.europa.eu/energy/gas_electricity/smartgrids/doc/expert_group2.pdf [17] european commission task force for smart grids, 2010b. “expert group 3:roles and responsibilities”, available from http://ec.europa.eu/energy/gas_electricity/smartgrids/doc/expert_group3.pdf [18] european electricity grid initiative (eegi), roadmap 2010-18 and detailed implementation plan 201012, 2010. available from http://ec.europa.eu/energy/technology/initiatives/doc/grid_implementation_ plan_final.pdf [19] u.s. department of energy (doe), guidebook for arra sgdp/rdsi metrics and benefits, doe report, 2010. available from http://www.smartgrid.gov/sites/default/files/pdfs/sgdp_rdsi_metrics_ benefits.pdf [20] d. stevanovic, p. petkovic, “utility needs smarter power meters in order to reduce economic losses” facta universitatis, series: electronics and energetics, vol. 28, no. 3, pp. 407-421, september 2015. [21] b. dupont, l. meeus, and r. belmans, "measuring the “smartness” of the electricity grid", in proc. of the 7th international conference on the energy market (eem), pp. 1-6, 2010. [22] epri (electric power research institute), "methodological approach for estimating the benefits and costs of smart grid demonstration projects", palo alto, ca: epri, 1020342, 2010. [23] european commission, guidelines for conducting cost-benefit analysis of smart grid projects. reference report joint research centre, institute for energy and transport, 2012. available from http://ses.jrc.ec. europa.eu/ [24] european commission, guidelines for cost-benefit analysis of smart metering deployment. scientific and policy report joint research centre, institute for energy and transport, 2012. available from http://ses.jrc.ec.europa.eu/ [25] p. beria, i. maltese, and i. mariotti, "multi-criteria versus cost benefit analysis: a comparative perspective in the assessment of sustainable mobility", eur. transp. res. rev., vol. 4, pp. 137-152, 2012. [26] t.l. saaty, the analytic hierarchy process. new york: mcgraw-hill, 1980. http://www.smartgrid.gov/sites/default/files/pdfs/sgdp_rdsi_metrics_benefits.pdf http://www.smartgrid.gov/sites/default/files/pdfs/sgdp_rdsi_metrics_benefits.pdf http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.dupont,%20b..qt.&searchwithin=p_author_ids:37528338800&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.meeus,%20l..qt.&searchwithin=p_author_ids:37271271000&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.belmans,%20r..qt.&searchwithin=p_author_ids:37274741400&newsearch=true http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=5558673&matchboolean%3dtrue%26searchfield%3dsearch_all%26querytext%3d%28%28%28p_title%3ameasuring%29+and+smartness%29+and+electricity%29 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=5551851 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=5551851 646 a. janjic, s. savic, g. janackovic, m. stankovic, l. velimirovic [27] o. duru, e. bulut, and s. yoshida, "regime switching fuzzy ahp model for choice-varying priorities problem and expert consistency prioritization: a cubic fuzzy-priority matrix design", expert systems with applications, vol. 39, pp. 4954-4964, 2012. [28] o. kulak, b. durmusoglu, and c. kahraman, "fuzzy multi-attribute equipment selection based on information axiom", journal of materials processing technology, vol. 169, pp. 337–345, 2005. [29] p. j. m. van laarhoven, w. pedrycz, "a fuzzy extension of saaty‟s priority theory", fuzzy sets and systems, vol. 11, pp. 229-241, 1983. [30] j. buckley, "fuzzy hierarchical analysis", fuzzy sets and systems, vol. 17, no. 3, pp. 233-247, 1985. [31] l. fatti, water research planning in south africa. in: b. golden et al. (eds) application of the analytic hierarchy process, springer: new york, pp. 122–137, 1989. [32] m. ridgley, "a multicriteria approach to allocating water under drought", resource management and optimization, vol. 92, pp. 112–132, 1993. [33] b. srdjevic and y. medeiros, "fuzzy ahp assessment of water management plans", water resources management, vol. 22, pp. 877-894, 2008. [34] c.h. cheng, "evaluating naval tactical missile systems by fuzzy ahp based on the grade value of membership function", european journal of operational research, vol. 96, pp. 343-350, 1996. [35] a.t. gumus, "evaluation of hazardous waste transportation forms by using a two step fuzzy ahp and topsis methodology", expert systems with applications, vol. 36, no. 2, pp. 4067-4074, 2009. [36] f. bozbura, a. beskese, and c. kahraman, "prioritization of human capital measurement indicators using fuzzy ahp", expert systems with applications, vol. 32, pp. 1100-1112, 2007. [37] e. bulut, o. duru, t. kececi, and s. yoshida, "use of consistency index, expert prioritization and direct numerical inputs for generic fuzzy-ahp modeling: a process model for shipping asset management", expert systems with applications, vol. 39, pp. 1911-1923, 2012. [38] m. dağdeviren and i. yüksel, "developing a fuzzy analytic hierarchy process (ahp) model for behaviorbased safety management", information sciences, vol. 178, no. 6, pp. 1717-1733, 2008. [39] g. janackovic, s. savic, and m. stankovic, "selection and ranking of occupational safety indicators based on fuzzy ahp: case study in road construction companies", south african journal of industrial engineering, vol. 24, no. 3, pp. 175-189, 2013. [40] l.a. zadeh, "fuzzy sets", information and control, vol. 8, pp. 338-353, 1965. [41] d.y. chang, "applications of the extent analysis method on fuzzy ahp", european journal of operational research, vol. 95, pp. 649–655, 1996. hybrid neural lumped element approach in modeling of rf mems switches facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 27-36 https://doi.org/10.2298/fuee2001027c hybrid neural lumped element approach in inverse modeling of rf mems switches  tomislav ćirić 1 , zlatica marinković 1 , rohan dhuri 2 , olivera pronić-rančić 1 , vera marković 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 alten gmbh, munich, germany abstract. rf mems switches have been efficiently exploited in various applications in communication systems. as the dimensions of the switch bridge influence the switch behaviour, during the design of a switch it is necessary to perform inverse modeling, i.e. to determine the bridge dimensions to ensure the desired switch characteristics, such as the resonant frequency. in this paper a novel inverse modeling approach based on combination of artificial neural networks and a lumped element circuit model has been considered. this approach allows determination of the bridge fingered part length for the given resonant frequency and the bridge solid part length, generating at the same time values of the elements of the switch lumped element model. validity of the model is demonstrated by appropriate numerical examples. key words: artificial neural networks, inverse modeling, lumped element model, rf mems switch. 1. introduction radio-frequency micro-electro-mechanical systems (rf mems) components have been proven to be of a great importance for rf circuits and subsystems, as they possess characteristics that may surpass characteristics of conventional, purely electrical components. rf mems devices consist of moving sub-millimeter-sized parts that provide radio frequency functionality. they are of high linearity, low insertion loss and extremely good intermodulation performance. mems devices have the ability to sense, control and actuate on micro scale, and generate effects on macro scale. according to variety and diversity of rf mems technology functionalities, they have wide applicability for the new generation of communication system components, like switches and varactors (variable capacitors), resonators, complex networks, reconfigurable filters, phase shifters, impedance matching tuners and programmable step attenuators [1-9]. in the recent time, the rf mems technology has found applications for internet of things (iot), internet of everything (ioe), tactile internet and 5g telecommunications [10-12]. design of the received february 20, 2019; received in revised form september 10, 2019 corresponding author: tomislav ćirić faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: cirict@live.com)  28 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković circuits containing rf mems switches requires repeated simulations and/or optimizations of the switch characteristics. therefore, there is a need for reliable rf mems models. switch electrical characteristics can be accurately determined in full-wave electromagnetic simulators [13-15]. however, as the simulation models are quite complex and the simulations consume a significant amount of time, a common option to overcome these problems is usage of lumped models in the circuit simulators [16, 17]. the lumped element models based on the equivalent circuits are faster than the full-wave ones. however, if differently sized bridges are analyzed, the procedures for obtaining the equivalent circuit elements have to be repeated, which is a time-consuming process. to make the lumped element model scalable with the dimensions, artificial neural networks (anns) were proposed to model the dependence of the lumped element model on the switch bridge dimensions [18]. the switch bridge dimensions determine the electromagnetic characteristics of the switch. therefore during the design of a switch, it is necessary to determine the bridge dimensions to ensure the desired switch characteristics, such as resonant frequency, i.e. to perform the inverse modeling of the switch. the authors of this work proposed earlier a black-box inverse modeling of the rf mems capacitive switches where the bridge lateral dimensions were determined for given electrical or mechanical switches [19-25]. in this work, the neural based inverse modeling approach is extended in a way that the novel approach provides not only determination of the bridge dimensions but also the values of the corresponding lumped model elements, resulting in a lumped element model ready to be used for further simulations of the circuits containing the considered switch. the paper is organized as follows: after introduction, a description of the considered rf mems switch is given in section 2. the proposed modeling approach is described in section 3. details of the model development and validation and the most illustrative numerical results are given is section 4. section 5 contains the conclusions. 2. rf mems capacitive switches mems are integrated devices consisting of micromechanical and electronic components. rf mems switches are the specific micromechanical switches that are designed to operate at rf to mm-wave frequencies. rf mems switches use mechanical movements of the bridge to achieve a closed or open circuit in the rf transmission lines. rf mems classification depends on the type of actuation, deflection axis, contact type, circuit configuration, and structure configuration. the considered device is a coplanar waveguide (cpw) based rf mems capacitive shunt switch (fig. 1(a)) designed at fondazione bruno kessler (fbk) in trento in an 8 layer silicon micromachining process [26-28]. the device is fabricated on silicon substrate and silicon dioxide (sio2) as insulator. the bridge is a thin gold (au) membrane connecting both sides of the ground plane with defined lateral dimensions (length of the fingered part lf and length of solid part – ls). the signal line is a thin aluminum layer, placed below the bridge. on the opposite sides of the signal line, the dc actuation pads made of polysilicon are placed. applying the actuation voltage on electrodes, electrostatic force becomes superior over mechanical restoring force, causes membrane to pull down towards the ground plane switching the circuit [26]. hybrid neural lumped element approach in inverse modeling of rf mems switches 29 a) b) fig. 1 (a) top-view of the realized switch and schematic of the cross-section [26] and (b) equivalent circuit of the rf mems switch rlc lumped element model the inductance of the bridge and the fixed capacitance between signal line and bridge create a resonant circuit to the ground. the resonant frequency can be adjusted by varying the length of the bridge lateral dimensions. at the series resonance, the circuit acts as a short circuit to the ground. in a certain frequency band around the resonant frequency the transmission of the signal is suppressed. an rf mems switch can be represented by a simplified equivalent circuit model, as shown in fig. 1(b). it consists of the resistance r, the inductance l and the capacitance c. two coplanar waveguide lines, cpw1 and cpw2, are added with the aim of matching the obtained s-parameters with the s-parameters obtained by a full-wave analysis, having in mind that the reference planes for simulation and measurement are usually not defined directly at the membrane but a distance apart from it [26]. the switch resonant frequency is 1 2 resf lc  . (1) the switch capacitance in the membrane down-state case, considered in this case, is calculated from the layout using the following expression [3]: d r t a c  0  , (2) where 0 is the dielectric permittivity, r is the relative dielectric permittivity, td is the distance between the two plates forming the capacitor and a is the surface of the plates. the capacitance is constant, because it does not depend on the bridge lateral dimensions. the other two elements, r and l, depend on the bridge lateral dimensions ls and lf. they can be obtained simultaneously by optimizations in a circuit simulator aimed to achieve the desired values of the equivalent circuit s-parameters. alternatively, the inductance can be determined from the given resonance frequency as: cf l res 22 4 1   , (3) and then only the resistance is to be obtained by optimization in a circuit simulator. it should be noted that once the capacitance is determined, the extraction of the resistance 30 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković and the inductance should be repeated for each considered combination of bridge dimensions, which always requires new full-wave simulations to provide inputs for optimization. following the approach proposed in [26], the lumped element scalability with the bridge lateral dimensions can be introduced by means of artificial neural networks, as will be described in the next section. 3. proposed inverse modeling approach to determine the switch lateral dimensions for the desired resonant frequency, and simultaneously to determine the corresponding equivalent circuit elements, a new inverse modeling approach is proposed in this work. the proposed approach is a hybrid approach combining neural modeling with a lumped element equivalent circuit. in other words, it is a combination of the black-box neural inverse modeling approach [19-21] and a modification of the scalable lumped element model proposed in [18]. schematic diagram of proposed model is shown in fig. 2. the aim of the first ann (ann 1) is to determine the length of the fingered part lf for the desired resonant frequency [19, 20, 22]. as described in the previous work, due to the fact that different combinations of the bridge solid and fingered parts’ lengths may lead to the same resonant frequency value, it is not possible to use this approach to determine ls and lf simultaneously. instead, the length of the solid part is considered as the inverse model input beside the resonant frequency. the second ann (ann 2) is used for modeling the relationship between the resistance and the bridge lateral dimensions ls and lf. unlike the model considered in [18] where the inductance dependence on the dimensions is modeled also by the ann, having in mind that in the considered case the resonant frequency is known, it is possible to calculate the inductance by using the eq. 3, assuming that the capacitance, which is constant and does not depend on the bridge lateral dimensions, has been determined previously. therefore the value calculated by eq. 2 is directly assigned to the capacitor in the equivalent circuit. fig. 2 proposed inverse modeling approach the used anns are multilayered anns having one input layer, one output layer and one or more hidden layers [1]. both anns have two input neurons and one output neuron. the inputs of the ann 1 correspond to the bridge solid part length ls and resonant frequency fres, and the output corresponds to the bridge fingered part lf. for the training and hybrid neural lumped element approach in inverse modeling of rf mems switches 31 validation of ann 1 it is necessary to have a set of samples consisting of a combination of dimensions and the corresponding values of the resonant frequency. that implies that simulations of the s-parameters in a full-wave simulator should be performed for each combination of the dimensions and the resonance frequency is determined as the frequency corresponding to the minimum value of the s21 magnitude. the ann 2 inputs correspond to the bridge lateral dimensions ls and lf , whereas the output corresponds to the equivalent circuit resistance r. the training samples consist of the two considered lateral dimension combinations and corresponding resistances. values of the resistance used for training are determined by optimization in a circuit simulator for the previously calculated capacitance and inductance, as described in section 2. the flow chart describing the development of the proposed model is shown in fig. 3. the optimization goal is to match the simulated resonant frequency (i.e. all scattering parameters) and the resonant frequency simulated in the full-wave simulator for the given combination of the dimensions. the implementation of anns in the equivalent circuit is done as follows. each ann is represented by a set of mathematical expressions describing the ann transfer function. the expressions corresponding to the developed anns are implemented by means of a variable and equation blocks (var) on the equivalent circuit schematic. a var block inputs and outputs are the same as the inputs and outputs of the corresponding ann. the output of the var block corresponding to the ann 1 is led to the input of the var block corresponding to the ann 2, whose output is further assigned to the resistance of the equivalent circuit. fig. 3 model development flow chart the developed inverse model does not require additional simulations in the full-wave simulator or additional optimizations. for the desired resonant frequency and a given value of the bridge solid part length, by running the s-parameter simulations, it is possible to simulate in em full-wave simulator the s-parameters for several combinations of the lateral dimensions find the resonant frequency for each combination of the lateral dimensions build the ann 1 training and test sets (ls, fres, lf) for each of the combination of the lateral dimensions determine in a circuit simulator the resistance r calculate the capacitance c (eq. 2) and the inductance l (eq. 3) build the ann 2 training and test sets (ls, lf, r) train the ann 1 (networks with different number of hidden neurons) and find the ann with the best accuracy train the ann 2 (networks with different number of hidden neurons) and find the ann with the best accuracy. create mathematical expressions describing ann 1 and ann 2 implement the expressions on the equivalent circuit schematic. 32 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković simultaneously calculate the length of the fingered part, determine the corresponding elements of the equivalent circuit and simulate the s-parameters over the desired frequency range. as all the operations are performed in the circuit simulator, the whole process is done within seconds, which is significantly faster than performing optimizations in a fullwave simulator for determining the dimension and optimizations in a circuit simulator to determine the resistance. 4. numerical results the proposed inverse model was developed for the following ranges of the switch geometrical parameters: ls from 50 µm to 500 µm, and lf from 0 µm to 100 µm. to prepare the data for the model development, the equivalent circuit elements r, l and c were determined for several different combinations of the lateral dimensions ls and lf. the relative permittivity of silicon dioxide is 3.9 and the dimensions contributing to the capacitance value in the down-state are a = 13000 μm 2 and td = 0.1 μm. therefore, by using eq. 2, the calculated capacitance in the down-state is 4.48695 pf. for each combination of ls and lf, first the s-parameters were determined by full-wave simulations in advanced design system (ads) momentum software [29] and the resonant frequency was determined as the minimum of the s21 parameter magnitude. further, the combinations of the lateral dimensions and the resonant frequencies obtained by ads simulations were used for training the ann 1. the available dataset was divided into the training set used for the development of the anns and the test set used for the model validation. anns with different number of hidden neurons in one or two hidden layers were trained, because a prior determination of number of hidden neurons is not possible. the networks with the best test results were chosen as the final model. in this paper, the following notation of anns is used: ann denoted with n-h1-h2-m, has n input neurons, h1 and h2 neurons in the first and second hidden layer, respectively, and m output neurons. in the table 1, there are test results obtained by the best ann 1 (2-15-15-1) for the input combinations whose values did not appear in the training set [19, 22]. table 1 rf mems switch inverse modeling results: lf ls (m) fres (ghz) lf (target) (m) lf (from ann 1) (m) lf abs. error (m) lf relative error (%) 5 22.78 25 24.9 0.1 0.4 75 19.17 65 65.4 0.4 0.6 75 17.92 85 85.3 0.3 0.3 100 17.5 75 73.6 1.4 1.9 200 13.13 85 86.8 1.8 2.1 350 11.67 25 23.4 1.6 6.4 350 10.83 65 62.2 2.8 4.3 400 10 85 87.4 2.4 2.9 the relative errors are in most cases less than 3%. however, the absolute difference of the predicted and expected values is less than 3 µm, which is already close to fabrication tolerances. more details about the development and validation of the mentioned inverse model can be found in [19, 22]. hybrid neural lumped element approach in inverse modeling of rf mems switches 33 further, the resonant frequency and the capacitance were used to determine the inductance for each combination of ls and lf. the inductance is calculated by using eq. 3 and achieved results are presented in table 2. in next step, the neural model for determining the resistance for the given dimensions (ann 2) was developed. the target resistance values were obtained by optimization of the resistance value for each considered combination of the dimensions. cpws of 50  were used. among the trained anns with different numbers of hidden neurons, the best results were obtained by ann which has the structure 2-4-8-1. the resistance obtained by ann 2 for the eight test combinations not used for the network training are shown in table 2. table 2 extracted equivalent circuit elements ls (m) lf (from ann 1) (m) c (pf) fres (ghz) l (ph) r (from ann 2) (mω) 75 24.9 4.48695 22.78 10.879 638.05 75 65.4 4.48695 19.17 15.363 739.45 75 85.3 4.48695 17.92 17.581 763.26 100 73.6 4.48695 17.5 18.435 764.92 200 86.8 4.48695 13.13 32.748 857.68 350 23.4 4.48695 11.67 41.455 908.49 350 62.2 4.48695 10.83 48.135 946.07 400 87.4 4.48695 10 56.457 977.75 to validate further the proposed hybrid inverse modeling approach, for the test combinations of the bridge dimensions, the calculated c, l and r were assigned to the corresponding equivalent circuit elements, and used for the s-parameter simulation. the comparison of rf mems switch s-parameters simulated by the equivalent circuit and the s-parameters determined by the ads momentum simulations shows a very good match. as an illustration, in fig. 4 and fig. 5 the insertion loss (|s21| in db) and the return loss (|s11| in db) are shown for two devices with different lateral dimensions: the first one having ls = 100 µm and lf = 75 µm, and the second device with ls = 350 µm 5 10 15 20 25 30 350 40 -20 -15 -10 -5 -25 0 f (ghz) |s 1 1 | (d b ) db(s(1,1)) db(sref(1,1)) 5 10 15 20 25 30 350 40 -30 -20 -10 -40 0 f (ghz) |s 2 1 | (d b ) db(s(2,1)) db(sref(2,1)) fig. 4 s11 and s21 of rf mems switch for ls = 100 µm and lf = 75 µm (rlc model red solid line, full-wave simulations – blue dashed line) 34 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković 5 10 15 20 25 30 350 40 -20 -15 -10 -5 -25 0 f (ghz) |s 1 1 | (d b ) db(s(1,1)) db(sref(1,1)) 5 10 15 20 25 30 350 40 -20 -10 -30 0 f (ghz) |s 2 1 | (d b ) db(s(2,1)) db(sref(2,1)) fig. 5 s11 and s21 of rf mems switch for ls = 350 µm and lf = 25 µm (rlc model red solid line, full-wave simulations – blue dashed line) and lf = 25 µm. as can be seen, in both cases the response of the equivalent circuit is almost identical to the reference response obtained by the full wave simulations, confirming the accuracy of the proposed approach. the results referring to the bridge with lateral dimensions ls = 350 µm and lf = 25 µm have been shown with the aim to show the results for the case where ann 1 exhibits the biggest deviation between modeled and reference values. even in that case, the circuit responses are almost identical and very close to the target values obtained by the full-wave simulations. 5. conclusion in this paper, a new approach to rf mems capacitive switch inverse modeling has been proposed. it is a hybrid approach combining artificial neural networks and a lumped element equivalent circuit model. the inverse approach proposed earlier by the authors aimed only to determine switch dimensions for the given resonant frequency. the inverse modeling approach proposed in this paper can be used to determine not only the necessary length of the bridge fingered part to achieve the given resonant frequency for the given value of the bridge solid part length, but also to determine the elements of the switch equivalent circuit in a full-wave simulator. after the anns composing the model have been developed, determination of the bridge fingered part length and the elements of the equivalent circuit are done straightforwardly without additional optimizations, making the process of inverse modeling very time-efficient. according to the obtained results, the accuracy of the determination of the bridge fingered part is within the fabrication tolerances. moreover, the s-parameters simulated by using the equivalent circuit elements obtained by this approach match well the s-parameters obtained by full-wave simulations, confirming the accuracy of the equivalent circuit parameter extraction. acknowledgement: the work was supported by the projects tr-32052 and iii-43102 of the serbian ministry of education, science and technological development. hybrid neural lumped element approach in inverse modeling of rf mems switches 35 references [1] q. j. zhang, k. c. gupta, neural networks for rf and microwave design, artech house, 2000. [2] m. gad-el-hak, the mems handbook florida: crc pres, 2002 [3] g. m. rebeiz, rf mems theory, design, and technology. new york: wiley, 2003. [4] g. m. rebeiz, j. b. muldavin, "rf mems switches and switch circuits," ieee microw. mag., vol. 2, no. 4, pp. 59-71, december 2001. [5] y. mafinejad, a. z. kouzani, k. mafinezhad, "determining rf mems switch parameter by neural networks", in proceedings of the ieee region 10 conference tencon 2009, 2009, pp. 1-5. [6] l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou “a mim capacitor study of dielectric charging for rf mems capacitive switches”, facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 113-122, 2015. [7] m. koutsoureli, l. michalas, g. papaioannou, “assessment of dielectric charging in micro-electromechanical system capacitive switches”, facta universitatis, series: electronics and energetics, vol. 26, no. 3, pp. 239-245, 2013. [8] a. napieralski, c. maj, m. szermer, p. zajac, w. zabierowski, m. napieralska, ł. starzak, m. zubert, r.kiełbik, p. amrozik, z. ciota, r. ritter, m. kamiński, r. kotas, p. marciniak, b. sakowicz, k. grabowski, w. sankowski, g. jabłoński, d. makowski, a. mielczarek, m. orlikowski, m. jankowski, p. perek, “recent research in vlsi, mems and power devices with practical application to the iter and dream projects”, facta universitatis, series: electronics and energetics, vol. 27, no. 4, pp. 561-588, 2014. [9] i. jokić, m. frantlović, z. đurić, m. dukić, "rf mems/nems resonators for wireless communication systems and adsorption-desorption phase noise", facta universitatis, series: electronics and energetics, vol. 28, no. 3, pp. 345-381, 2015. [10] j. iannacci and c. tschoban, "rf-mems for future mobile applications: experimental verification of a reconfigurable 8-bit power attenuator up to 110 ghz, " journal of micromechanics and microengineering (iop-jmm), vol. 27, no. 4, pp. 1-11, apr. 2017. [11] j. iannacci, "rf-mems technology as an enabler of 5g: low-loss ohmic switchtested up to 110 ghz", sensors and actuators a, vol. 279, pp. 624-629, 2018. [12] m. donelli, j. iannacci, "exploitation of rf-mems switches for the design of broadband modulated scattering technique wireless sensors", ieee antennas and wireless propagation letters, vol. 18, no. 1, january 2019. [13] e. hamad and a. omar, "an improved two-dimensional coupled electrostatic-mechanical model for rf mems switches", j. micromech. microeng., vol. 16, pp. 1424, 2006. [14] l. vietzorreck, "em modeling of rf mems," in proceedings of the 7th international conference on thermal, mechanical and multiphysics simulation and experiments in micro-electronics and microsystems, eurosime 2006, como, italy, april 24-26, 2006, pp.1-4. [15] z. j. guo, n. e. mcgruer and g. g. adams, "modeling, simulation and measurement of the dynamic performance of an ohmic contact, electrostatically actuated rf mems switch", j. micromech. microeng, vol. 17, pp. 1899-1909, 2007. [16] j. iannacci, r. gaddi, a. gnudi, "a experimental validation of mixed electromechanical and electromagnetic modeling of rf-mems devices within a standard ic simulation environment", journal of microelectromechanical systems, vol. 19, no. 3, pp. 526-537, 2010. [17] http://www.coventor/mems-solutions/products/mems [18] t. ćirić, r. dhuri, z. marinković, o. pronić-ranĉić, v. marković, l. vietzorreck, "neural based lumped element model of capacitive rf mems switches", frequenz, vol. 72, no. 11-12, november 2018. [19] z. marinković, t. ćirić, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse modeling of rf mems capacitive switches", in proceedings of the 11th conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2013), serbia, october 16-19, 2013, pp. 366-369. [20] z. marinković, v. marković, t. ćirić, l. vietzorreck, o. pronić-ranĉić, "artifical neural networks in rf mems switch modelling", facta universitatis, series: electronics and energetics, vol. 29, no 2, pp. 177191, 2016. [21] t. ćirić, z. marinković, o. pronić-ranĉić, v. marković, l. vietzorreck, "ann approach for modeling of mechanical characteristics of rf mems capacitive switches an overview", microwave review, vol. 23, no. 1, pp. 25-34, june 2017. [22] z. marinković, t. kim, v. marković, m. milijić, o. pronić-ranĉić, t. ćirić, l. vietzorreck, "artificial neural network based design of rf mems capacitive shunt switches", applied computational electromagnetics society (aces) journal , vol. 31 no. 7, pp. 756-764, july 2016. 36 t. ćirić, z. marinković, r. dhuri, o. pronić-ranĉić, v. marković [23] t. ćirić, z. marinković, t. kim, l. vietzorreck, o. pronić-ranĉić, m. milijić, v. marković, "ann based inverse electro-mechanical modeling of rf mems capacitive switches", in proceedings of the xlix scientific conference on information, communication and energy systems and technologies (icest 2014), niš, serbia, june 25-27, 2014, vol. 2, pp. 127-130. [24] z. marinković, a. aleksić, t. ćirić, o. pronić-ranĉić, v. marković, l. vietzorreck, "inverse electromechanical ann model of rf mems capacitive switches-applicability evaluation", in proceedings of the xlx scientific conference on information, communication and energy systems and technologies (icest 2015), sofia, bulgaria, june 24-26, 2015, pp. 157-160. [25] t. ćirić, z. marinković, m. milijić, o. pronić-ranĉić, v. marković, l. vietzorreck, "modeling of actuation voltage of rf mems capacitive switches based on rbf anns", in proceedings of the 13th symposium on neural networks and applications (neurel), belgrade, serbia, november 22-24, 2016, pp. 119-122. [26] s. dinardo, p. farinelli, f. giacomozzi, g. mannocchi, r. marcelli , b. margesin, p. mezzanotte, v. mulloni, p. russer, r. sorrentino, f. vitulli, l. vietzorreck, "broadband rf-mems based spdt", in proceedings of the european microwave conference 2006, manchester, great britain, september 2006. [27] f. giacomozzi, v. mulloni, s. colpo, j. iannacci, b. margesin, a. faes, "a flexible fabrication process for rf mems devices", romanian journal of information science and technology (romjist), vol. 14, no. 3, 2011. [28] d. dubuc, k. grenier, j. iannacci, "rf-mems for smart communication systems and future 5g applications", in smart sensors and mems -intelligent sensing devices and microsystems for industrial applications, 2nd edition, editors: s. nihtianov, a. luque, chapter 18, elsevier ltd. amsterdam, nl, pp. 499-539, march 2018. [29] advanced design system 2009, santa rosa, ca: electronic design automation software system produced by keysight eesof eda. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. i ii guest editorial nowadays, internet has evolved into a platform that reshapes modern life and removes borders between real, social and cyber worlds. internet of things (iot) is an emerging paradigm and a cutting edge technology that harnesses a network of embedded, interconnected objects (sensors, actuators, tags or mobile devices) in order to collect various types of information at anytime and anywhere. these devices can be used for building different complex smart environments [1], such as smart homes [2][3], smart classrooms [4], smart offices [5], smart factories [6], smart cities [7], intelligent transportation systems [8], smart power grids [9] or smart e-government. furthermore, networks of devices are based on advanced internet standards. iot implies seamless integration of numerous types of devices into the existing internet infrastructure. smart environments can be customized according to users’ needs and preferences which are suitable for automating these environments. internet of things solutions often encompass integration with cloud-based systems and services [7]: infrastructure as a service (iaas), platform as a service (paas) and software as a service (saas). the main subject of the special issue is internet of things and its application in business, industry, research and academic community works. this special issue aims to provide state-of-art and innovative papers on the design, implementation, and usage of intelligent iot and related technologies, such as: cloud computing, big data, pervasive computing, social computing, etc. the primary goal is to provide a variety of research and survey articles in the field of the internet of things and their application in different aspects of human activities. findings and discussion should foster potentials and capabilities of research, academic community, and industry as well. the first invited paper in this issue "design and technologies for implementing a smart educational building: case study" gives a design of an educational smart building at the florida atlantic university. the building was designed as a “living laboratory” so that students and faculty may actually see how iot for smart buildings works. furthermore, it represents a good example for designing and building smart buildings at other universities. the second invited paper "swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network" deals with the aspects of energy efficient routing in wireless sensor networks and proposes a new algorithm for energy-balanced routing. the following two papers "an architectural design for cloud of things" and "from intelligent web of things to social web of things" discuss the architecture aspects of cloud and web for iot, and propose new models and applications. the next paper "smart outlier detection of wireless sensor network" deals with unreliability of data sets collected from wireless sensor networks, and proposes a technique to detect outliers among data collected by geographically distributed sensors. in the paper "a new telerehabilitation system based on internet of things" the authors propose a telerehabilitation system that uses wearable muscle sensor and microsoft kinect to create interactive personalized physical therapy that can be carried out at home. then, the authors of the paper "a platform for a smart learning environment" propose a solution for integration of elearning and iot services within a smart learning environment. the paper "using internet ii editorial of things in monitoring and management of dams in serbia" presents an example of iot application in dam safety management. authors of the paper "a hadoop-enabled sensororiented information system for knowledge discovery about target-of-interest" present a generic sensor-oriented information system based on hadoop ecosystem used for realtime situational awareness about the specific behavior of targets-of-interest. the following two papers "a smart home system based on sensor technology" and "designing an intelligent home media center" present applications of sensor technologies in smart homes. finally, the paper "bridging the snmp gap: simple network monitoring the internet of things" deals with the problem of network management in iot and smart environments. finally, we would like to take the opportunity to thank authors and reviewers for their endeavor. without the great efforts from them, this special issue could not have been made. we would also like to thank the editor-in-chief, professor ninoslav stojadinović for the opportunity to edit this special issue and all his support throughout the editing process. acknowledgement: the editors are thankful to ministry of education, science and technological development, republic of serbia, project number 174031. references [1] l. atzori, a. iera, and g. morabito, "the internet of things: a survey", computer networks, vol. 54, pp. 2787-2805, 2010. [2] d. ding, r. a. cooper, p. f. pasquina, and l. fici-pasquina, "sensor technology for smart homes", maturitas, vol. 69, pp. 131.136, 2011. [3] l. c. desilva, c. morikawa, and i. m. petra, "state of the art of smar thomes", engineering applications of artificial intelligence, vol. 25, pp. 1313-1321, 2012. [4] s. s. yau, s. k. s. gupta, e. k. s. gupta, f. karim, s. i. ahamed, y. wang, and b. wang, "smart classroom: enhancing collaborative learning using pervasive computing technology", in proceedings of the asee 2003 annual conference and exposition, 2003, pp. 13633-13642. [5] c. le gal, j. martin, a. lux, j. l. crowley, "smartoffice: design of an intelligent environment", intelligent systems, vol. 16, no. 4, pp. 60-66, 2005. [6] m. brettel, n. friederichsen, m. keller, and marius rosenberg, "how virtualization, decentralization and network building change the manufacturing landscape: industry 4.0 perspective", international journal of mechanical, aerospace, industrial, mechatronic and manufacturing engineering , vol. 8, no.1, pp. 37-44, 2014. [7] j. jin, j. gubbi, s. marusic, and m. palaniswami, "an information framework for creating a smart city through internet of things", ieee internet of things journal, vol. 1, no. 2, pp. 112-121, 2014. [8] j. barceló, e. codina, j. casas, j. l. ferrer, d. garcía, "microscopic traffic simulation: a tool for the design, analysis and evaluation of intelligent transport systems", journal of intelligent and robotic systems, vol. 41, no. 2, pp. 173-203, 2005. [9] t. samada, s. kiliccote, "smart grid technologies and applications for the industrial sector", computers and chemical engineering, vol. 47, pp. 76-84, 2012. marijana despotović-zrakić university of belgrade, serbia zorica bogdanović university of belgrade, serbia huansheng ning university of science and technology beijing, china božidar radenković university of belgrade, serbia guest editors 10854 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 77-89 https://doi.org/10.2298/fuee2301077j © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper frequency analysis of the typical impulse voltage and current waveshapes of test generators vesna javor department of power engineering, faculty of electronic engineering, university of niš, serbia abstract. frequency analysis of the impulse waveshapes of generators which are commonly used for testing of the equipment in high-voltage engineering is presented in this paper. some of the typical impulse waveshapes, such as 1.2/50 µs/µs, 10/350 µs/µs, 10/700 µs/µs, 10/1000 µs/µs, and 250/2500 µs/µs, are approximated by the doubleexponential function (dexp) and by the terms of multi-peaked analytically extended function (mp-aef). experimental set ups for impulse signal generation are based on the desired outputs as given in the iec 60060-1 standard. dumped oscillations are characteristic of the standardized 8/20 µs/µs waveshape. the positive part of the normalized sinc function with dumped oscillations is also approximated by mp-aef terms. the corresponding frequency spectra of these aperiodic signals are obtained analytically by using piecewise fourier transform (pwft). this paper presents the procedure to obtain fourier transforms of the functions with multiple and sharp peaks typical for the impulse current and voltage test generators’ waveshapes. key words: fourier transform, high-voltage technique, standard impulse waveshapes, test generators 1. introduction for the testing of equipment in high-voltage technique the generators have to produce the defined waveshapes as given in the relevant standards, and the testing waveshapes have to be repeatable within tolerances which are also given in these standards. frequency analysis of such waveshapes is important for the study of their effects on the tested equipment. it is also important for the choice and use of measuring instruments and their characteristics in frequency domain (their frequency response and bandwidth). such analysis is significant for the design of test generators [1]. fourier transform (ft) of aperiodic signals results in continuous spectra over frequency domain, whereas periodic signals have discrete spectra and their amplitudes at each frequency received june 13, 2022; revised july 22, 2022 accepted august 08, 2022 corresponding author: vesna javor department of power engineering, faculty of electronic engineering, university of niš e-mail: vesna.javor@elfak.ni.ac.rs 78 v. javor represent the strength of the signal at that frequency. however, impulse voltages and currents testing signals are also called energy signals as they have finite energy in time domain. they are characterized by energy spectral density which is proportional to the square of the signal integrated over the time domain. according to the parseval’s relation, the same value is obtained if the square of its fourier transform amplitude is integrated over the frequency domain [2]. if an impulse function is approximated by e.g. double-exponential function (dexp) [3]-[4], it may be further replaced by the terms of multi-peaked analytically extended function (mp-aef) [5], and its fourier transform may be obtained analytically by using piecewise fourier transform (pwft) [6]. there are various applications of exponential functions for approximation of waveshapes and it is proved in this paper that mp-aef may be used equally without introducing any error. the sequence in which the analysis is carried out is the following: ▪ the parameters of dexp are obtained so to approximate the impulse waveshape by using least squares method (lsqm) [3]-[4], or by using marquardt method for least squares estimation of nonlinear mp-aef parameters (mlsm) [7]; ▪ frequency spectrum is determined analytically by using pwft [6] applied to the mp-aef terms; ▪ frequency bandwidth of the signal is determined so to use these results for further calculations or for the choice of measuring instruments, and possible peaks in the amplitude spectrum are analyzed so to check the periodicity of the waveshape (if it has dumped oscillations). this procedure is applicable to impulse functions with multiple and sharp peaks which can be well approximated by the terms of mp-aef. some typical impulse waveshapes, such as 1.2/50 µs/µs, 10/350 µs/µs, 10/700 µs/µs, 10/1000 µs/µs, and 250/2500 µs/µs, are analyzed in this paper. the frequencies at which fourier transforms amplitudes of these signals decay to a certain percentage of the amplitude at the frequency f=0 are also given in this paper. this is important for the analysis of induced effects on the equipment under test and for any further calculations in the frequency domain. other waveshapes can be analyzed by the presented procedure, including 8/20 µs/µs waveshape [8] and oscillatory dumped waveshapes of test generators, as they are suitable for approximation by the terms of mp-aef, but not suitable for approximation by dexp. to confirm the procedure on such functions, the results are given in this paper for the fourier transform of the sinc function approximated by a few terms of mp-aef. 2. typical impulse voltages and currents of the test generators impulse waveshapes are usually defined by several parameters, such as the peak value um for the voltage or im for the current at the instant tm, and the two time parameters t1 and t2. t1 is the front time which is the time interval between the instant ta at the point a when the signal is either 30% (fig. 1) or 10% of the peak value and the instant tb at the point b when the signal is 90% of the peak value. t2 is the time interval between the instant ta’ when the signal starts at the virtual origin o1 and the instant tk when the signal is half of the peak value (fig. 2). p1 is the saddle point at t1 in the rising part (fig. 1), p2 the saddle point at t2 in the decaying part (fig. 2). frequency analysis of the typical impulse voltage and current waveshapes of test generators 79 fig. 1 rising part of the voltage 1.2/50 s/s normalized to the peak value versus time t fig. 2 decaying part of the voltage 1.2/50 s/s normalized to the peak value u(t)/um versus time t 80 v. javor the iec 60060-1:2010 standard [9] gives general definitions and test requirements for the high-voltage test techniques. the standard voltage testing waveshape in highvoltage engineering is defined by t1/t2 = 1.2/50 µs/µs, as given in figs. 1 and 2. if approximated by dexp function, the parameters of some typical waveshapes are calculated using least squares method for parameters estimation in [3], as given in table 1, and in [4] for the switching impulse 250/2500 µs/µs approximated by dexp. table 1 dexp parameters obtained by using the least squares method [3], [4] waveshapes t1/t2 η α (s-1) β (s-1) 1.2/50 µs/µs 0.95847 14732.18 2080312.7 10/350 µs/µs 0.9511 2121.76 245303.6 10/700 µs/µs 0.97423 1028.39 257923.7 10/1000 µs/µs 0.98135 712.41 262026.6 250/2500 µs/µs 0.9057971 316.95721 16003.329 dexp function approximating impulse voltage is given by α β α β ( ) ( ) ( ) η t t t t m u u t u e e e e − − − − = − = − (1) for u the voltage value that multiplied by the peak correction factor η results in the peak voltage value um, whereas α and β are the parameters of the dexp function. the peak correction factor is the function of α and β, α β η m m t t e e − − = − (2) for tm the time instant of the maximum value um and 1 β ln β α α mt = − (3) the waveshape 10/350 µs/µs is the lightning current waveshape of the first positive stroke as given in the standard iec 62305 [10]. this function is very important because the positive first stroke has the highest specific energy among lightning discharge types. other impulse lightning discharge currents, such as 0.25/100 µs/µs for the subsequent negative stroke and 1/200 µs/µs for the first negative stroke, are not typical for test generators, but they are used for the analysis of induced voltages in electric power systems due to short rising times. the waveshapes of electrostatic discharge (esd) generators are given in [11], and important characteristics of impulse generators are listed in [12]. a simplified scheme of the circuit to produce an esd impulse for the testing of devices is given in fig. 3, for cd the distributed capacitance which exists between the generator and its surroundings, cd + cs of the typical value of 150 pf, rd of the typical value of 330 ω, rc of the typical value between 50 mω and 100 mω, as given in [9]. frequency analysis of the typical impulse voltage and current waveshapes of test generators 81 fig. 3 simplified scheme of the circuit to produce esd impulse for testing according to [9] experimental set ups and realization of impulse generators are discussed in [13]-[15]. various realizations of impulse generators for testing in high voltage engineering are given in [16]-[18]. real time conditions are the reason of defining intervals for typical parameters of the waveshapes in relevant standards, so that repeatability of testing and experiments can be achieved and compliance proved with the standardized values. 3. numerical examples 3.1. multi-peaked analytically extended function (mp-aef) and its fourier transform the mp-aef function [5] term is given by b b 1 2 1 e b e b b e 1 2 1 e b e b e b e b 1 2 1 1 2 1 2 ( ) c (c c ) exp 1 1 c (c c ) exp c (c c )[(d d ) exp(1 d d )] a a a t t t t x t t t t t t tt t t t t t t t t t t t   − − = + − − =   − −         + − − − + =     − − − −      + − + − − (4) for the parameter a and coefficients c1, c2, d1 = (te –tb) -1 and d2 = –tb (te –tb) -1= –tb d1. c1 is the function value at the beginning tb of using approximation term, so that y(tb) = c1, and c2 is the function value at the end te of using approximation term, so that y(te) = c2. the dexp function (1) may be replaced by four terms (4) using transformation ( ) [exp ( α ) exp( β )] [(α 1)exp( α ) α exp( 1)exp(1 α ) (β 1)exp( β ) β exp( 1)exp(1 β )] m m x t x t t x t t t t t t t t = − − − = + − − − − − + − + − − (5) fourier transform of each term (4) is obtained analytically, for 0 1 c , as 82 v. javor 1 2 1 2 1 1 21 1 1 c (c c ) exp( d / d ) ( ) γ[ 1, , ] d ( / d ) a a p y p a z z p a p + − + = + + + (6) for the gamma function defined by 2 1 1 2γ[ 1, , ] exp( )d z a z a z z t t t+ = − (7) with the arguments a+1, z1=(d1t1+d2)(a+p/d1), and z2=(d1t2+d2)(a+p/d1). 3.2. fourier transform of the impulse voltage 1.2/50 µs/µs waveshape the approximation of any impulse function with one peak may be also done by just two mp-aef terms, so that the impulse voltage 1.2/50 µs/µs may be approximated by                       −=               −= = tt t t t t utu tt t t t t utu tu m b mm m m a mm m ,1exp)( 0,1exp)( )( 2 1 (8) for a and b the parameters of the two mp-aef terms u1(t) and u2(t), respectively. the first term is the same as (4) for c1=0, c2 =um, te = tm, tb = 0, d1 = tm -1 and d2 = 0. this results in the piece-wise function (6) for tm =1.9 µs of the peak value and parameters a=4 and b=0.03126. rising part of the function u1(t) represents the rising part of u(t) and is given by blue line in fig. 4. decaying part of u2(t) represents the decaying part of u(t) and is given by red line in fig. 4. fig. 4 impulse voltage waveshape 1.2/50 s/s normalized to the peak value versus time t, approximated by the two mp-aef terms u1(t)/um (blue line) and u2(t)/um (red line). frequency analysis of the typical impulse voltage and current waveshapes of test generators 83 the result for the fourier transform of the function 1.2/50 µs/µs is given in fig. 5. it can be noticed that the amplitude of the fourier transform of that waveshape is approximately constant up to the frequency of 1khz. at the frequency f1  4khz the amplitude is approximately half of the value at low frequencies. at the frequency f2  200khz the amplitude is approximately 1% of the value at low frequencies. fig. 5 amplitudes of the fourier transform of the impulse 1.2/50 s/s versus frequency f. 3.3. fourier transforms of 10/350 µs/µs, 10/700 µs/µs and 10/1000 µs/µs waveshapes standard lightning current impulse i(t)/im of the first positive stroke is defined by 10/350 µs/µs and approximated by mp-aef terms. its fourier transform is presented in fig. 6. the amplitudes of the fourier transform are approximately constant up to the frequency of 100hz. at f1  200hz the amplitude is approximately half of the value at low frequencies, whereas at f2  10khz the amplitude is approximately 1% of the value at low frequencies. fourier transforms of the three typical impulse waveshapes are presented in fig. 7 and denoted by a) 10/350 µs/µs, b) 10/700 µs/µs, and c) 10/1000 µs/µs. these results show that the amplitudes of the fourier transforms are approximately constant up to the frequency of 100hz for all the three waveshapes. at f1a  200hz, f1b  300hz and f1c  600hz the amplitudes are approximately half of the value at low frequencies, whereas at f2a  10khz, f2b  15khz and f2c  25khz the amplitude is approximately 1% of the value at low frequencies. impulse 10/700 µs/µs presents an open-circuit voltage waveshape, whereas 10/1000 µs/µs may present either open-circuit voltage waveshape or short-circuit current waveshape of the impulse generator, but in this paper all the three waveshapes were analyzed together in fig. 7, so to notice the influence of the decaying time on the frequency spectrum of the waveshape. 84 v. javor fig. 6 amplitudes of the fourier transform of the impulse 10/350 s/s versus frequency f fig. 7 amplitude spectra of the impulse waveshapes: a) 10/350 s/s, b) 10/700 s/s, c) 10/1000 s/s versus frequency f frequency analysis of the typical impulse voltage and current waveshapes of test generators 85 results presented in figs. 5, 6 and 7 may be used to estimate the frequency bandwidth of the measuring instruments used in testing of the equipment according to the desired accuracy and relevant standards. for the computation of the induced effects of such signals in frequency domain is enough to take into account frequencies up to 1mhz. 3.4. fourier transform of the switching voltage 250/2500 µs/µs waveshape switching impulse waveshapes are slower in the rising part than the lightning impulse waveshapes, and have longer time duration in total. for the impulse t1/t2 = 250/2500 µs/µs the fourier transform is given in fig. 8. the amplitude of the fourier transform is approximately constant up to the frequency of 10hz. at f1  90hz the amplitude is approximately half of the value at low frequencies, whereas at f2  3khz the amplitude is about 1% of the value at low frequencies. fig. 8 amplitudes of the fourier transform of the switching impulse 250/2500 s/s versus frequency f 3.5. fourier transform of the dumped oscillations waveshapes standard 61000-4-12 impulse current, also denoted as ring wave, is given in fig. 9. pk1 denotes the first peak, pk2 the second, pk3 the third, and pk4 the fourth. only pk1 is specified for the current waveform. t1 is the rise time and t the period of oscillations. 86 v. javor fig. 9 standard 61000-4-12 [8] impulse current waveshape generation of 8/20 µs/µs impulse current assumes dumped oscillations with the rise time t1 = 8 µs ± 20% and the decaying time t2 = 20 µs ± 20% for the first peak pk1. the advantage of mp-aef over dexp is that it is suitable to approximate waveshapes with multiple peaks as given in fig. 9. sinc function is also an example of the dumped oscillations waveshapes. normalized sinc function is defined, for 0t , as t t t π )(πsin )(sinc = (9) this function is presented in fig. 10, for s]6,0(t . it is approximated by six terms of mp-aef given by (4), but for d1i =tmi -1 and d2i =  −= = − 1 0 1 ik k mkmi tt , so that the terms are given by 1 1 0 0 1 2 1( ) c (c c ) exp 1 a k i k i m k m k k k i i i i mi mi t t t t x t t t = − = − = =      − −          = + − −                      (10) for tm0 = 0. parameter a=3 for all the terms and other parameters are given in table 2. the complete procedure for obtaining function parameters is presented in [19]. amplitude spectrum i.e. modulus of the fourier transform of this function is presented in fig. 11 for f  [0.01hz, 10mhz], and in fig. 12 for f  [0.1hz, 1hz], so to notice the peak at f = 0.5hz due to т=2s (fig. 10). for the dumped oscillations of the impulse current waveshape 8/20 s/s with the period about т=330s, the peak in its fourier transform appears at about f = 3khz. frequency analysis of the typical impulse voltage and current waveshapes of test generators 87 fig. 10 sinc function approximated by six mp-aef terms, for s]6,0(t table 2 parameters of the six mp-aef terms term i c1i c2i tmi 1 1 0.21722 1.431 2 0.21722 0.12837 1.028 3 0.12837 -0.0913 1.012 4 -0.0913 0.07091 1.006 5 0.07091 -0.05797 1.004 6 -0.05797 0.04903 1.003 fig. 10 amplitude spectrum of the function from fig. 9 versus frequency f  [0.01hz, 10mhz] 88 v. javor fig. 11 amplitude spectrum of the function from fig. 9 versus frequency f  [0.1hz, 1hz] 4. conclusion electrostatic discharge currents and also lightning discharge currents are of impulse waveshapes. induced voltages and currents in electric circuits due to fast changing external electromagnetic fields are of impulse waveshapes. fast transients in electric circuits due to switching operations are of impulse waveshapes. due to all these, the testing generators have to produce such waveshapes [20]-[22] in order to check the equipment according to the standards. fourier transforms of typical impulse waveshapes of test generators in high-voltage technique are obtained by using terms of mp-aef. the procedure is suitable for aperiodic functions because fourier transform of mp-aef terms is obtained analytically by using gamma functions. frequency analysis of the waveshapes 1.2/50 µs/µs, 10/350 µs/µs, 10/700 µs/µs, 10/1000 µs/µs, and 250/2500 µs/µs, shows the necessary bandwidths for frequency domain calculations and also for measurements. the comparison of the three functions with the same rising time 10/350 µs/µs, 10/700 µs/µs, and 10/1000 µs/µs is also given in this paper. the procedure is also applied to the oscillatory dumped function in this paper as such waveshapes are important for testing of the equipment in high-voltage technique. in the future research, other oscillatory dumped functions as 8/20 µs/µs, 4/16 µs/µs and 5/320 µs/µs will be analyzed by using terms of mp-aef to approximate the waveshapes, and afterwards pwft is going to be used to obtain their frequency spectra. acknowledgement: the paper is a part of the research done within the project no. 451-03-68/2022-14 of the faculty of electronic engineering in niš, supported by the ministry of education, science and technological development of the republic of serbia. frequency analysis of the typical impulse voltage and current waveshapes of test generators 89 references [1] i. f. gonos, n. leontides, f. v. topalis, i. a. stathopulos, "analysis and design of an impulse current generator", wseas transactions on circuits and systems, vol. 1, pp. 38-43, january 2002. [2] k. l. kaiser, electromagnetic compatibility handbook, crc press, 2015, chapter 12, pp. 165. [3] d. lovrić, s. vujević, t. modrić, "least squares estimation of double-exponential function parameters," in proceedings of the 11th int. conf. пес 2013, niš, serbia, sept. 2013, pp. 1-4. [4] k. schon, high impulse voltage and current measurement techniques, springer, 2013, chapter 2, pp. 5-9. [5] v. javor, "multi-peaked functions for representation of lightning channel-base currents", in proceedings of the iclp 31st int. conference on lightning protection iclp 2012, sep. 2-7, 2012, ove vienna, austria, sep. 2012. [6] v. javor, "piece-wise fourier transform of aperiodic functions," in proceedings of the 21st int. symp. infoteh-jahorina 2022, march 2022, pp. 1-4. [7] k. lundengård, m. rančić, v. javor, s. silvestrov, "estimation of parameters for the multipeaked aef current functions", in proceedings of the 16th applied stochastic models and data analysis international conference (asmda) and 4th demographics workshop, piraeus, greece, 30 june 4 july 2015, pp. 623-636. [8] iec 61000-4-12 standard, electromagnetic compatibility (emc) – part 4-12: testing and measurement techniques – ring wave immunity test, ed. 3.0, 2017, https://webstore.iec.ch/publication/29872 [9] iec 60060-1:2010 standard, high voltage test techniques part 1: general definitions and test requirements, ed. 3.0, 2010, https://webstore.iec.ch/publication/300 [10] iec 62305-1 standard, protection against lightning part 1: general principles, ed. 2.0, 2010-2012. [11] emc – part 4-2: testing and measurement techniques – electrostatic discharge immunity test, iec int. standard 61000-4-2, ed. 2, 2008-2012. [12] http://pes-spdc.org/sites/default/files/impulse_generatorsaddedrev3.pdf [13] z. javid, k.-j. li, k. sun, and a. unbree, "cost effective design of high voltage impulse generator and modeling in matlab", j. electr. eng. technol., vol. 13, no. 3, pp. 1346-1354, 2018. [14] v. c. vita, g. p. fotis, and l. ekonomou "parameters’ optimization methods for the electrostatic discharge current equation", int. journal of energy, vol. 11, pp. 1-6, 2017. [15] g. p. fotis, f. e. asimakopoulou, i. f. gonos, and i. a. stathopulos, "applying genetic algorithms for the determination of the parameters of the electrostatic discharge current equation", meas. sci. technol., vol. 17, pp. 2819–2827, 2006. [16] m. abdel-salam, high-voltage engineering: theory and practice, rev. and expanded. crc press, 2018. [17] e. kuffel, w. s. zaengl, j. kuffel, high voltage engineering: fundamentals, newnes, 2000, pp. 48-74. [18] s. mehta, p. basak, k. anelis, a. paramane, "simulation of single and multistage impulse voltage generator using matlab simulink", in proceedings of the 2018 international conference on computing, power and communication technologies (gucon), ieee, 2018, pp. 641-646. [19] k. lundengård, m. rančić, v. javor, s. silvestrov, "electrostatic discharge currents representation using the analytically extended function with p peaks by interpolation on a d-optimal design", facta universitatis, series electronics and energetics, vol. 32, no. 1, pp. 25-49, 2019. [20] y. trotsenko, v. brzhezitsky, o. protsenko, y. haran, "simulation of impulse current generator for testing surge arresters using frequency dependent models", technology audit and production reserves, vol. 1, no. 1 (57), pp. 25-29, 2021. [21] k. filik, g. karnas, g. masłowski, m. oleksy, r. oliwa, k. bulanda, "testing of conductive carbon fiber reinforced polymer composites using current impulses simulating lightning effects", energies, vol. 14, no. 7899, 2021. [22] p. ghosh, b. chakraborty, s. dalai, s. chatterjee, "simulation and real-time generation of nonstandard lightning impulse voltage waveforms", in proceedings of the 2022 ieee int. conf. on distributed computing and electrical circuits and electronics (icdcece), 2022, pp. 1-5. https://webstore.iec.ch/publication/29872 https://webstore.iec.ch/publication/300 http://pes-spdc.org/sites/default/files/impulse_generatorsaddedrev3.pdf 2 k. lundengård, m. rančić, v. javor and s. silvestrov 1 introduction well-defined representation of real electrostatic discharge (esd) currents is needed in order to establish realistic requirements for esd generators used in testing of the equipment and devices, as well as to provide and improve the repeatability of tests. such representations should be able to approximate the esd currents waveshapes for various test levels, test set-ups and procedures, and also for various esd conditions such as approach speeds, types of electrodes, relative arc length, humidity, etc. a mathematical function is needed for computer simulation of esd phenomena, for verification of test generators and for improving standard waveshape definition. functions previously proposed in the literature for modelling of esd currents, are mostly linear combinations of exponential functions, gaussian functions, heidler functions or other functions, for a short review see for example [1]. the analytically extended function (aef) was initially proposed in [2] and has been successfully applied to lightning discharge modelling [3–13] using nonlinear least-square curve fitting. in this paper we analyse the applicability of the aef with p peaks to representation of esd currents by interpolation of data points chosen according to a d-optimal design. this is illustrated through examples from two applications. the first application is modelling of an esd commonly used in electrostatic discharge immunity testing, and the second modelling of lightning discharges. for the esd immunity testing application we model the iec standard 61000-4-2 waveshape, [14,15] and an experimentally measured esd current from [16]. for the lightning discharge application we model the iec 61312-1 standard waveshape [17, 18] and a more complex measured lightning discharge current from [19]. we also use the same method to approximate a measured derivative of a lightning discharge current derivative from [20]. in both applications the basic properties of the current (or current derivative) are the same, these properties and how they are modelled with the aef is discussed in the next section. 2 modelling of esd currents using the aef various mathematical expressions have been introduced in the literature that can be used for representation of the esd currents, either the iec 610004-2 standard one [15], or experimentally measured ones, e.g. [21]. these facta universitatis series: electronics and energetics vol. 32, no 1, march 2019, pp. 25 49 https://doi.org/10.2298/fuee1901025l karl lundengård1, milica rančić1, vesna javor2, sergei silvestrov1 received december 21, 2018 corresponding author: karl lundengard division of applied mathematics, ukk, mälardalen university, högskoleplan 1, box 883, 721 23 västerås, sweden (e-mail: karl.lundengard@mdh.se) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) electrostatic discharge currents representation using the analytically extended function with p peaks by interpolation on a d-optimal design 1division of applied mathematics, ukk, mälardalen university, västerås, sweden 2department of power engineering, faculty of electronic engineering, university of niš, niš, serbia abstract. in this paper the analytically extended function (aef) with p peaks is used for representation of the electrostatic discharge (esd) currents and lightning discharge currents. the tting to data is achieved by interpolation of certain data points. in order to minimize unstable behaviour, the exponents of the aef are chosen from a certain arithmetic sequence and the interpolated points are chosen according to a d-optimal design. the method is illustrated using several examples of currents taken from standards and measurements. key words: analytically extended function, electrostatic discharge (esd) current, lightning discharge current, d-optimal design. 2 k. lundengård, m. rančić, v. javor and s. silvestrov 1 introduction well-defined representation of real electrostatic discharge (esd) currents is needed in order to establish realistic requirements for esd generators used in testing of the equipment and devices, as well as to provide and improve the repeatability of tests. such representations should be able to approximate the esd currents waveshapes for various test levels, test set-ups and procedures, and also for various esd conditions such as approach speeds, types of electrodes, relative arc length, humidity, etc. a mathematical function is needed for computer simulation of esd phenomena, for verification of test generators and for improving standard waveshape definition. functions previously proposed in the literature for modelling of esd currents, are mostly linear combinations of exponential functions, gaussian functions, heidler functions or other functions, for a short review see for example [1]. the analytically extended function (aef) was initially proposed in [2] and has been successfully applied to lightning discharge modelling [3–13] using nonlinear least-square curve fitting. in this paper we analyse the applicability of the aef with p peaks to representation of esd currents by interpolation of data points chosen according to a d-optimal design. this is illustrated through examples from two applications. the first application is modelling of an esd commonly used in electrostatic discharge immunity testing, and the second modelling of lightning discharges. for the esd immunity testing application we model the iec standard 61000-4-2 waveshape, [14,15] and an experimentally measured esd current from [16]. for the lightning discharge application we model the iec 61312-1 standard waveshape [17, 18] and a more complex measured lightning discharge current from [19]. we also use the same method to approximate a measured derivative of a lightning discharge current derivative from [20]. in both applications the basic properties of the current (or current derivative) are the same, these properties and how they are modelled with the aef is discussed in the next section. 2 modelling of esd currents using the aef various mathematical expressions have been introduced in the literature that can be used for representation of the esd currents, either the iec 610004-2 standard one [15], or experimentally measured ones, e.g. [21]. these electrostatic discharge currents representation using the analytically extended function...3 functions are to certain extent in accordance with the requirements given in table 1. furthermore, they have to satisfy the following: • the value of the esd current and its first derivative must be equal to zero at the moment t = 0, since neither the transient current nor the radiated field generated by the esd current can change abruptly at that moment. • the esd current function must be time-integrable in order to allow numerical calculation of the esd radiated fields. 2.1 the analytically extended function (aef) with p peaks a so-called analytically extended function (aef) with p peaks has been proposed and applied by the authors to lightning discharge current modelling in [9–11]. initial considerations on applying the function to esd currents have also been made in [1,5]. the aef consists of scaled and translated functions of the form x(β; t) =( te1−t )β that the authors have previously referred to as power-exponential functions [10]. here we define the aef with p peaks as i(t) = q−1∑ k=1 imk + imq nq∑ k=1 ηq,kxq,k(t), (1) for tmq−1 ≤ t ≤ tmq, 1 ≤ q ≤ p, and p∑ k=1 imk np+1∑ k=1 ηp+1,kxp+1,k(t), (2) for tmp ≤ t. the current value of the first peak is denoted by im1, the difference between each pair of subsequent peaks by im2, im3, . . . , imp, and their corresponding times by tm1, tm2, . . . , tmp. in each time interval q, with 1 ≤ q ≤ p + 1, the number of terms is given by nq, 0 < nq ∈ z. parameters ηq,k are such that ηq,k ∈ r for q = 1, 2, . . . , p + 1, k = 1, 2, . . . , nq and nq∑ k=1 ηq,k = 1. furthermore xq,k(t), 1 ≤ q ≤ p + 1 is given by xq,k(t) =    x ( βq,k; t−tmq−1 tmq −tmq−1 ) , 1 ≤ q ≤ p, x ( βq,k; t tmq ) , q = p + 1. (3) 26 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 27 2 k. lundengård, m. rančić, v. javor and s. silvestrov 1 introduction well-defined representation of real electrostatic discharge (esd) currents is needed in order to establish realistic requirements for esd generators used in testing of the equipment and devices, as well as to provide and improve the repeatability of tests. such representations should be able to approximate the esd currents waveshapes for various test levels, test set-ups and procedures, and also for various esd conditions such as approach speeds, types of electrodes, relative arc length, humidity, etc. a mathematical function is needed for computer simulation of esd phenomena, for verification of test generators and for improving standard waveshape definition. functions previously proposed in the literature for modelling of esd currents, are mostly linear combinations of exponential functions, gaussian functions, heidler functions or other functions, for a short review see for example [1]. the analytically extended function (aef) was initially proposed in [2] and has been successfully applied to lightning discharge modelling [3–13] using nonlinear least-square curve fitting. in this paper we analyse the applicability of the aef with p peaks to representation of esd currents by interpolation of data points chosen according to a d-optimal design. this is illustrated through examples from two applications. the first application is modelling of an esd commonly used in electrostatic discharge immunity testing, and the second modelling of lightning discharges. for the esd immunity testing application we model the iec standard 61000-4-2 waveshape, [14,15] and an experimentally measured esd current from [16]. for the lightning discharge application we model the iec 61312-1 standard waveshape [17, 18] and a more complex measured lightning discharge current from [19]. we also use the same method to approximate a measured derivative of a lightning discharge current derivative from [20]. in both applications the basic properties of the current (or current derivative) are the same, these properties and how they are modelled with the aef is discussed in the next section. 2 modelling of esd currents using the aef various mathematical expressions have been introduced in the literature that can be used for representation of the esd currents, either the iec 610004-2 standard one [15], or experimentally measured ones, e.g. [21]. these electrostatic discharge currents representation using the analytically extended function...3 functions are to certain extent in accordance with the requirements given in table 1. furthermore, they have to satisfy the following: • the value of the esd current and its first derivative must be equal to zero at the moment t = 0, since neither the transient current nor the radiated field generated by the esd current can change abruptly at that moment. • the esd current function must be time-integrable in order to allow numerical calculation of the esd radiated fields. 2.1 the analytically extended function (aef) with p peaks a so-called analytically extended function (aef) with p peaks has been proposed and applied by the authors to lightning discharge current modelling in [9–11]. initial considerations on applying the function to esd currents have also been made in [1,5]. the aef consists of scaled and translated functions of the form x(β; t) =( te1−t )β that the authors have previously referred to as power-exponential functions [10]. here we define the aef with p peaks as i(t) = q−1∑ k=1 imk + imq nq∑ k=1 ηq,kxq,k(t), (1) for tmq−1 ≤ t ≤ tmq, 1 ≤ q ≤ p, and p∑ k=1 imk np+1∑ k=1 ηp+1,kxp+1,k(t), (2) for tmp ≤ t. the current value of the first peak is denoted by im1, the difference between each pair of subsequent peaks by im2, im3, . . . , imp, and their corresponding times by tm1, tm2, . . . , tmp. in each time interval q, with 1 ≤ q ≤ p + 1, the number of terms is given by nq, 0 < nq ∈ z. parameters ηq,k are such that ηq,k ∈ r for q = 1, 2, . . . , p + 1, k = 1, 2, . . . , nq and nq∑ k=1 ηq,k = 1. furthermore xq,k(t), 1 ≤ q ≤ p + 1 is given by xq,k(t) =    x ( βq,k; t−tmq−1 tmq −tmq−1 ) , 1 ≤ q ≤ p, x ( βq,k; t tmq ) , q = p + 1. (3) electrostatic discharge currents representation using the analytically extended function...3 functions are to certain extent in accordance with the requirements given in table 1. furthermore, they have to satisfy the following: • the value of the esd current and its first derivative must be equal to zero at the moment t = 0, since neither the transient current nor the radiated field generated by the esd current can change abruptly at that moment. • the esd current function must be time-integrable in order to allow numerical calculation of the esd radiated fields. 2.1 the analytically extended function (aef) with p peaks a so-called analytically extended function (aef) with p peaks has been proposed and applied by the authors to lightning discharge current modelling in [9–11]. initial considerations on applying the function to esd currents have also been made in [1,5]. the aef consists of scaled and translated functions of the form x(β; t) =( te1−t )β that the authors have previously referred to as power-exponential functions [10]. here we define the aef with p peaks as i(t) = q−1∑ k=1 imk + imq nq∑ k=1 ηq,kxq,k(t), (1) for tmq−1 ≤ t ≤ tmq, 1 ≤ q ≤ p, and p∑ k=1 imk np+1∑ k=1 ηp+1,kxp+1,k(t), (2) for tmp ≤ t. the current value of the first peak is denoted by im1, the difference between each pair of subsequent peaks by im2, im3, . . . , imp, and their corresponding times by tm1, tm2, . . . , tmp. in each time interval q, with 1 ≤ q ≤ p + 1, the number of terms is given by nq, 0 < nq ∈ z. parameters ηq,k are such that ηq,k ∈ r for q = 1, 2, . . . , p + 1, k = 1, 2, . . . , nq and nq∑ k=1 ηq,k = 1. furthermore xq,k(t), 1 ≤ q ≤ p + 1 is given by xq,k(t) =    x ( βq,k; t−tmq−1 tmq −tmq−1 ) , 1 ≤ q ≤ p, x ( βq,k; t tmq ) , q = p + 1. (3) 4 k. lundengård, m. rančić, v. javor and s. silvestrov remark 1. when previously applying the aef, see [9–11], all exponents (β-parameters) of the aef were set to β2 + 1 in order to guarantee that the derivative of the aef is continuous. here this condition will be satisfied in a different manner. since the aef is a linear combination of elementary functions, its derivative and integral can be found using standard methods. explicit formulae are given in [11, theorems 1-3]. previously, the authors have fitted aef functions to lightning discharge currents and esd currents using the marquardt least square method but have noticed that the obtained result varies greatly depending on how the waveforms are sampled. this is problematic, especially since the methodology becomes computationally demanding when applied to large amounts of data. here we will try one way to minimize the data needed but still enough to get as good approximation as possible. the method examined here will be based on d-optimality of a regression model. a d-optimal design is found by choosing sample points such that the determinant of the fischer information matrix of the model is maximized. for a standard linear regression model this is also equivalent, by the socalled kiefer-wolfowitz equivalence criterion, to g-optimality which means that the maximum of the prediction variance will be minimized. these are standard results in the theory of optimal design and a summary can be found for example in [22]. minimizing the prediction variance will in our case mean maximizing the robustness of the model. this does not guarantee a good approximation but it will increase the chances of the method working well when working with limited precision and noisy data, and thus improve the chances of finding a good approximation when it is possible. 3 d-optimal approximation for exponents given by a class of arithmetic sequences it can be desirable to minimize the number of points used when constructing the approximation. one way of doing this is choosing the d-optimal sampling points. in this section we will only consider the case where in each interval the n exponents, β1, . . . , βn, are chosen according to βm = k + m − 1 c , m = 1, 2, . . . , n 26 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 27 4 k. lundengård, m. rančić, v. javor and s. silvestrov remark 1. when previously applying the aef, see [9–11], all exponents (β-parameters) of the aef were set to β2 + 1 in order to guarantee that the derivative of the aef is continuous. here this condition will be satisfied in a different manner. since the aef is a linear combination of elementary functions, its derivative and integral can be found using standard methods. explicit formulae are given in [11, theorems 1-3]. previously, the authors have fitted aef functions to lightning discharge currents and esd currents using the marquardt least square method but have noticed that the obtained result varies greatly depending on how the waveforms are sampled. this is problematic, especially since the methodology becomes computationally demanding when applied to large amounts of data. here we will try one way to minimize the data needed but still enough to get as good approximation as possible. the method examined here will be based on d-optimality of a regression model. a d-optimal design is found by choosing sample points such that the determinant of the fischer information matrix of the model is maximized. for a standard linear regression model this is also equivalent, by the socalled kiefer-wolfowitz equivalence criterion, to g-optimality which means that the maximum of the prediction variance will be minimized. these are standard results in the theory of optimal design and a summary can be found for example in [22]. minimizing the prediction variance will in our case mean maximizing the robustness of the model. this does not guarantee a good approximation but it will increase the chances of the method working well when working with limited precision and noisy data, and thus improve the chances of finding a good approximation when it is possible. 3 d-optimal approximation for exponents given by a class of arithmetic sequences it can be desirable to minimize the number of points used when constructing the approximation. one way of doing this is choosing the d-optimal sampling points. in this section we will only consider the case where in each interval the n exponents, β1, . . . , βn, are chosen according to βm = k + m − 1 c , m = 1, 2, . . . , n 28 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 29 4 k. lundengård, m. rančić, v. javor and s. silvestrov remark 1. when previously applying the aef, see [9–11], all exponents (β-parameters) of the aef were set to β2 + 1 in order to guarantee that the derivative of the aef is continuous. here this condition will be satisfied in a different manner. since the aef is a linear combination of elementary functions, its derivative and integral can be found using standard methods. explicit formulae are given in [11, theorems 1-3]. previously, the authors have fitted aef functions to lightning discharge currents and esd currents using the marquardt least square method but have noticed that the obtained result varies greatly depending on how the waveforms are sampled. this is problematic, especially since the methodology becomes computationally demanding when applied to large amounts of data. here we will try one way to minimize the data needed but still enough to get as good approximation as possible. the method examined here will be based on d-optimality of a regression model. a d-optimal design is found by choosing sample points such that the determinant of the fischer information matrix of the model is maximized. for a standard linear regression model this is also equivalent, by the socalled kiefer-wolfowitz equivalence criterion, to g-optimality which means that the maximum of the prediction variance will be minimized. these are standard results in the theory of optimal design and a summary can be found for example in [22]. minimizing the prediction variance will in our case mean maximizing the robustness of the model. this does not guarantee a good approximation but it will increase the chances of the method working well when working with limited precision and noisy data, and thus improve the chances of finding a good approximation when it is possible. 3 d-optimal approximation for exponents given by a class of arithmetic sequences it can be desirable to minimize the number of points used when constructing the approximation. one way of doing this is choosing the d-optimal sampling points. in this section we will only consider the case where in each interval the n exponents, β1, . . . , βn, are chosen according to βm = k + m − 1 c , m = 1, 2, . . . , n electrostatic discharge currents representation using the analytically extended function...5 where k is a non-negative integer and c a positive real number. note that in order to guarantee continuity of the aef derivative the condition k > c has to be satisfied. in each interval we want an approximation of the form y(t) = n∑ i=1 ηit βieβi(1−t) and by setting z(t) = (te1−t) l c we obtain y(t) = n∑ i=1 ηiz(t) k+i−1. if we have n sample points, ti, i = 1, . . . , n, then the fischer information matrix, m, of this system is m = u�u where u =   z(t1) k z(t2) k . . . z(tn) k z(t1) k+1 z(t2) k+1 . . . z(tn) k+1 ... ... ... ... z(t1) k+n−1 z(t2) k+n−1 . . . z(tn) k+n−1   . thus if we want to maximize det(m) = det(u)2 it is sufficient to maximize or minimize the determinant det(u). set z(ti) = (tie 1−ti) l c = xi then un(t1, . . . , tn) = det(u) = ( n∏ i=1 xki )   ∏ 1≤i 1) when xi = xmax · yi and xn = xmax, or some permutation thereof. proof. this theorem follows from theorem 1 combined with the fact that wn(k; x1, . . . , xn) is a homogeneous polynomial. since wn(k; b·x1, . . . , c·xn) = bk+ n(n−1) 2 · wn(k; x1, . . . , xn) if (x1, . . . , xn) is an extreme point in [0, 1]n then (b · x1, . . . , b · xn) is an extreme point in [0, b]n. thus by theorem 1 the points given by xi = xmax · yi will maximize or minimize wn(k; x1, . . . , xn) on [0, xmax] n. remark 3. it is in many cases possible to ensure the condition 1 < xmax · y1 without actually calculating the roots of p (2k−1,0) n−1 (1 − 2y). in the literature on orthogonal polynomials there are many expressions for upper and lower bounds of the roots of the jacobi polynomials. for instance in [25] an upper bound on the largest root of a jacobi polynomial is given and can be, in our case, rewritten as y1 > 1 − 3 4k2 + 2kn + n2 − k − 2n + 1 and thus 1 − 3 4k2 + 2kn + n2 − k − 2n + 1 > 1 xmax guarantees that 1 < xmax · y1. if a more precise condition is needed there are expressions that give tighter bounds of the largest root of the jacobi polynomials, see [26]. 32 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 33 8 k. lundengård, m. rančić, v. javor and s. silvestrov 3.2 d-optimal interpolation on the decaying part finding the d-optimal points for the decaying part is more difficult than it is for the rising part. suppose we denote the largest value for time that can reasonably be used (for computational or experimental reasons) with tmax. this corresponds to some value xmax = (tmax exp(1 − tmax)) 1 c . ideally we would want a corresponding theorem to theorem 1 over [1, xmax] n instead of [0, 1]n. it is easy to see that if xi = 0 or xi = 1 for some 1 ≤ xi ≤ n − 1 then wn(k; x1, . . . , xn) = 0 and thus there must exist some local extreme point such that 0 < x1 < x2 < . . . < xn−1 < 1. this is no longer guaranteed when considering the volume [1, xmax] n instead. therefore we will instead extend theorem 1 to the volume [0, xmax] n and give an extra constraint on the parameter k that guarantees 1 < x1 < x2 < . . . < xn−1 < xn = xmax. theorem 2. let y1 < y2 < . . . < yn−1 be the roots of the jacobi polynomial p (2k−1,0) n−1 (1−2y). if k is chosen such that 1 < xmax ·y1 then the determinant wn(k; x1, . . . , xn) given in theorem 1 is maximized or minimized on the cube [1, xmax] n (where xmax > 1) when xi = xmax · yi and xn = xmax, or some permutation thereof. proof. this theorem follows from theorem 1 combined with the fact that wn(k; x1, . . . , xn) is a homogeneous polynomial. since wn(k; b·x1, . . . , c·xn) = bk+ n(n−1) 2 · wn(k; x1, . . . , xn) if (x1, . . . , xn) is an extreme point in [0, 1]n then (b · x1, . . . , b · xn) is an extreme point in [0, b]n. thus by theorem 1 the points given by xi = xmax · yi will maximize or minimize wn(k; x1, . . . , xn) on [0, xmax] n. remark 3. it is in many cases possible to ensure the condition 1 < xmax · y1 without actually calculating the roots of p (2k−1,0) n−1 (1 − 2y). in the literature on orthogonal polynomials there are many expressions for upper and lower bounds of the roots of the jacobi polynomials. for instance in [25] an upper bound on the largest root of a jacobi polynomial is given and can be, in our case, rewritten as y1 > 1 − 3 4k2 + 2kn + n2 − k − 2n + 1 and thus 1 − 3 4k2 + 2kn + n2 − k − 2n + 1 > 1 xmax guarantees that 1 < xmax · y1. if a more precise condition is needed there are expressions that give tighter bounds of the largest root of the jacobi polynomials, see [26]. electrostatic discharge currents representation using the analytically extended function...9 we can now find the d-optimal t-values using the lower branch of the lambert w function as in equation (5), ti = −w−1(−e−1xci), where xi are the roots of the jacobi polynomial given in theorem 1. since −1 ≤ w−1(x) < −∞ for −e−1 ≤ x ≤ 0 this will always give 1 ≤ ti < tmax = −w−1(−e−1xmax) so xmax is given by the highest feasible t. remark 4. note that here just like in the rising part tn = tp which means that we will interpolate to the final peak as well as p−1 points in the decaying part. 4 examples of models from applications and experiments in this section some results of applying the described scheme to two different applications will be presented. the first application is modelling of esd currents commonly used in electrostatic discharge immunity testing, and the second modelling of lightning discharge currents. the values of n and peak-times have been chosen manually, and k and c have been chosen by first fixing k and then numerically finding a c that gave a close approximation. for this purpose we used software for numerical computing [27], based on the interior reflective newton method described in [28, 29]. this is then repeated for k = 1, . . . , 10 and the best fitting set of parameters is chosen. note that this methodology uses all available data points to evaluate fitting but could probably be simplified further. for example, by using a simpler method for choosing c given k, only use a subset of available points to asses accuracy or, with sufficient experimentation find some suitable heuristic for choosing the appropriate value of k. since the waveforms are given as data rather than explicit functions the d-optimal points have been calculated and then the closest available data points have been chosen. in these examples we did not require that the coefficients in the linear sums were positive. 4.1 modelling of esd currents 4.1.1 the iec 61000-4-2 standard current waveshape esd generators used in testing of the equipment and devices should be able to reproduce the same esd current waveshape each time. this repeata32 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 33 10 k. lundengård, m. rančić, v. javor and s. silvestrov 61000-4-2 © iec:1995+a1:1998 – 43 – +a2:2000 values are given in table 2. figure 3 – typical waveform of the output current of the esd generator tr fig. 1: illustration of the iec 61000-4-2 standard esd current and its key parameters, [15]. bility feature is ensured if the design is carried out in compliance with the requirements defined in the iec 61000-4-2 standard, [15]. among other relevant issues, the standard includes graphical representation of the typical esd current, fig. 1, and also defines, for a given test level voltage, required values of esd current’s key parameters. these are listed in table 1 for the case of the contact discharge, where: • ipeak is the esd current initial peak; • tr is the rising time defined as the difference between time moments corresponding to 10% and 90% of the current peak ipeak, fig. 1; • i30 and i60 are the esd current values calculated for time periods of 30 and 60 ns, respectively, starting from the time point corresponding to 10% of ipeak, fig. 1. 34 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 35 10 k. lundengård, m. rančić, v. javor and s. silvestrov 61000-4-2 © iec:1995+a1:1998 – 43 – +a2:2000 values are given in table 2. figure 3 – typical waveform of the output current of the esd generator tr fig. 1: illustration of the iec 61000-4-2 standard esd current and its key parameters, [15]. bility feature is ensured if the design is carried out in compliance with the requirements defined in the iec 61000-4-2 standard, [15]. among other relevant issues, the standard includes graphical representation of the typical esd current, fig. 1, and also defines, for a given test level voltage, required values of esd current’s key parameters. these are listed in table 1 for the case of the contact discharge, where: • ipeak is the esd current initial peak; • tr is the rising time defined as the difference between time moments corresponding to 10% and 90% of the current peak ipeak, fig. 1; • i30 and i60 are the esd current values calculated for time periods of 30 and 60 ns, respectively, starting from the time point corresponding to 10% of ipeak, fig. 1. electrostatic discharge currents representation using the analytically extended function...11 table 1: iec 61000-4-2 standard esd current and its key parameters, [15]. voltage [kv] ipeak [a] tr [ns] i30 [a] i60 [a] 2 7.5 ± 15% 0.8 ± 25% 4.0 ± 30% 2.0 ± 30% 4 15.0 ± 15% 0.8 ± 25% 8.0 ± 30% 4.0 ± 30% 6 22.5 ± 15% 0.8 ± 25% 12.0 ± 30% 6.0 ± 30% 8 30.0 ± 15% 0.8 ± 25% 16.0 ± 30% 8.0 ± 30% in this section we present the results of fitting a 2-peak aef to the standard esd current given in iec 61000-4-2. data points which are used in the optimization procedure are manually sampled from the graphically given standard [15] current function, fig. 1. the peak currents and corresponding times are also extracted, and the results of d-optimal interpolation with two peaks are illustrated, see fig. 2. the parameters are listed in table 3. in the illustrated examples a fairly good fit is found but typically areas with steeply rising and decaying parts are somewhat more difficult to fit with good accuracy than the other parts of the waveform. 4.1.2 3-peaked aef representing measured current from esd in this section we present the results of fitting a 3-peaked aef to a waveform from experimental measurements from [16]. the result is also compared to a common type of function used for modelling esd current, also from [16]. in figs. 3 and 4 the results of the interpolation of d-optimal points are shown together with the measured data, as well as a sum of two heidler functions that was fitted to the experimental data in [16]. this function is given by i(t) = i1 ( t τ1 )nh 1 + ( t τ1 )nh e − t τ2 + i2 ( t τ3 )nh 1 + ( t τ3 )nh e − t τ4 , i1 = 31.365 a, i2 = 6.854 a, nh = 4.036, τ1 = 1.226 ns, τ2 = 1.359 ns, τ3 = 3.982 ns, τ4 = 28.817 ns. note that this function does not reproduce the second local minimum but that all three aef functions can reproduce all local minima and maxima 34 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 35 12 k. lundengård, m. rančić, v. javor and s. silvestrov table 2: iec 61312-1 standard current waveshape and its key parameters, [17]. protection level parameter first stroke subsequent stroke n 10 10 t 19.0 µs 0.454 µs τ 485 µs 143 µs η 0.930 0.993 i ipeak 200 ka 50 ka ii ipeak 150 ka 37.5 ka iii-iv ipeak 100 ka 25 ka (to a modest degree of accuracy) when suitable values for the n, k and c parameters are chosen. in fig. 4 we can see that even small bumps in he rising part are successfully reproduced. 4.2 modelling of lightning discharge currents 4.2.1 iec 61312-1 standard current waveshape in this section we use the scheme to represent the iec 61312-1 standard current wave shape as it is described in [18]. rather than being given graphically, as the iec 61000-4-2 standard current waveform, the shape is described using a heidler function, i(t) = ipeak η ( t t )n 1 + ( t t )n e− t τ (8) whose parameters are chosen according to table 2. in figs. 5 and 6 the results of fitting an aef by interpolating on a d-optimal design to the first stroke of a protection level i iec 61312-1 standard waveshape are shown. the parameters of the fitted aef are given in table 5. in this case the waveshape can be reproduced fairly well but gives a relatively complicated expression compared to (8). 36 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 37 12 k. lundengård, m. rančić, v. javor and s. silvestrov table 2: iec 61312-1 standard current waveshape and its key parameters, [17]. protection level parameter first stroke subsequent stroke n 10 10 t 19.0 µs 0.454 µs τ 485 µs 143 µs η 0.930 0.993 i ipeak 200 ka 50 ka ii ipeak 150 ka 37.5 ka iii-iv ipeak 100 ka 25 ka (to a modest degree of accuracy) when suitable values for the n, k and c parameters are chosen. in fig. 4 we can see that even small bumps in he rising part are successfully reproduced. 4.2 modelling of lightning discharge currents 4.2.1 iec 61312-1 standard current waveshape in this section we use the scheme to represent the iec 61312-1 standard current wave shape as it is described in [18]. rather than being given graphically, as the iec 61000-4-2 standard current waveform, the shape is described using a heidler function, i(t) = ipeak η ( t t )n 1 + ( t t )n e− t τ (8) whose parameters are chosen according to table 2. in figs. 5 and 6 the results of fitting an aef by interpolating on a d-optimal design to the first stroke of a protection level i iec 61312-1 standard waveshape are shown. the parameters of the fitted aef are given in table 5. in this case the waveshape can be reproduced fairly well but gives a relatively complicated expression compared to (8). electrostatic discharge currents representation using the analytically extended function...13 4.2.2 modelling a measured lightning discharge current in this section we fit an aef function both with free parameters (as in [6]) and using interpolation on a d-optimal design, to data extracted from [20] that comes from measurements of a lightning strike on mount säntis in switzerland [30]. we used a 13-peaked aef and the results are shown in figs. 7a, 7c and 7e. often the curves are similar enough that it can be hard to spot the differences so the residuals of the two models relative to the measured current is shown in figs. 7b, 7d and 7f. it can be seen that in most cases the aef with free parameters gives a closer fit but the version interpolated on a d-optimal design is often comparable. parameters for the d-optimal fitting can be found in table 6. 4.2.3 modelling the lightning discharge current derivative here we present some results when attempting to reproduce the derivative of the waveshape of the lightning discharge current using the aef interpolated on a d-optimal design. we also compare the result of this fitting scheme to the results in [13] where the parameters of the aef are chosen freely and fitted using the marquardt least-squares method. the method for fitting an aef described in this paper is applied to the modelling of lightning current derivative signals measured at the cn tower [20]. the results of the fitting can be seen in fig. 8. from these figures it is clear that in this case of several peaks and few terms in each interval the two schemes for fitting the aef are often similar in quality but sometimes the extra flexibility offered when letting all the exponents in the aef be chosen individually can give a significantly better fit, an example of this can be seen in fig. 8. a possible explanation for this in this case is that in the scheme for d-optimal fitting you need many terms in order to have both small and large exponents. in fig. 9 we examine how well the different fitting schemes model the current when they are integrated. here we can see that the free parameter version gives a considerably better matching to the numerically integrated measured values than the d-optimal fitting version. 36 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 37 14 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-8 0 2 4 6 8 i( t) 0 5 10 15 iec 61000-4-2 2-peaked aef peaks interpolated points fig. 2: 2-peaked aef representing the iec 61000-4-2 standard esd current waveshape for 4kv. parameters are given in table 3. table 3: parameters’ values of aef with 2 peaks representing the iec 61000-4-2 standard waveshape. local maxima and minima and corresponding times extracted from the iec 61000-4-2, [15] imax1 = 15 [a] imin1 = 7.1484 [a] imax2 = 9.0921 [a] tmax1 = 6.89 [ns] tmin1 = 12.85 [ns] tmax2 = 25.54 [ns] parameters of interpolated aef shown in fig. 2 interval n k c 0 ≤ t ≤ tmax1 3 1 0.01385 tmax1 ≤ t ≤ tmax2 3 4 2.025 tmax2 < t 5 10 2.395 electrostatic discharge currents representation using the analytically extended function...15 t [s] #10-8 0 1 2 3 4 5 6 7 8 9 i( t) 0 2 4 6 measured data 3-peaked aef two heidler function peaks interpolated points fig. 3: 3-peaked aef interpolated to d-optimal points chosen from measured esd current from [16, fig.3] compared with the sum of two heidler functions suggested in [16]. parameters are given in table 4. table 4: parameters’ values of aef with 3 peaks representing measured esd. local maxima and corresponding times extracted from [16, fig.3] imax1 = 7.37 [a] imax2 = 5.02 [a] imax3 = 3.82 [a] tmax1 = 1.23 [ns] tmax2 = 6.39 [ns] tmax3 = 15.5 [ns] parameters of interpolated aef shown in fig. 3 interval n k c 0 ≤ t ≤ tmax1 5 5 0.05750 tmax1 ≤ t ≤ tmax2 3 1 0.4920 tmax2 ≤ t ≤ tmax3 4 2 0.5967 tmax3 < t 6 1 1.019 38 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 39 14 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-8 0 2 4 6 8 i( t) 0 5 10 15 iec 61000-4-2 2-peaked aef peaks interpolated points fig. 2: 2-peaked aef representing the iec 61000-4-2 standard esd current waveshape for 4kv. parameters are given in table 3. table 3: parameters’ values of aef with 2 peaks representing the iec 61000-4-2 standard waveshape. local maxima and minima and corresponding times extracted from the iec 61000-4-2, [15] imax1 = 15 [a] imin1 = 7.1484 [a] imax2 = 9.0921 [a] tmax1 = 6.89 [ns] tmin1 = 12.85 [ns] tmax2 = 25.54 [ns] parameters of interpolated aef shown in fig. 2 interval n k c 0 ≤ t ≤ tmax1 3 1 0.01385 tmax1 ≤ t ≤ tmax2 3 4 2.025 tmax2 < t 5 10 2.395 electrostatic discharge currents representation using the analytically extended function...15 t [s] #10-8 0 1 2 3 4 5 6 7 8 9 i( t) 0 2 4 6 measured data 3-peaked aef two heidler function peaks interpolated points fig. 3: 3-peaked aef interpolated to d-optimal points chosen from measured esd current from [16, fig.3] compared with the sum of two heidler functions suggested in [16]. parameters are given in table 4. table 4: parameters’ values of aef with 3 peaks representing measured esd. local maxima and corresponding times extracted from [16, fig.3] imax1 = 7.37 [a] imax2 = 5.02 [a] imax3 = 3.82 [a] tmax1 = 1.23 [ns] tmax2 = 6.39 [ns] tmax3 = 15.5 [ns] parameters of interpolated aef shown in fig. 3 interval n k c 0 ≤ t ≤ tmax1 5 5 0.05750 tmax1 ≤ t ≤ tmax2 3 1 0.4920 tmax2 ≤ t ≤ tmax3 4 2 0.5967 tmax3 < t 6 1 1.019 38 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 39 16 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-9 0 2 4 i( t) 0 2 4 6 measured data 3-peaked aef two heidler function peaks interpolated points fig. 4: close-up of the rising part of a 3-peaked aef interpolated to doptimal points chosen from measured esd current from [16, fig.3]. parameters are given in table 4. table 5: parameters’ values of aef representing the iec 61312-1 standard waveshape. chosen peak time and current tmax = 28.14 [µ s] i = 92.54 [ka] parameters of interpolated aef shown in fig. 5 interval n k c 0 ≤ t ≤ tmax 4 10 0.7565 tmax < t 5 1 41.82 40 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 41 16 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-9 0 2 4 i( t) 0 2 4 6 measured data 3-peaked aef two heidler function peaks interpolated points fig. 4: close-up of the rising part of a 3-peaked aef interpolated to doptimal points chosen from measured esd current from [16, fig.3]. parameters are given in table 4. table 5: parameters’ values of aef representing the iec 61312-1 standard waveshape. chosen peak time and current tmax = 28.14 [µ s] i = 92.54 [ka] parameters of interpolated aef shown in fig. 5 interval n k c 0 ≤ t ≤ tmax 4 10 0.7565 tmax < t 5 1 41.82 electrostatic discharge currents representation using the analytically extended function...17 t [s] #10-3 0 0.5 1 1.5 2 2.5 3 i( t) 0 20 40 60 80 iec 61312-1 d-optimal aef peak interpolated sample points fig. 5: aef with 1 peak fitted by interpolating d-optimal points sampled from the heidler function describing the iec 61312-1 waveshape given by (8). parameters are given in table 5. t [s] #10-5 0 1 2 3 i( t) 0 20 40 60 80 iec 61312-1 d-optimal aef peak interpolated points fig. 6: close-up of the rising part of the aef with 1 peak fitted by interpolating d-optimal points samples from the heidler function describing the iec61312-1 waveshape given by (8). parameters are given in table 5. 40 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 41 18 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-4 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (a) comparison of measured data and aef functions from t = −0.3437 µs to t = 888.1 µs. t [s] #10-4 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (b) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 888.1 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (c) comparison of measured data and aef functions from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (d) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (e) comparison of measured data and aef functions from t = −0.3437 µs to t = 5.116 µs. t [s] #10-6 0 2 4 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (f) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 5.116 µs. fig. 7: comparison of two aefs with 13 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [19]. one is fitted by interpolation on d-optimal points and the other is fitted with free parameters using the mlsm method. parameters of the d-optimal version are given in table 6. 18 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-4 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (a) comparison of measured data and aef functions from t = −0.3437 µs to t = 888.1 µs. t [s] #10-4 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (b) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 888.1 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (c) comparison of measured data and aef functions from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (d) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (e) comparison of measured data and aef functions from t = −0.3437 µs to t = 5.116 µs. t [s] #10-6 0 2 4 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (f) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 5.116 µs. fig. 7: comparison of two aefs with 13 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [19]. one is fitted by interpolation on d-optimal points and the other is fitted with free parameters using the mlsm method. parameters of the d-optimal version are given in table 6. 18 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-4 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (a) comparison of measured data and aef functions from t = −0.3437 µs to t = 888.1 µs. t [s] #10-4 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (b) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 888.1 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (c) comparison of measured data and aef functions from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (d) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (e) comparison of measured data and aef functions from t = −0.3437 µs to t = 5.116 µs. t [s] #10-6 0 2 4 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (f) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 5.116 µs. fig. 7: comparison of two aefs with 13 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [19]. one is fitted by interpolation on d-optimal points and the other is fitted with free parameters using the mlsm method. parameters of the d-optimal version are given in table 6. 42 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 43 18 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-4 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (a) comparison of measured data and aef functions from t = −0.3437 µs to t = 888.1 µs. t [s] #10-4 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (b) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 888.1 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (c) comparison of measured data and aef functions from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (d) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (e) comparison of measured data and aef functions from t = −0.3437 µs to t = 5.116 µs. t [s] #10-6 0 2 4 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (f) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 5.116 µs. fig. 7: comparison of two aefs with 13 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [19]. one is fitted by interpolation on d-optimal points and the other is fitted with free parameters using the mlsm method. parameters of the d-optimal version are given in table 6. electrostatic discharge currents representation using the analytically extended function...19 table 6: parameters’ values of aef with 13 peaks representing measured data for a lightning discharge current from [30]. local maxima and corresponding times extracted from [19, figs.6, 7 and 8] are denoted t and i and other parameters correspond to the fitted aef shown in figs. 7a, 7c and 7e. peak times and currents parameters of fitted aef t [µs] i [µs] interval n k c t1 = 0.3998 i1 = 8.159 0 ≤ t ≤ t1 2 2 0.4773 t2 = 0.9468 i2 = 10.96 t1 ≤ t ≤ t2 2 10 2.148 t3 = 1.458 i3 = 11.14 t2 ≤ t ≤ t3 2 1 0.3964 t4 = 1.873 i4 = 10.26 t3 ≤ t ≤ t4 2 1 0.2210 t5 = 2.475 i5 = 10.07 t4 ≤ t ≤ t5 2 10 1.695 t6 = 2.904 i6 = 9.819 t5 ≤ t ≤ t6 2 1 0.4591 t7 = 3.533 i7 = 8.519 t6 ≤ t ≤ t7 2 1 0.3503 t8 = 3.985 i8 = 9.097 t7 ≤ t ≤ t8 2 10 3.716 t9 = 5.036 i9 = 8.485 t8 ≤ t ≤ t9 2 1 0.6963 t10 = 6.168 i10 = 8.310 t9 ≤ t ≤ t10 2 1 0.2954 t11 = 8.472 i11 = 8.413 t10 ≤ t ≤ t11 2 6 3.074 t12 = 20.48 i12 = 8.576 t11 ≤ t ≤ t12 2 1 0.2784 t13 = 137.5 i13 = 4.178 t12 ≤ t ≤ t13 2 1 0.6456 t13 < t 4 1 0.3559 18 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-4 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (a) comparison of measured data and aef functions from t = −0.3437 µs to t = 888.1 µs. t [s] #10-4 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (b) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 888.1 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (c) comparison of measured data and aef functions from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (d) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (e) comparison of measured data and aef functions from t = −0.3437 µs to t = 5.116 µs. t [s] #10-6 0 2 4 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (f) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 5.116 µs. fig. 7: comparison of two aefs with 13 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [19]. one is fitted by interpolation on d-optimal points and the other is fitted with free parameters using the mlsm method. parameters of the d-optimal version are given in table 6. 18 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-4 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (a) comparison of measured data and aef functions from t = −0.3437 µs to t = 888.1 µs. t [s] #10-4 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (b) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 888.1 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (c) comparison of measured data and aef functions from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 6 8 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (d) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 9.280 µs. t [s] #10-6 0 2 4 i( t) 0 2 4 6 8 10 measured data d-optimal aef free parameter aef peaks interpolated sample points (e) comparison of measured data and aef functions from t = −0.3437 µs to t = 5.116 µs. t [s] #10-6 0 2 4 i( t) 0 0.2 0.4 0.6 0.8 1 1.2 residual d-optimal aef residual d-optimal aef (f) residuals when comparing the fitted function to the measured data from t = −0.3437 µs to t = 5.116 µs. fig. 7: comparison of two aefs with 13 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [19]. one is fitted by interpolation on d-optimal points and the other is fitted with free parameters using the mlsm method. parameters of the d-optimal version are given in table 6. 42 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 43 20 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-6 0 2 4 6 di /d t[ ka /s ] 0 10 20 30 measured data d-optimal 12-peaked aef free parameter aef peaks interpolated sample points fig. 8: comparison of two aefs with 12 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [20]. one is fitted by interpolation on d-optimal points and one is fitted with free parameters using the mlsm method. parameters are given in table 7. t [s] #10-6 0 2 4 6 i [k a ] 0 5 10 measured data d-optimal aef free parameter aef fig. 9: comparison of results of integrating the results shown in fig. 8. 44 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 45 20 k. lundengård, m. rančić, v. javor and s. silvestrov t [s] #10-6 0 2 4 6 di /d t[ ka /s ] 0 10 20 30 measured data d-optimal 12-peaked aef free parameter aef peaks interpolated sample points fig. 8: comparison of two aefs with 12 peaks and 2 terms in each linear combination fitted to measured lightning discharge current derivative from [20]. one is fitted by interpolation on d-optimal points and one is fitted with free parameters using the mlsm method. parameters are given in table 7. t [s] #10-6 0 2 4 6 i [k a ] 0 5 10 measured data d-optimal aef free parameter aef fig. 9: comparison of results of integrating the results shown in fig. 8. electrostatic discharge currents representation using the analytically extended function...21 table 7: parameters’ value of aef with 12 peaks representing measured data for a lightning discharge current derivative from [20]. chosen peak times are denoted t and i and other parameters correspond to the fitted aef shown in fig. 8. peak times and currents parameters of fitted aef t [µs] i [µs] interval n k c t0 = −0.3437 i0 = 0 t0 ≤ t ≤ t1 2 10 0.06099 t1 = 0.9468 i1 = 36.65 t1 ≤ t ≤ t2 2 1 0.4506 t2 = 0.5001 i2 = −2.208 t2 ≤ t ≤ t3 3 1 0.04772 t3 = 0.9215 i3 = 6.89 t3 ≤ t ≤ t4 2 1 0.4502 t4 = 1.212 i4 = −7.322 t4 ≤ t ≤ t5 3 1 0.2590 t5 = 1.714 i5 = 3.402 t5 ≤ t ≤ t6 3 2 0.9067 t6 = 2.103 i6 = 1.319 t6 ≤ t ≤ t7 3 1 0.3333 t7 = 2.730 i7 = −1.844 t7 ≤ t ≤ t8 3 1 0.03732 t8 = 3.416 i8 = 16.08 t8 ≤ t ≤ t9 2 4 3.3793 t9 = 4.005 i9 = −5.787 t9 ≤ t ≤ t10 2 1 1.4912 t10 = 4.216 i10 = −0.1268 t10 ≤ t ≤ t11 2 2 0.09448 t11 = 4.875 i11 = 1.972 t11 ≤ t ≤ t12 2 6 2.288 t12 = 5.538 i12 = 1.683 t13 < t 3 1 0.001705 44 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 45 22 k. lundengård, m. rančić, v. javor and s. silvestrov 5 conclusion in this work we examine a mathematical model for representation of various esd currents or their derivative and apply it to some realistic cases, either taken from standards, see section 4.1.1 and 4.2.1, or measured data, see sections 4.1.2, 4.2.2 and 4.2.3. the model is based around the multi-peaked analytically extended function (aef), see section 2.1, has been proposed and successfully applied to lightning current modelling in [6,9–11]. it matches common requirements of esd-type currents, such as stating that the function and its first derivative must be equal to zero at the starting time. furthermore, the aef function is time-integrable, [11], which is necessary for numerical calculation of radiated fields originating from the esd current. we construct the model by restricting the exponents in the aef to an arithmetic sequence and then interpolate points of the function we wish to approximate chosen according to a d-optimal design. this makes the modelling less flexible than the case where all exponents can be chosen freely but gives a scheme for fitting the function that scales better to many data points than the mlsm fitting scheme used in [6,9–11]. the resulting methodology can give fairly accurate results even with a modest number of interpolated points but strategies for choosing some of the involved parameters should be further investigated. the decaying part of the waveforms are consistently difficult to fit and if the models are used in a context where significant error propagation appears a more flexible approach can be desirable. acknowledgments the authors would like to thank dr. pavlos katsivelis from the high voltage laboratory, school of electrical and computer engineering, national technical university of athens, greece, for providing measured esd current data. references [1] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”multi-peaked analytically extended function representing electrostatic discharge (esd) currents,” in aip conference proceedings icnpaa, la rochelle, france, pp. 1–10, 2016. 46 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 47 22 k. lundengård, m. rančić, v. javor and s. silvestrov 5 conclusion in this work we examine a mathematical model for representation of various esd currents or their derivative and apply it to some realistic cases, either taken from standards, see section 4.1.1 and 4.2.1, or measured data, see sections 4.1.2, 4.2.2 and 4.2.3. the model is based around the multi-peaked analytically extended function (aef), see section 2.1, has been proposed and successfully applied to lightning current modelling in [6,9–11]. it matches common requirements of esd-type currents, such as stating that the function and its first derivative must be equal to zero at the starting time. furthermore, the aef function is time-integrable, [11], which is necessary for numerical calculation of radiated fields originating from the esd current. we construct the model by restricting the exponents in the aef to an arithmetic sequence and then interpolate points of the function we wish to approximate chosen according to a d-optimal design. this makes the modelling less flexible than the case where all exponents can be chosen freely but gives a scheme for fitting the function that scales better to many data points than the mlsm fitting scheme used in [6,9–11]. the resulting methodology can give fairly accurate results even with a modest number of interpolated points but strategies for choosing some of the involved parameters should be further investigated. the decaying part of the waveforms are consistently difficult to fit and if the models are used in a context where significant error propagation appears a more flexible approach can be desirable. acknowledgments the authors would like to thank dr. pavlos katsivelis from the high voltage laboratory, school of electrical and computer engineering, national technical university of athens, greece, for providing measured esd current data. references [1] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”multi-peaked analytically extended function representing electrostatic discharge (esd) currents,” in aip conference proceedings icnpaa, la rochelle, france, pp. 1–10, 2016. electrostatic discharge currents representation using the analytically extended function...23 [2] v. javor, ”multi-peaked functions for representation of lightning channelbase currents,” 31st int. conference on lightning protection iclp 2012, september 2-7, 2012, proceedings of papers, vienna, austria, 2012. [3] v. javor, ”new function for representing iec 61000-4-2 standard electrostatic discharge current,” facta universitatis, series: electronics and energetics, vol. 27(4), pp. 509–520, 2014. [4] v. javor, k. lundeng̊ard, m. rančić, and s. silvestrov, ”measured electrostatic discharge currents modeling and simulation,” in proc. of telsiks 2015, nîs, serbia, pp. 209–212, 2015. [5] v. javor, ”new function for representing iec 61000-4-2 standard electrostatic discharge current,” (invited paper), facta universitatis, series electronics and energetics, vol. 27(4), pp. 509–520, university of ni, serbia, 2014. [6] v. javor, k. lundeng̊ard, m. rancic, s. silvestrov, ”analytical representation of measured lightning currents and its application to electromagnetic field estimation,” ieee transactions on electromagnetic compatibility, vol. 60(5), pp. 1415–1426, 2018. [7] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”application of the marquardt least-squares method to the estimation of pulse function parameters,” in aip conference proceedings 1637, icnpaa, narvik, norway, 2014, pp. 637–646. [8] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”estimation of pulse function parameters for approximating measured lightning currents using the marquardt least-squares method,” in conference proceedings, emc europe, gothenburg, sweden, 2014, pp. 571–576. [9] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”an examination of the multi-peaked analytically extended function for approximation of lightning channel-base currents,” in proceedings of full papers, pes 2015, nǐs, serbia, electronic, arxiv:1604.06517 [physics.comp-ph], 2015. [10] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”application of the multi-peaked analytically extended function to representation of some measured lightning currents,” serbian journal of electrical engineering, vol. 13(2), pp. 1–11, 2016. [11] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”estimation of parameters for the multi-peaked aef current functions,” methodol. comp. appl. probab., pp. 1–15, 2016. [12] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”on some properties of the multi-peaked analytically extended function for approximation of lightning discharge currents” in: sergei silvestrov and milica rančić, editors, engineering mathematics i: electromagnetics, fluid mechanics, material physics and financial engineering, volume 178 of springer proceedings in mathematics & statistics. springer international publishing, 2016. 46 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 47 24 k. lundengård, m. rančić, v. javor and s. silvestrov [13] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”novel approach to modelling of lightning current derivative,” facta universitatis, series electromagnetics and energetics, vol. 30(2), pp. 245–256, university of ni, serbia 2017. [14] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, 1995+a1:1998+a2:2000. [15] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, ed. 2, 2009. [16] p. s. katsivelis, i. f. gonos, and i. a. stathopulos, ”estimation of parameters for the electrostatic discharge current equation with real human discharge events reference using genetic algorithms,” meas. sci. technol., vol. 21, pp. 1–6, 2010. [17] protection against lightning electromagnetic impulse electrostatic discharge immunity test. iec international standard 61312-1, 1995. [18] f. heidler and j. cvetić, ”a class of analytical functions to study the lightning effects associated with the current front,” etep, vol. 12(2), pp. 141–150, 2002. [19] f. delfino, r. procopio, m. rossi, f. rachidi, ”prony series representation for the lightning channel base current,” ieee transactions on electromagnetic compatibility, vol. 54(2), pp. 308–315, 2012. [20] a. hussein, m. milewski, w. janischewskyj, ”correlating the characteristics of the cn tower lightning return-stroke current with those of its generated electromagnetic pulse,” ieee transactions on electromagnetic compatibility, vol. 50(3), pp. 642–650, 2008. [21] g. p. fotis, and l. ekonomu, ”parameters’ optimization of the electrostatic discharge current equation,” int. journal on power system optimization, vol. 3(2), pp. 75–80, 2011. [22] v. b. melas, functional approach to optimal experimental design. new york: springer, 2006. [23] m. abramowitz and i. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables. new york: dover, 1964. [24] t. s. chihara, an introduction to orthogonal polynomials. gordon and breach, science publishers, inc., new york, 1978. [25] k. driver and k. jordaan, ”bounds for extreme zeros of some classical orthogonal polynomials”, j. approx. theory 164(9), pp. 1200–1204, 2012. [26] f. r. lucas, ”limits for zeros of jacobi and laguerre polynomials”, proceeding series of the brazilian society of applied and computational mathematics 3(1), 2015. [27] matlab and optimization toolbox release 2015a, functon lsqnonlin, the mathworks, inc., natick, massachusetts, united states. 48 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 49 24 k. lundengård, m. rančić, v. javor and s. silvestrov [13] k. lundeng̊ard, m. rančić, v. javor, and s. silvestrov, ”novel approach to modelling of lightning current derivative,” facta universitatis, series electromagnetics and energetics, vol. 30(2), pp. 245–256, university of ni, serbia 2017. [14] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, 1995+a1:1998+a2:2000. [15] emc part 4-2: testing and measurement techniques electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, ed. 2, 2009. [16] p. s. katsivelis, i. f. gonos, and i. a. stathopulos, ”estimation of parameters for the electrostatic discharge current equation with real human discharge events reference using genetic algorithms,” meas. sci. technol., vol. 21, pp. 1–6, 2010. [17] protection against lightning electromagnetic impulse electrostatic discharge immunity test. iec international standard 61312-1, 1995. [18] f. heidler and j. cvetić, ”a class of analytical functions to study the lightning effects associated with the current front,” etep, vol. 12(2), pp. 141–150, 2002. [19] f. delfino, r. procopio, m. rossi, f. rachidi, ”prony series representation for the lightning channel base current,” ieee transactions on electromagnetic compatibility, vol. 54(2), pp. 308–315, 2012. [20] a. hussein, m. milewski, w. janischewskyj, ”correlating the characteristics of the cn tower lightning return-stroke current with those of its generated electromagnetic pulse,” ieee transactions on electromagnetic compatibility, vol. 50(3), pp. 642–650, 2008. [21] g. p. fotis, and l. ekonomu, ”parameters’ optimization of the electrostatic discharge current equation,” int. journal on power system optimization, vol. 3(2), pp. 75–80, 2011. [22] v. b. melas, functional approach to optimal experimental design. new york: springer, 2006. [23] m. abramowitz and i. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables. new york: dover, 1964. [24] t. s. chihara, an introduction to orthogonal polynomials. gordon and breach, science publishers, inc., new york, 1978. [25] k. driver and k. jordaan, ”bounds for extreme zeros of some classical orthogonal polynomials”, j. approx. theory 164(9), pp. 1200–1204, 2012. [26] f. r. lucas, ”limits for zeros of jacobi and laguerre polynomials”, proceeding series of the brazilian society of applied and computational mathematics 3(1), 2015. [27] matlab and optimization toolbox release 2015a, functon lsqnonlin, the mathworks, inc., natick, massachusetts, united states. electrostatic discharge currents representation using the analytically extended function...25 [28] t.f. coleman and y. li. ”on the convergence of reflective newton methods for large-scale nonlinear minimization subject to bounds.”, mathematical programming 67(2), pp. 189-224, 1994. [29] t.f. coleman and y. li. ”an interior, trust region approach for nonlinear minimization subject to bounds.”, siam journal on optimization, vol. 6, pp. 418-445, 1996. [30] a. rubinstein, c. romero, m. paolone, f. rachidi, m. rubinstein, p. zweiacker, and b. daout, lightning measurement station on mount säntis in switzerland, in proc. xth int. symp. lightning protection, curitiba, brazil, pp. 463–468, 2009. 48 k. lundengård, m. rancic, v. javor s. silvestrov electrostatic discharge currents representation using the analytically extended function... 49 development of an iot system facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 343-366 https://doi.org/10.2298/fuee1803343s a comparative study of reliability for finfet  saleh shaheen, gady golan, moshe azoulay, joseph bernstein faculty of engineering, dept. of electrical engineering, ariel university, ariel, israel abstract. the continuous downscaling of cmos technologies over the last few decades resulted in higher integrated circuit (ic) density and performance. the emergence of finfet technology has brought with it the same reliability issues as standard cmos with the addition of a new prominent degradation mechanism. the same mechanisms still exist as for previous cmos devices, including bias temperature instability (bti), hot carrier degradation (hcd), electro-migration (em), and body effects. a new and equally important reliability issue for finfet is the self -heating, which is a crucial complication since thermal time-constant is generally much longer than the transistor switching times. finfet technology is the newest technological paradigm that has emerged in the past decade, as downscaling reached beyond 20 nm, which happens also to be the estimated mean free path of electrons at room temperature in silicon. as such, the reliability physics of finfet was modified in order to fit the newly developed transistor technology. this paper highlights the roles and impacts of these various effects and aging mechanisms on finfet transistors compared to planar transistors on the basic approach of the physics of failure mechanisms to fit to a comprehensive aging model. key words: finfet, reliability, cmos finfet, bti, hcd, electromigration, aging 1. introduction 1.1. cmos finfet transistors a fin field-effect transistor (finfet) is a tri-gate transistor built on a substrate where the gate is placed on three sides of the channel or wrapped around the channel, forming a double gate structure. these devices have been given the generic name "finfets" because the source/drain region forms fins on the silicon surface. finfet devices exhibit significantly faster switching times and higher current density than the mainstream in the planner transistors technology. the term finfet (fin field effect transistor) was coined in 2001 by the university of california, berkeley [1]. finfet devices are 3d transistors, where the current flows through a thin fin wrapped by a metal gate. in this structure, channel inversion is created in the three walls of the fin, which increases the gate control over the channel and reduces the short channel effects [2]. finfets present lower threshold voltage variations than received february 8, 2018 corresponding author: gady golan faculty of engineering, dept. of electrical engineering, ariel university, ariel 40700, israel (e-mail: gadygolan@gmail.com) https://en.wikipedia.org/wiki/double-gate_transistor https://en.wikipedia.org/wiki/wafer_(electronics) https://en.wikipedia.org/wiki/cmos mailto:gadygolan@gmail.com 344 s. shaheen, g. golan, m. azoulay, j. bernstein planar transistors due to its undoped or slightly doped channel that reduces the impact of random doping fluctuations (rdf). on the other hand, thinner fins also increase the series resistance of the source/drain region due to its small transversal area, and limits the driving current of the finfet. this reduction in current affects the reliability which will be discussed in the following sections. 1.2. reliability issues in deep sub-micron technologies as previously mentioned, the semiconductor industry has witnessed remarkable growth and achievements in ic manufacturing through significant scaling in transistor dimensions. such scaling has not only made the ic more compact and dense, but also enhanced its performance without an increase in its power consumption as long as the chip area was kept constant. although the vision of gordon moore in 1965 that the complexities of an ic will be approximately doubled every two years seemed to be a dream at that time, it came true over four decades. this resulted in the production of more complex circuits [3]. in 2018, there are more than billion transistors in one single processor die! thus, this high integration density has to be accompanied by tough efforts to increase the ic reliability, since the failure of several transistors in a circuit can lead to a complete failure of the whole system. despite the fact that there were some claims in the semiconductor industry of hitting a ―red brick‖ wall at100 nm technology node in 1998 [4], leading edge research and development is currently working towards developing of transistors even smaller than 10nm technology node and beyond [5]. as with the continuous downscaling of device dimensions, variations in transistor parameters are increasing drastically and lead to unexpected reliability issues [6, 7]. these issues are essentially classified to ―time-zero‖ variability issues [8] such as line-edge roughness (ler), random dopant fluctuations (rdfs), metal gate granularity and body thickness variation, that causes intra-die variations during manufacturing process, and ―time-dependent‖ variability issues that are considered to be a major source for performance degradation of scaled devices over their lifetime, such as negative bias temperature instability (nbti) [9], hot carrier injection (hci) [10], and time-dependent dielectric breakdown (tddb) [11] electro-migration, self-heating and body effects, these degradation mechanisms are caused by the formation of charged traps within the gate oxide layer due to the high electric field and temperature that lead to a change in the device parameters (e.g. threshold voltage, carrier mobility, drain current) over time , depending on the operating conditions and the workload over lifetime. therefore, these issues degrade the reliability of the scaled devices and eventually may lead to an ic failure, when the variations reach a certain limit. an example to demonstrate the impact of variability on the scaled devices, are evident in downscaling of technologies. variation can reach up to 50% of vth in advanced technology [12], which strongly affects sram functionality and pose a major challenge for the sram design. reliability of digital integrated circuits has become one of the critical challenges at deep sub-micron (dsm) semiconductor technologies. researchers, nowadays, are studying these reliability issues at various levels such as design, process, transistor, and circuit. they also argue that these time-dependent mechanisms can be best described in terms of an ensemble of individual defects and their time, voltage, and temperature dependent properties that can be modeled and inserted into a circuit simulation, and thus, enabling reliability awareness at design [13], as will be discussed in the following paragraphs. in fact, the simulation and analysis of aging effects at higher design levels are basically difficult, since the a comparative study of reliability for finfet 345 degradation rate depends on operating conditions and workloads over the lifetime. these factors are often unknown during the design of a circuit, since the change of workloads applied to a circuit will lead to various amounts of performance degradation, and thus, impose dramatic challenges to the design of digital integrated circuits. in order to improve design predictability and support robust design it is necessary to develop appropriate techniques that are efficiently able to predict the aging effects in existing and future technologies. in this paper, our objective is to provide a detailed and accurate study for predicting aging effects in finfet compared to planar transistors, due to few of the above mentioned time-dependent phenomena, like nbti mechanism, hot carrier, self heat, body effects and finally electromigration. the rest of this paper is organized as follows: paragraph 2 gives the background and physical concepts behind nbti phenomenon and hot carrier, self heat, and finally electro-migration. the mechanisms are explained and the ways of modeling this degradation mechanism in the transistor and gate levels are explained. a new multiple temperature operational life test (mtol) model [14] is discussed as well. paragraph 3 presents a comprehensive analysis of the main differences between finfet transistor and planner transistor from the reliability perspective and presents detailed finfet transistors reliability and aging analysis. paragraph 4 presents a comparative discussion, putting it all together. finally, paragraph 5 summarizes the work that has been done and propose ideas for future improved reliability for finfet devices. 2. physical mechanisms 2.1. multiple-temperature operational life and fit reliability device simulators have become an integral part of the vlsi design process. these simulators successfully model the most significant physical failure mechanisms, such as negative bias temperature instability (nbti), electro-migration (em) and hot carrier injection (hci) in modern electronic devices. these mechanisms are modeled throughout the circuit design process, so that the system will operate for a minimum expected useful life. modern chips are composed of tens or hundreds of millions of transistors. hence, the chip level reliability prediction methods are mostly statistical. chip level reliability prediction tools, today, model the failure probability of the chips at the end of life, when the known wearout mechanisms are expected to dominate. however, modern prediction tools do not predict the random, post burn-in, failure rate that would be seen in the field [14-17]. chip and packaged system reliability is still measured by a failure unit, also defined as the failure-in-time (fit). the fit is a measure for the constant rate function (poisson model) failure rate, λ. this model is time-independent, and the failure rate in fit is defined as the number of expected device failures per billion part hours. a fit is assigned for each component multiplied by the number of devices in a system for an approximation of the expected system reliability. the semiconductor industry provides an expected fit for every product that, based on operation within the specified conditions of voltage, frequency, heat dissipation and more. hence, a system reliability model is a prediction of the expected mean time between failures (mtbf) for an entire system as the sum of the inverse fit rate for every component. a fit is defined in terms of an acceleration factor, af, seen in equation 1 below: (1) 346 s. shaheen, g. golan, m. azoulay, j. bernstein where: #failures and #tested are the number of actual failures that occurred as a fraction of the total number of units subjected to an accelerated test. the acceleration factor, af, has to be provided by the manufacturer, since only they know the failure mechanisms that are being accelerated in the final high temperature operating life (htol) test. this factor is generally based on a company proprietary variant of the mil-hdbk-217 approach for accelerated life testing. the real task of reliability modeling, therefore, is to choose an appropriate value for af based on the physics of the dominant failure mechanisms that would occur in the field for the device. the key innovation of the multiple-temperature operational life(mtol) method is its success in separating different failure mechanisms in devices in such a way that actual reliability prediction scan be made for any user defined operating conditions. this methodology is opposed to the common approach for assessing device reliability today, using high temperature operating life (htol) testing, which is based on the assumption that just one dominant failure mechanism is acting on the device. however, it is known that multiple failure mechanisms act on the device simultaneously. the new approach, mtol, deals with this issue, this method predicts the reliability of electronic components by combining the failure in time (fit) of multiple failure mechanisms. degradation curves are generated for the components exposed to the accelerated testing at several different temperatures and core stress voltage. the recent published data [18] clearly reveals that different failure mechanisms act on the components at different regimes of operation causing different mechanisms to dominate, depending on the stress and the particular technology. a linear matrix solution, allows the failure rate of each separate mechanism to be combined linearly to calculate the actual reliability as measured in fit of the system based on the physics of degradation at specific operating conditions. in this paper, we present the most significant physical failure mechanisms in modern electronic devices, such as negative bias temperature instability (nbti), electro-migration (em) and hot carrier injection (hci) but we will not present the mtol analysis, due to the fact that we present the theoretical aspects of physical reliability, yet we will present the mtol analysis on finfet transistors that may be implemented in the near future. 2.2. the physical mechanism behind bti bias temperature instability (bti) is a time-dependent degradation mechanism, it has been known since 1966 [7] and a model for understanding its effects was first proposed in 1977. bti has emerged as a key reliability concern due to its increasingly negative impact on performance of modern electronic devices. bti effects worsen as a transistor ages, and lead to severe shifts of important transistor parameters. therefore, understanding the impacts of bti degradation is of primary importance for current and near future cmos technologies. the continuous mosfet miniaturization trends (i.e., aggressive oxide thickness scaling) resulted in higher oxide fields and temperature [19]. consequently, more charge traps are able to tunnel through the gate oxide. these traps capture some of the charged carriers which are responsible for the current flow between source and drain. therefore, it results in the formation of a narrower transistor channel due to this charge loss. this means that less current can flow through the device and consequently, the device performance will degrade. these effects show up themselves at the circuit level by increasing circuit delays and in turn circuit timing errors. in order to maintain the drain current to its pre-degradation state, a higher voltage bias needs to be applied on the gate. https://www.researchgate.net/publication/224525678_negative_bias_stress_of_mos_devices_at_high_electric_fields_and_degradation_of_mnos_devices_j_appl_phys?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== a comparative study of reliability for finfet 347 therefore, a higher voltage will be needed before the transistor begins to conduct. this means that the threshold voltage (vth) increases significantly over time. the threshold voltage shift (∆vth) is accelerated by elevated temperature or supplied voltage and it is a direct measure of the device degradation and is widely used in the literature to evaluate bti impacts [20, 21]. bti mechanism, as illustrated below in figure 1, occurs in two phases, firstly, the transistor is in stress phase when the voltage vgs is applied to the gate over a period of time. during this phase, charged traps are generated at the gate oxide layer and the transistor threshold voltage increases (degrades). secondly, when the stress voltage is removed, the transistor is in recovery phase. during the recovery phase, trapped charges are released and the threshold voltage partially recovers to the level that was prior to the stress. the transistor enters into stress and recovery phases alternately, when the input is dynamic. fig. 1 (a) threshold voltage shift due to bti aging, (b) two phases of bti. the transistor does not fully recover thus the amount of bti degradation depends on the stress history that is reflected at the duty factor and calculated by the device stress and relaxation periods. devices in arithmetic units and memory circuits tend to present an un-balanced duty factor, while devices in clock circuitry are an example of duty cycle factor of 50%. the impact of bti is observed at both nmos and pmos transistors; both are susceptible to positive bias temperature instability (pbti) and negative bias temperature instability (nbti) respectively. hard et al. [22] have analyzed the impact of bti in four scenarios: 1. nmos under negative gate bias, 2. nmos under positive bias, 3. pmos under negative bias and 4. pmos under positive bias. the study has clearly shown that pmos devices are more susceptible to bti, regardless of negative or positive bias. it also proves that pmos under negative bias is the case that presents the largest threshold voltage shift. it is unfortunate, since in digital circuits pmos devices are negatively biased. this is the reason why bti is often referred to as nbti which is attributed to threshold voltage shifts in pmos devices only. therefore, most of published data [23, 24] are focused only to study the impact of nbti on the circuit reliability. in the last decade, accurately modeling of bti has become a major concern for industry. several approaches have been proposed to understand the origin of this phenomenon and predict its impacts, while there is a good agreement on the fact that bti is caused due to the generation of traps in oxide layer when bias is applied at the gate. in addition, the bti degradation impact can be directly measured as a shift in threshold voltage. the acceleration factor (af) of the nbti is presented below in equation 2 [18]: 348 s. shaheen, g. golan, m. azoulay, j. bernstein f-n ti ( n ti( stress use)) * a ( use stress )+ (2) where: tuse  use temperature in kelvin tstress  life test stress temperature in kelvin 𝐊 = 8.63 [ev/k] – boltzmann's constant ea  activation energy vuse  use voltage vstress  life test stress voltage nbti  acceleration factor for nbti the next paragraph gives an overview of the developed models for nbti mechanism at the transistor-level and the reasons behind the choice of the used model to carry out the research. moreover, bti impacts at the gate-level are also presented. 2.3. the physical mechanism behind hcd a physical understanding of hcd and the respective models are briefly introduced. for semiconductors in thermal equilibrium, electrons and holes continually absorb and emit acoustical phonons (low-frequency lattice vibrations), resulting in an average energy gain of zero. such electrons have kinetic energy (e) that are normally slightly higher than that of the conduction band edge (ec) by an amount ktr (tr is room temperature). similarly, for holes, e is slightly less than the valence band edge (ev) by ktr. in the case of low electrical fields, the carrier's velocity is field-independent and ktr is only 0.025 ev ; which is small compared to the carriers kinetic energy corresponding to ec and ev. however, if the electrical field is very high (for example, 100 kv/cm), the carriers gain more energy than they lose by scattering. such accelerated electrons have energies of ec + kte, where te is an effective temperature such that kte >ktr. with effective temperatures (~ec /kt) of tens of thousands of degrees kelvin, these electrons are at the very top of the fermi distribution and are known as hot electrons. while a mosfet is in active operation, if the gate voltage is comparable to or lower than vds, the inversion layer is much stronger on the source side than the drain side and the voltage drops due to the channel current is concentrated on the drain side (if vd>vs). the field near this side can be so high that carriers can gain enough energy between two scattering events to become hot carriers. the majority of these hot carriers simply continue toward the drain, but a small number of them gain enough energy to generate electrons and holes by impact ionization. in the n-channel mosfet (n-mosfet), the vast majority of the generated holes are collected by the substrate and give rise to the substrate current (isub) and the generated electrons enhance the drain current (id). photon emission might also occur during the generation of the hot carrier in the drain. some of the hot carriers with enough energy have been calculated (approximately 3.2 ev for electrons and 4.7 ev for holes [25]) and can surmount the energy barrier at the si-sio2 interface and be injected into the oxide, with a small gate current (ig). some energetic injected carriers might break some s-h or similar weak bonds in the oxide or at the si-sio2 interface. if the hot carriers injection last long enough, the trapped charge or generated defects will permanently modify the electric field at the si-sio2 interface, and hence, the electrical characteristics of the mosfet. a comparative study of reliability for finfet 349 hot carrier injection mechanisms hot carrier injection mechanisms according to takeda [26], are three main types of hot carrier injection modes: 1. channel hot electron (che) injection. 2. drain avalanche hot carrier (dahc) injection. 3. secondary generated hot electron (sghe) injection. che injection is due to the escape of ―lucky‖ electrons from the channel, causing a significant degradation of the oxide and the si-sio2 interface, especially at low temperature (77 k) [27]. on the other hand, dahc injection results in both electrons and holes gate currents due to impact ionization, giving rise to the most severe degradation around room temperature. sghe injection is due to minority carriers from secondary impact ionization or, more likely, bremsstrahlung radiation, and becomes a problem in ultra-small metal oxide semiconductor (mos) devices. fowler–nordheim tunneling and direct tunneling might also cause hot carrier injection. for deep sub-micrometer devices, it is important to attempt for the effects resulting from combinations in some or all of these injection processes. channel hot electron (che) injection: the che injection occurs when the gate voltage (vg) is comparable to the drain voltage (vd) (in n-mosfet). the gate current (ig) rises as vg initially increases, reaching to a maximum peak when vg is roughly equal to the drain-source potential vd, and drops thereafter. there are two reasons that cause the ig to increase. first, the inversion charges in the channel increases, so that more electrons are present for injection into the oxide. second, the stronger influence of the vertical electric field in the oxide prevents electrons in the oxide from detrapping and drifting back into the channel. it has been reported [28] that if an n-channel mosfet is operating at vg = vd the conditions would be optimum for che injection of ―lucky electrons.‖ such electrons gain sufficient energy to surmount the si-sio2 barrier without suffering an energy losing collision in the channel. in many cases, this gate current is responsible for device degradation as a result of carrier trapping. no gate current can be measured for vg to improve transistor performance [46], but can affect the interface quality. additionally, top corners can enhance local electric fields [47] and the vertical sidewalls require special integration schemes to ensure appropriate gate [48] and junction formation. perhaps most critically, the fin architecture limits heat dissipation from the channel, which can lead to increased local temperatures [49]. bti in tri-gate despite the crystal orientation and corners, multiple authors have shown that bti in trigate is matched to planar devices [50, 51, and 52]. for example, pmos nbti is matched to planar for both degradation and recovery. this indicates that the same basic modeling formulation can be extended from planar devices and applied to tri-gate. with a potential dependence on fin height and width, the bti variation could be expected to be influenced by the vertical tri-gate architecture. however, data from intel‘s 22nm technology indicate that the variability characteristics are matched between planar and tri-gate [50]. for example, the exponential distribution characteristic parameter for the average vth shift per trap, which indicates that the behavior in tri-gate devices is identical to that of planar. hot carrier in tri-gate hot carrier degradation in tri-gate devices is known to be complicated by dependences on tri-gate features such as fin width, field profiles, and junction formation [53,54]. further, the tri-gate architecture itself can increase degradation due to increased likelihood that the gate will capture energetic channel carriers. as with planar devices, the damage is expected to be localized near the drain end of the channel, where fields are the highest. tri-gate devices can enhance this localization due to the improved control over short channel effects, 356 s. shaheen, g. golan, m. azoulay, j. bernstein which effectively means that the gate better controls the potential in the channel, leaving more of the drain-source voltage drop at the drain end of the channel. the effect of this localization is reflected in the degradation characteristics, where, the ids degradation is measured at different applied biases after hot carrier stress. during this monitoring, a constant high gate bias is applied and the ids degradation is measured at different source or drain biases. as the monitor drain voltage increases from low to high, the measured degradation decreases. this can be interpreted as a reduction of interaction between the channel electrons and gate oxide damage at the drain end of the channel as the depletion region increases. conversely, as the source voltage is increased, the measured degradation increases, which indicates that a greater fraction of the channel is contributing to the measured degradation as the source-side depletion region increases. as a result, the degradation impact to a circuit will depend on the bias conditions during operation. therefore, the aging formula needs to be extended to include a dependence on the playback bias: fig. 2 the measured degradation after nmos hc stress is observed to be a strong function of applied bias during playback. as the drain voltage increases, measured degradation decrease. conversely, as the source voltage increases, the measured degradation increases. both effects are due to damage localization at the drain end of the channel. the data come from intel‘s 14nm tri-gate technology, and the applied gate voltage is constant for all biases. a further complication of hot carrier degradation is the tendency for ―minority carriers‖ to be injected into the gate, causing an increase in drive current rather than degradation. for example, a pmos device in a stress condition of high drain voltage but low gate voltage will have fields favoring the injection of electrons into the gate. this can cause a |vt| reduction and drive current increase. a recent example of this behavior in tri-gate devices is shown in fig. 2 where a pmos device was stressed with drain-high and gate-low, and the drive current is seen to increase after stress [54]. as a consequence of this mechanism, a general aging model will need to include a term for this minority carrier injection as indicated in equation 14. here, the mechanisms are assumed to be independent and therefore additive, though the actual physical picture is likely far more complicated. %id = fmajority (stress, ec, playback) ( ) (14) the discrete trap behavior from hot carrier degradation is expected to have a similar influence on variation as bti. as such, the variation can be modeled with the same variation formulation as bti [55]. however, as discussed in ref. [53], hot carrier degradation depends on many features of trigate device, so the variation magnitude may be affected as well. recent evaluations suggest that this may be the case, though additional research is needed in this area [54]. a comparative study of reliability for finfet 357 self-heat in tri-gate one of the most important aging-related mechanisms in the tri-gate device is the channel temperature during operation. as discussed in ref. [47], the channel temperature may be increased in tri-gate devices due to the reduced heat conduction to the substrate, as shown below in fig.3 the channel temperature is known to depend on the size and layout of the transistors. as a result, the total channel temperature must be calculated as the combination of the ambient temperature plus a self-heat function of both power dissipation and transistor layout. fig. 3 tri-gate devices have a constricted path for heat conduction from the channel to the substrate. as a result, the local temperature in the channel can increase above planar devices. [49] a critical complication from self-heating is that the thermal time-constant is generally much longer than transistor switching times in modern technologies [54]. as a result, the channel will heat up during a transistor switching event (the only time a transistor in digital operation dissipates significant power) and then this generated self-heat will persist long after the switching event itself. as a result, the temperature at any given time will depend on the activity history of the device, and the channel temperature needs to be expressed as equation 15 below: t = f (power, layout, activity) (15) this is of critical importance since it violates the quasi-static approximation on which the entire aging model is based. for a repeating waveform that switches with some regularity, an average temperature can be obtained for the entire waveform and a net aging for the waveform can be calculated. however, if the waveform is irregular then the quasistatic approximation breaks down, and an explicit evaluation of the aging at each point in the waveform will need to be evaluated. a further modeling challenge from the channel self-heat effect is from the empirical calibration standpoint. during standard dc hot carrier stress on a device, the channel temperature can increase far beyond conditions found during normal switching operations due to the total power dissipated. as a result, evaluation of the temperature dependence for the hot carrier model must take into account both the ambient and self-heat temperature when extracting the temperature dependence. furthermore, a particular concern is that the temperature dependence deviates from expected behavior at very high channel temperatures, which can lead to overly optimistic predictions of aging behavior at regular use conditions [55]. body effects in tri-gate one of the advantages of tri-gate devices is the excellent gate control over the channel which renders the device less susceptible to body-bias influences. for example, figures 4 and 358 s. shaheen, g. golan, m. azoulay, j. bernstein 5 show the nbti at pmos [50] and hc at nmos [56] sensitivity to body bias, which only affects the aging at very high body voltages. while the behavior observed is well beyond normal operating voltages, there are some stacked configurations where devices are placed on high voltage supplies and therefore the body bias effect may need to be considered. in these cases, the aging model formulation will need to be extended to include a dependence on the body bias during stress. to accurately include this body bias effect, the underlying physics needs to be comprehended. the body bias can influence the aging either through modulating existing mechanisms by altering channel fields or by introducing a completely new mechanism such as substrate hot carrier injection. depending on the underlying mechanism, the body bias will need to be added to the model either as a completely separate additive component or as a modulation of the existing terms. it is beyond the scope of this work to detail the exact physics involved with these effects in tri-gate devices, and for the sake of notational simplicity in the equations shown here, the body bias is assumed to simply modulate existing mechanisms. fig. 4 pmos bti in tri-gate devices is generally insensitive to body-bias until biases are applied far outside normal operating voltages. [48] fig. 5 nmos hc in tri-gate devices is relatively immune to body-bias effects, but at a high voltage threshold, the degradation can increase dramatically. [55] a feature unique to tri-gate is the possibility of traps forming in the sub-fin oxide, especially after high body-bias stress. depending on polarity, these traps can form either a parasitic conduction path or pinch off sub-fin leakage. for simplicity, these effects can be included in the aging model as terms for either the body bias aging influence or the minority carrier injection term. however, the physics is likely more complicated and a more precise model will need to have explicit terms and interactions included. a comparative study of reliability for finfet 359 electromigration in tri-gate the overall em challenges due to technology scaling come from the widening of the gap in current density limits for metal lines between the design needs and the technology capability. the current limit needed by the circuit/chip design increases rapidly from technology node to node, while the metallization process struggles to maintain the constant current carrying capability for the metal lines without invoking major innovations. the major driving forces for technology scaling include enhancing the chip performance and reducing the cost per device. technology scaling includes physical scaling, material scaling, electrical scaling and new integration schemes. physical scaling refers to dimensional shrink [57]. one obvious benefit of the physical scaling is to allow smaller devices and denser designs (more devices per chip area). this is essential to packing more functions, and more importantly to have more chips per wafer. the technology scaling has a direct undesirable consequence which is the increase of the overall process cost. to counter this cost increase for processing, packing in more functions per chip and producing more chips per wafer allows the cost per function and the cost per chip to keep decreasing, though the cost per wafer may increase from technology node to node. materials scaling refers to using materials which are more efficient or preferable for performance enhancement. one example from front end of line (feol) is to replace sio2 based material with materials having higher dielectric constant (k), such as hfo2 based material for gate dielectric. in back end of line (beol), the latency from the interconnect rc effect has become a major contributor to the overall performance degradation. to lower the conductor resistance, the al based metallization has been replaced with cu based (higher electric conductivity) metallization since the 180nm technology node. materials with lower k values have been introduced as interand intrametal dielectrics (ild) since the 130nm technology node to alleviate the interconnect capacitance effect. electrical scaling refers to the operating voltage (vdd) and power reduction. lowering the power at chip level has become a major desire for advanced applications. lower vdd can directly result in lower power, lower electric field, and lower current, which has major benefits for time dependent dielectric breakdown (tddb) and em reliability. however, due to device leakage concerns, and driving for faster speed (performance), vdd has not been scaled as fast as the dimensional scaling. integration scaling refers to the new integration schemes or innovations to enhance performance and pack more devices. this includes two very different integration aspects: (1) the interconnect fabrication/processing integration innovation driven by the scaling; and (2) chip or system level integration. the trend for the chip and system level integration scaling is growing from 2-d to 3-d schemes. the examples include adopting finfets in feol[58], and chip stacking through silicon vias (tsv) [59] for beol and packaging. most of these scaling aspects are not in favor of the technology reliability, they bring various new reliability challenges. however, these scaling aspects are not in favor of the technology reliability, they bring various new reliability challenges. from the circuit/chip design side, to keep the performance scaling following the so called moore‘s law, on one hand, the circuits and chips need to be smaller in size, or aggressively shrinking in dimensions both horizontally and vertically. on the other hand, the operating voltage scales at a much slower rate. the current density to flowing through a metal line may be computed as follows equation 16 as below: j = (16) 360 s. shaheen, g. golan, m. azoulay, j. bernstein where: c stands for capacitance, w and h are the metal line width and height; vdd is the supply voltage to devices, "f "is the clock frequency and "p" is the device switching factor. generally, w and h scale by a factor of 0.7, or the electric current conducting cross sectional area (w * h) reduces by about 50% for each technology node. for cu interconnects, from 180nm node to 10nm node, the cu cross sectional area for a minimum width metal line has reduced from 0.03 lm 2 to 0.0015 lm 2 , a 95% reduction. on the other hand, the operating voltage (vdd) is only scaled down from 1.8 v to 0.9 v, merely 50% reduction. the net effect is 10 times increase of (vdd/ w h) ratio. scaling for performance also drives higher and higher clock frequency "f", and switching factor "p". all these factors point to that higher and higher "j" is needed for circuit and chip design. in addition to the physical scaling (dimensional shrink), the material scaling impact on em reliability can also be significant. there are two aspects of this impact, direct impact from the new material properties and the indirect impact from the process integration changes driven by accommodating the new material properties .replacing cu with al gave a significant boost to the interconnect em capability [60], due to cu‘s higher melting point (1083 lc vs 660 lc of al) and higher em activation energy (0.9 ev for cu vs 0.8 ev for al). however, the subsequent aggressive dimensional shrink from technology scaling has led to rapid em performance degradation for cu interconnects. this is because the decrease of the critical void volume to cause em failure and the increase of the cu drift velocity. em failure time may be expressed as a function of the critical void volume and cu drift velocity [61, 62] as equation 17: tfail = critical d (17) another important factor to increase the interconnect current carrying capability is to take advantage of the short length benefit. from equation 18 below: ( ) (18) as the em process proceeds, a higher and higher backflow stress gradient (ω∆σ/∆l) will be built up at the anode to slow down the cu drift rate, vd. when this backflow stress gradient becomes sufficiently high, it can completely balance the driving force (z *epj), and makes the net cu mass flow to zero. at this steady state, equation 19can be written as: (jl)c = (19) (jl)c is called the threshold jl product [63]. in theory, if the jl in the interconnect is below the threshold product (jl)c, the interconnect should not suffer em damage. even when the jl product is higher than (jl)c to some extent, the em damage can occur but with longer time to fail, based on the following modified black equation 20 [64]. mttf = ( c) ( ) (20) max stress( 50stress lifetime ) * ( use stress )+ where: mttf is the median time to fail (i.e. t50), a is geometry and material related constant, jc is the threshold current density. all the mentioned equations serve as the basis for the fact that shorter lines can have higher maximum allowed current densities. taking advantage of a comparative study of reliability for finfet 361 this feature, i.e. making short interconnections, has been proven to be powerful in circuit design to solve some of the em challenges. another advantage of utilizing short length benefit for em is the low temperature sensitivity. if some devices are known to have high power, high frequency and high activity factors, the local interconnect temperature has a potential to be much higher than the nominal junction temperature due to joule heating. as shown in equation 20, the maximum allowed current density decreases exponentially with the interconnect temperature for the regular em process. a severe jmax de-rating may be needed for such circuits to account for the local temperature rise. using wider metal lines, which only increase the current linearly with line width, may not provide sufficient relief to compensate for the jmax de-rating. under such circumstances, taking advantage of the short length benefit becomes essential to overcome the local high temperature issues. since (jl)c has very low sensitivity to temperature [65], as long as the jl is sufficiently below (jl)c, the em reliability will not decrease much with the local joule heating. breaking the long interconnect into short segments to take advantage of the short length benefit may not always be feasible due to spacing limitations and resistance sensitivity. to allow the backflow stress to build, physical barriers at both cathode and anode of the interconnect is required. there are alternative ways to establish some pseudo-barriers to enable local backflow stress build up, and form some short length effects [66, 67]. one example is to have blocking islands on top of a metal line or using multiple levels of metal lines with period of vias or bar vias connecting them [68].those blocks on the cu surface may not create a complete physical barrier for cu diffusion, since they have a direct metal to metal interface in the blocking islands, the cu diffusion along those local interface areas will be much slower, and some degree of backflow stress gradient will be built up to create partial short length effects. one of the major challenges for short length em benefit applications is the line length definition [66]. actual circuits are often more complicated than the simple metal line segment with vias at each end. they often have fingers, branches/wings, passive reservoirs, passing vias, dropdowns and width transitions. in such cases, the line length and (jl)c to be used for the short length benefit calculations become very challenging, not only to the eda tools, but also for the design engineers. appropriate engineering judgments are often sought to solve these issues. due to the variability control and the liner thinning, lower (jl)c values are expected for the future technologies. furthermore, the distribution complexity should be closely watched as well when applying short length benefits for circuit designs. 4. comparative discussion the discussion so far was focused on the details of each independent mechanism and its influence on aging. in real circuits, devices will experience a variety of biases, temperatures, activity, and aging conditions, as illustrated in fig.6 below. in this type of generalized waveform, interactions between the aging mechanisms become important. one of the more challenging modeling complications from these interactions are the recovery effects. for example, bti recovery is more sensitive to the applied bias. therefore, as a circuit switches to a low voltage state either in analog operation or from dynamic voltage power management schemes, the transistors will undergo recovery differently. 362 s. shaheen, g. golan, m. azoulay, j. bernstein fig. 6 devices in circuits may see a wide variety of biases with associated aging mechanisms. the persistent self-heat effect makes the aging sensitive to history, modulating the aging and recovery behavior in regions of the waveform after sections of high self-heat generation. additionally, recovery is known to be sensitive to temperature, with more recovery occurring at higher temperatures [68]. modeling this effect is further complicated by the persistent self-heat behavior, so that within a general waveform the temperature during a recovery phase will depend on the immediate history of the waveform. one final challenge for recovery modeling is the interaction with minority carrier injection during hot carrier stress. these minority carriers can passivate bti traps enhancing recovery. for example, in pmos nbti a hole trap may be passivated by an injected electron during a subsequent hot carrier stress phase. comprehending these recovery interactions in a model requires that the recovery term will be updated to include a dependence on bias, temperature, activity, and minority carrier injection. putting all of these effects into a single equation depends on the physics involved, but a simplistic additive form is summarized below in 21: ( ( ) ( ) ) + ( ( ) ( ) ) (21) + ( ( ) ( ) ) in this summarizing equation, a key feature is still missing, which is any interaction between bti and hot carriers. this is an area that still requires significant research, but the basic issue hinges on whether the traps associated with bti and hc are independent or have some interaction. for simplicity, the traps are often assumed to be separate and independent based on their location. however, recent work has shown that there is a strong interaction between the mechanisms in some cases, so that the independent approximation may lead to overly conservative predictions of total aging [69]. this interaction poses further challenges for aging since there may be sequence dependence to the total aging, as indicated in fig. 7 a comparative study of reliability for finfet 363 below [69]. as a result, the net aging after a general waveform will not necessarily match a quasistatic integration of the waveform, but rather a more detailed model is needed for the recovery of the degradation at each point in time. the solutions to the em challenges due to technology scaling have relied on various innovations from all aspects, including process development, circuit design and chip/system integrations. though these innovations have been proven successfully, they have been projected more and more difficulties for the future technologies. from process development point of view, any innovative schemes to enhance the em performance will have to overcome the challenges of line/via electrical resistance, cu grain growth and variability. the rapid resistance increase of the interconnect has become one of the bottle necks and diminished the performance gain from technology scaling. for the advanced cu interconnect, historically, all process integration schemes to boost em reliability came with a certain degree of sacrifice of the electrical resistance. to slow down the resistance increase trend with scaling, new integration measures are needed not only to minimize the cu resistivity deterioration, but also to maximize the cu volume fraction in the trenches and vias. while the former metal technologies faces the challenges of the fundamental physics (size effect from electron diffraction), the later has significant potential implications with reliability and manufacturability. certain liner thickness in the trenches and via has been proven to be critical for good cu fill and slow cu mass flow along the sidewalls. a new liner deposition process is needed to overcome these challenges. though technological solutions can and will be developed to meet these challenges discussed above, the real potential barrier ahead of the technology scaling could be the economics, i.e. whether these technology solutions can still provide viable economic benefits. rather than developing costly technology solutions to cover ‗‗universal‘‘ applications, an approach to tailor the technology for specifically targeted applications and reliability may have to be adopted. therefore, a co-optimization of process development along with circuit and chip design becomes essential .circuit and chip designers will need to actively participate in the technology definition and process window evaluations. on one hand, the circuit and chip design teams need to understand the process capability and take advantage of the process strength and avoid the process weakness. on the other hand, the process development team knows what the critical needs are from circuit and chip designs and optimizes the process windows around those critical constructs of circuit and chip designs. 5. conclusions and future work conclusions in scaled tri-gate devices, bti, recovery, hot carrier and em continue to play a significant role in aging. modeling these effects can be accomplished with a similar modeling framework as methods established for planar devices. however, in addition to these primary aging fig. 7 aging after bti and hc is observed to depend on the sequence of stresses, indicating both an interaction between the mechanisms as well as a history dependence. [69] 364 s. shaheen, g. golan, m. azoulay, j. bernstein mechanisms additional phenomenon need to be included in any aging model general enough to capture the arbitrary waveforms found in real circuit use. mechanisms such as self-heat, minority carrier injection, body bias effects, bti/hc interactions, and recovery dependence on bias, temperature and activity all need to be included in the model. in the most limiting cases, these effects will violate the fundamental quasistatic approximation for many simple aging models, which will require restructuring the basic model formulation. technology scaling results in severe em challenges to the advanced interconnect. to balance the em reliability with performance, cost and cycle time, joint efforts are needed from all aspects, including process development, manufacturing, circuit designs and chip integration. while innovative process integration schemes are essential to enhance the interconnect current carrying capabilities, the robust circuit design and chip level budgeting are also important to make a product with high em reliability and optimized performance. nbti and hot carrier and em and aging are the most critical challenges for the future of semiconductor industry. they all affects the overall reliability of nano-scaled circuits and potentially causes system failures. being able to predict all the mentioned mechanisms degradations crucial for the development and long-term success of novel transistor structures such as the finfets. yet to be explored are: 1. finfets evaluation of the performance degradation on levels of abstraction such as processor-level or system-level. for that, new methodologies are required for predicting aging behavior such as dynamic reliability management (drm) techniques for new finfets devices (7nm and below). 2. studying of mitigation techniques for nbti aging effects for finfets. mitigation designates any method of limiting or controlling bti and its impacts on innovative electronic devices. currently, a number of research projects are conducted to develop fully automated schemes that are able to eliminate bti effects with little or no trade-offs in terms of performance, power consumption or costs. 3. extending the reliability-aware digital flow to cover statistical static timing analysis (ssta) to improve the accuracy of aging predictions. in reality, the distributions of logic gates in a circuit are correlated depending on their statistical properties. to obtain the information of path timing degradation, statistical timing analysis techniques to handle this correlation should be incorporated. acknowledgement: sponsored by us dept. of defense (onr and afosr). references [1] finfet modeling for ic simulation and design: using the bsim-cmg standard 114–117, april 19, 2001. [2] s.-h. oh, . monroe, j.m. hergenrother, ―analytic description of short-channel effects in fully-depleted double-gate and cylindrical, surrounding-gate mosfets,‖ieee electron device letters, vol. 21, no. 9, 445447, 2000. [3] g. e. moore, ―cramming more components onto integrated circuits,‖ proc. of electronics, vol. 38, 114–117, april 19, 1965. [4] s. thompson, p. packan, and m. ohr, ―mos scaling: transistor challenges for the 21 st century,‖ intel technology journal, vol. 2, pp. 1–19, 1998. [5] k. j. kuhn, ―cmos scaling for the 22nm node and beyond: device physics andtechnology,‖ in proceedings of the international symposium on vlsi technology, apr. 2011, pp. 1–2. [6] k. bernstein, d. j. frank, a. e. gattiker, w. haensch, b. l. ji, s. r. nassif, e. j. nowak, d. j.pearson, and n. j. rohrer, "high-performance cmos variability in the 65-nm regime andbeyond," ibm journal of research and development, vol. 50, pp. 433–449, jul-sep 2006. https://www.researchgate.net/publication/2985293_cramming_more_components_onto_integrated_circuits?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/2985293_cramming_more_components_onto_integrated_circuits?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/2985293_cramming_more_components_onto_integrated_circuits?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/2985293_cramming_more_components_onto_integrated_circuits?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/242353242_mos_scaling_transistor_challenges_for_the_21st_century?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/242353242_mos_scaling_transistor_challenges_for_the_21st_century?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/242353242_mos_scaling_transistor_challenges_for_the_21st_century?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/261039047_cmos_scaling_for_the_22nm_node_and_beyond_device_physics_and_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/261039047_cmos_scaling_for_the_22nm_node_and_beyond_device_physics_and_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/261039047_cmos_scaling_for_the_22nm_node_and_beyond_device_physics_and_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224102934_high-performance_cmos_variability_in_the_65-nm_regime_and_beyond_ibm_j_res_and_dev?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224102934_high-performance_cmos_variability_in_the_65-nm_regime_and_beyond_ibm_j_res_and_dev?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224102934_high-performance_cmos_variability_in_the_65-nm_regime_and_beyond_ibm_j_res_and_dev?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224102934_high-performance_cmos_variability_in_the_65-nm_regime_and_beyond_ibm_j_res_and_dev?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224102934_high-performance_cmos_variability_in_the_65-nm_regime_and_beyond_ibm_j_res_and_dev?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== a comparative study of reliability for finfet 365 [7] . k. schroder and j. a. abcock, ―negative bias temperature instability: road to cross in deepsubmicron silicon semiconductor manufacturing,‖ journal of applied physics, vol. 94, pp. 1–18, jul. 2003. [8] x. wang, . cheng, a. . rown, c. millar, j. . kuang, s. nassif, and a. asenov, ―statisticalvariability and reliability in nanoscale finfets,‖ in proceedings of the ieee int. electron devices meeting (iedm), 1–4, 2011. [9] b. kaczer, t. grasser, p. j. roussel, j. franco, r. degraeve, l. ragnarsson, e. simoen, g. groeseneken, and h. eisinger, ―origin of n ti variability in deeply scaled pfets,‖ in proc. of the ieee irps, 2010, pp. 26–32. [10] p. woerlee, p. amink, m. van ort, c. juffermans et al., ―the impact of scaling on hot-carrier degradation and supply voltage of deep-submicron nmos transistors,‖ in proceedings of the ieee int. electrondevices meeting (iedm), 1991, pp. 537–540. [11] y. lee, n. mielke, m. agostinelli, s. gupta, . lu, and w. mcmahon, ―prediction of logicproduct failure due to thin-gate oxide breakdown,‖ in proceedings of the ieee irps, 2006, pp. 18–28. [12] the international technology roadmap for semiconductors (itrs), 2009. http://public.itrs.net [13] v. huard, f. cacho, y. mamy andriamihaja, and a. ravaix, ―from defects creation to circuitreliability—a bottom-up approach,‖ microelectron. eng., vol. 88, no. 7, pp. 1396–1407, jul.2011. [14] v. huard v, f. cacho, y. mamy randriamihaja, a. bravaix, ―from defects creation to circuit reliability – a bottom-up approach,‖ microelectro. eng., vol. 88, pp. 1396-1407, 2011. [15] jedec publication. failure mechanisms and models for semiconductor devices, jep-122g, october 2011. [16] joseph b bernstein, gurfinkel moshe, li xiaojun, walters jörg, shapira yoram, talmor michael, ―electronic circuit reliability modeling,‖ microelectron reliab, vol. 46, pp. 1957–1979, 2006. [17] rf drenick ―mathematical aspects of the reliability problem,‖ j soc ind appl math, vol. 8, pp. 125–149, 1960. [18] "reliability prediction with mtol" by joseph b. bernstein, alain bensoussan, emmanuel bender [19] y. miura and y. matukura, ―investigation of silicon-silicon dioxide interface using mosstructure,‖ japanese journal of applied physics, vol. 5, pp. 180, 1966 [20] s. khan, s. hamdioui, h. kükner, p. aghavan, and f. catthoor, ― ti impact on logical gates innano-scale cmos technology,‖ in proceedings of the ieee 15 th international symposium on design and diagnosticsof electronic circuits systems (ddecs), 2012, pp. 348–353. [21] h. kükner, p. weckx, p. raghavan, b. kaczer, f. catthoor, l. van der perre, r. lauwereins,and g. groeseneken, ―impact of duty factor, stress stimuli, and gate drive strength on gate delaydegradation with an atomistic trap-based ti model,‖ in proceedings of the 15 th euromicro conf. on dsd, 2012, pp. 1–7. [22] v. huard, m. ennis and c. parthasarathy, ―n ti degradation: from physical mechanisms to modelling,‖ microelectronics reliability, vol. 46, pp. 1–23, 2006. [23] t. grasser, . kaczer, w. goes, t. aichinger, p. hehenberger, and m. nelhiebel, ―a two-stage model for negative bias temperature instability,‖ in proceedings of the ieee irps, 2009, pp. 33–44, 2009. [24] w. wang, s. yang, s. hardwaj, s. vrudhula, t. liu, and y. cao, ―the impact of n ti effect on combinational circuit: modeling, simulation, and analysis,‖ ieee trans. on very large scale integration (vlsi) systems, vol. 18, no. 2, pp. 173–183, 2010. [25] acovic, g. l. rosa, and y.-c. sun, ―a review of hot carrier deration mechanisms in mosfets,‖ microelectronics reliability, vol. 36, pp. 845–869, 1996. [26] e. takeda, c. y. yang, and a. miura-hamada, hot-carrier effects in mos devices, ch. 2, pp. 49–58. academic press, 1995. [27] m. song, k. p. macwilliams, and j. c. s. woo, ―comparison of nmos and pmos hot carrier effects from 300 to 77 k,‖ ieee transactions on electron devices, vol. 44, pp. 268–276, 1997. [28] m. ohring, reliability and failure of electronic materials and devices, ch. 5, p. 259. academic press, 1998. [29] d. g. pierce and p. g. rusius, ―electromigration: a review,‖ microelectron reliability, vol. 37, pp. 1053– 1072, 1997. [30] j. . lack, ―mass transport of aluminum by moment exchange with conducting electrons,‖ in proceedings of the 6th annual international reliability physics symposium, pp. 148–159, 1967. [31] . lake and s. atta, ―energy balance and heat exchange in mesoscopic systems,‖ phys. rev. b, vol. 46, no. 8, pp. 4757–4763, 1992 [32] u. lindefelt, ―heat generation in semiconductor devices,‖ j. appl. phys., vol. 75, no. 2, pp. 942–957, 1994. [33] j. lai and a. majumdar, ―concurrent thermal and electrical modeling of submicrometer silicon devices,‖ j. appl. phys., vol. 79, no. 9, pp. 7353–7361, 1996. [34] m. artaki and p. j. price, ―hot phonon effects in silicon field-effect transistors,‖ j. appl. phys., vol. 65, no. 3, pp. 1317–1320, 1989. [35] p. lugli and s. m. goodnick, ―nonequilibrium longitudinal-optical phonon effects in gaas-algaas quantum wells,‖ phys. rev. lett., vol. 59, no. 6, pp. 716–719, 1987. [36] s. amey et al, ―frequency and recovery effects in high-k ti egradation,‖ i ps 2009. pp. 1023-1027. [37] s. ramey, y. lu, i. meric, s. mudanai, s. novak, c. prasad, j. hicks. "aging model challenges in deeply scaled tri-gate technologies", in proceedings of the ieee international reliability workshop (iirw2015), 2015, pp. 56-62. https://www.researchgate.net/publication/234945384_negative_bias_temperature_instability_road_to_cross_in_deep_submicron_silicon_semiconductor_manufacturing_j_appl_phys_94_1?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/234945384_negative_bias_temperature_instability_road_to_cross_in_deep_submicron_silicon_semiconductor_manufacturing_j_appl_phys_94_1?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/234945384_negative_bias_temperature_instability_road_to_cross_in_deep_submicron_silicon_semiconductor_manufacturing_j_appl_phys_94_1?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/234945384_negative_bias_temperature_instability_road_to_cross_in_deep_submicron_silicon_semiconductor_manufacturing_j_appl_phys_94_1?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/241634466_statistical_variability_and_reliability_in_nanoscale_finfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/241634466_statistical_variability_and_reliability_in_nanoscale_finfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/241634466_statistical_variability_and_reliability_in_nanoscale_finfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/241634466_statistical_variability_and_reliability_in_nanoscale_finfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224147129_origin_of_nbti_variability_in_deeply_scaled_pfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224147129_origin_of_nbti_variability_in_deeply_scaled_pfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224147129_origin_of_nbti_variability_in_deeply_scaled_pfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224147129_origin_of_nbti_variability_in_deeply_scaled_pfets?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224672870_prediction_of_logic_product_failure_due_to_thin-gate_oxide_breakdown?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224672870_prediction_of_logic_product_failure_due_to_thin-gate_oxide_breakdown?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224672870_prediction_of_logic_product_failure_due_to_thin-gate_oxide_breakdown?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/251520646_from_defects_creation_to_circuit_reliability_-_a_bottom-up_approach_invited?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/251520646_from_defects_creation_to_circuit_reliability_-_a_bottom-up_approach_invited?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/251520646_from_defects_creation_to_circuit_reliability_-_a_bottom-up_approach_invited?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/251520646_from_defects_creation_to_circuit_reliability_-_a_bottom-up_approach_invited?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/251520646_from_defects_creation_to_circuit_reliability_-_a_bottom-up_approach_invited?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239046513_investigation_of_silicon-silicon_dioxide_interface_using_mos_structure?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239046513_investigation_of_silicon-silicon_dioxide_interface_using_mos_structure?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239046513_investigation_of_silicon-silicon_dioxide_interface_using_mos_structure?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239763174_bti_impact_on_logical_gates_in_nano-scale_cmos_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239763174_bti_impact_on_logical_gates_in_nano-scale_cmos_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239763174_bti_impact_on_logical_gates_in_nano-scale_cmos_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239763174_bti_impact_on_logical_gates_in_nano-scale_cmos_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/239763174_bti_impact_on_logical_gates_in_nano-scale_cmos_technology?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/259097263_impact_of_duty_factor_stress_stimuli_and_gate_drive_strength_on_gate_delay_degradation_with_an_atomistic_trap-based_bti_model?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/259097263_impact_of_duty_factor_stress_stimuli_and_gate_drive_strength_on_gate_delay_degradation_with_an_atomistic_trap-based_bti_model?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/259097263_impact_of_duty_factor_stress_stimuli_and_gate_drive_strength_on_gate_delay_degradation_with_an_atomistic_trap-based_bti_model?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/259097263_impact_of_duty_factor_stress_stimuli_and_gate_drive_strength_on_gate_delay_degradation_with_an_atomistic_trap-based_bti_model?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/259097263_impact_of_duty_factor_stress_stimuli_and_gate_drive_strength_on_gate_delay_degradation_with_an_atomistic_trap-based_bti_model?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/259097263_impact_of_duty_factor_stress_stimuli_and_gate_drive_strength_on_gate_delay_degradation_with_an_atomistic_trap-based_bti_model?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/222753189_nbti_degradation_from_physical_mechanisms_to_modeling?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/222753189_nbti_degradation_from_physical_mechanisms_to_modeling?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224567194_a_two-stage_model_for_negative_bias_temperature_instability?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224567194_a_two-stage_model_for_negative_bias_temperature_instability?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224567194_a_two-stage_model_for_negative_bias_temperature_instability?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224490639_the_impact_of_nbti_effect_on_combinational_circuit_modeling_simulation_and_analysis?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224490639_the_impact_of_nbti_effect_on_combinational_circuit_modeling_simulation_and_analysis?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224490639_the_impact_of_nbti_effect_on_combinational_circuit_modeling_simulation_and_analysis?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224490639_the_impact_of_nbti_effect_on_combinational_circuit_modeling_simulation_and_analysis?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== https://www.researchgate.net/publication/224490639_the_impact_of_nbti_effect_on_combinational_circuit_modeling_simulation_and_analysis?el=1_x_8&enrichid=rgreq-d53d2898-5e0d-4d8d-ba15-6a400a9455e7&enrichsource=y292zxjqywdlozi2otgxodcwmjtbuzoxnzy5nzyymtiwmtmwntzamtqxotiwntgyntq4mw== 366 s. shaheen, g. golan, m. azoulay, j. bernstein [38] t. grasser et al, "the universality of nbti relaxation and its implications for modeling and characterization," irps 2007, pp. 268-280. [39] s. pae, et al, "reliability characterization of 32nm high-k and metal gate logic transistor technology," irps 2010. pp. 3d2.1-3d2.6 [40] a. krishnan, et al., ―n ti impact on transistor & circuit: models, mechanisms, & scaling effects,‖ in proceedings of the iedm 2003. pp. 14.5.1-14.5.4 [41] c. hu, ―lucky-electron model of channel hot electron emission,‖ in proc. of the iedm, 1979. pp. 22-25. [42] . kaczer, et al., ―origin of n ti variability in eeply scaled pfets,‖ in proceedings of the irps 2010, pp. 2a3.1-2a3.7. [43] c. prasad, et al., ― ias temperature instability variation on sion/poly, hk/mg and trigate architectures,‖ in proceedings of the irps 2014, pp. 6a.5.1-6a.5.7. [44] p. packan, et al, ―high performance hi-k + metal gate strain enhanced transistors on (110) silicon,‖ in proceedings of the iedm 2008. pp.1-4. [45] g. groeseneken, et al, ― eliability issues in mugfet nanodevices,‖ in proc of irps 2008. pp.52-60. [46] j. kim, et al, ―effects of gate process on n ti characteristics of tin gate finfet,‖ in proceedings of the irps 2012. pp.gd6.1-gd6.4. [47] c. prasad, et al., ―self-heat reliability considerations on intel's 22nm tri-gate technology,‖ in proceedings of the irps 2013. pp.5d.1.1-5d.1.5. [48] s. ramey et al, "intrinsic transistor reliability improvements from 22nm tri-gate technology," in proceedings of the irps 2013. p.4c.5.1.-4c.5.5. [49] k.t. lee, et al , ―technology scaling on high-k & metal-gate finfet ti reliability,‖ in proceedings of the irps 2013. pp. 2d.1.1 2d.1.4. [50] c.c. wu, et al, ―high performance 22/20nm finfet cmos devices with advanced high-k/metal gate scheme,‖ in proceedings of the iedm 2010. pp. 27.1.127.1.4. [51] s. amey, et al., ―transistor reliability variation correlation to threshold voltage,‖ in proceedings of the irps 2015. pp. 3b2.1-3b2.6. [52] m. cho, et al, ―off-state stress degradation mechanism on advanced pmosfets,‖ in proceedings of the icicdt 2015. pp.1-4. [53] . kaczer, et al., ―origins and implications of increased channel hot carrier variability in nfinfets,‖ in proceedings of the irps 2015. pp. 3b.5.1 3b.5.6. [54] c. xu, et al, ―analytical thermal model for self-heating in advanced finfet devices with implications for esign and eliability,‖ ieee [55] international technology roadmap for seminconductors. . [56] d. hisamoto, ―multi-gate fets,‖ in proceedings of the ieee int electron dev meet (iedm). short course; 2003. [57] baozhen li, cathryn christiansen, dinesh badami, chih-chao yang. "electromigration challenges for advanced on-chip cu interconnects", microelectronics reliability, vol. 54, no. 4, pp. 712-724, 2014. [58] d. edelstein d et al, ―full copper wiring in a sub-0.25 pm cmos ulsi technology,‖ technical digest. in: ieee int electr dev meeting, 1997. p. 773–6. [59] ia blech, ―electromigration in thin aluminum films on titanium nitride,‖ j appl phys, vol. 47, pp.1203–1208, 1976. [60] li b et al, ―threshold electromigration failure time and its statistics for cu interconnects,‖ j appl phys, vol. 100, pp. 114516, 2006. [61] c-k hu et al, ―impact of cu microstructure on electromigration reliability,‖ in proceedings of the ieee intern interconnect tech. conf (iitc); 2007 [section 6.1]. [62] jj. clement, ―electromigration modeling for integrated circuit interconnect reliability analysis,‖ trans dev mater rel, vol. 1, pp. 33–42, 2001. [63] c. christiansen, b. li, j. gill, ―blech effect and lifetime projection for cu/low-k interconnects,‖ in proceedings ieee intern interconnect tech. conf. (iitc), 2008. p. 114–6. [64] b. li b et al, ―short line electromigration characteristics and their applications for circuit design,‖ in proceedings of the ieee int rel phys symp (irps), 2013, 3f2. [65] c-k hu et al, ―electromigration challenges for nanoscale cu wiring,‖ in proceedings of the aip conf 2009, 1143:3–11. [66] s. amey, et al., ― ti ecovery in 22nm tri-gate technology,‖ in proceedings of the irps 2014, pp. xt2.1xt2-6. [67] f. cacho et al, ―hci/ ti coupled model: the path for accurate and predictive reliability simulations,‖ in proceedings of the irps 2014. pp.5d4.1-5d4.5. [68] m. song, k. p. macwilliams, and j. c. s. woo, ―em reliability‖ ieee transactions on electron devices, vol. 44, pp. 268–276, 1997. facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 93 106 doi: 10.2298/fuee1701093k comparison of measured performance and theoretical limits of gaas laser power converters under monochromatic light  rok kimovec, marko topič university of ljubljana, faculty of electrical engineering, ljubljana, slovenia abstract. evaluation of gaas laser power converters (lpc) is reported in light of theoretical maximum limits calculated with detailed balance method as proposed by shockley and queisser (sq). calculations were done for three different theoretical structures of lpcs homogeneously illuminated by monochromatic light. effects of lpc thickness, central wavelength of a monochromatic light source and various irradiance levels are discussed. reflection of incident light from the interface between air and gaas is calculated and countermeasures in the form of single and double layer anti reflection coatings are theoretically studied. measurements of single junction, single segment gaas lpc illuminated by monochromatic light with central wavelength λ0 = 808 nm are presented and compared with the theoretical maximum values. the conversion efficiency ηmeas = 54,4 % was measured for gaas lpc illuminated with power density of monochromatic light pillum = 14,3 w/cm 2 at the temperature of the lpc casing t = 302 k. for the same parameters conversion efficiency ηsq = 76,6 % was calculated resulting in utilization ratio ηmeas/ηsq=0,71. measured jsc and voc achieve 88,5 % and 89,2 % of theoretically calculated sq limit values. key words: laser power converter, shockley-queisser limit, gaas, monochromatic efficiency 1. introduction shockley-queisser (sq) limit [1]–[3] is fundamental, widely adopted figure of merit used for evaluating efficiency limits of photovoltaic devices. it is based on detailed balance method and assumes radiative recombination as the sole loss mechanism in a solar cell. calculations of the sq limits were already done under standard solar spectra (am1.5, am1.0 and am0) or for black body radiation spectrum. for purposes of power beaming, where photovoltaic cell is illuminated by artificial light source in order to transfer energy with no electrically conductive path, sq limit under monochromatic illumination [4] will be calculated, since those systems commonly employ laser diodes as a source of monochromatic illumination. light energy irradiated from a laser diode is  received march 15, 2016; received in revised form june 13, 2016 corresponding author: rok kimovec university of ljubljana, faculty of electrical engineering, tržaška cesta 25, 1000 ljubljana, slovenia (e-mail: rok.kimovec@fe.uni-lj.si) 94 r. kimovec, m. topič converted to electrical energy by gaas laser power converters (lpc) optimized for monochromatic light sources at specific wavelength. in practice laser diodes with central wavelength between λ0 = 800 – 850 nm are often utilized, due to their low price and good system efficiency when employing gaas lpcs as optical energy to electrical energy converters. currently state-of-the-art gaas lpcs achieve efficiencies greater than 56 % [5-6] while illuminated with monochromatic light with central wavelength λ0 between 810 – 820 nm and pillum between 50-124 w/cm 2 . in this paper we present theoretically calculated efficiency limits based on detailed balance principle compared with measured gaas lpc. conversion of optical energy to electrical energy will be presented with loss analysis for both theoretical and measured lpc. 2. model sq current density limit is calculated as difference between photogenerated current density and loss of available current density due to radiative recombination as (1): . (1) jph – photogenerated current density q – elementary charge of electron rr – radiative recombination rate of electron-hole pairs (e – h) 2.1. photogenerated current density photogenerated current density is calculated from the flux of e – h pairs generated by absorbed photons (2). in this paper we present calculation of sq limit under monochromatic illumination applied for a gaas photovoltaic cell. (2) φe – h – flux of photogenerated e – h pairs φe – h presents number of generated e – h pairs in absorber per unit time per unit area and is calculated using absorption coefficient α0 of gaas as measured by [7] including urbach tail with slopes e0 below eg and e’ above eg [8] and fitted to the following equation (3) [9]: { ( ) (3) eph – energy of photons incident on lpc surface eg = 1.42 ev – band gap of gaas α0 = 8000 cm -1 e0 = 6,7 mev e’ = 140 mev comparison of measured performance and theoretical limits of gaas laser power converters 95 from known absorption rate α, absorptivity a for three different hypothetical structures of thickness l of gaas lpcs as seen in fig. 1 were considered as follows [3,4]: a) planar front surface with complete absorption on the back surface (4), representing a single pass of photons through absorber. ( ) ( ) (4) b) planar front surface with perfect reflecting mirror on the back surface (5), representing a double pass of photons through absorber. ( ) ( ) (5) c) random texture on front surface with perfect reflecting mirror on the back surface (6), representing multiple passes of photons through absorber. ( ) ( ) ( ) (6) ngaas – refractive index of gaas all considered structures have thickness dependence, noted with l and are assumed to be exposed in the air. for randomly textured dependence of absorptivity of gaas on refractive index ngaas of gaas can be also noted. fig. 1 different theoretical structures of gaas lpcs considered in calculations. arrows shows the light path through gaas absorber. r – reflection of light from bottom surface with known a, jphoto can be calculated from a flux of photons incident on an lpc front surface. reflection of incident light from the front surface is not taken into account here, but is added and discussed later. laser spectrum around central wavelength λ0 was interpolated with gaussian distribution as shown in the following equation (7). √ ( √ ) ( √ ) (7) 96 r. kimovec, m. topič laser spectrum was weighted with power density of incident light entering the front surface of lpc resulting in spectral irradiance (8), (10). (8) (9) ( ) (10) fwhm – full width at half maximum λ0 – central wavelength h – planck’s constant c – speed of light equation for a flux of photons (11) entering front surface per unit energy ( ) can be derived from known spectral irradiance. ( ) ( ) (11) integration of a multiplied by ( ) over energy content of photons presented in spectral irradiance results in a flux of e – h pairs (12) generated by a flux of photons for defined laser parameters λ0, fwhm and laser power density and lpc thickness l. ( ) ∫ ( ) ( ) (12) photo generated current density can be calculated from a known flux of e – h pair (13) as a function of incident photon energy and thickness of lpc. ( ) ( ) (13) 2.2. radiative recombination rate sq limit assumes radiative recombinations in thermal equilibrium as sole loss mechanism present in the photovoltaic cell [1]. according to the detailed balance method used in calculation of sq limit, all absorbed energy should be emitted for the system to be in equilibrium. therefore the loss of energy due to thermal radiation is unavoidable. derivation of rr can be found in literature [1] and recombination current density can be written as (14): ( ) (14) where: ( ) ∫ ( ) (15) comparison of measured performance and theoretical limits of gaas laser power converters 97 k – boltzmann’s constant h – planck’s constant v – voltage across device at open circuit condition t – device temperature it is remarked that rr in our case corresponds to an emission rate from the device surface (and not from the volume). consequently its unit is m -2 s -1 (instead of more commonly used m -3 s -1 ). 2.3. lpc model performance of the lpc is expressed with the same parameters as used in evaluation of solar cells. efficiency η, fill factor ff, open-circuit voltage voc, short-circuit current density jsc and available electrical power density at maximal power point pmax are derived from current density – voltage dependency, j – v (16). ( ) ( ) (16) max power density is calculated numerically as (17): ( ) ( ( ) ) (17) and vmpp and jmpp as (18): ( ) ( ) . (18) conversion efficiency is calculated as (19): ( ) ( ) (19) and fill factor as (20): ( ) ( ) ( ) ( ) . (20) jsc is obtained as (21): ( ) ( ) (21) and voc is calculated as (22): ( ) ( ) . (22) 3. simulation results all simulations of sq performance limit were done for three different theoretical lpc structures discussed above with lpc thickness l = 1 µm at lpc temperature t = 300 k. 98 r. kimovec, m. topič source of monochromatic illumination was assumed to be homogenous across the lpc front surface with illumination power density pillum = 100 mw/cm 2 , spectral distribution around a central wavelength λ0 = 808 nm is gaussian with fwhm = 5 nm. simulation parameters different from those specified in previous statement are noted where necessary. 3.1. effect of lpc absorber thickness on efficiency as seen in fig. 2 absorber layer thickness plays a significant role on sq efficiency limit for structures with thickness less than 3 µm. for thicker cells, there is less than 1 % difference between the best and worst performing structure and efficiency saturates at η = 68,4 % for all structures and for given simulation parameters. fig. 2 absorber thickness effect on efficiency of lpc similar strong rise with increasing thickness of absorber can be seen for jmpp while values of vmpp slightly fall (fig. 3). thickness is important to guarantee complete absorption of all photons which results in increased jmpp. this is most notable in structure with no reflection from the back surface where only single pass of light through absorber occurs. recombination fig. 3 absorber thickness effect on jmpp and vmpp of lpc comparison of measured performance and theoretical limits of gaas laser power converters 99 rate of e – h pairs increases with increasing thickness, resulting in increased jrad and decreased vmpp. product of jmpp and vmpp is rising with a thickness of absorber resulting in increasing pmpp and efficiency, since the gain from increased absorption is much larger than loss of voltage due to increased recombination rate of e – h pairs. 3.1. effect of central wavelength of monochromatic light on efficiency of lpc sq efficiency for three different 1 µm thick gaas theoretical structures of lpcs as a function of monochromatic light with central wavelength λ0 are shown in fig. 4. maximal efficiency of ηsq = 72,3 % is achieved for the randomly textured lpc with perfect back mirror at λ0 = 872 nm which correlates to eg=1,42 ev of gaas. lpc with planar front surface and perfect back mirror achieves ηsq = 65,0 % at λ0 = 808 nm and planar lpc with an absorbing mirror on the back has ηsq = 56,7 % at λ0 = 728 nm. it is clear that a lpc structure does not only influence absolute maximum of efficiency, but also shifts peak of efficiency, marked with x in fig. 4. commercially available lasers diodes with optimal performance between price and output optical power suitable for illumination of gaas lpcs emit light with a spectral peak at approximately λ0 = 808 nm marked with a vertical line in fig. 4 fig. 4 effect of central wavelength λ0 of monochromatic source on sq efficiency limit 3.2. performance of lpc under high irradiance lpcs are normally illuminated with high irradiance of monochromatic light, since efficiency increases with increasing illumination power density pillum, calculated as . all three structures have logarithmic dependence of efficiency on pillum as shown in semi-log plot in fig. 5. for comparison efficiency of high efficiency gaas lpcs are plotted in fig. 5. highest lpc efficiency known to the authors was achieved by helmers et al. with η = 57,4 % at λ0 = 805 nm and pillum = 124 w/cm 2 [6]. gaas lpc with similar efficiency η = 56,0 % at λ0 = 820 nm and pillum = 56 w/cm 2 was reported by andreev et al.[5]. efficiency η = 52,8 % at λ0 = 810 nm and pillum = 14 w/cm 2 was reported by beaumont et al.[10]. peña et al. developed lpc with efficiency η = 45,4 % at λ0 = 808 nm and pillum = 5 w/cm 2 [11]. for same illumination parameters shan et al. report efficiency η = 53,2 % [12]. reported high efficiency lpcs are marked with circles in fig. 5. 100 r. kimovec, m. topič fig. 5 influence of high irradiance on efficiency of lpcs. efficiencies of state-of-the-art gaas lpcs obtained from the literature are marked with circles. 3.3. single and double layer ar coating for reduced front surface reflection so far in the paper no reflection of incident light from front surface was assumed in calculations, resulting in all light reaching absorption layer. in the real world reflection from interface between two media results in decrease of light coupled in photovoltaic structure. reflection of light perpendicular to the surface is defined with refractive indices of media on the interface (23). in our case interface consists of air and gaas. since refractive index of gaas ngaas is dependent on photon energy [13], reflection r exhibits same dependence. for photon energy eph = 1,6 ev, representing monochromatic light with λ0 = 808 nm, ngaas=3,7 [13]. refractive index of air is nair = 1,0 and is constant through broad range of light spectrum [14]. ( ) (23) large difference of refractive indices between gaas and air leads to high reflection of light from the interface and only 67,2 % of perpendicularly incident monochromatic light at λ0=808 nm is coupled in the absorption region of gaas. numerous schemes are deployed in order to reduce reflection depending on the spectrum of incident light. for broadband white light random texturing of front surface reduce reflectivity of broad wavelength range to few percent [15]. another approach employed when using monochromatic light is to use thin film single layer antireflection ar coating with refractive index nar (24) and with quarter wavelength thickness dar (25) of incident light. √ (24) (25) when using monochromatic light single layer thin film ar coating may totally reduce reflection as seen in fig. 6 while for broad white light spectrum single layer of ar coating reduce reflectance to around 10 % [16]. comparison of measured performance and theoretical limits of gaas laser power converters 101 reflectance r for perpendicularly incident light as a function of thickness dar and energy of photon eph for single layer ar coating can be written as [17] (27): ( ) (26) ( ) ( ) ( ( )) ( ( ) ( )) ( ) ( ( )) ( ( ) ( )) (27) λ0 – wavelength of monochromatic light in air ( ) ( ) for monochromatic light with central wavelength λ0=808 nm, 105,5 nm thick single layer ar coating with refractive index nar=1,92 reduce reflectance to zero as seen in fig. 6, resulting in all incident light coupled in absorption layer. since material with exact same refractive index at specified wavelength doesn’t exist, it is informative to calculate reflection from front surface when using already deployed materials of ar coatings. fig. 6 shows reflections for three different materials of ar coating deployed on gaas as a function of their thickness. the best material regarding refractive index for ar coating on gaas is silicon nitride (si3n4) with refractive index 2,00 at 808 nm [18]. gaas with 101 nm thick layer of si3n4 reflect around 0,1 % of incident light. another appropriate material for ar coating of gaas is al2o3 or alumina. al2o3/gaas interface is widely studied [19], [20] since it has many uses in semiconductor industry such as insulator layer in igfet transistors [21], diode laser coatings [22] and ar coating for high efficiency solar cells [23]. 114,8 nm thick layer of alumina on gaas with refractive index nal203=1,76 [24] at 808 nm resulting in front surface reflection under 1 %. another commonly used material for ar coating on solar cells is sio2 or silica with refractive index nsio2=1,45 at 808 nm [25]. since the refractive index of silica is far from optimal for ar coating on gaas around 7,4 % of incident light is reflected in the best case scenario. fig. 6 influence of single layer ar coating on reflection from interface gaas/air 102 r. kimovec, m. topič single layer ar coating provides sufficient reduction of reflection for monochromatic light from the interface gaas/air, but put strict requirements on ar coating material, since it requires exactly specified refractive index in order to achieve good results. it also performs well only for designed wavelength so performance of single layer ar coating is decreased in real world scenario where wavelength of diode laser varies due to manufacturing tolerances and temperature of operation. to overcome this limits, double layer ar coating can be deployed. for quarter wavelength thicknesses of both ar coatings in double layer ar stack, for perpendicularly incident light r is defined as [17] (28): ( ( ) ( ) ( ) ( ) ( ) ( ) ) (28) r will be minimized when (29): √ (29) nar1, nar2 – refractive index of thin layer one and two of double layer ar coating to minimize reflection from interface air/gaas when using monochromatic light with λ0 = 808 ratio of nar2/nar1 = 1,92 should be utilized. well suited materials for ar coatings that approach this ratio are mgf2 and tio2. refractive indices of those two materials at 808 nm are ntio2 = 2,52 [24] and nmgf2 = 1,37 [26] resulting in ratio of nar2/nar1 = 1,84. fig. 7 shows reflection of double stack ar coating deployed on gaas as function of thickness of mgf2 and tio2. reflection is reduced to zero with thickness of dmgf2 = 72,2 nm and thickness of dtio2 = 58,1 nm. fig. 7 influence of double layer ar mgf2/tio2 coating on reflection from interface gaas/air comparison of measured performance and theoretical limits of gaas laser power converters 103 4. comparison of sq efficiency limit with measured lpc efficiency following theoretical calculations, measurements were done on gaas lpc pictured in fig. 8. a single segment single junction circular gaas lpc with radius 0,15 cm was fully illuminated with monochromatic light from semiconductor laser with λ0 = 808 nm and total output power 1,06 w. light from a laser diode is coupled into mm 105/125 µm, na 0,22 fiber with output positioned perpendicular to the surface of the lpc so that whole area is illuminated and spillage of light is minimized. impinging profile of incident light is near gaussian resulting in uniform irradiance of front surface. area of illumination was 0,074 cm 2 resulting in pillum = 14,3 w/cm 2 . lpc was mounted on to-39 casing that was socketed and mounted on heatsink for efficient heat dissipation. i v curve of illuminated lpc was measured with keithley 2602a. scan through whole i v curve was done in under one second in order to minimize heating of the lpc. measured temperature of the to-39 casing was 302 k. measurement results compared with theoretical sq limits for the same parameters can be seen in table 1. fig. 8 picture of measured gaas lpc mounted on to-39 casing. table 1 measurement and simulated results for lpc under monochromatic illumination for pillum = 14,3 w/cm 2 gaas lpc measured gaas lpc sq ratio [%] η [%] 54,4 76,6 71,0 ff [%] 82,3 90,3 91,1 voc [v] 1,16 1,30 89,2 jsc [a/cm 2 ] 8,24 9,31 88,5 vmpp [v] 1,00 1,20 83,3 jmpp [a/cm 2 ] 7,78 9,12 85,3 pmax [w/cm 2 ] 7,78 10,96 71,0 measured i v curve normalized to calculated sq limit values of jsc_sq and voc_sq [27] for the same parameters can be seen in fig. 9. while jsc and voc of fabricated lpc achieve around 90 % of the theoretical value, pmax at 71 % of theoretical limit still needs to be optimized. reason for low measured pmax in power lost on series resistance rs, which is beside grid shading dominant loss mechanism in manufactured single junction, single segment lpcs as discussed in [28]. 104 r. kimovec, m. topič fig. 9 measured and simulated i v curve of gaas lpc normalized to values of voc_sq and jsc_sq. measurements and calculations were done under monochromatic illumination λ0=808 nm for pillum = 14,3 w/cm 2 5. distribution of losses in lpc following the sq limit we can divide energy conversion from light to electrical energy in lpc in groups. loss analysis for randomly textured l = 1 µm thick lpc with perfect mirror on the back as best case theoretical structure at λ0 = 808 nm, pillum = 14,3 w/cm 2 and fwhm = 5 nm at t = 302 k is shown in fig. 8 in inner section of pie chart. 76,6 % of light energy is converted to useful electrical energy. 13,9 % of the light energy cannot be converted to electrical energy due to lower voltage at maximal power point vmpp than voltage of bandgap, vg. radiative recombinations of e – h pairs contribute to 2,0 % of energy emitted from lpc and 7,5 % is transformed to heat due to the thermal relaxation of photons with energy higher than bandgap. thermal losses could be minimized if monochromatic light source with central wavelength at peak efficiency as seen in fig. 4 would be used. outer section of pie chart in fig. 10 shows measured energy distribution in lpc. rs contribute to significant drop of vmpp resulting in increased loss of useful energy due to vmpp < vg. another 13,3 % of energy is a sum of other electronic and optical losses. fig. 10 distribution of energy conversion in lpc @ pillum = 14.3 w/cm 2 at λ0 = 808 nm and t = 302 k. inner section of pie chart presents energy conversion following sq limit, while outer section presents measured lpc. comparison of measured performance and theoretical limits of gaas laser power converters 105 6. conclusion calculation of sq limits for lpc under monochromatic illumination is a method for evaluation of theoretically achievable limits of lpcs and comparing them to measured results of manufactured devices. we provided insights how lpc design can be further optimized together with appropriate light source in order to achieve high system efficiency. irradiance should be high leading to small surfaces of lpcs and 80 % efficiency could be theoretically achieved for pillum = 100 w/cm 2 . comparison between calculated and measured values shows us that we can already achieve 90 % of theoretical values for jsc and voc while measured pmax achieve 71 % of theoretical limit calculated with sq method. further work should be done to include effect of series resistance in the calculations, since it is a major loss mechanism in single junction single segment lpcs. acknowledgement: the authors acknowledge andreas w. bett and henning helmers from fraunhofer ise for valuable discussion and providing us samples of lpcs. the authors acknowledge the financial support from the slovenian research agency (program p2-0197). r. kimovec thanks the slovenian research agency for his phd funding. references [1] w. shockley and h. j. queisser, "detailed balance limit of efficiency of p‐n junction solar cells," j. appl. phys., vol. 32, no. 3, pp. 510–519, mar. 1961. [2] m. jošt and m. topič, "efficiency limits in photovoltaics: case of single junction solar cells," facta univeristatis, series: electronics and energetics, vol. 27, no. 4, pp. 631–638, 2014. [3] a. w. b. gergö létay, "etaopt – a program for calculating limiting efficiency and optimum bandgap structure for multi-bandgap solar cells and tpv cells," in proc. of the 17th european photovoltaic solar energy conference, munich, germany, 2001, pp. 178–81. [4] a. w. bett, f. dimroth, r. lockenhoff, e. oliva, and j. schubert, "iii-v solar cells under monochromatic illumination," in proc. of the 33rd ieee photovoltaic specialists conference, 2008, pp. 362–366. [5] v. andreev, v. khvostikov, v. kalinovsky, v. lantratov, v. grilikhes, v. rumyantsev, m. shvarts, v. fokanov, and a. pavlov, "high current density gaas and gasb photovoltaic cells for laser power beaming," in proceedings of the 3rd world conference on photovoltaic energy conversion, 2003, vol. 1, pp. 761–764. [6] h. helmers, l. wagner, c. e. garza, and et al, "photovoltaic cells with increased voltage output for optical power supply of sensor electronics," in proceedings of the ama conferences 2015, 2015, pp. 519–524. [7] m. d. sturge, "optical absorption of gallium arsenide between 0.6 and 2.75 ev," phys. rev., vol. 127, no. 3, pp. 768–773, aug. 1962. [8] f. urbach, "the long-wavelength edge of photographic sensitivity and of the electronic absorption of solids," phys. rev., vol. 92, no. 5, pp. 1324–1324, dec. 1953. [9] o. d. miller, e. yablonovitch, and s. r. kurtz, "intense internal and external fluorescence as solar cells approach the shockley-queisser efficiency limit," arxiv prepr. arxiv11061603, 2011. [10] b. beaumont, j. c. guillaume, m. f. vilela, a. saletes, and c. verie, "high efficiency conversion of laser energy and its application to optical power transmission," in proc. of the record of the twenty second ieee photovoltaic specialists conference, 1991, pp. 1503–1507 vol.2. [11] r. pena, c. algora, and i. anton, "gaas multiple photovoltaic converters with an efficiency of 45% for monochromatic illumination," in proceedings of the 3rd world conference on photovoltaic energy conversion, 2003, vol. 1, pp. 228–231 vol.1. [12] t. shan and x. qi, "design and optimization of gaas photovoltaic converter for laser power beaming," infrared phys. technol., vol. 71, pp. 144–150, jul. 2015. [13] d. e. aspnes, s. m. kelso, r. a. logan, and r. bhat, "optical properties of alxga1−x as," j. appl. phys., vol. 60, no. 2, pp. 754–767, jul. 1986. 106 r. kimovec, m. topič [14] p. e. ciddor, "refractive index of air: new equations for the visible and near infrared," appl. opt., vol. 35, no. 9, pp. 1566–1573, mar. 1996. [15] m.-j. huang, c.-r. yang, y.-c. chiou, and r.-t. lee, "fabrication of nanoporous antireflection surfaces on silicon," sol. energy mater. sol. cells, vol. 92, no. 11, pp. 1352–1357, nov. 2008. [16] d. bouhafs, a. moussi, a. chikouche, and j. m. ruiz, "design and simulation of antireflection coating systems for optoelectronic devices: application to silicon solar cells," sol. energy mater. sol. cells, vol. 52, no. 1–2, pp. 79–93, mar. 1998. [17] d. a. steck, classical and modern optics, 1.5.1 ed. 2013. [18] h. r. philipp, "optical properties of silicon nitride," j. electrochem. soc., vol. 120, no. 2, pp. 295–300, feb. 1973. [19] l. hong-liang, l. yan-bo, x. min, d. shi-jin, s. liang, z. wei, and w. li-kang, "characterization of al2o3 thin films on gaas substrate grown by atomic layer deposition," chin. phys. lett., vol. 23, no. 7, p. 1929, 2006. [20] r. e. sah, c. tegenkamp, m. baeumler, f. bernhardt, r. driad, m. mikulla, and o. ambacher, "characterization of al2o3/gaas interfaces and thin films prepared by atomic layer deposition," j. vac. sci. technol. b, vol. 31, no. 4, p. 04d111, jul. 2013. [21] w. s. lee and j. g. swanson, "switching behaviour of al2o3-n gaas misfets," electron. lett., vol. 18, no. 24, pp. 1049–1051, nov. 1982. [22] p. v. bhore, a. p. shah, m. r. gokhale, s. ghosh, a. bhattacharya, and b. m. arora, "effect of facet coatings on laser diode characteristics," indian j eng mater sci, vol. 11, pp. 438–440, 2004. [23] s. abdul hadi, t. milakovich, m. t. bulsara, s. saylan, m. s. dahlem, e. a. fitzgerald, and a. nayfeh, "design optimization of single-layer antireflective coating for gaas p /si tandem cells with , 0.17, 0.29, and 0.37," ieee j. photovolt., vol. 5, no. 1, pp. 425–431, jan. 2015. [24] j. r. devore, "refractive indices of rutile and sphalerite," j. opt. soc. am., vol. 41, no. 6, pp. 416–417, jun. 1951. [25] i. h. malitson, "interspecimen comparison of the refractive index of fused silica," j. opt. soc. am., vol. 55, no. 10, pp. 1205–1208, oct. 1965. [26] h. h. li, "refractive index of alkaline earth halides and its wavelength and temperature derivatives," j. phys. chem. ref. data, vol. 9, no. 1, pp. 161–290, jan. 1980. [27] r. m. geisthardt, m. topic, and j. r. sites, "status and potential of cdte solar-cell efficiency," ieee j. photovolt., vol. 5, no. 4, pp. 1217–1221, jul. 2015. [28] e. oliva, f. dimroth, and a. w. bett, "gaas converters for high power densities of laser illumination," prog. photovolt. res. appl., vol. 16, no. 4, pp. 289–295, jun. 2008. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 383 393 doi: 10.2298/fuee1603383k smart outlier detection of wireless sensor network sahar kamal 1 , rabie a. ramadan 2 , fawzy el-refai 3 1 department of electronics and electrical communications, higher institute of engineering, el-shorouk academy, el-shorouk city, egypt 2 computer engineering department, cairo university, egypt 3 department of system and computer engineering, el-azhar university, cairo, egypt abstract. data sets collected from wireless sensor networks (wsn) are usually considered unreliable and subject to errors due to limited sensor capabilities and hard environment resulting in a subset of the sensors data called outlier data. this paper proposes a technique to detect outlier data base on spatial-temporal similarity among data collected by geographically distributed sensors. the proposed technique is able to identify an abnormal subset of data collected by sensor node as outlier data. moreover, the proposed technique is able to classify this abnormal observation, an error data set or event affected set. simulation result shows that high detection rate is achieved compared to conventional outlier detection techniques while preserving low positive false alarm rate. key words: wireless sensor network, outlier’s detection, fuzzy logic, spatial and temporal similarity 1. introduction wireless sensor network is considered a promising solution for monitoring and measurement of natural physical phenomena such as temperature, humidity, earthquakes, pressure, light, volt, etc. a typical wsn consists of a large number of very small sensors deployed over a topological area of interest. these sensors are supplied by power resources (batteries, solar cells), measurement unites, processing units and wirelesses tx/rx unit. unfortunately, the data collected from sensor nodes are considered inaccurate and may be even unreliable due to measurement errors or superimposed noise on the received data packets in [2]. duplicated measurement or even missing values are not common in the data set collected by a wsn. a subset of data which appear to be in consistence with the whole received august 30, 2015; received in revised form november 15, 2015 corresponding author: rabie a. ramadan computer engineering department, cairo university, egypt (e-mail: rabie@rabieramadan.org) *an earlier version of this paper was presented at the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, 2015 [1]. 384 s. kamal, r. ramadan, f. el-refai data set from which it is collected is called an outlier. outlier can be defined as in [2] “an outlier is a subset of observations which appear to be inconsistent with other dataset". on the other hand, outliers as in [3] can be defined as “those measurements that are deviated from the consistence dataset". each of two definitions can be used as a solution to declare the outlier in a data set. abrupt events such as sudden sensor failure, battery power deployment or even natural physical phenomena are also reasons to which outlier data can be attributed. in order to boost the accuracy and reliability of the collected sensor data, an outlier detection process should be applied and possibly corrected. there are three sources of outliers due to environmental changes or error coming from a faulty sensor, which can be defined as (1) errors& noise, (2) events and (3) malicious attacks, the last one being related to the network security as in [2]. noise or error refers to a noise-related measurement or data instance coming from a faulty sensor. outliers caused by errors may occur frequently, while outliers caused by events tend to have a smaller probability of occurrence. erroneous data is normally represented as an arbitrary change and is extremely different from the rest of the data. noisy data as well as erroneous data should be eliminated or corrected if possible. however, events may arise due to sudden change in the real world, for example rainfall, forest fire, chemical spill, air pollution, etc. removing the event outlier from data set will lead to a loss of important hidden information of the data about events as in [4].outliers that are very close to random errors in terms of size can only be determined through the application of outlier tests. outlier classification as an event or error is an important matter. many researches consider outliers and events as similar conditions by treating events as some sort of outliers. due to the fact that there are spatialtemporal similarities between neighboring nodes, measurements enable us to classify outlier as either an event or error. this depends on the fact that error data observations seem to be unrelated, while event observations seem to be spatially correlated as in [5]. the main approaches to determine outliers can be grouped as statistics-based methods, nearest neighbor-based, cluster-based and artificial intelligence techniques. new approaches are used for outlier detection including artificial intelligence techniques such as neural networks and fuzzy logic technique. the latter was suggested by [6] in which it can also be used for geodetic networks for outlier detection. the main aim of outlier detection in wsn is to declare outliers with high detection rate while decreasing the resource consumption of network. our work is based on the observation that in most applications of wsns measurements of sensors in the environment tend to be highly correlated for sensors that are geographically close to each other (spatial similarity), and also highly correlated for a period of time (temporal similarity) as in [5]. using this observation, we take advantage of the spatial and temporal similarity in the sensor data. in the first study, we detect outliers in the univariate attribute in wsn. the main contribution of this paper is the use of euclidean distance and fuzzy logic to detect outliers in wireless sensor networks. however, spatial and temporal similarity were used to make it easy to distinguish between error and event. if probability of output of fuzzy logic is above a prefixed threshold, the observation is considered as an outlier. the model is tested on a real data set from grand-st-bernard as in [7] and implemented using matlab. this paper achieves a high detection rate and still keeps a low false positive alarm rate and computational complexity. the rest of the paper is organized as follows: section (2) shows the necessary background definition related to outlier detection. the proposed algorithm is presented in section (3) along with the assumptions upon which the proposed technique is built. section (4) shows smart outlier detection of wireless sensor network 385 experimental results and the performance evaluation of the proposed technique using a realistic data set. finally, the whole paper is concluded in section (5). 2. related work recently, there are many researches in outlier detection of wsn to improve reliability and quality of measurement sensor. these researches used different techniques to detect outlier such as statistical-based, nearest neighbor-based, clustering-based, classificationbased, and spectral decomposition-based approaches. in general, these researches can be those that do not use spatial or temporal correlation data set or those that are based on spatial or temporal correlation only or on both. in 2006, the author in [8], uses the spatial correlation that exists among neighboring sensor nodes to distinguish between outlying sensors and event boundary. in this model, each node calculates the difference between its own measurements and the median from its neighboring measurements. then outlying node is declared when the absolute value of its measurement‟s deviation degree is greater than a pre-selected threshold. this technique suffers from a low detection rate because it ignores temporal correlation between sensor data reading. as shown by [9], this model used a cluster based technique to identify the global outlier. first, each node clusters the reading and reports cluster summaries and then transmits the raw sensor reading to its cluster head. the cluster head collects cluster summaries from all of its nodes before sending them to the sink. an outlier cluster can be declared in the sink if the cluster's average inter-cluster distance is greater than one threshold value of the set of inter-cluster distances. however, these models suffer from the choice cluster width parameter. additionally, these techniques increase computational complexity when computing the distance between data instance. in [10] author uses distance similarly to identify global outliers in wsn. each node uses a distance in a similar way to identify local outliers and then broadcast abnormal data instances to all neighboring node for verification. this technique is repeated until all neighboring nodes agree on the global outliers. this technique increases computational complexity and it isn't adapted for a large scale network. in 2007, the proposed technique as in [11] uses one class quarter sphere based technique to detect outliers in wsn. this technique takes advantage of temporal correlation to identify local outliers at each node. a measurements sensor that lies outside the quarter sphere is considered as an outlier. each node transmits only brief information to its parent for global outlier‟s classification. this technique suffers from a low detection rate because it ignored spatial correlation between neighboring nodes. at 2008, the author as in [12] uses a centered quarter-sphere support vector to detect local outlier in wsn. this technique takes advantage of spatial correlations that exist in sensor data of adjacent nodes to reduce the false alarm rate and to distinguish between events and errors, but it ignores temporal correlation and increases computational complexity. but in 2009, the author as in [13] used outlier detection technique to identify outliers in data set of wsn. this technique takes advantage of spatial temporal correlation exist among sensor data reading. in 2011, author as in [14] proposed outlier detection method in the wireless sensor networks and distinguishes between event and error. this technique is used to classify the sensor node data as local outlier or cluster outlier or network outlier. this technique considers the network outlier or cluster outlier as event and local outlier as error. this algorithm suffers from high computational complexity. in 2012, the author of [15] use the advantage of temporal correlation only to detect the outlier in wsn. however, this technique suffers from some computational complexity. this approach 386 s. kamal, r. ramadan, f. el-refai differs from our approach in that our approach has the advantage of spatial-temporal similarity combined with fuzzy logic to detect outlier and identify errors and events with high detection rate and relatively low false positive rate in comparison with the result in [15]. in 2013, the author as in [16] uses temporal and spatial properties to identify outliers and distinguish between event and error but with low detection rate and false positive rate in comparison with our approach. 3. the proposed stodm technique sensor nodes are assumed to be densely deployed and synchronized in wsn. a subset of sensors is considered as members of the same cluster if they fall within the same radio transmission range of each other. at any time interval , each node reads a data vector sij where “i” is the time index of the data symbol and “j” is the node spatial id. the potential of an outlier detection technique is to identify a subset xi of each sensor set si as outliers. a super advantage of a given detection technique is to classify deviation data instance as event or error. in this section, the proposed approach is introduced in details. many outlier detection techniques have been developed, however, they did not take into account the interesting events. on the other hand, several recently developed researches are interested only in events and did not care about erroneous data. in this paper, a new distance-based approach depends on spatial-temporal similarity combined with fuzzy logic-based approach is proposed to classify outliers, i.e. error data or events. our methodology consists of the following steps: first step the spatial and temporal similarity is calculated, each one of these is entered as input or (membership function) to fuzzy logic to detect outliers in each node. second step classifies the outlier as event or error. 3.1. spatial-temporal similarity in our proposed algorithm, spatial-temporal similarity is calculated using a two-step process. first step, the temporal similarity of a given data set of sensor node is calculated on point by point basis and is given by first order difference| si2-si1|. the absolute difference is compared to a pre-specified threshold which is calculated according to tolerance of temperature sensor. a data point si2 is considered similar to other points if the absolute first order difference does not exceed the threshold. otherwise, dissimilarity is obtained and point of data may be outlier. second step, spatial similarity is calculated based on the distance between neighboring nodes. we use the euclidean distance to calculate similarity measure between two points x, y, that are in the same transmission range and are in the same close time which is calculated as eq. (1). euclidean distance is a popular choice for univariate and multivariate continuous attributes as [17]. data instance in point x is considered similar to data point in y if euclidean distance d(x, y) does not exceed preselected threshold. spatial link is defined as number of spatial similarity to each point with its neighbors as in eq. (2). where spatial similarity threshold is calculated by computing mean distance of all data points in the close time. d(x, y) =√( ) (1) smart outlier detection of wireless sensor network 387 spatial link = ∑ (2) where n is the number of neighboring nodes. 3.2. fuzzy logic model recently, many approaches have been tested on decision making theories. some of the artificial techniques that are used in outlier analysis are neural networks, support vector machine and fuzzy logic as in [18]. our approach use fuzzy logic as one of artificial techniques to detect outliers in data set of wsn. fuzzy logic is a logical model providing a general idea about the decision process in the analysis of the data set. the fuzzy logic suggested by [19] is essentially an approach that allows transition values to make a definition between the conventional values such as right/wrong, yes/no, high/low. the main purpose of the method is to bring a certainty to assigning a membership degree to the concepts which are hard to express or have difficult meaning. a fuzzy logic system consists of three main parts, which are fuzzification, rule base and defuzzification. firstly, fuzzification can be defined as a transfer between a definite system and a fuzzy system and it describes a property of an object in a certain fuzzy set. the objects can belong to „low, middle, high‟ property classes with membership functions, and each object is assigned to a membership degree between 0 and 1. this technique uses temporal and spatial similarity as two inputs or two membership functions to fuzzy system. these membership functions are chosen empirically and optimized using a sample input/output data. the most common membership functions include a triangle, trapezoid, gauss curve and sigmoid. as the membership functions represent the fuzzy set, the selection of their shape and form directly affects the decision process. secondly, the rule base combines the membership functions from the fuzzificator with the rule handling data such as „if, and, although, if not‟ which is based on the database and stored there. the if-then rules define a connecting antecedent to the consequent (i.e. input to output). these rules are given weights based on their criticality as in [19]. with this approach, measurements can be classified according to their membership degrees by adequate membership, e.g.  if spatial link (low) and temporal similarity (low) then outlier (high)  if spatial link (low) and temporal similarity (med) then outlier (high)  if spatial link (high) and temporal similarity l (high) then outlier t (low)  if spatial link (med) and temporal similarity (med) then outlier (med) thirdly, in the defuzzification unit, the rule results that are obtained from the rule handling unit are evaluated in the fuzzificator and turned into definite results as in [19]. outlier is declared according to the rule results. fig.1 represents all three stages of fuzzy logic. fig. 1 three stage of fuzzy logic 388 s. kamal, r. ramadan, f. el-refai 3.3. outlier classification the third step is to classify the degree of outlier value (error or event). in this step, we aim to know the source of the values labeled as outlier. there are two possible options; either this outlier value is due to an error, as a result of a low battery or network damage, or due to an event or phenomena in the surrounding environment. our idea is based on the following observation in the result of this technique “error in the sensor data are likely to be spatially unrelated while event measurements are probable to be spatially correlated”. on the other hand, data instance tends to be correlated in both time and space. hence, we employed this fact by using data from neighboring nodes to assist measuring the spatial similarity, also using time stamps between readings to assist measuring the temporal similarity. in other words, this technique detects the outlier in the previous step and if data instances are declared as outlier, it produces similar values or values larger than the outlier readings in all nodes. in addition, if those neighboring nodes readings are within the same time range, this indicates an interesting event in the physical world. otherwise, it is likely to be an erroneous data. in our work, we assume that a sensor node (x) is considered to be a neighbor of another node (y) if x is within y‟s communication range, and vice versa. 4. experimental result and performance evaluation in this section, we investigate the effectiveness of our proposed approach when applied on the real dataset from st.-bernard wireless sensor network in [7]. we compare the accuracy of our algorithm with another detection method called stgod method [15], which is based on spatial temporal correlation among neighbor nodes. we evaluate accuracy and the scalability of the proposed method against the stgod method on a real dataset. 4.1. study area and data description the proposed outlier detection described in section iii is applied to a realistic data set collected from 23 sensor nodes. these nodes are geographically distributed over switzerland and italian boarder, representing two clusters. the small cluster, situated in the italian boarder, contains the five sensor nodes from whose data set is obtained. fig. 2 illustrates the fig. 2 a small cluster (consists of five nodes) of the grand st deployment and their corresponding metric coordinates (e-n). smart outlier detection of wireless sensor network 389 geographical distribution of these nodes over the area in which they are deployed. the collected data represent temperature as the attribute of interest. temperature values are measured over a period 06:00–14:00 during the day (30th september, 2007). fig. 3 depicts a plot of temperature measurements sensors for all nodes in a small cluster (node25, node28, node29, node31, node32). the measurement tolerance of the deployed sensors is about ±0.3°c. fig. 3 represented data measurements of each sensor node. 4.2. results and performance evaluation this section is devoted to evaluating the performance of the outlier detection technique proposed in section (iii). two performance metrics are considered. the first is the detection rate (dr) defined as the ratio of the correctly detected outliers to the total number of outliers in a given data set. another performance metric of interest is the false positive alarm rate (fpr) which is defined as the ratio of normal data points incorrectly classified as outliers to the total number of normal data points. this section shows outliers in each node, detection rate, and false positive rate to each node. to evaluate performance of outlier detection needs a reference dataset. usually, labeling techniques are utilized to label sensor measurements and classify each data point as either a normal pattern or anomalous. the choice of the labeling technique powerfully influences the evaluation of the outlier detection techniques. there are three labeling techniques used, as in [15], i.e., running average-based, mahalanonis distance-based, and density-based, but our research used the first one which fits the data set as in [15]. in this research two software are applied, statistical model and fuzzy logic simulink, implemented by matlab. as in fig. 4 and fig. 5, spatial temporal outliers in univariate attribute (temperature) in both node25 and node29, whose detection rate in node25 is about 92% and fpr is 10.4%, while in node29 the detection rate is 93.75% and high false positive rate is 18.33%. 390 s. kamal, r. ramadan, f. el-refai fig. 4 spatial temporal outliers in node29 detected by (stodm) fig. 5 spatial temporal outliers in node25 detected by (stodm) while in fig. 6, fig. 7 and fig. 8, node28, node31, and node32, they have high detection rate 100% and fpr 9.16, 10, 4.5% respectively in each node. fig. 6 spatial temporal outliers in node28 detected by (stodm) smart outlier detection of wireless sensor network 391 fig. 7 spatial temporal outliers in node31 detected by (stodm) fig. 8 spatial temporal outliers in node32 detected by (stodm) fig. 9 shows the result of accuracy assessment for detected outliers by using pattern approach. the highest detection rate (100%) is at node (28, 31, 32) while the lowest detection rate (92%) is at node 25. the lowest amount of fpr is at node 32 (4.5%) while the highest rate is at node 29 (18.33%). fig. 9 accuracy of the detected outliers at different nodes 392 s. kamal, r. ramadan, f. el-refai extensive ratio on the collected data set shows that both the detection rate and fpr increase when the threshold is decreased. a fixed threshold of temporal similarly and the mean of euclidean distance of all nodes is computed as threshold of spatial similarity that yields an average detection rate of 97.15% and fpr of 10.472%. the relative high fpr is a result of misclassifications of some normal observations, while the high detection rate achieved is a result of considering spatial temporal similarly. table 1 shows the comparison between the proposed storms with the most frequently used data labeling technique, namely the tsod and the stgod technique with the detection rate and false positive alarm achieved by each algorithm. it can be observed that the proposed algorithm outperforms these techniques in terms of detection rate. both references models are applied to the same data set as considered in our model. another advantage of the proposed technique is that it is able to distinguish between errors and events in a given data set obtained from the sensor node. classification of the outlier source is reported in table 2. table 1 comparison between our approach (stodm) and stgod model proposed of running average in [15] method dr% fpr% stodm 97.15 10.4 tsod 23.4 1.7 stgod 72.34 10.94 table 2 number of outliers and events detected at different nodes using stodm (our model) nodes no of outlier no of event node25 48 5 node28 23 4 node29 60 5 node31 25 5 node32 21 4 5. conclusions stodm algorithm proposed in this paper combines the fuzzy logic theory and distance base similarity to detect outliers and is a new try in the area of outlier detection for spatial temporal similarity. the proposed technique is able to identify normal and outlier data. moreover, error and event are also distinguished. high detection rate is achieved compared to conventional techniques while preserving the low positive alarm rate and also reducing computational complexity because it uses euclidian distance to calculate spatial similarity among neighboring nodes. for future work, we plan to build an algorithm to detect outliers in multi attributes and to consider dependencies among the attributes of the sensor data as well as spatialtemporal correlations that exist among the observations of neighboring sensor nodes. smart outlier detection of wireless sensor network 393 references [1] s. kamal, r. ramadan, f. el-refai, “smart outlier detection of wireless sensor network by fuzzy logic”, in proceedings of the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, november 2015. [2] y. zhang, m. nirvana, h. paul,”outlier detection techniques for wireless sensor networks,”,a survey, university of twente, p.o.box 217 7500ae, enschede, the netherlands, 2010. [3] v. chandola, a. banerjee, a. kumar, v,”outlier detection: a survey”, technical report, university of minnesota , 2007. [4] v. jha, o. veer singh, y. outlier, ”detection techniques and cleaning of data for wireless sensor networks”, a survey, international journal of computer science and technology, 2012. [5] x. luo, m. dong, y. huang, ”on distributed fault-tolerant detection in wireless sensor networks”, ieee trans computer, vol. 55, no. 1, pp. 58-70, 2006. [6] h. konak, a. dilaver, e. ozturk, ” the effects of observation plan and precision on the duration of outlier detection and fuzzy logic”, 2005, a real network application, survey review, vol. 38, 298, pp. 331341, 2005. [7] sensor scope system. http://sensorscope.ep.ch/index.php/main page [8] s. subramaniam, t. palpanas, d. papadopoulos, v. kalogeraki, d. gunopulos, ”online outlier detection in sensor data using nonparametric models”, seoul, korea:, vldb; young, the technical writer‟s handbook. mill valley, ca: university science, 1989, pp. 187–198m, 2006. [9] s. rajasegarar , c. leckie, m. palaniswami, j. c. bezdek,” distributed anomaly detection in wireless sensor networks”, uk: ieee, iccs, pp.12-16, 2006. [10] j. branch, b. szymanski, c. giannella, r. wolf, ”in-network outlier detection in wireless sensor networks”, in proceedings of ieee icdcs, 2006. [11] rajasegarar, s., leckie, c., palaniswami, m. and bezdek, j. c,”quarter sphere based distributed anomaly detection in wireless sensor networks,”proceedings of ieee international conference on communications, pp. 3864-3869,2007. [12] y. zhang, n. meratnia, and p.j.m. havinga, ”an online outlier detection technique for wireless sensor networks”, in proceedings of the third ieee european conference on smart sensing and context (eurossc), pp. 25-26, 2008. [13] y. zhang, n. meratnia, and p.j.m. havinga, ”adaptive and online one-class support vector machinebased outlier detection techniques for wireless sensor networks”, in proceedings of the ieee 23rd international conference on advanced information networking and applications workshops/symposia, pp. 990-995, 2009. [14] m.s. mohamed, t. kavitha, ”outlier detection using support vector machine in wireless sensor network real time data”, int j soft comput eng, vol.1, no. 2, 2011. [15] y. zhang, n.a.s. hamm, n. meratnia, a. stein, m. van de voort, p.j.m. havinga,” statistics-based outlier detection for wireless sensor networks”, international journal of geographical information science, 2012. [16] a. amidi, n.a.s. hamma, n. meratnia, ” wireless sensor networks and fusion of contextual information for weather outlier detection”, international archives of the photogrammetry, remote sensing and spatial information sciences, vol xl-1/w3, 2013. [17] a. fawzy, h.m.o. mokhtar, o. hegazy ,”outliers detection and classification in wireless sensor networks”, egyptian informatics journal, vol. 14, pp. 157-164, 2013. [18] s. syed, m.e. cannon, ”fuzzy logic based-map matching algorithm for vehicle navigation system”, in proceedings of the urban canyons, ion national technical meeting, san diego, ca, pp. 26-28, 2004. [19] y. sisman, a. dilaver, s. bektas, ”outlier detection in 3d coordinate transformation with fuzzy logic”, acta montanistica slovaca ročník 17, číslo 1, pp. 1-8, 2012. instruction facta universitatis series: electronics and energetics vol. 29, no 2, june 2016, pp. 309 323 doi: 10.2298/fuee1602309s in-channel misrouting suppression technique for deflection-routed networks on chip  igor z. stojanovic, goran lj. djordjevic faculty of electronic engineering, university of niš, serbia abstract. deflection routing, where port-contentions in routers are resolved by intentionally misrouting some of packets along unwanted directions instead of storing them, has been proposed as a promising approach for improving power and area efficiency of large-scale networks on chip (nocs). however, at high network load, when packets are misrouted more frequently, the cost and energy benefits of this simple routing scheme are offset by the performance degradation. to address this problem, we propose a technique that uses small in-channel buffers to capture some of deflected packets before they take a misrouting hop. the captured packets are then looped-back to the routers where they suffered deflection and routed again. to improve the efficiency of this in-channel misrouting suppression scheme we also slightly modify the routing function of the deflection router by restricting the choice of productive directions for misrouted packets. evaluations on synthetic traffic patterns show that the proposed misrouting suppression mechanism yields an improvement of 36.2% in network saturation throughput when implemented into the conventional deflection-routed network. key words: network-on-chip, multi-core, deflection routing, misrouting suppression. 1. introduction network-on-chip (noc) has been proposed as an efficient and scalable solution to the challenging on-chip interconnection problems in modern many-core systems on chip (socs). to accommodate the communication needs of tens or even hundreds of processing elements (pes) integrated on a single chip, this architecture employs dedicated routers interconnected by some form of network topology. nocs typically use wormhole routing with virtual channel (wormhole/vc) flow control to route data packets from the source to the destination pe. this flow control scheme enables deadlock avoidance, optimize channel utilization, improve performance and provide quality of service [1, 2]. although wormhole/vc routing needs considerably less amount of buffer storage then other traditional flow control schemes (e.g. virtual cut-through and store-and-forward), the in-router buffers are still a significant source of area and energy overhead. for a static random access memory received june 3, 2015; received in revised form august 3, 2015 corresponding author: igor.stojanovic@elfak.ni.ac.rs faculty of electronic engineering, university of niš, a. medvedeva 14, 18000 niš, serbia (e-mail: mita@iritel.com) 310 i. z. stojanovic, g. lj. djordjevic (sram) buffer implementation, the input buffers can consume 46% of the total on-chip network power while occupying 17% of the total area [3]. to address the issue, several bufferless noc architectures have recently been proposed. in these architectures, in-router buffers are removed and contentions among packets are handled by employing the deflection routing [4-13]. with deflection routing, data packets are divided into flits (flow control units) which are then routed independently through the network and reassembled at their destination. flits arrive synchronously on the router’s input ports, and each flit is routed via the output port that offers the shortest path to its destination. when two incoming flits require the same output port, the router deflects one of the flits to an alternative output port (this is always possible as long as the router has as many outgoing as incoming ports). in this way, port contentions cause flits to be misrouted temporarily, in contrast with the wormhole/vc scheme where such flits must be buffered. deflection routing has several advantages over wormhole/vc scheme. first, since the number of incoming ports is equal to the number of outgoing ports, and flits move between routers synchronously, deadlock cannot occur. the adaptive nature of deflection routing also enables hot spots avoidance and provides fault-tolerance in the network [4]. this approach also eliminates the need for backward status links to implement flow control, and thus the design of the router is greatly simplified. finally, the deflection routing permits the use of as few as one flit-wide register per inter-router link, thereby realizing significant savings in hardware cost and power consumption over wormhole/vc nocs, which must provide ample buffers in each router. recent studies have shown that in the deflection-routed nocs, the power consumption is reduced by 20-40%, and the router area on die is reduced by 40-75% [6]. deflection routers target mainly low-latency operation at low network load [5]. under such load conditions, deflections are rare so that flits rapidly advance toward their destinations over shortest paths. on the other hand, under high load, frequent deflections might cause flits to deviate significantly from their shortest paths, leading to early saturation and poor energy efficiency. the issue of limited maximum throughput of deflection-routed networks has been addressed by several prior works. one line of research is aimed at improving the design of router’s port allocator and switching (pas) stage. within this stage, input flits are first permuted and then passed to the router’s output ports so that as many flits as possible are directed toward their desired directions. bless router uses the pas stage composed of a 44 crossbar switch controlled by an allocator unit that arbitrates the flits to output ports based on oldest-first arbitration policy [6]. the full priority ordering of flits results in fewer deflections, but it incurs a long critical path delay, thus limiting router operation to low clock frequencies. chipper router speeds up the critical path of the router by replacing the crossbar with a two-stage permutation network composed of four independently controlled 22 switch modules [7]. however, the simplicity of this design results in an increased deflection rate, and consequently lowers the maximum network throughput. another line of research deals with techniques for reducing the overhead of flit deflection. such misrouting suppression mechanisms try to prevent deflected flit to take a misrouting hop by temporary holding the flit at its current route position. the minimally buffered deflection router (minbd) achieves the misrouting suppression by a small sidebuffer attached between the output and the input of the router’s pas stage [8]. at each clock cycle, the side-buffer can accept up to one of deflected flits from pas output, and in-channel misrouting suppression technique 311 resubmit that flit to the pas input at some later cycle. by preventing a fraction of deflected flits to leave the router, this technique significantly improves the maximum network throughput. however, it also introduces the contention between the buffered flits and the new flits waiting for injection, which can cause the injection unfairness among routers in a highly loaded network. in our previous work, we proposed an in-channel misrouting suppression technique, referred to as the dual-mode channel, which uses a lightweight link-control mechanism to force deflected flits, when possible, to loop-back to their current routers instead of being misrouted [9]. this simple and effective method improves performances without compromising the injection fairness, but the obtained maximum network throughput is lower than that obtained with the side-buffering technique. in this paper, we further improve the misrouting suppression efficiency of the dualmode channel by adding small buffers at both ends of the channel. these buffers temporary store deflected flits that cannot be looped-back during the same clock cycle when they are entering the channel. also, we slightly modify the routing function of the baseline deflection router to remove the tendency of misrouted flits to take immediate reverse hops. this modification is motivated by our observation that such hops have an adverse effect on how often the channel is able to loop-back the deflected flits. when combined, the proposed mechanisms suppress more than 50% of misrouting hops, raising the maximum throughput by 36.2% with respect to the baseline deflection-routed network. the throughput improvement is 8.7% higher than with the side-buffering technique, and is achieved without compromising the injection fairness in the network. the remainder of the paper is organized as follows. section 2 provides a background on deflection routing including the overview of two representative misrouting suppression techniques: the side-buffering and the dual-mode channel. section 3 presents the novel misrouting suppression scheme for deflection-routed nocs. in section 4, evaluation and results are presented. section 5 concludes this paper. 2. deflection-routed noc architecture overview in this section, we first provide a generic model of deflection-routed noc architecture, which includes only the essential features reported in several previous proposals [5-13]. in particular, we consider a network of 2d mesh topology composed of non-pipelined (i.e. combinational) deflection routers connected by synchronous bidirectional communication channels. then we also discuss two existing techniques to improve the performance of the baseline deflection-routed network via misrouting suppression. 2.1. baseline 2d mesh deflection network figure 1 illustrates the fundamental elements of a generic 2d mesh deflection-routed noc. the noc is constructed as a grid of routers where each router is connected by bidirectional communication channels only to its neighbors. each router is also connected to a local pe, which serves as a source and sink for data packets. before being injected to the router, packets are split into smaller flow control units, so called flits, and each flit is routed independently through the network. in the most basic form, the deflection router is a pure combinational logic module, which directs the incoming flits from the input ports to the proper output ports. the inter-router communication channel includes a pair of 312 i. z. stojanovic, g. lj. djordjevic oppositely oriented flit-wide edge-triggered registers. since there are no in-router buffers, these so-called flit-registers are the only memory elements for storing flits in transit. therefore, during traveling towards their destinations, flits are always on the move, by hopping between the flit-registers and propagating through the routers. x,y ew n s flit register x-1,y x-1,y+1 x,y+1 pe pe pe pe fig. 1 2d mesh deflection-routed noc architecture routers attempt to route each flit along a shortest path to its destination. a router forwards a flit through a productive output port in a productive direction if the distance between the current flit position and its destination decreases. in 2d mesh network, when a flit reaches a router, there are at most two productive directions (i.e. output ports) to its destination. if the router is not able to grant the productive output port, the flit is deflected to any free but non-productive output port. deflection occurs within the internal router structure when multiple incoming flits contend for the same output port. on the other hand, the term misrouting refers to an external manifestation of the flit deflection. it corresponds to a transfer of a deflected flit over the inter-router channel one hop further in a non-productive direction. the cost of misrouting is two clock cycles since each non-productive hop must be compensated by one productive hop in the opposite direction. let note that in the baseline deflection-routed network, every flit deflection leads to a flit misrouting. e i c1 c2 c3 c4 pas nin ein sin win pin pout r sout nout eout wout nout eout sout wout c1 c2 c3 c4 a) b) fig. 2 architecture of baseline deflection router: a) internal structure, and b) pas based on permutation network in-channel misrouting suppression technique 313 figure 2a shows the architecture of the deflection router with four pairs of input and output network ports (denoted as n north, s south, w west and e east) and a pair of inject and eject ports (denoted as pin and pout) which are connected to the local pe. the router is composed of four consecutive stages: the routing stage (r), the eject stage (e), the inject stage (i), and the port allocation and switching stage (pas). through these stages, four internal flit-channels, c1, ..., c4, are established to guide flits from the set of input to the set of output ports. the routing stage associates a set of productive ports to each incoming flit. the routing function is based on offsets in x and y dimensions between the current router and the flit’s destination router. the number of productive ports assigned to a flit can be: 0 (flit is addressed to the local pe, i.e. both xand y-offset are zero), 1 (flit is already at one of the axes of its final destination, i.e. either xor yoffset is zero) or 2 (both xand y-offset are different than zero). the eject stage picks randomly one of locally-addressed flits (if any), and directs that flit to the local pe. the inject stage detects the presence of a free flit-channel and directs the new flit (generated by the local pe) to that channel. if the new flit is not injected into the network because all flit-channels are occupied, then that flit remains in the pe’s transmission queue and is resubmitted in the next clock cycle. the pas stage permutes and passes the flits from flitchannels to output network ports. here, we adopt a pas stage introduced in chipper router [7], which consists of four two-input switch modules arranged into two stages (fig. 2b). each switch module is controlled by an arbitration logic which first, decides the winner between two flits, and then, sends the winning flit toward its productive output port. the losing flit is directed to the other output of the module. the winner between two input flits is determined according to the silver-flit arbitration policy [8]. in this arbitration scheme, a single randomly selected flit is designated as a silver flit, i.e. it is prioritized above the others. the silver flit always wins in arbitration. the winner between any two non-silver flits is decided randomly. 2.2. misrouting suppression techniques the term misrouting suppression refers to any technique for reducing delay overhead incurred by flit deflection in deflection-routed networks [9]. these mechanisms cannot cancel flit deflection, which occurs within the pas stage of the router, but they can recognize a deflected flit and force it to temporary stay at its current route position instead of making a non-productive hop. the misrouting suppression can be implemented either within the deflection router or within the inter-router communication channel. fra frb router b router a straight-through loop-back a) b) fig. 3 misrouting suppression techniques: a) in-router misrouting suppression with side-buffer, and b) in-channel misrouting suppression with dual-mode channel 314 i. z. stojanovic, g. lj. djordjevic side-buffering. the side-buffering [8] is an in-router misrouting suppression technique which uses a small buffer memory (so-called side buffer) attached to each router to buffer some deflected flits that otherwise would be misrouted. the side buffer can be implemented either as a single flit-register or as a small-size fifo (composed of several flit-registers). as shown in fig. 3a, the side buffer (sb) is attached to the deflection router via two additional stages: the buffer-eject stage (be) and the buffer-inject stage (bi). the be stage recognizes deflected flits at the output of the pas stage, and puts one of them into the side buffer if the side buffer is not full. this flit is picked randomly among the deflected flits. the buffered flit will be re-ejected through the bi stage in some later clock cycle, when there is a free flitchannel after flit ejection. previous studies have shown that even adding the smallest side-buffer (1-flit in size) can reduce the misrouting rate by 50%, and can improve the maximum network throughput by 26% [8]. however, the studies have also shown that the performance improvement of this technique does not scale with the increasing side-buffer size because increasing the buffer size over 2 flits leads to only marginal performance gain. more importantly, as pointed out in [9], the presence of side buffers can cause an imbalance between the injection and ejection bandwidth available to pes in the areas of the network congested with in-transit traffic. this occurs because of the arrangement of stages within the side-buffered deflection router, which gives injection precedence to the flit residing in the side-buffer over the new flit waiting at the pe inject port. when the router is overloaded with in-transit flits, a free flitchannel appears rarely and is occupied by buffered flit in most cases, leaving the new flit to wait for another chance. dual-mode channel. the dual-mode channel is an in-channel misrouting suppression technique which prevents some non-productive network hops by forcing deflected flits, when possible, to loop-back to their current routers instead of being misrouted [9]. the datapath for this design is shown in fig. 3b. the approach is based on enhancing the interrouter communication channel with the capability to dynamically (i.e. on a cycle-by-cycle basis) switches between two modes of operation. if deflected flits are present on both ends of the channel, or one flit is deflected and the other one is absent, then the channel activates the loop-back mode (indicated by dotted lines in fig. 3b). in this mode, the flits are returned back to the corresponding input ports of their current routers. otherwise, the channel is configured in the straight-through mode (indicated by dashed lines) allowing both flits to make one network hop. with this scheme, a deflected flit will be misrouted only if there is a productively-routed flit on the opposite side of the channel. in all other cases, the deflected flit will stay at its current route position. it is important to note that the loop-back mechanism is transparent for productively-routed flits, which flow as in a network with the conventional inter-route channels. our previous simulation results show that this simple in-channel misrouting suppression mechanism offers 14.3% performance improvement in terms of maximum network throughput when implemented in the baseline deflection-routed noc [9]. the improvement is smaller when comparing with the side-buffering technique, but is accomplished with lower implementation cost (i.e. there is no need for additional buffer memory) and without any modification to the underlying router microarchitecture. an important advantage of the dualmode channel approach over the side-buffering is that it preserves the injection fairness in the network. in-channel misrouting suppression technique 315 3. misrouting suppression with in-channel buffering the limited misrouting suppression efficiency of the dual-mode channel is a consequence of the fact that the channel cannot save a deflected flit from misrouting if a productivelyrouted flit is present on the opposite end of the channel. under high traffic, when the interrouter channels are almost fully utilized, the loop-back mode can only be activated when both ends of the channel are occupied by deflected flits, which occur rarely. in this section we propose two techniques to mitigate the performance limitation of the dual-mode channel. the first one relates to modifying the routing function of the baseline deflection router with goal to increase the frequency of simultaneous appearance of deflected flits at both sides of the channel. the second technique deals with adding a small in-channel buffer memory for temporary storing deflected flits that cannot be looped-back immediately. 3.1. optimized routing function according to the results of our simulation experimentation with 2d mesh deflection networks under saturated load with uniform random traffic pattern, a deflected flit appears at a router’s output port with the probability of δ = 0.3. assuming that flit deflections occur in neighboring routers independent, the probability that both sides of an inter-router channel are fed with deflected flits should be δ 2 = 0.09. however, the simulation results show that this probability is actually 0.05. that is, the loop-backs in dual-mode channels occur less frequently than would be expected. a closer examination of the patterns of inter-router communication reviles that the discrepancy between expected and measured loop-back probability is caused by the tendency of the misrouted flits to return back to the routers wherein they have suffered deflection during the previous clock cycle. suppose that a flit f is deflected in a router a and then misrouted to a router b over channel cab. upon arriving at router b, the flit f is assigned with at most two productive ports. because flit f is misrouted, one of its productive ports must be the port through which it just has entered the router b. therefore, during the next clock cycle, there is a high chance that flit f will be returned back to the router a over the channel cab, but now as a productively-routed flit, thus forcing the straight-through configuration of the dual-mode channel. if happens that router a sends deflected flit to channel cab during the next clock cycle, that flit will be misrouted, too. thus, the net effect of such behavior is that the likelihood of flit misrouting depends on whether a flit sent by the same router over the same channel during the previous clock cycle was misrouted or not. in order to resolve this performance issue, we slightly modify the routing function of the baseline deflection router by restricting the choice of productive ports for misrouted flits. in particular, we extend the routing function of the deflection router with the following rule: rule 1: let flit f has entered a router a through the input port t  {n,s,e,w}, and let p  {n,s,e,w}be the set of productive output ports for flit f in router a. if the size of p is two, then remove t from p. rule 1 only impacts the implementation of the routing stage of a deflection router (fig. 3a). it is applied after the incoming flits are assigned with productive ports. if flit f has reached router a by a productive hop, then rule 1 has no effect on the routing decision regarding f because t cannot be in p. otherwise, if flit f has arrived at router a 316 i. z. stojanovic, g. lj. djordjevic by a misrouting hop, then port t must be in p. in this case, port t will be preserved in p if t is the only productive option for f. otherwise, t is removed form p. without t in the set of its productive ports, flit f will not be intentionally returned back to the previous router, unless it is deflected within the pas stage of router a. it should be noted that rule 1 does not preclude the possibility that a misrouted flit will be returned back to the previous router; it only decreases the likelihood of such event to occur. 3.2. in-channel buffering the main motivation for using the in-channel buffers is to decouple the operations of the two sides of the dual-mode channel by enabling each side to buffer incoming deflected flits which cannot be looped-back immediately. thus, instead of being misrouted to a neighboring router, the buffered deflected flit will be kept at its current route position until the condition for looping-back is met. when eventually looped-back to the router that has caused its deflection, the flit will get a new chance to continue traveling along a productive direction toward its destination. the datapath of the proposed inter-router channel with in-built flit-buffers is shown in fig. 4a. in comparison to the dual-mode channel (fig. 3(b)), the buffered channel contains two additional small-sized fifo sections which parallel direct loop-back paths. with fifos included, the dual-mode channel is enhanced with several new options on how to handle the incoming and buffered flits. as indicated by dotted lines in fig. 4a, the buffered channel can carry out one or more of ten different flit-transfer actions during each clock cycle. the choice of the actions depends on the routing statuses of the incoming flits as well as the statuses of the two fifos. the first set of options is for transferring of an incoming flit straight-through to the flit-register on the opposite side of the channel. if the incoming flit is productively-routed, this action leads to a productive hop (actions labelled as 1a/1b); otherwise, if the flit is deflected, the straight-through transfer causes a misrouting hop (2a/2b). the second set of options is those that keep an incoming flit on the same side of the channel. the flit loop-back action (3a/3b) allows an incoming flit to bypass the fifo and immediately reach the flit-register (fra/frb) on the same side of the channel. the incoming flit can also be buffered (4a/4b), and a buffered flit can be looped-back (5a/5b). a c-like pseudo code describing the operation of the dual-mode channel with in-built buffers is shown in fig. 4b. consider the operation of the a-side part of the channel in more details. the b-side part operates analogously. the a-side part of the channel can be configured in either the straight-through or the loop-back mode. the straight-through mode moves the opposite-side flit, fb, into the a-side flit-register, fra. in the loop-back mode, either the a-side incoming flit, fa, or the flit taken form fifoa is written into the fra. the straight-through mode is prioritized over the loop-back mode, and occurs in two distinct situations: when the flit fb is productively-routed (1b), and when the flit fb is deflected and must be misrouted (2b). the deflected flit fb is misrouted if there are no other options for handling that flit, i.e. the loop-back path of b-side is blocked by a productivelyrouted flit fa and the fifob is full. even if the a-side part of the channel is configured in the straight-through mode, a deflected flit fa can still be saved from misrouting by storing into fifoa if fifoa is not full (4a). if the a-side loop-back path is enabled, the flit-register fra receives either a flit from fifoa (5a) if fifoa is not empty or an incoming flit fa, if that flit is deflected and fifoa is empty. in the case of buffered loop-back action (5a), in-channel misrouting suppression technique 317 the incoming flit fa, if deflected, is written into fifoa (4a). it should be noted that a situation where both incoming flits are misrouted is not possible with this scheme. the critical case is one where both fifos are full, and both incoming flits, fa and fb, are deflected. according to the algorithm, in this case, both sides of the channel activate the buffered loop-back operation (5a/5b), which enables buffering of both flits (4a/4b) regardless of the current fifos statuses. out in out in (4a) buffering (3a) flit loopback (2a) misrouting hop (1a) productive hop (5a) buffered loopback (4b) (3b) (2b) (1b) (5b) fa fb fifob fifoa fra frb side b side a router b router a a) side a: side b: if(fb.p || fb.n && fa.p && fifob.full){ if(fa.p || fa.n && fb.p && fifoa.full){ 1b/2b fra ← fb; 1a/2a frb ← fa; if(fa.n && !fifoa.full){ if(fb.n && !fifob.full){ 4a fifoa ← fa; 4b fifob ← fb; } } } else if(!fifoa.empty){ } else if(!fifob.empty){ 5a fra ← fifoa; 5b frb ← fifob; if(fa.n){ if(fb.n){ 4a fifoa ← fa; 4b fifob ← fb; } } } else if(fa.n){ } else if(fb.n){ 3a fra ← fa; 3b frb ← fb; } } b) fig. 4 misrouting suppression with in-channel buffering: (a) datapath; (b) pseudo code. notice: f.p is true if flit f is productively-routed; f.n is true if flit f is deflected; sign “←” denotes a register transfer operation. in-channel buffering vs. side-buffering. the rationale of using in-channel buffering is similar to that of using side-buffering – to buffer some deflected flits that otherwise would be misrouted. in difference to the side-buffering, which picks and buffers deflected flits before they leave the router, the in-channel buffers store deflected flits that have entered 318 i. z. stojanovic, g. lj. djordjevic the inter-router channel but cannot be looped-back immediately. by placing buffers within the channels instead of within the routers brings the following advantages. as opposite to the side-buffering that can accept up to one deflected flit per router per clock cycle, the buffered dual-mode channel can loop-back/store up to two deflected flits at each clock cycle. in a 2d mesh network with dimension of nn  , the number of routers is n 2 and the number of inter-router channels is 2n 2 2n. because the number of inter-router channels is almost two times greater than the number of routers, the opportunities to capture deflected flits are more frequent with the in-channel buffering than with the side-buffering. moreover, being stored outside the routers, the flits buffered into the in-channel fifos re-enter the routers via network ports, and consequently they do not block the new flits generated by pe to enter the router. in this way, the problem of injection unfairness is avoided. the minimum delay overhead of a deflected flit which is buffered into an in-channel fifo is two clock cycles: the first cycle is used for buffering, and the second one for looping-back the buffered flit. although the delay overhead is the same as in the case of misrouting, the inchannel buffering is still beneficial since the buffered flit does not occupy the resources of the neighboring router. 4. performance evaluations in order to evaluate the performance impact of the proposed misrouting suppression technique, we have developed a discrete-event, cycle-accurate simulator for modeling deflection-routed noc using systemc [14]. it provides support to experiment with deflection noc with various options available, such as network topology and size, router/ channel architecture, buffer parameters, and traffic modelling. the simulator provides output performance metrics, such as latency, throughput, transport delay, and deflection rate for a given set of choices. the main building blocks of the simulator are: 1) processing element, 2) deflection router, and 3) inter-router channel (irc). the processing element block generates and injects flits into the network according to the user-specified configuration, including the traffic pattern and injection rate. it is also responsible for ejecting flits from the destination endpoints and collecting appropriate statistics. the router block mimics the behavior of the generic non-pipelined deflection router described in section 2. it can be configured in the bufferless mode (i.e. without side-buffer) or the buffered mode (with sidebuffer of configurable size). the configuration options for the irc block are the following: conventional channel (a pair of oppositely oriented flit-registers), dual-mode channel (see fig 3b), and buffered channel (see fig 4). the simulation results presented in this section are obtained for 2d mesh network with size of 88 nodes. the default buffer size in buffered architectures was set to 1 flit. each simulation run was started with a warm-up period of 1,000 cycles followed by a measurement period of 20,000 cycles. 4.1. performance under saturation load the first set of evaluations was carried out in a saturation mode under uniform random traffic pattern. in this mode, the transmission queue of each pe is assumed to be always nonempty. under such overloaded conditions, each pe injects a new flit into the network in every clock cycle when a free flit-channel is available in the router inject stage. the in-channel misrouting suppression technique 319 injected flits are destined randomly to other pes with an equal probability. a summary of the results is given table 1. table 1 comparison of saturation performance of baseline deflection-routed noc architecture and architectures with misrouting suppression support th td h r r e baseline 0.265 13.216 13.216 0.298 0.298 0 side-buffering 0.332 11.016 8.696 0.295 0.143 51.5% dual-mode channel 0.303 11.555 10.889 0.298 0.240 19.36% in-channel buffering 0.361 14.541 8.144 0.305 0.145 52.3% the details of the performance measures reported in table 1 are as follow. the saturation throughput (th) is defined as the average number of flits received per pe per clock cycle. it is the single most important network-level performance indicator, which being measured under saturation load provides an absolute limit reached by the throughput of a deflection-routed network. the transport delay (td) is the time, measured in clock cycles, elapsed from the instant when the source pe injects a flit to the network to the instant when the destination pe receives it. both the saturation throughput and the transport delay are correlated with the average hop count (h), which is defined as the average number of hops (i.e. channels traverses) a flit takes from source to destination. the average hop count accounts for both productive and non-productive (i.e. misrouting) inter-router hops. in networks where deflected flits are misrouted more often, the average hop count is larger, and consequently the transport delay is longer and throughput is lower. deflections occur within the routers due to inability of pas stage to grant productive ports to all incoming flits. the tendency of the pas stage to produce deflections is measured with the deflection rate (r), which is defined as r = nd / nr, where nd is the total number of deflected flits, and nr is the total number of flits that are processed by pas stages of all routers during the simulation. similarly, the misrouting rate (r) is defined as r = nm / nr, where nm is the total number of flits that are misrouted after deflection. the baseline deflection-routed network misroutes every deflected flit, thereby r = r . with a misrouting suppression mechanism implemented, not all deflected flits are misrouted. the misrouting suppression efficiency is defined as e =((r  r) / r)100% . the results in table 1 show that the implementation of misrouting suppression techniques brings a significant improvement in saturation throughput over the baseline architecture. the dual-mode channel, as the simplest misrouting suppression technique, raises the throughput by 14.3% over the baseline, while the improvement reaches 25.3% for the side-buffering technique. the highest throughput of 0.361 flits/cycle is achieved with the in-channel buffering, which represents an improvement of 36.2% over the baseline. in the baseline architecture, a flit takes 13.2 inter-router hops on average to reach its destinations. misrouting suppression techniques decrease the average hop count (and thus increase the throughput) by temporary holding some of deflected flits at their current route positions. this way, in the network with dual-mode channels, the average hop count is reduced for 2.33 hops with respect to baseline, while the reduction for 4.52 hops has achieved with the side-buffering. as expected, the lowest average hop count of 8.14 hops 320 i. z. stojanovic, g. lj. djordjevic is achieved with the in-channel buffering, which represents a decrease of 5.07 inter-router hops per flit (or, 38.4%) with respect to the baseline. in the baseline architecture, the transport delay equals the average hop count because each hop (either productive or misrouting) takes one clock cycle. in the networks with a misrouting suppression support, the transport delay incurred by a flit is the sum of two components: the hop count and the time the flit spends blocked by a misrouting suppression mechanism. for example, each time the dual-mode channel activates the loop-back mode, it adds one clock cycle to the transport delay of the looped-back flits. however, since the loopback saves two hops, the total transport delay is lower than in the baseline noc. in difference to the dual-mode channel, deflected flits captured by the side-buffering or inchannel buffering mechanism may stay buffered at their current route positions for several clock cycles before they get a chance to make the next inter-router hop. a closer examination of the simulation statistics revealed that flits, while traveling toward their destinations, spend 2.32 clock cycles in the side-buffers on average, which is low enough to provide a 16.6% lower total transport delay than in the baseline network. on the other hand, with the in-channel buffering the average buffer delay is 4.85 clock cycles. the high buffer delay is the reason why the transport delay with the in-channel buffering is larger than in the baseline architecture, despite a significant reduction in hop count. note that the in-channel buffering achieves a high saturation throughput even with a high transport delay. this is because buffered flits waiting to be looped-back do not block other flits that could otherwise make forward progress. let note that the transport delay can be reduced by limiting the time (i.e. the number of clock cycles) that flits are allowed to spent in in-channel buffers – when the time limit is reached, the buffered flit is forced to loop-back, regardless of the routing status of the flit arriving from the opposite side of the channel. however, an inevitable consequence of such buffering policy will be reduction of the network throughput due to lower utilization of in-channel buffers. for this reason, we have excluded this design option from further consideration. the results in table 1 do not show significant difference in deflection rates between the baseline and nocs architectures with the misrouting suppression support. this is because the same pas stage (i.e. permutation network with silver flit arbitration policy) is used in all investigated noc configurations. on the other hand, the misrouting rate depends not only on how often flits deflect, but it also depends on how efficiently the misrouting suppression mechanism prevents the deflected flits to make misrouting hops. the side-buffering technique reduces the misrouting rate by preventing some of deflected flits to leave the router. in this way, 51.5% of misrouting hops are prevented. the dual-mode channel uses the loop-back mode to return some of deflected flits back to their current routers. with this strategy, the dual-mode channel succeeds to prohibit about 19.36% of all deflected flits to make misrouting hops without adding extra buffers. by adding buffers into the dual-mode channels and optimizing the routing function of the deflection router, the proposed inchannel buffering technique reaches the misrouting suppression efficiency which is slightly higher than that of the side-buffering technique. 4.2. injection fairness as emphasized out in section 3, the arrangement of stages within the side-buffered deflection router may create injection unfairness in the network, in sense that some pes get to transmit more flits than others. this phenomenon can be best observed in fig. 5a, which shows distribution of the injection rate (i.e. the number of flits injected by each pe per clock in-channel misrouting suppression technique 321 cycle) over all pes in the side-buffered deflection noc under saturated load with uniform traffic pattern. as can be seen, the injection rate differences between pes are significant: while corner pes can inject their flits at almost every cycle, the pes in the middle of the mesh get a chance to inject a flit on every tenth cycle. as shown in fig. 5b, the in-channel buffering provides almost uniform injection rate distribution under the same load conditions. this advantage occurs because the in-channel buffering is transparent for the deflection router, which treats each incoming flit equally, regardless of whether the flit is looped-back by the in-channel misrouting suppression logic or it comes from a neighboring router. a) b) fig. 5 injection rate distribution under saturation load in deflection-routed 2d mesh nocs with misrouting suppression support: a) side-buffering; b) in-channel buffering 4.3. sensitivity to buffer size the second set of simulations deal with the impact of buffer size on the effectiveness of the side-buffering and in-channel buffering techniques. observed form table 2, although increasing the buffer size improves the throughput and misrouting suppression efficiency under saturated traffic load, this improvement is relatively small and rapidly saturates. doubling the buffer size from 1-flit to 2-flits increases the saturation throughput by only 2.71% for side-buffering, and 4.15% for in-channel buffering technique. in addition to high hardware cost, the price paid for this throughput improvement is 10% longer transport delay for side-buffering, and even 28% longer for in-channel buffering. further increase of buffer size increases the saturation throughput only marginally, while the transport delay continues to steadily increase. these results suggest that buffers with size larger than 1 flit increases hardware complexity and wastes power without significant performance benefit. table 2 comparison of saturation performance of baseline deflection-routed noc architecture and architectures with misrouting suppression support side-buffering in-channel buffering buffer size th td e th td e 1 flit 0.332 11.016 51.5% 0.361 14.541 52.3% 2 flits 0.341 12.126 57.2% 0.376 18.613 58.6% 3 flits 0.344 13.476 59.2% 0.382 22.899 61.2% 4 flits 0.346 14.915 60.0% 0.386 27.201 62.4% 322 i. z. stojanovic, g. lj. djordjevic 4.3. latency analysis finally, we evaluate the impact of different misrouting suppression schemes on the latency performance of deflection-routed network. latency is defined as the time (in clock cycles) since the flit is generated at the source pe until it arrives at the destination pe, including the time the flit spends in the source pe’s transmission queue. in these simulations, each pe generates flits following poisson distribution with mean rate λ (λ is also called the average flit injection rate for the noc). generated flits remain in its queue until they are successfully injected to the network. for each network configuration, the flit injection rate is varied from zero to the point when the first transmission-queue in the network becomes saturated. fig. 6 latency comparison of baseline deflection noc architecture and architectures with misrouting suppression support under uniform traffic pattern figure 6 contains load-latency graph under uniform traffic pattern. as observed, at low injection rates, deflection-routed networks with the misrouting suppression support experience almost the same average flit latency as the baseline deflection network. this is because of the fact that the network is free from congestion. however, as load in the network increases, the effect of misrouting suppression technique adopted becomes more visible. the graph in fig. 6 shows that the proposed in-channel buffering technique significantly improves the routing performance by providing low-latency communication at higher injection rates. as can be observed in fig. 6, for every deflection scheme, except for the side-buffering technique, the maximum injection rate achieved closely matches the saturation throughput reported in table 1. this is because these schemes provide injection fairness so that all transmission queues in the network become saturated at approximately the same injection rate. on the other hand, in the side-buffered deflection network, the transmission queues of pes in the middle area of the network become saturated at much lower injection rate than those of boundary pes, leading to early saturation. in-channel misrouting suppression technique 323 5. conclusions in this paper, a misrouting suppression technique for deflection-routed networks on chip was presented. the presented technique avoids misrouting hops by looping-back or capturing deflected flits into small in-channel buffers, immediately after they have appeared at router’s output ports. the efficiency of the technique is further improved by modifying the routing function of deflection router in a way to prevent misrouted flits to take immediate reverse hops. the simulation results show that the proposed schemes improves performance of the baseline deflection-routed noc by 36.2% in terms of saturation throughput. results also show that the misrouting suppression with the in-channel buffering offers higher saturation throughput than with the in-router buffering (i.e. side-buffering) although with a penalty in terms of hardware cost. moreover, the performance improvement is achieved without incurring injection unfairness among network nodes, which characterizes the sidebuffering approach. acknowledgement: this work was partially supported by the serbian ministry of science and technological development project no. tr-33035. references [1] w. j. dally "virtual-channel flow control", ieee trans. parallel distributed syst., 1992, vol. 3, no. 2, pp. 194-205. [2] t. bjerregaard, s. mahadevan, "a survey of research and practices of network-on-chip", acm comput. surv., vol 38, no. 1, 2006. [3] a. kumar, p. kundu, a. singh, l. s. peh and n. jha, "a 4.6 tbits/s 3.6 ghz single-cycle noc router with a novel switch allocator in 65 nm cmos", in proc. of 25th international conference on computer design, iccd, 2007, pp. 63-70. [4] a. kohler and m. radetzki, "fault-tolerant architecture and deflection routing for degradable noc switches", in proc. of the 3 rd ieee international symposium on networks-on-chip, 2009, pp. 22–31. [5] g. michelogiannakis, d. sanchez, w.j. dally, c. kozyrakis, "evaluating bufferless flow control for onchip networks", in proc. of the 4 th acm/ieee int. symposium on networks-on-chip, 2010, pp. 9-16. [6] t. moscibroda and o. mutlu, "a case for bufferless routing in on-chip networks", in proc. of the 36 th annual international symposium on computer architecture, acm, new york, 2009, pp. 196-207. [7] c. fallin, c. craik and o. mutlu, "chipper: a low-complexity bufferless deflection router", in proc. of the 17 th international symposium on high performance computer architecture (hpca), 2011, pp. 144–155. [8] c. fallin, g. nazario, x. yu, k. chang, r. ausavarungnirun and o. mutlu, "minbd: minimallybuffered deflection routing for energy-efficient interconnect", in proc. of the 6th ieee/acm international symposium on networks on chip, 2012, pp. 1-10. [9] i. z. stojanovic, m. d. jovanovic and g. lj. djordjevic, "dual-mode inter-router communication channel for deflection-routed networks-on-chip", the journal of supercomputing, springer us, published online: march 2015. [10] y. li, k. mei, y. liu, n. zheng, yi xu, "ldbr: low-deflection bufferless router for cost-sensitive network-on-chip design", microprocessors and microsystems, 2014, vol. 38, no. 7, pp. 669-680. [11] m. hayenga, "scarab: a single cycle adaptive routing and bufferless network”, in proc. of the 42 nd annual ieee/acm international symposium on microarchitecture (micro-42), 2009, pp. 244-254. [12] j. jose, b. nayak, k. kumar and muyam m, "debar: deflection based adaptive router with minimal buffering", in proc. of the design, automation & test in europe conference & exhibition (date), 2013, pp. 1583–1588. [13] c. feng, j. li, z. lu, a. jantsch, m. zhang, "evaluation of deflection routing on various noc topologies", in proc. of ieee 9th international conference on asic (asicon 2011), pp. 163-166. [14] open systemc initiative. systemc v2.1 language reference manual, 2005. http://www.systemc.org/ instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 113 122 doi: 10.2298/fuee1501113m a mim capacitor study of dielectric charging for rf mems capacitive switches  loukas michalas, matroni koutsoureli, eleni papandreou, anestis gantis, george papaioannou solid state section, physics department, national kapodistrian university of athens, panepistimiopolis zografos, athens 15784, greece abstract. mim capacitors are considered equally important devices for the assessment of dielectric charging in rf mems capacitive switches. beside the obvious similarities between the down state condition of rf mems and mim capacitors there are also some important differences. the paper aims to introduce a novel approach to the study of dielectric charging in mems with the aid of mim capacitors by combining experimental results obtained by the application of dc, charging transient and kelvin probe techniques. the strengths and weaknesses are discussed in conjunction with experimental results obtained on sinx based mim capacitors and mems capacitive switches fabricated under the same conditions. key words: rf mems, mim, dielectric charging, reliability 1. introduction micro electro mechanical system (mems) capacitive switches have received important research attention over the last two decades mainly due to their potential implementation on rf applications such as filters and antennas [1]. however, although they offer several advantages over their conventional semiconductor counterparts, such as lower power consumption and high linearity, their commercialization is still hindered by reliability issues mainly related to dielectric charging. regarding the dielectric charging, three major modes are acknowledged as responsible up to date. two of them are usually referred as contactless charging and arise by either redistribution of internal charges and/or dipoles orientation [2-4], or by injection of electrons emitted by asperities at the bottom of the metal bridge due to field emission [5,6]. such effects occur in the non contact areas that are formed between the dielectric film and the bridge in the down state condition due to surface roughness but also in the up state before the device actuation. the third and most well studied charging mode occurs received july 28, 2014; received in revised form september 24, 2014 corresponding author: loukas michalas solid state section, physics department, national kapodistrian university of athens, panepistimiopolis zografos, athens 15784, greece (e-mail: lmichal@phys.uoa.gr) 114 l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou with the bridge in the down state and therefore is referred as contacted charging. in this case charges are injected from the metal electrodes under the presence of an electric field in the range of mv/cm. the commonly used dielectric, low temperature deposited sinx, contains a large amount of electrically active defects and may store these injected charges for times longer than 10 4 sec [7]. the stored charges are responsible for several undesirable effects such as shift of c-v characteristics, narrowing of pull-in/out windows, degradation of the ratio cdown/cup and finally lead to device failure due to bridge stiction [8,9]. contact mode dielectric charging has been assessed with a variety of experimental techniques and by several research groups world wide over the last years [10]. beyond the studies directly applicable on rf mems switches, equally important research takes place on metal insulator metal (mim) capacitors. the dielectric charging in mim capacitors has been investigated with the aid of charge/discharge current transients (cct/dct) [11], by thermally stimulated depolarization/polarization currents (tsdc/tspc) [12,13] and by kelvin probe force microscopy (kpfm) [14]. in addition, combinations of the above techniques have been also implemented in order to clarify the importance of mim structure [15] and of the dielectric film deposition condition on the characteristics of dielectric charging [16-18]. indeed the study of mim capacitors has advantages since it resembles the ideal down state condition of mems switches from the macroscopic point of view and also it is quite easier to fabricate such a structure without moveable parts. however, it should be always taken into consideration that beside some obvious similarities the two devices also have some important differences. therefore the results obtained from mim are not straightforwardly transferred to mems and the up to date the knowledge obtained from the study of mim capacitor only constitutes the physical background to the understanding of mems results. the present paper introduces a novel approach to the study of dielectric charging in mems with the aid of mim capacitors by combining experimental results obtained by the application of dc, charging transient and kelvin probe (kp) techniques. the study extends our previously published work in [19] and adds some knowledge on how information obtained when well accepted characterization techniques applied on mim capacitors may be transferred to the study, understanding and/or prediction of dielectric charging of mems. the experiments were performed on sinx based mim capacitors and mems capacitive switches fabricated under the same conditions. 2. experimental both mim capacitors and mems capacitive switches that have been investigated in the present work have been fabricated under the same conditions in order to have identical dielectric materials that allow the extraction of comparative conclusions. standard lithography process on high resistivity silicon wafer has been used. the dielectric film is 250 nm thick sinx deposited at 300 o c with the pecvd technique. the same metal electrodes (au/ti/au) were used in both cases while in the case of mems the 2 μm thick bridge is suspended about 2μm above the dielectric material the dc and transient current measurements were recorded with the aid of a keithley k6517a electrometer that provided the applied bias and measured the current flow through the device. the capacitance voltage characteristics of mems switches were a mim capacitor study of dielectric charging for rf mems capacitive switches 115 monitored with a boonton 72b capacitance meter. all measurements were performed in a vacuum cryostat after 2 hours annealing at 140 o c to remove humidity. current voltage characteristics, the charging current transient technique and kelvin probe technique for the assessment of dielectric were charging applied on mim capacitors while the charging and discharging effects in rf mems capacitive switches were assessed by monitoring the shift of the bias for minimum up state capacitance. the latter technique is the equivalent kelvin probe for mems and is based on the principle that the bias at which the capacitance attains its minimum is the one corresponding to the minimum electrostatic force independently of the charge uniformity and air gap distribution as well as bridge deformation [8,20]. 3. results and discussion the charging and discharging processes in mems capacitive switches can be expressed briefly as following. during the device actuation (down state), charges are injected from the metal bridge under the presence of an electric field, usually in the range of mv/cm. when the bridge pulls-up, these stored charges are collected only through the bottom electrode. this process occurs under the presence of a low intrinsic field, in the range of a few kv/cm, generated by the injected charges. therefore, in order to asses the issue of dielectric charging, it is essential to bear in mind that the charging process is a high field process while the discharging one is a low field process that occurs only through the bottom electrode. 3.1. conduction mechanism a study of the dc characteristics in mim capacitors could provide information on the corresponding conduction mechanism under high and low field, thus during charging and discharging process respectively. during the charging process charges are injected from the metal bridge into the insulating film with trap assisted tunneling (tat). these injected charges are redistributed under the presence of the high field. a commonly observed high field process in dielectric films is poole frenkel conductivity [21,22]. in this case the current density (j) as a function of the applied field (e) is expressed as exp b b e j ae kt          (1) where a,b are constants and φb is the energy barrier for process associated to the defects characteristics and k is the boltzmann constant. the straight line obtained for the high field regime in poole-frenkel plot presented in fig. 1 is a signature for the mechanism confirming that the conduction mechanism responsible for the charge redistribution during charging is poole frenkel. for the applied fields intensities below 500 kv/cm which correspond to the data out of the fitting curve, the poole frenkel mechanism is no longer valid to describe the conductivity of the films. 116 l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou fig. 1 poole frenkel signature plot for high field conductivity fig. 2 non linear fitting with eq.2 on low field experimental data confirms hopping conductivity as corresponding mechanism therefore, a different mechanism is expected to dominate in the low field regime that provide information on the conduction mechanism for the applied fields below 500 kv/cm, thus during the discharging process. fig. 2 presents a fitting of eq. 2 on the experimental data. in the low field regime charge transport arise by hopping described by eq.2, where α is a constant related to material microstructure. exp[ ]j ae  (2) 3.2. the charging process the charging process was investigated in both mim capacitors and mems capacitive switches in order to extract comparative information. in case of mim capacitors the charging process was investigated with the aid of charging current transient technique (cct). the density of charge injected and stored into the dielectric film upon the application of a specific applied field as a function of charging time is estimated by monitoring the current decay. the total measured current through the device is the sum of the dielectric relaxation current and the steady state leakage current of the dielectric film ( ) ( ) total relax leak j t j t j  (3) a typical charging transient δj(t) = jtotal(t)  jleak, is presented in fig. 3. the stored charge is then calculated by the following integral. ( ) t o j t dt   (4) by varying the upper integration limit it is possible to calculate the stored charge in the dielectric film as a function of stressing time for a specific applied field. in case of 1mv/cm, which corresponds to the same condition applied to mems switches, the stored charge is presented in fig. 4. a mim capacitor study of dielectric charging for rf mems capacitive switches 117 fig. 3 charging transient. the ti denotes the different integration times for eq. 4 fig. 4 charge storage in mim capacitors as a function on charging time under applied field of 1mv/cm in mems capacitive switches the stored charge after a successive contacting stress is calculated by monitoring the shift in the bias for minimum up state capacitance (fig. 5) [23,24] as presented in details in [20]. the stored charge (σ) is then calculated as 0 min r v d      (5) where d is the dielectric film thickness and ε0 and εr the vacuum and material relative dielectric permittivity and δvmin is min min min ( ) (0)v v t v   (6) fig. 5 up state c-v curves recorded in mems switches after each successive stress step. the arrow indicates the shift direction and the increase of time. fig. 6 mems up state c-v minimum shift versus charging time. the shift is proportional to the stored charge in our case the stored charge at the surface of the dielectric film after 5 minutes stress under a field of 1 mv/cm has been calculated to be 30 nc/cm 2 . comparing the results obtained by the charge current transient (cct) in mim and kelvin probe technique for mems and taking also into consideration similar reports [11118 l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou 19], it is concluded that it is quite important to take into consideration that charging in mim capacitors occurs through a perfect contact under uniform electric field and without air gaps that are always present in mems due to surface roughness. therefore even the application of the same field leads to important differences in the charge storage in the two devices. the charge stored in mems under the application of 1 mv/cm for 5 min was calculated to be 30 nc/cm 2 , while in mim it exceeds 200 nc/cm 2 . 3.3. the discharging process when the actuation voltage is no longer applied in a mems switch the bridge returns to the up state position and the electric field is no longer applied across the dielectric film. this is the onset of the discharging process. the injected and stored charges are starting to move from the surface through the dielectric film to the bottom contact and collected there. this charge collection process which is the most essential one for the device lifetime occurs under a very low intrinsic field generated by the injected charges themselves. in case of mim capacitors there are two types of measurements associated to the discharging of the dielectric film. the first one is the discharge current transient (dct) technique which is actually the monitoring of the current arising by the collection of the injected charges by short-circuit of the capacitor. in this case the charges are collected by the injection electrode and not through the opposite one as in the case of mems. therefore the dct technique is not suitable to provide valuable information on the discharging process in mems. the discharging process in this case is much faster and it contains no information regarding the transport of charges across the dielectric film. however, the dct technique is a powerful tool to assess the material properties themselves. fig. 7 presents the discharge transient obtained from mim capacitors charged up to 30 nc/cm 2 . the time constant obtained in this case is only 300 sec. another approach for the assessment of the discharging process in mim capacitors is by the application of kelvin probe (kp) method. this technique is based on the monitoring of the surface potential of a mim capacitor without contact (fig. 8), thus the discharging through the bottom electrode. kelvin probe method resembles quite realistically the discharging process in rf mems. indeed the results obtained after charging the mim with the equivalent charge denote that this technique can be used to obtain information from mim to mems. the time constant is in the same order of 10 4 sec, while in both cases the discharging obeys the stretched exponential relaxation. it is worth to mention that similar studies based on kelvin probe force microscopy (kpfm) on metal free dielectric film surfaces have been already reported in the past [14]. however, there is a difference between the kpfm and kp method. the first assesses the surface potential by minimizing the electrostatic force on the system tip. the second by minimizes the current through a vibrating capacitor [25]. the presence of the mim top electrode ensures uniform electric field arising from the charge at the surface of the dielectric film. in addition, the kpfm method has been used to assess the charge distribution of very small surfaces and the delay caused by sweeping the kpfm tip across the surface does not allow the real time measurement of the charge dissipation. thus, although the kp method does not resemble the real mems discharge, it provides a more accurate determination of discharge process, normal to film surface, that is not affected by lateral charge diffusion as shown in [24] a mim capacitor study of dielectric charging for rf mems capacitive switches 119 fig. 7 discharge current transient obtained in mim capacitor charged up to 30nc/cm 2 fig. 8 the discharging process through the dielectric film as obtained in mim capacitors charged up to 30nc/cm 2 regarding the stretched exponential relaxation, the transient decay is expressed by the following equation 0 exp t v v                (7) where τ is the time constant for the discharging process, usually referred as relaxation time, and β is the stretched factor (0<β<1), and is commonly observed in many systems with important degree of disorder like amorphous materials [26]. in mems switches the discharging process is quite accurately monitored by recording the position of the bias corresponding to the minimum up state capacitance (fig. 9) [20]. in our case mems switches have been stressed under the applied field of 1 mv/cm for 5 min which as mentioned in the previous sections results in charge storage of 30nc/cm 2 . the following discharging process is presented in fig. 10. the discharging process is found to obey the stretched exponential relaxation mechanism with a time constant in the range of 10 4 sec. fig. 9 up state c-v curves recorded in mems switches during discharging process. the minimum gradually returns to the initial position denoting film discharging. the arrow indicates the shift direction and the increase of time. fig. 10 position of the minimum as a function of the discharging time. monitoring of the discharging process in mems 120 l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou 3.4. discussion the analysis presented in the above sections confirms that the study of the electrical properties of mim capacitor provides valuable information on the study of dielectric charging for rf mems. the analysis of the dc characteristics reveals the responsible mechanism for the charging and discharging processes. in addition, taking into consideration the results extracted in both devices after charging up to the equivalent stored charge (table 1), we may conclude that the main drawback of using a mim capacitor for the study of mems arise by the differences in the top metal/dielectric contact. table 1 summary of the results for the discharge in mems and mim 250nm sin mim mems stored charge (nc/cm 2 ) 30 30 top surface potential relaxation stretched exp. stretched exp. relaxation time (sec) 5 x10 4 3 x 10 4 exponent β 0.9 0.4 the table summarizes the results for the discharge relaxation time (τ) and the stretched factor (β) obtained by fitting procedure with the stretched exponential relaxation law on experimental data obtained in mim capacitors and mems switches. the absence of air gaps in case of mim capacitors results in a uniform distribution of injected charges. therefore, even if the same charge density is injected, the resulting internal field is different in the two cases. in mems, the internal field is expected to be strongly non uniform resulting in locally enhanced densities. these fluctuations are responsible for the slight differences extracted from the discharging process. the uniform internal field that is formed in the case of mim results in slower discharging while the non uniformity in case of mems is also expressed by the reduced values of the stretched exponential exponent β. in this point it is worth to point out that the stretched exponential relaxation although appearing to be a fitting tool without significant physical meaning, arises from the superposition of many single exponentials relaxation processes and is multiscale in contrast to debye relaxation [27]. therefore, in the case of amorphous dielectric films for mems application, the stretched factor β is used as an indicator of the complexity of the charge collection process [28] and is mainly determined by the charge distribution formed at the film surface by the charging process [14]. therefore, the study of mim may provide the expected magnitude in the discharging time constant for mems, however different shapes on the discharging curves and mainly in curve’s tails for long times should always been expected. in addition, the charging study in mim cannot predict any effect arising by the non uniform charge injection such as the narrowing of pull-in/out window [8,9]. 4. conclusions the paper presents an approach for the assessment of dielectric charging in rf mems capacitive switches with the aid of mim capacitors. the study involved the dc analysis, charging current transients (cct) and kelvin probe. the strengths and weaknesses of each a mim capacitor study of dielectric charging for rf mems capacitive switches 121 technique are discussed. the dc study of mim provided valuable information on the responsible mechanisms for the charging and discharging processes while the kelvin probe technique provided a quite realistic estimation of the discharging time. on the other hand, the perfect mim contact leads to enhanced and uniform charge injection, thus the effects related to the non uniform charging arising mainly by the surface roughness of the dielectric film cannot be predicted. therefore, minor differences in the case of mems discharging process should not be considered as unexpected results. conclusively, it is presented that the appropriate use of mim capacitor constitutes a useful and important tool on the study of dielectric charging in rf mems. acknowledgement: the authors would like to acknowledge that the present work has been supported by the projects i) fp7 eniac/espa-gr“microsystem based on wide band gap materials miniaturized and nanostructured rf-mems” nanocom under ga: 270701-2, eniac call 3 and ii)“nanostructured materials and rf-mems rfic/mmic technologies for highly adaptive and reliable rf systems” nanotec under ga 288531. references [1] g.m. rebeiz, rf mems: theory, design and technology, new york: j. wiley & sons, 2003 [2] g.j. papaioannou, g. wang, d. bessas, j. papapolymerou “contactless dielectric charging mechanisms in rf-mems capacitive switches” 1 st europ. microw. integrat. circ. conf., manchester uk, pp. 513516, oct. 2006 [3] p. czarnecki, x. rottenberg, p. soussan, p. ekkels, p. muller, p. nolmans, w. de raedt, h.a.c.. tilmans, r. puers, l. marchand, i. de wolf “influence of the substrate on the lifetime of capacitive rf mems switches” in proc. 21 st ieee international conference on micro electro mechanical systems, mems 2008, pp. 172-175 jan. 2008 [4] m. koutsoureli, l. michalas, p. martins, e. papandreou, a. leuliet, s. bansropun, g. papaioannou, a. ziaei “properties of contactless and contacted charging in mems capacitive switches” microel. reliab., vol. 53, pp. 1655-1658, sep. 2013 [5] l. michalas, a. garg, a. venkattraman m. koutsoureli, a. alexeenko, d. peroulis and g. papaioannou “a study of field emission process in electrostatically actuated mems switches” microel. reliab., vol. 52, pp.2267-2271, sept. 2012. [6] l. michalas, m. koutsoureli, g. papaioannou “probing contactless injection dielectric charging in rf mems capacitive switches” iet electron. lett., vol. 50, no 10, pp. 766-768, may 2014. [7] m. koutsoureli, l. michalas and g. papaioannou “charge collection mechanism in mems capacitive switches” in proc. ieee int. reliab. phys. symp., pp. me2.1me 2.5, april 2012 [8] x. rottenberg, i. de wolf, b.k.j.c. nauwelaers, w. de raedt, h.a.c. tilmans “analytical model of the dc actuation of electrostatic mems devices with distributed dielectric charging and nonplanar electrodes” ieee j. microelectromech. syst., vol. 16, no.5, pp. 1243-1253, oct 2007. [9] w.m. van spengen., r. puers, r. mertens, i. de wolf “a comprehensive model to predict the charging and reliability of capacitive rf mems switches” j. micromech. microeng., vol. 14, pp. 514-521, 2004 [10] m.w. van spengen “capacitive rf mems switches dielectric charging and reliability : a critical review with recommendations” j. micromech. & microeng. vol. 22, pp. 074001, 2012 [11] m. lamhamdi, p. pons, u. zaghloul, l. boudou, f. coccetti, j. guastavino, y. segui, g. papaioannou, r. plana “voltage and temperature effect on dielectric charging for rf mems capacitive switches reliability investigation” microel. reliab., vol. 48 pp. 1248-1252, sept. 2008 [12] m. koutsoureli, e. papandreou, l. michalas, g. papaioannou “investigation of silicon nitride charging” microel. engineer., vol. 90 pp. 145-148, feb. 2012 [13] e. papandreou, g. papaioannou, t. lisec “ a correlation of capacitive rf mems reliability to aln dielectric film spontaneous polarization” int. j. microw. wirel.techn. vol. 1, pp. 43-47, feb. 2009 http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.papaioannou,%20g.j..qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.wang,%20g..qt.&searchwithin=p_author_ids:37278830700&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.bessas,%20d..qt.&searchwithin=p_author_ids:38131042900&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.czarnecki,%20p..qt.&searchwithin=p_author_ids:37281586900&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.rottenberg,%20x..qt.&searchwithin=p_author_ids:37300079900&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.soussan,%20p..qt.&searchwithin=p_author_ids:38260667300&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.ekkels,%20p..qt.&searchwithin=p_author_ids:37395824900&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.muller,%20p..qt.&searchwithin=p_author_ids:37629530300&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.nolmans,%20p..qt.&searchwithin=p_author_ids:37829003300&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.de%20raedt,%20w..qt.&searchwithin=p_author_ids:37275708200&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.tilmans,%20h.a.c..qt.&searchwithin=p_author_ids:37283039700&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.puers,%20r..qt.&searchwithin=p_author_ids:38367118100&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.marchand,%20l..qt.&searchwithin=p_author_ids:38328700800&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=p_authors:.qt.de%20wolf,%20i..qt.&searchwithin=p_author_ids:38476292600&newsearch=true 122 l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou [14] u. zaghloul, g.j. papaioannou, h. wang, b. bhushan, f. cocceti, p. pons, r. plana “ nanoscale characterization of the dielectric charging phenomenon in pecvd silicon nitride thin films with varius interfacial structures based on kelvin probe force microscopy” nanotechnology, vol . 22, pp. 205708, 2011 [15] e. papandreou, m. lamhamdi, c.m. skoulikidou, p. pons, g. papaioannou, r. plana “structure dependent charging process in rf mems capacitive switches” microel. reliab., vol. 47 pp. 1812-1817, sept. 2007 [16] r. daigler, e. papandreou, m. koutsoureli, g. papaioannou, j. papapolymerou “effect of deposition conditions on charging processes in sinx: application to rf mems capacitive switches, microel. engineer., vol. 86 pp. 404-407, mar. 2009 [17] u. zaghloul, g. papaioannou, f. coccetti, p. pons, r. plana “dielectric charging in silicon nitride films for mems capacitive switches: effect of film thickness and deposition conditions, microel. reliab., vol. 49 pp. 1309-1314, sept. 2009 [18] m. koutsoureli, l. michalas, a. gantis, g. papaioannou “a study of deposition conditions on charging properties of pecvd silicon nitride films for mems capacitive switches”microel. reliab., vol. 54 pp. 2159 – 2163, sept. 2014 [19] l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou “assessment of dielectric charging in rf mems capacitive switches with the aid of mim capacitors”in proc. of 29 th international conference on microelectronics (miel 2014), pp. 125-128, may 2014 [20] m. koutsoureli, g. papaioannou“ determination of bulk discharge currents in the dielectric film of mems capacitive switches” microel. reliab., vol. 51 pp. 1874-1877, sept. 2011. [21] r. ramprasad “ phenomenological theory to model leakage currents in metal-insulator-metal capacitor systems” phys. stat. sol.(b), vol. 239, no 1, pp. 59-70, 2003 [22] g. papaioannou, f. coccetti, r. plana “on the modeling of dielectric charging in rf-mems capacitive switches” in proc. top. meet. on silicon monolith. int. circ. in rf syst., pp. 108-111, jan. 2010 [23] j. wibbeler, g. pfeifera, m. hietschold “parasitic charging of dielectric surfaces in capacitive microelectromechanical systems (mems)” sensors and actuators a: physical, vol. 71, pp.74-80, is. 1– 2, nov. 1998 [24] r.w. herfst, p.g. steeneken, j. schmitz, a.j.g. mank, m. van gils “kelvin probe study of laterally inhomogeneous dielectric charging and charge diffusion in rf mems capacitive switches”. proc. ieee int. reliability physics symposium (irps) 2008, pp. 492-495,2008 [25] http://www.kelvinprobe.com/ [26] b. sturman, e. podivilov, m. gorkunov “ origin of stretched exponential relaxation for hoppingtransport models” phys, rev. lett., vol. 91, no 17, pp. 176602, 2003 [27] a.v. milovanov, k. rypdal, j.j. rasmussen “stretched exponential relaxation and ac universality in disordered dielectrics” phys. rev. b, vol. 76, pp. 104201, 2007 [28] g. papaioannou, m. n. exarchos, v. theonas, g. wang, j. papapolymerou “ temperature study of dielectric polarization effects of capacitive rf mems switches” ieee trans. microw. theor. techniq., vol. 53, no 11, pp. 3467-3473 http://www.sciencedirect.com/science/article/pii/s0924424798001551 http://www.sciencedirect.com/science/article/pii/s0924424798001551 http://www.sciencedirect.com/science/article/pii/s0924424798001551 http://www.sciencedirect.com/science/article/pii/s0924424798001551 http://www.sciencedirect.com/science/journal/09244247/71/1 http://www.sciencedirect.com/science/journal/09244247/71/1 http://www.kelvinprobe.com/ 10904 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 121-131 https://doi.org/10.2298/fuee2301121b © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper performance of wearable circularly polarized antenna on different high frequency substrates for dual-band wireless applications rama s. r. basupalli1, naresh k. darimireddy2, rajasekhar. nalanagula3, sujatha. mandala4 1bvrit (a), medak, telangana, india 2,3,4lendi institute of engineering & technology, ap, india abstract. this paper proposes the effect of different dielectric constants to construct a microstrip patch antenna deployed on jean's textile covering military wireless applications. initially, the structure is designed with double l-shaped slits inserted on both sides of the patch with an fr4 dielectric constant of 4.4. antenna dimensions are 40 × 25 mm2, which is miniature compared to the wave's length (λ) at the desired operating frequency. the proposed antenna performance in terms of simulated parameters such as gain in dbi, reflection loss (s11), directivity, and patch antenna radiation efficiency are executed by the cst mw em simulator. however, the conventional way of this design with fr4 may not be so reliable when it is designed on jean's substrate. besides all the above parameters extracted from the simulator should hold a low value to implement a high-performance deployed wearable antenna. the paper's outcome shows the importance of simulations and measurements undertaken for the proposed antenna assuming both the dielectric constants of fr4 and jeans cloth material (with ℇr of 1.7). the main contribution of the antenna is to resonate at the frequencies of 3.17 ghz with circular polarization and 5.04 ghz with linear polarization. the antenna prototype is described, and its performance is validated using measurements. the proposed structure also provides a better enhancement in terms of 10-db impedance bandwidth, with an average gain of 5 dbi. key words: jean’s dielectric, fr4 substrate, sar, textile antenna, dual-band, circular polarization 1. introduction wearable antennas are becoming extremely popular due to their profound potential in multiple applications covering services such as military soldiers, firefighters, and paramedics. establishing a communication link from the wearable antenna to the base station camp would undoubtedly help the military soldier get rid of heavy-weight telecommunication equipment to received july 09, 2022; revised august 17, 2022, september 06, 2022 and october 09, 2022; accepted october 12, 2022 corresponding author: naresh k. darimireddy lendi institute of engineering & technology, ap, india e-mail: darn0005@uqar.ca 122 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala carry along with them [1]-[2]. several current and upcoming modern wireless modules either require or can exhibit a solution by implanting one or more antennas on a piece of textile or cloth or directly integrated in to personal accessories which cover shoes, glasses, buttons, and helmets [3]. specific operating and thermal conditions in which textile antennas designed impose with specific requirements explicitly listed would be included in the performance criteria of antenna experimentation. to be developed as wearable, an antenna must combine a suitable choice of materials, both for the conductive and non-conductive parts, with a standard adopted antenna configuration. the most straightforward approach is to combine, knit, or conductive yarns into a portion of clothing [4]-[5]. for a designated wearable antenna, the wearer's placement, stance, and movements detrimentally impact the input impedance and radiation pattern. wearable antennas are usually broadband to compensate for such casual variations [6]. several techniques have been applied to design single and multiband wearable antennas over the past years, and few include slit [7] or u or l-shaped slot-loaded configurations [8], ebg structure [9], and monopole/planar antennas [10]. although these wearable antennas exhibit single/dual-band characteristics, a limited bandwidth is exhibited at higher resonant modes. furuya et al. [11] proposed a wearable antenna with a wideband for digital tv reception in the frequency range of 470-700 mhz. however, antenna radiation efficiency needs to be sufficiently improved with the required -10db return loss and 3-db axial ratio bandwidth. earlier, several slits/slots were loaded along boundaries of the various patch configurations [12]-[13] fabricated on non-flexible substrates to generate orthogonal modes for circularly polarized (cp) radiation by adequately positioning the feed point. owing to the interest in obtaining dual bands and the circular polarization feature, the proposed antenna is discussed to resonate frequencies covering wireless military applications. this article designs a novel cp-based antenna with double l-shaped slits. dual lshaped slits are fixed on either side of the conductive surface of the textile. slits on either side of the wearable patch antenna provide bandwidth variation. furthermore, the design obtains circular polarization exciting two orthogonally polarized tm01 and tm10 modes by placing the exact location of the feed point. 2. wearable antenna design and simulations the propagation and loss properties at the desired frequency band(s) must be known for the candidate material before antenna design and fabrication. in addition, permittivity and loss tangent have to be characterized to choose textile dielectric material as a substrate with different constructions and thicknesses. effective permittivity and loss tangent can be extracted using the formula based on the resonant frequency [14]-[15]. 2.1. proposed antenna structure the structure of the proposed antenna with the dimensions of ls × ws is represented as shown in figure1 with the placement of narrow dual l shapes slits. a modified ground plane for the improvement of characteristics is shown in figure 1. by optimizing the dimensions of the wearable patch, it is observed that there is an improvement in impedance bandwidth and circular polarization feature. performance of wearable circularly polarized antenna on different high frequency... 123 fig. 1 (a) top and (b) bottom layers of the wearable antenna 2.2. design parameters the parametric study of both dielectric materials (fr4 and jeans) with microstrip line feed is analyzed in this section. a single l-shaped slit on the top layer of the conventional patch makes an additional resonant band possible and slightly enhances the bandwidth. moreover, an additional l-shaped slit introduces a circular polarization feature with the formation of orthogonal modes at the initial resonant band of 3.17 ghz. the dimensions (in mm) for the parameters are listed as follows: ls = 75, ws = 40, wp = 25, lp = 40, w1 = 15, w2 = 1, w3 = 3.95, w4 = 2.8, w5 = 2, l1 = 39, l2 = 26, l3 = 10, l4 = 20. the effect of both dielectric substrates fr4 (4.4) and jean’s cloth (1.7) are discussed to analyze various characteristics. the relevant critical parameter is the fabric's conductivity (σ), holding the units as siemens per meter (s/m) as part of the antenna design. equation (1) gives the relation between the surface resistivity (s) and the thickness (t) of the fabric: 1 s t   = (1) 2.3. effect of different substrates with simulations the proposed wearable antenna with dual l-shaped slits and monopole ground plane improves gain and impedance bandwidth at the resonance bands 3.17 ghz and 5.04 ghz, respectively. it is shown from figure 2 that the degenerate modes are formed at 3.17 ghz leading to an axial ratio bandwidth of 8 mhz for the circular feature. additionally, a high impedance bandwidth is obtained at both resonant bands compared to the fr4. the coverage area using cp would better the short-range communication for military applications. this is the reason for which the prototype is implanted in textile materials. for this study, the characteristics of the wearable design are compared with the fr4 type substrate. fig. 3 shows the comparison of the vswr values for both bands. the values of the vswr are maintained less than 2 to evaluate the simulated results of impedance bandwidth and vswr responses. cst microwave studio simulator is used to analyze the case studies of both dielectric constants for different thickness values. for this study, the thickness values are 1 mm and 0.8 mm. 124 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala conventional rectangle shape geometry with l-shaped slits is chosen as implantable on electro textiles due to its simplicity. however, the thickness intensity of the yarns in the woven structures would be an ideal choice for the prototype design. minor variation in the selection of length and width of the conductive fabric as a wearable antenna results in a slight variation in the antenna characteristics, as displayed in figure 2. in general, wearable material holds a very low dielectric constant. therefore, the thickness and substrate dielectric impact are vital in designing the wearable patch to extract the efficient parameters. fig. 2 s11 response of different dielectric constants(fr4 and jean’s) fig. 3 vswr response of different dielectric constants (fr4 and jean’s) table 1 highlights the summary of comparing the executed parameters for both the substrates performed for this work, along with the effect of different thickness values considered. in addition, it mentions the high impedance bandwidth of 1470 mhz (54%) and 60 mhz (1.2%) with a thickness of 1 mm obtained at 3.17 ghz and 5.04 ghz, respectively. performance of wearable circularly polarized antenna on different high frequency... 125 fig. 4 gain versus frequency response for both (fr4 and jean’s) substrates it is proved that the reduction in the wearable dielectric constant increases the antenna performance with the novel structure design proposed. moreover, it is observed that a moderate gain of 5dbi is displayed with a wearable antenna when compared with the low-value gain for fr4 substrate, maintaining the same thickness of the substrate. feature of circular polarization is also obtained for the operating frequency of 3.17 ghz. gain response curve drawn for jean’s substrate show the gain values for 0.9 ghz and 3.2 ghz, respectively as 2 dbi and 3.2 dbi. table 1 performance comparison of both the substrates (jean’s and fr4) in terms of gain and impedance bandwidth dielectric (fr4)-4.4 substrate thickness frequencies (ghz) impedance bandwidth (mhz), % vswr gain(dbi) t=0.8mm 1.82 100, 5.49 1.86 2 4.17 30, 1.79 1.79 1.8 t=1mm 1.81 270,14.6 1.6 1.6 2.72 200,7.3 1.8 2.5 4.06 60,1.4 1.5 3.68 dielectric (jean’s cloth)-1.7 t=0.8mm 3.17 1420,52.3 1.31 3.5 5.07 30, 0.6 1.56 4 t=1mm 3.17 1470,54 1.1 4.1 5.04 60,1.2 1.3 5 sar simulated response for an operating frequency of 0.91ghz is represented in figure 5 and it shows that the sar value denoted on the scale is moderately at 18.2 w/kg. this sar value is to be maintained at low since the textile antenna is worn on a conductive body. material thickness, conductivity, operating frequency range and resonant behavior are 126 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala carefully chosen to be distinct to better understand the resulting sar. sar is a measure of power absorbed per unit mass (kg), in the human body tissue. fig. 5 simulated sar response of the antenna operating at 0.91ghz 3. experimental results and discussion the proposed novel textile antenna prototype with optimized dimensions is constructed and investigated experimentally. fig. 6(a) and (b) respectively shows the conventional and modified printed textile antennas fabricated. it is seen that the radiator patch with dual l-slits significantly improves the matching conditions of impedance for high resonance bands and maintains stable gain across the remaining bands. the experimental setup of the proposed dual-band antenna is shown in fig. 7 to measure radiation pattern and axial ratio. simulated and measured return losses of the antenna design are shown in fig. 8. the impedance bandwidth values for the proposed antenna prototype are 1470 mhz and 60 mhz, respectively, for both the generated bands. the anomalies between measured and simulated results are due to the in-house manufacturing and soldering losses. the return loss of the proposed antenna module is measured using the keysight e5071c ena series network analyzer. fig. 9 presents the measured axial ratio of the designed antenna. the lower resonant band's 3-db axial ratio (ar) bandwidth is about 11 mhz (3.17 to 3.172 ghz). the 3 db ar bandwidth is within the 10 db impedance bandwidth (overlapped), which is desirable. though the measured response is not in close agreement with the simulated response due to tolerance issues and sar, measured impedance bandwidths are about 60 mhz and 112 mhz, covering both the dual resonant bands at 900 mhz and 3.2 ghz. performance of wearable circularly polarized antenna on different high frequency... 127 (a) (b) fig. 6 fabricated textile antenna prototypes (a) front-view and (b) rear view of proposed dual band antenna fig. 7 experimental setup of proposed dual band antenna 128 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala fig. 8 measured and simulated return loss response of the proposed textile antenna fig. 9 measured axial ratio against frequency of the proposed textile antenna it is also found that the radiation pattern plots are drawn for both dual bands and observed that co-polarization patterns dominate, indicating the dual-band antenna is ideal for military applications. the measured radiation patterns are represented for jeans substrate operating at frequency of 3.17 ghz in fig. 10 and operating frequency of 5.04 ghz as shown in the fig. 11. it is found that unstable pattern at 5.04 ghz for jeans substrate. this is due to the back radiation of the monopole structure on the ground plane. performance of wearable circularly polarized antenna on different high frequency... 129 fig. 10 far field radiation patterns in xoy plane and xoz plane at 3.17ghz for jean’s cloth fig. 11 far field radiation patterns in xoy plane and xoz plane at 5.04ghz for jeans cloth the parameters covering dimensions, operating frequencies, return loss bandwidth, and axial ratio bandwidth and gain are compared with the existing design as tabulated in table 2 below. the wearable antenna proposed in this work is evaluated at a small size and operates at dual-band with circular polarization at one band. 130 r. s. r. basupalli, n. k. darimireddy, r. nalaganula, s. mandala table 2 calculated bandwidth and gain parameters of proposed wearable antenna comparing with existing structures ref. dimensions (mm2) frequency (ghz) 10-db rlbw (mhz) (%) 3-db arbw (lp/cp) gain (dbi) [8] 110×130 1.927 2.45 -6 db lp lp 0 0 [9] 120×120 2.45 5.5 4 % 16% lp lp 3 2 [11] 240×125 6.2 48% lp [14] 80×80 10 20% lp 8.5 [proposed] 40×25 3.17 5.04 60% 1.2% 11 mhz (cp) lp 4.1 5 [measured] 40×25 0.91 3.2 2% 10% lp lp 2 3.2 rlbw: return loss bandwidth, arbw:axial ratio bandwidth, lp: linearly polarized, cp: circularly polarized 4. conclusions a novel double l-shaped slit textile antenna is analyzed and developed for military wireless applications. the comparative study is performed for both fr4 and jeans dielectric and tabulated the parameters extracted. this paper majorly focuses on obtaining good cp radiation at the first band and lp at the second resonant band as obtained from multiple iterations of the antenna. moderate gain of 4.1 dbi and 5 dbi is achieved for 3.17 ghz and 5.04 ghz frequencies, respectively. displayed results show high impedance bandwidth with jean's wearable dielectric compared to the conventional fr4 substrate. it is observed that the proposed wearable antenna gives 0.2 db ar, indicating good quality cp (close to 0 db) for the resonant band. though this extended slit-based model executes tri-band cp, the radiation patterns are degraded. top and bottom layered structures with line feeding techniques conclude that the proposed prototype antenna could benefit the operating frequencies of military wireless applications. experimental evaluation and significance of the latest wearable dielectrics with different substrate thicknesses are also carried as part of future work. acknowledgement: the authors would like to thank to the advanced communications laboratory, bvrit(a), medak and also the central r & d cell, lendi iet (a) for providing the required facilities to execute the proposed work. references [1] a. kalis, t. antonakopoulos and v. maklos, "a printed circuit switched array antenna for indoor communications", ieee transactions on consumer electronics, vol. 46, no. 3, pp. 531-538, aug. 2000. [2] s. han and s. k. park, "performance analysis of wireless body area network in indoor off-body communication", ieee transactions on consumer electronics, vol. 57, no. 2, pp. 335-338, may 2011. [3] j. park and j. chun, "dtv receivers using an adaptive switched beamformer with an online-calibration algorithm", ieee transactions on consumer electronics, vol. 56, no. 1, pp. 34-41, february 2010. performance of wearable circularly polarized antenna on different high frequency... 131 [4] c. ahn, b. ahn, s. kim and j. choi, "experimental outage capacity analysis for off-body wireless body area network channel with transmit diversity", ieee transactions on consumer electronics, vol. 58, no. 2, pp. 274-277, may 2012. [5] b. zhang and f. yu, "lswd: localization scheme for wireless sensor networks using directional antenna", ieee transactions on consumer electronics, vol. 56, no. 4, pp. 2208-2216, november 2010. [6] s. koskinen, l. pykäri and m. mäntysalo, "electrical performance characterization of an inkjet-printed flexible circuit in a mobile application", ieee transactions on components, packaging and manufacturing technology, vol. 3, no. 9, pp. 1604-1610, sept. 2013. [7] b. zhang and f. yu, "lswd: localization scheme for wireless sensor networks using directional antenna", ieee transactions on consumer electronics, vol. 56, no. 4, pp. 2208-2216, november 2010. [8] p. salonen et al., "dual-band wearable textile antenna", in proceedings of the ieee antennas and propagation society international symposium, 2004, vol. 1, pp. 463-466. [9] z. shaozhen, r. langley, "dual-band wearable textile antenna on an ebg substrate", ieee transs. on ants. and propag, vol. 57, no. 4, pp. 926-935, 2009. [10] s. lingam, b. gupta, "development of textile antennas for body wearable applications and investigations on their performance under bent conditions", piers b; vol. 22, pp. 53-71, 2010. [11] k. furuya et al., "wide band wearable antenna for dtv reception," in proceedings of the 2008 ieee antennas and propagation society international symposium, 2008, pp. 1-4. [12] n. k. darimireddy, r. r. reddy, a. m. prasad, "asymmetric and symmetric modified bow‐tie slotted circular patch antennas for circular polarization", etri journal, vol. 40, no. 5, pp. 561-569, 2018. [13] n. k. darimireddy, r. r. reddy, a. m. prasad, "asymmetric triangular semi-elliptic slotted patch antennas for wireless applications", radioengineering, vol. 27, no. 1, p. 85, 2018. [14] k. x. wang and h. wong, "a wideband millimeter-wave circularly polarized antenna with 3-d printed polarizer", in ieee transactions on antennas and propagation, vol. 65, no. 3, pp. 1038-1046, march 2017. [15] i. bouhassoune, et al., "optimization of uhf rfid five-slotted patch tag design using pso algorithm for biomedical sensing systems", int. j. environ. res. public health, vol. 17, id. 8593, 2020. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 435 453 doi: 10.2298/fuee1403435n wireless sensor node with low-power sensing  goran nikolić 1 , mile stojčev 1 , zoran stamenković 2 , goran panić 2 , branislav petrović 1 1 faculty of electronic engineering, university of niš, niš, serbia 2 ihp innovations for high performance microelectronics, frankfurt, germany abstract. wireless sensor network consists of a large number of simple sensor nodes that collect information from external environment with sensors, then process the information, and communicate with other neighboring nodes in the network. usually, sensor nodes operate with exhaustible batteries unattended. since manual replacement or recharging of the batteries is not an easy, desirable or always possible task, the power consumption becomes a very important issue in the development of these networks. the total power consumption of a node is a result of all steps of the operation: sensing, data processing and radio transmission. in most published papers in literature it is assumed that the sensing subsystem consumes significantly less energy than a radio block. however, this assumption does not apply in numerous applications, especially in the case when power consumption of the sensing activity is comparably bigger than that of a radio. in that context, in this work we focus on the impact of the sensing hardware on the total power consumption of a sensor node. firstly, we describe the structure of the sensor node architecture, identify its key energy consumption sources, and introduce an energy model for the sensing subsystem as a building block of a node. secondly, with the aim to reduce energy consumption we investigate joint effectiveness of two common power-saving techniques in a specific sensor node: duty-cycling and power-gating. duty-cycling is effective at the system level. it is used for switching a node between active and sleep mode (with the dutycycle factor of 1%, the reduction of in dynamic energy consumption is achieved). power-gating is used at the circuit level with the goal to decrease the power loss due to the leakage current (in our design, the reduction of dynamic and static energy consumption of off-chip sensor elements as constituents of sensing hardware within a node of is achieved). compared to a sensor node architecture in which both energy saving techniques are omitted, the conducted matlab simulation results suggest that in total, thanks to involving duty-cycling and power-gating techniques, a three order of magnitude reduction for sensing activities in energy consumption can be achieved. key words: wireless sensor networks, sensor elements, power cosumption, duty-cycling, power-gating   received february 18, 2014; received in revised form may 29, 2014 corresponding author: goran nikolić faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (goran.nikolic@elfak.ni.ac.rs) 436 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović 1. introduction wireless sensor networks, wsns, consist of a large number of sensor nodes, sns, deployed randomly (or in some specific places) within a restricted area. applications for wsns range from consumer electronics, military target tracking, industrial monitoring, health monitoring, home environmental control, forest fire detection, greenhouse monitoring, etc [1]. since sns are usually battery-powered devices and operate unattended for a relatively long period of time, maximizing energy efficiency of sn is critical [1], [2]. typically, this constraint is imposed by the limited capacity of the sn's battery [3]. to optimize the design of sn, an accurate power consumption model, which allows a good forecast of battery lifetime, is needed. in order to extend the lifetime of sn, a wide variety of techniques for minimizing sn's energy consumption have been proposed in literature [4], [5], [6]. some of them deal with saving energy at mac (media access control) level [7], [8], [9], others at routing protocols [10], [11], [12], third with dissemination data aggregations or fusion [13], [14], fourth with involving novel architectures that utilize the optimized radio and digital parts [15], [16], [17], fifth employ on-chip power gating in order to reduce the static power loss [18], [19]. to address the problem of power saving within a sn, two promising approaches based on dynamic voltage scaling [20], [21] and power gating [22], [23] are used. the first represents a useful solution for high performance sns, while the second is effective in sns operating with low duty-cycle where the sns alter between off and on states to minimize the energy consumption [22], [16], [17]. sns, as constituents of wsn, are capable of performing computation, communication and sensing of oriented tasks. accurate prediction of the sn lifetime requires an accurate energy consumption model and estimation of sensor activities. the energy model which accurately reveals the energy consumption of sn is an extremely important part of the protocol development, sensor node micro-architecture design (radio, microcontroller and sensing subsystem), battery capacity, and performance evaluation in wsns. there have been various attempts to model sn energy consumption. in [24] a model that includes mcu processing and radio transmission and receiving is considered. in [25] and [26] sensing activities including sensor sensing, sensor logging and actuation are omitted. in [23] a comprehensive energy model for wsn that takes into account all key energy consumption sources within a sn is described. by studying component energy consumption in different sn states the authors in [27] present the energy models of the sn core components. in [28] a combination of two complementary approaches intended to reduce the energy consumed by a sensor node, duty cycling (waking up a sensing board only for the time needed to acquire a new set of samples and powering it off immediately afterwards) and adaptive sensing strategy (a huge computation approach which is able to dynamically adapt the sensor activity to the real dynamics of the process) is proposed. as is reported in [4], [29], [30], on time radio operation dominates the system power budget for order of magnitude in respect to the other two operations (data processing and sensing) combined, even when the radio module operates at a low duty cycle (approximately from 1 to 2 %). since data processing and sensing activities account for a small fraction of power budget, the authors suggest that sn's lifetime improvement requires a significant reduction in communication activities. however, our current research shows that by using a more realistic power consumption model of the sensing subsystem which clearly separates the power consumption of each sensor element, it is possible to derive clearer wireless sensor node with low-power sensing 437 results which provide insight into which sensing elements are limiting the wsn performance. in other words, in this work we extract the impact of sensing hardware on the total power consumption and point to the fact that the contribution of the sensing subsystem to the total power consumption of the sn cannot be neglected (ignored) especially in the case when wsns with medium(high-) energy consuming sensor elements are used. in other words, the main novelty presented in this paper deals with involving a joint combination of two common power saving techniques (duty cycling and power gating) during the operation of a sensor node. due to space constraints this paper concentrates only on sensing subsystem power consumption. for discussions on wireless communications and data processing activities, readers can refer to the following papers [6], [27], [30], [31]. the rest of the paper is organized as follows. in section 2, sensor node architecture is involved and operating functionalities of all constituents are identified. in addition, details which deal with specifics of connectivity at sensor elements and the power supply are given. section 3 concentrates on sensor node energy profile. justification of involving two power saving techniques, duty-cycling, at system level, and off-chip power-gating, at sensing subsystem level is discussed, too. section 4 deals with power estimation. also, the energy profile during initialization and sensing activities is calculated. section 5 concludes the paper. 2. sensor node architecture an overall hardware structure of a sn is presented in fig. 1 fig. 1 overall block scheme of a sensor node. the sn consists of several building blocks: a) mcureferred as a processing subsystem, controls the operation of all constituents within the sn and performs data processing. the mcu includes microcontroller and memory for local data processing. most existing processing subsystems employ microcontrollers, notably texas instruments' msp430, intel's strong arm, or atmel's avr. these microcontrollers enable some of their internal components to be turned-off completely when they are idle or sleep. cmos compatible memories including static random-access memory, sram, and embedded dynamic random-access memory, dram, permit sns to perform more complex digital signal processing algorithms (collection, aggregation, and compression) and log more sensor data. 438 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović b) off-chip sensor elements (ocse) – called a sensing subsystem, implemented as a set of passive and active sensors (digital or analog) convert input information from the external environment into electrical signals. in most applications, wireless sns are used for monitoring light, pressure, vibration, flow rates in pipelines, temperature, ventilation, electricity, etc. commonly, sensor elements generate voltage or current signals at their outputs. these signals are first amplified (conditioned) and then digitized with an analogto-digital converter, adc, before data are digitally processed, stored and transmitted. c) radio block (rb) – implemented as a short range transceiver which provides wireless communication with the host or sns within a wsn. the power consumption of a transceiver can be reduced both at: i) the circuit level by developing more energy-efficient rf circuits (using weak inversion operation in the rf building blocks, rf-mems passive components, ultra-wideband transceivers which send narrow pulses of energy to transmit data), and ii) at a system level by using rf communication (including shortening the communication distance, minimizing the amount of data sent over the rf link or using energy-efficient communication protocols, or powering down the transceiver during idle periods, i.e., using a duty-cycling concept). for more details about this problematic see reference [30]. d) battery supply unit (bsu) – is a part of the power subsystem acting as a controllable unit which individually switches on/off the power supply of each sn's building blocks. bsu is responsible for providing the right amount of supply voltage to each individual sn hardware component. a bulky battery is included in the bsu to power the sn's subsystems. the bsu is a very important building block of the sn intended to improve the wsn lifetime, and therefore numerous techniques based on the efficient exploitation of energy resources have been introduced with the aim to prolong the wsn lifetime. for more details see [31]. as we have already mentioned, currently, sns are powered by batteries. however, batteries are characterized by several disadvantages, including: i) the need to either replace or recharge then periodically; and ii) being of a big size and weight compared to sn electronics. one promising solution to overcome these drawbacks is to harvest energy from the environment to either recharge a battery or even to directly power the sn. as is presented in table 1, the energy harvesting circuits can be classified into two groups. table 1 classifications of energy harvesting circuits energy source type of energy human kinetic, thermal environment kinetic, thermal and radiation for more details see references [32], [33]. among the most popular harvesting circuits used in sns are those based on converting solar energy, as a radiation type of energy. the main advantages for using solar energy are as follows: i) it is excellent in remote or difficult access location; ii) it is a totally clean and renewable source; iii) for supplying small current loads such as sns; and iv) in any country the use of solar energy like this is feasible throughout the entire territory. depending on the specific application, sns may also include additional components like the location finding system to determine their position, a mobilizing unit to change their location, etc. more details about sn architectures and functionalities of their wireless sensor node with low-power sensing 439 building blocks can be found in [34]. different types of communication interfaces, such as parallel and serial buses interconnect the aforementioned subsystems. among serial buses the most frequently used interconnects are spi (serial peripheral interface) and i 2 c (interintegrated circuit). a spi is a preferable design solution for high-speed, while i 2 c for low-speed communication. today's wireless sn is a simple device, and its components that make up its subsystems are commonplace off-the-shelf components usually located on a printed circuit board. 2.1. connecting sensor elements within an sn architecture, sensor elements can be implemented as: a) on-chip constituents typical for future generation (advanced system-on-chip, soc design) of wireless sn designs, and b) off-chip constituents sn composed of discrete components typical for currently common market available (on-the-shelf) wireless sn systems. the recent progress in ultra-low power circuit design is creating new opportunities in sn architectures with on-chip for temperature and image sensor elements [35], [36]. important advances have been made to achieve millimeter-scale sn and standby power as low as 30 pw [37], or microwatt successive approximation register sar-adc with the figure of merit down to 4.4 fj per conversion step [38], but many design challenges remain yet open. our design choice is based on the use of the off-the-shelf components. such solution implies that sensor elements are of the off-chip type, i.e., externally connected components to the adc (in our proposal adc is a constituent of the mcu). in this paper, by involving adequate energy models, we will consider implementations of duty-cycling and powergating techniques and investigate how to reduce the dynamic and static power when both power saving approaches are used. 2.2. power supply subsystems in a sn, each subsystem/circuitry requires different supply voltage for its operation. for example, in most common currently used designs, the mcu and other digital circuits can run at supply voltage which ranges from 3 v to 1.8 v. analog components such are rf transceiver and sensor elements, in order to provide correct operation and noise margins, require higher supply voltages which range from 1.2 v to 2.5 v. batteries (lithium 3.3 v−4.2 v) incorporated as power sources in sns are limited in their output voltage by their chemistries, and their voltages degrade with use. since battery voltages do not usually match the desired subsystem/circuit supply voltages, switching dc-to-dc or linear low drop-out voltage regulator power converting electronics is used. bearing in mind that a current consumption of sn is within a range of several tens of ma (in active mode) down to several a (in sleep mode) the power electronics must be specifically designed for a low-power operation. as a preferable solution, we propose linear low drop-out voltage regulator for powering the sn subsystem. in general, for powering lowlevel of power devices, such as sn, the linear low drop-out voltage regulator has a better performance in respect to dc-to-dc converter (dc-to-dc converters are usually designed for high output power levels and do not efficiently convert the low level of the power needed by sns [30]). 440 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović before we start describing the principle of operation of bsu (see fig.1), it is necessary to first explain the meaning of the following two terms: power gating and duty cycle. power gating is a technique used in integrated circuit design to reduce power consumption, by shutting off the current to blocks of the circuit that are not in use [39]. a duty cycle is the percentage of one period in which a signal is active. a period is the time it takes for a signal to complete an on-and-off cycle [40]. in our case, the time interval during which the sn is on or off is known as its active, ton, and inactive (sleep) time interval, toff, respectively. according to the previous, the duty cycle (dc) is defined as: ( ) on on off dc t t t  (1) the focus of our interest in this paper is the implementation of a power distribution system, as part of bsu which relates to switching on/off both the sensor elements within ocse and the transceiver (as a constituent of rb) in a timely defined manner. by using a combination of a duty-cycling which relates to powering the sn at a system level, and power-gating technique intended to power the sn at a sensor element level, a significant saving of dynamic and static power during sn operation can be achieved. a global scheme of bsu is presented in fig. 2. it consists of: a) battery – acts as a main energy source for powering sn's functional unit; b) dual-channel controllable ldo regulator – implemented as single-input (in1) twooutput (out1 and out2) linear low drop-out voltage regulator. by setting the control enable signals en1 and en2 to logic one/zero, the voltage at the outputs out1 and out2 can be switched on/off. at the output out1 voltage is always present, since en1= {1}, while voltage at the output out2 can be switched on/off by setting en2= {1/0}, respectively. power-gating for ocse is achieved by switching on/off the pin out2 (output of the ldo, see fig. 2). dual-channel controllable ldo regulator -ldocontrollable turn-on/off load switch -cls1controllable turn-on/off load switch -clsncontrollable turn-on/off load switch -cls2battery uninterruptable power supply line to mcu ... analog or digital sensor 1 -se1analog or digital sensor 2 -se2analog or digital sensor n -senon/off power supply line out1 out2 en1 en2 global power-gating enable line for sensor block n individual power-gating enable lines for sensor element from mcu in off-chip sensor elements battery supply unit controllable turn-on/off load switch on/off power supply line to rf block power-gating enable line from mcu to adc or spi part of mcu ... fig. 2 power distribution system of sensor node http://en.wikipedia.org/wiki/integrated_circuit http://en.wikipedia.org/wiki/electric_power http://en.wikipedia.org/wiki/electric_current http://en.wikipedia.org/wiki/frequency http://en.wikipedia.org/wiki/turn_%28geometry%29 wireless sensor node with low-power sensing 441 c) controllable turn on/off load switches (clss) – each cls is implemented as a pchannel, or n-channel mosfet transistor which can be individually switched on/off. in this manner power-gating at a local control level within the sensing subsystem is provided (i.e., mcu can separately switches on/off the power supply voltage for each sensor element by setting a corresponding control line to logic one/zero). 3. energy profile the proposed wsn considered in this paper is composed of several sns deployed in a restricted area. this system is primarily intended to monitor scalar values like acceleration, space orientation, and audio signals. in this type of application almost all of the mentioned sensor measurements do not need to be taken continuously which implies that the environmental conditions can be periodically sampled. for example, taking one sample per two minutes could be adequate to monitor temperature, pressure, light, humidity, etc. power management is an efficient way to conserve energy in wsn. the crucial idea of power management is to dynamically make the sns inactive in order to reduce their energy consumption, i.e. to decide when a sn should go to the inactive state and the amount of time to stay so. most power management strategies proposed in literature [31], [41] assume that data acquisition (sensing activity) consumes significantly less energy than wireless data transmission [4]. however, in a large number of practical applications, this assumption does not hold, especially in the case when the power consumption of active (not passive) sensor element can be comparable to that of the communication subsystem. similar problem was considered in reference [42], [43]. in order to cope with this challenge in an effective way, we propose to implement the power management concept into two levels, system and component level, respectively. at the first level, a duty cycle technique is used, by which we identify the idle and active time periods of sn's constituents. at the second level, power gating technique is used, by which unutilized sensor elements are switched off while the analyzed sensor element is switched on. in other words, our goal is that during most of the time, the inefficient (unnecessary) power consumption of sensor elements due to not-optimal configuration of hardware and software components is significantly reduced. let us note that a sensor node as an electrical system is time invariant, i.e. the total energy consumption depends on its individual energy consumption components. having this in mind, in the sequel we will separately analyze the effects and benefits of implementation of duty-cycling and power-gating techniques on energy consumption only for the sensing subsystem as sn constituent. 3.1. duty cycling duty cycling is a well-known technique for minimization of power consumption in wireless sns. the main idea behind this is clear: keep hardware (sensing-, communication-, and some parts of powerand processing-subsystems – see fig. 1.) in a low power sleep state, except during instances when the hardware is needed. many realizations of duty-cycling technique allows even the mcu to be put into a low power state for long time periods, while its internal or external clock tracks the time in order to trigger a later wake-up. the wake-up time is the time from activation of the interrupt signal (by a real time clock, rtc, circuit) to the beginning of an interrupt service routine. let us note that, all activities which deal with the duty cycle 442 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović operation (switching into different power modes the transceiver, mcu, and low-drop out regulator) are performed by the mcu under software control. the total energy consumed by sn, et, depends on the dynamic (active), ed, and static (leakage) power loss, es. t d s lf d lf s e e e dc t p t p       (2) where, 0 < dc < 1, tlf is the lifetime of a sn, and pd and ps correspond to dynamic and static power, respectively. from eq. 2, the portion of energy lost due to the leakage is 1 1 s dt s e pe dc p   (3) the ratio pd / ps is technology dependent and is proportional to the mos transistor channel properties. similarly as in [18] 1 , taking the corresponding pd / ps for three different cmos technologies, we have calculated the impact of energy loss in respect to the total energy consumption in terms of a dc factor. the obtained results are presented in fig. 3. 00.10.20.30.40.50.60.70.80.9 0 10 20 30 40 50 60 70 80 90 100 duty-cycle e s /e t [% ] energy loss due to leakage as a function of duty-cycle for different cmos technologies for 0.18 um cmos technology for 0.13 um cmos technology for 0.25 um cmos technology fig. 3 energy loss due to leakage as a function of duty-cycle for different cmos technologies for digital components of the sn, similarly as in reference [44], we assume that pd / ps is  1000 for 0.25 m technology,  20 for 0.18 m technology, and  4 for 0.13 m technology. by analyzing fig. 3 we can conclude the following: 1 for the sake of clarity, the reference [18] defines the power consumption in active state (pa = pd + ps) and the power consumption in inactive state (pi = ps) wireless sensor node with low-power sensing 443 1. with cmos technology, scaling the energy loss due to static power increases. in other words, the static power loss is comparable to dynamic power loss (high amount of power is lost due to the leakage currents of cmos circuitry [45]). 2. in standard applications a dc factor of the sn is low ( 1% ), which makes the total system power dominated by the standby power, i.e. static power losses. 3. theoretically, better energy efficiency (achieved by decreasing ed) can be obtained by further decreasing the dc factor. however, in this case the influence of the clock system, as components of sn, on the overall time synchronization accuracy of the wsn becomes critical [46], [47]. namely, the impact of variations in environmental temperature on clock drift in highly duty-cycled wireless sns is emphasized [47]. 3.2. power gating with the aim to switch-off the leakage currents of inactive sensor elements we decide to implement power-gating, because as a design technique it is primarily used to reduce the overall static power loss in a circuit [48]. the efficiency of power gating depends on the activity profile of sn's components. by adapting an event-driven control mechanism we will first present the activity model of a sn at a general level (see fig. 4), and then in section 4 we will study energy consumption issues of sensor elements units (constituents of the sensing subsystem – see fig. 2) that switch-on and –off during the sensing period. sn_active state sn_sleep state see timing in fig. 6 t = 2 min wake-up event generated by on-chip rtc wake-up event generated by on-chip rtc t toffton initialization sensing data processing communication duty cycle profile profile of sn activities during sn_active state events profile power consumption of a single sensor element tsen consumption profile see fig. 5 initialization and sensing activities of sn (a) (b) (c) fig. 4 activity profile of a sensor node 444 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović as can be seen from fig. 4 a), the rtc circuit, as a building block of the mcu, periodically generates an interrupt signal called wakeup. the period of wakeup is t, in our case t = 2 min. the appearance of the signal wakeup initiates a sn and it enters into sn_active state (see fig. 4b)). during sn_active state (fig. 4c)) four sequential activities are performed, initialization, sensing, data processing, and communication. activity initialization deals with restoring the content of mcu registers to the preceding sn_active state and setting peripherals (ldo regulator, controllable load switches, and transceiver – see fig. 2) into the corresponding operating mode. the sensing activity is responsible for information collection and analog-to-digital conversion. the energy consumption during this activity comes from multiple operations, including power-on (-off) switching of sensor elements, signal sampling, and analog-todigital conversion/spi communication. if we assume that n sensor elements are connected to the mcu (see fig. 2), then the total energy consumption of the sensing subsystem, est, can be expressed as: 1 ( ) n st foi ofi wi ci i e n e e e e       (4) where:  efoi (eofi) is the one time energy consumption of opening (closing) sensor element operation – switching sensor element i from off (on) to on (off) state;  ewi – energy consumption during warm-up time period of sensor element i;  eci – energy consumption during analog-to-digital conversion period;  n – number of sn active states during lifetime of a sn, and  n – number of sensor elements in a sn. power consumption profile of a sn during single sensing activity of a sensor element is sketched in fig. 5. power time off on t  wu t con t on off t  off p on p sensor period t fig. 5 power consumption profile of a single sensor element notice: a time interval toffon (tonoff) includes transient time of controllable load switch clsi, i = 1,...,8, and transient time of a sensor element sei wireless sensor node with low-power sensing 445 as is marked in fig. 5, toffon (tonoff) corresponds to a time interval needed for switching the sensor element from off (on) to on (off) state, twu to warm-up time interval, and tcon to analog-to-digital conversion time interval. basic constituents of most sensor elements are analog circuits (input and output amplifiers, active filters, etc.). in analog circuits, the power gates must be turned-on long enough before the active system operation in order to allow the circuits to reach a stable dc state. this implies that both the sensor elements and their coupling with the source of stimulus cannot always respond instantly. namely, the sensor is characterized with a timedepended characteristic, and a delay (latency) appears in representing a true value of a stimulus. in fig. 5 this delay corresponds to the warm-uptime twu. in essence, warm-up time is the time between applying to the sensor power or excitation signal and the moment when the sensor can operate within its specified accuracy [48]. the warm-up time depends on the type of sensor. many sensors may have a negligible short warm-up time (in the range from 100 s up to1ms ), but those that operate in a thermally or humidity controlled environments, such as a thermostat and humidity sensor, may require from several hundred up to seconds or minutes of warm-up time after powering-up only the sensor elements. from the aspect of energy consumption, a sensor with a shorter warm-up time causes a lower amount of power loss. let us assume that all sensors are homogenous. this means that for i, i = 1,...,n, the following is valid ewi = ew, eci = ec and efo = eof = es. according to the aforementioned, the eq. (4) now has the form ( )2 st s w c e n n e e e      (5) the total energy consumed during warm-up time is tw w e n n e   (6) if we take that in average, es = 0.1ec, then a portion of energy due to the warm-up is (2 ) 1 1.2 1 tw w st s w c c w e n n e e n n e e e e e            (7) if we further take that ew = k  ec where k is an real number, the portion of energy loss, etw / est, in terms of ew is presented in table 2. table 2 a portion of energy lost in term of k k 0 0.1 0.2 0.5 0.8 1 2 5 10 20 50 100 1000 ∞ etw / est 0 0.077 0.143 0.294 0.400 0.454 0.625 0.800 0.892 0.943 0.976 0.989 0.999 1 by analyzing the results presented in table 2 we can conclude that as warm-up time increases the portion etw / est asymptotically brings closer to value 1 . this means that at the lower limit, twu = 0, the total energy loss is est = n * n * (2es + ec), and at the upper 446 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović limit twu  , the total energy loss is est  n * n * ew, i.e. ew becomes dominant. in general, better design solution concerning etw is one in which tsw  0, but in this case the sensor elements are all time active. as a direct consequence of this approach the power consumption of a sensing subsystem will be high. to cope efficiently with this problem, involving of power gating technique represents a good compromise. but in such a solution, the sensor warm-up time cannot be ignored when sn's energy model is considered. 4. power estimation in this article we continue our work [49], and present a complete energy consumption profile of the wireless sensor node during the activities initialization, and sensing, only within the sn_active state. in our case, the sensing subsystem ocse (see fig.2) is composed of eight sensor elements, se1, ..,se8. sensor elements from se1up to se7 are of analog type and drive the on-chip adc (as a component of mcu (msp430fr59xx)). these sensor elements are used for sensing temperature (lmt87), humidity (sht21s), acceleration (adxl377), ambient light (isl76671), position (ss345pt), motion (l3g3250a), and audio microphone (mp33ab01), respectively. the last sensor (t5400) is used for measurement pressure and it transfers data to mcu via an spi interface. for more details about electrical and time specifications of sensor elements see farnell website [50]. the power supply voltage out 2 = 3v (marked as vout2 output of a low-drop out dual-channel voltage regulator tlv716 [51]). aclsi is implemented as a p-channel mosfet transistor tps22908 [52] (see fig.2). electrical and time specifications (found in the devices documentations and determined by direct measurements) and energy consumption per sensor element (determined by calculation and direct measurements) are presented in table 3. table 3 electrical and timing specifications, and calculated energy consumption per sensor element s e n so r e le m e n t t y p e s e n so r o u t2 [v ] s e ia v . c u rr e n t [m a ] t o f f -o n [m s] e f o [u j] t w u [m s] e w [u j] t c o n * [m s] e c [u j] t o n -o f f [ m s] e o f [u j] 1 lmt87 3 0.0041 2.01 0.012  n.a. 0 .0 0 3 5 0.000043 0 .0 0 5 0.000031 2 sht21 0.1811 150.11 40.778 8000 4346.400 0.001902 0.001358 3 adxl377 0.3011 5.11 2.308  n.a. 0.003162 0.002258 4 isl76671 0.0361 0.205 0.011 0.350 0.038 0.000379 0.000271 5 ss345pt 3.0011 0.11 0.495 0.0015 0.014 0.031512 0.022508 6 l3g3250a 6.3011 0.11 1.040 0.3 5.671 0.066162 0.047258 7 mp33ab01 0.3011 0.11 0.050  n.a. 0.003162 0.002258 8 t5400 0.7911 2.61 3.097 10 23.733 16 37.9728 0.005933 notice: conversion time tcon is determined by sar-adc, as constituent of mcu, and for 12-bit resolution and it is 3.5 s (identical for all sensor elements); n.a. stands for not available data from catalog wireless sensor node with low-power sensing 447 according to eq. (4) and data presented in table 3, under the assumption that n = 1 and n = 8, the estimated energy consumption of our design during powering-up of sensor elements (initial phase of sensing activity) can be expressed as 8 1 47.791 0.081875 4375.856 38.07912) 4461.808 ( ) st foi ofi wi ci i e e e e e j            (8) let us note that this value corresponds to the worst-case of energy consumption for all clsi and sei during the sensing activity (namely, after powering-up of sensor elements this activity happens only once during the life-time of the sensor node. it is typical for sensor element stabilization to environment conditions. therefore, its impact, concerning power estimation, can be neglected). 4.1. energy profile during initialization and sensing activities with the aim to determine the total energy consumption during the active state of a sensor node it is necessary to take into account the energy loss of other building blocks (mcu and bsu (see fig. 1 and 2)) during the time period tsen (see fig. 4). a detailed timing diagram during initialization and sensing activities is presented in fig. 6. as can be seen from fig. 6a) the initialization activity begins at tstarton and ends with t1. the activity sensing deals with the right part of fig 6 a), time interval from t1 to t2, continues with fig. 6 b), time interval from t2 to t3, and ends with the left part of fig. 6c), time interval from t3 to tendon. the right part of fig. 6c) includes data processing and communication activities, time interval from tendon to tstartoff, and sn_sleep state, time interval from tstartoff to tendoff. duration of a time interval from tstarton to tendoff is 2min. in table 4, details concerning time interval durations of all activities during initialization and sensing activities (defined in fig. 6) including the average current and energy consumption for each time-subinterval are given. total time duration of initialization and sensing activities is tsen = 191.036ms and the corresponding energy consumption during this period is 278.31j. timing diagrams and power consumption profile during initialization and sensing activities (obtained by matlab,) are presented in fig. 7. figure subplot 1 (down-left part of fig. 7) deals with the initialization activity and acquiring data from se1 and se2. figure subplot 2 (down-right part of fig. 7) refers to acquiring data from se3 to se8. 448 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović (a) (b) mcu current first instruction of the user program is executed activate ldo mcu wake up ldo out2 current cls1 current activate load switch ldo shutdown current out 2 ldo quiescent current out2 ldo quiescent current out 1 n *clsx leakage current clsx quiescent current and (n-1) * clsx leakage current activate temperature sensor s1 current s2 current activate humidity sensor cls2 current tstart-on t1 t2 twum tin twul twls tws1 tadc tcmt tws1 tadc tcmt sensinginitialization mcu current ldo out2 current cls3 current ldo quiescent current out2 ldo quiescent current out 1 clsx quiescent current and (n-1) * clsx leakage current s3 current s4 current s5 current cls4 current cls5 current s6 current cls6 current activate accelerometer sensor activate ambient light sensor activate position sensor activate motion sensor t2 tcmt tws3 tadc tcmt tws4 tadc tcmt tws5 tadc tcmt tws6 tadc tcmt t3 sensing mcu current ldo out2 current ldo quiescent current out2 ldo quiescent current out 1 clsx quiescent current and (n-1) * clsx leakage current s7 current activate microphone sensor cls7 current s8 current activate pressure sensor cls8 current ldo shutdown current out 2 n *clsx leakage current lpm 3.5 operating mode tcmt tadc tcmttws7 tws8 tspi tol t3 tend-on tstart-off tend-off data processing & communication sn_sleep sensing (c) fig. 6 profile of power consumption during sensing activity wireless sensor node with low-power sensing 449 table 4 time interval duration, average current and energy consumption during initialization and sensing activities for each mcu and ocse sub-interval time interval duration [ms] average current [ma] energy [j] twum 0.3500 0.290 0.1522500 tin 1.0000 0.495 1.4850000 twul 0.9000 0.495 0.6682500 twls 0.1600 0.570 0.1368000 tws1 1.9000 0.590 1.6815000 tadc 0.0035 0.720 0.0075600 tcmt 0.0050 0.075 0.0005625 0.1600 0.078 0.0187200 tws2 150.000 0.729 164.02500 tadc 0.0035 0.877 0.0092085 tcmt 0.0050 0.075 0.0005625 0.1600 0.235 0.0564000 tws3 5.0000 0.869 6.5175000 tadc 0.0035 1.017 0.0106785 tcmt 0.0050 0.075 0.0005625 0.1600 0.375 0.0900000 tws4 0.4450 0.604 0.4031700 tadc 0.0035 0.752 0.0078960 tcmt 0.0050 0.075 0.0005625 0.1600 0.110 0.0264000 tws5 0.0015 3.569 0.00803025 tadc 0.0035 3.717 0.0390285 tcmt 0.0050 0.075 0.0056250 0.1600 3.075 0.7380000 tws6 1.0000 6.869 10.303500 tadc 0.0035 7.017 0.0736785 tcmt 0.0050 0.075 0.0056250 0.1600 6.375 1.5300000 tws7 3.0000 0.869 3.9105000 tadc 0.0035 1.017 0.0106785 tcmt 0.0050 0.075 0.0056250 0.1600 0.078 0.0187200 tws8 10.000 1.359 20.385000 tspi 16.000 1.373 65.904000 tol 0.1000 0.496 0.0744000 notice: where twum – wake-up time of the mcu; tin – mcu initialization; twul – out2 wake-up time of the ldo; twls – wake-up time of the cls; twsx – warm-up time of a sensor x={1,2, ..,8}; tadcx – conversion time x={1,2, ..,7}; tcmt – switching time which includes tturn-off(lsx+sx) + t turnon(lsx); tspi – spi time; tol – time-off ldo 450 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović fig. 7 diagrams of power consumption during initialization and sensing activities for mcu and ocse blocks in order to evaluate the performance of our design concerning energy reduction, we have compared the following two design solutions: a) total energy consumption of a sensor subsystem epg during initialization and sensing activities, with the implemented duty-cycling and power-gating techniques (epg = 638j ); and b) total energy consumption of a sensor subsystem ewpg without implementation of duty-cycling and power-gating techniques (ewpg = 3.92 j). the estimated ratio is ewpg / epg = 6146. the obtained result justifies the involvement of both power saving techniques in a sensing subsystem of a wireless sensor node. 5. conclusion wireless sensor nodes place sensor elements in the physical world in order to gather information. this activity consumes energy. due to the limited battery capacity, energy conservation becomes a goal. this paper attempts to provide a comprehensive insight into aspects of energy consumption of a sensing subsystem within a sensor node architecture. in order to achieve reduction in energy consumption in a sensor node operation, we propose using a combination of two power saving techniques. the first one, called dutycycling, is used for power reduction at a system level, i.e. switching on/off the sensor node architecture between active and sleep state. the second one, referred to as power wireless sensor node with low-power sensing 451 gating, is intended for switching on/off the sensor elements (constituent of the sensing subsystem within sn), during acquiring information from the external environment. the obtained results based on the analysis and validation by matlab show that on average, three order of reduction in energy consumption can be achieved when the mentioned two techniques intended for power saving are implemented with respect to the case when they are turned off. for the time period of two minutes the energy consumption when the two techniques are used is 638 j compared to 3.92 j in the case when the duty-cycling and powergating techniques are turned off. acknowledgement: this work was supported by the serbian ministry of education, science and technological development, project no. tr-32009 – “low-power reconfigurable fault-tolerant platforms”. references [1] i. f. akyildiz, and m. c.vuran, "wireless sensor networks", john wiley & sons ltd, 2010 [2] a.j. goldsmith, and s. b. wicker, "design challenges for energy constrained ad hoc wireless networks", ieee wireless communications, 2002, vol. 9, no. 4, (pp. 8-27) [3] g. pistoria, "battery operated devices and systems", elsevier bv., amsterdam, the netherlands, 2009 [4] v. raghunathan,s. ganerival, andm. srivastava, "emerging techniques for long lived wireless sensor networks", ieee communication magazine, 2006,vol.41, no. 4,(pp. 130-141) [5] g.anastasi, m. conti, m. di francesco, and a.passarella, "energy conservation in wireless sensor networks: a survey", ad hoc networks, 2009, vol. 7, (pp. 537–568) [6] m. n. halgamuge, m. zukerman, and k. ramamohanarao, "an estimation of sensor energy consumption, progress in electromagnetics research b", 2009, vol. 12, (pp. 259-295) [7] w. ye, j. heidemann, and d. estrin, "an energy-efficient mac protocol for wireless sensor networks," proc. ieee infocom, new york (usa) 2002, (pp. 1567-1576). [8] m. al ameen, s.m. riazul islam, and k.kwak, "energy saving mechanisms for mac protocols in wireless sensor networks", hindawi publishing corporation international journal of distributed sensor networks, volume 2010 (2010), article id 163413, (pp 1-16) [9] m. r. ahmad, e.dutkiewicz, and x. huang (2011), "a survey of low duty cycle mac protocols in wireless sensor networks", ch. 5,(pp. 69 – 90), in "emerging communications for wireless sensor networks", eds. a. foerster and a. foerster, pub. by intech, 2011, rijeka, croatia [10] j. n. al-karaki and a. e. kamal, "routing techniques in wireless sensor networks: a survey,", ieee wireless communications, 2004, vol. 11, no. 6, (pp. 6-28). [11] e. y. lin, "a comprehensive study of power-efficient rendezvous schemes for wireless sensor networks", phd thesis, university of california, berkeley, 2005 [12] e. a. lin, j. m. rabaey, and a.wolisz, "power-efficient rendez-vous schemes for dense wireless sensor networks", in proceeding of icc2004, paris, france, june 2004, vol.7, (pp. 3769 – 3776) [13] m. hempstead, n. tripathi, p. mauro, g.-y. wei, and d. brooks, "an ultra low power system architecture for sensor network applications," proc. 32nd annual international symposium on computer architecture, madison (usa) 2005, (pp. 208-219). [14] a. boulis, s. ganeriwal, and m. srivastava, "aggregation in sensor networks: an energy accuracy trade-off", ad hoc networks, vol. 1, 2003, (pp. 317–331) [15] b. h. calhoun, d. c. daly, n. verma, d. finchelstein, d. d. wentzloff, a. wang, s.-h. cho, and a. p. chandrakasan, "design considerations for ultra-low energy wireless micro-sensor nodes," ieee trans. computers, 2005, vol. 54, no. 6, (pp. 727-740) [16] c. lynch, and f. o'reilly, "processor choice for wireless sensor networks", workshop on real-world wireless sensor networks, realwsn'05, stockholm, sweden, 20-21 june 2005, (pp. 1-5) [17] d. singh, "micro-controller for sensor networks", msc. th., department of computer science and engineering, indian institute of technology, kharagpur, india, may 2008 452 g. nikolić, m. stojĉev, z. stamenković, g. panić, b. petrović [18] g. panić, z.stamenković, and r.kraemer, "power gating in wireless sensor networks", wireless pervasive computing, 2008.iswpc2008. 3rd international symposium on, santorini,greece,may 2008, (pp. 499-503) [19] h. jiang, m. marek-sadowska, and s. nassif, "benefits and costs of power-gating technique", proc. ieee int'l conf. computer design: vlsi in computers and processors (iccd '05), san jose, ca, usa, 2-5. oct. 2005, (pp. 559-566) [20] t. burd, and r. brodersen, "energy efficient microprocessor design", kluwer academic publishers, norwell ma, usa, 2002 [21] n. weste and d. harris, "integrated circuit design", pearson education, boston, usa, 2011 [22] g. panić, d. dietterle, and z. stamenković, "architecture of a power-gated wireless sensor node" , proc. 11 th euromicro conference on digital system design, 2008, parma, italy, (pp. 844-849) [23] y. lee, g. chen, s. hanson, d. sylvester and d. blaauw, "ultra-low power circuit techniques for a new class of sub-mm 3 sensor nodes", custom integrated circuits conference (cicc), 2010 ieee, 1922 sept. 2010, san jose, ca, usa, (pp. 1 – 8) [24] w. heizelman, a. chadrakasan, and h. balakrishnan, "an application-specific protocol architecture for wireless micro-sensor networks", ieee trans. on wireless communications, vol. 1, no. 4, oct. 2002, (pp. 666-670) [25] j. zhu, and s. papavassilion, "on the energy-efficient organization and the lifetime of multi-hop sensor networks", ieee communication letters, vol. 7, no. 11, nov. 2003, (pp. 537-539) [26] m. mille and n. vaidya, "a mac protocol to reduce sensor network energy consumption using a wake-up radio", ieee trans on mobile computing, vol. 4, no. 3, may, 2005, (pp. 228-242) [27] h.y. zhou, d. luo, y. gao, and d. zuo, "modeling of node energy consumption for wireless sensor networks", wireless sensor networks, vol. 3, 2011, (pp. 18-23) [28] c. alippi, g. anastasi, m. di francesco, and m. roveri, "energy management in wireless sensor networks with energy-hungry sensors", ieee instrumentation and measurement magazine, vol.12, no. 2, april 2009, pp. 16-23 [29] p. dutta, d. culler and s. shenker, "procrastination might lead to a longer and more useful life", in proceedings of the acm sixth workshop on hot topics in networks (hotnets-vi), 2007, atlanta, georgia, usa, (pp. 1-7) [30] g. chen, s. hanson, d. blaauw, and d. silvester, "circuit design advances for wireless sensing applications", proceedings of the ieee, vol. 98, no. 11, november 2010, (pp. 1808-1826) [31] w. dargie, “dynamic power management in wireless sensor networks: state-of-the-art”, sensors journal, ieee, vol. 12, no. 5, 2012, (pp. 1518 1528) [32] l. mateu; and f. moll, "review of energy harvesting techniques and applications for microelectronics", proc. spie 5837, vlsi circuits and systems ii, seville, spain, may 09, 2005, (pp. 115 ); [33] s. beeby, and n. white, "energy harvesting for autonomous systems", artech house, norwood, ma usa, 2010 [34] m.a.m viera, c.n. coelho, d.c. da silva jr., j.m. mata, ”survey on wireless sensor network devices”, ieee conference emerging technologies and factory automation, lisbon, portugal, 16-19 sept. 2003, vol.1, (pp. 537-544) [35] a.l. aita, m. pertijs, k. makinwa, and j.h. hujsing, "a cmos smart temperature sensor with a batchcalibrated inaccuracy of ±0,25 0 c(3δ) from -70 0 c to 130 0 c" in proceedings of the ieee solid state circuits conference, san francisco, ca, usa, feb. 2009, (pp. 342-343,343a) [36] s. hanson and d. sylvester, "a 0.45-0.7 v sub-microwatt cmos image sensor for ultra-low power applications", in proceedings of the symposium on very large scale integration (vlsi) circuits, vol. 1, kyoto, japan, jun. 2009, (pp. 176-177) [37] s. hansen, m. seok, y.s. liu, z.y. fao, d. kim, y. lee, n. liu, d. sylvester, and d. blaauw, "a lowvoltage processor for sensing applications with picowatt standby mode", ieee journal of solid-state circuits, vol. 44, no.4 april 2009, (pp. 1145-1155) [38] n. verma, and a.f.chandrakasan, "an ultra-low energy 12-bit rate resolution scalable sar adc for wireless sensor nodes", ieee journal of solid-state circuits, vol. 42, no. 6, june 2007, (pp. 1196-1205) [39] m. kuorilehto, m. kohvakka, j. suhonen, p. hamalainen, m. hannikainen, and t. d. hamalainen, "ultra-low energy wireless sensor networks in practice: theory, realization and deployment", john wiley & sons ltd, 2007, chichester, uk [40] b. krishnamachari, "networking wireless sensors", cambridge university press 2005, cambridge, uk [41] f. juan, b. lian, and z. hongwei, "hierarchically coordinated power management for target tracking in wireless sensor networks", international journal of advanced robotic systems, feb. 2013, vol. 10, (pp. 1 14) wireless sensor node with low-power sensing 453 [42] v. jeliĉić, "power management in wireless sensor networks with high-consuming sensors", technical project report, april 2011, university of zagreb, faculty of electrical engineering and computing, (pp. 1-9), av. february 2014 at http://www.ztel.fer.unizg.hr/_download/repository/vjelicic,kdi.pdf [43] h. joe, j. park, c. lim, d. woo, and h. kim, "instruction-level power estimator for sensor networks", etri journal, vol. 30, no. 1, february 2008, (pp. 47 58) [44] leibniz institute for high performance microelectronics – ihp, frankfurt (oder), germany, http://www.ihp-microelectronics.com [45] n.s. kim, t. austin, d baauw, t. mudge, k.flautner, j.s. hu, m.j. irwin, m.kandemir, and v. narayanan "leakage current: moore's law meets static power ", ieee computer, dec. 2003, vol.36, no. 12, (pp. 68 75) [46] m. kosanovic, m. stojcev, "rpats – reliable power time synchronization protocol", microelectronics reliability, vol. 54. no. 1, 2014, (pp.303-315) [47] t. schmid, r. shea, z. charbiwala, j. friedman, m. srivastava, and y. cho, "on the interaction of clocks, power, and synchronization in duty-cycled embedded sensor nodes", acm transactions on sensor network, vol. 7, no. 3, 2010, (pp. 1-19), article no. 24 [48] j. fraden, "handbook of modern sensors: physics, designs, and applications", fourth edition, springer new york, 2010 [49] g. nikolic, g. panic, z. stamenkovic, g. jovanovic and m. stojcev, "implementation of external powergating technique during sensing phase in wireless sensor networks", 29th international conference on microelectronics miel 2014, belgrade, serbia, 12-15 may 2014, accepted for presentation [50] online catalogue of www.farnell.com av. at january. 2014 [51] texas instruments, low-dropout voltage regulator, av. at www.ti.com/lit/gpn/tlv716120275p, january. 2014 [52] texas instruments, low ron load switch, av. at http://www.ti.com/lit/ds/symlink/tps22908.pdf, january. 2014 http://www.ztel.fer.unizg.hr/_download/repository/vjelicic,kdi.pdf http://www.ihp-microelectronics.com/ http://www.farnell.com/ http://www.ti.com/lit/gpn/tlv716120275p http://www.ti.com/lit/ds/symlink/tps22908.pdf 4583-32239-1-pb rev1 facta universitatis series: electronics and energetics vol. 32, no 3, september 2019, pp. 449-461 https://doi.org/10.2298/fuee1903449b frequency scanning antenna arrays with metamaterial based phased shifters nikola bošković1, branka jokanović1, vera marković2 1institute of physics, university of belgrade, serbia 2faculty of electronic engineering, university of niš, serbia abstract. this paper presents a simple design of linear series-fed frequency scanning antenna arrays with: (a) identical rectangular dipoles and (b) pentagonal dipoles having different impedances to provide enhanced side lobe suppression. phase shifters are designed as a metamaterial unit cell consisting of split-ring resonators coupled with the parallel microstrip line. shifter models variations are described and control of phase is demonstrated. two antenna arrays are manufactured and measured. key words: scanning antenna array, linear array, series feeding, pentagonal dipole, phase shifter, split-ring resonator. 1. introduction antenna elements come in various forms in terms of technology, size, cost and radiation properties. nevertheless, a single antenna has a typical omnidirectional radiation pattern and low gain. in many applications, there is a need for directional and high gain radiation, which can be generated by combining multiple antenna elements in different arrangements. one of the most notable problems in antenna arrays is side lobe emergence. they can be observed as radiation in the unwanted direction as a direct consequence of the configuration of arrays elements. high levels of side lobes can make it hard to isolate desired signals and overcome uncertainties in the determination of a position of the specific object, which is especially important in radar applications. two main factors, which determine the sidelobe levels (slls), are the power distribution between the elements and the distance between them. typical demands are slls -20 db [1], relative to the main lobe. slls from -30 to -20 db can be typically achieved solely by the power distribution, but for higher lobes suppression, the distance between the elements must be considered as well. printed antennas are by far the most popular antennas for applications due to size and shape diversity, ease of fabrication and integration, low cost and high flexibility in resonant frequency, polarization, radiation pattern and impedance. they come in forms of patch, received november 2, 2018; received in revised form february 6, 2019 corresponding author: nikola bošković institute of physics, university of belgrade, pregrevica 118, 11080 belgrade, serbia (e-mail: nikolab@ipb.ac.rs) 450 n. bošković, b. jokanović, v. marković dipoles, slots etc. their main drawback is a typical low power handling capability due to the low thermal conductivity of regularly used dielectrics. however, developing various dielectrics with high thermal conductivity similar to aluminum nitride (ain) ceramic can overcome even this obstacle. other problems like surface waves, spurious radiation and losses can be controlled in different ways [2-10]. in order to achieve low slls in frequency scanning antenna with linear element arrangements, the appropriate power distribution can be extremely hard to implement because it can require a very high ratio of the impedances of radiation elements. in addition, there is a need to maintain the desired distribution in a wide frequency band while avoiding main beam deformation, which can be caused by beam squinting due to the frequency change [11]. in this paper, experimental results from the array with regular dipoles comparing with the experimental results with the enhanced pentagonal dipoles are shown. both models use the same shifter based on the four srr, same dielectric, the distance between elements, and both are design to work at 10 ghz making it very fairly to use in comparison. (a) (b) fig. 1 antenna array feeds: (a) series feed and (b) corporate feed. φ φ φ φ φ φ φ φ 2 3 4 5 6 7 8φ φ φ φ φ φ φ φ frequency scanning antenna arrays with metamaterial based phased shifters 451 2. printed frequency scanning antenna with technological development, there is a great need for the scanning antennas, which enable tracking of a specific position in space with great accuracy and resolution [12-15]. in the past, such solutions were predominantly based on a combination of the antenna array and a customized mechanical system for pointing the array at the specific direction. such systems have limited agility, require periodic maintenance of the mechanical parts, and due to high price have limited usage. with the development of modern electronics, frequency scanning is introduced at a much lower price, better performances and reliability. basic frequency scanning antenna array consists of the radiating elements in a specific spatial distribution and frequency dependent elements between them, which introduce different phase shift, depending on the applied frequency, hence this elements can be called phase shifters. a typical configuration is a linear series-fed array, which enables scanning in one plane. combining multiple linear arrays in a planar array, we can obtain scanning in the second plane. scanning in the second plane is typically achieved in a different manner than in a linear array. frequency scanning antenna enables continuous coverage of the spatial range of scanning. angular resolution depends on the 3 db-beamwidth of the main beam, and speed of scanning is determined by the frequency dependence of the phase shifter. for linear arrangements, an array of n radiating elements require n-1 phase shifters, since for the basic operation phase shift ∆φ between two successive elements is constant, but there is also need for constant phase-increment from the first to the last element in the plane of scanning. depending on their structure phase shifters can have significant losses and non-linearity, which can seriously degrade array characteristics. in addition, in order to provide low slls the suitable power distribution needs to be implemented, which can be challenging in a series fed array (fig 1a) because phase shifters and radiating elements change their performances in the frequency range. because of that, full corporate-fed (fig. 1b) are occasionally used, but they require a much greater number of phase shifters, have significantly larger size and smaller efficiency. other type of commonly used electronic scanning is via switchers. the principle is that every antenna element is connected to the power source via one of the several available phase shifters. with on/off switching shifter selection is made and the main beam is positioned at the certain direction. the whole system can work in a single frequency, but the number of available directions depends on the number of the phase shifts available. similar basic principles of operations are used with rotman lens [16-17], butler matrix [18-19] and similar structures. they typically have n inputs and n outputs, which are connected to the antenna elements. when connecting the power source at the different inputs the different main beam positions are generated. combining one of these types of electronic scanning in one plane and frequency scanning in another, scanning in two planes is enabled at the same time. frequency scanning antenna should be cheaper, easier to manufacture, with more stable characteristics in the working range, with higher efficiency and easier for integration with other components in comparison with the electronic scanning antenna. a natural choice would be a printed antenna structure with antenna elements having stable radiation and impedance characteristics in the working range. the printed pentagonal dipole is an excellent choice as a radiating element that satisfies demands for higher impedance bandwidth due to working at the second resonance and has stable radiation characteristics in the wide frequency band [20-21]. 452 n. bošković, b. jokanović, v. marković 2.1. antenna array technology the printed dipole can be naturally implemented in two different technologies. one is coplanar stripline (cps) and the other is symmetrical (balanced) microstrip line. cps is a balanced uniplanar transmission line, consisting of two metallic conductor strips separated by a certain gap width, on a substrate. the cps line is without bottom metallization of the substrate for the ground; instead, the virtual ground is placed at the symmetry plane between two conductors. the balanced microstrip line is equivalent to the classical microstrip line and is represented by two identical parallel transmission lines, one from each side of the dielectric surface. for the given substrate height and line width, the impedance of the balanced microstrip line would be equal to double the impedance of the microstrip line having identical width and half of the substrate height. the cps line offers flexibility in the design of planar microwave and millimeter-wave circuits, especially in mounting the solid-state device in series or shunt without via holes. it exhibits low loss, small dispersion, small discontinuity parasitics, considerable insensitivity to substrate thickness and simple implementation of openand shortcircuits. the cps line has a typical impedance value around 200 ohms, which is much higher than the typical microstrip line of 50 ohms. in the series-fed array, it is very important to have available a high impedance ratio of the feeding transmission line and the radiating elements for achieving proper power distribution. since the balanced microstrip line can achieve much lower impedance value, it is a better choice for this type of array. 2.2. frequency scanning performance the frequency bandwidth is a valuable and limited resource, and certain bands are restricted for specific use [22]. two important parameters for frequency scanning systems are range and angular resolution. the angular resolution of beam scanning systems is defined by the antenna main lobe 3 db beamwidth. it means that two identical targets at the same distance are resolved in angle if they are separated by more than the antenna 3 db beamwidth. antenna system with fixed beam provides only range resolution. range resolution is the ability of an antenna system to distinguish between two or more targets on the same bearing but at different ranges. the pulse width is the primary factor in range resolution and it is generally the inverse of the pulse bandwidth. for the higher bandwidth available, the greater range precision can be obtained. frequency scanning provides the angular resolution. narrower 3 db beamwidth provides greater precision in determining the angular position of the target. frequency scanning systems performance is a typical trade-off between angular and range resolution. the simplest phase shifter is a basic transmission line. its length is directly proportional to phase shift contribution. any phase shifter can be approximated with the transmission line of the certain length. two parameters, which determine overall position of the main beam during scanning for the simple antenna array, are the distance between radiating elements (d) and length of the transmission line (l), as shown in fig. 2. dependence between scanning angle θ and relative frequency change ∆f / f0 is given as: 0 0 0 , 3602 sin fff df f d l −=∆ ∆ =      ∆ = λφ θ o (1) frequency scanning antenna arrays with metamaterial based phased shifters 453 where the beam is steered over the limits ±θ, f0 represent the central frequency at which the main beam is positioned broadside, λ0 is the free space length at the central frequency and ∆φ is the phase shift between two succeeding radiating elements (phase-increment). if the distance between the elements is fixed at the typical value of the 0.5 λ0 (wavelength in free space at center frequency), the length l and the available frequency bandwidth will determine scanning properties. fig. 2 frequency scanning antenna array with different positions of the main beam. as can be seen, the same results for the scanning angle (sector) can be obtained independently for different values of l and ∆f, while one of them is fixed. in practical application, frequency bandwidth is specified and l is used for obtaining a specific scanning angle. for the practical example, let us say that available relative bandwidth is 20% and required scanning is ± 25°, then from (1) l would have to be around 4.2 d , that is 2.1λ0 for the previously stated typical value of d. for 10% relative bandwidth, that value would be 4.2λ0. in both cases, the resulting phase shift would be around ±76 degrees. relative bandwidth in (1) would be equal to ∆f / f0, since total scanning sector is 2θ. from this, we can see that frequency sensitivity of the phased array is directly proportional to the equivalent length of the phase shifter. 2.3. phase shifter performances transmission line although simple, typically has a very slow phase contribution with frequency change. for narrow bandwidth and large phase shift, it has to have substantial length. long transmission lines can have significant losses and give rise to spurious radiation. if placed in the same plane as radiating elements, interaction might occur through the coupling and severe degradation of the radiation pattern could happen. for these reasons often other structures are employed as the phase shifter, which are better suited for the specific purpose. in fig. 3a it is shown phase shifter based on the metamaterial left-handed cell consisting of the pair of srrs (split ring resonators) in balanced microstrip technology, where one metal layer is on top and the second identical at the bottom side of the dielectric. in microstrip, it would be single srr cell coupled with transmission line with via in center in order to provide pass-band characteristics. such shifter is used in [23], where it enabled scanning sector of 32° in 5% of the relative bandwidth. if we applied that as an angle θ = ±16° in (1), we can see that phase shift would be around ±50 degrees and required l source l d radiation pattern 454 n. bošković, b. jokanović, v. marković would be around 5.5λ0. such a long line would take significant space and would require special care in order to minimize its impact on the radiation elements. the substrate used in [23] is rogers 4003c with the dielectric constant of 3.55, height of 1 mm, loss tangent is 0.0027. surface roughness in rogers 4003c is 2.8 microns. fig. 3 phase shifters based on the left-handed unit cell: (a) srrs coupled with meander line, (b) two pairs of srrs, (c) s-parameters for (a), (d); s-parameters for (b), (e) equivalent circuit of the microstrip line loaded with srrs and grounded with via. in fig. 3b shifter based on the four srr left-handed cell is shown [24]. two pairs of srrs in balanced microstrip technology are coupled with a transmission line in a similar manner like the previous one. the obtained characteristics are scanning sector of 30° for 2.5% of the relative bandwidth, which requires a phase shift of ±47° and required l would be 10.35λ0. in [24] rogers 5880 is used, with the dielectric constant of 2.17, the height of 0.508 mm, loss tangent is 0.001. surface roughness in rogers 5880 is 0.3 microns and is (a) (b) (c) (d) (e) via hole input output via hole input output k m k m c s l s /2 l/2 l/2 l s /2 c/4 c/2 c/4lvia p2p1 � frequency scanning antenna arrays with metamaterial based phased shifters 455 significantly smaller than the one in rogers 4003c, which would mean that at the same frequency, losses in metal would be considerably larger for rogers 4003c. in both cases impedance of the transmission line is 100 ω, and losses in the transmission line are 0.058 db/cm for rogers 4003c at 6 ghz [23], and 0.035 db/cm for rogers 5880 at 10 ghz [24]. from these two examples, we can see a significant advantage in the application of different phase shifter structures for enhancing frequency-scanning characteristics of the antenna array. in figs. 3c and 3d the s-parameters of the corresponding shifters are given. in fig. 3e the equivalent circuit of the shifters is shown and it can be derived from [25]. based on the characteristics it can be seen that these shifters exhibit the behavior of the pass-band filter, hence controlling its zeros and poles desired characteristics could be obtained. 2.4. linear arrays with identical rectangular dipoles a. scanning antenna array at 6 ghz previously discussed shifters are used in the antenna array design. radiating elements are simple identical rectangular dipoles. one-half of the dipole is at the top layer (brown) and the other is at the bottom (yellow) fig. 4c. the structure is designed in a balanced microstrip technology so in order to connect it to a standard sma connector, a transition from the balance-to-unbalance line (balun) is necessary. this is achieved via the triangular balun. the shifter from fig. 3a is used in the antenna design in [23]. from fig. 4a we can see that the antenna array achieved the scanning sector from 45° to 77°, frequency sensitivity of 10.67°/100 mhz and gain is from 12.4 to 13.73 dbi. dimensions of the rectangular dipoles are calculated in order to be resonant at a specific frequency with specific resistance value (z = 400 ω + 0j). the position of the resonance is determined with the length of the dipole and value of the resistance is regulated with the dipole width. since there are two variables and two goals it is more tuning than an optimization. for a true optimization it necessary to have a certain degree of freedom, that is to have more variables than goals, which is the case in the pentagonal dipoles. b. scanning antenna array at 10 ghz the shifter shown in fig. 3b is implemented at a higher frequency of 10 ghz. the produced prototype is shown in fig. 4e. the array is placed above the reflector plane at the distance d = 7.5 mm. dipoles are designed to have an impedance around 400 ω at 10 ghz, with the distance between radiating dipoles of 0.5λ0, that is 15 mm at 10 ghz. in fig. 4g we can see the offset between measured and simulated s11 parameter due to the fact that the sma connector is not precisely modeled and interconnection between the structure of the balun and the connector can produce discrepancy. nevertheless, the measured s11-parameter exhibits a good matching in the working bandwidth from 10 to 10.3 ghz. from fig. 4b we can again see a slight offset between the measured and simulated radiation characteristics due to manufacturing imperfections. measured characteristics show scanning from 105° to 130°, gain variation from 12.1 to 12.9 dbi and frequency sensitivity of 8.33°/100 mhz. as can be seen, these two shifters are designed to produce the frequency scanning at the different angles and scan rates, but both antenna arrays in this configurations display very high slls since the identical radiating elements are used in the array. in the first case, slls are from -10 db to -7.5 db below the main beam while in the second case their measured values are from -11.5 to -9 db bellow the main beam. the high slls are usually the biggest issues with scanning antennas. 456 n. bošković, b. jokanović, v. marković (a) (b) (c) (d) (e) (f) (g) fig. 4 comparison of the antenna arrays with different phase shifters operating at 6 ghz and 10 ghz: (a) simulated radiation pattern for the antenna array with phase shifter shown in fig. 3a, (b) simulated and measured radiation pattern for the antenna array with phase shifter shown in fig. 3b at the central and edge frequencies, (c) model of the antenna array with shifter shown in fig. 3a, (d) model of the antenna array with shifter shown in fig. 3b, (e) antenna prototype with dimensions 146.2 mm x 35.75 mm, (f) measured radiation pattern, (g) measured and simulated s-parameters of the array from fig. 4e. angle (deg) r a d ia ti o n p a tt e rn ( d b i) sma sma simulation measurement frequency scanning antenna arrays with metamaterial based phased shifters 457 2.5. linear array with pentagonal dipoles in order to obtain a higher side lobe suppression in the antenna array, the appropriate power distribution is necessary to be implemented. this problem is particularly challenging in the case of the linear scanning array with series feeding. the typical configuration of the traveling wave antenna array employs radiating elements of different impedances, so when wave travel through the array each radiating element takes the portion of the power available, which depends on its impedance value. at the end of the array, there is a termination for preventing the remaining power to return to the array and cause additional scanning beam in the opposite direction in relation to the broadside. shifters can have significant losses, which can considerably degrade the radiation characteristics. its influence on the power distribution must be seriously considered. another important issue is the fact that frequency scanning means that the antenna operates in a certain frequency band. the elements of the antenna array are frequency dependent and have different behavior depending on the observing frequency. power distribution is mostly implemented based on the ratio of the impedances of the transmission line and the radiation elements. an approach that is more proper would be observing sparameters on the multi-port network thus directly observe power distribution in the frequency range. in order to preserve power distribution in the frequency range, all components should have slow impedance change, which would result in stable s-parameters. this can be accomplished using pentagonal printed dipoles as radiating elements and shifter from fig. 3. approximation of the impedance values for the specific distribution can be calculated by: 2 10 )),(( 10 knw z z j norm a j j = (2) where zj represents the impedance in ohms of the jth element of the array, where j = 1..n and n is the number of the elements of the array; aj represents accumulated losses in the array at the jth element, which mostly originated from the phase shifters and radiating losses; znorm is the constant impedance which value depends on the scope of value of the minimum and maximum available as the impedance of the radiating elements; wj(n,k) is the weighting coefficient for the specific distribution for the case of the n elements and for k as a level of sidelobe suppression in db. implementation of this approach in the array with right-handed shifters is shown in [26]. for dolph-chebyshev distribution with n = 8 and k = 21, aj = 1.5(j-1) impedance values are given in table 1. table 1 impedance values for the array with the pentagonal dipoles. z1 z2 z3 z4 z5 z6 z7 z8 1570.8 750.2 292.3 156.1 110.5 103.7 133.4 140 458 n. bošković, b. jokanović, v. marković (a) (b) (c) (d) (e) fig. 5 (a) measured and simulated s-parameters of the linear array with pentagonal dipoles, (b) simulated radiation pattern, (c) measured radiation pattern, (d) model of the array, (e) manufactured prototype with dimensions: 140 mm x 27 mm. res sma frequency scanning antenna arrays with metamaterial based phased shifters 459 measured and simulated s-parameters are shown in fig. 5a. the measured s11 characteristic is better than simulated one due to the additional loses. simulated and measured radiation characteristics are shown in figs. 5b and 5c, respectively. the power distribution used in the array is dolph-chebyshev, with the goal to achieve slls suppression of 20 db in respect to the level of the main beam. in fig. 5b we can see that the goal is achieved and in the whole range slls are below desired level. the measured results show some degradation due to manufacturing errors and slightly lower gain due to losses. the model and manufactured prototype are shown in figs. 5d and 5e, respectively. the detailed comparison of the manufactured antennas characteristics are shown in table 2. from it, we can see that using pentagonal dipoles with different impedances, the main problem with printed scanning arrays can be resolved. the great improvement in slls is achieved. the trade-off of slls improvement is a wider 3 db beamwidth and somewhat lower antenna gain. table 2 comparison of the measured characteristics of the antenna arrays with identical rectangular and different pentagonal dipoles. dipoles rectangular pentagonal bandwidth 10.00 ghz-10.30 ghz 9.98 ghz-10.22 ghz scanning angle 100°-125° (25°) 100°-122° (22°) frequency sensitivity 83.333°/ghz 91.666°/ghz 3 db beamwidth 14.26° – 22.6° 21.2°-29.2° gain 12.1 db – 12.9 db 10.4 db 11.7 db sll better than 7.5 db better than 17 db measurements were performed using anritsu ме7838а vector network analyzer [27] in a setup which consists of the calibration kit, two identical standard horn antennas, device under test (dut), cables, positioner with stepper motor and pc control via arduino mega 2560 motherboard [28]. software communication with arduino is done with matlab through matlab support package for arduino hardware [29]. at the same time software communication with anritsu ме7838а, is done with instrument control toolbox through lan using tcp/ip [30]. one horn antenna is used as a transmitting antenna during the whole measurement procedure and the second one is used only at the beginning to determine relative gain levels at the position of the dut. after placing dut at the positioner with stepper motor the whole process is done automatically. accuracy should be better than 0.5 db. 3. conclusion in this paper, we have shown the use of the phased shifters based on the metamaterials. shifters are analyzed and their performance is discussed. their use in the frequency scanning arrays is shown. two prototypes are produced and shown. it is demonstrated that a combination of the pentagonal dipoles with different impedances and metamaterial based shifters can provide frequency scanning and slls control thus making it a good choice for cheap and highly accurate frequency scanning solution. 460 n. bošković, b. jokanović, v. marković acknowledgement: this work was financed by the serbian ministry for education, science and technological development through the projects tr-32024 and iii-45016. the authors would like to thank the institute imtel, belgrade, for the prototype manufacturing and to wipl-d, belgrade for the use of software licenses. references [1] m. miljić, a. nešić, b. milovanović, “an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors”, facta universitatis, series: electronics and energetics, vol. 30, no. 3, pp. 391–402, september 2017. [2] c. balanis, antenna theory: analysis and design, 3rd ed. hoboken, new jersey, united states, john wiley, 2005. [3] t. a. miligan, modern antenna design, 2nd ed. hoboken, new jersey, united states, john wiley, 2005. [4] w.s.t. rowe and r. b. waterhouse, “edge-fed patch antennas with reduced spurious radiation”, ieee trans. antennas propag., vol. 53, no. 5, pp.1785–1790, may 2005. [5] j. l. volakis, antenna engineering handbook, 4th ed. new york, united states, mcgraw-hill education, 2007. [6] d. f. sievenpiper, l. zhang, r. f. jimenez broas, n. g. alexópolous, and e. yablonovitch, “high impedance electromagnetic surfaces with a forbidden frequency band,” ieee trans. microwave theory tech., vol. 47, no. 11, pp. 2059–2074, nov. 1999. [7] c. s. lee, v. nalbandian, and f. schwering, “surface-mode suppression in a thick microstrip antenna by parasitic elements,” microwave and opt. technol. lett., vol. 8, pp. 145–147, feb. 1995. [8] n. g. alexopoulos and d. r. jackson, “fundamental superstrate (cover) effects on printed circuit antennas,” ieee trans. antennas propag., vol. 32, no. 8, pp. 807–816, aug. 1984. [9] d. r. jackson, j. t. williams, a. k. bhattacharyya, r. l. smith, s. j. buchheit, and s. a. long, “microstrip patch designs that do not excite surface waves,” ieee trans. antennas propag., vol. 41, no. 8, pp. 1026– 1037, aug. 1993. [10] komanduri, v.r., jackson, d.r., williams, j.t., and mehrotra, a.r.:”a general method for designing reduced surface wave microstrip antennas”, ieee trans. antennas propag., vol. 61, no. 6, pp. 2887–2894, march 2013. [11] r. j. mailloux, phased array antenna handbook, 2nd ed. london: artech house antennas and propagation library, 2005. [12] m. winfried, m. wetzel and m. menzel, "a novel direct imaging radar sensor with frequency scanned antenna," in proceedings of the ieee mtt-s int. microwave symp. dig., 2003, vol. 3, pp. 1941–1944. [13] y. alvarez lopez, c. garcia, c. vazquez, s. ver-hoeye, and f. las-heras, “frequency scanning based radar system,” prog. electromagn. res. vol. 132, pp. 275–296, 2012. [14] alvarez, y., camblor, r., garcia, c.,et al.: “submillimeter-wave frequency scanning system for imaging applications”, ieee trans. antennas propag., vol. 61, no 11, pp. 5689–5696, nov. 2013. [15] m. a. tehrani, j. j. laurin, and y. savaria, “multiple targets direction-of-arrival estimation in frequency scanning array antennas,” iet radar, sonar and navigation, vol. 10, no. 3, pp. 624–631, march 2016. [16] s. vashist, m. k. soni and p.k. singhal, "a review on the development of rotman lens antenna", chinese journal of engineering, vol. 2014, article id 385385, 9 pages, 2014. [17] w. zongxin, x. bo and y. fei, "a multibeam antenna array based on printed rotman lens", international journal of antennas and propagation, vol. 2013, article id 179327, 6 pages, 2013. [18] j. remez ; r. carmon,” compact designs of waveguide butler matrices”, ieee antennas wireless propag. lett., vol. 5, pp. 27–31, march 2006. [19] m. koubeissi, l. freytag, c. decroze, and t. monediere, “design of a cosecant-squared pattern antenna fed by a new butler matrix topology for base station at 42 ghz,”ieee antennas wireless propag. lett., vol. 7, pp. 354–357, 2008. [20] m. ilić i n. bošković, “poređenje karakteristika štampanih bow-tie dipola sa dipolima petougaonog oblika”, in proceedings of the etran conference 2012, zlatibor, 11-14. jun 2012. [21] a. nešić, z. mičić, s. jovanović, i. radnović, d. nešić, “millimeter-wave printed antenna arrays for covering various sector widths”, ieee antennas and propagation magazine, vol. 49, no. 1, pp. 113–118, february 2007. [22] https://www.itu.int/ frequency scanning antenna arrays with metamaterial based phased shifters 461 [23] n. bošković, b. jokanović i a. nesić, “frekvencijski skeniran antenski niz sa srr faznim šifterima”, in proceedings of the etran conference 2013, zlatibor, 3-6. jun 2013. [24] n. boskovic, b. jokanovic and a. nesic, “frequency scanning antenna array with enhanced side lobe suppression”, metamaterials 2014, copenhagen, denmark, 25-30. august 2014. [25] r. bojanic, v. milosevic, b. jokanovic, f. medina-mena and f. mesa, “enhanced modelling of split-ring resonators couplings in printed circuits,” ieee trans. microw. theory tech., vol. 62, no. 8, pp. 1605– 1615, 2014. [26] n. boskovic, b. jokanovic, and m. radovanovic, “printed frequency scanning antenna arrays with enhanced frequency sensitivity and sidelobe suppression,” ieee trans. antennas propag., vol. 65, no. 4, pp. 1757– 1764, apr. 2017. [27] “installation guide vectorstar me7838 series.” https://dl.cdn-anritsu.com/en-us/test-measurement/files/ manuals/installation-guide/10410-00293f.pdf [28] “arduino mega 2560 rev3.” https://store.arduino.cc/arduino-mega-2560-rev3 [29] “matlab support package for arduino hardware documentation.” https://www.mathworks.com/ help/supportpkg/arduinoio/index.html [30] “instrument control toolbox.” https://www.mathworks.com/products/instrument.html facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 147-177 https://doi.org/10.2298/fuee1901147r a novel priority based document image encryption with mixed chaotic systems using machine learning approach revanna c r 1 , keshavamurthy c 2 1 jain university, bangalore, and faculty of ece, government engineering college, ramanagara, karnataka, india 2 faculty of ece, sri revanasiddeshwara institute of technology, bangalore, karnataka, india abstract. document images containing different types of information are required to be encrypted with different levels of security. in this paper, the image classification is carried out based on the feature extraction, for color images. the k-nearest neighbor (k-nn) method of image classification technique is used for classifying the query document with trained set of features obtained from the document database. optical character recognition (ocr) technique is used to check for the presence as well as location of text/numerals in the documents and to identify the document type. priority level is assigned in accordance with the document type. document images with different priorities are encrypted with different multi-dimensional chaotic maps. the documents with different priority levels are diffused with different techniques. document with highest priority are encrypted with highest level of security but documents with lower priority levels are encrypted with lesser security levels. the proposed work was experimented for different document types with more number of image features for a large trained database. the results reveals a high speed of encryption for a set of document pages with priorities is more effective in comparison with a uniform method of encryption for all document types. the national institute of standards and technology (nist) statistical tests are also conducted to check for the randomness of the sequence and achieved good randomness. the proposed work also ensures security against the various statistical and differential attacks. key words: ikeda, lorenz, chaotic, feature space, nist received august 4, 2018; received in revised form december 18, 2018 corresponding author: revanna c r research scholar at jain university, bangalore, and faculty of ece, government engineering college, ramanagara, 562159 karnataka, india (e-mail: revannacr2008@gmail.com) 148 r. c. r, k. c. 1. introduction a document is any piece of information in text, picture or both forms on any medium which serves as a proof of evidence of a fact which may be secret, private or public. the advancement in digital and web based technology has led to document imaging (converting the physical form to electronic form) occupying prime importance and wide acceptance as the most reliable form of preserving data. in document management systems, one of the most important service is the sharing of documents in its image form which enables smooth functioning of an organization, where the documents are to be stored, retrieved, integrated, authenticated, distributed, collaborated, searched, reproduced, and transferred between members of the team. due to the advancement in the internet technology, the distribution of documents containing confidential information over open network / wired or wire-less communication channels, offers wide scope for the interceptor to attack and hack the information. to overcome this problem there is a need to transform the intelligent information to an unintelligent form and distribute it over the channel. in all such cases the confidentiality, integrity, authenticity and non-repudiation of information is required to be maintained. the cryptographic technique called encryption, which transforms the intelligent form of information to unintelligent form is used to provide these kind of security during document sharing. encryption is a ciphering technique which uses confusion and diffusion processes. the classical method of encryption is not appropriate as the document images are different from the electronic documents by their characteristics such as highly voluminous data, a strong correlation between neighboring pixels and redundant nature. the traditional lightweight block ciphers such as advanced encryption standard (aes) and rc5 encrypts 128-bit blocks, the data encryption standard (des), the triple des and the blowfish methods encrypts only 64-bit blocks, the md4 and md5 encrypts 512-bit blocks. these lightweight block ciphers are not appropriate to encrypt document images with larger block sizes. the chaotic systems used for encryption and decryption enable to use variable block size depending on the requirements as these images are voluminous, highly correlated and more redundant. document images are classified according to user defined classes. encryption of document images of different classes using a common single encryption algorithm vary in system resources (such as encryption time, complexity in design, computation complexity etc.) and utilization. basically, providing security for all images is not required, but only the document images on demand are required to be encrypted. images are different from each other by their features. rather than encrypting the document images of different types using a single method, classify them into different predefined types and encrypt those using different methodologies. this may take lesser encryption time, provide variable security level for each document type and make it difficult for the crypt analyzer to extract the key from the known cipher text-plain text attack. to encrypt and decrypt document images, the confusion and diffusion system is used with chaotic systems. the chaotic system is a non-linear dynamical system which generates pseudorandom numbers/sequences based on the initial conditions and system parameters. the chaotic sequences are aperiodic, random, deterministic, and sensitive to initial conditions and system parameters. the encryption of documents with different multi-dimensional chaotic maps may result in added security and increase the speed of encryption. the proposed work is for classification and assignment of a document to one of the predefined set of classes and encrypt each class of documents with a different level of a novel priority based document image encryption with mixed chaotic systems... 149 security to conserve the system resources. it uses the machine learning approach to classify the documents with an efficient and simple method of classifier with image features along with the ocr technique to differentiate text in them. the encryption is performed with confusion and diffusion process by using various multi-dimensional chaotic maps with different methodologies. document image classification enables a fast retrieval of image document to encrypt when a large set of heterogeneous document images are present in the database. the various predefined set of document image classes considered are 1. the complete text document such as business letter, newsletter etc. 2. a complete graphical picture document such as photo, engineering drawing, pictures, diagrams etc. 3. a document with text embedded over the picture such as certificates 4. medical image document showing the images of mri, x-ray etc. 5. text labelled images 6. numerical value labelled images 7. picture images with captions 8. picture images with descriptive text such as newspaper 9. landscape images 10. animal class of images 11. nature or scenic images 12. images of buildings etc. the documents are classified based on the constitution of the document classes, the options available in document features and the chosen classifier algorithm. there are different methods to classify the document images. the k-nn method of classifier is chosen for classification. based on the image features such as entropy, mean, variance, mean square error, skew, correlation, histogram, ocr and energy of the documents, the k-nn classifier assigns priority values. the images with least randomness are assigned the highest priority but the images with least ocr are assigned with lowest priority. images with different priorities are subjected to different encryption methodologies. each method of encryption uses confusion and different diffusion technique with different dimensional chaotic maps. the images with highest priority are encrypted with highest level of security when compared to lower priority images. the images with lower priorities are subjected to algorithms which can reduce the complexity and the encryption time. 2. literature survey recognizing of documents depending upon their characteristics features are very important in document management systems. a document analysis and recognition (dar) [1] consists of different processing steps, layout analysis, character recognition, analysis of structural images containing textual information and its applications can be used for efficient document mining technique. content based image retrieval (cbir) search technique make use of characteristic features that can be deployed for query document retrieval. a medical image retrieval system [2] using cbir is performed by extracting visual features. a study on document image classification is very important wherein the choice of different document image classifiers is based on the problem, the use of training data, the choice of document features [3]. the document in this context relates to a single page type-set document including a broader classification based on variations like business letters, articles and printed newspapers. the role of document classifier in document retrieval system is very important. the strength of the document classifier is based on image-level features, structural of textual features. document indexing in industrial context is very important where large number of documents are digitized every day and clustered in different classes. the document 150 r. c. r, k. c. cluster can be achieved using a clustering technique. k-means. a well-known clustering technique [4] used for document clustering. the clustered data are classified using content based image retrieval (cbir) search technique which is based on the feedback learning. genre identification [5] in documents such as technical papers, photos, slides and tables is also important in document recognition system. the document genre identification be achieved by using machine learning approach by combining text based features and svm. color document images can be classified using color histogram features [6]. a large image database containing pictures of landscape, buildings, animal class, etc. can be classified using this technique. the classification of images is based on the color histogram features of images using the k-nn method of machine learning classifier results in an accuracy of 85%. textual information should be separated from non-text areas in the document images. a block based segmentation technique [7] is introduced to separate these text and nontext regions. further optical character recognition (ocr) system is deployed to identify the density of text from image pictures. comparison of different pages in a document called page similarity is also important in document analysis and classification technique. page similarity can be achieved by using visual saliency metric defined on the basis textual parameters [8]. the obtained parameters are used to classify the documents using k-nearest neighbor classifier (knn) [9]. ocr and machine learning approach such as decision tree [10] are combined to classify and for indexing heterogeneous document images. color co-occurrence matrix (ccm) [11] can be calculated for efficient image retrieval system. the hue saturation value (hsv) is used for each pixel value of the image and the ccm is calculated by using the relevant formulae. the ccm of the sample image is compared with the images in the database and the resulting images are sorted based on the similarity. this method has the advantage of increased retrieval accuracy as the documents are retrieved based on the pixel information and color feature. there are many document images which have the text or the numerals embedded over the picture. in such cases the presence of the text/numeral value is recognized by using the ocr (optical character recognition) technique. the algorithm [12] discussed extract the lines and curves which the alphabet is made is using the feature recognition based ocr. in [13] four different ocr techniques namely vectors crossing, zoning, combination of vector crossing and zoning and template matching are proposed. the results obtained from these four methods are compared and it is shown that template matching technique yields better accuracy of 99.5% with an average time of 1.95 m sec per each character. in this proposed work, once the query/sample document image is classified, it is subjected to appropriate encryption algorithm. there are different encryption techniques. the confusion and diffusion processes involved in encryption makes use of random numbers. the confusion process scrambles the image pixels and the diffusion process create the interdependency among the pixels. the random number generators are mainly the chaotic maps. a mixed chaotic system uses two different chaotic maps one for confusion and other for diffusion in an encryption algorithm. a novel priority based document image encryption with mixed chaotic systems... 151 the document image [14] proposed is provided with a security using a mixed chaotic system in which the document image is confused using a 1d-logistic map and diffusion using 2d-henon map against both the statistical and dynamical threats. a selective image encryption [15] proposed yields with an npcr=100% and uaci=33% which is close to the ideal values is achieved by using different diffusion techniques in a mixed chaotic system. a 2d-ikeda chaotic map and 1d-quadratic map are used for confusion and diffusion respectively. in [16] qualitatively estimated the complexity of random sequences statistically generated by the different chaotic maps. the random number sequences generated for the proposed system using different chaotic maps are tested for their randomness using nist test for all the sixteen parameters [17]. from the literature, it is found that there is a need for a document management systems to encrypt document images of different types with, different levels of security, reduce the total encryption time and make it difficult for the cryptanalysis. in almost all the systems of image encryption, to the best of my knowledge, the same encryption technique is used irrespective of the document type. here we are proposing a novel method, for reducing the total encryption time, providing varying security levels and tough cryptanalysis for a set of different types of document images. the proposed system is aimed at classifying the document images containing picture, text, text as caption, text embedded over picture, etc. and assign a priority value based on the type of information contained in them. the statistical features extracted for the document images are used for classification using k-nn image classifier with machine learning approach. the classified documents are assigned with a pre-defined priority value and subjected for encryption. the priority value of the classified document determines the method of encryption. each priority is associated with different encryption methodology to obtain ciphered images with different security levels using different multi-dimensional chaotic maps. the chaotic maps used during the encryption and decryption are same. 3. process flow classification of document, depending upon their characteristics features are very important in document management systems. in a large document database system, the document can be classified as, document containing only text, document contains only picture, document contains text embed over picture, labelled document, document contains picture with more text, document containing picture with caption, medical image documents, satellite image documents, natural image document sets etc. in order to classify these documents, document mining has to be performed. typically document mining involves mainly two process, namely the feature extraction and the feature classification. feature extraction is a very important step in pattern (document pattern) classification as it involves extraction of important feature vector which can be used to categorize one class of document with other class. the obtained feature vectors are classified using a classifier. the classifier compares the query document features with trained features and results in the class name into which the query document belongs to. the process flow diagram is shown in figure 1. 152 r. c. r, k. c. fig. 1 the process flow diagram. 4. feature extraction a color histogram features of an image are calculated in order to classify different documents. the extraction of color histogram of an image has an advantage over other technique since it requires less computation time as well as more efficient than other techniques. histogram features an image‟s histogram is a graphical representation plotted between number of grey levels (0 to 255) and pixel quantity at each grey level. the histogram graph depicts the nature of the image. an image with uniform/flat histogram indicates that the image contains non correlated pixels (encrypted image). image with non-uniform histogram indicates that the image contains correlated pixels (plain image). in a correlated image if the histogram is too sharp it indicates an image having low contrast. in the similar plain image if the histogram spreads over entire grey levels with non-uniform peaks indicates that it is an image having high contrast. histogram features are considered as the statistical features, where the probability distribution of different intensity levels are modeled using histogram. these statistical features provides different traits pertaining to how the level of image intensity is spread. the histogram probability can be defined as a novel priority based document image encryption with mixed chaotic systems... 153 ( ) ( ) (1) where represents the total number of pixels in the image and ( ) is the sum total of pixels at each grey level . the value of histogram probability ( ) lies in the range from 0 to 1. the sum of all probability ( ) is equal to 1. different features are considered depending on the probability of pixels in the histogram, mean, variance, standard deviation, energy, entropy and skew symmetric. mean the mean indicates the average value calculated for all the pixels in an image, hence it provides brightness of the image. image with high mean value indicates more brightness, whereas image with less mean indicates less brightness. the mean of an image for k=256 intensity levels can be calculated as ̅ ∑ ( ) ∑ ∑ ( ) (2) variance variance is a measure of contrast in an image. the region which has high contrast indicates high variance, whereas the region which has low contrast indicates low variance in an image. the variance can be calculated as ∑ ( )̅ ( ) (3) standard deviation (sd) standard deviation can be obtained by applying the square root for the variance. it can be mathematically expressed as √ (4) skew skew measures the asymmetry of probability distribution of pixels about its mean value. the value of the skew can be positive or negative or undefined. the value of skew is positive when histogram is spread to the right, and negative in case it is left tailed. mathematically it can be expressed as ∑ ( )̅ ( ) (5) energy energy measures the distribution of intensity levels in an image. the value for energy lies between 0 and 1. for an image the energy value is equal to 1 if it contains constant pixel value, and gets decremented if the pixels values are distributed among different intensity levels. typically, for an image having more energy its compression ratio is high. mathematically the energy can be expressed as ∑ ( ) (6) 154 r. c. r, k. c. entropy information entropy is a measure of randomness in the image. the randomness of the image is based on the probability of occurrence of the various gray levels in the image. an image with all pixels of equal gray levels are equally probable represents the highest entropy. mathematically the entropy can be expressed as ∑ ( ) ( ) (7) correlation the images are characterized by high redundancy and significant correlation between adjacent pixels. correlation mainly find the similarities between textures of two images or within an image. hence used to find the image redundancies. to do this, the horizontal, vertical and diagonal correlation coefficients are required to be calculated. the smaller value of correlation coefficient between adjacent pixels represents the image is noncorrelated and contains more random pixels. for a plain image, the correlation of adjacent pixels is nearly equal to one but for a text image, the correlation coefficient is less. 5. feature vectors and feature space in a machine learning and pattern recognition technique, an image can be represented as n-dimensional symbolic or numerical values called feature vectors, where n represents the feature quantity. feature measurements may either be numerical or symbolic in nature or sometimes both. a case in study for numerical feature is, calculating afore mentioned statistical values such as mean, standard deviation, variance, energy, entropy etc. for an image and storing these values in a vector. table 1 shows the detailed feature extraction of different document image types. an example for symbolic feature involves assigning color symbols or tags like „red‟, „blue‟ or „green‟ or „magenta‟ etc. for an image and storing these tags in a vector. the feature vectors can be used to classify an image or an object. an n-dimensional vector space associated with these feature vectors is called feature space. this feature space enables visualization of feature vector and provides relationship between them. feature space allows us to classify an unknown sample by comparing with known samples using distance and similarity measures. different dimensionality reduction technique such as pca (principal component analysis), lda (linear discriminative analysis) can be applied to reduce the dimension of feature space. 6. training and testing in a pattern recognition and machine learning approach, the system needs to be trained before recognizing an unknown test sample. typically, training of machine can be performed by extracting the feature values for known data samples with different classes. recognition rate of the system can be increased by maximizing the training sets and each set consisting of more features. the proposed system consists of seven classes with ten sample images in each class. the different afore mentioned feature parameters are extracted in order to classify the query sample. a novel priority based document image encryption with mixed chaotic systems... 155 testing can be performed by giving an unknown sample to the trained machine for recognition. testing involves extraction of feature parameters and comparing the extracted parameters with trained features for classification. comparison can achieved either by performing distance or similarity measure. 7. distance and similarity measure the feature vectors of a test image or an object are used for classification. classification is performed by comparing trained feature vectors with test feature vectors. basically there are two methods to compare the feature vectors namely the distance and similarity measure. shorter distances between two closely related vectors results in higher levels of similarity distance measure technique makes use of calculating the difference between the two feature vectors. if the difference is more, then the two vectors are not matching while lesser distance indicates that they match. the most widely used distance measure, metric is the euclidean distance technique. given two vectors and then the euclidean distance can be given as √∑ ( ) (8) similarity measure technique measures the similarity between two feature vectors by calculating the inner product between them. if the two vectors are closely matching then the similarity is more. the inner vector product between two features can be given mathematically as ∑ (9) the most common measure for similarity is tanimoto metric which can be written as ∑ ∑ ∑ ∑ (10) the value of lies between 0 and 1. if the equals 1, then the two feature vectors are 100% similar. 8. image classification algorithm the statistical features such as histogram, mean, variance, standard deviation, energy, entropy and skew symmetric are calculated for known images and are tabled in a vector called as learning/training the machine. the features extracted for query document are subjected to k-nn classifier to classify the given document. 9. priority assignment the test feature vectors are compared with trained feature vectors by using distance and similarity measure. the unidentified test sample is recognized as the one that belongs to the closest sample in the training set. the smallest value is considered if distance measure is used and largest value is used if similarity measure is used. this process is simple and less accurate. the accuracy can be increased by considering nearest neighbors by considering group of close feature vectors instead of selecting just a nearest training 156 r. c. r, k. c. set sample. this is referred as k-nearest neighbor technique. k number of best matching neighbors are selected to classify the unknown sample to the given class. the value of k ranges from one to total number of images in the training set. the recognition accuracy depends on the chosen k value. as the value of k increases, we are considering matching neighbors to not matching neighbors in the training set. the test features matching with feature space using euclidean distance is shown in figure 2. fig. 2 test features matching with feature space using euclidean distance. a novel priority based document image encryption with mixed chaotic systems... 157 table 1 image feature values for different document types. document type only text medical text label medical numeric label text embedded over picture text label with more words (news paper) document with less text (caption) text label few words labelled numeric document no text (picture) satellite data priority 1 2 3 4 5 6 7 8 9 10 mean 238.265 138.5 164.7 152.73 200.6 149.82 160.23 164.96 180.22 100.59 variance 1603.20 2885.41 3497.7 3865.44 4022.01 3773.89 4679.6 4959.53 2405.9 4076.89 sd 40.0405 53.716 59.142 63.1727 63.42 61.432 68.408 70.424 49.05 63.85 energy 0.6564 0.0095 0.0086 0.0091 0.0169 0.0112 0.0138 0.0112 0.0077 0.0069 entropy 2.512 7.1807 7.3961 7.2349 6.835 7.1807 7.2595 7.2349 7.2531 7.5104 skew -2.8520 -0.675 -0.6514 -0.4492 -1.1291 -0.4895 -0.2597 -0.2880 -0.712 -0.2208 ocr with text 1524 65 4 159 800 40 10 1 0 0 ocr with numeric 84 1 42 15 4 12 8 38 0 0 correlation 0.4129 0.8946 0.8573 0.9876 0.9763 0.9688 0.9756 0.9811 0.97351 0.9211 fig. 3 hierarchical tree diagram for documents with different priority levels a pure text document consisting of maximum text information assumed to be less random with lesser entropy and correlation, and high energy feature values are considered as the highest priority document (1). the medical image containing textual information on the disease and patient, consisting of lesser values for mean, standard deviation and variance with a threshold range for ocr with text to be assigned the second highest priority (2). the medical image containing numerical labelling with threshold range for ocr with numeric, is given the third highest priority (3). the textual description embedded over the picture such as certificates, the features, entropy, energy 158 r. c. r, k. c. and ocr with position of the text are considered and is assigned the fourth highest priority (4). for the detailed textual information with the picture, such as the newspaper, the features, moderate entropy, energy and threshold ocr with text count are considered and treated as the fifth highest priority (5). for the picture with less text used as caption, the features entropy, energy and ocr with moderate text with its position are considered and is assigned the sixth priority (6). for a picture with few text labelling, the features entropy, ocr with minimum text and energy are considered and treated as the seventh priority (7). for a picture with numeric labelling, the features, ocr with numeric count along with energy is considered and is assigned eighth priority (8). for the picture image, the features, entropy, energy, standard deviation, variance, correlation and ocr text/numeric count are considered and is assigned ninth priority (9). for the satellite document, the features mean, energy and ocr text/numeric count are considered and is the tenth priority (10). according to the above predefined priorities, the k-nn algorithm assigns the priorities for the classified document based on its feature values are shown in figure 3. 10. chaotic maps 4-dimensional map hyper chaotic lorenz map: hyper [18] lorenz is a 4 dimensional chaotic map represented in a differential equations having chaotic behavior for certain initial conditions. mathematically it can be expressed as ( ) (11) (12) (13) (14) the system exhibits chaotic behavior when the parameters are having values p = 10, c = 28 and b = 8/3. 3-dimensional map lorenz map: the lorenz equation can be represented in differential equations having chaotic behavior for certain parameters with initial conditions. mathematically it can be defined as ( ) (15) ( ) (16) (17) the system exhibits chaotic behavior when the parameters are having values s = 10, r =28 and b = 8/3. a novel priority based document image encryption with mixed chaotic systems... 159 2-dimensional map henon map: in discrete time dynamic systems, henon map [19] exhibit good chaotic behavior. it takes the point ( , ) in the space and maps it to a new point. mathematically it can formulated as =  1+ (18) = (19) the initial value (0, 1) and (0, 1) can be used as the key for the system ( ) the henon map mainly depends on two parameters. and , the research results shows that the value for is 1.4 and 0.3 for for which the henon map exhibits chaotic nature. ikeda map: the 2d ikeda map is distinct for its complicated chaotic behavior when compared to the other chaotic map. it takes the input , and in a plane and maps it to a new point. mathematically it can be defined as ( ( ) ( )) (20) ( ( ) ( )) (21) where ( ) (22) where is the system parameter and , are the pair wise points. the system exhibits chaotic nature when lies in the range of [0.5 0.95]. the map depends on three values namely , and whose corresponding initial values are , and . two dimensional logistic map: the 2d logistic map is recognized well by its complicated chaotic behavior when compared to the one dimensional logistic map. it takes the input , and in a plane and maps it to a new point. mathematically it can be defined as ( ) ( ) (23) ( ) ( ) (24) where is the system parameter , are the pair wise points. the map depends on three values namely , and whose corresponding initial values are r = 1.19, x0 = 0.8909 and y0 = 0.3342. gingerbread man map: the gingerbread man map is a chaotic two-dimensional map. it is given by the piecewise linear transformation. (25) (26) where and . 1-dimensional map logistic map: the basic one dimensional logistic map [20] can be formulated as = (1 ) (27) 160 r. c. r, k. c. where (0, 1). the parameter and the intial value can be used as the key for the system ( ) the results obtained from the research indicates that the system is in chaotic condition when ranges from 3.569 < <4.0. quadratic map: the one dimensional quadratic chaotic map equation with initial condition and =0.1 can be mathematically defined as (28) the system exhibits chaotic behavior when the parameters are having values . bernoulli map: the bernoulli map can be mathematically given as if € [0, ] ( ) if € [ , 1] (29) the map exhibits a chaotic behavior when = 0.2709. circle map: the circle map can be mathematically can be expressed as ( ) ( ) ( ) (30) where , and sine map: sine map [21] can be defined as ( ) (31) the system exhibits chaotic behavior when and 11. proposed methodology image classification in our proposed work a document image page, containing only text, only picture, text embed over picture, labelled document, picture with more text, picture with caption, medical image documents, satellite image documents, natural image document sets etc., are used as the input/test/query images. the document may contain a single page or more. the input sample color image is first subjected to k-nn image classification algorithm. the image features being histogram, mean, variance, standard deviation, energy, entropy, skew and correlation. the image features are arranged in a space called feature space. more number of images belonging to a single class are stored in the database for better results. feature space allows us to classify an unknown sample by comparing with known samples using distance and similarity measures. the k-nn algorithm makes use of euclidean distance measure technique to find the minimum difference between training and testing features is shown in figure 4. in order to maximize the success rate of the sample image more number of nearest neighbors are considered. a novel priority based document image encryption with mixed chaotic systems... 161 fig. 4 k-nn classification using sorted distance values for all different features mathematical model for k-nn for a given query instance , k-nn algorithm works as follows: (32) where , is the predicted class for the query instance , is the class number and is the class number present in the data. ( ) set of nearest neighbors of . where (33) euclidean distance √∑ ( ) between query instance vector and trained vector . the k-nn method classification is as follows 1. calculate the different feature values for each of the images and store them into a space called feature vector space. 2. find the euclidean distance measure for each feature between the sample image and the images in the database. 3. the euclidean distance for each feature corresponding to different images are sorted in the ascending order. 4. the first column (k=1) in the figure 4 represents the best match between the sample image and the training image. the image which finds maximum occurrence in the first column is the one matching with the sample image. 5. for more accuracy consider the value of k=‟n‟, that is the „n‟ nearest neighbors for classification. for the above figure 4, the classification of images belonging into the three different classes with each class containing three images is shown in figure 5. the image so classified is subjected to check if it contains the text. the text primarily carries the confidential information in the image. the magnitude of text contained in the image represents the density of confidential information. hence the presence of text with its magnitude is to be identified. the ocr method checks if the image contains the text. the magnitude of the text maybe a few words or more (1000s of words). the ocr also provides the location of the text in the image. based on the density of text and its location, 162 r. c. r, k. c. the documents are classified further and each image is assigned with a priority value as shown in the figure 3. fig. 5 classification of images belonging into the three different classes with each class containing three or four images encryption encryption is a process of converting an intelligent form of information to an unintelligent form. the confusion and diffusion procedures are followed for the encryption. the confusion and diffusion techniques are used with the multidimensional chaotic maps. the priority levels of the documents determines the levels of security for each document. to increase the level of security, the block size of the image, the chaotic maps used for confusion and diffusion, and the technique used for establishing the interdependency between neighboring pixels plays a very important role. based on the required priority levels of documents, the block size, the maps and method of interdependency are chosen. confusion and diffusion the confusion is a process of scrambling the document image pixels/blocks. the diffusion is a process of modifying the values of pixels and establishing interdependency among the neighboring pixels. the detailed confusion and diffusion process are described in table 2. decryption decryption is a process in which a plain image is extracted from the given cipher image and is a reverse process of encryption. for the given cipher image, the priority level is extracted from the key. based on this priority level, the inverse second level diffusion is performed first and then the inverse first level diffusion is performed using the corresponding map from the key. this resultant image is subjected to the inverse process of confusion using the corresponding chaotic map from the key. a novel priority based document image encryption with mixed chaotic systems... 163 table 2 encryption of different documents with different priorities. document image page type priority pr confusion map block size first level diffusion map key encryption technique only text 1 4d hyper lorenz map x=0.0000000000778899 y=0.0000000000874533 z=0.0000000000898447 w=0.000000000098876 p=10 c=28 b=8/3 1x1 pixels 1d logistic map a= 3.9 x0= k={pr ,x, y, z, w, p, c, b, a, x0} pr=1 x=0.00000000007788 99 y=0.00000000008745 33 z=0.00000000008984 47 w=0.0000000000988 76 p=10 c=28 b=8/3 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 1×1 pixels. 2. generate the chaotic sequence of length × using the 4d hyper lorenz map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 4d hyper lorenz map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the 1d logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using mod 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 2. second level diffusion 1. the 3x3 kernel is traversed on the entire image and the mean is calculated. 2. the obtained mean value is xored with every element within the kernel. [15]. text labelled medical image 2 3d lorenz map x=0.0000000000856672 y=0.0000000000785563 z=0.0000000000889732 s=10 r=28 b=8/3 2x2 pixels 1d logistic map a= 3.9 x0= k= { pr, x, y, z, s,r,b,a,x0} pr=2 x=0.00000000008566 72 y=0.00000000007855 63 z=0.00000000008897 32 s=10 r=28 b=8/3 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 2×2 pixels. 2. generate the chaotic sequence of length × using the 3d lorenz map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 3d lorenz map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. generate the fibonacci series of length equal to total number of image pixels. 2. generated series are xored with image pixels in both forward and reverse directions. 164 r. c. r, k. c. numeric labelled medical image 3 2d henon map x=0.6315477 y= 0.18906343 p=1.4 b=0.3 4x4 pixels 1d logistic map a= 3.9 x0= k= {pr ,x, p,b,a,x0} pr=3 x=0.6315477 y= 0.18906343 p=1.4 b=0.3 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 4×4 pixels. 2. generate the chaotic sequence of length × using the 2d henon map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 2d henon map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1 second level diffusion 1. generate the two random sequence using 1d logistic map of length equal to total number of rows and total number of columns in the image. 2. xor the obtained sequence with pixels in both row as well as column. 3. establish interdependency within an image by xoring its previous row/column with current row/column. picture with more text 4 2d ikeda map x=0.1 y= 0.1 8x8 pixels 1d logistic map a= 3.9 x0= k={pr,x,y, ,a,x0} pr=4 x=0.1 y= 0.1 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 8×8 pixels. 4. generate the chaotic sequence of length × using the 2d ikeda map, convert them into integers and obtain the remainder using modulus as the unique index values. 2. permute the blocks according to the sequence generated by the 2d ikeda map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. scan the pixels in the image in the square wave path for both row as well as column. 2. perform xor between the pixels which comes on that path. a novel priority based document image encryption with mixed chaotic systems... 165 text label with more description 5 2d logistic map x= y= r=1.19 16x16 pixels 1d logistic map a= 3.9 x0= k={ pr,,x,y,r,a, x0} pr=5 x= y= r=1.19 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 16×16 pixels. 2. generate the chaotic sequence of length × using the 2d logistic map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 2d logistic map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. scan the pixels with in the image in the triangular wave path for both row as well as column. 2. perform xor between the pixels which comes on that path. text with caption 6 2d gingerbread man map x0= y0=0.1 32x32 pixels 1d logistic map a= 3.9 x0= k= {pr, x0r, y0, a,x0} pr=6 x0= y0=0.1 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 32×32 pixels. 2. generate the chaotic sequence of length × using the 2d gingerbread man map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 2d gingerbread man map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. scan the pixels within the image in the raster scan path order. 2. perform xor between the pixels which comes on that path. 166 r. c. r, k. c. text label with few description 7 1d quadratic map x0= r = 1.95 64x64 pixels 1d logistic map a= 3.9 x0= k={ pr, x0,r,a,x0} pr=7 x0= r = 1.95 a= 3.9 x0= confusion 1. divide the image of size 512 × 512 to equal number of blocks of size 64 × 64. 2. generate the chaotic sequence of length × using the 1d quadratic map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 1d quadratic map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512 × 512 using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. divide the image into triangular shaped four equal units. 2. perform xor operation between every individual unit with remaining three units. text label with numeric 8 1d bernoulli‟s map x0=0.2709 128x12 8 pixels 1d logistic map a= 3.9 x0= k= { pr,x0,a,x0} pr=8 x0=0.2709 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 128×128 pixels. 2. generate the chaotic sequence of length × using the 1d bernoulli‟s map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 1d bernoulli‟s map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512×512 pixels using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. divide the image into triangular shaped four equal units. 2. perform xor operation between every individual unit with remaining three units. 3. scan the pixels with in the image in the zig-zag path order. 4. perform xor between the pixels which comes on that path. a novel priority based document image encryption with mixed chaotic systems... 167 image with no text 9 circle map x=0.1 d = 0.2 c=0.5 256x25 6 pixels 1d logistic map a= 3.9 x0= k={ pr ,x,d,c,a,x0} pr=9 x=0.1 d = 0.2 c=0.5 a= 3.9 x0= confusion 1. divide the image of size 512 × 512 to equal number of blocks of size 256 × 256. 4. generate the chaotic sequence of length × using the 1d circle map, convert them into integers and obtain the remainder using modulus as the unique index values. 2. permute the blocks according to the sequence generated by the 1d circle map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512×512 pixels using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. scan the pixels with in the image in the bottom to top approach order. 2. perform xor between the pixels which comes on that path. satellite image document 10 sine map x0=0.7 p =2.3 512x51 2 pixels 1d logistic map a= 3.9 x0= k={ pr ,x0,p, a,x0} pr=10 x0=0.7 p =2.3 a= 3.9 x0= confusion 1. divide the image of size 512×512 to equal number of blocks of size 512×512 pixels. 2. generate the chaotic sequence of length × using the 1d sine map, convert them into integers and obtain the remainder using modulus as the unique index values. 3. permute the blocks according to the sequence generated by the 1d sine map in step 2. diffusion first level diffusion 1. generate the chaotic sequence of size 512×512 pixels using the logistic map. 2. the generated sequence is converted into integer by multiplying with a factor of 10 15 and obtain the remainder by using modulus 255. 3. modify the confused pixels by xoring them with the chaotic sequence generated in step 1. second level diffusion 1. scan the pixels with in the image in the right to left direction path order. 2. perform xor between the pixels which comes on that path. 168 r. c. r, k. c. 12. assessment parameters to assess the quality of the encryption method followed, the statistical and dynamic assessment parameters are calculated and compared with their ideal values. the statistical parameters include mse, psnr, correlation, entropy, key sensitivity, key space, ssim and uiq, whereas the dynamical assessment parameters include npcr and uaci. the encryption time taken for the set of documents when encrypted without classification by using single common encryption algorithm and documents classified according to their type and encrypted using different algorithms with multi-dimensional chaotic maps and diffusion technique is determined. the complexity of the algorithm is varied by using the different multidimensional chaotic maps. histogram histogram is a graphical representation of the distribution of number of pixels for a particular intensity level. the histogram for an encrypted image should be flat or uniform distribution for all the pixel intensity levels. a flat histogram depicts the difficulty in understanding/predicting the plain image. it is desirable to have uniform histograms for two cipher images which are obtained from the same plain image but with a tiny change in the key value. the variances of the histograms are determined and is tabulated in table 3 for text document image. the smaller value of variance depicts higher uniformity. the variance of the histograms are calculated by using the equation ( ) ∑ ∑ ( ) (34) where and the number of pixels which grey values equals to and . tests have been conducted to check if the ciphered image produced with different keys is producing the flat histogram with a uniform variance. for example, the variance values of the histogram obtained for the plain image with highest priority using the key k1 ( 0.0000000000778899, 0.0000000000123654, 0.00000000000657789 , and b = 8/3 and , ), and the histogram obtained for a ciphered image with different keys which are obtained for 5% change in each of the initial parameters of k1 are calculated. the results obtained are tabulated in table 3. the uniformity in the variance for different keys extracted out of uniform change in different parameters reveal that the ciphering technique is key-sensitive and resistant to statistical attacks. ciphering time the proposed ciphering algorithm is implemented using matlab 2014 software on intel core i3 processor having 2gb ram and 500gb hd. the encryption time taken for all types of document images with and without priority assignments are tabulated in table 4. document containing different information are encrypted with different level of table 3 histogram variance obtained for 5% change in initial parameters key with parameters text document image 0.7365 0.7811 0.7565 0.7617 0.7250 a novel priority based document image encryption with mixed chaotic systems... 169 complexity. here the prioritized documents are encrypted with different block sizes as indicated in table 2, based on their image features, the total encryption time taken for a bunch of documents is less. on the other hand, when a bunch of documents are encrypted without assigning priorities where all the documents are using a common algorithm with same chaotic map, same block size and the same method of diffusion, leading to more encryption time. it is observed that encryption of a set of documents experimented without assigning any priority has taken 4630.774422 seconds and that with priority, has taken 658.443416 seconds. the encryption time versus priority is graphically shown in figure 6. to further reduce the encryption time, the high dimensional chaotic maps which require lesser time to generate the required set of sequences are used. the high dimensional chaotic sequences are more aperiodic when compared to lower dimensional chaotic maps and thereby increases the security of the system. the encryption time taken for a lena image of size 512 x 512 in the proposed method is compared with the time taken for other methods in table 5. it is observed in table 5 that the proposed method has taken significantly lesser time for encryption. table 4 execution time obtained for different document images with and without priority image encryption time (seconds) proposed (lena 512 x 512) 0.020298 [24] 0.130 [25] 0.175 [26] 0.125 table 5 encryption time comparison for different existing methods time(sec) with priority time(sec) without priority first 465.2658 465.2658 second 170.970537 452.502153 third 19.759614 444.861688 fourth 2.228033 449.647004 fifth 0.095617 447.941053 sixth 0.035254 492.808793 seventh 0.027551 461.581642 eighth 0.023044 447.308014 ninth 0.020298 512.699405 tenth 0.017668 456.15887 total 658.443416 4630.774422 document containing different information are encrypted with different level of complexity. here the prioritized documents are encrypted with different block sizes as indicated in table 2, based on their image features, the total encryption time taken for a bunch of documents is less. on the other hand, when a bunch of documents are encrypted without assigning priorities where all the documents are using a common algorithm with same chaotic map, single block size and the same method of diffusion, leading to more encryption time. it is observed that encryption of a set of documents experimented without assigning any priority has taken 4630.774422 seconds and that with priority has taken 658.443416 seconds. the encryption time versus priority is graphically shown in 170 r. c. r, k. c. figure 6. to further reduce the encryption time, the high dimensional chaotic maps which require lesser time to generate the required set of sequences are used. the high dimensional chaotic sequences are more aperiodic when compared to lower dimensional chaotic maps and thereby increases the security of the system. the encryption time taken for a lena image of size 512 x 512 in the proposed method is compared with the time taken for other methods in table 5. it is observed in table 5 that the proposed method has taken significantly lesser time for encryption. fig. 6 graphical comparison of utilizing system resource for encryption by document images with and without priority. mean squared error and peak signal to noise ratio the peak signal to noise ratio (psnr) is used to assess the encryption scheme. it represents the amount of noise content present in the ciphered image. the psnr is calculated using the equation ( ) (35) where ∑ ∑ ( ) ( ) (36) the value of psnr>30 db is not advisable as it is possible to extract the original information from the ciphered image. the desirable value is less than 8db. the result yields 4.772db as depicted in table 6, this indicates the difficulty in reconstructing the plain image from the cipher image. psnr also represents the distortion of the plain image when subjected to encryption. it is observed that the mse is more, for a small change in the initial conditions. more the mse, the better is the algorithm. a small value of mse enables the interceptor to visualize the original image. these parameters contribute to confidentiality of the document. a novel priority based document image encryption with mixed chaotic systems... 171 entropy analysis the information entropy measures the randomness in the image. it is calculated by the equation et (m) = ∑ ( ) x ( ( )) (37) for a gray scale image with 2 8 states of information, if all the 256 states appear with the same probability the entropy value is equal to 8. the experimental result yields an entropy very close to 8 as shown in table 6. this is possible only when all pixels in the cipher image appear with same probability (equally probable). this indicates that the cipher image is random in pixel distribution. when all the pixels in the ciphered image appear with equal probability, it is very difficult to predict the original image by taking the statistical analysis. this parameter contribute to unpredictability and degree of uncertainty of the document. correlation analysis correlation is a measure of relationship between the plain and cipher images. it checks if they are similar or not. when a plain image is encrypted, the pixel positions are interchanged and their values get modified. hence the pattern of the ciphered image is different from that of the plain image. this can be measured by calculating the covariance between the set of pixels in different directions both in the plain and cipher images. for a set of pixels, the correlation coefficient of „1‟ indicates that the images are similar and a „0‟ indicates the dissimilarity between them. the equations for calculating the correlation is given by = ( ) √ ( )√ ( ) (38) where ( ) is the covariance between x and y can be formulated as ( )= ∑ (( ( ))( ( ))) (39) where, x and y are two adjacent pixels values in the image, v (x) is the variance of variable x, ( )= ∑ ( ( )) (40) ( )= ∑ (41) for a sample of about 2000 pixels are taken in horizontal, vertical and diagonal directions and the results are listed in table 6. the results indicate that there is very high dissimilarity between the plain and ciphered images. the results indicate that there is a large randomness in the cipher image and very high dissimilarity between the plain and ciphered images. this contributes to reduction in the redundancy as low as possible in the cipher image. table 6 encryption results obtained for different document images with different priority priority psnr mse hori verti diag entropy npcr uaci ssim uiq first 4.772 2.17x10 4 -0.02619 -0.0308 -0.00996 7.9993 100 33.465 -7.75x10-4 -7.74x10-4 second 7.958 9.127x10 3 -0.0119 -0.0175 -0.0044 7.9992 99.609 33556 7.00x10-4 6.99x10-4 third 7.044 1.28x10 4 -0.0057 0.01109 -0.0214 7.9992 99.616 33.285 7.701x10-4 7.692x10-4 fourth 6.409 1.48x10 4 -0.0055 -0.0171 0.00442 7.9993 99.59 33.49 0.001033 0.001032 fifth 6.593 8.417x10 3 -0.022 0.00597 -0.0157 7.9993 99.57 32.92 -1.6699x10-4 -1.682x10-4 sixth 7.931 1.014x10 4 0.0194 -0.004 0.0089 7.9992 99.56 32.89 -1.023x10-4 -1.034x10-4 seventh 8.368 9.4663x10 3 -0.005 -0.0153 0.012 7.9993 99.43 31.93 -3.911x10-4 -3.921x10-4 eighth 9.515 7.27x10 3 -0.0154 0.0043 -0.0024 7.9992 99.334 30.45 7.107x10-4 7.093x10-4 ninth 9.342 7.50x10 3 0.016 0.0121 0.0209 7.9993 99.24 30.19 -0.001339 -0.001335 tenth 9.367 6.83x10 3 0.0117 0.02184 0.0266 7.9992 99.21 30.08 -0.001633 -0.001635 172 r. c. r, k. c. key sensitivity and key space analysis key sensitivity is a test to check how many pixels are changing in their values for a tiny change in the original encryption key. for a good encryption scheme, two keys which differ in one bit produce significantly different ciphers. the key value depends on the initial parameters and the control parameters of the maps being used for encryption. also a bit change in the key at the receiving end will produce completely different plain images. the key space represents the total number of different key values for the given precision. the key space should be large enough to make it difficult for the intruder to crack the correct key to reconstruct the plain image. it should take years of time for the successful brute force attack. the keys used for the encryption of each class of document image is different from the other class. the key space is depending on the key value which basically depends on the initial parameters used, the precision of the real pseudorandom numbers being generated and the number of iterations the algorithm is repeated for ciphering. for example, in the encryption of highest priority image, the key used is 4d hyper chaotic lorenz map with initial conditions , , and step size and the 1d logistic map with initial condition of and the control parameter r is being used. the precision of the random number sequences being generated is equal to 10 -15 [22]. the length of the key is equal to 10 -16 times the number of initial conditions. that is ( )( ) which is greater than . the size of the key space should be above 2 100 [23] to get rid of brute force attacks. it is cleared from the results obtained for the proposed scheme that the algorithm is resistant to brute force attacks. npcr and uaci npcr and uaci are the assessment parameters in respect of differential attack for the cipher image. npcr denotes the rate at which the number of pixels changes in their values. a change is reflected as 1 and no change reflected as 0 for a single bit change in the cipher image. the uaci denotes the unified average change intensity, that is, the average change in the intensity values of pixels of the ciphered images obtained when a bit change is made at the plain image. ideally the npcr should be 100% and the uaci should be 33.465%. the npcr and uaci represents the strength of the encryption. it is cleared from the results obtained for the proposed scheme that the algorithm is resistant to brute force attacks. the results are tabulated in the table 6. the equations for calculating the npcr and uaci are given below. npcr = ∑ ( ) ( ) 100 (42) where m and n are the width and height of the image. ( ) can be defined as (43) ( ) grey value of cipher image and ( ) grey value of new cipher image a novel priority based document image encryption with mixed chaotic systems... 173 universal image quality index (uiq) let x= { } and y= { } be the original and cipher images respectively, then the quality index can be defined as ( ) ( ̅) ( ̅) (44) where ̅ ∑ (45) ̅ ∑ (46) ∑ ( ̅) 2 (47) ∑ ( ̅) 2 (48) ∑ ( ̅) ( ̅) (49) the ideal value for uiq is -1. the results obtained for different priority documents are tabulated in table 6. structural similarity index measure (ssim) the ssim is a quality metric for images. it measures the similarities between the two images with a reference image. the reference image being the uncompressed or noise free image. this metric compares the contrast, luminance and structural information between equal sized gray level images. its value is in the range of -1 and 1. a, 1 means that the two images are exactly the same but a -1 means they are dissimilar. mathematically it can be given as ( ̅ ̅ ) ( ) ( ̅ ̅ ) ( ) (50) where and are constants. the values of ssim obtained for the different priority images are tabulated in table 6. comparison of security levels the training database contains 10 different classes of document images obtained from internet source. each class contains a set of 7 images exhibiting all the attributes resulting in totally 70 images for training the system as shown in figure 7. for each image the different features are extracted and stored them in matlab library called „.mat‟ file. this file is loaded back when querying the test image for recognizing the document type. encryption time comparison for different existing methods are detailed in table 5. the results obtained for a set of documents of size 512x512 in the proposed method of encryption are listed in table 6. it is observed that the number of pixel change rate and unified average change intensity of images are equal to the ideal values in the proposed method. it indicates that the proposed method of encryption is 100% resistant to dynamical attacks. the entropy of the proposed method is greater than that of the existing method and indicates that the proposed method yields more randomness and hence the leakage of information is negligible. the results of the proposed method for other document types with different block sizes are observed in table 6. the encryption of different document image types for 4×4 pixels are compared in table 7. the comparison 174 r. c. r, k. c. of proposed 4×4 pixels encryption results with traditional block cipher techniques are depicted in table 8. the different document images when encrypted with priority, the corresponding cipher images obtained with their histogram and correlation in three different directions namely horizontal, vertical and diagonal are shown in figure 8. table 7 comparison of encryption results obtained for different document images type hori verti diag npcr uaci entropy psnr mse uiq ssim proposed (4 x 4 pixels) -0.0262 -0.031 -0.009 100 33.465 7.9993 4.772 2.17x10 4 -7.74x10 -4 -7.75x10 -4 [24] 0.0603 -0.0692 0.0487 99.66 33.49 na na na na na [25] 0.0079 0.0038 0.007 99.545 33.42 7.997 na na na na [26] -0.0702 -0.0782 0.0039 na na na na na na na [27] 0.0004 0.0006 na 99.61 33.47 7.997 na na na na table 8 comparison of proposed encryption results with traditional block cipher techniques type hori verti diag npcr uaci entropy psnr mse uiq ssim proposed (4 x 4 pixels) -0.0057 0.01109 -0.0214 99.616 33.285 7.9992 7.044 1.28x10 4 -7.692x10 -4 -7.701x10 -4 [28] 0.0208 0.0021 0.00012 na na 7.9999 na na na na [29] -0.0269 0.01096 0.0510 99.605 33.441 7.999437 na na na na [30] 0.030768 0.019044 0.010293 99.6058 33.4648 7.99769 na na na na fig. 7 total number of different document images used for training database. fig. 8 different document images encrypted with different priority and histogram of cipher images with its correlation in three different directions namely horizontal, vertical and diagonal. a novel priority based document image encryption with mixed chaotic systems... 175 table 9 test result by nist for the sequence generated by different dimensional chaotic maps p-value statistical analysis 4d chaotic map 3d chaotic map 2d chaotic map 1d chaotic map status mono bit frequency test 0.929568022278 0.573775403633 0.426325543384 0.511663282696 success block frequency test 0.914496908439 0.774138391129 0.220220646602 0.364406710571 success run test 0.035318327624 0.038169651362 0.036460364749 0.098222077056 success longest run ones 0.749822008129 0.968200174367 0.200825122695 0.877325270843 success binary matrix rank test 0.498218033841 0.291891444943 0.085200615631 0.058955719226 success spectral test 0.935353907956 0.491297124216 0.655518578368 0.528110313792 success no over lapping template matching test 0.999252260332 0.623258576742 0.412836786198 0.929710883665 success overlapping template matching test 0.853100388979 0.270571775094 0.488415602427 0.786016922447 success universal statistic test 0.061368829139 0.319352527457 0.294836833019 0.579319917432 success linear complexity test 0.808840178441 0.423178207016 0.919679104285 0.959466390424 success serial test 0.999998513560 0.999999999999 1.000000000000 1.000000000000 success approx. entropy test 1.000000000000 0.999869306864 0.977601055158 0.543961589891 success cumulative sums test forward 0.984155397448 0.961531188055 0.536609751712 0.831316404987 success cumulative sums test reverse 0.997700313205 0.629222570292 0.547547656132 0.301119661875 success random excursion test 0.471203771279 0.737055710199 0.762173877605 0.921218168648 success random excursions variant test 0.842342484558 0.980593202196 0.625123538855 0.423310463480 success nist test analysis there are different complexity measurement techniques to measure the randomness of a given chaotic sequence. in this paper a nist (national institute of standards and technology) analysis has been conducted to quantitatively estimate the complexity of different dimensional (4d, 3d, 2d and 1d) chaotic maps. the complexity of the proposed scheme can be assessed by making use of nist special publication 800-22 (sp 800-22) [17]. there are 16 different statistical test of special publication sp 800-22 [16]. the different statistical methods are 1. mono bit test, 2. frequency test within block 3.runs test, 4. longest run ones test, 5. binary matrix rank test, 6. spectral test, 7. non overlapping template matching test, 8. overlapping template matching test, 9. universal statistical test, 10. lempel-ziv compression test, 11. linear complexity test, 12. serial test, 13. approximate entropy test, 14. cumulative sums test, 15. random excursion test and 16. random excursion variant test. for each of these tests the value of p is calculated from a binary sequence generated by the multi-dimensional chaotic maps (4d, 3d, 2d and 1d etc.). each p-value determines whether the produced sequence is random in nature or not. a p-value equals to 1 determines perfect randomness. if p is in the range of 0.01 to 1, then the test indicates that the sequence produced is completely random in nature. the randomness of the sequence generated by the proposed algorithm can be evaluated by converting the encrypted pixels pi to bit pib. the nist table 9 shows that it is successful against statistical attacks and hence the proposed method is feasible for cryptography applications. 13. conclusion the proposed work classifies the sample document images and assigns them a priority value automatically based on the type of the document along with its features and encrypts each document with different levels of security. the performance of the proposed method is evaluated by subjecting different types of sample document images to the classifier and then to encryption. with more number of features of the image and a few neighbors 176 r. c. r, k. c. enabling the classification of the image correctly and efficiently. the magnitude of security is depicted in table 6 for different document classes with the parameters, the entropy, the mean square error, psnr, correlation, variance, npcr, uaci, ssim, and uiq. the results are obtained for a bunch of documents of different types. the documents are perfectly classified and correct priorities are assigned. the documents encrypted with highest priority have highest noise, randomness, ideal pixel change rate and ideal unified average change intensity with better correlation among the neighboring pixels when compared to the documents encrypted with lower priorities. the table 6 reveals that the proposed method is equipped to resist statistical and differential attacks with variable security levels. it is found that encryption of a set of document images without a priority results in more encryption time when compared to the documents with priority as in table 4. thus encryption of images followed by the classification with priorities is saving the system resources and also providing the required security against the attacks. the nist test is to check for the randomness in the chaotic sequence yielded complete randomness as depicted in table 9. the use of different dimensional chaotic maps also contributed for the variable security levels and encryption time. based on the priority value, a different second level diffusion technique uses a complex method for establishing interdependency between pixels involves more mathematical operations. the cipher images obtained for different input document images are different from one another and there is non-linearity in them. this makes the crypt analyzer difficult to decrypt the images with proper key to extract the original image. hence the proposed method encrypts different types of documents with variable security levels and encryption time. references [1] s. marinai, introduction to document analysis and recognition. springer-verlag berlin heidelberg, 90, pp. 120, 2008. [2] a. kumar, f. nette, k. klein, m. fulham and j. kim, “a visual analytics approach using the exploration of multi-dimensional feature spaces fo contentbased medical image retrieval”, ieee journal of biomedical and healt informatics, pp. 168-2194, 2013. [3] n. chen, d. blostein, a survey of document image classification: problem statement, classifier architecture and performance evaluation. springer-idjar, 2006. [4] o. augereaw, n. journet, j-p. domenger, “document images indexing with relevance feedback: an application to industrial context”, in proceedings of the international conference on document analysis and recognition, ieee computer society, 2011, pp. 1190-1194. [5] f. chen, a. girgensohn, m. cooper, y. lu and g. filby, “genre identification for office document search and browsing. springer-idjar”, 2012, pp. 167-182. [6] s. sergyan, “color histogram features based image classification in contentbased image retrieval systems”, in proceedings of the 6 th international symposium on applied machine intelligence and informatics, 2008. [7] f. esposito, d. malerba and f. a. lisi, “machine learning for intelligent processing of printed documents”, journal of intelligent information systems, vol. 14, pp. 175-198, 2000. [8] v. eglin, s. bres, l.-rfv, i. de lyon, “document page similarity based on layout visual saliency: application to query by example a document classification”, in proceedings of the 7 th international conference on document analysis and recognition (icdar-2003), ieee-computer society, 2003. [9] a. schenker, m. last, h. bunke and a. kandel., “classification of web documents using a graph model”, in proceedings of the 7 th international conference on document analysis and recognition (icdar-2003), ieeecomputer society, 2003. [10] e. appiani, f. cesarini, a.m. colla, m. diligenti, m. gori, s. marinai and g. soda, “automatic document classification and indexing in high-volume applications”, springer-verlag (ijdar), pp. 69-83, 2001. a novel priority based document image encryption with mixed chaotic systems... 177 [11] ms. k. arthi and mr. j. vijayaraghavan, “content based image retrieval algorithm using color models”, international journal of advanced research in computer and communication engineering, vol. 2, issue 3, 2013. [12] s. shastry, g. gunasheela, t. dutt, d. s. vinay and s. r. rupanagudi, ““i”-a novel algorithm for optical character recognition (ocr)”, ieee, pp. 389-393, 2013. [13] a. farhat, a. al-zawqari, a. al-qahatni, o. hommos, f. bensaali, a. amira and x. zhai, “ocr based feature extraction and template matching algorithms for qatari number plate”, ieee, 2016. [14] c. r. revanna and dr. c keshavamurthy, “a secure document image encryption using mixed chaotic system” international journal of computer science and information security (ijcsis), vol. 15, no. 3, pp. 263270, 2017. [15] c. r. revanna and dr. c keshavamurthy, “a new selective document image encryption using gmm-em and mixed chaotic system”, international journal of applied engineering research, vol. 12, pp. 88548865, 2017. [16] h. liu, and x. wang, “color image encryption using spatial bit-level permutation and high-dimension chaotic system‟”, optical communication, vol. 284, pp. 3895–3903, 2011. [17] a. melo, p. bezerra, and a. ablem, et al. “priority qoe: a tool for improving the qoe in video streaming”, intelligent multimedia technologies for networking applications: techniques and tools, chapter 11. [18] n. shaikh, s. chapaneri and d. jayaswal, “hyper chaotic color image cryptosystem”, in proceedings of the ieee international conference on advances in computer application, 2016, pp. 239-243. [19] v. praneeth kumar reddy and a. annis fathima “a cost effective approach for securing medical x-ray images using chebyshev map”, in proceedings of the ieee 5th international conference on recent trends in information technology, 2016. [20] m. dridi, m. ali hajjaji, b. bouallegue and a. mtibaa, “cryptography of medical images based on a combination between chaotic and neural network”, iet, image processing, pp. 1-10, 2016. [21] b. awdun and g. li, “the color image encryption technology based on dna encoding and sine chaos”, in proceedings of the ieee international conference on smart city and system engineering. 539-544, 2016. [22] ieee computer society. (1985). ieee standard for binary floating-point arithmetic, ansi/ieee standard, august 1985, p. 754. [23] g. alvarez, and s.j. li, “some basic cryptographic requirements for chaos-based cryptosystem”, int. j. bifurcation chaos, vol. 16, no. 8, pp. 2129-2151, 2006. [24] g. ye, “a block image encryption algorithm based on wave transmission and chaotic systems”, springernonlinear dyn, vol. 75, pp. 417-427, 2014. [25] x. wang, l. liu, y. zhang, “a novel chaotic block image encryption algorithm based on dynamic random growth technique. elsevier optics and laser in engineering, vol. 66, pp. 10-18, 2015. [26] z. yu, z. z. yang et al., “a chaos-based image encryption algorithm using wavelet transform”, in proceedings of the ieee conference, pp. 217-222, 2010. [27] d. e. goumidi, f. hachouf, “hybrid chaos based image encryption approach using block and stream ciphers”, in proceedings of the ieee international workshop on system signal processing and their applications, 2013, pp.139-144. [28] s. m. wadi and n. zainal, high definition image encryption algorithm based on aes modification. springer science business media new york, pp. 811-829, 2014. [29] y. zhang, x. li and w. hou, “a fast image encryption scheme based on aes”, in proceedings of the 2nd international conference on image, vision and computing, 2017, pp. 624-628. [30] y. zhang, “test and verification of aes used for image encryption”, springer-verlag gmbh germany, 2018. instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 49 60 doi: 10.2298/fuee1601049m effects of pulsed negative bias temperature stressing in p-channel power vdmosfets ivica manić 1 , danijel danković 1 , vojkan davidović 1 , aneta prijić 1 , snežana đorić-veljković 2 , snežana golubović 1 , zoran prijić 1 , ninoslav stojadinović 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 university of niš, faculty of civil engineering and architecture, niš, serbia abstract. our recent research of the effects of pulsed bias nbt stressing in p-channel power vdmosfets is reviewed in this paper. the reduced degradation normally observed under the pulsed stress bias conditions is discussed in terms of the dynamic recovery effects, which are further assessed by varying the duty cycle ratio and frequency of the pulsed stress voltage. the results are analyzed in terms of the effects on device lifetime as well. a tendency of stress induced degradation to decrease with lowering the duty cycle and/or increasing the frequency of the pulsed stress voltage, which leads to the increase in device lifetime, is explained in terms of enhanced dynamic recovery effects. key words: vdmosfet, nbti, pulsed bias stress, threshold voltage, lifetime 1. introduction negative bias temperature instability (nbti) has been widely recognized as one of the crucial reliability issues in state-of-the art cmos technology. specifically, p-channel devices exposed to stress with negative gate bias at increased temperatures are susceptible to threshold voltage shift due to the complex physical mechanisms involving generation of bulk oxide charge and interface traps [1]-[6]. magnitude of the observed shift strongly depends on stress parameters, such as the gate voltage, temperature, and stress time. most of the recent nbti studies have been done on devices with very thin (less than few nanometres) gate oxide films, including sio2, sion or high-k [1]-[6]. however, there is still high interest in ultra-thick oxides owing to widespread use of mos technologies for the realisation of power devices. the electric fields and temperatures typical for nbt stress can be approached during the routine operation of power mosfets in many applications [7], so the investigation of nbti in these devices, which may have gate oxide thickness ranging from several tens to 100 nm or even more, is of importance as received october 14, 2015 corresponding author: ivica manić university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: ivica.manic@elfak.ni.ac.rs) 50 i. manić, d. danković, v. davidović, et al. well. owing to its superior switching characteristics which enable operation in a megahertz frequency range, power vdmosfet (vertical double diffused mosfet) is an attractive device for application in high-frequency switching power supplies, home appliances and automotive, industrial and military electronics [8], [9]. in these applications gate bias applied during the operation switches between the “high” and “low” voltage levels, thus creating the pulsed stress conditions. earlier investigations have shown that the pulsed (also referred to as ac) nbt stress creates less significant degradation than the static (dc) stress owing to a dynamic recovery effect (a part of degradation created by the preceding stress voltage pulse is neutralized and/or annealed during the fraction of period corresponding to the “low” level of the pulsed stress voltage and has to be restored upon arrival of the next voltage pulse) [10]-[14]. accordingly, the lifetime predictions based on static nbt stress [15], [16], where the transistor was continuously kept “on”, might be wrong, and it is thus important to estimate device lifetime under the pulsed nbt stress conditions. this paper provides a review of our recent research of the effects of pulsed bias nbt stressing in p-channel power vdmosfets [17]-[20]. the dynamic recovery effects are assessed by varying the duty cycle and frequency of the pulsed stress voltage, and the results are further analyzed in terms of the effects on device lifetime. 2. results and discussion devices used in this study were commercial p-channel power vdmosfets irf9520 with current/voltage ratings of 6.8 a/100 v, encapsulated in to-220 plastic cases [21]. the devices were built in standard si-gate technology with gate oxide thickness of 100 nm, and had the initial threshold voltage, vt0, about 3.6 v. owing to the thick gate oxide, accelerated nbt stressing of these devices requires gate stress voltage amplitudes even over -40 v, which exceed capabilities of commonly used signal voltage sources [22], [23]. for that reason we have developed a specific stress and measurement system suitable for nbti testing in power mos devices, which includes an external amplifier between the stress voltage source unit and the device under test (dut) [24]. the system actually includes high voltage stress circuit and the low voltage measurement circuit, which are separated by two software-controlled switches. gate stress voltage is supplied from tektronix afg3102 source unit acting either as a dc source or a pulse generator, for the static and pulsed nbt stressing, respectively, while the drain and source terminals are grounded. the measurements of transfer i-v characteristics of dut are done by providing the sweeping gate bias from an agilent 6645a source unit, while a source-measure unit from agilent 4156c semiconductor parameter analyzer is used for drain biasing and drain current measurements. all the instrumentation and the temperature inside the heraeus hep2 chamber are computer controlled over ieee-488 gpib bus. this setup provides a complete measurement of i-v characteristic, with gate voltage swept from 4.75 v to 2 v in 50 mv steps, in about 235 ms (including the time required to switch the circuits from stress to measurement and back), which practically means that dut remains unstressed at least 235 ms for each interim measurement performed [24]. we have first done a preliminary experiment, in which the two sets of devices were stressed for 36 hours under the static and pulsed nbt stress conditions, respectively. for the static nbt stress, negative dc voltages in the range 35 45 v were applied to the gate, whereas the drain and source terminals were grounded. for the pulsed bias stress, negative effects of pulsed negative bias temperature stressing in p-channel power vdomsfets 51 gate voltage pulses (with base level of 0 v, frequency f = 10 khz, and duty cycle dtc = 50%) of the same magnitudes were used instead. stressing under both static and pulsed conditions was performed at temperatures ranging from 125 to 175 o c. typical subthreshold transfer i-v characteristics of p-channel power vdmosfets subjected to the static and pulsed nbt stressing are shown in fig. 1, (a) and (b), respectively. the transfer i-v characteristics were measured at the drain voltage value of 100 mv, so the devices were kept in the linear region of operation. during the 36 hour stress, a total of 39 interim measurements were done according to a specified timeline, but for simplicity only the initial (before stress) and final (after stress) characteristics measured at room (27 o c) and stress temperature (175 o c) are shown. as can be seen, the characteristics are being shifted along the vgs axis towards the higher vgs voltages as a consequence of stress-induced build-up of oxidetrapped charge. at the same time, the slope of the characteristics slightly decreases, indicating that interface traps and/or near interface oxide traps, known as border traps or switching oxide traps, are generated as well. it also can be seen that the shifts caused by the pulsed nbt stressing are smaller than those found in the case of static nbt stressing. fig. 1 transfer i-v characteristics of p-channel power vdmosfets measured before and after (a) static and (b) pulsed (f=10khz, dtc=50%) nbt stressing. in line with observed shifts of transfer characteristics along the voltage axis, nbt stressing was found to cause significant threshold voltage shifts (∆vt). threshold voltage values were calculated from the measured i-v characteristics using the second derivative method [25]. two characteristic sets of data (for different stress voltages at 175 c and at different temperatures for stress voltage of 45 v) for the stress-induced threshold voltage shifts found during the static and pulsed nbt stressing of irf9520 p-channel vdmosfets are shown in fig. 2. as can be seen, nbt stressing under both static and pulsed bias conditions was found to cause significant threshold voltage shifts, which were more pronounced at higher voltages and/or temperatures. in addition, it can be seen that the pulsed voltage stressing caused generally lower shifts as compared to static stressing performed at the same temperature with equal stress voltage magnitude. 52 i. manić, d. danković, v. davidović, et al. the lower shifts observed in the case of the pulsed nbt stress can be explained by two factors associated with the nature of pulsed stressing itself. the first factor is assessed by taking into account that “stress time” in fig. 2 refers to the total time, which includes fractions of the periods corresponding to both “high” and “low” levels of the pulsed gate voltage applied. however, the devices are actually stressed only during the fraction of period corresponding to the “high” voltage level (on-time), so the actual or net stress time is significantly shorter (and the resulting stress-induced threshold voltage shifts appear both slower and lower) in the cases of pulsed stress than in the case of static one. the other factor could be a partial recovery of threshold voltage during the period fractions corresponding to the “low” level of the pulsed stress voltage (off-time), which also contributes to the smaller shifts observed in the cases of pulsed bias stress. the partially recovered degradation is restored again on arrival of each new stress voltage pulse, so the phenomenon is referred to as dynamic recovery [26]. fig. 2 threshold voltage shifts in p-channel power vdmosfets during the static and pulsed (f = 10 khz, dtc = 50%) nbt stressing. to evaluate dynamic recovery effects during the pulsed bias stressing it is necessary to alleviate the first factor mentioned above, which is commonly done by plotting the threshold voltage shift as a function of the net stress time rather than the total time. acordingly, fig. 3 shows the results for threshold voltage shifts versus the net stress time, where the net stress time in the case of pulsed stressing was calculated by multiplying the total stress time with the value of duty cycle ratio (50% for the pulsed stress in this case). these results show that, for corresponding values of the net stress time, the stress induced threshold voltage shifts under the static stress remain much higher (approximately three times) than in the case of pulsed nbt stress. this clearly indicates that δvt time dependencies in figs. 2 and 3 have been affected by the partial recovery of threshold voltage during the period fractions corresponding to the “low” level of the pulsed stress voltage. effects of pulsed negative bias temperature stressing in p-channel power vdomsfets 53 fig. 3 threshold voltage shifts in p-channel vdmosfets during the static and pulsed (f = 10 khz, dtc = 50%) nbt stressing vs. the net stress time. further insight into the dynamic recovery effect was obtained by varying the duty cycle ratio and frequency of the pulsed voltage used for device stressing. the results of stressing with three different duty cycle pulses (75%, 50%, and 25%) at 10 khz and those of static stress are shown in fig. 4, where the net stress time in the cases of pulsed stressing was calculated by multiplying the total stress time with corresponding duty cycle value for each specific case. the overall net stress time was 6 h in all cases, and all devices were stressed with the same gate voltage magnitude (45 v) at 175 c. as can be seen, the nbt stress-induced threshold voltage shifts are most significant in the case of the static stress and clearly decrease with reducing the duty cycle in the cases of the pulsed bias stress. this is clear indication that dynamic recovery effects become more pronounced when the pulsed gate voltage with fig. 4 threshold voltage shifts in p-channel vdmosfets vs. net stress time at various duty cycles (nbt stress: vg= 45 v, t = 175 c, f = 10 khz). 54 i. manić, d. danković, v. davidović, et al. lower duty cycle ratio was applied. however, it should be noted that frequency remains constant, so variations in duty cycle change the ratio between the pulse and no-pulse fractions of the period: the lower duty cycle actually means shorter pulses and longer breaks in between the pulses, which further means shorter stress time and longer recovery time during each period of pulsed stress voltage applied. accordingly, there is less time to create degradation during a single period and more time for recovery, so the overall resulting degradation found after stressing for equal net stress times tends to decrease with reducing the duty cycle. therefore, it can be speculated that overall degradation tends to decrease with duty cycle reduction because of two combined effects: one is creation of lesser degradation because the pulses are getting shorter, while the other effect can be identified as the enhanced dynamic recovery because the period fractions between the two pulses are getting longer. threshold voltage shifts observed in devices stressed with three different frequency pulses (1, 10 and 100 khz) in comparison with those obtained by static stress are shown in fig. 5. all devices were stressed with the same gate voltage magnitude (45 v) at 175 c, and the overall net stress time was 6 h in all cases again. a duty cycle was kept at 50% for the pulsed stressing at all frequencies, so the net stress time in these cases was equal to a half of the total stress time. again, the stress-induced threshold voltage shifts are most significant in the case of the static stress, and it is interesting to note that they clearly decrease with increasing the frequency in the cases of the pulsed bias stress. so, the dynamic recovery effects seem to become more pronounced with increasing the frequency of the gate voltage applied, even though the change of frequency at constant duty cycle practically does not affect the ratio between the pulse and no-pulse fractions of the period at all. however, the increase in frequency means that the pulses themselves and the fractions of period between the pulses simultaneously become shorter, which further means that there is less time to create degradation and also less time for recovery during each period of the pulsed voltage applied. accordingly, one may expect the resulting degradation would be nearly independent of frequency, as reported in [27], [28], but in our case degradation apparently decreases with fig. 5 threshold voltage shifts in p-channel vdmosfets vs. net stress time at various frequencies (nbt stress: vg=45 v, t=175 o c, dtc=50%). effects of pulsed negative bias temperature stressing in p-channel power vdomsfets 55 increasing the frequency, as reported more recently in [29]. the advanced measurement techniques for nbti characterization have been developed rather recently, which might be the reason for inconsistency of the data reported here and in [29] with those found in less recent publications [27], [28]. a possible explanation for why the degradation decreases with increasing the frequency could be as follows. the pulses at low frequencies are long enough to allow for creation of rather significant amount of the slow and/or non-recoverable component of degradation, which is hardly removed in the fraction of period between the pulses. the amount of this component decreases at higher frequencies, while that of the fast component increases, and the latter is more easily removed even though the fraction of period between the pulses becomes shorter. as a result, the dynamic recovery effects become more pronounced and overall degradation tends to decrease with increasing the frequency. nbti can put serious limit to a device lifetime, so one of the main goals of our nbti studies was to estimate the normal operation lifetime of investigated p-channel power vdmosfets by using the experimental data obtained under the accelerated nbt stress conditions. considering the above differences in the observed effects between the static and pulsed nbt stressing, the predictions based on the results of static nbt stressing [15], [16] may underestimate the lifetime, so the proper approach is to assess the lifetime under the pulsed nbt stress conditions, which are closer to those met by devices in real applications. our experimental devices were the power transistors, so we could assume maximum normal bias and temperature to be, for example vg = -20 v and t = 100 o c. either of several device electrical parameters, such as threshold voltage, transconductance, or drain current, can be used as degradation monitor for the lifetime estimation [30], [31]. threshold voltage is widely accepted as a well-suited parameter, so in our studies we have used the experimental results for the nbt stress-induced δvt to estimate the device lifetime in practical operation. the procedure of lifetime estimation consists of two steps: experimental values of the lifetime are extracted first, and these values are then used for extrapolation to normal conditions. experimental lifetime is defined as the stress time required for the stress-induced δvt to reach some predetermined value, which is called failure criterion (fc). in our case we defined fc as the threshold voltage shift of 50 mv. the extraction of experimental lifetime values from our data for stress voltage magnitude vg = -45 v at different temperatures is illustrated in fig. 6. as can be seen, lifetime values extracted from the static nbt stress data (1, , 3) are significantly shorter than those extracted from the corresponding pulsed stress data (4, 6). there are several well-established models for extrapolation along the voltage or electric field axis, such as “vg”, “1/vg” and “power-law” models [32], [33], which are based on corresponding degradation models for the threshold voltage shifts. the “1/vg” and other models for extrapolation along the voltage or electric field axis can be used to estimate the device lifetime and ten year operation voltage (the maximum operation voltage providing 10 years of device operation without failure) only at temperatures applied during accelerated stressing, which are generally higher than actual temperatures found in device normal operation mode. estimates obtained by these models are very useful as the worst case expectations, but it could be even more useful if possible to have 56 i. manić, d. danković, v. davidović, et al. fig. 6 extraction of experimental lifetime from the threshold voltage shift time dependencies recorded during the nbt stressing with gate voltage vg = 45 v at different temperatures. lifetime estimates for normal operation temperatures. trying to resolve this issue, we have proposed extrapolation along the temperature axis [34]. a model for this extrapolation can be derived from any of several degradation models for nbt stress-induced threshold voltage shifts, which all include the arrhenius temperature acceleration factor, and can be expressed as [34]: )/exp( tba , (1) where a and b are the fitting parameters taken from the initial degradation model. the above expression has the same mathematical form as the one describing the standard “1/vg” model, so the proposed model was called “1/t” model. the “1/t” model requires experimental lifetime values extracted from the data for nbt stressing performed with the same voltage magnitude at several different temperatures, such as those plotted in fig. 6. the lifetime estimation by means of this model is illustrated in fig. 7. only extrapolation to t = 100 c (which seems rather realistic for operation of power devices) is shown, but the procedure can be used to estimate the lifetime by extrapolation to any other reasonable operation temperature. in analogy with the “1/vg” model, extrapolation procedure employed by the “1/t” model additionally allows us to estimate a new reliability parameter, which is called a “ten year operation temperature”, t10y, and is defined as maximum temperature that allows 10 years of device operation with stress-induced δvt below fc. as can be seen in fig. 7, “1/t” model yields significant differences between the effects of static and pulsed nbt stress. the lifetime at 100 c is more than two orders of magnitude higher under the pulsed bias conditions (lifetime p ) than under the static ones (lifetime s ). also, the ten year operation temperature is about 25 c higher in the case of exposure to pulsed voltage stressing (t p 10y) than in the case of static one (t s 10y). these observations are completely in line with those obtained by means of “1/vg” model [18]. an expectation based on the above results is that the use of lower duty cycle gate voltage pulses for switching (which of course has to conform to specific requirements of effects of pulsed negative bias temperature stressing in p-channel power vdomsfets 57 fig. 7 extrapolation to normal operation temperature by means of “1/t” model to estimate the lifetime and ten year operation temperature in p-channel power vdmosfets subjected to static and pulsed nbt stressing. the circuit the investigated devices are to be used in) could lead to a longer device lifetime. this expectation is confirmed by the data in fig. 8, which shows the nbt stressinduced threshold voltage shifts in devices subjected to the static and pulsed stressing under different duty cycles at f = 10 khz. as can be seen, the pulsed nbt stress caused significant threshold voltage shifts, which were more pronounced at higher duty cycles, but still lower than those caused by the static stress. the figure additionally illustrates that stress time required for stress-induced threshold shift to reach the predetermined value of fc increases with decreasing the duty cycle, so the experimental values of device lifetime increase with decreasing the duty cycle as well (1s < 175% < 150% < 125%), and the fig. 8 threshold voltage shifts in p-channel power vdmosfets subjected to the pulsed nbt stress at different duty cycles, f=10 khz with corresponding experimental lifetime values indicated. 58 i. manić, d. danković, v. davidović, et al. actual lifetime under the normal operation conditions is expected to increase likewise. this can be explained in terms of the mechanisms responsible for nbt stress-induced degradation. threshold voltage shifts related to nbti are known to originate from underlying buildup of oxide-trapped charge and interface traps due to stress-initiated electrochemical processes involving oxide and interface defects, holes, and various species associated with presence of hydrogen as a common impurity in mos devices [1]-[6], [12][14], [35]-[37]. in the case of pulsed voltage applied to the gate, devices are sequentially subjected to stress and no-stress conditions, where the actual stress time depends on pulse frequency and duty cycle. the actual stress time apparently decreases with decreasing the duty cycle, so the resulting degradation associated with stress-induced generation of oxidetrapped charge and interface traps must decrease as well, whereas the device lifetime consequently increases. 3. conclusion the results of our recent research of the effects of pulsed bias nbt stressing in pchannel power vdmosfets have been reviewed in this paper. the reduced degradation normally observed under the pulsed stress bias conditions has been discussed in terms of the dynamic recovery effects, which are further assessed by varying the duty cycle ratio and frequency of the pulsed stress voltage. the stress-induced degradation was shown to decrease with reducing the duty cycle and increasing the frequency of the pulsed stress voltage. the results were analyzed in terms of the effects on device lifetime as well. a tendency of the stress induced degradation to decrease with lowering the duty cycle and/or increasing the frequency, which has resulted into the increase in device lifetime, was explained in terms of the enhanced dynamic recovery effects. acknowledgement: this work has been supported by the ministry of education, science and technological development of the republic of serbia, under the projects oi-171026 and tr32026, and in part by ei pcb factory, niš, serbia. references [1] d.k. schroder, j.a. babcock, “negative bias temperature instability: road to cross in deep submicron silicon semiconductor manufacturing”, j. appl. phys., vol. 94, pp. 118, 2003. [2] j.h. stathis, s. zafar, “the negative bias temperature instability in mos devices: a review”, microelectron. reliab., vol. 46, pp. 270286, 2006. [3] s. ogawa, m. shimaya, n. shiono, “interface-trap generation at ultrathin sio2 (46 nm)-si interfaces during negative-bias temperature aging”, j. appl. phys., vol. 77, pp. 11371148, 1995. [4] v. huard, m. denais, c. parthasarathy, “nbti degradation: from physical mechanisms to modeling”, microelectron. reliab., vol. 46, pp. 123, 2006. [5] m.a. alam, s.a. mahapatra, “a comprehensive model of pmos nbti degradation”, microelectron. reliab., vol. 45, pp. 7181, 2005. [6] s. mahapatra, n. goel, s. desai, s. gupta, b. jose, s. mukhopadhyay, k. joshi, a. jain, a.e. islam, m.a. alam, “a comparative study of different physics-based nbti models”, ieee trans. electron devices, vol. 60, no. 3, pp. 901916, 2013. [7] s. gamerith, m. pölzl, “negative bias temperature stress in low voltage p-channel dmos transistors and role of nitrogen”, microelectron. reliab., vol. 42, pp. 14391443, 2002. effects of pulsed negative bias temperature stressing in p-channel power vdomsfets 59 [8] b.j. baliga, fundamentals of semiconductor power devices, new york: springer, 2008. [9] v. benda, j. gowar, d.a. grant, power semiconductor devices, new york: john wiley, 1999. [10] m.a. alam, “a critical examination of the mechanisms of dynamic nbti for pmosfets“, in iedm techn. dig., 2003, pp. 345-348. [11] r. fernández, b. kaczer, a. nackaerts, s. demuynck, r. rodríguez, m. nafria, g. groeseneken, “ac nbti studied in the 1 hz – 2 ghz range on dedicated on-chip cmos circuits”, in iedm techn. dig., 2006, pp. 1-4. [12] s. mahapatra, a. e. islam, s. deora, v. d. maheta, k. joshi, a. jain, m. a. alam, “a critical reevaluation of the usefulness of r-d framework in predicting nbti stress and recovery”, in proc. international reliability physics symposium (irps), 2011, pp. 614-623. [13] h. reisinger, t. grasser, k. ermisch, h. nielen, w. gustin, c. schlünder, “understanding and modeling ac bti”, in proc. international reliability physics symposium (irps), 2011, pp. 597-603. [14] s. desai, s. mukhopadhyay, n. goel, n. nanaware, b. jose, k. joshi, s. mahapatra, “a comprehensive ac/dc nbti model: stress, recovery, frequency, duty cycle and process dependence”, in proc. international reliability physics symposium (irps), 2013, pp. xt.2.1 – xt.2.11. [15] d. danković, i. manić, s. djorić-veljković, v. davidović, s. golubović, n. stojadinović, “nbt stressinduced degradation and lifetime estimation in p-channel power vdmosfets”, microelectron. reliab. vol. 46, pp. 1828-1833, 2006. [16] n. stojadinović, d. danković, i. manić, v. davidović, s. djorić-veljković, s. golubović, “impact of negative bias temperature instabilities on lifetime in p-channel power vdmosfets”, in proc. telsiks 2007 conf., 2007, pp. 275-282. [17] n. stojadinović, d. danković, i. manić, a. prijić, v. davidović, s. djorić-veljković, s. golubović, z. prijić, “threshold voltage instabilities in p-channel power vdmosfets under pulsed nbt stress”, microelectron. reliab. vol. 50, pp. 1278-1282, 2010. [18] i. manić, d. danković, a. prijić, v. davidović, s. djorić-veljković, s. golubović, z. prijić, n. stojadinović, “nbti related degradation and lifetime estimation in p-channel power vdmosfets under the static and pulsed nbt stress conditions”, microelectron. reliab. vol. 51, pp. 1540-1543, 2011. [19] d. danković, i. manić, v. davidović, a. prijić, s. djorić-veljković, s. golubović, z. prijić, n. stojadinović, “lifetime estimation in nbt stressed p-channel power vdmosfets”, facta universitatis, series: automatic control and robotics, vol. 11, pp. 15-23, 2012. [20] i. manić, d. danković, a. prijić, z. prijić, n. stojadinović, “measurement of nbti degradation in pchannel power vdmosfets”, informacije midem, journal of microelectronics, electronic components and materials. vol. 44, pp. 280-287, 2014. [21] “irf9520n,” data sheet, international rectifier, [online]. available: http://www.irf.com/productinfo/datasheets/data/irf9520npbf.pdf [22] tektronix, inc., “high amplitude arbitrary/function generator simplifies measurement in automotive, semiconductor, scientific and industrial applications,” application note, [online]. available: http://www.tektronix.com/afg3000, 2008. [23] agilent technologies inc., “agilent 4156c precision semiconductor parameter analyzer“, data sheet, p.5. [online]. available: http://www.agilent.com, 2009. [24] a. prijić, d. danković, lj. vračar, i. manić, z. prijić, n. stojadinović, “a method for negative bias instability (nbti) measurements on power vdmos transistors”, measure. sci. and technol., vol.23, 085003 (8 pp.), 2012. [25] a. ortiz-conde, f.-j. garcıa sanchez, j.j. liou, a. cerdeira, m. estrada, y. yue, “a review of recent mosfet threshold voltage extraction methods”, microelectron. reliab., vol. 42, pp. 583–596, 2002. [26] m.a. alam, “a critical examination of the mechanisms of dynamic nbti for pmosfets“, in technical digest of the iedm 2003, usa, pp. 345348, 2003. [27] g. chen, m.f. li, c.h. ang, j.z. zheng, d.l. kwong, “dynamic nbti of p-mos transistors and its impact on mosfet scaling”, ieee electron. dev. lett., vol. 42, pp. 734–736, 2002. [28] m.f. li, g. chen g, c. shen, x.p. wang, h.y. yu, y.c yeo et al., “dynamic bias temperature instability in ultrathin sio2 and hfo2 metal-oxide-semiconductor field effect transistor and its impact on device lifetime”, jpn. j. appl. phys., vol. 43(11b), pp. 78077814, 2004. [29] t. nigam, “pulse-stress dependence of nbti degradation and its impact on circuits”, ieee trans. device mater. reliab., vol. 9, pp. 72–78, 2008. [30] c. schlunder, r. brederlow, b. ankele, w. gustin, k. goser, r. thewes, “effects of inhomogeneous negative bias temperature stress on p-channel mosfets of analogue and rf circuits”, microelectron. reliab., vol. 45, pp. 39-46, 2005. http://www.sciencedirect.com/science/article/pii/s0026271404001714?_rdoc=6&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549998%23531072%23fla%23display%23volume)&_docanchor=&_ct=25&_reflink=y&_zone=rslt_list_item&md5=c7ff3b6c50da9e9ebaf551454c7d6b7e http://www.sciencedirect.com/science/article/pii/s0026271404001714?_rdoc=6&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549998%23531072%23fla%23display%23volume)&_docanchor=&_ct=25&_reflink=y&_zone=rslt_list_item&md5=c7ff3b6c50da9e9ebaf551454c7d6b7e 60 i. manić, d. danković, v. davidović, et al. [31] s.s. tan, t.p. chen, c.h. ang, l. chan, “mechanism of nitrogen-enhanced negative bias temperature instability in pmosfet”, microelectron. reliab., vol. 45, pp. 19-30, 2005. [32] m. ershov, s. saxena, s. minehane, p. clifton, m. redford, r. lindley, h. karbasi, s. graves, s. winters, “degradation dynamics, recovery, and characterization of negative bias temperature instability”, microelectron. reliab., vol. 45, pp. 99-105, 2005. [33] h. aono, e. murakami, k. okuyama, a. nishida, m. minami, y. ooji, k. kubota, “modeling of nbti saturation effect and its impact on electric field dependence of the lifetime”, microelectron. reliab., vol. 45, pp. 1109-1114, 2005. [34] d. danković, i. manić, v. davidović, s. djorić-veljković, s. golubović, n. stojadinović, “new approach in estimating the lifetime in nbt stressed p-channel power vdmosfets”, in proc. miel 2008 conference, 2008, pp. 599-602. [35] n. stojadinović, d. danković, s. djorić-veljković, v. davidović, i. manić, s. golubović, “negative bias temperature instability mechanisms in p-channel power vdmosfets”, microelectron. reliab., vol. 45, pp. 13431348, 2005. [36] d. danković, i. manić, v. davidović, s. djorić-veljković, s. golubović, n. stojadinović, “negative bias temperature instabilities in sequentially stressed and annealed p-channel power vdmosfets”, microelectron. reliab., vol. 47, pp. 14001405, 2007. [37] i. manić, d. danković, s. djorić-veljković, v. davidović, s. golubović, n. stojadinović, “effects of low gate bias annealing in nbt stressed p-channel power vdmosfets”, microelectron. reliab., vol. 49, pp. 10031007, 2009. http://www.sciencedirect.com/science/article/pii/s0026271404001696?_rdoc=4&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549998%23531072%23fla%23display%23volume)&_docanchor=&_ct=25&_reflink=y&_zone=rslt_list_item&md5=33b3b8e1247f1f16344621033801b279 http://www.sciencedirect.com/science/article/pii/s0026271404001696?_rdoc=4&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549998%23531072%23fla%23display%23volume)&_docanchor=&_ct=25&_reflink=y&_zone=rslt_list_item&md5=33b3b8e1247f1f16344621033801b279 http://www.sciencedirect.com/science/article/pii/s0026271404001775?_rdoc=12&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549998%23531072%23fla%23display%23volume)&_docanchor=&_ct=25&_reflink=y&_zone=rslt_list_item&md5=95603b756d572c59e2cf386213e9325f http://www.sciencedirect.com/science/article/pii/s0026271404001775?_rdoc=12&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549998%23531072%23fla%23display%23volume)&_docanchor=&_ct=25&_reflink=y&_zone=rslt_list_item&md5=95603b756d572c59e2cf386213e9325f http://www.sciencedirect.com/science/article/pii/s0026271404005220?_rdoc=8&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549992%23598770%23fla%23display%23volume)&_docanchor=&_ct=33&_reflink=y&_zone=rslt_list_item&md5=421f1400fc4e003ddbefc1485a9f22b0 http://www.sciencedirect.com/science/article/pii/s0026271404005220?_rdoc=8&_fmt=high&_origin=browse&_srch=doc-info(%23toc%235751%232005%23999549992%23598770%23fla%23display%23volume)&_docanchor=&_ct=33&_reflink=y&_zone=rslt_list_item&md5=421f1400fc4e003ddbefc1485a9f22b0 instruction facta universitatis series: electronics and energetics vol. 28, n o 3, september 2015, pp. 439 456 doi: 10.2298/fuee1503439m an architecture for pervasive healthcare system based on the ip multimedia subsystem and body sensor network  vanja mišković 1 , djordje babić 2 1 faculty of information technology, slobomir p university, doboj, rs, bosnia and herzegovina 2 school of computing, union university belgrade, serbia abstract. one of the most promising applications of sensor networks is mobile health monitoring. the key concept of new generation networks (ngn) is ip multimedia subsystem (ims). the possibility of using mobile devices as gateways between sensor networks and ims has led to the development of integrated solutions such as the one proposed in this paper. event-based sip for instant messaging and presence leveraging extensions (simple) architecture is considered as the best solution for ims based mobile health monitoring. this paper also describes usage of the session initiation protocol (sip) protocol to communicate with the ims core, whereas data are transmitted within the body of sip messages. thus there is no need for additional transport protocol. presence information data format (pidf) is used as data format and data privacy is controlled by xml configuration access protocol (xcap), which also provides the ability to manage groups of patients. key words: body sensor networks (bsns), context-awareness, ip multimedia subsystem (ims), pervasive healthcare. 1. introduction the various applications of monitoring patients can be divided among the following categories: prevention, healthcare maintenance and examinations, home care [1]-[4], intelligent hospital [5], [6]; incidence detection, emergency intervention [7]; and pervasive healthcare applications [8]-[13]. subject of this paper is the pervasive healthcare. it is “healthcare to anyone, anytime, and anywhere by removing locational, time and other restraints while increasing both the coverage and the quality of healthcare“ [13]. nowadays, most of these applications, particularly pervasive ones, use body sensor networks (bsns). the bsn consists of a set of wearable or implanted sensors, which monitor vital signs or movements of the human body [14]. a modern context-aware applications received september 22, 2014; received in revised form december 24, 2014 corresponding author: vanja mišković faculty of information technology, slobomir p university,74000 doboj, rs, bosnia and herzegovina (e-mail: vanja.elcic@gmail.com)  440 v. mišković, d. babić that enhance user interaction and interpersonal communication [15] use bsns as sources of raw data. this kind of intelligent applications is essential to monitor human health, where proper assessment of the health condition is extremely important. a certain kind of mobile device gathers this data/polls sensors and it acts as a gateway to the central server or it processes data locally. there has been significant number of research projects in this area. in [4], a wearable computing platform, called mithril, with sensors that can continuously monitor the vital signs of users, motoric functions, social interaction, sleep quality, and other health indicators, has been proposed. mithril is used to study human behavior and to recognize different behaviors for creating context–aware interface with the computer. a project called ubiquitous monitoring environment for wearable and implantable sensors (ubimon) [8] has for its goal to provide continuous and undisturbed health monitoring system. the system consists of five main components: bsn nodes, local processing unit (lpu), central server (management), patient database and workstation (monitoring). wireless sensors that can be used here are: ecg sensor, spo2 sensor (oximeter), acceleration sensor, etc. in addition, the compact flash card was developed for a personal digital assistant (pda), where all collected sensor data are transmitted through a wifi/gprs network for long-term storage and trend analysis. further examples of similar projects are mobihealth [9] and healthservice24 [10] with the difference in the fact that the processing of data is not performed at the lpu, but that the data is forwarded to a remote server where processing is done. basuma [11] is also a similar project but it suggests the use of intelligent sensors in the mesh sensor network. in [12], an ad-hoc sensor network for transferring vital signs to the health care providers has been presented as a result of codeblue project. there, the use of an adaptive spanning-tree multi-hop routing algorithm has been explored. activepal is a commercial example of a system that is used to visualize data from ambient assisted living services (aals). it provides a visual representation of data about the activities of the patient during the day. principles of context awareness are also studied in activepal, tunstall's adlife [2], and quietcare [3] system. the pervasive healthcare system presented here consists of three parts: bsn, data transmission from bsn to observers, and intelligence for context awareness. this contribution presents a novel system architecture for a pervasive healthcare system which is based on ip multimedia subsystem (ims) for data transmission from sensors to the central unit. thus, we focus here on the data transmission part. the architecture fully corresponds to the existing standard and supported functions of ims without requiring additional server components. the raw data are collected by using bsn which consists of desired sensors. the collected data is transferred to the central server using the messages of session initiation protocol (sip). these messages are a basic feature of the sip extensions, known as sip instant messaging and presence leveraging extensions (simple). simple defines the event-based architecture based on sip publish/notify asynchronous messages. it is also possible to protect data privacy using xml configuration access protocol (xcap). presence information data format (pidf) is chosen for data publishing. the presented architecture is effective because it is based on the existing devices and standards which are supported by leading manufacturers an architecture for pervasive healthcare system based on the ip multimedia subsystem... 441 and operators. furthermore, we have developed an application based on the adopted system architecture. the application has been successfully tested for several use cases. the outline of this paper is as follows. section 2 discusses common sensor networks topologies and actual sensor technologies in terms of application for monitoring of patients. section 3 is devoted to the analysis how the simple framework as a part of the ip multimedia subsystem (ims) architecture can be used for data transfer in the proposed system. section 4 explains basics of the context-aware system and describes how to enhance the given solution with characteristics of a context-awareness. in section 5, we illustrate several use cases of developed application, and we give the content of relevant messages. finally, section 6 gives main conclusions. 2. the analysis of body sensor networks with respect to patient monitoring the requirements that need to be satisfied by the application significantly affect the topology and sensor network technology. designers of sensor networks are faced with often conflicting technical challenges so as to meet unique performance characteristics. the ubimon project used a star topology [8]. the star topology involves a centralized architecture, where the intelligence of the system is concentrated in the central hub that is superior to the peripheral sensors in terms of resources such as cpu, memory and batteries. the star topology is a common choice because of its simplicity when the scenario does not require direct communication between sensors. on the other hand, basuma project is an example where the robust mesh topology is used, which relies on a distributed system with peer-to-peer connection, without a central controller [11]. as a result of the mesh topology, if one component breaks down the remaining parts of the system can still work. this approach is desirable when sensors need to communicate with each other and exchange data, perform even less processing and then send the data to a gateway. of course, this topology is more complex than the star topology and requires smart sensors that consume more energy. fig. 1 illustrations of star (left) and mesh topology (right) for bsn. in order to facilitate direct communication between sensors and also to reduce the complexity of the mesh network, a compromise solution exists in the form of the clustertree topology. if a tree consists of a coordinator node and its associated children nodes 442 v. mišković, d. babić with tree height equal to one, then the cluster-tree topology simplifies to the star topology. there is always a possibility to further spread the tree and to include not only body area sensors but also ambient sensors. greater complexity is required only by those sensors that provide access to the network. the routing is simple because every node knows its children nodes, and its superior node. the cluster-tree topology always has a bottleneck in the root node. however, if root node is a mobile device which should provide the connection to the ims and deployment of context-aware applications, then its processing, memory and energy resources are rather sufficient and reliable for transmission of data collected from sensors. because of its simplicity, low energy consumptions, and simple implementation using mobile devices as coordinator node, we chose cluster-tree topology as a convenient architecture for bsn. fig. 2 illustration of cluster-tree topology. at the market, there exists number of sensors that can be effectively used for patient monitoring in the proposed system. these sensors use either bluetooth 4.0 low energy standard or zigbee specification. bluetooth is a good option to build bsn because of its increasing data rates, low energy and extended battery life. it supports star topology. devices that have two protocol stacks are called gateways and they collect data from sensors and send them to the storage. for the purposes of medical applications, health device profile (hdp) is developed [16]. bluetooth sig determined ieee 11073-20601 protocol for exchanging data between the hdp layer and the ieee 11073-104xx device specification standard. ieee 11073-20601 defines a protocol for exchanging data, and ieee 11073-104xx defines the format of data, including size and encoding. hdp defines two types of devices: sink and source device. the source is a small device that plays the role of a transmitter of medical data. the sink is a functionally rich device that serves as a receiver of medical data, such as smart phone. zigbee is another candidate technology for building bsn. zigbee provides specification for a protocol stack. the first two layers, physical and mac layers, are taken from the ieee 802.15.4 standard. in addition, ieee 802.15.4 creates the foundation for zigbee upgrading it with the network and application layers to support multi-hop networks. zigbee multi-hop routing allows wireless environmental sensors scattered all over the house to connect with user‟s bsn network. zigbee standard distinguishes three types of devices: zigbee pan coordinator, routers and end devices. in this manner, zigbee standard enables the cluster an architecture for pervasive healthcare system based on the ip multimedia subsystem... 443 tree topology. zigbee health care (hc) profile takes most of its attributes of the medical devices from the ieee 11073 device specializations standards family defined for the communication of medical devices [17]. application scenarios that are supported by the hc profile are: continuous monitoring of the patient, patient's episodic monitoring, alarming scenario, assistance to the elderly and ill, monitoring of sports activities, etc. the proposed bsn architecture is based on the cluster-tree topology shown in fig. 2, as already explained above. the basic idea is to relay the system on commercial devices that are already available at the market. as explained above, these devices use either bluetooth or zigbee technology. however, at the moment, devices supporting bluetooth are dominant at the market. the developed application, which is illustrated in section 5, is focused mainly on the data transmission between a gateway device and registered watchers using ims. therefore, the application simulates the bsn and collection of sensor data. 3. architecture for data transmission as explained above, we decided to use the cluster-tree topology, which consists of one root node which communicates with bsn. the root node can be a typical mobile device (smart phone, tablet) operating under android, ios windows or symbian os. in this contribution, we present the system architecture and principles of pervasive health monitoring application which we have developed for symbian os using java me environment. however, the system architecture and other principles are also applicable to android, windows and ios case as well. the typical mobile device can provide the connection to the ims and deployment of an application that collects data from sensors according to patient‟s health profile and send them to the server or watcher. our intention is to use the existing set of data protocols supported by most of operators. therefore, ims represents an architectural framework for providing data transfer from the root node to server and watcher. as defined in [18], ims is a global, access-independent and standard-based ip connectivity and service control architecture that enables various types of multimedia services to end-users using common internet-based protocols. it is based on session initiation protocol (sip) [19]. ims primarily deals with issues of heterogeneity of access technologies, addressing schemes, authentication, authorization and accounting (aaa), qos, security and managing devices mobility. with these characteristics, ims is a highly scalable framework to carry the information derived from body sensor networks. simple [20] specification provides a simple and complete event-based architecture without the need for additional application server modules. the basic idea presented here is that sensor data are sent as presence data, and simple architecture for the transport of presence data uses the body of sip notify/publish messages. 3.1. complete architecture of the proposed network figure 3 shows the complete architecture of the proposed network for transport of data collected from sensors. the network architecture is very similar to ims network, as we rely on its services. here, the role of s-cscf is to allow the process of authentication, authorization and accounting. it is also responsible to distinguish the initial filter criteria in order to route the 444 v. mišković, d. babić sip requests to the specific application server, in our case to the presence server. presence service is responsible for managing information and publish/subscribe/notify sip requests. any information which belongs to the presence server is stored on the xcap server as xml documents. by using xcap it is possible to change, add, delete elements (nodes) in these xml documents via http methods as explained below. each type of document has its own folder with xml documents structured in tree form with following leaves: presence-rules: they contain the access rights to information which is posted by some presentity; in other words, those are rules which explain who has permission and who does not have permission to access presentity data. group list management: it contains lists of friends who belong to the same group. resource list server: it represents references to the presence data from the same group of friends. from fig. 3, it is possible to see that sip protocol is used for connection between mobile application and p-cscf. beside sip, there is also a direct xcap connection between mobile application and xcap server [21]. however, this connection is established if user is successfully registered to the ims network. generic bootstrapping architecture enables the user to perform authorization process, and to realize a secure connection with any application server in ims network. further in this contribution, we explain in details how ims entities and services are used for monitoring patients. 3.2. descriptions of the ims entities used in the proposed system a rough outline of the ims architecture has three main layers [18]. the first layer is the user plane which consists of radio access network, which is also called transport layer. in the middle of the architecture, there is the control layer. finally, on the top of the architecture is the service layer, as illustrated in fig. 4. with the service layer separated from the control layer, the service provider may be a third party. other components depicted in fig. 4 are described bellow. call session control function (cscf) is a sip server and the basic entity of the ims architecture. most of the sip signaling is processed through the cscf. functionalities that cscf offers are divided into three clusters: p-cscf (proxy-cscf), i-cscf (interrogativecscf) and s-cscf (serving-cscf). p-cscf is the first point of access to the ims architecture for each user. all sip signaling traffic to and from the user equipment (ue) goes through the p-cscf. i-cscf is included in the sip registration process as it joins the corresponding s-cscf server to the user. s-cscf is the core of the ims architecture and provides the logic to invoke and manage the application servers and, if necessary, to deliver the required services. home subscriber server (hss) is a secure database. it stores subscriber profiles, manages user identities and status (the presence as well as location). scscf, i-cscf and application servers have the right to access this information. the application server is a sip entity which hosts and implements services. it can have multiple modes, depending on the type of service that is implemented. an architecture for pervasive healthcare system based on the ip multimedia subsystem... 445 fig. 3 ims architecture of the proposed network. the ims system users and terminals are uniquely identified. users have the ability to have multiple profiles to identify services they want to use. these identities can be public or private. ims uses authenticating and key agreement protocol (aka) for authentication, which is based on http digest access authentication as defined in the ietf [rfc 2617] document. aka is a challenge-response based mechanism that uses symmetric cryptography, which enables both sides to authenticate and authorize each other. 3.3. subscribers registration process ims registration process begins when the user terminal has to find a p-cscf address, i. e., the ims network access point. dhcp/dns mechanism is commonly used for this purpose. the user sends a sip register message to the corresponding p-cscf. p-cscf does not know which s-cscf server is responsible to serve the user. therefore, p-cscf asks the i-cscf for the address of the corresponding s-cscf. i-cscf gets this information from a unified database of user profiles hss by using the diameter protocol [22], and returns the information to the p-cscf. after that, p-cscf forwards user‟s sip register message to the discovered s-cscf server. s-cscf takes the user profile from the hss database using the diameter protocol. then s-cscf answers to the user with the challenge in the unauthorized sip message with status code 401. the user authenticates the network and sends a response to the challenge of the s-cscf server. s-cscf authenticates users, and retrieves data about the user services from hss. at the end of this process, s-cscf sends a confirmation of successful registration to the user in the sip 200 ok message. p-cscf and the user terminal can agree about mechanisms of compression, as well as, about using ipsec protocol for greater security of their communication. 3.4. session initiation protocol for instant messaging and presence leveraging extensions (simple) sip‟s main purpose is to establish, modify and end multimedia sessions [19]. the standard implementation of sip defines six different methods of sip requests, which are shown in table 1. many of the sip response codes are inspired by the http protocol. sip codes are divided into six classes, identified by the first code number as it is shown in table 2. 446 v. mišković, d. babić table 1 sip request methods method description invite session setup ack acknowledgment of final response bye session termination cancel pending session cancellation register registration of a user‟s uri options query of options and capabilities recently, sip has been extended to support instant messaging and the presence of ims service through sip for instant messaging and presence leveraging extensions (simple) standard [20]. simple can easily allow our system expansion and integration with other services and applications aware of the presence (presence-aware) to offer a richer user experience. however, the simple framework cannot offer audio or video services. these types of services request additional application servers. presence service is built on the top of the eventbased sip architecture. it enables subscribers to know who is available, busy or of no obligation, what the possibilities of their terminals are, and similar status information. presence service is one of the standard ims services. table 2 standardized sip response codes class description 1xx provisional of information 2xx success 3xx redirection 4xx client error 5xx server error 6xx global failure presence architecture defines several possible roles/entities for each user, as shown in fig. 5. an entity that provides presence information of subscribers is called presentity, which is short of „present entity‟. the presentity can have several devices connected to it; these devices are identified as presence user agents (puas). the presentity communicates directly with the presence agents (pas). pa manages the state of various subscribers‟ events. pa also collects information from pua through publish transactions by creating a model of the current state of presentity. it also informs sip entities called watchers about new publications trough sip notify transactions. the observer (also called the watcher) is an entity that requests information about presentity from pa system. this process is done by using subscribe requests. fig. 4 a simple overview of the ims architecture an architecture for pervasive healthcare system based on the ip multimedia subsystem... 447 all subscribe/notify transactions, shown in fig 6, have sip event header field that identifies a particular event whether it is a request for information or notification [rfc 3856]. this is the user registration scenario used in our application. it also defines „presence‟ event package identified by the value of this header field. subscribe may be a short-time operation when it is used to take the current state of presentity information, or it can last for a longer time period to enable asynchronous receiving of notification whenever the published data is changed. therefore, the watcher must periodically renew subscribe before the validity period expires. current information and duration of subscribe request is sent by the pa to the watcher by using the notify request. fig. 6 the use of subscribe and notify sip messages. fig. 5 entities of presence architecture. 448 v. mišković, d. babić pua sends publish request in order to publish new data, as show in fig 7. the message should include header field identified as expires that is set to the duration of request. the message also contains new pua‟s presence data whose data format is defined in content-type header field. every new successful publish request obtains a unique identifier from pa, in the form of entity-tag value specified in the sip-etag header field. these tags are unique for each user agent. this means that two instances of published presence information can have the same entity tag until they do not belong to the same user agent. if the publication is authorized and successful, the expires header field of answer will tell about the validity of the publication and also the sip-etag will also be specified. 3.5. presence information data format the data format used to transfer the presence data is called presence information data format (pidf), and its extension is called rich presence extension for pidf (rpid) [23], [24]. the pidf defines a basic format for representing presence information from presentity. this format defines a textual note, an indication of availability (open or closed) and a uniform resource identifier (uri) for communication. the rpid includes information about presence information for persons, services, and devices. the examples of data that are supported by the rpid format are: what the person is doing, grouping identifier for a service, when a service or device was last used, type of place a person is in, person's mood, time zone he/she is located in, type of service he/she offers, icon reflecting the presentity's status, etc. figure 8 illustrates the watcher overview of collected data about patient in our application (note that the sensors are simulated by a simple data generator, the data shown in fig. 8 can not be interpreted as medical data). fig. 7 publishing data collected from sensors. note that a pidf document and its extension can be used in two different contexts, namely, by the presentity to publish its presence status and by the presence server to notify a certain set of watchers. the presence server may compose, translate, or filter the published presence data, before delivering customized presence information to the watcher. for example, it may perform some of the following operations: merge presence information from multiple presence user an architecture for pervasive healthcare system based on the ip multimedia subsystem... 449 agents; remove whole elements; translate values in elements; or remove information from elements. these mechanisms, that are used to filter calls and other communications to the presentity, can subscribe to the presence information in the same way as a regular watcher. in this way, they can generate automated rules, such as scripts, that govern the actual communications behavior of the presentity. the details are described in the data model document. since rpid is a pidf xml document, it also uses the content type application/pidf+xml. 4. context-awareness the sensor data can allow us to create a complete picture of the user‟s current situation. bsn and implemented special healthcare profile gather data which are forwarded to the gateway node. with simple architecture we get patient‟s data from gateway into presence server. our architecture resolves all problems of safe access to these data, because it uses secure protocols such as xcap. xcap server supports grouping of patients and doctors in corresponding lists. in this way, the health care providers have access to health conditions of all patients in their lists. at the same time, the patients have insight into the list of their care providers with their access rights. both the patient and the doctor use a mobile application, thus they can be anywhere and anytime, and they can have access to the service. fig. 8 presented data collected from sensors at watcher application. 450 v. mišković, d. babić however, the data in presence server can give us a true picture of the condition of the patients only with proper reasoning. for example, it is normal that the number of heart beats per minute is about 75. however, when the number of heart beats per minute is greater than 100 then it could be said that this is an abnormal function of the heart, tachycardia. this information is not sufficient to obtain a complete picture and to conclude whether it is an alarming situation or not. it is necessary to have information about the current activities of the person wearing the sensors, because during running and other hard physical activities heart rate per minute is much higher. accelerometer can give us information about the acceleration of bodies in motion. as we can see, there is a need to have a context-aware subsystem which will analyze collected data and will make decisions. common architecture of the context-aware system is shown in fig. 9 [25]. in general, sensor layer detects and registers new sources of context information, which can result from physical, logical or software sensors. for example, gps is an example of physical sensor, current browser activities (software) or a combination of sensor information obtained by inference is a logical one. raw data retrieval layer provides interfaces to return sensor information using abstract functions. pre-processing/inferencing is referred to quantification and extraction of raw data and handling of conflicting, ambiguous and indefinite information which raises the level of complexity to the next level. this preprocessing/inferencing can be done partly before storing data to the central server or after it. the storage and managing of data are usually separated from application layer. the storage offers a public interface for synchronous or asynchronous access to information. fig. 9 layers of common context-aware system. we do not need to make any changes to the proposed architecture of pervasive healthcare system if pre-processing/inferencing or learning algorithms are in charge of application provider. in service layer application a server would be added, and it would be able to access the presence data. the key point is to store context data for a specific application provider and an architecture for pervasive healthcare system based on the ip multimedia subsystem... 451 application provider determines rules to be respected in the process of reasoning over the set of context data. also, the mobile phones today have enough processing and memory resources that simple data pre-processing can be carried out on them before they are sent to the system. the implementation of the context awareness in our system is planned in the future. 5 application use cases as mentioned above, we have developed a mobile application which is based on the proposed system architecture explained above. the system has three main parts: bsn with cluster tree topology, data transmission part based on ims, and context aware reasoning. the data transmission part is fully implemented in the proposed mobile application. meanwhile, the bsn is simulated and sensor data is generated. the implementation of context awareness is left for the second stage of our research project. the mobile application has the following use cases: 1. the registration; subscription/unsubscription to the list of the sip watchers is implicitly added after the registration process, because every patient wants to have access to the list of watchers. 2. the sensors‟ data are simulated. the presence data are published automatically. 3. subscription/unsubscription to the list of the sip presentities (patients) and revision of the permitted and currently reported data; 4. determining access rights to the watchers (doctors, friends, family); 5. review of personal information that is currently published with the ability to manually publish/unpublish them; 6. the deregistration. each patient can also be the watcher of his friends/relatives. in the following, the use cases are described verbally with parts of code or through system sequence diagram, and message examples. section 3.2 explains the ims entities used in the proposed system: ims core, presence server, xcap server. any information which belongs to the presence server is stored on the xcap server as xml document. presence server gets data directly from xcap, and then they are integrated and together called presence group manager (pgm). application on a presence user agent (pua) needs to open two ports for two separate connections with system. the first one is for common synchronous sip communication, such as register/200 ok, subscribe/ 200 ok, publish/200 ok. the second one is for asynchronous sip simple communication based on events. examples are: presentity (un)publish / watcher notify, watcher (un)subscribe / presentity notify. xcap protocol is based on http protocol, but can also be based on secured https. this http connection is a direct connection between pua application and xcap server and allows manipulation with presence-rules documents. after successful regular sip registration, the user is automatically subscribed to the watcher list in order to check the list of assigned watchers and to change their access rights. the process is shown in fig. 10. if subscription is approved, pgm sends sip notify message to asynchronous port with a full watcher list, and sip 200 ok message to synchronous port as ack of sip subscribe message. after getting sip notify, application sends sip 200 ok from the port chosen as asynchronous as acknowledgment for using this port for getting information about watchers updates. 452 v. mišković, d. babić in use cases two and five, the same sip messages are used for sending data, regardless of whether it is automatic or manual transmission. an example of sip publish message is given below. this message is provided by ua from sip port to the ims core, and further from ims core to the pgm. in the opposite direction sip 200 ok message is sent as acknowledgment for the successful data receiving. data format is application/pdif + xml, as shown by the content-type header field. within rpdif are defined other elements about the service, devices or user. important header fields of the sip publish message are shown in the following example: publish sip:alice@ericsson.com sip/2.0 max-forwards: 70 event: presence cseq: 312 publish expires: 3600 content-length: 645 ... content-type: application/pidf+xml ... the following is sip publish message with sensor data in the message body: ids:1;values:5,1,4; 2014-06-05t23:02:43.59z fig. 10 sip registration and subscription to the presence watcher list. an architecture for pervasive healthcare system based on the ip multimedia subsystem... 453 sip unpublish is a sip publish message with the following differences in header fields: content-length is 0, expires is set to 0s and sip-if-match is equal to sip-etag of published data. the following message shows important header fields of the sip (un)publish message: publish sip:alice@ericsson.com sip/2.0 max-forwards: 70 event: presence cseq: 317 publish expires: 0 content-length: 0 sip-if-match: 0005 ... in the use case number three, the subscription to the sip presentity list has same sequence of messages as subscription to the watcher list. but the sip subscribe message has different header fields, such as: event: presence , supported: eventlist accept: multipart/ related,application/ rlmi+xml,application/ cpim-pidf+xml, application/pidf+xml. fig. 11 events presence.winfo and presence all changes of lists are done through xcap protocol, which is based on http protocol and its methods: delete, get, put. the http connection is established directly between pua and xcap server. the system sequence diagram in fig.11. illustrates presence.winfo and presence events. there is also xcap usage by presentity. it is possible to add (put) or remove (delete) members of the lists, and presentity can change access rights to the watchers. after subscription watcher waits for confirmation. presentity can block, polite-block or allow watcher. the action block tells the server to reject the subscription, placing it in the „terminated‟ state. the action polite-block tells the server to place the subscription into the „active‟ state, and to produce a presence document that 454 v. mišković, d. babić indicates that the presentity is unavailable. the action allow tells the server to place the subscription into the „active‟ state. only if watcher is allowed by presentity, the published presence data willtrigger notify message. 5 conclusions the network architecture based on ims has been proposed, and we have developed the application for data transmission between patients and watchers. this solution can have many significant purposes and represents a platform for the development of a wide range of applications. we have shown the main elements and use cases of mobile application developed according to the proposed model. the next phase of our research will result in a mobile application for patients‟ monitoring for different platforms. we will study applicable contextaware algorithms as well, in order to build a complete system. acknowledgement: this work has been partially supported by the serbian ministry of education and science under technology development project tr32023 – “performance optimization of energy-efficient computer and communications systems.” appendix a acronym definition aaa authentication authorization and accounting aals ambient assisted living services ack acknowledgement aka authenticating and key agreement basuma body area system for ubiquitous multimedia applications bsn body sensor network cpu central processing unit cscf call session control function ecg electrocardiogram gprs general packet radio service gps global positioning system hc health care hdp health device profile hss home subscriber server http hypertext transfer protocol i-cscf interrogative-cscf ieee institute of electrical and electronic engineers ietf internet engineering task force ims ip multimedia subsystem ip internet protocol ipsec internet protocol security lpu local processing unit mac media access control me micro edition (java) mit massachusetts institute of technology ngn next generation network an architecture for pervasive healthcare system based on the ip multimedia subsystem... 455 acronym definition os operating system pa presence agent pan personal area network p-cscf proxy-cscf pda personal digital assistant pidf presence information data format pgm presence group manager pua presence user agents qos quality of service rfc request for comments rpid rich presence extension for pidf s-cscf serving-cscf sig special interest group simple sip instant messaging and presence leveraging extensions sip session initiation protocol spo2 peripheral capillary oxygen saturation ubimon ubiquitous monitoring environment for wearable and implantable sensors ue user equipment uri universal resource identifier wi-fi wireless fidelity xcap xml configuration access protocol xml extensible markup language references [1] j. a. fraile1, j. bajo and j. m. corchado, "context-aware and home care: improving the quality of life for patients living at home", actas de los talleres de las jornadas de ingeniería del software y bases de datos, vol. 3, no. 6, 2009. [2] tunstall, tunstall healthcare (uk) ltd, [online]. available: http://www.tunstall.co.uk (current february 2012) [3] quietcare, intel-ge care innovations™, [online]. available: http://www.careinnovations.com/ products/quietcare/default.aspx (current july 2011) [4] project mithril at mit university, [online]. available: http://www.media.mit.edu/wearables/mithril/ index.html (current october 2004) [5] j. e. bardram, "applications of context-aware computing in hospital work: examples and design principles", in proc. of the acm symposium on applied computing, new york, 2004, pp. 1574-1579. [6] s. mitchell, m. d. spiteri, j. bates, g. coulouris, "context-aware multimedia computing in the intelligent hospital", in proc. of the 9th workshop on acm sigops european workshop, pp. 13-18, 2000. [7] p. hu, j. indulska, r. robinson, "an autonomic context management system for pervasive computing", in proc. 6 th of the annual ieee international conference on pervasive computing and communications, march 2008. [8] project ubimon at imperial college london, [online]. available: http://www.doc.ic.ac.uk/vip/ubimon/ home/index.html (current january 2005) [9] mobihealth funded by the european commission, [online]. available: http://www.mobihealth.org (current may 2004) [10] healthservice24, ericsson enterprise ab – project coordinator, [online]. available: http://www. healthservice24.com (current december 2006 ) [11] basuma funded by the german federal ministry for economics and labor (bmwa), [online]. available: http://www.basuma.de (current january 2006) [12] project codeblue at hardvard university, [online]. available: http://fiji.eecs.harvard.edu/codeblue (current march 2007) http://dl.acm.org/author_page.cfm?id=81100359819&coll=dl&dl=acm&trk=0&cfid=176339321&cftoken=43665439 http://dl.acm.org/author_page.cfm?id=81100199398&coll=dl&dl=acm&trk=0&cfid=176339321&cftoken=43665439 http://dl.acm.org/author_page.cfm?id=81406598996&coll=dl&dl=acm&trk=0&cfid=176339321&cftoken=43665439 http://dl.acm.org/author_page.cfm?id=81100529795&coll=dl&dl=acm&trk=0&cfid=176339321&cftoken=43665439 456 v. mišković, d. babić [13] u. varshney, "pervasive healthcare and wireless health monitoring", journal: mobile networks and applications, springer-verlag, new york, vol. 12, pp. 113-127, march 2007. [14] m. carmen domingo, "a context-aware service architecture for the integration of body sensor networks and social networks through the ip multimedia subsystem", ieee communications magazine, vol. 50, pp. 102-108, january 2011. [15] f. toutain, a. bouabdallah, r.. zemek, c. daloz, "interpersonal context-aware communication services", ieee communications magazine, vol. 50, pp. 68-74, january 2011. [16] r. latuske, ars software gmbh, "bluetooth health device profile (hdp)”, münchen, 2009. [online]. available: http://www.ars2000 .com /bluetooth_hdp.pdf (current january 2012) [17] zigbee aliance, "zigbee wireless sensor applications for health, wellness and fitness", march 2009. [online]. available: https://docs.zigbee.org/zigbee-docs/dcn/09-4962.pdf (current february 2012) [18] m. poikselk¨a, g. mayer, "the ims: ip multimedia concepts and services (3rd ed.) ": the complete book, john wiley & sons, 2009. [19] j. rosenberg, h. schulzrinne, g. camarillo, a. johnston, j. peterson, m. handley,e. schooler, "session initiation protocol (sip)", rfc 3261, [online]. available: http://www.ietf.org/rfc/rfc3261.txt (current june 2002) [20] m. day, j. rosenberg, h. sugano, "a model for presence and instant messaging", rfc 2778, [online]. available: http://www.ietf.org /rfc/rfc2778.txt (current february 2000) [21] j. rosenberg, "the extensible markup language (xml) configuration access protocol (xcap) ", rfc 4825, [online]. available: http://www.rfc-editor.org/rfc/rfc4825.txt (current may 2007) [22] p. calhoun,j. loughney, e. guttman, g. zorn, j. arkko, "diameter base protocol", rfc 3588, [online]. avaliable: http://tools.ietf. org/html/rfc3588 (current september 2003) [23] h. sugano, s. fujimoto, g. klyne, a. bateman, w. carr, j. peterson, "presence information data format (pidf)”, "rfc 3863, [online]. available: http://www.faqs.org/rfcs/rfc3863.html (current august 2004) [24] h. schulzrinne, v. gurbani, p. kyzivat, j. rosenberg, "rpid: rich presence extensions to the presence information data format (pidf)", rfc 4480, [online]. available: http://tools.ietf. org/html/rfc4480 (current july 2006) [25] p. gutheim, "an ontology – based context inference service for mobile applications in next – generation network", ieee communications magazine, vol. 50, pp. 60-66, january 2011. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 437 450 doi: 10.2298/fuee1603437l a hadoop-enabled sensor-oriented information system for knowledge discovery about target-of-interest yu liang, chao wu department of computer science and engineering, university of tennessee at chattanooga, usa abstract. to obtain a real-time situational awareness about the specific behavior of targets-of-interest using large-scale sensory data-set, this paper presents a generic sensor-oriented information system based on hadoop ecosystem, which is denoted as sois-hadoop for simplicity. robotic heterogeneous sensor nodes bound by wireless sensor network are used to track things-of-interest. hadoop ecosystem enables highly scalable and fault-tolerant acquisition, fusion and storage, retrieval, and processing of sensory data. in addition, sois-hadoop employs temporally and spatially dependent mathematical model to formulate the expected behavior of targets-of-interest, based on which the observed behavior of targets can be analyzed and evaluated. using two realworld sensor-oriented information processing and analysis problems as examples, the mechanism of sois-hadoop is also presented and validated in detail. key words: sensor-oriented information system, hadoop ecosystem, target of interest, wireless sensor network, mathematical model. 1. introduction an information system is generally a computer-centric system that integrates data acquisition, processing and analysis, storage and communication, interpretation, and knowledge discovery, etc. [1-4]. sensor-oriented information systems addressed in this work aim to obtain a panoramic, timely, trusted understanding about the observed behavior of targets-of-interest (toi) [3][5-10] by exploiting networked sensor assets, which consist of large number of autonomous, heterogeneous, and multi-layer sensor nodes. it is an extremely computationally intense, labor-intensive and highly unreliable job to derive a real-time situational awareness about toi from high volume, high generation velocity, wide variety of sensory data, the addressed sensor-oriented information system is constructed over hadoop ecosystem [8][11], a conventionally used big-data platform. received july 1, 2015; received in revised form november 16, 2015 corresponding author: yu liang department of computer science and engineering, university of tennessee at chattanooga, tennessee, 37403, usa (e-mail: yu-liang@utc.edu) 438 y. liang, c. wu this paper proposes a generic sensor-oriented information system based on hadoop cluster (sois-hadoop) [11-14] to monitor and analyze the specific behavior of target-ofinterest (toi) according to persistent surveillance sensory data. the addressed soishadoop has the following features: (1) employing temporally and spatially dependent mathematical models [3][9][15][16] to formulate the expected behavior about targets-ofinterest, based on which the observed behavior of toi can be evaluated or the future behavior of toi can be predicted; (2) tracking toi through deploying networked autonomous sensor nodes, which will be tuned using collective control and self-optimization to achieve the optimal, reliable, and energy-efficient observations; (3) using hadoop ecosystem to handle the acquisition, fusion, storage, management and mining of large-scale real-time/historical sensory data. the major topics of this paper include: (1) a generic hardware and software infrastructure, and the implementation flowchart of sois-hadoop; (2) the application of sois-hadoop in real-world sensor-enabled engineering problems such as predictive analysis of the aggregation of carp, and the anomaly detection of traffic flow; (3) mathematical modeling about the expected behavior of toi, both microscopic and macroscopic strategies are addressed. the remainder of the paper is organized as follows: section 2 discusses the hardware infrastructure of the sois-hadoop system; section 3 discusses the software framework of the sois-hadoop system; section 4 briefly introduces the flowchart of the system; section 5 uses two representative sensor-oriented information analysis problems as benchmark to demonstrate the mechanism and implementation of sois-hadoop; section 6 introduces macro-cell strategy, a divide-and-conquer method, to manipulate large-scale application problems with high scalability; section 7 summarizes the effort. 2. hardware architecture of hadoop xen clusters the overarching goal of this interdisciplinary project is enabling robust intelligent systems which can operate autonomously for long periods of time. this ability requires that all system components are seamlessly integrated. figure 1 shows the hardwareconfiguration of hadoop based sensory data processing and analysis system. the proposed system consists of three hardware modules: (1) data acquisition and pre-processing based on mobile computing platform (e.g., iphone, laptop, etc.). sensory data may be acquired by multiple sensors at the same time. pre-processing indicates translating stream sensory data such as video data into semi-structured format data such as xml format (www.w3.org). (2) data storage and management server based on hadoop cluster, which is built using multiple inter-connected xen virtual machines. (3) data analysis and visualization client, which mainly simulates the temporally and spatially dependent mathematical model. to fully exploit sensor asset, the following issues need to be investigated: (1) the design and development of provably correct, decentralized algorithms for finding and localizing multiple mobile toi using a networked sensor nodes [17], and synchronizing sensor stream, etc.; (2) achieving longevity and energy-efficiency by developing energy efficient motion planning algorithms; studying system level energy trade-offs and optimization; improving system life-time by energy harvesting; (3) mobility and energy aware communication protocols for robotic networks; and (4) data analysis algorithms to a hadoop-enabled sensor-oriented information system for knowledge discovery about... 439 discover toi’s mobility patterns at multiple scales and how these patterns correlate with environmental parameters. fig. 1 hardware configuration of sois-hadoop (dash-line indicates those internet link). as illustrated in figure 1, a multitude of collaborative mobile sensor nodes are deployed to detect, discriminate, localize, and track targets of interest (tois). each sensor node is managed by a raspberry pi single-board computer (www.raspberrypi.org), which is equipped with the robot operating system (www.ros.org), 3g/4g cdma cellular gateway that will provides sensor nodes with internet connectivity based on radio transceiver, solar panel, gps-aided inertial navigation system [18], and electro-optical sensor used to capture the movement of tois. robot operating system provides standard operating system services such as low-level device control, implementation of commonly-used functionality, inter-process message passing, and package management. fig. 2 (a) monitoring invasive fish with an robotic boat (univ. of minnesota); (b)-(c) tracking and analysis of the movement of vehicle according to low-resolution video data acquired by uav flying over the city figures 2(a)-(c) illustrates that mobile sensor nodes [19] are deployed to track carp in lakes and monitor traffic status over a city respectively. long-term operation of sensor nodes necessitates a long-term energy source. in this work, the energy consumption [20] for wireless sensor network [21] will be minimized from the point of view of mobility, communication, and solar harvesting. first, accurate trajectory estimation about moving sensor network for data acquisition and pre-processing data analysis and visualization client (hosted by multi-processor highperformance parallel computer system) hadoop cluster (fusion, storage, management) xen virtual machine cellular gateway 440 y. liang, c. wu toi will greatly optimize the motion of sensor-nodes; second, wireless communication is the largest energy consumer on the robotic platform, ieee 802.15.4 wireless personal area networks protocol (standards.ieee.org ) is used in our work; the topology of wireless sensor network, synchronization technique, wsn routing algorithm, transmission protocol, and long path-loss models [22] may all determine the energy efficiency of wsn. third, the robotic vehicle for toi tracking is equipped with solar panel coupled to a solar charger and a deep-cycle rechargeable battery. a hadoop ecosystem built on xen linux virtualization (www.xenproject.org) cluster provides a highly scalable and fault-tolerant platform for acquiring, fusing, storing and analyzing huge amounts of sensory data in a distributed computing environment. data analysis and visualization client mainly handle the simulation of temporally and spatially dependent mathematical model, the most computationally intensive operation in the implementation sois-hadoop. in this work, data analysis and visualization client is hosted by a multi-processor and multi-core parallel computing machine. message passing interface (mpi) parallel programming paradigm is used to implement the simulation of temporally and spatially dependent mathematical model [8][23]. 3. software infrastructure of sois-hadoop figure 3 demonstrates the software infrastructure about sois-hadoop. considering the addressed information system is driven by huge-scale heterogeneous sensory data [3], highly scalable, robust, and relatively accurate computational methods are investigated in the implementation of sois-hadoop. accessory information (e.g., geographic information, weather condition, and historical data, which are of xml format in our implementation) system and persistent surveillance sensory data constitute two major important inputs for sois-hadoop. hadoop-ecosystem [14] is employed as the main engine of sois-hadoop: flume (flume.apache.org) acquires, aggregates, pre-processes, and then forward the sensory data to hadoop distributed file system (hdfs); sqoop (sqoop.apache.org) provides an interface between accessory information and hdfs; built on the top of hdfs, hbase (hbase.aparche.org) provides a realtime and random access to the data; hbase is equipped with nosql database [7], which is of key-value format, supports highly scalable, concurrent, and fault-tolerant storage about structured or semi-structured data appeared in sois-hadoop because it does not need to category and parse the sensory data into fixed format; hive (hive.apache.org) facilitates query and managing large dataset; r-connector and mahout (mahout.apache.com) are employed in the mining and statistical analysis about sensory and accessory data. as one of our major contributions, analytics module extracts the features about target-of-interest from data using temporally and spatially dependent mathematical model. classification and clustering module use machine learning strategy to measure the event according to the features obtained in analytics module. a hadoop-enabled sensor-oriented information system for knowledge discovery about... 441 fig. 3 infrastructure for sois-hadoop. 4. a generic flowchart for sensor-oriented information analysis system fig. 4 shows a generic flow-chart about the sensor-oriented information analysis system [12]. geographic information module defines the geometry configuration of the scene; mathematical model about the expected behavior of target-of-interest [6], which is emphatically investigated in this paper, agent-based mathematical model is used to anticipate the evolution of observed behavior about carp school; persistent surveillance sensory data is directly acquired from sensors. in addition, the acquisition, pre-processing, storage, retrieval are all implemented based on hadoop ecosystem [11] including apache flume, mahout, hive, and r-connector, etc. fig. 4 also illustrates that the implementation of sensor-oriented information analysis system consists of following two threads: (1) formulating the mathematical model (e.g., spatialand temporal-dependent partial differential equations) using historical sensory data about the expected behavior of thing-of-interest (toi) [5][6][13][24]. (2) processing and integration of observed sensory data. a situational awareness is derived from these two threads. then the situational awareness will reversely guide the self-optimization of datacollection and cooperative control [25] of sensor nodes so as to generate more accurate understanding about external events and achieve longevity by developing energy-efficient motion planning algorithms. the mathematical model about the expected behavior of target-of-interest is formulated according to the geographic information and the historical records about target-of-interests. the core calculations corresponding to mathematics modeling include (1) statistical analysis about the historical sensory data stored in hadoop distributed file system (hdfs), and (2) numerical solution to the mathematical simulation (e.g., using finite element method [9][14][16][23] to solve the temporal and spatial-dependent partial differential equations. 442 y. liang, c. wu fig. 4 a generic flow-chart about sensor-oriented information analysis system. processing of persistent surveillance video data includes the following operations: (1) acquisition of video data; (2) segmentation, which extracts pixels of target-of-interests (toi) from background; (3) isolation of tois out of noise or other moving objects; (4) translation of optical behavior features (i.e., the velocity and position of moving tois within the sensor coordinate system) of detected pedestrians into their actual geographical features (i.e., the velocity and position of moving targets within the geographic coordinate system); (5) documentation, which posts the output in a format suitable for post-processing and includes position, velocity, and track. the implementation of both threads is highly computation and storage-intensive. 5. data analytics based on mathematical model in the context of sensor-oriented analysis, data analytics of sensory data aims to disclose specific behavior of target-of-interest (toi) out of sensory data [9][10][16]. for example, it is a significant task to detect those speeding or wrong-way vehicle out of surveillance video on the road; therefore a description about the expected (or normal) traffic flow is needed so that those abnormal vehicles can be identified. the expected behavior (i.e., normal behavior) of toi is commonly modeled using macroscopic or microscopic method. microscopic method, which is also called agent-based method, provides a detailed formulation about the behavior of toi while suffers from inhibitive computational cost and accumulated numerical error. macroscopic method generally uses timeand space-dependent partial differential equations (pde) to formulate the expected behavior of toi. in this paper, microscopic and macroscopic methods are used to simulate the aggregation of carp [26-33] and vehicle traffic flow [34-36] respectively. a hadoop-enabled sensor-oriented information system for knowledge discovery about... 443 5.1. aggregation of carp since being introduced to the u.s. in the 1970 for the purpose of weed and parasite controlling in aquatic farms [37-39], the asian carp (including bighead carp, the black carp, the grass carp, and the silver carp) has gradually established breeding populations in mississippi river region [39-41]. asian carps are causing serious damage to the area’ fresh-water ecosystem [38][40][42]. in order to provide constructive information to control the populations of asian carps [26][43], the addressed sois-hadoop is customized and employed to predict the collective behaviour of asian carps (figure 5(a)). in this work, an agent-based mathematics model (a microscopic method) is presented to formulate the aggregation of asian carps. based on the statistical analysis about empirical sensory data, the pair-wise interaction uij is defined using modified van der waals forces [44], where the corresponding potential function uij between carp-i and carp-j is defined by the formula (1).             || || || || || || || || (|| || ) ( || || ) ( || || ) ij ij s s ij h s ij h s m n ij sr r m n ij s ij hr r m n h ij kr r r r r r r r u r r r r r r                                         (1) where rij = xi  xj, m > n, and   1 m nn s m r  (such that uij(||rij||) = 0 when rs  rij  rh). from formula (1), it follows that the resulting force function is:         1 1 || || || || 1 1 || || || || || || (|| || ) 0 ( || || ) ( || || ) ij ij ij h s ij h s ij ij ij m n ij sr r s ij h m n h ij kr r r r r r u f r m n r r r r r m n r r r                                            (2) compared to alternative models such as [33][45][46], the addressed model can efficiently formulate the “aggregation” of carp school. fig. 5 (a) aggregation of carp; (b) interaction zones between neighboring carp. 444 y. liang, c. wu rs, rh, and rk are illustrated in fig. 5(b), ||rij|| indicates the distance between two neighboring carps.  is constant coefficient derived from empirical data. it should be remarked that the moving orientation, water flow velocity and blind zone is not considered in the formulation of formula (1). fig. 6 (a) inter-carp potential energy; (b) inter-carp force (m = 12; n = 6) fig. 6(a) shows the potential energy incurred by the pair-wise interaction between two neighboring carps. fig. 6(b) shows the resulting inter-carp force. it can be observed that, inter-carp potential energy has a stable zone (or parallel zone), within which the intercarp potential energy is basically constant so that the neighboring carps can cruise without influencing each other. fig. 7 (a) interaction between neighboring carps; (b) ij value with variant max (denoting the visible zone) according to ichthyology [31-32], the interaction between carps is supposed to be corresponding to blind-zone (figures 5(b) and 7(a)). as illustrated in the figure 7(a), i is the velocity of i-th carp. max is the maximal perceptible angle, obviously 0  max  . ij indicates the angle between i and rij, it is defined by the following formula: arccos || || || || i ij ij i ij r r             (3) a hadoop-enabled sensor-oriented information system for knowledge discovery about... 445 given the blind-anglemax, the inter-carp potential is defined as: * ij ij ij u u  (4) where 2 2 2 max2 max ( ) 2 ij ij ij e           (5) as a consequence, the inter-carp interaction force is determined by the following formula: * * || || ij ij ij ij ij f u f r      (6) where fij is defined in equation (2). based on the above mathematical model for carp schooling, the future status of carp school can be predicted according to the currently observed sensory data using agent-based mathematics model addressed above. furthermore, based on the preliminary simulation results, the motivations of fish aggregation, such as foraging advantages, reproductive advantages, predator avoidance, or hydrodynamic efficiency, can be disclosed. fig. 8 snapshots about the simulation of carp aggregation and corresponding standard-deviation of kinetic energy (the size of fish school is 50): (a) initial stage; (b) aggregation stage 446 y. liang, c. wu figure 8 demonstrated the aggregation process of a fish school of size 50. it is illustrated that carp gradually gather due to the pairwise interaction between neighboring carp; in addition, standard-derivation of kinetic energy of carp can be used to measure the aggregation status of carp school, namely a carp school in aggregation has smaller standard deviation of kinetic energy. 5.2. vehicle traffic analysis traffic flow analysis plays a significant role in civil engineering, transportation management, and homeland security [34]. due to the influence of rapid urbanization and modern industrialization, traffic congestion has become an intolerable issue in today’s world. modeling and simulation of traffic flow provides an efficient way to understand traffic congestion and disclose corresponding remedy. mathematical models for traffic flow are categorized into microscopic (or agent-based) and macroscopic strategies. macroscopic models study traffic from an average (or continuum) perspective, while microscopic models study the motion of individual vehicles. macroscopic model uses temporal and spatial-dependent partial differential equations (generally hyperbolic partial differential equations.) to formulate the expected traffic flow. representative macroscopic models for traffic flow are lighthill-whitham-richard model [35], aw-rascle model [47], and zhang model [36]. none of the above models can efficiently and accurately those formulate complicated road scenario such as nozzle, merging, diverging, and roundabout, etc. different from above models, the proposed work defines the governing equations for traffic flow using the following partial differential equations: ( ) 0v t       (7.1) (7.2) (7.3) where  (x,t) is the number of vehicles over unit length, v(x,t) is the expected velocity of the vehicle, vmax is the speed limit, a(x) is the cross-section width (or bandwidth) of the road. equation (7.1) is derived from conservation of mass. equation (7.2) ensures that traffic flow slows down up at nozzle and keeps constant speed at fork (illustrated in figure 9). fig. 9 traffic flow through (a) nozzle; (b) fork max ( , ) ( ) log ( ) r x t v a x a x   2dv p v g dt        a hadoop-enabled sensor-oriented information system for knowledge discovery about... 447 as illustrated in figure 10(a), this work acquires the citywide traffic status using electro-optical sensor array mounted on unmanned aerial vehicle (uav). figure 10(b) shows the expected traffic velocity field resulted from the solution of governing equations. the boundary conditions and the coefficient equations for governing equations are obtained according to empirical traffic data. using the expected traffic flow as reference, the observed vehicles can be measured and evaluated. fig. 10 (a) traffic status acquired using uav-mounted optical-electro sensor array; (b) expected traffic flow derived from empirical data and equation (6). 6. macro-cell strategy real-world problem generally involves a large scene such as a metropolitan city, or a huge lake. as a result, a sois-hadoop system should be scalable so as to solve the largescale problems. fig. 11 two partition strategies: (a) euler formation of a transportation network; (b) lagrange formulation of a lake. in this work, a macro-cell strategy, which partitions the global physics domain (or scene) into multiple overlapping/non-overlapping element (cell) and then manipulates them independently [9][10][16][48][49], will be employed to enhance the capability of the sois-hadoop framework to handle large-scale problems. as illustrated in fig. 8, the physics domain (or scene) of interest can be discretized using euler formulation or lagrange formulation [49]. inter-cell communications only occur between neighboring cells and they are only triggered while somewhat anomalous crowd behavior is observed 448 y. liang, c. wu and detected. cell of particular interest will be particularly analyzed using modeling and simulation strategy (which is relatively computationally costly). table 1 lists sample features value about macro-cell-oriented carp aggregation analysis [46] methods. through sufficient training, the addressed system can accurately employ the known cellular features to predict the likelihood of aggregation occurrence through appropriate machine learning methods [50] such as logistic regression, neural network, hidden markov method (hmm), and bayesian learning, etc.[50]. table 1 cell-by-cell analytics of carp aggregation (the lake is divided into 1000 cells). cell id fish density total kinetic energy std (kinetic -energy) entropy aggregation occurs? 1 25.6 77 10.2 10 yes 2 12.7 89 30.98 40 no 3 7.9 101 105 90 … .. … … 10 14 25 133 30 7. conclusion a pilot sois-hadoop system has been set-up and applied in a variety of real-world problems such as the prediction of the aggregation of carp [16] and vehicle traffic analysis [6][9][24]. some preliminary while promising outcomes has achieved. in the near future, we intend to make progress in the following directions: (1) broaden the application of the proposed sensor-oriented information analysis system such as the simulation about the spread of epidemics diseases [16], anomalous pedestrian detection [8][15], and structural health monitoring,, etc.; (2) develop scalable numerical methods in the mathematical modeling of sensor oriented information analysis system: time integration method for the solution of governing equation, domain decomposition method in the finite element method, and polynomial preconditioning, etc.; (3) optimize the exploitation of sensory data using dimensionality reduction (e.g. such as pca) [50]; (4) optimize the cooperative control of sensor asset so as to obtain the optimal observation and high energy efficiency; (5) employ more advanced and accurate mathematical model to formulate the expected behavior about toi. for example, stochastic analysis can be introduced to formulate uncertainty of sois-hadoop framework; and multi-scale modeling can be used to a seamlessly merge microscopic and macroscopic description about toi. acknowledgement: this work is jointly sponsored by the national science foundation (nsf) with proposal number 1240734 (“a design proposal for the center of cyber sensor networks for human and environmental applications”) and 1111542 (“ri: large: collaborative research: a robotic network for locating and removing invasive carp from inland lakes”). the authors would like to thank dr. kimberly kendrick from university of nevada -las vegas (nevada, usa), dr. xiaofang wei from central state university (ohio, usa), mr. darrell barker and ms. olga mendoza-schrock of the sensors directorate in air force research laboratory (ohio, usa) for their support and guidance of this work. a hadoop-enabled sensor-oriented information system for knowledge discovery about... 449 references [1] o. bott, m. marschollek, k.-h. wolf, and r. haux, "towards new scopes: sensor-enhanced regional health information systems-part 1: architectural challenges", meth. inform. med., vol. 46, pp. 476-483, 2007. [2] j. v. c. schneider, information systems today: managing in the digital world. prentice hall, 2015. [3] f. zhao, j. shin, and j. reich, "information-driven dynamic sensor collaboration", ieee signal process. mag., vol. 19, pp. 61-72, 2002. [4] w. m. ulrich, legacy systems: transformation strategies. prentice hall, 2002. [5] s. fernandes, y. liang, s. sritharan, x. wei, and r. kandiah, "real time detection of improvised explosive devices using hyperspectral image analysis", in proceedings of the 2010 ieee national aerospace and electronics conference (naecon 2010). 2010. [6] s. fernandes and y. liang, "chipping and segmentation of target of interest from low-resolution electrooptical data", in proceedings of the spie defense, security, and sensing. 2013, pp. 87440r-87440r-8. [7] k. grolinger, w. a. higashino, a. tiwari, and m. a. capretz, "data management in cloud environments: nosql and newsql data stores", j. cloud. comput. adv. syst. appl., vol. 2, p. 22, 2013. [8] y. liang, w. melvinb, s. fernandesa, m. hendersona, s. i. sritharanc, and d. barkerd, "a crowd motion analysis framework based on analog heat-transfer model", american journal of science and engineering, vol. 2, pp. 33-43, 2013. [9] y. liang, m. henderson, s. fernandes, and j. sanderson, "vehicle tracking and analysis within a city", in proceedings of the spie defense, security, and sensing. 2013, pp. 87510f-87510f-15. [10] y. liang, m. szularz, and l. t. yang, "finite-element-wise domain decomposition iterative solvers with polynomial preconditioning", math. comput. model., vol. 58, pp. 421-437, 2013. [11] a. s. foundation. (2014). hadoop releases. available: http://www.apache.org/ [12] y. liang and c. wu, "a sensor-oriented information system based on hadoop cluster", in proceedings on the international conference on internet computing (icomp). 2014, p. 1. [13] y. liang and c. wu, "an agent-based mathematical model about carp aggregation", in proceedings of the spie sensing technology+ applications. 2015, pp. 94860q-94860q-11. [14] r. c. taylor, "an overview of the hadoop/mapreduce/hbase framework and its current applications in bioinformatics", bmc bioinformatics, vol. 11, p. s1, 2010. [15] y. liang, w. melvin, s. i. sritharan, s. fernandes, and d. barker, "cma-ht: a crowd motion analysis framework based on heat-transfer analog model", in proceedings of the spie defense, security, and sensing. 2012, pp. 84020j-84020j-13. [16] y. liang, z. shi, s. i. sritharan, and h. wan, "simulation of the spread of epidemic disease using persistent surveillance data", in proceeding of comsol 2010, boston, 2010. [17] k. langendoen and n. reijers, "distributed localization in wireless sensor networks: a quantitative comparison", comput. netw., vol. 43, pp. 499-518, 2003. [18] h. qi and j. b. moore, "direct kalman filtering approach for gps/ins integration", ieee trans. aerosp. electron. syst., vol. 38, pp. 687-693, 2002. [19] g. a. bekey, autonomous robots: from biological inspiration to implementation and control. mit press, 2005. [20] g. anastasi, m. conti, m. di francesco, and a. passarella, "energy conservation in wireless sensor networks: a survey", ad hoc networks, vol. 7, pp. 537-568, 2009. [21] t. s. rappaport, wireless communications: principles and practice. vol. 2: prentice hall, 2002. [22] j. n. al-karaki and a. e. kamal, "routing techniques in wireless sensor networks: a survey", ieee wireless commun., vol. 11, pp. 6-28, 2004. [23] y. liang, j. weston, and m. szularz, "generalized least-squares polynomial preconditioners for symmetric indefinite linear equations", parallel. comput., vol. 28, pp. 323-341, 2002. [24] j. sanderson and y. liang, "no-reference image quality measurement for low-resolution images", in spie defense, security, and sensing. 2013, pp. 874404-874404-17. [25] w. ren and r. w. beard, distributed consensus in multi-vehicle cooperative control. springer, 2008. [26] s. camazine, self-organization in biological systems. princeton university press, 2003. [27] i. d. couzin, j. krause, r. james, g. d. ruxton, and n. r. franks, "collective memory and spatial sorting in animal groups", j. theor. biol., vol. 218, pp. 1-11, 2002. [28] a. deutsch and s. dormann, cellular automaton modeling of biological pattern formation. 2005. [29] a. huth and c. wissel, "the simulation of the movement of fish schools", j. theor. biol., vol. 156, pp. 365-385, 1992. [30] p. b. johnsen and a. d. hasler, "winter aggregations of carp (cyprinus carpio) as revealed by ultrasonic tracking", trans. am. fish. soc., vol. 106, pp. 556-559, 1977. [31] r. jullien and r. botet, aggregation and fractal aggregates. world scientific pub co inc, 1987. [32] s. stöcker, "models for tuna school formation", math. biosci., vol. 156, pp. 167-190, 1999. http://www.apache.org/ 450 y. liang, c. wu [33] t. vicsek, a. czirók, e. ben-jacob, i. cohen, and o. shochet, "novel type of phase transition in a system of self-driven particles", phys. rev. lett., vol. 75, p. 1226, 1995. [34] m. bando, k. hasebe, a. nakayama, a. shibata, and y. sugiyama, "dynamical model of traffic congestion and numerical simulation", phys. rev. e: stat., nonlinear, soft matter phys., vol. 51, p. 1035, 1995. [35] m. j. lighthill and g. b. whitham, "on kinematic waves. ii. a theory of traffic flow on long crowded roads", in proceedings of the royal society of london a: mathematical, physical and engineering sciences. 1955, pp. 317-345. [36] h. m. zhang, "a mathematical theory of traffic hysteresis", transport. res. b-meth., vol. 33, pp. 1-23, 1999. [37] a. c. r. c. committee, "asian carp control strategy framework", 2013. [38] e. h. buck, h. f. upton, c. v. stern, and j. e. nicols, "asian carp and the great lakes region", 2010. [39] j. rosenfeld, "assessing the habitat requirements of stream fishes: an overview and evaluation of different approaches", trans. am. fish. soc., vol. 132, pp. 953-968, 2003. [40] r. naylor, s. williams, and d. strong, "aquaculture-a gateway for exotic species", science (wash.), vol. 294, pp. 1655-1656, 2001. [41] r. goldburg, m. s. elliott, r. naylor, and p. o. commission, marine aquaculture in the united states: environmental impacts and policy options. pew oceans commission, 2001. [42] t. m. koel, k. s. irons, and e. n. ratcliff, "asian carp invasion of the upper mississippi river system", us department of the interior, us geological survey, upper midwest environmental sciences center, 2000. [43] p. bajer, c. chizinski, and p. sorensen, "using the judas technique to locate and remove wintertime aggregations of invasive common carp", fish. manage. ecol., vol. 18, pp. 497-505, 2011. [44] a. r. leach, molecular modelling: principles and applications. prentice hall, 2001. [45] a. czirók, m. vicsek, and t. vicsek, "collective motion of organisms in three dimensions", physica. a., vol. 264, pp. 299-304, 1999. [46] c. w. reynolds, "flocks, herds and schools: a distributed behavioral model", in acm siggraph computer graphics. 1987, pp. 25-34. [47] a. aw and m. rascle, "resurrection of" second order" models of traffic flow", siam journal on applied mathematics, vol. 60, pp. 916-938, 2000. [48] y. liang, the use of parallel polynomial preconditioners: in the solution of systems of linear equations. lap lambert academic publishing, 2013. [49] a. toselli and o. widlund, domain decomposition methods: algorithms and theory. vol. 3: springer, 2005. [50] s. theodoridis, machine learning: a bayesian and optimization perspective. academic press, 2015. facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 129-145 https://doi.org/10.2298/fuee1901129v fpga implementation of modified elliptic curve digital signature algorithm kamalakannan venkataraman, tamilselvan sadasivam department of electronics and communication engineering, pondicherry engineering college, pillaichavady, puducherry, india abstract. with rapid deployment of internet-of-things (iot) devices, security issues related to data transmitted between the devices increases. thus the integrity of perceptual layer devices is of utmost importance to secure the information being transmitted between the devices. in a secured information system, digital signature generation and verification processes are entirely different from data encryption and decryption processes. digital signatures are rapidly emerging due to the problems related to data integrity thus playing a crucial role in the authentication process by enabling the sender to attach a signature to the encrypted message. based on the devices it is beneficial to select an algorithm showing favorable behavior, therefore keccak-f [1600] algorithm is best suited for devices having area and cost constraints. in this paper, implementation of the original elliptic curve digital signature algorithm and its variants are considered and evaluated in terms of the security level and computational cost. here the modified ecdsa scheme concepts related to signature generation and verification are similar to the original ecdsa scheme. the computational cost of the modified ecdsa is reduced by removing inverse operation in key generation and signing phase, also problems related to signature being forged are resolved using hidden generator point concept. hence the modified ecdsa is more secure with less computational cost when implemented on fpga using verilog hdl. therefore, this algorithm can be applied for the devices being connected in perceptual layer of the iot. key words: internet of things, elliptic curve cryptography, elliptic curve digital signature algorithm, secured hash algorithm, keccak 1. introduction the internet of things (iot) represents a network of independent devices interconnected globally. in the iot enormous amounts of information have to be communicated, stored, processed and analyzed securely. securing these pieces of information is one of the fundamental challenges in the iot. many iot products consist of inexpensive components received june 20, 2018; received in revised form september 26, 2018 corresponding author: kamalakannan venkataraman department of electronics and communication engineering, pondicherry engineering college, pillaichavady, puducherry, india (e-mail: vkamalakannan@pec.edu) 130 v. kamalakannan, s. tamilselvan with limited memory and computational resources. such devices might be unable to support the computationally intense cryptographic functions of asymmetrical cryptography. if designers considered the privacy implications of unencrypted data, they have limited options for encryption because of this hardware platform. therefore the designers have to create their own security protocols. as it is known cryptography is the branch of cryptology dealing with the design of algorithms for encryption and decryption, intended to ensure the secrecy and/or authenticity of message [1]. in 1985 neal koblitz and victor s. suggested that the public key cryptography relates to the algebraic structure of elliptic curve (ec) over finite fields. the digital signature algorithm (dsa) was proposed in august 1991 by the u.s. national institute of standards and technology (nist) and was specified in a u.s. government federal information processing standard (fips) 186 called the digital signature standard (dss). in 1992 scott vanstone proposed elliptic curve digital signature algorithm (ecdsa) because nist requested public comment related to dss proposal. in 1998 it was accepted as iso 14888-3 standard by international standards organization (iso) and in 1999 it was accepted as ansi x9.62 standard by american national standard institute (ansi). in 2000 it was accepted as ieee 1363-2000 standard by institute of electrical and electronics engineers (ieee) and fips 186-2 standard by federal information processing standards (fips). elliptic curve cryptography (ecc) is a public cryptography that has a mathematical advantage when compared to rivest shamir adleman (rsa) as it requires full exponential time for solving elliptic curve discrete logarithmic problem (ecdlp). the ecdlp is distributed over points on the elliptic curve (ec). a digital signature is generally an authentication process that enables the sender to attach a signature to a message, thus comprises of digital signature generation and digital signature verification processes [2]. the integrity of the message is guaranteed because the digital signatures detects and stops unauthorized users from modifying the data and also authenticates the identity of the signatory. generally, the ecdsa is an elliptic curve variant of the dsa and it gives cryptographically strong digital signatures due to ecdlp concept. here keccak-f [1600], recognized as a new secure hash algorithm-3, i.e. sha-3 by nist is considered in the digital signature generation and the digital signature verification processes [3]. here the complexity of digital signature generation and digital signature verification is also analyzed to stop the attacker attempting to forge the signature. in this paper the analysis of the original ecdsa and its variants are considered and evaluated in terms of the implementation area, security level and speed of execution. based on these analysis a modified ecdsa method is designed and implemented on fpga for iot devices. this section gives a brief introduction about the paper. section 2 and section 3 give an overview of elliptic curve cryptography and secured hash algorithm-3 keccak. section 4 gives a detailed description of original ecdsa scheme, its security proofs and an attack possible on original ecdsa scheme. section 5 describes modified ecdsa suitable for signer with limited computation capability and a method to solve forging problem using initialization and authorization stage. section 6 provides comparison of ecdsa schemes, whereas section 7 explains implementation and synthesis of ecdsa. the result analysis of ecdsa and its variant are given in section 8 with conclusions drawn in section 9 followed byreferences. fpga implementation of modified elliptic curve digital signature algorithm 131 2. elliptic curve cryptography elliptic curves have been studied for centuries by mathematicians, therefore have a very rich history. ecc is the foremost choice in public key schemes due to its smaller key size [4], [13]. the key length of ecc is considerably shorter than that of rsa, but it achieves the same level of security of rsa. generally, ec are of two finite fields; fields of odd characteristic fp, where p is a large prime number, and fields of characteristic two f2 m , where 2 m is a large binary value. when the distinction is not important, denote both of them as fq, where n = p or n = 2 m . an ec is the set of solutions (x, y) to weierstrass equation. an elliptic curve e over a field k is defined by an eq. (1) as e (k): y 2 + a1xy + a3y = x 3 + a2x 2 + a4x + a6 (1) where the coeffients a1, a2, a3, a4, a6 ∈ k. the curve e is nonsingular or smooth and is an elliptic curve if and only if the discriminant of e, ∆e is nonzero. the weierstrass equation has been transformed to the elliptic curve called a short weierstrass curve, where a, b ∈ k. we assume that the characteristic of k ≠ 2, 3 and the discriminant of short weierstrass curve is given in eq. (2) as ∆ = − (4a 3 + 27b 2 ) (2) 3. secure hash algorithm-3 keccak has a different structure when compared to other hash functions. secure hash algorithm-3 keccak was selected because in 2004 sha-1 was found to be weak, and the threat was carried to sha-2 also. successful attacks have been reported in the algorithms sha-0 and sha-1, which generate collisions, which influences the principle of hash functions, which is to ensure the information integrity. the function sha-2 is currently still safe, but as sharing a similar structure with its predecessor, the sha-1, becomes suspicious and raises doubts about its safety stimulated the scientific community to search a successor more robust and secure. the sha-3 was focused in the information secure area and was more robust and secure. the keccak architecture is as shown in the fig. 1 consisting of preprocessing and the sponge construction [5]. fig 1 high-level view on keccak 132 v. kamalakannan, s. tamilselvan in the pre-processing construction the message is spliced into blocks with necessary padding. in the sponge construction absorbing (or input) phase and squeezing (or output) phases are present as shown in fig. 2. fig. 2 absorbing and squeezing phases of the sponge construction in the absorption phase the block data are applied to the algorithm for processing. in the squeezing phase the processed data is squeezed out based on the configurable length. the function keccak-f is used in both phases. it reads the input blocks xi, and generates the output blocks yj allowing arbitrary-length outputs y0···yu. the security level of keccak has to be configured with several parameters related to the input and output sizes. the parameter b to be configured is the width of the state depending on the exponent l i.e., b = r + c = 25(2 l ), where l = 0,1,..,6, having width of b ∈{25,50,100,200,400,800,1600}, r is the bit rate and c is called the capacity. the function keccak-f referred to as keccak-f permutation is the main part in hash algorithm and is used in both absorbing phase and squeezing phase. the keccak-f structure is shown in fig 3. there are nr rounds in the function, here each round has an input b bits. the parameter l influences the number of rounds specified in eq. (3) as nr = 12+2l (3) fig. 3 internal structure of function keccak the number of rounds required for the respective state width is provided in table 1. any instance of the keccak sponge function family makes use of one of the seven keccak-f permutations, denoted keccak-f[b], where b ϵ {25, 50, 100, 200, 400, 800, 1600} is the width of the permutation. fpga implementation of modified elliptic curve digital signature algorithm 133 table 1 number of rounds within keccak-f state width b [bits] # rounds nr 25 50 100 200 400 800 1600 12 14 16 18 20 22 24 these keccak-f permutations are iterated constructions consisting of a sequence of almost identical rounds. the number of rounds nr depends on the permutation width, and is given by nr = 12 + 2ℓ, where 2 ℓ = b/25. this gives 24 rounds for keccak-f [1600]. thus referring the table 1 the sha-3 keccak repeats 24 rounds, each round consists of five steps in sequence manipulating the entire state  step 1 step this function consists of three equations involving simple xor and bitwise cyclic shift operations. [ ] [ ] [ ] [ ] [ ] [ ] (4) [ ] [ ] ( [ ]) (5) [ ] [ ] [ ] (6) theta step involves xor-ing between the input state matrix from eq. (4) and output lanes obtained from eq. (5) to generate eq. (6).  step 2 step [ ] [ ]( [ ]) (7) here steps rho (ρ) and pi (π) together calculates a 5x5 array “b”. the operation of rho (ρ) and pi (π) take the state array “c” and perform circular rotation on each of the 25 lanes by a fixed number to obtain array “d” in eq. (7).  step 3 step [ ] [ ] ( [ ] [ ]) (8) in this step operation on the lanes, the d array obtained in the previous steps is manipulated and the results are replaced in the state array “s” illustrated in the eq. (8).  step 4 step in the iota ( step specified in eq. (9) the xor operation is performed for rc round constant specific for each of the 24 rounds of keccak-f[1600] with the lane at location [0, 0] of the new state matrix “s”. [ ] [ ] [ ] (9) 4. elliptic curve digital signature algorithm in 1992 scott vanstone proposed ecdsa because nist requested public comment related to dss proposal [6]. in 1998 it was accepted as iso 14888-3 standard by international standards organization (iso). the ecdsa is an elliptic curve variant of 134 v. kamalakannan, s. tamilselvan the dsa and because of ecdlp generates a cryptographically strong digital signatures. the integrity plays a critical role to safeguard data inside the network as shown in fig 4. sender bob generates a signature to be added with the message before transmission. at the other end receiver alice verifies the signature, in order to receive the message [7]. fig. 4 digital signature process ecdsa has been established as an efficient algorithm against cyberattacks and are characterized by their speed to generate and verify the signature. ecdsa consists of 3 phases: key generation, signature generation and signature verification. these three phases are explained in the following sub-sections. 4.1. ecdsa key generation to generate a public and private key sender performs the following steps step 1: select a random integer da ∈ [1, p-1] step 2: computes the public key qa = dag. 4.2. ecdsa signature generation using the sender‟s private key da and public key qa step 1: select an integer k ∈ [1, p − 1] step 2: compute h = hash (m) = sha-3 (m) step 3: calculate kg= (x1, y1) step 4: compute r = x1 (mod p), if r = 0, go to step 2 step 5: compute s = k -1 (h + da r) (mod p). if s = 0, go to step 2 the signature pair generated is (r, s) 4.3. ecdsa signature verification using public key qa and sender‟s signature (r, s) step 1: verify that r and s ∈ [1, p − 1]. if not, the signature is invalid step 2: compute h = hash (m) = sha-3 (m) step 3: compute w = s -1 (mod p) step 4: compute u1 = hw (mod p) and u2 = rw (mod p) step 5: compute (x2, y2) = u1g + u2qa step 6: compute v = x2(mod p) fpga implementation of modified elliptic curve digital signature algorithm 135 4.4. proof of ecdsa scheme step 1: compute s = k -1 (h + da r) mod p on rearranging step 2: compute k = s -1 (h + da r) step 3: compute kg = s −1 (h + da r) g = (x1, y1) step 4: compute kg = s −1 hg + s −1 da r g step 5: compute kg = w h g + r w qa where w = s −1 (mod p) and qa = da g (mod p) step 6: compute kg = u1 g + u2 qs = (x2, y2) where u1 = hw (mod p) and u2 = rw (mod p) therefore lhs= kg=(x1, y1) and r = x1 (mod p) rhs=u1g + u2qa = (x2, y2) and v= x2(mod p) hence v=r the signature is valid if v = r valid, invalid otherwise. in this algorithm if the same key k is being used for signing each and every messages, then there is an issue of the secret key being found by the intruder. this is explained in the following example, where the same secret k is applied for two different messages m1 and m2. in this process two signatures (r, s1) and (r, s2) are generated from the eq. (10) and eq. (11) as s1 = k −1 (h1 + da r) (10) s2 = k −1 (h2 + da r) (11) where h1 = sha-3 (m1); h2 = sha-3 (m2) knowing s1 and s2 it is possible to find the secret key k using the eq. (12) k = (h1 – h2)/(s1 – s2) (12) from the equation k s1  k s2 = h1 + da r  h2 – da r thus knowing k, r, s and h in the encryption concept, it is possible to find da by eq. (13) da = (ksh)/r (13) hence different key should be used for signing different messages, otherwise the private key da can be sensed by the intruder. the ecdsa is modified to solve the above problem by considering inverse operation only in verification phase. in this method there is no need of inverse operation in the key generation and signing phase there is no need of inverse operation. the scheme processes are discussed in the following sub-sections having the same key pair generation algorithm. 4.5. ecdsa scheme 2 signature generation using the sender‟s private key da and public key qa step 1: compute h = hash (m) = sha-3 (m) step 2: select a random integer k from [1, p − 1] step 3: compute kg= (x1, y1) step 4: compute r = x1 (mod p), if r = 0, go to step 2 step 5: compute s = (kh + (r xor h)da) g (mod p). if s = 0, go to step 2 the signature pair generated is (r, s) 136 v. kamalakannan, s. tamilselvan 4.6. ecdsa scheme 2 signature verification using the sender‟s private key da and public key qa using public key qa and sender‟s signature (r, s) step 1: verify that r and s are integers in [1, p − 1]. if not, the signature is invalid step 2: compute h = hash (m) = sha-3 (m) step 3: compute w = h −1 (mod p) step 4: compute u = (r xor h) (mod p) step 5: compute (x2, y2) = w(s – uqa) step 6: the signature is valid if v = x2 (mod n) = r, invalid otherwise 4.7. proof of ecdsa scheme 2 step 1: compute s = (kh + (r xor h)da) g = (kh + u da) g = khg + udag step 2: compute sw = khwg + uwdag step 3: compute sw = kg + uwqa where w = h -1 (mod p) and qa = da g (mod p) step 4: compute kg = swuwqa = w (s – u qs) therefore lhs= kg=(x1, y1) and r = x1 (mod p) rhs= w(s – u qa) = (x2, y2) and v= x2(mod p) hence v=r in the ecdsa scheme 2, an intruder can forge the signature by knowing the public parameters (g, n, p, qs) and transmit the wrong information to the receiver. the receiver receives the signature and verifies the signature to authenticate the sender's signature. this is been explained as follows if an intruder „t‟ is forges the signature by knowing the public parameters (g, n, p, qs) for a false message „m‟ in the following steps step 1: for signing a message „m‟ by sender, using private key da and public key qs = dag step 2: calculate h = hash (m) = sha-3 (m) step 3: select a random integer kt from [1, p − 1] step 4: compute kt g= (xt, yt) step 5: calculate rt = xt (mod p), if rt = 0, go to step 2 step 6: calculate st = (kt h + (rt xor h) qs (mod p). if st = 0, go to step 2 thus the signature pair (rt, st) is transmitted with the false message „m‟ the receiver obtains an authenticated copy of sender‟s signature pair with the false message „m‟ and verifys the authenticity of sender‟s signature (rt, st) using public parameters (g, n, p, qs) for message „m‟ by performing the following steps: step 1: verify that rt and st are integers in [1, p − 1]. if not, the signature is invalid step 2: calculate h = hash (m) = sha-3 (m) step 3: calculate w = h −1 (mod p) step 4: calculate u = (rt xor h)(mod p) step 5: calculate (xt, yt) = w(st –uqs) step 6: the signature is valid if vt = xt (mod p) = rt, invalid otherwise if the forged signature is validated, then intruder can successfully send false information, hence digital signature schemes are not secure. to solve this drawback public parameters being shared are reduced. fpga implementation of modified elliptic curve digital signature algorithm 137 5. modified elliptic curve digital signature algorithm while comparing the original ecdsa and its variants, it is found that original ecdsa is vulnerable to attack if the same key is used for different messages. scheme 2 is useful for verifier with limited compute apparatus as there is no inverse calculation in key generation and signing phase, but anyone can use legitimate user‟s public-key to forge the signature of any information. thus in the modified ecdsa scheme hidden generator point concept is applied to authenticate the encrypted message communicated between the devices connected in the perceptual layer of iot. the normal ecdsa are configured with the points on the elliptic curve, a generator point „g‟ is selected publicly available and distributed over the network by the certificate authority (ca) [11]. in this scheme, the requirement of ca makes it difficult to implement security. the information shared by the ca can be breached by the intruders, making the network susceptible to mim attack [12]. hence to elucidate this exposure and to secure the network against mim attacks, maintaining the security for each session of communication between the two nodes without a common generator point is suggested. therefore a generator point is shared only between the devices being connected to communicate. this concept is implemented in the ecdsa has two stages; initialization stage and authorization stage. 5.1 initialization stage let us consider two nodes represented in fig. 5 in the wsn. it is assumed that both nodes, i.e. sender and receiver, select their generator points, gs and gr individually apart from the private keys, ks and kr. the inverse of the private keys ks -1 and kr -1 are also computed. once the inverse of the private keys are computed, the sender generates its public key psa using the eq. (14), whereas the receiver generates its public key pra using the eq. (15) psa = ks -1 gs (14) pra =kr -1 gr (15) both the public keys psa and pra are exchanged between sender and receiver after multiplying it with the inverses of their private keys. the resultant key is transmitted to the receiver as is specified in the eq. (16), and the resultant key received by the sender is specified in the eq. (17) psb = praks -1 = kr -1 grks -1 (16) prb = psakr -1 = ks -1 gskr -1 (17) fig. 5 computational process for generator point 138 v. kamalakannan, s. tamilselvan these received keys are multiplied again by the sender and the receiver to generate psc and prc as specified in eq. (18) and eq. (19) psc=prbks=ks -1 gskr -1 ks =gskr -1 (18) prc=psbkr=kr -1 grks -1 kr= grks -1 (19) when psc and prc received by the individual sender and receiver, they are multiplied with ks and kr to obtain gr and gs. the sender computes the receiver‟s generator point in eq. (20) as prc*ks=ks -1 *gr*ks=gr (20) the receiver computes the sender‟s generator in eq. (21) as psc*kr=kr -1 *gs*kr=gs (21) these generator points gs and gr are added to generate a common generator points for the sender and receiver given in eq. (22) as g = gs + gr (22) hence the sender and receiver exchanges information between them and generated using „g‟ and computing p, 2p….. kp. 5.2 authorization stage let us consider two nodes in the wsn. the public key and the private keys of the transmitter are ps and ks, whereas for receiver it is pr and kr. the key has to be generated by the process shown in fig. 6 for every session of transmission between the sender and receiver. thus authorization has to be provided for each transmission. fig. 6 computational process for key both the public keys psr and prs are exchanged after multiplying it with private keys. the key transmitted to the receiver is specified in the eq. (23), and the key received by the sender is specified in the eq. (24) fpga implementation of modified elliptic curve digital signature algorithm 139 psr = prks (23) prs = pskr (24) the keys of the sender and the receiver are multiplied again to generate ksr and krs given in eq. (25) and eq. (26) as ksr = psrprs = prks pskr (25) krs = prspsr = prks pskr (26) when psr and prs received by the individuals, the key ksr and krs are generated by the sender and receiver individually which are equal, thus commonly referred as key „k‟ in the implementation of ecdsa. the sender and receiver in the wsn have individual generator points, gs and gr with their unique private keys, ks and kr. after initializing the keys generation process, both devices exchange the generator points gs and gr and generate a common generator point by the initialization process explained in subsection 5.1. hence the sender and receiver exchange information between them by considering common generator point ‘g’ and computing p, 2p, 3p….. kp. the sensor nodes must securely share a key before encryption. the shared secret key is generated and refreshed between the sender and receiver. the public key of sender and receiver are ps and pr. are exchanged using dhke process and a key is generated by the method explained in the sub section 5.2. considering the generator point ‘g’ and key ‘k’, scalar multiplication is performed to compute kg provided in eq. (27), to be applied for signature generation and signature verification process. ( ) ( ) (27) from the initialization and authorization stage, the values of k and g are known. this scheme processes are discussed in the following steps. 5.3. modified ecdsa signature generation to generates the signature for message m the signer using the values of k and g by performing the following steps: step 1: calculate h = hash (m) = sha-3 (m) step 2: compute kg= (x1, y1) step 4: compute r = x1 (mod p) step 5: compute s = (k + (r xnor h)) g (mod p). the signature pair thus generated is (r, s). 5.4. modified ecdsa signature verification the verifier verifies the signature using k and g from the initialization and authorization stage for message m by performing the following steps: step 1: verify that s is integers in [1, p − 1]. if not, the signature is invalid step 2: compute kg= (x1, y1) step 3: compute r = x1 (mod p) step 4: compute u = (r xnor h) mod (mod p) step 5: (x2, y2) = (s ug) step 6: the signature is valid if v = x2 (mod p) = r, invalid otherwise. 140 v. kamalakannan, s. tamilselvan 5.5 proof of modified ecdsa scheme signature send by sender to receiver is (r, s) and s can be generated only by sender because of its private key. step 1: compute s = (k + (r xnor h) g step 2: compute s = (k + u) g where u = (r xnor h) step 3: compute s = kg + ug step 4: compute s ug = kg = (x2, y2) therefore lhs = kg = (x1, y1) and r = x1 (mod p) rhs = (s  ug) = (x2, y2) and v = x2(mod p) hence v=r the improved ecdsa scheme reduces the computational cost while keeping the same security as original ecdsa. they are suitable for the users who have limited computing capacity. 6. comparison of elliptic curve digital signature algorithm the original ecdsa and proposed ecdsa are compared and represented in the table 2. while comparing the original ecdsa and the proposed ecdsa, it is found that the original ecdsa consists of inverse operations in signature generation and signature verification and hence is more complex as needs more point multiplication operation. the improved scheme, the initialization stage and authorization stage are introduced to share the values of k and g between the sender and receiver, thus reducing the computational cost as no inverse operations are required for signature generation and signature verification, while keeping the same security as original ecdsa. table 2 comparison of ecdsa variants algorithm signature generation signature verification attack inverse in key generation inverse in signing inverse in verification original ecdsa s=k − 1 (h + dar) u1=hs −1 u2=rs −1 u1g + u2qa vulnerable no yes yes proposed ecdsa s = (k + (r xnor h) g u = (r xnor h) (s ug) not vulnerable no no no 7. implementation and synthesis elliptic curve digital signature algorithm the original ecdsa signature generation and signature verification was realized in verilog hdl and simulation was carried out using isim simulation tool available in xilinx 14.3 for verifying its functional correctness. the rtl block schematic of the ecdsa signature generation is illustrated in fig. 7 and ecdsa signature verification is illustrated in fig. 8. fpga implementation of modified elliptic curve digital signature algorithm 141 the ecdsa signature generation and signature verification were synthesized and the device utilization summary, timings summary and memory utilization are tabulated in table 3. the hardware implementation of ecdsa signature generation was performed on virtex-5 5xc5vlx50t-1ff1136 fpga development board by xilinx to evaluate the area and speed. it was found that the ecdsa signature generation operated at a maximum frequency of 13.180 mhz whereas the ecdsa signature verification operated at a maximum frequency of 13.210 mhz. fig. 7 rtl block schematic of ecdsa signature generation fig. 8 rtl block schematic of ecdsa signature verification 142 v. kamalakannan, s. tamilselvan table 3 synthesis summary for ecdsa parameters signature generation signature verification slice registers 6701 6790 slice luts 16370 22734 lut-ff pairs 4884 4996 bonded iobs 226 226 real time 2627.00 secs 890.00 secs cpu time 2626.99 secs 890.19 secs maximum frequency 13.180 mhz 13.210 mhz the modified ecdsa signature generation and signature verification was realized in verilog hdl and the simulation was carried out using isim simulation tool in xilinx for verifying its functional correctness. the rtl block schematic of the modified ecdsa signature generation is illustrated in fig. 9, and modified ecdsa signature verification is illustrated in fig. 10. fig. 9 rtl block schematic of modified ecdsa signature generation fig. 10 rtl block schematic of modified ecdsa signature verification fpga implementation of modified elliptic curve digital signature algorithm 143 the modified ecdsa signature generation and signature verification are synthesized and the device utilization summary, timings summary and memory utilization are tabulated in the table 4. table 4 synthesis summary for modified ecdsa parameters signature generation signature verification slice registers 198 454 slice luts 7853 16387 lut-ff pairs 152 346 bonded iobs 41 34 real time 924.00 secs 910.00 secs cpu time 923.69 secs 910.09 secs maximum frequency 13.469 mhz 13.156 mhz 8. result analysis of elliptic curve digital signature algorithm the ecdsa and its variants are synthesized and analyzed using xilinx tool. the table 5 and table 6 illustrate the values obtained after synthesizing original ecdsa and modified ecdsa scheme for signature generation and signature verification. comparison was performed related to maximum frequency and number of slice luts. table 5 comparison of synthesis results of ecdsa signature generation parameters original ecdsa proposed ecdsa number of slice luts 16370 7853 max. frequency(mhz) 13.180 13.469 table 6 comparison of synthesis results ecdsa signature verification parameters original ecdsa proposed ecdsa number of slice luts 22734 16387 max. frequency(mhz) 13.210 13.156 the outcomes obtained show that the modified ecdsa scheme is better suitable for resource constrained devices. the maximum achievable frequency of 13.469 mhz is achieved for signature generation and maximum achievable frequency of 13.156 mhz is achieved for signature verification on virtex-5 (xc5vlx50t-1ff1136) fpga board is better than the existing ecdsa schemes. based on the design metric such as frequency (mhz) and area (slices/aluts), the modified ecdsa outperforms the existing ones in terms of time for execution and slice luts required in fpga device. 144 v. kamalakannan, s. tamilselvan 9. conclusion elliptic curve digital signature algorithm (ecdsa) is one of the primitives of elliptic curve cryptography (ecc). the sha-3 algorithms like keccak provide better security and proves beneficial wherever security constraints have to be achieved. here a variant of keccak-f [1600] having five steps (𝜃 step, 𝜌 𝑎𝑛𝑑 𝜋 step, 𝜒 step and 𝜏 step) repeated 24 times were applied to generate hashed output. generally, modular inversion is computed using montgomery‟s method which consists of a gcd operations. the gcd operation utilizes more number of arithmetical operations. thus computational cost increases when implemented on fpga as number of operations increases. from the analysis, it is found that ecdsa is vulnerable to mim attack when the same key is applied for all messages. at the same time if computational cost is reduced, then there are chances of signature being forged by the intruder. therefore, the modified ecdsa scheme keeps the mathematical structure of ecdsa and security the same as the original ecdsa scheme, but reduces the computational cost by reducing the inverse operation being applied in the key generation and signing phase. also this scheme solves the problems related to signature forging due to the available public parameters (g, n, p, qs). these are achieved by using hidden generator concept. hence this scheme has more security with less computational cost, therefore can be implemented in the perceptual layer devices in iot i.e., the ecdsa can be applied for securing the information communicated by devices such as wsns, rfids, etc., having limited memory and computational capacity. since fpgas are used as end products, the design of ecdsa is fine-tuned for fpga implementation. the work can be extended by considering advanced fpgas where parallelism can be exploited in the architecture to reduce the delay in the asymmetrical cryptography. references [1] n. koblitz, a. j. menezes, and s. a. vanstone, “the state of elliptic curve cryptography”, design, codes, and cryptography, vol. 19, issue 2-3, pp.173-193, 2000. [2] v. miller, “use of elliptic curves in cryptography”, advances in cryptography-crypto ‟85. lncs 218, springer verlag, 1986, pp. 417-426. [3] g. provelengios, p. kitsos, n. sklavos, and c. koulamas, “fpga-based design approaches of keccak hash function,” in proceedings of the 15 th euromicro conference, 2012, pp. 648-653. [4] d. manel, o. raouf, h. ramzi and a. mtibaa, “hash function and digital signature based on elliptic curve”, in proceedings of the 14th international conference on sciences and techniques of automatic control & computer engineering sta'2013, sousse, tunisia, december 20-22, 2013 pp. 388-392. [5] k. latif, m. m. rao, a. aziz, and a. mahboob, “efficient hardware implementations and hardware performance evaluation of sha-3 finalists,” in proceeding of 3 rd sha-3 candidate conference, march 2012. [6] s. p. raj, a. p. renold, “an enhanced elliptic curve algorithm for secured data transmission in wireless sensor network”, in proceedings of global conference on communication technologies (gcct 2015), pp. 891-896. [7] a. khalique, k. singh, s. sood, “implementation of elliptic curve digital signature algorithm”, international journal of computer applications, vol. 2, no. 2, pp. 21-27, may 2010. [8] e. wajih, b. noura, m. mohsen & t. rached, “low power elliptic curve digital signature design for constrained devices”, international journal of security (ijs), vol. 6, no.2, pp. 1-14, april 2012. [9] g. sarath, d. c. jinwala and s. patel, “a survey on elliptic curve digital signature algorithm and its variants”, computer science & information technology (cs & it) –cscp, 2014, pp. 121–136. [10] a. i. ali, h. p. isitc, “comparison and evaluation of digital signature schemes employed in ndn network”, international journal of embedded systems and applications (ijesa), vol. 5, no. 2, pp. 15-29, june 2015, fpga implementation of modified elliptic curve digital signature algorithm 145 [11] h. junru, “the improved elliptic curve digital signature algorithm”, in proceedings of the international conference on electronic and mechanical engineering and information technology (emeit), 2011, pp. 257-259. [12] b. panjwani, d. c. mehta, “hardware-software co-design of elliptic curve digital signature algorithm over binary fields”, in proceedings of the international conference on advances in computing, communications and informatics (icacci), 2015, pp. 1101-1106. [13] x. zhang, s. ma, w. shi, and d. han, “implementation of elliptic curve digital signature algorithm on iris nodes”, in proceedings of the international conference on estimation, detection and information fusion (icedif 2015), pp. 403-406. facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 243-259 https://doi.org/10.2298/fuee2002243l © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd a novel method for the codecs’ performance analysis in mobile telephony systems  aleksandar lebl, dragan mitić, vladimir matić, mladen mileusnić, žarko markov iritel a.d., belgrade, serbia abstract. this paper presents a novel method of expressing the quality of service in a mobile telecommunication system when its performance depends on several factors including applied codecs’ characteristics (voice quality and data flow rate) and telecommunications traffic service possibilities. the influence of these factors is unified in one variable quality of service measure. the proposed method is especially applicable in the cases when two-dimensional systems are analyzed – for example when two codecs with different flow rate and different achievable connection quality are used in a system. as an example, we also studied system with full-rate or mixed full-rate and half-rate codec implementation depending on the offered traffic. the system performances – mean dataflow and mean connection quality as a function of offered traffic are presented graphically and also expressed quantitatively by the novel quality of service measure. the systems with different number of available traffic channels may be compared on the base of this novel evaluation value such that the system with the highest value is the most suitable one for the concrete situation. in this way mobile system design is simplified to the great extent. the developed model is applicable generally for mobile telephony systems defining, but in this paper we studied its implementation for global system for mobile communications. key words: quality of service measure, full-rate codec, half-rate codec, e-model 1. introduction mobile telephony systems are characterized by a number of parameters. these parameters include, but are not limited to base station emission power, offered and served traffic, and achieved data-flow rate. different codec types are applied to satisfy traffic demands, each one with its specific voice quality level. when these characteristics are analyzed, usually one of them is calculated or simulated. the behaviour of the others is then only qualitatively estimated without putting it in relation to the analyzed one. received june 19, 2019; received in revised form august 19, 2019 corresponding author: aleksandar lebl iritel a.d., belgrade, batajnički put 23, 11000 belgrade, serbia e-mail: lebl@iritel.com  244 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov that is why it is then difficult to estimate whether the improvement of one characteristic is overcome by the degradation in some other parameter. there is not a significant number of contributions, which emphasizes the aggregate influence of two or more factors on some system characteristic by some unified measure (unit). one such an estimation may be found in [1], where classical equipment impairment factor ie of the coder, defined in [2], is modified according to the influence of necessary codec bandwidth and processing time, forming the unified estimation called modified impairment factor ib. instead of analyzing common influence of several elements important for connection quality, the majority of contributions are directed towards improving original voice quality model [3], [4], or comparing of practically obtained characteristics for single voice over internet protocol (voip) and mobile telephony codecs to theoretical characteristics [5]. the various testing approaches when considering only one voice connection quality are emphasized in [6]. the analysis may be performed on a per voice sample basis or on a per call basis when it is important to include the effect of recency and the effect of a speech sample with the worst signal quality. the statistical tests implementable for voice quality determination are also analysed in [6]. in some cases it is important to spread the analysis results of voice and data transmission in such a sense that they are representative for different networks (global system for mobile communications (gsm), universal mobile telecommunication systems (umts), code division multiple access (cdma)) or application classes (emergency, business and personal). the methodology, which includes aggregate collected data on the base of user area distribution simulated by drive test and the obtained results are shown in [7]. the methodology may be also applied to compare various network providers in some area on the base of criteria how they satisfy a set of analyzed key performance indicators (kpi) [8]. very often used approach for presenting influence of two factors on the third variable is over threedimensional (3d) graphics. again, when considering voice coders implementation, an example may be found in [9]. but, such graphics, although are illustrative, suffer from the lack that they are often less clear for values determination than in the case of two-dimensional (2d) graphics and that it is impossible to present the influence of more than two factors on the analyzed variable. the unified numerical measure according to the results from [1], or from this paper is simple for the estimation whether the analyzed characteristic has the satisfactory value. when projecting a telephony system such as mobile telephony system, there are several elements for the consideration to determine its quality of service (qos). the first element is the offered traffic, which may be served with the pre-defined loss probability. the second element is the achieved mean voice connection quality, as well the quality of each, separate realized connection. the third element is necessary bandwidth to serve the offered traffic. the available frequency spectrum is very limited in mobile telephony comparing to the necessary channel capacity. that is why a number of low bit-rate codecs are developed for the implementation in mobile telephony. however, these codecs have lower voice connection quality then higher bit-rate codecs (in mobile telephony with applied gsm technology full-rate (fr – gsm 06.10) is one of them). the service principle is to use higher bit-rate codecs in all connections at lower traffic loads (usually fr in systems of the second generation (2g) or gsm). if the offered traffic overcomes the predefined level and if it is possible that requested traffic loss is not achieved, it is necessary to start the implementation of lower bit-rate codecs (usually half-rate (hr a novel method for the codecs' performance analysis in mobile telephony systems 245 gsm 06.20) codec in gsm systems) in all or in a part of realized connections. besides the limitation in available frequency spectrum, the second element of restriction is the number of dedicated traffic channels, i.e. the system hardware performance. in mobile telephony, there is possibility to even two times increase system channel capacity by the implementation of hr codec. in this case, each traffic channel may contain two connections coded by half-rate codec. a system with emphasized characteristics is analyzed in [10]. there is a variety of different parameters which may be analysed together to obtain different types of system performance estimations. the reference [11] is a comprehensive survey of elements characterizing each fifth-generation (5g) mobile system. various among these elements besides those specific for the quality of applied codecs may be combined to obtain a number of different evaluations and [11] gives just an idea about the set of combinational factors. the most important elements which may be combined are energy consumption, transmission delay, data throughput and the applied frequency spectrum bandwidth. generally speaking, the 5g wireless technology is based upon modified fourth generation (4g), which at present is facing many problems to meet its performance goals. the 5g wireless technology helps to solve the problems of poor coverage, bad interconnectivity, poor quality of service and flexibility [12]. 5g systems bring significantly increased data transfer speed, spectrum bandwidth, spectral efficiency and so on, comparing to 4g systems [13]. this paper describes a novel method for comprehensive analysis of different codecs applied in mobile telephony systems. it unifies different codecs’ characteristics (such as their achievable voice quality, data throughput and possibilities for traffic serving) into one variable, which may be compared for various codecs. the main goal of the research is to simplify mobile system designing (selection of the number of traffic channels) to achieve the best system performance while considering several kpis. in the paper, the analysis is, first of all, limited to codecs implemented in gsm systems. the number of traffic channels for which the results are presented is specific for gsm systems with 1, 2, 3 or 4 frequency carriers. such analysis, unifying different mobile telephony codecs’ characteristics into one variable are rarely implemented and they, if applied, unify together lower number of codecs’ characteristics. the section 2 in the paper describes e-model, which we implemented for voice quality estimation as the first element included in our unified variable. the values of data-flow rate as the second element of this evaluation are also cited at the end of this section. the traffic model corresponding to the gsm mobile telephony system with two different applied codecs is developed in the section 3 as the third element in the analysis. section 4 describes how the value of our novel estimation is determined and presents the obtained results. section 5 is a brief survey how the model could be implemented in mobile systems of the third generation (3g) and the fourth generation (4g). at the end, section 6 is the conclusion of the paper. 2. voice connection quality estimation there are several subjective and objective models intended to express the voice connection quality. analysis in this paper is based on e-model implementation. e-model is the computational, objective model. it joins the influence of various factors into one 246 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov unique quality measure – rating factor r [2]. the value of r is connected with mean opinion score (mos), which is between 1 and 5. the values of mos and r are connected by formula and corresponding fig. b.2 from [2]. the main purpose of e-model is to express voice connection quality in a number of different kinds of voice connections systems. among them implementation in internet packet connections quality estimation is, probably, the most often one. according to [14], mos is used as the measure of voice quality in mobile telephone connections. taking into account that mos and r are mutually dependent variables, e-model may be implemented as a measure for the estimation of the quality of mobile telephony connection. according to e-model, connection-rating factor is [2]: 94 e effr i   (1) where r0=94 is the maximum practically possible connection-rating factor. effective equipment impairment factor ie-eff includes impairments caused by the codec implementation and influence of random signal loss. in (1) we do not consider influence of factors which appear in original equation from [2]: combination of impairments which occur simultaneously with voice signal (is), influence of delay (id) and influence of advantage factor (a), because these factors are not related to the implemented codec. the value of ie-eff is calculated from the equation [2]: (95 ) pl e eff e e pl pl p i i i p b burstr       (2) where it is:  ie – equipment impairment factor when there is no packet loss;  ppl – transmitted signal loss probability (in percent);  burstr – burst ratio: the quotient of the average lengths of the lost signal parts in real transmitted signal and when signal parts are randomly lost;  bpl – robustness factor, which is specific for each coder type. in this paper, we are limited to cases when there is no signal loss (ppl=0). such a situation has not the significant probability in mobile telephony systems, but the obtained results do not suffer from the loss of generality. it means that it is ie-eff=ie from (2), i.e. r=94-ie from (1) in our analysis. according to the available literature, we have chosen the values iefr=20 for fr codec and iehr=23 for hr codec [15]. however, also the worse, pessimistic value iefr=26 may be found in some references [16], [17]. the reason is that the value of r should be higher for fr codec than for hr codec, because dataflow is greater in the case of fr codec. besides, it is explained in [5] that, according to measurements, even the value iehr=23 is pessimistic for hr codec. data-flow rate for fr and hr is also important in our analysis. it is dffr=13kbit/s for fr codec and dfhr=5.6kbit/s for hr codec. besides this, in gsm systems enhanced full-rate codec (efr) may be also applied. its data-flow is dfefr=12.2kbit/s. a novel method for the codecs' performance analysis in mobile telephony systems 247 3. implemented traffic model traffic model of the analyzed system is presented in fig. 1. the total number of traffic channels is n and each channel, if busy, may contain one fr or one or two hr connections. each system state is modelled by two-dimensional variable {nf,nh}, where nf and nh are the numbers of instantaneously realized fr (i.e. hr) connections, respectively. the threshold number of channels when hr connections begin to be established is k (≤n). as a consequence, the total number of available traffic channels is equal 2 ( )cn k n k    (3) the value of k is chosen on the base of the request that traffic loss is lower than some limiting value, in our analysis it is ploss=1% or ploss=2%. the intensity of new requests generation is λ and the probability that the user, who generates the request, may establish half-rate connection, is πh. in up-to-date technology it may be supposed that the majority of mobile stations have the possibility to realize a hr connection. so we suppose that the value of πh is in the range between 0.8 and 1. call duration of both connection types is random variable with exponential distribution and mean duration tp=1/μ. in the period when both kinds of requests are generated, the intensity of full-rate requests generation is λ·(1-πh), while the intensity of half-rate requests generation is λ·πh. in the state {nf,nh} full-rate connection is finished with the intensity nf··μ and half-rate connection with the intensity nh·μ. if during traffic process realization happens that two one-half time slots appear (i.e. two channels with only one half-rate call per channel), these two calls are gathered in one completely busy channel, while the other channel becomes completely idle (complete re-packing, [18]). figure 1 presents the model of a system with hr connections realization possibility. the number of traffic channels is n and the threshold when hr connections realization starts is k=3. the value of πh is 0<πh<1. this is a two-dimensional markov birth-death traffic model. the solution may not be obtained in the closed form, because kolmogorov’s criterion for system reversibility is not satisfied [19]. kolmogorov’s criterion is satisfied in the special case of high traffic load when it is necessary to have k=0 to achieve satisfactory traffic loss. the system remains two-dimensional and it is possible to realize circular flow between any set of four system states in both clockwise and counter clockwise direction. for a system in fig. 1 such condition is not satisfied for states where there are less than 3 fr calls because there is no circular flow in counter clockwise direction. there are two more special cases when the system is simplified for the analysis comparing to the system from fig. 1. the first one is for a low traffic load when there is no need to implement hr connections, but only fr. in this case, we obtain one-dimensional traffic model with n+1 system states (the states are {0,0}, {1,0}, ..., {n,0}, i.e. it is always nh=0). the second special case is for the higher traffic load where it is k=0, while in the same time it is πh=1. under these conditions we also obtain one-dimensional traffic model with 2·n+1 system states where only hr connections are realized (the states are {0,0}, {0,1}, ..., {0,2·n }, i.e. it is always nf=0). 248 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov λ μ μ 0 0 2·μ μ 1 0 μ 2 0 . . . μ 2·μ 0 1 2·μ 2·μ 1 1 2·μ 2 1 . . . μ 3·μ 0 2 2·μ 3·μ 1 2 2 2 . . . . . . . . . . . . . . . 0 2n λ λ λ λ λ 0 2n-1 (1-πh)·λ μ 0 2n-2 (1-πh)·λ μ 0 2n-3 1 2n-2 1 2n-3 πh·λ πh·λ μ 2·μ μ 2·μ (1-πh)·λ (1-πh)·λ (1-πh)·λ (1-πh)·λ 0 2n-4 0 2n-5 1 2n-4 1 2n-5 2 2n-4 2 2n-5 3·μ 3 0 3·μ 3 1 3·μ 3 2 . . . λ λ (1-πh)·λ μπh·λ 2·μπh·λ μ 4·μ 0 3 2·μ 4·μ 1 3 2 3 λ λ 3·μ 3 3 (1-πh)·λ 3·μπh·λ3·μπh·λ μ 5·μ 0 4 2·μ 1 4 2 4 λ (1-πh)·λ 3·μ 3 4 (1-πh)·λ μ 6·μ 0 5 2·μ 1 5 2 5 λ (1-πh)·λ 3·μ 3 5 (1-πh)·λ 4·μπh·λ 5·μπh·λ 4·μπh·λ 5·μπh·λ5·μπh·λ 6·μπh·λ6·μπh·λ6·μπh·λ (n-1)·μ n-1 0 (n-1)·μ (n-1)·μ (1-πh)·λ (1-πh)·λ (1-πh)·λ n·μ n 0 (1-πh)·λ n-1 1 n-1 2 n-2 0 n-2 1 n-2 2 n-2 3 n-2 4 μπh·λμπh·λ 2·μπh·λ2·μπh·λ 3·μπh·λ 4·μπh·λ (n-2)·μ (n-2)·μ (n-2)·μ (1-πh)·λ (n-2)·μ (n-2)·μ (1-πh)·λ (1-πh)·λ (1-πh)·λ (1-πh)·λ πh·λπh·λπh·λ πh·λ πh·λπh·λ πh·λ 4·μ 4·μ 4·μ (1-πh)·λ (1-πh)·λ (1-πh)·λ 4·μ 4·μ 4·μ (1-πh)·λ (1-πh)·λ (1-πh)·λ . . . . . . . . . πh·λπh·λ (2n-5)·μπh·λ (2n-5)·μ (2n-5)·μ (2n-4)·μ (2n-4)·μ (2n-4)·μ (2n-3)·μ (2n-3)·μ (2n-2)·μ (2n-2)·μ (2n-1)·μ 2n·μ fig. 1 birth-death model of a system with fr and hr connection realization possibility a novel method for the codecs' performance analysis in mobile telephony systems 249 in the general case presented in fig. 1 the solution is obtained by solving the system of equations [10]. the equations in stationary case are obtained by equating the transition intensity from some system state {nf,nh} with the total transition intensity into the same state. this is expressed by the equation 1 1 1 1 1 1 1 1 ( , ) ( ( , ) ( , )) ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) f h f h f h f h f h f h f h f h f h f h f h p n n n n n n p n n n n p n n n n p n n n n p n n n n                          (4) where p(x,y) is the probability of state {x,y}, λ(x,y) is the intensity of new call generation in the state {x,y} and μ(x,y) is the intensity of call termination in the same state {x,y}. in order to successfully solve the system of equations, it is necessary to insert also the condition 2· , 0 0 1     f h f h f h n + n =n n n n n p (5) instead of one of the equations expressed by (4). besides of solving the system of equations, the system from fig. 1 may be also analyzed by simulation program. such a program is our original development realized in c programming language and it is executed on commercial personal computer. the program is the improved version of the simulation program, which is described and verified in [10] by comparing the values of state probabilities to the values obtained by solving the system of equations for a system with relatively low number of channels. 4. method of analysis and the results the main condition to perform the analysis presented in this paper is to define the connection scenario which allows optimum relation among three variables: 1. offered traffic; 2. traffic loss and 3. implementation as low as possible percent of hr connections (i.e. connection quality). the first step in the analysis is to determine components of served traffic (total traffic as, traffic of fr connections (asf), traffic of hr connections (ash), as well as the loss probability of these three traffic components (ploss, plossf, plossh) for each value of predefined traffic, while changing the threshold number of channels k from 0 to n. these values of served traffic and traffic loss are obtained in the simulation process. in our analysis we predefined the probability πh=0.8. at the end of simulation the components of offered traffic (total offered traffic ao, offered traffic of fr connections aof and offered traffic of hr connections aoh) are calculated as: 1 s o loss a a p   (6) 1 sf of lossf a a p   (7) 1 sh oh lossh a a p   (8) 250 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov the goal of this analysis is to determine the maximum values as, asf and ash for which is ploss lower than the predefined value (1% or 2% in our simulation) and the corresponding value of k. the second step in the analysis is calculation of mean rating factor r value for asf and ash determined in step 1. this value is (94 ) (94 )efr sf ehr sh mean sf sh i a i a r a a        (9) the third step is the calculation of mean data-flow value for the same values of asf and ash as in step 2. the implemented formula is fr sf hr sh mean sf sh df a df a df a a      (10) and finally the fourth step is calculation of our novel quality of service measure (qosm). its value is 100o mean mean a r qosm df n     (11) such an expression is formed, because a system has better characteristics when it may serve higher traffic with higher achieved voice quality, but with lower data flow. the division by the number of traffic channels is introduced in order to be able to compare performances of the systems with different number of traffic channels, meaning that the unit of this variable is expressed per one channel. this novel variable contains two factors related to the performances of applied codec (data flow rate and connectionrating factor). the third component is related to more general element, which is specified for each telecommunication system independent of the implemented codec type – offered, i.e. served traffic. in this way the defined variable is not just related to codec properties, but is more comprehensive, describing service in a system. table 1 presents the value of threshold number of traffic channels (k) when the implementation of hr codec starts. the value k is determined as a function of the value of offered traffic (ao), which is defined at the beginning of simulation for the systems with 6, 14, 22 and 30 traffic channels (n). these values of n are characteristic for the systems with 1, 2, 3 and 4 frequency carriers, respectively. the values of k are presented for offered traffic loss ploss=1% and ploss=2%. the designations ploss>1% and ploss>2% in the table 1 mean that the predefined value of offered traffic loss may not be achieved regardless of k. as an example, let us consider the system with ao=17e and desired ploss=2%. according to table 1, it is not possible to achieve ploss=2% by systems with 6 and 14 traffic channels. when there are 22 traffic channels, it is necessary to have mixed fr and hr connections and the threshold number of channels when hr connections are started to be realized is k=20. in the systems with n=30 traffic channels all connections may be fr (k=30). a novel method for the codecs' performance analysis in mobile telephony systems 251 table 1 the threshold number of channels (k) as a function of predefined offered traffic (a) and total number of channels (n) ao(e) ploss = 1% ploss = 2% n n 6 14 22 30 6 14 22 30 k k 1.5 6 14 22 30 6 14 22 30 2 5 14 22 30 6 14 22 30 2.5 4 14 22 30 5 14 22 30 3 3 14 22 30 4 14 22 30 3.5 2 14 22 30 4 14 22 30 4 1 14 22 30 3 14 22 30 4.5 ploss>1% 14 22 30 2 14 22 30 5 14 22 30 ploss>2% 14 22 30 6 14 22 30 14 22 30 7 14 22 30 14 22 30 8 13 22 30 14 22 30 9 12 22 30 13 22 30 10 11 22 30 12 22 30 11 10 22 30 11 22 30 12 9 22 30 11 22 30 13 8 22 30 10 22 30 14 6 21 30 9 22 30 15 ploss>1% 20 30 7 21 30 16 20 30 ploss>2% 21 30 17 19 30 20 30 18 19 30 20 30 19 18 30 19 30 20 18 30 19 30 21 17 29 18 30 22 16 28 18 29 23 15 28 17 29 24 14 27 16 28 25 12 27 16 28 26 ploss>1% 26 15 27 27 26 13 27 28 25 ploss>2% 27 29 25 26 30 25 26 31 24 26 32 24 25 33 23 25 34 22 24 35 21 24 36 20 23 37 17 23 38 ploss>1 % 22 39 20 252 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov figure 2 presents relation between ao, dfmean and rmean for a system with 6 traffic channels when traffic loss is limited to 2%. the threshold number of channels is determined according to data in table 1. the values of dfmean and rmean are determined on the base of expressions (10) and (9), respectively, based on asf and ash obtained in simulation process. after that, the values of qosm are presented in fig. 3 according to (11) for the same values of ao, dfmean and rmean as the ones used for the graphic in fig. 2. the results are presented both for traffic loss less than 1% and less than 2%. 13 13 12,58 11,55 11,12 9,5 8,02 1,5 2 2,5 3 3,5 4 4,5 70 71 72 73 74 75 rmean dfmean(kbit/s) ao(e) 70-71 71-72 72-73 73-74 74-75 13 13 12,58 11,55 11,12 9,5 8,02 1,5 2 2,5 3 3,5 4 4,5 70 71 72 73 74 75 rmean dfmean(kbit/s) ao(e) 70-71 71-72 72-73 73-74 74-75 fig. 2 mean connection-rating factor rmean as a function of offered traffic ao and mean dataflow dfmean for a system with 6 traffic channels and probability of traffic loss ploss<2% 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 1,5 2,5 3,5 4,5 ao (e) q o s m ( e ∙s /b it ∙c h a n n e l) ploss=2% ploss=1% fig. 3 qosm as a function of offered traffic ao for a system with 6 traffic channels a novel method for the codecs' performance analysis in mobile telephony systems 253 figure 4 and fig. 5 present the same variables as the fig. 2 and fig. 3, but for 14 traffic channels. after that, the similar analysis results are presented in fig. 6 and fig. 7 for the system with 22 traffic channels and in fig. 8 and fig. 9 for the system with 30 traffic channels. 13 13 12 ,6 9 12 ,0 4 11 ,1 4 10 ,7 9, 66 8, 71 7, 58 7 8 9 10 11 12 13 14 15 70 71 72 73 74 75 rmean dfmean(kbit/s) ao(e) 70-71 71-72 72-73 73-74 74-75 13 13 12 ,6 9 12 ,0 4 11 ,1 4 10 ,7 9, 66 8, 71 7, 58 7 8 9 10 11 12 13 14 15 70 71 72 73 74 75 rmean dfmean(kbit/s) ao(e) 70-71 71-72 72-73 73-74 74-75 fig. 4 mean connection-rating factor rmean as a function of offered traffic ao and mean data-flow dfmean for a system with 14 traffic channels and probability of traffic loss ploss<2% 0 0,2 0,4 0,6 0,8 1 1,2 7 9 11 13 15 ao (e) q o s m ( e ∙s /b it ∙c h a n n e l) ploss=2% ploss=1% fig. 5 qosm as a function of offered traffic ao for a system with 14 traffic channels 254 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov 1 3 1 3 1 2 ,8 2 1 2 ,7 3 1 2 ,2 9 1 2 ,0 8 1 1 ,4 2 1 1 ,1 4 1 0 ,4 1 1 0 ,1 1 9 ,4 1 8 ,7 6 8 ,5 4 8 ,0 2 7 ,4 2 13 15 17 19 21 23 25 27 70 71 72 73 74 75 rmean dfmean (kbit/s) ao (e) 70-71 71-72 72-73 73-74 74-75 1 3 1 3 1 2 ,8 2 1 2 ,7 3 1 2 ,2 9 1 2 ,0 8 1 1 ,4 2 1 1 ,1 4 1 0 ,4 1 1 0 ,1 1 9 ,4 1 8 ,7 6 8 ,5 4 8 ,0 2 7 ,4 2 13 15 17 19 21 23 25 27 70 71 72 73 74 75 rmean dfmean (kbit/s) ao (e) 70-71 71-72 72-73 73-74 74-75 fig. 6 mean connection-rating factor rmean as a function of offered traffic ao and mean data-flow dfmean for a system with 22 traffic channels and probability of traffic loss ploss<2% 0,2 0,4 0,6 0,8 1 1,2 13 15 17 19 21 23 25 27 ao(e) q o s m (e ∙s /b it ∙c h a n n e l) ploss=1% ploss=2% fig. 7 qosm as a function of offered traffic ao for a system with 22 traffic channels a novel method for the codecs' performance analysis in mobile telephony systems 255 13 12 ,8 3 12 ,4 4 11 ,8 1 11 ,4 10 ,5 9 9, 79 9, 06 8, 42 7, 9 20 22 24 26 28 30 32 34 36 38 70 71 72 73 74 75 rmean dfmean (kbit/s) ao (e) 70-71 71-72 72-73 73-74 74-75 13 12 ,8 3 12 ,4 4 11 ,8 1 11 ,4 10 ,5 9 9, 79 9, 06 8, 42 7, 9 20 22 24 26 28 30 32 34 36 38 70 71 72 73 74 75 rmean dfmean (kbit/s) ao (e) 70-71 71-72 72-73 73-74 74-75 fig. 8 mean connection-rating factor rmean as a function of offered traffic ao and mean data-flow dfmean for a system with 30 traffic channels and probability of traffic loss ploss<2% 0 0,2 0,4 0,6 0,8 1 1,2 1,4 20 24 28 32 36 40 ao (e) q o s m ( e ∙s /b it ∙c h a n n e l) ploss=2% ploss=1% fig. 9 qosm as a function of offered traffic ao for a system with 30 traffic channels 256 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov 0 0,2 0,4 0,6 0,8 1 1,2 0 5 10 15 20 25 30 35 40 ao (e) q o s m ( e ∙s /b it ∙c h a n n e l) 6 channels 14 channels 22 channels 30 channels fig. 10 qosm as a function of offered traffic ao for systems with 6, 14, 22 and 30 traffic channels and probability of traffic loss ploss<2% supposing that qosm=0 if ploss=2% may not be achieved figure 10 illustrates the comparative results of qosm for systems with 6, 14, 22 and 30 channels when it is ploss<2%. the situation when it is ploss>2% is treated as that specified characteristics are not satisfied and that’s why in that case the value of qosm suddenly drops to 0. the system dimension (i.e. the number of traffic channels) is selected to achieve the highest value of qosm. for example, if it is ao=6e, we have to choose system with 10 channels, because it has higher qosm than other 3 systems. 5. model implementation in 3g and 4g systems the model, which is developed in this paper, is also applicable for the analysis of 3g and 4g systems. the calculation procedure is here presented for the gsm (2g) systems and its fr and hr codec. in order to discuss model applicability in 3g and 4g systems, it is necessary first to perform a brief survey of codecs, which are implemented in 3g and 4g systems. adaptive multirate (amr) codec is usually applied in 3g and 4g systems when standard telephony frequency bandwidth 300hz-3400hz is applied [20]. here we are focused on amr codec, as it is comparable to fr and hr codec, which we considered in previous sections. at this moment we do not include amr wb (adaptive multi-rate wideband), evrc (enhanced variable rate codec) and other codec types intended for wideband signal coding (till 7khz or more), as well as other narrow-band signal codecs, which are used less than amr (for example, ilbc (internet low bit-rate codec). amr codec is based on the implementation of fr and hr codec, but with adaptable voice bit-rate (codec mode) according to signal transmission conditions [20]. there are 8 different codec modes and their bit-rates are between 4.75kbit/s and 12.2kbit/s. after selecting corresponding codec mode, the coded signal is transmitted in fr or hr channel (channel mode). it may be said that in 3g systems voice signal is transmitted “traditionally”, a novel method for the codecs' performance analysis in mobile telephony systems 257 as also in gsm systems. when codec bit-rate is decreased, more bits are spared for error protection. on the base of this consideration, it is possible to conclude that equation (10) remains valid also when amr codec in 3g systems is considered with the same values of data-flow, as fr and hr channels are steel implemented. equation (9) is also valid, but the concrete values of ie are changed and they depend on the voice bit-rate. these values are between 19 when voice bit-rate is 4.75kb/s and 3 when voice bit-rate is 12.2kb/s [21], as calculated on the base of the presented connection-rating factor. they are improved comparing to already emphasized values for implementation in gsm systems. the values of ie for amr codec are not always explicitly defined. for example, in [22] these values are different in various performed tests and the best results are approaching the values from [21]. the values of ie in [22] may be calculated on the base of the presented values of mean opinion score (mos). the part of our analysis related to traffic demands in model structure does not depend on codec modifications. there are still two channels’ types and equations (6)-(8) are valid. the value of πh additionally depends on signal transmission conditions, as these conditions have influence on the choice of corresponding voice bit-rate. at the end, the main equation (11) may be also applied for qosm factor calculation. the technology of signal transmission is a bit different in the case of 4g systems. the applied codecs are dominantly the same as in 3g systems [23], meaning that for our analysis amr codec is important. however, the applied frame structures are different, as they are based on voip technology [24]. more precisely, it is voice over long term evolution (volte) [25]. as a consequence, we may say that the part of our model related to voice quality, i.e. equation (9), remains as for 3g systems. the considerations dealing with data-flow rate (equation (10)) overcome the scope of this paper and will be studied in the future. overall, the detailed analysis of presented model in 3g and 4g systems will be the subject of our future development. besides analysis for amr codec, this analysis should also incorporate adaptive multirate wideband (amr wb) codec, as it is emphasized in [25] that its implementation is mandatory in volte systems. 6. conclusion in this paper, we analyzed the performances of systems with mixed traffic realization where two different traffic components are defined by the implementation of two codec types. the components of traffic (i.e. two applied codecs) differ in the connection quality. until the traffic threshold, which is determined according to the allowed traffic loss, only better quality codec is implemented for connection realization. after that for the higher traffic two codec types are applied. a primer of such a system in mobile telephony, which is analyzed in this paper, is a system where full rate (fr) codec has a better quality and half rate codec (hr) has a lower quality. the three main quality elements of such a system: offered traffic (ao), mean connection quality (rmean) and mean data-flow (dfmean) are mutually dependent and their relation is demonstrated by 3d graph. after this, first contribution, the second more important contribution is the definition of a novel qosm, which allows us to express cumulative influence of three previously cited system performances by one value. this is a novel approach, which may be also implemented in the other situations when it is necessary to analyze common influence of 258 a. lebl, d. mitić, v. matić, m. mileusnić, ţ. markov several factors. the scope of the analysis is to facilitate mobile systems design and mutual comparison. the variable qosm is defined in such a way that a number of necessary traffic channels is selected easily on the base of highest qosm value from the graph. according to definition and in real physical sense, qosm is increased when ao and rmean are increased, but also when dfmean is decreased. on the base of graphs from this paper, the benefits expressed by better system utilization when the offered and served traffic are increased while in the same time necessary data-flow is decreased significantly overcome effects of simultaneous connection quality degradation. there are two directions of our future activities. the first one is connected with already performed development. the model presented in this paper is universal for the implementation in mobile telephony. in our analysis we are limited to gsm systems. we proved by a brief survey the method applicability for the implementation in 3g and 4g systems. model adaptation for such an application will be the subject of future development. the second direction of our activities is directed towards model further improvement by involving new elements in the model. according to the considerations in the introductory section of this paper, the important elements may include energy consumption, transmission delay and the applied frequency spectrum bandwidth. acknowledgement: the paper is realized in the framework of the projects tr32051 and tr32007, which are cofinanced by ministry of education, science and technological development of the republic of serbia. references [1] i. vidaković and t. šuh, “proposition of new criteria for estimation of voice coders”, tehnika, vol. 58, no. 6, pp. 1–5, 2009, in serbian. [2] itu-t, recommendation g.107, “the e-model, a computational model for use in transmission planing”, series g: transmission systems and media, digital systems and networks, 2015. [3] h. assem, “assessing and improving the vvoip call quality”, master of science thesis, hamilton institute, national university of ireland maynooth, 2013. [4] t. daengsi and p. wuttidittachotti, “qoe modeling: a simplified e-model enhancement using subjective mos estimation model“, in proceedings of the conference icufn2015, at sapporo, japan, 2015. [5] s. möller, “assessment and prediction of speech quality in telecommunications“, springer-science + business media, b. v., isbn 978-1-4419-4989-9, 2000. [6] o. nipp, m. kuhn, a. wittneben and t. schweinhuber, “speech quality evaluation and benchmarking in cellular mobile networks”, in proceedings of the ieee 2007 mobile and wireless communications summit, budapest, hungary, 1-5 july 2007, pp. 1–5. [7] c. e. otero, i. kostanic, l. d. otero, s. l. meredith, “characterization of user-perceived quality of service (qos) in mobile devices using network pairwise comparisons”, international journal of wireless & mobile networks (ijwmn), vol. 2, no.3, pp. 141–153, 2010. [8] r. kadioğlu, y. dalveren, a. kara, “quality of service assessment: a case study on performance benchmarking of cellular network operators in turkey”, turkish journal of electrical engineering & computer science, vol. 23, pp. 548–559, 2015. [9] a. lebl, d. mitić, p. petrović, v. matić, m. mileusnić and ţ. markov, “the application of equal quality characteristics „delay-echo-packet loss“ to internet voice connection planning”, in proceedings of the 15 th international symposium infoteh jahorina 2016, 16-18.iii 2016, pp. 284–289. a novel method for the codecs' performance analysis in mobile telephony systems 259 [10] d. mitić, a. lebl, m. mileusnić, b. trenkić and ţ. markov, “traffic simulation of gsm cells with halfrate connection realization possibility“, journal of electrical engineering, vol. 67, no. 2, pp. 95–102, 2016. [11] m. a. habibi, w. nasimi, b. han and h. d. schotten, “a comprehensive survey of ran architecture toward 5g mobile communication system”, ieee access, vol. 7, pp. 70371–70421, 2019. [12] b. g. gopal , p. g. kuppusamy, “a comparative study on 4g and 5g technology for wireless applications”, iosr journal of electronics and communication engineering (iosr-jece), vol.10, issue 6, pp. 67–72, 2015. [13] r. a. aljiznawi, n. h. alkhazaali, s. qasim jabbar and d. j. kadhim, “quality of service (qos) for 5g networks”, international journal of future computer and communication, vol. 6, no. 1, pp. 27–30, 2017. [14] agilent technologies, „optimizing your gsm network today and tomorrow, using drive testing to estimate downlink speech quality”, application note 1325, july 2001. [15] itu-t recommendation g.113, “series g: transmission systems and media, digital systems and networks: transmission impairments due to speech processing”, november 2007. [16] itu-t, sg12 – d.106, „estimates of ie and bpl parameters for a range of codec types”, telchemy incorporated, january 2003. [17] a. kovac, m. halas, m. orgon and m. voznak, “emodel mos estimate improvement through jitter buffer packet loss modelling”, advances in electrical and electronic engineering, vol. 9, no. 5, pp. 233–242, special issue, 2011. [18] e. m. m. winands, j. wieland and b. sanders, “dynamic half-rate connections in gsm”, aeü international journal of electronics and communications, vol. 60, no. 7, pp. 504–512, july 2006. [19] w. b. iversen, “teletraffic engineering and network planning”, technical university of denmark, dtu course 34340, 2015.4. [20] t. koistinen, “voice coding in 3g networks”, ip telephony protocols, architectures and issues, helsinki university of technology networking laboratory, report 2/2001, pp. 39–46. [21] voice codecs, https://www.gl.com/voice-codecs.html. [22] a. rämö and h. toukomaa, “on comparing speech quality of various narrowand wideband speech codecs”, in proceedings of the eighth international symposium on signal processing and its applications, 28-31. august 2005, sidney, australia, pp. 603–606. [23] j. abichandani, j. baenke, m. s. irizarry, n. saxena, p. vyas, s. prasad, s. mada and y. z. tafesse, „a comparative study of voice quality and coverage for voice over long term evolution calls using different codec mode-sets”, ieee access, vol. 5, june 2017, pp. 10315–10322. [24] s. malisuwan, d. milindavanij and w. kaewphanuekrungsi, “quality of service (qos) and quality of experience (qoe) of the 4g lte perspective”, international journal of future computer and communication, vol 5, no. 3, june 2016, pp. 158–162. [25] d. h. nguyen, “enhancing and improving voice transmission quality over lte network: challenges and solutions”, doctorat en co-accreditation télécom sudparis – institut minestélécom et l’université pierre et marie curie – paris 6, 24. february 2017. https://www.gl.com/voice-codecs.html industrial wsn as a tool for remote on-line monitoring facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 107 119 doi: 10.2298/fuee1701107n industrial wireless sensor networks as a tool for remote on-line management of power transformers' heating and cooling process  aleksandar nikolić 1 , nataša nešković 2 , radoslav antić 1 , ana anastasijević 2 1 university of belgrade, electrical engineering institute nikola tesla, serbia 2 university of belgrade, school of electrical engineering, serbia abstract. industrial wireless sensor network used for supervising of high power transformer cooling system is presented in the paper. due to the fact that in the thermal power plant where industrial prototype is installed is very noisy environment, a lot of problems should be solved in order to obtain high reliability and accuracy of the system. results of the analysis presented in paper are obtained from the real thermal power plant where presented wireless sensor network based on-monitoring system is used for continuous management of power transformers’ heating and control of their cooling systems. obtained results during system operation in longer period confirm its stability, accuracy and improvement in power plant operation. key words: wireless sensor networks, power transformers, power plants, management, on-line monitoring, remote control 1. introduction global processes of liberalization and deregulation of energy sector have established new technical and technology requirements to the research and development centers all over the world. imperative requirements are increase of energy efficiency, reliability and availability of energy resources. in the field of testing and diagnostics individual measurements are replaced by integrated models. timely planned maintenance is replaced by condition based maintenance in respect to the risk assessment using information technologies (databases, intranet, internet). requirements of the modern energy market are integrations of several scientific disciplines and technologies: energetic, electronics, informatics, metrology, standardization, management. significant savings could be reached by prevention of malfunctions and breakdowns by introduction of on-line diagnostics and condition based maintenance strategy. on-line monitoring gives timely information about process and helps for further decisions about  received april 10, 2016; received in revised form may 30, 2016 corresponding author: aleksandar nikolić electrical engineering institute nikola tesla, university of belgrade, 8a koste glavinića serbia (e-mail: anikolic@ieent.org) 108 a. nikolić, n. nešković, r. antić, a. anastasijević process operation [1]. this type of diagnostics became very important in large power plants and its important equipment, especially high power transformers [2]. generator power transformers are the largest units in power plants, since their capacity could be even 1400mva. nowadays, two approaches for thermal management of such transformers are used. expensive solution is based on optical sensors mounted in transformer windings during manufacturing or repairing process. other solution relay on mathematical model of transformer and uses calculation of the highest temperature in transformer (e.g. the hot-spot temperature), measuring only transformer top-oil temperature and load current [3]. solution presented in this paper is based on calculation of the hot-spot temperature using measured transformer oil on the top of housing, ambient temperature and load current. real-time calculation of the hot-spot temperature and transformer cooling control is implemented in industrial type programmable controller. temperature sensors are industrial pt100 mounted on the pipes where transformer oil is circulating. since the system also controls transformer cooling, it is wise to put one sensor at the input of cooling device and the other on the output. in that case, besides cooling control, more information about operation of cooling device could be obtained, like malfunction of fan if temperatures on the input and output are near the same value. studies show that up to 90% of actionable process and environmental data remains uncollected. wired monitoring systems are expensive and unrealistic in challenging physical environments, and manual monitoring has proven simply to be cost-prohibitive [4]. during analysis prior to implementation of one such a system in one power plant, it was found that cabling would be very difficult and expensive and could yield to a very complicated and unsuitable system. in that case, research about application of some wireless based solution is performed. the reliability of wireless networks is set by the quality of the radio link between the central access point and each endpoint [4]. as a simplest and most reliable solution, a wireless sensor network based system is proposed and results of its implementation and performance in real conditions are presented in the paper. 2. temperature monitoring of power transformers dominantly, in transformer heat is spread by convection. determination of temperature distribution in transformer, as a dominant loading factor, is very complicated task. temperature is different in all functional parts of transformer (winding, core, tank and oil) and its changes per volume of each part. the maximum temperature occurring in any part of the winding insulation system is called the "hot-spot temperature“. this parameter represents the thermal limitation of loading of the transformer. for on-line temperature monitoring it is suitable to calculate hot-spot temperature using differential equation with load factor and ambient temperature as a time variables [1]. realtime algorithm is developed using concept with differential equation from iec 60076-7 standard, since it is suitable for on-line monitoring [3]. load factor and ambient temperature are time dependent variables and there is no limit regarding loading profile. if temperature rises are calculated using exponential functions, expression for hot-spot temperature is given in the equation (1): industrial wsn as a tool for remote on-line management of power transformers' heating... 109 2 1 2 1 ( ) ( ) ( ) ( ) 1 x y h a oi or oi hi r hi r k t f t hg k f t r                                (1) where hi is hot spot temperature at the start, f1(t) is function of top oil temperature increase and f2(t) is function of hot spot temperature increase depend on top oil temperature. 2.1. importance of power transformer on-line monitoring as mentioned in introduction, power transformers in power plants, esp. generator power transformers are one of the most important units in energy power system. maintenance of these transformers is complicated and expensive and nowadays it should be wait more than 2-3 years for production of new generator transformer. monitoring and supervising systems of these transformers are very useful and necessary in order to improve efficiency, reliability and reduces risk and costs of unexpected failure [2]. monitoring of transformer run during exploitation period could give accurate failure analyses while extending life of assets. actual conditions drive maintenance and repair and give possibility for estimating additional operational costs. saving relevant data for further analysis and creating historical data is a merit for improving on-line diagnostics and creating decision making expert systems. 2.2. on-line diagnostics models the analysis of the failure modes of the various components leads to a review of the inspection and maintenance procedures of power transformers. on-line diagnostic condition assessment addressing common failure modes:  multiple sensors,  multiple on-line models,  all parameters are recorded automatically and continuously,  trend and limit alarms. on-line models are focused on the main tank of transformer [3]. these models rely on various sensors installed on the transformer and in the substation, combined with other manually entered parameters. this data is then fed into industry standard and accepted models, which calculate the various outputs. 2.2.1. load current model load current model accept on its input measurements of winding current(s) and on its output it provide trending and alarms of particular load current (fig. 1). 2.2.2. winding temperature model winding temperature model accept on its inputs top-oil and ambient temperature measurements and measurement of two or three winding currents. additionally, some fixed parameters should be entered manually as an input of model: rated hot-spot temperature (hse) rise, rated load current and winding characteristics. on its output it provides trending and alarms of hot-spot temperature for each transformer winding (fig. 2). 110 a. nikolić, n. nešković, r. antić, a. anastasijević fig. 1 load current model fig. 2 winding temperature model 2.2.3. cooling control model cooling control model accept on its inputs top-oil temperature measurement and measurement of two or three winding currents. optionally, signal about cooling stage status could be also introduced. additionally, some fixed parameters should be entered manually as an input of model: top-oil temperature set point, hot-spot temperature set point and load current set point. on its output it provides cooling stages on/off control and status, display and trending and status alarms (fig. 3). r winding current minute average current on each winding sensors rules output  measurement on one or three phases  load current is measured every second and averaged over one minute  average value is used for top-oil temperature calculation.  for the hot-spot temperature calculation, the highest value is used. s winding current (optional) average current of phase a, b, c maximum current of phase a, b, c display and trending warnings and alarms t winding current (optional) top-oil temperature sensors rules output  continuosly computes the winding hottestspot temperature on each winding.  calculations are done using proven algorithms from ieee and iec loading guides.  additional fine-tuning of the algorithm is based on transformer manufacturer data. s winding current (optional) hottest-spot temperature for each winding display and trending warnings and alarms t winding current (optional) r winding current fixed parameters rated hst rise rated load current winding characteristics ambient temperature industrial wsn as a tool for remote on-line management of power transformers' heating... 111 fig. 3 cooling control model 2.2.4 . cooling efficiency model cooling efficiency model accept on its inputs top-oil temperature measurement and measurement of two or three winding currents. optionally, signal about cooling stage status could be also introduced. additionally, some fixed parameters should be entered manually as an input of model: rated top-oil temperature rise, top-oil time constant, load losses over no-load losses ratio and oil exponent. on its output it provides information about top-oil temperature discrepancy, warning about deficiency of cooling system and gives display and trending information (fig. 4). fig. 4 cooling efficiency model top-oil temperature sensors rules output  the cooling system can be initiated from either:  top-oil temperature,  load current,  winding hot-spot temperature.  cooling control can detect discrepancies and raise alarm in the case of cooling malfunction. s winding current (optional) stage(s) on/off control control vs. status discrepancy alarm display and trending t winding current (optional) r winding current fixed parameters load current set point cooling stage status top-oil temperature set point hot-spot temperature set point top-oil temperature sensors rules output  theoretical top-oil tempereature calculated according to ieee methods.  warning is initiated if top-oil temperature is too high, indicating malfunction of the cooling system. ambient temperature top-oil temperature discrepancy cooling deficiency warning display and trending fixed parameters ratio of load losses over no-load losses cooling stage status rated top-oil temperature rise top-oil time constant oil exponent r winding current 112 a. nikolić, n. nešković, r. antić, a. anastasijević 3. industrial communication systems industrial communication networks are often required to provide tight performance figures in terms of both real time and determinism. this is a consequence of the application fields, such as motion control, factory automation, manufacturing, and networked control systems in which they are typically employed [5]. until recently, mostly industrial networks were wired based. several network solutions were defined, from serial communication known as rs-485 and digital industrial automation protocols like hart communications protocol (highway addressable remote transducer protocol), fieldbus and profibus (process field bus). subsequently, at the end of the 1990s, field networks based on the well-known ethernet technology started to be introduced. they are characterized by strong performance figures in that they are able to provide high transmission rates (typically up to 100 mb/s), very limited and predictable transfer times, high determinism, and low jitters. one of the extensions is ethercat ethernet for control automation technology an open high performance ethernet-based fieldbus system. 3.1. wireless communications in industry recently, wireless networks started being considered an interesting solution for communication at the device level as well. among the first applications was in the wireless control of cranes in warehouses, where proprietary radios achieved flexible control of moving devices. during the past decade, standardized radio technologies like wireless lan (ieee802.11), wireless hart (ieee 802.15.4) and bluetooth technology (ieee802.15.1) have become the dominating technologies for industrial use. no single wireless technology offers all the features and strengths that fit the various industrial application requirements, so standardized wireless technologies, such as wireless lan, bluetooth and wireless hart (as well as a number of proprietary technologies) are all used in practice [6]. 3.2. wireless sensor networks principle a wireless sensor network (wsn) is a wireless network consisting of distributed autonomous devices using sensors to cooperatively monitor physical or environmental conditions, such as temperature, sound, vibration, pressure, motion or pollutants, at different locations. in addition to one or more sensors, each node in a sensor network is typically equipped with a radio transceiver or other wireless communications device, a small microcontroller, and an energy source, usually a battery. although battery supplied sensor is one of wsn features, in industrial applications it is wiser to use external, stable dc supply. wireless sensor networks (wsns) have quickly become an area of great interest in terms of research for both industry and academia. nowadays, the enormous potential of this technology can be easily seen, along with its inherent difficulties. in fact, the massachusetts institute of technology recently classified wsns as one of the 10 emerging technologies that will change the world [7]. sensor nodes are connected wirelessly to the gateway in the center that performs data acquisition and analysis [8]. connecting to a wireless sensor network with other, usually ethernet networks is realized via the communication module with the function of the gateway. at the top of the hierarchy of wireless sensor networks (where it is necessary to realize a gateway functions) is possible to use the communication module with programmable controller and memory to perform the complete processing of data collected from the industrial wsn as a tool for remote on-line management of power transformers' heating... 113 sensor nodes and possibly control and management [9]. a simple example of data acquisition system based on wsn is shown in fig. 5. fig. 5 example of data acquisition system with wireless data transfer based on wsn number of sensor nodes in wsn network could be additionally increased if some of sensor modules are configured as a mesh router. in that case, besides of data transfer from sensors directly connected to the module, it also performs data transfer from some other sensor module in its proximity. wsn node in that case performs functions of data packet routing from their path from the source to the destination [10], as shown in fig. 5. this shows a way for easily expansion of existing wsn network, both in functional and in space covering aspect [10], [11]. 4. realization of wsn based monitoring system proposed data acquisition and control system for temperature monitoring of six power transformers and control of transformer cooling of two generator power transformers is based on wireless sensor network. according to the available literature, there is only one similar application in nuclear power plant in usa [12], and few applications in power plants in china [13], [14]. measuring points in this system are mostly located on the each transformer, with maximum distance less than 50m for each transformer. distance from transformers to the control room is in the range from 120m to 200m. in that case, developing sensor network with communication links such as gprs/gsm are not viable because the consumer pays the monthly charges for connectivity. finally, a wireless sensor network is defined using zigbee communication between nodes at transformer and receiver in control room. the lower power zigbee communication protocol is based on the ieee 802.15.4 standard and uses the free 2.4ghz ism band [10]. this makes it viable to read a large number of nodes and justifies implementation and operation costs compared to its benefits. the ieee 802.15.4 standard defines two layers, the mac and the physical layer (phy) and uses the three license-free frequency bands. these license-free bands have a total of 27 channels divided into 16 channels at 2.4ghz with data rates of 250 kbps, 10 channels at 902 to 928mhz with data rates of 40 kbps, and one channel at 868 to 870mhz with a data rate of 20 kbps. however, only the 2.4-ghz band operates worldwide; the others 114 a. nikolić, n. nešković, r. antić, a. anastasijević are regional bands. the 868–870-mhz band operates in europe, while the 902–928–mhz band operates in north america, australia, and other countries [7]. for proposed system in thermal power plant chosen wsn sensors are industrial type which already operates on 2.4ghz band, since it is globally available and license free. the reason is not based on the spectral content, although according to the recommendation itu-r p. 372 industrial noise of any type disappears at 900mhz [15]. usually in thermal power plant and its surrounding there are no present large number of systems that operates in 2.4ghz frequency band and that could cause interference and endanger installed wsn stability. the other fact is that 2.4ghz ism has the highest throughput data rate of 250kbps (over 40kbps at 915mhz and 20kbps at 868mhz) and it supports 16 channels (10 at 915mhz and only 1 at 868mhz). these wsn devices are designed to monitor assets or environments in outdoor or harsh settings and hard-to-reach places. it could operate in industrial temperature range (-40c to 70c) which have proved also in the case of proposed system. finally, a variety of international safety, electromagnetic compatibility, and environmental certifications and ratings are available for these devices. finally, selected equipment according to ieee 802.15.4 standard provide development of independent system that do not require significant investment in infrastructure and monthly payments in the case of mobile network based system. although sensors in wsn network could be battery supplied and operate several months using standard aa batteries, in the presented system wsn nodes and gateways are supplied from external 24vdc using industrial grade supplies with isolated input/output. line voltage for these supplies is taken from plant’s secured (uninterrupted) supply. batteries were used to supply nodes only during installation, since at that time particular transformer and its whole supply is switched off due to the maintenance procedure. sensor network consists of several measuring nodes per transformer. one analog input type node performs transformer current measurement on one of its analog inputs and control of 4 cooling units via digital outputs. temperature measuring nodes are connected to pt100 sensors that measures top-oil temperature, input and output temperature of cooling units and ambient temperature. receivers (one per each transformer) are placed in the plant control room at the point where transformer could be viewed, in order to avoid lower signal reception. one receiver is equipped with microcontroller and memory, so it is used for realtime algorithm deployment. both receivers communicate with each other and supervising computer mounted in control room via ethernet. position of sensor nodes and wsn gateways is shown in fig. 6 using aerial view of the power plant. labels on white background denote power transformers, where 5t and 3t are generator transformers, 25t and 23t are their corresponding self-consumption transformers, respectively, while 1t and 2t are common group transformers. locations of wsn nodes are marked with yellow circles, while locations of wsn gateways are marked with yellow squares. application that runs on supervising pc communicate with receivers via modbus tcp/ip protocol, display all needed data for operators and store values in mysql database. in order to provide remote supervision and application modifications, whole system is connected to the internet via power plant lan. in fig. 7 a part of the realized system is shown for 100mva generator transformer. photo is taken from the control room where main receiver with real-time application is mounted. on the right side of fig. 7 open industrial enclosure in ip65 protection is shown. installed equipment in enclosure is designated as follows: wsn nodes – 1, wsn antennas – 2, isolated power supply 24vdc – 3, measurement transmitter for translating ma signals industrial wsn as a tool for remote on-line management of power transformers' heating... 115 into mv – 4, relays for switching fans and oil pumps and reading switching status – 5, 230vac socket for supply instruments and tools during maintenance – 6. fig. 6 disposition of wsn nodes and gateways in thermal power plant fig. 7 industrial wsn based generator power transformer thermal management system 1 2 3 4 5 6 116 a. nikolić, n. nešković, r. antić, a. anastasijević 5. experimental verifications during analysis of correct signal reception from nodes to receiver, it is found that several precautions should be done in the complex industrial environments such as thermal power plant. each node should be placed in such a way that it could be viewed from the place where receiver is mounted. this is important to avoid signal breakdown at some barriers. due to the fact that there are another equipment including large cooling units around transformer, it is better to put all measurement nodes for one power transformer and its auxiliary consumption transformer in a single enclosure, as shown in fig. 7. for auxiliary transformers, signals are just added to the existing wsn nodes in the case of transformer 25t and its main 5t. but, for other transformer 23t situation is slightly different, due to the fact that transformer 23t is not placed just behind its main transformer 3t, like 25t and 5t. transformer 23t is located on the other side of the main road in power plant than 3t. in that case, additional wsn node is used for 23t. since that node is not viewable completely clear from the control room building, some modifications in wsn network is made. in that case, one of wsn nodes used for transformer 3t (mounted in enclosure near 3t) that is closer to transformer 23t is reconfigured as mesh router. this wsn node, clearly “seen” by mesh router node that resends both its measured data and data obtained from wsn node near transformer 23t. wsn modules work with a constant output power of the transmitter 10 dbm (10 mw), the receiver sensitivity is -102 dbm. operation of wireless sensor networks is analyzed by monitoring the change in signal level at the receiver input on each of wsn modules over a longer period of time (one segment results are shown in fig. 8). the observed dynamics of the signal is in the range of 9 db to 45 db, depending on the relative positions of wsn module that provides communication. wireless sensor network realized in outdoor conditions (outside buildings) in the complex propagation environment of thermal power plant te kostolac a. this thermal power plant consists of several buildings representing good reflective surface as the entrance to the recipient causes the existence of a large number of reflected components (besides direct waves). this results in great instability of signal level at the entrances of their receivers (fig. 8). fig. 8 signal level change at the inputs of wsn module receivers sampled on 1min after testing, final places for both wsn nodes at transformers and receivers (gateways) are found. the main criterion is to keep signal level (link quality) over 30%. this assures industrial wsn as a tool for remote on-line management of power transformers' heating... 117 stable work of whole system during various influences, like disturbances, power disruptions, different weather conditions, etc. although it is not necessary for a node to be viewed by gateway, for stable work in such an environment we have proved by testing that it is better to provide optical visibility between them, even on the short distances (lower than 100m). existing wireless networks should be clearly defined and separated, especially wlans since their signals have larger level than those in zigbee networks. in that case, some measurements should be made before to find the most suitable communication channel for zigbee communication. the proposed system is developed and is for almost one year under testing in one thermal power plant. it has passed all tests without communication lost or other malfunctions under different plant operation regimes and ambient conditions. it should be noted that during first installations in summer it was over 40c, while in the winter it was below than -25c. in both cases, system has worked without any stop. that is confirmed through values stored in sql database saved on a local hard disk in pc computer mounted the control room. results of proper wsn transformer thermal monitoring system are taken through installed application on the panel pc computer in thermal power plant control room. application receives every ten second updates from real-time gateway and store data into the sql database. the main screen of the application for data acquisition, which are clearly separated parts that relate to a particular transformer (5t, 25t, 3t, 23t, 1t and 2t), shown in fig 9. in order to explain meanings of some part of the screen, arrows are pointed characteristic values in the case of transformer 5t: 1 – load current, 2 – top oil temperature, 3 – temperature at the entrance and exit of the cooling group, 4 calculated hot-spot temperature, 5 – ambient temperature, 6 – cooling group status (green square – group is activated, gray square – group is deactivated). fig. 9 signal level change at the inputs of wsn module receivers sampled on 1min from the fig. 9 it could be seen that all fields for transformer 23t are grayed. that was due to the fact that transformer 23t was in maintenance, out of operation, when screen was captured from application. 1 2 3 4 5 6 118 a. nikolić, n. nešković, r. antić, a. anastasijević further merit of proposed system is that application is reachable from outside of the plant through internet via protected vpn channel. in that case system could be monitored remotely and finally application could be even updated and replaced without necessity to visit the plant. that significantly simplifies system maintenance and reduces additional costs [16]. 6. conclusions industrial wireless sensor network system for thermal management of high power transformers is presented in the paper. the importance of implemented on-line temperature monitoring system can be seen in a completely new solution based on wireless sensor networks. this is a unique solution that has been developed due to some potential problems with the installation of cables required for the same purpose. prior to installation and during the operation of the system, a detailed analysis of the signal quality was carried out on all your wireless connections in a network individually (i.e., for each pair of sensor nodes). after the third upgrade of the system, operation of the wireless sensor network can be remotely monitored and since data about signal quality is recorded in a database on a computer in the control room. low consumption wsn modules can work for several months without changing batteries (standard aa type), allows the system to have an additional level of protection in case of failure of the auxiliary power supply in the plant. also, the system allows easy upgrades by deploying and configuring new wsn module. remote control via the internet added flexibility to the entire system, allowing the time needed for analysis and testing of transformers significantly shortened. also remote changes on the software are possible, which for security reasons is performed with the communication and cooperation with operators in the plant. since in the cases of accidents in power plants efficiency of decision and realization is priority, it is clear what the significance of the realized system is. results obtained from the real industrial plant confirm the proposed wireless network configuration, even during very high and very low ambient temperatures. acknowledgement: results presented in the paper are part of an innovation project “possibilities for wireless sensor networks application in smart grid power systems”, granted by serbian ministry of education, science and technological development, no. 451-03-2802/2013-16/79, 2014. references [1] b. sparling, “transformer monitoring and diagnostics”, in proceedings of the ieee power engineering society 1999 winter meeting, 31 jan-4 feb 1999. [2] b. flynn, “case studies regarding the integration of monitoring & diagnostic equipment on aging transformers with communications for scada and maintenance”, in proceedings of the distributech 2008 conference and exhibition, tampa fl, usa, january 22-24, 2008. [3] j. li, t. jiang, s. grzybowski, “hot spot temperature models based on top-oil temperature for oil immersed transformers”, in proceedings of the ieee conference on electrical insulation and dielectric phenomena, 2009. ceidp '09, 18-21 october 2009, pp. 55. [4] d. laurence, “wireless sensor networks”, awe international, issue 15, june 2008. [5] l. seno, f. tramarin, s. vitturi, “performance of industrial communication systems”, ieee industrial electronics magazine, pp. 27-37, june 2012. industrial wsn as a tool for remote on-line management of power transformers' heating... 119 [6] m. anderson, “a look at wireless technologies for industrial applications”, industrial ethernet book, issue 71, pp. 8-13, 2012. [7] a.-b. garcı´a-hernando et al., problem solving for wireless sensor networks, springer-verlag london limited, 2008. [8] i.f. akyildiz, w. su, y. sankarasubramaniam, e. cayirci, “wireless sensor networks: a survey”, computer networks, vol. 38, no. 4, 2002, pp. 393–422. [9] l. doherty, k.s.j. pister, l. el ghaoui, “convex position estimation in wireless sensor networks”, in proceedings of the infocom 2001, twentieth annual joint conference of the ieee computer and communications societies, anchorage, usa, 2001, pp. 1655 – 1663. [10] j. n. al-karaki, a. e. kamal, “routing techniques in wireless sensor networks: a survey”, ieee transactions on wireless communication, vol. 11, issue 6, pp. 6-28, 2004. [11] k. holger, a. willing, protocols and architectures for wireless sensor networks, west sussex: john willey and sons ltd, uk, 2007. [12] r. lin, z. wang, y. sun, “wireless sensor networks solutions for real time monitoring of nuclear power plant”, in proceedings of the fifth world congress on intelligent control and automation, wcica 2004, 2004, pp. 3663-3667. [13] s. z. huang, x. z. zhao, “application of wireless sensor networks on power plants monitoring”, applied mechanics and materials, pp. 762-766, 2013. [14] t. li, m. fei, “vibration monitoring of auxiliaries in power plants based on ar (p) model using wireless sensor networks”, communications in computer and information science, vol. 98, pp. 213222, 2010. [15] international telecommunication union, radio communication sector, recommendation itu-r p.37212, radio noise, p series radiowave propagation, 07/2015. [16] a. nikolic, “remote supervising and decision support for on-line monitoring systems in power plants”, plenary lecture, in proceedings of the 2 nd international conference on intelligent control, modelling and systems engineering (icms '14), cambridge, ma, usa, january 29-31, 2014, pp. 14. manuscript facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 133 142 doi: 10.2298/fuee1501133s an improved nanoscale transmission line model of microtubule: the effect of nonlinearity on the propagation of electrical signals  dalibor l. sekulić, miljko v. satarić university of novi sad, faculty of technical sciences, novi sad, serbia abstract. in what manner the microtubules, cytoskeletal nanotubes, handle and process electrical signals is still uncompleted puzzle. these bio–macromolecules have highly charged surfaces that enable them to conduct electric signals. in the context of electrodynamic properties of microtubule, the paper proposes an improved electrical model for divalent ions (ca 2+ and mg 2+ ) based on the cylindrical structure of microtubule with nano–pores in its wall. relying on our earlier ideas, we represent this protein–based nanotube with the surrounding ions as biomolecular nonlinear transmission line with corresponding nanoscale electric elements in it. one of the key aspects is the nonlinearity of associated capacitance due to the effect of shrinking/stretching and oscillation of c– terminal tails. accordingly, a characteristic voltage equation of electrical model of microtubule and influence of capacitance nonlinearity on the propagation of electrical pulses are numerically analyzed here. key words: microtubule, electrodynamic properties, nonlinear transmission line, voltage equation, soliton wave 1. introduction biological systems at the nanoscale are rich in electrical activity. besides the well– known pathways of biochemical regulation, there exist additional pathways of biological communication governed by electrical signals [1]. recently, in silico demonstration has shown that the microtubules (mts), essential cellular biopolymers, are the source of electromagnetic fields in the form of the electric pulses, playing an important role in the intracellular signaling and information processing [2]. mts are self–assembling protein mesostructures made up of electrically polar tubulin heterodimer subunits, α‒ and β‒tubulin monomers. each dimer, 8 nm in length, possesses two highly flexible c–terminal tail regions, by one of each monomer, which can extend up to 4.5 nm from the surface, see fig. 1a. in vivo mt is generally made up of 13 parallel, loosely connected chains, or received august 9, 2014; received in revised form november 14, 2014 corresponding author: dalibor l. sekulić university of novi sad, faculty of technical sciences, novi sad, serbia (e-mail: dalsek@yahoo.com) 134 d. l. sekulić, m. v. satarić protofilaments, formed by polymerization of the tubulin heterodimer molecules so as to result in a helical structure. cryo–electron microscopic analyses indicate that the single mt resembles a hollow tube with its external and internal diameters of 25 nm and 15nm, respectively, see fig. 1b, while its length typically varies in the range of few tens of nanometers to several micrometers. due to the shape and spacing of the tubulin heterodimers, the mt possesses small, nanometer size pores between the outer environment and the inner mt's lumen. two different types of these nano–pores exist within mt's wall. fig. 1 a) the topology of tubulin heterodimer with dimensions of the c– terminal tail regions b) cross–sectional view of microtubule c) microtubule tube–like structure with marked protofilament mt functionalization plays a critical role in making the jump from in vivo intracellular transport to in vitro nanoscale device applications [3]. the potential of this protein structure for application in novel bio–inspired nanoelectronic components was recently demonstrated [4]. in addition, the process of formation of mts by polymerization of αβ– tubulin heterodimers, which can be controlled by physical (temperature) and chemical (ph, concentration of protein and ions), can result to the production of closely or widely spaced mts, centers, sheets, rings and other structures [5], thus facilitating fabrication of an improved nanoscale transmission line model of microtubule 135 nanowires, nodes and networks in the future for bionanotechnological applications. interestingly, mts are self–assembling protein–based nanotubes with mechanical behavior similar to carbon nanotubes despite their very different chemical composition: proteins and non–covalent interactions in the case of mts; carbon and covalent bonds in the case of carbon nanotubes [6]. in view of the key role of mts in intracellular electrodynamic signaling and information processing, the objective of this study is to provide a framework for the analysis of propagation of electric pulses along this biological transmission line. more specifically, for this purpose we improve our previous electrical model [7] based on polyelectrolyte concept applied to the molecular structure and geometry of these nanotubes. the initial motivation for the model of mt as electrical transmission line was the experimental evidence [8], which revealed that these biopolymers are good conductors of electrical signals at nano–level. here, we will focus on the specific importance of this mechanism for intracellular transportation of divalent ions, in the first place ca 2+ and mg 2+ as the most important ions in human body. the particular attention will be paid to the role of nonlinearity parameter that arises from structure and dynamics of c– terminal tails. 2. electrodynamic properties and characterization of mt the conductance of mts is the result of their electrostatic and structural properties. on the basis of detailed molecular dynamics computations, it was recently demonstrated that the outer surface of tubulin heterodimer is mostly electrostatically negative charged, whereby a specifically large negative charge is located on the c–terminal tail regions [9]. as a result, overall surface of mt is predominantly negative charged, as one can see from fig. 2, while the ends of this nanotube possess a different amount of net charge. accordingly, it can be readily concluded that the mt supports an intrinsic electric field [5]. the average linear charge density of single mt is estimated to be 52.2 electronic charges (e) per dimer or approximately 85 e/nm [10]. also, recent experimental observation was showed that the mt conductivity in a solution is on the order of 10 ns [8], indicating a high level of ionic conductivity along this biopolymer. fig. 2 charge distribution on the surface of microtubule (color red indicates negative charges, blue positive charges and white neutral regions) [9] 136 d. l. sekulić, m. v. satarić in the context of the above–mentioned findings, it is possible to consider mt like one– dimensional polymer that behaves as highly charged polyelectrolyte [7, 10]. in a solution, positive ions are attracted to the negative charged surface of the mt, while negative ions are repelled, creating a positive condensed ionic cloud (cic) localized around the mt's landscape. in the presence of an applied voltage, the loosely held cic is free to migrate along mt, creating an ionic flow and an associated electrical current [11]. in accordance with manning's condensation theory of polyelectrolytes [12], negative ions in solution are repelled at the distance called the bjerrum length (lb), which is defined by the balance between electrostatic attraction energy of the ions and the pertaining thermal energy: 2 b r 0 b . 4π ze l ε ε k t  (1) the particular emphasis is paid on the divalent ions (z = 2) such as ca 2+ and mg 2+ , which are primary attracted by the mt's surface. their flows around mts play fundamental roles in auditory processes as well as in the cardiac muscles action. for these important ions in human body, the width of such "depleted layer" sandwiched between two charged regions is lb = 1.34 nm at physiological temperature t = 310 k. other parameters used in the estimation of bjerrum length are the elementary electric charge e = 1.6×10 -19 c, the dielectric constant of cytosol εr = 80, the permittivity of vacuum ε0 = 8.85×10 -12 f/m and boltzmann's constant kb = 1.38×10 -23 j/k. based on manning's approach, the thickness of cic λ around the rod–like filamentous polyon of radius r is given by the following expression:     .2/1 b 1 db ; 2/1 db 8 ,1 nllrl    aa (2) fig. 3 schematic illustration of a tubulin dimer with its condensed ionic cloud (cic), depleted layer and the geometry of "coaxial cable" with the dimensions. a is the cross section area of a cic around tubulin dimer. an improved nanoscale transmission line model of microtubule 137 using the above equation, the corresponding values for tubulin heterodimer (td) and c– terminal tails (ct) are λtd = 2.96 nm and λct = 0.92 nm, respectively, for equilibrium ionic concentration n = 1.5×10 23 m -3 and parameter a = 0.5. the estimated depleted layer between cic and repelled negative ions (anions) plays the role of a dielectric medium between charged plates in "coaxial cable", see fig. 3. it provides resistive and capacitive components for the behavior of the tubulin heterodimer that make up the mt. 2.1. electrical parameters of the mt model bearing in mind the cic around the mt's cylinder, this protein–based nanotube may act as biological "electrical wire" which can be modeled as nonlinear transmission line [7, 11]. in this section, we describe the nature and estimate the value of electrical components at nanoscale that make up transmission line model that mimics the behavior of mt in solution. due to symmetry of this biopolymer, it is plausible to consider just one of thirteen mt's protofilaments and to introduce the so called elementary electric unit, which is a single tubulin dimer with its two c–terminal tails and two different types of nano– pores in mt's wall. in order to estimate the total static capacitance of elementary electric unit for the case of divalent ions, we firstly consider the contribution of a tubulin heterodimer as half– cylindrical capacitor which has the value .f 1075.0 1ln π 160 b td           r l l c r  (3) here l = 8 nm is the length of tubulin dimer, and the parameter r = rtd + λtd = 5.46 nm represents the outer radius of respective cic, see fig. 3, while the other parameters are already mentioned and used. in a similar way, viewing an extended c–terminal tail as a smaller cylinder with r = rct + λct = 1.42 nm, one obtains its capacitance as follows f, 1014.0 1ln 2π 16 eff 0 ct b           r l l c r  (4) where l eff = lct – λtd = 1.54 nm is the corresponding effective length of the part of c– terminal not plunged in λtd. since each tubulin heterodimer has two c–terminal tails, and keeping in mind that the capacitances of tubulin dimer and c–terminal tails are in parallel arrangement [13], it implies that total maximal static capacitance of the elementary electric unit is readily estimated as the simple sum of above determined components: f. 1003.12 16 cttd0   ccc (5) it is evident that the capacitance in our model represents the charge distribution in the region from cic to approximately one lb away perpendicularly to the surface of the mt. since the ions in this layer are assumed to be condensed, the charge of this capacitor will vary in a nonlinear way with voltage as follows: 138 d. l. sekulić, m. v. satarić 0 0 [1 ( ) ] . n n n q c t t bv v    (6) the non–stationary term γω(t–t0) reflects the fact that the presence of extra ions injected within the ionic cloud affects the local concentration during the time compared with the characteristic charging time of localized c–terminal's capacitor. the nonlinear parameter b represents the change of capacitance of elementary electric unit with an increasing concentration of condensed cations due to the flexibility of the c–terminal tails (their shrinking/stretching). this expression (6) was derived and explained in detail in the previous version of electrical model of mt [7]. in order to estimate the conductance of nano–pores, we rely on the detailed atomic– scale in silico calculations using 3d brownian dynamics [14]. in accordance with these numerical results, the conductance of both nano–pores is estimated to be g = 10.7 ns, which is determined as a sum of pretty different components reflecting difference in two types of nano–pores [7, 13]. using the same numerical calculation, the resistance of the parallel flow of ions along elementary electric unit is r = 6.2×10 7 ω, neglecting the ionic current which leaks through the depleted layer. finally, in the earlier simplified version of the electrical model of mt [11], we found that corresponding inductance of elementary electric unit is small enough that its role can be safely ignored in this improved model. 3. electrical model of mt as biomolecular nonlinear transmission line on the basis of above estimations for the components of elementary electric unit of the mt, we are now in the position to establish the corresponding electrical model. electric circuit that simulates single mt's protofilament is presented in fig. 4. fig. 4 a scheme describing the corresponding electrical model of mt in order to derive the characteristic voltage equation of model, we begin by introducing the discrete potential in one section of the mt where kirchhoff’s laws for the currents and voltages read as follows an improved nanoscale transmission line model of microtubule 139 . , 11 nnnn n nn rivvgv t q ii      (7) we then introduce new function un(x, t) unifying voltage and its accompanying current along a mt as follows ,),( 2/12/1 nnn vziztxu   (8) where z is characteristic impedance of the elementary electric unit corresponding to characteristic frequency ω = 2π/(rc0) .1 1 2/1 2 2 22 1 2 0 2 0                      g c gc g rz   (9) this expression for z is much more accurate compared to the previous that was used in [7]. assuming that a large number of tubulin heterodimer along the protofilament of mt, the next step is to go over to the continuum approximation with respect to space variable x and as a result we get the following voltage equation of established model of mt   3 / 23 0 0 0 03 1 0 3 2 3 6 3( ) 0. zc s z bc su u u u zc γω u t t zg z r zc γω u                               (10) here, the first term is dispersive one arising from diffusive spreading of ions within the pulse. the second one with negative sign (zc0s < 2) resembles the corresponding time dependent term in fick’s diffusion law. third term reflects the non–stationary change of capacity due to the ratchet mechanism. fourth term represents nonlinearity that competes with dispersion. the key point here is that the last term accounts the competition between energy expenditure due to ohmic losses and energy supply by non–local ratchet mechanism. this term can be discarded as being very small due to the balance of above competition. in characteristic voltage equation (10), the characteristic charging (discharging) time of the elementary capacitor through the elementary resistance is given by t = rc0, while the dimensionless variables of space ξ, time τ and velocity s are defined as follows: , v, v v , , 0 0 t l s t t s l x   (11) where parameter v0 = l/t represents the characteristic cut–off velocity, and l is the length of one tubulin heterodimer. 3.1. numerical analysis of the voltage equation since one of the key aspects of electrical model of mt as transmission line is the nonlinearity of capacitance of elementary electric unit, the characteristic voltage equation and effect of capacitance nonlinearity on the propagation of ionic pulses along mt are numerically analyzed. numerical simulations are based on a finite–difference time–domain method applied to the differential equation (10) governing the voltages of proposed nonlinear circuit [15]. a circuit simulator is implemented in matlab software package. 140 d. l. sekulić, m. v. satarić in fig. 6 is shown the numerical results of voltage equation u(ξ, τ) for estimated values of electric components of the established model and different values of capacitance nonlinearity. fig. 5 numerical solution of the voltage equation u(ξ, τ) for the set of estimated electric components of the established model and a) capacitance nonlinearity of 0.05 v -1 , b) lower value of capacitance nonlinearity of 0.01 v -1 and c) higher value of capacitance nonlinearity of 0.1 v -1 for reasonable value of the nonlinearity parameter of 0.05 v -1 , the pulse appears in the form of soliton–like wave exhibiting the slowly decaying amplitude and an almost constant velocity of propagation. this wave possesses the energy losses due to ohmic resistance, but it preserves the stable localized form. using the values of time and space units, the average velocity of localized voltage pulse can be estimated from the obtained graphic shown in fig. 5a as follows: , s cm 3 1000 280 v ct     t l   (12) where l is length of elementary electric unit that is one tubulin heterodimer. the parameter tct represents a swing period of c–terminal tail which is given by the following expression [10] an improved nanoscale transmission line model of microtubule 141 s.1072.0π6 7 1 0 ct      rzzg zc t (13) the obtained propagation velocity of voltage pulse is the order of several cm/s that is close to experimental findings [8] and it depends on the electrical papameters of established circuit model of mt. it is also possible to estimate that the pulse width is of the order of 10 elementary units. this fact supports the validity of continuum approximation. the range of this soliton–like localized pulse is about 280 × l ≈ 2.24 μm, which is of the order of cell's diameter. in the case of lower value of nonlinearity parameter of 0.01 v -1 presented in fig. 9b, the amplitude of pulse decreases significantly faster so that over about 350 elementary units it becomes negligible. the graphic shows the deceleration of the soliton–like pulse along its path. for higher value of capacitance nonlinearity of 0.1 v -1 , the voltage pulse exhibits not only a higher localization, but also a slower decay of its amplitude and greater robustness compared to the previous two cases. also, fig. 5c indicates that the propagation velocity is lower than the average velocity of 3 cm/s for the case of nonlinearity parameter of 0.05 v -1 . overall, the performed numerical analysis of the voltage equation demonstrates that the role of capacitance nonlinearity is of decisive importance for the stability and localized character of ca 2+ and mg 2+ pulses along mts. 4. discussion and conclusion in this paper, we have developed an improved electrical model of mt that provides qualitative framework for the observations of propagation of ca 2+ and mg 2+ signals along this charged polar biopolymer. in summary, the geometric symmetry of mt provided the opportunity that one mt's protofilament can be approximately seen as a "coaxial cable" with the "charged plates" composed of counterions of solution with a depleted layer of water molecules in between. it enabled that the series of tubulin heterodimers with pertaining c– terminal tails could be observed as series of identical elementary electric units with estimated capacitance and resistance, including the conductance of two different types of nano–pores existing in mt's wall. the background for the proposed model is the molecular structure of mt and its polyelectrolyte properties. numerical analysis of characteristic voltage equation of model showed that the nonlinearity of capacitance, arising from structure and dynamics of c– terminal tails, is essential important parameter for the character propagation of ca 2+ and mg 2+ signals along mt. for reasonable value of the nonlinearity parameter of 0.05 v -1 , numerical results indicated stable and localized soliton–like pulses which propagate with a velocity value close to experimental findings and cross the distance of cell's size despite ohmic dissipation. velocities of these pulses are of the order of several cm/s, depending on the amount of the injected ions and valence of the ions. under physiological conditions, this effect is much faster than the process of propagation of the ions by pure diffusion (a few μm/s). interestingly, if we compare the estimated velocity with the drift velocity of electrons in a typical semiconductor, we get the same order of magnitude at current density of order of a/mm 2 . however, the velocity of establishing a local field around the protein biopolymers is much lower due to heaviness of the ions, which are much more massive than electrons. 142 d. l. sekulić, m. v. satarić finally, we believe that our predictions may have important consequences for understanding of mt's ability to conduct electrical signals, which may affect the neuronal computational capabilities, among other known functions. also, our findings could encourage experimentalists to conduct more subtle assays in an attempt to elucidate this important aspect of cellular activities. on the other hand, the much–needed experimental validation of presented results would undoubtedly lead to exciting new opportunities in the development of biological nanoscale electronic applications. acknowledgement: this research was financially supported by the ministry of education, science and technological development of the republic of serbia through projects no. iii43008 and oi171009. references [1] j. zimmerman, r. parameswaran and b tian, "nanoscale semiconductor devices as new biomaterials", biomater. sci., vol. 2, pp. 619–626, may 2014. [2] d. havelka, m. cifra and o. kucera, "multi-mode electro-mechanical vibrations of a microtubule: in silico demonstration of electric pulse moving along a microtubule", appl. phys. lett., vol. 104, pp. 243702.1–4, june 2014. [3] j. l. malcos and w. o. hancock, "engineering tubulin: microtubule functionalization approaches for nanoscale device applications", appl. microbiol. biotechnol., vol. 90, pp. 1–10, april 2011. [4] s. sahu, s. ghosh, k. hirata, d. fujita and a. bandyopadhyay, "multi-level memory-switching properties of a single brain microtubule", appl. phys. lett., vol. 102, pp. 123701.1–4, march 2013. [5] j. a. tuszynski, t. j. a. craddock and e. j. carpenter, "bio-ferroelectricity at the nanoscale", j. comput. theor. nanosci., vol. 5, pp. 2022–2032, october 2008. [6] f. pampaloni and e. l. florin, "microtubule architecture: inspiration for novel carbon nanotube-based biomimetic materials", trends biotechnol., vol. 26, pp. 302–310, june 2008. [7] d. l. sekulic, b. m. sataric, j. a. tuszynski and m. v. sataric, "nonlinear ionic pulses along microtubules", eur. phys. j. e., vol. 34, art. no. 49, may 2011. [8] a. priel, a. j. ramos, j. a. tuszynski and h. f. cantiello, "a biopolymer transistor: electrical amplification by microtubules", biophys. j., vol. 90, pp. 4639–4643, june 2006. [9] n. a. baker, d. sept, s. joseph, m. j. holst and j. a. mccammon, "electrostatics of nanosystems: application to microtubules and the ribosome", proc. nat. acad. sci. usa, vol. 98, pp. 10037–10041, august 2001. [10] d. l. sekulic and m. v. sataric, "an improved electrical model of microtubule as biomolecular nonlinear transmission line", in proceedings of the 29th international conference on microelectronicsmiel 2014, belgrade, 2014, pp. 111–114. [11] m. v. sataric, d. sekulic and m. zivanov, "solitonic ionic currents along microtubules", j. comput. theor. nanosci., vol. 7, pp. 2281–2290, november 2010. [12] g. s. manning, "the molecular theory of polyelectrolyte solutions with applications to the electrostatic properties of polynucleotides", q. rev. biophys., vol. 11, pp. 179–246, may 1978. [13] d. l. sekulic and m. v. sataric, "microtubule as nanobioelectronic nonlinear circuit", serbian journal of electrical engineering, vol. 9, pp. 107–119, february 2012. [14] h. freedman, v. rezania, a. priel, e. carpenter, s. y. noskov and j. a. tuszynski, "model of ionic currents through microtubule nanopores and the lumen", phys. rev. e, vol. 81, pp. 051912.1–11, may 2010. [15] d. l. sekulic, m. v. sataric, m. b. zivanov and j. s. bajic, "soliton-like pulses along electrical nonlinear transmission line", elektronika ir elektrotechnika, vol. 121, pp. 53–58, may 2012. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 647 651 doi: 10.2298/fuee1604647l transient voltage suppressor based on diode-triggered low-voltage silicon controlled rectifier  xiang li, shurong dong, zhihui yu, jie zeng, weihuai wang institute of photonics and microelectronics department of information sciences and electronic engineering, zhejiang university hangzhou, china abstract. transient voltage suppressor (tvs) has been widely used for electronic system esd protection. a good tvs is usually costive as it needs some special processes and extra masking layers for fabrication. a novel tvs design based on the standard cmos process will be much more attractive. this work proposes a new tvs device using a cmos compatible diode-triggered silicon controlled rectifier (dlvtscr) as the core device. due to the availability of multiple trigger mechanisms and the dual current paths for bypassing the esd current, the newly proposed device is able to sink an esd current of over 10 a. in addition, the holding voltage is promoted up to 6.83 v and the trigger voltage is lowered down to 10.8 v which is well suited for most portable device applications. key words: tvs, esd, lvtscr 1. introduction the integrated circuits (ics) used in modern mobile electronic devices are faster, more powerful, less power consumptive, and are much smaller than ever before. however, they are more vulnerable to reliability issues, not only due to the small device size and the use of ultrathin gate oxide, but also due to their applications which make the devices more frequently exposed to electrostatic discharge events produced during the frequent human interfacing, and often plugging and disconnecting the usb devices and hdmi port. on-chip protection is now of vital importance for system reliability. however, conventional protection scheme is not only costive and bulky, but also leads to the system performance degradation [1]-[5]. transient voltage suppressor (tvs) diodes have long been used to provide a high robustness system level esd protection [6-8]. under normal operating conditions, the tvs diode maintains in a high impedance state. during a transient discharge event, the tvs breaks down electrically and yields a low impedance shunt path to bypass the transient current. a received june 30, 2015; received in revised form march 12, 2016 corresponding author: shurong dong institute of photonics and microelectronics, department of information sciences and electronic engineering, zhejiang university, hangzhou, china (email: dongshurong@zju.edu.cn) 648 x. li, s. dong, z. yu, j. zeng, w. wang good tvs protection circuit must be able to divert the transient current and to clamp transient voltage below the threshold value before the failure of the protected ic. a tvs structure includes a core device and some steer diodes. the clamping voltage usually depends on the core device, and the steer diodes can divert the esd current to the core device and can reduce the overall capacitance of the structure. however, to obtain a good tvs diode, some special processes, such as deep trench isolation or additional processing masking layers, are required. this work attempts to develop a cmos compatible tvs device. fig. 1 shows the conventional tvs based on a zener diode (a) and the newly proposed tvs structure. fig. 1 conventional tvs structure based on zener diode and the proposed tvs structure based on the standard cmos process. for protecting the interface for data line communications, a good tvs device must possess some special features. first, a low working voltage is crucial for safeguarding the submicron integrated circuits. the maximum reverse working voltage, vrwm, which is the largest allowable dc voltage that can maintains the tvs in non-conducting state, is the key parameter for specifying the working voltage. when the transient voltage exceeds vrwm, the tvs turns on quickly and a low impedance path will be established to divert the transient current. hence, a low working voltage is essential for clamping the transient voltage to a level well below the threshold value. second, the equivalent capacitance of tvs should be low enough in order to preserve signal integrity at the high-speed interface. if the capacitance of the tvs diodes is too high, it will cause excessive load to the circuit and then signal distortion and data errors will result. 2. structure and performance the schematic equivalent circuit and the cross-sectional view of the diode-triggered lvtscr structure are shown in fig. 2. lvtscr, using the gated p-well structure, has been widely used as esd protection devices because of its suitable values of holding voltage and the low trigger voltage. however, the gate structure also plays an important role in the reliability of the device. by adding an extra diode connecting the anode and cathode of the conventional lvtscr, the structure can be triggered by an esd event more easily. when an esd event occurs, the drain pn junction and the substrate of the ggnmos will be first driven into an avalanche breakdown and the voltage drop across transient voltage suppressor based on diode-triggered low-voltage silicon controlled rectifier 649 the diode increases as the avalanche current increases. meanwhile, the electrons in the n+ (the one between the n-well and p substrate) will diffuse into the n-well. when the voltage drop across the diode rises above 0.7 v, the bipolar transistor (q1) will be turned on. and that makes the scr to be turned on later owing to the positive feedback in the transistors q1 and q2. this device has been taped out in 0.18um cmos process. to study the characteristic of this new structure, transmission line pulse (tlp) measurements using pulses with a rise time of 10 ns and a pulse width of 100 ns were conducted. fig. 3 shows the comparison of tlp characteristics for conventional and diode triggered lvtscr. as compared with the conventional lvtscr, the new diode-triggered lvtscr structure exhibits a low parasitic resistance (calculated by dv/di), because the current conduction path in the newly proposed structure is now formed by the ggnmos together with the scr. as shown in f, the trigger voltage decreases from 8.94 v to 7.82 v; whereas the holding voltage increases from 2.01 v to 3.21 v. in addition, the failure current, it2, also increases from 3.17 a to 4.05 a because of the availability of two current conduction paths. (a) (b) fig. 2 cross-sectional view (a) and the schematic equivalent circuit (b) of diode-triggered lvtscr. fig. 3 tlp results of conventional lvtscr and diode-triggered lvtscr. 650 x. li, s. dong, z. yu, j. zeng, w. wang as shown in fig. 4, when the value of d (the distance between the drain side of gate to contact of gate, see fig. 2(a)) increases from 0.85 μm to 2.35 μm, it2 increases from 2.05 a to 3.1 a. when the drain contact is close to the poly gate (when d is small), the heat produced at the drain junction spreads isotropically to the contact metal and results in a lower failure current level [8]. hence, a larger separation between the contact and the poly gate will help to increase the failure current level. on the other hand, this device behaves liking a diode when adding reverse voltage on it. after investigating the standalone dlvtscr, a tvs using the dlvtscr as the core device was realized and the tlp test result is shown in fig. 5. taking i/o1 as an example, when adding esd strike on i/o1 to gnd, the esd current will be released by the steer diode d1, through the dlvtscr and then going to gnd. as shown in fig.5, the tvs structure presents a higher holding voltage of about 6.83 v and an acceptable trigger voltage of about 10.8v. these values should be acceptable for esd protection applications for high-speed digital interfaces such as usb2.0, hdmi, avi ports etc, in portable equipments. fig. 4 tlp characteristics of diode-triggered lvtscr as a function of device spacing (d) between the gate and drain contact of ggmos of the lvtscr. fig. 5 comparison of tlp characteristics of a standalone dlvtscr and a tvs device embedded with a dlvtscr. transient voltage suppressor based on diode-triggered low-voltage silicon controlled rectifier 651 3. conclusion this paper attempts to incorporate a diode-triggered low-voltage silicon controlled rectifier into a tvs. the results show that the larger distance between the gate edge and the drain contact, the higher esd current (it2) can be obtained. the tvs structure was further verified with the standard cmos process and good robustness was obtained. this structure can be used for system level esd protections for high speed digital interfaces such as usb2.0, hdmi, avi ports, and so on. acknowledgement: this work was supported by the national natural science foundation of china (no. 61171038, 61204124). the authors thank the innovation platform for micro/nano device and system integration and cyrus tang centre for sensor materials and applications at zhejiang university. references [1] c. ito, k. banerjee, r.w. dutton, “analysis and design of distributed esd protection circuits for highspeed mixed-signal and rf ics”, ieee trans. electron devices, vol.49, pp.1444-1454, 2002. [2] j.-h. ko, k.-y. kim, j.-s. jeon, c.-h. jeon, c.-s. kim, k.-t.lee, h.-g. kim, "system-level esd onchip protection for mobile display driver ic", in proc. of the sympo. of electrical overstress/ electrostatic discharge, 2011, pp.1-8. [3] a. jahanzeb, l. lou, c. duvvury, c. torres, s. morrison, "tlp characterization for testing system level esd performance", in proc. of the sympo. electrical overstress/electrostatic discharge, 2010, pp.1-8. [4] k. shrier, t. truong, and j. felps, "transmission line pulse test methods, test techniques and characterization of low capacitance voltage suppression device for system level electrostatic discharge compliance", in proc. of the sympo. electrical overstress/electrostatic discharge, 2004. pp.1-10. [5] h. gossner, w. simbürger, m. stecher, "system esd robustness by co-design of on-chip and on-board protection measures", microelectronics reliability, vol. 50, no. 9-11, pp.1359–1366, 2010. [6] m. hove, t. o. sanya, a. j. snyders, i. r jandrell and h .c. ferreira, "the effect of type of transient voltage suppressor on the signal response of a coupling circuit for power line communications", africon, 2011 [7] s. s. choi, d. h. cho, k. h. shim, "development of transient voltage suppressor device with abrupt junctions embedded by epitaxial growth technology", electron. mater. lett., vol. 5, pp. 59-62, jun 2009. [8] r. n. rountree and c. l. hutchins, "nmos protection circuitry," ieee trans. electron devices, vol. 32, pp. 910-917, 1985. instruction facta universitatis series:electronics and energetics vol. 27, n o 1, march 2014, pp. 13 23 doi: 10.2298/fuee1401013i review of advanced igbt compact models dedicated to circuit simulation  petar igić 1 , nebojša janković 2 1 electronic system design centre, college of engineering, swansea university, singleton park, swansea sa2 8pp, united kingdom 2 department of microelectronics, faculty of electronics engineering, university of niš, serbia abstract. the paper aims to review the research area of the igbt compact modelling and to introduce different device models. the models are separated in two groups, one that solves ambipolar diffusion equation (ade) and one that does not. both types of compact models have been successfully used in the past for power electronic circuit design. key words: igbt, compact, model, power, inverter, circuit 1. introduction insulated gate bipolar transistors (igbts) are devices of choice in modern power converter systems targeting medium to high voltage and current applications, such as hybrid or electric vehicles [1], [2]. during the power circuitry early design stages, one could consider an igbt to be a binary on-off switch, thus achieving very fast simulation of the converter operation. however, this modelling approach cannot be used to analyze some key aspects of the device and converter performance such as heat dissipation for example, very important design parameter especially during operation at high switching frequencies [3]-[5]. obviously, this will not lead to robust equipment design, as it does not provide any information regarding switch failure mechanisms. to overcome all the above issues, one could develop and use the igbt models based on full internal physics of the device. these would be typically 2d or 3d finite-element (fe) models developed and run in some of the commercially available simulation tools. this modelling approach will provide designers with detailed knowledge of the igbt devices, but it requires very long simulation time, and it is numerically prohibitive if one would like to study complex circuits containing multiple power devices requiring many switching events [6]. the compact modelling approach is placed between these two extremes [7]-[27]. compact models are lower complexity, but yet fully physically based and very accurate,  received december 18, 2013 corresponding author: petar igic electronic system design centre, college of engineering, swansea university, singleton park, swansea sa2 8pp, united kingdom (e-mail: p.igic@swansea.ac.uk) 14 p. igic, n. jankovic models of the power devices dedicated to circuit simulation. this physical modelling approach could be based on certain mathematical simplifications of the fundamental semiconductor charge transport equations, for example [9], [20], [21], [24]. in order to develop an igbt model that will describe correctly its static and dynamic behaviour, the main challenge is to incorporate into a device model conductivity modulation and nonquasistatic charge storage effects [22], [23]. the absence of an industry-accepted igbt model and the pronounced industry-need for more accurate igbt compact model have triggered very intensive research in this area for more than a decade. the distinct challenge in developing igbt compact model for circuit simulation lays in the fact that model needs to satisfy some refuting requirements. it needs to provide high quantitative accuracy, short cpu time, and physical, yet easy to determine model parameters. as a result different igbt compact models have been developed and presented in the literature, some of those suitable for long time inverter simulations (minutes) [15]. the aim of this paper is to review the research area of the igbt compact modelling and to introduce different models, such as igbt models based on the ambipolar differential equation (ade) solutions [18]-[24] and the ones which are not solving ade, typically physics-based sub-circuit models [14], [16], [17], [25]-[27]. 2. ade solution based models to describe igbt's static and dynamic behaviour, the incorporation of conductivity modulation and non-quasistatic charge storage effects into the device model is vital [7], [22]. when the excess carrier density overcomes the igbt's n base doping level by several orders of magnitude within the carrier storage region, the assumption that the excess electron concentration, n, and excess hole concentration, p, are equal is valid [21]. then, the carrier transport is determined by the ambipolar diffusion equation (ade): 2 2 ( , ) ( , ) ( , )p x t p x t p x t d x t       (1) where d represents the ambipolar diffusion constant and stands for the ambipolar carrier lifetime. the boundary conditions for the above equation are determined by the current at the left (xl) and right (xr) ends of the carrier storage region. at the left end of the carrier storage region, the electron and hole currents are given by: ( ) ( ) l electron l nl n l n x n i x i anq e x aqd x       (2) ( ) ( ) l hole l pl p l p x p i x i apq e x aqd x       . (3) in the above equations, q stands for the electron unity charge, a is the cross sectional area of the carrier storage region, dn and dp stand for the electron and hole diffusion constants respectively, n represents the electron mobility, p is the mobility of the holes, and e stands for the electric field. dividing equation (2) with dn and equation (3) with dp and then subtracting (3) from (2) (under condition n  p andn /dn =p /dp) gives a derivative boundary condition on p for the ade at the left end of the carrier storage region [18]-[22]: review of advanced igbt compact models dedicated to circuit simulation 15            p pl n nl x d i d i qax p l 2 1 . (4) a similar expression is obtained for the right end of the carrier storage region:            p pr n nr x d i d i qax p r 2 1 , (5) where inr and ipr represent electron and hole currents respectively at the cathode end (see fig. 1). (a) (b) fig. 1 schematic representation of the nptigbt structure (a) and bipolar part of the structure (b) 2.1. exponential solution based models an exponential approximation based solution for this equation has been developed. to model the plasma carrier distribution, set of exponential shape functions is used [21]. these shape functions are found to model the shape of the plasma correctly, without 16 p. igic, n. jankovic oscillations in the internal distribution. the slopes to the carrier distribution at the boundaries are also physically correct. in steady state forward bias operation the plasma carrier concentration has a distribution of catenary form requiring just two exponential basis functions giving [21]: lxlx beaep //   (6) where l is the diffusion length. in transient operation, more complex profiles can be approximated using a number of exponential basis functions with a range of decay length parameters, shorter than the steady state ones. the models reported in [21], [22] actually uses up to seven exponential basis functions to model the plasma distribution during transient operation. to implement model and make it functional, one needs to determine forward junction voltage between p+ emitter and nbase (see fig. 1), the ohmic voltage drop across the plasma region, the depletion voltage, depletion capacitance, and depletion current at the anode end. the depletion current is a small extra current component that exists under high speed transient conditions as described in [21], [22]. this model has been used successfully to predict switching characteristics of different commercially available igbt devices; one example is given in fig. 2. turn-off time [s] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 v a [ v ] -100 0 100 200 300 400 500 i a [ a ] -3 0 3 6 9 12 15 -------experiment -------compact model fig. 2 igbt turn-off characteristic experimental results vs. compact model the downside of this modelling approach is model complexity [20], large number of model parameters [19], [22], difficult model implementation in circuit simulators such as pspice, long simulation time. many modern igbt devices include localize life time control (llc) region in order to reduce current tail during device turn-off and increase operating frequency [11]. the above discussed model does not have the ability to directly include llc region. in order to include this feature, it needs some alterations. if, for example, llc region is inserted between the p-emitter and arbitrary dashed line shown in fig. 1b, the plasma carrier distribution model will need to be consider separately across these two regions(each having different lifetimes) left and right from the arbitrary dashed line. this would introduce another set of equations needed to describe boundary between the llc region and rest of the nbase region, thus making model even more complex. review of advanced igbt compact models dedicated to circuit simulation 17 2.2. fourier series solution based models a fourier series solution based model has been developed and described in [24]. it has been based on the research results showing that the diffusion equation could be solved by means of an electrical analogy [23]. the plasma carrier concentration has a distribution of a sum of fourier series components in space: 1 0 1 2 1 ( ) ( ) ( ) cos k k k x x p p t p t x x            (7) where k represents the harmonic number. set of equations described in (7) can be represented in the form of two rc lines corresponding to the even and odd values of k. the rc lines are driven by currents defined by the boundary conditions as described in [24]. fig. 3 analogue solution to the ade [24] fig. 3 shows analogue solution to the ade with fixed or mobile boundaries. in fig. 3, qs represents total carrier stored charge, p0,…,pk stand for fourier series coefficients, w is the width of the n-base region, xl and xr are the positions of the plasma region left and right boundaries (see fig. 1b), pxl and pxr corresponds to the excess carrier concentration at xl and xr respectively, and currents ipl,r and inl,r are as depicted in fig. 1a. the model could be implemented in any general purpose simulation software having non-linear elements and variable parameters [24]. 3. physics-based sub-circuit compact models common feature of all sub-circuit compact models is that they are not trying to solve ambipolar diffusion equation in order to reproduce measurement data or predict device characteristics. recently, hisim-igbt compact model has been developed and presented [25]-[27]. model is based on the consistency of the potential distribution within the igbt device by considering in great details the mosfet surface potentials and the bjt junction potentials, as described in [25], [27]. the model has been originally developed for trench igbt device. hisim-igbt equivalent circuit is shown in fig. 4. the igbt's 18 p. igic, n. jankovic mosfet part is described with a conventional model, and the main model development effort has been put into extending the bjt shown in fig. 4, since igbt output current is managed by the bipolar transistor theory. in this model, the igbt characteristics are determined by three parameters, the trench-bottom mosfet gate charge, qtb, the base resistance, rb, and the nqs igbt base charge model, qb. fig. 4 hisim-igbt equivalent circuit [27] another popular igbt model has been presented by jankovic et al. in [14], [16], [17]. in [14], the physics-based igbt sub-circuit model which successfully included the effects of localised lifetime control (llc) on device electrical performance has been described. in particular, the model depicts the non-punch trough igbts with different locations of llc region. in what follows, the description of model implementation in spice will be given with attention to the modifications performed to include the lifetime control effects. the equivalent sub-circuit of the igbt model implemented in spice is shown in fig. 5. fig. 5 llc igbt equivalent circuit it includes a n-channel mosfet, a wide-base pnp bipolar transistor (bjt), the voltage-dependent base resistor rbb, the p-n junction capacitances, cbc and cbe, the gate overlapping source capacitance cgs, and the drain-gate overlapping capacitance cgd (the gate-overlap capacitor cox in series with the gate induced depletion capacitance cd). the review of advanced igbt compact models dedicated to circuit simulation 19 n-channel mosfet part of the igbt is modelled using a spice level 5 model. pnp bjt is low efficient and its operation is fully affected by the llc technique. fig. 6a shows the schematic of the pnp bjt circuit model consisting of two voltage-controlled current sources (ie and ic) and the junction capacitances cbe and ccb. the current sources mirror the input/output currents of separately developed sub-circuit shown in fig. 3b. the carrier transport trough the emitter and the base quasi-neutral regions (qnrs) are described with two equivalent loosy transmission lines (tls) consisting of identical rccells shown in figure 6c. the input voltage generators f(ube) and f(ucb) perform the voltage transformations {exp(ube /vt) 1} and {exp(ucb/vt) 1}, respectively. the rc-cell elements are non-linear conductance, capacitance, resistance and load impedance denoted as gk, ck, rk and zl, in fig. 6. their values are calculated by the following formulas [21]: 2 2 1 2 5 1 6 1 2 3 1 2 4 2 2 1 2 1 7 1 2 8 1 2 1 2 9 1 2 1 1 1 ( 1 1 ) (1 1 ) (1 ) (1 1 ) 1 1 (1 1 ) (1 1 ) k k k k u k k k k k k k k l n c c c u g c c c c u c c u c u u r c c u c c u c u z c c u                                (8) where u2k-1, u2k+1 and u2k are the input, the output and the middle node voltage, respectively, in the k-th rc-cell. (a) (b) (c) fig. 6 llc igbt model details 20 p. igic, n. jankovic the parameters c1-c9 are related to the physical and technology parameters of the emitter or the base qnrs as: sendie endddpie ie d t sat dnie d ie iet d ied ie vnq n c wncqn c wqn n c w vc v c wncqn c n wqn c nqv wn c wqn c n n c 2 , , 9 2 62 0 3 2 10 8 2 5 2 2 0 27 0 42 2 1 2 , 2 , , 2 , 2 , , 2 , 2 , 4                (9) where nd is the doping concentration, 0 is the doping-dependent low-field mobility, vsat represents the drift saturation velocity, 0stands for the low-injection level dopingdependent minority carrier lifetime, nie is the effective intrinsic carrier concentration incorporating band-gap-narrowing effects , and cn, cp are the auger's recombination constants. the parameterwis a physical width of single rc-cell, which is obtained by dividing a zero-bias qnr width w with the chosen number n of rc-cells (w=w/n). in [14], the emitter qnr is represented with three rc-cells. since the llc substantially decreases0 of particular device area, it follows from eqs. (8) and (9) that the rc-cells of controlled recombination region must have different gk element. it is illustrated in fig. 6c where the rc-cell of the controlled region is shown separately with different conductance g(). note that the bjt with the first (from left to right) rc-cell shaded shown in fig. 6b corresponds to the location of the llc region within the igbt. the model, as described above, has been used successfully for the prediction of the inverter circuit power losses [16] and also to investigate igbt tail current characteristics at different temperatures as shown in fig. 7. fig. 7 simulated and measured anode tail current and anode voltage of pt igbt during the device turn-off at 25 o c, 75 o c and 125 o c review of advanced igbt compact models dedicated to circuit simulation 21 4. electro-thermal (et) modelling strategy thermal compact model of an igbt is equally important as its electrical counterpart to accurately predict circuit performance [3]-[5]. the work presented in [3] describes an et modelling strategy that has been widely accepted by compact modelling research community and successfully applied since. it could be described in what follows. adding an extra node, thermal node, to the electrical compact model of the igbt device an electrothermal (et) models can be formulated. this thermal node has information regarding junction temperature of the device tj and it represents a connection between the active devices and rest of the circuit thermal network [3]. this is schematically represented in the fig. 8. a structure diagram of the et compact device model which shows the interaction between thermal and electrical networks through the electrical and thermal nodes is shown in fig. 9. as can be seen from the fig. 9, the instantaneous value of the device temperature estimated by the thermal network is used for the calculation of the temperature dependent model parameters and temperature dependent silicon properties. then, these temperature dependent values are used by the et compact device models to calculate instantaneous electrical characteristics as well as instantaneous dissipated power. finally, the dissipated power is used as an input parameter by the thermal network, and the device electrical characteristics are transferred to the electrical network. fig. 8 igbt et compact model – electrical contacts (g, a, k) as well as thermal node (tj) are shown fig. 9 structure diagram of the et compact model – interaction with the electrical and thermal network is shown 22 p. igic, n. jankovic the thermal parts of the compact models are represented using a thermal rc network due to an electrical analogy [4]: thermal resistance is represented by an electrical resistance, thermal capacitance by an electrical capacitance, and dissipated power by current source [29]. either foster or cauer rc networks can be used for this purpose [28], [29]. since the foster network is not directly suitable for the heat-flow path identification (because of the node-to-node heat capacitances), the cauer rc network is preferred choice for thermal device characterisation. cauer network includes only node-to-ground capacitances and it represents a discretised image of the real heat-flow structure. network elements can be determined by using a deconvolution method for extraction of the rc thermal network parameters from the thermal transient response of the device for a step function excitation [28]. namely, applying an abrupt dissipation step onto the chip, the time-function of the rise of the chip temperature has to be determined. either experimental method or 3d finite element model could be employed to obtain these thermal transient responses [29]. 5. conclusions the research area of the igbt compact modelling has been reviewed and different device models have been introduced. the models could be separated in two groups, ones that solve ambipolar diffusion equation (ade) and others that do not. the models based on ade solution, one could claim, are more physically based, but they are more complex to include in standard circuit simulator, need longer cpu time, might have convergence problems when simulating the circuits with larger number of igbts. both types of compact models have been successfully used in the past for power electronic circuit design. references [1] r.s. chokhawala, j. catt and b.r. pelly, "gate drive considerations for igbt modules", ieee transactions on industry applications, vol. 31,pp. 603-611, 1995. [2] p. palmer and a.n. githiari, "the series connection of igbt's with active voltage sharing", ieee transactions on power electronics, vol. 12,pp. 637-644, 1997. [3] a.r. hefner and d.l. blackburn, "thermal component models for electrothermal network simulation", ieee transaction on components, packaging and manufacturing technology, vol. 17–a, pp. 413-424, 1994. [4] v. szekely, a. poppe, a. pahi, a. csendes, g. hjas and m. rencz, "electro-thermal and logi-thermal simulations of vlsi designs", ieee transactions on vlsi systems, vol. 5, pp. 258-269, 1997. [5] h. vinke and c.j. clemens, "compact models for accurate thermal characterisation of electronic parts", ieee transaction on components, packaging and manufacturing technology, vol. 20-a, pp. 411419, 1997. [6] s. wunsche, c. class, p. swartz and f. winkler, "electro-thermal circuit simulation using simulator coupling", ieee transactions on vlsi systems", vol. 5, pp. 277-282, 1997. [7] a.r. hefner, "a dynamic electro-thermal model for the igbt", ieee transactions on industry applications, vol. 30, pp. 394-405, 1994. [8] p. turkes and j. sigg, "electro-thermal simulation of power electronic systems", microelectronic journal, vol. 29, pp. 785-790, 1998. [9] r. kraus and h.j. mattausch, "status and trends of power semiconductor device models for circuit simulation", ieee transactions on power electronics, vol. 13, pp. 452-465, 1998. [10] a. ramamurthy, s. sawant and b.j. baliga, "modeling the [dv/dt] of the igbt during inductive turn off", ieee transactions on power electronics, vol. 14, pp. 601-606, 1999. [11] e. napoli, a.g.m. strollo, p. spirito, numerical analysis of local lifetime control for high-speed low-loss p-i-n diode design, ieee transactions on power electronics, vol. 14, pp. 615-621, 1999. review of advanced igbt compact models dedicated to circuit simulation 23 [12] a. ammous, s. ghedira, b. allard, h. morel, d. renault, "choosing a thermal model for electrothermal simulation of power semiconductor devices", ieee transactions on power electronics, vol. 14, pp. 300-307, 1999. [13] c.m. tan and k.-j. tseng, "using power diode models for circuit simulations a comprehensive review", ieee transactions on industrial electronics, vol. 46, pp. 637-645, 1999. [14] n. jankovic, p. igic and n. sakurai, "compact model of the igbt with localized lifetime control dedicated to power circuit simulations", solid state electronics, vol. 54, pp. 268 – 274, 2010. [15] p. igic and z. zhou, "high-speed electro-thermal modelling of a three-phase igbt inverter power module", international journal of electronics, vol. 97, pp. 195 – 205, 2010. [16] n. jankovic, z. zhou, s. batcup and petar igic, "an advanced physics-based sub-circuit model of pt igbt", international journal of electronics,vol.96, pp. 767 – 779, 2009. [17] n. jankovic, t. pesic and p igic: "all injection level power pin diode model including temperature dependence", solid-state electronics, vol. 51, pp. 719-725, 2007. [18] a.j. forsyth, s.y. yang, p.a. mawby, p. igic, "measurement and modelling of power electronic devices at cryogenic temperatures", ieee proc. on circuits, devices and systems, vol. 153, pp. 407 – 415, 2006. [19] p. igic, p.a. mawby and m.s. towers, "physically based 2d compact model for power bipolar devices", international journal of numerical modelling – electronic networks, devices and fields, vol. 17, pp. 397-405, 2004. [20] p. igic, p.a. mawby and m.s. towers, "a 2d physically based compact model for advanced power bipolar devices", elsevier's microelectronics journal, vol.35, pp. 591-594, 2004. [21] p. igic, p.a. mawby, m.s. towers and s. batcup, "a new physically based pin diode compact model for circuit modelling applications",iee proc. on circ., devices and sys.,vol.149, pp. 257-263, 2002. [22] p. igic, p.a. mawby, m.s. towers, w. jamal and s. batcup, "investigation of the power dissipation during igbt turn-off using a new physics-based igbt compact model", microelectronics and reliability,vol.42, pp. 1045-1052, 2002. [23] p. gillet, m. kallala, j-l. massol and p. leturcq, "analogue solution of the ambipolar diffusion equation", c.r. acad. sc. paris, t. 321, serie ii-b, pp. 53-59, 1995. [24] p. leturcq, j-l. debrie and m.o. berraies, "a distributed model of igbts for circuit simulation", in the proc. of epe'97, pp. 1.494-1.501, 1997. [25] m. miyake, a. ohashi, m. yokomichi, h. masuoka, t. kajiwara, n. sadachika, u. feldmann, h.j. mattausch, m. miura-mattausch, t. kojima, t. shoji andy. nishibe, "a consistently potential distribution oriented compact igbt model", in the proc. of power electronics specialists conference, pp. 998-1003, 2008. [26] d. navarro, t. sano and y. furui, "a sequential model parameter extraction technique for physicsbased igbt compact model", ieee trans. on electron devices, vol. 60, pp.580-586, 2013. [27] m. miyake, a. ohashi, m. yokomichi, h. masuoka, t. kajiwara, n. sadachika, u. feldmann, h.j. mattausch, d navarro, u. feldmann, t. kojima, t. ogawa and t. ueta, "hisim-igbt: a compact si-igbt model for power electronic circuit design", ieee trans. on electron devices, vol. 60, pp. 571-579, 2013. [28] v. szekely, "identification of rc network by deconvolution: chances and limits", ieee trans. on fundamental theory and applications, vol. 45, pp. 244-258, 1998. [29] p. igic, p.a. mawby, m.s. towers and s. batcup, "thermal model of power semiconductor devices for electro-thermal circuit simulations", in proc. 23 rd ieee international conference on microelectronics (miel 2002), nis, yugoslavia, vol. 1, pp. 171-174, 2002. http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=2190 http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model http://ieeexplore.ieee.org/xpl/abstractauthors.jsp?tp=&arnumber=4592060&querytext%3dmiyake+ohashi+igbt+model facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 663 664 facta universitatis, series: electronics and energetics call for papers special issue internet of things guest editors marijana despotović-zrakić, faculty of organizational sciences, university of belgrade maja@elab.rs zorica bogdanović, faculty of organizational sciences, university of belgrade zorica@elab.rs huansheng ning, school of computer and communication engineering, university of science and technology beijing ninghuansheng@ustb.edu.cn božidar radenković, faculty of organizational sciences, university of belgrade boza@elab.rsria scope nowadays, internet has evolved into a platform that reshapes modern life and removes borders between real, social and cyber worlds. internet of things (iot) is an emerging paradigm and a cutting edge technology that harnesses a network of embedded, interconnected objects (sensors, actuators, tags or mobile devices) in order to collect various types of information at anytime and anywhere. these devices can be used for building different complex smart environments, such as smart homes, smart classrooms, smart offices, smart factories, smart cities, smart power grids or smart e-government. further, the networks of devices are based on advanced internet standards. iot implies seamless integration of numerous types of devices into existing internet infrastructure. smart environments can be customized according to users’ needs and preferences which are suitable for automating these environments. the main subject of the special issue is internet of things and its application in: business, industry, research and academic community works. this special issue aims to provide state-of-art and innovative papers on the design, implementation, and usage of intelligent iot and related technologies, such as: cloud computing, big data, pervasive computing, social computing, etc. the primary goal is to provide a variety of research and survey articles in the field of the internet of things and their application in different aspects of human activities. findings and discussion should foster potentials and capabilities of research, academic community, and industry as well. 664 call for papers the target audience of this special issue includes professionals and researchers working in the field of information and communication technologies and their applications in business, industry, science and education. this special issue looks for quality papers that present research results on the iot related to pervasive scenarios. recommended topics include, but are not limited to the following: 1. smart environments:  architectural design of smart environment  technologies for implementing smart environment  design and technologies for implementing smart environment (smart home, classroom, office, building, parking, smart grid etc.) 2. internet of things technologies and protocols:  iot and ipv6  iot and coap  mqtt  m2m communication  wireless standards and protocols for iot 3. infrastructure for the internet of things:  wireless sensor networks  smart grids  iot and cloud computing  iaas, paas and saas for internet of things  security and privacy of cloud and iot applications  iot, big data and data analytics  future infrastructure for the internet of things 4. applied internet of things:  iot infrastructure for business performance  iot infrastructure for industry  iot in energetics  iot in science and education  iot in e-health  iot in agriculture  iot in ecology and environment 5. trends in internet of things  the internet of everything  ambient intelligence  event-driven architecture  security challenges in iot  internet of nano things submission guidelines authors are invited to submit original research contributions by following the detailed instructions given in the “information for authors” at http://casopisi.junis.ni.ac.rs/index.php/ fuelectenerg/about. authors should explicitly state that the paper is submitted to the “special issue on internet of things”. questions about the special issue should be directed to the guest editors. important dates june 30, 2015: manuscript submission august 15, 2015: notification of the first review september 15, 2015: revised manuscript submission october 15, 2015: notification of the second review november 15, 2015: final manuscript submission spring 2016: expected publication. instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 287 296 doi: 10.2298/fuee1502287k performance analysis of a flexible polyimide based device for displacement sensing  milica g. kisić 1 , nelu v. blaž 1 , kalman b. babković 1 , andrea marić 1 , goran j. radosavljević 2 , ljiljana d. živanov 1 , mirjana s. damnjanović 1 1 faculty of technical sciences, university of novi sad, novi sad, serbia 2 institute for sensor and actuator systems, vienna university of technology, vienna, austria abstract. in this work, two variations of the displacement sensor, based on the heterogeneous integration process of traditional fabrication technologies pcb (printed circuit board) and ltcc (low temperature co-fired technology) with a flexible polyimide foil are presented. the proposed sensor uses the coil as an essential part, spacer and a polyimide foil as a flexible membrane with a piece of ferrite attached to it. with the displacement of the polyimide foil, the ferrite gets closer to the coil causing an increase in its inductance and a decrease of the resonant frequency of the system (coil, ferrite and antenna). simulation results showed that sensors with equal outer dimensions but different internal structures exhibit different performances. two prototypes of the sensor with different ferrite dimensions are designed, fabricated and characterized. finally, their performances are compared. key words: displacement measurement, wireless sensor, heterogeneous integration 1. introduction displacement measurement is of a profound importance in a wide range of applications, such as industrial systems, portable electronic devices, robots, biomedical devices, intelligent instruments, performance evaluation, etc. the high demand for displacement sensors is due to their application for measuring the position and movement of objects, for nondestructive evaluation of deformation, alignment and calibration of position as well as measuring other physical quantities which can first be converted into movement such as pressure, force, acceleration, etc. different types of displacement measuring methods and sensors have been developed: eddy-current, optical, resistive, capacitive, inductive, etc. eddy current sensors are resistant to dirt, dust, humidity, oil or dielectric material in the measuring gap and have been proven reliable in a wide range of temperatures. however, non-contact eddy current sensors have one problem, inhomogeneity (electrical run out) that affects their accuracy received september 18, 2014; received in revised form january 5, 2015 corresponding author: milica g. kisić faculty of technical sciences, university of novi sad, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: mkisic@uns.ac.rs) 288 m. g. kisić, n. v. blaž, k. b. babković, et al. and applications [1]. optical, as well as michelsons interferometers, are one of the popular displacement measurement methods because of their high precision [2, 3]. these sensors are very sensitive to environmental perturbations and suffer from deterioration of signal to noise ratio and disappearance of the desired signal and deviations in the optical path length. interferometers which are based on the fringe counting method have high resolution and stability, but their precision is dependent on the wavelength of the light [4]. they are bulky and quite expensive due to their sophisticated structure and their working performance is easily affected by the environment. displacement sensors with giant magnetoresistance (gmr) elements have their position resolution limited by excessively low signal to noise ratio [5]. capacitive sensors are the best regarding accuracy/resolution and the applicability for small targets. they show a fringe effect around the edge of the patterned electrode and drift in the signal caused by various parameters, such as thermal effect, stage coupling, random wave noise, external electric waves, etc. which is hard to control [6-9]. in order to improve signal conditioning and effective noise reduction for better accuracy and resolution, complex electronic circuits are needed [10-12]. inductive displacement sensors with complex structures of two sensor elements are presented in [13]. piezoresistive sensors are important and widely utilized class of devices in mechanical sensing. displacement or force sensors with sidewall embedded piezoresistors, piezoresistive microcantilevers and self-detection onboard electronic systems are designed and manufactured [14, 15]. a piezoresistive cantilever based on both the lateral and the vertical bending for two-dimensional detection is presented [16]. in [17] is proposed the sensor based on a seal cavity made by sacrificial layer, a deposition of polysilicon nanofilm acts as piezoresistors on the diaphragm. there is an increasingly wide interest in polymeric foils and their application in the field of sensor technology. the various types of sensors applying polymeric foils can be realized for different purposes and unique possibilities. flexible polymeric foils offer a lot of advantages in making sensors: they are very low-cost, thin, large area, lightweight, flexible, conformable, transparent, wearable, foldable, stretchable and produced on a large scale. polymer substrates are very flexible and can be bended in a very small radius of curvature. most previous sensors are silicon-based which is rigid for flexible, bendable applications and to cover continuous, contoured, conformal surfaces. in order to achieve mechanical sensors which can fulfill that application and can sustain sudden impact or large deformation, flexible substrates can be used. in order to develop strain sensors and multiplexed arrays with good performances, but lightweight construction, mechanical flexibility and robustness, sensors which combined silicon sensing elements and thin plastic substrates are developed [18]. different types of sensors for mechanical sensing with flexible substrates of various polymer-based materials, such as polyester, parylene, polyimide (pi) or polydimethylsiloxane (pdms) were proposed [19-25]. presented sensors are used for sensing the tactile, bending, interface pressure between implanted cuff and nerve tissue, normal and shear loads, application to robotics, medicine and industry. an overall mechanical flexibility, elasticity and biocompatibility of the sensor are obtained by integrating polymeric substrates. performance analysis of a flexible polyimide based device for displacement sensing 289 one of the simplest sensor structures having a coil and a ferrite object in close proximity have been reported for monitoring changing environmental parameters such as pressure, displacement and force [26-28]. in our previous paper [29], an inductive displacement sensor in discrete technology was presented. traditional fabrication techniques, pcb (printed circuit board) and ltcc (low temperature co-fired technology), are combined with a polyimide foil to create a displacement sensing structure. presented sensor uses non-contact wireless displacement measurement so it does not require proper, smooth, good contacts, vias and metal lines on the coil, which can deforms the coil structure and yield to the parasitic elements which should be eliminated from the actual parameters. commercially available polyimide is used as a membrane so the complex fabrication process that includes photolithography, spinning, curing and etching for membrane fabrication is avoided. the sensor utilizes the variation of the coil’s inductance and accordingly the shift of the resonant frequency of the antenna-sensor system to detect the desired parameter displacement. in this work, two variations of the sensor with different ferrite dimensions are designed and examined, in order to investigate their performance. such wireless displacement sensors applying the heterogeneous integration process are fabricated and tested. finally, their performances are compared. 2. design of displacement sensor with polyimide membrane the integrated displacement sensor consists of a coil, a small cylindrical ferrite plate, a suitable spacer and finally a flexible membrane. the exploded 3d view of the sensor and its cross section and an antenna coil are presented in fig. 1. the manufacturing process of the sensor consists of two stages, the fabrication of the components and the packing process. pcb technology is used for the coil fabrication because pcb circuits are planar, easy to mount, reliable and cheap. the coil is designed as a square spiral type with outer dimensions of 19 mm, 25 turns, and conductive lines width and distance between the lines both equal to 150 µm. as the membrane, a polyimide foil of 125 µm thickness and young’s modulus of 3 gpa [30] is used. polyimide substrate shows elastic-plastic behavior and tends to creep so may exhibit parametric drift, but it can take large strains before fracture and has an elastic modulus smaller by nearly a factor of 70 compared with the silicon and the metal foils [31, 32]. for used polyimide foil, considerable smaller load is needed for the same deflection compared with other membranes. the ferrite disk consists of 12 layers of ltcc ferrite tape (esl 40012, thickness of green tape: ~70 µm) sintered at 1100 °c in order to achieve the highest permeability [33]. the thickness of the ferrite disk after sintering is 0.66 mm. several pcb fr4 plates of different types and thicknesses (with overall thickness of 2.6 mm and a milled hole of a 16 mm radius in the center) are used as a spacer, in order to provide spacing of 1.2 mm between the ferrite and the coil (since the thickness of the membrane, the ferrite and the glue is 1.4 mm). 290 m. g. kisić, n. v. blaž, k. b. babković, et al. a) b) fig. 1 a) exploded 3d view of the sensor and b) cross section of the sensor and the antenna coil 3. calculation of a sensing mechanism the resonant frequency of the coil can be expressed as ss cl f 2 1 0  , (1) where ls and cs are the inductance and capacitance of the sensor coil, respectively. it can be seen that any variation in the coil inductance changes the resonant frequency. displacement can be detected in a simple and precise way without additional mechanical contact on the sensor. wireless measurements of the sensor regard the readout of the changes of the resonant frequency of the sensor-antenna system. measurements are done using an external surrounding coil–antenna. the equivalent circuit model of the sensor-antenna system is presented in fig. 2. the impedance of the sensor-antenna system can be determined as [34] performance analysis of a flexible polyimide based device for displacement sensing 291 2 ( ) ( ) 1 a a s s s m z r j l r j l c                , (2) where la and ra represent the inductance and resistance of the antenna, respectively, rs is resistance of the sensor coil, and m is the mutual inductance between the antenna coil and the sensor coil. the magnitude of the impedance’s phase dip at the resonant frequency 0 is           s s r lk 0 2 1 tan   , (3) where k is the coupling coefficient of the antenna coil and the sensor coil. using wheeler’s method [35], the inductance of the sensor coil is calculated, ls = 7.8 μh. a planar magnetic structure in close proximity to the coil yields high enhancement of the inductance values. in order to investigate in which manner dimensions of the ferrite influence the sensor coil inductance, simulations of the inductance for different ferrite dimensions and distances from the coil are performed using cst (computer simulation technology) microwave studio [36]. the simulated inductance of the coil for different ferrite radii (r = 6 mm and r = 9.5 mm) and distances between the ferrite and the coil, d, is presented in fig. 3. as can be seen, a larger ferrite yields higher inductance value and higher inductance change rate with distance (2.07 µh) compared to the smaller ferrite (0.66 µh) as the consequence of greater intersects of the magnetic field between the coil and the ferrite. fig. 3 simulated inductance changes for different ferrite radii and distances between the ferrite and the coil fig. 2 the equivalent circuit model of the sensor-antenna system 292 m. g. kisić, n. v. blaž, k. b. babković, et al. 4. experimental results and discussion the measurement setup is shown in fig. 4. the fabricated sensor is mounted and sealed on a test fixture and an antenna coil is placed around the sensor. a mts (manual translation stage) is positioned exactly above the sensor membrane to precisely control the polyimide membrane deflection by direct contact to the sensor. the antenna is connected to the impedance analyzer hp4191a and the amplitude of the impedance and phase dip of the system (antenna-sensor) are measured. a) b) c) fig. 4 a) deflection of the sensor’s membrane by application of the mts, b) the setup for the wireless measurements of the displacement sensors and c) the photograph of the mounted and fixed system connected to the impedance analyzer to test the functionality of the displacement sensor performance analysis of a flexible polyimide based device for displacement sensing 293 the measured phase shift without any displacement of the membrane for two different ferrite dimensions is shown in fig. 5. to wirelessly measure resonant frequency of the sensor, the resonant frequency of the antenna should be far enough from the sensor’s resonant frequency. the sensor with the smaller ferrite has smaller inductance value, and consequently resonant frequency of the system i (sensor coil, smaller ferrite r = 6 mm, antenna) is higher compared to the resonant frequency of the system ii (sensor coil, larger ferrite r = 9.5 mm, antenna) 71.7 mhz vs. 64.7 mhz, respectively. a square spiral winding is used as the antenna with a resonant frequency of 145 mhz, which is sufficiently far away from measured resonant frequencies of both sensor systems. fig. 5 measured phase of the impedance of the antenna and the systems with two different ferrites fig. 6 wirelessly monitored changes of the amplitude and the phase dip of the impedance of the system i (smaller ferrite) for different displacement (in µm) 294 m. g. kisić, n. v. blaž, k. b. babković, et al. fig. 7 wirelessly monitored changes of the amplitude and the phase dip of the impedance of the system ii (larger ferrite) for different displacements (in µm) the measured data for two investigated systems are shown in figs 6 and 7. resonant frequencies are determined from the minimum point of the phase dip for displacement variations up to 1.2 mm. the higher the displacement, the closer the ferrite core is to the sensor coil, causing the increase in the inductance and consequently decreasing the resonant frequency. resonant frequency characteristics of both systems are shown in fig. 8. in the measurable displacement range, change of the resonant frequency of the system ii (with the larger ferrite) is 10.5 mhz and sensitivity is 8.75 khz/µm. compared to the system ii, measured change of the resonant frequency and sensitivity of the system i (with the smaller ferrite) are smaller (3.5 mhz and 2.92 khz/µm, respectively). an increase in the overlapping area between the coil and the ferrite, results in greater change in the inductance, hence the greater decrease in the system’s resonant frequency. fig. 8 resonant frequency characteristics of systems with two different ferrite radii performance analysis of a flexible polyimide based device for displacement sensing 295 5. conclusion in our previous work [29], application of polyimide foil as a membrane in displacement sensor was investigated. the principle of operation of the presented sensor is based on the deflection of a membrane, hence approaching the ferrite closer to the sensor coil and the subsequent measurement of the phase-dip of the sensor-antenna system. using sensor structures with the same outer dimensions but larger ferrites yields a 3 times greater change in the resonant frequency (10.5 mhz vs. 3.5 mhz) and sensitivity (8.75 khz/µm vs. 2.92 khz/µm) of the system for the same amount of displacement. therefore, in order to enhance the performance of the sensor, the dimensions of the coil and the ferrite have to be appropriately selected. future work will be directed towards optimization of the coil layouts and the development of new designs with different technologies in order to fabricate sensors with the coil fabricated on the membrane. acknowledgement. this work was supported in part by the ministry of education, science and technological development, republic of serbia, on project number tr-32016 and iii45021. references [1] g. y. tian, z. x. zhao and r. w. baines, "the research of inhomogeneity in eddy current sensors", sensors and actuators a, vol. 69, pp. 148-151, 1998. [2] d. hofstetter, h. p. zappe and r. dandliker, "optical displacement measurement with gaas/algaasbased monolithically integrated michelson interferometers", journal lightwave technology, vol. 15, no. 4, pp. 663 – 670, april 1997. [3] s. j. lee, y. melikhov, c. m. park, h. hauser and d. c. jiles, "analysis of a remote magneto-optic linear displacement sensor using a jones matrix approach", ieee transaction on magnetics, vol. 42, no. 10, pp. 3273 – 3275, october 2006. [4] a. bergamin, g. cavagnero and g. mana, "a displacement and angle interferometer with subatomic resolution", review of scientific instruments, vol. 64, pp. 3076-3081, august 1993. [5] m. m. miller, p. lubitz, g. a. prinz, j. j. krebs and a. s. edelstein, "development of a high precision absolute linear displacement sensor utilizing gmr spin-valves", ieee transaction on magnetics, vol. 33, no. 5, pp. 33883390, september 1997. [6] a. arkadan, s. subramaniam, sivanesan and o. douedari, "design optimization of a capacitive transducer for displacement measurement", ieee transaction on magnetics, vol. 35, pp. 1869 1872, may 1999. [7] m. hirasawa, m. nakamura and m. kanno, "optimum form of capacitive transducer for displacement measurement", ieee transaction on instrumentation and measurement, vol. 3, pp. 276 – 280, december 1984. [8] m. kim and w. moon, "a new linear encoder-like capacitive displacement sensor", measurement, vol. 39, pp. 481–489, july 2006. [9] m. kim, w. moon, e. yoon and k r lee, "a new capacitive displacement sensor with high accuracy and long-range", sensors and actuators a, vol. 130-131, pp. 135–141, august 2006. [10] d. kang, w. lee and w. moon, "a technique for drift compensation of an area-varying capacitive displacement sensor for nano-metrology", in proceedings of the eurosensors xxiv, procedia engineering 5, september 5-8, 2010, linz, austria, pp. 412–415. [11] f. zhu, j. w. spronck and w. c. heerens, "a simple capacitive displacement sensor", sensor and actuator a, vol. 25, issue 27, pp. 265–269, 1991. [12] g. gao, y. wang, s. yan and y. han, "design of capacitive displacement sensor and measuring algorithm based on modulated differential pulse width", journal of chemical and pharmaceutical research, vol. 6, pp. 704-711, 2014. 296 m. g. kisić, n. v. blaž, k. b. babković, et al. [13] m. s. damnjanovic, l. d. zivanov, l. f. nagy, s. m. djuric and b. n. biberdzic, "a novel approach to extending the linearity range of displacement inductive sensor", ieee transaction on magnetics, vol. 44, no. 11, pp. 41234126, november 2008. [14] v. stavrov, e. tomerov, g. stavreva, c. hardalov and a. shulev, "lateral displacement mems sensor", in proceedings of the eurosensors xxiv, procedia engineering, vol. 00, september 5-8, 2010, linz, austria, 2009, pp. 1-4. [15] f. mathieu, d. saya, c. bergaud and l. nicu, "parallel detection of si-based microcantilevers resonant frequencies using piezoresistive signals downmixing scheme", ieee sensors journal, vol. 7, no. 2, pp. 172-178, february 2007. [16] t. c. duc, j. f. creemer and p. m. sarro, "piezoresistive cantilever beam for force sensing in two dimensions", ieee sensors journal, vol. 7, no. 1, pp. 96-106, january 2007. [17] x. yu, y. tang, h. zhang, t. li and w. wang, "design of high-sensitivity cantilever and its monolithic integration with cmos circuits", ieee sensors journal, vol. 7, no. 4, pp. 489495, april 2007. [18] s. m. won, h.-s. kim, n. lu, d.-g. kim, c. d. solar, t. duenas, a. ameen and j. a. rogers, "piezoresistive strain sensors and multiplexed arrays using assemblies of single-crystalline silicon nanoribbons on plastic substrates", ieee transactions on electron devices, vol. 58, no. 11, pp. 40744078, november 2011. [19] m-y cheng, x-h huang, c-w ma and y-j yang, "a flexible capacitive tactile sensing array with floating electrodes", journal of micromechanics and microengineering, vol. 19, pp. 1-10, 2009. [20] c-chu chiang, c-c k. lin and m-s ju, "an implantable capacitive pressure sensor for biomedical applications", sensors and actuators a, vol. 134, pp. 382–388, 2007. [21] h-j kwon, j-h kim and w-c choi, "development of a flexible three-axial tactile sensor array for a robotic finger", microsystem technologies, vol. 17, pp. 1721–1726, 2011. [22] j. engel, j. chen and chang liu, "development of polyimide flexible tactile sensor skin", journal of micromechanics and microengineering, vol. 13, pp. 359–366, 2003. [23] y-j yang, m-y cheng and x-h huang, "fabrication method of a flexible capacitive pressure sensor", us patent 8,250,926 b2, august 28, 2012. [24] h-w jang, "flexible display device having touch and bending sensing function", us patent 2014/0204285 a1, july 24, 2014. [25] k-h shin, c-y moon and y-j kim, "flexible device, flexible pressure sensor", us patent 8,107,248 b2, january 31, 2012. [26] a. baldi, w. choi and b. ziaie, "a self-resonant frequency-modulated micromachined passive pressure transensor", ieee sensors journal, vol. 3, pp. 728733, december 2003. [27] n. misron, l. q. ying, r. n. firdaus, n. abdullah, n. f. mailah and hiroyuki wakiwaka, "effect of inductive coil shape on sensing performance of linear displacement sensor using thin inductive coil and pattern guide", sensors, vol. 11, pp. 10522-10533, november 2011. [28] m. g. kisic, n. v. blaz, k. b. babkovic, a. m. maric, g. j. radosavljevic, l. d. zivanov and m. s. damnjanovic, "passive wireless sensor for force measurements", ieee transactions on magnetics, vol. 51, issue 1, doi 10.1109/tmag.2014.2359334, in press. [29] m. g. kisić, n. v. blaž, b. dakić, a. marić, g. j. radosavljević, lj. d. živanov and m. s. damnjanović, "a flexible polyimide based device for displacement sensing", in proceedings of the 29th international conference on microelectronics (miel 2014), belgrade, serbia, 12-15 may, 2014, pp. 1-4. [30] gts flexible materials ltd, available at: http://www.gtsflexible.co.uk. [31] n. lobontiu and e. garcia, mechanics of microelectromechanical systems. springer science & business media, 2005, chapter 6, pp. 364. [32] a. k. kaw, mechanics of composite materials. second edition, crc press, technology & engineering, november 2005, chapter 1, pp. 6. [33] n. blaž, a. marić, i. atassi, g. radosavljević, lj. živanov, h. homolka and w. smetana, "complex permeability changes of ferritic ltcc samples with variation of sintering temperatures", ieee transaction on magnetics, vol. 48, pp. 1563-1566, 2012. [34] o. akar, t. akin and k. najafi, "a wireless batch sealed absolute capacitive pressure sensor", sensors and actuators a, vol. 95, pp. 29–38, 2001. [35] s. s. mohan, m. m. hershenson, s. p. boyd and t. h. lee, "simple accurate expressions for planar spiral inductances", ieee journal of solid-state circuits, vol. 34, pp. 14191424, 1999. [36] cst microwave studio suite, computer simulation technology, www.cst.com. 11059 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 209-226 https://doi.org/10.2298/fuee2302209g © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a reliable routing mechanism with energy-efficient node selection for data transmission using a genetic algorithm in wireless sensor network sateesh gudla1, 2, nageswara rao kuda3 1dept. of computer science and engineering, jntuk, kakinada, ap, india 2dept. of computer science and engineering, lendi institute of engineering and technology(a), jntu kakinada, vizianagaram, ap, india 3dept. of computer science and systems engineering, auce(a), andhra university, ap, india abstract. energy-efficient and reliable data routing is critical in wireless sensor networks (wsns) application scenarios. due to oscillations in wireless links in adverse environmental conditions, sensed data may not be sent to a sink node. as a result of wireless connectivity fluctuations, packet loss may occur. however, retransmissionbased approaches are used to improve reliable data delivery. these approaches need a high quantity of data transfers for reliable data collection. energy usage and packet delivery delays increase as a result of an increase in data transmissions. an energyefficient data collection approach based on a genetic algorithm has been suggested in this paper to determine the most energy-efficient and reliable data routing in wireless sensor networks. the proposed algorithm reduced the number of data transmissions, energy consumption, and delay in network packet delivery. however, increased network lifetime. furthermore, simulation results demonstrated the efficacy of the proposed method, considering the parameters energy consumption, network lifetime, number of data transmissions, and average delivery delay. key words: genetic algorithm, energy-efficient routing path, data transmissions, lifetime, wireless sensor networks 1. introduction wireless sensor networks (wsns) are networks of many static or mobile sensors that use self-organization and multi-hop communication. wsns use cooperative sensing, collecting, computation, and transmission to cover the data of sensing objects in a specific region and then deliver it to the sink node or base station. wsn emerged as a received august 29, 2022; revised november 26, 2022, december 02, 2022 and december 24, 2022; accepted january 18, 2023 corresponding author: sateesh gudla department of computer science and engineering, jntuk, kakinada, ap, india e-mail: sateesh.research@gmail.com 210 s. gudla, k. nageswara rao significant paradigm for the rapid collection of data because of its widespread use in time-critical applications such as tsunami warnings, chemical assault detection, forest fire detection, and infiltration detection in military surveillance [1, 2, 3], etc. considering the significance of data acquired, many of these applications necessitate reliable information quickly. a sensor network's primary purpose is information dissemination, but this cannot take place if essential data is lost due to unforeseen node failure or the unpredictable nature of a wireless communication channel. hence achieving reliable data delivery is a challenging issue. on the other hand, the wireless sensor node is engineered on tiny circuits, and the sensors are low-cost and power-restricted [4]; energy is a critical resource since it affects the lifetime of individual sensor nodes and the entire network. henceforth current wsn routing techniques focus on discovering an efficient route to the sink to diminish energy usage and prolong the network lifetime. hence in wsns, it is challenging to deliver energy-efficient, reliable, low-latency data routing. reliable data delivery can be obtained by reducing packet drops in data communication. a packet could be lost for several reasons, including lousy connectivity, overflowing buffers, or a node running out of energy. retransmission and redundancy are two standard methods used in wsns to ensure network reliability. network energy usage and packet delivery delays are exacerbated by retransmission-based techniques because the number of transmissions increases [5]. since a bit lost within a packet can be retrieved using the coding scheme, the research community needs to pay more attention to employing redundancy to achieve reliability in wsns. if packets could be repaired to recover any lost or incorrect bits, the transmission overhead created by retransmitting an entire packet would be significantly reduced. however, selection of any node as a successor node in the routing path plays a vital role in designing an energy-efficient routing protocol. the packet dropping rate may be reduced by selecting a resourceful (which has the highest residual energy, free available buffer, good link quality, and less distance) node as the successor node. the current work aims to design energy-efficient and reliable data-gathering algorithms in wsns while adhering to strict data delivery time constraints by considering parameters such as residual energy, buffer capacity, link quality, and distance between nodes. in recent years, there has been considerable research to find the solution to reliable communication between sensor nodes while managing their energy consumption [6]. the primary keys in this area are addressed by familiar optimization techniques inspired by swarm intelligence, fuzzy logic system, heuristic search algorithms, reinforced learning approach, and genetic algorithms (ga). genetic algorithm (ga) is an evolutionary heuristic, stochastic optimization algorithm that learns about its universe by analyzing data to eliminate wrong solutions and increase the number of excellent ones. it discovers near-optimal solutions quickly and heuristically for either massive or substantially smaller populations [7]. the literature states that ga provided optimized solutions for node placement, network coverage, clustering, data aggregation, routing, etc, in wsns. hence, ga is considered in the present research for developing a heuristic approach-based efficient and reliable routing mechanism. the proposed research's novelty lies in developing a heuristic approach-based energyefficient and reliable routing mechanism using a genetic algorithm that ensures the reduction of packet drops by selecting a resourceful node as a successor node in the route. the authors proposed a genetic algorithm with a roulette wheel selection process, value encoding, and fitness function evaluation using the quality of the link. the network a reliable routing mechanism with energy-efficient node selection for data transmission 211 parameters considered to state the resourcefulness of a node in the evaluation of a proposed algorithm are residual energy, an available buffer of the node, link quality, and distance. the above approach demonstrated a decrease in retransmissions, packet delivery delays, and energy consumption. further, an increase in the network lifetime. the rest of this document is structured as follows. section 2 discusses related work and motivation. the proposed mechanism is described in section 3. the chromosome representation is presented in section 3.2.1, initialization of the population is shown in section 3.2.2, the fitness function is defined in section 3.2.3, section 3.2.4 offers the selection of chromosomes, the crossover and mutations are presented in section 3.2.5 and 3.2.6 respectively and repairing of the chromosomes are explained in section 3.2.7. accordingly, performance was measured using the simulations in section 4. finally, in section 5, we summarise our findings. 2. related work and motivation this part of the manuscript provides an overview of existing works on network lifetime improvement and energy-efficient and reliable routing algorithms. celimuge wu et al. [5] proposed a redundancy-based technique for reliable and rapid data collection in sensor networks to achieve high reliability and minimal end-to-end delay. the proposed protocol employs a network-based coding strategy to increase packet redundancy when a link is defective or when there is a strict end-to-end latency requirement. the protocol automatically modifies the redundancy level to suit the application requirements and the link failure rate. fatma h. elfouly et al. [6] used swarm knowledge to suggest a routing scheme that grows the life of wsns while reducing energy consumption per node. in addition, this model accounts for data reliability by guaranteeing that the sensed data will arrive at the sink node consistently. finally, the buffer size reduces packet loss and energy expenditure from retransmitting duplicate packets. mahmood et al. [8] addressed several reliability schemes that use retransmission and redundancy techniques to recover lost data through either hop-by-hop or end-to-end procedures. they analyzed these schemes by looking into the optimal mix of these techniques, methods, and desired reliability levels to propose an efficient mechanism for resource-constrained wsns. bhardwaj et al. [9] discussed the factors affecting the lifetime of wsns. they determined that one of the most critical issues in wireless sensor networks is how long a network can survive if each node is limited in energy consumption. using an evolutionary genetic algorithm, bhatia et al. [10] suggested an approach termed gada-leach, which seeks to enhance ch selection in the conventional leach routing protocol for wsn such that to facilitate communication between the central hub (ch) and the base station (bs) by considering relay node. according to wang et al. [11], a multipath routing technique for wireless sensor networks uses genetic algorithms to boost the network's fault tolerance and reduce energy consumption. they used only the distance between nodes to compute the genetic algorithm’s fitness function. muruganantham et al. [12, 13] have presented simulating results analysis of classic and genetic-based routing strategies to examine the performance of a wireless sensor network. the genetic approach has been shown to enhance the wsn’s lifespan in the presence of faulty nodes. mujtaba romoozi et al. [14] explored innovative strategies for node positioning to reduce power consumption without compromising coverage. they proposed a genetic algorithmbased node positioning in wireless sensor networks to optimize energy consumption and 212 s. gudla, k. nageswara rao extend the network’s lifetime. t.abirami et al. [15] used a genetic algorithm to improve the network lifetime by creating spanning trees for data aggregation, which are stable and economical with power consumption. hasanien ali talib et al. [16] presented a honey-bee optimization with a genetic algorithm approach to developing a system for sharing data among individual nodes in a one-to-one network setup. ioana apetroaei et al. [17] applied genetic algorithms for routing protocols in wsns. b. baranidharan et al. [18] to improve fnd, hnd, and lnd presented a new clustering technique called the genetic algorithm based energy-efficient clustering hierarchy (gaech). their fitness function is computed by considering the most critical aspects of a cluster.ajay khunteta et al. [19] designed an approach using a genetic algorithm with leach protocol for cluster head selection in wsn to mitigate energy consumption. trong-the nguyen et al. [20] proposed an approach based on a genetic algorithm with self-configuration chromosomes in the cluster formation of a sensor network. m. k. somesula et al. [21] established contact duration-aware cooperative cache placement using a genetic algorithm-based heuristic search technique for practical scenarios to improve the hit and acceleration ratios. table 1 summary of existing work vs proposed work referencе algorithms / techniques used parameters used comparison of the proposed work with related work proposed work genetic algorithm – with roulette wheel selection, value encoding, and route quality as the fitness function. available buffer, residual energy, link quality, and distance. ga-based route construction with resourceful successor nodes in the path by considering available buffer, residual energy, link quality, and distance. the proposed work significantly decreased packet drops, improved the network’s lifetime, reduced energy consumption and packet delivery delay, and decreased the number of transmissions. [5] network coding assisted redundancy in improving redundancy based on link quality. link quality. a network coding-based approach to improve packet redundancy when a link is unreliable, or there is a strict end-to-end delay requirement. it has considered only link quality to update redundancy levels. but available buffer, residual energy, and distance are also important parameters to be considered to improve the network further by reducing packet dropping rate. [11] genetic algorithm where the fitness function is determined using distance. distance between nodes. multipath routing uses a genetic algorithm with only distance as a parameter. available buffer, residual energy, and link quality are not considered in routing decisions which plays a vital role in reducing packet drops. [26] fuzzy approach and the a-star algorithm. residual energy, traffic load, hop count. a-star routing approach based on fuzzy logic to evaluate node weight using residual energy, traffic load, and hop count. here the node’s available buffer is not considered, which may lead to packet dropping. a reliable routing mechanism with energy-efficient node selection for data transmission 213 park et al. [22] introduced a scalable architecture to achieve reliability in downstream data delivery efficiently by considering the unique characteristics of wireless sensor networks to ensure the reliability of data transfer from sensing devices to base nodes. le et al. [23], with the help of a statistical reliability metric, suggested an energy-efficient and reliable transport protocol (ertp) to minimize the number of retransmissions in wsns. ertp guarantees that more than enough data packets are sent to the sink. d. jiang et al. [24] described a multi-constraint routing strategy for smart city applications using load balancing to achieve significant energy efficiency. lee et al. [25] investigated cluster-based wireless sensor networks’ upper bound on network lifespan by addressing a solution to the energy hole problem with spatial correlation in networked clusters. alshawi et al. [26] suggested a new routing strategy for wsns that combines the fuzzy approach and the a-star algorithm to increase the network's lifetime. the concept seeks to find the best route from the source to the destination, prioritizing routes with the most available energy, the fewest possible hops, and the low traffic loads. the redundancy-based technique for reliable and rapid data collecting in sensor networks employed a network-coding-based strategy to increase packet redundancy when a link is defective or when there is a strict end-to-end latency requirement. the ga-based technique used only distance as a parameter to choose which routing paths to explore. but when selecting the most efficient routes for data transmission, it is crucial to consider the network link quality, remaining energy, and available buffer, as these are the primary factors that cause packet loss. the network's lifetime increases when packet loss decreases due to fewer transmissions. in this work, a data routing strategy with parameters remaining energy, the available node buffer, link quality, and distance based on a genetic algorithm that is both energyefficient and reliable has been proposed. this study's significance resides in using a genetic algorithm with a roulette wheel selection process, value encoding, and fitness function evaluation using the quality of the link. the resulting route ensures a selection of a resourceful node as a successor node in the routing path to minimize the number of endto-end data transmissions, energy consumption, packet delivery delay, and an improved network lifetime. 2.1. motivation obtaining accurate data collection (packet level reliability) with carefully enforced end-toend delay requirements is a substantial difficulty in several time-critical applications of wireless sensor networks, such as tsunami warnings, chemical assault detection, forest fire detection, and infiltration detection in military surveillance. hence apart from improving the lifetime of the wsn, reliable data should be delivered within the time boundaries. retransmission and redundancy are two standard methods used in wsns to ensure network reliability. even though the retransmission approach achieves reliable data transfer as it consumes more energy, it is unsuitable for wsns. the research community has placed less focus on using redundancy to achieve reliability in wsns since, in redundancy-based reliability mechanisms, a bit lost within a packet can be recovered by adopting some form of the coding scheme. the transmission overhead generated by retransmitting a whole packet would be drastically reduced if packets could be repaired to restore any lost or malformed bits. so that, to meet the challenges here, we need to build a route such that the nodes in the route should be strong enough to mitigate packet drops. the primary causes of packet drops are bad connectivity, overflowing buffers, or a node running out of energy. 214 s. gudla, k. nageswara rao the genetic algorithm (ga) is well-known in stochastic optimization because it employs the idea of natural evolution to develop optimization solutions. thus, a genetic algorithm is suitable for representing and resolving many complex issues. it is even proved that ga is used in wireless sensor networks to improve the positioning of nodes, the extent to which a network is covered, the organization of clusters, and the collection and aggregation of data. hence, in this paper, an energy-efficient and reliable data routing solution using a genetic algorithm has been proposed to tackle the stated challenges in wsns. the proposed technique improves the network's lifetime and minimizes the number of data transmissions while achieving reliable data delivery and strict packet delivery delay. the proposed mechanism also considered the available buffer, residual energy, and link quality while selecting the data routing paths. the primary objective of this study is (a) to design a wsn data routing method based on a genetic algorithm that is both energy-efficient and reliable and (b) evaluation of the proposed algorithm's performance using simulation results. 3. proposed mechanism the proposed research work develops an energy-efficient and reliable routing mechanism using the genetic algorithm with a roulette wheel selection process and value encoding; by considering the sensor node’s residual energy, available buffer, link quality, and distance. this section of the manuscript describes the proposed work and associated methodologies genetic algorithm-based routing scheme, chromosome representation, initialization of population, fitness function, selection, recombination (crossover), mutation, and repair. 3.1. network model as indicated in fig.1, a wsn has been considered as a graph g (v, e), where v is a set of vertices of the graph representing a set of sensor nodes of the wsn, and e is the set of edges of the graph illustrating a set of wireless communication links of wsn. each node is shown in fig.1. with associated residual energy ‘e’ and buffer availability ‘b.’ using multi-hop communication, a sensor node gets the data and directs it to the sink node. retransmissions were expected to be repeated until all the packets arrived at the sink node. fig. 1 network diagram a reliable routing mechanism with energy-efficient node selection for data transmission 215 3.2. genetic algorithm-based routing scheme the genetic algorithm (ga) is a stochastic optimization tool based on natural evolution [21,32,33,35]. ga was found to be suitable for parallel optimization [29].ga is an incremental method in which each round is referred to as a generation.ga is a populationbased method that considers all possible individuals. ga derives individuals randomly from a specified population, and these individuals are encoded into genetic form. every chromosome can be viewed as a single string or an array of genes covering a piece of the solution. alleles are different variations of a gene’s value. the evolution of encoded individuals is accomplished by continuing the processes below until the termination conditions are met. ▪ the fitness function identifies the fittest members with the best fitness levels, and these most qualified individuals are chosen as parents for the following generation. ▪ the determined parents were subjected to genetic operators (crossover and mutations) to generate new offspring from the existing ones. the chromosomes for the following generations are chosen after a population’s crossover and mutation mechanisms. some of this generation's best performers might be changed out for the worst performers from prior generations in the same proportion to ensure that the current generation is, at maximum, as fit as the preceding generation. this is referred to as elitism. this process is continued till the algorithm’s halting requirement is fulfilled. we present a genetic method for an energy-efficient routing algorithm in wsns based on this survival of the fittest concept in algorithm 1. algorithm-1: genetic algorithm for routing input: ‘p(n)’: size of population ‘cp’: crossover-probability ‘mp’: mutation-probability ‘g’: number of iterations output: routing path from sensor node to sink node 1: for all nodes n ∈ n, do 2: t=0; 3: generate the initial population by randomly initializing p(n); 4: repair the randomly generated population p(n); 5: evaluating the individual’s fitness from the population p(n); 6: store best solutions of p(n) in old b(m); 7: while t < g do 8: choose the individuals as parents chromosomes from p(n) (i.e., selection); 9: perform the crossover on the selected individuals to produce new offspring (i.e., recombination); 10: perform mutation on the new offspring based on cp; 11: repair the individual chromosomes produced after mutation; 12: evaluating the individual’s fitness from the new population; 13: store the best fitness individuals of p(n) in new b(m); 14: if fit (old b(m))>fit(new b(m)) then 15: new b(m)=old b(m); 16: end if 17: old b(m)=new b(m); 18: compute worst fitness value in p(n) and change it with new b(m); 19: t=t+1; 20: end while 21: end for 216 s. gudla, k. nageswara rao ga aims to discover a strategy for gathering data in wsn that maximizes the network’s lifetime, given a set of intermediate nodes and a base station with their coordinates. every data collection period is referred to as a round, and the number of repetitions defines how long the first intermediate node will operate before running out of energy. we further consider that each relay node has the same initial energy. 3.2.1. chromosome representation in the initial population, every chromosome belongs to a feasible genetic strategy. a sequence of positive integers indicating the ids of sensor nodes in a route from the source to the destination is called a chromosome [28]. the structure of the initial routing path depends on the position of the central nodes, which can move the content from the source to the destination (fig.2). the first locus gene is usually given the location of the source. the chromosome size is sensitive, but it must stay within the allowable level, the number of nodes in the network because the channel will never have more than the value. based on virtual network information (route table), the chromosome (path) represents a problem by assigning node ids from its source location to the destination. the first locus's gene encodes the source node, and the nodes linked with the source node designated by the front gene's allele are picked randomly or heuristically by the succeeding locus's gene. a selected node is removed from the topological information database to prevent it from being selected again. this procedure will be repeated until the destination node has been reached. it is worth noting that encoding is only possible if each route step is carried out over an actual network link. to represent a chromosome, we used the ids of nodes in the routing sequence from source to destination. the same is encoded, as shown in fig.2. there are different encoding schemes from the literature [7], such as binary encoding, octal encoding, hexadecimal encoding, permutation encoding, value encoding, and tree encoding. in the proposed work, the chromosome is a sequence of node ids of nodes in the route from the source node to the destination node. hence value encoding is considered to avoid complexities and further conversions. fig. 2 encoding scheme consider an example of a chromosome from source to destination. the chromosome is a set of nodes along the formed path (s1 − s4 − s8 −s11 − sink), as shown in fig. 1. the chromosome length is denoted as the count of nodes (genes) in the constructed route. 3.2.2. initialization of population population initialization of genetic algorithm assumes two points: the process to initialize the population and its size. to develop sensible solutions, it was believed that the size of the population needs to grow significantly with the complexity of the problem. however, recent research has demonstrated that satisfactory results can be produced with substantially smaller populations. to conclude, having a significant population is beneficial, a reliable routing mechanism with energy-efficient node selection for data transmission 217 but it comes with a high cost in terms of memory and time [35, 36]. as one might think, determining an appropriate population size is critical for effectiveness. furthermore, the initial population can be generated in random or heuristic initialization. due to the lacking of variety in the population, the optimal global solution is never achieved, and it examines only a tiny portion of the solution space. hence, random initialization is used in this study to construct the initial population using the encoding approach described in section 3.2.1. 3.2.3. fitness function the fitness function analyses chromosomes regarding the physical description and assesses their viability in the solution based on desirable features. however, the fitness function should precisely estimate the population’s chromosomal quality. hence, the fitness function is crucial. in our work, the fitness function is the number of data transmissions (including ack and retransmissions) needed to transmit the packet to the sink node and is defined as follows. 1 1 1 1 * nr nr x x x x n g g p = =     = + + +          (1) where gx = (1 − p h)x n signifies the total number of transfers (including ack and data transfer), ‘nr’ represents the largest number of retransmissions (‘nr’ = 8), ‘p’ represents the probability of successful link transfer, ‘h’ represents the number of hops between the source and destination, 'gx’ represents the probability that a node will not get an ack for an 'x th' data transmission. in the transfer method, the first part of equation 1 specifies the data transfer value, and the next part specifies the ack transfer value per packet of data. 3.2.4. selection the reproduction operation aims to increase the population’s average quality by increasing the possibility of high-quality chromosomes being transferred to the subsequent generation [35,38]. as a result of the selection, the exploration is focused on profitable regions in the solution space. algorithm-2: selection process 1: compute the selection probability of each individual using eq. (2); 2: for all generations g ∈ g, do 3: ci=0;/*chromosome index*/ 4: p(r)=0;/*accumulation probability of roulette*/ 5: while p(r) < random (0,1) do 6: ci=ci+1; 7: p(r)=p(r)+ps(ci); 8: end while 9: selected individual=ci; 10: end for 218 s. gudla, k. nageswara rao the selection systems are characterized by selection pressure, the ratio of the probability of selecting the best chromosome in a population to the probability of selecting an average chromosome. as a result of the tremendous selection pressure, the population reaches stability swiftly, although genetic diversity is inevitably sacrificed. various approaches for selection are available in the literature [45]: proportionate selection (roulette wheel selection), stochastic selection, tournament selection, and truncation selection. the roulette wheel selection approach is the optimal choice when deciding on a selection method for genetic algorithm. because of its simplicity of understanding and coding with correctness at runtime, there is a strong bias toward selecting the fittest elements. hence the proposed work uses roulette wheel approach to choose the elite individuals. here individuals are picked depending on the selection probability of the fitness function. the selection probability of an individual is well-defined as: ( ) ( ) ( ) pop s j n fit j p j fit j  =  (2) where ‘fit(j)’ represents the fitness value of j. 3.2.5. recombination (crossover) blending the existing chromosome’s genetic information as parents to produce the new chromosomes (children) is known as crossover (recombination). fig.3 shows the crossover process. in the routing problem, recombination is defined as swapping each partial route of two individuals so that the child produced by the crossover reflects only one route. fig. 3 crossover example as a result, the one-point crossover is a suitable method for the proposed ga. the source node is connected to a relay node by one partial route, and the relay node is connected to the destination node by the other partial route. the recombination of two prevalent parents picked through the selection process (algorithm 3) increases the likelihood of generating children with prevailing characteristics. the proposed manuscript considered a one-point crossover different from the traditional one-point recombination scheme in the proposed ga. since the proposed problem deals with routing, the solution should contain the sequence of nodes that form a path from a source to the target node. hence, the recombination operation must have a minimum of one common node (gene) in the chromosomes other than the source and destination nodes in the considered chromosomes. however, the position of genes in the chromosome needs to be different. algorithm 3 shows the steps to perform the crossover operation. a reliable routing mechanism with energy-efficient node selection for data transmission 219 3.2.6. mutation the mutation operation involves either flipping the randomly chosen gene based on mutation probability or explicit modification, and the mutation causes a slight bias. fig. 4 shows mutation. the mutation helps to achieve the global optima by avoiding the optimal local value. flipping a gene may produce a partial and incomplete path in the suggested scheme. hence, when a random node is picked, we identify the common nodes connected to the next and before the chosen node. the chosen node is replaced with one of the common nodes identified. algorithm 4 specifies the steps involved in the mutation process. fig. 4 mutation example algorithm-4: mutation process 1: if mp > random[0,1] then 2: randomly choose a gene ci[m](i.e., node in the routing path); 3: identify the previous and next genes to ci[m] (i.e., ci[m−1] and ci[m+1]) in the chromosome; 4: choose a node randomly from the list of nodes that are common to ci[m−1] and ci [m+1]); 5: end if algorithm-3: crossover process input: each chromosome contains variable length d genes e.g., ci={g1,g2,···,gd} d: length of the chromosome output: cm,cn: new chromosomes after crossover 1: for all chromosomes do 2: pick two chromosomes (ci,cj) from the given population; 3: cps=find the set of crossing point pairs by calling common(ci,cj) 4: if length(cps) ≥1 then 5: if cp > random[0,1] then 6: ex=choose randomly a pair from cps; 7: produce the two new routes by switching the partial routes of ci and cj with other (i.e., cm=(ci[1] to ci[ex[1] ] )+( cj[ex[2]+1] to cj[d]) and cn=(cj[1] to cj[ex[2]])+( ci[ex[1]+1] to ci[d])); 8: end if 9: end if 10: end for 220 s. gudla, k. nageswara rao 3.2.7. repairing the recombination could produce infeasible individuals, which contradicts the proposed buffer and residual energy availability constraints. even loops may be generated while performing crossover operations. each chromosome produced after crossover and mutation should be reasonable. hence, repairing the violations should be done to make the infeasible chromosomes feasible. first, the nodes involved in the looping path should be removed to eliminate the loops in the routing path. second, each node should have buffer availability (ab) and residual energy (re) more significant than the given threshold; otherwise, considering those nodes would result in a losing path or dropping packets. therefore, each gene should be evaluated in the repairing process to determine whether ‘ab’ and ‘re’ exist sufficiently. if any gene is not meeting the criteria, then replace the gene (node) with one of the common nodes common to the previous and next nodes to a node in position ‘g’ having available buffer greater than τab and residual energy greater than τre; the process of repairing is present in algorithm 5. algorithm-5: repairing process 1: for all chromosomes, do 2: if chromosome is infeasible (i.e., loops in routing path), then 3: identify nodes that form a cycle; 4: remove the cycle 5: end if 6: for g ∈ ci do 7: if ( gab < τab ) or ( gre < τre), then 8: sabcomm =nodes that are common to the previous and next nodes to a node in position g having available buffer greater than τab and residual energy greater than τre; 9: replace g with a node randomly chosen from sabcomm; 10: end if 11: end for 12: end for 4. evaluation of performance at this stage, the proposed data collection method has been simulated in network simulator-3 (ns3) [46]; it is compared with the retransmission-based strategy and multipath routing using ga in terms of power consumption, transmission rate, packet delivery delay, and network lifetime. to ensure the most reliable data transfer, each packet is repeatedly transferred in the simulation until it reaches the sink node. in this paper, the data collection round refers to each node that senses the packet and sends it to the sink node. 4.1. simulation setup in the simulation, the nodes in the network are between 50 and 500, the initial energy of a node is 25 kj, and the buffer capacity is 2.5 kb. the link quality is assigned a value between 0 and 1. table 2 describes the simulation parameters. a reliable routing mechanism with energy-efficient node selection for data transmission 221 table 2 parameters for the simulation parameter value number nodes 50 to 500 nodes range transmission 40 meters node’s initial energy 25 kj packet size 960 bits еѕ = α3 α3 = 50 x 10 -9 joules / bit er = α12 α12 = 0.787 x 10 -6 joules / bit et = α11 + α2d n α11 = 0.937 x 10 -6 joules / bit α2 = 10 x 10 -12 joules / bit / meters2 d = 85 meters 4.2. results and discussions to analyze the network, the metrics considered are energy consumption, average delivery delay of a packet, the number of retransmissions, and network lifetime in terms of the first node dying round and half nodes dying round. the packet delivery delay, also known as latency, refers to the time a data packet travels from one node to another. the average packet delivery delay for the three approaches is shown in fig. 5. it is observed from fig. 5 that, compared to the other two techniques, the retransmission-based strategy has the most latency due to more retransmissions. the distancebased ga algorithm considered the distance as the parameter for finding the route. but our proposed work also considered link quality, residual energy, and available buffer parameters while finding the data routing paths. fig. 5 performance comparison between the average packet delivery delay fig. 6 compares the number of transmissions among the distance-based ga algorithm, retransmission-based approach, and the proposed energy-efficient and reliable ga-based data gathering mechanism. the number of retransmissions in the network increases in proposition to the packet loss rate. each dropped packet is retransmitted in the simulation to achieve high data delivery reliability. as seen in fig. 6, our approach minimizes the number of retransmissions compared to the other two mechanisms. our mechanism identifies resourceful data routing paths to reduce total data transmissions. the fitness function of the proposed gabased approach considers the link quality while choosing the routing paths. hence, the proposed algorithm selects good link-quality routing paths; thereby, it minimizes the rate of 222 s. gudla, k. nageswara rao packet dropping and reduces the number of retransmissions. the proposed mechanism also considered the residual energy and available buffer parameters in the genetic algorithm. fig. 6 performance comparison between the number of data and ack retransmissions energy consumption is the energy needed to send data from one node to another (transmission energy). fig.7 shows the network’s energy consumption for the proposed technique, retransmission method, and distance-based ga mechanism. the number of retransmissions is higher in the retransmission-based approach than in our proposed mechanism. so that the network's consumption of energy is more in a retransmissionbased approach. a distance-based ga mechanism considered only distance as a parameter while constructing a route. hence in contrast to the retransmission method and distancebased ga mechanism, the proposed mechanism used less energy. the ga-based proposed mechanism considered the link quality as the parameter that reduces the number of transmissions, resulting in reduced energy consumption. fig. 7 performance comparison between network energy consumption network lifetime has different definitions by various researchers. here we considered that the lifetime of a network is measured in terms of how long it takes for some predetermined fraction of sensors to die from lack of energy, such as the first node dying round and half a reliable routing mechanism with energy-efficient node selection for data transmission 223 nodes dying round. fig. 8 and fig. 9 show the total life of the network (first dead node) and half of the dead nodes. a half-node dead round is a data collection round in which half of the total number of nodes dies. the proposed approach has improved network lifetime compared to the retransmission method and the distance based on the ga algorithm. fig. 8 performance comparison between first node dead round the proposed technique provides the optimal data routing path in terms of available energy, available buffer, and quality of the link. thus, the packet loss rate is minimized. fig. 8 and fig. 9 depict that with an improvement in the node count, the node’s lifetime decreases due to increased network energy consumption. in comparison to existing techniques, the proposed mechanism exhibited considerable longevity improvement. fig. 9 performance comparison of half nodes dead round table 3 signifies that the proposed method is an improvement over the retransmissionbased strategy, the distance-based genetic algorithm, and the a-star-based approach. this is because the proposed work developed an energy-efficient and reliable data delivery routing using a genetic algorithm with a roulette wheel selection approach and value encoding for encoding chromosomes, considering the parameters like node's current energy state, distance, link quality, and available buffer. it is demonstrated when compared with the related works from table 3, the delay in packet delivery is reduced, the network's lifetime is extended, and the energy consumption is reduced. 224 s. gudla, k. nageswara rao table 3 comparison of results with existing works reference number of nodes average packet delivery delay (seconds) first node died round half number of nodes died round [5] 50 0.24 15 750 [11] 0.12 19 880 [26] 0.13 16 860 proposed work 0.11 20 890 [5] 200 0.49 8 670 [11] 0.34 10 760 [26] 0.22 11 755 proposed work 0.21 12 780 [5] 300 0.67 5 620 [11] 0.58 7.5 690 [26] 0.44 7 700 proposed work 0.41 8 710 [5] 400 0.89 3 560 [11] 0.79 5 620 [26] 0.68 5 634 proposed work 0.63 6 650 5. conclusion there are two primary challenges in wsn: power efficiency and the availability of reliable data transmissions. in this study, a genetic algorithm-based data collection strategy is developed to improve the ability to use robust and reliable data systems in wsns to extend network life. the genetic algorithm with a roulette wheel selection approach and value encoding regulates the most energy-efficient and reliable routes by considering the node's remaining energy, the link quality, the node's free available buffer, and distance as key factors that cause packet loss. simulated results show that the proposed method reduces network energy consumption by 40 percent. in addition, the proposed technique significantly increases the network’s lifetime and reduces the packet delivery delay. possible future expansion to the study includes considering node mobility and energy harvesting. references [1] i. f. akyildiz, w. su, y. sankarasubramaniam and e. cayirci, "wireless sensor networks: a survey", comput. netw., vol. 38, no. 2, pp. 393-422, march 2002. [2] i. f. akyildiz and i. h. kasimoglu, "wireless sensor and actor networks: research challenges", ad hoc netw., vol. 2, no. 4, pp. 351-367, oct. 2004. [3] t. rault, a. bouabdallah and y. challal, "energy efficiency in wireless sensor networks: a top-down survey", comput. netw., vol. 67, pp. 104-122, april 2014. [4] b. singh and d. k. lobiyal,"an energy-efficient adaptive clustering algorithm with load balancing for wireless sensor network", int. j. sensor networks, vol. 12, no. 1, pp. 37-52, july 2012. [5] c. wu, y. ji, j. xu, s. ohzahata and t. kato, "coded packets over lossy links: a redundancy-based mechanism for reliable and fast data collection in sensor networks", comput. netw., vol. 70, pp. 179-191, sept. 2014. a reliable routing mechanism with energy-efficient node selection for data transmission 225 [6] f. h. elfouly, r. a. ramadan, m. i. mahmoud and m. i. dessouky, "swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network", fu: elec. energ., vol. 29, no. 3, pp. 339-355, sept. 2016. [7] u. mehboob, j. qadir, s. ali and a. vasilakos,"genetic algorithms in wireless networking: techniques, applications, and issues", soft computing, vol. 20, no. 6, pp. 2467-2501, june 2016. [8] m. a. mahmood, w. k. g. seah and i. welch, "reliability in wireless sensor networks: a survey and challenges ahead", comput. netw., vol. 79, pp. 166-187, march 2015. [9] m. bhardwaj, t. garnett and a. p. chandrakasan, "upper bounds on the lifetime of wireless sensor networks", in proceedings of the ieee international conference on communications (icc), 2001, pp. 785-790. [10] t. bhatia, s. kansal, s. goel and a. verma, "a genetic algorithm-based distance-aware routing protocol for wireless sensor networks", comput. electr. eng., vol. 56, pp. 441-455, nov. 2016. [11] s. wang, "multipath routing based on genetic algorithm in wireless sensor networks", hindawi math. prob. eng., vol. 2021, pp. 1-6, june 2021. [12] n. muruganantham and h. el-ocla, "genetic algorithm-based routing performance enhancement in wireless sensor networks", in proceedings of the ieee 3rd international conference on communication and information systems (iccis), 2018, pp. 79-82. [13] n. muruganantham and h. el-ocla," routing using genetic algorithm in a wireless sensor network”, wirel. pers. commun., vol.111., pp. 2703-2732, jan. 2020. [14] m. romoozi and h. ebrahimpour-komleh, "a positioning method in wireless sensor networks using genetic algorithms", in proceedings of the international conference on medical physics and biomedical engineering, 2012, pp. 1042-1049. [15] t. abirami and p. priakanth, "energy efficient wireless sensor network using genetic algorithm based association rules", int. j. comput. appl., vol. 91, no. 10, april 2014. [16] h. a. talib, r. alothman and m. k. farhan, "optimization approach to optimal power efficient based on cluster top option in wireless sensor networks", turkish j. comput. math. education, vol. 12, no. 4, pp. 970-979, april 2021. [17] i. apetroaei, i.-a. oprea, b.-e. proca and l. gheorghe, "genetic algorithms applied in routing protocols for wireless sensor networks", in proceedings of the 10th roedunet international conference, 2011, pp. 1-6. [18] b. baranidharan and b. santhi, "gaech: genetic algorithm based energy efficient clustering hierarchy in wireless sensor networks", hindawi j. sensors, vol. 2015, pp. 1-8, aug. 2015. [19] a. khunteta and a. bajpai, "genetic algorithm with leach protocol for cluster head selection in wireless sensor networks", ictact j. commun. technol., vol. 11, no. 2, pp. 2182-2186, june 2020. [20] t.-t. nguyen, c.-s. shieh, m.-f. horng and t.-k. dao, "a genetic algorithm with self-configuration chromosome for the optimization of wireless sensor networks", in proceedings of the 12th international conference on advances in mobile computing and multimedia, 2014, pp. 413-418. [21] m. k. somesula, r. r. rout and d. somayajulu,"contact duration-aware cooperative cache placement using genetic algorithm for mobile edge networks", comput. netw., vol. 193, april 2021. [22] s. j. park, r. vedantham, r. sivakumar and i. f. akyildiz,"garuda: achieving effective reliability for downstream communication in wireless sensor networks", ieee trans. mobile comput., vol. 7, no. 2, pp. 214-230, feb. 2008. [23] t. le, w. hu, p. corke and s. jha, "ertp: energy efficient and reliable transport protocol for data streaming in wireless sensor networks", comput. commun., vol. 32, pp. 1154-1171, jan. 2009. [24] d. jiang, p. zhang, z. lv and h. song, "energy-efficient multi-constraint routing algorithm with load balancing for smart city applications", ieee internet of things j., vol. 3, no. 6, pp. 1437-1447, sept. 2016. [25] s. lee and h. s. lee, "analysis of network lifetime in cluster-based sensor networks", ieee commun. letters, vol. 14, no. 10, pp. 900-902, oct. 2010. [26] i. s. alshawi, l. yan, w. pan, b. luo, "lifetime enhancement in wireless sensor networks using fuzzy approach and a-star algorithm". ieee sensors j., vol. 12, no. 10, pp. 3010-3018. oct. 2012. [27] s. k. a. imon, a. khan, m. d. francesco and s. k. das, "energy-efficient randomized switching for maximizing lifetime in tree-based wireless sensor networks", ieee/acm trans. netw., vol. 23, no. 5, pp. 1401-1415, oct. 2015. [28] c. w. ahn and r. s. ramakrishna, "a genetic algorithm for shortest path routing problem and the sizing of populations", ieee trans. evolutionary comput., vol. 6, no. 6, pp. 566-579, dec. 2002. [29] p. lin, q. song and a. jamalipour,"multidimensional cooperative caching in comp-integrated ultradense cellular networks", ieee trans. wirel. commun., vol. 19, no. 3, pp. 1977-1989, dec. 2019. [30] w. shen, t. zhang, f. barac and m. gidlund,"priority-mac: a priority-enhanced mac protocol for critical traffic in industrial wireless sensor and actuator networks", ieee trans. industr. inform., vol. 10, no. 1, pp. 824-835, feb. 2014. https://www.sciencedirect.com/journal/computer-networks https://www.sciencedirect.com/journal/computer-networks/vol/79/suppl/c https://ieeexplore.ieee.org/xpl/conhome/5983339/proceeding 226 s. gudla, k. nageswara rao [31] n. alsindi and k. pahlavan, node localization: wireless sensor networks: a networking perspective, john wiley & sons, 2009, chapter 8. [32] j. patel and h. el-ocla, "energy efficient routing protocol in sensor networks using genetic algorithm", mdpi sensors, vol. 21, no. 21, p. 7060, oct. 2021. [33] m. shokouhifar and a. hassanzadeh, "an energy efficient routing protocol in wireless sensor networks using genetic algorithm", adv. environ. biol., vol. 8, no. 21, pp. 86-93, oct. 2014. [34] y. liu, a. liu, y. li, z. li, y. june choi, h. sekiya and j. li, "apmd: a fast data transmission protocol with reliability guarantee for pervasive sensing data communication", pervasive mob. comput., vol. 41, pp. 413-435, 2017. [35] k. sastry, d. goldberg and g. kendall, genetic algorithms in search methodologies, springer, 2005, chapter-4, pp. 97-125. [36] u. dohare, d. k. lobiyal and s. kumar, "energy balanced model for lifetime maximization in randomly distributed wireless sensor networks", wirel. pers. commun., vol. 78, no. 1, pp. 407-428, april 2014. [37] b. singh and d. k. lobiyal, "an energy-efficient adaptive clustering algorithm with load balancing for wireless sensor network", int. j. sensor networks, vol. 12, no. 1, pp. 37-52, july 2012. [38] d. e. goldberg, genetic algorithms in search, optimization, and machine learning, addison-wesley publishing, october 1989. [39] s. gudla and n. r. kuda, "learning automata-based energy efficient and reliable data delivery routing mechanism in wireless sensor networks", j. king saud univ. – comput. inform. sci., vol. 34, no. 8, pp. 5759-5765, april 2021. [40] a. rastogi and s. rai, "a novel protocol for the stable period and lifetime enhancement in wsn", int. j. inform. technol., vol. 13, pp. 777-783, jan. 2021. [41] d. k. sharma, d. kukreja, s. bagga et al., "gauss-sigmoid based clustering routing protocol for wireless sensor networks", int. j. inform. technol., vol. 13, pp. 2569-2577, nov. 2019. [42] f. ullah, m. zahid khan, m. faisal, h. u. rehman, s. abbas and f. s. mubarek, "an energy-efficient and reliable routing scheme to enhance the stability period in wireless body area networks", comput. commun., vol. 165, no. 1, pp. 20-32, jan. 2021. [43] d. deepakraj and k. raja, "markov-chain based optimization algorithm for efficient routing in wireless sensor networks", int. j. inform. technol., vol. 13, pp. 897-904, march 2021. [44] j. agarkhed, v. kadrolli and s. patil, "fuzzy based multi-level multi-constraint multi-path reliable routing in a wireless sensor network", int. j. inform. technol., vol. 12, pp. 1133-1146, june 2020. [45] r. champlin, "selection methods of genetic algorithms", student scholarship computer science,2018. available at: https://digitalcommons.olivet.edu/csis_stsc/8 (accessed: 2022-01-02). [46] network simulator 3. available at: https://www.nsnam.org (accessed:2022-01-02). https://digitalcommons.olivet.edu/csis_stsc/8 instruction facta universitatis series:electronics and energetics vol. 27, n o 1, march 2014, pp. 1 11 doi: 10.2298/fuee1401001c microstructural impact on electromigration: a tcad study  hajdin ceric 1,2 , roberto lacerda de orio 2 , wolfhard h. zisser 1,2 , siegfried selberherr 2 1 christian doppler laboratory for reliability issues in microelectronics at the institute for microelectronics, tu wien, austria 2 institute for microelectronics, tu wien, gußhausstraße 27–29, a-1040 wien, austria abstract. current electromigration models used for simulation and analysis of interconnect reliability lack the appropriate description of metal microstructure and consequently have a very limited predictive capability. therefore, the main objective of our work was obtaining more sophisticated electromigration models. the problem is addressed through a combination of different levels of atomistic modeling and already available continuum level macroscopic models. a novel method for an ab initio calculation of the effective valence for electromigration is presented and its application on the analysis of em behavior is demonstrated. additionally, a simple analytical model for the early electromigration lifetime is obtained. we have shown that its application gives a reasonable estimate for the early electromigration failures including the effect of microstructure. keywords: electromigration, interconnect, reliability, physical modeling, simulation 1. introduction electromigration (em) experiments indicate that the copper interconnect lifetime decreases with every new interconnect generation. in particular, fast diffusivity paths cause a significant variation in the interconnect performance and em degradation [1]. in order to produce more reliable interconnects, the fast diffusivity paths must be addressed when introducing new designs and materials. the em lifetime depends on a variation of material properties at the microscopic and atomistic levels. microscopic properties are grain boundaries and grains with their crystal orientation [2]. atomistic properties are configurations of atoms at the grain boundaries, at the interfaces to the surrounding layers, and at the cross-section between grain boundaries and interfaces. modern technology computer-aided design (tcad) tools, in order to meet the challenges of contemporary interconnects, must cover two major  received december 16, 2013 corresponding author: hajdin ceric christian doppler laboratory for reliability issues in microelectronics at the institute for microelectronics, tu wien, austria (e-mail: ceric@iue.tuwien.ac.at) 2 h. ceric, r. lacerda de orio, w. zisser, s. selberherr areas: physically based continuum-level modeling and first-principle/atomistic-level modeling. we present a computationally efficient ab initio method for calculation of the effective valence for em and the atomistic em force. the results of these ab initio calculations are applied for parameterization of a continuum-level model [7] and for simulation of the impact of the copper microstructure on the em behavior. additionally, an application of the kinetic monte carlo method in combination with the ab initio method for em analysis is demonstrated. results of ab initio and atomistic calculations are also used for the derivation of a compact model for early em failures in copper dual-damascene m1/via structures. the model is based on the combination of a complete void nucleation model together with a simple mechanism of slit void growth under the via. it is demonstrated that the early em lifetime is well described by a simple analytical expression, from where its statistical distribution can be obtained. moreover, it is shown that the simulation results provide a reasonable estimate for the em lifetimes. 2. theoretical background 2.1. electronic density based calculation of effective valence generally, the effective valence is a tensor field ( ̅), which defines a linear relationship between the em force ( ⃗) and an external electric field ( ⃗⃗). ⃗⃗( ⃗⃗⃗) ̅ ⃗⃗⃗ ⃗⃗⃗ (1) for the calculation of the effective valence several methods have been proposed, all of them being based on the computation of electron scattering states [3]. density functional theory (dft), in connection with the augmented plane wave (apw) method [4] or the korringa-kohn-rostoker (kkr) method [5], has been established as the most powerful method for the determination of scattering states, however, it requires a demanding computational scheme. the cumbersome representation of scattering wave functions with many parameters is a heavy burden on stability and accuracy of subsequent numerical steps. in this work we introduce a more robust and efficient method to calculate the effective valence, which relies only on the electron density ⃗⃗ ⃗ . the basic idea is given in the following equations for the tensor components: ( ⃗⃗⃗) ∭ ⃗⃗ ( ⃗⃗⃗) ( ⃗⃗)[ ⃗⃗( ⃗⃗) ̂ ] ∭ ⃗ ( ⃗⃗ ⃗)[ ⃗⃗⃗ ( ⃗⃗⃗ ⃗) ̂ ] is the interaction potential between an electron and the migrating atom, ( ⃗⃗) is the relaxationtime due to scattering by phonons, ⃗( ⃗⃗) is the electron group velocity, and is the volume of a unit cell. the first integration is over the k-space and the second over the volume of the crystal. for the calculation of the electron density the dft tool vasp [6] is used. an example of a vasp calculation is given in fig. 1. microstructural impact on electromigration: a tcad study 3 the electron density alone provides a qualitative explanation for the fact that the effective valence is higher in the bulk than in the grain boundaries. similar analyses can be performed for atomic structures of different copper/insulator interfaces. higher electron densities lead to higher effective valences, as can be seen from (2) [7]. for an accurate electron density calculation it is necessary to know the exact positions of the atoms in the structure. fig. 1 portion of the bulk copper crystal. the electron density is represented in two orthogonal planes. it varies from higher values (circle regions around atoms) closer to the atomic nucleus to lower in the inter-atomic space 2.2. kinetic monte carlo simulation of electromigration to utilize results of quantum mechanical calculations for kinetic monte carlo simulations an average driving force along the diffusion jump path must be calculated. in general, the microscopic force-field depends on the position of the defect along the diffusion jump-path. the average of the microscopic force over the j-th diffusion jump path between locations ⃗ and ⃗ [3] is ⃗⃗ ⃗⃗ ∫ ⃗⃗ ⃗ ⃗⃗ ⃗⃗ ⃗ (3) the change in diffusion barrier height is equal to the net work by the microscopic force as the defect is moved from the initial to final sites over the entire jump path. the rates of defect jumps were calculated using the harmonic approximation to transition state theory (tst) [9]. in this approximation the transition rate is given by (4) is the migration energy (barrier) defined as the difference in energy between the transition state and the initial state, and is an attempt frequency [10]. for each defect site α the residence time is calculated as [11] 4 h. ceric, r. lacerda de orio, w. zisser, s. selberherr ∑ (5) is the number of possible jump sites from the site α. a single point defect is created at an arbitrary site, the clock is set to zero, and the defect is released to walk through the system. at each step, the jump direction is decided by a random number according to the local jump probabilities (6) the jump is implemented by updating the coordinates of the defect. by repeating the described random walk procedure for millions of defects, their concentration dependence on the effective valence tensor and the external field is calculated. 2.3. compact model for lifetime estimation in order to calculate the mechanical stress in a three-dimensional copper dual damascene interconnect structure, a complex physically based model including the em equation, the electro-thermal equation, and the mechanical equations has to be solved [7]. korhonen et al. [14] proposed a simple one-dimensional model, where the solution for the stress at the cathode of a semi-infinite line is given by √ √ (7) da is the effective atomic diffusivity and b is the effective modulus, which depends on the metal and the surrounding materials. void formation occurs as soon as the mechanical stress reaches a critical magnitude at a site of weak adhesion, typically at the copper/capping layer interface [15], [16]. thus, the void nucleation time is determined by the condition σ(tn)= σc, which applied to (7) yields ( ) (8) where is the critical stress.the solution given by (8) is a good approximation to the more complete solutionobtained by solving a full physical model [7], [13] numerically, as will be shown later. it should be pointed out that (8) is valid as long as the stress remains significantly smaller than the stress magnitude at the steady state condition, which holds true for the void formation phase. fig. 2 early failure mode: slit void growth under the via microstructural impact on electromigration: a tcad study 5 2.4. void growth for a copper dual-damascene m1/via structure with downstream electron flow, em failure analyses [11] indicate that the early failures are caused by slit voids located under the via, as shown in fig. 2. since the void is very thin and does not grow through the line height, void growth can be described by a one-dimensional process, so that the void length is given by (9) where is the drift velocity of the right edge of the void. the atomic flux into the right edge of the void is governed by the diffusivity of the copper/barrier layer interface , while the outgoing flux is governed by the surface diffusivity . since , using the nernst-einstein equation one can write [17] (10) the em failure occurs, when the void spans the via size, , so that the void growth time contribution to the em lifetime is given by (11) 3. results and discussion the ab initio method described above is applied for the calculation of the effective valence inside grain boundaries and the calculated value is used to parameterize our continuum-level model [7]. prior to carrying out the ab initio calculation it is necessary to construct grain boundaries with exact positions of atoms. for this purpose an in-house molecular dynamic (md) simulator with a many-atom interatomic potential based on effective-medium theory [8] is used. the total energy of the system is expressed as ∑ ∑ ∑ ( ) (12) fig. 3 formation of grain boundaries (circled regions) for a n-atom system, where v(rij) describes a pair potential and f(ni) describes the energy due to the electron density. an example of the construction of grain boundaries by means of md simulation is presented in fig. 3. 6 h. ceric, r. lacerda de orio, w. zisser, s. selberherr ab initio calculations of the effective valence in copper grain boundaries have provided a value 75% lower than in the bulk for 4.3 ev fermi energy (cf. fig. 4), which is in good agreement with the results of sorbello [3]. along with the determination of the effective valence, ab initio calculations predict a lowering of the energy barrier for atomistic transport. knowing the influence of the em force on the diffusional barrier we utilize kinetic monte carlo [9] simulations for em, which provide a closer look into the distribution of atoms in the presence of em for a specific atomistic configuration. the dependence of the atomic concentration on the angle between the em force and the jump direction is displayed in fig. 5. the em intensity clearly reduces from θ = 0 ◦ , where the em force acts in the fast diffusivity path direction, to a minimum for θ = 90 ◦ , where the em force is orthogonal to this direction. ab inito calculations serve as basis to give a proper consideration of fast diffusivity paths and microstructure in the comprehensive physically based model [7].the solution of such a model is indeed rather complex and a detailed description of the numerical approach can be found in [13]. fig. 4 average distribution of the effective valence near a grain boundary. the external electric field is oriented parallel to the grain boundary fig. 5 concentration difference at four different angles () between the em force and the atom migration paths microstructural impact on electromigration: a tcad study 7 fig. 6 shows the mechanical stress close to the via at the cathode end of a simulated line. a high stress develops adjacent to the via, where there is a line of intersection between the copper, the capping layer, and the barrier layer. for a copper dual-damascene m1/via structure with downstream electron flow, this is the typical site for void formation and growth leading to early em failures. since em failure has a statistical character, in order to obtain a distribution of void nucleation times several lines with different microstructures were simulated. in particular, the mechanical stress under the via was monitored for a total of twenty lines, from where the resulting stress build-up for five different structures is shown in fig. 7. we have observed that the time evolution of the stress curves can be divided into two main parts. in the first one the stress increases linearly with time, while in the second part it increases with the square root of time, as shown in fig. 8 for a typical stress curve. it should be pointed out that kirchheim [18] derived a linear stress increase from a onedimensional version of a full physical model [7] under the condition that the stress is sufficiently low. in turn, korhonen et al. [14] obtained a square root stress increase, as given by (7), from the solution of a simplified model for em stress buildup. thus, the stress build-up obtained from our numerical simulations with a rather complete model and for fully three-dimensional structures can be conveniently described by simple analytical solutions. since void nucleation is expected to occur at high stress magnitudes, the second part of the stress curve shown in fig. 8 is fitted by the square root model given in (7), where a is used as fitting parameter. by fitting the stress curves of all simulated structures, the distribution of the parameter a is determined, as shown in fig. 9. the parameter is well described by lognormal statistics, where the mean and the standard deviation are mpa/s 1/2 and , respectively. once a is known, the void formation time is obtained from (8). since the distribution of a is also determined, we are able to obtain the statistical distribution of the void formation times, shown in fig. 10. due to the lognormal statistics of a, also follows a lognormal distribution, where the mean and standard deviation are h and . it should be pointed out that filippi et al. [12] estimated a nucleation time of approximately 5h, which lies within the range predicted by the simulations. fig. 6 hydrostatic stress distribution (in mpa). high stress develops at the copper/capping/barrier layer intersection adjacent to the via 8 h. ceric, r. lacerda de orio, w. zisser, s. selberherr the void growth time is determined by (11), which is a function of the surface diffusivity. choi et al. [17] obtained activation energy for surface diffusivity of ev on clean copper surfaces. it is expected that their measurement delivers a more precise copper surface diffusivity than the typical ones obtained on oxidized surfaces [17] and, therefore, we have used their estimate in our simulations. furthermore, we have assumed that the activation energy follows a normal distribution [19]. as a consequence, both the surface diffusivity and the void growth time are lognormally distributed. the mean and the standard deviation of the void growth time distribution are h and , respectively. the void formation and the void growth times are of about the same order of magnitude, as shown in fig. 10, which highlights the importance of considering both contributions for the early em lifetime estimation under accelerated test conditions. fig. 7 stress build-up at the copper/capping/barrier layer intersection for lines with different microstructures fig. 8 fitting of a numerical solution using a linear and a square root model microstructural impact on electromigration: a tcad study 9 as the void nucleation and the void growth times are known, the early em lifetime is given by the combination of (8) and (11), ( ) (13) the distributions of the em lifetimes are shown in fig. 10, together with the experimental results obtained from filippi et al. [12]. the lognormal mean and standard deviation of the simulated lifetimes are ̅ h and , we can see that the simulation results provide a reasonable description for the early em lifetimes. a major advantage of (13) is that it is a simple analytical formula which is more rigorously related to the physical mechanisms active during the early em failure development than black's equation. a critical issue arises, however, with regard to the estimation of the parameter a. this parameter is affected by several factors, like diffusion coefficients, effective valence, mechanical moduli, microstructure, and more, so that it cannot be defined in a closed form in full physical modeling [7], [13]. nevertheless, we have observed that it can be related to korhonen's solution. in this way, it can be directly described by an analytical expression and connected to physical parameters according to (7). fig. 9 distribution of the square root model fitting parameter. the line represents a lognormal fit the relative difference between the simulated and experimental lifetimes for the same failure percentile varies between 15% and 20%, as shown in fig. 11. the difference is smaller for shorter lifetimes, since the proposed slit void growth model is more accurate for very early failures, where the void volumes are smaller. such an error magnitude is reasonable, given the required assumptions for the parameters and considering the simplicity of the model. 10 h. ceric, r. lacerda de orio, w. zisser, s. selberherr fig. 10 early em lifetime distribution fig. 11 error between the simulation and the experimental results 4. conclusion our work demonstrates a novel approach for the calculation of the em force on an atomistic level and its application to continuum-level modeling. the consideration of the accurate effective valence in grain boundaries allows a realistic simulation of em behavior. the presented combination of atomistic force calculations with a kinetic monte carlo simulation enables sophisticated analyses of vacancy dynamics. a compact model for estimation of the early em lifetimes in m1/via structures of copper dual-damascene interconnects was developed. the model was derived through the combination of a complete model for void nucleation together with a simple slit void growth mechanism under the via. given the simplifications and assumptions made for the simulations, a reasonable approximation to experimental early em failures has been obtained. microstructural impact on electromigration: a tcad study 11 acknowledgment: this work was partly supported by the austrian science fund fwf, project p23296-n13. references [1] z.-s. choi, r. mönig, and c. v. thompson, "dependence of the electromigration flux on the crystallographic orientations of different grains in polycrystalline copper interconnects," appl. phys. lett., vol. 90, p. 241913, 2007. [2] e. zschech and p. r. besser, "microstructure characterization of metal interconnects and barrier layers: status and future," proc. interconnect technol. conf., pp. 233-330, 2000. [3] r. s. sorbello, "microscopic driving forces for electromigration," in materials reliability issues in microelectronics, edited by j. r. lloyd, f. g. yost, and p. s. ho, vol. 225 pp. 3-10, 1996. [4] r. p. gupta, "theory of electromigration in noble and transition metals," phys. rev. b, vol. 25, pp. 5118-5196, 1982. [5] d. n. bly and p. j. rous, "theoretical study of the electromigration wind force for adatom migration at metal surfaces," phys. rev. b, vol. 53, pp. 13909, 2006. [6] g. kresse and j. furthmüller, "efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set," phys. rev. b, vol. 54, pp. 11169, 1996. [7] h. ceric, r. l. de orio, j. cervenka, and s. selberherr, "a comprehensive tcad approach for assessing electromigration reliability of modern interconnects," ieee trans. dev. mat.rel., vol. 9, pp. 9,2009. [8] k. w. jacobsen, j. k. norskov, and m. j. puska, "interatomic interactions in the effective-medium theory," phys. rev. b, vol. 35, pp. 7423, 1987. [9] r. sorensen, y. mishin, and a. f. voter, "diffusion mechanisms in cu grain boundaries," phys. rev. b, vol. 62, pp. 3658, 2000. [10] m. gall, c. capasso, d. jawarani, r. hernandez, h. kawasaki, and p. s. ho, "statistical analysis of early failures in electromigration," j. appl. phys., vol. 90, no. 2, pp. 732-740, 2001. [11] a. s. oates and m. h. lin, "electromigration failure distribution of cu/low-k dual-damascene vias: impact of the critical current density and a new reliability extrapolation methodology," ieee trans. device mater. rel., vol. 9, no. 2, pp. 244-254, 2009. [12] r. g. filippi, p.-c.wang, a. brendler, p. s. mclaughlin, j. poulin, b. redder, and j. r. lloyd, "the effect of a threshold failure time and bimodal behavior on the electromigration lifetime of copper [13] interconnects," proc.intl. reliability physics symp., pp. 444-451, 2009. [14] r. l. de orio, dissertation, technische universität wien, (2010). [online]. available: http://www.iue. tuwien.ac.at/phd/orio/ [15] m. a. korhonen, p. borgesen, k. n. tu, and c.-y. li, j. "stress evolution due to electromigration in confined metal lines," appl. phys., vol. 73, no. 8, pp. 3790-3799, 1993. [16] r. j. gleixner, b. m. clemens, and w. d. nix, "void nucleati on in passivated interconnect lines: effects of site geometries, interfaces, and interface flaws," j. mater. res., vol. 12, pp. 2081-2090, 1997. [17] m. w. lane, e. g. liniger, and j. r. lloyd, "relationship between interfacial adhesion and electromigration in cu metallization," j. appl. phys., vol. 93, no. 3, pp. 1417-1421, 2003. [18] z. s. choi, r. mönig, and c. v. thompson, "activation energy and prefactor for surface electromigration and void drift in cu interconnects," j. appl. phys., vol. 102, p. 083509, 2007. [19] r. kirchheim, "stress and electromigration in al-lines of integratedcircuits," acta metall. mater., vol. 40, no. 2, pp. 309-323, 1992. [20] l. doyen, x. federspiel, l. arnaud, f. terrier, y. wouters, and v. girault, "electromigration multistress pattern technique for copper drift velocity and black's parameters extraction," proc. intl. integrated [21] reliability workshop, pp. 74-78, 2007. development of an iot system facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 329-342 https://doi.org/10.2298/fuee1803329r development of an iot system for students' stress management branka rodić-trmčić 1 , aleksandra labus 2 , zorica bogdanović 2 , marijana despotović-zrakić 2 , božidar radenković 2 1 medical college of applied studies in belgrade, serbia 2 faculty of organizational science, university of belgrade, serbia abstract. this paper shows the development of an iot system for students' stress management. the iot system is developed in an open architecture and is an integral part of the educational ecosystem. the system is composed of two elements: the one that enables measurement of vital parameters for identifying stress in students, and the other for stress control. the system for stress control consists of a mobile health application featuring relaxation content. such system should minimize the excitement and have an impact on reducing future stress. the iot system for stress management was evaluated in a real environment, during students’ thesis defense on faculty of organizational sciences, university of belgrade. the results show that time spent using mobile health application with relaxing content can reduce students’ physiological arousal and excitement during thesis defense. key words: internet of things, wearable computing, stress management, education, students 1. introduction the internet of things (hereinafter: iot) is the paradigm of the modern world in which people and devices are linked and communicate with each other. the human dimension of the internet of things leads to its role in the healthcare sector and it can change the health behavior for the better. until present, different technological solutions have been developed with the aim of improving healthcare and conducting preventive actions [1], in various areas, including the field of stress management. stressful situations can generate arousal and anxiety. when a person is aroused, the body is under the stress. stress activates the sympathetic nervous system, and its activation causes different reactions in the human body, such as the production of sweat, a heart rate increase and muscle tension. although stress plays a positive role in performance, too much received november 23, 2017 corresponding author: božidar radenković faculty of organizational science, university of belgrade, jovana ilića 154, 11010 belgrade, serbia (e-mail: boza@elab.rs) 330 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković stress or repeated stress can have negative effects. stress has been recognized as one of the leading problems in healthcare and has a high impact on people’s health. long-term repetitions of stress manifestations in any population can be used as a predictor of other health conditions and disorders. research shows that the most pronounced sources of stress are in the domain of selfcognition and school life [2]. there is a frequent occurrence of stress in students during studies. it can be caused by changes in habits, separation from family members, or excitement in taking an exam [3]. many students face a variety of stressful situations during an exam, which can negatively affect the result of the exam. at the same time, poor results frequently do not mean less intelligence or student knowledge [4]. to cope with stress moments and prevent a repetition of stress, it is important for students to manage the situation that can produce negative arousal. at the same time, providing information of arousal in students in real time to teachers is an important part to prevent stress. such biofeedback enables teachers to adapt classes or exam method and thus make the environment that is more favorable for the student. stress management in this domain refers to using different technologies with the aim to measure and control a person’s levels of stress. the presence of stress can be identified through the measurement of different vital parameters of a user’s body with various iot and wearable systems. commercial iot systems are not with open apis, they can hardly be programmed and customized, which makes them hard to integrate with other e-education services. in order to make this integration possible, we have chosen an approach based on open architecture. an open architecture ensures that the developed iot system for stress management becomes an integral part of an educational ecosystem. in addition, an important part of the developed system for stress management is mobile health application with relaxation content that should minimize the students’ excitement and have an impact on reducing future stress. the pilot project is evaluated in a real environment, at faculty of organizational sciences, university of belgrade, during students’ thesis defense. besides students' stress management, our motivation for this work is being raised by educational aspect, too. the educational aspect represents advocating for students to be provided with fundamentals of iot technologies, applications, and devices. further, an assertion is on introducing the iot application potential, therefore students should be able to use and further develop the iot system. the rest of the paper is organized as follows: section 2 summarizes previous work in the field of stress management. section 3 is dedicated to the presentation of the development and implementation of the iot system for stress management. section 4 discusses the research methods for experiment setup and data collection. experimental results and discussion are provided in section 5. section 6 concludes the paper. 2. related work and motivation in recent years, iot technologies were widely applied through many projects in various health areas. technologies based on iot, like mobile technologies and wearable computing, are frequently used in the area of self-measuring of health metrics. those measurements are used for health promotion, well-being, and stress management where they can detect changes in vital parameters and play a role in the improvement of human behavior. different kinds of wearables and sensors, used individually and in combination, development of an iot system for students' stress management 331 are designed for tracking vital parameters that are indicating stress or arousal symptoms, such as heart rate, gsr, blood pressure, and others. heart rate is one of the health status indicators, a manifestation of the presence of an arousal. heart rate sensor is often used in sport [5][6] and fitness [7] equipment. in everyday life, heart rate sensor can improve identification of stress indicators and help to provide stress management [8] [9]. smart phones take a big part in stress management. some devices have embedded sensors for detecting and monitoring behavior indicative of stress or depression [10]. others have implemented biofeedback, i.e. therapeutic tools for stress and anxiety [11] [12]. ahtinen proposed a mobile application for stress management that contains four intervention modules for mental wellness-training [13]. the study that recruited 15 participants from a university has shown significant improvements in stress and life satisfaction in respondents. one commercial solution for stress management is biosync technology. patient’s heart rate, galvanic skin response, and movements are measured. data are automatically processed to determine patient’s stress level, and in addition, the patient is provided with preventive measures to have a better life [14]. a number of papers have looked into measuring the psychological state and physical reactions in students [15] and their academic success, as well as calculating their correlation [16] [17], or the improvement of student’s mental health [8]. the research carried out by shen, wang & shen [18] used psychological signals to predict emotions. they investigated the presence of different emotions in the studying process and proposed a sensible e-learning model. the data were acquired using three sensors: skin conductivity sensor measuring electrodermal activity, photoplethysmograph sensor measuring blood pressure, and eeg sensor measuring brain activity. measurements were taken over several weeks on one subject in the natural environment, the closest possible to the everyday environment. in the study [19], a heart rate sensor was implemented together with a skin conductivity sensor, accelerometer and temperature sensor in a natural context – public appearance of phd students in front of an audience, where significant variations in the values of measured vital parameters were observed. a feedback to the speaker was implemented in a form of talk assistant that sends information about heart rate to trigger relaxation. in this research, we want to design and implement a system to identify the psychophysiological signals indicating stress during students’ thesis defending. most of the research uses different kinds of devices to recognize stress in students, but few of them are coping with the stress management in a real environment [20] [21]. our goal is to provide a solution that would enable monitoring of body manifestation of arousal and help students to manage their stress during thesis defense. a mobile application with the relaxation content should minimize the arousal in students and make an impact on reducing stress during thesis defense. the influences of the mobile application on the presence of stress and anxiety levels, or change the behavior of students in certain contexts will be analyzed. 3. an iot system for stress management sensors and smart phones in monitoring health status usually imply: collecting data from the sensors; providing support to the user through a display with the measured values; sharing the information; ensuring the low-power devices, wearability, precision, longevity and reliability of devices. 332 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković iot system architecture for stress measurement consists of three parts: a wearable system for monitor vital parameters, a mobile health application featuring relaxation content and cloud platform. in addition, we have developed services for connecting the components, hosts, and users (see fig. 1). the wearable system requires components such as a heart rate sensor, a microprocessor to process obtained data, and a wireless internet connection that allows the participants to freely move during their activities. the second component, the mobile application featuring relaxation content can be installed on any android device that has a wireless connection to the internet. cloud platform collects sensor data from the wearable system, analyzes collected data, and follows browsing history from the mobile application. fig. 1 components of the iot system for stress management 3.1. intelligent devices one of the main challenges is designing an iot system that is wearable, low-power, reliable and precise. commercial solutions often satisfy the above characteristics. one such solution is xiaomi mi band (shown in fig. 2), which is completely wearable. the device possesses a bluetooth module that enables communication with appropriate mobile application. as the most of the commercial solutions, it comes with no open apis and, thus, it is not programmable and customizable. accordingly, there is no possibility to gather sensor data and store in the database for real-time monitoring, further analytics, biofeedback, nor integration with other educational services. development of an iot system for students' stress management 333 fig. 2 commercial wearable wristband xiaomi mi band 2 the wearable system developed in this pilot project is not completely wearable as commercial solutions, but it provides open apis that enable adjustment and integration with other systems in accordance with our needs. fig. 3 shows the components of the wearable system for heart rate measurement. fig. 3 schematic view of components of a wearable system for heart rate measurement in the implementation, we used pulse sensor amped, plug-and-play heart rate sensor for arduino. there is added amplification and noise cancellation circuitry to the hardware of the sensor [22]. it works with either 3v or 5v lilypad arduino. a heart rate sensor records the user’s heartbeats and is an important parameter in evaluating the arousal or the exposure to stress. the technology of measurement is usually based on two beams of light of different wavelengths that are focused on the human nail tip. the measured signal can then be obtained by a photosensitive element. heart rate sensor, on the other side, is connected to arduino lily pad (atmega32u4), and raspberry pi microcomputer. a usb cable was used to connect raspberry pi and arduino lily pad. the system is packaged in a plastic box with two bracelets that enable easily wearing the device. heartbeat sensor is attached on the fingertip that enables more reliable measurement without excess sensor moves (see fig. 4). 334 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković fig. 4 position of wearable device during project evaluation the software was implemented using the python programming language. in the created prototype, communication layer consists of communication and networking connection with the wi-fi access integrated into devices with the support of required software [23]. obtained sensor data are transmitted to the cloud using the raspberry pi with wi-fi module (802.11b/g/n). wi-fi module and mobile phone with health application are connected to a local secured wi-fi network. the mobile application connects to web services on the remote cloud through mobile wi-fi access with low power consumption so that user can carry the device for a longer period of time. the power supply is provided via power bank battery attached to the plastic box of the device. the charged battery provides about 2 hours of continuous measurement. 3.2. mobile health application the android application with relaxation content was created in the android studio 1.2.1 programming environment, using the java programming language. contents that could have a relaxing effect were implemented in the app: funny sports scenes, beautiful nature photos, relax natural sounds [24] [25]. the content was taken from youtube. the content watched by users is recorded on the cloud platform. fig. 5 shows some screens of the application content. fig. 5 respondent relaxing with mobile health application development of an iot system for students' stress management 335 3.3. cloud platform the cloud platform is mainly responsible for data classification and storage. after receiving a piece of data from arduino, raspberry pi has a role to read the data from the arduino processor through the serial port. in case there are no errors, raspberry pi sends data to microsoft azure iot hub platform. raw data are analyzed using microsoft azure stream analytics processing job. the result of the analytics process is a stream of aggregated data, which is stored in mssql database on the cloud. 4. research methods the aim of this evaluation is to identify the psychological signals indicating arousal during students’ thesis defending. in addition, we examine and compare the arousal presence in respondents who used the mobile health application with relaxation content (experimental group) to the respondents who didn't use it (control group). significance should indicate if the stress during thesis defense can be managed with relaxing content delivered through the mobile application. 4.1. instruments for the purpose of evaluation, a general, non-anonymous questionnaire was created. it consisted of demographic questions and it was filled out by students before the first phase of testing. in order to evaluate anxiety level before testing, after the test, and after relaxation following the completed test, spielberger’s [26] text anxiety inventory (stai) test is used. the test is a self-report instrument for measuring anxiety and it consists of 20 questions. the answers were provided in the form of a 4-point likert scale, namely: 4 – very much so, 3 – moderately, 2 – a little, 1 – not at all and respondents rated the extent to which each statement is true for them. the instrument was referred to the student’s state at the given time, not an earlier period. the students’ stai test scores were categorized into low (20 to 40), moderate (41 to 50) and high (51 to 80) [17]. 4.2. participants the evaluation sample consists of students of faculty of organizational sciences, university of belgrade. the participants were students of the master studies of the department of e-business. all participants had previously attended theoretical and practical courses in e-business, internet technologies, mobile technologies, and iot, so they were familiar with sensors and usage of wearable devices. in total, 26 students successfully finish the evaluation of the pilot project. all of the participants were informed about the testing and they signed a consent form. the participants were from 23-30 years old. there is evidence that health-risk behaviors (smoking cigarettes, lack of physical activity, etc.) influence higher levels of stress in students [27]. in the sample, most of the respondents were physically active regularly or at least occasionally (couple times a week, at least half an hour) and they were non-smokers at most. the half of the students had a job. table 1 shows descriptive statistics of the sample. 336 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković table 1 descriptive statistics of the sample characteristic gradation frequency percentage (%) sex male female 10 38 16 62 smoker yes no 2 8 24 92 physical activity regularly occasionally never 12 46 12 46 2 8 employed yes no 14 54 12 46 4.3. experimental design experiment session lasted approximately 45 minutes for each participant. each respondent was situated in a pleasant classroom and connected to the wearable system after arrival. respondents wore the wearable device on the left hand and the pulse sensor was placed on the index finger. there were no sensors on the right hand and writing was enabled. at the beginning, respondents filled out a general questionnaire, an stai test and signed a consent to participate in testing. the sample was divided into two groups: experimental and control. after completing the questionnaires, respondents from experimental group were given a tablet with a pre-installed android application with relaxation content and short instructions on how to use it. the respondent had 15 minutes to relax and use the application content. during the use of the mobile application, the duration of the use and the browsed content were tracked and saved in the cloud. also, the obtained values of vital parameters for the respondent were recorded. the respondents from the control group had the same procedure, except they were relaxing without content on the mobile application. scheme of experimental protocol is shown in fig. 6. fig. 6 protocol and testing phases development of an iot system for students' stress management 337 4.4. statistical analysis measured and obtained data from the heart rate sensor were averaged for all three phases individually. thus, we got three values of heart rate for each respondent. for further analyses, totaled averaged measures on the pre-test were then subtracted from totaled averaged measures on the post-test, determining the difference between the two measures. data analyses were performed using spss (v20.0). in all analyses, results were considered statistically significant at the p ≤ 0.05 level or p≤0.001. in the first step of the analysis, we tested differences between arousal occurrences in students’ heart rate through three phases: pre-test, test and test phase. differences between two phases of measurement, regardless of group, were tested with wilcoxon signed-rank test. in the second step, we tested statistical differences in heart rate difference between control and experimental group through phases. for that purpose, we used students ttest. when testing for normality, shapiro-wilk test was used, and levene’s test for equality of variances. in case that normality or equality of variances was not met, data were tested with mann-whitney u test [28]. 5. results and discussion 5.1. arousal through testing phases we have investigated if there was a statistical difference in students’ heart rate from pre-test phase and test phase in which they defended their thesis. the results have shown that heart rate in the test phase is significantly higher (z=-2.66, p<0.05) than in pre-test phase. this is an expected result, as a defense is a stressful event. it was expected that after increased arousal during pre-test and test phase, respondents’ heart rate became lower in post-test phase, as a stressful event was finished. there is a strong statistical significance that measured values of heart rate at post-test phase are lower than heart rate values in pre-test phase (z=-3.92, p<0.001) as well as in test phase (z=-4.01, p<0.001). post-test phase can be considered as the most calming phase from the beginning of the test. figure 7 shows the distribution of differences of averaged heart rate between pre-test and posttest phase among experimental and control groups. mean value (shown as a vertical line in a box) is higher in the control group (-9.46) than in the experimental group (-5.92). 5.2. stress management in experimental and control group through testing phases there were statistical differences (t(24) = -3.72, p<0.05) in arousal, measured by subtraction of heart rate between pre-test and test phase, between the control and the experimental groups, regarding the use of the mobile health application with relaxation content in pre-test phase. according to the results, an arousal of the experimental group was decreased, probably because they were treated with relaxation content during pre-test phase. also, there is a strong statistical difference (mann–whitney u=17.5, p<0.05) in heart rate during the test and post-test phase, between control and experimental groups. distribution of averaged heart rate values between two phases is shown in fig. 8. the experimental group’s values do not deviate from test to post-test phase as they deviate in 338 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković the control group (shown as the difference on the figure). the greater deviation, shown as difference, points to higher arousal in the test phase. fig. 7 distribution of differences in measured averaged heart rate values between pre-test and post-test phase fig. 8 distribution of averaged heart rate values during test and post-test phase and difference between them in experimental and control group development of an iot system for students' stress management 339 since students used the mobile health application with relaxation content in the posttest phase, it could be an explanation for the significant difference in arousal between the two groups. students’ heart rate values were significantly lower in post-test phase than during the pre-test phase but there is no statistical significance in differences between the two phases among groups. regardless of the relaxation content on the mobile health application, students’ arousal were decreased in the post-test phase. table 2 presents the participants’ average time spent at different contents of the mobile health application with relaxation content. before and after defense, participants spent the longest time watching fun sports clips in total, and the longest average time they kept in continuously was watching fun sports clips and then listening to relax music. table 2 distribution of time spent at different contents of the mobile health application with relaxation content content total of time spent on content (hh:mm:ss) average time on content (hh:mm:ss) fun sport 05:46:18 00:06:56 relax music 04:10:21 00:04:49 relax photo 03:02:31 00:04:21 5.3. the results of the stai test table 3 shows participants distribution by stai test points in both experimental and control groups. table 3 participants distribution by stai test results anxiety gradation test phase/group low moderate high pre-test – total 20 5 1 experimental 12 1 0 control 8 4 0 test – total 22 3 1 experimental 12 1 0 control 10 2 1 post-test – total 25 1 0 experimental 12 1 0 control 13 0 0 almost 20% of students reported moderate anxiety in pre-test phase, e.g. a moment before thesis defense. after the test, only 11.5% of respondents reported moderate anxiety, and after relaxation after defense, in post-test phase, just one respondent (3.8%) had moderate anxiety. high anxiety was present in pre-test and test phase in 3.8% of the sample from the control group. in pre-test and test phase there were more respondents with moderate and high anxiety in the control group than in the experimental group. there is a statistically significant change in stai pre-test scores that are lower than stai post-test scores (z=-3.902, p<0.001). also, stai post-test scores are lower that 340 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković stai test scores (z=-3.137, p<0.05), but there is no significant change in scores between stai pre-test and stai test scores. figure 9 shows the distribution of stai scores in pre-test, test and post-test phases. median of stai scores (shown as a horizontal line in boxes) and mean (x mark in the box) is the highest in pre-test phase and the lowest in post-test phase. fig. 9 distribution of scores of stai test through three phases there was no correlation between stai test results and the level of heart rate through phases. low anxiety in participants during thesis defense could be explained either with too subjective answers on stai test because the test wasn’t anonymous or by the fact that the students are well prepared and relaxed with their professors. 6. conclusion the solution proposed in this paper demonstrates one of the ways of integrating the concepts of electronic health, mobile health, internet of things and wearable computing. the implemented iot system allows monitoring respondents’ vital parameters. mobile health application with relaxation content can have an impact on reducing arousal in respondents during the defense of the thesis. the proposed solution is applicable in education environments. unlike the conventional educational system, the proposed solution enables biofeedback. information about students’ arousal is sent to professors on which they can change the exam flow. in addition, the iot solution proposed in this paper can be successfully applied for stress monitoring in different life situations. the values obtained in the research indicate that the use of a mobile health application with relaxation content had a significant effect on decreasing students’ arousal before and during thesis defense. there is no evidence that the application can help students to calm down after they defend their thesis. development of an iot system for students' stress management 341 the major contributions of this paper can be summarized as follows:  the way of the realization of stress measurement and stress control through iot concept and mobile health in an education environment.  development of the iot infrastructure for stress measurement and control.  introducing a new smart healthcare service into the education system, providing biofeedback to subjects involved in the education process. the main limitation of the study is the small and homogeneous sample. also, there is no evidence whether the students were feeling stressed before the test for both the control and the experimental group that could have an effect on their baseline heart rate. in addition, the future work should include students from different departments where there is a different attitude towards professors. acknowledgement: the authors are thankful to ministry of education, science and technological development, republic of serbia, grant 174031. references [1] b. rodic-trmcic, a. labus, s. mitrovic, v. buha and g. stanojevic, “internet of things in e-health: an application of wearables in prevention and well-being,” in emerging trends and applications of the internet of things, igi global, 2017, pp. 191-197. [2] q. li, y. xue, l. zhao, j. jia and l. feng, “analyzing and identifying teens stressful periods and stressor events from a microblog,” ieee journal of biomedical and health informatics, vol. pp, no. 99, pp. 1-1, 2016. [3] n. sohail, “stress and academic performance among medical students,” journal of the college of physicians and surgeons pakistan, vol. 23, no. 1, pp. 67-71, 2013. [4] m. s. ali and m. n. mohsin, “test anxiety inventory (tai): factor analysis and psychometric properties,” iosr journal of humanities and social science (iosr-jhss), vol. 8, no. 1, pp. 73-81, 2013. [5] y. fu and j. liu, “system design for wearable blood oxygen saturation and pulse measurement device,” in proceedings of the 6th international conference on applied human factors and ergonomics (ahfe 2015) and the affiliated conferences, ahfe 2015, 2015. [6] suunto, “suunto foot pod mini,” 01 november 2015. [online]. available: http://www.suunto.com. [7] basis science, 02 november 2015. [online]. available: http://www.mybasis.com/. [8] a. millings, j. morris, a. rowe, s. easton, j. k. martin, d. majoe and c. mohr, “can the effectiveness of an online stress management program be augmented by wearable sensor technology?,” internet interventions, vol. 2, no. 3, pp. 330-339, 2015. [9] a. parnandi and r. gutierrez-osuna, “physiological modalities for relaxation skill transfer in biofeedback games,” ieee journal of biomedical and health informatics, vol. 21, no. 2, pp. 361-371, 2017. [10] s. saeb, m. zhang, c. karr, s. schueller, m. corden, k. kording and d. mohr, “mobile phone sensor correlates of depressive symptom severity in daily-life behavior: an exploratory study,” j med internet res, vol. 17, no. 7, p. e175, 2015. [11] h. al osman, h. dong and a. el saddik, “ubiquitous biofeedback serious game for stress management,” ieee access, pp. 1274-1286, 2016. [12] m. a. zafar, b. ahmed and r. gutierrez-osuna, “playing with and without biofeedback,” in proceedings of the ieee 5th international conference on serious games and applications for health (segah), 2017, perth, 2017. [13] a. ahtinen, e. mattila, p. välkkynen, k. kaipainen, t. vanhala, m. ermes, e. sairanen, t. myllymäki and r. lappalainen, “mobile mental wellness training for stress management: feasibility and design implications based on a one-month field study,” jmir mhealth uhealth, vol. 1, no. 2, p. e11, 2013. [14] s. falan, “wearable technology: examples from sweden,” 2016. [online]. available: http://salusdigital.net/ wearable-technology-examples-sweden/. [15] f. arriba-pérez, m. caeiro-rodríguez and j. m. santos-gago, “towards the use of commercial wrist wearables in education,” in proceedings of the 4th experiment@international conference (exp.at'17), 2017, faro, 2017. 342 b. rodić-trmĉić, a. labus, z. bogdanović, m. despotović-zrakić, b. radenković [16] t. dragon, i. arroyo, b. p. woolf , w. burleson, r. kaliouby and h. eydgahi, “viewing student affect and learning through classroom observation and physical sensors,” in intelligent tutoring systems, berlin, springer berlin heidelberg, 2008, pp. 29-39. [17] l. t. ping, k. subramaniam and s. krishnaswamy, “test anxiety: state, trait and relationship with exam,” malaysian journal of medical sciences, vol. 15, no. 2, pp. 18-23, 2008. [18] l. shen, m. wang and r. shen, “affective e-learning: using “emotional” data to improve learning in pervasive learning environment,” educational technology & society, vol. 12, no. 2, pp. 176-189, 2009. [19] m. kusserow, o. amft and g. tröster, “monitoring stress arousal in the wild,” pervasive computing, ieee, vol. 12, no. 2, pp. 28-37, 2013. [20] r. kocielnik, m. pechenizkiy and n. sidorova, “stress analytics in education,” in proceedings of the 5th international conference on educational data mining, chania, greece, 2012. [21] a. joshi, r. kiran and a. n. sah, “an experimental analysis to monitor and manage stress among engineering students using galvanic skin response meter,” work, vol. 56, no. 3, pp. 409-420, 2017. [22] world famous electronics llc., “pulse sensor amped,” 2017. [online]. available: https://pulsesensor.com/. [23] m. chen, y. zhang, y. li, m. m. hassan and a. alamri, “aiwac: affective interaction through wearable computing and cloud technology,” ieee wireless communications, vol. 22, no. 1, pp. 20-27, 2015. [24] j. j. alvarsson, s. wiens and m. e. nilsson, “stress recovery during exposure to nature sound and environmental noise,” int j environ res public health, vol. 7, no. 3, p. 1036–1046, 2010. [25] w.-c. wang, “a study of the type and characteristics of relaxing music for college students,” in proceedings of meetings on acoustics, providence, rhode, 2014. [26] d. c. spielberger, “test anxiety inventory: preliminary professional manual,” consulting psychologists press, 1980. [27] m. y. kwan, k. p. arbour-nicitopoulos, e. duku and g. faulkner, “patterns of multiple health risk– behaviours in university students and their association with mental health: application of latent class analysis,” health promot chronic dis prev can, vol. 36, no. 8, p. 163–170, 2016. [28] e. saperova and d. dimitriev, “effects of smoking on heart rate variability in students (545.6),” the faseb journal, vol. 28, no. 1, pp. 545-6, 2014. 10877 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 91-101 https://doi.org/10.2298/fuee2301091d © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper efficiency and radiative recombination rate enhancement in gan/algan multi-quantum well-based electron blocking layer free uv-led for improved luminescence samadrita das1, trupti r. lenka1, fazal a. talukdar1, ravi t. velpula2, hieu p. t. nguyen2 1department of electronics and communication engineering, national institute of technology silchar, assam, 788010, india 2department of electrical and computer engineering, new jersey institute of technology newark, new jersey, 07102, usa abstract. in this paper, an electron blocking layer (ebl) free gan/algan light emitting diode (led) is designed using atlas tcad with graded composition in the quantum barriers of the active region. the device has a gan buffer layer incorporated in a c-plane for better carrier transportation and low efficiency droop. the proposed led has quantum barriers with aluminium composition graded from 20% to ~2% per triangular, whereas the conventional has square barriers. the resulted structures exhibit significantly reduced electron leakage and improved hole injection into the active region, thus generating higher radiative recombination. the simulation outcomes exhibit the highest internal quantum efficiency (iqe) (48.4%) indicating a significant rise compared to the conventional led. the designed ebl free led with graded quantum barrier structure acquires substantially minimized efficiency droop of ~7.72% at 60 ma. our study shows that the proposed structure has improved radiative recombination by ~136.7%, reduced electron leakage, and enhanced optical power by ~8.084% at 60 ma injected current as compared to conventional gan/algan ebl led structure. key words: ultra-violet (uv), light emitting diode (led), gallium nitride (gan), internal quantum efficiency (iqe), multi-quantum well (mqw), quantum barrier (qb), electron blocking layer (ebl) 1. introduction ultra-violet light emitting diodes (leds) are of immense importance because of their potential applications and have attracted considerable attention in optical communication, pharmaceutical appliances, water and air purification, and many more. gallium nitride (gan), received june 28, 2022; revised july 14, 2022, and july 26, 2022; accepted july 26, 2022 corresponding author: samadrita das department of electronics and communication engineering, national institute of technology silchar, assam, india e-mail: samadrita_rs@ece.nits.ac.in mailto:samadrita_rs@ 92 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen a promising material for generating uv luminescence over a wide range of spectrum, has attracted many researchers’ attention [1]–[4]. gan shows a wide band gap ranging from 0.7 ev, 3.4 ev to 6.2 ev which can further be amplified by introducing aluminum (al) to prepare algan alloy [5], [6]. moreover, gan being an environmental friendly material has better biocompatibility and low manufacturing price [7]–[11]. gan are used for creating highefficiency shorter wavelength luminescence and fabricate semiconductor based materials such as led, laser diode (ld) with low threshold [12], [13]. from the last few years, research is going on the optimization of gan-based led structure design [14]–[17]. this optimization is beneficial to improve the efficiency in the symmetry of carrier transport, better injected charge carriers, confinement of carriers in the quantum wells which further enhances the radiative recombination rate leading to the breakthrough in the internal quantum efficiency (iqe) [18]–[20]. due to the electron overflow, iqe and efficiency droop at high injection current face a critical issue [22]. although the electron blocking layer (ebl) introduced between the p-region and active region can suppress the electron overflow[23], but the hole injection efficiency is also strongly affected because of positive polarization sheet charges formed at the heterointerface of the last quantum barrier (qb) and ebl[24]. additionally due to high magnesium (mg) activation energy in high al content ebl, efficient p-doping is quite difficult [25]. thus to mitigate these problems, in our paper we have used an ebl free multiquantum well (mqw) uv-led operating at ~354.6 nm wavelength which eliminates the formation of positive polarization sheet charges and shows a significantly enhanced hole injection and reduced electron leakage. we have presented a distinctive design of qb in gan/algan mqw by graded composition inside the entire barrier across [0001] axis. as a result, the performance of the proposed structure is remarkably improved, compared to the conventional uv-led structure using an ebl and with square quantum barriers. the design of led structure and its numeral simulation framework is presented in section 2 followed by results and analysis in section 3. finally, the conclusion is drawn in section 4. 2. device structure and numerical simulation framework in this study, the above-mentioned led structures are numerically studied using the use of computer-aided simulation tool silvaco atlas tcad which is designed to analyze and optimize leds based on wurtzite semiconductor compounds[26]. a gan/algan led with a peak wavelength of ~354.6 nm is presented in fig. 1. the basic device structure considered as the conventional led (ledi) is constructed above a sapphire based substrate with a thickness of 80 µm followed by an undoped gan buffer layer of thickness 1.2µm, n-doped algan coating layer (doping concentration: 1 ×1020 cm-3, width: 1.8 µm, al content: 18%), four pairs of gan (3 nm)/algan (7 nm) mqw, p-doped algan layer as ebl[27] (doping concentration: 2 ×1018 cm-3, width: 20 nm, al content: 20%), p-doped algan coating layer (doping concentration: 1 ×1018 cm-3, width: 180 nm, al content: 15%) and finally p-doped gan contact layer (doping concentration: 2.5 ×1018 cm-3, width: 80 nm). the quantum square barriers have 20% uniform al composition. efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 93 fig. 1 (a) schematic diagrams of ledi conventional gan/algan mqw with square qbs, (b) ledii with triangular barriers for the betterment of performance, the square barriers of ledi have been optimized with graded composition. along the n-side in each qb, the al composition is integrated to 20% (al0.26ga0.74n) while in the p-side the al composition is defined by the variable x which is in the range (0 ≤ x ≤ 20 %). the al composition in each qb gradually reduces from 20% to x (alxga1-xn, 0 ≤ x ≤ 0.2) across [0001] axis from n to p-side. the calculations are accomplished using the carrier mobilities of 90 (electrons) and 15 (holes) cm2v-1s-1 and the operating temperature is set as 300 k. the device with graded triangular barrier (x=0.02 for reference) is considered as ledii. the final ebl free uv-led proposed structure with graded triangular barrier is considered as lediii which is the optimum goal of this paper. the al composition in each qb is increased to 25% in the nside while in the p-side the value of x is in the range (0 ≤ x ≤ 25 %). the energy band gaps of the gan and algan used in the simulations are taken as 3.42 ev and 6.28 ev respectively. the respective radiative recombination rate of coefficient (copt) are 2×10-10 and 1.1×10-10 cm3/s. the lattice constant of gan is 0.3189 nm. the auger coefficient and carrier lifetime have their default values as 1×10-34 cm6/s and 1×10-9 s respectively. 3. results and analysis 3.1. internal quantum efficiency the iqes of the device with varying values of x in alxga1-xn in the graded qbs with respect to injection current are displayed in fig. 2. as shown, efficiency of ledi has the lowest value compared to the other cases with graded qb (0 ≤ x ≤ 0.2). with decreasing band gap of alxga1-xn from n to p-side, the iqe at the same injection current remarkably raises. due to better prospective, the efficiencies at 60 ma with respect to function x are displayed in the inset of fig. 2. while reducing the values of x from 0.2 to ~0.02, the efficiency increases from 34.79% to 45.68% then vaguely minimizes while x approaches 0. this is because when al-composition is further decreased to 0.02, band gap of alxga1-xn decreases which in turn increases the effective barrier heights for holes and electrons further. ledii acquires 31.3% rise of efficiency at 60 ma compared to ledi. fig. 3 show that the optimized ebl free device (lediii) has the highest iqe of 48.4% at 60 ma. lediii has 39.12% and 5.95% higher iqe than ledi and ledii respectively. the 94 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen efficiency droop is minimized from 13.87% (ledi) to 10.22% (ledii) and further to 7.72% (lediii) at the same current of 60 ma according to the equation given below: (1) this result establishes that the ebl free device with triangular barriers does contribute to the enhancement of iqe and decrease of efficiency droop. in order to validate our device model and parameters, the iqe is compared with the nearly available experimental result [28] as shown in fig. 2. fig. 2 internal quantum efficiency vs. injected current with varying values of x. inset: values of iqes as a function of x. fig. 3 internal quantum efficiency vs. injected current for all leds inset: the efficiency droop for each led at 60 ma current efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 95 3.2. energy band diagrams fig. 4 shows the calculated energy band diagrams for ledi, ledii and lediii at 60 ma injected current. the simulated results shown in fig. 4(a)-(b) depict the dissimilarities and the tendency of variation of the energy band diagrams where the band gap of every qb is altered from uniform to graded composition. band diagram of ledi depicts a triangular designed shape because of the presence of internal polarization field and forward bias [29]. the energy band gap (eg) of alxga1-xn can be calculated as – eg(alxga1-xn) = eg(aln)x + eg(gan)(1 – x) – (1.3x(1 – x)) (2) where → eg(aln) = band gap energy of aln = 6.2 ev, eg(gan) = band gap energy of gan = 3.42 ev [30] fig. 4 energy band diagram of active region of gan/algan mqw for (a) ledi, (b) ledii and (c) lediii this mathematical formula shows that the band gap of alxga1-xn decreases with a decrease in the al composition. hence the band gap of every qb reduces while moving from n to p-side, thus influencing the effective barrier heights for electrons as well as holes. the formation of the hole depletion region due to the positive polarization sheet charges 96 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen interface at lqb/ebl lessens the hole injection efficiency in ledii [24]. this problem can be overcome by removing the ebl from ledii. фcn and фebl are the effective conduction band barrier height (cbbh) at corresponding barrier (n) and ebl respectively. displayed in fig. 4(b), the values of фcn for all qbs i.e. фc1-фc5 are 370.8 mev, 408.2 mev 442.2 mev, 367 mev and фebl is 468.9 mev for ledii which is much higher than фebl for ledi (257.1 mev). the фcn values in the proposed ebl-free lediii i.e. фc1-фc5 are 460.2 mev, 618.3 mev, 505.2 mev and 632.1 mev, respectively. the higher and progressively increased фcn in lediii constructively confine the electrons in the active region and effectively resist the electron overflow into p-region. this leads to the significantly reduced non-radiative recombination in p-region and enhances hole injection into the active region. 3.3. carrier concentration the electron as well as hole concentration distribution in the mqw of various leds is displayed in fig. 5 to further understand the reason behind the tremendous performance improvement in lediii. ledi indicates a hole concentration of 10.4×1018 cm-3 in the initial quantum well from n to p-side which is much lower than ledii (16.9×1018 cm-3). these results specify that graded qb led has superior hole transport lessening the hole concentration. the distribution of electrons, as observed in graded qb led, appears to have better uniformity, compared to ledi, which may proportionate to superior transportation of fig. 5 carrier concentration of (a) ledi, (b) ledii and (c) lediii efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 97 holes [31]. the electron leakage in lediii is notably mitigated and lower than ledii which blocks the undesired recombination of electrons with incoming holes in the p-region. subsequently, lediii has higher electron (22.6×1018 cm-3) and hole (23.2×1018 cm-3) concentration throughout the active region, compared to ledii. 3.4. radiative recombination the distribution of radiative recombination in the active region at an injection current of 60 ma is simulated and illustrated in fig. 6. the radiative recombination distribution in ledii is more uniform compared to ledi. in ledi, radiative recombination in the primary qw has a recombination rate of 2.38×1028 cm-3s-1. this is probably due to the deficient spatial distribution overlap between holes and electrons [32]. the electrostatic field in mqws of lediii is lower than ledii that supports the spatial overlap of electron-hole wave functions which improves the radiative recombination process [33]. thus, the recombination rate of lediii is increased by ~136.7% compared to ledii. as shown in fig. 5, most electrons still accumulate in the initial well, while the hole concentration in the last well is less than that in the previous wells. however, in conventional led, both holes and electrons are centred at the wells close to p-gan, hence the radiative recombination is extremely effective at that location. above outcomes suggest that in order to diminish the droop behaviour of led without deteriorating total recombination, more attention has to be given to the spatial distribution between the holes and electrons [34]. fig. 6 radiative recombination rate of all led samples 3.5. power fig. 7 illustrates the luminous power vs. current for ledi and ledii. the light output is observed to be amplified with decrease in the value of x because graded qb benefits from superior electron confinement and larger hole injection efficiency. these superior optical properties are also attributed to the decrease in the polarization field in the mqw [35]. this improved power means that more carriers will recombine in the qw of graded qb led thus effectively improving the light efficiency of gan/algan led [36]. furthermore, as shown in fig. 8, the output power of lediii is remarkably increased to 98 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen 18.075 mw from 16.723 mw (ledi) at 60 ma current injection i.e. ~8.084% enhancement. the normalised power spectral density of the three devices is displayed in fig. 9. conventional ledi has stronger quantum-confined stark effect (qcse) induced by the spontaneous and piezoelectric fields in the mqw layers which shows an obvious screening effect and band-filling effect. this results in a blue-shift in ledi. from fig. 9, lediii shows a red shift of ~5 nm because of negligent presence of qcse. fig. 7 behaviour of luminous power versus forward current with varying values of x. inset: clearer view of power at 60 ma current fig. 8 luminous power as a function of injected current for all leds efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 99 fig. 9 room temperature el spectra of all the led devices vs. wavelength 4. conclusion to summarize, ebl free uv-led of gan/algan mqws with specially designed graded qbs are numerically simulated. after reducing the band gap of algan across [0001] axis from n towards p region in every qb, the efficiencies of the device enhance. the upgraded led having x=0.02 (ledii) acquires topmost iqe of ~45.68 % at 60 ma which is 31.3% more compared to the conventional one (ledi) with square barriers. the reason behind this improvement is attributed to the modified energy band diagrams in the graded qbs. moreover, we have numerically demonstrated ebl free uv-led graded qb structure and observed that it can effectively suppress electron overflow, support enhanced hole injection into the led active region as compared to the conventional led. the hole transport in mqws was notably intensified at current of 60 ma which is beneficial for droop reduction. the efficiency droop was decreased from 13.87% in conventional led to only 10.22% in graded qb led and further to 7.72% in the proposed ebl free led. the proposed led has an 8.084% increase in the luminous power at an injection current of 60 ma as compared to conventional led and 39.12% rise in the efficiency. we believe that the el performance of the leds based on gan materials can be further improved through elaborate device design and carefully considering the varying carrier transport characteristics of gan based leds, which show different conduction-to-valence band-offset ratios in their mqw structures. acknowledgement: this work is one of the outcomes of dst-serb, govt. of india sponsored matrics project no mtr/2021/000370 which is duly acknowledged for support. 100 s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen references [1] s. das, t. r. lenka, f. a. talukdar and r. t. velpula, "carrier transport and radiative recombination rate enhancement in gan/algan multiple quantum well uv-led using band engineering for light technology", in proceedings of the 2nd international conference on micro and nanoelectronics devices, circuits and systems, mndcs 2022, pp. 1–11. [2] m. usman, u. mushtaq, m. munsif, a. r. anwar and m. kamran, "enhancement of the optoelectronic performance of p-down multiquantum well n-gan light-emitting diodes", phys. scr., vol. 94, no. 10, p. 105808, 2019. [3] h. tao, s. xu, j. zhang, p. li, z. lin and y. hao, "numerical investigation on the enhanced performance of n-polar algan-based ultraviolet light-emitting diodes with superlattice p-type doping", ieee trans. electron devices, vol. 66, no. 1, pp. 478-484, 2019. [4] s. das et al., "effects of polarized-induced doping and graded composition in an advanced multiple quantum well ingan/gan uv-led for enhanced light technology", eng. res. express, vol. 4, no. 1, p. 015030, 2022. [5] y. nagasawa and a. hirano, "a review of algan-based deep-ultraviolet light-emitting diodes on sapphire", appl. sci., vol. 8, no. 8, p. 1264, 2018. [6] m. usman et al., "zigzag-shaped quantum well engineering of green light-emitting diode", superlattices microstruct., vol. 132, p. 106164, 2019. [7] g. kim, j. h. kim, e. h. park, d. kang and b.-g. park, "extraction of recombination coefficients and internal quantum efficiency of gan-based light emitting diodes considering effective volume of active region", opt. express, vol. 22, no. 2, p. 1235, 2014. [8] h. hu, s. zhou, x. liu, y. gao, c. gui and s. liu, "effects of gan/algan/sputtered aln nucleation layers on performance of gan-based ultraviolet light-emitting diodes", sci. rep., vol. 7, p. 44627, 2017. [9] s. zhou, x. liu, h. yan, z. chen, y. liu and s. liu, "highly efficient gan-based high-power flip-chip light-emitting diodes", opt. express, vol. 27, no. 12, pp. a669–a692, 2019. [10] x. zhao, b. tang, l. gong, j. bai, j. ping and s. zhou, "rational construction of staggered ingan quantum wells for efficient yellow light-emitting diodes", appl. phys. lett., vol. 118, no. 18, p. 182102, 2021. [11] s. zhou et al., "numerical and experimental investigation of gan-based flip-chip light-emitting diodes with highly reflective ag/tiw and ito/dbr ohmic contacts", opt. express, vol. 25, no. 22, p. 26615, 2017. [12] y. meng et al., "growth and characterization of amber light-emitting diodes with dual-wavelength ingan/gan multiple-quantum-well structures", mater. res. express, vol. 6, no. 8, p. 0850c8, 2019. [13] c. h. wang et al., "efficiency droop alleviation in ingan/gan light-emitting diodes by graded-thickness multiple quantum wells", appl. phys. lett., vol. 97, no. 18, p. 181101, 2010. [14] z. lin, x. chen, y. zhu, x. chen, l. huang and g. li, "influence of thickness of p-ingan layer on the device physics and material qualities of gan-based leds with p-gan/ingan heterojunction", ieee trans. electron devices, vol. 65, no. 12, pp. 5373–5380, 2018. [15] m. usman, a. r. anwar, m. munsif, s. malik and n. u. islam, "analytical analysis of internal quantum efficiency with polarization fields in gan-based light-emitting diodes", superlattices microstruct., vol. 135, p. 106271, 2019. [16] h. hu et al., "boosted ultraviolet electroluminescence of ingan/algan quantum structures grown on high-index contrast patterned sapphire with silica array", nano energy, vol. 69, p. 104427, 2020. [17] x. fan, s. xu, h. tao, r. peng, j. du, y. zhao, j. zhang, j. zhang and y. hao, "improved performance of gan-based ultraviolet leds with the stair-like si-doping n-gan structure", mdpi, vol. 11, no. 10, p. 1203, 2021. [18] m. h. kim et al., "origin of efficiency droop in gan-based light-emitting diodes", appl. phys. lett., vol. 91, no. 18, pp. 1-4, 2007. [19] j. h. park et al., "enhanced overall efficiency of gainn-based light-emitting diodes with reduced efficiency droop by al-composition-graded algan/gan superlattice electron blocking layer", appl. phys. lett., vol. 103, no. 6, 2013. [20] c. sheng xia, z. m. simon li, w. lu, z. hua zhang, y. sheng and l. wen cheng, "droop improvement in blue ingan/gan multiple quantum well light-emitting diodes with indium graded last barrier", appl. phys. lett., vol. 99, no. 23, p. 233501, 2011. [21] s. das, t. r. lenka, f. a. talukdar, r. t. velpula, h. p. t. nguyen and c. engineering, "carrier transport and radiative recombination rate enhancement in gan / algan multiple quantum well uvled using band engineering for light technology", in: lenka, t.r., misra, d., fu, l. (eds) micro and nanoelectronics devices, circuits and systems. lecture notes in electrical engineering, vol. 904. springer, singapore. pp. 187-198. efficiency and radiative recombination rate enhancement in gan/algan multi-quantum... 101 [22] j. cho, e. f. schubert and j. k. kim, "efficiency droop in light-emitting diodes: challenges and counter measures", laser photonics rev., vol. 7, no. 3, pp. 408-421, 2013. [23] h. hirayama et al., "222-282 nm algan and inalgan-based deep-uv leds fabricated on high-quality aln on sapphire", phys. status solidi appl. mater. sci., vol. 206, no. 6, pp. 1176–1182, 2009. [24] c. chu et al., "on the origin of enhanced hole injection for algan-based deep ultraviolet light-emitting diodes with aln insertion layer in p-electron blocking layer", opt. express, vol. 27, no. 12, p. a620, 2019. [25] m. l. nakarmi, n. nepal, j. y. lin and h. x. jiang, "photoluminescence studies of impurity transitions in mg-doped algan alloys", appl. phys. lett., vol. 94, no. 9, pp. 1–5, 2009. [26] s. clara, “silvaco user’s manual device simulation software,” no. october, 2004, [online]. available: www.silvaco.com. [27] b.-c. lin et al., "hole injection and electron overflow improvement in ingan/gan light-emitting diodes by a tapered algan electron blocking layer", opt. express, vol. 22, no. 1, p. 463, 2014. [28] j. li et al., "carrier transport improvement in zno/mgzno multiple-quantum-well ultraviolet lightemitting diodes by energy band modification on mgzno barriers", opt. commun., vol. 459, 2020. [29] k. mehta et al., "theory and design of electron blocking layers for iii-n-based laser diodes by numerical simulation", ieee j. quantum electron., vol. 54, no. 6, pp. 1–11, 2018. [30] h. hirayama, s. fujikawa and n. kamata, "recent progress in algan-based deep-uv leds", electron. commun. japan, vol. 98, no. 5, pp. 1–8, 2015. [31] r. charash et al., "carrier distribution in ingan/gan tricolor multiple quantum well light emitting diodes", appl. phys. lett., vol. 95, no. 15, pp. 2007-2010, 2009. [32] s. zhou, j. lv, y. wu, y. zhang, c. zheng and s. liu, "reverse leakage current characteristics of ingan/gan multiple quantum well ultraviolet/blue/green light-emitting diodes", jpn. j. appl. phys., vol. 57, no. 5, p. 051003, 2018. [33] y. a. yin, n. wang, g. fan and y. zhang, "investigation of algan-based deep-ultraviolet light-emitting diodes with composition-varying algan multilayer barriers", superlattices microstruct., vol. 76, pp. 149155, 2014. [34] j. chang et al., "algan-based multiple quantum well deep ultraviolet light-emitting diodes with polarization doping", ieee photonics j., vol. 8, no. 1, pp. 1–7, 2016. [35] t. y. wang et al., "algan-based deep ultraviolet light emitting diodes with magnesium delta-doped algan last barrier", appl. phys. lett., vol. 117, no. 25, p. 251101, 2020. [36] h. li, c. j. chang, s. y. kuo, h. c. wu, h. huang and t. c. lu, "improved performance of near uv gan-based light emitting diodes with asymmetric triangular multiple quantum wells", ieee j. quantum electron., vol. 55, no. 1, pp. 1-4, 2019. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 467 477 doi: 10.2298/fuee1403467p automatic prosody generation in a text-to-speech system for hebrew  branislav popović 1 , dragan knežević 1 , milan sečujski 1 , darko pekar 2 1 faculty of technical sciences, university of novi sad, serbia 2 alfanum – speech technologies, novi sad, serbia abstract. the paper presents the module for automatic prosody generation within a system for automatic synthesis of high-quality speech based on arbitrary text in hebrew. the high quality of synthesis is due to the high accuracy of automatic prosody generation, enabling the introduction of elements of natural sentence prosody of hebrew. automatic morphological annotation of text is based on the application of an expert algorithm relying on transformational rules. syntactic-prosodic parsing is also rule based, while the generation of the acoustic representation of prosodic features is based on classification and regression trees. a tree structure generated during the training phase enables accurate prediction of the acoustic representatives of prosody, namely, durations of phonetic segments as well as temporal evolution of fundamental frequency and energy. such an approach to automatic prosody generation has lead to an improvement in the quality of synthesized speech, as confirmed by listening tests. key words: speech synthesis, speech processing, natural language processing, classification and regression trees 1. introduction explicit modeling of prosodic features of synthesized speech, as well as prediction of values of certain parameters of a model based on explicit morphological, phonetic, syntactic and other relevant rules, is considered to be a relatively poor solution in practice. this is due to an enormous number of factors that need to be considered, as well as their mutual influence, too complicated to be closely examined on reasonably large speech corpora [1]. on the other hand, inadequately determined prosodic features impair the naturalness, and in some cases even the intelligibility of synthesized speech, significantly narrowing the field of its application. as the use of machine learning methods eliminates the need for explicit modeling of prosody, they have been widely adopted as a solution for automatic prosody generation  received february 25, 2014; received in revised form may 21, 2014 corresponding author: branislav popović university of novi sad, faculty of technical sciences, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: bpopovic@uns.ac.rs) 468 b. popović, d. knežević, m. seĉujski, d. pekar within text-to-speech systems. furthermore, they can also provide information about the mutual influence of specific linguistic factors (e.g. masking), which is of great interest to the linguistic community. in this paper, automatic training and subsequent prediction of prosodic features are carried out according to the methodology of classification and regression trees (cart) [2]. the idea of this methodology is to generate a tree structure through the process of automatic training based on a speech corpus of sufficient size. such a training should identify the most relevant factors that influence the prosodic features of speech and their acoustic representatives – phone durations as well as temporal evolution of fundamental frequency and energy. the speech corpus is marked for phone boundaries as well as relevant prosodic events, such as types and levels of boundaries between adjacent intonation units, as well as levels of emphasis. using regression trees trained on thus annotated speech corpus, the quality of synthesized speech is significantly improved compared to the quality obtained by conventional methods for prosody prediction in text-to-speech [3], [4], [5]. the paper is organized as follows. section 2 presents the particularities of the hebrew language, as it is well known that the properties of the target language significantly affect the development of a system for automatic speech synthesis (most notably the automatic prosody generation module). section 3 defines the procedure of automatic part-of-speech (pos) tagging and additional morphological annotation of input text. in section 4, prosody generation and synthesis are presented. section 5 presents the experimental results. in section 6, several conclusions are given. 2. language particularities the hebrew language, one of the most widely spoken semitic languages today, has a range of properties which drastically affect the design of a speech synthesis system. firstly, from the orthographical point of view, it belongs to the group of so called abjad languages, where each symbol commonly stands for a consonant [6]. however, vowels can be indicated by (1) the use of "weak consonants" serving as vowel letters (for example, the letter vav indicates that the preceding vowel is either /o/ or /u/, yodh indicates an /i/, whereas aleph indicates an /a/), or (2) by using a set of diacritical symbols called niqqud. another thing that should be borne in mind is that abjad languages, including hebrew, suffer from very loose spelling rules. this means that for a number of words there can be more than one acceptable spelling, which is a very serious source of ambiguity. namely, the revival of the hebrew language in the late 19 th century has left many unresolved issues [7]. as hebrew speakers were almost all native speakers of european languages and thus accustomed to the latin alphabet, it has led to the development of two parallel spelling systems: the first, where vowel indicators are used according to the historic rules, and the second, where vowel indicators are used excessively. it should also be noted that even today, a vast majority of speakers commonly makes spelling errors. therefore, if one aims at the design of a text-tospeech system which should be able to handle arbitrary texts, spelling errors have to be accepted as a part of standard inventory. spelling errors are thus another source of ambiguity in hebrew, and are something that the design of a practically applicable speech synthesizer cannot dismiss. automatic prosody generation in a text-to-speech system for hebrew 469 the hebrew alphabet has 22 letters, five of them have different forms when they are used at the end of a word. modern israeli hebrew has 5 vowel phonemes. however, the meaning of a word is carried not only by its phonological content, but also by its stress, and it is not uncommon to find pairs of words containing the same string of phonemes, but pronounced differently, the only difference being the stress. from the point of view of morphology, it should be noted that hebrew exhibits a pattern of stems consisting typically of consonantal roots from which nouns, adjectives, and verbs are formed in various ways. hebrew uses a range of very productive prefixes and a multitude of suffixes, dramatically increasing the number of possible morphological interpretations of each surface word form in the text. the syntactic structure of the sentence and the word ordering in hebrew can be considered as relatively flexible. although particular choices in word ordering can indicate specific literary styles or genres, one commonly encounters sentences where several orders of words can be considered equivalent. this is another source of difficulty for automatic morphological annotation of text. 3. morphological annotation after the text is preprocessed in order to locate sentence boundaries and reveal elements such as abbreviations, dates, punctuation, special characters, web addresses etc., it is submitted to automatic morphological annotation, aimed at assigning part-of-speech tags as well as some additional morphological information that may be of interest to any subsequent phase of automatic prosody generation. the morphological analysis begins by assigning an empty array of "readings" to every surface word form (token) in a sentence. the term "reading" denotes a morphological interpretation of this token together with its phonological representation, i.e. a particular inflected form of a word, together with the corresponding lemma, values of part-of-speech and corresponding morphological categories, its pronunciation as well as position and type of stress. in general, it is possible to derive several hundreds of morphological forms from a single lemma in hebrew. ideally, the lexicon should contain entries representing each and every possible surface word form. an evaluation score will be assigned to each of the readings of a word token during the evaluation process, in order to select the reading which is most likely to be correct. the aim of morphologic analysis is, thus, to distinguish between the available readings and thus assign a correct vocalization and stress pattern to each word, which is of utmost importance for the naturalness of synthesized speech. the novel approach to morphologic analysis described in this paper is outlined in fig. 1 and uses a combination of active and passive methods [8]. the passive method presumes the selection of appropriate lexemes, by using the hebrew lexicon, the lexicon of foreign words in hebrew transcription and finally, the lexicon of frequent foreign words in latin transcription. the active method involves an automatic morphological analysis of the input text string, as well as generation of appropriate readings by using a complex expert algorithm relying on a set of transformational rules. the use of the active method reduces the initialization time as well as the number of inflected morphological forms in the lexicon by two orders of magnitude, enabling the use of the software component within real-time applications. on the other hand, the passive methodology reduces the error rate. 470 b. popović, d. knežević, m. seĉujski, d. pekar fig. 1 morphological annotation of input text transformational rules in the form of complex tree structures are applied iteratively. branches are generated by using appropriate sets of morphological rules. word analysis is carried out morpheme by morpheme. every word is processed according to its left and right context. the aim is to correctly identify the surface form as a particular inflected form of a particular lemma. currently, the system supports more than 30 part-of-speech classes with more than 3000 corresponding morphological categories. the algorithm for the evaluation of particular readings, in order to select the most likely one, consists of a set of disambiguation tools, divided into individual scoring procedures. the scoring of syntactic structures assigns syntactic indexes to words using predefined statistical algorithms, aiming at establishing the similarity between the syntactic structure of input sentence and the predefined syntactic structures. the algorithm is coupled with an accurate comparison mechanism that allows the use of existing structures in order to project on unfamiliar ones. a syntactic score indicates the level of compatibility of a certain reading to the previously tagged syntactic environment. the scoring of semantic structures uses an analogous method, with only one difference: the structures represent semantic relations instead of syntactic ones. the index used is built over semantic attributes. the challenge in this process, besides building the most convenient set of indexes, is to determine the collection of a minimal number of morphological descriptors (tags) covering at the same time the maximum number of words. proximity scoring is the most efficient of the scoring processes. there are three types of proximity rules: generic to generic (this type of rules refers to the assignment of a relationship between linguistic items of non-specific identity, such as "there is a high probability that a verb in past tense of semantic category moving will be adjacent to a copula"; the attributes that can be used in composing these rules may be of grammatical and/or semantic nature), specific to generic (this type of rules would attach a generic rule to a specific word, e.g. "a verb in passive mood is likely to be followed by the word by") and specific to specific (this type of rules will attach two specific words, e.g. tel is likely to be followed by aviv). the effect of proximity scoring is clearly limited only to the words and entities for which proximity rules have been defined. full-niqqud scoring is a type of scoring unique to hebrew. it determines how close a certain reading of a word is to the most commonly used spelling version. due to the automatic prosody generation in a text-to-speech system for hebrew 471 previously mentioned lack of unique spelling standard, such a scoring procedure has to be taken into account as well. another scoring procedure used is frequency scoring, i.e. scoring readings according to their frequency in standard texts. although such a procedure is highly inaccurate on its own (it commonly serves as a baseline for establishing the performance of more sophisticated morphological annotation techniques), it can serve as an efficient tie-breaker, i.e. it can be used in cases where other scoring procedures have assigned approximately equal scores to multiple readings. every reading is also additionally evaluated in view of its context. context scores are obtained in compliance with the previously selected set of tags for the left context, as well as the set of tags for all possible readings in the right context. this is probably the most complex among all the applied scoring procedures. table 1 illustrates the effectiveness of the described scoring procedures, in terms of the overall accuracy of the automatic annotation process (selection of the correct reading), on the corpus of 3093 sentences (55046 words). table 1 the overall accuracy scoring type status syntactic on on on on semantic on on on on proximity on on on full niqqud on on frequency on on context on on on acc. [%] 92.3 85.9 44.7 45.1 32.1 46.9 99.3 99.4 99.6 table 2 presents the correlation matrix among the different scoring procedures. a high correlation between proximity, context and full-niqqud score can be noted. although such an analysis of the correlation between different scoring procedures is not immediately aimed at the improvement of the quality of synthetic speech, it can give an insight into the directions of the future development of the scoring system. at the same time, high correlation between particular scoring procedures, besides giving a linguistic insight into the problem, confirms the validity of the algorithms. table 2 the correlation matrix scoring type syntactic semantic proximity full niqqud context syntactic 1 0.062 0.238 0.224 0.239 semantic 0.062 1 0.356 0.364 0.342 proximity 0.238 0.356 1 0.945 0.982 full niqqud 0.224 0.364 0.945 1 0.929 context 0.239 0.342 0.982 0.929 1 472 b. popović, d. knežević, m. seĉujski, d. pekar fig. 2 evaluation scores and manually selected readings evaluation scores for an example sentence are presented in fig. 2. the sentence is given in the top right corner, and the readings with the highest scores (highlighted) match the actual correct readings. features recovered by automatic morphological annotation (primarily vocalization and stress pattern) constitute the symbolic representation of the prosody of a given input sentence. this representation will be used as an input to the cart prosody generator, which will, in turn, produce a corresponding sequence of values of fundamental frequency and energy, as well as phone durations. 4. prosody generation and synthesis as has been mentioned before, it is well known that fully expert systems used for modeling of prosodic features are not of great practical use within speech synthesizers, mostly due to the large number of factors that influence prosody as well as their mutual effects, which are too complex to be sufficiently analyzed on speech corpora of reasonable size. speaker inconsistence represents an additional problem. even a single speaker can be expected to pronounce the same sentence differently on different occasions, each of the resulting utterances being equally acceptable to the listener. for all these reasons, the prediction of prosodic features is performed using machine learning, namely the methodology of classification and regression trees (cart) [9]. the basic principle of cart prosody prediction will be shown on an example of predicting the durations of phonetic segments (phones). the initial and the most important step is to identify the features to be used for training. this step has some basis in expert knowledge but the rest of the procedure is completely automatic. the set of features considered to be relevant for the phone duration includes phonemic identity, primary and secondary stress (with values: stressed, unstressed; applicable to vowels only), position within the syllable and position within the intonation boundaries (expressed as number of syllables), but many others as well. the durations of phones and relevant features are known for the training set and this set is thus the basis for prediction of duration for all other phoneme instances. automatic prosody generation in a text-to-speech system for hebrew 473 fig. 3 the first 3 levels of the regression tree used for estimation of phone duration the tree branching is performed as follows. all the possible yes/no questions based on the selected features (e.g. "is the phone stressed?", "is the distance to the nearest phrase break more than 3 syllables?" etc.) are evaluated for each phone instance in the training set. every question splits the starting n phoneme instances ("root" node) into two distinct subsets ("child" nodes) based on the answer (yes or no), and every question generally splits the set differently. the most relevant question is the one that reduces the total diversity (in terms of duration) of both "child" nodes to the greatest possible degree. at this point, the initial node is split into two "child" nodes based on the most relevant question (e.g. "is the phone stressed?"), and the procedure is recursively repeated for every descendant node, until the tree is fully branched. every terminal node ("leaf" node) is assigned a value – the average duration of all instances assigned to that node. the final tree usually contains multiple phoneme instances assigned to each "leaf" node. although the branching procedure is very computationally complex, the final use of the tree is exceptionally simple and fast. during the synthesis phase, the instance of the phone with known answers to all the relevant yes/no questions is propagated through the tree – from the root node to one of the leaf nodes. the exact path to the leaf node and the final node itself depend on the answers to yes/no questions. the estimated phone duration is the one assigned to the "leaf" node during the training phase (average duration for all the instances assigned to that node). as an illustration, fig. 3 shows the first 3 levels of the regression tree for the prediction of phone duration. the number within the node indicates the occupancy, i.e. number of phone instances within the node. the module for automatic prediction of prosodic features of the synthesized speech based on the regression trees for the hebrew language is trained on the speech database which consists of approximately 4 hours of speech from one professional speaker (the same database is used for synthesis). the database is annotated for phone boundaries and 474 b. popović, d. knežević, m. seĉujski, d. pekar phonological content, which corresponds to the phonological inventory of modern israeli hebrew. some phones are split into subphones (such as occlusions and explosions of stops and fricatives). stress is also marked (primary and secondary). for the purposes of cart training, the database is marked for a number of prosodic events including types and levels of intonational phrase boundaries (up, down; none, weak, medium, strong, very strong) as well as levels of emphasis (very weak, weak, neutral, strong, very strong). regression trees are trained for duration, energy, the value of f0 and its derivative, log ratio of f0 values at 1/4 and 3/4 of the duration of a vowel, as well as log ratio of f0 values between two successive vowels (measured at 3/4 of the duration of the first vowel and 1/4 of the duration of the second one). energy and durations are directly obtained, while the final f0 curve is derived from the outputs of the 4 f0-related trees. a total of 600 different criteria (yes/no questions) are taken into account during the process of regression trees branching. these criteria are defined based on the phonetic context, type of phoneme, phoneme position within a word, the corresponding word’s position within the sentence, etc. a number of compound criteria are also used (e.g. "is the phone vowel and stressed?"). in this case, with a training corpus of approximately 4 hours of speech, the maximum number of levels in the trees was 11. however, it should be pointed out that this value is, in general, greatly dependent on the criterion used for stopping the branching procedure (e.g., a number of instances in the node is less than some predefined threshold, or the reduction of the impurity of the node has been reduced by branching by a value which is less than some predefined threshold). after the trees have been built, at synthesis time, the expert systems analyze the input text and attempt to recover the correct reading for each word in it. by doing so, they recover the symbolic representation of the desired prosody for the input text, including the positions of stressed syllables as well as types and levels of intonational phrase boundaries and levels of emphasis for each word. these features exactly correspond to the features used in cart questions, and will be used for “passing” each phoneme of the input sentence down the tree, thus providing the acoustic representation of the desired prosody. after the acoustic representatives of prosody have been generated, segments used for speech signal synthesis are selected. the basic unit on which the segment selector operates is a half-phone. half-phones that are selected as candidates to be used for concatenation are assigned concatenation and target costs. a trellis structure is formed and the viterbi algorithm is used to find the optimal path (half-phone sequence) through the trellis, i.e. the one with the minimal accumulated cost. the cost assignment is performed based on multiple criteria, which can be classified into two basic groups: target criteria and concatenation criteria. the target criteria determine the mismatch between the acoustic features of the candidate half-phone and the required prosodic features, and express it through target cost, which is thus the measure of the unsuitability of the phonetic segment for being used in actual synthesis. the features taken into account for target cost are duration, f0 and its derivative, as well as energy. on the other hand, the concatenation criteria determine the cost of concatenating any two half-phones [10]. the quality of the synthesized speech greatly depends on the frequency of concatenation points, as well as the audibility of each of them. the concatenation cost, assigned to any ordered pair of half-phones, is defined as the measure of their acoustic mismatch at concatenation points and thus their incompatibility for being automatic prosody generation in a text-to-speech system for hebrew 475 concatenated. for pairs of half-phones which are adjacent and in the same order as in the speech database this cost is equal to zero, which means that such pairs of segments will, whenever possible, be selected for concatenation. in other words, the basic units for synthesis are thus, in fact, not limited to half-phones, but can include strings of halfphones of unlimited length. in practice, the strings of half-phones selected for concatenation are mostly between 3 and 5 half-phones long. the speech signal synthesis module performs signal concatenation. this module is based on the time-domain pitch synchronous overlap and add (td-psola) algorithm, as implemented previously in [11]. the outputs of the prosody generator module and the segment selection module are used as inputs for the concatenation module. since it is impossible very unlikely to have the segments that ideally match the prosody requirements, it is usually necessary to additionally adjust the selected segments as regards their durations, f0 and/or energy. 5. the quality of speech it should be noted that there are several independent sources of the differences between the prosody of synthesized speech and the prosody of natural human speech. besides the intrinsic variability of speech prosody (the fact that no speaker will pronounce the same utterance twice in the same way, and that a wide range of the values of prosodic parameters can be considered acceptable), there are two major factors that affect the accuracy of synthetic prosody. firstly, any error in morphologic annotation (and thus stress assignment) or the assignment of some other prosodic event such as phrase break or emphasis will lead to an error at the input of cart based prosody predictor. this would inevitably result in audible prosodic errors. on the other hand, even in cases when the input to cart is quite accurate, the output still may be of inferior quality due to corpus tagging errors (largely eliminated through manual inspection), data sparsity (insufficient training corpus size), inadequately estimated feature set or simply the intrinsic inability of the cart technique to adequately cover all the peculiarities of spoken language. the errors introduced by cart are most often less audible, and the final outcome is an intonation contour characteristic of accurate, albeit somewhat emotionless speech. the evaluation of the proposed automatic prosody generation module was carried out through the perceptual evaluation of the quality of synthesis. within the listening tests, 10 listeners (native speakers with no background in speech processing, text-to-speech synthesis or speech prosody) rated the tts system performance in terms of naturalness of synthesized speech on a scale from 1 (unnatural, robotic speech) to 5 (speech with apparently natural prosodic features). the listeners were presented with examples of synthesized speech using either the proposed cart-based generator or its previous version based on an expert system implementing explicit rules governing prosodic features. the utterances (a total of 20) were not marked, and their ordering was varied. the average score given to the cart-based system was 3.9, as opposed to 3.5 given to the rule-based version (the corresponding standard deviations were 0.39 and 0.41 respectively). figure 4 shows a comparison of three fundamental frequency contours for the sentence 'תיטמוטוא הארקה תכרעמ תועצמאב תעמשומ תאז העדוה', corresponding to the utterance as rendered by the native speaker (blue), referent system [5] (grey) and 476 b. popović, d. knežević, m. seĉujski, d. pekar proposed system (green). the three contours have been manually time-aligned to the utterance as rendered by the human speaker (indicated by the waveform and the phonemic labelling). it can be observed that the intonation curve as generated by the referent system seems quite regular, unlike the curves corresponding to the native speaker and the proposed system, which seem to exhibit more variation. furthermore, it can be seen that a much greater percentage of frames in the speech signal generated by the referent system were identified as voiced, in comparison to the other two systems. this is related to the characteristic buzziness present in the speech signal generated by the referent system, which (together with a rather monotonous intonation) was one of the major drawbacks of the referent system as reported by the listeners. however, most listeners also reported that the intonation contours of both synthesizers are adequately related to the positions of stressed syllables. 6. conclusion by using the expert system in combination with cart the quality of synthesized speech is considerably increased. based on the results of the listening tests, the system described in the paper provided much more natural-sounding speech when compared to the previous version of the system, in which the prosody was estimated using the expert system. an additional benefit of automated prosody generation is in the fact that such an automated system can be adapted to different dialects of the hebrew language much more easily and in much less time than the expert system. namely, covering a different dialect fig. 4 fundamental frequency contours for an example sentence, corresponding to the native speaker (blue), referent system [5] (grey), and proposed system (green). automatic prosody generation in a text-to-speech system for hebrew 477 of hebrew would require that a new speech corpus be recorded and tagged, and that the automatic training procedure be repeated, which is still widely considered to be far simpler than discovering new sets of expert rules related to prosody. the quality of synthesized speech could be further improved by widening the set of relevant questions as well as by improving the segment selection and signal concatenation modules. acknowledgement: this research work has been supported by the ministry of education, science and technological development of the republic of serbia, and it has been realized as a part of the research project tr 32035. references [1] j.p.h. van santen, "contextual effects on vowel duration", speech commun., 1992, vol. 11, no. 6, pp. 513-546. [2] m. seĉujski, n. jakovljević and d. pekar, "automatic prosody generation for serbo-croatian speech synthesis based on regression trees", in proceedings of the 12th annual conference of the international speech communication association, 2011, florence, italy, pp. 3157-3160. [3] ö. öztürk and t. çiloğlu, "segmental duration modelling in turkish", in proceedings of the 9th international conference on text, speech and dialogue, brno, czech republic, lect. notes comput. sc., springer, 2006, vol. 4188, pp. 669-676. [4] a. lazaridis, p. zervas, n. fakotakis and g. kokkinakis, "a cart approach for duration modeling of greek phonemes", in proceedings of the 12th international conference on speech and computer, 2007, moscow, russia, pp. 287-292. [5] d. kamir, n. soreq and y. neeman, "a comprehensive nlp system for modern standard arabic and modern hebrew", in proceedings of semitic’02, the acl-02 workshop on computational approaches to semitic languages, 2002, acl, stroudsburg, pa, usa, pp 1-9. [6] n. chomsky, morphophonemics in modern hebrew. routledge, 2012. [7] j. fellman, "concerning the "revival" of the hebrew language", anthropol. linguist., may 1973, vol. 15, no. 5, pp. 250-257. [8] b. popović, m. seĉujski, v. delić, m. janev and i. stanković, "automatic morphological annotation in a text-to-speech system for hebrew", in proceedings of the 15th international conference on speech and computer, pilsen, czech republic, lect. notes comput. sc., springer, 2013, vol. 8113, pp. 319-326. [9] l. breiman, j.h. friedman, c.j. stone and r.a. olsen, classification and regression trees. chapman & hall/crc, boca raton, london, new york, washington d.c., 1984. [10] a. black and n. campbell, "optimising selection of units from speech databases for concatenative synthesis", in proceedings of the 4th european conference on speech communication and technology, 1995, madrid, spain, pp. 581-584. [11] v. delić, m. seĉujski, n. jakovljević, m. janev, r. obradović and d. pekar, "speech technologies for serbian and kindred south slavic languages", adv. speech recognition, chapter 9, 2010. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 613-626 https://doi.org/10.2298/fuee1804613k calibration of ac induction magnetometer  branko koprivica, marko šućurović, alenka milovanović university of kragujevac, faculty of technical sciences ĉaĉak, ĉaĉak, serbia abstract. the aim of this paper is to describe a procedure and experimental setup for calibration of ac induction magnetometer. the paper presents an overview of the previous research and results of measurement of magnetic flux density inside largediameter multilayer solenoid. this solenoid is magnetising coil of the magnetometer. the paper also describes a system of five smaller coils of the magnetometer which are placed inside the large solenoid. three small coils are pickup coils, accompanied with two compensation coils, of which one is an empty coil for magnetic field measurement. the experimental results of calibration of this coil system have been presented. a proper discussion of all the results presented has been also given in the paper. key words: induction magnetometer, calibration, measurement uncertainty, hall sensor, labview. 1. introduction iron loss in induction motors can reach up to 20% of the total losses [2]. large efforts were made to improve production of the electrical steel and to reduce losses. amorphous materials have been also used because of lower losses. measurement of their magnetic characteristics has become of great importance in order to obtain reliable data on power loss of these materials. ac induction magnetometer have proved to be a powerful tool for characterisation of the ferromagnetic materials [3, 4]. induction magnetometer uses a long solenoid for magnetisation of the sample [3, 4]. a pickup coil is placed inside the long solenoid and used for measurement of the magnetic flux density in the sample of ferromagnetic material. a lateral dimension of the sample is not large, usually in order of several millimetres (diameter of wire or bar and width of strips), while its length usually amounts several centimetres and may be up to 10-15 cm. another pickup coil may be also placed inside the long solenoid, without the sample, and used for measurement of the magnetic field.  received february 25, 2018; received in revised form july 5, 2018 corresponding author: branko koprivica university of kragujevac, faculty of technical sciences, svetog save 65, 32000 ĉaĉak, serbia (e-mail: branko.koprivica@ftn.kg.ac.rs) * an earlier version of this paper was presented at the 13 th international conference on applied electromagnetics (пес 2017), august 31 september 01, 2017, in niš, serbia [1]. 614 b. koprivica, m. šućurović, a. milovanović in general, it is easy to control a time waveform of the magnetic field created by the long solenoid. therefore, the magnetic field has desired shape, while the magnetic flux density shape depends on the material response. when it is used along with the personal computer, as a digital measurement setup, it enables performing of very complex experiments, such as those for measurement of first-order reversal curves [3, 4]. this measurement method is based on two basic laws of magnetism ampere’s law for magnetising coil and faraday’s law for pickup coil. in the case when a long solenoid with small diameter is used as magnetising coil, according to the ampere’s law, the magnetic field is homogeneous and has longitudinal direction and constant amplitude [4]. however, this is not easy to achieve in practice and in many cases the magnetic field is homogenous only in the middle zone of the solenoid. this limits the length of the pickup coil. because of this inhomogeneity, the magnetic field can not be always accurately calculated according to ampere’s law, using measured current of the solenoid. the faraday’s law applied to the pickup coil may also provide inaccurate result if the pickup coil is placed in the zone where the magnetic field is not homogeneous. therefore, the whole system needs to be calibrated. in this paper, a calibration procedure will be described and performed on the coil systems of one old ac induction magnetometer ferrotester 2738/s-3. as an initial step in the calibration of the magnetometer, a homogeneity of the magnetic field inside a largediameter multilayer solenoid (large solenoid) has been investigated in the prior research [1]. it has been found that the homogeneity zone covers only one third of the solenoid (its central part). a variation of the magnetic field in this zone was less than 1 % of its maximum value. an inner coil system of this magnetometer contains five smaller coils, one coil for measurement of the magnetic field (empty coil) and two pairs of coils in mutual opposition for measurement of the magnetic flux density and magnetisation. the calibration of this coil system, along with the calibration of the magnetising solenoid, will be presented in this paper. a calibration of the magnetometer has been performed using a pc based measurement setup. the hall sensor has been used for measurement of the magnetic flux density in a homogeneity zone of the large solenoid. voltage supplied and electric current of large solenoid have been also measured. five voltages from the system of pickup coils have been measured: one on the empty coil, two on coils in the opposition and two compensated voltages. measurements have been performed using ni usb 6009 data acquisition card and application created in labview software. a ratio of the magnetic flux density maximum and the electric current maximum gives a calibration constant of the large solenoid. a calibration constant of each pickup coil has been calculated from the corresponding voltage maximum. this paper gives a detailed description of ac induction magnetometer, all information on the measuring equipment and calibration procedure, as well as the results obtained during the calibration. it also gives a detailed calculation of the measurement uncertainty, as well as a discussion of the results obtained, and explains how to calculate the magnetic field, the magnetic flux density and the magnetisation using obtained calibration constants. moreover, some practical comments on measurements at various frequencies of the magnetising current are given in the paper. calibration of ac induction magnetometer 615 2. ac induction magnetometer a photo of the ac induction magnetometer is given in fig. 1. it was a part of equipment of ferrotester 2738/s-3. it has 360 mm long magnetising coil (large solenoid) of inner diameter 2r0=65 mm, fig 1a. a number of turns is unknown (it is not given in the user manual). it also has a system of five pickup coils, each 100 mm long with inner diameter 15 mm, fig 1b. a number of turns is also unknown. a cross-section of the magnetometer is presented in fig. 2a and an electrical scheme of connections of pickup coils is given in fig. 2b. a) b) fig. 1 photos of the induction magnetometer: a) large solenoid, b) pickup coils inside large solenoid five pickup coils are placed inside a large solenoid l0 in its central part (z(−5, 5) in fig. 3) at the same distance r1=20 mm from its axis (x=0 in fig. 3). as it has been presented in [1], this is a zone with a homogeneous magnetic field in which the variation of the magnetic field intensity is less than 1 % of its maximum. since whole system has axial symmetry, pickup coils are exposed to the same magnetic field. coil l1 is used for measurement of the magnetic field and its interior should not contain samples of ferromagnetic material. coils l2 and l′2 are two identical coils wound in the opposite direction and connected in mutual opposition, as it is given in fig. 2b. coils l3 and l′3 are identical, wound in the opposite direction, and connected in mutual opposition (fig. 2b). a small difference between coil systems 2 and 3 has been observed. a sample of ferromagnetic material under test can be placed in any of these 616 b. koprivica, m. šućurović, a. milovanović four pickup coils and the magnetic flux density can be measured. a resistor (45 kω) is connected in series with two coils in opposition to reduce the electric current in the coils, fig. 2b. 0 l 1 l 2 l ' 2 l ' 3 l 3 l 0 r 1 r hs 1 l ' 2 l 2 l ' 3 l 3l k45 4 12 3 k45 1′2′ a) b) fig. 2 induction magnetometer: a) cross-section, b) electrical scheme of connections of pickup coils fig. 3 total magnetic flux density distribution inside large solenoid fig. 2a also shows a position of hall sensor (hs) used for direct measurement of the magnetic flux density inside large solenoid. this sensor (type ss49e) is inserted in the vertical gap of the cylindrical plastic holder (perpendicular to its axis). along with holder, the sensor is placed inside the coil l1 in the middle of its length, perpendicular to the calibration of ac induction magnetometer 617 longitudinal axis of the coil and perpendicular to the magnetic field. a photo of the sensor and 3d printed plastic holder are presented in fig. 4. fig. 4 hall sensor and its position inside plastic holder usually, the magnetic field h generated by a long solenoid is calculated from the measured electric current i using expression (1) [5], according to ampere’s law: n n u h i l l r   , (1) where n is the number of the turns of large solenoid l0, l is its length, r is the resistor connected in the series with the solenoid and u is the voltage measured at the ends of r. however, this can not be used since n is unknown for l0. in general, the expression (1) can not be used with good accuracy in the case of a large-diameter multilayer solenoid l0. if the ferromagnetic sample is placed inside pickup coil, according to faraday’s law, a voltage induced ul in the pickup coil is equal to [4]: 0 d d d d d d l p p s p m h u n n s s t t t            , (2) where μ0 is the permeability of vacuum, np is the number of the turns of pickup coil, sp is the cross-section area of the pickup coil, ss is the cross-section area of the sample, φ is the magnetic flux and m is the magnetisation. the voltage measured at the non-common ends of the pickup coils in mutual opposition is equal to the first term in the expression (2). if no sample is placed inside the pickup coil in mutual opposition, the voltage induced in that pickup coil is equal to the second term in the expression (2). if no sample is placed inside pickup coils, induced voltages are equal and the resulting voltage is equal to zero or very close to zero. according to the iec standard for epstein frame [6], the resulting voltage should be smaller than 0.1 % of the individual voltages. expression (2) can not be applied to the pickup coils of the described magnetometer since these coils are multilayer coils and the number of turns is unknown. 3. calibration procedure and results the calibration of induction magnetometer is performed in three steps: 1. calibration of hall sensor, 2. calibration of large solenoid and 3. calibration of pickup coils. the calibration is necessary because numbers of turns of all coils are unknown for the used induction magnetometer. the large solenoid generates a homogeneous magnetic 618 b. koprivica, m. šućurović, a. milovanović field only in the central zone. even if this is not the case, it is always better to perform the calibration and to compare obtained results with calculations. all measurements are performed with controlled sinusoidal excitation voltage and current, at the frequency of 50 hz. ni usb 6009 data acquisition card is used in all measurements [7]. three simple labview applications are developed for each step of the calibration. in all measurements the averaging is used to reduce the noise [8]. 3.1. calibration of hall sensor the hall sensor ss49e has linear output voltage in the range of magnetic flux density from −100 mt to 100 mt [9]. therefore, it is suitable for measurement of the magnetic flux density generated by large solenoid l0. however, it needs to be calibrated in order to determine its sensitivity. a calibration is performed using a long solenoid (l=340 mm) with small diameter (25 mm). a varnished copper wire of 1.8 mm thick is used for winding of n=190 turns of this solenoid. an electric current i(t) of the solenoid is measured using a shunt resistor (20 a, 75 mv). the magnetic flux density is calculated as follows: 0 ( ) ( ) n i t b t l   . (3) the sensor characteristic uhmax=f(bmax) is obtained according to the maximum of the voltage uh measured at the output of the sensor and the maximum of the magnetic flux density calculated using expression (3). measurements are performed in the increasing and the decreasing direction in order to examine the linearity of the characteristic. a sensitivity of the sensor is calculated from the slope of the obtained characteristic. characteristics of the sensor (blue lines) obtained from two measurements for maximal magnetic flux densities of 10 mt (green squares) and 20 mt (red circles) are presented in fig. 5. 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 measured up to 10 mt measured up to 20 mt linear fit u h m a x [ v ] equation y = a + bx slope (b) stand. deviation adj. r-square 20 mt 0.02991 2.57e-4 0.99957 10 mt 0.03026 1.25e-4 0.99889 b max [mt] fig. 5 calibration of hall sensor measurement results and linear fit calibration of ac induction magnetometer 619 the figure presents also a table with calculated slopes, standard errors and adjusted rsquares. for a given dataset (xi, yi), i=1, 2, …, n, the standard deviation (error) ε and the adjusted r-square 2r of linear model y=a+bx can be calculated as [10]: 2 1 2 1 ( ( )) 1 n i i i n i i y a bx n x          , (4) 2 1 2 2 1 1 ( ( )) 1 1 ; ( ) n i i i n in i i i y a bx n r y y y y n              . (5) according to fig. 5, the sensitivity of hall sensor is equal to s=0.03 v/mt. it is interesting to notice that both measurements in fig. 5 show a dispersion of results. this effect has been examined more thoroughly. it has been found that this behaviour comes from the heating and cooling of the shunt resistor during the measurement of electric current. for lower values of the supply current (up to 10 a, rms) this effect is not expressed so much (green squares) and it can be neglected. therefore, it can be concluded that the sensor output is linear with a constant sensitivity. however, the second measurement (red circles) shows significant dispersion of the results and the calculated slope is lower by 1.16 % than in the first measurement. such a difference can be in the range or even higher than the overall measurement uncertainty of the experiment (discussed in details in section 4). therefore, attention needs to be paid to such effect and its influence on the calculated results. 3.2. calibration of large solenoid the calibration of the large solenoid is performed with the calibrated hall sensor. at this step, dependence of the magnetic flux density generated by the large solenoid on its electric current is examined. a final result of the calibration is bmax=f(imax) characteristic of the large solenoid. the electric current of the large solenoid is measured using a shunt resistor. the magnetic flux density is calculated using the measured output voltage of hall sensor and dividing it with the sensitivity s obtained in the previous step. the hall sensor is placed in the middle of large solenoid, so that the sensor surface is perpendicular to its longitudinal axis and the magnetic field. measurements are performed with the increasing and the decreasing of the electric current in order to examine the linearity of the characteristic. the result of the calibration of the large solenoid is presented in fig. 6. according to the linear fit of measured results, the slope of the b=f(i) characteristic of large solenoid is 620 b. koprivica, m. šućurović, a. milovanović around 19.77 mt/a. the linearity of this characteristic confirms the conclusion from the calibration of the hall sensor that dispersion of measurement results comes from the variation of temperature of the shunt resistor. the obtained slope can be used in further measurements in the calculation of the magnetic flux density generated by the large solenoid according to the measured current. 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 0 5 10 15 20 25 30 35 i max [a] b m a x [ m t ] measured increasing measured decreasing linear fit equation y = a + bx slope (b) stand. deviation adj. r-square linear fit 19.768 0.0244 0.999 fig. 6 calibration of large solenoid measurement results and linear fit 3.3. calibration of pickup coils in the third step of calibration, voltages induced in the pickup coils l1, l′2 and l′3 (fig. 2b) and voltages at the ends of mutual pickup coils l2-l′2 and l3-l′3 are measured in order to calculate calibration constants for all pickup coils. all voltages have been measured in relation to the ground terminal of a data acquisition card. additionally, the voltage supplied to the large solenoid ul0, as well as the electric current i and the magnetic flux density b (hall sensor) are measured. thus, all quantities of interest for the calibration of magnetometer are measured simultaneously. measurements have been performed at six different magnetising currents up to around imax=1.2 a. each signal (its time waveform) has been measured 1600 times and the averaged signal has been calculated. this reduces the noise in all signals to negligible levels [8]. because this is the most important calibration step, it has been repeated five times. finally, the mean value of all measurements has been calculated. signals measured at imax=0.39 a are presented in fig. 7. it can be noticed that signals that represent the magnetising current and the magnetic flux density are in phase, while signal ul1 from pickup coil l1 is lagging for π/2. signals ul′2 and ul′3 from pickup coils l′2 and l′3 are opposite in phase with signal ul1. the maximum of the voltage induced in a pickup coil can be derived as: max max 0 maxl l lu k b k h     , (6) calibration of ac induction magnetometer 621 where kl is a constant proportional to the product of the number of turns and the crosssection area of the pickup coil and ω is the angular frequency. since the number of turns are unknown for pickup coils of the calibrated magnetometer, this product can be calculated from the measured maximums of the induced voltage and the magnetic flux density. thus, the calibration constant of the pickup coil can be obtained as: kl=ulmax/ωbmax. 0.00 0.02 0.04 0.06 -60 -40 -20 0 20 40 60 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 -8 -6 -4 -2 0 2 4 6 8 voltage current magn. flux density b [ m t ] i [a ] u l 0 [ v ] t [s] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 -3 -2 -1 0 1 2 3 u l 1 , u l '2 , u l '3 [ v ] t [s] u l1 u l ' 2 u l ' 3 fig. 7 calibration of pickup coils measured signals table 1 presents results of calibration of pickup coils obtained for different magnetising currents, containing maximums of the supply voltage and electric current of the large solenoid, maximum of the magnetic flux density measured with hall sensor, maximum of the voltage induced in the pickup coil l1 and calculated calibration constants for all pickup coils. last row of table 1 gives averaged values of the ratio of maximums of the magnetic flux density and the magnetising current, which is the calibration constant of the large solenoid l0 (step two) and averaged values of the calibration constant of all pickup coils. 622 b. koprivica, m. šućurović, a. milovanović obtained calibration constant for the large solenoid is in accordance with the result given in fig. 6. table 1 results of pickup coils calibration ul0max [v] imax [a] bmax [mt] bmax/imax [mt/a] ul1max [v] kl1 [m 2 ] kl′2=kl2 [m 2 ] kl′3=kl3 [m 2 ] 1. 2. 3. 4. 5. 6. 31.65 63.41 95.22 127.14 159.22 191.37 0.195 0.390 0.586 0.782 0.979 1.176 3.855 7.718 11.596 15.455 19.321 23.161 19.738 19.780 19.801 19.767 19.738 19.685 1.267 2.540 3.815 5.088 6.368 7.641 1.046 1.048 1.047 1.048 1.049 1.050 1.225 1.226 1.225 1.226 1.228 1.229 1.170 1.171 1.170 1.171 1.172 1.174 average 19.751 1.048 1.227 1.171 4. discussion of results point of interest and important part of calibration procedure is estimation of the uncertainty of performed measurements [11]. calibration of hall sensor is found to be complex and influenced by many variables. also, the uncertainty of this calibration influences the uncertainty of other two calibrations. therefore, it is analysed thoroughly in this section. in order to achieve the lowest possible uncertainty, the calibration of hall sensor is repeated with another data acquisition card (ni 9205) which has adjustable voltage ranges and better absolute accuracy than ni usb 6009. voltage range for measurement of hall sensor voltage is set to 5 v and voltage range for measurements of voltage at the ends of shunt resistor is set to 200 mv in measurements made at magnetic flux density of 10 mt. hall sensor sensitivity is calculated as: max max max 0 1 max 2.5 ( 2.5)h hu u lr s b n u      , (7) where: uhmax=2.799 v is the maximum of hall sensor voltage, the quiescent output voltage of hall sensor is 2.5 v, bmax is the maximum of magnetic flux density, l=340 mm is the length of solenoid, r=0.00375 ω is the shunt resistance, μ0=4π·10 −7 h/m is the magnetic permeability of vacuum (air), n1=190 is the number of turns of solenoid and umax=0.05369 v is the maximum of shunt voltage. the combined uncertainty of sensitivity s is calculated according to the sensitivity coefficients, which are obtained as partial derivatives of sensitivity s expressed by (7), and absolute uncertainties of each independent variable in (7) [11, 12], as: calibration of ac induction magnetometer 623   2 2 2 , 1 1 i i s s c b x x ii i s u s u u x            , (8) where xi{uhmax, l, r, n1, umax}, uxi=s/xiub,xi and ub,xi is a type b standard uncertainty evaluated as , 3i ib x xu u or , 1.960 i ib x x u u for a rectangular distribution or a normal distribution with the confidence level of 95 %, respectively. sensitivity coefficients are calculated using values of uhmax, l, r, n1, μ0 and umax. their values are given in table 2. absolute uncertainties of voltages are calculated according to the specification for ni 9205 given by the manufacturer [13], taking into account three components of error: error of full scale, error of reading and noise error. the absolute uncertainty of the length of solenoid is taken as one half of measuring unit. the absolute uncertainty of the resistance is given by the manufacturer as 0.5 % of its rated value. the absolute uncertainty of the number of turns is equal to one turn. all values are given in table 2. rectangular distribution is assumed for voltages and resistance with coefficient of division 3 and the normal distribution is assumed in the case of length and number of turns with coefficient of division 1.96 (confidence level of 95 % at infinite degrees of freedom). its absolute value is 0.174 v/t and its relative value is 0.58 %. therefore, standard uncertainty is 0.58 %. moreover, correction factor k=2 can be used for calculation of expanded uncertainty. in such a case, the confidence level is around 95 % and expanded uncertainty is 0.35 v/t or 1.17 %. finally, the result of measurement of hall sensor sensitivity can be reported as: 29.77 v t 0.35 v ts   . (9) table 2 type b uncertainty for hall sensor calibration at 10 mt variable absolute uncertainty sensitivity coefficient distribution absolute standard uncertainty ub [v/t] relative uncertainty [%] uhmax 0.002138 v 99.46 1/t rectangular 0.12277 0.412 l 0.5·10 −3 m 87.555 v/tm normal (95%) 0.02233 0.075 r 0.01875·10 −3 ω 7938.32 v/tω rectangular 0.08593 0.289 n1 1 −0.1567 v/t normal (95%) 0.07994 0.268 umax 0.08953·10 −3 v −554.455 1/t rectangular 0.02866 0.096 the main contribution to the measurement uncertainty comes from the measurement of output voltage of hall sensor. the reason is a relatively high value of the measured voltage, as well as high measuring range. the absolute uncertainty ub,uhmax depends on both values. calculated uncertainty refers only to the measurements performed at 10 mt. whole calculation described need to be repeated to obtain uncertainty for other values of magnetic flux density. the results of the calibration can be used in further measurements with a magnetometer to shorten the time needed for measurement and calculation. they can be used in different 624 b. koprivica, m. šućurović, a. milovanović ways. at first, the calibration constant of the large solenoid can be used for calculation of the time waveform of the magnetic field generated by the large solenoid from the measured electric current i(t), as h(t)=19.77i(t)/μ0. as a consequence, the hall sensor may be excluded from the measurement setup. additionally, the magnetic field can be also calculated from the integral of measured voltage ul1(t) of the pickup coil l1 as h(t)=−ul1(t−t/4)/(μ0ωkl1), ω=2π/t (in the case of sinusoidal excitation current). thus, the shunt resistor for current measurement can be also excluded from the measurement setup. calibration constants of other four pickup coils can be used in calculations of time waveforms of the magnetic flux density b and the magnetisation m of the ferromagnetic sample. similar to expression (2), in the case when ferromagnetic sample is placed inside the pickup coil l2, the following expression can be used: 2 2 2 0 d d d d s l l l s m h u k s t t           , (10) where ul2 is the measured voltage of the pickup coil l2, sl2=πrp 2 is the cross-section area and rp=7.5 mm is the inner radius of the pickup coil. the magnetisation of the sample can be obtained by integration of this voltage and by substituting the magnetic field expressed over the voltage ul1 as: 2 2 20 0 1 d t l l s l s m u t h s k            . (11) on the other hand, the voltage u2 between points 2 and 4 (fig. 2b) can be obtained using only the magnetisation: 2 2 2 0 d d s l l s m u k s t   . (12) the magnetisation of the sample is obtained by the integration of (12), as: 2 2 2 0 0 1 d t l l s s m u t k s     . (13) the magnetic flux density of the sample can be calculated using (11) or (13) and previously calculated h(t), according to well-known relation: 0 ( )b m h   . (14) similar expressions can be used for other pickup coils in the case when the ferromagnetic sample is placed inside these coils. however, the above analysis may lead to measurement errors caused by two effects. the first one is related to the shape of the excitation current. as soon as the ferromagnetic sample is placed inside a pickup coil, for the same excitation voltage, the magnetising current will change by some amount. this change may be large and the shape of the current may be significantly distorted from sinusoidal. the level of influence depends on material calibration of ac induction magnetometer 625 characteristics. this effect can be overcome by digital feedback based on computer, as it has been discussed thoroughly in the literature [11]. the second effect also may appear after insertion of the ferromagnetic sample inside pickup coil. the magnetic field inside the pickup coil with ferromagnetic sample will be distorted, as well as surrounding magnetic field. this distortion might reach other pickup coils and disrupt the air flux compensation, which would be no longer effective as it was for the empty coils. this problem can be solved numerically in such a way that the compensating voltage is calculated using measured magnetic field [11]. this voltage needs to be subtracted from the measured voltage induced in the pickup coil, for example ul2 in (10). consequently, the magnetic field needs to be measured accurately to obtain effective air flux compensation. for these reasons, the analysis given by equations (10) to (14) need to be validated through numerous experiments in which different magnetic materials should be used. also, dimensions of magnetic samples need to be varied, as well as frequency of excitation current and its shape (sinusoidal, triangular and other). the main purpose of the calibrated magnetometer is to obtain instantly the hysteresis loop of some ferromagnetic sample from calculated time waveforms of the magnetic flux density (or magnetisation) and the magnetic field. it can be used for parallel comparison of the hysteresis loops up to four samples. samples can be made from different materials with the same dimensions or from one material with different dimensions. if all samples are made from one material and have the same dimensions, one sample can be used as a reference sample while the others can be checked against the reference and classified according to the predefined criteria. the magnetometer can operate with different frequencies of the magnetising current and with different shapes of its time waveform. it should be taken into account that the voltage induced in pickup coils should not exceed a voltage range of the data acquisition card (usually 10 v). the induced voltage increases with the increasing of the frequency of the magnetising current. sometimes, it is necessary to use voltage dividers to keep the induced voltage in the desired range. at very low frequencies (below 1 hz), the amplitude of the induced voltage is relatively small which results in a deterioration of the signal-tonoise ratio. in such cases, averaging of measured signals is useful [8]. 5. conclusion the paper gives a brief description of ac induction magnetometer, its construction and working principle, and emphasises its rising importance in complex measurements with ferromagnetic materials. the paper describes a calibration procedure of ac induction magnetometer. it has been performed on the coil systems of ferrotester 2738/s-3. initially, measurements have been performed on the large-diameter multilayer solenoid in order to investigate a homogeneity of the generated magnetic field. it has been found that the variation of the magnetic field in the homogeneity zone was less than 1 % of its maximum value. this homogeneity zone covers one third of the solenoid volume (its central part). the calibration constant of the large solenoid has been determined using measurements. further, using numerous measurements the calibration constants for all investigated pickup coils have been determined. 626 b. koprivica, m. šućurović, a. milovanović a detailed calculation of measurement uncertainty of hall sensor sensitivity is also presented in the paper. it has been found that the relative expanded uncertainty amounts 1.08 % at magnetic flux density of 10 mt. the possible ways to use a calibrated magnetometer for future measurements have been described. the method for determining time waveforms of the magnetic field and the magnetic flux density (or magnetisation) from the measured voltage induced in pickup coils and the calibration constants has been presented. a hysteresis loop of a ferromagnetic sample can be easily obtained from these waveforms. the magnetometer can be used for simultaneous testing of four samples of one material or different materials. two side effects that can produce errors in such measurements were discussed. some practical notes on the measurements with the calibrated magnetometer at different frequencies of the magnetising current have been also given in the paper. acknowledgement: this paper has been supported by scientific project tr 33016, financed by the ministry of education, science and technological development of the republic of serbia. references [1] b. koprivica, m. šućurović, n. jevtić, a. milovanović, "measurement of magnetic flux density of large-diameter multilayer solenoid", in proceedings of the 13th international conference on applied electromagnetics (пес 2017), niš, serbia, 2017, p. o5-2. [2] h. gavrila, v. manescu (paltanea), g. paltanea, g. scutaru, i. peter, "new trends in energy efficient electrical machines", procedia engineering, vol. 181, pp. 568-574, 2017. [3] f. béron, g. soares, k. r. pirota, "first-order reversal curves acquired by a high precision ac induction magnetometer", review of scientific instruments, vol. 82, no. 6, p. 063904, june 2011. [4] m. rivas, p. gorria, c. muñoz-gómez, j.c. martínez-garcía, "quasistatic ac forc measurements for soft magnetic materials and their differential interpretation", ieee transactions on magnetics, vol. 53, no. 11, p. 2003606, nov. 2017. [5] k.l. kaiser, electromagnetic compatibility handbook, crc press, new york, usa, 2004. [6] iec 60404-2 edition 3.1, "magnetic materials – part 2: methods of measurement of the magnetic properties of electrical steel strip and sheet by means of an epstein frame", iec, geneva, switzerland, june 2008. [7] national instruments: “ni 6009 low-cost, bus-powered multifunction daq for usb”, austin, usa, 2014. [8] s. zurek, t. kutrowski, a.j. moses, p. anderson, "measurements at very low flux density and power frequencies", journal of electrical engineering, vol. 59, no. 7/s, pp. 7-10, 2008. [9] ss49e linear hall effect sensor. available at: https://www.addicore.com/ss49e-linear-hall-sensorp/ad316.htm. [10] https://www.originlab.com/doc/origin-help/lr-algorithm#adj._r-square. [11] s. zurek, characterisation of soft magnetic materials under rotational magnetisation , crc press, london, uk, 2018. [12] jcgm 100, evaluation of measurement data, guide to the expression of uncertainty in measurement, joint committee for guides in metrology, 2008. [13] national instruments, ni 9205, datasheet, 2015. arbeitsgliederung für 17.09 deut / bos facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 369-385 https://doi.org/10.2298/fuee1903369a © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd the impact of the larger number of non-linear consumers on the quality of electricity * enver agić 1 , damir šljivac 2 , bakir agić³ ¹court expert electrical engineers, tuzla, bosnia and herzegovina 2 electrical faculty, the university j.j. strossmayer of osiijek, osijek, croatia ³faculty of elecrtical engineering, the university of tuzla, tuzla, bosnia and herzegovina abstract. theoretically this paper will explain the formation of higher harmonic components in the electricity network, their causes, consequences on consumers and the ways of their elimination. transformer role in the dyg connection will be explained on the concrete example. for a specific example the waveform of primary (r) phase at 10 kv voltage level, the current of the secondary (r) phase and the neutral conductor at the 0.4 kv voltage level will be determined as shown in the concrete example in the work. harmonic content will be determined up to 15 harmonics and the effective value of all these currents (phases r and r). thd for current of primary (r) phase and secondary (r) fase will be calculated. in this paper, the dimensional three-phase filter is set to eliminate the maximum harmonic component of current of the primary (r) phase on the 10 kv side of the transformer. the waveform, the corresponding harmonic content for the current and thd of primary (r) phase will be determined. additional measures have been proposed to reduce the thd. another parallel filter has been realized to eliminate the second by size harmonic components of primary (r) phase current. it will also compare thd for primary (r) phase as in the previous cases. for the total duration of the simulation, the used time is tstop = 0.1 sec. all of the above simulations will be realized in the matlab / psb program package and simulation models will be displayed. key words: electricity network, higher harmonics, load, transformer coupling 1. introduction the term quality of electricity was extensively used in the mid-1980s and it is important to pay attention to both suppliers and consumers. depending on the point of view, there are different definitions of the quality of electricity. the problem of the quality of electricity is related to the end customer [1], ie consumers of electrical energy. received february 21, 2019; received in revised form july 5, 2019 corresponding author: enver agić court expert electrical engineers, ui. a rbih 19/iv/15, tuzla, bosnia and herzegovina (e-mail: agabiem@bih.net.ba) * an earlier version of this paper was presented at the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, niš, serbia [1]  370 e. agić, d.šljivac, b.agić the concept of quality is becoming increasingly apparent as electric consumers become very dependent on the quality of power, since they are increasingly based on electronic or microprocessor components that are very sensitive to power supply disruptions. also, the quality is additionally updated today with regard to the liberalization of the electricity market, when electric energy becomes a commodity as any other commodity, it must satisfy a certain quality that is defined by the consumers. consumers, as well as electricity producers, must meet the appropriate standards regarding the quality of electricity. normal switching operations with condensing batteries for repairing the power factor, switching on or deactivating low-load transformers or overhead lines, atmospheric discharges, etc. lead to transients that have a significant impact on the quality of electricity. the deteriorating quality of electricity is also affected by an increasing number of non-linear consumers [2] that generates electricity harmonics resulting in voltage deformations.these non-linear consumers are increasingly vulnerable to voltage deformations. in today's conditions of complex production processes accompanied by a large number of electronic and automatic regulation and control elements, any error in the functioning of a particular system component necessarily leads to very significant economic consequences.in the late fifties and during the sixties, rapid semiconductor components (thyristors and strong bipolar transistors) developed rapidly. semiconductor energy electronic transducers appeared and completely suppressed those with vacuum elements. most energy electronic converters are those that connect to the ac mains (rectifiers, network switched inverters, ac voltage regulators, cyclone converters). due to their interrupt nature, they represent nonlinear consumers for the network and cause the distortion of the waveform of the current and voltage [3]. it is shown by mathematical analysis of distorted waveforms, using fourier series, that these distorted forms can be represented by a series of sine functions of different frequencies. these frequencies are an integer of the basic (dominant) frequency of the analyzed signal and are called higher harmonics [4]. the interest in analyzing the quality of electricity lately has been steadily increasing because:  electrical and electronic equipment are becoming more susceptible to voltage disturbances;  electrical and electronic equipment are increasingly generating voltage disruptions;  the quality of electricity is of particular importance in the conditions of the deregulated market; and  by developing modern measuring devices, today the quality of electricity can easily be measured and memorized. the two main categories of problems in the analysis are: a) disorders: transits decay and increase voltage power supply interruptions b) stationary variations: voltage regulation harmonic distortion voltage flickers influence on the quality of electric energy of higher harmonics as consumer consumption 371 2. model of solving problems of higher harmonics the problem of the quality of electricity is dealt with by experts and scientists from many countries. by raising and expanding the application of electricity and the technology of the operation of the electric energy system, there is a need for standardization of certain solutions for the improvement of quality. the main goal in this period was to provide a reliable supply for consumers, which were linear in nature. therefore, for many years, the definition of quality was very simple quality is equal to reliability. however, with the introduction of electronics into banks, business institutions, industrial plants and households, or the emergence of microelectronic circuits and the information revolution in the second half of the twentieth century, the concept of quality has gained a wider meaning. a huge number of sensitive consumers are connected to the network, which require high power quality and voltage. on the other hand, there is a category of non-linear consumers (energy converters), which intensely deforms the waveform of the consumed current. the need for standardization, ie limiting the level of interference, becomes essential for consuming electrical energy. the initial standards recommendations related to voltage quality were adopted at the end of the 1960s (the 1967 recommendation and the standard for the limiting of harmonics in britain and the ussr), and during the seventies and early eighties, by twenty countries [6]. the problem of standardization of the quality of electric energy, ie limiting the influence of energy converters on the environment, and especially the phenomenon of "pollution" of the network with higher harmonics, is dealt by several international organizations. the most important are the international electrotechnical commission iec, as well as the european committee for electrotechnical standardization cenelec. in addition to these international organizations, which are in charge of issuing standards, a number of international organizations are considering this problem as their professional interest, the most influential being the institute of electrical and electronics engineers – ieee, and the international conference on large electric networks cigre in figure 1, the example where the three-phase part of electrical energy network is presented, in which non-linear load of a group of pcs is powered through a transformer connection. program package matlab/simulink/powe system blockset (psb) is used for modeling, simulation and analization of the dinamic energy system. fig. 1 the analyzed part of the network 372 e. agić, d.šljivac, b.agić in figure 2, the equivalent nonlinear load model for each phase is presented. fig. 2 an equivalent non-linear load model (pc) the model parameters are: network: voltage level: un = 10 kv (line voltage) phase angle of the first phase: = 0 network parameters: rs = 2.5 ohm; ls = 0.05 h transformer (linear model): nominal power: sn = 50 kva transformer transformer ratio: 10 / 0.4 kv working resistance and primary and secondary winding reactance: rp = rs = 0.0025 p.u .; xp = xs = 0.06 p.u. magnetization branch parameters: rm = 500 p.u. ; xm = 500 p.u. nonlinear load parameters (pc): r=0.5 ohm; l=0.004 h ropt=100 ohms; ball=150 uf internal diodes parameters: resistance ron (ohms): 0.01 inductance lon(h): 1*10^-6 forward voltage vf(v): 0.8 initial current ic(a): 0 snubber resistance rs(ohms): 10 snubber capacitance cs(f): 0.01*10^-6 the work: 1. theoretically briefly explains the cause of the formation of higher harmonic components in the electric power network, their consequences on consumers and the ways of their elimination. the role of the transformer in the dyg compound is explained and a concrete example is given. 2. for the part of the electrical network from figure 1, a waveform of primary(r) phase at 10 kv voltage level and the current of the neutral conductor phase at 0.4 kv voltage level will be determined, as shown in figure 1. the harmonic content will be determined up to 15. accordion, as well as the effective value of all these currents. (phases r and r). also, thd for current of primary(r) and secondary(r) phases will be calculated. influence on the quality of electric energy of higher harmonics as consumer consumption 373 3. the three-phase filter will be set to eliminate the maximum harmonic component of the current of primary(r) phase at the 10 kv side of the transformer. the waveform, the corresponding harmonic content of primary(r) phase current and thd will be determined. 4. additional measures will be proposed to reduce the thd of primary phase current. another parallel filter will be realized for elimination of the second one by size harmonic component of the primary(r) phase current. the comparison of thd will be made in cases under 3 and 4 . 5. for the total duration of the simulation, the used time is tstop = 0.1 sec. 6. the mentioned simulations will be realized with the matlab / psb type program package. the corresponding simulation models are displayed. 3. the causes of the formation of higher harmonics more harmonics in the network are caused and generated by non-linear consumers in the electric power network, which injects more harmonic components of the current into the system. this component through the impedance of the system results in distortion of the supply voltage, that affects the reduced reliability and shortening of the life of electrical equipment [6]. praxis is interesting to the higher ranks of the order from 0 to 100. higher harmonics causes are:  transformers, due to non-linear φ and characteristics of the iron core,  semiconductor electronic converters that due to their switching nature represent non-linear consumers for the network and cause distortion of the waveform of the current and voltage,  electric furnaces,  discharge lamps,  saturated electrical machines. the power system components and consumers are designed for sinusoidal forms of voltage and current, and any appearance of higher harmonics brings negative effects. some of the side effects are:  the occurrence of serial and parallel resonance in the network resulting in increased voltage and current,  impact on condenser batteries which causes an increase in losses,  the influence on the protection elements, which leads to unwanted activity of the protective devices (protection) or fuse overheating,  impact on the accuracy of standard measuring instruments,  additional losses in electrical machines (eg overheating, heating of cables, and the like),  interference with tt signals (higher har monics from power lines are transmitted by electromagnetic interference to tt cables, thus creating noise and disturbances in telecommunications,  influence on transformers. 374 e. agić, d.šljivac, b.agić 3.1. reduction of negative impacts in order to minimize the negative effects of higher harmonics, the following measures are necessary: reduce the intensity of harmonic currents  by installing chokes in series with non-linear consumers,  transformers in the yd or dd connection, since the compound in d retains odd harmonics dividible with three,  dimensioning of neutral conductors 2: 1 in relation to phase or installation of two with old dimensions,  installation of 12 pulse converters. installation of filters  passive filters that are split into: a) ordinary  first row  second row  third row b) active filters  active filters have the ability to generate and manage non-linear currents resonant frequency change system resonance can occur when there are condensing batteries for compensation of reactive energy in the system or with consumers. since the resonant frequency of the system, including capacitor batteries, is often close to the frequency of characteristic harmonics of non-linear consumers, there are undesirable effects. the change in the resonance frequency of the system is realized:  by changing the size of the capacitor,  by adding a serial choke,  moving the capacitor to another location,  or disconnecting the capacitor. 3.2. the role of the transformer in the dyg compound when we look at the magnetization component i(t) in figure 1 we see that it is not a sinusoid but a recited periodic function with nulls having maximal values at the same moments as the induction function b (t). fourier's analysis of one such function gives data for the amplitude of higher harmonics: first harmonic 100% third harmonic 24,5% the fift harmonic 3,43% influence on the quality of electric energy of higher harmonics as consumer consumption 375 fig. 3 hysteresis transformer curve figure 3 shows the first, third and fifth harmonics. there is an important influence of the third harmonic (which is a negative sinusoid). in the first third of the semiperduct, negative values i3 decrease the positive values i1. in the second third of the semiperduct, the positive values i3 are summed up with positive values i1, and in this way a sharpening form occurs i(t). with zero lead, each single-phase transformer along the basic accordion i1 pulls all higher harmonics. the magnetization current of the first harmonic is: 1a 1am i i sin( t)     1b 1bm i i sin( t 120 )      1c 1cm i i sin( t 240 )      where m is the mark for the maximum value of the current. without zero lead, the magnetization current is sinusoidal, because there is no 0conductor, so the third harmonics of the current can not be closed. flux will have the first and third harmonics 3, so it will be non-sinusoidal. because the air has a high magnetic resistance, between the upper and lower yoke, the amplitude of the third harmonic of the flux will be considerably lower. in the case when the transformer is connected to the dyg compound, the third harmonic component and all others that are its multiplicity are closed in the triangle, but as a result, we have additional heating of the winding. 376 e. agić, d.šljivac, b.agić fig. 4 first, third and fifth harmonics 4. use of matlab / psb in the solution of problems a simulation was performed for the part of the electrical network from figure 1 with the help of the matlab/psb computer program [7]. the observation time was tstop = 0.1s and the harmonic currents were counted up to the fifth period [8]. in figure 6, the wave form of primary(r) phase current is shown on the 10 kv side and here it is clearly seen that due to the influence of higher harmonics, the current does not have a sinusoidal shape. the rectifiers are the most commonly used single-energy electronic converters and one of the main sources of higher harmonics [9]. the switching mode has the effect of continuously changing the configuration of the active diode of the rectifier, resulting in the waveform of the rectifier current composed of segments and being insufficiently shaped. the flow of the non-sinusoidal current causes a decrease on impedance of the network, which leads to the distortion of the basic sinusoid voltage, while on the consumer side, the waveform of the voltage consist of parts of the sinusoid, that in addition to the direct component, also have alternating components-higher (steam) harmonics [10]. most personal computers use mono-phase rectifiers in the power supply section and have the role of generating stable single-voltage with simple design and reliability. for these reasons, most commonly used one are diode rectifiers with a filter capacitor on a one-way side or, more recently, one-way switching power supplies, which also have a capacitor at the output. in case of need for another voltage level or more stable voltage, a linear or switching power supply is connected behind the capacitor. however, in both cases, this rectifier distorts the network current, and partly the voltage. due to the charging of capacitor in periods when the network voltage is higher than direct-way, there is a distortion of the current, that is, the voltage at the condenser. the paper analyzes the distortion of the wave shape of the current, as a quality factor, when supplying consumers that generate more harmonic components of the current. influence on the quality of electric energy of higher harmonics as consumer consumption 377 it is possible to filter the source harmonics by using filters and analyze all relevant design parameters [11]. in our case, we have a low voltage installation that is connected via the transformer 10 /0.4 kv to the conventional electricity distribution network 3x10 kv. the load we have is a non-linear set of pcs and is balanced at each stage. the non-linear load for each phase generates more harmonics, whose composition is approximately defined as in = i1/n where i1 is the current of the basic and in-n harmonic. the wave forms of the primary and secondary phase are given in the figures as follows: fig. 5 voltage waveform of primary(r) phase on the 10 kv side fig. 6 the wave form of the irt phase on the 10 kv side 378 e. agić, d.šljivac, b.agić fig. 7 wave phase current irz at 0.4 kv side fig. 8 wave waveform of phase urz at 0.4 kv voltage level fig. 9 the current shape of the in-neutral conductor at 0.4 kv voltage influence on the quality of electric energy of higher harmonics as consumer consumption 379 fig. 10 harmonic current content irt on 10 kv side without filter / not installed fig. 11 harmonic current content irz at 0.4 kv without filter /not installed in the power spectrum irt the odd harmonics are dominant, there are no even harmonics. measurements of the current of the computer have shown that the shape of the current is very distorted due to non-linear consumer, which generates more odd harmonics (electronic equipment) [12]. thdi is calculated according to the formula below: 210 % 2 12 thdi 100%n n i i   380 e. agić, d.šljivac, b.agić for primary (r) phase current (on the 10 kv side) thdi = 268%. for the current of secondary (r) phase (at 0.4 kv side) is thdi= 279%. 5. results and solutions to the problem higher harmonics are constantly present in the network in a higher or lower percentage. but at some point they can become a problem. this happens if the source of the accordion is too large, if the current harmonics path is too long, that is, if the circuit is large, or if the response of the system is such that it leads to reinforcement of harmonics (resonance). in order to reduce or eliminate the harmony problem, there are several basic solutions: 1. reducing the intensity of harmonic currents, 2. setting the filters, 3. change in the resonant frequency of the system the methods of reducing the intensity of harmonic currents usually involve changing the mode of operation of the propulsion, which generate harmonics. such an approach is difficult to implement in practice, as this can affect the entire production process, and it is only possible at the design stage. however, something can be done by matching the transformer. the coil in the triangle leads to the blocking of the further flow of all the harmonics, which are multiplies of 3. with the introduction of a phase shift of 30 degrees, by twisting the secondary transformers into a star and a triangle, the effect of a 12-pulse rectifier is obtained, that is, the 5 th and 7 th harmonics[14] are eliminated. connecting a non-linear consumer to highpower outputs also reduces the effects of harmonics. the goal of setting the filters is to provide a low impedance for the harmonics of the current and thus prevent their spread into the network. therefore, the filters are most often placed in parallel to the condenser and consist of a condenser with an additional impeller. the resonant frequency of the filter is calculated always to be slightly below the frequency of the lowest dominant harmonic. this ensures that the filter works properly in case of oscillations of the capacitor parameters due to temperature,etc., and also to avoid the anti-resonant frequency approaching the frequency of the harmonics. the use of serial filters is more rarely applied, and their goal is to represent a high impedance for current harmonics, thus blocking their expansion into the network. more details about the types and methods of filter design can be found in j. arrillaga, d. bradley, p. bodger: "power system harmonics", john wiley & sons, chichester, 1985. these types of filters are called passive, as opposed to the newer ones active. active filters are in fact energy electronic converters, which are programmed to compensate higher harmonics. with such a filter, the "pure" sinusoidal current of the grid is provided. the more complex configurations enable the complete discovery of all disorders that affect the quality of electricity [15]. changing the resonant frequency of the system is necessary when there are condensing batteries for compensation of reactive energy in the system or with consumers. their resonant frequency is often close to the frequency of characteristic harmonics, and there are undesirable negative occurrences. changing the size of the capacitor, adding a serial influence on the quality of electric energy of higher harmonics as consumer consumption 381 impedance, moving the capacitor to another bus (location), or simply completely disposing the capacitor (with a pay-off cost without compensation) are the measures that can be resolved. for example, in the literature j.toth iii, d.velazquez: "benefits of an automated on-line harmonic measurement system", ieee transaction on industry application ,. one such experience is described [16]. resolving the problem of resonance in the network, which led to the fugging of condenser battery fuses, inaccurate instrumentation, frequent engine failure and communication problems, was found in moving the resonant frequency beyond the range of characteristic harmonics (5 th and 7 th ) by reducing the number of capacitors in batteries. as a specific way of dealing with harmonics, it is an administrative, or economic approach, where special contracts and tariffs discourage the excessive generation of harmonics. for example, in france, a special contract called "emeraude" was developed, which is a set of technical regulations and obligations for electricity distribution, as well as for consumers, in order to ensure the proper quality. after a year of application, the results released by the edf show that at mid-voltage, 99% of consumers have complied with the contract, but that compensation is paid $ 150,000, mostly due to short interruptions. in 1995, a new text of the contract came out, which allowed a considerably smaller number of interruptions, and in 1998 its audit was carried out, which included problems related to voltage failures. more recent research goes not only to determine the state, and to provide a prediction, but also to answer the question of what the cost of degradation of quality (harmonic losses) is, ie what and how to collect additional charges from consumers for the operation of non-linear drives[17]. there is a particularly interesting question of tariffing higher harmonics and how this can be done. the results of the survey show that 46% of the distributions intend to additionally charge the generation of harmonics and flickers, 40% to charge harmonics over apparent power (kva), and other time of use, that is, the time of "pollution". in addition to this, the question of measuring higher harmonics and the way of expressing their influence is not cleared up. this section shows the possibility of filtering source higher harmonics using filters as well as analyzing all relevant design parameters. in our case, we have a low voltage installation that is connected via the 10/0.4 kv transformer to the 3x10 kv classical electricity distribution network. the load we have is a non-linear set of personal computers (pcs) that is balanced at each stage. non-linear load for each phase generates more harmonics, the composition of which is approximately defined as in = i1/n where i1 is the current of the basic and in n harmonic. the dimensioned three-phase filter is set to eliminate the maximum harmonic component of primary(r) phase current on the 10 kv side of the transformer [18]. also, in this part of the paper, the waveform and corresponding harmonic content of phase r are shown. in this case, thdi is calculated (for phase r current) (10 kv side). simulation was performed with the help of the matlab / psb computer program. the observation time was tstop = 0.1s. 382 e. agić, d.šljivac, b.agić fig. 12 primary (r) phase current waveform on a 10 kv side (built-in filter) fig. 13 frequency spectrum of primary (r) phase current on 10 kv side (with built-in filter) the value of thdi for primary (r) phase current is: 210 % 2 12 thdi 100%n n i i    thdi = 0.239 % the pictures show the oscillation of the primary (r) phase wave and the spectral composition of the same current. it turned out that this filter has done a good job of eliminating the third and all other odd harmonics, so that the total thdi of the primary (r) phase current is about .239 %, which is quite satisfactory, given the regulations in this field. influence on the quality of electric energy of higher harmonics as consumer consumption 383 6. conclusion the components of the energy system, as well as consumers that connect to it, include sine waveforms and currents. any appearance of higher harmonics brings negative effects, the most important are: 1. the appearance of resonance in the network, 2. influence on condensing batteries, 3. influence on elements for protection, 4. impact on the accuracy of standard measuring instruments, 5. additional losses in electrical machines, 6. interference with telecommunication signals. business buildings with a large number of administrative and other service workers who can not imagine their work without computers, laser and/or matrix printers, scanners, photocopiers, small telephone exchanges, fluorescent, halogen or energy-saving lamps and other similar small sources of higher harmonics, represent a recognizable consumer group, which must be given special attention. within these buildings there are often concentrated large groups of small pcs (either as computers in individual departments, or as special computer centers). this group includes faculties and university campuses, which have their own computer centers, as well as a large number of computers in laboratories and cabinets. in order to further reduce the thdi of the current of primary (r) phase, another parallel three-phase filter is set up to eliminate the maximum harmonic component of the current on the 10 kv side of the transformer. also, in this part of the paper, the waveform and the corresponding harmonic content (of the primary phase current) are shown [19]. in this case, the thdi (10 kv side) is calculated. with the help of the matlab/psb computer program, a simulation was performed. the observation time was tstop = 0.1s.. with this filter we completely reduce the second and third harmonic components [20]. the second filter is installed in parallel with the first one that we installed earlier. the capacity of the second filter is 2.25 times smaller than the first because of the frequency of the second amount, 150 hz. the parameters of both filters are: first: r=0.1 ohm, l=0.156 h, c=20e-06 f second: r=0.1 ohm, l=0.156 h, c=8,94e-06 f fig. 14 the wave of primary phase (r) on the 10 kv side (two filters) 384 e. agić, d.šljivac, b.agić fig. 15 spectral current of primary phase (r) current on 10 kv side (2 filters installed) the value of thdi for primary phase (r) current is: 210 % 2 2 1 thdi 100%n n i i   thdi = 0.0023 % comparison of thdi in either case: from the analysis we can see that thdi is the smallest when both filters are turned on for the elimination of the 2 and 3 current harmonics, but then the current of the primary (r) phase has the highest value, i.e. 36a the thdi current of the primary (r) phase is distinguished on the 10 kv voltage because it is higher thdi in the case of installing one in relation to the two filters (in parallel). in the following period, the authors will perform practical measurements of the above dimensions and compare the simulation in matlab and concrete measured values in the particular given power grid.metrel mi 2192 (power quality analyzer), a portable multifunctional instrument for measuring and analyzing parameters of three-phase systems, will be used for measurement purposes. the instrument has been produced in accordance with the following european standards:  safety (en 61010-1)  electromagnetic compatibility (en 50081-1 and en 61000-6-1)  measurements in accordance with european stand-by (en 50160) this device is enabled in the specific case:  real-time monitoring, recording and parameter analysis in a three-phase system.  large selection of measuring functions: voltage, current, power (w, var and va), power factor, energy, oscilloscope, harmonics analysis, statistical analysis, anomalies.  in shooting mode, the measured sizes will be stored in memory for later an-alignment  special shooting mode for wavelength recording, with various trigger options. influence on the quality of electric energy of higher harmonics as consumer consumption 385  special recording modes for monitoring the quality of the observed system: periodic, waveform, transient, fast-logging, en 50160  calculation of the minimum, mean and maximum values of the measured sizes, with various predefined report forms  oscilloscopic display of waveforms, in real time and when analyzing recorded data  analysis of harmonic distortion to 63 harmonics during operation and later on recorded data  monitoring and analysis of energy opportunities  rs232 connector for connecting to a computer  windows software for analyzing recorded data and instrument management references [1] e. agić, d. šljivac, b. agić, “the influence of nonlinear background on the quality of electricity”, in proceedings of the 4th virtual international conference on science, technology and management in energy, “energetics”, pp. 171-178, 2018. [2] a. tokić chapters 1-3: introduction, definitions, transition cefes-3: "quality of el. energy”, pp. 3. [3] a. tokić chapters 1-3: introduction, definitions, transition cefes-3: "quality of el. energy”, pp. 4. [4] r. dugan, m. mcgranaghan, w. beaty: “electric power system quality”, mcgraw hill, new york, 1996, pp. 34-36. [5] v.katic – chapters 4-6: standards, harmonics, fail voltage cefes "quality of electrical energy", pp. 66. [6] r. dugan, m. mcgranaghan, w. beaty: “electric power system quality”, mcgraw hill, new york, 1996. [7] v. katić chapters 4-6: standards, accordions, voltage disruptions cefesnovember 2005, pp. 8. [8] a. tokić ''electricity quality“ cefes november 2005, pp. 23. [9] matlab simulink power system blockset 7 software package for simulation and analysis of ee dynamic systems within the matlab tool. [10] e. agić power quality–cefes, 2006 [11] wg on modelling and analysis of system transients using digital programs. “modelling and analysis guidelines for slow transients – part iii: ''the study of ferroresonance”, ieee trans. on power delivery, vol. 15, no. 1, pp. 255-265, jan. 2000. [12] k. miličević, ferroresonance: “systems, analysis and modeling”, wiley encyclopedia of electrical and electronics engineering, pp. 1-8,2014. [13] a. vinkovic and r. mihalic, “a currentbased model of the static synchronous series compensator (sssc) for newton– raphson power flow,” electr. power syst. res., vol. 78, no. 10, pp. 1806–1813, okt 2008. [14] “simpowersystems”, user’s guide, mathworks, 2013. [15] v. katić – chapter 4-6-: standards, harmonics, voltage disruptions cefes "electricity quality“ pp. 44. [16] d. graovac, v. katić, a. rufer: “power quality compensation using universal power quality conditioning system”, ieee power engineering review, usa, vol.20, no.12, pp.58-60, dec.2000, [17] g. carpinelli at all, “probabilistic evaluation of the economical damage due to harmonic losses in industrial energy system”, ieee transaction on power delivery, vol.11, no.2, pp.1021-1030, apr.1996. [18] w. piasecki, m. florkowski, m. fulczyk, p. mahonen, w. nowak, “mitigating ferroresonance in voltage transformers in ungrounded mv networks”, ieee trans. on power delivery, vol. 22, no. 4, pp. 23622369, 2007. [19] wg c-5, “mathematical models for current, voltage, and coupling capacitor voltage transformers”, ieee trans. on power delivery, vol. 15, no. 1, pp. 62-72, jan. 2000. [20] n. janssens, th. van craenenbroeck, d. van dommelen f., van de meulebroeke,“direct calculation of the stability domains of three-phase ferroresonance in isolated neutral networks with groundedneutral voltage transformers”, ieee trans. on power delivery, vol. 11, no. 3, pp. 1546-1553, jul. 1996. 10771 facta universitatis series: electronics and energetics vol. 36, no 1, march 2023, pp. 31-42 https://doi.org/10.2298/fuee2301031r © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper machine learning assisted optimization and its application to hybrid dielectric resonator antenna design pinku ranjan1, harshit gupta1, swati yadav2, anand sharma3 1abv-indian institute of information technology and management (iiitm), gwalior, madhya pradesh, india 2department of electronics & communication engineering, college of engineering roorkee(coer), roorkee, uttrakhand, india 3department of electronics & communication engineering, motilal nehru national institute of technology allahabad, india abstract. machine learning assisted optimization (mlao) has become very important for improving the antenna design process because it consumes much less time than the traditional methods. these models' accountability can be checked by the accuracy metrics, which tell about the correctness of the predicted result. machine learning (ml) methods, such as gaussian process regression, artificial neural networks (anns), and support vector machine (svm), are used to simulate the antenna model to predict the reflection coefficient faster. this paper presents the optimization of hybrid dielectric resonator antenna (dra) using machine learning models. several regression models are applied to the dataset for optimization, and the best results are obtained using a random forest regression model with the accuracy of 97%. additionally, the effectiveness of machine learning based antenna design is demonstrated through comparison with conventional design methods. key words: dielectric resonator antenna, machine learning, gaussian process regression, anns, svm 1. introduction antenna design optimization is a topic that has received a lot of attention in previous few years. that is because methodologies of conventional antenna design are comprehensive and do not have any guarantee of producing effective results because of the complications of the latest antennas fabrication and execution necessities [1]-[3]. despite the fact that design automation via optimization goes with conventional approaches of antenna design, optimization of antenna designs has many problems [3]-[5]. the significant issues cover the received may 18, 2022; revised july 27, 2022; accepted august 31, 2022 corresponding author: pinku ranjan abv-indian institute of information technology and management (iiitm), gwalior, madhya pradesh, india e-mail: pinkuranjan@iiitm.ac.in 32 p. ranjan, h. gupta, s.yadav, a. sharma efficiency and optimization capacity of accessible techniques to address a wide extent of antenna design issues thinking about the developing details of current antennas. the methods presented in this report can have an effect on the upcoming development of antennas for an abundance of applications. the frequencies which are in the microwave range of their measurements of current (i) and voltage (v) become very difficult [6]-[8]. at higher frequencies, we do not measure current or voltage. it is preferred to measure power. as it goes to higher microwave frequencies, it is hard to carry out the short circuit and open circuit for the ac signals over the broad bandwidth. to control this problem, at the microwave range, we use s parameters. s parameters are stated in terms of incident and reflected traveling waves. they are easy to use in the analysis. s parameters can simply be measured using network analyzers the acceptances of the use of such parameters have been growing rapidly [8]. s-parameters are a complex matrix that shows reflection/transmission characteristics (phase/amplitude) in the frequency domain. there are various parameters on which this parameter depends, such as frequency bandwidth, return loss and radiation pattern [9]-[11]. some academics have concentrated on this issue and forecast antenna performance using various ml techniques in the open literature. sharma et.al. suggested using lasso (least absolute shrinkage and selection operator), ann, and knn approaches to optimize a tshaped monopole. to train the model at two separate frequency bands, 2.5 ghz and 5.5 ghz, 450 data set points were collected [12]. gao et al. optimized the yagi uda, shaped printed antenna, and dual-mode printed antenna using the gaussian process and support vector machines [13]. j. p. jacobs suggested using ml techniques based on gaussian process regression to optimize a u-shaped slot-loaded microstrip antenna. the aforementioned antenna is compatible with both 2.75 ghz and 3.75 ghz frequency bands [14]. this paper deals with hybrid antennas that combine passive and active architecture and its optimization using different machine learning techniques. hybrid antennas are widely used antennas with important applications including radar display control systems for managing self-driving cars, or automated equipment control systems using radar signal inputs. 2. background 2.1. artificial neural network anns were acquainted with the em field and microwave designing during the 1990s [15]. artificial neural networks have discovered applications in antennas, design of radar circuit, remote sensing, measurement difficulties and various fields. neural networks intended to demonstrate the manner by which the individual mind plays out a specific undertaking. an overall meaning of a neural network is given as massive. in the late twentieth century, anns were first acquainted with mimo antennas. ann was used to transform the design parameters, including the dielectric constant, and antenna’s dimensions. as of late, anns include many hidden layers, which are generally alluded to as deep neural networks (dnns) or deep learning, that have been presented to solve antenna parameter and problems of optimization. the output in ann can be predicted as follows: y = xinput * weights + b(bias) (1) yfinal = (x1 * w1) + (x2 * w2) + …. + (xn * wn) (2) machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 33 2.2. support vector machine the svm can take up classification as well as the regression problems. in the problems of regression, on a high-dimensional space called a feature space, the input space of svm is mapped here with the help of linear functions regression that can be accurately performed [16]. in the antenna design field in contrast to anns, the svm is introduced because of its better generalization capability. in practical applications, the sets of training data generated by full-wave em simulations are mainly of finite size, which causes overfitting in certain artificial neural networks applications. also, svm needs fewer training patterns to give precise results, which fastens procedure of training. 2.3. gaussian process regression as of late, the gpr has received broad attention in the area of em designing, including for antenna design. rather than the other 2 ml techniques, the gpr can tell the uncertainty at new design points for the predicted results, which will assist creator with investigating worldwide optima when hardly any points for training are given, the gpr was acquainted with model antenna responses containing the reflection coefficient, gain performance and crosstalk level for 3 distinctive antenna models [17]. 2.4. antenna architecture for this study a hybrid dielectric resonator antenna is used. in the hybrid structure, every antenna is designed to radiate in its own separate band. the hybrid resonator can offer ultrawideband operation if the different bands are sufficiently close. ultra-wideband bandwidth is possible in hybrid resonators to offer ultra-wideband operation by using the techniques of bandwidth improvement in dra and in other antennas as well. fig. 1 displays the structural layout of the dra antenna, and table 1 shows the dimensions of the proposed hybrid cp radiator. in this radiator, the ring-shaped ceramic material is excited by dual linked circular ring-shaped space. table 1 dimensions of proposed hybrid cp radiator symbols dimensions (mm) symbols dimensions (mm) ls = ws 50.0 h 13.0 t2 4.0 lf 31.0 hs 1.6 wf 3.0 r1 13.5 d3 10.0 r2 2.0 d4 4.0 t1 2.0 l1 12.0 fig. 2 shows the fabricated prototype of proposed cp antenna. alumina material (relative permittivity of ceramic material = 9.8; dielectric loss tangent = 0.002) is used to make the ring-shaped ceramic. alumina material is fixed over fr-4 substrate (relative permittivity of fr-4 material = 4.4; dielectric loss tangent fig. 1 structural layout of dra antenna: a) dual linked circular ring aperture; b) panoramic view 34 p. ranjan, h. gupta, s.yadav, a. sharma = 0.02) with the assistance of a quick fix. the thickness of the proposed antenna is 1.6 mm. dual linked circular ring-shaped aperture and t-shaped microstrip lines have been carved over the substrate. а) b. fig. 2 fabricated prototype of proposed cp antenna: (a) bottom view; (b) top view 2.5. s-parameter electrical systems represent the relationship between input and output by the designation of their port. for example, when having 2 ports named port 2 and port 1, the power that transfers from port 1 to port 2 is called . is transfer of power from port 2 to port 1 when it comes to antennas, speaks about amount of reflected power from the antenna, which is called its reflection coefficient. if s11=0 db, at that point 100% power will return back from the antenna and radiated value will be 0. if s11= -10 db, this depicts, if 3 db of power is transmitted to the antenna, -7 db will be the power that will reflect back. the remaining power was delivered to the antenna. this accepted power is either transmitted or consumed as losses inside the antenna itself. since antennas are commonly intended for a low loss, preferably most of the power delivered to the antenna is radiated. fig. 3 shows the simulated and measured s11 parameter of the reference antenna. fig. 3 variation of s-parameter with frequency machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 35 reflection coefficient is a specification which expresses the amount of a reflected wave because in the transmission medium impedance discontinuity is presented. it is equivalent to the ratio of the amplitude of the reflected wave to the incident wave, with each represented as phasors. for instance, it is utilized in optics when we measure the proportion of light reflected back from surface whose index of refraction is different, such as a glass, or in an electrical transmission line to compute line the amount of the electromagnetic wave is reflected due to impedance. 3. implementation and results this segment depicts various steps of the methodology section in more details along with the actual implementation details. 3.1. data collection by altering various parameters of the hybrid dra antenna, the data was first collected by using the hfss software. the data exported from this software includes various parameters related to the antenna such as height, frequency, as well as the corresponding s11 parameters. in this collected dataset the frequency parameter varied from 2.00ghz to 5.00ghz, whereas the height parameter discretely varied from 5mm to 15mm. the dataset contained the value of s11 parameter for every pair for height and frequency. 3.2. data preprocessing and sampling as can be seen from the sample dataset in fig. 6, the dataset exported contained some random, unrelated entries which needed to be removed. the exported dataset was also not in proper format and thus some rearranging of columns and rows was required. in this particular step, we mostly performed such operations on the exported dataset, and finally the exported dataset was dumped into a csv via bash script. this csv will serve as an input to our machine learning model algorithms. to build any machine learning model from a dataset, the first step is the sampling of data. in the dataset provided to the model for as csv input, a sampling procedure was acted upon so as to separate it into various subsets with each one having its own utility. it is normally expected that if we have more data to construct a model, it will give better outcomes. typically, the dataset is isolated like this with their individual utility. as the name demonstrates, the training set is utilized in the training of the learning algorithm. fig. 4 depicts different stages of sampling which is done on the dataset. for the validation and optimization of the model crossvalidation set is used. to ensure that our model extracted the proper patterns from the data and did not introduce too fig. 4 sampling over the dataset 36 p. ranjan, h. gupta, s.yadav, a. sharma much noise, crossvalidation is utilized. and here the fold value is 5. we cannot check the model on this set because the results would be very optimistic as the model is built by using the training set. to perceive how properly the learning algorithm performs with unknown data, we used the test set. 3.3. building ml model building any ml model starts with loading the csv dataset into the python code for ml modeling, after that different machine learning models are applied on that data set according to the requirement of the programmer and for training and running various ml models over the dataset google collab platform is used. 3.4. unpacking of data the dataset of height, frequency and s11 parameter obtained after preprocessing and filtering comes in a .csv file format to our model implementation. to read the dataset shown in fig. 5, from those files a small python code is implemented and dumped into the data frame for object serialization, fast and easy access. fig. 6 it shows that how data is stored in the csv file. fig. 5 python code for reading csv fig. 6 sample data as read from csv 3.5. preparing final data for input as can be seen in the previous image the data read from csv contains s11 values for corresponding pairs of frequency and height (from 5 to 15). after that prepare a data frame consisting of all three values (frequency, height and s11) in one row so that it can machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 37 be used as actual data for our models, i.e., with features and responses defined clearly. preparation of final data frame is indicated in fig. 7. fig. 7 preparing final data frame 3.6. data visualization the final data frame created in the previous step was then visualized in to get insights from the data, shown in fig. 8 and to decide on which models can be leveraged for such a dataset. treating frequency and height as independent variables and s11 will be a dependent variable. relation of all these three variables is shown in fig. 9. fig. 8 data visualization fig. 9 3d visualization of dataset 38 p. ranjan, h. gupta, s.yadav, a. sharma 3.7. multiple regression the very first model used here for dataset was multiple regression. the multiple regression was used because of our dependent variable (s11). in linear regression, the relationship between dependent variable and independent variable x1, ......, xp is given by equation: y = f(x) + ϵ (3) since there are multiple independent variables (height and frequency) and a dependent variable (s11) to predict, the multiple regression model will be given by: f(x) = β0 + β1x1 + ... + βp xp (4) where to calculate dependent variable (f{x}), β0...p are the coefficients of the model and x1...p are the independent variables used to ensure maximal prediction of the dependent variable from the set of independent variables. fig. 10 multiple regression predictions versus actual s11 this multiple regression model was not that much aligned with the dataset and we got only 23% of accuracy for our test dataset. this can be referred from fig. 10. 3.8. polynomial regression it is one of the types of linear regression where connection amidst the dependent and independent variable y and x is demonstrated as an nth degree polynomial. a nonlinear relationship is fitted on this regression between an estimation of x and the subsequent mean of y, denoted by e(y|x). polynomial regression is used for many reasons: 1. all the curvilinear relationships include polynomial terms. 2. inspection of residuals. on the off chance of a curved data a linear model is fitted, a graph consists of a predictor (x-axis) and a scatter plot of residuals (y-axis) is having many positive residuals in the center consequently, in these cases it is not suitable. 3. a speculation in different multiple linear regression analysis talks about independent variables. in a polynomial regression model, this supposition is not fulfilled. y = a + b1x1+ b2x2 +.... + bnxn + ϵ (5) machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 39 on the variable x, y is dependent, intercept of y is the rate of error. therefore, using least square technique, computed y is the response value. another important thing to note is that the polynomial regression is very delicate from the outliers and in the proximity of countable outliers which are present in the data of a nonlinear analysis can change the results drastically. fig. 11 polynomial regression predictions vs actual s11 starting with the 2nd degree polynomial, polynomial regression did not achieve the desired result. when the degree is increased, it is also observed that accuracy increases; within this model accuracy is up to 62%. actual and predicted value using polynomial regression predictions are displayed in fig. 11. random forest regression random forest is a supervised learning algorithm. the “forest” it builds is an ensemble of decision trees[18]. this is based on a gathering method based on bagging. the classifier functions are demonstrated in this way: 1. d is the classifier that primarily makes k bootstrap specimens of d, and each of the specimens symbolizes di . 2. a di has almost the same number of tuples as d that is tested with substitution from d. 3. along inspecting on substitution, in such a manner as a portion belonging to the real tuples of d may should not contain di, although further can happen more than once. the classifier at that point builds a decision tree dependent on each di. accordingly, a “forest” that comprises k decision trees is made. for categorizing an obscure tuple, x, every tree gives back its class forecast considering a single vote. the ultimate choice of x’s group is allocated and given to that tree which has the maximum votes. the working of random forest regression is portrayed in fig. 12. 40 p. ranjan, h. gupta, s.yadav, a. sharma fig. 12 random forest regression for its tree induction this project uses gini index. for d, the gini index is computed as: m gini(d) = 1 ∑ pi2 (6) i = 1 where pi is the likelihood that a tuple in d belongs to class ci. the gini index measures the contamination of d. if the index value is lower than better d was divided. fig. 13 random forest regression predictions vs actual s11 random forest regression is tried with a different number of estimators, and after multiple training, the results were extraordinary for the provided dataset, and the accuracy that is achieved is 97%. this model also has the best fit as compared to all other models that were tried on given dataset as can be seen in the fig. 13. machine learning assisted optimization and its application to hybrid dielectric resonator antenna design 41 4. conclusion this paper is implemented on python. on analyzing the dataset, random forest regression gave the highest accuracy rate of 97%, and the polynomial regression algorithm 62%. this multiple regression model was not a good fit with the dataset and we got only 23% of accuracy for our test dataset. so, multiple regression model has the worst results among all the models used in this paper. it is seen that machine learning is a good option for the optimization of antenna parameters and to predict the variation of s11 with different values of height and frequency. it saves a lot of time and material which was getting wasted in traditional designing. with these ml models, a near prediction based on real values was made (obtained from hfss). the ml models are further supported by experimental findings. the proposed antenna operates between 2 to 5 ghz. the optimized design validates its suitability for the application in hybrid dra by exhibiting stable radiation characteristics within the operating frequency range. references [1] q. wu, y. cao, h. wang and w. hong, "machine-learning-assisted optimization and its application to antenna designs: opportunities and challenges", china commun., vol. 17, pp. 152-164, 2020. [2] g. k. uyanik and n. guler, "a study on multiple linear regression analysis", in proceedings of the 4th international conference on new horizons in education, 2013, pp. 1-6. [3] k. c. lee, "application of neural network and its extension of derivative to scattering from a nonlinearly loaded antenna", ieee trans. antennas propag., vol. 55, pp. 990-993, 2007. [4] k. c. lee and t. n. lin, "application of neural network to analyses of nonlinearly loaded antenna arrays including mutual coupling effects", ieee trans. antennas propag., vol. 53, pp. 1126-1132, 2005. [5] y. rahmat-samii, j. m. kovitz and h. rajagopalan, "nature-inspired optimization techniques in communication antenna designs", proc. ieee, vol. 100, pp. 2132–2144, 2012. [6] w.-q. wang, h. shao and j. cai, "mimo antenna array design with polynomial factorization", int. j. antennas propag., vol. 2013, p. 358413, 2013. [7] a. sharma, g. das, s. gupta and r. k. gangwar, "quad-band quad-sense circularly polarized dielectric resonatorantenna for gps/cnss/wlan/wimax applications", ieee antennas propag. mag., vol. 60, pp. 57-65, 2018. [8] a. gupta and r. k. gangwar, "hybrid rectangular dielectric resonator antenna for multiband applications", iete tech. rev., vol. 37, pp. 83-90, 2020. [9] a. sharma, p. ranjan and sikandar, "dual band ring shaped dielectric resonator based radiator with left and right handed sense circularly polarized features", iete tech. rev., vol 38, pp. 511-519, 2020. [10] a. k. dwivedi, a. sharma and p. ranjan, "dual-band modified rectangular shaped dielectric resonator antenna with diversified polarization feature", int. j. circuit theory appl., vol. 49, pp. 34343442, 2021. [11] t. suryakanthi, "evaluating the impact of gini index and information gain on classification using decision tree classifier algorithm", int. j. adv. comput. sci. appl., vol. 11, pp. 612-619, 2020. [12] y. sharma, h. h. zhang and h. xin, "machine learning techniques for optimizing design of double tshaped monopole antenna", ieee trans. antennas propag., vol. 68, pp. 5658-5663, 2020. [13] j. gao, y. tian and x. chen, "antenna optimization based on co-training algorithm of gaussian process and support vector machine", ieee access, vol. 8, pp. 211380-211390, 2020. [14] j. p. jacob, "efficient resonant frequency modeling for dual-band microstrip antennas by gaussian process regression", ieee antennas wirel. propag. lett., vol. 14, pp. 337-341, 2014. [15] p. burrascano, s. fiori and m. mongiardo, "a review of artificial neural networks applications in microwave computer‐aided design", int. j. rf microw. c. e., invited article, vol. 9, pp. 158-174, 1999. [16] g. min and y. feng, "calculation of the characteristic impedance of tem horn antenna using support vector machine", in proceedings of the international conference on microwave and millimeter wave technology, 2010, pp. 895-897. 42 p. ranjan, h. gupta, s.yadav, a. sharma [17] j. gao, y. tian, x. zheng, x. chen and m. mrugalski, "resonant frequency modeling of microwave antennas using gaussian process based on semisupervised learning", complexity, vol. 2020, p. 3485469, 2020. [18] p. ranjan, a. maurya, h. gupta, s. yadav and a. sharma, "ultra-wideband cpw fed band-notched monopole antenna optimization using machine learning", prog. electromagn. res. m, vol. 108, pp. 27-39, 2022. 400.indd facta universitatis series: electronics and energetics vol. 27, no 4, december 2014, pp. 521 542 lozenge tiling constrained codes bane vasić1 and anantha raman krishnan2 1department of electrical and computer engineering, university of arizona, tucson, az, 85721, usa 2western digital corporation, irvine, ca 92612, usa abstract: while the field of one-dimensional constrained codes is mature, with theoretical as well as practical aspects of codeand decoder-design being well-established, such a theoretical treatment of its two-dimensional (2d) counterpart is still unavailable. research has been conducted on a few exemplar 2d constraints, e.g., the hard triangle model, run-length limited constraints on the square lattice, and 2d checkerboard constraints. excluding these results, 2d constrained systems remain largely uncharacterized mathematically, with only loose bounds of capacities present. in this paper we present a lozenge constraint on a regular triangular lattice and derive shannon noiseless capacity bounds. to estimate capacity of lozenge tiling we make use of the bijection between the counting of lozenge tiling and the counting of boxed plane partitions. keywords: 2d constrained codes, 2d constraints, lozenge tiling, colored tiling. 1 introduction real-world communications channels, in contrast to channels of theoretical interest, often suffer from physical and systemic limitations that constrain the nature of data that may be transmitted through them. one such often-quoted example is the magnetic recording system, where information is translated to a direction of magnetization of the recording medium. magnetic recording systems place constraints on the duration between two consecutive transitions of manuscript received august 9, 2014 corresponding author: bane vasić is with the department of electrical and computer engineering, university of arizona, tucson, az, 85721, usa (e-mail: vasic@ece.arizona.edu). 521 facta universitatis series: electronics and energetics vol. 27, no 4, december 2014, pp. 521 542 doi: 10.2298/fuee1404521v received august 9, 2014 corresponding author: bane vasic department of electrical and computer engineering, university of arizona, tucson, az, 85721, usa (e-mail: vasic@ece.arizona.edu) 522 a. r. krishnanan and b. vasić magnetization direction [1]. transitions too close to each other degrades the signal-to-noise ratio (snr) during readback. on the other hand, transitions too far apart make vulnerable to failure the timing-recovery sub-systems which typically rely on these transitions to derive symbol-level timing information. a solution to satisfy channel constraints is to use constrained codes (or modulation codes) – codes that transform user information to symbol-sequences that satisfy the constraints. one dimensional (1d) constraints have been well-studied. the mathematical framework for determining the shannon noiseless capacity of 1d constraints – the growth rate of the number of sequences satisfying a given constraint – has been comprehensively laid out. further, development of 1d constrained codes and decoders is mature (refer [1] for a review on this topic). in contrast, analysis and code development for two-dimensional (2d) constraint systems have been less successful. initial work on 2d constraints was done by ashley and marcus [2] who considered 2d low-pass filtering codes for eliminating bit-patterns with large highfrequency components. subsequently, considerable research was conducted on numerous classes of 2d constraints, e.g., [3–9]. in spite of these efforts, these constrained systems remain largely uncharacterized mathematically with only loose bounds of capacities existing. with an exception of a few cases, the channel capacity of 2d constrained channels is unknown (see [10]), and there is no systematic procedure to design encoders and decoders. nonetheless, the need two-dimensional constrained codes has come to the fore in light of the recent work in two-dimensional magnetic recording (tdmr). in [11], it was shown that constrained coding which restricts the occurrence of certain 2d patterns greatly reduces system complexity and improves the detection performance. also, there exist media models for tdmr that rely on polyomino tiling [12, 13], which are closely related to the 2d constrained systems. in this paper, we consider a class of low-pass constraint on the triangular lattice termed segregation constraint and it special case, no isolated bit (nib) constraint. we provide an upper bound on the shannon noiseless capacity for this constraint. the key novelty of the our approach is to view our 2d constraint as a colored tiling of a plane. thus, the estimation of the number of permissible colored tilings leads to upper bounds on shannon noiseless capacity of 2d channels represented by this constraint. in addition, the tools used for analysis also lead to a framework for encoders and decoders for the constraint. the rest of the paper is organized as follows. section 2 gives the necessary background and motivation of the problem. in section 3, we define lozenge codes and establish their connection with boxed plane partitions, and in section 4 we derive bounds on capacity. section 5 introduces encoding and decoding schemes for lozenge codes, and section 6 concludes the paper. detection and coding for tdmr channel 523 2 preliminaries a two-dimensional constrained encoder may be visualized as a mapping from an unconstrained binary sequence into a colored tiling. a tiling of the plane is a collection of plane figures that fills the plane with no overlaps and no gaps. the plane figures used as building blocks for tilings are called tiles. two tiles are said to be neigbors if they share an edge (side), and to be touching if they share only a single vertex. a regular tiling uses congruent regular polygons as tiles. there are only three regular tilings: triangular, square, and hexagonal. figure 1 illustrates each of these tilings. a tiling of a plane figure or finite region can be defined analogously. a tiling is said to be colored or labelled if each of its tiles is asigned a color/symbol from a finite set of colors/symbols. a colored tiling is also referred as a pattern or a configuration. a binary coloring employs black and white tiles, while generally in m -ary coloring each tile is colored by one of m > 1 colors. consequent to the definitions above, a two-dimensional constraint can be expressed as a restriction on a coloring (or labeling) of tiles in a regular tiling. the most famous example is a hard-hexagon constraint [14] – a planar hexagonal lattice with nearest-neighbor exclusion. this is used as a gas model in statistical mechanics. the hard-hexagon constraint allows only those (binary) colorings in which black hexagons are isolated, i.e., have all white neighbors. a hard-triangle [3] and hard-square constraints are defined similarly. a twodimensional runlength (d, k) constraint is a restriction on the separation space between black tiles, so that the number of white tiles between two black tiles in any direction is at least d and at most k (0 ≤ d < k). indeed, it can be seen that the hard-hexagon, hard-triangle and hard-square constraints are instances of the (d, k) constraints. in particular, they are (d, k) = (1, ∞) runlength constraints on their respective lattices. constraints for which a coloring depends on both neigboring and touching tiles is referred to as a checkerboard constraint [9]. one example of a checkerboard constraint is a square tiling in which black squares are not permitted to share a vertex. as mentioned previously, both runlength and checkerboard constraints have been an active area of research, but the progress has been slow because of inherent hardness of two-dimensional tiling problems. 3 lozenge codes consider a regular tiling of a plane. a (d, k) segregation constraint is a tiling that limits the number of neighboring tiles (neighborhood size) of same color to be no less than d + 1 and no more than k + 1. a 2d bit pattern is said to satisfy a no isolated bit (nib) constraint if every bit has at least one bit of the same polarity adjacent to it. thus the nib is a (d, k) = (1, ∞) segregation constraint. note that the parameters d and k in runlength and segregation constraints have different meanings. in rectangular latices runlength constraint can be converted 522 b. vasic, a. r. krishnan lozenge tiling constrained codes 523 522 a. r. krishnanan and b. vasić magnetization direction [1]. transitions too close to each other degrades the signal-to-noise ratio (snr) during readback. on the other hand, transitions too far apart make vulnerable to failure the timing-recovery sub-systems which typically rely on these transitions to derive symbol-level timing information. a solution to satisfy channel constraints is to use constrained codes (or modulation codes) – codes that transform user information to symbol-sequences that satisfy the constraints. one dimensional (1d) constraints have been well-studied. the mathematical framework for determining the shannon noiseless capacity of 1d constraints – the growth rate of the number of sequences satisfying a given constraint – has been comprehensively laid out. further, development of 1d constrained codes and decoders is mature (refer [1] for a review on this topic). in contrast, analysis and code development for two-dimensional (2d) constraint systems have been less successful. initial work on 2d constraints was done by ashley and marcus [2] who considered 2d low-pass filtering codes for eliminating bit-patterns with large highfrequency components. subsequently, considerable research was conducted on numerous classes of 2d constraints, e.g., [3–9]. in spite of these efforts, these constrained systems remain largely uncharacterized mathematically with only loose bounds of capacities existing. with an exception of a few cases, the channel capacity of 2d constrained channels is unknown (see [10]), and there is no systematic procedure to design encoders and decoders. nonetheless, the need two-dimensional constrained codes has come to the fore in light of the recent work in two-dimensional magnetic recording (tdmr). in [11], it was shown that constrained coding which restricts the occurrence of certain 2d patterns greatly reduces system complexity and improves the detection performance. also, there exist media models for tdmr that rely on polyomino tiling [12, 13], which are closely related to the 2d constrained systems. in this paper, we consider a class of low-pass constraint on the triangular lattice termed segregation constraint and it special case, no isolated bit (nib) constraint. we provide an upper bound on the shannon noiseless capacity for this constraint. the key novelty of the our approach is to view our 2d constraint as a colored tiling of a plane. thus, the estimation of the number of permissible colored tilings leads to upper bounds on shannon noiseless capacity of 2d channels represented by this constraint. in addition, the tools used for analysis also lead to a framework for encoders and decoders for the constraint. the rest of the paper is organized as follows. section 2 gives the necessary background and motivation of the problem. in section 3, we define lozenge codes and establish their connection with boxed plane partitions, and in section 4 we derive bounds on capacity. section 5 introduces encoding and decoding schemes for lozenge codes, and section 6 concludes the paper. detection and coding for tdmr channel 523 2 preliminaries a two-dimensional constrained encoder may be visualized as a mapping from an unconstrained binary sequence into a colored tiling. a tiling of the plane is a collection of plane figures that fills the plane with no overlaps and no gaps. the plane figures used as building blocks for tilings are called tiles. two tiles are said to be neigbors if they share an edge (side), and to be touching if they share only a single vertex. a regular tiling uses congruent regular polygons as tiles. there are only three regular tilings: triangular, square, and hexagonal. figure 1 illustrates each of these tilings. a tiling of a plane figure or finite region can be defined analogously. a tiling is said to be colored or labelled if each of its tiles is asigned a color/symbol from a finite set of colors/symbols. a colored tiling is also referred as a pattern or a configuration. a binary coloring employs black and white tiles, while generally in m -ary coloring each tile is colored by one of m > 1 colors. consequent to the definitions above, a two-dimensional constraint can be expressed as a restriction on a coloring (or labeling) of tiles in a regular tiling. the most famous example is a hard-hexagon constraint [14] – a planar hexagonal lattice with nearest-neighbor exclusion. this is used as a gas model in statistical mechanics. the hard-hexagon constraint allows only those (binary) colorings in which black hexagons are isolated, i.e., have all white neighbors. a hard-triangle [3] and hard-square constraints are defined similarly. a twodimensional runlength (d, k) constraint is a restriction on the separation space between black tiles, so that the number of white tiles between two black tiles in any direction is at least d and at most k (0 ≤ d < k). indeed, it can be seen that the hard-hexagon, hard-triangle and hard-square constraints are instances of the (d, k) constraints. in particular, they are (d, k) = (1, ∞) runlength constraints on their respective lattices. constraints for which a coloring depends on both neigboring and touching tiles is referred to as a checkerboard constraint [9]. one example of a checkerboard constraint is a square tiling in which black squares are not permitted to share a vertex. as mentioned previously, both runlength and checkerboard constraints have been an active area of research, but the progress has been slow because of inherent hardness of two-dimensional tiling problems. 3 lozenge codes consider a regular tiling of a plane. a (d, k) segregation constraint is a tiling that limits the number of neighboring tiles (neighborhood size) of same color to be no less than d + 1 and no more than k + 1. a 2d bit pattern is said to satisfy a no isolated bit (nib) constraint if every bit has at least one bit of the same polarity adjacent to it. thus the nib is a (d, k) = (1, ∞) segregation constraint. note that the parameters d and k in runlength and segregation constraints have different meanings. in rectangular latices runlength constraint can be converted 522 b. vasic, a. r. krishnan lozenge tiling constrained codes 523 524 a. r. krishnanan and b. vasić to a segregation constraint by precoding [1], while for other two latices in fig. 1(a) this is not the case. in other words segregation constraints are not in bijection to simple runlength (d, k) constraints. using fig. 1, the nib constraint can be explained as follows: for any given tile (for example, the tile marked gray in the figure), if none of its neighbouring tiles (tiles marked black) are of the same colour, then the nib constraint is said to be violated. our recent experimental results show that if input bits are mapped to bit patterns satisfying the nib constraint, then the probability of rewriting a grain is reduced [12, 15]. the goal of this paper is to establish a bound of shannon noiseless capacity and designing of encoders and decoders for nib constraint on the triangular lattice. (a) (b) (c) fig. 1. regular two-dimensional tilings (a) triangular, (b) square, and (c) hexagonal tiling. in the triangular, square and hexagonal tilings, each tile has three, four and six neighbors, respectively. the tiles shaded black are the neighbors that share an edge with the tile shaded gray. to describe lozenge codes, we start by considering a hexagon h with sides of lengths a, b, c, a, b, c and angles of 2π/3, subdivided into equilateral triangles of unit side by lines parallel to the hexagon sides. we will henceforth refer to such equi-angular triangularized hexagons as (a, b, c) hexagons. figure 2(a) shows such a (2, 3, 4) hexagon. as mentioned earlier, two triangles are neighbors if they share a side. (a) (b) fig. 2. lozenge tilings in a triangular lattice: (a) a (2, 3, 4) hexagon h embedded in a triangular lattice. (b) colored tiling of a (2,3,4) hexagon. detection and coding for tdmr channel 525 (a) (b) (c) fig. 3. the prototype tiles used in the tiling of hexagons. we are interested in coloring the triangles using m > 1 different colors in such a way that no isolated triangle is colored differently than its neighbors. in the paper we focus on the m = 2 case. we require a segregation of at least two neighboring triangles of the same color. fig. 2(b) shows one such coloring of the (2, 3, 4) hexagon shown in fig. 2(a). two neighboring triangles form a rhombus with side of unit length and internal angles π 3 and 2π 3 . such a rhombus is known as a lozenge. a lozenge created in this way may have three different orientations. fig. 3 shows these orientations. we refer to the prototiles shown in fig. 3(a), 3(b) and 3(c) as to type-a, type-b and type-c lozenges, respectively. the last two prototiles are referred to as vertical lozenges. now, suppose each of the lozenges is colored independently, we can ensure that for any triangle there is at least one neighboring triangle with the same color, i.e. (d, k) = (1, ∞). the constrained coding problem can be now formulated as follows: given a hexagonal region of a triangular lattice, one would like to find a one-to-one mapping from a set of integers representing user’s binary data (this bijection is trivial) to a set of nib-permissible colored tilings. in addition, due to complexity constrains, it is desirable that the mapping is described by an algorithm which operates locally on a small lattice region. we propose a mapping which is a composition of two mappings: tiling and coloring. now the nib coloring can be separated into two steps as illustrated in fig. 4. first, the hexagon (fig. 4(a)) is tiled with (uncolored) lozenges (to obtain lozenge tiling in fig. 4(b)), and then the lozenges are colored, (to obtain colored tiling in fig. 4(c)). it is important to note that the reader does not have knowledge of the tiling and retrieves encoded bits solely based on the colored pattern. thus, not all colorings can guaranty retrievability. of interest are only those colorings that ensure retrievability. dividing the hard coloring hard problem into two manageable operations results into an unavoidable rate penalty, however the rate penalty due this simplification is a tradeoff for tractability. as we show in the following sections, it is now possible to use elegant combinatorial methods to design encoders and decoders. moreover we can establish bounds on capacities, i.e., achievable density. we start with deriving the capacity bounds. 524 b. vasic, a. r. krishnan lozenge tiling constrained codes 525 524 a. r. krishnanan and b. vasić to a segregation constraint by precoding [1], while for other two latices in fig. 1(a) this is not the case. in other words segregation constraints are not in bijection to simple runlength (d, k) constraints. using fig. 1, the nib constraint can be explained as follows: for any given tile (for example, the tile marked gray in the figure), if none of its neighbouring tiles (tiles marked black) are of the same colour, then the nib constraint is said to be violated. our recent experimental results show that if input bits are mapped to bit patterns satisfying the nib constraint, then the probability of rewriting a grain is reduced [12, 15]. the goal of this paper is to establish a bound of shannon noiseless capacity and designing of encoders and decoders for nib constraint on the triangular lattice. (a) (b) (c) fig. 1. regular two-dimensional tilings (a) triangular, (b) square, and (c) hexagonal tiling. in the triangular, square and hexagonal tilings, each tile has three, four and six neighbors, respectively. the tiles shaded black are the neighbors that share an edge with the tile shaded gray. to describe lozenge codes, we start by considering a hexagon h with sides of lengths a, b, c, a, b, c and angles of 2π/3, subdivided into equilateral triangles of unit side by lines parallel to the hexagon sides. we will henceforth refer to such equi-angular triangularized hexagons as (a, b, c) hexagons. figure 2(a) shows such a (2, 3, 4) hexagon. as mentioned earlier, two triangles are neighbors if they share a side. (a) (b) fig. 2. lozenge tilings in a triangular lattice: (a) a (2, 3, 4) hexagon h embedded in a triangular lattice. (b) colored tiling of a (2,3,4) hexagon. detection and coding for tdmr channel 525 (a) (b) (c) fig. 3. the prototype tiles used in the tiling of hexagons. we are interested in coloring the triangles using m > 1 different colors in such a way that no isolated triangle is colored differently than its neighbors. in the paper we focus on the m = 2 case. we require a segregation of at least two neighboring triangles of the same color. fig. 2(b) shows one such coloring of the (2, 3, 4) hexagon shown in fig. 2(a). two neighboring triangles form a rhombus with side of unit length and internal angles π 3 and 2π 3 . such a rhombus is known as a lozenge. a lozenge created in this way may have three different orientations. fig. 3 shows these orientations. we refer to the prototiles shown in fig. 3(a), 3(b) and 3(c) as to type-a, type-b and type-c lozenges, respectively. the last two prototiles are referred to as vertical lozenges. now, suppose each of the lozenges is colored independently, we can ensure that for any triangle there is at least one neighboring triangle with the same color, i.e. (d, k) = (1, ∞). the constrained coding problem can be now formulated as follows: given a hexagonal region of a triangular lattice, one would like to find a one-to-one mapping from a set of integers representing user’s binary data (this bijection is trivial) to a set of nib-permissible colored tilings. in addition, due to complexity constrains, it is desirable that the mapping is described by an algorithm which operates locally on a small lattice region. we propose a mapping which is a composition of two mappings: tiling and coloring. now the nib coloring can be separated into two steps as illustrated in fig. 4. first, the hexagon (fig. 4(a)) is tiled with (uncolored) lozenges (to obtain lozenge tiling in fig. 4(b)), and then the lozenges are colored, (to obtain colored tiling in fig. 4(c)). it is important to note that the reader does not have knowledge of the tiling and retrieves encoded bits solely based on the colored pattern. thus, not all colorings can guaranty retrievability. of interest are only those colorings that ensure retrievability. dividing the hard coloring hard problem into two manageable operations results into an unavoidable rate penalty, however the rate penalty due this simplification is a tradeoff for tractability. as we show in the following sections, it is now possible to use elegant combinatorial methods to design encoders and decoders. moreover we can establish bounds on capacities, i.e., achievable density. we start with deriving the capacity bounds. 524 b. vasic, a. r. krishnan lozenge tiling constrained codes 525 526 a. r. krishnanan and b. vasić (a) (b) (c) fig. 4. colored lozenge tilings in a triangular lattice: (a) a hexagon with side-length of 6 embedded in a triangular lattice. (b) an uncolored lozenge tiling of the hexagon. (c) a colored lozenge tiling of the hexagon. 4 bounds on capacity for the nib constraint defined above, we can define the shannon noiseless capacity using an (n, n, n) hexagon as follows: c = lim n→∞ log2 m (n, n, n) log226n 2 (1) = lim n→∞ log2 m (n, n, n) 6n2 . (2) note that since the number of triangles constituting the (n, n, n) hexagon is 6n2, in eqn. 1, the denominator is logarithm of the total number of independent colorings of the constituent triangles. the numerator m (n, n, n) is the logarithm of the number of colorings of the constituent triangles that satisfy the nib constraint. similarly, if n (n, n, n) denotes the number of uncolored patterns (tilings), the asymptotic growth rate of n (n, n, n) – we call it tiling capacity ( ct ) is defined as ct = lim n→∞ log2 n (n, n, n) log226n 2 . (3) finally, we define a coloring capacity (cc ) based on the number of distinct patterns, i.e. number k(n, n, n) of ways of distinctly coloring a lozenge tiling so that the constraints are satisfied. this is defined as follows: cc = lim n→∞ log2 k(n, n, n) log226n 2 . (4) we are interested in calculating the number of colored lozenge tilings of a given lattice. observe that this will yield bounds on the capacity of the nib constraint because the capacity c(1,∞) of the nib constraint for the triangular lattice model, can be bounded by the tiling density and coloring densuty, ct and cc , respectively, as c(1,∞) ≥ ct + cc (5) detection and coding for tdmr channel 527 to estimate capacity of lozenge tiling we make use of the bijection between the two enumeration problems: the counting of lozenge tiling and the counting of boxed plane partition. then, we will make use of the work of macmahon on calculating the number of boxed plane partition [16] to determine ct . before this, we briefly discuss the problem of enumeration of boxed plane partition. the use of colored tilings to estimate capacity is attractive due to the fact that the theory of counting domino tilings is well-explored [17, 18]. early work in counting domino tilings includes the work of kasteleyn [19], who calculated the number of domino tilings of a square lattice. another work is that of macmahon [16] in which the number of lozenge tilings of a hexagon embedded in a triangular lattice was calculated. desreux and remila [20] gave optimal algorithms for the generation of domino tilings and lozenge tilings. 4.1 a method based on boxed plane partitions a plane partition π is a collection of non-negative integers πx,y indexed by non-negative integers x, y such that (a) only finite number of πx,y are non-zero and, (b) ∀x, y πx,y ≤ πx,y+1 and πx,y ≤ πx+1,y. the plane partition is said to fit inside a box of dimension a × b × c if there exist integers a, b, c such that πx,y ≤ c for all x, y and πx,y = 0 for all x > a, y > b. such partitions are called boxed plane partitions. a more intuitive way of visualizing the boxed plane partitions is by constructing the young’s solid diagram corresponding to a boxed partition π. for instance, consider the following partition boxed within a box of dimension 2 × 3 × 4. π = ( 4 2 2 2 1 1 ) the dimension of the matrix is 2 × 3 with all entries ≤ 4. to build its corresponding young’s solid diagram, we first consider a box of dimension 2×3×4. to one of the vertices, we assign the cartesian co-ordinate (0, 0, 0). to the opposite vertex we assign the co-ordinate (2, 3, 4). for each x ≤ 2, y ≤ 3, we start at (x − 1, y − 1, 0) and stack πx,y cubes of unit side length. this construction will fit inside the box. fig. 5. tiling corresponding to the boxed plane partition π. the tiling given in fig. 5 corresponds to the boxed plane partition π given 526 b. vasic, a. r. krishnan lozenge tiling constrained codes 527 526 a. r. krishnanan and b. vasić (a) (b) (c) fig. 4. colored lozenge tilings in a triangular lattice: (a) a hexagon with side-length of 6 embedded in a triangular lattice. (b) an uncolored lozenge tiling of the hexagon. (c) a colored lozenge tiling of the hexagon. 4 bounds on capacity for the nib constraint defined above, we can define the shannon noiseless capacity using an (n, n, n) hexagon as follows: c = lim n→∞ log2 m (n, n, n) log226n 2 (1) = lim n→∞ log2 m (n, n, n) 6n2 . (2) note that since the number of triangles constituting the (n, n, n) hexagon is 6n2, in eqn. 1, the denominator is logarithm of the total number of independent colorings of the constituent triangles. the numerator m (n, n, n) is the logarithm of the number of colorings of the constituent triangles that satisfy the nib constraint. similarly, if n (n, n, n) denotes the number of uncolored patterns (tilings), the asymptotic growth rate of n (n, n, n) – we call it tiling capacity ( ct ) is defined as ct = lim n→∞ log2 n (n, n, n) log226n 2 . (3) finally, we define a coloring capacity (cc ) based on the number of distinct patterns, i.e. number k(n, n, n) of ways of distinctly coloring a lozenge tiling so that the constraints are satisfied. this is defined as follows: cc = lim n→∞ log2 k(n, n, n) log226n 2 . (4) we are interested in calculating the number of colored lozenge tilings of a given lattice. observe that this will yield bounds on the capacity of the nib constraint because the capacity c(1,∞) of the nib constraint for the triangular lattice model, can be bounded by the tiling density and coloring densuty, ct and cc , respectively, as c(1,∞) ≥ ct + cc (5) detection and coding for tdmr channel 527 to estimate capacity of lozenge tiling we make use of the bijection between the two enumeration problems: the counting of lozenge tiling and the counting of boxed plane partition. then, we will make use of the work of macmahon on calculating the number of boxed plane partition [16] to determine ct . before this, we briefly discuss the problem of enumeration of boxed plane partition. the use of colored tilings to estimate capacity is attractive due to the fact that the theory of counting domino tilings is well-explored [17, 18]. early work in counting domino tilings includes the work of kasteleyn [19], who calculated the number of domino tilings of a square lattice. another work is that of macmahon [16] in which the number of lozenge tilings of a hexagon embedded in a triangular lattice was calculated. desreux and remila [20] gave optimal algorithms for the generation of domino tilings and lozenge tilings. 4.1 a method based on boxed plane partitions a plane partition π is a collection of non-negative integers πx,y indexed by non-negative integers x, y such that (a) only finite number of πx,y are non-zero and, (b) ∀x, y πx,y ≤ πx,y+1 and πx,y ≤ πx+1,y. the plane partition is said to fit inside a box of dimension a × b × c if there exist integers a, b, c such that πx,y ≤ c for all x, y and πx,y = 0 for all x > a, y > b. such partitions are called boxed plane partitions. a more intuitive way of visualizing the boxed plane partitions is by constructing the young’s solid diagram corresponding to a boxed partition π. for instance, consider the following partition boxed within a box of dimension 2 × 3 × 4. π = ( 4 2 2 2 1 1 ) the dimension of the matrix is 2 × 3 with all entries ≤ 4. to build its corresponding young’s solid diagram, we first consider a box of dimension 2×3×4. to one of the vertices, we assign the cartesian co-ordinate (0, 0, 0). to the opposite vertex we assign the co-ordinate (2, 3, 4). for each x ≤ 2, y ≤ 3, we start at (x − 1, y − 1, 0) and stack πx,y cubes of unit side length. this construction will fit inside the box. fig. 5. tiling corresponding to the boxed plane partition π. the tiling given in fig. 5 corresponds to the boxed plane partition π given 526 b. vasic, a. r. krishnan lozenge tiling constrained codes 527 528 a. r. krishnanan and b. vasić above. the young’s solid diagram construction is also apparent if the tiled (2, 3, 4) hexagon is visualized as a 2 × 3 × 4 box. for ease of visualization, the top face of the topmost box is colored black. the relationship between lozenge tiling and boxed plane partition is given by the theorem 1 in [18]. lemma 1 the number of boxed plane partitions fitting inside a a × b × c box is equal to the number of lozenge tilings of an (a, b, c) hexagon. proof: the proof is based on macmahon [16] formula. 4.2 bounds on ct by lemma 1, the number of lozenge tiling of any (a, b, c) hexagon is equal to the number of boxed partitions fitting inside an a×b×c box. macmahon [16] found the number of such partitions, n(a,b,c), to be n(a,b,c) = ∏a i=1(c + i)b/(i)b where (i)n := i(i + 1)(i + 2) . . . (i + n − 1) is the rising factorial. for an equilateral hexagon, we were able to find the capacity of tiling (without coloring) in a closed form. it is given by theorem 1. theorem 1 the capacity of the lozenge tiling for an equilateral hexagon is ct = 3 4 log2 3 − 1. proof: the proof follows from lemma 1 as given in appendix 7 note that closed forms solutions for the capacity such as one given by the above theorem are quite rare in problems involving 2d constraints. 4.3 bounds on cc the capacity of the coloring problem, cc , can be estimated by counting the number of ways the lozenges constituting a lozenge tiling can be colored. given a lozenge tiling of an (n, n, n) hexagon, h, the coloring of the lozenges cannot be done arbitrarily. to explain this, we consider a colored lozenge tiling of the (2, 1, 1) hexagon which is shown in figure 6. figure 6(a) shows the colored lozenge tiling without any boundaries. in the absence of the tiling information, the decoder would not be able to decode this tiling correctly. figures 6(b) and 6(c) show two possible lozenge tilings this can be decoded to. this means that in the absence of knowledge of the tiling pattern, the colored tiling in figure 6(a) is not uniquely decodable. in particular, the two tilings of the (1, 1, 1) hexagon formed at the bottom of the larger hexagon (marked with bold edges) cannot be identified uniquely. we denote the (1, 1, 1) hexagon formed in fig. 6(b) and fig. 6(c) as type1 and type-2 hexagon, respectively. to distinguish the two hexagons, we apply the following rule: whenever a type-1 hexagon is encountered, the vertical lozenges of different orientation, i.e., type-b and type-c lozenges constituting 528 b. vasic, a. r. krishnan lozenge tiling constrained codes 529 528 a. r. krishnanan and b. vasić above. the young’s solid diagram construction is also apparent if the tiled (2, 3, 4) hexagon is visualized as a 2 × 3 × 4 box. for ease of visualization, the top face of the topmost box is colored black. the relationship between lozenge tiling and boxed plane partition is given by the theorem 1 in [18]. lemma 1 the number of boxed plane partitions fitting inside a a × b × c box is equal to the number of lozenge tilings of an (a, b, c) hexagon. proof: the proof is based on macmahon [16] formula. 4.2 bounds on ct by lemma 1, the number of lozenge tiling of any (a, b, c) hexagon is equal to the number of boxed partitions fitting inside an a×b×c box. macmahon [16] found the number of such partitions, n(a,b,c), to be n(a,b,c) = ∏a i=1(c + i)b/(i)b where (i)n := i(i + 1)(i + 2) . . . (i + n − 1) is the rising factorial. for an equilateral hexagon, we were able to find the capacity of tiling (without coloring) in a closed form. it is given by theorem 1. theorem 1 the capacity of the lozenge tiling for an equilateral hexagon is ct = 3 4 log2 3 − 1. proof: the proof follows from lemma 1 as given in appendix 7 note that closed forms solutions for the capacity such as one given by the above theorem are quite rare in problems involving 2d constraints. 4.3 bounds on cc the capacity of the coloring problem, cc , can be estimated by counting the number of ways the lozenges constituting a lozenge tiling can be colored. given a lozenge tiling of an (n, n, n) hexagon, h, the coloring of the lozenges cannot be done arbitrarily. to explain this, we consider a colored lozenge tiling of the (2, 1, 1) hexagon which is shown in figure 6. figure 6(a) shows the colored lozenge tiling without any boundaries. in the absence of the tiling information, the decoder would not be able to decode this tiling correctly. figures 6(b) and 6(c) show two possible lozenge tilings this can be decoded to. this means that in the absence of knowledge of the tiling pattern, the colored tiling in figure 6(a) is not uniquely decodable. in particular, the two tilings of the (1, 1, 1) hexagon formed at the bottom of the larger hexagon (marked with bold edges) cannot be identified uniquely. we denote the (1, 1, 1) hexagon formed in fig. 6(b) and fig. 6(c) as type1 and type-2 hexagon, respectively. to distinguish the two hexagons, we apply the following rule: whenever a type-1 hexagon is encountered, the vertical lozenges of different orientation, i.e., type-b and type-c lozenges constituting detection and coding for tdmr channel 529 (a) (b) (c) fig. 6. figure (a) shows a colored lozenge tiling for a (2, 1, 1, ) hexagon. in the absence of tiling information, the colored tiling in figure (a) can be decoded to either of the colored tilings shown in figures (b) and (c) it are colored differently. one such coloring scheme is shown in fig. 7, where type-b prototiles are always colored black, and type-c prototiles are always colored white. by using this coloring scheme, no information is stored in the vertical lozenges of the type-1 hexagon. we refer to this coloring scheme as fixed-color vertical lozenges (fcvl) coloring. the fcvl coloring constraint reduces the number of ways of coloring for each tiling. this reduction depends on the number of type-1 hexagons formed in a tiling. the capacity can be bounded by calculating the number of type-1 hexagons formed for every tiling. fig. 7. the coloring scheme adopted for type-1 hexagon. the coloring capacity, cc , can be estimated by counting the number of ways the lozenges constituting a lozenge tiling can be colored. given a lozenge tiling of an (n, n, n) hexagon, h, the coloring of the lozenges cannot be done arbitrarily. theorem 2 for the coloring of a lozenge tiling of an (n, n, n) hexagon h, the capacity cc ≥ 13 . proof: the proof is based on the equivalence between tilings and boxed plane partitions and lemma 1 it is given in appendix 8. note that the above bound on cc may be strengthen at the expense of much harder combinatorial argument considering colors of tiles. 528 b. vasic, a. r. krishnan lozenge tiling constrained codes 529 530 a. r. krishnanan and b. vasić 5 encoding and decoding lozenge codes we refer to the assignment of a distinct element from the set of integers to an element of the set of all possible colored tilings as encoding of colored tilings. again, we divide the problem into two separate problems: the encoding of tilings and the coloring of tiles. to facilitate this, we assume that the fcvl coloring constraint is imposed. the first step of the encoding process is the encoding of tilings for which we seek a bijective assignment between the set of integers and the set of all tilings. the second step is the coloring of the tiling. the second step is trivial since for the fcvl coloring, n2 horizontal prototiles, which is the third of all prototiles in a (n, n, n) hexagon, can be colored independently. thus we focus on the first step of an encoding algorithm. before discussing the proposed encoding algorithm we give an intuition behind our approach. we begin by recalling that there is a unique boxed plane partition corresponding to any lozenge tiling. consider an (a, b, c) hexagon tiled with equilateral triangles with unit side-lengths. an alternative way to unambiguously specify a boxed plane partition is to traverse a paths covering only vertical prototiles as illustrated in fig. 8(d) for the case of the (3, 3, 3) hexagon. we refer to these paths as routings. each section of the path corresponds to one type-b or type-c lozenge and it is drawn as a line segment of ”northeast” or ”souteast” orientation in fig. 8(d). a collection of n such paths determines the tiling. note, however, that the n paths in a collection cannot be chosen arbitrarily. this is simply because we do not allow that a box at any level “levitates” without being supported by boxes the beneath it. we describe this requirement though combinatorial objects known as mountain ranges illustrated in fig. 8(c). each path correspond to one mountain range, but “lower” paths cannot have mountain ranges higher than “upper” paths. we formalize this concept through lexicographical ordering and ranking of mountain ranges. a collection of n nondecreasing mountain ranges represent an instance of a tiling. the set of all such collections represent the set of all tilings. to perform encoding and decoding we need to enumerate mountain ranges and enumerate collection of mountain ranges. thus the encoding procedure involves the step of converting an integer to a routing on a graph, which is represented by a set of constant-weight sequences and then converting it to a tiling. in the following subsections we introduce rigorously the combinatorial objects given in the above intuitive explanation. before discussing the details of the proposed algorithm, we describe the problem of enumeration of the routings on a graph. 5.1 routings on a graph recall that the idea is to use the bijection between tiling and routings on a graph. consider an (a, b, c) hexagon h tiled with equilateral triangles with unit side-lengths. for this tiling, an associated graph g can be constructed 530 b. vasic, a. r. krishnan lozenge tiling constrained codes 531 530 a. r. krishnanan and b. vasić 5 encoding and decoding lozenge codes we refer to the assignment of a distinct element from the set of integers to an element of the set of all possible colored tilings as encoding of colored tilings. again, we divide the problem into two separate problems: the encoding of tilings and the coloring of tiles. to facilitate this, we assume that the fcvl coloring constraint is imposed. the first step of the encoding process is the encoding of tilings for which we seek a bijective assignment between the set of integers and the set of all tilings. the second step is the coloring of the tiling. the second step is trivial since for the fcvl coloring, n2 horizontal prototiles, which is the third of all prototiles in a (n, n, n) hexagon, can be colored independently. thus we focus on the first step of an encoding algorithm. before discussing the proposed encoding algorithm we give an intuition behind our approach. we begin by recalling that there is a unique boxed plane partition corresponding to any lozenge tiling. consider an (a, b, c) hexagon tiled with equilateral triangles with unit side-lengths. an alternative way to unambiguously specify a boxed plane partition is to traverse a paths covering only vertical prototiles as illustrated in fig. 8(d) for the case of the (3, 3, 3) hexagon. we refer to these paths as routings. each section of the path corresponds to one type-b or type-c lozenge and it is drawn as a line segment of ”northeast” or ”souteast” orientation in fig. 8(d). a collection of n such paths determines the tiling. note, however, that the n paths in a collection cannot be chosen arbitrarily. this is simply because we do not allow that a box at any level “levitates” without being supported by boxes the beneath it. we describe this requirement though combinatorial objects known as mountain ranges illustrated in fig. 8(c). each path correspond to one mountain range, but “lower” paths cannot have mountain ranges higher than “upper” paths. we formalize this concept through lexicographical ordering and ranking of mountain ranges. a collection of n nondecreasing mountain ranges represent an instance of a tiling. the set of all such collections represent the set of all tilings. to perform encoding and decoding we need to enumerate mountain ranges and enumerate collection of mountain ranges. thus the encoding procedure involves the step of converting an integer to a routing on a graph, which is represented by a set of constant-weight sequences and then converting it to a tiling. in the following subsections we introduce rigorously the combinatorial objects given in the above intuitive explanation. before discussing the details of the proposed algorithm, we describe the problem of enumeration of the routings on a graph. 5.1 routings on a graph recall that the idea is to use the bijection between tiling and routings on a graph. consider an (a, b, c) hexagon h tiled with equilateral triangles with unit side-lengths. for this tiling, an associated graph g can be constructed detection and coding for tdmr channel 531 as follows. the vertices of g are the mid-points of the vertical sides in the tiling of h. two vertices of g are connected if their corresponding sides lie on adjacent triangles. fig. 8 shows a (3, 3, 3) hexagon (fig. 8(a)) and its corresponding graph (fig. 8(b)). the leftmost nodes of g are designated as sources and the rightmost nodes as drains. notice that the number of sources is equal to the number of drains. traversing the sources (drains respectively) from top to bottom, each of the sources is denoted as s1, s2, . . . , sc (t1, t2, . . . , tc respectively). now, the routings on the graph g is the set of c shortest nonintersecting paths from si to ti for each i. the routings on the graph g are associated to the tilings of h with lozenges by the following theorem. (a) (b) (c) (d) fig. 8. (a) a (3,3,3) hexagon h tiled with equilateral triangles of unit side. (b) the graph g associated with h. (c) example of a routing on the graph associated with a (3,3,3) hexagon. (d) the corresponding tiling on the hexagon. theorem 3 the lozenge tilings on h corresponds bijectively to the set of routings on the associated graph g. proof: to prove this we make use of the theorem 1 in [21]. figures 8(c) and 8(d) show an example of the relationship between lozenge tiling on h and a given lattice routing on g. fig. 8(c) shows a routing on a graph associated with a (3, 3, 3) hexagon (marked in bold). fig. 8(d) shows the corresponding tiling on the hexagon. the routing overlaid on the hexagon illustrates how the tiling can be generated from the routing. it can be seen that each of the a paths of the routing has a + b moves with b “northeast (ne) moves” and c “southeast (se) moves” between nodes. by mapping the ne move to “1” and the se move to “0”, each routing can be represented as an ordered collection of constant weight sequences. thus the problem reduces to coding of constant weight sequences. we now describe the encoding. let r be the ordered set of routings on the graph g. the ith element of r, ri, is an ordered collection of a constant weight sequences of length a + b and weight b. we denote the ordered set of all sequences with length n and weight k as d(n,k). hence, each element of ri is a member of d(a+b,b). the encoding process consists of assigning to the number i, 0 ≤ i ≤ |r|, the ith element of r, ri. in order to perform this operation without physically storing r, algorithms which directly map an integer i to a distinct ordered 530 b. vasic, a. r. krishnan lozenge tiling constrained codes 531 532 a. r. krishnanan and b. vasić sets in r have been designed. to explain this, we consider the following: let ri = { di, i = 1, 2, . . . , c|di ∈ d(a+b,b) } . there exists an ordering of elements of d(a+b,b) such that ∀i, j 1 ≤ i, j ≤ c, if i < j, then rank(di) ≤ rank(dj ) where rank(.) denotes the rank of an element in the ordered set. hence, instead of storing individual sequences, ri can be represented using the ranks of di as ri = {rank(di), i = 1, 2, . . . c| di ∈ d(a+b,b)} (we will later show that this can be further simplified by the use of non-decreasing strings. this will be dealt with in detail in section 5.3). we refer to this as the rank-based representation. we now describe the reverse lexicographical ordering of constant weight sequences which is one such ordering of sequences. 5.2 reverse lexicographical ordering of constant weight sequences in this section, methods to obtain a bijective mapping between members of d(n,k) and integers will be developed. firstly, we note that the size of d(n,k), ∣∣d(n,k) ∣∣, is ( n k ) . also, note that each n-bit sequence has a decimal equivalent between 0 and 2n − 1. let the decimal equivalent be denoted as de(.). the lexicographical ordering of the sequences is the arrangement of all the sequences in the ascending order of their decimal equivalent. the reverse lexicographical ordering can be defined analogously. in our description, we use the reverse lexicographical ordering. table 1 shows the reverse lexicographical ordering for d(4,2). the first column is the rank which gives the position of the sequence in the reverse lexicographically ordered set. again, we use rank(.) to denote the rank of a particular element. it is easily seen that if de(a) ≥ de(b), then rank(a) ≤ rank(b). next, we consider the transformation of the members of d(n,k), to paths in a cartesian plane. we refer to these paths as mountain ranges (similar to the catalan mountain ranges [22]). let b be a member of d(n,k). the mountain range mountain(.) corresponding to b can be calculated as follows: the mount ranges can also be represented by the n-tuple containing y coordinates of points traced by the sequence, denoted by y. figure 9 shows the table 1. ordering of elements in d(4,2). rank string decimal equivalent 0 1100 12 1 1010 10 2 1001 9 3 0110 6 4 0101 5 5 0011 3 detection and coding for tdmr channel 533 algorithm 1 the mount range algorithm start from (x = 0, y = 0) for i = 1 to n do if b(i) = 1 then move to (x + 1, y + 1) else move to (x + 1, y − 1) end if end for output the mount range mountain(.) fig. 9. mountain range corresponding to the string 100101, mountain(100101) mountain range corresponding to the sequence 100101, a member of d(6,3). we now establish a relation between the lexicographical ranking and the containment of mountain ranges within one another. for any a, b ∈ d(n,k), mountain(a) is said to be contained within m ountain(b) (denoted as m ountain(a) ⊂ m ountain(b)) if each element of the n-tuple corresponding to a, ya is never more than the corresponding element of the n-tuple corresponding to b, yb. for instance, the mountain range mountain(1010) is contained in mountain(1100). the following theorem relates the lexicographical ranking to containment. theorem 4 for any a, b ∈ d(n,k), if mountain(a) ⊂ mountain(b), then rank(a) ≥ rank(b) proof: the proof is given in appendix 9 the converse is not true. one counter-example in d(4,2) is 1001 and 0110. it is easily seen that an ordered set {di, i = 1, 2, . . . , a|di ∈ d(a+b,b)} such that mountain(di) ⊂ mountain(di+1) for all i, is a member of r and hence corresponds to a valid tiling. this implies that the rank-based representation of ri is a string of integers in which the integers are non-decreasing. we now address ranking and unranking of members. ranking, denoted by rank refers to the process mapping of a member of an ordered set to a distinct integer. unranking is its inverse operation. we use ranking and unranking algorithms on the reverse lexicographically ordered set d(n,k) for encoding data 532 b. vasic, a. r. krishnan lozenge tiling constrained codes 533 532 a. r. krishnanan and b. vasić sets in r have been designed. to explain this, we consider the following: let ri = { di, i = 1, 2, . . . , c|di ∈ d(a+b,b) } . there exists an ordering of elements of d(a+b,b) such that ∀i, j 1 ≤ i, j ≤ c, if i < j, then rank(di) ≤ rank(dj ) where rank(.) denotes the rank of an element in the ordered set. hence, instead of storing individual sequences, ri can be represented using the ranks of di as ri = {rank(di), i = 1, 2, . . . c| di ∈ d(a+b,b)} (we will later show that this can be further simplified by the use of non-decreasing strings. this will be dealt with in detail in section 5.3). we refer to this as the rank-based representation. we now describe the reverse lexicographical ordering of constant weight sequences which is one such ordering of sequences. 5.2 reverse lexicographical ordering of constant weight sequences in this section, methods to obtain a bijective mapping between members of d(n,k) and integers will be developed. firstly, we note that the size of d(n,k), ∣∣d(n,k) ∣∣, is ( n k ) . also, note that each n-bit sequence has a decimal equivalent between 0 and 2n − 1. let the decimal equivalent be denoted as de(.). the lexicographical ordering of the sequences is the arrangement of all the sequences in the ascending order of their decimal equivalent. the reverse lexicographical ordering can be defined analogously. in our description, we use the reverse lexicographical ordering. table 1 shows the reverse lexicographical ordering for d(4,2). the first column is the rank which gives the position of the sequence in the reverse lexicographically ordered set. again, we use rank(.) to denote the rank of a particular element. it is easily seen that if de(a) ≥ de(b), then rank(a) ≤ rank(b). next, we consider the transformation of the members of d(n,k), to paths in a cartesian plane. we refer to these paths as mountain ranges (similar to the catalan mountain ranges [22]). let b be a member of d(n,k). the mountain range mountain(.) corresponding to b can be calculated as follows: the mount ranges can also be represented by the n-tuple containing y coordinates of points traced by the sequence, denoted by y. figure 9 shows the table 1. ordering of elements in d(4,2). rank string decimal equivalent 0 1100 12 1 1010 10 2 1001 9 3 0110 6 4 0101 5 5 0011 3 detection and coding for tdmr channel 533 algorithm 1 the mount range algorithm start from (x = 0, y = 0) for i = 1 to n do if b(i) = 1 then move to (x + 1, y + 1) else move to (x + 1, y − 1) end if end for output the mount range mountain(.) fig. 9. mountain range corresponding to the string 100101, mountain(100101) mountain range corresponding to the sequence 100101, a member of d(6,3). we now establish a relation between the lexicographical ranking and the containment of mountain ranges within one another. for any a, b ∈ d(n,k), mountain(a) is said to be contained within m ountain(b) (denoted as m ountain(a) ⊂ m ountain(b)) if each element of the n-tuple corresponding to a, ya is never more than the corresponding element of the n-tuple corresponding to b, yb. for instance, the mountain range mountain(1010) is contained in mountain(1100). the following theorem relates the lexicographical ranking to containment. theorem 4 for any a, b ∈ d(n,k), if mountain(a) ⊂ mountain(b), then rank(a) ≥ rank(b) proof: the proof is given in appendix 9 the converse is not true. one counter-example in d(4,2) is 1001 and 0110. it is easily seen that an ordered set {di, i = 1, 2, . . . , a|di ∈ d(a+b,b)} such that mountain(di) ⊂ mountain(di+1) for all i, is a member of r and hence corresponds to a valid tiling. this implies that the rank-based representation of ri is a string of integers in which the integers are non-decreasing. we now address ranking and unranking of members. ranking, denoted by rank refers to the process mapping of a member of an ordered set to a distinct integer. unranking is its inverse operation. we use ranking and unranking algorithms on the reverse lexicographically ordered set d(n,k) for encoding data 532 b. vasic, a. r. krishnan lozenge tiling constrained codes 533 534 a. r. krishnanan and b. vasić fig. 10. ranking of string 110010. vertex labels denote the number of possible sequences from the corresponding node to a drain. the edge labels on the mountain range, shown as a bold path, correspond to intermediate steps of rank computation. to tilings. ranking and unranking algorithms for this set are available [23, 24]. we have designed algorithms using methods similar to [22]. to perform the ranking, first the number of paths passing through every node (x, y) in g, n(x,y) is calculated. the rank is calculated by starting with an initial estimate of rank (= 0) and then refining the estimate as the sequence is parsed bit by bit. from any node (x, y), if the next bit is 0, the corresponding point in the cartesian plane is (x + 1, y − 1). this implies that all sequences which start from the node (x + 1, y + 1) have a lower rank. thus, given that the current estimate is e, the rank of this sequence is greater than e + n(x+1,y+1). hence the estimate can be refined by adding the number to the current estimate. this process is continued till all the bits are parsed. the final estimate is the rank of the sequence. figure 10 shows flow of the algorithm for the sequence 110010, a member of d(6,3). the number of sequences passing through every node is marked next to the node. the refinement of the rank at every bit is shown in the figure. at the last bit, the rank is calculated as 2. similar approach can be used to unrank an integer. through the rest of this document, the ranking and unranking on this set will be denoted rd and r−1d respectively. it was earlier stated that ri can be represented as ri = {rank(di), i = 1, 2, . . . , c}. we note that the rank of each dsi is between 0 and ( a+b b ) −1. also, we noted that for all di, dj such that i < j, rank(di) ≤ rank(dj ). now, we consider the rankbased representation which is the ordered set { rank(di), i = 1, 2, . . . , a|di ∈ d(a+b,b) } . this set can be expressed as a non-decreasing string of length c with members of the set { 1, 2, . . . , ( a+b b )} . we can further reduce the complexity of representation of elements of r if the non-decreasing strings can be ordered. section 5.3 deals with this. 5.3 lexicographical ordering of non-decreasing strings this section deals with the lexicographical ordering in the set of non-decreasing strings of length n consisting of members of the set n = {1, 2, 3, ..., n}. this detection and coding for tdmr channel 535 set is denoted snn and is defined as the set of all strings, d of length n such that for all i, 1 ≤ i ≤ n−1, d(i), d(i + 1) ∈ n and d(i) ≤ d(i + 1). the set snn is said to be lexicographically ordered if for any d1, d2 ∈ snn , d1(i) ≤ d2(i) ∀i ⇔ rank(d1) ≤ rank(d2), where, as before, rank of a member is the position of the member in the ordered set. table 2 shows the set s23 along with the rank of each element. table 2. ordering of elements in s23 . rank string 0 111 1 112 2 122 3 222 to calculate the size of snn we analyze the construction of strings of length k using strings of length k − 1 for k < n. consider a number i such that 1 < i ≤ n . the number of strings of length k ending in i, n ik is equal to the number of strings of length (k −1) ending in digits less than or equal to i. that is, n ik = i∑ j=1 n j k−1 (n i 1 = 1, n 1 k = 1) = i−1∑ j=1 n j k−1 + n i k−1 = n i−1k + n i k−1 (6) this yields the recursive relation n ik = n i−1 k + n i k−1 (n i 1 = 1, n 1 k = 1) (7) the generating function for this recursion is given by bi(x) = x(1−x)i . the coefficient of xk in bi(x) gives the total number of sequences of length k ending in i. since only the members of snn can be used to make sequences of length n + 1 ending in n , we have ∣∣snn ∣∣ = n nn+1. by using the generating function, this is evaluated as ( n+n−1 n ) . the ranking algorithm for the lexicographical ordering of members involves mapping an integer between 0 and ( n+n−1 n ) − 1 to a sequence in snn . as in the previous case, the ranking algorithm starts with an initial estimate ( e0 = ( n+n−1 n ) − 1 ) and refines its estimate as the string is parsed digit by digit. given the estimate of the rank at the (b − 1)th digit, eb−1, and that the bth digit is d(b), the following conclusion can be drawn about the rank. the rank is less than or equal to eb−1 − ( n ′+b′−1 b′ ) where n ′ = n − d(b) and 534 b. vasic, a. r. krishnan lozenge tiling constrained codes 535 534 a. r. krishnanan and b. vasić fig. 10. ranking of string 110010. vertex labels denote the number of possible sequences from the corresponding node to a drain. the edge labels on the mountain range, shown as a bold path, correspond to intermediate steps of rank computation. to tilings. ranking and unranking algorithms for this set are available [23, 24]. we have designed algorithms using methods similar to [22]. to perform the ranking, first the number of paths passing through every node (x, y) in g, n(x,y) is calculated. the rank is calculated by starting with an initial estimate of rank (= 0) and then refining the estimate as the sequence is parsed bit by bit. from any node (x, y), if the next bit is 0, the corresponding point in the cartesian plane is (x + 1, y − 1). this implies that all sequences which start from the node (x + 1, y + 1) have a lower rank. thus, given that the current estimate is e, the rank of this sequence is greater than e + n(x+1,y+1). hence the estimate can be refined by adding the number to the current estimate. this process is continued till all the bits are parsed. the final estimate is the rank of the sequence. figure 10 shows flow of the algorithm for the sequence 110010, a member of d(6,3). the number of sequences passing through every node is marked next to the node. the refinement of the rank at every bit is shown in the figure. at the last bit, the rank is calculated as 2. similar approach can be used to unrank an integer. through the rest of this document, the ranking and unranking on this set will be denoted rd and r−1d respectively. it was earlier stated that ri can be represented as ri = {rank(di), i = 1, 2, . . . , c}. we note that the rank of each dsi is between 0 and ( a+b b ) −1. also, we noted that for all di, dj such that i < j, rank(di) ≤ rank(dj ). now, we consider the rankbased representation which is the ordered set { rank(di), i = 1, 2, . . . , a|di ∈ d(a+b,b) } . this set can be expressed as a non-decreasing string of length c with members of the set { 1, 2, . . . , ( a+b b )} . we can further reduce the complexity of representation of elements of r if the non-decreasing strings can be ordered. section 5.3 deals with this. 5.3 lexicographical ordering of non-decreasing strings this section deals with the lexicographical ordering in the set of non-decreasing strings of length n consisting of members of the set n = {1, 2, 3, ..., n}. this detection and coding for tdmr channel 535 set is denoted snn and is defined as the set of all strings, d of length n such that for all i, 1 ≤ i ≤ n−1, d(i), d(i + 1) ∈ n and d(i) ≤ d(i + 1). the set snn is said to be lexicographically ordered if for any d1, d2 ∈ snn , d1(i) ≤ d2(i) ∀i ⇔ rank(d1) ≤ rank(d2), where, as before, rank of a member is the position of the member in the ordered set. table 2 shows the set s23 along with the rank of each element. table 2. ordering of elements in s23 . rank string 0 111 1 112 2 122 3 222 to calculate the size of snn we analyze the construction of strings of length k using strings of length k − 1 for k < n. consider a number i such that 1 < i ≤ n . the number of strings of length k ending in i, n ik is equal to the number of strings of length (k −1) ending in digits less than or equal to i. that is, n ik = i∑ j=1 n j k−1 (n i 1 = 1, n 1 k = 1) = i−1∑ j=1 n j k−1 + n i k−1 = n i−1k + n i k−1 (6) this yields the recursive relation n ik = n i−1 k + n i k−1 (n i 1 = 1, n 1 k = 1) (7) the generating function for this recursion is given by bi(x) = x(1−x)i . the coefficient of xk in bi(x) gives the total number of sequences of length k ending in i. since only the members of snn can be used to make sequences of length n + 1 ending in n , we have ∣∣snn ∣∣ = n nn+1. by using the generating function, this is evaluated as ( n+n−1 n ) . the ranking algorithm for the lexicographical ordering of members involves mapping an integer between 0 and ( n+n−1 n ) − 1 to a sequence in snn . as in the previous case, the ranking algorithm starts with an initial estimate ( e0 = ( n+n−1 n ) − 1 ) and refines its estimate as the string is parsed digit by digit. given the estimate of the rank at the (b − 1)th digit, eb−1, and that the bth digit is d(b), the following conclusion can be drawn about the rank. the rank is less than or equal to eb−1 − ( n ′+b′−1 b′ ) where n ′ = n − d(b) and 534 b. vasic, a. r. krishnan lozenge tiling constrained codes 535 536 a. r. krishnanan and b. vasić b′ = n − b + 1. n ′ gives the total number of members of n greater than d(b) and b′ gives the total number of available digits including the current digit. the second term in the refined estimate is the total number of strings of length b′ that can be generated using the members of n larger than d. the strings having these substrings will have a higher ranking. table 3 shows the progression of the algorithm for the member 112, a member of the set s23 . at each step, the strings which are eliminated are shown in parentheses. table 3. progression of the ranking algorithm for an example. b db eb 3 1 1 3 − 1 = 2 (222) 2 1 2 − 1 = 1 (122) 3 2 1 the unranking of an integer can be achieved by using a similar approach. through the rest of this document, the ranking and unranking on this set will be denoted rs and r −1 s respectively. 5.4 encoding by routing given the lattice h of dimension (a, b, c), the set of valid routings, r, on g consists of ordered sets of size a consisting of members of the set d(a+b,b). also, the ordered sets will be in non-decreasing order of rank. hence, these will correspond to a non-decreasing string of length a with members from the set {1, 2, 3, ..., ∣∣d(a+b,b) ∣∣}. this set is s|d(a+b,b)|c . hence, we have established a transitive relationship between r and s| d(a+b,b)| c . however, all the sequences in this set do not correspond to valid tilings (as the converse of theorem 4 is not true). hence, we generate a look-up table (lut) to map the integers in [0, |r| − 1] to the ranks of members of sd(a+b,b)c corresponding to valid tilings. the encoding process is given by algorithm 2. the ranking algorithms, along with the lut, can be used to map a tiling to a digit in [0, |r| − 1]. 6 conclusion in this paper we first introduced a class of two dimensional constraints we call segregation constraints. in contrast to isolation runlength constraints considered in literature, we limit the minimal number of neighboring or touching tiles 536 b. vasic, a. r. krishnan lozenge tiling constrained codes 537 536 a. r. krishnanan and b. vasić b′ = n − b + 1. n ′ gives the total number of members of n greater than d(b) and b′ gives the total number of available digits including the current digit. the second term in the refined estimate is the total number of strings of length b′ that can be generated using the members of n larger than d. the strings having these substrings will have a higher ranking. table 3 shows the progression of the algorithm for the member 112, a member of the set s23 . at each step, the strings which are eliminated are shown in parentheses. table 3. progression of the ranking algorithm for an example. b db eb 3 1 1 3 − 1 = 2 (222) 2 1 2 − 1 = 1 (122) 3 2 1 the unranking of an integer can be achieved by using a similar approach. through the rest of this document, the ranking and unranking on this set will be denoted rs and r −1 s respectively. 5.4 encoding by routing given the lattice h of dimension (a, b, c), the set of valid routings, r, on g consists of ordered sets of size a consisting of members of the set d(a+b,b). also, the ordered sets will be in non-decreasing order of rank. hence, these will correspond to a non-decreasing string of length a with members from the set {1, 2, 3, ..., ∣∣d(a+b,b) ∣∣}. this set is s|d(a+b,b)|c . hence, we have established a transitive relationship between r and s| d(a+b,b)| c . however, all the sequences in this set do not correspond to valid tilings (as the converse of theorem 4 is not true). hence, we generate a look-up table (lut) to map the integers in [0, |r| − 1] to the ranks of members of sd(a+b,b)c corresponding to valid tilings. the encoding process is given by algorithm 2. the ranking algorithms, along with the lut, can be used to map a tiling to a digit in [0, |r| − 1]. 6 conclusion in this paper we first introduced a class of two dimensional constraints we call segregation constraints. in contrast to isolation runlength constraints considered in literature, we limit the minimal number of neighboring or touching tiles detection and coding for tdmr channel 537 algorithm 2 the encoding algorithm use the lookup table (lut) to find the rank of the corresponding string in s| d(a+b,b)| c use the r−1s to find the string si in the set s d(a+b,b) c . each digit in si, si(j), corresponds to the rank of a sequence in d(a+b,b). for all si(j) do use the r−1d to find the sequence corresponding to the rank si(j). end for assemble the sequences to find the ordered set in r. of the same color and restrict the shape of equaly-colored regions. for most twodimensional recording systems such constraints are more natural than isolation constraints. the reason is that in high density recording systems recorded pattern features (areas of same physical properties) are smaller than what can be reliably distinguished by a reading device. for a particular constraint on a triangular lattice called the no-isolated bit constraint we introduced a method for encoding and decoding. future work include studying the effect of lozenge constraints on a tdmr detector performance in the spirit of what we have done for the square lattice [25]. acknowledgment this work was supported by the national science foundation under grant ccf-0963726 and ccf-1314147. the work of a. r. krishnan was performed when he was with the department of electrical and computer engineering, university of arizona, tucson, az, usa. the authors would like to thank seyed mehrdad khatami for his help and fruitful discussions. 7 appendix proof:[proof of theorem 1] in this appendix we prove the theorem 8. macmahon formula [16] gives the number n (a, b, c) of restricted plane partitions in the form n (a, b, c) = a∏ i=1 b∏ j=1 c∏ k=1 i + j + k − 1 i + j + k − 2 it can be rewritten as n (a, b, c) = a∏ i=1 (c + i)b (i)b 536 b. vasic, a. r. krishnan lozenge tiling constrained codes 537 538 a. r. krishnanan and b. vasić where (i)n := i(i + 1)(i + 2)...(i + n − 1) is the rising factorial. since (i)n = (i + n − 1)!/(a − 1)!, we have a∏ i=1 (c + i)b (i)b = a∏ i=1 (c + i + b − 1)!(i − 1)! (c + i − 1)!(i + b − 1)! each product of factorials in 7 is of the form ∏a i=1 (d + i − 1)! and can be written as a∏ i=1 (d + i − 1)! = ∏d+a−1 i=1 i!∏d−1 i=1 i! the superfactorials ∏a i=1 i! in 7 can be expressed as ∏n−1 i=1 i! = g(n + 1). therefore we obtain n (a, b, c) = a∏ i=1 g(a + b + c + 1)g(a + 1)g(b + 1)g(c + 1) g(a + b + 1)g(a + c + 1)g(b + c + 1) (8) where in eq. 8 g denotes the barnes g-function [26] defined by g(d + 1) = (2π) d 2 e− 1 2 (d(d+1)+γd2) · · +∞∏ i=1 (( 1 + d i )i e−d+d 2/(2i) ) for a regular hexagon with a = b = c = n we have n (n, n, n) = a∏ i=1 g(3n + 1) (g(n + 1))3 g(2n + 1)3 (9) using the asymptotic of g(d + 1) ln g(d + 1)˜z2( 1 2 ln z − 3 4 ) + 1 2 ln(2π)z − − 1 12 ln z + ξ ′ (−1)o( 1 2 ) we obtain ln n (n, n, n) ∼ ln g(3n + 1) + 3 ln g(n + 1) − −3 ln g(2n + 1) ∼ n2 ( 9 2 ln 3 − 6 ln 2 ) + +n ln 2π − 1 12 ln n − 1 12 ln 3 2 by changing the base of the logarithm we obtain log2 n (n, n, n) ∼ n 2 ( 9 2 log2 3 − 6 ) (10) detection and coding for tdmr channel 539 the area of the hexagon with sides a, b and c is a(a, b, c) = 2(ab + ac + bc) √ 3 4 . for a regular hexagon a(n, n, n) = 2 √ 3 2 n2, so that finally the density is d = lim n→∞ log2 n (n, n, n) a(n, n, n) = lim n→∞ n2 ( 9 2 log2 3 − 6 ) 2 √ 3 2 n2 = √ 3(log2 3 − 4 3 ) 8 proof:[proof of theorem 2] consider the equivalence between tilings and boxed plane partitions. it can be seen that each box in the young’s solid diagram of an (a, b, c) hexagon with 3 visible faces corresponds to a type-1 hexagon. to find the maximum number of type-1 hexagon in h, the young’s solid diagram corresponding to an n × n × n box with the maximum number of visible boxes is constructed. this can be constructed as follows: start at (x, y) = (0, 0) of the n × n × n box and stack n boxes. now, at (x, y) = (1, 0), stack n − 1 boxes. step in the x direction reducing the number of boxes stacked by 1 at each step. this is continued till (x, y) = (n−1, 0) where 1 box is stacked. next, start at (x, y) = (0, 1) and stack n − 1 boxes. continue the same process till (x, y) = (n − 2, 1) is reached where 1 box is stacked. this process is continued till (x, y) = (n − 1, 0) is reached where 1 box is stacked. this corresponds to the young’s solid diagram with the maximum number of visible boxes. the boxed plane partition corresponding to this young’s diagram is given by: π =   n n − 1 . . . 3 2 1 n − 1 n − 2 . . . 2 1 0 n − 2 n − 3 . . . 1 0 0 ... ... 1 0 . . . 0   as an example, figure 11 shows the lozenge tiling of a (4, 4, 4) hexagon with the maximum number of type-1 hexagons. alternatively, it can be visualized as a the young’s solid diagram corresponding to a 4 × 4 × 4 box. the number of visible boxes is equal to the number of non-zero entries in π. this is equal to n(n+1) 2 . this is equal to the number of type-1 hexagons formed in h. the number of lozenges where no information is stored is twice the number of such hexagons. by subtracting this from the total number of lozenges in h, the number of bits that can be stored in this region can be found. the total number of lozenges is half the number of equilateral triangles tiling h. by calculating the areas of h and a unit triangle, the number of 538 b. vasic, a. r. krishnan lozenge tiling constrained codes 539 538 a. r. krishnanan and b. vasić where (i)n := i(i + 1)(i + 2)...(i + n − 1) is the rising factorial. since (i)n = (i + n − 1)!/(a − 1)!, we have a∏ i=1 (c + i)b (i)b = a∏ i=1 (c + i + b − 1)!(i − 1)! (c + i − 1)!(i + b − 1)! each product of factorials in 7 is of the form ∏a i=1 (d + i − 1)! and can be written as a∏ i=1 (d + i − 1)! = ∏d+a−1 i=1 i!∏d−1 i=1 i! the superfactorials ∏a i=1 i! in 7 can be expressed as ∏n−1 i=1 i! = g(n + 1). therefore we obtain n (a, b, c) = a∏ i=1 g(a + b + c + 1)g(a + 1)g(b + 1)g(c + 1) g(a + b + 1)g(a + c + 1)g(b + c + 1) (8) where in eq. 8 g denotes the barnes g-function [26] defined by g(d + 1) = (2π) d 2 e− 1 2 (d(d+1)+γd2) · · +∞∏ i=1 (( 1 + d i )i e−d+d 2/(2i) ) for a regular hexagon with a = b = c = n we have n (n, n, n) = a∏ i=1 g(3n + 1) (g(n + 1))3 g(2n + 1)3 (9) using the asymptotic of g(d + 1) ln g(d + 1)˜z2( 1 2 ln z − 3 4 ) + 1 2 ln(2π)z − − 1 12 ln z + ξ ′ (−1)o( 1 2 ) we obtain ln n (n, n, n) ∼ ln g(3n + 1) + 3 ln g(n + 1) − −3 ln g(2n + 1) ∼ n2 ( 9 2 ln 3 − 6 ln 2 ) + +n ln 2π − 1 12 ln n − 1 12 ln 3 2 by changing the base of the logarithm we obtain log2 n (n, n, n) ∼ n 2 ( 9 2 log2 3 − 6 ) (10) detection and coding for tdmr channel 539 the area of the hexagon with sides a, b and c is a(a, b, c) = 2(ab + ac + bc) √ 3 4 . for a regular hexagon a(n, n, n) = 2 √ 3 2 n2, so that finally the density is d = lim n→∞ log2 n (n, n, n) a(n, n, n) = lim n→∞ n2 ( 9 2 log2 3 − 6 ) 2 √ 3 2 n2 = √ 3(log2 3 − 4 3 ) 8 proof:[proof of theorem 2] consider the equivalence between tilings and boxed plane partitions. it can be seen that each box in the young’s solid diagram of an (a, b, c) hexagon with 3 visible faces corresponds to a type-1 hexagon. to find the maximum number of type-1 hexagon in h, the young’s solid diagram corresponding to an n × n × n box with the maximum number of visible boxes is constructed. this can be constructed as follows: start at (x, y) = (0, 0) of the n × n × n box and stack n boxes. now, at (x, y) = (1, 0), stack n − 1 boxes. step in the x direction reducing the number of boxes stacked by 1 at each step. this is continued till (x, y) = (n−1, 0) where 1 box is stacked. next, start at (x, y) = (0, 1) and stack n − 1 boxes. continue the same process till (x, y) = (n − 2, 1) is reached where 1 box is stacked. this process is continued till (x, y) = (n − 1, 0) is reached where 1 box is stacked. this corresponds to the young’s solid diagram with the maximum number of visible boxes. the boxed plane partition corresponding to this young’s diagram is given by: π =   n n − 1 . . . 3 2 1 n − 1 n − 2 . . . 2 1 0 n − 2 n − 3 . . . 1 0 0 ... ... 1 0 . . . 0   as an example, figure 11 shows the lozenge tiling of a (4, 4, 4) hexagon with the maximum number of type-1 hexagons. alternatively, it can be visualized as a the young’s solid diagram corresponding to a 4 × 4 × 4 box. the number of visible boxes is equal to the number of non-zero entries in π. this is equal to n(n+1) 2 . this is equal to the number of type-1 hexagons formed in h. the number of lozenges where no information is stored is twice the number of such hexagons. by subtracting this from the total number of lozenges in h, the number of bits that can be stored in this region can be found. the total number of lozenges is half the number of equilateral triangles tiling h. by calculating the areas of h and a unit triangle, the number of 538 b. vasic, a. r. krishnan lozenge tiling constrained codes 539 540 a. r. krishnanan and b. vasić fig. 11. the lozenge tiling of a (4, 4, 4) hexagon with maximum number of type-1 hexagons. it can also be visualized as the young’s solid diagram of a 4×4×4 box with the maximum number of visible boxes. triangles is calculated as 6n2. hence, the number of lozenges is 3n2. since for any lozenge tiling of h, the number of type-1 hexagons is always less than or equal to n(n + 1), we have a lower bound on the capacity cc as follows: cc ≥ lim n→∞ 3n2 − n(n + 1) 6n2 = 1 3 (11) 9 proof:[proof of theorem 3] if mountain(a) ⊂ mountain(b), then ya(i) ≤ yb(i) for all i. it is enough to prove that if ya(i) ≤ yb(i) ∀ i = 1, 2, . . . , n, then a ≤ b. in order to do this, reconstruct the sequences a and b from ya and yb. let t, 1 ≤ t ≤ n such that ∀i < t, ya(i) = yb(i) and ya(t) �= yb(t). this means that the ith term of a is 0 and that of b is 1 (to satisfy the inequality in the terms of ya and yb). hence, de(a) ≤ de(b) (inequality occurs if they differ in one or more bits and equality occurs if they do not differ). hence rank(a) ≥ rank(b)/ references [1] b. h. marcus, p. h. siegel, and j. k. wolf, “finite-state modulation codes for data storage,” ieee journal of selected areas in communication, vol. 10, no. 1, pp. 5–37, 1992. [2] j. j. ashley and b. h. marcus, “two-dimensional low-pass filtering codes,” ieee transactions on communications, vol. 46, pp. 724–727, jun 1998. [3] z. nagy and k. zeger, “capacity bounds for the hard-triangle model,” proceedings of international symposium on information theory 2004, p. 162, 2004. [4] i. demirkan and j. k. wolf, “block codes for the hard-square model,” ieee transactions on information theory, vol. 51, no. 8, p. 2836, aug 2005. 540 b. vasic, a. r. krishnan lozenge tiling constrained codes 541 540 a. r. krishnanan and b. vasić fig. 11. the lozenge tiling of a (4, 4, 4) hexagon with maximum number of type-1 hexagons. it can also be visualized as the young’s solid diagram of a 4×4×4 box with the maximum number of visible boxes. triangles is calculated as 6n2. hence, the number of lozenges is 3n2. since for any lozenge tiling of h, the number of type-1 hexagons is always less than or equal to n(n + 1), we have a lower bound on the capacity cc as follows: cc ≥ lim n→∞ 3n2 − n(n + 1) 6n2 = 1 3 (11) 9 proof:[proof of theorem 3] if mountain(a) ⊂ mountain(b), then ya(i) ≤ yb(i) for all i. it is enough to prove that if ya(i) ≤ yb(i) ∀ i = 1, 2, . . . , n, then a ≤ b. in order to do this, reconstruct the sequences a and b from ya and yb. let t, 1 ≤ t ≤ n such that ∀i < t, ya(i) = yb(i) and ya(t) �= yb(t). this means that the ith term of a is 0 and that of b is 1 (to satisfy the inequality in the terms of ya and yb). hence, de(a) ≤ de(b) (inequality occurs if they differ in one or more bits and equality occurs if they do not differ). hence rank(a) ≥ rank(b)/ references [1] b. h. marcus, p. h. siegel, and j. k. wolf, “finite-state modulation codes for data storage,” ieee journal of selected areas in communication, vol. 10, no. 1, pp. 5–37, 1992. [2] j. j. ashley and b. h. marcus, “two-dimensional low-pass filtering codes,” ieee transactions on communications, vol. 46, pp. 724–727, jun 1998. [3] z. nagy and k. zeger, “capacity bounds for the hard-triangle model,” proceedings of international symposium on information theory 2004, p. 162, 2004. [4] i. demirkan and j. k. wolf, “block codes for the hard-square model,” ieee transactions on information theory, vol. 51, no. 8, p. 2836, aug 2005. detection and coding for tdmr channel 541 [5] a. kato and k. zeger, “partial characterization of the positive capacity region of the two-dimensional asymmetric run length constrained channels,” ieee transactions on information theory, vol. 46, no. 7, p. 2666, nov 2000. [6] ——, “on the capacity of two-dimensional run-length constrained channels,” ieee transactions on information theory, vol. 45, no. 5, p. 1527, july 1999. [7] z. nagy and k. zeger, “bit-stuffing algorithms and analysis for run-length constrained channels in two and three dimensions,” ieee transactions on information theory, vol. 50, no. 12, p. 3146, dec. 2004. [8] p. siegel and j. wolf, “bit-stuffing bounds on the capacity of 2-dimensional constrained arrays,” in international symposium on information theory, august 16 august 21 1998, p. 323. [9] z. nagy and k. zeger, “asymptotic capacity of two-dimensional channels with checkerboard constraints,” ieee transactions on information theory, vol. 49, no. 9, p. 2115, 2003. [10] t. etzion and k. g. paterson, “zero/positive capacities of two-dimensional runlength constrained arrays,” in in proc. 2001 ieee intl. symp. on inform. theory, 2004, p. 269. [11] s. khatami and b. vasic, “generalized belief propagation detector for tdmr microcell model,” ieee transactions on magnetics, vol. 45, no. 10, p. submitted, october 2012. [12] a. krishnan, r. radhakrishnan, b. vasic, a. kavcic, w. ryan, and f. erden, “2-d magnetic recording: read channel modeling and detection,” ieee trans. on magn., vol. 45, no. 10, pp. 3830 –3836, oct. 2009. [13] l. pan, w. e. ryan, r. wood, and b. vasic, “coding and detection for rectangular-grain tdmr models,” ieee trans. magn., vol. 47, no. 6, pp. 1705–1711, jun. 2011. [14] r. j. baxter, “hard hexagons: exact solution,” journal of physics a: mathematical and general, vol. 13, no. 3, p. l61, 1980. [online]. available: http://stacks.iop.org/0305-4470/13/i=3/a=007 [15] a. krishnan, r. radhakrishnan, and b. vasić, “read channel modeling for detection in two-dimensional magnetic recording systems,” ieee transactions on magnetics, vol. 45, no. 10, pp. 3679–3682, october 2009. [16] p. a. macmahon, combinatory analysis (volume ii), reprint. new york: chelsea publishing company, 1990. [17] h. cohn, m. larsen, and j. propp, “the shape of a typical boxed plane partition,” new york journal of mathematics, vol. 4, p. 137, 1998. [online]. available: http://www.citebase.org/abstract?id=oai:arxiv.org:math/9801059 [18] g. david and c. tomei, “the problem of calissons,” american mathematics monthly, vol. 96, no. 5, pp. 429–431, 1989. [19] p. w. kasteleyn, “the statistics of dimers on a lattice: the number of dimer arrangements on a quadratic lattice,” physica, vol. 12, pp. 1209–1225, 1961. 540 b. vasic, a. r. krishnan lozenge tiling constrained codes 541 542 a. r. krishnanan and b. vasić [20] s. desreux and e. remila, “an optimal algorithm to generate tilings,” journal of discrete algorithms, vol. 4, no. 1, pp. 168 – 180, 2006. [21] m. luby, d. randall, and a. sinclair, “markov chain algorithms for planar lattice structures (extended abstract),” ieee symposium on foundations of computer science, pp. 150–159, 1995. [online]. available: citeseer.ist.psu.edu/luby95markov.html [22] d. l. kreher and d. r. stinson, combinatorial algorithms: generation, enumeration and search. crc press, 1999, pp. 97–101. [23] b. ryabko, “fast enumeration of combinatorial objects,” math. and applications, vol. 10, p. n2, 1998. [online]. available: http://www.citebase. org/abstract?id=oai:arxiv.org:cs/0601069 [24] t. m. cover, “enumerative source coding,” ieee transactions on information theory, vol. it-19, no. 1, p. 73, jan 1973. [25] m. khatami and b. vasic, “constrained coding and detection for tdmr using generalized belief propagation,” in proc. ieee int. comm. conf., sydney, australia, jun. 10–14 2014. [26] e. w. barnes, “the theory of the g-function,” quart. j. pure appl. math, vol. 31, pp. 264–314, 1900. 542 b. vasic, a. r. krishnan lozenge tiling constrained codes pb 11176 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 253-266 https://doi.org/10.2298/fuee2302253m © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper the fredkin gate in reversible and quantum environments claudio moraga1, fatima z. hadjam2 1technical university of dortmund, dortmund, germany 2university of djillali liabes, sidi bel abbes, algeria abstract. reversible computing circuits are characterized by low power consumption and their proximity to circuits for quantum computing. the fredkin gate was one of the earliest proposed controlled reversible circuits, which however, was soon superseded by the toffoli gate, the not, and cnot gates, which constituting a flexible functionally complete set could also realize the fredkin gate as a building block. in quantum computing circuits, the fredkin gate (under the name controlled swap) plays an important role regarding the superposition of states. the present paper studies extensions of the fredkin gate in terms of mixed polarity in the reversible domain and an application in quantum computing. keywords: fredkin gate, reversible circuits, quantum computing circuits 1. introduction the earliest contributions to the development of reversible computing circuits may be traced back to e. fredkin [6] and t. toffoli [26] who introduced the first controlled reversible gates. a reversible gate realizes a bijection. therefore, it does not lose information. if the outputs are known, then the inputs may be precisely recovered. the realization of reversible circuits as fanout-free and feedback-free cascades of reversible gates was stimulated by r. landauer‘s theorem [9] stating that erasing or deleting information in a circuit produces heat dissipation. moreover, c. bennet [2] showed that a computer could work with low power dissipation if all its circuits would be reversible. since the early times most work on the synthesis of reversible circuits has been based on the set of gates {not, cnot, toffoli}, known as nct, which is functionally complete. the symbols and functionality of these gates in a common environment of two (eventually) controlling lines and a target line are shown in fig. 1. received october 05, 2022; revised january 13, 2023; accepted january 18, 2023 corresponding author: claudio moraga technical university of dortmund, 44221 dortmund, germany e-mail: claudio.moraga@tu-dortmund.de received september 29, 2022; revised december 11, 2022; accepted january 06, 2023 254 c. moraga, f. z. hadjam fig. 1 symbols and functionality for the reversible gates not, controlled not and toffoli the not gate (represented with an exor symbol) is not controlled and acts directly on its target line. in the cnot and toffoli gates, control signals and target signals are distinguished. the not component acts on the target signal whenever the control signals have the value 1. black dots identify which signal or signals are controlling the not component. the design of minimal boolean circuits is known to be np-complete, and the design of minimal reversible circuits is np-hard [20]. this has led to the development of different heuristics to synthesize reversible circuits [4], [17], [21], [24], [25], and also to postprocessing strategies to improve the minimization of circuits [19], [23]. in the last 25 years there have been important developments that have contributed to the improvement of the synthesis methods. among the most relevant from the hardware point of view are the increasing speed of computers and the increasing size of memories. this has opened the possibility of including search [12], evolutionary algorithms [7], [8], [10] or sat solvers [17] for the synthesis of reversible/quantum circuits. at the software side, the development of specialized efficient libraries may be mentioned. at the level of gates, both the use of the value 0 for control signals identified by “white dots” [23], [14], frequently referred to as “mixed polarity”, and the use of disjoint control signals [13], [15] may be mentioned. in what follows, the “generalization” of fredkin gates in the reversible domain and a relevant application in the quantum domain will be analyzed. it may be mentioned that in [5] the term “generalized fredkin gate” is used, referring to fredkin gates with multiple control lines. 2. the reversible domain it is not known whether the fredkin gate ever had an own representation symbol (other than a box with three inputs and three outputs). possibly a first symbolic representation was introduced in [11], which has been later replaced by the symbol used in circuits for quantum computing, as in e.g. [5]. in the literature this gate appears frequently as a toffoli and cnots building block, as shown in fig. 2 (left). in what follows, this building block will frequently be called simply fredkin “gate” and will be used to illustrate the effects of mixed polarity. at the output side, variables will have a prime apostroph sign. (complemented variables, on the other hand, will have a dash over their names, as frequently used in switching theory.) in the barenco et al. based quantum model [1], fig. 2 (right), a white box represents a v gate, whose functionality equals the square root of not and the box with a diagonal represents the adjoint of v. fig. 2 representation of the fredkin gate as an nct building block with positive control (left) and its barenco et al. based quantum model (right) the fredkin gate in reversible and quantum environments 255 the functionality of the building block representing the fredkin gate is given by: 𝑐3 ′ = 𝑐3 ⨁ 𝑐1(𝑐2 ⨁ 𝑐3) = 𝑐3 ⨁ 𝑐1𝑐2 ⨁ 𝑐1𝑐3 = 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3, 𝑐2 ′ = 𝑐2 ⨁ 𝑐3 ⨁ 𝑐3 ′ = 𝑐2 ⨁ 𝑐3 ⨁ 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3 = 𝑐1̅𝑐2 ⨁ 𝑐1𝑐3, (1) 𝑐1 ′ = 𝑐1 . equations (1) may be expressed in the following summary: c1 = 0 c1 = 1 c2’ c2 c3 c3’ c3 c2 the tableau expressed in words means that whenever c1 has the value 0, the fredkin gate behaves as an identity, whereas when c1 has the value 1, then the target signals c2 and c3 are exchanged. this means that the fredkin gate behaves as a “controlled swap” although this name is hardly used in the community working on reversible circuits. an important property of the fredkin gate is its completeness.in (1), let c2 = 1. then: c2’ = 𝑐1̅ ⊕ 𝑐1𝑐3 = 1 ⊕ 𝑐1 ⨁ 𝑐1𝑐3 = 1 ⨁ 𝑐1𝑐3̅ = 𝑐1̅ ∨ 𝑐3 = 𝑐1 → 𝑐3 on the other hand, if for some x 𝑐1 → (𝑥 → 0) then 𝑐1 → (�̅� ∨ 0) = 𝑐1 → (�̅�) = �̅�1 ∨ �̅� = 𝑐1𝑥̅̅ ̅̅ = nand(𝑐1, 𝑥). since nand is functionally complete, so is also fredkin complete. a different formal representation of the fredkin gate, appropriate to determine e.g. the performance of (simulated) circuits [28], is that of a transfer matrix. c1 c2 c3 c1’c2’c3’ [ 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1] ∙ [ 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 1 1 0 1] = [ 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 1 0 1 1 0 1 1 1] . (2) equation (2) shows a fredkin-matrix as a blockdiag(i4, swap), where i4 denotes the 4⨯4 identity matrix. it becomes clear that the product of the fredkin-matrix with a matrix of all possible inputs preserves c1, c2 and c3 when c1 = 0, since the i4 submatrix is active and c1’c2’c3’ = c1 c3 c2 when c1 = 1, since then the swap submatrix is active. if a white dot is placed on the c1 line of the original fredkin gate, it is fairly obvious, that this will change the polarity of the control. the gate will become active when c1 = 0 and it will remain inhibited, behaving as an identity, when c1 = 1. the circuit and the quantum model of a fredkin gate, which is active when the control signal c1 has the value 0 is shown in fig. 3, where the quantum model uses the same twoqubit-gates as in fig. 2, but in a different order. = 256 c. moraga, f. z. hadjam fig. 3 a fredkin gate with negative control and its quantum model since the quantum cost of a reversible gate is obtained as the gate count of the barenco et al. quantum model of the gate [22], [27], it becomes apparent that fredkin gates with positive or negative control have the same quantum cost of 7. a set of new behaviors is obtained if white dots are introduced in the lines c2 or c3, indicating that a signal is effective if it has the value 0. a first pair of equivalent variations is shown in fig. 4. fig. 4 fredkin gate equivalent variations the functionality of the building block at the left of fig. 4 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1(𝑐3̅ ⊕ 𝑐2) = 𝑐3 ⊕ 𝑐1𝑐3̅ ⊕ 𝑐1𝑐2 = 1 ⊕ 𝑐3̅ ⊕ 𝑐1𝑐3̅ ⊕ 𝑐1𝑐2 = = 1 ⊕ 𝑐1̅𝑐3̅ ⊕ 𝑐1𝑐2, (3) 𝑐2 ′ = (𝑐3 ⊕ 𝑐2) ⊕ 𝑐3 ′ = 1 ⊕ 𝑐3 ⊕ 𝑐1̅𝑐3̅ ⊕ 𝑐2 ⊕ 𝑐1𝑐2 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅, 𝑐1 ′ = 𝑐1 . equations (3) may be summarized as follows: in words: if c1 = 0 the building block behaves as an identity and if c1 = 1 both targets will be exchanged and complemented. it is straightforward to show that the gate at the right of fig. 4 has the same functionality. further variations, and eventually their equivalent nct reversible circuits are analyzed below. fig. 5 second fredkin gate variation v v c1 c2 c3 c1‘ c2‘ c3‘ the fredkin gate in reversible and quantum environments 257 the functionality of the building block of fig. 5 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = 1 ⊕ 𝑐3̅ ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3̅ ⊕ 1, 𝑐2 ′ = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐3 ′ = 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1̅𝑐3̅ ⊕ 1 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ ⊕ 1, (4) 𝑐1 ′ = 𝑐1. equations (4) may be summarized as follows: in words: if c1 = 0 then c2 will be complemented and c3 will be preserved. if c1 = 1 then the complement of c2, and c3 will be swapped. a third variation is shown in fig. 6. fig. 6 third variation of the fredkin gate the functionality of the building block of fig. 6 is given by: 𝑐3 ′ = 𝑐3 ⨁ 𝑐1(𝑐2 ⨁ 𝑐3) = 𝑐1𝑐2 ⨁ 𝑐1𝑐3 ⨁ 𝑐3 = 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3, 𝑐2 ′ = (𝑐2 ⊕ 𝑐3) ⊕ (𝑐3 ′ ⊕ 1) = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1𝑐2 ⨁ 𝑐1̅𝑐3 ⊕ 1 = (5) = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 ⊕ 1, 𝑐1 ′ = 𝑐1 . equations (5) may be summarized as follows: in words: if c1 = 0 then c3 will be preserved, but c2 will be complemented. if c1 = 1 then the signal c3 will be complemented and swapped with c2. from all former variations follows that a variation comprising two white dots at the bottom and a white dot at the center of the middle line will have the same functionality as the original fredkin gate. this is illustrated in fig. 7. fig. 7 equivalent fredkin gates 258 c. moraga, f. z. hadjam for the same reasons, the following variations on the fredkin gate are equivalent, as illustrated in fig. 8. fig. 8 further equivalence of fredkin variations an additional way of introducing variations on the fredkin gate consists of replacing the classical toffoli gate with a disjunct controlled toffoli gate [13], [15] possibly called “or-toffoli”. for this gate, up-side-down triangles –(black or white)– are used instead of dots to identify the effectivity of driving control signals with value 1 and 0, respectively. up-side-down triangles are used, based on their similarity with “”, the disjunction symbol in mathematical logic. a different variation, based on the or-toffoli gate, is shown in fig. 9 and may be called “or-fredkin”. notice that the quantum model based on [1] does not use an adjoint v gate, but otherwise it uses the same gates as in the quantum model of the classical fredkin gate. therefore, it has the same quantum cost, 7. fig. 9 the or-fredkin gate and its quantum model the functionality of the building block of fig. 9 (left) is given by: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐2 ⊕ 𝑐3)) = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3) = = 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3) = 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅, (6) 𝑐2 ′ = 𝑐3 ⊕ 𝑐2 ⊕ 𝑐3 ′ = 𝑐3 ⊕ 𝑐2 ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ = 1 ⊕ 𝑐1̅𝑐3̅ ⊕ 𝑐1𝑐2, 𝑐1 ′ = 𝑐1. equations (6) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3 𝑐2̅ c3’ 𝑐2 𝑐3̅ an equivalent nct circuit may be obtained, as shown in fig. 10, where the two gates in the middle have the functionality of the or-toffoli gate. fig. 10 an nct equivalent circuit for the or-fredkin gate the fredkin gate in reversible and quantum environments 259 another variation is illustrated in fig. 11, where for an or-toffoli gate, a white dot at the input side is introduced. fig. 11 a simple variation of the or-fredkin gate the functionality of the building block of fig. 11 is given by: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐2 ⊕ 𝑐3̅)) = 𝑐3 ⊕ (𝑐1 ⊕ (𝑐2 ⊕ 𝑐3̅) ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅)) = = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = 1 ⊕ 𝑐2 ⊕ 𝑐1(𝑐2 ⊕ 𝑐3) = = 1 ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 , (7) 𝑐2 ′ = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐3 ′ = 𝑐2 ⊕ 𝑐3̅ ⊕ 1 ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3 , 𝑐1 ′ = 𝑐1. equations (7) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3 𝑐2 c3’ 𝑐2̅ 𝑐3̅ in words: if c1 = 0 then c2 will be complemented and swapped with c3, whereas if c1 = 1 there will be no swapping, but c3 will be complemented. fig. 12 shows an equivalent nct circuit for the modified fredkin gate of fig. 11. fig. 12 equivalent (more complex) nct circuit for the gate of fig. 11 two equivalent variations are shown in fig. 13, with a different distribution of the white dots/triangle. fig. 13 equivalent variations of the or-fredkin gate the functionality of the building block at the left of fig. 13 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1 ∨ (𝑐2 ⊕ 𝑐3) = 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3 = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ , 𝑐2 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐3 ′ ⊕ 1 = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3̅ , (8) 𝑐1 ′ = 𝑐1. 260 c. moraga, f. z. hadjam equations (8) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3̅ 𝑐2 c3’ 𝑐2 𝑐3̅ in words: c3 will be complemented and if c1 = 0 then it will be swapped with c2. if c1 = 1 then no swapping takes place. it is simple to show that the circuit shown at the right of fig. 13 has the same functionality. an nct circuit equivalent to the variation, is shown in fig. 14. it may be seen that its quantum cost [22], [27] and depth is higher by 1 with respect to the variations in fig. 13. fig. 14 equivalent nct circuit of the or-fredkin variation of fig. 13 another variation is possible, with both bottom dots white, as shown in fig. 15. fig. 15 another orfredkin variation the functionality of the building block of fig. 15 is given by: 𝑐3 ′ = 𝑐3 ⊕ 𝑐1 ∨ (𝑐2 ⊕ 𝑐3̅) = 𝑐3 ⊕ 𝑐1 ⊕ (𝑐2 ⊕ 𝑐3̅) ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = = 1 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 ⊕ 1 = 𝑐1̅𝑐2̅ ⊕ 𝑐1𝑐3̅ , (9) 𝑐2 ′ = (𝑐2 ⊕ 𝑐3̅) ⊕ 𝑐3 ′ ⊕ 1 = 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1̅𝑐2 ⊕ 𝑐1𝑐3 = = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3 ⊕ 1 = 𝑐1𝑐2̅ ⊕ = 𝑐1̅𝑐3̅ , 𝑐1 ′ = 𝑐1 . equations (9) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3̅ 𝑐2̅ c3’ 𝑐2̅ 𝑐3̅ in words: if c1 = 0 then 𝑐2 and 𝑐3 will be complemented and swapped, whereas if 𝑐1 = 1, 𝑐2 and 𝑐3 will be just complemented. equivalent circuits not using or-fredkin variations are shown in fig. 16. it is easy to see that these equivalent nct circuits are more complex than the or-fredkin variation. the fredkin gate in reversible and quantum environments 261 fig. 16 equivalent nct circuits for the or-fredkin variation of fig. 15 another variation is shown in fig. 17, where in analogy to the white dot, a white triangle is introduced, meaning that the corresponding control signal will be complemented before calculating the disjunction. fig. 17 a mixed-polarity or-fredkin variation the functionality of the building block of fig. 17 is given by: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐3 ⊕ 𝑐2 ⊕ 1)) = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐3 ⊕ 𝑐2̅ ⊕ 𝑐1(𝑐3̅ ⊕ 𝑐2) = 𝑐1(𝑐3̅ ⊕ 𝑐2̅) ⊕ 𝑐2̅ = 𝑐1𝑐3̅ ⊕ 𝑐1̅𝑐2̅ , (10) 𝑐2 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐3 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1𝑐3̅ ⊕ 𝑐1̅𝑐2̅ = 𝑐1̅𝑐3̅ ⊕ 𝑐1𝑐2̅ , 𝑐1 ′ = 𝑐1 . equations (10) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3̅ 𝑐2̅ c3’ 𝑐2̅ 𝑐3̅ it becomes apparent that equations (9) and (10) are equal. this means that the corresponding or-fredkin variations are equivalent. moreover, it may be noticed that the distribution of “white elements” in these variations is the same as the distribution of white dots shown in fig. 4 for variations of the classical fredkin gate. an additional or-fredkin variation is shown in fig. 18. fig. 18 or-fredkin variation the functionality of the circuit is: 𝑐3 ′ = 𝑐3 ⊕ (𝑐1 ∨ (𝑐2 ⊕ 𝑐3̅)) = 𝑐3 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐3̅ ⊕ 𝑐1(𝑐2 ⊕ 𝑐3̅) = = 1 ⊕ 𝑐1 ⊕ 𝑐2 ⊕ 𝑐1𝑐2 ⊕ 𝑐1𝑐3̅ = 𝑐1𝑐3 ⊕ 𝑐1̅𝑐2 ⊕ 1 , (11) 𝑐2 ′ = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐3 ′ ⊕ 1 = 𝑐2 ⊕ 𝑐3 ⊕ 𝑐1𝑐3 ⊕ 𝑐1̅𝑐2 = 𝑐1𝑐2 ⊕ 𝑐1̅𝑐3 , 𝑐1 ′ = 𝑐1 . 262 c. moraga, f. z. hadjam equations (11) may be summarized as follows: c1 = 0 c1 = 1 c2’ 𝑐3 𝑐2 c3’ 𝑐2̅ 𝑐3̅ it may be seen, that eqs. (7) and (11) are equal. therefore, the corresponding fredkin variations are equivalent. they have the same distribution of white elements as the variations shown in fig. 8. the comparison is shown in fig. 19. fig. 19 pairs of equivalent fredkin variations 3. the quantum domain in the domain of circuits for quantum computing, the fredkin gate is not known under this name, but as “controlled swap”, possibly because in the case of quantum circuits there is a very simple symbol for the swap of two “qubits” (= quantum bits). (see fig. 19). without knowing whether in some “quantum technology” the controlled swap is also realized as in the reversible domain, i.e. cnot-toffoli-cnot, no variations as presented in the former section will be discussed. however, some changes in the surroundings of the controlled swap may be considered. a relevant example of an effective use of the controlled swap was introduced in [3] to efficiently determine whether two qubits are equal or have an inner product with absolute value ≥  a threshold in [0, 1]. (see fig. 20). a detailed analysis follows. in the dirac notation [16], let |0〉 and |1〉 be the basis states of the working hilbert space [16]. let |〉 denote the state of a control qubit and let h denote the hadamard gate 𝟏 √𝟐 [ 𝟏 𝟏 𝟏 −𝟏 ]. if |〉 = |0〉 = [ 1 0 ]t (in the vector notation), then : 𝑯|0〉 = 𝟏 √𝟐 [ 1 1 1 −1 ].[ 1 0 ] = 𝟏 √𝟐 [ 1 1 ] = 𝟏 √𝟐 (|0⟩ + |1⟩). (12) this represents a superposition of states and a quantum circuit will work in both states simultaneously, which is one of the main characteristics of circuits for quantum computing. fig. 20 circuit to compare two qubits, in the dotted box, the symbol for the controlled swap the fredkin gate in reversible and quantum environments 263 at the output side the circuit of fig. 20, produces (h ⊗ i4)(c swap)(h ⊗ i4) |0〉|〉〉 , (13) where i4 represents the 4⨯4 identity matrix. lemma 1: the output of the circuit of fig. 20 is given by: (1/2)[ |0〉(|〉〉 + 〉|〉) + |1〉(|〉〉 – 〉|〉) ]. (14) proof: considering the most general case, let |〉 = (0〉 + |1〉), with ||2 + ||2 = 1 and 〉 = (0〉 + |1〉), with |2 + |2 = 1. recall that swap = [ 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 ] and let q = (h ⊗ i4)(c swap)(h ⊗ i4). let s stand for swap. then q = 1 √2 [ i4 i4 i4 −i4 ] ⋅ [ i4 04 04 s ] ⋅ 1 √2 [ i4 i4 i4 −i4 ] = 1 2 [ i4 s i4 −s ] ⋅ [ i4 i4 i4 −i4 ] = = 1 2 [ i4 + s i4 − s i4 − s i4 + s ]. (15) moreover, |〉〉 = [ ]t ⊗ [ ]t = [     () 〉|〉 = [ ]t ⊗ [ ]t = [     () therefore, |〉〉 + 〉|〉 = = [( + ) (  + ) (  + ) (  + ) (18) |〉〉 – 〉|〉 = = [( – ) (  – ) (  – ) (  – ) (19) since  and  are possibly complex values, the products  and  are commutative. therefore, the first and last components of the vector in (18) equal 2 and 2 respectively, and the first and last components of the vector in (19) equal 0. therefore, |〉〉 + 〉|〉 = [(2 ) (  + ) (  + ) () (20) |〉〉 – 〉|〉 = [ 0 (  – ) (  – )   (21) to calculate q |0〉|〉〉 the explicit expression for (15) will be needed, where “t” will be used to represent –1 and preserve the format of the matrix. 264 c. moraga, f. z. hadjam q |0〉|〉〉 = 1 2 [ 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2 0 0 0 1 0 0 𝐓 0 0 𝐓 0 0 1 0 0 0 0 0 0 1 0 0 𝐓 0 0 𝐓 0 0 1 0 0 0 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2] ⋅ [ 𝛼1𝛼2 𝛼1𝛽2 𝛽 1 𝛼2 𝛽 1 𝛽 2 0 0 0 0 ] = 1 2 [ 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽1𝛽2 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 ] . (22) the resulting vector in (22) may be additively divided into two vectors as follows: 1 2 [ 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽1𝛽2 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 ] = 1 2 ( [ 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽 1 𝛽 2 0 0 0 0 ] + [ 0 0 0 0 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 ] ) . (23) from eq. (21), with (18.b) and (19.b) follows that the first vector of (21) equals (1/2)|0〉( |〉〉 + 〉|〉 ) and the second vector of (21) equals n(1/2)|1〉( |〉〉 – 〉|〉 ). this ends the proof that q |0〉|〉〉 = (1/2)[ |0〉(|〉〉 + 〉|〉) + |1〉(|〉〉 – 〉|〉) ] . □ notice that if (|〉 = 〉) then (|〉〉 = 〉|〉) and (|〉〉 – 〉|〉) = 0. this means that in this case, |1〉 would be measured with probability 0, whereas |0〉 would be measured with a non-zero probability. the equality of two state vectors may be obtained in one step, whereas a classical algorithm would require two comparisons. a possible “variation” may consider |〉 = |1〉 = [ 0 1 ]t (in the vector notation), then 𝑯|1〉 = 𝟏 √𝟐 [ 1 1 1 −1 ].[ 0 1 ] = 𝟏 √𝟐 [ 1 −1 ] = 𝟏 √𝟐 (|0⟩ − |1⟩). (24) lemma 2: if in the circuit of fig. 20 |〉 is set to |1〉, then q|1〉|〉〉 = (1/2)[(|0〉(|〉〉 – 〉|〉) + |1〉(|〉〉 + 〉|〉)]. proof: at the output side the circuit with |〉 = |1〉 now gives (h ⊗ i4)(c swap)(h ⊗ i4) |1〉|〉〉 = q|1〉|〉〉 = = q([ 0 1]t ⊗ [     ) = the fredkin gate in reversible and quantum environments 265 = 1 2 [ 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2 0 0 0 1 0 0 t 0 0 t 0 0 1 0 0 0 0 0 0 1 0 0 t 0 0 t 0 0 1 0 0 0 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2] ⋅ [ 0 0 0 0 𝛼1𝛼2 𝛼1𝛽2 𝛽 1 𝛼2 𝛽 1 𝛽 2] = 1 2 [ 0 𝛼1𝛽2 − 𝛽1𝛼2 −𝛼1𝛽2 + 𝛽1𝛼2 0 2𝛼1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 𝛼1𝛽2 + 𝛽1𝛼2 2𝛽1𝛽2 ] . (25) as in the former case, (recall eq. (23)), the final vector of eq. (25) may be split into two components associated to |0〉 and |1〉 respectively, leading to: q|1〉|〉〉 = (1/2)[(|0〉(|〉〉 – 〉|〉) + |1〉(|〉〉 + 〉|〉)] . (26) in this case, if |〉 = 〉, |0〉 would be measured with probability 0. a (pseudo) variation may be introduced if a signal, not the controlled swap, is modified. recall that the pauli x matrix [18] equals [ 0 1 1 0 ] and behaves as a quantum inverter. it is fairly obvious that if pauli x gates are included, as shown in fig. (21), then the complement of |〉 will be compared with 〉, which is equivalent to compare |〉 with the complement of 〉. fig. 21. modified swap test circuit to compare one state with the complement of another. the modified swap test may be expressed as: (h ⊗ x ⊗ i2)(c swap)(h ⊗ x ⊗ i2) |0〉|〉〉. (27) the proof of effectiveness, i.e. measuring 〉 with probability 0, follows the same steps as in the former first case. 4. conclusions variations on the fredkin gate, based on mixed polarity, have been analyzed in the reversible domain. the or-fredkin gate is introduced and in all shown variation cases their circuits showed a lower complexity (quantum cost [22], [27], i.e. number of elementary gates on two qubits in the quantum model, and depth) than equivalent classical nct circuits. a wide range of functionalities of the fredkin gate under mixed polarity were shown, thus adding flexibility to the design of reversible circuits. some equivalent variations were found and associated patterns of distribution of the white elements could be detected. in the quantum domain an application of the controlled swap to efficiently test whether two states are equivalent was given a step by step calculation of behaviour and one possible extension of the test circuit was shown. • 266 c. moraga, f. z. hadjam references [1] a. barenco, c. h. bennett, r. cleve, d. p. di vincenzo, n. margolus, p. shor, t. sleator, j. a. = smolin, and h. weinfurter, "elementary gates for quantum computation", phys. rev. a, vol. 52, pp. 3457-3467, 1995. [2] c. bennett, "logical reversibility of computation", ibm j. res. develop., vol. 17, pp. 525-532, 1973. [3] h. buhrman, r. cleve, j. watrous and r. de wolf, "quantum fingerprinting", phys. rev. lett., vol. 87, no. 16, p. 167902-1-4, 2001. [4] c. s. cheng and a. k. singh, "heuristic synthesis of reversible logic – a comparative study", theoretical appl. electr. eng., vol. 12, no. 3, pp. 210-225, 2014. [5] o. dovhamuk and v. deibuk, “cmos simulation of mixed-polarity generalized fredkin gates", in proceedings of the 12th international conference on advanced computer information technologies (acit), ieee press, 2022. [6] e. fredkin and t. toffoli, "conservative logic", int. jr. theor. phys., vol. 21, no. 3/4, pp. 219-253, 1982. [7] f. z. hadjam and c. moraga, "rimep2. evolutionary design of reversible digital circuits", acm j. emerg. technol. comput. syst., vol. 11, no. 3, pp. 27:1-27:23, 2014. [8] f. z. hadjam and c. moraga, "a hierarchical distributed linear evolutionary system for the synthesis of 4-bit reversible circuits" in r. seising and h. allende-cid (eds.), studies in fuzziness and soft computing 349, pp. 233249. springer, 2017. [9] r. landauer, "irreversibility and heat generation in the computing process" ibm j. res. develop., vol. 5, pp. 183191, 1961. [10] m. lukac, m. a. perkowski, h. goi, m. pivtoraiko, ch. h. yu, k. chung, h. jeech, b.-g. kim and y. d. kim, "evolutionary approach to quantum and reversible circuits synthesis", artif. intell. rev., vol. 20, no. 3-4, pp. 361-417, 2003. [11] d. maslov, g. w. dueck and d. m. miller, "synthesis of fredkin-toffoli reversible networks", ieee trans. very large scale integ. (vlsi) syst., vol. 13, no. 6, pp. 765-769, 2005. [12] m. d. miller and g. w. dueck, "search-based transformation synthesis for 3-valued reversible circuits" in i. lanese, and m. rawski (eds.), reversible computation, lncs 12227, 218-236, springer, 2020. [13] c. moraga, "hybrid gf(2)-boolean expressions for quantum computing circuits", in a. de vos and r. wille (eds.), rc 2011, lncs 7165, pp. 54-63, springer, 2012. [14] c. moraga, "using negated control signals in quantum computing circuits", fu elec. energ., vol. 24, no. 3, pp. 423-435, 2011. [15] c. moraga, "or-toffoli and or-peres reversible gates", in s. yamashita and t. yokoyama (eds.) reversible computation, lncs 12805, pp. 266-273, springer, 2021. [16] m. nielsen and i. chuang, quantum computation and quantum information. cambridge univ. press, uk, 2000. [17] ph. niemann, l. müller and r. drechsler, "finding optimal implementations of non-native cnot gates using sat", in s. yamashita, t. yokoyama, (eds.), reversible computation, lncs 12805, pp. 242-255, springer, 2021. [18] w. pauli, handbuch der physik, chapter 24, springer, berlin, 1933. [19] m. rahman and g. w. dueck, "an algorithm to find quantum templates" in proceedings of the ieee congress on evolutionary computing, ieee press, 2012, pp. 623-629. [20] i. rahul, b. loff and i. c. oliveira, "np-hardness of circuit minimization for multi-output functions", in proceedings of the 35th computational complexity conference (ccc), 2020, pp. 22:1–22:36. [21] m. saeedi and i. l. markov, "synthesis and optimization of reversible circuits – a survey", acm comput. surveys, vol. 45, no. 2, pp. 1-34, 2013. [22] z. sasanian and d. m. miller, "ncv realization of mct gates with mixed control", in proceedings of the ieee pacific rim conference on communications, computers and signal processing (pacrim), 2011, pp. 567-571. [23] m. soeken and m. k. thomsen, "white dots do matter: rewriting reversible logic circuits", in g. w. dueck and d. m. miller (eds.), reversible computation, lncs 7948, pp. 196-208, springer, 2013. [24] m. soeken, g. w. dueck and m. d. miller, "a fast symbolic transformation-based algorithm for reversible logic synthesis", in s. devitt and i. lanese i. (eds.), reversible computation, lncs 9720, pp. 307-321, springer, 2016. [25] s. stojković, m. m. stanković and c. moraga, "complexity reduction of toffoli networks based on fdd", fu: elec. energ., vol. 28, no. 2, pp. 251-262, 2015. [26] t. toffoli, "reversible computing", in j. w. baker and j. van leeuwen (eds.), alp 1980, lncs 84, pp. 632644, springer, 1980. [27] r. wille, m. saeedi and r. drechsler, "synthesis of reversible functions beyond gate count and quantum cost", 2010, pp. 1-7. [28] a. zulehner and r. wille, "simulation and design of quantum circuits", in i. ulidowski, i. lanese, u. p. schulz and c. ferreira, (eds.), reversible computation: extending horizons of computing, lncs 12070, pp. 60-82, springer open, 2020. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 489 507 doi: 10.2298/fuee1604489b coreless open-loop current transducers based on hall effect sensor csa-1v  marjan blagojević 1 , uglješa jovanović 2 , igor jovanović 2 , dragan mančić 2 , radivoje s. popović 3 1 irc sentronis ad, niš, serbia 2 university of niš, faculty of electronic engineering, niš, serbia 3 epfl swiss federal institute of technology, lausanne, and senis ag, zug, switzerland abstract. the paper provides an overview of coreless open-loop current transducers based on hall effect sensor csa-1v. depending on the implementation method and current range, the presented transducers are divided in the four groups. the transducers are capable to measure ac and dc currents ranging from several tens of miliamperes up to several hundreds of amperes. methods for resolving issues with the skin effect and stray magnetic fields are also presented including the experimental test results. some of these methods are novelty and have never been presented in literature. key words: current measurement, current transducer, hall effect sensor, csa-1v 1. introduction hall effect refers to the voltage that appears on a conducting material when an electric current flowing through the conductor is influenced by a magnetic field [1], [2]. hall effect is illustrated in fig. 1, where the current i (flowing through the hall sensor in the direction shown in fig. 1) is deflected due to the magnetic flux density b, and thereby generates the voltage vh. fig. 1 operation principle of a hall effect sensor. received october 6, 2015 corresponding author: marjan blagojević irc sentronis ad, niš, serbia (email: marjan@sentronis.rs) 490 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović the equation which describes the output voltage of the hall effect sensor is: bikv hh  (1) whereas kh is coefficient which defines the sensitivity of the sensor. thanks to the advantages they provide, current transducers based on hall effect sensors are used in various applications [2]. hall effect sensors are suitable for current measurement due to their small sizes, low prices, good linearity, galvanic isolation, high bandwidth, good accuracy and the ability to measure dc current rather than only ac current [3], [4]. they can be employed to measure currents ranging from several microamperes up to several thousands of amperes. the distribution of current density in a conductor with a rectangular cross section and equivalent schematic of this conductor are shown in fig. 2, where each color of resistor in the schematic matches the corresponding area of a rectangular conductor. due to the skin effect, the higher the frequency the less current flows through the resistor r4 and more through the resistor r1, i.e. the less current flows through the middle of the conductor and more near the outer edges [5], [6]. current redistribution in a rectangular conductor is important factor in current measurement applications. fig. 2 distribution of current density and equivalent schematic of a flat conductor. the skin effect within massive rectangular conductors may become noticeable at very low frequencies in the order of several tens of hz. the redistribution of current density results in the redistribution of the measured magnetic flux density which deteriorates frequency and phase responses of current transducers. phase response of a current transducer is very important in applications for electric energy measurements (for instance, good current transformers have a phase shift less than 1°). immunity to stray magnetic fields is an important feature of current transducers based on hall effect sensors because they can induce a false reading and measurement errors. this paper provides an overview of current transducers based on a hall effect sensor csa-1v divided in the four groups. the first group works in the similar way as pickup coils and measures current in a pcb traced conductor or in a wire. the second group is based on miniature bus bars and can measure currents up to several tens of amperes. thanks to the magnetic field increase using multi-turn coils, the third group can measure very low currents in the range of miliamperes. the fourth group is based on bus bars and it is designed for high current applications. the major features of the transducers, the issues that arise with current measurement and the methods to overcome them are also presented in this paper. special attention was paid to methods for resolving issues with the skin effect and stray magnetic fields. coreless open-loop current transducers based on hall effect sensor csa-1v 491 some of the presented solutions for frequency response improvement are novelty and, according to the knowledge of this paper authors, have never been presented in the literature. 2. hall device csa-1v csa-1v is an integrated hall effect single-axis magnetic field sensor designed for non-contact measurement of electric current. the device is manufactured using a standard cmos technology with an additional sentron’s patented ferromagnetic layer called integrated magnetic concentrator (imc) [7], [8] and it incorporates the spinning current technique. thanks to that, compared to the conventional hall effect sensors, the csa-1v provides a magnetic gain contributing to a greater magnetic sensitivity, a lower magnetic offset and a lower magnetic noise [9], [10]. the device is packed in a standard soic-8 case (see fig. 3) which provides a good isolation (up to 600 v) for applications with current conductor traced on a printed circuit board (pcb) [9]. fig. 3 direction of the sensitivity vector and location of the sensing element [10]. the sensing element of the csa-1v is located approximately 0.3 mm below the top surface of soic-8 case as illustrated in fig. 3. consequence of uncontrollability of the imc process is that csa-1vs will usually not have the specifications rated in the datasheet [9]. for this reason, during the manufacturing process, the calibration procedure is introduced using the certain number of specifically designated memory cells [11]. the calibration memory cells are manufactured in “zener zapping” technology and can be programed only once [12], [13]. the calibration procedure of csa-1vs is well presented in papers [11], [13]. 3. pickup hall effect current transducers a current transducer operating on the similar principle as a pickup coil can be realized using hall effect sensors. in these transducers, instead of a pickup coil, the csa-1v is employed to sense a magnetic field generated by a current carrying conductor and convert it to a voltage proportional to that field. this can be performed either by employing the csa-1v to measure current in an adjacent wire or in a pcb traced conductor below the csa-1v as shown in fig. 4 [10]. 492 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 4 shape and direction of magnetic field from two different conductor types [10]. the csa-1v differential output voltage for a current carrying circular conductor (wire) located on top of the sensor can be approximated with the following equation [10]: 3.0 060.0    d i v outdiff (2) whereas d is a distance between the csa-1v top surface and a center of a wire given in milimeters (see fig. 4) and i is current applied in a wire. the application of the csa-1v measuring current in a current carrying wire is shown in fig. 5. if placed too close to the csa-1v, high current carrying wire can saturate the csa-1v. therefore, the limits for electrical and magnetic saturation must be taken into the account. fig. 5 application of the csa-1v measuring current in a current carrying wire. the csa-1v differential output voltage for a flat pcb conductor traced directly below the csa-1v can be approximated with the following equation [10]: ivoutdiff  40 (3) whereas i is current applied in a pcb traced conductor assumed to be roughly 3.2 mm wide. the sizing of the pcb trace needs to take in account the current handling capability and the total power dissipation. for this reason, the pcb trace needs to be thick enough and wide enough to handle designated nominal current continuously. using a single pcb traced conductor, currents up to 10 a can be measured. the applications of the csa-1v measuring current in the pcb traced conductor (see fig. 6) are presented in paper [14] while the thermal analysis performed using the thermal imaging camera is presented in paper [15]. coreless open-loop current transducers based on hall effect sensor csa-1v 493 fig. 6 applications of the csa-1v measuring current in the pcb trace [14], [15]. applications shown in fig. 6 are implemented in photovoltaic power plant for dc current measurement of photovoltaic modules [14]. 3.1. transducer with the magnetic shield the csa-1v can detect any surrounding stray magnetic field which is in the direction of sensitivity (across the chip) which may cause interference and disturb the measurement accuracy. the solution to this issue is to shield the csa-1v by mounting a small (roughly 1 cm 2 x 0.5 mm) ferromagnetic plate on the opposite side of a pcb from the one to which the csa-1v is soldered as shown in fig. 7 [10]. the plate can be made out of mu-metal since it has high permeability at low field strengths and low remanence field. fig. 7 shielding the csa-1v from stray fields [10]. the ferromagnetic shield has double effect: 1. it concentrates the flux around the trace thus shortening the field lines that go through the air almost by double. in this way the magnetic resistance is reduced by double which ultimately contributes to higher induction and greater output signal (by 30%–50%). 2. it serves as the concentrator for stray fields at the same time deflecting them from the csa-1v as shown in fig. 7. 3.2. anti-differential configuration of hall effect sensors measurement error produced by stray fields can be also minimized by implementing two hall effect sensors in the anti-differential configuration shown in fig. 8. 494 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 8 anti-differential configuration of hall effect sensors [10]. implementation of this method cancels common mode magnetic fields while the output signal is doubled [10] as per following equations: 1 ( ) s u s b b   (4) 2 ( ) s u s b b   (5) bsuu  2 21 (6) whereas b is the measured magnetic field, bs is the common mode magnetic field and s is the sensor sensitivity. this method works perfectly with homogenous stray fields. since the field gradient decreases as a function of distance, if the surrounding nonhomogeneous stray fields are relatively distant from the transducer they can be considered as homogeneous. in this way, the useful signal is doubled while the noise is 2 greater, i.e. the signal to noise ratio is 2 times better. application of anti-differential configuration of the csa-1vs on the massive oval conductor is shown in fig. 9. fig. 9 application of anti-differential configuration on the massive oval conductor. 4. miniature bus bar current transducers currents greater than 10 a can be measured using the csa-1v by conducting current throughout a properly shaped copper miniature bus bar (mbb) placed above the csa-1v as illustrated in fig. 10. fig. 10 copper mbb placed above the csa-1v. coreless open-loop current transducers based on hall effect sensor csa-1v 495 sizing of the mbb and the distance from the csa-1v are dependent on the desired current handling capability. the closer the mbb to the csa-1v, the more accurate readings will be obtained but the limits of electrical and magnetic saturation need to be taken into the account. the approximate csa-1v differential output voltage can be obtained by the following equation: 40 ( 0.3) outdiff i v d    (7) whereas d is the distance between the mbb center and the csa-1v top surface given in millimeters and i is current applied in the mbb. the method illustrated in fig. 10 can easily be implemented by soldering a mbb on to a pcb above the csa-1v as shown in fig. 11 [10]. fig. 11 application of a mbb and a pcb trace [10]. when a noncircular mbbs are employed in the application illustrated in fig. 11, it is necessary to take into the account frequency dependence of transducer’s sensitivity because the skin effect forces high frequency current to flow along the outer edges of the mbb thus changing the magnetic flux density at the site of the csa-1v. consequently, the frequency response deteriorates. the solution to this issue is to split a rectangular mbb into two parallel branches by drilling a hole in the middle of a mbb as illustrated in fig. 12. in this way, since the current flows through the branches the skin effect is minimized. fig. 12 rectangular mbb without and with the hole in the middle. it should be noted that a hollow mbb (see fig. 12) must be thicker than a same mbb without a hole in order to handle the same current intensity. to demonstrate the difference between transducers with a circular mbb, a rectangular mbb and a rectangular hollow mbb properly, it is necessary to analyze their frequency responses. in order to do so, the three transducers are realized using all three types of mbbs. the transducer with the circular mbb is shown in fig. 13. 496 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović pcb csa-1v circular mbb 0.7 fig. 13 transducer with the circular mbb. the transducer with the rectangular mbb, capable of handling currents up to 50 a, is shown in fig. 14. 0.7 pcb csa-1v rectangular mbb fig. 14 transducer with the rectangular mbb. the transducer with the rectangular hollow mbb is shown in fig. 15. the mbb is identical to one employed in the transducer shown in fig. 14 with the only difference being the hole. 0.7 pcb csa-1v hole rectangular hollow mbb fig. 15 transducer with the rectangular hollow mbb. the frequency responses of all three transducers are shown in fig. 16. the sensitivity of the transducer with the circular mbb for dc current is s=34 mv/a, the sensitivity of the transducer with the rectangular mbb for dc current is s=35 mv/a while the sensitivity of the transducer with the rectangular hollow mbb for dc current is s=28.36 mv/a. as can be seen from fig. 16 the frequency response of the transducer with the circular mbb (see fig. 13) has the 3 db sensitivity attenuation (sensitivity is equal to 0.7) at 100 khz which corresponds to the frequency response of the csa-1v sensor itself [9]. for the transducer with the rectangular mbb (see fig. 14), the 3 db sensitivity attenuation is around 80 khz. however, for the transducer with the hollow rectangular mbb (see fig. 15), the 3 db sensitivity attenuation is around 100 khz just like for the transducer with the circular mbb. based on the measurements shown in fig. 16, the benefit of the hollow rectangular mbb is evident. coreless open-loop current transducers based on hall effect sensor csa-1v 497 fig. 16 frequency responses for all three transducers. it is possible for a high frequency ac current carrying mbb to be on much higher potential relative to the ground of the csa-1v. this can lead to the capacitive coupling between the mbb and the csa-1v. to avoid this, it is necessary to place the electrostatic shield between the mbb and the csa-1v. figure 17 shows the electrostatic shield implemented in the transducer with the rectangular hollow mbb. the electrostatic shield is mounted over the top surface of the csa-1v and soldered on the ground pad on the pcb. the electrostatic shield drops sensitivity and to minimize this it is necessary to employ the electrostatic shield with the shape shown in fig. 17. fig. 17 transducer with the rectangular hollow mbb and the electrostatic shield. the frequency response of the transducer with the rectangular hollow mbb and the electrostatic shield is shown in fig. 18. fig. 18 frequency response of the transducer with the rectangular hollow mbb and the electrostatic shield. 498 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović by comparing the frequency responses for the transducer with and without the electrostatic shield (fig. 16 and fig. 18) slight sensitivity decrease is evident. 4.1. mbb transducer with the magnetic shield to protect the csa-1v from stray magnetic fields it is possible to employ the magnetic shield shown in fig. 19. selection of the shield material must be taken into the account in order not to affect transducer frequency response and linearity [16]. compared to the magnetic resistance of air, the magnetic resistance of the ferromagnetic shield is practically equal to zero. this means that the magnetic resistance of the magnetic circuit is reduced by factor of two, i.e. the sensitivity is increased by factor of two. fig. 19 magnetic shield structure and transducer with the magnetic shield. side effect of the magnetic shield is that it may have hysteresis and a remanence magnetization which can cause offset. the solution to this issue is to insert a layer of vitrovac beneath the magnetic shield. vitrovac absorbs the field inflicted by the remanence magnetization. the frequency response of the transducer with the magnetic shield (see fig. 19) is shown in fig. 20. sensitivity for dc current is s=57.8 mv/a. fig. 20 frequency response of the transducer with the magnetic shield. implementation the magnetic shield does not affect the frequency response which can easily be seen by comparing frequency responses shown in fig. 16 and fig. 20. coreless open-loop current transducers based on hall effect sensor csa-1v 499 5. current transducers based on a bobbin coil another method to develop low current transducers based on the csa-1v is by increasing the magnetic field around the csa-1v using a multi-turn coil (see fig. 21). in this way even currents in the order of several tens of miliampers can be accurately measured. during the assembly, the csa-1v is mounted in a center of a bobbin with the sensing element, inside the csa-1v, in the middle of a bobbin at equal distance from top and bottom bobbin edge as shown in fig. 21. fig. 21 multi-turn coil and placment of the csa-1v inside the bobbin. transducer sensitivity is dependent on the coil size and the number of turns. increased sensitivity and immunity to stray fields can be gained by shielding the coil. the bobbin provides very high dielectric isolation making this a suitable solution for high voltage power supplies with relatively low currents. the output should be scaled to obtain the maximum voltage for the highest current to be measured in order to obtain the best accuracy and resolution. based on this method the transducers, capable of measuring currents ranging from 250 ma to 10 a, are produced. structure of these transducers is the same (see fig. 22) with the only difference being the type of implemented coil. depending on the current range there are three types of coil implemented in the transducer: 1. for 250 ma current with 10 v/a sensitivity using 250 turns with awg34 wire; 2. for 2.5 a current with 1 v/a sensitivity using 24 turns with awg24 wire; 3. for 10 a current with 0.25 v/a sensitivity using 6 turns of awg18 wire; fig. 22 transducer structure: 1. shields; 2. duct tape; 3. foil; 4. bobbin; 5. csa-1v. components shown in fig. 22 are fitted in a cubic box and properly sealed. photo of the realized transducer is shown in fig 23. 500 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 23 photo of the realized transducer. the transducer can be adjusted to output either a bipolar or unipolar voltage. the transfer functions for both output types are shown in fig. 24. fig. 24 transfer function for bipolar and unipolar output. when a transducer is inserted in a primary circuit its resistance plays an important role because it acts an insertion resistance and can create an undesired voltage drop. for this reason, it is important to keep a transducer resistance as low as possible. the resistances of the realized transducers are 6 ω for 0.25a, 0.06 ω for 2.5 a and 0.006 ω for 10 a. 6. bus bar current transducers currents ranging up to few thousands of amperes can be measured in the similar way as presented in previous two methods. in this way, instead of employing a pcb trace or a mbb, the idea is to conduct current trough an electrolytic copper bus bar and to fit the csa-1v in the middle of a bus bar to measure current. rather than employing only one csa-1v effective cancellation of stray fields without magnetic cores or shielding can be achieved by employing two csa-1vs. for this reason, the bus bar transducer is realized using the anti-differential configuration of two csa-1vs shown in figs. 8 and 9. photo of the realized bus bar transducer is shown in fig 25. fig. 25 copper bus bar. coreless open-loop current transducers based on hall effect sensor csa-1v 501 as stated above, the skin effect within massive rectangular conductors such as the bus bar shown in fig. 25 can be manifested at very low frequencies in the order of several tens of hz. the skin effect has a major impact in rectangular bus bars [17, 18] with one of the major issues being a redistribution of the magnetic flux density [19]. for this reason, it is necessary to evaluate transducer under dc and ac current. frequency and phase measurements are conducted using the dc current source with modulation from 1 hz to 250 hz and using the power ac current source. measurement results are shown in fig. 26. the blue curve is obtained using the dc source while the red curve is obtained using the ac source. frequency ranges for both current sources partly overlap. fig. 26 frequency and phase responses of the bus bar transducer. based on these measurements, it is evident that skin effect becomes significant for frequencies higher than 20 hz. therefore, it is unnecessary to use dc current source hence every subsequent measurement is performed using the ac current source. on the phase response graph (see fig. 26), the blue curve is obtained using the dc current source, the red curve is obtained using the ac current source while the green curve represents the phase response of the csa-1v which has a dominant role on high frequencies. the transducer phase response on low frequencies is influenced by the bus bar and surroundings. issue with the frequency dependence of the transducer sensitivity can be resolved by implementing at least one of the following methods or by their combination: 1. unsymmetrical placement of the csa-1vs with regard to the bus bar; 2. application of a magnetic filter; 3. application of an electronic filter. 4. cutting out notches in a bus bar in order to produce a restrictive region. 6.1. unsymmetrical placement of the csa-1vs to evaluate the effect of the csa-1vs position on the bus bar, series of measurements are performed in which the both csa-1vs are placed at the same distance from the middle of the bus bar as illustrated in fig. 27. dbus bar csa-1v csa-1v fig. 27 csa-1v positions on the bus bar. 502 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović since the skin effect forces current to flow along the outer edges of the bus bar, the idea is to find a suitable position, for the csa-1vs to be mounted, at which the field changes originating from current redistribution are the least. the measurement results of this experimentation are shown in fig. 28. fig. 28 frequency and phase responses of the bus bar transducer with unsymmetrical placement of the csa-1vs. based on these measurements, the ideal position to mount the csa-1vs is where the frequency response is the flattest. 6.2. magnetic filter the frequency response can be improved using the passive method based on the assembly of a massive flat conductor above the bus bar and the csa-1v. this conductor will induce eddy currents which will cancel the primary magnetic field. consequently, the magnetic field lines will bypass the conductor. instead, they will concentrate between the bus bar and the conductor mounted above the csa-1v. moreover, the current distribution in the bus bar with the conductor mounted above will not be the same as in the case without the conductor, i.e. the current density in the bus bar will be higher on the side closer to the conductor. to obtain a flat frequency response, the copper magnetic filter is employed in the way shown in fig. 29. fig. 29 application of the magnetic filter on the transducer. coreless open-loop current transducers based on hall effect sensor csa-1v 503 frequency and phase responses of the bus bar transducer with the magnetic filter (see fig. 29) are shown in fig. 30. fig. 30 frequency and phase responses of the bus bar transducer with the magnetic filter. based on the measurements shown in fig. 30, it is evident that the magnetic filter reduces sensitivity drop caused by the skin effect, i.e. it increases sensitivity. as the sensitivity drop caused by the skin effect is roughly 40%, it is obvious that the magnetic filter reduces the initial impact of the skin effect for 10%. magnetic filter also improves the phase response. 6.3. electronic filter fig. 31 shows electrical schematic of the bus bar transducer. summation of outputs from two csa-1vs is performed using a differential amplifier ad623 with unity gain. fig. 31 schematic of the bus bar transducer. the idea how to employ an electronic filter to obtain a flatter transducer frequency response is to connect a resistor and capacitor in series instead of a gain defining resistor rg. the resistor is selected so the amplifiers gain compensates the output signal decrease caused by the skin effect. the capacitor is selected so that its impedance begins to decrease when the skin effect begins to impact, meaning that its impedance is zero when 504 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović practically entire current flows along the bus bar outer edges. on this basis, a 220 kω resistor and a 2.2 nf capacitor are selected. frequency and phase responses of the bus bar transducer with the electronic filter are shown in fig. 32. fig. 32 frequency and phase responses of the bus bar transducer with the electronic filter. based on these measurements it is evident that the electronic filter reduces sensitivity drop at the same time improving the phase response. 6.4. bus bar with the restrictive region by having the notches cut out in a bus bar (see fig. 33) nearly a circular cross section of the restrictive region is obtained. for conductors with a circular cross section, redistribution of a current density does not impact on distribution of a magnetic field around a conductor. in this way the lateral skin effect is minimized. fig. 33 bus bar with the restrictive region [20]. since ac current flows through the restrictive region of the bus bar the magnetic flux density around the restrictive region is greater than around the rest of the bus bar. in addition to this, combination of the anti-differential configuration of hall effect sensors and a notched bus bar provides the better immunity to stray magnetic fields mainly because hall effect sensors are close to each other. however, it should be noted that having the notches cut out may cause an overheating at the restrictive region [20]. 6.5. optimized bus bar current transducer in order to improve the frequency response, i.e. to obtain flat frequency response, the optimized bus bar current transducer comprising top three previously presented methods is realized. the electronic filter is composed of a 330 kω resistor and a 2.2 nf capacitor, coreless open-loop current transducers based on hall effect sensor csa-1v 505 the csa-1vs are mounted 5 mm away from the middle of the bus bar and the magnetic filter is applied. overall the obtained amplitude error is less than 1% as shown on fig. 34. fig. 34 frequency and phase responses of the optimized bus bar transducer. effect of the applied methods can be easily spotted on the frequency response in fig. 34 because they result in 55% better frequency response compared to the transducer without compensation. 6.6. braid bus bar another way to minimize the skin effect is rather than to employ plain bus bar to employ a braid bus bar, such as one shown in fig 35. the application of a braid bus bar, consisted of a thin insulated wires, results in a spatial averaging of a current density so that a distribution of a magnetic field around the conductor is not frequency depended. fig. 35 braid bus bar. the disadvantage of this solution is that it is not easy to achieve a rigid attachment between a flexible braid and a hall effect sensor. movement of a hall effect sensor relative to a braid bus bar results in a sensitivity change. therefore, if necessary, this issue must be properly addressed. 6.7. current transducer with magnetic shielded conductor the skin effect in rectangular bus bars can be minimized or even eliminated with partial shielding of the bus bar. the idea is to fit ferromagnetic plates, shaped like letter “c”, on the side edges of a bus bar as shown in fig. 36. 506 m. blagojević, u. jovanović, i. jovanović, d. manĉić, r. s. popović fig. 36 partial shielding of the bus bar [21]. fig. 36 illustrates current density distribution in a bus bar without (left bus bar) and with partial magnetic shield (right bus bar). this method is presented in patent [21] and discussed in paper [22]. optimization of size and shape of ferromagnetic shields can result in a significantly better transducer frequency response keeping dimensions of bus bar the same. minimization of skin effect reduces heating of a bus bar. magnetic structures presented in [1], [2] also minimize ac resistance, which can be useful for some applications. 7. conclusion this paper reviews several types of coreless open-loop current transducers based on the hall effect sensor csa-1v capable of measuring ac and dc currents ranging from several tens of miliamperes up to several hundreds of amperes. during the development of each transducer special attention was paid on solving problems related to the frequency response. in addition, attention was paid not to disrupt the linearity and to achieve satisfactory immunity to stray magnetic fields. another goal of this paper is to expand the scope of use of the realized transducers by providing a lot of useful guidelines for designers faced with the challenges of current measurement using hall effect sensors. the first experiments were related to the mbb transducers suitable for current measurements up to several tens of amperes. with mbbs the skin effect becomes noticeable at frequencies greater than 10 khz. the issue with the skin effect has been overcome by drilling a hole in the bus bar. the issue with stray magnetic fields has been overcome by implementing a ferromagnetic shield and anti-differential configuration of two csa-1vs. the second experiments were related to the transducers based on massive copper bus bars with cross sections which can handle currents up to several hundred of amperes. these solutions employ different ways of position csa-1vs relative to the bus bar, the application of magnetic filter and application of electronic filter. some of the presented solutions for frequency response improvement are novelty and have never been described in the literature. acknowledgement: the research presented in this paper is financed by the ministry of education, science and technological development of the republic of serbia under the projects tr32057 and tr33035. coreless open-loop current transducers based on hall effect sensor csa-1v 507 references [1] r. s. popović, hall effect devices. institute of physics publishing, bristol and philadelphia, 2004. [2] honeywell inc., "hall effect sensing and application," micro switch sensing and control, 2002. [3] d. r. popović, s. dimitrijević, m. blagojević, p. kejik, e. schurig, r. s. popović, "three-axis teslameter with integrated hall probe free from the planar hall effect," in proc. of the of instrumentation and measurement, technology conference, sorrento, italy, no. 6384, pp. 24-27, 2006. [4] d. r. popović, s. dimitrijević, m. blagojević, p. kejik, e. schurig, r. s.popović, "three-axis teslameter with integrated hall probe," ieee transactions on instrumentation and measurement, vol. 56, issue 4, pp. 1396-1402, 2007. [5] d. m. veliĉković, s. r. aleksić, "a numerical procedure for solving skin effect integral equation in thin strip conductors," facta universiatis, series: electronics and energetics, vol. 14, no. 2, pp. 253-270, 2001. [6] m. greconici, g. madescu, m. mot, "skin effect analysis in a free space conductor," facta universiatis, series: electronics and energetics, vol. 23, no. 2, pp. 207-215, 2010. [7] r. s. popović, z. randjelović, d. manić, "integrated hall-effect magnetic sensors," sensors and actuators a: physical, vol. 91, pp. 46 -50, 2001. [8] r. s. popović, p. m. drljaĉa, p. kejik, "cmos magnetic sensors with integrated ferromagnetic parts," sensors and actuators a: physical, vol. 129, pp. 94-99, 2006. [9] sentron, csa-1v datasheet, 2008. [10] sentron, "current sensing with the csa-1v," application note, 2008. [11] m. blagojević, d. manĉić, "programator strujnih i 2d magnetnih senzora," in proc. of the infotehjahorina 2007, jahorina, bosnia and herzegovina, no. e-vi-10, 2007, in serbian. [12] m. blagojević, s. dimitrijević, "programiranje strujnih senzora csa-1v i statisticka analiza," in proc. of the of xiii conference yu info 2007, kopaonik, serbia, pp. 11-14, 2007, in serbian. [13] m. blagojević, m. radmanović, "ureċaj za kalibraciju strujnih senzora," in proc. of the infotehjahorina 2007, jahorina, bosnia and herzegovina, no. e-vi-11, 2007, in serbian. [14] z. petrušić, i. jovanović, lj. vraĉar, d. manĉić, m. blagojević, "a wirelles solution of measurement-control system for photovoltaic application," in proc. of the unitech’10 international scientific conference, gabrovo, bulgaria, vol. 1, pp. 114-122, 2010. [15] m. blagojević, z. petrušić, d. manĉiĉ, m. radmanović, "termiĉka analiza strujne sonde bazirane na senzoru csa-1v," in proc. of the xiii meċunarodni simpozijum energetska elektronika ee 2005, novi sad, serbia, no. t4-4.8, pp. 1-5, 2005, in serbian. [16] p. ripka, "current sensors using magnetic materials," journal of optoelectronics and advanced materials, vol. 6, no. 2, pp. 587-592, 2004. [17] v. belevitch, "the lateral skin effect in a flat conductor," philips technical rev. 32, pp. 221-231, 1971. [18] j. zhou, a. m. lewis, "thin-skin electromagnetic fields around a rectangular conductor bar," journal of physics d: applied physics, vol. 27, pp. 419–425, 1994. [19] i. popa, a.-i. dolan, "numerical modeling of dc busbar contacts," facta universiatis, series: electronics and energetics, vol. 24, no. 2, pp. 209-219, 2011. [20] m. blagojević, d. manĉić, i. jovanović, z. petrušić, "current ampacity of bus-bar with neck for application in current transducers," in proc. of the unitech’10 international scientific conference, gabrovo, bulgaria, vol. 1, pp. 123-127, 2010. [21] j. s. gallina, m. brand, "magnetic structure for minimizing ac resistance in planar rectangular conductors", patent us6105236a [22] t. mizuno, s. enoki, t. suzuki, t. asahina, m. noda, h. shinagawa, "reduction in eddy current loss in conductor using magnetoplated wire," ieej transactions on fundamentals and materials, vol. 127, no. 10, pp. 611-620, 2007. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 395 405 doi: 10.2298/fuee1603395v a new telerehabilitation system based on internet of things  sanja vukićević 1 , zoran stamenković 2 , san murugesan 3 , zorica bogdanović 1 , božidar radenković 1 1 faculty of organizational science, university of belgrade, serbia 2 ihp, frankfurt (oder), germany 3 brite professional services and western sydney university, australia abstract. internet of things (iot) applied in healthcare system has a huge potential to improve patients' quality of life. representing network of devices embedded with electronics and sensors, iot enables constant monitoring of vital body functions, tracking of physical activities of a person and aids rehab physical therapy. such an iot-based system would allow standalone recovery process, minimizing the need for dedicated medical personnel and could be used in both hospital and home conditions. in this paper, we present a telerehabilitation system that uses wearable muscle sensor and microsoft kinect to create interactive personalized physical therapy that can be carried out at home. early experiments and results of pilot implementation validate the feasibility and effectiveness of the proposed iot-enabled telerehabilitation system. key words: telerehabilitation, muscle sensor, kinect, wearable sensor, telemedicine, physical therapy 1. introduction internet of things (iot) is a contemporary technology with the potential to alter or replace the various methods of classical medicine [1] and improve healthcare. the advantage of measuring physical parameters using iot devices instead of conventional ones is that the connected intelligent iot devices can carry out measurements independently, and carry out a specific action based on the measurement results. also, the results of measurements are available via internet and can be recorded in electronic form, enabling medical personnel to monitor patient’s state from any location at any time. the most common application of iot in healthcare is in wellness, using devices for measuring daily activities such as walking, running or riding a bicycle. telerehabilitation is recognized as a necessary form of treatment for numerous neurological, neuromusculoskeletal, cardiovascular and other conditions [2][3][4]. the  received june 30, 2015; received in revised form october 27, 2015 corresponding author: sanja vukićević faculty of organizational science, university of belgrade, jove ilića 154, 11000, belgrade, serbia (e-mail: sandzii@gmail.com) 396 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković number of people requiring telerehabilitation is rising. for example, according to world health organization report [5], 5 million people survive stroke each year and half of them remain with hemiparesis (weakness of one side of the body). medical and rehabilitation institutions are usually limited in space and personnel, so patients are forced to continue practicing physical therapy at home. expenses for traveling to rehabilitation centres for daily therapy are not insignificant for disabled persons, which contributes to the need for telerehabilitation. in this article, we present design of telerehabilitation system based on iot, which will enable the implementation of effective physical therapy remotely, then ensure the insight into the recovery process to competent medical personnel from a remote location, and provide interaction of therapists with the patient via communication technologies. a special attention is given to fostering patient's motivation to repeat the same group of exercises daily through serous games. 2. related work and motivation application of iot in healthcare spans a few different areas: physiological monitoring, ambient assisted living and well-being solutions. however, there is a lack of researches and experiments of iot usage in assisting and measuring performance of physical therapy. iot in rehabilitation therapy should ensure a wealth of information that can produce actions based on defined algorithms [6]. different kinds of sensors designed for healthcare, like muscle sensor that is measuring muscle activation via electrical potential and devices specialized for skeleton detection and tracking, are used in this rehabilitation model of physical therapy for obtaining feedback and correctness of the performed exercises [7][8]. readings from sensors are also used for the creation of future exercises and adaptation of interactive physical therapy to the patient's needs [9]. an industrial example of physiological monitoring system is bodyguardian by preventico, based on a band-like sensor patch placed on patient's body. sensor is powered by batteries which enables mobility of the patient and is connected to a smartphone device. smartphone transmits data to a cloud-based health platform which further delivers data and alerts medical personnel. cloud is a logical choice for such a system, as it does not burden patients with configuration of telerehabilitation system [10]. another example of iot device, developed for monitoring vital functions like heart and pulse rate, oxygen saturation, blood pressure, and skin temperature is visi mobile by sotera wireless inc. visi mobile communicates with e-health system using 802.11 wpa2/psk security protocol which guarantees protection of wireless communication channel. ambient-assisted living represents a technical system for supporting elderly people in their daily routine to allow an independent and safe lifestyle. sensors in those systems are wearable (for example, accelerometer or gyroscope) or fixed (proximity) and they gather data in order to monitor patient activities or detect a fall in patient's living environment [11]. one of the biggest iot growth areas is measuring individual health metrics and wellbeing, through self-tracking wearable gadgets. the use of wearable sensors, together with suitable applications running on smartphone devices enables people to track their daily activities (steps walked, running performance, calories burned, exercises performed, etc.), providing suggestions for enhancing their lifestyle. a new telerehabilitation system based on internet of things 397 combining all the three groups of application of iot in healthcare, it is possible to create a model of telereahabilitation designed for physical therapy. physiological measurements of interest in rehabilitation include heart rate, respiratory rate, blood pressure, blood oxygen saturation. parameters extracted from such measurements can provide indicators of patients' health status. but in physical therapy higher focus would be on measuring and stimulating muscle activity using muscle sensors. muscle sensor connected to microcontroller arduino present a low-cost, low-power solution for gathering electromyography (hereinafter: emg) data of skeletal muscle. emg is traditionally used for medical research and diagnosis of neuromuscular disorder. repeating the same exercises in a long-term therapy may lead to saturation and skipping therapy. it is therefore important to constantly maintain the motivation of the patient. serious game is a type of game designed for special purpose in industry of health, education, defence, engineering, and others [12]. although serious games should be entertaining, their main purpose is to train or educate users. recent researches [13] show that the cognitive and motor activity required by video games engage the user’s attention. in addition, users are focused on playing game which helps them in forgetting that they are performing therapy. microsoft kinect was recognized as a low price and clinical practical body sensing device to be applied in rehabilitation [14]. kinect can track a body part and can also reproduce 3d space with player in front of it which enables creation of virtual reality games. therefore, kinect is the basis of most interactive game-based rehabilitation solutions. physical therapy exercises are performed while playing games, which aim to facilitate the implementation of therapy [15][16]. there are several clinically tested solutions of physical therapy using kinect sensor [17][18]. in mirror magic neurorehabilitation clinical trial [19], kinect influences positively the process of rehabilitation. in [20] five rehabilitation games using kinect were evaluated, also with a positive outcome. one great example of application of kinect in rehabilitation is virtualrehab solution, developed by spanish virtualware, which consists of web based control centre administrator software platform and several games designed for kinect (http://www.virtualrehab.info). the control centre is used by therapists to prepare a plan of exercises, to monitor and assess the progress of therapy. 3. telerehabilitation system architecture as a substitute for physical therapy conducted in medical institutions, telerehabilitation therapy should include the same scope of exercises, but without physical presence of physiotherapist. in order to lead a patient through a therapy session, the system must have a virtual assistant in a form of a web based application and a set of games tailored specifically for the patient. telerehabilitation system architecture based on iot is configured in two segments: a home based segment and cloud based software as a service segment (see fig. 1). home based segment requires components such as kinect body tracking sensor and muscle sensor, to be installed and setup at patient's living environment. also, the patient must have a personal computer with internet connection in order to receive therapy sessions and to send data collected by sensors. 398 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković wired to laptop medical rehabilitation center physiatrist analytics patient physiatrist s smartphone/tablet collect data from a user data storage patient s laptop kinect home based environment cloud saas rehab platform virtual cloud servers muscle sensor fig. 1 components of telerehabilitation system software as a service cloud based segment serves several purposes:  provides software for telerehabilitation therapy,  collects feedback after performed therapy for each patient,  analyses collected data and represents it in comparative and progressive form, and  allows physiatrists to follow patient's condition remotely and manage further therapies. 3.1. intelligent sensors and actuators the role of sensors and actuators in telerehabilitation is twofold: to diagnose patient's physical abilities based on measurements and to use read values for adaptation, and tailoring rehabilitation game in order to meet patient's mobility. sensor applied in the pilot implementation of this model is muscle sensor v3, electromyography sensor for microcontroller applications, including three electrodes, connected to microcontroller arduino. using the muscle sensor, it is possible to measure muscle activation via electrical potential emg, by placing electrodes in three positions: in the middle of the muscle, at the end of the muscle and on bony part near the muscle. before placing electrodes it is necessary to get skin cleaned using alcohol. this step is mandatory in order to provide a better grip of electrodes and reduce the electrical resistance of the skin. proper placement of emg electrodes is crucial for accurate measurement of muscle contraction. unfortunately, if a muscle has more body fat, emg signal will be weaker and difficult to record. the motivation for using muscle sensor in pilot telerehabilitation of physical therapy is the need to strengthen muscles and also to measure progress of reinforcing muscles, depending on the type of exercises. for example, if the patient is required to alternately contract and relax the muscle, they will experience it as an effort, compared to a situation when they are performing same actions while playing a game, unconsciously. in the second case, the patient will probably perform more repetitions of muscle contractions and relaxations because they are unaware of those actions. based on a new telerehabilitation system based on internet of things 399 the above, the use of muscle sensor in rehabilitation games should lead to improvements in patient's muscle structure. fig. 2 example of reading data from the muscle sensor with electrodes connected to the microcontroller arduino uno. relaxed biceps places blue slider to 0 (left). contracted biceps places blue slider to a specific value measured by the sensor (right) figure 2 shows connection of muscle sensor with arduino uno microcontroller. the sensor requires 9v power, and since arduino uno can provide operating voltage of 5v, muscle sensor must be power supplied by two 9v batteries. in these settings, arduino uno is connected to a computer using a serial connection, but it is preferred to switch to wireless connection using arduino wifi shield. arduino would use the 802.11 b/g/n protocol for communication with application on laptop computer and, as a result, patient wouldn't be limited in space. in the settings as in fig. 2, muscle sensor is placed on biceps, and values representing muscle contractions are displayed on the monitor. when the biceps muscle is relaxed, blue slider (third rectangle on the left side of the figure) shows the voltage of 0 v. when the biceps contracts, blue slider (third rectangle on the right side of the figure) shows the voltage higher than 0. the voltage values depend on the physical condition and muscle function. 3.2 body tracking sensor the system was implemented using the kinect body tracking sensor which consists of an rgb camera, infrared (ir) camera and ir projector. rgb camera is a standard colour camera. ir projector emits infrared rays in space which bounce off objects and return back to ir camera and measures the distance between kinect and objects [21]. this feature is useful in the creation of therapy when the patient has to position the hand in front of or next to his/her body. all three components: rgb camera, ir camera and ir projector allow the creation of 3d images. with the depth stream, it is possible to estimate human motion in real-time. however, the acquired depth data can be quite noisy and the image can consist of pixels with no depth because of multiple reflections. to cope with that it is mandatory to perform denoising. for further information about denoising, refer to paper [22] which presents new data-driven-based denoising technique. 400 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković kinect can distinguish parts of the body and it can determine the position and orientation of the body. validity and reproducibility are important characteristics of this device [23] which makes it applicable in telerehabilitation applications. in addition to objects, kinect can detect sound. this feature is not sufficiently exploited although kinect can determine the source of sound very accurately. standalone pc kinect patient s laptop arduino microcontroller connected with cable to analog input of microcontroller rest ws cable adapter for kinect muscle sensor wireless 802.11 wireless 802.11 wlan fig. 3 home based equipment of telerehabilitation system schematic view fig. 3 represents a schematic view of the equipment required for home base segment of telerehabilitation and shows the type of communication protocols and device connection. 3.3 software as a service nowadays, cloud-based data storage, computation, software, platform and computing infrastructure are widely used for many different applications. using content and services from the cloud eliminates time and costs of buying hardware and installing and maintaining software. with cloud infrastructure health monitoring systems become low-cost, platformindependent, and rapidly deployable. applications deployed via cloud can be easily updated without forcing a patient to install any software on their devices, thus making system maintenance quick and cost effective. in fig. 4 we propose the concept of telerehabilitation platform based on software as a service cloud model containing four services: telerehabilitation application for patient, a new telerehabilitation system based on internet of things 401 setup and analytics application for therapist, game session manager and processor for sensor information, database for persisting of rehabilitation information and web and application servers for running the above described services. we envisage integration of telerehabilitation system with medical information system and using information from electronic health record, ehr, of the patient for precise diagnosis. patient medical is ehr diagnosis therapist adapted rehab game setup & preview rehab system rehab db setup and analytic software for therapist rehabilitation software saas rehab system sensor information processor game session manager fig. 4 proposed concept of cloud based software as a service rehabilitation platform in this system, kinect is used in rehabilitation therapy to determine the patient’s mobility and limitations, before the beginning of therapy. by testing patient's limitations in movements (for example, height to which he/she can raise the affected hand or move it to the right, left, or bend) and the actual speed of movement, a set of parameters is obtained, upon which the system may recommend a list of games. during therapy, body position is very important, and it is detected and recorded by kinect, because the patient is tilted to the right if they find it too hard to lift their left arm. after the patient finishes the session, raw data gathered from the muscle sensor and kinect are sent to the cloud saas application server. the received data are filtered and transformed into meaningful information, linked to the patient and stored in cloud data storage. the new session is then provided to the user until they decide to finish the therapy. data stored for each performed session can be used for various purposes and benefits. analytic software for therapists may present diagrams of patient's performance and progress. great amount of gathered data gives an opportunity for medical data mining and opens the door to a vast source of medical data analyses. finding patterns in the impact of a certain exercise to the establishment of the lost physical function, classification and prediction will create a knowledge base able to recommend a set of sessions to any new or existing patient. in order to promote physical and psychological condition of the patient, training sessions should be designed for a specific type of disability. based on stroke statistics report [24], conducted in the united kingdom, there are 77% of post-stroke patients with upper limb disabilities, and 72% of post-stroke patients with lower limb disabilities. according to these findings, the first trial game (fig. 5) is designed for practicing motor skills and coordination of stroke affected hand, especially elbow and shoulder. 402 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković fig. 5 serious game for hand and elbow rehabilitation consisting of virtual box with green barrier and twenty balls trial game is designed to contain only essential elements, without details that could draw patient's attention. elements in the game are virtual box filled with virtual balls located at one side of the box and adjustable barrier placed in the middle of the box, separating the box into two parts. patient's movements are tracked using kinect body sensor and their task is to take a virtual ball, placing their palm at the ball position and to drag the ball to the opposite side of the box, over a virtual barrier. the barrier height is adjustable in order to match patient's capabilities. if a patient is able to contract any muscle of the stroke affected hand, it should be insisted on muscle sensor usage in the game because it will increase muscle strength and endurance. in a trial game with a virtual box, emg muscle signals can be used for grabbing the ball when the palm covers it and for releasing the ball after passing the barrier. 4. patient trials and results in order to test telerehabilitation system in the domain of patient environment and patient's reaction to the new type of therapy, we setup a home based equipment containing kinect, arduino uno and muscle sensor in the patient's home environment. the patient was requested to play an interactive telerehabilitation game with a virtual box and the results of playing were recorded to patient's computer and uploaded to the remote computer. the goal of this trial was to test one part of the proposed telerehabilitation system interactive telerehabilitation game. interactive serious game has been tested on a single patient during one month pilot trial. the patient is a 60 year old male, who sustained a right hemisphere stroke a year before trial testing and as a consequence has hemiparesis of the left side of the body. the mobility of his left hand is very low and the goal is to increase it. one week prior to the start of the rehabilitation trial, the patient had gone through baseline rules to play virtual telerehabilitation game with a virtual box. the patient played the game using the stroke affected hand, in the three week period, five to six days per week, one hour per day. unfortunately, the patient was unable to close the fist and therefore contract the biceps, therefore the readings from the muscle sensor are omitted and the virtual ball is considered captured when the patient holds the hand over the ball a new telerehabilitation system based on internet of things 403 position for several seconds and the ball is considered released when the patient’s hand passes the barrier and a half of the box after the barrier. the results obtained after the three week telerehabilitation period was completed, are shown in fig. 6. that figure shows that transferring balls from the left to the right side lasts longer, which confirms that the patient slowly focuses the left, stroke affected side. comparing the measurements in the first five days and the last five days of the session, the duration of the session was reduced by 27% when moving twenty balls from right to left, and 15% when moving them from left to right. fig. 6 progress in playing the game between the first and the last day of telerehabilitation home based trial 5. discussion the results of the pilot trial of three week telerehabilitation session using serious game showed noticeable improvements in rehabilitation. after the trial, the patient showed increase of concentration, faster reflexes, and higher mobility of affected hand and better focus of left side when reading. the trial reveals that compared with in-clinic rehabilitation process, telerehabilitation process offers several benefits. in-clinic rehabilitation is by its nature repetitive and command based, which may reduce patient motivation. serious game telerehabilitation tends to demand movements based on purpose (pick the object, move the object, clean the surface, etc.) and tries to motivate the patient to achive better score in every game iteration. traditional rehabilitation requires one therapist per patient and both have to be in the same place. in telerehabilitation therapy session, one therapist can lead several patients, a session can be designed in advance for each patient and the therapist and patient can be miles apart. thus, telerehabilitation model reduces travelling costs, and reduces the time therapist spends for preparing a single patient therapy – compared to the time when the therapists works with one patient, showing exercises and waiting for the patient to complete it. the distribution of therapists over the territory is usually uneven, 404 s. vukićević, z. stamenković, s. murugesan, z. bogdanović, b. radenković with higher concentration in urban regions and city centres, and there is a lack of skilled therapists in rural and remote locations. it is exactly here that telerehabilitation model can give most contribution in providing an opportunity for each patient to be treated equally well. however, hereby presented telerehabilitation model is highly dependent on internet accessibility, availability of kinect body sensor, muscle sensor and personal computers. also patients or their caregivers should have basic computer knowledge in order to setup telerehabilitation equipment. validity and reliability of kinect body sensor has already been tested and confirmed [23][25]. kinect detects body skeleton automatically, but it requires at least two square meters clean place. compared to kinect, muscle sensor is not that simple to calibrate and to properly set. emg signal is usually very poor, which requires repetition. to improve the emg signal stability, the muscle sensor should be placed on a large muscle. second potential problem regarding muscle sensor is noise. interaction between the electrolytes in the skin and the metal of the detection surfaces of the electrode can produce noise. noise can be reduced employing conductive electrolytes to improve the contact with the skin and also by removing dead dermis from the surface of the skin. 6. conclusion the telerehabilitation model described in this paper allows for a faster recovery of patients who have survived a stroke. kinect device is used as a sensor for detection and tracking of body movements. muscle sensor records the muscle strength. cloud architecture model enables building a stable, scalable, reliable, cost effective and easy to use telerehabilitation system. this telerehabilitation system eliminates the need for mandatory presence of the therapist and enables a patient to perform post-stroke rehabilitation therapy at home, reducing the cost of treatment. therapist has access to patient’s virtual records and may check his/her activities and remotely guide the therapy at any moment. this model developed for research and experiment purposes serves as a foundation for creating a product which will be widely used in post stroke telerehabilitation and evaluation of the recovery degree. acknowledgement: the authors would like to thank to the ministry of education, science and technological development, republic of serbia, for financial support project number 174031. references [1] y. jog, a. sharma, k. mhatre and a. abhishek, "internet of things as a solution enabler in health sector", international journal of bio-science & bio-technology, vol. 7, no. 2, pp. 9-24, 2015. [2] j. langan, k. delave, l. phillips, p. pangilinan and s.h. brown, "home-based telerehabilitation shows improved upper limb function in adults with chronic stroke: a pilot study", journal of rehabilitation medicine, vol. 45, no. 2, pp. 217-220, 2013. [3] l.r. tindall, and r.a. huebner, "the impact of an application of telerehabilitation technology on caregiver burden", international journal of telerehabilitation, vol. 1, no. 1, pp. 3-8, 2009. [4] l. piron, a. turolla, p. tonin, f. piccione, l. lain and m. dam, "satisfaction with care in post -stroke patients undergoing a telerehabilitation programme at home", journal of telemedicine and telecare, vol. 14, no. 5, pp. 257-260, 2008. [5] the world health report 2002: “reducing risk, promoting healthy life”, http://www.who.int/whr/2002/en/. a new telerehabilitation system based on internet of things 405 [6] m.c. domingo, "an overview of the internet of things for people with disabilities", journal of network and computer applications, vol. 35, no. 2, pp. 584-596, 2012. [7] p. pharow, b. blobel, p. ruotsalainen, f. petersen and a. hovsto, "portable devices, sensors and networks: wireless personalized ehealth services", medical informatics in a united and healthy europe, pp. 1012-1016, 2009. [8] b. lange, c.y. chang, e. suma, b. newman, a.s. rizzo and m. bolas, "development and evaluation of low cost game-based balance rehabilitation tool using the microsoft kinect sensor. in engineering in medicine and biology society (embc)”, in proceedings of the ieee annual international conference, 2011 pp. 1831-1834. [9] l. geurts, v. vanden abeele, j. husson, f. windey, m. van overveldt, j.h. annema and s. desmet, "digital games for physical therapy: fulfilling the need for calibration and adaptation", in proceedings of the 5th international conference on tangible, embedded, and embodied interaction (tei '11), acm new york, 2011, pp. 117-124. [10] m. hoda, h. dong, d. ahmed and a. e. saddik, "cloud-based rehabilitation exergames system" in multimedia and expo workshops (icmew), in proceedings of the ieee international conference, 2014, pp. 1-6. [11] a. dohr, r. modre-opsrian, m. drobics, d. hayn, and g. schreier, "the internet of things for ambient assisted living", in proceedings of the 7th international conference on information technology: new generations (itng), las vegas, 2010, pp. 804-809. [12] t. susi, m. johannesson and p. backlund, "serious games: an overview", technical report, sweden: university of skövd, skövde, 2007 [13] b. m. alcover, a. jaume-i-capó, j. varona, p. martinez-bueso, and a. m. chiong, "use of serious games for motivational balance rehabilitation of cerebral palsy patients", in proceedings of the 13th international acm sigaccess conference on computers and accessibility, new york, 2011, pp. 297-298. [14] s. c. yeh, w. y. hwang, t. c. huang, w. k. liu, y. t. chen and y. p. hung, "a study for the application of body sensing in assisted rehabilitation training", in proceedings of the computer, consumer and control (is3c), international symposium, 2012, pp. 922-925. [15] m. f. levin, p. l. weiss and e. a. keshner, "emergence of virtual reality as a tool for upper limb rehabilitation", physical therapy, vol. 95, no. 3, march 2015, pp. 415-425. [16] s. vukićević, "telerehabilitation model of physical therpay using kinect and embedded systems", in proceedings of the 5th international conference on information society and technology, kopaonik, 2015, pp. 214-218. [17] h. m. hondori and m. khademi, "a review on technical and clinical impact of microsoft kinect on physical therapy and rehabilitation.journal of medical engineering", journal of medical engineering, vol. 2014, pp. 1-16, 2014. [18] d. webster and o. celik, "systematic review of kinect applications in elderly care and stroke rehabiliation", journal of neuroeneering and rehabilitation, vol. 11, no. 1, 108, pp. 1-24, 2014. [19] o. erazo, j. pino, r. pino and c. fernandez, "magic mirror for neurorehabilitation of people with upper limb dysfunction using kinect", in proceedings of the 47th hawaii international conference on system sciences (hicss), 2014, pp. 2607-2615. [20] c. m. tseng, c. l. lai, d. erdenetsogt and y. f. chen, "a microsoft kinect based virtual rehabilitation system", in proceedings of the international symposium on computer, consumer and control (is3c), 2014, pp. 934-937. [21] z. zhang, "microsoft kinect sensor and its effect", ieee multimedia, vol. 19, no. 2, pp. 4-10, 2012. [22] y. feng, m. ji, j. xiao, x. yang, j. j. zhang, y. zhuang and x li, "mining spatial-temporal patterns and structural sparsity for human motion data denoising", ieee tran. cybernetics, vol. 99, pp. 1-14, 2014. [23] b. bonnechere, b. jansen, p. salvia, h. bouzahouene, l. omelina, f. moiseev, f. and s. jan, "validity and reliability of the kinect within functional assessment activities: comparison with standard stereophotogrammetry" gait and posture, vol. 39, no. 1, pp. 593-598, 2014. [24] state of the nation january 2015: "stroke statistics", https://www.stroke.org.uk/sites/default/files/ stroke_ statistics_2015.pdf [25] r.a.clark, y.h. pua, k. fortin, c. ritchie, k.e. webster, l. denehy and a.l. bryant, "validity of the microsoft kinect for assessment of postural control", gait & posture, vol. 36, no. 3, pp. 372-377, 2012. http://www.tei-conf.org/11/ 10614 facta universitatis series: electronics and energetics vol. 35, no 4, december 2022, pp. 571-585 https://doi.org/10.2298/fuee2204571s © 2022 by university of niš, serbia | creative commons license: cc by-nc-n original scientific paper fast doa estimation of the signal received by textile wearable antenna array based on ann model* zoran stanković, olivera pronić-rančić, nebojša dončov university of niš, faculty of electronic engineering, niš, serbia abstract. mlp_doa module, being an integral part of the smart twaa doa subsystem, intended for fast doa estimation is proposed. multilayer perceptron network is used to create the mlp_doa module that provides a radio gateway location in azimuthal plane at its output when a spatial correlation matrix, found by receiving the radio gateway signal using two-element textile wearable antenna array, is on its input. mlp_doa network training with monitoring the generalization capabilities on the validation set of samples is applied. the accuracy of the proposed modeling approach is compared to the classical approach in mlp_doa module training previously developed by the authors. comparison of the presented ann model with the root music algorithm in terms of accuracy and program execution time is also done. key words: ann, mlp, doa, twaa, root music 1. introduction wearable wireless systems play an integral role in the fifth generation (5g) networks, which operate with higher bit rates, lower latency, and lower outage probabilities in smaller microcells and picocells covering broader areas than 4g or older technologies. in addition, beam reconfigurability and beamforming are expected to facilitate spectral and energy efficiency at both the mobile devices and base station levels. besides mobile communications, wearable wireless systems find numerous applications in areas such as health-care, security, ambient assisted leaving, sports etc., [2]-[7]. wearable antennas are among the most important elements of wearable wireless systems, [8]-[15]. they are usually integrated within the clothing by any of current stateof-the-art fabrication methods (fabric-based embroidered antennas, polymer-embedded antennas, microfluidic antennas with injection alloys, inkjet printing, screen printing and photolithography, 3d-printed antennas, etc.) [9]. depending on the type of application, it received march 24, 2022; revised may 15, 2022; accepted june 5, 2022 corresponding author: zoran stanković university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia e-mail: zoran.stankovic@elfak.ni.ac.rs * an earlier version of this paper was presented at the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks 2021), october 20 22, 2021, in niš, serbia [1] 572 z. stanković, o. pronić-rančić, n. dončov is vital to choose a suitable antenna form, as one-design-fits-all approach often does not meet all requirements. in health care monitoring (hcm), wireless technology enables a significant reduction in the cost of health services, while at the same time providing the necessary quality of service. combination of biosensors placed on patient’s body and antennas integrated into garments to transmit/receive signals to the remote wireless monitor point can allow the patients to receive the needed assistance, while continuing to live in their own homes [15]. a single textile wearable antenna with an omnidirectional radiation pattern is commonly used in health care monitoring systems. it allows to avoid signal level fluctuations between the antenna and a radio gateway (rg) of the hcm system due to wearer movements. however, the range of the single antenna is significantly reduced both outdoors and indoors due to its small gain. the classic antenna arrays provide significantly higher gain but have narrow and spatially invariant radiation patterns and therefore cannot overcome the problem of signal fluctuations due to the movement of antenna wearer. textile wearable antenna arrays (twaa) with adaptive beamforming (smart twaa), on the other hand, provide that the main lobe of the antenna array radiation pattern is always directed towards the rg [15]. direction-of-arrival (doa) estimation of rg signal represents a crucial factor in adaptive beamforming [16]. usually, doa estimation requires intensive matrix calculations аs it is based on super resolution algorithms such as music, esprit, and their modifications. therefore, their real-time implementation requires powerful hardware platforms, [17-20], which makes them unsuitable for implementation on small mobile platforms used to realise smart twaa. on the other hand, artificial neural networks (anns) for doa estimation do not require complex matrix calculations and can be easily implemented on modest mobile hardware platforms, [21]-[26]. further, from our previous research, it was shown that they have approximately the same modelling accuracy as super resolution algorithms, but significantly higher calculation speed [1], [24]-[26]. this paper is a continuation of the research presented in [1] where the basic version of the doa module based on the multilayer perceptron (mlp) network (mlp_doa module) was proposed. that module is an integral part of the smart twaa doa subsystem with two textile antennas that performs fast doa estimation of the rg signals and determination of the rg location in the azimuthal plane. the research conducted within this paper relates to further development and improvement of the mlp_doa module as well as to the examination of its performances in a working environment having a wide range of signal-to-noise ratio changes. unlike the classical approach in mlp_doa module training, applied in [1], that did not include mechanisms of control of the achieved generalization capabilities of mlp network, in this paper, network training with monitoring the generalization capabilities on the validation set of samples and thus preventing the effect of its overlearning is applied. the proposed ann approach in doa estimation of the rg signal is compared with the classical approach in doa estimation based on the root music algorithm in terms of accuracy and program execution time. the paper is organized as follows. after introduction, a brief description of the proposed smart twaa doa subsystem is given in section 2. the architecture, training, and testing of mlp_doa network are presented in section 3. the most illustrative numerical results are presented in section 4, and finally conclusion remarks are given in section 5. fast doa estimation of the signal received by textile wearable antenna array based on ann model 573 in order to facilitate interpretation of the material presented in these sections, the list of used acronyms is given in table 1. table 1 list of used acronyms a term replaced by its acronym (acronym) health care monitoring (hcm) mean square error (mse) radio gateway (rg) maximum validation failures (mvf) textile wearable antenna arrays (twaa) worst case error (wce) direction-of-arrival (doa) average test error (ate) artificial neural networks (anns) pearson product moment correlation coefficient (rppm) multilayer perceptron (mlp) signal-to-noise ratio (snr) 2. smart twaa doa subsystem architecture of the smart twaa doa subsystem is shown in fig. 1. it consists of two-element twaa, narrowband filters, a/d convertors, fpga module and doa module. the distance between the antenna elements is d = c/2f, where c is the speed of light. twaa, filters and a/d convertors perform the rg signal sampling at frequency f. based on the samples provided by twaa, fpga module calculates the spatial correlation matrix (c). this matrix is then sent to the input of doa module that determines the azimuth positions of the radio gateway (). anns are proposed for the realization of the doa module (ann based doa module). fig. 1 architecture of the smart twaa doa subsystem [1] in the absence of the antenna noise, the vector of signals induced on twaa with elements having omnidirectional radiation pattern in the azimuth plane is xs(t) = [xs1(t) xs2(t)], where xs1(t) and xs2(t) are the signals induced on the first and second antenna element, respectively. accordingly, the correlation matrix of signals induced on elements can be expressed as [16] 574 z. stanković, o. pronić-rančić, n. dončov sin h sin [ ( ) ( )] j d h s s s j d p pe e t t p pe p     −  = = =     c x x ss (1) where e [] denotes expectation operator, s = [1 ejdsin]t is the steering vector,  is the phase constant ( =2π/λ), and p is the power of the signal induced on one omnidirectional antenna element. in the initial state, when the twaa wearer does not move and the textile is not deformed, the gains of antenna elements are mutually equal, g()=g1()=g2(). in the general case, the twaa wearer moves, the textile deformations occur and consequently there are changes in the orientation of the antenna elements and in their effective apertures. therefore, the gains of antenna elements in the direction of the rg change over time and in general case, they can have different values at the same moment 1 1 2 2 ( , ) ( , ), for most valuesg g t g g t t =  = (2) here, we assume that creasing of textile does not lead to a significant change in the distance between the antenna elements, i.e., this change can be neglected. therefore, the equation (1), defining the correlation matrix of the signals received by the mobile twaa, must be modified as follows         = − pgpegg peggpg dj dj s 2 sin 21 sin 211   c . (3) when the antenna noise is present and there is not any external rg signal, the noise vector induced on the antenna elements can be represented as n(t) = [n1(t) n2(t)], where n1(t) and n2(t) are random noise components on the first and second antenna element, respectively. for uncorrelated noise, e.g., white gaussian noise, the noise correlation matrix is obtained as 2 h 2 0 [ ( ) ( )] 0 n n n e t t     = =     c n n . (4) the spatial correlation matrix at the twaa output, c, can be obtained as a superposition of the correlation matrix of signals and the noise correlation matrix, h 2 sin 1 1 2 sin 2 1 2 2 [ ( ) ( )] j d n s n j d n e t t g p g g pe g g pe g p       − = =  + = + =   +   c x x c c (5) where x(t) = xs(t)+n(t) is the vector at the twaa output. the signal-to-noise ratio (snr) is defined with respect to the power of the signal received by the first element of the antenna array, 2 1 n pg snr  = . (6) fast doa estimation of the signal received by textile wearable antenna array based on ann model 575 therefore, equation (5) can be expressed as follows             + + = − snr pg pgpegg pegg snr pg pg dj dj 1 2 sin 21 sin 21 1 1   c . (7) normalization of the matrix c does not lead to a change in the results obtained by music algorithm for doa estimation, [25]. by normalizing the matrix c with respect to element c11, it is obtained that normalized matrix, c, is invariant to the signal strength p and for its determination is not necessary to know the gains of both antennas but only their relative ratio g2/g1,                       + ++  +  = − snrg g snr snr e snr snr g g e snr snr g g dj dj 1 11 1 1 1 2sin 1 2 sin 1 2   c (8) with the introduction of the variables: g (root gain ratio), 12 ggg = , and the distance between antenna elements expressed in wavelengths, d, eq. (8) becomes                   + ++  +  = − snr g snr snr e snr snr g e snr snr g dj dj 1 11 1 1 2sin2 sin2     c (9) in the real scenario, the twaa wearer moves, and the textile is crumpled, so it is exceedingly difficult to determine the parameters g and snr at each time point. also, the angle  is unknown, so the spatial correlation matrix cannot be determined directly by applying the above formula. the spatial correlation matrix is estimated from a large number of twaa output samples in a short time interval (twaa snapshots) using fast a/d converters and calculating the matrix elements on the fpga module using the approximate formula  =  sn s h ss s n 1 1 xxc , (10) where xs is the sample of s-th snapshot at twaa output and ns is the number of snapshots. an example of a measuring point and the necessary laboratory equipment for obtaining the elements of a correlation matrix by measurement are presented in [26]. 3. ann based doa module the ann based doa module consists of a single mlp neural network (mlp_doa) that estimates the angle of arrival of the rg signal on the twaa based on the signal information contained in the spatial correlation matrix. this can be represented as follows 576 z. stanković, o. pronić-rančić, n. dončov _ ( ) mlp doa f = c . (11) the first row of a normalized spatial correlation matrix without autocorrelation element is sufficient for estimating the angular positions of em radiation sources, [1], [25]. the real and the imaginary part of the elements in the first row without the autocorrelation element, are brought separately to the neurons in the input layer of the mlp network. in this way, a model is obtained that is more suitable for implementation and training in relation to the case when the complex values of these elements are taken at the input of the mlp network, [23]. accordingly, for the two-element twaa, eq. (11) can be written in the form _ _ 12 12 ( ) (re{ }, im{c }) mlp doa mlp doa f f c  = =c , (12) where c is the vector of the input variables of the mlp neural network (c = [re{c12՛} im{c12՛}].) 3.1. architecture of mlp_doa network the architecture of mlp_doa network is shown in fig. 2. it consists of a total of l layers of neurons: one input and one output layer of neurons and a total of l-2 hidden layers of neurons between them. fig. 2 architecture of mlp_doa network. the signal propagation from the input to the output of the mlp network and the corresponding transfer functions of the mlp_doa network (eq. 12) can be described by the output vectors of each network layer. the input layer is a buffer layer and, according to eq. (12), has two neurons. thus, the output vector of the input layer is y1 = c = [re{c12՛} im{c12՛}]. the output vector of l-th layer (except the input layer) can be expressed as fast doa estimation of the signal received by textile wearable antenna array based on ann model 577 llf lll l l ,,3,2)( 1 =+= − bywy (13) where yl-1 represents the output of (l-1)-th layer. in eq. (13), wl is the connection weight matrix between the (l-1)-th and the l-th layer where matrix element wli,j represents the connection weight between the j-th neuron of the (l-1)-th layer and the i-th neuron of the l-th layer, bl is the vector containing biases of the l-th layer where vector element bi l represents bias of the i-th neuron of the l-th layer, while fl() is an activation function of l-th layer neurons. the hyperbolic tangent sigmoid transfer function was used as an activation function of hidden layers 1,...,3,2,)( −= + − = − − ll ee ee uf uu uu l . (14) the output layer has one neuron with the linear activation function fl(u) = u and its output is given as llllll l l f bywbywy +=+== −− 11 )( . (15) the weight matrices w1, w2,…, wl, and bias vectors b1, b2,…, bl form the set w of the trainable parameters of the mlp network. the values of the elements of this set are adjusted during the network training with the aim that the mapping expressed by eq.(12) is realized with the desired accuracy. the general architecture of this mlp_doa neural network is represented by the notation mlph-n1-…-ni-…-nh. h and ni in this notation are the total number of hidden layers in mlp architecture (h = l-2) and the total number of neurons in the i-th hidden layer, respectively. 3.2. training and testing of mlp_doa network mlp_doa network training is performed on a set of training samples p = {(c1,1 d), (c2,2 d),..., (cs,s d),...,(cnp,np d)}, where s d is the desired value of the network output when the sample cs is brought to its input and np is the total number of training samples. to monitor the achieved degree of the network generalization, the validation set v, containing the total number of nv samples of the same format as the samples of the training set p, is applied. during the network training, the samples from the training set are brought to the network input and an iterative change of weights and biases from the set w is performed in accordance with the chosen training algorithm. the goal is to minimize the mean square error (mse) of the network output relative to the desired output values. regarding the observation of network performance at the training set, the network training is stopped either when the target mse at the training set (eptarget) is reached or if the maximum number of iterations, nimax, is reached. during the network training, the mse of the network output at the validation set, ev(w), is also monitored and when its minimum value (evmin) is reached, the training is stopped, even if the above conditions for termination of the network training are not met. in fact, when evmin is achieved, any further training of the mlp_doa network leads to the network overfitting and deterioration of its generalization abilities. in other words, this means that the problem of finding the optimal breakpoint of the neural network training comes down to 578 z. stanković, o. pronić-rančić, n. dončov finding the values of network weights and biases from the set w for which the network will have a minimum mean square error at the validation set (eq. 16). v 2 min 1 1 ( ) min ( ) 2 n d v s s w sv e w n   =   = −     (16) if during the iterative training of the mlp_doa network is noticed that the error at the validation set after a period of continuous decline begins to grow in the next mvf (maximum validation failures) successive iterations, then it is considered that the minimum error has been reached and the training should be stopped. the mvf value is set before the start of the network training. in the example of mlp_doa network training that is presented in this paper, a test set intended for checking the generalization performance of the trained network was used as a validation set in the network training process (v=t). each sample used for neural network training or testing was obtained by establishing an inverse doa mapping according to eq. (9) and averaging a large number of consecutive twaa snapshots according to eq. (10). the training and test set of the samples contain ordered triplets of the format (re{c12( d [], g [db], snr [db])}, im{c12( d [], g [db], snr [db])},  d []), where the samples are generated for different values of the angle  d and the parameter g. the mlp_doa network training set is generated by a uniform distribution of the variables  d and g as ( ) 12 12 max max (re{ ( , , )}, im{ ( , , )}, }) | [ : : ], [ : : ] d d d snr d d d d min step min step c g snr c g snr p g g g g            =       (17) where  dmin,  d step and  d max are the minimum value, step, and the maximum value of the angle  d, respectively, and gmin, gstep and gmax are the minimum value, step, and the maximum value of the parameter g in the training set, respectively. in order to assess the quality of network training, the quality of generalization of the trained network and the final choice of mlp_doa network architecture to be used for the implementation of doa module, each trained network was tested on a test set that does not contain samples used in the training process. similar to the training set, the test set was generated by a uniform distribution of the variables  d and g as 12 12( ) (re{ ( , , )}, im{ ( , , )}, ) | [ : : ], [ : : ] step step d d d snr d dt dt dt dt dt dt min max min max c g snr c g snr t g g g g            =       (18) where  dtmin,  dt step and  dt max are the minimum value, step, and the maximum value of the angle  d in the test set, respectively, and g tmin, g t step and g t max are the minimum value, step, and the maximum value of the parameter g in the test set, respectively. the following metrics were used in the neural network testing process: worst case error (wce), average test error (ate) and pearson product moment (ppm) correlation coefficient (rppm), [22]. worst case error is calculated as 1 max min ( , ) max t dn s s d d s w wce    = − = − c , (19) fast doa estimation of the signal received by textile wearable antenna array based on ann model 579 where nt is the total number of test set samples,  (cs,w) is the output of mlp_doa network when the sample cs is brought to its input, and  d max and  d min are the maximum and minimum desired values of angle  in test set, respectively. average test error is calculated as 1 max min ( , )1 t dn s s d d st w ate n    = − = −  c . (20) ppm correlation coefficient is calculated as 1 2 2 1 1 ( ( , ) ) ( ) ( ( , ) ) ( ) t t t n d d s s ppm s n n d d s s s s w r w         = = = −  − =     −  −           c c , (21) where 1 1 ( , ) tn s st w n   = =  c represents the average value of neural network output and  = = tn s d s t d n 1 1  represents the average value of expected output values. 4. modeling results simulation of twaa doa subsystem operation, generation of training and testing samples, as well as development and testing of mlp_doa modules were performed in matlab environment. the reference computer configuration used to implement doa module and for all simulations was: intel core i7-9700f cpu @ 3 ghz, with 16 gb ram. the following modeling scenario was considered: rg has radiation power of 1 w (0 dbw) and its distance from twaa is 100m. twaa wearer moves in the azimuth plane and its positions in relation to the rg change from -60° to +60°. as the wearer moves, textile creases, and the root gain ratio changes from -10 to 10 db. the antenna elements are at a constant distance d=0.5 and the number of snapshots is ns=300. for the development and testing of mlp_doa module, training and test sets are formed using the eqs. (17) and (18). the training set, p(20), is formed for snr = 20 db and test sets are formed for the following signal to noise ratio values: snr{20 db, 15 db, 10 db, 5 db, 0 db, -5 db} (denoted as t(20), t(15 ), t(10), t(5), t(0), t(-5)). the following parameter values in the eq. (17) were used to generate the training set: min = -60, step = 0.5,max = 60, gmin = -10 db, gstep = 1db and gmax = 10 db. in this way, a training set containing 5061 samples was generated. the following parameter values in the eq. (18) were used to generate the test set:  tmin = -60,  t step = 0.7,  tmax = 60, g t min = -10 db, g t step = 1.3 db, and g t max = 10 db. in this way, 2752 samples were generated for each test set. the development phase of the doa module includes training and testing of a number of different mlp_doa networks as well as selection of mlp network with the best test characteristics for the implementation of the doa module. during this phase, it is 580 z. stanković, o. pronić-rančić, n. dončov assumed that the antenna environment is almost ideal in terms of noise, the signal to noise ratio is snr=20 db. therefore, the sets p(20) and t(20) were used to train and test different mlp_doa networks. for the implementation of the mlp_doa module, mlp architectures with two hidden layers (h = 2) and a variable number of neurons in them were considered. a number of different mlp networks having n1 ≤ 8 neurons and n2 ≤ 22 neurons were trained and tested. levenberg – marquardt algorithm [22] was chosen to train mlp_doa networks by tracking the achieved degree of network generalization at the validation set. during the training of the mlp_doa networks, t(20) test set was used as a validation set. the following values of training parameters were selected: eptarget = 10 -6, nimax = 1000 and mvf = 20. testing of all trained mlp_doa networks was performed with the t(20) test set. worst case error (wce), average test error (ate) and correlation coefficient (rppm) were monitored during the test procedure in order to find the mlp_doa network capable of providing the angle of arrival of rg signal on the twaa with the best accuracy. eight mlp_doa networks that have the best test statistics are shown in table 2. it can be seen that mlp2-18-16 neural network has the lowest values of wce and ate and the highest value of rppm. therefore, this neural network was chosen for the implementation of the mlp_doa module. the test statistics obtained by the presented modelling approach are significantly better than the corresponding ones presented in [1] where the selected mlp_doa module (mlp2-10-5) had the following statistics: wce=2.7949, ate=0.3699 and rppm=0.9998546. namely, it is shown that approach in training and selection of the appropriate mlp network architecture for the realisation of mlp_doa module presented here, significantly improves the accuracy of doa estimation compared to the classical approach in mlp_doa module training presented in [1]. the scattering diagram of the selected mlp2-18-16 neural network is shown in fig. 3. in this case, a very high accuracy of doa estimation can be observed. since the mlp network of mlp_doa module was trained and tested in almost ideal noise conditions (snr=20 db), it was necessary to test the mlp_doa module in case of an environment with increased noise in order to investigate the impact of noise on its accuracy. therefore, the mlp_doa module was tested in a noisy environment with a snr of 15 db, 10 db, 5 db, 0 db, and -5 db using t(15 ), t(10), t(5), t(0), and t(-5) test sets, respectively. in order to compare the accuracy of the proposed ann approach in doa estimation of the rg signals with the classical approach based on super-resolution algorithms, the implementation of doa module with the root music algorithm was performed (root music doa module). testing of the root music doa module was performed under the same conditions and with the same test sets as in the case of the mlp_doa module. table 2 testing results for mlp_anns with the best test statistics mlp_doa network wce (%) ate (%) r ppm mlp2-18-16 0.3466 0.0344 0.9999993 mlp2-14-11 0.3479 0.0432 0.9999980 mlp2-12-12 0.3735 0.0495 0.9999975 mlp2-15-11 0.3971 0.0596 0.9999964 mlp2-17-11 0.4368 0.0627 0.9999958 mlp2-22-10 0.4685 0.0488 0.9999974 mlp2-14-12 0.4837 0.0439 0.9999978 mlp2-14-11 0.5366 0.0571 0.9999965 fast doa estimation of the signal received by textile wearable antenna array based on ann model 581 fig. 3 scattering diagram of mlp2-18-16 neural network (snr = 20 db) based on the test results, the accuracy of both modules was examined and compared for different snr values. the values of the worst case errors, average test errors and correlation coefficients obtained by the mlp_doa module and by the root music doa module versus signal-to-noise ratio are shown in fig. 4 6. it is evident that both modules have very high accuracy in the case of low noise environment (snr=20 db, 15 db, 10 db). with increasing noise, i.e., decreasing snr, there is a decrease in the accuracy of both modules, which becomes significant for snr values less than 5 db. however, in the case of increased noise, the proposed mlp_doa module achieves better results. fig. 4 worst case error versus snr obtained by mlp_doa module and by the root music doa module 582 z. stanković, o. pronić-rančić, n. dončov fig. 5 average test errors versus snr obtained by mlp_doa module and by the root music doa module fig. 6 ppm correlation coefficient obtained by mlp_doa module and by the root music doa module the scattering diagram of both modules in case of extremely high noise, snr = -5db, are shown in figs. 7 and 8. fig. 7 shows the scattering diagram of the mlp_doa module. in this case, the following test statistics were obtained: wce=79.9515, ate=5.5533 and rppm=0.9175. fig. 8 shows the scattering diagram of root music doa module. in this case, the following test statistics were obtained: wce=116.0627, ace=6.5200 and rppm=0.8376. comparing the scattering diagrams of both modules, similar conclusions can be drawn as in the previous case. both modules show significant deviation of the output values from the referent (desired) ones for a large number of samples, however, the scattering in the case of the mlp_doa module is less than the scattering of the root music module, therefore, the mlp_doa module shows less accuracy reduction in conditions of intense noise than the root music module. fast doa estimation of the signal received by textile wearable antenna array based on ann model 583 fig. 7 scattering diagram obtained by the mlp_doa module in conditions with high noise level (snr = -5 db, solid line line of ideal value matching, dashed lines boundaries of the scattering area) in addition, the average program execution time, measured on the test set with 2752 samples, for the mlp_doa module is 0.008054 seconds and for the root music doa module is 0.366337 seconds (table 3). obviously, the mlp based doa module performs doa estimation significantly faster compared to the root music doa module (approximately 45 times faster). fig. 8 scattering diagram obtained by the root music doa module in conditions with high noise level (snr = -5 db, solid line line of ideal value matching, dashed lines boundaries of the scattering area) 584 z. stanković, o. pronić-rančić, n. dončov table 3 comparison of doa estimation speed of the mlp_doa module and the root music module measured on test set (intel core i7-9700f cpu @ 3 ghz, 16 gb ram) doa module run time @ 2752 samples (s) mlp_doa module 0.008054 root music doa module 0.366337 5. conclusion an improved mlp_doa module for fast doa estimation of the rg signal arrival angle on two-element textile wearable antenna array has been proposed. the multilayer perceptron network, which was used to create this module, learned to accurately determine the position of the radio gateway in the azimuth plane from the spatial correlation matrix obtained by sampling the rg signal at twaa. since the classical approach in mlp_doa module training, did not include mechanisms to control the achieved generalization capabilities of the mlp network, in this paper the training of mlp network was performed by monitoring the generalization capabilities on the validation set of samples. the obtained mlp_doa module has an extremely high accuracy of doa estimation in low noise conditions, i.e., better modelling accuracy was achieved compared to the results obtained by the classical approach in the training of the mlp_doa module. in addition, the proposed module was compared with the root music algorithm in terms of accuracy and execution time of the program. the selected mlp_doa module was shown to have approximately the same accuracy as the root music doa module in the case of low noise conditions and less degradation of the model accuracy in a very noisy environment. besides, mlp_doa module performs doa estimation approximately 45 times faster compared to the root music doa module. creasing of textiles can cause the center frequencies of the antenna elements of twaa to shift, as well as change the distance between the antenna elements. this leads to the effect of changing the phase difference of the signals received by the antennas regardless of the change in the angular position of the rg. this effect limits the accuracy of the mlp_doa module. therefore, further research will be aimed at increasing the accuracy of the mlp_doa module by developing the methods to reduce this effect. one of the methods that will be applied is the training of mlp_doa network with the samples of rg signals emitted at two different frequencies. also, during further research, mlp_doa module for twaa with more than two antenna elements will be developed. acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia (grant no. 451-03-9/2021-14/200102). fast doa estimation of the signal received by textile wearable antenna array based on ann model 585 references [1] z. stanković, o. pronić-rančić and n. dončov, "ann based doa estimation of the signal received by two-element textile wearable antenna array", in proceedings of the 15th international conference on advanced technologies, systems and services in telecommunications (telsiks), 2021, pp. 86-91. [2] cisco white paper, "cisco visual networking index: global mobile data traffic forecast update, 20162021 white paper", march 2017. [3] z. lin et al., "a low-power, wireless, real-time, wearable healthcare system", in proceedings of the ieee mtt-s international wireless symposium (iws), 2016, pp. 1-4. [4] t. liang and y. j. yuan, "wearable medical monitoring systems based on wireless networks: a review," ieee sensors j., vol. 16, no. 23, pp. 8186-8199, dec. 2016. [5] c. lin et al., "wireless and wearable eeg system for evaluating driver vigilance", ieee trans. biomed. circuits syst., vol. 8, no. 2, pp. 165-176, april 2014. [6] v. misra et al., "flexible technologies for self-powered wearable health and environmental sensing", proc. ieee, vol. 103, no. 4, pp. 665-681, april 2015. [7] s. saponara, "wearable biometric performance measurement system for combat sports", ieee trans. instrum. meas., vol. 66, no. 10, pp. 2545-2555, oct. 2017. [8] n. f. m. aun, p. j. soh, a. a. al-hadi, m. f. jamlos, g. a. e. vandenbosch and d. schreurs, "revolutionizing wearables for 5g: 5g technologies: recent developments and future perspectives for wearable devices and antennas", ieee microw. mag., vol. 18, no. 3, pp. 108-124, 2017. [9] b. mohamadzade, r. m. hashmi, r. b. v. b. simorangkir, r. gharaei, s. ur rehman and q. h. abbasi, "recent advances on fabrication methods for flexible antennas in wearable devices: state of the art", sensors, vol. 19, no. 10, p. 2312, 2019. [10] a. sabban, "small new wearable metamaterials antennas for iot, medical and 5g applications", in proceedings of the 14th european conference on antennas and propagation (eucap), 2020, pp. 1-5. [11] h. lee, j. tak and j. choi, "wearable antenna integrated into military berets for indoor/outdoor positioning system", ieee antennas wirel. propag. lett., vol. 16, pp. 1919-1922, 2017. [12] s. m. saeed, c. a. balanis, c. r. birtcher, a. c. durgun and h. n. shaman, "wearable flexible reconfigurable antenna integrated with artificial magnetic conductor", ieee antennas wirel. propag. lett., vol. 16, pp. 2396-2399, 2017. [13] s. su and y. hsieh, "integrated metal-frame antenna for smartwatch wearable device", ieee trans. antennas propag., vol. 63, no. 7, pp. 3301-3305, july 2015. [14] m. virili, h. rogier, f. alimenti, p. mezzanotte and l. roselli, "wearable textile antenna magnetically coupled to flexible active electronic circuits", ieee antennas wirel. propag. lett., vol. 13, pp. 209-212, 2014. [15] p. j. soh et al., "a smart wearable textile array system for biomedical telemetry applications", ieee trans. microw. theory techn., vol. 61, no. 5, pp. 2253-2261, may 2013. [16] l. c. godara, "application of antenna arrays to mobile communications, ii: beamforming and direction-ofarrival considerations", proc. ieee, vol. 85, pp. 1195-1245, 1997. [17] m. i. miller, and d. r. fuhrmann, "maximum likelihood narrow-band direction finding and the em algorithm", ieee trans. acoust., speech signal processing, vol. 38, no. 9, pp. 1560-1577, 1990. [18] r. schmidt, "multiple emitter location and signal parameter estimation", ieee trans. antennas propag., vol. 34, no. 3, pp. 276-280, 1986. [19] r. roy and t. kailath, "esprit-estimation of signal parameters via rotational invariance techniques", ieee trans. acoust., speech signal process, vol. 37, no. 9, pp. 984-995, 1989. [20] v. v. reddy, m. mubeen and b. poh ng, "reduced-complexity super-resolution doa estimation with unknown number of sources". ieee signal process. lett., vol. 22, no. 6, pp. 772-776, 2015. [21] s. haykin, neural networks, new york, ieee press, 1994. [22] q. j. zhang and k. c. gupta, neural networks for rf and microwave design, boston, artech house, 2000. [23] a. hirose, complex-valued neural networks: advances and applications, wiley, 2013. [24] z. stanković, n. s. dončov, i. milovanović and b. milovanović, "1d doa estimation of mobile stochastic em sources with a high level of correlation using mlp-based neural model", electromagnetics, vol. 38, no. 8, pp. 500-516, 2018. [25] z. stanković, n. dončov, i. milovanović, b. d. milovanović, "doa estimation of mobile stochastic em sources with variable radiation powers using hierarchical neural model", int. j. rf microwave computer-aided eng., vol. 29, no. 10, p. e21901, pp. 1-17, 2019. [26] m. agatonović, z. stanković, i. milovanovic, n. s. dončov, l. sit, t. zwick, b. d. milovanović, "efficient neural network approach for 2d doa estimation based on antenna array measurements", prog. electromagn. res., pier 137, vol. 137, pp. 741-758, 2013. instruction facta universitatis series: electronics and energetics vol. 29, n o 3, september 2016, pp. 339 355 doi: 10.2298/fuee1603339e swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network fatma h. elfouly 1 , rabie a. ramadan 2 , mohamed i. mahmoud 3 , moawad i. dessouky 4 1 department of electronics and electrical communications higher institute of engineering, el-shorouk academy, el-shorouk city, egypt 2 computer engineering department, cairo university, egypt 3 department of control engineering and industrial electronics, faculty of electronic engineering, menoufia university menouf, egypt 4 department of electronics and electrical communications, faculty of electronic engineering, menoufia university menouf, egypt abstract. energy is an extremely crucial resource for wireless sensor networks (wsns). many routing techniques have been proposed for finding the minimum energy routing paths with a view to extend the network lifetime. however, this might lead to unbalanced distribution of energy among sensor nodes resulting in, energy hole problem. therefore, designing energy-balanced routing technique is a challenge area of research in wsn. moreover, dynamic and harsh environments pose great challenges in the reliability of wsn. to achieve reliable wireless communication within wsn, it is essential to have reliable routing protocol. furthermore, due to the limited memory resources of sensor nodes, full utilization of such resources with less buffer overflow remains as a one of main consideration when designing a routing protocol for wsn. consequently, this paper proposes a routing scheme that uses swarm intelligence to achieve both minimum energy consumption and balanced energy consumption among sensor nodes for wsn lifetime extension. in addition, data reliability is considered in our model where, the sensed data can reach the sink node in a more reliable way. finally, buffer space is considered to reduce the packet loss and energy consumption due to the retransmission of the same packets. through simulation, the performance of proposed algorithm is compared with the previous work such as ebrp, aco, tadr, seb, and clr-routing. key words: wsns; swarm intelligence; ant colony system (acs); energy balancing; reliability received november 12, 2015 corresponding author: rabie a. ramadan computer engineering department, cairo university, egypt (e-mail: rabie@rabieramadan.org) *an earlier version of this paper was presented at the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, 2015 [1]. 340 f. elfouly, r. ramadan, m. mahmoud, m. dessouky 1. introduction a wireless sensor network (wsn) is a wireless network consisting of large number of small size, inexpensive, and battery operated sensor nodes. such nodes are essential for monitoring physical or environmental conditions such as temperature and humidity, perform simple computation, and communicate via wireless multi-hop transmission technique to report the collected data to sink node [2]. however, the nodes in wsn have severe resource limitations such as energy, bandwidth, and storage resources. energy is an extremely crucial resource because it not only determines the sensor nodes lifetime, but the network lifetime as well [3]. in wsns, communication has been recognized as the ajor source of energy consumption and costs significantly more than computation [3][4]. consequently, most of the existing routing techniques in wsn attempt to find the shortest path to the sink to minimize energy consumption. as a result, highly unbalanced energy consumption which causes energy holes around the sink and significant network lifetime reduction. therefore, designing energy-balanced routing technique plays a crucial role in wsns [5][6]. the reliable data transmission is one of the most essential issues in wsns [7][8][9]. the loss of important information due to unexpected node failure or dynamic nature of wireless communication link [10] prevents the sensor network from achieving its primary purpose which is data transfer. hence, routing techniques should give priority to reliable transmission. at the same time, it is critical to reduce packet loss in wsns which will improve the network throughput and energy-efficiency. due to memory constraints on sensor nodes, buffering a large number of packets is impossible. thus, such a buffer overflow problem may result in information loss and more energy consumption due to the retransmission of the same packets. thus, such retransmission limits the network's lifetime and efficiency. consequently, it is a highly needed to consider buffer space when designing routing protocols in wsns [11]. in the last two decades, optimization techniques inspired by swarm intelligence have gained much popularity [12]. they mimic the swarms' behaviour of social insects like ants and bees, the behaviour of other animal societies such as birds flocks, or fish schools as well [12]. swarm intelligent systems are robust, scalable, adaptable, and can efficiently solve complex problems through simple behaviour [13] such as the shortest path finding. ant colony system (acs) is considered one of the most important swarm intelligence techniques that can provide approximate solutions to optimization problems in a reasonable amount of computation time [12]. acs [14] has been inspired from the food searching behaviour of real ants which can be utilized to find the shortest path in wsns. unlike other routing approaches [15], the ant colony optimization meta-heuristic proposed in the literature for wsns is based only on local information of sensor nodes [16]. the problems of balancing energy consumption among sensor nodes and reliable communication have received significant attention in recent years [17][18][19][20][21][22]. however, our contributions in this paper focus on: 1) reducing energy consumption for wsn lifetime extension, 2) balancing of energy consumption among sensor nodes to maintain and balance of residual energy on sensor nodes as well, 3) enhancing data reliability where the sensed data can reach the sink node in a more reliable way, 4) taking into consideration buffer space on sensor nodes to reduce dropped packets, which in turn conserves energy, and 5) introducing a swarm intelligence as a heuristic algorithm based energy reduction and reliability as well as load balancing and minimizing the probability of buffer overflow. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 341 the rest of this paper is organized as follows: section 2 introduces a brief summary of the related work. section 3 introduces the problem description. then, section 4 describes the swarm based approach. section 5 provides the simulation results. finally, section 6 concludes the paper. 2. related work this section focuses only on the most related work to the proposal of this paper. it starts by explaining the work presented in [5][23][24][25][26] which are the more related work to our proposed approach followed by the differences from our proposal. [5] proposed an energy-balanced routing protocol (ebrp). ebrp algorithm borrows the concept of potential in physics to construct a mixed virtual potential field in terms of depth, energy density, and residual energy. the depth field is used to route packets toward the sink node. the energy density field is essential to balance energy consumption where, the packets are driven through the dense energy area. finally, the residual energy field protects nodes with relatively low residual energy from dying. [23] proposed an improved ant colony optimization routing (aco) for wsn. in this algorithm, an enhanced ant colony is used to optimize the node power consumption and prolongs network lifetime. the aco improved approach in enhanced an approach based on aco in which the probability of selecting next hop neighbour has been determined by using two heuristic functions. the first one is related to the quantity of the pheromone which inversely proportional to hop count, and the second depends on residual energy of neighbour nodes. however, the improvement in [23] is done by adding more accuracy to make a choice especially when probabilities are equal where, in such case the node chooses randomly the next hop. as a result, this might make wrong choice and data loss in uncovered area, or packets travel a long path to the sink. therefore, many nodes lose power due to bad choice, delivery delay, and may leads to network lifetime reduction. the aco improved approach adding new heuristic information to distinguish the best neighbour and avoiding the use of wrong nodes. the new heuristic information is related to the energy of the neighbour node which having sink in its collection field. such neighbour node will have more chance to be chosen, because the packets will attain the sink node definitely. however, only energy and pheromone are considered in the probabilistic rule when the sink is not in the neighbour node field. meanwhile, the analysis of aco improved algorithm [23], and ebrp [5] show that some issues are not considered which are reflected as drawbacks. firstly, the network reliability, as discussed above, this might increase the packet loss and packet retransmissions which affects the network efficiency. the second is the queue buffer size in which it has directly impact on network throughput and lifetime. finally, node load where, the nodes with heavy load and low residual energy should be prevented from being selected as a next hop to achieve energy balance of the whole network and relieve the energy hole problem. consequently, taking residual energy only into consideration as in [5][23] is not sufficient to achieve balanced energy usage in the network. 342 f. elfouly, r. ramadan, m. mahmoud, m. dessouky [24] proposed a traffic aware dynamic routing (tadr) algorithm to route the packets around the congestion areas and scatter excessive packets along multiple paths consisting of idle or unloaded nodes. in this algorithm, a hybrid potential field is constructed in terms of depth and the normalized queue length. the depth field creates a backbone to forward packets toward the sink. the queue length field is used to prevent the packet from going to the possible congestion area. however, tadr algorithm doesn't consider two critical issues which considered as a drawback. the first is energy balancing, as described above; this might lead to unbalanced energy consumption in the network which causes energy holes around the sink and significant network lifetime reduction. the second issue is the network reliability which is one of the key issues in wsns due to the high dynamics, limited resources, and unstable channel conditions. thus, this might deteriorate the network performance as mentioned above. [25] proposed a simple cross-layer balancing routing (clb-routing) that enhances the wsns lifetime by balancing the energy consumption in the forwarding task. clbrouting protocol is a bottom up approach, where the network layer uses information given by the mac layer in the choice of the next hop. the proposed algorithm in [25] operates in two phases. the first is initialization, where the sink node broadcasts a route request message containing a cost variable initialized to zero. each node receiving this message, updates the cost field according to its residual energy and the energy required for communication between that node and the sender of the route request and, then broadcasts it. the second phase is data transmission, where the mac layer informs the network layer about all the overheard communications of the neighbouring nodes. with this information, a node can know how many times each forwarding node has routed data. according to this information, and to effectively balance sensor nodes energy consumption, a node chooses its next hop among the less-used ones. this choice is not random; it is according to a probability, which counts residual energy, energy of communication, and the number of times that each forwarding node has routed data. however, clb-routing had important issues to take into account, but it lacked some others like network reliability and buffer size. this eventually affect the network throughput and lifetime as described above. [26] proposed a swarm intelligence based energy balance routing scheme (seb). it utilizes swarm intelligence to maintain and balance residual energy on sensor nodes for wsn lifetime extension. seb algorithm balances residual energy on sensor nodes evenly according to their weights as much as possible. the node weight is related to the number of its neighbour nodes that may select it to relay their messages. the probability of selecting the next hop neighbour node is calculated according to residual energy, distance to the sink, weight of nodes, and the environment pheromone which is related to path quality. nevertheless, the previous study of seb shows that it has some drawbacks since some issues are not considered. the first is the packet buffer capacity of sensor nodes. as described above, this might increase the packet loss and packet retransmission which inevitably affects the network efficiency. secondly, the dynamic behaviour of the wireless link quality over time and space where, the path quality is determined as a function of hop count. this can easily lead to the use of low-quality links, and result in unreliable routs [27]. finally, calculating the weight of nodes in such algorithm was based on the assumption that the environment events distributed uniformly. this might be inefficient when the environment events distributed non-uniformly. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 343 the proposed swarm algorithm in this paper considers the end-to-end reliability of a multi-hop route based on the packet reception rate (prr) which is one of the most commonly used reliability metrics [28]. in this model, the work analyzes the reliability of the whole path from the next hop node to the sink, and then chooses the relay node with the best prr which improve the end-to-end reliability of a multi-hop route. moreover, the proposed algorithm can balance energy consumption among sensor nodes evenly as much as possible through new effective function between nodes' residual energy and weight. as well as, a new weight definition is proposed in this algorithm to achieve balanced energy consumption for both uniform and non-uniform event distribution in the environment. in addition, it can effectively alleviate buffer overflow by integrating the normalized buffer space into routing choice. consequently, the local information in the proposed swarm solution refers to each neighbour's residual energy, weight, normalized buffer space, transmission distance, and pheromone. as well as, a new pheromone update operator is designed to integrate energy, path length, and path quality into routing choice. 3. problem description consider a static multi-hop wsn deployed in the sensing field. in this model, we aim to achieve reliable routing algorithm taking into consideration nodes' energy consumption, energy balancing among sensor nodes, and nodes' buffer space. the wireless sensor network can be modelled as a random geometric graph, g(v,l), where v denotes the set of sensor nodes which distributed randomly in the square monitoring field and l represents a set of all communication links (i, j) where, i, j  v. link (i, j) exists if and only if nodes i,j are within radio range of each other. the events in the environment will be detected by some sensor nodes which are called source nodes. assuming that the mac layer provides the link quality estimation service, e.g., the prr information on each link [29], where each node is aware of the prr values to its one-hop neighbours. the information regarding the presence of the detected events at each source node should be reported to the sink node. since wsns are usually based on a multi-hop transmission, the source nodes send their data to the sink through intermediate sensor nodes which acts as a relay nodes. the chosen path from each source node to the sink should be the best path which satisfies some constraints including 1) low communication cost, 2) its reliability greater than or equal target value, 3) at the same time, sensor nodes on that path should have the maximum value resulting from a new proposed equation between the residual energy and weight compared with their neighbours to balance energy consumption among sensor nodes, and 4) as well, sensor nodes should have a buffer space greater than or equal message size to reduce packet loss and energy consumption due to retransmission of the same packets as a result of buffer overflow. to simplify the description of the problem and its formulation, the notations used to model the problem are given in table 1. 344 f. elfouly, r. ramadan, m. mahmoud, m. dessouky table 1 our model notations given parameters notation description s the set of all sensor nodes that in sensing or sensing-relaying state. r the set of all sensor nodes that in relaying state accept sink node. prr the set of packet reception ratio prr(i,j) associated with link (i, j). wq constant value less than or equal 1. rej the residual energy of each sensor node j, rsnebnebj ii  , se(i,j) the energy required to do single hop transmission from i to j, .),( lji  mesi the number of messages at node i, rsi  wj the weight of a neighbor j, rsnebnebj ii  , ewrj the residual energy to weight ratio for each neighbor node j, rsnebnebj ii  , encj(t) the ratio between residual energy to initial energy for each neighbor node j at time t, }{sin, krsnebnebj ii   }{sin, krsnebnebj ii   pz the packet size. bsj(t) buffer space in node j at time t, }{sin, krsnebnebj ii   bmj(t) the normalized buffer space of node j at time t, }{sin, krsnebnebj ii   nrej the ratio between rej and se(i,j) for each neighbor node j, rsnebnebj ii  , nebi the set of neighbors of node i, }{sin, krsnebi i   4. swarm based approach this section describes the details of the proposed swarm technique for energy balance and reliable routing in wsns. the section states the different parts of the proposed scheme including the routing scheme, local heuristic information computation, pheromone computation, and neighbour node selection. the proposed swarm solution is composed of two phases. in the first phase, it starts with a set of forward ants placed in the source nodes and move through neighbour relay nodes until reach sink node. in this algorithm, for calculating the packet transfer probability to the next hop neighbour, residual energy, weight, normalized buffer space, transmission distance, and pheromone are considered. at each node i, a forward ant k selects the next hop node j, inebj  randomly with a probability ( , ) k r p i j which determined as follows:    inebl ililililil ijijijijijk r ttttt ttttt jip     )]([)]([)]([)]([)]([ )]([)]([)]([)]([)]([ ),( (1) where ηij(t) is the pheromone value on the link (i,j) at the time t, ηij(t), ψij(t), εij(t), and δij(t) are the heuristic information of link (i,j) for node j; α, β, γ, λ, and ϕ are the weight factors that control the pheromone value and the heuristic information parameters respectively. when forward ant k reaches sink node, it is transformed into a backward ant and the second phase starts. the backward ant starts from the sink node and moves towards its source node along the same path in opposite direction, depositing an increment of pheromone on that. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 345 4.1. problem formulation due to the use of multi-hop routing technique, the information about the detected events at each source node should be transmitted as messages to the sink node through intermediate nodes or relay nodes. in order to achieve energy balanced routing, the node with heavy weight and low residual energy should be prevented from being selected as a next hop. so, the proposed algorithm considers a model in which the sensor node residual energy and weight are used when choosing the relay node through a new proposed function. now, let’s start with the computation of the weight of a neighbour j at time t by equation (2).          otherwise hchifmes twe jnebi iji j 0 c (t) )( (2) because the events detected in the monitored environment distribute non-uniformly, node weight can be defined as the total number of messages at its neighbour nodes which may choose it to relay their messages. equation (2) means that packets are not allowed to be transmitted backward to the neighbours with higher hop count. this strategy ensures that the packets are forwarded closer toward the sink and prevents forming a loop. in addition, the new function that combines residual energy and weight for each node j at time t is defined by equation (3) as follows: (( ( ) ( )) 1) (exp( ( ))) ( ) ( ) 1 ( ) (exp( ( ))) ( ) ( ) ( ) ( ) 0 ( ) 0 j j j j j j j j j j j j nre t we t enc t if nre t we t ewr t enc t if nre t we t we t nre t if we t                 (3) due to the use of multi-hop routing technique, the information about the sensed events at each source node should be transmitted as messages to the sink node through intermediate nodes or relay nodes. therefore, the relay node needs to hold in a buffer the incoming data packets during the processing time required for the previous ones. the sensor nodes have limited memory, it is impossible to buffer a large number of packets. consequently, the buffer of the relay node may start overflowing, resulting in loss of important packets and more energy consumption due to the retransmission of the same packets [30]. for efficient use of available buffer, we consider a model in which the probability of buffer overflow is minimized as much as possible by integrating the normalized buffer space into routing choice. the normalized buffer space is defined as the ratio between the buffer space and packet size. it is used to express the number of packets that can be received by every sensor node without it starting buffer overflowing at a certain time. the normalized buffer space of node j at time t can be defined as follows: ( ) ( ) ( ) 0 j j j bs t if bs t pz bm t pz otherwise       (4) 346 f. elfouly, r. ramadan, m. mahmoud, m. dessouky 4.2. calculation of local heuristic information in order to maintain higher and balance residual energy on sensor nodes, the proposed relation between residual energy and weight is used as a heuristic information when selecting the next hop neighbour node which denoted by ij(t). ( ) ( ) ( ) i j ij l l neb ewr t t ewr t     (5) according to this rule, the node with the greater value of ij will have a higher residual energy compared to its weight and a much better opportunity to be chosen as a next hop. since energy conservation is an essential issue in wsn, selecting the nodes with minimum hop count is required to minimize energy consumption and conserve much more energy as possible. therefore, the hop count from neighbour node j to the sink node is used as heuristic information which is denoted by ij(t). ( ) 1 ( ) ( ) 1 i i j ij i j l neb hc hc t hc hc        (6) a neighbour node that has a greater value of ij(t) is closer to the sink than the others and will be more likely to be chosen as next hop. in order to avoid or reduce packet loss due to buffer overflow which in turn improve the overall network performance, it is critical to send packets to the sensor node with more buffer space or less traffic load. therefore, bmj(t) can be used as heuristic information which denoted by ij(t) ( ) ( ) 1 ( ) i j ij l l neb bm t t bm t      (7) this rule enables decision making according to the buffer apace on the neighbour nodes, meaning that if a node has a greater value of ij(t) then it has a much better opportunity to be chosen as next hop. due to the dynamic behaviour of the wireless link quality over time and space, it is essential to use the current packet reception ratio of link (i,j), prrij as heuristic information to improve the network throughput. it is denoted by ij(t) ( ) i ij ij lj l neb prr t prr     (8) where, the greater value of ij(t) indicates that the link (i,j) more reliable than others. thus the neighbour node j will have more chance to be chosen as next hop. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 347 4.3. pheromone calculation in this algorithm, pheromone concentration is affected by the combination between energy, path length, and path quality in a new effective form. this may improve network reliability, reduce energy consumption, and achieve more balanced transmission among the nodes. let’s begin with the calculation of the path quality, qp, which related to the prr by equation (9). p p q prr (9) where, prrp, represents the packet reception ratio of the path p. due to the use of multi-hop routing, the prrp can be computed by the prr of each hop on the path p as follow: ( , ) p p ij i j n prr prr    (10) where, np is the set of edges on the path p (hop count). in this model, all nodes have the same fixed transmission range. so, the number of hops in the path p is considered as the path length, lp as follow: p p l n (11) by estimating the length of each possible path for the same source node, the best path length lpbest is recorded at the sink. then, the relative length of path p can be determined as follows: rlp = lpbest / lp = npbest / np (12) the increasing density of pheromone on the path p is defined as follows: ij = (rlp  prr w1) w2  (e p min) w3 / n 2 p (13) where e p min is the minimum residual energy of nodes visited by ant k and the parameters w1, w2, and w3 determine the relative influence of the energy, path length, and path quality. the sink node constructs the value of pheromone update operator, ij and sent it back as a backward ant to its source node along the reverse path. whenever a node i receives a backward ant k coming from neighbouring node j, it updates its pheromone concentration according to the following rule: ijijij tt   )1()1()( (14) where,   (0,1) is the evaporation constant that determines the evaporation rate of the pheromone [26]. 348 f. elfouly, r. ramadan, m. mahmoud, m. dessouky 5. performance evaluation in this section, different experiments are conducted to evaluate the performance and validate the effectiveness of our proposal. the section starts by describing the performance metrics followed by simulation environment and finally simulation results. 5.1. performance metric for a comprehensive performance evaluation, several quantitative metrics considered are defined below. 1. network lifetime [5]. it is defined as the time duration from the begging of the network operation until the first node exhausts its battery. 2. energy imbalance factor (eif) [5]. it is defined to quantify the routing protocol energy balance characteristic which defined formally as the standard variance of the residual energy of all nodes.    n i avgi rere n eif 1 2 )( 1 where n is the total number of sensor nodes, rei is the residual energy on node i, and reavg is the average residual energy of all nodes. 3. throughput ratio (tr) [25]. this metric is defined as: nodessourcebysentpacketsofnumber kthebyreceivedpacketsofnumber tr sin  4. average end-to-end delay (seconds) [30]: it is defined as the average time a packet takes to travel from source node to the sink node. this includes propagation, transmission, queuing, and processing delay. the processing delay can be ignored as a result of fast processing speed [31]. 5.2. simulation environment in this paper, the simulation environment consists of 80 sensor nodes deployed randomly in a field of 1000 m x 1000 m. the sink node, and sensor nodes are stationary after being deployed in the field. furthermore, the sink node is located at (1000, 500) m. all the later experiments are done for both homogeneous and heterogeneous node energy distributions on a custom matlab simulator. data traffic is generated according to a passion process with mean parameter ζ. in addition, we choose a harsh wireless channel model, which includes shadowing and deep fading effects, as well as the noise [32]. in this simulation, the case of chipcon cc2420 radio transceiver is taken into consideration [1]. the simulation parameters are listed in table 2. in the later experiments, we use the combination (α = 2, β = 2, γ = 1, λ=1, and ϕ=12), the evaluation result shows this combination is the best for all experiments. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 349 table 2 simulation environment parameters parameters values network size 1000×1000 number of nodes 80 number of sink nodes 1 node placement random uniform packet size 64 byte frequency 2400 mhz transmission power -5dbm maximum transmission range 223 m channel model log-normal shadow path loss exponent 6 shadow fading variance 6 noise power -145dbm reference distance 3 m 5.3. simulation results to verify the feasibility and effectiveness of our proposal, its performance is compared in terms of network lifetime, energy imbalance factor, and throughput ratio, with the proposed protocols in [5][23][24][25][26] for homogenous and heterogeneous networks. we implemented all of the algorithms in [5][23][24][25][26]. 5.3.1. network lifetime evaluation for homogenous and heterogeneous networks in this experiment, the performance of the proposed swarm approach is evaluated in terms of network lifetime for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26] under different traffic rate σ. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. fig. 1 and fig. 2 show the variation of network lifetime with respect to different traffic rate σ for homogeneous and heterogeneous networks respectively. from the figures it can be found that as the value of σ increases, the network lifetime decreases. since the network traffic increases with the increment of σ, the relay load of nodes increases linearly which is the main reason behind decrease of lifetime. however, the figures show clearly that our swarm algorithm enhances significantly the network lifetime comparing with the others for both homogeneous and heterogeneous network. this means that our swarm algorithm balances the network energy consumption more effectively than the others. 350 f. elfouly, r. ramadan, m. mahmoud, m. dessouky fig. 1 network lifetime vs. traffic rate σ for homogeneous network fig. 2 network lifetime vs. traffic rate σ for heterogeneous network 5.3.2. network reliability evaluation for homogenous and heterogeneous network in this experiment, the performance of the proposed swarm approach is evaluated in terms of tr for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26] for homogeneous and heterogeneous network under different traffic rate σ. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. the tr against different traffic rate σ for both homogeneous and heterogeneous networks is depicted in fig. 3 and fig. 4 respectively. clearly, our swarm algorithm achieves the highest tr compared to the others. this is because it forwards the data packets toward the sink in a more reliable way and alleviates the possible buffer overflow. swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 351 fig. 3 network throughput vs. traffic rate σ for homogeneous network fig. 4 network throughput vs. traffic rate σ for heterogeneous network 5.3.3. energy balancing evaluation for homogenous and heterogeneous networks in this experiment, the performance of the proposed swarm approach is evaluated in terms of energy balance for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26]. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. in this set of experiments, it is assumed that the traffic rate σ equal 5. the eif was calculated during running time to find the network's balance efficiency. fig. 5 and fig. 6 present the variation of eif over simulation time for homogeneous and heterogeneous networks respectively. as shown in the figures, eif increases with more running time. the augmentation of the eif is due to the high use of the sink node neighbours comparing to the others, which reduce the average residual energy. however, according to the results in fig. 5 and fig. 6, it is obvious that the eif of our swarm algorithm is the minimum among those of all the 352 f. elfouly, r. ramadan, m. mahmoud, m. dessouky others. it means that in our swarm algorithm, the energy of the entire nodes in the network is close to the average energy in contrast to the others. that's to say, our swarm algorithm can balance residual energy among sensor nodes efficiently. fig. 5 the eif vs. simulation time for homogeneous network fig. 6 the eif vs. simulation time for heterogeneous network 5.3.4. average end-to-end delay evaluation for homogenous and heterogeneous networks in this experiment, the performance of the proposed swarm approach is evaluated in terms of end-to-end delay for both homogenous and heterogeneous networks compared to ebrp [5], aco proposed in [23], tadr [24], clr-routing [25], and seb [26] under different traffic rate σ. the initial energy on each sensor node is 125mj for homogenous network while it is between 100 and 125mj randomly for heterogeneous network. fig. 7 and fig. 8 show the average end-to-end delay under different traffic rate σ for homogeneous and heterogeneous networks respectively. from the results, it is observed that the end-end swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 353 delay increases, as the traffic rate increases. a higher traffic rate causes more queuing delay, which raises the end-to-end delay. however, it is clear that our swarm approach giving the lowest end-to-end delay compared with the others. this is because, our swarm approach forwards the data packets toward the sink in a more reliable way and alleviates the possible buffer overflow, which decreases the packet loss and retransmissions and hence the end-to-end delay. fig. 7 average end-to-end delay vs. traffic rate σ for homogeneous network fig. 8 average end-to-end delay vs. traffic rate σ for heterogeneous network 6. conclusions in this work we presented an efficient routing algorithm that uses swarm intelligence for wsns. the proposed approach not only reduces the energy consumption but also balanced it among sensor nodes to extend wsn lifetime. at the same time, the sensed data delivered to the sink with the highest possible reliability and minimum buffer overflow. the performance of proposed method compared with the previous works which are related to 354 f. elfouly, r. ramadan, m. mahmoud, m. dessouky our topic such as ebrp, aco, tadr, seb, and clr-routing are evaluated and analyzed through simulation. simulation results showed that our approach is robust; achieve longer lifetime, and giving lower end-to-end delay compared to the previous works for both homogenous and heterogeneous networks. references [1] f. elfouly, r. ramadan, m. mahmoud, m. dessouky, “swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network”, in proceedings of the international conference on recent advances in computer systems racs-2015, hail university, saudi arabia, november 2015 [2] “micaz wireless module.” [online]. available http://www.cmt-gmbh.de/micaz.pdf. [3] h. m. ammari, “challenges and opportunities of connected k covered wireless sensor networks-from sensor deployment to data gathering s,” springer, 2009 [4] g.j. pottie and w.j. kaiser, “ wireless integrated network sensors,” communications of acm, vol. 43, no. 5, pp. 51-58, 2000. [5] f. ren, j. zhang, t. he, c. lin, and s. k. das, “ebrp: energy-balanced routing protocol for data gathering in wireless sensor networks,” ieee trans. on parallel and distributed systems, vol. 22, no. 12, december 2011. [6] x. liu, “a transmission scheme for wireless sensor networks using ant colony optimization with unconventional characteristics,” ieee communications letters, vol. 18, no. 7, pp. 1214-1217, 2014. [7] g. campobello, a. leonardi, and s. palazzo, “improving energy saving and reliability in wireless sensor networks using a simple crt-based packet-forwarding solution,” ieee/acm transactions on networking, vol. 20, no. 1, pp. 191–205, 2012. [8] a. zonouz, l. xing, v. vokkarane, and y. sun, “reliability-oriented single-path routing protocols in wireless sensor networks,” ieee sensors journal, vol 14, no. 11, pp 4059-4068, june 2014. [9] j. niu, l. cheng, y. gu, l. shu, and s. das, “r3e: reliable reactive routing enhancement for wireless sensor networks,” ieee transactions on industrial informatics, vol. 10, no. 1, pp. 784–794, 2014. [10] a. m. kamal, c. j. bleakley, and s. dobson, “failure detection in wireless sensor networks: a sequence-based dynamic approach,” acm transaction on sensor networks (tosn), vol. 10, 2014. [11] f. viani, p. rocca, m. benedetti, g. oliveri, and a. massa, “electromagnetic passive localization and tracking of moving targets in a wsn-infrastructured environment,” inverse problems, vol. 26, no. 074003, pp. 1-15, 2010. [12] ch. blum, d. merkle, “swarm intelligence introduction and applications,” natural computing series, springer, berline, 2008. [13] r. r. mccune and g. r. madey, “control of artifial swarms with dddas,” in proceedings of the 14th international conference on computational science (iccs), elsevier, vol. 29, pp. 1171-1181, 2014. [14] a. r. sardar, m. singh, r. r. sahoo, k. majumder, j. k. sing, and s. k. sarkar, “an efficient ant colony based routing algorithm for better quality of services in manet,” ict and critical infrastructure: in proceedings of the 48th annual convention of computer society of india-vol i, advances in intelligent systems and computing, springer lncs, vol. 248, pp. 233-240, 2014. [15] p. rocca, m. benedetti, m. donelli, d. franceschini, and a. massa, “evolutionary optimization as applied to inverse problems,”, inverse problems 25th year special issue of inverse problems, invited topical review, vol. 25, pp. 1-41, dec. 2009. [16] m gunes, u sorges, i bouazzi, “ara-the ant-colony based routing algorithm for manets,” international workshop on ad hoc networking, pp. 79-85, 2002. [17] d. zhang, g. li, and k. zheng, “an energy-balanced routing method based on forward-aware factor for wireless sensor network”, ieee trans. on industrial informatics, vol. pp, no. 99, 2013, pp.1. [18] w. jianguo, w. zhongsheng, s. fei, and s. guohua, “research on routing algorithm for wireless sensor network based on energy balance”, in proceedings of the industrial control and electronics engineering (icicee '12), 2012, pp. 295-298. [19] a. m. s. almshreqi, b. f. a. rasid, a. ismail, and p. varahram, “an improved routing mechanism using bio-inspired for energy balancing in wireless sensor networks”, in proceedings of the information network (icoin '12), 2012, pp. 150-153. http://www.cs.ucf.edu/~turgut/courses/classreviewpapers/p51-pottie.pdf http://www.intechopen.com/books/references/contemporary-issues-in-wireless-communications/evolutionary-algorithms-for-wireless-communications-a-review-of-the-state-of-the-art#b48 http://www.intechopen.com/books/references/contemporary-issues-in-wireless-communications/evolutionary-algorithms-for-wireless-communications-a-review-of-the-state-of-the-art#b48 http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6322374&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6322374&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6164367&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks http://ieeexplore.ieee.org/xpl/articledetails.jsp?tp=&arnumber=6164367&ranges%3d2012_2013_p_publication_year%26querytext%3denergy+balance+routing+in+wireless+sensor+networks swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network 355 [20] k. yu, m. gidlund, j. akerberg, and m. bjorkman, “reliable rss-based routing protocol for industrial wireless sensor networks”," in proceedings of the 38th annual conference of the ieee industrial electronics society (iecon), canada, october, 2012. [21] j. niu, l. cheng, y. gu, l. shu, s.k. das, “r3e: reliable reactive routing enhancement for wireless sensor networks”, ieee trans. on industrial informatics, vol.pp, no.99, 2013, pp.1. [22] d. sahin, s. bulbul, v.c. gungor, t. kocak, “reliable routing in wireless sensor networks for smart grid environments”, in proceedings of the 20th ieee conf. on signal processing and communications applications (siu), 2012, pp. 1-4. [23] a. el ghazi, b. ahiod, and a. ouaarab, “improved ant colony optimization routing protocol for wireless sensor networks,” in p. g. noubir and m. raynal (eds.): netys 2014, pp. 246-256, springer, heidelberg, 2014. [24] f. ren, s. k. das, and c. lin, “traffic-aware dynamic routing to alleviate congestion in wireless sensor networks,” ieee transactions on parallel and distributed systems, vol. 22, no. 9, september 2011. [25] s. yaessad, l. bouallouche, and d. aissani, “a cross-layer routing protocol for balancing energy consumption in wireless sensor networks“ wireless pers. commun., springer, 2014. [26] d. qian, h. chen, w. wu, and l. cheng, “swarm intelligence based energy balance routing for wireless sensor networks”, in proceedings of the 2nd international symposium on intelligent information technology application, vol. 2, pp.811-815, 2008. [27] x. baoshu, and w. hui, “a reliability transmission routing metric algorithm for wireless sensor network”, in proceedings of the ieee international conference e-health networking, digital ecosystems and technologies (edt), vol.1, pp.454 – 457, 2010. [28] s. b. kootkar, “reliable sensor networks”, m.s. thesis, dept. comp. eng., tu delft univ., delft, netherlands, 2008. [29] l. cheng, j. nia, j. cao, s. k. das, and y. gu, “qos aware geographic opportunistic routing in wireless sensor networks”, ieee trans. on parallel and distributed systems, 2014. [30] g. s. sharvani, n. k. cauvery, t. m. rangaswamy, “different types of swarm intelligence algorithm for routing,” in proceedings of the ieee international conference on recent technologies in communication and computing (artcom), kottyam, kerala, india, pp.604 – 609, 2009. [31] v. k. verma, s. singh, and n. p. pathak, “analysis of scalability for aodv routing protocol in wireless sensor networks,” optik—international journal for light and electron optics, vol. 125, no. 2, pp. 748– 750, 2014. [32] d. jian, “cloud model and ant colony optimization based qos routing algorithm for wireless sensor networks,” y. wu (ed.): international conference on wtcs 2009, aisc 116, pp. 179–187, springer, heidelberg, 2012. http://link.springer.com/chapter/10.1007/978-3-319-09581-3_17 http://link.springer.com/chapter/10.1007/978-3-319-09581-3_17 http://www.informatik.uni-trier.de/~ley/pers/hd/s/sharvani:g=_s=.html http://www.informatik.uni-trier.de/~ley/pers/hd/c/cauvery:n=_k=.html facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 227-241 https://doi.org/10.2298/fuee2002227a © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability in rural islanded microgrids  fawad azeem, ghous bakhsh narejo department of electronic engineering, ned university of engineering and technology karachi, pakistan abstract. effective monitoring and control of isolated rural microgrid in the developing world is challenging. the modern communication and monitoring is difficult to handle in such communities due to a complicated approach to the area, lack of modern facilities and unavailability of skilled manpower. implementation of a microgrid in such areas using intermittent renewable sources and limited storage is challenging. uncontrolled load consumption leads to the system-wide outages due to prolonged storage utilization in peak hours and is referred here as battery storage stress hours (bssh). this research is focused to study and analyze the behavior of parametric load monitoring and control algorithm that could control the distinctive load of the microgrid during bssh. in the proposed algorithm, the residential loads are distinctively controlled while utilizing the three locally available parameters that are the state of the charge of storage, solar irradiations and ambient temperature. in other words, the natural parameter variations have been uniquely utilized as a monitoring tool for load control. the fuzzy controller takes a decision for the activation or deactivation of any load based on the three parameters variation ranges. it is observed from the simulation and experimental results that while only utilizing locally available parameters the effective load control is possible. key words: microgrid, stability and control, algorithms, renewable energy, fuzzy logic, load 1. introduction there has been exponential growth in the power system installations, transmission systems enhancement, rehabilitation and up-gradation for reliable power supply. this development in power systems not only strived to fulfill the energy demand of the users but also provided a platform for the researchers to explore and address the issues of modern power systems. besides all the research and actual enhancements, some of the rural areas in developing countries remains neglected and above billion people in the world shall remain received may 9, 2019; received in revised form august 13, 2019 corresponding author: fawad azeem department of electronic engineering, ned university of engineering and technology karachi, pakistan e-mail: fawad.azeem1@gmail.com  228 f. azeem, g. b. narejo without electricity by 2030 [1]. these unconnected rural areas are far from the developed cities where the habitats cannot afford expensive modes of power generation like diesel generators to fulfill their power needs. nevertheless, recent research shows that microgrids; a low voltage power systems are the potential candidate to resolve the issue. a microgrid is a low voltage transmission system having its own generation, control and storage to supply power to the remote locations where main stream network is unreachable [2]. microgrids control the voltage and power within its system using a centralized controller equipped with the energy management algorithms [10-15], load control [18] and demand response [10], [18]. these control schemes are one of the integral parts of the islanded microgrid that provides stable power to the end users while ensuring efficient power dispatch, optimized storage utilization and real-time load control. several control algorithms have been introduced for power balanced demand and supply match. such control schemes utilize monitoring [16], robust communication protocols [18], [25], internet for forecasts [18], [27] for feedback monitoring and balanced control between the loads, generation and storage. burgio et al. [4] proposed the central load controlling algorithm using behavior tree model approach to control the house-hold load. the central controller acts as a compact box to take optimized decisions based on different parameters. however, strong communication is used to get parameter information and forecasting, etc. su sheng et al. [5] developed a dynamic programming model for economically utilizing the 3 power sources that are solar pv, battery storage and diesel generators. burmester et al. [6] presented a centralized load controlling mechanism with the objective to economically purchase power from the grid. the mechanism is based on a thermostat controller where loads are shifted to the solar pv source while ensuring maximum power being produced by the solar pv. the central controller utilizes the maximum power point tracking scheme for instantaneous load shifting to the cheaper source of energy hence reducing the power purchase from the main grid network. thanh lich et al. [7] presented a decentralized droop control scheme to monitor the voltage and current deviation for the power balance between the load and distributed generation sources. similarly, mashood nasir et al. [8] provided a decentralized i-v droop control technique that takes the information of the solar dc bus and state of the charge (soc) of the storage for the coordinated power-sharing among the contributing grids. annette et al. [9] presented a bi-level power sharing between the multiple microgrids using bi-level converters. however, the emphasis on load control has not been shared in details. mohammad ali fotouhi et al. [10] presented a smart home energy management algorithm to schedule the controllable loads programmed on smart meters. bartosz soltowski et al. [11] presented smart residential load management that utilizes predefined consumption profiles of particular households that acts as reference energy to be supplied or received from that house. the results are encouraging but the authors have used simulations to test the system. r. k. chauhan et al. [12] developed a load sharing control technique between the battery storage, distributed generation source. the system was tested on distributed as well as lumped load. the dc microgrid storage follows its own ideal charging/discharging characteristic for power sharing, however, the reliable source of public utility has been considered in the operation. ashray manur [13] proposed a novel energy management system for residential microgrids while intelligently controlling the power dispatch from the battery to the load using multi layers communication, cloud computing and internet of things. garells et al. [14] presented a novel idea of generating power as well as sharing it to the neighbor houses in case of excess generation using networked communication and wireless communication infrastructure. in such cases, the consumer also becomes a a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 229 prosumer. each household has its own generation source and it also acts as a load at the same time when less power is available. j. gouveia et al. [15] presented an energy management algorithm to control the load during its autonomous operation. the objective is to maximize the autonomous operation. the load management algorithm relies on the forecasting result. the main objective of the research is to maximize the autonomous operational hours of the microgrid when the grid power is not available. 1.1 problem statement islanded residential microgrids operation during the night hours needs a comprehensive energy management algorithm that takes care of overall energy balance due to limited storage and intermittent distributed generation as a prime power source. mostly, residential loads are under operation during the night hours and less storage availability leads to the system wide outages. real-time monitoring is the heart of the energy management system responsible for effective energy management, load management, feedback control and demand and supply match. different communication layers, protocol, wireless communication, serves as a feedback controlling tool to isolated microgrids systems [16], [17], [25]. on the other hand, internet, weather forecast, market trends have been used for the generation, load forecasting and for economic operation [25]. however, the area under study has a major setback of easy access to all the discussed facilities and modern monitoring and communication tools [25]. if these are made available up to some extent to these areas, then its operation, maintenance and effective utilization is questionable due to less educated areas and unskilled manpower. 1.2 proposed residential load management scheme a remote area in punjab pakistan has been taken as a case study. the interviews from the residents were taken to find out the related loads, total number of houses and the expected total number of loads in one house as shown in table 1. climate conditions of the area, educational level and availability of the technical and communication facilities were also checked. the research paper is focused to develop real-time monitoring and control of residential loads simply utilizing the available intrinsic parameters of the microgrid integrated components that are storage, solar power generation and temperature. the objective is to practically test the performance of the microgrid in terms of its operation during the bssh hours. the feedback monitoring is based on the states of the intrinsic parameters like solar irradiations, light intensity, temperature and state of the charge of storage. a fuzzy logic controller that takes the parametric states as input in real time and based on the designed rules sets operation of different distinctive loads within the single household. the fuzzy controller is distributed controller installed at every home. fig. 1 shows the block diagram of the distributed controller. the salient features of the proposed algorithm are: 1. distributed controller approach; in case of the controller of any home not working will not affect the overall grid 2. real-time monitoring based only on the intrinsic parameters that reduce the communication and networks and works only on installed sensors 3. energy conservation and energy efficiency 4. the distributed controller is flexible in operation; the ranges of the fuzzy input membership functions can be settled based on the resident's ease. for example, the activation and deactivation values for the fan loads can be settled as per the comfort and weather conditions of the area. 230 f. azeem, g. b. narejo table 1 expected size of the microgrid under study based on an interview load power (watts) in single house total number of house total consumption (kw) fan 150 2 30 9.0 lights 10 3 30 0.9 water pump 1000 1 30 30.0 washing machine 150 1 30 4.4 fig. 1 proposed microgrid algorithm block diagram the fuzzy logic controller is responsible for the operational duration of any of the house hold load based on the developed fuzzy rules and set parametric ranges. the fuzzy controller makes real-time decisions based on the condition of the microgrid. in case of better parametric conditions, the operation of the distinctive loads increases, on the other hand in case of less soc and solar irradiations availability, the operation of the loads reduces, for instance, while monitoring the operation of the fan, the ambient temperature, soc of the storage and solar irradiations are monitored and based on the real-time values the duration of the fan operation is decided. if soc is low and less solar irradiations are available the operation of the fans will be reduced. the same applies for the case of other loads. table 2 shows the loads and associated membership functions for fuzzy ranges and rule base. in this way, the distinctive controller makes it possible to rely only on the available parameters of the grid to make intelligent decisions. the complete flow diagram of the distinctive load controller is shown in fig. 2. a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 231 fig. 2 distinctive load controller flow diagram table 2 membership function assigned to each load for fuzzy rule base 2. mathematical model the mathematical for fuzzy decision parameters such as soc estimation, conversion of solar irradiation to light intensity, conversion of solar irradiations into solar pv, residential loads and consumption of residential loads are given in this section. 2.1. residential load washing machines pwm, motors pmotor that work as pumps for irrigation purpose and also for household use, lights pl and fans pf equation 1 shows the load of total residential loads ( ) (∑ ∑ ∑ ∑ ) (1) load input membership functions soc ambient temperatures solar irradiations fans   lights   motor pump   extra load   232 f. azeem, g. b. narejo 2.2. residential load consumption model consumption model for electricity device for monthly consumption is given in equation 2 and 3 [24]. the equation in 2 is used to calculate the monthly power consumption where the duration of the power consumption is taken in seconds. the density of the usage is calculated using equation 2. equation one can be used to calculate all the residential loads being utilized in the islanded microgrid. ∑ ( ( )) ( ( )) (2) ( ) ( ) { ( ) ( ) ( ) (3) 2.2.1. state of the charge the state of the charge is used to calculate the soc of the centralized storage. the soc of the centralized storage can be calculated as done in [20] and given in equation (4-7). ( ) ( ) ( ) ( ) ( ( ) ) (4) battery dispatch at a particular instant of time pb(t), can be written as given in equation can 2 ( ) ( ) ( ) ( ) (5) soc at discharging can be measured as ( ) ( )( ) ( ( )( ) ( ) ) (6) simplifying the 4 will give; ( ) ( )( ) ( ) (7) 2.2.2. solar irradiation the solar irradiations have been used to calculate light intensity and solar photovoltaic (eq.8). light intensity help in identifying the available light which helps in the operation of lighting loads. equation 8 is used by [20] to convert the solar irradiation to solar pv ( ) ( ) ( ) (8) 2.2.3. mathematical representation of fuzzy logic controller equation 5, 6 and 7 are the membership functions for the fuzzy controller. to determine the range of each membership functions for the operation of distinctive loads. triangular fuzzy membership functions with fuzzy sets are generated as shown in fig. 3 to 5. [21]. the xi elements are divided into two operational functions that are high or low. the reason is consistent climatic conditions of the areas. the area under study mostly has hot weather with a very small time of winters. there are only two weathers affecting the area that are summer and winters. due to very hot summer and winter seasons, either the temperature is too hot or too cold so for simplicity, only two ranges low or high has been considered that comfortably covers the operation of the microgrid decision making. a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 233 equation 9 shows the equation for the fuzzy input membership function of the state of the charge. *∑ ( ) + (9) here, is the state of the charge membership function with element. similarly, equation 10 and 11shows a membership function of temperature. *∑ ( ) + (10) *∑ ( ) + (11) (a) (b) fig. 3 input membership function of solar irradiations for fans and lights (a) (b) fig. 4 (a) shows the input membership function of soc for solar pump while (b) shows the input membership function soc for extra load (a) (b) fig. 5 (a) shows the input membership function of si for extra load while (b) shows the input membership function si for water pump 234 f. azeem, g. b. narejo 3. simulation to analyze the effectiveness of load operation in response to the intrinsic parameters variation, the fuzzy simulation was performed. the simulation was performed in matlab toolbox using mamdani fis system. fig. 6 (a) shows the operation of fans. solar irradiation membership function for available solar power and temperature function for the customer comfort has been taken. the operation of the fan depends on the temperatures and available solar irradiations. if the temperature remains high with high solar power available, the fan operates on the other hand in case of low solar power and temperatures the operation of the fans is reduced or turned off. similarly, fig. (6) b shows the operation of the lights. (a) (b) (c) (c) fig. 6 fuzzy surface plots of the loads operation in response to the variation in the membership functions (a) shows the surface plot of fans (b) shows the surface plot of lights (c) shows the surface plot of extra load and (d) shows the surface plot of water pump the membership functions to control the lighting loads include the si however; temperatures have the least dominance while controlling the lighting loads. in the same way, fig. 6 (c) shows the activation and deactivation of the extra load in response to the variations in the solar irradiations and state of the charge of storage. since, the washing machines and another such type of loads are responsive in the sense that their operation a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 235 can be shifted to other time hours in case of unfeasible grid conditions. the designed fuzzy rules for washing machine has been set with flexible ranges of membership functions as shown in fig. 5. these flexible ranges help provide save power and utilize such loads during better soc and solar power conditions. it can be seen in fig. 6 (c) that the operation of the washing machine remains inactive during low soc and solar irradiation conditions and activated only during better soc and si conditions. fig 6 (d) shows the behavior of water pump load in response to the variation in si and soc membership function, like extra load operation the fuzzy rule and specified ranges are set in order to provide maximum power to the lighting and fans load during the night time. this helps in saving more battery during the bssh hours. fig. 6 (d) shows that water pumps operate during the day time only when maximum power is available. but as the water pump is required for irrigation as well as domestic use, the fuzzy rules and ranges are designed with slightly flexible values as compared to the extra load ranges. this shows the operation activation more than the extra loads. 4. experimentation experimentation at lab scale was conducted on testbed as shown in fig. 7. the controller embedded with fuzzy logic control was designed on atmega 2560 [22], [23] using c++ language.the inductive and resistive loads were used that mimic the behavior of resistive and inductive loads that are commonly used in the villages such as washing machines, water pumps, light and fans. the loads are connected with the 60 ah storage being charged with the solar panel installed. the loads are activated and deactivated based on the commands from the fuzzy controllers. the actuation of loads is done using arrays of relays being installed with the loads. table 3 shows the sensors being used during the experimentation. to validate the robustness of the algorithm the single day and five-day analysis was done. the single-day analysis is done with better solar irradiations while on the other hand, the 5-day analysis comprises of sunny as well as cloudy days. table 3 sensor used in the testbed name of intrinsic parameter name of the sensor solar irradiation for pv power pyranometer temperature measurement ds 18b20 sensor state of the charge measurement acs 712 sensor fig. 7 microgrid testbed 236 f. azeem, g. b. narejo 4.1. single day analysis the behavior of different loads was analyzed during real time testing. fig. 8 shows the operation of the fans during the 24-hour time period. it can be seen that the operation of the fans started at 9:30 after the temperature increases beyond20 °c. the operation of the fans remained on till 23:00. the temperature after 23:00 hours started decreasing. the total operation time of the fans was approximately 50% of the day. in the same way, fig. 9 shows the operation of the lighting loads. the lights remained off as the solar irradiations were high. the availability of the solar irradiations shows that light intensity is good enough and no need for lights required at the particular instant of time. the lighting load remained inactive from 8:30 am to 17:30 pm that is 37.5% operation of the complete day. this also ensures the energy conservation measures as lights remained off during the sunny hours and start automatic operation during the night hours. this can significantly reduce energy wastage. fig 10 (a) showsthe pump operation. it can be seen that the pump remained on for 3 hours starting from12:00 to 15:00 hours when the solar irradiations are at the peak. in the same way, another membership function is soc of the storage. fig. 10 (b) shows the operation of a pump with the soc availability. it can be seen that even the soc was available the pump operation was restricted the reason was to keep the soc available for bssh. the helps not only in reducing the energy wastage but also reduce the stress on bssh. the pump load was operated for 12.5% of the total day. in the same way, the extra load (fig. 11) was also operated only when the available solar power was higher during the day time. the extra load also operates for 12.5% of the total day. fig. 8 fan operation during the 24-hour time fig. 9 lighting operation during the 24 hour time a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 237 (a) (b) fig.10 (a) water pump operation during the 24-hour time (b) operation of water pump with soc during the 24 hour time fig. 11 extra load operation during the 24 hour time 238 f. azeem, g. b. narejo 4.2. five days analysis during the five-day analysis, the variation in solar irradiations was observed where day 4 was the most cloudy day with distorting solar irradiation trends as can be seen in fig. 12 fig. 12 solar irradiations during the 5 days time this also affects the state of the charge parameter. the state of the charge remains less compared to the other days with good solar irradiations availability. the less availability of solar irradiation directly affects the overall operational duration of different loads. as shown in fig.13 the operations of fans observed less duration compared to the other days (1, 2, 3, and 5). on the other hand,the the situation with the lighting loads is better and there is no big effect on the operation of lighting load as the lighting load is not big and available soc on day 4 can easily handle the lighting load operation due to proactive big loads deactivation or reduction in the operation. fig. 14 the shows the operation of lighting load. the water pump operation was also reduced so as to provide better soc for the night hours this can be seen in fig. 15. fig. 13 fan operation during the 5 days time a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 239 fig. 14 lights operation during the 5 days time fig. 15 water pump operation during the 5 days time it can be seen in fig. 16 that extra load didn’t operate on day 4 due to less availability of the solar irradiation so as to save the soc of the storage. it is evident from the 5 days analysis that even on the worst day that was day 4 the soc remains till the other day starts however the operating hours were reduced. nevertheless, the operation remained successful due to the proposed load control. fig. 16 extra load operation during the 5 days time 240 f. azeem, g. b. narejo 5. conclusion in this article a novel distinctive load controlling algorithm has been developed, simulated and tested. the objective of this research is to practically analyze and verify the impact of locally measured parameters of the microgrid for controlling the residential loads with less communication and outside information while keeping the residents comfort. the results have shown better resiliency to microgrid during peak hours. there are several extensions to the existing work economic analysis for the strategy can be done in future. the distinctive load control decision parameters can be increase like voltage and system frequency can also be added in the decision parameters to increase the accuracy of the proposed control system. further, there is a possibility of reduced illumination of a lighting load to further enhance the control on lighting loads. references [1] p. alstone, d. gershenson, d..m. kammen, “decentralized energy systems for clean electricity access”, nature climate change, vol. 5, no. 4, p. 305, 2015. [2] g.b. narejo, f. azeem, m.y. ammar, “a survey of control strategies for implementation of optimized and reliable operation of renewable energy based microgrids in islanded mode”, vol. 2015, 2015. [3] d. burmester, “a review of nanogrid topologies and technologies”, renewable and sustainable energy reviews, vol. 67, pp. 760–775, 2017. [4] a. burgio, “a compact nanogrid for home applications with a behaviour-treebased central controller”, applied energy, vol. 225, pp. 14–26, 2018. [5] s. sheng, “optimal power flow management in a photovoltaic nanogrid with batteries”, in proceedings of the energy conversion congress and exposition (ecce), 2015, vol. 2015. [6] d. burmester, r. rayudu, w. seah, “use of maximum power point tracking signal for instantaneous management of thermostatically controlled loads in a dc nanogrid”, ieee transactions on smart grid, 2017. [7] t.l. nguyen, g. griepentrog, “a self-sustained and flexible decentralized control strategy for dc nanogrids in remote areas/islands”, 2017. [8] m. nasir, “a decentralized control architecture applied to dc nanogrid clusters for rural electrification in developing regions”, ieee transactions on power electronics, 2018. [9] a. werth, n. kitamura, k. tanaka, “conceptual study for open energy systems: distributed energy network using interconnected dc nanogrids”, ieee transactions on smart grid, vol. 6, no. 4, pp. 1621–1630, 2015. [10] m.a.f. ghazvini, j. soares, o. abrishambaf, r. castro, z. vale, “demand response implementation in smart households”, energy and buildings, vol. 143, pp.129–148, 2017. [11] b. soltowski, s. strachan, o. anaya-lara, d. frame, m. dolan, “using smart power management control to maximize energy utilization and reliability within a microgrid of interconnected solar home systems”, in proceedings of the 7th annual ieee global humanitarian technologies conference, ghtc 2017, pp. 1–7. [12] r.k. chauhan, b.s. rajpurohit, f.m. gonzalez-longatt, s.n. singh, “intelligent energy management system for pv-battery-based microgrids in future dc homes”, international journal of emerging electric power systems, vol. 17, no. 3, pp. 339–350, 2016. [13] a. manur, l. shaver, a. sivit, g. venkataramanan, s. subbarao, “mem: energy management system for low voltage dc microgrids”, in proceedings of the ieee international conference on sustainable green buildings and communities (sgbc), 2016, pp. 1–6. [14] m. graellssobré, “cooperative energy management for a cluster of household’s prosumers”, ieee transactions on consumer electronics, vol. 62, no. 3, pp. 235–242, 2016. [15] j. gouveia, c. gouveia, j. rodrigues, r. bessa, a.g. madureira, r. pinto, c.l. moreira, j.p. lopes, “microgrid energy balance management for emergency operation”, in proceedings of the ieee powertech ieee manchester, 2017, pp. 1–6. [16] d. akinyele, j. belikov, y. levron, “challenges of microgrids in remote communities: a steep model application”, energies, vol. 11, no. 2, p. 432, 2018. a fuzzy based parametric monitoring and control algorithm for distinctive loads to enhance the stability... 241 [17] k. salemink, d. strijker, g. bosworth, “rural development in the digital age: a systematic literature review on unequal ict availability, adoption, and use in rural areas”, journal of rural studies, vol. 54, pp. 360–371, 2017. [18] f. azeem, g.b. narejo, u.a. shah, “integration of renewable distributed generation with storage and demand side load management in rural islanded microgrid”, energy efficiency, pp.1–19, 2018. [19] eun-kyu lee, et al., “design and implementation of a microgrid energy management system”, sustainability, vol. 8, no. 11, p. 1143, 2016. [20] t.r. ayodele, a.s.o. ogunjuyigbe, k.o. akpeji, and o.o. akinola, “prioritized rule based load management technique for residential building powered by pv/battery system”, engineering science and technology, an international journal, vol. 20, no. 3, pp. 859–873, 2017. [21] a. adeli, n. mehdi, “a fuzzy expert system for heart disease diagnosis,” in proceedings of international multi conference of engineers and computer scientists, hong kong, 2010, vol. 1. [22] s.t. cady, a.d. domínguez-garcía, c.n. hadjicostis, “a distributed generation control architecture for islanded ac microgrids”, ieee transactions on control systems technology, vol. 23, no. 5, pp. 1717– 1735, 2015. [23] a. mezouari, r. elgouri, m. alareqi, k. mateur, h. dahou, and l. hlou, “a new photovoltaic blocks mutualization system for micro-grids using an arduino board and labview”, international journal of power electronics and drive systems (ijpeds), vol. 9, no. 1, pp. 98–104, 2018. [24] f. issi, k. orhan, “the determination of load profiles and power consumptions of home appliances”, energies, vol. 11, no. 3, p. 607, 2018. [25] t. r. a. n. tuan, et al., “control algorithms for micro-grid local controllers”, 2018. [26] a. m. tripathi, a. kumar singh, a. kumar, “information and communication technology for rural development”, international journal on computer science and engineering, vol. 4, no. 5, p. 824, 2012. [27] c. gamarra, j. m. guerrero, “computational optimization techniques applied to microgrids planning: a review”, renewable and sustainable energy reviews, vol. 48, pp. 413–424, 2015. facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 105-118 https://doi.org/10.2298/fuee1901105k parallel overloaded cdma crossbar for network on chip ashok kumar k, dananjayan p department of ece, pondicherry engineering college, puducherry, india abstract. for high performance of network on chip (noc), code division multiple access (cdma) technique is used recently due to its fixed communication delay, reduced area utilisation and low power consumption. the cdma system uses walsh based spreading code which improves the bandwidth efficiency. on the contrary, it is not effective when the number of nodes present in the system increases. overloaded cdma (ocdma) is presented for such large network systems. in this paper, ocdma crossbar is modified and advanced with parallel encoding and decoding operation using orthogonal gold codes for improving the speed of crossbar thereby obtaining high performance in noc switch. a modified crossbar consisting of extra processing elements is used to enhance the performance of noc based system on chip (soc) system. this work is simulated on xilinx tool and implemented in vertex-6 (xc6vlx760) field programmable gate array (fpga) device. the proposed work is implemented for four ports, eight ports and sixteen ports with deterministic x-y routing algorithm in 3 3 noc design with mesh topology. this noc switch shows 9.79% improvement in delay and shows 20.76% improvement in power consumption when compared to the existing cdma nocs for 8 bit data packet. key words: cdma, gold code, noc, arbiter, fifo buffer, fpga. 1. introduction as the end user requirements have increased, integrated circuits have scaled down over the past few decades. according to itrs [1], the communication issues have evolved due to the down scaling of technology. existing communication protocols like the bus technology, shared bus and point to point technology which achieves high performance in chip multiprocessor (cmp) has inherent drawback while sharing the resources [2]. hence these protocols do not meet the performance requirements of system on chip (soc) [3]. network on chip (noc) is a scalable communication paradigm which provides high performance in cmp with the aid of parallel processor. however, when the number of processors increases, the design of noc becomes complicated and affects the communication latency, area occupancy and power consumption. conventional noc received april 24, 2018; received in revised form july 17, 2018 corresponding author: ashok kumar k department of ece, pondicherry engineering college, puducherry, india (e-mail: kashok483@gmail.com) 106 a. k. k, d. p switch has five input port, five output port (four directional and one local) and crossbar with control module (arbiter). the four bi-directional ports are connected with neighboring switches for transfer of data between the source and destination [4]. the local port is used as a processing element (pe) and is responsible for communication between the port and crossbar. this paper proposes a new method for noc with fixed latency, reduced system cost and power consumption. recently, cdma is used to transfer data between the input and output in noc switch [5]. fig.1 depicts the structure of cdma noc switch. noc switch has neither solid design nor standard protocol and hence can be designed flexibly to meet the user requirements. the proposed method is implemented for noc switch using a cdma crossbar with 2-d mesh topology and 3 3 noc designed with deterministic x-y routing algorithm. the routing of data initially searches along the x-direction of destination router and proceeds to the y-direction. depending on the destination availability, the distance between the source and destination switch is calculated and transferred to the neighboring switches. since each port has fifo buffer, store and forward packet switching [6] is used for the proposed work. the crossbar is the key module for noc switch as it affects the switch performance and provides multiple access for the data packets. the primary multiple access technique, time division multiple access (tdma) is simple but not efficient for cmp. in tdma only one port sends the data packet simultaneously leaving the other ports to wait until it releases the physical link, thereby increasing the packet latency. space division multiple access (sdma) a dedicated path is created between the ports. cdma is another traditional multiple access technique where the spreading code enables the medium access sharing. this method provides error-free data in cmp and reduces the multiple access interference (mai) by appropriately selecting the spreading code sequence with low cross correlation. the performance of cdma depends on its spreading code and hence choosing the sequence is crucial. recently, overloaded cdma is the most suitable medium sharing technique for cmp which increases the performance of classical cdma crossbar with more fig. 1 noc switch architecture with cdma crossbar of n input and n output ports parallel overloaded cdma crossbar for network on chip 107 available spreading codes. most of the cdma systems use walsh codes, but these codes are suitable only for noc system with fewer processor. walsh code generator provides sequences, out of which only sequences can be used for spreading. on the other end, the orthogonal gold codes are of much use for noc system with more pes. the rest of the paper is as follows, section 2 discusses the related work of cdma interconnects. section 3 describes the classical cdma operation with mathematical expressions. section 4 presents the generation of orthogonal gold codes. section 5 presents the noc router with parallel ocdma encoder and decoder. section 6 shows the implementation of ocdma system, and finally, the conclusion is presented in section 7. 2. related work recently, cdma technique is favored for crossbar of noc switch because of its fixed latency and reduced system cost. kim et al. [7] proposed and implemented walsh based cdma crossbar. this walsh based cdma gave suitable results for noc switch in terms of throughput and latency. star-mesh based noc switch is suggested to control large systems which have seven resources connected to the local switch and each local switch is linked to the central switch. this walsh based cdma gave suitable results for noc switch concerning throughput and latency. wang et al. [8] nominated a cdma technique for both synchronous and asynchronous system such as globally asynchronous locally synchronous (gals) scheme. a 6-node noc was simulated and the results were compared with ptp noc. kim et al. [9] advanced the source synchronous cdma interconnect (sscdma-i) thereby reducing the system overhead compared to tdma bus. nikolic et al. [10] presented two types of bus wrappers i.e. master wrapper along with arbiter module and slave wrapper with peripheral modules for cdma based shared bus architecture. the transaction delay has reduced by bundling the different connections as single, two and four to reduce the parallel lines. halak et al. [11] initiated dynamic assignment of spreading codes for cdma users and developed a novel cdma protocol (d protocol) for dynamic assignment. two different architectures were proposed for cdma i.e. serial cdma implementation, where the data chips from all users are arithmetically summed according to their bit position and in parallel cdma implementation where the data bits are transferred parallelly in the same cycle. the serial and parallel implementation schemes are compared with traditional cdma, mesh based noc and tdma bus and it is observed that the clock frequency was improved for parallel cdma implementation. wang et al. [12] preferred standard basis (sb) code in place of walsh based codes. the sb method duplicates the tdma technique as each spreading code consists only a single chip of one and the remaining chips are zeros. this method further decreases the latency and maximizes the throughput of noc. ahmed et al. [13] presented the overloaded cdma crossbar interconnect to improve the performance of noc. two different types of overloaded cdma interconnect (oci) have been suggested i.e. tdmaoverloaded cdma interconnect (t-oci) and parallel-overloaded cdma interconnect (p-oci) is compared with bus wrappers [10] and parallel implementation cdma [11]. by combining p-oci and t-oci, the speed of cdma crossbar is improved whereas the overall system gets complicated in terms of area utilization. to improve the results of ocdma, this paper proposes an encoder and decoder operated in parallel. by advancing 108 a. k. k, d. p the existing work, this paper has provided better results for noc with cdma crossbar. to the best of knowledge, this paper is the first to investigate ocdma crossbar with orthogonal gold codes. 3. classical cdma as cdma provides the same bandwidth for all users, it is more popular than tdma or sdma. among the various spread spectrum techniques in literature, direct sequence spread spectrum (dsss) is the dominant method for multiple access. dsss-cdma is a method of multiplexing using unique high-frequency spreading codes. two types of spreading codes used for cdma are the orthogonal codes and non-orthogonal codes. the orthogonal walsh-hadamard code is frequently used in cdma systems as the crosscorrelation is zero and the impulse autocorrelation property is unity. pn sequence, gold code and kasami code are the few non-orthogonal spreading codes in use. these codes are used in encoding and decoding of original data and protecting from interference in cdma. in cdma encoder, the input data signal is applied to the modulator with unique spreading code, and these modulated signals are added arithmetically before transmission. the encoded signal is transmitted through the channel and received in the decoder module. the decoder demodulates the encoded signal with the same unique spreading code. the encoded multi-sum signal is either accumulated in positive accumulator register (if spreading code bit is 0) or negative accumulator register (if spreading code bit is 1) and these accumulated values are sent to the comparison module. after comparison, if the positive accumulator value is high the transmitted data signal is 1 otherwise the signal is 0. unique spreading codes are assigned to each user to avoid multiple access interference. the different spreading code protocols are reviewed and analyzed [8]. among the several protocols, transmitter based protocol (t protocol) gives better performance by assigning unique spreading code to cdma system. table 1 describes the conventional cdma operation with suitable notation. table 1 definition of notations notation description data bit of jth sender for ith code sequence orthogonal code ith sequence for jth sender encoded chip value of ith code and jth sender arithmetic sum of ith code sequence positive register value for ith value of code sequence negative register value for kth value of code sequence n number of code sequences the data sent by each sender is xored with the unique code to generate the chip. these chips are added arithmetically to get the multi-bit sum. this multi-bit sum is sent to the decoder for reconstruction of original data. the encoding process is shown mathematically in the below equations. parallel overloaded cdma crossbar for network on chip 109 (1) ∑ (2) where means xor operation. the decoding process is expressed mathematically as below and (3) ∑ and ∑ (4) where pr and nr are positive and negative registers, also pac and nac are positive register with an accumulator and negative register with an accumulator. problem statement cdma through its multiple accesses technology enables number of transmitters to transfer data simultaneously to number of receivers. for efficient data transfer, spreading sequences must satisfy the orthogonal, balance and run-length properties. though orthogonal sequence like walsh code is used for improving bandwidth utilization, the code utilization of it is less, the cross correlation between some shifts is not zero and the total delay for generating the code is not fixed. a 16 node cdma needs 32-bit walsh code as 16-bit walsh code provides only 15 orthogonal sequences and hence it leads to wastage of code sequences in the system [12]. the proposed work provides a suitable solution for this problem. the contributions of the proposed work is as follows (i) implementing cdma with orthogonal gold codes to increase the code utilisation and to reduce the mai. (ii) modifying and advancing a novel approach of ocdma system and adding extra pes to the router for improving the performance of noc [13]. (iii) simulating the proposed design in xilinx software for synthesis and comparison of the results with existing work. 4. generation of spreading code for cdma system as described above, walsh code is not suitable for high data rate systems. therefore, instead of walsh code orthogonal gold code is implemented. the generation of gold code is through proper selection of pn-sequences used as initial values for linear feedback shift registers (lfsr) [14]. fig.2 describes the generation of gold code with proper msequence. gold sequence generator gives codes with a n bit sequence. experiments show that orthogonal gold codes are obtained by affixing „0‟ to the non-orthogonal gold sequence. further, there is no wastage of code sequences in gold code set and the sequences are utilized efficiently. hence, orthogonal gold sequences are suitable for huge node (port) noc system whose data width is 16, 32, and 64 bits. nevertheless with regard to ber, orthogonal gold codes provide similar performance compared to walsh codes. 110 a. k. k, d. p fig. 2 generation of gold code sequence with correct selection of pn-sequences 5. noc router with parallel ocdma crossbar each pe of noc is connected with network interface (ni) i.e. either transmit ni or receive ni. the input port consists of fifo and finite state machine (fsm) controller, and the fsm controller will direct the data packets based on fifo [16]. the data is divided into packets before being transferred to the fifo from transmitting ni. the size of fifo is decided by the width of the data packet. the distributed round robin arbiter provides grants for the packets which are ready to transmit from fifo. the arbiter selects the input port and output port based on the fifo memories. the transmit ni will assert the request to the arbiter, then depending on the fifo memory the arbiter will provide the grants in round robin fashion. hence only one data packet will be sent to the cdma system, thereby avoiding the conflict between data packets during transmission. from ni, the data packets are sent to parallel to serial converter (pts) module, and pts provides the data packets serially to the encoder of cdma. then, the serialized data is encoded with spreading code and the bits are summed arithmetically to form multi-bit sum. this multi-bit sum is sent to the decoder module where the original data bits are reconstructed based on decoder logic and then these serialized data are forwarded to the serial to parallel converter (stp) module. stp converts the data bits into data packet again, and the data packets are sent to the received ni of the output port. finally, the receive ni sends the data packets to the fifo port. the store and forward packet switching is flexible for the proposed noc system as the ports are using fifo. similarly, ocdma replaces the crossbar of noc switch configuration. the deterministic x-y routing protocol is used for data transfer from source noc switch to destination switch as it is straightforward, flexible for 2-d mesh design and free from deadlock. the control block which consists of arbiter is used to operate the spreading sequence assignment and provide data transaction permission for the winning ports. the concept of overloaded cdma system is implemented in wireless communication networks for increasing the number of trans/receiving ports without increasing the system complexity [13]. the difference between ocdma technology and standard cdma is in terms of code length i.e. l>n1. l is the code length for ocdma and n1 is the code length for classical cdma. ocdma facilitates multi-bit port transmission with minimal changes to traditional cdma system. hence, ocdma system needs long sequence generator such as gold code generator. parallel overloaded cdma crossbar for network on chip 111 the proposed work is implemented for noc switch with cdma without increasing the system complexity, fixed latency and limited system cost. to improve the bandwidth and reduce the area overhead, extra pe is connected to each noc switch which reduces the requirement of more switches and also reduces the area overhead of per-pe. the fact that the increasing number of pes per noc switch increases the communication requests there by increasing the inter communications links [15]. the modification of standard cdma system is required to achieve these objectives. the total encoding or decoding process of cdma depends on the spreading code length which equals to clock cycles for one data transaction. the completion of a single transaction requires n clock cycles which are also synchronized with the counter. fig. 3 n input and n output ports of noc switch with ocdma crossbar building blocks of ocdma crossbar the ocdma crossbar is designed of three main components: (i) encoder (ii) decoder and (iii) control block and these modules are shown in fig. 3 along with components of noc. the control block mainly controls the data transmission in terms of selection of proper input port, assigning the code sequence and counter for measuring the clock cycles. 1) encoder module the operation of encoding process is same as conventional cdma but the data is encoded bit wise in parallel manner. the multi-bit sum of data is transferred to the decoder module parallelly [11] therefore one clock cycle is sufficient for the completion of the process of encoding one bit of nodes. the data chips are xored and added simultaneously from the ports hence the proposed encoder reduces the clock cycles for completion of the encoding process than the standard cdma. fig.4 shows the parallel encoding method for ocdma with orthogonal gold code. the nodes send the data bits serially to the encoder block, and then the multi-bit sum is sent parallel to the decoder block. the cdma requires total of 24 bit to transfer original data of 8 bit because multisum of each bit requires 3 bit when it adds arithmetically. 112 a. k. k, d. p fig. 4 parallel process of encoding in ocdma crossbar 2) decoder module fig. 5 describes the parallel decoding process of ocdma. the parallel multibit sum is received by the decoder module through the channel and the encoded sum value first reaches the de-multiplexer stage. the encoded data bit is sent to the positive register (if spreading code is zero) or negative register (if spreading code is one), then the values of fig. 5 parallel process of decoder architecture of ocdma crossbar parallel overloaded cdma crossbar for network on chip 113 both registers are accumulated. finally, these positive accumulated values and negative accumulated values are sent to the comparison module. the original data bit would be 1 if the pac is high else the original data bit is 0. these registers are usually of length n/2 because of the balance property of the orthogonal spreading code. therefore, both the registers are of same length which is half of the spreading code length. the decoding process is executed parallelly for each spreading code of the multi-bit sum. 3) control block at the initial stage of data transfer, the control block provides spreading code sequences for the transmitter and then the transmitter transfers the code to the receiver. the arbiter eliminates the congestion and provides grant signal to input port for transferring the data to the crossbar by round robin fashion [15]. the counter within the arbiter module initializes the spreading sequences for all the senders. the control block sends the handshake signals to verify codes of the corresponding encoder and decoder. the code pool will assign a unique spreading code to each transmitter when it receives a request from the arbiter module. fig.6 describes the encoding and decoding process of 8 bit orthogonal gold code. the sender sends the data bit serially and orthogonal codes are assigned to each sender by the gold code generator. the data bit is xored with code bit parallelly and the encoded first bit of each sender is sent to the decoder section. the first bit of the code for each sender is zero but the multi-bit sum of the first encoded data is four after xoring with data bits. this process continues for each sender and the multi-bit sum is calculated for each encoded bit. the multi-bit sum is sent to the accumulators depending on the code bit value. for decoding of first bit, the positive accumulator register is more than negative accumulator register hence the data bit is re-constructed as zero. the process continues until the system gets the 8-bit original data. 6. implementation the simulation and synthesis results are presented in terms of area, delay and power consumption for the parallel ocdma crossbar of noc switch with 2 pe. the proposed work is simulated in xilinx software and implemented on vertex6 (xc6vlx760) fpga. the implementation of noc switch is carried out using different ocdma crossbar with spreading code lengths n= {4, 8, 16} and the comparison is also provided with existing noc switches. the parameters used for simulation of noc switch are tabulated in table 2. table 2 simulation parameters simulation parameter values topology 2d mesh arbiter distributed round robin switching store and forward routing algorithm minimal adaptive crossbar ocdma data packet length 8 bit buffer yes simulator riviera-pro traffic scenario uniform random traffic distribution poisson 114 a. k. k, d. p fig. 6 transmission and reception of ocdma with 8 orthogonal gold codes the performance metrics considered are area utilization (slice registers, slice luts and lut-ff pairs), maximum clock frequency (delay) and power consumption (dynamic power). the encoder and decoder of ocdma is implemented individually and applied to the crossbar of noc switch. fig. 7 shows the implementation results for 4,8,16 nodes of ocdma crossbar for noc switch.from fig. 7 (a), it is evident that the area utilization is increasing with increasing number of bits because the noc switch requires high architecture for transmission and reception of data packet. inference from fig. 7 (b) concludes that the maximum clock frequency is decreasing with increase of data packets because of the converters (stp and pts) present in ni. the power consumption of this noc switch increases with its data packet because the transition activity is more when the data is undergoing stp/pts block, hence dynamic power consumption also increases and it is shown in fig.7(c). the throughput ( ) is calculated as (5) where nc is the number of required clock cycles, nbpp is the number of bits in a packet, npe is the number of received packets at the pe and tc is the clock period for complete data transmission. from fig. 7(d), it is inferred that the throughput is increasing with increasing data because more pes are receiving data packets within the specified clock period. to improve the bandwidth and reduce the area overhead, extra pe is connected to parallel overloaded cdma crossbar for network on chip 115 each noc switch which reduces the requirement of more switches and also reduces the area overhead of per-pe. (a) (b) (c) (d) fig. 7 (a-d) implementation results in terms of area utilization, maximum clock frequency, power consumption and throughput for noc switch with ocdma crossbar of 4, 8, 16 nodes table 3 shows the comparison results for the 8-bit different cdma crossbar of noc switch in terms of area utilization (lut-ff pairs), delay (ns) and power consumption (mw) which are implemented in vertex-6 fpga device. the proposed work provides better results than wb-cdma [7], sb-cdma [12] and ocdma [13] as the encoding and decoding processes are executed in parallel. this parallel ocdma crossbar switch requires less area utilization because of orthogonal gold codes used for spreading and elimination of selector (multiplexor) and additional non-orthogonal sequence generator in ocdma [13]. even though, the number of spreading codes is more, the area overhead of parallel ocdma is lesser than ocdma [13] because of pe clustering which reduces the number of required switch for complete data transfer from source port to destination port. table 3 comparison of parallel ocdma with existing cdma crossbar of 8-bit data packet per switch cdma(8-node) area (no. of lut-ff) delay(ns) power consumption(mw) wb-cdma [7] 782 2.82 17.53 sb-cdma [12] 684 2.71 15.21 ocdma [13] 692 2.96 11.46 parallel ocdma 663 2.673 9.08 116 a. k. k, d. p fig. 8 (a-d) shows the comparison of parallel ocdma with ocdma [13] for different number of nodes. from the figure, it is inferred that the performance of parallel ocdma is improved compared to the existing work because of efficient code utilization and pe clustering. this crossbar switch shows 9.79% improvement in delay than ocdma [13] with minor modifications in simple cdma operation. the major power consumption is due to buffers in the bi-directional ports. but in the proposed method encoder and decoder modules are placed in ni of noc, hence buffers are not operated when the data packets are encoding and decoding. consequently 20.76% improvement in power consumption is obtained than ocdma [13]. (a) (b) (c) (d) fig. 8(a-d) comparison for ocdma [13] with parallel ocdma for different number of nodes in terms of area utilization, clock frequency, power consumption and throughput the parallel ocdma noc switch with 2pe extended for 3 3 mesh based noc system which of 9 bi-directional routers and 18 pes. for analyzing the packet latency and throughput, the mesh based noc is simulated on riviera pro for windows. number of experiments are conducted in uniform-random traffic pattern for observation of these performance metrics. the data packet latency (clock cycles) performance for noc with 2 pe is obtained and compred with that of noc switch with single pe as shown in fig.9 (a). from this figure, it is evident that the proposed work shows reduced packet latency because of parallel processing of encoder and decoder for transmission of the data packet. throughput performance of noc switch with 2 pe is also obtained and compared with parallel overloaded cdma crossbar for network on chip 117 that of noc switch with single pe. from figure, it is understand that as number of pe‟s increased, its latency and throughtput performances are improved. (a) (b) fig. 9 simulation results for network latency with injection load (a) and throughput with injection load (b) in uniform-random traffic pattern 7. conclusion this paper proposed the overloaded cdma crossbar for noc with parallel encoding and decoding process with walsh codes being replaced by orthogonal gold codes. a parallel encoder and decoder transfer the data in the same clock cycle hence the performance of proposed ocdma crossbar is increased. the results are improved with respect to latency, area usage and power consumption when compared with the existing cdma crossbars. the parallel ocdma crossbar switch showed 9.79% decreament in delay and showed 20.76% improvement in power consumption than ocdma [13]. in future work, noc switch will be present with different fault routing algorithms for handling permanent and transient faults. references [1] international technology roadmap for semiconductors 2012(www.itrs.net). [2] m. c. chiang, g. s. sohi, “evaluating design choices for shared bus multiprocessors in a throughput oriented environment,” ieee transactions on computers, vol. 41, no. 3, pp. 297-317, march 1992. [3] d. sigüenza-tortosa, t. ahonen, and j. nurmi, “issues in the development of a practical noc: the proteo concept,” integretion the vlsi journal, vol. 38, no. 1, pp. 95–105, october 2004. [4] t. bjerregaard and s. mahadevan, “a survey of research and practices of network-on-chip,” acm computing surveys, vol. 38, no. 1, pp.1-50, march 2006. [5] s. a. hosseini, o. javidbakht, p. pad, and f. marvasti, “a review on synchronous cdma systems: optimum overloaded codes, channel capacity, and power control,” eurasip journal of wireless communications networking, vol. 1, pp. 1-22, december 2011. [6] l. benini and d. bertozzi, “xpipes: a network-on-chip architecture for gigascale systems-on-chip,” ieee circuits and systems magazine, vol. 4, no. 2, pp. 18-31, september 2005. [7] d. kim, m. kim, and g. e. sobelman, “cdma-based network-on-chip architecture,” in proceedings of the ieee asia-pacific conference circuits systems, december 2004, pp. 137-140. 118 a. k. k, d. p [8] x. wang, t. ahonen, and j. nurmi, “applying cdma technique to network-on-chip,” ieee transactions on very large scale integration systems, vol. 15, no. 10, pp. 1091-1100, october 2007. [9] j. kim, i. verbauwhede, and m.-c. f. chang, “design of an interconnect architecture and signaling technology for parallelism in communication,” ieee transactions on very large scale integration systems, vol. 15, no. 8, pp. 881-894, august 2007. [10] t. nikolic, m. stojcev, and g. djordjevic, “cdma bus-based onchip interconnect infrastructure,” microelectrons reliability, vol. 49, no. 4, pp. 448-459, april 2009. [11] b. halak, t. ma, and x. wei, “a dynamic cdma network for multicore systems,” microelectrons journal, vol. 45, no. 4, pp. 424-434, april 2014. [12] j. wang, z. lu and y. li, “a new cdma encoding/decoding method for on-chip communication network,” ieee transactions on very large scale integration systems, vol. 24, no. 4, pp. 1607-1611, april 2016. [13] k. e. ahmed, r. rizkand m. m. farag, “overloaded cdma crossbar for network on chip,” ieee transactions on very large scale integration systems, vol. 25, no. 6, pp. 1842-1855, january 2017. [14] l. hanzo and t. keller, “ofdm and mc-cdma: a primer,” © 2006 john wiley & sons, ltd. isbn: 0470-03007-0, 2006. [15] r. kumar and a. gordon-ross, “macs: a highly customizable low-latency communication architecture,” ieee transactions on parallel and distributed systems, vol. 27, no. 1, pp. 237-249, january 2016. [16] a. k. k, p. d., “a survey for silicon on chip communication”, indian journal of science and technology, vol. 10, no. 1, january 2017. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 143 151 doi: 10.2298/fuee1501143g microwave annealing, a promising step in the roll-to-roll processing of organic electronics  koen gilissen, jeroen stryckers, wouter moons, jean manca, wim deferme institute for materials research (imo-imomec) – engineering materials and applications, hasselt university, wetenschapspark 1, 3590 diepenbeek, belgium abstract. in recent years, organic printable electronics has gained more and more attention. the development and characterization of new printing techniques and functional inks is vital to accomplish solution processable, large area organic electronic devices e.g.: organic photovoltaics (opv), organic light-emitting diodes (oleds). in this study a systematic comparison is made between hotplate annealing and microwave annealing of (screen) printed poly(3,4-ethylenedioxythiophene) : poly(styrenesulfonate) (pedot:pss) layers. pedot:pss films treated with both techniques were characterized and compared by their thin film morphology, their electronic properties and their annealing time. it is shown that no difference in the thin film morphology and final sheet resistance was observed for microwave annealed compared to the hotplate annealed samples. above that the annealing time is decreased up to a factor 6. these results show that microwave annealing is a feasible fast annealing technique for pedot:pss thin films and can therefor reduce the total processing time of organic and pedot:pss based electronic applications. key words: microwave annealing, printed electronics, organic electronics, pedot:pss, screen-printing. 1. introduction the conjugated polymer pedot:pss has been around for more than two decades [1]. although its electronic properties are still under investigation [2] pedot:pss is already used in a variety of organic based electro-optical applications, e.g.: organic photovoltaics (opv)[3], organic light-emitting devices (oleds) [4], and other applications, e.g.: antistatic coatings[5], anode material for capacitors[6]. this wide applicability is due to its high optical transparency, low resistivity, its ability to be processed from solution and its environmental stability. a thin film pedot:pss is an integral part of an oled device stack [7] and has several purposes. it is deposited on top of a transparent electrode, most commonly an indium tin oxide (ito) thin film. the pedot:pss layer received august 14, 2014; received in revised form october 27, 2014 corresponding author: koen gilissen institute for materials research (imo-imomec) – engineering materials and applications, hasselt university, wetenschapspark 1, 3590 diepenbeek, belgium (e-mail: koen.gilissen@uhasselt.be) 144 k. gilissen, j. stryckers, w. moons, j. manca, w. deferme prevents the diffusion of oxygen and indium originating from the ito layer [8], [9] and modifies the surface wetting properties [10]. due to its electronic properties it lower the energy barrier for the injection of holes and block electrons from reaching the holeinjecting contact preventing surface recombination [8], [11]. oled device studies have shown that the incorporation of a pedot:pss layer increases the external quantum efficiency and increase the lifetime [12]. as aqueous dispersion, pedot:pss can be processed from solution by either spin-coating or other printing or coating techniques. to remove solvents and obtain a high conductivity a post thermal treatment is needed [13]. in a laboratory environment this post-deposition thermal treatment is performed via conventional thermal conduction methods e.g.: hotplate or oven. these conventional thermal conduction methods do not scale to industrial sized, high throughput production process. the annealing time is limited due to the onset of thermal degradation at temperatures above 200 °c [11][14]. in this paper we investigate the microwave annealing of pedot:pss thin films [15] to overcome these limitations. furthermore a comparison is made between hot plate annealing and microwave annealing of thin pedot:pss films coated with the screen printing technique. based on our results a significant reduction in annealing time is achieved using a microwave annealing system. 2. experimental the pedot:pss agfa orgacon el-p3145 was screen printed using an 1 by 1 inch patterned screen on cleaned glass slides. the cleaning procedure for all substrates consisted out of an ultrasonic bath in soap solution for 30 minutes followed by an ultrasonic bath in milliq water for 10 minutes. after this the substrates were exposed to an ultrasonic bath for 10 minutes in acetone and as a last step boiling in isopropanol was performed for 10 minutes. after the cleaning procedure, prior to printing, all substrates were treated with uv ozone to improve the surface free energy of the substrates resulting in a better wetting behavior. in-situ current measurements were performed in a 4-wire sensing configuration using a keithley 2400 source meter under a 10v dc bias. in order to insure good contact with the wet films, silver contacts where applied to the 4 corners of the 1 by 1 inch substrate prior to screen printing the pedot:pss. the final film thickness was measured using a dektak 3 st profilometer. the sheet resistance of the pedot:pss film was measured using the van der pauw technique [16]. fig. 1 illustrates the configuration of the microwave annealing system. the electromagnetic waves (2.45ghz) are generated by the microwave source which is powered by a tunable microwave power generator. this tunable microwave power generator enables the system to vary the generated electromagnetic power from 50 to 1000 watt. as the generated electromagnetic waves are coupled into the waveguide they pass the water cooled microwave circulator, i.e. reflection load, which protects the microwave source from being damaged by reflected electromagnetic waves. the electromagnetic waves propagating through the waveguide are stirred to a multimode electromagnetic field to prevent standing waves. by preventing standing (electromagnetic) waves in the waveguide and in the applicator area, which is located at the end of the waveguide, the probability of local hotspots is drastically reduced. the formation of local hotspots in the applicator area could potentially damage our samples and would cause a non-uniform heating of the samples present at the applicator. part of the electromagnetic energy which are not microwave annealing, a promising step in the roll-to-roll processing of organic electronics 145 transferred to the samples is absorbed by the dummy load or reflected back towards the start of the waveguide. fig. 1 schematic overview of the microwave annealing system. in order to meet the requirement of a multimode uniform electromagnetic field rotating mode stirrers where placed inside the waveguide. to insure the isolated effect of both mode stirrers, the physical orientation of each mode stirrers is shifted at least 90° with respect to each other. furthermore the rotation speed of each mode stirrer is chosen as a prime number to insure that the rotation speeds are not a multiple of each other and the direction of their rotation is opposite. to obtain the correct behavior of the mode stirrers, the shape and material were experimentally determined. the selected group of materials was limited to dielectrics to prevent reflections from the mode stirrers. to further determine the mode stirrer material a systematic variation of the stirrer materials and careful measurement of the attenuation and phase shift caused by each material at 3 different orientations in the wave guide is performed. the test samples, with equal dimensions, where placed at the entrance of the waveguide, inline centered, inline against the waveguide wall and transversal centered. from these measurements it was found that the material macor, a glass ceramic, caused an adequate differential phase shift of 26.8°. to further optimize the stirrer design different shapes, e.g.: cylindrical, square, and mutual configurations, e.g.: spacing between stirrer blocks, where systematically varied. from these result it was found that a square shaped stirrer designed as shown in fig. 2. fig. 2 (left) 2d cad drawing of the stirrer design, units: mm (right) 3d render of the stirrer design. 146 k. gilissen, j. stryckers, w. moons, j. manca, w. deferme when the stirrer is inline (0°) it introduces a phase shift of 168°, when the stirrer is rotated by 45° it introduces a differential phase shift of 195° and when it is transversal positioned in the waveguide (90°) it introduces a phase shift of 271°. by introducing 2 stirrers in the waveguide, a uniform multimode electromagnetic field is obtained, ideal to test microwave annealing as alternative for hotplate annealing. 3. results and discussion the annealing time of a pedot:pss film is meanly depended on the amount of pedot:pss wet solution, the specific heat of the solution and the annealing temperature. to gain insight into the annealing time of these pedot:pss films in-situ current measurements were performed. these measurements were performed by applying a 10 v bias on the screen printed pedot:pss thin films while annealing the samples on a conventional hotplate. as shown in fig. 3 a clear distinction is observed when annealing the samples at different temperatures on a hotplate. while the solvents and additives evaporate, the current increases rapidly and then stabilizes. the maximum value of the current is independent of annealing temperature but the time to reach this value is temperature specific and the time to reach the maximum current decreases with increasing temperature. the time to reach the maximum current will serve as a reference to compare the annealing time of hotplate and microwave annealed samples. fig. 3 in situ current measurement while annealing of screen printed pedot:pss film on a hotplate at various temperatures. microwave annealing, a promising step in the roll-to-roll processing of organic electronics 147 fig. 4 average times to reach a stable current of pedot:pss films on a hotplate. fig. 4 shows the average times to reach the stable current on a conventional hotplate, as the temperature increases the time to stable current decreases and gets less disperse. when repeating this experiment in our microwave annealing system, as depicted in fig. 5error! reference source not found., a similar trend is observed when varying the emitted power of the microwave. it is also clear that the stable current is reached much faster than in the hotplate experiment. these results suggest that the microwave could potentially be a much faster technique than the hotplate for annealing these pedot:pss films without change of the morphological and electrical properties. fig. 5 average times to reach a stable current of pedot:pss films on our microwave annealing system. 148 k. gilissen, j. stryckers, w. moons, j. manca, w. deferme to be able to confirm these presumptions the morphology of the annealed pedot:pss films is compared on a scanning electron microscope (sem), as shown fig. 6. from these sem images it is clear there is no distinct structural or morphological difference between the hotplate annealed samples at 130°c and 200°c, column a and b. more importantly there is also no evidence of a change in morphology between the hotplate annealed samples and the microwave annealed samples, column a/b and column c resp. fig. 6 sem images of annealed pedot:pss films. column a: hotplate annealed at 130 °c for 600s. column b: hotplate annealed at 200 °c for 600s. column c: microwave annealed at 200w for 70s. the thin film morphology and chemical structure are the main film properties that influence the electronic properties the pedot:pss films. the well-known onset of thermal degradation for conventional thermal convection and radiation is 200 °c [14] has not been correlated to microwave power. the electromagnetic energy transfer from microwave source to the thin wet film is generally accepted to consist out of two main mechanisms, i.e. ionic conduction and dipolar polarization. the absorbed microwave microwave annealing, a promising step in the roll-to-roll processing of organic electronics 149 energy is converted in kinetic energy by movement of molecules [17]. the electronic properties of both hotplate and microwave annealed films were evaluated by measuring the in-plane sheet resistance using the van der pauw method [16]. fig. 7 shows the inplane sheet resistance of the hotplate annealed pedot:pss films. these films were annealed in air for 600s and cooled down to room temperature before measuring the sheet resistance. the results show the increasing sheet resistance with increasing annealing temperature. this effect can be attributed to the hydroscopic nature of pedot:pss films, as these films are annealed in atmospheric conditions they take up oxygen and water vapor from their surroundings [11] [14] . fig. 7 the in-plane sheet resistance as function of hotplate annealing temperature. fig. 8 the in-plane sheet resistance as function of the microwave power. 150 k. gilissen, j. stryckers, w. moons, j. manca, w. deferme fig. 8 shows the in-plane sheet resistance of the microwave annealed films at various microwave powers. these measurements were performed after the films cooled down to room temperature and the annealing time was based on the time to reach a stable current. a similar trend is observed as with the hotplate annealed films, with increasing microwave power the in-plane sheet resistance increases. it is shown that the annealed pedot:pss films in our microwave system at 150 w for 100 s reached a similar sheet resistance as the hotplate annealed samples at 125 °c for 251 s. the microwave annealing step is 2.5 times faster than the conventional hotplate annealing step without interfering with the electronic properties of the conjugated polymer pedot:pss. 4. conclusion we studied the effects of microwave annealing as post-deposition treatment on pedot:pss thin films in comparison to conventional hotplate annealing treatment. we investigated effects of these treatments on the morphological and electronic properties of pedot:pss thin films to determine the applicability of the microwave technique. in – situ current measurements suggest that the microwave annealing technique is 2.5 times faster than the hotplate technique. morphological investigations show no difference in morphology between hotplate annealed and microwave annealed samples. from the investigation of the electronic properties of these annealed pedot:pss films, we have observed that the sheet resistance increases rapidly when the power of the microwave is increased above 150w. using powers lower or equal to 150w will yield comparable results in terms of sheet resistance for both techniques. furthermore the measured annealing time decreases by a factor of 2.5 times, showing that microwave annealing is a feasible post-deposition thermal treatment for pedot:pss thin films. acknowledgement: the author would like to thank financial contribution from the cornet project poleot (iwt-tetra-120629), the interreg projects organext and solarflare and the soppom-project. references [1] f. jonas and l. schrader, “conductive modifications of polymers with polypyrroles and polythiophenes,” synth. met., vol. 43, pp. 831–836, 1991. [2] k. van de ruit, i. katsouras, d. bollen, t. van mol, r. a. j. janssen, d. m. de leeuw, and m. kemerink, “the curious out-of-plane conductivity of pedot:pss,” adv. funct. mater., vol. 23, no. 46, pp. 5787– 5793, dec. 2013. [3] c. deibel and v. dyakonov, “polymer–fullerene bulk heterojunction solar cells,” reports prog. phys., vol. 73, no. 9, p. 39, sep. 2010. [4] a. j. heeger, “nobel lecture : semiconducting and metallic polymers : generation of polymeric materials,” rev. mod. phys., vol. 73, no. july, pp. 681–700, 2001. [5] f. jonas and j. t. morrison, “3,4-polyethylenedioxythiophene (pedt): conductive coatings technical applications and properties,” synth. met., vol. 85, pp. 1397–1398, 1997. [6] b. l. groenendaal, f. jonas, d. freitag, h. pielartzik, and j. r. reynolds, “poly(3,4ethylenedioxythiophene) and its derivatives : past , present , and future,” adv. mater., vol. 12, no. 7, pp. 481–494, 2000. [7] j. h. cook, h. a. al-attar, and a. p. monkman, “effect of pedot–pss resistivity and work function on pled performance,” org. electron., vol. 15, no. 1, pp. 245–250, jan. 2014. microwave annealing, a promising step in the roll-to-roll processing of organic electronics 151 [8] g. greczynski, t. kugler, m. keil, w. osikowicz, m. fahlman, and w. . salaneck, “photoelectron spectroscopy of thin films of pedot–pss conjugated polymer blend: a mini-review and some new results,” j. electron spectros. relat. phenomena, vol. 121, no. 1–3, pp. 1–17, dec. 2001. [9] t. p. nguyen and s. a. de vos, “an investigation into the effect of chemical and thermal treatments on the structural changes of poly(3,4-ethylenedioxythiophene)/polystyrenesulfonate and consequences on its use on indium tin oxide substrates,” appl. surf. sci., vol. 221, no. 1–4, pp. 330–339, jan. 2004. [10] j. huang, p. f. miller, j. c. de mello, a. j. de mello, and d. d. c. bradley, “influence of thermal treatment on the conductivity and morphology of pedot/pss films,” synth. met., vol. 139, no. 3, pp. 569–572, oct. 2003. [11] y. kim, a ballantyne, j. nelson, and d. bradley, “effects of thickness and thermal annealing of the pedot:pss layer on the performance of polymer solar cells,” org. electron., vol. 10, no. 1, pp. 205– 209, feb. 2009. [12] g. greczynski, t. kugler, and w. . salaneck, “characterization of the pedot-pss system by means of x-ray and ultraviolet photoelectron spectroscopy,” thin solid films, vol. 354, no. 1–2, pp. 129–135, oct. 1999. [13] j. huang, p. f. miller, j. s. wilson, a. j. de mello, j. c. de mello, and d. d. c. bradley, “investigation of the effects of doping and post-deposition treatments on the conductivity, morphology, and work function of poly(3,4-ethylenedioxythiophene)/poly(styrene sulfonate) films,” adv. funct. mater., vol. 15, no. 2, pp. 290–296, feb. 2005. [14] e. vitoratos, s. sakkopoulos, e. dalas, n. paliatsas, d. karageorgopoulos, f. petraki, s. kennou, and s. choulis, “thermal degradation mechanisms of pedot:pss,” org. electron., vol. 10, no. 1, pp. 61–66, feb. 2009. [15] k. gilissen, w. moons, j. manca, and w. deferme, “microwave annealing as fast alternative for hotplate annealing of poly(3,4-ethylenedioxythiophene): poly(styrenesulfonate),” 29th int. conf. microelectron. proc. miel 2014, pp. 219–222, may 2014. [16] l. j. van der pauw, “a method of measuring specific resistivity and hall effect of discs of arbitrary shape,” philips res. reports, vol. 13, no. 1, pp. 1–9, 1958. [17] e. t. thostenson and t.-w. chou, “microwave processing: fundamentals and applications,” compos. part a appl. sci. manuf., vol. 30, no. 9, pp. 1055–1071, sep. 1999. instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. i i editorial as emphasized in the editorial for the first in the series of the anniversary issues, over the past quarter of century facta universitatis: series electronics and energetics has become one of the most widely read and cited journals in the field in the region of the west balkans. unfortunately, the journal has yet not gained worldwide recognition, and it will be the main goal, consistent with our coverage and focused aims, in the near future. in order to meet this goal we will strive to attract best submissions and publish best papers from a very broad geographic area, thus making facta universitatis: series electronics and energetics a truly international journal. we will also insist that all published papers are of high quality and practical value, thus leading to their worldwide citation, i.e. to the journal’s placement onto sci list. whilst insisting that all published papers are of high quality and practical value, we wish to avoid creating a situation where the journal publishes by quantity rather than quality, and that is the reason why we already started with rigorous refereeing of all submitted papers. however, recent submissions have surpassed our expectations in quantity, quality and practical values, so we are pleased to announce that journal will be published quarterly since this year. this one, second in the series of the anniversary issues, is a collection of 9 invited papers by well-known experts for the specific areas, most of them members of our editorial team, who present and discuss the state-of-the-art issues of practical interest in the field. as a new editor-in-chief, i, along with our editorial team, promise to continue developing and improving facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief instruction facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 513-528 https://doi.org/10.2298/fuee1904513m © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd designing method for integrated battery chargers in electrical vehicles  aleksandar milic, slobodan vukosavic university of belgrade, dept. of electrical engineering, serbia abstract. electrical vehicles often make use of multi-phase induction motors. at the same time, the vehicles have an on-board charger, the power electronics device that converts the ac power from the mains and charges the traction battery. the traction inverter can be integrated with the charger, reducing in this way the component count, weight and cost, while the windings of the ac motor can be used as the inductors required to complete the charger topology, thus saving on passive components, iron and copper. the integrated charger performances depend on the configuration of the stator windings as well as on the topology of the power converter. the objective in charging mode is reaching a high efficiency while keeping the charging-mode electromagnetic torque at zero. in traction mode, the goals include the efficiency and the torque-per-amps ratio. in order to compare and distinguish between the available topologies and configurations, the paper starts with the analysis of the magnetic field in the air-gap of the electric machine in both charging and traction modes. based upon that, a novel algorithm is proposed which determines the spacetime distribution of the air-gap field, eventually deriving all the relevant pulsating and revolving component of the magnetic field, thus providing the grounds for studying the losses, efficiency and torque pulsations in both charging and traction modes. key words: induction machines, machines, battery chargers, electric vehicles. nomenclature parameter designator (unit) number of stator slots ns = 36 grid currents in [a]; phase number n = 1,2…6 stator currents imn [a]; phase number n = 1,2…9 angular frequency ω [rad/s] stationary-frame currents iα , iβ [a] additional current components ix , iy , ix1 , iy1, ix2 , iy2 [a] current amplitude per slot i = 1 a phase shift rad frequency f = 50 hz time-discretization δt =1 ms harmonic order v = 1, 3, 5… 19 mmf harmonic components fvs [a] spatial displacement of the individual phase magnetic axis δ [rad] received january 8, 2019; received in revised form march 12, 2019 corresponding author: aleksandar milic university of belgrade, dept. of electrical engineering, 11000 belgrade, serbia (e-mail: milic.aleksandar@etf.rs)  514 a. milic 1. introduction the use of vehicles with internal combustion engines (icm) in urban areas has adverse effects on the local environment, and it also contribute to emission of pollutants and greenhouse gasses. although the first electric vehicles (ev) were introduced in 19. century, their use subsided after the introduction of low-cost model t ford. in an attempt to reduce and control emissions and their harmful effects on humans and the environment, electrical vehicles are regaining attention. in rural and open areas, the internal combustion engines [1] with spark-plug-controlled compression-ignition gasoline engines could acquire the potential of developing a new solution, more acceptable for the environment than the electrical vehicles that run on electrical energy obtained from the power utilities. there are opinions [2] that the green house gas emissions from power utilities in conjunction with emissions from the factories that manufacture ev, their batteries and other key parts could match, and, in some cases, even exceed the emissions of icm vehicles. this opinion is opposite to many other reports. in urban, densely populated areas, a vast number of icm vehicles concentrates emissions and pollutes the air. therefore, there are all the good reasons to substitute icm-driven vehicles by ev in all urban areas and other places where local environmental conditions are of particular importance. in general, the electrification of transportation has the potential of reducing the use of fossil fuels, but the green-house gasses emissions largely depend on the primary source mix for the electricity generation [3]. the vehicles that run on electricity require the equipment for the battery charging. in order to improve efficiency, regenerate the kinetic energy and enable the use in urban areas, most modern icm-vehicles are hybrid, and they also include an electrical machine, power electronics devices and the means for the energy accumulation. this emphasizes the significance of designing light-weight, low-cost integrated battery chargers and optimizing their configuration and topology so as to maximize the performances in both charging and traction mode. conventional chargers comprise their switching stages and passive l and c components separate from the traction motor and the traction converter. their weight and size adds to the weight and size of the traction converter and the traction motor. since the traction and charging do not take place at the same time, devices used in traction mode can be used for charging and vice versa. in integrated chargers, the passive filtering components of the charger are replaced by the stator winding, while the traction converter can be put to use as the switched-mode controller of the charging currents. in the prescribed manner, the overall weight and size of passives and switching devices is considerably reduced, making the integrated chargers preferred choice [4]-[8]. the chargers could be installed on-board the ev, which simplifies the stationary part of the charging system. off-board systems provide fast charging, an ease of installation and galvanic isolation [4], but their connections have electrical and mechanical features that are closely related to the ev and its battery. on the other hand, on-board systems require just an ac socket and the cable connection to the mains, paving the way to a more general and widespread use [5] in all the cases where the cable connection is acceptable. thus, it is of interest to study integrated-charger topologies which reduce the weight, space and cost of the on-board equipment [6]-[8]. while in charging mode, it is necessary to maintain sinusoidal line-current waveforms and unity power factor in order to maximize the power for the rated current [9]. in topology optimum battery charger topologies for electrical vehicles 515 shown in fig. 1, the stator phase windings of the traction motor are connected to a three-phase grid and used as input line filter of the integrated charger. both the inverter and the dc/dc converter are bidirectional. as the charging and traction modes do not take place simultaneously, the equipment shown in fig. 1 serves both purposes. while the system is in charging mode, the charging currents can create the revolving magnetic field and electromagnetic torque that could stir the rotor into motion. any motion and consequential electromotive forces are undesired and harmful, as they interfere with the charging process, increase the losses and jeopardize safety. the problem of the rotor movements can be solved by adopting mechanical brakes or using an appropriate clutch [10], [11]. both approaches require additional mechanical parts that require maintenance and increase weight and size of the motor. even in cases where the rotor is blocked by mechanical brakes, revolving magnetic fields caused by the charging currents would create additional losses and reduce the efficiency. in cases where the braking is accomplished by electrical means [12], it comes at the cost of increased losses within the motor. while in some cases the revolving fields do not exist in charging mode, others may require additional power switches and/or re-configuration of the windings while in charging mode [13]-[15]. advantage of multiphase machines is the possibility to obtain zero-revolving-fields conditions during the charging with no additional switches and without the reconnection of the windings. the operation of integrated chargers with multiphase machines depend on the power converter topology and also on the configuration of the stator winding and the magnetic circuit. the impact of design parameters and choices on the system performances has to be studied in order to distinguish the optimum converter-machine match, the one that meets the requirements in both charging and traction modes. commonly used design methods rely on finite-element models and tools [16]. in most cases, design and optimization of electrical machines [17], [18] includes iteration and/or search procedures. whether driven by meta-heuristic or similar algorithms, most procedures include considerable number of successive design attempts. an intent to use the fem tools in each attempt is impractical, and there is a need to devise a simple and quick approach to computing the key performances from the attempted design parameters and choices. fig. 1 integrated charger with battery dc/dc converter, 6-phase inverter and 6-phase asymmetrical machine. notice that the 6-phase power supply requires a set of conveniently arranged transformers. in this paper, a novel approach to deriving the key performances of machineconverter system is derived. proposed approach is considerably faster as it avoids 516 a. milic laborious finite-element analysis. design is expressed in terms of the stator-winding parameters, parameters of the magnetic circuit and the main features of the power converter. the input data is used to derive the principal pulsating and revolving air-gap fields, which are used thereafter to compare the losses and pulsating torques of attempted machine-converter designs. the core of the proposed approach is the algorithm which evaluates the impact of the winding configuration and the converter topology on the space-time distribution of the air-gap magnetic field in integrated chargers. analytical approach to obtaining the air-gap field distribution is given in section ii. a novel algorithm that distinguishes the relevant pulsating and revolving fields is developed in section iii. in section iv, proposed solution is verified by performing simulations of several cases with 6and 9-phase machines in both traction and charging modes. conclusions are given in section v. all variables are given in nomenclature. 2. analytical considerations in fig. 2, an integrated charger connects to the 3-phase grid. with multiphase machines with 3q phases, each phase of the mains connects to q phases of the stator winding. the stator winding configuration determines the topology of the static power converter that is used both for the charging mode and for the traction mode. whenever possible, the configuration is supposed to be conceived to zero-out the revolving fields in the case of the battery charging. specific transformation matrix for the phase currents is required for each configuration of the stator winding [19]. transformation matrix of [19] provides the usual torque-producing currents id and iq, but also additional current components which do not produce the torque. such additional components do not exist in the case of a conventional 3-phase machine. the pairs of current components such as (id, iq) and (ix, iy) can be represented as complex numbers, also called the space vectors that could be represented in two-dimensional plane. the currents that do not contribute to the torque represent a degree of freedom that could be used to optimize other design criteria, such as the operation of the system in charging mode. the transformation matrix of [19] is applied to a 5-phase machine, thus resulting in d, q, x, and y components that could be represented by vectors in two different planes, d-q and x-y. with the former being the key parts in torque generations, the later provide the degree of freedom required in suiting the proper torque-free battery charging process. the example given in fig. 1 presents an asymmetric six-phase machine used within an integrated charger. magnetic axes of individual phases are spatially shifted by 30º. even and odd phases can be grouped in two triplets. by the proper connection, these triplets can be connected in two distinct stars-of-windings. in order to develop the electromagnetic torque and achieve the normal traction functionality, it is necessary to supply the stator winding by the set of currents given in (1), where the phase shifts 1-6 have to be adjusted so as to provide the revolving air-gap field of a constant amplitude. 1 1,2,3,4,5,6,2,3,4,5,6 sin( )i i t   (1) the phase transposition illustrated in fig. 1 is given in (2). employing the decoupling transformation matrix for an asymmetrical six-phase machine from [21], the results are obtain that are given in (3). optimum battery charger topologies for electrical vehicles 517 2 3 3 2 4 5 5 4 , , , m m m m i i i i i i i i    (2) 6 cos( ), 6 sin( ), 0, 0x yi i t i i t i i      (3) with the stationary-frame current components α-β equal to zero, the system in given conditions does not develop any revolving magnetic field. thus, the charging replies on x-y current components, having the angular frequency ω and the amplitude sqrt(6)*i. in a very like way, the system illustrated in fig. 2 can be analyzed and explained. this system comprises a nine-phase machine where the machine phases are conveniently grouped in three stars. while in the charging mode, each star connection is attached to one of the three input phases. rather than the setup of fig. 1, the setup of fig. 2 can be connected to the mains without any additional transformers. according to the analysis of [22], the mains currents i1, i2 and i3 (fig. 2) are given in (4), while the winding currents while charging are given in (5). 2 4 2 31 3 3 sin( ), sin( ), sin( )i i t i i t i i t          (4) 2 31 4,5,6 7,8,91,2,3 / 3, / 3, / 3 m mm i i i i i i   (5) , 1 1 0, 0, 2 / 3 cos( ) 3 / 3 sin( ) 3 / 3 cos( ), 2 2 3 3 0 x y i i i i t t i i t x y x y i i i i                   (6) fig. 2 connections of the integrated charger with asymmetrical 9-phase machine according to [22], the currents of (5) can be transformed and decomposed in  and x-y pairs given in (6). with  components, the revolving field in the air-gap is equal to zero, while the currents x1-y1 are responsible for the charging. the nine-phase topology of fig. 2 connects to the mains without any additional components and with no re-wiring of the circuit. getting the integrated charger with asymmetric 6-phase machine [21] in charging mode is more involved, but, on the other hand, the 6-phase topology has the advantage of having a smaller component count. 518 a. milic 3. analysis of the air-gap field the analysis given in this section derives space harmonics of the air-gap field for integrated chargers given in figs. 1 and 2. the former uses a six-phase source derived from the mains by means of a dedicated transformer [21], while the later connects the mains to the three star connections of the nine-phase winding [22]. the objective of the analysis is to consider the charging process and derive any and all the revolving and pulsating air-gap fields. it is of interest to derive the approach which distinguishes the field components for all the relevant stator-winding-configurations and all the relevant topologies of the power converter. the ultimate goal is to improve the design in order to maximize the performances of the system in both charging and traction modes. extraction of the pulsating and revolving fields cannot be perform by means of the conventional tools, such as the fourier transformation. on the other hand, the problem complexity includes a non-sinusoidal winding and magnetomotive-force functions, which contribute to a great deal of space-time components of the air-gap field, including rather high orders. an accurate distribution of the air-gap field can be obtained by fem tools, but this approach is laborious and incompatible with the needs, as it requires considerable time and effort, and it can hardly be used in iterative loops. therefore, the approach outlined hereafter adopts the following reasonable assumptions: (1) the main part of the magnetic-voltage-drop is experienced in the air-gap, and (2) while the air-gap circumference can be divided in ns parts, ns being the number of the stator slots, the magnetomotive-force (mmf) can be obtained by integrating the field strength under each stator tooth, thus neglecting the non-homogeneous field distribution caused by the specific form of the slots and teeth of the magnetic circuit. in addition to these, it is also assumed that magnetic saturation is not emphasized. the later assumption on linearity simplifies the subsequent calculations: it is sufficient to calculate the mmf for only one amplitude of the sinusoidal stator currents, assuming that the mmf values for other amplitudes would change proportionally. in further considerations, the current amplitude is scaled down to 1a per slot. to obtain further time savings, it is assumed that 50hz waveforms can be represented by time-discretization of 1 ms. proposed procedure envisages the following steps: (i) given the winding configuration and converter topology, the time-space samples of the corresponding mmf are calculated at specific instants for each of the discrete locations along the circumference; (ii) an analytical function is devised comprising the reasonable number of low-order harmonics, suited to fit the mmf as the function of time and space, said function comprising the set of parameters (amplitudes and shifts) yet to be determined; (iii) the time-space samples of the mmf obtained in first step are used to calculate the parameters of the function devised in the second step. in order to obtain the best-fit in terms of the sum of the square of residues (errors), the adopted approach includes moore-penrose inversion of rectangular matrix [23], [24] wherein the number of matrix rows (equations) exceeds the number of columns (parameters) by several orders of magnitude; (iv) finally, all the revolving and pulsating magnetic fields are obtained from the function devised in the second step an provided with the parameters obtained in the third step. the waveforms given in figs. 3 and 4 correspond to the mmf obtained with the two sample designs. the waveforms are given for t=5ms, t=10ms, t=15ms, and t=20ms, with the angle (space) given on horizontal axes. at this point, it is necessary to attempt the corresponding analytical expressions. without the lack of generality, it can be assumed optimum battery charger topologies for electrical vehicles 519 that the instant that corresponds to θ = 0 is such that the target function is an odd function, and that the order of relevant harmonics ends with ν = 19. in this case, the mmf distribution obtained from one phase is given in (7), where the amplitudes f1s, f3s, …, f19s have yet to be defined. 5 71 3 19( ) sin sin 3 sin 5 sin 7 ... sin19s sѕ s sf f f f f f           (7) while (7) represents the distribution of the mmf along the circumference for one of the phases, other phases have similar formula. yet, due to the spatial shifts between magnetic axes of individual phases, other phases may require the functions f(θ) which include both sine and cosine components with corresponding amplitudes. denoting the spatial displacement of the magnetic axis of individual phase by δ and the phase shift by φ, the corresponding mmf expression is given in (8). the expression is derived assuming that all the phase currents have the same amplitude i. 1 1 3 3 19 19 ( , ) sin( ) sin( ) cos( ) sin 3( ) cos 3( ) ... sin19( ) cos19( ) [ ] ѕ c s c s c f t i t f f f f f f                               (8) the obtained results can be used to derive analytical expression that describes spacetime distribution of the magnetomotive force for the integrated charger with 6-phase machine, shown in fig. 1. corresponding expression derived for the charging mode is given in (9). due to specific configuration of the stator windings and the spatial displacement of magnetic axes, the expression does not comprise any even nor the harmonics with of order that is a multiple of three. due to the phase transposition, the considered six-phase asymmetrical stator winding does not generate harmonics of the 11th and 13th order, nor does it produce any fundamental component. 5 56 7 7 17 17 19 19 ( , ) f sin( t-5 ) f cos( t-5 ) -f sin( t+7 )+f cos( t+7 ) +f sin( t-17 )+f cos( t-17 ) -f sin( t+19 )+f cos( t+19 ) eq c sasy c s c s c s f t                    (9) when the 9-phase machine of fig. 2 is used in charging mode, each of the 3-phase mains voltages is fed into the star connection of one of the three groups that comprise three phase winding each. the three stars with three phase windings each create the stator winding system of 9-phase machine. the charging current in said windings creates pulsating magnetic field [5]. the resultant mmf which comprises the contribution of all the phases in charging mode is given in (10), and it has a strictly pulsating nature. from (10), the only harmonics of the pulsating field are of order 3, 9 and 15, while the harmonics of the order 1, 5, 7, 11, 13, 15 and 19 are absent. 520 a. milic [ ] 9 3 [ ] 3 [ ] 3 [ ] 3 [ ] 9 [ ] 9 ( , ) sin( t-3 )+sin( t+3 ) sin( t-3 -2 / 3)+sin( t-3 / 3) cos( t-3 )-cos( t+3 ) cos( t-3 -2 / 3)+cos( t-3 / 3) sin( t-9 )+sin( t+9 ) + cos( t-9 )-cos( t+9 ) eq asy c c s s c s f t f f f f f f f                                     [ ] 9 [ ] 9 [ ] 9 [ ] 9 [ ] 15 [ 15 sin( t-9 + / 3)+sin( t+9 + / 3) + cos( t-9 + / 3)-cos( t+9 + / 3) sin( t-9 +2 / 3)+sin( t+9 +2 / 3) + cos( t-9 +2 / 3)-cos( t+9 +2 / 3) sin( t-15 )+sin( t+15 ) sin( t+15 c s c s c c f f f f f                                  ] [ ] 15 [ ] 15 -2 / 3)+sin( t+15 / 3) cos( t-15 )-cos( t+15 ) -cos( t+15 -2 / 3)-cos( t+15 / 3) s s f f                 (10) analytical expressions that are obtained for the cases illustrated in figs. 1 and 2 are used in the subsequent considerations. for each of the considered systems, the functional approximation can take into account the space-time distribution of the field in both the traction mode, when the electrical machine operates with prevalently revolving magnetic fields and generates the moving torque, and in charging mode, where the revolving fields are not desired, while the amplitude of the remaining pulsating fields has to be reduced in order to curb down the losses. in deriving the functional approximation and parameterfitting of the relevant functions, proposed procedure describes pulsating fields as the sum of the two revolving fields having the same speed and amplitude but different directions (11). in addition to pulsating fields, the air-gap field comprises a set of revolving fields that run at different speeds in direct or inverse sense. all the fields components produce the losses, but only the revolving fields are responsible for the torque generation. ( , ) [ ]sin( t)+sin( t+ ) v v f t f     (11) analytical representations of the magnetomotive force in (9) and (10) depict the airgap field dependence on space (that is, position along the circumference) and time. expression (9) and fig. 3 correspond to the six-phase asymmetrical machine. expression (10) and fig. 4 correspond to the nine-phase machine. the field in fig. 4 and (10) comprises several revolving and several pulsating components. it is also observed that the revolving-field content is considerably lower in nine-phase machine. optimum battery charger topologies for electrical vehicles 521 fig. 3 spatial distribution of the air-gap field in an asymmetrical six-phase machine used within the integrated charger of fig. 1. the waveforms are taken at 4 distinct instants of time. fig. 4 spatial distribution of the air-gap field in an asymmetrical nine-phase machine used within the integrated charger of fig. 2. the waveforms are captured at 4 distinct instants of time. the next step of the proposed procedure consists in deriving the coefficients that correspond to the amplitudes of the specific fields components in (9) and (10). the relevant input data points for such procedure are the time-samples of the mmf taken within one period of the excitation current (that is, one period of the mains) for each of the preselected discrete locations along the circumference. based on the previous considerations, the time-samples are taken each 1ms within the period of 20ms. in a machine with 36 slots, there are 36 locations where the mmf is calculated. in the prescribed way, there is a total of 720 equations that express the mmf values at specific 522 a. milic time and space instants. in addition to the specific time (tx), space (x) and readily available mmf = f(tx,x) obtained in the previous step, each of 720 equations also comprises the desired parameters, that is, the coefficients that correspond to the amplitudes of the specific fields components in (9) and (10). based on the previous considerations, the functional approximation of the air-gap field in an 6-phase machine requires 8 coefficients. the number of equations (720) exceeds the number of unknown parameters by far. therefore, it is not possible to find the parameter set that would bring all the equations in the balance, as any and all of them may have a smaller or larger residual error. when using the moore-penrose pseudoinverse method to solve the system where the number of equations exceeds by far the number of parameters, the sum of the squared residues is brought to the minimum. therefore, said approach can be used to obtain a simple-to-use and sufficiently precise functional approximation of the air-gap field. notice in (9) and (10) that the number of spatial harmonics taken into account is rather arbitrary, and it should be based on the previous experience. it is of interest to increase the number of harmonics in the analytical expressions, yet, on the other hand, this increases the number of coefficients, increases the calculation complexity, gives the rise to the numerical errors cause by the finite wordlength and it also makes the practical use of the final outcome more involved when it comes to designing and optimizing integrated charger topologies. in order to get an insight into the impact of the number of spatial harmonics on the consequential errors, the process is repeated several time with the top harmonic being the 11th, 13th, 15th, 17th and 19th. the evaluation is performed in following steps: (i) first, the mmf values are calculated in 20 time x 36 space = 720 individual points, (ii) after assembly of 720 equations, the relevant parameters are calculated by means of pseudo-inversion, (iii) the values of mmf are calculated from the functional approximation; (iv) in each point, the residual errors are obtained as the difference between the actual and approximation values, and these are used in (12); (v) the error ∆f is obtained as ∆ 2 / 2wfmmf , as illustrated in (12).     20 36 2 1 1 20 36 2 1 1 ( , ) ( , ) ( , ) wf fa i j a wf i j mmf i j mmf i j mmf i j f           (12) table 1 residual errors (12) obtained after deriving the relevant parameters by moorepenrose pseudo-inversion 19 th 17 th 15 th 13 th 11 th 6 8.53·10 -13 9.52·10 -13 15.62 15.62 15.62 9 7.41·10 -13 7.41·10 -13 7.41·10 -13 6.78 6.78 the results are shown in table 1, and they display the error (12) obtained for 6-phase and 9-phase topology, in cases where the highest order harmonics sweep from 11th up to the 19th. notice in table 1 that the error (12) is relative; the value of 1 corresponds to the case where the rms value of the error corresponds to the actual field. therefore, the values that optimum battery charger topologies for electrical vehicles 523 exceed 1 are the indicators that the attempted functional approximation is fundamentally wrong. in the case of a 6-phase machine, the error maintains considerable value until the 17th harmonic is taken into account. this circumstance is attributed to the existence of significant 17th harmonic which contributes to considerable errors when not taken into account. further addition of the 19th harmonic into the approximation formula does not have any significant impact. in the case of a 9-phase machine, the error (12) remains high until the 15th harmonic is included into the functional approximation. this points out to the fact that the 15th harmonic of the air-gap field has considerable 15th harmonic, while the contribution of the 17th and 19th harmonics are negligible. it is reasonable to assume that the computation complexity that is required to take into account the space-time harmonics in excess of the 19th is feasible. with a reasonable number of the slots within the stator magnetic circuit (which are expected to remain far below 100), the proposed 4-step process remains unharmed by the finite-wordlength errors and other issues related to calculation with very large matrices. thus, the 4-step process outlined and demonstrated in this section provides the design tool for distinguishing and grading the stator winding configurations and the power converter topologies of integrated chargers. some very important key features of the system can be envisaged from the content of pulsating and revolving air-gap field, made readily available by the proposed approach. 4. applying the proposed method in designing integrated chargers proposed approach is applied in evaluating the pulsating and revolving fields of four distinct designs, having different stator winding configurations and different number of phases in the static power converter. considered are (i) the 6-phase integrated charger where the stator winding configuration is symmetrical, (ii) the 6-phase integrated charger where the stator winding configuration is asymmetrical, (iii) the 9-phase integrated charger where the stator winding configuration is symmetrical, (iv) the 9phase integrated charger where the stator winding configuration is asymmetrical. in all the considered cases, the number of stator slots is equal to 36, and it is assumed that the winding has a single layer. the functional approximation includes all the relevant spacetime harmonic, including the harmonic of the 19th order. the obtained results are summarized in table 2 for both the charging mode and the traction mode. with 6-phase asymmetrical machine running in charging mode, the relevant harmonics are the 5 th , 7 th , 17 th and 19 th , and all of them produce revolving fields. as a consequence, the electromagnetic torque assumes a non-zero value that depends on the magnitude of relevant harmonics fields. when considering the 6-phase symmetrical machine in charging mode, the harmonic content is more populated, and it includes all the odd high order space-harmonics up to the 19 th , excluding only the 3 rd , 9 th and 15 th . what is particularly harmful is the presence of considerable first harmonic, having a large and unacceptable contribution to the electromagnetic torque. in the considered case, the traction motor would need a mechanical brake while running in charging mode. the change of the magnetic field along the stator circumference is obtained with symmetrical 6-phase machine and plotted in fig. 5. 524 a. milic considerable harmonic content contributes to a great deal of losses caused by harmonic fields. then a 9-phase machine is used within the integrated charger running in charging mode, the distribution of the mmf along the circumference is given in fig. 6 for four consecutive instants of time. proposed functional approximation ends with the 19th harmonic, and based upon these assumptions, the outcome of table 2 includes the 3rd and the 15th harmonic different from zero (dfz). yet, when considering the error (12), the resulting value is exceptionally large, clearly indicating that there is a need to include more spatial harmonics. following a more thorough examinations, the findings are that the corresponding torque components caused by the charging currents in the 9-phase machine is used within the integrated charger are very small due to rather low values of revolving fields. the results obtained with the 9-phase machine with asymmetrical stator windings used in charging mode are shown in table 2, and they comprise the 3 rd , 9 th and the 15 th harmonic with a relatively low amplitude. most of the components produce pulsating fields, and the revolving field content is negligible when compared to the revolving field of the first harmonic in the traction mode. thus, the average value of the electromagnetic torque obtained in charging mode is negligible, as well as the relevant torque pulsations. in the second part of the table 2, the results correspond to the traction mode of all the considered configurations. the torque-generating currents have to be defined in the way prescribed by (1), excluding the intermediate phase transpositions and re-connections. with asymmetrical 6-phase machine, the field comprises only the 1 st , 5 th and 13 th harmonic, wherein the values of the 5 th and the 13 th harmonic are very small when compared to the fundamental. in symmetrical 6-phase machine, the fundamental component is accompanied by all the odd harmonic except for the 3rd, 9th and the 15th harmonic. the torque generation of 9-phase machines with asymmetrical and symmetrical stator winding is based on the approach given in [22], [25] and [26]. in addition to fundamental component, both symmetrical and asymmetrical 9-phase machines produce only the 17 th and 19 th harmonics, their values being considerably smaller than the fundamental. the previous analysis, results and discussion outlines the application of the proposed 4step approach in predicting the charging-mode and the traction-mode performances of the integrated charger with multiphase machines. the system losses and electromagnetic torque are estimated from the amplitudes of the pulsating and the revolving fields produced in the air-gap of multi-phase electrical machines. out of the considered 4 design samples, the best results are obtained with the nine-phase setup employing asymmetrical stator winding. the practical use of the proposed design tool stems from its ability to obtain a quick and precise estimate of the impact of several design parameters and features on the system performances in traction and charging modes. considered design parameters include the number of phases of the electrical motor, the number of branches in static power converter, the number of layers and the implementation details of the stator winding, the winding connections and transpositions as well as selection of either symmetrical or asymmetrical winding dispositions. drawback of the proposed algorithm is the need to repeat the pseudo-inverse-based parameter setting of the target functional approximation for each configuration and for each operating regime. in spite of that, related complexity and efforts are considerably reduced when compared to the approach which relies on fem tools in each of the design phases. optimum battery charger topologies for electrical vehicles 525 fig. 5 the resultant mmf within the symmetrical 6-phase machine inside the charger described in [25] fig. 6 the resultant mmf within the symmetrical nine-phase machine inside the charger described in [26] comparison of the proposed algorithm to other solutions that make use of [24] was not feasible. vast majority of design approaches [19-22, 25-26] relies on some basic 526 a. milic analytical considerations and fem tools. compared to evidence given in [21], [22], [25], [26], as well as in references listed in review [20], the results of figs. 3-6 and those of table 2 provide valid conclusions and credible design guidelines. table 2 the harmonic content of magnetomotive force obtained from the stator winding with 36 slots and with scaled 1 ampere-conductor located in each slot. results are obtained for both charging and traction modes. the pulsating and revolving fields are resolved in the text which refers to this table m o d e type v=1 v=3 v=5 v=7 v=11 v=13 v=15 v=17 v=19 c h a rg e 6-asym 0 0 1.80 0.97 0 0 0 0.16 0.162 6-sym 5.67 0 0.90 0.48 .06 0.05 0 0.08 0.081 9-asym 0 0.41 0 0 0 0 0.03 0 0 9-sym 0 dfz 0 0 0 0 dfz 0 0 t ra c ti o n 6-asym 11.35 0 0.12 0 0 0.10 0 0 0 6-sym 11.35 0 1.803 0.97 .12 0.10 0 0.16 0.0162 9-asym 11.43 0 0 0 0 0 0 0.04 0.044 9-sym 11.43 0 0 0 0 0 0 0.04 0.044 results presented in this section are not supported by the experimental evidence. therefore, with the adopted assumptions and approximations, presented results could be better than in practical application. experimental comparison of considerable number of machine-converter configurations was not feasible at this point. at the same time, the analysis and the design of the proposed procedures provide sufficient ground for the claim that the machine-configuration with superior predicted performances would outperform the others in practical application as well. therefore, notwithstanding the absence of the experimental evidence, the proposed design method is capable of deriving the best machine-converter pair for the desired integrated charger. 5. conclusions the paper focuses on studying the air-gap field in multi-phase induction machines that are used in electrical vehicles with integrated chargers. it is found that the configuration of the stator windings as well as the topology of the integrated power converter largely affect the distribution of the air-gap field in both charging and traction modes, determining in this way the amplitudes of the revolving and pulsating magnetic fields. said fields are proven to have the key impact on the losses, efficiency, pulsating torques and the torque-per-amps ratio of the integrated traction-charger system. based upon the analysis, a novel algorithm is proposed which determines the space-time distribution of the air-gap field, eventually deriving all the relevant performances, thus providing a practical and useful design tool. proposed approach is tested on four characteristic design examples, and it proved efficient in studying the impact of the converter topology, number of phases and the configuration of the stator winding, providing at the same time the indications for further improvements of integrated battery chargers. optimum battery charger topologies for electrical vehicles 527 references [1] g. kalghatgi, "fuel/engine interactions", isbn: 9780768080438, sae, 2014. [2] j. g. zivin, m. kotchen and e. mansur, spatial and temporal heterogeneity of marginal emissions: implications for electric cars and other electricity-shifting policies. j. econ. behavior & org. 2014, 107, 248–268. [3] a. elgowainy, j. han, m. mahalik, l. poch, a. rousseau, a. vyas and m. wang, well-to-wheels analysis of energy use and greenhouse gas emissions of plug-in hybrid electric vehicles. argonne national laboratory: argonne il, 2010. [4] s. haghbin, k. khan, s. lundmark, m. alak¨ula, o. carlson, m. leksell, and o. wallmark, "integrated chargers for ev’s and phev’s: examples and new solutions", in proceedings of the 19th international conference on electrical machines. rome, 2010, pp. 1–6. [5] l. solero, "nonconventional on-board charger for electric vehicle propulsion batteries”, ieee trans. veh. technol., vol. 50, no. 1, pp. 144–149, jan. 2001. [6] m. grenier, m. h. aghdam and t. thiringer, "design of on-board charger for plug-in hybrid electric vehicle", in proceedings of the international conference on power electronics, machine and drives, 2010, pp. 1–6. [7] m. yilmaz, p.t. krein, "review of battery charger topologies, charging power levels, and infrastructure for plug-in electric and hybrid vehicles", ieee trans. power electron., vol. 28, no. 5, pp. 2151–2169, 2013. [8] w. e. rippel, "integrated traction inverter and battery charger apparatus", u.s. patent 4 920 475, apr. 1990. [9] g. pellegrino, e. armando and p. guglielmi, "an integral battery charger with power factor correction for electric scooter", ieee trans. power electron., vol. 25, no. 3, pp. 751–759, mar. 2010. [10] f. lacressonniere and b. cassoret, "converter used as a battery charger and a motor speed controller in an industrial truck", in proceedings of the epe, 2005, pp. 1–7. [11] s. haghbin, s. lundmark and m. alakula, "grid-connected integrated battery chargers in vehicle applications: review and new solution", ieee trans. ind. electron., vol. 60, no. 2, pp. 459–473, 2013. [12] m. hinkkanen and j. luomi, "braking scheme for vector-controlled induction motor drives equipped with diode rectifier without braking resistor", in proceedings of the 14th industry applications conference. kowloon, hong kong: 2005, vol. 2, pp. 1066–1072. [13] a. g. cocconi, "combined motor drive and battery charger system", u.s. patent 5 341 075, aug. 1994. [14] s. lacroix, e. laboure and m. hilairet, "an integrated fast battery charger for electric vehicle", in proceedings of the ieee vehicle and power propulsion conference, sep. 2010, pp. 1–6. [15] s. haghbin, s. lundmark, m. alakula, and o. carlson, "an isolated high-power integrated charger in electrified vehicle applications", ieee trans. veh. technol., vol. 60, no. 9, pp. 4115–4126, nov. 2011. [16] j. k. kamoun, n. b. hadj, m. chabchoub, r. neji and m. ghariani, "an induction motor fem-based comparative study: analysis of two topologies", in proceedings of the 8th international conference and exhibition on ecological vehicles and renewable energies (ever). monte carlo, 2013, pp. 1–5. [17] đ. lekić and s. vukosavić, "finite element design of rotor permanent magnet flux switching machine with arbitrary slot, pole and phase combinations", electronics, vol. 22, no. 2, pp. 93–104, january 2019. [18] s. stipetić, w. miebach and d. žarko, "optimization in design of electric machines: methodology and workflow", in proceedings of the international acemp-optimelectromotion 2015 joint conference. side, turkey, 2-4 sep. 2015, pp. 1–8. [19] e. levi, m. jones, s. vukosavic, et al., "a novel concept of a multiphase, multimotor vector controlled drive system supplied from a single voltage source inverter", ieee trans. power electron., vol. 19, no. 2, pp. 320–335, 2004. [20] e. levi, r. bojoi, f. profumo, f., et al., "multiphase induction motor drives – a technology status review", iet electric power appl., vol. 1, no. 4, pp. 489–516, 2007. [21] i. subotic, e. levi, m. jones, and d. graovac, "an integrated battery charger for evs based on an asymmetrical six-phase machine", in proceedings of the ieee industrial electronics society conference. vienna, austria, 2013, pp. 7242–7247. [22] i. subotic, n. bodo, e. levi, and m. jones, "onboard integrated battery charger for evs using an asymmetrical nine-phase machine", ieee trans. on industrial electronics, vol. 62, no. 5, pp. 3285–3295, 2015. [23] penrose, roger (1955). a generalized inverse for matrices. proceedings of the cambridge philosophical society, vol. 51, pp. 406–413. [24] penrose, roger (1956). on best approximation solution of linear matrix equations. proceedings of the cambridge philosophical society, vol. 52, pp. 17–19. https://en.wikipedia.org/wiki/roger_penrose https://en.wikipedia.org/wiki/proceedings_of_the_cambridge_philosophical_society https://en.wikipedia.org/wiki/proceedings_of_the_cambridge_philosophical_society https://en.wikipedia.org/wiki/roger_penrose https://en.wikipedia.org/wiki/proceedings_of_the_cambridge_philosophical_society https://en.wikipedia.org/wiki/proceedings_of_the_cambridge_philosophical_society 528 a. milic [25] i. subotic, e. levi, m. jones, and d. graovac, "on-board integrated battery chargers for electric vehicles using nine-phase machines", in proceedings of the ieee international electric machines and drives conference. chicago, il, 2013, pp. 239–246. [26] i. subotic, n. bodo, e. levi, et al.: "isolated chargers for evs incorporating six-phase machines", ieee trans. ind. electron., vol. 63, no. 1, pp. 653-664, 2016. facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 539-554 https://doi.org/10.2298/fuee1904539p © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd on the node ordering of progressive polynomial approximation for the sensor linearization  aneta prijić, aleksandar ilić, zoran prijić, emilija živanović, branislav randjelović university of niš, faculty of electronic engineering, niš, serbia abstract. many sensors exhibit nonlinear dependence between their input and output variables and specific techniques are often applied for the linearization of their transfer characteristics. some of them include additional analog circuits, while the others are based on different numerical procedures. one commonly used software solution is progressive polynomial approximation. this method for sensor transfer function linearization shows strong dependence on the order of selected nodes in the linearization vector. there are several modifications of this method which enhance its effectiveness but require extensive computational time. this paper proposes the methodology that shows improvement over progressive polynomial approximation without additional increase of complexity. it concerns the order of linearization nodes in linearization vector. the optimal order of nodes is determined on the basis of sensor transfer function concavity. the proposed methodology is compared to the previously reported methods on a set of analytical functions. it is then implemented in the temperature measurement system using a set of thermistors with negative temperature coefficients. it is shown that its implementation in the low-cost microcontrollers integrated into the nodes of reconfigurable sensor networks is justified. key words: sensor linearization, progressive polynomial approximation, reconfigurable sensor networks, ntc thermistor 1. introduction transfer functions of sensors used in measurement systems usually do not have linear dependence between input and output variables. in addition, transfer functions often change with time. for these reasons, measurement systems based on the sensors exhibit various errors such as offset, gain, hysteresis, cross-sensitivity, drift, and non-linearity [1], [2]. in order to achieve reliable measurement, these errors should be compensated. one approach is to use additional analog circuits to condition sensors output signal [3], [4], [5]. however, analog compensation is not always appropriate for sensors integrated received february 5, 2019; received in revised form july 1, 2019 corresponding author: aneta prijić faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: aneta.prijic@elfak.ni.ac.rs)  540 a. prijić, a. ilić, z. prijić, e. živanović, b. randjelović into reconfigurable sensor networks [1], [6], [7]. a more flexible solution is to convert sensor output into the digital domain, where various numerical linearization methods may be applied in the form of compensation algorithms. these methods rely on a set of correction functions applied on a so-called linearization vector composed of linearization nodes. the effectiveness of the linearization method is evaluated on the basis of the number of nodes required to reduce non-linearity below a specific value, computation time, and implementation complexity. the simplest linearization method is based on a look-up table (lut) which is also the fastest one. however, to obtain a high accuracy of the estimated input value, a high number of linearization nodes should be implemented in the lut, making it memory consuming. to reduce memory requirement, a sparse lut can be combined with an interpolation method [8]. a simple method is piece-wise linear interpolation which connects each two adjacent lut values with an appropriate linear function. this type of interpolation can be also used for linearization of the whole transfer function. for n linearization nodes, sensors inverse transfer function will be represented by n−1 first order polynomials [9]. the main disadvantage of this method is a high number of nodes required for the linearization of highly nonlinear functions. this implies either a large memory requirement or a slow response time of the system [8]. more advanced methods are lagrange, newton [10] and spline [11] interpolations. however, lagrange interpolation suffers from overfitting effect for polynomials of higher degree and it is generally not applicable to highly nonlinear sensors [10], [12]. the newton method is more flexible and efficient when additional linearization nodes are introduced, but it is primarily applied for equidistant nodes [13]. on the other hand, spline interpolation is effective, but it comes at high implementation costs [14]. more effective and more commonly used methods are progressive polynomial approximation (ppa) [15], [16] and linearization methods based on the artificial neural networks (ann) [17], [18]. the effectiveness of the ppa method, besides the linearization nodes number, depends on the node ordering in the linearization vector. results obtained using the same compensation algorithm, but with the different order of nodes (permutation) [19], may vary between almost perfect in some cases, to even increased non-linearity in the other. in the case of the ann method, effectiveness depends on the neural network topology and the time needed for its training. this paper proposes the methodology that improves the accuracy of the ppa while keeping its simplicity. theoretical background, a summary of ppa, and an overview of its modifications are presented in section 2. section 3 contains an analysis of the ppa method effectiveness considering different permutations of the linearization vector for four different functions. two of these functions are convex and two concave. this is done in order to elaborate on the idea that the optimal order of nodes in the linearization vector is dependent on the transfer function shape. in such a sense, an extensive computation time needed to accomplish the desired linearity by analysis of all permutations may be avoided. experimental support of presented numerical results is given in section 4, using negative temperature coefficient (ntc) thermistors as sensors. on the node ordering of progressive polynomial approximation for the sensor linearization 541 2. linearization methods 2.1. underlying theory the transfer function of sensors is usually expressed as , where x is sensor input and y is sensor output [20]. the linearization method calculates the desired output value , using the workflow depicted in fig. 1. sensor output is first digitized, using an analog-to-digital converter (adc), and then the linearization algorithm is applied. obtained output value should vary linearly with the sensor input, i.e. , where k is the gain and n (usually 0) is the offset of the desired transfer function [1]. all operations are performed by a microcontroller, which is a part of the reconfigurable sensor node. fig. 1 workflow of a sensor linearization process there are two types of linearization methods, as illustrated in fig. 2. the first involves an estimation of the sensor transfer function and subsequent numerical determination of sensor inverse transfer function which is used to obtain the estimated input value . the linearized output value is . methods of the second type modify sensors output y using a correction function , so the linearized output is calculated as . estimated input value, in this case, is calculated as . fig. 2 linearization of a sensor output for two distinct methods. the linearization node is a pair of two values: input x and corresponding output y. values are determined experimentally by applying a known stimulus at the sensor input and measuring the value of its output. input values are usually chosen equidistantly, starting at the minimal input that a sensor can detect, and ending at the full-scale. nodes are then ordered to form the linearization vector. linearization coefficients, implemented into the correction function , are determined using the linearization vector, and then stored in the memory of a microcontroller. on each measurement, these coefficients are used to calculate the linear output value of the sensor. 542 a. prijić, a. ilić, z. prijić, e. živanović, b. randjelović the effectiveness of linearization is evaluated using relative non-linearity [21]: , (1) where is full-scale input and is the maximum deviation of the real input from the ideal transfer characteristic. due to the nature of the linearization algorithm, nonlinearity will be equal to zero for any of the linearization nodes if a quantization error introduced by ad conversion is neglected. therefore, additional measurements need to be performed to form a set of ne evaluation nodes and calculate the maximum deviation as: (| |) for i=1,…ne, (2) where is the applied input value and is the corresponding output value. 2.2. progressive polynomial approximation (ppa) calculation of the correction function in ppa is a successive process [15]. for each linearization node, new correction function, denoted as , is defined. therefore, i-th function corrects the non-linearity around the i-th node, while keeping the corrections introduced by the previously defined functions. the final correction function is the one defined for the last node. in order to calculate correction function, linear output value ti at each node should be defined as: for i = 1, . . . , n. (3) these values are then used to calculate linearization coefficients for correction functions. the first correction function is defined as: (4) where is the linearization coefficient calculated for the first linearization node: . (5) thus, adds the value to the sensor output, eliminating the offset (if present). the correction function at i-th node is defined as: ∏ for i = 2, . . . , n (6) and the linearization coefficient is calculated as: ∏ ( ) for i = 2, . . . , n. (7) these correction functions eliminate the gain error and successively minimize the transfer function non-linearity. since the final correction function includes all the linearization nodes, it will output the desired value for any of these nodes while between them linearized output will deviate from the ideal one to some extent. on the node ordering of progressive polynomial approximation for the sensor linearization 543 2.3. modifications of the ppa method in ppa, the first two nodes in the linearization vector are chosen from the ends of a sensor range, thus eliminating the offset and gain errors [15]. then, each new node is added halfway between the previous two. when the linearization vector ordered in that way is used, achieved results are not optimal, but large non-linearity is avoided. note that nodes are not necessarily equidistant. the main advantage of ppa lies in its simplicity, so it does not require intensive time-consuming operations. this makes it particularly suitable for the implementation in reconfigurable sensor networks. improved progressive polynomial approximation (imppa) is the method based on permutations of nodes in the initial linearization vector. each permutation is tested in order to find the best one [19]. to reduce the number of arithmetic operations, imppa does not test all possible permutations of the initial vector. rather, it fixes the first node from the beginning of the sensor range as the first, the node from the end of the range as the second, and then permutes the remaining ones. the effectiveness of each permutation is determined by the non-linearity obtained at nodes which are inserted between the nodes of the linearization vector. a major drawback of this method is increased implementation costs in terms of the complexity and computation time. it should be noted that this method finds the optimal permutation of the given linearization vector for equidistant nodes. the method can be further improved if linearization vector with non-equidistant nodes is used. a probability density function is used to improve ppa, as presented in [22], [23]. this approach proposes an accumulation of linearization nodes in the part of a transfer function that will be used most commonly during a sensor lifetime. this can significantly improve measurement system accuracy in some cases, but the problem of the further ordering of the selected nodes still remains unsolved. a different linearization method inspired by ppa, called modified progressive polynomial approximation (mppa), is addressed in [24]. methodology for selection of the nodes which does not consider their order in the linearization vector is introduced. the larger set of equidistant nodes is formed first. the linearization vector is not predefined, but it is populated at each step by the node from the original set at which current linearized function deviates most from the linear one. consequently, selected nodes in the linearization vector are not equidistant. 3. proposed methodology proposed modification of the ppa method concerns the order of nodes in the linearization vector to obtain the desired transfer function linearity without increasing the algorithm complexity. several analytic expressions commonly used to model sensors transfer functions are analyzed. in order to make a comparison of the results, transfer functions are normalized before the linearization methodology was applied. both, input (argument) and output (function value) are normalized to range [0, 1]. if x m is sensor input and y m is sensor output, normalized input and output values are calculated using the following equations: , (8) , (9) 544 a. prijić, a. ilić, z. prijić, e. živanović, b. randjelović where and are minimum and full range input values, while and are the corresponding output values. the initial linearization vector ti is formed as a set of n equidistant nodes starting from the beginning of the sensors range. it is expressed as: [ ], (10) or, using shorthand notation: [ ]. permutations of the linearization vector are denoted as , i = 1, 2, . . . , n − 1!. 3.1. convex functions the ppa linearization methodology is applied to an exponential function: ⁄ (11) where p is parameter used to adjust its non-linearity (0>a). the solution for the potential distribution of the electrode is based on conventional nodal analysis which is represented in the following matrix equations 1 [ ] [ ][ ] [ ] ([ ] [ ][ ] [ ] [ ] [ ]) s t t i y v y q g q a z a     (1) where [y] is the admittance matrix, [z] is the impedance matrix, [g] is the conductance matrix, [q] is a matrix containing relations between nodal and segment potentials and [is] is the current source vector. equation (1) results with the potential distribution [v] (of every node) along the conductor. the conductor is excited by a harmonic current source at the first node and the impedance to ground is calculated as a ratio of the conductor’s first node voltage and current. the presence of air-earth interface is taken into account by implementing the quasi-dynamic image theory [9]. for the purpose of this research the calculating the impedance of grounding electrodes using circuit equivalents 723 cbm method is implemented in the classical form [5], and in the hybrid form [8]. more details of the extraction of impedances, conductances from maxwell’s equations and the approximated relation between nodal and segment potentials can be found in [7]. the validity of the cbm method applied on complex grounding networks has been thoroughly investigated by the authors [7], [10]. one such network is the e. balaidos ii substation grounding (located close to the city of vigo, spain). the analyzed network has dimensions 80 x 60 m and is constructed with 107 horizontal copper bars of 1.28 cm diameter buried at 80 cm depth and a set of 67 vertical copper clad steel rods of 1.4 cm diameter and 2.5 m length. 3d view of the grounding grid and the profile on the surface of the ground used for calculations is presented in fig. 2. small portion of the verification results will be presented in fig. 3 fig. 2 geometry of the balaidos grounding network and the position of the calculation profile. the verification process included results for the potential along the profile when injected current frequency is low (50 hz) and higher (100 khz) [10]. the current was injected in a corner of the grounding grid (point o1 from fig. 2). the resistivity of the soil was ρ=50 ωm. the obtained results are compared graphically in fig. 3. fig. 3 potential along the profile. 2.2. lumped circuit (r-l-c) approach the second method that was compared in this paper is the lumped circuit (r-l-c) approach. the main purpose of this method is to equivalent the grounding conductor with its 724 a. kuhar, l. grcev input impedance or impedance to remote neutral ground. at low frequencies, the input impedance is represented by a single resistor and at high frequencies by a lumped r-l-c circuit, fig. 4. fig. 4 lowand high-frequency equivalent lumped circuit of the grounding conductor. the expressions used for the circuit parameters of a vertical grounding rod are taken from reference [11]. 0 1 2 log ( ) 2 2 2 / log ( ) 2 log ( ) 2 l r l a l c l f a l l l h a          (2) 2.3. tl model the third method being compared in this paper is the tl or distributed circuit parameters method, fig. 5. this method assumes transverse-electromagnetic (tem) propagation on a perfect infinite conductor in a homogeneous medium and neglects the effects of the earth-air interface. fig. 5 discrete approximation of the distributed circuit representation of the grounding conductor. the distributed circuit parameters (per length) are obtained from the lumped circuit parameters (2) in the following manner calculating the impedance of grounding electrodes using circuit equivalents 725 1 ' ( ) ' ' ( / ) ' ( / )       r r l m g c c f m l l l h m l (3) fig. 3 presents a discrete approximation of the distributed-parameter circuit, where each segment of the grounding conductor is represented by a r-l-c section. identical parameters are used for each section. the impedance to ground of the grounding electrode is in fact, the input impedance of the transmission line open at the lower end [12]. 0 0 coth ' ( ' ') '( ' ') z z l j l z g j c j l g j c             (4) 2.4. emf approach the emf method is used to investigate the validity range of the methods presented in the previous subsections. the referent results are obtained by rigorous method of moments calculations of the mixed potential integral equation [8] implemented on the identical system. for the perfect conductor from fig. 1, the above mentioned equation in the frequency domain expresses the z component (tangential) of the electric field as 1 ( ( ) ) z a v l l d di z e j g i z dz g dz j dz dz            (5) the longitudinal current i(z’) is then expanded as a linear combination 1 ( ) n n n n i z f i     (6) where in are the unknown current values on every segment and fn are triangular basis functions. the following matrix equation yields the current distribution [5], [13]. [ ][ ] [ ] s s z i z i  (7) where the array [i] represents the currents to be determined, [z] is the impedance matrix of mutual impedances between each of the current elements, [–zsis] represents the energization array, and is is the injected (source) current. the elements of the matrix [z] are calculated between the observation segment m and the source segment n, as 1 m nm nm nz m n n l v z e dl i i     (8) more details of the mpie solution can be found in [13]. 726 a. kuhar, l. grcev 3. results in the first part of this section, the comparison between the curves of the impedance to ground obtained by the different methods is presented. a) b) fig. 6 impedance to ground of a 3m long vertical grounding rod fig. 6 a) and b) present the impedance to ground for 3m long vertical grounding conductor, calculated by implementation of cbm method, compared to curves obtained by the other methods from literature [6]. the radius of the analyzed conductor is 1.25 cm. the value of soil resistivity (ρ=1/σ) is a) 30 ωm and b) 300 ωm. the length of the conductor is increased 10 times in fig. 7. as shown in the figures, the cbm method is implemented in two ways – first without time propagation, and secondly by including time propagation effects in the equivalent circuit parameters. it is visible from figs. 6 and 7 that there is an calculating the impedance of grounding electrodes using circuit equivalents 727 excellent agreement between the results obtained by the cbm method including time propagation and the referent emf method. the differences between all the other impedance curves and the above mentioned method are clearly significantly larger, especially for high frequencies. the cbm method takes into account not only the self characteristics of each segment but also the coupling among different segments. that is the main reason for the high precision of cbm compared with other less accurate circuit methods, some of which are tested in this paper. the second part of the section provides a parametric analysis of the grounding impedance of a horizontal conductor buried in imperfect ground, in terms of soil resistivity and conductor length. classical and enhanced cbm is compared in the latter case. a) b) fig. 7 impedance to ground of a 30m long vertical grounding rod 728 a. kuhar, l. grcev a) b) fig. 8 impedance to ground of a 3m long horizontal grounding conductor the dependency of the grounding impedance on the specific soil resistivity is shown in the following figures. fig. 8 a) and b) presents the magnitude and phase of the grounding impedance for a 3m long horizontal grounding conductor and fig. 9 for a 30m long horizontal conductor, respectively. the radius of the analyzed conductor is 1.25 cm, the depth of burial is 80 cm and the relative permittivity of soil is set to εr=10. calculating the impedance of grounding electrodes using circuit equivalents 729 a) b) fig. 9 impedance to ground of a 30m long horizontal grounding conductor. it may be observed from figs. 8 and 9 that as could be expected, the impedance is significantly higher for grounding conductors placed in highly resistive soil. the third part of this section provides a parametric analysis of the grounding impedance of a horizontal conductor buried in imperfect ground, in terms of soil resistivity and relative permittivity. classical and enhanced cbm is implemented for horizontal and vertical grounding conductors. fig. 10 a) and b) presents the magnitude of the grounding impedance for a horizontal grounding conductor with 30 and 300 m length. 730 a. kuhar, l. grcev a) l=30 m b) l=300 m fig. 10 impedance to ground of a horizontal grounding conductor fig. 11 a) and b) presents the magnitude of the grounding impedance for a 30 and 300 m long vertical grounding conductor calculating the impedance of grounding electrodes using circuit equivalents 731 a) l=30 m b) l = 300 m fig. 11 impedance to ground of a vertical grounding conductor 732 a. kuhar, l. grcev 4. conclusions it may be concluded from the presented research that the cbm method is a solid choice for determining the grounding impedance of buried single conductors, in terms of accuracy. even the classical form of cbm that doesn’t include time propagation of em fields provides much more accurate results than the other investigated methods, compared to the referent results. it may be observed in the presented figures that a very high agreement exists between the referent results and those obtained by implementation of enhanced cbm method (hem). the higher accuracy of this method compared to other circuit methods is mainly due to the coupling of segments that cbm (and hem) take into account. the high precision of the cbm method in addition with the relative simplicity and high speed of its application shows that it may be one of the best choices in the analysis of grounding systems. references [1] a. kuhar and l. grcev, ”calculating the impedance of grounding electrodes using circuit equivalents”, in proc. of the 12 th international conference on applied electromagnetics пес 2015, niš, serbia, 2015. [2] r. velazquez and d. mukhedkar, ”analytical modelling of grounding electrodes transient behavior”, ieee trans. power apparatus and systems, vol. 103, pp. 1314-1322, 1984. [3] f. napolitano, m. paolone, a. borghetti, c. a. nucci, f. rachidi, v. a. rakov, j. schoene and m. a. uman, “interaction between grounding systems and nearby lightning for the calculation of overvoltages in overhead distribution lines”, ieee trondheim powertech, 2011. [4] p. yutthagowith, a. ametani: “application of a hybrid electromagnetic circuit method to lightning surge analysis”, ieee trondheim powertech, 2011. [5] a. f otero, j. cidras and j. l. alemo, ”frequency-dependent grounding system calculation by means of a conventional nodal analysis technique” ieee transactions on power delivery, vol. 14, no. 3, pp. 873878, 1999. [6] l. grcev and m. popov, ”on high-frequency circuit equivalents of a vertical ground rod”, ieee transactions on power delivery, vol. 20, no. 2, pp. 1598-1603, 2005. [7] r. jankoski, a. kuhar and l. grcev, “application of the electric circuit approach in the analysis of grounding conductors”, in proc. of the 5th international symposium on applied electromagnetics – saem'2014, skopje, macedonia, 2014. [8] s. visacro, f.h. silveira, “evaluation of current distribution along the lightning discharge channel by a hybrid electromagnetic model”, journal of electrostatics, vol. 60, pp. 111-120, 2004. [9] v. arnautovski-toseva, l. grcev, “electromagnetic analysis of horizontal wire in two-layered soil”, journal of computational and applied mathematics, vol. 168, no. 1-2, pp. 21-29, 2004. [10] a. kuhar, l. ololoska-gagoska and l. grcev, “numerical analysis of complex grounding systems using circuit based method”, in proc. of the xii international conference – etai, ohrid, macedonia, 2015. [11] r. rudenberg, electrical shock waves in power systems, cambridge, ma: harvard univ. press, 1968. [12] s. bourg, b. sacepe and t. debu, “deep earth electrodes in highly resistive ground: frequency behavior”, in proc. of the ieee int. symp. electromagnetic compatibility, 1995, pp. 584-589. [13] l. grcev, b. markovski, v. arnautovski-toseva, a. kuhar, k. el khamlichi drissi, k. kerroum: ”modeling of horizontal grounding electrodes for lightning studies”, in proc. of the european electromagnetics, euroem 2012, toulouse, france, july 2-6, 2012. facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 119-131 https://doi.org/10.2298/fuee2001119m © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd improving time integration scheme for fet analysis of power system angle stability  marin mandić, ivica jurić-grgić, nedjeljka grulović-plavljanić university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia abstract. this paper presents improved algorithm for numerical analysis of power system angle stability achieved by improvement of the time integration when forming a local system of equations for power system finite elements (fe). previously developed local system of equations of power system angle stability has been obtained using the generalized trapezoidal rule (ϑ method). improvement of accuracy was obtained by using heun's method. numerical solutions obtained using heun’s method and using the generalized trapezoidal rule are compared to electromagnetic transients program (emtp). it has been shown that heun’s method yields the results with much higher accuracy comparing to results obtained by generalized trapezoidal rule. key words: heun’s method, finite element technique, angle stability, time domain analysis. 1. introduction the assessment of angle or transient stability of an electrical system plays a fundamental role, as it allows contingencies classification and provides indications for the design and planning of the power systems. transient stability or also known as large-disturbance rotor angle stability concerns the ability of the system to withstand critical disturbances such as three-phase short circuits. as it is reported in [1-7], it is very difficult to analyze simultaneously the slow electromechanical and fast electromagnetic phenomena. this paper presents improved algorithm for numerical analysis of power system angle stability. complex numerical analysis is performed simultaneously by both the timedomain and the frequency-domain. the developed numerical model is based on a finite element technique (fet) procedure and time-varying phasors. in contrast to existing transient stability programs (tsp) that are limited to the first swing stability analysis, the developed numerical model can analyze large-disturbance rotor angle stability. the basis of developed numerical model for analysis of power system angle stability is application of fet to the power system so that the considered system is divided into smaller parts, received august 28, 2019; received in revised form october 2, 2019 corresponding author: marin mandić university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia (e-mail: mmandi00@fesb.hr)  120 m. mandić, i. jurić-grgić, n. grulović-plavljanić which are treated as separated finite elements (fe) with a certain number of local nodes during entire time-domain simulation. fundamentals of fet analysis of power system angle stability have been described in [8] where the generalized trapezoidal rule (ϑ method) is used for temporal integration when forming a fe equations system. the use of the generalized trapezoidal rule for the time integration when forming a fe equations system can sometimes cause numerical oscillations or numerical diffusion of results. numerical oscillations and/or numerical damping of results are the consequences of the choice of the numerical integration parameter (ϑ). in principle, its choice is based on numerical experiments. the influence of ϑ on stability of one-step time integration methods for the first order initial value problems is discussed in [9-10], where the stability behaviour is investigated by examining the free response of the fe model. it was found that methods for which 0.5 1  are unconditionally stable. the aim of this article is to improve the algorithm for numerical analysis of power system angle stability by improving the time integration scheme when forming fe equations system. instead of using the generalized trapezoidal rule when forming a fe matrix, the accuracy of a numerical solution can be improved using one of the variants of the rungekutta second-order method also known as heun's method [11] for the time integration. sometimes heun‟s method is also referred as the improved euler‟s method. heun‟s method is suitable because, as a single-step method, it connects variables only at the beginning and at the end of the time interval and thus preserves the simplicity of numerical procedure. it is a single-step predictor-corrector method that improves the estimation of the slope for the time interval using an average value of slopes at the beginning and at the end of the time interval. desire for calculation speed must be balanced with the appropriate accuracy and stability required. we could get higher accuracy by multi-stage runge-kutta second-order method but it leads to time-consuming problem. theoretically, one can develop infinite number of time schemes and their combinations, but the idea was to improve accuracy of fet model computation for power system angle stability analysis with one-step method that is not time-consuming. numerical solutions obtained using heun‟s method and using the generalized trapezoidal rule for time integration parameter ϑ = 0.66 are compared to electromagnetic transients program (emtp). it has been shown that heun‟s method yields the results with much higher accuracy comparing to results obtained by generalized trapezoidal rule. 2. fet analysis heun‟s method fundamentals of fet analysis of power system angle stability have been described in [8] where the generalized trapezoidal rule (also known ϑ method) is used for temporal integration when forming a fe equations system. in the following text, it will be shown how to derive a local system of equations for power system finite elements by using heun‟s method for temporal integration instead of the generalized trapezoidal rule. numerical model of turbine governor and excitation system have been included into synchronous generator fe model where numerical models have been improved using heun's method [11] for the time integration. finite elements of utility equivalent source, three-phase transformers and transmission lines are defined by the system of algebraic equations as it is already described in [8]. improving time integration scheme for fet analysis of power system angle stability 121 2.1. fe synchronous generator – heun’s method according to [12-13], salient pole synchronous generator subtransient phasor model is defined by the following system of algebraic and differential equations where timedomain swing equation (6) has been also included: q q q d de u r i x i      (1) d d d q qe u r i x i      (2) o( ) o do f d d d de t e x x i e dt          (3) ( ) q do o d d d q de t e x x i e dt            (4) d d ( ) qo q q q de t x x i e dt        (5) = p p i m e d m dt   (6) = = ω ω n d dt   (7) 1 1sin 2 d qo e d d q x xe u u p = δ + sin2δ x 2 x x             (8) where e'o, e"q, e"d are transient and subtransient generator voltages. variable ef represents generator field voltage while variables uq, ud, iq and id are axial components of the terminal voltage u and stator current i . in order to perform the time integration according the heun's method, it is necessary to put relations (3-7) into following form: 1 ( , , ) o f d o de f e i e dt   (9) 2 ( , , ) q o d q de f e i e dt    (10) 3 ( , ) d q d de f i e dt   (11) 4 ( ) ( , ) m e d f p p dt   (12) 5 ( ) d f dt    (13) where: 122 m. mandić, i. jurić-grgić, n. grulović-plavljanić 1 ( ) ( , , ) f d d d o f d o do do do e x x i e f e i e t t t           (14) 2 ( ) ( , , ) qo d d d o d q do do do ee x x i f e i e t t t             (15) 3 ( ) ( , ) q q q d q d qo qo x x i e f i e t t         (16) 4 ( , ) m e m e i i p p f p p m m   (17) 5 ( )f     (18) according to heun's method, the state of variables at the end of the time interval is defined by the following corrector equations: 1 1 ( , , ) ( , , ) 2 2 p o o f d o f d o t t e e f e i e f e i e             (19) 2 2 ( , , ) ( , , ) 2 2 p q q o d q o d q t t e e f e i e f e i e               (20) 3 3 ( , ) ( , ) 2 2 p d d q d q d t t e e f i e f i e            (21) 4 4 ( , ) ( , ) 2 2 m e m e t t f p p f p p           (22) 5 5 ( ) ( ) 2 2 t t f f            (23) according to (14-18), the predicted slopes 1 ( , , ) p f d o f e i e    , 2 ( , , ) p o d q f e i e     , 3 ( , ) p q d f i e   , 4 ( , )m ef p p   , 5 ( )f   at the end of the time interval are: 1 ( ) ( , , ) p p f d d d o f d o do do do e x x i e f e i e t t t                 (24) 2 ( ) ( , , ) p p qo d d d o d q do do do ee x x i f e i e t t t                 (25) 3 ( ) ( , ) p p q q q d q d qo qo x x i e f i e t t              (26) 4 ( , ) m e m e i i p p f p p m m       (27) 5 ( )f        (28) improving time integration scheme for fet analysis of power system angle stability 123 where p o e  , p q e  , p d e  are predicted transient and subtransient voltage values at the end of the time interval defined by the following predictor equations: 1 ( , , ) p o o f d o e e t f e i e      (29) 2 ( , , ) p q q o d q e e t f e i e       (30) 3 ( , ) p d d q d e e t f i e      (31) where 1 ( , , )f d of e i e , 2 ( , , )o d qf e i e , 3 ( , )q df i e are defined by equations (14-16). the time integration of equations (3-7) using the generalized trapezoidal rule (  method) can be found in [8]. with application of the heun‟s method [11] on equations (3-7), which includes combining (19-31) and separating the variables at the end of time interval to the left-hand side and the variables at the beginning of time interval to the right-hand side, the following system of algebraic equations is obtained: 2 2 2 2 2 2 ( ) 2 2 ( ) 2 2 2 2 ( ) ( ) ( ) ( ) 2 ( ) 2 ( ) 2 ( ) f d d d o o do do fo d d d o do do do do f d d d o do do do e t t i x x e e t t e te t t i x x e t t t t t e t t i x x e t t t t                                                      (32) 22 2 2 2 2 ( ) 2 2 ( ) 2 2 2 2 ( )( ) ( ) ( ) 2 ( ) 2 ( ) 2 ( ) o d d d q q do do q qo d d d do do do do qo d d d do do do e t t i x x e e t t e t e te t t i x x t t t t e te t t i x x t t t                                                        (33) 2 2 2 2 ( ) 2 2 ( ) ( ) 2 2 2 ( ) ( ) ( ) 2 ( ) q q q d d d qo qo q q q d d qo qo qo q q q qo t i x x e t e e t t t i x x e t e t t t t t i x x t                                        (34) 2 2 2 2 m e m e i i i i p t p t p t p t m m m m                  (35) 2 2 t t        (36) 124 m. mandić, i. jurić-grgić, n. grulović-plavljanić the variables in equations (32-36) marked by “+” denote variables at the end of the time interval, while variables without mark denote variables at the beginning of the time interval. according to [8], synchronous generator can be defined as fe with three local nodes where fe local system has been defined by the system of algebraic equation (8) and equations (32-37). 1 1 { } [ ] { } [ ] { } g e g e o i z z e         (37) dynamic phasors of local matrix as well as dynamic phasors of vectors in equation (37) are defined as follows: 0 0 [ ] 0 0 0 0 e e e e z z z z             1 1 12 / 3 2 / 3 01 02 03 0 0 0 { } { } { } j j jt t o e e e e e e e e e e                  1 2 3 { } { } t g      1 2 3 { } { } t g i i i i where: [ cos ( )] j e d q d z r j x e x x          2 2 = (e ) +(e ) o d q e   2 2 i= ( ) +( ) d q i i  =  +  angle between stator current phasor i and „q‟ axis power angle 2.2. model of hydraulic digital turbine governor – heun’s method according to [14], numerical model of digital turbine governor is defined by the following system of algebraic and differential equations: s ref r rp = p δω p  (38) s1 sa sb scp = p + p +p (39) sa p sp k p  (40) 1g p q    (41) _v open v v_closeq q q  (42) min maxq q q  (43) sc s sc d d dp dp p t = k dt dt   (44) improving time integration scheme for fet analysis of power system angle stability 125 v g v g p s1 dq t q t t = p dt    (45) v dq = q dt (46) sb i s dp =k p dt  (47) 2 2 g g r r dp dq p t =δ t dt dt       (48) ( ) m m 11 w 11 23 11 13 21 w dp dq p a t = a δ a a a a t dt dt           (49) where:  rotor angular frequency n nominal angular frequency pm, pe mechanical and electrical power kp pid controller proportional gain ki pid controller integral gain kd pid controller derivative gain td pid controller derivative time constant pref referent value of mechanical power tp pilot time constant tr transient droop time constant tw water inertia time constant tg gate time constant qv_open gate max opening speed qv_close gate min closing speed qmax maximal gate position qminminimal gate position  permanent droop coefficient a11, a13, a21 turbine coefficients a23 turbine gain the time integration of above equations using the generalized trapezoidal rule (  method) can be found in [8]. using the same procedure shown in chapter 2.1. for time integration of equations (44 49), that is application of heun‟s method, the following system of algebraic equations is obtained: 2 2 2 2 )(2 )( )(2 )( 2 222 d sc d snd d sc d snd d sc d snd scsc t tp t tpk t tp t pkt t tp t pkt pp                      (50)       snsnss p t p t pp 22 (51) 126 m. mandić, i. jurić-grgić, n. grulović-plavljanić 2 2 2 2 111 )(2 )( 2 )( 2222 p v pg s p v pg s p v pg s vv t tq tt tp t tq tt pt t tq tt pt qq                     (52)       vv q t q t qq 22 (53)       sisisbsb pk t pk t pp 22 (54) 2 2 2 2 2 2 22 2 )( 2 )( 2 222 r g r v r g v r g vgg t tp t tq t pt q t t tp q t pp                       (55) 23 23 11 13 21 11 11 11 23 23 11 13 21 11 11 11 2 2 2 23 23 11 13 21 2 2 2 11 11 ( ) 2 2 2 ( ) 2 2 2 ( ) ( ) ( ) ( ) 2 ( ) 2 2 v m m m w w v m w w v m w w q t a t q a a a a p t p p a t a a t q a t t a a a a q p t a t a a t q a t t q a a a a p t a t a t                                                           2 11 ( ) w a t  (56) the numerical model of turbine governor is defined by the system of algebraic equations (38-43) and equations (50-56). 2.3. model of excitation system – heun’s method according to [15], numerical model of excitation system is defined by the following system of algebraic and differential equations: 0f c ref t f a e v v v v k    (57) in maxrm r rv v v  (58) 1t r t t r r dv k v v dt t t     (59) 1 ar r c a a kdv v v dt t t     (60) 1f e f r e e de k u v dt t t     (61) f f f f f dv de v t k dt dt    (62) where: improving time integration scheme for fet analysis of power system angle stability 127 ka, kr, kf, ke regulator amplification factors ta, tr, tf, te time constants vt, vf, vr , vc regulator signals ef field voltage ef 0 initial field voltage vt phase voltage vref referent phase voltage the time integration of above equations using the generalized trapezoidal rule (ϑ method) can be found in [8]. using the same procedure shown in chapter 2.1 for time integration of equations (59 62), that is application of heun‟s method, the following system of algebraic equations is obtained: 2 2 2 2 2 )( 2 )( 2222 r t r tr r t r tr r t r tr tt t tv t tvk t tv t vkt t tv t vkt vv                     (63) 2 2 2 2 2 )( 2 )( 2222 a r a ca a r a ca a r a ca rr t tv t tvk t tv t vkt t tv t vkt vv                     (64) 2 2 2 2 2 )( 2 )( 2222 e f e re e f e re e f e re ff t te t tvk t te t vkt t te t vkt ee                     (65) 2 2 2 2 2 )( 2 )( 1 2 2 1 22 1 f f f r e e f e f f f f r e e f e f f f f r e e f e f ff t tv t tv t k e t k t vt t v t k e t kt t tv t v t k e t kt vv                                        (66) the numerical model of excitation system is defined by the system of algebraic equations (57-58) and equations (63-66). 3. test example in order to verify a new numerical procedure, angle stability of the power system, shown in figure 1, has been analysed. at the beginning of simulation, the generator 1 and generator 4 are in operating mode with p=108 mw (0.9 pu) and q =0 mvar (0 pu) while generator 2 and generator 3 are in operating mode with p=84 mw (0.7 pu) and q =52.3 mvar (0.436 pu). global system of equations is obtained by assembling local system of equations of each fe according to the standard assembling procedure [9-10]. at the moment t=0.8 s, the three-phase fault with clearing fault time of 200 ms has been set at the busbar „d‟ causing synchronous generators rotor angle (δ) oscillations and variation of mechanical powers (pm) due to turbine governor control. 128 m. mandić, i. jurić-grgić, n. grulović-plavljanić the generators data are: 120 mva n s  , 108 mw n p  , 52.3 mvarnq  , 14.4 kvnu  , 4811 a n i  , 690 a fo i  , 1192 afni  . available transient and subtransient generators time constants and reactances are: 1.01dx  pu, 0.666qx  pu, 0.287qx  pu, 0.421dx  pu, 0.268dx  pu, 0.198x  pu, 0.00236r  pu, 3.4 sdt  , 0 7.5 sdt  , 0.0554 sdt , 0.0876 sqt  , 8.63 s mg t  . the turbine governors time constants, permanent and temporary droop coefficients, maximal and minimal gate opening speed, maximal and minimal gate position, amplification factors and turbine coefficients are: tp = 0.02 s, tr = 4.5 s, tw = 2.9 s, tg = 0.5 s, _ 0.15 v open q  , v_close 0.08q  max 0.8q  , min 0.1q  , 0.04  , δ = 0.378, kp = 15, ki = 0.8, kd = 1.5, td = 5 s, a11= 0.5, a13 = 1, a21 = 1.5, a23 = 1. the excitation systems time constants and amplification factors are: 0.02 s r t  , 0.001 s a t  , 0.1 s e t  , 0.1 sft  , 1rk  , 50ak  , 1ek  , 0.001fk  . fig. 1 single line diagram of the electric power system step-up transformers as well as transmission lines reactance‟s have been taken 0.1 pu. to demonstrate that heun‟s method is more accurate than ϑ method, numerical solutions obtained using heun‟s method and using the generalized trapezoidal rule for time integration parameter ϑ = 0.66 are compared to electromagnetic transients program (emtp) as it shown in figures (2-5). the below results, showing rotor angles (δ) and mechanical powers (pm) of generators 1 and 2, clearly demonstrate that heun‟s method yields the results with much higher accuracy comparing to results obtained by generalized trapezoidal rule. improving time integration scheme for fet analysis of power system angle stability 129 fig. 2 generator „1‟ rotor angle (δ) during angle stability simulation fig. 3 generator „2‟ rotor angle (δ) during angle stability simulation since different values of ϑ yield different time-stepping schemes, all these schemes vary in accuracy. as it can be seen from figures (2-5), numerical solutions obtained by heun‟s method yields highly accurate results while the generalized trapezoidal rule with 0.66  causes numerical damping of results. furthermore, the generalized trapezoidal rule with 0.5  leads to numerical oscillations. numerical oscillations and/or numerical 130 m. mandić, i. jurić-grgić, n. grulović-plavljanić damping of results are the consequences of the choice of the numerical integration parameter (ϑ). in principle, its choice is based on numerical experiments. the influence of ϑ on stability of one-step time integration methods for the first order initial value problems is discussed in [9-10], where the stability behaviour is investigated by examining the free response of the fe model. it was found that methods for which 0.5 1  are unconditionally stable. fig. 4 generator „1‟ mechanical power (pm) during angle stability simulation fig. 5 generator „2‟ mechanical power (pm) during angle stability simulation improving time integration scheme for fet analysis of power system angle stability 131 4. conclusion in this paper an improved time integration scheme for numerical analysis of power system angle stability using fet was presented. improvement of accuracy was achieved by using heun's method for the time integration scheme when forming a fe equations system. the proposed numerical method has been tested and compared to results obtained by ϑ method and emtp program. it has been shown that heun‟s method for time integration gives results with higher accuracy than the generalized trapezoidal rule (ϑ – method) previously used. in some cases where the generalized trapezoidal rule is used, the choice of the numerical integration parameter (ϑ) generates numerical oscillations and/or numerical damping of results. using the proposed approach, these unwanted consequences are avoided. references [1] z. de souza, l. bil, "unified computational tool for transient and long‐term stability studies", iet gener transm distrib, vol. 3, no.2, pp. 173‐181, 2009. [2] s. henschel, analysis of electromagnetic und electromechanical power system transients with dynamic phasors, ph.d. dissertation, university of british colombia, 1999. [3] v. venkatasubramanian, "tools for dynamic analysis of the general large power system using timevarying phasors", international journal of electrical power and energy systems, vol. 16, no. 6, pp. 365376, 1994. [4] t. odun-ayo, m.l. crow, "an analysis of power system transient stability using stochastic energy functions", international transactions on electrical energy systems, vol. 23, no. 2, pp. 151-165, 2013. [5] r. adapa, j. reeve, "a new approach to dynamic analysis of ac networks incorporating detailed modeling of dc systems. ii. application to interaction of dc and weak ac systems", ieee transactions on power delivery, vol. 3, no. 4, pp. 2012-2019, 1988. [6] m.r. zarate, c.t. van, m. federico, j. conejo antonio, "securing transient stability using time-domain simulations within an optimal power flow", ieee trans power system., vol. 25, no. 1, pp. 243‐253, 2010. [7] b.y. bagde, b.s. umre, k.r. dhenuvakonda, "an efficient transient stability‐constrained optimal power flow using biogeography‐based algorithm", international transactions on electrical energy systems, vol. 28, no. 1, e2467, 2018. [8] i . jurić-grgić, n . grulović-plavljanić, m. dabro, "an analysis of power system transient stability using finite element technique", international transactions on electrical energy systems, vol. 29, no. 1, e2647. 2019 [9] o.c. zienkiewicz, r.l. taylor, the finite element method, mcgraw-hill: vol 1, london, uk, 1989. [10] o.c. zienkiewicz, k. morgan, finite elements and approximation, john willey & sons: new york, usa, 1983. [11] s.c. chapra, r.c. canale, numerical methods for engineers (7th edition), mcgraw-hill: new york, 2015. [12] j. arrillaga, c.p. arnold, b.j. harker, computer modelling of electrical power systems. john wiley and sons: new york, usa, 1983. [13] j. arrillaga, c.p. arnold, computer analysis of power systems. john wiley and sons: new york, usa, 1990. [14] m. dabro, i. jurić-grgić, r. lucić, "emtp-rv model of hydraulic digital governor", international review on modelling and simulations, vol. 4, no. 6, pp. 1-5, 2011. [15] ieee standard 421.5-2016: recommended practice for excitation system models for power system stability studies, 2016. facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 289-302 https://doi.org/10.2298/fuee2002289a © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd design and electromagnetic modeling of integrated lc filter in a buck converter bensaci ahmed 1 , guettaf yacine 2 , djekidel rabah 1 , hamid m-h-a 3 1 lacosere laboratory, amar telidji university, laghouat, algeria 2 laboratory of instrumentation and advanced materials, university center nour bachir of el bayadh, algeria 3 university of lorraine, cran, umr 7039, france abstract. this paper presents the design and electromagnetic modeling ‎of a new structure of integrated ‎low-pass lc filter in a buck converter. this micro-filter consists of a planar circular coil ‎placed between two mn-zn ferrite substrates. mn-zn ferrite has been chosen because of its high permeability and permittivity. in this micro-filter substrates act not only as a magnetic core but also as a capacitor. a modelling of the electromagnetic and electric behavior of the integrated filter, we have simulated with the help of the ‎software psim 9.0 on the equivalent electrical circuit of the dimensioned filter. a ‎visualization of the different electromagnetic phenomena that appear during the ‎operation of the filter is determined in 3d space dimension using the finite element ‎method.‎ key words: converter dc-dc, comsol 3.5a, lc filter, modeling, mn-zn ferrite, integration, psim 9.0 1. introduction the integration of passive components for dc-dc converters applications is still a challenging area of research. the integration of planar lc passive components such as inductance and capacitance has been applied for many years [1]. these idea of integrating passive components into the converter’s structure had been suggested starting in the early 1990s by a power electronics group van wyk nd ferreira from the rand laboratory for energy and university of south africa. the main objective is to obtain multifunctional and compact integrated modules using conductive, dielectric, and magnetic materials with different properties, such as high conductivity, high permittivity, and high permeability. in the present work, we have chosen the mn-zn ferrite because of its high permeability and permittivity. in this filter mn-zn substrates act not only as a magnetic core but also as a capacitor. in order to reduce the conduction losses in the part of the ferrite used as a capacitor. received september 26, 2019; received in revised form december 29, 2019 corresponding author: djekidel rabah electrical engineering department, university of amar telidji of laghouat, bp 37g route of ghardaïa, laghouat 03000, algeria e-mail: rabah03dz@live.fr  290 b. ahmed, g.yacine, d. rabah, h. m-h-adnane they are associated with a wide range of substrates, including pcbs and standard ceramic. this technology was used only for low power ranges (a few watts) for operating frequencies up to several ghz [2,3]. the filter who to be studied in this work is based on this technology but it is designed for low power range and lowest operating frequency: few watts and 100 khz. in this paper, we present the design of an integrated output lc filter dedicated to dc-dc converters, based on use a ferrite magnetic circuit and simultaneously as a dielectric substrate for the capacitive effect. these two properties allow the conception of a hybrid lc component operating as a low pass filter. in taking into account the electric and magnetic characteristics of the materials selected, the proposed structure is composed of a planar micro-coil circular turns taken sandwiched between two ferrite layers. the outer surfaces of the ferrites are metalized to form ground planes. the operation of the micro-filter is simulated using software of type circuit (psim) to evaluate the performance of this model. finally, we visualized the different electromagnetic phenomena that appear during the operation of the microfilter. it is about the distribution of the electric potential and the density of the lines of the magnetic field. 2. presentation of the buck converter the converter is the starting point of the design of the micro-filter. we have chosen a buck converter (fig. 1). the micro-filter we want to integrate into a buck converter will thus be dimensioned for this type of application. input voltage, vin=5v. the output voltage, vout=3.3v. output power, pout=10w. the frequency of operation, f=100 khz. the buck converter circuit for this circuit is shown in fig. 1. the micro-filter to realize are two of the intrinsic elements of this converter [4-9]. fig. 1 output lc low-pass filter to be used in a buck converter 3. lc filter structure the micro-filter consists of a planar circular spiral coil sandwiched between two ferrite substrates (figs. 2 and 3). this structure shows both inductance and distributed capacitance between the coil and the ground planes as shown in fig. 2(b). this structure acts as an electromagnetically integrated lc low-pass filter, each element of the equivalent circuit model (fig. 2(b)) has to be calculated from simulations (inductors) and measurements (capacitors) [7-9]. the spiral winding is divided into 2n-cell semicircle connected in series. each semicircle is characterized by an equivalent self-inductance, mutual inductance, a series resistance and impedance inter-turn. values of inductances and mutual inductances are obtained by an analytic method, inter-turn and volume capacitances are calculated from the mn-zn ferrite modeling results [9]. design and electromagnetic modelling of integrated lc filter in a buck converter 291 4. calculation of the geometrical and electrical parameters of planar inductor by taking into account selected electrical and magnetic characteristics, we evaluate the volume of the magnetic core. this enables us to define the section on which we will put the electrical circuit of the planar spiral inductor, then we will determinate the dimensions of this circuit in order to meet the specifications of the converter in terms of magnetic storage of energy and losses in materials [6-9]. (a) (b) fig. 2 (a) exploded-view of the studied low pass filter, (b) equivalent circuit model fig. 3 lower mn-zn substrate (20 half-turn) each planar spiral inductor is defined geometrically by several parameters (fig. 4) such as the number of turns n, the width of the conductor w, thickness of the conductor t, the spacing between conductor s, length of the conductor l, the outer diameter dout and input diameter din [10-12]. fig. 4 geometrical parameters of a planar spiral integrated inductor 292 b. ahmed, g.yacine, d. rabah, h. m-h-adnane the mathematical equations used to determine the parameters of an integrated circular spiral inductor are presented in detail in the reference [13-15]. the essential parameters are gathered in table 1. table 1 results of the dimensioning 5. filter circuit model 5.1. modelling of mn-zn ferrite the electrical properties of ferrites depend on their polycrystalline structure. the grain and insulating grain boundary are the two main components that determine the variation of resistivity and permittivity [16]. the equivalent circuit is shown in fig. 5. rg and rgb are the grain and the grain boundary resistances respectively [7-9]. fig. 5 equivalent electrical circuit of mn-zn ferrite measures for extracting values of the three components of fig. 5 were performed in the laboratory laplace (paul sabatier university in toulouse, france) [13]. the results are shown in table 2. table 2 fitting parameters of mn-zn at t=20°c (frequency range: 10hz-0.7mhz) 5.2. filter model the spiral inductor between the ferrite substrates is circular. since the spiral inductor is a distributed structure, our proposed model is based on distributed model. several works geometrical parameters results of dimensioning inductance: l (µh) 1,85 number of turns: n 10 volume of the of magnetic cores (m 3 ) 615,2510 -9 m 3 length of the conductor: l (mm) 583,7 width: w (mm) 0.13 thickness: t (mm) 0,0932 spacing inter-turn: s (mm) 0,398 external diameter: dout (mm) 19,5 internal diameter: din (mm) 9,75 g (m) gb (m) r 0,173 34,8 6.5×10 4 design and electromagnetic modelling of integrated lc filter in a buck converter 293 have already shown that a turn may be represented by a lumped model as reported in fig. 6 [7-9]. in order to get better accuracy, the spiral of n-turns has been broken into 2n-half-turns. the lumped model corresponds to each half-turn is represented in fig. 6. this choice has already been preferred because in some of our studied lc filter designs the interturns distances vary from the input to the output filter. the overall model consists in 2n-circuits connected in series to form a distributed model. the inter-turns impedances between adjacent turns are to be considered, especially by using a high permittivity substrate [10-15]. they are distributed at the front and the end of the inter-turns, leading to 2(n1) inter-turns impedances. to illustrate the principle of this modeling, let us consider a lc filter with 2 turns. the resulting filter circuit model is presented in fig. 7. the mn-zn substrate presents a high permittivity, so we did not take into account the capacitor corresponding to the air gap between two neighboring half-turns. consequently, the capacity inter-turns of fig. 7 are only those due to the mn-zn substrate [7]. (b) fig. 7 2-turns filter (a) and its associated circuit model (b) fig. 6 half-turn lumped model (a) 294 b. ahmed, g.yacine, d. rabah, h. m-h-adnane 5.3. determination of the electrical model parameters 5.3.1. coil resistances the resistances of each half-turn have estimated by using the following classical relationship [7-9]: ( ) ( ) avg s cu l r i s i  (1) with i=1 to n. where, cu is conductor resistivity (here copper), lavg(i) and s are respectively conductor mean length of each half-turn and cross-section. ( ( 1) ]) 4 [ avg out s w t l i d i s i w si             (2) 5.3.2. substrate impedances both resistances and capacitances of the ferrite substrates are calculated by using the fitting parameters of table 2 [4, 5]. these relations are: g gmnzn h r a  (3) gb gbmnzn h r a  (4) 0gb rmnzn a c h   (5) where, h is the height of the ferrite between the half-turn and ground plane and a is the area below the half-turn of the ferrite. 5.3.3. inter-turns impedances the calculation of the inter-half-turn parameters is not easy. the co-planar electrodes of the tested sample have to be transformed into parallel plate electrodes by using the theory of conformal transformations developed in [17-19] (fig. 8). new constants are derived from the transformation as shown in the following equation [19, 20]: 2 ( ) ( ) and ( ) 2 ( ) r c k k k k k k k k k k       (6) where: k(k) is the complete integral of the first kind with modulus k, and k(k') is the complete integral of the first kind taken in the complementary modulus k' [17-19]. the ratio k(k) and k(k') can be calculated from the following equation: 1 1 ' 12. , 0 21 ' ( ') 1, 1( ) 21 2 1 k ln k k k k kk k k ln k                          (7) design and electromagnetic modelling of integrated lc filter in a buck converter 295 fig. 8 geometries of (a) two planar conductors on the upper half plane, (b) transformed conductors on the schwarz-christoffel rectangular region, (c) rectangle in w-plane (the lower part of the t-plane) so, we can calculate r'g, r'gb and c'gb of inter-half-turn by using the following relationships: 0 2. ( ) 1 . . ( ') 2. ( ) 1 ( ) ( ) ( ' . . ( ') ( ') 1 ' . . 2. ( ) ) g gmnzn tts gb gbmnzn tts gb rmnzn tts k k r k k l k k r k k l i i k k c k k l i         (8) where, ltts(i) is the mean length of inter-half-turn. ( ) 4 [ ( 1) ] tts out l i d i w i s wi          (9) all the parameters of the filter can be now computed. (a) (b) (c) 296 b. ahmed, g.yacine, d. rabah, h. m-h-adnane 6. results of the electrical parameters of the integrated filter the calculated electrical parameters of the equivalent electrical circuit of the integrated filter are shown in table 3. table 3 electrical parameters results number of turns’ the resistances and inductor of each half-turn substrate impedances inter-turns impedances total length n rs() ls(h) rg() rgb(k) cgb(f) rinsg() rinsgb() cinsgb(f) lt(m) 1 0.0189 0.019 25.896 5.2122 0.0038 2.8386 571.3285 6.029 0.0771 2 0.0370 0.074 13.277 2.6725 0.0075 1.4584 293.5440 3.097 0.1503 3 0.0539 0.167 9.099 1.8315 0.0109 1.0005 201.3681 2.125 0.2194 4 0.0699 0.297 7.024 1.4138 0.0141 0.7729 155.5582 1.641 0.2842 5 0.0847 0.464 5.790 1.1654 0.0172 0.6375 128.3047 1.354 0.3448 6 0.0986 0.669 4.976 1.0017 0.0200 0.5482 110.3451 1.164 0.4011 7 0.1114 0.910 4.404 0.8864 0.0226 0.4855 97.7130 1.031 0.4532 8 0.1232 1.189 3.983 0.8017 0.0250 0.4393 88.4277 0.933 0.5011 9 0.1339 1.504 3.663 0.7374 0.0271 0.4044 81.3911 0.859 0.5448 10 0.1436 1.857 3.416 0.6876 0.0291 0.3773 75.9470 0.801 0.5843 we performed simulations in order to test the operation of our equivalent electrical circuit of the micro-filter integrated. the simulation was performed using psim software 9.0. in this simulation, the circuit of fig. 7(b) contains an equivalent electrical circuit of the micro-filter. figs. 9 and 10 show the waveform of the output ant input voltage and current of the micro-filter. the output voltage is 4.414 v instead of 5 v. this is due to resistive losses in the conductor, then the magnetic core losses. in steady state, the current is 2.20 a, corresponding to 8.7 w output power instead of 10 w. fig. 9 waveforms of the output and input current of the micro-filter design and electromagnetic modelling of integrated lc filter in a buck converter 297 fig. 10 waveforms of output and input of voltages and currents of the micro-filter the measured values of output and input of voltages and currents of the micro-filter are given in the following tables (4 and 5): table 4 measured values at the time instant t=6.78e005s of the output and input currents measures time 6.78e  005 iinput (a) 2.232 ioutput (a) 2.207 table 5 measured values at the time instant t=6.78e005s of the output and input voltages measures time 6.78e  005 vinput (v) 4.999 voutput (v) 4.414 7. electromagnetic modeling of the lc filter in this section, we present the different electromagnetic phenomena that appear during the operation of the filter in a buck converter to see the distribution of the magnetic flux density and the electric field. this simulation was performed using the software comsol 3.5a. 7.1. simulation parameters under comsol 3.5a to get closer to reality and based on the results of the dimensioning as shown in table 1, we will show the geometry of our micro filter model designed in 3d as shown in figure 11. 298 b. ahmed, g.yacine, d. rabah, h. m-h-adnane fig. 11 structure of the micro-filter simulated with the mesh of the field of study in table 6 we group together the simulation parameters used. this is to define the mesh: number of nodes and type of elements. we have also noted the simulation time. table 6 simulation parameters number of elements of the mesh type of elements execution time (s) core volume (m 3 ) spiral substrate 7766 40486 tetrahedral 2032,372 669,60.10 -9 7.2. simulation results figs. 12, 13 and 14 show the distribution of the electric potential in the lc filter. we note in fig. 12 that the distribution of potential is in adequacy with the specifications. indeed, for a current injected of 2 a, the potential takes the greatest value at the input of the micro-filter (≈ 3.5 v), then decreases until reaching the value of 2 v. fig. 12 distribution of the electric potential in the micro filter design and electromagnetic modelling of integrated lc filter in a buck converter 299 fig. 13 shows that in the half-turns of the micro filter, located in the right half in this figure, of the falls of potential generated by the different losses at the ferrite (substrate). these losses decrease as one approach the exit of the micro filter. indeed, in this zone, the parasitic effects are almost non-existent. fig. 13 variation of the electric potential in the micro filter fig. 14 shows the distribution of the lines of magnetic field in our micro filter. these lines of field are concentrated at the input of the micro filter because of the high value of the current in this region. on the other hand, the majority of these lines of field are confined in a substrate because of high permeability of ferrite. fig. 14 distribution of the lines of magnetic field of the micro filter figs. 15 (a, b, and c) shows a good vertical propagation of the field in the material. the work frequency not being high enough, the eddy currents are not thus yet sufficiently important to modify the distribution of the lines of field. 300 b. ahmed, g.yacine, d. rabah, h. m-h-adnane . (a) (b) (c) fig. 15 distribution of the density flux magnetic in the micro filter (a) different direction of the lines of field, (b) magnetic flux on the upper part of the micro filter, (c) magnetic flux on the lower part of the micro filter design and electromagnetic modelling of integrated lc filter in a buck converter 301 fig. 16 gives an overview of the distribution of the density of the magnetic flux in the micro filter. because of the distribution of the current in the conductor, this last being more important in the right half of the micro coil, the density of flow is some more important. fig. 16 variation of the density of the magnetic flux in the micro filter 8. conclusions in this paper, we have presented the dimensioning, the modeling and the simulation of a micro filter. first, we have calculated the geometrical and electrical parameters of planar inductor. then, we have proposed a new structure of the micro filter consists in a planar spiral coil sandwiched in between two ferrite substrates. this structure shows both distributed capacitance and inductance between the spiral and the ground planes. next, we have used mn-zn ceramic substrates in this work for both their high permeability and permittivity. already largely used as core material, the goal of this study was to show that mn-zn may also be used as a capacitor in a dc-dc power converter integrated lc filter. next, by using a software simulation psim 9.0, we have compared the waveforms of the converter output voltages. finally, by using the software comsol 3.5a, we have visualized the distribution of the lines of field of our micro filter in order to know if they do not overflow and thus do not disturb the other elements of the converters. we also visualized the density of the magnetic flux and the electric potential. we conclude that the results of dimensioning in this paper are interesting indeed. 302 b. ahmed, g.yacine, d. rabah, h. m-h-adnane references [1] ph. artillan, m. brunet, d. bourrier, j-p. laur, n. mauran, "integrated lc filter on silicon for dc-dc converter applications", ieee transactions on power electronics, institute of electrical and electronics engineers, vol.26, no. 8, pp. 2319–2325, 2011 [2] j. d. van-wyk, f. c. lee, z. liang, r. chen, s. wang, b. lu, "integrating active, passive and emifilter functions in power electronics systems: a case study of some technologies", ieee transactions on power electronics, vol. 20, no. 3, pp. 523–536, may 2005. [3] m. ali, e. labouré, f. costa, b. revol, "design of a hybrid integrated emc filter for a dc-dc power converter ", ieee transactions on power electronics, vol. 27, no. 11, pp. 4380–4390, nov 2012. [4] m. rabia, h. azzedine, t. lebey, "modeling and dimensioning of a planar inductor for a monolithic integration", in proceedings of the asia-pacific power and energy engineering conference ieee, 2011, pp. 1–9. [5] m. derkaoui, a. hamid, t. lebey, r. melati, "design and modeling of an integrated micro-transformer in a flyback converter", telkomnika journal, vol.11, no. 4, pp. 669–682, 2013. [6] y. guettaf, a. flitti, a. bensaci, h. kharbouch, m. rizouga, a. hamid, "simulation of the operation of a dc–dc converter containing an inductor of planar type", electrical engineering journal, vol. 100, no. 2, pp. 953–969, june 2018. [7] h. h. nien, t. j. liang, j. f. chen , s. k. changchien, "study of the electrical and magnetic properties of mnzn ferrite by equivalent electrical elements", in proceedings of the second international conference on innovative computing information and control ieee, sept 2007. [8] q. y. yan, r. j. gambino, s. sampath, "plasma-sprayed mnzn ferrites with insulated fine grains and increased resistivity for high-frequency applications", ieee transactions on magnetic, vol. 40, no. 5, pp. 3346–3351, 2004. [9] h. h. nien, j. f. chen, s. k. changchien, h.w. shieh, "implementation of low loss mn-zn ferrite cores for power electronics applications", in proceedings of the ieee power india conference, 2006. [10] r. melati, a. hamid, t. lebey, m. derkaoui, "design of a new electrical model of a ferromagnetic planar inductor for its integration in a micro-converter", mathematical and computer modelling, vol. 40, no. 1-2, pp. 200–227, jan 2013. [11] s.s. mohan, m. del mar hershenson, s.p. boyd, t.h. lee, "simple accurate expressions for planar spiral inductances", ieee journal of solid-state circuits, vol. 34, no. 10, pp. 1419–1424, oct 1999. [12] y. benhadda, a. hamid, t. lebey, "thermal modeling of an integrated circular inductor", journal of nanoand electronic physics, vol. 9, no. 1, pp. 01004 (5pp), 2017. [13] a. bensaci, a. hamid, a. flitti, t. lebey, v. bley, f. z. medjaoui, "design of a new electrical model of integrated lc filter in dc–dc converter", journal of low power electronics, vol. 12, no. 1, pp. 34– 44, 2015. [14] i. kowase, t. sato, k. yamasawa, y. miura, "a planar inductor using mn–zn ferrite/polyimide composite thick film for low-voltage and large-current dc–dc converter", ieee transactions on magnetics, vol. 41, no. 10, pp. 3991–3993, oct 2005. [15] b. rejaei, "mixed-potential volume integral-equation approach for circular spiral inductors", ieee transactions on microwave theory and techniques, vol. 52, no. 8, pp. 1820–1829, aug 2004. [16] y.katayama, s. sugahara, h. nakazawa, e. masaharu, "high-power-density mhz switching monolithic dc-dc converter with thin-film inductor", ieee power electronics specialists conference, vol. 3, pp.1485–1490, jun 2000. [17] s. gevorgian, h. berg, h. jacobsson, t. lewin , "basic parameters of coplanar-strip waveguides on multilayer dielectric/semiconductor substrate, part1: high permittivity superstrates", ieee microwave magazine, vol.4, no. 2, pp. 60–70, jun 2003. [18] j. d. van-wyk, f. c. lee, z. liang, r. chen, s. wang, b. lu , " integrating active, passive and emifilter functions in power electronic system: a case study of some technologies", ieee transactions on power electronics, vol. 20, no. 3, pp. 523–536, may 2005. [19] c. j. nassar, c. a. kosik-williams, d. dawson-elli, r. johnbowman, "thin film cmos on glass: the behavior of enhancement mode pmosfets from cutoff through accumulation", ieee transactions on electron devices, vol. 56, no. 9, pp. 1974–1979, sept. 2009. [20] s. gevorgian, h. berg, "line capacitance and impedance of coplanar-strip waveguides on substrates with multiple dielectric layers", in proceedings of the 1st european microwave conference, september 2001, pp. 153–156. http://www.sciencedirect.com/science/journal/03787796/49/2 http://www.sciencedirect.com/science/journal/03787796/49/2 https://ieeexplore.ieee.org/author/38074165700 http://www.sciencedirect.com/science/journal/03787796/49/2 https://link.springer.com/journal/202 http://www.sciencedirect.com/science/journal/03787796/49/2 http://www.sciencedirect.com/science/journal/03787796/49/2 http://www.sciencedirect.com/science/journal/03787796/49/2 http://www.sciencedirect.com/science/journal/03787796/49/2 https://ieeexplore.ieee.org/author/37367619200 https://ieeexplore.ieee.org/author/37374909500 https://ieeexplore.ieee.org/author/37275279800 https://ieeexplore.ieee.org/author/37275279800 http://www.sciencedirect.com/science/journal/03787796/49/2 http://www.sciencedirect.com/science/journal/03787796/49/2 http://www.sciencedirect.com/science/journal/03787796/49/2 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=20 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=20 http://www.sciencedirect.com/science/journal/03787796/49/2 https://ieeexplore.ieee.org/author/37275446300 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=22 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=22 http://www.sciencedirect.com/science/journal/03787796/49/2 https://ieeexplore.ieee.org/author/37266805700 https://ieeexplore.ieee.org/author/37266805700 https://ieeexplore.ieee.org/author/37267359600 https://ieeexplore.ieee.org/author/37267359600 https://ieeexplore.ieee.org/author/37275458100 https://ieeexplore.ieee.org/author/37271047100 https://ieeexplore.ieee.org/author/37273115000 https://ieeexplore.ieee.org/author/37273115000 https://ieeexplore.ieee.org/author/37280446900 https://ieeexplore.ieee.org/author/37280446900 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=63 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=63 http://www.sciencedirect.com/science/journal/03787796/49/2 https://ieeexplore.ieee.org/author/37674375700 https://ieeexplore.ieee.org/author/38183989500 https://ieeexplore.ieee.org/author/38183989500 https://ieeexplore.ieee.org/author/37266805700 https://ieeexplore.ieee.org/author/37266805700 https://ieeexplore.ieee.org/xpl/conhome/4139976/proceeding 11106 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 227-238 https://doi.org/10.2298/fuee2302227a © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper savonius micro wind turbine: a theoretical analysis * elson avallone1, paulo henrique palota1, paulo césar mioralli1, pablo sampaio gomes natividade1, jonas rafael antonio1, josé ferreira da costa1, sílvio aparecido verdério junior2 1federal institute of education, science and technology of sao paulo, catanduva-sp, brazil 2federal institute of education, science and technology of sao paulo, araraquara-sp, brazil abstract. currently, the energy sector is the main responsible for the emission of carbon dioxide into the atmosphere. therefore, to reverse this scenario, it is necessary to expand the use of renewable energy sources, such as wind energy. with that, the search for improving efficiency in wind turbines that work with low speed winds, make the savonius turbine an advantageous option for presenting characteristics of low construction cost. this study aims to theoretically analyze a single model of vertical axis wind micro turbine using artificial wind. the wind power for 2 stages in this project was 0.063 w, as the power variation in relation to rotation is not linear. another important factor to consider is that the overlap ratio of 30% collaborates a power reduction. using the mathematical models, some results were theoretically analyzed through the savonius turbine with central axis. the literature indicates that the most efficient turbine is a two-stage turbine with helical blades and without a central axis. key words: wind energy, mini wind turbines, savonius 1. introduction this work is an extended version presented in [1], where the operating principles of a low-cost anemometer were presented. the world energy sector is the main responsible for the increase of carbon dioxide in the earth's atmosphere, and in 2007, 25% of the total greenhouse gases were emitted, due to the burning of coal, natural gas and oil. thus, in order to favor economic growth, it is necessary to invest in alternative energy sources aimed at sustainable development, as renewable energy [2]. renewable energies are those that are replenished naturally, that is, they are inexhaustible sources such as solar, tidal, geothermal and wind energy. according to [3] the production in the received september 16, 2022; revised october 25, 2022 and november 09, 2022; accepted november 25, 2022 corresponding author: elson avallone federal institute of education, science and technology of sao paulo, catanduva-sp, brazil e-mail: elson.avallone@ifsp.edu.br * an earlier version of this paper was presented at the 7th virtual international conference on science, technology and management in energy, energetics 2021, december 16-17, belgrade, serbia [1]. 228 e. avallone, p. h. palota, p. c. mioralli, et al. european union grew by 5.1% per year between 2007 and 2017, with wind being the second most produced energy source. wind energy is the transformation of air movement into useful energy, transforming mechanical energy into electricity, especially vertical and horizontal axis wind turbines [4]. this energy has been used for many centuries, such as ocean navigation, grain milling and water pumping [4]. persian windmills used in the transformation of wind energy into mechanical energy and their assembly was basically on vertical axes [5]. the first applications in europe took place in the netherlands with the same function of grinding grain, later spreading to the rest of the european continent in countries such as france, germany, belgium and denmark. however, it was in holland that they had the function of pumping water, with the change to the horizontal axis [5]. in comparison with europe, the exploration of brazilian wind energy began in the 1990s. this development took place through a mapping of the country's wind potential through sensors and special computers, mapping the first locations such as the states of ceará and pernambuco in northeastern brazil [6], [7] and [8]. the heating up of the brazilian market only occurred in 2004 with the creation of the incentive program for alternative sources of electric energy (proinfa), with the incentive of wind farm projects. even with the aforementioned incentive, the real growth occurred between 2009 and 2011, with the reduction of wind turbine prices and greater ease of connections to the electricity grid [9]. brazil is among the 10 countries that most exploit wind energy, ranking sixth with 3% of the world's installed capacity [10]. fig. 1 presents the growth of the installed capacity of wind energy in brazil and its participation in the national energy generation until the year 2020. fig. 1 wind energy growth in brazil [9] 1.1. types of wind turbines wind turbines are divided into horizontal axis wind turbines (hawts) and vertical axis wind turbines (vawts) [11], [12] and [13]. savonius micro wind turbine: a theoretical analysis 229 the hawts are more used nowadays because they are more efficient when compared to vertical axis turbines. however, the vawts have proven to be viable options due to their low production cost, independence from wind direction and wide applicability. rotary axis independence makes vawts work independently of wind direction [14]. the two main models of vertical axis turbines are savonius and darrieus [15]. in the 1930s, the darrieus turbine was developed, operating on the principle of lift and drag from a wing. its efficiency is similar to horizontal turbines, due to the presence of airfoils. when the moving air hits the blades, fixed to the ends of the deflector plates, a low pressure zone is generated. as the blade is fixed, the force of the wind causes the rotational movement of the set [15]. a variation of this type of turbine is the type h blades with straight and helical blades however this variation of the darrieus turbine has a torque deficiency, requiring a starter motor. [15-16]. mechanical systems produce constant and artificial exhaust winds, thus producing a constant rotation in the vawts, which provides a uniform generation of electrical energy [18]. one of the reasons vawts are not that expensive to build is that they do not need a yaw mechanism. this makes them ideal for small-scale applications in remote areas with electricity shortages. their shells do not require a mechanism to change their angle, as they work with any wind direction. vawts are less noisy than hawts which facilitates application in urban environments, in addition, with their reduced size provides greater safety for wildlife in rural areas [19]. the savonius turbine was developed and patented in 1929 in the united states of america and finland by the finnish sigurd j. savonius. latter it became one of the most widespread and well-known radial drag turbines in the world. [18-19]. the savonius turbine works by the aerodynamic principle of drag, having no airfoils, being formed by two opposite half channels, supported by a vertical axis [15]. their movement is based on the difference in the drag force acting on the concave and convex parts of their shells [19]. for this equipment to have better efficiency, it is necessary to determine the aspect ratio value (α1), which is the ratio between the height and diameter of the blades, where the most recommended value is around 4.0 [22]. according to menet [12], the savonius turbine, compared to other wind turbines, has greater resistance to fatigue and mechanical stress. aerodynamically, the savonius is simpler to design and build, which greatly reduces its cost compared to the airfoil blade designs of other vawts and hawts [19]. experimental studies show that the savonius performs well at low wind speeds and that the two-blade performs better than the three-blade, due to the fact that more drag is wasted on the three-blade versions [23]. 1.2. betz’s law betz's law (1926) states that “the maximum energy utilization of a wind turbine is 59.3%, and that even if a mechanical system is ideal, it is still possible to extract at most about 40% of the kinetic energy of the winds” [13] and [24]. as the maximum power coefficient, cp, is about 59% in wind turbines, according to betz's law, despite the efforts of hawts and vawts turbine designers to improve them, it was difficult to reach the betz limit [24] on page 27 of that publication. 230 e. avallone, p. h. palota, p. c. mioralli, et al. 1.3. objectives analyzing the characteristics of low production cost of the savonius turbine, low noise and construction in small sizes, it makes it a suitable application in urban environments, for having low noise, and also in rural environments, for having small size [19]. therefore, seeking to expand this approach, the aim of this work is theoretically analyze the two stages with central axis of savonius micro turbine for future application in small equipment, in addition to producing a literature review of micro turbines. 2. savonius the sizing of the wind micro turbines was based on a prototype developed at the federal institute of science and technology of são paulo campus catanduva, brazil, which was reduced to a 1:10 scale compared with an existing project of the institution. prototypes printed on a 3d printer have two stages, each 90 mm high, as the increase in the number of these stages increases inertia and reduces dependence on the wind direction for the start of rotation. on the other hand, the excess of stages would decrease the aspect ratio and static torques [22-23]. three “end plates” were used, or deflector plates with a diameter of 55 mm and 2 mm thick. wind tunnel tests using 5 models of savonius turbines found that the "end plates", which form an angle with the shells, provide improvements in efficiency [27]. thus, the greater the number of deflector plates, the greater the number of fins, which detain the air, increasing the total drag force by up to 36% [28]. all 2-blade turbines were used, which increases the rotational speed, but also generates a reduction in efficiency [15]. two models use conventional (straight) blades, which have an efficiency of approximately 10% if installed in a single stage. however, when the stages are overlapped the efficiency increases to 13%. two other models used have helical-shaped blades that have an efficiency close to 18% in a single stage [29], which was no object of this work. savonius wind turbines are designed with the central axis completely blocking the air passage through the cavity of thickness "e" (fig. 2). this generates better efficiency, while the turbine without the central shaft produces greater stability [30]. in the theoretical analysis of the savonius rotor (fig. 2), the equations of aerodynamic power (pa) and mechanical torque (m) are used [12]. the smallest value of df is 10% greater than the value of d [31]. fig. 2 𝐷𝑓 = 0,055 𝑚 𝐷 = 0,04889 𝑚 𝑑 = 0,03 𝑚 𝐻 = 0.186014 𝑚 𝑒 = 0,009 𝑚 savonius micro wind turbine: a theoretical analysis 231 a. aerodynamic power (pa) 2.1. aerodynamic power (pa) the aerodynamic power (pa) is determined using equation (1), derived from the bernoulli equation, where cp corresponds to the aerodynamic power coefficient, ρ the air density, ap the projected area of the rotor and v the air speed [12]. pa = cp ∙ 1 2 ∙ ρ ∙ ap ∙ v 3 (1) the projected area (ap) refers to the product of the diameter (d) by the height of the rotor (h), that is, ap = d ∙ h. the effect of the number of blades affects the aerodynamic performance of the wind turbine, in terms of 𝜆 and cp, as well as weight, cost, fatigue life and structural dynamics [24], [32] and [33]. the speed coefficient 𝜆 determines how fast the wind turbine will rotate. the referred coefficient depends on the specific wind turbine design with regard to the drag coefficient [34] (page 510) of the rotor and the number of shells. a high 𝜆 value can generate mechanical stress, noise and low energy absorption. therefore, it is important that wind turbines are designed to operate in a range of 𝜆, which considers the relationship between angular velocity and wind speed, in order to extract as much energy as possible from the airflow [35]. in fig. 3 it can be seen that the savonius turbine works best with low𝜆. the savonius has a field of application similar to that of dutch mills and multi-blade axial turbines, but the advantage of savonius is that it has less structural material. fig. 3 characteristic curves of cp as a function of 𝜆 for wind turbines [20] and [36] it can be observed in fig. 4 that the savonius turbine has a higher torque coefficient than the other turbines, with the exception of the dutch mil [20]. 232 e. avallone, p. h. palota, p. c. mioralli, et al. fig. 4 ct characteristic curves as a function of 𝜆 for wind turbines [20] and [36] the values of the aerodynamic power coefficient 𝐶𝑝 and the torque coefficient (𝐶𝑚 ) are obtained graphically through fig. 5, also studied by [37]. the value of λ is obtained through the relation λ = vtang v⁄ , where vtang = ω ∙ d 2⁄ . the power coefficient has its maximum value when λ ≈ 1. fig. 5 value of 𝐶𝑝 and 𝐶𝑚, as a function of 𝜆 [12-13] 2.2. torsion torque (m) torsion torque is defined by the equation (2) [12]. m = cm ∙ 1 4 ∙ ρ ∙ d ∙ ap ∙ v 2 (2) aspect ratio is an important characteristic of rotor efficiency and defined as α1 = h d⁄ . the best rotor power coefficient has a value of α1 ≈ 4,0 [39]. savonius micro wind turbine: a theoretical analysis 233 the overlap ratio1 is calculated by β = e d⁄ , where the best efficiency is between 20 and 30% [16], [30], [31] and [37]. the value of β used to calculate the rotor was 30%, as recommended by the authors tahani, kothe and fujisawa [16], [30], [31] and [37]. the aerodynamic power coefficient relates the aerodynamic power with the power available in the wind, expressed by the expression of cp = pa pv ⁄ [41]. the generator's theoretical electric current (𝐼𝑔 ), for a voltage of 12v defined by equation (3). where 𝜔 corresponds to angular speed [31]. ig = −0,0024 ∙ ω 2 + 0,4138 ∙ ω + 7,6 (3) 3. results the turbines were designed in autodesk inventor software and printed on a 400 x 400 x 400 mm 3d printer (fig. 7 and fig 8) with polylatic plastic (pla) filament witch was chosen in order to a better printing and to have a less materials residues and burs. the stages were built and fitted separately with fittings to enable coupling. the figs. 6, 7, 8 and 9 were based on the characteristics of the fig. 2 with dimensions, 𝐷𝑓 = 0,055 𝑚, 𝐷 = 0,04889 𝑚, 𝑑 = 0,03 𝑚, 𝐻 = 0.186014 𝑚 and 𝑒 = 0,009 𝑚. the calculations were performed in equations (1) and (2). the theoretical results for the two-stage savonius rotors applied to this work, obtained through equation (1), (2) and (3) are presented in table 1. fig. 6 assembly of the structure of the savonius turbine with its two stages, straight blades and shaft, developed by authors fig. 7 and 8 show the savonius turbine with the central shaft dismantled and assembled, printed on 3d printer. 1 the overlap ratio 𝛽 in savonius turbines is the ratio between the overlap of the blades "e" and their chord length (𝑑). thus, this is beneficial for the wind turbine due to the increase in pressure caused in the concave region of the return blade. however, it also generates pressure reduction in the concave region of the advance blade [25]. 234 e. avallone, p. h. palota, p. c. mioralli, et al. fig. 7 savonius turbine printed in 3d, developed by authors fig. 8 savonius turbine printed in 3d and assembled, developed by authors fig. 9 assembly of the structure of the savonius turbine with its two stages, straight blades and no shaft, developed by authors savonius micro wind turbine: a theoretical analysis 235 to generate the results of table 1, the input data were considered as specific mass 𝜌𝑎𝑖𝑟 = 1.23 𝑘𝑔 𝑚 3⁄ , dynamic air viscosity 𝜇𝑎𝑖𝑟 = 0.000015 𝑘𝑔 𝑚. 𝑠⁄ , projected area 𝐴𝑝 = 0.009447 𝑚 2, aspect ratio α1 ≈ 4,0 and peripheral velocity 𝑉𝑝 = 5 𝑚 𝑠⁄ . table 1 theoretical results for the savonius rotor name value 𝑅𝑒 20788.22 𝑉𝑎𝑖𝑟 [m/s] 5 𝑉𝑝𝑒𝑟𝑖𝑝ℎ [m/s] 5 𝜔 [rad/s] 23.47 𝜆 0.12 𝐶𝑝 0.043 𝐶𝑚 0.378 𝑃𝐸 [w] 0.031 𝑃𝐸 (x2) [w] 0.063 𝑀 [n.m] 0.001 𝑛 [rpm] 224.11 𝛽 [%] 30 𝐼𝑔 [a] 0.79 table 1 presents the theoretical results obtained from the equation presented by menet [12]. the wind power for 2 stages in this project was 0.063 w, as the power variation in relation to rotation is not linear. another important factor to consider is that the overlap ratio is very large, thus causing a reduction in power, which has already been presented as a footnote in section 2.2. the other results obtained in table 1 are derived from the dimensions of the designed savonius rotor. 4. conclusion the present work sought to expand the application of vertical rotor wind turbines with the use of artificial winds from the development and analysis of computer-assisted design of micro aero generator turbine savonius two blades with axial shaft. using the literature review and theoretical results, the mechanical torque and aerodynamic power were obtained and analyzed in the proposed savonius turbine configuration. the analysis carried out in this work is fundamentally theoretical, and to determine the viability of these types of turbines, other analyzes are needed, such as the recovery wind energy obtained from experimental models. several research fields need further deeply study. a proposal to install a savonius micro turbine printed on a 3d printer coupled to a micro electric generator may have satisfactory results through future studies. in this way, it will be possible to compare theoretical results with experimental data through electronic measurements using a datalogger system. another alternative for future work would be to apply the turbine in bus routes to large centers in order to capture wind energy from the passage of buses and transform it into electrical energy. it is also possible to monitor the capture of this energy through an 236 e. avallone, p. h. palota, p. c. mioralli, et al. intelligent mobile system, as an example, the one developed for monitoring microclimatic parameters [42]. the literature review is an important tool in the study of this work. thus, we concluded that the savonius model with greater efficiency is the one with two stages without a central axis. through the experimental further analysis, it will be possible too to certify the more efficiency in two blades with no axis when compared with no shaft as searched in the current literature review. acknowledgmenent: to the federal institute of education, science and technology of são paulo for the constant encouragement. nomenclature symbol name units 𝐴𝑃 rotor projected area [𝑚 2] 𝐶𝑃 aerodynamic power coefficient [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝐶𝑚 torque coefficient [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝐷𝑓 end plate diameter [𝑚] 𝐼𝑔 generator current [𝐴] 𝑃𝐴 aerodynamic power [𝑊] 𝑃𝑉 wind power [𝑊] 𝑉𝑎𝑖𝑟 air speed [𝑚 𝑠⁄ ] 𝑉𝑝𝑒𝑟𝑖𝑓 peripheral speed [𝑚 𝑠⁄ ] 𝑉𝑡𝑎𝑛𝑔 tangential speed [𝑚/𝑠] 𝛼1 aspect ratio [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝐷 rotor diameter [𝑚] 𝐻 rotor height [𝑚] 𝑀 torsion moment [𝑁. 𝑚] 𝑅𝑒 reynolds number [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝑉 air speed [𝑚/𝑠] 𝑑 diameter of rotor half cylinder [𝑚] 𝑒 spacing between the two half cylinders [𝑚] 𝑛 rotation [𝑟𝑝𝑚] 𝛽 overlap ratio [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝜆 speed coefficient [𝑎𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙] 𝜌 air density [𝑘𝑔 𝑚3⁄ ] 𝝎 angular speed [𝑟𝑎𝑑 𝑠⁄ ] references [1] e. b. da s cabral et al., “theoretical analysis of vertical micro wind turbines”, in proceedings of the 7th virtual international conference on science technology and management in energy, niš, serbia, 2022, pp. 407-411. [2] h. b. o. arruda, "mapeamento das emissões de gases de efeito estufa em uma empresa do setor energético", conexoes, vol. 12, no. 3, p. 108, 2018. [3] eurostat, "renewable energy on the rise", 2019. https://ec.europa.eu/eurostat/web/products-eurostat-news//ddn-20220126-1 (accesed 31 of august 2022). [4] u. lisboa, energia eólica: kit científico de energias renováveis, vol. 1, 1 vols. lisboa-portugal: museu de ciências da universidade de lisboa, 2009. savonius micro wind turbine: a theoretical analysis 237 [5] r. f. m. santos, "detecção de mudança da característica de produção de parques eólicos", master thesis, porto, porto portugal, 2008. [online]. available at: https://repositorio-aberto.up.pt/bitstream/10216/ 58768/2/texto%20integral.pdf [6] g. de a. nunes and a. a. magalhães, "energia eólica no brasil: uma alternativa inteligente frente às demandas elétricas atuais", bolsista de valor: revista de divulgação do projeto universidade petrobras e if fluminense, vol. 1, p. 163-167, 2010. [7] b. marcele medeiros monteiro and v. s. de q. varella, "fontes de energia renováveis", rio de janeiro. [online]. available at: http://www.solar.coppe.ufrj.br/eolica/eol_txt.htm [8] o. c. amarante, j. zack and m. brower, "atlas do potencial eólico brasileiro”, brazil, 2021. [online]. available at: http://www.cresesb.cepel.br/publicacoes/download/atlas_eolico/atlas%20do%20potencial%20eolico%20brasileiro. pdf [9] m. a. b. mroz, atlas eólico do estado de são paulo, vol. 1, 1 vols. são paulo: governo do estado de são paulosecretaria da energia, 2012. [online]. available at: https://dadosenergeticos.energia.sp.gov.br/portalcev2/intranet/ bibliovirtual/renovaveis/atlas_eolico.pdf [10] gwec, "global wind report”, global wind energy council, brussels, belgium, 2019. [online]. available at: https://gwec.net/wp-content/uploads/2020/08/annual-wind-report_2019_digital_final_2r.pdf [11] c. d. ôlo, "projecto de uma turbina savonius com utilização de componentes em fim-de-vida”, master thesis, faculdade de ciência e tecnologia universidade nova de lisboa, lisboa-portual, 2012. [online]. available at: https://run.unl.pt/bitstream/10362/8876/1/olo_2012.pdf [12] j.-l. menet, "a double-step savonius rotor for local production of electricity: a design study", renew. energy, vol. 29, no. 11, pp. 1843-1862, sept. 2004. [13] a. m. biadgo, a. simonovic, d. komarov and s. stupar, "numerical and analytical investigation of vertical axis wind turbine", fme, vol. 41, no. 1, pp. 49-58, 2013. [14] d. m. prabowoputra, a. r. prabowo, a. bahatmaka and s. hadi, "analytical review of material criteria as supporting factors in horizontal axis wind turbines: effect to structural responses", procedia struct. integ., vol. 27, pp. 155-162, 2020. [15] m. t. tolmasquim, energia renovável: hidráulica, biomassa, eólica, solar, oceânic, 1o ed, vol. 1, 1 vols. rio de janeiro: empresa de pesquisa energética (epe), 2016. [online]. available at: https://www.epe.gov.br/sites-pt/publicacoes-dados-abertos/publicacoes/publicacoesarquivos/publicacao172/energia%20renov%c3%a1vel%20-%20online%2016maio2016.pdf [16] m. tahani, a. rabbani, a. kasaeian, m. mehrpooya and m. mirhosseini, "design and numerical investigation of savonius wind turbine with discharge flow directing capability", energy, vol. 130, pp. 327-338, july 2017. [17] s. b. garcia, g. c. da s. simioni and j. a. v. alé, "aspectos de desenvolvimento da turbina eólica de eixo vertical", in proceedings of the conem 2016 iv congresso nacional de engenharia mecânica, recife pe brazil, 2006, vol. 1. [online]. [18] a. fazlizan, w. chong, s. yip, w. hew and s. poh, "design and experimental analysis of an exhaust air energy recovery wind turbine generator", energies, vol. 8, no. 7, pp. 6566-6584, june 2015. [19] e. aymane, "savonius vertical wind turbine: design, simulation and phisical testing", specialization monograph, al akhawayn university, marroc, 2017. [online]. available at: http://www.aui.ma/ssecapstone-repository/pdf/spring-2017/savonius%20vertical%20wind%20turbine%20%20design%20simulation%20and%20physical%20testing.pdf [20] j. v. akwa, "análise aerodinâmica de turbinas eólicas savonius empregando dinâmica dos fluidos computacional", master thesis, universidade federal do rio grande do sul, porto alegre, rs, 2010. [online]. available at: https://lume.ufrgs.br/bitstream/handle/10183/26532/000756688.pdf?sequence=1&isallowed=y [21] s. j. savonius, "wind rotor us1766765a", us1766765a [online]. available at: https://patentimages.storage. googleapis.com/b4/a5/b0/c9503e83cfe1c0/us1766765.pdf [22] u. k. saha, s. thotla and d. maity, "optimum design configuration of savonius rotor through wind tunnel experiments", j. wind eng. ind. aerodyn., vol. 96, no. 8-9, pp. 1359-1375, aug. 2008. [23] m. h. ali, "experimental comparison study for savonius wind turbine of two & three blades at low wind speed", int. j. modern eng. res., vol. 3, no. 5, pp. 2978-2986, oct. 2013. [24] h. kana, aerodynamics of wind turbines, university of london, london, uk, 2011. [25] l. s. bianchin, d. beck and d. j. seidel, "influência do número de estágios no torque estático da turbina eólica savonius", revista thema, vol. 17, no. 2, pp. 309-317, june 2020. [26] f. wenehenubun, a. saputra and h. sutanto, "an experimental study on the performance of savonius wind turbines related with the number of blades", energy procedia, vol. 68, pp. 297-304, apr. 2015. [27] t. ogawa and h. yoshida, "the effects of a deflecting plate and rotor end plates on performances of savonius-type wind turbine", bulletin of jsme, vol. 29, no. 253, pp. 2115-2121, 1986. 238 e. avallone, p. h. palota, p. c. mioralli, et al. [28] i. s. utomo, d. d. d. p. tjahjana and s. hadi, "experimental studies of savonius wind turbines with variations sizes and fin numbers towards performance", in proceedings of the 1st international conference and exhibition on powder technology indonesia (icepti), jatinangor, indonesia, 2018, p. 030041. [29] e. s. caser and g. da m. paiva, "projeto aerodinâmico de uma turbina eólica de eivo vertical (teev) para ambientes urbanos", degree project, universidade federal do espírito santo, vitória-es, 2016. [online]. available at: https://mecanica.ufes.br/sites/engenhariamecanica.ufes.br/files/field/anexo/2._pg_final__eduardo_caser_giuseppe_paiva.pdf [30] l. b. kothe, "estudo comparativo experimental e numérico sobre o desempenho de turbinas savonius helicoidal e de duplo-estágio", master thesis, universidade federal do rio grande do sul, porto alegre, rs, 2016. [online]. available at: https://lume.ufrgs.br/bitstream/handle/10183/141901/000993090.pdf?sequence=1&isallowed=y [31] n. fujisawa, "on the torque mechanism of savonius rotors", j. wind eng. ind. aerodyn., vol. 40, no. 3, pp. 277-292, 1992. [32] m. zemamou, m. aggour and a. toumi, "review of savonius wind turbine design and performance", energy procedia, vol. 141, pp. 383-388, dec. 2017. [33] k. r. abdelaziz, m. a. a. nawar, a. ramadan, y. a. attai and m. h. mohamed, "performance improvement of a savonius turbine by using auxiliary blades", energy, vol. 244, pp. 122575, apr. 2022. [34] b. munson, d. f. young, t. h. okiishi and w. w. huebsch, fundamental of fluid mechanics. estados unidos da américa: john wiley & sons, 2009. [35] k. horikiri, "aerodynamics of wind turbines”, ph.d. thesis, queen mary, london, uk, 2011. [online]. available at: http://qmro.qmul.ac.uk/jspui/handle/123456789/1881 [36] j. v. akwa, h. a. vielmo and a. p. petry, "a review on the performance of savonius wind turbines”, renew. sust. energy rev., vol. 16, no. 5, pp. 3054-3064, june 2012. [37] l. b. kothe, s. v. möller and a. p. petry, "numerical and experimental study of a helical savonius wind turbine and a comparison with a two-stage savonius turbine", renew. energy, vol. 148, pp. 627-638, apr. 2020. [38] j.-l. menet and a. leiper, "prévision des performances aérodynamiques d’un nouveau type d’éolienne à axe vertical dérivée du rotor savonius", in proceedings of the 17ème congrès français de mécanique, france, sept. 2005, vol. 1, pp. 1-6. [39] i. ushiyama and h. nagai, "optimum design configurations and performance of savonius rotors", wind eng., vol. 12, no. 1, pp. 59-75, 1988. [40] b. g. newman, "measurement on a savonius rotor with variable gap", in wind energy, achievements and potential, symposium sherbrook, canada, 1974, pp. 115-136. [41] j. a. schetz and a. e. fuhs, handbook of fluid dynamics and fluid machinery: fundamentals of fluid dynamics. hoboken, nj, usa: john wiley & sons, inc., 1996. [42] d. danković and m. djordjević, "a review of real time smart systems developed at university of nis", fu: elec. energ., vol. 33, no. 4, pp. 669-686, 2020. instruction facta universitatis series: electronics and energetics vol. 29, n o 2, june 2016, pp. 159 175 doi: 10.2298/fuee1602159a characterization of nonlinear loads in power distribution grid  miona andrejević stošović 1 , marko dimitrijević 1 , slobodan bojanić 2 , octavio nieto-taladriz 2 , vančo litovski 1 1 university of niš, faculty of electronic engineering, niš, serbia 2 escuela técnica superior de ingenieros de telecomunicación, universidad politecnica de madrid, madrid, spain abstract. electronic devices are complex circuits, consisting of analog, switching, and digital subsystems that require direct current (dc) for polarization. since they are connected to the mains delivering alternating current (ac), however, ac-to-dc converters are to be introduced between the mains and the electronics to be fed. a converter is an electric circuit containing several subsystems, the most important being the switch-mode power supply, drawing power from the mains in pulses hence it is highly nonlinear. that happens, in reduced amplitude, even when the electronics to be fed is switched off. the process of ac-to-dc conversion is not restricted to feeding electronic equipment only. it is more and more frequently encountered in modern smart-grid facilities giving rise to the importance of the studies referred hereafter. the converter can be studied (theoretically or by measurements) as two-port network with reactive and nonlinear port-impedances. characterization is performed after determining the port electrical quantities which are voltages and currents. based on these data power and power quality parameters – power factor and total harmonic distortionmay be extracted. when nonlinear loads are present, one should introduce new ways of thinking into the considerations due to the existence of harmonics and related power components. in that way the power factor can be generalized to total or true power factor where the apparent power, involved in its calculations, includes all harmonic components. after introducing a wide range of definitions used in contemporary literature, here we describe our measurement set-up both as hardware and a software solution. the results reported unequivocally confirm the importance of the subject of characterization of small nonlinear loads to the grid having in mind their number which is rising without saturation seen in the near and even far future. key words: smart grid, nonlinear loads, load characterization, power factor, harmonic distortions received september 29, 2015 corresponding author: miona andrejević stošović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: miona.andrejevic@elfak.ni.ac.rs) 160 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski 1. introduction with the advent of modern diversified sources of electrical energy, the issue of power quality becomes both more ambiguous and more complicated. we will address here first the new aspects that are coming in fore thanks to the new ways of producing electrical energy, which are becoming more and more popular, and thanks to the emergence of a new paradigm known as smart-grid which involves mutual interaction of power electrical systems and electronic systems for its proper functionality [1]. nowadays we are witnessing changes in the demand and energy use which in fact means “new” load characteristics, and trends changing the nature of the aggregate utility consumption. all of that is mostly due to the electronic devices that became ubiquitous. it is presumed that the overall household consumption for electronic appliances will rise with a rate of 6% per year so reaching 29% of the total household consumption in the year 2030. in the same time the household consumption is expected to reach 40% of the overall electricity demand. the immense rise of the office consumption due to the enormous number of computers in use is also to be added. that stands for educational, administrative, health, transport, and other public services, too. one may get the picture if one multiplies the average consumption of a desk-top (about 120 w) with the average number of hours per day when the computers is on (about 7), and the number of computers (billion(s)?). electronic loads are strongly related to the power quality thanks to the implementation of ac/dc converters that in general draw current from the grid in bursts. the current voltage relationship of these loads, looking from the grid side, is nonlinear, hence nonlinear loads. in fact, while keeping the voltage waveform almost sinusoidal, they impregnate pulses into the current so chopping it into seemingly arbitrary waveform and, consequently, producing harmonic distortions. having all this in mind the means for characterization of the load from the nonlinearity point of view becomes one of the inevitable tools of quality evaluation of smart grid. the problem is further complicated when different power generation technologies and resources are combined leading. new subsystem in the power production, transport, and consumption emerge named micro-grids and the overall system is supposed to become a smart-grid. for example, due to the rise of the number of different kind of electricity sources even the frequency of the grid voltage may be considered as “unknown” asking for algorithms and software to be implemented in real time to extract the frequency value [2] and, based on that, to compute the amplitudes of the harmonics [3, 4, 5]. due to the nonlinearities, measurement of power factor and distortion, however, usually requires dedicated equipment. for example, use of a classical ammeter will return incorrect results when attempting to measure the ac current drawn by a non-linear load and then calculate the power factor. a true rms multi-meter must be used to measure the actual rms currents and voltages and apparent power. to measure the real power or reactive power, a wattmeter designed to properly work with non-sinusoidal currents must be also used. contemporary methods and algorithms for spectrum analysis are presented in this paper. the basic definitions of parameters describing nonlinear loads are introduced. alternative definitions for reactive power and their calculation methods are elaborated, also. in our previous research we were first developing a tool for efficient measurements that would allow for proper and complete characterization of the nonlinear loads [6, 7]. namely we found that the tools for characterization of modern loads available on the market, most frequently, lack at least one of the following properties: low price, ability of implementation of complex data processing algorithms (versatility), ability to store and characterization of nonlinear loads in power distribution grid 161 statistically analyze the measured data, and ability to communicate with its environment no matter how distant it is. all these were achieved by the system reported in [6, 7] and the measurement results demonstrated here were obtained by these tools. next, we implemented these tools for characterization of small loads. the results obtained, as reported in [8] and [9] for example, were, in some cases, surprisingly different from what expected. that stands for the power components which are not the active power and for the abundance of harmonics. in [10] and [11] we demonstrated that based on the main's current, by proper data processing, despite the complex signal transformation between the mains and the components of a computer via the power supply chain, one may deduce the activities within the computer. even more, one may recognize a software running within the computer. such information is distributed via the grid. here we will for the first time summarize the theoretical background of all computations necessary to be performed for complete characterization of small loads. then, we will demonstrate our new results in the implementation of the theory and the measurement tools on a set of nonlinear loads. the definitions used in modern characterization of the main's current, voltage, and power which are implemented by our system will be listed in the second section so enabling the main attention to be devoted to the set of measured results and their analysis, which will be given next. the paper will be organized as follows. first a short description of the measurement experiment will be given. to preserve conciseness, for this purpose, we will mainly refer to our previous work. 2. parameter definitions although power quality is a relatively ambiguous concept, limited mostly to conversations among utility engineers and physicists, as electronic appliances take over the home, it may become a residential issue as well. 2.1. linear loads with sinusoidal stimuli a sinusoidal voltage source rms ( ) 2 sin(ω )v t v t (1) supplying a linear load, will produce a sinusoidal current of rms ( ) 2 sin(ω φ)i t i t  (2) where vrms is the rms value of the voltage, irms is the rms value of the current, ω is the angular frequency, φ is the phase angle and t is the time. the instantaneous power is ( ) ( ) ( )p t v t i t  (3) and it can be represented as rms rms ( ) 2 sin ω sin(ω ) . p q p t v i t t p p     (4) using trigonometric transformations, we can write: rms rms cosφ (1 cos(2ω )) (1 cos(2ω )) p p v i t p t        (5) and 162 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski rms rms sin φ sin(2ω ) sin(2ω ) q p v i t q t        (6) where rms rms rms rms cos φ, sin φ p v i q v i       (7) represent real (p) and reactive (q) power. it can be easily shown that the real power presents the average of the instantaneous power over a cycle: 0 0 t +t t 1 ( ) ( )p v t i t dt t    (8) where t0 is arbitrary time (constant) after equilibrium, and t is the period (20ms in european and 1/60s in american system, respectively). the reactive power q is the amplitude of the oscillating instantaneous power pq. the apparent power is the product of the root mean square value of current times the root mean square value of voltage: rms rms s v i  (9) or: 2 2 .s p q  (10) power factor is simply defined as the ratio of real power to apparent power [12, 3]: / .tpf p s (11) for pure sinusoidal case, using (7), (10) and (11) we can calculate: cosφ.tpf  (12) 2.2. nonlinear loads when there is a nonlinear load in the system, it operates in non-sinusoidal condition and use of well known parameters such as power factor, defined as cosine of phase difference, does not describe system properly. in that case, traditional power system quantities such as effective value, power (active, reactive, apparent), and power factor need to be numerically calculated from sampled voltage and current sequences by performing dft, fft or goertzel algorithm [3]. the rms value of some periodic physical entity x (voltage or current) is calculated according to the well-known formula [13, 14]: 0 0 t +t 2 rms t 1 ( ( )) t x x t d t  (13) where x(t) represents time evolution, t is the period and t0 is arbitrary time. for any periodic physical entity x(t), we can give fourier representation: 0 1 ( ) ( cos( ω ) sin( ω )) k k k x t a a k t b k t        (14) characterization of nonlinear loads in power distribution grid 163 or 0 1 ( ) cos( ω ) k k k x t c c k t        (15) where 0 0 c a represents dc component, 2 2 k k k c a b  magnitude of k th harmonic, k = arctan(bk/ak) phase of k th harmonic and  = 2/t, angular frequency. fourier coefficients ak, bk are: t / 2 t / 2 0 t / 2 t / 2 1 2 2 π ( ) , ( ) cos t t t k k t a x t dt a x t dt                (16) and t / 2 t / 2 2 2 π ( ) sin . t k k t b x t dt t            (17) the rms value of k th harmonic is k, rms / 2. k x c (18) we can calculate total rms value 2 2 2 rms , rms 1, rms h, rms 1 m k k x x x x     (19) where m is the highest order harmonic taken into calculation. index “1” denotes first or fundamental harmonic, and index “h” denotes contributions of higher harmonics. equations (13) – (19) need to be rewritten for voltage and current. practically, we operate with sampled values and integrals (16) and (17) are transformed into finite sums. for a single-phase system where k is the harmonic number, k phase difference between voltage and current of k th harmonic and m is the highest harmonic, the total active power is given by: ,rms ,rms 1 h 1 cos φ . m k k k k p i v p p       (20) the first addend in the sum (20), denoted with p1, is fundamental active power. the rest of the sum, denoted with ph, is harmonic active power [13]. in the literature, there exists a number of definitions of reactive power for nonsinusoidal conditions that serve to characterize nonlinear loads and measure the degree of loads’ non-linearity [14]. as more general term, non-active power n, was introduced. each definition has some advantages over others. but, although there is tendency to generalize, there is no generally accepted definition. the most common definition of reactive power is budeanu’s definition [15], given by following expression for single phase circuit: b ,rms ,rms 1 sin φ .k k k k q i v      (21) 164 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski budeanu proposed that apparent power consists of two orthogonal components, active power (20) and non-active power, which is divided into reactive power (21) and distortion power: 2 2 2 b .d u p q   (22) it should be noted that the actual contribution of harmonic frequencies to active and reactive power is small (usually less than 3% of the total active or reactive power). the major contribution of higher harmonics to the power comes as distortion power. the apparent power, for non-sinusoidal conditions conventionally denoted as u, can be written: 2 2 1 2 2 h 2 2 2 2 2 1,rms 1,rms 1,rms h,rms 2 2 2 2 1,rms h,rms ,rms h,rms v i s d h d s u i v i v v i v i          (23) where s1 represents fundamental apparent power, dv voltage distortion power, di current distortion power and sh harmonic apparent power. s1 and sh are 2 2 2 2 2 1 1 1 h h h h , s p q s p q d     (24) where dh represents harmonic distortion power. the total apparent power, denoted with u, is 2 2 2 rms rms .u p q d i v     (25) we can also define non-active power n, defined with equation 2 2 n q d  (26) and phasor power s, defined in the same way as apparent power for sinusoidal conditions (10). it is obvious that for sinusoidal conditions, apparent power and phasor power are equal, and (25) reduces to (10). the total harmonic distortions, thd, are calculated from the following formula [12, 13]: h, rms 2 , rms 2 2 2 rms 1, rms 2 1, rm1, rms 1, rms s 1 m i j j i thd i i i i ii      (27) and h, 2 , rms 2 2 rms 1, rms 2 1, rm, s21, 1 1 mrms v k krms rms vv th v v d v v v      (28) where ij, vk j, k=1, 2, …, m stands for the harmonic of the current or voltage. it can be shown that: 1, rms h, rms 1 h, rms 1, rms 1 1 . i i v v h i v d v i s thd d v i s thd s s thd thd            (29) characterization of nonlinear loads in power distribution grid 165 fundamental power factor or displacement power factor is given by the following formula: 1 1 1 1 cos . p pf s   (30) total power factor tpf [12, 13], defined by equation (12), taking into calculation (11) and (23), is 1 h 2 2 2 2 1 hi v p pp tpf u s d d u       (31) and substituting (29) and (30):   h 1 1 22 2 1 cos φ . 1 i v i v p p tpf thd thd thd thd            (32) total power factor can be represented as product of distortion power factor dpf and displacement power factor pf1, i.e. cos1: 1 cosφtpf dpf  (33) therefore, distortion power factor is [12, 13]   h 1 22 2 1 . 1 i v i v p p dpf thd thd thd thd       (34) in real circuits, ph << p1 and voltage is almost sinusoidal (thdv < 5%), leading to simpler equation for tpf [12, 13]: 1 2 cos φ . 1 i tpf thd   (35) 2.3. other definitions of reactive power budeanu’s definition the most common definition of reactive power is budeanu’s definition [16], given by following expression for single phase circuit, as mentioned earlier in the text: b ,rms ,rms 1 sink k k k q i v       (36) budeanu proposed that apparent power consists of two orthogonal components, active power and non-active power, which is divided into reactive power (36) and distortion power: 2 2 2b b .d u p q   (37) ieee std 1459-2010 proposes reactive power to be calculated as: ,rms ,rms 2 2 2 ieee 1 sin k k k k q i v       (38) 166 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski equation (38)eliminates the situation where the value of the total reactive power q is less than the value of the fundamental component. kimbark’s definition similar to budeanu’s definition, kimbark [17] proposed that apparent power consists of two orthogonal components, non-active and active power, defined as average power. the non-active power is separated into two components, reactive and distortion power. the first is calculated by equation k 1,rms 1,rms 1sinq i v    (39) it depends only on fundamental harmonic. the distortion power is defined as non-active power of higher harmonics: 2 2 2 k k .d u p q   (40) sharon’s definition this definition [18], introduces two quantities: reactive apparent power, sq, and complementary apparent power sc, defined as: 2 2 q rms ,rms 1 sink k k s v i       (41) and 2 2 2 c qs u p s   (42) where s is apparent power (9) and p active power(8). fryze’s definition fryze’s definition [19] assumes instantaneous current separation into two components named active and reactive currents. active current is calculated as a 2 rms ( ) ( ) p i t v t v  (43) and reactive current as: r a( ) ( ) ( ).i t i t i t  (44) active and reactive powers are rms a f rms r p v i q v i     (45) where ia and ir represent rms values of instantaneous active and reactive currents. kusters and moore’s power definitions kusters-moore definition [20] presents two different reactive power parameters, inductive reactive power: characterization of nonlinear loads in power distribution grid 167 ,rms ,rms 1 l rms 2 ,rms 2 1 1 sink k k k k k v i k q v v k             (46) and capacitive reactive power: ,rms ,rms 1 c rms 2 2 ,rms 1 sin . k k k k k k k v i q v k v              (47) there are other power decompositions, not considered in this paper: shepard-zakikhani [21], depenbrock [22] and czarnecki decomposition [23, 24]. more comprehensive comparison of reactive power definitions, obtained by means of simulation, can be found in [25]. 3. measurement system in order to establish a comprehensive picture about the properties of a given load one needs to perform complete analysis of the current and voltage waveforms at its terminals. in that way the basic and the higher harmonics of both the current and the voltage may be found. more frequently, however, indicators related to the power are sought in order to quantitatively characterize the load. namely, a linear resistive load will have voltage and current in-phase and will consume only real power. any other load will deviate from this characterization and one wants to know the extent of deviation expressed by as much indicators as necessary to get a complete picture. all these were implemented in our measuring system which will be shortly described in the next. the solution, as described in full details in [6, 7], is based on a real time system for nonlinear load analysis. the system is based on virtual instrumentation paradigm, keeping main advantage of legacy instruments – determinism in measurement. the system consists of three subsystems: acquisition subsystem, real time application for parameter calculations, and virtual instrument for additional analysis and data manipulation (fig. 1). tcp/ippcifpga rtos gpos n ni9225l1 l3 l2 ni9227 fig. 1 the system architecture the acquisition subsystem, fig. 2, is implemented using field programming gate array (pxi chassis equipped with pxi-7813r fpga card with virtex ii fpga) in control of data acquisition [26]. acquisition is performed using ni 9225[27] and ni 9227 [28] cseries acquisition modules connected to pxi-7813r fpga card [26]. a/d resolution is 168 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski 24-bit, with 50 ksa/s sampling rate and dynamic range ±300 v for voltages and ±5 a for currents. the fpga provides timing, triggering control, and channel synchronization maintaining high-speed, hardware reliability, and strict determinism. the fpga code is implemented in a labview development environment. the function of the fpga circuit is acquisition control. a a a a v v v dut ni 9225 ni 9227 l1 l2 l3 n fig. 2 connection diagram of acquisition subsystem the software component is implemented in two stages, executing on real-time operating system (pharlap rtos, [29, 30]) and general purpose operating system (gpos). described system enables calculation of a number of parameters in real-time that characterize nonlinear loads, which is impossible using classical instruments. the measured quantities are calculated from the current and voltage waveforms according to ieee 1459-2000 and ieee 1459-2010 standards [12, 13]. real time application (fig. 3) calculates power and power quality parameters deterministically and saves calculated values on local storage. the application is executed on real time operating system. fig. 3 part of real-time application in g code, alternative reactive power calculations characterization of nonlinear loads in power distribution grid 169 virtual instrument, implemented in national instruments labview [30, 31] environment, is used for additional analysis and data manipulation represents user interface of described system. it runs on general purpose operating system, physically apart from the rest of the system. communication is achieved by tcp/ip. parameters and values obtained by means of acquisition and calculations are presented numerically and graphically (fig. 4). fig. 4 virtual instrument provides measurements of various parameters 4. measurement results we have performed measurements on various small loads. the parameters obtained may be used for decision making of various kinds, such as verification of compliance to some standards or categorization within quality frames. as small loads here we consider various devices: cfl and led lamps, power supply devices and battery chargers in case of personal communication and computing devices. these devices are ubiquitous and in everyday use, thus their cumulative effect on power distribution grid is not negligible [32], [33]. various parameters that characterize nonlinearity, efficiency and quality are measured and calculated. table 1 shows measured results obtained on small loads such as various compact fluorescent lamps (cfl, 7 w – 20w), incandescent lamps (100w and 60w), two low-power 1 w indoor led (light emitting diode) lamps, prototype of street 34 w led lamp and crt computer monitor for reference. compact fluorescent lamp is good example of nonlinear load [34]. it brings reduction in total energy consumption (about 20%, comparing to incandescent lamp of equivalent luminosity), but with harmonic currents and increased harmonic loss on distribution transformer. measurements show that cfl lamps have good correction of displacement power factor, but significant distortion leading to low total power factor (table 1). cfls are equipped by power supply units which conduct current only during a very small part of fundamental period, so the current drawn from the grid has the shape of a short impulse. 170 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski table 1 cfl and led lamps type n o m in a l p o w e r (w ) f re q u e n c y (h z ) v r m s (v ) i r m s ( m a ) a c ti v e p o w e r (w ) i d c ( m a ) v o lt a g e t h d (% ) c u rr e n t t h d (% ) c u rr e n t c r e s t v o lt a g e c r e s t d p f ( % ) c o s( φ ) t p f ( % ) incandescent 100 50.03 230.20 421.66 97.02 0.62 3.11 3.05 1.52 1.47 99.95 1.00 99.95 cfl bulb 20 50.03 231.49 134.87 18.64 0.24 2.58 112.17 3.38 1.41 66.55 0.90 59.70 cfl tube 20 49.95 231.20 145.89 19.66 0.25 2.84 114.01 4.33 1.44 65.94 0.88 58.28 cfl bulb 15 49.99 231.47 92.16 12.60 0.13 2.82 115.52 3.52 1.41 65.45 0.90 59.08 incandescent 60 49.97 231.15 257.88 59.58 0.42 2.87 2.84 1.57 1.41 99.96 1.00 99.96 cfl spot 7 49.97 232.48 50.86 7.23 0.19 2.81 104.24 3.24 1.40 69.23 0.88 61.20 cfl bulb 7 50.06 230.95 52.46 7.21 0.28 2.83 112.26 3.42 1.40 66.51 0.90 59.54 cfl bulb 9 50.01 233.20 60.54 8.25 0.11 2.87 116.93 3.60 1.39 64.99 0.90 58.44 cfl tube 11 50.01 233.17 84.34 11.66 0.16 2.79 112.27 3.37 1.45 66.51 0.89 59.28 cfl tube 18 50.01 221.32 135.56 18.40 0.38 2.82 107.35 4.52 1.45 68.16 0.90 61.32 cfl tube 11 50.01 221.14 115.00 14.06 0.16 3.01 119.30 4.06 1.46 64.24 0.86 55.41 cfl helix 11 50.00 221.83 76.73 10.23 0.25 2.96 109.26 4.90 1.47 67.51 0.89 60.09 cfl bulb 9 49.99 232.52 70.06 9.70 0.19 2.84 110.87 3.52 1.43 66.98 0.89 59.53 cfl helix 18 50.01 221.46 138.68 19.01 0.35 2.89 105.56 3.94 1.43 68.77 0.90 61.71 cfl helix 20 50.03 231.19 156.43 21.02 0.20 2.79 111.36 3.91 1.44 66.82 0.87 58.13 cfl tube 15 50.01 221.00 105.09 13.96 0.29 3.16 112.13 4.46 1.40 66.56 0.90 60.11 led white 1 50.00 217.24 14.96 0.35 0.09 2.36 21.14 1.72 1.38 97.84 0.11 10.79 led cold w. 1 49.94 217.33 14.95 0.35 0.08 2.36 21.14 1.72 1.38 97.84 0.11 10.79 led street 34 49.99 216.63 246.12 32.87 0.05 2.53 102.98 3.28 1.38 69.66 0.89 61.66 crt  50.03 232.63 475.86 107.46 1.60 2.93 13.24 1.65 1.49 99.14 0.98 97.69 characterization of nonlinear loads can be accomplished by analyzing reactive and distortion power. table 2 shows reactive power and distortion power values, calculated using alternative definitions, for compact fluorescent lamps, two incandescent lamps and indoor led lamps. following values are displayed: active power (p), apparent power (s), non-active power (n), budeanu’s reactive power (qb), budeanu’s distortion power (db), fryze’s reactive power (qf), ieee std 1459-2010 proposed definition for reactive power (qieee), shanon’s apparent power (sq), kimbark’s reactive power (qk), kusters-moore’s capacitive (qc) and inductive (ql) reactive power. comparison of budeanu’s reactive and distortion power suggests that all examined cfl and led lamps are non-linear loads (db>qb). reactive power calculated from fryze’s definition (45) is equal to non-active power, 2 2n s p  . kimbark’s equation (39) for reactive power, which takes only fundamental harmonic into account, gives approximately ±3% deviance from budeanu’s formula (qb). it suggests that the actual contribution of harmonic frequencies to reactive power is small – less than 3% of the total reactive power. ieee proposed definition always provides value of the total reactive power greater than the value of the fundamental component. characterization of nonlinear loads in power distribution grid 171 table 2 cfl and led lamps no. type p o w e r p ( w ) u (v a ) n ( v a r ) q b (v a r ) d b (v a r ) q f (v a r ) q ie e e ( v a r ) s q ( v a r ) q k ( v a r ) q c ( v a r ) q l ( v a r ) 1 cfl rod 11.56 17.84 13.58 -6.16 12.10 13.58 6.16 10.24 -6.16 -4.43 -6.11 2 cfl bulb e27 20 17.14 27.72 21.78 -8.43 20.08 21.78 8.43 14.48 -8.43 -6.46 -8.37 3 cfl tube e27 20 16.77 28.46 23.00 -8.44 21.39 23.00 8.45 14.55 -8.45 -6.07 -8.39 4 cfl bulb e27 15 11.59 18.91 14.94 -5.31 13.97 14.94 5.32 9.22 -5.32 -4.00 -5.28 5 inc e27 100 86.77 86.78 0.80 -0.50 0.63 0.80 0.50 0.56 -0.50 -0.36 -0.49 6 cfl spot e14 7 5.87 9.32 7.25 -2.83 6.67 7.25 2.81 4.23 -2.81 -2.17 -2.80 7 cfl bulb e27 7 6.16 9.86 7.71 -2.64 7.24 7.71 2.65 4.83 -2.65 -2.03 -2.63 8 cfl bulb e14 9 6.46 10.78 8.63 -2.72 8.19 8.63 2.72 5.45 -2.72 -2.08 -2.70 9 cfl tube e14 11 9.89 16.11 12.72 -4.71 11.82 12.72 4.69 7.89 -4.69 -3.61 -4.66 10 cfl tube e27 18 17.10 28.86 23.24 -8.73 21.54 23.24 8.75 13.27 -8.75 -6.64 -8.68 11 cfl tube e27 11 10.63 17.67 14.12 -5.83 12.85 14.12 5.83 8.85 -5.83 -4.41 -5.79 12 cfl helix e27 11 9.58 16.27 13.16 -4.93 12.20 13.16 4.95 8.75 -4.95 -3.68 -4.90 13 inc e14 60 55.06 55.06 0.61 -0.37 0.49 0.61 0.37 0.37 -0.37 -0.27 -0.37 14 cfl helix e27 18 17.21 28.87 23.18 -8.82 21.43 23.18 8.83 15.55 -8.82 -6.77 -8.76 15 cfl helix e27 20 18.41 30.68 24.54 -9.95 22.43 24.54 9.93 16.14 -9.93 -7.56 -9.86 16 cfl tube e27 15 12.66 21.97 17.95 -6.32 16.80 17.95 6.33 11.63 -6.33 -4.80 -6.28 17 spot e27 15 16.92 34.24 29.77 -3.88 29.52 29.77 4.14 20.01 -4.13 -1.98 -4.06 18 spot e27 10 13.23 26.33 22.76 -2.97 22.56 22.76 3.17 15.45 -3.17 -1.51 -3.12 19 bulb w e27 8 10.00 19.53 16.77 -2.81 16.54 16.77 2.94 11.52 -2.93 -1.74 -2.89 20 bulb w e27 6 8.51 9.45 4.11 0.08 4.11 4.11 0.07 3.29 0.07 0.08 0.07 21 bulb e27 6 8.69 9.58 4.04 0.09 4.04 4.04 0.08 3.28 0.08 0.08 0.08 22 bulb e27 3 4.07 7.70 6.54 -0.84 6.48 6.54 0.90 4.35 -0.90 -0.45 -0.88 23 rgb e27 3 1.92 3.17 2.52 0.01 2.52 2.52 0.01 1.39 0.00 0.05 0.00 24 spot e14 3 4.00 8.05 6.99 -0.98 6.92 6.99 1.04 4.86 -1.04 -0.52 -1.02 further, personal devices such as tablet computer, mobile phone, laptop computer and cordless telephone containing rechargeable batteries are analyzed regarding operating conditions. measured results are presented in table 3. working conditions are standby (device turned off and battery not charging), working and charging (device turned on and battery charging) and charging only (device turned off and battery charging). a standalone battery charger is also tested. following values are measured and shown in the table: voltage rms (v), current rms (i), frequency (f), cosine of 1st harmonic phase difference (cosφ1), tpf – total power factor (%), dpf – distortion power factor (%), thdv – voltage total harmonic distortion (%),thdi – current total harmonic distortion (%), active power (p), budeanu’s reactive power (qb), apparent power (u), distortion power (d), non-active power (n), phasor power (s), first harmonic active power (p1) and higher harmonics active power (ph). 172 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski in the next we will pay some attention to the very results depicted in table 3. let's first have a glimpse at the distortions of the current (thdi). as can be seen even in the best cases the thdi is larger than 20%. there is a case, a mobile phone battery charger while charging, where the thdi is 154.51% which means the harmonics exceed by a large margin the fundamental. note that this is not an isolated case. one may observe several thdis of similar value. to summarize, thdi is exposing the nonlinear character of all small loads, some of which are extremely nonlinear producing harmonics larger than the fundamental one. table 3 personal devices in different working conditions n o . device description v ( v ) i (m a ) f (h z ) 1 charger 230v 1.7a 2xaaa nicd battery charging. 850mah 236.06 9.89 50.02 2 tablet computer turned on. li-polimer 8220 mah battery charging 235.70 80.92 49.98 3 tablet computer turned off. li-polimer 8220 mah battery charging 236.59 61.65 49.99 4 tablet computer turned off. charger 230v/2a connected. not charging 236.51 1.70 50.00 5 mobile phone charger connected. not charging 230v/0.2a 236.62 1.33 9.99 6 mobile phone turned on. li-ion 1230 mah battery charging 235.65 53.72 49.98 7 mobile phone turned off. li-ion 1230 mah battery charging 236.09 48.05 50.01 8 laptop comp. (type 1) turned on. charger 230v. 1.7a connected, not charging 233.49 22.99 50.01 9 laptop comp. (type 1) turned on. li-ion 2200mah battery charging 232.81 231.39 50.00 10 laptop comp. (type 1) turned off. li-ion 2200mah battery charging 233.52 106.52 49.99 11 laptop comp. (type 2) turned on. charger 230v 1.5a connected, not charging 233.07 15.71 49.99 12 laptop computer (type 2) turned on. li-ion 4400mah battery charging 232.05 436.60 49.97 13 cordless telephone base charger 230v/40ma disconnected 232.77 21.05 49.97 14 cordless telephone base. 2xaaa. nicd. 550mah battery not charging 233.68 21.71 50.00 15 cordless telephone base. 2xaaa. nicd. 550mah battery charging 233.55 25.60 49.99 n o . t p f ( % ) d p f ( % ) t h d v ( % ) t h d i (% ) p ( w ) q b ( v a r ) u ( v a ) d ( v a r ) n ( v a r ) s ( v a r ) p 1 ( w ) p h (w ) 1 32.93 70.81 1.70 94.47 0.77 1.77 2.33 1.62 2.20 1.68 0.78 -0.02 2 57.36 58.15 1.73 137.76 10.94 -1.74 19.07 15.53 15.62 11.08 11.08 -0.14 3 55.12 55.54 1.70 146.23 8.04 -0.93 14.59 12.13 12.17 8.09 8.17 -0.12 4 21.43 79.20 1.67 114.80 0.09 0.18 0.40 0.35 0.39 0.20 0.05 0.00 5 12.64 101.35 1.69 59.01 0.04 0.17 0.31 0.26 0.31 0.18 0.02 0.00 6 52.73 53.66 1.71 154.51 6.67 -1.18 12.66 10.69 10.76 6.78 6.73 -0.05 7 51.18 51.98 1.77 161.72 5.81 -0.96 11.34 9.70 9.75 5.88 5.87 -0.06 8 7.00 95.18 1.78 29.07 0.38 1.38 5.37 1.61 5.36 5.12 0.38 -0.01 9 53.67 54.76 2.00 147.11 28.91 -6.10 53.87 45.04 45.45 29.55 29.65 -0.71 10 47.51 50.62 1.92 164.35 11.82 -4.64 24.87 21.39 21.89 12.70 12.18 -0.28 11 12.69 99.22 1.94 40.82 0.46 1.46 3.66 1.42 3.63 3.37 0.43 0.00 12 96.74 97.30 1.83 20.90 98.01 -10.67 101.31 23.32 25.65 98.59 97.86 0.02 13 23.50 90.76 1.80 43.70 1.15 4.33 4.90 1.97 4.76 4.48 1.16 -0.01 14 47.31 92.64 1.78 36.64 2.40 4.09 5.07 1.81 4.47 4.74 2.43 -0.01 15 70.29 92.99 1.82 37.24 4.20 3.66 5.98 2.16 4.25 5.57 4.23 -0.02 characterization of nonlinear loads in power distribution grid 173 the next very important and also interesting set of data is related to the power factor. in early days it was known as cos of the load while only linear loads were considered supposedly having reactive component introducing phase shift between the voltage and the current. the total power factor (tpf) encompasses the whole event including the distortions of both the voltage and the current and their mutual phase shift. as can be seen from table 1, there is only one case where the tpf is approaching unity which is supposed to be its ideal value. in many of the cases the value of tpf is smaller than 50% meaning that the active power is smaller than a half of the total power drawn from the main which, as we could see from the previous paragraph, is mainly due to the distortions. in general, since most of the chargers are considered of small power (look to the column p1 in table 3), no power factor correction is built in so that significant losses are allowed. that, to repeat once more, would not be a problem if the number of such devices, being attached to the mains all the time, is not in the range of billion(s). the next column, the distortion power factor (dpf), represents the percentage of power taken by the harmonics. as we can see, except for a small number of cases where the harmonics are approximately on the level of half of the total power, in most cases they are taking as large power as the fundamental. note, the harmonics are unwanted not only because of efficiency problems. in fact, in the long term, the presence of harmonics on the grid can cause:  increased electrical consumption  added wear and tear on motors and other equipment  greater maintenance costs  upstream and downstream power-quality problems,  utility penalties for causing problems on the power grid  overheating in transformers, and similar. similar conclusion may be drawn in by comparison of the distortion (d) and the power of the first (fundamental) harmonic (p1). there are only three cases where the second is larger than the former. to summarize the data from table 3 one may say that an electronic load to the grid which in fact represents a power supply of a telecommunication or it device, represents a small but highly nonlinear load. in many cases the tpf of such a load is in favor of everything but not the active power to be delivered to the device. 5. conclusion due to the changes in the nature of the electrical loads to the grid new aspects of the characterization of the loads to the electrical grid are emerging. these are related mainly to the nonlinearities of modern electronic loads and to the subsystems used for conversion from dc to ac and vice versa that is becoming unavoidable in modern production and distribution systems. to qualify and quantify the properties of the modern power electrical systems new tools are to be developed being able to cope with the new properties of the signals arising at the grid-to-load and grid to power-producing-facility interface. that stands for both theoretical algorithms for computation and for the very measurement equipment. in these proceedings we represent our results in development and implementation of a measurement system for small loads that are becoming ubiquitous and consequently of big concern for the quality of the delivered electrical energy. we also present the measurement 174 m.andrejević-stošović, m. dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski results for a broad set of electronic loads revealing many secrets hidden behind the prejudice that these loads are small and unimportant. our hardware and software solutions may be characterized as advanced, accurate and versatile while at the same time of low price making them very attractive for practical use being it in laboratory or in field conditions. acknowledgement: this research was partly funded by the ministry of education and science of republic of serbia under contract no tr32004. references [1] l. freeman, “the changing nature of loads and the impact on electric utilities”, tech advantage expo electronics exhibition and conference 2009, new orleans, usa, feb. 2009, www.techadvantage.org/ 2009conferencehandouts/2e_freeman.pdf. [2] v. terzija, v. stanojević, “stls algorithm for power-quality indices estimation”, ieee transactions on power delivery, vol. 24, no. 2, pp. 544-552, april 2008. [3] g. goertzel, “an algorithm for the evaluation of finite trigonometric series”, the american mathematical monthly, no. 1, vol. 65, pp. 34-35, january 1958. [4] s. vukosavić, “detection and suppression of parasitic dc voltages in 400 v ac grids”, facta universitatis, series: electronics and energetics, vol. 28, no 4, pp. 527-540, december 2015. [5] l. korunović, m. rašić, n. floranović, v. aleksić, “load modelling at low voltage using continuous measurements”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 455-465, september 2014. [6] m. dimitrijević, v. litovski, “power factor and distortion measuring for small loads using usb acquisition module”, journal of circuits, systems, and computers, vol. 20, no. 5, pp. 867-880, august 2011. [7] m. dimitrijević, “electronic system for polyphase nonlinear load analysis based on fpga“, phd thesis, niš, 2012 (in serbian). [8] m. dimitrijević, and v. litovski, “quantitative analysis of reactive power definitions for small nonlinear loads”, in proc. of the 4th small systems simulation symposium, niš, serbia, 2012, pp. 150-154. [9] m. dimitrijević, and v. litovski, “real-time virtual instrument for polyphase nonlinear loads analysis“, in proc. of the ix int. symp. on industrial electronics, indel 2012, banja luka, b&h, november 2012, pp. 136-141. [10] m. andrejević stošović, m. dimitrijević, and v. litovski, “computer security vulnerability as concerns the electricity distribution grid”, applied artificial intelligence, vol. 28, pp. 323–336, 2014. [11] m. dimitrijević, m. andrejević stošović, j. milojković, v. litovski, “implementation of artificial neural networks based ai concepts to thesmart grid”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 411-424, september 2014. [12] -,”ieee trial-use standard definitions for the measurement of electric power quantities under sinusoidal, non-sinusoidal, balanced, or unbalanced conditions”, ieee power engineering society, ieee std. 1459-2000, 30. january 2000. [13] ieee power engineering society: ieee trial-use standard definitions for the measurement of electric power quantities under sinusoidal, nonsinusoidal, balanced, or unbalanced conditions. ieee std. 1459-2010, 2. february 2010. [14] l. s. czarnecki, “harmonics and power phenomena”, encyclopedia of electrical and electronics engineering, j. wiley and sons, 1999. [15] a. e. emanuel, “power definitions and the physical mechanism of power flow”, j. wiley and sons, 2010. [16] c. i. budeanu, “reactive and fictitious powers.” rumanian national institute, no. 2.,1927. [17] e. w. kimbark, “direct current transmission” j. wiley and sons, 1971. [18] d. sharon, “reactive power definition and power-factor improvement in nonlinear systems.” 1973. in proc. of ins vol. electric engineers, vol. 120, pp. 704-706. [19] s. fryze, et al., “elektrischen stromkreisen mit nichtsinusoidalformingem verfauf von strom und spannung.” elektrotechnische zeitschriji, no. 53, vol. 25, pp. 596-599, 1932. [20] n. l. kusters, w. j. m. moore, “on the definition of reactive power under nonsinusoidal conditions.” ieee trans. power apparatus systems, no. 99, vol. 5, pp. 1845-1854, 1980. [21] w. shepard, p. zakikhani, “power factor correction in nonsinusoidal systems by the use of capacitance”, journal of physics d: applied physics, no. 6, pp. 1850–1861, 1973. [22] m. w. depenbrock, e. t. g. blindleistung, fachtagung blindleistung. aachen, 1979. characterization of nonlinear loads in power distribution grid 175 [23] l. s. czarnecki, “powers in nonsinusoidal networks: their interpretation, analysis and measurement”, ieee trans. instrumental measurements, no. 39, vol. 2, pp. 340-345, 1990. [24] l. s. czarnecki, “physical reasons of currents rms value increase in power systems with nonsinusoidal voltage”, ieee trans. in power delivery, no. 8, vol. 1, pp. 437-447, 1993. [25] m. e. balci, m. h. hocaoglu, “quantitative comparison of power decompositions”, electric power systems research, no. 78, pp. 318-329, 2008. [26] -,“ni pxi-7813r r series digital rio with virtex-ii 3m gate fpga.” national instruments. [27] -, “ni 9225 operating instructions and specifications.” national instruments. [28] -, “ni 9227 operating instructions and specifications”, national instruments. [29] c. jarvis, c., k. kinsella, p. timpanaro, “phar lap ets™ – an industrial-strength rtos white paper.” [30] national instruments: “labview real-time.” national instruments web page. [url] http://sine.ni.com/ nips/cds/view/p/lang/en/nid/2381. [31] national instruments, “labview system design software.” [32] d. stevanović, p. petković, “smarter power meters reduce economic losses at utility grid”, facta universitatis, series: electronics and energetics, vol. 28, no 3, pp. 407-421, september 2015. [33] s. puzović, b. m. koprivica, a. milovanović, m. đekić, “analysis of measurement error in direct and transformer-operated measurement systems for electric energy and maximum power measurement”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 389-398, september 2014. [34] m. etezadi-amoli, t. sr. florence, “power factor and harmonic distortion characteristics of energy efficient lamps”, ieee transactions on power delivery, no. 4, pp. 1965–1969, 1989. http://sine.ni.com/nips/cds/view/p/lang/en/nid/2381 http://sine.ni.com/nips/cds/view/p/lang/en/nid/2381 10666 facta universitatis series: electronics and energetics vol. 35, no 2, june 2022, pp. 155-186 https://doi.org/10.2298/fuee2202155n © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd review paper fifty years of microprocessor evolution: from single cpu to multicore and manycore systems goran nikolić, bojan dimitrijević, tatjana nikolić, mile stojčev university of niš, faculty of electronic engineering, niš, serbia abstract. nowadays microprocessors are among the most complex electronic systems that man has ever designed. one small silicon chip can contain the complete processor, large memory and logic needed to connect it to the input-output devices. the performance of today's processors implemented on a single chip surpasses the performance of a room-sized supercomputer from just 50 years ago, which cost over $ 10 million [1]. even the embedded processors found in everyday devices such as mobile phones are far more powerful than computer developers once imagined. the main components of a modern microprocessor are a number of general-purpose cores, a graphics processing unit, a shared cache, memory and input-output interface and a network on a chip to interconnect all these components [2]. the speed of the microprocessor is determined by its clock frequency and cannot exceed a certain limit. namely, as the frequency increases, the power dissipation increases too, and consequently the amount of heating becomes critical. so, silicon manufacturers decided to design new processor architecture, called multicore processors [3]. with aim to increase performance and efficiency these multiple cores execute multiple instructions simultaneously. in this way, the amount of parallel computing or parallelism is increased [4]. in spite of mentioned advantages, numerous challenges must be addressed carefully when more cores and parallelism are used. this paper presents a review of microprocessor microarchitectures, discussing their generations over the past 50 years. then, it describes the currently used implementations of the microarchitecture of modern microprocessors, pointing out the specifics of parallel computing in heterogeneous microprocessor systems. to use efficiently the possibility of multi-core technology, software applications must be multithreaded. the program execution must be distributed among the multi-core processors so they can operate simultaneously. to use multi-threading, it is imperative for programmer to understand the basic principles of parallel computing and parallel hardware. finally, the paper provides details how to implement hardware parallelism in multicore systems. key words: microprocessor, pipelining, superscalar, multicore, multithreading received april 13, 2022 corresponding author: goran nikolić university of niš, faculty of electronic engineering, 18106 niš, aleksandra medvedeva 14, serbia e-mail: goran.nikolic@elfak.ni.ac.rs 156 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev 1. introduction a microprocessor (processor implemented in a single chip) is one of the most inventive technological innovations in electronics since the discovery of the transistor in 1948. this amazing device has involved many innovations in the field of digital electronics, and became a part of everyday life of people. the microprocessor is the central processing unit (cpu) and it is an essential component of the computer [5]. nowadays, it is a silicon chip that is composed from millions up to billions of transistors and other electronic components. the cpu can execute several hundred millions/billions of instructions per second. a microprocessor is preprogrammed to execute software in conjunction with memory and special-purpose chips. it accepts digital data as input and processes it according to the instructions stored in the memory [6]. the microprocessor performs numerous functions including data storage, interaction with input-output devices, time-critical execution and other. applications of microprocessors range from very complex process controllers to simple devices and even toys. therefore, it is necessary for every electronics engineer to have a solid knowledge of microprocessors. this article discusses the types and 50 years’ evolution period of microprocessor. the evolution of microprocessors throughout history has been turbulent. the first microprocessor called intel 4004 was designed by intel in 1971. it was composed of about 2,300 transistors, was clocked at 740 khz and delivered 92,000 instructions per second while dissipating around 0.5 watts. after that, almost every year a new microprocessor, with significant performance improvements in respect to previous ones, was launched. the growth in performance was exponential, of the order of 50% per year, resulting in a cumulative growth of over three orders of magnitude over a two-decade period [7]. these improvements have been driven by advances in the semiconductor manufacturing process and innovations in processor architecture [8]. multicore processing has posed new challenges for both hardware designers and application developers. parallel applications place new demands on the processing system. although a multicore architecture designed for a specific target problem gives excellent results, it should be borne in mind that the main goal in computer system design should be to provide the ability to efficiently handle different types of problems. however, a single architecture "one size fits all", which is able to effectively solve all challenges, has not been found so far, and many are convinced that it will never be [9]. this article presents a review of the microarchitecture of contemporary microprocessors. the discussion starts with 50 years of microprocessor history and its generations. then, it describes the currently used microarchitecture implementations of modern microprocessors. at the end it points to specifics of parallel computing in heterogeneous microprocessor systems. this article is intended for an advanced course on computer architecture, suitable for graduate students or senior undergrads in computer and electrical engineering. it can be also useful for practitioners in the industry in the area of microprocessor design. 2. definition of microprocessor central processing unit, also known as a processor or microprocessor, is a controlling unit of a micro-computer inside a small chip. cpu is often referred to as the brain and heart of all computer (digital) systems and is responsible for doing all the work. it performs every single action a computer does and executes programs. in essence, the cpu is capable to perform arithmetic logical unit (alu) operations and communicates fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 157 with the other input/output devices and auxiliary storage units connected with it. in modern computers, the cpu is contained on an integrated circuit chip in which several functions are combined [10]. in general, all cpus, single-chip microprocessors or multichip implementations run programs by performing the following steps: 1. read an instruction and decode it 2. find any associated data that is needed to process the instruction 3. process the instruction 4. write the results out the instruction cycle is repeated continuously until the power is turned off. a microprocessor is built using the following three basic circuit blocks [11]: 1. registers, 2. alu, and 3. control unit (cu). registers can exist in two forms, either as an array of static memory elements such as flipflops, or as a portion of a random access memory (ram) which may be of the dynamic or static type. alu usually provides, at the minimum, facilities for addition, subtraction, or, and, complementation, and shift operations. the cu of the cpu regulates and integrates computer operations. it selects and retrieves instructions from main memory in the appropriate order and interprets them to activate other functional building blocks of the system at the appropriate time with aim to perform its proper operations. 3. generation and microprocessor history on december 23rd, 1947, the transistor was invented in bell laboratory, whereas an integrated circuit was invented in 1958 in texas instruments. in 1971 intel or integrated electronics has invented the first microprocessor. the evolution of cpu can be divided into five generations such as first, second, third, fourth, and fifth generation [12], and the characteristics of these generations will be discussed in the sequel. 1st generation: the first-generation microprocessors were introduced in the year 1971-1972 when intel launched the first microprocessor 4004 running at a clock speed of 740 khz. other microprocessors that belong to this generation are rockwell international pps-4, intel-8008, and national semiconductors imp-16. instruction processing of these cpus was serial. namely, instruction phases, fetch, decode and execution, were performed sequentially. when the current instruction was finished, then the cpu updates the instruction pointer and fetches the consecutive one in the program sequence, and so on for each instruction in turn. 2nd generation: this was the period from 1973 to 1978 in which very efficient 8-bit microprocessors were implemented like motorola 6800 and 6801, intel-8085, and zilog’sz80, which were among the most popular ones. the second-generation of the microprocessor is characterized by overlapped fetch, decode, and execute phases. when the first instruction is processed in the execution unit, then the second instruction is decoded and the third instruction is fetched. compared to the first-generation, the use of new semiconductor technologies for chip manufacture was a novelty in the second generation. gains in innovation were a significant increase in instruction execution speed and chip densities. 3rd generation: the third-generation microprocessors were introduced in the year 1978, as denoted by intel’s 8086 and the zilog z8000. from 1979 to 1980, intel 8086/80186/80286 158 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev and motorola 68000 and 68010 were developed. processors of this generation were 16-bit, four times faster than the previous generation, and with a performance like mini computers [13], [14], [15]. the development of a proprietary microprocessor architecture based on own instruction set computer (isc) was a novelty of this generation. 4th generation: development of 32-bit microprocessors, during the period from 1981 up to 1995, characterizes the fourth-generation. typical products were intel-80386 and motorola’s 68020/68030. microprocessors of this generation are characterized by higher chip density, even up to a million transistors. high-end microprocessors at the time, such as motorola's 88100 and intel's 80960ca, could issue and retrieve more than one instruction per clock cycle [16], [17]. 5th generation: from 1995 until now, this generation has been characterized by 64bit processors that have high performance and run at high speeds. typical representatives are pentium, celeron, dual and quad-core processors that use superscalar processing, and their chip design exceeds 10 million transistors. the 64-bit processors became mainstream in the 2000s. microprocessor speeds were limited by power dissipation. in order to avoid the implementation of expensive cooling systems, manufacturers were forced to use parallel computing in the form of the multi-core and many-core processor. thus, the microprocessor has evolved through all these generations, and the fifthgeneration microprocessors represent an advancement in specifications. some of the processors from the fifth generation of processors with their specifications will be briefly discussed in the text that follows. 4. classification of processor processor can be classified along several orthogonal dimensions. here we will point briefly to some of the most commonly used. the first classification is based on microarchitecture specifics, second one to the market segment, the third on type of processing, e.tc. in this article, we will focus on the first classification scheme. for more details about this problematic the readers can consult reference [10]. 4.1. classification of microarchitecture specifics in general, we distinguish the following classifications: 4.1.1. pipelined vs non-pipelined processors a non-pipelined processor executes only a single instruction at a given time. the start of the next instruction is delayed until the current ends, not based on hazards but unconditionally. the cpu scheduler chooses the instruction from the pool of waiting instructions, when it is free. pipelining is a technique where multiple instructions are overlapped during execution. the pipelined processor is divided into several processing stages (segments). the stages are mutually connected in a form of a pipe structure. constituents of each stage are an input register and a combinational circuit. the role of the register is to hold data and of combinational circuit to process it. the combinational circuit outputs processed data to the input register of the next segment. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 159 the pipeline technique is divided into two categories: a) arithmetic pipelines are mainly used for floating point operations, multiplication of fixed-point numbers, etc.; b) in instruction pipeline instructions are executed by overlapping fetch, decode and execute phases. pipeline technique increases instruction level parallelism (ilp) and is used by all processors nowadays [18]. table 1 difference between pipelining and non-pipelining systems pipelining system non-pipelining system multiple instructions are overlapped during execution phases fetching, decoding, execution and writing memory are merged into a single unit (step) several instructions are executed at the same time only one instruction is executed at the same time the cpu scheduler design determines efficiency the efficiency is not dependent on the cpu scheduler execution time is less (in a fewer cycle) execution takes more time (a greater number of cycles) in addition to the fact that pipelining increases the overall system performance, there are several factors that cause conflicts and degrade performance. among the most important factors are the following: 1. timing variations the processing time of instructions is not the same, because different instructions may require different operands (constants, registers, memory). accordingly, pipeline stages do not always consume the same amount of time. 2. data hazards the problem arises when several instructions are partially executed in the pipeline system and in doing so, two or more of them refer to the same data. in that case, it must be ensured that the next instruction stall until the current instruction has finished processing that data, because otherwise an incorrect result will occur. 3. branching the next instruction is fetched during the execution of the current one. however, if the current instruction is conditional branching, then the next instruction will not be known until this current one completes data processing and determines the branching outcome. 4. interrupts interrupts have an impact on the execution of instructions by inserting unwanted instructions into the current instruction stream. 5. data dependency this problem occurs when the result of the previous instruction is not yet available, and it is already needed as data for the current instruction. main advantages of pipelining are higher clock frequency and increased the system throughput. however, there are disadvantages of this technique, primarily the greater complexity of the design and the increased latency of the instruction. 4.1.2. in-order vs out-of-order processors a processor that executes instructions sequentially usually uses resources inefficiently, resulting in poor performance. two approaches can be used to improve processor performance. the first one deals with simultaneous executing different sub-steps of consecutive instructions or even executing instructions completely simultaneously. the second one refers to out-of-order instruction execution which can be achieved by executing the instruction in a different order from the original one [1], [19]. instructions order is determined by the compiler, but it is not necessary to execute them in that order. they may 160 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev be: a) issued in order and completed in order; b) issued in order, but completed out of order; c) issued out of order, but completed in order; and d) issued out of order and completed out of order [1]. firstand second-generation microprocessors process instructions in order. in-order processor performs the following steps: 1. retrieves instructions from the program memory. 2. if input operands are available in the register file, it sends command to the execution unit in order to execute instruction. 3. if during the current clock cycle input operands are not available, the processor will wait for them. this case occurs when the processor retrieves data from slow memory. this implies that instructions are statically scheduled. 4. the instruction is then executed by the appropriate execution unit. 5. after that, the result is entered back into the destination register. out-of-order execution is an approach used in third-, fourth-, and fifth-generation microprocessors. this approach significantly reduces latency when executing instructions. the specificity is that the processor will execute instructions in the order of data or operand availability, but not in the original order of instructions generated by the compiler. in this way, the processor will avoid waiting states, because during the execution of the current instruction, it will obtain operands for the next instruction. for example, i1 and i2 are two instructions where i1 is the first and i2 is the second. in out-of-order execution, the processor may execute an i2 instruction before the i1 instruction is completed. this feature will improve cpu performance as it allows execution with less latency. the steps required for out-of-order processor are as follows: 1. retrieves instructions from the program memory. 2. instructions are sent to an instruction queue (also called instruction buffer). 3. until the input operand is available the instruction waits in the queue. the instructions will leave the queue when the operand is available. this implies that instructions are dynamically scheduled. 4. the instruction is sent to appropriate execution unit for execution. 5. then the results are queued. 6. if all the previous instructions have their results written back to register file, then the current result is entered back to the destination register. the main goal of out-of-order instruction execution is to increase the amount of ilp. but let note that the hardware complexity of out-of-order processors is significantly higher compared to in-order ones. 5. scalar vs superscalar processors a scalar processor is one where instructions are executed in a pipeline, as is presented in fig. 1a), but only a single instruction can be fetched or decoded in a single cycle. a super scalar processor on the other hand can have multiple parallel instruction pipelines [20], [21]. a 2-way super scalar processor (see fig. 1b)) can fetch two instructions per cycle and supports two parallel pipelines. the terms "scalar" or "superscalar" are not to be confused with "single-core/multi-core". scalars are single-core processors, while superscalars may either be singleor multi-cores. the key point is that scalars cannot perform more than one operation (i.e., carry out more than one instruction) per clock cycle, but fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 161 superscalars can perform up to two instructions in some cases. this means that if you have a cpu with three cores on it – one being an old scalar processor – and you run an application that utilizes all three cores, the old third core will be no more than half as fast as if it were completely superscalar. the main thing to remember is that certain instruction sets are suited better to certain optimizations. superscalars can execute basic operations such as add and load on separate registers simultaneously, whereas a scalar processor would have to complete one operation before moving on to the next. for example, a scalar processor may be able to run multiple threads, but they will all share the same core and therefore only run as fast as the slowest thread. superscalars can provide much higher performance because each thread gets its own core/execution unit. a) b) fig. 1 scalar processor (a), superscalar processor (b) the terms "scalar" or "superscalar" are not to be confused with "single-core/multi-core". scalars are single-core processors, while superscalars may either be singleor multi-cores [22]. the key point is that scalars cannot perform more than one operation (i.e., carry out more than one instruction) per clock cycle, but superscalars can perform up to two instructions in some cases. this means that if you have a cpu with three cores on it – one being an old scalar processor – and you run an application that utilizes all three cores, the old third core will be no more than half as fast as if it were completely superscalar. the main thing to remember is that certain instruction sets are suited better to certain optimizations. superscalars can execute basic operations such as add and load on separate registers simultaneously, whereas a scalar processor would have to complete one operation before moving on to the next. for example, a scalar processor may be able to run multiple threads, but they will all share the same core and therefore only run as fast as the slowest thread. superscalars can provide much higher performance because each thread gets its own core/execution unit. the main challenge in superscalar processing is how many instructions can be issued per cycle. if a processor can issue k instructions per cycle, then it is called a k-degree superscalar processor. in order for a superscalar processor to take full advantage of parallelism, then k instructions must be executable in parallel. so, the key idea of a superscalar processor is that there is more instruction level parallelism (ilp) [18]. 162 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev the implementation of superscalar processing requires special hardware (see fig. 2 for more details). the data path is increased with the degree of superscalar processing. for instance, if 2-degree superscalar processor is used and the instruction size is 32 bit, then 64bit data is fetched from the instruction memory and 2 instruction registers are required. fig. 2 comparison between a scalar and a superscalar processor. notice: the superscalar processor implements one pipeline dedicated for memory access and one pipeline for arithmetic operations. the main feature of superscalar processors is to issue more than one instruction in each cycle (usually up to 8 instructions). let note that instructions can change the order to make better use of the processor architecture. in order to reduce data dependency in superscalar processing, more complex parallel hardware is necessary. hardware parallelism ensures the availability of more resources and it is one of the ways to use parallelism. an alternative way is to use ilp which can be achieved by transforming the source code using an optimization compiler. typical commercial superscalar processors are ibm rs/6000, dec 21064, mips r4000, power pc, pentium, etc. very-long-instruction-word (vliw) processors are a variant of superscalar processors because they can process multiple instructions in all pipeline stages [23]. the vliw processor has the following features: (a) it is an in-order processor; (b) the binary code defines which instructions will be executed in parallel. the size of the vliw instruction word can be in hundreds of bits. the compiler forms the layout of the vliw instruction by compacting the instruction words of the source program. the processor must have the sufficient number of hardware resources to execute all the specified operations in vliw word simultaneously. for instance, as shown in fig. 3, one vliw instruction word is compacted to have l/s operation, fp addition, fp multiply, branch, and integer alu. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 163 fig. 3 a) vliw instruction word; b) vliw processor all functional units (shown in figure 3 b)) are implemented according to the vliw instruction word (given in figure 3 a). large registry file is shared by all functional units in the processor. the parallelism in instructions and data flow is specified at compile time. trace scheduling is used for handling branch instructions. it is based on the prediction of branch decisions at compile time, while prediction is based on some heuristic methods. in table 2 a comparison between vliw and superscalar processors from aspect of ilp implementation is given. as conclusion, when we compare vliw and superscalar processors, we can say that vliw differs from superscalar machine in the following: a) instruction decoding process is simpler; b) ilp is higher but code density is lower; and c) object-code compatibility with a larger family of nonparallel machines is lower. table 2 instruction-level parallelism: vliw vs superscalar superscalar vliw instruction scheduling mechanism is implemented with complex hardware more functional units are needed instruction code word is larger complex compiler is needed out-of-order execution ▪ there is a logic that checks the dependencies between parallel instructions and checks the hazards when working functional units if a compiler that performs efficient code optimization is not implemented, then more effort is needed to create executable code longer execution time and higher power consumption are potential consequences hardware is simpler due to the use of predicted execution to avoid branching more efficiently execution of pipelinedependent code 164 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev simple hardware structure and instruction set are the crucial advantages of vliw architecture. the vliw processor is suitable for scientific applications where the program behavior is more predictable. super-pipelining is an alternative performance method to superscalar. in this approach, pipeline stages can be segmented into n distinct non-overlapping parts each of which can execute in 1/n of a clock cycle, i.e., super-pipelining is based on dividing the stages of a pipeline into sub-stages and thus increasing the number of instructions which are active in the pipeline at a given moment. by dividing each stage into two, the cycle period τ is reduced to the half, τ/2 => at maximum capacity, a result is produced every τ/2 s (see fig. 4). for a given architecture and the corresponding instruction set there is an optimal number of pipeline stages; increasing the number of stages over this limit reduces the overall performance [24]. by analyzing fig. 4 we can observe the following: 1. base pipeline: i) issues one instruction per clock cycle; ii) can perform one pipeline stage per clock cycle; iii) several instructions are executing concurrently; iv) only one instruction is in its execution stage at any one time; and vi) total time to execute 6 instructions is 10 cycles. 2. super-pipelined implementation: j) capable of performing two pipeline stages per clock cycle; jj) each stage can be split into two non-overlapping parts: jjj) each executing in half a clock cycle; jiv) total time to execute 6 instructions is 7.5 cycles; jv) theoretical speedup is equal to 1 − 7.5 / 10 ≈ 25%. 3. superscalar implementation: k) capable of executing two instances of each stage in parallel; kk) total time to execute 6 instructions is 7 cycles; and kkk) theoretical speedup: 1 – 7/10 ≈ 30%. fig. 4 comparison of superscalar and super-pipeline approaches fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 165 from the fig. 4 we can notice that both the super-pipeline and the superscalar implementations: a) have the same number of instructions executing at the same time; b) however, super-pipelined processor falls behind the superscalar processor; and c) parallelism empowers greater performance. so, a better solution to further improve speed is the superscalar architecture. 6. vector processor a vector is an ordered set of the same type of scalar data items that can be of type a floating-point number, an integer, or a logical value. vector processing is the arithmetic, or logical computation, applied on vectors whereas in scalar processing only one or pair of data is processed. therefore, vector processing is faster compared to scalar processing. when the scalar code is converted to vector form then it is called vectorization. a vector processor is a special accelerator building block, which is designed to handle the vector computations [25], [26]. there are the following types of vector instructions: a) vector-vector instructions: vector operands are fetched from the vector register and after processing generated results are stored in another vector register. these instructions are marked with the following function mappings: 1 : 1 2 2 : 1 2 3 p v v p v v v    for example, p1 type denotes vector square root, and p2 addition (or multiplication) of two vectors. b) vector-scalar instructions: scalar and vector operands are fetched and stored in vector register. these instructions are denoted with the following function mappings: 3 : 1 2p s v v  ; where s is the scalar item for example, p3 type denotes vector-scalar subtraction or divisions. c) vector-reduction instructions: this type of instructions is used when operations on vector are being reduced to scalar items as the result. these instructions are presented with the following function mappings: 4 : 1 1p v s 5 : 1 2 2p v v s  for example, p4 type corresponds to finding the maximum, minimum and summation of all the elements of vector, while p5 is used for the dot product of two vectors. d) vector-memory instructions: this type of instructions is used when vector operations with memory m are performed. these instructions are marked with the following function mappings: 6 : 1 1p m v 7 : 1 2p v m for example, p6 type corresponds to vector load and p7 to vector store operation. typical examples of vector operations are the following: 1. 2 1v v ; complement all elements 2. 1s v ; min, max, sum 3. 3 2 1v v v  ; vector addition, multiplication, division 4. 2 1v v s  ; multiply or add a scalar to a vector 5. 2 1s v v  ; calculate an element of a matrix 166 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev vector processing with pipelining: due to the repetition of the same computation on different operands, vector processing is very suitable for pipelining. a vector processor performs better if length of vector is larger, but it causes the problem in storage and manipulating of vectors. efficiency of vector processing over scalar processing: as we have already mentioned, a sequential computer processes vector item by item. therefore, with aim to process a vector of length n through the sequential computer then the vector must be divided into n scalar steps and executed one by one. for example, consider the following example which is used for addition of two vectors of length 1000: + a b c the sequential computer implements this operation by 1000 add instructions in the following way: [1] [1] [1] [2] [2] [2] . . . [1000] [1000] [1000] c a b c a b c a b = + = + = + a vector processor does not divide the vectors in 1000 add statements to perform identical operation, because it has the set of vector instructions that allow the operations to be specified in single vector instruction as: (1:1000) (1:1000) (1:1000)+ a b c comparative execution of addition instruction by scalar and vector processor is presented in fig. 5. fig. 5 scalar vs vector operations execution thus, the main advantage of using vector in respect to scalar processing is reflected in the elimination of overhead caused by the loop control. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 167 properties of vector instructions: a) single instruction implies lot of operations: hence reduce the number of instruction’s fetch and decode. b) each operation is independent of each other: i) simple design; ii) multiple operations can be run in parallel. c) data hazards has to be checked for each vector operation and not each operation. d) reduces control hazards by reducing branches. e) knows memory access pattern. nowadays, large number of microprocessors contain a set of instructions that manipulate with relatively small vectors (e.g., up to 8 single-precision fp elements in the intel avx extensions [27]). these instructions are often referred to as simd (single instruction, multiple data) instructions. table 3 shows the comparative properties (advantages vs disadvantages) of vector processors. table 3 comparative properties of vector processors advantages of vector processors disadvantages of vector processors ▪ instruction bandwidth is lower ▪ fetch and decode phases are reduced ▪ main memory addressing is easier ▪ load/store units use known patterns for memory access ▪ memory wastage is eliminated – no cache misses, latency only occurs during vector loading ▪ control hazards logic is simple – loop-related control hazards are eliminated ▪ scalable platform – larger number of hardware resources increases performance ▪ code size is reduced – n operations are described by single instruction ▪ works (only) if parallelism is regular (data/simd parallelism). ▪ very inefficient if parallelism is irregular. ▪ memory (bandwidth) can easily become a bottleneck especially if: a) compute/memory operation balance is not maintained; b) data is not mapped appropriately to memory banks vector processing applications include problems that can be efficiently formulated in terms of vectors such as: a) long-range weather forecasting; b) petroleum explorations; c) seismic data analysis; d) medical diagnosis; e) aerodynamics and space flight simulations; f) artificial intelligence and expert systems; g) mapping the human genome; and h) image processing. 7. multicore processors nowadays, large uniprocessors no longer scale in performance, because conventional superscalar techniques for instruction issue allow only a limited amount of parallelism to be extracted from the instruction flow. in addition, it is not possible to further increase the clock speed, because the power dissipation will become prohibitive. for more than thirty years (time period between 1972-2003 year, often called as time intensive microarchitecture processor design), a variety of modifications have been conducted to perform one of two goals: 1) increasing the number of instructions that can be issued per cycle; and 2) increasing the clock frequency faster than moore’s law and 168 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev denard’s rule would normally allow [28]. pipelining and super-pipelining of individual instruction execution into a sequence of stages has allowed designers to increase clock rates. superscalar processors were designed to execute multiple instructions from an instruction stream on each cycle. these function by dynamically examining sets of instructions from the stream to find one’s capable candidates for parallel execution on each cycle. these can be often executed in out-of-order manner with respect to the original sequence. this concept is referred as instruction-level parallelism (ilp). typical instruction streams have only a limited amount of usable parallelism among instructions [1], [29], so superscalar processors that can issue more than about four instructions per cycle achieve very little additional benefit on most applications. today, advances in processor core development have slowed dramatically because of a simple physical limit: power dissipation. in modern pipelined and superscalar processors, typical high-end power exceeds 100 w. in order to bypass the mentioned design constraints, processor manufacturers are now switching to a new microprocessor design paradigm: multicore (also called chip multiprocessor, or cmp for short) and many-core. a multi-core processor is a single computing component with two or more independent actual processing units (called "cores" made up of computation units and caches [30]), which are functional units that read and execute program instructions. multiple cores can run multiple instructions (ordinary cpu instructions) at the same time, increasing overall speed for programs suitable to parallel computing. coupling multiple cores on a single chip should achieve the performance of a single faster processor. the individual cores on a multi-core processor are not necessary to run as fast as the highest performing singlecore processors, but in general they improve overall performance by executing more tasks in parallel [31]. the increase in performance can be seen by considering the way singlecore and multi-core processors execute programs. single-core processors that run multiple programs will assign different time slices to all programs, and they will run sequentially. if one of the processes lasts longer, then all the other processes start to lag behind. however, with multi-core processors, if there are multiple tasks that can run in parallel at the same time, then each of them will be executed by a separate core in parallel. this improves performance. depending on the application requirements, multi-core processors can be implemented in different ways. it can be a group of heterogeneous cores or a group of homogeneous cores or a combination of both. in a homogeneous core architecture, all cores in the processor are identical [32] and in order to improve overall performance they break down a computationally intensive application into less intensive applications and run them in parallel [4]. significant advantages of a homogeneous multi-core processor are reduced design complexity, reusability, and reduced verification effort [33]. heterogeneous cores, on the other hand, consist of dedicated application specific processor cores that would run various applications [34]. cores in multi-core systems, as well as single-processor systems, can implement architectures such as vliw, superscalar, vector, or multithreading. multicore processors are used in many application domains, such as general purpose, embedded, multimedia, network, digital signal processing (dsp) and graphics (gpu). they can be harnessed as complex cores that address computationally intensive applications, or a remedial core that deals with less computationally intensive applications [24]. software algorithms and their implementations greatly influence the performance improvement obtained by using multi-core processors. in particular, possible gains are limited by the fraction of the software that can run in parallel simultaneously on multiple fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 169 cores. at best, parallel problems can achieve acceleration factors close to the number of cores, or even more if the problem is sufficiently split to fit in the local core cache(s). in this way the number of accesses to much slower main system memory are reduced. however, most applications are not so fast without the effort of programmers to reshape the whole problem. currently, software parallelization is a significant ongoing research topic [4]. a comparison between single and multiple-core processor is given in table 4. table 4 comparation of single-core processor and multi-core processor parameter single-core processor multi-core processor number of cores on a die single multiple instruction execution single instruction is executed at a time multiple instructions are executed by using multiple cores gain speed up every program speed up the programs intended for multi-core processor performance depend on the clock frequency depend on the clock frequency, number of cores and program examples 80386, 80486, amd 29000, amd k6, pentium i, ii, iii etc. core-2-duo, athlon 64 x2, i3, i5, i7 etc. 7.1. multicore topologies in the sequel we will point out to four types of multicore topologies: symmetric (or homogeneous), asymmetric, dynamic, and composed (alternatively referred as "fused" or "heterogeneous") [20], [35]. the symmetric multicore topology is composed of multiple copies of the same core that functioning at the same frequency and voltage. in this topology, the resources such as the power and the area budget, are evenly distributed on all cores. in figure 6a) symmetric multicore processor is presented where each block is a basic core equivalent (bce) and contains l1 and l2 caches as constituents. l3 cache and on-chip network are not presented. the asymmetric multicore topology is composed of one large monolithic core and a number of identical small cores. this topology uses a large high-performance core that performs the serial part of the code and uses a number of small cores as well as the large core to take advantage of the parallel part of the code. in figures 6b) and 6c) asymmetric multicore processors are presented with: b) one complex core and 12 bces; c) two complex cores and 8 bces. the dynamic multicore topology is a modification of the asymmetric topology. parallel parts of the code are executed by small cores while the large core is off, and the serial part of the code is executed only on the large core, while small cores are inoperative. in figures 6d) and 6e) dynamic multicore processors are presented with: d) 16 bces or one large core; e) four cores and frequency scaling using power budget of 8 bces (currently one core is at full core thermal design point (tdp), two cores are at 0.5 core tdp, and one core is switched off). 170 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev fig. 6 a) symmetric multicore processor; b) and c) asymmetric multicore processors; d) and e) dynamic multicore processors; f) heterogeneous multicore the heterogeneous (composed) multicore topology is composed of a set of small cores that are logically combined to assemble a large high-performance core for serial code execution. in serial or parallel cases, exclusively large or small cores are used. in figure 6f) heterogeneous multicore is presented with large core, four bces, two accelerators or co-processors of type a, b, d, e each. some of the limits of multicore processors are the following [36]: 1. present days cmps are designed to exploit both instruction-level parallelism (ilp) and thread-level parallelism (tlp). in such solutions, the number of processors and the complexity of each processor are fixed at design time. 2. performance improvement mainly achieved by increasing the number of cores cannot always lead to effective design solution due to: a) dark silicon problem (all the cores cannot be powered at the same time); and b) declining yield in tlp. nowadays, we have multicore processors all over the place, single thread programs are no longer an option. in essence, we moved from single core to multicore not because the software community was ready for concurrency but because the hardware community could not afford to neglect the power issue. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 171 today, multi-core technology has become commonly used in most personal electronic devices that contain multiple cores. therefore, in order to take advantage of multiple cores on such machines, creating parallel programs is crucial to achieving high performance and enabling large-scale data processing. in addition to multicore technology (mainly realized as shared memory systems) [37], parallel computing can be in the form of distributed systems. unlike multicore shared memory systems, distributed systems can solve problems that do not fit in the memory of a single machine. in contrast to multicores with shared memory, communication and data replication in distributed systems causes high additional overheads. compared to distributed memory systems, multicores with shared memory are more efficient for programs that can fit in memory. efficiency is reflected in reduced hardware, cost, and power consumption [3]. today’s multicore cpus use most of their transistors on processing logic and cache memory. during operation most of the power is consumed by non-computational units. alternative strategy are heterogeneous architectures, i.e. multi-core architectures in combination with accelerator cores. accelerators are specialized hardware cores designed with fewer transistors, operating at lower frequencies than traditional cpus, and enabling increased system performance. 8. multithreaded processors as hardware complexity of modern processor and capabilities have increased, so demands related to higher performance increased too. this requirement has led to an increase in cpu resource efficiency to the same extent. the main idea is that the time while the processor is waiting to perform certain tasks, i.e. it is in idle state, is used to perform another activities. to achieve this goal, software designers involved new approach in possibilities of the operating system that support running pieces of programs, called threads. threads are small tasks that can run independently [38]. during execution, each thread gets its own time period. as a consequence, the processor time is efficiently utilized. fig. 7 shows multithreading execution on single processor and two-way superscalar processor. fig. 7 multithreading in a cpu: (a) single processor running a single thread. (b) single processor running several threads. (c) two-way superscalar processor running a single thread. (d) two-way superscalar processor running multiple (two) threads. 172 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev 8.1. difference between multitasking, multiprocessing and multithreading several threads make up one process (task), and share access to processor resources. this new concept of operating systems, known as multi treading, has ensured the run of one thread while the other is in a state of waiting for an event. contemporary commercially available pc machines and servers, mainly based on intel or amd processors, that run microsoft windows, support multithreading [28]. each program requires resources that are occupied by the process (task). the process is assigned a virtual address space, executable code, system object manipulation, a security context, a unique process identifier, environment variables, a priority type, working set sizes, and at least one thread of execution. a thread is a single entity within a process that can be planned for execution. all threads that are part of a process share its previously mentioned resources. in addition, each thread maintains code for manipulation with exceptions, a planning priority, a local thread memory, a thread identifier, and structures that the system will use in order to preserve the context of the thread. the thread context consists of a set of machine registers, a kernel stack, an environment block, and a user thread stack. each thread is characterized by: 1) thread id; 2) register state, including pc and sp; 3) stack; 4) signal mask; 5) priority; and 6) thread-private memory. threads share instructions and data of the process to which they belong. all threads in the process can see changes in the shared data of any thread. threads in the same process can interact with each other without involving the operating environment. multitasking is a mode of operation where the cpu performs multiple tasks at the same time (see fig. 8). it is characterized by cpu switching between multiple tasks so that users can work together with each program. unlike multithreading, in multitasking, processes share separate memory and resources. in multitasking, cpu switching between tasks is relatively fast. fig. 8 multitasking operating system for single processor multithreading is an operating mode in which during process execution many threads are active. in this manner, higher computer power is achieved. in multithreading (see fig. 9), cpu executes many threads that are part of a process at a time. processes share the same memory and resources. property of multithreading is that two or more threads can run concurrently. therefore, multithreading is also referred as concurrency [39]. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 173 fig. 9 multithreading system for single processor the difference between multitasking and multithreading [40] is presented in table 5. table 5 difference between multitasking and multithreading no. multitasking multithreading 1. cpu performs many user tasks within a process many threads are created 2. cpu switching among the tasks cpu switching among the threads 3. processes share separate memory processes share same memory 4. multiprocessing can be involved multiprocessing cannot be involved 5. cpu executes many tasks at a time cpu executes many threads at a time 6. each process has separate resources each process shares same resources 7. multitasking is slower multithreading is faster 8. termination of process is longer termination of thread is shorter a computer system composed of two or more processors is called a multiprocessing system (see fig. 10). in this way, the computing speed of the system is increased. in such systems, each processor has its own registers and main memory. the division of processes and resources among processors is done dynamically. the main characteristics of multiprocessing are the following: i) the organization of memory determines the type of multiprocessing; ii) system reliability is improved, and iii) decomposing programs into parallel executable tasks leads to performance increase. advantages of multiprocessing are the following: a) more activity can be performed in a shorter time; b) code is simple; c) system is composed of multiple cpu and cores; d) synchronization is simplified; e) child processes are interruptible/killable; and f) costefficient because processors share resources. disadvantages of multiprocessing are: a) inter-process communication involves time overhead; and b) larger memory is needed. the main characteristics of multithreading are the following: j) each thread is executed parallel with other; and jj) program performance is increased since threads share the same memory area. 174 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev fig. 10 multiprocessing system advantages of multithreading are the following: a) the address space is shared for all threads; b) lower amount of memory is needed; c) cost-efficient and fast communication between threads; d) fast context switching; e) suitable for input/output-oriented applications; and f) switching time between two threads is short. disadvantages of multithreading are the following: a) not interruptible/killable; b) manual synchronization is often necessary; and c) program code is harder to understand and testing and debugging is harder due to race conditions. both multiprocessing and multithreading as operating modes increase a computing power [41]. a multiprocessing system is composed of multiple processors where a multithreading comprises multiple threads. table 6 multiprocessing vs multithreading multiprocessing multithreading multiple cpus increase computing power multiple threads of a single process increase computing power multiple processes are executed concurrently multiple threads of a single process are executed concurrently 9. multithreading: execution model one decade later in respect to software architects, hardware architects designed a multithreaded processor which can run more than one thread on some of its cores at the same time. a multithreaded architecture is one in which a single processor has the ability to follow multiple streams of execution without the aid of software context switches. in order for a conventional processor to stop executing one thread and start executing instructions from another thread, it requires special software. the role of this software is to transfer the state of the running thread to memory (usually to stack memory) and then load the state of the selected other thread into the processor. this process usually requires hundreds (or thousands) of cycles, especially if an operating system was introduced. a multithreaded architecture, on the other hand, can access the state of multiple threads in, or near, the processor core. this allows the multithreaded architecture to quickly switch between threads, and potentially more efficiently and effectively use processor resources [42] (see for illustration fig. 11). fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 175 fig. 11 multithreaded pipeline example multicore and multithreading can be used simultaneously because they are two orthogonal concepts. for instance, the intel core i7 processor has multiple cores, and each core is two-way multithreaded [43]. in the case where multiple threads are executed simultaneously, then those threads use mostly different hardware resources in the multicore, while they share most of the hardware resources in the multithreaded processor. in order to achieve this, a multithreaded architecture must be able to store the state of multiple threads in hardware this storage is referred to as hardware contexts, where the number of supported hardware contexts defines the level of multithreading (the number of threads that can share the processor without software intervention). the state of a thread is primarily composed of the program counter (pc), the contents of generalpurpose registers, and special purpose and program status registers. it does not include memory (because that remains in place), or dynamic state that can be rebuilt or retained between thread invocations (branch predictor, cache, or tlb contents). 9.1. instructions issue multithreaded processors are divided into two groups depending on how many threads can issue instructions in a given cycle. when instructions can be issued only from a single thread in a given cycle, explicit multithreading is used. in that case, the following two main techniques can be applied [44] (see fig. 12): i) coarse-grain multithreading (cgmt) or blocked multithreading (bmt); and ii) fine-grain multithreading (fgmt) or interleaved multithreading (imt). when instructions can be issued from multiple threads in a given cycle, simultaneous multithreading (smt) is used. fig. 12 explicit multithreading 176 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev coarse-grain multithreading, also called blocked multithreading or switch-on-event multithreading, has multiple hardware contexts associated with each processor core [45]. the instructions of a thread are executed successively, but when an event occurs that may cause latency, then it produces a context switch. instructions of one thread continue to be executed until there is a long delay such as a branch or no cache data is found (see fig. 13). when such a delay is achieved, it is switched to another thread, and this thread is also executed until a long delay occurs. this process is constantly repeated. the strategy of this technique makes it possible to hide long delays, but omits shorter delays where the cost of switching is higher than the cost of tolerating delays. a hardware context is the program counter, register file, and other data required to enable a software thread to execute on a core. a coarse-grain multithreaded processor operates similarly to a software time-shared system, but with hardware support for fast context switch, allowing it to switch within a small number of cycles (e.g., less than 10) rather than thousands or tens of thousands. fig. 13 coarse-grain multithreading fine-grain multithreading, also called interleaved multithreading, also has multiple hardware contexts associated with each core, but can switch between them with no additional delay. an instruction of another thread is fetched and entered into the execution pipeline at each cycle, and therefore the processor can execute an instruction or instructions from different thread in each cycle. unlike coarse-grain multithreading, then, a fine-grain multithreaded processor has instructions from different threads active in the processor at once, within different pipeline stages [46]. but within a single pipeline stage (or given our particular definitions, within the issue stage) there is only one thread represented. in this approach, the cpu executes one instruction of each thread in succession (one after the other) before going back (in a circular way) to execute the next instruction of the first thread. during execution, the cpu skips the instruction of any thread that is waiting for an event to occur and has a long delay (stalled). in this manner, the processor is busy because the pipeline system is almost always full. such a processor has significantly complex hardware structure because for each thread it needs a separate copy of register file and program counter. since the next instruction of a thread is fed into the pipeline after the withdrawal of the previous instruction of this thread, control and data dependencies between instructions do not occur in fgmt. the pipeline system is simple and potentially very fast because there is no need for complex hardware hazard detection. in addition, the context switching time between threads is zero cycles. memory latency is compensated by not scheduling a thread until memory access is completed. in this model, the number of cpu pipeline stages determines the number of threads that can be executed. the processing power available to one thread is limited by the instruction interleaving from other threads. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 177 in simultaneous multithreading (smt) instructions are simultaneously initiated from multiple threads to the execution units of a superscalar cpu. in this way, the initiation of several superscalar instructions is linked with hardware resources for multiple-context approach. the cpu can issue multiple instructions from multiple threads each cycle. in this way, both unused cycles in the case of latencies and unused issue slots within one cycle can be filled by instructions of alternative threads. from one hand, tlp exists as a consequence of multithreading, parallel programs or multiple independent programs in a multiprogramming workload. from the other hand, ilp is based on execution of individual threads. smt processor achieves better throughput and speedup in respect to single threaded superscalar processor for multithreaded workloads because it efficiently uses coarseand fine-grain parallelism, at cost of more complex hardware architecture. smt, cgmt and fgmt are approaches which are often used in risc or vliw processors [39], [47]. intel pentium 4 implements smt from 2002, starting from the 3.06 ghz model. intel calls smt technique as hyper-threading. other processors that use smt are alpha axp 21464, ibm power5, and intel nehalem i7 [48]. one simple smt architecture is presented in fig. 14. fig. 14 smt processor architecture much like pipelining, superscalar architecture (presented in fig. 14) also extends very naturally the possibility to support multiple threads of instructions. a multi-threaded superscalar processor executes instructions from multiple threads. each thread executes its logical instruction stream and uses separate registers, etc., but shares most of the available physical resources. the additional hardware required to support multiple thread execution is minor, but performance is significantly improved. short remarks related to explicit multithreading: coarse-grain multithreaded processors directly execute one thread at a time, but can switch contexts relatively quickly, in a matter of a few cycles. this allows them to switch to the execution of a new thread to hide long latencies (such as memory accesses), but they are less effective at hiding short latencies. finegrain multithreaded processors can context switch every cycle with no delay. this allows them to hide even short latencies by interleaving instructions from different threads while one thread is stalled. however, this processor cannot hide single-cycle latencies. a simultaneous multithreaded processor can issue instructions from multiple threads in the same cycle, 178 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev allowing it to fill out the full issue width of the processor, even when one thread does not have sufficient ilp to use the entire issue bandwidth. illustration purpose only, two different approaches that are possible with single-issue (scalar) processors and multiple-issue processors are given in fig. 15 and fig. 16, respectively [37]. fig. 17 presents two cases of issuing multiple threads in a cycle [38]. fig. 15 different approaches possible with single-issue (scalar) processors: a) single-threaded scalar, b) interleaved multithreading scalar, c) blocked multithreading scalar fig. 16 different approaches possible with multiple-issue processors: (a) single-threaded four-wide superscalar, (b) interleaved multithreading four-wide superscalar, (c) blocked multithreading four-wide superscalar notice: vertical waste corresponds to darker marked box, while horizontal waste corresponds to lighter marked box fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 179 fig. 17 issuing from multiple threads in a cycle: a) simultaneous multithreading, b) chip multiprocessor 10. parallel vs serial computing what is serial computing: computer software is conventionally created for serial execution, where the algorithm divides the problem into smaller parts, i.e. instructions. these instructions are then serially executed on the cpu of the computer one by one [18]. after completing the current instruction, the next one begins. so, in short, serial (sequential) computing is following (see fig. 18): ▪ a problem is broken into a discrete series of instructions ▪ instructions are executed sequentially one after another ▪ executed on a single processor ▪ only one instruction may execute at any moment in time. fig. 18 serial computing generic example what is parallel computing: contrary to the serial approach, parallelism can be defined as an approach of dividing big problems into smaller ones. after that, smaller problems are simultaneously solved by multiple processors. the terms parallelism and concurrency are often confused. parallelism means that two or more program sequences 180 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev are executed independently of each other by a number of processors, while in concurrent execution there are dependencies between program sequences so that the execution of one program sequence must wait for the execution of another to continue. every parallel processing is not needed to be considered as concurrent. for example, bit-level parallelism is not concurrent. as can be seen from fig. 19, to solve a computational problem parallel computing involves the simultaneous usage of multiple computing resources [49]: ▪ a problem is decomposed into several parts that can be solved concurrently ▪ each part is further decomposed down to a series of instructions ▪ instructions from each part execute simultaneously on different processors ▪ a general control/coordination mechanism is implemented. concurrency vs parallelism: we can see how concurrency and parallelism work with the below example. as shown in fig. 20, there are two cores and two tasks. in a concurrent approach, each core is executing both tasks by switching among them over time. in contrast, the parallel approach doesn’t switch among tasks, but instead executes them in parallel over time [50]. this simple example for concurrent processing can be any user-interactive program, like a text editor. in such a program, there can be some io operations that waste cpu cycles. when we save a file or print it, the user can concurrently type. the main thread launches many threads for typing, saving, and similar activities concurrently. they may run in the same time period; however, they aren’t actually running in parallel. types of parallelism: in essence, the parallelism can be implemented at two levels, hardware and software, respectively [51]. fig. 19 parallel computing generic example parallelism at hardware level is built into machines architecture and hardware multiplicity, so it is also known as machine parallelism. this type of parallelism is a function of cost and performance trade off. it also displays resource utilization patterns of simultaneously executable operations and indicates the peak performance of processor resources. it is characterized by number of instruction issues per machine cycle [1]. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 181 fig. 20 concurrent vs parallel execution in summary, we distinguish the following types of hardware parallelism: 1. parallelism in a uniprocessor level implemented as: i) pipelining, super-pipelining; ii) superscalar, vliw etc. 2. parallelism implemented with simd instructions, vector processors, gpus 3. parallelism at multiprocessor level: j) symmetric shared-memory multiprocessors; jj) distributed-memory multiprocessors; jjj) chip-multiprocessors a.k.a. multi-cores; ivj) multicomputer a.k.a. clusters. parallelism at software level is exploited by the concurrent execution of machine language instructions in a program. this type of parallelism is a function of algorithm, programming style and compiler optimization. it also displays patterns of simultaneously executable operations using the program flow graph [52]. we distinguish the following two types of software parallelism: 1. control parallelism – allows two or more operations to be performed simultaneously; 2. data parallelism – at most same operation is performed over many data elements by many processors simultaneously. data level parallelism (dlp) arises from executing essentially the same code on a large number of objects [53], while control level parallelism (clp) arises from executing different threads of control concurrently [54]. parallelism in software level can be implemented at instruction, task, data or transaction level parallelism [55]. 10.1. implementations of the most common type of parallelism bit-level parallelism: this type of parallelism (implemented at hardware level) uses doubling the processor word size. it provides faster execution of arithmetic operations for large numbers. for instance, an 8-bit cpu executes 16-bit addition for two cycles, whereas a 16-bit processor needs just one cycle for the same activity. this level of parallelism is also used in 64-bit processors. instruction-level parallelism (ilp): this type of parallelism (implemented at hardware level) exploits the potential overlap between instructions in a program. in most cases, ilp is implemented on each processor’s hardware as: i) instruction pipelining; ii) superscalar processing; iii) out-of-order execution; and iv) speculative execution/branch prediction. most processors use a combination of the aforementioned ilp techniques to achieve higher performance. very long instruction word (vliw) processors use specialized compilers to achieve static ilp parallelism at the software level. compilers prepare parallel instruction streams for vliw processors so that they take full advantage of a number of executive units organized in multiple pipelines. 182 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev task/thread-level parallelism (tlp): this type of high-level parallel computing (implemented at software level) is based on partitioning the application in distinct task or threads, that can be then executed simultaneously. threads are executed on different computer units and can work on independent data or share data. until recently, programming was done sequentially, with a single thread representing the entire application. today, it is necessary to use the new paradigm of multi-threaded programming in order to take full advantage of the available multicore processors. in that sense, modern operating systems (oss) provide scheduling of different processes on different cores. however, in the case of complex applications such as bioinformatics, the os cannot efficiently distribute the computational load of each process to available cores. in order to improve performance, these applications need to be redeveloped to achieve thread-level parallelism. data parallelism: this form of high-level parallelism partitions data into various available computing units. data is assigned to cores that independently execute the same task code on each fragment of data. therefore, this type of parallelism requires advanced code development skills and can only be applied to specific problems. computer graphics is an important area of application of high-level data parallelism. the design of graphic processor units (gpus) enables efficient execution of every graphics processing task. first, each frame is divided into regions, and then, based on the command, hundreds of processor units perform the task independently on each data region. many incarnations of dlp architectures over decades are the following [49]: a) old vector processors (cray processors: cray-1, cray-2, …, cray x1); b) simd extensions (intel sse and avx units, alpha tarantula (didn’t see light of day)); c) old massively parallel computers (connection machines, maspar machines); and d) modern gpus (nvidia, amd, qualcomm, ...). in general, dlp focus of throughput rather than latency. in fig. 21 a classification scheme of parallel computer architectures based on type of instruction processing is given. fig. 21 classification of parallel architectures based on type of instruction processing the difference among the three major categories ilp, tlp and dlp that are nowadays mainly used in computer systems to exploit parallelism is sketched in fig. 22, and that is [56]: ▪ instruction-level parallelism (ilp) multiple instructions from one instruction stream are executed simultaneously; ▪ thread-level parallelism (tlp) multiple instruction streams are executed simultaneously; ▪ vector data parallelism (vdp) the same operation is performed simultaneously on arrays of elements. fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 183 there are many reasons to use parallel computing [4]. first of all, the whole real world has a dynamic nature, i.e. many things happen at a certain time, but in different places at the same time, so this data is very huge to manage and requires more dynamic simulation and modeling. it is parallel computing that ensures the concurrency and organization of complex, large datasets and their management while saving money and time. it also provides efficient use of hardware resources and real-time system implementation. fig. 22 differences in execution among ilp, tlp and vdp parallel computing finds application in many areas of science and engineering, then in databases and data mining, in real-time system simulations, as well as in advanced graphics, augmented reality and virtual reality. in addition to a number of advantages, there are some limitations of parallel computing. the main problem is the difficulty in achieving communication and synchronization between multiple subtasks and processes. also, algorithms or programs must be provided with low coupling and high cohesion as well as the possibility that they can be handled in a parallel mechanism. developers must be experts and technically skilled in order to be able to effectively code a program based on parallelism. 11. conclusion nowadays, the microprocessor represents one of the most complex applications of the transistor, with well over 10 billion transistors of the most powerful microprocessor. in fact, throughout its 50 years of evolution period, the microprocessor has always used the technology of the day. the intention to permanently increase performance has led to rapid technological improvements that have made it possible to build more complex microprocessors. advances in semiconductor fabrication processes, computer architecture and organization, as well as cmos ic vlsi design methodologies, were all needed to create today’s microprocessor. the development of microprocessors since 1971 has been aimed at (a) improving architecture, (b) improving instruction set, (c) increasing speeds, (d) simplifying power requirements [57], [58] and (e) embedding more and more memory space and i/o facilities in the same chip (using single chip computers). this paper discusses first, fifty years of microprocessor history and its generations. then it describes the benefits of switching from non-pipelined processor to single core 184 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev pipelined processor, and switching from single core pipelined and superscalar processor to multicore pipelined and superscalar processor. finally, it presents the design of a multicore processor. the transition from single-core to multi-core is inevitable because past techniques for accelerating processor architectures that do not modify the basic von neumann computer model, such as pipelining and superscalar, encounter strong limits. the question is, why have multi-core machines become so widespread in the last decade? according to moore's law, the density of transistors doubles approximately every 18 months [59], and according to dennard's scaling, the power density of transistors is constant [60]. this has historically corresponded to increase in clock speed of single core machines of approximately 30% per year since the mid-1970s [61]. however, since the mid-2000s, dennard scaling has no longer been maintained due to physical hardware limitations, and therefore, there has been a need for new mechanisms. to improve performance, hardware vendors have focused on developing processors with multiple cores [62]. as a result, the microprocessor industry is moving towards multicore architectures. however, the full potential of these architectures will not be exploited until the software industry fully accepts parallel programming. multiprocessor programming is much more complex than programming single processor machines and requires an understanding of new algorithms, computational principles, and software tools. only a small number of developers currently master these skills. there are many techniques that can be used to facilitate the transition to multicore processors, but to take full advantage of the potential offered by such systems, some form of parallel programming will always be needed [4]. multicore technology has become ubiquitous today with most personal computers and even mobile phones [63], so writing parallel programs is crucial to achieving scalable performance and enabling large-scale data processing. in addition, to take full advantage of multicore technology, software applications must be multithreaded. the total work to be performed must be able to be distributed among the execution units of a multicore processor in such a way that they can execute at the same time. in order to consider multithreading in more detail, it is first necessary to understand parallel hardware and parallel computing [44]. finally, this paper provides some details on how to implement hardware parallelism in multicore systems. acknowledgement: this work was supported by the serbian ministry of education and science, project no tr-32009 – "low power reconfigurable fault-tolerant platforms". references [1] d. patterson and j. hennessy, computer architecture: a quantitative approach, 6th ed., morgan kaufmann, 2017. [2] j.-l. baer, microprocessor architecture: from simple pipelines to chip multiprocessors, cambridge university press, 2009. [3] y. solihin, fundamentals of parallel multicore architecture, chapman & hall/crc, 2015. [4] r. kuhn and d. padua, parallel processing, 1980 to 2020", morgan & claypool, 2021. [5] m. stojčev, microprocessor architectures i part, in serbian, elektronski fakultet niš, 2004 [6] b. parhami, computer architecture: from microprocessors to supercomputers, oxford university press, 2005 [7] semiconductor industry association, international technology roadmap for semiconductors (itrs), 2013 edition, 2013 [8] k. olukotun, l. hammond, and j. laudon, chip multiprocessor architecture: techniques to improve throughput and latency, morgan & claypool, 2007 fifty years of microprocessor evolution: from single cpu to multicore and manycore systems 185 [9] r. v. mehta, k. r. bhatt and v. v. dwivedi, "multicore processor challenges – design aspects", j. emerg. technol. innov. res. (jetir), vol. 8, no. 5, pp. c171-c174, may 2021. [10] a. gonzalez, f. latorre and g. magklis, processor microarchitecture: an implementation perspective, morgan & claypool, 2011. [11] m. stojčev and p. krtolica, computer systems: principle of digital systems, in serbian, elektronski fakultet niš i prirodno-matematički fakultet niš, 2005. [12] "microprocessor chronology", av.at. https://en.wikipedia.org/wiki/microprocessor_chronology, last access 28.03.2022. [13] m. stojčev, contemporary 16-bit microprocessors, vol. i, in serbian, naučna knjiga, beograd, 1988. [14] m. stojčev, contemporary 16-bit microprocessors, vol. ii, in serbian, naučna knjiga, beograd, 1988. [15] m. stojčev, contemporary 16-bit microprocessors, vol. iii, in serbian, naučna knjiga, beograd, 1988. [16] m. stojčev, risc, cisc and dsp processors, in serbian, elektronski fakultet niš, 1997. [17] m. stojčev, branislav petrović, architectures and programming microcomputer systems based on processor family 80x86, in serbian, elektronski fakultet niš, 1999. [18] d. patterson and j. hennessy, computer organization and design: the hardware/software interface, 5th ed., morgan kaufmann, 2014. [19] y. etsion, "computer architecture out-of-order execution", av.at. https://iis-people.ee.ethz.ch/~gmichi/ asocd/addinfo/out-of-order_execution.pdf, last access 28.03.2022. [20] "superscalar processors", av. at. https://www.cambridge.org/core/terms, last access. 28.03.2022. [21] m. stojčev and t. nikolić, pipeline processing and scalar risc processor, in serbian, elektronski fakultet niš, 2012. [22] m. stojčev and t. nikolić, superscalar and vliw processors, in serbian, elektronski fakultet niš, 2012 . [23] philips semiconductors, introduction to vliw computer architecture, av. at. https://www.isi.edu/ ~youngcho/cse560m/vliw.pdf. last access 28.03.2022 [24] n. p. jouppi and d. w. wall, "available instruction level parallelism for superscalar and superpipelined machines", wrl research report 89/7, av. at. https://www.hpl.hp.com/techreports/ compaq-dec/wrl-89-7.pdf, last access 28.03.2022 [25] c. e. kozyrakis and d.a. patterson, "scalable vector processors for embedded system", ieee micro, vol. 23, no. 6, pp. 36– 45, nov.-dec. 2003. [26] e. aldakheel, g. chandrasekaran and a. kshemkalyani, "vector procesors", av. at. https://www.cs.uic.edu/~ajayk/c566/vectorprocessors.pdf, last access 29.03.2022 [27] c. lomont, "introduction to intel® advanced vector extensions", av. at. https://hpc.llnl.gov/sites/ default/files/intelavxintro.pdf, last access 29.03.2022. [28] m. stojčev, e. milovanović and t. nikolić, multiprocessor systems on chip, in serbian, elektronski fakultet niš, 2012. [29] j. l. lo and s. j. eggers, "improving balanced scheduling with compiler optimizations that increase instruction-level parallelism", av. at. https://homes.cs.washington.edu/~eggers/research/bsopt.pdf, last access 29.03.2022. [30] s. akhter and j. roberts, multi-core programming, intel press, 2006. [31] g. koch, "intel’s road to multi-core chip architecture", av. at. http://www.intel.com/cd/ids/ developer/asmo-na/eng/220997.htm [32] g. koch, "transitioning to multi-core architecture", av.at. www.intel.com/cd/ids/developer/asmona/eng/recent /221170.htm, last access 29.03.2022. [33] m. brorsson, "multi-core and many-core processor architectures", chapter 2 in programming manycore chips, ed. a. vajda, springer, 2011. [34] m. zahran, heterogeneous computing: hardware and software perspectives, acm books #26, 2019. [35] m. mitić, m. stojčev and z. stamenković, "an overview of soc buses", in embedded systems handbook, digital systems and aplications, ed. v. oklobdzija, chapter 7, 7.17.16, crc press, boca raton, 2008. [36] j. rehman, "advantages and disadvantages of multi-core processors", av. at https://www.itrelease.com/ 2020/07/advantages-and-disadvantages-of-multi-core-processors/, last access 29.03.2022 [37] j. shun, shared-memory parallelism can be simple, fast, and scalable, morgan & claypool pub., 2017 [38] t. ungerer, b. rogic and j. silc, "multithreaded processors", comput j., vol. 45, no. 3, pp. 320–348, 2002. [39] a. silberschatz, g. gagne and p. b. galvin, "multithreaded programming", chapter 4 in operating system concepts, 8th ed., john wiley, 2009. [40] differencebetween.com, "difference between multithreading and multitasking", av.at. https://www.differencebetween.com/difference-between-multithreading-and-vs-multitasking/, last access 29.03.2022. https://en.wikipedia.org/wiki/microprocessor_chronology https://iis-people.ee.ethz.ch/~gmichi/%0basocd/addinfo/out-of-order_execution.pdf https://iis-people.ee.ethz.ch/~gmichi/%0basocd/addinfo/out-of-order_execution.pdf https://www.cambridge.org/core/terms https://www.hpl.hp.com/techreports/%0bcompaq-dec/wrl-89-7.pdf https://www.hpl.hp.com/techreports/%0bcompaq-dec/wrl-89-7.pdf https://www.cs.uic.edu/~ajayk/c566/vectorprocessors.pdf https://homes.cs.washington.edu/~eggers/research/bsopt.pdf http://www.intel.com/cd/ids/developer/asmo-na/eng/recent%20/221170.htm http://www.intel.com/cd/ids/developer/asmo-na/eng/recent%20/221170.htm https://www.itrelease.com/%0b2020/07/advantages-and-disadvantages-of-multi-core-processors/ https://www.itrelease.com/%0b2020/07/advantages-and-disadvantages-of-multi-core-processors/ https://www.differencebetween.com/difference-between-multithreading-and-vs-multitasking/ 186 g. nikolić, b. dimitrijević, t. nikolić, m. stojčev [41] techdifferences, "difference between multitasking and multithreading in os", av. at. https://techdifferences.com/difference-between-multitasking-and-multithreading-in-os.html, last access 29.03.2022 [42] tutorialspoint, "multi-threading models", av.at https://www.tutorialspoint.com/multi-threading-models, last access 22.03.2022 [43] wikipedia, "list of intel core i7 processors", av. at https://en.wikipedia.org/wiki/list_of_intel_ core_i7_processors, last access 29.03.2022 [44] m. nemirovsky and d. m. tullsen, multithreading architecture, morgan & claypool, 2013. [45] o. mutlu, "computer architecture: multithreading", av. at. https://rmd.ac.in/dept/ece/supporting_ online_%20materials/5/cao/unit5.pdf, last access 22.03.2022 [46] n. manjikian, "implementation of hardware multithreading in a pipelined processor", in proceedings of the ieee north-east workshop on circuits and systems, 2006, pp. 145–148. [47] p. manadhata, and v. sekar, "simultaneous multithreading”, av. at https://www.cs.cmu.edu/afs/cs/ academic/class/15740-f03/www/lectures/smt.pdf, last access 29.03.2022 [48] intel, "products formerly nehalem ep", av. at [49] https://ark.intel.com/content/www/us/en/ark/products/codename/54499/products-formerly-nehalem-ep.html, last access 29.03.2022 [50] k. hwang and z. xu, scalable parallel computing: technology, architecture, programming, mcgrawhill, 1998. [51] d. malkhi, concurrency: the works of leslie lamport, acm books #29, 2019. [52] cs4/msc parallel architectures, "lect. 2: types of parallelism", av. at https://www.inf.ed.ac.uk/ teaching/courses/pa/notes/lecture02-types.pdf, last access 29.03.2022 [53] chapter 3: "understanding parallelism", av. at https://courses.cs.washington.edu/courses/cse590o/06au/lnlch-3-4.pdf, last access 29.03.2022 [54] j. owens, "data level parallelism", av. at https://www.ece.ucdavis.edu/~jowens/171/lectures/dlp3.pdf. last access 29.03.2022 [55] a. a. freitas, s. h. lavington, "data parallelism, control parallelism, and related issues", in mining very large databases with parallel processing, springer, 2000. [56] e. i. milovanović, t. r. nikolić, m. k. stojčev and i. ž. milovanović, "multi-functional systolic array with reconfigurable micro-power processing elements", microelectron. reliab., vol. 49, no. 7, pp. 813–820, july 2009. [57] c. severance and k. dowd, high performance computing, connexions, rice university, houston, texas, 2012. [58] g. nikolić, m. stojčev, z. stamenković, g. panić and b. petrović, "wireless sensor node with lowpower sensing", facta univ. ser.: elec. energ., vol. 27, no 3, pp. 435–453, sept. 2014. [59] t. nikolić, m. stojčev, g. nikolić and g. jovanović, "energy harvesting techniques in wireless sensor networks", facta univ. ser.: aut. cont. rob., vol. 17, no. 2, pp. 117-142, dec. 2018. [60] g. e. moore. "cramming more components onto integrated circuits", electronics, vol. 38, no. 8, pp. 114–117, april 1965. [61] r. h. dennard, f. h. gaensslen and k. mai, "design of ion-implanted mosfet’s with very small physical dimensions", ieee j. solid-state circuits, vol. 9, no. 5, pp. 256–268, oct. 1974. [62] s. naffziger, j. warnock and h. knapp. "when processors hit the power wall", in proceedings of the ieee international solid-state circuits conference (isscc), 2005, pp. 16–17. [63] s. borkar and a. a. chien, "the future of microprocessors", commun. acm, vol. 54, no. 5, pp.67–77, may 2011. [64] m. d. hill and m. r. marty, "amdahl’s law in the multicore era", ieee comput. mag., vol. 41, no.7, pp. 33–38, july 2008. https://techdifferences.com/difference-between-multitasking-and-multithreading-in-os.html https://www.tutorialspoint.com/multi-threading-models https://en.wikipedia.org/wiki/list_of_intel_%0bcore_i7_processors https://en.wikipedia.org/wiki/list_of_intel_%0bcore_i7_processors https://rmd.ac.in/dept/ece/supporting_%0bonline_%20materials/5/cao/unit5.pdf https://rmd.ac.in/dept/ece/supporting_%0bonline_%20materials/5/cao/unit5.pdf https://ieeexplore.ieee.org/xpl/conhome/4016922/proceeding https://www.cs.cmu.edu/afs/cs/%0bacademic/class/15740-f03/www/lectures/smt.pdf https://www.cs.cmu.edu/afs/cs/%0bacademic/class/15740-f03/www/lectures/smt.pdf https://ark.intel.com/content/www/us/en/ark/products/codename/54499/products-formerly-nehalem-ep.html https://www.inf.ed.ac.uk/%0bteaching/courses/pa/notes/lecture02-types.pdf https://www.inf.ed.ac.uk/%0bteaching/courses/pa/notes/lecture02-types.pdf https://www.ece.ucdavis.edu/~jowens/171/lectures/dlp3.pdf http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/140 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/140 instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 25 39 doi: 10.2298/fuee1401025p determination of actual reduction factor of hv and mv cable lines passing through urban and suburban areas  ljubivoje m. popović school of electrical engineering, beograd, serbia abstract: the paper presents the method for determination of ground fault current distribution in the cases when feeding cable lines are passing through urban and/or suburban areas, or when many relevant data are uncertain, or completely unknown. the problem appears as a consequence of the fact that many of surrounding urban installations are situated under the surface of the ground and cannot be visually determined or verified. on the basis of on site measurements, the developed method enables compensation of all deficiencies of the relevant data about metal installations involved with the fluctuating magnet field appearing along and around of a power line during an unbalanced fault. the presented analytical procedure is based on the fact that certain measurable quantities cumulatively involve the inductive effects of all, known and unknown surrounding metal installations. the performed quantitative analysis points on at the significance of taking into account the existence of surrounding metal installations. key words: substation, grounding system, ground fault current, inductive influence 1. introduction the fault current during an earth fault in a power network has at least two alternative paths for returning to the source which feeds the fault. because of that, each ground fault current has at least two fractions. one of them is injected into surrounding earth through the grounding system of a supplied substation, while the other is returning to the source of origin through the neutral conductor(s) of the feeding line [1]. the first one produces all potentials and potential differences (touch and step voltages) relevant for the safety conditions on the grounding system of a supplied substation, while other causes the thermal stress on the neutral conductor(s) of the feeding line. thus, the correct estimation of the ground fault distribution is of prime importance for correct designing of the grounding system of a supplied substation and correct selection of the feeding line neutral conductor(s).  received december 24, 2013 corresponding author: ljubivoje m. popović school of electrical engineering, beograd, serbia (e-mail: ljubivoje@beotel.net) 26 lj. popović with the aim of defining the influence of available return paths on the ground fault current distribution, a special parameter of feeding line is introduced in professional literature, including technical standards [2]. this parameter is called the reduction factor of the feeding line and is defined as the ratio of the part of the ground fault current returning through earth and the total ground fault current. by this, it is assumed that the grounding impedances at both line ends are negligible (e.g. [2]). under such assumption the fault current(s) in the line neutral conductor(s) is a consequence solely of inductive coupling between this/these conductor(s) and the phase conductor through which the total ground fault current passes. with the aim of solving the problem of determination of ground fault current distribution, an extensive and continuous research work has been done in the last at least five decades. the firstly developed methods for solving this problem relate to the case when the feeding line is constructed as an overhead line, e.g. [3]-[5]. somewhat later, the papers considering this problem in special cases, i.e. when a feeding line appears as a longitudinal combination of one cable and one overhead section, have been published, e.g. [6], [7]. also, several methods have been developed for solving the problem in cases where, because if high local soil resistivity and/or a high short-circuit level, special measurements (e.g. bare copper conductor laid in the same trench as the cable returning line, counterpoises, etc.) in the aim of reducing the part of ground fault current returning exclusively through the earth are considered indispensable [8]-[10]. then, the papers [11], [12] present the method developed for determination of the ground fault current distribution when a feeding cable line is constructed of three single-core cables. the method enables taking into account the participation of all three metal sheaths on ground fault current distribution for the fault at any place along the line. the common characteristic of the mentioned methods is that the value of the reductions factor depends only on design/constructive characteristics of a feeding line and on characteristics of the surrounding earth as a conductive medium. thus, none of the mentioned methods enables obtaining the solution of this problem when it appears in urban and suburban conditions, when many other metal installations participate in the ground fault current returning to the power system. in such surroundings each powercable line represents a very complex electrical circuit with many conductively and inductively coupled elements having uncertain or completely unknown parameters. the problem of determination of the influence of metal installations surrounding a feeding cable line on ground fault current distribution through the grounding system of a supplied substation is considered and solved for the first time in [14]. the solution is achieved by substituting all surrounding metal installations by one, from the standpoint of the ground fault current distribution, equivalent conductor of cylindrical form placed around and along each considered cable line. somewhat later, the achieved solution extended to include the overhead distribution lines [15]. the investigation results presented in [14], [15] show that part of the ground fault current flowing through the earth, in typically urban environments, is three to five times smaller than it has been considered earlier. the developed method enables a new insight into the whole grounding problem of urban hv/mv substations and dramatically changes our perception about the magnitude of this problem. it can be seen in realistic frameworks and solved in each concrete case without any redundant expenditure. determination of actual reduction factor of hv and mv cable lines passing through urban and... 27 the developed method simultaneously gives possibility of solving another problem of the current engineering practice that has also not been solved in the past. this is the problem of determination of the feeding line series impedance without ignoring the fact that the surrounding metal installations are attendant. namely, on the basis of the imagined physical appearance of the introduced equivalent conductor, it is not difficult to see that this conductor acts as an additional neutral conductor of each distribution line passing through urban and suburban areas and, in accordance with this, improves its transfer characteristics [16]. this paper presents the theoretical foundations of the mentioned method in the more transparent and complete manner and introduces certain improvement in the development procedure of the calculative part of the method. this improvement enables a more direct and easier determination of the actual ground fault current part dissipated through the grounding system of the supplied substation into the surrounding earth and returning to the power system only through the earth. 2. problem description the increasing sizes of modern distribution networks, as well as the higher operating and short-circuit currents of these networks, have been matched by over-spreading networks of earth return circuits (different pipelines, different cable and overhead line neutrals, etc.) close to hv and mv distribution lines. space dispositions of all of these installations, determined mainly by dispositions of city streets, and small mutual distances result in an inductive and, in the vicinity of substations, conductive coupling of different network types. the usage of common routes (mainly street pavements) for positioning various supply networks (electricity, water, gas, oil, telecommunications, etc.) unavoidably leads to the appearance of their mutual interaction that should be determined in many concrete cases. the whole problem has at least three different aspects important for the current engineering practice. they are:  influence of the surrounding metal installations on the fault current dissipated into the surrounding earth through the grounding system of a supplied substation [14], [15],  influence of the surrounding metal installations on the transfer characteristics of power lines passing through urban and suburban areas [16], and  inductive influence of an hv power line on each of the surrounding metal installations considered separately. one of the main parameters for estimation of these mutual interactions, soil resistivity of the surrounding area, can not be determined exactly. although there are several methods to measure the soil resistivity [1], no one of them can be practically applied in urban conditions. the reason stems from the fact that the surface of urban areas are already covered/occupied by buildings, streets, pavements and many other permanently constructed objects; while under the ground surface many known and unknown metal installations already exist. many urban metal installations of different basic functions are usually situated in a relatively small space, like: sheaths of different types of cable lines, neutral conductors of the low voltage network, steel water pipes, building foundations, etc. some of them are 28 lj. popović not in direct contact with the earth, while the others are in an effective and continuous contact with the earth. they are interconnected and their spatial dispositions are different in each concrete case and vary along any of the distribution lines. also, most of them are laid under the street pavements and many relevant data about them cannot be notified and visually determined or verified. grounding system of a distribution hv/mv substation consists of the substation grounding electrode and many outgoing mv cable lines acting as external grounding electrodes, and/or conductive connections with the grounding systems of the supplied mv/lv substations [17]. the spontaneously formed grounding system involves a large urban area around an hv/mv substation. such grounding system includes, through the terra-neutral (tn) grounding system in the low-voltage (lv) network and consumer installations, many, known and unknown, metallic installations typical for an urban area. as a consequence, the outgoing cable lines simultaneously become the conductive connections with the metal installations laid along the same street(s) as the feeding line. thus, it is not difficult to notice that in the case of unbalanced operating conditions two, essentially different, currents flow out of the power system. one of them is dissipated into the surrounding earth through the grounding system of the supplied substation, while the other is induced in the metal installations surrounding the feeding line. as an illustration, these two, mutually separated fault currents, when a ground fault occurs in the substation supplied by a three-phase line, are presented in fig. 1. fig. 1 current fractions passing through the elements of the grounding systems the used notation has the following meaning: a – supply substation, f – ground fault place (supplied substation), if – ground fault current, ii – ground fault current component induced in metal installations surrounding the feeding line, and ie – ground fault current component injected into the earth. both of the presented ground fault current fractions leave power system through the elements of the grounding system of the supplied substation, f. the current, ie, is injected into the earth, while the other, ii, circulates only through the surrounding metal installations that are foreseen for some others purposes. because of that all potentially dangerous and harmful influences of power cable lines and supplied substations on their environment emanate from these two ground fault current components. accordingly, determination of actual reduction factor of hv and mv cable lines passing through urban and... 29 determination of these two currents is of prime importance for estimating the safety conditions within, and in the vicinity, of a supplied substation and inductive influence of a feeding line on the neighboring parallel circuits (pipeline, telecommunication line, etc). since the process of splitting to these two fractions occurs along many external grounding electrodes and under the surface of the ground, none of these components can be separated and determined by direct or indirect measurements [14], [15]. each of the metal return paths, together with earth as the common return path, forms an electrical circuit, while all together these paths form a large number of conductively and inductively coupled circuits. since these circuits can be represented by the corresponding system of equations and since the analytical expressions necessary for the self and mutual impedances of metal conductors are known [2], it can be said that the considered problem has been in principle solvable long ago. however, it is not possible because of many practical difficulties and limitations in collecting for calculations necessary data. thus, the problem can be defined as follows: how to find the method enabling the compensation of the lack of huge number of relevant data? 3. experimental investigation the experimental investigations of the influence of metal installations, typical of urban areas, on value of the feeding line reduction factor are performed on a cable line that supplies, in series, two substations of the 110 kv distribution network in belgrade, serbia [14]. length of the line to the closer of the two supplied substations, measured from the supply substation, is 2320 m, while the feeding line length to the more distant substation is 6590 m. the line is realized by xhlp cables having mutually identical design parameters, laid in a triangular formation over the entire line length. the crossbounding technique necessary for reduction of circulating currents was not applied. in the areas through which the line is passing the specific soil resistivity is estimated on the basis of the main geological characteristics of the involved area, the only possible way for doing this in urban areas. the roughly estimated equivalent soil resistivity of the entire area is within the range from 30 to 50 ωm. the line section between the supply and the transit (nearer) substation passes through the area with a lower degree of urbanization compared with the rest of the line. the phase conductors are made of aluminum of a cross-section of 1000 mm 2 , while the metallic sheaths are made of copper strings of a total cross-section of 95 mm 2 and a medium diameter of 91 mm. the main elements of the grounding system in both supplied substations are the station building foundation and the 44 mv outgoing cable lines performed by cables with uncovered metal sheaths. the grounding impedances of the supply substation and of both of the supplied substations, determined by standard measurements, were found to be between 0.02 and 0.03 ω. since these impedances are very small compared to the other parameters influencing measured values of the reduction factor, they are completely disregarded. the described line is used to obtain the experimental results for two different feeding cable lines, one 2320 m and the other 6590 m long. this was achieved in the following manner. the longer line is obtained as a continuous cable line along its entire length by disconnecting its metal sheaths from the grounding electrode of the transit substation. the necessary measurements are performed by using the test circuit schematically represented in fig. 2. 30 lj. popović fig. 2 measurement circuit the used notation has the following meaning: a and b – substations connected through the tested cable line, ua – auxiliary voltage source, it – test current, is – current induced in the cable sheath, ic – total current induced in the surrounding metal installations, ie – current injected into the earth through the grounding system of substation b, za (zb ) – impedance of the grounding system of substation a(b), and g – remote ground. the influence of the surrounding metal installations has been observed by measuring value of the self impedance of one of the line phase conductors and by using the known analytical expressions for this impedance that is, according to e.g. [2], given by ω/km,;ln 428 '' 00          ph r phph r jrz       (1) where r'ph – phase conductor resistance per unit length, /km, rph – outer radius of the phase conductor, m,  – angular frequency = 2f, 0 – magnetic permeability of vacuum, 4∙10 –7 vs/am, and r – relative magnetic permeability. (prime (') denotes values per unit length). the equivalent earth penetration depth  is determined by the following expression ,m;658 f    (2) where  – equivalent soil resistivity along the cable line in ωm, and f – test circuit frequency. determination of actual reduction factor of hv and mv cable lines passing through urban and... 31 here, it should be mentioned that these expressions are based on carson's theory of the current return path through the earth. they have been derived under the assumptions that the power line is laid in a homogeneous soil of a resistivity equal to the equivalent resistivity of the normally heterogeneous (multilayer, with each layer having different resistivity) soil and that there are no other metal installations in the vicinity of the line. however, these assumptions do not correspond to the described actual situation. since the considered cable lines pass through areas covered by a spontaneously formed network of different underground metal installations, the measured quantities are affected by the conductive and inductive couplings of all, known and unknown, surrounding metal installations. thus, measurements can give only an apparent value of the self impedance of the line phase conductor. by simulating single-phase ground fault in the supplied substation and by disconnecting all three cable sheaths from the grounding electrode at one of the line ends (fig. 2), the following values for the apparent phase conductor self-impedance were obtained  zpha = (0.1819 + j1.0243) ω, for the line 2320 m, and  zpha = (0.5167 + j2.6260) ω, for the line 6590 m. the values of the line reduction factor obtained only on the bases of the measured values of the currents appearing in the cable sheaths were the following  r = 0.0473 – j 0.1565, for the shorter line, and  r = 0.0637 – j 0.1724, for the longer line. when the apparent values of impedance z'pha are determined, the only unknown parameters in the given expressions, (1) and (2),  and δ can be obtained by using these equations. however, these parameters in the considered cases involve, besides local earth characteristics, the constructive characteristics and mutual space disposition of all others available return paths. according to this, they should be defined in somewhat different way. a new definition of the parameter is that it represents the apparent equivalent soil resistivity, because it involves the conductive and inductive influences of all return paths, while the parameter δ is the equivalent penetration depth all return paths because it involves the conductive and inductive influences of all return paths. thus, the new notation which expresses this new meaning is  and δa. the corresponding values of the newly defined parameters are:  δa = 20.9 m, or ρa = 0.052 ωm, for the shorter line and  δa = 10.88 m, or ρa = 0.0137 ωm, for the longer line. the obvious difference between the two values can be explained by the fact that the shorter line passes through an area of a lower degree of urbanization, e. i. along lower number of surrounding metal installations. since the design data concerning the tested lines, as well as the newly defined parameters  or δa are known, the analytical expression, for the reduction factor in the case of single-core cables laid in a triangular formation, can be tested. this expression has, according to [13], the following mathematical form 32 lj. popović 3 2 00 ln 2 3 8 3' ' dr jr r r s a s s         (3) where r's  metal sheath resistance per unit length, /km, d  distance between two adjacent cables (or, diameter of the single-core cable) in the case of triangular formation, and rs  medium radius of the cable sheath. by applying the above expression to the considered cable lines one obtains:  r = 0.0471 – j0.1537, for the shorter and  r = 0.0589 – j0.1693, for the longer line. it can be seen that the results obtained by the applied expression and the results obtained by the measurements are in good agreement, practically identical. in this way the experimental proof has been obtained for the accuracy of the given analytical expression, but under the unreal assumption that the surrounding metal installations do not exist. based on the experimental results, the following facts can be still noted. the apparent equivalent soil resistivity obtained in this way is drastically lower compared to the realistic soil resistivity (between 30 and 50 ωm). the corresponding values of the reduction factor are by 52,1% and 69.6 % higher than the value obtained with the approximately estimated equivalent specific soil resistivity, r( =30 ωm)= 0.0204 – j0.1037. this can be explained by the fact that the presence of the nearby underground metal installations reduces not only of the fault current flowing through the earth, but also the currents flowing through the cable line sheaths. for obtaining a more complete insight into the influence of metal installations surrounding a feeding line, currents through the cable sheaths were also experimentally investigated. at first, a ground fault current distribution was observed when only the sheath of the cable with the ground fault current was connected to both grounding electrodes at the line ends. then another situation was considered, i.e. when one more cable sheath was connected to both of the grounding electrodes at the line ends. finally, normal operating conditions have been observed, i.e. when all three metal sheaths were grounded at the line ends. the measurement results show that the successive increase of the number of grounded metal sheaths (fig. 2) reduces not only the current flowing through the earth, but also the relative participation of each of the already connected sheaths in reducing the fault current through the earth. these relative reductions in the case of the sheath of the cable with simulated ground fault are from 60.66% to 46.31% and from 46.31% to 37.49%, respectively. these results are helpful for understanding the influence of the surrounding metal installations. since the reduction factor is defined as the ratio ie/if, for obtaining the actual value of the reduction factor the presence of the surrounding metal installations has to be taken into account. determination of actual reduction factor of hv and mv cable lines passing through urban and... 33 4. method development on the basis of the former analysis, it is clear that each power line passing through urban and/or suburban areas during a ground fault represents a very complex electrical circuit. this circuit consists of a large number of mutually conductively and inductively coupled electrical circuits with common return path through the earth. the number of these circuits, if the line phase conductors are excluded, is equal to the number of the line neutral conductors enlarged by the number of surrounding metal installations. if again the cable line presented in fig. 2 is considered, but now, with all metal sheaths grounded at both line ends, and if it is assumed that the total number of surrounding metal installations, including the neutral conductor, is equal to an arbitrary number n, then this line can be represented by the equivalent circuit shown in fig. 3. an arbitrary current in fig. 3, in, induces in an also arbitrary (m th ) current circuit, voltage, umn, determined by nmnmn izu  (4) where zmn – mutual impedance between two arbitrary (n th and m th ) surrounding metal installations (circuits), fig. 3. it is well known that distribution substations are located in areas occupied by many underground metal installations, acting as perfect grounding electrodes [17]. thus, grounding impedances za and zb can be neglected (za ≈ 0 and zb ≈ 0). because of that, the fault currents appearing in the cable line metal sheaths are a consequence solely of the inductive influence of the ground fault current appearing in the faulted phase conductor [15]. this is in accordance with the formerly mentioned reduction factor definition. also, for further considerations it is necessary to mention that the current directions shown in fig. 3 are taken arbitrarily. on the basis of the equivalent circuit presented in fig. 3 it is possible to write the system of (n + 1) equations and, for the known the values of ua and all self and mutual impedances, determine in the given equivalent circuit each of the presented currents. unfortunately, because of the previously mentioned practical difficulties and limitations the parameters of the surrounding metal installations necessary for determination of these impedances should be treated as unknown quantities. thus, for solving the problem of determining the current circulating through the earth, ie, and reduction factor of the considered line a completely new approach is necessary. the problem is solved in [14] by measuring currents it and i1 (fig. 2) and by substituting all surrounding metal installations by one equivalent conductor that is imagined as a cylinder placed around and along the entire feeding line. here, for the sake of simplifying the necessary calculation procedure, it will be assumed that this conductor also involves 34 lj. popović fig. 3 complete line equivalent circuit the used notation has the following meaning ua – auxiliary voltage source, un0, un1, un2, …, unn – voltages induced in an arbitrary (n-th) circuit by the current in each of the surrounding circuits (metal installations), it – test current through the phase conductor of the considered line, i1,– current through the sheath of the cable with the current it, i2, i3 – currents through the metal sheaths of the other two single-core cables, i4, i5, i6 , … ,in, …, in currents induced in the individual surrounding metal installations z1, z2, z3 – selfimpedances of cable metal sheaths, and z4, z5, z6, … , zn – self-impedances of the individual surrounding metal installations. two sheaths of the remaining single-core cables through which the ground fault current do not circulate. under such assumption the considered cable line is transformed into the physical model whose cross-section can be represented as shown in fig. 4. determination of actual reduction factor of hv and mv cable lines passing through urban and... 35 fig. 4 cross section of the introduced line model for the equivalent conductor imagined in such manner, the corresponding equivalent circuit of the entire cable line has the appearance as shown in fig. 5. fig. 5 equivalent circuit of the introduced line model the used notation has the following meaning u0c , u1c – voltages that current ic induces in the phase conductor and its metal sheath, uc0 , uc1 – voltages that currents it and i1 induce in the equivalent conductor. the relevant parameters of the assumed equivalent conductor will be determined under condition that currents it and i1 in fig. 3 remain unchanged after introducing the equivalent conductor, figs. 4 and 5. by using the equivalent circuit presented in fig. 5 this condition can be expressed by the following system of equations 0.2 0.1 110 11101   ccctc cct iziziz iziziz (5) 36 lj. popović where zc – self impedance of the equivalent conductor, and z1c – mutual impedance between the equivalent conductor and the cable sheath. impedances z1 and z01 are determined, according to e.g. [14], by the following expressions: s s r jrz       ln 28 00 1  ; ω/km (6) s r jz       ln 28 00 01  ; ω/km (7) for the adopted physical appearance of the equivalent conductor, the analytical expressions for impedances zc, z0c, and z1c are c cc r jrz       ln 28 00  ; ω/km (8) c cc r jzz       ln 28 00 10  ; ω/km (9) where rc – medium radius of the cylinder representing the equivalent conductor, and r'c – equivalent conductor resistance per unit length, /km. since currents it and i1 represent the known quantities, obtained by measurements (fig. 2), the condition given by (5) can be modified to the folowing simpler form tcc ccc i i zzz zzzz 1 2 11 0110    (10) in the given expression the only unknown quantities, according to (6), (7), (8), and (9), are r'c and rc. since (10) gives the relationship between complex quantities, it can be presented in the form of the following system of two equations   tccc izzzz 0110 re  =   1 2 11 re izzz cc  (11)   tccc izzzz 0110 im  =   1 2 11 im izzz cc  after determining the relevant parameters of the equivalent conductor (r'c and rc) relations (8) and (9) can be used for obtaining: z1c, z0c and z1c, as well as: ic and ie. then, in accordance with the equivalent circuit in fig. 5 and the reduction factor definition, the feeding line reduction factor is determined by     2 11 0110111 cc cccc t e zzz zzzzzz i i r    , (12) or, according to (8), and (9), in somewhat more compact form determination of actual reduction factor of hv and mv cable lines passing through urban and... 37   2 11 01101 ' 1 cc ccc zzz zzzzr r    (13) condition defined by (5) means at the same time:    n n nnccc izizu 2 000 , and (14)    n n nnccc izizu 2 111 (15) on the basis of (14) and (15) it is clear that the introduced equivalent conductor substitutes many known and unknown relevant parameters. data about actual reduction factor of a distribution line, or about actual ground fault current distribution in a supplied substation, are usually necessary before the line has been constructed. in these cases, for performing of the previously defined measurements one can utilize a provisory cable line posed on the surface of the soil along the foreseen path of the future line. this can be done by using any single-core cable suitable for this purpose in different, practically possible conditions and by using calculation procedure described here. since the metal installations in urban areas are mainly under the soil surface, the considered inductive influence will be slightly smaller [16]. according to the developed analytical procedure, the presented method enables determination of the reduction factor of any type of cable lines and takes into account all relevant factors and parameters, including even those whose contribution is negligibly small. it means that this method enables a correct problem solution for each, including extremely complex, practical situation. some inaccuracy can appear only as a consequence of the inductive influence of the nearby power distribution lines. this influence can be efficiently avoided by using the test current of somewhat higher frequency that can easily be discriminated from the omnipresent power frequency (e.g. [15]). the introduced error is small and gives the final results that are slightly on the safe side. although there are several methods of measurement of soil resistivity (e.g. [1]), no one of them is applicable in urban conditions (e.g. [15]). the reason stems from the fact that the surfaces of urban areas are already covered/occupied by buildings, streets, pavements, and many other permanently constructed objects; while under the ground surface many known and unknown metal installations already exist. thus, one is forced to adopt an approximate value of equivalent soil resistivity, based on the main geological characteristics of the relevant area, and use it in the calculation part of the developed method. here, the favorable circumstance is that the selfand mutual impedances are, according to the given equations, only slightly dependent on the equivalent soil resistivity and a more accurate data about this parameter does not bear any practical significance. it is sufficient to know that, within the framework of actually possible values, the lowest one should be preferred because it gives final results that are slightly on the safe side. 38 lj. popović 5. quantitative analysis by using the previously described method we obtain that the reduction factor taking into account the influence of the surrounding metal installations in the cases of the considered lines is: – r = – 0.0267 – j0.0245, or | r | ≈ 0.036, for the shorter, and – r = – 0.0225 – j 0.0170, or | r |≈ 0.029, for the longer line. it can be seen that its effective values are by 65.9% and 72.6% lower than the value obtained without taking into account the surrounding metal installations ( | r | ≈ 0.105). if it is assumed that the cables in the considered cases are laid in a flat formation and at a distance of d = 0.5 m, one obtains r = – 0.0275 – j0.0297, or | r | = 0.0405 for the shorter and r = – 0.0232 – j 0.0202, or | r | = 0.0308, for longer line. the differences are still greater in comparison with the previously given results of measurements; the obtained values are lower in this case: 78.0% and 84.2%, respectively. obviously, disregarding the influence of metal installations surrounding a feeding line, as well as determining the reduction factor only by measuring currents through the cable sheaths give results that are excessively conservative. bearing in mind that this also means the reduction by the same ratio of all potentials appearing on the grounding systems of the supplied substations, one can conclude that the results of this analysis throw a completely new light on the grounding problem of the supplied substations. also, having in mind the similarity of the urban conditions all over the world, this conclusion can be treated as generally valid for the safety conditions of the distribution substations supplied by cable lines. certainly, greater economical effects can be expected in cases where, because of a high soil resistivity and/or a high short-circuit current level, special measurements (e.g. bare copper conductor laid in the same trench as the cable feeding line, counterpoises, etc.) were considered necessary. besides, one can expect elimination of the strict requirement for the application of expensive mv cables acting as grounding conductors (cables with an uncovered sheath), as was the case with the mv network of beograd. the only difficulty arises from the fact that the actual ground fault current distribution depends on the metal installations laid in the area through which the feeding line passes. it practically means that for obtaining actual ground fault current distribution, each concrete distribution line should be considered separately. 6. conclusions the presented method enables taking into account the favorable influence of urban metal installations surrounding hv and mv distribution cable lines on the ground fault current distribution in the supplied substation(s). since the cable lines are almost without any exception applied in urban areas and since the effect of the surrounding metal installations are considerable, the presented method can by used as a foundation for the revision of the current version of the international technical standard [2]. determination of actual reduction factor of hv and mv cable lines passing through urban and... 39 references [1] ieee guide for safety in ac substation grounding, ieee std. 80, 2000. [2] short-circuits currents in three-phase a.c. system-part 3:currents during two separate simultaneous line-to-earth short circuit and partial short-circuit flowing trough earth, iec int. std. 60909-3, ed. ii, 2003. [3] j. endreny, "analysis of transmission tower potentials during ground faults", ieee trans. power app. syst., vol. pas-86, no. 10, pp. 1274-1283, october 1967. [4] f. dawalibi and g. niles, "measurements and computations of fault current distribution on overhead transmission lines", ieee trans. power app. syst. ,vol. pas-103, no. 3, pp. 553-560, march 1984. [5] lj. m. popović: "practical method for evaluating ground fault current distribution in station, towers and ground wire", ieee transactions on power delivery, vol. 13, no. 1, january 1998, pp. 123 129. [6] s. t. sobral, j. o. barbosa and v. s. costa, "grpunding potential rise characteristics of urban step-down substations fed by power cables a practical example", ieee transactions on power delivery, vol. 3, no. 2, pp. 1564 -1572, apr. 1988. [7] s. mangione, "a simple method for evaluating ground-fault current transfer at the transient station of a combined overhead cable line", ieee transactions on power delivery, vol. 23, no. 3, pp. 14131418, july 2008. [8] j. fortin, h.g. sarmieto, d. mukhedkar, "field measurements of draund fault distribution at lg-2", quebec, ieee transactions on power delivery, pwrg-1, vol. 3, pp. 48-60, 1986. [9] lj. m. popović: "efficient reduction of fault current through the grounding grid of substation supplied by cable line", ieee transactions on power delivery, vol. 15, no. 2, april 2000, pp. 556561. [10] a. campuccia and g. zizzo, "a study of the use bare buried comductors in an extended interconnected earthing systems inside a mv network", 18 th international conference on electricity distribution, cired, turin, 6-9 june 2005 [11] lj. m. popović, "practical method for evaluating ground fault current distribution in station supplied by an unhomogeneous line", ieee transactions on power delivery, vol. 12, no. 2, april 1997, pp. 722-727. [12] lj. m. popović, "determination of the reduction factor for feeding cable lines consisting of three single-core cables", ieee transactions on power delivery, vol.18, no.3, july 2003, pp.736-744. [13] lj. m. popović, improved analytical expressions for the determination of the reduction factor of the feeding line consisting of tree single-core cables", european transactions on electrical power, 2008, 18, pp. 190 – 203. [14] lj. m. popović, "influence of the metal installations surrounding the feeding cable line on the ground fault current distribution in supplied substations", ieee trans. on power delivery, vol. 23, no. 4, october 2008, pp. 2583-2590. [15] lj. m. popović, "testing and evaluating grounding systems for substations located in urban areas", iet generation, transmission & distribution, vol. 5, no. 2, february 2011, pp. 231-238 [16] lj. m. popović, "transfer characteristics of electric-power lines passing through urban and suburban areas", international journal of electrical power & energy systems, ijepes, vol. 56, mart 2014, pp. 151-158. [17] lj. popovic: "comparative analysis of grounding systems formed by mv cable lines with either uninsulated or insulated metal sheath(s)", electric power system research, vol. 81. no. 2, february 2011, pp. 393 – 39. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 281-290 https://doi.org/10.2298/fuee2102281s © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a compact lowpass/dual-band bandpass filter for microwave application priyanshu sadwal, amit bage department of electronics & communication engineering, national institute of technology, hamirpur, india, 177005 abstract. a combination of lowpass and dual band bandpass filter with improved selectivity is presented in this manuscript. two circular split ring resonators which are connected at the upper side, provide a lowpass filter characteristics. in order to achieve combined lowpass with dual passband response two l-shape stubs are incorporated into the lowpass filter structure. the parametric analysis of the proposed filter has been carried out using cst microwave studio. the numerical result shows a 3-db cutoff frequency for lowpass at 1.22 ghz, first and second passband resonant frequencies are 2.44 ghz and 3.7 ghz respectively. the proposed filter is compact and overall size of the proposed filter is 32.9 mm×15mm. key words: microstrip, lowpass filter (lpf), bandpass filter (bpf) and multiband bandpass filter 1. introduction in modern microwave and millimeter wave communication systems, microstrip based filters are playing an important role due to their ability to pass certain frequency and reject others and having the characteristics like compactness, cost effectiveness, light weight, sharp rejection, low insertion loss, high selectivity, etc. in modern communication system, the filters with multiband with low insertion loss as well as compact and light weight are required [1-2]. there are many techniques for the design and development in multiband filters [3]. in 1971 [4], concept of ring resonator are introduced by wolff and knoppik. the ring resonator is still used for multiband filter, due to its high quality factor as well as compactness, the subsequent studied has been presented by many authors [5-6]. the grounded slotted structures are also used to design multiband filters, which allow back to back and front to front configuration for dual band bandpass filter characteristics with compact size [7]. the dual-band bandpass filter can also be realized using stepped impedance resonator (sir) and shunted stub resonator [8-9]. the multilayer dielectric techniques are also used to realize dual passband filters, but the method and fabrication received october 15, 2020; received in revised form december 12, 2020 corresponding author: amit bage national institute of technology hamirpur, himachal pradesh, pin no. 177005, india e-mail: bageism@gmail.com 282 p. sadwal, a. bage process are complex. in [10], coupled resonator pairs have been used to design passband filter with the help of low temperature co-fired ceramic (ltcc) technology. in the aforementioned literature, the researchers are concentrated on the design and development of multiband filters only. in many applications, lowpass with bandpass filters are required like: mixers and hybrid fiber coaxial supporting systems [11-12]. in mixers, lowpass and bandpass filters are used to separate intermediate frequency (if) and local oscillator (lo) frequency. in 2016 [13], combination of lowpass and single bandpass filter is presented using stub-loaded transmission lines and stepped-impedance stubs. the impedances and lengths of the transmission lines are used to control the resonance frequencies, while shunting stubs are used stop band attenuation improvement. the lowpass and bandpass filter can be realized using defected ground structure (dgs) and complementary split ring resonator (csrr). the csrr are used for lpf with sharp cutoff and wide stop band, while a gap is added in series with the transmission line for passband [14]. the lumped element resonator structure is also used to design lowpass and dual-band bandpass filter [15]. the dual-band bandpass filters with lpf characteristics are designed using different techniques like, dual mode resonator [16], multi armed open loop resonators with direct feed [17], hairpin-slots [18] etc. in 2019 [19], a compact lowpass and dual-band bandpass filter using λ/5 length of folded coupled line and open circuited l-shape strip has been presented. the filter presents controllable transmission zero with controllable center frequencies and passband bandwidth. in this paper, a compact lowpass and dual band bandpass filter are designed using a combination of two circular shape split ring resonator which and connected at the upper side and a l-shape stub. the numerical analysis has been carried out using cst microwave studio. the proposed filter has its advantages of independent control of resonance frequency as well as transmission zeros location. the organization of the manuscript is as follows. in the first section, the lowpass filter geometry and analysis is presented. in the second section design of lowpass and dual pass band discussion are presented. the third section is based on final filter design, result and discussion. in the last section conclusion has been presented. 2. lowpass filter geometry geometry of the microstrip lowpass filter is shown in fig. 1. the resonator is printed on rogers rt/duroid 6010 substrate having loss tangent 0.0023, dielectric constant 10.2 and thickness of the dielectric substrate 1.27mm. the figure shows that there are two sections: (i) 50 ω microstrip transmission line, and (ii) double split ring resonator, which are connected through strip. the corresponding frequency response of filter drawn in fig. 1 is shown in fig. 2, which reveals lowpass frequency response. fig. 1 schematic diagram of lowpass filter prototype with dimensions l1=6.54, to = go = g1 = 0.3, r01= 5, r02 = 4.7, ri1 = 4.4, ri2 = 4.1, w0 =1.2 (all dimensions are in mm). a compact, lowpass and dual band bandpass filter for microwave application 283 fig. 2 simulated s-parameters of the lowpass filter in order to get particular cutoff frequency, the parametric analysis has been carried out. initially a parametric analysis is carried out for ri2 with fixed other parameters, and its numerical analysis is shown in fig. 3. the figure reveals that the radius ri2 is varied from 2.9 to 4.1 mm. for radius of 4.1 mm the 3-db cutoff frequency is 2.15 ghz with its improved return loss. the figure also shows a passband frequency response at upper frequency. fig. 3 simulation results for the variation in s11 with respect to radius ri2 fig. 4. shows the variation of ro1 from 5 to 6.2 mm. the figure shows that the acceptable performance is achieved for the outer ring radius of 5 mm. the selectivity of the lowpass filter is also improved at these dimensional values. 284 p. sadwal, a. bage fig. 4 simulation results for variation in s11 with respect to radius r01 3. lowpass and bandpass filter design and analysis it has been observed from fig. 4., that during the design of lpf a bandpass is also achieved at higher frequency, but the performance of the bandpass is very poor. in order to improve the passband a quarter wavelengths l-shape stub has been incorporated at both side of the jointed split ring resonator. the length of the stubs is chosen in such a way to get the acceptable band-pass response. the stub increases the overall circuit size of the device. it is therefore more preferable to combine both the lowpass and bandpass filter structures on the same length to overcome the large circuit size. the lshape stub introduction in fig. 1 is shown in fig. 5. fig. 5 schematic diagram of lowpass and bandpass filter prototype. the stub dimensions are: l2 = 6.58 mm, s1 = 1.95mm and g1 = 0.3mm. the fig. 6 shows the frequency response of the proposed filter. the figure reveals that a lowpass with dual band bandpass filter characteristics. the 3-db cut-off frequency for the lowpass filter is at 1.34 ghz, first passband lies between 2.05 to 2.8 ghz with bandwidth of 0.75 ghz a compact, lowpass and dual band bandpass filter for microwave application 285 and the second passband lies between 2.92 to 4.62 ghz with bandwidth of 1.7 ghz. the 2.39 and 3.67 ghz are the resonant frequencies of the first and second passband respectively. fig. 6 simulated s-parameters of the proposed lowpass filter 4. proposed filter design, result and discussion the performance of the filter which is shown in fig. 6, is not good and not applicable in any microwave applications. in order to improve the filter performance, two identical resonator structure has been connected through transmission line at an optimized distance of l3 = 1.71 mm. in order to achieve l3 = 1.71 mm, a parametric analysis has been carried out and shown in fig. 8. the figure reveals that, at 1.71 mm, the performance of the proposed filter is acceptable. the proposed lowpass dual-band bandpass filter has also been modeled on the same substrate rt/duroid 6010 (dielectric constant (ɛr) = 10.2 and having loss tangent (tanδ) =0.0023). the final lp-dbf is shown in fig. 7. fig. 7 schematic diagram of lowpass and bandpass filter prototype 286 p. sadwal, a. bage fig. 8 variation of s11 parameters with length l3 in order to achive a perfect filter characteristics, parametric analyis of stub length l2 has been carriedout. the corresponding s11 and s21 charatceristics are plotted in fig. 9 and 10. the paramteric analysis of fig. 9 shows that, the lowpass and first passband are remains unchanged while second passband is varied. it also overseved that the resonant frequency has been shifted at the lower side when the l2 is increased. the reutn loss is also improved whrn the length is increases. the figure also reveals that the return loss is 20 db. fig. 9 variation of s11 parameters with length l2 a compact, lowpass and dual band bandpass filter for microwave application 287 fig. 10 variation of s21 parameters with length l2 the return loss variations gives us the optimum results for the operating frequencies in the second passband which can be utilized for the intended applications. the stub length s1 = 1.95 mm is kept constant, which is used in the lshape bent stub. there is a pole which can be seen in the fisrt passband, which improves the filter performance. the selectivity of the first passband also increases with increase in stub length l2.the s21 parameter variations has been shown in the fig. 10. the increase in length l2 of the stubs shows a sharp transition at the 3-db cutt of point for the lowpass band which improves the out of band performance. the 3-db cut-off point is marginally reduced with the increase in length l2 from 5.68 mm to 6.88 mm. the first stopband response is also improved with the parametric variations increments. fig. 11 simulation results for the s1 length optimization 288 p. sadwal, a. bage the length s1 of the proposed filter has been optimized from 1 to 5 mm and its corresponding 3-db bandwidth and transmission poles are shows in fig. 11. the figure reveals that, position of the transmission pole appearing in this band has been shifting slightly towards origin but not by much factor. the transmission pole has also shifted from 1.09 ghz and 1.05 ghz with the increase in length s1. the 3db cut-off frequency of the lowpass frequency response has also been optimized for the stub length s1. minor shifts in the cut-off frequency towards origin can be seen with the increasing length.the cut-off frequency goes from 1.24 ghz to 1.18 ghz. the analysis for the lowpass band has been done and there is no significant change in the the position of transmission ploes or the cut-off frequency. based on the analysis of different parameters optimizations using cst microwave studio, the final dimensions for the proposed filter has been realized and tabulated in table 1. the final results are highly promising and meet all the requirements of microstrip filter applications. table 1 complete dimensions of the proposed filter parameter values in (mm) parameter values in (mm) l1=l1` 6.54 to=go=g1=g2 0.3 to`=go`=g1`=g2` 0.3 r01=r01` 5 r02=r02` 4.7 ri1=ri1` 4.4 w0 1.2 l2= l2` 6.58 s1= s1` 1.95 l3= l3` 1.71 ri2=ri2` 4.1 size of the filter 32.9×15 the frequency response of the proposed filter has been simulated and is shown in the fig. 12. fig. 12 frequency response of the proposed filter the figure shows that, a lowpass with two dual passband filter characteristics with improved selectivity and multiple transmission zeros has been observed in the results. the filter response shows lowpass 3-db cutoff frequencies at 1.22 ghz, first and second resonant a compact, lowpass and dual band bandpass filter for microwave application 289 passband frequencies are 2.44 ghz and 3.7 ghz respectively with transmission zeroes at 1.53, 1.57 and 2.78ghz. the return loss of below 10 db for lowpass and below 20db for passband of the proposed filter has been achieved. a comparison between the proposed filter and some of the reported filters is summarized in table 2. the table reveals that the proposed filter has lower insertion loss and total circuit size of the proposed filter is 0.42 g  0.19 g, which is compact. table 2 comparison of the proposed filter with few other filters. reference no. lowpass 3-db cutoff frequency in ghz bandpass cutoff frequency in ghz insertion loss in db circuit size in (g × g) 11 1 2.4/ 5.8 0.8/ 2.1/ 2.5 0.31×0.18 13 1 5.2/0.8/0.5/0.175×0.15 16 4.5 9.1/0.3/0.4/ 19 1.05 1.9/ 3.0 0.3/0.9/0.8 0.13×0.06 this work 1.22 2.44/ 3.7 0.15/ 0.5/ 0.45 0.42×0.19 5. conclusion a lowpass dual band-pass filter has been designed which is based on split ring resonators with inserted quarter-wavelength stubs. the filter provides great selectivity for the second passband as well as good lowpass frequency response. the lowpass 3-db cutoff frequencies at 1.22 ghz, first and second resonant passband frequencies are 2.44 ghz and 3.7 ghz respectively with transmission zeroes at 1.53, 1.57 and 2.78ghz. the size of the filter is very compact and simple which is only 32.9 mm × 15 mm (0.42 g × 0.19 g). acknowledgment: the authors would like to thank dr. lakhindar murmu (assistant professor, iiit, naya raipur, india) for their help and support. we would like also to thanks iiit naya raipur for providing resources to complete this research. references [1] d. m. pozar, microwave engineering, new york, wiley, 1998. [2] x. h. wang, k. j. chen and b. z. wang, "compact broadband dual-band bandpass filters using slotted ground structures", prog. electromagnet. res., vol. 82, pp. 151–166, april 2008. [3] j. t. kuo, t. h. yeh, and c. c. yeh, "design of microstrip bandpass filters with a dual-passband response", ieee trans. microw. theory tech., vol. 53, no. 4, pp.1331–1336, april 2005. [4] i. wolff, and n. knoppik, "microstrip ring resonator and dispersion measurements on microstrip lines", electron. lett., vol. 7, 779–781, december 1971. [5] f. c. chen, z. h. chu, and q. x. tu, "design of compact dual-band bandpass filter using short stub loaded resonator", microwave opt. technol. lett., vol. 51, no. 4, pp. 959–963, february 2009. [6] l. murmu and s. das, "a dual-band bandpass filter for 2.4 ghz bluetooth and 5.2 ghz w-lan applications", prog. electromagnet. res. lett., vol. 53, pp. 65–70, june 2015. [7] x. h. wang, k. j. chen and b. z. wang, "compact broadband dual-band bandpass filters using slotted ground structures", prog. electromagnet. res., vol. 82, pp. 151–166, april 2008. [8] y. c. chiou, "transmission zero design graph for dual-mode dual-band filter with periodic stepped impedance ring resonator", prog. electromagnet. res., vol. 108, pp. 23–36, september 2010. [9] v. radonić and v. c. bengin, "compact left-handed dual-band filters based on shunted stub resonators", fu elec. energ., vol. 32, no 4, pp. 571-579, december 2019. 290 p. sadwal, a. bage [10] k. wang, l. zhu, s.-w. wong et al., "balanced dual-band bpf with intrinsic common-mode suppression on double-layer substrate", electron. lett., vol. 51, no. 9, pp. 705–707, april 2015. [11] f. c. chen, j. m. qiu, h. t. hu, q. x. chu and m. j. lancaster, "design of microstrip lowpass-bandpass triplexer with high isolation", ieee microw. compon. lett., vol. 25, no. 12, pp. 805–807, december 2015. [12] m. h. capstick, "microstrip low pass-bandpass diplexer topology", electron. lett., vol. 35, no. 22, pp. 1958–1960, october 1999. [13] f. c. chen and q. shao, "low insertion loss microstrip dual-band lowpass-bandpass filter with controllable passband frequencies", microw. opt. technol. lett., vol. 58, no. 8, pp. 1858–1861, may 2016. [14] e. ghofran, a. omair, f. s. mahmoud, and a. s. a. zayed, "lowpass and bandpass filter designs based on dgs with complementary split ring resonators", aces j., vol. 26, no. 11, november 2011. [15] y. k. singh and a. chakrabarty, "miniaturized dual-mode circular patch bandpass filters with wide harmonic separation", ieee microw. and wirel. compon. lett., vol. 18, no. 9, pp. 584–586, september 2008. [16] m. awida, a. balalem, a. safwat, h. e. hennawy and a. omar, "combined low-pass and bandpass filter response using microstrip dual-mode resonators", in proceedings of the ieee mtt-s international microwave symposium digest, san francisco, ca, usa, june 2006, pp. 701–704. [17] m. awida, a. boutejdar, a. safwat, h. e. hennawy and a. omar, "multi-bandpass filters using multi-armed open loop resonators with direct feed", in proceedings of the ieee mtt-s international microwave symposium digest, san francisco, ca, usa, june 2006, pp. 913–916. [18] a. boutejdar and a. omar, "design of microstrip bandpass and lowpass filters using coupling matrix method and a new hairpin defected ground structure", microw. opt. technol. lett., vol. 50, no. 11, pp. 2898-2901, august 2008. [19] d. k. choudhary and r. k. chaudhary, "compact lowpass and dual-band bandpass filter with controllable transmission zero/center frequencies/passband bandwidth", ieee trans. circuits syst. ii: express briefs, vol. 67, no. 6, pp. 1044-1048, june 2020. instruction facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 555-569 https://doi.org/10.2298/fuee1904555v © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd the influence of conductive passive parts on the magnetic flux density produced by overhead power lines  slavko vujević, tonći modrić university of split, faculty of electrical engineering, mechanical engineering and naval architecture, split, croatia abstract. there has been apprehension about the possible adverse health effects resulting from exposure to power frequency magnetic field, especially in the overhead power lines vicinity. research work on the biological effects of magnetic field has been substantial in recent decades. various international regulations and safety guidelines, aimed at the protection of human beings, have been issued. numerous measurements are performed and different numerical algorithms for computation of the magnetic field, based on the biotsavart law, are developed. in this paper, a previously developed 3d quasistatic numerical algorithm for computation of the magnetic field (i.e. magnetic flux density) produced by overhead power lines has been improved in such a way that cylindrical segments of passive conductors are also taken into account. these segments of passive conductors form the conductive passive contours, which can be natural or equivalent, and they substitute conductive passive parts of the overhead power lines and towers. although, their influence on the magnetic flux density distribution and on the total effective values of magnetic flux density is small, it is quantified in a numerical example, based on a theoretical background that was developed and presented in this paper. key words: cylindrical segments of passive conductors, magnetic flux density, self and mutual potential coefficients, self and mutual impedances 1. introduction extremely high-voltage overhead power lines are among the most significant sources of extremely low-frequency (elf) electric and magnetic fields. these fields change very slowly over time and are therefore considered quasistatic [1, 2]. hence, electric and magnetic fields are computed separately. potential long-term health effects of exposure arising from the power distribution system including overhead power lines, due to their proximity to residential areas and field levels they emit, have been extensively studied over received february 22, 2019; received in revised form april 23, 2019 corresponding author: tonći modrić university of split, faculty of electrical engineering, mechanical engineering and naval architecture, rudjera boškovica 32,21000 split, croatia (e-mail: tmodric@fesb.hr)  556 s. vujević, t. modrić the last few decades [3–6]. epidemiological studies have suggested that long-term low-level exposure to elf (50-60 hz) magnetic field might be associated with an increased risk of childhood leukaemia. however, there is no firm evidence of such adverse health effects related to prolonged exposure to elf magnetic field. the international commission on non-ionizing radiation protection (icnirp) recommends the exposure limits related to short-effects, called basic restrictions in their guidelines [7]. the exposure limits outside the body, called reference levels, are set to 1000 μt for occupational exposure and to 200 μt for general public exposure for 50 hz magnetic flux density. the assessment of human exposure to the magnetic field originating from overhead power lines is based on measurements or computations. measurements are performed in accordance with international standards [8– 10]. various methods for computation of elf magnetic field, such as method of moments (mom), finite-difference time-domain method (fdtd), finite element method (fem), charge simulation method (csm) and surface charge simulation method (scsm), are wellknown [11]. the magnetic field of the overhead power lines is computed using the biot-savart law. in simplified two-dimensional (2d) numerical algorithms [12–14] overhead power line conductors are infinitely long straight thin-wire horizontal lines parallel to the flat earth’s surface. the number of line sources equals the number of overhead power line phase conductors and shield wires, and the contribution of each of them is taken into account. in three-dimensional (3d) numerical algorithms [15–19] the catenary form of the overhead power line conductors can be taken into account and therefore, more accurate computation results can be obtained. the basis of this paper is a previously developed 3d algorithm for computation of the magnetic field (i.e. magnetic flux density) produced by overhead power lines. the catenary conductors are approximated by a set of straight thin-wire cylindrical conductor segments. moreover, the cylindrical segments of passive conductors are also taken into account using closed current contours, which substitute conductive passive parts of the overhead power lines and towers. because of the currents induced in them, they may have the influence on the magnetic flux density distribution. as anticipated, the influence is not as pronounced as in computation of the electric field intensity, especially in close vicinity of the towers and other passive parts that strongly distort the electric field. therefore, many researchers ignore their effect, but nevertheless, a theoretical background for taking into account conductive passive parts and their effect on the magnetic flux density distribution is developed herein. moreover, this influence has been quantified for the first time so far. expressions for self and mutual potential coefficients of cylindrical conductor segments are given. equations for self and mutual impedances per unit length of the conductive passive contours are derived and included in the system of linear equations for computation of currents in natural and equivalent conductive passive contours. finally, the sum of the contributions of cylindrical segments of active and passive conductors is taken into account for computation of the magnetic flux density. in the numerical example, two different cases are observed, the first where conductive passive parts (cpps) are neglected and the second where they are taken into account. the obtained results of the magnetic flux density distribution are shown and compared with available results from the literature. the influence of conductive passive parts on the magnetic flux density 557 2. conductive passive parts of the overhead power lines the magnetic flux density distribution at the arbitrary field point t (x, y, z) in the air of a two-layer medium can be computed using the well-known biot-savart law. one of the advanced 3d numerical algorithms for computation, sufficiently accurate as computation module hifreq of the cdegs software package, is presented in detail in [19]. in addition to the cylindrical segments of active conductors with known currents, the cylindrical segments of passive conductors can also be taken into account. in these cylindrical segments of passive conductors (i.e. current contours), the currents are induced and therefore, they have influence on the magnetic flux density distribution. conductive passive parts of the overhead power lines and towers can be described using closed current contours, approximated by a set of cylindrical conductor segments. the contours can be natural (fig. 1) or equivalent, that substitute parts of conductive passive surfaces. examples of conductive passive surfaces are overhead power line towers, fences or any other conductive passive parts in high-voltage substations, which can be modelled using straight thin-wire cylindrical conductor segments or using subparametric spatial 2d finite elements, as in [20]. hence, a network model of conductive passive surfaces is used herein. the cylindrical conductor segments, that form the conductive passive contours, are oriented from the start to the end point of the segment. the unit vector 0s  is assigned to each cylindrical segment. fig. 1 closed passive contour approximated with 5 cylindrical conductor segments 2.1. currents of the conductive passive contours the system of linear equations for computation of currents in conductive passive contours, written in matrix form, can be expressed as follows:                                                  s ns s ks nsnk ks nk ks ns ks k nk k kk nknk kk nk kk nk kk i i zz zz i i zz zz         1 ,1, ,11,11 ,1, ,11,1 (1) where nk is the total number of conductive passive contours, ns is the total number of active cylindrical conductor segments, k ik i is the phasor of the ik-th conductive passive 558 s. vujević, t. modrić contour current, s k i is the phasor of the k-th active cylindrical conductor segment current, kk jkik z , is the mutual impedance of the ik-th and jk-th conductive passive contour, ks kik z , is the mutual impedance of the ik-th conductive passive contour and k-th cylindrical segment of active conductor. self impedance of the ik-th conductive passive contour is described by:          1 1 1 , , 1 , , ,1 , 2 ik ns is ik ns jk nsjs ikik jsis ik ns is ikik isis ik is iku is kk ikik ljljzz  (2) where ik ns is the total number of segments of the ik-th conductive passive contour, iku is z ,1 is the internal impedance per unit length of the is-th cylindrical conductor segment of the ik-th conductive passive contour, ik is  is the length of the is-th cylindrical conductor segment of the ik-th conductive passive contour, ikik isis l , , is the external inductance of the is-th cylindrical conductor segment of the ik-th conductive passive contour, ikik jsis l , , is the mutual inductance of the is-th and js-th cylindrical conductor segments of the ik-th conductive passive contour,  is the circular frequency and j is the imaginary unit. the internal impedance per unit length of the cylindrical conductor segment of the natural conductive passive contour is described by following expression [21, 22]: 1 0 0 0 1 0 ( ) 2 ( ) u j k rk z r j k r         (3) where k is the complex wave number, 0 r is the radius of the cylindrical conductor segment, v  is the electrical conductivity of the cylindrical conductor segment, 0 0 ( )j k r is the complex bessel function of the first kind of order zero, 01( )j k r is the complex bessel function of the first kind of order one. the complex wave number k is defined by the following equation: )4/(exp2  jfk vv (4) where  is the magnetic permeability of the cylindrical conductor segment, f is the timeharmonic current frequency. the complex bessel function of the first kind of order nn  can be written as [23, 24]: 2 0 2 ( ) ( 1) ! ( )! n m m n m k r j k r m n m                 (5) conductive passive surface can be replaced by a contour formed by a set of equivalent cylindrical conductor segments. radius of these segments is equal to [23]: vv v f d r   2 1 2 (6) where d is the skin depth of the wave into the conductive surface. the influence of conductive passive parts on the magnetic flux density 559 internal impedance per unit length of these conductor segments can be described using the following expression: 1 (1 ) 2 22 pu v v vv v e z z j rr h                (7) where v z is the wave impedance of the medium from which the conductive passive surface is made, ve is the phasor of electric field intensity on the surface of the conductor, vh is the phasor of magnetic field intensity on the surface of the conductor, p is the magnetic permeability of the surface. mutual impedance of the ik-th and the jk-th conductive passive contour is described by the following equation: kk ikjk ik ns is jk ns js jkik jsis kk jkik zljz , 1 1 , ,,      (8) where jkns is the total number of segments of the jk-th conductive passive contour, jkik jsis l , , is the mutual inductance of the is-th cylindrical conductor segment of the ik-th conductive passive contour and js-th cylindrical conductor segment of the jk-th conductive passive contour. mutual impedance of the ik-th conductive passive contour and k-th cylindrical segment of active conductor is defined by the following expression:    ik ns is ik kiskik ljz ks 1 ,, (9) where ik kis l , is the mutual inductance of the is-th cylindrical conductor segment of the ikth conductive passive contour and k-th cylindrical segment of active conductor. 2.2. self and mutual inductances of the cylindrical conductor segments self inductance of the cylindrical conductor segment is described by the following expression: isis un isis pssεl ,00,  (10) where isis un pss , is the self potential coefficient of the is-th cylindrical conductor segment in homogeneous unbounded dielectric medium with permittivity 0  , which can be computed as described in detail in chapter 3. mutual inductance of the is-th and js-th cylindrical conductor segment, which can be cylindrical segment of active conductor or part of conductive passive contour, is described using the following expressions: ,, 0 0 0 0 ,( ) un is jsis js is js js isl s s pss l       (11) where is s 0  is the unit vector of the is-th cylindrical conductor segment, jss0  is the unit vector of the js-th cylindrical conductor segment, jsis un pss , are the mutual potential 560 s. vujević, t. modrić coefficients of the is-th and js-th cylindrical conductor segments in homogeneous unbounded dielectric medium with permittivity 0  , which can be computed as described in detail in chapter 3. 3. self and mutual potential coefficients of the cylindrical conductor segments self potential coefficients of the cylindrical conductor segments in homogeneous unbounded dielectric medium are described by the following expression: 0 0 ( , ) 4 π ε un i ii p r pss    (12) where auxiliary function p is defined [25] as:             vvv v vp 22 222 ln2),(    (13) according to [26], two cylindrical conductor segments can be parallel or nonparallel. two parallel cylindrical conductor segments, i-th and j-th, in homogeneous unbounded dielectric medium are shown in fig. 2. further, i-th cylindrical conductor segment with endpoints t1 (u1, vi) and t2 (u2, vi) is observed in the local coordinate system (u, v) of the j-th cylindrical conductor segment. fig. 2 two parallel cylindrical conductor segments in homogeneous unbounded dielectric medium mutual potential coefficients of two parallel cylindrical conductor segments, i-th and j-th segment, in homogeneous unbounded dielectric medium can be obtained from the following expression: 1 2 3 4 0( ) / 4 un ijpss c c c c ε      (14) where auxiliary function ck (k = 1, 2, 3, 4) is described by: 2222 ln ikkikkk vwwvwwc       (15) the influence of conductive passive parts on the magnetic flux density 561 2/21 juw  (16) 2/12 juw  (17) 2/13 juw  (18) 2/24 juw  (19) in a case of two nonparallel cylindrical conductor segments, there is always one and only one pair of parallel planes, 1 and 2 in which these segments lie (fig. 3). in a limiting case, these two planes overlap and then, nonparallel segments lie on intersecting straight lines. fig. 3 two nonparallel cylindrical conductor segments in homogeneous unbounded dielectric medium mutual potential coefficients of two nonparallel cylindrical conductor segments in homogeneous unbounded dielectric medium, defined by using the galerkin-bubnov method, are described by:         2 1 2 1 0 ε4 1 ij ij un r dd pss (20) where the mutual distance between points on the axes of the segments is equal to:  cos2 222 dr ij (21) where d is the distance between the parallel planes on which segments lie,  is the angle between lines on which the segments lie,  is the distance of the observed point on the axis of the i-th segment and the start point o1,  is the distance of the observed point on the axis of the j-th segment and the start point o2. 562 s. vujević, t. modrić after the double integration in (20) is carried out, the following expression, known as cejtlin’s formula [25], can be obtained: 1 1 2 2 1 2 2 1 0 0 ( , ) ( , ) ( , ) ( , ) 4 4 un ij a a a a pss                     (22) where 1 ( , ) ln ( cos ) ln( cos ) 2 tan tan sin 2 ij ij ij a r r rd d                                   (23) self potential coefficients of the i-th cylindrical conductor segment, with linear charge density i λ approximated by a constant, in the air of the two-layer medium (fig. 4) can be computed, using the well-known image method: sii un rii un ii psskpsspss  (24) where sii un pss is the mutual potential coefficient of the i-th cylindrical conductor segment and its image in homogeneous unbounded dielectric medium with permittivity 0, whereas rk is the reflection coefficient derived for a point current source [23, 27, 28] which can be approximated to high accuracy by 1 r k for power line frequencies as a consequence of assumption that the earth’s conductivity is infinite. fig. 4 cylindrical conductor segment in the air of the two-layer medium and its image if i-th and j-th cylindrical conductor segments are in the air of the observed two-layer medium (fig. 5), their mutual potential coefficient can be expressed by image method: sij un rij un ij psskpsspss  (25) where sij un pss is the mutual potential coefficient of the i-th cylindrical conductor segment and image of the j-th cylindrical conductor segment in homogeneous unbounded dielectric medium with permittivity . 0  the influence of conductive passive parts on the magnetic flux density 563 fig. 5 two cylindrical conductor segments and the image from one of them 4. numerical example in order to estimate the influence of conductive passive parts on the magnetic flux density distribution, a computer program was developed, in which these passive parts can be considered on the basis of the presented theory. in the numerical example, two spans between three identical towers of a typical 400 kv overhead power line, each carrying three phases with two conductors in the bundle per phase and two shield wires are observed (fig. 6). detailed input data concerning the tower geometry, the maximum and minimum heights of all conductors and sags, radii of all phase conductors and shield wires, the length of the overhead power line span, as well as electrical input data are given in [20]. the maximum allowed symmetrical currents for cross section of the chosen phase conductors and symmetrical operating conditions have been assumed. two different cases are observed. in the first case, only phase conductors and shield wires (16 catenaries, each approximated using 60 thin-wire cylindrical segments of active and passive conductors) are taken into account, whereas conductive passive parts (cpps) are neglected. in the second case, in addition to aforementioned catenaries, a central tower is approximated using 68 thin-wire cylindrical segments of passive conductors and 40 conductive passive contours are taken into account. the electrical conductivity of the cylindrical conductor segments  is equal to 7 ms/m, while the magnetic permeability of the cylindrical conductor segments  is equal to 500. computation of the magnetic flux density distribution is carried out at height of 1 m above the earth’s surface in the close vicinity of a central tower along observed xand y-axes in a total of 500 points. fig. 6 simplified representation of the overhead power line 564 s. vujević, t. modrić figures 7–10 present computed effective (rms) values of the total magnetic flux density and its components along x-axis, whereas figures 11–14 present computed effective (rms) values of the total magnetic flux density and its components along y-axis for the aforementioned two cases. maximum absolute deviations of the computed total magnetic flux density distribution along xand y-axes for two cases in the chosen example are equal to 0.15 % and 0.89 %, respectively. as expected, according to well-known parameters affecting the magnetic flux density distribution, these absolute deviations due to conductive passive parts are small. nevertheless, they have not been quantified so far. the maximum computed value of the magnetic flux density, obtained in this example, in the close vicinity of a central tower (fig. 7) is equal to 16.84 μt, as well as the maximum computed value obtained under the midspan of the overhead power lines, which is equal to 31.83 μt, are substantially less than the exposure limits given in [7]. fig. 7 distribution of the total magnetic flux density along x-axis fig. 8 magnetic flux density x-component along x-axis the influence of conductive passive parts on the magnetic flux density 565 fig. 9 magnetic flux density y-component along x-axis fig. 10 magnetic flux density z-component along x-axis fig. 11 distribution of the total magnetic flux density along y-axis 566 s. vujević, t. modrić fig. 12 magnetic flux density x-component along y-axis fig. 13 magnetic flux density y-component along y-axis fig. 14 magnetic flux density z-component along y-axis the influence of conductive passive parts on the magnetic flux density 567 in order to verify the accuracy of the presented algorithm, the magnetic flux density results computed by efc-400ep software [29] are shown in several points are compared to computed results obtained by numerical algorithm given herein (fig. 15) and a very good agreement can be seen. detailed input data of a 400 kv overhead power line are given in [29]. fig. 15 comparison of computed magnetic flux density results obtained by presented algorithm with results computed by efc-400ep software table 1 shows percent errors (p.e.) of magnetic flux density results obtained by presented algorithm with respect to results computed by efc-400ep software, in chosen points, given in fig. 15, along observed x-axis. table 1 percent errors of magnetic flux density results obtained by presented algorithm with respect to results computed by efc-400ep software x (m) p.e. (%) 0 3.276 2.5 3.033 5.0 2.727 7.5 0.995 10.0 0.389 12.5 2.021 15.0 2.541 17.5 3.317 20.0 1.588 22.5 3.503 25.0 4.807 5. conclusion in this paper, a 3d quasistatic numerical model for taking into account conductive passive parts of the overhead power lines and their effect on the computation of the magnetic flux density distribution is presented. the catenary conductors of the overhead 568 s. vujević, t. modrić power line span are approximated by a set of straight thin-wire cylindrical conductor segments. besides cylindrical segments of active conductors, the cylindrical segments of passive conductors are also taken into account using closed current contours, which can be natural or equivalent. these conductive passive parts have small influence on the magnetic flux density distribution, which has been quantified herein. primarily, it is due to extremely low-frequency of the magnetic flux density produced by overhead power lines. an originally developed theoretical background is described in detail, including expressions for self and mutual potential coefficients of cylindrical conductor segments and expressions for self and mutual impedances per unit length of the conductive passive contours. an influence of conductive passive parts on the magnetic flux density is shown and quantified in the chosen numerical example of a typical 400 kv overhead power line. references [1] r. g. olsen and p. s. wong, "characteristics of low frequency electric and magnetic fields in the vicinity of electric power lines", ieee transactions on power delivery, vol. 7, no. 4, pp. 2046–2055. [2] r. fitzpatrick, maxwell’s equations and the principles of electromagnetism. infinity science press llc, hingham, 2008. [3] n. wertheimer and e. leeper, "electrical wiring configurations and childhood cancer", american journal of epidemiology, vol. 109, no. 3, pp. 273–284, 1979. [4] j. c. teepen and j. a. van dijck, "impact of high electromagnetic field levels on childhood leukemia incidence", international journal of cancer, vol. 131, no. 4, pp. 769–778, 2012. [5] international agency for research on cancer, monographs on the evaluation of carcinogenic risks to humans, non-ionizing radiation, part 1: static and extremely low-frequency (elf) electric and magnetic fields, vol. 80, iarcpress, lyon, france, 2002. [6] world health organization, extremely low frequency fields, environmental health criteria monograph no. 238, who press, geneva, 2007. [7] international commission on non-ionizing radiation protection, guidelines for limiting exposure to time-varying electric and magnetic fields (1 hz to 100 khz), health physics, vol. 99, no. 6, pp. 818–836, 2010. [8] ieee standard procedures for measurement of power frequency electric and magnetic fields from ac power lines, ieee standard 644-1994. doi: 10.1109/ieeestd.1995.122621. [9] measurement of dc magnetic, ac magnetic and ac electric fields from 1 hz to 100 khz with regard to exposure of human beings – part 1: requirements for measuring instruments, iec std 61786–1, 2013. [10] measurement of dc magnetic, ac magnetic and ac electric fields from 1 hz to 100 khz with regard to exposure of human beings – part 2: basic standard for measurements, iec standard 61786–2, 2014. [11] p. zhou, numerical analysis of electromagnetic fields. springer-verlag, berlin heidelberg, 1993. [12] c. garrido, a. f. otero and j. cidrás, "low-frequency magnetic fields from electrical appliances and power lines", ieee transactions on power delivery, vol. 18, no. 4, pp. 1310–1319, 2003. [13] g. filippopoulos, d. tsanakas, "analytical calculation of the magnetic field produced by electric power lines", ieee transactions on power delivery, vol. 20, no. 2, pp. 1474–1482, 2005. [14] f. moro and r. turri, "fast analytical computation of power-line magnetic fields by complex vector method", ieee transactions on power delivery, vol. 23, no. 2, pp. 1042–1048, 2008. [15] g. lucca, "magnetic field produced by power lines with complex geometry", european transactions on electrical power, vol. 21, no. 1, pp. 52–58, 2011. [16] a. z. el dein, "magnetic-field calculation under ehv transmission lines for more realistic cases", ieee transactions on power delivery, vol. 24, no. 4, pp. 2214–2222, 2009. [17] j. c. salari, a. mpalantinos and j. i. silva, "comparative analysis of 2and 3-d methods for computing electric and magnetic fields generated by overhead transmission lines", ieee transactions on power delivery, vol. 24, no. 1, pp. 338–344, 2009. [18] b. zemljaric, "calculation of the connected magnetic and electric fields around an overhead-line tower for an estimation of their influence on maintenance personnel", ieee transactions on power delivery, vol. 26, no. 1, pp. 467–474, 2011. the influence of conductive passive parts on the magnetic flux density 569 [19] t. modrić, s. vujević and d. lovrić, "3d computation of the power lines magnetic field", progress in electromagnetics research m, vol. 41, pp. 1–9, 2015. [20] t. modrić and s. vujević, "computation of the electric field in the vicinity of overhead power line towers", electric power systems research, vol. 135, pp. 68–76, 2016. [21] s. vujević, v. boras and p. sarajčev, "a novel algorithm for internal impedance computation of solid and tubular cylindrical conductors", international review of electrical engineering, vol. 4, no. 6, pp. 1418–1425, 2009. [22] d. lovrić, v. boras and s. vujević, "accuracy of approximate formulas for internal impedance of tubular cylindrical conductors for large parameters", progress in electromagnetics research m, vol. 16, pp. 177– 184, 2011. [23] j. a. stratton, electromagnetic theory. mcgraw-hill book company, new york and london, 1941. [24] m. r. spiegel, s. lipschutz and j. liu, mathematical handbook of formulas and tables, fourth ed., mcgraw-hill education, new york, 2012. [25] l. r. neiman and p. l. kalantarov, theoretical fundamentals of electrical engineering, part 3: theory of electromagnetic field, gosenergoizdat, moscow, leningrad, 1959 (in russian). [26] w. h. mccrea, analytical geometry of three dimensions. dover publications, new york, 2006. [27] s. vujević and p. sarajčev, "potential distribution for a harmonic current point source in horizontally stratified multilayer medium", compel: the international journal for computation and mathematics in electrical and electronic engineering, vol. 27, no. 3, pp. 624–637, 2008. [28] t. takashima, t. nakae and r. ishibashi, "high frequency characteristics of impedances to ground and field distributions of ground electrodes", ieee transactions on power apparatus and systems, vol. pas100, no. 4, pp. 1893–1900, 1981. [29] s. carsimamovic, z. bajramovic, m. rascic, m. veledar, e. aganovic and a. carsimamovic, "experimental results of elf electric and magnetic fields of electric power systems in bosnia and herzegovina", in proceedings of eurocon 2011, international conference on computer as a tool, lisbon, portugal, 2011, pp. 1–4. facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. i-iii © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd guest editorial energetics conference has been held annually since 2015, and it has been organized by research and development center alfatec in cooperation with mathematical institute of the serbian academy of sciences and arts and complex systems research centre cosrec. the conference co-organizers are the following respectable institutions: academy of sciences and arts of the republika srpska, faculty of mining and geology, university of belgrade, and faculty of technical sciences, uklo university st. climent ohridski. the conference is organized under the patronage of the ministry of education, science and technological development of the republic of serbia. the aim of the energetics conference is to provide an opportunity for researchers to exchange and discuss their respectable work in a variety of areas including:  energy management  energy modeling, planning and policies  energy efficiency and conservation  energy pricing policies  new technologies for energy saving  energy and climate change  sustainable energy technologies  renewable energy and alternative fuels  computational methods in energy economics  energy economics the papers appearing in this issue are from the energetics 2018 – 4 th virtual international conference on science, technology and management in energy held in october 25-26. there are six refereed contributed papers in this special issue. preliminary versions were presented at energetics 2018 conference and published in the energetics 2018 – 4 th virtual international conference on science, technology and management in energy proceedings. the papers included here are fully refereed revised and extended versions. circuit breaker replacement strategy based on the substation risk assessment, dragan stevanović, aleksandar janjić develops methodology based on real field data. in this paper, based on 427 circuit breakers’ statistical data, weibull probability distribution of contact resistance for circuit breakers is determined. authors calculate substations reliability and cbs’ removal costs with this methodology. cloud-based scada systems: cyber security considerations and future challenges, mirjana d. stojanović, slavica v. boštjančič rakas, jasna d. marković petrović describes cloud-based scada systems, and focus on cloud service selection as well as on the analysis of benefits and risks of cloud-based scada applications. authors address security threats in cloud environment and present challenges in security provisioning regarding security solutions, risk management and test environment. low cost cup electronic anemometer, elson avallone, paulo césar mioralli, pablo sampaio gomes natividade, paulo henrique palota, josé ferreira da costa describes the cup anemometer, an easy to build and low cost device which is a great choice for small farmers, but also for evaluation of wind turbines, and especially for meteorological stations.. the reed switch sensor is also another advantage as it does not require a sophisticated programming, as well as the open platform arduino. the present sensor was developed as part of the project of a meteorological station to monitor the microclimate of the city of catanduva-sp, brazil. transformative and disruptive role of local direct current power networks in power and transportation sectors, prahaladh paniyil, rajendra singh, amir asif, vishwas powar, guneet bedi, john kimsey discusses the best possible energy solution for smart community. authors focus on decentralized power generation, storage and distribution through photovoltaic and lithium batteries. the paper encompasses the need for local direct current (dc) power and provides an example of local dc power in the surface transport sector. calculation of losses in the distribution grid based on big data, lazar sladojević, aleksandar janjić, marko ćirković provides a new approach for calculation of losses in the electrical distribution. this is done by analyzing the data available from the distribution grid operator. the used data set is available in the serbian distribution grid operator’s report for the year 2017. the influence of nonlinear background on the quality of electricity, enver agić, damir šljivac, bakir agić discusses the power distribution network load and analyze three-phase part of the electrical network where the ygyg transformer connects the nonlinear circuits of a set of personal computers (pcs) through the transformer. the load is balanced at each stage. we are truly grateful to all the authors for their contributions to this special issue. we acknowledge the important contribution of the energetics 2018 program committee members, listed below, for their valuable comments and reviews on the contributed papers. we also express our sincere gratitude to prof. ninoslav d. stojadinovic, editor-inchief, and dr. danijel m. dankovic, technical secretary, facta universitatis: electronics and energetics series, for their support on this special issue. we sincerely hope that publication of these results will stimulate continued research in the field of energy. dr. lazar z. velimirovic mathematical institute sasa kneza mihaila 36 11001 belgrade serbia energetics 2019 program committee chair: prof. dr. aleksandar janjić, faculty of electronic engineering, serbia members: prof. dr. zoran stajić, faculty of electronic engineering, serbia prof. dr. detelin markov, faculty of power engineering and power machines, bulgaria prof. dr. marko serafimov, faculty of mechanical engineering, macedonia prof. dr. mileta janjić, faculty of mechanical engineering, montenegro prof. dr. miomir stanković, faculty of occupational safety, serbia prof. dr. enver agić, public enterprise electric utility of bosnia and herzegovina, bosnia and herzegovina prof. dr. niko majdandžić, faculty of mechanical engineering, croatia prof. dr. serkan abbasoglu, cyprus international university, turkey dr. lazar velimirović, mathematical institute of the serbian academy of sciences and arts, serbia prof. dr. bojan srđević, faculty of agriculture, serbia prof. dr. abdelhak djoudi, national polytechnic school, algeria prof. dr. suzana savić, faculty of occupational safety, serbia prof. dr. zdravko milovanović, faculty of mechanical engineering, bosnia and herzegovina prof. dr. miloš jelić, research and development center alfatec, serbia prof. dr. zoran markov, faculty of mechanical engineering, macedonia prof. dr. krsto miljanović, agromediterranean faculty, bosnia and herzegovina prof. dr. krum todorov, faculty of power engineering and power machines, bulgaria prof. dr. zoran jovanović, faculty of electronic engineering, serbia prof. dr. dragoljub mirjanić, academy of sciences and arts of republic of srpska, bosnia and herzegovina prof. dr. zoran gligorić, faculty of mining and geology, serbia prof. dr. ljubiša papić, faculty faculty of technical sciences cacak, serbia prof. dr. goran janaćković, faculty of occupational safety, serbia dr. wassila issaadi, faculty of technology, university of bejaia, algeria prof. dr. roddy lollchund, university of mauritius, republic of mauritius instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 689 700 doi: 10.2298/fuee1604689d a novel analytical method for the selective multiplierless linear-phase 2d fir filter function  jelena r. djordjević-kozarov, vlastimir d. pavlović university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper, a novel analytical method for new class of selective linear-phase two-dimensional (2d) finite impulse response (fir) filter functions generated by applying a new modified 2d christoffel–darboux formula for classical orthogonal chebyshev polynomials of the first and the second kind is proposed. fundamental research proposed in this paper is also illustrated by examples of 2d fir filter and adequate comparison with new class of multiplierless linear-phase 2d fir filter function given in the literature. key words: 2d fir filter function, multiplierless, linear-phase, frequency response analysis, chebyshev polynomials, hilbert transform 1. introduction successful applications of powerful orthogonal polynomials, in the filter theory, are well-known and described in [1]. a lot of problems in various scientific and technical areas have been solved applying the classical christoffel-darboux formula for all classic orthogonal polynomials. the new class of the explicit filter functions for continuous signals, generated by the classical christoffel-darboux formula for the classical jacobi orthogonal polynomials, is given in detail in [2]. design of the linear-phase fir filters for defined specifications is discussed in [3]. the grid density requirement for the design of fir filters, with a useful design rule, is presented in detail in [4]. in [5] the authors present the relationship between the accuracy and the frequency grid density in 2-d filter designs. a new formula for determining the frequency grid spacing is proposed. one-dimensional half-band linear-phase fir filter design approach is efficiently used in realization of 2d linear-phase fir filter [6]. the paper [7] describes the approach for the successful design of 2d fir filters with multipliers. moreover, 2d fir filters with nonstandard specifications are designed using transformation technique in [8, 9]. they are based on transformation of one-dimensional fir filters, as well as direct application of the approximation techniques in two dimensions. received october 13, 2015; received in revised form april 16, 2016 corresponding author: jelena r. djordjević-kozarov university of niš, faculty of electronic engineering, aleksandra medvedeva 14, serbia (email: jelena.djordjevic-kozarov@elfak.ni.ac.rs) 690 j. r. djordjević-kozarov, v. d. pavlović a simple recurrence formula for computing the impulse response coefficients of the sinc n fir filter, consisting of a cascade of n sinc filters, each of length m, is presented in [11]. the initial consideration for the synthesis of the 2d fir filter functions is given in a short paper [12]. proposed christoffel-darboux formula for four orthogonal polynomials on two equal finite intervals for powerfully generating filter functions is proposed. in [13] is described in detail the analytical method for the synthesis of the multiplierless linearphase 1d and 2d fir filter functions in an explicit form using chebyshev orthogonal polynomials of the first kind. in [14] is described an analytical method for the synthesis of the multiplierless linear-phase 2d fir filter functions in a compact form that can have the effect of hilbert transformer in the z2 domain. a novel analytical method for a new class of linear-phase 2d fir filter functions with a full effect of hilbert transformer in z1 and z2 domains is proposed in [15]. the main motivation for this research is the extreme property of christofell-darboux sum for the classical orthogonal polynomials that provides new results in the continuous domain, the domain of 1d z and 2d z domain. these results are written in explicit form and give a huge contribution to the filter theory. it should be emphasized that the multiplierless solutions are new solutions worthy of attention. there is no effect of the final quantization of filter coefficients because all the filter coefficients are equal per module. this paper presents further generalization of the previous research [4] in two dimensions. the proposed solution is a filter function in the z1 domain, and the hilbert transformer in the z2 domain, and with the solution from [14] constitutes the whole. an analytical method of the christoffel-darboux formula for the classical orthogonal chebyshev polynomials, of the first and the second kind, is proposed in this paper in an explicit form in continuous domain. also, the new class of the linear-phase 2d fir digital filters, generated by the proposed modified formula and by direct mapping from the continuous domain into 2d z domain, is given. in order to illustrate, the examples of the efficient design of the new class of selective linear-phase 2d fir filter functions are also given. 2. review of the 2d fir filter function multiplierless linear-phase 2d fir filter function with two free real parameters is considered in this paper. a linear-phase 2d fir filter of (m x m)-order is defined by 1 2 1 1 2 0 0 ( , , ) ( , , ) m m r k r k h m z z k h m r k z z        (1) where m is desired order of the filter, 1 k is the real constant and ( , , )h m r k are the impulse response coefficients that are real numbers. squared filter frequency response, in absolute units, can be presented by 1 21 2 1 2 1( , , ) ( , , ) , for , 2 i j h m z z h m z z z e z e     (2) or 2 1 2 1 2 1 2( , , ) ( , , ) ( , , ) i j i j i j h m e e m e e h m e e         (3) alternatively in db, squared filter frequency response can be presented by a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 691 1 2( ) 20 log ( , , ) i j a db h m e e    (4) 3. new class of two-dimensional linear phase multiplierless fir filter functions directly applying the formula proposed by eq. (a.8), the new class of non-causal twodimensional symmetric fir filter functions can be obtained as 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 11 2 2 2 2 2 2 1 2 sin 12 2 2 ( ) ( ) ( , , ) sin 12 2 2 ( ) ( ) m r z z z z z z t u r ri i h r h r h m z z k z z z z z z t u r rj j h r h r                                                                      (5) or 2 2 2 2 1 2 1 1 1 2 2 (m, , ) [( ) ( )] 1 m r r r r r h z z ij k z z z z           (6) multiplying the eq. (6) with factor 2 2 1 2 n n z z    , the filter function can be generated as 2 2 2 2 2 2 2 2 1 2 1 1 1 2 2 1 ( , , ) [( ) ( )] m m r m r m r m r r h m z z ij k z z z z               (7) it is obvious from eq. (7) that the linear-phase fir filter contains no multipliers and has only adders. the frequency response, 1 2( , , ) i j h m e e   , can be defined as 1 2 ( 2 ) ( 2 ) 1 22 2 1 2 0 ( , , sin (2 ) sin (2 )) i m j m m i j r h m e e e e r r                 (8) and the magnitude characteristic is defined as 1 2 1 2 0 ( , , ) sin (2 ) sin (2 ) mi j r h m e e r r        (9) and the amplitude characteristic, 1 2 (2 , , )a m   , is defined as 1 2 1 2 0 ( , , ) sin (2 ) sin (2 ) m r a m r r       (10) the linear-phase function, 1 2 (2 , , )m   , can be defined as ( 2 ) ( 2 ) 1 22 2 1 2 ( , , ) i m j m e em             (11) 692 j. r. djordjević-kozarov, v. d. pavlović a filter function of k = 2r cascaded identical blocks can be written as (h(m, z1, z2)) 2r . if we propose that the filter function h(m, 2r, z1, z2) is performed as a product of three functions of successive orders 1m , m and 1m , than the form of the filter function can be given by: 2 1 2 1 2 1 2 1 2 (2 , , , ) [ ( 1, , ) ( , , ) ( 1, , )] r h r m z z h m z z h m z z h m z z   (12) 4. examples of the new class of two-dimensional linear phase fir filter functions the proposed design algorithm, for original 2d fir filter function, has limitations in addition to the filter dimension (6mr x 6mr) and in addition to the value of two free real integer parameters 2r and m. this means that the linear-phase characteristic forms are limited. in table 1, for some low values of 2r and m, the form of linear-phase characteristics and type of filter functions are given. when 2r is an even number, the proposed filter function has filter properties in both domains. table 1 explicit form of the linear phase characteristics of the proposed fir filter for some low values of free integer parameters 2r and m 2r m 1 ( 4 ) 2( 4 ) 2 2 1 2 2( , , , ) j mr i mr m r e e             type 4 4 ( 32 ) ( 32 )1 2 1 2 (4, 4, , ) i j e e             z1 filter z2 filter 4 5 40 401 2 1 2( 5, 4, , ) i i j j e e              z1 filter z2 filter 4 6 ( 48 ) ( 48 ) 1 2 1 2 (6, 4, , ) i j e e             z1 filter z2 filter 4 7 ( 56 )( 56 ) 1 2 1 2 (7, 4, , ) ji e e           z1 filter z2 filter 4 8 ( 64 )( 64 ) 1 2 1 2 (8, 4, , ) ji e e           z1 filter z2 filter 4 9 ( 72 ) ( 72 ) 1 2 1 2 (9, 4, , ) i j e e             z1 filter z2 filter 4 10 ( 80 )( 80 ) 1 2 1 2 (10, 4, , ) ji e e           z1 filter z2 filter 4 11 ( 88 ) ( 88 ) 1 2 1 2 (11, 4, , ) i j e e             z1 filter z2 filter using the standard technique, the amplitude, magnitude and phase characteristics are obtained from eq. (7) for the numerical values 2r = 4 and m = 6, and detailed characteristics of the filter function 1 2 ( , , ,2 )h m zr z are given in the following figures. a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 693 (a) (b) fig. 1 a) 3d plot of normalized amplitude characteristics of proposed 2d fir filter for 2r =4 and m=6; b) zoomed panel a) illustrated examples of pass-band and stop-band characteristics of the considered fir filter function for given values of attenuation, 1 2 (2 , , , )a r m   , are shown in fig. 2. (a) (b) fig. 2 2d contour plot of normalized magnitude characteristics: a) shape of the pass-band with attenuation of 0.28 db for 2r =4 and m=6; b) shape of the stop-band with attenuation of 100 db for 2r =4 and m=6 fig. 3 shows the phase characteristic of the considered linear-phase multiplierless 2d fir filter function in the initial part, for the same values of the free integer parameters 2r and m, i.e., 2r = 4 and m = 6. 694 j. r. djordjević-kozarov, v. d. pavlović fig. 3 3-d plot of the phase characteristic of proposed 2d fir filter for 2r = 4 and m = 6 5. comparison in this part of the paper, the comparison of the proposed solution for the multiplierless linear-phase 2d fir filter functions and the solution described in [13] is discussed. for the same values of the real free parameters, m and 2r, and the same value of constant group delay we have considered the comparison of amplitude response characteristics and cut-off frequencies of the pass-band of filter and cut-off frequencies of the stop-band of filter. for the value of the free integer parameter 2r = 4, the paper described in [13] and this paper have the same filter property both in z1 and in z2 domain. for the same odd m = 6 we discussed the comparisons between the amplitude responce characteristics, as well as cut-off frequencies of the pass-band of filter with defined attenuation of 0,28 db and cutoff frequencies of the stop-band of filter with attenuation of 100 db. we correctly compare two examples of filter functions that have the same values of free real integer parameters, and thus the same form of linear-phase characteristics which is given in a compact explicit form in the next expression ( 48 ) ( 48 )1 22 2 1 2 (6, 4, , ) i j e e             (13) in fig. 4 are shown the amplitude response characteristics of 2d fir filter of the solution from [13] and the proposed solution, respectively, for free parameters 2r=4 and m=6. a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 695 (a) (b) fig. 4 3d contour plot of amplitude response characteristics, review of the comparison between: a) solution from [13], and b) the proposed filter from eq. (10) fig. 5 shows zoomed forms of pass-bands of filters with attenuation of 0.28 db of the solution from [13] and the proposed solution, respectively, for parameters 2r=4 and m=6. (a) (b) fig. 5 2d contour plot of normalized magnitude characteristics, shape of the pass-band with attenuation of 0.28 db; review of the comparison between: a) solution from [13], and b) the proposed filter from (10) in fig. 6 are shown the stop-bands of filters with attenuation of 100 db of the solution generated by expressions from [13] and proposed solution, respectively. 696 j. r. djordjević-kozarov, v. d. pavlović (a) (b) fig. 6 2d contour plot of normalized magnitude characteristics, shape of the stop-band with attenuation of 100 db; review of the comparison between: a) solution from [13], and b) the proposed filter from (10) in table 2 and table 3 are given the values of the surface area of pass-band and stopband, respectively, of considered 2d fir filter function for different values of given maximal attenuation compared with 2d fir filter function given in [13]. results in table 3 are given in (%) in relation to a total area of the amplitude characteristic. table 2 normalized surface area of pass-band for proposed 2d fir filter function for given values of maximal attenuation compared with 2d fir filter function given in [13] 2r m passa (db) normalized surface area of the proposed filter function pass-band normalized surface area of the filter function pass-band proposed in [13] 4 6 0.28 24.3284219110 -4 0.6779299310 -4 table 3 normalized surface area of stop-band for proposed 2d fir filter function for given values of maximal attenuation compared with 2d fir filter function given in [13] 2r m stopa (db) normalized surface area of the proposed filter function stop-band (%) normalized surface area of the filter function stop-band proposed in [13] (%) 4 6 100 21.405425 34.789375 a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 697 6. conslusion this paper presents an original approach to the multiplierless linear-phase 2d symmetric fir digital filter function synthesis, bringing the significant improvements in the filter theory. the new christoffel-darboux formula for classical orthogonal chebyshev polynomials of the first and the second kind is proposed in appendix of this paper. the presented formula can be used for successfully solving extremely complicated problems of the linear-phase 2d filter design, with high selectivity and high order. transition from the continuous domain into the 2d z domain is successfully presented. this new formula can be directly applied in generating 2d filter functions. all parasitic effects, such as gibbs phenomenon, are suppressed and there is no need for using multipliers. filters design by the proposed method can be applied in various areas, such as telecommunications, medicine, pharmacy, seismology, general localizations and diagnostics, where they can be of special interest. the illustrated examples of the 3d frequency responses and the corresponding 2d contour plots of the proposed linear-phase 2d fir filter are also presented. these examples illustrate the high advantages of the proposed approach and an efficient way of designing ultra-selective filters. the difference between capital research described in [13] and the proposed new classes of filter function for even and odd real value of the free parameter 2r, is following. formulae in the z domain proposed in this paper and in the papers [13] are highly sophisticated and written on the model of extreme properties of christoffel-darboux sum for the continuous classical orthogonal polynomials [3]-[5]. the proposed multiplierless filter functions do not have the problem of the final quantization of filter coefficients, and these solutions are still superior and still very useful for real-time and require a minimum area of integrated technology implementation. undesirable gibbs phenomenon, presented in the analog and in 1d digital filters, has been completely eliminated by the proposed solution. in many practical solutions that requires a minimum dissipation of dc power supply, this solution successfully realizes circuits without multipliers and without quasi multipliers. acknowledgement: the paper is a part of the research done within the project no. 32023, funded by the ministry of science of the republic of serbia. references [1] m. abramowitz, i. stegun, handbook on mathematical function, national bureau of standards, applied mathematics series, usa, 1964. [2] v. d. pavlović, “new class of filter functions generated directly by the modified christoffel–darboux formula for classical orthonormal jacobi polynomials, international journal of circuit theory and applications”, john wiley & sons, vol. 40, pp. 1059–1073, 2013. [3] s. n. hazra, m. s. reddy, “design of circularly symmetric low–pass two–dimensional fir digital filters using transformation”, ieee transactions on circuits and systems, vol. 33, no. 10, pp. 1022–1026, 1986. [4] r. h. yang, y. c. lim, “grid density for design of oneand two-dimensional fir filters”, electronics letters, vol. 27, no. 22, pp. 2053–2055, 1991. [5] s. h. low, y. c. lim, “frequency grid density for the design of 2-d fir filters”, electronics letters, vol. 32, no. 16, pp. 1460–1461, 1996. 698 j. r. djordjević-kozarov, v. d. pavlović [6] a. klouche-djedid, s. s. lawson, “simple design and realisation of linear phase 2d fir filters with diamond frequency support”, electron. letters, vol. 35, no. 14, pp. 1148–1150, 1999. [7] n. vijayakumar, k. m. m. prabhu, “two-dimensional fir compaction filter design”, iee proc., vis. image process, vol. 148, no. 3, pp. 173–181, 2001. [8] v. l. narayana murthy, a. makur, “design of some 2-d filters through the transformation technique”, iee proc., vis. image process, vol. 143, no. 3, pp. 184–190, 1996. [9] b. g. mertzios, a. n. venetsanopoulos, “fast block implementation of two-dimensional fir digital filters via the walsh–hadamard decomposition”, international journal of electronics, vol. 68, no. 6, pp. 991-1004, 1990. [10] s. k. mitra, digital signal processing. – the mcgraw–hill companies, new york, usa, 1998. [11] s. c. dutta roy, “impulse response of sinc n fir filters”, ieee transactions on circuits and systems, vol. 53, no. 3, pp. 217–219, 2006. [12] d. g. ćirić, v. d. pavlović, “linear phase two-dimensional fir digital filter functions generated by applying christoffel-darboux formula for orthonormal polynomials”, elektronika ir elektrotechnika, vol. 4, pp. 39-42, 2012. [13] v.d. pavlović, n. doncov, d.g. ćirić, “1d and 2d economical fir filters generated by chebyshev polynomials of the first kind”, int. journal of electr., vol. 100, no. 11, pp. 1592-1619, 2013. [14] j.r. djordjevic-kozarov, v.d. pavlovic, “an analytical method for the multiplierless 2d fir filter functions and hilbert transform in z2 domain”, ieee trans. on circuits and systemsii: express briefs, vol. 60, no. 8, pp. 527-531, 2013. [15] v.d. pavlovic, j.r. djordjevic-kozarov, “ultra-selective spike multiplierless linear-phase twodimensional fir filter function with full hilbert transform effect”, iet circuits devices syst., vol. 8, no. 6, pp. 532–542, 2014. appendix 1. proposed mathematical background if 1 ( ) r u x  and 1 ( ) r u y  are two sets of the orthogonal chebyshev polynomials of the second kind [5], where x and y are real variables and r is the order of the continuous non-periodical polynomials on a finite interval 1 1x   and 1 1y   respectively, with regard to the nonnegative continuous weight functions, 1 ( )w x and 2 ( )w y , defined as 2 1 ( ) 1w x x  (a.1) and 2 2 1 ( ) 1 w y y   (a.2) then, for the orthogonal chebyshev polynomials of the first kind, tr(y), and the second kind, ur + 1(x), the following relations are valid 1 1 1 1 1 ( ) ( ) ( ) 0 ; , 0, 1, 2, 3, m k w x u x u x dx m k m k       (a.3) and 1 2 1 ( ) ( ) ( ) 0 ; , 0, 1, 2, 3, m k w y t y t y dy m k m k     (a.4) a novel analytical method for the selective multiplierless linear-phase 2d fir filter function 699 for the polynomial, tr(y), r-th order norm h2(r), is: 1 2 2 2 1 , 0 ( ) ( ) ( ( )) , 1, 2, 3, 2 r r h r w y t y dy r           (a.5) and for the polynomial um + 1(x), m-th order norm h1(m), is 1 2 1 1 1 1 ( ) ( ) ( ( )) , 1, 2, 3, 2 mh m w x u x dx m       (a.6) a novel analytical method for the linear phase two-dimensional symmetric fir digital filter functions generated by applying the modified christoffel-darboux formula with alternating sign. components of that sum are determined by multiplication of orthogonal classical chebyshev polynomials of the first and the second kind, with x and y as a real continual variables, ur+1(x), ur+1(y) and tr(x), tr(y), r = 1,2,...,n (where n is the order of the continual orthogonal polynomials), on the equal finite interval [1,1], is proposed in the following explicit form of the orthogonal components: or 1 1 1 1 2 1 2 ( ) ( ) ( ) ( ) ( , ) sin ( ) sin ( ) ( ) ( ) ( ) ( ) n r r r r n r t x u x t y u y x y x y h r h r h r h r       (a.7) i. e. 2 1 1 1 2 ( , ) sin ( ) sin ( ) ( ) ( ) ( ) ( ) n n r r r r r x y x y t x u x t y u y              (a.8) using the standard technique, the eq. (a.8) can be mapped into the new domains, analogue, s, and digital, z, [3, 4, 17-19]. thus, in the z1 domain for example, the following relations are always valid: 1 1 1 1 1 1 22 2 2 2 2 ( cos ) cos ( ) ( ) / 2 ( ) / 2 ( cos ) cos ( ) ( ) / 2 ( ) / 2 j k j k k k k j kj k k k k t x k e e z z t y k e e z z                        (a.9) where tk (x = cos 1) and tk (y = cos 2) are the orthogonal continuous chebyshev polynomials of the first kind, and 1 1 1 1 1 1 1 2 2 2 1 2 2 2 sin ( ) ( ) sin ( ) ( ) /(2 ) ( ) /(2 ) sin ( ) ( y ) sin ( ) ( ) /(2 ) ( ) /(2 ) i k i k k k k j k j k k k k u x k e e i z z i u k e e j z z j                         (a.10) where uk+1 (x = cos 1) and uk+1 (y = cos 2) are the orthogonal continuous chebyshev polynomials of the second kind. as we can see, for odd k, e.g. k = 9, following eq. (a.12) and eq. (a.13) the orthogonal chebyshev polynomials tk (y) and uk+1 (x), respectively, becomes 700 j. r. djordjević-kozarov, v. d. pavlović 9 9 2 2 9 9 9 2 2 2 ( ) cos ( 9 ) ( ) / 2 ( ) / 2 j j t y e e z z           (a.11) and 9 9 1 1 9 9 1 10 1 1 1 sin ( ) ( ) sin ( 9 ) ( ) / 2 ( ) / 2 i i u x e e z z            (a.12) instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 113 126 doi: 10.2298/fuee1601113v 3-d stereoscopic modeling of the tesla’s long island  vladan vučković, sanja spasić faculty of electronic engineering, university of niš, serbia abstract. this paper presents in detail the methods for realization of the basic software infrastructure for the conversion of 3-d animation of tesla’s laboratory in long island to modern stereoscopic 3-d formats. modeling of tesla’s lab is done in cooperation with the nikola tesla museum in belgrade on a project entitled “computer simulation and modeling of the original patents of nikola tesla” approved by the ministry of education and science of the republic of serbia. in recent years, there has been a revolution in the field of 3-d technology, so it is clear that this will be the strategic direction of the progressing of television, cinema screenings and presentations in the future. using modern technology for generating and conversion to stereoscopic 3-d format, the authors show in detail the procedure that was used in the realization of this segment of the project. the complete improved 3d developing pipeline from the original photograph to the stereoscopic 3d real-time model is also presented. the novelty in the phase of semiautomatic materialization of the wire models is also described. key words: computer simulation, 3-d modeling, nikola tesla’s long island laboratory. 1. introduction three-dimensional (3-d) [1], [2] or stereoscopic 3-d (s3-d) is the presentation which improves the illusion of the depth of perception. it creates the impression of the third dimension. it is accompanied by 3-d technology based on all logical principles. however, the feature that makes this technology extraordinary is the very process of the picture presentation. 3-d or stereoscopic picture is connected to the ways in which eyes and brain create the impression of the existence of the third dimension. two very similar pictures reach the brain creating a particular disparity. this disparity actually represents 3-d, the third dimension projected by our brain as it receives two different perspectives of the same thing. this very effect is the goal that modern 3-d technology tries to accomplish through distribution of different sequences towards both the left and the right human eye. this kind of technology is very fast spread and acquired by the users. since all new contents of pictures, cartoons and animated films are being made in some of these formats, received december 26, 2014; received in revised form september 15, 2015 corresponding author: vladan vuĉković faculty of electrical engineering, university of niš, alaksandra medvedeva 14, 18000 niš, serbia (e-mail: vladanvuckovic24@gmail.com) 114 v. vuĉković, s. spasić it is very important for this high quality material to be accompanied by appropriate players which will completely point out, emphasize and show the quality. some of the most commonly used media players nowadays which support and make possible the presentation of stereoscopic materials, and in most cases films, are: stereoscopic player, nvidea 3-d vision video player, ultra studio 3-d, p2gstereostage, firmware version 3.50, etc. since 3-d technology is very rapidly acquired, the formats as the main guide lines of this technology are being developed rapidly, too. stereoscopic formats belong to the group of standards for coding of audio and video signals of the third dimension. these new formats are more complex and they require more powerful devices for processing, but better efficiency is achieved better quality of video signal can be achieved for the same flow of data. one of the formats which appeared on the market and which stand out in the whole range of the formats is jpeg2000. this format can be in the easiest way converted into mxf format, where it is the very part of this format. this paper has the following structure. first, we describe standard stereoscopic formats that are in use nowadays. our 3d application is based on these formats. then, we describe the complex procedure of generating the 3d model of nikola tesla long island lab, from origin photographs to stereoscopic model. we introduce original algorithm for semiautomatic pattern re-changing. this technique is very helpful to improve and accelerate this segment of the project. the novelty accomplished in our work could be divided in two categories: first, we developed the original and complex 3d producing pipeline – from original picture to stereoscopic 3d model with these basic steps: 1. manual 3d modeling based on original pictures. 2. assembling the complete wire model. long island contains over 100 separate 3d models. 3. materialization of the wire construction using our innovated procedures (section 6). 4. generating the jpeg2000 stereoscopic 3-d model. secondly, many of these steps are innovated or improved. the main developing line is hardly based on manual 3d projecting (especially in the phase of basic 3d modeling from original pictures), but many improvement could be accomplished in some phases. in this paper we presented the improvements in the phase of materialization. the complete c++ source code and application could be inspected after the contact with authors. 2. jpeg2000 format jpeg2000 [3], [4] is the new system of image coding. it is created by joint photographic experts group as a supplement to jpeg format. its architecture is very useful for many different applications, including distribution of internet images, security systems, digital photography and medical records. the main characteristics of jpeg2000 standard are:  losses compressions it compresses pictures without loss, increases picture quality and decreases storage space,  supports all resolutions, color depth, number of components and frame rate,  provides higher subjective and objective quality of the image, especially at slower transmission speed,  enables roi-region of interest coding,  not sensitive to bit errors, 3-d stereoscopic modeling of the tesla’s long island 115  not sensitive to errors in the transmission channel,  new image format enables protection of intellectual property over the image. apart from the above enumerated characteristics it should be mentioned that jpeg2000 provides precision of 1-127 bit/rate. the components may have variable precision or rate factors. compression may be with or without loss. restoration of the image may be progressive according to the criteria of quality or resolution. it provides visual improvement compared to jpeg for about 20%. it also provides better protection of the image for the same ratio of noise signal which exceeds the jpeg standard by almost two times. users may notice advantages of jpeg2000. this effect is achieved by using wavelet compression. this compression provides many advantages compared to discrete cosine transform (dct) used by jpeg format. this transformation divides the image into blocks of 8x8 size and stores them in special files. in this process of compression, blocks are compressed individually, without taking into consideration the neighboring blocks. this is a problem in compressing of the jpeg files because only the most important information is reported at the high-level and thus only the most important parts of the image are transferred. due to this, there is a loss of certain parts of the picture and there is a loss in image quality. however, wavelet compression converts the image into a series of wavelets which are better stored and the transmission of blocks is done pixel by pixel. in this way images with large and smaller contrast are transferred with the same quality. to transfer an image the right transformation is first applied to the original image data. transform coefficients are then encoded and sampled. the decoder is simply contrary to the encoder. unlike other schemes for coding, jpeg 2000 can be both with without losses . it depends on using wavelet transform and quantization method. in jpeg 2000 compression of the image is divided into overlapping blocks (panels). these blocks are compressed independently, as if they were completely independent parts of the image. all operations: wavelet transform, quantization and entropy coding, are performed independently for all blocks separately, as shown in the figure. fig. 1 compression blocks. each of these blocks is defragmented into particular number of parts by dwt (sub bands). application procedure of one-dimensional filters in both directions is then repeated in the block of low resolution. when using dwt in the one-dimensional blocks (sub-band) they are divided on the set of low-pass and a set of high-pass samples. the low-pass samples represent a small low-resolution version of the original. the high-pass javascript:void(0) 116 v. vuĉković, s. spasić samples represent a smaller version of the original. this step is needed for perfect reconstruction of the original set from the low-pass set. the question that is raised now is how to add audio track to the material recorded in the format jpeg2000 without losing the quality. for this kind of conversion it is the best to use the format mxf and there is a very acceptable way of converting for such purposes. as we already know, jpeg2000 enables only the view of video without any possibility of installing audio. conversion into mxf makes possible for the inclusion of the four channels of uncompressed audio and video frame with a precise time code without loss in the compressed video. this kind of compression may be traced in the program sony vegas pro 11.0. 3. compression in sony vegas pro 11.0. sony vegas pro 11.0 is a professional software solution for nonlinear video installing (installing performed on the computer). the basic purpose of the program is installing (choosing of cadres, cutting and assembling) and rendering of that material. this version of sony vegas is very interesting because, apart from other functions, it makes possible rendering of the material in some of the stereoscopic formats. when it comes to stereoscopic video and computer graphics used in presentations, objects may be practically rendered and presented in a very high resolution. such generating of 3-d scenes may contain image tags of the depth and can be rendered with increased quality. apart from this, it also offers different styles of viewing for all kinds of tastes and glasses. here will be shown the way of arranging and transferring of 3-d stereoscopic recording in vegas pro 11.0 regime. the process of rendering as 3-d stereoscopic record begins just after the uploading and installing of video is finished. it is very important before the uploading of any material into program, to make particular settings of the video, so that all the material can be equal concerning quality, as well as resolution. this is very important if one works with material of various resolutions and formats. this setting may be accomplished in one of the two possible ways. the first way is by direct connecting of the cards file/properties, and the second way is by choosing the short cut from the tool bar. when all the settings have been done, the upload of the material for processing and installing can start. after finishing, a new record can be rendered as a stereoscopic video. it can be achieved by following the path file/render as. after starting a card with a lot of settings appear and the most important settings here are formats. mfx is the container of the format for professional digital video and audio media defined by the set of smpte (society of motion picture and television engineers) standards. in its container of formats, apart from other things, there is the jpeg2000 format. it offers possibility of choosing formats of various systems (pal, ntsc and hd) with stereoscopic presentation. fig. 2 the basic structure of mxf format. 3-d stereoscopic modeling of the tesla’s long island 117 it automatically offers the resolution which minimally distorts the existing video because the setting was done in the card properties at the beginning. however, the parameters of this format can be changed if one wants that. the recommendation is that the user picks out any format with hd system, because, as we know, hd system with the best resolution provides the picture which is of higher quality and sharpness compared to ordinary sd picture. 4. the structure of mxf format the purpose of mxf format is to put multimedia data in a file. it will make possible for users to use package transfer and efficient exchange in real time. the specification of mxf is also carefully devised to ensure efficient data saving on different media as well as to ensure reliable transport through communication connections. as it is presented mxf format consists of three parts: file header, file body and file footer. the header provides information about the file as a whole. this part is designed to be the smallest in order to be more easily isolated and sent to a microprocessor for analysis. the body consists of a cover which includes: video, audio and program data, such as text. all these body parts are recorded separately in a single envelope whereby when playing all video frames are connected with full audio and data files. this arrangement is also known as speckled media files. however, the body can be based on several different types of material, including uncompressed audio and video material. the footer is a closing file and it can contain some information which is not available at the time of writing of the header and in some forms can be omitted. important role in the footer is occupied by metadata which clearly define the end of the file. as it can be seen, great attention is paid to metadata. fig. 3 data transfer through mxf format. in general, metadata have many functions:  including the contents used in media industry (broadcasting film and music),  the use which is based on applications (recording, creating and libraries),  it covers a wide range of business transactions, presenting information and labeling. metadata can be divided into three broad categories:  the structure of metadata (set of information which defines the core of structure)  description of metadata (set of information which describes content)  the dark side of metadata (metadata is unknown until the moment of their processing, usually relating to privacy of metadata). 118 v. vuĉković, s. spasić however, the television transmits ready-made streaming video and audio. this is logical because a viewer expects a scene in real time in parallel with video and audio materials. computers then exchange data file transfer. this means that two of them go separately. therefore it is very important to integrate these files. mxf is designed to simultaneously work with both files. stream files – mostly are of open type without a beforehand defined start and the end, the transfer of data is usually synchronized, but there can also be an asynchronous transfer with specific maximum and minimum speed transmit data. transfer files – are used for transferable media, using a package based on reliable transmission network data in which the transfer of data segments is itself reliable, data synchronization is performed while the file formats are often structured to allow access to core data. figure 3 shows data transfer between the stream and transfer files through to mxf format. during the process of rendering the compressions through, two separate channels are split and recorded as two separate files with the purpose of getting better quality, and when someone starts playing video using some player, these two files are compatible and are merged into one file. this process is done to lessen the space needed because the image in jpeg2000 format is compressed and the audio signal is linear. apart from this, this process ensures protection against unauthorized use. when rendering is finished it is advised to present this kind of video recording in some digital cinema. 5. modeling of tesla’s long island laboratory the project computer simulation and 3-d modeling of the original patents of nikola tesla that was later continued through the project advanced techniques in modeling and simulation of the original patents of nikola tesla began in 2009 at the faculty of electronic engineering. our team was able to create software that is used in the presentation systems in the nikola tesla museum, and now visitors have the opportunity to see original animations of tesla’s inventions. the aim of the project is multiple. basically, it is a digital and detailed 3-d modeling of the original patents of nikola tesla, which are the part of the museum’s archives (figures 4-8.) [5], [6], [7], [8]. fig. 4 long island facility original photograph. 3-d stereoscopic modeling of the tesla’s long island 119 fig. 5 tesla’s tower original blueprint. using 3-d models, further objectives are rendering, animation, simulation and visualization in real time and space shaded by wire (3-d stereo) model (figures 6 and 7). characters and other models were also modeled and textured in autodesk maya, but exported in motion builder (mb). this program runs the character animation using forward-inverse kinematics. nature environment was made in udk with terrain editor and speed tree. we used original photographs from the museum of nikola tesla, and other materials available in this regard. here are the key elements that have been modeled in the first phase of the project:  laboratory: (generator, transformer (power and high frequency), arrays of measuring devices on the shelves and low cupboards next to the main wall, large cabinets, small motors and generators, tables and chairs, tesla’s desk, desk assistant, tesla’s boat, other smaller components and devices) (figure 7). fig. 6 tesla’s long island laboratory original photograph (courtesy of museum of nikole tesla). 120 v. vuĉković, s. spasić fig. 7 tesla’s long island laboratory workshop interior (model).  boiler-room: (main steam generator, boiler, auxiliary generator, steam engine, regulators, gauges, details, furniture, stairs, railings, switches, levers, valves arrays).  dynamo room: (with storage for coal).  workshop (lathes, small engines and transmission belts, desks against the restraints, chairs, pieces of tools, materials, boxes, reels wire, other details in the workshop. addition, modeling full galleries, stairs and railings). fig. 8 tesla’s laboratory exterior (model). until the end of 2013, we are working on the following sub-project activities: refinement of models of long island (railways, carriages, turbines in the boiler-room). dynamics is already incorporated in the 3-d model of long island (figure 8.). also, we implemented some of the standard simulation techniques, to run the models and engines [7], [8]. 3-d stereoscopic modeling of the tesla’s long island 121 6. details of modeling sequence developing work on a very complex model such as long island requires a large team of collaborators and precisely defined development procedure. first of all, we need to posses quality original images that is the basis of modeling. cooperation with nikola tesla museum was indispensable bearing in mind that the museum keeps the original tesla’s legacy. the basic material from which? we have been working to develop models includes six original photos of the interior of the building on long island, as well as photographs of the transmitting tower also with the original plans and drawings of the tower. in the next phase, we started analyzing the structure of the building and the tower and completed modeling of these major facilities. after that, we gradually developed device models from inside the building, on the basis of the existing photographs. at this stage, permanent consultations among engineers and designers were necessary. wire structure of 3d models is the essential that allows objects to schedule in building, to look into their proper dimensions and relationships (figures 9a 9c). this part of procedure is very complex and time-consuming. each detail of the 3d model is manually defined [8]. fig. 9a tesla’s long island wire model. fig. 9b tesla’s long island wire model (structure). 122 v. vuĉković, s. spasić fig. 9c tesla’s laboratory wire model (main room). the next phase is materialization of objects using expert knowledge and research related to data objects that are in the model. this job is mostly manual, but in the course of the project, we focused our research to semi-automatic systems for materialization using predefined structure of the material from our original database. in our application there is an option to replace the texture. after performing the segmentation and reaching the region merge, it is possible in certain segments to change material. the basic structure is database with textures. in the initial stage it is necessary to replenish the base and textures can later be added as needed. textures can be added manually by choosing from the menu or automatically based on the color histogram [9], [10]. all functions working with texture are accommodated in the class: addedittextureform. the histogram determination function (application is completely written in c++) performs the calculation and displays the color histogram for a given texture. this function simply counts the number of rgb pixels in give (x, y) rectangle separately. red, green and blue color channel are determined with three identical procedures, each for every channel. the pixels values are in range [0.255]. here the color structure prevailing in the texture can be noticed. based on this procedure, later someone could choose a texture for a specific region. in the preview window, it is possible to determine the type and name of the texture. the application offers some types of basic materials. there are no practical limitations in number of textures in the database. we could use different sources for textures, as well as scanned photographs in high resolution. some of the pre-defined basic textures that are commonly in use are: ’wood’,’stone’,’sand’,’water’,’stone’,’clouds’,’metal’,’brick’. after selecting and setting the attributes, texture surveying is done with a group of texture management functions. when the texture is defined in the database, we continue the process of changing textures. it is significant to note that when someone selects the texture, blend factor must be selected also. this factor determines how the texture is impressed or integrated with the region. since we want to preserve the originality of the image with original lighting and shadows and keep the texture, blend factor determines how much level shadows and light from the original image is applied to the texture. 3-d stereoscopic modeling of the tesla’s long island 123 the main idea implemented in this phase is using of two separate image sources, one based on original image, and the other based on synthetic (automatically generated) improved texture image. the perfect tuning between these sources is done manually. finally, the complete materialization is done in semi-automatic manner as a combination of manually and automatic materialization. there is a possibility of info data for a defined region. this info card is remembered with each region. it is possible to see a histogram of the distribution of blue, red and green color information about the region, such as the number of pixels belonging to the region and the size of a region. the size of a region defines the smallest rectangle that can be described around the region. this data is expressed in pixels in height and width of a rectangle. there is a description of the region as well as the material from which is defined. the changes that have been entered are remembered with the region and will be visible the next time someone opens the infocard of the region. in this way, the system remembers the process and details at this stage of materialization and applies it when processing in the future. we can say that our approach has the elements of automatic (machine) learning. fig. 10 rendering of the main room after materialization. after materialization phase (figure 10), we proceed to set up lighting. virtual light sources are placed manually. attributes of light sources, diffused color and structure are determined experimentally. this part of the project requires intensive use of computers, keeping in mind that iterations must be verified with new rendering scenes. for instance, there are 3 virtual light sources inside the main room (figures 11a and 11b). 124 v. vuĉković, s. spasić fig. 11a light application in main room model. after the light application in the monochromatic mode, full scale rendering, with lights on is performed. fig. 11b rendering lights in main room. now, the final rendering is done on a complete model of long island. this job takes several hours on a cluster that we use (about 30 physical intel i5 or i7 cores). the final phase is passing through the scene. in doing so, we can turn the stereo rendering and generate a stereo view of long island in formats (jpeg2000) that are shown in the first sections in this paper. this part can be done only by using modern nvidea gtx video cards, performing rendering in hardware with a large number of parallel processors. 3-d stereoscopic modeling of the tesla’s long island 125 fig. 12 complete functional 3d model in normal or stereo view observation of stereo display 3d models of long island and a dynamic walk through the model using ordinary anaglyph or special stereo glasses is the final phase of the project (figure 12). 7. conclusion this paper presents details about the stereo conversion techniques from the tesla’s long island 3-d model and animation to modern stereo 3-d formats. we presented the detail software infrastructure from the original construction to the real time stereo model. there are many steps in this procedure, the main ones being described. the main pipeline is following the further steps: 1. manual 3d modeling of the basic model directly from the original pictures (only six old photographs for long island model), model by model. 2. assembling and tuning of the complete wire model. 3. materialization of the wire construction using original semi-automatic procedures (described in section 6). 4. finalizing the model and generating of the jpeg2000 stereoscopic 3-d model using the possibilities of nvidea gpus and real-time 3d software. finally, our long island model is one of the most complex and detailed models made till nowadays. real time application is the first one of that kind, as we know. this work is a part of our work on an a interdisciplinary project iii44006 realized at the faculty of electronic engineering in cooperation with the ministry of science and technology and the nikola tesla museum in belgrade [6], [8]. we believe that the realization of the project is very important for both institutions, especially the nikola tesla museum in belgrade that has a special status of the institution of national importance, so that the results of the project have high importance. the project 126 v. vuĉković, s. spasić had a number of promotions, seminars, and media coverage. to crown the success, there was a joint presentation with the museum of nikola tesla at the world exhibition in china in 2010 at the central stand of the republic of serbia. a 3-d movie that was especially made for this occasion was seen by some 200,000 visitors. acknowledgement: this work is supported by the serbian ministry of education and science (project iii44006-10). references [1] m. m. woolfson, introduction to computer simulation, oup oxford. [2] hartmann alexander, practical guide to computer simulations, world scientific publishing; pap/cdr edition (21 jun 2009). [3] m.w. marcellin , m.j. gormish, a. bilgin, m.p. boliek, “an overview of jpeg-2000”, in proceedings of the ieee data compression conference, dcc 2000, 28 mar 2000-30 mar 2000. [4] d.s. taubman, m.w. marcellin, “jpeg2000: standard for interactive imaging”, in proceedings of the ieee, vol. 90 , issue 8, pp. 1336 1357. [5] v. vuĉković, n. stojanović “the complete 3-d modeling and real-time simulation of the tesla’s boat”, international journal of emerging sciences ijes, vol. 1, no. 4, pp. 535-544, december 2011. [6] v. vuĉković, “3-d modeling and simulation of the tesla’s wireless controlled boat” (invited lecture and paper), in proceedings of the 7th international symposium nikola tesla, belgrade, pp. 37-44, november 23, 2011. [7] v. vuĉković, n. stojanović, "mathematical 3d modeling and real-time simulation of the tesla’s wireless controlled vehicle boat", the scientific journal facta universitatis, series electronics and energetics, vol. 24, no. 2, niš, pp. 257-270. august 2011. [8] v. vuĉković, "virtual models of tesla patents", in proceedings of the xvii telecommunications forum telfor 2009, cd rom proceedings, belgrade, 24-26. november 2009, pp. 1335-1338. [9] z. zhang, "determining the epipolar geometry and its uncertainty", a review. international journal of computer vision. vol. 27, no. 2, 1998, pp. 161 198. [10] bo peng, lei zhang, automatic image segmentation by dynamic region merging, department of computing, the hong kong polytechnic university, hong kong. https://www.cs.auckland.ac.nz/courses/compsci773s1c/resources/ijcv-review.pdf facta universitatis series: electronics and energetics vol. 29, no 4, december 2016, pp. 701 720 doi: 10.2298/fuee1604701a anas n. al-rabadi1,2 received november 28, 2015; received in revised form april 13, 2016 corresponding author: anas n. al-rabadi electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan (email: alrabadi@yahoo.com) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) facta universitatis, seri es: electroni cs and energeti cs vol. 22, no. 4, december 2016, 100-118 udk: doi: multi-valued galois shannon davio trees and their complexity anas n. al-rabadi abstract: the idea of shannon-davio (s/d) trees for binary logic is a general concept that found applications in the sum-of-product (sop) minimization and the generation of new diagrams and canonical forms. extended s/d trees are used to generate forms that include a minimum galois field sum-of-products (gfsop) forms. since there exist many applications of galois field of quaternary radix especially that gf(4) is considered as an important extension of gf(2), the extension of the s/d trees to gf(4) is presented here. a general formula to calculate the number of inclusive forms (ifs) per variable order for an arbitrary galois field radix and arbitrary number of variables is derived and introduced. a new fast method to count the number of ifs for an arbitrary galois radix and functions of two variables is also introduced; the ifn,2 triangles. the results introduced in this work can be useful for the creation of an efficient gfsop minimizer for galois logic and in other applications such as in reversible logic synthesis. keywords: complexity, galois field sum-of-product (gfsop), galois forms, inclusive forms, multi-valued logic, quaternary logic, shannon-davio (s/d) trees. 1 introduction spectral transforms play an important role in the synthesis, analysis, testing, classification, formal verification and simulation of logic circuits and systems. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to p-adic (multi-valued) transforms, have found a fruitful use in digital system design [1, 2, 6-35]. reed-muller-like spectral transforms [2-6, 12-14, 16-18, 21, 25, 27, 29, 33, 35] have found a variety of useful applications in minimizing exclusive sum-of-products (esop) and galois field sop (gfsop) expressions, creation of new forms, binary decision diagrams, spectral decision diagrams, regular manuscript received july 15, 2015; revised electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan e-mail: alrabadi@yahoo.com this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university 101 multi-valued galois shannon davio trees and their complexity 1electrical engineering department, philadelphia university, 2jordan & computer engineering department, the university of jordan, amman-jordan abstract. the idea of shannon-davio (s/d) trees for binary logic is a general concept that found applications in the sum-of-product (sop) minimization and the generation of new diagrams and canonical forms. extended s/d trees are used to generate forms that include a minimum galois field sum-of-products (gfsop) forms. since there exist many applications of galois field of quaternary radix especially that gf(4) is considered as an important extension of gf(2), the extension of the s/d trees to gf(4) is presented here. a general formula to calculate the number of inclusive forms (ifs) per variable order for an arbitrary galois field radix and arbitrary number of variables is derived and introduced. a new fast method to count the number of ifs for an arbitrary galois radix and functions of two variables is also introduced; the ifn,2 triangles. the results introduced in this work can be useful for the creation of an efficient gfsop minimizer for galois logic and in other applications such as in reversible logic synthesis. key words: complexity, galois field sum-of-product (gfsop), galois forms, inclusive forms, multi-valued logic, quaternary logic, shannon-davio (s/d) trees. 702 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 703 multi-valued galois shannon davio trees and their complexity structures, besides their well-known uses in digital communications, digital signal processing, digital image processing and fault detection and testing [1-7, 12, 13, 15-19, 21-25, 27, 29, 32, 35]. the method of generating the new families of multi-valued shannon and davio spectral transforms is based on the fundamental multi-valued shannon and davio expansions, respectively. the remainder of this paper is organized as follows: basic definitions of the fundamental binary expansions and their multi-valued extensions are given in section 2. section 3 presents the quaternary galois shannon-davio (s/d) trees. the number of s/d inclusive forms and the new ifn,2 triangles are introduced in section 4. conclusions and future work are presented in section 5. 2 basic shannon and davio decompositions this section presents necessary mathematical background and the fundamental formalisms of the work that will be introduced and further developed in the following sections. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [2, 8, 13, 15, 17, 18, 21, 23, 25, 27, 29, 32, 35-37]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebras [2, 8, 17, 18, 21, 32]. galois field has proven high efficiency in various applications of logic synthesis, such as in the design for test, error correction codes, and even in the proof of the well-known fermat’s last theorem. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as high testability of such circuits, are due to the fact that the gf operators exhibit the cyclic group also called latin square property which can be explained, for example, using gf(4) (quaternary) operators as shown in figures 1(a) and 1(b), respectively; note that in any row and column of the addition table in figure 1(a), the elements are all different which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. in binary, for example, gf(2) addition operator exor has the cyclic group property. + 0 1 2 3 0 0 1 2 3 1 1 0 3 2 2 2 3 0 1 3 3 2 1 0 ∗ 0 1 2 3 0 0 0 0 0 1 0 1 2 3 2 0 2 3 1 3 0 3 1 2 (a) (b) fig. 1: gf(4) addition and multiplication tables. reed-muller based normal forms have been classified using the green-sasao hierarchy. the green-sasao hierarchy of families of canonical forms and corresponding decision di102 702 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 703 multi-valued galois shannon davio trees and their complexity agrams is based on three generic expansions; shannon, positive davio and negative davio expansions. the corresponding shannon, positive davio and negative davio expansions are given as follows [2, 32]: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn), = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn), = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2,...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. an arbitrary n-variable function f (x1,x2,...,xn) can be represented using the positive polarity reed-muller (pprm) expansion as follows: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ a2x2 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn ⊕ ...⊕ a12...nx1x2 ...xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. there are 2n possible combinations of polarities and as many fprms for any given logic function [2, 32]. if we freely choose the polarity of each literal in equation (4), we obtain the generalized reed-muller (grm) form. in grms, contrary to fprms, the same variable can appear in both positive and negative polarities. there are n2(n−1) literals in equation (4) so there are 2n2 (n−1) polarities for an n-variable function and as many grms [32]. each of the polarities determines a unique set of coefficients, and thus each grm is a canonical representation of a function. two other types of expansions result from the flattening of certain binary trees that will produce kronecker (kro) forms and pseudo kronecker (pkro) forms for shannon, positive davio and negative davio expansions. there are 3n and at most 32 n−1 different kros and pkros, respectively [32]. the good selection of the various permutations using the shannon and davio expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds, that represent the corresponding logic functions, with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. the minimization of the size of dd, to represent a logic function, will result in speeding up the manipulations of logic functions using dd as data structure, and the minimization of the use of memory space during the execution of such manipulations. one can observe that 103 704 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 705 multi-valued galois shannon davio trees and their complexity by going from pprm to grm forms, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom (less constraints) on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions. in general, a literal can be defined as any function of a single variable [2, 18, 32]. basis functions in the general case of multi-valued expansions are constructed using literals. galois field sop expansions can be performed on variety of literals. for example, one can use among others: k-reduced post literal (k-rpl) to produce k-rpl gfsop, post literal to produce pl gfsop, window literal to produce wl gfsop, generalized (post) literal to produce gl gfsop, or universal literal to produce ul gfsop. figure 2 demonstrates set-theoretic relationships between the various literals, where the shaded reduced post literal is the type of literal that will be used through this paper. one may note that the rpl in the discrete domain is analogous to the delta function in the continuous domain. reduced post literal post literal window literal generalized (post) literal universal literal fig. 2: inclusion relationship of various types of literals. example 1. figure 3 demonstrates several literal types, where one proceeds from the simplest rpl literal in figure 3(a) to the more complex wl literal in figure 3(c). for rpl in figure 3(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state (here this state is equal to one). figure 3(b) shows pl where the value generated by the literal at a specific state is equal to the maximum value (i.e., radix) of that logic, and wl in figure 3(c) generates a value equal to the radix for a "window" of specific states. since k-rpl gfsop is as simple as pl and it is simpler from implementation point of view than other kind of literals, we will perform all of the gfsop expansions utilizing the corresponding 1-reduced post literal gfsop. consequently, let us define the 1-rpl as [2, 32]: ix = 1 iff x = i else ix = 0. (5) for example { 0x, 1x, 2x} are the zero, first and second polarities of the 1-reduced post literal, respectively. also, let us define the ternary shifts over x variable {x,x′,x′′} as 104 704 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 705 multi-valued galois shannon davio trees and their complexity 1 0 1 2 3 4 x 2 3 4 1 x 0 1 2 3 4 x 1 2 3 4 l 1 (x) 0 1 2 3 4 x 1 2 3 4 l [1:2] (x) (a) (b) (c) fig. 3: an example of different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) post literal (pl), and (c) window literal (wl). the zero, first and second shifts of the variable x respectively (i.e., x = x + 0, x′ = x + 1 and x′′ = x+ 2, respectively), and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware such as in universal logic modules (ulms) [2]. analogously to the binary and ternary cases, quaternary shannon expansion over gf(4) for a function with single variable is: f = 0x f0 + 1x f1 + 2x f2 + 3x f3, (6) where f0 is the cofactor of f with respect to variable x of value 0, f1 is the cofactor of f with respect to variable x of value 1, f2 is the cofactor of f with respect to variable x of value 2, and f3 is the cofactor of f with respect to variable x of value 3. example 2. let f (x1,x2) = x′′1 x2 +x ′′′ 2 x1. by using figure (1), the quaternary truth vector in the variable order {x1,x2} is f = [0,3,1,2,2,1,3,0,3,0,2,1,1,2,0,3]t . utilizing equation (6), one obtains the following gf(4) shannon expansion for f : f =2 · 0x1 1x2 + 3 · 0x1 2x2 + 0x1 3x2 + 3 · 1x1 0x2 + 1x1 1x2 + 2 · 1x1 3x2 + 2x1 0x2 + 3 · 2x1 1x2 + 2 · 2x1 2x2 + 2 · 3x1 0x2 + 3x1 2x2 + 3 · 3x1 3x2. using the axioms of gf(4) that are manifested in the operators shown in figure 1, the 1-rpl defined in equation (5) is related to the shifts of variables over gf(4) in terms of powers [2-5] as follows: 105 706 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 707 multi-valued galois shannon davio trees and their complexity 0x = x3 + 1, (7) 0x = x′ + (x′)2 + (x′)3, (8) 0x = 3(x′′)+ 2(x′′)2 + (x′′)3, (9) 0x = 2(x′′′)+ 3(x′′′)2 + (x′′′)3, (10) 1x = x + (x)2 + (x)3, (11) 1x = (x′)3 + 1, (12) 1x = 2(x′′)+ 3(x′′)2 + (x′′)3, (13) 1x = 3(x′′′)+ 2(x′′′)2 + (x′′′)3, (14) 2x = 3(x)+ 2(x)2 + (x)3, (15) 2x = 2(x′)+ 3(x′)2 + (x′)3, (16) 2x = (x′′)3 + 1, (17) 2x = x′′′ + (x′′′)2 + (x′′′)3, (18) 3x = 2(x)+ 3(x)2 + (x)3, (19) 3x = 3(x′)+ 2(x′)2 + (x′)3, (20) 3x = x′′ + (x′′)2 + (x′′)3, (21) 3x = (x′′′)3 + 1, (22) where { 0x, 1x, 2x, 3x} are the zero, first, second and third polarities of the 1-rpl, respectively. also, {x,x′,x′′,x′′′} are the zero, first, second and third shifts (inversions) of the variable x respectively, and variable x can take any value of the set {0,1,2,3}. analogous to the ternary case, we chose to represent the 1-rpl in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware. after the substitution of equations (7) (22) in equation (6), and after the rearrangement and reduction of terms according to the gf(4) operations in figure 1, one obtains: f = 1 · f0 + x( f1 + 3 f2 + 2 f3)+ (x)2( f1 + 2 f2 + 3 f3)+ (x)3( f0 + f1 + f2 + f3), (23) f = 1 · f1 + (x′)( f0 + 2 f2 + 3 f3)+ (x′)2( f0 + 3 f2 + 2 f3)+ (x′)3( f0 + f1 + f2 + f3), (24) f = 1 · f2 + (x′′)(3 f0 + 2 f1 + f3)+ (x′′)2(2 f0 + 3 f1 + f3)+ (x′′)3( f0 + f1 + f2 + f3), (25) f = 1 · f3 + (x′′′)( f2 + 3 f1 + 2 f0)+ (x′′′)2( f2 + 2 f1 + 3 f0)+ (x′′′)3( f0 + f1 + f2 + f3). (26) equations (6) and (23) (26) are the 1-rpl quaternary shannon (s) and davio (d0,d1,d2,d3} expansions for a single variable, respectively. these equations can be re-written in the following matrix-based convolution-like forms, respectively: 106 706 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 707 multi-valued galois shannon davio trees and their complexity f = � 0x 1x 2x 3x �     1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1         f0 f1 f2 f3     , (27) f = � 1 x x2 x3 �     1 0 0 0 0 1 3 2 0 1 2 3 1 1 1 1         f0 f1 f2 f3     , (28) f = � 1 x′ (x′)2 (x′)3 �     0 1 0 0 1 0 2 3 1 0 3 2 1 1 1 1         f0 f1 f2 f3     , (29) f = � 1 x′′ (x′′)2 (x′′)3 �     0 0 1 0 3 2 0 1 2 3 0 1 1 1 1 1         f0 f1 f2 f3     , (30) f = � 1 x′′′ (x′′′)2 (x′′′)3 �     0 0 0 1 2 3 1 0 3 2 1 0 1 1 1 1         f0 f1 f2 f3     . (31) one can observe that equations (27) (31) are expansions for a single variable. yet, these canonical expressions can be generated for arbitrary number of variables n using the kronecker (tensor) product. this can be expressed formally as in the following discrete convolution-like forms for shannon (s), and davio (d0, d1, d2 and d3) expressions, respectively [2, 32]: f = n � i=1 � 0xi 1xi 2xi 3xi � n � i=1 [s][�f], (32) f = n � i=1 � 1 xi x2i x 3 i � n � i=1 [d0][�f], (33) f = n � i=1 � 1 x′i (x ′ i) 2 (x′i) 3 � n � i=1 [d1][�f], (34) f = n � i=1 � 1 x′′i (x ′′ i ) 2 (x′′i ) 3 � n � i=1 [d2][�f], (35) f = n � i=1 � 1 x′′′i (x ′′′ i ) 2 (x′′′i ) 3 � n � i=1 [d3][�f]. (36) the following section utilizes the presented gf(4) spectral-based functional decompositions for the synthesis of decision trees. in this spectral interpretation of decision trees, 107 708 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 709 multi-valued galois shannon davio trees and their complexity different decision trees can be defined by using different decomposition forms which are specified by the corresponding transform matrices and multi-valued literals. the utilized decision trees can therefore be viewed as graphical representations of functional expressions where different trees produce different functional expressions, and by counting the number of possible different trees, that are derived by assigning different decomposition rules to their nodes, we can count the number of possible functional expressions. 3 quaternary shannon-dvaio (s/d) trees the basic s, d0, d1, d2 and d3 quaternary expansions (i.e., flattened forms) introduced previously in equations (32) (36) can be represented in quaternary dts (qudts) and the corresponding varieties of reduced quaternary dds (rqudds) according to the corresponding reduction rules. for one variable (i.e., one level of the dt), figures 4(a) 4(e) represent the expansion nodes for {s,d0,d1,d2,d3}, respectively, and the notation in figure 4(f) means that x corresponds to the four possible shifts of the variable x as: x ∈ {x,x′,x′′,x′′′}, over gf(4). (37) 1 1 1 1 1 x ( ’)x ( ’’)x ( ’’’)x x x ( ’)x ( ’’)x ( ’’’)x ( )x 2 2 2 2 2 x ( ’)x ( ’’)x ( ’’’)x ( )x 3 3 3 3 3 0 x 1 x 2 x 3 x s d0 d1 d2 d3 d ( )a (d) ( )b (e) ( )c (f ) fig. 4: quaternary decision nodes: (a) shannon in eq. (32), (b) davio0 in eq. (33), (c) davio1 in eq. (34), (d) davio2 in eq. (35), (e) davio3 in eq. (36), and (f) generalized quaternary davio defined in eq. (37). utilizing the two nodes defined for quaternary shannon in figure 4(a) and quaternary generalized davio in figure 4(f), and analogously to the binary and ternary cases, one can obtain the quaternary shannon-davio (s/d) trees for two variables (cf. figure 5), where general family called inclusive forms (ifs) is obtained as flattened expressions generated by these s/d trees. for example, the corresponding s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the number of these s/d trees per variable order is 2(4+1) = 32, where the number of qifs per s/d tree will be later derived in section 4 in two different ways; the first method 108 708 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 709 multi-valued galois shannon davio trees and their complexity is by using the general formula for an arbitrary number of variables over gf(4) and the second method is performed by using the general formula for any radix. the number of all possible forms is important because it can be used as an upper-bound parameter in a search heuristic that searches for a minimum gfsop expression using the corresponding s/d trees. example 3 illustrates some of the quaternary s/d trees and some of the quaternary trees they produce. the numbers on top of s/d trees in figures 5(a) and 6(a) are the numbers of total qifs (i.e., total number of quaternary trees) that are generated. example 3. utilizing the notation in equation (37), we obtain, for the s/d trees in figures 5(a) and 6(a), the corresponding s/d trees in figures 5(b) 5(c) and figures 6(b) 6(c), respectively. from the quaternary s/d trees shown in figures 5 and 6, by taking any s/d tree, multiplying the two-level cofactors (which are in the qudt leafs) each by the corresponding path in that qudt, and next summing all the resulting cubes (terms or products) over gf(4), one obtains the flattened if form for the function f as a certain gfsop expression (expansion). for each qudt in figures 5(a) and 6(a), there are as many if forms obtained for the function f as the number of all possible permutations of the polarities of the variables in the second level branches of each qudt. 4 count of the number of s/d inclusive forms over gf(pk) and the new ifn,2 triangles this section provides the count for the numbers of inclusive forms, which are flattened expressions generated by the corresponding s/d trees, where these counts can be used as numerical parameters for upper-bounds in search heuristics that search for minimum gfsop expressions. theorem 1. for gf(3) and n variables, the total number of ternary ifs (tifs) per variable order is: #tifs = (3)n−1 ∑ k1=0 (3)n−2 ∑ k2=0 (3)n−3 ∑ k3=0 ··· (3)0 ∑ kn =0 {[ 3(n−1)! (3(n−1) − k1)!k1! 3(n−2)! (3(n−2) − k2)!k2! 3(n−3)! (3(n−3) − k3)!k3! ··· 3(0)! (3(0) − kn)!kn ! ] [ (32(3) 0 )k1 (32(3) 1 )k2(32(3) 2 )k3 ···(32(3) n−1 )kn ]} . (38) proof. the following is the derivation of equation (38) to calculate #tifs per variable order. the total number of nodes for any gf(3) tree with n levels (n variables) equals: n−1 ∑ k=0 (3)k. (39) for any s-type node there is only one type of nodes as the branches have the possibility of single value each. yet, for d-type node there are n possible types of nodes where n is the 109 710 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 711 multi-valued galois shannon davio trees and their complexity 1 1 1 1 1 1 0 b 0 b 0 b 0 b 0 b 0 b b b b b b’’ b’ b 1 b 1 b b 1 b 1 b ( )b 2 (b’) 2 (b) 2 ( )b 2 (b’) 2 (b) 2 2 b 2 b 2 b 2 b 2 b 2 b ( )b 3 (b’) 3 (b’’) 3 ( )b 3 (b) 3 (b) 3 3 b 3 b 3 b 3 b 3 b 3 b s s s s s s s s s d d d d d d (a) (b) ( ) n= 4,096 0 a 0 a 0 a 1 a 1 a 1 a 2 a 2 a 2 a 3 a 3 a 3 a c fig. 5: examples of s/d trees: (a) quaternary s/d tree for two variables of order {a,b} with three shannon nodes and two generalized davio nodes, and (b) (c) some of the quaternary trees that it generates. number of variables which is equal to the number of levels. the highest possible number of forms for the d-type node is when the d-type node exists in the first (highest) level, and the lowest possible number of forms for the d-type node is when the d-type node exists in the n-level (lowest level). therefore, for certain number m of s-type nodes the following equation describes the number of the d-type nodes for n variables: #s = m ⇒ #d = [ n−1 ∑ k=o (3)k − m]. (40) it can be shown that for gf(3) (i.e., ternary decision tree (tdt)) and n-levels (n-variables), the general formulas that count the number of d-type nodes, and the number of all possible 110 1 1 710 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 711 multi-valued galois shannon davio trees and their complexity 1 1 1 1 1 1 0 a 0 a 0 a 1 1 1 a a aa a’ a 1 a 1 a 1 a a a’’ a’’ ( )a 2 (a) 2 (a) 2 ( )a 2 (a’’) 2 (a’’) 2 2 a 2 a 2 a (a) 2 (a’’’) 2 (a) 2 ( )a 3 (a) 3 (a’) 3 ( )a 3 (a’’’) 3 (a’’) 3 3 a 3 a 3 a (a) 3 (a’) 3 (a’) 3 s s s d d s s s s d d d d d d (a) (b) n= 262,144 0 b 0 b 0 b 1 b 1 b 1 b 2 b 2 b 2 b 3 b 3 b 3 b (c) fig. 6: examples of s/d trees: (a) quaternary s/d tree for two variables of order {b,a} with two shannon nodes and three generalized davio nodes, and (b) (c) some of the quaternary trees that it generates. forms for the d-type node in the k level of the n-level tdt are: #dk =(3)(k−1), (41) |dk|per node =(3)2(3) (n−k) , (42) where #dk is the number of d-type nodes in k level, and |dk| is the number of all possible forms for the d-type node in the k level. let us define s/d tree category to be the s/d trees that have in common the same number of s-type nodes and same number of 111 (a)2 (a)3 d 712 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 713 multi-valued galois shannon davio trees and their complexity d-type nodes within the same variable order. also, define: ψ ≡ number of variable orders, (43) ω ≡ number of s/d tree categories per variable order, (44) φ ≡ number of s/d trees per category, (45) φ ≡ number of tifs per variable order. (46) from equations (39) (42), and using some elementary count rules, we can derive by mathematical induction the following general formulas for n being the number of variables: ψ = n!, (47) ω = n−1 ∑ k=0 (3)k + 1, (48) φ = [∑n−1k=0 (3) k]! [∑n−1k=0 (3) k − k]!k! , where k = 0,1,2,3,..., n−1 ∑ k=0 (3)k, (49) φ = (3)n−1 ∑ k1=0 (3)n−2 ∑ k2=0 (3)n−3 ∑ k3=0 ··· (3)0 ∑ kn =0 {[ 3(n−1)! (3(n−1) − k1)!k1! 3(n−2)! (3(n−2) − k2)!k2! 3(n−3)! (3(n−3) − k3)!k3! ··· 3(0)! (3(0) − kn)!kn ! ][ (32(3) 0 )k1(32(3) 1 )k2(32(3) 2 )k3 ··· (32(3) (n−1) )kn ]} . (50) from equations (47) (50), it can be noticed that the total number of tifs for all variable orders is equal to [n!][#tifs per order]. example 4. for number of variables equal to two (n = 2), φ reduces to: φ = (3)1 ∑ k1=0 1 ∑ k2=0 { 31! (31 − k1)!k1! 30! (30 − k2)!k2! (32(3) 0 )k1(32(3) 1 )k2 } φ =φ|k1=0,k2=0 + φ|k1=1,k2=0 + φ|k1=2,k2=0 + φ|k1=3,k2=0 +φ|k1=0,k2=1 + φ|k1=1,k2=1 + φ|k1=2,k2=1 + φ|k1=3,k2=1 =φ00 + φ10 + φ20 + φ30 + φ01 + φ11 + φ21 + φ31 =1 + 27 + 243 + 729 + 729 + 19683 + 177147 + 531441 = 730,000. utilizing multi-valued map representation, there are n#minterms different functions for n-valued input-output logic. therefore, for ternary logic, there are 39 = 19,683 different ternary functions of two variables, and 730,000 ternary inclusive forms generated by the s/d trees. thus, on the average every function of two variables can be realized in approximately 37 ways. 112 712 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 713 multi-valued galois shannon davio trees and their complexity d-type nodes within the same variable order. also, define: ψ ≡ number of variable orders, (43) ω ≡ number of s/d tree categories per variable order, (44) φ ≡ number of s/d trees per category, (45) φ ≡ number of tifs per variable order. (46) from equations (39) (42), and using some elementary count rules, we can derive by mathematical induction the following general formulas for n being the number of variables: ψ = n!, (47) ω = n−1 ∑ k=0 (3)k + 1, (48) φ = [∑n−1k=0 (3) k]! [∑n−1k=0 (3) k − k]!k! , where k = 0,1,2,3,..., n−1 ∑ k=0 (3)k, (49) φ = (3)n−1 ∑ k1=0 (3)n−2 ∑ k2=0 (3)n−3 ∑ k3=0 ··· (3)0 ∑ kn =0 {[ 3(n−1)! (3(n−1) − k1)!k1! 3(n−2)! (3(n−2) − k2)!k2! 3(n−3)! (3(n−3) − k3)!k3! ··· 3(0)! (3(0) − kn)!kn ! ][ (32(3) 0 )k1(32(3) 1 )k2(32(3) 2 )k3 ··· (32(3) (n−1) )kn ]} . (50) from equations (47) (50), it can be noticed that the total number of tifs for all variable orders is equal to [n!][#tifs per order]. example 4. for number of variables equal to two (n = 2), φ reduces to: φ = (3)1 ∑ k1=0 1 ∑ k2=0 { 31! (31 − k1)!k1! 30! (30 − k2)!k2! (32(3) 0 )k1(32(3) 1 )k2 } φ =φ|k1=0,k2=0 + φ|k1=1,k2=0 + φ|k1=2,k2=0 + φ|k1=3,k2=0 +φ|k1=0,k2=1 + φ|k1=1,k2=1 + φ|k1=2,k2=1 + φ|k1=3,k2=1 =φ00 + φ10 + φ20 + φ30 + φ01 + φ11 + φ21 + φ31 =1 + 27 + 243 + 729 + 729 + 19683 + 177147 + 531441 = 730,000. utilizing multi-valued map representation, there are n#minterms different functions for n-valued input-output logic. therefore, for ternary logic, there are 39 = 19,683 different ternary functions of two variables, and 730,000 ternary inclusive forms generated by the s/d trees. thus, on the average every function of two variables can be realized in approximately 37 ways. 112 multi-valued galois shannon davio trees and their complexity theorem 2. for gf(4) and n variables, the total number of quaternary ifs (qifs) per variable order is: #qifs =φ = (4)n−1 ∑ k1=0 (4)n−2 ∑ k2=0 ··· (4)0 ∑ kn =0 { 4(n−1)! (4(n−1) − k1)!k1! 4(n−2)! (4(n−2) − k2)!k2! ··· 4(0)! (4(0) − kn)!kn ! (43(4) 0 )k1 (43(4) 1 )k2(43(4) 2 )k3 ···(43(4) (n−1) )kn } . (51) proof. a general proof that includes gf(4) as special case will be provided later in this section. the extension of the concept of s/d trees to higher radices of galois fields (i.e., higher than four) is a systematic and direct process that follows the same method developed for the ternary case and the quaternary case. the following example demonstrates the counts of qifs using theorem 2. example 5. for number of variables equal to two (n = 2), equation (51) reduces to: φ = (4)1 ∑ k1=0 (4)0 ∑ k2=0 { 4(1)! (4(1) − k1)!k1! 4(0)! (4(0) − k2)!k2! (43(4) 0 )k1 (43(4) 1 )k2 } =φ|k1=0,k2=0 + φ|k1=1,k2=0 + φ|k1=2,k2=0 + φ|k1=3,k2=0 + φ|k1=4,k2=0 + φ|k1=0,k2=1 + φ|k1=1,k2=1 + φ|k1=2,k2=1 + φ|k1=3,k2=1 + φ|k1=4,k2=1 =φ00 + φ10 + φ20 + φ30 + φ40 + φ01 + φ11 + φ21 + φ31 + φ41 =1 + 256 + 24,576 + 1,048,576 + 16,777,216 + 16,777,216 + 4,294,967,296 + 412,316,860,416 + 1.75921860444 × 1013 + 2.81477976711 × 1014 =2.99483809211 × 1014. utilizing multi-valued map representation, we can easily prove that there are 416 = 4,294,967,296 quaternary functions of two variables, and 2.99483809211 × 1014 quaternary inclusive forms generated by the s/d trees. thus, on the average, every function of two variables can be synthesized (realized) in approximately 69,729 ways. this high number of realizations means that most functions of two variables are realized with less than five expansions, and all functions with at most five expansions. 4.1 general formula to compute the number of ifs for an arbitrary variable number and arbitrary galois radix gf( pk) although the s/d trees and inclusive forms that were developed are for gf(4), the same concept can be directly and systematically extended to the case of n radix of galois fields and n variables. theorem 3 provides the total number of ifs per variable order for n variables (i.e., n decision tree levels) and n radix of any arbitrary algebraic field, including gf( pk) where p is a prime number and k is a natural number ≥ 1. the generality of theorem 3 comes from the fact that algebraic structures specify the type of operations 113 714 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 715 multi-valued galois shannon davio trees and their complexity (e.g., addition and multiplication operations) in the functional expansions but do not specify the counts which are an intrinsic property of the tree structure and are independent of the algebraic operations performed. thus, theorem 3 is valid, among others, for galois fields of arbitrary radix. theorem 3. the total number of inclusive forms for n variables and n-radix galois field logic is equal to: #nif s = φn,n = (n)n−1 ∑ k1=0 (n)n−2 ∑ k2=0 ··· (n)0 ∑ kn =0 { n(n−1)! (n(n−1) − k1)!k1! n(n−2)! (n(n−2) − k2)!k2! ··· n(0)! (n(0) − kn)!kn ! (n(n−1)(n) 0 )k1 (n(n−1)(n) 1 )k2 (n(n−1)(n) 2 )k3 ···(n(n−1)(n) n−1 )kn } . (52) proof. the following is the derivation of the general equation (52) to calculate the number of ifs per variable order. the total number of nodes for any gf(n) tree with n levels (i.e., n variables) equals to: n−1 ∑ k=0 (n)k. (53) for any s-type (i.e., shannon type) node there is only one type of nodes as the branches of the shannon node have the possibility of single value each. yet, for d-type (i.e., davio type) node there are n possible types of nodes where n is the number of variables which is equal to the number of levels. the highest possible number of forms for the d-type node exists when the davio node exists in the first (highest) level, and the lowest possible number of forms for the d-type node is when the davio node exists in the n-level (lowest level). therefore, for certain number m of s-type nodes the following formula describes the number of the d-type nodes for n variables: #s = m ⇒ #d = n−1 ∑ k=0 (n)k − m. (54) it can be shown that for gf(n) (n-ary decision tree with n-levels, i.e., n variables), the general formulas that count the number of d-type nodes, and the number of all possible forms for the d-type node in the k level (where k is less than or equal the total number of levels n) are, respectively: #dk =(n)k−1, (55) |dk| =(n)(n−1)(n) (n−k) , (56) where #dk is the number of d-type nodes in the k level and |dk| is the number of all possible forms (per node) for the d-type node in the k level. let us define the s/d tree category to be the s/d trees that have in common the same number of s-type nodes and the same number of d-type nodes within the same variable order. let us define the following 114 714 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 715 multi-valued galois shannon davio trees and their complexity entities for n radix galois field and n variables (i.e., n decision tree levels): ψn,n ≡ number of variable orders, (57) ωn,n ≡ number of s/d tree categories per variable order, (58) φn,n ≡ number of s/d trees per category, (59) φn,n ≡ number of ifs per variable order. (60) from the previous equations, and using elementary count rules, one can derive by mathematical induction the following general formulas for n being the number of variables and n being the field radix: ψn,n = n!, (61) ωn,n = n−1 ∑ k=0 (n)k + 1, (62) φn,n = [∑n−1k=0 (n) k]! [∑n−1k=0 (n) k − k]!k! , where k = 0,1,2,3,...,n − 1, (63) φn,n = (n)n−1 ∑ k1=0 (n)n−2 ∑ k2=0 ··· (n)0 ∑ kn =0 { n(n−1)! (n(n−1) − k1)!k1! n(n−2)! (n(n−2) − k2)!k2! ··· n(0)! (n(0) − kn)!kn ! (n(n−1)(n) 0 )k1(n(n−1)(n) 1 )k2 ···(n(n−1)(n) (n−1) )kn } . (64) one can note that the formula in equation (52) used to obtain the total number of inclusive forms for n variables and n radix of galois field is a very general formula that includes the ternary case in equation (38) and the quaternary case in equation (51) as special cases. numerical counting results that are obtained from equation (52) can be used in search heuristics as numerical bounds that could be incorporated into efficient search of s/d trees in order to obtain minimal gfsop forms for specific multi-valued logic functions. since such search for minimal forms is already a difficult problem in two-valued logic for example using binary s/d trees especially when the number of variables is large, the search for minimal gfsop forms in multi-valued galois logic will be very difficult. thus, further numerical evaluations have to be conducted in order to estimate the usefulness of the utilizations of the numerical bounds obtained from equation (52) in such extended multi-valued search heuristics. example 6. the number of qifs over gf(4) for two variables (i.e., n = 2 and n = 4) is: 115 716 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 717 multi-valued galois shannon davio trees and their complexity φ4,2 = (4)2−1 ∑ k1=0 (4)2−2 ∑ k2=0 { 4(2−1)! (4(2−1) − k1)!k1! 4(2−2)! (4(2−2) − k2)!k2! (4(4−1)(4) 0 )k1 (4(4−1)(4) 1 )k2 } , =φ00|4,2 + φ10|4,2 + φ20|4,2 + φ30|4,2 + φ40|4,2 + φ01|4,2 + φ11|4,2 + φ21|4,2 + φ31|4,2 + φ41|4,2 =1 + 256 + 24,576 + 1,048,576 + 16,777,216 + 16,777,216 + 4,294,967,296 + 412,316,860,416 + 1.75921860444 × 1013 + 2.81477976711 × 1014 =2.99483809211 × 1014. corollary 1. from equation (52), the count of ifs for n variables and second radix is: n−1 ∏ k=0 (1 + 22 n−k−1 )2 k = (2)n−1 ∑ k1=0 (2)n−2 ∑ k2=0 ··· (2)0 ∑ kn =0 { 2(n−1)! (2(n−1) − k1)!k1! 2(n−2)! (2(n−2) − k2)!k2! ··· 2(0)! (2(0) − kn)!kn ! (2(2−1)(2) 0 )k1 (2(2−1)(2) 1 )k2 ···(2(2−1)(2) (n−1) )kn } . (65) as previously mentioned, this enumeration can be useful as a terminating point of minimization algorithms for multi-valued functions. yet, as shown, the number of combinations is so large that restriction to some particular cases of functional expressions can be more feasible. the following section introduces a fast method to calculate the number of ifs for an arbitrary galois field logic for functions with two variables. 4.2 the ifn,2 triangles: fast count calculations of ifs for gf( pk) and two-variable functions the count of the number of ifs can be important in many applications, especially in providing upper numerical boundaries for efficient search of a minimum gfsop. calculating the numbers of inclusive forms can be very time consuming due to the time required to perform the mathematical operationsin the general equation (52). this is why a fast method to generate the number of ifs is needed. because functions with two variables find an important application such as in universal logic modules (ulms) for pairs of control variables that generalize shannon and davio expansion modules [2], and since two-variable functions are attractive in logic synthesis since many functional decomposition methods exist that produce two control inputs for primitive cells in a standard library of standard cells such as in a multiplexer with two address lines, theorem 4 provides a fast computational method to calculate the number of ifs over an arbitrary radix of galois field gf(pk) for two-variable functions (i.e., n = 2). theorem 4. the following ifn,2 triangles provide a fast computational method to calculate the number of ifs over an arbitrary n radix of galois field gf( pk) for two-variable functions (n = 2). 116 716 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 717 multi-valued galois shannon davio trees and their complexity φ4,2 = (4)2−1 ∑ k1=0 (4)2−2 ∑ k2=0 { 4(2−1)! (4(2−1) − k1)!k1! 4(2−2)! (4(2−2) − k2)!k2! (4(4−1)(4) 0 )k1 (4(4−1)(4) 1 )k2 } , =φ00|4,2 + φ10|4,2 + φ20|4,2 + φ30|4,2 + φ40|4,2 + φ01|4,2 + φ11|4,2 + φ21|4,2 + φ31|4,2 + φ41|4,2 =1 + 256 + 24,576 + 1,048,576 + 16,777,216 + 16,777,216 + 4,294,967,296 + 412,316,860,416 + 1.75921860444 × 1013 + 2.81477976711 × 1014 =2.99483809211 × 1014. corollary 1. from equation (52), the count of ifs for n variables and second radix is: n−1 ∏ k=0 (1 + 22 n−k−1 )2 k = (2)n−1 ∑ k1=0 (2)n−2 ∑ k2=0 ··· (2)0 ∑ kn =0 { 2(n−1)! (2(n−1) − k1)!k1! 2(n−2)! (2(n−2) − k2)!k2! ··· 2(0)! (2(0) − kn)!kn ! (2(2−1)(2) 0 )k1 (2(2−1)(2) 1 )k2 ···(2(2−1)(2) (n−1) )kn } . (65) as previously mentioned, this enumeration can be useful as a terminating point of minimization algorithms for multi-valued functions. yet, as shown, the number of combinations is so large that restriction to some particular cases of functional expressions can be more feasible. the following section introduces a fast method to calculate the number of ifs for an arbitrary galois field logic for functions with two variables. 4.2 the ifn,2 triangles: fast count calculations of ifs for gf( pk) and two-variable functions the count of the number of ifs can be important in many applications, especially in providing upper numerical boundaries for efficient search of a minimum gfsop. calculating the numbers of inclusive forms can be very time consuming due to the time required to perform the mathematical operationsin the general equation (52). this is why a fast method to generate the number of ifs is needed. because functions with two variables find an important application such as in universal logic modules (ulms) for pairs of control variables that generalize shannon and davio expansion modules [2], and since two-variable functions are attractive in logic synthesis since many functional decomposition methods exist that produce two control inputs for primitive cells in a standard library of standard cells such as in a multiplexer with two address lines, theorem 4 provides a fast computational method to calculate the number of ifs over an arbitrary radix of galois field gf(pk) for two-variable functions (i.e., n = 2). theorem 4. the following ifn,2 triangles provide a fast computational method to calculate the number of ifs over an arbitrary n radix of galois field gf( pk) for two-variable functions (n = 2). 116 multi-valued galois shannon davio trees and their complexity 1 2 1 1 2 1 1 3 3 1 1 3 3 1 1 4 6 4 1 1 4 6 4 1 1 5 10 10 5 1 1 5 10 10 5 1 1 6 15 20 15 6 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 7 21 35 35 21 7 1 2 0 2 1 2 2 2 2 2 3 2 4 3 0 3 2 3 4 3 6 3 6 3 8 3 10 3 12 4 0 4 3 4 6 4 9 4 12 4 12 4 15 4 18 4 21 4 24 5 0 5 4 5 8 5 12 5 16 5 20 5 20 5 24 5 28 5 32 5 36 5 40 n 0(n-1) n 1(n-1) n 2(n-1) n 3(n-1) ... n ... (n-1)(n-1) n n(n-1) n n(n-1) n (n+1)(n-1) n (n+2)(n-1) n 2n(n-1) ( )a ( )b fig. 7: the ifn,2 triangles: (a) triangle of coefficients, and (b) triangle of values for fast calculation of the number of inclusive forms for arbitrary radix galois field and functions of two-input variables. proof. . the proof follows directly from mathematical induction of the number of ifs over gf( pk) for twovariable functions. this is deduced from the general equation (52); if the ifn,2 triangles are valid for n = q then they will be also valid for n = q + 1, for n = pk where p is a prime number and k ≥ 1. these triangles are important because the count complexity using equation (52) for high dimensions is very high, and thus the ability of a computer to compute the counts for number of variables greater than five in a reasonable amount of time becomes difficult. consequently, the ifn,2 triangles provide an alternative numerical and geometrical pattern of computing. it can be observed that the ifn,2 triangle of coefficients possesses a close similarity to the well-known pascal triangle. this occurs as follows: if one omits the first two rows of the pascal triangle and duplicates each row into another horizontally adjacent row, the ifn,2 triangle of coefficients will be obtained. this observation helps in creating algorithms that generates the ifn,2 triangle of coefficients since many efficient algorithms exist to generate the pascal triangle example 7. utilizing ifn,2 triangles from figure 7, one calculates the following number of inclusive forms for gf(2), gf(3) and gf(4) for two variables, where the results are 117 (a) (b) . 718 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 719 multi-valued galois shannon davio trees and their complexity identical to those obtained previously: φ2,2 = 1 · 20 + 2 · 21 + 1 · 22 + 1 · 22 + 2 · 23 + 1 · 24 = 1 + 4 + 4 + 4 + 16 + 16 = 45. φ3,2 = 1 · 30 + 3 · 32 + 3 · 34 + 1 · 36 + 1 · 36 + 3 · 38 + 3 · 310 + 1 · 312 = 730,000. φ4,2 = 1 · 40 + 4 · 43 + 6 · 46 + 4 · 49 + 1 · 412 + 1 · 412 + 4 · 415 + 6 · 418 + 4 · 421 + 1 · 424 = 2.99483809211 × 1014. the ifn,2 triangles, for n is the number of variables, possess the following interesting properties: 1. number of positions (elements) in each row of the triangles in figure 7 are even starting from six. 2. sum of elements in each row in figure 7(a) equals to the number of s/d trees per variable order. 3. triangle in figure 7(a) possesses even symmetry around an imaginary vertical axis in the middle. 4. the minimum number of columns required to generate the whole triangle in figure 7(a) is equal to three due to even symmetry: one wing, one column neighbor to the middle column and one middle column. 5. the triangle in figure 7(a) can be generated by the process of "shift diagonally and add diagonally" (sdaad): shift the left wing diagonally from west to southeast direction and add two numbers diagonally from east to southwest direction, and shift the right wing diagonally from east to southwest direction and add two numbers diagonally from west to southeast direction. 6. the difference in powers in the triangle in figure 7(b) per row element is (n − 1). 7. the first number in each row of the triangle in figure 7(b) is n0 and the last number per row is n2n(n−1). 8. the middle two numbers in each row of the triangle in figure 7(b) are always equal to nn(n−1). 5 conclusions and future work trees for generalized shannon-davio (s/d) expansions over quaternary galois radix is presented, and the corresponding count for the number of inclusive forms (ifs) per variable order for arbitrary galois radix and arbitrary number of variables is introduced. also, the ifn,2 triangles as a new fast computational method to count the number of ifs for an arbitrary galois radix and functions of two variables is introduced. since galois field of quaternary radix has some interesting properties including its implementation utilizing the well-established two-valued logic synthesis methods, the extension of the s/d trees to gf(4) is presented. in addition, the form of s/d trees is a general concept that can be used in applications for the generation of new diagrams and canonical forms, and in the sumof-product (sop) minimization where s/d trees can be utilized for generating forms that include minimum galois field sum-of-products (gfsop) circuits for binary and m-ary radices. 118 718 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity 719 multi-valued galois shannon davio trees and their complexity future work will investigate using other complex types of literals such as the presented post literal (pl) and window literal (wl) to expand upon and consequently construct the corresponding new s/d trees. the utilization of the results from this research to create an efficient gfsop minimizer for synthesis applications within the spaces of classical and reversible logic will also be conducted. references [1] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [2] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [3] a. n. al-rabadi, "reversible fast permutation transforms for quantum circuit synthesis," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 81-86. [4] a. n. al-rabadi, "quantum circuit synthesis using classes of gf(3) reversible fast spectral transforms," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 87-93. [ 5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," j. circuits, systems, and computers, world scientific, singapore, vol. 16, no. 5, pp. 641 671, 2007. [ 6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis (fu) electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [8] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [9] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415-419. [10] m. escobar and f. somenzi, "synthesis of and/exor expressions via satisfiability," proc. reed-muller, 1995, pp. 80-87. [11] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [12] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [13] h. fujiwara, logic testing and design for testability, mit press, 1985. [14] d. h. green, "families of reed-muller canonical forms," int. j. of electronics, no. 2, pp. 259-280, 1991. [15] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [16] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [17] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic 119 multi-valued galois shannon davio trees and their complexity future work will investigate using other complex types of literals such as the presented post literal (pl) and window literal (wl) to expand upon and consequently construct the corresponding new s/d trees. the utilization of the results from this research to create an efficient gfsop minimizer for synthesis applications within the spaces of classical and reversible logic will also be conducted. references [1] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [2] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [3] a. n. al-rabadi, "reversible fast permutation transforms for quantum circuit synthesis," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 81-86. [4] a. n. al-rabadi, "quantum circuit synthesis using classes of gf(3) reversible fast spectral transforms," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 87-93. [ 5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," j. circuits, systems, and computers, world scientific, singapore, vol. 16, no. 5, pp. 641 671, 2007. [ 6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis (fu) electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [8] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [9] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415-419. [10] m. escobar and f. somenzi, "synthesis of and/exor expressions via satisfiability," proc. reed-muller, 1995, pp. 80-87. [11] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [12] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [13] h. fujiwara, logic testing and design for testability, mit press, 1985. [14] d. h. green, "families of reed-muller canonical forms," int. j. of electronics, no. 2, pp. 259-280, 1991. [15] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [16] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [17] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic 119 acknowledgement: this research was performed during sabbatical leave in 2015-2016 granted from the university of jordan and spent at philadelphia university. 720 a. n. al-rabadi multi-valued galois shannon davio trees and their complexity pb multi-valued galois shannon davio trees and their complexity press inc., 1985. [18] m. g. karpovski, finite orthogonal series in the design of digital devices, wiley, 1976. [19] c. y. lee, "representation of switching circuits by binary decision diagrams," bell syst. tech. j., vol. 38, pp. 985-999, 1959. [20] c. moraga, "ternary spectral logic," proc. ismvl, pp. 7-12, 1977. [21] j. c. muzio and t. wesselkamper, multiple-valued switching theory, adam-hilger, 1985. [22] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. [23] d. k. pradhan, fault-tolerant computing: theory and techniques, vol. i, prentice-hall, 1987. [24] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [25] t. sasao (editor), logic synthesis and optimization, kluwer academic publishers, 1993. [26] t. sasao, "exmin2: a simplified algorithm for exclusive-or-sum-of-products expressions for muliptle-valued input two-valued output functions," ieee trans. computer aided design, vol. 12, no. 5, pp. 621-632, 1993. [27] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [28] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [29] t. sasao, switching theory for logic synthesis, kluwer academic publishers, 1999. [30] n. song and m. perkowski, "minimization of exclusive sum of products expressions for multioutput multiple-valued input incompletely specified functions," ieee trans. computer aided design, vol. 15, no. 4, pp. 385-395, 1996. [31] r. s. stanković, "functional decision diagrams for multiple-valued functions," proc. ismvl, 1995, pp. 284-289. [32] r. s. stankovic, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [33] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [34] b. steinbach and a. mishchenko, "a new approach to exact esop minimization," proc. reedmuller, starkville, 2001, pp. 66-81. [35] s. n. yanushkevich, logic differential calculus in multi-valued logic design, technical univ. szczecin, 1998. [36] i. zhegalkin, "on the techniques of calculating sentences in symbolic logic," math. sb., vol. 34, pp. 9-28, 1927. [37] i. zhegalkin, "arithmetic representations for symbolic logic," math. sb., vol. 35, pp. 311-377, 1928. 120 multi-valued galois shannon davio trees and their complexity future work will investigate using other complex types of literals such as the presented post literal (pl) and window literal (wl) to expand upon and consequently construct the corresponding new s/d trees. the utilization of the results from this research to create an efficient gfsop minimizer for synthesis applications within the spaces of classical and reversible logic will also be conducted. references [1] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [2] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [3] a. n. al-rabadi, "reversible fast permutation transforms for quantum circuit synthesis," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 81-86. [4] a. n. al-rabadi, "quantum circuit synthesis using classes of gf(3) reversible fast spectral transforms," proc. ieee int. symposium on multiple-valued logic (ismvl), toronto, 2004, pp. 87-93. [ 5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," j. circuits, systems, and computers, world scientific, singapore, vol. 16, no. 5, pp. 641 671, 2007. [ 6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis (fu) electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [8] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [9] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415-419. [10] m. escobar and f. somenzi, "synthesis of and/exor expressions via satisfiability," proc. reed-muller, 1995, pp. 80-87. [11] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [12] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [13] h. fujiwara, logic testing and design for testability, mit press, 1985. [14] d. h. green, "families of reed-muller canonical forms," int. j. of electronics, no. 2, pp. 259-280, 1991. [15] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [16] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [17] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic 119 facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 323-332 https://doi.org/10.2298/fuee2103323s © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper analytical study of effect of energy band parameters and lattice temperature on conduction band offset in aln/ga2o3 hemt * rajan singh1, trupti ranjan lenka1, hieu pham trung nguyen2 1department of electronics and communication engineering, national institute of technology silchar, assam, india 2department of electrical and computer engineering, new jersey institute of technology, newark, new jersey, 07102, usa abstract. apart from other factors, band alignment led conduction band offset (cbo) largely affects the two dimensional electron gas (2deg) density ns in wide bandgap semiconductor based high electron mobility transistors (hemts). in the context of assessing various performance metrics of hemts, rational estimation of cbo and maximum achievable 2deg density is critical. here, we present an analytical study on the effect of different energy band parameters—energy bandgap and electron affinity of heterostructure constituents, and lattice temperature on cbo and estimated 2deg density in quantum triangular-well. it is found that at thermal equilibrium, ns increases linearly with δec at a fixed schottky barrier potential, but decreases linearly with increasing gate-metal work function even at fixed δec, due to increased schottky barrier heights. furthermore, it is also observed that poor thermal conductivity led to higher lattice temperature which results in lower energy bandgap, and hence affects δec and ns at higher output currents. key words: 2deg density, cbo conduction band offset, heterojunction, hemt, lattice temperature, barrier, buffer, ga2o3 1. introduction due to its suitable material properties and availability of high quality and cost-effective native substrates, gallium oxide (ga2o3) is being exhaustively investigated for power electronics applications [1]. currently, this domain of high-power and high-frequency devices are dominated by wide-bandgap semiconductors like silicon carbide (sic) and gallium nitride (gan). some of the unique features—excellent carrier confinement in the form of 2deg, high received february 10, 2021; received in revised form may 06, 2021 corresponding author: rajan singh department of electronics and communication engineering, national institute of technology sichar, assam, india e-mail: rajan_rs@ece.nits.ac.in * an earlier version of this paper was presented at the international conference on micro/nanoelectronics devices, circuits, and systems (mndcs-2021), january 29-31, 2021, india [1]. 324 r. singh, t. r. lenka, h. p. t. nguyen carrier mobility, and large breakdown voltage of gan-based hemts have made it one of the most useful devices for high-power and high-frequency applications [2-4]. despite some key challenges on the substrate side, gan technology however survived beyond the expected life cycle of a typical semiconductor technology [5]. recently, researchers across the globe have started to look towards ultra-wide bandgap (uwb) semiconductors—ga2o3, aln, and diamond for high voltage applications [6]. among these uwb semiconductors, ga2o3 has emerged as an ultimate choice for future power electronics devices on the back of preliminary results that are encouraging enough to prove its capabilities to supplement existing sic/gan technologies. it is worth noting here that, apart from lower bulk electron mobility, 150 200 cm2/vs [5-6], ga2o3 has very low thermal conductivity, 0.13 – 0.27 w/cm k [5]. the following equation relates the electron affinity of a semiconductor with lattice temperature (tl), as given in [7]: 𝜒(𝑇𝐿 ) = 𝜒(300) − 𝐶𝐻𝐼. 𝐸𝐺. 𝑇𝐷𝐸𝑃 × (𝐸𝑔 (𝑇𝐿 ) − 𝐸𝑔 (300)) (1) where, parameter chi.eg.tdep is a ratio, in range (0 – 1) with a default value of 0.5 (used here), which specifies a fraction of the change in the bandgap due to the temperature change. for β-ga2o3, the energy bandgap at tl is given by varshni equation [8]: 𝐸𝑔 (𝑇𝐿 ) = 𝐸𝑔 (300) − 𝛼𝑇𝐿 2 𝑇𝐿+𝛽 (2) where fitting parameters α, and β for aln, and β-ga2o3 are taken from [9], [10] respectively and extrapolated at higher temperatures. the different bandgaps of two materials of heterojunction create these band discontinuities, and the band offset parameters—conduction and valence band offset (δec and δev) have a large impact on the charge transport in the heterostructure [7]. higher values of 2deg density are anticipated for large δec values [11], while δec is also dependent on different electron affinity values of heterojunction materials besides their bandgaps as stated earlier. considering the importance of 2deg density in the operation of hemts, various physics-based analytical models for ns [12-18] are available mostly for algan/gan hemts. in this work, we primarily investigated the estimation of ns based on different values of alignment—a fraction of bandgap difference to δec, at fixed schottky barrier and at higher metal work function led increasing schottky potential at fixed δec through tcad simulations. the study is also done considering the heat flow in the device as the poor thermal conductivity of ga2o3 and high currents in power devices resulting in high lattice temperature. table 1 symbols used and meaning [1] symbol physical meaning symbol physical meaning χ electron affinity ε static dielectric permittivity eg energy bandgap d density of states ϕb schottky barrier height ef position of fermi level ϕm metal work function d thickness of barrier layer q electron charge qv0 built-in potential ns electron density in the 2deg vth thermal voltage δec conduction band offset (cbo) nc conduction band density δev valence band offset (vbo) nv valence band density analytical study of effect of energy band parameters and lattice temperature on conduction band... 325 fig. 1 energy band diagram of a typical heterojunction showing band offset [1]. the energy band diagram of aln/β-ga2o3 abrupt heterojunction, having β-ga2o3 buffer layer (eg1, and χ1) and aln barrier layer (eg2, and χ2), where eg2 > eg1 and χ2 < χ1 is shown in figure 1. the developed model is used to optimize ns considering band parameters of barrier and buffer layer materials reported in ga2o3 experimental hemts. the 2deg charge density ns relating conduction band offset δec, using charge control equation [12], can be written as: 𝑛𝑠 = 𝜀 𝑞𝑑 [𝑉𝑔 − 𝜙𝑏 + 𝑉𝑝𝑏 − 𝐸𝑓 + 𝛥𝐸𝑐 ] (3) where vpb is barrier layer pinch-off voltage and vg is the applied gate voltage. the other symbols used along with their physical meaning are listed in table 1. the device under study here is aln/β-ga2o3 hemts, as the experimental measurements of band offset parameters—δev and δec at the iii-nitride (gan, and aln)/β-ga2o3 heterostructure are readily available [19, 20]. additionally, as the in-plane lattice mismatch between [-201] aln and [0002] aln planes is as small as 2.4 % [20], aln/β-ga2o3 is anticipated as a potential candidate for future high-power applications. 2. 2deg charge density and device model description in the triangular quantum well, 2deg charge density ns is related with fermi level ef and two sub-bands e0 and e1, using fermi-dirac statistics, as given by [12] 𝑛𝑠 = 𝐷𝑉𝑡ℎ [𝑙𝑛 {𝑒 (𝐸𝑓−𝐸0) 𝑉𝑡ℎ⁄ + 1} + 𝑙𝑛 {𝑒 (𝐸𝑓−𝐸1) 𝑉𝑡ℎ⁄ + 1}] (4) where first energy level e0 = γ0ns 2/3, and second e1= γ1ns 2/3. in case of complete ionization of barrier layer, equation (1) can be written as: 𝑛𝑠 = 𝜀 𝑞𝑑 {𝑉𝑔𝑜 − 𝐸𝑓 } (5) where vgo = vg – voff. the 2deg density model, for algan/gan hemt, developed so far explained ns behavior concerning vg. it was assumed that only the first sub-band e0 lies 326 r. singh, t. r. lenka, h. p. t. nguyen below ef for vg > voff. here, a more simplified expression of fermi level (in volts) in terms of ns is obtained under steady-state conditions. 𝐸𝑓 = 𝑛𝑠 2𝐷 + 𝐸0+𝐸1 2 − 𝑉𝑡ℎ 2 (6) the above equation is obtained using the approximation ln(1 + 𝑥) ≈ 𝑥, 𝑓𝑜𝑟 𝑥 ≪ 1 and after some mathematical manipulations. next, e0 and e1 can be replaced to get the following explicit relation between ef and ns: 𝐸𝑓 = 𝑛𝑠 2𝐷 + ( 𝛾0+𝛾1 2 ) 𝑛𝑠 2/3 − 𝑉𝑡ℎ 2 (7) furthermore, higher charge confinement in the triangular well is anticipated based on higher energy difference between ef and e0 [21]. it is worth mentioning that, some fraction, say 60 or 80 % and very rarely up to 100% [22], of the heterostructures’ material bandgap difference appears as cbo. therefore, while estimating ns, careful measurement of cbo (δec) is important. in this work, for the estimation of confined charge density in the quantum well, the relative position of e0 to ef is analyzed under varying band alignment and varying schottky barrier height under thermal equilibrium. this is illustrated in figure 2. the device model analyzed here is comprised of an aln barrier on β-ga2o3 buffer layer having a thickness of 10 and 50 nm respectively. the layer sequence cum device cross-section is shown in figure 3. source and drain contacts are considered to be ohmic, while gate contact is schottky type. silicon nitride (si3n4) is used for surface passivation and to limit the parasitic capacitances as mentioned in [23]. the various material parameters for β-ga2o3 are taken from [24, 25] and are shown in table 2 along with for aln taken from [7]. different material parameters used in the simulation of aln/β-ga2o3 hemt constituents are listed in table 2. fig. 2 the relative position of e0 and ef to cbo for fixed schottky potential [1]. fig. 3 schematic diagram showing layer sequence of aln/β-ga2o3 hemt; dashed line below aln barrier represents 2deg charges [1]. analytical study of effect of energy band parameters and lattice temperature on conduction band... 327 table 2 material parameters of β-ga2o3 and aln used in different calculations of tcad simulations, taken from [7, 24-25]. symbol β-ga2o3 aln χ (ev) 3.15 1.4 eg (ev) 4.9 6.1 nc (cm-3) 3.6 × 1018 4.42 × 1018 nv (cm-3) 2.86 × 1020 6.76 × 1018 𝒏𝒊 (cm -3) 2.23 ×10-22 1.51 × 10-33 ε 10.2 8.5 3. results and discussion the device under the test (figure 3) is simulated to estimate cbo and 2deg density using atlas tcad under steady-state conditions and at different bias voltages enabling heat-flow in the device. the duo investigations are performed using the alignment-based rule—δec as a fraction of δeg, due to the significant difference between band offsets estimated using standard values of electron affinity and experimental measurements. 3.1. at steady state condition 3.1.1. fixed schottky barrier height earlier, various high-performance aln schottky barrier diodes were demonstrated [2628] and barrier heights ranging from 1.6 – 2.3 ev between aln and different metals were measured [26]. here, ti and au aln schottky contacts with a barrier height of 1.6 ev are used to estimate cbo and analyze 2deg density under three different degrees of alignments— 60, 80, and 100%. as conduction band offset δec increased from 0.65 to 1.15 ev, 2deg density increases as higher conduction band alignment boost carrier confinement as illustrated in figure 4. fig. 4 estimation of cbo keeping fixed barrier height of 1.6 ev under 60, 80, and 100% alignment of bandgap difference. 328 r. singh, t. r. lenka, h. p. t. nguyen 3.1.2. fixed alignment of 60% considering a moderate value of alignment, say 60% of the bandgap difference (δeg = 1.24 ev) between aln and β-ga2o3 is assigned here as cbo. the three different metals—ti, ni, and au on aln with barrier heights of 1.6, 1.8, and 2.3 ev are considered to analyze to estimate δec and consequently 2deg density and are shown in figure 5. although, a fixed fraction of bandgap difference (0.6 of 1.24 = 0.744 ev) is assigned to conduction band discontinuity, the estimated value of δec is slightly less than the assigned value. this may be attributed to surface and or interface states at the aln/β-ga2o3 boundary [26]. fig. 5 estimation of δec, and ns under three increasing schottky barrier heights for fixed alignment of 60 %; decreasing values of ns (5.0, 4.9, and 4.7) × 10 13 cm-2 are estimated. 3.2. heat-flow simulation poor thermal conductivity of ga2o3 and higher output currents in ga2o3 based power devices commonly result in high lattice temperature in absence of device-level thermal management. here, after enabling the relevant model in simulation, maximum lattice temperature under the gate area is gauged under different bias voltages. the subsequent effects on electron affinity and energy bandgap are also estimated. the increased affinity values led to reduced energy bandgap results in higher bandgap difference at the heterointerface. the maximum lattice temperature at elevated currents is shown in figure 6, and resulting energy bandgap and electron affinity at different lattice temperature is shown in figure 7. analytical study of effect of energy band parameters and lattice temperature on conduction band... 329 fig. 6 maximum lattice temperature; extracted from atlas (left) at higher drain current, and plotted versus corresponding drain voltage at zero gate voltage (right). fig. 7 energy band gap and electron affinity as a function of maximum lattice temperature. 3.2.1. effect on conduction band offset as the lattice temperature increases, the energy bandgap of β-ga2o3 shrinks as per equation (2) and is shown in figure 7. now the maximum bandgap difference available between aln and β-ga2o3, corresponding to the maximum lattice temperature of 1063 k at vds = 15 v (vgs = 0 v), is given as ∆𝐸𝑔 = (𝐸𝑔 𝐴𝑙𝑁 − 𝐸𝑔 𝐺𝑎2𝑂3 ) ≈ 5.8 − 3.2 ≈ 2.6 𝑒𝑉 here, it is evident that a larger change in ga2o3 energy bandgap led to a 46 % higher bandgap difference between aln and ga2o3 compared to its value, 1.24 ev, at 300 k and consequently higher values of cbo result. further, corresponding to 80 % alignment of 330 r. singh, t. r. lenka, h. p. t. nguyen δeg (δec = 0.8 × 2.6 = 2.08 ev), 2deg density ns is estimated for trio barrier heights and is shown in figure 8. fig. 8 2deg estimation at the fixed alignment of 80 % for different schottky barrier heights. 2deg density decreases with higher schottky barrier heights. the important inferences from the results exhibited above are summarized in table 3. it is found that a higher degree of cbo results in increased 2deg density, both at steady-state and at higher bias voltages. however, 2deg density in the latter scenario is relatively low as compared to the previous case. this is attributed to the enhanced electron-phonon interaction with increasing lattice temperature. additionally, confined carrier density decreases with increasing schottky barrier heights in both cases. this can be due to the presence of interface charges and defect states at the aln-β-ga2o3 boundary. table 3 estimated values of δec, and ns under fixed schottky barrier and fixed alignment alignment (%) / schottky height (ev) under steady state (tl = 300 k) 𝐸𝑔 𝛽−𝐺𝑎2𝑂3 = 4.9 𝑒𝑉, 𝐸𝑔 𝐴𝑙𝑁 = 6.1 𝑒𝑉 at vds /vgs = 15 / 0v (tl = 1063 k) 𝐸𝑔 𝛽−𝐺𝑎2𝑂3 = 3.2 𝑒𝑉, 𝐸𝑔 𝐴𝑙𝑁 = 5.8 𝑒𝑉 δec (ev) ns ( × 1013 cm-2) δec (ev) ns ( × 1013 cm-2) 60 / 1.6 0.65 5.0 1.47 4.6 80 / 1.6 0.9 5.12 2.0 4.8 100 / 1.6 1.15 5.23 2.5 5.0 80 / 1.6 0.9 5.12 2.0 4.8 80 / 1.8 0.9 5.0 2.0 4.7 80 / 2.3 0.9 4.8 2.0 4.5 4. conclusion to summarize, the effect of energy bandgap difference enabled conduction band offset on 2deg density in aln/β-ga2o3 hemt is studied analytically. the analytical expression of fermi level is derived to conclude that the relative position of ef and e0 largely affects 2deg density. alignment-based rule—cbo as a fraction of δeg is found in more agreement with its value measured in experimental devices. by varying band alignment analytical study of effect of energy band parameters and lattice temperature on conduction band... 331 and schottky barrier heights, the resultant effect on δec and 2deg density are estimated. it is found that apart from conduction band offset dependency, ns is also affected by schottky barrier height. it is also shown that poor thermal conductivity led to higher lattice temperature which results in large δeg and cbo, but yielded relatively lower 2deg density as compared to steady-state condition. in steady-state, for fixed schottky barrier height of ϕb = 1.6 ev (ti/aln) , 2deg density increases from 5.0 × 10 13 to 5.23 × 1013 cm-2 when δec changes from 60 to 100 %, and from 4.6 × 10 13 to 5.0 × 1013 cm-2 at vds/vgs =15/0 v. on the other hand, even at fixed δec, 2deg density decreases from 5.12 × 10 13 to 4.8 × 1013 cm-2 when ϕb increases from 1.6 to 2.3 ev in steady-state, and 4.8 × 10 13 to 4.5 × 1013 cm-2 at lattice temperature of 1063 k corresponding to vds/vgs =15/0 v. these conclusions can be beneficial to access the limitations in β-ga2o3 hemt performance, which critically depends on the careful estimation of 2deg density. references [1] r. singh, t. r. lenka, s. a. ahsan, and h. p. t. nguyen, “analytical study of conduction band discontinuity supported 2deg density in aln/ga2o3 hemt,” in proceedings of the international conference on micro/nanoelectronics devices, circuits, and systems (mndcs-2021), 29-31 jan 2021. http://mndcs.nits.ac.in/ [2] u. k. mishra, l. shen, t. e. kazior, y. f. wu, “gan-based rf power devices and amplifiers,” in proceedings of the ieee. 16, jan 2008, vol. 96, no. 2, pp. 287–305. [3] p. parikh, y. wu, m. moore, p. chavarkar, u. mishra, r. neidhard, et al., “high linearity, robust, algan-gan hemts for lna and receiver ics,” in proceedings of the ieee lester eastman conf. high perform. devices, aug. 2002, pp. 415–421. [4] p. kordos, a. alam, j. betko, p. p. chow, m. heuken, p. javorka, et al., “material and device issues of gan-based hemts,” in proceedings of the 8th ieee int. symp. high perform. electron devices microw. optoelectron. appl., nov. 2002, pp. 61–66. [5] s. j. pearton, f. ren, m. tadjer, and j. kim, “perspective: ga2o3 for ultra-high power rectifiers and mosfets,” journal of applied physics, vol. 124, no. 22, p. 220901, dec 2018. [6] e. ahmadi, and y. oshima, “materials issues and devices of α – and β – ga2o3,” journal of applied physics, vol. 126, p. 160901, oct. 2019. [7] atlas, device simulator. “atlas user’s manual.” silvaco international software, santa clara, ca, usa (2016). [8] y. p. varshni, “temperature dependence of the energy gap in semiconductors,” physica 34, vol. 1, pp. 149–154, 1967. [9] s. rafique, l. han, s. mou, and h. zhao, “temperature and doping concentration dependence of the energy band gap in β-ga2o3 thin films grown on sapphire,” optical material express 3561, vol. 7, no. 10, oct. 2017. [10] k. b. nam, j. li, j. y. lin, and h. x. jiang, “optical properties of aln and gan in elevated temperatures,” applied physics letters, vol. 85, no. 16, oct 2004. [11] y. zhang, et al., “demonstration of high mobility and quantum transport in modulation-doped β (alxga1-x)2o3/ga2o3 heterostructures,” applied physics letters, vol. 112, no. 17, p. 173502, apr. 2018. [12] s. kola, j. m. golio, and g. n. maracas, “an analytical expression for fermi level versus sheet carrier concentration for hemt modeling,” ieee electron device lett., vol. 9, no. 3, pp. 136–138, mar. 1988. [13] x. cheng, m. li, and y. wang, “an analytical model for current voltage characteristics of algan/gan hemts in presence of self-heating effect,” solid state electron., vol. 54, no. 1, pp. 42–47, jan. 2010. [14] s. khandelwal, n. goyal, and t. a. fjeldly, “a physics based analytical model for 2deg charge density in algan/gan hemt devices,” ieee trans. electron devices, vol. 58, no. 10, pp. 3622–3625, oct. 2011. [15] x. cheng and y. wang, “a surface-potential-based compact model for algan/gan modfets,” ieee trans. electron devices, vol. 58, no. 2, pp. 448–454, feb. 2011. [16] t. r. lenka, and a. k. panda, “effect of structural parameters on 2deg density and c ~ v characteristics of alxga1-xn/aln/gan-based hemt,” indian journal of pure & applied physics, vol. 49, pp. 416-422, jun 2011. [17] s. khandelwal and t. a. fjeldly, “a physics based compact model for i–v and c–v characteristics in algan/gan hemt devices,” solid state electron., vol. 76, pp. 60–66, oct. 2012. 332 r. singh, t. r. lenka, h. p. t. nguyen [18] f. m. yigletu, s. khandelwal, t. a. fjeldly, and b. iniguez, “compact charge-based physical models for current and capacitances in algan/gan hemts,” ieee trans. electron devices, vol. 60, no. 11, pp. 3746–3752, nov. 2013. [19] w. wei, et al., “valence band offset of β-ga2o3/wurtzite gan heterostructure measured by x-ray photoelectron spectroscopy,” nanoscale research letters; vol. 7, p. 562, dec. 2012. [20] h. sun, et al., “valence and conduction band offsets of β-ga2o3/aln heterojunction,” applied physics letters, vol. 111, no. 16, p. 162105, oct. 2017. [21] y. k. verma, v. mishra, s. k. gupta, “a physics based analytical model for mgzno/zno hemt,” journal of circuits, systems, and computers, jan 2019. [22] h. sun et al., “nearly-zero valence band and large conduction band offset at baln/gan heterointerface for optical and power device application,” applied surface science, vol. 458, pp. 949–953, jul. 2018. [23] r. singh, t r lenka, r t velpula, b jain, h q t bui, h p t nguyen, “a novel β-ga2o3 hemt with ft of 166 ghz and x-band pout of 2.91 w/mm,” int. j. numer model el., e2794, 2020. [24] a. mock, r. korlacki, c. briley, v. darakchieva, b. monemar, y. kumagai, k. goto, m. higashiwaki, m. schubert, phys. rev. b condens. matter, vol. 96, no. 24, p. 245205, 2017. [25] z. zhang, e. farzana, a.r. arehart, and s.a. ringel, “deep level defects throughout the bandgap of (010) β-ga2o3 detected by optically and thermally stimulated defect spectroscopy,” applied physics letters, vol. 108, p. 052105, feb. 2016. [26] p. reddy, i. bryan, z. bryan, j. tweedie, r. kirste, r. collazo, and z. sitar, “schottky contact formation on polar and nonpolar aln,” journal applied physics, vol. 116, no. 19, p. 194503, nov. 2014. [27] y. irokawa, e. villora, and k. shimamura, “schottky barrier diodes on aln free-standing substrates,” japanese jornal of applied physics, vol. 51, no. 4r, p. 040206, mar. 2012. [28] t. kinoshita, t. nagashima, t. obata, s. takashima, r. yamamoto, r. togashi, y. kumagai, r. schlesser, r. collazo, a. koukitu, and z. sitar, “fabrication of vertical schottky barrier diodes on n-type freestanding aln substrates grown by hydride vapor phase epitaxy,” appl. phys. exp., vol. 8, no. 6, p. 061003, may 2015. instruction facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 41 50 https://doi.org/10.2298/fuee1801041p investigating the steady state mode of power inverters for induction heating evgeniy popov, nikolay hinov faculty of electronic engineering and technologies, department of power electronics, technical university of sofia, sofia, bulgaria  abstract. this article formulates the new unified interpretation of the analysis of electromagnetic processes in the autonomous (usually resonant) inverters with power circuits having a serial rlc configuration either with or without free wheeling diodes. the investigation starts with clarifying the parameters of the inverter circuit by bringing the fourth order power network into such of a second order in a normalized form. on this basis the novel compendious relationships between the most important internal inverter parameters are given. a matlab program calculates and displays the frequency characteristics of both types of inverters and simulates their steady state. the results from characteristics and simulation confirm each other in many ways. they were also proved experimentally. the whole processed information helps better understanding and organizing intelligent design, measurement and control of the inverters for technological applications (induction heating). key words: modeling, electromagnetic processes, frequency characteristics, rlc inverters, steady state, unified interpretation. 1. introduction the voltage fed rlc inverter (fig. 1), its dual counterpart the current fed rlc inverter and the serial rlc inverter without free–wheeling diodes (fig. 2) cover a very wide range of practical autonomous inverter circuits generally applied in electronic technology [1]-[4]. in this case important and complex mutually connected problems are the accurate design of the power circuit, appropriate adjustment between the inverter and the load and adequate control providing stable, reliable and efficient operation of the converter when wide variations of the load are expected [5]-[7]. the mathematical relationships between the parameters of the mentioned inverters are rather complicated [5], [8]-[10]. it has been proved that all the quantities in these second order received september 15, 2016; received in revised form september 29, 2017 corresponding author: nikolay hinov faculty of electronic engineering and technologies, department of power electronics, technical university of sofia, sofia, bulgaria (e-mail: hinov@tu-sofia.bg) 42 e. popov, n. hinov topologies depend on two variables the ratio between the controlling angular frequency and the generalized angular frequency (frequency coefficient) in n /     on one hand, and the ratio between the damping coefficient and the generalized angular frequency i n n /      of the power circuit, on the other. at the same time engineering practice in the area of the power electronics needs fast and accurate means for simultaneous solution of the already stated problems. vt1 vt4 u d vd1 vd4 vt2 vt3 vd2 vd3 r i l i ci i u i d u i fig. 1 voltage fed rlc inverter when, for oscillatory mode i i ir 2 l / c of the inverter circuit (fig 1, fig. 2) coefficient is c 1 , for over damped mode ( i i ir 2 l / c ) c 1  and for critical mode ( i i i r 2 l / c ) c 0 . the parameters that determine the development of the electromagnetic processes in the steady state in the power inverter circuits are 2 2 cos ( .; );1 ( ) (4 sin cos ) i i i i i i i b n n osc over damped critical c b b           (1) for resonant inverters (oscillatory mode) the coefficient of hesitation is 0 1/ [1 exp( / )] 1/ [1 exp( )] i k n         (2) the frequency coefficient is 2 2 2 2 ( .; ); ( ) cos(4 sin cos ) i i ii i i i n n osc over damped critical bc b b           (3) vs3 vs2 c i rii u u d vs1 vs4 l i1 i d l i2 or l i =l i1 +l i2 b fig. 2 a serial rlc inverter without free–wheeling diodes the steady state mode of power inverters for electro technology applications 43 2. the inverter analysis in the steady state mode in contrast to the approaches used in [10]-[15] provided herein is a unified approach to the analysis, which alleviates the description of the behavior of the power circuit. a constant ci reflecting the type of the inverter is introduced, having values ci = +1 for the rlc inverter with free wheeling diodes (fig. 1) or ci = 1 for the rlc inverter without free wheeling diodes (fig. 2). the following designations are applied: s f (x) sin x , c f (x) cos x , i i r / (2l )  , 2 0 i 1/ (l ci)     for oscillatory mode; sf (x) sinh x , cf (x) cosh x , i ir / (2l )  , 2 i i 1/ (l c )   for over damped mode; s f (x) x , c f (x) 1 , i i r / (2l )  ,   for critical mode. then the inverter current and the voltage across the capacitor ci can be written in the following manner 0 0 ( ) ( ) ( ) t td s i c s u u i e f t c i e f t f t l              (4) 0 0 ( ) ( ) ( ) ( ) t t d d c s i s i u u u u e f t f t c e f t c               (5) the angle 2 0 / / n          corresponding to the half period is determined from the controlling angular frequency 2 f  and the generalized frequency  . the parameters of the inverter circuit (fig. 1, fig. 2) can be determined taking into account the initial conditions for the steady state 0 2 (0) . ( ) (0) ( ) i i c i u u       (6) and 1 ( ) 0i   (7) then the determination of the parameters follows. the parameter 0 0 / ( ) d a i l u u   is 2 2 . 2 2 ( ) . ( ) . . ( ) s n i c i s f a e c f c n f         (8) for the inverter in fig. 1 only the angle 1  is determined from 1 1 1 1 ( ) / ( ) 1 . ( ) / ( ) s c s c f f a n f f        (9) the generalized coefficient of hesitation is 2. 2 2 2 1 1 [( . . ) ( ) ( )] n i i s c k e c ca n c a n f f            (10) 44 e. popov, n. hinov it should be underlined that when calculating a and k for discontinuous inverter current mode for fig. 2 0 2   must be equal to  in the already given expressions (8) and (10). the initial capacitor voltage is 0 0 ' 2. 1 d u u k u    (11) the maximal voltage across the capacitor i c m u for fig. 2 is also given by (11). for fig. 1 it is given by 1 ' 2( ) 1m k m k d u u k u     (12) the expression for the coefficient 1 k is the same as (10), but the variable 2  is exchanged with 1  ( 2 1   ). the normalized inverter current and capacitor voltage are respectively   ( ) '( ) 2 (1 . . ) ( ) . . ( ) n i s i c d i l i ke c a n f c a f u             (13) 2( ) '( ) 1 2 ( . . . . ) ( ) ( ) n i i s c d u u ke c c a n c a n f f u                 (14) the average value of the input current (all values are normalized) is 0 2 0 00 1 1 2(2 1) ' '( ) .d d d i l k i i d u n c              (15) the rms value of the inverter current is ' / ' / (2. ) d d i i l u i n     (16) the output characteristic (fig. 1 and fig. 2) is '/ ' d och i i (17) the input characteristic is 1/ ( . ' ) d ich n i   (18) the characteristic of the coefficient of nonlinear distortion (klir – factor) of the inverter current is 2 2 (1) (1) [%] 100. ' ' / 'kf i i i  (19), where i'(m) (m = 1, 3, 5, 7...) is the m th harmonic component of the inverter current. (the mathematical expressions for calculation of the harmonic components in the different circuits and modes of operation are disparate, and rather complicated. they have been found and published in earlier authors’ publications.) from that point on the analysis of the power inverter may be continued without problems in a normalized or in a non normalized form. the steady state mode of power inverters for electro technology applications 45 3. an example with a serial resonant inverter without free wheeling diodes a half bridge circuit (fig. 3) is under study but it can be easily converted into the bridge one. the power losses in the inverter are neglected. the commutation of the power semiconductor devices is instantaneous. a parallel equivalent circuit represents the induction heater. the quality factor of the load circuit is sufficiently high that the voltage across the load has a close to the sine wave shape. a matlab program processes all the mathematical information describing the steady state operation of the inverter in the allowed frequency range. the frequency characteristics of the inverter are obtained and graphically displayed in fig. 4. these particular characteristics correspond to a practically implemented inverter (pt1-1002400) with the following data: ud=500 v; lk=45 h; ck=84 f; rlr= 0.4739599 , flr=2083 hz, llr=8.8717 h, cl,=657.88 f,n1=0, n2=1, f=1600 -2600 hz. the first graphic shows: the average input current id [a] (solid line). the second graphic shows: the maximal device voltage uvsm (solid line). other parameters of the circuit can also be calculated and displayed. u d /2 r l c k l k1 i vs1 vs2 i d l l c l l k2 u l u d /2 or fig. 3 a practical serial resonant inverter without free–wheeling diodes 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 0 500 1000 x: 2196 y: 200.3 frequency (hz) t h e a v e ra g e i n p u t c u rr e n t (a ) 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 0 1000 2000 3000 4000 x: 2196 y: 1005 frequency (hz) t h e m a x im a l th y ri s to r v o lt a g e ( v ) fig. 4 the inverter frequency characteristics 46 e. popov, n. hinov the stability and efficiency of the inverter can be studied from the characteristics. in general there is a minimum of the input current and power, load voltage, serial capacitor voltage and device voltage around the resonance of the load circuit. if the parameters of the inductor heater vary during the induction heating process it is advisable to maintain almost constant input current (power) in a slight capacitive detuning of the load circuit rl,ll,cl, where the operation is stable, by exercising an influence on the controlling frequency. the calculated slope of the frequency characteristic of the input current helps for determining the parameters of the closed loop automatic control system. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x 10 -3 -1000 -500 0 500 1000 time (s) t h e i n v e rt e r c u rr e n t (a ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x 10 -3 -1000 0 1000 2000 x: 0.000386 y: 1045 time (s) t h e d e v ic e v o lt a g e ( v ) fig. 5 the detailed simulation results at f 2195 hz the steady state is simulated by a matlab program based on the method described in [16] for direct determination of the steady state mode (in the case for the discontinuous inverter current mode) that is experimentally confirmed. the results from the frequency characteristics, from the simulation and from experiments are in good agreement according to table i. that confirms the correctness of the whole study. the detailed simulation results of the power inverter are graphically displayed (fig. 5) for f=2195 hz. the first diagram shows: the inverter current ick [a] (solid line). the second diagram shows: the voltage across a device uvs1 [v] (solid line). the simulation results for the same circuit at f=694 hz displaying an interesting case – a non typical but possible operation in an inductive detuning of the load generally with the third harmonic component of the inverter current (a case that is difficult to be studied analytically) are given in fig. 6. the steady state mode of power inverters for electro technology applications 47 table 1 results for fig. 3 f[hz] 694 1823 2018 2083 study sim. fr.ch. sim. fr.ch. exper. sim. fr.ch. load ind. ind. ind. res. id [a] 169.5 400.7 388.7 200 200 196.5 178.2 ul[v] 200.4 304.4 303.5 215.4 217 215.8 201.7 tq.c.[s] 604 113.6 125.9 61 65 68.3 63.8 uckm[v] 1456 1308 1270 590 585 580 509 uvsm[v] 1927 1342 1342 866 870 875 863 f[hz] 2083 recom. 2195 2394 study sim. fr.ch. exper. sim. fr.ch. sim. load res. cap. cap id [a] 180.1 199.8 200 208.2 400.6 416.3 ul[v] 206.6 211.9 218 222.1 302.2 314.1 tq.c.[s] 72.5 82.9 85 89.9 97.3 99.6 uckm[v] 515 542 560 565 996 1034 uvsm[v] 886 1003 1030 1045 1636 1689 4. an example with a serial resonant inverter with free wheeling diodes a real circuit is given in fig. 7. the inverter frequency characteristics are graphically displayed in fig. 8. they correspond to an inverter (pt2-50-4000) with: ud=500 v; lk=0.3 mh; ck=4 f; rlr=4 , flr=4000 hz, coslr=0.24254, n1=0, n2=1, f=3000 – 4800 hz. the graphics show: the average input current id [a] (solid line); the load voltage ullm [v] (solid line). for each frequency there is a check whether the shape of the load voltage is close to the sine wave. the increase of the controlling frequency leads to increase of the input current (power), load voltage, peak serial capacitor voltage and rms value of the inverter current and to decrease of the circuit turn-off time. but around the load resonant frequency the character of most functions is opposite (inflexed points) and the changes of parameters are not so large. therefore, if the load rl,ll varies during the induction heating, it is advisable to maintain a resonance of the load circuit rl,ll,cl by influencing the controlling frequency. the results from the frequency characteristics, from the simulation of the steady state mode and from experiments are compared in table 2. they are in good agreement confirming the correctness of the results. the steady state results for f=4000 hz are shown in fig. 8. they are: ick [a] (solid line), ul [v] (solid line). 48 e. popov, n. hinov 0 0.5 1 1.5 2 2.5 3 x 10 -3 -4000 -2000 0 2000 4000 time (s) t h e i n v e rt e r c u rr e n t (a ) 0 0.5 1 1.5 2 2.5 3 x 10 -3 -2000 -1000 0 1000 2000 x: 0.00129 y: 1926 time (s) t h e d e v ic e v o lt a g e ( v ) fig. 6 the detailed simulation results for f 694 hz u d r l c k l k i vd1 vd4 vs1 vs4 vs2 vs3 vd2 vd3 i d l l c l  0  0 fig. 7 a real resonant inverter with free–wheeling diodes 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 0 50 100 150 200 x: 4000 y: 74.92 frequency (hz) t h e a v e ra g e i n p u t c u rr e n t (a ) 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 0 200 400 600 800 x: 4000 y: 385.7 frequency (hz) t h e l o a d v o lt a g e ( v ) fig. 8 the inverter frequency characteristics the steady state mode of power inverters for electro technology applications 49 table 2 results for fig. 8 hz f= 3200 f= 3600 recom. f= 4000 par. fr.ch. sim. fr.ch. sim. fr.ch. exper. sim. id ,a 16.74 16.62 88.53 88.39 74.92 75 74.34 vl ,v 182.3 182.3 420.2 420.5 385.7 387 385.6 tq.c,s. 69.83 70.31 35.98 36.88 22.33 22 23.19 vckm,v 1683 1700 2129 2125 1320 1325 1320 ick ,a 94.20 94.58 137.7 137.8 96.77 97 96.75 hz f = 4400 f= 4800 par. fr.ch. sim. fr.ch. sim. id ,a 75.81 75.45 182.0 181.7 vl ,v 388.7 388.5 603.1 602.8 tq.c,s. 31.13 31.31 24.28 24.42 vcsm,v 1548 1551 3120 3121 is ,a 122 122 268 268 0 50 100 150 200 250 300 350 400 450 500 -200 -100 0 100 200 x: 288.8 y: 145.8 time (0.000001 s) t h e i n v e rt e r c u rr e n t (a ) 0 50 100 150 200 250 300 350 400 450 500 -1000 -500 0 500 1000 x: 291 y: 546.1 time (0.000001 s) t h e l o a d v o lt a g e fig. 9 the detailed simulation results at f 4000 hz 5. conclusions serial rlc inverters with or without free wheeling diodes for induction heating are investigated. a novel unified interpretation of the electromagnetic processes is applied based on the previously calculated inverter network parameters in a normalized form. the frequency characteristics, the steady state simulation parameters and the experimental results are obtained and mutually confirmed. that proves the correctness of the whole study. the requirements for the control system are defined, which is useful for research on the type of offline simulation, hardware in the loop and rapid prototyping. the main contribution of the work is an approach to analysis of series rlc dc/ac converters for 50 e. popov, n. hinov induction heating. this allows the creation of methods for engineering design that are simple, such as mathematical ones, but with sufficient precision for engineering practice. references [1] b. l. dokić, b. blanuša, power electronics converters and regulators third edition, © springer international publishing, switzerland 2015, isbn 978-3-319-09401-4. [2] e. i. berkovitch, g. v. ivenskiy, yu. s. yoffe, a. t. matchak, v. v. morgun, higher frequency thyristor converters for electro technological units, st. petersborough, energoatomizdat, (1973), 1983, (in russian). [3] m. k. kazimierczuk and d. czarkowski, resonant power converters, ieee press and john wiley & sons , new york, ny 2nd edition, pp. 1-595, isbn 978-0-470-90538-8, 2011. [4] n. mohan, undeland, m. tore, william p. robbins, power electronics converters, applications, and design (3rd edition), © 2003 john wiley & sons. [5] a. dominguez, a. otin, l. a. barragan, j. i. artigas, d. navarro, and i. urriza, "frequency-to-output-power transfer function measurement of a resonant inverter for domestic induction heating applications", in proceedings of the ieee annual conference of the industrial electronics society iecon13, vienna, austria, 2013, pp. 5032-5037 [6] e. popov, analysis, modeling and design of converter units (computer – aided design of power electronic circuits), technical university printing house, sofia, 2005 (in bulg.), chapters 2-3, pp. 59–80. [7] o. lucía, p. maussion, e. dede, and j. m. burdío, "induction heating technology and its applications: past developments, current technology, and future challenges", ieee transactions on industrial electronics, vol. 61, pp. 2509-2520, may 2014. [8] a. dominguez, l. barragan, j. artigas, a. otin, i. urriza, d. navarro, "reduced-order models of series resonant inverters in induction heating applications", ieee transactions on power electronics, vol. 32, issue: 3, pp. 2300 – 2311, march 2017. [9] a. dominguez, l. a. barragan, j. i. artigas, a. otin, i. urriza, and d. navarro, "reduced-order model of a half-bridge series resonant inverter for power control in domestic induction heating applications", in proceedings of the ieee international conference on industrial technology (icit), 2015, pp. 2542-2547. [10] y. yin, r. zane, r. erickson, and j. glaser, "direct modelling of envelope dynamics in resonant inverters", electronics letters, vol. 40, pp. 834-836, 2004. [11] d. maksimovic, a. m. stankovic, v. j. thottuvelil, and g. c. verghese, "modeling and simulation of power electronic converters", in proceedings of the ieee, 2001, vol. 89, pp. 898-912. [12] f. h. dupont, c. rech, r. gules, and j. r. pinheiro, "reduced-order model and control approach for the boost converter with a voltage multiplier cell", ieee transactions on power electronics, vol. 28, pp. 3395-3404, 2013. [13] j. sun and h. grotstollen, "averaged modeling and analysis of resonant converters," in proceedings of the 24th annual ieee of the specialists conference on power electronics pesc '93, 1993, pp. 707-713. [14] s. tian, f. c. lee, and q. li, "a simplified equivalent circuit model of series resonant converter" ieee transactions on power electronics, vol. 31, pp. 3922-3931, 2016. [15] y. yan, r. zane, r. erickson, and j. glaser, "direct modeling of envelope dynamics in resonant inverters", in proceedings of the 34th annual ieee conference on power electronics electronic pesc '03, 2003, 2003, vol.3, pp. 1313-1318. [16] e. i. popov, “direct determination of the steady-state mode in autonomous inverters”, scientific journal electrical engineering and electronics e+e, sofia, bulgaria, (in bulg.), vol. 11-12, 2004. instruction facta universitatis series: electronics and energetics vol. 28, n o 4, december 2015, pp. 625 636 doi: 10.2298/fuee1504625m analysis of a square coaxial line with anisotropic substrates by strong fem formulation  žaklina j. mančić 1 , vladimir v. petrović 2 1 university of niš, faculty of electronic engineering of niš, serbia 2 tes electronic solutions gmbh, stuttgart, germany abstract. in this paper the concept of the strong finite element method (fem) formulation is explained first. next, a brief review of strong basis functions that are used for quasi-static analysis of transmission lines with piecewise homogeneous anisotropic medium is presented. as numerical examples, effective relative permittivities of square coaxial lines with two anisotropic layers or one isotropic and one anisotropic layer are calculated by using the galerkin version of the strong fem formulation. high accuracy of the method is demonstrated for the layer thicknesses ranging from 0 to 100% of the transmission line height. it is also shown that in the case of the halffilled line, effective relative permittivity computed by the fem is practically equal to the one obtained by a simple formula. key words: strong fem formulation, quasi-static analysis, anisotropic dielectric, square coaxial line. 1. introduction one of the frequent and multidisciplinary methods used for calculation of electromagnetic (em) fields is the finite element method. fem belongs to the group of numerical methods that are used for approximate solving of the boundary value problems in mathematical physics (partial differential equations whose order is two or higher, with the given boundary conditions). in almost all the available literature the weak fem formulation [1]–[3] is used. weak formulation is based on basis functions that are not in the domain of the original differential operator (usually of the second order). furthermore, most often low order approximations (i.e. of the first order) are used. on the other hand, only several published papers deal with the strong fem formulation. first studies in this research area [4]–[6] have shown that strong formulation may have certain advantages in comparison to weak formulation. these advantages are reflected in the conceptual simplicity, simpler and more natural inclusion of boundary conditions and inherent higher order of approximation (the lowest order being three). a convenient choice of basis functions satisfies both received december 19, 2014; received in revised form may 4, 2015 corresponding author: žaklina j. manĉić faculty of electrical engineering, university of niš, alaksandra medvedeva 14, 18000 niš, serbia (e-mail: zjmancic@gmail.com) 626 ž. j. manĉić, v. v. petrović boundary conditions and results in much smaller number of unknowns for the same approximation order. for numerical examples in this paper, square coaxial lines with anisotropic layers are chosen, as a special case of the shielded planar lines. shielded planar transmission lines with anisotropic dielectrics were subject to previous research, where their propagation characteristics were calculated by the use of various numerical and analytical methods. in [7] rectangular coaxial lines with homogeneous anisotropic dielectric were analyzed and their capacitance calculated by the use of an expanded charge simulation method and affine transformations. in [8] an analytical technique (the spectral domain technique in discrete fourier variable) is presented for quasi-static analysis of rectangular lines with homogeneous and inhomogeneous anisotropic dielectric. in [9] several method classes (quasi-static, dynamic, empirical; analytical, numerical) and a number of methods (method of moments, finite differences method, transmission-line matrix technique, modified wiener-hopf method, fourier series techniques, method of lines) were discussed and applied for calculation of the propagation characteristics of several typical planar structures. the paper, however, does not mention fem. in [10] weak fem formulation is applied to analysis of square coaxial lines with inhomogeneous anisotropic dielectric. in [4], [6] and [11] strong fem formulation for anisotropic medium is presented and applied to analysis of shielded transmission lines with homogeneous anisotropic dielectric. its high accuracy is demonstrated by comparison with results obtained by other numerical methods and by the commercial software. this paper is aimed to generalize the strong fem formulation for anisotropic homogeneous medium to anisotropic inhomogeneous (piecewise homogeneous) medium and apply the method to calculation of effective permittivity of such square coaxial lines. to the best of authors’ knowledge, there are no published papers on the strong fem formulation applied to those structures. considering square coaxial lines as an example of a simple geometry, they are not only an excellent benchmark for numerical methods, but are also advantageous for practical measurements of constitutive parameters of anisotropic materials [12]. 2. strong and weak form of boundary value problem, strong and weak formulation let the computational domain , be filled with anisotropic (possibly inhomogeneous) medium of parameters ε and  and bounded by 1 2   . in  and on , let the following differential equation and boundary conditions be given: ( )f f g    , (1) 0 1, onf v  , (2) 20( ) , onnf a n . (3) in (1) f denotes the unknown function, function g represents sources, i.e. excitations, v0 and an0 are known values on the boundary. if a domain is spatial (3-d), it is bounded by surfaces. if it is a surface (2-d), it is bounded by lines (contours). by introducing vector analysis of square coaxial line with anisotropic substrates by strong fem formulation 627 grad fa , differential equation (1) is written in the form div f ga . in a 3-d case and for a diagonal tensor diag[ ]xx yy zz    , equation (1) can be written in cartesian coordinate system as a compact operator form, ,glf  (4) where operator l is defined by ε ε εxx yy zzl x x y y z z  . (5) expression (2) represents dirichlet boundary condition, whereas expression (3) represents neumann boundary condition. if the parameters of the medium have abrupt changes on the surface (in 3-d problems) or on the contour (in 2-d problems) that separates mediums 1 and 2, it is necessary to satisfy dirichlet and neumann boundary conditions on the boundary between mediums, i.e. continuity of both function f, f1 = f2, and its generalized first derivative, 1 2a n a n ( 1 1 2 2( ) ( )f f n n ), where n is the unit vector normal to the boundary and directed into medium 1. in the literature [1]–[3], the problem defined by (1)–(3), which contains a differential equation with boundary conditions, is referred to as the strong form for a given contour problem. in the given case, the strong form contains second derivative. the solution of equations (1)–(3) is adopted in the form of approximation function, usually a polynomial with initially unknown coefficients. in general, mathematical functions can exhibit different orders of continuity: c 0 represents the continuity of the function itself, c 1 – continuity of both function and its first derivative(s). in general, c m continuity represents continuity of function and its derivatives up to the m-th order. when the problem is described by a system of partial differential equations, strong formulation requires that the approximation function be in the domain of the original differential operator, i.e. of the operator in the strong form, throughout the entire domain  [13]. this means the continuity of both function and its derivatives up to the order one less than that of the original differential operator [3]. the strong form requires the strong formulation. in case of the approximation function in equation (1), or alternatively eq. (4), we introduce the concept of a generalized c 1 continuity, as the continuity of the expression ε ( / )xx f x with respect to x, ε ( / )yy f y with respect to y and ε ( / )zz f z with respect to z, so that ( )f  is regular. this also implies that both boundary conditions (continuity of f and n  a) must be satisfied on every boundary inside , including boundaries between finite elements, if the problem is solved by fem. approximation function is represented as a sum of basis functions multiplied by initially unknown coefficients. sufficient condition that the approximation function is a generalized c 1 function, i.e. that the formulation is strong, is that basis functions are generalized c 1 functions. such functions are called strong basis functions for a given problem, both in homogeneous (constant ε and ) and inhomogeneous ( ε( , , )x y z , ( , , )x y z ) medium [4],[6]. in order to solve problems easier, conditions of the strong form are commonly weakened, through integration, resulting in a wider class of possible approximate solutions. an arbitrary test (weighted) function w and the integral of the weighted residual for equation (1), 628 ž. j. manĉić, v. v. petrović ( ( ) )d 0w f f g     , (6) are introduced. the first term inside the integral (the term that contains the second derivative) can be transformed by the use of the gauss-ostrogradski theorem [1],[14], ( ( ))d div d (div( ) )d d ( ) ( ) d , w f w w w w w w f f w                a a a a dγ a dγ (7) so that equation (6) can now be written in the form ( ) (( ) )d 0w f f w w f wg      dγ , (8) where only first derivatives exist. equation (8) represents the weak form. weak form of equations, instead of using the original operator l, in which c 1 is required, uses the extended operator (not explicitly expressed) that requires only c 0 continuity. this means that on interelement boundaries only c 0 continuity is required. such approximation is the weak formulation. in weak formulations approximation functions have one order lower continuity in comparison with the strong formulation. in the weak formulation, boundary condition for normal components of vector a are not exactly satisfied. they are satisfied in approximate sense, through the weighted residuals. this introduces artificial charges at interelement boundaries. details on this can be seen in [13]. both strong and weak formulations can be applied in the weak form. 3. basic fem methodology for a 2-d case and an anisotropic dielectric let now the domain  be two-dimensional, e.g., uniform with respect to the z-axis, filled with linear, anisotropic dielectric without free charges, in which the distribution of electrostatic potential, v(x,y), is the unknown function. let dielectric be discontinuously inhomogeneous. in this case 0 and differential equation for potential v is s sdiv ( grad ) 0v , (9) where divs and grads are surface divergence and surface gradient, respectively. in case of 2-d problems and for the particular choice of coordinate axis along the crystallographic axis, diag[ ]xx yy   [9]. in this paper we will consider only such (diagonal) permittivity tensors. computational domain is divided into m finite elements. let the elements be rectangular and homogeneous, so that the surfaces of discontinuity of  coincide with interelement boundaries. exact solution v(x,y) is substituted by approximate solution expressed as a linear combination of basis function with unknown coefficients, 1 n j jj v f a f     . following the galerkin procedure [1], the system of linear algebraic equations for unknown coefficients is obtained, [ ][ ] [ ], , 1,..., , ij j i k a g i j n  (10) analysis of square coaxial line with anisotropic substrates by strong fem formulation 629 (grad )( grad )dij i j s k f f s , ldfg nii d 2 0   , (11) where dn0 = xx,yy(v/n) is a known normal component of the electric induction vector d on the contour 2 with indices xx and yy corresponding to the orientation of the unit normal n, i and j are global indices of basis functions and s is the union of all the finite elements surfaces, eme ss 1  . the approximate solution f is obtained by solving the system of equations (10) [5]. the matrix of this system, [kij], is a sparse matrix. such systems are usually solved by using specialized computer routines for sparse systems (in this paper we used [15]). 4. strong basis functions for anisotropic medium unlike conventional node-based functions [1]–[3], we will use basis functions that are not node-based and are of the strong type. 2-d basic functions for strong formulation are obtained by mutual multiplying pairs of 1-d basis functions for the strong formulation [16], 2 2 2 1 2 2 ( 1) ( 2), 1 ( 1) ( 1) , 2 2 1 ( ) ( 1) ( 1) , 3,..., 1 4 ( 1) (2 ), ( 1) ( 1) , 1. 2 e k k e u u k l u u k f u u u k n u u k n l u u k n (12) in 2-d strong formulation all basis functions are products of the two 1-d basis functions of the two orthogonal coordinates u and v. thus, continuity of the function’s first derivative (c 1 continuity) on all the boundaries between elements is automatically satisfied. basis functions in this case (except the first two, for j = 1, j = 2 and last two, for j = n, j = n + 1) are polynomials of different order and as such they are linearly independent. the order of all the other four polynomials is 3, but it is easy to show that they are linearly independent, i.e. none of them is a linear combination of the others. strong formulation of 2-d problems has the minimal order of basis function equal to 3. a complete set of 2-d strong basis functions consists of singlets (basis functions defined over a single finite element), doublets (basis functions defined over two adjacent elements) and quadruplets (basis functions defined over a maximum of four adjacent elements that have a common node) [5], [17] . in fig. 1 are shown one singlet and two kinds of doublets in the x-direction, respectively. analogously, there are two kinds of doublets in the y-direction. d1-x doublet provides c 0 continuity, whereas d2-x doublet provides c 1 continuity. in fig. 2 are shown four kinds 630 ž. j. manĉić, v. v. petrović of quadruplets, where q1 provides c 0 continuity whereas q2-x and q2-y provide c 1 continuity in the x and y direction, respectively. fig. 1 (a) singlet, (b) d1-x doublet, (c) d2-x doublet fig. 2 quadruplets: (a) q1, (b) q2-x, (c) q2-y, (d) q3 two-dimensional strong basis functions for anisotropic mediums can be formed as [6] , ,( , ) ( , ) ( ) ( ) e j k l k l k lf u v f u v f u f v f  , (13) where f e k,l is a constant factor defined within e-th element, providing dn continuity across the interelement boundaries (retaining continuity of potential v over the boundaries at the same time). this factor is required only if k and/or l are equal to 2 or n + 1. from this condition is derived general continuity of strong basis functions. for piecewise homogeneous anisotropic dielectric, a complete set of strong basis functions can be applied under certain conditions. we will consider here a case where dielectric is homogeneous in the x-direction and piecewise homogeneous in the ydirection (fig. 3). let us find which basis functions (singlets, doublets and quadruplets) can be accepted in this case. all the functions that have zero derivative at their boundaries can be automatically accepted, as they do not participate in the neumann boundary condition. these are all the singlets, doublets d1-x and d1-y and the quadruplet q1. doublet d2-x is defined in the homogeneous region, so no modification is needed; it is directly accepted into the set of basis functions. doublet d2-y spans over two homogeneous elements of different  . in order that it can be accepted in the collection of strong basis functions, it must provide continuity of dn between the two elements. one way to establish this is to multiply it by the factor , r1/ e e k l yyf   , which is different for the two elements [6]. (relative permittivity is used here for simplicity.) considering the inclusion analysis of square coaxial line with anisotropic substrates by strong fem formulation 631 of the quadruplet q2-x the sufficient condition 1 2 3 4 r r r r/ /xx xx xx xx    is automatically satisfied, as both fractions are equal to 1 and no additional factor is needed. considering the inclusion of the quadruplet q2-y the sufficient condition 1 3 2 4 r r r r/ /yy yy yy yy    1 3 2 4 r r r r/ /yy yy yy yy    is also automatically satisfied. however, now the factors are needed, as direction of derivative is across the boundary between two different mediums. the simplest choice is , r1/ e e k l xxf   for all the four elements in fig. 3. finally, for the quadruplet q3 sufficient condition is 1 2 2 1 3 4 4 3 r r r r r r r r( ) /( ) ( ) /( )xx yy xx yy xx yy xx yy        . as both fractions are equal to one, this condition is satisfied. then, the simplest choice is , r1/ e e k l yyf   . with these additional factors all the singlets, doublets and quadruplets are approved to participate in the set of strong basis functions. some of those doublets and quadruplets are in the next step (enforcing dirichlet boundary conditions on the conductor surfaces) simply omitted from the set. the remaining basis functions enter the galerkin procedure to determine the unknown coefficients. fig. 3 two-layer anisotropic medium divided into four finite elements: e 1 , e 2 , e 3 and e 4 in this way it is shown that in the case when the boundary lines between domains of different parameters are straight, mutually parallel lines, the complete set of strong basis functions can be applied. 5. numerical results applying the galerkin version of fem [4]–[6],[10],[18], using the complete set of strong basis functions, we have calculated effective relative permittivity, re, of square coaxial lines shown in inset of fig. 4. we performed verification comparing results obtained by the strong fem formulation with those obtained by the weak fem formulation from [10], by the other available software femm [19] and, in special cases, with simple analytical formulas. in fig. 4 dependence of re on the relative height of a dielectric layer, h/b, above which is air, is shown for the three anisotropic dielectrics. fig. 5 shows the same dependence but in the case where both layers are dielectrics; in three cases both dielectrics are anisotropic and in one case combination of isotropic-anisotropic dielectrics is applied. 632 ž. j. manĉić, v. v. petrović fig. 4 effective relative permittivity as a function of the dielectric relative height, h/b, for three different anisotropic dielectrics over which is air, obtained by using strong fem formulation, weak fem formulation [10] and the commercial software femm [19], for b/a = 3 fig. 5 effective relative permittivity re as a function of the dielectric relative height, h/b, for three different cases of double-layered anisotropic dielectrics and one example of isotropic-anisotropic dielectrics, for b/a = 3 analysis of square coaxial line with anisotropic substrates by strong fem formulation 633 these dependencies are shown for the following anisotropic dielectrics * : boron nitride, bn (rxx = 3.4, ryy = 5.12), sapphire (rxx = 9.4, ryy = 11.6), epsilam 10 (rxx = 13, ryy = 10.3), α-quartz (rxx = 4.52, ryy = 4.637) and ptfe glass (rxx = 2.15, ryy = 2.34). these dielectrics are uniaxial crystals, for which cutting the layers perpendicularly to the axis of symmetry (optical axis, here a y-axis) results in a diagonal permittivity tensor [9]. relative height of the dielectric substrate, h/b, is varied from 0 to 1. excellent mutual agreement of all the groups of results on both diagrams can be observed. another comparison can also be made. if the line is completely filled with an anisotropic dielectric, effective relative permittivity is re r r( ) / 2xx yy    . here, a dash over re denotes both the average value and that it is obtained by the formula and not by the fem. applying this expression, the effective relative permittivity is re = 11.65 for epsilam 10, re = 4.26 for bn, re = 10.5 for sapphire, re = 4.5787 for α-quartz and for ptfe glass is re = 2.245. differences between values obtained by strong fem formulation and the formula, although both groups of values are approximate, were found to be less than 0.5% (see results from fig. 5, h/b = 0.0,1.0). for the half-filled line (h/b = 0.5) with two anisotropic dielectrics (or with one isotropic and one anisotropic dielectric), approximate formula r 1 r 1 r 2 r 2 re 4 xx yy xx yy     (14) can be used. table 1 shows relative permittivity of the square coaxial transmission line half-filled with two anisotropic dielectrics, re strongfem (h/b = 0.5), compared with re , calculated from the above formulas. excellent agreement can be observed. (derivations of the above formulas are given in appendix iii.) table 1 relative permittivity of the square coaxial transmission line half-filled with two anisotropic dielectrics or one anisotropic and one isotropic dielectric dielectrics sapphire / epsilam 10 bn / sapphire epsilam 10 / isotropic dielectric, 13r  )5.0/( femstrong re  bh 11.0786 7.4004 12.3249919 re 11.075 7.38 12.325 (%) 0.033 0.276 0.0008 comparison between the number of unknowns required for the accuracy better than 0.5% can be made. for the weak fem formulation the number of (rectangular) mesh elements was 512 and the number of unknowns was 4416 [10]. for the strong fem formulation in this paper and the same mesh of 512 elements the number of unknowns is 1472. the order of basis functions was nx = ny = 3 for both formulations. for low-order weak formulation [18] number of (triangular) mesh elements was between 5895 and 6019 and the number of unknowns between 3097 and 3177. * permittivity values taken from [9] and [20]. 634 ž. j. manĉić, v. v. petrović 6. conclusion in this paper the strong fem formulation for piecewise homogeneous anisotropic medium with a diagonal permittivity tensor was defined for the closed quasistatic 2-d problems. it is shown that for such a medium, the complete set of strong basis functions (singlets, doublets and quadruplets) can be used. the square coaxial transmission line has been analyzed by the presented method. obtained results for effective relative permittivity for the line partly or completely filled with anisotropic dielectric have shown that the strong fem formulation of the third order is exceptionally accurate. calculated values are found to be in excellent agreement (better than 0.5%) with those obtained by the other available software and by simple approximate formulas. for the same accuracy, number of unknowns for the strong formulation was less than one half of the number of unknowns required for the weak formulation. practical scope of the method is the analysis of all closed (shielded) planar transmission lines with anisotropic dielectric whose permittivity tensor is diagonal. perspectives of the method are its generalizations to 3-d and open problems. acknowledgement: the paper is a part of the research done within the project tr-32052 supported by the serbian ministry of science. appendix i. possibility of extension of the method to a non-diagonal permittivity tensor presented strong fem formulation is based on basis functions that automatically satisfy both boundary conditions (for et and dn) at interelement boundaries, provided that permittivity discontinuities coincide with those boundaries. let us examine, for example, doublet d1-x (fig. 1b) in this regards. this doublet should provide both nonzero function value and zero normal component of the field at the interelement boundary. for non-diagonal permittivity tensors of the two elements forming a doublet, [ ; ] e e e e e xx xy yx yy     , e = 1,2, from continuity of the function along the boundary follows e1t = e2t and from the doublet property follows e1n = e2n = 0. next, d1n = 1 1 1 1xx n xy te e  , d1n = 2 2 2 2xx n xy te e  . as, in general,  1 xy   2 xy, it follows that d1n  d2n, thus this doublet in this case does not satisfy boundary condition for dn. multiplying parts of the doublet inside each of the two elements by different factors in order to satisfy boundary condition for dn would ruin the boundary condition for et. the same reasoning is valid for the quadruplet q1 (fig. 2a). without doublets d1-x, d1-y and quadruplet q1, approximation is not possible, as, e.g. the function value will be forced to zero in nodes. thus, the presented method is not applicable in this case. this, however, is not a significant shortcoming, as anisotropic substrates are in practice often cut perpendicularly to their optical axis, which results in a diagonal permittivity tensor. analysis of square coaxial line with anisotropic substrates by strong fem formulation 635 appendix ii. possibility of extension of the method to arbitrary shaped 2-d geometries presented strong fem formulation is based on basis functions that automatically satisfy both boundary conditions at interelement boundaries which, by themselves, should coincide with u or v coordinate lines. also, permittivity tensor must be diagonal with respect to the applied coordinate system. thus, any deformation (transformation (x, y) = f (u,v)) of originally straight u-v coordinate lines that 1. preserves mutual orthogonality of lines, 2. provides that discontinuities of dielectric coincide with either uor v-lines and 3. preserves the diagonal property of  ( diag[ ]uu vv   ) is possible. e.g., for a coaxial line with isotropic dielectric and dielectric discontinuities either along the radial or along the angular coordinate of a polar coordinate system, the u-v mesh (and the corresponding finite elements) that coincides with polar coordinate lines enables the strong fem formulation. for anisotropic dielectric, however, the diagonal property of  for the curved u-v-coordinates is very unlikely to be provided in practice, so the presented fem formulation for the curved geometries and anisotropic dielectric is practically not possible. appendix iii. on the explicit formulas for effective permittivity used in this paper distribution of the electrostatic potential, v(x,y), inside the tem line with homogeneous isotropic dielectric is independent of its permittivity, as it is the solution of the laplace equation, v = 0. for a square coaxial line, due to symmetry, along xand yaxis (according to the coordinate system shown in inset of fig. 4) is also en = 0 (component normal to the axis). therefore, if this line is half-filled with isotropic dielectrics, potential distribution inside each of the two dielectrics is the same as the whole line is filled with that dielectric and independent of permittivities, so its capacitance per unit length is c' = (c'1 +c'2)/2, where c'1,2 are values for the cases when the line is completely filled with dielectric 1 or 2. from this follows that effective relative permittivity is re r1 r2( ) / 2   . for the tem line with homogeneous anisotropic dielectric with a diagonal permittivity tensor, distribution v(x,y) is the solution of equation rxx 2 v/x 2 + ryy 2 v/y 2 = 0, so it depends on rxx and ryy (more precisely on their ratio). in the case of the square coaxial line, along the xand y-axis is again, due to symmetry, en = 0, but we note that now v(x,0) and v(0,y) depends on rxx and ryy. for a given ratio b/a, the capacitance per unit length of the line is c' = f (rxx, ryy) = f (ryy, rxx). after the change of variables, p = rxx + ryy, q = rxx  ryy, this transforms to c' = g(p,q) = g(p,q). in the special case of the isotropic dielectric (q = 0), r 0 0( ,0) / 2xxg p c pc . from those properties follows that the taylor series of g(p,q) around q = 0 and arbitrary p, up to linear terms is g(p,q)  g(p,0). thus, for small rxx  ryy, re 0 r r( , ) / / 2 ( ) / 2xx yyg p q c p   . for the same line, but half-filled with two anisotropic dielectrics with a diagonal permittivity tensor, potential distribution is not simply composed of the two distributions from the two cases when the line is completely filled with one or the other dielectric, because v(x,0) and v(0,y) (and thus, et on dielectric boundary) depend on elements of the two permittivity tensors. for the capacitance per unit length of such a line, formula 636 ž. j. manĉić, v. v. petrović c'  (c'1 +c'2)/2 is now an approximation for small rxx1  ryy1 and rxx2  ryy2. after substitution c'12 = re1,2c'0 and re1,2  (rxx1,2 + ryy1,2)/2, an approximate formula re  (rxx1 + ryy1 + rxx2 + ryy2)/4 is obtained. this formula is exact in the limiting case r 1 r 1xx yy  and r 2 r 2xx yy  (isotropic dielectrics). references [1] p. p. silvester, r. l. ferrari, finite elements for electrical engineers. cambridge university press, cambridge 1983. [2] m. n. o. sadiku, numerical techniques in electromagnetics with matlab, crc press, taylor & francis group, london, new york, 2009. [3] m. s. palma, t. k. sarkar, e. g. castillo, t. roy and a. đorċević, iterative and self-adaptive finiteelements in electromagnetic modeling, artech house, 1998. [4] v. v. petrović and ž. j. manĉić, "strong fem formulation for quasi-static analysis of shielded planar transmission lines in anisotropic media", eccsc 2010, belgrade. [5] ž. j. manĉić and v. v. petrović, “strong and weak fem formulations of higher order for quasi-static analysis of shielded planar transmission lines”, microwave and optical technology letters (motl), vol. 53, no. 5, pp. 1114-1119, may 2011. [6] ž. j. manĉić and v. v. petrović, "strong fem formulation for quasi-static analysis of shielded striplines in anisotropic homogeneous dielectric", microwave and optical technology letters (motl), vol. 54, no. 4, pp. 1001-1006, april 2012. [7] h. shibata, s. minakawa and r. terakado, "a numerical calculation of the capacitance for the rectangular coaxial line with offset inner conductor having an anisotropic dielectric", ieee trans. on mtt, vol. mtt-31, no. 5, pp. 385-391, may 1983. [8] s. k. koul, "an analytical method for the capacitance of the rectangular inhomogeneous coaxial line having anisotropic dielectrics", ieee trans. on mtt, vol. mtt-32, no. 8, pp. 937-941, august 1984. [9] n. g. alexopoulos, "integrated-circuit structures on anisotropic substrates", ieee trans. on mtt, vol. mtt-33, no. 10, pp. 847-881, october 1985. [10] ž. j. manĉić and v. v. petrović, "analysis of square coaxial line with anisotropic dielectric by finite element method", telfor journal, vol. 3, no. 2, pp. 125–127, 2011. [11] ž. j. manĉić and v. v. petrović, "strong fem calculation of the influence of the conductor’s position on quasi-static parameters of the shielded stripline with anisotropic dielectric", in proceedings of the icest conference, niš, 2011, isbn 978-86-6125031-6, pp. 191-194. [12] n. j. damaskos, b. j. kelsall and j. e. powell, "square coaxial lines and materials measurements", microwave journal, vol. 55, no. 2, february 2012. [13] b. m. kolundzija and v. v. petrovic, “power conservation in method of moments and finite-element method for radiation problems”, ieee trans. on antennas and propagation, vol. 53, no. 8, pp. 27282737, august 2005. [14] a. r. djordjević, elektromagnetika, akademska misao, beograd, 2008. [15] http://iri.columbia.edu/~ines/naren/fortrannumericalrecepies/linbcg.for [16] v. v. petrovic and b. d. popovic, "optimal fem solutions of one dimensional em problems", int. j. of numerical modelling, vol. 14, no. 1, pp. 49-68, jan-feb 2001. [17] ž. j. manĉić and v. v. petrović, "strong fem solution for the square coaxial line", in proceedings of the telsiks conference, niš, 2009, pp. 343-346. [18] ž. j. manĉić and v. v. petrović, "strong fem formulation for 2d quasi-static problems and application to transmission lines", in proceedings of the 22nd tеlеcommunicаtions forum telfor, belgrade, 2014, isbn 978-1-4799-6190-0, ieee catalog number sfp 1498p-cdr, pp. 749-756. [19] http://www.femm.info/wiki/homepage, http://www.femm.info/archives/bin/femm42bin_x64.exe [20] y. k. awasthi, h. singh, a. kumar, p. singh, a. k. verma, "accurate cad-model analysis of multilayer microstrip line on anisotropic substrate", j. of infrared, millimeter, and terahertz waves, vol. 31, no. 3, pp. 259-270, 2010. http://iri.columbia.edu/~ines/naren/fortrannumericalrecepies/linbcg.for instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 549 556 doi: 10.2298/fuee1704549s high-bandwidth buffer amplifier for liquid crystal display applications  saeed sadoni, abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan, iran abstract. in this paper, a novel high-bandwidth and low-power buffer amplifier is presented for the liquid crystal display applications. this buffer amplifier consists of a folded cascade differential amplifier in the input and a class-ab amplifier in the output, which are designed carefully. the proposed buffer amplifier utilizes a high-performance feedback circuit to increase the bandwidth. it also utilizes a comparator circuit to avoid wasting power. the designed circuit has been simulated in 180nm technology using hspice 2008.3. the simulation results show that the bandwidth, power consumption and power supply of the designed circuit are 1.14mhz, 1.64mw and 1.8v, respectively. key words: buffer amplifier, liquid crystal displays, low-power amplifier, highbandwidth amplifier 1. introduction the increasing development of electronics made living in nowadays almost impossible without electronic devices such as cell phone, television, and tablet. progress in demands of human nature has forced science to grow up in this field [1, 2]. display is a common factor among these devices, which helps so much in improving the progress of them. the main reason for users and market attractiveness of progress in this field is that the display devices directly interface with the users. there are several improvements in this field, which can be categorized in two groups: (a) improvement in the circuit components, and (b) improvement in the hardware of displays. buffer amplifiers are main part of the screens. they directly affect power consumption, settling time, speed, bandwidth and other parameters [3-5]. developments in integrated circuit technology, particularly circuits related to displays and their improvements, indicate that there is a need to have a high-quality and high-speed circuit under low voltage source and low power consumption [3-5]. recently, the technology and consumer tendency are portable devices such as cell phone and tablet. in this process, many attempts have been done to improve the circuits to achieve the objectives that received november 4, 2016; received in revised form february 20, 2017 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan 84175-443, iran (e-mail: rezaie@acecr.ac.ir) 550 s. sadoni, a. rezai generally focus on parameters such as the slew rate, voltage swing, bandwidth, maximum load current and low power consumption [3-5]. there are many attempts to increase the performance of displays such as [6-10]. in [6], the designed circuit works well in terms of charging and discharging speed of large capacitors, but it has low bandwidth. in [7] and [8], the designed circuits have a good bandwidth, but they do not have the ability to charge large capacitors well. in [9] developed circuit works well in terms of charging and discharging speed for relatively large capacitors, but it has a relatively high power consumption. the developed circuit in [10] works well in terms of charging and discharging speed of large capacitors. the power consumption is also relatively good, but it has low bandwidth. this paper presents a novel and efficient buffer amplifier. the proposed circuit consists of four parts, which are designed carefully. the proposed circuit has been simulated using hspice. the simulation results show that the proposed circuit provides an improvement in terms of bandwidth, slew rate and power consumption compared to other buffer amplifiers. the remaining of this paper is organized as follows: general description of the proposed buffer amplifier circuit is described in section 2. circuit design techniques are discussed in section 3. in section 4, the simulation results of the proposed buffer amplifier are compared with other buffer amplifiers. finally, section 5 concludes this paper. 2. buffer amplifier the buffer amplifiers generally consist of several parts: an input amplifier, an intermediate amplifier, a feedback network and output stage for drives capacitive loads [11]. as it is presented in figure 1, the first block is a differential amplifier. the folded cascade amplifier is a commonly used amplifier in this stage; it is because in this amplifier due to differential inputs and outputs, the noise is eliminated. so, it has a better swing compared to telescopic amplifier. in the next block, in the output stage of class ab, the output circuit in [11] is utilized. since two comparators are used behind the output transistors, the output efficiency is considerably increased, which prevents the static power dissipation. figure 2 shows a schematic of the output circuit [11]. feedback network input amp. out stage fig. 1 a simplified buffer circuit schematic high-bandwidth buffer amplifier for liquid crystal display applications 551 fig. 2 the circuit of the output stage in the third part, a feedback circuit is used. as it is shown in (1), using this feedback network, the bandwidth of the buffer amplifier is increased. figure 3 shows a schematic of feedback network [8]. bw(new) = bw (old)(1+ab) (1) fig. 3 the feedback network schematic 552 s. sadoni, a. rezai 3. circuit design to implement the proposed buffer amplifier circuit, the design must begin by designing circuit of folded cascade differential amplifier. then, the common mode feedback circuit should be added. moreover, the gm constant circuit and the level shifter should be designed and added. to design the circuit, gm/id method is used as shown in [11]. circuit design is done using the graph of gm/id in relation with vov [11]. in gm/id design methodology, the circuit is designed using a curve dependent on the utilized technology. so, the possibility to choose appropriate dimensions of transistors is provided more efficiently. this curve is independent of the dimensions, and it is relevant to all areas of the transistors. in addition, it can be used to design the dimensions of all transistors in the utilized technology. it should be noted that this ratio is directly related to the coefficient of carrier mobility in transistors. according to differences in the n-type transistors, the ptype transistors are used. thus, in designing swing with respect to transistor and the utilized technology, vov must be considered differently. figure 4 shows the gm/id -vov according to [11] for n-type transistors. fig. 4 the simulation results for gm/id to vov for n-type transistors in this circuit, swing voltage is 1.2volt. with this limitation, over drive voltages are selected as shown in (2) and (3): vovn=0.13volt (2) vovp=0.17volt (3) using the same way, p-type transistors and feedback circuit are designed. in the feedback circuit design, one thing is necessary: the low current branches should be considered. so, the power consumption of the entire circuit does not increase. feedback circuit is shown in figure 5. high-bandwidth buffer amplifier for liquid crystal display applications 553 fig. 5 the utilized circuit for the feedback in the proposed buffer amplifier the general circuit of the proposed buffer amplifier without output stage is shown in figure 6. that utilizes two circuits as shown in figures 2 and 3. it should be noted that figure 2 depicts output power stage. to ensure the other driving devices mo1 and mo2 to stay off during static operation and to save power consumption, the dc currents of mc1 and mc4 are designed to be slightly lower than the nominal currents of mc2 and mc3, respectively. the above specification is fulfilled by the following design conditions: n mc p mc bias l w l w l w bias l w l w )( )()( )( )( 21   (4) p mc n mc bias l w l w l w bias l w l w )( )()( )( )( 34   (5) 554 s. sadoni, a. rezai fig. 6 the general circuit of the proposed buffer amplifier high-bandwidth buffer amplifier for liquid crystal display applications 555 4. simulation results the designed circuit that is shown in figure 6 is simulated in 180nm technology using hspice 2008.3 with 1.8 volt power supply. the simulation results for the slew rate and bandwidth are shown in fig. 7 and fig. 8, respectively. fig. 7 the simulation results for the slew rate of the proposed circuit fig. 8 the simulation results for the bandwidth of the proposed circuit table 1 summarizes the simulation results of the proposed buffer amplifier in hspice in comparison with other buffer amplifiers [6-10]. based on our simulation results, which are shown in table 1, the developed circuit provides an improvement in comparison with [6-10] in terms of bandwidth, power consumption and slew rate. 556 s. sadoni, a. rezai table 1 the comparative table for amplifiers reference power consumption (mw) maximum capacitive load (pf) cmos technology (nm) supply voltage (v) bandwidth (mhz) slew rate (v/µs) [6] 1 680 600 4 0.1 - [7] -15 350 2 2.3 - [8] -200 500 3.3 -- [9] -20000 130 -1 0.083 [10] 6.47 1000 500 5 0.503 5.7 this paper 1.64 1000 180 1.8 1.14 378 5. conclusion low-power and high-bandwidth design are trends in the circuit design [4-5]. in this paper, a novel buffer amplifier has been proposed to be used in the liquid crystal display applications. the proposed buffer amplifier is composed of four parts, which have been designed carefully. the proposed circuit has been simulated using hspice. the simulation results showed that the proposed circuit provides an improvement in terms of bandwidth, power consumption and slew rate compared to [6-10]. therefore, the proposed buffer amplifier architecture has a huge potential to be an efficient architecture for hardware implementation of buffer amplifier. references [1] a. nikolic, n. neskovic, r. antic, a. anastasijevic, “industrial wireless sensor networks as a tool for remote online management of power transformers heating and cooling process”, facta universitatis, series: electronics and energetics. vol. 30, pp. 107-119, 2017. [2] r. khalilian, a. rezai, e. abedini, “an efficient method to improve wban security”, adv. sci. tech. let., vol. 64, pp. 43-46, 2014. [3] h. m. obaid, s. m. chaudhry, “design of linear high drive class ab differential amplifier in 90 nmos technology”, natl. acad. sci. lett, vol. 38, pp. 489–492, 2015. [4] s. soltany, a. rezai, “a novel low power and low voltage bulk-input four-quadrant analog multiplier in voltage mode”, int. j. multi. ubiq. eng., 2016, 16, pp. 159-168. [5] s. soltany, a. rezai, “a new low power four quadrant analog multiplier”, adv. sci. tech. let., , vol. 106, pp. 3336, 2015. [6] c. w. lu, c. l. lee, “a low-power high speed class-ab buffer amplifier for flat panel display application”, ieee trans. very large integ. (vlsi) syst., vol. 10, pp. 163-168, 2002. [7] g. grasso, s. pennisi, “high drive and linear cmos class ab pseudo-differential amplifier”, ieee trans. cir. syst.: expree brifes, vol. 54, pp. 112-116, 2007. [8] p. r. surkanti, p. m. furth, “converting a three stage pseudo-class ab amplifier to a true class ab amplifier”, ieee trans. cir. syst.: expree brifes, vol. 59, pp. 229-233, 2012. [9] p. liao, p. leo, b. zhang, zh. li, “single capacitor with current amplifier compensation for ultra large capacitive load three stage amplifier”, microelectr. j., vol. 44, pp. 712-717, 2013. [10] a. d. grasso, d. marano, f. e. alfaro, a. martin, g. palumbo, s. pennisi, “self-biased dual path push pull output buffer amplifier for lcd column drivers”, ieee trans. cir. syst.: regular papers, vol. 61, pp. 663-670, 2014. [11] f. silveira, d. flandre, p. jespers, “a gm/id based methodology for the design of cmos analog circuits and its application to the synthesis of a silicon on insulator micro power ota”, ieee j. solid state cir., vol. 31 , pp. 1314-1319, 1996. 11248 facta universitatis series: electronics and energetics vol. 36, no 2, june 2023, pp. 285-298 https://doi.org/10.2298/fuee2302285r © 2023 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a study of binary decision diagram characteristics of bent boolean functions miloš radmanović faculty of electrical engineering, university of niš, niš, serbia abstract. bent boolean functions exist only for an even number of variables, moreover, they are unbalanced. therefore, they are used in coding theory and in many areas of computer science. general form of bent functions is still unknown. one way of representing boolean functions is with a reduced ordered binary decision diagram (robdd). the strength of robdds is that they can represent boolean functions data with a high level of redundancy in a compact form, as long as the data is encoded in such a way that the redundancy is exposed. this paper investigates characteristics of bent functions with focus on their robdd parameters. decision diagram experimental framework has been used for implementation of a program for calculation of the robdd parameters. the results presented in this paper are intended to be used to create methods for the construction of bent functions using a robdd as a data structure from which the bent functions can be discovered. key words: coding theory, boolean functions, bent functions, binary decision diagram 1. introduction bent functions were introduced by rothaus in 1976 as boolean functions with maximal nonlinearity [1]. this feature has led to the fact that they are used in many areas of computer science, but their most significant application is in coding theory [2]. however, their application is quite limited because the general form of these functions is still unknown [3]. as the number of variables in a function increases, bent functions become extremely rare in the set of all possible functions. therefore, the construction of these functions is computationally demanding. also, there is no formal method for enumeration, generalization, construction, or classification of all bent functions for a given number of variables. for this reason, researching the characteristics of these functions is of great interest to the scientific community. some characteristics of bent functions are known. it is well known that all walsh spectral coefficients of bent functions, whose values are mapped onto {-1, 1}, have the same absolute value 2n/2, where n is the number of variables of the function. they have the maximum possible value of nonlinearity equal to (2n−1 ± 2(n/2)-1), and they only exist received november 01, 2022; revised december 30, 2022 and march 14, 2023; accepted april 10, 2023 corresponding author: miloš radmanović faculty of electrical engineering, university of niš, aleksandra medvedeva 14, 18104 niš, serbia e-mail: milos.radmanovic@elfak.ni.ac.rs 286 m. radmanović for an even number of variables. it is known that the algebraic degree of bent functions is at most n/2 for n ≥ 4 [3]. it is also well known that from given bent functions, new bent functions with the same or a greater number of variables can be constructed [4]. extensive work on bent functions has been done and various interesting results of research have been brought out in respect of construction and classification of specific subsets of bent functions [3], [5]. as the number of variables increases, it is very time consuming to discover a bent function. the number of bent functions is small compared to the complete set of all boolean functions and therefore, their detection in some defined boundaries is very timeconsuming. testing of bentness across all possible functions, using all known characteristics, even for small numbers of variables (n>4), requires a lot of processing time. consequently, the complete set of n-variable bent boolean functions is known only for n ≤ 4 [3]. the general number of bent functions is an open problem. note that the number of bent functions increases rapidly with increasing n. there are 8 bent functions in 2 variables, 896 bent functions in 4 variables, 5,425,430,528 bent functions in 6 variables, and 99,270,589,265,934,370,305,785,861,242,880 bent functions in 8 variables [5]. for example, for a complete enumeration of bent functions with 8 variables (which is approximately 8.57*10-44 percent of all functions), using maximal algebraic degree, has used approximately 50 personal computers running for 3 months [6]. for this reason, identifying new characteristics can help in more efficient detection of bent functions. one way to represent a boolean function is with a binary decision diagram (bdd). bdds were originally introduced as an efficient support to the procedures and operations required to solve a given problem in synthesis and verification of logic circuits [7]. they are popular data structures widely used in various other areas where manipulation and computation with boolean functions is required. bdd is a representation of a boolean expression using a rooted directed acyclic graph that consists of terminal nodes (with constant values 0 or 1) and non-terminal nodes (marked with variables). a reduced ordered binary decision diagram, which is a widely used data structure in practice, is a bdd with a particular variable order where redundant nodes are shared, and redundant subtrees are also shared. robdds are derived by the reduction of the corresponding binary decision tree (bdt). robdds provide a compact representation allowing one to process large boolean functions efficiently in terms of space and time [7]. the most important characteristic of the robdd is the size of the graph representation or number of nonterminal nodes. this parameter is critical since the memory requirement during the construction of an robdd is directly proportional to the size [2]. besides the size, the following basic parameters are most often considered: the number of paths, width, and the average path length. there is a direct correspondence between these characteristics and the basic characteristics of a logic network derived from robdds. for example, the size of an robdd corresponds to the number of elementary modules in the corresponding realization of a logic network [1]. the paths related characteristics directly correspond to the interconnection complexity of the logic realization. therefore, this paper proposes researching the basic robdd characteristics of bent functions. previously, shafer [9], has analyzed the robdd characteristics of disjoint quadratic bent functions, symmetric bent functions, and homogeneous bent functions of 6variables. specifically, disjoint quadratic bent functions were found to have size 2n − 2 for a study of binary decision diagram characteristics of bent boolean functions 287 functions of n-variables, symmetric bent functions have size 4n − 8, and all homogeneous bent functions of 6-variables were shown to be p-equivalent. two functions are p-equivalent iff those two functions have identical bdds for distinct variable orderings [9]. however, in this paper the complete set of bent functions with 4 variables is analyzed. for bent boolean functions with 6 or 8 variables, only appropriate subsets of bent functions are analyzed and not only the size is included, but also the number of paths, width, and the average path length. a decision diagram experimental framework has been used for implementation of a program for the calculation of robdd characteristics. for discovery of bent functions, it uses maximal algebraic degree as the search space boundary. also, it uses the implementation of the discovery of bent functions using reed-muller (rm) subsets, which is described in [10]. experimental results show interesting robdd characteristics of bent functions. for each robdd characteristic, the range of values is determined. additionally, this paper also investigates the same robdd characteristics of non-bent functions with n-variables having hamming weight equal to (2n-1±2(n/2)-1). these additional characteristics confirm that experimental results can be used to create methods for discovering bent functions using robdd. this paper is organized as follows: section 2 shortly introduces the theoretical background about bent functions and their discovery in the rm domain. section 3 discusses robdds and their characteristics. the experimental results are shown and discussed in section 4. the closing section 5 summarizes the results of the research reported in this paper. 2. background theory if a function f and its walsh spectrum sf,w, in matrix notation, are represented by vectors t nffff ],,,[ 121,0 − =  , and t wf n ssss ],,,,[ 1210, − =  respectively, the walsh transform is defined by the walsh matrix w(n) [11]: f wf nws )1)(( , −= (1) wf n n snxxxf ,1 )(2),,( − = (2) where  i n i xnx 211)( 1 −= = , tffff n ])1(,,)1(,)1[((-1) 1210 −−−−=  (3) and )1()( 1 wnw n i= = ,       − = 11 11 )1(w (4) where w(1) represents the basic walsh transform matrix and  is the kronecker product. the nonlinearity of an n-variable boolean function f is the (hamming) distance of f from the set of all n-variable affine functions [3]. boolean functions achieving maximal nonlinearity are called bent functions. every bent function has a hamming weight (number of times it takes the value 1) of 2n−1 ± 2(n/2)-1. for example, bent functions with 4 variables have hamming weight of 6 or 10, with 6 variables have 28 or 36, and with 8 variables have 120 or 136. 288 m. radmanović a boolean function f in (1,−1) encoding is bent if all walsh spectral coefficients sf,w have the same absolute value 2n/2. the fast transform algorithm can be used to compute the coefficients in walsh spectrum. this algorithm is composed of the “butterfly” operations which are repeated and have structure derived from the basic transform matrices [11]. the recursive definition of the walsh transform matrix, expressed in eq. (4), is the fundamental for the definition of the fast walsh transform algorithm similar to a fast fourier transform (fft) algorithm. the computation of the fast transform algorithm consists of the repeated application of the same “butterfly” operations determined by the basic transform matrices [10]. figure 1 shows the “butterfly” operation for the walsh transform matrix. the “butterfly” operation are performed in each step over a different subset of data. figure 1 also shows the flow graphs of the fast walsh transform algorithm of the cooley-tukey type for computation of the walsh spectrum of a 2-variable boolean function f given by the truth-vector f = [ f (0), f (1), f (2), f (3)]t. this algorithm is highly exploited for testing of bentness across all possible boolean functions in some defined space for their discovery. fig. 1 the flow graph of the fast walsh transform for 2-variable boolean function for example, bentness testing for a function of 4 variables f (x1,x2,x3,x4), given by truth vector [1,0,1,1,0,1,0,0,0,1,0,0,0,1,0,0] t f = , with (1,−1) encoding, can be calculated as shown on figure 2. it should be noticed that the absolute values of all walsh coefficients are equal to 4. a positive polarity reed-muller form comprises exclusive-or of and product terms, where each variable appears uncomplemented. any boolean function f can be represented by the positive polarity rm form in matrix notation defined as [11]: fnrs rmf )(, = (5) rmfn snxxxf ,1 )(),,( = (6) where  i n i xnx 1)( 1= = (7) and )1()( 1 rnr n i= = ,       = 11 01 )1(r , 1 ))1(()1( − = rr , 1 ))(()( − = nrnr (8) where addition and multiplication are modulo 2, r(n) is the positive reed-muller transform matrix of order n, and )1(r is the basic positive reed-muller transform matrix. a study of binary decision diagram characteristics of bent boolean functions 289 fig. 2 example of the bentness testing for a 4-variable boolean function in the (1, -1) encoding using the fast walsh transform the elements of ],...,.,,,,,[ ...12123231331221,0, nrmf aaaaaaaaas = are coefficients in the positive polarity reed-muller (pprm) espressions for any boolean function [11]:  = = nji nnjiij n i ii xxxaxxaxaaxf 1 2112 1 0 )(   (9) where σ denotes modulo 2 summation. the algebraic degree or the order of nonlinearity of a boolean function f is a maximum number of variables in a product term with non-zero coefficient ak, where k is a subset of {1,2,3,...,n}. when k is an empty set, the coefficient is denoted as a0 and is called the zero-order coefficient. coefficients of order 1 are a1,a2,...an, coefficients of order 2 are a12,a13,...a(n−1)n, coefficient of order n is a12...n. the number of all coefficients of order i is         i n . the pprm coefficients are divided into order groupings according to the number of ones in the binary representation of its index in the spectrum. 290 m. radmanović the algebraic degree of bent functions is at most /2n for 4n [8]. thus, the maximal number of non-zero pprm coefficents of a bent functions is:  =        2/ 0 n i i n . since the order of bent functions is limited, the number of non-zero pprm coefficents is also limited and the positions of the coefficients in the pprm spectrum are restricted. these restrictions are the main reasons for discovery possibility since they certainly reduce the possible search space for discovery in the reed-muller domain. as the boolean function size increases, the possible search space increases too. for the pprm transform, we need an inverse transform to get back from the reedmuller domain. since the reed-muller transform matrix )(nr is a self-inverse matrix over gf(2), the forward and inverse transform are given by the same matrix. figure 3 shows the flow graphs of the fast inverse read-muller transform algorithm of the cooley-tukey type for computation of the boolean function f with the truthvector [ (0), (1), (2), (3)] t f f f f f= from a pprm spectrum. this algorithm is highly exploited for discovery of bent functions across all possible boolean functions in the rm domain. fig. 3 the flow graph of the fast inverse rm transform for 2-variable boolean function for example, discovery of the bent function for a function of 4 variables f (x1,x2,x3,x4), with truth vector [1,0,1,1,0,1,0,0,0,1,0,0,0,1,0,0] t f = , using the fast inverse rm transform of its , [1,1, 0,1,1, 0, 0, 0,1, 0, 0, 0,1, 0, 0, 0]f rms = , is shown in figure 4. the bentness testing for this truth vector, with (1,−1) encoding, is performed as shown in figure 2. black dots on the flow graph on the left side in figure 4 indicate 11 possible positions for non-zero pprm coefficients. the number of these possible positions is calculated according to the following formula: (10) 11641 2 4 1 4 0 442/4 0 =++=        +        +        =         =i i a study of binary decision diagram characteristics of bent boolean functions 291 fig. 4 example of the discovery of the bent function of 4 variables, using the fast inverse rm transform this means that there is one possible position for non-zero coefficient of the 0-th order, 4 possible positions for non-zero coefficients of the 1-st order and 6 possible positions for nonzero coefficients of the 2-nd order. the pprm espression for the function from figure 4 is: 43214214321 1),,,( xxxxxxxxxxxf = (11) fast transform algorithms are highly exploited for discovering bent functions in the rm domain. 3. robdd a bdd is a directed acyclic graph that contains non-terminal nodes, two terminal nodes, and edges. an robdd is a reduced bdd for which the nodes at a same level are labelled with the same variable [12]. the reduction is performed by sharing the isomorphic subtrees and removing the redundant data in the bdt using the appropriately defined reduction rules [6]. non-terminal nodes are labeled with variables xi and have two outgoing edges. outgoing edges are labeled ‘0’ and ‘1’ according to the values of the variable xi. terminal nodes contain the function values ‘0’ and ‘1’. the truth table entry 292 m. radmanović of a boolean function labels edges from the root node to the corresponding terminal node. an example of the robdd representation for the function defined by the truth vector [0,1,1,1,1,0,0,0,1,0,0,0,1,0,0,0] t f = using ordering (x1,x2,x3,x4) is shown in fig. 5. for characteristics of the robdd, the following basic parameters are most often considered: the size, the number of paths, width, and the average path length (apl) [13]. the efficiency of the robdd representation in the above example is that it represents a truth vector with a high level of redundancy in a compact form using non-terminal nodes, as long as the data is encoded in such a way that the redundancy is exposed. in a robdd for logic function f , the size of the robdd is the number of non-terminal nodes needed to represent the robdd. in the memory representation of the robdd, each nonterminal node requires an index and two pointers to the succeeding nodes. fig. 5 the robdd representation of the function defined by f=[0111 1000 1000 1000]t the width of a robdd is defined as the maximum number of nodes per level. the delay in a logic network is directly proportional to the number of levels of the robdd, which together with the width determines the surface area of the logic network [7]. a path in an robdd is the sequence of nodes connected by edges leading from the root node to the terminal node. the number of paths is the sum of all different paths to any of the terminals. the number of paths influences the robdd complexity. a minimized disjoint-sum-of-product representation can directly be extracted from an robdd and leads to a small logic network [8]. path length is the number of non-terminal nodes on the path. apl represents the arithmetic mean of the lengths of all possible bdd paths [13]. the minimization of the apl leads to reduction of the logic network evaluation time. an example of the characteristics of robdd representation shown in fig. 5 is given by: 66666.3)(( 2))(( 9))((# 6))(( = = = = frobddapl frobddwidth frobddpaths frobddsize (12) a study of binary decision diagram characteristics of bent boolean functions 293 4. experimental results this section presents the basic robdd characteristics of bent functions with the initial order of variables as shown in fig. 5. in these experiments, the order of the variables in the robdd was not changed. the complete set of all bent functions is analyzed only for functions of 4 variables. due to the very time-consuming process for finding bent functions of 6 and 8 variables, the complete set of all bent functions is not analyzed for these cases. for bent functions of 6 and 8 variables, robdd characteristics were analyzed on a set of 1 million, and 10,000 bent functions, respectively. the following robdd characteristics are included in the analysis: the size (the number of nodes), the number of paths, width, and the average path length. for discovery of bent functions of 4 variables, the maximal algebraic degree is used as the search space boundary. also, the implementation of the discovery of bent functions of 6 and 8 variables uses rm subsets described in [9]. this implementation performs discovery of single random bent function. additionally, these experiments also show the number of non-bent functions with n-variables having hamming weight equal to (2n-1±2(n/2)-1). only functions that have a predefined hamming weight are presented because they represent the search space when creating a potential method for discovering bent functions using robdd characteristics. implementation of the program for analysis of bent and specific non-bent functions was created using a decision diagram experimental framework. implementation is done by extension of an existing bdd package using the c++ programming language. the bdd package is implemented using all basic recommendations for programming bdd packages (unique table, operation table, garbage collector, swapping levels, etc.) [14], [15]. the experiments are performed on a pc pentium iv running at 3.66 ghz with 8 gb of ram. the bdd package performs operations using shared bdds, but in these experiments, only the shared bdds with one output were used. the size of the unique table and the operation table was limited to 262,139 entries. garbage collection was activated when available memory ran low. tables in this section present for each robdd characteristic (parameter) the total number of bent and specific non-bent functions that have that characteristic. table 1 shows the number of bent functions of 4 variables with respect to the robdd size, number of paths, widths, and apl. it can be noticed that almost 70% of all bent functions of 4 variables have size 7 or 8, about 63% of functions have the number of paths 9, 10 or 11, about 58% of all bent functions of 4 variables have the robdd width 2, and about 40% have the robdd apl 3.33333 or 3.5. this table also shows the number of all non-bent functions having hamming weight equal to 6 or 10 of 4 variables with respect to the robdd size, width, number of paths, and apl. it is evident that it is necessary to average 20 checks of functions that have hamming weight 6 or 10 that one of them to be bent. table 2 shows the number of bent functions of 6 variables on a sample of 1 million functions with respect to the robdd size, number of paths, width, and apl. the reason why the entire set of bent functions with 6 variables was not tested is the long time it took to discover these functions. in this table, about 70% of the sampled bent functions of 6 variables have size 15,16, or 17, about 50% have the number of paths 22, 23,24 or 25, about 73% have a robdd width 2, and about 11% have the robdd apl 5.00007 or 5.125. similarly, this table also shows the number of non-bent functions of 6 variables having hamming weight equal to 28 or 36 on a sample when there are 1,000,000 discovered bent 294 m. radmanović functions with respect to the robdd size, number of paths, width, and apl. it can be noticed that the number of non-bent functions follows the number of bent functions with a ratio of about 40 times more. table 3 shows the number of bent functions of 8 variables on a sample of 10,000 functions with respect to the robdd size, number of paths, width, and apl. the reason why the number of tested functions is reduced to 10,000 is the very long computation time required for discovery of these functions. in this table, about 60% of the sampled bent functions of 8 variables have size 22, 23, or 24, about 70% have the number of paths 30, 31, or 32. about 90% of the sampled bent functions have robdd width 2, and about 50% have robdd average path lengths 6.12903, or 6.133335. this table also shows the number of non-bent functions of 8 variables having hamming weight equal to 120 or 136 on a sample when there are 10,000 discovered bent functions. it can be noticed that the number of non-bent functions follows the number of bent functions with a ratio of about 80 times more. if we look at all three tables for functions with 4, 6 and 8 variables, it is easy to determine the formulas for the characteristics that have the largest number of functions. it is discovered that the largest number of n-variable bent functions have size 4*n-8. regarding other robdd characteristics for the maximum number of functions, no law can be determined. it is interesting that there is a law for bent functions of 4 and 6 variables where apl has formula 0.8333333*n. table 1 the number of all bent and all non-bent functions of 4 variables having hamming weight equal to 6 or 10 with respect to the robdd parameters. size #f (bent) #f (non-bent) 4 7 169 5 44 667 6 153 2001 7 308 4079 8 318 5296 9 66 2630 width #f (bent) #f (non-bent) 1 132 1780 2 520 8531 3 216 4113 4 28 442 #paths #f (bent) #f (non-bent) 5 3 90 6 17 272 7 52 728 8 117 1471 9 193 2398 10 210 2985 11 157 3074 12 95 2294 13 42 1186 14 9 323 15 1 33 apl #f (bent) #f (non-bent) 2.4 3 61 2.66667 12 123 2.83333 5 92 2.85714 12 176 3 38 572 3.125 78 907 3.14286 2 30 3.22222 8 430 3.25 39 484 3.33333 172 1740 3.4 42 1468 3.44444 13 227 3.5 168 1529 3.54545 86 2405 3.63636 71 669 3.66667 86 2143 3.75 9 152 3.76923 42 1182 3.85714 9 323 a study of binary decision diagram characteristics of bent boolean functions 295 table 2 the number of bent and non-bent functions of 6 variables having hamming weight equal to 28 or 36 with respect to the robdd parameters. size #f (bent) #f (non-bent) 8 3 142 9 49 2168 10 369 15845 11 2078 84473 12 9182 271672 13 33303 1005621 14 94595 2559252 15 195267 7037689 16 273064 10165891 17 225355 8587546 18 121807 5018437 19 37694 1889635 20 6794 506954 21 440 35829 width #f (bent) #f(non-bent) 2 269858 8434452 3 590964 22425672 4 133894 4584176 5 5284 155319 #paths #f (bent) #f (non-bent) 11 3 194 12 26 1593 13 120 5926 14 397 18922 15 1586 68573 16 3643 151345 17 7840 302133 18 20466 772155 19 37087 1363662 20 61213 2197944 21 83862 3018102 22 116633 4201566 23 135603 5127539 24 136380 6056333 25 117712 5578164 26 97858 4783127 27 80974 4205942 28 46092 2589411 29 26284 1881356 30 16647 1204724 31 8095 712410 32 1467 166751 33 12 1478 apl #f (b) #f (n-bent) 3.90909 3 122 3.91667 12 568 4.07692 24 799 4.08333 6 188 4.15385 32 1420 4.16667 8 521 4.21429 48 2278 4.23077 12 499 4.25 48 2568 4.26667 284 10891 4.28577 178 6995 4.30786 45 1591 4.3125 70 4575 4.33333 449 20481 4.35714 36 1665 4.38463 7 302 4.4 284 9895 4.41176 200 9678 4.42857 72 2889 4.4375 1254 59642 4.46667 282 9612 4.47059 96 4576 4.50019 823 47612 4.52941 660 28902 apl #f (bent) #f (n-bent) 4.80952 868 44599 4.8125 13 568 4.81818 4922 228955 4.82353 103 5601 4.83333 1128 57546 4.84214 4359 178221 4.85 5518 316647 4.85729 9472 376566 4.86431 12929 564982 4.88235 234 9677 4.89596 2046 74766 4.90063 34213 1956687 4.90478 15467 602886 4.90909 2191 80885 4.91304 35656 2012624 4.91667 18849 604202 4.9447 147 6457 4.95023 2078 74702 4.95238 25216 870164 4.95424 398 19465 4.95456 45164 1972354 4.95652 11110 575503 4.95833 639 26774 5.00007 68074 3089332 apl #f (bent) #f (n-bent) 5.16667 6967 257702 5.17241 2640 98321 5.17391 2862 97023 5.17857 20647 860785 5.18182 84 3677 5.18518 14512 798544 5.19231 757 34577 5.2 18960 870556 5.20833 13085 561017 5.21429 2091 79045 5.21739 24 885 5.22222 8080 387122 5.23077 22424 874322 5.23333 1849 70554 5.2381 9 349 5.24 4806 156855 5.24138 11841 590446 5.25 9424 519445 5.25806 5351 254023 5.25926 1928 85266 5.26087 38 1671 5.26667 9674 385661 5.26923 5140 202677 296 m. radmanović 4.53333 14 569 4.55556 656 30187 4.56383 900 35671 4.57143 2 85 4.58916 3783 165761 4.60025 221 8647 4.61111 106 4854 4.62533 220 12587 4.63158 1266 52908 4.64706 1362 49762 4.65 6022 158712 4.66667 10588 458972 4.68421 7271 276134 4.6875 36 1589 4.7 1436 57125 4.7059 1276 45476 4.72228 6569 245564 4.73684 1609 47933 4.75 2227 78231 4.7619 7652 256443 4.76471 90 3554 4.77778 1040 39677 4.78947 19927 819556 4.80001 8330 122 5.04167 5929 258466 5.04348 42467 1704665 5.04545 29139 956027 5.04762 172 5702 5.05 52 1795 5.0527 96 3899 5.07692 13845 507879 5.08 36418 1422644 5.08333 10918 403354 5.08696 6941 353002 5.09117 13059 680223 5.09532 574 31066 5.10649 8 346 5.10714 5499 285302 5.11111 39390 1896601 5.11538 23450 873321 5.12 3655 154667 5.125 45599 1564998 5.13043 31476 1502337 5.13636 234 13121 5.14815 3456 165209 5.15067 39 1466 5.15385 19933 698680 5.16 32140 1570665 5.27586 2173 75331 5.28 4278 165006 5.28571 2793 95661 5.29167 24 1164 5.2963 12014 485664 5.30769 2616 106447 5.31034 3726 185433 5.31818 1 43 5.32143 3222 135886 table 3 the number of bent and non-bent functions of 8 variables having hamming weight equal to 120 or 136 with respect to the robdd parameters size #f (bent) #f (non-bent) 16 9 1040 17 37 4074 18 110 12061 19 196 19866 20 438 41860 21 960 91649 22 1598 137048 23 2006 160217 24 2518 219748 25 1340 121521 26 740 67832 27 40 3821 28 8 836 width #f (bent) #f (non-bent) 2 9110 712085 3 890 83305 apl #f (bent) #f (non-bent) 5.80769 300 40801 5.83333 30 2161 5.84 120 9212 5.91304 15 1408 5.91667 60 6304 5.92593 88 6602 5.96 16 1282 5.96154 44 3240 6 24 1960 6.0303 540 52486 6.03333 48 3595 6.03448 36 3342 6.03571 114 9354 6.03704 16 1750 6.03846 8 705 6.04 22 1886 6.04167 8 711 6.0625 216 17381 6.06452 54 3865 6.07143 24 1938 6.07407 44 3982 apl #f (bent) #f (non-bent) 6.11765 36 3296 6.12 4 341 6.12903 3600 269403 6.13333 1329 124259 6.14286 136 9576 6.14815 38 2902 6.15152 36 2920 6.15385 22 2114 6.15625 18 1390 6.16 5 84 6.16667 72 5751 6.17241 402 28489 6.18182 48 4112 6.2 40 2971 6.21875 492 40242 6.22222 68 5509 6.22581 233 22063 6.23077 13 1071 6.24138 64 5356 6.25 215 23555 6.25806 12 860 a study of binary decision diagram characteristics of bent boolean functions 297 #paths #f (bent) #f (non-bent) 23 15 1661 24 98 9720 25 167 16465 26 405 34536 27 261 21638 28 527 41872 29 925 73051 30 1597 124734 31 3899 331211 32 1446 125449 33 624 63760 34 36 4603 6.07692 10 764 6.09375 720 52980 6.10345 372 28586 6.11538 8 706 6.26667 102 7354 6.32143 14 1166 6.33333 13 1385 6.34483 51 4902 5. conclusions and future work one efficient way to represent boolean functions is with a reduced ordered binary decision diagram. the strength of robdds is that they can represent boolean function data with a high level of redundancy in a compact form. the quality of compactness is expressed by basic robdd parameters or characteristics. these basic characteristics are the size, the number of paths, the width, and the average path length. using these characteristics, bent function analysis can be performed to determine their properties better. this paper investigates the characteristics of bent functions with a focus on their basic robdd parameters. a decision diagram experimental framework has been used for implementation of a program for calculation of these parameters. the complete set of all bent functions is analyzed for functions of 4 variables. due to very time-consuming process for the discovery of bent functions of 6 and 8 variables, robdd characteristics were analyzed on a set of 1 million, and 10,000 bent functions, respectively. so that we can use these experimental results in future research, this paper also investigates the robdd characteristics of nonbent functions with n variables having hamming weight equal to (2n-1±2(n/2)-1) with focus on the same parameters. the complete set of all non-bent functions of 4 variables is analyzed. the set of non-bent functions of 6 variables is analyzed on a sample when there are 1 million discovered bent functions and the set of non-bent functions of 8 variables is analyzed on a sample when there are 10,000 discovered bent functions. from the experimental results, it is evident that for bent functions of 4 variables there is a small set of values of robdd characteristics that most bent functions have. for example, for these functions, 70% of them have the robdd size of 7 or 8, 63% have the number of paths of 9, 10 or 11. 58% have the width 2 and 40% have the average path lengths 3.33333 or 3.5. for bent functions of 6 variables, the values of the robdd characteristics that have the largest number of bent functions can be determined again. the same applies to bent functions with 8 variables. it was also determined that the largest number of n-variable bent functions has a size of 4*n-8, and an average path length of bent functions of 4 and 6 variables is very close to 0.8333333*n. but unfortunately, it was not possible to confirm the same average path length formula for bent functions of 8 variables. perhaps the reason for this is the small set of functions that was tested. 298 m. radmanović from the experimental results for non-bent functions, it is evident that they follow the characteristics of bent functions. the ratio of the number of non-bent to bent is about 20 times more for functions of 4 variables, about 40 times more for functions of 6 variables and about 80 times more for functions of 8 variables. these values also represent the search space when creating a potential method for discovering bent functions using robdd characteristics. the results presented in this paper are intended to be used to create methods for the construction of bent functions using robdd as a data structure from which the bent functions can be discovered. research in this direction can reduce the time for discovering random bent functions. in addition, the results in this work represent new boundaries within which we can detect bent functions. it was shown that a large percentage of bent functions with 4, 6 and 8 variables have a very small range of robdd characteristics which are tested in this paper. also, based on individual robdd characteristics, new subsets of bent functions can be defined. bent function discovery can be performed within these robdd subsets that have a predefined hamming weight. future work will refer to the study of a pair or more robdd parameters of bent functions. it also can be extended to research of additional robdd parameters of bent functions, as well as to the study of the characteristics of not only binary decision diagrams, but also other types of diagrams, such as functional decision diagrams, algebraic decision diagrams, kronecker decision diagrams, pseudo-kronecker decision diagrams, etc [7], [8], [11]. references [1] o. rothaus, "on bent functions", j. comb. theory ser. a, vol. 20, pp. 300-305, 1976. [2] o. logachev, a. salnikov and v yashchenko, boolean functions in coding theory and cryptography, american mathematical society, 2012. [3] s. mesnager, bent functions, fundamentals and results, springer international publishing, 2016. [4] n. tokareva, bent functions, results and applications to cryptography, academic press, 2015. [5] m. stanković, c. moraga and r. stanković, "an improved spectral classification of boolean functions based on an extended set of invariant operations", fu: elect. energ., vol. 31, no. 2, pp. 189-205, 2018. [6] p. langevin and g. leander, "counting all bent functions in dimension eight 99270589265934370305785861242880", in designs, codes and cryptography, vol. 59, pp. 193-201, 2011. [7] t. sasao and m. fujita, representations of discrete functions, kluwer academic publishers, boston, 1996. [8] r. drechsler and b. becker, binary decision diagrams: theory and implementation, springer us, 2013. [9] n. schafer, "the characteristics of the binary decision diagrams of bent functions", m.s. thesis, naval postgraduate school, monterey, ca, september 2009. [10] m. radmanović, "efficient discovery of bent function using reed-muller subsets", in. proceedings of the 55th int. scientific conference on information, communication and energy systems and technologies (icest 2020), pp. 7-10, 2020. [11] m. g. karpovsky, r. s. stanković and j. t. astola, spectral logic and its applications for the design of digital devices, wiley, 2008. [12] m. thornton, r. drechsler and d. miller, spectral techniques in vlsi cad, springer us, 2012. [13] s. nagayama, a. mishchenko, t. sasao and j. t. butler, "minimization of average path length in bdds by variable reordering", in proceedings of the international workshop on logic and synthesis, 2003, pp. 207-213. [14] k. brace, r. rudell and r. bryant, "efficient implementation of a bdd package", in proceedings of the 27th acm/ieee design automation conference, 1990, pp. 40-45. [15] f. somenzi, "efficient manipulation of decision diagrams", software tools for technology transfer, vol. 3, no. 2, pp. 171-181, 2001. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 1 15 doi: 10.2298/fuee1501001b igbts working in the ndr region of their i-v characteristics  riteshkumar bhojani 1 , thomas basler 1 , josef lutz 1 , roland jakob 2 1 department of power electronics and electromagnetic compatibility, technische universität chemnitz, germany 2 ge energy power conversion berlin, germany abstract. this paper demonstrates the detailed work on high voltage igbts using simulations and experiments. the current-voltage characteristics were measured up to the break through point in forward bias operating region at two different temperatures for a 50 a/4.5 kv rated igbt chip. the experimentally measured data were in good agreement with the simulation results. it was also shown that the igbts are able to clamp high collector-emitter voltages although a low gate turn-off resistor in combination with a high parasitic inductance was applied. uniform 4-cell and 8-cell igbt models were created into the tcad device simulator to conduct an investigation. an engendered filamentation behaviour during short-circuit turn-off was briefly reviewed using isothermal as well as thermal simulations and semiconductor approaches for development of filaments. the current filament inside the active cells of the igbt is considered as one of the possible destruction mechanism for the device failure. key words: igbt, i-v characteristic, voltage clamping, short-circuit, filamentation 1. introduction igbts are one of the most important power semiconductor devices in low, medium and high power applications ranging from few hundred volts to several thousand volts. this device offers excellent switching behaviour, easy gate drivability, wide safe operating area, snubber-less operation and robust turn-off capability. in addition, the capability to limit the short-circuit current is one of the superior properties of igbts [3, 4]. the basic physics of the igbts is explained well in given literatures [1-4]. in the present work, the investigated igbt chips are taken from a high voltage igbt press-pack device which consists of 42 single igbt chips. each chip is rated at 50 a for 50 hz half-sine waveform at 75 °c and has a blocking capability of 4.5 kv. the purpose of this paper is to give a comprehensible explanation of the device behaviour from static up to dynamic characteristics. simulations are performed here to investigate internal received october 15, 2014 corresponding author: riteshkumar bhojani department of power electronics and electromagnetic compatibility, technische universität chemnitz reichenhainer str. 70, room h109, d-09126 chemnitz, germany (e-mail: riteshkumar.bhojani@etit.tu-chemnitz.de) 2 r. bhojani, t. basler, j. lutz, r. jakob effects of the igbts. the igbts presented here have a planer cell structure and a fieldstop layer at the collector side. 2. igbt complete i-v characteristics up to breakdown point to understand the device behaviour, the static characteristics of the igbt at different gate voltages has to be well understood. the technique to measure igbt static characteristics non-destructively was explained in [5]. complete static characteristics were measured using two different measurement setups. the first setup uses the “tektronix 371b curve tracer” up to the maximum power of 3 kw. the measurement points in desaturation and breakthrough area at the breakthrough branch were taken using a single pulse short-circuit (sc) type 1 measurement setup given in fig. 1(a) [5]. fig. 1 (a) sc 1 test circuit (b) sc example pulse for static characteristic measurement vge = 15 v, vdc = 3.5 kv, lpar = 3.9 μh, rg,on = 44 ω, rg,off = 220 ω, t = 400 k igbts working in the ndr region of their i-v characteristics 3 a protection igbt (sigbt) was used to turn-off the short-circuit in a case of dut (device under test) failure. several measurement points of the desaturation area have been taken during the static short-circuit phase. high parasitic inductances (lpar) up to 14 μh in combination with low rg,off have been used to produce high overvoltages during sc turnoff. thereby, the breakdown point of the output characteristic can be attained for a short time interval. the course of vce, ic, vge and ig moments read-out times of the different measurement points and energy loss during the short-circuit measurement pulse are displayed in fig. 1(b). the applied collector-emitter voltage during this measurement was 3.5 kv at a temperature of 400 k. the gate to emitter voltage (vge) and collector-emitter voltage (vce) were measured very close to the chip to avoid parasitic influences. charging process of the igbt capacitances and self heating during the short-circuit pulse has to be considered to get exact points of the i-v characteristics. since the width of the short-circuit pulse is reduced to small values, the losses generated up to the measurement point are low. a very small increment in the chip temperature can be calculated due to the large base width of the high voltage igbt. temperature change ∆tsc during the shortcircuit turn-off was calculated using equation (1) by assuming a homogeneous temperature distribution throughout the chip [5]. sc sc sc th th,si w w t c c d a       (1) sc 1 1 3 3 0.7 j 3 k 788 j kg k 0.00234 kg cm 0.128 cm t         in equation (1), wsc is the energy loss during single pulse short-circuit, cth is the thermal capacitance, cth,si is the lattice heat capacity of silicon, ρ is the density of silicon, d is the igbt thickness and a is the igbt area. the calculated temperature rise for sc was added to the measurement temperature of static curve tracer measurements [5]. the measured and simulated i-v characteristics of the igbts are shown in fig. 2(a) and fig. 2(b) at two different temperatures 300 k and 400 k respectively. the graphs exhibit good agreement between measured and simulated results. at higher currents, the bipolar current gain increases due to increase in injection efficiency and base transport factor. this is the reason for an increment in igbt saturation current at higher collectoremitter voltages. the current equation of the igbt and its relation to bipolar current gain are mentioned below [1-4]. 2 c,sat ge th pnp 1 ( ) (1 ) 2 k i v v       (2) n ox w c k l    (3) pnp e t    (4) 4 r. bhojani, t. basler, j. lutz, r. jakob p e p n j j j    (5) t eff p,nb 1 cosh w l           (6) fig. 2 measured (left-side) and simulated (right-side) igbt static characteristics (a) t = 300 k (b) t = 400 k in above equation (2), k is the channel conductivity and αpnp is the bipolar current gain given by equations (3) and (4). vge is the gate emitter voltage, vth is the threshold voltage, μn is the electron mobility, cox is the oxide capacitance, γe is the injection efficiency, αt is the base transport factor, jp is the hole current density, jn is the electron igbts working in the ndr region of their i-v characteristics 5 current density, weff is the non-depleted width of the n base and lp,nb is the diffusion length for holes in collector side buffer region. the higher the applied battery voltage the more holes are injected from the collector side. if vce increases, the space charge region expands. consequently the effective base width will be reduced. the effective base width is the non-depleted width of the base region under applied collector-emitter voltage. therefore, the injection efficiency and the base transport factor increases with higher vce which results in increased bipolar current gain. hence, the igbt collector current grows with the increment in applied collectoremitter voltage. moreover, the reduced collector current at higher temperatures for gate voltages, significantly higher than threshold voltage, were resulted from the strong mobility dependency on temperature. for gate voltages slightly higher than the threshold voltage, the saturation current can also increase with temperature due to the strong reduction of the vth. these operation points are below the so called “temperature compensation point” (tcp) [18]. when gate voltage is lower than the threshold voltage of the igbt, the mos channel cannot supply electrons to conduct current. as the gate voltage comes close to the threshold voltage, a very small amount of current flows due to the starting of accumulation of electrons below the gate oxide. the saturation current strongly depends on the thickness of the gate oxide layer below the gate contact. a small reduction in gate oxide thickness influences the channel conductivity k which is given by equation (3). thus, the gate oxide considered as one of the crucial parameter for designing igbt simulation model. on the breakdown characteristic, the measurement results describe that for vge > vth, the igbt is able to block about 4.2 kv at 300 k. this breakdown voltage is significantly lower than the breakdown voltage at vge = 0 which is 5.5 kv shown in fig. 2(a). at 400 k, the measured breakdown voltage is slightly lower than the simulated breakdown voltage. in practice, during the short-circuit turn-off event it is more practical that the operating point comes on the ndr (negative differential resistance) branch of the igbt static characteristics for low rg,off. this could lead the igbt to destructive mechanisms. one of the possible destruction mechanisms was filamentation due to the rapid turn-off of the igbts during short-circuit type 1. later on, the filamentation phenomenon is explained using simulation results. 3. self-clamping at short-circuit turn-off of high voltage igbts igbts have the potential to block high overvoltages during fast switching or shortcircuit. in this part of work, investigations were done on the single igbt. the measured igbts were able to clamp the overvoltages induced during short-circuit turn-off event. the high vce can occur during fast short-circuit turn-off and self turn-off [7] and this can be clamped by igbts itself. a similar clamping mechanism has already been described for the igbt overcurrent turn-off in [9] and switching self clamping mode (sscm) [7-9]. however, for low gate turn-off resistor, the igbt turn-off may become critical and unstable which leads to the device destruction. here, the gate turn-off resistor (rg,off) controls the speed and the overvoltage during turn-off process. 6 r. bhojani, t. basler, j. lutz, r. jakob fig. 3 igbt destruction during short-circuit turn-off (initial failure: freewheeling diode failed during double pulse test with high turn-on inductance, sc type 2 for igbt) vdc = 3 kv, lpar = 7.75 μh, rg,off = 300 ω, t = 400 k fig. 3 demonstrate waveforms of the short-circuit type 2 turn-off, where at point 1 the freewheeling diode fails first during its reverse recovery and the igbt runs under shortcircuit. in the measurement, the igbt current exceeds the measurement range of the used rogowski coil. at point 3, the igbt is automatically turned-off by the gate drive unit. therefore, a large overvoltage is induced which comes on the post avalanche branch of the static characteristic and the igbt was destructed, see point 4. the values used for the parasitic inductance, gate turn-off resistor and temperature are displayed in fig. 3. the destruction point on the chip is clearly visible from the emitter and from collector side as well [14]. a crack engendered during the destruction which propagates towards the junction termination. the detailed explanation about different short-circuit types are given in [10-12]. fig. 4 (a) and (b) shows the measured and simulated clamping mechanisms by igbts during single pulse short-circuit turn-off. to compare measurement with simulation, different values for the gate turn-off resistor were taken. high overvoltage induced due to the high parasitic inductance during the falling dic/dt. for small rg,off, dic/dt is limited by avalanche generation to clamp dcc par d d v vi t l   (7) which is defined for sscm mode and holds for measurements as well as simulations [9]. in difference to overcurrent turn-off, the density of the charge carrier is quite low for the short-circuit case due to high applied electric field. igbts working in the ndr region of their i-v characteristics 7 fig. 4 (a) measured sc 1 behaviour vdc = 3 kv, lpar = 7.3 μh, t = 300 k (b) simulated sc 1 behaviour vdc = 3 kv, lpar = 10 μh, t = 300 k measurements and simulations were done using different gate turn-off resistors. the clamping mechanism is shown with the help of simulations using a half-cell igbt model in fig. 4(b). half-cell, 4-cell and 8-cell igbt models are proposed in this work to inspect the physical behaviour. they were fitted to match the behaviour of the real igbt. the measurements clearly show a course through the ndr region of the static curve for the phase of rapidly increasing vce during fast turn-off. at this point, the igbt turn-off behaviour completely relies on the course of the applied gate voltage which will be explained in the next section. 8 r. bhojani, t. basler, j. lutz, r. jakob the ndr branch of the igbts is related to the bipolar current gain explained in previous section. it has a vast influence on the i-v characteristics. by increasing the buffer deepness of the igbt, it is possible to adjust the bipolar current gain and emitter efficiency that will result in high blocking capability. that means the ndr branch shifts towards higher voltages for e.g. deeper field-stop regions. there are other possible ways to adjust bipolar current gain. more detailed explanation has been given in the following literature [13, 14]. from this clamping behaviour, it is difficult to conclude the physical reasons behind the destruction of the igbts. therefore, investigations were completed using 4-cell and 8-cell igbt isothermal and thermal simulations to investigate the cause for the igbt destruction. 4. filamentation in igbts during short-circuit turn-off simulation allows complex analysis of the internal behaviour and gives several physical parameters like electric field expansion, electron and hole density, temperature distribution, current density and much more. precise knowledge of those various physical quantities can explain the possible reasons for device destruction. in this section, an isothermal simulation of the 4-cell and 8-cell igbt model shall be demonstrated. furthermore, it is not possible to ascertain filamentation behaviour using half-cell igbt model. therefore, the igbt model should have more than one full cell structure to realize current filaments. the simulation time has to be considered as well for simulation of structures with large number of igbt cells. filaments were found before in igbts under various conditions and at different places inside the igbts in given published works [15-17]. the single pulse short-circuit type 1 turn-off was used with adjusted pulse length of 20 µs. at 20 µs, the igbt was turned-off with high current velocity dic/dt. a very high slope of the falling current is more feasible to induce voltage and current stress on the igbts during turn-off. the value of the parasitic inductance, the applied battery voltage and gate turn-off resistor were adjusted to achieve fast switching. under given circumstances, the igbts have to work on the breakdown branch. during current falling time, the collector-emitter voltage of the igbt rises rapidly. fig. 5(a) and (b) shows the simulated short-circuit type 1 switching waveform. the course of collector current, collector-emitter voltage and gate voltage are plotted as a function of time. as far as the gate voltage is higher than the threshold voltage, the igbt can conduct large saturation currents simultaneously with large applied battery voltage across it during short-circuit. these points can be the stable operating points on the linear region of the igbt static characteristics. on the other side, a point comes when the gate voltage becomes lower than the threshold voltage. when that happens, the mos channel below the gate oxide closes and the igbt stops conducting through this channel. there comes the influence of the ndr (negative differential resistance) branch on device behaviour. from now on, the igbt is “self-controlled” and the influence of the ndr branch on device behaviour shows a higher impact on the switching behaviour for fast and high-inductive turn-off. igbts working in the ndr region of their i-v characteristics 9 fig. 5 simulation of sc turn-off (a) 4-cell model black lines and 8-cell model red dashed lines (b) magnified picture vdc = 3.5 kv, lpar = 10 μh, rg,off = 30 ω, t = 300 k the igbt operating point is already on the ndr branch of the igbt due to large voltage overshoot. this ndr phenomenon may be destructive. the working point of the igbts comes on this branch only if the gate voltage becomes less than the threshold voltage of the given igbt, compare with the above shown static i-v characteristics. in this case it is hard for the igbts to survive. one of the possible failure pictures was 10 r. bhojani, t. basler, j. lutz, r. jakob given in fig. 3. the demeanours of the current filaments during short-circuit turn-off for the 4-cell and 8-cell are exemplified in fig. 6(a) and (b) respectively. at the beginning of the short-circuit turn-off event, the gate signal has been set to zero to turn-off the igbt. the igbt current starts to fall while the gate voltage approaches to 0 v. starting with 4-cell model at time point 19.5 µs, the igbt is still flowing current through its fully opened n-channel. the base of the igbt is flooded with charge carriers. the electrons are flowing from emitter to collector and holes opposite to that. the range of the electron and hole density during igbt conducting state is significantly lower than 1 × 10 15 cm -3 . the current density in each cell at the p-well has a value of about 1 ka/cm 2 which is quite high. since the gate voltage is 12.6 v, the igbt channel is still conducting. the generated overvoltage at this point is about 870 v. the voltage overshoot is superimposed to the battery voltage of 3.5 kv and is blocked by the reversed biased junction at the emitter side. this overvoltage resulted due to high parasitic inductance, understandable from equation (7). that means the igbt is working in a breakdown branch of the static characteristics. nevertheless, there is no destruction phenomenon initiated at this point. this belongs to the stable operating phase during short-circuit turn-off. just after 100 ns when the gate voltage reduces from 12.6 v to 10.6 v, the operating point shifts on static characteristics from 12.6 v gate voltage branch to 10.6 v branch. this is a starting of the ndr branch of igbt output characteristics. since only a small amount of current flowing through the channel, the remaining current forced to flow directly through p-well. the peak of the electric field at time point 19.6 µs along the reverse biased junction is observed to exceed 250 kv/cm. this high electric field generates large number of electron-hole pairs by avalanche generation at this junction. the remaining charge carriers inside the igbt base region will conduct the current additionally by means of avalanche generated electron-hole pairs. to get physical understanding for the igbt destruction, this time point is very important. only if all igbt cells carry the same current, this could be a stable operating point otherwise it is the initial point for current filamentation. at 19.7 µs, the channel is partially closed due to further reduction in gate voltage. here, a small current is still flowing through the channel which is clearly visible from fig. 6 at 2 nd and 4 th cell of the 4-cell igbt model. at this time point, the values of gate voltage, collector current and collector-emitter voltage are 10.1 v, 277 a and 4335 v respectively. a small kink has been seen in a magnified voltage waveform of a 4-cell igbt inside the blue circle of fig. 5(b). as can be seen from fig. 6, the current filament is already initiated at first and third cell from left hand side of the igbt structure. the current density along the third cell at this time point becomes double due to avalanche generated electron-hole pairs. subsequently, the current filament starts to dominate on a single cell. this means the whole current starts to flow through a single cell due to reduced resistance and thus current crowding. eventually, at 20 µs only single filament can be seen in one of the active cell in simulation results of the 4-cell igbt model. igbts working in the ndr region of their i-v characteristics 11 fig. 6 filamentation in the igbt during isothermal simulations (a) 4-cell model and (b) 8-cell model picture vdc = 3.5 kv, lpar = 10 μh, rg,off = 30 ω, t = 300 k 12 r. bhojani, t. basler, j. lutz, r. jakob the investigation was also done for an 8-cell igbt model. similar behaviour was observed for an 8-cell igbt model simulated again at room temperature. out of eight cells, the current filament found initially into the four random active cells of the igbt at 19.90 µs. the intensity of these current filaments is different at each cell. later, the igbt current wants to flow through one single cell. like in the 4-cell igbt model, the current filament has converged along sixth cell at time 22.1 µs in an 8-cell model. in practice, a very high localized current density increases the temperature inside the igbt which cannot be seen through the isothermal simulation. hence, the temperature inclusion into the device simulation should be considered. to examine the temperature influence on filamentation, thermodynamic simulation was performed utilizing a 4-cell igbt model. similar current filament effect was found out through the igbt simulation. in device simulation, a starting temperature and the pulse width were adjusted to 300 k and 10 µs respectively. this thermodynamic simulation can give more realistic behaviour like in a real application of the igbt. fig. 7(a) and (b) show the thermodynamic switching behaviour and the respective demeanour of the current filament. to correlate switching behaviour with the internal igbt behaviour during turnoff, eight different time points are taken to allege the physical reasons for the igbt failure. as previously explained, the gate voltage plays a major role during the shortcircuit turn-off. the demeanor of the current filament is shown in fig. 7(b) by taking eight different points of the switching waveform. at 10.5 µs, the collector-emitter voltage is about 5 kv and the gate voltage is 13.6 v. this large overvoltage is induced due to the falling dic/dt. the igbt is working in the avalanche branch due to the high overvoltage. first three time points belongs to the stable operating points on the igbt static characteristic. in the next time step at 10.75 µs when the gate voltage becomes 9.6 v, the avalanche branch becomes ndr-like and filamentation event occurs. likewise the previous results of the isothermal simulations, the current filament start to converge into the single active cell. eventually, the collector current reaches to zero. the movement of the current filament from time step 11.50 µs to 11.75 µs was ascertained. the filamentation jumping from one cell to other can be related to the temperature compensation point (tcp) of the igbt transfer characteristics and at the position of the filaments high temperatures are reached. therefore, the ionization rates are lowered and the avalanche generation decreases. now the current wants to flow at a cooler region. thus, the current instability has been analyzed on the igbt transfer characteristics below tcp point [18]. the maximum temperature observed during the short-circuit turn-off was about 480 k at 11.50 µs. maximum electric field simulated at this point has reached more than 270 kv/cm. high localized current density and high temperature across a single igbt cell should be the reason for the found device destruction. during the simulations, an anomalous behaviour of the forming filamentation was investigated under different starting parametric conditions like dc-link voltage, parasitic inductance and temperature. to improve the short-circuit turn-off ruggedness, the gate turn-off resistor for this case must be chosen well. a trade-off between short-circuit turn-off losses and a stable filament behaviour must be found. in practice, a high resistance is often used for the short-circuit turn-off (e.g. soft-turn-off resistor). a reduced parasitic inductance can additionally lead to a lower generated overvoltage. furthermore, an active clamping can be used to recharge the gate and thus to overcome the ndr region. adjustment of bipolar current gain can be one of the possibilities to suppress the ndr influence directly in the semiconductor [6]. igbts working in the ndr region of their i-v characteristics 13 fig. 7 thermal simulation of 4-cell igbt model (a) switching characteristics vdc = 3.5 kv, lpar = 10 μh, rg,off = 30 ω, starting temperature t = 300 k and (b) filamentation behaviour 5. summary the investigation concludes that it was possible to measure the i-v characteristics of the igbt up to the breakdown branch non-destructively. important facts regarding igbt failure can be interpreted with the help of this measured output characteristics. the results manifest that the igbt is able to work on avalanche branch. the failure analysis can be clearly related to the ndr branch of the i-v characteristics. a stable operating point can be feasible in the avalanche branch of the igbt during fast turn-off. the destructive event occurs if the igbt operating points are situated in the 14 r. bhojani, t. basler, j. lutz, r. jakob ndr branch during short-circuit turn-off event. igbts are able to clamp large overvoltage generated due to high steepness of falling dic/dt. this clamping mechanism turned out unstable and led igbt towards destruction. the internal behaviour of the igbt was analyzed with the help of single pulse short-circuit type 1 simulation. the found igbt failure is connected to the inhomogeneous current conduction through the igbt structure. the complete current was converged at a single cell. a very high localized current density and high temperature were observed with the help of simulation. the results state that both, avalanche branch and negative differential resistance branch are same on the static characteristics. when the igbt is still conducting current through its mos channel during short-circuit, no destruction of the device was found. as soon as the mos channel ended, the operation points are moving into the ndr branch by approaching the igbt towards destruction. the stability of the short-circuit turn-off can be achieved with adjustment of the bipolar current gain. a high gate turn-off resistor can suppress the filamentation but it will increase the short-circuit turn-off losses. a trade-off has to be found. references [1] j. lutz, h. schlangenotto, u. scheuermann and r. d, donker, semiconductor power devices physics, characteristics, reliability. springerlink, 2011, chapter 10, pp. 315-340. doi: 10.1007/978-3-64211125-9 [2] j. tihanyi, "mos-leistungsschalter", etg-fachtagung bad nauheim, fachbericht nr. 23, vde-verlag, pp. 71-78, mai 1988. [3] g. miller, j. sack, "a new concept for a non punch through igbt with mosfet like switching characteristics", proceedings of the pesc '89, vol. 1, pp. 21-25, june 1989. doi: 10.1109/pesc.1989.48468 [4] t. laska et al, "short circuit properties of trench/field stop igbts design aspects for a superior robustness", proceedings of the 15 th international symposium on power semiconductor devices & ics, cambridge, pp. 152-155, april 2003. doi: 10.1109/ispsd.2003.1225252 [5] t. basler, r. bhojani, j. lutz and r. jakob, "measurement of a complete hv igbt iv characteristic up to breakdown point", 15 th europian conference on power electronics and applications (epe), pp. 1-9, september 2013. doi: 10.1109/epe.2013.6634454 [6] t. basler, r. bhojani, j. lutz and r. jakob, "dynamic self-clamping at short-circuit turn-off of highvoltage igbts", proceedings of the 25th international symposium on power semiconductor devices & ics, japan, pp. 277-280, may 2013. doi: 10.1109/ispsd.2013.6694440 [7] t. basler, j. lutz, t. brückner and r. jakob, "igbt self-turn-off under short-circuit condition", international seminar on power semiconductor, prague, pp. 235-242, september 2013. [8] urls: https://www.bibliothek.tu-chemnitz.de/uni_biblio/frontdoor.php?source_opus=7514 [9] t. basler, j. lutz, r. jakob and t. brückner, "the influence of asymmetries on the parallel connection of igbt chips under short-circuit condition", 14 th europian conference on power electronics and applications (epe), pp. 1-8, september 2011. [10] urls: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6020357 [11] m. rahimo, a. kopta, s. eicher, u. schlapbach, and s. linder, "switching-self-clamping-mode “sscm“, a breakthrough in soa performance for high voltage igbts and diodes", proceedings of the 16 th international symposium on power semiconductor devices & ics, japan, pp. 437-440, may 2004. doi: 10.1109/ispsd.2004.1332970 [12] h. g. eckel and l. sack, "experimental investigation on the behaviour of igbt at short-circuit during the on-state", 20 th international conference on industrial electronics, control and instrumentation (iecon), vol. 1, pp. 118-123, september 1994. doi: 10.1109/iecon.1994.397762 [13] lutz j, döbler r, mari j, menzel m: “short circuit iii in high power igbts” proceedings epe, barcelona (2009) urls: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5279057 [14] j. lutz and t. basler: "short-circuit ruggedness of high-voltage igbts", 28 th international conference on microelectronics (miel), pp. 243-250, may 2012. doi: 10.1109/miel.2012.6222845 http://dx.doi.org/10.1109/pesc.1989.48468 http://dx.doi.org/10.1109/ispsd.2003.1225252 http://dx.doi.org/10.1109/epe.2013.6634454 http://dx.doi.org/10.1109/ispsd.2013.6694440 http://dx.doi.org/10.1109/ispsd.2004.1332970 http://dx.doi.org/10.1109/iecon.1994.397762 http://dx.doi.org/10.1109/miel.2012.6222845 igbts working in the ndr region of their i-v characteristics 15 [15] t. balser, "short-circuit ruggedness of high-voltage igbts", phd thesis, technische universität chemnitz, february 2014. [16] urls: http://www.qucosa.de/fileadmin/data/qucosa/documents/14710/dissertation_thomas_basler.pdf [17] r. bhojani, "simulation of high-voltage igbts in short-circuit and avalanche mode", master thesis, technische universität chemnitz, april 2013. [18] p. rose, d. silber, a. porst and f. pfirsch, "investigations on the stability of dynamic avalanche in igbts", proceedings of the 14 th international symposium on power semiconductor devices & ics, new mexico, pp. 165-168, june 2002. doi: 10.1109/ispsd.2002.1016197 [19] t. raker, h. felsl, f. niedernostheide, f. pfirsch, and h. schulze, "limits of strongly punch through designed igbts", proceedings of the 23 rd international symposium on power semiconductor devices & ics, u.s.a., pp. 100-103, may 2011. doi: 10.1109/ispsd.2011.5890800 [20] z. chen, k. nakamura and t. terashima, "ltp(ii)-cstbt tm (iii) for high voltage application with ultra robust turn-off capability utilizing novel edge termination design", proceedings of the 24 rd international symposium on power semiconductor devices & ics, belgium, pp. 25-28, june 2012. doi: 10.1109/ispsd.2012.6229014 [21] d. dibra, m. stecher, s. decker, a. lindemann, j. lutz and c. kadow, "on the origin of thermal runaway in a trench power mosfet", ieee trans. on electron devices, vol. 58, pp. 3477–3483, october 2011. doi: 10.1109/ted.2011.2160867 http://dx.doi.org/10.1109/ispsd.2002.1016197 http://dx.doi.org/10.1109/ispsd.2011.5890800 http://dx.doi.org/10.1109/ispsd.2012.6229014 http://dx.doi.org/10.1109/ted.2011.2160867 instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 383 390 doi: 10.2298/fuee1703383v exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless channel transistor (jlct)  b vandana, jitendra kumar das, b shivaval patro, sushanta kumar mohapatra school of electronics engineering, kiit university, bhubaneswar, odisha, india abstract. in view of reduced electric field and avoiding source drain engineering, the work exploresstrain effect in junctionless channel transistor. to achieve scaled ioff and maintain ion, here the device sg-oi jlct is proposed. the study discusses higher switching action with mole fraction x = 0.25. the dependency of ϕm and the nd is responsible for maintaining constant current for overall analysis. key words: sg-oi jlct, soi jlt, drift diffusion carrier mobility, on-off currents. 1. introduction the interpretation of the si based semiconductor industries started in 1959 and is still continuing in following moore’s law. scaling technology contributed different leakage currents in conventional metal oxide semiconductor (mos) field effect transistor (fet) which internally affects the device performance. this gives a challenging notation to device engineers. a brief description of various leakage currents is given in [1] describing the issues at scaled channel length. in order to overcome these issues various challenges are addressed such as high-κ gate oxide engineering, spacers engineering and new materials and structural design etc. are reported [2], [3], and therefore process technology device structures have been invented. the new design architectures such as silicon on insulator (soi) [4], double gate mosfet (dgmosfet) [5], [6], tri gate mosfet (tmosfet) [7], gate all around (gaa-mosfet) [8], fin-fet [9], [10] etc., are briefly described. apart from this, a new device structure has been identified such as lilienfeld’s first transistor architecture [11], and followed with various structural design approach as a trigate architecture with no doping gradients is given in [12]. accordingly, a vertical gate stack soi and bulk planar junctionless transistor are reported in [13]–[15]. received september 29, 2016; received in revised form january 23, 2017 corresponding author: b vandana school of electronics engineering, kiit university, bhubaneswar, odisha, india (e-mail: vandana.rao20@gmail.com) 384 b. vandana, j. k. das, b. s. patro, s. k. mohapatra junctionless nanowire (jn) transistor are uniform with heavy doping profile of 10 19 to 10 20 cm -3 with in a si device layer, the jn is usually a on resistor which do not require any metallurgical junctions across the channel edges. depending on the type of transistor n + or p + is doped along s/d channel regions. this approach is well simplified and fabricated with standard cmos technology [16], [17]. the physics behind the architecture is given for lg < 20-nm. for n-type mosfet, due to n+ doping concentration, a high electric field is generated along the vertical direction, which makes the channel fully depleted below vth with vgs = 0 v, and above vth field drops to zero. therefore due to its specific merits jlt along with different structures are preferable for scaling short channel effects. the paper proposes a sige on insulator (sg-oi) [18]–[21] using junction-less channel transistor. nd is taken as sige to evaluate the performance of the device with respect to ioff. the parameters listed in table 1 are used to investigate electrostatic integrity of the device. the obtained results are verified with soi-jlt and conventional mosfet. fig 2 shows that our simulation model is in well agreement with [13]. with the inherent features of the jlt, a si1-0.25ge0.25 mole fraction (x) material is taken along s/d and channel regions with uniformly high nd. the conduction mechanism of jlt shows the difference in ϕm ϕs (jlt conducts above 5 ev work function) leads to the positive shift in vth and bands becomes flat at vfb, which then takes a path for the conduction at positive vgs. the channel depletes completely at zero vgs. at high nd mobility degrades perpendicularly to channel and with low electric field enhances mobility. along with introduction, section ii discusses the device structure and physics behind the device that carried out simulations, and activated models for the simulations. section iii describes the study of electrostatic integrity of sg-oi jlct. section iv deals with conclusion and remarks of the proposed device. 2. silicon germanium on insulator junction-less channel transistor (sg-oi jlct) the schematic diagram of sg-oi jlct is shown in fig.1. the architecture is carried out with no metallurgical junction in lateral direction, hence named jlct. according to the features and specifications listed in table 1 the device has been designed, and the parameter specifications are taken from [4], [13], [14]. fig. 1 cross sectional view of sg-oi jlct exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless ... 385 junctionless transistors are the devices with no doping gradients across source channel and drain edges. usually, the device layer is doped with high doping density. the gate metal is taken at high work function ϕm of 5.1 ev. for isolation purpose a sio2 is considered. the spacers are provided with high-κ hfo2 [22], [23]. a fully depleted sige layer is grown epitaxial on an insulator (fd sg-oi jlct) forming a conducting path across the device layer. the strain induce effect occur when a sige layer is grown epitaxial on a thin silicon substrate. [24], [25] a simple identification of the device performance is represented by considering a relaxed sige layer, with the change in the molefraction value. the model used for the simulation is default drift-diffusion carrier transport mobility model. the mobility model is then dependent on the doping concentration with high field saturation carrier densities and transverse field dependence. as sige is compound material, a mole fraction dependent effective intrinsic density band gap narrowing model for sige is used for the device. the structure assumes to be abrupt and taken at room temperature. in order to solve this, a self-consistent drift-diffusion equation is used. due to its high nd across lateral direction and oldslotboom band gap narrowing model and schottky-read-hall mechanism is accounted. the model calculates the intrinsic carriers for silicon material. it then improves the carrier mobility under high field saturation. the overall simulations are carried out using sentaurus tcad 2d simulator [26], [27]. fig. 2 comparison of id,lin with respect to vgs plot for sg-oi jlct at vds = 1 v and [13] with simulation lg = 20-nm, ϕm = 5.1 ev, nd = 1.5e19 cm -3 table 1 parameter required for simulation [28][], [13]. parameters sg-oi jlct conventional mosfet sige layer (tsi) for sg-oi jlct 5-nm 5-nm donor doping (nd) 1.5x10 19 cm -3 10 18 cm -3 eot of gate dielectric (tox) 1-nm 1-nm gate work function (ϕm) 5.1 ev 4.6 ev well doping (na) 5x10 18 cm -3 10 15 cm -3 drain supply voltage (vdd) 0.05 v, 1 v 0.05 v, 1 v channel length (lg) 20-nm 20-nm 386 b. vandana, j. k. das, b. s. patro, s. k. mohapatra 3. results and discussions the section deals with the results and discussions that carryout for the simulations with the parameter values of vds = 0.05 v for id,lin and vds = 1 v for id,sat with vgs = 1 v. basically the paper deals with the electrostatic integrity (ei) parameter which usually has short channel effects and dibl given in equation 2 and 3. this induces a qualitative control on the channel through the gate. in short channel devices the channel is predominated by gate with affecting electric field lines from source to drain. as the approach is fully depleted, sg-oi ei is shown in equation 1, most of the electric field lines propagate through box to channel which can reduce sce. further this has an inconvenience of increasing junction capacitance and body effect [30]. firstly from fig. 4 our model is well suitable for reducing ioff at 10 -13 (a) and ion is maintained 10 -06 (a) which is then compared with [13]. a comparative analysis is shown for conventional mosfet, soi jlt and a sg-oi jlct. the main intention behind the analysis is to scale ioff, the challenges for scaling ioff is [28], (1) having a thin channel region, (2) considering high κ spacers, which improves the ioff and (3) temperature doping dependent channel is considered. 2 2 1 si ox si box el elel t t t t ei l ll         (1) 0.80 si ds ox dibl eiv    (2) 0.64 si bi ox sce eiv    (3) fig. 3 comparative analysis of id,lin with respect to vgs is shown for soi jlct and sg-oi jlct. with ϕm = 5.1 ev, nd = 1.5e19 cm -3 and lg = 20-nm fig. 3 comparison of id,lin with respect to vgs is shown for soi jlct and sg-oi jlct. ion is improved in case of sg-oi jlct and ioff shows better improvement in soi jlct. this shows that at x = 0.25 the values are similar to those given in [13]. as the sige is a compound material, there is possibility of a varying band gap from 0.6 to 1.1 ev. this variation of bandgap is obtained due to tuning the molefraction value (x = 0.25, x = 0.5, and x= 0.75). the composition of si is high in content; therefore the device exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless ... 387 acquires the si material characteristics though the channel is maintained to be sige material. however, the band gap value of si is 1.1 ev, but for sige at x = 0.25 the bandgap value is almost near to 1.1 ev. if the molefraction x = 0.75 the bandgap value is near to 0.6 ev, hence the channel acts according to the ge material properties shown in fig. 4. [31] provide ge mosfet advancement in electrical performance which represents the switching activity and the mobility enhancement to that of an si mosfet [32]. fig. 5 shows the dibl as a function of ion is represented for both sg-oi jlct and soi jlct. at x = 0.25 ion is improved in case of sg-oi jlct but dibl remain equal for both the devices and at x = 0.75 ion found to be less and dibl is very high which is not considerable. in order to improve ion and dibl for x = 0.75 a proper tuning of nd and work-function is suggested. fig. 4 energy with respect to distance “x” along the channel for sg-oi jlct is shown. for si1-xgex channel (x = 0.25, 0.5, 0.75), vdsat = 0.7 v and tsi = 5 nm is given. fig. 5 dibl with respect to ion for both sg-oi jlct and soi jlct is shown. vd,lin = 0.05 v, vd,satm = 1 v and x = 0.25, 5, 0.75 fig 6 investigates the impact of ϕm on ion and ioff, as jlt works ϕm > 5 ev the performance of the device is shown accordingly. it is clear that at ϕm > 5.2 ev ion and ioff start degrading. though the device takes the si material properties, the concentration of the ge at sige channel will affect the electric field at low vth. hence results in ioff improvement. [14] jlt as si channel has ϕm = 5.5 ev. in the proposed work, fig. 3 compares id,lin function of vgs plotted with x = 0.25 for sg-oi jlct. as the value of x increases a 388 b. vandana, j. k. das, b. s. patro, s. k. mohapatra drastic degradation of device performance takes place, as shown in fig. 7 with respect to ion/ioff ratio. fig. 6 impact of metal work function ϕm on ion and ioff of the sg-oi jlct with x = 0.25, nd = 1.5e19 cm -3 , eot = 1-nm with lg = 20-nm fig. 7 ion/ioff of the sg-oi jlct with different x composition (x = 0.25, 0.5, 0.75), ϕm = 5.1 ev, nd = 1.5e19 cm -3 , eot = 1-nm with lg = 20-nm 4. conclusion the paper investigates an improvement in ioff current and maintaining ion at 10 -6 amp's. a conduction mechanism of sg-oi jlct with the concept of relaxed sige on insulator is explained. the id,lin and id,sat values at x = 0.25, ϕm = 5.1ev is considered to estimate the ioff. therefore from the above results sg-oi jlct performs better at x = 0.25 by activating the drift-diffusion carrier mobility and srh mechanism for high field saturation mobility model using sentaurus tcad 2d simulator. and study towards electrostatic integrity can then be evaluated. exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless ... 389 references [1] k. roy, s. mukhopadhyay, and h. mahmoodi-meimand, “leakage current mechanisms and leakage reduction techniques in deep-submicrometer cmos circuits,” proc. ieee, vol. 91, no. 2, pp. 305–327, 2003. [2] m. t. bohr, r. s. chau, t. ghani, and k. mistry, “the high-k solution: microprocessors entering production this year are the result of the biggest transistor redesign in 40 years,” ieee spectr., vol. 44, no. 10, pp. 23–29, 2007. [3] s. das and s. kundu, “simulation to study the effect of oxide thickness and high-dielectric on draininduced barrier lowering in n-type mosfet,” ieee trans. nanotechnol., vol. 12, no. 6, pp. 945–947, 2013. [4] j.-p. colinge, “soi materials,” in silicon-on-insulator technology: materials to vlsi, springer, 1997, pp. 7–65. [5] f. balestra, s. cristoloveanu, m. benachir, j. brini, and t. elewa, “double-gate silicon-on-insulator transistor with volume inversion: a new device with greatly enhanced performance,” ieee electron device lett., vol. 8, no. 9, pp. 410–412, 1987. [6] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “some device design considerations to enhance the performance of dg-mosfets,” trans. electr. electron. mater., vol. 14, no. 6, pp. 291–294, 2013. [7] m. g. c. de andrade, j. a. martino, m. aoulaiche, n. collaert, e. simoen, and c. claeys, “behavior of triple-gate bulk finfets with and without dtmos operation,” solid. state. electron., vol. 71, pp. 63– 68, 2012. [8] j.-p. colinge, m. h. gao, a. romano-rodriguez, h. maes, and c. claeys, “silicon-on-insulator’gate-allaround device’,” in technical digest. of the international electron devices meeting, 1990. iedm’90., 1990, pp. 595–598. [9] b. ho, x. sun, c. shin, and t.-j. k. liu, “design optimization of multigate bulk mosfets,” ieee trans. electron devices, vol. 60, no. 1, pp. 28–33, 2013. [10] e. a. cartier, b. j. greene, d. guo, g. wang, y. wang, and k. k. h. wong, “finfet structure and method to adjust threshold voltage in a finfet structure.” google patents, 2015. [11] j. e. lilienfeld, “method and apparatus for controlling electric currents,” 1925. [12] j.-p. colinge, i. ferain, a. kranti, c.-w. lee, n. d. akhavan, p. razavi, r. yan, and r. yu, “junctionless nanowire transistor: complementary metal-oxide-semiconductor without junctions,” sci. adv. mater., vol. 3, no. 3, pp. 477–482, 2011. [13] j.-p. colinge, c.-w. lee, a. afzalian, n. d. akhavan, r. yan, i. ferain, p. razavi, b. o’neill, a. blake, m. white, a.-m. kelleher, b. mccarthy, and r. murphy, “nanowire transistors without junctions,” nat. nanotechnol., vol. 5, no. 3, pp. 225–229, 2010. [14] a. kranti, r. yan, c. w. lee, i. ferain, r. yu, n. d. akhavan, p. razavi, and j. p. colinge, “junctionless nanowire transistor (jnt): properties and design guidelines,” in proc. of the essderc, 2010, pp. 357–360. [15] s. gundapaneni, s. ganguly, and a. kottantharayil, “bulk planar junctionless transistor (bpjlt): an attractive device alternative for scaling,” ieee electron device lett., vol. 32, no. 3, pp. 261–263, 2011. [16] c.-w. lee, i. ferain, a. afzalian, r. yan, n. d. akhavan, p. razavi, and j.-p. colinge, “performance estimation of junctionless multigate transistors,” solid. state. electron., vol. 54, no. 2, pp. 97–103, 2010. [17] c.-w. lee, a. afzalian, n. d. akhavan, r. yan, i. ferain, and j.-p. colinge, “junctionless multigate field-effect transistor,” appl. phys. lett., vol. 94, no. 5, p. 53511, 2009. [18] t. irisawa, t. numata, e. toyoda, n. hirashita, t. tezuka, n. sugiyama, and s. takagi, “physical understanding of strain-induced modulation of gate oxide reliability in mosfets,” ieee trans. electron devices, vol. 55, no. 11, pp. 3159–3166, 2008. [19] m. a. hopcroft, w. d. nix, and t. w. kenny, “what is the young’s modulus of silicon?,” j. microelectromechanical syst., vol. 19, no. 2, pp. 229–238, 2010. [20] k. p. pradhan, p. k. sahu, d. singh, l. artola, and s. k. mohapatra, “reliability analysis of charge plasma based double material gate oxide (dmgo) sige-on-insulator (sgoi) mosfet,” superlattices microstruct., vol. 85, pp. 149–155, 2015. [21] c. k. maiti and g. a. armstrong, applications of silicon-germanium heterostructure devices. crc press, 2001. [22] w. j. zhu, t. tamagawa, m. gibson, t. furukawa, and t. p. ma, “effect of al inclusion in hfo 2 on the physical and electrical properties of the dielectrics,” ieee electron device lett., vol. 23, no. 11, pp. 649–651, 2002. 390 b. vandana, j. k. das, b. s. patro, s. k. mohapatra [23] m. wu, y. i. alivov, and h. morkoc, “high-$κ$ dielectrics and advanced channel concepts for si mosfet,” j. mater. sci. mater. electron., vol. 19, no. 10, pp. 915–951, 2008. [24] p. goyal, “design and simulation of strained-si/strained-sige dual channel hetero-structure mosfets,” 2007. [25] y. sun, s. e. thompson, and t. nishida, “physics of strain effects in semiconductors and metal-oxidesemiconductor field-effect transistors,” j. appl. phys., vol. 101, no. 10, p. 104503, 2007. [26] http://www.synopsys.com/, “sentaurus tcad user’s manual,” in synopsys sentaurus device, 2012. [27] l-2016.03, “sentaurus tm device user,” september, 2014. [28] s. gundapaneni, m. bajaj, r. k. pandey, k. v. r. m. murali, s. ganguly, and a. kottantharayil, “effect of band-to-band tunneling on junctionless transistors,” ieee trans. electron devices, vol. 59, no. 4, pp. 1023–1029, 2012. [29] “the international technology roadmap for semiconductors,” 2015. [30] j. p. colinge, “the new generation of soi mosfets,” rom. j. inf. sci. technol, vol. 11, no. 1, pp. 3– 15, 2008. [31] d. p. brunco, b. de jaeger, g. eneman, j. mitard, g. hellings, a. satta, v. terzieva, l. souriau, f. e. leys, g. pourtois, and others, “germanium mosfet devices: advances in materials understanding, process development, and electrical performance,” j. electrochem. soc., vol. 155, no. 7, pp. h552-h561, 2008. [32] s. c. martin, l. m. hitt, and j. j. rosenberg, “p-channel germanium mosfets with high channel mobility,” ieee electron device lett., vol. 10, no. 7, pp. 325–326, 1989. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 195-210 https://doi.org/10.2298/fuee1902195d a smart weather station based on sensor technology miloš đorđević, danijel danković faculty of electronic engineering, university of niš, serbia abstract. in this paper a new approach to utilize technology in a practical and meaningful manner within a smart weather station system is presented. this system is primary intended for use in agriculture and meteorological stations, but its application is not limited here. weather parameters observing plays an important role in human live, so the observing, collecting and storing of information about the temporal dynamics of weather changes is very important. the primary goal is to design a low cost smart system for storing data obtained by measuring various physical parameters in the atmosphere without human involvement. realized system use internet of things technology to storage measured results, and allows the user to access the results anytime and anywhere. in this research internet of things is used as technology for storing measured data, because this technology is an advanced and efficient solution for connecting the things to the internet and to connect the entire world of things in a network. the proposed smart weather station system is based on the following steps: direct environment sensing, measuring and storing data and then allowing user to customize the settings. this research will present the design and implementation of a practical smart weather station system, which can be further extended. the system is based on: group of embedded sensors, peripheral interface microcontroller (pic) microcontroller as a core and server system and wireless internet using global system for mobile telecommunications (gsm) module with general packet radio service (gprs) as a communication protocol. key words: smart weather station, internet of things, pic microcontroller, sensor technology, general packet radio service (gprs) 1. introduction as internet of things technology developed and became more and more affordable, there are more and more smart devices based on this technology. internet of things is a network of physical objects in which electronics are incorporated, as well as software, microcontrollers and sensors that allow users to obtain timely and accurate data through services for data exchange between users or other connected devices. the main concept received may 29, 2018; received in revised form september 3, 2018 corresponding author: miloš đorđević university of nis, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: milos.djordjevic@elfak.ni.ac.rs) 196 m. đorđević, d. dankovic behind the internet of things (iot) technology used in researches as this is to connect various electronic devices as smart weather station through a network and then retrieve the data from these devices (sensors) which can be distributed in any fashion, upload them to any cloud service where one can analyze and process the gathered information. the smart weather stations is device which can observe, measure and storage meteorological/ambient parameters without human involvement. smart weather station can be used in meteorological stations for observing various physical parameters in the atmosphere, in agriculture can help in improving the productivity of crops. in mines, smart weather stations can be used to detect, besides the basic meteorological parameters, the possible danger of excessive concentration of ch4 methane, carbon dioxide, lpg gas, etc. to save lives of miners. technological development offers easier solutions for the realization better devices than in the past. it also offers new sensors with improved precision and response times, from the other side that sensors miniaturized electronic devices and power consumption. the use of sensors on internet of things devices ensures a seamless connection between the electronic devices and the physical world [1]. the atmosphere is a complicated system, which consists of a lot of parameters that have an impact on one another. small change of one of the parameters, have large influence on life quality. from that reason, it is difficult to predict in atmospheric parameters and weather forecasts at whole. the devices such as smart weather station, it can facilitate the prediction of atmospheric changes and its environmental impacts. this device is designed for meteorological measurements, observations and data storage, wherever measurement and monitoring of ambient parameters such as production halls, mines, hospitals, laboratories, libraries, museums, botanical gardens, etc. are necessary. this system includes the wireless communication technology and internet of things technology as in researches [2-4]. previous research was based on the storage of data on media that cannot be accessed during the measurement to give the end user an insight into the current results. the smart weather station based on internet of things technology updates the information to the database on cloud or database on webserver after observing and measuring the weather conditions periodically. the realized smart weather station for sending data to cloud uses global packet radio service, so that the device can be used in inaccessible places, unlike w-fi internet modules that must have access to the internet. by updating the data to the cloud or database, end user can maintain the weather conditions of a particular place that can be known readily verified anywhere in the world. the development of this system aims to meet the needs of the user by employing networked low-power sensors sensitive to the environment, so it can be applied to make life easier. the realized device expect dc power supply and batteries offers the possibility of solar power, which is an advantage in the need of measuring atmospheric parameters in difficult accessible locations. in accordance with the use of solar power, the number of measuring points, i.e. as the surface of the solar panel increases, and number of measurement points will increase over an hour. the device has a wide application based on meteorological/ambient parameters that measure:  temperature [°c] or [°f],  humidity [%],  atmospheric pressure [mbar],  altitude [m], a smart weather station based on sensor technology 197  wind speed (anemometer) [m/s] or [km/h],  lighting [lx],  detection and measurement of gas (lpg) [%]. all measurements are accompanied by information on the time and date of measurement. time and date information are present during the storage of data on the server and are available when downloading the results. there are a large number of modular devices for measuring and acquiring atmospheric parameters on the market, but there are no devices that combine all modules for measuring/observing both ambient and atmospheric parameters. 2. theoretical background the smart weather station is essentially a data acquisition system remotely able to collect information based on meteorological/ambient parameters and store it in the cloud or database on the webserver [5]. data acquisition system such as smart weather station are based on internet of things technology. this smart weather station can be called “smart” in relation to research related to so-called non-smart weather stations that do not use internet of things technology. non-smart weather station use only wired connected media to store measurement results, such as security digital (sd) memory card, flash memory, eeprom memory, etc. [6]. the areas of application point out include, e.g., the smart industry, where the development of intelligent production systems and connected production sites is often discussed under the heading of industry 4.0. [7]. main elements of the smart weather station are: 1. network for communication – wire, cable (local area network (lan)), wireless. 2. intelligent control – microcontroller to manage the system. 3. embedded sensors – products which can be used to observe and measure meteorological/ambient parameters. there are many different implementations of smart weather stations, which are reflected in the way in which communications and storage media are realized. most of the implementations use wireless technologies for communication between the sensor part and the main unit. the main problem is how to realize a smart weather station that will be as cheap as possible and as safely as possible to store the measured data. also, how to make a weather station that will help people to automate regular daily activities. for example, store data so that the end-user can easily access them, access the results of the measurement from anywhere, manipulate stored data, etc. based on these needs, smart weather stations have been developed based on different technologies: 1. smart weather station based on custom microcontroller and mobile application. smart weather station is using bluetooth for communication between the mobile application and the weather station. limitation of the system like this is bluetooth range. 2. smart weather station based on custom microcontroller and computer. smart weather station is using radio frequency (rf) for communication between the computer and weather station. it is based on a computer as entry point for 198 m. đorđević, d. dankovic communication between user and smart weather station. computer is connected using wire to microcontroller. 3. smart weather station based on nodemcu running esp8266 wi-fi module and cloud or database on webserver. wi-fi internet is using for communication between weather station and database. limitation of the system like this is because esp8266 need to have access to the internet. 4. smart weather station based on custom microcontroller and cloud or database on webserver. smart weather station is using global system for mobile telecommunications for communication with general packet radio service for communication between weather station and cloud or database on webserver. advantage of this system is that the global system for mobile telecommunications does not require internet access and can be used in difficult accessible locations. most smart weather stations are based on wireless internet communication and data storage on the cloud or database on webserver using wi-fi modules. also, the smart weather stations offer solution based on wireless internet communication and data storage on the cloud or database on webserver using general packet radio service for communication. smart weather stations offers similar functionality to the end user. that functionality is based on: 1. integration with smart appliances. 2. integration with embedded sensors for observing and measuring meteorological/ ambient parameters. 3. access to data from anywhere 4. remote control of smart weather station. in the research the authors [8] analyzed smart weather monitor based on nodemcu microcontroller and internet of things. this system are based on two main parts. first, nodemcu as heart of the device which provides the platform of internet of things. sensors connected to this microcontroller are dht11 which are used to measure temperature and humidity and light dependent resistor (ldr) as light intensity sensor. second part based on thingspeak internet of things platform for storing data using esp module. the developer of nodemcu is esp8266 opensource community. it has an operating system called xtos. thecpu is esp8266 (lx106). it has an in-built memory of 128 kbytes and a storage capacity of 4mbytes. it includes firmware which runs on the esp8266 wi-fi soc from espressif systems and hardware which is based on the esp module. dht11 sensor are used to measure temperature and relative humidity. this sensor includes a resistive-type humidity measurement component and an ntc temperature measurement component. the accuracy of the temperature measurement is weaker than the digital calibrated sensor used in this smart weather station. light dependent resistor used as light intensity sensor. analog sensor has not been calibrated, and measurement is not precision in relation to the digital sensors used in the smart weather station described in this paper. nodemcu need to have access to the internet to send data to the cloud, which is another limitation of this research. in another research, authors [9] analyzed arduino based smart weather station based on wi-fi internet technology and arduino. the implemented system consists of a microcontroller (atmega328) as a main processing unit for the entire system. sensors used in this research are lm35 as temperature sensor, light dependent resistor as light intensity sensor, microphone as sound sensor for detecting decibel level and carbon monoxide (co) sensor. temperature sensor lm35, are analog sensor with weaker accuracy than digital a smart weather station based on sensor technology 199 sensor used in this smart weather station. also, it can measure temperature only above 0°c (from 0°c to 100°c), on the other digital sensor used in this weather station have operating range from -40°c to +85°c. for sending data to cloud authors used wi-fi esp8266 module. limitation of this system is that esp8266 need to have access to the internet to send data to the cloud. in a research [10] about temperature and humidity monitoring system for agriculture, authors analyzed weather monitoring system based on raspberry pi and thingspeak. authors used raspberry pi 3 model b, it has a 1.2 ghz 64-bit quad core armv8 cpu, and ram of 1gb. it also has 40 gpio pins, full hdmi port, 4 usb ports, ethernet port, 802.11n wireless lan connectivity, etc. operating voltage of system is 5v. for measuring temperature and humidity authors used dht11 sensor with 8-bit microcontroller. dht11 sensor offer temperature measuring in range from 0ºc to 55ºc, and relative humidity in range from 20% to 90%. in compared with sensor used in this manuscript bme280, dht11 provides weaker range of temperature and humidity measurement. authors used local area network wired internet connectivity for send data of measured temperature and humidity to thingspeak. authors in next researches [11] and [12] analyzed smart weather station based on raspberry pi and mobile application. in proposed system architecture arm11 processor is center core. in this system single core 32 bit arm11 processor is used. it is having 512mb ram. this board has on-board usb ports, ethernet port, hdmi port, etc. it is easy to give internet connection to this board by using ethernet port. the operating voltage of this board is 5v. for temperature and pressure measurement authors use bmp180 sensor with weaker accuracy than bme280 sensor and equipped with humidity sensor. authors for upload data in thingspeak using ethernet via local area network (lan) module on raspberry pi. general packet radio service module have advantage in relation than ethernet because general packet radio service use wireless communication to upload data on thingspeak database. in a research [13] related to the development of weather station based on solar powered, authors used a microcontroller msp430 and zigbee wireless transmitter for communication. zigbee wireless module are used for communication between weather station and the monitoring part. in that research, authors use light dependent resistor for detecting solar radiation level, temperature and humidity sensor and small circuit designed for rain level detection. the disadvantage of this system is that the authors do not store measured data on any medium, but only monitor measured values using the lcd display. recent developments in internet of things made possible to collect the data in-situ. the user can access this data anywhere in the world, at any time through internet. internet of things refers to giving objects representation in the digital realm through giving them a unique id and connecting them in a network [14]. in other words, these things are connected to the internet and are able to automatically transfer data without relying on human interaction hence being “machine to machine” (m2m) interaction [15]. a machine to machine talk (m2m) system maybe generally seen as a wireless sensor network where sensor nodes are embedded systems referred to as m2m terminals. embedded software running inside m2m terminal should manage concurrent tasks efficiently and reliably within limited hardware resources and with real-time constraints. [16]. thingspeak, is an application programming interface (api) and web service for the internet of things. the thingspeak application programming interface is an open source 200 m. đorđević, d. dankovic interface which listens to incoming data, timestamps it, and outputs it for both human users (through visual graphs) and machines (through easily parse-able code) [17]. thingspeak is a part of cloud computing as platform as a service (paas), including operating system, programming-language execution environment, database, and web server. thingspeak allows to build applications around data collected by sensors. it offers near real-time data collection, data processing, and also simple visualizations for its users (see fig. 1). data is stored in so-called channels, which provides the user with a list of features. fig. 1 thingspeak principle of work [15]. once when user create a channel, data can be published by accessing the thingspeak api with a write key, a randomly created unique alphanumeric string used for authentication. from the other side, a read key is used to access channel data in case it is set to keep user data private (the default setting). user channels can also be made as public in which case no read key is required. each user thingspeak channel can have up to 8 fields to store data of measured meteorological/ambient parameters such as temperature, humidity, pressure, etc. all entries from smart device are stored with an unique identifier and a date and time stamp. also, user can import existing data from a comma-separated values (csv) file to the channel. 3. the working principle this paper presents the model of smart weather station based on pic microcontroller and cloud platform. the system is designed to be scalable and easy to setup and extend. it is based on powerful pic microcontroller which manage the whole system. it includes embedded sensors for observing and measuring of the environment or places where is that necessary and gprs module which upload data to cloud platform. a smart weather station based on sensor technology 201 3.1. design of solution a weather station is realized so that is consist of six segments, shown in fig. 2. the power supply serves all other blocks. the microcontroller pic18f45k22 [18], which represents the core of the entire device, manages the sensor block, which serves for meteorological measurements and observations. also, the gsm block, realized using the sim800l module [19], is controlled by the above microcontroller. fig. 2 block schematic of smart weather station. the sensor part of weather station consist of the following sensors:  bme280 [20]:  temperature [ºc] or [ºf],  humidity [%],  atmospheric pressure [mbar],  altitude [m]. bosch sensortec bme280 sensor allows the measurement of temperature in the range of -40ºc to +85ºc, relative air humidity ranging from 0% to 100% rh and air pressure in the range of 300 to 1100 hpa. also, based on the measurement of atmospheric pressure, it is possible to calculate the altitude at which the measurement is performed. supply voltage of this sensor is in the range of 1.71v to 3.6v. the sensor used is on the same printed circuit board with the voltage regulator (from 5v to 3.3v), so that the power supply voltage is 5v. startup time of bme280 is less than 2ms. the resolution of the humidity measurement is fixed at 16 bit adc output. absolutely accuracy tolerance is ±3% rh. the resolution of the pressure data depends on the infinite impulse response (iir) filter and the oversampling setting. accordingly, when the iir filter is enabled, the pressure resolution is 20 bit, in case of when the iir filter is disabled, pressure resolution is 16+ bit (e.g. 18 bit). absolutely accuracy tolerance is ±1 hpa (in 202 m. đorđević, d. dankovic operation temperature ranged from 0ºc to 65ºc). similar to the measurement of atmospheric pressure, in case when iir filter is enabled, the temperature resolution is 20 bit, if iir filter is disabled, the temperature resolution is 16+ bit. absolutely accuracy tolerance is ±0.5ºc, at operating temperature 25ºc and ±1ºc, in operating temperature ranged from 25ºc to 65ºc.  anemometer [21]:  wind speed [m/s] or [km/h]. anemometer are realized using revolution per minute (rpm) sensor. the wind speed is calculated based on time of detection rotation, the length of the blades, the circumference of the circle by which it is rotated, the number of blades and the weight of the blades.  bh1750 [22]:  lighting [lx]. rohm semiconductor bh1750 sensor is ambient light sensor with voltage power supply ranged from 2.4v to 3.6v. this sensor is placed on same board with voltage regulator, so it is possible to power supply 5v. it is possible to detect wide range at high resolution ranged from 1 65535 lx. it is possible to detect minimum 0.11 lx to 100000 lx using function for influence of optical window. operating temperature is ranged from 40ºc to 85ºc.  mq-2 [23]:  detection and measurement of gas (lpg) [%]. the mq-2 sensor allows measuring in range from 300 to 10000 ppm of combustible gas (lpg, hydrogen, oxygen, propane and methane). operating voltage is ranged from 4.9v to 5.1v, heather consumption is ≤900mw. also, there is and real time clock (rtc) module ds1307 [24], which serves to set how long the measurements take. 3.2. implementation (hardware and software) realization of the measurement system can be seen on electrical scheme of the smart weather station shown in fig. 3. the microcontroller used for the realization of the weather station, operates at a frequency of 8 mhz, using an external oscillator. for the speed of the pic18f45k22 microcontroller, it is possible to use a high-frequency internal oscillator with the highest frequency of 16 mhz, with the possibility of a 4-time multiplier (pllen) frequency increase at 64 mhz. also, it is possible to use an external oscillator, which also has the ability to operate at a frequency of 64 mhz, using the pllen multiplier of the oscillator operating frequency. master reset (mclr) is software off, but a pull-up resistor of 10 kω is also installed in the realization, in order to see the possibility of hardware disconnection of the mclr pin on the microcontroller. this microcontroller has 2 usart (universal synchronous asynchronous receiver transmitter) modules [25]. with this it is possible to manage hardware modules communicating with the microcontroller via the i2c or spi bus, and also via the rx/tx serial terminal. a smart weather station based on sensor technology 203 fig. 3 electrical scheme of weather station. the bme280 (bosch sensortec) sensor was used to measure air humidity, temperature and atmospheric pressure, and an altitude was also determined based on atmospheric pressure. the pic18f45k22 microcontroller communicates with this sensor via the i2c bus [26]. the i2c bus is software implemented using rb.1 and rb.2, port b of the used microcontroller. the bme280 barometric sensor provides a wide range of measurements:  air humidity from 0 to 100% rh,  temperature from -40 °c to +85 °c and  atmospheric pressure from 300 to 1100 hpa (mbar). anemometer, a wind speed sensor, was realized using a speed sensor. the communication of this sensor with the microcontroller was used ra.0 pin of microcontroller port a. calibration was necessary for the operation of this sensor in order to obtain the value of the wind speed. measurement of illumination is done by the sensor bh1750 (rohm semiconductor), which communicates with the microcontroller via the software realized i2c bus. the used sensor provides a wide range of high resolution measurement (0-65535 lx). a mq-2 sensor was used to detect gas of lpg. this sensor provides the possibility of digital and analogue output. in this case an analog output was used and, therefore, the ra.1 microcontroller pin was used to work with this sensor. at port a, the microcontroller has an a/d converter, which is used to determine the percentage of gas of lpg in the environment. the mq-2 sensor also offers wide application, i.e. possibility to measure several different parameters, such as: lpg liquid petroleum gas, ch4 methane, co carbon monoxide, alcohol, smoke and propane. 204 m. đorđević, d. dankovic the ds1307 is an integrated circuit that is used as a real time clock (rtc). this circuit works at a frequency of 32.768 khz, using an external oscillator. also, this circuit for communication with the microcontroller uses the software realized i2c bus. the rtc ds1307 can be used as an clock and as a calendar. time calculation starts from 00:00:00 in the format hh:mm:ss. the time adjustment is done using the hour and minute keys, realized through ra.2 (hours) and ra.5 (minute) pins of the microcontroller. these keys are also used to adjust the operation of the weather station, where one additional button is used via the rb.2 pin of the microcontroller. this button is used to select the weather station mode setting. the gsm/gprs module sim800l serves to send data to thingspeak in the cloud. also sim800l module serves to send an sms to the user with information that the measurement has been completed. this module communicates with the microcontroller via the rx/tx uart serial module using at commands [27]. rx/tx uart is implemented via the first uart microcontroller module on rc.6 (tx) and rc.7 (rx) pins. to interact with the user while working with the weather station, an 2x16 character lcd display is used [28]. power supply of the device is possible in several ways:  using a dc power supply (5v 12v),  solar power, through the built-in solar cell, where, as indicated by the number of measurement points in one hour, it depends on the dimensions of the solar panel.  using batteries (the ability to use rechargeable batteries that are charged on the spot using solar power). when starting the device, a message with the accompanying animation of internet of things and the name of the device appears on the lcd display screen. thereafter, a selection of the language will be used, which will be used for further adjustment and operation. then follows the time setting, which consists of the following:  setting the current time when the measurement starts;  setting the time when the user wants the smart weather station to complete the measurement. when the measurement start and stop time are set, then enter the user's api key in which the measurement results will be stored. the api key consists of 16 characters, consisting of a combination of letter characters (a-z) and numbers (0-9). after that, the number of the phone to which the sms (short message service) is sent will be entered. finally, check all settings, i.e. information to the user about the entered api key of the channel in which the results of the measurement will be stored, so that at the end of the setup the countdown begins before the start of the measurement. each task in measurement is defined as shown in fig. 4 as algorithmic mode of displaying software. the main thread is responsible for starting the other threads. it also sends diagnostic messages to the lcd display and receives simple commands from the keyboard, affecting the application’s execution flow (e.g. stop, restart). the main thread contains all the settings necessary (e.g. settings of a/d converter, the values of registers for the sensors that work through the i2c bus, interrupt routines) for the proper operation of the smart weather station. also, this thread contain the settings of the measuring time (start/stop) that will be sent to the measurement thread. a smart weather station based on sensor technology 205 fig. 4 basic algorithm of the embedded software of smart weather station. the measurement thread is receive settings from the main thread before start measurement. this thread is used for the acquisition of the measurement results from the sensors, storing them in the variables (variables are defined for each parameter that is measured separately) through transfer thread. the transfer thread transfer data from sensors to microcontroller (from a/d converter, i2c bus). when data is transferred to the main thread from sensors, the data 206 m. đorđević, d. dankovic must be processed before sending. all data of the integer type must be converted to the string data type and then with the defined api key send to the send thread. in the send thread, converted data and api key is concatenated into one string data. the gsm/gprs module must be configured to work as a gprs (wireless internet) module using at commands to send data to the thingspeak database in the cloud. data from smart weather station is sent and received via simple “hypertext transfer protocol” (http) posts, such as going to a web page and filling out a form. this communication happens through plaintext, json or xml. if the weather station is used in places like high depths in mines, where there is no global system for mobile telecommunications (global packet radio service) signal, it is possible to use local storage as security digital memory card or flash memory as an alternative. 3.3. usage scenario during the measurement, the user is able to see the remaining measurement time on the lcd screen, i.e. how much is left until the end of the measurement, in the form of a progress bar, followed by information in percentages. measurement interval is 5 minutes (adjustable). when each measurement cycle is performed, it is necessary to convert the results from the integer type into the string data type, so that the same using the gsm/gprs module is forwarded to the internet and placed in the database. at the end of the measurement, the message on the completed measurement is printed on the lcd display, the device sends a text message using the gsm module. the user will receive a sms on the completed measurement on his mobile phone, whose number was previously entered during the weather station setup. the message text will depend on the previously selected device language. in this way, the user himself is not obliged to be while working next to the device. when the sms is sent to the user with the completed measurement information. the prototype of the smart weather station is practically realized on protoboard in laboratory conditions (see fig. 5). the prototype was realized using the ready for pic development environment developed by the company mikroelektronika [29]. fig. 5 the prototype of the smart weather station. a smart weather station based on sensor technology 207 4. results the meteorological/ambient parameters were measured with prototype of smart weather station during 7 days (from 8 th of july to 14 th of july, 2018). when the measurement was started (8 th of july), there was a rainy period, which can be noticed on the basis of the results measured for the relative humidity of the air, as can be seen from fig. 6. fig. 6 humidity (field 1 chart) and lpg detection (field 2 chart) during measurements. it can be seen that the highest relative humidity values were recorded at that time, when the relative humidity value of the day did not fall below 50%. after the rainy period was followed by a period of sunshine, accordingly, the relative humidity of the air, whose value was changed in the range of 80% during the 10 th of july to 18% during the 11 th of july, was changed. the highest concentration of lpg gas was recorded in period from 15:00 to 21:00 hours on july 11 th and that values was in range from 23% to 42% (lunch and dinner preparation period). in accordance with the period of measurement when the precipitation was present, there was also a decrease in temperature, as well as light intensity (see fig. 7). fig. 7 results of temperature (field 5 chart) and light intensity (field 6 chart). during the period of rain time in accordance with the change in relative humidity, the temperature was changed, so that the temperature values were the lowest during the rainy period (from 18° to 28°c). the highest values of the temperature were measured in the period of july 11 th , where the temperature value ranged from 19° to 37°c. 208 m. đorđević, d. dankovic cloudy weather influenced the change in the light intensity, so the lowest values was recorded on the july 8 th (from 246 lx to 26267 lx). the highest values were recorded in the period after the precipitation (from 7567 lx to 54612 lx) excluding measurements during the night when the light intensity values were 0 lx. atmospheric pressure also changed according to weather conditions, so the highest change in atmospheric pressure was recorded during the rainy period (985 to 992 mbar), on the basis of which the period of sunny weather could be observed. the value for air pressure during the sunny weather were ranged from 987 mbar to 995 mbar (see fig. 8). on the basis of air pressure, the altitude was measured. fig. 8 changing of air pressure (field 3 chart) during measurements. the strongest wind blows were recorded during the rainy period, when wind speeds ranged from 5 to 47km/h. after precipitation, the winds were weakened, so wind speeds were recorded in the range of 2 to 32 km/h. all measurements were made on the location indicated on the map (latitude 43.320304, longitude 21.925792) shown in fig. 9. fig. 9 wind speed (field 7 chart) results and measurement location. during the measurement, the user can have access to the measurement results by accessing the database stored on the user channel on thingspeak platform. also, the user can change the way the results are displayed, starting with the color of the values themselves, the background to the way of representing whether it is in the form of lines, bars, steps, etc. it is also possible to choose how many points are displayed on the graphics for the visibility of the results. then, if the measurements are done over a period of several hours or days, it is possible to choose whether the results are displayed for one or more hours, or for one or more days. a smart weather station based on sensor technology 209 5. conclusion further development of the device is planned, in the form of simultaneous sending of data from several smart weather stations at the same time to the same channel. in this way, the user would have an insight into the change of measured parameters in multiple locations in order for the user to have an insight into the change of measured parameters in several locations. it is also possible to add new sensors to measure other meteorological/ambient parameters. some of the sensor can be added as sensor for rain gauge, wind direction, geiger counter radiation detector, etc. the development of an android application is planned, so that the end-user can use the smartphone to monitor the measurement/observation and eventual changes in the measured parameters. in the era of the internet of things technologies, more and more new smart devices are developing to use it as a storage resource. that part of internet of things technology is thingspeak platform as a part of it. thingspeak is an internet of things platform which allows the user to collect and store sensor data in the cloud and develop internet of things application. thingspeak internet of things platform provides applications that allows the user to analyze and visualize data, and then act on the data. for analyzing and visualizing data which are stored in channel thingspeak use power of matlab. one of the devices that builds its work on cloud computing technology is the smart weather station described in this paper. the smart weather station apart from the possibility of storing data on cloud or database on webserver, offer the possibility of the system to be solar powered. in this way, it is possible to measure atmospheric parameters in difficult accessible locations. there is also the possibility of operating this device in four different languages. acknowledgment: the author would like to thank to the ministry of education, science and technological development, republic of serbia, for financial support project numbers tr33035 oi171026 and tr32026. references [1] a. k. sikder, g. petracca, h. aksu, t. jaeger and a. s. uluagac, “a survey on sensor-based threats to internetof-things (iot) devices and applications” https://arxiv.org/pdf/1802.02041.pdf. accessed: 08.08.2018. [2] n. jabeur and h. haddad, “from intelligent web of things to social web of things”, facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 367–381, september 2016. [3] m. kosanović and s. stošović, “the concept for the “smart home” controlled by a smartwatch”, facta universitatis, series: electronics and energetics, vol. 31, no. 3, pp. 389–400, september 2018. [4] b. rodić-trmčić, a. labus, z. bogdanović, m. despotović-zrakić and b. radenković, “development of an iot system for student’s stress management”, facta universitatis, series: electronics and energetics, vol. 31, no. 3, pp. 329–342, september 2018. [5] b. mihai, “about the smart weather station”, acta universitatis cibiniensis – technical series, vol. lxviii, no. 3, pp. 26–29, 2016. [6] m. b. waghmare and p. n. chatur, “temperature and humidity analysis using data logger of data acquisition system: an approach”, international journal of emerging technology and advanced engineering, vol. 2, no. 1, pp. 102–106, january 2012. [7] f. wortmann and k. flütcher, "internet of things – technology and value added", springer fachmedien weisbaden, bus inf syst eng, vol. 57, no. 3, pp. 221–224, march 2015. https://arxiv.org/pdf/1802.02041.pdf 210 m. đorđević, d. dankovic [8] s. mallikarjun, s. chandra shekar reddy, g. nagalaxmi, r. gayathri, “iot based remote monitoring of weather parameters for solar, wind applications”, international journal of engineering research in electrical and electronic engineering (ijereee), vol. 2, no. 2, pp. 69–72, february 2018. [9] b. srinivas rao, k. srinivasa rao and n. ome,”internet of things (iot) based weather monitor system”, international journal of advanced research in computer and communication engineering, vol. 5, no. 9, pp. 312–319, september 2016. [10] akash and a. birwal, “iot-based temperature and humidity monitoring system for agriculture”, international journal of innovative research in science, engineering and technology, vol. 6, no. 7, pp. 12756–12761, july 2017. [11] s. d. shewale and s. n. gaikwad, “an iot based real-time weather monitoring system using raspberry pi”, international journal of advanced research in electrical, electronics and instrumentation engineering, vol. 6, no. 6, pp. 4242–4249, june 2017. [12] k.n.v. satyanarayana, s. r. n. reddy, k. n. v. suresh varma and p. kanaka raju, “mobile app & iot based smart weather station”, international journal of electronics, communication and instrumentation engineering research and development (ijecierd), vol. 7, no. 4, pp. 1–8, august 2017. [13] a. ghosh, a. srivastava, a. patidar, c. sandeep and s. prince, “solar powered weather station and rain detector”, in proceedings of the 2013 texas instruments india educator’s conference, 2013, pp. 131-134. [14] k. ashton, "that “intenet of things” thing", rfid journal, vol. 22, pp. 97–114, 2009. [15] what is internet of things (iot)? – definition from whatis.com: 2013. http://whatis.techtarget.com/definition/internet-ofthings. accessed: 07.05.2018. [16] a. prijić, z. prijić, d. vučković and a. stanimirović, “aadl modeling of m2m terminal”, in proceedings of the microelectronics conference (miel 2010), 16-19 may 2010, pp. 373-376. [17] m. a. gomez maureira, d. oldenhof and l. teernstra, “thingspeak – an api and web sevice for the internet of things”, 2014. [18] pic18f45k22 http://www.microchip.com/wwwproducts/en/pic18f45k22. accessed: 07.05.2018. [19] gsm/gprs sim800l: http://simcom.ee/documents/sim800/sim800_hardware%20design_v1.08.pdf. accessed: 08.05.2018. [20] bme280 sensor bosch sensortec: https://cdn-shop.adafruit.com/datasheets/bst-bme280_ds001-10.pdf. accessed: 08.05.2018. [21] anemometer – introduction to air velocity measurement: https://www.omega.com/prodinfo/anemometers.html. accessed: 08.05.2018. [22] bh1750fvi: sensor ics – mouser electronics: http://rohmfs.rohm.com/en/products/databook/datasheet/ic/sensor/light/bh1721fvc-e.pdf. accessed: 08.05.2018. [23] mq-2 semiconductor sensor for combustible gas pololu: https://www.pololu.com/file/download/mq2.pdf?file_id=0j309. accessed: 08.05.2018. [24] ds1307 – part number search – maxim integrated: https://datasheets.maximintegrated.com/en/ds/ds1307.pdf. accessed: 08.05.2018. [25] uart universal asynchronous receiver and transmitter: http://es.elfak.ni.ac.rs/pld/materijal/uart.pdf accessed: 10.05.2018. [26] i2c manual – nxp semiconductors: https://www.nxp.com/docs/en/application-note/an10216.pdf. accessed: 10.05.2018. [27] sim800 series_at command manual v1.09 – simcom a company of sim tech: https://nselectric.com/files/datasheets/sim800_atcommands.pdf. accessed: 10.05.2018. [28] hd44780u (lcd ii) (dot matrix liquid crystal display controller/driver) hitachi: https://www.sparkfun. com/datasheets/lcd/hd44780.pdf. accessed: 10.05.2018. [29] ready for pic https://shop.mikroe.com/ready-pic. accessed: 10.05.2018. http://whatis.techtarget.com/definition/internet-ofthings http://www.microchip.com/wwwproducts/en/pic18f45k22 http://simcom.ee/documents/sim800/sim800_hardware%20design_v1.08.pdf https://cdn-shop.adafruit.com/datasheets/bst-bme280_ds001-10.pdf https://www.omega.com/prodinfo/anemometers.html http://rohmfs.rohm.com/en/products/databook/datasheet/ic/sensor/light/bh1721fvc-e.pdf https://www.pololu.com/file/0j309/mq2.pdf https://www.pololu.com/file/download/mq2.pdf?file_id=0j309 https://datasheets.maximintegrated.com/en/ds/ds1307.pdf http://es.elfak.ni.ac.rs/pld/materijal/uart.pdf https://www.nxp.com/docs/en/application-note/an10216.pdf https://ns-electric.com/files/datasheets/sim800_atcommands.pdf https://ns-electric.com/files/datasheets/sim800_atcommands.pdf https://www.sparkfun.com/datasheets/lcd/hd44780.pdf https://www.sparkfun.com/datasheets/lcd/hd44780.pdf https://shop.mikroe.com/ready-pic instruction facta universitatis series: electronics and energetics vol. 29, n o 1, march 2016, pp. 89 100 doi: 10.2298/fuee1601089z inkjet printed resistive strain gages on flexible substrates  čedo žlebič 1 , ljiljana živanov 1 , aleksandar menićanin 2 , nelu blaž 1 , mirjana damnjanović 1 1 faculty of technical sciences, university of novi sad, novi sad, serbia 2 institute for multidisciplinary research, university of belgrade, belgrade, serbia abstract. in this paper, resistive strain gages designed and fabricated in inkjet printing technology with three different silver nanoparticle inks are presented. inks have different ag content (15, 20 or 25wt%) and solvents (water type or organic type). strain gages were printed on a 50µm thick polyimide and 140 µm thick pet-based substrate with different printer types (professional and desktop). all printed sensors have the same size (17mm×5mm). to determine the change of resistance due to bending of the steel beam, tensile tests were performed up to 1500 microstrains. due to performed cycles of loading and unloading of the steel beam, gauge factor and stability of the response of the strain gages are measured. resistance change was measured with keithley sourcemeter 2410. for acquisition of measured data, in-house software tool was developed. measured gauge factors of the sensors are in the range between 1.07 and 2.03 (depending on a used ink, substrate and printer). results of this research indicate the strain gages with good gf can be produced even with low-cost equipment, such as desktop printer epson c88+ and pet-based substrate. key words: resistive strain gage, inkjet printing, silver nanoparticles, flexible substrate 1. introduction strain sensors are one of the most critical devices required for structural health monitoring, damage detection, condition-base maintenance and failure prevention. although some promising technologies are emerging into the market, still about 50 % of all strain sensors rely on a strain gage principle. strain gage provides benefits, like low price, simple measurement circuits and easy configuration etc. strain sensors can be fabricated in different technologies. in [1], pt and nicr strain sensors with cu interconnection lines were fabricated on polyimide sheets using a dc magnetron sputtering system, and a base pressure in a 10 −7 torr range. sensors and interconnections were photolithographically patterned, using either a lift-off process (with received september 18, 2014; received in revised form july 24, 2015 corresponding author: ĉedo žlebiĉ faculty of technical sciences, university of novi sad, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: cedoz@uns.ac.rs) 90 ĉ. žlebiĉ, a. menićanin, n. blaž, lj. živanov, m. damnjanović a negative photoresist) for pt and nicr, or by a chemical etching (and a positive photoresist) for cu interconnection lines. these pt thin film sensors have gauge factor gf 1.7. in [2], an aerosol-jet printing was applied to fabricate strain sensors. using the maskless fine feature deposition characteristics of this printing technology and a pre-cure protocol, strain sensors were successfully printed onto carbon fiber prepregs, to enable fabricating composites with intrinsic sensing capabilities. measured gf of these sensors was in the range 2.2 ± 0.06. strain sensing architectures, such as smart flexible sensors adapted to textile structures, are able to measure their strain deformations. the optimization of the sensors, in terms of dimensions, geometry, preparation process, and filler concentration, has led to a sensitive, reliable strain gage, which can be easily deposited on any flexible substrate, such as a textile fabric [3]. the advantages of inkjet printing technology are high-speed of the process, the efficient use of ink materials, patterning capability, and the fact that thin films can be printed on flexible substrates, at low costs. results obtained from the characterization survey showed coherence between the expected trend and the experimental behavior, and have encouraged future efforts towards the use of inkjet printing technology for the rapid prototyping of strain gages and other sensing architectures [4]. in our previous research, we fabricated and compared resistive and capacitive strain gage sensors [5]. both were fabricated on polyimide substrate using inkjet printing technology. however, the capacitive sensors proved to be ineffective for measuring strain on metallic specimen, due to parasitic capacitance. the aim of this work was to investigate the influence of three different nanoparticle inks and flexible substrates on the characteristics of single-element gages. some improvements in design were introduced, in order to provide better stability of structures. the response was measured up to 30 minutes period. gages can be integrated into light-weight structures for purpose of monitoring. focus of this investigation was on inkjet silver inks and flexible substrates for sensing tensile and compressive strain on the steel surface. 2. strain gages fabrication the printing of inks, especially those containing silver nanoparticles, has been found to be a crucial tool for direct patterning of electrically conductive interconnections in electronic devices [6-8]. a drawback for the widespread application of printing processes is the availability of suitable printable materials. a formulation of a suitable ink is a critical phase, as the performance or quality of the printing process strongly depends on the ink. a low tendency for sedimentation and properly adjusted behaviour of the liquid carrier matrix are essential for a reliable printing process [9]. first two series of tested strain gages were fabricated in one layer. they were printed with the dimatix dmp3000 printer using 10 pl nozzle volume cartridge. printing was performed in a horizontal configuration instead of a vertical one. as printing head move along horizontal axis, ink drops onto substrate moving along the same axis, which leads to better printing results. similar printing solution is presented in [10]. samples were printed in 1016 dpi resolution, on 50 µm thick polyimide substrate, apical gts av [11]. key properties of this substrate are shown in table 1. inkjet printed resistive strain gages on flexible substrates 91 the third series of strain gages were fabricated in one and two layer using epson stylus c88+ desktop printer with 180 nozzles and ink droplet size small as 3 pl. gages were printed in 2000 dpi resolution on 140 µm thick pet-based substrate (novacentrix novele™ ij-220) [12]. these series of strain gages present an example of low-cost sensor manufacturing process using low-cost equipment. short developing time is additional advantage. pet-based substrate properties are shown in table 2. table 1. specifications of apical 200 av polyimide substrate. property value nominal thickness (µm) 50.8 tensile modulus (gpa) 2.8 tensile strength (mpa) 293 elongation (%) 104 coefficient of thermal expansion (ppm/ºc) 32 yield (m 2 /kg) 55 table 2. specifications of novele™ ij-220 pet-based substrate. property value basis weight (g/m 2 ) 175±10 caliper (µm) 140±12 smoothness bekk (sec.) >1000 stiffness (mn·m) 0.5±0.3 the first series of tested strain gages were printed in previously mentioned resolution, which corresponds to 25 μm drop spacing, and showed to be an optimal solution in terms of avoiding ink spillage and achieving uniform structures. amplitude of the driving waveform was 23 v and the frequency was 2 khz. the ink u5603 was made of silver nanoparticles (with 20 % wt of silver) and capped with a polymer coating that keeps the particles in a colloidal suspension, by sun chemical corporation [13]. in the manufacturers’ technical specifications, it is stated that the ink has a specific resistivity in the range of 530 μωcm. to avoid rapid evaporation of the printed structures during sintering, they were firstly left to dry for 30 minutes at room temperature. after the printing process was finished, the samples were put in an oven and sintered for 45 minutes at 240°c. the second series of tested strain gages were printed with water-based silver nanoparticle ink js-b25hv, produced by novacentrix with 25 % wt of silver [14]. this is an electrically conductive ink, with 2.8 µωcm film resistivity, designed to produce circuits on porous and non-porous substrates including inkjet papers, pet, polyimide, and glass. js-b25hv ink is specially formulated for compatibility and stability with dimatix print heads. drop spacing was kept the same (25 μm), but the amplitude of driving waveform was 30 v and the printing frequency was 1 khz. the samples were sintered for 30 minutes on 270°c, as recommended by the manufacturer. the third series of strain gages were printed with water-based silver nanoparticle ink js-b15p, produced by novacentrix with 15 % wt of silver [14]. ink has 4.5 µωcm film resistivity, and it is designed to produce circuits on porous substrates such as paper and novele™ (a coated pet). samples printed in one layer were sintered for 30 minutes on 92 ĉ. žlebiĉ, a. menićanin, n. blaž, lj. živanov, m. damnjanović 100°c, while the samples printed in two layers were printed for 60 minutes also on 100°c. detailed physical properties of used inks are shown in table 3. design of printed strain gages in this paper is improved compared to the design presented in our previous research [15], by increasing the length of the end loop. proposed length of strain gage end loop is five times longer than grid track width. since the creep behavior depends on parameters such as gage material, adhesive thickness, cantilever material and design of strain gage, it is necessary to make additional measurements for our inkjet fabricated gages to determine which ratio of end loop length and track width is the most appropriate, so the creep behavior been reduced to a minimum. the contact pads design are also changed and placed inside the overall sensor area (fig. 1). they have a taper section to adjust slowly the current density distribution. a photograph of sensor was taken with the 3.0 megapixels moticam 2300 camera on a wafer probe station. geometrical parameters of the strain sensors are presented in table 4. table 3 typical physical properties of silver inks [13], [14]. table 4 geometrical parameters of the strain gages. dimensions end track width 0.677 mm end loop width 0.979 mm track width 0.205 mm track spacing 0.217 mm track length 9.574 mm number of turns n=8 fig. 1 representative sample of strain gages with geometrical dimensions 3. measuring principle the terms stress and strain  (1 μ = 10 –6 m/m) are used to describe deformations of solid materials. when the strain is not too large, most of the solid materials behave like ink u5603 ink js-b25hv ink js-b15p silver content (wt %) 20 25 15 resistivity (μωcm) 5-30 2.8 4.5 viscosity (cp) 10-13 8 4 surface tension (dynes/cm) 27-31 30-32 30 inkjet printed resistive strain gages on flexible substrates 93 linear springs, and the displacement is proportional to the applied force. if the same force is applied to a thicker peace of solid material, the spring is stiffer and the displacement is smaller. this leads to a relation between force and displacement that depends on the dimensions of the material [16]. the resistive strain gage is a physically simple device, which can be easily applied in a straightforward manner for elementary measurements of surface strains [17]. these devices provide a suitable way to test new materials for their strain sensitivity by bonding the gages to a beam with a known strain behavior. the strain of the beam increases with the distance from the point of the applied force. maximal deflection δmax for elastic deformations of the used steel beam 67sicr5 [18] is approximately 50 mm (which is greater than the maximal deflection of 15 mm applied in this research). tested strain gages were mounted close to the fixed end, where the strain  has the greatest value and equals to: 3 6 ( ) , 4 beam beam l z h l         (1) where  is the deflection, h is the cantilever thickness, z is the distance from the fixed end of the cantilever to the middle of the strain gage, and lbeam is the length of the cantilever, as shown in fig. 2. deflection was controlled with screw mechanism at the free end of the cantilever, where two turns bend the cantilever for exactly 1 mm. deflection is measured with digital sliding caliper kern ip54. the relative resistance change is equal to: ,/ εgfr r  (2) where gf is the gauge factor of the material and r is the initial resistance of gage. the gages were bonded on the top side of the steel beam, for measuring tensile strain (denoted with “1”), and on bottom side of the steel beam, for measuring compressive strain, denoted with “2”, by two-component epoxy adhesive (fig. 2). in order to quickly and simply test strain gage, source meter keithley 2410 was used for measurement and as a current source (with excitation current of 1 ma). control software tool “ksm 2410 rc” was developed for acquisition of measured data. it was written using national instruments labview software. the measuring principle is shown in fig. 3. f δ h lbeamz 1 2 fig. 2 gages placement for measuring tensile (gage denoted with “1”) and compressive (gage denoted with “2”) strain. 94 ĉ. žlebiĉ, a. menićanin, n. blaž, lj. živanov, m. damnjanović source meter s e ri a l c o m m u n ic a ti o n data acquisition (r s -2 3 2 ) fig. 3 principle of data acquisition using keithley source meter 2410 and control software tool. in our previous work [8], we have presented a bridge as an alternative for measuring small resistance changes accurately. the sensor placement, as shown in fig. 5, enables the best response when the load is applied and its temperature compensation [19]. the testing gages were connected in a full-bridge wheatstone circuit, where the differential output voltage can be approximated as setout irεgf v  , (3) where iset is the set excitation current and r is the initial resistance, ideally the same for all the resistors. since the output of the full wheatstone bridge is a differential voltage, an instrumentation amplifier is used. for a low noise signal acquisition, it was used ina122pa instrumentation amplifier due to its high amplification and low offset voltage [20]. to suppress the supply ripple and high frequency interference, it was used low-pass active filters made with lm224 quad operational amplifier, as shown in fig. 6. f δ h lbeamz 1 2 3 4 fig. 5 cantilever with four placed strain gages (1, 2, 3 and 4) connected in a full-bridge wheatstone circuit. inkjet printed resistive strain gages on flexible substrates 95 fig. 6 block diagram of the developed signal conditioning circuit. 4. results and discussion measurements were performed on twelve gage samples. presented results of gf are average values of the tested strain gages. shown microstrain ranges represent ranges where the gages have approximately linear characteristic. 4.1. first series of printed strain gages (ink u5603, polyimide substrate, dimatix printer) measured r/r resistance values of first series of printed strain gages are obtained using keithley source meter controlled by developed software tool, are shown in fig. 7. average resistance of tested strain gages was 140 ω. as it can be seen, average gf, when the beam is loaded, is 1.07. when the beam is unloaded, average gf = 1.03. results are in accordance with our previous measurements presented in [5], where the measurements were performed with wheatstone bridge (gf was 1.09 when the beam was loaded, and when the beam was unloaded, gf was 1.01). in [10], sunchemical ink was also used for sensor fabrication, and obtained gf was around 0.35. fig. 7 relative resistance change δr/r as a function of the applied microstrain for first series of printed strain gages for the loading (black line) and unloading (red line) of the beam. 96 ĉ. žlebiĉ, a. menićanin, n. blaž, lj. živanov, m. damnjanović 4.2. second series of printed strain gages (ink js-b25hv, polyimide substrate, dimatix printer) the measurement results of tensile (positive) strain for second series of tested strain gages when the beam is loaded and unloaded are shown in fig. 8. the measurement results show that when the beam is loaded, average gf is 2.03, and when the beam is unloaded, average gf is 1.96. as it can be seen, there is a significant increase of gauge factor as compared to the first series of tested strain gages, with a presence of small values of hysteresis. possible reason for that is because the ratio of strain induced changes in atomic structure of silver nanoparticles ink js-b25hv to the strain producing them is better than it is for the first ink. average resistance value of tested sensors was 142 ω. in order to investigate compressive (negative) strain, measuring cables were connected to the cu wires of gage, which was bonded at the bottom side of steel beam. steel beam was upturned so that the compressive strain was measured also with the upper gage. as it can be seen in fig. 9, average gauge factor is lower than when measuring tensile strain, and it is gf=1.59. tested gages have linear characteristics up to approximately 1400 microstrains. fig. 9 relative resistance change δr/r as a function of the applied microstrain for compressive strain for second series of tested strain gages. in fig. 10, it is presented a comparison between stability of the response of one representative sample strain gage (from second series of tested gages) and commercial sensor fig. 8 relative resistance change δr/r as a function of the applied microstrain for second series of printed strain gages for the loading (red line) and unloading (black line) of the beam. inkjet printed resistive strain gages on flexible substrates 97 cea-06-125un-350 produced by micro-measurements (vishay precision group) [21] under constant deflection of the beam for 30 minutes. measurements were performed for four deflection steps (0 mm, 5 mm, 10 mm and 15 mm). as expected, the lowest ripple of strain sensors has been recorded for deflection of 0 mm. as it can be seen in fig. 10, commercial strain sensor has better resistance stability under different beam deflection, because commercial sensors have excellent encapsulation and sensitive grid is made of constantan. fig. 10 time response of one sample strain gage from second series of tested gages (denoted r sensor) and commercial sensor cea-06-125un-350 by micro-measurements (denoted r commercial) under constant deflection of the steel beam. 4.3. third series of printed strain gages (ink js-b15p, pet-based substrate, epson printer) in fig. 11, it is shown sensitivity of gages fabricated in one layer, on pet-based substrate with epson stylus c88+ printer using js-b15p nanoparticle silver ink. the average electrical resistance of tested strain gages was 104 ω. gf is slightly higher when beam is loading (~1.94) than for unloading the beam (~1.85). it is expected that, in cycling between a loaded and unloaded condition, there is a some degree of hysteresis. fig. 11 relative resistance change δr/r as a function of the applied mechanical deformation for third series of tested strain gages (one printed layer). 98 ĉ. žlebiĉ, a. menićanin, n. blaž, lj. živanov, m. damnjanović in [4], the realization process uses the epson desktop printer, “metalon js-b15p” ink and pet substrate as printing base. sensors have estimated gf of 1.6 for one printing layer, which is smaller than gf obtained with strain gages shown in this work. fig. 12 relative resistance change δr/r as a function of the applied mechanical deformation for third series of tested strain gages (two printed layers). the strain gage printed in two layers has smaller gf than gage printed in one layer, but hysteresis is manifestly smaller. measured gf is approximately equal for tensile and compressive strain (~1.58). average strain gages resistance printed on pet-based substrate in one layer was 104 ω, while for two layers strain gages was 61 ω. fig. 13 time response under constant deflection of the steel beam for one printed layer (r1 layer) and two printed layers (r2 layers) strain gages. as it can be seen in fig. 13, strain gages printed with epson printer on pet-based substrate are not stable as strain gages fabricated in combination with dimatix printer on polyimide substrate. stability measurements were performed for 15 minutes, since after that time readings of gages became unstable. it means that strain gage starts to display resistance inkjet printed resistive strain gages on flexible substrates 99 values which corresponds to higher beam strains. strain gage with two printed layers are more stable than gage with one layer, due to thicker and more uniform lines structure. beside presented measurements, it was also measured resistance change of printed gages in two month interval in laboratory conditions. it is observed that, at 25-26°c and 55 % of relative humidity, after one month, sensor resistance values decrease from its initial value for 1-1.5 %, after two months, for 2-2.5 % and after three months for 10 %. future improvements of the presented resistive strain sensor will be focused on encapsulation and investigation of possible wireless applications with the lc circuit. 5. conclusion substrate material choice playing important role in the fabrication of strain gages, since the strain of measuring object is transmitting through the substrate. strain gages were developed on polyimide and pet-based substrates. strain gages mounted on the surface of a metallic test specimen respond only to the strains that occur at the surface of the test specimen. as such, the results from strain gauge measurements must be analyzed to determine the state of stress occurring at the strain gauge locations. gf of resistive strain gages printed with three different silver nanoparticle inks onto polyimide and pet-based substrate have been successfully measured and compared. based on measured results for strain sensitivity, the second series of tested resistive strain gages have the highest strain sensitivity, with average gf~2.03, while the corresponding value of gf for the first series was around 1.07, and for the third series of strain gages gf~1.94 (for one layer) and gf~1.58 (for two layers). the first series of tested gages (printed with ink u5603 based on concentrated dispersion of silver nanoparticles in organic solvent ethanol-ethylene glycol mixture, on polyimide substrate with dimatix printer) have smaller gf than the commercial strain gages, probably because of that the silver nanoparticles exhibit some negative piezoresistive behavior, which is damping the positive resistance change. the second series of tested gages (printed with water-based ink js-b25hv on polyimide substrate with dimatix printer) have much better strain sensitivity, but their stability of the resistance response under constant deflection of the steel beam isn’t good as commercial sensors, as it is shown. the third series of tested gages (printed with water-based ink js-b15p on pet-based substrate with epson desktop printer) also have higher gf than the first series of tested gages. the most linear strain sensitivity function is achieved for two printed layers on pet-based substrate. results of this research indicate the strain gages with good gf can be produced even with low-cost equipment, such as desktop printer epson c88+, and pet-based substrate. also, it can be concluded that inkjet printing technology (with various inks, substrates and printers) is suitable for prototyping and development of strain and force sensors, and could be expanded for other sensor types through development of new nanoparticle inks, geometrical printing design and encapsulation of fabricated sensors. acknowledgement: this research was supported by the ministry of education, science and technological development, republic of serbia, project number tr-32016. 100 ĉ. žlebiĉ, a. menićanin, n. blaž, lj. živanov, m. damnjanović references [1] d. lichtenwalner, a. hydrick, a. kingon, "flexible thin film temperature and strain sensor array utilizing a novel sensing concept." sensors and actuators a: physical, vol. 135, no. 2, pp. 593-597, 2007. [2] d. zhao, t. liu, m. zhang, r. liang, b. wang, “fabrication and characterization of aerosol-jet printed strain sensors for multifunctional composite structures”, smart materials and structures, vol. 21, no. 11, pp. 115008, 2012. [3] c. cochrane, v. koncar, m. lewandowski, c. dufour, "design and development of a flexible strain sensor for textile structures based on a conductive polymer composite." sensors vol. 7, no. 4, pp. 473492, april 2007. [4] b. ando, s. baglio, “all-inkjet printed strain sensors”, ieee sensors journal, vol. 13, pp. 4874-4879, december 2013. [5] c. zlebic, n. ivanisevic, m. kisic, n. blaz, a. menicanin, lj. zivanov, m. damnjanovic, “comparison of resistive and capacitive strain gauge sensors printed on polyimide substrate using ink-jet printing technology”, proc. of 29th ieee international conference-miel, belgrade, serbia, 2014, pp. 141-144. [6] v. osch, t. hj, j. perelaer, a. de laat, u. schubert, “inkjet printing of narrow conductive tracks on untreated polymeric substrates”, adv. mater., vol. 20, pp. 343-345, january 2008. [7] k. j. lee, b. h. jun, t. h. kim, j. joung, “direct synthesis and inkjetting of silver nanocrystals toward printed electronics”, nanotechnology, vol. 17, no. 9, pp. 2424-2428, april 2006. [8] d. kim, s. jeong, b. k. park and j. moon, “direct writing of silver conductive patterns: improvement of film morphology and conductance by controlling solvent compositions”, appl. phys. lett., vol. 89, pp. 264101-264101-3, december 2006. [9] m. maiwald, c. werner, v. zöllmer, m. busse, “inktelligent printing< up>® for sensorial applications”, sensor review, vol. 30, pp. 19-23, 2010. [10] v. correia, c. caparros, c. casellas, l. francesch, j. rocha, s. lanceros-mendez. "development of inkjet printed strain sensors." smart materials and structures, vol. 22, no. 10, pp. 105028, 2013. [11] gts flexible materials, [online]: http://www.gts-flexible.com/about-gts/apical/ [12] novacentrix novele ij-220, [online]: http://store.novacentrix.com/novele_ij_220_p/910-0070-02.htm [13] sun chemical, [online] available: http://www.sunchemical.com [14] novacentrix, [online] available: http://www.novacentrix.com/products/metalon-inks/silver [15] c. zlebic, m. kisic, n. blaz, a. menicanin, s. kojic, lj. zivanov, m. damnjanovic, “ink-jet printed strain sensor on polyimide substrate”, 36th international spring seminar on electronics technology, alba iulia, romania, 2013, pp. 409-414. [16] webster j. g., the measurement, instrumentation, and sensors: handbook, crc press, 1999, chapter 22, pp. 571-589. [17] hannah r. l., reed s. e., strain gage user’s handbook, springer, cambridge university press, 1992, chapter 1, pp. 1-79. [18] matbase, [online] available: http://www.matbase.com/ [19] r. s. figliola, d. e. beasley, theory and design for mechanical measurements, john wiley and sons; 5th edition, 2010, chapter 11, pp. 466-503. [20] burr-brown corp., “single supply instrumentation amplifier ina122”, 1997, [online] available: http://www.datasheetcatalog.org . [21] micro-measurements (vishay precision group), [online] available: http://www.vishaypg.com/docs/11224/125un.pdf. http://www.novacentrix.com/products/metalon-inks/silver http://www.matbase.com/ http://www.datasheetcatalog.org/ http://www.vishaypg.com/docs/11224/125un.pdf instruction facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 51 61 https://doi.org/10.2298/fuee1801051j feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives  kosta jovanović 1 , branko lukić 1 , veljko potkonjak 2 1 laboratory for robotics etf robotics, school of electrical engineering, university of belgrade, serbia 2 school of information technologies, metropolitan university, belgrade, serbia abstract. to ensure safe human-robot interaction impedance robot control has arisen as one of the key challenges in robotics. this paper elaborates control of bidirectional antagonistic drives – qbmove maker pro. due to its mechanical structure, both position and stiffness of bidirectional antagonistic drives could be controlled independently. to that end, we applied feedback linearization. feedback linearization based approach initially decouples systems in two linear single-input-single-output subsystems: position subsystem and stiffness subsystem. the paper elaborates preconditions for feedback linearization and its implementation. the paper presents simulation results that prove the concept but points out application issues due to the complex mechanical structure of the bidirectional antagonistic drives. key words: bidirectional antagonistic drives, variable stiffness actuators, pullerfollower control, stiffness control. 1. introduction this paper presents a further elaboration of the approach for stiffness control of classical antagonistic drives in robotics [1] 1 to bidirectional antagonistic drives. the long term desire of scientists to design and build a faithful copy of a human being finally coincides with the latest efforts of in-house service robotics how to design a robot which fully matches the house environment. because humans shape their living environment to fully meet their comfort and necessities, home robots have to be built to fit such areas and therefore they must move and behave in the same manner as humans. received november 16, 2016; received in revised form may 4, 2017 corresponding author: kosta jovanović laboratory for robotics etf robotics, school of electrical engineering, university of belgrade, 11000 belgrade, serbia (e-mail: kostaj@etf.rs) *an initial research related to this paper received best paper award at 3 rd international conference on electrical, electronic and computing engineering (icetran ’16) – section robotics and flexible automation [1]. 52 k. jovanović, b. lukić, v. potkonjak therefore, there are numbers of actual research projects with the ultimate goal of creating musculoskeletal (or so-called anthropomimetic robots [2], [3]). the most popular among them are famous japanese robot kenshiro [4] and eccerobot as an anthropomimetic robot of european consortium [5]. following the anthropomimetic approach, key issues are human-like actuators and their control. the design of an anthropomimetic actuator has to follow guidelines set by its human paragon: it should be tendon driven, compliant (of changeable compliance vsa) and therefore it has to be driven by at least two motors – to control both position and compliance (opposite of stiffness). the control of such drives, which are inevitably multivariable and non-linear, has to be reliable, safe and robust. this paper presents one instance of a bio-inspired robotic drive of changeable stiffness – bidirectional qbmove maker pro, and an approach to control such drive initially based on our work on puller-follower approach [6]. a brief overview of bidirectional antagonistic joints in robotics, as well as our target one, is given in section 2. special attention of our group from robotics laboratory at the school of electrical engineering, university of belgrade, is paid to the control of novel bioinspired robot actuators in general and the control of bidirectional antagonistic drives as one of the instances available in the laboratory. generalized puller-follower approach based on feedback linearization to the control of qbmove maker pro is introduced in section 3. the validity of the proposed control algorithm is proven via simulation in section 4. section 5 brings conclusions about a prospective application of the proposed methodology, gives tips for future work and points out the already tested alternative approaches for stiffness control of bidirectional antagonistic drives. 2. bidirectional antagonistic drives qbmove maker pro a subgroup of vsas that mimics biological paragon of mammals is antagonistic actuators. although classical antagonistic actuation is the prime example of a fully biologically inspired actuation, lately, the engineers turned to bidirectional antagonistic actuation as a big step towards real antagonistic actuation. the most significant advantage of bidirectional antagonistic actuation is bidirectional torque achieved by two antagonistically coupled motors. namely, both motors could either pull or push, contrary to classical antagonistic, tendon driven actuators and human muscles. therefore, slacking of the tendons is not possible, and controllability of such drives is ensured. pioneering works in antagonistic actuation exploited intrinsic compliance of hydraulic and pneumatic actuators as antagonistically coupled drives. therefore, the first widely known implementation of antagonistic drives were: the utah/m.i.t. dexterous hand [7], mckibben pneumatic artificial muscles in antagonistic arrangements [8] such as work of tondu et al. [9] or boblan et al. [10], biped walking robots with antagonistically actuated joints at waseda university [11], or european pneumatic biped lucy build at vrije university of brussels [12]. in parallel, electric drives have been gradually developed and prevailed in antagonistic drives due to control issues when pneumatic actuators are employed [13]. to achieve variable stiffness, non-linear tendon transmission has to be designed [14]. the non-linear transmission could be obtained either by placing non-linear elastic elements ([15] and [16]) or placing linear elastic elements with a controlled system feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 53 dedicated to shaping non-linearity in transmission. the latter approach was employed by migliore [17], hurst [18], and tonietti [19]. in this research we opted for the first approach. although vsa is a topic of an increasing importance towards safe human-robot interaction, a limited number of vsa is available on the market due to high costs and complex mechanical design. with an idea to bring an instance of such compliant actuator to a broad audience, to researchers and academy, the natural motion initiative [20] developed qbmove maker series of the actuator. their latest prototype, qbmove maker pro is a lowcost 3d printed bidirectional spring antagonistic actuator design which is affordable and it has all features of bidirectional antagonistic vsa. all parts of the actuator are on-the-shelf and could be either purchased from the natural motion initiative or their models could be downloaded from the internet free-of-charge. furthermore, all software dedicated to realtime control of qbmove maker pro is open source [21]. a prototype of qbmove maker pro actuator and its functional scheme are depicted in fig.1. therefore, both motors can contribute to the overall shaft torque symmetrically. this is the basic difference when compared to the traditional antagonistic structure where each motor can contribute only in one direction due to a pulling constraint. joint shaft and motors are coupled via non-linear springs. the non-linear force-deflection characteristic is of fundamental importance since it enables variable stiffness of the joint which depends on spring pretensions [17]. experiments which confirm this non-linear coupling are given in [22]. fig. 1 qbmove maker pro: prototype (left), functional scheme (right) a mathematical model of qbmove maker pro actuator is given by equations (1) (7). non-linearity in force-deflection characteristics causes that relatively small displacement of motors positions and/or output shaft induces a significant change in stiffness for high stiffness values. equation (1) describes joint/shaft dynamics, equation (2)-(3) stands for motor dynamics. resulting driving torques are given by (4)-(7). ( ) ̈ ( ̇) ̇ ̇ ( ) ( ) (1) ̈ ̇ ( ) (2) ̈ ̇ ( ) (3) ( ) ( ) ( ) (4) 54 k. jovanović, b. lukić, v. potkonjak ( ) ( ( )) (5) ( ) ( ( )) (6) ( ) ( ( )) ( ( )) (7) actuator dynamics is specified by shaft inertia ( ), velocity related terms (centrifugal and coriolis) ( ̇) ̇, viscous damping , gravity load ( ), and overall actuator torque ( ) as a sum of both bidirectional antagonistic tendon/drive torques ( ) and ( ). the bidirectional antagonistic drives are assumed to be symmetric with inertia – and damping term – . note that non-linearity in the transmission given by (5) and (6) is a prerequisite for variable stiffness of qbmove maker pro actuator. since both drives influence actuator position as well as actuator stiffness, decoupling of position and stiffness subsystem is demanding control challenge which is considered in this paper. since our final goal to control joint stiffness, let us briefly recall the definition of joint stiffness equivalent to the stiffness of a translational spring. the force acting on the spring depends on its extension and this static dependence is defined as the spring stiffness ⁄ . thus, the spring of length in its equilibrium position ( ) stays undeformed, whereas if the spring is extended to a length , it generates force . if this relation is linear, then we consider the spring as linear (8) and the stiffness is constant. otherwise, the spring is considered as non-linear (9) and the stiffness is variable. likewise, the stiffness of the robot joint (usually denoted in the literature as ⁄ ) is defined by (10), where stands for the torque generated in the joint and denotes the joint position. ( ) ⁄ (8) ( ) ( ) ( ) ⁄ (9) (10) analogously, joint stiffness can be constant or changeable which is a desirable feature from an exploitation point of view since it enables tradeoffs between safe and precise manipulation. since we focus on robot joints that exploit antagonism, the stiffness of such joints is presented in accordance with the source of mechanical stiffness in antagonistically coupled tendons. therefore, the overall shaft/joint stiffness of qbmove maker pro actuator is estimated as follows in (11). for unloaded shaft, equilibrium position is given by (12). ( ) ( ( )) ( ( )) (11) (12) feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 55 3. feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives since both bidirectional antagonistic motors contribute to joint position and joint stiffness, static feedback linearization is employed to decouple this multivariable system into two decoupled and linearized single-input-single-output systems. the original system can be written in state-space representation (13). ̇ ( ) ( ) (13) here, joint and motor positions and velocities are considered as state space variables ̇ ̇ ̇ , while motor torques are considered as control inputs. joint position and overall joint stiffness are outputs: . by straightforward application of feedback linearization [23], outputs and were differentiated until a linear relation to inputs and/or was obtained. to that end, outputs and were differentiated four times (14) and two times (15) respectively. since the sum of the relative degrees (=4+2) of the outputs was equal to the state dimension of the system (=6), zero dynamics does not exist and all states are fully observable. ( ) ( ) (14) ( ) ( ) (15) ( ) denotes lie derivative of ( ) along vector function ( ). lie derivatives in cases of position and stiffness of the model representing qbmove maker pro are depicted in (16) and (17) respectively. decoupling the matrix ( ), defined as in (18), has to be non-singular to prove controllability of the system, which is always valid for positive joint stiffness. at the same time, this is the second precondition for the application of static feedback linearization. for the sake of simplicity, the following notation is adopted: ( ( )), ( ( – )), ( ( )), and ( ( )). ( ( ( ) ) ( ̇ ̇) ( ( ) ) ( ̇ ̇) ) (16) ( ( ( ) ) ( ̇ ̇) ( ( ) ) ( ̇ ̇) ) (17) 56 k. jovanović, b. lukić, v. potkonjak [ ] [ ] (18) finally, in accordance to [23], original input can be transformed as in (19) to achieve independent control of both the joint position and stiffness via the newly-defined intermediate input [ ] . the result of this input transformation is two linear single-input-single-output systems controlled by intermediate input which can be written in linear state space form (20). new state vector contains all output derivatives up to the highest order [ ̇ ̈ ( ) ̇] . ( * ( ) ( ) + * +) (19) ̇ (20) from (14) through (20) follows that ( ) ( ) [ ] . thus, if we choose as the desired joint position and as the desired joint stiffness, a basic control law (21) can be applied. accordingly, state feedback linearization allows control of both the positions and stiffness of the bidirectional antagonistic robot joint, using two totally independent linear controllers, composed of static state feedback and feed-forward action. as demonstrated in [24] and [25], the stability of the proposed control methodology (21) is ensured if the gains in are chosen so the polynomials depicted in (22) are hurwitz's. ( ) ( ( ) ( )) ( ̈ ( )) ( ̇ ( )) ( ( )) ̈ ( ̇ ( )) ( ( )) (21) (22) theoretically, if the desired joint positions and stiffness are smooth trajectory, asymptotic trajectory/force tracking is possible. in this paper, the desired trajectories are set manually without considering higher control levels and optimization issues. an illustrative scheme of the proposed algorithm is depicted in fig 2. feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 57 fig. 2 decoupled position/stiffness control scheme for qbmove maker pro actuator 4. results and discussion the mathematical model (presented in section 2) and the presented control approach (section 3) are implemented in user-defined dedicated matlab/simulink model. the validation of the proposed approach is given in fig 3 through fig 6. fig 3 presents joint position tracking the desired trajectory combines an interval of smooth increase in position for ⁄ and sine trajectory with an amplitude of ⁄ . desired and achieved stiffness are depicted in fig 4. desired stiffness comprises flat and sine part of an amplitude of which is in accordance with desired trajectory to demonstrate simultaneous control of both joint position and stiffness for different trajectory patterns. theoretically, as elaborated by palli et al. [24], [25], if the desired joint positions are continuous up to the 4 th order ( ) , and the stiffness is planned to be continuous up to the 2 nd order ( ) , asymptotic trajectory/ force tracking is achieved. fig 5 presents coordinated actions of two antagonistically coupled motors which contribute to the joint position but also stiffness. one can see that while the desired stiffness is constant ( ) both motors move in the same direction equally contributing to the joint position which follows its pattern. when stiffness starts changing its value motors act as follows: when joint stiffen (rise in stiffness) motors move in opposing directions while a decrease in joint stiffness results in a decrease in the difference in antagonistic motor positions. the overall resulting joint torque is depicted in fig 6 which fits the pattern of the desired joint trajectory. demonstrated results are obtained for parameters adopted as shown in table 1. control parameters (23) and (24) are adopted from [6]. ( )( ) ( ) ( ) (23) ( ) (24) 58 k. jovanović, b. lukić, v. potkonjak table 1 simulation parameters label numerical value unit description 0.000003 motor inertia 0.015 joint inertia 0.000001 [ s/rad] motor damping 0 [ s/rad] joint damping 6.7328 spring coefficient 0.0227 spring coefficient fig. 3 joint position tracking fig. 4 joint stiffness tracking feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 59 fig. 5 positions of bidirectional antagonistically coupled motors fig. 6 resulting joint torque as contribution of both bidirectional antagonistically coupled motors 5. conclusion the paper elaborated exploitation of the stiffness control method proposed in [1] to robot joint driven by a bidirectional antagonistic actuators qbmove maker pro actuator. therefore, an increasing topic of variable stiffness actuation was presented. the approach which enables simultaneous decoupled control of joint position and joint stiffness was demonstrated. the concept is validated through simulations. 60 k. jovanović, b. lukić, v. potkonjak however, the key issue in the implementation of this feedback linearization based control approach is model dependence. the model itself is very complex and non-linear, so model identification must be considered comprehensively before the approach is used. moreover, it is well known that systems that are linearized by decomposing their structure to two or more linear subsystems are prone to behave erratically when disturbed. the robustness of the presented approach is discussed by authors’ previous work [6]. to overcome the dependence on the model, alternative approaches to simultaneous position/ stiffness control of bidirectional antagonistic drives were pointed out in authors’ previous works [27] and [28], while neural networks for system modeling and feed-forward control were presented in [29]. future work on the topic will consider the implementation of the proposed approach for stiffness control on the laboratory setup driven by qbmove maker pro actuators, on a model-based multi-jointed robot with bidirectional antagonistic drives, as well as its implementation for cartesian stiffness control. an ultimate goal of this research is the development of a control scheme which should shape cartesian stiffness by symbiosis of joint stiffness control and posture planning of the robot. acknowledgment: research leading to these results was funded by the ministry of education, science and technological development, republic of serbia, under contract tr-35003. references [1] k. jovanovic, b. lukic, v. potkonjak, “enhanced puller-follower approach for stiffness control of antagonistically actuated joints”, in proceedings of international conference on electrical, electronic and computing engineering (icetran ’16), 13-16 jun 2016, pp. roi1.2.1-5. [2] a. diamond, r. knight, d. devereux, o. holland, "anthropomimetic robots: concept, construction and modelling," international journal of advanced robotic systems, vol. 9, no. 209, pp. 1-14, 2012. [3] k. jovanovic, v. potkonjak, o. holland, "dynamic modelling of an anthropomimetic robot in contact tasks," advanced robotics, vol. 28, no. 11, pp. 793-806, 2014. [4] y. nakanishi, s. ohta, t. shirai, y. asano, t. kozuki, y. kakehashi, h. mizoguchi, t. kurotobi, y. motegi, k. sasabuchi, j. urata, k. okada, i. mizuuchi, m. inaba, "design approach of biologicallyinspired musculoskeletal humanoids", international journal of advanced robotics systems, vol. 10, no. 216, pp. 1-13, 2013. [5] s. wittmeier, c. alessandro, n. bascarevic, k. dalamagkidis, a. diamond, m. jäntsch, k. jovanovic, r. knight, h. g. marques, p. milosavljevic, b. svetozarevic, v. potkonjak, r. pfeifer, a. knoll, o. holland, "toward anthropomimetic robotics: development, simulation, and control of a musculoskeletal torso", artificial life, vol. 19, no. 1, pp. 171-193, 2013. [6] v. potkonjak, b. svetozarevic, k. jovanovic, o. holland, "the puller-follower control of compliant and noncompliant antagonistic tendon drives in robotic system", international journal of advanced robotics systems, vol. 8, no. 5, pp. 143-155, 2012. [7] s. c. jacobsen, e. k. iversen, d. knutti, r. johnson, k. biggers, "design of the utah/m.i.t. dextrous hand", in proceedings of ieee international conference on robotics and automation (icra 1986), san francisco, ca, usa, 7-10 april 1986. pp. 1520-1532. [8] g. c. klute, j. m. czerniecki, b. hannaford, "mckibben artificial muscles: pneumatic actuators with biomechanical intelligence", in proceedings of ieee/asme international conference on advanced intelligent mechatronics, atlanta, ga, usa, 19-23 september 1999, pp. 221-226. [9] b. tondu, s. ippolito, j. guiochet, a. daidie, "a seven-degrees-of-freedom robotarm driven by pneumatic artificial muscles for humanoid robots", the international journal of robotics research, vol. 24, no. 4, pp. 257-274, 2005. [10] i. boblan, j. maschuw, d. engelhardt, a. schulz, h. schwenk, r. bannasch, i. rechenberg, "a humanlike robot hand and arm with fluidic muscles: modelling of a muscle driven joint with an antagonistic feedback linearization for decoupled position/stiffness control of bidirectional antagonistic drives 61 setup", in proceedings of international symposium on adaptive motion in animals and machines, ilmenau, germany, 25-30 september 2005. [11] j. yamaguchi, d. nishino, a. takanishi, "realization of dynamic biped walking varying joint stiffness using antagonistic driven joints", in proceedings of ieee international conference on robotics and automation (icra 1998), leuven, belgium, 16-20 may 1998, pp. 2022-2029. [12] b. verrelst, r. van ham, b. vanderborght, f. daerden, d. lefeber, "the pneumatic biped "lucy" actuated with pleated pneumatic artificial muscles", autonomous robots, vol. 18, no. 2, pp. 201-213, 2005. [13] s. ĉajetinac, d. šešlija, v. nikolić, m. todorović, "comparison of pwm control of pneumatic actuator based on energy efficiency", facta universitatis, series: electronics and energetics, vol. 25, no. 2, pp. 93-101, 2012. [14] r. van ham, t. sugar, b. vanderborght, k. hollander, d. lefeber, "compliant actuator design: review of actuator with passive adjustable compliance/controllable stiffness for robotic applications", ieee robotics & automation magazine, vol. 13, no. 3, pp. 771-789, 2009. [15] k. koganezawa, y. watanabe, n. shimizu, "stiffness and angle control of antagonistically driven joint", advanced robotics, vol. 12, no. 7-8, pp. 81-94, 1997. [16] c. english, d. russell, "implementation of variable joint stiffness through antagonistic actuation using rolamite springs", mechanism and machine theory, vol. 34, no. 1, pp. 27-40, 1999. [17] s. migliore, e. brown, s. deweerth, "biologically inspired joint stiffness control", in proceedings of ieee international conference on robotics and automation (icra ’05), 18-22 april 2005, pp. 4508-4513. [18] j. hurst, j. chestnutt, a. rizzi , "an actuator with physically variable stiffness for highly dynamic legged locomotion", in proceedings of ieee international conference on robotics and automation (icra 2004), new orleans, la, usa, 26 april-1 may 2004, pp. 4662-4667. [19] g. tonietti, r. schiavi, a. bicchi, "design and control of a variable stiffness actuator for safe and fast physical human/robot interaction", in proceedings of ieee international conference on robotics and automation (icra 2005), barcelona, spain, 18-22 april 2005. pp. 526-531. [20] m. catalano, g. grioli, m. garabini, f. bonomo, m. mancini, n. tsagarakis and a. bicchi, “vsa-cubebot: a modular variable stiffness platform for multiple degrees of freedom robots”, in proceedings of ieee international conference on robotics and automation (icra ’11), 9-13 may 2011. pp. 5090 5095. [21] natural motion machine initiative (nmmi) [qbmove maker pro assembly guide], last accessed november 13th, 2016 – https://sourceforge.net/projects/nmmiwebsite/files/qbmovev01/assembly%20guide%20v01.pdf/ download [22] k. melo, m. garabini, g. grioli, m. catalano, l. malagia, a. bicchi, “open source vsa-cubebots for rapid soft robot prototyping”, robot makers workshop in conjunction with 2014 robotics science and systems conference, berkeley, california, usa, july 12, 2014. [23] h. k khalil, "chapter 13: state feedback stabilization," in nonlinear systems, 3rd edition, upper saddle river, new jersey, usa, prentice hall, 2002, pp. 197-227. [24] g. palli, c. melchiorri, a. de luca, "on the feedback linearization of robots with variable joint stiffness", in proceedings of ieee international conference on robotics and automation (icra 2008), pasadena, ca, usa, 19-23 may 2008. pp. 1753-1759. [25] g. palli, c. melchiorri, t. wimböck, m. grebenstein, g. hirzinger, "feedback linearization and simultaneous stiffness-position control of robots with antagonistic actuated joints", in proceedings of ieee international conference on robotics and automation (icra '07), rome, italy, 10-14 april 2007. pp. 4367-4372. [26] b. lukić, k. jovanović, a, rakić, “realization and comparative analysis of coupled and decoupled control methods for bidirectional antagonistic drives: qbmove maker pro,” presentedat the 3 rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, jun 13-16, 2016. [27] b. lukić, k. jovanović, “minimal energy cartesian impedance control of robot with bidirectional antagonistic drives,” in proceedings of the iftomm/ieee/eurobotics 25 th international conference on robotics inalpe-adria-danube region – raad 2016, belgrade, june 30 th july 2 nd 2016. [28] b. lukić, k. jovanović, g. kvašĉev, “feedforward neural network for controlling qbmove maker pro variable stiffness actuator”, in proceedings of the 13th symposium on neural networks applications in electrical engineering (neurel 2016), belgrade, serbia, november, 2016. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 219-237 https://doi.org/10.2298/fuee2102219p © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper optimal allocation of multiple dgs in rds using pso and its impact on system reliability shradha singh parihar, nitin malik the northcap university, gurugram, india abstract. this article presents the distributed generator (dg) integration in a radial distribution system (rds). the dg penetration changes the single power source to multiple power sources and bidirectional load flow which enhances the system reliability and reduces system power losses. the particle swarm optimization and gravitational search algorithm are implemented for the optimal siting and sizing of one and three dg units in the rds to examine its impact on system reliability and loss reduction. the types of dgs considered are type i (injects real power) and type iv (injects reactive and real power). the constant power is the chosen load model. the reliability indices taken for the analysis of system reliability are average energy not supplied, total energy not supplied and average system interruption duration index. the efficacy of the proposed method is validated on 33-bus in the presence of single and multiple dgs. the significant decrease in system power losses with the upgraded bus voltage profile, system reliability and remarkable annual loss saving is analyzed for type iv dg over type i dg. the results determined are compared to other meta-heuristic approaches as well as analytical techniques to demonstrate the superiority of the proposed methodology. the results are also statistically verified. key words: dg, siting and sizing, pso, gsa, aens, asidi, tens, reliability, radial distribution system 1. introduction the high rate of growing population, industrialization and global economic expansion motivates a massive investment in the reliable power supply. the component failure in the radial distribution system (rds) is the primal cause of power interruption which reduces system reliability and produces a significant impact on distribution utilities and consumers. hence, the need for a reliable power supply has been very important. many corrective measures such as network reconfiguration have been tried out to restore the power supply until the replacement of the failed components using tie and sectionalizing switches. due to the lack of such functionalities, the penetration of distributed generator (dg) in the rds received september 14, 2020; received in revised form january 11, 2021 corresponding author: nitin malik the northcap university, huda sector 23-a gurugram – 122017, india e-mail: nitinmalik@ncuindia.edu 220 s. singh parihar, n. malik plays a vital role in providing reliable supply [1-2]. based on the type of power delivering capability the dgs are classified as mentioned [3] type i (injects real power, power factor (pf)=1): ex. photovoltaic, battery, fuel cell type ii (injects reactive power, pf=0): ex. synchronous capacitor type iii (injects real power, consumes reactive power, pf is leading): ex. induction generator type iv (injects reactive and real power at lagging pf): ex. synchronous generator, wind power the optimal penetration of dg in rds has many advantages like enhancement in bus voltage profile, system reliability and power loss reduction but it may adversely impact the system if not integrated optimally. the dg allocation methods are classified as analytical and meta-heuristic methods. the analytical technique uses mathematical expressions for the computation of the optimal solution. in [4], the authors have proposed an analytical expression to find dg size without evaluating cost benefits associated with it. numerical methods such as mixed integer non-linear programming [5] and kalman filter [6] have been utilized to integrate dg optimally in rds. in [7], the authors have proposed a multiobjective index-based approach to find the optimum size and site of dg in rds with consideration of voltage deviation at the critical node and tail-end nodes simultaneously. a power stability index is developed in [8] to optimally site the dg in rds but considers type i dg only. many meta-heuristics approaches such as ant lion optimization algorithm [9] and non-dominated sorting genetic algorithm-ii [10] have been utilized to solve dg installation issues. a new voltage stability index has been developed for optimal integration of various types of dgs using analytical and particle swarm optimization (pso) approach to analyse system performance [11]. the comparison of two surveys conducted between the canadian and united states utilities in regard to service utility data collection and its utilization is presented in [12] to show the service continuity statistics. authors in [13] demonstrate the effect of location and numbers of dg units on the reliability indices of the rds. the assessment of reliability indices is demonstrated in [14] under uncertainties but not for the multiple dgs. the relationship between dg penetration and power supply reliability of rds has been analysed and presented in [15] but only for the small-scale system. a new methodology is described in [16] to estimate the dg impact on the reliability indices in the presence of system constraints. the penetration of multiple dg units in rds may generate an adverse effect on system reliability due to excessive power injection as shown in [17]. the effect of installing different sizes of dg at different distances from the substation on the reliability indices is demonstrated in [18]. in [19], authors have demonstrated the effect of optimal penetration of multiple dgs on system power losses and reliability index in the existing rds. it has been observed from the previous work, that very few researchers had worked upon the impact of optimal installation of multiple dg units on reliability indicators to analyze the distribution system reliability. in this article, as a first step, the optimal allocation of multiple types of dg units using pso and gravitational search algorithm (gsa) has been carried out in rds considering various operational limits. thereafter, the effect of dg placement not only on system power losses and bus voltage but also on reliability indicators, namely, total energy not supplied (tens), average energy not supplied (aens) and average system interruption duration index (asidi) is carried out. the types of dg taken for this research work are type i and type iv. the efficacy of the presented technique has been tested on ieee 33-bus system. the load model selected for opttimal allocation of multiple dgs and reliabilty analysis 221 this study is constant power type. based on the type of dg integration, two case studies are identified and presented. the main contributions of this article are mentioned below: i. two meta-heuristic techniques have been implemented for the simultaneous siting and sizing of one and three dg units in rds and their results are compared. ii. the impact of type i and type iv dg allocation on reliability indicators such as tens, aens and asidi has also been analysed in addition to system voltage profile and power losses. iii. the percentage real power loss reduction (plr) to penetration level (pl) ratio is determined to show the efficacy of the proposed method over other analytical and metaheuristic methods. the rest of the research article is arranged as follows: the mathematical modeling of load, line and dg is demonstrated in section 2. section 3, section 4 and section 5 explains the development of problem formulation, system reliability indicators and the working of pso and gsa, respectively. the solution methodology for multiple dg units installation using pso approach is discussed in section 6. the result analysis has been discussed in section 7. in section 8, the conclusions of the paper are drawn. 2. mathematical modeling 2.1. line and load model system loads are considered to be concentrated at its nodes. most of the system loads in rds are voltageand frequency-dependent [20]. for the analysis of static load, variation in voltage is taken into account as frequency deviation is insignificant [21]. in this article, the load is modeled as a constant complex power type. the one-line diagram (sld) of a branch connected between i-1th and ith bus is demonstrated in fig. 1. in short-line distribution model, the line-to-ground capacitance is very small and hence neglected [22]. fig. 1 electrical equivalent of one branch from fig. 1, we get 𝑷𝒊 + 𝒋𝑸𝒊 = 𝑽𝒊∠𝜹𝒊. 𝑰𝒊 ∗ (1) where vi is receiving-end bus voltage and vi-1 is sending-end bus voltage. 𝛿𝑖 represents voltage angle at ith bus. qi and pi represents reactive and active power load fed to bus i, respectively. conjugating both sides of eq. (1), we get 𝑃𝑖 − 𝑗𝑄𝑖 = (𝑉𝑖 ∠𝛿𝑖) ∗. 𝐼𝑖 (2) the receiving-end bus voltage is given as 𝑉𝑖 ∠𝛿𝑖 = 𝑉𝑖−1∠𝛿𝑖−1 − (𝑅𝑖 + 𝑗𝑋𝑖 )𝐼𝑖 (3) 222 s. singh parihar, n. malik where 𝑅𝑖 and 𝑋𝑖 represents the branch resistance and the branch reactance, respectively and δi-1 shows the voltage angle at i-1 th bus. from eq. (2) and eq. (3), the magnitude of receiving-end voltage is determined and given in eq. (4) 𝑉𝑖 = [{(𝑅𝑖 𝑃𝑖 + 𝑋𝑖 𝑄𝑖 − 1 2 𝑉𝑖−1 2)2 − (𝑅𝑖 2 + 𝑋𝑖 2)(𝑃𝑖 2 + 𝑄𝑖 2)} 1 2 − (𝑅𝑖 𝑃𝑖 + 𝑋𝑖 𝑄𝑖 − 1 2 𝑉𝑖−1 2 )] 1 2 (4) the branch rpl (𝑃𝑙𝑜𝑠𝑠 ) and branch reactive power loss (𝑄𝑙𝑜𝑠𝑠 ) between bus i-1 and bus i is expressed as 𝑃𝑙𝑜𝑠𝑠 (𝑖 − 1, 𝑖) = (𝑃𝑖 2+𝑄𝑖 2) |𝑉𝑖| 2 . 𝑅𝑖 (5) 𝑄𝑙𝑜𝑠𝑠 (𝑖 − 1, 𝑖) = (𝑃𝑖 2+𝑄𝑖 2) |𝑉𝑖| 2 . 𝑋𝑖 (6) the lf algorithm applied in the paper is backward-forward (b/f) sweep [23]. a tolerance of 10-4 p.u in bus voltage difference in two successive iterations at all the buses is considered as the stopping criteria. 2.2. dg modeling the dg resources of high rating can lead to situation wherein losses are more than the base case [4]. the dg resources of small size generally operate in constant power mode, that is, the generator bus is being modeled as a constant negative pq load. however, the dg can be modeled wherein dg associated bus is considered as pv bus and the total reactive power penetrated by the dg is kept at a fixed voltage level. according to ieee 1547 standard [24], the utilities do not recommend the dg units to regulate bus voltages in order to avoid their conflict with the existing voltage control schemes [25]. in addition to this, as the amount of reactive power delivered by the generator depends upon the system configuration and cannot be stated in advance. therefore, the dg is modeled as pq load. the system performance in terms of voltage upgradation and loss minimization attained from 3rd type of dg is worst among all other dg types of dg [26]. the change in the load demand at a bus is dependent upon the power injected by the dg. if a dg is placed at bus i, then the equivalent load at the same bus can be articulated as 𝑃𝑖 𝑒𝑞 = 𝑃𝑖 − 𝑃𝐷𝐺𝑖 (7) 𝑄𝑖 𝑒𝑞 = 𝑄𝑖 − 𝑄𝐷𝐺𝑖 (8) where, 𝑄𝐷𝐺𝑖 and 𝑃𝐷𝐺𝑖 represents the reactive and real power penetrated by dg at bus i, respectively. the magnitude of reactive power injected at bus i for a given pf of type iv dg is 𝑄𝐷𝐺𝑖 = 𝑃𝐷𝐺𝑖 . tan (𝑐𝑜𝑠 −1((𝑃𝐹)𝐷𝐺 )) (9) 3. problem formulation the objective of installing multiple dg units of multiple types in rds is to upgrade bus voltage profile and system reliability with reduction of power losses. the pso and gsa based technique has been implemented for the optimal installation of single and multiple dg considering eq. (10) as the objective function opttimal allocation of multiple dgs and reliabilty analysis 223 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 (𝑂𝐹) = 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 ∑ 𝑃𝑙𝑜𝑠𝑠 (𝑖 − 1, 𝑖) 𝑵𝒃−1 i=1 (10) where 𝑁𝑏 is the total buses of the system. the operational constraints are as follows: a) power balance principle: 𝑃𝐺 = 𝑃𝐷 + 𝑃𝑙𝑜𝑠𝑠 (11) 𝑄𝐺 = 𝑄𝐷 + 𝑄𝑙𝑜𝑠𝑠 (12) where 𝑄𝐺 and 𝑃𝐺 represents the generated reactive and real power. 𝑄𝐷 is system reactive load demand and 𝑃𝐷 is system real load demand. b) bus voltage limits: 0.95 𝑝. 𝑢 ≤ 𝑉𝑖 ≤ 1.05 𝑝. 𝑢 (13) c) branch ampacity constraints: 𝐼𝑏𝑟𝑎𝑛𝑐ℎ ≤ 𝐼𝑡ℎ𝑒𝑟𝑚𝑎𝑙 (14) where, 𝐼𝑏𝑟𝑎𝑛𝑐ℎ and 𝐼𝑡ℎ𝑒𝑟𝑚𝑎𝑙 represents the branch current and its thermal limit, respectively. d) constraints on dg power generation: 0 ≤ 𝑃𝐷𝐺𝑖 ≤ ∑𝑃𝐿𝑜𝑎𝑑 (15) 0 ≤ 𝑄𝐷𝐺𝑖 ≤ ∑𝑄𝐿𝑜𝑎𝑑 (16) 0 ≤ 𝑆𝐷𝐺𝑖 ≤ ∑ 𝑆𝐿𝑜𝑎𝑑 (17) where 𝑆𝐷𝐺𝑖 represents the distributed apparent power generation and ∑𝑆𝐿𝑜𝑎𝑑 , ∑𝑃𝐿𝑜𝑎𝑑 and ∑𝑄𝐿𝑜𝑎𝑑 are the system’s total load for apparent, real and reactive power, respectively. e) distribution substation capacity: 0 ≤ 𝑃𝑔 𝑖 ≤ 𝑃𝑔(𝑚𝑎𝑥) i ∈ slack (18) 0 ≤ 𝑄𝑔 𝑖 ≤ 𝑄𝑔(𝑚𝑎𝑥) (19) where 𝑄𝑔(𝑚𝑎𝑥) is maximum reactive power generation and 𝑃𝑔(𝑚𝑎𝑥) is maximum real power generation, at slack bus. 4. system reliability indicators planning procedure uses reliability indicators for deciding new investments in new generation capacities. in this article, the effect of different dg units in the rds is assessed by considering the following reliability indicators: a) total energy not supplied tens is a measure of distribution system in adequacy and is estimated using eq. (20) 𝑇𝐸𝑁𝑆 = ∑ 𝜆𝑎(𝑒)𝑢𝑒 (mwh/yr 𝑛𝑙 𝑒=1 ) (20) where, nl is the total load points count, 𝜆𝑎(𝑒) is the unavailability of the load point e (kw) and 𝑢𝑒 is the annual outage time in hours/year. the annual outage time is the summation of total load outages occurred due to branch failure and can be calculated as 224 s. singh parihar, n. malik 𝑢𝑒 = ∑ 𝜆𝑓 𝑟𝑒𝑛𝑙 (21) where, 𝑟𝑒 is the repair time (hours) or the total interruption time of the load and 𝜆𝑓 is the failure rate. b) average energy not supplied the aens is estimated using eq. (22) 𝐴𝐸𝑁𝑆 = ∑ 𝜆𝑎(𝑒)𝑢𝑒 𝑛𝑙 𝑒=1 ∑ 𝑁𝑒 (mwh/customer − yr) (22) where, 𝑁𝑒 is the number of customers at e. c) average system interruption duration index the asidi calculates the average duration of the interrupted system load due to the occurrence of the outages. mathematically, it can be given as eq. (23) 𝐴𝑆𝐼𝐷𝐼 = ∑ 𝐿𝑒𝑟𝑒 𝐿𝑇 (hours or minutes) (23) where, 𝐿𝑒 is the load interrupted and 𝐿𝑇 is the total connected load. d) customer average index duration index (caidi) the caidi is the ratio of sum of customer interruption duration to the total number of the customer interruption and is given as eq. (24) 𝐶𝐴𝐼𝐷𝐼 = ∑ 𝑢𝑒.𝑁𝑒 ∑ 𝜆𝑓.𝑁𝑒 (hours/cust − interruption) (24) e) system average interruption frequency index (saifi) the saifi is the average number of interruptions per customer per unit time and is given in eq. (25) 𝑆𝐴𝐼𝐹𝐼 = ∑ 𝜆𝑓.𝑁𝑒 ∑ 𝑁𝑒 𝑛𝑙 𝑒=1 (interruptions/customer-yr) (25) the allocation of dg in rds is a cost-effective solution to enhance system reliability as it is used in the distribution system as an alternative source for restoring power and may supply electric power to the loads that are failed due to faults. hence, the integration of dg decreases the number of total customers not connected to the grid and outage time depending upon their output power which in turn reduces the numerator of all reliability indices mentioned in eq. (20), (22), (23), (24) and (25) thereby enhancing distribution system reliability. 5. optimization algorithms 5.1. pso algorithm pso is a stochastic approach wherein each particle changes its existing state in a multidimensional search space. if 𝑉𝑝𝑑 = [𝑣𝑝1, 𝑣𝑝2 … . . 𝑣𝑝𝑛𝑑 ] and 𝑆𝑝𝑑 = [𝑠𝑝1, 𝑠𝑝2 … . . 𝑠𝑝𝑛𝑑 ] demonstrate the velocity and the position of particle p, respectively; 𝑑 = 1,2, … 𝑛𝑑 and 𝑝 = 1,2, … 𝑁𝑠. here, d signifies the current dimension, ns signifies the swarm size and 𝑛𝑑 is the dimension of the concerned problem. opttimal allocation of multiple dgs and reliabilty analysis 225 𝑣𝑝𝑑 𝑘+1 = 𝑤𝑝 𝑣𝑝𝑑 𝑘 + 𝑐1𝑟𝑎𝑛𝑑1(𝑝𝑏𝑒𝑠𝑡𝑝𝑑 − 𝑆𝑝𝑑 𝑘 ) + 𝑐2𝑟𝑎𝑛𝑑2(𝑔𝑏𝑒𝑠𝑡𝑝𝑑 − 𝑆𝑝𝑑 𝑘 ) (26) 𝑆𝑝𝑑 𝑘+1 = 𝑆𝑝𝑑 𝑘 + 𝑣𝑝𝑑 𝑘+1 (27) where, 𝑆𝑝𝑑 𝑘 and 𝑣𝑝𝑑 𝑘 represents the current position and velocity of particle p at kth iteration, respectively. c2 and c1 are the accelerating coefficients for 2 nd and 1st particle, respectively. rand1(.) and rand2(. ) are the random numbers distributed uniformly between 0 and 1. pbestpd and gbestpd is the particle’s best position depending upon its personal experience and the global best position of the particle depending upon the experience of the overall swarm, respectively. the 1st and 3rd terms in eq. (26) represents the inertia component and social component, respectively. the inertia weight of the pth particle (𝑤𝑝) decreases linearly with iterations and is mentioned as 𝑤𝑝 = 𝑤𝑝𝑚𝑎𝑥 − (𝑤𝑝𝑚𝑎𝑥−𝑤𝑝𝑚𝑖𝑛) 𝑘𝑚𝑎𝑥 . 𝑘 (28) where, 𝑤𝑝𝑚𝑖𝑛 and 𝑤𝑝𝑚𝑎𝑥 are the min and the max value of wp, respectively. 𝑘𝑚𝑎𝑥 and k represents the maximum and current iteration number. 5.2. gravitational search algorithm gsa is a stochastic metaheuristic approach inspired by the law of gravitational and law of motion. the performance of the object is measured in terms of its mass. the laws results in global movement of all the considered objects towards the object having heavier mass. the agent’s mass is calculated using eq. (29) 𝑀𝑔 𝑘 = 𝑚𝑔 𝑘 ∑ 𝑚ℎ 𝑘 𝑁𝑔 ℎ=1 (29) where, 𝑚𝑔(𝑘) = 𝑓𝑖𝑡𝑔 𝑘 −𝑓𝑖𝑡𝑤𝑜𝑟𝑠𝑡 𝑘 𝑓𝑖𝑡𝑏𝑒𝑠𝑡 𝑘 −𝑓𝑖𝑡𝑤𝑜𝑟𝑠𝑡 𝑘 (30) where, 𝑓𝑖𝑡𝑔 𝑘 and 𝑀𝑔 𝑘 are the fitness value and the mass of agent g at kth iteration. 𝑁𝑔 represents the total number of agents. 𝑓𝑖𝑡𝑏𝑒𝑠𝑡 𝑘 and 𝑓𝑖𝑡𝑤𝑜𝑟𝑠𝑡 𝑘 are the best and the worst fitness value among ng at k th iteration. the force acting between agent g and h as per the law of gravity is given in eq. (31) 𝐹𝑔ℎ𝑑 𝑘 = 𝐺 𝑘 . 𝑀𝑔 𝑘.𝑀ℎ 𝑘 𝐷𝑔ℎ 𝑘+ℰ . (𝑆𝑔𝑑 𝑘 − 𝑆ℎ𝑑 𝑘 ) (31) where, 𝐺 𝑘 is the gravitational constant at kth iteration. ℰ is a small constant which ensures the denominator is non-zero. 𝐷𝑔ℎ 𝑘 shows the euclidian distance present between the agent g and h. the acceleration of agent g as per the law of gravity is given in eq. (32) 𝑎𝑔𝑑 𝑘 = 𝐹𝑔𝑑 𝑘 𝑀𝑔 𝑘 (32) where, 𝐹𝑔𝑑 𝑘 is the force acting on agent g at iteration k in d dimension. the updated velocity and position of agent g is calculated as 226 s. singh parihar, n. malik 𝑣𝑔𝑑 𝑘+1 = 𝑟𝑎𝑛𝑑. 𝑣𝑔𝑑 𝑘 + 𝑎𝑔𝑑 𝑘 (33) 𝑆𝑔𝑑 𝑘+1 = 𝑆𝑔𝑑 𝑘 + 𝑣𝑔𝑑 𝑘+1 (34) the value of 𝐺 𝑘 is set using eq. (35) 𝐺 𝑘 = 𝐺𝑜 . 𝑒 −𝛼 𝑘 𝐾 (35) where, 𝐺𝑜 is the initial value of the gravitational constant. k is the total number of iterations and reduces linearly to 1. the sequence of steps the gsa follows are identification of search space, random initialization of gsa parameters, fitness function evaluation, updation of gsa parameters, determination of force using eq. (31), acceleration using eq. (32) and velocity using eq. (33) followed by updating agent’s position using eq. (34) till the stopping criteria is met. 6. solution methodology for optimal multiple dgs allocation and reliability assessment using pso the pso-based method to allocate multiple dgs optimally in rds for mitigating system power losses and the reliability indicators takes the following steps step i: solve the b/f lf problem for the base case to determine magnitude of system bus voltage and its power losses as mentioned in section 2.1. step ii: calculate reliability indicators: tens, aens, asidi, caidi and saifi. step iii: select pso parameters (swarm size, acceleration coefficients and weight) to minimize the of value. step iv: set iteration counter k as 0 step v: the values of the dg location and size are generated (between zero and sum of system loads (continuous)) with random velocities and positions on the dimension (locations & sizes of type i and type iv dg) as pbest. step vi: repeat the lf algorithm for every particle after placing dg randomly. if all constraints are within limits then compute of for the randomly initialized particles. else, reject the infeasible solution. step vii: the dg site and size providing the lowest of value is considered as gbest and its corresponding position is nominated as the particle best position. step viii: the value of particle’s velocity, particle’s position and its weight are updated using eq. (26), eq. (27) and eq. (28), respectively. step ix: if kmax is achieved, jump to step x. else, increment k and repeat steps iv through ix. a new pbest and gbest is generated and stored if the newly obtained values is found to be superior than the previous values. step x: the best position signifies the optimal sites and sizes of multiple dgs and its corresponding of value represents the minimum total rpl. step xi: calculate the value of tens, aens, asidi, caidi and saifi after optimal penetration of single and multiple dgs of the corresponding type with the values calculated in step ii. opttimal allocation of multiple dgs and reliabilty analysis 227 7. numerical results and discussion this paper demonstrates the effect of optimal installation of different dg units in rds using pso and gsa to mitigate system power losses and upgrade system reliability. the total capacity of the simultaneously placed multiple dgs is not to supersede the total system load. the total system data for 33-bus has been taken from [27]. the ieee 33-bus system has a power demand of 3.715+j2.3 mva and three laterals. the base kv and mva taken for the test system are 12.66 and 100, respectively. the data taken for calculating reliability indicators are mentioned in the appendix section (table 9 and table 10). the total number of interruptions and customer with at least one interruption is considered as 10 and 4012, respectively. the failure rate of the system is assumed to be 0.5 f/yr. the pso and gsa are tested on standard 33-bus rds to verify its robustness. the pso algorithm analyses the impact of single and multiple dgs placement, whereas, the gsa analyses the impact of single dg placement on system’s reliability. the maximum iteration count and swarm size chosen for the pso is 100 and 20, respectively. the values of pso control variables 𝑐1 , 𝑐2 , 𝑤𝑝𝑚𝑖𝑛 and 𝑤𝑝𝑚𝑎𝑥 selected for the fast convergence are 2, 2, 0.4 and 0.9, respectively as in [28]. in gsa, the values of go and α is taken as 100 and 20, respectively as in [29]. the population size, k and dimension selected for the gsa technique is 33, 20 and 1, respectively. the proposed technique is implemented to calculate bus voltages, total system power losses, annual cost of energy loss (acel), annual savings and reliability indicators. the value of acel [30] is calculated using (36) 𝑨𝑪𝑬𝑳 = (∑ 𝑷𝒍𝒐𝒔𝒔 (𝒊 − 𝟏, 𝒊) 𝑵𝒃 𝒊=𝟐 𝑻 ∗ 𝑬 )$ (36) where t and e are annual time duration (8760 hours) and energy cost (0.06 $/kwh), respectively. the comparative analysis of the obtained results has been carried out at the same base voltage and load model. the methodology to integrate multiple dgs in the test system is implemented in matlab. the results of the 33-bus rds before and after penetration of one and three dgs using pso approach are compared and tabulated in table 1. the total system real and reactive power loss in the absence of any type of dg is 210.07 kw and 142.43 kvar, respectively. the value of tens, aens and asidi is also calculated for the uncompensated system and found out to be 8.0475 mwh/yr, 0.0004969 mwh/cust-yr and 0.2794 hours, respectively. the following case studies based on the type of dg penetration are as follows: 228 s. singh parihar, n. malik table 1 results of type i dg installation in 33-bus rds using pso approach base case with dg 1 dg 3 dgs pso optimal dg size in kw (optimal bus) 2605(6) 1067.5(24), 779.7(14), 1091.8(30) minimum bus voltage (vmin) p.u @ bus (improved voltage in %) 0.9042 @ 18 0.9436 @ 18 (4.35%) 0.9729 @ 33 (7.59%) rpl (kw) 210.07 110.00 71.00 rplr (kw) (% reduction in rplr) 100.07 (47.63%) 139.07 (66.20%) reactive power loss (kvar) 142.43 80.82 acel ($) 110413 57816.00 37317.6 annual energy loss savings ($) 52597.00 73095.4 table 2 impact of type i dg installation on reliability indicators for 33-bus rds tens (mwh/yr) aens (mwh/cust-yr) asidi (hours) no dg 8.0475 4.9691e-04 0.2794 one dg pso 2.2230 1.3727e-04 0.0452 gsa 1.8936 1.1693e-04 0.0461 three dgs 1.5682 9.6834e-05 0.0541 7.1. type i dg penetration the optimal position of single dg placement is found out to be bus 6 after applying pso technique with dg size of 2605 kw, whereas, for simultaneous positioning of multiple dgs, the buses 24, 14 and 30 are obtained with a dg capacity of 1067.5 kw, 779.7 kw and 1091.8 kw, respectively (from table 1). the cpu time for the computation of lf in a 33-bus system considering type i dg obtained from the pso approach is 1.15 seconds and found out to give faster convergence as compare to other approaches viz. 4.2651 seconds [8] and 6.9255 seconds [31]. the optimal location and size of single type i dg in 33-bus rds using gsa is bus 6 and 2000 kw, respectively. the effect of penetration of single and multiple dgs on tens, aens and asidi are also analysed and mentioned in table 2. the reduction in the value of reliability indicators after penetrating dgs in rds demonstrates the improvement in system reliability. 7.1.1. effect of type i dg on system power losses the optimal installation of a single dg minimizes the rpl by 47.63%, whereas, in the case of 3 dgs, the value of rpl reduces by 66.20% as illustrated in table 1. this in turn releases the real power demand of 100.07 kw and 139.07 kw after penetration of single and multiple dgs, respectively, at unity power factor. opttimal allocation of multiple dgs and reliabilty analysis 229 7.1.2. effect of type i dg on voltage profile the minimum bus voltage of 0.9042 p.u without dg was attained at bus 18 which got enhanced to 0.9436 p.u at bus 18 and 0.9729 p.u at bus 33 for single and multiple dgs placement with a percentage voltage enhancement of 4.35% and 7.59%, respectively. the proposed methodology meets all the constraints except small voltage violation (lower limit) in case of single type i dg i.e. 5.64% instead of 5% in the case of single dg placement (from table 1). the impact of installing one and three dg on the convergence of bus voltage magnitude for 33-bus rds is presented in fig. 2 which displays that the multiple dg has better bus voltage profile than single dg placement. fig. 2 bus voltage profile without and with one and three type i dgs 7.1.3. effect of type i dg on reliability indicators the impact of single dg allocation on reliability indices in 33-bus rds is carried out using pso and gsa and the values are tabulated in table 2. after locating single dg in the system, the values of tens, aens and asidi decreases with a percentage reduction of 72.37%, 72.42% and 83.82%, respectively for pso, whereas, it is 76.47%, 76.46% and 83.50%, respectively for gsa, w.r.t the base case. the value of tens, aens and asidi becomes 1.5682 mwh/yr, 0.0000968 mwh/cust-yr and 0.0541 hours after the installation of three dgs, respectively (from table 2). the percentage improvement in reliability indices with the penetration of single and multiple type i dgs is illustrated in fig. 3 and clearly infers that the percentage reduction in reliability indices is more with the installation of three dgs for tens and aens as compared to one dg. hence, the injection of real power in the system enhances the system reliability, but excessive real power injection may create an adverse effect on asidi. fig. 3 percentage improvement in reliability indices with single and multiple dgs 230 s. singh parihar, n. malik 7.1.4. effect of type i dg on the cost of annual energy loss the cost of annual energy loss obtained for the base case is $110413.00 which is reduced to $57816.00 and $37317.6 with the installation of one and three dgs, respectively. the annual energy loss saving after single and multiple dg placement is $52597.00 and $73095.4, respectively (from table 1). 7.1.5. result comparison the test results obtained without and with single and multiple dgs placement using pso and gsa are compared to the already existing results and tabulated in table 3. due to varying nature of dg size and real power loss reduction (rplr), a ratio of plr to pl is introduced. the larger ratio indicates the dominance of the method employed to integrate dg optimally. plr is the ratio of rplr considering dg to rpl with no dg. pl is the ratio of real power penetrated by dg to the real power load. the ratio obtained from pso and gsa in the presence of single and multiple dgs is determined to be either equal or superior to the previously published results. it is obvious from the results that, in the presence of three dgs the reduction in power loss (66.20%) is maximum. the acel with the penetration of three dgs is significantly less as compared to a single dg. table 3 comparison of results for multiple type i dgs in 33-bus rds installed dg size in kw (optimal bus) total dg capacity (mw) rpl (kw) plr ratio of plr to pl acel ($) no dg 210.07 110413.00 1 dg proposed method pso 2605(6) 2605 110.00 47.63 0.68 57816.00 gsa 2000(6) 2000 114.60 45.45 0.84 60233.76 ia [32] 2600(6) 2600 111.10 47.39 0.67 58394.16 grid search algorithm [33] 2600(6) 2600 111.00 47.39 0.67 58341.60 pso [34] 3150(6) 3150 115.29 45.36 0.53 60596.42 kha [35] 2590(6) 2590 111.02 47.38 0.53 58352.11 3 dgs proposed method 1067.5(24), 779.7(14), 1091.8(30) 2939 71.00 66.20 0.84 37317.60 pso-cfa [36] 1049.1(10),878.6(25),804.9(33) 2732.6 76.00 62.48 0.84 39945.60 shbat [37] 1190.0(30), 849.0(25), 790.0(13) 2829 72.12 64.34 0.83 37906.27 aco-abc [38] 754.7(14),1099.9(24),1071.4(30) 2926 71.40 64.77 0.81 37527.84 abc [31] 1756.9(6), 575.7(15), 782.6(25) 3115.2 79.20 61.15 0.73 41627.52 ga [39] 1500(11),422.8(29),1071.4(30) 2994.2 106.3 49.61 0.59 55871.28 mocsos [40] 1187.9(13),1197.1(24),1300.2(31) 3685.2 89.40 57.67 0.58 46988.64 mota [41] 980(7),960(14),1340(30) 3280 96.30 54.36 0.56 50615.28 ga/pso [39] 925(11),863(16),1200(32) 2988 124.0 41.22 0.51 65174.40 opttimal allocation of multiple dgs and reliabilty analysis 231 7.2. type iv dg allocation for pso approach, the optimal position after applying proposed methodology to locate a single type iv dg is bus 6 with dg size of 3150 kva, whereas, for the positioning of multiple dgs, the buses 24, 30 and 14 are determined with a dg capacity of 859.2 kva, 1031.6 kva and 605.3 kva, respectively, as mentioned in table 4. the optimal bus obtained using gsa to locate single type iv dg is 6 with a dg capacity of 2828.42 kva. the effect of penetration of single and multiple dgs on each reliability indicator using pso and gsa is provided in table 5. table 4 results of type iv dg installation in 33-bus rds using pso base case with dg 1 dg 3 dg optimal dg size in kva (optimal bus) 3150 (6) 859.2 (24), 1031.6 (30), 605.3 (14) vmin in p.u @ bus (improved voltage in %) 0.9042@18 0.9602@18 (6.19%) 0.9953@33 (10.07%) rpl (kw) 210.07 64.00 17.00 rplr (kw) (% reduction in rplr) 146.07 (69.53%) 193.07 (91.90%) acel ($) 110413 33638.40 8935.20 annual energy loss savings ($) 76774.60 101477.80 table 5 impact of type iv dg installation on reliability indicators 7.2.1. effect of type iv dg on system power losses after optimal penetration of one and three dgs in rds, the losses reduced to 64 kw and 17 kw with a reduction of 69.53% and 91.90%, respectively, as illustrated in table 1 at 0.82 pf [28]. the reactive loss obtained without dg is 142.43 kvar. the power loss reduction attained with the installation of single and multiple type i and iv dgs is demonstrated in fig. 4 which concludes that the allocation of multiple type iv dgs gives the highest reduction in system rpl amongst all. tens (mwh/yr) aens (mwh/cust-yr) asidi (hours) no dg 8.0475 4.9691e-04 0.2794 one dg pso 1.7033 1.0518e-04 0.045287 gsa 1.5702 9.6956e-05 0.045513 three dgs 1.1429 7.0571e-05 0.054032 232 s. singh parihar, n. malik fig. 4 rplr with single and multiple type i & type iv dg penetration the acel after penetration of single and multiple type iv dgs is $33638.40 and $8935.20 which results in the annual energy loss savings of $76774.60 and $101477.80, respectively (from table 4). 7.2.2. effect of type iv dg on system voltage profile the installation of a single dg in 33-bus system improves the magnitude of bus voltage at bus 18 from 0.9042 pu to 0.9602 pu at bus 18 resulting in percentage voltage improvement of 6.19%. in the presence of multiple dgs the system voltage at bus 18 enhances from 0.9042 pu to 0.9953 p.u at bus 33 resulting in percentage bus voltage improvement of 10.07%. the impact of installing single and multiple type iv dgs on the convergence of voltage magnitude at each bus is presented in fig. 5, which demonstrates that the bus voltage profile with multiple dg units is over-represented as compare to one dg placement. fig. 5 comparison of bus voltages in presence of single and multiple type iv dgs penetrations p+jq indicates the system’s nominal loading. the impact of system loading on the magnitude of bus voltage with optimally placed dg (at bus 6) is evaluated by incrementing load gradually at all the buses as mentioned in table 6 [11]. at critical loading, the voltage at bus 6 got reduced from 0.9496 p.u. to 0.7594 p.u. the value of critical loading factor obtained for 33-bus system is 3.405 after which there will be a voltage collapse. the subsequent incorporation of dg enhances the voltage magnitude at all the buses and hence provides stable operation with enhanced system capacity. opttimal allocation of multiple dgs and reliabilty analysis 233 table 6 impact of system loading & type iv dg on voltage for 33-bus [11] system load rpl (kw) reactive power loss (kvar) voltage in p.u @ bus 6 dg size (kva) @ bus 6 p+jq 210.0704 142.4372 0.9496 0 2(p+jq) 1016.33 691.98 0.8894 0 3(p+jq) 3094.5 2122.31 0.8078 0 3.405(p+jq) 4905.40 3382.42 0.7594 0 3.41(p+jq) nc nc nc 0 3.41(p+jq) 2463.5 1755.3 0.8545 1000 3.41(p+jq) 1412.3 1062.1 0.9263 2000 nc: no convergence 7.2.3. effect of type iv dg on reliability indicators the value of tens, aens and asidi obtained from pso method after placement of a single type iv dg is 1.7033 mwh/yr, 0.00010518 mwh/cust-yr and 0.045287 hours which becomes 1.1429 mwh/yr, 0.00007057 mwh/cust-yr and 0.0540 hours in the presence of three dgs, respectively (from table 5). the drop in the reliability indicators shows system reliability improvement. the % improvement in the value of tens and aens incorporating single type iv dg using gsa is 7.81% and 7.82%, respectively. the impact of single and multiple type iv dgs on the percentage reduction in reliability indices is illustrated in fig. 6. the percentage reduction in tens and aens is higher due to the installation of three dgs as compared to one dg, except asidi. the results demonstrate that the value of tens and aens decreases with higher dg penetration, whereas, the value of asidi increases due to excessive real power penetration. fig. 6 percentage improvement in reliability indices with different number of type iv dgs in addition to this, the impact of optimal allocation of three dg (type iv) units on caidi and saifi have also been analysed. for an uncompensated system the values of caidi and saifi is 0.62466 (hours/cust − interruption) and 0.72337 (interruptions/customer-yr) which got reduced to 0.62305 (hours/cust − interruption) and 0.72028 (interruptions/customer-yr), respectively after integration of type iv dg units. these indices are difficult to compare from one utility to another and from one location to another because of the differences in the calculation of the number of customers connected. some utilities determine their number of customers based on the total number of meters connected and some based on customer postal 234 s. singh parihar, n. malik addresses and do not considers the weather conditions and planned outages for reliability calculation. 7.2.4. comparison of results the comparative analysis without and with the integration of single and multiple type iv dgs has been carried out and tabulated in table 7. the value of plr and ratio of plr to pl attained from the proposed method is found out to be the highest among all reported results for one and three dgs. the ratio obtained from gsa for single dg placement is found out to be superior than pso. the presented methodology leads to a superior solution causing minimum annual energy loss in most cases. it is obvious that the vmin and reduction in system rpl attained with multiple dgs of type iv is superior to single dg. table 7 comparison of results for multiple type iv dgs in 33-bus rds installed dg size in kva (optimal bus) total dg capacity (kva) rpl (kw) plr ratio of plr to pl acel ($) no dg 210.07 110412.8 1 dg proposed method pso 3150(6) 3150 64.00 69.53 0.813 33638.40 gsa 2828.42(6) 2828.42 64.55 69.27 0.91 33927.48 ia [32] 3107(6) 3107 67.90 67.85 0.809 35688.24 minlp [42] 3105(6) 3105 67.85 67.84 nr 35661.96 gams [43] 3078(6) 3078 67.80 67.80 nr 35635.68 3 dgs proposed method 859.2(24),1031.6(30),605.3(14) 2496.1 17.00 91.90 1.36 8935.20 tm [41] 705.2(16),705.2(27),1410.4(30) 2820.8 27.4 87.01 1.14 14401.44 dgsi [44] 1208(13), 1208(29),152(31) 2568 49.8 76.22 1.09 26174.88 lsfsa [45] 1382.9(6),551.7(18),1062.9(30) 2997.5 26.7 86.82 1.08 14033.52 mota [41] 880(14),920(25),1560(30) 3360 15.7 92.55 1.00 8251.92 mocsos [40] 926.1(13),1257(24),1481.2(30) 3664.3 15.1 92.83 0.93 7936.56 nr: not reported 7.3. statistical analysis of rpl from table 8, the value of coefficient of variation (cv) of rpl in the presence of single type iv dg is minimum as compared to the other types of dgs. this demonstrates that the type iv dg is capable in reducing the variation in system power losses in distribution feeders around its mean value much more effectively than the type i dg and hence give better security against overheating of the distribution feeders. table 8 statistical results for 33-bus with and without type iv dg using pso technique dg type 𝑃𝑙𝑜𝑠𝑠 (kw) min max mean std cv no dg 0.013 51.896 6.570 11.536 1.756 type i 0.012 15.457 3.425 4.099 1.196 type iv 0.010 9.9412 1.993 2.367 1.187 opttimal allocation of multiple dgs and reliabilty analysis 235 8. conclusion this article presents a comprehensive strategy to optimally allocate multiple type i and type iv dgs in the existing rds to reduce rpl, reliability indicators (tens, aens and asidi) and improve bus voltage profile. the optimal integration of dg units is carried out using pso and gsa based approach which is capable to determine optimal solution with or without few assumptions even in a large search space. the comparative analysis on 33bus system has been carried out for single and multiple dgs placement in the rds. the analysis clearly illustrates that the system performance in terms of reduction in system power losses, enhancement in tens, aens, bus voltage profile and aels is superior for multiple dgs placement when compared to single dg. the results also demonstrated that the penetration of dg resources in rds using pso and gsa method improves tens and aens, but excessive power injection may create an adverse effect on asidi. an approach like gsa founds to provide better results than pso for tens and aens improvement in case of single dg. the optimal integration of multiple type iv dg is found to have many positive impacts on system performance. references [1] c.l.t. borges, "an overview of reliability models and methods for distribution systems with renewable energy distributed generation", renew. sustain. energy rev., vol. 16, no. 6, pp. 4008-4015, august 2002. [2] a. escalera, b. hayes and m. prodanovic, "a survey of reliability assessment techniques for modern distribution networks", renew. sustain. energy rev., vol. 91, pp. 344–357, august 2018. [3] k.m. jagtap and d.k. khatod, "loss allocation in radial distribution networks with various distributed generation and load models", int. j. electr. power energy syst., vol. 75, pp. 173-186, february 2016. [4] n. acharya, p. mahat and n. mithulananthan, "an analytical approach for dg allocation in primary distribution network", int. j. electr. power energy syst., vol. 28, no. 10, pp. 669-678, december 2006. [5] a.c. rueda-medina, j.f. franco, m.j. rider, a. padilha-feltrin and r. romero, "a mixed integer linear programming approach for optimal type, size and allocation of distributed generation in radial distribution system", electr. power syst. res., vol. 97, pp. 133-143, april 2013. [6] l. soo-hyoung and p. jung-wook, "selection of optimal location and size of multiple distributed generations by using kalman filter algorithm", ieee trans. power syst., vol. 24, no. 3, pp. 1393-1400, august 2009. [7] n. mohan, t. ananthapadmanabha and a.d. kulkarni, "a weighted multi-objective index based optimal distributed generation planning in distribution system", procedia technology, vol. 21, pp. 279-286, august 2015. [8] m. m. aman, g. b. jasmon, h. mokhlis and a. h. a. bakar, "optimal placement and sizing of a dg based on a new power stability index and line losses", int. j. electr. power energy syst., vol. 43, no. 1, pp. 1296– 1304, december 2012. [9] e. s. ali, s. m. elazim and a. y. abdelaziz, "ant lion optimization algorithm for optimal location and sizing of renewable distributed generations", renewable energy, vol. 101, pp. 1311-1324, february 2017. [10] m. karimi and m.r. haghifam, "risk based multi-objective dynamic expansion planning of subtransmission network in order to have eco-reliability, environmental friendly network with higher power quality", iet generation, transmission & distribution, vol. 11, no. 1, pp. 261-271, january 2017. [11] s. s. parihar and n. malik, "optimal integration of multi-type dg in rds based on novel voltage stability index with future load growth", evolving system, october 2020. [12] r. billinton and j. e. billinton, "distribution system reliability indices", ieee trans. power deliv., vol. 4, no. 1, pp. 561–568, january 1989. [13] h. falaghi and m. haghifam, "distributed generation impacts on electric distribution systems reliability: sensitivity analysis", in proceedings of the international conference on "computer as a tool", 2005, pp. 1465-1468. http://www.sciencedirect.com/science/journal/01420615 file:///c:/users/shradha/appdata/local/packages/microsoft.microsoftedge_8wekyb3d8bbwe/tempstate/downloads/75: https://www.sciencedirect.com/science/article/pii/s2212017315002571#! https://www.sciencedirect.com/science/article/pii/s2212017315002571#! 236 s. singh parihar, n. malik [14] n. nikmehr and s. najafi-ravadaneghbrown, "stochastic risk and reliability assessments of energy management system in grid of microgrids under uncertainty", journal of power technologies, vol. 97, no. 3, pp. 179–189, november 2017. [15] z. wang, j. li, w. yang and z. shi, "impact of distributed generation on the power supply reliability", in proceedings of the ieee pes innovative smart grid technologies, 2012, pp.1-5. [16] a.c. neto, m.g. silva and a.b. rodrigues, "impact of distributed generation on reliability evaluation of radial distribution systems under network constraints", in proceedings of the international conference on probabilistic method applied to power systems, 2006, pp. 1-6. [17] r. k. mathew, s. ashok and s. kumaravel, "analyzing the effect of dg on reliability of distribution systems", in proceedings of the international conference on electrical, computer and communication technologies, 2015, pp. 1-4. [18] i. waseem, m. pipattanasomporn and s. rahman, "reliability benefits of distributed generation as a backup source", in proceedings of the ieee power & energy society general meeting, 2009, pp. 1-8. [19] s. a. nowdeh et al., "fuzzy multi-objective placement of renewable energy sources in distribution system with objective of loss reduction and reliability improvement using a novel hybrid method", appl. soft. comput. j., vol. 77, pp. 761-779, april 2019. [20] m. e. el-hawary and l.g. dias, "incorporation of load models in load flow studies. form of model effects", iee proceedings cgeneration, transmission and distribution, vol. 134, pp. 27-30, january 1987. [21] m. h. haque, "load flow solution of distribution systems with voltage dependent load models", electr. power systems res., vol. 36, no. 3 pp. 151-156, march 1996. [22] s. s. parihar and n. malik, "optimal allocation of renewable dgs in a radial distribution system based on new voltage stability index", int. trans. electr. energy syst., vol. 30, no. 4, pp. 1-19, february 2020. [23] m. s. thomas, r. ranjan and n. malik, "deterministic load flow algorithm for balanced radial ac distribution systems", in proceedings of the ieee fifth power india conference, 2012, pp.1-6. [24] 547‐2003‐ieee standard for interconnecting distributed resources with electric power systems, ieee standards, pp. 1‐16, 2003. [25] r.a. walling, r. saint, r. c. dugan, j. burke and l. a. kojovic, "summary of distributed resources impact on power delivery systems", ieee trans. power deliv., vol. 23, no. 3, pp. 1636–1644, july 2008. [26] h. pradeepa, t. ananthapadmanabha, r. d. n. sandhya and c. bandhavya, "optimal allocation of combined dg and capacitor units for voltage stability enhancement", procedia technology, vol. 21, pp. 216-223, november 2015. [27] r. ranjan, b. venkatesh and d. das, "voltage stability analysis of radial distribution networks", electr. power compon. syst., vol. 31, no. 1, pp. 501-511, march 2003. [28] s. kansal, v. kumar and b. tyagi, "hybrid approach for optimal placement of multiple dgs of multiple types in distribution networks", int. j. electr. power energy syst., vol. 75, pp. 226-235, february 2016. [29] e. rashedi, h. nezamabadi-pour and s. saryazdi, "gsa: a gravitational search algorithm", information sciences, vol. 179, pp. 2232-2248, june 2009. [30] v. v. s. n. murty and a. kumar, "optimal placement of dg in radial distribution systems based on new voltage stability index under load growth" int. j. electr. power energy syst., vol. 69, pp. 246-256, july 2015. [31] m.p. lalitha, n.s. reddy and v.c.v. reddy, "optimal dg placement for maximum loss reduction in radial distribution system using abc algorithm", int. j. rev. comput., vol. 3, pp. 44–52, january 2010. [32] d. q. hung and n. mithulananthan, "multiple distributed generators placement in primary distribution networks for loss reduction", ieee trans. industr. electron., vol. 60, no. 4, pp. 1700-1708, april 2013. [33] t. gözel, u. eminoglu and m. h. hocaoglu, "a tool for voltage stability and optimization (vs&op) in radial distribution systems using matlab graphical user interface", simul. model. pract. theory, vol. 16, no. 5, pp. 505-518, may 2008. [34] s. kansal, v. kumar and b. tyagi, "optimal placement of different type of dg sources in distribution networks", int. j. electr. power energy syst., vol. 53, pp. 752–760, december 2013. [35] s.a. chithradevi, l.lakshminarasimman, r.balamurugan, "stud krill herd algorithm for multiple dg placement and sizing in a radial distribution system", eng. sci. technol. an int. j., vol. 20, no. 2, pp. 748759, april 2017. [36] k. d. mistry and r. roy, "enhancement of loading capacity of distribution system through distributed generator placement considering techno-economic benefits with load growth", int. j. electr. power energy syst., vol. 54, pp. 505–515, january 2014. [37] c. yammani, s. maheswarapu and s.k. matam, "optimal placement and sizing of distributed generations using shuffled bat algorithm with future load enhancement", int. trans. electr. energy syst., vol. 26, pp. 274–292, april 2016. https://ieeexplore.ieee.org/author/37563564100 https://ieeexplore.ieee.org/author/37288706000 http://www.sciencedirect.com/science/journal/03787796 http://www.sciencedirect.com/science/journal/03787796 https://www.sciencedirect.com/science/journal/1569190x https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=maheswarapu%2c+sydulu opttimal allocation of multiple dgs and reliabilty analysis 237 [38] m. kefayat, a. l. ara and s. a. n. niaki, "a hybrid of ant colony optimization and artificial bee colony algorithm for probabilistic optimal placement and sizing of distributed energy resources", energy conver. manag., vol. 92, pp. 149–161, march 2015. [39] m. h. moradi and m. a. abedini, "combination of genetic algorithm and particle swarm optimization for optimal dg location and sizing in distribution systems", int. j. electr. power energy syst., vol. 34, no. 1, pp. 66‐74, january 2012. [40] s. saha and v. mukherjee, "a novel multiobjective chaotic symbiotic organisms search algorithm to solve optimal dg allocation problem in radial distribution system", int. trans. electr. energy syst., vol. 29, no. 5, pp. 2839-2864, february 2019. [41] n.k. meena, a. swarnkar, n. gupta and k.r. niazi, "multi‐objective taguchi approach for optimal dg integration in distribution systems", iet generation transmission distribution, vol. 11, no. 9, pp. 2418‐ 2428, june 2017. [42] s. kaur, g. kumbhar and j. sharma, "a minlp technique for optimal placement of multiple dg units in distribution systems", int. j. electr. power energy syst., vol. 63, pp. 609–617, december 2014. [43] p. v. babu and s. p. singh, "optimal placement of dg in distribution network for power loss minimization using nlp & pls technique", energy procedia, vol. 90, pp. 441-454, december 2016. [44] p. kayal, s. chanda and c.k. chanda, "an analytical approach for allocation and sizing of distributed generations in radial distribution network", int. trans. electr. energy syst., vol. 27, no. 7, pp. 2322, february 2017. [45] s.k. injeti and n.p. kumar, "a novel approach to identify optimal access point and capacity of multiple dgs in a small, medium and large scale radial distribution systems", int. j. electr. power energy syst., vol. 45, no. 1, pp. 142‐151, february 2013. appendix table 9 number of customers at each bus bus number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 total no. of customers 500 600 750 250 425 220 500 640 800 600 730 640 550 920 120 bus number 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 total no. of customers 60 340 410 230 260 550 650 290 270 700 250 420 720 850 760 bus number 31 32 33 total no. of customers 180 350 660 table 10 customer interruption details at five load points load points unavailable buses off time (min) 1 5, 21, 24, 6, 3 28 2 12, 7, 21, 4, 21 40 3 6, 3, 13, 16, 24 14 4 16, 9, 14, 10, 7 60 5 8, 19, 6, 1, 12 35 instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 237 249 doi: 10.2298/fuee1502237d novel, low power, nonlinear dilatation and erosion filters realized in the cmos technology  rafał długosz 1,2 , andrzej rydlewski 3 , tomasz talaśka 1 1 utp university of sciences and technology, faculty of telecommunication, computer science and electrical engineering, bydgoszcz, poland 2 delphi automotive company, kraków, poland 3 alcatel-lucent, coldra woods, chepstow rd, newport np18 2yb abstract. in this paper we propose novel, binary-tree, asynchronous, nonlinear filters suitable for signal processing realized at the transistor level. two versions of the filter have been proposed, namely the dilatation (max) and the erosion (min) one. in the proposed circuits an input signal (current) is sampled in a delay line, controlled by a multiphase clock. in the subsequent stage particular samples are converted to 1-bit digital signals with delays proportional to the values of these samples. in the last step the delays are compared in digital binary-tree structure in order to find either the min or the max value, depending on which filter is used. both circuits have been simulated in the tsmc cmos 0.18µm technology. to make the results reliable we applied the corner analysis procedure. the circuits were tested for temperatures ranging from -40 to 120ºc, for different transistor models and supply voltages. the circuits offer a precision of about 99% at a typical detection time of 20 ns (for the max filter) and 100 ns for the min filter (the worst case scenario). the energy consumed per one input during a single calculation cycle equals 0.32 and 1.57 pj, for the max and min filters, respectively. key words: nonlinear filters, cmos realization, full-custom, binary-tree architecture 1. introduction dilatation and erosion operations often referred to as the max and min functions, respectively, are useful in many applications. these operations are commonly used in artificial neural networks (ann) but also in signal and image processing [1]. to perform the competitive learning, which is common in some types of anns, the min function is used to determine which of the neurons is located in the closest proximity to a provided learning pattern. in this case this operation is known as winner takes all (wta), which is somehow misleading, as from the formal point of view the min function corresponds to the loser takes all (lta) operation. however, the winning neuron is the one, for which the distance is the smallest and thus such a convention. received august 19, 2014; received in revised form february 11, 2015 corresponding author: rafał długosz utp university of sciences and technology, faculty of telecommunication, computer science and electrical engineering,ul. kaliskiego 7, 85-796 bydgoszcz, poland (e-mail: rafal.dlugosz@gmail.com) 238 r. długosz, a. rydlewski, t. talaśka another area in which the min, as well as the max operations are used is nonlinear filtering. such filters are used, for example, to enhance signals or to correct shapes of the objects in pictures. the min and the max operations are in such applications denoted as erosion and dilatation, respectively. both types of filters can be joined in series in order to perform more complex tasks, such as morphological opening and closing operations, commonly used in image processing to reconstruct digital image into original form from noisy image. for such applications we can use min/max detector based (mdb) filters or min/max exclusive mean (mmem) filters. these technics can be used to achieve best performance [2]-[4]. a large similarity exists between the both nonlinear dilatation (max) / erosion (min) filters and the wta / lta operations, as in both mentioned cases the core circuit fulfills exactly the same task. the task relies on searching for either the minimum or maximum signal among a set of the input signals. the difference exists in the input signals used in each of these cases. in anns all signals are independent as they come from separate neurons distributed over the input data space. on the other hand, the erosion/dilatation filters process samples of on one signal stored in the delay line, as shown in fig. 1. in this diagram a classic delay line is schematically shown to illustrate the idea. in the classic approach the samples are rewritten many times between memory cells. in software realizations it is not the problem, but in analog transistor level implementations this usually is the source of errors that have an impact on the quality of filtering. we faced with this problem in our former projects of finite impulse response (fir) filters realized in switched-capacitor technique [5]. to avoid the problem of reduced accuracy it is necessary to use such filters, in which the number of read/write operations in minimized. one of the possibilities in this regard is to use the, so-called, circular delay line [6], [7]. in this approach, the samples are stored in particular memory cells and remain there as long as they are replaced with new samples after m clock cycles, where m is also the number of samples stored in the whole delay line. we adopted this solution to nonlinear filters presented in this paper. in filters the input signal can be sampled in time domain (1-d signal), or particular samples can be, for example, pixels of an image (2-d signal). in this paper we focus on nonlinear filters used in the first situation. however, the proposed solution can be adopted to image filtering as well. fig. 1 nonlinear dilatation / erosion filtering of a 1-d signal in time domain numerous min/max circuits have been reported in the literature, but two major types of architectures can be clearly distinguished. in the first group the min/max circuits are usually based on the current conveyor (cc) architecture [8]-[10]. in this approach all input signals (either currents or voltages) are compared in a single stage. such circuits low power nonlinear min/max filters implemented in the cmos technology 239 usually feature a simple structure, but suffer from limited accuracy that decreases when the number of inputs increases [9], [11]. this problem results mostly from the, so called, 'corner error‟, which occurs when two or more input signals have similar values. in this case an average value between these signals appears at the output of the filter. fig. 2 block diagram of the proposed min / max filter binary-tree solution the second group of filters is based on the concept of the binary tree (bt) structure. in this case the competition between the input signals is conducted on particular layers of the tree. the number of layers equals log2m, where m is the number of inputs that equals the length of delay line. signals at each particular layer compete in pairs and always only one winning signal is allowed to take part in the competition at the next layer of the tree [9], [12], [13]. the binary-tree circuits usually are more complex than their cc counterparts. however, if precise comparators are used, they are able to properly distinguish signals that differ by very small amounts. the advantage of bt solutions is also evident in the fact that they are able to determine the address of the max or min signal, which is not possible in the cc circuits. this is an important feature in case of the application of such circuits in anns, in which the value of the winning signal is less important than the information which of the input signals has the smallest value. in typical bt solutions the signals (analog) at the outputs of particular layers of the tree are determined (calculated or copied) on the basis of signals provided from preceding layers [9], [12], [14]. this may be the source of errors [15] that accumulate at the top of the tree. in the proposed solution [16] this problem is less visible. at an early stage of the signal processing chain the analog input signals are converted to digital 1-bit signals with delays proportional to the values of the input signals. then the comparison of the signals (their delays) is performed in a digital bt structure. in this way, the copying of analog signals between layers has been eliminated. the paper is organized as follows: in next section we propose two filters specific for the dilatation (max) and erosion (min) nonlinear operations. in following section we present verification of the proposed circuit by means of transistor level simulations. to provide reliable results we performed rigorous pvt (process, voltage, temperature) variations tests. the conclusions are formulated in last section. 240 r. długosz, a. rydlewski, t. talaśka a) b) c) d) e) f) fig. 3 components of the proposed nonlinear filters: (a) input multiple-output cm used in circular delay line, (b) s&h memory element used in delay line, (c) current to time converter (itc), (d) delay comparator used in dilatation (max) filter, (e) delay comparator used in erosion (min) filter, (f) address determination block (adet) 2. proposed dilatation and erosion nonlinear filters both nonlinear filters proposed in this paper are based on the same structure shown in fig. 2. the circuit is composed of the analog part whose role is to prepare simplified signals for the subsequent digital bt structure. the circuit consists of several blocks, or groups of elements, presented in detail in fig. 3. analog part of the system in both filters the input current, iin, is first sampled and held in the circular delay line. this delay line has been used to avoid multiple read and write operations of particular signal samples, which is the source of large errors in classical delay line. in this approach particular samples are not rewritten between memory cells but remain in particular cells as long as they are replaced by new samples after m clock cycles, as described earlier. the delay line in the proposed filters works as follows: the input signal is copied m times by the use of the multiple output current mirror (cm), shown in fig. 3(a). in this way each branch receives a separate copy of the input signal and thus data processing in particular branches is independent from each other. particular samples of the input signal are stored in sample & hold (s&h) memory elements, shown in fig. 3 (b). to compensate a typical low power nonlinear min/max filters implemented in the cmos technology 241 in this case charge injection effect across the storage capacitors, cst, we have used the, so called, dummy switches, swd. in such switches inputs and outputs are shorted together, so they do not change the functionality of the circuit. such switches are controlled by clock signals of opposite polarity in comparison with the memory switches, swm. the circular delay line is in this case controlled by an m-phases clock. the complexity of the clock can be viewed as a disadvantage. however, since the length of nonlinear filters usually does not exceed 8-10, it is not a significant problem, taking into account an increased precision of the circuit. output signals from particular s&h elements, denoted as i ' in i are provided to currentto-time converters (itc), shown in fig. 3(c), that convert them into binary 1-bit flags (f). these converters are also common for both filters. the flag signals occur at the outputs of particular itcs with delays proportional to the values of the signal samples. each of these blocks is composed of a pmos-type cascoded cm, an integrating capacitor with reset function, and two not gates. the voltage across the capacitor is increasing with a rate which is proportional to the value of the i ' in i signal. the not gates change their logical states when the voltage across the capacitor reaches a value of about vdd/2. fig. 4 a theoretical influence of transistor sizes on the gain error of the current mirror (due to threshold voltage mismatch) for the weak and strong inversion regions. to improve the precision of the circuit we have used the cascoded cms to increase the accuracy of the copying operations. an additional problem while designing the cms, is how to determine the optimal sizes of transistors for particular values of the input currents (in the range up to 10 µa in this case). we faced with a similar problem in our former projects [17], [18]. the sizes of transistors have a strong influence on the mismatch effect [19], for example the threshold voltage mismatch ∆vth. the last parameter has, in turn, an impact on the theoretical gain error of the cm and thus on the precision of the circuit. in the weak inversion region the impact is usually larger than in the strong inversion region, as shown in fig. 4, and therefore we polarize the transistors in such a way to put their operating points in the strong inversion region. to make it possible we do not work with currents smaller than 1 µa. increasing the sizes of transistors always reduces the mismatch effect. however, for given values of the input currents this also decreases the gate to source voltage, vgs, that in turn enlarges the gain error of the cm. for the currents being in the range up to 10µa optimal sizes of transistors are w / l=3 / 1 and 9 / 1 µm for nmos and pmos transistors, respectively. 242 r. długosz, a. rydlewski, t. talaśka digital binary tree structure the binary tree structure used in the proposed nonlinear filters is composed of delay comparators (dcmp), which are different for particular filters. two versions of this block have been proposed. the circuit used in the dilatation filter is shown in fig. 3(d), while the one used in the erosion filter in fig.3 (e). both circuits are built on the basis of the rs flip flop (rsff) that is able to distinguish very small (at the level of 3-5ns) differences between delays of particular input signals. depending on the mode of the filter (min or max) either the smaller or the larger of two input signals becomes the winner, which dcmp signalizes by two digital signals, o1 and o2. in the overall bt the process of determination of the winning (or losing) signal is based on the competition performed at particular layers of the tree. to make it possible, dcmp blocks provide an additional signal flag (f) of a given pair that takes part in the competition at the following layer of the tree. depending on the type of filter, between the f11 and f12 inputs and the output f there is only a single or or and gate. in the dilatation filter, as soon as only one of the input flags becomes 1, a given dcmp immediately (with a delay below 0.5 ns) sends the flag f of a pair to the next layer of the tree. in the erosion filter, on the other hand, the and gate causes that a given dcmp has wait with sending the flag of the pair until both input flags become 1. this causes that the erosion filter is slower than the dilatation one. the other problem in this filter appears when the minimum signal very small or zero. in this case the process of detection of this signal can take a very long time. to solve this problem we assume not only an upper range of the input signals but also the bottom range, which in this case equals 10% of the upper range. if there is no possibility that the input signals are always larger than the bottom range, we can introduce a constant bias, added in junction to each signal charging the integrating capacitor in the itc block. the last operation performed in the proposed filters is determination of the address of the min or the max signal, depending on the type of the filter. the o1 and o2 signals from particular dcmp blocks are used by the adet block (address determination), shown in fig. 3(e), to determine the address. the o1 and o2 signals have always such values that enable an unambiguous indication of the winning signal. unfortunately, the problem with the rsff is that it can hang ('0.5' states at both outputs) when two input flags arrive at almost the same time i.e. when the corresponding input currents are almost equal. in this case the values at both outputs of the rsff are equal to about vdd/2. to avoid ambiguity in this case, a simple hierarchy mechanism has been introduced that is able to recognize the '0.5' states. in such situations the circuit arbitrarily decides that one of the input signals obtains the status of the 'winner'. the proposed arbitrary mechanism is based on asymmetrical not (notn and notp) gates. the gates have different threshold voltages obtained throughout a proper transistor sizing. these voltages are equal to 0.25∙vdd/2 and 0.75∙vdd/2 for the notp and notn gates, respectively. in case when the rsff hangs, the gates provide different output signals that is detected by the xor gate. this gate throughout the configuration switches (controlled by 'swn' / 'swp' signals) controls the values of the o1 and o2 output signals. in this case the circuit arbitrarily connects the outputs of the rsff to vdd ('1') and vss ('0') supplies. this function does not introduce a substantial error, as in this case both analog input signals are almost equal (difference < 0.2%). additionally, it is worth to say that the '0.5' states occur seldom in practice, so the mechanism is only an emergency solution. low power nonlinear min/max filters implemented in the cmos technology 243 fig. 5 transistor level simulations of a single dcmp equipped with the proposed arbitrary mechanism. in the b state the arbitrary mechanism eliminates the ambiguity. fig. 6 simulations of the circular delay line with eight s&h memory elements. from top to bottom are presented: (1) an example input current with the amplitude of 2µa, (2) controlling clock signals (8 phases), (3) signal samples stored in particular s&h cells (voltages across the storage capacitors, cst), and (4) the supply current 244 r. długosz, a. rydlewski, t. talaśka fig. 7 simulations of the bt block composed of the dcmp (max mode) circuits for t=20ºc and vdd=1.8v. from top to bottom are presented: (1) vc voltages in the itc circuits, (2) resultant flag signals, and (3) addresses of the samples that in a given time period have the maximum values fig. 8 simulations of the bt block composed of the dcmp (min mode) circuits for t=20ºc and vdd=1.8v. from top to bottom are presented: (1) vc voltages in the itc circuits, (2) resultant flag signals, and (3) addresses of the samples that in a given time period have the minimum values low power nonlinear min/max filters implemented in the cmos technology 245 3. verification of the proposed nonlinear filters the proposed circuit has been tested in several steps. at the beginning we tested the dcmp as a separate circuit. we put a special emphasis on the arbitrary mechanism, as this block has a crucial meaning. illustrative results for the circuit used in the dilatation filter are shown in fig. 5.the rs out1 and rs out2 signals can be either in a typical state (a), in which their values are `0' or '1', or in the '0.5' state (b), which is not desired. in case b (e.g. in the range from 47 to 52us) the outputs of asymmetrical notn and notp gates provide different values. this state is detected by the xor gate that signalizes it by reverting the values of the 'swp' and 'swn' signals. as a result, the outputs o1 and o2 are arbitrary connected to vdd and vss rails (logical '1' and '0'), respectively. after verifying the dcmp block we have tested the performance of the overall filter composed of a delay line with 8 memory cells and the bt block with three layers (log28). the results for the dilatation as well as the erosion filters are presented in figs. 6 – 8. fig. 6 illustrates the operation of the circular delay line. an example input current – sinus waveform with f=10khz and the amplitude of 2 µa across the 3µa dc signal – is sampled and held in the memory cells (cst = 400 ff). each sample remains in a given cell during eight subsequent clock cycles. an average supply current equals 70 µa, which means that an average power dissipation equals 126 µw (for vdd = 1.8 v). the performance of the overall circuit for both types of nonlinear filters is presented in figs. 7 and 8. top panel in both figures present voltages across storage capacitors in particular itc blocks. this phase is preceded by resetting the capacitors. after the reset signal is released, the capacitors are charged from 0 to vdd by currents (samples of the input signal), whose values are stored in the corresponding s&h elements. middle panel presents resultant delays of particular flags. finally, the bottom panels show the addresses of the samples with the max or min values that appear at the outputs of the filters. the input signals in both cases have been selected in such a way to present different scenarios. in fig. 7 in the first cycle (in between 50 and 60 µs) the i7 and i8 samples are almost equal. as a result, both corresponding flags are set to 1 in a short period of time that activates the arbitrary mechanism. in this case the mechanism arbitrary selects the i7 signal as a winner. the next two cycles (60-70 and 70-80 µs) present a typical situation, in which differences between particular signals are larger. the detection time varies in this case in between 5 and 20 ns. in most cases we expect that this time will be no greater than 20 ns. in the worst case scenario (not presented), i.e. for bottom values of all samples of the input signal (1µa) this time can reach even 100 ns. taking into account the average power dissipation provided above, the energy consumed during one detection cycle per one input can be determined to be 1.57pj and 0.32pj in the worst case scenario and in a typical situation, respectively. in case of the erosion filter we selected such input signals for which the flags appear almost at the same time (a difference of 1 – 3 ns). it is shown in fig. 8. in each situation the arbitrary mechanism properly selects one of these signals as a winner (minimum signal in this case). detection time is in this case longer than in case of the dilatation filter, as the circuit must wait until the flag of the smallest signal appears at the output of the itc. we can assume that the detection time is in this case closer to the worst case scenario of the dilatation filter (100 ns), while the power dissipation remains the same. 246 r. długosz, a. rydlewski, t. talaśka a) b) fig. 9 corner analysis: simulations of the dilatation (max) filter for different temperatures: (a) -40ºc , (b) 120 ºc. the meaning of particular diagrams seen from top to bottom is the same as in figs. 7 and 8. after the initial verification of the circuit described above we performed a detailed corner analysis of the circuits. we tested the filters in wide ranges of particular pvt (process, voltage and temperature) parameters. the environment temperature varied in the range from -40ºc to 120 ºc, while the supply voltage in the range from 1.2 to 1.8 v. we tested the circuit for three transistor models, namely slow, fast and typical (ss, ff, tt). fig. 9 presents selected simulation results for the same situation as in fig. 7, but for different temperatures to illustrate the stability of the system. low power nonlinear min/max filters implemented in the cmos technology 247 table 1 performance comparison of selected min / max circuits reported in literature. reference process (cmos µm) vdd [v] no. of inputs p / (1 in) [µw] data rate f [mhz] input range [µa] fom (f/p1in) [mhz/µw] [20] 0.5 3.3 8 106.25 5 3.3 0.047 [21] 0.35 3.3 8 70 1 10 0.014 [12] 0.8 6 8 120 2.8 50 0.023 [9] 2.4 5 8 200 13.8 100 0.069 [15] 0.6 3 8 283.75 20 70 0.070 this work (dilatation) 0.18 1.8 8 15.75 50 1 – 10 3.174 this work (erosion) 0.18 1.8 8 15.75 10 1 – 10 0.635 4. discussion of results in this section we compare the obtained results with performance of other min/max circuits reported in the literature. a straightforward comparison is not easy, as particular solutions were designed for different purposes and thus, to some extent, offer different features. most of the reported circuits does not contain the delay line, as they have been designed for independent input signals directly provided to the bt block. in case of our circuit the memory cells used to store the signal samples contain additional branches that conduct current, thus increasing the power dissipation. schemes presented in fig. 3 (a) – (c) show that each memory cell contains two additional branches that almost doubles the power dissipation in the comparison with the situation in which the same circuit (without the delay line) would be used to process independent signals. to facilitate the comparison of different solutions we define a figure-of-merit (fom) as data rate over the power dissipation for a single input. such assumption is correct, as the power dissipation increases approximately linearly with the number of inputs. note that the number of all elements used in the circuit also increases linearly with the number of inputs. as discussed earlier, the number of layers in the binary tree equals log2m. at each following layer the number of elements that serve as comparators (dcmp) is reduced by 2. for example, for 8 inputs the circuit has 3 layers with 4, 2, 1 comparators, respectively (7 comparators in total). for 128 inputs the number of comparators equals 127. the number of itcs and memory cells in delay line equals the number of the inputs. we are aware that the proposed circuit has been realized in newer technology than other circuits of this type, presented in table 1. however, as described earlier, to reduce the mismatch effect we oversized transistors used in the analog part that had some impact on the attainable data rate. the main source of the observable delay is the analog part of the system. in particular itc blocks the currents with the values in-between 1 and 10 µa have to charge the capacitors of 100 ff to the value of about 0.9v, that enables generating the flag. this process takes about 9 to 90 ns, for 10 and 1 µa, respectively. if the circuit would be realized in an older technology we would have to increase the supply voltage, so the process of charging the capacitors would take a longer time (let us assume 2 -3 times). the digital part of the system is very fast. the delay of a single layer of the bt block equals the delay of a single or or and gate only, as the flag of the pair is generated by these gates (fig. 3 d-e). this delay in the cmos 0.18µm technology does not exceed 1 ns for vdd = 1.8v. if the rsff hangs, the arbitrary mechanism requires about 3 ns to decide 248 r. długosz, a. rydlewski, t. talaśka which of the inputs is assumed to be the winner. however, this process is parallel to the process of propagating flags in the tree. in the bt block we use transistors with minimal lengths in a given technology. if we would redesign the circuit in an older technology, the propagation time of each layer of the tree would increase by a factor of (l1/l2) 2 . in the cmos 0.5µm technology, for example, this time would be longer about 7-10 times. we suppose that in this technology the delay of the overall circuit in the worst case scenario would not exceed 300 ns. this would reduce the fom of our circuit about 3 times. however the obtained results are still four times better than in an example circuit reported in [20], designed in cmos 0.5µm technology. the provided delay times and calculations are for an example case of 8 inputs i.e. 3 layers in the tree. in case of larger structures the delay of the analog part will remain the same, while the delay of the digital part will increase only moderately. this is one of the main advantages of the proposed solution. in other circuits of this type with analog bt, the delay is linearly proportional to the number of layers. during the corner analysis we simulated the filters with smaller supply voltages. the circuit worked properly dissipating less power, but it was also much slower in this case. for vdd = 0.8v the digital part was approximately 10 times slower. working with such supply voltages does not make sense as the energy consumed during one cycle does not decrease as fast as the dissipated power, just due to reduced speed. additionally for such voltages transistors used in the analog part work in the weak inversion region that reduces the precision of the circuit. 5. conclusions novel nonlinear dilatation and erosion filters have been proposed in the paper. the circuits are based on the binary tree concept. however, in contrary to typical solutions of this type, in which analog bt structures are used, is the proposed circuit we distinguish the analog part that converts the analog signals to 1-bit signals with different delays and the parallel and asynchronous digital bt block that determines which delay is the smallest or the largest, depending on the type of the filter. the proposed digital bt is much faster than its analog counterparts. it additionally eliminates propagation of analog signals in the tree, as it is in other circuits of this type. as a result, the circuit offers a precision at the level exceeding 99% that is sufficient in many signal processing tasks. the proposed bt is very sensitive and is able to distinguish very small differences of delays of particular input signals. this is possible through a not typical use of the rs flip flops, which serve in this case as time comparators. in a typical application of the rs flip flops the „11‟ input state is not allowed. in our circuit we call this situation an emergency state that happens relatively seldom. nevertheless, to avoid the situation in which this state will unable calculation of the output sample of the filter, we propose an arbitrary mechanism that is able to handle this situation. the next step of the project will be design and fabrication of the chip containing the filters and its laboratory tests. this phase is necessary, as the noise can have same impact on the results. low power nonlinear min/max filters implemented in the cmos technology 249 references [1] m. vemis, g. economou, s. fotopoulos, a. khodyrev, "the use of boolean functions and logical operations for edge detection in images", signal processing, 1995, vol. 45, 161–172 [2] r.a. araujo, a.l.i. oliveira, s. soares, s. meira, "designing dilation-erosion perceptrons with di_erential evolutionary learning for air pressure forecasting", in procedings of the international joint conference on neural networks, 2011, san jose, california, usa, pp. 595–602 [3] p.t. jackway, m. deriche, "scale-space properties of the multiscale morphological dilation-erosion", ieee transactions on pattern analysis and machine intelligence, 1996, vol. 18, no. 1, pp.38–51 [4] joseph (yossi) gil and ron kimmel, "efficient dilation, erosion, opening, and closing algorithms", ieee transactions on pattern analysis and machine intelligence, vol. 24, iss. 12, december 2002, pp.1606–1617 [5] a. dąbrowski, r. długosz, p. pawłowski, “integrated cmos gsm baseband channel selecting filters realized using switched capacitor finite impulse response technique”, elsevier microelectronics reliability journal, vol. 46, no. 5–6, pp. 949–958, june 2006. [6] sophocles j. orfanidis, "introduction to signal processing", previously published by pearson education, inc. 1996-2009 by prentice hall, inc. previous isbn 0-13-209172-0 [7] r. długosz, k. iniewski, “programmable switched capacitor finite impulse response filter with circular memory implemented in cmos 0.18μm technology”, journal of signal processing systems (formerly the journal of vlsi signal processing systems for signal, image, and video technology), springer new york, vol. 56, no. 2-3, pp. 295–306, september 2009. [8] w. w. moses, e. beuville, m. h. ho, "a winner-take-all ic for determining the crystal of interaction in pet detectors", ieee transactions on nuclear science, vol. 43, no. 3, 1996, pp.1615–1618 [9] a. demosthenous, s. smedley, j. taylor, "a cmos analog winner-takes-all network for large-scale applications", ieee transactions on circuits and systems-i: fundamental theory and applications, vol. 45, no. 3, 1998, pp.300–304. [10] j. ramirez-angulo, j.e. molinar-solis, s. gupta, r. g. carvajal, a. j. lopez-martin, "a high-swing, high-speed cmos wta using differential flipped voltage followers", ieee transactions on circuits and systems ii: express briefs, vol.54, no. 8, 2007, pp.668–672. [11] t. serrano, b. linares-barranco, "a modular current-mode high-precision winner-take-all circuit", ieee transactions on circuits and systems-ii: analog and digital signal processing, vol. 42, no. 2, 1995, pp.132–134. [12] k. wawryn, b. strzeszewski, "current mode ab class wta circuit", in the proceedings of the ieee international conference on electronics, circuits and systems (icecs), 2001, pp. 293–296. [13] g. t. tuttle, s. fallahi, a. a. abidi, "an 8-b cmos vector a/d converter", ieee international solidstate circuit conference (isscc), san francisco, usa, 1993, pp. 38–39 [14] r. długosz, t. talaśka, r.wojtyna, "new binary-tree-based winner-takes-all circuit for learning on silicon kohonen's networks", in proceedings on the int. conf. on signals and electronic systems (icses), lódź, poland, 2006, pp. 441–446. [15] b. tomatsopoulos, a. demosthenous, "low power, low complexity cmos multiple-input replicating current comparators and wta/lta circuits", in proceedings on the european conference on circuit theory and design (ecctd), vol. 3, no. 28, cork, ireland, 2005, pp. 241–244. [16] r. dlugosz , a. rydlewski , t. talaska, "low power nonlinear min/max filters implemented in the cmos technology", in proceedings on the 29th international conference on microelectronics, beograd, serbia, 12-14 may 2014, pp. 397–400. [17] r. długosz, w. pedrycz, "łukasiewicz fuzzy logic networks and their ultra low power hardware implementation", elsevier neurocomputing, vol. 73, iss.7-9, pp.1222–1234, march 2010. [18] r. długosz, t. talaska, w. pedrycz, "current-mode analog adaptive mechanism for ultra-low power neural networks", ieee transactions on circuits and systems–ii: express briefs, vol. 58, iss. 1, pp. 31–35, january 2011. [19] m.j.m. pelgrom, h.p. tuinhout and m. vertregt, "transistor matching in analog cmos applications", in proceedings on the ieee international electron devices meeting, december 1998, pp. 915–918 [20] y.c. hung, b.d. liu, "high-reliability programmable wta/lta circuit of o(n) complexity using a single comparator", iee proceedings-circuits devices and systems, vol. 151, no. 6, 2004, pp. 579–586. [21] yu chien-cheng, tang yun-ching, liu bin-da, "design of high performance cmos current-mode winner-take-all circuit", in proceedings on the international conference on asic, beijing, china, 2003, pp. 568–572. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 653 674 doi: 10.2298/fuee1604653p high performance digital current control in three phase electrical drives  ljiljana s. peric, slobodan n. vukosavic university of belgrade, dept. of electrical engineering, belgrade, serbia abstract. majority of contemporary static power converters makes use of three-phase, pwm controlled igbt inverters. typical applications include electrical drives and grid connected converters. in both cases, a high closed loop bandwidth is highly desirable. the bandwidth is constrained by the problems of the feedback acquisition. the feedback errors are caused by the noise, parasitic phenomena and by the current ripple at the pwm frequency. the errors can be reduced by considering deriving the average value of the output current within the past pwm period. this feedback acquisition method reduces the noise, but it also introduces delay into the feedback lines. along with delays brought in by digital pwm, the feedback delay reduces the range of stable gains and limits the closed loop bandwidth. effects of the delay can be reduced by conveniently placing the control interrupt and adopting an optimum parameter setting which meets both the bandwidth requirements and the robustness against the noise and the parameter changes. experimental verification proves that the proposed current controller achieves the response speed and the robustness against the noise which outperforms the competitive solutions. key words: current control, high-performance control, signal acquisition, ac motor drives, three phase inverters 1. introduction electrical drives are frequently used to control the speed or the position of the work piece or the tool. in such cases, the speed and position controllers are used as the outer control loop, which provide the torque reference. the later determines desired currents that have to be injected into the stator windings in order to obtain the desired torque. digital current controllers are the inner loop of the drive, [1], [2]. the bandwidth of the current loop determines the torque response time. therefore, it determines the overall performance of the drive [3], [4], such as the closed loop bandwidth of the speed or position loop. for the proper operation of the drive, it is essential to decouple the flux control loop and the torque control loop. the basic prerequisite for the decoupled flux and torque control is a fast and robust current controller [5]-[7]. in high speed drives, the received august 28, 2015; received in revised form november 13, 2015 corresponding author: slobodan n. vukosavic university of belgrade, dept. of electrical engineering, 11000 belgrade, serbia (email: boban@ieee.org) 654 lj. s. peric, s. n. vukosavic fundamental frequency ff of the stator currents and voltages can reach considerable values. in some cases, ff can reach a considerable fraction of the sampling frequency fspl of the current controller [8]-[11]. in such cases, it is of particular importance to have a high closed loop bandwidth of the digital current controller. although the objective of this paper is to maximize the closed loop performances in the presence of delays, and it is reasonable to expect that the devised control measures would also improve the closed loop performances with very high fundamental-to-switching-frequency ratios, we did not discuss nor did we verify such performances in this paper. a high bandwidth of the current controller is also important in grid-connected static power converters, such as the three phase inverters that regenerate into the grid, used in conjunction with wind-power plants and solar-power plants. in order to inject undistorted, sinusoidal currents into the grid, the closed loop current controllers have to overcome the nonlinearities such as the lockout time, reduce the line harmonics, and to secure a very low factor of total harmonic distortions (thd). for this to achieve, digital current controllers should have a quick response and a high disturbance rejection. several imperfections and nonlinearities make this requirement difficult to achieve. performance enhancement requires the proper modeling and accounting for such imperfections and nonlinearities [8], [10], [11]. certain popularity is gained by the dead-beat and predictive controllers [13], due to their capacity to cope with transport delays, but their wider use is hindered by pronounced sensitivity to changes in system parameters. digital current controllers are frequently located in synchronous dq frame. this is done in order to achieve constant steady-state references, instead of sinusoidal steady-state references, that would have to be tracked with the controllers located in stationary coordinate frame. synchronous controllers have the possibility to achieve the steady state performance with zero phase error and zero amplitude error. controller structure includes proportional and integral action. when used with elevated fundamental frequencies ff, the current controller has to be enhanced by the dq decoupling actions [10], [11], [14]. in cases where the pwm delays and imperfections are negligible, and where the current ripple does not impair the feedback acquisition process, conventional pi controllers can be tuned by using well known tuning procedures [11], [12], [14], derived by applying the imc concept [5]. neglecting the imperfections, the synchronous frame current controllers can reach the bandwidth frequency of 0.11∙fspl. with stationary frame controllers, it is possible to achieve the bandwidth frequency of 0.07∙fspl [12]. in cases where the pwm ripple of the output current is not negligible, as well as in cases where the pwm delays and lockout time delays impair the sampling process and contribute to parasitic alias components, it is not possible to use the conventional structure and parameter setting of digital current controllers. the current controller designed in this paper reduces the sampling errors by taking the average value of the feedback signals over the past pwm period. in essence, the well known technique of oversampling is applied within the current controller environment, paced by the pwm carrier, and implemented on an industrial dsp as a time-skewed onepwm-period-averaging. the method uses an automated, dma-driven oversampling. the feedback signal is calculated from a large number of equally spaced samples collected within the past two sampling periods. with double-update mode, the two sampling periods correspond to one pwm period. although this approach reduces the feedback errors caused by the noise and the ripple, it also introduces delay into the feedback lines. the feedback delay adds to the delay contributed by the digital pwm. in a conventional high performance digital current control in three phase electrical drives 655 implementation, where the control interrupt gets triggered by the zero-count and the period-count of the pwm carrier, the equivalent transport delay encountered with the proposed one-pwm-period-averaging reaches 2.5 sampling periods, thus reducing the range of stable gains and limiting the closed loop bandwidth. effects of the delay can be reduced by conveniently shifting the control interrupt and adopting an optimum parameter setting which meets both the bandwidth requirements and the robustness against the noise and the parameter changes. in order to deal with transport delays in digital current controllers, and to provide and error-free feedback acquisition, the authors recently designed and tested several solutions. the most important previous results are given in [20] and [21]. as well as this paper, [20], [21] deal with digital current controllers. therefore, it is of interest to clarify what has been done in [20], [21], and what is the contribution proposed in this paper. in [20], delay compensation is performed by introducing a differential control action with the proper d-gain setting. in this paper, we took a different approach. we do not use the differential action. instead, we rely on the time-skewed acquisition window of fig. 9, which permits rescheduling of the interrupt as an effective way of coping with delays. it is also of interest to compare the feedback acquisition technique used in [20], [21], and in this paper. the approach used in this paper is similar, but not identical to the one used in [20], while the approach of [21] is quite different than both. in [20], the impact of the lockout time, the motor cable capacitance and the switching noise on the feedback errors incurred with conventional regular-sampling-double-update approach are experimentally verified, suggesting the need for the use of the oversampling. thorough experimental evidence of [20] is obtained with variable length of the motor cable, and it proves that the pwm-period-based oversampling and decimation results in considerable reduction of the sampling errors. the method is coined into one-pwm-period-averaging, and it takes the average of the samples acquired within one tpwm window encircled by the interrupt ticks. while the method used in this paper also uses the oversampling-decimation, the current samples are acquired within a different acquisition window. in fig. 9, the new acquisition window is shifted by texe with respect to the interrupt ticks. the time shift texe corresponds to the execution time of the control interrupt. the skew between the corresponding sampling windows can be observed by comparing [20, fig. 5] and fig. 9. in [21], the authors deal with the current controllers which do not use the oversampling/ decimation, but rely instead on the conventional regular-sampling-double-update approach. this approach is applicable to noise-free sampling cases, such as the one where the inverter is integrated within the motor housing. the structure of the current controller of [21] uses p, i, and d actions, and it is different that the controller considered in this paper. in [21], the parameter setting deals with the three gains, and it takes into account the controller capability to suppress the impact of the electromotive-force disturbances. in this paper, the parameter setting focuses on p and i gains, and it does not consider the disturbance rejection capability. the objective of parameter setting rules in [21, page 7, criterion function q] is different than the objective of the parameter setting rules in section 5 of this paper. the former results in an optimum p-i-d gains, while the later deals with a p-i controller and it searches for the optimum p gain while maintaining a constant p/i ratio. the feedback acquisition technique, the structure of the current controller, and the goal of the parameter setting procedure in [21] are different that those proposed in this paper. this paper is organized as follows. the system with three phase igbt-based inverter, the typical load and the digital controller are reinstated in section 2. the feedback 656 lj. s. peric, s. n. vukosavic acquisition system is discussed in section 3, with the proposal of the error-free sampling scheme which operates with a minimum indispensable delay. in section 4, the two competitive structures of the digital current controller are analyzed and discussed. the optimum parameter setting is proposed in section 5, focused of achieving quick response, robustness, and high rejection of the input disturbances. experimental results obtained with the proposed current controller are included in section 6. conclusions are given in section 7. 2. synchronous-frame digital current controller most 3-phase digital current controllers are either employed in electrical drives or in grid connected static power converters. the former have the task of controlling the stator currents of 3-phase ac machines, while the later control the current injected into the 3phase ac grids. an igbt inverter, used to supply an ac machine, is shown in fig. 1. the ac line voltage is rectified to obtain the dc voltage e. the voltage e feeds the 3-phase inverter. by means of the pulse width modulation (pwm), the inverter generates variable frequency, variable amplitude voltages, required for the proper current control. in fig. 2, the two igbt inverters are used to take the energy from the wind turbine and pass it into the ac grid. one of the inverters (on the right) has the function rather similar to the one in ac drives, and it controls the stator current of the ac generator, thus controlling the flux and torque of the machine. the inverter on the left in fig. 2 takes over the energy that comes through the dc link circuit and passes the energy into the grid. it has to control the current injected into the grid. the grid current has to be sinusoidal, with a low thd, and with the power factor which depends on the active power and reactive power commands. in both fig. 1 and fig. 2, the three phase inverters are used as the voltage actuators, which supply the voltages required for the proper current control. fig. 1 three phase inverter as the voltage actuator within an electrical drive. the three phase inverters of figs. 1 and 2 are nonlinear voltage actuators that cannot supply a continuously changing voltage. they are linearized by means of the pulse width modulation, illustrated in fig. 3. in each switching period tpwm, the output phases are connected to the upper rail of the dc bus during ton conduction interval, and then switched to the lower rail of the dc bus during toff = tpwm ton conduction interval. in this way, the average value within the switching period is uav = eton/tpwm, where e is the dc voltage across the dc bus, while the voltage uav is referred to the minus rail of the dc bus. in the prescribed way, the switching bridge as a voltage actuator is linearized. in each switching interval tpwm, it provides the average voltage which can be adjusted by altering the value of ton. high performance digital current control in three phase electrical drives 657 fig. 2 the use of the 3-phase inverters as the voltage actuators within the power conversion system which takes over the energy from a variable speed generator (right) and recuperates the energy into the ac grid. the left side of fig. 3 illustrates asymmetrical pwm technique, while symmetrical pwm is given in the right. due to the inferior performances, asymmetrical pwm is not used in 3-phase inverters. in both electrical drives and grid side power converters, the 3phase inverters use the symmetrical pwm technique. other techniques which are also used have the same sequences as the carrier-based symmetrical pwm. in cases where the output voltage of the 2-level 3-phase inverter is generated by the space vector modulation, the resulting pwm pattern can be proved equal to the pattern obtained with symmetrical pwm where the modulating signal is conveniently changed. fig. 3 pulse width modulation with asymmetrical (left) and symmetrical carrier. the waveform of the line-to-line voltage uab is given in lower right. in fig. 3, the intervals where the digital controller executes the control algorithm are designated by exe. one execution occurs during the rising edge of the carrier, while the successive execution takes place during the falling edge. in other words, there are two executions in each tpwm. therefore, the sampling time of the current controller is tspl = tpwm/2. each execution instants calculates a new value for the conduction intervals ton for the phases a, b, and c. due to uav = eton/tpwm, calculated intervals ton actually represent the voltage commands. the ratio m=ton/tpwm represents the modulation index. in fig. 3, the modulation indices are denoted by ma, mb, and mc. considering the execution instant between b0 and a1 in fig. 3, it calculates the values of ton (m) that cannot be applied before the instant a1, when the carrier reaches the period count and starts do decline. therefore, the effects of the calculated ton (m) take place between a1 and b1, thus affecting the falling edges of the phase voltage pulses. 658 lj. s. peric, s. n. vukosavic the same way, the execution instant between a1 and b1 produce ton and m that would be applied only after the instant b1, when the carrier reaches zero, thus affecting the rising edge of the phase voltage pulses between b1 and a2. delays are introduced due to the hardware properties of the pwm peripheral units. in order to avoid multiple commutations within a single period tpwm, the values of modulation signals ma, mb, and mc are reloaded into the pwm comparators only at instants where the pwm carrier reaches either zero or the period count. intrinsic delay of the pwm peripheral unit introduces a transport delay of tspl/2 into the voltage actuator. this delay has to be taken into account when designing the structure and deciding parameters of the digital current controller. a simplified schematic of the inverter supplied stator winding is given in fig. 4, made by adopting the subsequent assumptions. for both the induction motors and synchronous motors, it is reasonable to assume that the magnetic flux within the machine exhibits very slow changes, compared to the desired dynamics of the loop. the same assumption holds for the rotor speed. at the same time, the electromotive force emf is the product of the flux and the rotor speed. therefore, it is reasonable to assume that the electromotive force emf has the role of a slow, external disturbance within the current control system. fig. 4 simplified schematic of the inverter supplied stator winding. this schematic does not reflect the coupling between phases in an electric machine. in cases where the inverter of fig. 1 generates the three phase system of symmetrical, sinusoidal voltages, the average value of the three output phase voltages corresponds to the center-point of the dc bus, denoted by the ground symbol in fig. 4. if the three phase winding of the ac machine is symmetrical, and the electromotive forces are symmetrical and balanced, then the star connection of the stator winding remains at the potential of the ground, as denoted in fig. 4. in such cases, the stator winding can be represented by simplified schematic of fig. 4. similar considerations can be drawn for grid side connected power converters with series l filter. the only difference is that the phase voltages of the ac grid replace the electromotive forces of the stator winding, while rl parameters of the output filter and the grid replace the elements r and l in fig. 4. in order to perform the current control task, it is necessary to acquire the feedback signals, namely, the value of the stator current. characteristic waveforms are given in fig. 5. due to the pulsed nature of the stator voltage, the current has the fundamental component and a superimposed ripple. the ripple comprises the spectral component at the pwm frequency fpwm and a certain spectral content at fpwm integer multiples. with tspl = tpwm/2, the most of the ripple energy resides at the nyquist frequency and impairs the sampling process. with assumed linear change of the ripple (curve a in fig. 5), it would be possible high performance digital current control in three phase electrical drives 659 to acquire the samples at the center of each voltage pulse, whether positive or negative, and obtain a ripple free feedback (i1) (regular-sampling-double-update). due to rl nature of the winding impedance, the waveform of the ripple assumes the form of the curve b in fig. 5. moreover, the center of the voltage pulses gets affected by unpredictable effects of the lockout time and gating signal delays. therefore, an attempt to acquire a single sample in each halfperiod of the pwm would result in sampling errors, denoted by i2 in fig. 5. one of the ways is to respect the limits imposed by kotelnikov sampling theorem, that is, to filter out any spectral content above fspl/2 = fpwm. considering a limited resolution of the analog-to-digital converter, it is usually considered quite sufficient to reduce any spectral content in the forbidden area below the level of 1 lsb of the adc. even so, a heavy low pass filtering would be required to complete the task, thus introducing unacceptable delays, phase errors and amplitude errors. alternative ways of securing an error-free sampling without introducing considerable delays is explained in the following section. fig. 5 pwm-related ripple and the fundamental component of the stator current. 3. an advanced feedback acquisition system the main problem in acquiring the current feedback is the presence of the current ripple, caused by the pulsed nature of the inverter voltages. the ripple has a triangular form, illustrated in fig. 6. most of the spectral energy of the ripple resides at the pwm frequency, with some minor components located at integer multiples of fpwm. the consequential sampling errors, illustrated in fig. 5 can cause considerable performance deterioration of the current controller performance. in ac drives environment, single-sample feedback acquisition of fig. 5 is prone to sampling errors [17], [18]. with relatively large dv/dt values at the inverter output, the switching causes parasitic oscillations of the voltage and current [17]. the frequency of such oscillations is well above the nyquist frequency. the parasitic lc elements that give rise to poorly damped parasitic oscillations are present even with a rather short inverter-motor cable [17]. in all the sampling schemes where the feedback is obtained from a single sample in each sampling period, the oscillations above the nyquist frequency introduce the sampling errors. in a 3-phase ac controller, the switching instants continuously change according to the voltage command, and their position relative to the sampling instant is variable. the consequential sampling noise is larger when the switching comes close to the sampling instants, where the switching-excited parasitic oscillations may contribute to considerable feedback errors [18]. regular-sampling-double-update approach introduces the sampling errors even in cases where 660 lj. s. peric, s. n. vukosavic the switching-noise-oscillations are much lower, such as the case with ac drives with integrated motor-inverter and no cable. the feedback errors are introduced whenever the sampling instants slide away from the zero-crossing instants of the current ripple. any imperfection, parasitic effect or delay that moves the ripple-zero-crossing from the sampling instant introduces the sampling errors. one way of reducing the such errors is finding the average value of the current in each pwm period by oversampling. fig. 6 sampling scheme with pwm-period averaging. the oversampling technique aided with digital filtering/averaging is well known and widely used [16], [20]. in order to reduce the measurement errors in an industrial current control environment, we implemented the oversampling to perform time-skewed onepwm-period averaging of the feedback signals on an industrial dsp. devised technique is rather simple. a similar technique has been introduced and tested in [20]. thorough experimental evidence of [20] proves that the proposed oversampling and decimation technique, also called one-pwm-period-averaging results in considerable reduction of the sampling errors. while the feedback acquisition proposed in [20] takes the average of the samples acquired between the interrupt events, the acquisition window proposed in this paper (fig. 9) uses another acquisition window. it is skewed by texe, where texe corresponds to the execution time of the control interrupt. in [20], the feedback delay is compensated by extending the controller with a differential control action. in this paper, we avoid the differential action and rely on the time-skewed acquisition window (fig. 9) which enables an effective reduction of time delays. the basic approach taken is [20, fig. 5] is similar, but not identical to the time-skewed approach illustrated in fig. 9, where the oversampling window is shifted by texe. the general description of the oversampling/decimation is found in [20] in a brief, eight lines paragraph above (12). at the same time, neither this specific implementation nor the analysis of consequential delays were published by other authors. a more detailed description of the one-period-averaging is reinstated in this section, along with the necessary information and the model of the transport delay, which is required for the proper understanding of the next steps and for the further analysis. later on, the time-skewed sampling window of fig. 9 is emphasized and discussed, as it represents the difference between the feedback acquisition of [20] and the feedback acquisition used in this paper. the feedback averaging is implemented on a low cost dsp controller. the implementation of the time-skewed one-pwm-period averaging on an industrial dsp requires some skill, as the resources are cost-limited and the programming is not trivial. we are also aware of the possibility to implement the relevant algorithm on the high performance digital current control in three phase electrical drives 661 laboratory control platform dspace microautobox ii, which has the hardware support for the synchronization of a/d and pwm processes, and it also supports the oversampling. this opens the possibility to implement and verify the time-skewed onepwm-period averaging in laboratory environment, with much lower effort, and without the need to change the dsp code of the actual industrial drive. the current ripple can be reduced by the sampling scheme outlined in fig. 6. the feedback signal at instant (n+1)tspl is calculated from a number of equidistant samples, acquired within the past pwm period, starting from (n-1)tspl and ending with (n+1)tspl. the number of equidistant samples acquired within each tpwm = 2tspl interval can be very large. with contemporary digital signal processors, it is possible to scan all the adc channels each 1s. hence, with a typical fpwm = 10 khz, it is possible to acquire up to 100 successive samples of all the analog channels. for practical reasons, the number of samples is usually adjusted to 2 n , hence, either 32 or 64 samples. the samples are collected automatically, by an internal dma machine, without an additional overload of the cpu. collected samples are automatically stored in a designated region of the internal ram, thus made ready for further processing. during each control interrupt, it is necessary to find the sum of the samples acquired over the past pwm period, and to calculate their average value by dividing the sum by the number of samples. in cases where the oversampling factor is equal to the power of two (2 n ), division can be replaced by simple right-shifting of the sum. the sum of the samples corresponds to the average value of the output current in stationary coordinate frame. transformation into the stationary frame requires the proper angle between the two frames. the averaged feedback signals have to be transferred into the synchronous frame by using the average angle within the same pwm period where the actual samples have been acquired. in both direct and inverse park transformations, the angle has to be time-synchronized with the samples that are being transformed. with a considerable number of samples, it is reasonable to assume that the average value of the collected samples corresponds to the average value of the sampled current within the preceding tpwm period. therefore, value of i f n+1 at instant (n+1)tspl can be expressed in terms of the current samples in-1, in and in+1, representing the instantaneous value of the output current at instants (n-1)tspl, ntspl, and (n+1)tspl. it is of interest to notice that the samples in-1, in and in+1 are not actually acquired, and they are not available in the dsp ram. they are mentioned in an effort to relate the feedback signals to the output response of the actual system. namely, the dsp controller does not acquire the samples at instants (n-1)t, nt, and (n+1)t, and it does not have the information on the actual output (i dq in fig. 7). the feedback loop is closed by using the signal if dq in fig. 7, obtain by onepwm-period averaging. for the purposes of the subsequent analysis, it is necessary to find appropriate model of the delay, and to express the feedback i f n+1 in terms of the samples in-1, in and in+1. the samples in-1, in and in+1 coincide with the zero-count and the period-count of the pwm carrier (figs. 3 and 5). with regular-sampling-double-update, tpwm=2tspl, and the average value of the inverter voltage is changed in each tpwm/2. with l/r >> tspl, and neglecting the current ripple, the remaining ripple-free component of the output current has a quasi-linear change between (n-1)tspl and ntspl, as well as between ntspl and (n+1)tspl. with this assumption, the average value of the output current from (n-1)tspl to ntspl is roughly equal to (in-1+in)/2, while the average value from ntspl to (n+1)tspl is equal to (in+in+1)/2. therefore, the average value on the interval [(n-1)tspl .. (n+1)tspl] becomes the average value of (in-1+in)/2 and (in+in+1)/2, 662 lj. s. peric, s. n. vukosavic 1 1 1 2 4 f n n n n i i i i        . (1) both the samples of the stator current, such as (n-1)tspl, ntspl, and (n+1)tspl, and the samples of the feedback i f can be transformed into z domain and represented by corresponding complex images i f (z) and i(z). the former and the later are related by the transfer function of the feedback path wf, 2 2 ( ) 2 1 ( ) ( ) 4 f f i z z z w z i z z      . (2) with tspl = tpwm/2, the transfer function wf has an infinite attenuation at the switching frequency fpwm. the attenuation is also infinite at integer multiples of fpwm. therefore, the feedback signal acquired from (1) does not get affected by the ripple. an average transport delay introduced by (1) and (2) is equal to tspl, hence, considerably lower than the delay of conventional anti-aliasing filters that can be used instead of the proposed averaging. it is of interest to compare the frequency response of the pulse transfer function (2) and the actual one-pwm-period-averaging. with a large number of current samples within each period, the transfer function of the feedback acquisition system is very close to the analog-implemented average over the past tpwm, which has an infinite attenuation at the switching frequency and its integer multiples. delay model of one-pwm-period averaging with n samples requires modified z-transform with the fractional sampling period ratio of n=32 or n=64, which is less convenient, less instructive, and hardly suitable for the analysis, design and the parameter setting of the controller. for that reason, we adopted the approximation (2). this approximation is verified by considerable similarity between simulations and the experimental results. thus, all the further design phases and parameter setting procedures use delay approximation of (2). the purpose of wf(z) approximation is to model the transient phenomena below the nyquist frequency, which is the frequency range of interest when it comes to designing and tuning the digital current controller. in the frequency range well above the nyquist frequency, there are differences between one-pwm-period-averaging and the transfer function (2), but they do not have any meaningful influence on the setting of the feedback gains. validity of the above approximations are justified by the experiment. 4. the structure of the current controller the current controller provides the two voltages (ud and uq) supplied to the three phase ac machine, where they affect the two output currents (id and iq). whether represented in synchronous or in stationary frame, the plant has two inputs and two outputs. the transient phenomena in orthogonal axis are coupled. the coupling depends on the revolving speed of the dq frame, that is, on the revolving speed of the electrical machine. the coupling is also affected by the transport delays. direct digital design (ddc) with the imc applied in z domain [11] decouples the transient phenomena in orthogonal axes by means of the controller with conveniently embedded proportional and integral actions. the pulse transfer of such controller gets multiplied by the pulse transfer function of the plant (fig. 7) to obtain the open loop transfer function which does not have any coupling terms [11, iv.e]. relying on this result, it is possible to add the filtering terms in the feedback path and high performance digital current control in three phase electrical drives 663 perform the gain setting assuming that the coupling between the orthogonal axis does not exist; namely, assuming that the current controlled system is a single input single output system (siso). the analysis given in subsections 4.1 and 4.2 are performed under this assumption. more detailed support for such a claim is given in subsections 4.3-4.5. the current controller executes in each tspl interval, that is, two times in each pwm period. the current controller tasks include the acquisition of the feedback signal, the execution of the control algorithm, and writing the voltage references, in the form of ton commands, into the corresponding registers of the pwm peripheral. the exact sequence of events has considerable effects on the consequential transport delay and it determines the closed loop performance. the execution of the current control tasks takes place within an interrupt, triggered by a programmable event. the interrupts have to repeat twice in each pwm period. in fig. 8, the interrupt events are created whenever the pwm carrier reaches either zero or the period count. one such execution account is denoted in fig. 8. it takes place after the period count of the pwm carrier, at t = (n1)tspl. the interrupt calculates the feedback signal i f n-1 as the average value of successive current samples within the past pwm period. it can be approximated by the function of the samples in-3, in-2, and in-1, as i f n-1= 0.25(in-3+2in-2+ in-1). based upon such feedback, the current controller derives the current error, and it calculates the voltage command un-1, suited to drive the current error back to zero. once calculated, the voltage command un-1 is expressed in terms of the pulse width ton for each of the three inverter phases. the pulse width values are reloaded into the pwm peripheral at instant t = ntspl. therefore, the voltage command un-1 determines the average voltage on an interval [ntspl .. (n+1)tspl]. this implies another transport delay that has to be taken into account in the controller design. fig. 7 block diagram of the digital current controller. fig. 8 execution of the control interrupt immediately after the pwm carrier reaches zero-count or period-count. 664 lj. s. peric, s. n. vukosavic the transport delays can be reduced by using the multisampling technique of [18]. instead of executing the control interrupt twice per each pwm period tpwm, as is the case in most dual-update-mode solutions, the interrupt can be executed n > 1 times in each half-period tpwm/2. each time the interrupt is triggered, a new feedback sample is acquired and a new voltage reference calculated. the voltage references affect the modulating signals that change n times in each half-period. most of these references do not get implemented, as the pwm process permits only one switching per half-period, as it accepts only one crossing of the modulation signal and the pwm carrier. therefore, the pwm unit uses only one of the voltage references in each tpwm/2, while the multisampling generates considerably more references. one of the consequential drawbacks is a nonlinear relation between the voltage command and the actual output voltage. positive side of the multisampling approach is reduction of the transport delay. besides the nonlinear inputoutput relation, caused by specific insensitivity at vertical transitions, the multisampling picks up the current ripple, which has to be reduced by introducing a dedicated digital filtering, proposed in [18]. in order to suppress the impact of the switching transients on critical samples, it is of vital interest to avoid and skip any sampling after the pwm switching. in order to avoid possible multi-switching conditions, the multisampling requires specialized pwm logic which prevents the system from making more than one commutation in each half-period of the pwm. another possibility of sequencing the current control tasks is denoted in fig. 9, where the zero-count events and period-count-events of the pwm carrier take place at (n1)tspl, (n-0)tspl, and (n+1)tspl. the interrupts occur texe before each counter event. the value of texe should be larger than the worst-case execution time of the control interrupt. in this case, the control interrupt would complete before the successive event of the pwm counter. with recent dsp controllers, the interrupt execution time does not exceed 4s. hence, the interval texe is considerably shorter than the sampling period tspl. the interrupt which completes just before t = ntspl calculates the feedback signal i f n as the average value of successive current samples within the past pwm period. with texe << tspl, the feedback signal can be approximated by the function of the samples in-2, in-1, and in-0, as i f n= 0.25(in2+2in-1+in). the current controller derives the current error and calculates the voltage command un, expressed in terms of the pulse widths ton in corresponding phases. the values are ready before ntspl, and they are reloaded into the pwm peripheral at instant t = ntspl. in this way, transport delay is reduced as un determines the average voltage on an interval [ntspl .. (n+1)tspl]. both the schedule of fig. 8 and the schedule of fig. 9 are considered in this section. 4.1. the schedule with the control interrupt executed after the counter event. in this section, the schedule of fig. 8 is considered, where the current controller collects the feedback i f n-1 as 0.25(in-3+2in-2+ in-1), calculates the voltage command un-1, which, in turn, determines the average voltage on an interval [ntspl .. (n+1)tspl]. the 3 output currents (ia, ib, and ic) can be converted into  frame of reference and expressed in terms of their components i and i. the current can be expressed as a vector i  =i + ji. by introducing  =exp(-rtspl/l), and assuming that the slowly changing emf can be neglected, the difference equation which describes the change of the output current becomes high performance digital current control in three phase electrical drives 665 1 1 1 . n n n i i u r          (3) introducing the complex images i(z) and u(z) by 0 0 ( ) , ( ) , k k k k i z i u z u        the transfer function wp(z) of the plant can be obtained by 0 ( ) 1 1 ( ) ( ) ( ) mf p e i z w z u z r z z        . (4) the structure of the current controller can be determined by applying the imc principle on wp, 1 1 ( ) 1 c p q z w z w z z      , where q is adjusted to make wc feasible, while  is design parameter that determines the response speed. applied to (4), the imc concept results in a pi controller, with both pand igains determined by decoupled variation of the gains provides an additional degree of freedom which helps meeting the desired performances. the current controller with proportional action kp and the integral action ki can be described by the following transfer function, ( ) . 1 c p i z w z k k z    (5) with transfer functions (2)-(5), the block diagram of the current controller is given in fig. 7, where emf is assumed to a slow, external disturbance, the effects of which are reduced by the integral action of the controller. fig. 9 execution of the control interrupt just before the pwm carrier reaches the next zerocount or period-count. the interrupt must start at least texe before the rollover of the pwm carrier. the worst case execution of the interrupt should not exceed texe. 666 lj. s. peric, s. n. vukosavic the open loop transfer function wol = wcwpwf is equal to 2 2 ( ) 2 1 1 1 ( ) . 1 ( )4 p i p ol k k z k z z w z z r z zz            (6) introducing the relative gains p and i, 1 1 , , 4 4 p i p k i k r r      (7) 2 3 [( ) ] ( 2 1) ( ) . ( 1)( ) ol p i z p z z w z z z z         (8) the closed loop transfer function is * 3 2 5 4 3 2 ( ) ( ) 1( ) 4( ) 4 . ( 1 ) ( ) ( 2 ) ( ) c p cl ol w wi z w z wi z p i z pz z z z p i z p i z i p p                    (9) the optimum gain setting and the resulting performances are discussed in section 5. 4.2. the schedule with the control interrupt executed before the counter event if the execution time of the control interrupt represents a negligible fraction of the sampling time, than the delay of texe can be neglected in fig. 9. in this case, the feedback signal i f n is obtained as 0.25(in-2+2in-1+ in), and it is used to obtain the voltage command un, which gets applied on the interval [ntspl .. (n+1)tspl]. in this case, 1 1 , n n n i i u r         (10) which leads to the transfer function wp(z) of the plant 0 ( ) 1 1 ( ) ( ) mf p e i z w z u z r z       . (11) the open loop transfer function wol = wcwpwf of the system in fig. 9 becomes 2 2 [( ) ] ( 2 1) ( ) . ( 1)( ) ol p i z p z z w z z z z         (12) the closed loop transfer function becomes * 3 2 4 3 2 ( ) ( ) 1( ) 4( ) 4 . ( 1 ) ( 2 ) ( ) c p cl ol w wi z w z wi z p i z pz z z p i z p i z i p p                   (13) due to reduced delays, the order of the system is reduced from the 5 th down to the 4 th , providing the potential to improve the bandwidth and robustness of the system. high performance digital current control in three phase electrical drives 667 4.3. decoupling of d-axis and q-axis the plant has two inputs, the voltages in d-axis and q–axis. the two outputs are the corresponding currents. therefore, it is a multi input, multi output system (mimo). in cases where the transient phenomena in orthogonal axis are decoupled, it is possible to design and tune the current controller in a simplified way, considering a single input, single output system (siso). adopting the complex vector notation, where the current error is expressed as i=id+jiq, while the output current is i=id+jiq, the product wcwp = i(z)/i(z) in fig. 9 is a complex number. in cases where the axes are coupled, this number has a non-zero imaginary part. in cases where the controller wc cancels the undesired dynamics of the plant wp and achieves complete decoupling, the transfer function wc(z)wp(z), as well as the closed loop transfer function i(z)/i * (z) do not have an imaginary part. one such example is given in [11, iv.e], where the equation (12) represents the resulting closed loop transfer function. in such cases, it is possible to adopt siso approach in parameter tuning. some key considerations on decoupling and tuning of the current controllers are given in [8], [11], [12], [14], and [19]. the axis decoupling can be obtained by using the s-domain imc approach, as shown in [5-7]. in absence of additional delays, the approaches of [5-7] would provide a decoupled operation of the current controller. yet, any digital implementation of the current controller is time-discrete, and it involves additional time delays. there is a number of valuable contributions that deal with the current controller in sdomain, and they provide a useful insights to readers [12], [14], [19]. in s-domain, the transport delays have to be modeled by rational approximation, and most frequently by pade approximation. at the same time, discrete-time integrators are represented by tustin approximation. designing and tuning digital current controllers in s-domain has a limited accuracy of representing the discrete-time phenomena in s-domain. consequential errors are more emphasized in the frequency range next to the desired bandwidth. the errors get more visible as the frequency comes close to the desired bandwidth, which is close to 20% of the pwm frequency in cases with regular-sampling-double-update. in addition, rational approximation of delays also introduces the non-minimum phase phenomena that do not correspond to the behavior of the actual system. the above mentioned problems do not exist in cases where the controller design relies on direct digital design, and in particular on the implementation of the imc concept in z-domain. in such cases, the errors introduced by s-model representation of discrete-time phenomena are absent. in [19], the current controller is designed in s-domain, with approximation of discretetime phenomena. the transport delay is equal to 3/2 of the sampling period, and it is approximated by pade delay. this approximation has the phase error of 1 degree at 10% of the sampling frequency. the error rises to 8 degrees at 20% of the sampling frequency. the phase shift of the relevant vectors due to delay is compensated by introducing a "lead" compensation which rotates the vectors by an angle of tdelay. the integrators are represented by tustin approximation. notwithstanding the decoupling measures, some cross-coupling effects remain due to delays. remaining cross coupling is seen from nondiagonal elements. these cross coupling effects are seen in differential equation [19, eq. 17] and the plant transfer function [19, eq. 18]. the cross coupling is greatly reduced by mimo design and a systematic procedure for an accurate tuning of the pi controller. the non-minimal phase due to numerator of [19, eq. 23] causes some inaccuracy at the very beginning of transients. 668 lj. s. peric, s. n. vukosavic 4.4. direct digital design with z-domain imc the effects caused by approximations inherent to s-domain design are also seen in the first four controllers considered in [11]. in the design iv.e of [11], where the approximations are not used and the imc design is applied in z-domain, the open loop transfer function wol(z) = idq(z)/idq(z) and the closed loop transfer function wcl(z) = idq(z)/i * dq(z) do not have the cross coupling terms, and do not include delay-dependent factors. the function wcl(z) in [11, eq. 12] represents the ratio between the output current idq(z) = id(z) +j iq(z) and the corresponding reference. the imaginary part of this wcl(z) is equal to zero. therefore, the q axis output current is not affected by the d axis reference, and vice versa. the input step response obtained with the current controller of [11, iv.e] does not depend on the excitation frequency, proving the effective decoupling. it has to be noticed at this point that the above conclusions consider the closed loop transfer function. the same does not hold for the disturbance transfer function, which relates the output to the voltage disturbance. while the imc-designed controller wc of fig. 9 gets multiplied with wp to obtained (wpwc), the product free from any coupling terms; the electromotive force in fig. 9 acts between wp and wc. this results in the disturbance transfer function which comprises the factor wp on its own, without getting multiplied by wc. thus, the undesired coupling does not get canceled in disturbance transfer function, which depends on the fundamental frequency even in systems with ddc-imc designed controllers. the same holds in all the competitive current controllers [14], where the response of the output current to changes in the electromotive force depends on the fundamental frequency. disturbance response of the current controller is of considerable importance, but it falls out of the scope of this paper. at the same time, the electromotive force in synchronous permanent magnet motors comes as a product of the constant flux of the magnets and the revolving speed. therefore, the changes in the electromotive force are determined by the speed changes, which are considerably slower than the current loop transients. in grid connected inverters, where the electromotive force gets substituted by the mains voltage, disturbance transfer function is of particular importance. namely, it describes the capability of the current loop to reject the low order harmonics of the grid and prevent them from introducing distortion in the output current. 4.5. the ratio between proportional and integral gains the imc design in s-domain results in wc(s)= r/s+ l+ jl/s. the ratio between the proportional gain l and the integral gain r is defined by the electrical time constant, and it has to be maintained in order to cancel the undesired plant dynamics. direct digital design and the use of the imc method in z-domain results in the current controller given in [11, fig. 10] and [11, eq. 11]. it has the proportional gain of l/tspl and the integral gain of r. hence, the ratio between the proportional and integral gain has to remain equal to (1/tspl)(l/r) in order to maintain the desired decoupled operation. within the closed loop transfer function wcl, the pulse transfer of the controller wc is multiplied by the plant wp to obtain the product wcwp= idq(z)/idq(z). with ddc-imc design [11, iv.e], wcwp does not have any coupling terms. relying on that, it is possible to add the term wf(z) in the feedback path and to perform the gain setting assuming that the coupling between the orthogonal axis does not exist. as long as the function wf(z) does not introduce the coupling terms of its own, the closed loop gain wcwpwf and the closed loop transfer function wcwp/(1+wcwpwf) will remain coupling-free. high performance digital current control in three phase electrical drives 669 the previous conclusions can be applied to fig. 9, where ddc-imc designed controller wc multiplies the plant transfer function wp and provides the direct-path gain wpwc with no diagonal elements. with proper implementation, the transfer function wf of (2) does not introduce any coupling elements. therefore, the analysis and the parameter setting of the system with one-pwm-period averaging in the feedback path can be performed by adopting siso approach, as already done in previous subsections. while applying the siso design procedure, a particular attention has to be paid to the parameter setting procedure. namely, the choice of the proportional and integral gains is not free. in order to maintain the decoupled operation, the gains have to maintain the ratio (1/tspl)(l/r). 5. the optimum parameter setting design and tuning of digital current controllers has attracted considerable attention. a comprehensive and instructive summary of most relevant controllers is given in [11]. it includes several s-domain approaches to designing and tuning the digital current controllers, as well as the case with direct digital control (ddc) with z-domain implementation of the imc concept. while the other approaches introduce a number of s-domain approximations of discrete-time features, and therefore introduce additional coupling terms, the approximation-free ddc-imc concept provides a flawless decoupling. the controller wc comprises the basic proportional and integral actions along with several decoupling terms that include exp(jtspl) elements. therefore, parameter tuning procedure has to establish the proportional and integral gains that provide the desired response. in current controller design iv.e of [11], the open loop transfer function and the closed loop transfer function do not have the cross coupling terms, and they do not include delay-dependent factors. based upon that, we concluded that in the systems with ddc-imc designed controller, the feedback filtering and the parameter setting procedure can be performed on a simplified, more transparent and more intuitive bases. in order to keep decoupled response, it is necessary to maintain the ratio between the proportional and integral gains, which reduces the gain tuning to selecting one single parameter. performance of both closed loop transfer functions (9) and (13) depends on the closed loop gains p and i. with characteristic polynomials of the 4 th and the 5 th order, it is difficult to find analytical relation between the gains and the closed loop performance. parameter setting procedure proposed in this section envisages (a) definition of the performance criterion, (b) the search of the p-i plane for the point which offers the best performance. in order to obtain a fast, high-bandwidth response with well damped waveforms, the performance criterion includes the settling time t01. the value of t01 has to do with the closed loop step response. following the input step, the output of the system moves towards the target, and it settles on the target exponentially, or with some damped oscillations. after the time delay t01, the output error falls into +/-1% wide strip. following t01, the error does not leave the strip, unless another input disturbance is received. the interval t01 is called "1% settling time", and it is of interest to keep it as small as possible. the settling time as performance criterion is an effective way of discarding the reponses which have a high bandwidth and a short rise time, but at the same time exhibit poorely damped response with oscillatory approach to the target value. 670 lj. s. peric, s. n. vukosavic in addition to the settling time, it is also of interest to evaluate the robustness of the controller. due to on-line changes in the system parmeters, such as the ac grid impedances in a grid connected power converter, it is of interest to maintain the stability and the response characters in the presence of variable parameters. the robustness of the system can be measured by the vector margin (vm), as effectively used in [11]. the vector margin is usually calculated from the open loop transfer function wol(z). for the given excitation frequency , the argument z is exp(jspl). while  sweeps from 0 up to the nyquist frequency, the values of wol(z) are complex numbers which move in the complex plane and draw a graph. for the system stability, this graph must not pass through the point (1, j0). the robustness can be judged from the the minimum distance (radius) between the graph and the point (-1, j0). the value of the radius is called the vector margin. with vm < 0.5, one would expect an oscillatory response that is likely to pass into instability in the case of a significant parameter change. the motivation of using a larger vm comes from the fact that magnetic saturation in electrical machines has considerable effect on the equivalent inductance of the stator winding. in this paper, there are two search runs for the optimum parameters. the first search assumes that the feedback gains p and i can be changed independently, while the second search respects the need to maintain a constant p/i ratio. 5.1. parameter search in p-i plane parameter search performed in this subsection assumes that the feedback gains p and i can be changed independently. in other words, it is assumed that the ratio p/i does not have to be kept constant. the optimum gains p and i are searched for the closed loop transfer functions of (9) and (13). the space where the optimum gains are searched is a domain in the 2-dimensional p-i space, limited by p > 0, i > 0, and by p < 1, i < 1. the search method is rather simple, it starts by selecting a large number of equally spaced discrete gains along both axes, it proceeds by calculating the performances for each pair of the gains (p, i), and ends by selecting best pair of gains according to design criteria. the optimum gains are searched for the execution schedule of figs. 7 and 8. in both cases, the search method provided the optimum gains (p, i), the frequency f45 where the phase of wcl drops to -45 o , the frequency fbw where the amplitude of wcl drops to -3db, the vector margin vm, and the overshoot of the step response. all the results are obtained with fpwm = 10khz. the results are summarized in table 1. in cases with vm > 2/3 (p < 0.0777 in table 1), the character of the closed loop response is maintained for a wide range of parameter changes. the step responses are compared in fig. 10. it has to be noted in table 1 that, although the ratio p/i is not fixed, the ratio between the optimum gains remains close to (1/tspl)(l/r). the ratio p/i is equal to 119 for the schedule of fig. 8, and 131 for the schedule of fig. 9. for the given motor, the ratio (1/tspl)(l/r), required for the decoupled operation is equal to 144. hence, in a way, the search procedure finds the optimum close to the area which secures decoupled operation. it is of interest to perform the search procedure where the ratio p/i is kept constant and equal to (1/tspl)(l/r). 5.2. parameter search with constant p/i ratio although there are two gains, p and i, they have to maintain the same ratio, determined by parameters l and r, in order to preserve the proper decoupling between d and q axis [11], [14]. therefore, it is of interest to consider the changes of the gain p, assuming that the ratio p/i remains unaltered and equal to (1/tspl)(l/r). in this case, the search results are high performance digital current control in three phase electrical drives 671 given in table 2. with an overshoot of 2.64%, the closed loop bandwidth reaches 20% of the switching frequency. in table 2, the gain p sweep from 100% to 150% of the optimum value makes the overshoot increase from 3.4% up to 22%, while the closed loop bandwidth increases from 21% up to 33% of the switching frequency. starting with the optimum gain setting, the gain reduction of 50% leads to an overshoot of 0%, while the closed loop bandwidth drops to 7% of the switching frequency. stability limit is reached with the gain equal to 410% of the optimum value. comparable state-of-the-art solutions are summarized in [11] and [14]. they do not use one-period-averaging, and rely instead on regular-sampling-double-update with one sample in each tpwm/2. in table 1 of [11], the closed loop bandwidth of a well damped, low overshoot response reaches 10% of the sampling frequency (20% of the switching frequency). in figs. 12-14 of [14], a well damped, low overshoot response has the rise time of 5-6 sampling times. the corresponding bandwidth is, roughly, 0.35/(5ts) = 0.07fs, (14% of the switching frequency). solution devised in this paper has an additional delay, caused by the feedback averaging. with devised control methods, the consequential closed loop bandwidth of table 2 is better than with comparable solutions. table 1 performance factors obtained with the optimum gains and with the execution schedule of figs. 7 and 8. parameter search is performed in two dimensional space, with unconstrained proportional and integral gains. schedule p i f45 [hz] fbw [hz] vm overshoot [%] t01 fig. 8 0.0442 0.00037 541 1177 0.677 0.84 21 fig. 9 0.0708 0.00054 994 1882 0.705 0.67 9 table 2 parameter search is performed for the schedule of fig. 9, and for fixed ratio between the proportional and integral gains. schedule p fbw [hz] vm overshoot [%] fig. 9 0.065 1607 0.722 0.42 fig. 9 0.067 1687 0.715 0.75 fig. 9 0.071 1862 0.701 1.61 fig. 9 0.075 2005 0.689 2.64 fig. 9 0.077 2116 0.679 3.45 fig. 9 0.081 2252 0.668 4.8 fig. 9 0.086 2474 0.648 6.8 fig. 9 0.091 2618 0.636 8.4 fig. 9 0.095 2753 0.623 10.2 fig. 9 0.1 2912 0.607 12.1 fig. 9 0.116 3382 0.553 22 6. experimental results the experimental verification of the two current controllers is performed on an experimental setup which comprises a synchronous motor with surface mounted magnets, a pwm inverter and a digital control platform. the stack length of the motor is l = 128mm, and it has 6 poles. the rated torque is 7.3 nm while the rated current is 7.3 arms. the motor has stator resistance of 0.47and the inductance of 3.4 mh. the pwm inverter has the dc-bus voltage of edc = 520v, and it has the switching frequency of 10khz. the rated 672 lj. s. peric, s. n. vukosavic lockout time is set to 3s. the digital control platform uses tms320f28335 dsp. it has the adc unit with 12-bit resolution and with 16 input channels. the oversampling mechanism acquires 32 samples per base period. the sampling and storing is automated by embedded dma machine. the anti-aliasing filters are designed as passive rc filters, using the standard procedures, and also taking into account the fact that the effective sampling frequency is 32 time larger. with one-pwm-period-averaging, the effective nyquist frequency is increased 32 times, as well as the desired cutoff frequency of the analog anti-aliasing filters. therefore, it is possible to design a passive anti-aliasing filter that would remove any residual noise, while having the cutoff frequency considerably above the desired bandwidth. thus, the impact of such anti-aliasing filter on phase lag, delays and the closed loop dynamics is negligible, and it has no detrimental effects on the closed loop response. in most reports on digital ac current controllers, the experimental waveforms of the output current in dq frame are calculated by the dsp controller, and then written on a dac or copied into a pc. similar procedure is not feasible with the present system of fig. 9, where the output current i dq (z) gets filtered through the block wf(z) to obtain the feedback signal if dq (z). it has to be noted at this point that the only signal available to the dsp controller is the feedback signal. for that reason, the output current i dq (z) cannot be observed from the registers of the dsp controller. the only way to access the output current instead of the average feedback is direct measurement of the actual motor current. the phase current does reflect the changes in id(t) and iq(t), but it is also affected by the rotor position. therefore, experimental traces in fig. 11 are obtained with the rotor locked in position where the measured phase current gets equal to the q-axis current. it has to be notice though that this approach does not allow the experimental verification of the axes decoupling at high speeds. in this regard, the authors rely on the analytical and experimental findings of [11], which considers the direct digital design and the implementation of the imc approach in z-domain, the approach also used in this paper. results of [11] prove that any coupling is removed, while the input-step response does not depend on the excitation (fundamental) frequency. the execution of the control interrupt takes 3.5s on the selected dsp platform. therefore, for the scheduling scheme of fig. 9, the interrupts were triggered 4s prior to each pwm counter event. experimental traces are obtained in fig. 11, showing a reasonable similarity with the simulation results shown in fig. 10. all the measurements were done at the zero speed, with the rotor locked in position where the measured phase current corresponds to the q axis current. fig. 10 step response with execution fig. 11 experimental step responses schedules outlined in figs. 7 and 8. with schedules of figs. 7 and 8. high performance digital current control in three phase electrical drives 673 7. conclusions this paper deals with practical implementation of digital current controllers, which represent the key elements in majority of contemporary static power converters. high performance current controllers are required in the electrical drives and also in grid connected converters. a high bandwidth and considerable robustness are of uttermost importance in all applications of digital current controllers. in this paper, a novel approach to acquiring and filtering the feedback signal is proposed. devised approach is free from sampling errors and it introduces only a minimum delay into the feedback path. we also consider the transport delays in the voltage actuation path and the transport delays in the feedback path. proposed parameter setting procedures takes into account the delays, and it meets both the bandwidth requirements and the robustness against the noise and the parameter changes. proposed results are verified by simulation and also on an experimental setup comprising a brushless dc motor, a pwm inverter and a dsp-based control platform. for the relative gain p = 0.075, and for gain ratio p/i determined by the imc procedure, the closed loop bandwidth reaches 20% of the switching frequency (that is, 10% of the sampling frequency) with an overshoot of 2.64% and with a vector margin of 0.689. the system parameters can to change more than 4 times before entering the instability region. devised control solutions have the potential of reducing the noise sensitivity and improving the closed loop performance of digital current controllers applied in 3 phase ac drives and in grid connected power converters. references [1] e. levi, “foc: field oriented control,” in the industrial electronics handbook, (power electronics and motor drives), 2 nd ed., boca raton, fl, usa: crc press, 2011. [2] d. g.holmes, b. p. mcgrath, and s. g. parker, “current regulation strategies for vector-controlled induction motor drives,” ieee trans. ind. electron., vol. 59, no. 10, pp. 3680–3689, oct. 2012. [3] j.-w. choi and s.-k. sul, “fast current controller in three-phase ac/dc boost converter using d-q axis crosscoupling,” ieee trans. power electron., vol. 13, no. 1, pp. 179–185, jan. 1998. [4] j.-w. choi and s.-k. sul, “new current control concept minimum time current control in the threephase pwm converter,” ieee trans. power electron., vol. 12, no. 1, pp. 124–131, jan. 1997. [5] l. harnefors and h. p. nee, “model-based current control of ac machines using the internal model control method,” ieee trans. ind. appl., vol. 34, no. 1, pp. 133–141, jan./feb. 1998. [6] f. briz, m. degner, and r. lorenz, “dynamic analysis of current regulators for ac motors using complex vectors,” ieee trans. ind. appl., vol. 35, no. 6, pp. 1424–1432, nov./dec. 1999. [7] f. briz, m. degner, and r. lorenz, “analysis and design of current regulators using complex vectors,” ieee trans. ind. appl., vol. 36, no. 3, pp. 817–825, may/jun. 2000. [8] j.-s. yim, s.-k. sul, b.-h. bae, n. patel, and s. hiti, “modified current control schemes for highperformance permanent-magnet ac drives with low sampling to operating frequency ratio,” ieee trans. ind. appl., vol. 45, no. 2, pp. 763–771, mar./apr. 2009. [9] j. holtz, j. quan, j. pontt, j. rodriguez, p. newman, and h. miranda, “design of fast and robust current regulators for high-power drives based on complex state variables,” ieee trans. ind. appl., vol. 40, no. 5, pp. 1388–1397, sep./oct. 2004. [10] b.-h. bae and s.-k. sul, “a compensation method for time delay of full digital synchronous frame current regulator of pwm ac drives,” ieee trans. ind. appl., vol. 39, no. 3, pp. 802–810, may/jun. 2003. [11] h. kim, m. degner, j. guerrero, f. briz, and r. lorenz, “discrete-time current regulator design for ac machine drives,” ieee trans. ind. appl.,vol. 46, no. 4, pp. 1425–1435, jul./aug. 2010. [12] d. g. holmes, t. a. lipo, b. mcgrath, and w. kong, “optimized design of stationary frame three phase ac current regulators,” ieee trans. power electron., vol. 24, no. 11, pp. 2417–2426, nov. 2009. [13] h.-t. moon, h.-s. kim, and m.-j. youn, “a discrete-time predictive current control for pmsm,” ieee trans. power electron., vol. 18, no. 1, pp. 464–472, jan. 2003. 674 lj. s. peric, s. n. vukosavic [14] a. g. yepes, a. vidal, j. malvar, o. lopez, j. d. gandoy, "tuning method aimed at optimized settling time and overshoot for synchronous proportional-integral current control in electric machines," ieee trans. power electron., vol. 29, no. 6, pp. 3041–3054, jun. 2014. [15] s. h. song, j. w. choi, s. k. sul, "current measurements in digitally controlled ac drives," ieee ind. appl. magazine, vol. 6, no. 4, pp. 51-62, jul./aug. 2000. [16] l. r. carley "an oversampling analog-to-digital converter topology for high resolution signal acquisition systems", ieee trans. circuits sys., vol. cas-34, pp.83 -90 1987 [17] a. said, a. h. kamal, "a modeling technique to analyze the impact of inverter supply voltage and cable length on industrial motor-drives," ieee trans. power electron., vol. 23, no. 2, pp. 753–762, mar. 2008. [18] l. corradini, w. stefanutti, and p. mattavelli, "analysis of multi-sampled current control for active filters," ieee trans. ind. appl., vol. 44, no. 6, pp. 1785-1794, nov./dec. 2008. [19] f. d. freijedo, a.vidal, a. g. yepes, j. m. guerrero, o. lopez, j. malvar, and j. dovai-gandoy," “tuning of synchronous-frame pi current controllers in grid-connected converters operating at a low sampling rate by mimo root locus”, ieee trans. ind. electron., vol. 62, no. 8, pp. 5006-5017, aug. 2015. [20] s. n. vukosavic, s. l. peric, and e. levi, “ac current controller with error-free feedback acquisition system,” ieee trans. energy convers., accepted for publication, doi 10.1109/tec.2015.2477267. [21] s. n. vukosavic, s. l. peric, “a modified digital current controller with reduced impact of transport delays,” iet electric power appl., under review (epa-2015-0507). http://dx.doi.org/10.1109/tec.2015.2477267 facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 627 638 doi: 10.2298/fuee1704627a a novel supply voltage compensation circuit for the inverter switching point alexandru-mihai antonescu, lidia dobrescu faculty of electronics, telecommunication and information technology, university “politehnica” of bucharest, romania abstract. the present work proposes an innovative circuit that is able to compensate the inverter switching point voltage variation due to supply voltage change. the circuit is designed to work for a 1.6v to 2v supply voltage range. the operation principle includes the back gate effect and an original transistor switching. key words: adaptive threshold, inverter, back gate effect 1. introduction increasing modern circuits working frequency precision and low current consumption are important prerequisites for a modern design. the logic gate delays are used for periodical signal generation and time synchronizing. at high speed, the gate delay approach is to be considered, due to its low power consumption, reduced area, simplicity in design and large integration. when using logic gates as delays, the gate delay is proportional with the supply voltage variation. for a ring oscillator case, the frequency lowers as the supply voltage rises due to the fact that the stage capacitors are charging to higher voltage. the inverter schematic is exposed in figure 1. the usual approach is to design the inverter switching point to be half of the supply voltage. the switching point of the inverter is denoted with . the transfer characteristic for the former stated case is shown in figure 2. received january 24, 2017; received in revised form may 18, 2017 corresponding author: lidia dobrescu faculty of electronics, telecommunication and information technology, university “politehnica” of bucharest, 1-3 iuliu maniu boulevard, district 6, bucharest, romania (e-mail: lidiutdobrescu@yahoo.com) fig. 1 cmos inverter schematic 628 a. antonescu, l. dobrescu fig. 2 cmos inverter transfer characteristic [1] in region 1 of the transfer characteristic of the inverter m2 is in on state and m1 in off state. as the input voltage rises over the m1 threshold voltage, the transistor enters conduction state and m2 remains in on state. this is represented as the entering point in region 2. as the input voltage further increases, m1 turns on completely. at half of the second region represented with c is the switching point. in the second region, both transistors are in saturation state. as the circuit is leaving the second region, m2 transistor starts to turn off and, eventually, it reaches the full off state as it enters the third region. for the design of half the supply voltage switching point, both devices have the same drain current. under this assumption, the next equation can be written 1 [1]: 2 2 ( ) ( ) 2 2 pn sp thn dd sp thp v v v v v      (1) after some equation processing, one can find equation 2 [1] which calculate the switching point voltage of the inverter; βn and βp are the nmos and pmos transistor transconductances, vthp and vthp are the pmos and nmos device thresholds: ( ) 1 n thn dd thp p sp n p v v v v          (2) the ratio between the pmos and nmos devices geometry can be estimated using equation 3, where x is a factor between 2 and 4, depending on the technology used: ( / ) ( / ) n p w l x w l  (3) 2. controlling the inverter switching point if the inverter is at the switching point, rp1 and rp2 form a resistor divider. the output resistance of the nmos and pmos devices are given by equations 4 and 5, λ is the body effect parameter: a novel supply voltage compensation circuit for the inverter switching point 629 2 1 ( ) 2 on n gs thn r v v       (4) 2 1 ( ) 2 op p sg thp r v v       (5) continuing with the resistor divider analogy, the inverter switching point is given by the following equation: ,on sp dd tot on op tot r v v r r r r     (6) an increase of the threshold of the nmos device will result in a decrease of the drain current and an increase of the device resistance. on the other hand, an increase of the absolute value of the pmos threshold will result in an increase of the device current and a decrease of the device resistance. the decrease of the pmos resistance will lead to the lowering of rtot; the value of will rise. on the nmos side, increasing the device resistance will lead to the increase of rtot; vsp will go down. these statements are expressed in a simplified manner in equation 7 and 8:  spdsndnthn vriv (7)  spdspdpthp vriv (8) the transistor threshold can be modified using the back gate effect, by regulating the bulk voltage accordingly, and sensing the threshold on an inverter with the output tied to the input. as stated in [2], the transistor threshold is influenced by the bulk voltage according to equation 9: 0 ( | 2 | | 2 |) th th f sb f v v v       (9) in figure 3, the inverter schematic with bulk control voltages and the intrinsic device diodes are shown. fig. 3 back gate controlled cmos inverter 630 a. antonescu, l. dobrescu in normal operation, the bulk is tied to the source and diodes dp1 and dn2 are shorted out. the other two diodes, dp2 and dn1 are reversed biased [5] and only a small current crosses them. when the source bulk voltage is applied, diodes dp1 and dn2 become forward biased and, if the correct amount of voltage is applied on the bulk of the transistor, the diodes can enter conduction. this situation is illustrated in figure 4. fig. 4 bulk diode forward biasing as resulting from figure 4, the maximum back bias voltage should not exceed 600mv. for the sake of safe design, the back bias voltage will be designed not to exceed 550mv. the downside of bulk biasing is that across the forward biased bulk diode there will flow a small amount of current no matter how small the bulk voltage will be. also, the nmos device has to be isolated in a separate well. regarding the pmos transistor, this device is already isolated due to the nwell in which it is constructed. the good part is that the diodes will start to conduct when having a drop of at least 0.6v. also, from the stand point of view of area, all the nmos isolated devices can share the same well [5]. 3. adaptive threshold circuit 3.1. modified inverter schematic first the inverter supply voltage switching point variation must be evaluated. then, the supply voltage range will be split into two domains, symmetrical around the centre value. for this particular case, the centre value for the supply voltage is 1.8v, the minimum and maximum being 1,6v and, respectively, 2v. the switching point voltage will be increased for the low range of the supply and decreased for the high range. by referring to the original inverter schematic, at half the supply range the original switching point characteristic and the adjusted one will cross. simulations lead to the conclusion that the supply range is too large to be adjusted only by bulk regulation. therefore, the approach of switching a part of the original transistors for each range has been taken into consideration. a novel supply voltage compensation circuit for the inverter switching point 631 the new inverter schematic is shown in figure 5. fig. 5 modified inverter schematic a similar principle is proposed in [3]. this is done by using multiple fingers for the nmos transistor. these transistors, are floating gate type, and need to have their threshold programmed. by programming or erasing, there is added or subtracted one or several nmos devices. this method, although very effective, cannot be used to adjust continuously the inverter switching point. the inverter is made out of transistors p1, p3, n1, and n2; without switching (p2 and n3 are conducting), p1 and p3 are acting as an equivalent pmos transistor of 4w channel width, and n1, n2 as a nmos transistor of 2w channel width. p2 and n3 transistors are in charge of disconnecting transistors p3, respectively, n2, in the regions where the inverter threshold cannot be adjusted only by bulk regulation. the two transistor used as switches, are controlled by signals disc_p and disc_n. p3 represents ¼ of the total pmos width and n2 ½ of the nmos width. the bulks are common and yet to be externally controlled for p1 and p3 pair, and n1, n2 pair. by disconnecting p3 pmos transistors (p2 is off), the inverter threshold will be lowered. in the same manner, by disconnecting n2 nmos transistors (n3 is off), the inverter threshold will rise. the centre of the supply span is taken as reference. this is the reason that the 1.8 volt supply is the point where the circuit will switch from increasing the switching point voltage to decreasing it. that means that the circuit will switch from regulating the pmos bulk voltage to regulating the nmos bulk voltage. the regulation of both transistors bulks will not bring any functional improvement, because the effects are in opposite directions. switching p3 off (which has ¼ of the total pmos width) will decrease the switching point voltage. on the other hand, switching n2 off (which has ½ of the total nmos width) will increase the switching point voltage. this is the reason for the extra two sub domains that control the switching of p3 and n2; this brings a rough adjustment where 632 a. antonescu, l. dobrescu the body effect can no longer tune the vsp variation. to describe better the functionality of the circuit, the voltage domains are exposed in figure 6. fig. 6 threshold adjustment principle perfect symmetry of the domains is desired but it would be very hard to obtain. the goal is to layout the complementary pmos and nmos transistors as a number of identical fingers having the same dimensions. also, the width of the composing finger should be a fractional number but containing only the first digit after the point. there is no use to try to obtain high precision when switching the extra transistors, because their effective length and width will change due to process variation. in the end, the maximum bulk voltage will be different between the pmos and nmos devices. this is also due to different mobility and geometry of the complementary devices. in the end, the importance of this circuit is to obtain a threshold with a low variation and to keep the bulk voltages under the maximum limits. the circuit needs two regulation loops, one for the pmos bulk and one for the nmos bulk and a supply voltage monitoring block. the supply monitor will be composed out of 3 independent circuits, each signalling one of the voltage intervals shown in figure 6. 3.2. adaptive threshold circuit the proposed adaptive threshold circuit is exposed in figure 7. fig. 7 adaptive threshold circuit a novel supply voltage compensation circuit for the inverter switching point 633 the bulk voltages are generated using regulated current to voltage converters [6]. the centre inverter, enclosing the schematic exposed in figure 5, is used as switching point reference. for regulating the pmos bulk voltage is used the operational amplifier oa1, nmos transistor n3, resistor r1 and ilim1 current source (this one being a client form the general biasing mirror). oa1 compares the switching point voltage of inverter inv with the vthr_adj voltage. if vsp is lower, the vbulk_p voltage will be increase and the threshold will go up. on the other hand, the nmos bulk voltage is regulated by oa2, pmos transistor p6 and current source ilim2. oa2 compares the vsp value with vthr_adj voltage and regulates the vbulk_n node voltage in order to decrease the threshold. both ilim values are set to 5ua so that the voltage drop across r1 and r2 would be limited to about 500mv. with a 10% variation of the bias current, the maximum bulk voltage will not exceed the safe operation area, depicted in figure 4. the value for the vthr_adj voltage is the inverter switching point at half the supply domain; in this case for a supply of 1.8v it will be 0.75v. setting this parameter is critical, in order to obtain the lowest variation. the supply monitor block has three outputs, comp_1v7, comp_1v8 and comp_1v9. the comp_1v7 controls the dis_n signal of the inverter, disconnecting the extra nmos transistor for a supply voltage under 1.7v and connecting it when the supply exceeds this limit. comp_1v9 signal controls the dis_p signal, keeping the extra pmos connected for supply voltages under 1.9v and disconnecting it when the voltage limit is exceeded. comp_1v8 controls which of the pmos or nmos bulks are regulated. for supply voltages under 1.8v, the pmos bulk is regulated, and over 1.8v, the nmos bulk is regulated. the operational amplifiers have basically the same schematic, the only difference is that oa1 pulls down the output when disabled and oa2 pulls is up. this is done in order to disconnect n3 and p6 when the opamps are off. in this manner, the pmos bulk will be pulled up, and nmos bulk pulled down, by r1 and r2 resistors. the supply monitor schematic is exposed in figure 8. fig. 8 supply monitoring circuit 634 a. antonescu, l. dobrescu the resistor divider has been drawn as four independent resistors (normally it contains several resistor fingers, each finger hasthe same number of squares and resistance value); the bias for the operational amplifiers is omitted in the picture. each of the amplifiers is using an external 1ua bias current. the resistor divider composed out of r1, r2, r3, r4 and the disabling p1 transistor, reduces each supply voltage threshold to the value of the system voltage reference vbg=1.2v. the three opamps, oa1, oa2 and oa3 compare the two values (the certain tap of the resistor divider and the bandgap voltage reference) and switch the output to logic 1 when the resistor tap voltage exceed the bandgap voltage value. p1 transistor has the role of switching off the resistor divider, in order to reduce current consumption when the circuit is disabled. in simulation, the reference voltage is provided by an ideal voltage source. the schematic of comparator is depicted in figure 9. fig. 9 comparator schematic the comparator uses a trans-admittance amplifier. the topology was chosen due to the capacitive load driving capability. p2 and p3 pmos transistors act as active loads for the differential input stage. the input offset is optimised by providing high matching between the differential pair bias currents. this is done by using a topology that offers high precision biasing for the differential input stage of the comparator. to enhance even more the schematic, the nmos current mirror (n7-n8) which regulates the current between the active load current mirrors (p1p2 and p3-p4) is cascaded with transistors n5 and n6. the cascode topology also enhances the output resistance of the block. 3. simulation, results and discussion figure 10 exposes the circuit operation. the source-bulk and bulk-source voltages define the two operation modes. a novel supply voltage compensation circuit for the inverter switching point 635 fig. 10 inverter switching point and bulk voltages simulation the regions where the back bias increases (pmos) and decreases (nmos) rapidly represent the switching points for the extra transistors. from simulation, the inverter switching point can be tuned by body effect only between 1.7v and 1.9v supply voltage. for the nmos bulk regulation, the bulk voltage almost reaches the maximum safe operation voltage. the simulation was done for a 1.6-2v supply voltage sweep, using cadence spectre. the temperature set is 27ºc, and it used the typical corner for simulation. finer adjustments can be made. for example, the extra nmos that is disconnected can have the width a bit lower. although this adjustment is possible, it would require to work with transistor fingers of 0.5w or smaller, reaching the minimum technological size. comparing the original threshold with the adjusted one, there is to conclude that the circuit is doing its job properly. figure 11 presents the difference between the original inverter, with all transistors working and no bulk regulation, and the same inverter that has both bulk regulation and transistor switching. fig. 11 comparison between the original and improved inverter switching point 636 a. antonescu, l. dobrescu the two characteristics are close to meet on the 1.8v supply voltage value. there is a little offset due to the fact that the original inverter had the switching point a bit higher than 0.75v at vdd=1.8v and the regulated value was chosen to be 0.75v. also, when reaching the upper part of the supply range, the nmos bulk regulation cannot cope with the variation anymore. same thing happens around half the designed supply voltage but the threshold decreases. the maximum values for some important signals, in the case of the, nominal, 25ºc simulation, are underlined in table 1: table 1 circuit relevant signals absolute values for the nominal corner vsp_adj vsp_orig vsb_p vbs_n nom(vdd=1.8v) 0.7504 0.7572 0.0743 0 min 0.7433 0.6801 0 0 max 0.7601 0.8368 0.2902 0.5282 delta 0.0168 0.1567 delta[%] 2.24 20.7 the switching point value in the table, represent the total variation reported to the central 1.8v supply voltage value (0.75v). the adjusted switching point variation is almost ten times lower than the original one. the overall technological corner variation is depicted in table 2. table 2 inverter switching point overall corner variation detailed corner simulation results are underlined in table 3 and figure 12. table 3 inverter switching point technological corners simulation results temp. [°c] corner fast fast_hh fast_hl fast_lh fast_ll fast fast_hh fast_hl fast_lh fast_ll fast fast_hh fast_hl fast_lh fast_ll min. [v] 0.7409 0.7326 0.7326 0.7484 0.7484 0.7389 0.7389 0.7389 0.7478 0.7478 0.7474 0.7375 0.7375 0.7481 0.7481 max. [v] 0.7667 0.7595 0.7595 0.7736 0.7736 0.7633 0.7545 0.7545 0.7717 0.7717 0.7636 0.7531 0.7531 0.7739 0.7739 delta [v] 0.0258 0.0269 0.0269 0.0252 0.0252 0.0244 0.0156 0.0156 0.0239 0.0239 0.0162 0.0156 0.0156 0.0258 0.0258 delta[%] 3.44 3.59 3.59 3.36 3.36 3.25 2.08 2.08 3.19 3.19 2.16 2.08 2.08 3.44 3.44 corner slow slow_hh slow_hl slow_lh slow_ll slow slow_hh slow_hl slow_lh slow_ll slow slow_hh slow_hl slow_lh slow_ll min. [v] 0.7472 0.7391 0.7391 0.7491 0.7491 0.7487 0.7434 0.7434 0.749 0.749 0.7483 0.7401 0.7401 0.7487 0.7487 max. [v] 0.773 0.7658 0.7658 0.7797 0.7797 0.7679 0.7593 0.7593 0.7761 0.7761 0.7665 0.7563 0.7563 0.7765 0.7765 delta [v] 0.0258 0.0267 0.0267 0.0306 0.0306 0.0192 0.0159 0.0159 0.0271 0.0271 0.0182 0.0162 0.0162 0.0278 0.0278 delta[%] 3.44 3.56 3.56 4.08 4.08 2.56 2.12 2.12 3.61 3.61 2.43 2.16 2.16 3.71 3.71 corner typ_hh typ_hl typ_lh typ_ll typ_hh typ_hl typ_lh typ_ll typ_hh typ_hl typ_lh typ_ll min. [v] 0.7356 0.7356 0.7488 0.7488 0.7408 0.7408 0.7483 0.7483 0.7384 0.7384 0.7484 0.7484 max. [v] 0.7622 0.7622 0.7763 0.7763 0.7565 0.7565 0.7735 0.7735 0.7542 0.7542 0.7747 0.7747 delta [v] 0.0266 0.0266 0.0275 0.0275 0.0157 0.0157 0.0252 0.0252 0.0158 0.0158 0.0263 0.0263 delta[%] 3.55 3.55 3.67 3.67 2.09 2.09 3.36 3.36 2.11 2.11 3.51 3.51 -40 25 85 a novel supply voltage compensation circuit for the inverter switching point 637 fig. 12 inverter switching point – technological corners simulation results 4. conclusions the present work proposes a compensation technique for the inverter switching point supply voltage variation. this is based on bulk voltage regulation and variable inverter geometry. the circuit has a maximum power supply switching point variation of 4% (simulated in technological corners, complete with temperature variation). this represents the difference between the maximum and minimum vsp values, and it is computed using the 0.75v switching point (at 1.8v supply voltage) value as reference. compared to the original inverter (simulated in the nominal corner at 25ºc), the compensation scheme brings an improvement of 16% for vsp variation (or a 5 times lower variation). the compensation technique implies nodal capacitance variation according to circuit operation and supply voltage variation. for this reason, there is the need to take precautions when using the proposed circuit for generating time delays. an oscillator circuit is a typical application for the proposed circuit. schematic is proposed in [9]. the additional circuitry, for bulk voltage regulation and transistor switching, brings both additional area and power consumption. for the typical application presented in [9], the frequency accuracy increased almost 6 times for, at most, 20% higher power consumption. acknowledgement: the paper includes the results of a distinct research work at starting point of the first author’s phd thesis. 638 a. antonescu, l. dobrescu references [1] r. jacob baker, “circuit design, layout and simulation”, ieee press series on microelectronic systems, 2002. [2] yannis tsividis, “operation and modeling of the mos transistor, second edition”, oxford university press, 2003. [3] j. segura, j. l. roselle’s, j. morra, and h. sigg, “a variable threshold voltage inverter for cmos programmable logic circuits”, ieee journal of solid-state circuits, vol. 33, no. 8, august 1998. [4] p. gray, r. meyer, p. hurst, s. lewis, “analysis and design of analog integrated circuits”, john wiley & sons, inc., 2009. [5] a. hastings, “the art of analog layout”, prentice hall, 1997. [6] d. johns, k. martin, “analog integrated circuit design”, john wiley & sons, inc., 2011. [7] g. streel, d. bol, “study of back biasing schemes for ulv logic from the gate level to the ip level”, journal of low power electronics and applications, 2014. [8] c. toumazou, g. moschytz, b. gilbert, “trade-offs in analog circuit design”, kluwer academic publishers, 2004. [9] a. antonescu, l. dobrescu, “70 mhz oscillator circuit based on constant threshold inverters”, in proceedings of the 10 th international symposium on advanced topics in electrical engineering, bucharest, march 25-27, 2017. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 639 646 doi: 10.2298/fuee1704639d exact analytical solutions of continuously graded models of flat lenses based on transformation optics mariana dalarsson 1 , raj mittra 2 1 department of physics and electrical engineering, linnaeus university, växjö, sweden 2 emc laboratory, department of electrical engineering, the pennsylvania state university, university park, pa, usa abstract. we present a study of exact analytic solutions for electric and magnetic fields in continuously graded flat lenses designed utilizing transformation optics. the lenses typically consist of a number of layers of graded index dielectrics in both the radial and longitudinal directions, where the central layer in the longitudinal direction primarily contributes to a bulk of the phase transformation, while other layers act as matching layers and reduce the reflections at the interfaces of the middle layer. such lenses can be modeled as compact composites with continuous permittivity (and if needed) permeability functions which asymptotically approach unity at the boundaries of the composite cylinder. we illustrate the proposed procedures by obtaining the exact analytic solutions for the electric and magnetic fields for one simple special class of composite designs with radially graded parameters. to this purpose we utilize the equivalence between the helmholtz equation of our graded flat lens and the quantummechanical radial schrödinger equation with coulomb potential, furnishing the results in the form of kummer confluent hypergeometric functions. our approach allows for a better physical insight into the operation of our transformation optics-based graded lenses and opens a path toward novel designs and approaches. key words: flat lenses, graded permittivity and permeability models, transformation optics, exact analytical solutions 1. introduction flat lenses designs based on transformation optics (to) and using left-handed (negative refractive index) metamaterials have been discussed in a number of recent publications ([1], [2]). basically, using the electromagnetic design, one is able to design a lens with the full functionality of a conventional lens, but compressed in space and possibly having additional functionalities. it is possible to do this in a wide range of operating frequencies, including microwave, terahertz and optical. however, the metamaterial composites proposed for such designs may be difficult to manufacture, received april 4, 2017; received in revised form may 21, 2017 corresponding author: mariana dalarsson department of physics and electrical engineering, linnaeus university, 351 95växjö, sweden (e-mail: mariana.dalarsson@lnu.se) 640 m. dalarsson, r. mittra especially when the required values of relative magnetic permeability and relative dielectric permittivity are less than unity, as argued in [3]. in order to avoid problems with fabrication of metamaterials with suitable values of magnetic permeabilities, it is possible to set the value of and to vary only to create the desired refractive index of √ , but at the cost of decreasing the efficiency ofthe composite lenses [4]. in [4] a plano-concave lens has been designed with metamaterials to obtain a gain above 13 db in the frequency band between 10 and 12 ghz. such a lens has a narrow bandwidth typical for a majority of designs using metamaterials. the conventional flat lens designs, using ray optics (ro) approach, avoid the abovementioned difficulties with to designs, but they do not have the same flexibility to control the phase and amplitude of the fields within the lens structure. an approach to remedy the drawbacks of both to and ro designs is the field manipulation (fm) method, described in [3]. the studies of the flat-lenses design approaches mentioned above, however, generally require a direct numerical approach in solving the field equations. in the present paper, we use an alternative approach and investigate the possibilities to identify and study some special designs that allow for the exact solutions of the field equations analogous to those obtained in studying various planar and cylindrical metamaterial structures [5] [11]. the main motive for pursuing analytical solutions of the problems involving flat lenses is that the detailed knowledge of analytical structure of the field solutions may provide additional insights leading to improved or even entirely new designs. we apply our approach to a specific case of a gradient-index (grin) flat lens. 2. problem formulation and field equations the graded index (grin) approach to the design of a flat lens is based on the concept of field transformation, similar to that proposed by luneburg for the design of spherical lenses [12]. similarly to luneburg's approach, a desired field distribution in the output port (the exit aperture) is specified and the medium parameters of the intervening medium are determined such that the given field distribution in the input port (input aperture) is transformed to the desired field distribution in the exit plane. in many practical cases, this can be performed by tracing rays through a designed inhomogeneous medium. the design parameters of the lens include center frequency, focal length, thickness, and gain. the physical size (diameter d) of the lens will depend on the gain and the radial model function (e.g. radial dependence of the permittivity). one typical design layout is shown in fig. 1. the design goal is to maximize the performance of the lens, and for that purpose we want to realize the desired phases on the face b of the lens while simultaneously maximizing the transmission coefficient over a broad frequency band. the problem is typically solved using a multi-layer structure, with the desired phase at the center frequency and a transmission coefficient as close to one as possible over abroad frequency band for each of the ten rings shown in fig. 1. in fig. 1 the following symbols are used: t – thickness of the lens f – focal length of the lens i – phase of the plane wave incident from the left on the face a a, b – the notation for the two faces (a and b) of the lens exact analytical solutions of continuously graded models of flat lenses 641 fig. 1 flat grin lens. left: cross section (side view) of the lens showing layers; right: top view of the lens the middle layer perform a majority of the phase transformation, while the other layers act as matching layers to maximize the transmission of the waves incident from either side (graded antireflection structure). in the present approach we model the discrete structure shown in fig. 1 by a cylindrical composite structure with the electric permittivity and permeability being continuous spatial functions ( ) ( ), ( ) ( ) (1) where ( ) is the set of cylindrical coordinates and the structure is centered around the z-axis. we consider a case of te-wave propagation through the structure, so that the electric and magnetic field are ( ) , ( ) ( ) (2) here we note that the choice of te-waves is by no means a restriction, and writing an analogous procedure for tm-waves is straightforward. in the case of te-waves as described by (2), maxwell equations for the scalar field components become ( ) , ( ) ( ) (3) ( ) (4) substituting equations (3) into (4), we obtain helmholtz equation for the electric field ( ) ( ) ( ) ( ) ( ) ( ) ( ) (5) or introducing a new function ( ) ( ) ( ) ( ) (6) where . the equation (5), or (6), is quite general. after choosing suitable model functions ( ) ( ) ( )and ( ) ( ) ( ), if we can determine the analytic solution for the electric field ( ), then using (3) we can readily obtain the magnetic field components ( ) and ( ) as well. the challenge is therefore to find suitable model functions ( ) ( ) ( ) and ( ) ( ) ( ) that provide 642 m. dalarsson, r. mittra a reasonable resemblance of actual design structures like the one described in table 1 and fig. 2 of [3]. 3. analytics of a simple model of composite designs at this stage, we need to restrict the form of the functions (1) to allow for a suitable analytical solution. let us here consider a simple model where ( ) ( ) ( ) ( ( ) ) , ( ) (7) in (7) we require that at large distances ( ) the composite permittivity ( ) becomes unity, which describes the gradual transition to the free space outside the structure. this is simultaneously the condition for the antireflective behavior of the lens surface and thus the maximum input electromagnetic flux. utilizing (7) and separating variables using the ansatz ( ) ( ) ( ) ( ), the equation (6) gives rise to two ordinary differential equations for the two functions, ( ) and ( ), as follows ( ) (8) [ ( ) ] (9) as indicated in (8), the solutions for ( ) are simple plane waves propagating in the z-direction, and we only need to solve equation (9). introducing ( ) √ ( ), the equation (9) becomes * ( ) + (10) let us now introduce two constants , , whereby the equation (10) becomes the well-known radial schrödinger equation * ( ) ( ) + (11) where we notice the following analogy between the parameters of the electromagnetic equation (11) and the parameters of the usual quantum-mechanical radial schrödinger equation ( ) ( ) (12) since we require that ( ) when , the simplest model that we can adopt is the coulomb potential ( ) ( ) (13) where α is a constant that must be chosen to provide the best fit to the presented graded model. such a choice of ( ) introduces an unphysical singularity of the permittivity function for , but with a proper choice of boundary conditions it can provide a sufficiently accurate model of the realistic graded permittivity structures. substituting ( ) from (13) into (11) we obtain * ( ) + (14) exact analytical solutions of continuously graded models of flat lenses 643 the equation (14) has an exact analytical solution ( ) ( ) ( ) (15) where ( ) ( ) and ( ) ( ) are whittaker functions that can be expanded in terms ofkummer confluent hypergeometric functions f1 and u. based on the asymptotic behavior of the whittaker functions for and , and the physical requirements on the behavior of the electric field functions ( ), we see that we must choose c2 = 0, such that for , we have ( ) √ ( ) √ ( ) (16) and for waves propagating in the positive z-direction, we can write ( ) ( ) ( ) ( ) (17) it is here convenient to express the result (17) in terms of kummer confluent hypergeometric functions, in order to further clarify the mathematical properties of the electric field intensity function. thus, we finally obtain ( ) ( ) ( ) (18) the result (18) for the electric field intensity function ( ) refers to the φcomponent of the electric field due to the assumed te-wave as defined in (2). it should however be noted that the assumption of the te-wave is by no means limiting the generality of the results obtained in the present paper. the case of the tm-wave is fully analogous to the case of the te-wave, and the only difference is that the result (18) is then valid for the magnetic field intensity function ( ) which refers to the φcomponent of the magnetic field. the electric field components ( ) and ( ) are then readily obtained using the tm-wave analogues of the equations (3). the choice of the te-wave in the present paper was made for illustration purposes. following the approach in [3], the relative permittivities are here assumed to be real functions and no dielectric losses are taken into account. it should however be noted that there is nothing in the present theory that limits the values of the relative permittivities to be real. it is fully feasible to use the present model with complex relative permittivities as well. this will be the subject of our future studies. 4. study of a specific numerical case let us now turn to the specific case of a grin lens studied in [3], where we have a structure with radially graded permittivities for the middle layer, as listed in table 1. table 1 radially graded permittivities of the middle layer of a grin lens. layer 1 2 3 4 5 6 7 8 9 10 ̅ ( ) 1.5875 4.7625 7.9375 11.1125 14.2875 17.4625 20.6375 23.8125 26.9875 30.1625 ( ̅) 25.5 24.5 22.3 18.5 14.55 10.5 7.65 5.5 3.5 1.65 644 m. dalarsson, r. mittra using the model function (7) with (13), we obtain the fitting graph as shown in fig. 2, where we have chosen the parameter α to be equal to 0.36. fig. 2 fitting of grin lens relative permittivity data using coulomb function with . the cross section of the solution (18) for a constant z is shown in fig. 3. fig. 3 cross section of the electric field function e(r, z) for given constant z (z = 0), with c1 = 1, f = 30 ghz, k = 2π f/c and kz = 0.8 k. exact analytical solutions of continuously graded models of flat lenses 645 finally, a three dimensional plot of the solution (18) is shown in fig. 4. fig. 4 electric field function e(r, z). from fig. 4 we readily see how the wave is radially focused while moving along the zdirection, as expected. the size of the wave amplitudes is not normalized with respect to any starting position, and does not reflect any specific initial electric field strength. even though the coulomb function is far from the optimum fit for the grin lens data, the obtained results can be used to describe simply and sufficiently accurately the chosen lens. it should be noted here that our choice of the model function (coulomb function) has been made based on the well known analytical solutions for that function. there is a number of other functions that also allow the exact analytic solutions of the problem at hand, in particular if the model is extended to allow the graded permeability of the lens layers. the studies of other models involving such more accurate model functions will be the subject of our coming papers. 646 m. dalarsson, r. mittra 5. conclusions the possibility to find exact analytic solutions for the electric and magnetic fields in continuously graded flat lenses has been studied. the flat lenses are modeled as compact composites with continuous permittivity and permeability functions which asymptotically approach unity at the boundaries of the composite cylinder. in order to illustrate the present approach, we obtain an exact analytic solution for the electric field intensity for an fm composite lens with constant magnetic permeability ( z) and radially dependent dielectric permittivity. in our coming research efforts, we see the need to look for the possible models with exact analytical (or at least perturbational and/or wkb) solutions for graded profiles of some more complex flat-lens designs studied in literature. references [1] r. yang, w. tang, and y. hao, "a broadband zone plate lens from transformation optics", optics express, vol. 19, no. 13, pp. 12348 12355, 2011. [2] d. a. roberts, n. kundtz, and d. r. smith, "optical lens compression via transformation optics", optics express, vol. 17, no. 19, pp. 16535 – 16542, 2009. [3] s. jain, m. abdel-mageed and r. mittra, "flat-lens design using field transformation and its comparison with those based on transformation optics and ray optics", ieee antennas and wireless propagation letters, vol. 12, pp. 777 – 780, 2013. [4] t. driscoll, g. lipworth, j. hunt, n. landy, n. kundtz, d. n. basov, and d. r. smith, "performance of a threedimensional transformation-optical flattened luneburg lens", optics express, vol. 20, no. 12, pp. 13264 13273, 2012. [5] m. dalarsson and p. tassin, "analytical solution for wave propagation through a graded index interface between a right-handed and a left-handed material", optics express, vol. 17, no. 8, pp. 6747 – 6752, 2009. [6] m. dalarsson, m. norgren, and z. jaksic, "lossy gradient index metamaterial with sinusoidal periodicity of refractive index: case of constant impedance throughout the structure", journal of nanophotonics, vol. 5, no. 1, pp. 051804-1 – 8, 2011. [7] m. dalarsson, m. norgren, n. doncov, and z. jaksic, "lossy gradient index transmission optics with arbitrary periodic permittivity and permeability and constant impedance throughout the structure," journal of optics, vol. 14, no. 6, pp. 065102-1 – 7, 2012. [8] m. dalarsson, m. norgren, t. asenov, n. doncov, and z. jaksic, "exact analytical solution for fields in gradient index metamaterials with different loss factors in negative and positive refractive index segments," journal of nanophotonics, vol. 7, no. 1, 073086-1 – 13, 2013. [9] m. dalarsson, m. norgren, t. asenov, and n. doncov, "arbitrary loss factors in the wave propagation between rhm and lhm media with constant impedance throughout the structure", pier, vol. 137, pp. 527 – 538, 2013. [10] m. dalarsson, m. norgren, and z. jaksic, "exact analytical solution for fields in a lossy cylindrical structure with linear gradient index metamaterials", pier, vol. 151, pp. 109–117, 2015. [11] m. dalarsson, and z. jaksic, "exact analytical solution for fields in a lossy cylindrical structure with hyperbolic tangent gradient index metamaterials", optical and quantum electronics, vol. 48, no. 3, pp. 1–6, 2016. [12] r. k. luneburg, "mathematical theory of optics", university of california press, berkeley, 1964. facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 345-358 https://doi.org/10.2298/fuee1903345s © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd scada systems in the cloud and fog environments: migration scenarios and security issues * mirjana d. stojanović 1 , slavica v. boštjančič rakas 2 , jasna d. marković-petrović 3 1 university of belgrade, faculty of transport and traffic engineering, belgrade, serbia 2 university of belgrade, mihailo pupin institute, belgrade, serbia 3 ce djerdap hydroelectric power plants ltd., hpp djerdap 2, negotin, serbia abstract. this paper addresses scenarios and security issues when migrating scada systems to cloud and fog environments. migration strategies to the cloud refer to different cloud infrastructures (public, private or hybrid) as well as selection of cloud service. benefits of cloud-based scada systems mainly refer to improving economic efficiency. we further address migration risks, with regards to quality of service and cyber security. challenges in security provisioning encompass security solutions, risk management and test environment. finally, we address emerging evolution of scada toward fog computing, including the three-tier system’s architecture and security issues. key words: cloud computing, cyber security, fog computing, quality of service, scada 1. introduction with respect to the earlier version [1], presented at the 4th virtual international conference on science, technology and management in energy – energetics 2018, this paper is extended with more thorough considerations related to cyber security when migrating supervisory control and data acquisition (scada) systems into cloud computing environment, and discussion on the evolution toward fog computing system architecture. in the past few years, the focus of cloud computing has progressively shifted from consumer applications toward corporate control systems. the migration of applications, such as scada, into the cloud environment is interesting for business users due to potential reduction of costs, scalability, efficient system configuration and maintenance. access and lease of resources are on-demand, with costs that are much lower than buying, received february 12, 2019; received in revised form april 18, 2019 corresponding author: mirjana d. stojanović university of belgrade, faculty of transport and traffic engineering, vojvode stepe 305, 11000 belgrade, serbia (e-mail: m.stojanovic@sf.bg.ac.rs) * an earlier version of this paper was presented at the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, niš, serbia [1]  346 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović installing and maintaining the hardware and software, and with decreasing the number of technical staff. the industrial sector is experiencing substantial benefits from using the industrial internet of things (iiot) to automate systems, deploy different types of sensors, improve efficiency, and increase revenue opportunities. the amount of data from such industrial systems can be measured in the millions of gigabytes. traditional information technology (it) cannot fulfill requirements regarding data analysis, delay, mobility, reliability, security, privacy, and network bandwidth. fog computing seems to be a promising solution to resolve such problems. particularly, fog computing outperforms cloud computing for delay-sensitive applications with stringent security requirements. apart from industry efforts [2–5], only a few academic research papers systematically explored scada systems using cloud and/or fog environments, and particularly the related security issues. our primary motivation for this work was to provide an in-depth insight into migration of advanced scada systems to both cloud and fog environments, with focus to security as a crucial risk factor in the context of critical infrastructure. the main objectives of this review paper are: (1) to discuss security issues of cloud-based scada systems, regarding different migration strategies and types of cloud services; (2) to consider scada security solutions and challenges in security provisioning in the cloud environment and (3) to explore scada evolution toward fog computing system architecture, with special concern to security. finally, we identify gaps in current research and propose relevant research priorities for future work in the area. the rest of the paper is organized as follows. section 2 provides a brief theoretical background, regarding operation principles of scada systems, as well as basic concepts of cloud and fog computing. in section 3, we first explain migration strategies of scada systems to the cloud, with respect to different cloud infrastructures and selection of cloud service. further, benefits and risks of cloud-based scada systems are explained, as well as challenges in cloud security provisioning in terms of security solutions, risk management and test environments. section 4 considers evolution toward fog computing system architecture, including migration of scada systems to fog, and security issues. section 5 concludes the paper. 2. theoretical background 2.1. scada system: architecture, configuration and protocols scada systems are a class of industrial control systems that control and monitor geographically dispersed process equipment in a centralized manner. they are widely used in the industrial sectors like electric power systems, oil refineries and natural gas distribution, water and wastewater treatment, and transportation systems. fig. 1 presents layered architecture of a scada system, with common components and configuration. hierarchy of scada system is defined according to interconnection of its components and their connectivity with external networks [6–8]. the lowest layer 0 represents physical devices that are in direct interaction with industrial hardware, interconnected via fieldbus. scada systems in the cloud and fog environments: migration scenarios and security issues 347 scada mtu ...rtu plc ied scada partner plants work stations hmi historian internet fieldbus network field devices layer 0 fieldbus network (field site) layer 1 controller network (field site) layer 2 supervisory network (control center) layer 3 operational traffic over dmz layer 4 corporate network (web and e-mail servers, business servers) application servers domain controller supervisory network controller network corporate network fig. 1 the layered architecture of a scada system controllers at layer 1 process signals from field devices and generate appropriate commands for these devices. they encompass remote terminal units (rtus), programmable logic controllers (plcs) and intelligent electronic devices (ieds) that perform local control of actuators and sensor monitoring. processing results are forwarded to control center at layer 2 for further analysis and response control. supervisory network connects scada server (master terminal unit, mtu), historian server, engineering work stations, human machine interface (hmi) server and consoles, as well as communication devices, such as routers, switches or modems. control center collects and analyzes information obtained from field sites, presents them on the hmi consoles, and generates actions based on detected events. control center is also responsible for general alarms, analysis of trends and generating the reports. communication subsystem connects control center with field sites and allows operators remote access to field sites for diagnostic and failures repairing purposes. it also connects control center with scada partner plants. layer 3 typically represents demilitarized zone (dmz), where application servers, historian server and domain controller are located. layer 4 corresponds to the corporate it network, which is connected to the internet. modern scada systems are based on open communication standards, such as ethernet, transmission control protocol/internet protocol (tcp/ip) suite and a variety of wireless standards. a set of standard or proprietary protocols are used for communications, over pointto-point links or a broadband ip-based wide area network (wan). there are several standard 348 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović and vendor-specific scada communication protocols, and the most widespread are modbus, distributed network protocol (dnp3), iec 60870-5 series and iec 61850 series used for electrical substation automation systems. most of these protocols are designed or extended to operate over tcp/ip networks. besides, most of the existing fieldbus protocols are based on ethernet technology. a comprehensive review of scada protocols can be found in [9] and [10]. 2.2. cloud computing: basics and security issues according to [11], cloud computing is defined as "a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction". public cloud infrastructure is owned by a provider and sold as a service to business and residential users. private cloud refers to an infrastructure that is owned or leased by a business user (single organization). hybrid cloud infrastructure is a combination of private and public cloud infrastructures, which remain mutually independent and connected by standard or proprietary technology that enables portability of data and applications. service architecture of a cloud computing system is hierarchically structured [12]:  the lowest, hardware layer at the data center actually consists of the physical hardware devices including the processor, memory, storage and bandwidth.  the infrastructure layer assumes the virtualization to provide the infrastructure as a service (iaas), which usually consists of a pool of virtual machines (vms) that can be provisioned on demand to the it consumers.  the platform layer enables creation and development of software, which can be later delivered over the web. hence, this layer provides the platform as a service (paas) by utilizing the components and services of the infrastructure layer.  the highest, software layer provides the ready-to-use software and applications for the business needs of the cloud service customers. hence, this layer facilitates and provides the software as a service (saas) by utilizing the components and services of the platform layer. besides security threats that are present in the existing computing platforms and networks, cloud computing faces a number of additional vulnerabilities [13]. they include: (1) attacks by other customers; (2) shared technology issues; (3) failures in provider or customer security systems; (4) flawed integration of provider and customer security systems; (5) insecure application programming interfaces; (6) data loss or leakage and (7) account or service hijacking. in particular, susceptibilities depend on the type of cloud service. in general, iaas is susceptible to all of the threats that are well known from the traditional information and communication environment [14]. all of the client applications running on the virtual machines are like "black boxes" for the provider. in other words, the customer is responsible for securing these applications. paas is particularly susceptible to shared technology issues, because security settings may differ for various kinds of resources. another problem caused by shared resources refers to data leakage. finally, protection of user objects is one of the most serious issues of paas [15]. since saas requires only a web browser and the internet connection, its security aspects are similar to the web service [16]. saas is susceptible to data security, and particularly to their confidentiality. scada systems in the cloud and fog environments: migration scenarios and security issues 349 the other common problems with data security include data backup, data access, storage locations, availability, authentication, etc. table 1 summarizes the attacks types and their impacts, regarding layered cloud service architecture and emphasizes responsibilities of cloud service provider (csp). table 1 types of attacks on the cloud and their impacts (adapted from [16]) security issues attack types impacts csp responsibility saas paas iaas software layer sql injection attacks, cross site scripting modification of data, confidentiality, session hijacking  platform layer domain name system attacks, sniffing, reuse of ip address traffic flow analysis, exposure in network security   infrastructure layer dos and ddos, vm escape, hypervisor rootkit software interruption and modification, programming flaws    hardware layer phishing attacks, malware injection attack limited access to data centers, hardware modification and theft    2.3. fog computing: basics and security issues fog computing is a decentralized network architecture in which data storage, processing and applications are distributed in the most efficient way between the data source and the cloud. fog computing and cloud computing show similar characteristics in terms of computation, storage and networking technologies. however, the most important difference of fog computing is its close distance to end users. this property is essential to support delay-sensitive applications and services. another difference refers to support of big data by means of edge analytics and stream mining. finally, location-awareness property enables mobility support. comprehensive surveys on fog computing, including applications in the electric power industry, can be found in [17] and [18]. another benefit of fog computing is its high security because data is processed by a large number of nodes in a distributed system. however, including the virtualization just like cloud, fog environment can still be affected by the similar threats. as opposed to cloud computing, standard security certifications and measures still do not exist for the fog computing. stojmenovic and wen first explored security and privacy issues in the fog computing environment [19]. a more detailed review of fog security solutions can be found in [20]. 3. cloud-based scada systems: migration, benefits and risks 3.1. migration of scada systems to cloud cloud computing provides support for scada applications in two ways [3]: 1. scada application is executed on premises (company, organization, etc.). it is directly connected with control center and transfers data to the cloud where they can be stored and distributed. 2. scada application is completely executed in the cloud, and is remotely connected to the control center. 350 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović the first method, presented in fig. 2, is more widely used. control functions of scada application are isolated in the controller network, while scada application is connected to cloud services that allow visualization of processes, reports and remote access. such applications are usually implemented on a public cloud infrastructure. implementation illustrated in fig. 3 is suitable for distributed scada applications. controllers are connected via wan links to scada application that is executed in the cloud. such applications are usually implemented on private and hybrid cloud infrastructures. public cloud controller network scada mtu on-site scada real-time process and historical data uploaded to the cloud hmi (work stations) firewall plc rtu rtuied rtu fig. 2 public cloud infrastructure, with scada system operating on-premises and sending data through cloud private/hybrid cloud infrastructure wan rtu radio, gsm rtu ied distributed control network ied plc plc real-time process and historical data uploaded to the cloud command and control messages downloaded to the controllers satelite hmi (work stations) fig. 3 private or hybrid cloud infrastructure where controllers are connected via wan links to scada application that is executed in the cloud regarding service selection, there are three possible migration scenarios of scada system to the cloud, namely the re-hosting, refactoring and revising [21]. the first scenario (re-hosting) is the fastest and the simplest, and represents installing the existing scada applications to cloud environment, based on iaas. this is the first step of the gradual migration, that allows analysis and, if needed, extension of the applications, through several iterations. the second and the third scenarios assume re-engineering to better benefit from the features of cloud computing, primarily in terms of scalability and reliability. this can be a simple modification of particular features (refactoring). an example is implementation of resource control capabilities that allows adding additional resources when the application scada systems in the cloud and fog environments: migration scenarios and security issues 351 is intensively used and releasing resources when they are not needed. larger modifications at the application core are also possible (revising). an example is the use of paas database for modification of application in such a way that multi-tenancy saas offering is possible. the increase of the number of saas offers requires the replacement of the existing scada applications with cloud-based saas solutions. 3.2. benefits and risks there are several advantages of cloud computing that motivate users to migrate to cloudbased scada system. with public cloud, access and lease of resources are on-demand, at a much lower price than purchasing, installing and maintaining company's own software and hardware. consequently, the number of technical staff, needed for it resource maintaining, decreases. according to [3], cloud-based scada solution can reduce end-user costs up to 90% over a traditional scada system. scalability is enhanced, since there is no need for purchasing and installing server farm, databases, web servers, when more resources are needed. users can easily purchase additional resources on a virtual cloud server, with no need of installing and maintaining the additional hardware [3]. information located on a cloud server can be accessed from anywhere; hence, the collaboration on projects is more efficient due to easier access to information. upgrade of existing and implementation of new applications is simplified through rehosting, refactoring and revising [21]. with private cloud, efficient resource usage, reduced energy consumption and efficient maintenance enable faster upgrade, business continuity, rapid deployment of new services and overall cost reduction [22]. despite the aforementioned benefits, there are two serious risk factors for cloud-based scada systems, quality of service (qos) and cyber security, which will be considered in the following subsections. 3.2.1. qos requirements qos refers to system’s performance, as well as reliability and availability. most of the industrial applications pose stringent performance requirements regarding delay, packet loss and bandwidth. the most stringent requirements for delay are in the fieldbus and controller networks. response times are in the range of 250 microseconds to 1 millisecond, while for less demanding processes they are in the range of 1 to 10 milliseconds [9]. upper layers have progressively less stringent delays, typically up to 1 second. industrial applications also assume highly reliable network infrastructure with service availability higher than 99.98% [23]. the use of public cloud services increases the risk that these requirements will not be met, because the user cannot control the network performance. increased and unpredictable delay is challenging, since it can block the real-time scada operation. church et al. presented a case study on migration of a scada system to the iaas cloud [21]. they analyzed several open source scada applications and applied re-hosting approach to migrate scada application to a real academic network. performance evaluation has shown that delay introduced by the cloud-based scada system was not a limiting factor. measured response times were in the range from several hundreds of milliseconds to one 352 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović second. however, problems emerged with polling protocols (e.g., modbus tcp), which are based on individual polls of remote stations. applying event-driven communication protocols seems to be more efficient solution, because of reducing both delay and amount of data sent across a network. if such a solution is not possible and polling protocols have to be applied, field devices should be spread across several remote servers. before migration of a scada system to the public cloud, the following questions should be answered [2, 3]:  what are the consequences of variable qos on the industrial process controlled by a particular scada system?  what is the impact of increased delay and delay variation on scada system?  what is the upper bound for delay in each system’s part? for example, increased delay is not critical at the upper layers, which perform monitoring and reporting. the problems of availability and reliability exist in every system in the public cloud. the servers are placed in unknown locations that users cannot access. the data of scada systems encompass results of the industrial process control in real time; therefore, the loss of functionality, even for a few seconds, can cause serious consequences to the industrial process. the situation is different with private cloud infrastructure. chen et al. conducted a study on private cloud-based electric power scada system [22]. their experimental results indicated technical feasibility of the professional private cloud solution; such a system meets the actual need of power grid operations, and some qos parameters such as network load rate are even better than those of the traditional it architecture. 3.2.2. security issues the security issue of the hard real-time system requires overall analysis and holistic understanding of network protection, management theory and physical systems. this problem is getting even more complex in the case of migration to the cloud. cyber attacks on scada systems can be categorized into: hardware attacks, software attacks and communication stack attacks [24]. scada control center performs its actions based on the data received from rtus. attacks that jeopardize process control focus on modifying control data or blocking the data transfer. primary threats to scada systems are command/response injection, various forms of denial of service (dos) attacks, including distributed dos (ddos), and man-in-the-middle (mitm) attack [25, 26]. cloud-based scada systems suffer from the same cyber security risks (indicated in section 2.2) as the other systems integrated into cloud [27]. still, there are a number of threats in the public cloud environment that might make scada systems more vulnerable. first, cloud-based scada systems are more exposed to cyber threats such as command/response injection, dos/ddos attacks, and mitm attack. this comes as a consequence of sharing an infrastructure with unknown outside parties [3]. second, network connections between scada systems and the cloud potentially increase the risk of jeopardizing the whole industrial control systems by outside attackers [27, 28]. third, some of scada-specific application layer protocols lack protection [4, 27]. for instance, the most widespread scada protocols, modbus and dnp3, do not support authentication and encryption. finally, the use of commercial off-the-shelf solutions (instead of proprietary ones) potentially increases the cyber security risk [27]. scada systems in the cloud and fog environments: migration scenarios and security issues 353 3.3. security solutions according to [28], security solutions concerning public cloud infrastructure should address the challenges related to:  information input;  information and command output;  shared storage and computational resources and  shared physical infrastructure. regarding information input/output, it is essentially important not to expose the critical, control infrastructure to the internet. for that reason, when using public cloud infrastructure, push technology should be utilized to move data to the cloud rather than pull technology. thus, there are no open network ports on the control infrastructure, while scada applications remain isolated in the controller network. concerning shared storage and computational resources, scada utility interacting with a csp should be aware how the computational resources are managed for different applications running in the cloud, including guarantees for resource allocation and network access, service levels, fault-tolerance strategy, etc. finally, security of shared physical infrastructure refers to secure cloud infrastructure locations, communication links connecting the cloud infrastructure to the rest of the communications infrastructure, ability to inspect and audit the locations from which scada application will be served, etc. consequently, when selecting the csp and assessing maturity of the offered cloud service, the following criteria should be taken into account [4]:  ensuring secure user access.  mutual isolation of information originating from different applications.  determining the level of users control regarding changes of the csp infrastructure.  data encryption.  automated distribution of software patches.  provisioning scheduled and unscheduled reports that satisfy business needs.  continuous monitoring, which includes assessment of security mechanisms efficiency in near real-time.  continuous analysis of events, incidents, suspicious activities and anomalies.  capabilities to create and analyze log files, to detect intrusions in real-time, to generate responses to detected attacks.  readiness to take immediate corrective actions of all vulnerabilities identified.  consistent and reliable customer service. the most efficient way to protect scada system connected to the public cloud is to establish precise service level agreement (sla) that fulfills the aforementioned criteria. similar research, regarding enterprise resource planning, pointed out the importance of introducing slas in the context of using saas and open-source software [29]. scada protection is much simpler in private cloud, since security solutions are responsibility of the network owner. it is recommended to apply a strategy known as "defense-in-depth", i.e., a multilayer security architecture that minimizes the impact of a failure in any one layer mechanism [30]. this strategy assumes corresponding security policies, employing dmz network architecture to prevent direct traffic between the corporate and scada networks, as well as security mechanisms such as smart access control, firewalls, intrusion detection and prevention systems, antivirus software, deploying security patches on a regular basis, etc. 354 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović in hybrid cloud, using secured virtual private network (vpn) connection to the control infrastructure is strongly recommended [28]. 3.4. risk management security risk management is a cyclic process that encompasses several phases: risk analysis through identification of vulnerabilities and threats, risk assessment, making decisions on acceptable risk level, selection and implementation of measures to mitigate the risk. risk assessment is the most important phase in the risk management process, but also most susceptible to errors. according to [30], risk assessment is "the process of identifying risks to operations, assets, or individuals by determining the probability of occurrence, the resulting impact, and additional security controls that would mitigate this impact". different qualitative and quantitative approaches, methods and tools for risk assessment in industrial control environment can be found in the literature. two comprehensive reviews of risk assessment methods for scada systems have been published only recently [31, 32]. however, none of the reviewed methods considers cloud-based scada. hence, significant research efforts are needed to address this important issue, because risk management takes the outputs of the risk assessment process to consider the options for risk mitigation and finding the trade-offs among overall costs, benefits, and risks of scada migration to the cloud. 3.5. test environments due to need to support the operational continuity, it is often unfeasible to perform security experiments on a real scada system. hence, proper test environments should be developed, consisting of testbeds, datasets and simulated attacks. while test environments for scada in traditional ip-based networks have gained certain level of maturity, research work is still needed regarding cloud-based scada systems. scada security testbed can be implemented as: (1) a single software simulation package; (2) laboratory testbed, which may have several interacting simulations and (3) emulation or implementation-based, which uses emulator or real hardware [8]. in the context of cloud-based scada, probably the most valuable option will be laboratory testbeds that allow other researchers to repeat the experiments and validate their own upgraded solutions. such testbeds should be built on the top of some of the general-purpose cloud simulators, which interact with domain specific models or real world field devices. some examples of such simulators are cloudsim, greencloud, cloudanalyst, icancloud and emusim [33]. due to confidentiality of real scada network data, researchers often use synthetic datasets or datasets obtained from scada testbeds. this is a general problem in verifying security solutions for scada systems. besides synthetic datasets, there is a strong need to use datasets from real scada networks or to reuse publicly available ones. finally, proper attack models and scenarios, in which the attackers try to exploit vulnerabilities in cloud-based scada systems, should be developed. building accurate and plausible threat models is a prerequisite to design secure architecture concepts. this is generally an open issue in scada security, while in the context of cloud-based scada, it has been explicitly recognized for the first time very recently [34]. scada systems in the cloud and fog environments: migration scenarios and security issues 355 4. evolution toward fog computing system architecture 4.1. migration of scada systems to fog fog computing essentially extends cloud computing and services to the edge of network. consequently, end users, fog and cloud together form three-tier system architecture. considering migration of scada system, a possible architecture is proposed in fig. 4 (based on [17]). the end users stratum corresponds to end user devices, e.g., field devices, smart energy meters, line sensors, etc. it can also include iiot devices. the fog stratum encompasses one or more fog domains, managed by the same or different providers. fog domain is constituted by the fog nodes, i.e., devices with computing, storage, and network connectivity. examples of fog nodes are industrial controllers, switches, routers, embedded servers, etc. fog nodes provide integration with cloud stratum, routing and switching, data storage and sharing, real-time analytics, outage management, controller functions (rtus, plcs, ieds), wireless access, etc. the end users stratum and fog stratum are typically connected via wired or wireless local area networks (lans). the cloud stratum is responsible for functions such as high level storage, utility billing system, demand prediction, high level processing and historical data analysis. the fog stratum and cloud stratum communicate via wan connections. table 2 is based on [5] and summarizes comparison of fog computing and cloud computing environments in the context of scada system requirements. cloud stratum storage utility billing system demand prediction high level processing historical data fog stratum cloud integration routing fog load storage and sharing real-time analytics outage management controller network wireless access end users stratum end user devices iiot devices switch/router ap servers real-time analytics switch/ router scada mtu data storage and sharing industrial controller outage management field devices smart energy meters, line sensors etc. data storage and sharing private/hybrid cloud hmi fog domain 1 fog domain n ... wan wan lan lan fog nodes fog nodes fig. 4 three-tier architecture of a fog-based scada system 356 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović table 2 fog computing vs. cloud computing in terms of scada requirements feature fog computing cloud computing architecture decentralized (distributed) centralized communication wired or wireless lan ip wan number of server nodes large few real-time operation supported supported delay low relatively high bandwidth cost low relatively high security high relatively high mobility and location awareness supported limited 4.2. security considerations with fog computing, security does not function in the cloud, but locally, thus using the same corporate it policy, controls, and procedures as in traditional scada system. inherently, there is opportunity to increase cyber security as compared to the cloud environment. most fog nodes include a hardware root of trust [35], which represents a basis for protection chain from the field devices, through the fog stratum up to the cloud stratum. traffic is supervised from the cloud to the distributed fog network, which can use different anomaly-based techniques to detect malicious activities in the local context. security solutions for fog-based scada systems are generally similar to the ones applied for cloud-based scada. the emphasis is on the following techniques [20]:  authentication. all messages and entities must be authenticated, which is particularly important to prevent mitm attacks. different techniques can be applied, including public cryptography coupled with decoy technology, biometric authentication, etc.  access control. all fog nodes should provide access control and ensure authorization, to protect operations such as reading or writing data, executing programs and controlling sensors/actuators.  intrusion detection. intrusion detection techniques are generally deployed within cloud environment to identify possible incidents, e.g., different types of cyber attacks, and violation of network security policies or standard security practices. in fog computing, intrusion detection systems can be implemented both on the client side and the fog network side thus allowing double protection, from insider attacks and attacks originating from the cloud. if a threat is detected, fog nodes block malicious traffic and protect the critical scada network. highly-sensitive data can be processed locally without leaving the field site.  privacy. fog nodes are located close to, or at the field sites and collect more sensitive data compared to the cloud computing. for that reason, security techniques must ensure privacy for all field sites. similar to the cloud environment, challenges in security provisioning of fog-based scada systems include further development of fog-specific security solutions, risk assessment methods, as well as dedicated test environments. scada systems in the cloud and fog environments: migration scenarios and security issues 357 5. concluding remarks this paper provided a review of migration scenarios of scada systems toward cloud and fog environments with special attention to cyber security as a main operational risk factor, which requires additional research work and stipulates gradual migration. we have identified a progress in some areas, but also some open issues remain. first, public and private cloud architectures can both be the right selection for scada, but one size does not fit all. a proper risk analysis should be conducted to make right choice, and there is a strong need to develop appropriate risk assessment methods for that purpose. taking into account the assessed risk, the cost increase is justified to provide secure cloud services. second, proper testbeds should be developed to validate security solutions. they include laboratory testbeds, but also research efforts to develop sophisticated hardware/software emulation platforms that are able to interact with the network. finally, although evolution of scada toward fog computing environment is an emerging trend, which eliminates some of the problems inherent to the cloud, it is not risk-free by default. besides an obvious need for security standards in the area, additional research work is needed to assess suitability of a complicated three-tier system for scada applications, regarding additional expenses and limited scalability. acknowledgement: the paper is a part of the research funded by the ministry of education, science and technological development of serbia, within the projects tr 32025 and tr 36002. references [1] m. stojanović, s. boštjanĉiĉ rakas and j. marković-petrović, "cloud-based scada systems: cyber security considerations and future challenges", in proceedings of the 4th virtual international conference on science, technology and management in energy – energetics 2018. niš, serbia: research and development center "alfatec", and complex system research center, 2018, pp. 253–260. [2] e. nugent, "how cloud and fog computing will advance scada systems", manufacturing automation, pp. 22–24, november/december 2017. [3] l. combs, "cloud computing for scada", indusoft, 2011. http://www.indusoft.com/documentation/ white-papers/artmid/1198/articleid/430/cloud-computing-for-scada (accessed february 05, 2019). [4] p. d. howard, "a security checklist for scada systems in the cloud", gcn, 2015. https://gcn.com/articles/2015/06/29/scada-cloud.aspx (accessed february 05, 2019). [5] c. byers, "fog computing for industrial automation", control eng., 2018. https://www.controleng.com/ articles/fog-computing-for-industrial-automation/ (accessed february 05, 2019). [6] i. ahmed, s. obermeier, m. naedele and g. g. richard iii, "scada systems: challenges for forensic investigators", computer, vol. 45, no. 12, pp. 44–51, december 2012. [7] j. marković-petrović and m. stojanović, "an improved risk assessment method for scada information security", elektron. elektrotech., vol. 20, no. 7, pp. 69–72, september 2014. [8] s. nazir, s. patel and d. patel, "assessing and augmenting scada cyber security: a survey of techniques", comput. secur., vol. 70, pp. 436–454, september 2017. [9] b. galloway and g. p. hancke, "introduction to industrial control networks", ieee commun. surv. tut., vol. 15, no. 2, pp. 860–880, second quarter 2013. [10] j. gao, j. liu, b. rajan, r. nori, et al., "scada communication and security issues", secur. commun. netw., vol. 7, no. 1, pp. 175–194, january 2014. [11] p. mell and t. grance, the nist definition of cloud computing. nist special publication 800-145, 2011. http://www.indusoft.com/documentation/%20white-papers/artmid/1198/articleid/430/cloud-computing-for-scada http://www.indusoft.com/documentation/%20white-papers/artmid/1198/articleid/430/cloud-computing-for-scada https://gcn.com/articles/2015/06/29/scada-cloud.aspx https://www.controleng.com/%20articles/fog-computing-for-industrial-automation/ https://www.controleng.com/%20articles/fog-computing-for-industrial-automation/ 358 m. stojanović, s. boštjanĉiĉ rakas, j. marković-petrović [12] a. bashar, "modeling and simulation frameworks for cloud computing environment: a critical evaluation", in proceedings of the international conference on cloud computing and services science – icccss 2014. world academy of science, engineering and technology, 2014, pp. 1–6. [13] b. hari krishna, s. kiran, g. murali and r. pradeep kumar reddy, "security issues in service model of cloud computing environment", procedia comput. sci., vol. 87, pp. 246–251, 2016. [14] p. chavan, p. patil, g. kulkarni, r. sutar et al, "iaas cloud security", in proceedings of the 2013 international conference on machine intelligence and research advancement. ieee, 2013, pp. 549–553. [15] m. t. sandikkaya and a. e. harmanci, "security problems of platform-as-a-service (paas) clouds and practical solutions to the problems", in proceedings of the ieee 31st symposium on reliable distributed systems. ieee, 2012, pp. 463–468. [16] s. soufiane and b. halima, "saas cloud security: attacks and proposed solutions", trans. on machine learning and artificial intelligence, vol. 5, no. 4, pp. 291–301, august 2017. [17] c. mouradian, d. naboulsi, s. yangui, r. h. glitho, et al, "a comprehensive survey on fog computing: state-of-the-art and research challenges", ieee commun. surv. tut., vol. 20, no. 1, pp. 416–464, first quarter 2018. [18] p. hu, s. dhelima, h. ning and t. qiu, "survey on fog computing: architecture, key technologies, applications and open issues", j. netw. comput. appl., vol. 98, pp. 27–42, november 2017. [19] i. stojmenovic and s. wen, "the fog computing paradigm: scenarios and security issues", in proceedings of the 2014 federated conference on computer science and information systems. ieee, 2014, pp. 1–8. [20] s. khan, s. parkinson and y. qin, "fog computing security: a review of current applications and security solutions", j. cloud comput., vol. 6, no. 10, pp. 1–22, august 2017. [21] p. church, h. mueller, c. ryan, s. v. gogouvitis, et al., "migration of a scada system to iaas clouds – a case study", j. cloud comput. adv. syst. appl., vol. 6, no. 11, pp. 1–12, june 2017. [22] y. chen, j. chen and j. gan, "experimental study on cloud computing based electric power scada system", zte communications, vol. 13, no. 3, pp. 33–41, september 2015. [23] integrated service networks for utilities. cigré technical brochure tb 249, wgd2.07, 2004. [24] b. zhu, a. joseph and a. sastry, "a taxonomy of cyber attacks on scada systems", in proceedings of the international conference on internet of things and the 4th international conference on cyber, physical, and social computing. ieee, 2011, pp. 380–388. [25] z. el mrabet, n. kaabouch, has. el ghazi and ham. el ghazi, "cyber-security in smart grid: survey and challenges", comput. electr. eng., vol. 67, pp. 469–482, april 2018. [26] w. gao, t. morris, b. reaves and d. richey, "on scada control system command and response injection and intrusion detection", in proceedings of the 2010 ecrime researchers summit. ieee, 2010, pp. 1–9. [27] a. sajid, h. abbas and k. saleem, "cloud-assisted iot-based scada systems security: a review of the state of the art and future challenges", ieee access, vol. 4, pp. 1375–1384, april 2016. [28] b. a. akyol, "cyber security challenges in using cloud computing in the electric utility industry", technical report pnnl 21724, pacific northwest national laboratory, 2012. https://www.pnnl.gov/ main/publications/external/technical_reports/pnnl-21724.pdf (accessed february 05, 2019). [29] m. stojanović, v. aćimović-raspopović and s. boštjanĉiĉ rakas, "security management issues for open source erp in the ngn environment", in enterprise resource planning: concepts, methodologies, tools, and applications, vol. ii, m. khosrow-pour, ed. new york: igi global, 2013, pp. 789–804. [30] k. stouffer, j. falco and k. scarfone, guide to industrial control systems (ics) security. nist special publication 800-82 rev. 2, 2015. [31] y. cherdantseva, p. burnap, a. blyth, p. eden, et. al, "a review of cyber security risk assessment methods for scada systems", comput. secur., vol. 56, pp. 1–27, february 2016. [32] n. hossain, a. hossain, t. das and t. islam, "measuring the cyber security risk assessment methods for scada system", glob. j. eng. sci. res. manag., vol. 4, no. 7, pp. 1–12, july 2017. [33] a. ahmed and a. s. sabyasachi, "cloud computing simulators: a detailed survey and future direction", in proceedings of the 2014 ieee international advance computing conference (iacc). ieee, 2014, pp. 866–872. [34] m. kamal, ics layered threat modeling, sans institute – information security reading room, march 2019. https://www.sans.org/reading-room/whitepapers/ics/ics-layered-threat-modeling-38770 (accessed april 02, 2019). [35] y. gui, a. s. siddiqui and f. saqib, "hardware based root of trust for electronic control units", in proceedings of the southeastcon 2018. ieee, 2018, pp. 1–7. https://www.pnnl.gov/%20main/publications/external/technical_reports/pnnl-21724.pdf https://www.pnnl.gov/%20main/publications/external/technical_reports/pnnl-21724.pdf https://www.sans.org/reading-room/whitepapers/ics/ics-layered-threat-modeling-38770 facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 211-229 https://doi.org/10.2298/fuee1902211m analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming * mladen mileusnić, branislav pavić, verica marinković-nedelicki, predrag petrović, dragan mitić, aleksandar lebl iritel a.d., belgrade, serbia abstract. in this paper we first briefly compare the performances of active jamming remote controlled improvised explosive devices activation using wide-band noise and frequency sweep signal. frequency sweep is the most widely used technique intended for active jamming and we analyze its characteristics: 1) sweep speed, 2) conditions for certainly successful jamming, 3) successful jamming probability if jamming is not certainly successful, and 4) step of frequency change when frequency sweep is applied. the separate paper section is devoted to the successful jamming probability calculation in general. the attention is also paid to jamming probability determination when starting and ending sweep signal frequencies are varied. the initial research has been upgraded and extended. the presented results refer to jamming equipment development in iritel, but it is important to add that they are also applicable to the other similar jamming systems realizations. key words: jammer, remote controlled improvised explosive devices, frequency sweep, successful jamming probability, bit error correction 1. introduction today the world is faced with the growing challenges in the fight against terrorism. methods of terrorist attacks are constantly improved. it is the reason why devices for the fight against these attacks must follow changes in the applied techniques of attack. remote controlled improvised explosive devices (rcied) are widely used as the equipments intended for terrorism. such devices are activated by messages, which are transmitted from longer or shorter distances by wireless communications. the two most widely used jamming techniques against rcied activation are reactive and active jamming [2]. the advantage of reactive jamming is related to the lower level of emission power, because jamming signal is generated only when rcied activation message is detected in one intercepted channel. it is necessary to detect activation signal appearance received july 19, 2018; received in revised form october 3, 2018 corresponding author: aleksandar lebl iritel a.d., 11080 belgrade, batajniĉki put 23, serbia (e-mail: lebl@iritel.com) * the earlier version of this paper is awarded as the best one in the section telecommunications at the 5 th icetran conference, palić, 11-14 june 2018, [1]. 212 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl and its frequency, i.e. channel which must be jammed. on the contrary, active jamming supposes constant jamming signal transmission independent of activation signal existence. reactive jamming technique is more often applied in the last time [3] [10]. in the existing solutions fast fourier transform (fft) is usually used as fast and reliable detection algorithm [3], [4]. the pipeline of different operations when fft is applied to detection algorithm (signal samples collection, these samples processing, decision making) instead of multiplying hardware elements contributes to more reliable and faster rcied activation message detection even in the case when it is necessary to analyze frequency hopping signal [5]. a survey of problems arising in the realization of reactive jammers is presented in [4]. the greatest attention in [4] is devoted to time synchronization in the case of simultaneous operation of multiple jammers. the characteristics of some other detector types such as energy detector, matched filter detector, feature detector and detector based on the calculation of eigenvalues of the covariance matrix are theoretically compared in [6], [7]. contribution [8] deals with activation signals jamming in one specific network (ieee. 802.15.4), where message packet duration is very short (only about 350μs), thus causing necessity for a very short detection time. in general, the achieved detection time is less than 1ms in [9], and even about 200μs for the frequency range up to 6ghz in [10]. it is important to emphasize several problems, which may occur when active jamming is applied. the first one is that activation signal power at the rcied location may be very different, depending on the implemented techniques for message transmission and on the distance between activation message transmitter and receiver. the second one is that the operating frequency for signal transmission may be in very wide frequency range. in such situation the most reliable method for jamming realization is wide-band jamming signal generation. it means that available transmitter power is used in the whole frequency range. this high jamming power is therefore distributed into many available channels and, as a consequence, its level in each channel is relatively low. the jamming signal power in a channel with an activation signal is, perhaps, not enough to prevent rcied activation signal reception. the other possible, most often implemented signal generation method for active jamming is linear variation of jamming signal frequency (i.e. frequency sweep) [11]-[13]. in this, second case it is possible to concentrate significantly higher power in one channel where activation message is transmitted comparing to wide-band jamming. but, as jamming signal is not always present in each channel, there is a risk that generated sweep signal would not reach the desired channel in time, while activation signal is yet not finished. sweep jamming implementation is not limited only to rcied activation jamming. it may be also used for mobile telephony systems jamming [14]-[16]. the possibility to achieve the higher sweep speed [17] caused that sweep jamming becomes very popular and widely applied. in this way the benefits of sweep jamming in the area of power saving come to the fore. sweep jamming is today dominant technique of active jamming and this is the reason to devote significant attention in this paper to its analysis. the relation of necessary jamming signal power for wide-band and sweep jamming depends on several factors: the desired jamming probability, implemented technique (modulation) for rcied activation message transmission, level of environmental noise, and so on. the results presented in our analysis in [18] prove and explain that in the case of qpsk modulation for small values of bit error rate (ber) till ≈2.5% wide-band jamming is more efficient than sweep jamming. the conclusion is based on the fact that under such analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 213 conditions lower signal power is necessary to be implemented for wide-band jamming to achieve the same ber. but, such low values of ber are not important for jamming realization and for ber>2.5% sweep jamming is more efficient. for other psk modulation types the limit value of ber above which sweep jamming becomes more efficient than wideband jamming is ≈10% for bpsk and less than 1% for 8psk and 16psk. the power save increases with psk modulation level and it may reach even 11db for 16psk. the additional disadvantage of higher necessary power consumption for wide-band jamming is that jammer may be easier detected. as a consequence, there is a greater opportunity that personnel controlling jammer operation are exposed to enemy attack [19]. when comparing efficiency of wide-band jamming and sweep jamming, available literature is mainly concentrated on their qualitative comparison, or, in some cases, approximate quantitative results of such comparison are presented [20]. as for the knowledge of the authors of this paper, there is no such an analysis related to the ber value and the applied signal modulation type for rcied activation. the main purpose and the novelty of this paper is that it presents and analyzes different parameters of sweep jamming: 1. sweep speed; 2. the role of practical sweep jamming realization as step function instead of linear frequency change in jamming probability determination; 3. performances comparison of two different sweep jamming strategies; 4. jamming probability calculation when starting and ending jamming frequency are varied; 5. jamming probability calculation when different error detection and correction algorithms are applied. the method of frequency sweep realization for jamming rcied activation is presented in section 2 of this paper. the sweep speed is defined as the most important characteristic of this method. after that, successful jamming probability for frequency sweep signal implementation is determined in section 3. two methods of sweep signal generation considering jamming reliability are analyzed in section 4. section 5 explains the influence of starting and ending jamming frequency variation on the value of successful jamming probability. section 6 deals with the calculation of successful jamming probability when signal physical characteristics are such that reliable jamming is not guaranteed. at the end, section 7 presents conclusions. 2. sweep speed of rcied activation jamming signal there are two jamming signal characteristics, which must be considered to prevent successful rcied activation: jamming signal frequency and jamming signal level. jamming signal frequency must be equal to the activation signal one or in its proximity. the difference between activation and jamming signal frequency depends on several factors such as characteristics of rcied activation message receiver (its bandwidth and attenuation characteristic) and relation between amplitudes of jamming signal and rcied activation signal. in general, there are three possibilities in the analysis of jamming and activation signal levels at the place of rcied activation message receiver. first, if activation message signal level is greater than jamming signal level, jamming is unsuccessful. in other two 214 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl situations jamming is successful, but the reaction of activation message receiver is different. if jamming and activation signal levels have nearly the same values, rcied receiver detects activation message of no use due to its changed content. in the case that jamming signal level is significantly greater than the activation signal level, rcied receiver does not detect activation message, but only the jamming signal [21]. let us suppose that it is necessary to jam a signal, which may cause activation of rcied anywhere in a frequency band of total width w (in hz) [22]. the sweep jamming is applied in the same frequency bandwidth w=f2-f1, where f1 is the minimum and f2 is the maximum sweep signal frequency (figure 1). it can be supposed that jamming may be successful under the condition that jamming signal appears in the frequency band (channel) where activation signal is transmitted. it is assumed that successful jamming probability is pdist=1. the period of one sweep cycle is tsw. one channel width where activation signal is transmitted is c (channels c(1) and c(2) in figure 1). when jamming signal appears somewhere in this channel while rcied activation message is present (time interval tc in figure 1) and the condition related to level of two considered signals is satisfied, we shall suppose that jamming is successfully realized. in this moment we also suppose that jamming signal appears only once in the frequency band reserved for rcied activation message transmission in a time of this message duration. f f1 f2 c(1) c(2) t τsw w τmess tc fig. 1 rcied activation jamming when jamming signal frequency is linearly changed. sweep speed will be defined as frequency change speed: . sw sw w v t  (1) jamming probability will be pdist=1 if one cycle time of frequency change from f1 to f2 satisfies a condition: , sw mess t t (2) where tmess is rcied activation message duration. analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 215 it follows from eq. (1) and eq. (2) . sw mess w v t  (3) it is possible that jamming is not successful, although jamming signal frequency is in the proximity of activation message frequency and the condition related to signal levels is satisfied. in such a case it is necessary that jamming signal appears more than once (m times in our analysis) in a considered channel during message duration to achieve satisfactory rcied activation jamming probability. in such a case sweep speed must be increased. expression (3) is, consequently, changed to: . sw mess m w v t   (4) the value vsw for which it is valid the equating part of (3) and (4) defines lower limit of sweep speed to assure successful jamming. it is a time needed to guarantee that jamming signal at least once (in the case of (3)) or m-times (in the case of (4)) „hits“ the considered channel when its frequency sweeps. 3. successful jamming probability for frequency sweep implementation let us suppose that the condition from eq. (2) is not satisfied, i.e. that it is tmess1. in this case rcied activation signal jamming is not guaranteed, i.e. pdist<1. the probability of rcied activation jamming is: 1 .mess dist sw t p k t   (6) figure 2 presents successful jamming probability (pdist) as a function of one sweep cycle time interval (tsw) and message duration (tmess). the values of pdist are obtained on the basis of eq. (6) if it is satisfied the condition tmess≤tsw. if it is tmess>tsw, jamming signal frequency in any case crosses the frequency of rcied activation message at least once. that’s why in such situation is pdist=1, providing that other conditions for successful jamming are satisfied. 216 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl 0,001 0,003 0,005 0,007 0,009 0,02 0,04 0,06 0,08 0,1 0,3 0,5 0,7 0,9 2 4 6 8 1 0,01 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 tmess (ms) tsw (ms) pdist 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 2 rcied successful jamming probability as a function of sweep time and message duration. f f1 f2 c(1) c(2) t τsw w τmess tc=t∆ f ∆ t∆ fig. 3 practical realization parameters of rcied activation jamming. practical realization of sweep signal generation differs from the presentation in the figure 1. instead of generation by linear frequency change, signal is generated as stepwise function. in this way it is realized an approximation of linearly variable signal frequency, as presented in figure 3. according to this figure, the basic data defined in implementation are time step (t∆) and frequency change step (f∆). these two values may be used to express sweep speed in the other manner as . sw f v t    (7) analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 217 if it is satisfied the condition ,f c   (8) jamming will be certainly successful. if not, there are two possibilities: 1. the value of generated frequency is in no moment in the frequency range dedicated to the considered channel (channel c(1) in the figure 3); 2. generated frequency coincides during some time interval with the frequency of a channel (interval tc in the figure 3, when signal in channel c(2) is jammed). in the first case jamming will be unsuccessful, while in the second case it will be successful. the aim of practical sweep signal generation is to approximate linear frequency change as much as possible. to achieve this, it is chosen the minimum value of t∆ (t∆min) which is allowed by applied hardware components [17]. the calculation is performed for such defined t∆ value. the details of implemented jammer solution are presented in [13]. let us suppose that we want to determine whether it is possible to assure successful jamming using the selected hardware component for sinusoidal signal generation. the first step in the analysis is to find the necessary number of frequency steps for linear approximation of frequency change. we have already emphasized in eq. (2) the necessary condition for such successful jamming. in the limiting case tsw=tmess, the number of frequency steps for linear approximation of frequency change is: min .sw s t n t   (9) the value of necessary frequency change step to (eventually) achieve successful jamming may be now determined on the base of eq. (8) and eq. (9) as: min . s sw w tw f c n t       (10) the conclusion of this short analysis expressed by eq. (9) and eq. (10) is that fast linear change of jamming frequency does not always lead to successful jamming. the jamming successfulness is also related to the characteristics of applied hardware components for jamming signal synthesis, namely to the possibility to achieve satisfactory short step for linear approximation of frequency change. it is possible that the time of one frequency sweep from the minimum to maximum frequency is satisfactory, but that one step of frequency change is still greater than one channel width (c), thus causing unreliable jamming. 4. comparison of jamming successfulness for two sweep signal generation methods there are two methods for sweep signal generation: 1. signal frequency is always generated from its minimum towards the maximum value and after reaching the maximum value, signal frequency immediately drops down to its minimum value; 2. signal frequency starts to linearly increase from its minimum value and when reaches its maximum, starts to linearly decrease towards the minimum usually at the same rate as it was previously in the increasing direction. 218 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl f t f1 f2 τsw τsw c(1) c(2) t f1 f2 τsw τsw c(1) c(2) f τmess τmess a) b) fig. 4 jamming possibilities of rcied activation signal for two methods of sweep signal generation. figure 4 presents these two methods for sweep signal generation. the first method is shown in figure 4a and the second one in figure 4b. two rcied activation messages are taken into account together with a sweep signal in both cases. rcied activation messages are located in two different frequency bands: c(1) and c(2). in this example message length (tmess) is equal to the sweep time (tsw). if a sweep signal is generated according to the first method, jamming is always successful, irrespective of the part of frequency range between f1 and f2 where rcied activation signal appears. however, if sweep signal is generated according to the second method, jamming may be successful for a signal in a channel c(2), where jamming signal two times „hits“ the channel with activation message. in addition, it may be also unsuccessful for a signal in a channel c(1), because jamming signal does not „hit“ channel c(1) in a time of message duration. it is important to emphasize that jamming is certainly successful for the second method of sweep signal generation if a bit changed condition comparing to eq. (2) is satisfied: 2 . sw mess t t  (11) successful jamming probability for the second method of jamming signal generation is determined starting from formula (11) and is: analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 219 1 . 2 2 mess dist sw t p k t     (12) figure 5 presents variation of rcied activation signal jamming probability as the function of the relation tsw/tmess for two presented methods of sweep signal generation. the graph in this figure illustrates that successful jamming probability is always greater if sweep signal is generated according to the first method for all values tsw/tmess>0.5. for tsw/tmess≤0.5 both methods have pdist=1. fig. 5 successful jamming probability as a function of relation tsw/tmess for two methods of sweep signal generation. 5. the role of starting and ending jamming frequency selection the analysis from previous sections and graphs in figure 2 and figure 5 demonstrate that successful jamming probability decreases very fast when tsw is greater than tmess, i.e. when activation message is short. there is a limit of sweep speed increase due to the characteristics of used hardware components for signal generation. also too great sweep speed decreases the time of jamming signal frequency existence enough close to activation message frequency and thus message content is not changed to cause successful jamming. these problems may be overcome if sweep cycle does not cover the whole predicted frequency band in the jammer, but the smaller range of frequencies, that is estimated to contain the activation message frequency. in such a case tsw is no more significantly greater than tmess. it is necessary to know in advance the nearer frequency limits of expected activation signal, thus allowing possibility to define smaller distance between the lowest and the highest sweep frequency. it is demonstrated in [23] that the implemented operating frequencies for rcied activation are specific for different war areas. these frequencies depend on devices, which may be easily purchased in that area and then simply adjusted for the application. thus it is possible to predict a priori the expected activation frequencies. in any case, it is necessary to satisfy the condition , down mess up f f f  (13) to realize jamming successfully. in this expression fmess is the frequency used for activation message transmission and fdown and fup are the minimum and maximum sweep frequency, respectively. 220 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl figures 6, 7 and 8 present pdist as a function of fdown and fup. frequencies fdown and fup are presented as shifted values. the value 0 on these figures corresponds to minimum possible sweep frequency (fminsw), which may be implemented in the jammer when sweep signal is realized, while the value 1 corresponds to maximum sweep frequency (fmaxsw). the value of activation signal frequency is also shifted. the correct relation of the frequencies for the graphs in figures 6, 7 and 8 is fdown≤fup. that’s why pdist=0 if this condition is not satisfied. 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0.2 0.4 0.6 0.8 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 fup fdown pdist 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 6 successful jamming probability as a function of minimum (fdown) and maximum (fup) shifted sweep frequency, tmess/tsw=0.2, shifted fmess=0.3. 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0.2 0.4 0.6 0.8 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 fup fdown pdist 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 7 successful jamming probability as a function of minimum (fdown) and maximum (fup) shifted sweep frequency, tmess/tsw=0.2, shifted fmess=0.6. analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 221 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0.2 0.4 0.6 0.8 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 fup fdown pdist 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 8 successful jamming probability as a function of minimum (fdown) and maximum (fup) shifted sweep frequency, tmess/tsw=0.5, shifted fmess=0.6. figures 6 and 7 are plotted for the case when the complete sweep cycle from the minimum to the maximum frequency has the duration five times greater than the activation message (tmess/tsw=0.2), while figure 8 is plotted for tmess/tsw=0.5. figure 6 corresponds to the shifted value of activation signal frequency 0.3 (i.e., the real value of this frequency is fminsw+(fmaxswfminsw)∙0.3), while the value of shifted frequency for the figures 7 and 8 is 0.6. the main conclusion from the graphs in figures 6, 7 and 8 is that pdist may reach the value equal to 1, which is not possible if the whole range of frequencies is swept. but, it is also possible that activation frequency is never in the range of jammed frequencies, when it is pdist=0. that’s why the good estimation of the frequency range used for activation signal transmission is very important. the second possibility to increase probability of successful jamming is simultaneous implementation of sweep signal generation in several frequency bands (the whole available frequency range is swept in each such formed frequency band). in this way, multisweep signal generation is implemented at the same speed in m bands in the same time. that’s why pdist is also increased m times until the value pdist=1 is reached. in the solution presented in [13], the value of m is 7. 6. the influence of rcied activation message characteristics on successful jamming probability until now we supposed in the analysis that jamming signal characteristics guarantee successful jamming if a signal appears in a channel where rcied activation message is transmitted. however, it is possible that this condition is not satisfied (first of all, because of a low jamming signal level, as already expressed in section 3). even in the case that jamming signal level is satisfactory (i.e. greater than the level of rcied activation 222 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl message), it is possible that ber<1. it means that each bit in activation message will be changed in relation to its exact value with probability equal to ber. the total number of bits forming an activation message is n. it is supposed that error correction coding is not applied which means that activation message will be successfully transmitted, if all bits in its content are correctly transmitted. probability of message successful transmission is therefore: (1 ) , n sa p ber  (14) and successful jamming probability will be: 1 1 (1 ) .n dist sa p p ber     (15) figure 9 presents probability of successful rcied activation jamming (pdist) as the function of the number of bits n, which form a message and ber. this graph is obtained on the base of eq. (15). the importance of this graph is that it presents the dependence of pdist on two independent variables. we shall suppose that satisfactory combinations of n and ber give as a result pdist>0.95. in the case that activation message consists of only one byte (8 bits) the desired jamming probability is achieved for ber≈0.35. 1 3 5 7 12 20 28 36 44 52 60 1 0,7 0,4 0,1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 ber n pdist 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 9 successful jamming probability (pdist) as a function of message length (n) and bit error rate (ber). there is a great variety of transmission techniques, which may be used for rcied activation message sending. it is possible to use an algorithm, which corrects certain number of incorrectly transmitted message bits. in this paper we consider algorithms, which correct one or two message bits. in the case of a code able to correct one message bit, a message will be successfully transmitted if no more than one bit is faulty. when there are no faulty bits, message successful transmission probability may be determined according to (14). in the other possible case, when one bit is faulty, successful message transmission may be calculated from   11 (1 ) .1 n sa n p ber ber      (16) analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 223 successful jamming probability on the base of (14)–(16) is then:   1 1 1 1 (1 ) (1 ) . 1 dist sa sa n n p p p n ber ber ber             (17) if we have a code with a possibility to correct two faulty message bits, a message will be correctly transmitted if there are not more than two faulty message bits. the successful transmission probability when two bits are faulty may be determined as   2 22 (1 ) ,2 n sa n p ber ber      (18) i. e., total successful jamming probability in this case will be:     1 2 1 2 2 1 1 (1 ) (1 ) (1 ) . 1 2 dist sa sa sa n n n p p p p n n ber ber ber ber ber                   (19) fig. 10 successful jamming probability in the case of error correction coding application to rcied activation message for ber=0.4. fig. 11 successful jamming probability in the case of error correction coding application to rcied activation message for ber=0.6. 224 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl the graphs in figures 10 and 11 present successful jamming probability as the function of the number of bits, which form activation message. these graphs are obtained using formulas (15), (17) and (19). the parameter in the figures is the number of bits, whose content may be corrected in the rcied receiver on the basis of implemented algorithm for error correction. the graphs in figure 10 and figure 11 are presented for ber=0.4 and ber=0.6, respectively. the aim is to achieve as greater as possible value of pdist and for practical considerations satisfactory jamming probability is supposed to be pdist=0.95, as already pointed out. this target value is achieved for ber=0.6 in the case of very robust error correction coding algorithm, which may correct two bit errors in a message even in the case of very short messages, whose length is only 8 bits. such short messages are not real to exist in practice. fig. 12 successful jamming probability in the case of error correction coding application to rcied activation message for n=16 bits. fig. 13 successful jamming probability in the case of error correction coding application to rcied activation message for n=32 bits. figures 12 and 13 present successful jamming probability as a function of ber in the case that activation message consists of only 16 bits (figure 12) or 32 bits (figure 13). analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 225 the results are also obtained on the base of expressions (15), (17) and (19). a satisfactory jamming probability pdist>0.95 is now reached for ber=0.35 if the message consists of 16 bits, or for ber=0.19 if the message consists of 32 bits when algorithm with two bits correction is applied. figures 14 and 15 present successful jamming probability as a function of n and ber for the case when one bit error in rcied activation message may be corrected (figure 14) and when two bit errors may be corrected (figure 15). graph in figure 14 is obtained using (17) and graph in figure 15 using (19). the results from these two figures make a complete with the graph in figure 9. 48 1216 2024 2832 3640 444852 566064 0,1 0,4 0,7 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 pdist n ber 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 14 successful jamming probability as the function of n and ber when it is possible to correct one bit error. 48 1216 2024 2832 3640 444852 566064 0,1 0,4 0,7 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 pdist n ber 0-0,1 0,1-0,2 0,2-0,3 0,3-0,4 0,4-0,5 0,5-0,6 0,6-0,7 0,7-0,8 0,8-0,9 0,9-1 pdist fig. 15 successful jamming probability as the function of n and ber when it is possible to correct two bit errors. 226 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl the results from [18] may be used to estimate the necessary jamming signal power relative to activation message level in order to achieve desired ber values. graphs in [18] are presented for often applied mpsk (m-ary phase shift keying) activation signal, where the values of m are 2 (bpsk – binary psk), 4 (qpsk – quaternary psk), 8 and 16. when the message consists of relatively small number of bits (16 in figure 12 or 32 in figure 13), it is expected that bpsk or qpsk is applied. it is very interesting to make additional comparison between the sweep speed when active jamming is implemented and the necessary signal detection time when reactive jamming is realized by fft analysis. the application of very fast, modern digital signal processors (dsp) presented in [24], [25] allows the achievement of very short detection times [10], which are even significantly smaller than the time necessary to realize one sweep cycle. it can be often found in literature that active jamming is more reliable than reactive jamming. we may point out as the conclusion that this statement is certainly valid only if wide-band noise jamming is used as a method of active jamming. when sweep signal is used for active jamming, it is possible to find the frequency of rcied activation signal by fft analysis and to start jamming signal generation in a shorter time than to complete one sweep cycle over all envisaged frequencies. the fft analysis rate depends on the applied dsp clock frequency, the number of activated dsp cores and the application of additional hardware accelerator in dsp. the clock for dsp core is obtained by pll components, which may generate very fast clock signals [26]. as there are even three factors, which may increase fft calculation speed, the analysis flow rate is several tens of times greater when these factors have maximum values than if they have minimum ones. the consequence is that for some combinations of considered factors, active sweep jamming is more reliable and for some others reactive jamming is better solution. the more detailed quantitative comparison of these two jamming scenarios, which may be realized by components presented in [17], [24] and [25] will be the subject of our future analysis. 7. conclusions there are two techniques applied to rcied activation jamming: active and reactive jamming. frequency sweep as the most widely used technique for active jamming is analyzed in this paper. in the introductory section it is explained why sweep jamming is important for application and what are its advantages and disadvantages. we emphasized the condition for certainly successful jamming and presented the method for jamming probability calculation in the case that jamming is not certainly successful. in the analysis two methods for sweep signal generation are compared considering successful jamming probability and all formulas are developed for both methods. the attention is devoted to practical sweep hardware implementation, where linearly variable sweep frequency is approximated by stepwise change of signal frequency. it is proved that the cause of unsuccessful jamming may be not only too slow signal frequency sweep comparing to the rcied message duration, but also excessively great frequency step change in stepwise jamming signal realization. the particular paper section is devoted to successful jamming probability determination when starting and ending sweep jamming frequencies are varied. at the end we presented the method for successful jamming probability calculation in general. we analyzed the influence of transmission ber, rcied activation message length and applied algorithm for error correction coding of activation messages on the calculated jamming probability value. analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 227 fig. 16 rcied jammer at defense & security international exhibition eurosatory 2018 in paris this paper is the enhanced version of the contribution [1]. comparing to [1], the new section 5 explains how changes of starting and ending sweep frequency influence the successful jamming probability. the conclusions from the new, finishing part of section 3 (eq. (9) and eq. (10)) are important for the jamming practical realization. it is proved in this part of the paper that jamming may be unsuccessful, although sweep speed satisfies the condition tmess>tsw. the results in section 6 are completed by new graphs in figure 12 and figure 13, which present successful jamming probability as the function of ber when activation message length is fixed. this is the other way to present the results from figure 10 and figure 11, where message duration was variable and ber was fixed. the graphs in figure 14 and figure 15 are also new in comparison to [1]. they present the value of successful jamming probability as a function of, together, number of bits forming a rcied activation message (n) and ber. when we compare these two graphs to the graph in figure 9, we can conclude how bit error correction algorithm in rcied activation message contributes to successful jamming probability decreasing. the importance of the additional, last paragraph in section 6 is that it emphasizes the fact that reactive jamming may be in some cases more reliable than active jamming, realized by sweep signal generation. having 228 m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić, a. lebl in mind our study of existing published papers, such statement is not proved in the available literature. we plan to proceed with more detailed quantitative analysis of this problem in our future developmental work. and, last but not least, in the section 1 we have added the main results from [18], which are related to quantitative comparison of necessary signal power in the case of sweep jamming and wide-band jamming for several modulation techniques. the results are presented without detailed mathematical proof, which is presented in [18]. the presented analysis is based on long standing iritel experience in the systems development for rcied activation jamming [13], [18], [27] and for jammers intended for other applications [11], [12], [16]. the analysis procedures and rcied jamming implementation are mainly related to [13]. the realized jammer was presented with the great success at the eurosatory 2018 – defense & security international exhibition in paris, figure 16. having in mind the applications of new technologies in our rcied jammer implementations, such as absorptive filter at power amplifiers outputs, new theoretical approaches and papers related to this topic are of interest like [28]. acknowledgement: the paper is realized in the framework of the project tr32051, which is cofinanced by ministry of education, science and technological development of the republic of serbia. references [1] m. mileusnić, b. pavić, v. marinković-nedelicki, p. petrović, d. mitić and a. lebl, “analysis of jamming successfulness against rcied activation“, in proceedings of the 5 th international conference icetran 2018, palić, june 11-14, 2018. [2] s. d’oro, l. gallucio, g. morabito and s. palazzo, “efficiency analysis of jamming-based countermeasures against malicious timing channel in tactical communications”, in proceedings of the 2013 ieee international conference on communications icc, budapest, june 2013. [3] k. wilgucki, r. urban, g. baranowski, p. grądzki and p. skarźyński, “automated protection system against rcied, military communications and information technology”, chapter 7: “cognitive radio and spectrum management techniques”, 2012, pp. 593-601. [4] j. mietzner, p. nickel, a. meusling, p. loos, and g. bauch, “responsive communications jamming against radio-controlled improvised explosive devices”, ieee communications magazine, vol. 50, no. 10, pp. 38–46, october 2012. [5] l. karlsson, “method, system and apparatus for maximizing a jammer’s time-on-target and poweron-target”, united states patent application, publication no. us 2006/0164283 a1, 27. july 2006. [6] m. tanatwy, “responsive communication jamming detector with noise power fluctuation using cognitive radio”, international journal of innovative research in computer and communication engineering, vol. 2, no. 10, pp. 5967–5973, october 2014. [7] t. trump and i. müürsepp, “detection speed of responsive communication jamming detectors, recent advances in telecommunications and circuits”, in proceedings of the 2nd international conference on circuits, systems, communications, computers and applications, dubrovnik, june 2013, pp. 149-154. [8] m. wilhelm, i. martinović, j. schmitt, and v. lenders, “reactive jamming in wireless networks: how realistic is the threat?”, in proceedings of the 4 th acm conference on wireless network security (wisec '11), acm, hamburg, june 2011, pp. 47-52. [9] g. evans, “a new weapon in the fight against rcieds”, army technology, august 2015, https://www.armytechnology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/. [10] selena electronics, “rss intelligent reactive stationary jammer and rsv vehicle reactive jammer”, in “electronics warfare systems: jamming solution”, 2015. [11] iritel high frequency (hf) radio surveillance and jamming system, in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011. [12] iritel very/ultra high frequency (v/uhf) radio surveillance and jamming system, in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011. https://www.army-technology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/ https://www.army-technology.com/features/featurea-new-weapon-in-the-fight-against-rcieds-4647155/ analysis of jamming successfulness against rcied activation with the emphasis on sweep jamming 229 [13] m. mileusnić, p. petrović, b. pavić, v. marinković-nedelicki, j. glišović, a. lebl and i. marjanović, “the radio jammer against remote controlled improvised explosive devices”, in proceedings of the 25 th telecommunications forum (telfor). belgrade, november 2017, pp. 151–154. [14] phantom technologies ltd., tsecnet s.r.l., sgs, “selective cellular jammer”, http://www.tsecnet.com/assets/ docs/tsecnet%20cellular_selective_jammer.pdf. [15] “gbppr 800mhz cellular phone jammer“, http://67.225.133.110/~gbpprorg/mil/celljam1/. [16] n. remenski, b. pavić, p. petrović, m. mileusnić, v. marinković-nedelicki, “integrisana radio-oprema za zaštitu prostora od mobilnih veza (treća generacija radio-opreme), tehniĉko rešenje – novi proizvod s oznakom cj-1p na projektu tehnološkog razvoja tr-11030 “razvoj i realizacija nove generacije softvera, hardvera i usluga na bazi softverskog radija za namenske aplikacije”, 2010., http://www.iritel. com/images/pdf/cj-1p-e.pdf, (also published in the book m. streetly, jane’s radar and electronic warfare systems. ihs global limited, 2011.). prva generacija radio-opreme s oznakom cj-1 je realizovana na projektu tehnološkog razvoja tr6149b, 2006. [17] analog devices, 1 gsps, 14-bit, 3.3v cmos direct digital synthesizer ad9910, data sheet, 2017, http://www.analog.com/media/en/technical-documentation/data-sheets/ad9910.pdf. [18] m. mileusnić, p. petrović, b. pavić, v. marinković-nedelicki, v. matić and a. lebl, “jamming of mpsk modulated messages for rcied activation”, paper accepted for 8 th international scientific conference on defensive technologies oteh 2018, belgrade, 11-12 th october 2018. [19] elbit systems ew and signit – elisra: mrj family – miniature reactive jammer family for eurosatory 2016 exhibition, http://elbitsystems.com/media/mrj.pdf. [20] elaman german security solutions, “jammer – principle of operation”, https://ht.transparencytoolkit.org/rcsdev%5cshare/documentation/gamma/elaman/elamancat/jammer/jammer%20principles%20of%20operati on.pdf. [21] m. strasser, c. pöpper, s. ĉapkun and m. ĉagalj, “jamming-resistant key establishment using uncoordinated frequency hopping”, ieee symposium on security and privacy. oakland, ca, usa, may 2008. [22] k. burda, “the performance of follower jammer with a wideband scanning receiver”, journal of electrical engineering, vol. 55, no. 1-2, pp. 36–38, 2004. [23] a. gulyás, “the radio controlled improvised explosive device (rcied) threat in afghanistan”, aarms, vol. 12, no. 1, pp. 1–11, 2013. [24] x. li and e. blinka, “very large fft for tms320c6678 processors”, texas instruments, 2015, pp. 1–6. [25] texas instruments, “multicore fixed and floating-point digital signal processor”, sprs691 – november 2010 – revised march 2014, pp. 1–242. [26] w. wang, x. chen, h. wong, “a system-on-chip 1.5ghz phase locked loop realized using 40nm cmos technology”, facta universitatis, series: electronics and energetics, vol. 31, no. 1, pp. 101– 113, march 2018. [27] p. petrović, n. remenski, p. jovanović, v. tadić, b. pavić, m. mileusnić, b. mišković, “wrj 2004 wideband radio jammer against rcieds“, tehniĉko rešenje – novi proizvod na projektu tehnološkog razvoja tr32051 pod nazivom “razvoj i realizacija naredne generacije sistema, ureċaja i softvera na bazi softverskog radija za radio i radarske mreže“, 2011., http://www.iritel.com/images/pdf/wrj2004-e.pdf. [28] s. c. dutty roy, “a new lumped element bridged-t absorptive band-stop filter”, facta universitatis, series: electronics and energetics, vol. 30, no. 2, pp. 179–185, june 2017. http://www.tsecnet.com/assets/docs/tsecnet%20cellular_selective_jammer.pdf http://www.tsecnet.com/assets/docs/tsecnet%20cellular_selective_jammer.pdf http://67.225.133.110/~gbpprorg/mil/celljam1/ http://www.iritel.com/images/pdf/cj-1p-e.pdf http://www.iritel.com/images/pdf/cj-1p-e.pdf http://www.analog.com/media/en/technical-documentation/data-sheets/ad9910.pdf http://elbitsystems.com/media/mrj.pdf https://ht.transparencytoolkit.org/rcs-dev%5cshare/documentation/gamma/elaman/elamancat/jammer/jammer%20principles%20of%20operation.pdf https://ht.transparencytoolkit.org/rcs-dev%5cshare/documentation/gamma/elaman/elamancat/jammer/jammer%20principles%20of%20operation.pdf https://ht.transparencytoolkit.org/rcs-dev%5cshare/documentation/gamma/elaman/elamancat/jammer/jammer%20principles%20of%20operation.pdf http://www.iritel.com/images/pdf/wrj2004-e.pdf 1 2 1 3 1 2 3 × facta universitatis series: electronics and energetics vol. 28, no 1, december 2019, pp. 101 125 wind turbine tower detection using feature descriptors and deep learning fereshteh abedini1, mahdi bahaghighat2∗, misak s’hoyan3 1department of electrical engineering amirkabir university of technology tehran, iran 2department of electrical engineering raja university qazvin, iran 3department of information security national polytechnic university of armenia yerevan, armenia abstract: wind turbine towers (wtts) are the main structures of wind farms. they are costly devices that must be thoroughly inspected according to maintenance plans. today, existence of machine vision techniques along with unmanned aerial vehicles (uavs) enable fast, easy, and intelligent visual inspection of the structures. our work is aimed towards developing a visionbased system to perform nondestructive tests (ndts) for wind turbines using uavs. in order to navigate the flying machine toward the wind turbine tower and reliably land on it, the exact position of the wind turbine and its tower must be detected. we employ several strong computer vision approaches such as scale-invariant feature transform (sift), speeded up robust features (surf), features from accelerated segment test (fast), brute-force, fast library for approximate nearest neighbors (flann) to detect the wtt. then, in order to increase the reliability of the system, we apply the resnet, mobilenet, shufflenet, effnet, and squeezenet pre-trained classifiers in order to verify whether a detected object is indeed a turbine tower or not. this manuscript received september 28, 2019 corresponding author: mahdi bahaghighat department of electrical engineering, raja university, qazvin, iran (e-mail: m.bahaghighat@raja.ac.ir) 1 facta universitatis series: electronics and energetics vol. 33, no 1, march 2020, pp. 133 153 https://doi.org/10.2298/fuee2001133a fereshteh abedini1, mahdi bahaghighat2, misak s’hoyan3 received august 27, 2019; received in revised form november 26, 2019 corresponding author: mahdi bahaghighat department of electrical engineering raja university qazvin, iran (e-mail: m.bahaghighat@aut.ac.ir) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) wind turbine tower detection using feature descriptors and deep learning 1department of electrical engineering amirkabir university of technology tehran, iran 2department of electrical engineering raja university qazvin, iran 3department of information security national polytechnic university of armenia yerevan, armenia abstract. wind turbine towers (wtts) are the main structures of wind farms. they are costly devices that must be thoroughly inspected according to maintenance plans. today, existence of machine vision techniques along with unmanned aerial vehicles (uavs) enable fast, easy, and intelligent visual inspection of the structures. our work is aimed towards developing a vision-based system to perform nondestructive tests (ndts) for wind turbines using uavs. in order to navigate the flying machine toward the wind turbine tower and reliably land on it, the exact position of the wind turbine and its tower must be detected. we employ several strong computer vision approaches such as scale-invariant feature transform (sift), speeded up robust features (surf), features from accelerated segment test (fast), brute-force, fast library for approximate nearest neighbors (flann) to detect the wtt. then, in order to increase the reliability of the system, we apply the resnet, mobilenet, shufflenet, effnet, and squeezenet pre-trained classifiers in order to verify whether a detected object is indeed a turbine tower or not. this intelligent monitoring system has auto navigation ability and can be used for future goals including intelligent fault diagnosis and maintenance purposes. the simulation results show the accuracy of the proposed model are 89.4% in wtt detection and 97.74% in verification (classification) problems. key words: machine vision, object detection, vision inspection, wind tur-bine, deep learning. © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd 2 f. abedini, m. bahaghighat, m.s’hoyan intelligent monitoring system has auto navigation ability and can be used for future goals including intelligent fault diagnosis and maintenance purposes. the simulation results show the accuracy of the proposed model are 89.4% in wtt detection and 97.74% in verification (classification) problems. keywords: machine vision, object detection, vision inspection, wind turbine, deep learning. 1 introduction providing reliable and affordable electricity in order to face the increasing demand of energy in the near future is a worldwide concern. in this regard, developing renewable and clean energy sources such as wind turbine (wt) farms in smart grid (sg) infrastructures can play a crucial role in increasing the capacity of electricity production in many countries across the world. sg deploys widely information and communication technologies (ict) [1] subsystems. there is almost unlimited number of possible applications of ict subsystems within the smart grid. sgs with these infrastructures make it more possible to develop reliable systems through artificial intelligence (ai) [2–4]. on the other side, guaranteeing the reliability of wind turbines is of great importance. in case of failure and faulty operation, the grid will face interruptions in its service. challenges and costly breakdowns such as mechanical deformations, surface defects, overheated components in rotor blades, nacelles, slip rings, yaw drives, bearings, gearbox, generators, and transformers are the ones which should be monitored to detect faults intelligently in a wind turbine farm [5–8]. besides, wts are costly devices that should have advanced maintenance systems [5]. in order to increase the lifetime of the wts and reduce the maintenance cost, it is essential to improve the monitoring and maintenance approaches and reach solutions to avoid failure during in-service operation [9, 10]. vision inspection paves the way toward generating reliable, efficient, and economical electrical energy in wind turbine farms. image processing and machine learning methods have been widely employed to assist in monitoring and fault diagnostic solutions in energy systems [10–15]. the authors in [11] proposed a learning-based approach to inspect power line infrastructures. in [12], authors suggested a smart framework for system reliability, using machine learning algorithms, to predict failures for preventive maintenance of system components. the authors in [10], benefited from the image data and suggested a model to estimate the rotational velocity of the turbine wind turbine tower detection using feature descriptors and deep ... 3 blade. estimating the velocity of blades helps to predict the amount of generated power by the wts in the smart grid. this will ensure the grid to be a more reliable system. in [13], the authors employed machine learning algorithms to manage the energy of loads and sources in smart grids. the problems of malicious activity prediction and intrusion detection have been analyzed using machine learning techniques in smart grid communication systems in [14, 15]. the authors in [14] detected malicious events and improved system reliability. in [15], a novel method was proposed to reliably warn and anticipate abnormalities and failures in distribution and communication systems. deep learning techniques in computer vision applications such as autonomous inspection and monitoring have had a tremendous impact in recent years [16]. using convolutional neural networks (cnns) have led computer vision to more advanced approaches. the main feature of a cnn is its deep architecture [17]. one of the common and effective approaches in deep learning is to use a pre-trained network. several classification problems have been solved using pre-trained networks [17]. for instance, the authors in [16] used deep cnn architecture in the fault classification of power line insulators. precise monitoring and forecasting of emerging faults and failures of wts are critical tasks and can be complex challenges. if the system problems are detected and notified accurately, they can be fixed as soon as possible to increase the reliability of the system. intelligent vision inspection techniques can be employed to make these predictions and controls to be done automatically and reliably. therefore, in this work, we propose an intelligent approach, which deploys computer vision techniques to detect wind turbine towers (wtts). detection of wtts can ease the challenges in automatic fault detection and diagnostics process in wind farms through unmanned aerial vehicles (uav) [18]. in [19], they used signal processing approaches and employed a combination of line and feature detection to locate the wind turbine towers. they started the wind turbine detection stage with hough transform to detect lines but there are many objects in the background with line shapes, such as horizon, shadow, mountains, and power lines, which are not desired to be located. detecting the lines and then removing the false detections can cause computational cost and decrease the overall accuracy. in this research, we developed a new vision-based model to detect wtts and verify it. the proposed verification step which is implemented using a deep learning classification method, is an extra phase to improve the reliability of the navigation system. this classifier decides between ok and ng (not good) detection results. here, ok means a wtt is detected correctly 134 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 135 2 f. abedini, m. bahaghighat, m.s’hoyan intelligent monitoring system has auto navigation ability and can be used for future goals including intelligent fault diagnosis and maintenance purposes. the simulation results show the accuracy of the proposed model are 89.4% in wtt detection and 97.74% in verification (classification) problems. keywords: machine vision, object detection, vision inspection, wind turbine, deep learning. 1 introduction providing reliable and affordable electricity in order to face the increasing demand of energy in the near future is a worldwide concern. in this regard, developing renewable and clean energy sources such as wind turbine (wt) farms in smart grid (sg) infrastructures can play a crucial role in increasing the capacity of electricity production in many countries across the world. sg deploys widely information and communication technologies (ict) [1] subsystems. there is almost unlimited number of possible applications of ict subsystems within the smart grid. sgs with these infrastructures make it more possible to develop reliable systems through artificial intelligence (ai) [2–4]. on the other side, guaranteeing the reliability of wind turbines is of great importance. in case of failure and faulty operation, the grid will face interruptions in its service. challenges and costly breakdowns such as mechanical deformations, surface defects, overheated components in rotor blades, nacelles, slip rings, yaw drives, bearings, gearbox, generators, and transformers are the ones which should be monitored to detect faults intelligently in a wind turbine farm [5–8]. besides, wts are costly devices that should have advanced maintenance systems [5]. in order to increase the lifetime of the wts and reduce the maintenance cost, it is essential to improve the monitoring and maintenance approaches and reach solutions to avoid failure during in-service operation [9, 10]. vision inspection paves the way toward generating reliable, efficient, and economical electrical energy in wind turbine farms. image processing and machine learning methods have been widely employed to assist in monitoring and fault diagnostic solutions in energy systems [10–15]. the authors in [11] proposed a learning-based approach to inspect power line infrastructures. in [12], authors suggested a smart framework for system reliability, using machine learning algorithms, to predict failures for preventive maintenance of system components. the authors in [10], benefited from the image data and suggested a model to estimate the rotational velocity of the turbine wind turbine tower detection using feature descriptors and deep ... 3 blade. estimating the velocity of blades helps to predict the amount of generated power by the wts in the smart grid. this will ensure the grid to be a more reliable system. in [13], the authors employed machine learning algorithms to manage the energy of loads and sources in smart grids. the problems of malicious activity prediction and intrusion detection have been analyzed using machine learning techniques in smart grid communication systems in [14, 15]. the authors in [14] detected malicious events and improved system reliability. in [15], a novel method was proposed to reliably warn and anticipate abnormalities and failures in distribution and communication systems. deep learning techniques in computer vision applications such as autonomous inspection and monitoring have had a tremendous impact in recent years [16]. using convolutional neural networks (cnns) have led computer vision to more advanced approaches. the main feature of a cnn is its deep architecture [17]. one of the common and effective approaches in deep learning is to use a pre-trained network. several classification problems have been solved using pre-trained networks [17]. for instance, the authors in [16] used deep cnn architecture in the fault classification of power line insulators. precise monitoring and forecasting of emerging faults and failures of wts are critical tasks and can be complex challenges. if the system problems are detected and notified accurately, they can be fixed as soon as possible to increase the reliability of the system. intelligent vision inspection techniques can be employed to make these predictions and controls to be done automatically and reliably. therefore, in this work, we propose an intelligent approach, which deploys computer vision techniques to detect wind turbine towers (wtts). detection of wtts can ease the challenges in automatic fault detection and diagnostics process in wind farms through unmanned aerial vehicles (uav) [18]. in [19], they used signal processing approaches and employed a combination of line and feature detection to locate the wind turbine towers. they started the wind turbine detection stage with hough transform to detect lines but there are many objects in the background with line shapes, such as horizon, shadow, mountains, and power lines, which are not desired to be located. detecting the lines and then removing the false detections can cause computational cost and decrease the overall accuracy. in this research, we developed a new vision-based model to detect wtts and verify it. the proposed verification step which is implemented using a deep learning classification method, is an extra phase to improve the reliability of the navigation system. this classifier decides between ok and ng (not good) detection results. here, ok means a wtt is detected correctly wind turbine tower detection using feature descriptors and deep ... 3 blade. estimating the velocity of blades helps to predict the amount of generated power by the wts in the smart grid. this will ensure the grid to be a more reliable system. in [13], the authors employed machine learning algorithms to manage the energy of loads and sources in smart grids. the problems of malicious activity prediction and intrusion detection have been analyzed using machine learning techniques in smart grid communication systems in [14, 15]. the authors in [14] detected malicious events and improved system reliability. in [15], a novel method was proposed to reliably warn and anticipate abnormalities and failures in distribution and communication systems. deep learning techniques in computer vision applications such as autonomous inspection and monitoring have had a tremendous impact in recent years [16]. using convolutional neural networks (cnns) have led computer vision to more advanced approaches. the main feature of a cnn is its deep architecture [17]. one of the common and effective approaches in deep learning is to use a pre-trained network. several classification problems have been solved using pre-trained networks [17]. for instance, the authors in [16] used deep cnn architecture in the fault classification of power line insulators. precise monitoring and forecasting of emerging faults and failures of wts are critical tasks and can be complex challenges. if the system problems are detected and notified accurately, they can be fixed as soon as possible to increase the reliability of the system. intelligent vision inspection techniques can be employed to make these predictions and controls to be done automatically and reliably. therefore, in this work, we propose an intelligent approach, which deploys computer vision techniques to detect wind turbine towers (wtts). detection of wtts can ease the challenges in automatic fault detection and diagnostics process in wind farms through unmanned aerial vehicles (uav) [18]. in [19], they used signal processing approaches and employed a combination of line and feature detection to locate the wind turbine towers. they started the wind turbine detection stage with hough transform to detect lines but there are many objects in the background with line shapes, such as horizon, shadow, mountains, and power lines, which are not desired to be located. detecting the lines and then removing the false detections can cause computational cost and decrease the overall accuracy. in this research, we developed a new vision-based model to detect wtts and verify it. the proposed verification step which is implemented using a deep learning classification method, is an extra phase to improve the reliability of the navigation system. this classifier decides between ok and ng (not good) detection results. here, ok means a wtt is detected correctly 4 f. abedini, m. bahaghighat, m.s’hoyan and the uav should update its navigation information while ng means a false detection has occurred and the uav should keep its previous knowledge. in our future works, the uav will be embedded with thermal vision cameras for advanced visual nondestructive tests (ndt). the remainder of this paper is organized as follows. section ii presents the methodology of the proposed model. the experiments and results are reported in section iii. finally, in section iv, we conclude the paper. class ok class ng fig. 1: flowchart of the proposed model. 2 methodology in this work, we target developing an appropriate infrastructure to perform vision-based nondestructive tests (ndt) for wind turbines using uavs for future works. in order to have a precise navigation and guide the flying machine toward the wind turbine tower and reliably land on it, the position of the wind turbine and its tower must be estimated. to tackle this issue, 134 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 135 4 f. abedini, m. bahaghighat, m.s’hoyan and the uav should update its navigation information while ng means a false detection has occurred and the uav should keep its previous knowledge. in our future works, the uav will be embedded with thermal vision cameras for advanced visual nondestructive tests (ndt). the remainder of this paper is organized as follows. section ii presents the methodology of the proposed model. the experiments and results are reported in section iii. finally, in section iv, we conclude the paper. class ok class ng fig. 1: flowchart of the proposed model. 2 methodology in this work, we target developing an appropriate infrastructure to perform vision-based nondestructive tests (ndt) for wind turbines using uavs for future works. in order to have a precise navigation and guide the flying machine toward the wind turbine tower and reliably land on it, the position of the wind turbine and its tower must be estimated. to tackle this issue, 4 f. abedini, m. bahaghighat, m.s’hoyan and the uav should update its navigation information while ng means a false detection has occurred and the uav should keep its previous knowledge. in our future works, the uav will be embedded with thermal vision cameras for advanced visual nondestructive tests (ndt). the remainder of this paper is organized as follows. section ii presents the methodology of the proposed model. the experiments and results are reported in section iii. finally, in section iv, we conclude the paper. class ok class ng fig. 1: flowchart of the proposed model. 2 methodology in this work, we target developing an appropriate infrastructure to perform vision-based nondestructive tests (ndt) for wind turbines using uavs for future works. in order to have a precise navigation and guide the flying machine toward the wind turbine tower and reliably land on it, the position of the wind turbine and its tower must be estimated. to tackle this issue, 136 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 137 4 f. abedini, m. bahaghighat, m.s’hoyan and the uav should update its navigation information while ng means a false detection has occurred and the uav should keep its previous knowledge. in our future works, the uav will be embedded with thermal vision cameras for advanced visual nondestructive tests (ndt). the remainder of this paper is organized as follows. section ii presents the methodology of the proposed model. the experiments and results are reported in section iii. finally, in section iv, we conclude the paper. class ok class ng fig. 1: flowchart of the proposed model. 2 methodology in this work, we target developing an appropriate infrastructure to perform vision-based nondestructive tests (ndt) for wind turbines using uavs for future works. in order to have a precise navigation and guide the flying machine toward the wind turbine tower and reliably land on it, the position of the wind turbine and its tower must be estimated. to tackle this issue, wind turbine tower detection using feature descriptors and deep ... 5 we have proposed our model based on the flowchart of fig. 1. there are two main steps in our model, the first one probing a wtt location in an input image and the second one checking whether it is a real wtt or not using a deep learning classifier. this classifier decides between ok and ng detection results. the uav updates the position information when the classifier output is ok. 2.1 dataset in our research, db1 includes about 1500 images which have been captured by us in different angles, distances, and backgrounds in a real wind turbine farm. in this wind turbine farm more than 300 wtts exist and in average 5 images/wtt are available in db1. this dataset is used for object detection problem. in fig. 2, some selected samples from db1 are depicted. beside this dataset, about 2000 images consisting of two different classes, not good (ng) as class 1 and ok as class 2 with equal distribution (1000 images for ok class and 1000 samples for ng), have been collected as db2 to evaluate the performance of the proposed algorithm in the verification stage (classification problem). fig. 3 illustrates several examples of two mentioned classes in db2. fig. 2: some selected image samples in our dataset (db1) 4 f. abedini, m. bahaghighat, m.s’hoyan and the uav should update its navigation information while ng means a false detection has occurred and the uav should keep its previous knowledge. in our future works, the uav will be embedded with thermal vision cameras for advanced visual nondestructive tests (ndt). the remainder of this paper is organized as follows. section ii presents the methodology of the proposed model. the experiments and results are reported in section iii. finally, in section iv, we conclude the paper. class ok class ng fig. 1: flowchart of the proposed model. 2 methodology in this work, we target developing an appropriate infrastructure to perform vision-based nondestructive tests (ndt) for wind turbines using uavs for future works. in order to have a precise navigation and guide the flying machine toward the wind turbine tower and reliably land on it, the position of the wind turbine and its tower must be estimated. to tackle this issue, 136 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 137 6 f. abedini, m. bahaghighat, m.s’hoyan (a) (b) fig. 3: several samples from our database (db2) for verification problem including two classes: ng (not good) and ok, (a) ng (b) ok. 2.2 the wtt object detection object detection is a fundamental field of study in computer vision and image processing applications [20–27]. recently, various algorithms have been suggested for object detection purposes [28–31]. these algorithms extract local interest features (key points) and describe them to identify the objects [28–31]. in [28], a well-known algorithm, scale-invariant feature transform (sift) was presented as a scheme for extracting highly distinctive invariant features, which can be used to match different views of objects. the advantage of sift is its invariance to scaling, rotation, and translation. the sift key point detectors and descriptors have reported to be remarkably effective in different applications [28]. sift is computationally expensive, especially for real-time systems. this 138 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 139 6 f. abedini, m. bahaghighat, m.s’hoyan (a) (b) fig. 3: several samples from our database (db2) for verification problem including two classes: ng (not good) and ok, (a) ng (b) ok. 2.2 the wtt object detection object detection is a fundamental field of study in computer vision and image processing applications [20–27]. recently, various algorithms have been suggested for object detection purposes [28–31]. these algorithms extract local interest features (key points) and describe them to identify the objects [28–31]. in [28], a well-known algorithm, scale-invariant feature transform (sift) was presented as a scheme for extracting highly distinctive invariant features, which can be used to match different views of objects. the advantage of sift is its invariance to scaling, rotation, and translation. the sift key point detectors and descriptors have reported to be remarkably effective in different applications [28]. sift is computationally expensive, especially for real-time systems. this wind turbine tower detection using feature descriptors and deep ... 7 has led into thorough research toward an alternative algorithm with lower computational cost such as speeded up robust features (surf) and features from accelerated segment test (fast) [28–31]. in [29], the authors implemented a new detector and descriptor called surf which is invariant to scaling and rotation. it is competitive and often superior in terms of repeatability, distinctiveness, and robustness to sift, and can be calculated and compared much faster. sift and surf are both based on detectors and descriptors. once key points are extracted, a template-matching algorithm must be applied to describe the features. here, we adopt brute-force [28] and flann (fast library for approximate nearest neighbors) based matcher [32] to test the similarity between the descriptors for the training and test images. the brute-force matcher is simple; it takes the descriptor in the first set and compares it to all other descriptors in the second set using a distance calculation. then the closest ones are returned as the best matches. flann contains a collection of optimized algorithms for fast nearest neighbor searches in large datasets and for features which are high dimensional. 2.3 the verification step based on deep learning classifiers we used sift, surf, fast algorithms to extract features and detect wts. since the object detection stage plays a crucial role in reliability of navigating toward the correct target, we must verify our detection results. if the towers are not detected reliably, uavs and thermal cameras may hit the blades or land on wrong objects. as a result, the uavs and cameras may be damaged or the inspection and estimation data may be erroneous. there are a lot of classical approaches for classification problems such as random forest (rf), adaboost, k-nearest neighbor (knn), and support vector machine (svm) [33–38] but to verify the object detection output accuratelly, we propose use of a pre-trained cnn as the classifier. the authors in [17] presented mobilenet as a class of more efficient models for mobile and embedded vision applications. as indicated in fig. 4, mobilenets are based on an architecture that uses depth wise separable convolutions to build lightweight deep neural networks [17, 39–41]. in [17], the authors introduced two simple global hyper-parameters that effectively compensate for latency and accuracy. these hyper-parameters allow the model builder to choose the right sized model for their application according to the limitations of the problem. they have presented many experiments on resource and accuracy trade-offs and have demonstrated better performance compared to 138 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 139 8 f. abedini, m. bahaghighat, m.s’hoyan other common models on imagenet classification [17]. m n dk dk (a) standard convolution filters m dk dk 1 (b) depthwise convolution filters m n1 1 (c) 1×1 convolution filters called pointwise convolution in the context of depthwise separable convolution fig. 4: the standard convolution filters in (a) are replaced by two layers: depth-wise convolution in (b) and point-wise convolution in (c) to build depth-wise separable filters [17]. in this work, we applied mobilenet [17], shufflenet (an extremely efficient convolutional neural network for mobile devices) [42], effnet (an efficient structure for convolutional neural network) [43], squeezenet [44], and resnet [45,46] pre-trained classifiers with two added fully connected layers dense 1 and dense 2. dense 2 is the output layer and fixed for our binary classifier while dense 1’s parameters are targeted as optimization parameters. shufflenet is a practical cnn architecture with high computational efficiency. it provides more feature map channels to encode more information. this is an important point for the performance of very small networks. shufflenet is well designed and developed for embedded devices such as mobile phones with very low computing power [42]. squeezenet tries to reduce time cost and parameters noticeably while holding on the accuracy [44]. residual neural network so-called resnet utilizes the bottleneck architecture efficiently to obtain impressive performance [45, 46]. in this model an innovative structure with skip connections and features heavy batch normalization was introduced. such skip connections are also known as gated 140 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 141 8 f. abedini, m. bahaghighat, m.s’hoyan other common models on imagenet classification [17]. m n dk dk (a) standard convolution filters m dk dk 1 (b) depthwise convolution filters m n1 1 (c) 1×1 convolution filters called pointwise convolution in the context of depthwise separable convolution fig. 4: the standard convolution filters in (a) are replaced by two layers: depth-wise convolution in (b) and point-wise convolution in (c) to build depth-wise separable filters [17]. in this work, we applied mobilenet [17], shufflenet (an extremely efficient convolutional neural network for mobile devices) [42], effnet (an efficient structure for convolutional neural network) [43], squeezenet [44], and resnet [45,46] pre-trained classifiers with two added fully connected layers dense 1 and dense 2. dense 2 is the output layer and fixed for our binary classifier while dense 1’s parameters are targeted as optimization parameters. shufflenet is a practical cnn architecture with high computational efficiency. it provides more feature map channels to encode more information. this is an important point for the performance of very small networks. shufflenet is well designed and developed for embedded devices such as mobile phones with very low computing power [42]. squeezenet tries to reduce time cost and parameters noticeably while holding on the accuracy [44]. residual neural network so-called resnet utilizes the bottleneck architecture efficiently to obtain impressive performance [45, 46]. in this model an innovative structure with skip connections and features heavy batch normalization was introduced. such skip connections are also known as gated wind turbine tower detection using feature descriptors and deep ... 9 units or gated recurrent units and have a strong similarity to recent successful elements applied in rnns. resnet has proven to be powerful in a lot of applications but one major disadvantage is that the deeper networks usually need several weeks for training, making it impractical in real-world applications. in addition it has large size for most embedded devices. in comparison to resnet, shufflenet has the lower complexity with the same settings. mobilenet and shufflenet are favourite models for embedded and mobile systems but effnet is the optimized model that can be replaced with them in the same applications. effnet deploys spatial separable convolution, which is simply a depthwise convolution splitted along the x and y axis with a separable pooling between them. it has been shown that it has the same capacity even when applied to narrow and shallow architectures [43]. effnet block is developed to guarantee the safe replacement of the vanilla convolution layers in mobile hardware applications. therefore, it has two main advantages, first is the quicker inference and second the application of a larger, deeper network becoming possible [43]. a comparison of mobilenet and shufflenet with effnet is shown in fig. 5. in this figure, dw means depthwise convolution, mp means max-pooling, ch is for the number of output channels and gc is for group convolutions [43]. fig. 5: a comparison of mobilenet and shufflenet with effnet [43] (a) an effnet block (b) a mobilenet block (c) a shufflenet block in the next section, we elaborate experimental results in our research. 3 experiments and results in our work, all implementations and simulations have been done using python programming language and tensorflow by a core i7 cpu and 140 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 141 10 f. abedini, m. bahaghighat, m.s’hoyan nvidia gtx 1050 gpu with 16 gb ddr4 ram memory. as mentioned before in section 2.1, we carried out our experiments and simulations using images which were captured by us in a real wind farm. 3.1 using sift, surf, and fast for wtt object detection in order to detect the wind turbine towers, firstly, the features and descriptors are extracted using sift, surf, and fast schemes and then, by applying brute-force and flann template matching algorithms, a bounding box is predicted for the wind towers. to evaluate the performance of our proposed object detection method, we used the intersection over union (iou) [47]. iou can be calculated having the ground-truth (gt) bounding box and predicted bounding box of the model. fig. 6 illustrates different examples of the simulation results in our dataset. the features key points are drawn in blue, predicted bounding box is in red and the ground-truth bounding box is in green. the goal was to compute the intersection over union of detected bounding box and ground-truth box based on equation (1). iou = area of overlap area of union (1) we considered the iou as a scoring factor and will decide on the performance of the suggested model based on equation (2). prediction = { 0 iou≤ λ 1 iou≥ λ (2) if prediction = 0, it actually means a detected object is not acceptable. fig. 6.(a) demonstrates a poor detection with iou = 0.0198, while fig. 6.(b) is an example of a less good prediction with iou = 0.3129, fig. 6.(c) is a good detection with iou = 0.5734, and fig. 6.(d) demonstrates an excellent bounding box, iou = 0.9708. the detection rate of the wtt, drw t t , can be calculated as introduced in equation (3). in this equation, n is the total number of images in the experiment. drw t t = ∑n i=1 prediction n × 100 (3) 142 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 143 10 f. abedini, m. bahaghighat, m.s’hoyan nvidia gtx 1050 gpu with 16 gb ddr4 ram memory. as mentioned before in section 2.1, we carried out our experiments and simulations using images which were captured by us in a real wind farm. 3.1 using sift, surf, and fast for wtt object detection in order to detect the wind turbine towers, firstly, the features and descriptors are extracted using sift, surf, and fast schemes and then, by applying brute-force and flann template matching algorithms, a bounding box is predicted for the wind towers. to evaluate the performance of our proposed object detection method, we used the intersection over union (iou) [47]. iou can be calculated having the ground-truth (gt) bounding box and predicted bounding box of the model. fig. 6 illustrates different examples of the simulation results in our dataset. the features key points are drawn in blue, predicted bounding box is in red and the ground-truth bounding box is in green. the goal was to compute the intersection over union of detected bounding box and ground-truth box based on equation (1). iou = area of overlap area of union (1) we considered the iou as a scoring factor and will decide on the performance of the suggested model based on equation (2). prediction = { 0 iou≤ λ 1 iou≥ λ (2) if prediction = 0, it actually means a detected object is not acceptable. fig. 6.(a) demonstrates a poor detection with iou = 0.0198, while fig. 6.(b) is an example of a less good prediction with iou = 0.3129, fig. 6.(c) is a good detection with iou = 0.5734, and fig. 6.(d) demonstrates an excellent bounding box, iou = 0.9708. the detection rate of the wtt, drw t t , can be calculated as introduced in equation (3). in this equation, n is the total number of images in the experiment. drw t t = ∑n i=1 prediction n × 100 (3) wind turbine tower detection using feature descriptors and deep ... 11 (a) (b) (c) (d) fig. 6: detecting wind tower in different images. (a): iou=0.0198, poor detection using fast, sift, and flann. (b): iou=0.3129, less good detection using fast, surf, and flann. (c): iou=0.5334, good detection using surf and brute-force. (d): iou=0.9708, excellent detection using fast, sift, and brute-force. to evaluate the performance of the proposed method, we measured the accuracy using equation (4) and the following metrics: tp (true positive) indicates the number of correctly classified samples in the ok class, fn (false negative) indicates the number of samples for which the classification is ng class, but misclassified, tn (true negative) indicates the number of samples that properly classified as not belonging to the ok class, and fp (false positive) indicates the number of samples belongs to the class ok but misclassified [48] accurracy = tp + tn tp + fp + tn + fn (4) the results of applying sift, surf, and fast feature extractors with brute-force and flann template matching algorithms are summarized in table 1. according to this table, fast outperforms other feature extractors in term of accuracy. it extracts more features than sift and surf, that makes fast more powerful in the detection of wind towers. we also observe that surf is more accurate in extracting and describing features 142 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 143 12 f. abedini, m. bahaghighat, m.s’hoyan in comparison to sift. as it is mentioned in table 1, flann template matching works slightly better than brute-force. table 1: result of the wtt object detection method accuracy(%) feature extractor & descriptor matcher runtime (s) iou=0.5 0.3 0.25 sift brute-force 0.3955 45.4 74.8 79.8 surf brute-force 0.3021 46.6 76.9 82.0 fast & sift brute-force 0.0728 51.1 81.8 84.1 fast & surf brute-force 0.0623 52.7 83.6 87.3 sift flann 0.3666 47.0 78.5 83.2 surf flann 0.2309 49.5 82.7 84.9 fast & sift flann 0.0821 52.6 82.8 87.3 fast & surf flann 0.0542 54.8 83.8 89.4 3.2 applying deep learning classifiers to verify object detection results to verify the object detection performance, firstly we used the pre-trained mobilenet classifier with mobilenets body architecture [17] that is presented in table 2. in addition, we deployed the following parameter setting: the optimizer used in our work is named adam [49]. we set learning rate to be 3 × 10−7 for the optimizer. at two last layers, we used fully connected (fc) networks. in these layers, each neuron reads the neurons output in the previous layer and processes the information it needs, and produces the outputs for the next layer [50–52]. the general formula is the following, where b is the bias; weights of connections are wi, f is a nonlinear activation function. f(w tx) = f( 3∑ i=1 wixi + b) (5) the most common activation functions are sigmoid function, hyperbolic tangent function (tanh), and rectified linear function (relu). their formulas 144 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 145 12 f. abedini, m. bahaghighat, m.s’hoyan in comparison to sift. as it is mentioned in table 1, flann template matching works slightly better than brute-force. table 1: result of the wtt object detection method accuracy(%) feature extractor & descriptor matcher runtime (s) iou=0.5 0.3 0.25 sift brute-force 0.3955 45.4 74.8 79.8 surf brute-force 0.3021 46.6 76.9 82.0 fast & sift brute-force 0.0728 51.1 81.8 84.1 fast & surf brute-force 0.0623 52.7 83.6 87.3 sift flann 0.3666 47.0 78.5 83.2 surf flann 0.2309 49.5 82.7 84.9 fast & sift flann 0.0821 52.6 82.8 87.3 fast & surf flann 0.0542 54.8 83.8 89.4 3.2 applying deep learning classifiers to verify object detection results to verify the object detection performance, firstly we used the pre-trained mobilenet classifier with mobilenets body architecture [17] that is presented in table 2. in addition, we deployed the following parameter setting: the optimizer used in our work is named adam [49]. we set learning rate to be 3 × 10−7 for the optimizer. at two last layers, we used fully connected (fc) networks. in these layers, each neuron reads the neurons output in the previous layer and processes the information it needs, and produces the outputs for the next layer [50–52]. the general formula is the following, where b is the bias; weights of connections are wi, f is a nonlinear activation function. f(w tx) = f( 3∑ i=1 wixi + b) (5) the most common activation functions are sigmoid function, hyperbolic tangent function (tanh), and rectified linear function (relu). their formulas wind turbine tower detection using feature descriptors and deep ... 13 are as follows ( [50–52]): f(w tx) = sigmoid(w tx) = 1 1 + exp(−w tx) (6) f(w tx) = tanh(w tx) = ew tx − e−w tx ew tx + e−w tx (7) f(w tx) = relu(w tx) = max(0, w tx) (8) the sigmoid function receives a value range between 0 and 1, and a realvalued number as the firing rate of a neuron: 0 for not firing or 1 for firing. the hyperbolic tangent functions as a zero-centered output range and uses [−1, 1] instead of [0, 1]. for relu function, if the input is less than 0, its activation will be thresholded at zero. the softmax function also can be used as the output neuron function and is a logistic function. the function definition is as follows ( [50–52]): σ(x)j = exj∑k k=1e xk xi for j = 1, ..., k (9) in our work, softmax is used for the final classification at the final layer of the nn. in the verification problem, the detected objects should be classified into two categories: ok and ng. as a result, the number of neurons in fc output layer (dense 2) would be equal to two. the classification results are presented in table 3. based on the information given in this table, we could achieve the accuracy of 96.01% in classifying the objects in the 5th experiment (e5) as the best result of mobilenet with runtime around 0.034857 (s). then we conducted our new experiments (e7 to e12) using resnet50 rather than mobilenet. according to e11, it could result in the higher accuracy about 98.92% in 0.061942(s). although resnet50 achieved the better result in term of the average accuracy, the runtime was doubled. afterwards, e13 to e18 were implemented for effnet, shufflenet, and squeezenet. the obtained results can be evident that effnet with 1000 iterations and 100 neurons before the output layer (e14) can lead to 97.74% average accuracy in 0.039351(s) as the best scenario. the validation accuracy and the validation loss for e14 are depicted in fig. 7. in fig. 7.(a), the network has been trained for 1000 epochs and we have obtained a validation accuracy of 97.74% and as it is apparent in fig. 7.(b) the validation loss follows the training loss which is very low. 144 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 145 14 f. abedini, m. bahaghighat, m.s’hoyan table 2: mobilenets body architecture [17]. type/stride filter shape input size conv/s2 3 × 3 × 3 × 32 224 × 224 × 3 conv dw/s1 3 × 3 × 32dw 112 × 112 × 32 conv/s1 1 × 1 × 32 × 64 112 × 112 × 32 conv dw/s2 3 × 3 × 64dw 112 × 112 × 64 conv/s1 1 × 1 × 64 × 128 56 × 56 × 64 conv dw/s1 3 × 3 × 128dw 56 × 56 × 128 conv/s1 1 × 1 × 128 × 128 56 × ×56 × 128 conv dw/s2 3 × 3 × 128dw 56 × 56 × 128 conv/s1 1 × 1 × 128 × 256 28 × 28 × 128 conv dw/s1 3 × 3 × 256dw 28 × 28 × 256 conv/s1 1 × 1 × 256 × 256 28 × 28 × 256 conv dw/s2 3 × 3 × 256dw 28 × 28 × 256 conv/s1 1 × 1 × 256 × 512 14 × 14 × 256 5× conv dw/s1 3 × 3 × 512dw 14 × 14 × 512 conv/s1 1 × 1 × 512 × 512 14 × 14 × 512 conv dw/s2 3 × 3 × 512dw 14 × 14 × 512 conv/s1 1 × 1 × 1 × 512 × 1024 7 × 7 × 512 conv dw/s2 3 × 3 × 1024dw 7 × 7 × 1024 conv/s1 1 × 1 × 1024 × 1024 7 × 7 × 1024 avg pool/s1 pool 7 × 7 7 × 7 × 1024 fc/s1 1024 × 1000 1 × 1 × 1024 softmax/s1 classifier 1 × 1 × 1000 3.3 decision making and updating the positions remembering the fig. 1, the last stage of our model is the decision making. in order to update the location of the wtt, equations (10) and (11) are used for x-axis and y-axis, respectively. x∗i = { xi−1, di = ng xpi , di = ok (10) y ∗i = { yi−1, di = ng ypi , di = ok (11) where in these equations, di denotes the status of the detected object in current image frame, i. here, xi−1, x p i and x ∗ i are the x positions for 146 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 147 14 f. abedini, m. bahaghighat, m.s’hoyan table 2: mobilenets body architecture [17]. type/stride filter shape input size conv/s2 3 × 3 × 3 × 32 224 × 224 × 3 conv dw/s1 3 × 3 × 32dw 112 × 112 × 32 conv/s1 1 × 1 × 32 × 64 112 × 112 × 32 conv dw/s2 3 × 3 × 64dw 112 × 112 × 64 conv/s1 1 × 1 × 64 × 128 56 × 56 × 64 conv dw/s1 3 × 3 × 128dw 56 × 56 × 128 conv/s1 1 × 1 × 128 × 128 56 × ×56 × 128 conv dw/s2 3 × 3 × 128dw 56 × 56 × 128 conv/s1 1 × 1 × 128 × 256 28 × 28 × 128 conv dw/s1 3 × 3 × 256dw 28 × 28 × 256 conv/s1 1 × 1 × 256 × 256 28 × 28 × 256 conv dw/s2 3 × 3 × 256dw 28 × 28 × 256 conv/s1 1 × 1 × 256 × 512 14 × 14 × 256 5× conv dw/s1 3 × 3 × 512dw 14 × 14 × 512 conv/s1 1 × 1 × 512 × 512 14 × 14 × 512 conv dw/s2 3 × 3 × 512dw 14 × 14 × 512 conv/s1 1 × 1 × 1 × 512 × 1024 7 × 7 × 512 conv dw/s2 3 × 3 × 1024dw 7 × 7 × 1024 conv/s1 1 × 1 × 1024 × 1024 7 × 7 × 1024 avg pool/s1 pool 7 × 7 7 × 7 × 1024 fc/s1 1024 × 1000 1 × 1 × 1024 softmax/s1 classifier 1 × 1 × 1000 3.3 decision making and updating the positions remembering the fig. 1, the last stage of our model is the decision making. in order to update the location of the wtt, equations (10) and (11) are used for x-axis and y-axis, respectively. x∗i = { xi−1, di = ng xpi , di = ok (10) y ∗i = { yi−1, di = ng ypi , di = ok (11) where in these equations, di denotes the status of the detected object in current image frame, i. here, xi−1, x p i and x ∗ i are the x positions for wind turbine tower detection using feature descriptors and deep ... 15 previous frame (i − 1), predicted x position for current frame (i), and the new value of x position for the current frame, respectively (for i = 1, 2). the same rule applies for yi−1, y p i and y ∗ i , in y positions (for i = 1, 2). according to the result of the classifier and these equations, the uav can decide on updating its information about the location of the wtt. table 3: classification results for the verification problem based on mobilenet, shufflenet, effnet, squeezenet, and resnet experiment base model epochs dense 1 run time (s) accuracy (% ) e1 mobilenet 20 40 0.027058 51.11 e2 mobilenet 50 40 0.030192 54.43 e3 mobilenet 100 100 0.032696 76.69 e4 mobilenet 300 100 0.032752 86.22 e5 mobilenet 1000 100 0.034857 96.01 e6 mobilenet 1000 40 0.033198 85.00 e7 resnet50 20 40 0.055964 56.01 e8 resnet50 50 40 0.054390 56.13 e9 resnet50 100 100 0.059588 88.21 e10 resnet50 300 100 0.060267 95.78 e11 resnet50 1000 100 0.061942 98.92 e12 resnet50 1000 40 0.060906 89.46 e13 effnet 100 100 0.036546 79.12 e14 effnet 1000 100 0.039351 97.74 e15 shufflenet 100 100 0.033497 77.43 e16 shufflenet 1000 100 0.035001 95.89 e17 squeezenet 100 100 0.040951 73.41 e18 squeezenet 1000 100 0.041828 93.36 4 conclusion wind turbine tower (wtt) as a main component in a farm is a mechanical structure where its components are formed and constructed using carbon wind turbine tower detection using feature descriptors and deep ... 15 previous frame (i − 1), predicted x position for current frame (i), and the new value of x position for the current frame, respectively (for i = 1, 2). the same rule applies for yi−1, y p i and y ∗ i , in y positions (for i = 1, 2). according to the result of the classifier and these equations, the uav can decide on updating its information about the location of the wtt. table 3: classification results for the verification problem based on mobilenet, shufflenet, effnet, squeezenet, and resnet experiment base model epochs dense 1 run time (s) accuracy (% ) e1 mobilenet 20 40 0.027058 51.11 e2 mobilenet 50 40 0.030192 54.43 e3 mobilenet 100 100 0.032696 76.69 e4 mobilenet 300 100 0.032752 86.22 e5 mobilenet 1000 100 0.034857 96.01 e6 mobilenet 1000 40 0.033198 85.00 e7 resnet50 20 40 0.055964 56.01 e8 resnet50 50 40 0.054390 56.13 e9 resnet50 100 100 0.059588 88.21 e10 resnet50 300 100 0.060267 95.78 e11 resnet50 1000 100 0.061942 98.92 e12 resnet50 1000 40 0.060906 89.46 e13 effnet 100 100 0.036546 79.12 e14 effnet 1000 100 0.039351 97.74 e15 shufflenet 100 100 0.033497 77.43 e16 shufflenet 1000 100 0.035001 95.89 e17 squeezenet 100 100 0.040951 73.41 e18 squeezenet 1000 100 0.041828 93.36 4 conclusion wind turbine tower (wtt) as a main component in a farm is a mechanical structure where its components are formed and constructed using carbon wind turbine tower detection using feature descriptors and deep ... 15 previous frame (i − 1), predicted x position for current frame (i), and the new value of x position for the current frame, respectively (for i = 1, 2). the same rule applies for yi−1, y p i and y ∗ i , in y positions (for i = 1, 2). according to the result of the classifier and these equations, the uav can decide on updating its information about the location of the wtt. table 3: classification results for the verification problem based on mobilenet, shufflenet, effnet, squeezenet, and resnet experiment base model epochs dense 1 run time (s) accuracy (% ) e1 mobilenet 20 40 0.027058 51.11 e2 mobilenet 50 40 0.030192 54.43 e3 mobilenet 100 100 0.032696 76.69 e4 mobilenet 300 100 0.032752 86.22 e5 mobilenet 1000 100 0.034857 96.01 e6 mobilenet 1000 40 0.033198 85.00 e7 resnet50 20 40 0.055964 56.01 e8 resnet50 50 40 0.054390 56.13 e9 resnet50 100 100 0.059588 88.21 e10 resnet50 300 100 0.060267 95.78 e11 resnet50 1000 100 0.061942 98.92 e12 resnet50 1000 40 0.060906 89.46 e13 effnet 100 100 0.036546 79.12 e14 effnet 1000 100 0.039351 97.74 e15 shufflenet 100 100 0.033497 77.43 e16 shufflenet 1000 100 0.035001 95.89 e17 squeezenet 100 100 0.040951 73.41 e18 squeezenet 1000 100 0.041828 93.36 4 conclusion wind turbine tower (wtt) as a main component in a farm is a mechanical structure where its components are formed and constructed using carbon wind turbine tower detection using feature descriptors and deep ... 15 previous frame (i − 1), predicted x position for current frame (i), and the new value of x position for the current frame, respectively (for i = 1, 2). the same rule applies for yi−1, y p i and y ∗ i , in y positions (for i = 1, 2). according to the result of the classifier and these equations, the uav can decide on updating its information about the location of the wtt. table 3: classification results for the verification problem based on mobilenet, shufflenet, effnet, squeezenet, and resnet experiment base model epochs dense 1 run time (s) accuracy (% ) e1 mobilenet 20 40 0.027058 51.11 e2 mobilenet 50 40 0.030192 54.43 e3 mobilenet 100 100 0.032696 76.69 e4 mobilenet 300 100 0.032752 86.22 e5 mobilenet 1000 100 0.034857 96.01 e6 mobilenet 1000 40 0.033198 85.00 e7 resnet50 20 40 0.055964 56.01 e8 resnet50 50 40 0.054390 56.13 e9 resnet50 100 100 0.059588 88.21 e10 resnet50 300 100 0.060267 95.78 e11 resnet50 1000 100 0.061942 98.92 e12 resnet50 1000 40 0.060906 89.46 e13 effnet 100 100 0.036546 79.12 e14 effnet 1000 100 0.039351 97.74 e15 shufflenet 100 100 0.033497 77.43 e16 shufflenet 1000 100 0.035001 95.89 e17 squeezenet 100 100 0.040951 73.41 e18 squeezenet 1000 100 0.041828 93.36 4 conclusion wind turbine tower (wtt) as a main component in a farm is a mechanical structure where its components are formed and constructed using carbon wind turbine tower detection using feature descriptors and deep ... 15 previous frame (i − 1), predicted x position for current frame (i), and the new value of x position for the current frame, respectively (for i = 1, 2). the same rule applies for yi−1, y p i and y ∗ i , in y positions (for i = 1, 2). according to the result of the classifier and these equations, the uav can decide on updating its information about the location of the wtt. table 3: classification results for the verification problem based on mobilenet, shufflenet, effnet, squeezenet, and resnet experiment base model epochs dense 1 run time (s) accuracy (% ) e1 mobilenet 20 40 0.027058 51.11 e2 mobilenet 50 40 0.030192 54.43 e3 mobilenet 100 100 0.032696 76.69 e4 mobilenet 300 100 0.032752 86.22 e5 mobilenet 1000 100 0.034857 96.01 e6 mobilenet 1000 40 0.033198 85.00 e7 resnet50 20 40 0.055964 56.01 e8 resnet50 50 40 0.054390 56.13 e9 resnet50 100 100 0.059588 88.21 e10 resnet50 300 100 0.060267 95.78 e11 resnet50 1000 100 0.061942 98.92 e12 resnet50 1000 40 0.060906 89.46 e13 effnet 100 100 0.036546 79.12 e14 effnet 1000 100 0.039351 97.74 e15 shufflenet 100 100 0.033497 77.43 e16 shufflenet 1000 100 0.035001 95.89 e17 squeezenet 100 100 0.040951 73.41 e18 squeezenet 1000 100 0.041828 93.36 4 conclusion wind turbine tower (wtt) as a main component in a farm is a mechanical structure where its components are formed and constructed using carbon 146 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 147 16 f. abedini, m. bahaghighat, m.s’hoyan (a) (b) fig. 7: the comparison between the training and validation results of effnet pre-trained classifier for the verification problem (a) accuracy (b) loss fiber reinforced plastic (cfrp) [9]. in order to have intelligent and proactive maintenance services for the farm, it is essential to develop monitoring in148 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 149 16 f. abedini, m. bahaghighat, m.s’hoyan (a) (b) fig. 7: the comparison between the training and validation results of effnet pre-trained classifier for the verification problem (a) accuracy (b) loss fiber reinforced plastic (cfrp) [9]. in order to have intelligent and proactive maintenance services for the farm, it is essential to develop monitoring inwind turbine tower detection using feature descriptors and deep ... 17 frastructure based on vision inspection (vi) technologies. this can increase the lifetime of the wt farm and reduce the maintenance cost, provided that accurate faults and failures predictions are available via advanced nondestructive tests approaches such as intelligent vi and thermal visions [6–10]. in this paper, we suggested a scheme to detect the wind turbine towers to facilitate monitoring, controlling, and maintenance tasks in smart grids. we deployed machine learning techniques with vision inspection proposes to navigate a flying machine in a wind turbine farm precisely. we used sift, surf, and fast as feature extractors and brute-force and flann as matchers to detect wind turbines. our simulation results have shown that fast as the feature extractor with surf as the descriptor along with flann matcher outperforms in object detection task with the 89.4% accuracy. besides, in order to improve navigation reliability, an additional binary classification step was considered based on resnet, mobilenet, shufflenet, effnet, and squeezenet. among all mentioned classifiers, resnet50 obtained the highest average accuracy about 98.92% in 0.061942(s). its runtime was almost two times compared to effnet with average accuracy around 97.74%. therefore, from practical points of view such as computationalefficiency and memory restrictions for an embedded device, our well-tuned pre-trained effnet was considered as the best classifier among all mentioned models in our research. references [1] s. a. motamedi, “psnr enhancement in image streaming over cognitive radio sensor networks,” etri journal, vol. 39, no. 5, pp. 683–694, 2017. [2] m. a. dimitrijević, m. andrejević-stošović, j. milojković, and v. litovski, “implementation of artificial neural networks based ai concepts to the smart grid,” facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 411–424, 2014. [3] a. janjic, s. savic, g. janackovic, m. stankovic, and l. z. velimirovic, “multicriteria assesment of the smart grid efficiency using the fuzzy analitical hyerarchy process,” facta universitatis, series: electronics and energetics, vol. 29, no. 4, pp. 631–646, 2016. [4] r. martać, n. milivojević, v. milivojević, v. ćirović, and d. barać, “using internet of things in monitoring and management of dams in serbia,” facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 419–435, 2015. [5] p. tchakoua, r. wamkeue, m. ouhrouche, f. slaoui-hasnaoui, t. tameghe, and g. ekemb, “wind turbine condition monitoring: state-of-the-art review, 148 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 149 18 f. abedini, m. bahaghighat, m.s’hoyan new trends, and future challenges,” energies, vol. 7, no. 4, pp. 2595–2630, 2014. [6] h. sanati, d. wood, and q. sun, “condition monitoring of wind turbine blades using active and passive thermography,” applied sciences, vol. 8, no. 10, p. 2004, 2018. [7] f. p. garćıa márquez, i. segovia ramı́rez, a. pliego marugán, á. huerta herráiz, and m. papaelias, “a novel walking robot based system for non-destructive testing in wind turbines,” 2019. [8] x.-l. li, j. sun, n. tao, l. feng, j.-l. shen, y. he, c. zhang, and y.-j. zhao, “an effective method to inspect adhesive quality of wind turbine blades using transmission thermography,” journal of nondestructive evaluation, vol. 37, no. 2, p. 19, 2018. [9] c.-s. tsai, c.-t. hsieh, and s.-j. huang, “enhancement of damage-detection of wind turbine blades via cwt-based approaches,” ieee transactions on energy conversion, vol. 21, no. 3, pp. 776–781, 2006. [10] m. bahaghighat and s. a. motamedi, “vision inspection and monitoring of wind turbine farms in emerging smart grids,” facta universitatis, series: electronics and energetics, vol. 31, no. 2, pp. 287–301, 2018. [11] c. sampedro, c. martinez, a. chauhan, and p. campoy, “a supervised approach to electric tower detection and classification for power line inspection,” in 2014 international joint conference on neural networks (ijcnn). ieee, 2014, pp. 1970–1977. [12] c. rudin, d. waltz, r. n. anderson, a. boulanger, a. salleb-aouissi, m. chow, h. dutta, p. n. gross, b. huang, s. ierome et al., “machine learning for the new york city power grid,” ieee transactions on pattern analysis and machine intelligence, vol. 34, no. 2, pp. 328–345, 2011. [13] r. n. anderson, a. boulanger, w. b. powell, and w. scott, “adaptive stochastic control for the smart grid,” pp. 1098–1115, 2011. [14] z. m. fadlullah, m. m. fouda, n. kato, x. shen, and y. nozaki, “an early warning system against malicious activities for smart grid communications,” ieee network, vol. 25, no. 5, pp. 50–55, 2011. [15] y. zhang, l. wang, w. sun, r. c. green ii, and m. alam, “distributed intrusion detection system in a multi-layer network architecture of smart grids,” ieee transactions on smart grid, vol. 2, no. 4, pp. 796–808, 2011. [16] z. zhao, g. xu, y. qi, n. liu, and t. zhang, “multi-patch deep features for power line insulator status classification from aerial images,” in 2016 international joint conference on neural networks (ijcnn). ieee, 2016, pp. 3187–3194. [17] a. g. howard, m. zhu, b. chen, d. kalenichenko, w. wang, t. weyand, m. andreetto, and h. adam, “mobilenets: efficient convolutional neural networks for mobile vision applications,” arxiv preprint arxiv:1704.04861, 2017. 150 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 151 18 f. abedini, m. bahaghighat, m.s’hoyan new trends, and future challenges,” energies, vol. 7, no. 4, pp. 2595–2630, 2014. [6] h. sanati, d. wood, and q. sun, “condition monitoring of wind turbine blades using active and passive thermography,” applied sciences, vol. 8, no. 10, p. 2004, 2018. [7] f. p. garćıa márquez, i. segovia ramı́rez, a. pliego marugán, á. huerta herráiz, and m. papaelias, “a novel walking robot based system for non-destructive testing in wind turbines,” 2019. [8] x.-l. li, j. sun, n. tao, l. feng, j.-l. shen, y. he, c. zhang, and y.-j. zhao, “an effective method to inspect adhesive quality of wind turbine blades using transmission thermography,” journal of nondestructive evaluation, vol. 37, no. 2, p. 19, 2018. [9] c.-s. tsai, c.-t. hsieh, and s.-j. huang, “enhancement of damage-detection of wind turbine blades via cwt-based approaches,” ieee transactions on energy conversion, vol. 21, no. 3, pp. 776–781, 2006. [10] m. bahaghighat and s. a. motamedi, “vision inspection and monitoring of wind turbine farms in emerging smart grids,” facta universitatis, series: electronics and energetics, vol. 31, no. 2, pp. 287–301, 2018. [11] c. sampedro, c. martinez, a. chauhan, and p. campoy, “a supervised approach to electric tower detection and classification for power line inspection,” in 2014 international joint conference on neural networks (ijcnn). ieee, 2014, pp. 1970–1977. [12] c. rudin, d. waltz, r. n. anderson, a. boulanger, a. salleb-aouissi, m. chow, h. dutta, p. n. gross, b. huang, s. ierome et al., “machine learning for the new york city power grid,” ieee transactions on pattern analysis and machine intelligence, vol. 34, no. 2, pp. 328–345, 2011. [13] r. n. anderson, a. boulanger, w. b. powell, and w. scott, “adaptive stochastic control for the smart grid,” pp. 1098–1115, 2011. [14] z. m. fadlullah, m. m. fouda, n. kato, x. shen, and y. nozaki, “an early warning system against malicious activities for smart grid communications,” ieee network, vol. 25, no. 5, pp. 50–55, 2011. [15] y. zhang, l. wang, w. sun, r. c. green ii, and m. alam, “distributed intrusion detection system in a multi-layer network architecture of smart grids,” ieee transactions on smart grid, vol. 2, no. 4, pp. 796–808, 2011. [16] z. zhao, g. xu, y. qi, n. liu, and t. zhang, “multi-patch deep features for power line insulator status classification from aerial images,” in 2016 international joint conference on neural networks (ijcnn). ieee, 2016, pp. 3187–3194. [17] a. g. howard, m. zhu, b. chen, d. kalenichenko, w. wang, t. weyand, m. andreetto, and h. adam, “mobilenets: efficient convolutional neural networks for mobile vision applications,” arxiv preprint arxiv:1704.04861, 2017. wind turbine tower detection using feature descriptors and deep ... 19 [18] m. babaie, m. e. shiri, and m. bahaghighat, “a new descriptor for uav images mapping by applying discrete local radon,” in 2018 8th conference of ai & robotics and 10th robocup iranopen international symposium (iranopen). ieee, 2018, pp. 52–56. [19] m. stokkeland, k. klausen, and t. a. johansen, “autonomous visual navigation of unmanned aerial vehicle for wind turbine inspection,” in 2015 international conference on unmanned aircraft systems (icuas). ieee, 2015, pp. 998–1007. [20] b. u. töreyin, y. dedeoğlu, u. güdükbay, and a. e. cetin, “computer vision based method for real-time fire and flame detection,” pattern recognition letters, vol. 27, no. 1, pp. 49–58, 2006. [21] i. laptev, “improving object detection with boosted histograms,” image and vision computing, vol. 27, no. 5, pp. 535–544, 2009. [22] s. messelodi, c. m. modena, and m. zanin, “a computer vision system for the detection and classification of vehicles at urban road intersections,” pattern analysis and applications, vol. 8, no. 1-2, pp. 17–31, 2005. [23] j. c. s. j. junior, s. r. musse, and c. r. jung, “crowd analysis using computer vision techniques,” ieee signal processing magazine, vol. 27, no. 5, pp. 66–77, 2010. [24] r. akbari, m. k. bahaghighat, and j. mohammadi, “legendre moments for face identification based on single image per person,” in 2010 2nd international conference on signal processing systems, vol. 1. ieee, 2010, pp. v1–248. [25] m. k. bahaghighat, r. akbari et al., “fingerprint image enhancement using gwt and dmf,” in 2010 2nd international conference on signal processing systems, vol. 1. ieee, 2010, pp. v1–253. [26] n. karimimehr, a. a. b. shirazi et al., “fingerprint image enhancement using gabor wavelet transform,” in 2010 18th iranian conference on electrical engineering. ieee, 2010, pp. 316–320. [27] m. bahaghighat, m. mirfattahi, l. akbari, and m. babaie, “designing quality control system based on vision inspection in pharmaceutical product lines,” in 2018 international conference on computing, mathematics and engineering technologies (icomet). ieee, 2018, pp. 1–4. [28] d. g. lowe, “distinctive image features from scale-invariant keypoints,” international journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004. [29] h. bay, t. tuytelaars, and l. van gool, “surf: speeded up robust features,” in european conference on computer vision. springer, 2006, pp. 404–417. [30] e. rosten, r. porter, and t. drummond, “faster and better: a machine learning approach to corner detection,” ieee transactions on pattern analysis and machine intelligence, vol. 32, no. 1, pp. 105–119, 2008. [31] e. rosten and t. drummond, “machine learning for high-speed corner detection,” in european conference on computer vision. springer, 2006, pp. 430–443. 150 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 151 20 f. abedini, m. bahaghighat, m.s’hoyan [32] m. muja and d. g. lowe, “scalable nearest neighbor algorithms for high dimensional data,” ieee transactions on pattern analysis and machine intelligence, vol. 36, no. 11, pp. 2227–2240, 2014. [33] m. bahaghighat, l. akbari, and q. xin, “a machine learning-based approach for counting blister cards within drug packages,” ieee access, vol. 7, pp. 83 785–83 796, 2019. [34] a. esmaeili kelishomi, a. garmabaki, m. bahaghighat, and j. dong, “mobile user indoor-outdoor detection through physical daily activities,” sensors, vol. 19, no. 3, p. 511, 2019. [35] p. thanh noi and m. kappas, “comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery,” sensors, vol. 18, no. 1, p. 18, 2018. [36] h. al-shehri, a. al-qarni, l. al-saati, a. batoaq, h. badukhen, s. alrashed, j. alhiyafi, and s. o. olatunji, “student performance prediction using support vector machine and k-nearest neighbor,” in 2017 ieee 30th canadian conference on electrical and computer engineering (ccece). ieee, 2017, pp. 1–4. [37] f. n. koutanaei, h. sajedi, and m. khanbabaei, “a hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring,” journal of retailing and consumer services, vol. 27, pp. 11–23, 2015. [38] m. b. stojanović, m. m. božić, and m. m. stanković, “mid-term load forecasting using recursive time series prediction strategy with support vector machines,” facta universitatis-series: electronics and energetics, vol. 23, no. 3, pp. 287–298, 2010. [39] w. kim, w.-s. jung, and h. k. choi, “lightweight driver monitoring system based on multi-task mobilenets,” sensors, vol. 19, no. 14, p. 3200, 2019. [40] b. siemiatkowska, m. majewski et al., “a system for weeds and crops identification–reaching over 10 fps on raspberry pi with the usage of mobilenets, densenet and custom modifications.” 2019. [41] w. puarungroj and n. boonsirisumpun, “recognizing hand-woven fabric pattern designs based on deep learning,” in advances in computer communication and computational sciences. springer, 2019, pp. 325–336. [42] x. zhang, x. zhou, m. lin, and j. sun, “shufflenet: an extremely efficient convolutional neural network for mobile devices,” in proceedings of the ieee conference on computer vision and pattern recognition, 2018, pp. 6848–6856. [43] i. freeman, l. roese-koerner, and a. kummert, “effnet: an efficient structure for convolutional neural networks,” in 2018 25th ieee international conference on image processing (icip). ieee, 2018, pp. 6–10. [44] f. n. iandola, s. han, m. w. moskewicz, k. ashraf, w. j. dally, and k. keutzer, “squeezenet: alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size,” arxiv preprint arxiv:1602.07360, 2016. 152 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 153 20 f. abedini, m. bahaghighat, m.s’hoyan [32] m. muja and d. g. lowe, “scalable nearest neighbor algorithms for high dimensional data,” ieee transactions on pattern analysis and machine intelligence, vol. 36, no. 11, pp. 2227–2240, 2014. [33] m. bahaghighat, l. akbari, and q. xin, “a machine learning-based approach for counting blister cards within drug packages,” ieee access, vol. 7, pp. 83 785–83 796, 2019. [34] a. esmaeili kelishomi, a. garmabaki, m. bahaghighat, and j. dong, “mobile user indoor-outdoor detection through physical daily activities,” sensors, vol. 19, no. 3, p. 511, 2019. [35] p. thanh noi and m. kappas, “comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery,” sensors, vol. 18, no. 1, p. 18, 2018. [36] h. al-shehri, a. al-qarni, l. al-saati, a. batoaq, h. badukhen, s. alrashed, j. alhiyafi, and s. o. olatunji, “student performance prediction using support vector machine and k-nearest neighbor,” in 2017 ieee 30th canadian conference on electrical and computer engineering (ccece). ieee, 2017, pp. 1–4. [37] f. n. koutanaei, h. sajedi, and m. khanbabaei, “a hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring,” journal of retailing and consumer services, vol. 27, pp. 11–23, 2015. [38] m. b. stojanović, m. m. božić, and m. m. stanković, “mid-term load forecasting using recursive time series prediction strategy with support vector machines,” facta universitatis-series: electronics and energetics, vol. 23, no. 3, pp. 287–298, 2010. [39] w. kim, w.-s. jung, and h. k. choi, “lightweight driver monitoring system based on multi-task mobilenets,” sensors, vol. 19, no. 14, p. 3200, 2019. [40] b. siemiatkowska, m. majewski et al., “a system for weeds and crops identification–reaching over 10 fps on raspberry pi with the usage of mobilenets, densenet and custom modifications.” 2019. [41] w. puarungroj and n. boonsirisumpun, “recognizing hand-woven fabric pattern designs based on deep learning,” in advances in computer communication and computational sciences. springer, 2019, pp. 325–336. [42] x. zhang, x. zhou, m. lin, and j. sun, “shufflenet: an extremely efficient convolutional neural network for mobile devices,” in proceedings of the ieee conference on computer vision and pattern recognition, 2018, pp. 6848–6856. [43] i. freeman, l. roese-koerner, and a. kummert, “effnet: an efficient structure for convolutional neural networks,” in 2018 25th ieee international conference on image processing (icip). ieee, 2018, pp. 6–10. [44] f. n. iandola, s. han, m. w. moskewicz, k. ashraf, w. j. dally, and k. keutzer, “squeezenet: alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size,” arxiv preprint arxiv:1602.07360, 2016. wind turbine tower detection using feature descriptors and deep ... 21 [45] c. szegedy, s. ioffe, v. vanhoucke, and a. a. alemi, “inception-v4, inceptionresnet and the impact of residual connections on learning,” in thirty-first aaai conference on artificial intelligence, 2017. [46] t. akiba, s. suzuki, and k. fukuda, “extremely large minibatch sgd: training resnet-50 on imagenet in 15 minutes,” arxiv preprint arxiv:1711.04325, 2017. [47] c. l. zitnick and p. dollár, “edge boxes: locating object proposals from edges,” in european conference on computer vision. springer, 2014, pp. 391– 405. [48] z. zhong, j. wen, b. zhang, and y. xu, “a general moving detection method using dual-target nonparametric background model,” knowledge-based systems, vol. 164, pp. 85–95, 2019. [49] d. p. kingma and j. ba, “adam: a method for stochastic optimization,” arxiv preprint arxiv:1412.6980, 2014. [50] l. zhang, s. wang, and b. liu, “deep learning for sentiment analysis: a survey,” wiley interdisciplinary reviews: data mining and knowledge discovery, vol. 8, no. 4, p. e1253, 2018. [51] g.-b. huang, d. h. wang, and y. lan, “extreme learning machines: a survey,” international journal of machine learning and cybernetics, vol. 2, no. 2, pp. 107–122, 2011. [52] b. karlik and a. v. olgac, “performance analysis of various activation functions in generalized mlp architectures of neural networks,” international journal of artificial intelligence and expert systems, vol. 1, no. 4, pp. 111–122, 2011. 152 f. abedini, m. bahaghighat, m. s’hoyan wind turbine tower detection using feature descriptors and deep learning 153 instruction facta universitatis series: electronics and energetics vol. 27, n o 1, march 2014, pp. 57 102 doi: 10.2298/fuee1401057o a review of diode and solar cell equivalent circuit model lumped parameter extraction procedures  adelmo ortiz-conde, francisco j. garcía-sánchez, juan muci, andrea sucre-gonzález solid state electronics laboratory, simón bolívar university, caracas 1080a, venezuela abstract. this article presents an up-to-date review of several methods used for extraction of diode and solar cell model parameters. in order to facilitate the choice of the most appropriate method for the given particular application, the methods are classified according to their lumped parameter equivalent circuit model: singleexponential, double-exponential, multiple-exponential, with and without series and parallel resistances. in general, methods based on numerical integration or optimization are recommended to reduce the possible uncertainties arising from measurement noise. key words: solar cell, diode, single-exponential, double-exponential, multiple-exponential 1. introduction photovoltaic energy conversion has received great attention recently and much research has been dedicated to solar cells. practical applications of solar cells require simple lumped models and efficient parameter extraction methods. parameter extraction in solar cells has been a research topic for many years [1] and several articles [2]-[11] have reviewed over time the different parameter extraction methods in diodes and solar cells. although last year cotfas et al reviewed [2] 34 different methods, not all the methods were included. the pioneering fabrication of the first silicon p-n junction and solar cell by ohl in 1940 was presented in a patent in 1941 [12]. ohl and his colleagues in bell labs were studying silicon as a detector and they were trying to obtain pure silicon by fusing silica (si02) and slowly cooling the fused material until it solidified. as a result, impurities inside the silicon spontaneously segregated forming a p-n junction by serendipity [13], [14]. they observed that the device produced electrical energy when it received light. shockley developed [15] the theory of p-n junctions in 1949, and presented the first  received january 8, 2014 corresponding author: adelmo ortiz-conde solid state electronics laboratory, simón bolívar university, caracas 1080a, venezuela (e-mail: ortizc@ieee.org) 58 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález single-exponential model for a p-n junction with series resistance. chapin et al published in 1954 the first article dedicated to the silicon photocell [16]. prince published in 1955 the first single-exponential model for a solar cell with series and parallel resistance [17] and, considering that all the research so far was done from bell labs, he referred to the solar cell as the bell solar battery [17]. sah, noyce and shockley published in 1957 the first multi-exponential model for a p-n junction with series resistance [18]. in the following sections we will review available methods for parameter extraction in diodes and solar cells. we consider that the choice of the best method depends on each particular application, based on the appropriate lumped parameter equivalent circuit model to be used. an important consideration is measurement noise, which obscures parameter extraction, especially if the method is based on using few experimental points [19]-[21]. two possible ways exist when measurement noise is high: (i) applying conventional data smoothing to reduce the possible uncertainties arising from measurement noise; and (ii) using a robust technique based on taking many points, as for example, numerical integration [22]-[24] or optimization [25], [26]. another future promising solution, which is not going to be evaluated in the present article, consists in the use of genetic algorithms [27], [28]. in order to facilitate the choice of the most appropriate method for a given particular application, the different methods will be organized according to their corresponding lumped parameter equivalent circuit model. section 2 reviews parameter extraction using the single-exponential diode model for three different cases: without any resistance, with series resistance, and with both series and parallel resistances. section 3 presents parameter extraction using multiple-exponential diode models. section 4 scrutinizes parameter extraction using the single-exponential solar cell model for the following cases: without any resistance, with series resistance, with parallel resistance, and with series and parallel resistances. finally, section 5 reviews and discusses parameter extraction using multiple-exponential solar cell models. 2. single-exponential diode model 2.1. single-exponential diode model without any resistance consider an idealized diode without resistance whose i-v characteristics may be described by a lumped parameter equivalent circuit model consisting of a single exponential-type ideal junction [15]. figure 1 presents the equivalent circuit of such a model. the terminal current, i, of this lumped parameter equivalent circuit model is explicitly described in mathematical terms by shockley's equation: 0 exp 1 th v i = i n v           (1) where v is the terminal voltage, i0 is the reverse current, n is the so-called diode quality factor, and vth = kbt/q is the thermal voltage. alternatively the terminal voltage may be expressed as an explicit function of the terminal current: a review of parameter extraction in diodes and solar cells... 59 0 ln 1 th i v = n v i       . (2) fig. 1 idealized diode equivalent circuit without parasitic resistances figure 2 presents measurements of a silicon diode from motorola [25] and simulated i-v characteristic of an idealized diode, in linear and logarithmic scales. neglecting the -1 term in (1), which is equivalent to the assumption that i>> i0, yields: 0ln( ) ln( ) th v i = i n v  . (3) therefore, the plot of ln(i) vs v is a straight line whose slope, 1/nvth, and v-axis intercept yield at room temperature n=1.03 and i0=0.55 na, respectively. fig. 2 measurement (symbols) and simulation (lines) of a silicon diode i-v characteristics at room temperature in linear and logarithmic scales. simulations are done using (3) which assumes i>> i0 and results in a straight line in logarithmic scale 2.2. single-exponential diode model with series resistance figure 3 presents the lumped parameter equivalent circuit model of a diode with parasitic series resistance. as a consequence of the presence of the parasitic series resistance rs, the terminal current of this equivalent circuit is mathematically described by an implicit equation: 60 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález 0 exp 1s th v r i i = i n v           . (4) fig. 3 diode equivalent circuit with a parasitic series resistance the terminal voltage can be mathematically solved from the previous equation as an explicit function of the terminal current: 0 ln 1 s th i v = r i n v i        . (5) the implicit terminal current equation given by (4) can be solved explicitly in terms of the terminal voltage if we introduce the use of the special lambert w function [29], [30]:  00 0 0 exp sth s s th th v i rn v i r i w i r n v n v             , (6) where w0 represents the principal branch of the lambert w function [31] which is a special function defined as the solution to the equation w(x) exp(w(x))=x. the lambert w function has already proved its usefulness in numerous physics applications [32], [33]. figure 4 presents aim-spice [34] simulations of a diode with several values of the series resistance. it is important to observe that the effect of rs is significant for the high voltage region and that the region where ln(i) is proportional to v decreases as rs increases. 2.2.1. vertical and lateral optimization methods the three parameters (n, i0 and rs) that fully describe the diode in terms of this lumped parameter equivalent circuit model can be extracted by fitting the cell's measured data to any of the model's defining equations. equations (4), (5) or (6) can be applied directly for fitting. vertical or lateral optimization could be used for fitting by minimizing either the voltage quadratic error or the current quadratic error, respectively. in the present case the use of equation (5) in combination with lateral optimization [25], [35] affords the best computational convenience, since this equation is not implicit, as (4) is, and does not contain special functions, as (6) does. figure 5 presents measurements of a silicon diode from motorola [25] and simulated i-v characteristic of a diode, in linear and logarithmic scales, using the parameters extracted by lateral optimization [25]. a review of parameter extraction in diodes and solar cells... 61 fig. 4 aim-spice simulations of a diode with several values of series resistance fig. 5 measured i-v characteristics of a silicon diode and its simulation using the parameter values extracted by lateral optimization [25] 2.2.2. integration method to extract series resistance and ideality factor following the idea of araujo and sánchez about the use of integration for parameter extraction [36], the drain current may be integrated by parts in combination with (5): 2 00 0 2 v i s th r i dv v i v di i n v i i v      . (7) assuming that i>>i0 the last term in the above equation can be neglected and we obtain [37], [38]: 2 0 2 v s th r i dv i n v i  . (8) 62 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález therefore, a plot of the numerical integration of the measured current with respect to voltage is represented by an explicit algebraic quadratic function of i, which requires a much simpler fitting procedure than the original implicit equation. kaminski et al later generalized this method [39] by allowing an arbitrary lower integration limit (vi, ii) instead of the origin, so that (8) becomes:   1 2i v s i thv i r i dv i i n v i i      . (9) 2.2.3. the integral difference function concept and the g method we proposed a different approach [22], [23] that does not start with the extraction of the parasitic series resistance value. instead, it does just the opposite. the proposed method is based on calculating an auxiliary function, or rather an operator, whose purpose is to eliminate the effect of the parasitic series resistance, retaining only the intrinsic model parameters. this new function was originally called "integral difference function," it is denoted "function d," and is defined as: 0 0 0 0 ( , ) 2 2 i v v i d v i v di i dv iv i dv v di iv         , (10) where d has units of "power." the integrals with respect to i and v are the device's "content" and "co-content", respectively, as shown in fig. 6. for simplicity's sake and without loss of generality, the lower limit of integration in (10) is taken at the origin, but it may equally be placed at any arbitrary point of interest along the device's characteristics. notice that adding the content and co-content, instead of subtracting them, as in (10), yields the device's total power. it can be proved that in any given lumped parameter equivalent circuit model only nonlinear branches produce non-zero terms, and thus they are the only elements that contribute to the total d seen at the terminals. this property embodies the essence of the function d's ability to eliminate parasitic resistances (linear elements) from device models. fig. 6 schematic illustration of the content (c) and co-content (cc) of a simple case of nonlinear function a review of parameter extraction in diodes and solar cells... 63 it is important to point out that function d may be understood as a representation or measure of the device's amount of nonlinearity, which for a linear element is obviously equal to zero. this description of function d, in terms of linearity, led us to refer to this function as the "integral non linearity function" (inlf) [40], [41], and to use it to quantify the non-linear behavior of devices and circuits in terms of distortion. applying function d to the case of a single-exponential diode model with series resistance and restricting the analysis to the region of the measured forward characteristics, where i>>i0, the substitution of (5) into (10) yields [22], [23]:  0ln( ) 2thd i n v i i  , (11) which does not contain rs. dividing this equation by the current yields an auxiliary function, which we call g, defined by:  0ln( ) [ln( ) 2]thg d i n v i i    . (12) since this function g is calculated from function d, it requires a numerical integration of the experimental data. when g is plotted against ln(i), according to (12) the resulting curve is a straight line, whose intercept and slope allow the immediate extraction of the values of i0, and n, respectively as is shown in fig. 7. the extracted values of n=1.03 and i0=0.55 na are very close to those previously obtained by lateral optimization. fig. 7 function g as a function of the logarithm of the current calculated from the measured i-v characteristics of a silicon diode (symbols) and a linear fit of its quasi linear portion (solid line) 2.2.4. norde's method this method [42] contains clever mathematical ideas and it was developed for schottky diodes with n=1. the following notation is adapted to conventional p-n junctions. norde defined the following function which we denominate by his name: 64 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález ln 2 th x iv norde v i         . (13) where ix represents an arbitrary value of the current. norde's function presents a minimum value (vmin, imin) which is independent on the selected value of ix. the location of this minimum value is obtained by differentiating the above equation and equating it to zero: 1 0 2 th vd norde d i d v i d v    (14) the derivative of v with respect to i is obtained from (5), and using n=1 yields: th s vd v r d i i   . (15) combining and solving the two previous equations at i=imin yields the series resistance: min th s v r i  . (16) using (4) with n=1, the reverse current parameter i0 is obtained: min 0 min minexp 1s th i i v i r v           , (17) where vmin is the value of the voltage at the minimum of norde's function. there are two main disadvantages of norde's method: 1) that the ideality factor n needs to be assumed to be equal to unity, and 2) that the parameters are extracted from only a few data points near the minimum of norde's function. nevertheles, this is a clever transition.type extraction method, which extracts the parameters from a region where both the diode and the resistance effects are significant. to test norde's method, we will use the same previous experimental data [25], whose parameters previously were i0 = 0.580 na , n = 1.05 and rs = 33.4 . since n = 1.05 and for the present method it should be unity, we will let vth= 1.05x0.259 v. the extracted values are: rs = 40  and i0 = 0.76 na, for the three selected values of ix, as illustrated in figure 8. figure 9 presents measured and simulated i-v characteristics, using the parameters extracted by norde's method. we observe that simulations agrees very well with experimental data for values close to vmin = 0.4 v. a review of parameter extraction in diodes and solar cells... 65 fig. 8 norde's function as a function of the voltage calculated from the i-v characteristics of the silicon diode for three values of ix, showing the minimum that defines the value of the series resistance fig. 9 measured (symbols) and simulated (solid lines) i-v characteristics of the silicon diode using the parameters extracted by norde's method one of the limitations of norde's method, namely that of having to fix n=1, has been removed by various authors [43]-[45]. for example, the following generalized norde's function has been proposed [43], [44]: ln th x iv norde v i           , (18) 66 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález where  is a new parameter, which for the particular case of =2 yields the original norde's equation. this function presents a minimum value (vmin, imin) which is also independent on the selected value of ix. the location of the minimum is obtained as before by differentiating (18) and equating it to zero: 1 0 th d norde v d i d v i d v      . (19) the derivative of v with respect to i is obtained from (5): th s n vd v r d i i   . (20) combining and solving the two previous equations at i=imin yields: min ( ) th s n v r i    . (21) using (4), the reverse current parameter i0 is obtained: min 0 min minexp 1s th i i v i r n v           , (22) where vmin is the value of the voltage at the minimum of norde's function. because there are only two equations available ((21) and (22)), and we need to extract 3 parameters (n, i0 and rs), at least two norde's plots with different values of  are needed. it is interesting to compare the generalized norde's function with the previous g function. if we make  tend to infinity and let ix = i0, the generalized norde's function is closely related to the g function by: 0x i i g norde n     . (23) 2.2.5. cheung's method cheung et al [45] proposed the following procedure to extract the idelity factor and the series resistance. using the identity: ln( ) d v d v i i d id i d i d v   (24) in combination with equation (20) yields: s th i r i nv d i d v   . (25) a review of parameter extraction in diodes and solar cells... 67 therefore, when the ratio of the current to the conductance ( i/(di/dv) ) is plotted against the current it should produce a straight line, as shown in figure 10, whose slope yields the series resistance and its intercept is nvth, implying in the example shown that rs = 33.3  and n = 1.17. fig. 10 ratio of the current to the conductance ( i/(di/dv) ) as a function of the current showing a straight line behaviour cheung et al proposed the following variation of norde's function [45]: ln th x i cheung v n v i         . (26) where ix is an arbitrary value of the current. rewriting (5) with the assumption i>> i0 yields: 0 ln s th i v = r i n v i        . (27) combining the two previous equations yields: 0 ln x s th i cheung r i nv i         . (28) therefore, when cheung's function is plotted against the current it should produce a straight line, as is shown in figure 11, whose slope yields the series resistance and its intercept the reverse current, implying in the present example shown that rs = 33.4  and i0 = 2.6 na. 68 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 11 cheung's function as a function of the current showing a straight line behaviour 2.3. single-exponential diode model with series and parallel resistances figure 12 presents the lumped parameter equivalent circuit model of a diode with a series parasitic resistance and two parallel parasitic conductances, one at the junction (gp1) and the other at the periphery (gp2). the mathematical description of the terminal current of this equivalent circuit is given by the implicit equation: 2 0 1 1 1 ( 1 ) exp 1 ( ) (1 ) s p s s p p s p th v + r g ir i = i + v ir g vg + r g n v              . (29) the above equation has the following solution for the terminal current as a function of the terminal voltage [46], [47]: 1 00 0 0 2 1 1 1 ( ) exp (1 ) (1 ) 1 pth s s p s th s p th s p s p v g inv i r v i r i w v g r nv r g nv r g r g                       , (30) and for the terminal voltage as a function of the terminal current the solution is: 0 12 20 12 2 0 2 12 0 12 2 exp ( ) th s th th i i r di r v nv d w i d r r i r nv d nv                             , (31) where w0 represents the principal branch of the lambert w function, and a review of parameter extraction in diodes and solar cells... 69 1 11 (1 )s pd r g  , (32) 2 21 (1 )s pd r g  , (33) and 12 1 2 1 21 ( )p p p p sr g g g g r   . (34) fig. 12 lumped parameter equivalent circuit model with a parasitic series resistance and two parallel parasitic conductances, representing two possible shunt current losses, one at the junction (gp1) and another at the device's periphery (gp2) 2.3.1. bidimensional fit of function d this integration-based procedure that was developed in 2005 [47] can be summarized as follows: first for convenience function d in (10) is rewriten as: 0 ( , ) 2 i d v i v di iv  . (35) secondly the terminal voltage given by (31) and its integral with respect to i are substituted into (35), which results in a long expression that contains lambert w functions and the variables v and i. thirdly, substituting all the terms that contain lambert w functions using equation (31), and after some algebraic manipulations, we can arrive at a form of function d(i,v) that is conveniently expressed as the following purely algebraic bivariate equation: 2 2 v1 i1 v1i1 v2 i2 ( , ) d d d d dd i v v i vi v i     , (36) where the five coefficients are given by: i1 0 1d 2 2 (1+ )s th p sr i n v g r   , (37) v1 1 2 0 2 1 2 d 2 +2 ( 1)+2 ( ) s th p p s p th p p r n v g g i r g n v g g   , (38) i2 1d (1+ )s p sr g r  , (39) 2 2 2 v2 2 1 1 2 2 1 2 d 2 p p s p p s p s p p g g r g g r g r g g      , (40) and the fifth coefficient is dependent upon the others: 2 i1v1 i2 v2 d 1+4 d d . (41) 70 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález as can be seen, there are actually four independent coefficients, (37)-(40), and therefore only four unknowns may be extracted uniquely. the general solution of n, i0, gp1, and gp2, in terms of rs, di1, dv1, di2 and dv2 is: i2 1 2 d + s p s r g r   , (42) v1 2 i1 i1 d d d 2 s s p th r r g n v     , (43) 2 1 v2 s 1 v2 2 1 1 2 1 4 d 4 r d 2 (1 ) s p s p p s s p r g r g g r r g         , (44) and 0 v1 1 i1 2 i1 1 v1 1 2 i1 1 (d d d + d d ) 2 p p p s p s p i g g g r g r g    . (45) it is important to notice that a set of values of di1 , dv1 , di2 and dv2 defines a unique i-v characteristic which can be generated with various combinations of rs, n, i0, gp1, and gp2. particular cases, which do not simultaneously include both conduntances gp1 and gp2, present specific solutions as presented in table 1. the parameter extraction procedure consists of fitting algebraic equation (36) to the d(i,v) function as numerically calculated from the experimental data with (35). this bidimensional (bivariate) fitting process produces the values of the equation coefficients dv1, di1, dv2, di2. these resulting values are then used to calculate the diode model parameters (gp1 or gp2, rs, n, and i0), as presented in table 1 for the particular cases. to illustrate this extraction method, it was applied to simulated i-v characteristics for the case of series resistance and only peripheral shunt loss, using parameters values of i0 = 1 pa, n = 1.5, gp1 = 0 and various combinations of rs and gp2 as is shown in figure 13. symbols used in this figure are not data points but are used to identify the several cases. the ideal case of rs = 0 and gp2 = 0, identified by large hollow squares, is a straight line. the case when rs = 1k is significant and gp2 = 0, identified by small solid squares, produces a straight line for low voltage that bends down for high voltage (i.e. the effects rs become important at high voltage). the case when only gp2 is significant (gp2 = 1s and rs = 0), is identified by small solid circles. it is a straight line at high voltage and bends up at low voltage (i.e. the effects gp2 are important at low voltage). when rs and gp2 are both simultaneously significant (rs = 1k and gp2 = 1s) is identified by large hollow circles. it is important to notice that the plot in this extreme case does not exhibit any region from which the intrinsic parameters could be obtained, because the overlapping effects of rs and gp2 totally conceal the intrinsic characteristics everywhere. this contrasts with the fact that the intrinsic parameters of this extreme case could not be directly extracted by any traditional method from any portion of its i-v characteristics. a review of parameter extraction in diodes and solar cells... 71 table 1 particular cases of a single-exponential diode model dv1 02 i 0 12 +2 th pi nv g 0 12 +2 th pi nv g 0 2 2 2 ( 1) +2 s p th p i r g n v g  di1 02 2s thr i n v  2 thnv 0 1 2 2 (1+ ) s th p s r i nv g r   02 2s thr i nv  dv2 0 1pg 1pg 2 2(1 )p s pg r g  di2 sr 0 1(1+ )s p sr g r sr gp1 0  dv2  dv2 0 gp2 0 0 0 v2 1 1 4 d 2 s s r r    rs  di2 0 1 i2 1 1 1 4 d 2 p p g g     di2 n i1 0 d 2 2 s th r i v   1 2 i th d v  i1 v1 d d 2 s th r v   0 i1 2 +d 2 s th r i v  i0 v1 d 2 v1 1 d 2nv 2 th p g v1 1 d 2 2 th p nv g i1 2 1 d 2 p v g d 72 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález the previously described combinations, as well as several other additional cases, were simulated and the quadratic equation of d as a function of current and voltage, defined in (36), was then used to extract the simulated parameters. in all cases the extraction procedure succeeded in producing the exact original parameters, within computational accuracy. this means that the errors between the original and the extracted parameters depend only on the computational precision and accuracy of the fitting algorithms used. it must be pointed out that in order to obtain reasonably accurate results, it is advisable that measurements use a small as possible voltage step (typically at most 10 mv). additionally, it is of paramount importance to use a suitable algorithm for numerical integration, that is, one that will not introduce significant error, such as a closed newtoncotes formula with 7 points, as illustrated in the appendix of [41]. fig. 13 illustrative synthetic i-v characteristics for various cases with several series resistance values and several peripheral shunt loss values. symbols are used to identify the several cases and do not represent data points 2.3.2. iterative g function method for the particular case of gp1=0 an iterative procedure was proposed in 2000 [48], which is based on the g function described in section 2.2.3. by estimating the value of gp2 (gp2e) we can calculate the current in the diode branch: 2d p e i i g v  . (46) then, function g is calculated from the measured i-v data and is plotted as a function of ln(id) for different estimated values of gp2e. selecting the plot that best fits a straight line will determine the correct value of gp2e =gp2. to illustrate the approach, we use simulated data with parameters values: i0=1 pa, n=1.5, rs=1 k and gp2=1 s. figure 14 presents several plots of the calculated function a review of parameter extraction in diodes and solar cells... 73 g, using the id defined in eq. (46), for several estimated values of gp2e. the best straight line of the function g with respect to ln(id) will define the correct value of gp2e. fig. 14 function g vs the logarithm of the id estimated using (46). the plots tend to a straight line (solid line) when the estimated value of gp2e approaches the actual value of gp2=1s 3. multiple-exponential diode model when modeling real junctions a single-exponential equation is usually not enough to adequately represent the several conduction phenomena that frequently make relevant contributions to the total current of a particular junction. in such cases junctions need to be represented by lumped multi-diode equivalent circuits. 3.1. double-exponential diode model with series resistance the first single-exponential model for a p-n junction with a unity ideality factor and a series resistance was proposed by shockley in 1949 [15]. in 1957, sah et al [18] presented the first double-exponential model for a p-n junction with series parasitic resistance and diode quality factors of n2=2n1 and n1=1. the lumped parameter equivalent circuit is illustrated in fig. 15. the mathematical description of this circuit is given by the following implicit equation: 0201 exp 1 exp 1 2 s s th th v i r v i r i = i i v v                            (47) the above implicit equation does not have an explicit solution for the terminal current, but it does have a solution for the terminal voltage as an explicit function of the terminal current [49]: 2 02 02 01 01 01 2 ln 1 2 2 s th i ii v = r i v i i i                . (48) 74 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález a global lateral fitting procedure based on (48) has been proposed to directly extract the diode's model parameters [49]. figure 16 presents the i-v characteristics of an experimental silicon pin lateral diode fabricated at the université catolique de louvain [49] measused at two temperatures. the model playback i-v characteristics calculated using the parameter values extracted using this global lateral fitting procedure are also shown in fig. 16. fig. 15 a double-exponential model with series resistance fig. 16 measured and simulated i-v characteristics of an experimental silicon lateral pin diode at two temperatures. the playback is calculated using the doubleexponential model, with diode quality factors of n2=2 and n1=1, and the rest of the parameter values extracted by a direct global lateral fitting of (48) to the data it is important to point out that this lateral fitting procedure may be used in general when the value of one diode quality factor can be assumed to be roughly twice the value of the other (n22n1) even if n11. it is also worth mentioning here that a doubleexponential model parameter extraction method, based on area error minimization between measured and modeled i-v characteristics, was recently proposed by yadir et al [50]. the essence of that method is closely related to integration-based extraction methods [23], [24]. 3.2. functions a and b another possible situation worth considering is represented by a double-exponential model where the values of the ideality factors are arbitrary, and all series resistances and shunt conductances are negligible, as illustrated by the equivalent circuit shown in fig. 17. the mathematical description of the terminal current of such a circuit is given by the following explicit function of the terminal voltage: a review of parameter extraction in diodes and solar cells... 75 0201 1 2 exp 1 exp 1 th th v v i = i -i n v n v                         . (49) additionally assume that diode 2 (n2, i02) is dominant at low voltage, the curren in that region may be approximated by: 02 2 exp 1 th v i -i n v            . (50) fig. 17 ideal double-exponential model with arbitrary ideality factors and without parasitic resistances substituting (50) into the following operators a and b, yields [51]: 0 2 02 v th i dvcc v a n v i i i i            (51) and 0 2 02 v th i dvcc i b n v i v v v           . (52) therefore, the application of either one of these two operators (51) or (52) to measured iv characteristics produces linear equations on the ratio v/i or on its reciprocal i/v, from whose slopes and intercepts the values of the ideality factor n2 and the reverse saturation current i02 may be directly extracted. figure 18 presents measurements of the base current as a function of forward baseemitter voltage of a power bjt measured at t =298 k with vbc=0. figure 19 shows plots of operators a and b applied to this measurements. the slope of a gives an extracted value of i02=215 pa, and its ordinates axis intercept gives an extracted value of n2=2. the slope of function b gives an extracted value of n2=1.98, and its ordinates axis intercept gives an extracted value of i02 =210 pa. it is worth mentioning that a vertical optimization method could also be used for this case, as is illustrated in fig. 20. 76 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 18 measured characteristics (symbols) of the base current as a function of forward base-emitter voltage of a power bjt measured at t =298 k, with vbc=0 and 10 mv voltage steps. also shown is the model playback simulated with (49) (solid line). the low and high voltage asymptotes (dashed lines) are also shown fig. 19 plots of operators a (51) and b (52) applied to the measured power bjt characteristics shown in fig. 18 a review of parameter extraction in diodes and solar cells... 77 fig. 20 measured characteristics (symbols) of the power bjt shown in fig. 18, and the model playback simulated with (49) (solid line) using the parameter values extracted by vertical optimization 3.3. regional approach for a double diode with series and parallel resistance this method is based on the idea that some components of the diode model dominate at a given voltage region [39]. let us assume a double-exponential model with arbitrary values of ideality factors and with parallel and series resistance as illustrated in figure 21. the mathematical description of this circuit is given by the following explicit equation: 0201 1 2 exp 1 exp 1 ( )s s p s th th v i r v i r i = i g v i ri n v n v                            . (53) fig. 21 double-exponential model with arbitrary ideality factors and parasitic series and parallel resistances figure 22 presents a particular simulation using (53) with specific parameter values in which we observe that for low voltage, diode 2 and gp are dominant. thus, equation (53) may be simplified for low voltages to the following explicit equation: 02 2 exp 1 p th v i g vi n v            , (54) 78 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 22 synthetic i-v characteristics simulated by (53) with the parameter values indicated inside the figure, together with the components dominant at low and high voltage, as calculated with the parameter values locally extracted using (54) and (56) (also indicated inside the figure) similarly, for high voltage, diode 1 and rs are dominant, thus equation (53) may be simplified for high voltages to to the following implicit equation: 01 1 exp 1s th v i r i i n v            , (55) although (55) is implicit, it has the following explicit solution for the terminal voltage: 1 01 ln 1 s th i v = r i n v i        . (56) therefore, the parameters can be extracted locally from two regions: 1) gp, n2 and i02 by vertical optimization of the low voltage region fitting equation (54) to the measured current; and 2) rs, n1 and i01 by by lateral optimization of the high voltage region fitting equation (56) to the measured voltage. figure 22 also includes the original and the parameter values extracted by this method. 3.4. alternative multi-exponential model with parasitic resistances figure 23 illustrates a multi-diode equivalent circuit. accordingly, the total current has been traditionally described by the following conventional implicit equation: 0 1 exp 1 ( ) n s k p s k k th v r i i = i g v r i n v              . (57) a review of parameter extraction in diodes and solar cells... 79 fig. 23 a conventional equivalent circuit of a real junction with multiple diodes in order to circumvent the explicit insolvability of the previous equation, we proposed [52] the use of the equivalent circuit presented in figure 24. by solving each branch separately and adding the solutions, this model's i-v characteristics may be expressed by the following explicit equation for the terminal current: 0 0 0 0 1 exp n k a th s k a k a s k a k a ka p a k s k a k a th k a th n v r i v r i i = w i g v r n v n v                   . (58) where as before w0 represents the principal branch of the lambert w function [31], gpa = 1/rpa is the alternative outer shunt conductance and the rest of the parameters are defined as before. notice that the single global series resistance, rs, present in the conventional model, has been substituted in this alternative model by individual series resistances, rska, placed in each of the kth parallel current paths associated with the kth conduction mechanism. fig. 24 alternative equivalent circuit with multiple diodes, resistances in series with each diode, and an outer shunt resistance figure 25 presents the i-v characteristics of a lateral pin diode at four temperatures from 300 to 390 k. model parameters were extracted, for both conventional and alternative double-exponential models, by globally fitting the logarithm of each model to the experimental data. the left figure also includes the corresponding alternative model playbacks while the right figure includes the corresponding conventional model playbacks. additional calculations of the playback errors relative to the original measured data indicate that the alternative model produces a more accurate representation of this device's forward conduction behavior at the four temperatures considered here. 80 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 25 measured (red symbols), alternative and conventional model playbacks (black solid lines) forward i–v characteristics of an experimental lateral pin diode at four temperatures 3.5. lateral optimization using an approximate analytical expression for the voltage in multi-exponential diode models whenever the conductance gp can be neglected in the model presented in fig. 23, the total current is described by the following conventional implicit equation: 0 1 exp 1 n s k k k th v r i i = i n v            . (59) we recently proposed [53] an approximate solution of the above transcendental equation for the terminal voltage as an explicit function of the terminal current valid for arbitrary n, i0k, and nk. this approximate solution is [53]: 1 0 ln 1 kn n m s th k k i v r i m v i                   . (60) where m represents an empiric dimensionless joining factor. it is important to note in (60) that at a any particular bias point (i, v) at which only one of the conduction mechanisms represented by one of the diodes in the model is dominant, the summation in (60) reduces to only one term. for the particular case of a model with just two parallel diodes (n=2) a review of parameter extraction in diodes and solar cells... 81 with arbitrary values of n1 and n2, the explicit approximate terminal voltage solution simplifies to: 1 2 01 02 ln 1 1 n n m m s th i i v r i m v i i                         . (61) to illustrate the applicability of this approximate model, we applied it to experimental i–v characteristics of lateral thin-film soi pin diodes. figure 26 presents the measured i–v characteristics of a device where parameter extraction was performed using lateral optimization by minimizing voltage errors at a given current. the extracted parameters and the joining factor are indicated in fig. 26, together with the lateral voltage error with respect to measured data. fig. 26 (upper pane) measured (red dotted lines) and model playback (black solid lines) of a lateral thin-film soi pin diode at 150 k in linear and logarithmic scales; and (lower pane) absolute lateral error of model playback with respect to the measured data 82 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález 4. single-exponential solar cell model 4.1. single-exponential model without any resistance consider an idealized solar cell without any parasitic resistance, whose i-v characteristics under illumination may be described by superposition of two currents: a voltage independent photo-generated current source and the current of a single exponential-type junction, as shown in fig. 27. fig. 27 idealized solar cell equivalent circuit without parasitic resistances the terminal current of this lumped parameter equivalent circuit model is mathematically described by the following explicit equation of the terminal voltage: 0 exp 1 ph th v i = i i n v            , (62) where the magnitude of the photo-generated current iph depends only on the illumination intensity. alternatively the terminal voltage may be expressed as an explicit function of the terminal current: 0 ln 1 ph th i i v = n v i       . (63) figure 28 shows simulated i-v characteristics of an idealized solar cell, in linear and logarithmic scales, under illumination. fig. 28 simulated dark and illuminated i-v characteristic of an idealized solar cell in linear and logarithmic scales a review of parameter extraction in diodes and solar cells... 83 the short circuit current (isc) and open circuit voltage (voc) can be found by evaluating (62) at v=0 and (63) at i=0, respectively, as: 0sc v ph i i i     , (64) and 0 0 ln 1 ph oc i th i v v n v i          . (65) the output power is given by the vi product. using (6.1) yields: 0 exp 1 ph th v p v i = v i i n v                  . (66) maximum output power will be delivered when (66) becomes maximun. differentiating (66) with respect to voltage and equating to zero yields the value of the voltage (vmpp) at the maximum power point (mpp):  0 0 0 0 2.718 2.718 1 1 ph ph mpp th th i i i v = n v w n v w i i                       , (67) where w0 stands for the principal branch of the lambert w function [31]. the corresponding current (impp) at the mpp is found by evaluating (62) at vmpp using (67). 4.2. single-exponential model with series resistance figure 29 presents the lumped parameter equivalent circuit model of a solar cell with parasitic series resistance. fig. 29 solar cell equivalent circuit with a parasitic series resistance as a consequence of the presence of the parasitic series resistance rs, the terminal current of this equivalent circuit is mathematically described by an implicit equation: 0 exp 1s ph th v r i i = i i n v            . (68) the terminal voltage can be mathematically solved from (68) resulting in an explicit function of the terminal current: 84 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález 0 ln 1 ph s th i i v = r i n v i        . (69) the implicit terminal current equation given by (68) can be solved explicitly in terms of the terminal voltage if we introduce the use of the special lambert w function [54]: 00 0 0 ( ) exp ( ) s phth s ph s th th v r i inv i r i w i i r nv nv                . (70) another consequence of the presence of the parasitic series resistance rs is that it prevents finding an exact analytical solution for the maximum power point, since equating the derivative of the vi product to zero does not allow to analytically solve for either vmpp or impp. figure 30 illustrates the effect of series resistance rs on linear and semilogarithmic scale i-v characteristics simulated under illumination with three values of rs. fig. 30 simulated i-v characteristic of a solar cell at different values of parasitic series resistance, in linear and logarithmic scale the open circuit voltage voc does not depend on rs, since its effect, given by irs, becomes zero when the current goes to zero (open circuit). thus, the value of voc is given by the same equation (65). on the other hand, the short circuit current isc can be found by evaluating (70) at v=0, as: 00 0 0 ( ) exp ( ) s phth s sc v ph s th th r i inv i r i i w i i r nv nv                 . (71) the four parameters (n, i0, rs and iph) that fully describe the solar cell in terms of this lumped parameter equivalent circuit model can be extracted by fitting the cell's measured data to any of the model's defining equations. equations (68), (69) or (70) can be applied directly for fitting. vertical or lateral optimization could be used for fitting by minimizing either the voltage quadratic error or the current quadratic error, respectively. in the present case the use of equation (69) in combination with lateral optimization affords the best computational convenience, since this equation is not implicit, as (68) is, and does not contain special functions, as (70) does. a review of parameter extraction in diodes and solar cells... 85 4.2.1. first integration method to extract the series resistance of solar cells to the best of our knowledge, araujo and sánchez were the first to propose, back in 1982, the use of integration for parameter extraction in solar cells [36]. they used the integral of (69), assuming that i >> i0 and iph >> i0 , to obtain the relation:  2 0 0 0 ln ln 2 i ph phs th th ph th ph i i ir v di i n v i n v i i n v i i i                  . (72) evaluating (72) at an upper limit of integration i=isc, the series resistance rs can be evaluated as: 0 2 2 2 2 sci th oc s scsc sc v di n v v r ii i     . (73) 4.3. single-exponential model with parallel resistance figure 31 presents the lumped parameter equivalent circuit model of a solar cell with parallel series resistance. fig. 31 solar cell lumped parameter equivalent circuit model with parasitic parallel conductance the mathematical description of the terminal current of this equivalent circuit is given in terms of the terminal voltage by the explicit equation: 0 exp 1 ph p th v i = i i g v n v             . (74) the terminal voltage can be solved from the above equation as an explicit function of the terminal current if we use the special lambert w function: 0 00 0 exp ph ph th th p th p p i i i i i ii v nv w nv g nv g g                  . (75) as a consequence of the presence of the parallel conductance gp an exact analytical solution for the maximum power point is not possible, since equating the derivative of the vi product to zero does not allow to analytically solve for either vmpp or impp. figure 32 illustrates the effect of parallel conductance gp on linear and semilogarithmic scale i-v characteristics simulated under illumination with three values of gp. 86 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález fig. 32 simulated i-v characteristic of a hypothetical solar cell for three values of parallel conductance in linear and logarithmic scales for this particular case we find the short circuit current by evaluating (74) at v=0, yielding isc = -iph, which is independent of the value of gp. the open circuit voltage voc is obtained by evaluating (75) at i=0, yielding: 0 00 exp ph ph oc th th p th p p i i i ii v nv w nv g nv g g                . (76) the four parameters (n, i0, gp and iph) that fully describe the solar cell in terms of this lumped parameter equivalent circuit model can be extracted by fitting the cell's measured data to any of the model's defining equations. equations (74) or (75) can be applied directly for fitting. vertical or lateral optimization could be used for fitting by minimizing either the voltage quadratic error or the current quadratic error, respectively. in the present case the use of equation (74) in combination with vertical optimization affords the best computational convenience, since this equation is explicit and does not contain special functions, as (75) does. 4.4. single-exponential model with series and parallel resistances figure 33 presents the lumped parameter equivalent circuit model of a solar cell with series resistance and parallel conductance. the mathematical description of the terminal current of this equivalent circuit is given by the implicit equation: 0 exp 1 ( ) s s p ph th v i r i = i + v i igr n v            . (77) fig. 33 solar cell equivalent circuit with parasitic series and parallel resistances a review of parameter extraction in diodes and solar cells... 87 the use of the special lambert w function allows the above equation to be explicitly solved [54] for the terminal current as a function of the terminal voltage: 0 00 0 ( ) ( ) exp (1 ) (1 ) 1 s ph p phth s s th s p th s p s p v r i i vg i inv i r i w r nv r g nv r g r g                   (78) and for the terminal voltage as a function of the terminal current: 0 00 0 1 exp ph ph th s th p th p p p i i i i ii v nv w i r nv g nv g g g                           . (79) an exact analytical solution for the maximum power point is not possible in this case either, because equating the derivative of the vi product to zero does not allow to analytically solve for either vmpp or impp. figure 34 illustrates the effect of series resistance and parallel conductance gp on linear and semilogarithmic scale i-v characteristics simulated under illumination with different values of rs and gp. fig. 34 simulated i-v characteristic of a hypothetical solar cell with different values of series and parallel resistance in linear and logarithmic scales the short circuit current is found by evaluating (78) at v=0, yielding: 0 00 0 ( ) ( ) exp (1 ) (1 ) 1 s ph phth s sc s th s p th s p s p r i i i inv i r i w r nv r g nv r g r g                 , (80) and the open circuit voltage voc is obtained by evaluating (79) at i=0, yielding: 0 00 0 exp ph ph oc th th p th p p i i i ii v nv w nv g nv g g                . (81) 4.4.1. vertical optimization the implicit terminal current equation (77) could be directly fitted to the experimental data to extract the model parameters. however, a more convenient way [55], [56] would be to use instead the explicit equation (78) for the terminal current as a function of the 88 a. ortiz-conde, f.j. garcía-sánchez, j. muci, a. sucre-gonzález terminal voltage. of course, this implies having a lambert w function calculation añgorithm implemented within the data fitting software. del pozo et al [55] propose following this route by using matlab's non-linear curve fitting routine "lsqcurvefit." this vertical optimization procedure (minimizing the current quadratic error) allows the extraction of all the parameters at the same time, but it frequently requires using good initial estimates of the parameters. 4.4.2. extraction from the co-content function model parameters can be extracted from the integrals of the illuminated i-v characteristics. the integral with respect to the voltage is known as the co-content cc(i,v). for an illuminated solar cell it is defined as [23], [24]:   0 ( ) v sc cc i,v i i dv   . (82) the lower limit of integration in the above equation is defined at the point v=0, i=isc. substitution of (78) into (82) and integrating with respect to v results in a long expression that contains lambert w functions and both variables v, and i. replacing the terms that contain lambert w functions of v, using equation (78), and after some algebraic manipulations, the function cc(i,v) may be conveniently expressed for the solar cell as a purely algebraic equation of the form: 2 2 v1 i1 i1v1 v2 i2 ( ) c ( ) c ( ) c c ( ) sc sc sc cc i,v c v i i v i i v i i        , (83) where the five coefficients are given in terms of the model parameters by: 2 i1 0 c ( ) (1+ ) s ph sc th p s sc s p r i i i nv g r i r g     , (84) v1 0c ( )ph sc th p sc s pi i i nv g i r g      , (85) i2 (1+ ) c 2 s p s r g r  , (86) v2 c 2 p g  , (87) and the fifth is a coefficient that is dependent on the others: i2 v2 i1v1 1 1 16 c c 2 c  . (88) as can be realized from (88), there are actually only four independent coefficients, (84)-(87), and therefore only four unknowns may be extracted uniquely. however, all the model parameters may be extracted. the extraction procedure consists of performing bivariate fitting of algebraic equation (83) to the co-content function cc as numerically calculated from the experimental data using (82). this bivariate fitting process yields the values of the four equation coefficients cv1, ci1, cv2, ci2, which are then used to calculate the solar cell's model parameters gp, rs, iph, n and i0 as follows. the value of the shunt loss is calculated directly from (87): a review of parameter extraction in diodes and solar cells... 89 v2 2 c p g  . (89) the value of the series resistance is calculated by substituting (89) into (86) and solving the resulting quadratic equation: v2 i2 v2 1 16 c c 1 4 c s r    . (90) the value of the junction quality factor is calculated by substituting (89) and (90) into (84) and (85) and solving the two equations to yield: v1 v2 i2 i1 v2 v2 ( 1+16 c c 1) 4 c c 4 c th c n v    . (91) the value of the photo-generated current is obtained assuming i0<100 ghz. the maximum values of the spatial increments of the linear amplification of scw in the nitride films achieve at the frequencies f > 50 ghz. those differ from the bulk crystals where the maximum increments correspond to the zero frequency [20]. therefore the strong monopulses in semiconductor films with ndc differ from the ordinary domains in the bulk semiconductors [20]. under the excitation of the domains the electric field out of them is essentially below the threshold value, whereas in the films the electric field out of the monopulses practically does not change and is equal to the bias one [1,18,19]. moreover, only a single domain can propagate simultaneously within the crystal, and the next domain is excited when the previous one leaves the crystal. the strong monopulses can be excited when the bias electric field slightly exceeds the critical value of the electric field for ndc and the doping levels are moderate [1,18,19]. note that the domain regime of operation is not preferable in the bulk gunn diodes, because of its lower efficiency compared with another regimes, like the limitation of accumulation of space charge one [20]. the most interesting case is the excitation of sequences of strong monopulses from input envelope pulses of small amplitudes with the carrier frequency of the microwave range 10 – 50 ghz. such an excitation was not considered earlier and is important for the practical needs of the modulation of the optical radiation by scw in laser devices [21-25]. under such an excitation the repetition rate of the output monopulses is determined by the input carrier frequency. this paper is devoted to the investigations of the generation of sequences of output strong monopulses under excitation by input microwave pulses of small amplitudes. in the films the influence of the boundaries on the properties of the films is principal. namely, at the boundaries the electron mobility can be lower than in the center of the film and, moreover, ndc can be absent at the boundaries due to additional mechanisms of carrier scattering. in the last case the non-uniform doping can be used that is increased in the center of the film x = l and is decreased at the boundaries of the film x = 0, x = 2l, see fig. 1. in the present consideration the permittivities below and above the film are 1 = 4, sio2, and 3 = 1, air. generation of sequences of strong electric monopulses in nitride films 189 a) b) fig. 1 the geometry of the problem. parts a), b) are views from different directions. the nitride film occupies the region 0 ≤ x ≤ 2l, 0 ≤ y ≤ ly, 0 ≤ z ≤ lz. nd(x) is the nonuniform doping profile. the input antenna is i. the nonlinear space charge waves are formed as strong monopulses of picosecond durations at the output antenna ii. the pulses are generally localized also along oy axis. 2. basic equations the non-local electron hydrodynamics can be used to investigate nonlinear scw in the nitride films [1, 15-19, 26, 27]. in this approach the dynamics of the electron gas is described by the full electron concentration n in all valleys jointly, the average electron velocity v, and by the average electron energy w. this model is valid for the nitrides ngan and n-inn and cannot be applied for n-gaas [27], because in the last material the occupation of the different valleys is principally important to calculate the diffusion coefficient. the equations of the balance of the number of electrons, the linear momentum, and of the electron energy are: * 0* 00 * 2 * ( ) 0; ( ) ( ) 1 ( ) ( ); ; ( ) 1 (( ) ) ( ) ( ); 2 ( ) 5 ( ); . 3 2 2 ( ) p w p n dv v ee div nv v v t dt t m w nt v w e e nm w dw ee v nv t w w w dt n m w v nt where t w m w         + =  +  = −    − = − =  −  −  − − = − = (1) here n is the total electron concentration, v is the average velocity, w is the average electron energy; p, w are the effective relaxation frequencies of the electron momentum and energy, m* is the effective electron mass, t is the electron temperature in energetic units,  is the electron thermoconductivity coefficient. w00 = 0.039 ev is the average electron energy at 300 k; e00  e0z is the bias constant electric field. it is assumed that p, w, m* are the functions of the average electron energy w. an influence of the thermoconductivity 190 v. grimalsky, s. koshevaya, j. escobedo-alatorre, a. kotsarenko on the electron gas dynamics is not essential up to the frequencies f  2 – 3 thz [15]. the electron kinetic energy for the investigated processes is one order lower than the average electron energy, so it is t  (2/3)w. the utilized dependencies of the drift velocity and the average electron energy on the electric field for the zinc blende n-gan and for n-inn [28] are presented in fig. 2. it is assumed that in the non-uniform films ndc at the surfaces is absent, as depicted by the dash curves 1, 2. the results of our investigations are tolerant to changes of the dependencies of the drift velocity at the surfaces of the films. a) b) fig. 2 parts a), b) are dependencies of the drift velocity v and the average electron energy w on the electric field for the zinc blende n-gan, solid curve 1, and n-inn, solid curve 2. dash curves 1 and 2 are the used dependencies v(e) at the boundaries of the n-gan and n-inn non-uniform films where ndc is absent. the dependencies of the relaxation frequencies were computed from eqs. (1) in the stationary case /t = 0 [1, 15-19]; the dependencies in fig. 2, a), b), were used.. their computed values are p 2·10 13 s-1 for both nitrides; w  10 12 s-1 for inn, w  10 13 s-1 for gan [15]. thus, it is p >> w. therefore at the frequencies f < 1 thz the inertia of the electron gas can be neglected: * * 1 ( ) ( ) ( ). p p e d v e nt w e nw nwm nm     −  = −  (2) here , d are the coefficients of the electron mobility and the diffusion: * * ; p p e t d t em m     = =  . (3) the relaxation frequency w in n-inn is smaller than in n-gan; this fact limits the frequency range of the amplification of scw in the n-inn films [15]. below the processes are considered in the wide frequency range f  300 ghz, or with the temporal scales 3 ps, so the using of eq. (2) results in the following diffusion-drift equations for the total electron concentration: generation of sequences of strong electric monopulses in nitride films 191 2 2 2 1/ 2 00 0, ; ; ; ( ) , ( ) , ( ) ; ( ) , 4 ( ), ( ) , 3 ( ) , , . yx z x x y y z z x x y y z z d z x y w z x y jj jn n j nv d t x y z x n n j nv d j nv d v e e y z v v e e v e e e e v v v e e e e e w t e e e e z x y            + + + = = −        = − = − =   = =   + =  + +     = − = − = −    (4) the simplest model of the taking account of the non-local dependence of the drift velocity v is applied here, to estimate the influence of the nonlocality. the characteristic time of nonlocality was calculated in [16,17], it is of about 4/(3w(w)). in eqs. (4) vd(e) is the stationary dependence of the drift velocity on the electric field presented in fig. 2, a. in the simulations presented below an influence of the nonlocality is not essential, as it has been checked; such an influence results in <1% variations. thus, the diffusion-drift equation is valid here with the local dependence of the drift velocity v on the electric field e. the equations for the dynamics of the electron gas should be added by the poisson equation for the electric field potential. note that the potential 0 ( ) ( , , , )x z x y t  = + includes both the stationary part 0(x) due to the possible non-uniform doping and the variable in time one  due to the propagation of scw: 2 2 2 0 2 2 2 2 ( ( )) , 0 2 ; 0, 0, 2 . d e n n x x l x y z x x l      − −      + + =        (5) the non-uniform doping is considered as: 2 0 ( ) exp( (( ) / ) ). d d d n x n x l x= − − (6) here xd is the scale of the non-uniform doping. under the non-uniform doping the stationary concentration n0(x), the electric potential 0(x), and the additional electric field e0(x) have been calculated from the following set of equations: 2 2 0 0 0 0 0 2 2 0 0 0 2 0 0 ( ) ( ) ( ( ) ( )), ( ) ; ( ) ( ) ( ) exp( ), . ( ) exp( ) d l d l b e b e d x d xe n x n x e x dxdx n x dx e x n x c c k t e x dx k t       = − −  − =  − = −   (7) 192 v. grimalsky, s. koshevaya, j. escobedo-alatorre, a. kotsarenko eqs. (7) have been solved by the iterative newton method, which is analogous to the gummel one [29]. it yields the rapid convergence; the obtained accuracy is <10-12. the poisson equation for the variable electric potential of scw is 0 2 2 2 0 2 2 2 2 ( ( )) , 0 2 ; 0, 0, 2 . e n n x x l x y z x x l      − −      + + =        (8) 3. linear amplification of scw to investigate the amplification of linear scw, the solution of linearized eqs. (4), (8) is searched as the travelling waves: , , ~ exp( ( ))n w i t kz  − . (9) the circular frequency   2f is real here, whereas the longitudinal wave number is complex k  k’+ik’’. the amplification of scw occurs when k’’ > 0. in the non-uniform films the dependence of the electron mobility on the coordinate x is taken as 1 1 2 2 2 0 0 ( ) ( ) ( ), ( ) exp( ( / ) ) exp( ((2 ) / ) . x x x x x l x x    = − −    − + − − (10) here 1, 2 are the mobilities in the center x = l and at the boundaries x = 0, x = 2l of the film taken from fig. 2, a; x0 is the scale of the non-uniformity of the conductivity. in the center ndc is present whereas at the boundaries it is absent. in fig. 3 there are the results of the simulations of the dependencies of the spatial increments of amplification k’’ on the frequency f of linear scw. for all cases the thickness of the films is 2l = 0.5 m. this thickness is quite large to provide the using of scw for the modulation of laser radiation in the waveguides with the nitride films. the curves 1, 2 correspond to uniform films. the curve 3 is for the film with non-uniform dependence of the electron mobility  on x, as in eq. (10), but with the uniform doping. the curves 4, 5 are for both the non-uniform dependence of  (x) and the non-uniform doping nd (x), eq. (6). note that the critical field of ndc in n-gan is ec = 1.2510 5 v/cm, in n-inn it is ec = 0.4910 5 v/cm. in all cases the bias electric field corresponds to relatively small values of ndc, as it is necessary for the excitation of strong electric monopulses in the nonlinear regime [1,18,19]. when the bias field exceeds the value e00 > 1.410 5 v/cm in n-gan and e00 > 0.5410 5 v/cm in n-inn, the linear increments of amplification increase sharply and under the nonlinear regime the stable monopulses are not excited there [18]. in fig. 3 the part a) is for n-gan film. the curve 1 is for the bias electric field e00 = 1.3105 v/cm, the equilibrium electron concentration is n0 = nd = 10 17 cm-3. the curve 2 is for e00 = 1.3510 5 v/cm, n0 = nd = 5·10 16 cm-3. the curve 3 is for the film with non-uniform conductivity with the scale x0 = 0.1 m (see eq. 10), e00 = 1.410 5 v/cm, n0 = nd = 10 17 cm-3. the curve 4 is for the film with non-uniform conductivity with the scale x0 = 0.1 m, e00 = 1.3105 v/cm; the scale of the non-uniform doping is xd = 0.2 m, nd0 = 10 17 cm-3. the curve generation of sequences of strong electric monopulses in nitride films 193 5 is for e00 = 1.310 5 v/cm, xd = 0.2 m, nd0 = 1.5310 17 cm-3. the last case corresponds to the value of the electron concentration averaged along ox axis as 1017 cm-3. in the films with the non-uniform conductivity the increments are essentially smaller than in the uniform films. the increments can be increased in the films with the nonuniform doping, i.e. due to the localization of the concentration of the electrons of conductivity in the center of the films near x = l. the analogous results have been obtained for n-inn films, see fig. 3, b. but the bias electric fields and doping levels are smaller there in comparison with the case of n-gan films. for all cases the bias electric field is e00 = 0.5210 5 v/cm. the curve 1 is for the electron concentration n0 = nd = 1.5·10 16 cm-3. the curve 2 is for n0 = nd = 2·10 16 cm-3. the curve 3 is for the film with the non-uniform conductivity x0 = 0.1 m, n0 = nd = 2·10 16 cm-3. the curve 4 is for the non-uniformly doped films x0 = 0.1 m, xd = 0.2 m, nd0 = 2·10 16 cm-3. the curve 5 is for x0 = 0.1 m, xd = 0.2 m, nd0 = 3.06·10 16 cm-3 that corresponds to the averaged electron concentration 2·1016 cm-3. thus, the typical frequency range of the linear amplification of scw is of about 100 ghz and more. the maximum values of the increments reach at the frequencies f = 50 – 150 ghz in the films with the moderate doping and under the bias fields slightly above the threshold values. at the lengths of the films lz ≥ 20 m it is possible to obtain the amplification of linear scw exp(k’’lz) ≥ 10. after the linear amplification stage, scw are subject to strong nonlinear processes that are investigated below in section 4. a) b) fig. 3 spatial increments of amplification of linear scw. part a) is for n-gan fims, b) is for n-inn films. curves 1, 2 are for uniform films, curve 3 is for non-uniform conductivity but uniform doping; 4, 5 are for non-uniform conductivity and non-uniform doping. the finite size of scw along oy axis results in decreasing the increment of amplification [24], i.e. for amplified waves in active media the wave diffraction is equivalent to diffusion. 4. excitation of sequences of nonlinear monopulses the nonlinear dynamics of sequences of scw nonlinear pulses has been simulated by means of the diffusion-drift equation jointly with the poisson equation added by boundary conditions eqs. (4), (8). 194 v. grimalsky, s. koshevaya, j. escobedo-alatorre, a. kotsarenko the equation for the electron concentration has been solved by the splitting with respect to physical factors [30-32]. the first fractional step is along ox axis, the second one is along оy, the third one is along oz. the unconditionally stable implicit difference schemes have been used of the second order of approximation. because the spatial scales along oz axis and ox, oy ones differ 1 order and more, it is important to preserve the balance of the electric charge at the surfaces, namely to use the integral interpolation method to derive the difference approximations both the volume equations and for the boundary conditions [31]. it is assumed the absence of the surface charge at the boundaries of the film x = 0, x = 2l. here the electric boundary conditions are the continuity of the potential and the normal component the electric induction. at the ends of the film the boundary conditions for the concentration are n(z=0,x,y)=n0(x) and n/z(z=lz) = 0, the last one corresponds to the ohmic junctions [20]. the boundary conditions n(y=0 or ly, z, x) = n0(x) are used at у = 0 and у = lу. but in the simulations the film is assumed enough wide, so an influence of the boundary conditions along оy axis is not essential. for the variable electric potential the boundary conditions are  = 0 at z = 0, z = lz. the poisson equation for the electric potential  has been solved by the fast fourier transform with respect to z, sine-like, and у, cosine-like ones [30], for real functions. along ox axis the finite difference method has been used of the second order. to get a suitable accuracy ≤1% the number of the points of the computation grid along oz axis should be ≥ 512, along oy one it should be ≥64, and along ox axis it should be ≥50. because the splitting with respect to physical factors has been applied, i.e. the method of the total approximation, the temporal step has been chosen ≤ 0.1 ps. the accuracy has been checked by means of 1) checking the approximation of the basic equations at each temporal step; 2) re-simulations with another numbers of the numerical grid points along each axis; 3) using various finite difference approximations, like the central differences and the monotonic schemes; 4) comparison of 3d simulations with 2d ones; 5) comparison of the simulations at the initial linear stage of the dynamics with the analytical results. the initial pulses of scw are excited by wide-band planar waveguide; the exciting electric field is at the input antenna: 2 2 21 1 0 0 0 / 2 exp( ( ) ( ) ( ) ) sin( ). yexc z y lt t z z e a t t z y −− − = − − −   (11) here  is the carrier circular frequency of the envelope pulse (do not mix with the circular frequency  from the previous section), z1, z0 are the positions of the center of the exciting antenna and its half-width, a is the small input amplitude, a << e00. the exciting field is uniform along x, 0 < x < 2l. for all cases of simulations the thickness of the film is 2l = 0.5 m. the similar results have been obtained for the films of the thicknesses 2l  2 m. at higher thicknesses the excited pulses are similar to the domains in the bulk gunn diodes. the lengths of the films are lz  30 m. in the case when the bias electric field e00 is essentially higher than the critical one and corresponds to the maximum of ndc, the initial strong amplification at the linear stage occurs. but then the amplified pulse deforms at the nonlinear stage of amplification and as a result several powerful oscillations are formed at the output [17]. therefore, to create stable nonlinear monopulses, the bias electric field should be chosen slightly above the threshold of ndc. namely in this case the short monopulses are formed at the essentially nonlinear stage generation of sequences of strong electric monopulses in nitride films 195 at the output antenna z = z2  lz. also the moderate doping levels should be applied n0 = 3·1016 –1017 cm-3 for n-gan and n0 = 10 16 – 3·1016 cm-3 for n-inn films. below the typical results of numerical simulations are presented. there are dependencies on time t of the variable part of z-component of the electric field of scw in the center of the output antenna y = ly/2: 2 00 0( , 2 , / 2, )z y ze z z x l y l t e e e= = =  − − . the three cases have been considered. the first one is for the uniform nitride films, the second one is for the films with the non-uniform conductivity, see eq. (10); the third case is for the films with both the non-uniform conductivity and the non-uniform doping as in eq. (6). for all the cases the transverse width of the films is ly = 100 m, the initial transverse width of the input field is y0 = 20 m. in figs. 4-6 there are the results for the uniform films. in all figs. the left panels are general views, the right ones are the detailed views. in fig. 4 the parameters are as follows. there are n-gan films, the carrier input frequency is  = 1.51011 s-1. for the parts a), b) the bias electric field is e00 = 1.35·10 5 v/cm, the constant electron concentration is n0 = nd = 5·10 16 cm-3. the parameters of the input pulse are a = 3 kv/cm, t1 = 600 ps, t0 = 300 ps, z1 = 10 m, z0 = 0.5 m, see eq. 11. the length of the film is lz = 55 m, the output antenna is at z2 = 54 m. for the parts c), d) they are e00 = 1.3·105 v/cm, n0 = nd = 9·10 16 cm-3; a = 3 kv/cm, t1 = 600 ps, t0 = 300 ps, z1 = 10 m, z0 = 0.5 m; lz = 40 m, the output antenna is at z2 = 39 m. the parts a), c) are general views, b), d) are detailed ones. a) b) c) d) fig. 4 the shapes of strong electric monopulses at the output antenna in uniform n-gan film. the input carrier circular frequency is  = 1.5·1011 s-1. parts a), c) are general views, b), d) are detailed ones. 196 v. grimalsky, s. koshevaya, j. escobedo-alatorre, a. kotsarenko in fig. 5 there are n-gan films, the carrier input circular frequency is  = 21011 s-1. the parameters are the same as for fig. 4, parts a), b). a) b) fig. 5 the shapes of strong electric monopulses at the output antenna in uniform n-gan film;  = 2·1011 s-1. in fig. 6 there are n-inn films, the carrier input circular frequency is  = 1.51011 s-1. the parameters are e00 = 0.52·10 5 v/cm, n0 = nd = 1.5·10 16 cm-3; a = 3 kv/cm, t1 = 600 ps, t0 = 300 ps, z1 = 10 m, z0 = 0.5 m; lz = 60 m, the output antenna is at z2 = 59 m. a) b) fig. 6 the shapes of strong electric monopulses at the output antenna in uniform n-inn film;  = 1.5·1011 s-1. in figs. 7, 8 there are the results for the films with the non-uniform conductivity. the scale of the non-uniformity is x0 = 0.1 m there. in fig. 7 there are n-gan films. the parameters are e00 = 1.4·10 5 v/cm, n0 = nd = 5·1016 cm-3; a = 2 kv/cm, t1 = 600 ps, t0 = 300 ps, z1 = 10 m, z0 = 0.5 m; lz = 50 m, the output antenna is at z2 = 49 m. for the parts a), b) the carrier input circular frequency is  = 21011 s-1; for the parts c), d) it is  = 1.51011 s-1. generation of sequences of strong electric monopulses in nitride films 197 a) b) c) d) fig. 7 the shapes of strong electric monopulses at the output antenna in n-gan film with the non-uniform conductivity. in fig. 8 there are n-inn films. the carrier input circular frequency is  = 1.51011 s-1. the parameters are e00 = 0.53·10 5 v/cm, n0 = nd = 2·10 16 cm-3; a = 3 kv/cm, t1 = 600 ps, t0 = 300 ps, z1 = 10 m, z0 = 0.5 m; lz = 53 m, the output antenna is at z2 = 52 m. a) b) fig. 8 the shapes of strong electric monopulses at the output antenna in n-inn film with non-uniform conductivity;  = 1.5·1011 s-1. from figs. 4 – 8 it is seen that the sequences of strong electric monopulses are formed at the output antenna. the maximum values of the output pulses exceed several times the bias electric field e00. the durations of the output monopulses are 3 – 10 ps, so the total frequency range is 100 – 300 ghz. the monopulses are formed practically without any 198 v. grimalsky, s. koshevaya, j. escobedo-alatorre, a. kotsarenko pedestal and differ from the domains of strong electric field in bulk semiconductors [20]; the domains possess the values of electric fields essentially below the ndc threshold outside of the domain. the repetition rate of the output monopulses is determined by the input carrier circular frequency . moreover, the peak values of the monopulses repeat the shapes of the envelopes of the input microwave pulses. in n-gan films the input carrier circular frequency should be   2·1011 s-1, in n-inn films it should be   1.5·1011 s-1. at higher carrier frequencies the behavior of the sequence of output pulses becomes unstable, because there exists the mutual influence of neighboring output monopulses. the peak values of the monopulses are higher in n-gan films, so namely n-gan films are preferable for excitations of sequences of strong monopulses, despite the lower values of ndc there. also the input carrier frequency range is wider in n-gan films, as mentioned above. the results of numerical simulations are tolerant to changes of the parameters of input pulses and of the lengths of the films lz. under the uniform doping the shapes of the output pulses do not depend on the transverse widths of the input pulse у0 when у0 ≥ 20 m. at smaller initial widths у0 < 20 m the maximum values of the electric field of monopulses decrease due to smaller increments of amplification at the linear stage of the scw dynamics. in fig. 9 there are the results for the n-gan films with both the non-uniform conductivity and the non-uniform doping. the scales of the non-uniform conductivity and the non-uniform doping are x0 = 0.1 m and xd = 0.2 m there. the parameters are e00 = 1.3·10 5 v/cm, n0 = nd = 10 17 cm-3; a = 2 kv/cm, t1 = 600 ps, t0 = 300 ps, z1 = 10 m, z0 = 0.5 m; lz = 50 m, the output antenna is at z2 = 49 m. a) b) fig. 9 the shapes of strong electric nonlinear pulses at the output antenna in n-gan film with both the non-uniform conductivity and with the non-uniform doping;  = 1.5·1011 s-1. under the non-uniform doping the sequences of the monopulses are excited rather due to the non-uniformity of the electric field near the input end of the film z = 0 than due to the input field at the exciting antenna. there is a principal difference between the films with the non-uniform doping and with uniform one. in the uniformly doped films the monopulses are not electric domains, whereas under the non-uniform doping the nonlinear pulses possess a similarity to the domains of the strong field in the bulk semiconductors. namely, the values of the electric field out of the nonlinear pulses are essentially below the threshold ones and the generation of sequences of strong electric monopulses in nitride films 199 durations of the nonlinear pulses are essentially longer ≥20 ps. moreover, in the uniformly doped films the repetition rate of the monopulses at the output antenna is determined by the input carrier circular frequency , whereas under the non-uniform doping the repetition rate is determined by the distance between z = 0 and the position of the output antenna z = z2 only. note that under the excitation of nonlinear pulses from a single input pulse in the films with non-uniform doping the output pulse can be similar to the strong monopulse like in uniform films [19]. they are shorter and are practically without the pedestal. but in the films with non-uniform doping the sequences of the output pulses are similar to the domains. the z e component of the variable electric field is dominating in the nonlinear monopulses. this component is uniform along the thickness of the film i.e. along ox axis. because the peak values of the electron concentration are high within the monopulses, the sequences of the strong monopulses can be used for the effective modulation of the electromagnetic radiation of higher part of thz range and of the optical range in the waveguides on the base of the nitride films. in the earlier works such a modulation was considered within the linear regime of amplification of scw [21-24]. 5. conclusions the sequences short strong monopulses of space charge waves of durations 3 – 10 picoseconds can be excited in the nitride n-gan и n-inn films of the thicknesses 2 m under the negative differential conductivity when the input signals are the modulated microwave pulses of small amplitudes. the repetition rate of the output monopulses is determined by the carrier frequency of the input microwave pulses. each monopulse is without the internal carrier frequency and occupies the wide frequency range of about 100 – 300 ghz. the strong monopulses are formed both in the uniform films and in films with the non-uniform conductivity. these monopulses differ from the domains of the strong electric fields in the bulk semiconductors. the monopulses are realized under the amplification in the essentially nonlinear regime. the bias electric fields should be chosen slightly higher than the thresholds of the negative differential conductivity. the doping levels should be moderate 1017 cm-3. the typical lengths of the films are 30 – 60 m, the widths are ≥ 100 m. the n-gan films are preferable for the excitation of the nonlinear monopulses, because the peak values of the monopulses are higher than ones in n-inn films. also the carrier microwave circular frequencies at the input can be higher namely in the n-gan films and are   21011 s-1, or f  /2  30 ghz. the sequences of strong monopulses can be used for the modulation of the optical radiation in the waveguides on the base of the nitride films. in the films of the non-uniform doping the sequences of nonlinear pulses are excited due to the electric field inhomogeneity near the input end of the film; the nonlinear pulses are similar to domains of the strong electric field there. acknowledgement. the authors are grateful to sep-conacyt, mexico, for a partial support of our work. 200 v. grimalsky, s. koshevaya, j. escobedo-alatorre, a. kotsarenko references [1] v. grimalsky, s. koshevaya, j. escobedo-a., and j. sanchez-s., "strong electric monopulses in nonuniformly doped nitride films under negative differential conductivity", in proceedings of the 2019 31st ieee international conference on microelectronics (miel), nis, serbia, 2019, pp. 8386. [2] y-s. lee, principles of terahertz science and technology, springer, 2009, p. 340. [3] m. perenzoni and d. j. paul, physics and applications of terahertz radiation, springer, 2014, p. 255. [4] h.-j. song, t. nagatsuma, handbook of terahertz technologies. devices and applications, boca raton, crc press, 2015, p. 585. [5] g. carpintero, l.e. garcıa muñoz, h.l. hartnagel, s. preu and a.v. räisänen, semiconductor terahertz technology. devices and systems at room temperature operation, john wiley & sons, 2015, p. 386. [6] y. nakasha, "special section on terahertz waves coming to the real world" ieice trans. electron., vol. e98.c, no. 12, december 2015. [7] s. j. pearton, j. c. zolper, r. j. shul and f. ren, "gan: processing, defects, and devices", j. appl. phys., vol. 86, no 1, pp. 1-79, july 1999. [8] s. jain, m. willander, j. narayan, and r. van overstraeten, "iii-nitrides: growth, characterization, and properties", j. appl. phys., vol. 87, no. 3, pp. 965-1006, february 2000. [9] v. gruzhinskis, p. shiktorov, e. starikov, and j. h. zhao, "comparative study of 200–300 ghz microwave power generation in gan teds by the monte carlo technique", semicond. sci. technol., vol. 16, no 8, pp. 798-805, august 2001. [10] j. t. lü and j. c. cao, "terahertz generation and chaotic dynamics in gan ndr diode", semicond. sci. technol., vol. 19, no 4, pp. 451-456, april 2004. [11] v. i. timofeyev, e. v. semenovskaya and o.m. falieieva, "electrothermal analysis of gan power submicron field-effect heterotransistors", radioelectron. commun. syst., vol. 59, no 2, pp. 66–73, february 2016. [12] a. a. kokolov and l. i. babak, "methodology of built and verification of non-linear eehemt model for gan hemt transistor", radioelectron. commun. syst., vol. 58, no 10, pp. 435–443, october 2015. [13] p. siddiqua, w. a. hadi, a. k. salhotra, m. s. shur and s. k. o’leary, "electron transport and electron energy distributions within the wurtzite and zinc-blende phases of indium nitride: response to the application of a constant and uniform electric field", j. appl. phys., vol. 117, no 12, article id 125705, june 2015. [14] w. a. hadi, p. k. guram, m. s. shur and s. k. o’leary, "steady-state and transient electron transport within wurtzite and zinc-blende indium nitride", j. appl. phys., vol. 113, no 11, article id 113709, june 2013. [15] e. jatirian foltides, v. grimalsky, s. koshevaya and j. escobedo-alatorre, "amplification of space charge waves in n-inn films of thz range", in proceedings of the ieee latin america microwave conference lamc-2016, puerto vallarta, mexico, 2016, pp. 1-3. [16] v. grimalsky, s. koshevaya, m. tecpoyotl-t. and f. diaz-a.,"influence of nonlocality on amplification of space charge waves in n-gan films", j. electromagn. analysis & applic. (jemaa), vol. 3, no 2, pp. 33-38, february 2011. [17] v. grimalsky, s. koshevaya, i. moroz, and a. garcia-b., "influence of nonlocality on amplification of space charge waves in n-gan films", in proceedings of the international symposium on physics and engineering of microwaves, millimeter and submillimeter waves, kharkov, ukraine, 2010, pp. 1-4. [18] v. grimalsky, s. koshevaya, j. sanchez-s. and y. rapoport, "excitation of short monopulses in nitride films under negative differential conductivity", in proceedings of the international ieee microwaves, radar, and remote sensing symposium, kyiv, ukraine, 2017, pp. 151-154. [19] s. v. koshevaya, v. v. grimalsky, j. escobedo-alatorre and m. tecpoyotl-torres, "excitation of short electric monopulse in nitride films with negative differential conductivity", radioelectron. commun. syst., vol. 62, no. 6, pp. 262–270, june 2019. [20] s. m. sze and kwok n. ng, physics of semiconductor devices, hobokem, wiley-interscience, 2007. p. 815 [21] g. e. chaika, v. n. malnev and m. i. panfilov, "interaction of light with space charge waves", in proceedings of the spie. vol. 2795, 1996, pp. 279-282. [22] d. g. sannikov and d. i. semetsov, "waveguide interaction of light with amplifying scw", physics of the solid state (fizika tverdogo tela), vol. 49, no. 3, pp. 488-492, march 2007. [23] s. yu, dadoenkova, i. o. zolotovsky, i. s. panyaev and d. g. sannikov, "modeling the generation of optical modes in a semiconductor waveguide with distributed feedback formed by a space charge wave", comput. opt., vol. 44, no 2, pp. 183-188, february 2020. generation of sequences of strong electric monopulses in nitride films 201 [24] v. grimalsky, s. koshevaya, m. tecpoyotl-t. and j. escobedo-a., "nonlinear interaction of terahertz and optical waves in nitride films", terahertz sci. technol., vol.6, no. 3, рp. 165-176, june 2013. [25] v. v. grimalsky, s. v. koshevaya, yu. g. rapoport, "superheterodyne amplification of electromagnetic waves of optical and terahertz bands in gallium nitride films", radioelectron. commun. syst., vol. 54, no. 8, pp. 401-410, august 2011. [26] k. tomizawa, numerical simulation of submicron semiconductor devices, boston: artech house publ., 1993, p. 356. [27] a. garcia-b., v. grimalsky, e. gutierrez-d. and s. koshevaya, "dispersion relation for two-valley quasi-hydrodynamic models in scws propagation in n-gaas thin films", in proceedings of the 25th internatioanl conference on microelectronics, belgrade, serbia, 2006, pp. 507-510. [28] m. levinshtein, s. rumyantsev and m. shur, properties of advanced semiconductor materials: gan, aln, inn, wiley, 2001, p. 216 [29] r. kircher and w. bergner, three-dimensional simulation of semiconductor devices, basel, birkhauser verlag, 1991, p. 124. [30] w. h. press, s. a. teukolsky, w. t. vetterling and b. p. flannery, numerical recipes in fortran, cambridge, cambridge univ. press, 1997, p. 1486. [31] a. a. samarskii, the theory of difference schemes, marcel dekker inc., 2001, p. 761. [32] g. i. marchuk, splitting and alternating direction methods. in handbook of numerical analysis, vol. i, finite difference methods, solution of equations in r" (part 1), amsterdam, elsevier, 1990, pp. 203462. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 599 609 doi: 10.2298/fuee1704599s performance of macro diversity wireless communication system operating in weibull multipath fading environment suad n. suljović 1 , dejan milić 1 , zorica nikolić 1 , stefan r. panić 2 , mihajlo stefanović 1 , đoko banđur 3 1 university of niš, faculty of electronic engineering, niš, serbia 2 faculty of natural science and mathematics, university of priština, kosovska mitrovica, serbia 3 faculty of technical sciences, university of priština, kosovska mitrovica, serbia abstract. in this paper, we consider wireless mobile radio communication system with macro diversity reception. signal is subject to weibull small scale fading and gamma large scale fading resulting in system performance degradation. receiver uses macro diversity selection combining (sc) technique in order to reduce the impact of long term fading effects, and two micro diversity sc branches are used to mitigate weibull short term fading effects on system performance. probability density function (pdf), and cumulative distribution function (cdf), as well as level crossing rate (lcr) and average fade duration (afd) of the sc receiver output signal envelope are evaluated. the obtained expressions converge rapidly for all considered values of weibull fading parameter and gamma shadowing severity parameter. mathematical results are studied in order to analyze the influence of weibull fading parameter and gamma shadowing severity parameter on statistical properties of the sc receiver output signal. key words: weibull short term fading, probability density function, cumulative distribution function, level crossing rate, average fade duration. 1. introduction long term fading and short term fading degrade outage probability and limit channel capacity of wireless communication systems in general, and different techniques can be used to lessen the impact of the fading effects. one of the strategies for mitigating both effects: long term fading (shadowing), as well as short term fading, is the use of macro diversity combining reception. in general, macro diversity receiver features two or more micro diversity combiners, and it then combines their outputs in order to avoid the possibility of deep fades. received december 3, 2016; received in revised form may 5, 2017 corresponding author: suad n. suljović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: suadsara@gmail.com) 600 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur such system reduces the influence of simultaneously long term fading effects and short term fading effects on system performance. there are a number of statistical distributions that can be used to describe small scale signal envelope variation in multipath fading channels, depending on propagation environment and communication scenario. rayleigh and nakagamim distributions can be used to describe signal envelope in small scale non line-of-sight multipath fading environments, while rician distribution can model signal envelope in line-ofsight multipath fading environments. signal envelope variation in nonlinear multipath fading environments can also be well described by using weibull model [1]. samples of a weibull random process can easily be obtained by taking the samples of a rayleigh random process and raising them to a power. weibull distribution therefore has a parameter related to nonlinearity of environment. when this weibull parameter tends to infinity, weibull multipath fading channel becomes a channel without fading effects. when weibull parameter goes to two, weibull channel reduces to rayleigh channel, and when weibull parameter goes to one, weibull channel becomes exponential fading channel. first order performance measures of a communication system include: outage probability, bit error probability and channel capacity. these performance measures can be calculated by using probability density function of receiver output signal. second order performance measures of a wireless communication system usually encompass average level crossing rate and average fade duration. these performance measures can be evaluated by using joint probability density function of the receiver output signal and the first derivate of output signal. log-normal distribution and gamma distribution can be used to describe variations of signal average power in shadowed channels. when log-normal model is used to describe long term fading, the expression for probability density function and cumulative distribution function of received output signal cannot be evaluated in the closed form. application of gamma distribution enables tractable calculation of system performance of the wireless communications system in shadowing environment [2]. there are a number of papers in open technical literature considering outage probability, bit error probability and average level crossing rate of macro diversity system with two or more micro diversity receivers operating over shadowed multipath fading channels. in [3], [4], [5] macro diversity system with two micro diversity branches operating over gamma shadowed nakagami-m multipath fading channels is considered. communication channel is described by the use of compound model [6]. system performance of macro diversity system in the presence of log-normal shadowing and rayleigh multipath fading are presented in [7]. average level crossing rate and average fade duration of macro diversity system operating over gamma shadowed multipath fading channel are evaluated in [8], where macro diversity reception in cellular system is considered and its outage probability is calculated. in this paper, we analyze macro diversity selection combining receiver, with two micro diversity sc branches, operating over gamma shadowed weibull multipath fading channel. macro diversity sc receiver serves to reduce considered gamma shadowing effects and micro diversity sc branches mitigate weibull multipath fading effects on system performance. analytical expressions can be obtained for calculation of important performance parameters such as outage probability and bit error probability. to the best author’s knowledge system performance of macro diversity system in weibull fading channel is not reported in technical literature. performance of macro diversity wireless communication system… 601 2. weibull random variable probability density function of weibull random variable is [9]: 1 1 ( ) x x ex xp        (1) where α is weibull fading parameter and ω is average power of x. cumulative distribution function of weibull random variable is [10,11]: 1 ω 0 1( ) ( ) x x x x f dtp ex t      (2) weibull random variable x, and its first derivative x , are: 2 2 1 1 2 2 2 , , , 2 x y y x x y y y x x             , (3) where y is rayleigh random variable. joint probability density function (jpdf) of x and x is [12]: 1 / 2 2, , 2 ( ) xx yy x x p x x xp j            (4) where the jacobian of the coordinate transform is: 1 2 2 2 1 2 0 2 4 0 2 y y x x x j x y y x y x                   (5) joint probability density function of rayleigh random variable y and its first derivative y is [12]: 2 2 2 22 22ω( ) ( ) 2 1 , ( , 2 ) yy yy m p p y p y y y y e e f            , (6) where fm is maximal doppler frequency, and y is gaussian random variable [13, 14], with variance β. after substituting (5) and (6) into (4), the expression for jpdf of weibull random variable and its first derivative becomes: 2 2 2 2 3 4 12 21 ω/ 2 82( ) ( ), , , 2 2 2 ω x x x xx yy yy x x x j p y y j p x x x ep                   (7) the average level crossing rate of weibull random processes is [15]: 1 / 2 ω 0 2 ω ( ) x x xx m n dxxp xx f x e      (8) the selection combining diversity receiver with inputs operating over identical, independent weibull multipath fading channel is considered next. signal envelopes at 602 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur inputs of a sc receiver are denoted with x1 and x2, and the sc receiver output signal envelope is denoted with x. pdf of sc receiver output signal envelope is [16]: 1 2 2 1 1 2 1 1 1 ω ω( ) ( ) ( ) ( 2 2 1 ω ) ( ) ( ) ( ) x x x x x x x x x p x x x x x xp f p f f xp x e e                 (9) cumulative distribution function of sc receiver output signal envelope is [17]: 1 2 2 1 ω( ) ( ) ( 1) x x x x x x f xf f e           (10) the jpdf of sc receiver output signal and its first derivative is [17]: 2 2 2 2 1 1 2 3 4 1 12 2 ω 8 ω, 2 , 1 2 ( ) ( ) ( ω ) x x x x xx x x x x x x p x f e ep xx                   (11) where the sc receiver output signal envelope is denoted with x. using the previous expression (11), level crossing rate of the process x is [17]: 2 2 1 1 1 / 2 ω ω 0 0 ( ) ( ) ( ) 2 2 ( )2 2 1 x x x xx x xx x x m n dxxp xx f x dxxp xx f x n f x e e                    (12) this expression can be used for calculation of average level crossing rate of wireless communication system with sc receiver operating over weibull multipath fading channel. 3. macro diversity system with two micro diversity branches macro diversity system with two micro diversity sc branches is considered next. received signal experiences gamma correlated long term fading and weibull short term fading resulting in signal envelope and average power variation. model of the system considered in this paper is shown in figure 1. signal envelopes at inputs of the first micro diversity sc combiner are denoted with x11 and x12 and at input of the second micro diversity sc combiner with x21 and x22. signal envelopes at the outputs of micro diversity sc combiners are denoted with x1 and x2, and ultimately, at the output of macro diversity sc combiner with x. fig. 1 model of a macro diversity system, featuring two front-end micro diversity combiners performance of macro diversity wireless communication system… 603 average signal powers at the inputs of micro diversity sc combiners are denoted with ω1 and ω2, and they follow correlated gamma distribution [18]: 1 2 2 0 1 2 1 2 2 0 1 ω ω 2 1/ 2 1/ 2ω 11 2 ω ω 1 2 1 1 21 22 1 00 ω ω 1 12 ( ω 1 1 2 2 2 2 2 ) 0 ( ) 0 (ω ω ) 2 ω ω ω ω ω 1γ( )(1 ) ω ω ω γ( ) ω 1 ! ( , ) ( ) γ( ) 1 ( )i c ccc i c i ci i c i c e i c e c i i p c                                (13) where γ(a) denotes the well-known gamma function [19, eq.(8.310)], ρ is correlation coefficient, ω0 is the scaling factor proportional to mean value of ω1 and ω2, c  1/2 is gamma shadowing parameter, and in () is the n-th order modified bessel function of the first kind [19, eq. (8.406)]. macro diversity sc receiver selects the branch with the highest signal power. therefore, using the expression (9), probability density function of x can be written as [16]: 1 2 1 1 2 2 1 2 ω ω 1 2 1 ω ω 1 2 2 1 2 ω ω 1 2 0 0 0 0 ω ω ω ω ω ω ω( ) ( ) ( ) ( ) ( )ω, x x x d d x p d d x pp x p p           1 1 1 2 1 2 1 0 1 ω 1 2 1 2 1 ω ω 1 2 2 00 0 0 1 1 (1 )2 1 1 1 2 0 0 4 2 ω ω ω ω ( ) (1 ) ! ( ( ) ( , ) 1 , ) (1 ) i x i c i i x x i c x d d x p c i i c d e p e i c                                                       1 2 1 2 2 1 1 2 1 2 1 2 1 2 1 2 2 2 2 2 2 1 1 0 0 2 2 2 2 2 0 1 1 1 1 1 2 2 2 2 2 1 2 2 12 2 0 0 8 γ( ) ω 1 !γ( )( )(1 ) 2 2 2 4 ω 1 ω 1 ( ) ( ) ( ) i i c i i i ii i i c i i i c i i c i i c x c i i k c i c i c x x k                                                          (14) where γ(a, x) is incomplete lower gamma function, (a)n is pochhammer symbol [19] and kν(·) is the second kind of the modified bessel function of order ν [19, eq. (8.407)]. using the expression (12), the level crossing rate of macro-diversity sc receiver output signal envelope of x can be written in the form [20]: 1 2 1 1 2 2 1 2 1 1 1 1 2 1 2 2 ω ω 1 2 1 ω ω 1 2 2 1 2 ω ω 1 2 0 0 0 0 ω 2 1 1 1 ω ω 1 2 0 0 1 1 1 10 0 ( ) ( ) ( , ) ( ) ( , ) 2 ( ) ( ω ω ω ω ω ω ω ω 8 2 ω ω ω ω γ( ) !γ( )( ) , ) (1 ) x x x i m x i i i n d d n x p d d n x p f d d n x p c i i c i c i c x                             2 1 2 1 1 2 1 2 12 4 2 1 12 2 2 2 2 2 2 0 0 02 2 1 4 2 2 2 4 ω 1 ω( ) ( )1 1( )ω i i i i c i i c i i c c x x x kk                                          (15) 604 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur using the expression (10), cumulative distribution function of macro diversity sc receiver can be written in the form [21]: 1 1 1 1 2 2 1 2 ω ω 1 2 1 ω ω 1 2 2 1 2 ω ω 1 2 0 0 0 0 ω ω ω ω ω ω( ) ( ) ( , ) ( ) ( , )ω ω x x x f d d f x d d f xx p p           1 1 1 1 2 1 2 1 2 1 2 2 1 2 1 2 1 2 ω 2 1 2 1 ω ω 1 2 2 2 22 0 00 0 0 1 1 2 2 2 2 0 0 1 2 1 1 2 2 0 2 2 2 ω ω ω ω γ( ) ω 1 !γ( ) ω 1 ω 11 γ(2 2 ) 4 ( ) ( ) ( , ) ( ) ( ) ( (1 ) 2 2 2 2 ω ) (1 i x i i c i i c i i i i i c i i i c c i i i d d f x p c i c x i i c c i c x k                                                 2 1 1 2 2 0 2 22 2 0 22( ω 1 ) 4( ) ( )ω ) 1 i i c i i c x x k                         (16) using expressions (16) and (15), we can easily obtain afd. the afd is defined as the average time over which the signal envelope ratio remains below the specified level after crossing that level in a downward direction, and is determined as [12,15]: 1 2 1 1 2 2 2 1 1 1 2 2 2 0 0 2 2 1 1 1 1 0 1 2 4 2 0 0 1 1 1 1 0 2 !γ( )( )(1 ) ω 1( ) ( ) ( ) 4 2 !γ( )( ) ( ( )) ( )(1 ) ω 1 i i ii i i x x j c m j j x j j j i i c i c i cf x t x n x x f x j c c jj j c                                   2 1 2 21 1 4 2 11 2 1 2 1 21 2 2 1 1 2 1 2 2 2 20 1 2 2 2 2 22 2 2 2 0 0 1 12 2 2 2 2 02 2 ω 1 γ(2 2 ) 2 2 2 2 4 ω 1 ω 12 2 2 2 4 ω 1 ( ( )) ( ) ( ) ( ) i i c j j c i i c i i c i i c i i ci i c j j c j j c i i c x x x k k x x k k                                                                     2 0 (ω 1 )         (17) 4. numerical results numerically obtained results are presented graphically in order to examine the influence of shadowing and fading severity on the concerned quantities. probability density function of macro diversity sc receiver output signal is given in fig. 2. it is evident that the probability density function shifts to the right due to the increase of α, while the change of correlation coefficient ρ causes only slight changes of general pdf behavior. level crossing rate values normalized by maximal doppler shift frequency fm, versus sc receiver output signal, are presented in fig. 3, for several values of weibull fading parameter α, gamma shadowing severity parameter c and correlation coefficient. in fig. 3, abscissa represents arbitrary crossing level, relative to scaling factor 0. http://jwcn.eurasipjournals.springeropen.com/articles/10.1186/1687-1499-2011-151#fig1 performance of macro diversity wireless communication system… 605 0 2 4 6 8 0,0 0,5 1,0 p (x ) x =1,   =1, =0.2, c=1 =1.5,   =1, =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 2 pdf of macro diversity sc receiver output signal -10 -5 0 5 10 15 20 1e-3 0,01 0,1 1 n x [x ]/ fm x[db] =1,   =1, =0.2, c=1 =1.5,   =1, =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 3 lcr for different fading severity and correlation parameter -10 -5 0 5 10 15 20 1e-4 1e-3 0,01 0,1 1 n x [x ]/ fm x[db] =1,   =1, =0.2, c=1 =1.5,   =1, =0.,2 c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.2, c=1.5 =1,   =1, =0.2, c=2 =1,   =1, =0.2, c=2.5 =1,   =1, =0.2, c=3 fig. 4 lcr for different fading and shadowing severity 606 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur average level crossing rate increases as the crossing level increases towards the mean signal level. close to mean signal level, lcr achieves its maximum and then decreases again with increasing the crossing level. sharpness of the peak near the maximum is closely related to weibull fading severity parameter. while the higher values of weibull parameter α correspond to less severe fading conditions, increasing of correlation parameter slightly worsens the effectiveness of diversity reception. it is evident from fig. 3. that, for severe fading conditions, higher correlation increases probabilities that signal passes lower threshold levels. general influence of correlation is the same for lower fading severity, but this is not shown in figures. when correlation coefficient tends to one, the same signal is present simultaneously on both antenna ports and system will not be able to achieve any diversity gain. fig. 4 shows lcr when shadowing severity parameter increases. this increases the mean signal level, identified by the peak lcr, and it is the consequence of normalization by the scaling factor 0, which we chose previously. going back to (13), we see that averaging over 1, mean value of 2 is c0, and vice versa. this higher mean value is clearly seen as lcr curves shift to the right in fig. 4. by increasing parameter c, shadowing severity decreases, which is analogous to behavior due to weibull parameter . cumulative distribution function of macro diversity sc receiver output signal for different system parameters is presented in fig. 5. from the figure, we can conclude that changes of the parameter α show significant influence on the outage probability. due to an increase in the parameter α, the outage probability becomes lower, and the system is more stable at lower threshold levels. cumulative distribution clearly shows that probability of signal staying below the threshold level is lower. an increase of parameter ρ affects the stability of the system also. if ρ rises, the outage probability is greater and the system operation becomes less stable. -10 -5 0 5 10 15 20 1e-4 1e-3 0,01 0,1 1 f x [x ] x[db] =1,   =1, =0.2, c=1 =1.5   =1 =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 5 cdf for different system parameters table 1 represents the table of convergence for the expression (16) in reliance on the variable x. the table shows number of terms needed to be included in (16), in order for the accuracy of the resulting expression to achieve 6 accurate decimal positions, for the performance of macro diversity wireless communication system… 607 given parameter values. it is evident that the expression converges rapidly for the given parameters. we can conclude from the table 1 that due to increase of the coefficient α, the number of terms that have to be summed is slightly lower, while for the greater values of correlation coefficient ρ, the required number of terms increase. table 1 number of terms that should be added in expression (16) in order to reach 6 accurate decimal positions, when parameters α and ρ change. x= -10 db x=0 db x=10 db α=1, ω0=1, ρ=0.2, c=1 8 13 19 α =1.5, ω0 =1, ρ =0.2, c=1 6 13 19 α =2, ω0 =1, ρ =0.2, c=1 5 13 19 α =2.5, ω0 =1, ρ =0.2, c=1 5 13 19 α =3, ω0=1, ρ =0.2, c=1 5 13 19 α =1, ω0 =1, ρ =0.4, c=1 9 15 19 α =1, ω0 =1, ρ =0.6, c=1 9 15 21 α =1, ω0 =1, ρ =0.8, c=1 13 20 29 -10 -5 0 5 10 15 20 0,01 0,1 1 10 t x [x ]/ fm x[db] =1,   =1, =0.2, c=1 =1.5,   =1, =0.2, c=1 =2,   =1, =0.2, c=1 =2.5,   =1, =0.2, c=1 =3,   =1, =0.2, c=1 =1,   =1, =0.4, c=1 =1,   =1, =0.6, c=1 =1,   =1, =0.8, c=1 fig. 6 afd for different system parameters fig. 6 presents normalized values for average fade duration for various system parameters. when the crossing threshold level x is below the average signal level, afd stays low, and it is the main mode in which the system is operates normally. better performance is expected in cases where the value of weibull parameter α is higher, and correlation coefficient ρ is lower, resulting in lower afd. 608 s. suljović, d. milić, z. nikolić, s. panić, m. stefanović, đ. banđur 4. conclusion macro diversity receiver with macro diversity sc combiner and two micro diversity sc combiners operating over gamma shadowed multipath fading environment is considered in this paper. received signal experiences combined effects of gamma long term fading and weibull short term fading resulting in system performance degradation. when shadowing severity parameter tends to infinity the composite channel approaches a simple weibull multipath channel, and when weibull fading parameter tends to infinity the channel tends to a gamma shadowing channel. when weibull fading parameter equals two, the composite fading channel reduces to gamma shadowed rayleigh multipath channel. closed form expressions for probability density function, cumulative distribution function and average level crossing rate of macro diversity sc receiver output signal envelope are calculated. for special case when weibull parameter is equal to two, we can easily evaluate pdf, cdf and average level crossing rate for the resulting rayleigh signal envelope. infinity series expressions converge for any values of gamma shadowing severity parameter, weibull fading parameter, and shadowing correlation coefficient. number of terms that need to be summed in order to achieve desired accuracy depends on gamma severity parameter, weibull fading parameter and correlation coefficient. the number of terms increases as gamma severity parameter and weibull parameter deceases, and correlation coefficient increases. level crossing rate and average fade duration are presented graphically to show the influence of gamma severity parameter, weibull fading parameter, and correlation coefficient. on average level crossing rate of sc receiver output signal. as expected, system performance is better when the fading and shadowing severity is lower, and correlation between the diversity branches is relatively low. when the correlation of shadowing effects on the two macro branches is substantial, macro diversity system gains are minimal, and the receiver performance reduces to performance of a micro diversity receiver. acknowledgement: the paper is supported in part by the projects iii44006 and tr32051 funded by ministry of education, science and technological development of republic of serbia. references [1] n.c. sagias, g.k. karagiannidis, "gaussian class multivariate weibull distributions: theory and applications in fading channels", ieee transactions on information theory, vol. 51, no. 10, 2005. [2] p.s. bithas: "weibull-gamma composite distribution: alternative multipath/shadowing fading model", electronics letters, vol. 45, issue: 14, p. 749-751, 2009. [3] p.m. shankar, "analysis of micro diversity and dual channel macro diversity in shadowed fading channels using a compound fading model", international journal of electronics and communications (aeue), vol.62, pp.445-449, 2008. [4] d.b. đosić, d.m. stefanović, ĉ.m. stefanović, "level crossing rate of macro-diversity system with two micro-diversity sc receivers over correlated gamma shadowed α–µ multipath fading channels", iete journal of research, vol. 62 , iss. 2, 2016. [5] ĉ. m. stefanović, "macro-diversity system with macro-diversity ssc receiver and two sc microdiversity receivers in the presence of composite fading environment", in proceedings of the 23rd telecommunications forum (telfor), belgrade, pp. 321-324, 2015. [6] p. m. shankar, "performance analysis of diversity combining algorithms in shadowed fading channels", wireless personal communications, vol. 37, issue 1, pp. 61-72, 2006. http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=5159700 performance of macro diversity wireless communication system… 609 [7] a. adinoyi, h. yanikomeroglu, s. loyka, "hybrid macro-and generalized selection combining microdiversity in lognormal shadowed rayleigh fading channels", in proceedings of the ieee international conference on communications, vol. 1, 2004, pp. 244-248. [8] s. mukherjee, d. avidor, "effect of micro-diversity and correlated macro-diversity on outages in a cellular system", ieee transactions on wireless communications, vol. 2, no. 1, pp. 50-58, 2003. [9] a. papoulis and s. u. pillai, "probability, random variables, and stochastic processes", 4th (fourth) ed., edition mc graw-hill, london, uk, europe, isbn-13: 978-0070486584, 2002. [10] m. stefanović, d. milović, a. mitić, m. jakovljević, "performance analysis of system with selection combining over correlated weibull fading channels in the presence of co-channel interference", aeu international journal of electronics and communications, vol. 62, issue 9, pp. 695-700, october 2008. [11] a. golubović, n. sekulović, m. stefanović, d. milić, "performance analysis of dual-branch selection diversity system using novel mathematical approach", facta universitatis, series: electronics and energetics, vol. 30, no 2, pp. 235 – 244, june 2017. [12] s. suljović, d. milić, s. panić, "lcr of sc receiver output signal over α-κ-μ multipath fading channels", facta universitatis, series: electronics and energetics, vol. 29, no, 2, pp. 261 – 268, june 2016. [13] w.c. jakes, microwave mobile communications, piscataway, nj: ieee press, 1994. [14] g. l. stüber, principles of mobile communications, boston, kluwer academic publishers, 1996. [15] f.-p. calmon, m. d. yacoub, mrcs–selecting maximal ratio combining signals: a practical hybrid diversity combining scheme, ieee trans. wireless communications, 2009. [16] b. jaksić, d. stefanović, m. stefanović, p. spalević, v. milenković, "level crossing rate of macro-diversity system in the presence of multipath fading and shadowing", radio-engineering, vol. 24, no.1, 2015. [17] a. marković, z. perić, d. đošić, m. smilić, b. jakšić, "level crossing rate of macro-diversity system over composite gamma shadowed alpha-kappa-mu multipath fading channel", facta universitatis, series: automatic control and robotics, vol. 14, no 2, pp. 99 – 109, 2015. [18] m. bandjur, n. sekulović, m. stefanović, a. golubović, p. spalević, d. milić, "second-order statistics of system with micro-diversity and macro-diversity reception in gamma shadowed rician fading channels", etri journal, vol. 35, no. 4, pp. 722-725, 2013. [19] i. s. gradshteyn and i. m. ryzhik, table of integrals, series and products, 6th ed. new york: academic press, 2000. [20] n. sekulović, m. stefanović, "performance analysis of system with microand macro-diversity reception in correlated gamma shadowed rician fading channels", wireless personal communications, vol. 65, no.1, p. 143–156, 2012. [21] s. panić, d. stefanović, i. petrović, m. stefanović, j. anastasov and d. krstić, "second-order statistics of selection macro-diversity system operating over gamma shadowed κ-μ fading channels", eurasip journal on wireless communications and networking, 2011. http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.sergey%20loyka.qt.&newsearch=true instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 145 160 doi: 10.2298/fuee1702145e load sharing methods for inverter-based systems in islanded microgrids  a review  augustine m. egwebe, meghdad fazeli, petar igic, paul holland electronic system design center at college of engineering, swansea university, wales abstract. this paper explores and discusses various design considerations for inverterbased systems. different load sharing techniques are presented for the integration of renewable energy sources within islanded microgrids. in off-grid connection, renewable energy sources are often configured to share power based on their rated capacity. this paper explores both conventional and dynamic load sharing interaction between distributed generation units, both in an inductive (high voltage) and resistive (low voltage) networks. load sharing based on the proper design of virtual impedance is also reviewed. key words: distributed generation, microgrids, droop control, virtual impedance, photovoltaic, renewable sources. 1. introduction the need for clean and reliable energy generation has propelled global activity in various spheres of human endeavor to develop alternative sources of energy. the provision of affordable, reliable and sustainable access to energy in different forms remains one of the key challenges of economic and social development especially in developing countries [1, 2]. while it may be practically impossible to eliminate conventional nuclear and fossil fueled steam turbines, renewable energy sources (res) offer huge prospects to ease the ever-increasing demand burden on large, centralized conventional power systems. vast reduction of greenhouse gases emission can also be achieved via res integration with the existing electricity grid networks [3]. distributed generation is a term commonly used to describe small-scale and modular power generation sources that are located close to the distribution network rather than large power stations connected to the high voltage transmission network [4, 5]. distributed generators (dg) includes small-scale fossil and renewable energy generation technologies including wind, photovoltaic, micro-hydro-turbines, biogas, geothermal, tidal, steam turbines with supplementary storage devices like fuel cells and batteries. dg therefore serves as a contrast to conventional large power stations that use a small number of largescale, frequency controlled generators; it offers enhanced and improved power quality,  received october 28, 2016 corresponding author: augustine marho egwebe electronic system design center at college of engineering, swansea university, wales (e-mail: augustine.egwebe@swansea.ac.uk) 146 a. m. egwebe, m. fazeli, p. igic, p. holland enhanced system security, mitigates against issues like blackout and gives better control over the cost of energy [6]. with distributed generation, consumers now have some scales of flexibility on their energy utilization [7]. increased penetration of green renewable energy requires high-level engineering prowess in maintaining and improving the technologies that make them effective, durable and sustainable [8]. the integration of res with the existing power network mainly involves the strategies and schemes employed via the use of technologies, processes, and advanced control protocols to balance the production and demand of electrical energy within the network. these control schemes enhance the reliability of energy supply irrespective of the intermittent nature of the renewable source (i.e. fluctuating sunshine or wind profile). they include strategies for the optimal harnessing of the available renewable power, effective energy management, power/voltage device control, intelligent control of energy transformation, islanding detection, and line faults management [7-9]. to balance generated energy with demand in modern microgrids, various renewable sources and converters are often interconnected for load sharing and complimentary energy support. renewable power generation units are also often supplemented with dispatchable resources such as energy storage systems and local auxiliary generators; where the absence of such resources can result in the malfunctioning of the inverter-based sources (ivbs) [10]. an intermediate solution to some of the problems with the integration of dg with the existing power network is the concept of the microgrid shown in fig. 1 an electrical distribution system using distributed energy resources such as generators, storage devices and controllable loads which are coordinated, when connected to the main power network or operated in islanded mode [9, 10]. in grid-connected mode, control measures are relatively easy to be implemented since the utility grid regulates voltage and frequency for loads within the microgrid; whereas in islanded mode voltage and frequency must be actively controlled for the continuous and stable performance of the network [11-13]. the microgrid, when operated in islanded mode, must be able to integrate and coordinates several energy resources with appropriate voltage-frequency control strategies. the electrical power generated from dgs must be well regulated to suit sensitive non-linear loads within the distribution level (i.e. computers, motor drives, battery chargers), without causing unregulated constraint on the generator. the control measures in dgs also aim to offer greater power quality control and low voltage ride through required for eliminating transient stability issues [14, 15]. additionally, microgrids help to reduce congestion on the utility grid, serves as uninterrupted supply for critical loads, encourage the localized generation of power on the consumer side and offer extra support regarding voltage support, demand response as well as spinning reserve via inbuilt storage devices [16]. the hierarchical control approach employed in a microgrid allows autonomously coordinated generation output from dgs and energy storage systems while ensuring the appropriate load sharing and interaction with the national grid (ng) [17]. in the absence of free generation capacity in the system due to each dg hitting their maximum generating capacity, the microgrid should be self-sustainable without violating sensitive network parameters like voltage and frequency [18]. a microgrid can connect and disconnect from the ng to enable it to operate in either grid-connected or islanded mode using the microgrid central switch shown in fig. 1. a required basic characteristic of the microgrid is seamless “islanding” and “reconnection” from/to the ng. disconnection can be as a result of grid events which include faults, voltage collapse, and blackout [8, 19]. all dgs within the microgrid must be well regulated to present peer-to-peer and plug-and-play characteristics. load sharing methods for inverter-based systems in islanded microgrids  a review 147 pv arrays small wind turbine energy storage bank domestic houses small industries local amenities e.g. hospital diesel generator mv network (10kv) microgrid central switch transformer pv arrays lv lv: low voltage mv: medium voltage mc: microsource controller mcc: microgrid central controller mcc mc mc mc mc mc fig. 1 an example of a microgrid network islanding detection of distributed generation systems, voltage regulation, protection, power quality improvement and stability of the power system network are some of the technical challenges facing dgs as they are increasingly connected to the microgrids. accurate and intelligent controller design is thus required to ensure swift interaction with connected loads and the microgrid while ensuring system stability when disconnecting from the ng in the case of fault or disturbance [20]. in this paper, a thorough survey of the core components and techniques required for the effective integration of dgs in an islanded microgrid is presented. control paradigms to facilitate the efficient load sharing, operation, and energy management of dgs in a microgrid is also presented. 2. inverter-based systems inverter-based systems (ivbs) play a vital role in the effective integration of renewables with the microgrid at a synchronized system frequency. ivbs are commonly employed for switching dc voltages from renewable sources to ac voltages supplied to the microgrid and locally connected loads [19, 21]. monitoring and control functionality are essential requirements for the power electronics interface used in ivbs so as to ensure the protection of the dg system and as well as meet the connection specifications of the ng [19]. active power, reactive power, voltage and frequency at the point of common connection are some of the critical monitoring parameters for these types of systems as shown in fig. 2. also, proper conditioning of voltage and current ensures successful control of power flow per specific power references under varying load or dg input sources. motor drives and distributed generation systems use ivbs due to their inherent advantages of adjustable power factor, low total harmonic distortion (thd), and their high efficiency. 148 a. m. egwebe, m. fazeli, p. igic, p. holland inverter gc-pr(s)gv-pr(s) power controller ll cf lfvi ii voαβ * iiαβ * vcon-αβ ii vo vo io vb io iiαβ voαβ voαβ ab c voαβ ioαβ αβ ab cα β pwm mαβ vdc θ ÷ * +vdc fig. 2 generic inverter-based system a conventional inverter circuit consists of controllable transistorized switches, such as igbts with parallel diodes to provide a bypass path for transient currents as shown in fig. 3.a. the three-phase igbt bridge circuit operates according to the control signal (vcon) generated by the control algorithm of the controller as shown in fig. 3.b. the threephase igbt-based inverter in fig. 3 consists of six switching devices (q1 q6), which are directly controlled by pulse width modulation (pwm) signals (s1 through s6) to be on (closed) or off (open) according to a well-structured switching pattern to produce the desired output ac waveforms [23, 24]. vdc s1 q1 s3 q3 s5 q5 s4 q4 s6 q6 s2 q2 a b c van vbn vcn ia ib ic 2 vdc 2 vdc + + + reference sine-wave generator (vcon) + carrier triangular wave (sets switching frequency, fsw) + + vcon-a vcon-c vcon-b vtrig s1, s4 s3, s6 s5, s2 comparator(a) (b) fig. 3 (a) three-phase bridge inverter [22]; (b) spwm control signal generator [22] the use of pwm switching together with closed-loop voltage and current controllers produces a sinusoidal output current in phase with the grid voltage with thd aligned to grid regulations. conventional grid-mode ivbs in pv applications ensure that: (1) pv modules operate at maximum power point (mpp); (2) the injected ac current into the grid is sinusoidal, with consideration for the ieee 547 demand standards for grid connection. these standards include issues such as power quality, islanding detection mode, grounding and harmonics. one of the challenges of switching ivbs at high frequencies (2 20 khz) is the creation of high-order harmonics. thd in current and voltage can lead to low power factor, overheating of distribution system components, mechanical oscillation in generators and motors, poor performance of communication equipment, and unpredictable behavior of load sharing methods for inverter-based systems in islanded microgrids  a review 149 security protection systems [22, 23, 25]. the low-pass filter connected to the output of the inverter helps to prevent the injection of high-frequency harmonics into the ac bus [23, 25]. line frequency transformers are used for galvanic isolation when interfacing the microgrid with the ng as shown in fig. 1. sinusoidal pulse width modulation (spwm) is the simplest continuous carrier-based pwm method for generating pulses, and for switching inverter-based devices with a fundamental frequency of 50 or 60 hz. the main objectives of any modulation scheme are: (1) lower switching losses, (2) reduced thd of output current, (3) minimize computational switching time, (4) better dc bus utilization, and (5) easy digital implementation [26]. in the spwm-based system in fig. 2, the three-phase fundamental components of the ac output voltage of the inverter are given by (1) [27-31]. 0 0 0 0 0 0.5 cos( ) 0.5 cos( 120 ) 0.5 cos( 120 ) i a a dc i b b dc i c c dc v m v t v m v t v m v t            (1) where ma, mb, mc = modulation index per phase; vdc = dc link voltage; and ω0 = fundamental angular frequency of the system. by using the vector-control approach, (1) can be represented as αβ-components hence offering better tracking performance at steady state for the proportional-resonant (pr) controller as shown in (2). dci dci vmv vmv   5.0 5.0     (2) the magnitude of the ac output voltage of the dg in (2) is provided in (3) dcdci mvmmvv 5.05.0 22    (3) in (3), the fundamental component of the ac output voltage is thus controlled by controlling the inverter amplitude modulating index m. where m is defined as the ratio of the amplitude of the modulated signal to that of the carrier signal. the inverter switching process works well for 0 < m < 1 to prevent unwanted harmonic distortion [23, 32, 33]. the dc link voltage vdc of the inverter-based source must satisfy (3) to avoid pwm over modulation and to ensure the stable operation of the dg in a microgrid. however, when there is a reduction in renewable energy resource level (i.e. wind or solar irradiance level) hence decreasing vdc, m must increase to maintain vi-αβ in (3). at m = 1; a fixed vi-αβ depends solely on vdc. therefore, when designing the ivbs, consideration for the minimum dc link voltage to satisfy (3) must be ensured. the diagrammatic description of the closed-loop control scheme of each inverterbased dg in an islanded microgrid is shown in fig. 2. the direct proportional-resonant (pr) control approach can be used to simplify fig. 2 as shown in the block diagram representation in fig. 4. 150 a. m. egwebe, m. fazeli, p. igic, p. holland gv-pr(s) gc-pr(s) + + vdc ÷ * gpwm(s) * * slf + rf 1+ + io scf 1 vovo * il * il ic lc filterpwm invertercontrollers vi fig. 4 block diagram of the closed-loop inverter-based source [20, 34] in the closed-loop dg model in fig. 4, an outer voltage loop gv-pr is used to control the output voltage of the inverter. the main control objective of gv-pr is to maintain a clean and balanced dg voltage as close as possible to the given sinusoidal reference voltage so that the thd of the output voltage is minimized. the voltage reference is compared with the measured voltage in αβ-frame to produce an error signal. the error signal is fed into a pr compensator, which in turn generates the current reference signal for the inner current loop. similarly, in the inner current controller gc_pr in fig. 4, the reference current from the outer voltage loop is compared with the measured output current. the error signal is fed into a pr controller to generate the reference signal for the pwm generator. the controlled output wave from the current controller is transformed back to the abc-frame using the abc/αβ-coordinate transformation principle, to generate the reference control signal for the inverter switching devices. the bandwidth of the inner current controller is usually designed to be much faster than the outer voltage loop to achieve a fast dynamic response. in general, the voltage and current controllers are designed to provide nearly perfect sinusoidal output voltage waveforms at a nominal switching frequency and to offer good damping for the output filter of the inverter and the rejection of high-frequency disturbances. vcα ioα + vcβ ioβ vcβ ioα vcα ioβ vcα vcβ ioβ ioα p qωf s + ωf ωf s + ωf p q ω * mpp v * nqq 1 s ω0 θ v fig. 5 droop controlled power sharing for islanded dg [30] the dynamics of the control scheme depends mainly on the bandwidth of the pq controller shown in fig. 4, since the bandwidth of the current and voltage controller, are designed to be much higher than that of the pq controller [25]. the power controller block is used for accurate sharing of p and q according to the droop characteristics as shown in fig. 5 [10]. the low-pass filter with cutoff frequency wf is used to extract the average powers as shown in fig. 5. the non-ideal pr controller adopted in this paper can overcome two well-known drawbacks of conventional pi controller: (1) the inability to track a sinusoidal reference with zero steady-error, (2) poor disturbance rejection capability. this is due to the pr controller infinite gain at the fundamental frequency [35], thus reducing steady state error to zero. load sharing methods for inverter-based systems in islanded microgrids  a review 151 equation (4) shows the transfer function of the adopted practical non-ideal pr controller to achieve finite gain at the ac line frequency. 22 s2s s2 (s) oci ci ppr k k kg     (4) fig. 6 frequency response of a pr controller for kp = 0.01, ki = 1 and ωc = 1, 5, 15, 25, 50, 100 rad/s the frequency response of (4) shows a wider bandwidth around the 50 hz resonant frequency which helps to minimize any slight frequency variation due to load disturbance. the pr controller’s bandwidth can be varied with the damping factor ωc as shown in fig. 6. it can be seen that ωc has an effect on both the magnitude and phase of the controller. when choosing ωc, there has to be a compromise between the reduction of sensitivity and steady state error. 2.1. current controller design the design objective of the current controller is to have a high loop bandwidth with sufficient stability margins. it is noted from control laws that systems with greater gain margins can withstand greater changes in the systems parameters before becoming unstable in the closed loop response. when designing via frequency response analysis, the goal is to predict the closed-loop behavior from the open-loop response of the current control loop shown in fig. 4. the closed-loop transfer function of the current loop when the output current is assumed as disturbance is given in (5) [36]. feedforward terms are added to the current loop in order to decouple the αβ components of the output voltage [37]. (s)(s)(s)1 (s)(s)(s) (s) * lfpwmprc lfpwmprc l l c ggg ggg i i g     (5) where gc-pr is the pr current controller; glf is the transfer function of the lc filter respectively; gpwm(s) = 1 / (1+1.5tss) represents the pwm and computational delay with 152 a. m. egwebe, m. fazeli, p. igic, p. holland respect to the sampling period ts. by setting ωc in (4) equal to 10 rad/s, (5) can be tuned for a closed-loop bandwidth of 1 khz to give kp and ki of 12.5ω and 250ω respectively. note that the bandwidth of (5) is usually selected as one-tenth of the switching frequency. 2.2. voltage controller design the voltage controller is also based on the pr structure discussed in (4), where a generalized integrator is used to achieve a zero steady-state error. the closed-loop dynamic behavior of the dg in fig. 4 is approximated as an equivalent thevenin equation as given in (6):             ooooo o xxx prcpwmff o xxx prvprcpwm o izvgv i csbas ggrsl v csbas ggg v (s)(s) (s)(s)(s)(s)(s) * 2 * 2 (6) where ax = lfcf; bx = (rf + gpwm(s)gcpr(s))cf; cx = gpwm(s)gcpr(s)gvpr(s); gvpr(s) is the pr capacitor voltage controller; rf is the parasitic resistance of the filter inductor; go(s) is the control closed-loop system transfer function; zo(s) is the output impedance. fig. 7 shows the open-loop frequency response of the dg’s voltage loop when the output current is assumed as disturbance, the positive high gain margin (46.4 db) and phase margin (79.3 degrees) both confirm the stability of the overall system. the bandwidth of the voltage controller is tuned to be about one-fifth of the bandwidth of the current controller as shown in the closed loop frequency response in fig. 8, to give kp and ki values of 0.5ω and 7 ω respectively. fig. 7 open-loop frequency response of the dg voltage loop load sharing methods for inverter-based systems in islanded microgrids  a review 153 fig. 8 closed-loop frequency response of the dg voltage loop 2.3. virtual impedance design for p and q decoupling go(s) vo * +io vo zv(s) + zo(s) → go(s) vo * +io vo go(s)zv(s)+zo(s) fig. 9 block diagram representation of virtual impedance loop in order to ensure a stable output impedance of the dg, the output dg voltage is dropped proportionally with the output current as shown in fig. 9 and explained in (7): * ( ) ( ( ) ( ) ( )) o o o o v o o v g s v g s z s z s i   (7) the output impedance of the inverter is re-designed to mitigate the influence of control parameters and line impedance on the power-sharing accuracy around the fundamental frequency as shown in fig. 9, to share the power precisely between the distributed ivbs [34]. references [21, 34, 38, 39] proposes a design scheme to eliminate the impact of dg output impedance on the overall system dynamics; hence the virtual impedance loop was implemented for power decoupling and restraining of circulating current between dgs. a performance comparison of virtual impedance techniques used in droop-controlled islanded microgrids was presented in [40, 41]. it was noted that the virtual inductive loop helps to improve the output impedance of the inverters such that it becomes predominantly inductive thereby improving the power-sharing accuracy of the droop control algorithm. similarly, a virtual resistive loop increases the output impedance of the inverters such that it becomes more resistive. the overall effect of impedance mismatches is also reduced by 154 a. m. egwebe, m. fazeli, p. igic, p. holland the virtual resistance loop thereby improving the current sharing. a virtual resistance allows sharing of linear and nonlinear loads in microgrid applications without introducing additional losses in the network and improves the stability of the microgrid [41]. according to fig. 10, the magnitude of the ivbs output impedance at the fundamental frequency is approximately zero. this shows the effectiveness of the designed control parameter of the voltage and current loop. hence, the output impedance of the ivbs is designed to be equal to the virtual impedance around the fundamental frequency as shown in fig. 10. fig. 10 also illustrates the effect of the virtual inductance on the overall output impedance of the ivbs. as can be seen, the overall output impedance become more inductive as the virtual inductance increases. fig. 10 output impedance frequency response of the ivbs with varying virtual inductance 3. conventional load sharing schemes in an islanded microgrid load sharing without communication between the parallel dgs is the most favored option in an autonomous microgrid as the network can be complex and can span over a large geographical area [19, 42]. numerous literature has studied and presented the droop scheme so that parallel dgs can be locally controlled to deliver required active and reactive power to the microgrid network. by adopting the droop scheme, two local independent network quantities (voltage and frequency) are controlled to regulate active and reactive power with consideration for the allowable frequency and voltage deviation within the microgrid. the small-signal stability analysis of the droop scheme has also been explored in the various literature [19, 30, 43]. one major concern with the droop scheme is its sensitivity to the imbalance in the system’s closed-loop output impedance and line impedance, which can lead to poor coupling between the active and reactive power [21, 42]. load sharing methods for inverter-based systems in islanded microgrids  a review 155 the complex power delivered to the common bus in fig. 2 can be expressed as shown in (8). jqps  (8) 2 2 cos( ) cos sin( ) sin o b b o b b v v v p z z v v v q z z                   (9) where p and q are the active and reactive power delivered by the dg; vo is the ac output voltage of the dg; vb is the bus voltage; z is the magnitude of the output impedance, and θ is the phase angle of the output impedance. 3.1. active power-frequency droop scheme conventionally, the output impedance is considered to be purely inductive (i.e. z ≈ jx), hence (9) is re-written as in (10).          x v x vv q x vv p bbo bo 2 cos sin   (10) the power droop controller in fig. 5 aims to adjust the frequency and voltage difference relative to increasing load in a stable manner. in an inductive-based microgrid, the droop equation is expressed as (11) and shown in fig. 13. * * 0 0 * * ( ); ( ) p o q m p p v v n q q               (11) where ω* is the fundamental frequency; v* is the ac reference voltage; p* and q* are the reference active and reactive powers; ɸ is the power angle, p and q are the instantaneous active and reactive power of the dg. the droop gains mp and nq are calculated for a given range of frequency and voltage as shown in (12) rated q rated p q vv n p m minmaxminmax ;      rated q rated p q vv n p m minmaxminmax ;      (12) fig. 11 steady-state characteristic of conventional droop scheme 156 a. m. egwebe, m. fazeli, p. igic, p. holland equation (11) indicates that the active power of the dg is dependent on the power angle, whereas the voltage amplitude difference mainly influences the reactive power. equation (13) shows the load distribution for n-parallel connected dgs in a microgrid when (11) is adopted. nqnqq npnpp qnqnqn pmpmpm   ... ... 2211 2211 (13) 3.2. active power-voltage droop scheme it is noted in the various literature that the performance of the conventional droop control is severely affected by the resistance-to-inductance (r/x) ratio of output and the line impedance. equation (9) is given as (14) when the output impedance of the dg is resistive (i.e. z ≈ r): sin cos 2   r vv q r v r vv p bo bbo   (14) since low voltage microgrid electrical distribution networks present a high r/x ratio, the voltage amplitude is used to control active power, while reactive power is controlled by the system frequency as shown in (15). * * 0 0 * * ( ); ( ) p o o q m q q v v n p p           (15) rated p q m minmax    ; rated q p vv n minmax   3.3. virtual impedance load sharing scheme the active and reactive power can also be well autonomously controlled using the virtual impedance scheme in (16) without any requirement for additional power controller as studied in [43]. equation (16) ensures accurate load sharing between the dgs and compensates reactive power differences due to output voltage mismatches, or line impedance mismatches. in order to avoid the steady-state frequency deviation, a pll is introduced. this way, the pll adjust the phase of the inverter, and the system is controlled by a virtual resistance controlling current as in a dc electrical system. reference [44] proposes an autonomous loading sharing scheme using the virtual resistance loop and a synchronous reference frame phase-locked loop. this scheme provides for both instantaneous current sharing and fast dynamic response of the paralleled ivbs. the relationship between the ioαβ and the virtual resistance rv for n-dgs is given as (16). vnnovovo vnnovovo ririri ririri     ... ... 2211 2211 (16) the small-signal analysis shows that the output α and β axis output currents of paralleled inverters are inversely proportional to their virtual resistances since the current load sharing methods for inverter-based systems in islanded microgrids  a review 157 sharing performance is just influenced by the output impedance ratio instead of the output impedance value of the dgs [43]. 1 v i1 * (p) 2 v * i1(p) i2 (p)i2 * (p) i(p) fig. 12 characteristics of virtual impedance droop (v-p) 3.4. energy saving via dynamic load sharing reference [10] presented a dynamic load sharing scheme for photovoltaic (pv) inverter-based systems in an inductive microgrid, by using the pv array’s current vs voltage characteristics in defining an operating range for the inverter-based source. the dynamic load sharing scheme is based on the available solar power to ensure an efficient load sharing interaction with other dgs, without the need for energy support from local connected fossil-fuelled auxiliary generator and thereby providing significant energy saving compared with conventional static droop control techniques. in the dynamic loading scheme, the droop gains of the power controller in (11) are redefined for dynamic load interaction between the dgs [45] as follows: avail q dc p q v n p m      ; max  (17) where pdc-max is the maximum available power of the pv array which is deduced from the maximum power curve in fig. 13. qavail is the available reactive power that the dg can supply as defined in (18). 22 dgratedavail psq  (18) figure 14 shows the load sharing profiles of two dgs interfaced to the islanded microgrid [10]. fig. 14.b shows load sharing based on the conventional droop scheme in (11), where a drop in the available power of dg2 causes similar drop in dg1 even though it has enough available capacity. as a result, the total generation becomes less than the load, the auxiliary generator (ag) is thus triggered on to supply the shortage in supply paux. in fig. 14.c, the load is adaptively shared based on the available pv power using (17). thus, a drop in the available energy in dg2 causes a proportional drop in its contribution to pload. similarly, dg1 dynamically compensate for this drop by supplying more power. hence no extra power is required from the ag (paux ≈ 0 in fig. 14.c). 158 a. m. egwebe, m. fazeli, p. igic, p. holland ppv-rated pload pdc-max o operating points as g drops vdc-min voc c g1 fig. 13 steady-state characteristic of pv operating zone fig. 14 simulation results of two dg systems using droop-based load sharing scheme showing active power sharing (a) available solar power in pu; (b) static scheme: active power in pu (d) dynamic scheme: active power in pu. 4. conclusion a thorough review of the effective integration of inverter-based systems in islanded microgrids was presented in this paper. different control and load sharing method were discussed with respect to the output impedance of the dg, and a frequency response analysis influence of the pr controller on the performance of the dg was also presented. in the dynamic load sharing scheme presented, the droop parameters were tuned based on the available power of the dg. the dynamic load sharing scheme offers energy savings when compare to the conventional loading scheme. references [1] m. olken and a. zomers. (2014, jul.aug.) energy for all: world access to electricity. power and energy society. [2] j. c. vasquez, j.m. guerrero, m. savaghebi, j. eloy-garcia and r. teodorescu, "voltage support provided by a droop-controlled multifunctional inverter," ieee trans. ind. electronics, vol. 56, pp. 4510-4519, oct. 2009. load sharing methods for inverter-based systems in islanded microgrids  a review 159 [3] p. basak, s. chowdhury, s. halder, s. p. chowdhury, "a literature review on integration of distributed energy resources in the perspective of control, protection and stability of microgrid," renewable and sustainable energy reviews, vol. 16, pp. 5545-5556, 2012. [4] n. jenkins, r. allan, p. crossley, d. kirshen, and g. strbac, embedded generation. london: the institute of electrical engineers, 2000. [5] y. levron, j. m. guerrero and y. beck, "optimal power flow in microgrids with energy storage," ieee transactions on power systems, vol. 28, pp. 3226-3234, 2013. [6] m. milligan, b. frew, b. kirby, m. schuerger, k. clark, d. lew, p. denholm, b. zavadi, m. o’malley, and b. tsuchida, "alternatives no more: wind and solar power are mainstays of a clean, reliable, affordable grid," ieee power and energy magazine, vol. 13, pp. 78-87, 2015. [7] p. k. olulope, k. a. folly, and g. k. venayagamoorthy, "modeling and simulation of hybrid distributed generation and its impact on transient stability of power system," in proc. of the 2013 ieee international conference on industrial technology (icit), 2013, pp. 1757-1762. [8] q. fu, a. hamidi, a. nasiri, v. bhavaraju, s. b. krstic, and p. theisen, "the role of energy storage in a microgrid concept: examining the opportunities and promise of microgrids," ieee electrification magazine, vol. 1, pp. 21-29, 2013. [9] g. a. jimnez-estevez, "energy access challenge: it takes a village," ieee power and energy society trans. , vol. 12, pp. 60-69, 2014. [10] a. m. egwebe, m. fazeli, p. igic, and p. m. holland, "implementation and stability study of dynamic droop in islanded microgrids," ieee transactions on energy conversion, vol. 31, pp. 821-832, 2016. [11] j. rocabert, g. m. s. azevedo, a. luna, j. m. guerrero, j. i. candela, and p. rodrixguez, "intelligen t connection agent for three-phase grid-connected microgrids," ieee trans. power electronics, vol. 26, pp. 2993-3005, oct. 2011. [12] j. y. kim, j. h. jeon, s. k. kim, c. cho, j. h. park, h. m. kim, and k. y. nam, "cooperative control strategy of energy storage system and microsources for stabilizing the microgrid during islanded operation," ieee transactions on power electronics, vol. 25, pp. 3037-3048, 2010. [13] m. fazeli, g.m. asher, c. klumpner, l. yao, "novel integration of dfig-based wind generators within microgrids," ieee trans. energy conversion, vol. 26, pp. 840-850, aug. 2011. [14] l. yun wei and k. ching-nan, "an accurate power control strategy for power-electronics-interfaced distributed generation units operating in a low-voltage multibus microgrid," ieee transactions on power electronics, vol. 24, pp. 2977-2988, 2009. [15] c. trujillo rodriguez, d. velasco de la fuente, g. garcera, e. figueres, and j. a. guacaneme moreno. trujillo rodriguez, et al., "reconfigurable control scheme for a pv microinverter working in both gridconnected and island modes," ieee trans. ind. electronics, vol. 60, pp. 1582-1595, nov. 2013. [16] l. gao, r. a. dougal, s. liu, and a. p. lotova. gao, et al., "parallel-connected solar pv system to address partial and rapidly fluctuating shadow conditions," ieee transactions on industrial electronics, vol. 56, pp. 1548-1556, 2009. [17] b. homchaudhuri and m. kumar, "market based allocation of power in smart grid," in proceedings of the 2011 american control conference, 2011, pp. 3251-3256. [18] n. s. wade, p. c. taylor, p. d. lang, and p. r. jones, "evaluating the benefits of an electrical energy storage system in a future smart grid," energy policy, vol. 38, pp. 7180-7188, 2010. [19] r. majumder, b. chaudhuri, a. ghosh, g. ledwich, and f. zare, "improvement of stability and load sharing in an autonomous microgrid using supplementary droop control loop," ieee power and energy society general meeting, pp. 1-1, jul. 2010. [20] x. wang, f. blaabjerg, and z. chen, "an improved design of virtual output impedance loop for droopcontrolled parallel three-phase voltage source inverters," in 2012 ieee energy conversion congress and exposition (ecce), 2012, pp. 2466-2473. [21] s. golestan, f. adabi, h. rastegar, and a. roshan, "load sharing between parallel inverters using effective design of output impedance," in proc. of the power engineering conference, 2008. aupec '08. australasian universities, 2008, pp. 1-5. [22] n. mohan, t. undeland, and w. robbins, power electronics: converters, applications and design. new jersey: john wiley & sons, inc., 2003. [23] a. keyhani, design of smart power grid renewable energy systems. hoboken, new jersey: john wiley and sons, 2011. [24] p. igic, "review of advanced igbt compact models dedicated to circuit simulation," facta universitatis series: electronics and energetics, vol. 27, pp. 1-12, 2014. [25] m. n. marwali, j. jin-woo, and a. keyhani, "stability analysis of load sharing control for distributed generation systems," ieee trans. energy conversion, vol. 22, pp. 737-745, sep. 2007. 160 a. m. egwebe, m. fazeli, p. igic, p. holland [26] m. trabelsi, l. ben-brahim, t. yokoyama, a. kawamura, r. kurosawa, and t. yoshino, "an improved svpwm method for multilevel inverters," in proc. of the 15th international power electronics and motion control conference (epe/pemc), 2012, pp. ls5c.1-1-ls5c.1-7. [27] v.f. pires, j.f. martins, and c. hao, "dual-inverter for grid-connected photovoltaic system: modeling and sliding mode control," sciencedirect: solar energy, vol. 86, pp. 2106-2115, jul. 2012. [28] y. mohamed and e. f. el-saadany, "adaptive decentralized droop controller to preserve power sharing stability of paralleled inverters in distributed generation microgrids," ieee trans. power electron., vol. 23, pp. 2806-2816, nov. 2008. [29] p. h. divshali, s. h. hosseinian, and m. abedi, "a novel multi-stage fuel cost minimization in a vscbased microgrid considering stability, frequency, and voltage constraints," ieee trans. power sys. , vol. 28, pp. 931-939, may 2013. [30] s. hongtao, z. fang, h. lixiang, y. xiaolong, and z. dong, "small-signal stability analysis of a microgrid operating in droop control mode," ieee ecce asia downunder (ecce asia), pp. 882-887, jun. 2013. [31] m. antchev and g. kunov, "investigation of three-phase to single-phase matrix converter," facta universitatis series: electronics and energetics, vol. 22, pp. 245-252, 2009. [32] m. liserre, r. teodorescu, and j. rodriguez, grid converter for photovoltaic and wind power systems. chichester, west sussex: john wiley & sons inc, 2011. [33] z. grbo, s. vulkovic, and e. levi, "a novel power inverter for switched reluctance motor drives," facta universitatis series: electronics and energetics, vol. 18, 2005. [34] j. c. vasquez, j. m. guerrero, m. savaghebi, j. eloy-garcia, and r. teodorescu, "modeling, analysis, and design of stationary-reference-frame droop-controlled parallel three-phase voltage source inverters," ieee transactions on industrial electronics, vol. 60, pp. 1271-1280, 2013. [35] h. cha, t. k. vu, and j. e. kim, "design and control of proportional-resonant controller based photovoltaic power conditioning system," in 2009 ieee energy conversion congress and exposition, 2009, pp. 2198-2205. [36] a. chatterjee and k. b. mohanty, "design and analysis of stationary frame pr current controller for performance improvement of grid tied pv inverters," in proc. of the ieee 6th india international conference on power electronics (iicpe), 2014, pp. 1-6. [37] f. de bosio, l. a. d. s. ribeiro, m. s. lima, f. freijedo, j. m. guerrero, and m. pastorelli, "inner current loop analysis and design based on resonant regulators for isolated microgrids," in proc. of the ieee 13th brazilian power electronics conference and 1st southern power electronics conference (cobep/spec), 2015, pp. 1-6. [38] x. wang, f. blaabjerg, and z. chen, "autonomous control of inverter-interfaced distributed generation units for harmonic current filtering and resonance damping in an islanded microgrid," ieee transactions on industry applications, vol. 50, pp. 452-461, 2014. [39] j. m. guerrero, v. luis garcia de, j. matas, m. castilla, and j. miret, "output impedance design of parallel-connected ups inverters with wireless load-sharing control," ieee transactions on industrial electronics, vol. 52, pp. 1126-1135, 2005. [40] a. micallef, m. apap, c. spiteri-staines, and j. m. guerrero, "performance comparison for virtual impedance techniques used in droop controlled islanded microgrids," in proc. of theinternational symposium on power electronics, electrical drives, automation and motion (speedam), 2016, pp. 695-700. [41] g. herong, g. xiaoqiang, and w.weiyang, "accurate power sharing control for inverter-dominated autonomous microgrid," in proc. of the 7th international power electronics and motion control conference (ipemc), 2012, pp. 368-372. [42] j.m. guerrero, j. matas, v. luis garcia de, m. castilla, and j. miret, "decentralized control for parallel operation of distributed generation inverters using resistive output impedance," ieee trans. ind. electron., vol. 54, pp. 994-1004, apr. 2007. [43] y. guan, j. c. vasquez, j. m. guerrero, and e. a. a. coelho, "small-signal modeling, analysis and testing of parallel three-phase-inverters with a novel autonomous current sharing controller," in proc. of the ieee applied power electronics conference and exposition (apec), 2015, pp. 571-578. [44] y. guan, j.c. vasquez, and j.m. guerrero, "a simple autonomous current-sharing control strategy for fast dynamic response of parallel inverters in islanded microgrids," in proc. of the ieee international energy conference (energycon), 2014, pp. 182-188. [45] d. wu, f. tang, j. m. guerrero, j. c. vasquez, g. chen, and l. sun, "autonomous active and reactive power distribution strategy in islanded microgrids," in proc. of the ieee applied power electronics conference and exposition apec 2014, 2014, pp. 2126-2131. instruction facta universitatis series: electronics and energetics vol. 29, n o 4, december 2016, pp. 675 688 doi: 10.2298/fuee1604675s nonrigorous symmetric second-order abc applied to large-domain finite element modeling of electromagnetic scatterers  slobodan v. savić 1 , milan m. ilić 1,2 1 university of belgrade, school of electrical engineering, belgrade, serbia 2 colorado state university, department of electrical and computer engineering, fort collins, co, usa abstract. nonrigorous symmetric second-order absorbing boundary condition (abc) is presented as a feasible local mesh truncation in the higher-order large-domain finite element method (fem) for electromagnetic analysis of scatterers in the frequency domain. the abc is implemented on large generalized curvilinear hexahedral finite elements without imposing normal field continuity and without introducing new variables. as the extension of our previous work, the method is comprehensively evaluated by analyzing several benchmark targets, i.e., a metallic sphere, a dielectric cube, and nasa almond. numerical examples show that radar cross section (rcs) of analyzed scatterers can be accurately predicted when the divergence term is included in computations nonrigorously. an influence of specific terms in the second-order abc, which absorb transverse electric (te) and transverse magnetic (tm) spherical modes, is also investigated. examples show significant improvements in accuracy of the nonrigorous second-order abc over the firstorder abc. key words: absorbing boundary condition, electromagnetic scattering, finite element method, numerical methods 1. introduction the finite element method (fem) is a widely used computational tool in the frequency-domain analysis of electromagnetic (em) problems [1-4]. to preserve the sparsity of the fem system when analyzing open-region (radiating and scattering) problems, the necessary artificial truncation of the computational domain is often done by applying approximate local absorbing boundary conditions (abcs) [4]. the symmetric second-order vector absorbing boundary condition (abc) is a very popular choice among abcs because it preserves the symmetry of the fem system while maintaining received september 29, 2015; received in revised form december 29, 2015 corresponding author: slobodan v. savić university of belgrade, school of electrical engineering, belgrade, serbia (email: ssavic@etf.rs) 676 s. savić, m. ilić satisfactory accuracy of the solution [5, 6]. however, this formulation requires computation of the divergence term on the faces of finite elements (fes) belonging to the absorbing boundary surface (abs). this, in turn, is a problem on its own because the required normal continuity of the fields is generally not enforced across the edges of adjacent elements in a standard weak-form fem discretization where edge-based curl-conforming vector basis functions are employed. in addition, a divergence calculation of the nonconforming basis functions in such formulations cannot be done analytically for the generalized curved fes, even across the faces of elements at the abs (excluding the troublesome edges) where these functions are continuous and differentiable. this problem has been addressed before, however all reported conclusions pertain to evaluation of the second-order abc in small-domain spatial discretization frameworks [7-9], where the fem volume elements are electrically small (e.g., their edges are on the order of /10,  being the wavelength at the operating frequency of the implied timeharmonic excitation). this spatial discretization results in a rather fine mesh throughout the computational domain and at the abs as well. it appears that in such meshes omitting the divergence term in the second-order abc, or computing it nonrigorously without enforcing the normal continuity of the fields yields approximately the same error [8]. on the other hand, the method which rigorously implements the second-order abc on small curved tetrahedra, while preserving the symmetry of the system, has been recently proposed in [9]. however, this method employs auxiliary variables thus mandating significant changes in the existing fem code. conversely however, in the open literature there appear to be no analyses of the second-order abc performance in coarse large-domain fem meshes, although fine meshes and small elements are really not required at the abs, which is typically moved away from the analyzed structure and resides in a homogeneous free space. the em field is usually not changing rapidly at the abs, hence the advantages of large-domain modeling can be fully exploited. with the above in mind, we proposed that large-domain discretization utilizing curved elements whose edges are up to 2 long, coupled with truly higher order (e.g., up to the 10 th order) polynomial field expansion, can be efficiently used in the abs tessellation. the number of edges shared by faces of adjacent finite elements at the abs is thus reduced, which can, in turn, significantly reduce the error introduced by direct computation of required derivatives, because these edges are the sole locations where discontinuities of the normal field components actually arise when the second-order abc is implemented nonrigorously. preliminary results of the proposed method applied to a simple metallic spherical scatterer can be found in [10]. in this work we present the implementation details of the nonrigorous symmetric second-order abc applied on large curvilinear hexahedra in higher-order fem and evaluate its performance on a comprehensive set of benchmark targets which include: a metallic sphere, a dielectric cube (as an example of penetrable structure with sharp edges and vertices), and a metallic nasa almond as a standard nontrivial benchmark target of the electromagnetic code consortium (emcc). nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 677 2. theory and implementation 2.1. higher-order large-domain fem formulation when solving three-dimensional (3-d) linear steady-state em problems by the fem, we first geometrically discretize the domain of interest using lagrange-type generalized curved hexahedra of arbitrary orders, ku, kv, and kw (ku, kv, kw  1). these hexahedra are geometrically flexible and can be used for large-domain modeling of arbitrary shapes [11]. they are analytically described by position vector [11]      u v w wvu k i k j k k k k k j k iijk wlvlulwvu 0 0 0 )()()(),,( rr ,       u u k il l li lk i uu uu ul 0 )( , 1,,1  wvu , (1) where ),,( kjiijk wvurr  are position vectors of interpolation nodes and u k i l represent lagrange interpolation polynomials in the u coordinate, of the local parametric u-v-w coordinate system, with l u being the uniformly spaced interpolating nodes defined as uul kklu /)2(  , u kl ,...,1,0 , and similarly for )(vl v k j and )(wl w k k . we then solve the electric field vector wave equation within each of the finite elements [1, 3]. in every hexahedron we expand the electric field vector as 1 1 1 , , , , , , 0 0 0 0 0 0 0 0 0 u v w u v w u v wn n n n n n n n n u ijk u ijk v ijk v ijk w ijk w ijk i j k i j k i j k                     e f f f , (2) where f are curl-conforming (and generally div-nonconforming) hierarchical polynomial vector basis functions defined as r w k jiijkw r vk j iijkv r ukj i ijku wvpup wpvup wpvpu af af af )()( )( )( )()( , , ,    ,             odd ,3, even ,2,1 1,1 0,1 )( iuu iu iu iu up i ii , 1,,1  wvu , (3) nu, nv, and nw are the adopted degrees of the polynomial approximation, which are entirely independent of the element geometrical orders, ku, kv, and kw, and ijku, , ijkv, and ijkw,  are unknown field-distribution coefficients (to be determined by the fem). the reciprocal unitary vectors r u a , r v a and r w a in (3) are defined as j wv r u /)( aaa  , j uw r v /)( aaa  and j vu r w /)( aaa  , where wvu j aaa  )( is the jacobian of the covariant transformation and u a , v a and w a are unitary vectors defined as u u  ra , v v  ra and w w  ra . by adopting higher-order polynomial field expansion [nu, nv, and nw in (2) can be up to 10 th order], through the process of p-refinement, fes could be up to 2 long in each direction [11]. applying the standard galerkin-type discretization yields the disconnected system of linear equations for each of the finite elements [1] 2 0 ([ ] [ ]) { } { } s a k b g   , (4) 678 s. savić, m. ilić where k0 represents the free-space wave number and {} is the column vector of electric field distribution coefficients from (2). disconnected system of linear equations does not take into account boundary conditions which fields must satisfy on the interfaces between two adjacent fes, but considers each finite element (fe) separately. in order to facilitate implementation (and coding), matrices [a] and [b] can be represented using submatrices as in [11] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] uua uva uwa a vua vva vwa wua wva wwa            , [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] uub uvb uwb b vub vvb vwb wub wvb wwb            . (5) the entries in the submatrices [uva] and [uvb] are given as 1 ˆ ˆ r ,ˆ̂ ˆ̂, , rˆ ˆ ,ˆ̂ ˆ̂, , ( ) d , d , v ijkijk ijk u ijk v v ijkijk ijk u ijk v uva v uvb v                      f f f f ,,...,1,0,ˆ ,1,...,1,0 ,,...,1,0ˆ ,,...,1,0 ,1,...,1,0ˆ w v v u u nkk nj nj ni ni      (6) where v stands for the volume of the fe and r and r are relative permittivity and permeability tensors [12, 13], respectively. the electric field expansion orders nu, nv, and nw in (2) are selected in accordance with reduced-gradient criterion [14, 15] and by following the recipes in [16] which facilitate optimal higher-order computation. the remaining entries of matrices [a] and [b] are calculated in a similar manner. analogously, column vector {gs} can be represented as { } { } { } { } s s s s ug g vg wg            , (7) and the entries in the column vector {ugs} are given as                     s kjiukjis sug d 1 rˆˆ̂,ˆˆ̂, nef , ,,...,1,0ˆ ,,...,1,0ˆ ,1,...,1,0ˆ w v u nk nj ni    (8) where s stands for the boundary surface of an element, e is the electric field vector at s (generally not known in advance) and n is the unit normal on s pointing outwards of the element. the remaining entries of the column vector {gs} are calculated in a similar manner. connected system of linear equations [1] is then assembled from (4) and the surface integrals in {gs} [as in (8)] are calculated only at the outer boundary of the fem domain, and not at the boundary of each element [3]. connected system of linear equations takes into account natural boundary conditions, i.e., tangential continuity of electric fields (explicitly) and magnetic fields (implicitly) which must be satisfied at the interfaces nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 679 between finite elements. consequently, {gs} is calculated only at the outer fem domain boundary, thus it represents a natural connection (interface) between the fem domain and the surrounding space. finally, to obtain a well-defined numerical problem, appropriate em field boundary conditions must be imposed at the outer fem boundary. these boundary conditions can be (i) exact and nonlocal, as in the hybrid finite element method-method of moments (fem-mom) [17], (ii) exact and local, when the fem domain is surrounded by a perfect electric conductor (pec) or a perfect magnetic conductor (pmc), or (iii) approximate and local, e.g., when em field propagation through free space, far from em sources and media discontinuities, is approximated by an abc placed relatively close to the scatterer. the local boundary conditions do not reduce sparsity in the final system of linear equations, which is a highly desirable property [18, 19] and one of the strongest benefits of the fem compared to mom. 2.2. symmetric second-order absorbing boundary condition consider an em scatterer (or generally em field sources) occupying a finite volume, surrounded by free space and illuminated by an incident em field (e inc and h inc ), as shown in fig. 1. in most cases the incident em field is a uniform plane wave, but the theory presented here applies to a general case as well. let sabc be a fictitious spherical surface of radius rabc, centered at the origin and surrounding the scatterer. we truncate the fem computational domain by applying abc at sabc. symmetric (resulting in symmetric system of linear equations) second-order abc, obtained by approximation of the term sc ( ) r  i e utilizing the wilcox expansion [20], given as [6]  sc sc scabc0 0 abc scabc 0 abc ( ) j ( ) [ ( )] 2(1 j ) ( ) , 2(1 j ) r r r r r t t r k k r r k r                i e i i e i i e e (9) will be applied at sabc, where incsc eee  represents the scattered electric field, r i is spherical coordinate system radial unit vector, t in subscripts represents the tangential (to sabc) part of a vector or gradient operator and j is the imaginary unit. fig. 1 with the analysis of open em problems using abc. 680 s. savić, m. ilić note that for the connected system of linear equations, the surface integrals in {gs} are calculated (only) at the entire outer fem domain boundary sabc, and that they are zero at two finite elements junction. on the other hand, the basis and testing functions appearing in the integrals are taken locally, from a specific element, as the integration progresses. terms in surface integrals in {gs} [as in (8)] can be rearranged for easier implementation of the second-order abc (9) as abc abc 1 ˆ ˆ ˆrˆ̂ ˆ̂ ˆ̂, , , d [ ( )] d rs ijk u ijk u ijk s s ug s s                    f e n i e f , (10) since r in  and 1 r [i]   at sabc, with ]i[ being the identity matrix. applying (10) and imposing the second-order abc (9), the system of linear equations (4) becomes 2 abc 0 0 ([ ] [ ] j [ ]) { } { } s a k b k s g    . (11) matrix [s] in (11) is the sum of three parts: the part corresponding to the first-order abc, the part corresponding to the second-order abc, which absorbs transverse electric (te) spherical modes, and the part corresponding to the second-order abc, which absorbs transverse magnetic (tm) spherical modes [6, 10]. in the matrix notation this can be written as   te tm 1abc 2abc 2abcabc 0 0 abc [ ] [ ] [ ] [ ] , 2 ( j) r s s s s k k r     (12) where the corresponding terms are self explanatory. analogously as in (5), matrix [s] can be represented using submatrices, namely [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] uus uvs uws s vus vvs vws wus wvs wws            , (13) where the entries in the submatrix [uvs], for example, are given [in accordance with (12)] as   te tm 1abc 2abc 2abcabc ˆ ˆ ˆ ˆˆ̂ ˆ̂ ˆ̂ ˆ̂, , , , 0 0 abc , 2 ( j)ijk ijk ijk ijk ijk ijk ijk ijk r uvs uvs uvs uvs k k r     (14) and analogously for all other submatrices in (13). the entries corresponding to the firstorder abc, the te part corresponding to the second-order abc, and the tm part corresponding to the second-order abc, respectively, are calculated as   abc te abc tm abc 1abc ˆ ˆ ,ˆ̂ ˆ̂, , 2abc ˆ ˆ ,ˆ̂ ˆ̂, , 2abc ˆ ˆ ,ˆ̂ ˆ̂, , ( ) ( )d , [ ( )][ ( )] d , ( )( ) d , r r v ijkijk ijk u ijk s r r v ijkijk ijk u ijk s t v ijkijk ijk t u ijk s uvs s uvs s uvs s                    i f i f i f i f f f .,...,1,0,ˆ ,1,...,1,0 ,,...,1,0ˆ ,,...,1,0 ,1,...,1,0ˆ w v v u u nkk nj nj ni ni      (15) nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 681 the column vector abc { } s g in (11) can be written in the form shown in (7), with the addition of the superscript “abc” to distinguish the column vectors in (4) and (11). hence, similarly as in (12), the column vector abc { } s g can be represented as the sum of part corresponding to the first-order abc, the te part corresponding to the second-order abc, and the tm part corresponding to the second-order abc, respectively, as te tm abc 1abc 2abc 2abc { } { } { } { } . s s s s g g g g   (16) the entries in the column vector  abc s ug , for example, are given as   abc te abc tm 1abc inc inc ˆ ˆ ˆ0ˆ̂ ˆ̂ ˆ̂, , , 2abc incabc ˆ ˆˆ̂ ˆ̂, , 0 abc 2abc abc ˆ ˆˆ̂ ˆ̂, , 0 abc ( ) ( ) j ( ) ( ) d , [ ( )][ ( )] d , 2(1 j ) ( ) 2(1 j ) r r rs ijk u ijk u ijk s r rs ijk u ijk s s ijk t u ijk ug k s r ug s k r r ug k r                        i f e i f i e i f i e f abc inc ( ) d , t s s     e ,,...,1,0ˆ ,,...,1,0ˆ ,1,...,1,0ˆ w v u nk nj ni    (17) and analogously for the remaining entries in abc { } s g . 2.3. computation of the surface integrals appearing in the symmetric second-order absorbing boundary condition applied to curvilinear elements consider the surface integrals appearing in (11) when computing entries in [s] and abc { } s g . the utilized basis and testing functions are curl-conforming and generally divnonconforming, hence the divergences in the tm parts of (15) and (17), and all similar terms, cannot be expressed in the closed form. moreover, as already discussed, these surface integrals are calculated over the entire sabc surface; in other words, they are calculated not only over the finite element surfaces belonging to sabc, but across the junctions (edges between the elements) as well. since the basis and testing functions possess only tangential continuity, this results in appearance of squares of delta-functions ( 2 ) in the kernels of the surface-integral terms at all edges enveloping the surfaces of the finite elements belonging to abc s [9]. in order to rigorously treat the divergence of the basis and testing functions at the edges of elements over sabc, the basis and testing functions must be adopted to enforce the normal continuity of the em field over sabc [8] or additional auxiliary (scalar) variables need to be introduced as in [9]. nevertheless, since the utilized higher-order polynomial basis and testing functions are continuous and differentiable over fes faces, their divergence can be readily calculated numerically. for example, from (3) it follows that the divergence of fu,ijk is given as 1 , 1 ( ) ( ) ( ) ( ) ( ) ( ) 1 ( ) ( ) ( ) ( ) ( ) 1 ( ) ( ) ( ) ( ) . i r r i r r u ijk j k u u j k u u ji r r i r r k u v j k u v i r r i r rk j u w j k u w iu p v p w u p v p w j j u p v u p w u p v p w j v j v p w u p v u p v p w j w j w                         f a a a a a a a a a a a a (18) 682 s. savić, m. ilić partial derivatives in (18) are calculated numerically utilizing the symmetric finite difference. for example, ' d ' d ( ) ( ) ( ) , 2 d r r r r u v u vr r v v v v v v u v j j j v v           a a a a a a (19) where vd is a numerical-differentiation step. since these divergences are computed only at the fem domain-truncation boundary sabc, numerical differentiation represents minimal addition to the complexity of the overall algorithm, and computation time for the surface integrals abc { } s g is almost negligible compared to the computation time for the fem volume integrals appearing in matrices [a] and [b]. the procedure is similar when divergence is calculated for the functions ijkv,f and ijkw,f . 3. numerical results and discussion 3.1. pec spherical scatterer as the first numerical example, consider a pec spherical scatterer of radius a = 1 m. the scatterer is situated in free space, with permittivity 0  and permeability 0  , and illuminated by a time-harmonic plane-wave of a free space wavelength m1 0  (f = 299.792 mhz), as shown in fig. error! reference source not found. (a). when constructing numerical model, infinite free space surrounding the scatterer is truncated at the artificial spherical boundary sabc, of radius m5.1b , where the nonrigorous symmetric second-order abc is imposed. the normalized thickness of the free space layer between the scatterer and sabc is 5.0)( 0  ab and it is meshed by only six cushion-like triquadratic curved hexahedral fes. 0 1 2 3 4 5 6 7 8 9 10 10 -3 10 -2 10 -1 10 0 1 st ord. abc 1 st ord. abc with g s 2abc, te and s 2abc, te 1 st ord. abc with g s 2abc, tm and s 2abc, tm nonrigorous 2 nd ord. abc unknowns fem-abc l 2 n o rm ( b ir c s ) / l 2 n o rm (m ie b ir c s ) n 10 1 10 2 10 3 10 4 10 5 10 6 u n k n o w n s (a) (b) fig. 2 (a) large-domain fem-abc model of a pec spherical scatterer. (b) normalized l 2 error norm of the computed bistatic rcs for the pec spherical scatterer and the number of unknowns. nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 683 first, we will consider far field results. a bistatic radar cross section (rcs) of the scatterer is computed by the proposed fem-abc technique. the order of the polynomial expansion of the electric field for all fes and in all directions is nu = nv = nw = n. numerical integration is performed by means of the 13 th order gauss-legendre quadrature. the bistatic rcs is computed in all directions uniformly (from  0start to  180 stop with the resolution of  5 , and from  0start to  360stop with the resolution of  5 ), and its error (with respect to the analytical mie’s series solution) is calculated as a normalized 2 l norm 180 360 miebircs 2 mombircs2 0 0 2 miercs 180 360 miebircs 2momrcs mombircs 0 0 (fembircs( , ) ( , )) l norm( bircs) l norm( ) ( , )                        , (20) where fembircs stands for the numerical solution for the bistatic rcs obtained by the proposed fem-abc technique and miebircs stands for the analytical (reference) results in the form of mie’s series. in the following subsection, when analytical miebircs solution is not available, the results obtained by mom, denoted as mombircs, will be used as a reference, as indicated in (20). in fig. error! reference source not found. (b) numerical results are compared for the firstand nonrigorous second-order abc, along with results for the first-order abc with only one term included from the nonrigorous second-order abc [ , te abc2 s g te abc2 s and , tm abc2 s g tm abc2 s from (12) and (16)]. to validate the convergence of the method with p-refinement, the solutions are obtained for various orders n, ranging from n = 1 to n = 9. from fig. error! reference source not found. (b) it can be concluded that, although not being implemented rigorously and not contributing independently to the accuracy of the solution, the tm part of the symmetric second-order abc together with the te part synergistically contributes to the overall solution accuracy. in addition, due to very rough mesh in this example, the fem solution becomes sufficiently accurate for 97  n with n = 8 yielding the lowest error, which is consistent with the results reported in [16]. moreover, the lowest errors obtained with the proposed large-domain fem with the nonrigorous second-order abc are of the same order of magnitude as those reported in the first example in [9], where the same scatterer was analyzed utilizing the rigorously implemented second-order abc. in this example the nonrigorous second-order abc performs significantly better in far field compared to the first-order abc, and for n = 8 the solution error is 2.7 times lower compared to results obtained utilizing the first-order abc. note that this error difference is even greater (8.8 times in favor of the nonrigorous secondorder abc) when the abc is set closer to the scatterer, i.e., when 1.0)( 0  ab , as reported in [10]. noting that far fields, and related derived parameters, are less sensitive to computational errors than near fields, in order to obtain and demonstrate an even more rigorous and complete validation of the proposed fem-abc technique, we next analyze the accuracy of the computed near field of the presented pce spherical scatterer. using the mesh from fig. error! reference source not found. (a) and setting n = 8 (for all elements in all direction) we compute the near electric field numerically and analytically 684 s. savić, m. ilić and show the comparison of obtained results in fig. 3. shown in fig. 3 is the magnitude of the x-component of the total electric field, in the 0x plane, obtained (a) analytically (mie’s series solution) and numerically using (b) the first-order abc and (c) the proposed second-order abc. the incident electric field is ]m/v[1 inc xie  ( xi being the cartesian unit vector in the x-direction) traveling in the z-direction, as shown in fig. 3 (d). in figs. 3 (e) and (f) the error of the electric field computed by the fem (relative to the reference mie’s series solution) for the first-order and second-order abc models are plotted, respectively. the error is calculated as 2immie, im fem, 2re mie, re fem, )()( xxxxx eeeee  , where ex,fem and mie,xe ex,mie are x-components of the electric fields obtained numerically and analytically, respectively, and re and im stand for the real and imaginary part of the complex quantities, respectively. (a) (b) (c) (d) (e) (f) fig. 3 near field results for the pec spherical scatterer from fig. error! reference source not found. obtained (a) analytically and numerically using (b) the first-order and (d) the proposed second-order abc. (d) large-domain fem-abc model of a pec spherical scatterer with illustrated incident field. electric field error (relative to the reference mie’s series solution) for (e) the first-order and (f) the proposed secondorder abc. from fig. 3, it can be concluded that the proposed second-order abc significantly outperforms the first-order abc. the results obtained using nonrigorous second-order abc are more accurate than those using the first-order abc in the complete x = 0 plane, and especially for z > 0. note that, due to symmetry, the remaining two cartesian components of the electric field vanish in the 0x plane (ey = 0, ez = 0), hence they are not shown. also, note that other field components in different planes exhibit similar nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 685 errors, hence they are not shown here for brevity. in addition, the errors in the near field can be further reduced employing p-refinement. 3.2. dielectric cubical scatterer as the second numerical example, consider a dielectric cubical scatterer with relative permittivity 25.2 r  and relative permeability 1 r  , of edge length m2a . the scatterer is situated in free space and illuminated by a time-harmonic plane-wave of a free space wavelength m2 0  (f =149.896 mhz), as shown in fig. 4 (a). when constructing the numerical model, infinite free space surrounding the scatterer is truncated at the artificial spherical boundary sabc, of radius m2b , where the nonrigorous symmetric second-order abc is imposed. free space between the scatterer and the abc s is again meshed by only six cushion-like triquadratic curved hexahedral fes and the dielectric scatterer is meshed by only one trilinear fe. minimal normalized distance between the scatterer and abc s is 13.0)35.0( 0  ab and this maximal distance is (b  0.5a)/0 = 0.5. 0 1 2 3 4 5 6 7 8 9 10 10 -3 10 -2 10 -1 10 0 1 st ord. abc 1 st ord. abc with g s 2abc, te and s 2abc, te 1 st ord. abc with g s 2abc, tm and s 2abc, tm nonrigorous 2 nd ord. abc unknowns fem-abcl 2 n o rm ( b ir c s ) / l 2 n o rm (m o m b ir c s ) n 10 1 10 2 10 3 10 4 10 5 10 6 u n k n o w n s (a) (b) fig. 4 (a) large-domain fem-abc model of a dielectric cubical scatterer. (b) normalized l 2 error norm of the computed bistatic rcs for the dielectric cubical scatterer and the number of unknowns. normalized l 2 error norm of the computed bistatic rcs for the cubical scatterer is calculated as discussed in subsection 0 and shown in fig. 4 (b). the error is calculated with respect to the fully converged mom solutions obtained by wipl-d software [21]. numerical parameters regarding the field expansion and integration in the fem model are kept the same as in the previous example. it can be concluded based on fig. 4 (b) that the nonrigorously implemented tm part of the second-order abc independently contributes to the quality of solutions and that, together with te part of the second-order abc, both parts synergistically contribute to the overall solution accuracy. in this example, the nonrigorous second-order abc performs significantly better compared to the first-order 686 s. savić, m. ilić abc, and for 7n the error obtained using the second-order abc is 5.6 times smaller than that for the first-order abc. 3.3. pec nasa almond scatterer as the last example, consider a pec nasa almond scatterer, which is one of the standard benchmarks of the emcc. the nasa almond is geometrically described by the parametric equations given above fig. 2 in [22]. the almond of length mm37.252d (parameter d from equations in [22]), situated in free space, and illuminated by horizontally and vertically (in  90 plane) polarized incident em field at the operating frequency ghz19.1f ( mm252 0  ) will be considered, as shown in fig. 5. fig. 5 pec nasa almond scatterer. higher-order fem-abc model of the pec nasa almond scatterer consists of 96 triquadratic large-domain lagrange-type fes. these fes model the free space between the almond and the spherical surface abc s , where nonrigorous symmetric second-order abc is applied. the radius of abc s is mm220b . minimum and maximum distances from the almond to abc s are 0 373.0  and 0 801.0  , respectively, and the field expansion orders are set to 6n (for all finite elements and in all directions), which results in 62220 unknown field distribution coefficients. using the proposed nonrigorous second-order abc coupled with the 0 30 60 90 120 150 180 -50 -45 -40 -35 -30 -25 -20 -15 -10 wipl-d feko ----------------------------------higher order fem-abc n u =n v =n w =6, 62220 unkn. nonrigorous 2 nd ord. abc = 90 0 m o n o st a ti c r c s [ d b m 2 ]  0 30 60 90 120 150 180 -50 -45 -40 -35 -30 -25 -20 -15 -10 wipl-d feko ----------------------------------higher order fem-abc n u =n v =n w =6, 62220 unkn. nonrigorous 2 nd ord. abc m o n o st a ti c r c s [ d b m 2 ]  = 90 0 (a) (b) fig. 6 computed monostatic rcs of the pec nasa almond from fig. 5 for the (a) horizontal and (b) vertical incident field polarization; comparison of proposed fem-abc and two mom results obtained by wipl-d [21] and feko [23] software. nonrigorous symmetric second-order abc applied to large-domain finite element modeling ... 687 large-domain higher-order fem technique, the monostatic rcs in the horizontal plane (  90 , )1800  is computed. the results are compared with results obtained by mom technique [21, 23] for both horizontal and vertical incident field polarizations, and shown in fig. 6. from fig. 6 it can be concluded that a very good matching between the fem-abc and mom results is achieved in all directions, and that scatterers of relatively complex shapes can also be accurately analyzed by the proposed fem-abc method. 4. conclusions we have presented, implemented, and validated by representative numerical experiments, a nonrigorous symmetric second-order abc in combination with largedomain higher-order fem technique for frequency domain em scattering analysis. in the proposed method, the abc is implemented nonrigorously, without imposing the normal field continuity and without introducing additional variables. the required divergence of the nonconformal field components is computed numerically on the faces of elements belonging to the abs, using simple finite differences. numerical experiments have shown that the nonrigorous second-order abc performs significantly better compared to the first-order abc and that the proposed method results mach very good with referent numerical solution of high accuracy. moreover, the examples have shown that the errors in computation of the rcs can be significantly lower if the divergence term is included in the abc, as described, than if it is omitted. this conclusion is in contrast with results reported thus far in the literature, where examples with small-domain fem meshes have been utilized exclusively. finally, examples with a dielectric cubical scatterer and the nasa almond have shown that the proposed method can be successfully applied in analysis of scatterers with sharp edges and tips. acknowledgement: this work was supported by the serbian ministry of science and technological development under grant tr-32005. references [1] p. p. silvester and r. l. ferrari, finite elements for electrical engineers, 3 ed. new york: cambridge university press, 1996. [2] j. l. volakis, a. chatterjee, and l. c. kempel, finite element method for electromagnetics (antennas, microwave circuits, and scattering applications), 1 ed. new york: ieee press, 1998. [3] j.-m. jin, the finite element method in electromagnetics. hoboken, new jersey: john wiley & sons, 2014. [4] j.-m. jin and d. j. riley, finite element analysis of antennas and arrays, 1 ed. hoboken, new jersey: wiley-ieee press, 2009. [5] j. p. webb and v. n. kanellopoulos, "absorbing boundary conditions for the finite element solution of the vector wave equation," microwave and optical technology letters, vol. 2, no. 10, pp. 370-372, october 1989. [6] a. f. peterson, "accuracy of 3-d radiation boundary conditions for use with the vector helmholtz equation," ieee transactions on antennas and propagation, vol. 40, no. 3, pp. 351-355, march 1992. [7] v. n. kanellopoulos and j. p. webb, "3d finite element analysis of a metallic sphere scatterer: comparison of first and second order vector absorbing boundary conditions," journal de physique iii, vol. 3, no. 3, pp. 563-572, march 1993. 688 s. savić, m. ilić [8] v. n. kanellopoulos and j. p. webb, "the importance of the surface divergence term in the finite element-vector absorbing boundary condition method," ieee transactions on microwave theory and techniques, vol. 43, no. 9, pp. 2168-2170, september 1995. [9] m. m. botha and d. b. davidson, "rigorous, auxiliary variable-based implementation of a secondorder abc for the vector fem," ieee transactions on antennas and propagation, vol. 54, no. 11, pp. 3499-3504, november 2006. [10] s. v. savić, b. m. notaroš, and m. m. ilić, "accuracy analysis of the nonrigorous second-order absorbing boundary condition applied to large curved finite elements," in 2015 international conference on electromagnetics in advanced applications (iceaa), turin, italy, 2015, pp. 58-61. [11] m. m. ilić and b. m. notaroš, "higher order hierarchical curved hexahedral vector finite elements for electromagnetic modeling," ieee transactions on microwave theory and techniques, vol. 51, no. 3, pp. 1026-1033, march 2003. [12] s. v. savić, a. b. manić, m. m. ilić, and b. m. notaroš, "efficient higher order full-wave numerical analysis of 3-d cloaking structures," plasmonics, vol. 8, no. 2, pp. 455-463, june 1 2013. [13] s. v. savić, b. m. notaroš, and m. m. ilić, "conformal cubical 3d transformation-based metamaterial invisibility cloak," journal of the optical society of america a, vol. 30, no. 1, pp. 7-12, january 2013. [14] j. c. nedelec, "mixed finite elements in r3," numerische mathematik, vol. 35, no. 3, pp. 315-341, september 1980. [15] j. c. nedelec, "a new family of mixed finite elements in r3," numerische mathematik, vol. 50, no. 1, pp. 57-81, january 1986. [16] e. m. klopf, n. j. šekeljić, m. m. ilić, and b. m. notaroš, "optimal modeling parameters for higher order mom-sie and fem-mom electromagnetic simulations," ieee transactions on antennas and propagation, vol. 60, no. 6, pp. 2790-2801, june 2012. [17] m. m. ilić, m. djordjević, a. ţ. ilić, and b. m. notaroš, "higher order hybrid fem-mom technique for analysis of antennas and scatterers," ieee transactions on antennas and propagation, vol. 57, no. 5, pp. 1452-1460, may 2009. [18] g. strang, linear algebra and its applications, 4 ed.: brooks cole, 2005. [19] g. strang, introduction to linear algebra, 4 ed. wellesley, ma: wellesley cambridge press, 2009. [20] c. h. wilcox, "an expansion theorem for electromagnetic fields," communications on pure and applied mathematics, vol. 9, no. 2, pp. 115-134, may 1956. [21] "wipl-d pro," 11.0 wipl-d d.o.o., 2013 available: http://www.wipld.com. [22] a. c. woo, h. t. g. wang, m. j. schuh, and m. l. sanders, "benchmark radar targets for the validation of computational electromagnetics programs," ieee antennas and propagation magazine, vol. 35, no. 1, pp. 84-89, february 1993. [23] "feko," altair development s.a. (pty) ltd,, 2011 available: http://feko.info/applications/rcs. http://www.wipld.com/ http://feko.info/applications/rcs facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 445-460 https://doi.org/10.2298/fuee2103445s © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper introducing an optimal qca crossbar switch for baseline network reza sabbaghi-nadooshan department of electrical engineering, central tehran branch, islamic azad university, tehran, iran abstract. crossbar switch is the basic component in multi-stage interconnection networks. therefore, this study was conducted to investigate performance of a crossbar switch with two multiplexers. the presented crossbar switch was simulated using quantum-dot cellular automata (qca) technology and qca designer software, and was studied and optimized in terms of cell number, occupied area, number of clocks, and energy consumption. using the provided crossbar switch, the baseline network was designed to be optimal in terms of cell number and occupied area. also, the number of input states was investigated and simulated to verify accuracy of the baseline network. the proposed crossbar switch uses 62 qca cells and the occupied area by the switch is equal to 0.06µm2 and its latency equals 4 clock zones, which is more efficient than the other designs. in this paper, using the presented crossbar switch, the baseline network was designed with 1713 cells, and occupied area of 2.89µm2. key words: qca, crossbar switch, mux, baseline network, multi-stage interconnection networks, energy dissipation 1. introduction today, density of devices, power consumption, and speed of response are among challenges of designing electronic circuits. sizes of semiconductor devices have reached sub microns, increasing level of complexity in design of chips. according to moore's law, the number of transistors in a chip doubles every 18 months, meaning an increase in circuit density and power consumption in the chips [1]. the increase in density of the circuits means shrinking of the transistors inside the chips, and this shrinkage in complementary metal-oxidesemiconductor (cmos) technology increases leakage current and creates a short channel effect [2]. received january 11, 2021; received in revised form march 6, 2021 corresponding author: reza sabbaghi-nadooshan niayesh building, emam hasan blvd., pounak, tehran, iran e-mail: r_sabbaghi@iauctb.ac.ir 446 r. sabbaghi-nadooshan problems in downsizing cmos technology have led to introduction of new emerging technologies, such as fin field-effect transistor (finfet), carbon nanotube field-effect transistor (cntfet), and quantum-dot cellular automata (qca), among which qca technology can be the best alternative to cmos in designing of digital circuit [3-5]. qca cells are quantum cells that in binary state consist of four quantum wells and two electrons forming stable states of electrons for polarization in qca [6-8]. in qca cells, transfer of current from one cell to another is zero because electrons can only tunnel inside each cell, and information can only be transmitted through transfer of state from one cell to another. this technology has the least energy consumption due to the lack of current transfer between cells. the advantages of qca technology include high response speed, low occupied area, and low energy consumption, which will make this technology a leader in designing the future digital circuits [9]. among the circuits designed using qca, one can mention to design of adder circuits [10], serial-parallel converter [11], counter [12], serialin to serial-out (siso) shift register [13], multiplier [14], and comparator circuit [15]. another advantage of using qca is design of complex digital systems. parallel communication systems, multiprocessors, and design of network communication systems within chips can be mentioned as examples of complex digital systems [16-17]. in multiprocessor systems, communication between processors is required from input to output; this connection can be established through nodes in the network [18-20]. using control lines in each switch used in each node, path of data transfer from the desired input to output is specified. interconnection networks that can be implemented using qca technology include butterfly, dragonfly, beyond network, etc. in this study, an optimal crossbar switch is presented considering occupied area, the number of cells, and the amount of latency in the switch. then, interconnection structure of the baseline network will be presented using the proposed crossbar switch. in rest of the paper, section 2 describes the qca. the proposed crossbar switch and baseline network will be presented in section 3 and section 4. finally, section 5, concludes the paper. 2. qca the qca cell was first proposed by lent [21] and was developed in 1997 [22]. each qca cell contains four potential wells and two electrons enclosed in a square [23-25]. two electrons inside each square can freely tunnel between the quantum-dots in each cell, but electrons cannot leave the enclosed square and tunnel from one cell to another. data transfer in qca cells from one cell to another is done through external electrostatic energy and in fact, information is not transmitted through current. two stable states are formed based on placement of electrons in each qca cell creating -1 and +1 polarizations in the qca (steady state occurs in qca cells at the greatest distance between the electrons in each cell) [26-31]. fig. 1 depicts structure of the qca cell. optimal qca crossbar switch for baseline network 447 fig. 1 qca cell structure the qca uses clocks to control tunneling and data synchronization. the clock in qca has four phases: switch, hold, release, and relax [19, 25, 27]. fig. 2 illustrates phase clock in the qca. fig. 2 phase clock in qca in the switch phase, the potential of barrier slowly increases, and kinetic energy of the electrons decreases. after the switch phase, the cell will enter the hold phase, in which the barrier potential reaches the highest level and kinetic energy in this phase is almost zero. after the hold phase, the release phase occurs, in which the potential of barrier slowly begins to decrease, electrons are slowly released, and kinetic energy begins to increase. the next phase is the relax phase. in this phase, the potential level reaches its lowest point and electrons can tunnel freely inside the cell [19, 25, 27]. as mentioned, the qca cell has two polarizations of -1 and 1, which can be attributed to logic levels of 0 and 1, respectively, and basic gates of binary logic can be implemented by these polarizations in qca [6-8]. fig. 3 shows the basic gates in qca. fig. 3 basic gates in qca 448 r. sabbaghi-nadooshan 3. proposed crossbar switch in this section, at first, crossbar switch is designed and implemented and then, application of switch crossbar will be reviewed in the baseline network. in this study, qca designer software version 2.0.3 was used to simulate circuit and qcapro software was used to calculate energy consumption. the parameters used in qca designer software are as follows: number of samples: 50000 convergence tolerance: 0.001000 radius of effect (nm): 65 relative permittivity: 12.9 clock high: 9.8 e-22 clock low: 3.8 e-23 clock shift: 0 clock amplitude factor: 2.000000 layer separation: 11.500000 maximum literation’s per sample: 100 3.1. crossbar switch a crossbar switch is a digital system connecting an input to an output using a control line with respect to a pattern. this pattern is created using the control line, for example, if the control line is equal to 0, the input 1 information is transferred to output 1 and input 2 to output 2, or if the control line is equal to 1, input1 is transferred to output 2 and input 2 to output 1 as shown in fig. 4. using the crossbar switch, the connection between nodes can be created crosswise, and the control line is used to create a connection between the nodes in the form of a bar and cross. fig. 4 shows the two states of bar and cross configured using the control line. fig. 4 crossbar switch configuration table 1 shows truth table of a crossbar switch, which can be simplified by karnaugh map to provide a relationship between input and output of a crossbar switch. eqs. 5 and 6 are related to crossbar switches, and each of the outputs op0 and op1 shows a 2:1 multiplexer relationship, and each multiplexer (mux) is controlled across the control line. optimal qca crossbar switch for baseline network 449 table 1 truth table for crossbar switch control line ip1 ip0 op1 op0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1 1 0 0 0 0 1 0 1 1 0 1 1 0 0 1 1 1 1 1 1 0 0 1 . .= +op ip c ip c (1) 1 0 1 . .= +op ip c ip c (2) fig. 4 shows the states created in data transfer in a crossbar switch, which uses eqs. 1 and 2 to design such a switch. these two equations indicate the existence of two multiplexers in the crossbar switch, which can be used to create states in the crossbar switch using the control line. fig. 5 shows wiring of a crossbar switch by two multiplexers, using eqs. 1 and 2. fig. 5 crossbar switch using two multiplexers fig. 6 shows the mux circuit used in this study. fig. 7 shows simulation results of the mux. in the used mux, if c=0, the output will be equal to ip1 and if c=1, the output will be equal to ip0. fig. 6 mux circuit 450 r. sabbaghi-nadooshan fig. 7 mux simulation result in the following, design of the crossbar switch using qca is presented. fig. 8 shows implementation of the crossbar switch using qca technology, and fig. 9 illustrates the simulation results for the crossbar switch. in table 2, the values obtained from measuring number of cells, occupied area, and latency in the proposed crossbar switch are compared with those reported by the other previous works. fig. 8 crossbar switch with qca optimal qca crossbar switch for baseline network 451 fig. 9 result simulation for crossbar switch fig. 9 illustrates transfer of input to output, where for c=0 the ip0 and ip1 data are transferred to op0 and op1, respectively (shown with a red square in fig. 9). for c=1, ip0 data are transferred to op1 and ip1 data are transferred to op0, as shown in fig. 9 with a blue square. table 2 compare crossbar switch with other work structure parameters cell count total area (µm2) latency (clock zone) crossbar switch [18] 101 0.096 8 crossbar switch [19] 157 0.25 6 crossbar switch [20] 123 0.137 8 crossbar switch [36] 81 0.08 4 crossbar switch this work 62 0.06 4 according to table 2, the proposed crossbar switch has an optimized design in terms of cell number, occupied area, and latency. in the circuit designed in this study, 62 cells were used with an occupied area of 0.06µm2 and latency equal to 4 zone clocks or one clock pulse . 3.2. energy dissipation on crossbar switch in this section, energy in the proposed crossbar switch designed using qca cell will be investigated. in calculation of the consumed energy, the hamiltonian matrix is needed, which is obtained using the hartree-fock approximation. eq. 7 shows the hamiltonian 452 r. sabbaghi-nadooshan matrix in qca [32, 33]. in eq. 3, p is the polarization value in binary qca that is equal to 1 or -1; ek is the kinetic energy, and j is the tunneling energy in qca. h(j) = 2 2   − −      −    k j k j e p j e j p   (3) energy value of the qca cell can be measured by calculating the hamiltonian matrix and quantum measurement in the qca. herein, the amount of dissipated energy was calculated using qca pro software. energy dissipation was calculated for the values of 0.5ek, 1 ek and 1.5 ek, as shown in table 3. fig.10 depicts thermal layout of crossbar switch circuit for energy dissipation. (a) (b) (c) fig. 10 thermal layout for average energy dissipated in crossbar switch circuit. a) thermal layout for 0.5ek. (b) thermal layout for 1ek. (c) thermal layout for 1.5ek table 3 energy dissipation for crossbar switch circuit γ/ek 0.5 1 1.5 avg ediss (mev) 109.19 133.85 166.17 max ediss (mev) 219.46 227.44 244.82 min ediss (mev) 17.14 55.24 100.20 optimal qca crossbar switch for baseline network 453 4. baseline network in the previous section, a crossbar switch was simulated using two multiplexers and was studied in terms of cell number, occupied area, latency, and consumed energy. now, the baseline network is simulated using the proposed crossbar switch, and the proposed baseline network is investigated for several selected states. baseline network is one of various types of multistage interconnection networks (mins) and a subset of delta network, consisting of several layers, and each layer includes several 2×2 switches [34, 35]. each baseline network consists of 2q rows and q+1 set, and each node contains a 2×2 switch. the baseline network is presented in the form of 2×8, which has eight inputs and eight outputs, as designed using a crossbar switch. fig. 11 shows the building block of baseline network. fig. 11 baseline network diagram 4.1. implementation of baseline network by qca the aim of the present study was presenting and implementing a baseline network optimally in terms of the number of cells and occupied area, simulating the target baseline network using qca technology and qca designer software. in this simulation, the proposed baseline network has eight inputs, eight outputs, and 12 crossbar switches. fig. 12 illustrates the designed baseline network by qca technology. in the simulated baseline network, 1713 qca cells were used, the occupied area by the baseline network was equal to 2.89µm2 and the latency was equal to 5 clocks or 20 clock zones. table 4 presents the parameterized results in the baseline network and compares the results. table 4 baseline network comparison results structure parameters cell count total area (µm2) latency (clock zone) baseline network [19] 2491 3.85 18 baseline network with this work 1713 2.89 20 454 r. sabbaghi-nadooshan fig. 12 designed baseline network with qca 4.2. simulation scenarios in simulation of the baseline network, eight input lines and 12 control lines were used. a large number of inputs and control lines cause the displayed waveform to be distorted, for this reason, vector table setup of the qca designer software was used to display the waveforms. in this regard, vector table setup forms were used to display inputs and control lines. in the first scenario, all the control inputs are equal to 0. the simulation results are shown in fig. 13, in which the input data have been transported in the output. table 5 shows the input-to-output transmissions done by control lines. fig. 13 illustrates latency of the baseline network with the purple box. latency was equal to 5 clock pulses as shown in fig. 13. the baseline network outputs are shown with the red box. table 5 state of input to output transport in the first scenario input output i0 op0 i1 op4 i2 op2 i3 op6 i4 op1 i5 op5 i6 op3 i7 op7 optimal qca crossbar switch for baseline network 455 fig. 13 output result for input state of first scenario according to fig. 13 and table 5, for all control lines to be zero, information i0 to op0, i1 to op4, i2 to op2, i3 to op6, i4 to op1, i5 to op5, i6 to op3, i7 will be transferred to op7. in the second scenario, all control inputs are 1 and the results are shown in fig. 14. table 6 shows the input-to-output transmissions. 456 r. sabbaghi-nadooshan fig. 14 output result for input state of second scenario table 6 state of input to output transport in the second scanrio input output i0 op7 i1 op3 i2 op5 i3 op1 i4 op6 i5 op2 i6 op4 i7 op0 optimal qca crossbar switch for baseline network 457 according to fig. 14 and table 6, for all control lines to be 1, information i0 to op7, i1 to op3, i2 to op5, i3 to op1, i4 to op6, i5 to op2, i6 to op4, i7 will be transferred to op0. fig. 15 output result for input state of third scenario in the third scenario, control input c31 and c32 are 1 and other control lines are 0 and simulation results are shown in fig. 15. table 7 depicts the input-to-output transmissions that are done by control lines. fig. 15 illustrates the latency of the baseline network and the outputs of baseline network. in the simulation results of fig. 15, the latency is equal to 5 clock pulses. 458 r. sabbaghi-nadooshan table 7 state of input to output transport in the third scenario input output i0 op1 i1 op4 i2 op3 i3 op6 i4 op0 i5 op5 i6 op2 i7 op7 according to figs. 15 and table 7, control lines c31 and c32 are 1 and other control lines are 0, and information will be transferred as table 7. 5. conclusion in this paper, an optimized crossbar switch was studied in terms of cell number, occupied area, number of clocks, and energy consumption. the switch uses 62 qca cells and the occupied area by the switch is equal to 0.06µm2 and latency is equal to 4 clock zones, which is more efficient than the other designs presented in the literature. in the provided switch, the amount of consumed energy for 0.5ek, 1ek, and 1.5ek was calculated by qca pro software. then, the baseline network was designed with 1713 cells and occupied area of 2.89µm2 using the presented crossbar switch. for validating data transfer in the simulated baseline network, three scenarios were considered. in the first scenario, all the control lines are equal to 0 where the information from i0 to op0, i6 to op3, i7 will be transferred to op7. in the third scenario, control input c31 and c32 are equal to 1 and other controls are equal to 0 where information from i0 to op1, i1 to op4, i2 to op3, i3 to op6, i7 will be transferred to op7. the baseline network was optimized in terms of the number of cells and occupied area, but the amount of zone clock was increased and for correct operation of the designed baseline network, states such as input and control lines were applied to the baseline network, and the input corresponding to the control lines was transferred to the output. therefore, it can be concluded that the proposed crossbar switch can be used in optimizing other networks. references [1] g.e. moore, mcgraw-hill new york, ny, usa, 1965. [2] t.j.k. liu, k. kuhn, cmos and beyond: logic switches for terascale integrated circuits, cambridge university press, 2015. [3] s. garg, t.k. gupta, “a 4: 1 multiplexer using low-power high-speed domino technique for large fan-in gates using finfet”, circuit world, 2020. [4] d.k. nandhaiahgari, r.p. somineni, c.r. kumari, “design and analysis of different full adder cells using new technologies”, international journal of reconfigurable and embedded systems, vol. 9, p. 116, 2020. [5] r. sabbaghi-nadooshan, z. shahosseini, d. rezaeipour, “design of new qca lfsr and nlfsr for grain-128 stream cipher”, journal of circuits, systems and computers, vol. 25, p. 1650005, 2016. https://doi.org/10.1108/cw-09-2019-0128 optimal qca crossbar switch for baseline network 459 [6] s. zoka, m. gholami, “two novel d-flip flops with level triggered reset in quantum dot cellular automata technology”, international journal of engineering transactions c: aspects, vol. 31, no. 3, pp. 415–421, 2017. [7] j. iqbal, f. khanday, n. shah, “design of quantum-dot cellular automata (qca) based modular 2n−1−2nmux-demux”, in proceedings of the impact-2013, 2013, pp. 189–193. [8] s. kamrani, s.r. heikalabad, “a unique reversible gate in quantum-dot cellular automata for implementation of four flip-flops without garbage outputs”, international journal of theoretical physics, vol. 57, pp. 3340–3358, 2018. [9] f. ahmad, “an optimal design of qca based 2n: 1/1: 2n multiplexer/demultiplexer and its efficient digital logic realization”, microprocessors and microsystems, vol. 56, pp. 64–75, 2018. [10] h.r. roshany, a. rezai, “novel efficient circuit design for multilayer qca rca”, international journal of theoretical physics, vol. 58, pp. 1745–1757, 2019. [11] l.e. arani, a. rezai, “novel circuit design of serial–parallel multiplier in quantum-dot cellular automata technology”, journal of computational electronics, vol. 17, pp. 1771–1779. [12] m.n. divshali, a. rezai, s.f.h. samidpour, “design of novel coplanar counter circuit in quantum dot cellular automata technology”, international journal of theoretical physics, vol. 58, pp. 2677–2691. [13] m.n. divshali, a. rezai, a. karimi, “towards multilayer qca siso shift register based on efficient dff circuits”, international journal of theoretical physics, vol. 57, pp. 3326–3339, 2018. [14] s.m. mohaghegh, r. sabbaghi-nadooshan, m. mohammadi, “design of a ternary qca multiplier and multiplexer: a model-based approach”, analog integrated circuits and signal processing, vol. 101, no. 1, pp. 23–29, 2019. [15] a. shiri, a. rezai, h. mahmoodian, “design of efficient coplanar comparator circuit in qca technology”, facta universitatis, series: electronics and energetics, vol. 32, 119–128, 2019. [16] r. sabbaghi-nadooshan, m. modarressi, h. sarbazi-azad, “the 2d digraph-based nocs: attractive alternatives to the 2d mesh noc”, the journal of supercomputing, vol. 59, no. 1, pp. 1–21, 2012. [17] r. sabbaghi-nadooshan, a. patooghy, “analytical performance modeling of de bruijn inspired meshbased network-on-chips”, microprocessors and microsystems, vol. 39, no. 1, pp. 27–36, 2015. [18] j.c. das, d. de, “design of single layer banyan network using quantum-dot cellular automata for nanocommunication”, optik, vol. 172, pp. 892–907, 2018. [19] m.a. tehrani, f. safaei, m.h. moaiyeri, k. navi, “design and implementation of multistage interconnection networks using quantum-dot cellular automata”, microelectronics journal, vol. 42, pp. 913–922, 2011. [20] j.c. das, d. de, “circuit switching with quantum-dot cellular automata”, nano communication networks, vol. 14, pp. 16–28, 2017. [21] c.s. lent, p.d. tougaw, w. porod, g.h. bernstein, “quantum cellular automata”, nanotechnology, vol. 4, p. 49, 1993. [22] a. orlov, i. amlani, g. bernsten, c. lent, g. snider, “realization of a functional cell for quantum-dot cellular automata”, science, vol 277, pp. 928–930, 1997. [23] z. mohammadi, k. navi, r. sabbaghi-nadooshan, “design of testable reversible latches by using a novel efficient implementation of fredkin gate”, international journal of electronics, vol. 107, no. 6, pp. 859– 878, 2020. [24] z. taheri, a. rezai, h. rashidi, “novel single layer fault tolerance rca construction for qca technology”, facta universitatis, series electronics and energetics, vol. 32, no. 4, pp. 601–613, 2019. [25] r. kianpour, r. sabbaghi-nadooshan, “novel design of n-bit controllable inverter by quantum-dot cellular automata”, international journal nanoscience and nanotechnology, vol. 10, no. 2, pp. 117– 126, 2014. [26] j.r. monfared, a. mousavi, “design and simulation of nano-arbiters using quantum-dot cellular automata”, microprocessors and microsystems, vol. 72, p. 102926, 2020. [27] m. abutaleb, “a novel qca shuffle-exchange network architecture with multicast and broadcast communication capabilities”, microelectronics journal, vol. 93, 104640, 2019. [28] r. kianpour, r. sabbaghi-nadooshan, “optimized design of multiplexor by quantum-dot cellular automata”, international journal nanoscience and nanotechnology, vol. 9, no. 1, pp. 15–24, 2013. [29] j.c. das, d. de, “nanocommunication network design using qca reversible crossbar switch”, nano communication networks, vol. 13, pp. 20–33, 2017. [30] l. silva, l. sardinha, m. vieira, l. vieira, o.v. neto, “robust serial nanocommunication with qca”, ieee trans. nanotechnol., vol. 13, no. 3, pp. 464–472, 2015. [31] h.a. mousavi, p. keshavarzian, a.s. molahosseini, “a novel fast and small xor-base full-adder in quantum-dot cellular automata”, appl nanosci, vol. 10, pp. 4037–4048, 2020. javascript:void(0) javascript:void(0) javascript:void(0) javascript:void(0) javascript:void(0) javascript:void(0) javascript:void(0) javascript:void(0) http://www.ijnnonline.net/article_6117.html http://www.ijnnonline.net/article_6117.html http://www.ijnnonline.net/article_3875.html http://www.ijnnonline.net/article_3875.html 460 r. sabbaghi-nadooshan [32] s. srivastava, s. sarkar, s. bhanja, “power dissipation bounds and models for quantum-dot cellular automata circuits”, in proceedings of the sixth ieee conference on nanotechnology, 2006, vol. 1, pp. 375-378. [33] j. timler, c.s. lent, “power gain and dissipation in quantum-dot cellular automata”, journal of applied physics, vol. 91, pp. 823–831, 2002. [34] j. duato, s. yalamanchili, l.m. ni, “interconnection networks: an engineering approach”, morgan kaufmann, 2003. [35] d. tutsch, m. marcus brenner, “min simulate. a multistage interconnection network simulator”, in proceedings of the 17th european simulation multiconference: foundations for successful modelling & simulation, 2003, pp. 211–216. [36] a. chandrasekaran, k. senthil kumar, k. hemalatha, k.s. tamilselvan, p. umarani, “design of coplanar circuit switching network in quantum dot cellular automata”, international journal of recent technology and engineering, vol. 8, no. 4, pp. 10611–10619, 2019. javascript:void(0) javascript:void(0) https://www.google.com/search?tbo=p&tbm=bks&q=inauthor:%22jos%c3%a9+duato%22 https://www.google.com/search?tbo=p&tbm=bks&q=inauthor:%22sudhakar+yalamanchili%22 https://www.google.com/search?tbo=p&tbm=bks&q=inauthor:%22lionel+m.+ni%22 javascript:void(0) instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 363 373 doi: 10.2298/fuee1703363m lte and wifi co-existence in 5 ghz unlicensed band  nenad milošević 1 , bojan dimitrijević 1 , dejan drajić 2 , zorica nikolić 1 , milorad tošić 1 1 university of nis, faculty of electronic engineering, nis, serbia 2 university of belgrade, school of electrical engineering, belgrade, serbia abstract. since the future mobile networks will require significantly higher data throughput, and the long-term evolution (lte) licensed bands are already occupied, the frequency band extension and the data rate increase may be achieved by using some of the available unlicensed bands. the most appropriate unlicensed band for this purpose lies in 5 ghz frequency range. however, this unlicensed band is already occupied by wifi networks and a special attention has to be paid to coordinate these two different networks in the shared spectrum usage. therefore, this paper considers the shared access co-existence in 5 ghz unlicensed band between uncoordinated lte and wifi networks. more precisely, it considers the influence of the lte downlink transmission on the performance of the wifi networks. the experimental results show that the lte significantly degrades the wifi network performance, which means that some of the coordination algorithms have to be employed. key words: wifi, lte, co-existence, unlicensed band, shared access 1. introduction mobile communications industry is rapidly growing over the past decade, and the mobile data transfer was almost completely based on the usage of the licensed spectrum. having in mind predictions of 1000 times cellular data traffic growth until 2020 [1], and the fact that there is an increasing amount of machine to machine data transfer [2], it is clear that the licensed band communications would have problems to support such a high bandwidth demand. one of the possible solutions to this problem use some additional spectrum out of the dedicated licensed band, while causing minimum interference to the existing systems in that frequency band. the co-existence of the mobile communication networks (global system for mobile (gsm) and long-term evolution (lte)) and digital terrestrial video broadcasting (dvb-t) systems are analyzed in [3]. the paper shows that there could be a significant mutual influence of these systems. besides, the available ultra received september 5, 2016; received in revised form december 7, 2016 corresponding author: dejan drajić university of belgrade, school of electrical engineering, belgrade, serbia (e-mail: ddrajic@etf.rs) 364 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić high frequency (uhf) bandwidth is not very large. all of this indicates that uhf tv bands are not very appropriate for the mobile communication systems bandwidth increase. on the other hand, the unlicensed bands are particularly suitable for the bandwidth extension. the unlicensed band consist of industrial, scientific and medical (ism) and unlicensed national information infrastructure (u-nii) bands. ism bands occupy frequencies around 900 mhz, 2.4 ghz, and 5.8 ghz, whereas u-nii occupies frequencies from 5 to 5.8 ghz. 2.4 ghz band provides around 80 mhz of bandwidth, but it is heavily occupied by 802.11b/g wifi networks, bluetooth and other wireless personal area networks. on the other hand, 5 ghz band provides around 500 mhz of bandwidith and it is lightly occupied mainly by wifi 802.11ac/n networks. both 2.4 and 5 ghz wifi use carrier sense multiple access (csma) to access channel, and they are possible victims of some other technologies operating in the same frequency range. bluetooth use csma for data transmission and time division multiple access (tdma) for audio transmission. therefore, in case of audio transmission bluetooth may cause interference to other networks. having in mind the existing interference and the available bandwidth, 5 to 5.8 ghz band was chosen to be used for the bandwidth extension [4]. however, the implemented technology should be flexible enough to support other frequency bands. lte was first defined in 3rd generation partnership project (3gpp) release 8 [5]. it represents an evolving mobile communication standard that provides high data rates, higher capacity, smaller latency and new levels of user experience. in the 3gpp release 10 [6], lte was improved to fulfil the requirements of 4g mobile networks and it was named lte–advanced (lte-a). the most important advancement of the lte-a is the possibility of simultaneous use of multiple frequency bands by the means of the carrier aggregation (ca) technology. ca is the key technology that enables the unlicensed spectrum usage by the lte devices. however, the unlicensed spectrum would only be used for data rate increase, both in downlink and uplink, while the licensed spectrum, having predictable performance, will still be used for the important operations, such as network management, or delivery of critical information and guaranteed quality of service. although the unlicensed band may be freely used by the communication systems, there are some regulations that have to be followed, such as dynamic frequency selection (dfs) and listen-before-talking (lbt), which may use different technologies, such as carrier sense multiple access or spectrum sensing [7]. these coordination mechanisms, that are variants of dynamic spectrum access (dsa), are essential for achieving efficient co-existence between different systems that are operating in unlicensed spectrum. as the 5ghz band is primarily used by ieee 802.11ac wifi networks, the focus should be on the coordination between the lte and wifi. the main problem lies in the fact that the lte was designed to operate in a dedicated, licensed band. therefore, it does not have shared access mechanisms, like wifi does. papers [8] and [9] provide respectively simulation and theoretical results on the co-existence of lte and wifi networks and show the need for some sort of coordination between these two networks. experimental analysis of the 2.4 ghz band wifi communication influenced by lte is given in [10]. the lte is represented only by the base station, without any mobile stations. in this case, lte enb waits for the ue and transmits mainly control signals. there are two possible solutions to the problem of wifi and lte networks co-existence. the first approach is to modify the lte standard and adapt it to work in frequency shared environment. lte-u (lte-unlicensed), proposed by lte-u forum [11], uses a lte lte and wifi co-existence in 5 ghz unlicensed band 365 version with duty cycle i.e. with pauses in the transmission. in this way, wifi has the opportunity to transmit its data during the silent periods of the lte-u. besides, lte-u access point listens to wifi transmissions, tries to predict the usage patterns and to adapt to them. licensed assisted access (laa) will be a part of the future 3gpp lte release13 standard [12], [13], and includes listen before talk (lbt) mechanism to transmit when the channel is free. standardization progress and the summary of the laa is given in [14]. also, an operator level system performance is analyzed for indoor hotspot, indoor office, and outdoor small cell scenarios. the analysis showed that a significant lte capacity increase may be obtained by using laa and lbt. paper [15] considers the design of lbt for the laa system and analyzes the influence of laa clear channel assessment threshold on the performance of both lte and wifi networks. the paper shows that the proposed lbt algorithm is able to improve laa and to keep low interference to wifi. however, both lte-u and laa require significant modifications of the lte standard and will not be available in near future. the second approach is to introduce a coordinated access to the shared channel. there are two general approaches to spectrum coordination as follows [16]: reactive spectrum coordination and proactive spectrum coordination. the most straightforward reactive spectrum coordination concept is so called agile wideband radio scheme [17]. in this scheme, transmitter analyzes the spectrum and chooses its frequency band and modulation scheme, having in mind the highest allowed interference level. there is no higher-level coordination with the neighboring nodes. this coordination scheme is very simple, but has one serious possible problem with the hidden nodes, i.e. with the nodes that may not be visible to the station, but may interfere with it. another simple coordination scheme is reactive control [18]. all the radio stations in a network control its transmit power, rate, or frequency band in a way to optimize channel quality and interference levels. the name reactive comes from the fact that the station change its parameters as a reaction to the changes in the wireless environment. although these schemes are simple, with low software and hardware complexity, their application is limited to some simple scenarios. proactive spectrum coordination schemes are slightly more complex than the reactive. an example of proactive schemes is the spectrum etiquette protocol [19]. this scheme employs a distributed coordination by the means of either internet services or a separate coordination radio channel reserved for this purpose within the frequency band common to all participating radio nodes. these schemes enable radio nodes, using different radio access technologies, to coordinate its activities and adjust transmit parameters for successful joint operation. the etiquette approach is capable of operating in more complex scenarios than the reactive schemes. the common spectrum coordination channel (cscc) variant of the etiquette approach is given in [19], [20] together with the demonstration of proof-of-concept experiments for co-existing ieee 802.11b/g and bluetooth networks in the shared 2.4 ghz unlicensed band. with the coordination approach, only minor modifications of the existing standards are needed. however, the best solution would be to use coordination together with the lte-u or laa. having in mind the analyzed literature, it may be noticed that there is a lack of the experimental results for the scenario of lte and wifi networks co-existence in 5 ghz band. this paper gives the experimental data regarding the interference caused by lte towards the wifi in 5 ghz unlicensed band. unmodified versions of the existing standards are used, 802.11a for wifi, and 3gpp release-10 for lte. since there is no commercial lte 366 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić hardware available that operates in any unlicensed band, we used software radio based lte implementation named openairinterface (oai) [21]. oai is also meant to be used in the licensed bands, so we had to modify source code to allow usage in 5 ghz unlicensed band. the experimentation is performed at nitos testbed [22]. the rest of the paper is organized as follows. section 2 briefly describes the openairinterface as well as the nitos testbed. the experiment description is given in section 3, while the experiment results and discussion are given in section 4. finally, the concluding remarks are presented in section 5. 2. openairinterface and nitos testbed the openairinterface lte implementation represents the full real-time software implementation of 4th generation mobile cellular systems compliant with 3gpp lte standards release-8/10. oai is implemented in gnu-c and uses x86 single instruction, multiple data (simd) hardware acceleration. it is primarily targeted for x86 real time application interface (rtai), but can be made to run on any gnu environment. oai implements both lte enb, i.e. lte base station, and lte user equipment (us), i.e. lte mobile station. it supports both frequency-division duplexing (fdd) and timedivision duplexing (tdd) configurations in 5, 10, and 20 mhz channel bandwidth. oai is designed to work with any hardware rf platform with minimal modifications. currently, two platforms are supported: eurecom exmimo2 [23], and universal software radio peripheral (usrp) xand bseries [24]. in our experiments, we used usrp b210. besides usrp, an intel core i5 or i7 based pc with usb 3.0 port is needed. the experiment will be performed at nitos testbed. nitos testbed consists of several experimentation environments: outdoor, indoor rf isolated, and office testbeds to meet different experimentation scenarios (fig. 1). users internet nitos server outdoor testbed indoor rf isolated tesbed office tes tbed openf low switch fig. 1 nitos testbed block diagram lte and wifi co-existence in 5 ghz unlicensed band 367 the experiments were executed at indoor rf isolated testbed because it is the only testbed currently equipped with usrp b210. it consists of 4 × 11 nodes arranged in the grid (11 rows with 4 nodes each), as shown in fig. 2. the distance between the neighboring nodes is 1 m. 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 fig. 2 indoor rf isolated testbed topology the nodes are numbered from 50 to 93 because previous 49 nodes are in outdoor and office testbeds. each node consists of a pc with different rf devices attached, such as wifi, usrp, bluetooth, and lte. after the reservation of a time slot, each node may be accessed online by the user and any software may be executed. 368 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić 3. experiment setup and results 3.1. experiment description the topology of the experiment setup is shown in fig. 3. nodes 50 and 68 create an ad-hoc 802.11a wifi network. wireless network adapters are qualcomm atheros ar9580 (rev 01). due to wifi cards regulatory domain, available channels at 5 ghz frequency band are 36, 40, 44, and 48. it was chosen to use channel 48 with central frequency of 5.24 ghz. wifi adapters output power was set to 0 or 10 dbm in order to make it less than or equal to the output power of the usrp devices. the transmission control protocol (tcp) throughput between these two stations is generated and measured using iperf v2 [25] application during 60 seconds, without parallel streams. the lte enb and lte ue are run on nodes 59 and 60, respectively, using oai software. it may be noticed that the lte nodes are close to each other. that is because the oai is still in the development phase and the link quality between enb and ue is not very good. currently, the eurecom is paying the most attention to the development of oai enb in order to make it work correctly with different commercial lte devices, such as mobile phones. 50 59 62 68 69 51 52 53 54 55 56 57 58 60 61 90 91 92 93 63 64 65 66 67 wifi network lte network wifi node oai node fig. 3 the experiment setup topology the lte channel width may be configured using the number of resource blocks (nrb) parameter. possible channel widths are 1.4, 3, 5, 10, 15, 20 mhz for nrb = 6, 15, 25, 50, 75, 100. the oai is configured to work in fdd mode with 5 mhz channel bandwidth, i.e. the number of resource blocks is set to 25, because oai works the best with 5 mhz channel width.. the downlink frequency is set to be equal to the channel 48 central frequency, 5.24 ghz, and the uplink frequency offset is set to -100 mhz, i.e. the uplink frequency is 5.14 ghz. the throughput and the round-trip time (rtt) between wifi stations is constantly measured while the lte traffic is varied. again, iperf is used, now to generate user datagram protocol (udp) traffic in the downlink of the lte network. lte and wifi co-existence in 5 ghz unlicensed band 369 it should be noted that paper [8] and this paper consider a similar topic. however, the results in this paper may not be compared to those obtained in [8]. namely, paper [8] analyzes the influence of oai enb (without ues) on the wifi transmission in 2.4 ghz band. wifi stations are located at the same testbed node, with 25 cm distance between the antennas. oai enb distance to wifi was varied from 1 to 20 m. since we did not have a physical access to the nitos testbed, we could not put two wifi cards on one node. also, we could not move usrps to different nodes, and therefore could not change the distance between lte and wifi stations. 3.2. experimental results this section presents some experimental results that show the influence of lte on wifi network based on scenario described in the previous section. fig. 4 shows wifi throughput over time for different lte traffic intensity: no lte network present, only lte enb generating light load with control signals, 1 mb/s, and 10 mb/s of the downlink lte traffic. the usrp b210 output power is around 10 dbm, so wifi output power was chosen to be equal to usrp (10 dbm) and 10 db lower (0 dbm). it may be noticed that the higher the lte throughput, the lower the wifi throughput is. that is because wifi senses lte transmission and postpones its own transmission. on the other hand, lte does not use carrier sensing and it transmits continuously. wifi transmit power has almost no influence on wifi throughput (curves a, c, and d), except for the case of light lte traffic with only enb (curve b), because stronger wifi packets are more likely to reach the destination, even if they are hit by the lte signal during the transmission. 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 25 d c b wifi power 10 dbm wifi power 0 dbm w if i t h ro u g h p u t [m b /s ] t [s] a fig. 4 wifi throughput over time for different lte traffic intensity: a) no lte, b) only lte enb, c) 1 mb/s d) 10 mb/s besides the throughput, the transmission delay is also an important parameter of a communication network. the round-trip time, i.e. time needed for a packet to travel from source to destination and back to source, for the wifi network is shown in fig. 5. it is measured using ping application, which sends internet control message protocol (icmp) echo request 370 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić packets, and waits for icmp echo response packets. the rtt is considered for different lte traffic intensity and for different icmp packet size: 100, 1000, and 10000 bytes. fig. 5 shows average value and standard deviation of the rtt. the conclusion from fig. 4 may be applied here: higher lte throughput increases both average value and the standard deviation of rtt. the average value increases significantly for 10 mb/s lte throughput. on the other hand, the rtt standard deviation increases approximately exponentially with the increase of lte throughput. no lte 0 1m 10m 0.01 0.1 1 10 100 1000 a v e ra g e r t t [ m s ] lte throughput [b/s] average value standard deviation packet size 100 bytes packet size 1000 bytes packet size 10000 bytes fig. 5 wifi network rtt as a function of lte throughput, for different values of packet size no lte 0 1m 10m 1 10 100 a v e ra g e r t t [ m s ] lte throughput [b/s] f = 0 hz f = 5 mhz f = 10 mhz c b a fig. 6 wifi network average rtt as a function of lte throughput, for different values of frequency offset between wifi and lte carrier frequency f, and wifi packet size a) 100 bytes, b) 1000 bytes, c) 10000 bytes lte and wifi co-existence in 5 ghz unlicensed band 371 finally, fig. 6 analyzes the influence of the carrier frequency offset between the wifi channel central frequency (fwifi) and the lte downlink frequency (flte) f. frequency s p e c tr u m m a g n it u d e f = 0 mhz frequency s p e c tr u m m a g n it u d e f = 5 mhz frequency s p e c tr u m m a g n it u d e f = 10 mhz a) b) c) fig. 7 mutual position of the wifi (solid line) and lte (dashed line) spectra for different carrier frequency offset a) 0 mhz, b) 5 mhz, c) 10 mhz we should have in mind that wifi occupies 20 mhz bandwidth (fwifi ± 10 mhz), and lte occupies 5 mhz (because nrb is chosen to be 25) bandwidth (flte ± 2.5 mhz), as shown in fig. 7. as can be seen from fig. 7, for 0 and 5 mhz offset, whole lte spectrum overlaps with wifi spectrum and 25% of the wifi channel is occupied by lte. note that 372 n. milosević, b.dimitrijević, d. drajić, z. nikolić, m.tošić lte carrier frequency lies within wifi channel. for 10 mhz offset, a half of the lte spectrum (2.5 mhz) overlaps with the wifi spectrum, and lte carrier frequency is on the edge, or practically out of wifi channel. the results show that the higher the offset the lower is the influence of lte on wifi network. if the offset is 10 mhz, lte has very little influence on the wifi network. figs. 6 and 7 show that the lte carrier itself is the main cause of the interference. 4. conclusion the influence of the lte on the wifi network, sharing the same 5 ghz frequency range without coordination, is considered in this paper. the results show that the higher the lte throughput, the lower the wifi throughput. the lte similarly influences the round-trip time of the wifi network packets. the influence is the highest if the lte downlink frequency is equal to the wifi channel central frequency. if the difference between these two frequencies is higher, the influence is lower. having in mind the presented results, a conclusion can be made that the coordination between the lte and wifi networks is very important and will be the topic of our future research. we are currently developing spectrum coordination based on an ontological framework. the coordination process will be centralized on one coordination server. it will communicate to wifi and lte clients and provide them all the needed parameters for the successful co-existence in a shared frequency band. acknowledgement: the authors thank the anonymous reviewers for their valuable suggestions and comments. the research leading to these results has received funding from the european union's seventh framework programme under grant agreement no 612050 (flex project) and from the european union's horizon 2020 research and innovation programme under grant agreement no. 687860 (softfire project). references [1] qualcomm, extending the benefits of lte advanced to unlicensed spectrum, http://www.qualcomm. com/media/documents/files/extending-the-benefits-of-lte-advanced-to-unlicensed-spectrum.pdf; 2014 [accessed 29.08.16]. [2] a. prijić, lj. vračar, d. vučković, d. danković, z. prijić, "practical aspects of cellular m2m systems design", facta universitatis, series: electronics and energetics, vol. 28, pp. 541-556, december 2015. [3] l. polak, o. kaller, l. klozar, j. sebesta, t. kratochvil, “mobile communication networks and digital television broadcasting systems in the same frequency bands: advanced co-existence scenarios”, radioengineering, vol. 23, pp. 375–386, april 2014. [4] 3gpp. lte in unlicensed spectrum, http://www.3gpp.org/news-events/3gpp-news/1603-lte_in_unlicensed; 2014 [accessed 129.08.16]. [5] 3gpp. 3gpp release 8, http://www.3gpp.org/specifications/releases/72-release-8; 2014 [accessed 29.08.16]. [6] 3gpp. 3gpp release 10, http://www.3gpp.org/specifications/releases/70-release-10; 2014 [accessed 29.08.16]. [7] r. deka, s. chakraborty, j. s. roy, "optimization of spectrum sensing in cognitive radio using genetic algorithm", facta universitatis, series: electronics and energetics, vol. 25, pp. 235-243, december 2012. [8] j. jeon, h. niu, qc li, a. papathanassiou, g. wu, “lte in the unlicensed spectrum: evaluating coexistence mechanisms”, in the proceedings of the ieee globecom work. gc wkshps 2014, 2014, austin, tx (usa), pp. 740–745. http://www.qualcomm.com/media/documents/files/extending-the-benefits-of-lte-advanced-to-unlicensed-spectrum.pdf http://www.qualcomm.com/media/documents/files/extending-the-benefits-of-lte-advanced-to-unlicensed-spectrum.pdf http://www.3gpp.org/news-events/3gpp-news/1603-lte_in_unlicensed http://www.3gpp.org/specifications/releases/72-release-8 http://www.3gpp.org/specifications/releases/70-release-10 lte and wifi co-existence in 5 ghz unlicensed band 373 [9] a. babaei, j. andreoli-fang, y. pang, b. hamzeh, “on the impact of lte-u on wi-fi performance”, int j wirel inf networks, vol. 22, pp. 336–344, december 2015. [10] s. sagari, s. baysting, d. saha, i. seskar, w. trappe, di. raychaudhuri, “coordinated dynamic spectrum management of lte-u and wi-fi networks”, in the proceedings of the ieee int. symp. dyn. spectr. access networks, dyspan 2015, stockholm, sweden, 2015, pp. 209–220. [11] lte-u forum, http://www.lteuforum.org; [accessed 19.08.16]. [12] 3gpp, 3gpp release 13, http://www.3gpp.org/release-13; 2015 [accessed 29.08.16]. [13] 3gpp, rp-151045: new work item on licensed-assisted access to unlicensed spectrum, http://www.3gpp.org/ftp/tsg_ran/tsg_ran/tsgr_68/docs/rp-151045.zip; 2015 [accessed 29.08.16]. [14] r. ratasuk, n. mangalvedhe, a. ghosh, “lte in unlicensed spectrum using licensed-assisted access”, in proceedings of the ieee globecom work. gc wkshps, austin, tx, usa, 2014, pp. 746–751. [15] li y, zheng j, li q, “enhanced listen-before-talk scheme for frequency reuse of licensed-assisted access using lte”, in proceedings of the ieee int. symp. pers. indoor mob. radio commun. pimrc, hong kong, china, 2015, pp. 1918–1923. [16] d. raychaudhuri, x. jing, i. seskar, k. le, jb evans, “cognitive radio technology: from distributed spectrum coordination to adaptive network collaboration”, pervasive mob comput, vol. 4, pp. 278–302, june 2007. [17] k. challapali, s. mangold, z. zhong, “spectrum agile radio: detecting spectrum opportunities”, in the proceedings of the intern. symp. adv. radio technol, boulder, co, usa, 2004, po. 61–65. [18] x. jing, sc. mau, d. raychaudhuri, r. matyas. “reactive cognitive radio algorithms for co-existence between ieee 802.11b and 802.16a networks”, in proceedings of the globecom ieee glob. telecommun. conf., st. louis, mo, usa, vol. 5, 2005, pp. 2465–2469. [19] d. raychaudhuri, x. jing, “a spectrum etiquette protocol for efficient coordination of radio devices in unlicensed bands”, in proceedings of the ieee int. symp. pers. indoor mob. radio commun. pimrc, beijing, china, vol. 1, 2003, pp. 172–176. [20] x. jing, d. raychaudhuri, “spectrum co-existence of ieee 802.11b and 802.16a networks using reactive and proactive etiquette policies”, mob networks appl, vol. 11, pp. 539–554, august 2006. [21] openairinterface software alliance, openairinterface, http://www.openairinterface.org/; 2015 [accessed 29.08.16]. [22] nitlab, nitos, http://nitos.inf.uth.gr/; [accessed 29.08.16]. [23] eurecom, expressmimo2, https://twiki.eurecom.fr/twiki/bin/view/openairinterface/expressmimo2; [accessed 29.08.16]. [24] ettus, usrp xand bseries, https://www.ettus.com/; [accessed 29.08.16]. [25] iperf, https://iperf.fr/; [accessed 29.08.16]. http://www.lteuforum.org/ http://www.3gpp.org/release-13 http://www.3gpp.org/ftp/tsg_ran/tsg_ran/tsgr_68/docs/rp-151045.zip http://www.openairinterface.org/ http://nitos.inf.uth.gr/ https://twiki.eurecom.fr/twiki/bin/view/openairinterface/expressmimo2 https://www.ettus.com/ https://iperf.fr/ temperature measurement performance of silicon piezoresistive mems pressure sensors for industrial applications facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 123 131 doi: 10.2298/fuee1501123f temperature measurement performance of silicon piezoresistive mems pressure sensors for industrial applications  miloš frantlović 1,2 , ivana jokić 1,2 , žarko lazić 2 , branko vukelić 1,2 , marko obradov 1,2 , dana vasiljević-radović 2 , srđan stanković 1 1 school of electrical engineering, university of belgrade, serbia 2 ictm – center of microelectronic technologies, university of belgrade, serbia abstract. temperature and pressure are the most common parameters to be measured and monitored not only in industrial processes but in many other fields from vehicles and healthcare to household appliances. silicon microelectromechanical (mems) piezoresistive pressure sensors are the first and the most successful mems sensors, offering high sensitivity, solid-state reliability and small dimensions at a low cost achieved by mass production. the inherent temperature dependence of the output signal of such sensors adversely affects their pressure measurement performance, necessitating the use of correction methods in a majority of cases. however, the same effect can be utilized for temperature measurement, thus enabling new sensor applications. in this paper we perform characterization of mems piezoresistive pressure sensors for temperature measurement, propose a sensor correction method, and demonstrate that the measurement error as low as ± 0.3 °c can be achieved. key words: mems sensor, temperature measurement, sensor correction 1. introduction the most commonly used temperature sensors for contact temperature measurement in industrial processes are those based on seebeck effect (thermocouples), and those based on the temperature dependent resistance of platinum (resistance temperature detectors – rtds). the former do not offer high accuracy (worse than ± 0.5 °c), but have the widest temperature range, while the latter can be of very high performance (better than ± 0.05 °c for standard platinum resistance thermometers – sprts). in a typical industrial plant both temperature and pressure measurements are required at various points of the process, often at remote locations, while monitoring and control functions are centralized. industrial telemetry relies on the use of a special kind of received august 8, 2014; received in revised form december 3, 2014 corresponding author: miloš frantlović school of electrical engineering, university of belgrade, bulevar kralja aleksandra 73, 11000 belgrade, serbia (e-mail: frant@nanosys.ihtm.bg.ac.rs) 124 m. frantlović, i. jokić, ž. lazić, et al. industrial-grade instruments able to transmit their measurement indication in the form of an electrical signal from the measurement site to the control room, and therefore called industrial transmitters. during the past three decades, industrial pressure transmitters evolved from simple electronic devices that perform analog signal processing and generate an analog output signal to much more complex computerized instruments with two-way digital communication. contemporary intelligent pressure transmitters owe their high measurement performance to sensor correction techniques based on digital signal processing. in this paper we investigate the possibility of using silicon piezoresistive mems pressure sensors for temperature measurement, utilizing hardware resources already existing in contemporary intelligent pressure transmitters. some early results of our work were presented in ref. [1], while this paper contains more comprehensive information based on measurement data obtained for a new set of sensors. research of silicon mems piezoresistive pressure sensors, including their design, fabrication and correction techniques, has been performed at the center of microelectronic technologies (cmt) for more than 25 years [2]-[11]. one of the successful types of pressure sensing elements developed and fabricated at cmt is the sp-9, which was chosen for this work. it is intended for measurement of absolute or relative pressure in the range from 0.5 bar to 50 bar. the base material used for its fabrication is a double sided polished single crystal n-type silicon wafer (specific resistivity from 3 cm to 5 cm). four p-type piezoresistors are formed by boron diffusion on the surface of the silicon substrate, constituting a wheatstone bridge. two piezoresistors are in the radial direction and the remaining two in the transversal direction relative to the edges of a micromachined diaphragm. the diaphragm is square, 2×2 mm 2 in size, fabricated by anisotropic etching of silicon on the bottom side of the wafer. the thickness of the diaphragm is from 43 μm to 160 μm, depending on the nominal pressure range of the sensing element. positions of the piezoresistors are optimized for each diaphragm thickness in order to achieve the highest linearity of the output signal. the overall size of the sensing element die is 3.2×3.2×0.38 mm 3 . after the fabrication of the die, it is anodically bonded to a 1.7 mm thick glass support. if a sensing element is intended for relative pressure measurement, there must be a channel through the glass support in order for the fluid at the reference pressure to reach the bottom side of the sensing element diaphragm. a photograph of the sensing element mounted on a to-5 housing is shown in fig. 1a. an industrial-grade pressure sensor consists of a pressure sensing element (e.g. the sp-9) and a metallic sensor body that ensures optimal operating conditions for the sensing element, protects it from damage, and provides for a standardized process connection. a photograph of an industrial pressure sensor based on the sp-9 sensing element is shown in fig. 1b. a separation membrane, located inside the metallic body, eliminates a direct contact between the sensing element and a possibly electrically conductive, chemically aggressive or dirty fluid whose pressure is measured. the sensing element is surrounded by chemically inert silicone oil which is also a good dielectric. temperature measurement performance of silicon piezoresistive mems pressure sensors... 125 fig. 1 photographs of a) sp-9 sensing element mounted on a to-5 housing, b) industrial pressure sensor based on the sp-9 sensing element a simplified electrical circuit diagram of a piezoresistive sensor with current excitation is shown in fig. 2. for a typical sensing element made by cmt, the resistances r1, r2, r3, and r4 are approximately equal in the absence of the applied pressure. their value is within the range from 2 kω to 3 kω, and the temperature coefficient of the resistance is in the range from 0.13 %/°c to 0.15 %/°c. fig. 2 simplified electrical circuit diagram of a piezoresistive pressure sensor with current excitation 126 m. frantlović, i. jokić, ž. lazić, et al. 2. method 2.1. sensor characterization in order to devise a temperature measurement method based on the resistance of the sensor's wheatstone bridge, three gauge pressure sensors based on the sp-9 sensing element are characterized in terms of their temperature response. the mechanical construction of all the sensors is the same, featuring a separation membrane and silicone oil filling. the experimental setup used for the characterization of the sensors was similar to the one described in our previous work [11], except for the relative pressure that was set to zero by leaving the sensors' pressure ports at the normal atmospheric pressure throughout the experiment. acquisition of the signals from the pressure sensors was performed using a custom designed signal acquisition unit connected to a personal computer. a simplified block diagram of the unit is shown in fig. 3. fig. 3 simplified block diagram of the signal acquisition unit the input circuitry connected to the sensor under test consists of a constant current source for sensor excitation (i0=420 μa), two 24-bit delta-sigma analog-to-digital converters (adc), two zero-drift programmable gain instrumentation amplifiers (pga), one zero-drift buffer amplifier and one high-performance resistor (rref=5 kω with ±0.01% tolerance, temperature coefficient of resistance ≤ 2 ppm/°c). the amplifiers are necessary when low level signals are measured and also because of the high impedance of the sensor used as the signal source. measurements are ratiometric, with the adc reference voltage proportional to the sensor excitation current (vref=i0·rref), in order to eliminate the error introduced by variations of the excitation current. the resistance of the sensor, seen at its excitation port, is calculated as rbr=(rref/(a·2 n-1 ))·n, where a is the amplifier gain, n is the resolution of the adc, and n is the numeric value at the adc's output (in this case a=1 and n=24). in this experiment the voltage between the remaining ends of the wheatstone temperature measurement performance of silicon piezoresistive mems pressure sensors... 127 bridge was not measured. for the reference measurement of the pressure sensor's temperature a high performance pt-100 sensor was used. the temperature measurement block shown in the diagram is realized using the same circuitry as the one used for the pressure sensor, thus enabling the inputs of the signal acquisition unit to be interchangeable. the control & data acquisition block is based on a msp430f169 microcontroller. it controls all the unit's functions, including the communication with the pc computer via the rs-232 interface (the comm. interface block). the power supply block contains low-noise voltage regulators. the power consumption of the signal acquisition unit is low, so it is powered by 4 aaa batteries. the temperature of the sensor under test is controlled using a climatic test chamber, in the range from -20 °c to 70 °c. during the sensor characterization experiment the operator sets the temperature value, waits for the sensor temperature to settle and then initiates the measurement. the process is repeated for each temperature value in a sequence. the personal computer receives the measurement data from the signal acquisition unit, displays the measurement indications and saves the data to a file. a diagram showing the experimentally obtained dependences of rbr on the temperature t for the three tested sensors is shown in fig. 4. in order to evaluate the temperature measurement performance of the tested sensors without any sensor correction method applied, a linear calibration function is used. its parameters are calculated by fitting it to the obtained characterization data of each of the sensors, using the least squares method [12]. the temperature measurement error is calculated as the difference between the obtained temperature indication and the temperature value measured using the pt-100 sensor, at all the set temperatures. the results are shown graphically in fig. 5. it can be concluded from the diagram that the temperature measurement error exhibited by the tested sensors is within ± 4 °c. fig. 4 experimentally obtained dependences of the resistance rbr on temperature t for three tested sensors 128 m. frantlović, i. jokić, ž. lazić, et al. fig. 5 temperature measurement error δт as a function of temperature t for three tested sensors, without sensor correction 2.2. sensor correction method in order to improve the measurement performance, a suitable sensor correction method must be applied. in this case a third order polynomial has been chosen for the sensor calibration function. its parameters were determined by fitting it to the sensor characterization data, using the least squares method [12]. a diagram showing the calibration functions obtained in this way for the three tested sensors is given in fig. 6. 3. results & discussion temperature measurement error with the described correction method applied can be estimated by calculations performed using the characterization data obtained in 2.1. such a calculation indicates that the temperature measurement error is within ± 0.3 °c for the three tested sensors. in order to experimentally verify that the performance expected based on calculations can be achieved in real applications, a series of temperature measurements was performed using the same sensors with the described correction method applied. the time interval between the sensor characterization and the new series of measurements was approximately six months. an offset correction was subsequently performed at 20 °c. a diagram showing the temperature measurement error as a function of temperature for the three tested sensors is shown in fig. 7. it can be seen from the diagram that the measurement error exhibited by the tested sensors is indeed within ± 0.3 °c. this result represents a great improvement achieved by using the proposed sensor correction method, since the temperature measurement error is reduced by at least 10 times compared to the results obtained without the correction method applied. temperature measurement performance of silicon piezoresistive mems pressure sensors... 129 fig. 6 calibration functions of three tested sensors the achieved measurement accuracy is better than that of thermocouples and also surpasses a majority of dedicated semiconductor-based temperature sensors. however, typical industrial pressure sensors are neither designed nor optimized for temperature measurement, so there are some disadvantages and limitations that must be considered in practical applications. the temperature range, size and shape, and dynamic behavior of the sensors are the most important limitations, and therefore will be discussed here. the temperature range of a silicon piezoresistive sensing element is predominantly determined by the physical properties of silicon as a semiconductor material. it extends from cryogenic temperatures to 130 °c, whereas platinum resistance thermometers can measure temperatures up to 600 °c, and certain types of thermocouples beyond 1000 °c. there is, however, a multitude of applications where the temperature is below 130 °c, including liquid fuel or water tanks and pipelines, heating, ventilating, and air conditioning (hvac) systems etc. the size and shape of pressure sensors, as well as their mass and other properties, can differ significantly depending on intended applications. in some cases the sensing element can be surrounded by the fluid whose pressure or temperature is measured, with only a minimal mechanical support, whereas in many industrial applications a relatively large metallic body with a protective oil filling is required (as described in the introduction). since the thermal time constant of the sensing element in the air is of the order of 10 s, the time constant of the whole sensor is predominantly determined by other sensor elements, especially the sensor body. furthermore, dynamic properties of all contact temperature measurements inevitably depend on parameters and conditions external to the sensor, which contribute to the overall thermal inertia of the system. many industrial processes involve large amounts of fluids and/or large metallic objects whose heat capacity causes the thermal response time of the system to be much greater than that of a typical industrial pressure sensor. some preliminary results indicate that the thermal time constant of the described industrial pressure sensor is approximately 400 s in still air, which will be further investigated in our future work. 130 m. frantlović, i. jokić, ž. lazić, et al. fig. 7 temperature measurement error δт as a function of temperature т for three tested sensors, with sensor correction 4. conclusion in this paper we presented a method for temperature measurement using mems piezoresistive pressure sensors. three such sensors made by cmt were tested and characterized for temperature measurement. the measurement error, which was within ± 0.3 °c in the observed temperature range (from -20 °c to 70 °c), can be considered as a good result, knowing that many dedicated semiconductor-based temperature sensors, as well as thermocouples, exhibit greater measurement errors. the use of piezoresistive pressure sensors instead of dedicated temperature sensors for temperature measurements has some disadvantages and limitations. being a silicon-based semiconductor device, the pressure sensing element has a very limited temperature range (less than 130 °c) compared to some dedicated temperature sensors such as platinum resistance thermometers, and especially thermocouples. another limitation is the thermal response time of a typical industrial-grade pressure sensor. in spite of these limitations, many applications exist where the described temperature measurement method can be useful. some interesting new applications are possible. for example, in industrial processes with many pressure sensors installed there is often a need for an additional temperature measurement. the presented method enables a simple on-site conversion of a pressure transmitter into a temperature transmitter, as well as sensor validation and various multisensor configurations. in our future work in this research field we intend to improve the sensor measurement performance and to overcome the limitations by using more advanced sensor designs, materials and fabrication techniques. for example, the mentioned temperature range limitation can be overcome by fabricating sensing elements on soi (silicon-on-insulator) substrates [13]. combined pressure and temperature influences as well as dynamic properties of the sensors will be investigated. the development of sensor correction methods will be continued and expanded to other types of mems sensors. temperature measurement performance of silicon piezoresistive mems pressure sensors... 131 acknowledgement: this paper is a result of the research performed within the project tr-32008 funded by the serbian ministry of education, science and technological development. references [1] m. frantlović, i. jokić, ž. lazić, b. vukelić, m. obradov, d. vasiljević-radović, s. stanković, "temperature measurement using silicon piezoresistive mems pressure sensors", in proc. 29th international conference on microelectronics miel 2014, 2014, pp. 159 161 (doi: 10.1109/miel.2014.6842110). [2] z. djurić, j. matović, m. matić, n. mišović (simičić), r. petrović, m. a. smiljanić, and ž. lazić, "pressure sensor with silicon diaphragm", in proc. xiv yugoslav conference on microelectronics miel, beograd, 1986, pp. 88-100. [3] j. matović, z. djurić, n. simičić, m. matić, and r. petrović, "a nonlinear simulation of pressure sensors", in proc. 19th yugoslav conference on microelectronics miel '91, beograd, 1991. [4] d. tanasković, n. simičić, z. djurić, ž. lazić, r. petrović, j. matović, m. popović, and m. matić, "temperature characterics of silicon pressure sensor: the effect of impurity profile variation", in proc. 2nd serbian conference on microelectronics and optoelectronics miopel 93, 1993, pp. 297-302. [5] z. đurić, "rezultati istraživanja i razvoja si senzora i transmitera pritiska u ihtm – centru za mikroelektronske tehnologije i monokristale", in proc. 20th international conference on microelectronics miel, 1995 (serbian). [6] m. m. smiljanić, z. djurić, ž. lazić, m. popović, and k. radulović, "piezootporni senzori pritiska na soi pločicama namenjeni funkcionisanju na visokim temperaturama", in proc. 49th conference for electronics, telecommunications, computers, automation and nuclear engineering etran, budva, 2005,vol. 4, pp. 185-188 (serbian). [7] m. m. smiljanić, ž. lazić, z. djurić, and k. radulović, "dizajn i modelovanje modifikovanog senzora niskih pritisaka sp-6 ihtm-cmtm", in proc. 51st conference for electronics, telecommunications, computers, automation and nuclear engineering etran, herceg novi igalo, 2007, pp. mo3.2-1-4 (serbian). [8] m. m. smiljanić, z. djurić, ž. lazić, and b. popović, "soi piezootporni senzor pritiska za opseg radnih temperatura od 600c do 3000c", in proc. 52nd conference for electronics, telecommunications, computers, automation and nuclear engineering etran, palić, 2008, pp. mo2.6-1-4 (serbian). [9] m. m. smiljanić, v. jović, and ž. lazić, "maskless convex corner compensation technique on a (1 0 0) silicon substrate in a 25 wt% tmah water solution", j. micromech. microeng. 22 115011, 1-11, 2012, doi:10.1088/0960-1317/22/11/115011 [10] m. frantlović, i. jokić, and d. nešić, "a wireless system for liquid level measurement", in proc. 8th international conference on telecommunication in modern satellite, cable and broadcasting services telsiks, 2007, pp. 475-8. [11] m. frantlović, v. jovanov, and b. miljković, "intelligent industrial transmitters of pressure and other process parameters", telfor journal, 2009, vol. 1, no. 2, pp. 65-8. [12] j. wolberg, data analysis using the method of least squares, berlin, heidelberg: springerverlag, 2006. [13] s. s. kumar, b. d. pant, "design principles and considerations for the ‘ideal’ silicon piezoresistive pressure sensor: a focused review", microsyst. technol., vol. 20, pp. 1213– 1247, 2014. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 401-413 https://doi.org/10.2298/fuee2103401k © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper evaluation of electronic readiness level (a case of financial institution) hossein kardanmoghaddam, nafiseh sarboland department of computer engineering, birjand university of technology, birjand, iran abstract. electronic readiness is the ability to accept, use and apply information and communication technology in an organization. to effectively implement information and communication technologies, the first step is to measure the electronic readiness of companies and organizations to adopt these new technologies. in this research, the level of electronic readiness of mellat bank has been studied in khorasan razavi province in iran, from the perspective of the employees in cities of feyz abad, kashmar, bajestan, gonabad and bazar and central branches in khorasan razavi province. electronic readiness levels of bank mellat have been evaluated in the following dimensions: strategy readiness and it policies, it infrastructure readiness, management readiness, legal-juridical readiness, culture and human resource (personnel) readiness and process readiness. this research is based on descriptive research design and applied purpose. the statistical population of the personnel includes people with sufficient and necessary information in the field of financial and banking activities regarding e-commerce issues and e-readiness, which was a total population of 74 people. 50 questionnaires consisting of 30 questions were distributed using non-probability convenience sampling method of which 42 questionnaires were accurate. the spss15 software was used for analysis. the results of the analysis showed that the level of electronic readiness of mellat bank in khorasan razavi province in the studied branches is significantly higher than the average theoretical score (3) (p <0.001) in total and its components. this demonstrates the level of electronic readiness of mellat bank in khorasan razavi is high (above average) from the perspective of the studied personnel. also, there is no significant difference in the average score of the perspective of personnel based on gender, age, years of service, level of education, field of study and organizational position concerning the level of electronic readiness in mellat bank in khorasan razavi. key words: information technology, electronic commerce, electronic readiness, electronic banking, information technology infrastructure. received october 22, 2020; received in revised form april 20, 2021 corresponding author: hossein kardanmoghaddam department of computer engineering, birjand university of technology, birjand, iran e-mail: h.kardanmoghaddam@birjandut.ac.ir 402 h. kardanmoghaddam, n. sarboland 1. introduction researchers argue that at present we live in information technology era, in which knowledge and information are considered inevitable necessity, the reason for emerging this era, is new technologies known as information communication and technology [1]. with the emergence of new technologies and developing their applications, organizations and communities planned for making structural changes by drawing clear prospective to future goals and analyzing the current situation and turned to applying modern technologies for increasing efficiency and convenience of their citizens and personnel. for effectiveness of using ict, countries around the world should have necessary readiness in it infrastructures, have easy access to communication technologies for large part of population and can provide suitable collection of rules for suitable use of these technologies. for obtaining development goals, capacities of ict should be assessed along with organizations or countries’ readiness or electronic readiness. electronic readiness means participation rate of each community or organization in network space [2]. ict is one the newest modern achievements of humankind, and not only it has undergone deep changes, but also it affects patterns of life, research, education, management, transportation, security and commerce. using prefix “e” represents many dimensions of our life represents this issue [3]. by evaluation of electronic readiness rate of organizations, communities can evaluate their current situations in terms of different dimensions of it developments and increase their quality level by planning related indices and criteria. electronic readiness has dimensions and elements like telecommunication infrastructure, human resources and political and legal frameworks. most managers are aware of ict potentials and capabilities and consider it as redeemer of their organization; however, the most important challenge facing them is controlling ict power in organization’s goal framework. using ict without examining opportunities and threats related to using it and assessing its weak points and strongpoints within organization not only won’t remove the problem, but also will make it more complicated. electronic readiness is an environmental analysis tool and detects surrounding opportunities and threats and by expressing weak points and strongpoints of communities and organizations, presents a pattern of their electronic readiness level. generally, evaluation of electronic readiness represents preliminary recognition of environment and available infrastructures and provides criteria and data for assessing and evaluating ict effects. ict advancements and its developments to monetary and financial markets not only facilitate bank customers’ tasks, but also changes traditional banking methods. among this, banks and financial institutions are moving toward e-banking and presenting modern financial services, so they play a critical role in increasing e-commerce rate. regarding widespread application of ict in iran’s organizations and banks, the main research question is posed as: what is rate of electronic readiness among mellat bank in khorasan razavi county from the perspective of this organization’s staff? 2. research background electronic readiness indicates personal attitudes and tendency to using it services and products in people’s daily lives and electronic readiness can help people in fulfilling their professional goals [4]. electronic readiness increases perceived convenience and perceived productivity and using them [5]. also in another research it was demonstrated that electronic readiness has meaningful effect on technology acceptance [6]. brueckner (2002) [7] expressed that it can increase life quality of citizens of a city or country and evaluated evaluation of electronic readiness level (a case of financial institution) 403 electronic readiness of michigan municipality and finally presented a website for municipality called waes. flak et al (2005) [8] designed a model called megap-3 and by which they evaluated electronic municipalities of norway, the results indicated that authorities’ attitude to e-government is simplistic and burocratic government is more common. in [9], electronic readiness in two financial and commercial institutes of iran has been examined and compared. in this research, bridge institute model has been used. according to the research results, electronic readiness level of both financial and commercial institutes were not at suitable level and was lower than average. shirvani and baneshi (2009) in a research called “evaluation of electronic readiness of municipality of new city, baharestan along with fulfillment of electronic municipality” using dr. hamid noori model, evaluated new city, baharestan in iran. they finally evaluated electronic readiness of new city, baharestan as 38% and concluded that this municipality should increase its electronic readiness in three dimensions of technical infrastructure, systems and electronical service [10]. musa (2010) [11] evaluated electronic readiness of municipalities in iraq. he proposed a measuring tool for electronic readiness in municipalities and implemented this in two counties of iraq. tavanaa et al (2013) [12] by proposing a combined fuzzy model using topsis and anp presented a comprehensive model for evaluating electronic readiness in municipalities of usa and measured the municipalities of the states with their model. seakow [13] in a study titled electronical learning evaluation in thailand in comparison with ua university, effective and successful factors on electronic learning of us universities were examined and the results were compared with higher education in thailand. these results included the most important and effective factors like supporting resources and online programs, well-introduced programs, precise selection of early proposed programs and educating trainers for helping and developing efficient educational styles. olatokun [14] in a study titled evaluating top university of nigeria examined how ibadan university uses various opportunities of ict in performing its activities. in this study, five elements had the capability of using infrastructures, access to infrastructures, human resource abilities, ict policy in organization, legal framework, examined it development and this university got the score 2.57. richey (2003) [15] published the result of his researches in a book called “technological readiness and strategic interactive fit: dynamic capabilities impacting logistics service competency and performance”, and declared that technological readiness may influence the general performance of the organization. however, kue (2013) [16] showed that technological readiness has a positive moderator role between information system quality (isq) and organizational performance and didn’t approve of the direct relationship between these two. the efficient and effective performance of the electric government requires educated citizens, the skillful workforce, and the lack of resistance of employees against the adoption of new methods in the organization [17]. vasdinus vaati (2009) [18] investigated information and communication technologies readiness in the institute of higher education in kenya in order to establish the electronic library. in this research, the indices of information and communication technologies were designed, which aid the managers in making a suitable decision in order to evaluate the library’s information and communication technologies. tiemo & edewor (2011) [19] investigated the information and communication technologies readiness of the libraries of the institute of higher education in nigeria. results showed that the available equipment and facilities of the information and communication technologies services of these libraries were automatized. also, the results of this research list some limitations in applying information and communication technologies in these libraries, such as the poor budget, inadequate skillful workforce, the unreliability of electricity source, inadequate technical support, the poor performance of policies, and lack of reparations. in 404 h. kardanmoghaddam, n. sarboland another study, the most effective factors in the electronic readiness of small and big organizations were represented as the organization’s substructures, hardware and software, and workforce [20]. sivaraks et al (2011) [21] investigated the effect of electronic customer communication on the quality of services in commercial banks in thailand. their result of the analysis showed that electronic communication with costumers has a positive and meaningful influence on the quality of the services. hendi et al (2013) [22], carried out a research called “electronic readiness of mashhad university libraries”, according to cspp model by surveying model and applied type; this research was done with 54 members of the community as sample among librarians and administers of the libraries and informatics section. the results of this research indicate that there is no social, technical, and legal difference between the libraries of different universities. however, only public universities are in a more suitable situation in terms of economic and cspp model. by analyzing electronic readiness in university libraries, strengths and weaknesses were recognized, or in other words, digital-gap was identified. oraee et al (2013) [23] performed a research called “electronic readiness of esfahan university libraries”. the results of this research indicated that accessibility rate and readiness of substructure of information and communication technologies, information services, and activity readiness required information and communication technologies, information security, management readiness, and organizational culture more than the average level, although the preparation of organizational characteristics, communication with the external environment, the extent of policies, strategies, and legal relations about information and communication technologies, financial readiness, human workforce, and the extent of information and communication technologies usage had not been more than the average level. lee & chieng (2014) [24] consider electronic banking services from three aspects. furthermore, they believe that banks’ customers can receive electronic banking services in three levels (informing level, communication level, and transaction level). baversad et al (2015) [25] had done a research called analyzing the impact of information and communication technology (ict) on the performance of fajr petrochemical company by applying the balanced score card (bsc) model. the results of this research indicated that the greatest impact of information technology is related to internal processes, and the least impact is related to financial recovery. overall, the improvement of organizational performance based on the balanced score card (bsc) model is affected mostly by information technology. pollack & adler (2017) [26] had done a research called “skills that improve profitability: the relationship between project management, it skills, and small to medium enterprise profitability”. in this research, it has been assumed that the use of project management and it capabilities are suitable for the commercial performance of the organization. this research examined the mentioned hypothesis through positive impact testing in the usage of project management it capabilities on total sales’ rate of the work and profitability. this research’s date was achieved through two governmental longitudinal surveys consisting of small to medium companies in australia. the models were created to describe the relationship between project management, it capabilities, profitability, and total sale by the usage of the multiple linear regression method and binary logistic regression method. the result of this method indicated that they have a positive and special effect on the sale and profitability when controlling the effect of other business skills, project management, and it capability. khaemba et al (2017) [27] in a research called “factors influencing the readiness of citizens for e-government systems in kenya”, indicate that the indices related to the countries’ privilege of the information and communication technologies (ict) are insufficient for measuring citizens’ electronic readiness. the results of this research indicate that not paying attention to users performs as the disincentive factor in electronic readiness. that means in evaluation of electronic readiness level (a case of financial institution) 405 order to succeed in each electronic government plan, citizens ought to use the system perfectly. this research has mentioned efficient factors in citizens’ electronic readiness respectively: the poor substructures and budget limitation, executive performance, skills and attitude, citizenship participatory, digital-gap (discrepancy in benefit from technology facilities inter-citizens), culture, private, security concerns, etc. this research indicates that influential factors in electric readiness have to be considered in different social and economic groups in the society. in the research of tan, wang & sedera [28], they presented a model that indicates how to use it in order to perform operational agility in a company. the presented model indicates the it operational agility, new capabilities of resource management, negotiation process, and management actions to use it in the supply chain. the results of this research enable the managers to apply it capabilities with better methods and to take a step to reach operational agility. in the research which is done by masuri et al (2017) [29] they investigated various models and identifying various indices of the electronic readiness of the government agencies to establish the human resource management system. in this research, adequate dimensions and indices with organizations’ electronic readiness assessment were identified and extracted to establish the human resource management systems. the results of study by salek ranjbarzadesh et al (2013) [30] in examining electronic readiness of medical university of tabriz indicated that generally university personnel, students, managers and it specialists had acceptable electronic readiness. noori et al (2007) [31] in a study, evaluated the rate of electronic readiness in colleges of ferdowsi university of mashhad based on information access. the required information was collected from main parts of the model including organizational readiness, informational readiness, infrastructure readiness, human resource readiness and environmental readiness. the results indicated that colleges of ferdowsi university of mashhad in most of these parts especially organizational readiness, environmental readiness and human resources readiness have weak points. in this study, the level of electronic readiness has been examined in a number of mellat bank branches in northeastern iran. mellat bank is currently one of the largest banks in iran. mellat bank is introduced as the first largest private company between the top 100 iranian companies in 2017. this bank has more than 1390 branches throughout iran and the world in different cities such as london, yerevan, malaysia, hamburg, istanbul, ankara, izmir and seoul and has more than 19,000 personnel providing banking services [32-34]. mellat bank is included in the list of 30 largest banks in the middle east in 2020, with a capital value of $ 82.33 billion. it is ranked 16th in the ranking of major banks in the middle east and africa. this bank has increased its capital in one year and changed it ranking place from 20th place in 2019 to 16th place in 2020 [35]. khorasan razavi province in iran is one of the most important border provinces due to money and goods trade. mellat bank has many exchanges with other countries, especially afghanistan and turkmenistan. numerous traders and merchants are using banking and foreign exchange systems in this region, and there is a need for cooperation between these three countries (iranafghanistan-turkmenistan) to evaluate a comprehensive electronic readiness. so far, there has not been a comprehensive research on this issue in this region that includes all three countries. the level of electronic readiness can be examined by similar studies in the neighboring provinces of iran, afghanistan and turkmenistan as these regions have high currency exchanges, goods and services as well as high banking communications due to the train transportation. moreover, a model of the electronic readiness of the region can be proposed by a comprehensive analysis and assessing the strengths and weaknesses of the region. organizations will more confidently plan for the development of it-based activities by assessing the level of electronic readiness. this research was conducted on 406 h. kardanmoghaddam, n. sarboland the personnel of one of the biggest and most important banks of iran (mellat) in a geographical area which represents the advantage of this study compared to previous studies as they have not performed on this scale. 3. research in present research for using mellat bank personnel’s attitudes in khorasan razavi county (iran), the questionnaire in studies [26][36] was used in which 5 scale likert was used from totally disagree to totally agree (minimum score was one and maximum score was 5). this questionnaire had 30 questions that its operation variables included strategic readiness and it policies (5 items), it infrastructure readiness (7 items), management readiness (5 items), legal-juridical readiness (4 items), human resources (personnel) readiness and culture (5 items), process readiness (4 items). in this research for detecting test reliability, alpha cronbach method has been used calculated by spss15 for questions related to each variable. reliability of posed questions for measuring each variable using alpha cronbach has been presented in table 1. table 1 reliability table related to questionnaire items number variable alpha cronbach coefficient 1 strategy readiness and it policies 0.704 2 it infrastructure readiness 0.701 3 management readiness 0.822 4 legal-juridical readiness 0.704 5 culture and human resource (personnel) readiness 0.861 6 process readiness 0.702 7 total alpha 0.897 obtained alpha cronbach coefficient for all of the questions and individual variables indicate that used questionnaire has enough reliability. alpha cronbach coefficient for all variables was 0.897. in descriptive part, descriptive statistics (frequency, percentage) and in analysis part for examining research hypothesis one sample statistical t-test, independent t and one-way variance analysis in meaningful level of 0.05 have been used. this research is quantitative based on nature, applied based on purpose, and descriptive based on method. the population of this research included personnel of mellat bank branches in feyz abad, kashmar, bajestan, gonabad, bazar and headquarter in khorasan razavi county in iran during 2020. these subjects had enough and necessary information in financial and banking activities in e-ecommerce and electronic readiness. in sum, population is 74 subjects and by available non-probable method, 50 questionnaires were distributed including 30 questions and among this 42 questionnaires were usable. frequency distribution of studied people in this research was as following in terms of gender, 38 subjects (90.5%) were male and 4 subjects (9.5%) were female, 27 subjects (64.3%) were40 years old and less and 15 subjects (35.7%) were over 40 years old. 35 subjects (83.3%) of studied personnel had experience for 20 years and less and 7 subjects (16.7%) had experience more than 20 years. and frequency distribution of studied personnel based on educational level was according to this, 29 subjects (69%) had b.a. degree, 13 subjects (31%) had m.a. degree, 23 subjects (54.8%) of studied personnel studied humanities, 7 subjects (16.7%) evaluation of electronic readiness level (a case of financial institution) 407 studied basic science and 12subjects (28.6%) studied engineering. frequency distribution of studied personnel on the basis of organization tenure was according to this, 12 subjects (28.6%) of personnel were working in headquarter (grade one branch), 17 subjects (40.5%) were working in grade the second and 13 subjects (31%) were working in grade three. inferential findings are examination of electronic readiness of mellat bank in branches of feyzabad, kashmar, bajestan, gonabad, bazar and headquarter in khorasan razavi county in iran from the perspective of personnel as following table. table 2 comparison of mean score of the perspective of personnel about rate of electronic readiness of mellat bank in khorasan razavi county with theoretical mean score (3) readiness level average standard deviation mean difference t df p strategy and it policies readiness 3.59 0.14 0.59 27.46 41 <0.001 it infrastructure readiness 4.15 0.31 1.15 23.76 41 <0.001 management readiness 4.35 0.28 1.35 30.86 41 <0.001 juridical-legal readiness 4.01 0.25 1.01 26.59 41 <0.001 human resource (personnel) and culture readiness 3.90 0.30 0.90 19.32 41 <0.001 process readiness 3.92 0.41 0.92 14.42 41 <0.001 electronical readiness in total 4.00 0.20 1.00 32.53 41 <0.001 as the result of one sample t-test indicates, the mean score of studied personnel’s attitude about rate of electronic readiness in total and its elements in mellat bank of khorasan razavi county is meaningfully higher than theoretical mean score (3) (p<0.001). in other words, based on studied personnel’s attitude about rate of electronic readiness of mellat bank of khorasan razavi county is high (more than average). table 3 comparison of mean score of the perspective of studied personnel about rate of electronic readiness of mellat bank in khorasan razavi county based on gender variable gender average standard deviation t df p strategy and it policies readiness female 3.50 0.20 1.38 40 0.18 male 3.60 0.13 it infrastructure readiness female 3.75 0.61 2.93 40 0.006 male 4.20 0.25 management readiness female 4.00 0.67 2.79 40 0.008 male 4.38 0.19 juridical-legal readiness female 3.69 0.47 3.03 40 0.004 male 4.05 0.19 human resource (personnel) and culture readiness female 3.60 0.40 2.16 40 0.04 male 3.93 0.28 process readiness female 3.56 0.55 1.86 40 0.07 male 3.95 0.38 electronical readiness in total female 3.69 0.43 3.71 40 0.001 male 4.03 0.13 as the result of independent t-test indicates, the mean score of attitude about rate of electronic readiness of mellat bank of khorasan razavi county among male personnel is meaningfully higher than female personnel totally and elements of it infrastructure readiness, 408 h. kardanmoghaddam, n. sarboland management readiness, judicial-legal readiness, human resource (personnel) and culture readiness (p<0.05) but the mean score of attitude about rate of strategic readiness and it policies and process readiness wasn’t meaningfully different among male and female personnel (p>0.05). table 4 comparison of mean score of the perspective of studied personnel about rate of electronic readiness of mellat bank in khorasan razavi county based on age variable age average standard deviation t df p strategy and it policies readiness 40 years old and lower 3.57 0.14 1.26 40 0.21 higher than 40years old 3.63 0.13 it infrastructure readiness 40 years old and lower 4.16 0.33 0.16 40 0.88 higher than 40years old 4.14 0.30 management readiness 40 years old and lower 4.3 0.33 1.36 40 0.18 higher than 40years old 4.43 0.17 juridical-legal readiness 40 years old and lower 3.95 0.24 2.14 40 0.04 higher than 40years old 4.12 0.23 human resource (personnel) and culture readiness 40 years old and lower 3.87 0.32 0.61 40 0.55 higher than 40years old 3.93 0.27 process readiness 40 years old and lower 3.84 0.42 1.59 40 0.12 higher than 40years old 4.05 0.37 electronical readiness in total 40 years old and lower 3.97 0.23 1.35 40 0.19 higher than 40years old 4.05 0.12 as the result of independent t-test indicates, the mean score of attitude about rate of judicial-legal readiness of personnel higher than 40 years was meaningfully high related to personnel who were 40 years old or lower (p<0.05) but the mean score of attitude about rate of electronic readiness of mellat bank in total and its other elements among studied personnel didn’t have meaningful difference based on age (p>0.05). table 5 comparison of mean score of the perspective of studied personnel about rate of electronic readiness of mellat bank in khorasan razavi county based on educational level variable educational level average standard deviation t df p strategy and it policies readiness b.a 3.57 0.15 1.26 40 0.21 m.a 3.63 0.11 it infrastructure readiness b.a 4.10 0.36 1.72 40 0.09 m.a 4.27 0.14 management readiness b.a 4.34 0.32 0.33 40 0.75 m.a 4.37 0.20 juridical-legal readiness b.a 4.00 0.29 0.46 40 0.65 m.a 4.04 0.09 human resource (personnel) and culture readiness b.a 3.84 0.31 1.78 40 0.08 m.a 4.02 0.25 process readiness b.a 3.91 0.42 0.07 40 0.95 m.a 3.92 0.41 electronical readiness in total b.a 3.97 0.22 1.39 40 0.17 m.a 4.06 0.12 evaluation of electronic readiness level (a case of financial institution) 409 as the result of independent t-test indicates, the mean score of attitude about rate of electronic readiness of mellat bank in total and its other elements among studied personnel didn’t have meaningful difference based on educational level (p>0.05). table 6 comparison of mean score of the perspective of studied personnel about rate of electronic readiness of mellat bank in khorasan razavi county based on field of study variable field of study average standard deviation t df p strategy and it policies readiness humanities 3.56 0.16 1.55 (39,2) 0.23 basic sciences 3.63 0.08 technical and engineering 3.63 0.12 it infrastructure readiness humanities 4.09 0.37 1.13 (39,2) 0.33 basic sciences 4.22 0.11 technical and engineering 4.24 0.25 management readiness humanities 4.33 0.34 0.16 (39,2) 0.86 basic sciences 4.40 0.20 technical and engineering 4.35 0.19 juridical-legal readiness humanities 4.0 0.30 0.06 (39,2) 0.94 basic sciences 4.04 0.09 technical and engineering 4.02 0.20 human resource (personnel) and culture readiness humanities 3.81 0.31 2.24 (39,2) 0.12 basic sciences 4.00 0.23 technical and engineering 4.00 0.28 process readiness humanities 3.88 0.47 0.24 (39,2) 0.79 basic sciences 4.00 0.32 technical and engineering 3.94 0.36 electronical readiness in total humanities 3.95 0.25 1.32 (39,2) 0.28 basic sciences 4.06 0.08 technical and engineering 4.05 0.11 as the result of one way variance analysis test indicates, the mean score of attitude about rate of electronic readiness of mellat bank in total and its other elements among studied personnel didn’t have meaningful difference based on field of study (p>0.05). as the result of independent t-test indicates, the mean score of attitude about rate of electronic readiness of mellat bank in total and its other elements among studied personnel didn’t have meaningful difference based on working experience (p>0.05). as the result of one way variance analysis test indicates, the mean score of attitude about rate of electronic readiness of mellat bank in total and its other elements among studied personnel didn’t have meaningful difference based on organizational tenure (p>0.05). 410 h. kardanmoghaddam, n. sarboland table 7 comparison of mean score of the perspective of studied personnel about rate of electronic readiness of mellat bank in khorasan razavi county based on working experience variable working experience average standard deviation t df p strategy and it policies readiness 20 years and fewer 3.59 0.15 0.2 40 0.85 more than 20 years 3.60 0.12 it infrastructure readiness 20 years and fewer 4.14 0.32 0.65 40 0.52 more than 20 years 4.22 0.27 management readiness 20 years and fewer 4.33 0.30 1.13 40 0.27 more than 20 years 4.46 0.19 juridical-legal readiness 20 years and fewer 3.99 0.25 1.57 40 0.13 more than 20 years 4.14 0.20 human resource (personnel) and culture readiness 20 years and fewer 3.90 0.31 0.36 40 0.72 more than 20 years 3.86 0.28 process readiness 20 years and fewer 3.90 0.40 0.58 40 0.56 more than 20 years 4.00 0.50 electronical readiness in total 20 years and fewer 3.99 0.21 0.85 40 0.40 more than 20 years 4.06 0.16 table 8 comparison of mean score of the perspective of studied personnel about rate of electronic readiness of mellat bank in khorasan razavi county based on organizational tenure variable organizational tenure average standard deviation t df p strategy and it policies readiness grade 1 branch 3.6 0.15 0.51 (39, 2) 0.60 grade 2 branch 3.56 0.13 grade 3 branch 3.62 0.15 it infrastructure readiness grade 1 branch 4.25 0.15 0.79 (39, 2) 0.46 grade 2 branch 4.12 0.39 grade 3 branch 4.11 0.32 management readiness grade 1 branch 4.35 0.17 1.01 (39, 2) 0.37 grade 2 branch 4.28 0.38 grade 3 branch 4.43 0.20 juridical-legal readiness grade 1 branch 4.04 0.18 0.39 (39, 2) 0.68 grade 2 branch 3.97 0.32 grade 3 branch 4.04 0.20 human resource (personnel) and culture readiness grade 1 branch 3.98 0.26 0.85 (39, 2) 0.44 grade 2 branch 3.84 0.39 grade 3 branch 3.89 0.18 process readiness grade 1 branch 3.98 0.20 0.99 (39, 2) 0.38 grade 2 branch 3.81 0.46 grade 3 branch 4.00 0.48 electronical readiness in total grade 1 branch 4.05 0.09 1.10 (39, 2) 0.34 grade 2 branch 3.95 0.27 grade 3 branch 4.02 0.15 evaluation of electronic readiness level (a case of financial institution) 411 8. discussion and conclusion according to the obtained results, from the perspective of studied personnel, rate of electronic readiness of mellat bank in khorasan razavi in total and elements of strategy and it policies readiness, it infrastructure readiness, management readiness, juridicallegal readiness, human resource (personnel) and culture readiness and process readiness were high (more than mean score 4).also, the results indicated that the situation of mellat bank in khorasan razavi in realms of management, it infrastructure and processes were better than other fields. tabarsa et al (2016) [37] in a study examined the rate of electronic readiness of public organizations for successful deployment of electronical human resource management in yemen and concluded that rate of electronic readiness of public organizations in dimensions of it and technical infrastructures, human resources and cultural factors, judicial and legal infrastructures, management factors and strategies based on it and processes were high (more than mean score 3). total score of evaluating electronic readiness of studied organization was 3.15 and the situation of this organization was higher respectively in realms of management and strategies based on it and processes than other fields. the results of this study in the field of management factors and processes are close to the results of the present study. badamche et al (2012) [38] in a study examined and evaluated factors of electronic readiness in public libraries of east azerbaijan county and concluded that electronic readiness in public libraries of this county is in desirable level (53%). the results of these studies were compatible with the results of the present study. norouzi and jafarpour (2013) [39] in a study examined electronic readiness of tabriz university libraries from five dimensions including organization and management, using ict, information readiness, personnel and human resource readiness, communication with environment and other organization readiness. results indicated that electronic readiness of tabriz university libraries with score of 2.44 from maximum 5 is not desirable. maximum weak point was in environment dimension and communication with other organizations (2.31) and ict dimension (2.73) was in desirable condition related to other dimensions. the results of these studies are incompatible with the results of present study. evaluation of electronic readiness of organizations can have important role in recognizing different aspects of it in organization and economic institutions and exact planning for successful deployment of information and organizational systems like managing electronic human resources (li and maolin, 2015) [40]. awareness of processes and dimensions and electronic readiness indices help countries’ authorities and organization managers to be successful in designing ict strategies. many country authorities argue that ict can help their country to solve social and economic problems faced with them and they are ready to apply necessary changes for using these new technologies. they need to know real value of using ict and their trust should be boosted along this path. assessing electronic readiness is the first step along with turning goals to planned actions that leads to critical changes in people’s lives. as one of the most important tasks of management is assessing existing jobs within organizations, so designing effective frameworks is necessary and inevitable by which we can equally assess organizations and make organization personnel satisfied. the results of present paper indicated that the mean score of studied personnel’s perspective about electronic readiness rate of mellat bank in khorasan razavi county didn’t have meaningful difference in terms of gender, age, occupational experience, educational level and organization tenure. by searching in available 412 h. kardanmoghaddam, n. sarboland databases we didn’t find any study comparing and examining rate of electronic readiness of organizations based on demographic features. therefore, it wasn’t possible to compare this part of present study results with results of other studies. references [1] h. r. r. klidbari, a. davari, a. imani, “the examination and comparison of organizational ereadiness: case study: two financial and commercial organizations”, quarterly journal of bi management studies, vol. 1, no. 1, 2013. [2] a. molla, “the impact of e-readiness on ecommerce success in developing countries: firmlevel evidence”. institute for development policy and management, university of manchester, precinct center, manchester, m139 qh, 2004, uk. [3] m. menou, r. taylor, “a grand challenge: measuring information societies”, information society, vol. 22, no. 5, pp. 261–267, nov/dec 2006. [4] y. l. kuo, “technology readiness as moderator for construction company performance”, industrial management and data systems, vol. 113, no. 4, pp. 558–572, 2013. [5] j. s. c. lin, & h. c. chang, “the role of technology readiness in self-service technology acceptance”, managing service quality: an international journal, vol. 21, no. 4, pp. 424–444. [6] s. y. lam, j. chiang, & a. parasuraman, “the effects of the dimensions of technology readiness on technology acceptance: an empirical analysis”, journal of interactive marketing, vol. 22, no. 4, pp. 19–39, 2008. [7] a. brueckner, government & community building: a study of michigan local governments online. proceedings of the 65th asist annual meeting, e.g. toms (ed.), medford, 539–541, 2002. [8] l. flak, d. olsen & p. wollcat, “local e government in norway”, scandinavian journal of information systems, vol. 17, no. 2, 41–84, 2005. [9] h. rezaeeklidbari, a. davari, a. imani, “the examination and comparison of organizational e-readiness: case study: two financial and commercial organizations”, it management studies, vol. 1, no. 1, pp. 75– 90, 2012. [10] h.r. shirvani, z. baneshi, “baharestran new city electronic municipality’s preparation evaluation according to electronic municipality”, urban management, vol. 7, no. 23, pp. 59–70, 2009. [11] m. r. musa, “an e-readiness assessment tool for local authorities: a pilot application to iraq. a thesis submitted to department of public policy and administration in partial fulfillment of the requirements for the degree of master of public policy and administration”, 2010. [12] m. tavanaa, f. zanadic, & m.n. katehakis, “a hybrid fuzzy group anp–topsis framework for assessment of e-government readiness from a cirm perspective”, information & management, vol. 50, no. 7, pp. 383–397, 2013. [13] a. seakow, d. samson, “e-learning readiness of thailand’s universities comparing to the usa’s cases”, international journal of e-education, e-business,e-management and e-learning, vol. 2, no. 1, pp. 126–131, 2013. [14] w. olatokun, o. michael, o.a. opesade, “an e-readiness assessment of nigeria’s premier university (part 1)”, international education and development of using ict, vol. 2, no. 4, pp. 16–46, 2008. [15] r. g. richey, “technological readiness and strategic interactive fit: dynamic capabilities impacting logistics service competency and performance” (doctoral dissertation). university of oklahoma, 2003. [16] y. l. kuo, “technology readiness as moderator for construction company performance”, industrial management and data systems, vol. 113, no. 4, pp. 558–572, 2013. [17] s.a. alateyah, r.m. crowder, and g.b. wills, “identified factors affecting the citizen's intention to adopt e-government in saudi arabia”, international journal of social and industrial engineering, vol. 7, no. 8, pp. 2435–2443, 2013. [18] v. n. vasdinus, “ict-readiness for e-library (a case study of institution of higher learning)”, a project report submitted in partial fulfillment of the requirements for the master of science in information systems. university of nairobi, school of computing and informatics, 2009. [19] p. a. tiemo& n. edewor, “ict readiness of higher institution libraries in nigeria”, international journal of digital library systems, vol. 2, no. 3, pp. 29–38. [20] c. pornchai & k. bundid, “ict readiness assessment model for public and private organizations in developing country”, international journal of information and education technology, vol. 1, no. 2, pp. 99–107, 2011. https://www.magiran.com/volume/54342 evaluation of electronic readiness level (a case of financial institution) 413 [21] p. sivaraks, d. krairit, j.c.s. tang, “effects of e-crm on customer– bank relationship quality and outcomes: the case of thailand”, journal of high technology management research, vol. 22, pp. 141– 157, 2011. [22] f. hendi, a. soleimani nejad, f. doroudi, “survey of e-readiness base of cspp model in mashhad university libraries”, library and information science research, vol. 3, no. 2, pp. 31–50, 2013. [23] n. oraee, m. cheshmeh sohrabi, a. sanayei, h. jabbari noghabi, “e-readiness survey of university libraries in isfahan”, library and information science research, vol. 3, no. 2, pp. 113–132, 2013. [24] p. lee, m. chieng, “building consumer-brand relationship: a crosscultural experiential view, “psychology & marketing”, vol. 23, no.5, pp.10–30, 2014. [25] b. baversad, s.e. shojaei, & m. taheri, “influence of information and communication technology (ict) on fajr petrochemical company performance by using balanced scorecard model (bsc)”, iranian marketing articles bank (in persian), 2015. [26] m. b. dehnavi, j. rezaeenour, s.h. hani, “provide a conceptual model for assessing the e-readiness of government agencies using the delphi method”, in proceedings of the 2nd lahijan national conference on software engeering, 2012 (in persian). [27] s. n. khaemba, “factors affecting citizen readiness for e-government systems in kenya”. research in engineering and applied sciences, vol. 2, no. 2, pp. 59–67, 2017. [28] f.t.c. tan, b. tan, w. wang & d. sedera, “it-enabled operational agility: an interdependencies perspective”, information & management, vol. 54, no. 3, 292–303, 2017. [29] gh. tabarsa, m. a. haghighi, kh. al-masuri, “evaluating the e-readiness of government agencies to establish a successful electronic human resource management system in yemen”, public administration perspective, vol. 7, no. 26, pp. 77–104, 2017 (in persian). [30] f. salek ranjbarzadesh, m. h. biglu, s. hassanzadeh, n. safaei & p. saleh, “e-readiness assessment at tabriz university of medical sciences”, res dev med educ, vol. 1, pp. 3–6, 2013. [31] a. noori, m. kahani, h. afkhami, “assessing the level of electronic readiness of the faculties of ferdowsi university of mashhad with emphasis on access to information”, in proceedings of the 3rd international conference on information and knowledge technology (ikt2007), pp. 1–8 (in persian). [32] www.codal.ir [33] www.bankmellat.ir [34] www.mellatib.ir [35] www.thebanker.com [36] m. b. dehnavi, j. rezaeenour, “the study of ereadiness in melli bank of guilan province in experts' view”, thesis for obtaining a master's degree in public administration, islamic azad university, 2012 (in persian). [37] g. a. tabarsa, m. a. haghighi, k. al-maswary, “a survey on e-readiness of public organizations for successful emplementation of e-hrm in yemen”, journal of public administration perspective, vol. 7, no. 26, pp. 77–104, 2016. [38] r. badamche vaughne, “review and evaluation of electronic readiness criteria indicators in the public libraries of the country, case study: public libraries of east azerbaijan province (iran)”, master thesis, islamic azad university, north tehran branch, 2012 (in persian). [39] y. norouzi, i. jafarpour, “survey of the e-readiness in university libraries: the case of tabriz university libraries”, library and information sciences, vol. 16, no. 1, pp. 123–150, 2013. [40] l. ma & m. ye, “the role of electronic human resource management in contemporary human resource management”, open journal of social sciences, vol. 3. pp. 71-78, 2015. https://en.symposia.ir/lncse02 https://en.symposia.ir/lncse02 http://www.bankmellat.ir/ http://www.mellatib.ir/ instruction facta universitatis series: electronics and energetics vol. 28, n o 2, june 2015, pp. 223 236 doi: 10.2298/fuee1502223m oscillation-based testing method for detecting switch faults in high-q sc biquad filters  miljana milić, vančo litovski faculty of electronic engineering, university of niš, serbia abstract. testing switched capacitor circuits is a challenge due to the diversity of the possible faults. a special problem encountered is the synthesis of the test signal that will control and make the fault-effect observable at the test point. the oscillation based method which was adopted for testing in these proceedings resolves that important issue in its nature. here we discuss the properties of the method and the conditions to be fulfilled in order to implement it in the right way. to achieve that, we have resolved the problem of synthesis of the positive feed-back circuit and the choice of a proper model of the operational amplifier. in that way, a realistic foundation to the testing process was generated. a second order notch cell was chosen as a case-study. fault dictionaries were developed related to the catastrophic faults of the switches used within the cell. the results reported here are a continuation of our previous work and are complimentary to some other already published. key words: obt method, sc filters, switch faults, fault dictionary. 1. introduction the synthesis of test signal is one of the essential problems in analog circuits testing. choices among many possibilities have to be made. first, one should select an analog test domain [1]. testing can be done by analyzing dc signals [2, 3, 4], signals in the frequency domain [5, 6, 7], as well as the signals in the time domain [8]. it is often necessary to use test signals from several domains simultaneously. if we use dc signals, then we search for fault effects related to the nonlinearities and quiescent conditions. a number of methods consider the time domain test signals selection [9, 10, 11]. in the frequency domain, one has to determine the most appropriate spectrum of the testing signal in order to achieve maximal fault effects. in the time domain one searches for one or more signal waveforms that will enable the fastest and the cheapest testing, in order to optimize the production and decrease the price of the product. a technique that does not require solving such problems since it needs no test signal is the oscillation-based test (obt), [12]. to implement this powerful method one has to create a feed-back during testing. by measuring the frequency of the created oscillator and by comparing that with the fault-free frequency, one can detect defective circuits. received may 16, 2014; received in revised form january 19, 2015 corresponding author: miljana milić faculty of electronic engineering, university of niš, aleksandra medevedeva 14, 18000 niš, serbia (e-mail: miljana.milic@elfak.ni.ac.rs) 224 m. milić, v. litovski structural testing is a concept where test signals that detect one or more faults are created. to detect all most possible faults we need many tests. in obt all possible defects are targeted with only one measurement since they all affect the oscillation frequency. here comes the major difference in oscillator design. regular oscillators are created to be insensitive to the presence of parameter variations. on the other hand, obt oscillators and their oscillation frequency should be as much sensitive to parameter variations as possible. unfortunately, the obt technique cannot be automated, since each analog circuit is unique. that is the biggest difficulty in implementing the method. unlike other testing approaches where numerous and most appropriate testing points have to be selected, observed, and captured signals processed, [13], the number of obt test points is one, i.e. the oscillator‟s output. nevertheless, the problem of measurement is not solved, since one has to decide which parameters of the response should be measured and extracted. there are also other problems related to obt implementation. first, bringing the oscillations into a stable state can slow down the testing process. second, in rare situations observing just one testing point (for example, the output voltage), cannot show the fault effect, so additional measurements are needed, such as iddq [14, 15]; additional voltage waveforms [16]; or even mixing domains, including physical redesign of the original circuit to create access points for measurement [17]. problems of needed test (measurement) points and most appropriate quantities of observation for determining the state of the circuit should be solved [18, 19]. the oscillation based testing (obt) method [12, 20, 21] has been drawing our attention for a relatively long period. the main reason for that was the fundamental discrepancy between the theoretical developments reported by the original authors and the practical implementation of the method. namely, as elaborated in [22, 23, 24] the original method is based on the presumption that the operational amplifiers (oa) (or active amplifying elements) within the circuit under test (cut) perform ideally as if the frequency is equal to zero. in practice, that is not the case. in fact, at the oscillation frequency the modulus and the phase of the gain of the operational amplifier(s) are so degraded that it makes the theoretically developed expressions not only impractical but also misleading. both the oscillation frequency of the fault-free (ff) and faulty circuits (fc) obtained by the closed form expression derived based on the original method are far from the real ones. later implementations of the obt concept [25, 26, 27, 28, 29, 30] suffer from the same drawback. namely, if one generates a fault dictionary using closed form formulae (or even by simulation) based on use of ideal model of the operational amplifier, and then verifies the results by (repeated) simulation based on the same models, one does not notice the fundamental problem: the gain and the phase distortions of the operational amplifier are not negligible and the fault dictionaries are far of the realistic ones. it is worth mentioning that, admittedly, the need to include the operational amplifier's phase shift in the evaluation of the oscillation frequency for implementation of the obt method to continuous time analog filter was mentioned earlier in the literature [31]. the idea was, however, implemented in the frequency domain and led to conclusions quite different than the ones we reported in [22]. we find the implementation of the obt method for continuous time filters reported in [32] to be the proper one. there, of course, due to the complexity of the cut the developments in the frequency domain were, simply, not feasible. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 225 based on those considerations we concluded that a different, not analytical, approach to the extraction of these two quantities based on realistic (dynamic) models of the operational amplifiers and time (not frequency) domain simulation is to be implemented. of course, that introduces an additional problem related to the overall time needed to get the fault dictionary (for as many faults as needed) since one needs to extract the oscillation frequency from a time domain signal which in turn has to reach a steady-state. the problems introduced in practical implementation of the obt, however, are broadly compensated by the sole fact that obt resolves the main problem in the test generation per se. namely in general, and especially for analog circuits, the synthesis of the testing signal is a problem above all. obt needs no test signal and, if testable, it exposes any fault present in the circuit. that enables simple implementation of the structural concept [33] to the test signal generation by which only selected faults (being the most probable) in the system are targeted. the obt method was implemented to several types of circuits (as listed in all references above) among which to switched capacitor (sc) filters [34, 35, 36, 37, 38]. the problem of the non-idealities was considered in [37] where, due to the difference between the conclusion obtained from the z-domain and from time domain, much attention was paid to the time domain simulation for generation of the fault dictionaries. there, a simple cmos transconductance amplifier was implemented in the schematic of the sc filter and a conclusion was drawn that the difference between the time and frequency domain analysis is to be attributed to the feed-back circuit. we believe that a purely resistive model of the transistors within the op-amp was used. it is well known, however, [39] that the op-amp implemented in sc circuits has to fulfill stringent requirements not only in the frequency domain (including the nominal gain and the cut-off frequency) but also in the dc domain (low offset), and in transient domain (high slew-rate). in addition, low noise requirements are usually imposed. for example, in [40] the authors recommend the lt1055 op-amp which, in fact, has a jfet at the input in order to reduce the noise. this is why, we think, the main circuit, not the feed-back, is imposing the need to have a much better model of the operational amplifier. that was illustrated in more detail in [22]. considering the obt method, a very important and powerful one, and having in mind the need for a more realistic implementation, we started to resolve the corresponding issues of implementation of obt to the testing of sc filter cells, one by one. the nature of the faults in sc circuits was studied in [41]. namely, within a circuit one may encounter parametric and catastrophic faults. parametric faults here are related only to the capacitance values. catastrophic faults may belong to the following categories: faults related to the connection lines, faults related to the capacitors, faults related to the switches (transistors), and faults related to the operational amplifiers. to generate a fault dictionary, however, one has to resolve the synthesis of the oscillator and the fault insertion method first. we first attacked the problem of synthesis of the feed-back loop that enables oscillation. the success was demonstrated on the simplest situation that is fault dictionary creation for parametric or soft faults. [42]. large changes were attributed to all capacitance values. introduction of catastrophic faults into a circuit that has active elements and feed-back loops is a challenging task since the fault may have several very dramatic consequences. firstly, a catastrophic fault may change the quiescent working conditions of the active elements, eventually bringing them in saturation or in cut-off. that fundamentally changes the circuit behavior and makes the simulation settings much more complicated. one is not to forget that the simulation of an oscillator is a specific 226 m. milić, v. litovski problem related to the conflict between the requirements for a stable numerical integration rule and simulation of an unstable electronic circuit. on the other side, a catastrophic fault may break the existing or establish a new feed-back loop that, again, leads to a totally new circuit with unknown properties and behavior. for that reason, when speaking of catastrophic faults, we first considered the capacitors in an sc notch cell [43]. after getting good results and after considerable experience was accumulated we are here attacking the second element type of the sc filters, the switches. the results reported here are complimentary to the ones reported in [43] (which are not repeated here) and we consider the present and the report given in [43] as completion of a single task. to our knowledge the fault dictionaries reported here are the first and unique base for testing a notch sc cell. the paper is structured in the following way. in the next section the analysis and design of the high q sc notch cell is described. then, in section 4, the obt is discussed and application described. there follow, in paragraph 5, the main results being related to the method of creation of the fault dictionary and the dictionary itself. here the discussion of the results is given too. 2. high q sc notch filter a switched capacitor (sc) is an integrated electronic element used in discrete time signal processing systems. the main idea is to use capacitors and switches to emulate the drawbacks of integrated resistors which have pure accuracy and temperature dependence properties. in that way discrete time systems are obtained from continuous time originals. the circuits so obtained use non-overlapping signals to control the switches, often termed break before make switching, so that all switches are open for a very short time during the switching transitions. filters implemented with these elements are termed 'switchedcapacitor filters'. the switching frequency may be used to control the response of the filters since the equivalent resistances are directly dependent on it. from the implementation point of view the sc filters are in between the analog and the digital ones. namely, while the signals are sampled they are not quantized so that their advantage over digital filters is the potential to achieve a high dynamic range. in the same time the need for analog-to-digital (ad) and digital-to-analog (da) conversion as well as digital signal processing (dsp) hardware is avoided. the analog output signal is simply restored by a low-pass filter. on the other side, besides the potential to be integrated in silicon which is not the case for the continuous time rc active filters, unlike continuous time filters (which have to be constructed with resistors, capacitors and sometimes inductors whose values are accurately known), switched capacitor filters depend only on the ratios between capacitances and the switching frequency. for all these reasons sc filters are an important class of integrated circuits and testing of sc filters is an important issue in electronic design. the physical realization of the sc filters is very frequently performed by cascading second-order cells. that concept will be followed here, too. a topology of a universal second order sc filter cell is depicted in fig. 1 [44, 45]. this is a well known fleischerlaker active sc filter [46]. by proper choice of the parameter values of the cell one may produce all four variants needed for complete filter design: low-pass (lp), band-pass (bp), band-stop (notch), and high-pass (hp). in these proceedings (as we did in our previous oscillation-based testing method for detecting switch faults in high-q sc biquad filters 227 research related to the obt method) the notch cell will be elaborated mostly because it may be stated as the most complex one. namely, it has to suppress part of the frequency band and has two pass-bands. in addition, when creating filters with transmission zeroes at the axis of real angular frequencies, this cell is used as many times as the number of transmission zeroes is (which usually is n/2-1, n being the order of the filter). practically only one additional cell of the type lp, bp or hp is enough to complete the filter realization. finally, it is not to forget that the use of a universal topology drastically simplifies the layout design since it allows for ‟programming‟ the layout on the chip. a3 1c a4c1 a2 1c a 1 1c c1 a5 2c + + a 6 2c c2 v in v out f2f2 f 2 f2 f 2 f1 f1 f1 f1 f1 fig. 1 high q sc notch filter cell the transfer function of the cell t(s), obtained under presumption that the operational amplifiers have infinite gain (not frequency dependent) is given by 2 2 1 0 2 20 0 ( ) ( ) . ω( ) ω out in v s k s k s k t s v s s s q + +   + + (1) the element values of fig. 1 may be related to the coefficients of (1) in the following way [44]: 1 0 0 α /ωk t (2) 2 5 0 α α ω t  (3) 3 1 0 α /ωk (4) 4 α 1/ q (5) 6 2 α k , (6) where k0, k1, k2 are constants determining the position of the passband on the frequency axis, e.g. low-pass, band-pass, etc., while ω0 – notch frequency, q – quality factor, and t – non-overlapping clock period, are design parameters. in the specific case of a notch transfer function one should choose k0= (3∙ω0) 2 , and k1=0, k2=9. in the above expressions t stands for the sampling period and ω0=2πf0, is equal to the modulus of the pole of the cell which is, in the same time, equal to the notch frequency. 228 m. milić, v. litovski the cell is usually designed using (1). to do that we choose as an example for this study the following: f0=1 khz, q=10, t=10 μs (the frequency of the two-phase, nonoverlapping switching is 100 khz), and capacitances c1=c2=20 pf. the selected value of q, which is recognized as quality factor of the cell, is considered large, hence the name of the cell. after substitution of these values in (2)-(6) we obtain: ω0=6283.18 rad/s, α1=0.063, α2=0.063, α3=0, α4=0.1, α5=0.063, and α6=1. the amplitudeand phase-frequency response of this filter cell, produced under the presumption of use of infinite gain operational amplifiers, is depicted in fig. 2. note the usual spice [47] presentation of the phase which is presented as if the phasor is jumping for 180 degrees. fig. 2 amplitude (full line) and phase (doted) frequency response of the “ideal” filter 3. basic concepts of the obt under testing, within these proceedings, we will understand the creation of a fault dictionary. it is a look-up table containing the effect of every fault conceived in advance. by using it we practically implement the simulation before test approach [33]. the information stored in it tells the test engineer whether the selected fault is testable from both points of view: controllability and observability. the main problem hidden behind this table is the selection of a test signal that will activate and propagate the fault effect to the output in the shortest time (to reduce the overall testing in mass volume production). this problem is especially difficult to solve for analog circuits [48] since three domains are to be taken into account: dc, frequency and the time domain. there exists, however, a technique that needs no test signal. it is known as the obt [12, 20, 21]. the basic idea behind this powerful method is to create a redundant feedback loop that is to be activated during testing only. by measurement of the output signal of the fc and by comparison with the response of the ff circuit, one may conclude whether there are defects in the circuit or not. the simplicity of the method is deeper since usually only one testing point is needed (the output) and frequently only one quantity is to be observed: the oscillation frequency. when the ff circuit is set to oscillate, the fc may be revealed either by absence of oscillation or by a different value of the frequency of the signal measured at the output. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 229 cut mode select fig. 3 simplified obt for the implementation of this method it is assumed that the system is so structured that an external controlling signal is capable of (1) isolating a part of it (being the cut) and (2) introducing positive feed-back that will make it oscillate. fig. 3 represents the local arrangement. as can be seen a switch is introduced that, under control of the ‟mode select‟ signal, when activated, simultaneously isolates the cut from the rest of the circuit and positive connects the feed-back branch. this concept allows for implementation of the design for testability (dft) concept of integrated circuits (ic) design which is depicted in fig. 4 in its simplified version [22, 43]. the development of the schematic proposed here is based on [49]. it is reminiscent of the one published in [50] where more details concerning the digital circuitry may be found. analog block 1 analog block 2 analog block n-1 analog block n additional circuitry amux control logic circuit under test (cut) inputs outputs output oscilations testing mode scan test signals test logic fig. 4 obt at the system level in this configuration additional digital control logic is provided in order to set the system in testing mode and to allow for the analog cuts to be isolated and tested one by one. since no test signal is needed, in testing mode, only the output of the cells is to be connected to the system output. note that every analog block is to be designed with a structure as depicted in fig. 3. when first reported the obt method was based on two fundamental presumptions: the active elements are ideal with infinite gain and consequently the system is linear. neither of these presumptions is valid. namely, the operational amplifiers which are necessary for implementation have real properties such as, among many others, finite and frequency dependent modulus of the gain and a phase shift that is different from zero and also frequency dependent. ignoring these properties leads to wrong expressions for calculation of the oscillation frequency and consequently wrong values for the outcome. unfortunately, the fact is that if the simplest, more realistic model of the oa was to be implemented (single pole roll-off) there would not be possible to get a closed form expressions due to the complexity of the node equations of the system. in fact, nonlinear expressions are obtained. 230 m. milić, v. litovski on the other side, there are no linear active circuits as such. furthermore, one is to be aware that if one designs an oscillator, one must draw the working point of the active element into saturation in order to limit the rise of the amplitude being forced by the positive feed-back. so, in part of the period, the circuit must be nonlinear. how long the active element will stay in saturation will depend on the quality of the feed-back loop, i.e. on the value of the modulus of the loop-gain. again, we come to the conclusion that no closed form expressions may be derived for calculation of the oscillation frequency. here we will allow ourselves to make a small digression. the fact that there are nonlinearities within the feedback loop does not disqualify the obt method as such. on the contrary! the abundance of harmonics in the output signal may be effectively used as additional information (besides the oscillation frequency) for both testing and diagnostic purposes [23, 24]. the use of harmonic analysis of the output signal (that may be done of-line) may drastically reduce the additional efforts, redesigns, and silicon area needed in order to get the supply current as additional information for testing what was done in [51]. to summarize, if the simulation with proper models of the active elements is presumed instead of closed form expressions, and if the problem of the additional circuitry needed to create positive feed-back loop is resolved, the obt method becomes a powerful means for testing and diagnosis of not only analog but also mixed-signal systems [20, 23, 24]. 4. simulations and testing the creation of the fault dictionary goes as follows. a list of faults is assembled first. it is normally shorter than the list of all possible faults for several reasons, one of them being the tractability of the testing process, while another is lower probability of occurrence of some specific faults. in these proceedings, as explained in the introduction we will consider the faults related to all transistors used as switches. two types of faults will be taken into account: stuck-at-open and stuck-at-closed. for the circuit of fig. 1 where 10 switches are used, one is to create a fault dictionary with 21 rows, the first one being allocated for the ff circuit. in the next step a fault is to be inserted in the circuit [52]. in our case we have to model an open-circuit (stuck-at-open fault) and a short-circuit. in the former case we use two variants. in the first we consider the open to be an infinite resistance (ideal open), while in the second we choose a more realistic model: the open has finite resistance of 1mω (representing the leakage between the source and the drain). similarly, for the modeling of the stuck-at-short we use a real short-circuit, i.e. zero ohms (ideal short) or the more realistic 0.01ω (representing a physical short circuit). note, to ensure numerical accuracy one is to use a relatively small span of the resistance values between the open and the closed case. in spice one uses 1 g and 1  as defaults. what we use is more realistic and less parted. accordingly, in the sequel we will present two fault dictionaries, one for ideal and the other for more realistic models of the faults. based on these, we will conclude whether we can rely on the ideal switch models or not. to get the oscillation frequency of the ff circuit it has to be extended by a positive feed-back loop. having in mind the value of the gain of the notch cell alone, we concluded that additional gain is to be added to the loop for the oscillations to be enabled. furthermore, the additional gain is to be positive and small enough to avoid excessive harmonic distortions. in fact, additional gain of 6 db was added to the loop-gain. the resulting configuration is depicted in fig. 5 where the schematic is copied from the schematic input to the spice simulator. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 231 before proceeding to simulation we had to solve two additional issues. the first one is related to the choice of the integration rule for the differential equation solver within the simulator. namely, if one is to simulate an electronic circuit, one usually asks for an integration rule (or derivative approximation formula) that is stable and if possible a-stable. that however, as shown in [53], may lead to a signal with a decaying amplitude which eventually vanishes due to the stability requirement imposed. for that reason, for simulation of oscillator circuits, the so called trapezoidal integration rule [53] is to be implemented. finally, a realistic model of the operational amplifier is to be chosen and implemented. since no model (schematic neither) is normally given with the design kits delivered together with the technology file for ic design for the academic licenses we use, we were forced to use a model that is built in the simulator. that was the model of the ltc6078 [54] op-amp whose schematic is built into the ltspice simulator [47]. we published the schematic and the parameters of the model in [22]. all conditions set, after the simulation of the ff circuit, we obtained oscillation with frequency fosc= 960 hz. the fast fourier transform (fft) analysis results for the obtained output signal is shown in fig. 6. that is the information from which the oscillation frequency was extracted. in order to get the fault dictionary, as can be seen, for every fault we have to simulate the oscillator and to perform fft. here all together we needed 41 simulations and fft analyses. the results are given in table i and table ii. the first one uses ideal models of the faulty switches, while the second one uses more realistic models of the faulty switches. the test dictionaries expressed by table i and table ii contain the following data for the ff and for every fc: 1. oscillation frequency; 2. deviation (in percentage) of the oscillation frequency from the ff circuit; 3. and 4. the amplitude and the phase of the first harmonic; 5. the dc value of the output; and 6. the thd of the output signal. fig. 5 ltspice schematic of the notch filter 232 m. milić, v. litovski fig. 6 fft of the obt oscillator output signal we note, at the beginning, the difference between the notch and the oscillation frequency of the ff. for the idealized case it is 1 khz, while for the case when realistic models of the op-amps are implemented it becomes 960 hz. these, if ideal operational amplifiers were to be implemented, would be equal. one is not to forget that 1 khz is a very low frequency and this effect is to be expected to have much more severe consequences if the working frequency of the circuit is to be risen. we also expect that a higher gain in the additional circuit will be needed if oscillations at higher frequencies are to be created. as can be seen from the fosc column for both tables, oscillations are not established in all fcs. in cases where no oscillations are established, one simply concludes that the fault is testable. when however, the oscillations are established in the fc, we may distinguish three situations. in the first one, such as the cases s1-open, s5-open, and s7-short, in table i, there is clear difference in the oscillation frequency. that is enough to conclude that the fault is testable. in the second case, we may have oscillation with frequency near to the one of the ff but with a clearly different value of the amplitude of the first harmonic. this is practically always the case in table i and table ii. there is one case in table ii which deserves some additional attention. namely, when s2-short is present, if not satisfied, in order to get an absolutely firm conclusion about the presence of the fault, one may take into account not only the frequency (difference is 8.3%) and the amplitude of the first harmonic (difference is 29%), but the harmonic distortions, too. in all other cases there is no practical need for the use of the phase shift, the dc values and the distortions as an information about the testability of the circuit. there is a special situation where no sinusoidal oscillations are observed at the output. these are marked by 100k in the oscillation frequency column. instead, as a consequence of clock feed-trough, trapezoidal waveform, having the frequency of the clock, is obtained at the output. note that since such signal is far above the passband of the lowpass filter used at the output of the sc cell, if one wants to diagnose this effect, one is to measure directly at the sc output. in the opposite, the filter will suppress this fault effect. oscillation-based testing method for detecting switch faults in high-q sc biquad filters 233 table 1 simulation with ideal switch model – fault dictionary defect fosc [hz] δfosc [%] ampl. 1st. har. [mv] phase 1st. har. [deg] dc val. [mv] thd [%] comments ff 960 266.6 -125.45 -0.218 0.353 s1 open 360 62.5 139.6 -40.12 -0.073 1.161 s1 short 100k >500 0.216 80.62 -2999.87 209.071 no sin. osc. s2 open 100k >500 0.023 -117.67 -2999.91 171.753 no sin. osc. s2 short 1040 8.3 374.1 45.05 -1.147 0.914 s3 open 100k >500 0.091 -93.06 -2999.53 266.543 no sin. osc. s3 short 100 0. no oscillations s4 open 100k >500 0.018 -92.74 -2999.91 270.629 no sin. osc. s4 short 100k >500 59.52 -6.31 398.077 45.526 no sin. osc. s5 open 890 7.3 3.259 74.36 0.003 2.918 s5 short 100k >500 43.05 71.75 -16.101 109.583 no sin. osc. s6 open 890 7.3 20.76 81.64 -0.036 1.339 s6 short 100k >500 45.95 68.30 18.849 107.222 no sin. osc. s7 open 100 0. no oscillations s7 short 310 67.7 476.3 105.40 -12.934 134.391 s8 open 100k >500 0.033 -90.37 -2999.91 281.203 no sin. osc. s8 short 100k >500 0.051 -91.73 -2999.9 284.390 no sin. osc. s9 open 100k >500 0.092 -93.11 -2999.51 266.616 no sin. osc. s9 short 100k >500 0.155 -91.80 -2999.75 210.601 no sin. osc. s10 open 100k >500 0.008 -71.64 -2999.91 384.349 no sin. osc. s10 short 100k >500 0.430 -97.77 -2170.95 263.349 no sin. osc. table 2 simulation with real switch model – fault dictionary defect fosc [hz] δfosc [%] ampl. 1st. har. [mv] phase 1st. har. [deg] dc val. [mv] thd [%] comments ff 960 266.6 -125.45 -0.218 0.353 s1 open 727.3 24.2 52.67 -152.95 -0.195 0.541 s1 short 100k >500 0.291 82.61 -2999.84 221.73 no sin. osc. s2 open 670 30.2 6.073 7.47 0.078 1.386 s2 short 1028 7.1 374 133.07 1.966 1.754 s3 open 680 29.2 305.3 101.60 -0.509 0.29 s3 short 100 0 no oscillations s4 open 680 29.2 109.7 -166.00 -0.399 0.300 s4 short 100k >500 59.65 -6.35 398.039 45.352 no sin. osc. s5 open 930 3.1 120.2 47.64 -0.105 0.857 s5 short 100k >500 43.16 71.59 -16.221 109.513 no sin. osc. s6 open 930 3.1 95.39 45.16 -0.070 0.872 s6 short 100k >500 45.84 68.44 18.737 107.305 no sin. osc. s7 open 733 23.6 0.019 1.00 -0.007 4.535 s7 short 310 67.7 478.5 107.40 -12.276 133.594 s8 open 670 30.2 79.09 -158.37 -0.421 0.554 s8 short 100k >500 0.042 -91.93 -2999.9 273.155 no sin. osc. s9 open 680 29.2 318.3 22.08 -0.370 1.067 s9 short 100k >500 0.513 -92.00 -2998.75 227.005 no sin. osc. s10 open 680 29.2 258.1 -165.77 -0.063 1.139 s10 short 100k >500 0.421 -97.95 -2171.09 263.529 no sin. osc. 234 m. milić, v. litovski by comparison of table i and table ii we may get a notion on the quality of the model of the switch used. the main difference between table i and table ii is in the number of feed-trough fault effects. in addition, as can be seen for the case s7-open, the change of the circuit functionality due to the ideal model, leads to a wrong conclusion about the fault effects, while both models cover the fault. the realistic fault model is mostly suppressing this effect and this is why we do recommend it for this application. note its simplicity. by inspection of table i and table ii we may conclude that there are no untestable faults. the fault effects being different, all faults may be recognized at the output of the cell making obt a successful concept for testing this kind of cells while using an extremely simple additional circuitry for the synthesis of the oscillator circuit. we want to stress here again that only one testing point was used and only one measurement is undertaken the output voltage waveform was measured. the additional processing (fft) is unavoidable in order to get the oscillation frequency so that the numbers depicted in table i and table ii are obtained with no additional cost and effort. 5. conclusion implementation of the obt is a challenging issue. it comes from the fact that the method was originally proposed based on presumptions that the active elements exhibit ideal performances. that is not the case. in this proceeding we demonstrate the proper implementation of obt for the case of a second-order notch cell. this cell may be considered as the best representative (among other second order cells) for the task we undertook, since it is the most complicated and is the most frequently used one. it was synthesized to be implemented as an integrated circuit with switched capacitors. since the number and the nature of the possible faults if large we are attacking the problem in several phases, one of them being reported here. only catastrophic faults of the switches were modeled and corresponding fault dictionary was created. it was shown that full coverage of the selected faults may be achieved if proper modeling of the operational amplifiers is used and proper feed-back circuit is synthesized. the results reported here are parts of a project run for a longer period in which we started with continuous time analog filter cells and we are here ending with switched capacitor filter cell. acknowledgement. this research was partly funded by the ministry of education and science of republic of serbia under contract no. tr32004. references [1] m. soma, "automatic test generation algorithms for analogue circuits", iee proc. circuit, devices and systems, vol. 143, no. 6, december 1996, pp. 366-373. [2] c. dufaza and h. ihs, "a bist-dft technique for dc test of analog modules”, journal of electronic testingtheory and applications, vol. 9, no. 1-2, 1996, pp. 117-133. [3] m. marlett and j. abraham, "dc-iatp: an iterative analog circuit test generation program for generating dc single pattern tests", international test conference, 1988, pp. 839-845. [4] l. milor and v. viswanathan, "detection of catastrophic faults in analog integrated circuits", ieee transactions on computer aided design, vol. 8, 1989, pp 114-130. [5] m. slamani and b. kaminska, "multifrequency analysis of faults in analog circuits", ieee design & test of computers, vol. 12, no. 2, 1995, pp. 70-80. http://rd.springer.com/search?facet-author=%22christian+dufaza%22 http://rd.springer.com/search?facet-author=%22hassan+ihs%22 http://rd.springer.com/journal/10836 http://rd.springer.com/journal/10836 http://rd.springer.com/journal/10836/9/1/page/1 oscillation-based testing method for detecting switch faults in high-q sc biquad filters 235 [6] c. wang, y. yun, h. liang, j. he, m. chan, "multi-frequency test for analog circuits," electron devices and solid-state circuits (edssc), ieee international conference, june 2013, pp. 1, 2, 3-5. [7] s. huynh, s. kim, m. soma and j. zhang, "automatic analog test signal generation using multifrequency analysis", ieee transactions on circuits and systems—ii: analog and digital signal processing, vol. 46, no. 5, may 1999, 565-576. [8] b. burdiek, "generation of optimum test stimuli for nonlinear analog circuits using nonlinear programming and time-domain sensitivities," proc. of design, automation and test in europe, conference and exhibition, 2001, pp. 603-608. [9] z. guo and j. savir, "algorithm-based fault detection of analog linear time-invariant circuits", proc. of ieee instrumentation and measurement technology conference, budapest, hungary, may 2001, pp. 49-54. [10] s. cherubal and a. chatterjee, "parametric fault diagnosis for analog system using functional mapping", proc. of ieee date, nice, france, 1999, pp. 195-200. [11] v. prasannamoorthy and n. devarajan, "time domain technique for fault diagnosis of analog circuits with flexible accuracy algorithm", eur. journal of scientific research, vol. 51, no. 2, 2011, pp. 211-221. [12] k. arabi and b. kaminska, "oscillation-test strategy for analog and mixed-signal integrated circuits", proc. of the 14th ieee vlsi test symposium (vts‟96), princeton, new jersey, april/may 1996, pp. 476-482. [13] a. halder and a. chatterjee, "automated test generation and test point selection for specification test of analog circuits," proc. of 5th international symposium on quality electronic design, 2004, pp. 401-406. [14] g. hu, h. wang, m. hu and s. yang, "oscillation test strategy for analog filters by monitoring output voltage and supply current" thinghua science and technology, vol. 12, no. s1, 2007, pp. 78-82. [15] p. alli, testing a cmos operational amplifier circuit using a combination of oscillation and iddq test methods, m.sc. thesis, louisiana state university, usa, 2004. [16] m. wong and k. ko, "fault diagnostic improvement method for otm-based testing", proc. of 17th ieee instrumentation and measurement technology conf., 2000, baltimore, md, usa, pp. 1118 – 1123. [17] s. yellampalli, a. srivastava, and v. pulendra, "a combined oscillation, power supply current and iddq testing methodology for fault detection in floating gate input cmos operational amplifier", proc. of the 48th midwest symp. on circuits and systems, covington, ky, aug. 2005, pp. 503 506. [18] prasad, v.c.; babu, n.s.c., "selection of test nodes for analog fault diagnosis in dictionary approach", ieee transactions on instrumentation and measurement, vol. 49, no. 6, dec. 2000, pp. 1289-1297. [19] c. yang, s. tian, and b. long, "application of heuristic graph search to test-point selection for analog fault dictionary techniques", ieee trans. on instrumentation and measurement, vol. 58, no. 7, 2009, pp. 2145-2158. [20] k. arabi and b. kaminska, "efficient and accurate testing of analog-to-digital converters using oscillation-test method", proc. of the european design and test conference (ed&tc 97), paris, france, march 1997, pp. 384-352. [21] k. arabi and b. kaminska, “oscillation-test methodology for low-cost testing of active filters”, ieee trans. on instrumentation and measurements, vol. 48, no. 4, august 1999, pp 798-806. [22] m. milić, m. stošović and v. litovski, "oscillation based analog testing – a case study", in proceedings of the 34th international conference on information and communication technology, electronics and microelectronics mipro 2011, opatija, croatia, 2011, vol. 1, pp. 118-123. [23] m. stošović, m. milić and v. litovski, "analog filter diagnosis using the oscillation based method", journal of electrical engineering, issn 1335-3632, vol. 63, no. 6, 2012, pp. 349–356. [24] m. stošović, m. milić, m. zwolinski and v. litovski, "oscillation-based analog diagnosis using artificial neural networks based inference mechanism", computers and electrical engineering, vol. 39, 2013, pp. 190-201. [25] a. chaehoi, y. bertrand, l. latorre and p. nouet, "improving the efficiency of the oscillation-based test methodology for parametric faults", latw'03, 4th ieee latin american test workshop, natal, brazil, 2003. [26] a. raghunatan, h. shin and j. a. abraham, "prediction of analog performance parameters using oscillation based test", proc. 22nd ieee vlsi test symp., apr. 2004, pp. 377-382. [27] a. raghunatan, j. h. chun, j. a. abraham and a. chatterjee, "quasi-oscillation based test for improved prediction of analog performance parameters", proc. of the itc'04, international test conference 2004, pp. 252-261. [28] k. suenaga, e. isern, r. picos, s. bota, m. roca and e. garcía-moreno, "application of predictive oscillation-based test to a cmos opamp", ieee transactions on instrumentation and measurement vol. 59 , issue 8, 2010., pp. 2076-2082. [29] e. romero, m. costamagna, g. peretti and c. marques, "a performance evaluation of oscillation based test in continuous time filters", international journal of mechanical, industrial science and engineering vol. 8, no. 1, 2014, pp 196-201. https://www.researchgate.net/researcher/11902562_k_suenaga https://www.researchgate.net/researcher/7902034_e_isern https://www.researchgate.net/researcher/35260394_r_picos https://www.researchgate.net/researcher/11616821_s_bota https://www.researchgate.net/researcher/6100697_m_roca https://www.researchgate.net/researcher/8012611_e_garcia-moreno http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=5508591 236 m. milić, v. litovski [30] m. s. sankari. and p. sathish kumar, "oscillation test methodology for built-in analog circuits", international journal of computational engineering research, ijcer, vol. 2, issue no.3, 2012, pp. 868-877 [31] m. s. zarnik, f. novak and s. macek, "design of oscillation-based test structures of active rc filters. iee proceedings, circuits, devices and systems, 2000, vol. 147, no. 5, pp. 297–302. [32] m. wong, "on the issues of oscillation test methodology", ieee transactions on instrumentation and measurement, 2000, vol. 49, no. 2, pp. 240–245. [33] s. hurst, vlsi testing: digital and mixed analogue/digital techniques, institution of engineering and technology (iet), uk, 1999. [34] m. s. zarnik, f. novak and s. macek, "efficient go no-go test of active rc filters", international journal of circuit theory and applications, 1998, vol. 26, no. 5, pp. 523–529. [35] u. kač and f. novak, "all-pass sc biquad reconfiguration scheme for oscillation based analog bist", proc. of the 9th european test symposium, ajaccio, france, 2004, pp. 133-138. [36] u. kač and f. novak, "oscillation test scheme of sc biquad filters based on internal reconfiguration", journal of electron. test, vol. 23, no. 6, pp. 485-495, december 2007. [37] u. kač and f. novak, "reconfiguration schemes of sc biquad filters for oscillation based test", information technology and control, vol.42, no. 1, pp. 38-47, 2013. [38] g. huertas, d. vazquez, e. j. peralias, a. rueda and h. l. huertas, "practical oscillation-based test of integrated filters", ieee design & test of computers, vol. 19, no. 6, 2002, pp. 64-72. [39] k. martin and a. sedra, "effect of the opamp finite gain & bandwidth on the performance of switched-capacitor filters", ieee trans. circuits syst., vol. cas-28, no. 8, pp. 822-829, aug 1981. [40] j. náhlík, j. hospodka, p. sovka and b. pšenička, "implementation of a two-channel maximally decimated filter bank using switched capacitor circuits", radioengineering, vol. 22, no. 1, april 2013, pp. 167-173. [41] m. robson and g. russell, "a digital method for testing embedded switched capacitor filters", in proceedings of the conference on european design automation, euro – dac „96/ euro – vhdl ‟96, pp. 239–244. [42] m. milić and v. litovski, "soft defects testing in notch sc filters using the oscillation method", in proc. of the lvii etran conf., zlatibor, serbia, 2013, pp. el 2.3. [43] m. milić and v. litovski, "testing capacitors‟ hard defects in notch sc filters using the oscillation method", in proc. of the 5th small system simulation symposium, ssss 2014, niš, serbia, pp. 30-36. [44] p. e. allen and d. r. holberg, cmos analog circuit design, 2nd ed., oxford university press, new york, usa:, 2002. [45] f. h. ironns, active filters for integrated circuits applications, artech house, norwood, ma, usa, 2005. [46] p. e. fleischer, and k. r. laker, "a family of active switched capacitor biquad building blocks", bell system technical journal, no. 58, december 1979, pp. 2235-2269. [47] -, lt spice user manual, http://www.intactaudio.com/forum/viewtopic.php?t=596. [48] c. chalk, m. zwolinski, and b. r. wilkins, "test stimulus generation for steady-state analysis of analogue and mixed-signal circuits", proc. of the 3rd ieee international mixed signal testing workshop,1997, pp. 85-92. [49] p. kabisatpathy, a. barua and s. sinha, fault diagnosis of analog integrated circuits, springer, dordrecht, the nederland, 2005. [50] s. mosin, "a built-in self-test circuitry based on reconfiguration for analog and mixed-signal ic." information technology and control, 2011, vol. 40, no. 3, pp. 260-264. [51] g. hu, h. wang, m. hu and s. yang,, "oscillation test strategy for analog filters by monitoring output voltage and supply current," thinghua science and technology, vol. 12, no. si, july 2007, pp. 78-82. [52] b. kaminska, "analog and mixed signal test", in: eda for ic system design, verification, and testing, crc press, taylor&francis group, london, 2006. [53] litovski, v., zwolinski, m., "vlsi circuit simulation and optimization", chapman and hall, london, 1997. [54] -, linear ethnology, [55] http://www.linear.com/designtools/software/?gclid=ck_dzsaknl4cfqbmtaod2akarg#ltspice http://www.linear.com/designtools/software/?gclid=ck_dzsaknl4cfqbmtaod2akarg#ltspice instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 375 382 doi: 10.2298/fuee1703375d mixed mode performance of gaas utb-mosfet with extra insulator region and undoped buried oxide region  shiva prasad das 1 , ananya dastidar 2 , partha sarkar 1 , sushanta k. mohapatra 3 1 department of electronics and communication engineering, centre for advanced post graduate studies, biju patnaik university of technology, odisha, india 2 department of instrumentation and electronics, college of engineering and technology, bhubaneswar, bput, odisha, india 3 school of electronics engineering, kiit university, bhubaneswar, odisha, india abstract. investigation of mixed mode performances for gaas utb-mosfet at nanoscale regime keeping in view of “beyond cmos” is the current trend of semiconductor industry. here it is proposed to modify conventional models by considering an extra insulator region (ir) and undoped buried oxide region (ubr) to study the performance related to digital and analog/rf applications. here a gaas is considered as the channel material. the irutb-soi-n-mosfet has shown promising results with respect to ss, dibl, ft and switching speed. key words: silicon-on-insulator, utb mosfet, gaas, dibl, analog/rf performance, insulator region. 1. introduction in recent years, there has been a growing demand of integrated circuits (ics) providing better analog/ rf applications as well as digital functionalities [1]–[3]. the silicon-oninsulator (soi) technology [1], [4], [5] based fully depleted (fd) silicon on insulator mosfets are widely used for mixed mode application ics as it offers sharp sub-threshold slope, high current drive, high transconductance, reduced parasitic capacitance, and absence of latch-up which are key parameters for digital applications [6]–[8]. due to high transconductance to drain current (gm/id) ratio and low body factor, the fd-soi-mosfets have been used to design low power circuits to operate at a high and low frequency as received september 17, 2016; received in revised form november 30, 2016 corresponding author: sushanta k. mohapatra school of electronics engineering, kiit university, bhubaneswar, odisha, india (e-mail: skmctc74@gmail.com) 376 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra well as high temperature providing better performance than the conventional mosfets [9], [10]. the use of high electron mobility material like gaas is promising as it has higher saturated electron velocity, higher electron mobility, allowing it to function at much higher frequencies, less noise and be operated at higher power levels than silicon [11], [12]. previously it has been shown by orouji et al. [13] that soi-mosfets with an extra insulator region (ir-soi) in which the silicon active layer and drain region consists of an insulator region (hfo2) provides high electron reliability due to low gate leakage current and low critical electric field. the self heating effect (she) which is one of the drawbacks of fd-soi has been reduced by a new structure undoped buried region mosfet (ubrmosfet) [14]. in this paper, the analog/ rf performance along with some scaling parameters of ultra thin body (utb) soi n-channel mosfet (utb-soi-n-mosfet) has been examined along with utb-soi-mosfet with extra insulator region (ir-utb-soi-n-mosfet), utb-n-mosfet with undoped buried region under channel (ubr-utb-soi-n-mosfet) and a new structure utb-soi-n-mosfet with extra insulator region and undoped buried region under channel (ir-ubr-utb-soi-n-mosfet) with the help of the device simulator from silvaco tcad[15]. 2. device structure and simulation setup the schematic representation of four different structures utb-soi-n-mosfet, irutb-soi-n-mosfet, ubr-utb-soi-n-mosfet and ir-ubr-utb-soi-n-mosfet, which was considered for the 2-d simulation is given in fig.1. the effective oxide thickness (eot), the gate length (lg), the gaas body thickness (tgaas), the sio2 buried oxide thickness (tbox) and si substrate thickness (tsub) have been taken of 1.1 nm, 60 nm, 10 nm, 50 nm and 100 nm respectively in all the four type of structures. the source extension (ls) and the drain extension (ls) have been taken as 70 nm each. the source and drain area are highly doped with n-type donor ions with concentration 10 20 /cm 3 each to reduce the mobility degradation due to coulombs scattering. the silicon substrate is diffused with p-type acceptor ions with concentration 10 18 /cm 3 and the gaas channel region is doped with p-type acceptor ions with concentration 10 16 /cm 3 to avoid threshold voltage variation[16]. the metal gate work function is set to 4.6 ev during simulation[17]. the structures are calibrated to meet the requirement of international technology roadmap for semiconductors (itrs) in 45 nm technology node [18]. the 2-d numerical device simulator [15] atlas is used for the simulation of the proposed structures. the drain bias is fixed to vdd =1.0 v as per itrs [19]. to study the analog/ rf performance the simulation is carried out at the drain to source voltage vds = 0.5 v (half of the supply voltage i.e. vdd/2) [20] with a variable gate to source voltage (vgs) 0 v to 1.0 v. the threshold voltage is obtained by using constant current id =10 -6 a/µm, from id~vgs characteristic curve. in the channel region the electron and hole shockley-read-hall [21],[22] generation and recombination lifetime, τn and τp are set to the value 1×10 -8 sec each. in material models, lombardi mobility model [23] is used which considers the effect of transverse electric fields along with doping and temperature dependent parameters gaas utb-mosfet with extra insulator region 377 of mobility [24]. the numerical solution used here is based on the drift-diffusion approach [25]. some other material models have also been used here like the concentration dependent (conmob), parallel electric field dependence (fldmob) which is required for measuring velocity saturation effect, shockley-read-hall (srh) and optical [15]. the fermi-dirac model helps to get the result close to ideal values by a rational chebyshev approximation [19]. (a) (b) tgaas tbox tsub lc ls ld source gate drain substrate lc ubr (c) (d) fig. 1 schematic device structures (a) utb-soi-n-mosfet (b) ir-utb-soi-n-mosfet (c) ubr-utb-soi-n-mosfet (d) ir-ubr-soi-n-mosfet table 1 structure notation notation used in this article structure a utb-soi-n-mosfet b ir-utb-soi-n-mosfet c ubr-utb-soi-n-mosfet d ir-ubr-utb-soi-n-mosfet 378 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra 3. result analysis as described previously these four types of structures were simulated using 2-d numerical device simulator and the parameters like the on-state drive current (ion), off-state leakage current (ioff), ion/ioff ratio, threshold voltage (vth) and power dissipation variation were evaluated which are some of the factors affecting the scaling properties of the devices. the surface potential variation with respect to channel length was also observed. the rf/ analog performance analysis was done by measuring the parameters like transconductance (gm), total capacitance (ctotal), q-factor and cut-off frequencies (ft) for the four different structures. a sub-threshold slope (ss) was calculated by using the following equation [19]. ( / ) (log ) gs d v ss mv dec i    (1) another vital parameter responsible for scaling effect is the drain induced barrier lowering (dibl) which was also evaluated by the following equation[26]. (a) (b) fig. 2 surface potential variation along channel for a, b, c and d at vgs = 1 v (a) at vds = 0.05 v (b) at vds = 1 v (a) (b) fig. 3 ion and ioff comparison for a, b, c and d (a) at vds = 0.05 v (b) at vds = 1 v gaas utb-mosfet with extra insulator region 379 1 2 0.95 th thv v dibl   (2) where vth1 and vth2 are threshold voltages at vds = 0.05 v and vds = 1 v. fig.2 shows the surface potential variation along the channel of the structures a, b, c and d, where fig. 2 (a) shows the variation of surface potential along the channel for the four structures at drain to source voltage vds = 0.05 v and fig. 2 (b) shows the surface potential variation along the channel for the four structures when vds = 1 v. the trade-off between ioff and ion has been shown in the fig. 3 for different structures. fig. 3(a) shows the ion and ioff comparison between a, b, c and d at vds = 0.05 v and fig. 3(b) shows the ion and ioff comparison between a, b, c and d at vds = 1 v. at vds = 0.05 v structure c gives better ion/ioff ratio and at vds = 1 v, structure b shows significant improvement in ion/ioff ratio. (a) (b) fig. 4 (a) static power dissipation for a, b, c, and d, (b) threshold voltage variation at vds = 0.05 v and vds =1 v in the fig. 4(a), the static power dissipation (pd = ioff x vdd) [27] variation with respect to the four type of structures is presented. the structure b provides lower static power dissipation than the other three structures. the fig. 4(b) provides the threshold voltage variation of the four structures at vds = 0.05 and vds = 1 v. the extracted value of threshold voltage, sub-threshold slope, dibl and static power dissipation are tabulated for all device structures in table 2. in fig. 5, the trans-conductance i.e. d m gs i g v    (3) for different a, b, c and d has been given. the fig. 5(a) and fig. 5(b) show the gm variation with id for the given four structures at vds = 0.05 v and vds = 1 v respectively. 380 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra (a) (b) fig. 5 trans-conductance (gm) variation with id for a, b, c and d (a) at vds = 0.05 v (b) at vds =1 v (a) (b) fig. 6 (a) total capacitance (ctotal) with id for a, b, c and d at vds =1 v (b) a cut-off frequency (ft) variation with id for a, b, c and d at vds =1 v in fig. 6(a), the variation of total capacitance (ctotal = cgd + cgs ) for a, b, c and d has been given at vds = 1 v where cgd is parasitic gate to drain capacitance and cgs is the parasitic gate to source capacitance. another important parameter, a cutoff frequency (ft) has been plotted in fig. 6(b) 2 ( ) m t gs gd g f c c   (4) the q-factor (gm/ss) has been calculated for the four device structures and given in the table 3. gaas utb-mosfet with extra insulator region 381 table 2 performance parameters-1 structure vth1 (v) vth2 (v) ss1 (mv/dec) ss2 (mv/dec) dibl (mv/v) pd (x10 -12 w) a 0.420 0.403 69.81 71.95 17.678 1.92 b 0.420 0.404 69.68 71.83 17.589 1.82 c 0.505 0.436 74.11 82.21 72.923 6.55 d 0.505 0.437 74.01 81.90 71.872 6.04 table 3 performance parameters-2 structure ion1/ioff1 (x10 -9 ) ion2/ioff2 (x10 -8 ) ctotal (ff/µm) ft (x10 -11 hz) q-factor a 1.686 3.920 1.639 2.00 24.21 b 0.681 4.132 1.655 2.03 9.32 c 2.095 1.034 1.629 2.00 23.07 d 0.773 1.120 1.639 2.03 7.68 4. conclusions a comparative performance analysis of a new structure was presented namely a irubr-utb-soi-n-mosfet which contains an extra insulator region (ir) at the channel source junction, undoped buried region and having a gaas under the channel region. the scaling and rf parameters of ir-ubr-utb-soi-n-mosfet have been obtained along with conventional utb-soi-n-mosfet. from the analysis, it has been obtained that the sub-threshold slope, dibl, and the static power dissipation are lower for irutb-soi-n-mosfet than the other three structures and it also provides better ion /ioff ratio. so the above structural change in the device can be a good candidate for switching and low standby operating power application. references [1] s. cristoloveanu, “silicon on insulator technologies and devices: from present to future,” solid. state. electron., vol. 45, no. 8, pp. 1403–1411, 2001. [2] m. a. pavanello, j. a. martino, v. dessard, and d. flandre, “analog performance and application of graded-channel fully depleted soi mosfets,” solid. state. electron., vol. 44, no. 7, pp. 1219–1222, 2000. [3] k. kim, “1.1 silicon technologies and solutions for the data-driven world,” in digest of technical papers 2015 ieee international solid-state circuits conference-(isscc), 2015, pp. 1–7. [4] j.-t. park and j.-p. colinge, “multiple-gate soi mosfets: device design guidelines,” electron devices, ieee trans., vol. 49, no. 12, pp. 2222–2229, 2002. [5] a. chaudhry and m. j. kumar, “investigation of the novel attributes of a fully depleted dual-material gate soi mosfet,” electron devices, ieee trans., vol. 51, no. 9, pp. 1463–1467, 2004. [6] s. cristoloveanu and s. li, electrical characterization of silicon-on-insulator materials and devices, vol. 305. springer science & business media, 2013. [7] b. vandana, “study of floating body effect in soi technology,” int. j. mod. eng. res., vol. 3, no. june, pp. 1817–1824, 2013. [8] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “ztc bias point of advanced fin based device: the importance and exploration,” facta univiversitatis: series, electronics and energetics, vol. 28, no. 3, pp. 393–405, 2015. [9] q. xie, c.-j. lee, j. xu, c. wann, j. y.-c. sun, and y. taur, “comprehensive analysis of short-channel effects in ultrathin soi mosfets,” electron devices, ieee trans., vol. 60, no. 6, pp. 1814–1819, 2013. [10] h.-s. wong, “beyond the conventional transistor,” ibm j. res. dev., vol. 46, no. 2.3, pp. 133–168, 2002. 382 s. p. das, a. dastidar, p. sarkar, s. k. mohapatra [11] r. h. reuss et al., “macroelectronics: perspectives on technology and applications,” proc. ieee, vol. 93, no. 7, pp. 1239–1256, 2005. [12] j. yoon et al., “gaas photovoltaics and optoelectronics using releasable multilayer epitaxial assemblies,” nature, vol. 465, no. 7296, pp. 329–333, 2010. [13] a. a. orouji and m. k. anvarifard, “soi mosfet with an insulator region (ir-soi): a novel device for reliable nanoscale cmos circuits,” mater. sci. eng. b, pp. 1–7, 2013. [14] m. rahimian and a. a. orouji, “a novel nanoscale mosfet with modified buried layer for improving of ac performance and self-heating effect,” mater. sci. semicond. process., vol. 15, no. 4, pp. 445–454, 2012. [15] atlas user manual. silvaco international,santa clara, 2012. [16] h. a. el hamid, j. r. guitart, and b. iñíguez, “two-dimensional analytical threshold voltage and subthreshold swing models of undoped symmetric double-gate mosfets,” electron devices, ieee trans., vol. 54(6), p. 1402–1408., 2007. [17] j. p. colinge, “multiple-gate soi mosfets,” solid state electron, vol. 48 (6), pp. 897–905, 2004. [18] “the international technology roadmap for semiconductors,” 2011. [19] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “temperature dependence inflection point in ultra-thin si directly on insulator (sdoi) mosfets: an influence to key performance metrics,” superlattices microstruct., vol. 78, pp. 134–143, 2015. [20] s. chakraborty, a. mallik, and c. k. sarkar, “subthreshold performance of dual-material gate cmos devices and circuits for ultralow power analog/mixed-signal applications,” electron devices, ieee trans., vol. 55 (3), pp. 827–832, 2008. [21] w. shockley and w. t. read, “statistics of the recombination of holes and electrons,” phys. rev., vol. 87, pp. 835–842, 1952. [22] r. n. hall, “electron–hole recombination in germanium,” phys. rev., vol. phys. rev., p. 387, 1952. [23] c. lombardi, s. manzini, a. saporito, and m. vanzi, “a physically based mobility model for numerical simulation of nonplanar devices,” ieee trans. comput. des. integr. circ. syst., vol. 7 (11), pp. 1164– 1171, 1988. [24] p. k. sahu, s. k. mohapatra, and k. p. pradhan, “zero temperature-coefficient bias point over wide range of temperatures for singleand double-gate utb-soi n-mosfets with trapped charges,” mater. sci. semicond. process., vol. 31, pp. 175–183, 2015. [25] s. selberherr, “analysis and simulation of semiconductor devices,” springer–verlag, wien–newyork, 1984. [26] g. c. patil and s. qureshi, “impact of segregation layer on scalability and analog / rf performance of nanoscale schottky barrier,” j. semicond. technol. sci., vol. 12, no. 1, pp. 66–74, 2012. [27] k. p. pradhan, d. singh, s. k. mohapatra, and p. k. sahu, “assessment of iii-v finfets at 20 nm node: a process variation analysis,” procedia comput. sci., vol. 57, pp. 454–459, 2015. 1 2 1 3 1 2 3 × facta universitatis series: electronics and energetics vol. 32, no 4, december 2019, pp. 479 501 https://doi.org/10.2298/fuee1904479d rafal długosz, katarzyna kubiak, tomasz talaśka, inga zbierska-piątek received july 1, 2019 corresponding author: rafal długosz utp university of science and technology, faculty of telecommunication, computer science and electrical engineering, bydgoszcz, poland (e-mail: rafal.dlugosz@gmail.com) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) parallel matrix multiplication circuits for use in kalman filtering 1utp university of science and technology faculty of telecommunication, computer science and electrical engineering 2adam mickiewicz university faculty of mathematics and computer science 3aptiv services poland abstract. in this work we propose several ways of the cmos implementation of a circuit for the multiplication of matrices. we mainly focus on parallel and asynchronous solutions, however serial and mixed approaches are also discussed for the comparison. practical applications are the motivation behind our investigations. they include fast kalman filtering commonly used in automotive active safety functions, for example. in such filters, numerous time-consuming operations on matrices are performed. an additional problem is the growing amount of data to be processed. it results from the growing number of sensors in the vehicle as fully autonomous driving is developed. software solutions may prove themselves to be insuffucient in the nearest future. that is why hardware coprocessors are in the area of our interests as they could take over some of the most time-consuming operations. the paper presents possible solutions, tailored to specific problems (sizes of multiplied matrices, number of bits in signals, etc.). the estimates of the performance made on the basis of selected simulation and measurement results show that multiplication of 3×3 matrices with data rate of 20 100 msps is achievable in the cmos 130 nm technology. key words: matrix multiplication, parallel circuits, asynchronous solutions, kalman filter, automotive applications, cmos. © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd 480 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 481 480 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 481 482 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 483 × × 482 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 483 µ × × µ × ·109 ∗ µ × 2 × 1012 > x(n + 1) = ax(n) + bu(n) + v(n), y(n + 1) = cx(n) + w(n), n a x(n) b c v w y 484 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 485 µ × × µ × ·109 ∗ µ × 2 × 1012 > x(n + 1) = ax(n) + bu(n) + v(n), y(n + 1) = cx(n) + w(n), n a x(n) b c v w y 484 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 485 ana,ma bnb,mb cnc,mc na ma nb mb nc mc n m ma nb k nc na mc mb c cnm k a b c cnm k k = ma = nb a3,4 b4,2 → c3,2   a11 a12 a13 a14 a21 a22 a23 a24 a31 a32 a33 a34   ×   b11 b12 b21 b22 b31 b32 b41 b42   =   c11 c12 c21 c22 c31 c32   c cnm n m c c ma nb ≥ 1 486 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 487 ana,ma bnb,mb cnc,mc na ma nb mb nc mc n m ma nb k nc na mc mb c cnm k a b c cnm k k = ma = nb a3,4 b4,2 → c3,2   a11 a12 a13 a14 a21 a22 a23 a24 a31 a32 a33 a34   ×   b11 b12 b21 b22 b31 b32 b41 b42   =   c11 c12 c21 c22 c31 c32   c cnm n m c c ma nb ≥ 1 c3,2   a11 a12 a13 . . . a1ma a21 a22 a23 . . . a2ma a31 a32 a33 . . . a3ma   ×   b11 b12 b21 b22 . . . . . . bnb1 bnb2   =   c11 c12 c21 c22 c31 c32   ma nb k nba = nbm + log2 k + d, d k ∈ 2· a b 486 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 487 cgs cgs 488 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 489 cgs cgs nbtl ∈ nbtl nbtl = log2 nbi. nsclk = nbi · k. a b 488 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 489 k c a b n = m [ c ] = [ a1 a2 a3 . . . am ] × [ b1 b2 b3 . . . bn ]t . cnm = k∑ k=1 akm · bnk. a b bn b bn = 2 0 · bn,1 + 21 · bn,2 + 22 · bn,3 + · · · + 2l−1 · bn,l, bn,l bn,1 bn,l [c] = 20 · a1 · b1,1 + 21 · a1 · b1,2 + · · · + 2l−1 · a1 · b1,l 20 · a2 · b2,1 + 21 · a2 · b2,2 + · · · + 2l−1 · a2 · b2,l . . . 20 · am · bn,1 + 21 · am · bn,2 + · · · + 2l−1 · am · bn,l [c] = 20 · (a1 · b1,1 + a2 · b2,1 + · · · + am · bn,1)+ 21 · (a1 · b1,2 + a2 · b2,2 + · · · + am · bn,2)+ 22 · (a1 · b1,3 + a2 · b2,3 + · · · + am · bn,3)+ · · · + 2l−1 · (a1 · b1,l + a2 · b2,l + · · · + am · bn,l) 490 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 491 k c a b n = m [ c ] = [ a1 a2 a3 . . . am ] × [ b1 b2 b3 . . . bn ]t . cnm = k∑ k=1 akm · bnk. a b bn b bn = 2 0 · bn,1 + 21 · bn,2 + 22 · bn,3 + · · · + 2l−1 · bn,l, bn,l bn,1 bn,l [c] = 20 · a1 · b1,1 + 21 · a1 · b1,2 + · · · + 2l−1 · a1 · b1,l 20 · a2 · b2,1 + 21 · a2 · b2,2 + · · · + 2l−1 · a2 · b2,l . . . 20 · am · bn,1 + 21 · am · bn,2 + · · · + 2l−1 · am · bn,l [c] = 20 · (a1 · b1,1 + a2 · b2,1 + · · · + am · bn,1)+ 21 · (a1 · b1,2 + a2 · b2,2 + · · · + am · bn,2)+ 22 · (a1 · b1,3 + a2 · b2,3 + · · · + am · bn,3)+ · · · + 2l−1 · (a1 · b1,l + a2 · b2,l + · · · + am · bn,l) 2l [c] = (a1 · b1,1 + a2 · b2,1 + · · · + am · bn,1) 0+ (a1 · b1,2 + a2 · b2,2 + · · · + am · bn,2) 1+ (a1 · b1,3 + a2 · b2,3 + · · · + am · bn,3) 2+ · · · + (a1 · b1,l + a2 · b2,l + · · · + am · bn,l) l k c c 490 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 491 a1m bn1 492 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 493 a1m bn1 a1m bn1 a3,4 b4,2 c3,2 c11 c 492 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 493 a1m bn1 c11 a11 b11 a12 b21 a13 b31 a14 b41 c11 idd a1m bn1 idd c idd 494 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 495 a1m bn1 c11 a11 b11 a12 b21 a13 b31 a14 b41 c11 idd a1m bn1 idd c idd cgs k = 16 494 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 495 a b a b fp = 1 k · (5 + 20) fs = 1 k · 16 · 20 k = 16 fp = 2.5 fs = 0.195 × × 496 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 497 a b a b fp = 1 k · (5 + 20) fs = 1 k · 16 · 20 k = 16 fp = 2.5 fs = 0.195 × × · × · · · ≈ k = 10 c nc · mc a b v w × 496 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 497 2 498 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 499 2 498 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 499 500 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 501 500 r. długosz, k kubiak, t. talaśka, i. zbierska-piątek parallel matrix multiplication circuits for use in kalman filtering 501 plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 417 427 doi: 10.2298/fuee1703417s design and implementation of non-uniform quantizers for discrete input samples and its application to an image processing algorithm  nikola simić 1 , zoran h. perić 1 , milan savić 2 1 university of niš, faculty of electronic engineering, department of telecommunications, niš, republic of serbia 2 university of pristina, faculty of natural science and mathematics, department of informatics, kosovska mitrovica, republic of serbia abstract. this paper describes an algorithm for grayscale image compression based on non-uniform quantizers designed for discrete input samples. non-uniform quantization is performed in two steps for unit variance, whereas design is done by introducing a discrete variance. the best theoretical and experimental results are obtained for those discrete values of variance which provide the operating range of quantizer located in the vicinity of maximal signal value that can appear on the entrance. the experiment is performed by applying proposed quantizers for compression of standard test grayscale images as a classic example of discrete input source. the proposed fixed non-uniform quantizers, designed for discrete input samples, provide up to 4.93 [db] higher psqnr compared to the fixed piecewise uniform quantizers designed for discrete input samples. key words: discrete input samples, grayscale image processing, non-uniform quantization, optimal input range. 1. introduction the interest in methods of digital image processing comes from two basic ideas. first of all, rapidly growing information systems aim at reducing the amount of data required for data processing in order to use narrower bandwidth, as well as to save available storage. next, visual interpretation has to be improved since digital images are widely used in a number of applications [1]. generally, all compression algorithms may be classified in two groups – „lossless‟ compression algorithms if there is no loss of information, and „lossy‟ methods if some information is lost irreversibly [1], [2]. even though there is a variety of compression algorithms for different purposes [3], research areas are still expanding. in recent years, schemes which incorporate compressive sensing became very important and received november 6, 2016; received in revised form january 17, 2017 corresponding author: nikola simić faculty of electronic engineering, university of niš, serbia, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: simicnikola90@gmail.com) 418 n. simić, z. h. perić, m. savić some restoration as well as image reconstruction schemes made an impact in the image processing field [4]-[5]. besides schemes developed for software applications, some effort is also paid to fpga based solutions [6]. this paper deals with a type of improved btc (block truncation coding) algorithm that is a kind of a „lossy‟ method, used for compression of grayscale images [7]. although the basic algorithm has been well-known for years, some upgrades proposed in recent years have found application in modern systems [8]. moreover, an improved block truncation coding algorithm based on optimized dot diffusion was proposed by guo et. al [9], whereas an effective image retrieval system was presented a year later [10]. also, a data hiding scheme based on btc algorithm, designed to embed a huge amount of watermarks was presented in the paper [11], so it can be concluded that the core algorithm can be still improved and implemented in modern systems. despite the core algorithm and its modifications usually can not improve the coding gain comparing to the modern state-of-the-art techniques such as jpeg and jpeg2000, the computational complexity of those schemes is much lower compared to the aforementioned state-of-the-art solutions, which makes it very suitable for image retrieval purposes [10]. the difference in designing of fixed uniform quantizers for continual and discrete input was observed in papers [12], [13]. further research in this direction included designing of fixed piecewise uniform quantizers described in [14]. this paper is a logical continuation of the research. we expect that the gain due to different non-uniform quantizer designing for discrete and continual input is higher than the maximal difference of psqnr (peak signal-to-quantization-noise ratio) between fixed piecewise uniform (l=16) and optimal non-uniform quantizer that is equal to 0.7 [db] (for continual input signal)[14], [15]. the proposed design is fixed, it was tested for a set of standard test grayscale images and optimal parameters are found. however, non-uniform quantizer can be designed by using lloyd-max algorithm which represents a very powerful iterative solution [16]. moreover, highquality performance can be achieved by introducing variance adaptation which would provide better quality of reconstructed image [17]. on the other hand, the proposed design is less complex and it requires less processing time, as it represents a kind of fixed scalar quantization. the paper is organized as follows. in section 2 basic modelling of discrete input source is shown, improved by introducing non-uniform quantization. section 3 describes an algorithm for grayscale image compression that is used for experimental analysis. finally, the obtained theoretical and experimental results as well as the obtained gain in comparison to other models are presented in section 4. 2. system model the considered system consists of two stages − the purpose of uniform quantizer q0 exploited in the first stage is to convert analog input signal to discrete samples, whereas the proposed quantizer q, designed for discrete input, is exploited in the second stage in order to perform additional data compression. in the first step, samples with a continual amplitude have to be quantized with a fixed uniform quantizer q0 which is described with n0 output levels, x={x1,x2,…, xn0}, and the maximal amplitude xmax, which depends on the input signal range [14], [17]. considered pixel values of standard grayscale images are described with 8 bits and they can take values from 0 to 255, so xn0 = 255. furthermore, quantization process in btc algorithm is design and implementation of non-uniform quantizers for discrete input samples 419 based on quantization of distinction between the original and mean pixel value of all pixels in a block. therefore, the number of output levels n0 is equal to 512. on the other hand, samples with continuous amplitude can be described only as random variables, since the input information is unknown. in probability theory, random variables are described by using probability density function (pdf) which provides the relative likelihood for the observed random variable to take on a given value. so far, it is shown in literature that laplacian source ensures good matching between a btc model and reality [1], [7]. consequently, in the rest of the paper we will suppose that the information source is laplacian with a memoryless property and mean value equal to zero. it is defined with: 1 2 | | ( ) exp 2 x p x          , (1) where  represents a standard deviation of the random variable x. the second step of quantization process involves quantization of discrete output samples from the quantizer q0 using n quantization levels, where n < n0. probabilities of these discrete input levels for laplacian distribution are: 1 1 2 21 ( )d exp exp 2 i i x i i i x x x p p x x                             , (2) where i = 0, … , n0 1. the main goal of this phase is additional data compression. in the rest of the paper the quantizer from the second step is denoted with q. this paper deals with designing and optimization of quantizer q. so far in literature, the application of both uniform and piecewise uniform quantizers was described, and in this paper we propose application of a non-uniform quantizer since it provides better quality of reconstructed signal for the equal number of quantization levels [1]. the design of the non-uniform quantizer q is done as follows. firstly, we design the optimal compandor with n quantization levels for the unit standard deviation (σ = 1). its compressor function maps the range (-, ) to (-1, 1). the compressor function formed in this way can be defined with:       ttp ttp xc x d)( d)( 21)( 3/1 3/1 . (3) decision thresholds obtained in this way can be calculated as [15]: 2 0 2 log 2 3 n i n i ti        , (4) . 2)(2 log 2 3 ni n in n ti           (5) 420 n. simić, z. h. perić, m. savić furthermore, representational levels are determined with [15]: 2 11 2 log 2 3 n i n i i        , (6) . 21)1(2 log 2 3 ni n n n i           (7) in all previous equations log(x) represents natural logarithm of x. the range of quantizer designed in this way is (tn, tn). since the obtained range is not adjusted to the theoretical range of pixel values, denormalization is required. due to the fact that for a low number of quantization levels tn < xn0 [15] is always valid, denormalization is performed by introducing a discrete variance d̂ . it is obtained by multiplying decision thresholds ti and representational levels i with discrete variance d̂ that is used for quantizer designing. finally, decision thresholds and representational levels of quantizer q are determined with: ,0,ˆ nitx dii   (8) .1,ˆ niy dii  (9) the maximal support xn can be defined in different ways [15], and in this paper we have decided to choose a simplest form in order to place the last represent at the half of the decision range. however, in the case if xn < xn0, the overload distortion will exist. on the other hand, if xn > xn0, the range [xn0, xn] will be unused. as a result, higher granular distortion would exist. if the system conditions require designing of fixed quantizer with the unused range (case xn > xn0), we propose additional modification by introducing another denormalization parameter . its function is to adapt the range [xn, xn] formed in the previous step, to the range [xr, xr], where the desired maximal value of the range is denoted with xr. consequently, we define parameter  with: nr xx / . (10) finally, decision thresholds and representational levels of quantizer q in the case xn > xn0 are equal to: ,0, ' nixx ii  (11) .1, ' niyy ii  (12) as this is a kind of a „lossy‟ compression method, some information will be lost irreversibly during the quantization process. as a standard measure of a reconstructed signal quality we estimate distortion (d) which consists of both granular (dg) and overload (do) distortion that can be calculated with [8], [9]: design and implementation of non-uniform quantizers for discrete input samples 421 ,)()(2 2/ 1 1 2      n i k j ijiijg i xpyxd (13) .)()(2 1 2 2/0    s j jnj xpyxd (14) in eq. (13) parameter ki denotes the number or input levels mapped with yi whereas xij  x. moreover, in eq. (14) xj  x, parameter s denotes the total number of pixel values from the theoretical range, which are not placed within the designed range. this parameter can be calculated as: . 0 nn xxs  (15) finally, the total distortion is equal to: .ogt ddd  (16) 3. algorithm for image processing the proposed design of second-stage quantizer q from section 2 is tested by analyzing its application to the image processing algorithm, defined as follows. 1. the image is divided into m non-overlapping blocks of dimensions m  m. 2. each block is processed separately by sending data and reconstructing information at the receiver side. the algorithm processes pixels from left to right and from top to bottom. 3. the mean value of all pixels in the block (xav) is calculated and then quantized ( avx̂ ) with a fixed uniform quantizer. in order to minimize the error in the reconstruction process, coding process uses values which are available to the decoder. 4. the difference blocks of m  m pixels are formed. elements of a block are denoted with di,j and obtained as: avjiji xxd ˆ,,  , (17) where xi, j is original pixel value and i = 1,…, m; j = 1,…, m. elements of a difference block have laplacian distribution [1], and they can take integer values [xn0, x n0]. 5. elements of difference blocks are quantized by using proposed fixed non-uniform quantizers from section 2. these elements are denoted with jid , ˆ , coded with log(n) bits and transmitted to the receiver. 6. in the receiver, pixel reconstruction is done as: avjiji xdx ˆ ˆˆ ,,  . (18) during quantization process there was made distortion of original image in step 5. it can be experimentally measured as [9]: 2 , , 1 1 1 ˆ( ) m m i j i j i j d x x m m       2 , , 1 1 1 ˆ( ) m m i j i j i j d d m m       . (19) 422 n. simić, z. h. perić, m. savić the flow chart of this algorithm is shown in fig. 1. fig. 1 flow chart of the proposed grayscale image compression method 4. numerical results to demonstrate the performance of the proposed algorithm for image compression, we will show a comparison of theoretical with experimental results obtained for a set of standard test grayscale images as well as a comparison with the results available in literature for piecewise uniform quantization model [14]. all theoretical calculations and experimental results are done for a set of three standard test grayscale images (lena, street and boat). we estimate system performance using average bit-rate rb and psqnr which represent standard measures. since we discuss fixed non-uniform quantizers, average bitrate depends on the number of quantization levels n and the number of bits required for transmitting the mean value avx̂ . on the other hand, psqnr is defined with [13], [14], [17]: ]db[log10 2 10 0            d x psqnr n (20) design and implementation of non-uniform quantizers for discrete input samples 423 for measuring experimental psqnrex we use eq.(20), whereas d is defined with eq. (19). however, theoretical results have to include weighting function, since input samples do not occur with the same probabilities [14]. the weighting function in linear domain for tested images is shown in fig. 2. fig. 2 the weighting function in fig. 2, i represents standard deviation of the difference between pixels and the mean value of the block that pixel belongs to. taking previous consideration into account, including weighting averaging for the observed test grayscale images and considering that total distortion is defined with eq.(16), theoretical results are denoted with psqnrwav. this measure is defined with [14]: ]db[)()( 255 1 iiwav psqnrwpsqnr i    (21) table 1 shows obtained experimental results of applying the proposed algorithm for grayscale image compression as well as corresponding theoretical results. it can be seen that experimental results very well follow changes of theoretical values, whereas relative difference between theoretical and experimental values occurs due to non-ideal modelling with laplacian source as well as because of averaging for a set of images [18]. from table 1, it can be clearly seen that the best theoretical and experimental results are obtained for those values of discrete variances ( 17ˆ d for n = 32 and 15ˆ d for n = 64) which ensures input range of quantizer q as close as possible to the range (152, 152) [14], [17]. consequently, this means that parameter xr = 152. 424 n. simić, z. h. perić, m. savić table 1 comparison of experimental and theoretical results for the proposed model n d̂ psqnrwav[db] psqnrex . [db] nx rb [bpp] 32 15 46.82 47.57 132 5.375 17 46.43 46.94 149 29 44.59 44.51 255 64 15 49.38 51.57 154 6.375 24 49.00 50.85 247 29 48.01 48.50 298 moreover, it can be noticed that for the case n = 64 and 29ˆ d , overload distortion does not exists since the range (-298, 298) is wider of the theoretical range (255, 255) and the support region is not adapted to the theoretical one. in this case, decision thresholds and representational levels could be calculated using eqs.(10)-(12). however, this modification involves additional hardware requirements and processing time as well as information about xr for specific systems regarded to the nature of the input signal. in fig. 3 we have shown original test grayscale images of resolution 512512 pixels, while in fig. 4 we have presented corresponding images from fig. 3, after processing with the proposed algorithm for n=32 quantization levels and .15ˆ d (a) (b) (c) fig. 3 standard test grayscale images: (a) lena, (b) boat and (c) street (a) (b) (c) fig. 4 standard test grayscale images from fig. 3, after compression with the proposed algorithm (n=32): (a) lena, (b) boat and (c) street design and implementation of non-uniform quantizers for discrete input samples 425 in order to compare the obtained results with models available in the literature, we perform comparison of both experimental and theoretical results with system performance of the model based on fixed piecewise uniform quantizers designed for discrete input, as it represents the model with similar complexity. the experimental comparison is measured as experimental gain of the proposed method and it represents the difference of psqnr between the proposed and equivalent results from savic et al. [14], i.e. gain [db] = psqnrex . [db] psqnreq(n) inf [db], where equivalent results are provided for n=32 and n=64 quantization levels. in [14], obtained experimental results as close as to the nonuniform quantization are achieved for n = 32 and l =16 (psqnrex(32) inf = 42.64 [db], rb = 5.375 [bpp]), whereas corresponding theoretical performance is psqnrth(32) inf =42.29 [db]. since the paper [14] did not deal with systems that use n = 64 levels, comparison for these results is done considering the rule that psqnr values increase/decrease for 5.5 [db] by changing the bit-rate for 1 bit [13], [14]. respecting that bit-rate difference between quantizers that are designed for n = 32 and n = 64 quantization levels is 1 [bpp], corresponding result for n = 64, which is used for comparison, is psqnreq(64) inf = 42.640+1*5.5 = 48.14 [db]. comparing the obtained results from table 1 with corresponding results (psqnrex(32) inf and psqnreq(64) inf ) from [14], the obtained experimental gain is shown in table 2 for the same number of quantization levels. table 2 experimental gain of the prposed model in comparission to the piecewise uniform quantization model. n d̂ gain[db] 32 15 4.93 17 4.30 29 1.87 64 15 3.43 24 2.71 29 0.36 by observing table 2, it can be concluded that fixed non-uniform quantizers designed for discrete input samples for n = 32 and n = 64 quantization levels gives from 0.35605 to 4.93 [db] higher psqnr compared to the fixed piecewise uniform quantizers designed for discrete input samples in addition, comparing theoretical results from table 1 with psqnrth(32) inf , it can be concluded that beside experimental gain, the proposed improved theoretical model that uses discrete variance predicts gain up to 4.52 [db] compared to the same similar system, confirming experimental results. 5. conclusion in this paper we described a novel method for non-uniform quantizer design for discrete input samples and we tested the proposed quantizer for grayscale image coding. considering that quantizers designed for continuous and discrete signals have different nature, we have introduced discrete designing variance as an additional and effective parameter in the 426 n. simić, z. h. perić, m. savić process of quantizer designing, for discrete input samples. system performance was discussed using weighting averaging of psqnr for a set of three standard test grayscale images. the experimental results demonstrate that the performance of the proposed method outperforms other similar models obtained gain of the proposed discrete solution is much higher for the most of discussed cases than the maximal difference of psqnr between piecewise uniform (l=16) and optimal non-uniform quantizer that is equal to 0.7 [db] (for continual input signal), which proves the introduction of the proposed quantizer design. furthermore, additional system modification was proposed to adjust quantizer design in the special cases. however, this modification requires additional computing time as well as information about a set of input images. to generalize this approach, future work will include testing of specific images in order to find optimal values of input range support as well as implementation for different types of discrete input source. acknowledgments: this work is supported by serbian ministry of education and science through mathematical institute of serbian academy of sciences and arts (project iii44006) and by serbian ministry of education, science and technological development (project tr32035). references [1] jayant n. s., noll p, digital coding of waveforms, prentice hall pb, 1984. [2] yun q., shi, huifnag sun, image and video compression for multimedia engineering, taylor & francis group, 2008. [3] m. savic, z. peric, n. simic, “coding algorithm for grayscale images based on linear prediction and dual mode quantization”, expert systems with applications, vol. 42, pp. 7285–7291, 2015. [4] n. eslahi, a. aghagolzadeh, “compressive sensing image restoration using adaptive curvelet thresholding and nonlocal sparse regularization”, ieee transactions on image processing, vol. 25, no. 7, pp. 3126 – 3140 july 2016. [5] j. musić, t. marasović, v. papić, i. orović, s. stanković, “performance of compressive sensing image reconstruction for search and rescue”, ieee geoscience and remote sensing letters, vol. 13, no. 11, pp. 1739 – 1743, nov. 2016. [6] a. napieralski, j. cłapa, k. grabowski, m. napieralska, w. sankowski, p. sękalski, m. zubert, “image and video processing with fpga support used for biometric as well as other applications”, facta universitatis, series: electronics and energetics, vol. 28, no. 2, june 2015, pp. 165 – 175. [7] y. yang, q. chen, y. wan, “a fast near-optimum block truncation coding method using a truncated kmeans algorithm and intre-block correlation”, international journal of electronics and communications (aeu), 2011, no. 65, pp. 576-581. [8] s. kim, d. lee, j-s. kim, h-j. lee, “a block truncation coding algorithm and hardware implementation targeting 1/12 compression for lcd overdrive”, journal of display technology, vol. 12, no. 4, pp. 376−389, april 2016. [9] j-m., guo, y-f., liu, “improved block truncation coding using optimized dot diffusion”, ieee transactions on image processing, vol. 23, no. 3, pp.1269−1275, march 2014. [10] j-m., guo, h. prasetyo, n-j., wang, “effective image retrieval system using dot-diffused block truncation coding features”, ieee transactions on multimedia, vol. 17, no. 9, pp. 1576−1590, september 2015. [11] j-m., guo, y-f., liu, “high capacity data hiding for error-diffused block truncation coding”, ieee transactions on image processing, vol. 22, no. 12, pp. 4808−4818, december 2012. [12] m. savić, z. perić, m. dinčić, “design of forward adaptive uniform quantizer for discrete input samples for laplacian source”, electronics and electrical engineering, no. 9 (105), pp. 73-76, 2010. [13] m. savić, z. perić, m. dinčić, “an algorithm for grayscale image compression based on the forward adaptive quantizer designed for signals with discrete amplitudes”, electronics and electrical engineering, no. 2 (118), pp. 13-16, 2012. design and implementation of non-uniform quantizers for discrete input samples 427 [14] m. savic, z. peric, m. dincic, “coding algorithm for grayscale images based on piecewise uniform quantizers”, informatica, vol. 23, no. 1, pp. 125-140, 2012. [15] z. peric, m. petkovic, m. dincic, “simple compression algorithm for memoryless laplacian source based on the optimal companding technique”, informatica, vol. 20, no. 1, pp. 99–114, 2009. [16] z. peric, j. nikolic, “an effective method for initialization of lloyd-max's algorithm of optimal scalar quantization for laplacian source”, informatica, vol. 18, no.2, pp. 279-288, 2007. [17] n. simic, z. peric, m. savic, ”improved algorithm for grayscale image compression based on multimode coding algorithm”, revue roumaine des sciences techniques-serie electrotechnique et energetique, tome 59, issue 3, pp. 315-323, october 2014. [18] z. peric, n. simic, m. savic, “analysis and design of two stage mismatch quantizer for laplacian source”, elektronika ir elektrotechnika, vol. 21, no. 3, pp. 49-53, 2015. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 429 429 doi: 10.2298/fuee1703429e corrigendum tomislav suligoj, marko koričić, josip žilak, hidenori mochizuki, so-ichi morita, katsumi shinomura, hisaya imai horizontal current bipolar transistor (hcbt) – a low-cost, highperformance flexible bicmos technology for rf communication applications. facta universitatis, series: electronics and energetics (fu elec energ), vol. 28, no 4, december 2015, pp. 507 525. doi: 10.2298/fuee1504507s  the editor-in-chief has been informed that in the article tomislav suligoj, marko koričić, josip žilak, hidenori mochizuki, so-ichi morita, katsumi shinomura, hisaya imai. horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications. facta universitatis, series: electronics and energetics, vol. 28. no 4, 2015, pp. 507-525. doi: 10.2298/fuee1504507s fig. 15 with its legend has been ommited in published version of the paper. after further discussion with the corresponding author, editor-in-chief has decided to publish a corrigendum for this article, providing the figure and legend of fig. 15. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 10 -13 10 -12 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 i c c u rr en t, ( a ) base-emitter voltage, (v) single-poly double emitter v ce =2 v i b 0 2 4 6 8 10 0 20 40 60 80 100 c o lle ct o r c u rr en t , ( a ) collector-emitter voltage, (v) w hill =0.36 m w hill =0.5 m w hill =0.6 m single poly hcbt i b =0.1 a (a) (b) fig. 15 measured dc characteristics of double-emitter (de) hcbt: (a) comparison between the gummel characteristics of de hcbt with n-hill width whill=0.36 µm (b) output characteristics of de hcbts with different n-hill widths (whill). single polysilicon region hcbt is added for the reference. link to the corrected article doi:10.2298/fuee1504507s received march 2, 2017 doi:10.2298/fuee1504507s instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 391 402 doi: 10.2298/fuee1703391m an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors  marija milijić 1 , aleksandar nešić 2 , bratislav milovanović 3 1 faculty of electronic engineering, university of niš, serbia 2 "imtel-komunikacije" a.d., novi beograd, serbia 3 university singidunum, 11000 belgrade, serbia abstract. the paper discusses the problem of side lobe suppression in the radiation pattern of printed antenna arrays with different 3d reflector surfaces. the antenna array of eight symmetrical pentagonal dipoles with corner reflectors of various angles is examined. all investigated antenna arrays are fed by the same feeding network of impedance transformers enabling necessary amplitude distribution. considering the different reflector surfaces, the influence of parasitic radiation from feeding network on side lobe suppression is studied to prevent the reception of unwanted noise and to increase a gain. key words: printed antenna array, reflector antennas, side lobe suppression, symmetrical pentagonal dipole 1. introduction modern wireless communication systems establish strong antenna requirements relating to theirs size, weight, cost, performance and ease of installation. printed (microstrip) antennas can meet the most requests set by many government and commercial applications (mobile radio and wireless communications), high-performance aircraft, spacecraft, satellite, and missile applications. printed antennas feature low profile and low weight, simple and inexpensive production using standard photolithographic technique, great reproducibility and the possibility of integration with other microwave circuits [1]. major disadvantages of microstrip antennas are spurious feed radiation, tolerances in fabrication, very narrow frequency bandwidth and surface wave effect [2]. the printed antenna arrays with symmetrical pentagonal dipoles can mostly overcome mentioned limitations of printed antenna. the antenna array is an assembly of radiating elements in an electrical and geometrical configuration improving the majority of antenna parameters. received october 3, 2016; received in revised form november 14, 2016 corresponding author: marija milijić faculty of electronic engineering, university of niš, serbia, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: marija.milijic@elfak.ni.ac.rs) 392 m. milijić, a. nešić, b. milovanović the symmetrical pentagonal dipole operates on the second resonance (antiresonance) enabling both much slower impedance variation with frequency and useful wide bandwidth than in case of operation on the first resonance [3]-[7]. consequently, proposed antenna array has lower sensitivity to fabrication’s tolerances enabling the use of low-cost photolithography printing process for its manufacture. further, the feeding network is also symmetrical printed structure causing the reduction of parasitic radiation and surface wave effect. however, spurious radiation from the feed network has a very substantial, although indirect, effect on side lobe level [2]. international standards and recommendations define side lobe suppression (sls) for modern communication systems between 20 db and 40 db. moreover, the radar systems that are employed to control civil and military object, have more serious sls requirements. side lobe should be minimized to avoid false target indications through the side lobes that can cause catastrophic consequences. also, insufficient sls indicates that more power is radiated in side lobes resulting in a low antenna gain. the proposed antenna array of eight symmetrical pentagonal dipoles uses dolphchebyshev distribution of the second order with 19 db pedestal (imax/imin ratio) in order to achieve sls of 44 db in ideal case without realization errors. besides realization deviations that can reduce sls, the parasitic radiation from feeding network may influence on antenna parameters, especially sls. the reflector plates can be good tool to overcome undesired effect by feeding structure’s radiation. furthermore, they can improve gain and control radiation pattern that is desirable for many modern wireless applications. 2. printed antenna arrays with high side lobe suppression the investigated printed antenna consists of array of eight symmetrical pentagonal dipoles (labelled as d1-d8), feeding network, balun and 3d reflector surface (fig. 1). the array, feeding network and balun are printed on the same dielectric substrate of 0.508 mm thickness with εr=2.1. the vertex of corner reflector with angle α is at distance h from the antenna array. fig. 1 printed antenna array of eight symmetrical pentagonal dipoles with feeding network, balun and reflector http://www.mwjournal.com/buyersguide/buyersguide.asp?catid=115&ref=autoarticle http://www.mwjournal.com/buyersguide/buyersguide.asp?catid=114&ref=autoarticle an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 393 2.1. antenna array the eight symmetrical dipoles form the antenna array. they have pentagonal shape whose one half is on one side and another half, contrariwise turned, is on the opposite side of the substrate. the single pentagonal dipole has very large bandwidth regardless used reflector plate. considering their impedance variation in desired frequency range, its bandwidth (vswr is bellow 2) is more than 35% of central frequency [8]. however, mostly antenna parameters of single pentagonal dipole are hardly controlled. therefore, pentagonal dipoles associate in array to achieve higher gain, better sls and other required parameters. the pentagonal dipoles in array are axially positioned at distance d. the array is mirror symmetrical whose line of symmetry passes through the middle of array. consequently, dipoles d1 and d8 have the same dimensions and parameters, as well as dipoles d2 and d7, d3 and d6, and d4 and d5. each array’s dipole is fed by symmetrical feeding line that penetrates through the holes at the contact between two reflector metal surfaces. the holes must be sufficient diameter (2.3mm) to minimize the influence of the metallic plates. the previous researches [8-10] have showed that reflector plate influences the antenna parameters like sls, gain, bandwidth, etc. the planar reflector plates [8-10] enable that antenna array can apply in a wide frequency range, although its gain and sls are not satisfactory for many modern communication systems. unlike planar reflector plates, the antenna array with corner reflectors [8-9] can achieve greater gain and sls but narrow bandwidth. the improvement in a gain and sls is more noticeable when the angle of corner reflector is smaller. even though the antenna array with corner reflector of 90° and 60° angle [8-9] have been examined, their slss are not satisfactory for many wireless applications. further investigations should optimize the angle of corner reflector to obtain satisfactory sls and gain. furthermore, the antenna array with optimal parameters is planned to be realized and measured. the radiation patterns in h plane will be also investigated. 2.2. feeding network the feeding network, also symmetrical printed structures, enables the amplitude distribution calculated by linplan software [11]. it begins with balun that is used for transition from conventional printed to symmetrical printed structure. after it, there are impedance transformers, tjunctions, and feeding lines (fig. 2). fig. 2 feeding network of impedance transformers for antenna array with high side lobe suppression feeding lines with impedance zc = 100 ω correspond the dipoles with the same impedances zd = 100 ω. there are a few t-junctions (one in the first stage, two in the second 394 m. milijić, a. nešić, b. milovanović stage and four in the third stage of feeding network). the tjunctions are marked by points a, b, c, and d (fig. 2). the impedance in separating points a, b, c, and d is zs = 50 ω. the impedance transformer zt = 70.7 ω is used to transform impedance of feeding line zc = 100 ω into impedance in separating point zs = 50 ω. the other impedance transformers (z1, z2, z3, z4, za, and zb) are employed to obtain a requested amplitude weight for every array’s radiating elements obtained by linplan software [11]. linplan software enables the adjustment of an antenna array’s parameter in order to achieve optimal value of sls, gain and hpbw (half-power beamwidth) [8]. also, the distance d between radiating elements should have optimal value to prevent oversizing of their mutual impedance. furthermore, the value of pedestal determine the width of impedance transformers in feeding network that should be moderate due to easier realization by photolithographic printing. considering all requests, the pedestal of 19 db and distance between array’s elements d = 0.77λ0 = 19.25 mm (λ0 is wavelength in vacuum at centre frequency fc=12 ghz) are chosen [8]. the amplitude distribution is shown in table 1. table 1 the distribution coefficients for dolph-chebyshev distribution of the second order with pedestal of 19 db dipoles number i 1/8 2/7 3/6 4/5 distribution coefficients ui 0.121 0.387 0.742 1 all impedance transformers are λg/4 length (λg is wavelength at the centre frequency fc for the dielectric substrate whose thickness is 0.508 mm and dielectric constant is 2.17). their characteristics and dimensions have been calculated [9]-[10] using values ui i = 1,2,3,4 from table 1 for dielectric substrate of 0.508 mm thickness, 2.17 relative dielectric permittivity, 41 ms/m conductivity of metal, insignificantly small values of loss tangent and conductor thickness (table 2). table 2 the impedance transformers of the feeding network impedance transformer z1 z2 z3 z4 za zb width [mm] 0.152 1.232 0.615 0.97 0.147 1.245 characteristic impedance [ω] 236.95 74.08 118.66 88 228.37 74.36 moreover, expected tolerances in standard photolithographic process have been assumed in order to estimate the sls degradation, due to amplitude, phase and radiating elements positioning deviations from optimized values at the operating frequency of 12 ghz. besides ideal case without realization errors, two more cases have been considered:  the real case when deviations in distances between radiating elements in the array are 1 percent of λ0, phase deviations are 0.908° (approximately 40 μm tolerances in the length of the feeding line) and amplitude deviations along feeding lines are 1db;  the worst case when deviations in distances between radiating elements in the array are 2 percent of λ0, phase deviations are 1.835° (approximately 80 μm tolerances in the length of the feeding line) and amplitude deviations along feeding lines are 2 db. an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 395 the radiation patterns simulated by linplan [11] for all three cases show that the proposed antenna array in ideal case has sls of 44.8 db at 12 ghz and that the expected sls is 39.8 db (in the real case) and 36.2 db (in the worst case) at the same frequency. further, linplan [11] works with abstract radiating elements and it can expect that real antenna arrays with real elements would have bigger degradation. 3. side lobe suppression of printed antenna array initially, all dipoles are fed by singular generators and their dimensions are adjusted in order to obtain all dipoles’ impedance of zd = 100ω at the centre frequency (fc = 12 ghz) taking into consideration mutual coupling and reflector influence. afterwards, the array of dipoles with optimized dimensions is connected with feeding network of impedance transformers. 3.1. simulation results the first investigated antenna array is situated in corner reflector with angle of 90° whose vertex is at distance h = 0/2 = 12.5 mm from centres of dipoles. both reflector plates have 308mm x 60.8mm dimensions. its simulated radiation patterns, run by wipld software, in both e and h plane are presented in fig. 3 and fig. 4, respectively. first simulated model when dipoles are fed by single generators has side lobe suppression 40.6 db in e plane (fig. 3). the second model is generated by integration antenna array with feeding network. its sls is 36.65 db. it can suppose that the unwanted radiation from feeding network degrades the sls. fig. 3 radiation pattern of printed antenna array in corner reflector with angle of 90° in e plane 396 m. milijić, a. nešić, b. milovanović fig. 4 radiation pattern of printed antenna array in corner reflector with angle of 90° in h plane however, it does not influence gain and radiation pattern in h plane (fig. 4). gain in e plane for both antenna simulation models is 19 dbi. the second investigated antenna array is located between two metallic plates of 308mm x 76mm dimensions joined at 60° angle. the distance between array and vertex of corner reflector is h = 0/2 = 12.5 mm. the antenna array fed by eight single generators has sls = 43.7 db and gain g = 20.5 dbi in e plane (fig. 5). when feeding network of impedance transformers is integrated with antenna array, sls decreases to 37.3 db while gain stays approximately the same. fig. 5 radiation pattern of printed antenna array in corner reflector with angle of 60° in e plane an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 397 fig. 6 radiation pattern of printed antenna array in corner reflector with angle of 60° in h plane the simulated radiation pattern in h plane, run by wipl-d software, does not change using the different feeding method (fig. 6). meanwhile, it is significantly narrower than the simulated radiation pattern in h plane of antenna array with corner reflector with 90° angle. the last examined antenna array is with corner reflector of angle of 45°. the dimensions of each reflector plate are 308mm x 106 mm. the smaller angle of corner reflector requests greater distance between its vertex and antenna array. therefore, the distance h = 0.6 0 = 15 mm is selected. the first wipl-d simulation model when dipoles in array are fed by single generators has sls = 41 db (fig. 7). the second model, that integrates antenna array of eight dipoles with feeding network of impedance transformers, has sls = 38.8 db (fig. 7). due to the simulation results are satisfied for both wipl-d models, the proposed antenna has been realized. fig. 9 shows a photograph of a fabricated antenna in such a way that fig. 9 a is a view of antenna array in corner reflector with angle of 45° and fig. 9.b is a view of antenna array with one metallic plate and with feeding network. measured results are presented in fig. 7 and fig. 8. the gain in e plane of realized antenna is about 21 dbi which is the value obtained by wipl-d software for both simulated models (fig. 7). however, sls is smaller than value expected by wipl-d simulations. sls of realized antenna is 32 db that is about 6.8 db smaller then simulated sls of antenna array with feeding network. the possible reasons for sls degradation of realized antenna can be: an accidental reflection during measuring, tolerances in fabrication of very thin impedance transformers (z1 and za), the influence of corner reflector metallic plates on feeding structure, etc. although all these influences are hardly investigated and some of them cannot be solved, the measured sls is appropriate for many commercial wireless services. furthermore, the realized antenna has very good gain of 21 dbi which is very important for applications where all potential users request wireless signal of good quality. fig. 8 presents the simulated and measured radiation pattern in h plane for antenna array in corner reflector with 45°angle. it is obvious that it is the narrowest beam in h plane among the examined antennas. 398 m. milijić, a. nešić, b. milovanović fig. 7 radiation pattern of printed antenna array in corner reflector with angle of 45° in e plane fig. 8 radiation pattern of printed antenna array in corner reflector with angle of 45° in h plane entire symmetrical printed structure composing eight dipoles array in corner reflector, feeding network and balun are simple and easy to fabricate by printing on the unique substrate. in particular, it is fabricated by cheap and simple photolithographic process satisfying the requirements of mass productions. an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 399 a) b) fig. 9 a) realized antenna with corner reflector b) antenna array, feeding network and one metallic plate of corner reflector the simulated and measured vswr is presented in fig. 10. the simulated vswr of the antenna array is less than 2 in wide frequency range for every considered corner reflector: between bellow 9 ghz and 13.9 ghz for reflector with 90° angle, between 10.4 ghz and 13.4 ghz for reflector with 60° angle and between 10.3 ghz and 13.7 ghz for reflector with 45° angle. however, the realized antenna array in corner reflector with 45° angle is characterized by vswr, measured by agilent n5227a network analyzer, below 2 for range of frequencies between 11.13 ghz and 13.11 ghz. while realized antenna is less wideband than simulation models, possibly due to fabrication tolerance and losses introduced by connectors, it still demonstrates good bandwidth of 1.98 ghz (16.5% of central frequency). 400 m. milijić, a. nešić, b. milovanović fig. 10 vswr of antenna arrays in corner reflector with different angles the presented results show that the corner reflector can significantly influence both radiation patterns in e and in h plane. using corner reflector of smaller angle can increase gain and sls of antenna array (fig. 11). in order to confirm the advantages of corner reflector, the antenna array with the same parameters as investigated arrays (distribution, feeding network, distance between elements, dielectric, frequency, etc.), although without any reflector plate, is studied. the simulation results of antenna array without reflector are presented in fig. 11. fig. 11 simulated radiation pattern in e plane of antenna arrays without reflector and with corner reflector with different angles an investigation of side lobe suppression in integrated printed antenna structures with 3d reflectors 401 it is obvious that its simulation results are far worse than all results of antenna array in corner reflectors. its gain is 8.6 dbi while its sls is 15.5 db. even if its simulation results are not satisfactory for communication system that require high sls, thanks its planar form and optimal dimensions (185mm x 50mm), the printed antenna array without reflector is suitable for many other applications: iot equipment [12], portable devices, rfid systems, etc. the distance h between the corner reflector’s vertex and the dipoles must increase as the angle α of the reflector decreases [1]. furthermore, for reflectors with smaller angles, the dimensions of reflector plates must be larger [1] increasing the size of entire antenna. although a gain increases as the angle between the reflector plates decreases, there is an optimum dipoles-to-vertex distance h for the angle α of corner reflector. if the distance h becomes too small, antenna can be inefficient. for very large distance h, the system produces undesirable multiple lobes and it loses its directional characteristics [1]. consequently, the corner reflector whose plates are set at the 45° angle has the distance h = 0.60 between its vertex and dipoles in array and the satisfactory simulated and measured results. even though the using a corner reflector with a smaller angle can increase antenna gain and sls, also it will result in larger entire antenna dimensions as well as inadequate radiation pattern. 4. conclusion side love suppression is one of the most important antenna parameters whose value must be sufficient to minimize false target indications through the side lobes. side lobe levels of −20 db or smaller are usually not desirable in most applications. antennas with side lobe suppression bigger than 30 db or 40 db (mostly radar systems) must be carefully designed and realized. furthermore, modern wireless systems request compact, light and simple antennas that are easy to implement. microstrip (printed) antennas satisfy all listed requirements although they have several limitations to achieve high side lobe suppression: tolerances in fabrication, mutual coupling between radiating elements, surface wave effect as well as parasitic radiation from a feeding network. the symmetrical printed antenna array of pentagonal dipoles can overcome mostly obstacles to achieve great side lobe suppression. the presented simulated and measured results show that symmetrical printed antenna arrays can achieve great sls. but, there are several factors that must be considered for their design. the appropriate choice of used distribution determines maximum sls that can be obtained but also and the parameters of impedance transformers in feeding network. if tapered distribution with great pedestal is used, the transformers with the greatest and the smallest impedance will have the smallest and the biggest width. the impedance transformers with the smallest width are mechanically unreliable; they can easily be broken. the impedance transformers with the biggest width can have high modes. also, the technical tolerances of photolithographic realization must be considered because deviations from projected values in width and length of impedance transformers can lead to change in amplitude and phase of radiating elements causing sls degradation. an unwanted radiation from the feeding structure has significantly influence on side lobe level. although the feeding network of symmetrical impedance transformers features less radiation than standard microstrip feeding structures, it cannot be completely eliminated. the simulated results show that use of corner reflector with different angle can partially 402 m. milijić, a. nešić, b. milovanović solve the problem of parasitic radiation from feeding network. the corner reflector with smaller angle better prevent the spurious radiation from feeding network and greater sls can be achieved. moreover, the greater gain can be obtained using corner reflector with smaller angle. furthermore, the corner reflector with different angle influences on width of beam in h plane. besides all mentioned advantages of corner reflector, the side lobe suppression of realized antenna array in corner reflector of 45° is less about 6.8 db than expected value obtained by simulation. the reason for measured sls degradation can be in weakness of measuring condition and in tolerances of realization. however, the gain of realized antenna is 21 dbi that is optimal value for many modern wireless applications. acknowledgement: this work was supported by the ministry of education, science and technological development of republic serbia under the projects no. tr 32052. the authors would like to thank n. tasić and m. pešić from "imtel-communication" for the antenna fabrication. special thanks also go to msc i. radnović and n. popović for their help and support. references [1] c.a. balanis, antenna theory: analysis and design, 3rd edition, wiley-interscience, 2005. [2] d. m. pozar and b. kaufman, ”design considerations for low sidelobe microstrip arrays”, ieee trans. on antennas and propagation, vol. 38, no.8, pp. 1176-1185, august 1990. [3] a. nešić, i. radnović, z. mićić, s. jovanović, “side lobe suppression of printed antenna arrays for integration with microwave circuits”, microwave j., vol. 53, no. 10, pp. 72-80, 2010. [4] m. milijić, a. nešić, b. milovanović, “design, realization and measurements of a corner reflector printed antenna array with cosecant squared-shaped beam pattern”, ieee antenn. wirel. pr., vol. 15, pp. 421-424, 2016. [5] a. nešić, i. radnović, “new type of millimeter wave antenna with high gain and high side lobe suppression”, optoelectron. adv. mat., vol. 3, no. 10, pp. 1060-1064, 2009. [6] a. nešić, z. mićić, s. jovanović, i. radnović, d. nešić, “millimeter wave printed antenna arrays for covering various sector width”, ieee antennas propag., vol. 49, no. 1, pp. 113-118, 2007. [7] m. milijić, a. nešić, b. milovanović, “wideband printed antenna array in corner reflector with cosecant square-shaped beam pattern”, in proc. of the 22nd telecommun. forum telfor, belgrade, serbia, november 25-27, 2014, pp.780-783. [8] m. milijić, a. nešić, b. milovanović, “the investigation of reflector influence on the bandwidth of symmetrical printed antenna structures”, in proc. of the 3rd international conference on electrical, electronic and computing engineering icetran 2016, zlatibor, serbia, june 13 – 16, 2016, pp. mti1.1.1-6. [9] m. milijić, a. nešić, b. milovanović, “printed antenna arrays with high side lobe suppression: the challenge of design”, microwave review, 2013, vol. 19, no. 2, pp. 15-20. [10] m. milijić, a. nešić, b. milovanović, “side lobe suppression of printed antenna array with perpendicular reflector”, in proc. of the 11th int. conf. telsiks, nis, serbia, october 16–19, 2013, pp. 217-220. [11] m. mikavica, a. nešić, cad for linear and planar antenna array of various radiating elements, norwood, ma, artech house, 1992. [12] i. đurić, v. ratković-ţivanović, m. labus, d. groj, n. milanović, “designing an intelligent home media center”, facta universitatis, series: electronics and energetics, vol. 29, no 3, pp. 461 – 474, september 2016. instruction facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 403-416 https://doi.org/10.2298/fuee1903403s © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd energy losses estimation by polynomial fitting and k-means clustering * lazar sladojević, aleksandar janjić university of niš, faculty of electronic engineering, niš, serbia abstract. this paper represents an approach for the estimation and forecast of losses in a distribution power grid from data which are normally collected by the grid operator. the proposed approach utilizes the least squares optimization method in order to calculate the coefficients needed for estimation of losses. besides optimization, a machine learning technique is introduced for clustering of coefficients into several seasons. the amount of data used in calculations is very large due to the fact that electrical energy injected in distribution grid is measured every fifteen minutes. therefore, this approach is classified as the big data analysis. the used data sets are available in the serbian distribution grid operator’s report for the year 2017. obtained results are fairly accurate and can be used for losses classification as well as future losses estimation. key words: grid losses, least squares optimization, big data, clustering 1. introduction big data analysis is rapidly becoming one of the most important tools in many aspects of engineering. data are collected everywhere, and their numbers and collection rates are increasing each day. therefore, various methods for processing of this data have been developed in recent years. these methods are efficient not only for extracting valuable information from a mass of data and their visualization, but also for developing predictive models for various applications. increasingly high amount of data can also be observed in a field of electrical power engineering. electrical power grid is being modernized faster than ever, with large number of smart sensors being installed in many points of the grid. these sensors collect information about various electrical variables which are important for normal grid received february 26, 2019; received in revised form may 19, 2019 corresponding author: lazar sladojević university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18106 niš, serbia (e-mail: lazar.sladojevic@elfak.rs) * an earlier version of this paper was presented at the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, 2018 [1].  404 l. sladojevic, a. janjic operation. these data are used everywhere, from the power generation side management to the demand side management. a good overview of many different applications of big data in electrical energy management and most common methods for data processing can be found in [2]. one of the most important usage of big data is prediction of solar and wind power generation based on collected weather data. weather has a major impact on production from renewable sources, and therefore it is very important to observe the relationship between the two. another, very interesting application of big data is detection of different consumption profiles based on measurements of different variables. example would be [3], where the authors have used hourly electricity consumption readings and external temperature measurements to compute consumption profiles for residential customers. electrical faults in power grid can present a big problem, especially when the fault occurs on a geographically distant part of a network. fault detection, identification and location [4] can also be obtained from the data collected in various measurement points in the grid. another big problem for electrical energy suppliers is energy theft. theft of electrical energy can in some places reach astonishingly high values. therefore, an approach for estimating the amount of stolen electrical energy based on smart meter data and least squares method for data processing has been developed and proposed in [5]. this topic is closely related to this paper, since the energy theft is observed as a non-technical loss which is also evaluated here. prediction of losses in distribution network has drawn more attention during the last years due to the deregulated energy market conditions, where distribution network operators are obliged to procure the energy for covering losses on the open electricity market [6, 7]. increased market deregulation [7, 8] and shares of renewable and intermittent energy generation [9], makes line loss prediction difficult. line losses themselves are also influenced by a multitude of factors and non-linear correlations which makes predictions model even more complicated. design of line loss prediction models have become a research priority in transmission networks as well and several different models have been proposed [10, 11, 12]. losses are allocated using the quadratic expression in [10], formulated explicitly in terms of all the transactions in the system [11] with consideration of wind generation and varying loads in [12]. however, a majority of these models are mainly designed for line loss allocation issues for market applications as opposed to day-ahead predictions for tso purposes. in distribution networks, different methods for the calculation of losses are used, including heuristic algorithms [13] and neural networks [14]. the near quadratic relationship that exists between load and loss has been used to develop empirical relations for estimation of loss [15]. these relationships relate either the loss and load factors [16] or the loss and load [17, 18]. in these methods, using simplified feeder models for computation of the loss, the coefficients in the quadratic function are determined using a curve fitting approach. although the previous research and studies established satisfactory models for the energy losses calculation, they didn’t treat the process of the open market losses procurement. the contribution of this paper is therefore the seasonal classification and determination of curve fitting parameters for the purchase of energy losses. least squares optimization method which is used in this paper is very similar to machine learning energy losses estimation using polynomial fitting 405 regression algorithms, but with certain restrictions attached to it. it is suitable for numeric data with linear or quadratic relationships of measured (input) and estimated (output) values. least squares regression is a so called “supervised” learning algorithm, which will be described further in section 3. however, for experimental purposes, another machine learning algorithm was used for model improvement. this is the so-called clustering algorithm, which belongs to the “unsupervised” learning category. even though these algorithms are used in various applications in electrical power engineering, to the authors knowledge, this is their first application in estimation of losses. 2. problem description in this paper, an approach for estimation of losses in distribution grid based on the available data analysis is proposed. these losses comprise of two components, namely technical losses (tl) and non-technical losses (ntl). the proposed approach is based on analysis of losses data collected in year 2017. this data will be used to estimate parameters of a predictive model for future losses estimation. 2.1. physical interpretation technical losses can be split into two terms. the first term represents the constant losses. these mainly represent the losses in magnetic cores of distribution transformers, but other factors, such as losses due to corona, constantly operating measurement equipment, leakage currents and losses in dielectrics also contribute. the other term is variable losses. they appear mainly in conductors but a small part of these losses can also be observed in other current carrying parts, such as switch contact resistances and busbars. these losses are proportional to the square of current or, equivalently, to the square of active power. non-technical or commercial losses appear due to infrequent or bad reading of measurement equipment and electrical power thefts. therefore, these losses are proportional to the active power. in distribution power grid of serbia, electrical energy received from the transmission grid is measured every fifteen minutes throughout whole year. on the other hand, energy supplied to end users is measured once every month. total losses represent the difference between total energy received from the transmission grid and total energy supplied to end users during one month. these data are collected and can be used for future losses estimation and losses classification. one method that allows this kind of estimation is described in the following section. 2.2. mathematical model there are several ways to model the losses in the distribution grid, but they all need information about energy obtained from the transmission grid and distributed sources and energy delivered to end users. the model chosen here represents the losses in the following polynomial form [19, 20]: 406 l. sladojevic, a. janjic 2 , ( )c j j j i j i i i δw a b p c p δt      (1) where δwc,j are the calculated (estimated) total losses for month j, i is the index of fifteenminute interval δti in month j, pi is an average input power for the that interval, aj represents the amount of constant losses in month j, bj is the coefficient associated with the commercial losses and is proportional to input power pi in month j, and finally cj is the variable losses coefficient for month j, proportional to the square of power. for now, coefficients a and c are considered constant throughout the whole year, while the coefficient b varies by month. this assumption will be addressed in the following chapter. on the other hand, measured losses, denoted as δwm,j are already available as a difference between measured input and measured output energy. for calculation of coefficients a, b and c, least squares method was used, which means that the sum of squared differences between the calculated and measured values of losses was minimized. total number of variables is 36 (twelve for constant losses – coefficients aj, twelve for commercial losses for each month – coefficients bj and twelve for variable losses – coefficient cj). some of these variables are considered constant in some calculation, so the effective number of variables is lower. general objective function for minimization can now be written as: 2212 12 2 , , , 1 1 min ( , , ) ( ) ( )c j m j j j i j i i m j j j i f a b c δw δw a b p c p δt δw                      (2) coefficient values are constrained to a certain range: for coefficient aj: amin ≤ aj ≤ amax, for coefficients bj: bmin ≤ bj ≤ bmax and for coefficient cj: cmin ≤ cj ≤ cmax. constraints for coefficients a, b and c have to be properly selected, based on their physical interpretation explained in the following chapter. 2.3. restrictions of the proposed method proposed method has one drawback, it uses the monthly readings of energy consumption. this means that the value of measured losses is prone to errors due to bad or untimely readings. for example, in some rural areas, electricity consumption is read only every three months. this leads to slight under readings for certain months and slight over readings for others and, consecutively, to miscalculation of some parameters. better results would be obtained if the consumption was read with higher frequency, preferably the same as the input readings. this would require large number of smart meters installed at every point in the grid, which is not yet realized in practice. however, smart meters are being installed every day and, in the future, more reliable and accurate data will be available for analysis. 3. mathematical model solving method as it was mentioned earlier the mathematical problem given with (2) is solved using the least square optimization algorithm. this problem is very similar to a linear regression problem which exists in the field of machine learning. in linear regression, output values energy losses estimation using polynomial fitting 407 are a linear combination of constant parameters and measured values of features (predictors). from (1) it can be observed that in this case, output values would be monthly measured values of losses, δwc,j, features would be δti, pi·δti and pi 2 ·δti and their coefficients would be a, b and c, respectively. however, since restrictions have been imposed on all parameters and some of them are also variable, this problem was reformulated into non-linear programming optimization problem, given with (2). for solving of this problem, standard mathematical methods were used, in this case the interior point algorithm. input parameters are the fifteen-minute readings of electrical energy injected from the transmission grid and distributed sources into distribution grid and the monthly measured values of losses in the distribution system. output consists of the values of coefficients a, b and c. for easy programing and formulation of problem, an open source optimization platform yalmip [21] was used. yalmip’s syntax allows easy and intuitive definition of variables, objective function, constraints and other options. yalmip was used with one of matlab’s integrated solvers for performing computations. it can select solver for a problem automatically, based on its structure, but also permits users to select the solver they think it fits best. this allows all kinds of problems to be defined in the same way, unlike the case of using each solver individually, where user would have to define the problem in a form specific to that particular solver. the solver used for calculation of coefficients is matlab’s fmincon nonlinear programming solver. this solver utilizes several different algorithms for objective function minimization, but the one used here was the interior point algorithm [22]. variables involved in calculations are already denoted aj, bj and cj, with j = 1...12. objective function is given with (2). constraints are chosen based on real data, and the realistic values of coefficients. the total nominal iron core losses power of all transformers in the distribution system of serbia is approximately 32 mw. therefore, the parameter a constriction adopted is 30 ≤ a ≤ 40. commercial losses always exist, but they do not exceed 10 % in serbian distribution grid. thus, adopted constraint for parameter b is 0.01 ≤ b ≤ 0.1. unlike the previous parameters whose extreme values are relatively easy to estimate, parameter c cannot be constrained in such a straight-forward manner. since it is multiplied by a square of power, its value is undoubtedly very small. based on author’s previous experience, adopted constriction for this parameter is 0.00002 ≤ c ≤ 0.00006. as it was mentioned in the previous section, for this calculation the coefficients a and c are considered constant, while the coefficient b varies by month. this is not to be confused with constraints which are simply the minimum and maximum sensible values that can be obtained as calculation results. since the grid topology and number of transformers remains very much the same throughout the whole year, it makes sense to keep coefficients a and c constant. on the other hand, coefficient b is affected by many external factors and therefore it is considered variable. all of these assumptions are used for the first calculation. 4. numerical results obtained results are presented in table 1 and fig. 1. table 1 contains the real values of coefficients, while the values of coefficients bj and c in fig. 1 are scaled. the scaling is used only to make all values visible on a chart. after the scaling, coefficients bj are shown 408 l. sladojevic, a. janjic in percent, while the coefficient c is now dimensionally equal to w -1 . fig. 2 represents the comparison between calculated and measured values of losses. table 1 calculated values for coefficients month coefficient a (mw) coefficient b (pu) coefficient c (mw -1 ) january 32.739 0.091499 0.000020609 february 32.739 0.072052 0.000020609 march 32.739 0.061105 0.000020609 april 32.739 0.039002 0.000020609 may 32.739 0.057862 0.000020609 june 32.739 0.013125 0.000020609 july 32.739 0.019671 0.000020609 august 32.739 0.014427 0.000020609 september 32.739 0.018085 0.000020609 october 32.739 0.043728 0.000020609 november 32.739 0.051645 0.000020609 december 32.739 0.065777 0.000020609 it can be observed from fig. 2 that the computations were done successfully. calculated and measured losses are equal which means that the coefficients are well estimated. statistically speaking, it can be said that the input data, defined in the previous chapter are the training data for the model (1). fig. 1 scaled values of the calculated coefficients energy losses estimation using polynomial fitting 409 fig. 2 comparison between calculated and measured losses from fig. 2 it is obvious that the model fits the training data well. now these coefficients can be used for future estimation of losses under the previously introduced assumptions. 5. model improvement distribution companies are procuring the energy for covering losses on the open electricity market. in order to simplify the procurement procedure, the whole concept can be extended by grouping months into several distinct seasons. the advantage of distinction is grouping of coefficients by season instead of having different coefficients for each month and potentially better prediction accuracy in the certain season. the accuracy however depends on quality of the collected data. grouped coefficients are less prone to various stochastic errors than individual coefficients. besides that, there are noticeable differences of the climate factors, such as external temperature and humidity in the same months during the years. even the reading of electrical energy consumption is not as frequent in winter as it is at summer. on the other hand, adopting one set of coefficients for the whole year would cause too high error values in future calculation of losses. therefore, a three-season model was adopted, and months were clustered into winter, summer and “transient” seasons. clustering itself is an “unsupervised” machine learning algorithm, which means that, unlike regression, it doesn’t have the output data to compare inputs to. instead, it seeks to find similarities among the input measurements. in this case, there are twelve input points twelve months and three features – coefficients a, b and c. this means that there is a total of 36 input values among which a clustering algorithm should find similarities. a twelve by three matrix was used for convenient storage of these data. in this particular 410 l. sladojevic, a. janjic case, that matrix is given in tabular form with table 2, where rows represent input measurements for each month and columns represent features. the number of clusters is another input to the clustering algorithm. as explained earlier, the number of clusters was chosen to be three, since it was utmost logical to divide year into three seasons according to weather conditions. nevertheless, three different number of clusters were examined, ranging from two to four and the clustering results were observed for each calculation. since the results depend on both number of clusters and initial (random) clustering [23], there have been several possible solutions, among which the one with most sense was chosen. generally, there is no single “best” way of choosing the number of clusters. rather, a certain expertise in the field that the data belong to is required in order to choose the appropriate number [23]. another important issue with clustering is different scaling of data features. features’ scales can be different from each other by several orders of magnitude, such as in table 2. to address this issue, a normalization is required in order to obtain meaningful clustering results. for this particular case of clustering months into seasons, a somewhat different approach than before was used. the previously introduced constrictions still apply, but now only coefficient a is considered constant and its value fixed to 32 mw. all the other coefficients are considered variable for every month. this way, coefficients are optimized so that calculated losses are equal or almost equal to measured losses, similar to the previous case, depicted in fig. 2. these values of coefficients serve as an input data set for the process of clustering. therefore, they will be referred to as the initial coefficients. obtained values of these coefficients are shown in table 2, while the fig. 3 shows their scaled versions (same scaling as fig. 1). comparison of calculated and measured losses is shown in fig. 4. values of errors i.e. differences from fig. 4, both in absolute and relative units are given in table 3. table 2 calculated values for coefficients month coefficient a (mw) coefficient b (pu) coefficient c (mw -1 ) january 32 0.0353882 0.00003250295 february 32 0.0380628 0.00002887427 march 32 0.0198292 0.00003235119 april 32 0.0360867 0.00002157678 may 32 0.0337267 0.00002904083 june 32 0.0125835 0.00002087766 july 32 0.0158188 0.00002198364 august 32 0.0132383 0.00002108569 september 32 0.0150446 0.00002172672 october 32 0.039316 0.00002210148 november 32 0.0455431 0.00002230626 december 32 0.0357529 0.00002795988 energy losses estimation using polynomial fitting 411 fig. 3 scaled values of the calculated coefficients after initial optimization fig. 4 comparison between calculated and measured losses after initial optimization 412 l. sladojevic, a. janjic table 3 absolute and relative differences in calculated and measured losses after initial optimization month absolute difference (mwh) relative difference (%) jan 0.0000 0.0000 % feb 0.0000 0.0000 % mar 0.0000 0.0000 % apr 117.5637 0.0432 % may 0.0000 0.0000 % jun 0.0000 0.0000 % jul 0.0000 0.0000 % aug 0.0003 0.0000 % sep 0.0000 0.0000 % oct 784.1185 0.2644 % nov 407.3877 0.1088 % dec 0.0000 0.0000 % from table 2, it can be noted that the values of coefficients bj are different than those in table 1. the reason for this lies in the fact that now all the other coefficients are different too (although coefficient a is only slightly different). this means that the coefficients bj also had to change in order to achieve the best possible fit to the input data. from fig. 4 and table 3 it is obvious that coefficients fit the input data almost perfectly, since the yellow bars are almost invisible for every month. coefficients b and c are now variable throughout months, and discovering similarities among them is the key for clustering of months. this approach theoretically gives better results than the approach with two constant parameters because in the former case, the clustering is done on the basis of two features (parameters b and c), while in the latter case, clustering would have been done on the basis of one feature only (parameter b). this theory has also been proven experimentally. based on these values, months are clustered into seasons and clustering results are shown in table 4. table 4 clustered months month jan feb mar apr may jun jul aug sep oct nov dec cluster index 1 1 1 3 1 2 2 2 2 3 3 1 season win win win tra win sum sum sum sum tra tra win season labels in table 4 stand for winter season (cluster 1, label “win”), summer season (cluster 2, label “sum”) and transient season (cluster 3, label “tra”). from the table 4 it can be observed that months are clustered almost completely as expected. the only exception is may, which has been clustered as a winter month. looking back to the previous calculations, one can notice that may does not follow the usual pattern like other months. in fig. 1, all months follow general pattern that coefficient b values get lower during summer and higher during winter in characteristic “elbow” shape. however, value for may presents an outlier from that pattern since its value is higher than values for the surrounding months. there is also a visible difference of parameter c for may in fig. 3 energy losses estimation using polynomial fitting 413 compared to surrounding months. this value corresponds more to the winter months than to other months. the reason for this could be a significant under reading of electrical energy consumption in may, untimely data collection and report creation, error in statistical processing of data or simply higher energy theft rate in that particular month. final coefficients were calculated as centroids of each cluster from table 4. centroid of a cluster is a vector of mean values of all features for all points in that cluster. these coefficients are shown in table 5 and their scaled values depicted in fig. 5 while the comparison of calculated and measured losses for each month is shown in fig. 6 and absolute and relative differences given in table 6. table 5 grouped coefficients for three seasons cluster index season coefficient a (mw) coefficient b (pu) coefficient c (mw -1 ) 1 winter 32 0.0350917 0.00002942712 2 summer 32 0.0141719 0.00002141825 3 transient 32 0.0403153 0.00002199484 fig. 5 grouped coefficients for the three seasons 414 l. sladojevic, a. janjic fig. 6 comparison between calculated and measured losses after clustering table 6 absolute and relative differences in calculated and measured losses after initial optimization month absolute difference (mwh) relative difference (%) jan 48709.5891 7.6852 % feb 697.9128 0.1541 % mar 12719.8945 3.3250 % apr 13128.5232 4.6069 % may 4248.3817 1.5425 % jun 6624.1044 3.5945 % jul 7211.0681 3.7413 % aug 4227.7345 2.1589 % sep 3686.2140 2.0147 % oct 2343.3930 0.7859 % nov 16880.7436 4.7272 % dec 17489.1562 3.5188 % it can be seen that small error appears in the calculation. this error is a consequence of fact that one set of coefficients cannot perfectly fit all months, but seeks to reduce the overall error instead. higher error can only be observed for winter months, due to the fact that may once again influenced this calculation. this also reflected to the somewhat higher value of coefficient c for winter season. nevertheless, the overall error value is very low compared to values of losses. energy losses estimation using polynomial fitting 415 6. conclusions in this paper, a new approach for calculation of losses in the electrical distribution grid was presented. there are two general classes of losses: technical and non-technical losses. both types are unavoidable, but it is important to know how each type affects the total amount of losses, i.e. they have to be classified. this is done by analyzing the data available from the distribution grid operator. the data contain the distribution grid input energy measurement for every fifteen-minute time interval and the monthly measurements of energy delivered to end users. difference between these two are the real total losses for a certain month. on the other hand, a polynomial equation is introduced to calculate that same losses based on the grid input power. coefficients for this equation are computed using the least squares method, by minimizing the squared differences between the calculated and measured losses. these coefficient values are constrained based on their physical interpretation and authors experience. results show that the minimization was successful and that the losses can clearly be classified this way. additionally, calculated coefficients can be used for future estimation of losses. this concept was further expanded by introducing clustering of months into seasons. the obtained results show expected distinction of months, with the exception of may, which was classified as a winter month. this, along with the results of previous calculations, lead to a conclusion that there has probably been an error in energy consumption readings in this particular month. however, the overall results are very good, and the whole concept can be further improved. future research will be focused on processing the data for the previous few years. this could allow the researchers to discover some specific trends and obtain better clustering accuracy since some seasons may begin in one year and end in another. acknowledgement: this research is partly supported by project grant iii44006 financed by the ministry of education, science and technology development of the republic of serbia. references [1] l. sladojević, a. janjić, m. ćirković „calculation of losses in the distribution grid based on big data“, in proceedings of the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, 2018, pp. 19-22 [2] k. zhou, c. fu and s. yang, “big data driven smart energy management: from big data to big insights”, renewable and sustainable energy reviews, vol. 56, pp. 215–225, 2016. [3] o. ardakanian, n. koochakzadeh, r. p. singh, l. golab, and s. keshav, “computing electricity consumption profiles from household smart meter data,” in proceedings of the edbt workshop on energy data management, 2014, pp. 140–147. [4] h. jiang, j. zhang, w. gao, and z. wu, “fault detection, identification, and location in smart grid based on data-driven computational methods,” ieee transaction on smart grid, vol. 5, no. 6, pp. 2947–2956, 2015. [5] d. n. nikovski, z. wang, a. esenther, h. sun, k. sugiura, t. muso and k. tsuru, “smart meter data analysis for power theft detection”, in proceedings of the int. workshop machine learning and data mining in pattern recognition, ser. lncs, vol. 7988. springer, 2013, pp. 379–389. [6] p. nallagownden and t. p. hong, “development of a new loss prediction method in a deregulated power market using proportional sharing,” in proceddings of power engineering and optimization conference (peoco), 2011 5th international. shah alam, selangor, malaysia: ieee, june 2011, pp. 48–53. 416 l. sladojevic, a. janjic [7] a. petrušić, a. janjić, “economic regulation of electric power distribution network”, in proceedings of 2 nd virtual international conference on science, technology and management in energy, niš, serbia, 2016, pp. 25-32 [8] w. d. liu, t. j. zeng, t. jun, l. l. shang guan, and b. j. li, “research on electric power with development and application of line loss rate forecasting software based on mlrm-gm,” advanced materials research, vol. 977, pp. 182–185, june 2014. [9] k. li, z. q. sun, and m. wang, “theoretical line loss calculation of distribution network considering wind turbine power constraint,” advanced materials research, vol. 986-987, pp. 630–634, july 2014. [10] q. ding and a. abur, “transmission loss allocation based on a new quadratic loss expression,” ieee transactions on power systems, vol. 21, pp. 1227–1233, august 2006. [11] g. gross and s. tao, “a physical-flow-based approach to allocating transmission losses in a transaction framework,” ieee transactions on power systems, vol. 15, pp. 631–637, may 2000. [12] elmitwally, a. eladl, and s. m. abdelkader, “efficient algorithm for transmission system energy loss allocation considering multilateral contracts and load variation,” iet generation, transmission & distribution, vol. 9, pp. 2653–2663, august 2015. [13] l. tian, q. q. wang, and a. z. cao, “research on svm line loss rate prediction based on heuristic algorithm,” applied mechanics and materials, vol. 291-294, pp. 2164–2168, february 2013. [14] y. ren, x. g. zhang, and x. c. huang, “study on the prediction of line loss rate based on the improved rbf neural network,” advanced materials research, vol. 915-916, pp. 1292–1295, april 2014. [15] p. s. nagendra rao and r. deekshit energy loss estimation in distribution feeders ieee transactions on power delivery, vol. 21, no. 3, july 2006. [16] m.w. gustafson, j. s. baylor, and s. s. mulnix, “equivalent hours loss factor revisited,” ieee trans. power syst., vol. 3, no. 4, pp. 1502–1507, nov. 1988. [17] m. w. gustafson, “demand, energy and marginal electric system losses,” ieee trans. power app. syst., vol. pas-102, no. 9, pp. 3189–3195, sep. 1983. [18] w. gustafson and j. s. baylor, “approximating the system losses equation,” ieee trans. power syst., vol. 4, no. 3, pp. 850–855, aug. 1989 [19] m. järvinen, “developing network loss forecasting for distribution system operator”, master of science thesis, tampere university of technology, 2013 [20] https://www.ofgem.gov.uk/ofgem-publications/43519/sohn-overview-losses-final-internet-version.pdf, accessed on 26.2.2019. [21] j. löfberg, “yalmip: a toolbox for modeling and optimization in matlab”, in proceedings of the cacsd conference, taipei, taiwan, 2–4 september 2004. [22] waltz, r. a., j. l. morales, j. nocedal, and d. orban, “an interior algorithm for nonlinear optimization that combines line search and trust region steps,” mathematical programming, vol. 107, no. 3, pp. 391– 408, 2006. [23] g. james, d. witten, t. hastie, r. tibshirani, ” an introduction to statistical learning with applications in r”, springer texts in statistics, springer science + business media new york, 2013 (corrected at 6th printing 2015). facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 105-117 https://doi.org/10.2298/fuee2001105v © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd impact strength of 3d-printed polycarbonate  hans de vries, roy engelen, esther janssen high tech campus 7, eindhoven, netherlands abstract. a vertical wall printed by fused filament fabrication consists of a ribbed surface profile, due to the layer wise deposition of molten plastic. the notches between the printed layers act as stress concentrators and decrease its resistance to impact. this article shows the relation between impact strength and layer height by experimental data and finite element simulations of the stress intensity factor and the plastic zone near the tip of the notch. the impact resistance increased from 6 to 32 kj/m 2 , when the layer height was decreased from 1.8 to 0.2 mm. when notches were removed by sanding, the samples did not fail any more during impact testing, resembling the behavior of smooth molded test bars. tensile strength values up to 61 mpa were measured independent of layer height. birefringence measurements were done to determine the actual stress levels, which ranged from 2 to 5 mpa. key words: 3d-printing, polycarbonate, layer height, residual stress, impact strength. 1. introduction fused filament fabrication is an additive manufacturing process in which a product is built up layer-by-layer. each subsequent layer must adhere to the previous layer. in the case of polymers this is done at temperatures well above the glass transition temperature, such that the polymer chains have enough mobility to interpenetrate and a strong interface layer is formed. the tensile and flexural strength of 3d-printed polycarbonate parts can be nearly as high as for injection molded material. values of 88–89% of the bulkor injection molded strength have been published [1,2]. slightly lower values of 63–82% have also been reported, which is probably due to different printing toolpaths and process settings [3,4]. an important geometric difference with injection molded material is the notch that is formed between the printed layers. these can act as loci of stress concentration and might impair the impact strength. moreover, internal stress may add or detract from the strength of the material. this justifies an investigation of the additional stress that builds up in the material during the 3d-printing process. received june 19, 2019; received in revised form october 2, 2019 corresponding author: hans de vries high tech campus 7, 5656ae eindhoven, netherlands (e-mail: j.w.c.de.vries@signify.com)  106 h. de vries, r. engelen, e. janssen unlike for injection molded material, not many studies on the internal stress of 3dprinted materials have been published. here, a few that do exist are briefly mentioned. in [5] the deformation and associated stress was experimentally determined from contact measurements and analytically described for abs (acrylonitrile butadiene styrene)-plates. from correspondence with the authors it became clear that their solution cannot be readily applied to samples with a completely different form factor – single pass, vertical walls as used in our work. still, their study is interesting because of its potential to determine the stress in opaque materials. in another investigation, the deformation field was determined by a speckle technique [6]. this also concerned plates of abs. the hole drilling method was used to measure the deformation in printed abs-plates [7]. what all these studies have in common is that they show that the stress at the surface of 3d-printed material is (slightly) compressive while at the interior it is (slightly) tensile. finally, in a study on 3d-printed abs it was also speculated that the notches – or lobes as these authors called it – between the printed layers can act as stress concentrators [8]. in this work the effect of the lobes or notches in 3d-printed structures of polycarbonate on the mechanical strength is investigated. this is done by subjecting as-printed and polished samples to impact tests. in both type of samples, the stress level is estimated from birefringence measurements. 2. experiments 2.1. test samples single wall structures were printed on a desktop printer (ultimaker 2+), which was modified to allow printing of polycarbonate at a temperature between 250 and 300°c. maximum bond strength is reached above the glass transition temperature, when the polymer molecules diffuse across the interface of two consecutive layers [9]. printing was done with layer heights of 0.2 to 1.8 mm. the thickness of the wall after printing was around 2 mm for all layer heights, except for the 1.8 mm layer which had a 3 mm thick wall (see table 1). test bars of 125x13x2 mm 3 were made by machining them out of vertically (sic) printed walls. the stack of printed layers was in the length direction of the sample (see figure 1). a part of the bars was polished to remove the notches between the layers. this is supposed to reduce the stress concentration at the tip of these notches as was referred to in the introduction. 2.2. mechanical tests charpy impact tests were performed (impact xjc-25, chengdu jingmi co. ltd.) on as-printed and polished samples. it must be noted that this test was not executed completely according to any standard, only the anvil was set to the required width of 40 mm [10]. the impact strength was determined with the samples positioned as shown in figure 1, thus striking on the face of the wall. machining the test samples from the printed walls means that the narrower cut side is smooth, and thus polishing of that side makes no sense. still, the impact strength in that direction was tested as well. for tensile testing, dog bones (astm type iv) were cut from the printed walls and mounted in a zwick1464 test machine. the tests were performed at a speed of 4 mm per minute. impact strength of 3d-printed polycarbonate 107 fig. 1 structure for impact testing, the hatching representing the printed layers. the dashed grey area (left picture) indicates the position in the vertical wall where the test bar (right picture) is cut from. the arrow shows the direction of the impact 2.3. birefringence internal stress can be determined by retardation caused by birefringence. this was measured by polarization microscopy (leitz laborlux 12 pol) with a 1942k-compensator. for reasons of comparison, the birefringence in injection molded samples was also measured. some materials have a refractive index that depends on the polarization of light. noncubic crystals and plastics under stress exhibit this phenomenon. an incoming light beam is split in two beams with mutual perpendicular polarization directions. after exiting the material, the two beams have a phase difference since one beam will be retarded compared to the other. by means of a compensating filter the phase difference can be determined and the retardation (r) can be obtained from a table. the retardation depends on the thickness of the material (t), the stress-optical constant (c), and on the stress level: , (1) where the ii are the first two principal stress components. for polycarbonate several values for the stress-optical constant are mentioned [11,12]. for this investigation a value of 8.9x10-11 pa -1 was adopted. this implies that the absolute value of the stress values mentioned in this report are subject to a possible correction if the material’s constant must be changed. 3. results in this section, the nondestructive analyses will be presented first. this includes both the optical microscopy and polarization measurements. the next part concerns the outcome of the destructive tests i.e. tensile and impact strength. photographs of a few printed and polished samples are shown in figure 2. it is a top view of the sample as sketched in figure 1. thus, one is looking at the cross section of the printed wall. 108 h. de vries, r. engelen, e. janssen 0.2 mm layer height 1.2 mm layer height fig. 2 top view of printed bars (see figure 1) as-printed (left) and after polishing (right). the red scale bar is 0.5 mm as printed polished fig. 3 side view of samples. polarized images of sample with layers of 1 mm height. left: compensator at 0°; right: with compensation (as-printed ~9.5° and polished ~7.5°) impact strength of 3d-printed polycarbonate 109 3.1. birefringence / stress the internal or residual stress that was obtained from the retardation measurements is listed in table 1 for the as-printed and the polished samples. it must be remembered that this is the difference between the two first principal stress components. each measurement was made right in the center of the layer. an example is shown in figure 3 for a sample with layers of 1 mm height. this is a front view of the sample as sketched in figure 1. thus one is looking at the ridged front wall of the sample. the black band indicates the compensated or stress-free situation. this is approximately 9.5 for the as-printed case and ~7.5 for the polished case. it should be noted that the same samples were measured both before and after polishing. to calculate the stress from the retardation, the thickness of the material is needed (see eq. (1)). in the as-printed samples this is the printed thickness of the wall (tprint), and for the polished samples it is the remaining material, which is the thickness of the interface (tint). in figure 4 we show the stress data as a function of the printed layer height. data is included that was obtained on injection molded pieces of polycarbonate. there is no clear difference between the internal stress in the as-printed and polished samples, which indicates that the residual stress does not depend on the ridged geometry of the printed layers. for the small printed layer height an increased internal stress is observed. as the printed layers are higher, the internal stress approaches the level of the injection molded material. table 1 retardation (r) and stress (s, difference between first two principal stress components). the printed layer height (h), thickness of wall after printing (tprint) and of interface (tint) are given. h tprint tint r s r s mm mm mm nm mpa nm mpa as-printed polished 0.2 1.93 1.73 853 4.91±0.26 776 5.37±0.17 0.4 1.90 1.72 899 5.26± 669 4.49±0.32 0.8 2.14 1.60 727 3.78±0.23 636 4.36±0.10 1.0 2.09 1.42 627 3.34±0.17 394 2.72±0.26 1.2 2.19 1.39 640 3.26±0.31 373 2.55±0.20 1.8 3.10 1.61 685 2.45±0.11 473 2.35±0.42 3.2. tensile strength and impact strength tensile strength tests were made on dog bone type samples. the available results are listed in table 2 and graphically shown in figure 5. not in all cases a polished sample could be made for testing. the strength was 77–94% of the bulk value for the injection molded polycarbonate, showing that the process is well controlled. it also shows that the tensile strength of the material is a material property that is not significantly affected by both the layer height and the ridged geometry that are the result of the 3d-printing process. the measurements from the impact tests give the energy that is absorbed during the shock. in order to compare the results of various samples having different geometries, it was decided to use the energy per cross sectional area. in the case of 3d-printed samples 110 h. de vries, r. engelen, e. janssen this is the internal cross section for which the thickness of the interface between two layers is taken. the test results are compiled in table 2. a graphical representation of the test results can be found in figure 6. in the case of polished 0.8 mm high layers, two out of four samples did not fail. the polished samples with layers of 0.2and 0.4-mm height did not fail at all. this means that the impact strength was more than 200 kj/m 2 given the maximum energy of 4j of the heaviest hammer in the test facility (in the figure this has been indicated by the dashed arrows). evidently, the impact strength depends on both the layer height and the ridged geometry that are the result of the 3d-printing process. anticipating the discussion of these results, the iso (179/1ea) test value with a notch radius of 0.25 mm for molded polycarbonate is 65 kj/m 2 . when the notch radius is varied, impact strengths in the range of 10 to over 80 kj/m 2 were measured [13]. fig. 4 stress calculated from retardation measurements by eq. (1). data from table 1. asprinted (), polished (). the shaded band indicates the stress level in unnotched injection molded material fig. 5 tensile strength with data from table 2. as-printed (), polished (). the dashed line shows the injection molded value impact strength of 3d-printed polycarbonate 111 table 1 impact strength (j) and tensile strength (ts) for as-printed and polished samples. the printed layer height (h) is given, other sample data are in table 1. in some cases no polished samples could be made. h (mm) j (kj/m 2 ) ts (mpa) as-printed polished as-printed polished 0.2 32±2 > 200 61.4±0.3 56.8±1.1 0.4 14±2 > 200 60.0±1.1 60.6±1.9 0.8 10±2 30±10 60.0±0.9 not tested 1.0 9±2 not tested 56.2±2.2 not tested 1.2 7±2 12±7 50.0±1.5 44.6 1.8 6±0.002 not tested 50.0±5.2 not tested fig. 6 absorbed impact energy versus printed layer height (see table 2). as-printed (), polished (). the vertical arrows indicate “no failure” (energy above 200 kj/m 2 ) for polished samples 4. discussion during the process of 3d-printing material is added layer-by-layer. the process requires that this is done at an elevated temperature to ensure good adhesion between the layers, which is caused by diffusion and reptation [14]. as stacking of the layers proceeds, the already deposited material begins to cool down and shrinks. thus, thermal stress builds up throughout the entire structure. the additive manufacturing process also leads to inhomogeneity since the printed walls are not smooth but have rounded edges and sharp notches. at such notches stress concentrations occur and these might reduce the impact strength of the printed material. this was also supposed in an investigation of printed abs [8]. the experiments that are described in the previous sections, were designed to shed more light on the following questions:  what is the level of the internal stress in 3d-printed material?  what is the influence of the inhomogeneous nature (i.e. non-smooth surface finish) of fdm-printed material on its strength? 112 h. de vries, r. engelen, e. janssen it lies at hand that these questions are closely related. the above questions will be discussed first in a general sense based on the experimental results. afterwards, a detailed treatment of the several topics of this investigation will follow. 4.1. general not much has been published to date on the residual stress in samples made by 3dprinting. relevant to the present investigations is an analytical approach of the deformation and internal stress in 3d-printed beams from abs [5]. it reports radii of curvature of 1.4– 2.0 meter in beams of 5 mm in height made by layers of 200–350 µm thick. these numbers would lead to an internal stress of 2–3 mpa. tests on 3d-printed abs-plates of 3 mm thickness from layers of 240 µm thick, showed bending in the order of 10 µm over a span of 80 mm [6]. one can estimate that the stress amounts to about 6 mpa. also in abs, by the hole-drilling method, a residual stress of 6 mpa was determined [7]. although in these cases abs was used as material, the level of stress is comparable to the value we have found in polycarbonate. in the current study, the internal stress that could be extracted from birefringence measurements is in the range between 2 to 6 mpa (see table 1 and figure 4). this agrees with the abovementioned literature data. our own results for injection molded polycarbonate are on the lower end with about 2 mpa. no information was found concerning the distribution of the stress over the dimensions of the tested samples made by additive manufacturing techniques. of course, as for injection molded material such studies have been published. for instance, by gradually removing thin layers from an injection molded polycarbonate bar and subsequently measuring the stress at the surface, it was found that the stress becomes lower as one reaches the center of the material [15]. at the surface the stress was compressive. elsewhere, the stress near a weld line was modeled, and near the interface the stress increases rapidly [16]. this subject will be addressed further in the next paragraphs. 4.2. residual stress distribution the initial birefringence measurements were done in the center of the layer, i.e. the polarized light travels through the thickest cross-section of the printed ridged structure. no real difference exists between the printed and the polished samples, as one can see in figure 4. with above indications in mind of stress peaks near a weld line in injection molded material, an attempt was made to additional birefringence measurements over the wall thickness of the printed layers. however, only a very tentative indication was found that indeed the stress increases towards the interface. no quantification of the stress at that location was possible. 4.3. tensile and impact strength the tensile strength does not depend very markedly on the deposited layer height. there is a trend to higher strength as the printed layers get thinner (figure 5) and the strength of injection molded material is seen to be approached (60–65 mpa). as for the effect of the notches, these do not seem to have a large effect since the polished samples have the same tensile strength as the polished specimens. again, in the literature little has impact strength of 3d-printed polycarbonate 113 been published regarding the influence of layer height on the tensile strength. to date the only thorough piece of information concerns 3d-printed pla (polylactic acid) [17]. a slight decrease was observed there with thicker layers (0.04 to 0.2 mm). quite a different picture arises if we look at the impact strength. here, the height of the printed layer has a significant effect on the test result. thinner layers are much more resistant against an impact (figure 6). polishing away the edges makes even more of a difference. striking the polished specimen with layer height of 0.2 mm, 0.4 mm, and 0.8 mm (partly) on the flat surface, did not lead to fracture at all within the limits of the test equipment. with printed layers of 0.8 mm (partly) and 1.2 mm height, the impact strength after polishing has increased by a factor of between two and three. in combination with the apparent removal of the stress concentration at the layer interfaces, it must be concluded that there is also an increase in the robustness against impact loads. in the next paragraph this will be explored further. 4.4. stress intensity and plastic zone to try to explain the improved resistance of samples fabricated with thin rather than thick layers to impact loads, the concept of the crack-tip plastic zone can be used [18]. from a theoretical point of view, at a crack tip with a vanishing radius the stress would become infinite. since this does not reflect reality, for numerical calculations it was proposed to assume a rounded crack tip with an enhancement factor for the stress. the stress at the end of the crack is then higher than the applied stress. in addition, a region of plastic deformation was defined around the crack tip that prevents the crack propagating on its own. this so-called plastic zone blunts the sharp end of the crack. considering the influence of the printed layer height on the impact strength, it is worthwhile estimating the size of the plastic region. based on studies of the fracture behavior of ductile materials, an expression for the extension of the plastic region was derived [19]. for the present investigation a slightly different form can be used: ( ) (2) where ki is the stress intensity factor and y is the yield stress. estimating the magnitude, for ki values of 2.8 mpa.m 0.5 for sharp and 5–10 mpa.m 0.5 for blunt notches were reported [20]. elsewhere, 1.25 mpa.m0.5 was used [19]. concerning the yield stress, this will typically be around 40 mpa [15,19]. as a result, we get rpl values of 0.06 to 0.6 mm. this means that this crack-growth-damping-zone can be of the same dimension as the thinner printed layers of 0.2 and 0.4 mm, and perhaps even close to that found in samples with layers of 0.8 mm height. this will be further commented on at the end of this subsection. all the data for the stress intensity factor was determined for injection molded material. while the similarity in mechanical properties between this and 3d-printed material has been shown, it is better to separately estimate the stress intensity factor for the 3d-printed material. this was done by finite element simulations. a 20 mm-long sample of layered material was modeled in detail for the range of layer heights conform the dimensions specified in table 1. linear elastic material properties were used for standard polycarbonate with a modulus of 2.4 gpa and 0.37 for the poisson’s ratio. the application of a half penny-shaped crack-tip mesh at the notched interface between two 114 h. de vries, r. engelen, e. janssen layers allows for the accurate calculation of the stress field around the notched interface between the layers. the stress intensity factor was calculated using the j-integral method determined after imposing 1% tension to the sample. in figure 7 the result of the stress intensity factor is shown for the fabricated layer heights. in an actual loading situation, the stress will become higher near the end of a crack or notch, as was explained above. should the critical stress be reached, cracks will grow, and failure occurs. this is indicated by the critical stress intensity kic, which is regarded as a material constant. from the data in figure 7 one can infer that for thinner printed layers the actual stress intensity is further away from the critical level than for thicker printed layers. thus, the increased resistance of thin-layered structures against shock impact is understandable. as for the size of the plastic zone around the notch, the layer-height dependence of the stress intensity factor of figure 7 in combination with equation (2) suggests that the plastic zone grows almost linearly with the height of the printed layer. the validity of this observation is still under investigation. 4.5. notch radius the influence of the notch radius has previously been studied for molded samples. this concerned testing of samples with a predefined notch. the impact strength of polycarbonate at room temperature increased by a factor of about six if the radius of the notch was increased from 0.125 to 0.25 mm [13]. values were reported of 10–80 kj/m 2 . compared with the values in table 2 these are in line with the strength of 3d-printed polycarbonate. experiments on samples with notches of 0.13 to 0.5 mm radius showed roughly a doubling of the impact strength [21]. it is not yet possible to make an estimation of the radius of the notches between the printed layers. still, qualitatively, a rounding-off as the layers are thinner was found. again, this makes the samples made with thin layers stronger than those with thick layers. to summarize, three mechanisms have been identified that all enhance the impact strength of 3d-printed structures. printing of thin layers leads to a rounding of the notch and thus a lower stress peak at the interface between the layers. the second mechanism stems from the size of the plastic zone around the notch that reduces crack growth. there are indications that this zone is of the same dimension as the height of the thinnest layers. finally, polishing the printed wall to make it smooth removes the origin of the stress concentration. fig. 7 stress intensity ki as function of printed layer height. the dashed line is a guide to the eye impact strength of 3d-printed polycarbonate 115 5. summary and conclusions various experiments and analyses were carried out on 3d-printed samples with deposited layer heights from 0.2 mm to 1.8 mm. the purpose was to understand the relation between mechanical strength, internal stress and layer height in fdm-printed polycarbonate products. as it was supposed that the shape of the printed layers forms a notch-like indentation, several samples were polished to obtain a smooth wall. the main point of attention of this study was to evaluate the level and the variation of the internal stress and the possible relation with the impact strength of the samples. 3d-printed plastic parts may contain more internal stress than injection molded parts, due to the layered way of fabrication. this leads to deformation and residual stresses, especially for engineering plastics, which are processed at high temperatures. in this study, the measured internal stresses ranged from 2 mpa for molded samples to 2-5 mpa for 3dprinted parts, depending on the deposited layer height. polycarbonate is known to be sensitive to notches. notched molded samples have a lower impact strength than unnotched samples, because the notch acts as stress concentrator. due to the layered structure of 3d-printed parts, the surface intrinsically contains many notches. depending on layer height, and thus the radius of the notch, the measured impact strength ranged between 6 and 32 kj/m 2 . similar values (10-80 kj/m 2 ) have been reported for notched molded samples [13]. tensile strength is dependent on the processing temperature and is not influenced by the notches on the surface. perpendicular to the layers, single pass, vertical printed walls have a strength of up to 61 mpa, which is 95% of the value measured for injection molded samples. the following conclusions can be drawn from the results of the investigations and are supported by results from finite element simulations: 5.1 internal stress  stress concentrates at the notches between the printed layers. removal of the edges to make a smooth wall reduces the stress concentration and makes the stress almost constant over the layer height.  the notch between thin layers is rounded as compared to the notch in thick layers. this reduces the magnitude of the stress at the interface.  the internal stress is of equal magnitude in the center of the layers in as-printed and polished samples. this was determined by optical birefringence measurements.  the internal stress increases as the printed layers get thinner. in the thickest layers of 1.8 mm the internal stress level was the same as that of injection molded material.  the stress intensity factor is lower for thinner layers which means that in samples with thinner layers more stress-load can be applied before fracture occurs.  there are indications of the existence of a plastic zone around the notch which is of the same size as the height of the thinner layers. this zone hampers the propagation of cracks.  it is very likely that the stress is not constant over the wall thickness of the printed layers. an attempt was made to determine such variation, but with an inconclusive result. this is a subject for further research. based on this set of conclusions, some trends in the impact strength can be noted. 116 h. de vries, r. engelen, e. janssen 5.2 impact strength and tensile strength  the impact strength is highest in samples with thin layers because of the rounded notches. the role of the plastic zone around the tip of the notch is however not yet completely understood.  removing the edges increases the impact strength in thin layered samples further because the stress concentrations are taken away.  the impact strength depends on the geometry of the printed samples (e.g. notch radius and layer height) it cannot be regarded as a material constant  the tensile strength is weakly affected by the layer height.  provided that the processing conditions during printing are such that a proper adhesion between the printed layers can be achieved, the tensile strength can be regarded as a material property for the 3d-printed samples. acknowledgement: the authors are grateful to rifat hikmet, loes koopmans, leendert van der tempel, and ruud stelten for technical and theoretical support. references [1] j. cantrell, s. rohde, d. damiani, r. gurnani, l. disandro, j. anton, a. young, a. jerez, d. steinbach, c.kroese, p. ifju, "experimental characterization of the mechanical properties of 3d-printed abs and polycarbonate parts", rapid prototyping j., vol. 23, no. 4, pp. 811-829, 2017. [2] n. hill, m. haghi, "deposition direction-dependent failure criteria for fused deposition modeling polycarbonate", rapid prototyping j., vol. 20, no. 3, pp. 221-227, 2014. [3] w.c. smith, r.w. dean, "structural characteristics of fused deposition modeling polycarbonate material", polymer testing, vol. 32, pp. 1306-1312, 2013. [4] m. domingo-espin, j.m. puigoriol-forcada, a.a. garcia-granada, j. llumà, s. borros, g. reye, "mechanical property characterization and simulation of fused deposition modeling polycarbonate parts", materials and design, vol. 83, pp. 670-677, 2015. [5] v.a. safronov, r.s. khmyrov, d.v. kotoban, a.v. gusarov, "distortions and residual stresses at layerby-layer additive manufacturing by fusion", j. manuf. sc. eng., vol. 139, pp. 031017-1-6, 2017. [6] w. zhang, a.s. wu, j. sun, z. quan, b. gu, b. sun, c. cotton, d. heider, t.w. chou, "characterization of residual stress and deformation in additively manufactured abs polymer and composite systems", composites. sc. techn., vol. 150, pp. 102-110, 2017. [7] c. casavola, a. cazzato, v. moramarco, g. pappalettera, “residual stress measurement in fused deposition modelling parts”, polymer testing, vol. 58, pp. 249-255, 2017. [8] j.e. seppala, s.h. han, k.e. hillgartner, c.s. davis, k.b. migler, "weld formation during material extrusion additive manufacturing", soft matter, vol. 13, pp. 6761-6769, 2017. [9] see e.g. a.c. abbott, g.p. tandon, r.l. bradford, h. koerner, j.w. baur, "process-structure-property effects on abs bond strength in fused filament fabrication", additive manufacturing, vol. 19, pp. 29-38, 2018. j. yin, c. lu, j. fu, y. huang, y. zheng. "interfacial bonding during multi-material fused deposition modeling (fdm) process due to inter-molecular diffusion", materials and design, vol. 150, pp. 104-112, 2018. [10] metallic materials – charpy pendulum impact test – part 1: test method. iso 148-1. 2009. [11] s. shirouzu, h. shikuma, n. senda, m. yoshida, s. sakamoto, k. shigematsu, t. nakagawa, s. tagami, "stress optical coefficients in polycarbonates", jpn. j. appl. phys., vol. 29, no. 5, pp. 898-901, 1990. [12] r. wimberger-friedl, j.g. de bruin, h.f.m. schoo, "residual birefringence in modified polycarbonates", polymer eng. & sc., vol. 43, no. 1, pp. 62-70, 2003. [13] g. allen, d.c.w. morley, t. williams, "the impact strength of polycarbonate", j. mater. sc., vol. 8, pp. 1449-1452, 1973. [14] see e.g. j. f. rodriguez, thomas, and j. e. renaud, "maximizing the strength of fused-deposition abs plastic parts", solid freeform fabrication platform, pp. 335-342, 1999. j. p. thomas and j. f. impact strength of 3d-printed polycarbonate 117 rodríguez, "modeling the fracture strength between fused deposition extruded roads", solid freeform fabrication platform, pp. 16-23, 2000. [15] a. ram, o. zilber, s. kenig, "residual stresses and toughness of polycarbonate exposed to environmental conditions", polymer eng. & sc., vol. 25, no. 9, pp. 577-581, 1985 [16] b. yang, j. oujang, f. wang, "simulation of stress distribution near weld line in the viscoelastic melt mold filling process", j. appl. math., vol. 2013, article id 856171, 2013. [17] j. floor, "getting a grip on the ultimaker 2: tensile strength of 3d printed pla: a systematic investigation", technical university of delft, msc-thesis, 2015. [18] g.r. irwin, "analysis of stresses and strains near the end of a crack traversing a plate", j. applied mechanics, vol. 24, pp. 361-36, 1957. [19] m.t. takemori, d.s. matsumoto, "an unusual fatigue crack-tip plastic zone: the epsilon plastic zone of polycarbonate", j. polym. sc., vol. 20, pp. 2027-2040, 1982. [20] r.a.w. fraser, i.m. ward, "the impact fracture behavior of notched specimens of polycarbonate", j. mater. sc., vol. 12, pp. 459-468, 1977. [21] l.e. hornberger, g. fan, k.l devries, "effect of thermal treatment on the impact strength of polycarbonate", j. appl. phys., vol. 60, pp. 2678-2682, 1986. 08_494-8228-1-le.indd facta universitatis series: electronics and energetics vol. 29, no 1, march 2016, pp. 101 112 doi: 10.2298/fuee1601101k 1university of priština, faculty of technical science, serbia 2university of priština, faculty of natural sciences and mathematics, serbia 3university of niš, faculty of electronic engineering, serbia ivan krstić1, negovan stamenković2, vidosav stojanović3 binary to rns encoder for the moduli set {2n – 1, 2n, 2n + 1} with embedded diminished-1 channel for dsp application received september 24, 2014; received in revised form september 1, 2015 corresponding author: ivan krstić university of priština, faculty of technical sciences, serbia (e-mail: ivan.krstic@pr.ac.rs) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) facta universitatis (niš) ser.: elec. energ. vol. 27, no. 4, december 2014, xx-xx binary to rns encoder for the moduli set {2n − 1,2n,2n + 1} with embedded diminished-1 channel for dsp application ivan krstić, negovan stamenković and vidosav stojanović abstract: a binary-to-residues encoder (forward encoder) is an essential building block for the residue number system digital signal processing (rns dsp) and as such it should be built with a minimal amount of hardware and be efficient in terms of speed and power. the main parts of the forward encoder are residue generators which are usually classified into two categories: the one based on arbitrary moduli-set which make use of look-up tables, and the other based on the special moduli sets. a new memoryless architecture of binary-to-rns encoder based on the special moduli set {2n − 1,2n,2n + 1} with embedded modulo 2n + 1 channel in the diminished-1 representation is presented. any of two channels (standard modulo 2n + 1, or modulo 2n + 1 in the diminished-1 representation) operation can be performed by using a single switch. the proposed encoder has been implemented on a xilinx fpga chip for the various dynamic range requirements. keywords: rns system, special moduli set, forward encoder, diminished-1 encoded channel, modulo carry save adders, virtex fpga. 1 introduction residue number system [1, 2] is a non-weighted integer number system in which arithmetic operations are limited to the addition, subtraction and multiplication. other arithmetic operations such as division, sign detection, overflow, scaling manuscript received on september 25, 2014. i. krstić is with university of priština, faculty of technical sciences (e-mail ivan.krstic@pr.ac.rs). n. stamenkovic is with university of priština, faculty of natural sciences and mathematics (e-mail negovan.stamenkovic@pr.ac.rs). v. stojanović is with university of niš, faculty of electronic engineering (e-mail vidosav.stojanovic@elfak.ni.ac.rs). 1 facta universitatis (niš) ser.: elec. energ. vol. 27, no. 4, december 2014, xx-xx binary to rns encoder for the moduli set {2n − 1,2n,2n + 1} with embedded diminished-1 channel for dsp application ivan krstić, negovan stamenković and vidosav stojanović abstract: a binary-to-residues encoder (forward encoder) is an essential building block for the residue number system digital signal processing (rns dsp) and as such it should be built with a minimal amount of hardware and be efficient in terms of speed and power. the main parts of the forward encoder are residue generators which are usually classified into two categories: the one based on arbitrary moduli-set which make use of look-up tables, and the other based on the special moduli sets. a new memoryless architecture of binary-to-rns encoder based on the special moduli set {2n − 1,2n,2n + 1} with embedded modulo 2n + 1 channel in the diminished-1 representation is presented. any of two channels (standard modulo 2n + 1, or modulo 2n + 1 in the diminished-1 representation) operation can be performed by using a single switch. the proposed encoder has been implemented on a xilinx fpga chip for the various dynamic range requirements. keywords: rns system, special moduli set, forward encoder, diminished-1 encoded channel, modulo carry save adders, virtex fpga. 1 introduction residue number system [1, 2] is a non-weighted integer number system in which arithmetic operations are limited to the addition, subtraction and multiplication. other arithmetic operations such as division, sign detection, overflow, scaling manuscript received on september 25, 2014. i. krstić is with university of priština, faculty of technical sciences (e-mail ivan.krstic@pr.ac.rs). n. stamenkovic is with university of priština, faculty of natural sciences and mathematics (e-mail negovan.stamenkovic@pr.ac.rs). v. stojanović is with university of niš, faculty of electronic engineering (e-mail vidosav.stojanovic@elfak.ni.ac.rs). 1 102 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 103 2 i. krstić, n. stamenković and v. stojanović: and magnitude comparison are non-modular and quite complex for implementation. the rns is defined in terms of a set of relatively prime moduli called rns basis. special moduli set {2n − 1,2n,2n + 1} has gained popularity and is expected to play an important role in rns digital signal processing [3]. in comparison with the other moduli sets, special moduli set has the advantage of low-cost forward conversion, modulo reduction, and the reverse conversion. thus, the use of this moduli set can significantly reduce hardware complexity and delay [4]. the rns dsp consists of three major parts: binary-to-residue encoder, modular arithmetic channels and residue to binary decoder (reverse converter). the forward encoder and reverse converter are needed to achieve the rns representation of the binary number and vice versa, respectively [5]. the above-mentioned modular operations, required by each modular arithmetic channel, are inherently carry-free addition, multiplication and borrow-free subtraction, which means that each digit of the resulting number is a function of only one digit from each operand and independent of the others. this is the most attractive feature of rns that enables one to design highly parallel structures for computation which leads to speed improvement required for the dsp applications [6, 7, 8, 9, 10]. a binary-to-residues encoder is an essential building block for a residue number system and as such it should be built with a minimal amount of the hardware and along with that be efficient in terms of speed and power. conceptually, binaryto-residues encoder involves computation of the remainders of the input bit stream with respect to the each modulus in the rns moduli set. in other words, the binaryto-residues encoder maps a binary weighted number into a finite ring [11]. a finite ring is a set of finite elements over which the modular addition and the modular multiplication operations are defined. main parts of binary-to-residues encoder are the forward converters (residue generators). the forward converters are usually classified into the next two categories: the one based on the arbitrary moduli-sets [12, 4] which are usually built using the look-up tables, and the other based on special moduli-sets [13, 14, 15]. the use of special moduli-sets simplifies the forward conversion algorithms and such forward converters can be realized using only combinational logic. the dynamic range of rns system, which is equal to the product of the modulus of three moduli-set base {2n − 1,2n,2n + 1}, is m = 23n − 2n i.e. corresponds to the 3n bits. thus, any 3n-bit unsigned binary integer x can be uniquely represented by its residues: x = (x1,x2,x3), where x1 is the reminder when x is divided by modulo 2n − 1 denoted as ⟨x⟩2n−1, x2 = ⟨x⟩2n and x3 = ⟨x⟩2n+1. the diminished-1 number system [16] can be used to represent modulo 2n + 1 residue (x3 = ⟨x⟩2n+1) as: x′3 = ⟨x − 1⟩2n+1. thus, the each operand is represented decreased by one, and the zero operands are not used in the computation channel. 102 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 103 binary to rns encoder with diminished-1 encoded channel 3 in the diminished-1 representation, x′3 is represented as 2 nx3,n + x ′3, where x3,n is zero indication bit, and x ′3 is n-bit number part. if x ′ 3 > 0, x3,n = 0 and x ′ 3 = x3 − 1, whereas for x′3 = 0; x3,n = 1, and x ′ 3 = 0. thus, for the diminished-1 representation, the residue of x − 1 modulo 2n + 1 instead of the residue x modulo 2n + 1 is used. the results of arithmetic operations are derived alternatively when any of operands or the result is equal to zero [17, 18, 19]. in this way, the diminished-1 representation can lead to the implementations with delay and area complexity approaching that of the modulo 2n − 1 channel. this paper presents a binary-to-residues encoder based on special moduli set {2n − 1,2n,2n + 1} with embedded diminished-1 encoded channel, which unifies the encoders architectures presented in [15]. theoretical background of forward converters for 2n −1 and 2n channel remains the same as in [15], while new forward converter for 2n + 1 channel with embedded diminished-1 encoded channel has been developed. the standard and the diminished-1 forward converters for modulo 2n + 1 channel are implemented on the same hardware. thus, the standard 2n + 1 channel or the diminished-1 channel can be activated simply, by using the single switch. the rest of the paper is organized as follows. in section 2 we introduce the binary-to-residues memoryless encoder for special moduli set based only on the standard combinational logic and a novel design of the binary to residue encoder with embedded diminished-1 channel. section 3 presents the hardware implementation and performance evaluation. our conclusion is drawn in section 4. 2 binary-to-residues encoder in our approach, the 3n-bit input is divided into three n-bit sections to obtain the corresponding three residue numbers in parallel. an 3n-bit integer in the range 0 ≤ x ≤ m − 1 can be represented in power-of-two notation as [20, 21]: x = 3n−1 ∑ i=0 bi2 i = n2 × 22n + n1 × 2n + n0 , (1) where n0 = n−1 ∑ i=0 bi2 i, n1 = 2n−1 ∑ i=n bi2 i−n and n2 = 3n−1 ∑ i=2n bi2 i−2n . (2) in order to obtain the rns representation of the integer x , partitioned into three n-bit parts n0, n1 and n2, three residue generators are required, one for each channel. 104 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 105 4 i. krstić, n. stamenković and v. stojanović: 2.1 forward conversion for modulo 2n and modulo 2n − 1 channel forward conversion for modulo 2n channel is quite simple, i.e. the residue x2 can be obtained by truncation of x : x2 = ⟨x⟩2n = bn−1bn−2 ···b0 . (3) the calculation of x1 can be performed as a sequence of additions [15]: x1 = ⟨n2 + n1 + n0⟩2n−1 , (4) which can be performed by csa with eac (carry save adder with end around carry) on whose inputs three n-bit operands (n2, n1, n0) are connected, followed by the cpa with eaic (carry propagate adder with end around inverted carry) and decrementer. slight modifications to the architecture of the modulo 2n − 1 residue generator presented in [15] are introduced: lsb full-adder of cpa with eaic (end-around inverted carry) has been replaced with ha (full-adder with one input driven by the logical one), while msb half-subtractor of decrementer has been replaced with xor gate, fig. 1. ha has the same complexity as the standard half-adder except for an extra inverter, whose delay and area consumption can be ignored if using the unit-gate model as a means of performance evaluation. the critical path of the binary-to-modulo 2n − 1 converter is depicted by a dashed line. fa fa fa fa fa fa fa fa fa fa hafa hs hs hs hs hs n2,5 n2,4 n2,3 n2,2 n2,1 n2,0 n1,5 n1,4 n1,3 n1,2 n1,1 n1,0 n0,5 n0,4 n0,3 n0,2 n0,1 n0,0 cout x1,5 x1,4 x1,3 x1,2 x1,1 x1,0 s5 s4 s3 s2 s1 s0c5 c4 c3 c2 c1 c0 fig. 1. the architecture of the modulo (26 − 1) residue generator. propagation delay of binary to modulo 2n − 1 converter, according to the unitgate model, is t1 = 3n + 4. the area cost of binary to modulo 2n − 1 converter is 104 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 105 binary to rns encoder with diminished-1 encoded channel 5 a1 = 17n − 5. 2.2 forward conversion for modulo 2n + 1 channel modulo 2n + 1 residue and modulo 2n + 1 residue in diminished-1 number system representation can be calculated by equations derived in [15]: x3 = ⟨s +c + 1⟩2n+1 , (5) x′3 = ⟨s +c⟩2n+1 , (6) where s and c are n-bit partial sum and carry vectors generated by modulo 2n + 1 carry save adder, whose inputs are driven by n2, n1 and n0, where n1 = 2n −1−n1 is the one’s complement of operand n1 [22]. by introduction of a control bit d d = { 1, if calculating x3 0, if calculating x′3 , (7) equations (5) and (6) are combined to form x′′3 = ⟨s +c + d⟩2n+1 . (8) the modulo 2n + 1 addition of two n-bit operands s = sn−1sn−2 ...s0, c = cn−1cn−2 ...c0 and the 1-bit operand d is based on the following relation x′′3 = { s +c + d, s +c + d ≤ 2n s +c + d +(2n − 1)− 2n+1, otherwise . (9) in order to implement (9) we can ignore the output carry (cout ) from 2n+1 position and add the constant value of 2n −1 to the result of a = s+c +d (in binary notation a = anan−1 ...a0), if s + c + d > 2n. that is, the output of the residue generator should yield the value b which is obtained by adding the (n + 1)-bit binary number k = 0 11...1� �� n to the binary number a, b = anan−1 ...a0 +0 11...1� �� n , where an = cout . let pi+1 denote the carry from i-th bit position obtained while performing addition of the binary numbers a and k. it is obvious that: p1 = a0, pi+1 = pi ∨ ai, for i = 1, 2,..., n − 1, (10) where ∨ corresponds to the logical or operation. 106 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 107 6 i. krstić, n. stamenković and v. stojanović: furthermore, the output vector b is: b0 = a0, bi = pi ⊕ ai, for i = 1, 2,..., n − 1, bn = pn ⊕ an . (11) according to the value of the control signal sel either a = anan−1 ...a0 or b = bnbn−1 ...b0 should be connected to the output (if sel = 0 then x′′3 = a, else x ′′ 3 = b): x′′3,k = (bk ∧ sel)∨ ( ak ∧ sel ) . (12) considering the values of cout and pn, there are three cases to be discussed: 1. if s + c + d < 2n, that is cout = 0 and pn can be zero or one, then sel = 0, which corresponds to the binary number a at the output. 2. if s +c + d = 2n, that is cout = 1 and pn = 0, then sel = 0 and x′′3 = a. 3. if s +c +d > 2n, that is cout = 1 and pn = 1, then sel = 1, which corresponds to the binary number b at the output. according to the above discussion it can be concluded that sel = cout ∧ pn, where ∧ corresponds to the logical and operation. equation (12) can be simplified by putting (11) in (12): x′′3,0 = a0 ⊕ sel, x′′3,i = (sel ∧ pi)⊕ ai, for i = 1, 2,..., n − 1, x′′3,n = an ∧ pn . (13) finally, the architecture of the binary to the modulo 2n + 1 converter with the embedded modulo 2n + 1 channel in the diminished-1 representation for n = 6 is given in the fig. 2. depending on the value of the control bit d, the converter gives either x3 or x′3. the critical path of the converter is depicted by the dashed line. the theoretical formula for the propagation delay, i.e. conversion time, of the binary to the modulo 2n + 1 residue generator with the embedded diminished-1 channel is t3 = 2n + 10. the area cost is equal to a3 = 18n. the validity of the modulo 2n + 1 channel and the diminished-1 encoded channel operation of the binary to residues encoder for the 16-th bit input number and n = 6 is demonstrated in the following example. let x = 54 425 = 1101 010010 011001. the carry save adder with end around inverted carry (eaic) reduces the three 6-bit inputs n0, n1 and n2 to the two 6-bit numbers: the partial sum sequence (s) and the partial carry sequence (c) 106 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 107 binary to rns encoder with diminished-1 encoded channel 7 fa fa fa fa fa fa fa fa fa fa fa dfa n2,5 n2,4 n2,3 n2,2 n2,1 n2,0 n1,5 n1,4 n1,3 n1,2 n1,1 n1,0 n0,5 n0,4 n0,3 n0,2 n0,1 n0,0 cout x3,6 or x′3,6 x3,5 or x ′ 3,5 x3,4 or x ′ 3,4 x3,3 or x ′ 3,3 x3,2 or x ′ 3,2 x3,1 or x ′ 3,1 x3,0 or x ′ 3,0 cout s5 s4 s3 s2 s1 s0c5 c4 c3 c2 c1 c0 a5 a4 a3 a2 a1 a0p6 p5 p4 p3 p2 sel fig. 2. the architecture of binary to modulo (26 + 1) converter with the embedded modulo (26 + 1) channel in the diminished-1 representation. for d = 1 we have x3 = ⟨x⟩26+1, but for d = 0 we have diminished-1 encoded channel x′3 = ⟨x − 1⟩26+1. n2 = 0 0 1 1 0 1 n1 = 1 0 1 1 0 1 n0 = 0 1 1 0 0 1 s = 1 1 1 0 0 1 c = 0 0 1 1 0 1 1 eaic for d = 1 the carry input of lsb full adder of the cpa adder is equal to 1, and the cpa gives: s = 1 1 1 0 0 1 c = 0 1 1 0 1 1 d = 1 a = 1 0 1 0 1 1 0 a carry out cout = a6 = 1 is generated. since the 6-bit vector p is equal to p = [1 1 1 1 1 1] and the sel = cout ∧ p6 = 1 (bit p6 is msb), the output of the converter is given by: x′′3 = x3 = [0 0 1 0 1 0 0] 108 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 109 8 i. krstić, n. stamenković and v. stojanović: or in the decimal notation x3 = 20, which can be verified as true. for d = 0, the cpa gives: s = 1 1 1 0 0 1 c = 0 1 1 0 1 1 a = 1 0 1 0 1 0 0 a carry out cout = a6 = 1 is generated. since the 6-bit vector p is equal to p = [1 1 1 1 0 0] and the sel = cout ∧ p6 = 1, the output of the converter is given by: x′′3 = x ′ 3 = [0 0 1 0 0 1 1] or in the decimal notation x′3 = 19, which can be verified as true. 3 hardware implementation and performance evaluation in this section, the propagation delay and the amount of the hardware needed for implementation of proposed encoder on an asic and fpga chip, along with the comparison to the encoders presented in [15] are given. the encoder architecture shown in fig. 3 is based on equations (3), (4) and (9). csa with eac csa with eaic cpa cpa c cs sn nn n 1 0 sw + dcout cout cin = 1 decrementer selection network n2 n1 n0 n n n x1 n x3 or x′3 n + 1 x2 fig. 3. the architecture of the new binary to residues encoder for moduli set {2n − 1,2n,2n + 1} with embedded diminished-1 channel the carry save adder with the end around carry, which is an adder on whose inputs three n-bit operands are connected, followed by the decrementer is used for the modulo 2n − 1 channel. 108 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 109 binary to rns encoder with diminished-1 encoded channel 9 by using the switch sw the modulo 2n + 1 channel in the new encoder can yield either the modulo 2n + 1 residue (x3) or the modulo 2n + 1 residue represented in diminished-1 number system (x′3), as illustrated in fig. 3. if the carry input of the cpa is equal to 1 (sw in upper position) the output of modulo (2n + 1) channel is equal to x3. on the other side, if the carry input of the cpa is equal to 0 (sw in lower position) the output of the modulo 2n + 1 channel is equal to the modulo 2n + 1 residue in diminished-1 representation denoted as x′3. as is shown in figure 3, the cpa is critical element in the both modulo (2n −1) and modulo (2n + 1) residue generator data paths. the encoder performance can be increased if the cpa is replaced with parallel prefix adder. in the vlsi implementation, the parallel-prefix adders (or carry-tree adders) are known to have the best performance [23]. however, this performance advantage does not translate directly into an fpga chip due to constraints on the logic block configurations and routing overhead. an fpga chip, such as the virtex-6, contains the number of slices, each containing a number of multiplexers, look-up tables, logic gates, flip-flops, etc. the parallel-prefix adder implementation on an fpga chip is given in [24]. however, the implementations of the binary to residues encoder based on the parallel-prefix adders can lead to the higher hardware cost and consequently considerable the power dissipation in comparison to the carry propagate based architectures. since the residues are computed in parallel, the propagation delay of binary to residues encoder is t = max(t1,t3). that is, if n < 6, t = 2n+10, else t = 3n+4. the area cost of binary to residues encoder is a = a1 + a3 = 35n − 5. the presented algorithms were used for the description of proposed binary to residues encoder in the vhdl hardware programming language. complete design was implemented on the virtex 6 xc6vcx75t fpga chip using xilinx ise design suite 14.2 while behavioral and post-route simulation of implemented encoder was performed using isim simulator. the exact values of the area cost and propagation delay which relate to the asic and the fpga-based implementations of the presented encoder architecture for different values of n, along with the comparison to architectures presented in [15], are shown in tables 1 and 2. table 1. asic implementation performance comparison design delay area encoder [15] 3n+5 34n encoder with d-1 channel [15] max(2n + 12,3n + 5) 37n + 7 new encoder max(2n + 10,3n + 4) 35n − 5 the encoder shown in fig. 3 can perform the operation of the both encoders 110 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 111 10 i. krstić, n. stamenković and v. stojanović: presented in [15]. as can be seen from table 1 there is no significant propagation delay improvement if the asic implementation is considered. however, the increase of performance regarding area consumption is significant in comparison to the encoder with the modulo 2n + 1 residue in the diminished-1 number system representation. table 2. fpga implementation propagation delay [ns] / area consumption [slices] n encoder [15] encoder with d-1 channel [15] new encoder 4 4.072 / 28 4.034 / 29 3.885 / 31 5 4.962 / 36 4.806 / 38 4.529 / 41 6 6.238 / 45 6.099 / 47 5.370 / 48 7 7.012 / 52 8.049 / 54 6.500 / 58 8 4.463 / 74 4.567 / 77 4.618 / 77 10 5.535 / 90 5.535 / 95 5.563 / 96 functional simulation waveforms of the binary to residues encoder based on moduli set {63,64,65} with diminished-1 encoded channel are shown in fig. 4. the length of the input string x is 18 bits and it is sub-grouped into three groups of 6 bits. the output residues are 6 bits long for modulo 26 − 1 and modulo 26 channels, and 7 bits long for modulo 26 + 1 channel. fig. 4. functional simulation waveforms of the binary-to-residue encoder based on moduli set {63,64,65} with diminished-1 encoded channel. 4 conclusion in this paper, we investigated the binary to residues memoryless encoder, which is an important issue concerning the utilization of the rns number system in dsp application. we proposed a new binary to residues encoder for moduli set {2n − 1,2n,2n + 1} with embedded diminished-1 channel which can be used instead of standard modulo 2n + 1 channel. the modulo 2n + 1 channel and the diminished1 channel are implemented on the same hardware and the encoder can be used to perform either the modulo 2n + 1 or the diminished-1 channel operation. the single switch is used for the channel selection. our approach avoids the initial 110 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel 111 binary to rns encoder with diminished-1 encoded channel 11 calculation of x − 1 in order to compute the residue of x modulo 2n + 1 residue for the diminished-1 encoded channel and enables utilization of any modulo binary adder designs. the speed of the binary to residues encoder can be increased by using pipelining. the encoder proposed in this paper is applicable to the asic and the fpga implementations. the obtained results of the design in terms of the number of xilinx fpga logic elements and input-to-output propagation delays are given. acknowledgement this work was supported by the serbian ministry of science and technological development, project no. 32009tr. references [1] h. l. garner, “the residue number system,” ire trans. electronic computer, vol. ec-8, no. issue 2, pp. 140–147, jun. 1959. [2] n. szabo and r. i. tanaka, residue arithmetic and its application to computer technology. new york: mcgraw-hill, 1967. [3] e. gholami, r. farshidi, m. hosseinzadeh, and k. navi, “high speed residue number system comparison for the moduli set {2n − 1,2n,2n + 1},” journal of communication and computer, vol. 6, no. 3, pp. 40–46, 2009. [4] j. low and c.-h. chang, “a new approach to the design of efficient residue generators for arbitrary moduli,” circuits and systems i: regular papers, ieee transactions on, vol. 60, no. 9, pp. 2366–2374, sep. 2013. [5] w. k. jenkins and b. leon, “the use of residue number systems in the design of finite impulse response digital filters,” ieee trans. on circuits and systems, vol. cas-24, no. 4, pp. 191–201, apr. 1977. [6] r. chaves and l. sousa, “rdsp: a risc dsp based on residue number system,” in digital system design, 2003. proceedings. euromicro symposium on, belekantalya, turkey, sep. 1–6, 2003, pp. 128–135. [7] n. stamenković, digital filter implementation using rns-binary arithmetic, monographies ed. lap lambert academic publishing, 2014. [8] d. živaljević, n. stamenković, and v. stojanović, “digital filter implementation based on the rns with diminished-1 encoded channel,” in telecommunications and signal processing (tsp), 35th international conference on, prague, czech republic, jul. 3–4, 2012, pp. 662–666. [9] n. stamenković, d. živaljević, and v. stojanović, “the use of residue number system in the design of the optimal all-pole iir digital filters,” in telecommunications and signal processing (tsp), 36th international conference on, rome, italy, jul. 2–4, 2013, pp. 722–726. [10] d. živaljević, n. stamenković, and v. stojanović, “fir filter implementation based on the rns with diminished-1 encoded channel,” international journal of advances in telecommunications, electrotechnics, signals and systems, vol. 2, no. 2, pp. 56– 62, 2013. 112 i. krstić, n. stamenković, v. stojanović binary to rns encoder with diminished-1 encoded channel pb 12 i. krstić, n. stamenković and v. stojanović: [11] k.-w. kim and w.-j. lee, “an efficient parallel systolic array for ab2 over gf(2n),” ieice electronics express, vol. 10, no. 20, pp. 1–6, 2013. [12] r. capocelli and r. giancarlo, “efficient vlsi networks for converting an integer from binary system to residue number system and vice versa,” circuits and systems, ieee transactions on, vol. 35, no. 11, pp. 1425–1430, 1988. [13] f. pourbigharaz and h. m. yassine, “modulo-free architecture for binary to residue transformation with respect to {2m − 1,2m,2m + 1} moduli set,” in circuits and systems, 1994. iscas ’94., 1994 ieee international symposium on, vol. 2, 1994, pp. 317–320. [14] m.-h. sheu, s.-h. lin, y.-t. chen, and y.-c. chang, “high-speed and reduced-area rns forward converter based on (2n − 1,2n,2n + 1) moduli set,” in circuits and systems, 2004. proceedings. the 2004 ieee asia-pacific conference on, vol. 2, 2004, pp. 821–824. [15] i. krstić, n. stamenković, m. petrović, and v. stojanović, “binary to rns encoder with modulo 2n + 1 channel in diminished-1 number system,” international journal of computational engineering & management (ijcem), vol. 17, no. 4, pp. 1–9, may 2014. [online]. available: www.ijcem.com [16] l. m. leibowitz, “a simplified binary arithmetic for the fermat number transform,” ieee transactions on acoustics, speech, and signal processing, vol. assp-24, no. 5, pp. 356–359, oct. 1976. [17] h. vergos and c. efstathiou, “a unifying approach for weighted and diminished-1 modulo 2n + 1 addition,” circuits and systems ii: express briefs, ieee transactions on, vol. 55, no. 10, pp. 1041–1045, oct. 2008. [18] c. efstathiou, i. voyiatzis, and n. sklavos, “on the modulo 2n + 1 multiplication for diminished-1 operands,” in signals, circuits and systems, 2008. scs 2008. 2nd international conference on, monastir, tunisia, nov. 7–9, 2008, pp. 1–5. [19] e. vassalos, d. bakalis, and h. vergos, “reverse converters for rnss with diminished-one encoded channels,” in eurocon, 2013 ieee, jul. 1–4, 2013, pp. 1798–1805. [20] b. vinnakota and v. v. b. rao, “fast conversion techniques for binary-residue number systems,” ieee trans. on circuits and systems-i: fundamental theories and applications, vol. 41, no. 12, pp. 927–929, dec. 1994. [21] s. j. piestrak, “design of residue generators and multioperand modular adders using carry-save adders,” ieee transactions on computers, vol. 423, no. 1, pp. 68–77, jan. 1994. [22] e. vassalos, d. bakalis, and h. vergos, “on the design of modulo 2n ± 1 subtractors and adders/subtractors,” circuits syst signal process, vol. 30, no. 6, pp. 1445–1461, 2011. [23] c. efstathiou, h. vergos, and d. nikolos, “fast parallel-prefix modulo 2n +1 adders,” computers, ieee transactions on, vol. 53, no. 9, pp. 1211–1216, sep. 2004. [24] s. v. padmajarani and m. muralidhar, “a new approach to implement parallel prefix adders in an fpga,” international journal of engineering research and applications (ijera), vol. 2, no. 4, pp. 1524–1528, july-august 2012. [online]. available: www.ijera.com facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 217-226 https://doi.org/10.2298/fuee2002217v © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd comparative analysis of ml and map detectors for pam constellations in awgn channel  slobodan a. vlajkov, aleksandra ž. jovanović, zoran h. perić university of niš, faculty of electronic engineering, department of telecommunications, niš, republic of serbia abstract. in this paper we perform a comparative performances analysis of “maximum a posteriori” (map) and “maximum likelihood” (ml) detectors for one-dimensional constellation in the adaptive white gaussian noise (awgn) channel. more precisely, error probabilities per symbol for the aforementioned detectors are compared for the case when the pulse amplitude modulation (pam) constellation with the equidistant and non-equiprobable constellation points is used as one-dimensional constellation. we perform analysis for different distributions of the constellation point probabilities and different values of the signal-to-noise ratio (snr). the analysis indicates which detector can be adequate choice for the certain distribution of constellation point probabilities and the snr. besides this, for the straightforward performance assessment of the map detector we derive a formula for the symbol error probability. our analysis also points out that the nonuniform distribution of the constellation points probabilities does not necessarily improve the symbol error probability. with the aim to decrease the symbol error probability we propose a method for defining constellation point probabilities. the presented results show that pam constellation designed by utilizing the method we propose significantly outperforms the conventional pam constellation in terms to the symbol error probability. key words: pam constellation, awgn channel, ml detector, map detector, symbol error probability 1. introduction one classical issue in digital communications is estimation of the symbol error probability after transmission of digital signal through the additive white gaussian noise (awgn) channel 1-17. this error probability mainly depends on the decision rule, that is, on the type of the detector. the decision rule that minimizes the probability of decision error is the “maximum a posteriori” (map) decision rule. however, the detector based on map decision rule requires the knowledge of a priori probabilities of symbols received april 14, 2019; received in revised form july 1, 2019 corresponding author: slobodan a. vlajkov faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, republic of serbia e-mail: vlajkov.slobodan@gmail.com  218 s.a. vlajkov, a.ž. jovanović, z.h. perić and also it has considerable complexity. from these reasons, the simpler detectors are desirable. one such simpler detector is the “maximum likelihood” (ml) detector that does not take into consideration the a priori probabilities of symbols 1, 2. although the topic on map and ml detection has been studied for a long time, it has not still completely investigated largely because of the complexity of the map detection. thus, the comparison of ml and map detectors for multidimensional constellations was recently performed in 3 pointing that the topic is still actual. besides this, as there have been an increasing number of studies on the constellations with the non-equiprobable symbols 3-17, the research we perform here can be meaningful in detection of these constellations. in this paper, for pam constellations with the equidistant and non-equiprobable symbols we estimate and compare the symbol error probabilities of the map and ml detectors in order to obtain answer on question which detector is suitable for certain scenario. also, we derive a formula for the symbol error probability of map detector. the formula we propose is useful for the performances estimation when signal is modulated with the nonequiprobable and equidistant pam constellation and is transmitted through the awgn channel. the choice of the constellation point probabilities considerably influences the symbol error probabilities whereby one should notice that the the unadequate choice of the constellation point probabilities can considerably increase the symbol error probability. this issue motivates us to develop a method for determining constellation points probabilities, that lead to the significantly decreased symbol error probability. we will demonstrate that the pam constellation based on method we propose outperforms some known pam constellations in terms of the symbol error probability. in order to present our research in a distinct and concise manner, in follows we firstly define the one-dimensional constellation. then we focus on the signal reception by considering in detail the ml and map detectors. finally, we propose the method for designing pam constellation with the equidistant and non-equiprobable constellation points. 2. symbol error probability for pam constellation in awgn channel as already mentioned in the introduction we investigate the performances of ml and map detectors for pam constellations in awgn channel. before we define constellation, we briefly explain the considered scenario. namely, the useful signal sent into the channel is s(t). due to transmission through the communication channel the signal is corrupted by the additive white gaussian noise n(t). then, on the channel output, that is, at the detector input there is the signal r(t) which represents the sum of the useful signal and the noise. based on the sample of the received signal r = r(kt) = s(kt) + n(kt) = ai + n, k  0, the detector makes decision about the transmitted signal. 2.1. pam constellation parameters we consider m-ary pam constellation with constant distance d between the amplitudes of adjacent constellation points ai, i = 1, 2,..., m. we also assume that constellation is symmetric, which enables us to focus on the positive part of the constellation. by following these assumptions, we formulate the expression for the constellation point amplitude as follows: comparative analysis of ml and map detectors for pam constellations in awgn channel 219 2 ,...,1 2 1 m idia i        . (1) the parameter we also define is the probability of constellation point marked with pi = p(ai). we assume pam constellation with non-equiprobable symbols, so that it holds constp i  , (2) under the constraint that the sum of probabilities of all constellation points is equal to 1:    2 1 12 m i i p . (3) finally, we define the average energy per symbol es in the following manner 1:    2/ 1 2 2 m i iis pae . (4) after that we formulate expressions for the average energy per bit eb 1: m e e s b 2 log  (5) and the signal-to-noise ratio per bit snr 1: 0 log10 n e snr b , (6) where n0 is the spectral power density of the awgn. 2.2. map and ml decision rules the sample of the received signal is expressed analytically in the following manner: nar i  , (7) where ai is the sample of the useful signal s(t) and n is the sample of the awgn noise n(t) having gaussian probability density function with zero-mean value and variance σn 2 = n02. from (7) follows that for given ai the sample r has also the gaussian probability density function with the same variance as the noise variance, but with the mean value ai 1: 2 00 ( )1 ( ) exp ,i i r a p r a r nn            . (8) the map rule that minimizes the error probability reads 1, 2: ( ) , if ( ) ( ) for i i m d r a p a r p a r m   . (9) 220 s.a. vlajkov, a.ž. jovanović, z.h. perić by applying bayesian rule on (9) we obtain a simpler form for the decision rule: ( ) ( ) for i i m m p r a p p r a p m  . (10) by using (10), (8) and (1) we derive the decision thresholds of map detector: 0 1 1 2 1 ( 1) ln , 2,..., 2 2 0, i i i m n p m m i d i d p m m          . (11) let us now define the ml decision rule 1, 2: ( ) , if ( ) ( ) for i i m d r a p r a p r a m   . (12) by substituting (8) and (1) into (12) we derive the decision thresholds of ml detector: 1 2 1 ( 1) , 2,..., 2 0, i m m m i d i m m        . (13) by comparing equations (10) and (11) with equations (12) and (13) for pi = 1m one can notice the well-known fact that in the case of equiprobable symbols the map and ml decision rules are equivalent. after determining decision thresholds the following general formulation for decision rules can be assumed 1 ( ) , if i i i d r a m r m     . (14) to stress the difference between the ml and map decision rules, we graphically illustrate the decision thresholds in fig. 1. fig. 1 the decision thresholds for: 1) map detector mi = a for pi  1 < pi, mi = b for pi  1 = pi, mi = c for pi  1 > pi ; 2) ml detector bm i  . 2.3. formulas for symbol error probability during transmission through the channel the channel noise affects the digital signal. due to that, there is a probability that the transmitted symbol is not accurately detected. the probability that the wrong symbol is detected is called the error probability per symbol (pe). the general form for the symbol error probability of a symmetric pam constellation is 1, 2: comparative analysis of ml and map detectors for pam constellations in awgn channel 221 / 2 1 2 ( ) m e i i i p p p e a    , (15) where p(eai) represents conditional error probability per symbol: 1 1 1 ( ) [ , ) 1 [ , ) 1 ( ) i i m i i i i i i i i m p e a p r m m a p r m m a p r a dr                   . (16) by substituting (16), (8), (11) and (1) into (15) we derive an analytical expression for the symbol error probability: 0 1 1 20 0 / 2 1 0 01 2 10 0 2 10 2 20 erfc erfc ln 22 2 erfc ln erfc ln 2 22 2 erfc ln . 22 e m i i i i i i m m m n pd d p p d pn n n np pd d p d p d pn n pnd p d pn                                                                    (17) where erfc() is the complementary error function: 2 z 2 erfc(z) exp{ }t dt     . (18) one can notice that in the case of the ml detection, the expression for the symbol error probability gets much simpler form: 2 0 (1 )erfc 2 e m d p p n           . (19) assuming in (19) that symbols are equiprobable, we derive the known expression for the symbol error probability of the conventional pam constellation 1:          0 2 erfc 1 n d m m p e . (20) 3. numerical results and discussion in order to be able to quantify and compare the performances of the ml and map detectors for the previously defined scenario, for a given number of constellation points m we change other parameters of pam constellation and calculate snr and pe. we assume the number of constellation points m = 8. we define the probabilities of the constellation points on several manners. actually, we observe four cases illustrated in fig. 2. in the first case in the fig. 2 the probabilities of the constellation points are equal and amount pi = 0.125 (the uniform distribution). in the case given in the fig. 2.b the probability of the 222 s.a. vlajkov, a.ž. jovanović, z.h. perić constellation point decreases with the increase of the amplitude of the constellation point (decreasing distribution). in the third case we observe the pam constellation whose probability of constellation point increases with the constellation amplitude, so we name it the increasing distribution. graphical representation of the third case is shown in fig. 2.c. the last case is characterized with the probabilities of constellation points arranged in „zigzag‟ form, that is, the distribution of constellation points does not change monotonically with the constellation amplitude. the last case can be seen in fig. 2.d. a. b. c. d. fig. 2 distribution of the constellation point probabilities for 8-pam constellation: a) uniform distribution pi = [0.125, 0.125, 0.125, 0.125]; b) decreasing distribution pi = [0.21, 0.15, 0.1, 0.04]; c) increasing distribution pi = [0.04, 0.1, 0.15, 0.21]; d)„zigzag‟ distribution pi = [0.18, 0.07, 0.18, 0.07], where pi = [p1, p2, p3, p4]. for all presented distributions of the probabilities of constellation points and for different values of the snr, the symbol error probabilities for ml and map detectors are calculated by using (17) and (19) and listed in the table 1. in order to better distinct the difference in the performances of the ml and map detectors, we introduce the relative difference between the symbol error probabilities δ:   100100%      map e map e ml e map e e p pp p p , (21) comparative analysis of ml and map detectors for pam constellations in awgn channel 223 where pe map and pe ml are the error probabilities for map and ml detector, respectively. in fig. 3 the  in function of the snr is graphically presented. by analyzing the results presented in fig. 3 and table 1, we can derive several conclusions. in every case the error probability per symbol of map detector is lower than the corresponding one of ml detector (the known fact). for pam constellation with equiprobable symbols (see the first case) we confirm that there is no difference between the symbol error probabilities of ml and map detectors. here it should be also noticed that the first case defines the conventional pam constellation 1. for the second case, that is, when the distribution of constellation probabilities is decreasing, we observe that the relative difference between the symbol error probabilities of ml and map detectors amounts about 4.6 % for snr = 0 db and 2.5 % for snr = 20 db. besides that, the symbol error probability for both types of detectors is for three orders lower than the corresponding one in the first case. in the third case, when the distribution of probabilities is increasing, relative difference in symbol error probabilities is somewhat greater and ranges from 7.5 % to 3.2 % for the signal – to – noise ratio per bit ranging from 0 db to 20db. however, for this kind of probability distribution one should observe that the error probability for both detectors is for three orders higher than the error probability of a conventional pam. finally, the fourth case with the „zigzag‟ distribution of the constellation points probabilities the relative difference in symbol error probabilities has a significantly greater value than it is in other cases. this difference reaches 19.1% for snr = 0db and 9.2% for snr= 20db. thereby, the symbol error probability for both detectors is of the same order as the error probability of the conventional pam constellation. from fig. 3 we can also observe that curves in all cases approach the curve corresponding to the case where there is no difference between map and ml detection. this actually means that with the increase of snr the relative difference in the symbol error probability decreases. table 1 symbol error probability for ml and map detectors in function of snr: a) case 1 and case 2, b) case3 and case 4 a) pe(snr) case 1 case 2 snr[db] ml map ml map 0 0.519 0.519 0.461 0.441 5 0.299 0.299 0.201 0.194 10 0.080 0.080 0.025 0.024 15 0.002 0.002 6.9110 -5 6.7310 -5 20 7.9010 -9 7.9010 -9 1.6110 -12 1.5710 -12 b) pe(snr) case 3 case 4 snr[db] ml map ml map 0 0.517 0.481 0.519 0.436 5 0.337 0.320 0.277 0.243 10 0.124 0.119 0.059 0.054 15 9.3610 -3 9.0510 -3 9.1710 -4 8.3510 -4 20 6.0310 -6 5.8410 -6 4.3410 -9 3.9710 -9 224 s.a. vlajkov, a.ž. jovanović, z.h. perić fig. 3 relative difference in the symbol error probabilities of the ml and map detectors. as we have already noticed the pam constellation with the increasing distribution of the constellation point probabilities does not represent a good constellation because its symbol error probability is significantly higher than that for the conventional pam constellation. this points out that the nonuniform distribution of the constellation points probabilities does not necessarily improve the symbol error probability. because of that, determination of constellation points probabilities with the aim to reduce the symbol error probability is an important task. in this paper we propose that the constellation points probabilities follows the gaussian distribution. namely, we assume that the constellation point probability pi is equal with the probability that gaussian variable with zero mean value and unit variance belongs to segment ai  d2, ai + d2: 2 ,,1, 2 erfc 2 )1( erfc 2 1 m i iddi p i                            . (22) for m = 16 and pam constellation with equidistant constellation points whose probabilities are defined with (22) we calculate symbol error probability by utilizing (17) and tabulate the obtained results in table 2. in table 2 we also tabulate the symbol error probability of the conventional pam constellation. one can evident that the pam constellation we propose achieves significantly lower symbol error probability than it is in the case of conventional pam constellation, whereby the achieved gain grows with the snr. thus, for snr = 20 db it is reached that the symbol error probability of our pam constellation is for 2.57106 times lower than that of its conventional counterpart. furthermore, the considered pam constellation outperform the best pam constellation in 17 whose symbol error probability amounts 4.5710 -6 for snr = 20 db. unlike the pam constellation defined with (22), the pam constellation in 17 is characterized with only two different constellation points probabilities, so that it represents the pam constellation of smaller complexity than it is the pam constellation proposed in this paper. comparative analysis of ml and map detectors for pam constellations in awgn channel 225 table 2 symbol error probability for 16-pam constellation based on (22) snr[db] pe pe conventional pam 0 0.515 0.712 5 0.265 0.549 10 0.051 0.311 15 5.3610 -4 0.079 20 7.8610 -10 2.0210 -3 4. conclusion in this paper, for pam constellation with equidistant and non-equiprobable symbols the comparative analysis of the map and ml detectors has been performed. one of the results of analysis is formula for the symbol error probability of map detector. it has been also noticed that in all observed cases, except for the first one, with the increase of the signal-to-noise ratio the relative difference in the symbol error probabilities of the ml and map detectors decreases so that it can be expected that it becomes negligible for higher snr values. this means that in the channels with higher values of signal-to-noise ratio the detector of smaller complexity the ml detector can be used. the another finding is that when the probability distribution has „zigzag‟ form, the choice of the detector is of great importance since it significantly influences the symbol error probability. in such situations, the map detector is preferable solution. it has been observed that the increasing distribution of the constellation points probabilities negatively effects the symbol error probability, while the decreasing distribution decreases the symbol error probability. finally, this observation has helped us to define the method for determining constellation points probabilities. the proposed method has led to the symbol error probability of pam constellation being significantly decreased in comparison to that for the conventional pam constellation. acknowledgement: this paper was realized as a part of the projects "development and implementation of next-generation systems, devices and software based on software radio for radio and radar networks" (tr-32051) and "development of dialogue systems for serbian and other south slavic languages" (tr 32035), financed by ministry of education, science and technological development of the republic of serbia. references [1] j. g. proakis, digital communications, mcgraw-hill science, 2001. [2] p. ivaniš, d. drajić, introduction to information theory and coding (in serbain), akademska misao, 2018. [3] a. alvarado, e. agrell, f. brannstrom, "asymptotic comparison of ml and map detectors for multidimensional constellations", ieee transactions on information theory, vol. 64, no. 2, pp. 1231– 1240, 2018. [4] m. ivanov, f. brannstrom, a. alvarado, et al., "on the exact ber of bit-wise demodulators for onedimensional constellations", ieee transactions on communications, vol. 61, no. 4, pp. 1450–1459, 2013. [5] h. kuai, f. alajaji, g. takahara, "tight error bounds for nonuniform signaling over awgn channels", ieee transactions on information theory, vol. 46, no. 7, pp 2712–2718, 2000. 226 s.a. vlajkov, a.ž. jovanović, z.h. perić [6] l. wei, i. korn, "optimal m-amplitude shift keying/quadrature amplitude shift keying with non-equal symbol probabilities", iet communications, vol. 5, no. 6, pp 745–752, 2011. [7] l. wei, "optimized m-ary orthogonal and bi-orthogonal signaling using coherent receiver with non-equal symbol probabilities", ieee communications letter, vol. 16, no. 6, pp 793–796, 2012. [8] i. korn, j. p. fonseka, s. xing, "optimal binary communication with nonequal probabilities", ieee transactions on communications, vol. 51, no. 9, pp 1435–1438, 2003. [9] v. ipatov, "comments on optimal binary communication with nonequal probabilities", ieee transactions on communications, vol. 55, no. 1, pp. 231, 2007. [10] z. h. perić, "nonlinear transformation of one-dimensional constellation points in order to error probability decreasing", facta universitatis, series: electronics and energetics, vol. 11, no. 3, pp. 291– 299, 1998. [11] z. h. perić, m. s. bogosavljević, "performance of nonuniform pam constellations for gaussian channel", electronics and electrical engineering, vol. 52, no. 3, pp. 27–30, 2004. [12] z. h. perić, n. milošević, a. ž. jovanović, et al., "design of piecewise uniform pam constellation", in proceedings of the xii international saum conference on systems, automatic control and measurements, niš, serbia, november 2014, pp. 109–111. [13] z. h. perić, a. ž. jovanović, s. a. vlajkov, "comparative analysis of various pam constellations", in proceedings of the xiii international saum conference on systems, automatic control and measurements, niš, serbia, november 2016, pp. 27-30, 2016. [14] i. b. djordjevic, a. ž. jovanovic, z. h. peric, t. wang, "optimized vector-quantization-based signal constellation design (ovq-scd) for multidimensional optical transport", in proceedings of thd cleo: science and innovations, san jose, california united states, june 8-13, 2014. [15] i. b. đorđević, a. ž. jovanović, z. h. perić, t. wang, "multidimensional optical transport based on optimized vector-quantization-inspired signal constellation design", ieee transactions on communications, vol. 62, no. 9, pp. 3262–3273, 2014. [16] i. b. đorđević, a. ž. jovanović, m. cvijetić, z. h. perić, "multidimensional vector quantization-based signal constellation design enabling beyond 1 pb/s serial optical transport networks", ieee photonics journal, vol. 5, no. 4, 2013. [17] s. a. vlajkov, a. ž. jovanović, z. h. perić, "approach in companding-quantisation-inspired pam constellation design", iet communications, vol. 12, no. 18, pp. 2305–2314, 2018. facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 73-82 https://doi.org/10.2298/fuee2001073k © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd numerical compact modeling approach of dispersive magnetoelectric media based on scattering parameters  miloš kostić 1 , nebojša dončov 2 , zoran stanković 2 , john paul 3 1 innovation center of advanced technologies, niš, serbia 2 faculty of electronic engineering, university of niš, serbia 3 electromagnetics scientist, nottingham, united kingdom abstract: z-tlm based compact modeling approach for dispersive media exhibiting magnetoelectric coupling is presented in this paper. scattering parameters based representation of considered medium is created in a form of compact model by extracting effective electromagnetic parameters using a retrieval method, and implementing them into a non-uniform tlm grid. proposed approach is illustrated here on the example of dispersive isotropic chiral medium modeling. key words: dispersive media, compact models, scattering parameters, z-tlm method, retrieval method, non-uniform mesh. 1. introduction numerical modeling techniques nowadays represent important tools in a research process of complex materials especially when it is not possible or worthwhile to solve a problem with analytical approach. two most used discrete time domain numerical techniques, the finite difference time domain (fd-td) method [1] and transmission line matrix (tlm) method [2] are very suitable for solving problems of electromagnetic (em) wave propagation through complex structures and media. even though the fd-td method is often favored by researchers, the tlm method offers in some cases a more straightforward approach for describing and modeling different discontinuities, internal boundaries, propagation in dispersive media etc. this is a result of tlm feature that both electric and magnetic field components are solved in center of the tlm cell simultaneously without a need for temporal and spatial averaging. modifying and extending tlm method with z transformation techniques create valuable means for an efficient time domain modeling of linear isotropic and anisotropic, bi-isotropic, nonlinear, quantum, chiral materials and metamaterials [3-7]. this so-called received march 21, 2019; received in revised form june 18, 2019 corresponding author: miloš kostić innovation center of advanced technologies, 18000 niš, serbia (e-mail: r.i.p.romeo@gmail.com)  74 m. kostić, n. donĉov, z. stanković, j. paul z-tlm method supports implementation of the debye, drude, lorentz and other dispersion models along with specific methods which allow for describing materials with complex frequency dependencies. compact models allow for complex structure, artificial or multilayered material to be represented as one effective material block via scattering parameters which to some degree simplifies numerical analysis and modeling process. in addition, compact models can be also used to reduce computational and time costs of the simulation by using a much coarser mesh for modeling of thin material panel, where instead of direct modeling by a fine mesh the material is replaced with single interface between two tlm cells [8,9]. in this paper, a formulation based on nonlinear constitutive relations and discretization of maxwell’s equations which allows implementation of most general properties of dispersive and anisotropic materials into the z-tlm non-linear grid, is described. procedures for applying tlm method in modeling of dispersive and general anisotropic media inside of non-uniform mesh are given in [9-11]. z-tlm based approach presented in [12,13] is here expanded to allow modeling of dispersive materials with magnetoelectric coupling characteristics while preserving the advantage of including the arbitrary frequency dependencies of modeled material em parameters, i.e. these dependences do not have to necessarily follow some of the known dispersion models. effective em parameters which are used to characterize materials with magnetoelectric coupling are extracted from s parameters through retrieval procedure [14-16], approximated through the vector fitting (vf) method [17-19] and then used to form a compact model after applying the bilinear z transforms, which is later included into the tlm scattering algorithm. created model efficiently describes studied material based on provided s parameters and enables analyzing and observing em field propagation throughout the medium. proposed approach is demonstrated by modeling dispersive isotropic chiral material slab exhibiting magnetoelectric coupling [20]. 2. formulation of maxwell’s equations for non-uniform mesh non-uniform mesh generally enables a proper resource handling especially while modeling materials and structures with nonlinear characteristics. it allows for a usage of smaller size cells in areas which are physically small but have a greater em importance and also bigger cells in less complex or less em important areas of the structure which improves an overall efficiency of the simulations. in a non-uniform tlm cell (see fig. 1) one or more directional space steps (x, z and z) are not the equal. relations between incident v i and reflected v r voltage wave components of the twelve-port non uniform hybrid tlm cell [2] can be presented as: 1 1 2 2 2 1 3 3 10 4 4 9 5 5 6 6 6 5 r i z z r i z z r i x y r i y x r i z y r i z y v v i z v v v i z v v v i z v v v i z v v v i z v v v i z v                   , 7 7 8 8 8 7 9 9 4 10 10 3 11 11 12 12 12 11 r i z x r i z x r i y x r i x y r i y z r i y z v v i z v v v i z v v v i z v v v i z v v v i z v v v i z v                   (1) dispersive media numerical compact modeling method based on scattering parameters 75 where impedance of each link line can be calculated based on link line inductance, time and space steps: / , k k z l i t   (1, 2,...,12)k  (2) fig. 1 non-uniform tlm cell using notations for fields, current and flux densities maxwell’s curl equation can be written as , ef mf h j d t e j b t           (3) after given constitutive relations for electric and magnetic current and flux densities , , , ef mf j j d b , (3) can be written as: 0 0 0 0 ( / ), ( / ) ef e e r mf m r m h j e e e c h t t e j h h c e h t t                                     (4) where sign  denotes time domain convolution, ,e m  and ,e m  are electric and magnetic conductivity and susceptibility matrices respectively, 0, 0 are free space permittivity and permeability, respectively, and ,r r  are dimensionless matrices describing magnetoelectric coupling coefficients given as 1 r rr r r r r rr xyxx xz yx yy yz r zyzx zz c                        , 1 r rr r r r r rr xyxx xz yx yy yz r zyzx zz c                        (5) formulation (4) is extended further with transformations and introduced additional compact notations to create normalized form of maxwell’s equations (6): 76 m. kostić, n. donĉov, z. stanković, j. paul 1 1 2 2 2 ( ) , 2 2 2 ( ) t ef b e e r t mf b m m r c i i v v g v a v i t t t c v v i i r i a i v t t t                                                            (6) with 1 , , , b a c   representing background susceptibility matrix, matrix of inverse cell areas, normalized curl matrix and matrix of inverse cell length, respectively [9]. previous form can be further transformed into the traveling wave format by using following relation where vector i v represents sum of appropriate incident wave intensities [9]: 2 2 4 , 2 2 4 tlm i tlm i c i v v v t c v i v i t              (7) this leads to: 1 1 2 4 2 ( ) , 2 4 2 ( ) i t ef e b e r i t mf m b m r v i v g v v a v i t v v i r i i a i v t                                                    (8) if reflected voltage wave r v and reflected free current r i are introduced as in [9], and matrix p is defined: 2 , 2 r i ef r i mf v v i i v v      (9) 1tp a       (10) while 1 b p   , (8) can be written as: 2 4 2 ( ( )), 2 4 2 ( ( )) r e b e r r m b m r v v g v v p v i t i i r i i p i v t                                (11) 3. material modeling based on extracted effective parameters modeling process based on proposed approach (see fig. 2) begins by applying appropriate retrieval method [14-16] on the scattering matrix parameters of considered material, obtained either analytically, numerically or experimentally, in order to acquire effective permittivity, permeability and magnetoelectric coupling coefficients in the frequency range of interest. dispersive media numerical compact modeling method based on scattering parameters 77 in order to calculate effective parameters from s parameters the index of refraction n and characteristic impedance of considered medium zred need to be first determined [15]. based on calculated n and zred effective parameters of the material can be determined as: red n z   , rednz  , n   (12) after straightforward conversion of effective permittivity, permeability and magnetoelectric coupling coefficients to appropriate effective susceptibilities, (electric–e, magnetic–m and magnetoelectric–r which is related to (5)), they are then further approximated using the vf method [17-19] in order to represent each susceptibility in the form of rational function: [ , , ] 1 [ , , ] [ , , ] [ , , ]0 ( ) e m r e m r i e m r e m r pii c s s s       np (13) in (13) np[e,m,r] stands for the number of poles, s[e,m,r]pi represents the set of complex pole frequencies, and c[e,m,r]i are the pole residues of vf approximated susceptibilities. next, the bilinear z-transform is applied: 1 1 2 1 1 z z s t z             (14) to obtain a discrete-time model: [ , , ] [ , , ] [ , , ] 0 [ , , ] [ , , ] 0 ( ) e m r e m r i e m r i i e m r i e m r i i b z z a z         np np (15) where a[e,m,r]i and b[e,m,r]i are real coefficients and z is time-shift operator. fig. 2 z-tlm compact modelling approach a and b coefficients are used to define compact model which is incorporated into tlm scattering procedure where electric and magnetic fields are calculated as: 78 m. kostić, n. donĉov, z. stanković, j. paul 1 1 1 1 (2 ) ( 2 ) 2 2 r e e r m m re e e rm rm m m re t v z s v i t i z s s v k v s s s i k i s s                                       (16) where 1 0 0 0 (4 4 4 ) e e e r t g        , 1 0 0 0 (4 4 4 ) m m m r t r        , 0 1 (4 4 ) e e e k g     , 0 1 (4 4 ) m m m k r     and 1 1 , , e m re rm s s s s represent additional material accumulator vectors. for simplicity, it is assumed that electric conductivity and magnetic resistivity are not dependent on frequency. 4. numerical results proposed approach is illustrated here for an efficient modeling of dispersive isotropic d = 200 mm wide chiral material slab placed between two isotropic free-spaces [20]. in this case due to isotropic nature of the material slab, magnetoelectric susceptibilities have the form 0 0 1 0 0 0 0 r r r r c               , r r   (17) by using the retrieval method, effective susceptibilities are first extracted from s parameters analytically obtained in [20] (figs. 3 6, marked with triangle). the retrieved electric, magnetic and magnetoelectric susceptibilities are shown in figs. 7, 8 and 9, respectively (marked with red triangle). fig. 3 magnitude of scattering parameters s11 and s21 dispersive media numerical compact modeling method based on scattering parameters 79 fig. 4 phase of scattering parameters s11 and s21 fig. 5 magnitude of scattering parameters s12 and s22 fig. 6 phase of scattering parameters s12 and s22 80 m. kostić, n. donĉov, z. stanković, j. paul retrieved susceptibilities are approximated with the vf method [17-19] with fourth order rational function (np = 4) (12). obtained a’s and b’s coefficients for electric and magnetic susceptibility as well as magnetoelectric susceptibility are presented in table 1. accuracy of approximation is confirmed through perfectly matched comparison of retrieved parameters values (red triangles on the graph) and values calculated based on a and b coefficients (solid line) (see figs. 7-9). discrete time-models described by (14) are further incorporated into the z-tlm scattering algorithm in (15) taking into consideration magnetoelectric coupling characteristic of isotropic chiral slab. total of 800 tlm cells are used in x direction, while chiral slab itself is modeled with 200 cells and simulation is executed within 2000 time steps. model was excited with initial z polarized gaussian pulse which propagates along +x direction. fig. 7 real and imaginary parts of effective electric susceptibility of isotropic chiral medium fig. 8 real and imaginary parts of effective magnetic susceptibility of isotropic chiral medium dispersive media numerical compact modeling method based on scattering parameters 81 fig. 9 real and imaginary parts of effective magnetoelectric susceptibility of isotropic chiral medium two simulations are performed, first where the chiral medium was not considered present inside of the mesh in order to obtain incident field values, and second simulation with chiral medium included in the mesh in order to obtain the total field at the first interface, free space-chiral slab, and transmitted fields at the second interface, chiral slabfree space. reflected field was calculated by deducting incident field from total field values. accuracy of the approach is confirmed through a comparison of magnitudes and phases of scattering parameters from [20] and simulated scattering parameters shown with solid lines in figs. 3 6. 4. conclusion in this paper, an approach for compact modeling of dispersive media exhibiting magnetoelectric coupling, described with effective parameters extracted from scattering matrix, is presented. since there is a wide variety of dispersive media exhibiting magnetoelectric coupling the compact models presented here provide efficient tools for characterization and inside view of em waves propagation through these media. in future research this approach will be used in the design process of devices based on these media because it allows optimization and fine tuning of their characteristics in the frequency range of interest. table 1 coefficients of discrete-time model in (15) coefficient e m r = r a 1 1 1 -3.994898e+00 -1.971846e+00 -3.994897e+00 5.985618e+00 -2.760262e-02 5.985617e+00 -3.986539e+00 1.971845e+00 -3.986539e+00 9.958199e-01 -9.723764e-01 9.958197e-01 b 5.471056e-05 -4.988997-01 5.224473e-03 2.651185e-08 9.860555e-01 -1.044641e-02 -1.093681e-04 1.174398e-02 -3.898947e-09 2.649748e-08 -9.860555e-01 1.044641e-02 5.471051e-05 4.871557e-01 -5.224469e-03 82 m. kostić, n. donĉov, z. stanković, j. paul acknowledgement: this work has been supported by the ministry of education, science and technological development of serbia, project number tr32024. references [1] k.s. kunz and r.j. luebbers, the finite difference time domain method for electromagnetics, crc press, 1993 [2] c. christopoulos, the transmission-line modelling (tlm) method, ieee/oup press, 1995. [3] j. paul, c. christopoulos and d.w.p. thomas, "generalized material models in tlm – part i: materials with frequency-dependent properties", ieee trans. antennas and propagation, vol. 47, no. 10, pp. 15281534, 1999. [4] j. paul, c. christopoulos and d.w.p. thomas, "generalized material models in tlm part 2: materials with anisotropic properties", ieee trans. antennas and propagation, vol. 47, no. 10, pp. 1535-1542, 1999. [5] j. paul, c. christopoulos and d.w.p. thomas, "time-domain modelling of electromagnetic wave propagation in complex materials", electromagnetics, vol. 19, no. 6, pp. 527-546, 1999. [6] j. paul, c. christopoulos and d.w.p. thomas, "generalized material models in tlm – part iii: materials with nonlinear properties", ieee trans. antennas and propagation, vol. 50, no. 7, pp. 997-1004, 2002. [7] j. paul, c. christopoulos, and d.w.p. thomas, "time-domain simulation of electromagnetic wave propagation in two-level dielectrics", international journal of numerical modelling: electronic networks, devices and fields, vol. 22, no. 2, pp. 129-141, 2009. [8] n. donĉov, m. kostić, z. stanković, "compact numerical models for efficient representation of em field propagation through dispersive and anisotropic media", in proceedings of the 5th international conference icetran 2018, palić, serbia, 2018, pp. 582-587. [9] m. kostić, n. donĉov, z. stanković, j. paul, "efficient tlm-based approach for compact modeling of anisotropic materials and composites", applied computational electromagnetics society (aces) journal, vol. 34, no. 1, pp. 1-10, 2019. [10] p. saguet, h. louzani and f. ndagijimana, "the use of z-transform in tlm with non-uniform meshes", in proceedings of the workshop on computational electromagnetics in time-domain, cem-td, atlanta, usa, 2005, pp. 68-71. [11] a. l. farhat, s. le maguer, p. queffelec and m. ney, "tlm extension to electromagnetic field analysis of anisotropic and dispersive media: a unified field equation", ieee transactions on microwave theory and techniques, vol. 60, no. 8, pp. 2339-2351, 2012. [12] t. asenov, m. kostić, n. donĉov and b. milovanović, “z-tlm method simulation of left-handed metamaterials based on retrieved effective parameters”, in proceedings of the 2nd international conference icetran 2015, silver lake, serbia, pp. mti1.7.1-5, 2015. [13] m. kostić, n. donĉov, z. stanković and t. asenov, "3d z-tlm modeling of dispersive lossy metamaterial structures described by scattering parameters", proceedings of the 3nd international conference on electrical, electronic and computing engineering, icetran 2016, zlatibor, serbia, pp. mti2.7.1-4, 2016. [14] f. j. hsieh and w. c. wang, "full extraction methods to retrieve effective refractive index and parameters of a anisotropic metamaterial based on material dispersion models", journal of applied physics, vol. 112, 064907, september 2012. [15] d. r. smith, d. c. vier, th. koschny and c. m. soukoulis, "electromagnetic parameter retrieval from inhomogeneous metamaterials", physical review e, 71, 036617, 2005. [16] v. milosević, b. jokanović and r. bojanić, "effective electromagnetic parameters of metamaterial transmission line loaded with asymmetric unit cells", ieee transactions on microwave theory and techniques, vol. 61, no. 8, pp. 2761-2772, aug. 2013. [17] b. gustavsen and a. semlyen, "rational approximation of frequency domain responses by vector fitting", ieee transactions on power delivery, vol. 14, no. 3, pp. 1052-1061, 1999. [18] b. gustavsen, "improving the pole relocating properties of vector fitting", ieee transactions on power delivery, vol. 21, no. 3, pp. 1587-1592, 2006. [19] d. deschrijver, m. mrozowski, t. dhaene, and d. de zutter, "macromodeling of multiport systems using a fast implementation of the vector fitting method", ieee microwave and wireless components letters, vol. 18, no. 6, pp. 383-385, 2008. [20] j. paul, c. christopoulos and d. w. p. thomas (1999) “time-domain modeling of electromagnetic wave propagation in complex materials”, electromagnetics, vol. 19, no. 6, pp. 527-546, 1999. instruction facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 367-388 https://doi.org/10.2298/fuee1803367d nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets vojkan davidović 1 , danijel danković 1 , snežana golubović 1 , snežana djorić-veljković 2 , ivica manić 1 , zoran prijić 1 , aneta prijić 1 , ninoslav stojadinović 1,3 , srboljub stanković 4 1 university of niš, faculty of electronic engineering, niš, serbia 2 university of niš, faculty of civil engineering and architecture, niš, serbia 3 serbian academy of sciences and arts, branch niš, niš, serbia 4 university of belgrade, institute of nuclear sciences "vinča", belgrade, serbia abstract. in this paper we provide an overview of instabilities observed in commercial power vdmosfets subjected to irradiation, nbt stress, and to consecutive exposure to them. the results have indicated that irradiation of previously nbt stressed devices leads to additional threshold voltage shift, while nbt stress effects in previously irradiated devices depend on the gate bias applied during irradiation and on the total dose received. this points to the importance of the order of applied stresses, indicating that for proper insight into the prediction of device behaviour not only harsh conditions, but also the order of exposure have to be considered. it has also been shown that changes in the densities of oxide trapped charge and interface traps during spontaneous recovery after each of applied stresses can be significant, thus leading to additional instability, even though the threshold voltage seems to remain stable, pointing to the need for clarifying the responsible mechanisms. key words: negative bias temperature instability (nbti), irradiation effects, responsible mechanisms, oxide trapped charge, interface traps, spontaneous recovery 1. introduction development of advanced electronic industry is based on combining two concepts: more moore (miniaturization) and more than moore (diversification), i.e. on combining of system-on-chip and system-in-package concepts, thus leading to higher value systems. second concept includes integration of different devices, such as passives, analog/rf, power devices, sensors and actuators and biochips. among these, power vertical double-diffused metal oxide semiconductor (vdmos) transistors exhibit a received april 29, 2018 corresponding author: snežana djorić-veljković, university of niš, faculty of civil engineering and architecture, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: snezana.djoric.veljkovic@elfak.ni.ac.rs) 368 v. davidović, d. danković, s. golubović', et al. number of advantages, such as high switching speed, high current driving capabilities, high breakdown voltage, high input impedance and high thermal stability, which make these devices attractive for various application in power control in industrial electronics (automation, robotics), auto industry (automotive electronics), nuclear power plants, communication satellites, military and civil airplane industry, and military equipment (tanks, ships, submarines). in many of these applications devices may be subjected to stress or harsh environment conditions. accordingly, investigation of their reliability and related effects is of high importance [1-11]. because of its superior switching characteristics which enable operation at high frequencies, the power vdmosfet is attractive as a switching device especially in communication satellites that require many extremely small, lightweight power supplies for supplying various components, circuits and systems. namely, high-frequency operation of power supplies enables reduction of their weight and volume through the use of smaller passive components (transformers, choke-coils, and capacitors), so the power vdmosfets are suited for these applications. however, during a communication satellite operation of several years, assembled devices can accumulate the total dose up to 100 gy (sio2), while in high orbits this dose can be even 10 kgy (sio2) [12]. therefore, the most important requirement for power vdmosfets assembled in electronic systems for application in radiation environment is high radiation tolerance. the ionizing irradiation may cause degradation of power vdmosfets electrical parameters, such as threshold voltage shift, reduction of transconductance, increase of leakage currents and reduction of breakdown voltage [13, 14]. threshold voltage shift (vt) is the most serious problem in these devices since it may cause change of operation mode from enhancement to depletion in n-channel devices or dramatic reduction of current driving capability in p-channel ones. threshold voltage shift is known to increases with total dose received and in many investigations it is shown that the main irradiation effects on electrical parameters are caused by the creation of positive gate oxide charge (not) and interface traps (nit) [15]. besides operation in the irradiation environment, in a number of application devices are routinely operated at high voltage and current levels, which lead to both self heating and increased gate oxide fields [16]. negative bias temperature instability (nbti) is a phenomenon that is commonly observed in p-channel devices operated in the temperature range 100-250 c at negative gate voltages producing gate oxide electric fields 2-6 mv/cm [17-20]. note that electric fields and temperatures that cause nbti are typically found during the device burn-in tests [21, 22]. nbt stress may lead to degradation of important electrical parameters of power vdmosfets. among these the negative vt caused by increase of not and nit is the most serious reliability problem [23]. note that more significant negative vt is obtained at higher temperatures and/or higher gate voltages, i.e. higher oxide electric fields [24-28]. although nbti phenomenon is known for more than a half of the century, the reliability issues associated with nbti have resurfaced in the past two decades due to convergence of several factors resulting from the device scaling. this is the reason that vast majority of recent extensive investigation of nbti has been focused on the related phenomena in ultrathin gate dielectrics layers, and only few research groups seem to have addressed the nbti in thick gate oxides [24, 29, 30]. however, in spite of device dimensions being generally scaled down, there is still high interest in ultra-thick oxides nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 369 owing to widespread use of mos technology for the realization of power devices, so the investigation of nbti in vdmosfets is of high interest. it should be emphasised that pmos transistors can be subjected to a single stress, but also in numerous applications to simultaneous or consecutive nbt and irradiation stresses. namely, if p-channel power devices which exposed to radiation operate at higher temperature or at maximum power, the mechanisms responsible for both radiation effects and nbt instability can be activated. for example, satellite electronic equipment can be exposed to cosmic irradiation during a long time without air convection cooling. so, the active p-channel mos devices can be irradiated and in the same time exposed to nbt stress, while the back-up devices (which do not operate) are only irradiated. it is known that irradiation effects and nbt instabilities in power mos devices have been extensively studied, but they have been investigated separately. though, the results of elevated temperature effects on the radiation response have been reported in some studies performed in order to estimate the mos device behaviour in real irradiation environment [31, 32], but the p-channel devices used in those studies were irradiated and/or annealed under the positive gate bias. regarding the fact that devices which operate in real applications can be stressed and recovered under different conditions and that final effects depend on specific applications and device mission, in this paper we present the results of consecutively nbt stressed and irradiated p-channel power vdmos transistors. in this way the effects of specific kind of stress in devices previously subjected to the other kind of stress are investigated. however, for proper understanding of the effects induced by applied stresses, it is important to analyze in detail not only the changes in the electrical parameters, but also the mechanisms responsible for the observed effects. clarification of behaviour and nature of oxide and interface defects created during and after the stress is very important in order to improve device stability and resistivity to applied stress. that is why this paper is aimed at analysis of reliability problems in power vdmos transistors caused by nbt stress and radiation, as well as at related degradation and underlying mechanisms. the most vulnerable parts of the vdmos transistors subjected to extreme, harsh environmental conditions or to the stress are the parts based on dielectrics (sio2 and sio2-si interface), as both nbt stress and irradiation of vdmos transistors lead to creation of oxide and interface defects causing significant degradation of electrical parameters. it is of great interest to know the nature of the defects, and these are still in the focus of many investigations aimed at clarifying responsible mechanisms and improving possibilities of predicting device behaviour in specific application. 2. radiation effects as already mentioned, the threshold voltage shift is, undoubtedly, the most serious problem for irradiated devices since it may cause change of operation mode from enhancement to depletion in n-channel devices (thus leading to faulty operation of switching power supplies), or dramatic reduction of current driving capability in pchannel ones. even the radiation-hardened devices may fail due to reduction in currentdrive capability owing to channel carrier mobility degradation and/or positive vt [33]. the irradiation effects in mosfets have been extensively investigated by many researchers in the last decades. in our early study we have examined radiation response of 370 v. davidović, d. danković, s. golubović', et al. commercially available n-channel power vdmosfets efl1n10 manufactured by "eimicroelectronics", niš, serbia, which were realized in a standard si-gate technology with the hexagonal cell geometry and gate oxide thickness of 100 nm. gamma radiation was performed in co-60 source (dose rate 0.04 gy/s) at room temperature for two groups of the devices (without and with gate bias applied vg = + 9 v). drains and sources of all devices were grounded during irradiation. the changes of the threshold voltage and mobility () during the irradiation of the devices [2] are presented in fig. 1(a), where are also comparatively presented the results for similar devices [34]. observed significant threshold voltage shift and mobility reduction in the devices were much more pronounced in the case of positive gate bias applied. it should be noted that the similar behaviour of these electrical parameters of the power vdmos transistors has also been observed by other investigators and it has been generally established as a typical behaviour [14, 15, 33]. the radiation tolerance of the power vdmos transistors, as a very important requirement, can be determined for the maximum operating positive bias applied as this is the worst case scenario. as can be seen in fig. 1(a), the threshold voltage shift becomes equal to threshold voltage (vt = vt) at the total dose of about 250 gy (denoted point at which investigated devices change their operating mode from enhancement to depletion). therefore, the radiation tolerance of used commercial devices is of about 250 gy, which is half the value required for their application in communication satellites with life spans of ten years [2]. 10 100 1000 -18 -12 -6 0 10 100 1000 0.6 0.8 1.0 1.2 1.4 1.6 0.5 v t v t = v t v g = 0 v v g = 9 v v g = 10 v (sakai & yachy)  v t ( v ) dose (gy) -9 (0)   (0 ) v g = 0 v v g = 9 v v g = 10 v (sakai & yachy) 10 100 1000 0.1 1 10 v g = 9 v v g = 0 v v g = 10 v (sakai &yachy) n it  n o t ,  n it ( 1 0 1 1 c m -2 ) dose (gy) n ot a) b) fig. 1 gamma-irradiation induced (a) vt and /0; (b) not and nit in n-channel power vdmosfets (efl1n10). considering that the main radiation effects on electrical parameters are caused by the creation of both not and nit , the changes in their densities (not and nit ) are very often analysed and discussed in the literature [8, 15, 33, 35-37]. in fig. 1(b) not and nit in the devices which were irradiated in our experiment are presented. it should be emphasized that reliability screening is important in achieving high reliability of vdmosfets for application in radiation environment. screening is normally performed on all devices in order to reduce the possibility of infant mortality. the standard reliability screening for these devices includes „burn-in tests“ (us milstd 883, test method 1015), such as: high temperature reverse bias (htrb), high nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 371 temperature gate bias (htgb) and high temperature storage life (htsl) stresses [38]. however, it was shown that htgb stress affects the radiation response in mos transistors. this was the reason for modification of the standard qualification testing for application of mosfets in radiation environment and for imposing the requirement for radiation qualification testing after burn-in (us mil-std 883, test method 1019). our results [22, 39, 40] which have shown that burn-in tests could have a significant impact not only on the radiation response of vdmosfets, but also on annealing of radiation defects, have confirmed the need for modification of qualification testing. in the another experiment also commercially available irf510 (with nominal gate oxide thickness of 100 nm, realized in a standard si-gate technology) and efl1n10 devices (from different batches a and b) were irradiated by co-60 source (dose rate 0.13 gy/s) at room temperature with gate bias applied vg = + 10 v. the changes of the threshold voltage and mobility during the irradiation of efl1n10 (batch a and b) and irf510 devices [22] are presented in fig. 2, and fig. 3, respectively, while underlying changes of not and nit are presented in fig. 4. in these figures the results for reference devices and for devices subjected to htrb (vd = 80 v, t = 125 c for 168 h) and htgb (vg = 20 v, t = 125 c for 168 h) stresses are compared. it can be seen that vt was more pronounced in efl devices, indicating that irf devices were better in view of radiation tolerance, while there was almost no difference in mobility reduction. 0 250 500 750 -8 -4 0 efl(a) irf v g = + 10 v  v t ( v ) dose (gy) efl(b) htrb htgb reference fig. 2 gamma-irradiation induced vt in efl-batch a, efl-batch b and irf n-channel power vdmosfets in htrb and htgb stressed and reference devices [40]. it can be seen that there was almost no difference between the htrb and htgb stress effects on radiation response of investigated devices. the results which suggested that radiation response appeared to be almost independent of device pre irradiation stress biasing were obtained also for field -oxide mosfets [41]. as can be seen in figs. 2 and 3, vt during irradiation was slightly larger in htrb stressed devices, while the mobility reduction was slightly larger in htgb stressed ones. similar behaviour of vt was obtained for irradiated field-oxide mosfets [41]. as can be seen from fig. 4, the build-up oxide trapped charge appeared to be almost independent of device pre irradiation stress. on the other hand, the build-up of interface traps was somewhat less pronounced in the stressed device. for explanation of such behaviour of not and nit the chain of mechanisms, in which the diffusion of hydrogen 372 v. davidović, d. danković, s. golubović', et al. related species (originating either from package inside or gate oxide adjacent structures) from the bulk of the oxide towards the interface, has been proposed. 0 250 500 750 0.4 0.6 0.8 1.0 efl(a) irf v g = + 10 v    ( 0 ) dose (gy) efl(b) htrb htgb reference fig. 3 gamma-irradiation induced /0 in efl-batch a, efl-batch b and irf n-channel power vdmosfets in htrb and htgb stressed and reference devices [40]. in many investigations of p-channel power mosfets radiation response, the role of not and nit were also emphasized. in fig. 5(a) the radiation induced threshold voltage shift (vtnh) and degradation of the hole mobility /0 in non-hardened irf9130 and threshold voltage shift in radiation hardened frm9130 (vtrh) p-channel power mosfets are presented [14]. 200 300 400 1 10 irf irf efl(b) irf efl(b) v g = 10 v  n o t ,  n it ( 1 0 1 1 c m -2 ) dose (gy) efl(a) irf n ot n it efl(a) reference htrb htgb efl(b) efl(a) efl(b) htrb reference htgb fig. 4 gamma-irradiation induced not and nit in efl-batch a, efl-batch b and irf n-channel power vdmosfets in htrb and htgb stressed and reference devices [40]. also, in the fig. 5(a) the contributions of the gate oxide charge (vot) and interface traps (vit) to the vtnh (for non-hardened devices) are presented. devices were irradiated at room temperature by co-60 gamma-ray source (dose rate of 0.2 gy(si)/min), with gates biased at vg = + 9 v, while source and drain terminals were grounded. unlike the nonhardened devices, vtrh of hardened devices is small for total dose below 400 gy and mobility degradation is less than 4%. both gate oxide charge (not = vot cox/q) and nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 373 interface traps (nit = vit cox/q) are positive, that gives rise to negative vtnh, i. e. vtnh = vot + vit . 0 150 300 -3.0 -1.5 0.0 0.5 1.0 v trh  /  0 (zupac et al.) v g = 9 v v tnh v ot v it  v t ( v ) dose (gy) /  0  0 30 60 -0.6 -0.3 0.0 0 30 60 0 6 12  n o t ,  n it (1 0 1 0 c m -2) n it v g 0 v 10 v + 10 v dose (gy) n ot v g 0 v 10 v + 10 v  v t (v ) dose (gy) a) b) fig. 5 gamma-radiation induced (a) vtnh and /0 in non-hardened and vtrh in hardened p-channel power mosfet; (b) vt and corresponding not and nit (in inserted figure) in commercial p-channel power vdmosfets. in fig. 5(b) the radiation induced vt and behaviours of corresponding buildup of not and nit (in inserted figure) in commercially available irf9520 p-channel power vdmosfets irradiated at different gate bias applied are presented. although irradiation conditions and obtain results will be discussed in detail in sect. 4, it should be mentioned that significant negative vt induced by radiation also increases with total dose received and depends on gate bias applied. 3. nbt stress effects nbt stress-induced threshold voltage instabilities in commercial power vdmosfets, as well as the implications of related degradation on device lifetime have been extensively investigated in our research in the last decade [27, 42-44]. although in many experiments devices have been subjected to various nbt stress (static or pulsed) and annealing conditions [9, 23, 25, 45, 46, 48-52], in this section a part of results obtained during static nbt stress and annealing is presented, with attention to insight into the nbti as a result of sequential nbt stress and bias annealing steps. in these investigations commercial p-channel power vdmosfets transistors irf9520 (with current and voltage ratings of 6.8 a and 100 v) were used. these devices were built in standard silicon-gate technology with 100 nm thick gate oxide. devices have been stressed up to 2000 hours by applying negative voltages (30 – 45 v) to the gate, with drain and source terminals grounded, at temperatures ranging from 125 to 175 c. important details of used equipment for stress, annealing and measurement will be described in sect. 4. during nbt stresses vt of investigated p-channel power vdmosfets was more significant in the cases of higher stress voltage and/or temperatures [25]. the underlying phenomenon leading to the observed vt in the stressed devices is the stress-induced buildup of not and nit. typical time dependencies of stress induced buildup of not and 374 v. davidović, d. danković, s. golubović', et al. nit for different stress voltages at the temperature of 150 c and for different stress temperatures at stress voltage of 40 v are presented in fig. 6, while corresponding vt are presented in inserted figures. in these figures the results for nbt stressed devices during 2000 hours are presented. analysis has shown that vt time dependencies follow the t n power low, but with three different phases (that depends on the parameter n), which is indicated by the dashed lines (in inserted figures). in the first phase, parameter n depends on bias as well as on temperature, and varies from 0.4 to 1.14. in the second phase parameter n is almost independent on bias and temperature, and equals approximately 0.25 as obtained in all earlier nbti investigations [17, 24, 53, 54]. this phase begins earlier in devices stressed at higher voltages and/or temperatures, and might be even expected that the first phase disappears under more severe stress conditions. in the third (long stress) phase parameter n again becomes bias and temperature dependent, varying from 0.25 to 0.14. also, in fig. 6 could be observed that the buildup of not is more significantly pronounced than that of nit for each specific combination of temperature and stress voltage in all three stress phases. in addition, it could be seen that nit rapidly increases in the early phase, but slows down in the second phase and tends to more rapidly saturate than not . it should be emphasized that the strong correlation between time dependence of vt and corresponding not in all cases (all combinations of temperatures and stress voltages) was observed. on the other hand, such correlation between vt and nit time dependencies was not observed. this disagreement becomes more noticeable as the nbt stressing advances into the second phase and, especially, further into the third phase. therefore, time dependence of vt in investigated p-channel power vdmosfets seems to be mostly affected by nbt stress induced buildup of oxide trapped charge, which does not appear to be consistent with most of literature data emphasizing dominant role of stress induced interface traps [17, 24, 53]. in addition, it was shown that the effects of post-stress annealing (at various voltages and/or temperatures, during various time intervals), provided after each phase of nbt stress (1 st 3 rd ), depend not only on temperature and gate bias conditions, but also on status of the gate oxide and sio2–si interface, immediately after the stress [46]. namely, observed effects were affected by the densities of stress-induced not and nit and their spatial and energy distributions, number of potential trapping sites and quantities of reacting species available after the stress, quantity and distribution of new defects possibly created by preceding stress, etc. besides that, in order to further disclose the effects of post-stress and intermittent annealing on degradation associated to nbti, another experiment, in which devices were subjected to a five step sequence, was performed. in this experiment, commercial pchannel irf9520, and n-channel irf510 power vdmosfets were also used. irf510 transistors were also built in standard silicon-gate technology with 100 nm thick gate oxide. the experiment included three nbt stress steps interchanging with two bias annealing steps. namely, one week of nbt stressing with gate voltage of 40 v at t = 150 c was followed by one week of annealing without or with the gate bias applied, also at 150 c. after that, nbt stress and annealing were repeated, followed by final nbt stress. devices were annealed without or with gate bias applied (vg = +10 v or vg = 10 v). nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 375 it was shown that annealing with negative gate bias applied did not affect noticeably not and nit , while annealing performed under the zero and positive gate bias removed the portion of stress induced oxide charge, but created a new interface traps over to those that have been created during the preceding nbt stress. observed effects were more pronounced in the case of positive gate bias applied. therefore, evolutions of vt in p and n-channel power vdmosfets and corresponding evolutions of not and nit in pchannel transistors obtained during nbt stress and annealing under the positive gate bias applied are presented in fig. 7 and fig. 8, respectively. in these figures can be observed that majorities of the changes occurred only in an early stage of the annealing steps, as well as of nbt stresses. 0.1 1 10 100 1000 10 8 10 9 10 10 10 11 10 -3 10 -2 10 -1 0.1 1 10 100 1000 stress time (h) v g (v) t = 150 o c -30 -35 -40 -45 i v t i (v ) t = 150 o cvg (v) stress time (h) n ot n it -30 -35 -40 -45  n o t ,  n it ( c m -2 ) 0.1 1 10 100 1000 10 8 10 9 10 10 10 11 10 -3 10 -2 10 -1 0.1 1 10 100 1000 stress time (h) v g = 40 v t ( o c) 125 150 175 i  v t i (v ) t ( o c) v g = 40 v stress time (h) n ot n it 125 150 175  n o t ,  n it ( c m -2 ) a) b) fig. 6 time dependence of not and nit (and vt at inserted figure) for different: (a) stress voltages at the same temperature (150 c); (b) stress temperatures for the same stress voltage (vg = 40 v). in fig. 7 it can be seen that evolutions of vt were similar in both types (pand nchannel) of transistors, and that overall variations of vt over the entire stress and anneal sequence were greater in n-channel ones. besides that, in fig. 7(a) (vt in p-channel transistor) it can be seen that vt was significantly recovered, but the initial stress-induced δvt did not fall below 100 mv after both annealing. during repeated nbt stress the major portion of vt induced by the initial nbt stress is also quickly restored. the changes of δvt tend to decrease on each new repetition of annealing, indicating that there is a non-reversible component of δvt, which resulted from the portion of non-annealed stress-induced oxide-trapped charge and interface traps and new created interface traps. in fig. 8 it can be seen that the shapes of not mostly follow the shapes of vt over the complete sequence. this suggests that charge trapping/detrapping processes occurring in oxide bulk could be of primary importance for nbti in power vdmosfets. it should be emphasized that although recovery of vt during annealing was observed, it does not seem to be a true device recovery because only not decreases while nit simultaneously increases. this increase could be ascribed to a reversed drift direction of positively charged species. it should be emphasized that similar to radiation induced degradation, degradation induced by nbt stressing in power vdmosfets might be associated with gate oxides as reservoirs of hydrogen related species required for both passivation and 376 v. davidović, d. danković, s. golubović', et al. depassivation processes occurring at the sio2–si interface during and after the stress. accordingly, some elements of the approach applied in standard model of irradiation damage [15, 23, 55, 56] might be reasonable in considering the nbti in power vdmosfets. 0.0 0.1 0.2 0.3 0 40 80 120 160 160 120 80 40 0 v g = + 10 v v g = 40 v i  v t i (v ) 1 st stress 2 nd stress 3 rd stress anneal time (h) stress time (h) stressing 1 st anneal 2 nd anneal t = 150 o c annealing -0.2 0.0 0.2 0 40 80 120 160 160 120 80 40 0 v g = + 10 v v g = 40 v i  v t i (v ) 1 st stress 2 nd stress 3 rd stress anneal time (h) stress time (h) stressing 1 st anneal 2 nd anneal t = 150 o c annealing a) b) fig. 7 evolution of vt in power vdmosfets during complete sequence of nbt stressing and positive bias annealing steps in: (a) p-channel and (b) n-channel. 0 2 4 0 40 80 120 160 160 120 80 40 0 v g = + 10 v v g = 40 v  n o t (1 0 1 0 c m -2 ) 1 st stress 2 nd stress 3 rd stress anneal time (h) stress time (h) stressing 1 st anneal 2 nd anneal t = 150 o c annealing 0 2 4 0 40 80 120 160 160 120 80 40 0 v g = + 10 vv g = 40 v  n it (1 0 1 0 c m -2 ) 1 st stress 2 nd stress 3 rd stress anneal time (h) stress time (h) stressing 1 st anneal 2 nd anneal t = 150 o c annealing a) b) fig. 8 evolution of (a) not and (b) nit in p-channel power vdmosfets during complete sequence of nbt stressing and positive bias annealing steps. also, in fig. 8 it can be observed that the changes of not and nit tend to decrease during each new repetition, indicating that non-reversible components of not and nit tend to increase. namely, the repetition of nbt stress after annealing re-created the annealed portion of not , while removed the reversible component of nit . it is interesting that interface traps created during each annealing are almost completely removed during following nbt stress. the second and the third nbt stresses actually lead to decrease of nit to value approximately equal to one after the first stress. in this way nit remains almost on the same value as it was after the first nbt stress. besides, it could be noticed that the values of not are significantly higher than that of nit after each nbt stress. on the other hand, the values of not after annealing become almost the same as values of nit after nbt stresses, at these experimental conditions. nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 377 the observed changes could be ascribed to the available oxide trapped charge and interface trap precursors, as well as to the presence of hydrogen species that significantly contribute to the observed vt , not and nit evolution. 4. consecutive radiation and nbt stress effects the devices used in the investigation of consecutive irradiation and nbt stress were also the commercial p-channel power vdmosfets irf9520, whose important properties were presented in sect. 3. in this investigation two different experiments were performed: 1) after nbt stress the samples were irradiated (nbt-rad experiment) and 2) after irradiation samples were nbt stressed (rad-nbt experiment). in the first experiment nbt stress was followed by spontaneous recovery (24 hours), irradiation, by another spontaneous recovery (168 hours) and by thermal annealing. in the second experiment irradiation was followed by spontaneous recovery (24 hours), nbt stress, another spontaneous recovery (168 hours) and by thermal annealing. in both experiments, all stresses and recoveries, as well as thermal annealing were done under the same conditions. during nbt stresses and irradiations the source and drain were grounded. nbt stressing was performed in thermally stable heraeus chambers at 175 c (168 h) with device gates biased at vg = 45 v. chosen voltage value of 45 v enables to observe notable vt within a reasonable period of time. namely, stressing of these devices with gate voltage within the range found in manufacturer’s data sheet (maximal gate voltage 20 v), would lead to small degradation which would be notable after a long period (thousands of hours) [16]. chosen voltage value of 45 v exceeds the range of gate voltages allowed for application in the investigated devices, but it is within the range of gate voltages used for nbt stress experiments on these power devices. regarding the choosing of temperature, significant device degradation at room temperature can be observed only at stress voltages which are just few volts below the gate oxide breakdown voltage (70 v), and can be ascribed to tunnelling effects [57], while at t > 175 c, backward interface reactions can be activated [58]. it should be mentioned that in this study the nbt stressing was limited to 168 h with the aim of shortening the experiment. therefore the combination of bias and temperature value, as well as nbt stress duration was chosen in order to obtain optimal conditions for this investigation. the irradiation was performed at department of radiation and environmental protection at institute for nuclear sciences, vinča, serbia. the devices were gamma irradiated by co-60 (dose rate of the source was 0.5 gy(sio2)/min) up to a total dose of 75 gy (total duration of 150 min). the devices were irradiated without gate voltage applied, and with applied positive (+10 v) and negative (-10 v) gate voltage. the chosen voltage value of 10 v enables to enhance irradiation effects, and to simulate real cases of biased device in the working conditions. besides that, the chosen total dose of 75 gy (relatively low compared to very high doses that could be achieved in the devices assembled in satellites) provides to avoid that radiation effects in devices significantly surpass and masks the nbt stress effects. in addition, the thermal annealing (the final phase in both experiments), of all devices, was performed at t = 175 c during 168 hours without any bias applied. both spontaneous recovery were carried out at room temperature of t = 25 c, also, without any bias applied. 378 v. davidović, d. danković, s. golubović', et al. in order to detect and monitor the degradation during all phases of experiments, each one was interrupted after certain, predefined periods to measure the device transfer id-vgs characteristics. the highly precise source measurement units (smus) keithley 237 (for drain biasing and drain current measurement) and keithley 2400 (for sweeping the gate voltage), both controlled by pc over ieee 488 gpib were used for devices electrical characterization. it should be noted that all measurements were performed at room temperature. in figs. 9 and 10 are presented vt in p-channel power vdmosfets during the nbt stress irradiation and irradiation nbt stress experiment, respectively. all devices subjected to initial nbt stress in nbt stress-irradiation experiment follow the same degradation curve of vt . on the other hand, irradiation of virgin devices (in irradiation nbt stress experiment) has induced significant negative vt , which increased with total dose received and were dependent on the gate bias applied. in the case of zero bias applied, the value of vt was the lowest, while at gate bias applied of 10 v it was significantly more pronounced. at the same vt was somewhat more pronounced in the case of positive bias applied (vg =+10 v) than at negative bias applied (vg =-10 v) [11]. the underlying changes of not and nit , determined by the commonly used subthreshold midgap technique [59], during the nbt-rad experiment are presented in fig. 11, while underlying changes of not and nit , during the rad-nbt experiment are presented in fig. 12 [60]. it should be mentioned that the microscopic origin of the nbti related degradation as well as radiation related degradation was extensively investigated. namely, the changes of oxide trapped charge and interface traps, which lead to corresponding threshold voltage shift, could be explained by numerous models of the responsible mechanisms for these changes during nbt and gamma radiation stress, as well as during the annealing of stressed devices. in many models changes of not and nit are the result of electro-chemical processes that occur in the gate oxide and at sio2-si interface. these electro-chemical processes and underlying reactions are based on the charge traps precursors existing in the gate oxide and at sio2-si interfaces. some models include reactions at sio2-si interface involving holes and their transport through the oxide. besides that there are models which can properly explain results obtained in the investigations of nbti and radiation degradation, which are based on transport of hydrogen species (h • , h + , h2, oh • , h2o, h3o + ). the presence of hydrogen species is associated with the presence of hydrogen as a common impurity in mos devices. the result in this investigation can also be explained by mentioned models. mechanisms responsible for nbt stress induced changes of not and nit are bias dependent and thermally activated [9, 16, 20, 23, 24, 26-29]. interpretations of mechanisms responsible for degradation, very often, include various forms of model based on the assumption that previously passivated defects at sio2-si interface release hydrogen species which diffuse into the oxide and leave the interface traps [17, 19, 53, 61]. in these models dispersive hydrogen species motions were proposed, due to various assumptions related to trap controlled hydrogen migration in the oxide [62-65]. in many investigations of nbti, there were proposals that interface trap creations could be reaction controlled mechanism rather than diffusion controlled one [18]. generation of positive charge in the oxide bulk due to hole trapping has been reported in addition to generation of interface traps [18, 62, 63]. nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 379 although there was a controversy on the role of trapped charge in nbti [18], numerous studies suggested that hole trapping dominantly contributes to degradation [20, 66, 67]. this might lead to the proposal of a new charge trapping model, which makes connection between the nbti degradation and the creation of switching oxide traps, and that is consistent with recovery data showing dispersion over the wide range of time. 0 50 100 150 0.0 0.3 0.6 0.9 1.2 0 40 80 120 0 100 0 50 100 150 i v t i (v ) stress time (h) v g = 45 v 0 20 t = 175 o ct = 175 o c t = 25 o c t = 25 o ct = 25 o c time (h) recovery1 nbt stress irradiation irradiation time (min) v g + 10 v 0 v 10 v time (h) recovery 2 annealing irradiated at -10 v annealing time (h) previously irradiated at +10 v fig. 9 behaviour of vt in p-channel power vdmosfets (irf9520) during the nbt stress-irradiation experiment [11]. 0 40 80 120 0.0 0.3 0.6 0.9 1.2 0 20 0 50 100 150 0 100 0 50 100 150 i v t i (v ) irradiation v g + 10 v 0 v 10 v t = 25 o ct = 25 o c t = 25 o c time (h) recovery1 irradiation time (min) stress time (h) nbt stress v g = 45 v time (h) recovery 2 t = 175 o c t = 175 o c annealing time (h) irradiated at -10 v previously irradiated at +10 v annealing fig. 10 behaviour of vt in p-channel power vdmosfets (irf9520) during the irradiation-nbt stress experiment [11]. 380 v. davidović, d. danković, s. golubović', et al. 0 50 100 150 0 5 10 15 20 0 40 80 120 0 100 0 50 100 150  n o t ,  n it ( 1 0 1 0 c m -2 ) stress time (h) v g = 45 v 0 20 t = 25 o c t = 25 o c t = 175 o c t = 175 o c t = 25 o c time (h) recovery1 n it n ot n it n ot n it not + 10v 0v 10v v g nbt stress irradiation irradiation time (min) time (h) recovery 2 annealing annealing time (h) fig. 11 behaviours of not and nit in p-channel power vdmosfets (irf9520) during the nbt stress-irradiation experiment [60]. 0 40 80 120 0 5 10 15 20 0 20 0 50 100 150 0 100 0 50 100 150 not nit nit not nit not vg  n o t ,  n it ( 1 0 1 0 c m -2 ) irradiation + 10 v 0 v 10 v t = 25 o c t = 25 o c t = 25 o c time (h) recovery1 irradiation time (min) stress time (h) nbt stress v g = 45 v time (h) recovery 2 t = 175 o c t = 175 o c annealing time (h) annealing fig. 12 behaviours of not and nit in p-channel power vdmosfets (irf9520) during the irradiation-nbt stress experiment [60]. the results obtained in our investigations, in power vdmosfets, signify that major contribution to nbti in these devices also originates from the oxide trapped charge. the other important feature of nbti in power vdmos devices is additional generation of interface traps in devices annealed under the positive gate bias. it is important to note that our results indicate strong bias dependence of the processes which occurred during both stress and annealing. this suggests that one or more kind of charged species could be involved. the holes induced and/or accumulated under the gate oxide have to be among them, as negative gate bias stress resulted into significant threshold voltage shift. we also believe that hydrogen, as a most common impurity in mos devices, which is widely considered as the primary agent of instabilities associated with radiation damage [55, 56], hot carrier injection, and high electric field stress [68, 69], has to be considered in bti as well. nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 381 5. underlying mechanisms during nbt stress high electric field at elevated temperature in the presence of holes (h + ) may cause dissociation of weak si-h bonds at the interface thus leading to creation of interface traps and hydrogen atoms: si3 ≡ si-h + h + ↔ si3 ≡ si + + h • . (1) released highly reactive hydrogen atoms (h • ) could react with holes from the channel and create hydrogen ions (h + ). the holes originate from the channel owing to applied high negative gate bias of 45 v. created hydrogen ions may dissociate si-h bonds at the interface, thus creating additional interface traps: si3 ≡ si-h + h + ↔ si3 ≡ si + + h2 . (2) alternatively, hydrogen ions could drift away, due to applied high negative gate bias, from the interface into the oxide bulk and participate in creation of positive oxide charge: o3 ≡ si-h + h + ↔ o3 ≡ si + + h2. (3) buildup of oxide charge under the high negative oxide field can be also explained by hole trapping at oxygen vacancy defects near the interface: o3 ≡ si • • si ≡ o3 + h + → o3 ≡ si + • si ≡ o3. (4) it should be mentioned that oxide-trapped charge and switching traps (interface traps and near interface oxide traps so-called “border traps” [70]) are all positive in the case of p-channel mos transistor and thus contribute to a negative vt. in fig. 11 (nbt-rad) it can be observed that during nbt stress the increase of not was more pronounced than nit and that these values were not affected notably by the subsequent spontaneous recovery at 25 c, as the temperature was too low to activate any process of relevance for the phenomena under the investigation. because of that the changes of vt were not affected notably by the subsequent spontaneous recovery. regarding the ionizing radiation, the knowledge acquired during many years of microelectronic devices testing [15, 71, 72] has been successfully implemented in explaining the impact of ionizing radiation on vdmosfets, and an appropriate model of responsible electrochemical process was proposed in [2]. the essence of the model is an assumption that weak bonds between silicon and oxygen atoms in the oxide structure (as well as the bonds in the defects between silicon atoms and hydrogen/hydroxyl groups and/or atomic clusters containing hydrogen) and near the oxide-silicon interface would be broken due to irradiation. namely, high energy (mev magnitude) ionizing irradiation breaks not only weak sih and si-oh bonds in the oxide, but also the regular si-o-si bonds and generates electron-hole pairs in the gate oxide structure: o3 ≡ si-o-si ≡ o3  h  h o3 ≡ si . + o3 ≡ si-o . + e   h + . (5) o3 ≡ si-h (o3 ≡ si-h)  h  h o3 ≡ si •  h • (oh • ) + e   h + . (6) although some of these pairs recombine, most of the generated electrons, however, quickly escape from the oxide, while most of the holes (which are weakly mobile) get captured in the oxide volume on oxygen vacancy defects o3≡si • • si≡o3, contributing to creation of positive oxide trapped charge over a reaction identical to (4). 382 v. davidović, d. danković, s. golubović', et al. when the gate is positively biased, the electrons almost immediately [35] remove through the gate, while when the gate is negatively biased, the electrons remove through the semiconductor. in the case of higher electric field applied more unrecombined holes remain trapped in the oxide which leads to higher oxide trapped charge. small difference between irradiation effects obtained for positive gate bias and negative gate bias can be explained by small difference (due to different surface potential) between the corresponding values of electric field in the oxide, which affects the removal of electrons. it should be mentioned that a fraction of the holes may dissociate weak si-h and sioh bonds and can be trapped again in the oxide contributing to oxide trapped charge increase: h + + o3≡ si-h (o3≡ si-oh) → o3≡ si + + h • (oh • ). (7) also, a fraction of the holes could be trapped at oxide defects, such as oxygen vacancies, also contributing to oxide trapped charge increase over a reaction identical to (4). as mentioned before, holes can react with hydrogen atoms forming the ions. these hydrogen ions also could contribute to the oxide trapped charge increase [56]. released holes could dissociate weak si3  si-h and si3  si-oh bonds which exist at the interface creating interface traps: from silicon h + + si3  si-h (si3  si-oh) + e   si3  si • + h • (oh • ) . (8) similarly, hydrogen ions could contribute to creation of interface traps: from silicon h + + si3  si-h (si3  si-oh) + e   si3  si • + h2 (h2o) . (9) in fig. 12 (rad-nbt) it can be seen that the values of not are significantly higher than those of nit after irradiation and that all changes were the smallest in the case of irradiation without gate bias applied. also, it can be seen that both not and nit were somewhat more pronounced in the case of a positive gate bias applied. the reason for these differences is found in the electric field dependence of irradiation effects [12, 35]. it should be emphasized that post-radiation spontaneous recovery resulted in a decrease of not and an increase of nit (fig. 12), although it seems that vt remained stable (fig. 10). in nbt stress-irradiation experiment (fig. 11), the irradiation applied after nbt stressing has produced the additional significant increase of not and nit (leading to additional negative vt presented in fig. 9) which were slightly lower, but almost the same to those previously observed in irradiated virgin devices during the first step of the irradiation-nbt strss experiment (fig. 12). this suggests that radiation effects probably were not noticeably affected by nbt stress-induced degradation. such behaviours can be explained by relatively low temperature (room temperature) and relatively low electric field applied during irradiation, as well as relatively low total irradiation dose. however, for the effects observed during the nbt stress applied after irradiation (in irradiation-nbt stress experiment) two mechanisms might be responsible. the first one is activation of electrochemical reactions contributing to nbti, which leads to additional creation of oxide charge and interface traps, and the second one is annealing of irradiationinduced oxide charge due to high temperature (175 c) applied. in order to compare the obtained values of not and nit in figs. 13 their behaviours during nbt stress of virgin (fig. 11) and previously irradiated (fig. 12) devices are presented. nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 383 in devices previously irradiated without gate bias applied, the amount of radiationinduced defects was rather small while the number of available defect precursors remained rather high. therefore, during the nbt stress applied after irradiation additional defects were created. this caused further increase of threshold voltage shift. on the other hand, in devices previously irradiated at positive or negative gate bias, the amount of irradiation-induced defects was much higher and their decreasing during the subsequent nbt stress was actually dominant over the new defect creation. decreasing of oxide trapped charge and interface traps led to the decrease of threshold voltage shift. 1 100 0 5 10 15 after irradiation at 10 v virgin after irradiation at 0 v nbt stress v g = 45v t = 175 o c  n o t (1 0 1 0 c m -2 ) stress time (h) after irradiation at + 10 v 1 100 0 5 10 15 after irradiation at 10 v virgin after irradiation at 0 v nbt stress v g = 45v t = 175 o c  n it ( 1 0 1 0 c m -2 ) stress time (h) after irradiation at + 10 v a) b) fig. 13 comparative presentation of (a) not and (b) nit during nbt stress for investigated devices (virgin and previously irradiated). also, in order to compare obtained values of vt in fig. 14 behaviours of vt during nbt stress of virgin (fig. 9) and previously irradiated (fig. 10) devices are presented [60]. it can be seen that the difference in vt established after irradiation between devices irradiated with positive and negative gate bias applied, decreased very fast at the beginning of the nbt stressing step (within about 24 hours) to a level that remained almost unchanged until the end of nbt stress. it should be noted that the second spontaneous recovery generally causes a small decrease of vt in the first period in all devices. during the rest of the spontaneous recovery vt remains almost unchanged in nbt-rad experiment (fig. 9), while slightly decreases in rad-nbt experiment (fig. 10). as in the case of the first spontaneous recovery, vt seems also to be relatively stable during the second spontaneous recovery in both experiments. despite this, it was observed decrease of not and increase of nit in both experiments (figs. 11 and 12). in figs. 7, 8, 11 and 12 it can be seen that during annealing (last step) vt, not and nit significantly decreases and that this decrease is more pronounced in devices subjected to nbt-rad experiment. although the conditions of nbt stress, irradiation and annealing have been the same in both experiments, the final values of vt, not and nit were found to depend on the order of stress steps, and were generally lower in radnbt experiment. such obtained values are the result of two high temperature steps after irradiation in rad-nbt experiment which have been applied (nbt stress and annealing, both at 175 o c for 168 h), so more defects were annealed. in nbt-rad experiment, only 384 v. davidović, d. danković, s. golubović', et al. one thermal annealing step was applied after irradiation that resulted in higher final values. namely, in nbt-rad experiment the defects induced by nbt stress and by radiation have been subjected to thermally annealing (175 o c) for 168 h, while in radnbt experiment only nbti defects have been subjected to thermally annealing (175 o c) for 168 h, but radiation defects have been subjected to high temperature (175 o c) twice as much. more pronounced and faster decrease of all values (vt, not and nit) in the initial period of annealing could be ascribed to higher values of created defects after previous steps. the obtained results undoubtedly point to the importance of the order of applied stresses. during annealing the devices were not biased, and annealing is strongly thermally supported, as observed by comparing two last steps in the experiments (spontaneous recovery and annealing). the mechanisms during annealing are thermally activated, so the diffusion of neutral species like hydrogen molecules from the areas of high concentrations in oxide toward lower concentrations near the interface could lead to decrease of not and nit. namely, hydrogen molecules could be cracked at charged oxide traps (o3 ≡ si + and o3≡ si + • si ≡ o3) leading to neutralization of positive oxide traps followed by the h + ions releasing [55] over the reverse reaction (3) and: o3 ≡ si + • si ≡ o3 + h2 → o3 ≡ si-h + h + . (10) the decrease of interface traps during the annealing might be also attributed to the hydrogen species (molecule h2 and highly reactive atom h • ) involved in reactions [20]: si3  si • + h2 → si3  si-h + h • , (11) si3  si • + h • → si3  si-h . (12) observed threshold voltage decrease is in agreement with comparable published results [73] (power mos, 105 nm gate oxide, annealed at 175 o c), and also fits to switching-oxide traps model used originally as so-called hdl model in interpreting irradiation effects and later in nbti phenomena [12, 20, 30, 55, 56]. 1 100 0.0 0.3 0.6 0.9 after irradiation at 10 v virgin devices after irradiation at 0 v nbt stress v g = 45v t = 175 o c  v t ( v ) stress time (h) after irradiation at + 10 v fig. 14 comparative presentation of vt during nbt stress of virgin devices and previously irradiated devices. nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 385 6. conclusions the main features of independent nbti and irradiation effects in pand n-channel, as well as consecutive nbti and irradiation effects in p-channel power vdmosfets have been reviewed. it was shown that experimental results of consecutive stresses complement the results of research of independent nbti and irradiation effects. the obtained results were analysed in terms of underlying mechanisms. this investigation is shown as important in assessing the device behaviour in real working conditions (where devices are simultaneously under negative bias, irradiation and selfheating). it was shown that radiation induced degradation of previously nbt stressed devices practically was not affected by previous nbt stress. however, previously irradiated devices with and without gate bias applied have shown different behaviours. devices previously irradiated without gate bias have been further degraded by nbt stress, while devices previously irradiated with gate bias have been partially recovered by nbt stress, due to high temperature introduced by nbt stress. the obtained results undoubtedly point to the importance of the order of applied stresses, indicating that for proper insight into the prediction of device behaviour not only harsh conditions, but also the order of their possible applications have to be considered. acknowledgement this research was supported by the ministry of education, science and technological development of the republic of serbia under the grants no. oi171026 and tr32026. also, the research was supported by serbian academy of sciences and arts (sasa) under the grant no. f-148. the authors would like to thank the staff of department of radiation and environmental protection at the institute for nuclear sciences, vinča, serbia, for providing the facilities for radiation experiments. references [1] n. stojadinović, s. djorić, s. golubović, v. davidović, "separation of the radiation induced gate oxide charge and interface traps effects in power vdmosfets", electron. lett., vol. 30, pp. 1992-1993, 1994. [2] n. stojadinović, s. golubović, s. djorić, s. dimitrijev, "analysis of gamma-irradiation induced degradation mechanisms in power vdmosfets", microelectron. reliab., vol. 35, pp. 587-602, 1995. [3] a. jakšić, m. m. pejović, g. ristić, s. raković, "latent interface-trap generation in commercial power vdmosfets", ieee trans. nucl. sci., vol. 45, pp. 1365-1372, 1998. [4] c. pickard, c. brisset, o. qittard, m. marceau, a. hoffman, f. joffre, j-p. charles, "use of commercial vdmosfets in electronic systems subjected to radiation", ieee trans. nucl. sci., vol. 47, pp. 627-633, 2000. [5] u. schwalke, m. polzl, t. sekinger, m. kerber, "ultra-thick gate oxides: charge generation and its impact on reliability", microelectron. reliab., vol. 41, pp. 1007-1010, 2001. [6] n. stojadinović, i. manić, s. djorić-veljković, v. davidović, s. golubović, s. dimitrijev, "mechanisms of positive gate bias stress induced instabilities in power vdmosfets", microelectron. reliab., vol. 41, pp. 1373-1378, 2001. [7] m.s. park, i. na, c.r. wie, "a comparison of ionizing radiation and high field stress effects in n-channel power vertical double-diffused metal-oxide-semiconductor field-effect transistors", j. appl. phys., vol. 97, pp. 014503-1-6, 2005. [8] g. bo, y. xuefeng, r. diyuan, l. gang, w. yiyuan, s. jing, c. jiangwei, "total ionizing dose effects and annealing behavior for domestic vdmos devices", j. semicond., vol. 31, pp. 044007-1-5, 2010. [9] n. stojadinović, d. danković, i. manić, a. prijić, v. davidović, s. djorić-veljković, s. golubović, z. prijić, "threshold voltage instabilities in p-channel power vdmosfets under pulsed nbt stress", microelectron. reliab., vol. 50, pp. 1278-1282, 2010. 386 v. davidović, d. danković, s. golubović', et al. [10] d. danković, i. manić, a. prijić, s. djorić-veljković, v. davidović, n. stojadinović, z. prijić, s. golubović, "negative bias temperature instability in p-channel power vdmosfets: recoverable versus permanent degradation", semicond. sci. technol., vol. 30, pp. 105009-1-105009-9, 2015. [11] v. davidović, d. danković, a. ilić, i. manić, s. golubović, s. djorić-veljković, z. prijić, n. stojadinović, "nbti and irradiation effects in p-channel power vdmos transistors", ieee trans. nucl. sci., vol. 63, pp. 1268-1275, 2016. [12] d.m. fleetwood, p.s. winokur, p.e. dodd, "an overview of radiation effects on electronics in the space telecommunication environment", microelectron. reliab., vol. 40, pp. 17-26, 2000. [13] k.r. davis, r.d. schrimpf, f.e. cellier, k.f. galloway, d.i burton, jr. c.f.wheatley, "the effects of ionizing radiation on power-mosfet termination structures", ieee trans. nucl. sci., vol. ns-36, pp. 2104-2109, 1989. [14] d. župac, k.f. galoway, r.d. schrimpf, p. augier, "effects of radiation-induced oxide-trapped charge on inversion-layer hole mobility at 300 and 77 k", appl. phys. lett., vol. 60, pp. 3156-3158, 1992. [15] t.p. ma, p.v. dressendorfer, ionizing radiation effects in mos devices and circuits, john wiley & sons, new york, 1989. [16] n. stojadinović, d. danković, s. djorić-veljković, v. davidović, i. manić, s. golubović, "negative bias temperature instability mechanisms in p-channel power vdmosfets", microelectron. reliab., vol. 45, pp. 1343-1348, 2005. [17] d.k. schroder, j.a. babcock, "negative bias temperature instability: road to cross in deep submicron silicon semiconductor manufacturing", j. appl. phys., vol. 94, pp. 1-18, 2003. [18] v. huard, m. denais, c. parthasarathy, "nbti degradation: from physical mechanisms to modelling", microelectron. reliab., vol. 46, pp. 1-23, 2006. [19] j.h. stathis, s. zafar, "the negative bias temperature instability in mos devices: a review", microelectron. reliab., vol. 46, pp. 270-286, 2006. [20] t. grasser, b. kaczer, w. gös, h. reisinger, t. aichinger, p. hehenberger, p.-j. wagner, f. schanovsky, j. franco, ph.j. roussel, m. nelhiebel, "recent advances in understanding the bias temperature instability", ieee proc. iedm, 2010, pp. 82-85. [21] n. tošić, b. pešić, n. stojadinović, "high-temperature-reverse-bias testing of power vdmos transistors", microelectron. reliab., vol. 37, pp. 1759-1762, 1997. [22] s. djorić-veljković, i. manić, v. davidović, s. golubović, n. stojadinović, "effects of burn-in stressing on post-irradiation annealing response of power vdmosfets", microelectron. reliab., vol. 43, pp. 1455-1460, 2003. [23] n. stojadinović, i. manić, d. danković, s. djorić-veljković, v. davidović, a. prijić, s. golubović z. prijić, "negative bias temperature instability in thick gate oxides for power mos transistors", pp. 533559, in bias temperature instability for devices and circuits, tibor grasser, editor, springer science publisher, 2014. [24] s. gamerith, m. polzl, "negative bias temperature stress in low voltage p-channel dmos transistors and role of nitrogen", microelectron. reliab., vol. 42, pp. 1439-1443, 2002. [25] n. stojadinović, i. manić, v. davidović, d. danković, s. djorić-veljković, s. golubović, s. dimitrijev, "effects of electrical stressing in power vdmosfets", microelectron. reliab., vol. 45, pp. 115-122, 2005. [26] d. danković, i. manić, v. davidović, s. djorić-veljković, s. golubović, n. stojadinović, "negative bias temperature instability in n-channel power vdmosfets", microelectron. reliab., vol. 48, pp. 13131317, 2008. [27] i. manić, d. danković, a. prijić, v. davidović, s. djorić-veljković, s. golubović, z. prijić, n. stojadinović, "nbti related degradation and lifetime estimation in p-channel power vdmosfets under the static and pulsed nbt stress conditions", microelectron. reliab., vol. 51, pp. 1540-1543, 2011. [28] a.n. tallarico, p. magnone, g. barletta, a. magri, e. sangiorgi, c. fiegna, "negative bias temperature stress reliability in trench-gated p-channel power mosfets", ieee trans. dev. mater. reliab., vol. 14, pp. 657-663, 2014. [29] s. aresu, w.kanert, r. pufall, m. goroll, "exceptional operative gate voltage induced negative bias temperature instability (nbti) on n-type trench dmos transistors", microelectron. reliab., vol. 47, pp. 1416-1418, 2007. [30] t. grasser, t. aichinger, g. pobegen, h. reisinger, p.j. wagner, j. franco, m. nelhiebel, b. kaczer, "the 'permanent' component of nbti: composition and annealing", in proc. ieee international reliab. phys. symp. 2011, pp. 6a.2.1-6a.2.9. nbt stress and radiation related degradation and underlying mechanisms in power vdmosfets 387 [31] j.r. schwank, f.w. sexton, d.m. fleetwood, r.v. jones, r.s. flores, m.s. rodgers, k.l. hughes, "temperature effects on the radiation response of mos devices", ieee trans. nucl. sci., vol. 35, pp. 1432-1437, 1988. [32] m.r. shaneyfelt, j.r. schwank, d.m. fleetwood, p.s. winokur, "effects of irradiation temperature on mos radiation response", ieee trans. nucl. sci., 45 (1998) 1372-1378. [33] k.f. galloway, r.d. schrimpf, "mos device degradation due to total dose ionizing radiation in the natural space environment: a review", microelectron. j., vol. 21, pp. 67-81, 1990. [34] t. sakai, t. yachi, "effects of gamma-ray irradiation on thin-gate-oxide vdmosfet characteristics", ieee trans. electron dev., vol. 38, pp. 1510-1515, 1991. [35] t.r. oldham, f.b. mclean, "total ionizing dose effects in mos oxides and devices", ieee trans. nucl. sci., vol. 50, pp. 483-499, 2003. [36] g. ristić, m. pejović, a. jakšić, "comparison between post-irradiation annealing and post-high electric field stress annealing of n-channel power vdmosfet", appl. surf. sci., vol. 220, pp. 181-185, 2003. [37] s. djorić-veljković, i. manić, v. davidović, d. danković, s. golubović, n. stojadinović, "comparison of gamma-radiation and electrical stress influences on oxide and interface defects in power vdmosfets", nucl. technol. radiat. protec., vol. 28, pp. 406-414, 2013. [38] discrete and special technology group, tmos power mosfet, selector guide, cross reference and reliability data, motorola inc. 1985. [39] n. stojadinović, i. manić, s. djorić-veljković, v. davidović, s. golubović, s. dimitrijev, "effects of high electric field and elevated-temperature bias stressing on radiation response in power vdmosfets", microelectron. reliab., vol. 42, pp. 669-677, 2002. [40] n. stojadinović, s. djorić-veljković, i. manić, v. davidović, s. golubović, "effects of burn-in stressing on radiation response of power vdmosfets", microelectron. j., vol. 33, pp. 899-905, 2002. [41] m.r. shaneyfelt, p.s. winokur, d.m. fleetwood, j.r. schwank, r.a.jr. reber, "effects of reliability screens on mos charge trapping", ieee trans. nucl. sci., vol. 43, pp. 865-872, 1996. [42] d. danković, i. manić, s. djorić-veljković, v. davidović, s. golubović, n. stojadinović, "nbt stressinduced degradation and lifetime estimation in p-channel power vdmosfets", microelectron. reliab., vol. 46, pp. 1828-1833, 2006. [43] d. danković, i. manić, s. djorić-veljković, v. davidović, s. golubović, n. stojadinović, "implications of negative biastemperature instability in power mos transistors", 19.319 19.342, in micro electronic and mechanical systems, edited by kenichi takahata, in-tech press, boca raton, 2009. [44] d. danković, i. manić, v. davidović, a. prijić, s. djorić-veljković, s. golubović, z. prijić, n. stojadinović, "lifetime estimation in nbt-stressed p-channel power vdmosfets", facta univers.: ser. automat. cont. rob., vol. 11, pp. 15-23, 2012. [45] d. danković, i. manić, v. davidović, s. djorić-veljković, s. golubović, n. stojadinović, "negative bias temperature instabilities in sequentially stressed and annealed p-channel power vdmosfets", microelectron. reliab., vol. 47, pp. 1400-1405, 2007. [46] i. manić, d. danković, s. djorić-veljković, v. davidović, s. golubović, n. stojadinović, "effects of low gate bias annealing in nbt stressed p-channel power vdmosfets", microelectron. reliab., vol. 49, pp. 1003-1007, 2009. [47] a. prijić, d. danković, lj. vračar, i. manić, z. prijić, n. stojadinović, "a method for negative bias temperature instability (nbti) measurements on power vdmos transistors", meas. sci. technol., vol. 23, pp. 085003-1-8, 2012. [48] d. danković, i. manić, a. prijić, v. davidović, s. djorić-veljković, s. golubović, z. prijić, n. stojadinović, "effects of static and pulsed negative bias temperature stressing on lifetime in p-channel power vdmosfets", inform. midem., vol. 43, pp. 58-66, 2013. [49] i. manić, d. danković, a. prijić, z. prijić, n. stojadinović, "measurement of nbti degradation in pchannel power vdmosfets", inform. midem., vol. 44, pp. 280-287, 2014. [50] d. danković, n. stojadinović, z. prijić, i. manić, v. davidović, a. prijić, s. djorić-veljković, s. golubović, "analysis of recoverable and permanent components of threshold voltage shift in nbt stressed p-channel power vdmosfet", chinese phys. b, vol. 24, pp. 106601-1-9, 2015. [51] i. manić, d. danković, v. davidović, a. prijić, s. djorić-veljković, s. golubović, z. prijić, n. stojadinović, "effects of pulsed negative bias temperature stressing in p-channel power vdmosfets", facta univers.: ser. electron. energ., vol. 29, pp. 49-60, 2016. [52] d. danković, i. manić, v. davidović, a. prijić, m. marjanovic, a. ilic, z. prijić, n. stojadinović, "on the recoverable and permanent components of nbti in p-channel power vdmosfets", ieee trans. on device mater. reliab., vol. 16, art. no. 7536114, pp. 522-531, 2016. 388 v. davidović, d. danković, s. golubović', et al. [53] s. ogawa, m. shimaya, n. shiono, "interface-trap generation at ultrathin sio2 (4-6 nm)-si interfaces during negative-bias temperature aging", j. appl. phys., vol. 77, pp. 1137-1148 1995. [54] a. demesmaeker, a. pergoot, p. de pauw, "bias temperature reliability of p-channel high-voltage devices", microelectron. reliab., vol. 37, pp. 1767-1770, 1997. [55] r.e. stahlbush, a.h. edwards, d.l. griscom, b.j. mrstik, "post-irradiation cracking of h2 and formation of interface states in irradiated metal-oxide-semiconductor field-effect transistors", j. appl. phys., vol. 73, pp. 658-667, 1993. [56] d.m. fleetwood, "effects of hydrogen transport and reactions on microelectronics radiation response and reliability", microelectron. reliab., vol. 42, pp. 523-541, 2002. [57] v. davidović, n. stojadinović, d. danković, s. golubović, i. manić, s. djorić-veljković, s. dimitrijev, "turn around of threshold voltage in gate bias stressed p-channel power vertical double-diffused metaloxide-semiconductor transistors", jap. j. app. phys., vol. 47, pp. 6272-6276, 2008. [58] k.e. kambour, d.d. nguyen, c. kouhestani, r.a.b. devine, "comparison of nbti and irradiation induced interface states", ieee international integrated reliability workshop, 2013, pp. 157-160. [59] p.j. mcwhorter, p.s. winokur, "simple technique for separating the effects of interface traps and trapped-oxide charge in metal-oxide-semiconductor transistors", app. phys. lett., vol. 48, pp. 133135, 1986. [60] v. davidović, d. danković, a. ilić, i. manić, s. golubović, s. djorić-veljković, z. prijić, a. prijić and n. stojadinović, "effects of consecutive irradiation and bias temperature stress in p-channel power vertical double-diffused metal oxide semiconductor transistors", jap. j. app. phys., vol. 57, no. 4, art. no 044101, pp. 044101-1-10, 2018. [61] k.o. jeppson, c.m. svensson, "negative bias stress of mos devices at high electric fields and degradation of mnos devices", j. appl. phys., vol. 48, pp. 2004-2014, 1977. [62] s. zafar, "statistical mechanics based model for negative bias temperature instability induced degradation", j. appl. phys., vol. 97, pp. 103709-1-103709-9, 2005. [63] a.e. islam, k. kufluoglu, d. varghese, s. mahapatra, m.a. alam, "recent issues in negative-bias temperature instability: initial degradation, field dependence of interface trap generation, hole trappin g effects, and relaxation", ieee trans. electron. dev., vol. 54, pp. 2143-2154, 2007. [64] t. grasser, w. goes, b. kaezer, "dispersive transport and negative bias temperature instability: boundary conditions, initial conditions, and transport models", ieee trans. on device and mater. reliab., vol. 8, pp. 79-97, 2008. [65] t. grasser, bias temperature instability for devices and circuits, springer, new york, 2014. [66] v. huard, "two independent components modeling for negative bias temperature instability", intl. reliab. phys. symp. anheim, ca, 2010, pp. 33-42. [67] d.s. ang, z.q. teo, t.j.j. ho, c.m. ng, "reassessing the mechanisms of negative-bias temperature instability by repetitive stress/relaxation experiments", ieee trans. device mater. reliab., vol. 11, pp. 19-34, 2011. [68] e. cartier, "characterization of the hot-electron-induced degradation in thin sio2 gate oxides", microelectron. reliab., vol. 38, pp. 201-211, 1998. [69] d.j. dimaria, j.h. stathis, "anode hole injection, defect generation, and breakdown in ultrathin silicon dioxide films", j. app. phys., vol. 89, pp. 5015-5024, 2001. [70] d.m. fleetwood,"’border traps’ in mos devices", ieee trans. nucl. sci., vol. 39, pp. 269-271, 1992. [71] s. dimitrijev, s. golubović, d. župac, m. pejović, n. stojadinović, "analysis of gamma-radiation induced instability mechanisms of cmos transistors", solid-state electron., vol. 32, pp. 349-353, 1989. [72] s. djorić-veljković, i. manić, v. davidović, d. danković, s. golubović, n. stojadinović, "annealing of radiation-induced defects in burn-in stressed power vdmosfets", nucl. technol. radiat. protec., vol. 26, pp. 18-24, 2011. [73] s. mahapatra, n. goel, s. desai, s. gupta, b. jose, s. mukhopadhyay, k. joshi, a. jain, a.e. islam, m.a. alam, "a comparative study of different physics-based nbti models", ieee trans. electron dev., vol. 60, pp. 901-916, 2013. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 415-433 https://doi.org/10.2298/fuee2103415r © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper a high efficiency pneumatic drive system using vane-type semi-rotary actuators alfred rufer epfl ecole polytechnique fédérale de lausanne, ch1015 lausanne, switzerland abstract. a compressed air driven generator is proposed, where the pneumatic energy is converted into mechanical energy using two vane-type rotational actuators. the use of a second actuator with a higher displacement in order to produce a thermodynamic expansion allows to reach a better energetic efficiency in comparison to the classical operation of such actuators. the alternating movement of the angular actuators is transformed into a unidirectional rotational motion with the help of a mechanical motion rectifier. the paper analyses the enhancement of the energetic performance of the system. an experimental set-up is also described. the performance of the new system is described, and the limits of its realization is commented on the base of experimental recordings of the evolution of the pressure in the chambers. key words: compressed air motor; energy efficiency; adiabatic expansion; compressed air energy storage; pneumatic actuators; motion rectifier 1. introduction renewable energy sources have been developed principally during the last decades of the 20th century and got their technical and economical maturity in the early years of the 21st. these developments are however accompanied by new questions and aspects of availability and intermittency like weather, day and seasonal cycles. these conditions are the main motivations for the development of energy storage techniques. near the classical storage technologies with the example of pumped hydro plants, battery energy storage facilities seem to be well adapted solutions for the decentralized energy production. recent developments of the advanced li-ion batteries and of their production facilities have led to the actual situation of their high performances in energy and power density, together with their economic aspects [1]. received december 11, 2020; received in revised form march 3, 2021 corresponding author: alfred rufer epfl ecole polytechnique fédérale de lausanne, ch1015 lausanne, switzerland e-mail: alfred.rufer@epfl.ch 416 a. rufer beneath interrogations about the available material resources for a wide expansion and mass production of batteries for electrical systems as distributed generation or electric mobility, other open questions remain as the possible life cycles and aging effects of electrochemical energy carriers. finally, the questions about recycling the battery materials are of an actual topic of many researches, looking for answers to technical and economic challenges. in such a context, many alternative solutions for energy storage are investigated as technologies based on reversible physics like mechanical or thermodynamic principles. compressed air energy storage (caes) belongs to the list of candidate solutions, using only classical materials and established technology. in addition, and in opposition to the electrochemical batteries, these systems can be repaired or refurnished, offering incomparable longer life cycles. additionally, their materials are not problematic for recycling. the development of caes systems includes the development of high-performance compression and expansion machinery and must be in accordance with the elementary rules of thermodynamics. by many different development projects, focus has been set on the isothermal compression and expansion [2], [3], [4] in order to reach the best possible efficiency. also elementary conversion based on classical pneumatic equipment have been proposed, where the operating principle has led to scarce performance. in this paper, a new solution is analysed for the energy conversion from compressed air to mechanical and electrical power. the proposed system is based on the use of vane-type rotary actuators, from which the oscillating movement of the rotor is transformed into a unidirectional rotational motion using a so-called motion rectifier to drive the electrical generator [5], [6], [7], [8]. the paper describes the enhancement of the energy efficiency of the pneumatic to mechanical transformation, where in addition to the classical displacement work of the actuator an expansion volume is added to the system allowing to recover an important part of the primarily injected enthalpy [9]. first the operation principle of the original system is presented, which is an electric generator driven by one actuator only and driven under constant pressure as all common pneumatic devices (fig. 1a). the typical variables as the pressure or the developed torque are simulated, and the resulting energetic efficiency is evaluated. then, the modified system with enhanced efficiency is presented where an expansion volume is added which allows to exploit the internal energy of the pressurized air (fig. 1b). the performances and typical variables are evaluated by simulation. finally, a third system is studied in which the same energetic performance as for the previous system can be achieved with one actuator only, where the displacement work and the expansion of the air are executed inside of the same chambers by a dedicated control of the intake and exhaust valves (fig. 1c). a high efficiency pneumatic drive system using vane-type semi-rotary actuators 417 pin p2 r1 v1b v1a m1 not inverting ( belt) inverting ( gear) freewheeling clk.-w. freewheeling c. clk.-w. genfw1 fw1 ax2 ax1 prv a) pin p2 p2 r2r1 v2bv1b v1a v2a m1 m2 not inverting inverting freewheeling clk.-w. freewheeling c. clk.-w. genfw1 fw1 ax2 ax1 antireturn valves 1a b) p2 v2 a m2 gen ax2 in c) fig. 1 the three versions of the compressed air driven generator a) the original system, b) the system with enhanced efficiency, c) the simplified system with the same enhanced efficiency 418 a. rufer 2. torque and mechanical work produced by the three systems 2.1. the basic concept of the air-powered diver lamp gianfranco gallino, a genius inventor from southern switzerland has realized and patented an original lamp for divers, using as energy reservoir the same air bottle as for the breathing air. the conversion of the energy from its pneumatic form into electric power for the lighting bulb is realized by a pneumatic vane-type semi-rotary actuator driving an electric generator. figure 1a) shows the patented system of gallino with its main parts as the high-pressure air bottle, the pneumatic actuator with the motion rectifier and the electric generator for the powering of the bulb. at the left side of fig. 1a), the air reservoir is connected to a pressure reduction and regulation valve (prv), which reduces the air pressure from the reservoir to the operation pressure of the vane type angular actuator. this actuator comprises two working chambers, fed alternatively by pressurized air through the control valves. the motion rectifier is composed of two driving trains, the first being of the direct non-inverting type, realized with a toothed belt. the second train is an inverting gear. both trains are driven from the input shaft in their respective direction through two free-wheeling devices working in the clockwise and counter-clockwise directions, resulting in a unidirectionally rotating output shaft. this output shaft is connected to the electric generator through a simple multiplying gear in order to reach a sufficiently fast speed of the generator. 2.1.1. the typical curves characterizing the gallino system the pneumatic actuator of the gallino system has two chambers fed alternatively by the air under pressure. the volumes of the chambers vary linearly with the angular position of the vane rotor. in the chosen example, the 270° excursion of the rotor lasts over 0.5 s for the clockwise and counter-clockwise motion, leading to a period of 1s. the complementary evolution of both chambers is represented in fig. 2. the maximal value of the volume is equal to 118 cm3. fig. 2 evolution of the volumes of the chambers of the rotary actuator the actuator is fed by air at a pressure of 10 bar, generating torque contributions on the two sides of the vane. fig. 3 shows a simple model simulation of the two torque components, composed of the action of the feeding pressure during the first respectively the second half period, and the action of the atmospheric pressure during the second a high efficiency pneumatic drive system using vane-type semi-rotary actuators 419 (respectively the first) half period when the exhauxt valves are open. in this simulation, the dynamic response of the air-flow into the chambers is not considered, but in reality, the tubing section and length as the size of the control valves will have an influence as was recognized in the experimental set-up (see section 5). fig. 3 torque contributions of the left and fig. 4 resultant torque of the actuator right sides of the rotor vane the evolution of the resulting torque of the actuator according the simple model is represented in fig. 4 and shows the alternance of its sign depending on the rotor’s motion. in this simulation, the rotor’s inertia is neglected, and the reversal of the motion is supposed to be infinitively short. the real motion of the rotor will be considered in the simulation of the third system in section 4. at the output side of the motion rectifier, the torque is applied to the generator in one direction only. the “rectified” torque of the first system (gallino) is represented in fig. 5. fig. 5 torque at the output side of the motion fig. 6 torque developed with the system with rectifier an additional expansion volume. the figure shows adiabatic (blue and yellow) and isothermal (violet and light blue) expansions together with their averages. 420 a. rufer fig. 6 anticipates for comparison the result of the simulation of the second system with an additional expansion device as represented in fig. 1b). it is clearly visible that the mechanical performance of this second system is nearly doubled in comparison with the first system. 2.2. the benefit of adding an expansion chamber instead of simply opening an exhaust valve and releasing the air under pressure to the atmosphere at the end of the stroke, this air can be expanded in a supplementary actuator of the same type but with larger volumetry and running mechanically in parallel to the first one. such a system is represented in fig. 1b). the alternating movement of the coupled rotors is transmitted in the same way through the motion rectifier as in the first described system. the two vane-type rotary actuators have two active chambers each, corresponding to the volumes v1a and v1b, respectively v2a and v2b (fig. 1b). the chambers v1a and v1b are fed alternatively by the input compressed air, and they produce torque contributions alternatively according the clockwise and anti-clockwise motions. the chambers v2a and v2b are fed from the exhaust air of the chambers of the first actuator. the volume v2a receives the exhaust air of the v1b chamber during the clockwise motion, and the volume v2b that-one of the v1a chamber during the anti-clockwise one. because of the different volumes of the chambers of the first and second actuator, the airtransfer from the chambers of the first actuator to that-ones of the second-one corresponds to a real expansion of the transferred air, allowing so to recover a significant part of the internal energy of the pressurized fluid. in the studied example, the volume ratio of the two actuators is chosen as v2/v1 = 3. the evolution of the volumes of the chambers of the actuators is indicated in fig. 7a and 7b. fig. 7a shows the volumes of the two complementary chambers v1a and v2b during the two half-periods (0 to 0.5s clockwise and 0.5 to 1s anti-clockwise). fig. 7b shows the evolution of the two other chambers (v1b and v2a) for the same cycle. in addition, the yellow curve represents the equivalent expansion volume of the interconnected chambers v1b+v2a. during the first half-cycle, this last evolution corresponds to a real expansion, but in the second half-cycle, the volume decrease does not have any significant contribution to the torque, the exhaust valve being open. a) b) fig. 7 evolution of the volumes of the two actuators. a) volumes v1a and v2b, b) volumes v1b, v2a and v1b+v2a a high efficiency pneumatic drive system using vane-type semi-rotary actuators 421 the added actuator is of the type crb1-100 and has a volume of the chambers equal to 376 cm3. it develops a nominal torque of 75 nm at 10 bar. its radius r is 0.038 m, and the surface of the wing is 21cm2. 2.2.1. the pressure variation during the expansion the expansion of the air is supposed to be of the adiabatic type, and the resulting pressure p2 in the chambers v2a (first half-cycle) and v2b (second half-cycle) and of the respective feeding chambers v1b and v1a as-long-as the c-w and a-c-w motions are not completed takes the value of 1max 2 1 , 2 , in a b b a v p p v v    =   +  with  = 1.4 (1) at the end of the two motions, the volume ratio is equal to 1/3, and the corresponding pressure ratio becomes pin/p2 = 0.21 according to rel. (1). in fig. 8, the pressure p2 in the expansion volume v1b+v2a is represented (blue curve), together with the pressure in the inlet chamber of the first actuator (pin) (red). while pin is not changing during the motion of the wing, the pressure p2 is decreasing according rel. (1). fig. 8 pressures pin (red) and p2 (blue) during the first half of the cycle (clock-wise motion) if the cascaded actuators are fed from a lower pressure, the expansion can lead to a pressure decrease to a value that is under the atmospheric pressure, producing a negative torque value. in order to avoid this phenomenon, an anti-return valve is connected to the exchange lines between the two actuators [9]. 2.2.2. from the pressure to the torque figure 9 is representing the evolution of the torque contributions of the first actuator during the first stroke (clock-wise motion) occurring in the first half-period. this torque contributions are related to both sides of the wing. the v1a-side surface s1a is multiplied by the pressure pin to obtain the acting force. further, this force is multiplied by the corresponding radius r1 to get the m1a torque component. the effective surface and related radius are given in table 1. this contribution is represented in red in fig. 9. 422 a. rufer the v1b-side surface s1b is multiplied by the pressure p2 and further by the same radius r1 to obtain the m1b torque component (blue curve in fig.9). m1 is obtained by subtraction of both components. this torque is represented in yellow in fig. 9. fig. 9 torque contributions of the first actuator fig. 10 torque contributions of the two actuators and total torque during the first half-period the contribution of the second actuator in the first stroke (clockwise motion) is represented in red in fig. 10 (m2). this torque results from the v2a-side contribution (effect of p2, p2 being the absolute pressure) and the v2b-side where the pressure is equal to the atmospheric pressure. on the same figure, the torque contribution m1 of the first actuator is represented again (blue curve), together with the total torque mtot during the first half-period (yellow curve). finally, fig. 11 shows the total torque during one complete period. the only positive value of the torque indicates that the torque is considered at the output of the motion rectifier. the average value of the torque is also represented. further, the torque is represented for an expansion occurring in isothermal conditions. fig. 11 torque developed by the two coupled actuators. the figure shows adiabatic (blue and yellow) and isothermal (violet and light blue) expansions together with their averages. x: time [s], y: torques [nm] fig. 12 torque developed by the large actuator alone (fig.1c) and with a specific control of the inlet valve. x: time [s], y: torques [nm] a high efficiency pneumatic drive system using vane-type semi-rotary actuators 423 2.3. displacement and expansion work in one single actuator 2.3.1. basic principle with the third system represented in fig. 1c) the displacement work and the expansion work are realized in the same device. in the single rotating actuator, the displacement work corresponds to an intake phase under constant pressure, for a volume value comprised between zero and a given volume v1. after this first phase, the intake valve is closed and the air under pressure is expanded while the vane continues its motion up to the end position. the expanded air occupies then the total volume of the chamber v2. to reach the same energetic performance as for the system with two cascaded actuators, the full volume of the single actuator (v2) must have the same value as the volume of the larger actuator of the cascade. the volume v1 of the intake phase is equal to the volume of the small device of the cascade. for the displacement work at constant pressure and for the expansion work in the same actuator, two rotation angles are defined for one chamber, as for the a-side of the rotating vane, namely the intake angle _int_a, and the expansion angle _exp_a as indicated in fig. 13. these angles are obtained from the angular sensors shown in fig. 14. v2b v2a _int_a _exp_a fig. 13 intake and expansion angles the intake phase at constant pressure begins at the origin of the stroke at the a-side of the actuator. then the intake and the related work at constant pressure lasts over the defined angle _int_a. during this phase, the intake valve at this side (a) is open, the exhaust valve of the opposite side (b) is also open. for the double effect rotary actuator, 4 valves are needed, two for the intakes and two for the exhausts. the principle of adding a non-return valve for each chamber in order to avoid a decrease of the pressure under the atmospheric level remains. the different valves are represented in fig. 14. after the intake, an expansion phase lasts for the remaining angle of the stroke _exh_a. during this phase, the intake valve is closed in order to initiate the expansion. the exhaust valve at the opposite side is kept open up to the end of the stroke. 424 a. rufer v2b genmotion retifier v2a 220 vac +12 v 0 v xin_a xin_b xexh_a xexh_b s r & & x_int_b x_int_a x_exh_a x_exh_b fig. 14 the complete system of one rotary actuator with displacement and expansion work in the same device the torque developed by the actuator is composed of two different segments, namely a first constant torque segment corresponding to the intake at constant pressure, followed by a second segment where the torque decreases proportionally to the pressure during the expansion angle. the torque curves are given in fig. 12. two different expansion conditions are represented, namely an adiabatic expansion and an isothermal expansion. 3. comparison of the efficiencies 3.1. basics in a classical pneumatic actuator like a cylinder or an angular actuator, the mechanical work is obtained from the displacement of the piston or the vane under constant pressure p2 through a volume v (w2, red in fig. 15). p2 is generally maintained constant by a pressure release valve. at the end of the stroke, the pressure in the fully deployed cylinder is released to the atmosphere by opening the exhaust valve, allowing the free return of the piston. this corresponds to renounce to the pneumatic energy content inside the cylinder. the pneumatic energy content of the deployed cylinder can be illustrated by the w2d surface (green, right side of v2) in the diagram of fig. 15. the maximal value of this energy is obtained under isothermal conditions of the expansion and can be calculated through rel. 2: a high efficiency pneumatic drive system using vane-type semi-rotary actuators 425 2 2 2 2 2 ln 1 a d a pp w p v p p   =  − +    (2) this value will be considered as the internal energy u of the injected air for the calculation of the efficiency in the next paragraph. v p p1 p2 pa v1 v2 e1 w2 w2d fig. 15 pv diagram of the air in an actuator 3.2. the efficiency of the single actuator system the efficiency of the pneumatic motor with a single actuator is calculated on the base of the produced mechanical work and the injected enthalpy into the system: out out conv in in w w u p v    = = +  (3) u is the thermodynamic content of the injected air under pressure and is calculated according rel. (1). numerically, and considering the two half-cycles, u becomes 5 2 3 10 110 10 / 2 0.000118 ln 1 331 1 10 comp bar bar u e n m m j bar bar   = =    − + =    (4) 5 2 310 10 / 0.000236 236 in p v n m m j =   = in this single actuator, the torque produced corresponds to 1 1 1 25.05 2.4 22.65 th atm m m m nm nm nm= − = − = (5) then, the angular velocity of the rectified motion is calculated. during the 1 second period of the pneumatic motor, an angle of 270° is run in both directions. after the motion rectification, the output angle is equal to 2 2 270 540 out actuator  =  =   =  (6) the resulting angular velocity becomes 540 1 2 2 1.5 / 9.42 / 360 1 out rad s rad s s   =  =   = (7) 426 a. rufer the corresponding efficiency becomes _ single 1 22.65 9.42 0.376 331 236 out out conv in in w m nm rad u p v j j       = = = = +  + 3.3. the efficiency of the motor with two cascaded actuators for the evaluation of the efficiency of the system with two cascaded actuators, the average value of the developed torque is considered. from the simulation, and as represented in fig. 11, the average torque value is equal to 40.5 nm. the pressure and the intaken volume of air by this system are identic to the previous one. this means that the input enthalpy is identical. consequently, and taking into account that the angular velocity is the same, the efficiency becomes _ single 1 40.5 9.42 0.672 331 236 out out conv in in w m nm rad u p v j j       = = = = +  + for an adiabatic expansion. under isothermal expansion conditions, the average torque is equal to 46.3 nm (fig. 11). the efficiency becomes here _ single 1 46.3 9.42 0.769 331 236 out out conv in in w m nm rad u p v j j       = = = = +  + 3.4. considering the friction in the actuators to get a more realistic value of the conversion efficiency, the friction torque of the pneumatic actuators must be introduced. the identification of the friction torque has been described in [10]. according to these results, the value of the coulomb friction torque alone is considered. for a general case, the friction torque is supposed to be equal to 10% of the rated torque of the actuator. for the single actuator, the effective torque is given by 1 1 1 0.9 22.65 20.38 e fric m m m nm nm= − =  = (8) then, taking in account of the friction the efficiency becomes _ single_eff 1 20.38 9.42 0.338 331 236 out e out conv in in w m nm rad u p v j j       = = = = +  + (9) for the cascaded actuators, the effective torque is equal to the average torque minus the friction of both individual actuators _1 _ 2 40.5 0.1 22 0.1 75 30.8 e avtot tot fric fric m m m m nm nm nm nm= − − = −  −  = (10) the efficiency becomes for the 1s cycle _ 30.8 9.42 0.51 331 236 out cascaded tot out conv in in w m nm rad u p v j j       = = = = +  + a high efficiency pneumatic drive system using vane-type semi-rotary actuators 427 4. simulating the system with sensors and closed loop control the simulation of the system represented in fig. 1c) with sensors, control signals and the closed loop operation will consider the real operation of the motion rectifier described at paragraph 2.1. for this purpose, the mechanical model given in fig. 16 will be used. in this model, a first integration block is representing the movement of the wing-rotor of the actuator alone. at its input side, a factor k1 is used to characterize the inertia of this rotor, k1 being the inverse of the mechanical time constant tm of the rotor. the action of the motion rectifier corresponds to limit the speed of the rotor to the value of the rotational speed of the output shaft. this value is set as a constant (omega) in order to simplify the simulation. this parameter could be simulated separately as another mechanical integrator as represented with dotted line in the lower left side of the figure. this additional integrator would simulate the dynamics of the whole output shaft including the electric generator. the limitation function is given through a simple limiting block where the upper limit (omega) and the lower one (omega) are directly imposed. as a result, the speed of the rotor appears at the output of the limiting block. in reality the angular velocity of the rotor cannot overtake that of the output shaft (omega), and the output of the first integrator must be controlled with an anti-reset wind-up circuit. this circuit comprises a subtraction block which compares the output and the input of the limitation block. the output of the comparator is fed back to the input with a factor k2. this factor (100 in this case) controls the static error between the simulated rotor speed and the speed of the output shaft. then, the position of the rotor is simulated with a second integrator. the output quantity corresponds to the rotor position. + k1 k2 1 s 1 s -1 + 1 s torque (therm.) rotor speed rotor position omega fig. 16 mechanical model of the rotor and motion rectifier from the position of the rotor, the signals produced by the sensors are generated, as well as the control signals for the valves. the computation of the control signals is realized according the description of fig. 14. sensors signals and control signals are represented in fig. 17. 428 a. rufer x_a_270 x_b_270 x_int_a x_int_b x_exh_b x_exh:a fig. 17 signals from the sensors and control signals in fig. 18, the speed of the rotor and its position are represented. the angular speed of the rotor is limited through the parameter omega set to 6.28 rad/s. one can see that the rotor speed jumps from a positive value (omega) to a negative one (-omega) with a specific slope, corresponding to the inertia of the wing rotor alone. the position of the rotor is controlled by the rs flip-flop as represented in fig. 14 which acts as a kind of two-point controller. the rotation angle swings up and down from nearly zero and 4.71 rad. these values correspond to the end-of-stroke positions of the 270° actuator. fig. 18 rotor speed (blue) [rad/s] and rotor position (red) [rad] fig. 19 volumes of the two chambers of the actuator. side a (blue) and side b (red) x: time [s], y: volumes [cm3] from the angular position of the rotor, the volumes of both chambers (a and b) can be easily calculated. they are represented in fig. 19 combining the variables representing the volumes and the control signals of the valves, the pressure in each chamber can be calculated. on the curves of fig. 20, one can clearly distinguish the flat segment at the beginning of the clockwise and anti-clockwise strokes, corresponding to the intake under constant pressure. these segments are followed by the expansion curves when the intake valves are closed. in the simulation, the expansion is considered in adiabatic conditions. at the end of the expansion, the rotor has reached its end-position and returns in the opposite direction. in this phase, the exhaust valves are open and the atmospheric pressure appears in the chambers. a high efficiency pneumatic drive system using vane-type semi-rotary actuators 429 fig. 20 pressures in both chambers (a: upper curve, and b: lower curve) x: time [s], y: pressure [bar] fig. 21 torque applied to the rotor (red) and rotor speed (blue). x: time [s], y: torque [nm], speed [rad/s] from the pressures in the chambers and the surface of the wing, the force acting on both sides of the wing are calculated, as well as the generated torque, considering the radius of the force’s action. the torque produced through the pressure in the chambers is represented in fig. 21 (red curve). this torque produces the motion of the rotor shown by the blue curve in the figure. this curve corresponds to the speed of the rotor which is limited in both directions through the motion rectifier which drives the generator at a constant speed of 6.28 rad/s. between the clockwise and the anti-clockwise motions, the rotor undergoes a speedreversal. during the reversal, the roller-clutches of the free-wheeling devices in the motion rectifier are inactive, and no torque is transmitted to the output shaft. this can be observed on the curve of fig. 22, where the torque effectively transmitted to the output of the rectifier is shown. by the change from a positive to a negative value of the speed, one can see a zero segment of the duration of the speed reversal. finally, the mechanical power at the output of the rectifier can be represented (fig. 23). the instantaneous value of the power (blue curve) corresponds to the value of the transmitted torque multiplied by the value of the rotational speed. in the figure, the average value of the power is also represented. fig. 22 torque transmitted to the output of the motion rectifier [nm] fig. 23 mechanical power and average at the output of the motion rectifier [w] 430 a. rufer 5. experimental set-up the different systems with enhanced efficiency have been realized as prototypes for demonstration. fig. 24 shows the complete system with two cascaded actuators. in the middle of the figure, the motion rectifier can be shown. left in the figure, the electric generator, and right, the coupled actuators with the intake, transfer and exhaust valves. fig. 24 experimental set-up of the two coupled actuators the used actuators are of the type crb2 bw30 (11.3cm3) and crb2 bw20 (4.8 cm3). in figure 25, the motion rectifier can be seen with its direct and inverting trains. in fig. 26, the two coupled actuators are represented. fig. 25 the motion rectifier fig. 26 the two coupled actuators a high efficiency pneumatic drive system using vane-type semi-rotary actuators 431 finally, fig. 27 shows the system with one actuator only where the displacement and expansion work are produced. between the motion rectifier (middle of the figure) and the single actuator, the position sensors can be observed. fig. 27 the system with one actuator only figure 28 shows the evolution of the pressure in the two chambers of the single actuator with integrated expansion according to the scheme of fig. 14. these curves correspond to the theoretical curves which were simulated (fig. 20) but show some differences mainly due to the limited performances of valves and tubing. the different phases are indicated: intake, expand and exhaust. for the intake phase, the pressure in the chambers does not establish instantaneously, but rises according to an exponential form, mainly caused by the small section of the tubes and of the valves. then, the expansion phase lasts over a reduced time and a reduced expansion ratio. this is due to delays of the control valves as can be shown in fig 29. the period of one cycle is 150ms intake expand exhaust fig. 28 pressures in the chambers a: upper curve b: lower curve 432 a. rufer the left picture of fig. 29 shows the relation between the sensor signal “end of stroke” of the chamber b and the pressure in the chamber a. before the sensor signal rises, the chamber a is in its exhaust phase. the pressure is not at the level of the atmospheric pressure but takes the value of a pressure difference pexh caused by the exhaust flow of the air through the small valve and tubing. after the delay time tint the intake valve is opened. due to this delay, the rotor of the actuator hits its end of stroke stop after tmech instead of being braked pneumatically by the opening of the intake valve of the chamber a. while the rotor is blocked at its stop, the exhaust flow goes to zero and the pressure decreases from pexh to the atmospheric level. then, after tint the intake valve a opens and the pressure rises during the new intake phase. in the right picture of fig. 29, the relation between the sensor signal giving the length of the intake and the pressure in the corresponding chamber is represented. the sensor signal rises at a “length angle” which is the symmetric one before the end of stroke position. the signal x_int_a for real length of the intake for the corresponding chamber is activated through the sr flip flop as represented in fig.14 (logic “and” function). a time delay texp appears between the falling edge of the sensor signal and the real beginning of the expansion. this is due to the slow response of the valve especially by closing. the delayed expansion causes a reduction of the expansion factor. these records illustrate the limits of the realized demonstrator and the need of using fast control valves for such an application [11]. additionally, the section of the valves and of the pipes must be chosen as large as possible, allowing fast filling and exhaust of the chambers of the actuator. tint tmech pexh texp fig. 29 control signals and pressure in the chamber left/upper curve: end of stroke signal from the sensor left/lower curve: pressure in the corresponding chamber right/upper curve: sensor signal for the length of the intake right/lower curve: pressure in the corresponding chamber 6. conclusions a special pneumatic drive system has been analyzed, where the energetic efficiency can be enhanced. the basic drive system using semi-rotary vane-type actuators is simulated and its energetic performance is evaluated. in this system, the alternating movement of the actuator is transmitted to an electric generator via a dedicated motion rectifier. a high efficiency pneumatic drive system using vane-type semi-rotary actuators 433 a first version of the drive with enhanced efficiency is proposed, where a second actuator of larger size is added with the goal to increase the mechanical performance. this system is based on the principle of extracting, in addition to the classical displacement work an additional amount of mechanical work in the form of an expansion work of the pressurized air. a second version of the enhanced drive uses one oversized actuator only in which both the displacement work and the expansion work are realized in the same device. the principle uses two intake valves and two exhaust valves, controlled by a more complex electronic circuit and sensor system. the three systems are simulated with an idealized model of the air transfer into the chambers. the simulation of the produced torque and mechanical work shows that the energetic performance of such a drive system can be theoretically doubled. the performances of the three systems are compared, and a corresponding efficiency is evaluated in the sense of the calculation of the produced work for an identical air consumption. the real behavior of the systems has been tested on a small-scale experimental setup. in these real systems, the different phases of the pneumatic to mechanical transformation are well recognized through pressure measurements. the real behavior show that the dynamic phenomena related to the air transfer play a major role. section of the tubes and of the valves as the length of the pneumatic connections must be designed specifically. additionally, the dynamic response of the electromagnetic valves was also observed with its impact on the system behavior and performance. references [1] a. rufer, energy storage – systems and components, crc press taylor & francis group, boca raton fl33487-2742, isbn-13: 978-1-138-08262-5. [2] s. lemofouet, investigation and optimisation of hybrid electricity storage systems based on compressed air and supercapacitors, thesis epfl, 2006. [3] m. heidari, a. rufer, “fluid flow analysis of a new finned piston reciprocating compressor using pneumatic analogy”, international journal of materials, mechanics and manufacturing, 2014. [4] a. iglesias, d. favrat, “innovative isothermal oil-free co-rotating scroll compressor-expander for energy storage with first expander tests”, energy conversion and management, elsevier, vol. 85, pp. 565–572, september 2014. [5] vane type rotary actuators, smc company, series crb2/ crb1, cat es20-159 a. [6] z. li, l. zuo, j. kuang and g. luhrs, “energy harvesting shock absorber with a mechanical motion rectifier,” smart materials and structures, vol. 22, no. 025008, pp. 1–10, 2013. [7] dr. mohamed moshrefi-torbati, mechanical motion rectifier supervisor: investigating the potential of a mechanical device for transforming bi-directional rotational motion into unidirectional rotational motion. individual project work undertaken by geoffrey moore, academic year 2015/2016 [8] g. gianfranco, underwater electric torch with incorporated electric generator, european patent application ep 1 249 659 a2, 28.03.2002 [9] a. rufer, a compressed air driven generator with enhanced energetic efficiency iemera 2020 at: imperial college london (virtual). [10] m. dagdelen, m. sarigecili, “friction torque estimation of vane type semi-rotary pneumatic actuators in the form of combined coulomb-viscous model”, in proceedings of the 5th international conference on advances in mechanical engineering, december 2019, [11] festo website: fast switching valve mhe2, available on web site: http://www.festo.com. [12] s. čajetinac, d. šešlija, v. nikolić, m, todorović, “comparison of pwm control of pneumatic actuator based on energy efficiency”, facta universitatis, series: electronics and energetics, vol. 25, no. 2, pp. 93–101, august 2012. https://infoscience.epfl.ch/search?p=rufer++alfred http://www.festo.com/ instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 601 611 doi: 10.2298/fuee1404601s fuzzy model reference adaptive control of velocity servo system  momir r. stanković 1 , milica b. naumović 2 , stojadin m. manojlović 1 , srđan t. mitrović 1 1 military academy, university of defence in belgrade, serbia 2 department of automatic control, faculty of electronic engineering, university of niš, serbia abstract. the implementation of fuzzy model reference adaptive control of a velocity servo system is analysed in this paper. designing the model reference adaptive control (mrac) and the problem of choosing adaptation gain is considered. tuning the adaptation gain by fuzzy logic subsystem and a simple synthesis procedure of fuzzy mrac are proposed. several simulation runs show the advantages of fuzzy mrac approach. experimental validation on laboratory speed servo is realized by the acquisition system. the results confirm benefits of the proposed controller in comparison with the standard mrac. key words: mrac, fuzzy mrac, adaptation gain. 1. introduction the major conventional controllers design concepts are model based. however, process modelling is a complex procedure, which at best, provides only an approximate model of the real process, followed with some level of model uncertainty. the controllers with constant parameters in most cases are unable to cope with parameters perturbations, unmodelled dynamics and external disturbances. in order to provide acceptable system behaviour in the presence of internal and external disturbances, the appropriate adaptation of controller parameters is necessary [1]. adaptive control contains a proper adjustment mechanism of controller parameters in accordance with working conditions and the current state of the system. recall that the adaptive systems are divided in two classes: self-tuning systems and model reference adaptive systems based on parameters adaptation technique [2], [3]. in self-tuning control systems some of the recursive methods for on-line process identification are used and controller parameters are adjusted in real time based on the estimated values and predefined algorithm [4].  received may 14, 2014; received in revised form october 13, 2014 corresponding author: momir r. stanković military academy, university of defence in belgrade, generala pavla jurišića šturma 33, 11000 belgrade, serbia (e-mail: momir_stankovic@yahoo.com) 602 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović in model reference adaptive control (mrac) the system performance is given by the reference model. tuning of controller parameters is based on the error, defined as the difference between reference model and real process responses [2], [5]. the main weakness of the standard mrac, with mit rule, is non-existent clearly defined rules for the adaptation gain selection. in most cases, however, it is chosen based on the large number of simulations and trial and error methods [2]. the use of fractional order parameter adjustment rule instead of the gradient approach with mit rule and the employment of fractional order reference model is proposed in [6]. the different modifications of the mrac, with a variable structure design [7], based on performing repetitive tasks [8] and with a time-varying reference model [9] have successfully been applied for plants with unmodeled dynamics, external disturbances and unknown parameters and for a system with control effort bounded. since ichikawa presented the novel design of model reference adaptive fuzzy control [10], many authors have made progress in the application of fuzzy theory in mrac [11], [12]. the fuzzy set theory allows the use of experience in system control design. the great contribution of fuzzy logic is the possibility of modelling unstructured heuristic assertions, which are expressed linguistically [13]. fuzzy adaptive concept becomes closer to the designer and it allows the use of expert knowledge and experience in designing control systems. as a result, the performance/complexity ratio is better for fuzzy adaptive controllers [14], [15]. fuzzy mrac is suitable for application in industrial control systems, where the influence of internal and external disturbances is high. in [16] simple design procedure of fuzzy mrac using error signal as fuzzy subsystem input was presented and this controller has shown better results than conventional ones. in this paper the fuzzy mrac of the speed servo system is proposed. in speed servo systems the main influence on system performances has varying load torque. based on the estimation of load torque and its first derivative the adaptation gain is adjusted by fuzzy logic subsystem (flss). through matlab/simulink® simulation models, the proposed and standard mrac of speed servo system with different load disturbance profiles are compared. the proposed controller is implemented in laboratory dc velocity servo system, and experimental validation of simulation results is obtained. 2. a revisit to the model reference adaptive control (mrac) a block diagram of the model reference adaptive control is shown in fig. 1. the desired behaviour of the system is expressed by reference model. parameters of controller are adjusted based on error e = y  ym, which is the difference between plant output y and reference model output ym. the main sources of error e are difference of reference and plant dynamics and external disturbances, denoted with w in fig. 1. the system has two feedback loops: an ordinary one composed of the plant and controller and a feedback loop for controller parameters adjustment [2]. fuzzy model reference adaptive control of velocity servo system 603 controller plant reference model adjustment mechanism m y y u c u controller parameters w   e fig. 1 block diagram of model reference adaptive control adjusting the controller parameters in the direction of the negative gradient of e is realized by the well-known mit rule: d de e dt d      , (1) where  is the parameter of controller, the partial derivative de / d is called the sensitivity derivative of the system and  presents the adaptation gain [2]. consider a first order system described by the model [2]: dy ay bu dt    , (2) where u is the control variable. let the reference model be given as: m m m m c dy a y b u dt    , (3) where uc is the command signal. the perfect model reference following can be achieved by controller: 1 2c u u y   , (4) with parameters: 0 1 1 m b b    and 0 2 2 m a a b      . (5) the sensitivity derivatives directly follow from partial derivations of e with respect to the controller parameters 1 and 2 [2]: 604 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović 1 2 c de b u d p a b     (6) 2 1 2 2 22 ( ) c bde b u y d p a bp a b          (7) where p = d /dt is the differential operator. equations (6) and (7) cannot be used directly because a and b represent parameters of the system, which are uncertain or unknown. in order to exclude parameters a and b the following approximations are required: 2 m p a b p a    (8) based on (8) and mit rule (1) the following equations for updating the controller parameters are obtained: 1 1 c m d u e dt p a           , (9) 2 1 m d y e dt p a          . (10) it can be noted that parameter b is absorbed in adaptation gain  [2]. from (9) and (10) it can be seen that mrac has only one parameter, the adaptation gain , which has to be chosen a priory and its selection influences system performances significantly [15]. by substitution of (2), (3) and (4) in (9) and (10), y and e are excluded, and the following equations are obtained: 21 1 2 ( ) 1 ( ) m c m m m g p ud y y dt g p              , (11) 2 1 12 2 2 ( ) ( ) ( ) ( ) 1 ( ) 1 ( ) m c m c ref m ref m c m c g p u g p ud g p y g p dt g p u g p u                , (12) where gm(p) and gref(p) are equivalent to transfer functions of the plant and reference model, respectively. the influence of adaptation gain  on the convergence rates of 1 and 2 to 1 0 and 2 0 , cannot be analytically derived from (11) and (12). if  is constant, the convergence rates depend only on uncertainty of plant transfer function. it is known that system performances differ for different values of  [2] and therefore it is assumed that varying , as a function of external disturbances influencing the system, can significantly increase the convergence rates. fuzzy model reference adaptive control of velocity servo system 605 3. fuzzy mrac of velocity servo system 3.1. concept of fuzzy mrac it is known that external disturbances significantly influence the convergence rates of controller parameters. the main external disturbance for speed servo system is the varying load torque on the motor shaft and it can be shown that convergence rates depend on disturbance and its dynamics. for example, the constant load torque can be effectively compensated with a small value of , while the compensation of small load torque of high frequency dynamics requires a significantly larger value. depending on load torque and its dynamics some different rules can be formed based on experience, but the exact mathematical solution cannot be easily found. this is one of the required prerequisites for fuzzy logic subsystem design: the system can be described trough set of rules based on experience, while its mathematical model is too complicated or does not exist at all [18]. this fact was the main motivation to include a special fuzzy logic subsystem (flss) in the control loop. 3.2. application of fuzzy mrac the block diagram of a velocity servo system with a fuzzy model reference adaptive control is shown in fig. 2. controller dc servo motor with tacho reference model adjustment mechanism m y y u c u observer of load torque fuzzy logic subsystem y , a i   ˆ d m ˆ d m controller parameters fig. 2 block diagram of fuzzy model reference adaptive control the performance of the system is given by the first order reference model: ( ) 10 ( ) ( ) 0.01 1 m ref c y s g s u s s    . (13) the perfect model reference following the controller (4) is designed with parameters adjustment rules (9) and (10) and fuzzy logic based tuning of gain . dc servo motor transfer function with armature voltage u as input and tacho voltage utg as output is given with: 606 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović ( ) (s) ( ) 1 tg m tg m m u s k k g u s t s    , (14) where / ( ) m em a e em me k k r f k k  and / ( ) m e a a e em me t j r r f k k  are dc motor static gain and time constant, respectively [19]. the values of tacho constant ktg and dc motor electrical and mechanical parameters are previously identified [20] and shown in table 1. table 1 parameters of dc motor and tacho parameter value armature resistance ra 8.91 ω armature inductivity la 4.5 mh moment of inertia je 2.93e-5 kg m 2 coefficient of viscous friction fe 11.7e-5 kg m 2 /rad/s electromechanical constant kem 0.103 nm/a mechanical-electrical constant kme 0.103 v/rad/s tacho constant ktg 0.0191 v/rad/s the observer for load torque and its first derivative estimation is designed based on dc motor moment equation: ( )ˆ ( ) ( ) ( ) ˆ ( )ˆ ( ) d em a e e d d d t m t k i t j f t dt dm t m t dt      (15) where ia(t) and (t) are measured armature current and shaft angular velocity, respectively. the estimated values ˆ d m and ˆ d m are inputs to fuzzy subsystem for  tuning, and the corresponding membership functions are shown in fig. 3. the linguistic variable m (load torque) is described by five membership functions: small (ms), intermediate positive (mip), large positive (mlp), intermediate negative (min) and large negative (mln). linguistic variable cm (the first derivation of load torque) is defined by membership functions: small (cs), intermediate positive (cip), large positive (clp), intermediate negative (cin) and large negative (cln). -0.25 0 0.25-0.15 -0.05 0.05 0.15 mln  ms min mip mlp [nm] d m -0.5 0 0.25-0.2 -0.1 0.1 0.4 cln  cs  cin  cip  clp  [nm/ s]d dm dt -0.4 0.2 a b fig. 3 membership functions of: a) linguistic variable m and b) linguistic variable cm fuzzy model reference adaptive control of velocity servo system 607 the developed flss is of the takagi–sugeno type, with two inputs: ˆ d m and ˆ d m , and one output, adaptation gain . for t and s norm, the minimum and maximum method was selected, respectively [21]. the fuzzy rules base for suggested flss for adaptation gain selection is shown in table 2. the set of rules is comprised of 25 rules, and the rules are defined experimentally, based on repeated simulations with different values of . flss output  is nonnegative scalar, and can assume values from min = 0.005 to max = 0.25. table 2 fuzzy rules base of fls mln min ms mip mlp cln 0.6γmax 0.8γmax 0.9γmax 0.8γmax 0.4γmax cin 10 γmin 0.6γmax 0.4γmax 0.6γmax 2 γmin cs 2 γmin 10 γmin 0.4γmax 10 γmin γmin cip 10 γmin 0.8γmax 0.4γmax 0.6γmax 2 γmin clp 0.6γmax γmax γmax γmax 0.4γmax 3.3. simulation results based on matlab/simulink® simulation models, the mrac and proposed fuzzy mrac of the velocity servo system with parameters given in table 1 are compared. performances of tracking of square reference with magnitude of ± 100 rad/s and period of 10s are analyzed. in the first case the trapezoidal load, shown in fig. 4a, is applied on a motor shaft. responses of the speed servo system with mrac and fuzzy mrac are presented in fig. 4b and fig. 4d. it can be seen that during the transient of the load disturbance, when it has a constant rate of change, mrac with greater  provides smaller reference tracking error, but when load disturbance becomes of constant value, the response is more oscillatory. the fuzzy mrac has better reference tracking performances during both periods in load disturbance profile, due to fuzzy adjustment of  in which the information of load disturbance derivative is included. the change of  is shown in fig. 4c. the tracking performances in the presence of the sinusoidal load disturbance with angular frequency of 4 [rad/s], presented in fig. 5a, are also analysed. the responses of mrac and proposed controller are shown in fig. 5b and fig. 5d, respectively. it can be noted that mrac with greater  almost completely eliminates the influence of load disturbance of steady state, but the transient is more oscillatory. the proposed controller enables acceptable overshoot and steady state reference tracking performances. the adaptation gain for this case is shown in fig 5c. the integral of absolute error e = y  ym for all cases with trapezoidal load and sinusoidal load disturbance is summarized in table 3, and the results confirmed the advantages of the proposed fuzzy mrac controller. 608 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović 0 2 4 6 8 10 -0.2 -0.1 0 0.1 0.2 t [s] m d [ n m ] 0 2 4 6 8 10 -150 -100 -50 0 50 100 150 t[s]  [ ra d /s ] y (=0.05) y (=0.02) y (=0.18) y m a b 0 2 4 6 8 10 0 0.05 0.1 0.15 0.2 0.25 t [s] a d a p ta ti o n g a in 0 2 4 6 8 10 -150 -100 -50 0 50 100 150 t[s]  [ ra d /s ] y m y (fuzzy mrac) c d fig. 4 simulation results for tracking reference model: a) load torque, b) mrac for different , c) change  of fuzzy mrac, d) fuzzy mrac 0 2 4 6 8 10 -0.1 -0.05 0 0.05 0.1 0.15 t [s] m d [ n m ] 0 2 4 6 8 10 -150 -100 -50 0 50 100 t [s]  [ ra d /s ] y (=0.02) y (=0.05) y (=0.18) y m a b 0 2 4 6 8 10 0 0.05 0.1 0.15 0.2 0.25 t [s] a d a p ta ti o n g a in 0 2 4 6 8 10 -150 -100 -50 0 50 100 t [s]  [ ra d /s ] y m y (fuzzy mrac c d fig. 5 simulation results for tracking reference model: a) load torque, b) mrac for different , c) change  of fuzzy mrac, d) fuzzy mrac fuzzy model reference adaptive control of velocity servo system 609 table 3 integral absolute error mrac fuzzy mrac γ=0.02 γ=0.05 γ=0.18 trapezoidal load 43.4 37.5 149.2 15.1 sinusoidal load 182.6 70 27.5 24.5 4. experimental validation the experimental validation of simulation results is realized with a laboratory velocity servo system. in fig. 6 the experimental setup is shown. a dc servo motor with outputs for angular rate and armature current signals is used. the motor is equipped with a magnetic brake for variable load torque generating. the communication between the personal computer and the dc servo motor is provided with the acquisition card dt 9812. the control signals from the acquisition card before applying to the armature of the motor are amplified by the power amplifier. fig. 6 experimental setup mrac and fuzzy mrac are designed in matlab/simulink® environment. the simulink model of the proposed fuzzy mrac is shown in fig. 7. analog out. dt9812 to power amplifire analog inp. dt9812 from tacho analog inp. dt9812 from dc motor 10 0.01s+1 reference model w [rad/s] ia [a] md [nm] dmd/dt[nm/s] load torque observer 1/ktg gain fuzzy logic teta 1 teta 2 ref uc control signal controller e y gamma uc teta 2 teta 1 adjustment mechanism 100 [rad/s] fig. 7 simulink model of fuzzy mrac 610 m. r. stanković, m. b. naumović, s. m. manojlović, s. t. mitrović the step reference with magnitude of 100 rad/s is software generated. the signals of tacho and dc motor armature current from the acquisition card are introduced in simulink environment by analog input blocks. the control signal from the controller is passed to acquisition card by analog output block. varying load torque, generated by magnetic brake, is estimated with the observer and is shown in fig. 8a. angular rate of motor shaft is acquired and graphically presented in fig 8b and 8d. from figures it can be seen that the experimental results are very similar to the simulation results. speed servo system performances are much better with the proposed fuzzy mrac then with conventional mrac. in fig. 8c the changing of the adaptation gain  is shown. 0 1 2 3 4 5 -0.1 0 0.1 0.2 0.3 t [s] m d [ n m ] 0 1 2 3 4 5 0 50 100 t [s]  [ ra d /s ] y (=0.03) y (=0.1) y m a b 0 1 2 3 4 5 0 0.05 0.1 0.15 0.2 0.25 t [s] a d a p ta ti o n g a in 0 1 2 3 4 5 0 50 100 t [s]  [ ra d /s ] y m y (fuzzy mrac) c d fig. 8 experimental results for tracking reference model: a) load torque, b) mrac for different , c) change  of fuzzy mrac, d) fuzzy mrac 5. conclusion the synthesis procedure of fuzzy logic model reference adaptive control (mrac) is realized in this paper. fuzzy mrac is suitable for use in industrial control applications under all disturbance conditions. the implementation of the proposed control algorithm is analysed on the laboratory velocity servo system where the varying load torque has the main influence on system performances. the influence of varying load disturbance is compensated by changing the adaptation gain parameter by using a relatively simple t-s fuzzy logic subsystem. some simulation results show the advantages of the fuzzy mrac concept. the experimental validation confirms the simulation results. fuzzy model reference adaptive control of velocity servo system 611 acknowledgement: the paper is a part of the research supported by the ministry of education, science and technology development within the project iii44004 (2011-2014). references [1] z. bubnicki, modern control theory, general characteristics of control system, springer, 2005. doi:10.1007/3-540-28087-1 [2] k. astrom, b. wittenmark, adaptive control, second ed. netherlands, addison-wesley, 1995. [3] e. nebosko, a. proskurnikov, v. yakubovich, "adaptive regulators for the control of an uncertain linear discrete time system with a reference model", doklady mathematics, vol. 82(1), pp.667–670, 2010. doi: 10.1134/s1064562410040423 [4] t. ren, t. chen,c. chen, "motion control for a two-wheeled vehicle using a self-tuning pid controller", control engineering practice, vol. 16, pp.365–375, 2008. doi: 10.1016/j.conengprac.2007.05.007 [5] s. abdeddaim, a. betka, s. drid, m. becherif, "implementation of mrac controller of a dfig based variable speed grid connected wind turbine", energy conversion and management, vol. 79, pp.281-288, 2014. doi: 10.1016/j.enconman.2013.12.003 [6] b. vinagre, i. petráš, i. podlubny, y.chen, "using fractional order adjustment rules and fractional order reference models in model-reference adaptive control", nonlinear dynamics, vol. 29, pp.269279, 2002. doi: 10.1023/a:1016504620249 [7] c. chien, k. sun, a. wu, l. fu, "a robust mrac using variable structure design for multivariable plants", automatica, vol.32, pp.833-848, 1996. doi: 10.1016/0005-1098(96)00009-x [8] a. tayebi, "model reference adaptive iterative learning control for linear systems", international journal of adaptive control and signal processing, vol. 20, pp.475–489, 2006. doi :10.1002/acs.913 [9] p. balaguer, "similar model reference adaptive control with bounded control effort", international journal of adaptive control and signal processing, vol.25, pp.577–592, 2011. doi:10.1002/acs.1222 [10] k. ichikawa, "an approach to the synthesis of model reference adaptive control system", international journal of control, vol. 32, pp.175-190, 1980. doi:10.1080/00207178008922852 [11] n. golea, a. golea, k. benmahammed, "fuzzy model reference adaptive control", ieee transactions on fuzzy systems, vol. 10(4), pp. 436-444, 2002. doi: 10.1109/tfuzz.2002.800694 [12] h. abid, m. chtourou, a. toumi, "an indirect model reference robust fuzzy adaptive control for a class of siso nonlinear systems", international journal of control, automation and systems, vol. 7, pp. 982-991, 2009. doi: 10.1007/s12555-009-0615-8 [13] s. mitrović, ţ. đurović, "fuzzy logic controller for bidirectional garaging of differential drive mobile robot", advanced robotics, vol. 24(8), pp.1291-1311, 2010. doi:10.1163/016918610x501444 [14] c. dragoş, s. preitl, r. precup, m. cretiu, "modern control solutions for mechatronic servosystems. comparative case studies", in proceedings of the 10th international symposium of hungarian researchers on computational intelligence and informatics cinti 2009, budapest, hungary, 2009, pp. 69-82. [15] m. kadjoudj, n. golea, m. benbouzid, "fuzzy rule – based model reference adaptive control for pmsm drives", serbian journal of electrical engineering, vol.4, pp. 13-21, 2007. doi: 10.2298/sjee0701013k [16] z.li, "model reference adaptive controller design based on fuzzy inference system", journal of information & computational science, vol. 8, pp.1683–1693, 2011. [17] p.swarnkar, s. jain, r. nema, "effect of adaptation gain in model reference adaptive controlled second order system", engineering, technology and applied science research, vol.1, pp.70-75, 2011. [18] n. sinha, m. gupta, l. zadeh, soft computing and intelligent systems: theory and applications, academic press, 2000. [19] ţ. đurović, b. kovačević, signals and systems, beograd, academic mind, 2006 (in serbian). [20] m.stanković, m. naumović, s. manojlović, "a simple servo system as a laboratory equipment for demonstrating optimal control design", in proceedings of the 57 th conference etran, zlatibor, serbia, 2013, pp. au4.2.1-6 (cd edition in serbian). [21] k.tanaka, h. wang, fuzzy control systems design and analysis, john wiley & sons, 2001. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 571 584 doi: 10.2298/fuee1704571v toward acoustic noise type detection based on qq plot statistics * sanja vujnović, aleksandra marjanović, željko đurović, predrag tadić, goran kvaščev university of belgrade, school of electrical engineering, belgrade, serbia abstract. fault detection and state estimation using acoustic signals is a procedure highly affected by ambient noise. this is particularly pronounced in an industrial environment where noise pollution is especially strong. in this paper a noise detection algorithm is proposed and implemented. this algorithm can identify the times in which the recorded acoustic signal is influenced by different types of noise in the form of unwanted impulse disturbance or speech contamination. the algorithm compares statistical parameters of the recordings by generating a series of qq plots and then using an appropriate stochastic signal analysis tools like hypothesis testing. the main purpose of this algorithm is to eliminate noisy signals and to collect a set of noise free recordings which can then be used for state estimation. the application of these techniques in a real industrial environment is extremely complex because sound contamination usually tends to be intense and nonstationary. the solution described in this paper has been tested on a specific problem of acoustic signal isolation and noise detection of a coal grinding fan mill in thermal power plant in the presence of intense contaminating sound disturbances, mainly impulse disturbance and speech contamination. key words: acoustic signal, qq plot, noise detection, predictive maintenance 1. introduction it is well known that the largest financial loss for modern industrial plants is due to inefficient or untimely maintenance [2]. this is especially true for power plants which are designed to be in function for many decades after their construction. therefore, it is only logical that there is a significant amount of research done in an attempt to prolong the working life of the plant, improve the quality of its operation [3] and reduce unnecessary losses [4]. with this in mind, the fact that predictive maintenance has become a very received november 16, 2016; received in revised form february 21, 2017 corresponding author: sanja vujnović school of electrical engineering, university of belgrade, kralja aleksandra blvd. 73, 11120 belgrade, serbia (e-mail: svujnovic@etf.bg.ac.rs)  an earlier version of this paper received best section paper award at automatics section at 3rd international conference on electrical, electronic and computing engineering icetran 2016, zlatibor, serbia, 13-16 june, 2016 [1] 572 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev popular area of research is not so surprising. crucial aspects of predictive maintenance are fault detection and state estimation, i.e. the estimate of whether the fault has occurred somewhere within the system or whether certain components are worn and the maintenance needs to be done in order to replace them. the accelerometers are the sensors most commonly used for implementing predictive maintenance algorithms on rotating machinery. the logic behind this is sound: as the fault occurs within the rotating element or as the wear of some components becomes pronounced, the vibration of the machine is sure to change accordingly [5]. the sensors can measure this vibration and algorithms can be constructed which can, based on the change in vibration signal, detect the amount of wear of certain components. these techniques are widely used in the industry with much success [6]; however, an alternative has been presented in the early 90s. this alternative proposes the use of acoustic signals for the same purpose. it has been shown that sound recordings can be as informative as vibration signals when it comes to state estimation of components [7], but acoustic sensors (microphones) are cheaper to obtain and are contactless, which is a very important feature for certain types of processes. one major drawback of using microphones for predictive maintenance is the fact that they are very sensitive to ambient noise. this makes them less than ideal for the use in an industrial environment which is usually very polluted with contaminating noise. for this reason microphones are still rarely used for predictive maintenance in real industrial environments. one way to significantly increase the applicability of acoustic signals for this purpose is developing an algorithm capable of filtering out the acoustic noise caused by the surrounding events. there are many preprocessing algorithms developed in recent years for purpose of fault detection and state estimation. using one of the standard frequency filters is usually not applicable because it is very difficult (if not impossible) to determine the frequencies on which the noise is dominant. even if that can be established, usually the useful part of the signal exists on the same frequencies as well, so filtering out the noise would significantly damage the informative part of the signal. impulse disturbance in time domain, for example, is equally pronounced on all frequencies, so it cannot be filtered using traditional algorithms. taking this into consideration one can easily conclude that standard frequency domain analysis is not reliable enough for noise detection in acoustic signals. therefore advanced procedures should be used for this purpose, such as statistical analysis of the signal. statistical parameters of the recorded signal can be very informative in this case because different statistical behavior is expected when the noise occurs and when the signal is in its nominal form. one of the standard tools used for statistical comparison and analysis are qq plots and they are shown to be quite effective in this case [8]. the purpose of the algorithm proposed in this research is not removal, but rather detection of noise. the entire recording is separated into windowed signals, and each windowed segment is tested for noise. this is done by comparing the statistical distribution of the recorded signal against the statistical distribution of the signal in nominal working condition. the comparison is conducted using qq plots and neyman-pearson hypothesis test. the noisy sequences are discarded and those which are classified as nominal are saved for the purpose of state estimation or some other predictive maintenance procedure. toward acoustic noise type detection based on qq plot statistics 573 the algorithm developed in this research is seen as a part of a larger system of state estimation and fault detection mechanism of rotating elements in thermal power plants based on acoustic signals. it has been tested on real recordings taken in thermal power plant kostolac a1 in serbia, on a specific fan mill which is a part of coal grinding subsystem. it has been shown that state estimation of impact pates within a mill is possible only by using recordings from a microphone placed in the vicinity of the mill [9]. however, it has also been shown that noise can significantly influence the classification results. the purpose of this algorithm is to conduct signal preprocessing, so that the noise-free samples of the acoustic signals can be used for state estimation of impact plates of the mill. this paper is structured as follows. section 2 contains theoretical description of the algorithm used, mainly qq plots and neyman-pearson method of hypothesis testing. section 3 contains the description of the real industrial coal grinding subsystem in thermal power plants on which this algorithm has been tested. in section 4 the detailed results of the algorithm are given. here, the algorithm has been tested on nominal and noisy signals. furthermore, the effect of the change of certain parameters of the algorithm has been examined, as well as upgrade of the algorithm which enables it to be used for classification and not just noise detection. finally, the conclusions are presented in section 5. 2. qq plot as a tool for noise detection in nominal, stationary operation of the system it is assumed that the statistical parameters of the measured signals will remain constant. if, on the other hand, an event occurs that causes a deviation from nominal state (e.g. nonstationary ambient noise), statistical parameters of the recorded signals are expected to change in a certain way. therefore, the probability distribution of the recorded signal in nominal regime is going to be different from the distribution of the signal which is polluted with noise. this change is going to depend on the duration and the type of noise, so the statistical parameters can be used not only for noise detection, but for noise classification as well. 2.1. qq plot a very efficient graphical tool which is used to compare the expected and obtained probability distribution is a qq plot method [10]. this graph is obtained by plotting quantiles of the measured signal against the quantiles of the expected probability distribution. if the two distributions are similar, all the points in qq plot will approximately lie on the line . figure 1 shows a qq plot of an experimentally obtained zero mean unit variance gaussian distribution against its theoretical expectation. the application of this type of data inspection allows not only the comparison of two probability distributions, but also the identification of the distribution of recorded model. for example, if outliers occur at the end of the line, this means that the measured distribution has lager (or smaller) tails than the expected distribution. if all dots lie on the line, but the angle is not 45 o , then the variance of the expected distribution is not the same as in the measured signal. 574 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev fig. 1 experimentally obtained gaussian samples plotted against the theoretical distribution. fig. 2 contaminated gaussian distribution in time domain (upper left) with the appropriate qq plot (upper right) and laplace distributed sample data in time domain (lower left) with its qq plot (lower right). using these rules one can easily infer the shape of the probability distribution as a function of the expected distribution. for example, a gaussian signal polluted with noise is expected to contain large tails on the qq plot, as in fig. 2 (up). on the other hand, if the distribution of experimentally obtained signal is significantly different in nature than -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 normal theoretical quantiles n o rm a l d a ta q u a n ti le s qq plot 0 200 400 600 800 1000 -5 0 5 n[sample] x [n ] -4 -2 0 2 4 -5 0 5 normal theoretical quantiles c o n ta m in a te d d a ta q u a n ti le s 0 200 400 600 800 1000 -5 0 5 n[sample] x [n ] -4 -2 0 2 4 -5 0 5 normal theoretical quantiles l a p la c e d a ta q u a n ti le s toward acoustic noise type detection based on qq plot statistics 575 the expected distribution, one will expect the deviation from axis for both lower and higher values of quantiles. this is shown in fig. 2 (down) where laplacian distributed experimental samples are plotted against the gussian distribution. the graph indicates that the obtained samples have higher values than the gaussian distribution will indicate and there is a curve for lower values as well. if the measured samples form a distribution ( ), an ordered nondecreasing sequence can be obtained, where for . here, represents the number of samples taken. by observing the ordered sequence , the formula for conditional probability can be obtained [8] which calculates the probability that measurement will have the rank in the said sequence: ( ) ( ) ( )( ( )) (1) 2.2. hypothesis testing qq plots in this research are used to represent the relationship between the measured signal distribution and the distribution of the signal in nominal working conditions. for this reason hypothesis testing is implemented in order to decide, based on the available data, whether the assumption of nominal working conditions is correct. if not, then the signal is considered polluted by noise and is discarded. the noise detection algorithm developed in this research relies heavily on eq. (1). in order to successfully implement it several initial calculations need to be performed. first the expected probability distribution in nominal regime (when there is no noise) needs to be established. then, after calculating nominal probability density function , the discriminant boundaries should be determined. if all the samples of the qq plot lie within these boundaries, then the recorded signal is in nominal working condition, i.e. there is no noise. if, on the other hand, points on the qq plot find themselves beyond the calculated boundaries, the fault has occurred, and the recorded samples are dismissed. there are two objectives which must be taken into account when establishing valid bounds on the qq plot. the first objective is maximization of the probability that the noise-free recordings will be classified as valid. the second objective is minimization of the probability that faulty recordings will be falsely classified as valid. therefore, a tradeoff needs to be made, and as a solution a variation of neyman-pearson method [11,12] for hypothesis testing has been chosen. this means that the probability for the desired efficiency under nominal conditions has been fixed. in the literature this value is usually adopted in the range between and . in this paper the value has been taken. therefore, lower and upper bounderies ( and ) are calculated so that the following condition is satisfied: ∫ ( ) ( ) ( ) (2) where the probability density function ( ) can be expressed using the bayesian formula: ( ) ( ) ( ) ( ) (3) 576 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev 3. case study coal fueled thermal power plants play a very important role in energy production worldwide and are the number one energy provider in serbia. for that reason an increase of productivity and work life of an entire plant, as well as its subsystems, is of great economical importance. coal grinding subsystem is one of the key parts of thermal power plant and is responsible for pulverization of coal, so it can be used in a burner system. in thermal power plant kostilac a1 in serbia fan mills used for coal pulverization have ten impact plates which rotate around the center. pulverization occurs as a result of friction between the plates and the chunks of coal within the mill. when the coal is grinded into a fine powder it is transported into a burner system where it is used as a fuel. the particles which are not small enough return back into the mill where they are additionally pulverized. after several weeks the impact plates within the mill get worn due to constant impact with coal chunks and rock and the efficiency of the mill starts to decrease. this is when the maintenance needs to be performed or other more serious problems and malfunctions will occur. the algorithms which can detect the moment the maintenance is needed based on the recorded acoustic signals have already been developed. they, however, are unable to perform their function when the noisy measurements are provided, which often happens with acoustic signals in a real industrial environment. mills in thermal power plants produce high intensity noise and they are located in the vicinity of other mills of the coal grinding subsystem. therefore, the acoustic environment in which the recordings are measured is extremely complex. even with all this in mind, the frequency features of this noise are very informative for state estimation of impact plates within the mills. however, given that the area around the mill consists of a large number of other actuators, valves, pipes, pumps, additional works such as welding, repairs, maintenance and the like, are quite common. at the same time, the sound recording is being enriched by sporadic impact of larger chunks of coal. these occurrences contaminate acoustic recording and, considering that their statistics are not included in the training sets used for impeller state estimation algorithms, they can cause the algorithm to make a wrong decision or, at the very least, cause a large time delay in making a correct decision. for this reason it is of great importance to develop techniques for detection and, if possible, classification of contamination in the acoustic recording. acoustic signals used to demonstrate the results of the proposed algorithm are recorded in different acoustic surroundings of the mill. one part of these recordings is taken in nominal working conditions in which, other than the noise from the mills and other rotating elements, there are no other sources of contamination. the second group of recordings consists of nominal sound sources as well as the sound of people talking in the vicinity of the microphone. the third group of signals contains nominal sound as well as the sound produced during welding and repair of the steam lines near the mill. the noise detection algorithm developed in this research has been tested on real acoustic signals recorded in thermal power plant kostolac a1 in serbia. there are 10 impact plates within the mill for which the noise detection algorithm has been tested and the recorded signal has the sampling frequency of . the length of the obtained recording is approximately 20 minutes. this recording consists of intervals in which the system is in nominal regime, as well as intervals when the artificially created noise has been used to pollute the recording. toward acoustic noise type detection based on qq plot statistics 577 4. results the proposed algorithm is tested in several steps. first the learning part of the algorithm is conducted in which the recordings in nominal regime are analyzed. in this way the nominal probability density function , as well as the discriminant boundary for nominal recordings are obtained. after that, the algorithm is tested on both contaminated and nominal samples in order to determine how prone it is to false classification. the effect of window length on proposed algorithm is analyzed as well. finally, an attempt has been made to classify the obtained noise and to determine whether the impulse disturbance or speech contamination has occurred. 4.1. nominal recordings as it is stated earlier, the initial part of the algorithm is a learning process in which sufficiently long signal in nominal regime is used to approximate the nominal probability density function. after that the hilbert transform of the signal is performed in order to obtain an envelope of the signal. there are several ways to approximate the probability density function (pdf) of the obtained sequence. one is by observing the scaled histogram of the signal, and the other is using the method of kernel functions. the latter method is chosen in order to obtain a smoother version of the estimate without a significant increase in computational complexity. for pdf estimation an epichenkov kernel function is used due to the fact that it is most commonly applied in the literature because it minimizes the mean square error. as expected, the pdf estimate obtained in this way roughly resembles the shape inferred from the histogram. after estimating pdf of a noise-free signal, the next step is to determine the boundaries of a qq plot from eq. (2). seeing how all the samples of a hilbert transform of the signal are positive and the expected behavior of a noisy signal would be a larger variance and a greater mean value (with respect to noise-free parameters) a slight simplification of (2) can be implemented, for the sake of easier numerical calculations: ∫ ( ) (4) the lower classification boundary does not need to be determined because when the noise occurs, the points on the qq plot are expected to drift above the line. therefore, eq. (4) is used for the purpose of noise detection and boundary calculation. the resulting qq plot of the samples in nominal regime and the calculated boundary are shown in fig. 3. fig. 3 qq plot of nominal recorded quantiles with respect to nominal expected quantiles, with boundary . 0 0.2 0.4 0.6 0 0.2 0.4 0.6 f nom -1 n o m in a l d a ta q u a n ti le s 578 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev 4.2. noisy recordings testing the algorithm as a tool for noise detection is conducted on the part of the signal which is 12 seconds long and whose hilbert transform is shown in fig. 4. this signal contains dominant sections of nominal regime (blue), sections contaminated with speech (green) and samples which contain impulse disturbance (red). in this way all the aspects of noise detection algorithm are tested. the hilbert transform is applied in order to obtain an envelope of the signal. fig. 4 part of the recording on which the algorithm has been preliminary tested. blue represents the nominal regime, green represents the part of the signal contaminated with speech, and red represents the part of the signal contaminated by impulse disturbance. the testing recording has been separated into smaller pieces obtained using window the size of 1sec, with overlap of 50%. each window has been tested for noise, and the noisy recordings have been dismissed. all the windows which include only the nominal behavior without the noise have qq plots which resemble the shape shown in fig. 3. all the points of the plot are below the discriminant boundary and are therefore classified as noise-free samples. the effect of speech contamination on the qq plot depends heavily on the percentage of contaminated signal which is enveloped within the window, as shown in fig. 5. in case when the windowed signal consists exclusively of speech contaminated samples (fig. 5 down), its qq plot has quantiles which lie on an approximately straight line with angle larger than . this indicates that the variance of the recorded signal, as well as its mean value, is larger than expected. also, most of the samples lie above the discriminant line which means that the algorithm has detected the noise. the situation is not so clear when only part of the window which is examined contains speech contaminated samples. in that case the angle of the plot is lower and, depending on the amount of speech included in the window, sometimes all the quantiles lie below the discriminant line. this means that the contamination has not been detected (fig. 5 up). 0 2 4 6 8 10 12 0 0.5 1 1.5 t [sec] x h ilb e rt nominal impulse disturbance speech contamination toward acoustic noise type detection based on qq plot statistics 579 fig. 5 speech contaminated samples in time domain (left) and the appropriate qq plot (right). upper figures show the behavior of the plot when only small part of the speech contamination is encompassed in the window. central figures show the behavior when about 50% of the window contains contamination, while lower figures show what happens when the contamination is present in the entire windowed signal. with impulse disturbance the problem becomes much simpler and the algorithm manages to detect the contamination regardless of the percentage of noisy samples in the window. the nature of impulse disturbance is so abrupt that even a small number of samples encompassed within a window is enough to significantly change the statistical parameters. the appropriate qq plot of this is shown in fig. 6. 0.5 1 1.5 0 0.1 0.2 0.3 0.4 0.5 0.6 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 f nom -1 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 1.2 f nom -1 2 2.2 2.4 2.6 2.8 3 0 0.2 0.4 0.6 0.8 1 1.2 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 1.2 f nom -1 580 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev fig. 6 samples which contain impulse disturbance in time domain (left) and the appropriate qq plot (right). the classification results of the algorithm are presented in table 1. while classifying the nominal samples and samples which contained impulse disturbance the algorithm has achieved accuracy of 100%, while speech contamination has a lesser percentage of detection. this is due to the fact that the statistical parameters of the windowed signal do not vary considerably with respect to the nominal regime when only a small part of the window contains speech contamination. this is precisely what happened in those 2 windowed parts of the signal which were wrongly classified. table 1 results of the noise detection algorithm nominal recordings speech contamination impulse disturbance classified as nominal 13 (100%) 2 (25%) 0 classified as noisy 0 6 (75%) 4 (100%) 4.3. length of the window adjustment the previous analysis suggests that the proposed algorithm easily detects impulse disturbances, but speech contamination can be somewhat more elusive. in the given 8.5 9 9.5 0 0.5 1 1.5 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 f nom -1 9.5 10 10.5 0 0.5 1 1.5 t [sec] x h ilb e rt 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 f nom -1 toward acoustic noise type detection based on qq plot statistics 581 example, out of 8 windowed signals contaminated with speech, the algorithm cannot correctly classify two of them. the problematic windowed signals are at the beginning and the ending of the speech sequence and incorrect classification is due to the fact that there is a small percentage of contaminated samples inside the window. one way to correct this error is by changing the length of the window. the noise detection results as the length of the window is changed are given in table 2. table 2 changeable length of the window tested on speech contaminated signals window length classified as nominal classified as noisy total number of windowed signals 1 (17%) 5 (83%) 6 2 (25%) 6 (75%) 8 2 (13%) 13 (87%) 15 31 (37%) 52 (63%) 83 one thing which is obvious from the results is the fact that the number of speech contaminated windowed signals increases as the length of the window decreases. this is important for statistical significance of the experiment. however, with smaller number of samples inside the window, the qq plots are not as representative as they are for larger number of samples. the table shows that for window sizes between 1.5s and 0.5s only one or two windowed signals are wrongly classified as nominal, and those correspond to the beginning or the end of the sequence, as discussed previously. therefore smaller length of the widow will yield statistically better results because higher percentage of signals will be correctly classified as noisy. by continuing to decrease the length of the window, however, the algorithm starts to behave inconsistently. for window length of 0.1s the percentage of misclassified signals drastically increases. this is due to several factors. first of all, qq plots have fewer samples and are therefore less accurate. secondly, the dynamics of speech is such that usually the gaps between the words, and sometimes even within a single word, are larger than 0.1s. therefore there are a significant number of windowed signals which do not contain any information about the speech. furthermore, while other window lengths correctly classify all nominal recordings and all impulse disturbance recordings, for misclassification occurres not only for speech contaminated signals, but for nominal signals as well. 4.4. noise detection and classification from fig. 5 and 6 it is clear that two different types of noise present themselves quite differently on the qq plot. with this in mind it might be possible to classify which type of noise has occurred when the algorithm detects the presence of contamination. the way in which this can be done is by determining another classification line, as in eq. (4), but this time with respect to speech contaminated signals, rather than nominal recordings. in this way two classification lines are obtained, one which classifies nominal recording from the contaminated ones, and the other which detects whether contaminated recordings have impulse or speech disturbance, as shown in fig. 7. in the upper graph it can be seen that nominal recordings do not trigger any errors. speech contaminated recordings can be seen in the lower left part of the figure, and they fit ideally between two classification 582 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev lines. impulse disturbance, on the other hand, has the quantiles above both discrimination lines, as can be seen in the lower right part of the figure. this upgraded algorithm for noise detection and classification has been tested on the recording from fig. 4 and the results are shown in table 3. as can be seen, the impulse disturbance has been impeccably classified as such. nominal recordings have a high percentage of nominal classification as well. speech still has the lowest detection and classification percentage due to the facts discussed earlier. table 3 results of the noise detection algorithm nominal recordings speech contamination impulse disturbance classified as nominal 100% 25% 0 classified as noisy 0% 55% 0 classified as impulse noise 0% 20% 100% fig. 7 qq plot with 2 classification lines. when the samples of a qq plot go above the red line, the noise has been detected. however, if samples are above the black line, this means that impulse disturbance has occurred, and when they are between the red and blue classification lines the speech contamination has occurred. 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 f nom -1 nominal data quantiles   0 0.2 0.4 0.6 0.8 1 1.2 0 0.5 1 1.5 f nom -1 speech data quantiles 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 f nom -1 impulse data quantiles toward acoustic noise type detection based on qq plot statistics 583 5. conclusion in this paper an algorithm was presented which is capable of detecting the occurrence of noise in acoustic signals and is able to classify this noise with high percentage of accuracy. the main tool used for this purpose is a qq plot with probability density function estimates and hypothesis testing algorithms. this research has been conducted with a purpose of making acoustic signals more broadly usable in the industry as a tool for predictive maintenance and state estimation of machines. the algorithm has been tested in a real industrial environment in thermal power plant kostolac a1 in serbia, and is shown to be capable of detecting whether the noise has occurred, and to classify whether the impulse disturbance or speech contamination is in question. furthermore, the influence of the length of the window used on the efficiency of the algorithm is tested as well. successful detection and classification is much lower on speech signals than on impulse disturbance due to the fact that the intensity of the speech, as well as words that are spoken directly influence the amount of contamination of the nominal signal. therefore, if someone speaks quietly or makes long pauses while speaking, the chances are that the proposed algorithm will not manage to detect all the polluted parts of the signal. also the percentage of contamination which is included in the analyzed window affects the detectability of the contamination, so the beginning and an ending of a speech contaminated sequence may not always be detectable. this can be improved by increasing the overlap between the windows and decreasing the size of the window, but only up to a point. the algorithm proposed in this paper is an introductory research of a preprocessing tool that should be capable of detecting and isolating acoustic noise in an industrial environment with a purpose of making acoustic recordings more compelling for usage in industrial predictive maintenance algorithms. further research is going to contain robustification of the algorithm and improvement of speech detection possibly by using correlation analysis or some similar tools. also, a pdf estimation of noisy signals based on their qq plots is something that might yield more robust results as well. acknowledgement: this paper is a result of activities within the projects supported by serbian ministry of education and science iii42007 and tr32038. references [1] s. vujnović, a. al-hasaeri, p. tadić and g. kvašĉev, “acoustic noise detection for state estimation”, in proceedings of the 3rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, june 13 – 16, 2016, aui4.6 1-5. [2] r. k. mobley, an introduction to predictive maintenance, 2nd ed. amsterdam, netherlands: butterworthheinemann, 2002. [3] m. a. stošović, m dimitrijević, s. bojanić, o. nieto-taladriz, v. litovski, “characterization of nonlinear loads in power distribution grid,” facta universitatis, series: electronics and energetics, vol. 29, no. 2, pp. 159-175, 2016. [4] d. stevanović, p. petković, “utility needs smarter power meters in order to reduce economic losses,” facta universitatis, series: electronics and energetics, vol. 28, no. 3, pp. 407-421, 2015. [5] m. j. crocker, handbook of noise and vibration control, hoboken, new jersey: john wiley & sons, 2007. [6] z. su, p. wang, x. yu, z. lv, "experimental investigation of vibration signal of an industrial tubular ball mill: monitoring and diagnosing," miner eng, vol. 21, no. 10, pp. 699-710, 2008. 584 s. vujnović, a. marjanović, ţ. đurović, p. tadić, g. kvašĉev [7] n. baydar, a. ball, "a comparative study of acoustic and vibration signals in detection of gear failures using wigner-ville distribution, "mech syst signal pr, vol. 15, no. 6, pp. 1091-1107, 2001. [8] g. s. kvascev, z. m. djurovic, b. d. kovacevic, "adaptive recursive m-robust system parameter identification using the qq-plot approach," iet control theory & applications, vol. 5, no. 4, pp. 579-593, 2011. [9] s. vujnovic, z. djurovic, g. kvascev, "fan mill state estimation based on acoustic signature analysis," control engineering practice, vol. 57, pp. 29-38, 2016. [10] j. j. filliben, "the probability plot correlation coefficient test for normality," technometrics, vol. 17, no. 1, pp. 111-117, 1975. [11] k. fukunaga, introduction to statistical pattern recognition, 2nd ed. san diego, california: academic press professional, 1990. [12] s. theodoridis, k. koutroumbas, pattern recognition, 3rd ed. orlando, florida: academic press, 2006. facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 261-271 https://doi.org/10.2298/fuee2002261k © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd dependencies of current harmonics of some nonlinear load devices on rms supply voltage  lidija m. korunović, ivan anastasijević faculty of electronic engineering, university of niš, niš, serbia abstract. the paper deals with the determination of current harmonic dependencies of some nonlinear load devices on the rms supply voltage. these dependencies are based on the laboratory experiments that include the variations of rms supply voltage in relatively wide ranges. the experiments were performed on some representatives of nonlinear load devices. both current harmonic amplitudes and their angles are recorded during the voltage changes, and corresponding dependencies on rms voltage are obtained by curve fitting. the results are related to actual devices that are typically used in residential load sector. the obtained dependencies are the indices of potentially significant effects of rms voltage variation on current harmonics in low voltage installations. key words: load device, voltage variation, current harmonics, power quality, harmonic distortion 1. introduction in up-to-date power networks, there is a significant increase of the use of nonlinear load devices in all characteristic load sectors such are industrial, commercial and residential [1], [2]. for example, residential load sector typically includes the following nonlinear devices: energy efficient lighting  led lamps and fluorescent lamps with electronic ballast; switch-mode power supplied loads  laptop computers, tv sets, mobile chargers; air conditioners; direct drive washing machines; refrigerators and freezers. all nonlinear devices inject current harmonics into the network nodes, and numerous problems can arise [3], [4]. therefore, the limitation of harmonic emission is needed [5], as well as adequate modelling of individual and/or aggregate nonlinear loads for proper harmonic analysis [6], [7]. for the correct harmonic analyses of low voltage networks, load devices should be properly modelled. in [8], modelling of low voltage devices that is based on simulations is presented. these simulations use equivalent electric circuits that represent analytical models of the devices, with typical parameter values. however, the parameters of the devices are received september 2, 2019; received in revised form september 20, 2019 corresponding author: lidija m. korunović faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia e-mail: lidija.korunovic@elfak.ni.ac.rs  262 l. m. korunović, i. anastasijević almost impossible to get from the manufacturers. thus, although the results presented in [8] regard real and reactive power and power factor, as well as total harmonic distortion of the current obtained for different rms values of supply voltages, these can not be adopted for the load devices that belong to the same load category when they consist of different circuits or of the same circuits with different parameters. also, for harmonic power flow analysis information regarding particular current harmonics is needed. on the other hand, there is the group of references that present individual harmonic distortions of currents that can be used for harmonic studies. thus, [9] provides these distortions of some load devices used in households, obtained by measurements and simulations, while individual harmonic distortions of currents and current harmonic angles obtained by measurements are listed in [10]. the significant influence of harmonic distortion of the supply voltage on current harmonic distortion of low voltage devices is found in many references, e.g. [6] and [11]-[13]. it indicates that network operating conditions hardly influence current harmonic emissions. however, published references do not analyse measured current harmonics of low voltage devices when rms supply voltage changes in relatively wide range that can appear in various network operating conditions. therefore, this paper presents the results of experiments performed on some typical low voltage devices used in residential load sector, in order to obtain proper dependencies of current harmonics and their angles, on rms supply voltage. for this purpose the results of experiments performed in time periods when the harmonic distortions of supply low voltage network voltage were almost constant, are selected for the presentation. it is presumed that current frequency component dependencies of a device, and the dependencies of their angles, are not always similar to each other. this assumption is proven in this paper, since mentioned dependences expressed in the form of mathematical functions, demonstrate significant mutual differences for the same load device. the presented approach to determining dependencies of current harmonics on rms supply voltage is applicable to any low voltage load, and obtained dependencies have numerous potential applications. the rest of the paper is organized as follows: typical nonlinear residential load devices are listed in section 2, description of experiments is presented in section 3, section 4 discusses experimental findings of current harmonic dependencies on rms supply voltage in detail, section 5 summarises the functions that represent these dependencies, while the main conclusions are drawn in section 6. the paper ends with the references, and with data of load devices used in experiments that are listed in appendix. 2. representatives of nonlinear load devices in residential load sector the group of linear load devices used in residential load sector consists of resistive load devices such are: cooker hot plates, water and space heaters. the participation of these devices in total energy consumption of residential load sector is reducing, due to the increased usage of other energy sources for cooking (natural gas) and heating (e.g. heat pumps and central heating), while the consumption of nonlinear load devices is increasing constantly. dependencies of current harmonics of some nonlinear load devices on rms supply voltage 263 the group of nonlinear load devices used in residential load sector consists of devices that belong to different types of load. one of these types is nonlinear indoor lighting load whose use increases. namely, linear indoor lighting loads  incandescent lamps, have very low efficiency [14], [15], and therefore are progressively being replaced with more efficient nonlinear lighting sources. in some of the developed countries they are even forbidden. nonlinear lighting load applied in residential load sector includes compact fluorescent lamps (cfl) with electronic ballasts, as well as light-emitting diode (led) lamps. since compact fluorescent lamps are rapidly being replaced with light-emitting diode lamps in modern households (and office buildings) [15], [16], the representative of led lamp was selected for the experiment. its data are listed in the appendix together with data of other nonlinear examined load devices. all load devices with switch mode power supply are named switch mode power supply (smps) type of load [17]. the representatives of this type of nonlinear load are: personal computers, monitors, dvd/cd players/recorders, televisions, etc. the participation of smps loads in total energy consumption of modern household increases as concluded in [18] on the bases of numerous scenarios. in order to obtain current harmonic-rms voltage dependencies of one representative of smps loads typically used in residential load sector, the experiment was performed on a laptop computer. as a result of climate change, air-conditioners are frequently used in serbian households during summer months. therefore, there is the impact of air-conditioner on total energy consumption of residential consumers and consequently on the whole power system, as also noted in many countries (e.g. [19]). in [18] it is found that the ownership of air-conditioning equipment (and personal computers) will grow mostly in urban and rural settlements in the future. therefore, an air-conditioner was selected for laboratory experiment. refrigerators consume noticeable portion of energy in each household, since they operate during the whole year [14]. there are many types of compressor refrigerators, but the refrigerator which is already placed in the laboratory is used for the experiment. in modern households, conventional washing machines are rapidly being replaced with direct drive washing machines that are also nonlinear devices typically used in households. however, an experiment on the washing machine was not performed, since there was not such machine in the laboratory. 3. description of the experiments as mentioned in introduction, laboratory experiments were performed in order to investigate the influence of rms supply voltage on current harmonics of selected nonlinear load devices. in these experiments devices supply voltage changed by a variable autotransformer which was connected to low voltage laboratory installation. the output voltage of autotransformer was decreased from the higher limit of 110 % un according to the standard en 50160 [20] that relates normal operating conditions in low (and medium) voltage distribution networks, to the voltage of 80 % un (that is 10 % bellow en 50160 lower limit of 90 % un) selected to be sufficiently high to ensure that none of the tested devices is damaged. the steps of voltage changes during experiments were 2% un, where un is the rated phase-to-neutral voltage of low voltage network, i.e.230 v. thus, the results presented in the paper are the indicators of current harmonic variations during both normal 264 l. m. korunović, i. anastasijević operating conditions and those abnormal conditions in low voltage networks when the voltage reduces down to 10 % bellow en 50160 lower limit. the measurements of voltage and current were performed by 4-channel power meter lmg450 [21] after each voltage decrease when the steady-state regime was achieved. the applied meter measures voltage and current according to standard [22], i.e.  performs discrete fourier transform (dft) in order to obtain the amplitudes and phase angles of voltage and current harmonics. during the experiments, total harmonic distortions of voltage and current, thdu and thdi, respectively, were calculated by the meter on the basis of voltage and current fourier series with frequency components whose harmonic order is up to the 99 th : 99 99 2 2 2 2 1 1 ( ) ( ) 100% , 100% h h h h u i thdu thdi u i       , (1) where uh and ih are voltage and current frequency components, respectively, of h th harmonic order, and u1 and i1, are fundamental voltage and current components. although the meter recorded numerous frequency components including those which frequencies are higher than 2 khz (high-frequency components), for practical reasons this paper analyses only several current frequency components whose order is up to 9 th . total and individual harmonic distortions are most frequently used as measures of sine wave harmonic distortion [1], [3]. the latter are also defined for voltage and current, %100 , %100 11 i i hdi u u hdu h h h h  , (2) and calculated from corresponding voltage and current frequency components. as mentioned, the autotransformer used in experiments was supplied by public network. therefore, the voltage waveform was not an ideal sinusoid. the values of thdu and uh were recorded during the experiments on selected nonlinear load devices. dominant frequency components, u5 and u7, are analysed in particular. other frequency components were less than 1 % of voltage fundamental component during all experiments. for example, odd frequency components, u3, u9, u11, u13 and u15, were less than 0.14 %, 0.14 %, 0.50 %, 0.35 % and 0.17 %, respectively, at the half of all measuring instants during experiments. table 1 summarizes the ranges of thdu, hdu3 and hdu5 obtained by the statistical analysis of data during each experiment of the voltage decrease. the ranges during particular experiment are represented in the form: 25% 75% ,25% ,75% ( ; ) , ( ; ) h h h thdu thdu thdu hdu hdu hdu  , (3) where: thdu25% and thdu75% are 25 th and 75 th percentiles of thdu, respectively, and hduh,25% and hduh,75% are 25 th and 75 th percentiles of hduh, respectively. since, each voltage frequency component is characterized by its angle, uh, that also influences current harmonic emission of nonlinear loads [6], [11]-[13], the ranges of u5 and u7 during the experiments are also determined in an analogous way as in (3), and are presented in table 1. it is found that the recorded thdu ranges that relate to different experiments performed in the same laboratory on various days and day periods, differ from each other, but during dependencies of current harmonics of some nonlinear load devices on rms supply voltage 265 each experiment they changed stochastically in narrow ranges. the same conclusion is drawn for particular hduh and uh ranges. therefore, it should be emphasized that the results presented in the paper correspond to specific distortion of supply voltage recorded when the experiments were performed. table 1 the ranges of thdu, hdu5, hdu7, u5 and u7 during experiments load device led lamp laptop computer air-conditioner (cooling) refrigerator thdu [%] 2.23; 2.31 3.02; 3.06 2.50; 2.62 2.14; 2.19 hdu5 [%] 1.73; 2.40 2.02; 2.09 2.29; 2.41 1.57; 1.64 hdu7 [%] 1.44; 1.86 1.65; 1.67 0.76; 0.81 1.25; 1.36 u5 [] 169.2; 172.6 171.2; 173.3 178.7; 181.3 178.8; 180.9 u7 [] 51.1; 53.1 53.0; 54.3 59.9; 63.1 43.6; 45.5 4. results 4.1. led lamp current harmonics of one representative of led lamps are examined on the bases of experiment results. it is found that thdi of led lamp increases negligibly from 39.9 % to 43.4 %, for supply voltage variation of 30 %, i.e. for the voltage decrease from 110 % un to 80 % un (253 v  184 v). as mentioned before, the meter records up to 99 th frequency component, but this paper analyses only odd current components: 1 st (fundamental), 3 rd , 5 th , 7 th and 9 th , i.e. i1, i3, i5, i7 and i9, respectively, and their angles, i1, i3 , i5, i7 and i9, so as not to burden the text and figures. even harmonics are not analysed, since they are negligible. fig. 1a) presents measured values of i1, i3, i5 and i9 along with corresponding second order polynomial fits. measured values of i7 (and corresponding fitting polynomial) are not presented in the figure, since they are almost the same as i9 measured values. the polynomials whose general form is y=b0+b1·u+b2·u 2 , are obtained with adjusted coefficient of determination, i.e. adjusted r 2 ( 2 r ), greater than 0.5 [23]. all polynomial fittings of current harmonics and their angles presented in the paper are obtained with such, relatively large 2 r . this indicates significant relationships between the variables. the parameters of all polynomial fits obtained for led lamp, laptop computer, airconditioner and refrigerator are listed in section 5 (table 2). fig. 1a) depicts that i1 is significantly greater than other frequency components and that i1 slightly decreases with voltage decrease of 30 %  for approximately 8.6 % of its value measured at 110 % un. on the other hand, i3, i5, i9 (and i7), are small: i3 is less than one third, i5 is about fifteen percent, and i7 and i9 are about ten percent, of i1. third current harmonic is almost constant during voltage decrease, while the fifth, seventh and ninth current harmonic increases for about 7 %, 15 % and 3 %, respectively, for 30 % voltage decrease. for the considered voltage decrease, current harmonic angle of i1 and i7 of examined led lamp decrease from 28.7° to 22.4°, and from 28.8° to about 10°, respectively, the angle of i5, increases from 21.0° to 11.7°, while the angles of other frequency components are almost constant (fig. 1b). the angles are also fitted by second order polynomials, whose parameters are denoted by b0, b1 and b2 (table 2). 266 l. m. korunović, i. anastasijević fig. 1 measured values and corresponding fitting curves of: a) amplitudes of i1, i3, i5 and i9, and b) angles of i1, i3, i5, i7, and i9, of led lamp 4.2. laptop computer laptop computers are commonly used in modern households. however, their currents are very distorted. for example, thdi of examined laptop computer varies in the range from 203.7 % to 198.5 % for the voltage change from 110 % un to 80 % un, i.e. it is very high and slightly decreases with mentioned voltage decrease. measured values of the amplitudes of i1, i3, i5 and i9 for different supply voltages, as well as corresponding fitting polynomials, are presented in fig. 2a). the values of i7 amplitudes are omitted from this figure, since they are almost equal to the measured i5 values. laptop computer characterizes large amplitudes of current frequency components. thus, analysed frequency components are between 70 % and 90 % of current fundamental component. furthermore, all of the components, i1, i3, i5, i7 and i9, increase with considered 30 % voltage decrease: for about 23 %, 29 %, 40 %, 32 % and 31 %, respectively. the recorded angles of current frequency components are presented in fig. 2b) together with corresponding second order polynomial fittings. it is found that the angle of i1 decreases with the voltage decrease from 39.2° to 21°, while the angles of other analysed components generally slightly increase with voltage decrease. the angles of i3, i5, i7 and i9 increase: from 158.4° to 167.3°, from 40.8° to 24.6°, from 130.7° to 149.3°, and from 66.2° to 40.3°, respectively, for 30 % supply voltage decrease. fig. 2 measured values and corresponding fitting curves of: a) amplitudes of i1, i3, i5 and i9, and b) angles of i1, i3, i5, i7, and i9, of laptop computer dependencies of current harmonics of some nonlinear load devices on rms supply voltage 267 4.3. air-conditioner as mentioned, air-conditioners are commonly used for cooling in households in serbia. therefore, the experiments were performed when the selected air-conditioner operated in cooling mode. in this mode, thdi changed from 38.9 % to 28.2 %, for the voltage variation from 253 v to 184 v, i.e. it decreases significantly, for even 27.5 % of thdi obtained at 110 % un. this trend of thdi change is different than in the cases of led lamp and laptop computer. according to fig. 3a), amplitudes of fundamental component and current frequency components, slightly change with 30 % voltage decrease, and these changes can be also represented by second order polynomials presented in the same figure. the amplitudes of i7 during experiments are almost the same as i9 amplitudes, and are not presented in fig. 3a). the amplitude of i1 changes negligibly with 30 % voltage decrease (for about 2%), while final values of i3, i5, i7 and i9 are for even 19.6 %, 55.5 %, 83.2 % and 31.8 %, respectively, less than their values obtained at 110 % un. fig. 3 measured values and corresponding fitting curves of: a) amplitudes of i1, i3, i5 and i9, and b) angles of i1, i3, i5, i7, and i9, of air-conditioner on the other hand, there is an increase of the angles of i1, i3, i5 and i9 with 30 % voltage decrease: from 34.6° to 8°, from 104.5° to 23.3°, from 177° to 33.5°, and from 26.4° to 81.1°, respectively. the angle of i7 increases from 106.5° to 141.9° with the voltage decrease to 225.4°v, and then decreases to 63.8° which is measured at 184°v. 4.4. refrigerator differently from other nonlinear load devices, thdi of the examined refrigerator is relatively small. it changes in the range from 5.7 % to 9.6 % for supply voltage variation from 253 v to 184 v. it increases significantly (about 67 % of its initial value) with 30 % voltage decrease and this is quite a different trend of change than trends of thdi changes of other examined load devices. amplitudes of i1, i3 and i5 of refrigerator, as well as corresponding polynomial fittings, are depicted in fig. 4a). the amplitudes of i7 and i9 are omitted, because they are very small and almost the same as amplitudes of i5. fundamental current component decreases for 13.3 % with 30 % voltage decrease, while the trends of i3 and i5 are opposite – 268 l. m. korunović, i. anastasijević they increase for even about 49 % and 149 % of their small initial values measured at 253 v. for the same, 30 % voltage decrease, the angles of i1, i3 and i7 increase: from 51.7° to 35.5°, from 107.3° to 117.6°, and from 35.5° to 24.9°, respectively, i9 increases significantly from 105.4° to 220.7°, while the angle of i5 is almost constant. fig. 4 measured values and corresponding fitting curves of: a) amplitudes of i1, i3 and i5, and b) angles of i1, i3, i5, i7, and i9, of refrigerator 5. summary of current harmonic dependencies on rms supply voltage as discussed in section 4, thdis of led lamp, laptop computer, air-conditioner and refrigerator change quite different with rms supply voltage decrease from 110 % un to 80 % un. the reason is different variation of current frequency components of different devices with the voltage change. also, current frequency components of a device have different trends of change with the same voltage variation. these trends are presented by polynomial fittings with acceptable accuracy. for better insight, the parameters of these fittings, of both amplitudes and angles of i1, i3, i5, i7 and i9, are summarized in table 2 for all examined load devices. listed parameters correspond to particular devices operating under conditions of specific harmonic distortion of the supply voltage, as discussed in section 3. the parameters of similar load devices and other nonlinear devices used in different load sectors can be obtained using experiments analogous to those presented in this paper. for comprehensive research regarding parameter determination at different harmonic distortion of supply voltage expensive programmable source is needed. polynomial functions of both amplitudes and angles of current frequency components of various load devices, obtained under different supply voltage pollution conditions, can be used as input data for harmonic load flow analysis in low voltage installations, or the base for determination of harmonic model of aggregate load. obtained dependencies on rms supply voltage can be used as the part of load and/or harmonic management system in future smart homes and smart buildings, since the regulation of supply voltage can decrease current harmonics and eliminate existing or potential problems caused by harmonic emission. dependencies of current harmonics of some nonlinear load devices on rms supply voltage 269 table 2 parameters of polynomial fits of ih amplitudes and angles, of different nonlinear load devices load device ih parameters of polynomial fit of ih amplitudes parameters of polynomial fit of ih angles b0 [a] b1 [10 -4 ·a/v] b2 [10 -6 ·a/v 2 ] b0 [º] b1 [º/v] b2 [º/v 2 ] led lamp i1 0.04617 0.03683 0.16840 14.2008 0.27478 0.00041 i3 0.01668 0.17859 0.04997 31.7018 0.27437 0.00060 i5 0.01922 0.86482 0.18041 113.517 1.06471 0.00211 i7 0.01505 0.79088 0.15743 174.346 1.4987 0.00276 i9 0.00873 0.24801 0.04654 98.1075 0.80731 0.00175 laptop computer i1 0.08447 3.13475 0.48013 76.1786 0.72586 0.00108 i3 0.10307 4.74304 0.69349 86.9708 2.43972 0.00579 i5 0.10933 5.67059 0.96997 432.505 3.94086 0.00943 i7 0.07309 2.6096 0.33821 405.194 5.32206 0.01263 i9 0.04765 0.91079 0.00171 845.348 7.72459 0.01831 air-conditioner i1 3.08103 137.600 32.1795 267.487 2.70608 0.00706 i3 4.17088 358.800 84.3613 674.884 7.00444 0.01881 i5 4.28898 405.200 97.1605 62.3786 1.01267 0.00791 i7 0.1112 21.9000 9.24923 1745.01 16.5361 0.03644 i9 0.52854 51.1000 10.534 1217.14 12.6702 0.03058 refrigerator i1 2.11544 144.100 36.3634 20.3878 0.35263 0.00027 i3 0.02649 4.48077 1.51362 363.872 2.1763 0.00464 i5 0.12448 7.48473 1.20006 109.046 0.45892 0.0014 i7 0.06567 4.40451 0.90482 335.228 3.28763 0.00723 i9 0.0322 3.22985 0.76027 444.094 1.14503 0.00067 6. conclusions the research based on laboratory experiments, reveals that current harmonic distortion of the examined typical nonlinear load devices changes in different ways with variation of rms supply voltage. the dependences of the amplitudes and angles of fundamental and frequency components of the current are obtained in the form of second order polynomials for the considered load devices. it is revealed that corresponding dependencies of the devices are significantly different. also, the dependencies, of both amplitudes and angles, for particular device are often with the opposite trends of change with rms voltage variation. therefore, the dependencies on rms supply voltage should be taken into account for proper harmonic models of nonlinear load devices. future research should include experiments on numerous load devices under different harmonic distortion conditions of supply voltage by using programmable source, when investment in such equipment is possible, as well as implementation of their results into voltage control procedures for load and harmonic control in low voltage installations and distribution networks. acknowledgement: the paper is a part of the research done within the projects iii44004 and iii44006 supported by the ministry of education, science and technological development of the republic of serbia. the authors would like to thank prof. milutin p. petronijević for the help during setting up the experiments. 270 l. m. korunović, i. anastasijević references [1] l. m. korunović, power quality. niš, rs: faculty of electronic engineering, 2014. (in serbian) [2] s. đorđević, m. dimitrijević and v. litovski, "a non-intrusive identification of home appliances using active power and harmonic current", facta universitatis, series electronics and energetics, vol. 30, pp. 199–208, june 2017. [3] r. c. dugan, m. m. mcgranaghan, s. h. santoso and w. beaty, electrical power system quality. ny: mcgraw-hill, 2002. [4] e. agić, d. šljivac and b. agić, "the impact of the larger number of non-linear consumers on the quality of electricity", facta universitatis, series electronics and energetics, vol. 32, pp. 369–385, september 2019. [5] ieee, ieee standard 519-1992: ieee recommended practices and requirements for harmonic control in electrical power systems. ieee, 1993. [6] j. yong, l. chen and s. chen, "modelling of home appliances for power distribution system harmonic analysis", ieee trans. power del., vol. 25, pp. 3147–3155, october 2010. [7] c. f. m. almeida and n. kagan, "harmonic coupled norton equivalent model for modeling harmonicproducing loads", in proceedings of 14th international conference on harmonics and quality of power ichqp 2010. bergamo, it: ieee, 2010, pp. 1–9. [8] c. cresswell, steady state load models for power system analysis, phd thesis. edinburgh, uk: the university of edinburgh, dept. elect. eng., 2009. [9] m. j. h. rawa, d. w. p. thomas and m. sumner, "experimental measurements and computer simulations of home appliances loads for harmonic studies", in proceedings of 2014 uksim-amss 16th international conference on computer modelling and simulation. cambridge, uk: ieee, 2014, pp. 1–5. [10] j. niitsoo, i. palu, j. kilter, p. taklaja and t. vaimann, "residential load harmonics in distribution grid", in proceedings of 2013 3rd international conference on electric power and energy conversion systems. istanbul, tr: ieee, 2013, pp. 1–5. [11] s. bhattacharyya, j. f. g. cobben and w. l. kling, "harmonic current pollution in distribution grid", in proceedings of the 2010 ieee power and energy society general meeting. minneapolis, mn: ieee, 2010, pp. 1–8. [12] a. m. blanco, s. yanchenko, j. meyer and p. schegner, "impact of supply voltage distortion on the current harmonic emission of non-linear loads", dyna, vol. 82, pp. 150-159, august 2015. [13] s. cobben, w. kling and j. myrzik, "the making and purpose of harmonic finger-prints", in proceedings of the 19th international conference on electricity distribution. vienna, at: cired, 2007, pp. 1-4. [14] a. j. collin, advanced load modelling for power system studies, phd thesis. edinburgh, uk: university of edinburgh, dept. elect. eng., 2013. [15] t. g. reames, m. a. reiner and m. b. stacey, "an incandescent truth: disparities in energy-efficient lighting availability and prices in an urban u.s. county", appl. energy, vol. 218, pp. 95–103, may 2018. [16] b.-l. ahn, c.-y. jang, s.-b. leigh, s. yoo and h. jeong, "effect of led lighting on the cooling and heating loads in office buildings", appl. energy, vol. 113, pp. 1484–1489, january 2014. [17] s. a. rashid, z. haider, s. m. c. hossain, k. memon, f. panhwar, k. m. mbogba, p. hu and g. zhao, "retrofitting low-cost heating ventilation and air-conditioning systems for energy management in buildings", appl. energy, vol. 236, pp. 648-661, february 2019. [18] m. li, r. shan, m. hernandez, v. mallampalli and d. patiño-echeverri, "effects of population, urbanization, household size, and income on electric appliance adoption in the chinese residential sector towards 2050", appl. energy, vol. 236, pp. 293-306, february 2019. [19] j. c. lam, h. l. tang and d. h. w. li, "seasonal variations in residential and commercial sector electricity consumption in hong kong", energy, vol. 33, pp. 513–523, march 2008. [20] cenelec, en 50160:2010 + corrigendum december 2010 voltage characteristics of electricity supplied by public distribution systems, cenelec, 2010. [21] zes zimmer electronic systems, 4 channel power meter lmg450. oberursel, de: zes zimmer electronic systems gmbh, 2006. [22] iec, international standard iec 61000-4-7: electromagnetic compatibility (emc)  part 4-7: testing and measurement techniques  general guide on harmonics and interharmonics measurements and instrumentation, for power supply systems and equipment connected thereto. iec, 2000. [23] m. merkle, probability and statistics for engineers and students of technical sciences. belgrade, rs: academic mind, 2006. (in serbian). https://ieeexplore.ieee.org/xpl/conhome/5612395/proceeding https://ieeexplore.ieee.org/xpl/conhome/5612395/proceeding dependencies of current harmonics of some nonlinear load devices on rms supply voltage 271 appendix examined load devices:  led lamp with rated power of 11 w: general electric dimmable lamp, a60, 810 lm, 2700k.  intel pentium quad core laptop computer with charger input 100-240 v~; 1,4 a; 50-60 hz; and charger output 19,5 v dc; 2,31 a dc: hp 250 g3.  window type air-conditioner with rated power of 1500 w: frozzini, kfr 35 gw/a.  classic refrigerator with rated power of 72 w: obod, hl-145 ecolux. facta universitatis series: electronics and energetics vol. 32, n o 2, june 2019, pp. 315-330 https://doi.org/10.2298/fuee1902315o feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network opeyemi osanaiye 1 , olayinka ogundile 2 , folayo aina 3 , ayodele periola 4 1 department of telecommunication engineering, federal university of technology, minna, niger state, nigeria 2 department of physics and telecommunications, tai solarin university of education, ogun state, nigeria 3 department of telecommunication science, university of ilorin, ilorin, kwara state, nigeria 4 electrical electronics and computer engineering, bells university of technology, ota, nigeria abstract. wireless sensor network (wsn) has become one of the most promising networking solutions with exciting new applications for the near future. notwithstanding the resource constrain of wsns, it has continued to enjoy widespread deployment. security in wsn, however, remains an ongoing research trend as the deployed sensor nodes (sns) are susceptible to various security challenges due to its architecture, hostile deployment environment and insecure routing protocols. in this work, we propose a feature selection method by combining three filter methods; gain ratio, chi-squared and relieff (triple-filter) in a cluster-based heterogeneous wsn prior to classification. this will increase the classification accuracy and reduce system complexity by extracting 14 important features from the 41 original features in the dataset. an intrusion detection benchmark dataset, nsl-kdd, is used for performance evaluation by considering detection rate, accuracy and the false alarm rate. results obtained show that our proposed method can effectively reduce the number of features with a high classification accuracy and detection rate in comparison with other filter methods. in addition, this proposed feature selection method tends to reduce the total energy consumed by sns during intrusion detection as compared with other filter selection methods, thereby extending the network lifetime and functionality for a reasonable period. key words: chi-squared, cluster, gain ratio, intrusion detection, nsl-kdd, relieff, wsns received january 16, 2019; received in revised form march 9, 2019 corresponding author: opeyemi osanaiy department of telecommunication engineering, federal university of technology, minna, niger state, nigeria (e-mail: opyosa001@myuct.ac.za) 316 o. osanaiye, o. ogundile, f. aina, a. periola 1. introduction wireless sensor networks (wsns) are formed by sets of distributed autonomous devices with the capability to sense, process, transmit and receive observed or measured condition. the sensor nodes (sns) used in wsns are characterized by their light weight, limited processing power, limited energy, low storage capacity, short communication range and low bandwidth [1]. the sensor component of the sn measures the observed condition of a particular situation or physical surroundings while the microprocessor ensures the obtained information are intelligently computed [2]. the wireless radio of the node, on the other hand, ensure communication between neighbouring nodes. wsns often times are deployed in remote, harsh and unattended environment over a certain period of time. these locations are most times not accessible, therefore, it is impractical to carryout maintenance on the nodes after installation. common among its applications are in the area of environmental monitoring, air craft control, disaster control, medical health monitoring, surveillance and military application among many others [3]. although wsns have been used in numerous applications, the requirements of these applications have put a lot of constraints on its design and deployment. security has been identified in the literature as one of the main constraints in the deployment of wsns. this is evident as wsns are subjected to vulnerabilities associated with wireless communication. additionally, in events that involve unprotected hostile outdoor environment, wsns are prone to different types of attack that compromise the confidentiality, integrity, authentication and availability of the data traffic and battery life of the sns [4,5]. many of these attacks have been identified, analysed and discussed in the literature, with authors proffering different defence and prevention techniques. one of such attacks is the denial of service attack, which can also be referred to as packet drop attack or sinkhole attack [6]. blackhole attack in wsn is also a type of denial of service attack that advertises itself as either the destination node or the shortest route to get to the destination. upon receiving these falsely advertised packets from other nodes, the attacker discards all the packets. selective forwarding is a derivative of blackhole attack in which the adversary node does not reject all received packet, instead, it randomly selects packets that will be discarded [7]. the adversary can use this to evade detection. in order to protect the wsns from intrusion by an adversary, various intrusion detection system (idss) have been proposed by researchers. these ids defence solutions are categorized into signature-based and anomaly-based. the former relies on signatures of known attack patterns while the latter profiles a statistical usage model over a certain amount of time to classify data packets as either normal or anomaly using various techniques such as data mining, machine learning and statistical modelling. the signaturebased has a major flaw of not being able to detect unknown attacks while anomaly-based detection suffer from high false positive rate [8]. this has necessitated the emergence of a hybrid solution that uses the complementary feature of both techniques to achieve a higher detection rate. the novel challenges of most of these proffered security solutions for wsns include its limited storage capacity, computational resources and battery power. therefore, traditional security solutions are inappropriate for wsns. due to the resource limitation in wsn environment, proposed ids designs are often lightweight and highly specialized by type of attack to reduce false alarms. computational intelligence ids improves its performance by providing features such as learning, reasoning, perception, evolution and adaptation [5]. these features can be explored to feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 317 develop a more robust ids that is adaptive to different application scenarios, to handle unknown attacks. in this work, we introduce a pre-processing phase in the form of feature selection by combining three filter feature selection methods; gain ratio, chi-squared and relieff, herein called triple-filter, to select one-third split (14 most important features) from the original dataset before classifying with a decision tree algorithm. the motivation behind feature selection is the resource constraint of sns, therefore machine learning techniques use feature selection to eliminate redundant features to reduce the complexity of the proposed system. intrusion detection benchmark dataset, nsl-kdd that consists of 41 features [9] was used to evaluate the performance of the ids by considering the detection rate, classification accuracy and false alarm rate in waikato environment for knowledge analysis (weka). furthermore, we compared our result with the proposed work in [10]. the results obtained show that our proposed method can effectively reduce the number of features with a high classification accuracy and detection rate and a low false alarm rate as compared with [10]. the contribution and relevance of this paper is as follows. in this work, we introduce a pre-processing phase in the form of feature selection, similar to our approach in [11]. however, here we combine three filter feature selection methods, herein called triple-filter. this is used to select the 14 most important features in nsl-kdd for intrusion detection in wsns. this reduces the complexity of the ids by presenting a lightweight technique. reduced ids complexity implies that the sns in a wsn will consume less energy while maintaining high availability. since the sns are battery powered, prolonging the network lifetime and functionality to a reasonable time is very paramount. thus, our proposed ids defence solution is suitable for use in a real-time wsn as it helps to efficiently extend the network life-time and functionality. the rest of the paper is structured as follows. section 2 describes related work on ids defence solution for wsns. in section 3, the wsn architecture and the proposed ids was discussed. also, the section explains the three filter feature selection methods; gain ratio, chi-squared, and reflieff in details. the feature selection and execution process is highlighted in section 4 while section 5 present the experimental results. section 6 highlights the performance measure with respect to the classification accuracy, detection rate and false alarm, while we discuss the results in section 7. finally, section 8 concludes the work and suggests possible research directions. 2. related work in defending against malicious attacks in wsn, various intrusion detection approaches have been proposed in the literature. an intelligent intrusion and prevention system was proposed in [1] by introducing a specialized dataset for wsn. this improves the detection and classification of four types of denial of service (dos) attacks: blackhole, grayhole, flooding, and scheduling attacks. artificial neural network (ann) was used to train the dataset to detect and classify the different dos attacks. results from the work show that the dataset, wsn-ds, enhanced the ids ability to achieve a higher classification accuracy rate. an ids based on evidence theory was proposed in [12] for cluster-based wsn. in this work, each cluster head collects the behavioural pattern of its cluster members before constructing an input evidence according to the deviation from the normal pattern. a weight value is further 318 o. osanaiye, o. ogundile, f. aina, a. periola applied to represent the importance of each behaviour characteristics and revise the evidence before its synthesis. a hybrid ids that enhances security in cluster-based wsn has been proposed in [13]. in this work, the proposed ids is deployed on the cluster head and consists of both an anomaly and misuse module. the output of the anomaly and misuse modules are integrated with a decision-making module to identify the presence of an attack before subsequently classifying into different attack type. in [14], a distributed two-layer and threelayer ids scheme was proposed for wsn to detect intrusion using 10% of the data to learn during the training phase. a complexity reduction process was introduced to select the features to minimize the energy consumption. a specification-based intrusion detection system was proposed in [15]. this system uses rule-based technique to map behaviours to either normal or anomalous. the rulebased technique optimizes the local information obtained by watch dogs into a global information for decision making by cluster heads. this compensate for the communication pattern in the network. in [16], a decentralized ids was proposed for wsn. the proposed algorithm is divided into three phases; data acquisition, rule application and intrusion detection. in data acquisition phase, messages are obtained in promiscuous mode and the relevant information are filtered and subsequently stored for analysis. the rule application phase, on the other hand, process the information and apply the rule to the stored data. if the message fails the test during analysis, a failure is raised. lastly, in the intrusion detection phase, the amount of raised failure is compared with the expected amount of occasional failures in the network. intrusion alarm is raised if the former is higher than the latter. in [10], an integrated intrusion detection system (iids) was proposed for clusterbased wsns. the iids was based on an earlier work in [17] and it consists of three individual idss, namely: intelligent hybrid intrusion detection system (ihids), hybrid intrusion detection system (hids) and misuse ids. these idss are designed for the base station (bs), cluster head and cluster members, based on their capacity and the type of attack they are vulnerable to. for example, the ihids with a learning capability is deployed in the bs. the ihids combines the anomaly and misuse detection by first filtering a large number of normal packets. the packets are then forwarded to the misuse detection module to identify the type of attack. this is done to achieve a high detection with low false alarm. the cluster heads, on the other hand, houses the hids, which is similar to the ihids but without a learning ability. the hids function to optimally detect attacks, however, it retrains the behaviour of the new attack previously detected and classified by the ihids. lastly, due to the resource constraints of sns, the misuse ids is proposed. the misuse ids uses a predetermined attack model to match packets to find and detect attacks. experimental results for the misuse detection, using back propagation network and kddcup’ 99 dataset, shows that a detection rate of 90.96% was achieved with an accuracy of 99.75% and false positive of 2.06%. in the discussion above, different techniques have been considered for feature selection in wsns. the overall aim of these techniques is to enhance the ability of sns to differentiate attacks in wsn. the performance of the security mechanism designed in this manner can be influenced by the number of features of the dataset. different kinds of feature selection methods can be used to achieve varying results. this is because of the resource limitation characterizing wsns. therefore, a combination of different feature selection methods that considers the resource constraints in wsn is required. a strategy that uses multiple algorithms that harness its features will be advantageous in classifying the type of attack in wsns. feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 319 considering the resource limitation that characterize wsns, this work proposes a feature selection method by combining the trio of gain ratio, chi-squared and relieff (triple-filter) to select one-third split (14 features) from the initial 41 features of the dataset. this will significantly reduce the complexity of the ids and minimize the energy consumed during intrusion detection. more so, this filter feature selection method offers high detection rate with good classification accuracy and a low false alarm rate as shown in table 4. 3. wsn architecture and proposed ids the deployment of wsns is often made up of tens to hundred thousand of autonomous sns that function via member node communication. this is necessary as a single sensor node only covers a small area, therefore can only provide limited information. this single node deployment limitation has brought forth the introduction of networks of sns, that are selforganising and collaborative, to achieve a wider coverage over a large environment. the sns monitor, sense, computes and transmits the observed and measured condition of the environment to relay the information to the intended user through the base station. a typical sensor node consists of sensor components, microprocessor components and wireless radio. the sensor component measure the condition of the observed environment of interest while the microprocessor component embedded in the node is used to intelligently compute the obtained information [1]. the wireless radio component of the sensor node is used to initiate communication between neighbouring sensor nodes in wsn. a significant benefit of sensor network deployment is its ability to extend its coverage area to environments where it is near impossible for human beings to access. when categorizing wsns, environment the sensor nodes are deployed can be used. the work in [18] described five types of wsn, namely: underground wsn, terrestrial wsn, underwater wsn, multi-media wsn and mobile wsn. in underground wsn deployment, sensor nodes are buried under the surface of the ground to monitor and sense its condition. these sensor nodes transmit the sensed information to the sink node, which is placed above the ground, to relay it to the base station. terrestrial wsn, on the other hand, consist of several cheap sensor nodes deployed on a specific area of interest, on the surface of the earth in a pre-planned or ad hoc way. the pre-planned deployment involves the optimal placement of sensor nodes, such as grid placement and 3-d placement model [19], while in ad hoc deployment, sensor nodes are randomly deployed. underwater wsn deployment are instances where the sensor nodes are deployed under the water body to sense, explore and gather information about a subject matter and transmits this information using acoustic wave [20]. underwater wsn presents a sparse sensor node deployment as compared to the dense deployment of terrestrial wsn. multi-media wsn are sensor nodes equipped with cameras and microphones to ensure the efficient monitoring and tracking of multi-media events, such as imaging, audio and video [21]. here, the sensor nodes interconnects over a wireless medium to retrieve, process, compress and convey sensed data in a pre-planned arrangement to ensure coverage. one major obstacle to the deployment of multimedia wsn is the resource challenge of sensor nodes, due to the excessive energy consumption during the compression and decompression when transmitting multi-media events. finally, mobile wsn are sets of sensor nodes deployment that move and interact with the physical environment. just as with static wsn, 320 o. osanaiye, o. ogundile, f. aina, a. periola mobile nodes can sense, compute, transmit and receive observed and measured events. the sensor nodes have the potential to reorganise and reposition themselves after deployment to obtain information. the obtained information can be distributed among other mobile nodes within their communication range using dynamic routing protocol. wsn can be further classified according to the structure and uniformity of the deployed sensor nodes. some deployment consists of uniform nodes with equal capacity while other deployments consist of different sizes and capacity, depending on the architecture. in wsn, the network structure (topology) can be categorized into two, namely: flat-based and hierarchical [22]. the flat-based topology consists of sensor nodes with equal capacity, playing similar roles, such as monitoring and sensing events, computing the sensed information and transmitting it directly or via multi-hop routing towards the bs [23]. on the other hand, hierarchical wsns are designed to distribute the sensing and monitoring function of the sns into different levels. cluster-based wsns are typical example of hierarchical wsns. in this paper, we limit our scope to cluster-based wsns. arranging sns into clusters have been widely employed by researchers to efficiently sense and monitor a particular environment. the clustering technique is widely used in wsns because it offers advantages such as reduced energy consumption, fault-tolerance, scalability, efficient data aggregation, latency reduction, and robustness [3,24]. a clustered wsn comprises of two sets of nodes, namely: the member nodes known as the non-cluster head nodes, and the coordinating nodes often referred to as the cluster head. fig.1 shows a typical example of a cluster-based wsn, where c represents a cluster. as shown in fig. 1, the non-cluster head nodes forward the sensed message to their respective cluster heads in a process known as intra-cluster communication. the cluster heads organise the messages from their respective members before transmitting it to the bs. thus, clustering technique can be regarded as a two-layer hierarchy wsn, where the cluster heads work in the upper layer and the non-cluster head nodes operate in the lower layer. the coordinating nodes in most cases perform more function as compared to the lower layer nodes. therefore, the cluster head nodes are sometimes equipped with better processing subsystem, sensing unit, radio subsystem, and power supply unit as compared with the lower layer. if the components of all the sensor nodes in the network are the same, the clustering wsn is usually referred to as a homogeneous clustering wsn. otherwise, it is referred to as a heterogeneous clustering wsn. in this work, we assume that the cluster heads are equipped with a better processing subsystem, sensing unit, radio subsystem, and power supply unit. accordingly, our proposed ids is deployed on the cluster heads for intrusion detection. the cluster heads will monitor the sns to detect attacks. furthermore, the cluster heads will filter abnormal data and forward all the reliable sensed information to the bs, either directly or via one or more relay nodes. from the literature, the relay nodes can either be a cluster head node or a non-cluster head node [3]. since our proposed ids are installed only on the cluster head nodes, we assume that the relay nodes towards the bs can only be a cluster head node in order to maintain high availability. more so, the ids is deployed only on the cluster head nodes to conserve the battery energy of the non-cluster head nodes, which in turns prolong the network lifetime and functionality. finally, the bs integrates all the collected information and transmits the final result to the end user. this proposed ids defence solution can be deployed with relevant energyefficient and energy-balanced clustering routing protocols such as [25, 26, 27, 28, 29]. however, in this paper, we verify our proposed ids solution with the routing algorithm proposed in [26]. feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 321 fig. 1 typical example of a cluster-based wsn in this section, we present a detailed explanation of our proposed ids. current feature selection methods can be categorised into filter, wrapper and embedded method. while wrapper and embedded methods are time consuming and require specific classification techniques to determine the importance of feature subset, filter methods often rely on the general attributes of the dataset to carry out data pre-processing; a step which is independent of the induction algorithm [11]. furthermore, filter methods can be classified into univariate and multivariate techniques. univariate techniques, such as information gain, presents an efficient and scalable method, however they tend to disregard feature dependencies. multivariate filter techniques, on the other hand, incorporate feature dependencies. this makes multivariate techniques more complex. the system that use multivariate techniques are less scalable and have a longer computational time than systems incorporating univariate techniques. in this work, we combine three filter selection methods, gain ratio, chi-squared, relieff, herein referred to as triple-filter method. the choice of these filter methods is due to its ranking and space searching algorithm. furthermore, research has shown that combining feature selection methods can improve the performance of classifiers by identifying features that are weak individually but strong as a group [31]. our proposed triple-filter method relies on the combined strength of the trio to determine the features that are strong in determining the output class. here, we select 14 most important features. a. gain ratio in filter feature selection, the value of gain ratio is said to be large when data are evenly spread while it presents a small value when all data belongs to only one branch of attribute. gain ratio is an improvement on information gain that remedies its bias towards features with large diversity value exhibited by the latter. it uses the number and size of branches to determine an attribute and corrects information gain by using intrinsic information [30]. intrinsic information is the entropy of distribution of instance value for a given feature. gain ratio can be calculated [30] for a given feature and a feature value of using the equation (1) below 322 o. osanaiye, o. ogundile, f. aina, a. periola ( ) ( ) ( ) (1) where, intrinsic value ( ) ∑ is the number of possible values feature can take while is the number of actual values of feature . in our work, we select 14 features from the nsl-kdd dataset that represents the highest ranked feature using gain ratio. b. chi-squared chi-squared (χ 2 ), in mathematical statistics, is a feature selection method that is often used to determine the worth of an attribute with respect to a particular class. chi-squared can be used to test the independence of two variables with an initial hypothesis, ; with the assumption that the two features are not related [30, 31]. this can be tested using the chi-squared formula: ∑ ∑ ( ) (2) where is the actual value and is the predicted value declared by the hypothesis . the higher the value of the chi-squared, the higher the evidence against the null hypothesis. c. relieff relieff is an extension of an earlier relief algorithm that randomly samples an instance from the dataset to locate its nearest neighbours from both the same and opposite class [32]. the values of the attributes obtained from the nearest neighbours, after comparing with the sample instance, are used to update the relevant score from each of the attributes. the idea behind this is that, significant attributes will be able to distinguish between instances that belong to different classes and have the same value from instances belonging to the same class [32]. key among the advantage of relieff filter method is its ability to deal with multiclass issues and its robustness and ability to deal with noisy and incomplete data [33]. relieff can be applied in virtually all situations because of its low bias. 4. feature selection and execution process as depicted in fig. 2, we divided our proposed ids defence solution for cluster-based heterogeneous wsn into three phases. the first phase in implementing a lightweight ids is to introduce an initial pre-process stage for the dataset prior to training. to achieve this, we use our proposed triple-filter method for ranking. by ranking, the features that are strong in determining the output class of the dataset are obtained and one-third split of the ranked features are selected (that is, 14 features). one-third features of the entire features in the dataset was arrived at after ranking and eliminating redundant features before the performance of the classifier start to decline. these features selected represents the most significant features among all the filter methods. in the second phase, the training phase, the features selected after pre-processing the nsl-kdd dataset are used to train the ids to detect possible attacks in the network. this is deployed on the cluster feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 323 head to monitor data from the sensor nodes to the base station. the final phase, the classification phase, is a process whereby a labelled training dataset is used to learn, before subsequently classifying a test data into one of the class labels [34]. anomaly detection techniques that use classification-based algorithms can be divided into two stages; the training stage and testing stage. in the training stage, labelled data are used to learn a particular classifier. subsequently, this classifier can be used in the test stage to classify a test instance as either normal or anomaly. in this work, we use decision tree classification algorithm to detect the occurrence of a dos attack. decision trees are data mining approach which are often called classifier trees or hierarchical classifiers and are used for prediction. it is a popular method because of its simple structure, ease to understand and the short time required to interpret [35]. during the classification process, the degree of adjustment of the model to the training set is very essential. when a tight stopping criterion is employed, it often creates a small and unidentified decision tree, while the algorithm with a loose stopping criterion, on the other hand, gives a larger decision tree that tends to over-fit the training dataset. decision tree has been embraced for classification and data analysis in fields, such as agriculture, environmental, health, etc. decision trees are recursive partition models that use a single variable to divide datasets at each level. initially, all sets of cases are defined to belong to the same class before a variable is selected, using a split criterion, to determine the attribute to insert in a node and branch. decision tree nodes consist of set of rules where each tree node is labelled with an attribute variable which creates a branch for each value. they are represented by a tree like structure, with the leaf nodes labelled with a class label [36]. from its original version of id3 (iterative dichotomiser 3), c4.5 and c5.0 has been developed as an advance version of id3 [35]. over the years, the c4.5 algorithm has been used in the literature as the standard model for supervised learning. during a classification process, a training dataset is used to train the decision tree algorithm while a test dataset is used to validate the model. when there is a new sample of a test dataset, a prediction can be made on the state of the class variable using the path of the tree from the root to the leaf node, for the tree structure and sample values. for example, let us consider a set s, and select a case at random belonging to class ct. to determine if the random sample belongs to the class ct, we find pi using the equation [37]: = ( ) (3) where { denotes the number of samples contained in the set . the information conveyed can therefore be represented by * + where is the probability distribution. the entropy of , which is the information conveyed by the distribution, can be expressed as follows: ( ) ∑ (4) where n is the length of the information. when a set of samples are segmented by using a non-categorical attribute x, we have a set * +where m is the number of samples. the weighted average is the information used in determining the class of an element and can be determined using the formula: 324 o. osanaiye, o. ogundile, f. aina, a. periola ( ) ∑ ( ) (5) therefore, the information gain can be computed as follows: ( ) ( ) ( ) (6) the eqn. 6 above expresses the difference between the information required to determine the value of an element of and the information required to determine having obtained the value of the attribute . this is therefore referred as the information gain due to attribute x. in this work, we use j48 decision tree classification algorithm, a version of the c4.5 for classification. fig. 2 proposed intrusion detection model 5. experimental results in this work, we use the combination of three filter methods during the pre-processing stage to select features from the labelled dataset, nsl-kdd. the most relevant features that are strong in determining the output class are ranked and chosen to be used by the machine learning algorithm to classify traffic packets as either normal or anomaly. weka software [38], a machine learning tool that consists of series of machine learning algorithms, is used for our experimental analysis. during classification, the parameters of weka are set to its default values. during evaluation, we determine the performance of our proposed triple-filter method by using an open source nsl-kdd dataset. the motivation behind the use of nsl-kdd in our work is because it is open source and readily available online. furthermore, nsl-kdd can be modified to suit different experimental attack scenario in wsn. the nsl-kdd is a labelled benchmark dataset developed from the initial kddcup’99 that presented some feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 325 shortcomings. the nsl-kdd consists of 41 features and 2 classes, labelled as either attack or normal. the features in the dataset are categorized into four groups, namely; basic features, content features, time-based traffic features and connection-based traffic features [9]. the attacks in the dataset are grouped into dos, probe, r2l and u2r; with these attacks divided into training set and test set. the training set consists of 21 attack types, while on the other hand, the test set consists of an extra 17 unique attack type [9]. in this work, we have modified the dataset and extracted the dos attack trace. dos attack is one of the most prevalent attacks on the resource constraint sensor nodes in wsn that depletes its energy and cause a denial of service. dos attacks on systems, often a times, are carried out using similar methods, however, its impact on different hosts varies. the feature selection process is carried out to determine the one-third slip (14 highest ranked features) of the nsl-kdd dataset using our proposed triple filter method, as shown in table 1. this experiment is performed on an hp 64bit windows 10 operating system with intel (r) core (tm) i7-4700mq cpu and 8gb of ram. we use 10-fold cross-validation to estimate the performance of our proposed classifier. in a 10-fold cross validation, data are split into 10 equal folds of same sizes prior to carrying out 10 iterations of training and validation. table 1 feature selection using filter method filter method feature selected gain ratio 12,26,4,25,39,6,30,38,5,29,3,37,34,33 chisquared 5,3,6,4,29,30,33,34,35,12,23,38,25,39 relieff 3,29,4,32,38,33,39,12,36,23,26,34,40,31 from table 1, it is seen that individual filter method has ranked the feature of the dataset according to its strength in determining the class. we have attached a weight to each ranking position, therefore, we determine the strongest features across the three filter methods, and cumulatively sum up the weights. table 2 presents the output of our triple-filter method, that is, the fourteen most important features. these fourteen features have been used as the input for training the decision tree classifier, j48 in weka. table 2 triple-filter feature selection method filter method feature selected triple-filter 3,4,29,33,34,39,12,5,30,38,26,25,23,6 6. performance measure during the evaluation of a classifier, different metrics such as classification accuracy, detection rate and false alarm rate can be used. these metrics are determinant on the measure of the true positive (tp), false positive (fp), true negative (tn) and false negative (fn). tp are the instances where attack packets are correctly classified, while situations of fp occur when certain amount of normal packets are misclassified as attack (false alarm). tn, on the other hand, are situations where normal packets are correctly classified, whereas, fn are instances where packets are classified as normal, when indeed they are attacks. recently developed ids for detecting attacks in wsn requires a relatively high detection rate with low false alarm. as discussed, in this work, we consider the classification 326 o. osanaiye, o. ogundile, f. aina, a. periola accuracy, detection rate and false alarm rate of our triple-filter method. we compare these metrics with the performance of the full dataset containing all the features and each of the filter methods using j48 classifier. the metrics used for comparison are defined as follows. 1. classification accuracy: this is defined as the ratio of the data defined correctly to that of the entire dataset in percentage. the accuracy of a proposed technique can be derived using the formula: ca= × 100% (7) 2. detection rate: detection rates is usually based on the confusion matrix and can be determined by using the formula dr = × 100%. (8) 3. false alarm rate: this is the amount of normal data that are misclassified as attack during detection. the false alarm rate can be determined by using the formula: far = × 100% (9) table 3 presents the performance measure of our proposed ids defence solution with respect to the classification accuracy, detection rate, and false alarm rate. 7. discussion intrusion detection in wsn during an attack can further increase the complexity and resource consumption of the sns. thus, filter methods for feature selection when compared to wrapper methods are fast and easy to interpret. however, previous research has shown that it cannot determine features that are strong as a group but weak individually [39]. we have chosen to deploy our proposed ids on the cluster heads because we assume that the cluster heads have better battery life with a higher software and hardware capability as compared to the other nodes. fig. 3 shows the classification accuracy across different filter feature selection methods and our triple-filter method. fig. 3 classification accuracy for different filter methods 98.60% 98.80% 99.00% 99.20% 99.40% 99.60% 99.80% accuracy full set gain ratio chi-squared relieff triple-filter feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 327 as shown in fig. 3 and table 3, our proposed method exhibits the best accuracy performance. it presents a slight improvement of 0.01 % as compared to chi-squared filter method which gives the second best accuracy performance. in fig, 4, the detection rate across the different filter methods and our proposed triple-filter is presented. the result shows that our proposed filter method with 14 selected features offers the best detection rate in comparison with the other filter methods. as shown in table 3 and fig. 4, the triple-filter method offer a slight increase in detection rate of 0.02% when compared with the next best filter feature selection method. fig. 4 detection rate for different filter methods for the false alarm rate, relieff presents the worst result of 0.87% while the full dataset (with the entire features) showcase the best performance, 0.38%. our proposed method presents a false alarm rate of 0.42% as shown in fig 5. although, our proposed triple-filter method do not offer the best false alarm rate, it is still suitable in real-time wsns because it offer good classification accuracy and detection rate at a reduced complexity. note that a lightweight ids is an important requirement in order to prolong the lifetime and functionality of sensor networks. fig. 5 false alarm rate for different filter methods 98.60% 98.80% 99.00% 99.20% 99.40% 99.60% 99.80% 100.00% detection rate full set gain ratio chi-squared relieff triple-filter 0.00% 0.20% 0.40% 0.60% 0.80% 1.00% false alarm rate full set gain ratio chi-squared relieff triple-filter 328 o. osanaiye, o. ogundile, f. aina, a. periola table 3 performance comparison of the triple-filter with full dataset, gain ratio, chi-square and relieff filter methods no of features accuracy detection rate false alarm rate full set 41 99.56% 99.49% 0.38% gain ratio 14 99.60% 99.68% 0.47% chi-squared 14 99.66% 99.74% 0.41% relieff 14 99.08% 99.02% 0.87% triple-filter 14 99.67% 99.76% 0.42% finally, we compared the triple-filter method with a similar work in [10]. table 4 presents the performance comparison of the proposed triple-filter method with the work in [10]. table 4 performance comparison of the triple-filter with the work in [10] filter method classifier no of features accuracy detection rate false alarm rate svm-rfe [32] bpn 24 99.75% 95.13% 2.06% triple-filter j48 14 99.67% 99.67% 0.42% as presented in table 4, the triple-filter feature selection, with 14 features, present an improvement in the detection rate and the false alarm rate as compared with the work in [10] using the nls-kdd dataset. this shows the efficiency of our proposed triple-filter feature selection method in improving the detection rate of the decision tree classifier with minimal false alarm while conserving the limited resources of the sensor network. 8. conclusion and future work in this paper, we have proposed the combination of three filter feature selection methods, gain ratio, chi-squared and relieff, called triple-filter, to pre-process dataset prior to attack classification. the proposed feature selection method is deployed in a heterogeneous cluster-based wsn, where the ids is implemented on the cluster head nodes. the proposed ids reduce the complexity of the system by selecting important features in the dataset, thus reducing the features from 41 to 14 before classification, using a decision tree algorithm, j48. experimental results obtained show an improved performance with reduced feature set from 41 to 14. also, our proposed triple-filter feature selection method performed better than individual filter methods using j48 classifier. in the future, we seek to extend our work to study the effect of our solution on homogeneous wsns and also to evaluate our proposed triple-filter feature selection on other classification algorithms. references [1] i. almomani, b. al-kasasbeh, m. al-akhras, “wsn-ds: a dataset for intrusion detection systems in wireless sensor networks”, journal of sensors, pp. 1–16, 2016. [2] c. o'reilly, a. gluhak, m. a. imran, s. rajasegarar, “anomaly detection in wireless sensor networks in a non-stationary environment”, ieee communications surveys & tutorials, vol. 16, pp. 1413–1432, 2014. feature selection for intrusion detection system in a cluster-based heterogeneous wireless sensor network 329 [3] o.o. ogundile, a. s. alfa (2017), “a survey on an energy-efficient and energy-balanced routing protocol for wireless sensor networks”, sensor, vol. 17, 1084, 1–51, 2017. [4] o. osanaiye, a. alfa, “denial of service defence for resource availability in wireless sensor networks”, ieee access, vol. 6, pp. 6975–7004, 2018. [5] h.m. salmon, et al, “intrusion detection system for wireless sensor networks using danger theory immune-inspired techniques”, international journal of wireless information networks, vol. 20, pp. 39–66, 2013. [6] v. f. taylor, d. t. fokum, “mitigating black hole attacks in wireless sensor networks using noderesident expert systems”, in proceedings of the ieee wireless telecommunications symposium (wts), pp. 1–7, 2014. [7] s. athmani, d.e. boubiche, a. bilami, “hierarchical energy efficient intrusion detection system for black hole attacks in wsns”, in proceedings of the ieee world congress computer and information technology (wccit), pp. 1–5, 2013. [8] o. osanaiye, r. choo, m. dlodlo, “distributed denial of service (ddos) resilience in cloud: review and conceptual cloud ddos mitigation framework”, journal of network and computer applications, vol. 69, pp. 1447–1465, 2016. [9] m. tavallaee, e. bagheri, w. lu, a. ghorbani, “a detailed analysis of the kdd cup 99 dataset”, in proceedings of the second ieee symposium on computational intelligence for security and defence applications cisda, pp. 1–6. [10] wang s.-s., yan k.-q., wang s.-c., liu c.-w. (2011) an integrated intrusion detection system for cluster-based wireless sensor networks. expert systems with applications, 38, 15234–15243. [11] o. osanaiye, h. cai, k.k.r. choo, a. dehghantanha, z. xu and m. dlodlo, “ensemble-based multifilter feature selection method for ddos detection in cloud computing” eurasip journal on wireless communications and networking, vol. 130, pp. 1–10, 2016. [12] x. deng, “an intrusion detection system for cluster based wireless sensor networks”, in proceedings of the 16 th ieee international symposium on wireless personal multimedia communications (wpmc), 2013, pp. 1–5. [13] k. q. yan, s.c. wang, s.s. wang, c.w. liu, “hybrid intrusion detection system for enhancing the security of a cluster-based wireless sensor network”, in proceedings of the 3rd ieee international conference on computer science and information technology (iccsit), vol. 1, 2010, pp. 114–118. [14] k. medhat, r.a. ramadan, i. talkhan, “distributed intrusion detection system for wireless sensor networks”, in proceedings of the 9th ieee international conference on next generation mobile applications, services and technologies, 2015, pp. 234–239. [15] m. tiwar, k.v. arya, r. choudhari, k. s. choudhary, “designing intrusion detection to detect black hole and selective forwarding attack in wsn based on local information”, in proceedings of the 4th ieee international conference on computer sciences and convergence information technology iccit'09, 2009, pp. 824–828. [16] a.p.r. da silva, et al, “decentralized intrusion detection in wireless sensor networks” in proceedings of the 1st acm international workshop on quality of service & security in wireless and mobile networks, 2005, pp. 16–23. [17] jong k., marchiori e., sebag m., van der vaart a. (2004) feature selection in proteomic patten data with support vector machines. symposium on computational intelligence in bioinformatics and computational biology, 41–48. [18] j. yick, b. mukherjee, d. ghosal, “wireless sensor network survey”, computer networks, vol. 52, pp. 2292–2330, 2008. [19] i. f. akyildiz, w. su, y. sankarasubramaniam, e. cayirci, “a survey on sensor networks”, ieee communications magazine, vol. 40, pp. 102–114, 2002 [20] j. heidemann, et al, “research challenges and applications for underwater sensor networking” in proceedings of the ieee wireless communications and networking conference, wcnc, 2006, pp. 228– 235. [21] i. f. akyildiz, t. melodia, k. r. chowdhury, “a survey on wireless multimedia sensor networks”, computer networks, vol. 51, pp. 921–960, 2007. [22] a. abduvaliyev, a.-s. k. pathan, j. zhou, r. roman, w.-c. wong, “on the vital areas of intrusion detection systems in wireless sensor networks”, ieee communications surveys & tutorials, vol. 15, pp. 1223–1237, 2013. [23] y. yu, k. li, w. zhou, p. li, “trust mechanisms in wireless sensor networks: attack analysis and countermeasures”, journal of network and computer applications, vol. 35, pp. 867–880, 2012. 330 o. osanaiye, o. ogundile, f. aina, a. periola [24] c.c. su, k.m. chang, y.h. kuo, m.f. horng, “the new intrusion prevention and detection approaches for clustering-based sensor networks”, in proceedings of the ieee wireless communications and networking conference, vol. 4, 2015, pp. 1927–1932. [25] m. h. anisi, a. h. abdullah, s. a. razak, “energy-efficient and reliable data delivery in wireless sensor networks”, wireless networks, vol. 19, pp. 495–505. [26] p. kuila, p.k. jana, “energy efficient loadbalanced clustering algorithm for wireless sensor networks”, procedia technology, vol. 6, pp. 771–777, 2012. [27] p. kuila, s. k. gupta, p.k. jana, “a novel evolutionary approach for load balanced clustering problem for wireless sensor networks”, swarm and evolutionary computation, vol. 12, pp. 48–56, 2013. [28] p. kuila, p.k. jana, “approximation schemes for load balanced clustering in wireless sensor networks”, journal of supercomputing, vol. 68, pp. 87–105, 2014. [29] r. xie, x. jia, “transmission-efficient clustering method for wireless sensor networks using compressive sensing”, ieee trans. parallel distrib. syst., vol. 25, pp. 806–815, 2014. [30] o. a. osanaiye, ddos defence for service availability in cloud computing. doctoral dissertation, university of cape town, 2016. [31] v. bolon-canedo, n. sanchez-marono, a. alonso-betanzos, “a review of feature selection methods on synthetic data”, knowledge and information systems, vol. 34, no. 3, pp. 483–519, 2013. [32] c. j. mantas, j. abellan, “credal-c4. 5 decision tree based on imprecise probabilities to classify noisy data”, expert systems with applications, vol. 41, pp. 4625–4637, 2014. [33] h.f. eid, a.e. hassanien, t.h. kim, s. banerjee, “linear correlation-based feature selection for network intrusion detection model”, in advances in security of information and communication networks, pp. 240–248, 2013. [34] m.b. yassein, y. khamayseh, m. abujazoh, “feature selection for black hole attacks”, journal of universal computer science, vol. 22, no. 4, pp. 521–536, 2016. [35] j. gehrke, v. ganti, r. ramakrishnan, w.y. loh, “boat-optimistic decision tree construction” in acm sigmod record, vol. 28, pp. 169–180, 1999. [36] n. sanchez-marono, a. alonso-betanzos, m. tombilla-sanroman, “filter methods for feature selection a comparative study”, intelligent data engineering and automated learning-ideal, pp. 178-187, 2007. [37] n. sengupta, j. sen, j. sil, m. saha, “designing of on line intrusion detection system using rough set theory and q-learning algorithm”, neurocomputing, vol. 111, pp. 161-168, 2013. [38] http://www.cs.waikato.ac.nz/ml/weka/, [online] access 2nd august 2017. [ 39] o. osanaiye, r. choo, m. dlodlo, “analysing feature selection and classification techniques for ddos detection in cloud”, in proceedings of the southern africa telecommunication, pp. 198-203, 2016. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 401-410 https://doi.org/10.2298/fuee1803401s single event transients monitoring and diagnostic in fpga  georgy s. sorokoumov national research nuclear university mephi (moscow engineering physics institute) moscow, russia abstract. analysis of single event transients (sets) generated in field programmable gate arrays (fpga) under heavy charged particles (hcp) irradiation and set suppression methods is performed. the circuit for fpga set detection is designed for transients generated both inside fpga and outside at package pin level. set registration inside fpga is carried out as an event when logical cell is switched. the set control schematic circuit efficiency has been comparatively verified using heavy ion accelerator and picosecond focused laser source. set in fpga experimental results are presented and discussed. key words: high-performance systems, space radiation, single event transients, digital integrated circuits, fpga, failures, hcp, vlsi, tmr 1. introduction development of high-performance systems, such as telecommunication technologies, orbit the group of earth remote sensing, navigation and global positioning systems (glonas, gps, galileo, beidou) and require the usage of high-performance microelectronics devices. the main trend of high-performance systems design is data processing speed of the information flow increase. data processing speed rise is obtained both by memory cluster growth and the range of integrated circuits (ics) operating frequencies increase. this trend in the development of high-performance systems require the usage of microchips with fewer than 250 nm design rules that leads to smaller supply voltage and growth of elements density on a chip. one of the main components used to create high-performance systems are programmable logic gate arrays (fpga). a distinctive feature of fpgas is the possibility of easy reconfiguration of the logical structure within the framework of the logical elements basis implemented in fpgas. modern fpgas are available as elementary logical structures implemented in the basis of look up tables (lut) cells constituting fpgas and complete received september 19, 2017; received in revised form march 7, 2018 corresponding author: georgy s. sorokoumov national research nuclear university mephi (moscow engineering physics institute), moscow, russia (e-mail: gssor@spels.ru) 402 g. sorokoumov built-in ip units, such as memory, phase locked blocks (pll), codecs and decoders for various purposes, microcontroller and microprocessor cores, etc. due to these unique characteristics, fpgas are used to implement autonomous high-performance systems in on-board space equipment. however, in real operating conditions ionizing radiation from space environment impacts all electronic devices and it may lead to ic upsets, failures and damages. space environment induces two main ionizing radiation effects in microelectronic devices: total ionizing dose (tid) [1] – [5] and single event effects (see) [30] – [32]. tid effects cause the degradation of ic element parameters associated with the accumulation of radiation-induced charge in silicon dioxide insulator that leads to mos transistors threshold voltage shift and the corresponding leakage current increase at the edge of nmos transistors. the design rules decrease causing the corresponding gate silicon dioxide thickness to decrease and therefore tid effects in mos transistors are not of great importance for modern submicron devices. see due to hcp are crucial to ics. such see as single event upsets (seu) and single event transients (set), are associated with digital elements logic state inversion possibility. modern design rules make it possible to increase operation frequency and to reduce ic element input capacitance. higher operation frequency rise up the probability of set on data lines that cause a real data corruption. small input capacitance of ic element leads to less linear energy transfer (let) threshold that can change logic state of digital element. if we assume a typical submicron process capacitance of 0.10.4 pf then the induced charge can form a voltage pulse up to volts (for a typical 1.2 v power supply voltage), which can lead to a change in the logic state of a digital circuit element. in connection with this reasons the dominating of set over seu in modern circuits is the most significant effect for high reliability system design for space applications. the main goal of this work is to demonstrate how sets in fpga are able to be registered and the possible consequences of set. the paper analyses set research method and appropriate ground facilities. the obtained original experimental results of sets in fpga are described. 2. typical schemes of sets effects research in digital ics set effects modeling in different types of asynchronous logic elements combinations are presented in [6] – [9]. the presented data analysis makes it possible to conclude that authors use two basic combinations – parallel (fig. 1.a) and consistent (fig. 1.b) chain of logic circuit elements. for analog ics sets authors of [10], [11] adduce amplitude and time duration as significant characteristics. for digital sets that kind of characteristic is redundant. the most important question for real system is how sets propagate through internal ic elements to external ic package pins and how those sets are latched by ic trigger structures. taking that into account, logic chain variants presented in fig. 1 are not sufficient for set research in digital ics. presented logic chains give us information about set propagation to an external pin, as well as about set time duration and amplitude, but no information about latching set by internal trigger structures. monitoring and diagnostic of single event transients in fpga 403 sets simulation for sub-micron process is presented in [12]. tcad results show that set duration must be from tens to hundreds of ps for submicron process (see fig. 2), that is in good agreement with experimental results. a. parallel chain b. consistent chain fig. 1 examples of asynchronous logic elements chains used for sets investigation fig. 2 tcad and on-chip measured time duration of sets in sub-micron process [12] 3. fpga configuration for sets monitoring and diagnostics in ground experiments fig. 3 presents the proposed schematic diagram for monitoring and diagnostics of set in fpga. the investigated set generation occurs in the chain of asynchronous logic (in the example it was considered that the chain consists of 195 inverters connected in a series). the choice of the number of inverters in the chain is a compromise solution. on the one hand, a large number of inverters increase the probability of the set. on the other hand, it is necessary to take into account the logical capacity of the ic. thus, the number of inverters in the chain must be as high as possible. the external signal „inv_in‟ specifies the inverters chain output signal logical state. the chain output is connected directly to an external package pin of the ic under test (signal „inv_out‟) and also connected to clock inputs of three d-flip-flops. connecting the output of the inverters 404 g. sorokoumov chain to three d-flip-flops allows us to monitor the fact of the set origin that can switch the logical state of the ic digital elements. the set formation fact is monitored by analyzing the triple measure redundancy (tmr) element output state which is connected to the outputs of d-flip-flops, as well as by the state of „the tmr_out [2..0]‟ signals, which are the outputs of d flip-flops connected to the external package pins of the ic. ion accelerator experiments suggest that all ics elements are under irradiation unlike on the focused laser source when irradiation influenced to the limited area of the chip. in the experiments on focused laser source the limited area does not include the tmr element. the used tmr element in that scheme makes possible to separate set in the inverters chain and seu in d flip-flop. the output „inv_out‟ is used to control the form of the transient process that has propagated to the external pin of the investigated ic. the observation of a set form is performed using an oscilloscope that records signals at the ics‟ package pin point where the signal line „inv_out‟ is connected. fig. 3 the schematic diagram for sets in fpga monitoring and diagnostics transient registration is based on the following algorithm: hcp ionize the part of ic semiconductor, the induced charge is collected by active element region in the inverters chain, and collected charge switches the logic state of inverter and generates voltage pulse at the inverters chain output. the generated short-time transient process is detected by the clock inputs of d flip-flops as the synchronization signal; after that the input data (logical „1‟) of d flip-flops is latched. after the set occasion an external reset (the signal „reset‟) is sent and the outputs of the d flip-flops take the state „0‟ as a result. set registration diagram (see fig. 3) assumes the presence of the block (duration measurement unit dmu) responsible for analyzing the duration of set (see fig. 4). dmu is formed by the composition of logical elements available in ic under test. dmu monitoring and diagnostic of single event transients in fpga 405 allows to measure the set duration which is based on the logical element switching time within the ic composition. as it is shown in fig. 4 the dmu consists of two parts. the first dmu part is responsible for set duration which is measured in quantity of the switched logic elements. the second dmu part is a functional block which downloads data about the switched logic elements. fig. 4 set duration measuring diagram based on logic elements. dmu schematic diagram is presented in fig. 5. one can see the set duration measured by the number of switched inverters during the time of the transient process in the logical „1‟ state. the feedback made on the signal „set_in‟ on the „and‟ element allows to exclude the accidental latching of the state '1' by the d flip-flop in case of the set occurrence in the duration measurement circuit. fig. 5 dmu schematic diagram. the scheme for set duration data downloading can be realized in various ways. if the unit is implemented as fifo with parallel loading, the signal community „unload‟ will represent the following form: the parallel loading signal „load", the signal of the data load permission is „ce‟, the clock signal is „clk‟. for example, it is also possible to implement in the form of a multiplexer. in this case signal „unload‟ is group of signals for addressing inputs of the multiplexer (see fig. 6). other variants of implementation are also possible. 406 g. sorokoumov fig. 6 set duration data downloading unit in mux realization. 4. experimental verification of set‟s in fpga monitoring setup verification of the proposed scheme for set monitoring was carried out at: the cyclotron of heavy short-range ions u-400m (jinr, dubna, russia) [13] and the source of focused laser radiation piko-3 (spels, moscow, russia) [14], [15]. the research was carried out for three types of fpgas built in antifuse process and one cmos asic in 180 nm design rules. in future we are going to investigate sets in a test chip in 90 nm design rules. sets were detected in all devices under test (dut) and in all experiments. the most interesting results have been found in the first type of irradiated fpga, in other duts regular forms of sets (traditional bell-shaped form) were registered. sets with durations from hundreds ns to several microseconds and essential amplitude fluctuations were detected with u-400m ion accelerator. the registered sets amplitude was found to be from hundreds of mv to the supply voltage limit. sets were registered under ions with let from 7 mev • cm 2 / mg (si) to 69 mev • cm 2 / mg (si) (all available ion lets at the facility). at the second research stage the proposed antifuse fpga sets investigation method was verified using the picosecond laser source piko-3. the irradiation was carried out from top side of ic chip by laser radiation with 1.064 μm wavelength. at the initial stage of the experiment, the chip was scanned with the step of 50 μm and energy of 300 nj. in the process of scanning the chip‟s surface sets were monitored at the dut external pin (line „inv_out‟), at the output line from tmr (line „tmr_out‟) and at d flip-flops three output signals (tmr_out[2..0]). after chip surface scanning the particular places on chip layout which was dedicated to set origin was localized and the stability of the registered sets generation was confirmed. monitoring and diagnostic of single event transients in fpga 407 time, ns 800 1000 1200 1400 1600 1800 2000 a m p li tu d e , v -2 -1 0 1 2 3 4 5 laser facility ''pico-3' ion facility ''u400m' fig. 7 laser and ion facilities sets comparison the performed laser experiments gained the following results:  places in fpga layout were found where transients are generated;  it was established that all the formed sets lead to the triggering of tmr;  the set origin energy threshold in picosecond focused laser test was determined as 200 nj. the comparison of the registered transients forms and durations obtained within the ion accelerator and focused laser tests (see fig. 7) demonstrates that laser methods are applicable for investigating single events in digital vlsi [16] – [18] and their results are in a good agreement with heavy ion accelerator results. some abnormal durations and forms of sets observed were found to be associated with the discharging of hcp‟s induced charge through large resistance in the disabled rams‟ cells in antifuse mode. in the next fpga chip versions this design mistakes was corrected and abnormal sets disappeared. also in accelerator experiments seu in ram and set effects in one type of fpga were investigated. table 1 presents the summary information of set and seu control. fig. 8 demonstrates experimental set and seu cross-section. the following conventions are used in table 1:  let [mev×cm 2 /mg ]– linear energy transfer.  f [1/cm 2 ] – fluence ion irradiation accumulated during the session  σset / σseu[cm 2 /gate / cm 2 /bit] – cross section set/seu measured for one inverter/bit.  dut – device under test  nset / nseu – number of set/seu registered during the ion irradiation session 408 g. sorokoumov table 1 experimental results of seu and set control in fpga io n lte, d u t f, n s e t σset, f, n s e u σseu, mev×cm 2 /mg 1/cm 2 cm 2 /gate 1/cm 2 cm 2 /bit xe ≈ 65 1 7,2e+06 101 7,1e-08 6,3e+06 72 4,5e-08 2 7,0e+06 120 8,6e-08 6,1e+06 55 3,6e-08 kr ≈ 40 1 1,2e+07 103 4,3e-08 4,2e+06 25 2,4e-08 2 1,2e+07 105 4,4e-08 4,2e+06 30 2,9e-08 ar ≈ 17 1 2,7e+06 3 6,8e-09 2,7e+07 49 7,2e-09 2 2,5e+07 21 4,3e-09 2,5e+07 55 8,7e-09 3 2,3e+07 11 2,5e-09 ne ≈ 7 1 1,6e+07 0 3,7e-10 2 1,6e+07 0 3,7e-10 3 1,7e+07 0 3,7e-10 as can be seen from fig. 8 even in asynchronous combination logic in static mode set can occur with frequency near the seu frequency. this means that it is important to control not only seu but also set. under adverse design of synchronization lines in fpga (for example) it is possible that set has a significant influence on the functioning of the device as a whole. let, mev*cm 2 /mg 0 10 20 30 40 50 60 70   c m 2 /g a te (b it ) 1,0e-10 1,0e-9 1,0e-8 1,0e-7 1,0e-6 set seu fig. 8 set (set) and seu (seu) cross section vs heavy ions let monitoring and diagnostic of single event transients in fpga 409 5. conclusions sets research in digital ics is an important aspect for fault-tolerant system design for space application. the presented experimental results demonstrate that sets are generated under the influence of hcp in modern vlsi, which are capable to induce false triggering of combinational circuits that cause change to the stored information in the trigger structures, i.e. set turns into seu. the amplitude and time characteristics prediction of generated sets is necessary to work out set filtering and correction circuits for space application. the experimental comparative results obtained at the heavy ion accelerator u-400m and the focused picosecond laser radiation source piko-3 are in good agreement. the use of laser source for set research allows to localize semiconductor structures responsible for the sets generation and their conversion to seu. the focused picosecond laser sources, especially in combination with heavy ion accelerator is a very informative and efficient facility of set experimental research and prediction. references [1] d. boychenko, o. kalashnikov, a. nikiforov, a. ulanova, d. bobrovsky, p. nekrasov, "total ionizing dose effects and radiation testing of complex multifunctional vlsi devices, " facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 153-164, 2015. [2] a. sogoyan, a. artamonov, a. nikiforov, d. boychenko, "method for integrated circuits total ionizing dose hardness testing based on combined gammaand x-ray irradiation facilities, "facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 329-338, 2014. [3] o.a. kalashnikov and a.y. nikiforov, “tid behavior of complex multifunctional vlsi devices,” in proceedings of the 29th int. conf. on microelectronics, miel 2014, belgrade, serbia, may 2014, pp. 455458. [4] a. karakozov, o. korneev, p. nekrasov, et. al, “bias conditions and functional test procedure influence on powerpc7448 microprocessor tid tolerance”, in proceedings of the radecs, 2013. pp. 1-2. [5] o. kalashnikov, a. nikiforov. “tid behavior of complex multifunctional vlsi devices”, in proceedings of the international conference on microelectronics, icm, 2014, pp. 455-458. [6] f. l. kastensmidt, l. tambara, d. v. bobrovskiy, a. a. pechenkin, and a. y. nikiforov, “laser testing methodology for diagnosing diverse soft errors in a nanoscale sram-based fpga,” ieee trans. nucl. sci., vol. 61, no. 6, pp. 3130-3137, 2014. [7] b. narasimham, b.l. bhuva, r.d. schrimpf, l.w. massengill, m.j. gadlage, o.a. amusan, w.t. holman, a.f. witulski, w.h. robinson, j.d. black, j.m. benedetto, and p.h. eaton, “characterization of digital single event transient pulse-widths in 130-nm and 90-nm cmos technologies”, ieee trans. nucl. sci., vol. 54, no. 6, pp. 2506-2511, 2007. [8] b. narasimham, v. ramachandran, b. l. bhuva, r. d. schrimpf, a. f. witulski, w. t. holman, l. w. massengill, j. d. black, w. h. robinson and d. mcmorrow, “on-chip characterization of single event transient pulse widths”, ieee trans. on device and materials reliability, vol. 6, no. 4, pp. 542-549, 2006. [9] kartik mohanram, “simulation of transients caused by single-event upsets in combinational logic”, in proceedings of the ieee international conference on test, pp. 981-990, 2005. [10] a. zanchi, s. buchner, y. lotfi, s. hisano, c. hafer, d. kerwin, “correlation of pulsed-laser energy and heavy-ion let by matching analog set ensemble signatures and digital set thresholds”, ieee trans. nucl. sci., vol. 60, no. 6, pp. 4412 – 4420, 2013. [11] r.m. chavez, l.z. scheick, t.f. miyahira, a.h. johnston,“single event transients (sets) in the rh108 operational amplifier in analog circuits”, in proceedings of the ieee radiation effects data workshop, 2006, pp. 154 – 159. [12] cadence tool use – single-event transient pulse-width measurement in advanced cmos technologies, url: http://www.isde.vanderbilt.edu/rer/cadence/2013-research-projects-using-cadencetools/cadence-tool-use-set-pulse-width-measurement-in-advanced-cmos-technologies [13] v.a. skuratov, y.g. teterev, v.b. zager, a.i. krylov, i.v. kalagin, g.g. gulbekyan, v.s. anashin, “ion beam diagnostics for see testing at u400m flnr jinr cyclotron”. in proceedings of the radec2012, pp. 756 – 759. http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=10560 http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.alfio%20zanchi.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.stephen%20buchner.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.younes%20lotfi.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.shinichi%20hisano.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.craig%20hafer.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.david%20b.%20kerwin.qt.&newsearch=true http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=23 http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=23 http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.rosa%20m.%20chavez.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.leif%20z.%20scheick.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.tetsuo%20f.%20miyahira.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.allan%20h.%20johnston.qt.&newsearch=true http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=4077266 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=4077266 410 g. sorokoumov [14] a.n. egorov, "“pico-4” single event effects evaluation and testing facility based on wavelength tunable picosecond laser, " in proceedings of the ieee radiation effects data workshop 2012, pp. 6972. [15] a.n. egorov, et al., "laser «pico» family simulators for testing electronic components for resistance to hcp. russia, " specialized machinery and communication, no. 4-5, pp. 8–13, 2011. [16] a.i. chumakov, a.a. pechenkin, d.v. savchenkov, a.s. tararaksin, a.l. vasil'ev, and a.v. yanenko, “local laser irradiation technique for see testing of ics,” in proc. of the 12th european conf. on radiation and its effects on components and systems, radecs-2011, sevilla; spain; sept. 19 -23, 2011, pp. 449-453. [17] d.v. savchenkov, a.i. chumakov, a.g. petrov, a.a. pechenkin, a.n. egorov, o.b. mavritskii and a.v. yanenko, “study of sel and seu in sram using different laser techniques” in proc. 14 th european conf. on radiation and its effects on components and systems, radecs-2013, oxford; united kingdom; sept. 23 -27, article number 6937411. [18] a.g. petrov, a.l. vasil'ev, a.v. ulanova, a.i. chumakov and a.y. nikiforov, “flash memory cells data loss caused by total ionizing dose and heavy ions,” central european journal of physics, vol. 12, no. 10, pp. 725-729, 2014. [19] m. berg, “field programmable gate array (fpga) single event effect (see) radiation testing”. 2012, nasa electronic parts and packaging (nepp); and defense threat reduction agency under iacro #11-4395i [20] a. evans, d. alexandrescu, “see test report. single event transient (set) measurement microsemi a3p3000l fpgas”, iroc technologies. wtc pox 1510. grenoble, 2014 [21] c. jacoboni, "a review of some charge transport properties of silicon", solid state electron, vol. 20, no. 2, pp. 77-89, 1977. [22] keith s. morgan, daniel l. mcmurtrey, brian h. pratt, and michael j. wirthlin, “a comparison of tmr with alternative fault-tolerant design techniques for fpgas”, ieee trans. nucl. sci., vol. 54, no. 6, pp. 2065-2072, 2007 [23] f. wang and v. d. agrawal, “soft error rate determination for nanometer cmos vlsi logic”, in proceedings of the 40th southeastern symposium on system theory (ssst), 2008, pp. 324 – 328. [24] s.p. buchner and m.p. baze, “single-event transients in fast electronic circuits,” ieee nsrec short course, pp. 1-105, 2001. [25] d. truyen, j. boch, b. sagnes, j.r. vaillé, n. renaud, e. leduc, m. briet, c. heng, s. mouton, and f. saigné, “temperature effect on the heavy-ion induced single-event transients propagation on a cmos bulk 0.18 µm inverters chain”, in proceedings of the conf. on radiation and its effects on components and systems, radecs-2007, pp. 1 – 6. [26] d. kobayashi, k. hirose, y. yanagawa, h. ikeda, h. saito, v. ferlet-cavrois, d. mcmorrow, m. gaillardin, p. paillet, y. arai and m. ohno, “waveform observation of digital single-event transients employing monitoring transistor technique”, ieee trans. nucl. sci., vol. 55, no. 7, pp. 2872-2879, 2008. [27] e. peterson, p. shapiro, j. adams, and e. burke, “calculation of cosmic-ray induced soft upsets and scaling in vlsi devices”, ieee transactions on nuclear science, vol. 29 pp. 2055-2063, december 1982. [28] l.w. massengill, a. e. baranski, d. o. v. nort, j. meng, and b. l. bhuva, “analysis of single-event effects in combinational logic – simulation of the am2901 bitslice processor”, ieee trans. nucl. sci., vol. 47, no. 6, pp. 2609–2615, 2000. [29] v. ferlet-cavrois, p. paillet, a. torres, m. gaillardin, d. mcmorrow, j. s. melinger, a. r. knudson, a. b. campbell, j. r. schwank, g. vizkelethy, m. r. shaneyfelt, k. hirose, o. faynot, g. barna, c. jahan, and l. tosti, “direct measurement of transient pulses induced by laser and heavy ion irradiation in decananometer soi devices,” ieee trans. nucl. sci., vol. 52, no. 6, pp. 2104–2113, 2005. [30] j. benedetto, p. eaton, k. avery, d. mavis, m. gadlage, t. turflinger, p. e. dodd, and g. vizkelethyd, “heavy ion-induced digital single-event transients in deep submicron processes”, ieee trans. nucl. sci., vol. 51, no. 6, pp. 3480-3485, 2004. [31] h. schone, d. s. walsh, f. w. sexton, b. l. doyle, p. e. dodd, j. f. aurand, r. s. flores, and n. wing, “time-resolved ion beam induced charge collection (tribicc) in micro-electronics,” ieee trans. nucl. sci., vol. 45, no. 6, pp. 2544–2549, 1998. [32] p.e. dodd, “basic mechanisms for single-event effects”, in proceedings of the ieee nucl. and space radiat. effects conf. short course text, pp. 1–85, 1999. [33] m. gadlage, r. schrimpf, j. benedetto, p. eaton, d. mavis, m. sibley, k. avery, and t. turflinger, “single event transient pulse widths in digital microcircuits,” ieee trans. nucl. sci., vol. 51, no. 6, pp. 3285–3290, 2004. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 425 433 doi: 10.2298/fuee1403425b relevance of the types and the statistical properties of features in the recognition of basic emotions in speech  milana bojanić, vlado delić, milan sečujski faculty of technical sciences, university of novi sad, serbia abstract. due to the advance of speech technologies and their increasing usage in various applications, automatic recognition of emotions in speech represents one of the emerging fields in human-computer interaction. this paper deals with several topics related to automatic emotional speech recognition, most notably with the improvement of recognition accuracy by lowering the dimensionality of the feature space and evaluation of the relevance of particular feature types. the research is focused on the classification of emotional speech into five basic emotional classes (anger, joy, fear, sadness and neutral speech) using a recorded corpus of emotional speech in serbian. key words: emotional speech recognition, acoustic features, basic emotions 1. introduction basic emotion is a term used in categorical emotion models, among which ekman‟s concept of six basic emotions is the most prominent one. his theory of basic emotions, which are “psychological universals and constitute a set of basic, evolved functions that are shared by all humans”, is supported with experimental findings of cross-culturally recognized emotions from vocal signals and facial expressions [1]. from the beginning of its development, emotional speech recognition (esr) studies have used corpora of acted emotional speech since those corpora were easy to collect. such corpora usually contained several basic emotions reproduced by actors [2]. there are apparently reasonable objections about acted speech corpora, saying that acting emotions is not the same as producing „spontaneous‟ emotions and pointing out that within human-machine interaction emotion-related states are much more common than prototypical full-blown emotions (such as those represented in acted speech corpora) [3]. still, recent research has shown that the relationships between the acted emotions and their acoustic correlates and between real life emotions and their acoustic correlates do not necessarily contradict [4].   received february 10, 2014; received in revised form march 13, 2014 corresponding author: milana bojanić university of novi sad, faculty of technical sciences, trg dositeja obradovića 6, 21000 novi sad, serbia (milana.bojanic@uns.ac.rs) 426 m. bojanić, v. delić, m. seĉujski a more flexible solution to the problem of the representation of emotional states is to represent them as points in the continuous 2d space whose co-ordinates are the activation and evaluation involved in the emotional state [5]. such dimensional models also allow for the mapping of basic emotions into the continuous 2d emotional space [5, 6], thus enabling a broad field of application of the recognition of basic emotions in speech. the paper summarizes our approach to the recognition of basic emotions in speech, focusing particularly on the improvement of recognition accuracy by lowering the dimensionality of the feature space. additionally, a feature selection procedure has been performed in order to rank feature types and used statistical functionals. the presented research has been conducted on a corpus of acted emotional speech in serbian. the paper is organized as follows. aspects of the proposed approach that are relevant to the recognition of basic emotions, including acoustic modeling, classification scheme and speech corpus, are presented in section 2. in section 3, theoretical background about feature dimensionality reduction techniques is given and their possible benefits are pointed out. experimental results are shown and discussed in section 4. finally, the conclusions are given in section 5. 2. the proposed approach 2.1 the proposed approach to acoustic modeling the proposed approach to acoustic modeling is based on the statistical analysis of acoustic feature contours [7, 8] and it is performed in three stages, as shown in fig. 1. the first stage includes the extraction of acoustic features on a frame basis. these features belong to two acoustic feature sets, namely prosodic and spectral feature set. in the prosodic feature set, pitch and energy are extracted. as to spectral feature set, only the first 12 mfccs are taken into account in our analysis, since they correspond to slow changes in the spectrum, i.e., the spectrum envelope. the feature contours which correspond to the pitch contour, energy contour and mfcc contour, are, respectively, sequences of short-term pitch, energy and mfcc values extracted on a frame basis. fig. 1 feature extraction process in three stages relevance of the types and the statistical properties of features in recognition of basic emotions in speech 427 the extracted features are forwarded to the second stage, in which the first derivative of the acoustic features is calculated in order to model the dynamics of speech. the first derivative carries the information about the dynamics of emotional speech, which is useful in emotional speech classification [4]. the third stage of the feature extraction process involves a statistical analysis of the feature contours. the final feature set is obtained from the feature contours by applying so-called static modeling through functionals [9]. in the literature, larger numbers of statistical features are analyzed [10, 11]. our selection of statistical functionals was guided by the principle that chosen statistical features should describe the variations and follow the trend of changes of acoustic features correlated with different types of emotional speech. at the same time, since it was impossible to predict which statistical characteristics would be the most effective, the proposed set of statistical features included 12 features, bearing in mind that if particular information in the feature vector showed to be redundant and aggravating for classification, an efficient subset of features would be extracted using a dimensionality reduction technique. the proposed set of 12 statistical functionals has been chosen from three groups of functionals which are the most frequently used [9]. these groups and their corresponding functionals are [7]: 1. the first four moments (mean, standard deviation, skewness and kurtosis), 2. extrema and their positions (minimum, maximum, range, relative position of minimum and relative position of maximum), 3. regression coefficients (the slope and the offset of the linear regression of the contour) and regression error (the mean squared error between the regression curve and the original contour). by applying the proposed procedure, three sets of features have been extracted [7]. the first feature set includes only prosodic features (pitch and energy) and it will be referred to as prosodic feature set (p-fs). the second feature set includes only spectral features (12 mfcc); this set will be referred to as spectral feature set (s-fs). finally, the third feature set includes both prosodic and spectral features, and additionally the voicing probability and the zero crossing rate. for the mentioned 16 features, the first derivative is calculated, and then 12 functionals are applied on all of them, resulting in 384 features extracted for each utterance. the third feature will be referred to as prosodic-spectral feature set (ps-fs). 2.2 classification scheme for the purpose of emotional speech classification, we have considered the linear discriminant classifier (ldc) and the k-nearest neighbours classifier (knn), as they belong to well known and simple classifiers, which have been used by other researchers for this purpose and which have proved to be successful for both acted and spontaneous emotional speech [9]. as for ldc, two classification schemes have been considered. the first one is the linear bayes classifier with the underlying assumption that classes have gaussian densities and equal covariance matrices. the second one is the derivation of linear discriminant functions via the perceptron rule [12]. in the latter case, no assumptions have been made about the underlying class densities. 428 m. bojanić, v. delić, m. seĉujski 2.3 emotional speech corpus the research was conducted on the corpus of emotional and attitude expressive speech (gees, according to the serbian acronym), which is the first speech corpus recorded in serbian for the purpose of research on acoustic manifestations of emotions in human speech in the context of speech technology [13]. it contains recordings of acted speech-based emotional expressions corresponding to five basic emotional states: anger, joy, fear, sadness, and neutral, reproduced by six actors (3 female, 3 male). the underlying textual material is emotionally neutral with respect to lexical content and for the purpose of this study a section of the corpus including 30 short and 30 long sentences was used. the reported human recognition accuracy for this corpus is 94.7%. to avoid an imbalance between male and female speakers, an equal portion of the material from each emotional class belonging to each speaker was chosen and a total of 1740 sentences (75 minutes of speech) have been processed. both training and test sets included utterances from all speakers. therefore, these experiments belong to the case of speaker dependent emotion recognition. 3. dimensionality reduction dimensionality reduction can be performed through feature extraction or feature selection. while feature extraction employs a mapping (usually linear) of a given feature space onto a lower dimensional space, creating a feature subset which is a combination of existing features, feature selection involves a selection of a subset from the existing features without any transformation. 3.1 linear discriminant analysis linear discriminant analysis (lda) is a linear feature extraction technique whose goal is the enhancement of the class-discriminatory information in a lower dimensional feature space. fisher‟s lda for a two-class problem is based on a search for a projection that maximizes the ratio of between-class to within-class scatter. the solution is in a specific choice of direction for the projection of the data where the examples from the same class are projected so as to be very close to each other and, at the same time, the projected class means are projected so as to be as far from each other as possible [14]. fisher‟s lda generalizes easily for a c class problem (in our case c = 5 since we deal with 5 emotional classes). since the projection is no longer a scalar (it has c−1 dimensions), the determinants of the scatter matrices are used to obtain an objective function. betweenclass scatter matrix represents the scatter of the class mean vectors around the mixture mean, defined as: t 1 ))((    i c i iib ns , (1) where    icxi i x n 1 is the mean vector of each class in the original feature space x, and    x x n 1 is the mean vector of the mixture distribution. a within-class scatter matrix shows the scatter of samples around their respective class relevance of the types and the statistical properties of features in recognition of basic emotions in speech 429 mean vectors, and is expressed by:    c i iw ss 1 , (2) where    icx iii xxs t ))(( . (3) it can be shown that the optimal projection matrix ]|...||[ * 1 * 2 * 1 *   c wwww is the one whose columns are the eigenvectors corresponding to the largest eigenvalues of the following generalized eigenvalue problem [14]: 0)( *  iwib wss . (4) the projections with maximum class separability information are the eigenvectors corresponding to the largest eigenvalues i of the matrix sw 1 sb. 3.2 feature selection the drawback of feature extraction methods is that they are not very appropriate for feature mining, as the original features are not retained after the transformation [9]. in order to gain an insight into the significance of particular features, feature selection was used. we adopted sequential forward feature selection (sffs) as the search strategy and wrapper based evaluation as the objective function. sffs starts the selection with an empty set and sequentially adds the feature that results in the highest value of the objective function when combined with the already selected features [15]. in the case of wrappers the objective function is a classifier which evaluates feature subsets by their recognition rate on test data employing cross-validation. in our case, linear bayes classifier was selected as the wrapper as it had shown the best performance in previous recognition tests [7]. ideally, feature selection methods should not only reveal the single most relevant attribute (or groups thereof), but they should also decorrelate the feature space [9]. feature selection results in a reduced, interpretable set of significant features; their counts and weights in the selection set allow us to draw conclusions on the relevance of the feature types they belong to [16]. the feature set used in our feature selection experiments was ps-fs. since it is a combination of both prosodic and spectral features, the relevance of particular feature types within ps-fs was expected to be evaluated. 4. experimental results the focus of the research was on the investigation of a possible improvement of recognition accuracy in the case of a reduced feature space in the task of basic emotions classification. therefore, the performances of each classifier were tested in two ways: (1) using 3 extracted feature sets (p-fs, s-fs, ps-fs), and (2) using 3 feature sets obtained after lda feature reduction has been applied on the 3 initial feature sets. the experiments were carried out using 3 classification techniques (the knn classifier, the linear bayes classifier and the perceptron rule). 430 m. bojanić, v. delić, m. seĉujski table 1 shows the class and average recognition rate of the knn classifier (k = 9) in case of 3 feature sets, before lda (originally extracted feature sets) and after lda (original feature space reduced to 4 projection vectors). it can be observed that rather poor performance of knn in the case of all three original sets has been significantly improved in the reduced feature space. the highest improvement has been achieved in the case of prosodic-spectral feature set (an increase from 39.9% to 91.3% average recognition rate), which could be explained by the fact that the performance of the knn classifier is affected by the high dimensionality, which is particularly apparent in case of ps-fs. table 1 recognition accuracy of knn classifier using 3 feature sets (before and after feature reduction using lda) class recognition rate [%] feature set anger fear joy neutral sadness average p-fs 44.3 23.9 39.1 25 44.5 35.4 p-fs reduced with lda 53.7 51.2 52.3 53.2 61.2 54.3 s-fs 73.9 56.9 35.1 58.1 37.1 52.2 s-fs reduced with lda 81.3 92.8 81.6 95.7 93.9 89.1 ps-fs 57.8 37.9 32.2 23.6 33.3 39.9 ps-fs reduced with lda 86.8 93.7 83.6 95.9 96.3 91.3 table 2 shows the class and average recognition rate of the linear bayes classifier in case of three feature sets, before lda (initially extracted feature sets) and after lda (original feature space reduced to 4 projection vectors). an improvement of recognition accuracy is obtained only in the case of prosodic feature set (p-fs). this improvement amounts to about 5%, which is a rather moderate increase compared to the results in table 1, where the improvement is about 19%. as to s-fs and ps-fs there were no improvements, which is probably due to good linear separability in the original feature space (resulting in high recognition rates using non-reduced s-fs and ps-fs). table 2 recognition accuracy obtained with 3 feature sets (before and after feature reduction using lda) and with the linear bayes classifier class recognition rate [%] feature set anger fear joy neutral sadness average p-fs 51.4 43.7 46.8 45.4 62.4 49.9 p-fs reduced with lda 51.4 53.4 46.8 56.6 69.8 55.6 s-fs 85.1 91.7 81 95.9 93.9 89.5 s-fs reduced with lda 85.1 91.4 80.7 96.5 94.3 89.6 ps-fs 88.8 92.5 84.2 97.1 94.8 91.5 ps-fs reduced with lda 88.2 92.5 85.3 95.9 95.7 91.5 the class and average recognition rate of the perceptron rule in two test conditions (3 feature sets before lda and 3 feature sets after lda) are given in table 3. slight improvements of recognition accuracy are noticeable in the case of all three reduced feature sets. the improvement is the lowest in case of p-fs. relevance of the types and the statistical properties of features in recognition of basic emotions in speech 431 when these three classifiers are compared, it can be noted that a substantial improvement of recognition accuracy has been achieved for the simplest classifier, namely knn. using the ps-fs reduced using lda, knn achieves the accuracy almost equal to the best result in our experiments (91.5%). this holds for the perceptron as a classifier, although the relative improvement of the average performance of the perceptron is much smaller. table 3 recognition accuracy using 3 feature sets (before and after feature reduction using lda) and with the perceptron rule as the classifier class recognition rate [%] feature set anger fear joy neutral sadness average p-fs 34.8 29.3 36.2 21.3 56.9 35.7 p-fs reduced with lda 16.9 33.1 42.2 33 62.6 37.6 s-fs 79.9 81.9 72.1 89.7 87.9 82.3 s-fs reduced with lda 78.2 90.5 80.5 91.1 93.4 86.7 ps-fs 83.9 88.2 77.1 91.4 93.7 86.9 ps-fs reduced with lda 86.8 94.2 82.8 93.9 94.5 90.5 employing lda, the original feature space is transformed to a new one, making it impossible to interpret the relevance of particular feature types. for an insight into the list of the most relevant features in the original (untransformed) feature space, sffs (sequential forward feature selection) has been applied. the wrapper for sffs is the linear bayes classifier since it had the best recognition results. the number of selected features has been preset to 35. for the interpretation of results, three indicators have been used. the first indicator of the relevance of a feature type is the number (#) of the features selected by sffs. the other two indicators are so called „share‟ and „portion‟, as described in [16]. with „share‟, the count of the selected feature type is normalized by the total number of features in the reduced set (#/35 in our experiment). with „portion‟, the same number is normalized by the cardinality of a feature type in the original feature set (#/#total). for each feature type, the „share‟ indicator displays its percentage in modeling our 5-class problem, while the indicator „portion‟ gives the percentage of the total number of the feature type which contributes to the modeling of the problem. the results of the selection of 35 features from ps-fs and the effectiveness of each feature type are displayed through 3 indicators in table 4. the observed feature types from ps-fs are: zero crossing rate (zcr), energy, pitch (plus voicing probability) and mfcc. columns „#total‟ and „#‟ show the total number and the number of selected features per each feature type, respectively. from table 4 it can be observed that the most selected features („share‟=77.1%) belong to the mfcc type. the second important feature type is energy („share‟=11.4%). the third and the fourth feature type are zcr and pitch, respectively. as regards the indicator „portion‟, the list of feature types can be arranged in the following way: from the total feature set energy is selected with the highest percentage (16.7%), followed by zcr (12.5%). although the mfcc feature type is the most frequent one in the selected feature set, only 9.4% of the total number of mfcc is selected. the pitch feature type is selected by the lowest rate (2.1%). 432 m. bojanić, v. delić, m. seĉujski table 4 summary of feature selection results (35 features selected using sffs), displayed with respect to feature types zcr energy pitch mfcc #total 24 24 48 288 sffs # 3 4 1 27 share [%] 8.6 11.4 2.9 77.1 portion [%] 12.5 16.7 2.1 9.4 table 5 summarizes the results of the feature selection distributed along groups of used statistical functionals: moments, extrema and regression coefficients. the features derived via moments are the most frequent among the selected features („share‟=57.1%), followed by the features derived via extrema (22.9%) and the features derived via linear regression (20%). observing the „portion‟ of the total number of features in each group of functionals, the most highly ranked are moments, followed by regression functionals and extrema, in that order. table 5 summary of feature selection results, distributed along groups of used statistical functionals moments extrema regression #total 128 160 96 sffs # 20 8 7 share [%] 57.1 22.9 20 portion [%] 15.6 5 7.3 5. conclusion the paper gives an outline of a system for the recognition of basic emotions in speech, with particular emphasis on the extracted acoustic feature sets, classification schemes and emotional speech corpus. the paper discusses the obtained improvement of the recognition accuracy in a lower dimensional feature space obtained by applying linear discriminant analysis. the most substantial improvement of the recognition accuracy has been achieved for the simplest classifier in our experiments, namely the knn classifier. a combination of knn with a reduced prosodic-spectral feature set nearly approaches the best results obtained in the experiments (the accuracy of 91.5%). feature selection algorithm has been employed in order to evaluate the relevance of the feature types and their statistical properties in the given task of the recognition of 5 basic emotions. in descending order of relevance, the features are: mfcc, energy, zero crossing rate and pitch. observing the ratio of selected features to the total number of features in each feature type, features related to the energy are the most usually selected. the results of the feature selection distributed along groups of used statistical functionals imply that moments are the most relevant statistical features, although the extrema, regression coefficients and regression error also play notable roles. relevance of the types and the statistical properties of features in recognition of basic emotions in speech 433 combining chosen prosodic and spectral features, represented by appropriate statistical features, even with a most simple classification scheme (such as knn) the recognition results comparable with more complex systems can be achieved. acknowledgement: the research presented in this paper has been carried out within the project "the development of dialogue systems for serbian and other south slavic languages" (tr32035), supported by the ministry of education, science and technological development of the republic of serbia. references [1] d.a. sauter, f. eisner, p. ekman, s. scott, "crosscultural recognition of basic emotions through nonverbal emotional vocalizations", proceedings of national academy of sciences of the usa, vol. 107(6), pp. 2408-2412, 2010. [2] d. ververidis, c. kotropoulos, "emotional speech recognition: resources, features and methods", speech communication, vol. 48, pp. 1162-1181, 2006. [3] s.l. lutfi, f. fernandez-martinez, j.m. lucascuesta, l. lopez-lebon, j.m. montero, "a satisfactionbased model for affect recognition from conversational features in spoken dialog systems", speech communication, vol. 55, pp. 825-840, 2013. [4] m.e. ayadi, m.s. kamel, f. karray, "survey on speech emotion recognition: features, classification schemes and databases", pattern recognition, vol. 44, pp. 572-587, 2011. [5] n. fragopanagos, j.g. taylor, "emotion recognition in human-computer interaction", neural networks, vol 18, pp. 389-405, 2005. [6] b. schuller, b. vlasenko, f. eyben, g. rigoll, a. wendemuth, "acoustic emotion recognition: a benchmark comparison of performances", ieee workshop on automatic speech recognition and understanding, asru 2009, italy, 2009, pp. 552-557. [7] v. delić, m. bojanić, m. gnjatović, m. seĉujski, s.t. joviĉić, "discrimination capability of prosodic and spectral features for emotional speech recognition", electronics and electrical engineering, kaunas technologija, vol. 18, no. 9, pp. 51-54, 2012. [8] m. bojanić, extraction and selection of feature set for automatic emotional speech recognition. ph.d. dissertation, dept. elect. eng., faculty of technical sciences, university of novi sad, 2013. [9] b. schüller, a. batliner, s. steidl, d. seppi, "recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge", speech communication, vol. 53, pp. 1062-1087, 2011. [10] c.m. lee, s.s. narayanan, "toward detecting emotions in spoken dialogs", ieee transactions speech audio processing, vol. 13, no. 2, pp. 293-303, 2005. [11] h. altun, g. polat, "new frameworks to boost feature selection algorithms in emotion detection for improved human computer interaction", lncs, vol. 4729, berlin-heidelberg: springer, pp. 533-541, 2007. [12] r.o. duda, p.e. hart, d.g. stork, pattern classification, 2 nd edition. wiley, new york, 2000. [13] s.t. joviĉić., z. kašić, m. djordjević, m. rajković, "serbian emotional speech database: design, processing and evaluation", proceedings of international conference on speech and computer (specom 2004), st peterburg, 2004, pp.77–81. [14] k. fukunaga, introduction to statistical pattern recognition. academic press, 1990. [15] p. pudil, j. novovicova, j. kittler, "floating search methods in feature selection", pattern recognition lett., vol. 15, pp. 1119-1125, 1994. [16] a. batliner et al., "whodunnit – searching for the most important feature types signalling emotion-related user states in speech", computer speech and language, vol. 25, pp. 4-28, 2011. facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 257 265 doi: 10.2298/fuee1702257s analytical test structure model for determining lateral effects of tri-layer ohmic contact beyond the contact edge  neelu shrestha, geoffrey k. reeves, patrick w. leech, yue pan, anthony s. holland school of engineering, rmit university, melbourne, victoria, australia abstract. contact test structures where there is more than one non-metal layer, are significantly more complex to analyse compared to when there is only one such layer like active silicon on an insulating substrate. here, we use analytical models for complex test structures in a two contact test structure and compare the results obtained with those from finite element models (fem) of the same test structures. the analytical models are based on the transmission line model and the tri-layer transmission line model in particular, and do not include vertical voltage drops except for the interfaces. the comparison shows that analytical models for tri-layer contacts to dual active layers agree well with fem when the specific contact resistances (scr) of the contact interfaces is a significant part of the total resistance. overall, there is a broad range of typical dual-layer-to-tltlm contacts where the analytical model works. the insight (and quantifying) that the analytical model gives on the effect of the presence of the contact, on the distribution of current away from the contact is shown. key words: ohmic contact, specific contact resistance, transfer length, transmission line model 1. introduction the study of specific contact resistivity (c, [ω.cm 2 ]), one of the most important parameters to investigate metal semiconductor interfacial properties, has been reported using several test structures [e.g. 1-6]. the effect of this parameter on the resistance of device components is also an area of investigation and an example is the effect of interfaces in via liners [7]. the transmission line model (tlm) is the commonly used among them and was first applied for analysing/determining the resistance components of semiconductor ohmic contacts, by shockley in 1964 [5]. for many years the tlm test structure was considered accurate for contacts such as aluminium to silicon. a more complex tlm model for contacts such as alloyed ni/ge/au to gaas was introduced by received september 26, 2016; received in revised form november 16, 2016 corresponding author: neelu shrestha school of engineering, rmit university, melbourne, victoria, australia (e-mail: neelu_shrestha@hotmail.com) 258 n. shrestha, g. k. reeves, p. w. leech, y. pan, a. s. holland reeves [6]. the example given has a metal, an alloyed and a semiconductor layer. another example of a tri-layer contact structure is aluminium-silicide–silicon. the analytical models due to the dual-layer and tri-layer structures considered the contact as beginning at the leading edge of the metal. yao li [8] developed an analytical model to show (and quantify) how the presence of the contact affects the current (and hence voltage distribution) away from the leading edge. this was also investigated by reeves [9]. the works in these references [6, 8, 9] were combined in the investigation for this paper and their utilisation and further demonstration of accuracy is demonstrated. the analytical model demonstrated in this paper can provide a solution to the 2l-tlm, contact structure (a metal contact to a dual semiconductor layer (see fig. 1(a))), considered intractable in 1994 [10]. for most semiconductor devices the sheet resistance of the semiconductor active layer is sufficient to quantify its resistive effect even for typical planar contacts such as source/drain contacts. the principle of the tlm models for ohmic contacts relies on this being so. only in cases where the specific contact resistance of the metal/semiconductor interface is extremely small, there will be significant vertical voltage drop in the semiconductor layer. tlm models (analytical type models) do not account for vertical voltage drop in the semiconductor layer but fem models do. this paper investigates the difference. fem is a powerful technique and is used here to demonstrate the accuracy of the analytical models which are also expanded here to demonstrate their usefulness as one approach to quantifying the total resistance encountered by current between two tri-layer contacts to a dual-layer (parallel sheet resistances) structure as shown in fig 1(a) and the corresponding resistor network model shown in fig 1(b). fig. 1 test structure for investigating resistance effects of contacts to dual-active layers. (a) schematic of test structure and (b) resistor network model showing the tltlm [6] contact model connected to the dual active layer model [9]. the one resistor that is common to the resistor network of both analytical models is indicated analytical test structure model for determining lateral effects of tri-layer ohmic contact... 259 2. methodology the general tri-layer contact structure investigated in this work is illustrated diagrammatically with its three layers, namely a metal layer, an intermediary active layer a and a bottom active layer b in fig.1. the metal layer is an ideal metal layer for tlm models where the metal is at an equipotential. the fem model considers this by making the metal in the fem metal have the same effect by having extremely low resistivity, so it is effectively at an equipotential. two interfaces with associated specific contact resistances (scr) are included in the model and these occur between the metal and layer a and layer a and b with their scr given as ca and cu respectively. the sheet resistance of the metal at the metal-semiconductor is considered to be zero and, rsa and rsu are the sheet resistance of layer a and b respectively. (the subscripts -sa, -su, -ca and -cu are used to maintain uniformity in the expressions with refs [6] and [9].) the total current given to the structure is i0. i1 and i2 represent the current in layer b and a respectively. here, i3 is the total current exiting through the contact. the length of the contact is d, the length between two contacts is l and w is the width of the structure. the analytical expressions for the calculation of total resistance are derived combining the reeves’ tri-layered transmission line model (tltlm) [6] and yao li’s model [8]. because the models are based on the transmission line model (tlm), (which can at best be regarded as a 2-d model) the vertical voltage dropped is not included except for the interfaces. as defined by berger [11], the  parameter gives the ratio of voltage drop across the contact interface to the vertical voltage drop in the semiconductor layer and thus determines whether a metal and a semiconductor contact is in 3d circumstances or not (fem is utilised in this work to investigate 3-d effects).  = ρc / (ρb * t) (1) where, ρb and t are the resistivity and thickness of the individual semiconductor layer respectively. consequently, the model presented in this work best suits to modelling contacts when  is typically greater than 1 because the model will be 2-d. when the two models, dual active layer [8] and tltlm [6] are combined to describe the test structure, the voltage drop across the common resistor (see insert in fig.1) between two models must be the same and thus the current division factor, f can be determined by solving for voltage using this boundary condition. reeves [9] in 1997 reported a similar structure applied to the source/drain region of a mosfet (with a short extended dual (silicide/silicon) active layer before the contact) and considered all current entering the dual layer through layer b. in this work, we focus on the more general case of a test structure in which current enters the dual layer through both the layers as shown in fig.1. the assumption made is that the two contacts are geometrically and electrically identical. in the following equations, f1 is the current division factor at the intersection where current is leaving one tltlm contact and entering the dual active layer. factor f is the modified current division factor at the intersection of a tltlm contact and the dual layer while current is entering the tltlm contact from the dual active layer. thus the boundary condition used is i1(x)=iof1 at x=l and i1(x)=iof at x=0. the expression of modified current division, f is given by. ( ) ( ) (2) 260 n. shrestha, g. k. reeves, p. w. leech, y. pan, a. s. holland where, transfer length for the dual active layer, √ (3) ( ) (4) ( ( ) ( ( ) ) ( ) ( ) ) (5) ( ) ( ) ( ) ( ) ( ) ( ) ( ) (6) ( ( ) ( )) ( ) ( ) ( ) (7) reeves [6] demonstrated that the current (i1(x)) through layer b and contact resistance (rc) in the tltlm are given by ( ) ( ( ) ) ( ( ) ( ) ( ) ( ) ) (8) { ( ) ( ) } (9) where, ( ) ( ) (10) ( ) ( ) (11) ( ) (12) the equation for a, b and c are shown in the appendix a1, a2 and a3. using the work reported by li et al. [8], the current (i1(x)) through the lower layer b of the dual-layer structure and total resistance, rtot(li) is given by ( ) ( ( ) ( ) ( ) ( ) ( ) ( ) ) (13) ( ) ( ) (14) where, (15) ( ) (16) for two contacts (in a tlm type test structure) with identical geometrical and electrical features, the total resistance, rtot(std) between probes connected to each contact is usually given by ( ) (17) the numerical evaluation of total resistance and current flow in all layers with different contact parameters giving various  values are shown using the table and graphs obtained from matlab. also, models with similar contact parameters were simulated using finite element modelling (fem) and the results for rtot(fem), rtot(li) and rtot(std) are compared in this work. analytical test structure model for determining lateral effects of tri-layer ohmic contact... 261 3. results and discussion the table below gives sets of contact material parameters used for analysis of the test structure of fig. (1). the values a and u for two layers, (=ρc/(ρb*t), where ρb and t are the semiconductor layer resistivity and thickness, differs in sets due to different contact parameters. the contact length is 10m and the length between the contacts (dual active layer) is 20m for set 1, 2 and 3. but, the length for set 4 is 5.2m. the thickness of layers a and b are considered as 0.2 and 0.6m respectively. the distances were chosen as being typical of a tlm type test structure. the value came about because of the mesh density used, which was the same for all models. table 1 contact parameters for the test structure set rsa(ω/sq) rsu(ω/sq) ρca (ω.cm 2 ) ρcu (ω.cm 2 ) l (m) a u rtot (fem) rtot (li) rtot (std) 1 40 60 8.0e-7 1.6e-6 20 50 7 576.1 575.7 587.8 2 30 40 8.0e-7 9.0e-6 20 66.67 62.50 445.9 445.4 482.9 3 40 60 8.0e-9 1.0e-8 20 0.50 0.04 487.4 507.8 507.7 4 40 60 8.0e-7 1.6e-6 5.2 50 7 220.8 220.1 232.4 the analytical and fem results are in excellent agreement for sets when the  value is greater than 1. moreover, the error percentage is less between rtot(fem) and rtot(li) results compared to rtot(std). this clearly shows that the redistribution of current near the intersection of tltlm and the dual layer affects the total resistance between the contacts. in sets 1 and 2, rtot(fem) and rtot(li) are practically the same and the error percentage is less than 1 in both cases(considering the fem model to give the most realistic result). however, the error is 2% between rtot(fem) and rtot(std) for set 1 and it increased to 8% for set 2. this is because the transfer length in the dual active layer is longer for set 2 than set 1. the distance before at the leading edge of the contact that is affected by the redistribution of current due to the contact is directly proportional to the transfer length of the dual active layer. consequently, the total resistance in the dual layer of the test structure will be different to rsh*l/w (where rsh is the effective sheet resistance of the two parallel active layers). in set 3, both rtot(std) and rtot(li) have approximately 4% disagreement with the fem results. in set 4, all the parameters are the same as set 1 except that the length between the contacts is 5.2m. as the dual active layer is shorter, and the transfer lengths do not change, the effect of the contacts on current distribution in the dual layer is relatively bigger. the error is ~5% between rtot(fem) and rtot(std) and less than 1% with rtot(li). thus, the expressions described in this work can give an accurate insight into the electrical behaviour of the test structure, provided that >1. the graph plotted below in fig. 2(a) is for set 1 and the point of contact for tltlm and dual layer is considered to be point 0. the length of the tltlm contact region is 10 µm and the dual layered region is 20µm. it shows the flow of current starting from the point at which current exits one tltlm contact and enters the dual layer. here, i1 (solid line) is the current flowing in layer b, i2 (dotted line) in layer a and i3 (fine dotted line) represents the total current in the tltlm contact. the graph illustrates that i1 distributes itself and tends to remain constant at a ratio of rsa:rsu (i.e. i1:i2=4:6) in the dual layered structure. but, the flow is disturbed near the point when current enters the tltlm structure from the dual layer. 262 n. shrestha, g. k. reeves, p. w. leech, y. pan, a. s. holland the fem result confirms the same as shown in fig. 2(b). the voltage contour is uniform at the middle of the dual layer, and the contour is disturbed near the tltlm contact region. fig. 2 (a) current distribution in one tltlm contact and dual layer contact regions for a test structure with rsa=40 ω/sq, rsu= 60ω/sq, ρca=8e-7ω.cm 2 , ρcu= 1.6e-6ω.cm 2 with dual active layer of 20m (b) distribution of corresponding voltage contours, determined by fem for the test structure fig. 3(a) shows results for set 4 when all the other parameters are same as set 1 except the length between the contacts is reduced to 5.2m. because of this, the current flow with ratio rsa:rsu is over a shorter length. this leads to greater variation in total resistance. the higher error value of ~5% between rtot(std) and rtot(fem) shows that the total resistance between contacts is not given by rsh*l/w. fig. 3 (a) current distribution in a tltlm contact and dual layer contact regions for a test structure with rsa=40 ω/sq, rsu= 60ω/sq, ρca=8e-7ω.cm 2 , ρcu= 1.6e-6ω.cm 2 with a dual active layer of 5.2m in length (b) distribution of corresponding voltage contours, determined by fem for the test structure analytical test structure model for determining lateral effects of tri-layer ohmic contact... 263 4. conclusion two variations of transmission line model networks, namely tri-layer tlm and a dual layer network, for modelling current in semiconductor contact regions, were combined to model a test structure with multiple layers. a comparison of the mathematical analysis and a two-dimensional finite element model of the test structure with two metal contacts to a dualactive layer, show that the combination of tltlm and the dual-layer network expressions provides accurate analysis for these test structures. the limitations on the accuracy of expressions have been presented in terms of the  parameter. the distribution of current through the dual-layer and tltlm contact region is discussed in detail to understand its influence on the total resistance of the test structure. this distribution is accurately represented by the combined tltlm-dual-active layer model investigated which is an improvement on models where the current distribution and sheet resistance is considered uniform between contacts. references [1] a. m. collins, y. pan, a. s. holland, “using a two-contact circular test structure to determine the specific contact resistivity of contacts to bulk semiconductors”, facta universitatis, electronics and energetics, vol. 28, no. 3, september 2015, pp. 457-464. [2] y. pan, a. m. collins and a. s. holland, "determining specific contact resistivity to bulk semiconductor using a two-contact circular test structure", in proceedings of the ieee international conference on miel, may 2014, pp. 257-260. [3] v. gudmundsson, p. hellstrom, and m. ostling, “error propagation in contact resistivity extraction using cross-bridge kelvin resistors,” ieee trans. electron devices, vol. 59, no. 6, pp. 1585–1591, june 2012. [4] a. s. holland, g. k. reeves, “new challenges to the modelling and electrical characterisation of ohmic contacts for ulsi devices”, in proc. of the 22nd international conference on microelectronics (miel 2000), vol. 2, niš, serbia, pp. 461-464, 2000. [5] w. shockley, “research and investigation of inverse epitaxial uhf power transistors”, air force atomic laboratory, wright-patterson air force base, rep. no. al-tdr-64-207, sept. 1964. [6] g. k. reeves, h. b. harrison, “an analytical model for alloyed ohmic contacts using a tri-layer transmission line model”, ieee trans. electron devices, vol. 42, no. 8, p. 1536. 1995. [7] g. k. reeves, a. s. holland, p.w. leech, “influence of via liner properties on the current density and resistance of vias”, in proc. of the 23rd international conference on microelectronics (miel 2002), vol. 2, niš, yugoslavia, pp. 535-538, 2002. [8] y. li, g. k. reeves, h. b. harrison, “correcting separating errors related to contact resistance measurement”, microelectronics journal, vol. 29, 1996. [9] g. k. reeves, h. b. harrison, “using tlm principles to determine mosfet contact and parasitic resistance”, solid-state electronics, vol. 41, no.8, 1997. [10] y. shiraishi, n. furuhata, a. okhamoto, “influence of metal/n-inas/interlayer/n-gaas structure on nonalloyed ohmic contact resistance”, journal of applied physics, vol. 76, p. 5099, 1994. [11] h. h berger, “models for contacts to planar devices”, solid state electronics, vol. 12, 1972. appendix the expressions for the tltlm √* { √( ( )⁄ )} + (a1) √ { √( ( )⁄ )} (a2) (a3) 264 n. shrestha, g. k. reeves, p. w. leech, y. pan, a. s. holland matlab code %calculation of f factor rs=rsa+rsu; rsh=(rsu*rsa)/(rsu+rsa); al=sqrt(rs/pcu); c=(rs/pcu)+(rsa/pca); z=sqrt((c*c)-(4*rsu*rsa/(pcu*pca))); a=sqrt((c-z)/2); b=sqrt((c+z)/2); d=al*pcu*coth(al*l); e=al*pcu*(rsa*cosh(al*l)+(f1*rs-rsa))/(rs*sinh(al*l)); h=((b*((rsu-(pcu*a*a))*tanh(a*d)))-(a*((rsu-(pcu*b*b))*tanh(b*d))))/((b*ba*a)*tanh(b*d)*tanh(a*d)); g=rsa*(b*tanh(a*d)-a*tanh(b*d))/((b*b-a*a)*tanh(b*d)*tanh(a*d)); f2=(e+g)/(d+h+g); % current flow dual active layer a=rsh/rsu; y1=sqrt(pcu/(rsu+rsa)); i11=i0*(a+((f2-a)*sinh((l-x)/y1)/sinh(l/y1))+((f1-a)*sinh(x/y1)/sinh(l/y1))); i21=i0-i11; % current flow tltlm contact c=((rsa+rsu)/pcu)+(rsa/pca); z= (c*c)-(4*rsu*rsa)/(pcu*pca); a=sqrt((c-sqrt(z))/2); b=sqrt((c+sqrt(z))/2); p= f2*(rsu-pcu*a*a)-(1-f2)*rsa; q= f2*(rsu-pcu*b*b)-(1-f2)*rsa; i12=(i0/(pcu*(b*b-a*a)))*((p*sinh(b*(d+y))/sinh(b*d))-(q*sinh(a*(d+y))/sinh(a*d))); i23=(i0/(rsa*pcu*(b*b-a*a)))*((p*(rsu-pcu*b*b)*sinh(b*(d+y))/sinh(b*d))-(q*(rsu pcu*a*a)*sinh(a*(d+y))/sinh(a*d))); itot=i0-(i12+i23); plot(x,i11); hold on; plot(y,i12); hold on; plot(x,i21); hold on, plot(y,i23); hold on; plot(y,itot); % contact resistance c1=((rsa+rsu)./pcu)+(rsa./pca); z1= (c1.*c1)-((4.*rsu.*rsa)./(pcu.*pca)); a1=sqrt((c1-sqrt(z1))/2); b1=sqrt((c1+sqrt(z1))/2); k1=rsu./(pcu.*w.*(b1.*b1-a1.*a1)); analytical test structure model for determining lateral effects of tri-layer ohmic contact... 265 x1=tanh(b1*d); y1=tanh(a1*d); p1=f2.*(rsu-(pcu.*a1.*a1)); q1=f2.*(rsu-(pcu.*b1.*b1)); k1=(1-f2).*rsa; rc=k1.*(((p1-k1)./(b1.*x1))-((q1-k1)./(a1.*y1))); % total resistance using standard formula rtot=2*rc+(rsh*l/w) % total resistance using yao et al. formula when dual layer is longer beta=2*((((f1+f2)*rsu)/(2*rsh))-1); bcor=(rsh*beta)/(w*al); rtotli=2*(rc+(bcor/2))+(rsh*l)/w; instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 389 398 doi: 10.2298/fuee1403389p analysis of measurement error in direct and transformer-operated measurement systems for electric energy and maximum power measurement  slaviša puzović 1 , branko koprivica 2 , alenka milovanović 2 , milić đekić 2 1 edb užice, prijepolje, serbia 2 faculty of technical sciences ĉaĉak, university of kragujevac, serbia abstract. analysis of error in measuring electric energy and maximum power within direct and half-indirect measurement system at the voltage of 0.4kv is presented in the paper. the analysis involved all the elements of the measurement system, i.e. calibration and testing of the transformer-operated and direct digital energy meters and measuring current transformers. this equipment was also used for measurements in the transformer substation aiming at error analysis at measurements made under the real conditions. the results obtained show significant negative measurement error introduced by the energy meters under overload conditions. energy meters have lower values of both the consumed electric energy and maximum power in this operating mode, which can be interpreted as a loss. key words: measurement error, digital energy meters, measuring current transformers 1. introduction in the early xxi century, the power system of serbia is facing numerous strategic challenges, one of the most important ones being enhancing energy efficiency of the systems for generation, transmission and distribution of electricity. the continual increase in electricity consumption, changed consumers’ structure and inhibited construction of the resources and the network caused the long-term and excessive operation of the power system. this has resulted in its inefficient operation and has led to substantial electricity losses. these losses may be due to a number of reasons, one of major factors that require analysis being the error at measuring electric energy and maximum power (maximum average fifteen-minute active power). the systems of half-indirect of both electric energy and maximum power include measuring current transformers and transformer-operated measurement instruments for measuring active and reactive electric energy, as well as those measurement instruments  received january 28, 2014; received in revised form march 28, 2014 corresponding author: branko koprivica faculty of technical sciences ĉaĉak, svetog save 65, 32000 ĉaĉak, serbia (e-mail: branko.koprivica@ftn.kg.ac.rs) 390 s. puzović, b. koprivica, a. milovanović, m. đekić for electric energy and maximum power measurement in direct systems. the measuring instruments need to provide the required accuracy in operation. given that measuring current transformers need to meet the given accuracy class (up to 120% of the given current), the question is whether or not the measuring current transformers exceed the rated accuracy class limits when the primary current is near zero, as well as when it exceeds 120% of the rated current, and even when the overload amounts to 100%, [1, 2]. similarly, the question is also raised as to the extent to which changes in the load at the secondary windings of measuring current transformers affect measurement accuracy. this primarily refers to replacing measuring instruments at the secondary windings of measuring current transformers, i.e. replacing electro-mechanical meters with less energy-consuming digital ones. precise determination of the rated power of measuring current transformer is of utmost importance as the accuracy class and security factor are adjusted to that power. as transformer-operated instruments for measuring active and reactive electric energy and maximum power, which within the system of half-indirect measuring are connected to the measuring current transformers’ secondary windings, are dimensioned to comply with the rated secondary electric current of the measuring current transformers (1a or 5a). in [3-4] there is raised the issue of how the measuring instruments behave when the current through the measuring current transformers exceeds the specified one. the same goes for the measuring instruments within the direct measuring system. the base current in the latter is usually 10a with maximum current amounting to 40a, 60a, 80a or 100a, which are usually exploited in conditions where actual current values are twice as high as those of maximum ones. the aim of this paper is to examine how measuring current transformers and direct and transformer-operated three-phase energy meters behave under underload and overload conditions, and determine the measuring error occurring thereby. recent research regarding the accuracy of the measuring current transformers and energy meters consider mostly the impact of non-linear loads, i.e. the distortion of current and voltage, on the value of measurement error [5-9]. the influence of current and voltage thd has been studied separately for measuring current transformers, as well as for current transformers embedded in energy meters. analyses presented show the significant influence of thd on phase error of both types of current transformers. this error is highly dependant on the frequency, so measuring the harmonics may be highly inaccurate. generally, high error may be expected when load current is nonsinusoidal. given the fact that literature does not provide enough information on the measurement errors under underload and overload conditions, the idea was to perform a detailed examination on this issue. the analysis presented in this paper includes separated laboratory testings on measuring current transformers, and direct and transformer-operated three-phase energy meters, under underload and overload conditions. furthermore, the paper presents the results obtained through measurements in a 10/0.4kv substation. 2. measuring electric power and energy and measurement errors the measurement system for electric energy and peak power at the voltage level of 0.4kv includes the measuring current transformers and transformer-operated instruments for measuring active and reactive power, and maximum power in transformer-operated measurement system. the measurement system also involves instruments for measuring active and reactive power, and maximum power within the direct measurement system. analysis of measurement error in direct and transformer-operated measurement systems... 391 in this paper, measuring current transformers of 50a/5a, 100a/5a, 150a/5a and 400a/5a current ratios (manufacturer fmt zajeĉar) were tested, as well as the digital energy meters of enel belgrade, i.e. two three-phase transformer-operated energy meters and three direct three-phase energy meters. 2.1. measuring current transformer errors measuring current transformers includes current, phase and complex error. current error, gi, results from the deviation of actual transmission ratio from its specified value. it is determined by [1, 2]: 2 1 1 100% n i m i i g i   (1) where mn = i1/i2 is the indicated transformation ratio, and i1 and i2 are the primary and secondary windings currents. phase error, i, is defined by the angle between the secondary and primary current phasors. phase error is positive if the secondary current leads the primary current. given that the distortion of the secondary current is possible at the increased primary current, which results from the saturation in the core, complex error can be defined with measuring current transformers as follows: 2 2 1 1 0 1 1 ( ) d 100% t i n p m i i t i t    (1) the accuracy class of a measuring current transformer is equal to the absolute value of the current error expressed in percentage, at the specified load on the secondary winding and 120% of the rated primary current. standard class accuracies are 0.1, 0.2, 0.5, 1, 3 and 5. measurement of the electricity consumption does not include accuracy classes 1, 3 and 5. fig. 1 shows the limit values of the current and phase errors of measuring current transformers of accuracy classes 0.1, 0.2, 0.5 and 1, set out in [10]. thus, for a transformer of accuracy class 1, limit of the current error is at 120% of the rated primary current and specified load on the secondary windings of the measuring current transformer. 20 g i i1n ±1 40 60 80 10 0 12 0 % ±2 ±3 % 1 0.5 0.2 0.1 20 δ i i1n ±50 40 60 80 10 0 12 0 % ±100 ±150 min 1 0.5 0.2 0.1 a) b) fig. 1 error value range: a) current error, b) phase error 392 s. puzović, b. koprivica, a. milovanović, m. đekić 2.2. digital energy meters errors three-phase energy meters (direct and transformer-operated) are intended for measuring active and reactive electric energy in three-phase voltage system of the specified frequency of 50 hz. the accuracy of digital measurement groups is set out in [11]. under the referential conditions, the percentage error should not exceed the value of the relevant accuracy class, tables 1 and 2. table 1 percentage error limits in single-phase and three-phase direct energy meters of accuracy class 1 (ib is the base current, imax is the maximum current) current values power factor error limit in % b b 0.05 0.1 i i i  1 1.5 b max 0.1 i i i  1 1.0 b b 0.1 0.2 i i i  0.5(ind.), 0.8(kap) 1.5 b max 0.2 i i i  0.5(ind.), 0.8(kap.) 1.0 table 2 percentage error limits in single-phase and three-phase transformer-operated energy meters of accuracy class 1 (in is the rated current, imax is the maximum current) current values power factor error limit in % n n 0.02 0.05 i i i  1 1.5 n max 0.05 i i i  1 1.0 n n 0.05 0.1 i i i  0.5(ind.), 0.8(kap.) 1.5 n max 0.1 i i i  0.5(ind.), 0.8(kap.) 1.0 3. measurement results 3.1. tests with measuring current transformers the testing of measuring current transformers was performed in fmt zajeĉar, a measuring transformer company, on measuring current transformers of 50a/5a, 100a/5a, 150a/5a and 400a/5a current ratios. three stem 081 type transformers (50a/5a, 100a/5a, 150a/5a) and one sten 081 type transformer (400a/5a) were used [3]. the testing was performed under the following conditions: voltage: rated phase voltage, current: from 0% to 200% of the rated current, power factor: cosφ=1, cosφ=0.8(ind), power: sn=1.25va, sn=2.5va, sn=10va, and frequency: rated frequency of 50 hz. current errors expressed in % and phase errors expressed in minutes at different current values were measured. all the measurements gave similar distribution of errors, regardless of the measuring current transformer ratio and the load on the secondary winding. typical graphs that show current and phase errors with the primary current are given in figures 2 and 3. analysis of measurement error in direct and transformer-operated measurement systems... 393 0 50 100 150 200 -0,7 -0,6 -0,5 -0,4 -0,3 -0,2 -0,1 0,0 0,1 0,2 0,3 0,4 10va, cos=0.8 g i [%] i 1 /i n [%] 1.25va, cos=1 2.5va, cos=0,8 2.5va, cos=1 0 50 100 150 200 -3,0 -2,5 -2,0 -1,5 -1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 g i [%] i 1 /i n [%] a) b) fig. 2 variation in the current error with the primary current for different loads on the secondary winding: a) without the designated error limits, according to standard, and b) with the designated error limits, according to standard (broken line) the results presented suggest that the current and phase errors are lower than the limit values, regardless of the value and power factor of the primary load, and the load on secondary windings of the measuring current transformer. 0 50 100 150 200 0 5 10 15 20 25 30 1.25va, cos=1 2.5va, cos=0,8 2.5va, cos=1 10va, cos=0.8  [min] i p /i s [%] 0 20 40 60 80 100 120 140 160 180 200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 i 1 /i n [%]  [min] a) b) fig. 3 variation in the phase error with the primary current for different loads on the secondary winding: a) without the designated error limits, according to standard, and b) with the designated error limits, according to standard (broken line) figure 4 presents the current error for the different current ratios of measuring current transformers and the 2.5va load on the secondary winding at cosφ=1. it can be seen that the current ratio changes, for the same load at secondary winding does not affect the current error. 394 s. puzović, b. koprivica, a. milovanović, m. đekić 0 50 100 150 200 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 400a/5a 100a/5a 50a/5a i 1 /i n [%] g i [%] 150a/5a fig. 4 current error for the different current ratios of measuring current transformers 3.2. testing of digital energy meters digital energy meters were tested using a control measurement system, i.e. iskramatic cats system [12, 13]. the testing was done on two three-phase transformer-operated energy meters (manufacturer enel belgrade, type dmg2), and three direct three-phase energy meters, type db2mg, of the same manufacturer [4]. the testing involved the following conditions: voltage: specified phase voltage, current: from 0.5 % to 200% of 5a rated current (transformer-operated energy meters), and from 2.5 % to 1000% of 10a base current (direct energy meters), power factor: cosφ=1, cosφ=0.5 (ind.), cosφ=0.8 (ind.), cosφ=0.8 (kap), and frequency: the rated frequency of 50 hz. errors for active and reactive electric energy were measured, as well as for the maximum power. three-phase transformer-operated energy meters fig. 5 shows the measurement errors occurring at measuring active electric energy for two transformer-operated three-phase energy meters of the same type, while cosφ=1. fig. 6 shows the error occurring at measuring active energy for the different power factor values (cosφ=1, cosφ=0.5(ind), cosφ=0.8(ind), cosφ=0.8(cap)) using a three-phase transformer-operated energy meters. a similar distribution of errors occurred when measuring reactive power. the graphs in figs. 5 and 6 show that errors occurring at measurements exceed the range of the error limits set out by a particular standard. analysis of measurement error in direct and transformer-operated measurement systems... 395 0 50 100 150 200 -10 -8 -6 -4 -2 0 2 i [%] g [%] 100 120 140 160 180 200 -10 -8 -6 -4 -2 0 g [%] i [%] fig. 5 the comparison of errors occurring at measuring active energy in two transformer-operated energy meters of the same type. broken line presents the error range set out by standard fig. 6 error occurring at measuring active energy at the different power factors direct three-phase energy meters base current for the tested direct measurement groups was 10a, whereas their maximum current was imax = 60 a or imax = 80 a. given that the testing was done with the currents not exceeding 100a, error in active and reactive power measurements was within the error range set out by standard when measurements were performed in the energy meter with maximum current imax = 80 a (tables 3 and 4). the reason for this is a small difference between the maximum current of the meter and the maximum current used in the testing. in two meters with imax = 60 a maximum current, this difference was substantially greater, which resulted in measurement errors (tables 5 and 6). table 3 percentage errors g % in direct energy meter, accuracy class 1 (active energy measurement, imax=80a) no. 1 2 3 4 5 6 7 8 9 i [a] 0.25 0.5 1 2 5 10 50 80 100 cosφ 1 1 1 1 1 1 1 1 1 error limit [%] ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 g % -0.78 -0.27 0.07 0.19 0.03 0.05 0.31 0.35 0.65 table 4 percentage errors g % in direct energy meter, accuracy class 1 (reactive energy measurement at cos 0.5  (ind), cos 0.8  (ind), imax=80a) no. 1 2 3 4 5 6 7 8 i [a] 5 5 10 10 50 50 100 100 cosφ 0.5 0.8 0.5 0.8 0.5 0.8 0.5 0.8 error limit [%] ±2 ±2 ±2 ±2 ±2 ±2 ±2 ±2 g % -0.33 -0.13 -0.2 -0.06 0.1 0.04 -0.03 0.17 396 s. puzović, b. koprivica, a. milovanović, m. đekić table 5 percentage errors g % in direct energy meter, accuracy class 1 (active energy measurement, imax=60a) no. 1 2 3 4 5 6 7 8 9 i [a] 0.25 0.5 1 2 5 10 50 80 100 cosφ 1 1 1 1 1 1 1 1 1 error limit [%] ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 ±1 g % 1.47 0.71 -0.2 -0.3 -0.33 -0.53 -0.66 -2.19 -4.22 table 6 percentage errors g % in direct energy meter, accuracy class 1 (reactive energy measurement at cos 0.5  (ind), cos 0.8  (ind), imax=60a) no. 1 2 3 4 5 6 7 8 i [a] 5 5 10 10 50 50 100 100 cosφ 0.5 0.8 0.5 0.8 0.5 0.8 0.5 0.8 error limit [%] ±2 ±2 ±2 ±2 ±2 ±2 ±2 ±2 g % -0,76 -1 -1,05 -4,64 maximum power measurement error measurement of the maximum power error was done on transformer-operated threephase energy meters at the current of 9a (180%) and cosφ=1. the results obtained show that peak power measurement errors at the load of 180%, cosφ=1, at a rated voltage and frequency on transformer-operated energy meters were –3.201% and –3.154%, respectively. specifically, referential measuring instrument gave the value of 5.9362kw, whereas energy meters showed the value of 5.744kw and 5.748kw. laboratory studies do not fully correspond to real conditions, which is primarily due to the short testing period. in addition, the testing carried out in laboratory is done for a finite number of measurement points at certain values of currents and power load factors. in practice, current values and load type can be changed very quickly within a wide range of values, whereas the long-term current overloads on the measurement equipment cause it to overheat, which can affect measurement characteristics of the equipment and the value of measurement errors accordingly. hence, the equipment tested in the laboratory was set up in a 10/0.4 kv substation for measurements in real conditions. substation feeders on which major changes and long-term overloads can be expected were used in these measurements. measurements performed in a 10/0.4 kv substation the measurements included setting up three measurement systems in a 10/0.4 kv substation. the complete measurement system comprised two transformer-operated energy meters dmg2, a single direct energy meters db2mg (imax = 80 a) and two sets of measuring current transformers with current ratios of 150a/5a 50a/5a, fig. 7. measurement systems were connected to each other so as to enable mutual load. analysis of measurement error in direct and transformer-operated measurement systems... 397 k l k l k l k l k l k l k l k l k l k l k l k l dmg2 5(6)a db2 mg 10-80a dmg2 5(6)a stem 081 150/5a stem 081 50/5a l1 l2 l3 n p fig. 7 connection diagram of the measurement system in a 10/0.4kv substation earlier measurements conducted in a substation showed that changes in current were within the range of 70a–120a. these current values provide nominal operation of measuring current transformers of 150a/5a current ratio. on the other hand, transformers with 50a/5a current ratio operate under overload. therefore, one of the transformeroperated energy meters (dmg2) operates in the nominal mode, whereas the other is overloaded. direct energy meter db2mg works with overload only partially. the average value of the phase voltage during measurement was 234v. the measurement of the consumed active and reactive energy, and maximum power over the period of 2h 45min was performed. table 7 shows the results of measurements obtained under the stated conditions. the results indicate a significant difference among the individual measurement systems. it can be assumed that the first measurement system, comprising measuring current transformers with 150a/5a current ratio and dmg2 transformer measurement group, gives the measurements with an error within an acceptable range (based on the results shown in previous subsections). compared with these results, the relative deviation in measurement results for other two measurement systems was calculated. the results also point to significantly greater deviations than allowed. table 7 percentage g % errors in direct energy meter, accuracy class 1 (active and reactive energy and peak power) dmg2+mct 150/5 db2mg dmg2+mct 50/5 wa [kwh] 138.90 135.37 119.50 wr [kvarh] 31.20 30.42 29.30 pmax [kw] 71.16 66.800 57.12 δwa [%]  -2.54 -13.97 δwr [%]  -2.5 -6.09 δpmax [%]  -6.13 -19.73 4. conclusion this paper presents the results of testing of the electric energy and maximum power measurement systems within the system of direct and half-indirect measurement at the voltage level of 0.4kv. laboratory studies of measuring current transformers indicated 398 s. puzović, b. koprivica, a. milovanović, m. đekić that the current and phase errors, regardless of the power factor and primary load values, and the load on the secondary windings of the measuring current transformer, are below the limit values. however, it is important to note that, when selecting measuring current transformer, attention should be paid to the load on the secondary winding, as it can affect the measurement error. laboratory testing of transformer-operated energy meters revealed that the measurement errors of active and reactive electric energy and maximum power are:  within the limits of accuracy class in overloads up to 70% (regardless of the load type),  beyond the limits of accuracy class in overloads above 70%, i.e.: 1) in 80% overloads the error ranges from 3.154% to 3.5%, and 2) in 100% overloads the error exceeds 9%. in direct energy meters, measurement results were within the limits of accuracy class when the value of the maximum current of the measurement group is slightly lower than the maximum operating current (up to 20%). higher values of the operating currents result in similar error values as in transformer-operated energy meters. measurements conducted in substation confirm the results obtained in laboratory conditions. increase in the measurement error can be expected under real conditions. the results obtained imply that the energy meters introduce significant negative measurement error under overload conditions. this infers that in this operating mode, energy meters have lower values of both the consumed electric energy and maximum power, which can be interpreted as a loss. future analysis in this area will be focused on the influence of current and voltage thd to the measuring current transformer and energy meters errors. references [1] p. duduković, m. đekić, electrical measurements, first edition, nauĉna knjiga, beograd, 1991. (in serbian) [2] v. bego, measuring transformers, školska knjiga, zagreb, 1977. (in serbian) [3] katalog proizvoda strujni transformatori za merenje 0.72 kv, fabrika mernih transformatora zajeĉar, zajeĉar 2010. [4] catalogues – db2mg, dmg1, dmg2, enel belgrade, belgrade, 2010. [5] a.e. emanuel, j.a. orr, "current harmonics measurement by means of current transformers", ieee trans. power deliv., vol. 22, pp. 1318–1325, july 2007. [6] p. mlejnek, p. kaspar, "calibrations of phase and ratio errors of current and voltage channels of energy meter", journal of physics: conference series, vol. 450, p. 012046, 2013. [7] d. stevanovic, p. petkovic, "the losses at power grid caused by small nonlinear loads", serb. jour. elec. eng., vol. 10, pp. 209–217, feb 2013. [8] m. soinski, w. pluta, s. zurek, a. kozłowski, "metrological attributes of current transformers in electrical energy meters", in proceedings of the international workshop on 1&2 dimensional measurement and testing. vienna, austria, 2012. [9] k. draxler, r. styblíkova, "effect of magnetization on instrument transformer errors", jour. elec. eng., vol. 10, pp. 209–217, feb 2013. [10] srps en 60044-1:2009, merni transformatori deo 1: strujni transformatori, institut za standardizaciju srbije, beograd, 25.02.2009. [11] srps en 62053-21:2008, oprema za merenje elektriĉne energije naizmeniĉne struje deo 21: statiĉka brojila aktivne energije (klase 1 i 2), institut za standardizaciju srbije, beograd, 29.12.2008. [12] j.g. webster, the measurement, instrumentation and sensors handbook, first edition, crc press, boca raton, fl, usa, 2000. [13] http://www.iskraemeco.si/emecoweb/eng/products/equipment/iskramatic_cats.html facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 313 326 doi: 10.2298/fuee1703313h circular test structures for determining the specific contact resistance of ohmic contacts  anthony s. holland, yue pan, mohammad saleh n. alnassar, stanley luong school of engineering, rmit university, melbourne, victoria, australia abstract. though the transport of charge carriers across a metal-semiconductor ohmic interface is a complex process in the realm of electron wave mechanics, such an interface is practically characterised by its specific contact resistance. error correction has been a major concern in regard to specific contact resistance test structures and investigations by finite element modeling demonstrate that test structures utilising circular contacts can be more reliable than those designed to have square shaped contacts as test contacts become necessarily smaller. finite element modeling software nastran can be used effectively for designing and modeling ohmic contact test structures and can be used to show that circular contacts are efficient in minimising error in determining specific contact resistance from such test structures. full semiconductor modeling software is expensive and for ohmic contact investigations is not required when the approach used is to investigate test structures considering the ohmic interface as effectively resistive. key words: ohmic contact, specific contact resistance, contact resistance, test structure, circular transmission line model, transmission line model. 1. introduction in practice, ohmic contacts are one of the least complex aspects of semiconductor devices. the modelling of an ohmic contact requires only contact geometry, material resistivities and the specific contact resistances of all contact interfaces in the contact structure. if a contact is ohmic then its current-voltage behaviour is linear. an ohmic contact interface has a finite thickness defined by the alignment of the fermi levels of the two contacting materials at equilibrium and the thickness is really that of the „disturbed‟ region (depletion layer) of the semiconductor. the „undisturbed‟ region (undisturbed by the presence of the metal) of the semiconductor behaves resistively as intended due to whatever doping values it was fabricated with. though the transport of charge carriers across a metal-semiconductor ohmic interface is a complex process, such an interface is practically characterised by a characteristic specific received january 23, 2017 corresponding author: anthony s. holland school of engineering, rmit university, melbourne, victoria, australia (e-mail: anthony.holland@rmit.edu.au) 314 a. s. holland, y. pan, m. s. n. alnassar, s. luong contact resistance (scr). the evolution of semiconductor devices required the lowering of values of this parameter for ohmic contacts and the investigation of test structures to determine these small values has been a significant and important area of research. error correction has been a major concern in regard to scr test structures and investigations by finite element modelling demonstrate that test structures utilising circular contacts can be more reliable than square shaped contacts which are impractical to realise for small geometries. circular designed contacts will remain as circles when fabricated and hence their area can be accurately determined. test structures with circular contacts can be realised with an equipotential always resulting at the contact circumference and mathematical solutions are obtainable. in this paper it is shown that the use of finite element modelling software nastran can be used effectively for designing and modelling ohmic contact test structures and that circular contact are efficient in minimising error in determining scr from such test structures. nastran software solves for heat flow and gives temperature contour distribution. it has been extensively used for solving problems for the analogous situation of electrical current flow and equipotential distribution. full semiconductor modelling software is expensive and for ohmic contact investigation is not required when the approach used is to investigate test structures considering the ohmic interface as purely resistive. more complex investigations that consider tunnelling probability and other aspects of charge transport across an interface will require software for full semiconductor physics solutions but this is not necessary when an experimentalist wants to determine the effect of this physics which is an interface‟s scr. there are two ways to look at the parameter specific contact resistance (represented by symbol c, [ω.cm 2 ]). first there is the rather academic or theoretical way of describing it as the inverse of the differential of current density versus voltage (j-v at the origin) for uniform current density, so that even a schottky contact has a scr value. there is much to be gained from this first approach in understanding the physics of current across a metalsemiconductor junction. semiconductor software tools are great for this first approach but not real test structures as it is difficult to isolate a contact so that it is the only entity determining a j-v curve and have uniform current density at the same time. however, this approach is worth pursuing with computer modelling (and actual test structures if possible) of appropriate test structures to demonstrate the physics in scr equations [1]. the second or the more practical investigation is to study scr in regard to practical ohmic contacts only so that the derivative of a contacts j-v curve is the same at the origin as it is at practical voltage values e.g. j-v being linear from -5v to +5v. this second approach need not consider the physics of current transfer but rather the effective resistance of a contact interface as it contributes to the total resistance of a source or drain contact of a mosfet for example. in practical ohmic contacts the depletion layer of the metal-semiconductor interface is relatively small and for active layers of a practical contact structure, each layerto-layer interface can be considered to have a unique scr value. having determined scr values for any layer-to-layer interface should enable accurate modelling of structures of any geometry involving such interfaces. so, the second approach is very much a „try and see‟; where the first enquiry is to determine if a particular contact interface is ohmic and if so what is its scr value and can this be used to determine the effective resistance of a contact of a particular area. the reverse is also used, where the effective resistance of a two-layer contact can be used to determine the scr of the interface using appropriate analytical expressions relating scr and contact resistance. although the authors are not aware of any report on using computer modelling in this reverse way, it should be possible to use computer circular test structures for determining the specific contact resistance of ohmic contacts 315 modelling in an iterative way to determine what scr realises a given (e.g. experimentally determined) contact resistance. the study of scr requires the use of test structures for measuring the voltage drop across the contact of interest and any parasitic resistance encountered. it is the parasitic resistance that causes most difficulty. another difficulty is the effect of contact area, unless the area is small enough that uniform current distribution can be assured, otherwise the concept of transfer length has to be considered. it is in regard to area that circular contacts have an advantage (compared to square contacts) in that even though a circular contact realised after fabrication may not have the same diameter as designed, it will still be a circle and its diameter and area can be accurately determined. square designs have the disadvantage of ending up having rounded corners. several test structures have been developed using this advantage of circular contact designs [2-5]. the circular transmission line model (ctlm) ohmic contact test structure [6] was developed not with this advantage of circular contacts always being circular but with the advantage that no mesa etch or active area isolation is required which is a significant advantage compared to linear transmission line model (tlm) ohmic contacts test structure‟s [7, 8]. the disadvantages include the active layer isolation process steps and active layer overlap of contacts where in theory there should be none. the cross kelvin resistor (ckr) test structure has the same disadvantages [9]. 2.ohmic contact characterisation ohmic contacts are fundamentally important: there are at least two contacts in every transistor and there are billions of transistors on the most complex semiconductor chips. ohmic contact research is crucial for the development of novel nanotechnology devices [1]. it is imperative to have low resistance contacts to these nanoscale devices. fundamental understanding of ohmic contact structures, materials properties and processing will result in better semiconductor devices performance and enhanced power efficiency. scr is an extremely important parameter for quantifying a metal to semiconductor ohmic contact. its theoretical description is defined as the reciprocal of the derivative of current density with respect to voltage at v = 0 [8] (equation 1). a good ohmic contact requires a negligible value of scr to ensure the linear i-v characteristic (between such two contacts) is mainly due to resistance of the semiconductor ( ) (1) note that equation (1) is the definition of scr which is a theoretical quantity referring to the metal-semiconductor interface only. in practice, a more meaningful definition of the scr for a real metal-semiconductor ohmic contact is an electrical parameter which is determined from measured contact resistance between a metal and a semiconductor. scr is a very useful term for characterising ohmic contacts because it is independent of contact area and is a convenient parameter when comparing contacts of various sizes. though in practice, an experimentalist or device process engineer will want to know how many ohms an ohmic contact presents to current flow, a design engineer can utilise known scr values to better design and model a contact considering contact layers, interfaces and geometry parameters are all known, including the scr of each interface. to ensure accurate 316 a. s. holland, y. pan, m. s. n. alnassar, s. luong semiconductor modelling, any scr determined experimentally and used in contact or device modelling should be in agreement with equation 1, if this is possible to demonstrate. ohmic contact characterisation is carried out by using test structures to investigate the electrical behaviour and a suite of materials analysis tools to investigate the materials which make up the ohmic contacts e.g. silicide (metal-silicon reaction product) layer. characterisation usually aims to attain the outcomes listed below for optimising contact properties. 1. use test structures to accurately quantify the resistance due to contact interfaces. this resistive property is qualified using scr and improved efficiency and speed in determining low scr values for ultra-small contacts is often a goal of particular research in this area. 2. understand the influence of mechanical, electrical and thermal materials parameters, in particular the influence of defects and stress formation at the contact interface, on scr values. 3. optimise test structures and demonstrate new ones to confirm a test structure‟s suitability for determining processing changes that contribute to reducing the scrs of metal-silicide-silicon contacts for example. 4. hybridisation of analytical calculations and numerical computations of ohmic contact architectures to model the electrical behaviour of fabricated test structures. item 4 above is an area that could be explored further. multilayer ohmic contact test structures will of course have an effective resistance to electrical current and the accurate determination of the resistance of such contact structures can be better realised if interfaces have their scr‟s included – other parameters being layer resistivities and geometries for ohmic contacts structures only. scr is a parameter that has been reduced by several orders of magnitude (due to the introduction of silicides for silicon contacts [17] for example) throughout the semiconductor era. reported values of scr for some ohmic contacts are listed in table 1. in the international technology roadmap for semiconductors (itrs) the scr values required for particular technology nodes have been given in many of its publications showing significant reduction in the required value as technology generations progress. in 2017 the target is in the low 10 -9 cm 2 range. determining the value of scr quantifies the interface for a particular processing technology and gives information about the quality of an ohmic contact fabrication process. it also allows for comparison of different two-layer ohmic contacts or for different device processes using the same two layers for contacts. hence, determining scr allows for optimisation of the process for forming an ohmic contact. determining accurate values of scr will aid in better modelling of contact structures, in order to minimise the contact resistance (rc). note that scr is the biggest contributor to rc for relatively small contacts. minimising rc in turn minimises the net resistance of a circuit, and its overall power consumption; these will result in more power efficient devices and circuits. unlike contact resistance rc, the scr value should not include contributions from the resistivity of the two contacting layers or topological effects due to the contact geometry design. if a reported value of scr does include these effects it is regarded as an effective scr for a particular contact (including geometry) and such a scr value cannot be used in designing and modelling other geometries with the same contact layers (and processing steps). the units of scr (cm 2 ) may be misleading to some researchers who have not specialised in this area. this parameter cannot be used directly to determine the resistance circular test structures for determining the specific contact resistance of ohmic contacts 317 contribution of the interface in a two-layer contact unless one is confident that the current is uniformly distributed in the contact interface. if current can be assumed to be distributed uniformly in the interface of a two-layer contact then for area a (cm 2 ) the resistance of the interface is simply ρc/a (). this assumption does not always hold (unless a is relatively small) because electrical current in most semiconductor devices (which are planar) has to turn 90 o into a contact, and so is not always uniformly distributed across a contact interface. for example, current can flow laterally under the gate region of a transistor and then turn upwards through the drain ohmic contact, similarly for a contact in a test structure. the distribution of current in the drain contact area is dependent on the value of ρc, but is also influenced by other parameters such as the resistivity of any silicide used, the interconnect material, and any liner used; and the geometry of these materials. intuitive understanding in this case can be misleading, and only rigorous analytical and numerical modelling will portray the actual current distribution. table 1 reported values of specific contact resistance (scr) for some ohmic contacts ohmic contact layers scr value ωcm 2 ref. al-si 1  10 -6 [11] al-wsix-si 3  10 -7 [12] al-tisi2-si 1  10 -8 [13] al-tisi2 4  10 -9 [5] nisi-si 5  10 -9 [14] nige-ge 2.3  10 -9 [15] tisix-si 1.3  10 -9 [16] 3. test structure modelling the main test structures used for characterising ohmic contacts and determining scr in particular are the transmission line model (or transfer length method) (tlm) [7,8], cross kelvin resistor [9], and the circular transmission line model [6]. more recent test structures are the multi-ring ctlm [15], refined tlm [16] and the two-circle electrode contacts [4]. one of the main issues with test structures based on the transmission line model is that they are essentially 2-d models and do not allow for vertical voltage drops. an estimate as to whether a 3d correction is applicable to a contact can be made by calculating the parameter  where =c/b .t and b is the resistivity of the semiconductor layer. this parameter was first used by berger [8] to estimate the influence of semiconductor depth and resistivity on the derivation of c using transmission line model test structures. the parameter gives an indication of the ratio of the voltage drop across the contact interface to the voltage drop in the vertical direction occurring in the semiconductor material beneath the contact. when <1, 3d effects are significant as the voltage drop in the vertical direction in the semiconductor layer is nominally greater than the voltage drop across the contact interface (scr = c.) when >1 and increasing, the voltage drop in the vertical direction is becoming less important (the contribution of this vertical voltage drop compared to the measured values becomes less significant) and 2d modelling will be sufficiently accurate. calculation of  requires some knowledge of c; however an initial upper figure for  can be found using c determined from a 2d correction. 318 a. s. holland, y. pan, m. s. n. alnassar, s. luong this will give an indication of whether a 3d correction may be applicable. if <1 and the corrections are made using 2d data, then significant errors can be introduced (overestimation) in the derivation of c [18]. by using finite element analysis we can optimise the use of material and geometries of interconnect to minimise ohmic contact and interconnect via resistance [18]. the electrical equation used to describe d.c. electrical conduction (equation. 2) is analogous to that for thermal conduction (equation 3) (2) where j = electrical current density, v = voltage, n = spatial coordinate in the direction of current flow and = material electrical conductivity (3) where h = heat flux, t = temperature, n = spatial coordinate in the direction of heat flow, and k = material thermal conductivity equations 2 and 3 have the same form and therefore can be solved using the same finite element program. nastran is a finite element program developed by nasa for heat transfer analysis (and mechanical structural analysis). nathan et al. [19] reported on the use of this program for electrical analysis based on the analogy indicated by equations 2 and 3. nastran has been used by the authors to design and model various ohmic contacts test structures as well as interconnect vias [20] as shown in figure 1. figures 2 (a) and (b) show an example of modeling a ckr test structure using nastran. in figure 2(b) the metal layer has been lifted up to show the equipotentials. the contact layers are typically separated by a thin oxide layer with the contact opening. vb is the value of the equipotential of the voltage tap of the (top) metal layer of the contact. the voltage measured on the tap (va) is used to determine the average voltage at the bottom of the contact interface. figure 3 shows an ideal ckr ohmic contact test structure. it can easily be appreciated that such a test structure is not possible to realise, and contact widths smaller than the current and voltage arms are required. (stavitski et al give an excellent report on using ckr test structures in [21]).this leads to parasitic error which can be studied using software such as nastran. figure 4 shows a possible test structure for fast turnaround in ckr measurements using the technique described in [5]. again, the software nastran can be readily used to model such a test structure. the circular contacts used can be as small as possible as long as their diameters can be measured. this contrasts with square contacts which will most likely have rounded corners (figure 5). extrapolation of scr‟ (scr plus parasitic resistance effect) for small contacts where the effective scr‟ is determined for each d/w value using scr‟=(va – vb) x area, gives the actual scr, as shown in [5]. again the use of circular contacts is more reliable as contact area can be reliably determined using measured diameters. the series of ckr test structures demonstrated in figure 4 utilises the technique of the ckr and the accuracy of determining area of circular contacts. the benefits of the series of ckr of figure 4 is that as the contact becomes infinitely small then the contact resistance will dominate the ckr resistance measurement. the possible problems with tlm test structures can be demonstrated using nastran modelling. figure 6 shows the effect of vertical voltage drop in the semiconductor layer which occurs when the semiconductor resistance (due to semiconductor resistivity and circular test structures for determining the specific contact resistance of ohmic contacts 319 thickness) below the contact is comparable to that due to the contact interface (scr effect). investigation shows the relevance of the parameter  [17]. in fig 6(a) there is the effect of horizontal and vertical voltage drop and in fig 6(b) the semiconductor has only the horizontal resistance effect of sheet resistance and the tlm equations can be reliably applied. figure 7 shows a schematic with the inclusion of this tlm contact section in a test structure and the effect is to increase the value of rc determined. similar error contributions occur for the ckr test structure [10]. fig. 1 example of equipotential distribution in an interconnect via (for input current i) determined using nastran finite element modeler. ρc1 is the specific contact resistance between metal1 and the via liner material [7]. (a) (b) fig. 2 (a) example of equipotentials in a cross kelvin resistor test structure for ohmic contact characterisation of a semiconductor layer (bottom layer) to a metal layer contact. the distribution of quipotentials in the semiconductor layers current input arm and the voltage (va) tap are more clearly shown in (b). the metal layer is shown as having one equipotential (vb) for the scale used. (modelled using nastran). 320 a. s. holland, y. pan, m. s. n. alnassar, s. luong fig. 3 ideal ckr test structure, where the square contact area has the same width as the four arms. i i v1a v1b v2b v2a v3a v3b wi dth , w co nta ct d iam ete r, d d/w =0. 1 d/w =0. 2 d/w =0 .3 scr’ scr’ fig. 4 (a) schematic of a chain of ckr test structures with varying contact sizes to determine scr and quick electrical testing. v1a etc. are voltages measured on the respective ckr taps. (b) expected and observed trend for scr‟ determined for varying ckr contact geometry. the actual value of scr is obtained by extrapolating to d/w = 0. d is the contact diameter and w is the ckr arm width. (a) (b) circular test structures for determining the specific contact resistance of ohmic contacts 321 fig. 5 possible effects of fabrication steps in reducing designed area of contacts of circular and square shapes. (a) (b) fig. 6 examples of equipotentials (volts) distribution for tlm models of metal to semiconductor contacts. in (a) the structure has <1 and (b) has >1,  being the parameter introduced by berger [17] to quantify the effect of semiconductor layer resistivity on tlm resistance measurements. fig. 7 (a) schematic of tlm test structure for determining contact resistance (rc) by measuring resistance between two contacts. (b) shows shows the equipotential distribution where the vertical voltage drop is significant as indicated by the curvature of the equipotentials. the tlm test structure does not include this contribution and measurements will. hence error results when the measured rc is used to determine the scr of the contact. (c) plan view of tlm test structure. (i is input current, rsh is sheet resistance, l is distance between contacts, w is width of active layer). 322 a. s. holland, y. pan, m. s. n. alnassar, s. luong 4. 2d circular specific contact resistance test structure the circular transmission line model (ctlm) test structure can be demonstrated using nastran finite element modelling. this test structure completely eliminates alignment error (as there is no alignment) and error is mainly due to the any inaccuracy in sheet resistance and like the tlm and ckr, error due to finite resistivity of the semiconductor layer can cause significant voltage drop in the semiconductor layer under the contact. yue et al [4] reported a technique using the ctlm test structure shown in figure 8 (a) and figure 9. the outer radius r1‟ is regarded as infinite in figure 9. here we will call this test structure the yue2d. it consists of three electrode discs and resistance measurement from these can relatively easily give semiconductor sheet resistance and scr. the main error that can occur in the yue-2d will be due to the  factor [17]. figure 8 (b)-(d) show images from examples of nastran finite modelling of the yue-2d test structure. the perfect symmetry of each electrode means that only a small „wedge‟ of each of the three electrodes (of fig. 9) needs to be modelled. the equipotentials shown in the semiconductor layer of figure 8(d) are similar to those in figure 6(b) where the vertical arrangement of the equipotentials indicates that there is little voltage drop in the vertical direction and hence accurate determination of sheet fig. 8 (a) schematic of circular transmission line model (ctlm) test structure using two electrodes, for determining contact resistance (rc) and scr, (b) finite element mesh used to model representative section of ctlm, (c) nastran model result showing equipotential distribution for two electrode ctlm and (d) section of ctlm showing equipotentials in semiconductor layer. circular test structures for determining the specific contact resistance of ohmic contacts 323 resistance and scr should ensue. again, an advantage of the circular electrodes is that accurate contact geometry can be measured (the fabricated contacts will be circular) and the actual radii can be used in calculations to determine contact parameters. extremely small contacts can also be realised when the value of scr is small and appropriate geometry is described in [4] for this. such a test structure with extremely small contacts will require more than one metal layer [22] in order to connect a probe to the electrode. fig. 9 schematic of the yue-2d test structure for determining semiconductor layer sheet resistance and scr of metal to semiconductor interface [4]. 5. 3d circular specific contact resistance test structure the scr of metals contacts to bulk semiconductor material is not usually reported, as the main interest for the semiconductor industry is in determining and reducing scr to shallow active layers. however the authors consider the test structure shown in figure 10 which shall hereafter be called the yue-3d, to give the most reliable measurements [3]. however, unlike the test structures reported previously in this paper, there is no analytical solution available relating resistance measurements and scr. solutions have to be obtained by computer modelling and resistance measurements plotted as a function of varying semiconductor resistivity, scr and the two radii (see figure 11). unlike the yue2d, the yue-3d only needs one resistance measurement (from one pair of electrodes). because of its accuracy, this test structure would be very suitable for studies of scr where a series of substrates are available with varying resistivity and for investigating the effects of surface treatments on varying scr. the yue-3d can be used for investigating ohmic contacts to bulk semiconductors where the semiconductor has uniform resistivity to a depth of several times the inner radius (r1) of the outer electrode shown in figure 10(a). as in the yue-2d, the outer radius r1‟ can be infinite [2]. a scaling equation can be applied to this test structure similar to that reported by loh et al. [23] for ckr test structures. 2 0 1 2 0 1 2 ( , , , , ) ( , , , , ) t b c t b c r mr mr mr mn m n nr r r r    (2) 324 a. s. holland, y. pan, m. s. n. alnassar, s. luong fig. 10 (a) schematic of the yue-3d test structure for determining scr of a metal to semiconductor contact interface for bulk semiconductor [3], (b) example of equipotential distribution in a section of the yue-3d test structure obtained from fem modeling using nastran. fig. 11 example of fem (nastran) analysis results for total resistance rt between two electrodes (fig. 10) as a function of scr (ρc) with resistivity ρb varying from 0.001 ω·cm -to0.01 ω·cm. geometry is fixed; r0 = 3 μm, r1 = 5 μm, and r2 = 9 μm. note that this figure can be scaled using (3). [2] circular test structures for determining the specific contact resistance of ohmic contacts 325 6. conclusion this paper has reviewed ohmic contact test structures investigated by the authors for ohmic contact characterisation between a metal and semiconductor in both two dimensional (2-d) and three-dimensional (3-d) circumstances using these test structures. the issues with regards to error correction, difficulty in analysing results and difficulty in fabrication, lead to the development of test structures with circular electrodes. these issues are (i) active layer definition, (ii) contact misalignment and overlap, (iii) equipotential problem, (iv) complicated analytical expressions and (v) vertical voltage drop. when the semiconductor layer in a metal-to-semiconductor contact is neither true 2-d nor true 3-d, there will always be some error, and error correction is required. for the test structure presented here, accurate results can be always determined when semiconductor layer can be regarded as truly 2-d or 3-d. in summary, all of the above issues with conventional test structures have been addressed and improved by the novel test structures (yue-2d and yue3d) developed for ohmic contact characterisation in both 2-d and 3-d circumstances. the corresponding methods for determining scr have also been presented and demonstrated using finite element modeling (fem). because of the resistance only effect of ohmic contacts, a full semiconductor physics modelling program is not required. commercially available fem software for static thermal analysis, such as nastran can be used for ohmic contact test structure investigation considering the analogous equations for heat and electric current flow. the yue-2d set of three two-contact circular test structures does not require mesa isolation and correction factors are unnecessary. furthermore, the analytical expressions are relatively simple compared to the conventional ctlm test structure. a 3d test structure (yue-3d) was demonstrated that should be most accurate in determining specific contact resistance. references [1] hiep n. tran, tuan a. bui, aaron m. collins, and anthony s. holland, “consideration of the effect of barrier height on the variation of specific contact resistance with temperature”, ieee trans. electron devices, vol. 64, no. 1, pp. 325, 2017. [2] a. m. collins, y. pan, a. s. holland, “using a two-contact circular test structure to determine the specific contact resistivity of contacts to bulk semiconductors”, facta universitatis, series electronics and energetics, vol. 28, no. 3, pp. 457 – 464, september 2015. [3] y. pan, a. m. collins and a. s. holland, "determining specific contact resistivity to bulk semiconductor using a two-contact circular test structure", in proceedings of the ieee international conference on miel, may 2014, pp. 257-260 [4] y. pan, g. k. reeves, p. w. leech, and a. s. holland, “analytical and finite-element modeling of a twocontact circular test structure for specific contact resistivity,” ieee trans. electron devices, vol. 60, no. 3, pp. 1202–1207, mar. 2013. [5] a. s. holland, g. k. reeves, "new challenges to the modelling and electrical characterisation of ohmic contacts for ulsi devices", microelectronics reliability, vol. 40, pp. 965-971, 2000. [6] g. k. reeves, “specific contact resistance using a circular transmission line model,” solid state electron., vol. 23, no. 5, pp. 487–490, may 1980. [7] w. shockley, “research and investigation of inverse epitaxial uhf power transistors”, air force atomic laboratory, wright-patterson air force base, rep. no. al-tdr-64-207, sept. 1964. [8] h. berger, “models for contacts to planar devices,” solid state electronics, vol. 15, pp. 145-158, 1972. [9] s. j. proctor and l. w. linholm, ieee electron device lett., edl-3 (10) 294 (1982). [10] c. y. chang, y. k. fang, and s. m. sze, “specific contact resistance of metal-semiconductor barriers,” solid-state electron., vol. 14, no. 7, pp. 541–550, jul. 1971. 326 a. s. holland, y. pan, m. s. n. alnassar, s. luong [11] g. srinivasan, m. f. bain, s. bhattacharyya, p. baine, b. m. armstrong, h.s. gamble, d. w. mcneill, mat. sci. eng. b, 114-115, pp.223-227, 2004. [12] m. finetti, s. guerri, p. negrini, a. scorzoni, and i. suni, thin solid films, vol. 130, no. 37, 1985. [13] majumdar et al, “stlm: a sidewall tlm structure for accurate extraction of ultralow specific contact resistivity”, ieee trans. electron devices, vol. 34, no. 9, september 2013. [14] miyoshi et al, “in-situ contact formation for ultra-low contact resistance nige using carrier activation enhancement (cae) techniques for ge cmos”, in digest of technical papers symposium on vlsi technology, 2014. [15] yu et al, “titanium silicide on si:p with precontact amorphization implantation treatment: contact resistivity approaching 1 × 10 −9 ohm-cm 2 ”, ieee trans. electron devices, vol. 63, no.12, september 2016. [16] r. dormaier and s. e. mohney, “factors controlling the resistance of ohmic contacts to n-ingaas,” j. vac. sci. technol. b, vol. 30, no. 3, pp. 031209-1–031209-10, may/jun. 2012. [17] n. stavitski, m. h. van dal, a. lauwers, c. vrancken, a. y. kovalgin, and r. m. wolters, “evaluation of transmission line model structures for silicide-to-silicon specific contact resistance extraction,” ieee trans. electron devices, vol. 55, no. 5, pp. 1170–1176, may 2008. [18] holland a. s. and reeves g.k., “new challenges to the modelling and electrical characterisation of ohmic contacts for ulsi devices", in proc. of the miel 2000 conference, vol. 2, pp.461-464, nis, may 2000. [19] m. nathan, s. purushothaman and r. dobrowolski, “geometrical effects in contact resistance measurements: finite element modelling and experimental results”, j. appl. phys., vol. 53, no. 8, pp. 5776-5782, august 1982, [20] anthony s. holland, geoffrey k. reeves, patrick w. leech, “finite element modelling of misalignment in interconnect vias”, pp. 307-310, commad, brisbane 2004. [21] n. stavitski, j. h. klootwijk, h. w. van zeijl, a. y. kovalgin, and r. a. m. wolters, “cross-bridge kelvin resistor structures for reliable measurement of low contact resistances and contact interface characterization,” ieee trans. semicond. manuf., vol. 22, no. 1, pp. 146–152, feb. 2009. [22] phd thesis, “versatile circular test structure for ohmic contact characterisation” dr pan yue, rmit university 2015. [23] w. m. loh, s. e. swirhun, t. a. schreyer, r. m. swanson, and k. c. saraswat, “analysis and scaling of kelvin resistors for extraction of specific contact resistivity,” ieee electron device lett., vol. edl-6, pp. 105–108, mar. 1985. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 161 178 doi: 10.2298/fuee1702161p spice modeling of ionizing radiation effects in cmos devices  tatjana pešić-brđanin faculty of electrical engineering, university of banja luka, republic of srpska, bosnia and herzegovina abstract. electric characteristics of devices in advanced cmos technologies change over the time because of the impact of the ionizing radiation effects. device aging is caused by cumulative contribution of generation of defects in the gate oxide and/or at the interface silicon-oxide. the concentration of these defects is time and bias-dependent values. existing models include these effects through constant shift of voltage threshold. a method for including ionizing radiation effects in spice models of mos transistor and finfet, based on an auxiliary diode circuit using for derivation of values of surface potential, that also calculates the correction time-dependent voltage due to concentration of trapped charges, is shown in this paper. key words: ionizing radiation effects, trapped charges, spice model, cmos devices 1. introduction with aggressive scaling of device dimensions in cmos technologies, which includes the decrease of oxide thickness and the increase of doping concentration in the channel, the susceptibility of the most cmos technologies has been reduced. scaling of the oxide thickness caused the decrease of concentration of fixed charge in the oxide, because the value of the concentration is directly proportional to the oxide thickness. on the other side, the increase of doping concentration in the channel decreased the oxide trapped charge effect on the surface potential of the channel, which also caused robustness of the components on ionizing radiation [1]. however, recent studies showed that the negative bias temperature instability damage and hot carrier injection damage were attributed to the charges trapped in the oxide (with areal density nox) and/or at the interface of the silicon and oxide layers (with energy density distribution dit) [2-4]. therefore, trapped charges still represent a potential radiation threat and have measurable impact on the integrated circuits performances [2,5].  received november 2, 2016 corresponding author: tatjana pešić-brđanin faculty of electrical engineering, patre 5, 78000 banja luka republic of srpska, bosnia and herzegovina (e-mail: tatjanapb@etfbl.net) 162 t. pešić-brđanin a harmful effect of ionizing radiation on cmos devices can be diminished by using well-known techniques, such as radiation-hardening-by-process (rhbp) and radiationhardening-by-design (rhbd) techniques [6,7]. however, even with significant efforts in rhbp and rhbd techniques, the capability of estimating the influence of ionizing radiation on electric characteristics of devices in advanced technologies are still improper [8]. analysing of test ic circuits on ionizing radiation is quite expensive [7], so the incorporation of ionizing radiation effects in devices compact models used in standard electric circuits simulators is put upon as an alternative. the incorporation of these effects needs the knowledge of physical processes which contribute to emerging of the defects due to ionizing radiation and the impacts which these effects have on the electric characteristics of components in advanced cmos technologies [8,9]. numerous existing techniques for modelling these effects in circuit simulators are based on the fixed change of threshold voltage (threshold voltage shift), not considering the special impact which these defects have on the electric characteristics of the transistors [2,10-12]. previously derived surface-potential based non-quasi static mos model (nqs mos model) and non-quasi static soi model (nqs soi model) can be modified as to include these effects of oxide trapped charges and interface trapped charges is described in this paper [13,14]. 2. ionizing radiation effects in cmos devices the main cause of the damage that occurs in cmos devices after ionizing radiation is the generation of the electron-hole pairs in the oxide (or another dielectric) as a material that is the most sensitive to ionizing radiation in cmos devices. after the generation of the electron-hole pairs, some of the pairs are immediately recombined. since the electron mobility in the oxide is considerably bigger that the hole mobility [15,9], the electrons will be soon swept out of the oxide or the dielectrics, while the holes will move slowly through the oxide to the interface sio2-si, causing long-term effects of the ionizing radiation. fig. 1 shows the processes after the ionizing radiation. fig. 1 processes in the oxide after the ionizing radiation [16] spice modeling of ionizing radiation effects in cmos devices 163 vacancies in the oxide or the dielectrics can trap the generic holes. a total amount of trapped charge in the oxide is nox. the trapped charge changes the threshold voltage thv of cmos devices for the threshold voltage shift [17]: , 2 ox oxox th ntq v   (1) where q is the electron charge, tox is the oxide thickness and ox is the oxide permittivity. the threshold voltage shift vth is negative, which means that in the case of the nmos transistor the off current increases, while in the case of the pmos transistor the total value of threshold voltage vth increases, as shown in fig. 2(a). it can be concluded from (1) that vth depends on the square of the oxide thickness; with the decrease of the oxide thickness in nanometer cmos technologies and due to the change of the threshold voltage the oxide trapped charge will be smaller. fig. 2 illustration of the threshold voltage shift vth due to the oxide trapped charges (a) and increase in subthershold swing due to interface trapped charges (b) [17] after the ionizing radiation, the generation of interface traps occurs, which concentration is nit. the generation holes react with hydrogen atoms in the oxide, making in such a way h + ions [18]. these ions move by drifting to sio2-si interface, and create 164 t. pešić-brđanin dangling bonds (i.e. pb centres) [2]. interface trapped charges are often linked with the permanent effects of components aging [2,10]. fig. 2(b) shows the impact of the generation of trapped charges at the sio2-si interface on the transfer characteristic of the transistors. it can be noted that these charges increase the swing in the device subthreshold region. for nmos and pmos transistors, the generation of interface trapped charges decreases the transistor off current. 3. nqs mos and nqs soi transistor models static and dynamical characteristics of transistors can be described by set of basic equations, which are comprised of poason's equation, drift-diffusion and continuity equations [19]. since mos transistor modelling is three dimensional problem, solving these sets of equations is complex and memory demanding. however, for numerous practical applications of mos transistors, changes in the third direction can be neglected and problem can be reduced to two dimensional problem (to x and y direction). 3.1. nqs mos transistor model in [13] a physically based nqs mos transistor model is described, which belongs to a group of models based on surface potential. fig. 3 shows equivalent model scheme, which as a subcircuit can be embedded into electric circuit simulators. external elements of transistor model (resistors and capacitors) can be modelled in a similar way as in other stationary or non-stationary models. unlike some known models [20-22], in the nqs mos model there are no analytical expressions for node currents, but they are obtained after the solution of equivalent circuit shown on fig. 3(a). this subcircuit has two parts, as shown on fig. 3(b):  internal part is connected to transistor gate terminal. this part of the model is, in fact, equivalent line that models drift-diffusion transport of electrons in transistor channel;  external part is connected to source, drain and gate terminals, and it contains current-controlled current sources is1 and isn. this part of the circuit is defined by the potential of source, drain and substrate that is obtained by mirroring the currents which flow through voltage sources s1 and sn. voltage generators s1 and sn copy values of boundary surface potentials to subcircuit in the source end and the drain end of channel. voltage generator vb serves to copy bulk polarisation to equivalent subcircuit. capacitance coxk represents gate-oxide capacitance (coxk = cox / n). the other model elements rk and ck, non-linear channel resistance and depletion region capacitance, are respectively defined by the equations: 3 31/ 1 2 1 4 5 6 (1 ( )) (1 ( ( )) ) , ( ) a a gs sk sk sk k gb fb sk sk a v a r a a v v a                (2) 1/ 20 7 ( ) , 2 bk si ch k sk sk sk q qn c a           (3) spice modeling of ionizing radiation effects in cmos devices 165 where the constants a1  a7 are physically based, nch is doping concentration in the channel and si is the silicon permittivity. surface potential of every cell is denoted with sk. the derivations for (2) and (3) and the expressions for a1  a7 are given in [13]. (a) (b) fig. 3 nqs mos model (a) and the equivalent subcircuit (b) in a surface charge-sheet model, which describes mos transistor operation [23], the boundary channel potentials s1 and sn at the source and drain side are functions of biasing voltage of transistor terminals through the following recurrent relations [24]: 166 t. pešić-brđanin 2 1 1 12 1 1 2 ln ( ) , s f sb t gb fb s s t v v v v v                    (4) 2 2 1 1 2 ln ( ) . sn f sb ds t gb fb sn sn t v v v v v v                     (5) in the previous equations  is the body factor, vt is the thermal voltage, f is the channel potential (=vt ln(nch/ni))) and vfb is the flatband voltage. since the equations (4) and (5) are implicit relations, to determine surface potentials s1 and sn there are several iterative methods proposed in the literature [25]. in the nqs mos model, relations (4) and (5) are determined by diode circuits. for any point y in the channel is: 2 2 1 1 exp( / ) 1 exp(2 / ) ( ) 1. sy t fy t gb fb sy sy t v v v v v                     (6) by comparing the equation (6) with the diode current expression: 0 (exp( / ) 1) d sy t ss i i v i   (7) the conclusion is that: 2 2 0 1 1 exp(2 / ) ( ) 1, 1. ss fy t gb fb sy sy t i v v v v i                    (8) when determining the boundary surface source potential s1, in the equation (8) sy and fy should be replaced with sy = s1 and fy = 2f + vsb, consecutively, while for determining boundary surface potential on the drain side sn instead sy and fy should be used sn and 2f + vsb + vds, respectively. owning to this type of analysis, it is possible to construct a circuit for solving equations (7) and (8), which is comprised of a diode (with unit current i0 = 1) and voltage-controlled current source, where the current is calculated by the equation (8). figure 4 shows this type of auxiliary diode circuit. for determining both boundary surface potentials, 1s and sn , there are used two identical diode subcircuits and the described method is used to solve the equations (4) and (5). the values of the boundary surface potentials determined in this way are copied with voltage generators s1 and sn (shown in fig. 3(b)) on the input and output of equivalent circuit to solve the transport of the electrons in the channel. knowing the boundary surface potentials allows us to calculate the values of nonlinear resistors and capacitors rk and ck, namely to determine the transistor currents. fig. 4 diode subcircuit for solving surface potentials spice modeling of ionizing radiation effects in cmos devices 167 a physical base of the nqs mos model in an easy way allows including significant effects shown in aggressive scaling of transistor dimensions, like, for example, short channel effects and quantum-mechanics effects. 3.2. nqs soi transistor model a compact model for n-channel fully depleted soi mos transistor with double gate (fd soi transistor) is developed based on the nqs mos model, and it is applicable for asymmetrical and symmetrical planar structures [14]. in non-stationary model of fd soi mos transistor (nq soi model), a transistor is represented by parallel connection of two soi transistors with one gate, as shown in fig. 5, to model current in a front and back channel [14]. fig. 5 schematic presentation of fd soi transistor (a) and its electric equivalent (b) by comparison with the nqs mos model, recurrent expressions for calculating boundary surface potentials in the nq soi model also contains the influence of biasing of both gates. so the boundary surface potentials in channel s1 and sn in the fd soi transistor are connected with biasing of front (vgf) and back (vgb) gate, and biasing between drain and source vds with new recurrent relations [26,27]: 1 11 1 2 2 2 1 12 2 2 / / // / 1 1 1 ( ) ( ) ( ) ( ) ,f t s t s tb t b t oxf gf fbf s gf fbb b oxb v v vv v t t s b t v v v v t v e e e v e e                             (9) 2 2 2 2 2 ( 2 ) / / / / / 1 ( ) ( ) ( ) ( ) ,f ds t sn t bn t sn t bn t oxf gf fbf sn gf fbb bn oxb v v v v v v t t sn bn t v v v v t v e e e v e e                               (10) where, in the case of fully depleted silicon layer, boundary potentials of back channel can be expressed as: ,and 22 11 si sich snbn si sich sb tnqtnq      (11) while for a fully symmetrical transistor applies toxf = toxb. in the equations (9)-(11) the index f relates to the front gate, and the index b relates to the back gate. recurrent 168 t. pešić-brđanin relations (9) and (10) are calculated with the assumption that the difference of fermi’s potentials between the source and the drain is equal to the voltage vds. electric potential distribution in the channel through depth, i.e. in the line of axis x, is obtained by solving these recurrent relations (fig. 6). fig. 6 electric potential distribution in the channel through depth of fd soi transistor for applications in the nq soi model for a symmetrical fd soi mos transistor, recurrent equations for calculating boundary surface potentials can be written with basic algebraic transformations [14] in the following form: / / 1 2 ( ) ( ) ,sx t sx t v v s s s i e i e i     (12) while: 2 2 2 2 2 1 1 ( ) ,ch si ch si s gf fbf sx gf fbb sx t si si qn t qn t i v v v v v                           (13) ,exp1 2 / 1                   sit sichtvfx s v tnq ei   (14) ,exp1 2 2          sit sich s v tnq i  (15) where on the source side sx = s1 and fx = 2f , while on the drain side the changes have to be made sx = sn and fx = 2f + vds. in the previous expressions, tsi is the silicon film (body) thickness. auxiliary diode circuits, similar to the nqs mos model for solving recurrent relations, are used in this way for calculating boundary values of surface potentials in the nq soi model. fig. 7 shows equivalent diode circuit for solving the equation (12) [14]. fig. 7 diode subcircuit for solving surface potentials in nqs soi model spice modeling of ionizing radiation effects in cmos devices 169 4. inclusion of nox and dit in nqs mos and nqs soi models a physical foundation of previously described models allows easily inclusion of effects important for transistor operation. modelling of the effects of generation interface trapped charge with energy density distribution dit and oxide trapped charge with areal density nox is possible in nqs mos and nqs soi model by changing the surface potential equations. it is possible to model the impact of these effects onward on the characteristics of transistor in two ways: 1. auxiliary diode circuits, with the included effects of nox and dit, are used for determining surface potentials for use in nqs mos and nqs soi models or 2. auxiliary diode circuits, with the included effects of nox and dit, are used for determining surface potentials, and then to connect consecutively to gate of some standard models (for example, bsim 4 for mos transistor or bsim.cmg for finfet). a total amount of electric charge caught in oxide is: ,oxox qnq  (16) while a total amount of interface charge [19]: , 2 2/           f g it ge fe ititit e e qddedqq (17) where eg / 2 is the midgap energy level at the interface and ef is the energy of fermi level. if we add and subtract the factor egb / 2, where egb is the bulk midgap energy level, to the factors in the equation parenthesis (17) we have:  . 222 222 fsit f gbggb it gb f gbg itit qd e eee qd e e ee qdq                                      (18) as stated in the section 2, charges qox and qit have impact on the change of the transistor voltage threshold. this change can be expressed by correction potential nt [6]: [ ( )]. ox it nt ox it s f ox ox q q q n d c c         (19) in the nqs mos model, the equations (4)-(6) are modified in a way to include correction potential nt. eqn. (6) in a modified form with included correction potential is: 2 2 1 1 exp( / ) 1 exp(2 / ) ( ) 1. sy t fy t gb fb sy nt sy nt t v v v v v                         (20) 170 t. pešić-brđanin for determining surface potential sy, two identical diode circuits are used, as shown in fig. 3(b). in the nqs soi model, for a symmetrical fd dg soi transistor, the equation for surface potential is modified in a way to include nt in the following way: 1 11 1 2 2 1 12 ( 2 ) / / // / 1 1 1 ( ) ( ) ( ) ( ) .f ds t s t s tb t b t gf fbf nt s gf fbb nt b bv v v vv v t t s b v v v v v e e e v e e                               (21) the parameter b, which appears in the equation (21), can have the value b = 0 for the source end of the channel and b = 1 for the drain end of the channel (in accordance with the equations (9) and (10)). however, the main problem in modelling of trapped charges with (21) is the fact that the distribution of surface potential in the channel depends not only on gate voltage, but also on drain voltage vds due to split of quasi fermi levels [19]. it means that the concentration qit will change along the channel, even for the constant nit. the impact of the changeable charge qit along the channel can be modelled with a modified value of the parameter b  (0,1). in the equation (21) it is calculated with in advance known value, and it is possible with the fine tuning [28] to accomplish better match of the model results with the results of 2d tcad numeric simulator silvaco atlas [29]. the equation (21) can also be solved with auxiliary diode circuits (fig. 7) with: 2 2 2 2 2 1 1 ( ) ,ch si ch si s gf fbf nt sx gf fbb nt sx t si si qn t qn t i v v v v v                               (22) ,exp1 2 /)2( 1                   sit sichvbv s v tnq ei tdsfx   (23) .exp1 2 2          sit sich s v tnq i  (24) the surface potential s from the diode circuit in fig. 7 represents the equation solution (21) for any combination of voltage variables vds and vgs. 5. simulation results and discussion the ionizing radiation has the effects on the changes of the electric characteristics of the transistor. in the paper, the approaches described in the section 3 are used for the simulation of electric characteristics of the transistor and the results are compared with numerical results. spice modeling of ionizing radiation effects in cmos devices 171 5.1. modeling of nox and dit effects in mos transistor including of the effects nox and dit in the nqs mos transistor model is made by incorporation of the correctional potential nt in the surface potential equation (eqn. 20). as already stated, with diode circuits as in fig. 4, by using mathematical apparatus available in the spice, the boundary surface potentials are acquired, and based on them the equivalent line is solved (fig. 3). in this paper, the equivalent line is divided on 10 equal segments. fig. 8 shows the acquired surface potentials that show the impact of nox (fig. 8(a)) and the impact of the interface trapped charges through dit on the surface potential value. the results acquired with diode circuits are shown with solid line, while the numerical results are shown with open circles. a solid compliance of the results confirms the efficiency of the diode circuit as a new method for solving iterative relations (21). as it can be seen on the figure, the surface potential is changed for constant negative voltage shift with the increase of nox, while dit = 0. in the case of the increase of dit while nox = 0, the voltage shift of the surface potential will depend on its value due to the dynamic charge contribution on sio2-si interface. namely, the interface charges have the energy inside forbidden zone. interface trapped charges with energies above intrinsic energy level ei behave as acceptor-like charges, while all interface trapped charges with energies below intrinsic energy level behave as donor-like charges, which is experimentally verified [2,30,31]. fig. 8 surface potential versus gate voltage dependence for different values of nox at dit = 0 (a) and for different values of dit at nox = 0 (b) obtained from spice simalation of proposed model (solid line) and tcad numerical results (open circles) for mos transistor with tox = 5 nm and nch = 410 17 cm 3 fig. 9 shows the transfer characteristics of mos obtained from the spice and compared with tcad numerical results, which shows solid compliance of the results of the applied method in nqs mos model with the tcad numerical results. it is important to state that in [2] is used the same expression for correctional potential due to the effects of ionizing radiation, by using voltage-controlled voltage source (vcvs) with voltage: )(),,,( sitoxsbgbdf fdnvvfv  (25) 172 t. pešić-brđanin and which is series connected to transistor gate, for which some of standard models are used (for example, bsim model). for determining vdf = nt, respectively solving (19) the authors used the non-iterative algorithm inside the verilog-a model [2], while in our method the iterative equation for determining the surface potential was solved in a physical way, with diode subcircuits. fig. 9 transfer characteristics id(vgs) for different values of nox at dit = 0 (a) and for different values of dit at nox = 0 (b) obtained from spice simalation of proposed model (solid line) and tcad numerical results (open circles) 5.2. modeling of nox and dit effects in finfet with the scaling of the device dimensions, conventional transistors reached its limits, so new technological structures for future generations of integrated circuits are emerging. such structure is fully-depleted floating-body (fin) multi-gate fet (finfet) [32]. however, recently it has been shown that finfet technology has rapid rate of aging, so that the degradation on finfet exceeds the degradation of the planar technology node by higher stress voltage and longer time [33]. therefore, the modelling of ionizing radiation effects in these structures is important. in the standard bsim.cmg model [34] for finfet, however, there is only fitting parameter cit (interface trap capacitance parameter) in sub-threshold region [35], while it does not have a possibility for user-defined input of oxide trapped charges. fig. 10 shows a schematic presentation of n-type finfet analysed in this paper (with the following parameters l = 0.9 m, tox = 5 nm, tsi = 20 nm, nch = 2.410 18 cm 3 and nd = 10 20 cm 3 ). fig. 10 schematic representation of n-type finfet spice modeling of ionizing radiation effects in cmos devices 173 fig. 11 shows the output characteristics of transistor obtained by using tcad numerical results, bsim.cmg model which parameters are acquired by fitting, and modified nqs soi model. in order to simplify the tuning of the parameters of bsim.cmg model, a simulate structure has a long channel and the thickness of oxide gate and silicon fin, so the effects of a short channel can be neglected, and the silicon fin is fully depleted [28,36]. the same parameter set is used for p-type finfet, with the fact that the fin film has the opposite doping (n-type fin film). in the absence of the ionizing radiation effects, the compliance of results of different models is shown [28]. fig. 11 the output characteristics of n and p-type finfets simulated for nox = 0 and dit = 0 with spice using bsim.cmg model (solid line), nqs soi model (dashed line) and tcad simulator silvaco atlas (open circles) modeling of nox and dit effects by using auxiliary diode subcircuits (ads) for solving surface potential equations (21) is possible in two ways: by using nqs soi model (time consuming), or as shown in [2,6], for determining surface potential as control voltage of vcvs for producing vdf = nt = f (vgb, vsb, nox, dit). this vcvs is connected in series with gate node of bsim.cmg model, as shown in fig. 12. second approach of modelling the ionizing radiation effects in finfet is at time more comfortable, because the simulation execution time is shorter and there are no problems due to convergence, but due to a physical dependency the nqs soi model is more convenient, because other effects important for the operation of finfet can be easily included (for example, quantum-mechanic effects). the second approach, bsim.cmg model with ads, was used in this paper for modelling the ionizing radiation effects. 174 t. pešić-brđanin fig. 12 schematic of diode subcircuit shown together with the bsim.cmg finfet model as implemented in spice simulations to include the effects of nox and dit fig. 13 shows transfer characteristics of n and p-type finfets for different values of dit while nox = 0. fig. 14 shows transfer characteristics for different values of nox while dit = 0, and fig. 15 shows characteristics for combinations of different values of nox and dit. in figs. 14 and 15 there are no results obtained by bsim.cmg model because oxide trapped charge effect is not included in this model. all characteristics are generated for vds = 1.2v. in the bsim.cmg model, a parameter cit is determined for given dit. parameter b, which appears in the equation (21), was used with value b = 0.05, for the reason previously explained in section 4. all stated characteristics show good match of suggested approaches with tcad numerical results [28,37]. fig. 13 transfer characteristics id(vgs) for different values of dit at nox = 0 spice modeling of ionizing radiation effects in cmos devices 175 fig. 14 transfer characteristics id(vgs) for different values of nox at dit = 0 fig. 15 transfer characteristics id(vgs) for combined influence of nox and dit for n-type finfet fig. 16 shows changes of threshold voltages for n and p-type finfets after ionizing radiation, obtained from tcad and proposed method. the constant current method is used for threshold voltage extraction [28,38], with i'd = 100 na/m. the impact of this ionizing radiation effect is also experimentally confirmed [39]. 176 t. pešić-brđanin fig. 16 theshold voltages vth for p and n-type finfet as function of nox and dit. 6. conclusion the modelling of ionizing radiation effects for cmos devices is presented in this paper. it is shown how surface potential equations can be modified with correctional potential, which is a result of existence of oxide charges and interface trapped charges. auxiliary diode circuits were used for determining modified surface potentials, while for obtaining electric characteristics of devices, two approaches were used, previously developed non-stationary models for cmos devices and, second approach, vcvs (with controlled voltage obtained by diode circuits) in series with gate node of standard models. in comparison with tcad numerical simulations, the efficiency of suggested approaches for prediction of impacts of dynamic effects of both oxide and interface trapped charges on electrical characteristics of devices is shown. references [1] n. s. saks and m. g. ancona, "generation of interface states by ionizing radiation at 80k measured by charge pumping and subthreshold slope techniques," ieee trans. on nucl. sci., vol. 34, pp. 1348-1354, 1987. [2] i. esqueda, h. barnaby, "a defect-based compact modeling approach for the reliability of cmos devices and integrated circuits," solid-state circuits, vol. 91, pp. 81-86, 2014. [3] v. huard, cr. parthasarathy, a. guerin, e. pion, "cmos device design in reliability approach in advanced nodes," ieee irps conference, pp. 624-633, 2009. [4] v. huard, "two independent components modeling for negative bias temperature instability," ieee irps conference, pp. 32-42, 2010. [5] a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko, "method for integrated circuits total ionizing dose hardness testing based on combined gammaand x-ray irradiation," facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 329-338, 2014. spice modeling of ionizing radiation effects in cmos devices 177 [6] h.j. barnaby, m.l. mclain, i.s. esqueda, v. xiao jie, "modeling ionizing radiation effects in solid state materials and cmos devices," ieee trans. on circuits and systems i, vol. 56, pp. 1870-1833, 2009. [7] d. boychenko, o. kalashnikov, a. nikiforov, a. ulanova, d. bobrovsky, p. nekrasov, “total ionizing dose effects and radiation testing,” facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 153-164, 2015. [8] t.p. ma and p.v. dressendorfer, ionizing radiation effects in mos devices and circuits, new york: wiley, 1989. [9] m.m. pejovic, "p-channel mosfet as a sensor and dosimeter of ionizing radiation," facta universitatis, series: electronics and energetics, vol. 29, no. 4, pp. 509-541, 2016. [10] t. grasser, b. kacter, w. goes, t. aichinger, "a twostage model for negative bias temperature instability," ieee irps conference, pp. 33-44, 2009. [11] j.p. campbell, p.m. lenahan, a.t. krishnan, "nbti: an atomic-scale defect perspective," ieee irps conference, pp. 442-447, 2006. [12] w. wang, s. yang, s. bhardwaj, s. vrudhula, f. liu, y. cao, "the impact of nbti effect on combinational circuit: modeling, simulation and analysis," ieee trans. on vlsi syst. vol. 18, pp. 173– 83, 2010. [13] t. pešić, n. janković, "a compact non-quasi-static mosfet model based on the equivalent nonlinear transmission line", ieee trans. on computer-aided-design of integrated circuits and systems, vol. 24, pp. 1550-1561, 2005. [14] n. janković, t. pešić, "non-quasi-static physics based circuit model of fully-depleted double-gate soi mosfet", solid-state electronics, vol. 49, pp. 1086-1089, 2005. [15] g. a. ausman and f. b. mclean, "electron-hole pair creation energy in sio2," appl. phys. lett., vol. 26, pp. 173-177, 1975. [16] f. b. mclean and t. r. oldham, "basic mechanisms of radiation effects in electronic materials and devices," harry diamond laboratories technical report, vol. hdl-tr, pp. 2129, 1987. [17] esko mikkola, "hierarchical simulation method for total ionizing dose radiation effects on cmos mixed signal circuits", doctorate thesis, university of arizona, 2008. [18] f. b. mclean, "a framework for understanding radiation-induced interface states in sio2 mos structures," ieee trans. on nucl. sci., vol. 27, no. 6, pp. 1651-1657, dec. 1980. [19] s. m. sze, semiconductor devices, physics and technology, wiley, new york, 2008. [20] a.s. porret, j.-m. sallese, c. enz, "a compact non-quasi-static extension of a charge-based mos model," ieee trans. on electron devices, vol. 48, pp. 1647-1654, 2001. [21] m. miyake et al., "hisim-igbt: a compact si-igbt model for power electronic circuit design," in ieee trans. on electron devices, vol. 60, no. 2, pp. 571-579, feb. 2013. [22] g. gildenblat et al., "psp: an advanced surface-potential-based mosfet model for circuit simulation," in ieee trans. on electron devices, vol. 53, no. 9, pp. 1979-1993, sept. 2006. [23] j. r. brews, "a charge-sheet model of the mosfet", solid-state electronics, vol. 21, pp. 345-355, 1978. [24] f. van de wiele, "a long channel mosfet model," solid-state electronics, vol. 22, no. 12, pp. 991997, 1979. [25] m. miura-mattausch, u. feldman, a. rahm, m. bollu, d. savignac, "unified complete mosfet model for analysis of digital and analog circuits", ieee trans. on computer-aided design of integrated circuits and systems, vol. 15, pp. 1-7, 1996. [26] j. sleight, r. rios, "a continuous compact mosfet model for fullyand partially-depleted soi devices", ieee trans. on electron devices, vol. 45, pp. 821-825, 1998. [27] s. bolouki, m. maddah, a. afzali-kusha, m. el nokali, "a unified i-v model for pd/fd soi mosfets with a compact model for floating body effects", solid-state electronics, vol. 47, pp. 19091915, 2003. [28] nebojsa jankovic, tatjana pesic-brdjanin, "spice modeling of oxide and interface trapped charge effects in fully-depleted double-gate finfets", springer journal of computational electronics, vol. 14, no. 3, pp. 844-851, 2015. [29] silvaco atlas user's manual, http://www.silvaco.com, 2010. [30] ch helms, eh poindexter, "the silicon–silicon-dioxide system: its microstructure and imperfections," rep progr phys., vol. 57, pp. 791-852, 1994. [31] nh thoan, k. keunen, vv. afanas’ev, a. stesmans, "interface state energy distribution and pb defects at si(110)/sio2 interfaces: comparison to (111) and (100) silicon orientations," journal of appl. phys., 2011; 109:013710. 178 t. pešić-brđanin [32] j.-p. colinge (ed.), finfets and other multi-gate transistors, springer, 2008. [33] h. kukner, p. weckx, p. raghavan, b. kaczer, f. catthoor, lauwereins r. van der perre, g. groeseneken, "bti reliability from planar to finfet nodes," in proc. of the 3rd workshop on manufacturable and dependable multicore architectures at nanoscale (median'14), pp.11-14, 2014. [34] n. paydavosi, s. venugopalan, y.s. chauhan, j.p. duarte, s. jandhyala, a.m. niknejad, c.c. hu, "bsim-spice models enable finfet and utb ic designs," ieee access, vol. 1, pp. 201-215, 2013. [35] s. yao, t.h. morshed, d.d. lu, s. venugopalan, w. xiong, c.r. cleavelin, a. m. niknejad, c. hu, "global parameter extraction for a multi-gate mosfets compact model," in proc. of the ieee international conference on microelectronic test structures (icmts), pp. 194-197, march 2010. [36] h r. khan, d. mamaluy, d. vasileska, "approaching optimal characteristics of 10-nm highperformance devices: a quantum transport simulation study of si finfet," ieee trans. on electron devices, vol. 55, no. 3, pp. 743-752, march 2008. [37] t. pesic-brdjanin and nebojsa janovic, "sub-circuit model of fully-depleted double-gae finfet including the effects of oxide and interface trapped charge", in proceedings of the 16 th edition of ieee region 8 eurocon conference, pp. 273-276, salamanca, spain, september 2015. [38] a. ortiz-conde, f.j. garcia sanchez, j.j. liou, a. cerdeira, m. estrada, y. yue, "a review of recent mosfet threshold voltage extraction methods," microelectronics reliability, vol. 42, pp. 583-596, 2002. [39] yang-kyu choi, daewon ha, e. snow, j. bokor and tsu-jae king, "reliability study of cmos finfets," in proc. of the ieee international electron devices meeting, 2003. iedm '03, washington, dc, usa, 2003, pp. 7.6.1-7.6.4. instruction facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 91-104 https://doi.org/10.2298/fuee1901091i characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells stefan ilić, vesna paunović university of niš, faculty of electronic engineering, niš, serbia abstract. dye-sensitized solar cells are the closest mankind has come to replicating nature’s photosynthesis. the type of a dye influences the efficiency of these cells. in this paper we studied curcumin dye as a sensitizer in dye-sensitized solar cells and compared it with most often used cyanidin. the results have shown that curcumin has higher efficiency and higher absorption in the visible part of the spectrum compared to cyanidin. simulation models of dye molecules, curcumin and cyanidin, are deprotonated upon adsorption on the titanium dioxide surface. the energy levels obtained from the calculation indicate a higher probability of electron transition from molecule to titanium dioxide surface in case of curcumin than in case of cyanidin. based on these results, we concluded that curcumin dye has better properties as sensitizer in dye-sensitized solar cells. key words: solar cells, curcumin, cyanidin, titanium dioxide, density functional theory, voltage-controlled resistance 1. introduction a solar cell is a renewable source of energy that directly converts visible light into electricity [2-4]. when exposed to light, the solar cell becomes the source of direct current. operation principle of all solar cells is based on photoelectric effect. there are first, second and third generations of solar cells. dye-sensitized solar cells (dssc) belong to the third generation. the major part of these cells is the nanoparticle anatase titanium dioxide coated with dye molecules. the type of the dye, the way it anchored to the tio2, directly affects the efficiency. ruthenium polypyridyl complexes are known as the most efficient pigments, they achieved almost 12% efficiency [5]. however, these pigments contain a heavy metal which has undesired environmental impact. cheaper alternative can be given by natural pigments, such as anthocyanins, betalains, chlorophyll, etc. betalains are recorded as most efficient natural pigments achieving more than 2% [6]. anthocyanins received february 28, 2018; received in revised form july 5, 2018 corresponding author: stefan ilić faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia. (e-mail: stefan.ilic@yahoo.com) * an earlier version of this paper was presented at the 61st national conference on electrical, electronic and computing engineering (etran 2017), june 5-8, 2017, in kladovo, serbia [1]. 92 s. ilić, v. paunović are very frequent in research papers that study natural pigments as sensitizers in dssc [7]. they give different sensitizing performances from various plants, absorb light at the longest wavelength and have widespread availability [8]. wongcharee et al. used extracts from rosella and blue pea flowers. solar cells sensitized by rosella (delphinidin and cyanidin) have been reported to achieve efficiency up to 0.37%, whereas extract from blue pea (ternatin) can achieve up to 0.05% [9]. tekerek et al. fabricated a solar cell also with rosella dye and compared it to black raspberry and black carrot dyes. they achieved efficiencies of 0.16%, 0.16% and 0.25%, respectively [10]. curcumin can also be a sensitizer, but it has not attracted significant research attention. kim et al. reported a dye-sensitized solar cell sensitized with curcumin dye, and showed 0.36% efficiency [11]. in this work we investigate two types of natural pigments: cyanidin extracted from raspberries and curcumin dye extracted from curcuma longa. the aim of this paper is to both experimentally and theoretically (simulation) confirm the thesis that curcumin is a better sensitizer in dye-sensitized solar cells than cyanidin. firstly, we measured current-voltage characteristics and absorption spectrum. after that, to confirm the experimental results, we simulated the models of anatase (tio2)16 cluster and cyanidin or curcumin molecule attached to it. our calculation is based on density functional theory (dft) and time-dependent density functional theory (tddft). calculations were carried out with nwchem software [12]. 2. operation principle of dsscs the main idea of dye-sensitized solar cells is to separate the light absorption process from charge collection process by using dye sensitizer with semiconductor. this process imitates the natural light harvesting procedure in photosynthesis [13]. that is why dyesensitized solar cells are the closest mankind has ever come to replicate nature's photosynthesis. to separate these two processes we could use semiconductor with wide band gap such as titanium dioxide (tio2). a dye-sensitized solar cell is composed of photoactive electrode, electrolyte and counter electrode. photoactive electrode is made of porous nanocrystalline anatase titanium dioxide deposited on fto conducting glass (fluorine doped tin oxide). fto layer is 220 nm thick and it is deposited on the glass. it enables transport of photo-generated charge carriers to the electrode and it is also transparent so the light can penetrate into the solar cell. dye is absorbed on tio2 layer to complete the photoactive electrode. counter electrode is also fto glass, but it is deposited with platinum to increase the conductivity. the space between electrodes is fulfilled with electrolyte which is based on iodide and triiodide ions (fig. 1). when sunlight passes through the photoactive electrode, molecules of the dye absorb the photons and electrons go from the homo (highest occupied molecular orbital) in the ground state to the lumo (lowest unoccupied molecular orbital) in the excited state. some of the excited electrons have enough energy to jump to the conduction band of titanium dioxide and then to diffuse to the electrode. dye molecules that lost electrons are oxidized. electrolyte gives electrons to replace the lost ones. after that, iodide molecules are oxidized. electrons from photoactive electrode flow through an external load to counter electrode and recombine with electrolyte, thus completing the circuit. hence, the operating mechanism of dye-sensitized solar cell generates electricity without irreversible http://en.wikipedia.org/wiki/molecular_orbital characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 93 chemical changes in the cell. dye molecules play a key role in producing electricity. they need to overcome small absorption of titanium dioxide by absorbing the photon and exciting the electron. therefore, they are increasing the efficiency of solar cell. thus, the greater absorption of the dye is, the more efficient the solar cell will be. 3. dye sensitizers a dye sensitizer absorbs energy in dye-sensitized solar cell. when using natural pigments as a dye-sensitizer, a big problem is the degradation during prolonged exposure to sunlight due to uv radiation. figure 2 shows optimized molecular structures of the cyanidin and curcumin. fig. 2 optimized molecular structures of the cyanidin and curcumin. anthocyanins are widespread water-soluble pigments that can be found in many flowers, fruits and leaves of angiosperms. they are responsible for different colours (red, fig. 1 schematic structure and principle of operation of dssc. 94 s. ilić, v. paunović blue and violet) depending on the ph value [14]. they have found new application in dye-sensitized solar cells because they have significant absorption in the visible part of the spectrum. only organic dyes that contain several =o or -oh groups (for example cyanidin found in raspberry) capable of chelating to tio2 can be used as dye sensitizer. curcumin is an active ingredient of turmeric (curcuma longa). turmeric is a rhizomatous herbaceous perennial plant of the ginger family. it is used for indian spice, it has yellow color and is known as e100 (food additives). curcumin can exist in two tautomeric forms (keto – solid and enol solution). a molecule of curcumin has carbonyl and hydroxyl groups which can bind to tio2 surface. 4. models and computational details we used (tio2)16 cluster, to model anatase tio2 slab. cluster is obtained by correct ''cutting'' of anatase slab (fig. 3). for proper cutting, three conditions must be fulfilled: all titanium atoms must be coordinated to at least four oxygen atoms, all oxygen atoms must be coordinated to at least two titanium atoms, and the ratio of the number of titanium and oxygen atoms in the cluster must be 1 : 2 [15]. after optimization band gap of the (tio2)16 cluster was 4.52 ev. fig. 3 model of anatase (tio2)16 cluster before and after optimization. for all calculations a freely accessible software nwchem was used, performing density functional theory and time-dependent density functional theory for which we used b3lyp functional together with 6-31g basis set. density functional theory is a powerful tool for solving multi-stage problems in quantum mechanics. it allows the complicated nelectron wave function and its associated schrodinger equation to be replaced by much simpler single-electron equations in which the electron density is determined. we used dft to calculate band gap, homos and lumos for all structures and tddft to calculate absorption spectra of molecules. 5. fabrication of dsscs fabrication of dye-sensitized solar cell requires a preparation of titanium dioxide film, extraction of natural pigments, electrolyte preparation and solar cell assembly [16]. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 95 5.1. preparation of tio2 film a nanoparticle powder tio2 (p25 degussa) was used to prepare the films. water and acetic acid have been added due to the contribution to the mechanical properties of the films, i.e. good adhesion to the substrate and preventing the formation of cracks. terpineol is added to prevent particle growth, ethyl cellulose to achieve porosity of the films due to decomposition during thermal annealing. the films were deposited with a doctor-blade technique on an fto glass. doctor-blade technique is process of paste deposition on some surface by a razor blade, while the scotch tape is used as a pattern which gives the shape to the deposited layer and uniform thickness of the film about 40 μm. quadratic shapes were made with dimension 5×5mm with initial thickness 40 μm and final thickness 10-11 μm, after drying and thermal annealing. after the deposition, the films were left at room temperature for a few minutes, after which each film due to calcination was treated with the procedure: at 120°c/10 min, at 250°c/10 min, at 400°c/10 min, at 450°c/5 min and finally at 500°c/15 min, similar to the procedure presented elsewhere [17]. 5.2. natural dyes preparation and photoactive electrode formation anthocyanins are extracted from frozen raspberries. raspberries are crushed in mortar and pestle until they became juicy. curcumin was extracted from commercially purchased turmeric powder. the preparation process involved the dissolution of 5 grams of turmeric powder in ethanol. the prepared solutions were stored at room temperature and in a dark place to prevent their photodegradation. photoactive electrodes are made by soaking fto glasses with tio2 layer in crushed raspberries or in solution of turmeric. they can stay in from several minutes to several hours, while dye molecules from the raspberries and turmeric naturally adsorb onto the titania particles. tio2 layer absorbs more dye molecules if it stays longer [18]. films were pre-warmed to 80°c during staining to prevent unwanted binding of moisture from air to tio2. figure 4 shows the look of a photoactive electrode after each procedure, chronologically. a) b) c) fig. 4 fto glass with tio2 layer deposition (a), finalized photoactive electrode stained with raspberry (b) and finished solar cell stained with curcuma longa (c). 96 s. ilić, v. paunović 5.3. preparation of electrolyte the electrolyte was prepared by dissolving 1.66 g of lithium iodide (approx. 60 mm lii) and 0.254 g of iodine (approx. 0.5 mm i2) in 20 ml of ethylene glycol at 50°c with stirring. the preparation of iodine-based electrolyte was chosen based on the reported procedures [9, 19]. 5.4. solar cell assembly after photoactive electrode formation, the films were washed carefully with ethanol and distilled water. after drying with warm air, they were coupled with counter electrode and fastened with clips. fig. 5 dye-sensitized solar cell assembly. a platinum transparent electrode was prepared by a doctor-blade deposition of commercially available platinum paste (platisol t/sp, solaronix) on fto glass. furthermore, counter electrode was thermal annealed at 450°c for 30 minutes. figure 5 shows a schematic representation of the cross-section of the solar cell. after coupling the electrodes, the pressure of the clips is slightly reduced and the addition of the electrolyte between the electrodes by needle and syringe is applied, which completes the process of solar cell assembly (fig. 4). 6. measurement of current-voltage characteristics when measuring the current-voltage characteristics of a solar cell, it is necessary to measure the voltage of the cell and the current passing through the cell for different values of resistance in the circuit when it is exposed to solar radiation. since dye-sensitized solar cell gives very weak current (microamperes or less), the current in the circuit was not measured directly by the ampere meter, because it would disturb the measurement. instead, the current was determined indirectly, by measuring resistance and voltage in the circuit. this is done by using light-emitting diode and photo-resistor facing each other in a dark and closed system. therefore, we used so-called voltage-controlled resistance, because different voltages on the light-emitting diode, give different resistances on the photo-resistor. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 97 fig. 6 measuring equipment. we used multifunctional system ni usb-6008 [20]. voltage values of the led were applied for 1136 known resistance values on the photo-resistor (range of 367-250000 , and then, after 10 milliseconds the voltage of the solar cell (which was exposed to solar radiation) was measured (fig. 6). based on the known resistance and voltage in the circuit the current is calculated. after that the current-voltage characteristics are drawn. 7. experimental results the analysis of tio2 films by scanning electron microscopy confirms the presence of a developed surface and a porous structure (fig. 7). fig. 7 sem image of the tio2 on fto glass surface on the left and on the right its cross section. results for the current-voltage characteristics measured for dye-sensitized solar cell stained with curcuma longa and raspberry are shown in figure 8. all measurements were recorded at a solar radiation intensity of 790 w/m 2 . 98 s. ilić, v. paunović fig. 8 current-voltage curve of dsscs stained with curcuma longa (black curve) and with raspberry (red curve). dye-sensitized solar cell stained with curcuma longa has efficiency of 0.028% and fill factor of 45%, while dye-sensitized solar cell stained with raspberry has efficiency of 0.017% and fill factor of 36%. by comparing the current-voltage characteristics, we can conclude that the dye-sensitized solar cell stained with curcuma longa is better than dyesensitized solar cell stained with raspberry. graphic results can be explained by the absorption spectra of curcuma longa and raspberry (fig. 9). fig. 9 absorption spectra of curcuma longa (black) and raspberry (red). characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 99 the curcuma longa is active in the visible region 400-500 nm and has a peak at 429.6 nm, while the raspberry is active in the visible region 480-580 nm and has a peak at 544 nm, which is the characteristic of an anthocyanins [16]. the absorption spectra were recorded using the perkin-elmer lambda 15 uv/vis spectrophotometer. samples did not have the same concentration of the solution, turmeric has a much higher absorption than shown. for our work the most important was to see the absorption peaks and compare them with simulation results. 8. simulation results based on tddft, absorption spectra for cyanidin and curcumin were calculated (fig. 10). curcumin has an absorption peak at 420.8 nm, while the cyanidin has the highest peak at 477.3 nm, which differs from the experimental results. considering that in experiment raspberry dye contains more than one pigment that can absorb light, results obtained from simulation are in good agreement with the experimental values that has been previously explained (fig. 9). curcumin has higher absorption than cyanidin, which can explain the higher efficiency of the solar cell stained with curcuma longa [21]. after optimization for models of the cyanidin and curcumin molecules, the homolumo gap has value: for cyanidin 2.43 ev, which is in perfect agreement with reference work [13], and for curcumin 3.22 ev. fig. 10 absorption spectra of curcumin (black) and cyanidin (red). dye molecule can be anchored on tio2 surface by the carbonyl (=o), hydroxyl (-oh) or carboxyl group (-cooh). curcumin and cyanidin have only carbonyl and hydroxyl groups. a carboxyl group can be represented as a combination of a hydroxyl group and a carbonyl group. adsorption modes can be bridged bidentate and monodentate modes. for simplicity, the adsorption modes are represented with a carboxyl group (fig. 11) [22]. 100 s. ilić, v. paunović fig. 11 anchoring region for bridged bidentate (a) and monodentate (b) adsorption modes. the dotted circle denote the position of deprotonated atom (a). when dye molecule binds to the titanium dioxide surface deprotonation process may occur. deprotonation process happens when hydrogen atom of the dye molecule transfers to the titanium dioxide surface during anchoring. in the case of curcumin and cyanidin the h atom is transferred from the hydroxyl group to the tio2 structure. deprotonation process lowers the energy of the system. in figure 11a, we can see that the dye molecule formed a bridged bidentate adsorption after the deprotonation was performed. note that the hydrogen atom (dotted circle) is bound to oxygen from the cluster of titanium dioxide. however, the dye molecule can be adsorbed, as in figure 11b, without deprotonation. in this case, hydrogen bond may occur. of course, hydrogen atom can be also deprotonated which is lowering the energy of the system [13]. fig. 12 optimized geometries of the cyanidin adsorbed onto the (tio2)16 model (c@tio2) 1 , along with their homo and lumo+7. in our simulation, we observed three systems of molecule/cluster. deprotonation was performed in each of them. dotted circles denote the positions of protons that have been deprotonated from dye molecules to the (tio2)16 cluster (fig. 12, 13, 14). figures also illustrate the homos and lumos of molecule/cluster systems. 1 c@tio2 label means cyanidin anchored onto the tio2. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 101 in the case of c@tio2 and k2@tio2 the first level above the lumo that is delocalized on the whole molecule/cluster is lumo+7 (energy -2.859 ev for c@tio2 and -3.039 ev for k2@tio2). for k1@tio2 the first such level is lumo+28, which is at higher energy (-2.597 ev). the absorption of electrons from the valence band to the lumo+7 and lumo+28 levels lead to direct electron injection [23] in the tio2, since the lumo levels are delocalized along the whole system. fig. 13 optimized geometries of the curcumin in monodentate anchoring adsorbed onto the (tio2)16 model (k1@tio2) 2 , along with their homo and lumo+28. fig. 14 optimized geometries of the curcumin in bridged bidentate anchoring adsorbed onto the (tio2)16 model (k2@tio2) 3 , along with their homo and lumo+7. 2 k1@tio2 label means curcumin with one bond anchored onto the tio2. 3 k2@tio2 label means curcumin with two bonds anchored onto the tio2. 102 s. ilić, v. paunović after optimization was carried out for three molecule/cluster systems homo-lumo gaps were calculated: for c@tio2, k1@tio2 and k2@tio2 in the order of 2.37 ev, 1.93 ev and 2.34 ev. we notice that homo-lumo gaps have decreased after binding molecules onto the clusters. also that curcumin has the smallest homo-lumo gap when it is monodentate (k1@tio2) anchored onto the tio2. based on these results, energy diagram of the cyanidin, curcumin, tio2 model and three molecule/cluster systems was made (fig. 15). effective dye-sensitized solar cell requires the homo of the dye molecule to reside in the tio2 band gap and its lumo to lie within the conduction band of the tio2 [13]. we noticed that the homo levels of all three molecule/cluster systems are in the band gap of the tio2, and that lumo levels are below the cbm (conduction band minimum). the energy of the cbm is -3.835 ev. the nearest to the conduction band is lumo level of k2@tio2 (-3.842 ev), then lumo level of k1@tio2 (-3.925 ev) and at the end lumo level of c@tio2 (-4.386 ev). in all systems, all other lumo levels were found in the conduction band of tio2 cluster. fig. 15 schematic energy diagram of the cyanidin, curcumin, tio2 model and three molecule/cluster systems. the results confirm that electron has a higher probability to reach the conduction band in case of systems with the curcumin than in case of system with the cyanidin, which indicates another reason why the solar cell with curcumin has greater efficiency. characteristics of curcumin dye used as a sensitizer in dye-sensitized solar cells 103 9. conclusion the experimental results showed that the dye-sensitized solar cell stained with curcuma longa provides greater efficiency than the dye-sensitized solar cell stained with raspberry. dft calculations showed that a curcumin is closer to the conduction band minimum than a cyanidin, which indicates that electron from curcumin has a higher probability to reach the conduction band. we concluded that curcumin has better properties as a sensitizer than cyanidin for the needs of dye-sensitized solar cells, which is confirmed both by experimental and by simulation results. it is essential to find new dye sensitizers to improve efficiency of the dye-sensitized solar cells, one of the potential new dye sensitizer could be curcumin. acknowledgement: the authors would like to thank to the petnica science center, the institute of physics in belgrade on great assistance and cooperation, also the authors gratefully acknowledge the financial support of serbian ministry of education, science and technological development. references [1] s. ilić, v. paunović, “application of curcumin in dye-sensitized solar cells,” in proceedings of the extended abstracts of the 61st national conference on electrical, electronic and computing engineering (etran 2017), kladovo, serbia, june 5-8, 2017. [2] s. abasian, r. sabbaghi-nadooshan, “introducing a novel high-efficiency arc less heterounction dj solar cell,” facta universitatis, series: electronics and energetics, vol. 31, no. 1, pp. 89-100, 2018. [3] m. jošt, m. topič, “efficiency limits in photovoltaics – case of single junction solar cells,” facta universitatis, series: electronics and energetics, vol. 27, no. 4, pp. 631-638, 2014. [4] r. singh, g. alapatt, g. bedi, “why and how photovoltaics will provide cheapest electricity in the 21 st century,” facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 257-298, 2014. [5] b. o’regan, m. gratzel, “a low-cost, high-efficiency solar cell based on dye-sensitized colloidal tio2 films,” nature, vol. 353, pp. 737-740, 1991. [6] g. calogero, j. yum, a. sinopoli, g. di marco, m. gratzel, m. k. nazeeruddin, “anthocyanins and betalains as light-harvesting pigments for dye-sensitized solar cells,” solar energy, vol. 86, pp. 15631575, 2012. [7] n. a. ludin, et al. "review on the development of natural dye photosensitizer for dye-sensitized solar cells." renewable and sustainable energy reviews, vol. 31, pp. 386-396, 2014. [8] m. r. narayan, "dye sensitized solar cells based on natural photosensitizers." renewable and sustainable energy reviews, vol. 16, no. 1, pp. 208-215, 2012. [9] k. wongcharee, v. meeyoo, s. chavadej. "dye-sensitized solar cell using natural dyes extracted from rosella and blue pea flowers." solar energy materials and solar cells, vol. 91, no. 7, pp. 566-571, 2007. [10] s. tekerek, a. kudret, and ü. alver. "dye-sensitized solar cells fabricated with black raspberry, black carrot and rosella juice." indian journal of physics, vol. 85, no. 10, pp. 1469-1476, 2011. [11] h. kim, d. kim, s.n. karthick, k.v. hemalatha, c. justin raj, sunseong ok, youngson choe, “curcumin dye extracted from curcuma longa l. used as sensitizers for efficient dye-sensitized solar cells,” int. j. electrochem. sci., vol. 8, pp. 8320-8328, 2013. [12] m. valiev, et al., “nwchem: a comprehensive and scalable open-source solution for large scale molecular simulations,”computer physics communications, vol. 181, pp. 1477-1489, 2010. [13] s. meng, j. ren, e. kaxiras, “natural dyes adsorbed on tio2 nanowire for photovoltaic applications: enhanced light absorption and ultrafast electron injection,” nano letters, vol. 8, no. 10, pp. 32663272, 2008. [14] m. alhamed, a. isaa, w. doubal, “studying of natural dyes properties as photo-sensitizer for dyesensitized solar cells (dssc),” journal of electron devices, vol. 16, pp. 1370-1383, 2012. [15] p. persson, j. c. gebhardt, s. lunell, “the smallest possible nanocrystals of semiionic oxides,” thejournal of physical chemistry b, vol. 107, pp. 3336-3339, 2003. 104 s. ilić, v. paunović [16] i. đorđević, s. ilić, “the application of combined natural pigments in dye-sensitized solar cells,” petnica science center – selected students’ papers, vol. 73, pp. 96-105, 2014 (in serbian). [17] s. ito, p. chen, p. comte, m. k. nazeeruddin, p. liska, p. péchy, m. grätzel, "fabrication of screen‐printing pastes from tio2 powders for dye‐sensitised solar cells." progress in photovoltaics: research and applications, vol. 15, no. 7, pp. 603-612, 2007. [18] from the official website solaronix [on line]. available at: http://www.solaronix.com/documents/ dye_solar_cells_for_real.pdf [19] a. luque, s. hegedus, eds. handbook of photovoltaic science and engineering. john wiley & sons, 2011. [20] multifunctional system ni usb-6008. available at: http://www.ni.com/pdf/manuals/371303n.pdf [21] s. ilić, “dft characterization of curcumin and cyanidin as photosensitizers in dye-sensitized solar cells,” petnica science center – selected students’ papers, vol. 74, pp. 68-74, 2015 (in serbian). [22] e. ronca, m. pastore, l. belpassi, f. tarantelli, f. de angelis, “influence of the dye molecular structure on the tio2 conduction band in dye-sensitized solar cells: disentangling charge transfer and electrostatic effects,” energy & environmental science, vol. 6, pp. 183-193, 2013. [23] d. rocca, r. gebauer, f. de angelis, m. k. nazeeruddin, s. baroni, “time-dependent density functional theory study of squaraine dye-sensitized solar cells,” chemical physics letters, vol. 475, pp. 49-53, 2009. facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. 141-156 https://doi.org/10.2298/fuee2101141d © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper high-accuracy quasistatic numerical model for bodies of revolution tailored for rf measurements of dielectric parameters antonije djordjević1,2, dragan olćan1, jovana petrović1, nina obradović3, and suzana filipović3 1university of belgrade – school of electrical engineering, belgrade, serbia 2serbian academy of sciences and arts, belgrade, serbia 3institute of technical sciences of the serbian academy of sciences and arts, belgrade, serbia abstract. we have developed rotationally symmetrical coaxial chambers for measurements of dielectric parameters of disk-shaped samples, in the frequency range from 1 mhz to several hundred mhz. the reflection coefficient of the chamber is measured and the dielectric parameters are hence extracted utilizing a high-accuracy quasistatic numerical model of the chamber and the sample. we present this model, which is based on the methodof-moments solution of a set of integral equations for composite metallic and dielectric bodies. the equations are tailored to bodies of revolution. the model is efficient and accurate so that the major contribution of the measurement uncertainty comes from the measurement hardware. key words: dielectric measurements, electromagnetic modeling, method of moments, bodies of revolution 1. introduction the key parameter for characterization of a linear, isotropic dielectric material is the relative complex permittivity and its dependence on frequency. there exist many methods for measuring the permittivity [1], [2]. it is beyond the scope of this paper to present and compare these techniques, so that we give only a brief overview. for measurements at frequencies up to several hundred megahertz, the most commonly used technique is based on the measurement of the capacitance of a parallel-plate capacitor, where the sample is inserted between the capacitor electrodes. this method assumes that the electromagnetic field within the measured sample is quasistatic, which imposes a high received september 2, 2020; received in revised form october 16, 2020 corresponding author: dragan olcan school of electrical engineering, kralja aleksandra blvd. 73, 11000 belgrade, serbia e-mail: olcan@etf.rs 142 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović frequency limit on the method. at high frequencies, this technique has a drawback due to the strong electromagnetic coupling with the environment. hence, shielding is required. another potential drawback is that commercially available meters [1] require largediameter samples (15 mm or more). for broadband measurements at microwave frequencies (above around 1 ghz), open coaxial lines [3] or waveguides [4] can be used. parameters of sheet materials can be estimated by measuring the transfer between two antennas [5]. all these techniques require relatively large samples. in yet another set of techniques, a material sample is inserted into a coaxial line or a waveguide [6]. the sample is relatively large and has to be machined according to the shape of the coaxial line, viz. the waveguide. for measurements of dielectric substrates, other techniques can be used (e.g., [7]), which also require a special shape of the dielectric or a particular metallization pattern on it. narrowband measurements are performed in resonators. they are convenient for lowloss materials and can be used for measurements of anisotropic dielectric materials [8], but they provide data only for discrete frequencies. in our research, we primarily deal with ceramic materials. we utilize disk-shaped samples, which are relatively small due to the restricted available mass of starting components used for sintering: the diameter (d) is in the range from 4 mm to 12 mm, whereas the height (h) is between 1 mm and 4 mm. the samples are too small for the standard test equipment based on the parallel-plate capacitor. further, their shape and size do not fit into the available test equipment for standard measurements at microwave frequencies. hence, for measurements in a wide frequency range (1 mhz–5 ghz), we have developed several coaxial chambers. the first prototype is described in [9], whereas an improved design of the chamber is shown in fig. 1. the dielectric sample is pressed between a plate and a plunger, both made of brass. using a vector network analyzer (vna), we measure the reflection coefficient at the sma (subminiature version a) connector and, hence, evaluate the input admittance of the chamber. on the other hand, we utilize a numerical electromagnetic model of the chamber with the sample. in the model, we optimize the dielectric parameters of the sample in order to match the measured admittance. sma connector brassteflon air dielectric sample plunger plate d h vna calibration plane shifted reference plane fig. 1 cross-section of coaxial chamber high-accuracy quasistatic model for bodies of revolution 143 we have selected a coaxial structure because it is electromagnetically closed, and thus well shielded from the environment. note that the chamber, with the inserted sample, is practically a rotationally symmetrical structure, i.e., a body of revolution (bor). in practice, the measurement structure does not have a perfect rotational symmetry. this is not critical for lower frequencies because the chamber input admittance, as a function of the sample position, has a stationary point when the sample is in the middle of the chamber. however, at higher frequencies, when resonances of the chamber and the sample occur, the positioning is critical because modes with asymmetrical field distributions can be excited. in order to facilitate positioning of the sample, we use three thin screws that protrude through the chamber wall. after inserting the dielectric sample, the screws hold it in the required position. the screws are removed once the plunger presses the sample so that they do not influence the measurements. at lower frequencies, up to around 500 mhz, which we consider in this paper, the dimensions of the coaxial structure are relatively small compared to the wavelength. (the frequency limit depends on the dimensions and the relative permittivity of the sample.) hence, the quasistatic approximation can be used for the analysis. assuming a time-harmonic electromagnetic field [10], the equations involved in the analysis are formally the same as for the electrostatic fields. the differences from the electrostatics are: (a) phasors are involved, for the field sources (charges), electric scalar-potential, and the electric-field vector, and (b) the complex permittivity is used to characterize dielectrics. such an approach enables analysis of lossy dielectrics. note that losses in the metallic parts of the chamber have a negligible influence on the overall results of our measurements, which we have verified experimentally and computationally. the structure shown in fig. 1 belongs to the class of structures that consist of metallic (conductive) regions and piecewise-homogeneous dielectric regions [11]. the electrostatic (quasistatic) analysis of the chamber cannot be performed analytically, but only numerically. to that purpose, various methods can be used, like the method of moments (mom) [12], the finite-element method (fem) [13], the method of fictive charges [14], the method of equivalent electrode [15], etc. based on the mom and the fem, methods have been developed for the analysis of arbitrary two-dimensional (2-d) and three-dimensional (3-d) structures. also, several commercial electrostatic solvers for arbitrary 2-d and 3-d structures are available, e.g., [16]–[18]. in the implementation of such general 3-d solvers, the analyzed structure is segmented without taking into account the rotational symmetry. consequently, the required computer resources (memory and cpu time) are substantially larger than if a bor solver were used (where the rotational symmetry of the sources and fields is utilized), resulting in non-optimal running time and even jeopardizing the accuracy due to oversized systems of equations. unfortunately, there is no commercial simulator for the electrostatic analysis of bors. also, in the open literature we could not find papers devoted to the electrostatic analysis of arbitrary bors (which consist of metallic and dielectric regions). only very few older papers partly deal with this topic, e.g., [19] and [20], but their scope is limited because they are related to the analysis of slender conductors, viz. oblate dielectric bodies. in both papers, uniform asymptotic expansions are used. tailoring the analysis method to bors is important because it can be substantially faster compared to the conventional analysis of 3-d structures. the speed is important for our applications, because many analysis cycles are involved in the optimization. the accuracy of 144 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović the method is even more important because we want to achieve negligible influence of numerical errors on the overall measurement uncertainty. hence, we have been motivated to develop a new method for precise and efficient quasistatic (electrostatic) analysis of arbitrary bors, which is described here. the paper is organized as follows. in section 2, the numerical method is described. in section 3, some benchmark numerical results are presented. section 4 illustrates the implementation of the proposed method on actual measurements. the paper is concluded with section 5. 2. numerical method we consider a bor structure (fig. 2) that consists of charged conducting (metallic) bodies (electrodes) and electrically-neutral dielectric bodies (which collectively constitute a piecewise-homogeneous, isotropic dielectric medium). the axis of symmetry (revolution) is z. the generatrix of the bor is in the right-hand part of the oxz plane (x  0). hence, x is the distance from the axis of symmetry. the operating frequency is f. based on our past experience, as the preferred technique for the numerical analysis, we have selected the integral-equation approach, along with the mom. we follow a similar path as in [16], [18]. x0 conductor 2 conductor 1 vacuum (air) axis of revolution (z) dielectric 1 er1 dielectric 2 er2 sb sb s sb s 'r 'dl r source element field point sb 1v 2v xu zu fig. 2 example of bor consisting of conductors and dielectrics 2.1. integral equations first, we replace the conducting bodies by free surface charges (whose density is s) and the dielectric bodies by bound surface charges (whose density is sb). all these charges are assumed to be in a vacuum. the electric scalar-potential and the electric-field high-accuracy quasistatic model for bodies of revolution 145 vector of these charges are the same as in the original (analyzed) system. the reason for introducing these surface charges is to homogenize the medium, so that the potential and the electric field can be evaluated using the standard integral relations for a vacuum. we collectively refer to these surface charges as the total charges, whose (phasor) surface density is st. at an interface (boundary) between a conducting body and a vacuum, the total charges comprise only the free charges, i.e., st = s. at an interface between a conducting body and a dielectric body, we have st = s + sb. at an interface between a dielectric body and a vacuum, there are only bound charges, so that st = sb. finally, at an interface between two dielectric bodies, there are also only bound charges. in this case, we write st = sb and assume that sb is the sum of the densities of the bound charges of these two dielectrics. assuming rotationally-symmetric charge distributions, their (phasor) electric scalarpotential at a field point defined by the position-vector r = xux + zuz (where ux and uz are the unit vectors of the cartesian coordinate system) is given by [21]  e = ' borst 0 'd)',()'( 1 )( c lgv rrrr , (1) where 12 0 f 8.8541878128 10 m e −   is the permittivity of a vacuum and )( ' )',(bor mk q x g  =rr (2) is bor green’s function. further, r' = x'ux + z'uz defines the location of the source (i.e., the element dl'), k(m) is the complete elliptic integral of the first kind, q = (x + x')2 + (z − z')2, and m = 4xx' / q. the bor generatrix line c ' defines boundaries of all conducting and dielectric bodies and st is an unknown function of the position along c ', i.e., a function of a local coordinate l'. the reference point for the potential is at infinity. note that the kernel of (1) becomes singular when r = r'. the singularity is logarithmic and integrable. the corresponding (phasor) electric field is vgrad−=e . we formulate a set of integral equations for st based on the boundary conditions. the first part of this set is based on the boundary condition for the potential at the surfaces of electrodes. each conducting body is equipotential. we denote the number of the conducting bodies by nc and assume to know their potentials, vi, i = 1,...,nc. consequently, when the field point is on the surface of a conducting body whose potential is vi, we have an integral equation of the form v(r) = vi, i.e., i c vlmk q x =   e  ' st 0 'd)( ' )'( 1 r , c,...,1 ni = . (3) the second part of this set of integral equations is based on the boundary condition for the normal component of the electric field at the dielectric-to-dielectric interfaces (fig. 3). we include here interfaces between any two dielectric bodies, as well as between a dielectric body and the surrounding vacuum. this boundary condition yields 146 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović 0 st 2r 1r 211 1 e  =         e e −ue , (4) where e1 is the electric field in the first dielectric just at the boundary, u21 is the unit vector perpendicular to the boundary surface (directed from the second dielectric toward the first dielectric), st = sb is the charge density at the interface, and er1 and er2 are the relative complex permittivities of the two dielectrics. note that e1  u21 = en1 is the normal component of the electric field in the first dielectric, so that 21n11n ue e= . dielectric 1 er1 dielectric 2 er2 stsb = 21u1 e a b d n fig. 3 boundary surface between two dielectrics the cartesian components of the electric field in fig. 2 are [21] ( )  − +−−−  e = ' 2/3st 0 'd )1( )'('2)()1()('2 ')'( 1 )( c x l mmq xxmxmemmkx xe rr , (5)  − −  e = ' 2/3st 0 'd )1( )()'( ')'( 1 )( c z l mq meyy xe rr , (6) where e(m) is the complete elliptic integral of the second kind. these expressions contain harder singularities compared to (1), because the kernels in (5) and (6) come from the derivative of green’s function. nevertheless, the technique for the evaluation of integrals described in subsection 2.3 handles well even these integrals. an alternative approach is to compute the electric field by numerical differentiation. we evaluate v(r) at two points (a and b) on u21, which are close to the boundary surface (fig. 3) and calculate the normal component of the electric field as en1  (va − vb) / dn. the distance dn has to be carefully chosen in order to maximize the accuracy of computations. if dn is too small, the error of subtracting two similar numbers (va and vb) dominates. if nd is too large, the error of replacing the differentiation by differencing becomes pronounced. our primary goal is to numerically evaluate the matrix of electrostatic-induction coefficients [b] [22]. we consider here a system that consists of two conductors (such as the one shown in fig. 2). the conductor free charges (q1 and q2) and potentials (v1 and v2) are related as q1 = b11v1 + b12v2, q2 = b21v1 + b22v2, where bij, i, j =1, 2, are the electrostatic-induction coefficients, so that       = 2221 1211 ][ bb bb b . high-accuracy quasistatic model for bodies of revolution 147 (due to reciprocity, bij = bji.) a generalization to a system with an arbitrary number of conductors is straightforward. in order to compute the elements of the matrix [b], we assume that one conductor is at a certain non-zero potential (e.g., 1 v) and all other conductors are at a zero potential. we numerically evaluate the free charges of the conductors. hence, the elements of one column of [b] are easily calculated. this procedure is repeated for all conductors. for our measurements, we also need the matrix of partial capacitances [c] of the analyzed system. for a two-conductor system, the matrix [c] is defined in terms of the elements of the matrix [b] as       +− −+ =       212221 121211 2221 1211 ][ bbb bbb cc cc c . 2.2. method-of-moments solution the complete set of integral equations is solved numerically using the mom. as the basis (expansion) functions, we implement one of the simplest approximations for the distribution of the total surface charges: the piecewise-constant (staircase, pulse) approximation. to that purpose, we divide the contour c ' into a number of straight-line segments. (in the general case, each segment corresponds to a right conical frustum, which may degenerate into a right cylindrical frustum or a flat circular ring.) we assume that st is constant along a segment, though yet unknown. in order to provide a high accuracy and at the same time minimize the number of unknowns, we take the segments to be denser in regions where we expect faster variations of st, e.g., near edges of conductor and dielectric bodies. the distribution of the segments is defined in a way similar to [16]. for testing, we implement the galerkin procedure: we integrate the left-hand side and the right-hand side of each integral equation over the surface of one-by-one frustum. as the result, the elements of the part of the mom matrix that corresponds to the boundary condition (3) have the form              e = i jc ii c j j ij lxlmk q x z d2d)( 1 0 , sc,...,1 ni = , s,...,1 nj = , (7) where the index i corresponds to the field segment (i.e., the segment where the boundary condition is implemented) and j corresponds to the source segment. further, ci denotes the field segment and cj denotes the source segment. nsc is the total number of segments for conductors and ns = nsc + nsd is the total number of segments (unknowns) for the whole structure, where nsd is the total number of segments for dielectric-to-dielectric interfaces. finally, in (7), q = (xi + xj) 2 + (zi − zj) 2 and m = 4xixj / q. the elements of the remaining part of the mom matrix, which corresponds to the boundary condition (4), i.e., zij, i = nsc + 1,..., ns, j = 1,..., ns, have a similar form, except that, in their derivation, the integrals in (5) and (6) are used instead of the integral in (3). we have found that high-contrast dielectrics (e.g., if the relative permittivity of one dielectric is 1000 and the other dielectric is a vacuum) tend to destabilize the system. in 148 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović order to solve this problem, we add an equation that requires that the total bound charge of the system is zero [23]. we solve the resulting system of linear equations by the lu decomposition and back substitution. thus we obtain the total charge densities on the segments. all bor conductors are assumed to have finite thicknesses. hence, we evaluate the free-charge density of a segment simply as s = erst, where er is the relative complex permittivity of the adjacent dielectric. knowing the free-charge densities, we evaluate the free charges of the conductors and, hence, calculate the matrices [b] and [c]. 2.3. evaluation of integrals we have devoted particular care to the evaluation of integrals, in order to soften the influence of singularities, yet obtain a good accuracy and high computational speed. we use double precision arithmetic. we evaluate the elliptic integrals using library functions [24]. the inner integration in (7), along the source segment cj, is performed numerically in the following way. let us consider the source segment shown in fig. 4. in the coordinate system oxz, the endpoints of the segment are p1(x1, z1) and p2(x2, z2). a local coordinate system is attached to the segment, so that its origin (ouv) is in the middle of the segment, the u-axis is along the segment, and the v-axis is perpendicular to it. let us assume that the global coordinates of the field point are p(x, z). the local (u, v) coordinates of the field point are evaluated and the point p is projected onto the u-axis to obtain 'p . the minimal distance between p and the segment is calculated. two distinct cases are considered: first, when p ' lies on the segment, and second, when it is out of the segment (either towards negative u-coordinates or towards positive u-coordinates). in the first case, the distance is equal to 'pp and the segment is divided into two integration intervals, bounded by p '. the integration is further carried on these two parts separately. in the second case, the minimal distance is the distance between p and the closer end of the segment (p1 or p2), and the integration is carried out on the whole segment as one integration interval. xo z u v x1 x2 z1 z2 ouv p p' p2 p1 fig. 4 local coordinate system for evaluation of integrals based on our experience, if the minimal distance is greater than one half of the length of the integration path ( 1 2 p p ), the integration is performed on the whole path as a unique integration interval, using a gauss-legendre integration formula. otherwise, the integration interval is divided into nonuniform subintervals (at most 30), whose lengths progressively increase away from p '. each increase is by the factor of 2. the same integration formula is used for all subintervals, both for the potential and for the field components. high-accuracy quasistatic model for bodies of revolution 149 the outer integration in (7), along the field segment ci, is also performed numerically using gauss-legendre integration. if the electric field is evaluated using differentiation, numerical experiments have shown that the optimal choice for the evaluation of the electric field is to take abn =d (fig. 3), where  = 10−6. as the result, all the integrals (and their derivatives) are calculated to at least 5 significant digits. 3. benchmark results the analysis method was tested on various examples where analytical solutions exist (fig. 5). 3.1. conducting sphere shown in fig. 5a is a conducting sphere located in a vacuum. the radius of the sphere is a = 10 mm. the theoretical capacitance of the sphere is cth = 4e0a = 1.112650 pf. the cross section of the sphere is a circle, which is approximated in our computations by a regular polygon with np sides. hence, the generatrix of the sphere is a semi-circle, which is approximated by a polygonal line with ns = np / 2 uniform segments. in the numerical model, the actual sphere is approximated by a set of right conical frustums. in order to reduce the error of the geometrical modeling, we use the same strategy as in [25]: the radius of the given sphere (a) is the mean value of the radius of the circle inscribed into the polygon (rin) and the radius of the circle circumscribed around the polygon (rout). hence, rout = 2a / (1 + cos (/np)) and the generatrix is easily constructed. 10 30 er 20 4 30 32 10 10 62 60 50 16 14 4 er (a) (b) (c) (d) (e) fig. 5 longitudinal cross sections of benchmark structures: (a) conducting sphere, (b) dielectric-covered conducting sphere, (c) conducting prolate ellipsoid, (d) spherical capacitor, and (e) coaxial-line section; all dimensions are in millimeters 150 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović this strategy is in accordance with the theorem, due to maxwell, that the capacitance of a conducting body is larger than the capacitance of an inscribed body and smaller than the capacitance of a circumscribed body [26]. the numerical result for the capacitance obtained with 20 pulses is cnum = 1.112098 pf, which corresponds to a relative error with respect to cth of around 0.0005. the relative error is reduced below 10−6 when the number of pulses is increased to 150. 3.2. dielectric-covered conducting sphere fig. 5b shows a conducting sphere, whose radius is a = 10 mm, covered by a concentric dielectric layer. the outer radius of the dielectric is b = 30 mm and the relative permittivity is er = 10 4. the remaining space is a vacuum. the theoretical capacitance of the sphere is cth = 4e0 / ((b − a) / erab + 1/b) = 3.336459 pf. the computed value, obtained with 20 segments per spherical surface, is cnum = 3.336185 pf, so that the relative error is around 0.0005. the same low relative error is obtained for any other er ranging from 1.000000 to 1018. similar results are obtained for a sphere with several concentric dielectric layers. 3.3. conducting prolate ellipsoid fig. 5c shows a conducting prolate ellipsoid, located in a vacuum. the longer semi-axis of the spheroid (which is the axis of rotational symmetry) is a = 10 mm and the shorter semi-axis is b = 2 mm. the theoretical capacitance is cth = 8e0c / ln((a + c) / (a − c)) = 0.4755518 pf, where 2 2 c a b= − . in order to keep the relative error below 0.001, at least 30 segments are needed. 3.4. spherical capacitor a spherical capacitor, which consists of two concentric conducting spherical shells, is shown in fig. 5d. the radius of the inner conductor is a = 10 mm, the inner radius of the outer conductor is b = 30 mm, and the outer radius of the outer conductor is c = 32 mm. the medium is a vacuum. the theoretical matrix of electrostatic induction coefficients for this system is           e+ − e − e − − e − − e = c ab ab ab ab ab ab ab ab 0 00 00 t h 4 44 44 ][b and the corresponding capacitance matrix is pf 3.5604801.668975 1.6689750 4 4 4 0 ][ 0 0 0 t h       =           e − e − e = c ab ab ab ab c . using 20 segments per spherical surface (i.e., a total of 60 unknowns), the computed capacitance matrix is high-accuracy quasistatic model for bodies of revolution 151 pf 3.5587161.668181 1.66818110048.2 ][ 6 num         − = − c . if 50 segments are used, then pf 3.5601921.668842 1.66884210684.2 ][ 9 num         − = − c . theoretically, c11 = 0 because the inner conductor is completely shielded by the outer conductor. the numerical result for c11 is very small, indicating a high accuracy of computations. 3.5. coaxial line the last example considered here, for which an analytical solution exists, is a section of a coaxial line (fig. 5e), whose dielectric is teflon, of relative permittivity er = 2.1. the radius of the inner conductor is a = 2 mm, the inner radius of the outer conductor is b = 7 mm, and the outer radius of the outer conductor is c = 8 mm. the coaxial line is open-circuited at both ends and the width of both gaps between the conductors is 5 mm. the length of the inner conductor is la = 50 mm, the inner length of the outer conductor is lb = 60 mm, and the outer length of the outer conductor is lc = 62 mm. the structure shown in fig. 5e has significant fringing capacitances at both ends. in order to compute the per-unit-length capacitance of the coaxial line (c '), we have to remove the effect of the fringing capacitances. in the middle zone of the structure, which is sufficiently far away from the ends, the structure of the electric field is practically the same as in an infinitely long line. if we increase the length of the structure for dl (i.e., if we increase la, lb, and lc for dl), without changing the gap widths, the fringing capacitances will remain the same. hence, the corresponding increase in the mutual capacitance between the inner and the outer conductor can be attributed only to the increased capacitance of the middle zone. following this reasoning, we compute the mutual capacitance for the original dimensions of the structure ( )1( 1 2 c ) and for the increased length ( )2( 12 c ). from these two results, ( 2) (1) 12 12 ( ) /c c c l = − d . using 35 segments for the inner conductor and 93 for the outer conductor, for dl = 2 mm, the computed per-unit-length capacitance is num c = 93.24421 pf/m. the theoretical per-unit-length capacitance is th r 0 2 / ln( / )c b ae e = = 93.25647 pf/m. the relative error between numc and t hc is 0.00013. 3.6. run time the run time of the program is primarily influenced by the number of unknowns. the program is not parallelized, i.e., it uses only one core. with 100 unknowns, the run time is less than 1 s on a desktop computer with intel core i7-3770 @ 3.4 ghz, 32 gb ram, and 64-bit windows operating system. 4. measurements using coaxial chamber in this section we implement the technique for the bor analysis, described in section 2, to the coaxial chamber shown in fig. 1. we describe the model of the chamber and the calibration procedure, and present some measurement results. 152 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović 4.1. bor model of measurement setup the segmented model of the chamber looks as in fig. 6a. the plot shows the generatrix. the numbers of segments were chosen by an educated guess and numerical experiments (i.e., convergence tests) so to provide a good accuracy at a reasonable run time. obviously, there are several differences between the model and the actual structure shown in fig. 1: the generatrix in fig. 6a does not completely follow the contours of the actual device. when analyzing antennas and various microwave circuits, the structure must have ports and it is excited at those ports [11]. this is the same situation as in actual measurements. however, in our electrostatic model, the excitation is “virtual”: the conductors are assumed to be at a certain potential with respect to the reference point. no interconnections are provided between the conductors and the surroundings. in fig. 1, which shows the actual device, the inner conductor of the chamber extends all the way to the vna reference plane at the bottom of the sma connector. in measurements, the inner and the outer conductors of the sma connector further extend into the vna connector. however, in the electrostatic model, the conductors must be floating. hence, in fig. 6a, the inner conductor of the coaxial line is left open-circuited inside the sma connector. the outer conductor of the chamber in fig. 1 has an opening at the mouth of the sma connector (where it extends to the mating sma connector of the vna), whereas in fig. 6a there is no such opening. the structure shown in fig. 6a is completely shielded so that there is no electric field outside. hence, the shape of the outer surface of the outer conductor is irrelevant. for simplicity, we have taken a spherical shape. shifted reference plane shifted reference plane 0.01 0.005 0 −0.005 −0.01 −0.015 −0.02 0.006 0.004 0.002 0 −0.002 −0.004 −0.006 0.005 0.01 0.0150 0.02 0.025 0.03 0.004 0.0060 0.008 0.01 0.012 0.0140.002 sma connector teflon dielectric sample plunger plate (a) (b) fig. 6 segmented model of (a) chamber shown in fig. 1 and (b) coaxial-line section and its positive image; red segments are for inner conductor, blue segments are for outer conductor, and green segments are for dielectric-to-dielectric interfaces; coordinates are in meters high-accuracy quasistatic model for bodies of revolution 153 in subsection 4.2 we present numerical results of the electrostatic bor analysis. in subsection 4.3 we describe a theoretically rigorous calibration procedure that relates the actual setup with the electrostatic model. the aim of the calibration is to obtain a unique and measurable result for the chamber capacitance as seen looking upwards from the shifted reference plane in fig. 6a. 4.2. numerical results for empty chamber we consider an empty chamber, without a sample, but with a gap of mm 2=h between the brass plate and the plunger. (equivalently, the relative permittivity of the sample is 1.) the electrostatic analysis of the structure shown in fig. 6a yields the following matrix of the electrostatic induction coefficients: pf 81495.6881959039973.83418344 13843.8341832985513.83418356 ][       − − =b . note that the numerically obtained matrix ][b is almost perfectly symmetrical (up to 8 significant digits). the matrix of the partial capacitances is pf 67651.8540126139973.83418344 13843.834183294550.00000012 ][       =c . the outer surface of the chamber in fig. 6a is a sphere, which we approximate in the same way as described in subsection 3.1. the theoretical capacitance, csphere = 1.8543089243 pf, agrees with the computed c22 within the first four digits. 4.3. calibration referring to the previous subsection, the mutual capacitance cmodel = c21  3.8342 pf is the capacitance between the inner conductor of the chamber and the outer conductor. the modeled structure (fig. 6a) includes the inner conductor of a section of the coaxial line (within the zone of the sma connector) whose length is 3 mm. this conductor is left open-circuited, but it contributes to cmodel. hence, its influence must be calibrated-out. there exists a strong fringing effect at the open end of the coaxial line. this is a similar situation as described in subsection 3.5. there also exists a discontinuity at the transition between the coaxial line and the chamber. hence, we cannot assume that the field structure along the whole line is the same as in an infinitely long line. (note that the field in an infinitely long line corresponds to the electric field of the guided tem wave.) we consider this coaxial-line section and its positive image in the shifted reference plane (fig. 6b). this structure has two identical fringing zones. the computed capacitance between the inner conductor and the outer conductor is ccoax_double = 0.7016 pf. one half of it can be ascribed to the coaxial line in fig. 6a, assuming that the tem field exists all the way up to the shifted reference plane (although this is not true). hence, the apparent capacitance of the chamber, looking from the shifted reference plane upwards, is cchamber = cmodel − ccoax_double / 2. note that, theoretically, we cannot uniquely define cchamber because it depends on the presence of the inner coaxial-line conductor, which affects the fringing field in the vicinity of the shifted reference plane. however, the described procedure of evaluating the apparent capacitance is essentially the same as used in actual measurements, where 154 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović we measure the input admittance at the vna calibration plane, then shift the reference plane, and calculate the new admittance. in this procedure, it is assumed that a pure tem wave exists in the coaxial line all the way up to the shifted reference plane. on the other hand, from manufacturer’s data, we know the geometrical dimensions of the sma connector and that its dielectric is teflon. hence, the actual length of the coaxial line, from the vna calibration plane in fig. 1 up to the shifted reference plane, is lcoax = 11.75 mm. as in subsection 3.5, the per-unit-length capacitance of the coaxial line is calculated to be c ' = 96.045 pf/m, so that the capacitance of this section (assuming that the electric field has the same structure as in an infinitely long line) is ccoax = lcoax c ' = 1.1285 pf. the apparent capacitance transformed back from the shifted reference plane to the vna calibration plane is thus cat vna reference plane = cchamber + ccoax = cmodel + 0.7777 pf = 4.6064 pf. this is physically the same result as evaluated by measurements at the vna calibration plane, looking towards the chamber. this capacitance, measured at f = 30mhz, is cmeasured = (4.60  0.01) pf and it agrees with cat vna reference plane within the measurement uncertainty. 4.4. examples of measurement of dielectric parameters in this subsection, we implement the complete measurement setup (vna, coaxial chamber, and software) to evaluate parameters of various dielectric samples. the general procedure is to measure the reflection coefficient of the chamber and the dielectric sample (at the vna reference plane) and, hence, calculate the corresponding complex admittance. from the admittance, we evaluate the complex capacitance of the chamber. thereafter, we use the quasistatic model of the chamber with the sample. in that model, we vary (optimize) the relative complex permittivity of the sample so to obtain the same complex capacitance as measured. the procedure can be simplified because, in a reasonably wide range of permittivities, the capacitance is practically a linear function of the permittivity (i.e., cat vna reference plane = er + , where α and  are constants). hence, it is sufficient to implement a linear fit in the complex domain between two capacitances computed for two assumed permittivities, which are selected, e.g., based on an educated guess. if the sample is small (i.e., d and h are sufficiently smaller than the diameter of the plate shown in fig. 1), the electric field in the whole dielectric sample is practically homogeneous. in that case, the measurement procedure is simple. first, the sample is inserted, fixed by the plunger, and the complex capacitance c is measured. second, the sample is removed, the plunger is positioned at the same elevation h as when the sample was present, and the capacitance c0 of the empty chamber is measured. this is an elementary situation in electrostatics, for which c − c0 = ((er − 1) e0d 2 / 4h). hence, er can easily be calculated. in order to illustrate the applications of the coaxial chamber shown in fig. 1, we present here results for three measured samples. two samples are printed-circuit board (pcb) substrates, measured at f = 100 mhz. the first substrate is taconic 602-250. the measured relative permittivity was er = 2.55, which agrees well with the manufacturer’s data (er = 2.50). the measured loss tangent was tan  < 0.001 (below the resolution of our measurement system). for the second substrate, fr-4, we obtained er = 4.49 and tan  = 0.025, which agrees well with the data from [7]. high-accuracy quasistatic model for bodies of revolution 155 the third example is a ceramic material – alumina (al2o3) doped with nickel oxide (nio), mechanically activated by ball-milling for 60 minutes and sintered at 1400 °c [27]. fig. 7 shows the relative permittivity and loss tangent of the material in the frequency range from 1 mhz to 500 mhz. the material is lossy and, hence, the relative permittivity significantly decreases with frequency. mathematically, this decay follows from the causality conditions [7]. the measurement uncertainty at the lowest frequencies (around 1 mhz) is large because the input admittance of the chamber is very small (i.e., the chamber behaves almost like an open circuit). hence, very small measurement errors of the reflection coefficient cause huge errors of the input admittance. the accuracy at frequencies in the range from 10 mhz to 100 mhz is much better. the accuracy at higher frequencies decreases because the field in the chamber cannot be considered to be quasistatic anymore. for these frequencies, the estimation of the relative permittivity requires a full-wave model of the chamber. fig. 7 measured relative permittivity and loss tangent of alumina doped with nickel oxide 5. conclusion a high-precision and efficient quasistatic numerical method for the analysis of arbitrary metallo-dielectric bodies of revolution was presented. the method has been developed for measurements of dielectric parameters of small disk-shaped samples, for frequencies up to several hundred mhz. for higher frequencies, up to around 10 ghz, a full-wave (dynamic) solver is under development. 156 a. djordjević, d. olćan, j. petrović, n. obradović, s. filipović acknowledgement: this paper was funded in part by the project f133 of the serbian academy of sciences and arts, and in part by the ministry of education, science and technological development of the republic of serbia. references [1] basics of measuring the dielectric properties of materials, application note, keysight technologies, document available at: https://www.keysight.com/zz/en/assets/7018-01284/application-notes/5989-2589.pdf. [2] o. v. tereshchenko, f. j. k. buesink and f. b. j. leferink, "an overview of the techniques for measuring the dielectric properties of materials", in proceedings of the xxxth ursi gen. ass. sci. symp., vol. 1320, istanbul, turkey, 2011, pp. 1–4. [3] t. p. marsland and s. evans, "dielectric measurements with an open-ended coaxial probe", iee proc. microw., antennas propag., vol. 134, no. 4, pp. 341–349, 1987. [4] b. sanadiki and m. mostafavi, "inversion of inhomogeneous continuously varying dielectric profiles using open-ended waveguides", ieee trans. antennas propag., vol. 39, no. 2, pp. 158–163, feb. 1991. [5] d. k. ghodgaonkar and v. v. varadan, "a free-space method for measurement of dielectric constants and loss tangents at microwave frequencies", ieee trans. instrum. meas., vol. 37, pp. 789–793, 1989. [6] a. m. nicolson and g. f. ross, "measurement of the intrinsic properties of materials by time domain techniques", ieee trans. instrum. meas., vol. im–19, pp. 377–382, 1970. [7] a. r. djordjević, r. m. biljić, v. d. likar-smiljanić and t. k. sarkar, "wideband frequency-domain characterization of fr-4 and time-domain causality", ieee trans. electromagn. compat., vol. 43, pp. 662–667, 2001. [8] p. dankov, b. hadjistamov, i. arestova and v. levcheva, "measurement of dielectric anisotropy of microwave substrates by two-resonator method with different pairs of resonators", piers online, vol 5, pp. 501–505, oct. 2009. [9] a. đorđević, j. dinkić, m. stevanović, d. olćan, s. filipović and n. obradović, "measurement of permittivity of solid and liquid dielectrics in coaxial chambers", microw. rev., vol. 22, pp. 3–9, dec. 2016. [10] r. f. harrington, time-harmonic electromagnetic fields, hoboken, nj: wiley-ieee press, 2001, chapter 1. [11] b. m. kolundžija and a. r. djordjević, electromagnetic modeling of composite metallic and dielectric structures, norwood, ma: artech house, 2002, pp. 6–8. [12] r. f. harrington, field computation by moment methods, hoboken, nj: wiley-ieee press; 1993. [13] m. salazar-palma, t. k. sarkar, l.-e. garcia-castillo, t. roy and a.r. djordjević, iterative and selfadaptive finite-elements in electromagnetic modeling, norwood, ma: artech house, 1998. [14] j. v. surutka and d. m. veličković: "some improvements of the charge simulation method for computing electrostatic fields", bull. serb. acad. sci. arts, class sci. techn., no. 15, pp. 27–44, 1981. [15] d. m. veličković and a. milovanović, "electrostatic field of cube electrodes", serbian j. electr. eng., vol. 1, pp. 187–198, june 2004. [16] a. r. djordjević, m. b. baždar, r. f. harrington and t. k. sarkar, linpar for windows: matrix parameters for multiconductor transmission lines, norwood, ma: artech house, 1999. [17] cst studio suite, available at: https://www.3ds.com/products-services/simulia/products/cst-studio-suite/. [18] m. m. nikolić, a. r. djordjević and m. m. nikolić, es3d: electrostatic field solver for multilayer circuits, norwood, ma: artech house, 2007. [19] r. a. handelsman and j. b. keller, "the electrostatic field around a slender conducting body of revolution", siam j. appl. math., vol. 15, pp. 824–841, july 1967. [20] r. barshinger, "the electrostatic field about a thin oblate dielectric body of revolution", siam j. appl. math., vol. 52, pp. 651–675, may 1991. [21] o. ciftja, a. babineaux and n. hafeez n, "on the electrostatic potential of a uniformly charged ring", eur. j. phy., vol. 30, 623–627, may 2009. [22] a. r. djordjević, electromagnetics, belgrade, serbia: academic mind, 2008, section 2.5. [23] d. olćan, diakoptic analysis of electromagnetic systems, ph.d. thesis, school of electrical engineering, university of belgrade, serbia, 2008, pp. 26–28. [24] imsl fortran and c application development tools, houston, tx: visual numerics, 1997. [25] j. dinkić, d. olćan, a. djordjević, а. zajić, "design and optimization of nonuniform helical antennas with linearly varying geometrical parameters", ieee access, vol. 7, pp. 1–12, oct. 2019. [26] a. r. djordjević, fundamentals of electrical engineering, belgrade, serbia: academic mind, 2016, section 1.10.1. [27] s. filipović, n. obradović, s. marković, a. đorđević, i. balać, a. dapčević, j. rogan, v. pavlović, "physical properties of sintered alumina doped with different oxides", sci. sinter., vol. 50, pp. 1–11, 2018. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 435-444 https://doi.org/10.2298/fuee2103435b © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper dual wideband and high gain microstrip antenna for wireless system biplab bag, sushanta biswas, partha pratim sarkar department of engineering and technological studies, kalyani university, west bengal, india abstract. in this paper dual wideband high gain circular shaped microstrip antenna with modified ground plane is presented for wireless communication systems. the overall dimension of the proposed antenna is 50 x 40 x 1.6 mm3. the radiating element consists of circular shaped patch which is excited by microstrip feed-line printed on fr4 epoxy substrate. the ground plane is on the other side of the substrate having a rectangular ring shape to enhance the peak gain of the antenna. the proposed antenna exhibits two wide fractional bandwidths (based on ≤ -10 db) of 61.1% (ranging from 2.0 to 3.8 ghz, centred at 2.88 ghz) and 53.37% (ranging from 5.48 to 9.6 ghz, centred at 7.44 ghz). the measured peak gain achieved is 8.25 dbi at 8.76 ghz. the measured impedance bandwidth and gain suffice all the commercial bands of wireless systems such as 4g lte band-40, bluetooth, wi-fi, wlan, wimax, c-band, and xband. the measured results are experimentally tested and verified with simulated results. a reasonable agreement is found between them. key words: high gain, monopole, wide bandwidth, wlan, wimax, x-band 1. introduction recently, wireless communication systems have played an important role in the development and advancement of modern technology. microstrip patch antennas (mpa) are widely used in wireless systems due to inherent features such as low cost, light weight, conformal, ease of integration with microwave circuits [1]. however, microstrip patch antennas suffer from the limitation of narrow bandwidth and low gain problems in the last decade. for short range high speed data connectivity of wireless systems (wlan, wi-fi, bluetooth) requires wide bandwidth, whereas in long range point to point communication for c-band and x-band needs high gain also. however, narrow bandwidth problem can be solved to some extent by monopole antenna. the bandwidth enhancement of monopole antennas has been reported in [2]-[9]. wideband l-shaped printed monopole antenna with received december 23, 2020; received in revised form march 27, 2021 corresponding author: biplab bag department of engineering and technological studies, kalyani university, west bengal, india e-mail: bbagateie@gmail.com 436 b. bag, s. biswas, p. p. sarkar impedance bandwidth over 4.7 ghz has been reported in [2]. in [3] presented dual wideband monopole antenna with split ring resonators having impedance bandwidth of 280 mhz and 3100 mhz. in [4] demonstrated bandwidth enhancement of planar monopole microstrip patch antenna with curved slot having 109% fractional bandwidth. in [5] developed a novel cpw-fed wideband printed monopole antenna with dgs exhibits impedance bandwidth of 3.18 ghz and peak gain of 4.5 dbi. in [6] mentioned a technique of adding small fractal elements to the polygon shaped radiator of a cpw-fed monopole antenna leads to coverage of bandwidth (3.1-10.6 ghz). in [7] presented dual wideband monopole antenna for umts, wlan, and wimax applications. an inverted question mark broadband high gain patch antenna has been designed [8], which exhibits -10 db impedance bandwidth of 250 mhz and 8.07 ghz. in [9] reported cpw feed dual band and wideband antennas using crescent shape and t-shape stub for wi-fi and wimax application. besides these, few researchers have developed some designs [10]-[14] in which the gain of the monopole antenna may be increased. in [10] proposed arrow shaped broadband high gain monopole antenna, percentage bandwidth of 133% with peak gain 5.2 dbi. in [11] demonstrated a high gain ultra wide band monopole antenna exhibits 154% impedance bandwidth with maximum gain of 8 dbi. in [12] reported high gain dual-band antenna having maximum gain with the values of 6.2 dbi and 10.4 dbi. the disadvantage of this antenna is its lager size (109.03 mm x 77.88 mm). in [13] proposed acs-fed dual band antenna with truncated ground plane for 2.4/5 ghz wlan application with peak gain of the antenna is 4.85 dbi. planar stair-like uwb monopole antenna serve to enhance the boresight gain has been presented in [14]. indeed, there are many other promising techniques [15]-[17] proposed for enhancing gain of mpas. aforementioned gain enhancement techniques are bulky and much complex structure which are not usable for portable devices. in this paper, a wide band microstrip antenna with high gain is presented for the applications of wireless communication systems. the configuration of the proposed antenna is very simple. it consists of circular shaped patch excited through a microstrip feed line which acts as a radiating element on the top of the substrate and a rectangular ring is loaded on the ground plane, which enhanced the impedance bandwidth as well as the gain. the proposed antenna has two wide bands with resonant frequencies at 3.36 ghz, 6.88 ghz, and 8.76 ghz. the measured -10 db impedance bandwidth of the 1st band is 1800 mhz (from 2.0-3.8 ghz) and that of 2nd band is 4120 mhz (from 5.48-9.6 ghz). the peak gains at the specified resonant frequencies are 4.5 dbi, 5.5 dbi and 8.25 dbi. the bandwidth of the lower and upper bands may cover all the commercial bands (such as 4g lte, wi-fi, bluetooth, wimax, c-band, and x-band). the detail evolution of the proposed antenna structure is given in the consecutive sections. 2. antenna structure the structural configuration of the proposed antenna is very simple which is illustrated in fig. 1. the radiator consists of a circular patch and 50 ω microstrip feedline (wf x lf) on the top of the substrate. the low cost fr4-epoxy substrate (εr = 4.4, tan δ = 0.02, thickness = 1.6 mm) is used for construction of the proposed antenna. the overall dimension is 50 x 40 x 1.6 mm3. the values of the antenna parameters are optimized with the help of hfss v.15 electromagnetic simulator. the proposed monopole dual wideband and high gain microstrip antenna for wireless system 437 antenna evolution is done by two steps shown in fig. 2. instead of a partially ground plane on one of the sides of the substrate, an almost rectangular ring (or square slot) is loaded at the ground plane, which enhanced the gain as well as reformed the impedance characteristics of the antenna. the conventional circular shape patch antenna (without ring on ground plane) produces a dominant tm11 mode at 4.64 ghz. the radius is taken 9 mm (r) [wf1 x lf1 = 1.8 x 12 mm²], and the resonant frequency is estimated by the theoretical equations [1]: r e r 1.8412c f 2 r =   (1) e r 2h r r r 1 ln 1.7726 r 2h     = + +        (2) where, r = physical radius (mm), re = effective radius (mm), c = speed of light in free space (m/s), fr = resonant frequency (ghz), εr = relative permittivity and h = thickness of the substrate. the theoretical resonant frequency is similar to the simulated resonant frequency obtained by hfss simulator. however, the desired characteristics could not be achieved from this configuration. therefore, the ground plane has been modified by adding a strip of length l1, zg, and wg, it makes the ground plane as a rectangular ring. the length l1(or l2), zg and lg excites a resonant frequency at ~6.95 ghz and ~9.65 ghz, which are determined by the following equations [1]: tmmn e eeff c m n f l w2     = +         (3) reff 1 2  +  = (4) where, ftmmn = resonance frequencies of different resonant tmmn modes, εeff = effective permittivity. at the second resonance frequency (~6.95 ghz), the tm2.50 mode is present as two half-wavelength variations in the field is appeared along the length le and zg of the ground plane. therefore, le becomes (l1+zg), and we is equal to zero at tm2.50 mode. at third resonance frequency (~9.65 ghz), the tm34 mode is present as three half-wavelength variations along the length of (l1+zg) and four half-wavelength variations in the field is occurred along the width wg of the ground surface. therefore, le becomes (l1+zg), and we becomes wg at tm34 mode. the reflection coefficient and gain comparison of two antenna structure are depicted in fig. 3 and fig. 4, respectively. figure 4 it is clearly visualized that the proposed antenna has better gain characteristics. the gains have been improved by introducing a rectangular ring on the ground plane. this improvement may happen because of the increment of the directivity as compared to the antenna without rectangular ring on the ground plane. whereas in fig. 3 the reflection coefficient is slightly reformed (in term of resonant frequency and impedance bandwidth). the final dimensions of the antenna parameters are finalized after large number of simulation. the parameters are as follows (in millimetre): r = 7, wf = 1.6, lf = 12, wf1 = 1.8, lf1 = 12, s = 4, p = 4, zg = 10, wg = 40, lg = 11, g = 10.5, r = 9, l1 = 29, l2 = 29, d = 1, t = 1.5, h = 1.6. 438 b. bag, s. biswas, p. p. sarkar fig. 1 geometry of the proposed antenna (a) top view (b) side view fig. 2 evolution of the proposed antenna (a) without ring on ground plane (b) with rectangular ring on ground plane fig. 3 simulated reflection coefficient of proposed antenna with and without ring on ground plane dual wideband and high gain microstrip antenna for wireless system 439 (a) (b) fig. 4 simulated (a) directivity and (b) peak gains of proposed antenna with and without rectangular ring 3. simulated and measured results the antenna prototype has been fabricated on the basis of optimal dimension, depicted in fig. 5. after fabrication, the |s11| of the proposed antenna is measured with the help of rohde and schwarz znb 20 vector network analyzer. figure 6 and table 1 illustrates that the measured |s11| agrees with simulated result with small discrepancy. this may be due to fabrication tolerance, dielectric inconsistency, and/or sma connector (low cost). from measured |s11| parameter it is exhibited that the -10 db impedance bandwidth of lower band is 1800 mhz and of upper band is 4120 mhz. so, it is clear to observed that the proposed antenna can comprise the band of 4g lte, bluetooth, wi-fi, wimax, c-band. fig. 5 photograph of the proposed (a) top view (b) bottom view 440 b. bag, s. biswas, p. p. sarkar fig. 6 simulated and measured s11(db) versus frequency (ghz) table 1 comparison of simulated and measured s11(db) resonant frequencies (ghz) impedance bandwidth (mhz) simulated 3.73, 6.98, 9.64 2160, 4130 measured 3.36, 6.88, 8.76 1800, 4210 the far field radiation patterns on the e-plane and h-plane for 2.36 ghz, 3.5 ghz, and 6.9 ghz are shown in fig. 7, fig. 8 and fig. 9, respectively. it is clearly noticed that the proposed antenna exhibits a bi-directional e-plane and omnidirectional h-plane pattern, and cross-pol values are fairly small in broadside direction which is most desirable for monopole antenna radiation. it is also noticed that the cross-polar values are significantly small due to the symmetrical configuration with respect to feedline. the cross-pol values are obtained at 2.36 ghz < -20 db, 3.5 ghz < -30 db and 6.9 ghz < -30 db, in broadside direction. the simulated 2d and 3d radiation pattern at 9.65 ghz is illustrated in fig. 10. it is found that the 3 db beam-width is about 35˚ at 9.65 ghz. fig. 7 measured and simulated far-field radiation patterns at 2.36 ghz of (a) e-plane (b) h-plane dual wideband and high gain microstrip antenna for wireless system 441 fig. 8 measured and simulated radiation patterns at 3.5 ghz of (a) e-plane (b) h-plane fig. 9 measured and simulated far-field radiation patterns at 6.9 ghz of (a) e-plane (b) h-plane 442 b. bag, s. biswas, p. p. sarkar fig. 10 simulated far-field radiation patterns at 9.65 ghz of (a) e-plane (b) h-plane (c) 3d pattern the measured and simulated gains are compared in fig. 11 over the operating bands. it is observed that the measured peak gain can be reached maximum to 8.25 dbi at 8.76 ghz. the peak gains at the other frequencies are 4.5 dbi, and 5.5 dbi corresponding to frequencies of 3.36 ghz, and 6.9 ghz, respectively. dual wideband and high gain microstrip antenna for wireless system 443 fig. 11 simulated and measured realized gains of proposed antenna the proposed monopole antenna and some other monopole antennas are compared which is given in table 2. the references of [4], [8], [10] and [11] have larger impedance bandwidth but lower peak gain compared to proposed antenna. table 2 comparison of various monopole antennas with proposed antenna ref. no. size (mm³) operating bands (ghz) impedance bandwidth (mhz) peak gain (dbi) [3] 25x40x0.635 2.4/5.2/5.8 280, 3100 -2, 4 [4] 50x55x1.5 2.4/3.5/5.2/5.8 4800 4.8 [7] 42x51x1.6 2.4/3.5/5.2/5.8 2120, 1540 5.03, 1.94 [8] 30x20x1.6 2.4/5.2/5.8 200, 8070 5.5 [10] 40x43x1.6 2.4/3.5/5.2/5.8 6500 5.2 [11] 48x32x1.6 3/5/11 1.82–14.07 8 [13] 25x17.5x1.6 2.4/5.2/5.8 250, 2590 4.85 [18] 45x40x1.6 2.4/2.5/5.2/5.8 5500 6.8 [19] 30x30x1.56 2.4/2.5 180, 1080 1.49, 3.23 [20] 60x60x1.6 3.6/5.8/8.5 309, 270, 310 6.4, 7.52, 7.32 [21] 30x23x1.575 5.8/10 160, 680 4.11, 7.15 [22] 21x17x1.6 2.74/ 6.34 20, 79 4.43, 5.37 [23] 48x25x1.6 5.8 500 6.85 proposed antenna 50x40x1.6 2.35/2.4/5.8/6.9 1800, 4120 8.25 4. conclusion in this paper dual wide band and high gain microstrip monopole antenna has been proposed for wireless systems. the antenna is moderate in size and simple in configuration. the measured result shows that the proposed antenna has two wide frequency bands ranging from 2 ghz to 3.76 ghz (1800 mhz) and 5.48 ghz to 9.6 ghz (4120 mhz) with a high peak gain of about 8.25 dbi achieve at 8.76 ghz. the peak gain of the proposed antenna has been enhanced by modifying the ground surface (introducing a rectangular ring or square slot) without hampering radiation characteristics (standard monopole type) too much. the measured bandwidth exhibits that the antenna can sufficiently cover all the commercial bands such as 4g lte band-40 (india), bluetooth, wi-fi, wlan, wimax, c-band, and xband. 444 b. bag, s. biswas, p. p. sarkar references [1] g. kumar and k. p. ray, broadband microstrip antenna. artech house, 2003, chapters 1-2, pp. 1–52. [2] k. p. ray, s. s. thakur and r. a. deshmukh, "wideband l-shaped printed monopole antenna", aeu int. j. electron. commun., vol. 66, pp. 693–696, 2012. [3] s. c. basaran, and k. sertel, "dual wideband cpw-fed monopole antenna with split-ring resonators", microw. opt. technol. lett., vol. 55, pp. 2088–2092, 2013. [4] s. baudha, and d. k. vishwakarma, "bandwidth enhancement of a planar monopole microstrip patch antenna", int. j. microw. wirel. technol., vol. 8, pp. 237–242, 2016. [5] a. singh, and s. singh, "a novel cpw-fed wideband printed monopole antenna with dgs", aeue int. j. electron. commun., vol. 69, pp. 299–306, 2015. [6] h. fallahi, and z. atlasbaf, "bandwidth enhancement of a cpw-fed monopole antenna with small fractal elements", aeu int. j. electron. commun., vol. 69, pp. 590–595, 2015. [7] d. ustun, and a. akdagli, "signs, curls, and time variations: learning to appreciate faraday’s law", int. j. microw. wirel. technol., vol. 9, pp. 1197–1208, 2017. [8] k. mondal, a. shaw, and p. p. sarkar, "inverted question mark broadband high gain microstrip patch antenna for ism band 5.8 ghz/wlan/wifi/x-band applications", microw. opt. technol. lett., vol. 59, pp. 866–869, 2017. [9] r. p. dwivedi, and u. k. kommuri, "cpw feed dual band and wideband antennas using crescent shape and t-shape stub for wi-fi and wimax application", microw. opt. technol. lett., vol. 59, pp. 2586– 2591, 2017. [10] k. mondal, and p. p. sarkar, "arrow shaped broadband high gain monopole antenna for wireless communications", wirel. pers. commun., vol. 95, pp. 1019–1030, 2017. [11] s. de, and p. p. sarkar, "a high gain ultra-wideband monopole antenna", aeu int. j. electron. commun., vol. 69, pp. 1113–1117, 2015. [12] x. he, s. hong, h. xiong, q. zhang, and e. m. m. tentzeris, "design of a novel high-gain dual-band antenna for wlan applications", ieee antennas wirel. propag. lett., vol. 8, pp. 798–801, 2009. [13] k. a. ansal, and t. shanmuganantham, "a novel cb acs-fed dual band antenna with truncated ground plane for 2.4/5 ghz wlan application", aeu int. j. electron. commun., vol. 69, pp. 1506–1513, 2015. [14] c. thajudeen, w. zhang, and a. hoorfar, "boresight gain enhancement of planar stair-like uwb monopole antenna", microw. opt. technol. lett., vol. 56, pp. 2809–2812, 2014. [15] h. zhai, q. gao, c. liang, r. yu, and s. liu, "a dual-band high-gain base-station antenna for wlan and wimax applications", ieee antennas wirel. propag. lett., vol. 13, pp. 876–879, 2014. [16] d. gangwar, s, das, and r. l. yadava, "gain enhancement of microstrip patch antenna loaded with split ring resonator based relative permeability near zero as superstrate", wirel. pers. commun., vol. 96, pp. 2389–2399, 2017. [17] s. liu, w. wu, and d. g. fang, "wideband monopole-like radiation pattern circular patch antenna with high gain and low cross-polarization", ieee trans. antennas propag., vol. 64, pp. 2042–2045, 2016. [18] a. desai, t. k. upadhyaya, r. h. patel, s. bhatt, and p. mankodi, "wideband high gain fractal antenna for wireless applications", progress in electromagnetics research, vol. 74, pp. 125–130, 2018. [19] a. gupta, a. patro, a. negi, and a. kapoor, "a compact dual-band metamaterial inspired antenna with virtual ground plane for wimax and satellite applications", progress in electromagnetics research, vol. 81, pp. 29–37, 2019. [20] a. ghosh, t. mandal, and s. das, "design of triple band slot-patch antenna with improved gain using triple band artificial magnetic conductor", radioengineering, vol. 25, pp, 442–448, 2016. [21] i̇. ataş, t. abbasov, and m. b. kurt, "development of a high gain, dual-band and two-layer miniaturized microstrip antenna for 5.8 ghz ism and 10 ghz x-band applications", applied computational electromagnetics society journal, vol. 34, pp. 1568–1575, 2019. [22] s. b. behera, d. barad, and s. behera, "a small form factor impedance tuned microstrip antenna with improved gain response", progress in electromagnetics research, vol. 95, pp. 13-23, 2020. [23] s. maddio, g. pelosi, m. righini, and s. selleri, "a slotted patch antenna with enhanced gain pattern for automotive applications", progress in electromagnetics research, vol. 95, pp. 135-141, 2021. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 85 102 doi: 10.2298/fuee1501085к rpll rendezvous protocol for long-living sensor node  mirko r. kosanović 1 , mile k. stojčev 2 1 college of applied technical sciences, niš, 2 university of niš, faculty of electronic engineering, niš, serbia abstract. sensor nodes (sns), as constituents of wireless sensor network (wsn), are battery-powered not rechargeable devices and have limited amount of energy available. since the lifetime of sns is a crucial parameter for energy-efficient wsn design, it is essential to extend their lifetimes as much as possible. here we propose a rendezvous scheme called rendezvous protocol for long-living sn, rpll. this scheme is based on implementation of a duty-cycling technique. for each sn within wsn a unique identification number (id) is allocated, thanks to which a collision problem is effectively remedied. the rpll provides an on time wake-up of sns in a fully decentralized way and fast detection of new appended sns. taking into account the wsn and sn working parameters, such as beacon time, beacon period, number of active sns, and quartz oscillator instability, by using the proposed method, a wsn designer can determine the maximal lifetime of a sn, i.e. achieve optimal energy consumption. key words: rendezvous protocol; duty cycling; energy efficiency; wireless sensor networks 1. introduction wireless sensor networks (wsns) consist of a large number of cooperating, radioequipped and battery powered sensor nodes (sns). battery is the main power source in a sn. bearing in mind that during its lifetime a sn operates with limited battery capacity (single battery charge) the energy consumption becomes a critical issue. for example, off-the-shelf sn works for a few days, if all its parts, including transceiver and microcontroller, are permanently powered-on. a sn is assumed to be dead when it is out of battery. the challenge is to guarantee lifetime of several years [1]. four main activities during which a sn consumes energy are sensing, communication, computation, and storage [2]. the power consumed during communication is the greatest portion of energy consumption by any sn [1]. communication between any two sns is possible only if both of them are powered-on simultaneously. in order to arrange simultaneous on-time communication a rendezvous scheme is commonly used. several ideas for rendezvous schemes in low duty cycle wsns have been proposed [3]. they usually function at the received june 3, 2014; received in revised form august 25, 2014 corresponding author: mirko r. kosanović college of applied technical sciences, aleksandra medvedeva 20, 18 000 niš, serbia (e-mail: mirko.kosanovic@open.telekom.rs) 86 m. r. kosanović, m. k. stojĉev media access control (mac) layer and can be categorized into three general classes [4, 5]: (i) asynchronous – the sender sn tries to capture the unknown active time of the receiver sn; (ii) synchronous – sns are synchronized in time and agree on specific communications time slots; and (iii) pseudo asynchronous – sns establish rendezvous on demand by using periodic wakeup. see ref [4] for more details about this problematic. we consider a pseudo-asynchronous scheme because our application is primarily intended for rare events observation, i.e. in applications that are typical for environment monitoring. in this scheme sns are powered on and off periodically, and a beaconing approach is used to express the desire or willingness to communicate. the concept of periodic on/off powering an sn, also called duty-cycling, has to satisfy performance requirements related to throughput, guaranteed end-to-end delay and a lifetime of several years, often contradictory design parameters. in general, duty cycling is the most widely used mechanism for saving energy in wsn, i.e. to elongate the wsn lifetime. the idea behind this is clear. keep all or parts of hardware in low power sleep state except during instances when the hardware is operative. in this way, depending on the network activity, the sn switches its mode of operation between active and sleep. duty-cycling protocols based on rendezvous scheme not only arrange for sns to communicate, but also inherently include the availability to plan the channel access time, avoiding and resolving collisions, and some time-synchronization mechanisms which are required to locally determines the beginning of the active and sleep state [4, 6, 7]. here we propose the usage of rendezvous scheme called rendezvous protocol for long-living sn, rpll. in essence we continue our work [8], and present a complete rpll protocol. the rpll uses pseudo-asynchronous scheme and is based on the implementation of a duty cycling mechanism. its principle of operation is similar to distributed low duty cycle (dldc-mac) protocol [9], from which we have adopted all advantages that it offers (synchronized wake-up times of sns, hidden terminal, link failures, and asymmetric links elimination, etc.). with the aim to remedy the notified disadvantages of dldc-mac protocol, which primarily relate to collision avoidance during the registration of new sns, as well as fast and correct selection of beacon time and beacon period, we propose its modification. the modification deals with involving a unique id for each sn within wsn which provides us with integration of two activities (related to neighbor discovering and neighbor registration) into a single one (neighbor discovery and registration). this modification makes it possible to effectively overcome the aforementioned disadvantages. our primary interest in this investigation is to determine how the working parameters of this protocol (beacon time, beacon period, number of communication active sns, and quartz oscillator instability) have impact on power consumption, and to understand how to achieve a proper tradeoff between performance and power. to this end, the performances of rpll protocol that relate to the number of active sns, duration of a beacon period, and power consumption of sn, are estimated. for the defined and selected working parameters, by using the appropriate method, the designer can minimize the power consumption of sn. rpll rendezvous protocol for long-living sensor nod 87 2. power management protocols and energy waste in wsns depending on the layer on the network architecture they are implemented, power management protocols can be divided into the following two groups [10]: a) independent sleep/wakeup protocols execute on top of a media access control (mac) protocol. since these protocols run at the network or application layer, they provide usage with any mac protocol and are characterized with good adaptability to different application needs. b) strictly integrated with the mac protocol these protocols permit optimization of media access functions, but as specific solutions are not universal. in terms of the approach which is used to determine when sns should be switched-on, the sleep/wakeup protocols can be divided into the following three categories [5]: 1. on-demand – a sn should wake-up when another sn wants to communicate with it. 2. scheduled-rendezvous – each sn should wake-up at the same time as its neighbors. 3. asynchronous protocols – a sn should wake-up when it wants and still be able to communicate with its neighbors. depending on which way the source and destination sn achieve rendezvous, three categories of rendezvous schemes exist [11]: 1. synchronous scheme – all sns agree to the same clock time, wake-up synchronously and rendezvous with one another. 2. asynchronous scheme – a source sn actively wakes-up destination sns. 3. pseudo-asynchronous scheme – a source sn wakes-up first and waits for destination sns to wake-up and rendezvous. as previously discussed, the energy is consumed by a sn for the sensing purpose, processing the data and communication. communication consumes the largest amount of energy. in a typical sn, during communication, the major waste of energy occurs due to the following reasons [12]: a) idle listening – an sn carrier senses the idle channel in anticipation of possible arrival of packets, what causes waste of power. b) collision – when large number of sns is present in a small area, collision is a common occurrence if it is not effectively controlled. a collided packet is discarded; an sn usually retransmits the packet, but packet retransmission causes a further waste of energy. c) overhearing when sn’s neighbors are transmitting packets, although the packet is not designated for this sn, it still receives the packet, which represents yet another source of power waste. d) over-emitting – a sn sends packets to another sn but the receiver sn is not ready, and the packet has to be sent again, which also costs wasting power. e) control packet overhead – the presence of extra control packets in wsn such as a request to send (rts), clear to send (cts), acknowledgment (ack), beacons, packets in csma-based protocols, and other transmitting and receiving control packets cost power as well. 88 m. r. kosanović, m. k. stojĉev 3. distributed low duty cycle rendezvous protocol first of all, we will analyze shortly the dldc-mac protocol, and point out to its principle of operation, advantages and drawbacks. the basic idea of this solution is described in [9]. for the sake of simplicity but without any intention to generalize the following discussion will explain dldc-mac protocol operation when the wsn consists of three sns, sn1, sn2 and sn3, respectively. figure 1 presents a scenario of wsn activities which uses a dldc-mac protocol. as can be seen from fig. 1 the following four activities (phases) exist: 1. current state (cu_st) – each registered sn (in our case, sn1 and sn2) sends a short message called beacon periodically and wakes-up to receive beacons from its neighbors. the beacon period is the same for all sns. after receiving a neighbor`s beacon the sn estimates the time of the next beacon taking into account the beacon period, the reception time, and the guard period. having in mind that the sn executes this runtime for several neighbors it knows the beacon times for its neighbors. thanks to this, the sn spends most of the time in a sleep state, and wakes-up only to receive neighbor`s beacons and to send its own beacon. upon transmitting a beacon, each sending sn enters shortly into a receive mode during which it can accept data and commands piggybacked in the beacon. in this way each sn permanently takes an active part in communication and knows with how many sns it communicates. let us note that every phase (cu_st, ne_di, ne_re and up_sc) is divided into several time slots denoted as tbcy (see fig. 1). due to involving a guard time, used for compensation of sn`s quartz instability, each time slots is slightly longer than the time needed to send and to receive a beacon from its neighbor. 2. neighbor discovering (ne_di) – after powering-on (see fig. 1.) the sn3 enters into a phase ne_di and listens for a whole beacon period. during this period sn3 receives neighbors` beacons from sn1 and sn2, and stores their reception times. 3. neighbor registration (ne_re) – after recognizing sn1 and sn2, the node sn3 sends network join advertisement messages during the time periods when sn1 and sn2 are in the receive mode, respectively. the advertisement messages contain information which relate to the beacon time of the node sn3. 4. update scheduler (up_sc) – this phase is identical to cu_st phase with one exception, the joined sn3 becomes active now, since sn1 and sn2 accept the beacons from sn3. the main advantages of dldc-mac protocol are: i) even in the presence of very unreliable links, it can successfully synchronize the wake-up times of sns in a fully decentralized way. ii) problems accompanied with clock drifts, link failures, temporarily asymmetric links, and a hidden terminal, are effectively overcome. iii) by using off-the-shelf sns, long lifetime is possible to achieve. rpll rendezvous protocol for long-living sensor nod 89 sn sn sn cu_st (valid for sn and sn ) cu_st (valid for sn and sn ) ne_di (valid for sn ) t receive send 1 1 1 3 ne_re (valid for sn ,sn ,sn )321 up_sc (valid for sn ,sn ,sn )321 2 2 2 3 beacon period beacon time sleep period tbcy fig. 1 dldc-mac rendezvous scheme we meet the following drawbacks of dldc-mac protocol during: a) determination of an accurate beacon time during which the new joined sn tries to announce its presence in wsn, i.e. the selection of proper beacon time for the newly joined sn. b) when the time difference between any two beacons is smaller than tbcy (see fig. 1), one of the affected sns must choose a new beacon time. since the beacon time is not pre-defined a problem arises when two or more new sns choose the same beacon time, or the beacon time overlaps with the beacon period of other neighboring sns during the new attempt period. c) collision appears when two or more sns are simultaneously registered during the ne_re phase. 4. rendezvous protocol for long-living sensor node in rpll we assume that each sn is uniquely identified by its identification number, idx, x=1, 2 … n, where n corresponds to the total number of sns within wsn. involvement of idx allows us to accurately define a unique time slot for each sn, within the beacon period, during which a corresponding sn can send data. let us note that the unique id is assigned during sn`s software initialization, i.e. in a phase of installing system and application software. the scenario of events for rpll is given in fig. 2. the following three phases exist: 1. current state (cu_st) – identical to the one defined in dldc-mac. 2. neighbor discovering and registration (ne_di_re) – since each sn has a unique idx it knows in advance the position of its time slot within beacon period. each new joined sn during ne_di_re phase enters in the receive mode. during a time slot t3 (t1), see fig. 2, the node sn2 accepts the beacon from sn3 (sn1) and notifies sn3 (sn1) about its presence in wsn, while during the time slot t2 it announces its presence to the wsn (in this way sn3 announces its presence to others sns (for example sn4, sn5, …) as possible candidates that can attempt to join wsn during the same beacon period). this opportunity allows us to reduce a registration latency of the new joined sn (in fig. 2 it is sn2) to a single beacon period. 3. update scheduler (up_sc) – identical to that one defined in the dldc-mac protocol. 90 m. r. kosanović, m. k. stojĉev a procedure which deals with the joining of new sn to the wsn is presented in figure 2 (in our case a new sn is sn2). after powering-on, at the instant t1, for a single beacon period tbp, sn2 switches into the receive mode (ne_di_re phase) and accepts beacons from its neighbors. during this period each registered sn (sn1 and sn3) in a predefined time slot, determined by the sn`s id number, sends a beacon. since sn2 accepts a beacon from its neighbor sn3 (sn1) it shortly switches to send a mode and announces a presence by sending its id2 to sn3 (sn1). in this moment sn3 (sn1) is in receive mode waiting to receive information from the new joined sn (in our case sn2). according to the received id2 the node sn3 (sn1), in the next beacon period (during the phase up_sc) precisely determines a time slot for listening sn2. in this manner, possible collision during registration of a new single joined sn (i.e. sn2) is effectively avoided. however, let us note that a collision problem can appear when two or more sns simultaneously join the wsn after powering-on during the same beacon period (for more details how to bypass this problem see subsection 4.1). as can be seen from fig. 2 the rpll protocol differs in respect to dldc_mac (see fig. 1) in that it merges ne_di and ne_re phases into a single phase called now ne_di_re. the other two phases cu_st and up_sc remain identical for both protocols. in this way, the needed latency for transition from phase ne_di to phase cu_st (see fig. 1) decreases for one beacon period. during cu_st (up_sc) phase a beacon period always starts with sending beacon from sn with id=1 (instant t2 in fig. 2). thanks to this fact, it is easy to synchronize the operation, at the global level, of all sns in wsn which relates to identification of the start and the end of each protocol phase. some important observations concerning the implementation of rpll protocol are the following: 1. it is preferable to use rpll protocol in wsn with a moderate number of sns (n < 255). in this case id number is 8 bit width. 2. a designer assigns a unique id number to each sn before its joining to wsn. 3. time slot position of each beacon time, within a beacon period, is fixed and is directly determined by sn`s id number. 4. time duration of a beacon period is known in advance to each sn, i.e. defined during programming of a sn. 5. minimal time duration of beacon period directly depends on a total number of sns, n, within a wsn. sn2 sn1 sn2 sn3 ne_di_re (valid for sn )2 cu_st (valid for sn and sn )1 3 cu_st (valid for sn and sn )1 3 1 2 3 up_sc (valid for sn ,sn ,sn ) t receive send t1 t1 t3 t3 t1 t2 t2 t2 powering on tbp commun. activity fig. 2 initialization phase without collisions. rpll rendezvous protocol for long-living sensor nod 91 4.1. collision avoidance collision appears when during single beacon period several sns are simultaneously switched on and try to transmit a packet at the same time. the following difficulty appears now in dldc_mac protocol: sns that are in collision cannot accurately determine their beacon times, and cannot accurately define the duration of a total beacon period. without loss of generality, in fig. 3 we show how the collision problem in rpll protocol can be solved. namely, we assume that two sensor nodes sn2 and sn4 are powered-on during the same beacon period (i.e. time overlapping between phases ne_di_re of sn2 and sn4 exists). during this we assume that id1 99.9 %) each sn spends into cu_st or up_sc phase. during a single beacon period, tbp, there are maximum n-1 receive beacon cycles, trbc, and one transmit-receive beacon cycle, ttrbc. rpll rendezvous protocol for long-living sensor nod 93 for a given tbp and defined duty cycle, dc, we will determine the maximal number of neighboring sns with which some active sn can communicate. we assume that wsn consists of maximum n sns. during cu_st phase two different sn`s activities exist. the first one, with time duration trbc, is called receive beacon cycle. during this cycle the sn receives beacon from its neighboring sns. from fig.4a we have that trbc = trb + toff. during toff sn is in sleep mode, while in trb it receives beacon. the second activity, with time duration ttrbc, is called transmit-receive beacon cycle. during this cycle, the sn transmits its beacon, and receives acknowledge from a new appending neighboring sn. from fig.4b we have that ttrbc = ttb + trb + toff, where ttb corresponds to time period needed to acquire data from the sensor element and to transmit beacon. in our case we assume that trbc = ttrbc = tbc. tbc corresponds to the duration of time slot defined in fig. 2. for an arbitrary communication active (com_active) sensor node snk, k є {1,..., n}, we define its duty cycle, dck (see fig. 3 and fig. 4), as: 1 n tbk rbk i rbi onk i k bp bc t t a t t dc t nt       , (1) where the term ai є{0,1}, i=1,...,n, where i ≠ k, points to the following: ai=1(ai=0), a node snk can (cannot) communicate with its neighboring node sn, while tonk corresponds to the total active time of snk during a single beacon period tbp. within a wsn all active sns are called com_active, while all sns that directly communicate (point to point communication) are referred to as com_visible. from eq. (1) we have, 1 n onk tbk rbk i rbi i t t t a t     . (2) let us assume now that during a cu_st phase the number of com_active neighboring sns is p, and that p0, the following condition, related to eq. (8), has to be fulfilled: 2 ( 2) 0 k x dc s p   , (9) i.e. : 2 2 k x dc p s   . (10) according to eq. (10) the maximal number of com_active sns within a single wsn region can now be derived. during this analysis, sx will be taken as a parameter. in real wsn applications, the designer decides about: what kind of oscillator unit, in respect to quartz instability defined in ppm, to built-in within an sn structure? usually, quartz units with factory declared quartz frequency instability from 10 ppm up to 50 ppm are built-in into sns. due to the process, voltage, and temperature (pvt) variations, as well as the influence of others ambient conditions (pressure, humidity, etc.), some additional frequency deviations, in respect to the factory declared quartz frequency rpll rendezvous protocol for long-living sensor nod 95 instability, inevitably appear. we will consider a case when the impact of all additional frequency deviations is within the limits of ±10%. as direct consequence of frequency deviation, the number of com_active sns will differ with respect to the number of sns when quartz oscillators of nominally equal quartz frequency instability are used. in the worst case, for frequency deviation of +10 %, for fixed value of a dc factor, wsn with minimal possible com_active sns is feasible. according to eq. (10), for a given dc factor (from 0.1 up to 1 %), and specified frequency instability (from 10 up to 50 ppm) and frequency deviation of ±10 %, the maximal number of com_active sns is presented in table 1. table 1 maximal number of com_active sns in terms of dc factor ppm dc[%] 10 (±10%) 20(±10%) 30(±10%) 40(±10%) 50(±10%) -10% nom +10% -10% nom +10% -10% nom +10% -10% nom +10% -10% nom +10% 0.1 52 47 42 24 22 19 15 13 12 10 9 8 8 7 6 0.2 108 97 87 52 47 42 34 30 27 24 22 19 19 17 15 0.3 163 147 133 80 72 65 52 47 42 38 34 31 30 27 24 0.4 219 197 178 108 97 87 71 63 57 52 47 42 41 37 33 0.5 274 247 224 135 122 110 89 80 72 66 59 53 52 47 42 0.6 330 297 269 163 147 133 108 97 87 80 72 65 63 57 51 0.8 441 397 360 219 197 178 145 130 118 108 97 87 85 77 69 1 552 497 451 274 247 224 182 163 148 135 122 110 108 97 87 *note: in shaded columns the maximal number of com_active sns within wsn, is derived, i.e. this column points to p. by analyzing the results presented in table 1 we can conclude that: 1. for the same dc factor the maximal number of com_active sns always decreases as sx increases, i.e. less stable oscillators are built-in. 2. independently of quartz instability, as dc factor takes higher value, the maximal number of com_active sns increases. 3. in all cases, small frequency deviation (from 9 up to 11 ppm) causes significant variation of com_active sns. for example, for dc=1% and nominal sx =10 ppm, the difference is 552-451=101 sns. 5.2. beacon period selection for wsn with a maximal number of com_active sns, let us determine now the beacon period, tbp. to this end, in the sequel, two analyses related to the determination of a maximal number of com_active sns within wsn, will be conducted. the first one deals with the choice of tbp in terms of com_active sns, for different quartz instability sx and fixed (predefined) dc=0.5 %, as parameters. during the second analysis we determine tbp duration in terms of com_active sns for different dc factors (dc=0.1%, …, 1%) and fixed sx (sx =10ppm and sx =40 ppm), as parameters, respectively. first analysis: in general, low dc factor is preferable when long life sn operation is required. in practice the dc factor is within the range from 0.1 up to 1 % [15]. for illustration purpose only, we choose a middle value of dc (dc=0.5 %). in fig.5, the minimal duration of tbp in terms of the com_active sns, with sx as parameter, is sketched. during this we have adopted that: 96 m. r. kosanović, m. k. stojĉev a) tproc= 4 ms (in our case cpu runs at 1 mhz, packet length is 64 b, and data transfer rate is 128 kbps); b) tswitch= 6 μs (for a microcontroller of type msp430f123, where 6 μs corresponds to a switching time from active to low power mode 3, and vice versa [16]). fig. 5 minimal duration of beacon period in terms of com_visible sns for dc=0.5% according to the results presented in fig.5 we can conclude that a quartz oscillator instability has direct impact to the maximal number of com_visible sns. indenpendently of tbp, as sx increases the maximal number of com_visible sns decreases. second analysis: when dc factor is within the range from 0.1 up to 1 % similar results concerning minimal duration of tbp are obtained. a) b) fig. 6 minimal duration of tbp in terms of p, for dc factor as parameter: a) sx=10 ppm; b) sx=40 ppm time duration of tbp in terms of p (p<20), for dc factor as a parameter, and for the fixed value of quartz oscillator instabillity sx (sx=10ppm and sx=40 ppm), is sketched in fig.6a and 6b, respectively. 0 20 40 60 80 100 120 140 -50 0 50 100 150 200 250 300 350 400 450 500 b e a c o n p e ri o d [ s ] number of com_visible sns 10ppm 20ppm 30ppm 40ppm 50ppm 4 6 8 10 12 14 16 18 20 22 0 20 40 60 80 100 120 140 160 b e a c o n p e ri o d [ s ] number of com_active sns dc=0.1 % dc=0.2 % dc=0.5 % dc=1 % 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 0 200 400 600 800 1000 1200 b e a c o n p e ri o d [ s ] number of com_active sns dc=0.1 % dc=0.2 % dc=0.5 % dc=1 % rpll rendezvous protocol for long-living sensor nod 97 by analyzing the results presented in fig.6 we can conclude the following: 1. as dc factor decreases and the number of com_active sns increases, tbp increases, too. 2. as dc factor increases the curves defined by a function tbp=φ(p) become more linear. according to the results presented in fig.5 and fig.6 we can conclude that: when quartz oscillator with lower ppm value is built into sn`s architecture, then: a) for beacon period with fixed duration, wsn with larger number of com_active sns can be made to be operative. for example, for tbp =100 s and sx =40 (10) ppm, wsn composed of p=35 (82) com_active sns can be realized (see fig. 5). for wsn with a fixed number of com_active sns, correct operation can be achieved with a lower dc factor. for example, for p=20 sns, and sx =50 (10) ppm, a dc factor of 1 (0,1) % is needed for feasible wsn operation (see fig. 6). 6. estimating energy consumption it is well-known that battery factories declared that energy capacity, installed in sn, is not always equivalent to the energy drawn from that battery. with the aim to extract maximum energy from a battery, it is necessary to have a profound understanding of the following two phenomena. the first one deals with the amount of leakage current (the leakage current is a direct consequence of the battery self discharging characteristics). the second one relates to the energy consumption of sn during different phases of rpll protocol. the energy consumption of sn is equal to: sdsn t sd t avr eevdttivdttie   00 )()( (11) where esn is energy consumption of sn during its lifetime; esd is the wasted energy due to battery self discharge; v is power supply voltage; t lifetime of sn; iavr average current during lifetime of sn; and isd self discharge current. with the aim to simplify the analysis we will assume that during lifetime of sn the power supply voltage is constant. in order to determine accurately the term esn =iavrvt it is necessary to know the profile of iavr. current profiles of the sensor, transceiver, and cpu during phases up_sc and ne_di_re, are presented in fig 7a and 7b, respectively. fig. 7 current profile of sn3`s constituents during phases: a) up_sc; b) ne_di_re sensor transceiver cpu iswitch iacpu iacpu iscpuiscpu iscpu iarfiarf isen iarfiarf isrfisrfisrf isrf tbp toff toff tproc tsen 2trec 2tproctswitch t a) trec b) sensor transceiver cpu iswitch iswitchiacpu iarf tbptswitch tswitch t 98 m. r. kosanović, m. k. stojĉev according to the principle of operation of rpll protocol we have: sdadau iihhi  )1(i avr (12) where: iau average current of sn during up_sc phase; iad – average current of sn during ne_di_re phase; isd – battery self-discharging current; and, h – time ratio which points to the fact how long during its lifetime the sn spends into the phase up_sc in respect to ne_di_re phase. let us note that during the lifetime of sn the phase ne_di_re happens at least once (during the registration of a new sn or after the reset), so limes h → 1. the average current iau during the phase up_sc (see fig.7a), is equal to: ausenaurfaucpu iii  au i (13) where: ,iausen bp sen sen t t i (14) aurf i 2 ( 1) ,off switchrf rec srf switch arfrf bp bp bp t t t pi pi p i t t t     (15) and aucpu 2 i 2 ( 1) . proc guardoff switchcpu scpu switchcpu acpu bp bp bp t tt t pi pi p i t t t      (16) the average current, iad, during the phase ne_di_re (see fig.7b), is equal to: 2 ( ) 1 2switch switch ad acpu arf switch bp bp t t i i i i t t          (17) from fig. 7a we have that iacpu, iarf, isen correspond to the current values of cpu, transceiver, and sensor, respectively, when sn is in state on, while iscpu, isrf to the current values of the cpu and transceiver during the time period when sn is in the state off. in our case, for switching period tswitch= 6 µs the cpu (msp430f123) goes from active to lpm3 mode. furthermore, we assume that during switching period tswitch the switching current iswitch varies linearly, and is given by the following formulas: (i ) 2 arf srf switch rf i i   (18) ( ) 2 acpu scpu switchcpu i i i   (19) 2 )( srfscpuarfacpu switch iiii i   (20) where: iacpu (iscpu) and iarf (isrf) correspond to cpu and rf currents during active (sleep) mode, respectively (see fig.7). rpll rendezvous protocol for long-living sensor nod 99 by substituting eqs. (14), (15), (16) and (17) into eq. (12), the exact formula for calculating iavr is obtained. the architecture of our sn consists of: a) cpu low power microcontroller msp430f123; b) communication part rf modulator cc 2420; c) sensor subsystem sensor ms55er (for barometric-pressure); and d) lithium-ion battery with capacity of 560 mah. all presented results in the sequel relate to this sn architecture. in our design solution we have for iacpu=300 µa, iarf=19700 µa, iscpu=1.6 µa, isrf=1 µa, and isen=1000 µa. for rechargeable (non-rechargeable) lithium-ion battery, the selfdischarging current, isd, causes battery capacity loss of 10 % (2%) per year [17]. the required battery capacity, rbc, for wsn composed of ten neighboring sns (p=10), in terms of tbp, for one year working period is sketched in fig. 8. in fig. 8, sx is taken as a parameter, while the time ratio h = 0.9999884 corresponds to one discovery phase appearance during 24 hours time period. as can be seen from fig.8, in all cases, the minimal rbc is obtained for tbp=60 s. also, as the quality of quartz unit is better the rbc is lower. for example, for sx=50 ppm the minimal rbc50=670 mah/year, while for sx=10 ppm the minimal rbc10=516 mah/year, what corresponds to rbc increase of 23%. under identical operating conditions for p=20, we obtain similar results (curves rbc=φ(tbp) have nearly the same shape as those presented in fig.8). as an illustration, for sx=50 ppm and p=20, the minimal rbc50=1188 mah/year, while for sx=10 ppm we obtain rbc10=895 mah/year, which corresponds to rbc increase of 25 %. 0 100 200 300 400 500 600 400 600 800 1000 1200 1400 1600 1800 b a tt e ry c a p a c it y f o r o n e y e a r [m a h ] beacon period [s] 10ppm 20ppm 30ppm 40ppm 50ppm fig. 8 required battery capacity in terms of beacon periods with sx as parameter in fig. 9 the power consumption of sn, for one year working period, in terms of tbp with h as a parameter, is presented. according to the results presented in fig. 9 we can conclude that: a) as h decreases the power consumption of sn increases. b) in all cases, minimal consumption exists, but for different tbp values. for example for h=0.999988426 (single discovery cycle per day) and tbp=60s we obtain emin=670 mah, while for h=0.9999996 (single discovery cycle per month) and tbp=300 s, we obtain emin=466 mah, which corresponds to rbc increase of 30.45 %. 100 m. r. kosanović, m. k. stojĉev 0 100 200 300 400 500 600 400 600 800 1000 1200 1400 1600 1800 2000 p o w e r c o n s u m p ti o n f o r o n e y e a r [m a h ] beacon periods [s] 1 day 2 days 3 days 4 days 5 days 6 days 7 days 10 days 15 days 30 days fig. 9 required battery capacity in terms of beacon period for different wsn scanning period 6.1. power consumption comparison between dldc-mac and rpll protocols in fig. 10 a power comparison between dldc-mac and rpll protocol for one year period in terms of number of collisions, nc, and time duration of beacon period, tbp, as a parameter is given. the comparison is presented as ratio between sn`s power consumption of dldc-mac versus rpll protocol, and is denoted as pcr. by analyzing the results presented in fig. 10 we can conclude the following: i. power consumption of dldc-mac is always higher in respect to rpll protocol. ii. for both protocols as nc increases, the power consumption increases too. in addition, as nc increases the pcr increases too. for example: for tbp = 180 s and nc = 10 the pcr = 2.81 %, while for tbp = 180 s and nc = 180 we obtain pcr = 36.55 %. iii. higher power saving is always achieved for larger tbp. as tbp increases the pcr increases too. for example: for tbp = 180 s and nc = 60 the pcr = 15.95 %, while for tbp = 600 s and nc = 60 we obtain pcr = 41.00 %. -20 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 p o w e r c o n s u m p ti o n f o r 1 y e a r d l d c -m a c /r p l l ( % ) collision per 1 year (num) 10 s 30 s 60 s 120 s 180 s 300 s 600 s 1800 s fig. 10 power comparison between dldc-mac and rpll protocol for one year period in terms of number of collisions the achieved saving in power consumption justifies the usage of rpll in respect to dldc-mac protocol. rpll rendezvous protocol for long-living sensor nod 101 7. conclusion power saving is a crucial issue in battery powered sns. with the aim to achieve correct in-time wake-up of sns we have implemented some modifications in respect to dldc-mac rendezvous protocol. the principle of operation of the proposed rpll protocol is based on usage of duty cycling technique. it uses pseudo-asynchronous scheme and is preferable for wsn applications that are typical for environment monitoring. the modification in respect to dldc-mac protocol competes-in of involving a unique id for each sn within wsn, which provides us with decreasing activities during the registration process and avoiding collisions of newly joined sns. the performance of the proposed protocol which relate to: a) the maximal number of sns within wsn in terms of a dc factor; b) the maximal duration of a beacon period in terms of simultaneously active sns; c) the energy consumption of sn in terms of sn`s quartz oscillator instability; and d) the energy consumption of sn in terms wsn scanning period in order to detect newly active sns, have been estimated. according to the obtained results, for a defined beacon period and selected sns quartz oscillator instability, wsn designer can exactly determine the minimal power consumption of sn, and thus extend its lifetime. the power consumption of a sn which implements a rpll protocol in respect to sn which uses dldc-mac protocol is always lower. the achieved power saving, for one year working period, is within the range from 0.01 % (tbp = 1 s and nc = 180) up to 86.43 % (tbp =1800 s and nc = 180). the obtained results justify the involved modifications in the rpll protocol. acknowledgement: this research was sponsored in part by the serbian ministry of science and technological development, project no. tr-32009 "low-power reconfigurable fault-tolerant platforms". the authors would like to thank to anonymous reviewers for their useful and constructive suggestions, mainly intended to improve the quality of this paper. references [1] raghunathan v., ganerival s., srivastava m., "emerging techniques for long lived wireless sensor networks", ieee communication magazine, vol. 44, no. 4, pp. 108-114, 2006 [2] p. k. dutta, d. e. culler, "system software techniques for low power operation in wireless sensor networks", in proceedings of the iccad`05, san jose, california, usa, 2005, pp. 925-932 [3] m. riduan ahmad, eryk dutkiewicz and xiaojing huang (2011). "a survey of low duty cycle mac protocols in wireless sensor networks", emerging communications for wireless sensor networks, (ed.), isbn: 978-953-307-082-7, intech, available from: http://www.intechopen.com/books/emergingcommunications-for-wirelesssensor-networks/a-survey-of-low-duty-cycle-mac-protocols-in-wirelesssensor-networks, acc.01.11.2013 [4] en yi lin, a comprehensive study of power-efficient rendezvous schemes for wireless sensor networks, phd thesis, university of california, berkeley, 2005 [5] en-yi a. lin, jan m. rabaey, adam wolisz, "power-efficient rendez-vous schemes for dense wireless sensor networks", in proceeding of icc2004, paris, france, june, 2004, vol.7, pp. 3769 3776 [6] d. christmann, r. gotzhein, m. krämer, m. winkler, "flexible and energy-efficient duty cycling in wireless networks with macz", proc. 10th annual int new technologies of distributed systems (notere) conf, ieee, tozeur, tunisia, 2010, pp. 121-128 [7] e.serpedin, q.m.chaudhari, synchronization in wsn: parameter estimation, performance benchmarks and protocols, cambridge university press, new york, 2009 [8] m.kosanovic, m.stojcev, "energy efficient rendezvous protocol for wireless sensor networks, 2 nd mediterranean conference on embedded computing", meco – 2013, budva, montenegro, pp. 215-218 http://www.intechopen.com/books/emerging-communications-for-wirelesssensor-networks/a-survey-of-low-duty-cycle-mac-protocols-in-wireless-sensor-networks http://www.intechopen.com/books/emerging-communications-for-wirelesssensor-networks/a-survey-of-low-duty-cycle-mac-protocols-in-wireless-sensor-networks http://www.intechopen.com/books/emerging-communications-for-wirelesssensor-networks/a-survey-of-low-duty-cycle-mac-protocols-in-wireless-sensor-networks http://vs.informatik.uni-kl.de/people/christmann/ http://vs.informatik.uni-kl.de/people/gotzhein/ http://vs.informatik.uni-kl.de/people/kraemer/ 102 m. r. kosanović, m. k. stojĉev [9] m.brzozowski, k.piotrowski, p.langendoerfer, "a cross-layer approach for data replication and gathering in decentralized long-living wireless sensor networks", isads 2009, the 9 th international symposium on autonomous decentralized system, athens 2009, pp.49-54 [10] giuseppe anastasi, mario di francesco, marco conti, andrea passarella, how to prolong the lifetime of wireless sensor networks, chapter 6 in handbook on mobile ad hoc and pervasive communications, l.t. yang and m.k. denko, editors, american scientific publishers, december 2006, http://info.iet.unipi.it/~anastasi/papers/yang.pdf, acc.10.11.2013 [11] g.anastasi, m.conti, m.d.francesco, a.passarelle, "energy conservation in wireless sensor networks: a survey", ad hoc networks 7 (2009) 537–568, http://info.iet.unipi.it/~anastasi/papers/ adhoc08.pdf, acc.10.02.2013 [12] wei ye, heidemann j., estrin d., "an energy-efficient mac protocol for wireless sensor networks", in proceedings of ieee infocom 2002 (june 2002), new york, usa, vol. 3, pp. 1567–1576 [13] zvi rosberg, ren ping liu, tuan le dinh, yi fei dong, sanjay jha, "statistical reliability for energy efficient data transport in wireless sensor networks", wireless netw (2010) 16, pp.1913-1927, published on line by springer science + business media, http://www.cse.unsw.edu.au/~ydon/publications/winet. pdf, acc. 20.01.2013 [14] m. kosanovic, m. stojcev, "delay compensation method for time synchronization in wireless sensor networks",10 telsiks ieee, nis 2011, vol.2 pp.623-629 [15] anton ageev, time synchronization and energy efficiency in wireless sensor networks, disi university of trento, italy, phd thesis, march 2010 [16] texas instruments datasheets, msp430f123 16-bit ultra-low-power microcontroller, 8kb flash, 256b ram, usart, comparator , http://www.ti.com/product/msp430f123, acc.15.02.2013 [17] gianfranco pistoia, battery operated devices and systems, elsevier b.v., amsterdam, the netherlands, 2009 http://info.iet.unipi.it/~anastasi/papers/yang.pdf http://info.iet.unipi.it/~anastasi/papers/%20adhoc08.pdf http://www.cse.unsw.edu.au/~ydon/publications/winet.pdf http://www.cse.unsw.edu.au/~ydon/publications/winet.pdf http://www.ti.com/product/msp430f123  facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 61-71 https://doi.org/10.2298/fuee2001061n © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd parallel-strip line stub resonator for permittivity characterization dušan a. nešić 1 , ivana radnović 2 1 centre of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, serbia 2 institute imtel komunikacije a.d, belgrade, serbia abstract. a new type of a microwave permittivity sensor with a short open stub as a resonator is introduced. the open stub is realized as a double-sided parallel-strip line without a substrate and can be totally immersed into the measured material. it provides high sensitivity of the resonant frequency nearly proportional to the ratio of square roots of dielectric constants of the measured materials. the sensor is tested in two different frequency ranges and for two different dielectric constant ranges (oils and ethanol-water mixture). its technology is without any additional technological processes such as vias, air-bridges or defected ground structures. presented sensor is designed, fabricated and tested showing good agreement between simulations and measurements. key words: microwave sensor, microstrip, double-sided parallel-strip line, permittivity measurement. 1. introduction microwave sensors are being increasingly used as sensing components in many applications [1]. they are sensitive, able to survive overdrives and their signal can be directly transmitted over a distance [2]. one type of microwave sensor is a resonant sensor. great advantage of this type of sensor is its principle of operation that is based on the resonance frequency and is generally immune to the environmental noise. besides, the use of the planar technology enables an easy, fast and inexpensive fabrication. advantages of the planar microwave fabrication process finds wide application in planar structures such as microstrip, cpw and strip line [1,3]. accordingly, a microwave microstrip resonator is a good choice for a sensor [4-9]. the location of the material under test (mut) is usually above the microstrip line [4,9], under the pattern etched in the microstrip ground plane [5,6] or above the coupling area of the coupled microstrip structures [7,8]. however, there is one main problem the received march 4, 2019; received in revised form august 21, 2019 corresponding author: dušan a. nešić centre of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, njegoseva 12, 11000 belgrade, serbia (e-mail: nesicad@nanosys.ihtm.bg.ac.rs)  62 d. a. nešić, i. radnović fact that the sensitivity depends on the extent of the field penetration inside the mut [3]. in all three mentioned positions of the mut only a part of the field lines is inside the mut because the field lines in microstrip are predominantly concentrated within the substrate, as presented in fig. 1. fig. 1 electric (e) and magnetic (h) field lines in microstrip are stronger within the substrate. material under test (mut) is usually above the substrate in the lower field region. gray areas represent metallization it is obvious that locating the mut inside the substrate results in a higher sensitivity [3]. still, one can insert the mut (i.e. fluid) through the substrate [10, 11]. this solution is inconvenient especially in cases where thin substrates are used and is suitable only for microfluids. another solution can be double-sided parallel-strip line printed on dielectric pipes for fluids testing, [12], though it is appropriate for pipes but not for immersing a stub into a fluid. also, the resonance occurs at low frequencies and open stubs are in this case too long (around 25 cm). some analogy with a coaxial open stub is given in [4]. its resonance is also at low frequencies thus an open stub is too long (around 33 cm), and is not practical for a number of applications. besides, it is tested only for high dielectric constants. the microstrip sensor for immersing into a fluid is presented in [5]. it has disadvantages in the construction and the protection problems during measurements. one solution to problems from [5] is in use of substrate integrated waveguide (siw) technology [13]. however, the disadvantage of the solution presented in [13] is great number of vias in the siw technology. in this paper a new type of a modified microstrip /4 open stub resonant sensor is introduced. it is suitable for immersing into a fluid and has a short open stub ( 20 mm). the whole structure is in the form of a double-sided parallel-strip line [14,15], i.e. a tjunction with an open stub without a substrate as a sensing part, fig. 2. the pair of two symmetrical metal strips without a substrate represents the sensing part of the stub. double-sided parallel-strip line technology is chosen in order to obtain such sensing structure. the absence of a substrate enables each stub strip to be totally surrounded by the mut. according to this, the total field around the stub strips is inside the mut and naturally produces higher sensitivity. the sensing stub can be simply immersed into the mut without any additional preparation or use of any auxiliary structure like cavity. the sharp stopband always exists and the resonant frequency can be clearly measured. parallel-strip line stub resonator for permittivity characterizati 63 fig. 2 basic layout of a double-sided parallel-strip line t-junction with an open stub without substrate an open stub is a well-known resonator. the first resonant frequency of an open shunt stub is for the wavelength: 2 0 2 1 1 , , 4 4 44 gr r reff reff reff r c c l f l l f                (1) where gr is the guided resonant wavelength, 0 is the free space wavelength, ɛreff is the effective dielectric constant and l is the length of the open stub, fr is the resonant frequency and c is the speed of light. in the microstrip structure ɛreff mainly depends on the dielectric constant ɛr of the microstrip dielectric substrate because the field lines of the microstrip are predominantly concentrated within the substrate, as presented in fig. 1. the goal of the paper is to use an open stub without a substrate in which case the material under test totally fills both the area surrounding the substrate and the area commonly occupied by the substrate. in that case ɛreff  ɛrmut induces high sensitivity. the ideal sensitivity, as the shift of the open stub resonant frequency, is equal to the ratio of square roots of dielectric constants of the measured materials, eq. (1). the proposed sensor is fabricated in microstrip printed planar technology without any additional technological process such as vias, air-bridges, defected ground structures (dgs) or many vias for substrate integrated waveguide (siw). the realization of the sensor was carried out in an easy way using standard photolithographic procedure. besides, the sensor dimensions are within technological tolerances. 2. design and fabrication as mentioned previously, the structure is designed in printed planar technology as a double-sided parallel-strip line t-junction. the objective of the design was to fabricate the tjunction with an open stub without a substrate. according to fabrication possibilities, the realized structure is somewhat different from the basic ideal model shown in fig. 2. the photos of the both sides of the fabricated structure are displayed in fig. 3. the main part of the proposed structure is realized on cuclad 217 substrate (with relative dielectric constant ɛr = 2.17 and thickness h = 1.143 mm) as a double-sided parallel-strip line t-junction. layouts of the bottom and the top parts of the structure are presented in fig.4 and are denoted by gray and black color, respectively. the structure consists of a 4.5 mm wide 50 ω-double-sided parallel-strip line with a double-sided parallel-strip line open stub in the middle which is 4.75 mm long and 4.5 mm wide as shown in fig.4. the part of the stub 64 d. a. nešić, i. radnović printed on the dielectric substrate serves for bonding the rigid metal strips (a in fig.3) on both sides while the distance between the strips is the same as the thickness of the substrate (1.143 mm). since the structure is designed as a symmetrical (balanced) microstrip line, there has to be a transition (bal-un) to unsymmetrical (conventional) 50 ω-microstrip line at its both ports, [15]. in our case, for the used dielectric substrate, the width of this 50 ω-line is 3.5 mm. width of the ground plane area at the sma connector location is 14 mm. rigid metal strips, 20 mm long, 4.5 mm wide and 0.3 mm thick, are bonded (conventional eutectic alloy) to the 4.75 mm long stubs (a in fig.3) on the both sides of the substrate. free parts of the rigid metal strips are forming 15.25 mm long part of the open stub without a substrate (b in fig.3). a) bottom side of the proposed microwave sensor b) top side of the proposed microwave sensor fig. 3 photograph of the proposed microwave sensor with sma connectors. a – part of the metal strip on the substrate; b part of the metal strip without the substrate fig. 4 layout of the bottom (gray) and the top (black) side metallization of the proposed double-sided parallel-strip line t-junction with a bal-un transition to the conventional microstrip line at both ports parallel-strip line stub resonator for permittivity characterizati 65 3. simulation the main problem is a double segmented open stub. the shorter part of this stub (part a in fig. 5) is printed on the substrate and cannot be immersed in the mut. it is treated like a common double-sided parallel-strip line on a substrate. the part b (fig. 5) is immersed into the mut so to be totally surrounded by it. simulations were carried out using 3d wipl-d microwave pro program package [16]. a) segments of the open stub b) wipl-d pro simulation model fig. 5 sketch of the open stub resonator and its wipl-d simulation model. a segment of the stub printed on the substrate (4.75 mm); b segment of the stub without the substrate immersed in the mut (15.25 mm) the wipl-d simulation model is presented in fig. 5b. simulation results are obtained for two specific ranges of the relative dielectric constants. the first is for r which ranges from 1.5 to 3, specific for oils, while the second is for r that ranges from 20 to 80, specific for the water-ethanol mixtures. for the mixture water-ethanol the parameters are taken from [17]. high imaginary parts ofr are incorporated from [17] to calculate real resonant frequency for the measured frequency range (ethanol 70%: 39.5 i7 and ethanol 96%: 22 i11)relative dielectric constantr-mut related to the resonant open stub frequencies are presented in diagrams in fig. 6., fig. 7. and fig. 11. for the reference air (r = 1) simulated resonant frequency is 3.74 ghz. fig. 6 simulated diagrams for the first specific range of the mut relative dielectric constants (1.53.0) vs. the resonant frequencies 66 d. a. nešić, i. radnović fig. 7 simulated diagrams for the second specific range of the mut relative dielectric constants (2080) vs. the resonant frequencies 4. measurement the measurements are performed in the steady state at the temperature around 300 k in order to obtain stable results. measurement setup with the sensing open stub and the container with the mut are presented in fig. 8. the container, shown in fig. 8, inserts itself a negligible frequency shift. transmission coefficient (s21) of the proposed structure is measured using the agilent technologies network analyzer n5227a. several materials were tested: air, gasoline (medical), paraffin oil and sunflower oil, as well as water and ethanol. the measured s21 parameters in both cases are presented in figures 9, 10 and 12, respectively. fig. 8 measurement setup with the sensing open stub and the container. a segment of the stub printed on the substrate; b segment of the stub without the substrate to be immersed into the mut parallel-strip line stub resonator for permittivity characterizati 67 fig. 9 measured s21 coefficient of various mut fig. 10 measured s21 coefficient of water and ethanol fig. 11 ethanol 96% simulated s21 coefficient (parameters from [17]) 68 d. a. nešić, i. radnović 0.0 0.5 1.0 1.5 2.0 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 s 2 1 [d b ] f [ghz] b fig. 12 ethanol 96% measured s21 coefficient according to the diagrams presented in fig. 6. and fig. 7. ɛr-mut values (and measured resonant frequencies) are: gasoline-medical (2.755 ghz) ɛr = 1.90; paraffin oil (2.584 ghz) ɛr = 2.16; sunflower oil (2.4 ghz) ɛr = 2.5; water (0.449 ghz) ɛr = 73; diluted ethanol 35% (0.49 ghz) ɛr = 61; ethanol 70% (ethanol 70% v/v) (0.629 ghz) ɛr = 37 and ethanol 96% (ethanol 96% v/v) (0.787 ghz) ɛr = 22. for the air (3.74 ghz), ɛr =1. all results reasonably match values from the available references [17-21] as shown in table 1. agreement between simulation and measurement can be tested by comparing s21 parameters for ethanol 96% from the simulation in fig. 11 and from the measurement in fig. 12. the loss tangent tan(δ) is extracted (-3db frequency range) according to [22] using the relation for the quality factor q  tan(δ) and contribution of the mut part in the entire electrical length of the open stub. the authors assume that tan(δ) of the cuclad 217 substrate as well as tan(δ) of the rigid metal strips in the air are negligible comparing to the tan(δ) of the mut. proposed estimation gives somewhat higher tan(δ) of the mut (conservative version). the tan(δ) of the mut is estimated from the influence of the mut on the resonator and is slightly higher than the measured tan(δ) (only the longer part of the open stub is in the mut). tan( ) tan( ) 1 shorter reff mut meas mut mut d d              (2) table 1 results mut measured ɛr ( fr ) tan(δ) reference ɛr (error %) tan(δ) (error %) gasoline-medical 1.90 ±0.003 (2.755 ghz) 0.015 [18] 2.0 ( 5. %) 0.015 ( 1. %) paraffin oil 2.16 ±0.018 (2.584 ghz) 0.013 [19] 2.2 ( 2. %) sunflower oil 2.50 ±0.005 (2.4 ghz) 0.08 [20] 2.56 ( 3. %) 0.128 (38. %) water # 73.0 ±3.8 (0.449 ghz) 0.05 [21] 76.0 ( 4.%) 0.026 (90. %) ethanol 35% 61.0 ±2.6 (0.49 ghz) 0.064 [17] 58.9 ( 4.%) 0.07 ( 9. %) ethanol 70% 37.0 ±1.2 (0.629 ghz) 0.186 [17] 39.5 ( 7. %) 0.177 ( 5. %) ethanol 96% 22.0 ±1.0 (0.787 ghz) 0.53 [17] 22.0 ( 1.%) 0.5 ( 6. %) # tap water – water from the regular water supply parallel-strip line stub resonator for permittivity characterizati 69 5. discussion the sensor is tested for two dielectric constants and frequency ranges (oils and ethanol-water mixture). the frequency shift between two measured materials is close to the ratio of square roots of their relative dielectric constants ɛr for both ranges. for example, the ratio between the air and the water resonant frequencies is around 8.3 and the ratio between square roots of the water and the air dielectric constants is around 8.5. for gasoline these ratios are 1.36 and 1.38, respectively. the sensing part of the open stub is relatively short (15.25 mm) and can be immersed into a small container. the measurement errors are calculated according to the frequency step in the measurement process, table 1. the measurement errors against values in references [1721] are given in percentages [%]. the errors are high for tan(δ) of the sunflower oil and water due to not so fixed mixture content of the sunflower oil and water from the regular water supply (especially for tan(δ)). relative sensitivity for both dielectric constant ranges, (1.5-3.0) and (20-80), are given in fig. 13 and fig. 14, respectively. the resolution depends on the frequency step and on the dielectric constant range. fig. 13 relative sensitivity for the first specific range of the mut relative dielectric constants (1.53.0) vs. dielectric constant fig. 14 relative sensitivity for the second specific range of the mut relative dielectric constants (2080) vs. dielectric constant 70 d. a. nešić, i. radnović the second group of resonant frequencies in fig. 10 is from the second resonant bandgap from the open stub (3 times the first resonance). the second resonances are somewhat shifted and have wider bandgaps. the reason is lower dielectric constant and higher tan(δ) for higher frequencies [17, 21]. 6. conclusion the paper introduces the new type of a microwave resonant sensor realized as a tjunction with an open stub as a sensing part. the sensing part of the stub represents a pair of two metal strips in the form of a double-sided parallel-strip line without a substrate. the absence of the substrate enables each stub strip to be totally surrounded by the mut. the frequency shift between two measured materials is close to the ratio of the square roots of their relative dielectric constants ɛr-mut. the proposed sensor is fabricated in the planar technology without dimension tolerance problems: narrowest line width is 3.5 mm that is much wider than typical photolithographic manufacturing tolerances (around 30 microns). the sensing open stub is short (15.25 mm), but still significantly longer than common tolerances. there are no additional technological processes such as vias, air-bridges, defected ground structures (dgs) or great number of vias like in substrate integrated waveguide (siw) technology. the only additional process is bonding of the rigid metal strips to the microstrip line on the substrate. the sensing stub can be simply immersed into the mut without any additional preparing or use of auxiliary structures like cavity. the sensor is suitable for distinguishing the mut, especially mixture concentrations such as water and ethanol mixture. presented sensor is tested for two dielectric constant ranges: oils (1.5-3) and ethanol-water mixtures (20-80), and in two frequency ranges: around 2 ghz and below 1 ghz, respectively. in both cases frequency shift between two measured materials is closely proportional to the ratio of the square roots of their relative dielectric constants. all results reasonably match values from the available references. acknowledgment: the authors would like to thank colleagues m. pesic, n. tasic, lj. radovic, n. popovic and p. manojlovic from the institute imtel for their help in the realization and to professor m. potrebic from the university of belgrade, school of electrical engineering, for her assistance in performing the measurements. this work was funded by the serbian ministry of education and science within the project tr 32008. references [1] s. dey, j.k. saha, and n.c. karmakar, "smart sensing", ieee microwave magazine, pp. 26-39, november 2015. [2] j. polivka, "an overview of microwave sensor technology", high frequency electronic, pp. 32-42, april 2007. [3] k. saeed, m. f. shafique, m. b. byrne and i. c. hunter (2012). planar microwave sensors for complex permittivity characterization of materials and their applications, applied measurement systems, prof. zahurul haq (ed.), intec. [4] a. hoog, m.j.j. mayer, h. miedema, w. olthuis, f.b.j. leferink and a. van den berg, "modeling and simulations of the amplitude–frequency response of transmission line type resonators filled with lossy dielectric fluids", sensors and actuators a, vol. 216, pp. 147-157, 2014. parallel-strip line stub resonator for permittivity characterizati 71 [5] c. liu and y. pu, "a microstrip resonator with slotted ground plane for complex permittivity measurements of liquid", ieee microwave and wireless components letters, vol. 18, no. 4, pp. 257259, 2008. [6] c.-s. lee and c.-l. yang, "complementary split-ring resonators for measuring, dielectric constants and loss tangents“, ieee microwave and wireless components letters, vol. 24, no. 8, pp. 563-565, 2014. [7] a. a. abduljabar, d. j. rowe, a. porch, and d. a. barrow, "novel microwave microfluidic sensor using a microstrip split-ring resonator", ieee transactions on microwave theory and techniques, vol. 62, no. 3, pp. 679-688, 2014. [8] m. t. jilani, w. p. wen, l. y. cheong, m. z. u. rehman, and m. t. khan, "determination of sizeindependent effective permittivity of an overlay material using microstrip ring resonator", microwave and optical technology letters, vol. 58, no. 1, pp. 4-9, 2016. [9] lescopa, f. galléeb, s. riouala, "development of a radio frequency resonator for monitoring water diffusion in organic coatings", sensors and actuators a, vol. 247, pp. 30-36, 2016. [10] l. le cloirec, a. benlarbi-delaï and b. bocque, "new concept of rf functions by microfluidic coupling", microwave and optical technology letters, vol. 48, no. 10, pp. 1912-1916, 2006. [11] d.l. diedhiou, r. sauleau, and a.v. boriskin, "microfluidically tunable microstrip filters", ieee transactions on microwave theory and techniques, vol. 63, no. 7, pp. 2245-2252, 2015. [12] m. a. karimi, m. arsalan and a. shamim, "low cost and pipe conformable microwave-based water-cut sensor", ieee sensors journal, vol. 16, no. 21, pp. 7636-7645, 2016. [13] c. liu and f. tong, an siw resonator sensor for liquid permittivity measurements at c band, ieee microwave wireless components letters, vol. 25, no. 11, pp. 751-753, 2015. [14] s.-g. kim and k. chang, "ultrawide-band transitions and new microwave components using doublesided parallel-strip lines", ieee transactions on microwave theory and techniques, vol. 52, no. 9, p. 2148, september 2004. [15] j.-x. chen, c.-h. k. chin and q. xue, "double-sided parallel-strip line with an inserted conductor plane and its applications", ieee transactions on microwave theory and techniques, vol. 55, no. 9, p. 1899, 2007. [16] 3d wipl-d microwave pro program package. [17] a. megriche1, a. belhadj and a. mgaidi, "microwave dielectric properties of binary solvent wateralcohol, alcohol-alcohol mixtures at temperatures between -35°c and +35°c and dielectric relaxation studies", mediterranean journal of chemistry, vol. 1, no. 4, pp. 200-209, 2012. [18] f. s. jafari, j. ahmadi-shokouh, reconfigurable microwave siw sensor based on pbg structure for high accuracy permittivity characterization of industrial liquids, sensors and actuators a, vol. 283, pp. 386-395, 2018. [19] https://www.engineeringtoolbox.com/relative-permittivity-d_1660.html. [20] j. vrba and d. vrba, "temperature and frequency dependent empirical models of dielectric properties of sunflower and olive oil", radioengineering, vol. 22, no. 4, pp. 1281-1287, 2013. [21] martin chaplin, water and microwaves, http://www1.lsbu.ac.uk/water/microwave_water.htm. [22] a. r. fulford and s. m. wentworth, "conductor and dielectric property extraction using microstrip tee resonators," microwave and optical technology letters, vol. 47, no. 1, pp. 14-16, 2005. http://onlinelibrary.wiley.com/doi/10.1002/mop.v48:10/issuetoc http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=7579242 https://www.engineeringtoolbox.com/relative-permittivity-d_1660.html http://www1.lsbu.ac.uk/water/microwave_water.htm  facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 291-305 https://doi.org/10.2298/fuee2102291s © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper coefficient quantization effects on new filters based on chebyshev fourth-kind polynomials biljana p. stošić university of niš, faculty of electronic engineering, niš, serbia abstract. the aim of this paper is to construct non-recursive filters, extensively used type of digital filters in digital signal processing applications, based on chebyshev orthogonal polynomials. the paper proposes the use of the fourth-kind chebyshev polynomials as functions in generating new filters. in this kind, low-pass filters with linear phase responses are obtained. comprenhansive study of the frequency response characteristics of the generated filter functions is presented. the effects of coefficient quantization as one type of quantization that influences a filter characteristic are investigated here also. the quantized-coefficient errors are considered based on the number of bits and the implementation algorithms. key words: chebyshev recursion, orthogonal polynomials, non-recursive filters, linear phase characteristic, coefficient quantization, implementation structure 1. introduction digital filters are used in a wide variety of digital signal processing (dsp). most of the time, the final goal of using a filter is to achieve a kind of frequency selectivity on the spectrum of the input signal. till now, researchers have suggested different methods for improving the digital filter efficiency as presented in [1]-[8]. different methods presented in [4]-[8] share the same idea, namely, using cic (cascade-integrator-comb) sections with some added solutions to improve filter attenuation. different methods are proposed to compensate for the passband droop as it is described in [9]-[17]. the objective of the paper [11] has been to categorize and describe the most important methods for compensator designs, proposed till than, and to propose some future direction for the compensator designs. in [12][17], new compensator designs are desribed and applied to different filters. chebyshev polynomials have been the focus of many studies and have drawn a wide attention due to their frequent appearance in various applications in polynomial approximation, integral and differential equations, etc. there are four kinds of these polynomials as projected by mason and handscomb [18] which leads to an extended received october 15, 2020; received in revised form january 12, 2021 corresponding author: biljana p. stošić faculty of electronic engineering, aleksandra medvedeva 14, 18115 niš, serbia e-mail: biljana.stosic@elfak.ni.ac.rs 292 b. stošić range of new filter functions. the chebyshev polynomials of the first-kind [19]-[21] are used to generate fir (finite impulse response) filters in [3] and integer coefficients of coleman filter in [20]. filter functions recently presented in [22] and [23] prove the use of the firstand second-kind chebyshev polynomials in filter design, respectively. the principle idea behind obtaining these filter functions is based on coleman filter described in [20]. usually, chebyshev polynomials of the fourth-kind are known less. the aim of this paper is to construct non-recursive filters, extensively used type of digital filters in dsp applications, based on these polynomials. the presented design algorithm leads to filter functions having a linear phase characteristic which is very important. however, if the phase characteristic is nonlinear, the phase delay of the individual frequency components of the signal is not equal, and a change in signal shape occurs. in dsp, quantization has a great influence. coefficient quantization represents a type of quantization that influences a frequency response of digital filter. hardware implementation of digital filters requires quantization of filter coefficients to the word size of registers. the goal is to obtained quantized frequency characteristic that has more similarity to frequency characteristic of infinite precision filter. the contribution of this study is twofold. firstly, it has been mathematically proved that it is possible to obtain novel filter functions by considering chebyshev orthogonal polynomials. some characteristics of them are shown here to demonstrate their usefulness in communication systems. to the best of authors’ knowledge, this is the first study that investigates the use of the fourth-kind chebyshev polynomials as functions in generating new filters. as the second contributation, after designing a digital filter, the effects of the coefficient quantization on its frequency characteristic have to be examined and studied. if the quantized filter does not meet the target specifications, the designer needs to redesign the filter. in particular, it will be study how magnitude response of the filter is affected by coefficient quantization. also, different implementation structures will be considered in order to suggest the best solution. this paper is unscrewed into several sections. the design procedure of the novel filters and the verification of the designed filters are discussed in sections 2 to 4. then, section 5 elaborates the coefficient quantization’s effects on the filter frequency characteristic depending of the number of bits and the structure implementation algorithm. finally, the summary is drawn in section 6. 2. coleman filter: a brief overview the recursion for generation of the chebyshev polynomials of the first-kind [19]-[21] denoted as tn(x) is 1 2 1 , 0 ( ) , 1 2 ( ) ( ), 1 n n n n t x x n x t x t x n− − =  =   −  (1) the chebyshev polynomials of the first-kind are used to generate coleman filter form given in [20]. the frequency response is calculated as g(f) = 2−20t7(f(f)), where the function ( ) 2 2 cos(2 )f f f= +  is used and t7(x) represents the chebyshev polynomial coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 293 of the first-kind of degree seven. there, the exact functions with chebyshev polynomials of degree n are obtained by following relation ( ) (0) ( ( ))n ng f g t f f=  , (2) where the constant is calculated as (0) ( (0))ng t f= . 3. the new generated filter forms design of a digital filter involves the following steps: (1) filter specification, (2) filter coefficient calculation, (3) realization, (4) analysis of finite word length effect and (5) implementation. some of the steps will be described in the following sections. 3.1. filter specifications the filter specifications are determined by the applications. once the specifications are defined, various concepts and mathematics can be used to come up with a filter description that approximates the given set of specifications. in the case of new suggested design, for given specifications like passband cut-off frequency, stopband cut-off frequency, maximum and minimum attenuations, the sampling frequency as well as the filter order can be calculated by using exact formulas for design of chebyshev filters. the actual attenuation of the filter depends on the filter order (the higher the order, the higher the attenuation). here, in case of lowpass filter, the group delay is equal to the filter order. meeting the specifications is not guaranteed a-priori, trial and error is often required. if the resulting filter does not meet the specifications, one can adjust the filter parameters, i.e. changing the filter order could help to resolve the problem. in order to obtain more attenuation, some optimization can be done, like increasing filter order or free parameter v . this design approach is superior in that by varying one parameter v a much better design can be obtained for the same filter order. 3.2. definition of fourth-kind chebyshev polynomials a brief overview of basic definition of chebyshev polynomials of the fourth-kind [9], [21] is given here. usually, the polynomials are defined according to the trigonometric formula. the nth chebyshev polynomials of the fourth-kind at the point x are denoted as wn(x). a simplified definition of the polynomials on the interval x  [−1,1] by using recurrence relations is as follows 1 2( ) 2 ( ) ( )n n nw x x w x w x− −=   − , (3) where n = 2,3,... represents polynomial degree and the initial conditions are w0(x) = 1 and w1(x) = 2  x + 1. 294 b. stošić 3.3. design procedure: filter coefficient calculation the chebyshev polynomials of the fourth-kind are used here to generate different low-pass filter functions. in this case, the new filter functions are generated by equation , ( ) (0) ( ( ))n new new n newg f g w f f=  , (4) where the normalized constant gnew(0) = wn(fnew(f)) is calculated for f = 0, and applied cosine function ( ) 2 cos(2 )newf f v f= +  , (5) with parameter 1 2v  . equations (4) and (5) allow one to predict how the filter will respond to varying frequency and constant parameter v. in order to analyze filter characteristics, a few filter examples are designed by using matlab environment. the function proposed by j. coleman [20], eq. (2), and functions of new filters, eq. (4), for filter order n, are arranged here in rectangular form as the following unique function of frequency  ( ) [ (0) (1) ( ) ... ( ) cos( )] / (0)ng a a cos a n n g = +   + +   , (6) which is normalized with the constant (0)g or (0)newg . assume that the transfer function ( )h z of designed filter can be presented as 1 ( ) 0 ( ) ( ) ( ) [ ] n n k n k k h z h n h k z z . (7) the filter coefficients show the symmetry around the center value. 4. design examples 4.1. magnitude and phase responses normalized curves of designed low-pass filters with functions f(f), v = 2 and fnew(f), v = 1 or 2, versus normalized frequency f =  / , are summarized in figs. 1 and 2, for filter orders n = 5 and n = 6, respectively. examples of odd and even orders are generated. the filter characteristics generated by eq. (4), for the function fnew(f) and v = 1, and different filter order n show higher selectivity in the transient area in comparison with both designed functions f(f) and fnew(f) for v = 2. the worst-case suppression in the stop-bands for new filters with fnew(f) is between 63.81 db and 76.41 db, for n = 5 as shown in fig. 1. the worst-case suppression in the stop-bands for new filters with fnew(f), shown in fig. 2, for the case of n = 6 are between 75.52 db and 90.78 db. coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 295 fig. 1 normalized filter curves in dbs for case 5n = fig. 2 normalized filter curves in dbs for case 6n = 296 b. stošić table 1 gives some filter characteristics, such as pass-band cut-off frequency fcp at max = 0.1 db, and minimum attenuation max at stop-band cut-off frequency fcn. the listed values of stop-band cut-off frequencies indicate higher selectivity of filter functions with parameter v = 1. a low-pass filter seeks to eliminate all frequencies in the stop-band, that is, all frequencies above cut-off frequency are desired to be filtered out. in these cases, high suppression level in stop-band is evidently desirable. the main key features of the designed filters are good in band and out band performance. table 1 characteristics of designed fir filters parameter n 5 6 parameter v 1 2 1 2 max @ fcp 0.0263 0.0305 0.0239 0.0278 max @ fcn 63.81@0.50 76.41@0.67 75.52@0.50 90.78@0.66 the generated filter characteristics by eq. (5) for the function fnew(f), filter order 5n = and different values of parameter 1 2v  are depicted in fig. 3. the magnitude characteristics of designed filters have a passband droop in the passband that is dependent upon the parameter v . depending on chosen normalized frequency, passband droop can vary about 0.0038 db at normalized frequency 0.01, 0.0152 db at frequency 0.02, or 0.0581 at 0.025, for different parameter v values. the passband droop is higher in case when the parameter v has smaller value (the width of passband is smaller in this case as shown in table 1). fig. 3 normalized new filter curves in dbs for case 5n = and 1 2v  . in order to achieve correct performance, the filter should have a flat passband. the passband droop can be further improved by applying an additional filter, called compensating filter or compensator. this idea for improving the passband is given and applied on some other filters (comb, cic and new cic filters) like it is shown in [4] and [9]-[17]. investigation coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 297 can be done in order to elaborate which suggested method is the most convenient for passband droop compensation in these suggested filters. the comparison of filter coefficients, a(i), i = 0,1,2,..., n, from eq. (6) and n = 5is done in tabular form, table 2. there are listed coefficients of the filters based on first-, secondand fourth-kind chebyshev polynomials and different value of parameter v = 1 or 2. it is obvious in all cases that the filter coefficients calculated for case of v =1 have smaller values then the one calculated for v =2. table 2 coefficients of designed filter functions ( )ng  : a comparison literature [20] [23] [22] new filter functions 5n = first-kind chpoly* 2v = first-kind chpoly 1v = secondkind chpoly 2v = secondkind chpoly 1v = fourthkind chpoly 2v = fourthkind chpoly 1v = (0)g 15124 3363 30744 6930 34633 8107 (0)a 3642 681 7436 1414 8477 1679 (1)a 6130 1210 12492 2508 14180 2964 (2)a 3600 840 7296 1728 8168 2024 (3)a 1400 440 2816 896 3072 1024 (4)a 320 160 640 320 672 352 (5)a 32 32 64 64 64 64 *chpoly – chebyshev polynomials in comparison with traditional fir filters, the designed filters have higher suppression level in stopband for the same filter order. the difference is more than 30 db in favor of new designed filters. figure 4 shows the linear phase response of the designed new filter. the amplitudes of the individual frequency components will not change by passing the signal through such a filter. the group delay of the designed filters is equal to the filter order n . fig. 4 magnitude and phase responses for case 5n = and 2v = 298 b. stošić 4.2. pole-zero plots figure 5 graphically displays the locations of an unquantized system’s poles and zeros. this is a two-dimensional plot of the z-plane that shows the unit circle, the real and imaginary axes, and the position of the system’s zeros. a location having multiple poles is marked with a number next to that location to indicate how many poles exist there. fig. 5 pole-zeros plot of filter designed for 5n = and 2v = the zeros of a linear-phase filter are of three different arrangements which include: a) a pair of complex-conjugate zeros which are not on the unit circle along with their reciprocals, e.g. z2, z * 2, 1 / z * 2 and 1/ z2; b) a pair of complex-conjugate zeros on the unit circle such as z3, z * 3, z4 and z * 4, and finally, c) a real zero which is not on the unit circle along with its reciprocal such as z1 and 1 / z1. the filter with the transmission function having zeros in the left half-plane has a smaller phase, which means smaller signal delay. the zeros do not need to be inside the unit circle to maintain the stability. as shown in this figure, some zeros are moved out of the unit circle. recall that zeros near the unit circle can be expected to have a strong influence on the magnitude frequency response of the filter. this example shows that after designing a filter, the effect of the coefficient quantization have to be examined because they have direct influence on filter zeros. if the quantized filter does not meet the target specifications, the designer needs to redesign the filter. 5. realization and analysis of finite word length effect a design of digital filters involves finding the coefficients of the filter which is described in section 3. for computing the coefficients of digital filter, infinite precision arithmetic is used (the number of digits is only limited by available memory of the system). coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 299 as it is known, a common filter design tool like matlab, can run the design algorithm and return the filter coefficients. after designing a digital filter, the effects of the coefficient quantization on its frequency response have to be examined and studied. if the quantized filter does not meet the target specifications, the designer needs to redesign the filter. in dsp system, the number of bits in designing a filter is limited by word length of the register used to store them (registers have a fixed number of bits). quantization methods like rounding and truncating are used to quantize the filter coefficients to the word size of the register [24], [25]. the designed coefficients have to be converted or quantized to a fixed-point representation [24]. a signed fixed-point representation with b + 1 total bits including sign is used here. notice that filter coefficients of the novel designed fir filters are smaller than 1. because of that, b + 1-bit format which reserve 1 bit for sign, 1 bit for integer part and b − 1 bits for the fractional part has been chosen for coefficient representation. since a digital filter uses a finite number of bits to represent signals and filter coefficients, it is needed to find structure which can somehow retain the target filter specifications even after quantizing the coefficients. below is given analysis of the direct-form structure and cascade-form structure, respectively. the main difference between the aforementioned realization structures is their sensitivity to using a finite length of bits. 5.1. direct-form structure the direct-form structure is directly obtained from the difference equation. the coefficients of the new filter according to eq. (7), for 6n and parameter 2v = are listed in table 3, and for parameter 1v = in table 4. the filter coefficients obtained by using different number of bits are presented in these tables. table 3 the unquantized and quantized filter coefficients, 1b + -bit format, for 6n and 2v = k ( )h k unquantized ( )newh k quantized b=7 ( )newh k quantized b=15 ( )newh k quantized b=31 0 0.000234721982814 0 0.000244140625000 0.000234722159803 1 0.002934024785174 0 0.002929687500000 0.002934024669230 2 0.016371858301273 0.015625000000000 0.016357421875000 0.016371858306229 3 0.054455500012836 0.046875000000000 0.054443359375000 0.054455500096083 4 0.121409945610516 0.125000000000000 0.121398925781250 0.121409945189953 5 0.192611392084735 0.187500000000000 0.192626953125000 0.192611391656101 6 0.223965114445304 0.218750000000000 0.223937988281250 0.223965113982558 300 b. stošić table 4 the unquantized and quantized filter coefficients, 1b + -bit format, for 6n and 1v = k ( )h k unquantized ( )newh k quantized b=7 ( )newh k quantized b=15 ( )newh k quantized b=31 0 0.001354526021715 0 0.001342773437500 0.001354525797069 1 0.008804419141146 0.015625000000000 0.008789062500000 0.008804419077933 2 0.030138203983153 0.031250000000000 0.030151367187500 0.030138203874230 3 0.070435353129167 0.078125000000000 0.070434570312500 0.070435353554785 4 0.123600499481471 0.125000000000000 0.123596191406250 0.123600499704480 5 0.170797265550594 0.171875000000000 0.170776367187500 0.170797265134752 6 0.189739465385511 0.187500000000000 0.189758300781250 0.189739465713501 obviously, there are changes in the filter coefficients and they depend on used number of bits, so the filter frequency response will change correspondingly. as examples of affecting the magnitude response of a filter by coefficient quantization, new low-pass filters with different orders are designed and different number of bits are considered. normalized curves are pictured in figs. 6-9. fig. 6 new filter curves in dbs: 5n = , 2v = coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 301 fig. 7 new filter curves in dbs: 6n = , 2v = fig. 8 new filter curves in dbs: 5n = , 1v = 302 b. stošić fig. 9 new filter curves in dbs : 6n = , 1v = it can be concluded that the quantization of filter coefficients creates a deviation in the frequency response of the filter as seen in figs. 5-8. note that if the wordlength b is not large enough, there will be undesirable effects. in summary, after coefficient quantization, a filter having a frequency response diverge from the frequency response of a filter with unquantized coefficients is obtained. the quantization effect is more visible and more significant in the stopband area. conclusion from this analysis and given characteristics: the new designed filters realizated by direct-form exhibit high sensitivity to the coefficient quantization. 5.2. cascade-form structure in this part of the article, it will be shown that implementing a high-order filter as a cascade of second-order sections can significantly reduce the sensitivity to the coefficient quantization. the cascade structure is obtained from the system function h(z). the idea is to decompose the target system function into a cascade of second-order fir systems. in other words, we need to find second-order systems which satisfy [ /2] [ /2]1 1 2 0 1 2 0 1 1 ( ) ( ) ( ) m mm k k k k k k k k k h z b z b b z b z h z − − − − = = = =  = +  +  =   (8) or [ /2] [ /2]1 1 2 0 1 2 0 1 1 ( ) ( ) ( ) m mm k k k k k k k k k h z b z g b b z b z g h z − − − − = = = =  =  +  +  =    , (9) where 0 1 2, ,k k kb b b represent filter coefficients and g represents a gain factor. a software implementation, such as a matlab code which use the tf2sos function, is applied to convert the transfer function into the cascade form insted of tedious mathematics. function tf2sos converts forward and feedback path coefficients of the filter coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 303 into numerator and denominator coefficients of the second-order sections. then the coefficients of the second-order sections are quantized and the frequency response of the obtained structure with that of the unquantized system are compared. to clarify converting a system function into the cascade form, as well as the effects of the chosen bit number for quantization, a few examples given below in table 5 are reviewed. each row in the table 5 gives the transfer function of one of the second-order sections. the first three numbers of each row represent the numerator of the corresponding secondorder section and the second three numbers give its denominator. the tf2sos comand can also give a gain factor g, which can be included in the cascade-form structure. from the listed coefficient values it can be concluded that increasing coefficient word bit widths can be a viable option. also, the case of using the tf2sos command which gives a gain factor is desirable one due to better coefficient resistance to quantization errors. 5.3. performance analysis notes the main difference between the aforementioned realization structures, directand cascade-form, is their sensitivity to using a finite length of bits. the realizations, such as direct forms, are very sensitive to quantization of the coefficients. however, structures with cascaded second-order sections show smaller sensitivity and are preferred. the analyzed examples show that implementing a high-order filter as a cascade of secondorder sections can significantly reduce the sensitivity to the coefficient quantization. table 5 the second-order sections and their coefficients for 5n = and 2v = function: sos = tf2sos(b,a); unquantized coefficients 0.000923974244218 0.002844143378878 0.000923974244218 1.000000000000000 0 0 1.000000000000000 3.487567927223607 3.069689062680776 1.000000000000000 0 0 1.000000000000000 1.713003943968079 1.000000000000962 1.000000000000000 0 0 1.000000000000000 1.085134395891595 0.999999999999881 1.000000000000000 0 0 1.000000000000000 1.136130681644964 0.325765893411380 1.000000000000000 0 0 quantized coefficients 7b = 0 0 0 1.000000000000000 0 0 1.000000000000000 3.484375000000000 3.062500000000000 1.000000000000000 0 0 1.000000000000000 1.718750000000000 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.078125000000000 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.140625000000000 0.328125000000000 1.000000000000000 0 0 note: impossible to get quantized transfer function, equal to zero! quantized coefficients 15b = 0.000915527343750 0.002868652343750 0.000915527343750 1.000000000000000 0 0 1.000000000000000 3.487548828125000 3.069702148437500 1.000000000000000 0 0 1.000000000000000 1.713012695312500 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.085144042968750 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.136108398437500 0.325744628906250 1.000000000000000 0 0 note: the two graphs shows a very good agreement! function: [sos,g] = tf2sos(b,a); unquantized coefficients 1.000000000000000 3.078163051271750 0.999999999999806 1.000000000000000 0 0 1.000000000000000 3.487567927223607 3.069689062680776 1.000000000000000 0 0 1.000000000000000 1.713003943968079 1.000000000000962 1.000000000000000 0 0 1.000000000000000 1.085134395891595 0.999999999999881 1.000000000000000 0 0 1.000000000000000 1.136130681644964 0.325765893411380 1.000000000000000 0 0 g = 9.239742442179425e-04 quantized coefficients 7b = 1.000000000000000 3.078125000000000 1.000000000000000 1.000000000000000 0 0 1.000000000000000 3.484375000000000 3.062500000000000 1.000000000000000 0 0 1.000000000000000 1.718750000000000 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.078125000000000 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.140625000000000 0.328125000000000 1.000000000000000 0 0 g = 9.239742442179425e-04 note: the two graphs shows a very good agreement! quantized coefficients 15b = 1.000000000000000 3.078186035156250 1.000000000000000 1.000000000000000 0 0 1.000000000000000 3.487548828125000 3.069702148437500 1.000000000000000 0 0 1.000000000000000 1.713012695312500 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.085144042968750 1.000000000000000 1.000000000000000 0 0 1.000000000000000 1.136108398437500 0.325744628906250 1.000000000000000 0 0 g = 9.239742442179425e-04 note: the two graphs are barely distinguishable from each other! 304 b. stošić 6. concluding remarks the presented approach for filter design which relies on the fourth-kind chebyshev polynomials is computationally very simple. the chebyshev polynomials can be used here to produce a set of low-pass non-recursive filter coefficients. filter characteristics have been fully demonstrated through numerical examples to illustrate the efficiency and accuracy of the design approach. when implementing a digital filter in the real world, a finite number of bits to represent each coefficient has to be used. the frequency response of the quantized filter might be quite different from that of the original design. filter stability based on quantization effect on filter characteristics has been also analyzed on several examples. in practice, it is reasonable to do coefficient quantization of a single filter and measure the effect on its frequency response. implementing a higher-order filter requires a higher number of bits for coefficient representation, and a cascade of second-order sections used as implementation algorithm can significantly reduce the sensitivity to the coefficient quantization. acknowledgment: the ministry of education, science and technological development of the republic of serbia has supported this research. references [1] b. a. shenoi, introduction to digital signal processing and filter design. new jersey: john wiley & sons, 2006. [2] s. k. mitra, digital signal processing: a computer-based approach. new york: mcgraw-hill education, 2011. [3] v. d. pavlović, n. s. dončov and d. g. ćirić, "1d and 2d economical fir filters generated by chebyshev polynomials of the first kind", int. j. electron., vol. 100, no. 11, pp. 1592–1619, march 2013. [4] b. p. stošić and v. d. pavlović, "design of new selective cic filter functions with passband-droop compensation", iet electron. lett., vol. 52, no. 2, pp. 115–117, january 2016. [5] g. jovanović doleček and c. j. s. cruz, "improving design of comb decimation filters using symmetrical polynomials", in proceedings of the 2019 ieee international fall meeting on communications and computing (roc&c), 2019, pp. 9–12. [6] g. jovanović doleček , "improving magnitude response of comb two-stage structure using simple multiplierless filters", in proceedings of the 2019 ieee 31st international conference on microelectronics (miel), serbia, niš, 2019, pp. 223–226. [7] g. jovanović doleček , "exploring three classes of symmetrical polynomials for improving comb filter design", in proceedings of the aip conference, vol. 2116, no. 1, pp. 450036-1–450036-4, 2019. [8] a. dudarin, g. molnar and m. vucic, "optimum multiplierless compensators for sharpened cascadedintegrator-comb decimation filters", electron. lett., vol. 54, no. 16, pp. 971–972, august 2018. [9] g. molnar, a. dudarin and m. vucic, "design of multiplierless cic compensators based on maximum passband deviation", in proceedings of the 2017 40th international convention on information and communication technology, electronics and microelectronics (mipro), opatija, croatia, 2017, pp. 119-124. [10] g. jovanović doleček , "multiplierless wideband and narrowband cic compensator for sdr application", int. j. commun. netw. syst. sci., vol. 10, no. 8b, august 2017. [11] g. jovanović doleček , "design of compensators for comb decimation filters", in encyclopedia of information science and technology, fourth edition, edited by mehdi khosrow-pour, igi global, 2018, pp. 6043-6056. [12] g. jovanović doleček , "improving magnitude response of comb two-stage structure using simple multiplierless filters", in proceedings of the 2019 ieee 31st international conference on microelectronics (miel), nis, serbia, 2019, pp. 223-226. https://digital-library.theiet.org/search;jsessionid=1qp0art4j9aao.x-iet-live-01?value1=&option1=all&value2=a.+dudarin&option2=author https://digital-library.theiet.org/search;jsessionid=1qp0art4j9aao.x-iet-live-01?value1=&option1=all&value2=g.+molnar&option2=author https://digital-library.theiet.org/search;jsessionid=1qp0art4j9aao.x-iet-live-01?value1=&option1=all&value2=m.+vucic&option2=author coefficient quantization effects on filters based on chebyshev fourth-kind polynomials 305 [13] g. jovanović doleček and c. j. s. cruz, "decimation structures for power of three decimation factors for consumer devices," in proceedings of the 2019 ieee 23rd international symposium on consumer technologies (isct), ancona, italy, 2019, pp. 181-185. [14] g. jovanović doleček and c. j. s. cruz, "improving design of comb decimation filters using symmetrical polynomials", in proceedings of the 2019 ieee international fall meeting on communications and computing (roc&c), acapulco, mexico, 2019, pp. 9-12. [15] g. jovanović doleček , "design of multiplierless comb compensators with magnitude response synthesized as sinewave functions", fu elec energ, vol. 33, no. 1, pp. 1-14, march 2020. [16] g. jovanović doleček and j. m. de la rosa, "design of wideband comb compensator based on magnitude response using two sinusoidals and particle swarm optimization", aeu-int. j. electron. commun., vol. 130, p. 153570, december 2020. [17] g. jovanović doleček , l. camuñas-mesa and j. m. de la rosa, "low order wideband multiplierless comb compensator", in proceedings of the 2020 ieee 63rd international midwest symposium on circuits and systems (mwscas), springfield, ma, usa, 2020, pp. 162-165. [18] j. c. mason and d. c. handscomb, chebyshev polynomials. chapman and hall/crc, 2002. [19] j. o. coleman, "chebyshev stopbands for cic decimation filters and cic-implemented array tapers in 1d and 2d", ieee trans. circ. syst. i: reg. papers, vol. 59, no. 12, pp. 2956–2968, december 2012. [20] j. o. coleman, "integer-coefficient fir filter sharpening for equiripple stopbands and maximally flat passbands", in proceedings of the 2014 ieee international symposium on circuits and systems (iscas), melbourne vic, australia, 2014, pp. 1604–1607. [21] m. abramowitz and i. a. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables. usa, national bureau of standars, applied mathematics series, 1972. [22] b. p. stošić and v. d. pavlović, "chebyshev polynomials of the second kind in filter design", in proceedings of the 2017 13th international conference on advanced technologies, systems and services in telecommunications (telsiks), serbia, niš, 2017, pp. 191–194. [23] b. p. stošić and v. d. pavlović, "chebyshev recursion in design of linear phase low-pass fir filter with equiripple stop-band", proceedings of the romanian academy, series a, vol. 20, no. 3/2019, pp. 267–273, november 2019. [24] a.v. oppenheim, r.w. schafer, discrete-time signal processing. vol. 3, prentice hall englewood cliffs, nj, 2010. [25] b. p. lathi, r. a. green, essentials of digital signal processing, cambridge university press, 2014. facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 571-579 https://doi.org/10.2298/fuee1904571r © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd compact left-handed dual-band filters based on shundted stub resonators  vasa radonić, vesna crnojević-bengin university of novi sad, biosense institute – research and development institute for information technologies in biosystems, novi sad, serbia abstract. in this paper, super-compact microstrip dual-band resonator is presented, designed using the superposition of two simple left-handed (lh) resonators with single shunt stub. the proposed resonator exhibits spurious response in wide frequency range and therefore allows construction of dual-band filters using the superposition principle. the equivalent circuit model of the proposed resonator is crated and the influence of different geometrical parameters to the performances of the resonator are analyzed in details. as an examples, two dual-band filters that operate simultaneously at the wimax frequency bands are designed. key words: left-handed, metamaterials, microstrip resonator, dual-band resonator, dual-band filter 1. introduction increasing demands for higher capacity of various communication systems, together with the unavailability of certain portions of the frequency spectrum, have created a great need for multi-band passive filters that can simultaneously operate at several nonharmonically related frequencies. the design of compact filters that operate at two independent frequencies and exhibit independently controlled bandwidths still remains a challenge. different dual-band filter topologies have been recently proposed based on different synthesis techniques such as dual-band frequency transformation [1], filters based on dual-mode resonators [2], stepped impedance resonators [3], etc. double-negative or left-handed (lh) media, that simultaneously exhibits negative values of permittivity and permeability in a certain frequency range, were attract significant attention in the last decade [4-5]. artificial left-handed transmission lines (lh tl) allow the design of miniature passive microwave devices such as filters, antennas or directional couplers [4-8]. in [8], lh tl are implemented in microstrip architecture using unit cells comprised of one interdigital capacitor and one grounded inductive stub. received february 25, 2019; received in revised form april 16, 2019 corresponding author: vasa radonić university of novi sad, biosense institute – research and development institute for information technologies in biosystems, zorana đinđića 1, 21101 novi sad, serbia (e-mail: vasarad@uns.ac.rs)  572 v. radonic, v crnojevic-bengin although these elements provide the lh contribution to the structure, the whole structure is in fact a composite right/left handed (crlh) transmission line, due to the inherent parasitic series inductance and parallel capacitance of the microstrip line. balanced crlh lines show the unique property of achieving zero propagation constant at a nonzero frequency. by utilizing this feature, resonator called zeroth order resonator (zor) was designed in [9]. zor is implemented as a microstrip line with two interdigital capacitors and one stub inductor connected to the ground plane. its resonance is independent of the length of the structure but depends only on the reactive loading. zor performances are comparable to those of the conventional half-wavelength microstrip resonator on the same substrate. however, tuned to the same resonant frequency, the overall length of the zor is smaller and equal to λg/5, where λg is the guided wavelength. in this paper, we propose a much simpler version of the crlh resonator, called single shunt stub resonator (sssr), where interdigital capacitor is replaced by a simple gap one. when tuned to the same resonant frequency as zor, same fractional bandwidth and insertion loss, sssr exhibits more than three times smaller dimensions, as well as significantly better out-of-band performances. using proposed sssr resonators, a compact dual-band resonators are formed using superposition method. the influence of different geometrical parameters to the performances and the tuning range of the proposed dual-band resonator is analyzed in detail. performed analysis shown that the characteristic of the each passband can be independently controlled. such dual-band resonators are of the greatest importance for modern wireless systems, where multi-band operation on non-harmonically related frequencies is greatly needed. to demonstrate applicability of the proposed resonator, two second order dual-band filters that operates according to the ieee wimax 802.16 standard are designed and compared with different dual-band filters recently proposed in literature in term of operating frequency, insertion losses and overall size. 2. single shunt stub resonator end-coupled microstrip resonators employ gaps that are, in fact, serial capacitance and can be used to obtain lh behaviour. consequently, lh resonator can be designed without interdigital capacitors used in zor which significantly reduces its fabrication complexity. in fig. 1a, the proposed single shunt stub resonator (sssr) is shown together with relevant dimensions, where g denotes gap between the microstrip line and the resonator, l is the overall length of the resonator, w is width of 50  line, and ls and ws are length and width of the inductive stub, respectively. to allow fair comparison, the gap g and the dimension of the inductive stub ls and ws were optimized to achieve the same resonant frequency, same fractional bandwidth and same insertion loss at resonance as zor with dimensions reported in [9]. the same substrate was used for both circuits with thickness equal to 1.575 mm, dielectric constant εr=2.17 and dissipation factor 0.0009. fig. 2 shows the photograph of the fabricated resonator and comparison of the sssr and zor responses. the overall length of the zor is equal to 24.4 mm, while the proposed sssr is only 7.5 mm long, i.e. more than three times shorter than zor. the simulated responses of both structures are compared in fig. 2, where it can be seen that the sssr exhibits second harmonic at 7.69 ghz, thus creating a wider stopband with insertion loss higher than 35 db. this stopband presents a novel feature of sssr, not compact left-handed dual-band filters based on shundted stub resonators 573 existing in zor, which exhibits second harmonic at approximately two times the resonant frequency, i.e. at 3.81 ghz. a measured response of the sssr is as well shown in fig. 2, illustrating a good agreement with the simulated one. furthermore, zor is designed with interdigital spacing equal to 100 m, so it cannot be coupled stronger to the feed lines (i.e. the insertion loss at resonance and the fractional bandwidth are the best achievable in standard pcb technology). for this comparison, sssr was designed using gaps equal to 900 m. therefore, by decreasing the gaps, significantly stronger coupling to the feed lines and smaller insertion loss can be easily obtained. the proposed resonator can be modelled using a standard equivalent circuit of a clrh tl, shown in fig. 1b, which is basically equivalent to the circuit proposed in [10] used for modelling of dual-band crlh resonators. the inductive stub is modelled by the parallel resonant circuit with inductance ls and capacitance cs, while the inductance of the via is included in ls. to better understand the behaviour of the sssr, the values of the elements of equivalent circuit have been extracted for the lossless case using standard formulas for microstrip elements calculation. values obtained for sssr and zor are compared in the table i. as expected, it can be seen that the sssr exhibits much lower series capacitance than zor, but significantly increased shunt inductance. in both cases, the shunt inductance ls is a sum of the stub inductance and the via inductance. while the original zor design uses a via with diameter equal to 0.4 mm, via used in the optimized sssr is equal to 0.5 mm for fabrication reasons. this results in approximately 10% lower the via inductance, so slightly longer (1%) stub should be used. however, resonators with the unchanged performances could be designed by using vias with a smaller diameter (for example 0.1 mm) and shorter stub lengths. this illustrates well the impact that via dimensions have on the performance of all crlh-based resonators. to validate the parameter extraction procedure, the responses of electromagnetic simulation and electric model of srrr are compared in fig. 3. it can be seen that the electrical model describes em behaviour very accurately and in a wide frequency range, up to the second harmonic. fig. 1 single shunt stub resonator (sssr): (a) layout with optimized dimensions, (b) equivalent circuit of the proposed sssr resonator. 574 v. radonic, v crnojevic-bengin fig. 2 comparison of the simulated responses of zor and sssr, and the measured sssr response. photograph of fabricated circuit is shown as inset. table 1 extracted equivalent circuit parameters for sssr and zor cg [pf] lr [nh] cs [pf] ls [nh] sssr 0.0564 0.874 0.81 8.24 zor 0.519 2.58 2 3 fig. 3 comparison of responses obtained from full-wave electromagnetic simulation (em) and the equivalent circuit model (el) of sssr. 3. dual-band resonator due to the small size of the proposed structure, compact dual-band resonators can be designed using superposition of two sssrs tuned to different frequencies and placed at a distance s, fig. 4a. the equivalent circuit of the proposed dual-band resonator is shown in fig. 4b. it consists of electric models of two sssrs, capacitively coupled through cc. an additional coupling between the feed lines is included in model by cx. although very small (cx<1nf), this additional coupling is responsible for the pole located around 1 ghz in the transmission characteristic of the dual-band resonator. to illustrate applicability of the compact left-handed dual-band filters based on shundted stub resonators 575 proposed dual-band resonator, the dimensions of the resonator have been optimized to obtain center frequency of the passbands equal to 2.4 ghz and 3.5 ghz, i.e. to the frequencies of ieee wimax 802.16 standard. the following dimension of the resonators have been obtained: l=3.5 mm, g=0.1 mm, w1=1.4 mm, w2=2.2 mm, ws1=ws2=0.5 mm, ls1=9.6 mm, ls2=14.3 mm, s=1.6 mm, via dimensions equal to 0.15 x 0.15 mm. the simulation response of the proposed dual-band resonator is shown in fig. 5, together with the measured results. the proposed dual-band resonator is fabricated in standard pcb technology on 1.575 mm thick taconic substrate with dielectric constant εr=2.17. layout of the fabricated circuit is shown as an inset in fig. 5. a very good agreement is obtained. the measured insertion losses are equal to -2.68 db and -2.61 db, respectively. the 3 db-fractional bandwidths are somewhat larger than in the simulated case, especially in the second pass band and are equal to 1.91 % and 2.38 %, respectively, and the spurious response is located at approximately four times the resonant frequency. the overall width of the resonator is only 3.5 mm, i.e. λg/26, where λg is the guided wavelength. the length of the resonator, currently equal to λg/3, can be significantly reduced by meandering the inductive stubs. fig. 4 a) layout of the proposed dual-band resonator, and b) equivalent circuit model of the proposed dual-band resonator. fig. 5 measured and simulated responses of the proposed dual-band resonator. photograph of the fabricated resonator is shown as inset. 576 v. radonic, v crnojevic-bengin 4. parameters analysis the response of the proposed dual-band resonator, i.e. its two resonant frequencies, corresponding insertions losses and the bandwidths, can be independently controlled by changing specific geometrical parameters of individual sssrs. the influence of different geometrical parameters of the dual-band resonator to the performances are analysed in fig. 6. simulation results for several different distances between two sssrs, s, are compared in fig. 6a, while the influence of the stubs length to the resonances is shown in fig. 6b. it can be seen that the coupling between two sssrs can be neglected in most cases. only for very small values of s, performance of the dual-band resonator is slightly affected by mutual coupling between two sssrs. reducing s to the minimal value achievable in the standard pcb technology, equal to 0.1 mm, results in the shifts of resonant frequencies of only 6%. it is interesting to observe how the total length of the structure l influences the performances, fig. 6b. for example, reducing l from 7.5 to 3.5 mm, for fixed all other dimensions, increases the resonant frequency for more than 21%. the limiting situation is also analysed, when the dual-band resonator consists only of two grounded stubs, i.e. l=ws1=ws2=0.5mm. the dual-band behaviour is preserved, while the selectivity of the peaks is influenced by increased coupling to the feed lines. obviously, if two identical sssrs are used, only one resonant peak arises. therefore, several mechanisms can be used for tuning the resonant frequencies and the bandwidths. small changes in the dimensions of the stub ls and ws, result in large changes of the resonant frequency and the bandwidths. for example, by reducing stub length from 18 mm to 12 mm, resonant frequency increases for more than 22%, due to decreased stub inductance, figure 6c. width of the stub slightly influences on the resonant frequency but dominantly influences on the width of the resonant peak. fig. 6 influence of the dimensions to the performance of the proposed dual-band resonator: a) the coupling between two sssrs, b) length of the resonator, l, c) length of the stub, ls1, and d) width of the stub, ws1. compact left-handed dual-band filters based on shundted stub resonators 577 consequently, the resonant frequencies and the bandwidths can be independently set with the proposed optimization of all parameters. 5. dual-band filters in order to demonstrate applicability of the proposed dual-band resonators, two dualband passband filters of the second order have been designed. the layout of the proposed dual-band filter with all relevant dimensions is shown in figure 7. in order to obtain proper coupling between resonators in each band, i.e. the independent control of the passband characteristics, the feeding lines are slightly modified. one additional segment has been added that allows independent control of the coupling lengths d1 and d2. filter1 has been designed to operate at 2.4 and 3.5 ghz with the 3 db fractional bandwidths of 1.2% and 4.7 %, respectively and return losses better than -12 db in both bands. the geometrical parameter of the filter have been determined as follows: l=3.5 mm, g=0.1 mm, w1=w2=2.2 mm, ws1=0.5 mm ws2=0.9 mm, ls1=6.7 mm, ls2=13.5 mm, s=1 mm, d1=0.75mm, d2=0.55 mm, and via dimensions are equal to 0.2 mm. the response of the filter1 is shown in figure 8a. the simulated insertion losses are -2.5 db and -1 db, respectively. filter2 has been designed to operate at 3.5 and 5.8 ghz with the 3 db fractional bandwidths of 3.6 % and 8%, respectively and return losses better than -12 db in both bands. the optimized geometrical parameters of the filters are: l=3.5 mm, g=0.1 mm, w1=2.2 mm, w2=1.9 mm, ws1=0.5 mm ws2=0.1 mm, ls1=6.7 mm, ls2=1.5 mm, s=0.9 mm, d1=0.8 mm, d2=0.75 mm, and via dimensions are equal to 0.2 mm. the simulated response of the filter2 is shown in figure 8b. the proposed filter2 shows small insertion losses of 1.3 db and -1.53 db, receptively. the size of the fitler2 is 7.9 mm × 13.2 mm, i.e. 0.13λg x 0.21λg, where λg is the guided wavelength on the given substrate at 3.5 ghz. both proposed filters exhibit excellent characteristics, high selectivity, and compact dimensions. fig 7 layout of the second order dual-band passband filter based on srrrs with all relevant dimensions. 578 v. radonic, v crnojevic-bengin fig. 8 response of the second order dual-band passbands filters: a) filter 1, and b) filter 2 the proposed filters are compared with recently published microstrip dual-band bandpass filters based on different configurations, [11]-[16]. the characteristic parameters for these filters are summarized in the table 2, where fc1 and fc2 denotes central frequencies, and il1 and il2 are insertion losses, in first and second band, respectively. the size of all filters is shown in guided wavelength, calculated at first resonant frequency. the proposed filters have comparable in-band characteristics and they are more compact than the most of the other filters published so far. although one may argue that the filter published in [14] has smaller size, it should be noted that these filter has higher insertions losses. the filter published in [15] outperform all other filters in term of insertion losses but its footprint is nearly twice as large as the proposed filter 2. additional miniaturization of the proposed filters can be achieved with bending of the inductive stubs. therefore, the proposed topology presents a good candidate for the design of high-performance dual-band filters since it unifies the requirements for good in-band characteristics and compactness. table 2 comparison of the characteristics of the proposed filters and other recently published dual-band passband filters reference fc1 fc2 il1 il2 size unit [ghz] [ghz] [db] [db] [λg 2 ] [11] 2.45 5.25 -2 -2.33 0.245 [12] 2.09 2.62 -1.56 -1.45 0.0377 [13] 2.45 5.8 -2.12 -2.33 0.2346 [14] 2.4 5.2 -1.66 -2.25 0.0143 [15] 2.35 5.68 -0.35 -0.25 0.0452 [16] 2.41 3.52 -2 -2.2 0.0679 filter 1 2.45 3.5 -2.5 -1 0.052 filter 2 3.5 5.8 -1.3 -1.53 0.0273 6. conclusion in this paper, we have proposed novel compact dual-band resonator realized using superposition of two singe shunt stubs. the proposed resonator exhibits spurious response in wide frequency range and therefore allows construction of dual-band filters using the superposition principle. the circuit model and design procedures of a dual-band compact left-handed dual-band filters based on shundted stub resonators 579 resonators have been analysed in details and verified through two fabricated prototypes. as an examples, two dual-band bandpass filters that operate simultaneously at the wimax frequency bands are designed. designed filter1 is characterized with center operating frequencies of 2.4 ghz and 3.5 ghz, with return losses better than 12 db in both bands, and the 3 db fractional bandwidths equal to 1.2% and 4.7%, respectively. filter2 is characterized with central frequencies of equal to 3.5 ghz and 5.8 ghz with fractional bandwidths of 3.6% and 8%, receptively, and return losses better than 12 db in both bands. the proposed filters exhibit excellent in-band characteristics, spurious response, high selectivity, and compact dimensions. acknowledgement: this work has been supported by ministry of education, science and technological development, republic of serbia within the project iii 44006 development of new information and communication technologies, based on advanced mathematical methods, with applications in medicine, telecommunications, power systems, protection of national heritage and education. references [1] p. ma, b. wei, j. hong, z. xu, x. guo, b. cao, and l. jiang, "a design method of multimode multiband bandpass filters," ieee trans. microw theory tech., vol. 66, no. 6, pp. 2791–2799, mar. 2018. [2] j.-x. chen, j. li and j. shi, "miniaturized dual-band differential filter using dual-mode dielectric resonator," ieee microw. wirel. comp. lett., vol. 28, no. 8, pp. 657–659, aug. 2018. [3] h. liu, y. song, b. ren, p. wen, x. guan, h. xu, "balanced tri-band bandpass filter design using octosection stepped-impedance ring resonator with open stubs," ieee microw. wirel. comp. lett., vol. 27, no. 10, pp. 912–914, oct. 2017. [4] f. capolino, applications of metamaterials. boca raton: crc press, taylor & francis group, 2009. [5] a. lai, t. itoh and c. caloz, "composite right/left-handed transmission line metamaterials," ieee microw. magazine, vol. 5, no. 3, pp. 34–50, sept. 2004. [6] c. caloz, h. okabe, t. iwai, and t. itoh, “transmission line approach of left-handed (lh) materials,” in proc. usnc/ursi national radio science meeting, san antonio, tx, june 2002, vol. 1, p. 39. [7] g.v. eleftheriades, o. siddiqui, and a.k. iyer, “transmission line models for negative refractive index media and associated implementations without excess resonators,” ieee microwave wireless compon. lett., vol. 13, pp. 51–53, feb. 2003 [8] caloz and t. itoh, “electromagnetic metamaterials: transmission line theory and microwave applications,” john wiley & sons, inc, 2006. [9] a. sanada, c. caloz, and t. itoh, “zeroth order resonance in composite right/left-handed transmission line resonators,” in proc. of the asia-pacific microwave conf., seoul, korea, 2003, vol. 3, pp. 1588–1592. [10] g. sisó, m. gil, j. bonache and f. martín, “generalized model for multiband metamaterial transmission lines,” ieee microwave and wireless components letters, vol. 18, no. 11, pp. 728–730, november 2008. [11] c.-h. lee, c.-i. g. hsu, and c.-c. hsu, “balanced dual-band bpf with stub-loaded sirs for commonmode suppression,” ieee microwave and wireless components letters, vol. 20, no. 2, pp. 70–72, february, 2010. [12] x.b. wei, g.t. yue, j.x liao, p. wang, z. q. xu, y. shi, “compact dual-band bandpass filter with ultrawide upper-stopband,” electron letters, vol. 49, pp. 708–709, 2013. [13] m. jiang, l.m. chang, a. chin, “design of dual-passband microstrip bandpass filters with multi-spurious suppression,” ieee microwave and wireless components letters, vol. 20, pp. 199–201, 2010. [14] j. luo, c. liao, h. zhou, x. xiong, “a novel miniature dual-band bandpass filter based on the first and second resonances for 2.4/ 5.2 ghz wlan application,” microwave and optical technology letters, vol. 57, pp. 1143–1146, 2015. [15] m. r. salehi, e. abiri, l. noori, “design of a microstrip dual-band bandpass filter with compact size and tunable resonance frequencies for wlan applications,” international journal of computer and electrical engineering, vol. 6, pp. 248–251, 2014 [16] k. song, f. zhang, y. fan, “miniaturized dual-band bandpass filter with good frequency selectivity using sir and dgs,” international journal of electronics and communications, vol. 68, pp. 384–387, 2014. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. i i editorial as emphasized in the editorial for the second in the series of the anniversary issues, we will strive to attract best submissions and publish best papers from a very broad geographic area, thus making facta universitatis: series electronics and energetics a truly international journal. we will also insist that all published papers are of high quality and practical value, thus leading to their worldwide citation, i.e. to the journal’s placement onto sci list. whilst insisting that all published papers are of high quality and practical value, we wish to avoid creation a situation where the journal publishes by quantity rather than quality, and that is the reason why we already started with rigorous refereeing of all submitted papers. our new policy regarding publication of practical papers in facta universitatis: series electronics and energetics deserves to be elucidated now in more details. we want to publish more practical papers, as badly as the readers want to see them, but they are hard to provide. it should be emphasized here that the acceptance rate for practical papers is considerably higher than that for theoretical ones, since we want to encourage the submission of practical papers. the main reason why you see so many theoretical papers and so few practical papers is that people from an academic environment get paid to produce hardware rather than to write papers about it, and both of them do their jobs reasonably well. when next time you complain about how only a few practical papers appear in this, or any similar journal in the field, please ask yourself the following question: “when was the last time when i, or someone from this division of my organization submitted a practical paper to this journal?” if you have an idea for practical paper, do not hesitate to contact me, and i will be pleased to discuss it with you. this, third in the series of the anniversary issues, is collection of 9 invited papers by well-known experts for the specific areas, most of them being members of our editorial team, who present and discuss the state-of-the-art issues of practical interest in the field. as a new editor-in-chief, i, along with our editorial team, promise to continue to develop and improve facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 399 410 doi: 10.2298/fuee1403399c effective combining of color and texture descriptors for indoor-outdoor image classification  stevica cvetković 1 , saša v. nikolić 1 , slobodan ilić 2 1 university of niš, faculty of electronic engineering, niš, serbia 2 technische universitat münchen (tum), munich, germany abstract. although many indoor-outdoor image classification methods have been proposed in the literature, most of them have omitted comparison with basic methods to justify the need for complex feature extraction and classification procedures. in this paper we propose a relatively simple but highly accurate method for indoor-outdoor image classification, based on combination of carefully engineered mpeg-7 color and texture descriptors. in order to determine the optimal combination of descriptors in terms of fast extraction, compact representation and high accuracy, we conducted comprehensive empirical tests over several color and texture descriptors. the descriptors combination was used for training and testing of a binary svm classifier. we have shown that the proper descriptors preprocessing before svm classification has significant impact on the final result. comprehensive experimental evaluation shows that the proposed method outperforms several more complex indoor-outdoor image classification techniques on a couple of public datasets. key words: feature extraction, image classification, image color analysis, image edge detection, support vector machines. 1. introduction indoor-outdoor image classification is a problem that attracts considerable attention of scientific population involved in content based image retrieval [1]. it is a restricted case of the general image classification problem, which represents the basis for decision of further processing steps depending on the scene type. for instance, assumption that indoor and outdoor images are usually taken under different illumination conditions can be used for decision about forthcoming color correction approach [2]. furthermore, indooroutdoor classification can be exploited for many image processing applications such as image orientation detection [3], image retrieval [4], or robotics [5].  received january 28, 2014; received in revised form march 28, 2014 corresponding author: stevica cvetković university of niš, faculty of electronic engineering (e-mail: stevica.cvetkovic@elfak.ni.ac.rs) 400 s. cvetković, s, v. nikolić, s. ilić several approaches for indoor-outdoor image classification have been proposed in the literature so far. in the recent methods there is an evident trend of introduction of additional information about camera or scene [6], [7]. however, this kind of information which includes exposure time, object distance or flash fired info, is commonly unavailable to the system. there is also evident involvement of domain specific assumptions about indooroutdoor scenes, such as presence of sky or grass in outdoor images [8], [9], or intentional favoring of specific image partitions [10]. in our work, the goal is to define an effective and efficient method for indoor-outdoor image classification based on standardized low-level descriptors and proven machine learning techniques. another goal is to achieve sufficient generality of the method and to avoid introduction of domain specific knowledge about the scene. to this end we propose a carefully engineered procedure for the composition of mpeg-7 color and texture descriptors characterized by efficient extraction, compact representation and high discriminative power. we conducted comprehensive empirical tests to determine the optimal combination of descriptors for this purpose. after important preprocessing procedure, combined descriptors will be conducted to input of an optimally tuned support vector machines (svm) classifier. we have empirically shown a large impact of feature preprocessing before svm on the accuracy of the system. experimental evaluation will show that the proposed method outperforms some more complex state-ofthe-art indoor-outdoor classification techniques on a couple of standard datasets. to be successfully applied for the image classification task, an image descriptor (feature) should be highly discriminative and invariant against image content [11]. such a descriptor should generate features with high variance and good distribution over category samples. in addition, it should be robust to different levels of image quality and resolution. there is a large collection of visual descriptors available in the literature with corresponding strengths and weaknesses [12], [13]. in order to provide standardized descriptors of image and video content, mpeg-7 standard defines three classes of still image visual descriptors: color, texture and shape descriptors [14], [15]. each class of visual features characterizes only a certain aspect of image content, so the combination of features is necessarily employed to provide an appropriate description of image content. we performed exhaustive experiments on combining several mpeg-7 color and texture descriptors which were carefully chosen to meet the requirements of fast calculation, compact representation and high discriminative power. once features have been extracted, method for automatic image classification should be applied. approaches for image classification can be roughly grouped in two categories: (a) learning-based methods that are able to learn optimal parameters based on input training samples. these methods include svm [6], [8], [10], [16]-[18], neural networks [19], [20], decision trees [2], hidden markov models [36], etc. (b) non-parametric methods that perform classification directly on the data, without learning the parameters. the most widely used non-parametric method is k-nearest neighbors (k-nn) which determines image class based on the class of its most similar images [4], [21], [22]. although non-parametric methods require no learning steps and are able to naturally handle a large number of classes, they often suffer from high variation along the decision boundary caused by finite sampling in terms of bias-variance decomposition [23]. as a consequence, their accuracy could be inferior compared to learning-based methods [24]. in addition, processing time of non-parametric methods is considerably larger than the learning-based methods, which makes them inconvenient for large scale classification systems. for our purpose we propose to apply binary svm image classifier on preprocessed effective combining of color and texture descriptors for indoor-outdoor image classification 401 feature vectors formed by combination of several visual descriptors. we applied classification over low-level features only, i.e. features that can be automatically extracted without any a priori knowledge of the image content. although our paper does not involve fundamentally new procedures, it has several main contributions: (1) it has empirically shown that a baseline method for indooroutdoor image classification can reach satisfactorily good results without using complex image descriptors and sophisticated machine learning techniques, (2) gives extensive review on indoor-outdoor image classification topic that is lacking in the literature, and (3) provides comprehensive statistical analysis of descriptor combination with feature scaling methods in order to demonstrate importance of each component. in the rest of the paper we first give overview of the previous research in the field of indoor-outdoor image classification. in section 3, the reasons for choosing specific mpeg-7 descriptors are explained including a brief overview of feature extraction procedures as well as strategy for their combination. then, details of svm image classification method are presented, including description of feature preprocessing step and svm parameters selection. finally, testing methodology and results are presented and discussed. 2. previous research the research on indoor-outdoor image classification can be traced back to the work of szummer and picard [21] who applied a two-stage classification approach on features that combine ohta color space histogram and multi-resolution simultaneous autoregressive model (msar). at the first stage, they used k-nn to classify subblocks of the image, while the final decision was based on the majority rule. the accuracy of 90.3% is achieved on a set of over 1300 consumer images. in a similar approach, serrano et al. [18], extracted lst color histogram and wavelet texture features for classification of image sub-blocks. they used linear svm classifiers to train color and texture features separately. recognition rate was 90.2%, on a set of 1200 images. in [4], indoor-outdoor classification is proposed at the highest level of a hierarchical image classification method. color moments in the luv color space were computed for 10x10 image subblocks. concatenation of the feature vectors of all subblocks produced final feature vector. finally, k-nn classifiers have been evaluated on a database of 6931 vacation photographs achieving accuracy of 90.5%. straight edges were used as a feature in a method proposed in [22]. the authors claim that the proportion of straight edges in indoor images is larger in comparison to outdoor images. the final classification of the image is based on a k-nn rule applied to the proportion of straight edges contained in sub-blocks of the image. in addition, a multi-resolution estimates are used to improve the results. tests conducted on a set of 872 photographs, reported classification accuracy of 90.71%. gupta et al. [19] use a fuzzy clustering method to initially segment an image into sub-regions. segments are then described using simple color, texture, and shape features. the probabilistic neural network is finally applied for the classification that reported accuracy of 92.36% on a benchmark set of 902 images. indoor-outdoor classification is used to improve automatic illuminant estimation in [2]. the feature vector consists of color, texture and edge information. decision forests of classification and regression trees are used for classification. testing was performed on a collection of 6785 images, downloaded from the web or acquired by digital cameras. they reported a classification accuracy of 93.1%. 402 s. cvetković, s, v. nikolić, s. ilić table 1 chronological overview of the published methods for indoor-outdoor image classification authors year classifier accuracy (in %) number of images szummer & picard [21] 1998. knn 90.30 1343 vailaya et al. [4] 2001. knn 90.50 6931 serrano et al. [18] 2004. svm 90.20 1200 serrano et al. [8] 2004. svm+bayesian 90.70 1200 payne & singh [22] 2005. knn 90.71 872 boutell & luo [6] 2005. svm+bayesian 94.10 5120 liu et al. [7] 2005. lda+boosting 92.20 13000 lu et al. [9] 2005. gmm+lda 93.80 1400 gupta et al. [19] 2007. neural network 92.36 902 bianco et al. [2] 2008. decision forest 93.10 6785 kim et al. [10] 2010. svm 90.26 1276 all of the previously mentioned methods are concerned with low-level features only that are extracted directly from digital images with no impact of human perception. in addition, there have been proposed several methods based on high-level image information, i.e. semantic assumptions about the scene. authors of [18] have extended their approach in [8] by introducing semantic detectors for grass and sky to improve the classification accuracy. the results from the svm sub-block classification and semantic detectors were integrated using a bayesian classifier. a classification accuracy of 90.7% was reported on a set of 1200 consumer images. boutell and luo [6] proposed a fusion of low-level image information with the camera metadata information provided in exchangeable image file format (exif), such as exposure time, flash fired and subject distance. first, they applied svm to classify images by low-level for color and texture-features. then, bayesian network was used to classify low-level features integrated with exif metadata. on a benchmark set of 5120 images, the reported accuracy was 94.1%. in a similar approach [7], the combination of color moments and edge direction histogram was extracted as low-level features. to improve the classification accuracy, they utilized 14 exif features associated with images. linear discriminant analysis (lda) algorithm was utilized to implement linear combinations between all extracted features. finally, the combined features are used with the original features in boosting classification algorithm. on a large set of about 13000 digital photographs, they achieved 92.2% accuracy. authors of [9] first trained gaussian mixture models (gmm) to describe the color-texture properties of image patches for 20 predefined materials (building, blue sky, bush, etc.). these models are then applied to a test image to produce 20 probability density response maps which are later used to train lda classifiers for scene categories. a database of 1400 photos taken from 43 persons was used for testing. the indoor-outdoor classification rate was 93.8%. in [10], authors make an assumption that foreground objects (human bodies and faces), which often appear in the central part of the image, may negatively affect the system performance. they partitioned the image into five blocks and extracted edge and color orientation histogram (ecoh) for each block. then, the features are weighted according to the block positions (central part is less weighted) and concatenated to generate the final feature vector. svm classifier evaluated on 1276 images obtained the 90.26% classification rate. effective combining of color and texture descriptors for indoor-outdoor image classification 403 3. feature extraction and combination statistical analysis of global visual descriptors for image retrieval in [11], [12], has shown a large overlapping of the information they extract. although mpeg-7 suggests a number of different descriptors, most of them are highly dependent on each other [11]. on the other side, some of them like dominant color descriptor (dcd), are too computationally expensive for practical applications. to extract features with compact representation and low computational costs we considered the following five mpeg-7 descriptors: scalable color descriptor (scd), color structure descriptor (csd), color layout descriptor (cld), homogeneous texture descriptor (htd), and edge histogram descriptor (ehd). in order to test their behavior and make a selection of the best descriptors for combining, we have conducted exhaustive experiments. we will show that a combination of only a few of them is sufficient for successful indoor-outdoor image classification. first, we give a brief overview of the selected descriptors and describe a method that we used for their combination. further details about descriptor extraction procedures could be found in [14], [25], [26]. color descriptors scalable color descriptor (scd) measures color distribution over an entire image. it is a histogram in the hsv color space that is encoded using the haar transform. the histogram is extracted in hsv space uniformly quantized to 16 levels of h, 4 levels of s and 4 levels of v, giving 256 bins in total. these values are truncated into an 11 bit integer representation and non-linearly mapped into a 4-bit representation. this representation gives a higher significance to smaller values with high probability. to reduce the size of this representation, the histogram values are encoded using haar transform. its representation is scalable in terms of a coefficient number varying from 16 up to 256. our experiments have shown that using more than 64 coefficients does not necessarily lead to a significant accuracy improvement. therefore, we used the following representation of the descriptor ),...,( 641 scdscdscd fff (1) color structure descriptor (csd) extends the image color histogram with information about local spatial structure of the color. it is based on the concept of color structure histogram which counts the number of times a particular color is contained within the 8x8 window as the window scans over the image. mpeg-7 specific color space, denoted as hmmd [14], is used for the extraction. it is first non-uniformly quantized into n colors [27], determining the number of bins in the color structure histogram. then, the window scans over the entire image, and for each color which is present within the window, it increments a corresponding histogram bin. finally, histogram values are normalized and nonlinearly quantized to 8 bits/bin. in our experiments we used csd containing 64 bins: ),...,( 641 csdcsdcsd fff (2) color layout descriptor (cld) has been designed to efficiently represent spatial layout of colors inside an image. it is obtained by applying the discrete cosine transformation (dct) on local representative colors of 64 image blocks in ycbcr color space. the descriptor is characterized by compact representation, invariance to resolution 404 s. cvetković, s, v. nikolić, s. ilić changing and low computational complexity. the extraction process starts with image partitioning of each rgb color channel into 8x8=64 non-overlapping blocks to guarantee resolution invariance. then, a single representative color is computed for each block by simple pixel averaging. in the next step, conversion to ycbcr color space is done and color channels are transformed by dct to obtain three sets of 64 dct coefficients. finally, a zigzag scanned dct coefficients are concatenated into a feature vector containing the most informative elements of each ycbcr color channel. our rough experiment has shown that the feature vector with 22 elements represents a good choice: ),...,,,...,,,...,( 6161101 cld cr cld cr cld cb cld cb cld y cld y cld fffffff (3) texture descriptors homogeneous texture descriptor (htd) characterizes the region texture by the mean energy and the energy deviation from a set of 30 frequency channels. the descriptor is extracted by first partitioning the frequency space into 30 equidistant channels. the individual feature channels are filtered with a bank of 2-d gabor functions, and the mean and standard deviation of the energy in each of the channels is calculated. the final form of the descriptor that consists of 62 coefficients is ),...,,,...,,,( 301301 htdhtdhtdhtdhtd sd htd dc htd ddeefff (4) the first two components are the mean and standard deviation of the complete image, and ei htd and di htd are mean energy and energy deviation of the corresponding i-th frequency channel, respectively. edge histogram descriptor (ehd) represents spatial distribution of 5 types of edge orientations inside local image partitions called sub-images. one local edge histogram is generated for each of 44=16 subimages, representing distribution of five edge orientations inside a subimage. to generate the local edge histogram, edges in the sub-image are extracted and classified into five categories depending on the orientation (vertical, horizontal, diagonal, diagonal, and non-directional). since there are 16 subimages, final edge histogram will contain 16x5 = 80 bins formed by concatenation of the local histograms ),...,( 801 ehdehdehd fff (5) combination of descriptors when using multiple visual features for image classification, crucial problem is how to combine them in order to measure image similarity. generally, there are two approaches for feature combination (fusion, aggregation, composition, merging) [17], [28], [34]. the first one, named “early fusion” performs a combination of features before the estimation of the distances between images. in contrast, “late fusion” applies classification on each feature separately, after which it integrates these results into final decision. an obvious disadvantage of late fusion approach is its computational expensiveness, as every feature requires separate classification stage. another disadvantage is the potential loss of correlation in mixed feature space [17]. since our goal is to develop computationally efficient and accurate method, we focused on the “early fusion” approach. specifically, effective combining of color and texture descriptors for indoor-outdoor image classification 405 we create the final feature vector by concatenating the extracted feature vectors (3 color and 2 textures), where not all feature vectors are necessarily involved. formally, the most extensive form of final feature vector is: ),,,,( ehdhtdcldcsdscd i ffffff  (6) when considering the combination of only two features (e.g. cld+ehd), we will use the final feature vector in the form ),( ehdcld i fff  . as it will be shown in the experimental evaluation, not all the features have to be combined to achieve the best performance. our experiments have shown that the combination of only a few of them is sufficient for fast and accurate indoor-outdoor image classification. depending on the number of features chosen for image representation, the final feature vector will contain from 22 up to 292 elements. the feature vector formed in this way will serve as an input of svm classifier described in the following section. 4. svm based indoor-outdoor image classification svm is one of the most popular machine learning methods for classification of multimedia content [6], [8], [10], [16]-[18]. it is a supervised machine learning technique that performs learning from examples in order to predict the values of previously unseen data. svm can be formalized as an optimization problem which finds the best hyperplane for two or more groups of vectors by maximizing the size of the margin between groups. in order to get the best performance svm, it is crucial to apply appropriate feature scaling and svm parameters tuning [35]. the procedures that we performed are described in details below. although many existing svm approaches apply some sort of feature scaling, the impact of this on svm classification performance is still not sufficiently clear. our intension is to empirically test significance of feature scaling procedures before svm classification. in general, complex image features may contain a significantly different range of values since they are combined from several components (e.g. texture and color information). as a consequence, components with a higher variance will be dominant in determining distance between images. to avoid high influence of the feature component with a large variance, each element of the feature vector must be scaled using appropriate method. let us consider a collection of m indoor-outdoor images where each image is represented by its n dimensional feature vector formed by combination of several mpeg7 descriptors. we examined two basic and efficient feature scaling methods: a) linear min-max scaling [11], and b) scaling to zero mean and unit variance (“z-score”) [29]. linear min-max method can be mathematically represented as: njmi jj jj j ii ii i ,...,1;,...,1, )(min)(max )(min)( )( '     ff ff f , (7) where f 'i (j) represents element at position j in the scaled feature vector of the image i, min fi (j) and max fi (j) are minimal and maximal elements at position j among all training feature vectors. the resulting feature vector will be normalized to range [0, 1]. this 406 s. cvetković, s, v. nikolić, s. ilić approach has the advantage that the relative distributions (variances) of both rows and columns of the feature matrix are preserved. another approach to be considered is scaling to zero mean and unit variance (“zscore”). it is defined by: njmi j i stdev j i meanj i j i ,...,1;,...,1, )( )()( )( '    f ff f (8) where mean fi (j) and stdev fi (j) represents mean and standard deviation of elements at position j among all training feature vector. the comparative evaluation of two scaling approaches will reveal their influence on svm classification performance. as we will show later on, “z-score” significantly outperforms the first method, and hence was chosen as the optimal one. it will be shown that proper feature scaling may increase classification accuracy up to several percent. besides appropriate scaling of feature vectors, svm requires to choose a kernel function with corresponding parameters. we considered a commonly used non-linear gaussian rbf kernel with l2 norm, over the scaled feature vectors. for optimal selection of the svm parameters pair we applied “n-fold cross validation” which separates the training dataset into n subsets and tests every subset using a svm classifier trained on the remaining subsets. systematic “grid-search” [13] was performed over various pairs of values to select the pair with the best accuracy. in order to limit the search complexity, parameter values for evaluation were sampled to form a grid of equidistant steps. 5. experimental evaluation image datasets currently, there is an evident lack of a comprehensive standard dataset for indooroutdoor image classification testing. most of the proposed methods in the literature use their custom image datasets. for the purpose of objectivity, we will test and compare our results only with relevant methods whose test datasets are publicly available. thus, methods presented in [19] and [10] were used for comparison using datasets they provided. the first image dataset is the iitm-scid2 (extended scene classification image database) introduced in [19]. it contains 902 indoor-outdoor images with a wide variation of scenes and resolutions in range from 80x80 up to 2048x1536 pixels. out of this dataset, 193 indoor and 200 outdoor images are used for training, while 249 indoor and 260 outdoor images for testing. compared to the second dataset, images of this dataset show a large variation in the scene content and resolution, which makes this dataset more suitable for testing of the real world performances. the second dataset, hereafter referred as corel-inout, was provided by the authors in [10]. its basis is the wang’s image database [30] extended with various images obtained from the web. it consists of a total of 1276 indoor-outdoor images of different scenes, all of the 256x256 pixels size. specifically, 650 of the images were used for the training of classifiers among which 320 are indoor and 330 outdoor images. for the verification phase, other 626 images composed of 310 indoor and 316 outdoor images, are used. examples of images from both datasets are shown in fig. 1. effective combining of color and texture descriptors for indoor-outdoor image classification 407 a) b) c) d) fig. 1 examples of test images: a) iitm-scid2 indoor, b) iitm-scid2 outdoor, c) corel-inout indoor, d) corel-inout outdoor test results we have tested performance of indoor-outdoor image classification using various combinations of descriptors as well as two feature scaling approaches. the prototype system is implemented in matlab, where mpeg-7 features are extracted using c++ implementation [31]. for svm classification we utilized matlab implementation of libsvm library [32]. in the first experiment we have tested the impact of feature vectors scaling on svm classification accuracy. table 2 presents accuracy of svm classification for different single descriptors, when feature vectors are scaled using two approaches: min-max and “z-score”. table 2 impact of feature scaling method on the svm classification accuracy (in %) descriptor (dimension) iitm-scid2 dataset corel-inout dataset min-max z-score min-max z-score scd (64) 81.73 84.68 70.73 70.93 csd (64) 84.87 87.43 80.67 83.23 cld (22) 78.98 82.71 80.83 82.59 htd (62) 79.17 82.12 74.76 79.39 ehd (80) 83.10 87.03 83.70 83.87 we have performed a comprehensive experimental evaluation in order to get the combination of mpeg-7 features that gives the best classification performances. table 3 presents svm classification accuracy of the proposed method for different combinations of mpeg-7 color and texture descriptors. note that each combined descriptor includes at least one color and one texture descriptor. the same tests are performed on iitm-scid2 and corel-inout image datasets. 408 s. cvetković, s, v. nikolić, s. ilić table 3 accuracy of svm classification using different combination of descriptors and “z-score” preprocessing (in %) descriptor combination dimens. iitm-scid2 dataset corel-inout dataset scd+htd 126 88.61 81.95 scd+ehd 144 91.36 87.38 csd+htd 126 88.02 86.74 csd+ehd 144 90.37 88.66 cld+htd 84 88.02 87.70 cld+ehd 102 91.55 91.53 scd+csd+htd 190 89.59 84.82 scd+csd+ehd 208 91.55 88.66 scd+cld+htd 148 92.34 88.66 scd+cld+ehd 166 93.71 91.53 csd+cld+htd 148 91.16 89.30 csd+cld+ehd 166 92.93 91.05 scd+htd+ehd 206 91.36 89.30 csd+htd+ehd 206 92.73 91.05 cld+htd+ehd 164 92.34 92.01 scd+csd+cld+htd 212 91.94 88.82 scd+csd+cld+ehd 230 93.32 91.05 scd+csd+htd+ehd 270 92.34 89.78 scd+cld+htd+ehd 228 93.32 92.17 csd+cld+htd+ehd 228 93.71 92.49 scd+csd+cld+htd+ehd 292 93.71 92.01 it can be observed that the combination of four descriptors csd+cld+htd+ehd gives the best overall results for both datasets, and therefore can be considered the optimal combination of mpeg-7 descriptors. it can also be noted that among combinations of two descriptors, cld+ehd performs better than all others. when considering three descriptors combinations, scd+cld+ehd gives the best average accuracy. general observation is that the introduction of additional descriptor does not necessarily lead to performance improvement. if a request is to have a fast and sufficiently accurate descriptor, than cld+ehd represents a reasonable choice, providing excellent costs/performance ratio. finally, table 4 presents the results of our method using the most accurate mpeg-7 descriptors combination (csd+cld+htd+ehd) with “z-score” scaling and svm classification, compared to the results of methods [19] and [10] on iitm-scid2 and corel-inout datasets, respectively. the results presented in table 4 show that the proposed method outperforms both compared methods. we have achieved 93.71% classification accuracy on iitm-scid2 dataset, which is better than 92.36% reported in [19]. on the second dataset, a result of 92.49% is improvement of over 2% compared to [10]. since the overall accuracy is over 92.49%, it may be concluded that the proposed method is very effective for the indoor-outdoor image classification. there should also be noted high quality of the results despite the relatively small size of the training datasets; knowing that svm requires a rather large dataset of images to obtain good generalization capabilities. effective combining of color and texture descriptors for indoor-outdoor image classification 409 table 4 accuracy comparison of different methods for indoor-outdoor image classification (in %) method iitm-scid2 dataset corel-inout dataset total indoor outdoor total indoor outdoor gupta et al. [19] 92.36 94.00 90.80 kim et al. [10] 90.26 90.00 90.29 our method 93.71 95.58 91.92 92.49 93.55 91.46 6. conclusion we have presented a relatively simple but highly accurate method for indoor-outdoor image classification based on combination of mpeg-7 features and svm classification. since we intended to create a computationally efficient method, we chose to apply the combination of low-level color and texture features in which all features contribute equally to the final result. we have empirically found that the combination of four mpeg-7 descriptors (csd+cld+htd+ehd) scaled to zero mean and unit variance before input into svm classifier, outperforms all others. also, the combination of only two descriptors cld+ehd is a good trade-off if we further intend to reduce computational costs while retaining the high level of accuracy. experiments conducted on two public datasets achieved 93.71% and 92.49% accuracy, which is comparative to the top results previously published in the literature. future research will be targeted towards using regions of interest (roi) [33] for performance improving. references [1] r. datta, d. joshi, j. li, j. z. wang, “image retrieval: ideas, influences, and trends of the new age,” acm computing surveys, vol. 40, no. 2, pp. 1-60, 2008. [2] s. bianco, g. ciocca, c. cusano, r. schettini, “improving color constancy using indoor-outdoor image classification,” ieee transactions on image processing, vol. 17, no. 12, pp. 2381-2392, 2008. [3] l. zhang, m. li, h.-j. zhang, “boosting image orientation detection with indoor vs. outdoor classification,” proceedings of wacv ’02, washington, dc, usa, ieee computer society, 2002, pp. 95-99. [4] a. vailaya, m. a. t. figueiredo, a. k. jain, h.-j zhang, “image classification for content-based indexing,” ieee transactions on image processing, vol. 10, no. 1, pp. 117-130, 2001. [5] j. collier, a. ramirez-serrano, “environment classification for indoor/outdoor robotic mapping,” proceedings of canadian conference on computer and robot vision crv’09, kelowna, british columbia, canada, 2009, pp. 276-283. [6] m. boutell, j. luo, “beyond pixels: exploiting camera metadata for photo classification,” pattern recognition, vol. 38, no. 6, pp. 935-946, 2005. [7] x. liu, l. zhang, m. li, h. zhang, d. wang, ”boosting image classification with lda-based feature combination for digital photograph management,” pattern recognition, vol. 38, pp. 887-901, 2005. [8] n. serrano, a. savakis, j. luo, “improved scene classification using efficient low-level features and semantic cues,” pattern recognition, 37(9), pp. 1773-1784, 2004. [9] l. lu, k. toyama, g. d. hager, ”a two level approach for scene recognition,” proceedings of cvpr’05, washington, dc, ieee computer society, 2005, pp. 688-695. [10] w. kim, j. park, c. kim, “a novel method for efficient indoor–outdoor image classification,” journal of signal processing systems, vol. 61, no. 3, pp. 251-258, 2010. [11] h. eidenberger, „statistical analysis of content-based mpeg-7 descriptors for image retrieval,” multimedia systems, vol. 10, no. 2, pp. 84-97, 2004. 410 s. cvetković, s, v. nikolić, s. ilić [12] t. deselaers, d. keysers, h. ney, “features for image retrieval: a quantitative comparison,” proceedings of dagm sspr’04, tübingen, germany, springer, 2004, pp. 228-236. [13] t. deselaers, d. keysers, h. ney, “features for image retrieval: an experimental comparison,” information retrieval, vol. 11, pp. 77-107, 2008. [14] b.s. manjunath, p. salembier, t. sikora, introduction to mpeg-7, san francisco, ca, usa, wiley, 2002. [15] s. chang, t. sikora, a. puri, “overview of the mpeg-7 standard,” ieee transactions on circuits and systems for video technology, vol. 11, no. 6, pp. 688-695, 2001. [16] s. n. lindstaedt, r. mörzinger, r. sorschag, v. pammer, g. thallinger, “automatic image annotation using visual content and folksonomies,” multimedia tools and applications, vol. 42, no. 1, pp. 97-113, 2009. [17] c. g. m. snoek, m. worring, a. w. m. smeulders, “early versus late fusion in semantic video analysis,” proceedings of acm multimedia ’05, new york, ny, usa, acm, 2005, pp. 399-402. [18] n. serrano, a. savakis, a. luo, “a computationally efficient approach to indoor/outdoor scene classification,” proceedings of icpr’02, quebec city, canada, 2002, pp. 146-149. [19] l. gupta, v. pathangay, a. patra, a. dyana, s. das, “indoor versus outdoor scene classification using probabilistic neural network,” eurasip journal on advances in signal processing, pp. 1-11, 2007. [20] s. park, “content-based image classification using a neural network,” pattern recognition letters, vol. 25, no. 3, pp. 287-300, 2004. [21] m. szummer, r. w. picard, “indoor-outdoor image classification,” proceedings of iwcbaivd’98, ieee computer society, 1998, pp. 42-51. [22] a. payne, s. singh, “indoor vs. outdoor scene classification in digital photographs,” pattern recognition, vol. 38, no. 10, 2005, pp. 1533-1545, 2005. [23] h. zhang, a. c. berg, m. maire, j. malik, “svm-knn: discriminative nearest neighbor classification for visual category recognition,” in proceedings of cvpr’06, new york, ny, usa, 2006, pp. 2126-2136. [24] m. varma and d. ray, “learning the discriminative power-invariance trade-off,” proceedings of iccv’07, rio de janeiro, brazil, 2007, pp. 1-8. avilable: http://dx.doi.org/10.1109/iccv.2007.4408875 [25] b. s. manjunath, j. r. ohm, v.v. vinod, a. yamada, “color and texture descriptors,” ieee trans. circuits and systems for video technology, vol. 11, no. 6, pp. 703-715, 2001. [26] a. yamada, m. pickering, s. jeannin, l. cieplinski, j. r. ohm, m. kim, mpeg-7 visual part of experimentation model version 10.0. iso/iec jtc1/sc29/wg11/n4063, 2001. [27] r. datta, j. li, j. z. wang, “content-based image retrieval: approaches and trends of the new age,” proceedings acm sigmm mir ’05, new york, ny, usa, acm, 2005, pp. 253-262. [28] s. ayache, g. quénot, j. gensel, “classifier fusion for svm-based multimedia semantic indexing,” proceedings of ecir’07, berlin, germany, springer-verlag, 2007, pp. 494-504. [29] r. j. larsen and m. l. marx. an introduction to mathematical statistics and its applications, pearson prentice hall, 2006. [30] j. z. wang, j. li, g. wiederhold, “simplicity: semantics-sensitive integrated matching for picture libraries,” ieee transactions on pattern analysis and machine intelligence, vol. 23, no. 9, pp. 947-963, 2001. [31] m. bastan, h. cam, u. gudukbay, o. ulusoy, “bilvideo-7: an mpeg-7compatible video indexing and retrieval system,” ieee multimedia, vol. 17, pp. 62-73, 2010. [32] c.-c. chang and c.-j. lin, “libsvm: a library for support vector machines,” acm transactions on intelligent systems and technology, vol. 2, no. 3, pp. 1-27, 2011. [33] j. lee, j. nang, “content-based image retrieval method using the relative location of multiple rois,” advances in electrical and computer engineering, vol. 11, no. 3, pp. 85 – 90, 2011. [34] m. soysal and a.a. alatan, “combining mpeg‐7 based visual experts for reaching semantics,” in proc. of vlbv03, madrid, 2003. [35] d. lu and d. weng, "a survey of image classification methods and techniques for improving classification performance," international journal of remote sensing, vol. 28, issue 5, 2007, pp. 823‐870. [36] j. li and j.z. wang, “automatic linguistic indexing of pictures by a statistical modeling approach,” ieee trans. on pami, vol. 25, no. 9, 2003, pp.1075‐1088. facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 601-613 https://doi.org/10.2298/fuee1904601t © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd novel single layer fault tolerance rca construction for qca technology zahra taheri, abdalhossein rezai, hamid rashidi acecr institute of higher education, isfahan branch, isfahan, iran abstract. quantum-dot cellular automata (qca) technology has become a promising and accessible candidate that can be used for digital circuits implementation at nanoscale, but the circuit design in the qca technology has been limited due to fabrication high-defect rate. so, this issue is an interesting research topic in the qca circuits design. in this study, a novel 3-input fault tolerance (ft) majority gate (mg) is developed. accordingly, an efficient 1-bit qca full adder is developed using the developed 3-input mg. then, a new 4-bit ft qca ripple carry adder (rca) is developed based on the proposed 1-bit ft qca fa. the developed circuits are implemented in the qcadesigner tool version 2.0.3. the results indicate that the developed qca circuits provide advantages compared to other qca circuits in terms of double and single cell missing defect, area and delay time. key words: nanoelectronics, fault-tolerance, majority gate, qca fa, ripple carry adder, quantum-dot cellular automata 1. introduction the qca technology is a promising computing paradigm that has widespread applications in emerging technologies like carbon nano tube field effect transistor (cntfet) [1, 2, 3] and silicon on insulator (soi) [4, 5]. in addition, it has the capability to provide better performance compared to other technologies such as conventional cmos technology [6]. the qca technology was addressed for the first time by lent et al. [7]. this technology presents a novel computation and information transformation method [8, 9, 10]. the four-dot square cell is the fundamental unit in the qca technology, which contains two free electrons [9, 11, 12]. thus, there exist two stable arrangements for qca cell. these two arrangements are denoted as cell polarization. interconnection among the electrons of intercell can make logic ‘1’ and logic ‘0’. so, the logical states can be computed with the charge configuration of the qca cell [8, 13]. the basic components in this technology are qca majority gate (mg), qca inverter gate (ig) and qca wire [9, 14]. this technology has received extensive attention due to the immense practical received march 21, 2019; received in revised form june 18, 2019 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan, iran (e-mail: rezaie@acecr.ac.ir)  602 z. taheri, a. rezai applications such as qca multiplexer [8, 9, 11], qca multiplier [15], efficient design of qca full adder (fa) [10, 15-35], comparator [14, 36] and shift register [37]. on the other hand, the fa circuit is an inseparable component in the computer arithmetic circuits. hence, efficient qca fa construction design is an interesting research topic. the 1-bit qca fa construction can be designed by employing mg and ig in the qca technology [10]. the qca fault tolerance (ft) circuit design is also an interesting and necessary in the qca technology [38-47]. hence, in order to present the developed digital circuits for future modern computing, many researchers have worked on the characteristics of ft in the qca full adder constructions. in this paper, novel 3-input mg is proposed to offer high fault tolerance for double and single missing cell defects. next, a novel 1-bit ft qca fa circuit is developed using the developed 3-input mg. in addition, novel 4-bit ft qca rca construction is designed using the developed 1-bit ft qca fa. the developed qca constructions are implemented with qcadesigner tool version 2.0.3. the simulation results confirm that the developed qca constructions have considerable advantages compared to other designs.the paper progresses as follows: in section 2, an overview of the qca technology is presented. then, the proposed design for new 3-input ft mg, 1-bit qca ft fa construction, and 4-bit qca ft rca construction are presented in section 3. the simulation results and comparison with related works are discussed in section 4. finally, conclusion is given in section 5. 2. background 2.1. qca cell the four-dot squared cell is the fundamental unit in the qca technology, which contains two free electrons [6]. these electrons occupy two of these four quantum dots due to electrostatic repulsion and diagonally occupy corners of the cell [18]. thus, there exist two stable arrangements for qca cell, which are shown in fig. 1 [9]. these two arrangements are denoted as cell polarization. interconnection among the electrons of intercell can be shown as logic ‘1’ and logic ‘0’. the logical state in this construction is defined by using the electrons position in quantum dots [20]. fig. 1 the polarizations of the qca cell [9] novel single layer fault tolerance rca construction for qca technology 603 2.2. the qca gates the mg, xor and the ig are indispensable building blocks of qca circuits [48]. fig. 2 shows a 3-input mg, an ig and xor gate [12, 48]. fig. 2 the basic gates in the qca technology (a) majority gate, (b) inverter gate, (c) exclusive-or gate [12, 48] the 3-input mg logical function is defined as follows [11]: m (a, b, c) = ba + ca + bc (1) the majority gates can work as 2-input or gates or 2-input and gates by applying 1 or -1 to one of the inputs, respectively [8]. 2.3. the qca fa the full adder is an important part of computer arithmetic circuits. if we have a, b, and cin as the inputs, and carry and sum as the outputs of 1-bit full adder, the sum and the carry outputs can be computed as follows [20]: in in in c maj 3(a,b,c ) ab+ac +bcout   (2) sum = (3) 2.4. defect in qca circuits as it is shown in fig. 3, the defects can be occurred during positioning of the cells to a surface or synthesis in cellular layout of the qca circuits [29]. in generally, they can be categorized as follows:  cell omission (missing cell): in this case, the location of the qca cell is changed from its original position [44, 45]. the cell omission is displayed in fig. 3(b).  cell displacement: in this case, the qca cell lost its original direction [41]. the cell displacement defect is displayed in fig. 3(c).  cell misalignment: in this case, the faults are occurred where the qca cells are shifted from their intended locations [45]. the cell misalignment is displayed in fig. 3(d). 604 z. taheri, a. rezai  extra (additional) cell: in this case, an extra cell is erroneously deposited on the substrate [29]. the extra cell defect is displayed in fig. 3(e).  cell dislocation or cell rotation: this fault is occurred where qca cells are rotated proportionate to the other cells in the array [40, 46]. fig. 3(f) shows this defect. fig. 3 defects in the qca circuits, (a) fault-free majority gate, (b) cell omission, (c) cell displacement, (d) cell misalignment, (e) extra cell, (f) rotation defect [29] 2.5 related works a majority gate is a vital component for 1-bit qca full adder constructions. hence, previous mgs and 1-bit qca constructions are reviewed in this section. note that, the fault tolerance is calculated as follows [42]: based on (4), the maximum fault tolerance in terms of the single missing cell defects is 20% for the conventional mg [42]. fig. 4 shows the mg construction in [47], with 3×3 tile. fig. 4 the utilized mg in [47] this design has 13 qca cells. the correct function is 16.67% and 55.6% in terms of the double and single cell missing defects, respectively. fig. 5 shows the utilized mg in [42] with 3 × 5 tile. this construction has 19 qca cells. the number of the correct outputs of this mg is 9 cases out of 15 cases in single cell omission. as a result, it achieves 60% fault tolerance. fault tolera ce (%) = number of wrong output patterns number of detect ve patterns ×100 (4) novel single layer fault tolerance rca construction for qca technology 605 fig. 5 the utilized mg in [42] the number of the correct outputs in this mg is 33 cases out of 105 cases in double cell omission. as a result, it achieves 31.4% fault tolerance. on the other hand, the full adder is an important component in the construction of digital circuits. roohi et al. [12] have presented 1-bit ft fa by using one 5-input mg, one 3-input mg, and one inverter gate. the percentage of the fault tolerance for the sum and carry outputs are 22.22% and 72.22%, respectively in [12]. this 1-bit ft fa has 0.01 µm2 area, 23 qca cells and three clock phases delay. cho and swartzlander [17] have presented 1-bit fa that utilizes three mgs and two igs. this 1-bit ft fa has 0.09 µm2 area, and four clock phases delay. the fault tolerance for the sum and carry outputs are 32% and 60.49%, respectively in [17]. du et al. [42] have presented 1-bit ft fa that utilizes one 3-input ft mg, one 5-input mg and one inverter gate. the fault tolerance of the sum and carry outputs are 29.92% and 94.87%, respectively in [42]. the area is 0.08 µm2 and delay is two clock phases. kassa and nagaria [25] have presented 1-bit fa that utilizes one 3-input mg, one 5-input mg and one ig. this 1-bit fa has 0.05 µm2 area, 48 qca cells and three clock phases delay. the fault tolerance for the sum output and carry output are 17.94% and 92.30%, respectively in [25]. hayati and rezaei [21] have proposed 1-bit fa that utilizes three 3-input mgs and one ig. this 1-bit fa has 0.02 µm2 area, 38 qca cells and two clock phases delay. the fault tolerance for the sum output and the carry output are 12.12% and 48.48%, respectively in [21]. kianpour et al. [22] have presented 1-bit fa that utilizes three 3-input mgs and one ig. this 1-bit fa has 0.07 µm2 area, 69 cells and three clock phases delay. the fault tolerance for the sum and carry outputs are 12.70% and 71.43%, respectively in [22]. tougaw and lent [34] have presented a 1-bit fa that utilizes five 3-input mgs and three inverter gates. this 1-bit fa has 0.2 µm2 area, 192 cells and five clock phases delay. the fault tolerance for the sum output and carry output are 34.78% and 60%, respectively in [34]. sen et al. [23] have presented a 1-bit fa that has 0.01 µm2 area, 31 cells and two clock phases delay. the fault tolerance for the sum output and carry output are 11.54% and 76.92%, respectively in [23]. angizi et al. [31] have presented a 1-bit fa that utilizes one 3-input mg, one 5input mg and one ig. this 1-bit fa has 0.09 µm2 area, 95 cells and five clock phases delay. the fault tolerance for the sum output is 25.55% and for carry output is 74.44%. 606 z. taheri, a. rezai 3. the developed constructions this section designs a new ft mg as a basic module. then, a novel 1-bit ft qca fa and 4-bit ft qca rca constructions are implemented using this novel ft mg. 3.1. the proposed ft mg the proposed construction for the 3-input ft mg is shown in fig. 6. in this construction, the inputs are denoted by a, b and c, and the output is denoted by out. the output of the proposed construction is determined as follows [20]: out = m (a, b, c) = ba + ca + bc (5) in addition, table 1 shows truth table of the developed 3-input ft mg. fig. 6 the proposed 3-input ft mg (a) qca layout (b) logic diagram table 1 truth table of the proposed ft 3-input mg the layout of the proposed 3-input ft mg consists of 20 qca cells. it also has 0.02 μm2 area with a 4×4 tile. 3.2. the developed 1-bit ft qca fa construction the logical design and layout of the developed 1-bit qca ft full adder that uses the proposed 3-input ft mg and a 3-input xor gate as building block, are shown in fig. 7. a b c out 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 novel single layer fault tolerance rca construction for qca technology 607 fig. 7 the proposed 1-bit qca ft fa (a) qca layout, (b) logic diagram as it shown in fig. 7, the inputs are labeled as a, b and cin and the outputs are shown by sum and cout. the proposed 1-bit qca ft full adder takes two clock phases to generate the sum function. the layout of the proposed 1-bit qca ft full adder includes 44 qca cells and its occupied area is 0.04 μm 2 . 3.3. the proposed 4-bit qca ft rca the logical design and layout of the proposed 4-bit qca ft rca are shown in fig. 8. as shown in fig. 8, the developed 4-bit qca ft rca uses four 1-bit qca ft full adder modules that is developed in this paper. fig. 8 the proposed 4-bit qca ft rca, (a) qca layout, (b) logic diagram as it shown in fig. 8, the inputs are labeled as a (a0, a1, a2, a3), b (b0, b1, b2, b3) and cin and the outputs are shown by sum (s0, s1, s2, s3) and cout. the developed 4-bit qca ft full adder takes four clock phases to generate the sum function. the layout of the proposed 4-bit qca ft rca consists of 236 qca cells and its occupied area is 0.49 μm2. 608 z. taheri, a. rezai 4. simulation results and comparison this section presents simulation results and comparison of the proposed ft constructions. the qcadesigner tool version 2.0.3 has been utilized to simulate the proposed constructions. 4.1. the proposed 3-input ft mg fig. 9 shows the simulation results of the proposed 3-input ft mg. to fair comparison, the layout of the proposed 3-input ft mg is re-plotted in fig. 10 in which the qca cells are shown by number. fig. 9 the output waveform of the designed 3-input ft mg fig. 10 the proposed mg with cell number table 2 shows the simulation results of the 3-input ft mg in terms of the single missing defect. the correct functions and the fault tolerance of the various constructions for the double and single cell missing defects are comparison in table 3. table 2 single missing cell defect in the proposed 3-input ft mg novel single layer fault tolerance rca construction for qca technology 609 table 3 the comparative table for the mg for double and single cell missing defects based on our obtained results that are shown in tables 2 and 3, and fig. 10, the fault tolerance of the proposed 3-input mg is 62.5% in terms of the single cell missing defect and 37.5% for the double cell missing defect. as a result, the fault tolerance for the double and single cell missing in the proposed 3-input mg are improved compared to mg in [42, 47]. 4.2. the developed 1-bit qca ft fa the output waveform of the developed 1-bit qca ft fa are shown in fig. 11. fig. 11 the output waveform of the developed 1-bit ft qca fa the carry output and sum output fault tolerance of the developed 1-bit qca ft fa are 84.6% and 56.4%, respectively. in addition, the area is 0.04 µm2 and delay is two defect cell result defect cell result 1 m(a, b, c) = correct 9 m(a, b, c) = correct 2 m(a, b', c) = incorrect 10 m(a, b, c) = correct 3 m(a, b, c) = correct 11 m(a, b, c) = correct 4 m(a, b, c) = correct 12 m(a, b, c) = correct 5 m(a', b, c) = incorrect 13 m(a, b, c) = correct 6 c = incorrect 14 m(a, b, c') = incorrect 7 c = incorrect 15 m(a, b, c) = correct 8 m(a', b', c') = incorrect 16 m(a, b, c) = correct reference (3 × 3-based mg) [47] (3 × 5-based mg) [42] this paper one cell missing two cells missing one cell missing two cells missing one cell missing two cells missing m(a, b, c) 5 6 9 33 10 45 total 9 36 15 105 16 120 fault tolerance (%) 55.6% 16.67% 60% 31.4% 62.5% 37.5% 610 z. taheri, a. rezai clock phases. table 4 shows the comparison of the 1-bit qca ft fa constructers. the cost is calculated as follows: cost=area (μm2) × delay (clock phase) (6) the comparison results between the proposed 1-bit qca ft fa and other 1-bit qca fa constructions show that although fault tolerance of the carry output in our design is lesser than the fault tolerance of the carry output in [42, 25], but the fault tolerance of the sum output in the proposed 1-bit qca ft fa has improved compared to [42, 25]. in addition, the proposed 1-bit qca ft fa has significant improvement in comparison with [42] in terms of area, and cost by about 50% and 50%, respectively. it should be mentioned that the percentage improvement is calculated as follows: rove e t (%) = (1 our mp ement on resu t prev ous mp ement on resu t )× 100 (7) table 4 comparative table for 1-bit qca fa 4.3. the developed 4-bit qca ft rca fig. 12 indicates the output waveform of the proposed 4-bit qca ft rca. for optimum layout, the layout of the proposed 4-bit qca ft rca is implemented in only one layer using 236 qca cells and 0.49 μm 2 area. it also takes four clock phases to generate the outputs. fig. 12 simulation results for the proposed 4-bit qca ft rca in order to present a fair comparison, we have compared the proposed 4-bit ft qca rca with the existing designs in [12, 17, 32, 33, 42] in terms of the clock phase, occupied area, number of qca cells and fault tolerance in table 5. reference [12] [17] [42] [25] [21] [22] [34] [23] [31] this paper fault tolerance (%) sum 22.22 32.00 26.92 17.94 12.12 12.70 34.78 11.54 25.55 56.4 carry 72.22 60.49 94.87 92.30 48.48 71.43 60.00 76.92 74.44 84.6 area 0.01 0.09 0.08 0.04 0.02 0.09 0.20 0.01 0.09 0.04 delay (clock phase) 3 3 2 2 2 4 5 2 5 2 cost (delay ×area) 0.03 0.36 0.16 0.08 0.04 0.36 1 0.02 0.45 0.08 novel single layer fault tolerance rca construction for qca technology 611 based on these results, our design provides a significant reduction on resulting the clock delay, area, number of qca cells and fault tolerance compared to previous designs in [12, 17, 23, 32, 33, 42]. moreover, the proposed constructions have significant robustness against the missing cell defects. they can achieve to higher level of fault tolerance. according to equation (8), the 4-bit qca rca has approximately 45.2% and 42.8% improvements compared to the presented 4-bit qca rca in [42] in terms of the number of qca cells and clock delay, respectively. table 5 comparative table for 4-bit qca rcas 5. conclusions this paper presented and evaluated an efficient 3-input ft mg. the fault tolerance of the proposed 3-input mg has investigated for double and single missing cell defects and compared to previous works. then, the 1-bit fa and 4-bit rca have been designed. the developed designs are simulated using qcadesigner tool version 2.0.3. our simulation results confirm that the proposed 3-input ft mg could reach 62.5% fault tolerance for the single cell missing defect and 37.5% fault tolerance for the double cell missing defect. the proposed 1-bit qca ft full adder could reach 84.6% and 56.4% fault tolerance for carry output and sum output, respectively. the results show that the developed adder constructions have significant improvements compared to other designs. references [1] a. naderi, m. ghodrati, "binary an efficient structure for t-cntfets with intrinsic-n-doped impurity distribution pattern in drain region", turk j elec eng & comp sci., vol. 26, no. 5, pp. 2335–2346, 2018. [2] a. karimi, a. rezai, "a design methodology to optimize the device performancein cntfet", ecs journal of solid state science and technology, vol. 6, no. 8, pp. 97–102, 2017. [3] m. shafizadeh, a. rezai, "improved device performance in a cntfet using lao3high-κ dielectrics", journal of computational electronics, vol. 16, no. 2, pp. 221–227, 2017. [4] m. zareiee, "a new construction of the dual gate transistor for the analog and digital applications ", int. j. electron. commun., vol. 100, no. 1, pp. 114–118, 2019. [5] a. naderi, k. moradi satari, f. heirani, " soi-mesfet with a layer of metal in buried oxide and a layer of sio2 in channel to improve rf and breakdown characteristics ", materials science in semiconductor processing, vol. 88, no. 1, pp. 57–64, 2018. [6] m. zareiee, "a new structure for lateral double diffused mosfet to control the breakdown voltage and the on-resistance", silicon, https://doi.org/10.1007/s12633-019-0092-5, pp. 1–9, 2019. reference number of cells area (µm 2 ) delay (clock phase) considered fault tolerance [12] 165 0.18 6 yes [17] 371 0.4 6 no [21] 156 0.18 5 yes [42] 431 0.44 7 yes [23] 153 0.11 5 no [32] 308 0.29 8 no [33] 570 0.68 8 no this paper 236 0.49 4 yes https://scholar.google.com/scholar?oi=bibs&cluster=16349063946136763704&btni=1&hl=en https://link.springer.com/article/10.1007/s10825-017-0964-0 http://dx.doi.org/10.1007/s12633-019-0092-5 612 z. taheri, a. rezai [7] c. s. lent, p. d. tougaw, w. porod, g.h. bernstein, "quantum cellular automata", nano., vol. 4, no. 1, pp. 49–57, 1994. [8] h. rashidi, a. rezai, "design of novel efficient multiplexer construction for quantum-dot cellular automata", j. nano electr. phys., vol. 9, no. 1, pp. 1–7, 2017. [9] h. rashidi, a. rezai, s. soltany, "high-performance multiplexer construction for quantum-dot cellular automata", j. comput. electr., vol. 15, no. 3, pp. 968–981, 2016. [10] d. mokhtari, a. rezai, h. rashidi, f. rabeie, s. emadi, a. karimi, "design of novel efficient full adder construction for quantum-dot cellular automata technology", facta universitatis, series: electronics and energetics, vol. 31, no. 2, pp. 279–285, 2018. [11] b. sen, m. goswami, s. mazumdar, b. k. sikdar, " towards modular design of reliable quantum-dot cellular automata logic circuit using multiplexers", comput. electr. eng., vol. 45, pp. 42–54, 2015. [12] a. roohi, r. f. demara, n. khoshavi, "design and evaluation of an ultra-area-efficient ft qca full adder", microelectr. j., vol. 46, no. 6, pp. 531–542, 2015. [13] m. niknezhad divshali, a. rezai, s.s. falahieh hamidpour, "design of novel coplanar counter circuit in quantum-dot cellular automata technology", in transaction journal of theoretical physics, 2019. [14] a shiri, a rezai, h mahmoodian, "design of efficient coplanar 1-bit comparator circuit in qca technology", facta universitatis, series: electronics and energetics, vol. 32, no. 1, pp.119–128, 2019. [15] i. edrisi arani, a. rezai, "novel circuit design of serial-parallel multiplier in quantum-dot cellular automata technology", j. comput. electr., vol. 17, no. 4, pp.1771–1779, 2018. [16] h. roshany, a. rezai, " novel efficient circuit design for multilayer qca rca", int. j. theor. phys., vol. 58, no. 6, pp. 1745–1757, 2019. [17] h. cho, e. e swartzlander, "adder and multiplier design in quantum-dot cellular automata", ieee trans comput., vol. 58, no. 6, pp. 721–727, 2009. [18] y. adelnia, a. rezai, "a novel adder circuit design in quantum-dot cellular automata technology", int. j. theor. phys., vol. 58, no. 1, pp. 184–200, 2019. [19] m. r. azghadi, o. kavehei, k. navi, "a novel design for quantum-dot cellular automata cells and fulladders", j. appl. sci., vol. 7, no. 22, pp. 3460–3468, 2007. [20] h. rashidi, a. rezai, "high-performance full adder construction in quantum-dot cellular automata", j. eng., vol. 2017, no. 7, pp. 394–402, 2017. [21] m. hayati, a. rezaei, "design of novel efficient adder and subtractor for quantum-dot cellular automata", int. j. circuit theor. appl., vol. 43, no. 10, pp. 1446–1454, 2014. [22] m. kianpour, r. s. nadooshan, k. navi, "a novel design of 8-bit adder/subtractor by quantum-dot cellular automata", j. comput. sys. sci., vol. 80, no. 7, pp. 1404–1414, 2014. [23] b. sen, a. rajoria, b. k. sikdar, "design of efficient full adder in quantum-dot cellular automata”, sci world j., vol. 2013, pp. 1–10, 2013. [24] m. balali, a. rezai, h. balali, f. rabiei, s. emadi, " towards coplanar quantum-dot cellular automata adders based on efficient three-input xor gate", result phys., vol. 7, pp. 1389–1395, 2017. [25] s. r. kassa, r. k. nagaria, "a novel design of quantum-dot cellular automata 5-input mg with some physical proofs", j. comput. electr., vol. 15, no. 1, pp. 324–334, 2016. [26] r. farazkish, f. khodaparast, "design and characterization of a new ft full adder for quantum-dot cellular automata", microprocess microsyst., vol. 39, no. 6, pp. 426–433, 2015. [27] s. sheikhfaal, s. angizi, s. sarmadi, m. h. moaiyeri, s. sayedsalehi, "designing efficient qca logical circuits with power dissipation analysis", microelecter. j., vol. 46, no. 6, pp. 462–471, 2015. [28] h. b. sousan, m. mosleh, s. setayeshi, "designing and implementing a fast and robust full-adder in quantum-dot cellular automata (qca) technology", j. adv. comput. res., vol. 6, no. 1, pp. 27–45, 2015. [29] m. goswami, b. sen, r. mukherjee, b. k sikdar, "design of testable adder in quantom-dot cellular automata with fault secure logic", microelectr j., vol. 60, pp. 1–12, 2017. [30] k. navi, r. farazkish, s. sayedsalehi, m. r. azghadi, "a new quantum-dot cellular automata fulladder", microelectr j., vol. 41, no. 12, pp. 820–826, 2010. [31] s. angizi, e. alkaldy, n. bagherzadeh, k. navi, " novel robust single layer wire-crossing approach for exclusive-or sum of products logic design with quantum-dot cellular automata", j. low power electr., vol. 10, no. 2, pp. 259–271, 2014. [32] s. hashemi, m. tehrani, k. navi, "an efficient quantum-dot cellular automata full-adder", sci. res. essays., vol. 7, no. 2, pp. 177–189, 2012. [33] i. hänninen, j. takala, "binary adders on quantum-dot cellular automata", j. signal. proc. syst., vol. 58 no. 1, pp. 87–103, 2010. [34] p. d. tougaw, c. s. lent, "logical devices implemented using quantum cellular automata", j. appl. phys., vol. 75, no. 3, pp. 1818–1825, 1993. novel single layer fault tolerance rca construction for qca technology 613 [35] m. balali, a. rezai, " design of low-complexity and high-speed coplanar four-bit ripple carry adder in qca technology", international journal of theoretical physics, vol. 57, no. 7, pp. 1948–1960, 2018. [36] r. mokhtarii, a. rezai, "investigation and design of novel comparator in quantum-dot cellular automata technology", journal of nano-& electronic physics, vol. 10, no. 5, p. 05014(4pp), 2018. [37] m. niknezhad divshali, a. rezai, a. karimi "investigation and design of novel comparator in quantum-dot cellular automata technology", international journal of theoretical physics, vol. 57, no. 11, pp. 3326–3339, 2018. [38] a. fijany, b. n. toomarian, "new design for quantum dots cellular automata to obtain fault tolerant logic gates", j. nano. res., vol. 3, no. 1, pp. 27–37, 2001. [39] r. farazkish, k. navi, "new efficient five-input majority gate for quantum-dot cellular automata”, j. nano. res., vol. 14, no . 11, pp. 1–6, 2012. [40] r. farazkish, s. sayedsalehi, k. navi, "novel design for quantum dots cellular automata to obtain ft majority gate", j. nano., vol. 2013, pp. 1–7, 2012. [41] m. dalui, b. sen, b. k. sikdar, "fault tolerant qca logic design with coupled majority-minority gate", int. j. comput. appl., vol. 1, no. 29, pp. 81–87, 2010. [42] h. du, h. lv, y. zhang, f. peng, g. xie, "design and analysis of new ft majority gate for quantumdot cellular automata", j. comput. electr., vol. 15, no. 4, pp. 1484–1497, 2016. [43] b. sen, m. dutta, r. mukherjee, r. k. nath, a.p. sinha, b.k. sikdar, "towards the design of hybrid qca tiles targeting high fault tolerance", j. compu. electr., vol. 15, no. 2, pp. 429–445, 2016. [44] m. momenzadeh, m. ottavi, f. lombardi, "modeling qca defects at molecular-level in combinational circuits", in proc. ieee int. symp. defect fault tolerance vlsi syst., 3-5 oct 2005, pp. 208–216, [ 45] y. mahmoodi, m. a .tehrani, "novel fault tolerant qca circuits”, in proc. conf. electr. comput. eng., 20-22 may 2014, pp. 20–22. [46] m. rahimpour gadim, n. jafari navimipour "a new three-level fault tolerance arithmetic and logic unit based on quantum dot cellular automata", microsyst. technol., 2017. [47] j. huang, m. momenzadeh, f. lombardi, "on the tolerance to manufacturing defects in molecular qca tiles for processing-by-wire", j. electr. test., vol. 23, no. 2-3, pp. 163–174, 2017. [48] a. newaz bahar, md. mo. asaduzzaman, "a novel 3-input xor function implementation in quantum dot-cellular automata with energy dissipation analysis", alexandria engineering journal, vol. 57, no. 2, pp. 729–738, 2018. https://scholar.google.com/scholar?oi=bibs&cluster=2756061675598716689&btni=1&hl=en https://scholar.google.com/scholar?oi=bibs&cluster=2756061675598716689&btni=1&hl=en facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 313 328 https://doi.org/10.2298/fuee1802313m the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density and temperature distributions in n-type silicon excited by a frequency-modulated light source  dragana k. markushev 1 , dragan d. markushev 2 , slobodanka galović 3 , sanja aleksić 1 , dragan s. pantić 1 , dragan м. todorović 4 1 faculty of electronic engineering, university of niš, niš, serbia 2 institute of physics, university of belgrade, belgrade-zemun, serbia 3 vinča institute of nuclear sciences, belgrade, serbia 4 institute for multidisciplinary research, university of belgrade, belgrade, serbia abstract. the temperature distributions in the n-type silicon circular plate, excited by a frequency-modulated light source from one side, are investigated theoretically in the frequency domain. the influence of the photogenerated excess carrier density on the temperature distributions is considered with respect to the sample thickness, surface quality and carrier lifetime. the presence of the thermalization and non-radiative recombination processes are taken into account. the existence of the fast and slow heat sources in the sample is recognized. it is shown that the temperature distribution on sample surfaces is a sensitive function of an excess carrier density under a bulk and surface recombination. the most favorable values of surface velocities ratio and bulk lifetime are established, assigned for a simpler and more effective analysis of the carrier influence in semiconductors. the photothermal and photoacoustic transmission detection configuration is proposed as a most suitable experimental scheme for the investigation of the excess carrier influence on the silicon surface temperatures. key words: semiconductor, surface recombination, bulk lifetime, photogeneration, excess carriers, periodic thermal excitation, temperature distribution, frequency domain, photothermal, photoacoustic. received august 23, 2017; received in revised form november 9, 2017 corresponding author: sanja aleksić faculty of electronic engineering, university of niš, 14 aleksandra medvedeva, 18000 niš, serbia (e-mail: sanja.aleksic@elfak.ni.ac.rs) 314 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. 1. introduction it is a well known fact that the basis of photoacoustic (pa) and photothermal techniques (pt) is a photo-induced change in the thermal state of the sample. when a modulated light is used to excite a sample, part of the light energy is absorbed and used to periodically increase its internal energy resulting in a sample heating. a periodic sample heating causes a periodic temperature change in a sample, both on its surfaces and in the bulk [1-3]. the evolution of solid-state materials and the development of the technology for integrated circuit and/or microelectromechanical systems (mems) fabrication bring silicon (si) as well as other crystalline semiconductor materials in focus of modern pt investigations [4-9]. our ability to understand the influence of the thermal propagation and carrier transport processes on the thermal state of si and other semiconductors is based on a detailed understanding of the solid-state physics in retrospect, as well as on the development of the new theoretical and experimental pa and pt methods and models necessary to turn the "pure" theory into the "measurable" reality [10-13]. a variety of pa and pt methods can be used to characterize the semiconductors in details [14-17]. the temperature changes can be detected measuring directly a sample surface temperature variations or perturbations of temperature-dependent thermodynamic parameters, such as pressure or density. these parameters changes are detected with various transducers, assuming that the transducer signal intensity is proportional to the amplitude of the measured parameter and depends of temperature propagation across the investigated sample and its environment [18-20]. there are several different experimental schemes that can be used to efficiently detect temperature changes in irradiated semiconductor samples having the form of circular or rectangular plates. one of the most simple and popular among them is the so-called pa transmission detection configuration [20-22]. such scheme assumes that the sample "front" surface is irradiated by the modulated optical beam, and the transducer detects the signal behind the "rear" one. the thermal state of the illuminated semiconductor depends, beside the others, on the density of photogenerated excess carriers and their generation and recombination ratios in response to the light-matter interaction. the reason for that lies in the fact that the generation of periodic excess carriers in semiconductors will produce thermal waves induced by the carrier thermalization and recombination processes. these processes are responsible for the existence of fast and slow thermal sources within the semiconductor. a fast thermal source exists due to the thermalization of carriers to the band gap. slow thermal sources exist due to the recombination of carriers in the bulk and on the surface of the sample. slows are directly dependent on the density of the photogenerated carriers [23-25]. the lifetime of the photogenerated excess carriers is a single value parameter usually referring to the carrier recombination lifetime in the material bulk [26,27]. this bulk lifetime is dependent on the carrier density and doped atoms concentrations. carriers move from regions where their density is high to regions where their density is low during their lifetime. this transport mechanism, called the carrier diffusion, is associated with a random motion of carriers. such motion is characterized with the carrier diffusion length, the average length a carrier covers between generation and recombination. the carrier diffusion length is related to the bulk lifetime and carrier diffusivity [29]. knowing that, the diffusion length can be used as a convenient parameter to define the relative thickness of the semiconductor sample. besides the bulk, the semiconductor surface also plays an important role in recombination. a recombination on the surfaces is typically described by a surface lifetime. it depends, the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 315 besides other things, on the sample thickness and surface recombination velocities on the front and rear surfaces of the sample [30]. higher values of recombination velocities means a large number of recombination centers (active states) created by the abrupt termination of the semiconductor crystal. the reduction of these centers using some conventional chemical or mechanical methods is known as a surface passivation. one can expect the decrease of the recombination velocity during the passivation. the surface and bulk generation and recombination effects on photogenerated carrier densities and temperature distributions can be analyzed by various pa and pt measuring techniques [3,8,24,31]. in the case of pa methods, when the photoinduced effects are measured and analyzed as the function of the optical source modulated frequency, the sample response depends practically only on the periodically variable temperature (i.e. the distribution of periodically variable photogenerated carriers). the steady temperature and carrier density components does not have the effect on the photoacoustic response. this is the reason why, in all theoretical considerations, only periodic (quasi-static) components is taken into account [32-35]. on the other hand, transients can be ignored in the analysis of signals as the function of a periodic optical excitation [22,31]. the object of this paper is to investigate theoretically the influence of photogenerated excess carrier densities on temperature distribution in n-type silicon circular plate irradiated from one side, both on sample surfaces and in the bulk. the assumptions with respect to the surface and bulk lifetime changes are taken into account as parameters closely connected to the various pa and pt measurement conditions and configurations. detailed thermal source analysis is presented as a function of modulation frequency in the range of (10 1 -10 7 ) hz and sample thickness higher (thick sample) and lower (thin sample) then the excess carrier diffusion length. 2. theoretical background photogenerated carrier dynamics doped semiconductors contain majority and minority carriers. in the case of n-type silicon, majority carriers are electrons and minority carriers are holes. excess carriers (electrons and holes) can be generated in semiconductors illuminating them with modulated light source. a necessary condition is that the energy of the light photon is larger than the band gap of the material. both minority and majority carriers are generated when a photon is absorbed. the number of majority carriers in an irradiated semiconductor does not alter significantly. the opposite is true for the number of minority carriers. the number of photogenerated minority excess carriers outweighs the number of minority carriers existing in the doped sample. therefore, the number of minority carriers in an irradiated sample can be approximated by the number of light generated excess carriers. if we assume that the excess carrier transport happens due to diffusion only, and accept a simple model that defines recombination-generation rates proportional to the minority excess carrier density, we can write time-dependent 1d diffusion equation for holes along the x-axes in the low-level donor density n-type silicon [5,22]: 2 2 ( , ) ( , ) ( , ) ( , ) p p p p p n x t n x t d g x t r x t t x         , (1) where np(x,t) is the excess minority carriers – holes density, dp is the hole diffusivity, gp(x,t) is the volumetric rate of the holes generation and rp(x,t) is the volumetric rate of 316 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. the holes recombination. simple equations can be used to define gp(x,t) and rp(x,t) assuming modulated light excitation [5,6]: 0 1( , ) (1 ) 2 βx i ωt p βi g x t e e ε    and ( ) ( ) 1 ( , ) (1 ) 2 p p i t p δn x,t δn x r x t e τ τ     , (2) where i0 is the incident light intensity,  is the photon energy,  is the semiconductor material absorption coefficient,  = 2f, f is the light modulation frequency,  = p is the holes bulk lifetime and np(x) is the spatial part of np(x,t). if the general solution of eq. (1) is given by np(x,t) = (1/2) np(x)(1 + e it ), one can obtain steady (eq.(3.a)) and periodic (eq.(3.b)) components of the equation explaining the excess carriers (holes) dynamics in the sample [22,23,31]: 2 0 2 2 d ( ) ( ) d p p βx pp n x n x βi e εdx l       , (3.a) 2 0 2 2 d ( , ) ( , ) d p p βx p n x n x βi e εdx l          , (3.b) where (1 ) ω p l l iω  , τpp dl  is the holes diffusion length, and np(x,) is the periodic excess carrier density. for the sake of simplicity, we will denote np(x,) = p(x) in all further analysis. assuming that only p(x) is responsible for sample response (pa case), the general solution of eq. (3.b) can be written in the form: / / 0 1 2 2 2 ( ) [ ] ( ) x l x l βx p p βi x a e a e e εd β l            , (4) where a1 and a2 are the integration constants, which can be obtained using proper boundary conditions. boundary conditions, among other things, introduce the surface recombination velocities s onto the front (s1) and back (s2) sample surface, as the parameters used to define surface recombination process efficiency: a higher value of s means a higher recombination efficiency. in our case, boundary conditions are given by: 1 0 d ( ) (0) d p p p x x d s x      and 2 d ( ) ( ) d p p p x l x d s l x       , (5) where the sample front surface is at x = 0, and rear is at x = l. using eqs.(3-5), the constants a1 and a2 can be calculated and presented in the following form: / 1 2 1 1 2 [( )( ) ( )( ) ] l l βl d β d β a a v s v s e v s v s e a              , (6a) / β 2 2 1 1 2 [( )( ) ( )( ) ] l l l d β d β a a v s v s e v s v s e a             , (6b) 0 2 2 ( ) p βi a εd β l      , (6c) and the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 317 / / 1 2 1 2 [ ] [ ] [ ] [ ] l l l l d d d d a v s v s e v s v s e              . (6d) here vd = dp / l and β pv βd are excess carrier’s diffusion velocities. temperature distribution with excess carrier influence the excess carrier generation, recombination and transport processes affect the spatial temperature distribution ( , ) ( ) re( ( ) ) i t s st s t x t t x x e    within the sample, where ( )stt x is the steady and re( ( ) ) i t s s e   is the periodic (quasi-static) component (modulated light source is assumed). taking into account pa simplification, only re( ( ) ) i t s x e   component will be considered in our further analysis [32,33]. to include the excess carrier influence ( )p x is used (eq.(4)) within the s(x) calculations. assuming that thermalization htherm(x), bulk hbr(x) and surface hsr(x) recombination heat sources exists in the sample, the 1d thermal diffusion equation can be written in the form [22,23,31]: 2 2 therm br2 d ( ) 1 ( ) [ ( ) ( )] d s s x x h x h x kx        , (7) where (1 ) /i f d   is the complex heat diffusion coefficient, d = k / (c) is the thermal diffusivity of the sample (k  is the heat conductivity,   is the density, c  is the heat capacity) and: therm 0 ( ) exp( ) g ε ε h x βi βx ε   , (8.a) br ( ) ( ) g p ε h x δ x τ   , (8.b) where g is the semiconductor band gap energy. the surface recombination heat sources hsr(0) and hsr(l) are incorporated in boundary conditions on the front (x = 0) and back (x = l) sample surfaces: 1 0 d ( ) (0) d s p g x x k s ε x       and 2 d ( ) ( ) d s p g z l x k s δ l ε x        (9) general solution of the eq. (7), using boundary conditions (eq. (9)), can be written as a total temperature distribution s(x) given by a vector sum of the thermalization therm(x), bulk br(x) and surface sr(x) recombination components as a result of three different heat generation mechanisms [21,22,31]: therm br sr ( ) ( ) ( ) ( ) s x x x x      , (10) where ( ) ( ) 0 therm 2 2 ( ) ( ) x l x l x xβl g βx l l ε εi β e e e e e x b e k ε β e e                                , (11.a) 2 2 1 2 3 br 2 2 2 1 ( )1 ( ) 1 1 x x g p βx l l ε b δ xb e b e b c x e bτk c be e                               , (11.b) 318 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. 1 2 sr 2 (0) cosh[ ( )] ( ) cosh( ) ( ) g p p l l ε s δ x l s δ l x x k e e                  , (11.c) with β b   , 1 c l   , 1 b a    , 2 4 5 l b b e b    , 3 4 5 l b b e b    , and β 1 4 2 2 1 [ ( ) (0) cosh( / )] cosh( / ) sinh( / ) ( 1) 1 l p p l l l l l e b b b c l l c b                , β β 1 5 2 2 1 [ ( ) cosh( / ) (0)] 1 cosh( / ) sinh( / ) ( 1) 1 l p p l l l l e l l b be b c l l c b                . 3. results and discussions our simulations were performed assuming the photogenerated carriers 1d transport along the x-axes (sample thickness). the illuminated sample front surface is at x = 0 and its rear surface is at x = l. the samples were supposed to be n-type si circular plates with the donor density nd = 210 15 cm 3 . the resistivity si and hole diffusivity dp are found to be: si = 2,37 cm and dp = 12 cm 2 /s [5]. carrier (holes) bulk lifetime is taken to be  = 10 4 s [4,5,27]. it is assumed that the sample is excited by a 5 mw red laser diode light ( = 660 nm) modulated in the frequency f range of (1010 7 ) hz. upper frequency limit is defined by the condition 2ftherm  1 [32-35] where therm  10 8 s is the assumed si thermal relaxation time [35-36]. this condition allows the similar behavior of the s(x) spectra predicted by the fourier parabolic or cattaneo–vernotte hyperbolic theoretical models. lower frequency limit is determined by the fact that in the case of f  10 hz we are able to separate the influence of individual s(x) components. within the given frequency range it is assumed that the temperatures of all quasi-particle subsystems are the same [33]. a simple ratio between the sample thickness l and holes diffusion length lp appears in many equations written above, affecting the behavior of the carrier density and temperature distribution in the frequency domain. the presented analysis was performed assuming the existence of two groups of samples defined by a relative thickness: thick in the case of l / lp > 1; thin in the case of l / lp < 1. in both cases, for given , lp = 346.4 m. each group is characterized with specific effects within the carrier density or temperature distributions induced by photogenerated carriers. the 1000 m thick silicon plate was taken as a typical representative of the thick samples. the 10 m thick plate represents the thin ones. our calculations were performed regarding the silicon surface quality changing the values of recombination velocities in the front and in the rear. bulk lifetime changes are the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 319 taken into account, too. they are closely connected to the sample material quality due to defects and impurities. finally, our analysis is focused on the front and rear sample surfaces, characterized with the different levels of carrier influence on s(x). this type of analysis could help one to find the best method and experimental conditions within the pa and pt measurements to obtain the significant influence of photogenerated carriers on semiconductor thermal response. carrier density: surface recombination influence in order to simplify the analysis and to understand better the photogenerated excess carrier density behavior in the frequency domain, p(x) amplitudes a are considered rather than phases. to calculate a equation (1) is used with different values of recombination speeds s1 and s2, and constant holes bulk lifetime ( = 10 4 s). the values of s are taken to be 0 m/s and 24 m/s representing totally passive and active surface respectively [20,22,23,31]. the value of  corresponds to the assumed nd value [5,6,27]. it must be noted here that in reality most silicon wafers have higher levels of contaminants and thus shorter lifetimes than used here. calculated density amplitudes on the front p(0) and rear p(l) sample surface as a function of modulation frequency f are shown in figure 1. 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 14 10 15 10 16 10 17 10 18 10 19 10 20 s 1 = 0 m/s, s 2 = 24 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, f (hz) l = 1000 m  p (0)  p (l) a)  =10 -4 s 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 14 10 15 10 16 10 17 10 18 10 19 10 20 s 1 = 24 m/s, s 2 = 24 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, (hz) l = 1000 m  p (0)  p (l) b)  =10 -4 s 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 14 10 15 10 16 10 17 10 18 10 19 10 20 s 1 = s 2 = 0 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, f (hz) l = 1000 m  p (0)  p (l) c)  =10 -4 s 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 14 10 15 10 16 10 17 10 18 10 19 10 20 s 1 =24 m/s, s 2 = 0 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, (hz) l = 1000 m  p (0)  p (l) d)  =10 -4 s fig. 1 comparison of the excess carrier density p(x) amplitudes a at front (p(0)  magenta) and rear (p(l) light blue) surface versus modulation frequency f, in the case of a thick (l = 1000 m  dash) and thin (l = 10 m  solid) sample. different surface recombination velocities s1 and s2 ratios are presented (a,b,c,d) assuming constant lifetime value  = 10 -4 s. 320 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. in the case of thick samples, results show that there is an enormously greater density of excess carriers on the front surface, while rear density is at least one order of magnitude lower. a brief numerical analysis shows that p(0) ~ l and p(l) ~ l  e l/l. both densities are independent of frequency when  << 1. at higher frequencies, a sharp decrease of p(0) and p(l) is expected with different slopes. obviously, p(l) is much more sensitive to the sample thickness l than p(0) is. in the case of thin samples, results show that both p(0) and p(l) values are almost identical. at lower frequencies, they are independent of frequency approaching the limit density values for the given s1 and s2 ratio. at higher frequencies, l/l ratio determines their behavior. in the case of l/l < 1, a sharp decrease of both densities is expected with the same slope. in the case of l/l  1, a sharp decrease of p(l) continues but p(0) slope becomes smaller. an exception must be made in the case of a full passivation on both surfaces (s1 = s2 = 0) presented in figure 1.c. the obtained values of p(0) and p(l) at lower frequencies are much higher than those calculated for other s1 and s2 ratios. at higher frequencies, the densities on both surfaces start to decrease following l frequency dependence till l/l  1, when they start to differ (f  10 7 hz). it is a well known fact that the correct understanding of the carrier density influence on the thermal state of the sample sometimes lies in the analysis of a density change, not of its absolute value [3,31]. knowing that, the amplitude ratio ar of the carrier density on the front and back of the sample (ar = |p(0)/p(l) |) was taken as a suitable parameter for our analysis. the results of such analysis in frequency domain as a function of the sample thickness and different s1 and s2 ratios are shown in figure 2. as follows from this figure, the highest values of density ratios (red lines) are obtained when the illuminated front sample surface is perfectly passivated (s1 = 0 m/s) while the rear one has a high recombination (s2 = 24 m/s). this result will be used in further analysis as the most suitable recombination velocities ratio in which the effects of the excess carriers are most obvious. 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 0 10 1 10 2 10 3 10 4 10 5 s 1 = 0 m/s, s 2 = 24 m/s s 1 = s 2 = 24 m/s  = 10 -4 s l = 1000 m a m p li tu d e r a ti o , a r modulation frequency, f (hz) s 1 = s 2 = 0 m/s s 1 = 24 m/s, s 2 = 0 m/s a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 0,8 1 1,2 1,4 1,6 1,8  = 10 -4 s l = 10 m a m p li tu d e r a ti o , a r modulation frequency, (hz) s 1 = 0 m/s, s 2 = 24 m/s s 1 = s 2 = 24 m/s s 1 = s 2 = 0 m/s s 1 = 24 m/s, s 2 = 0 m/s b) fig. 2 comparison of the photogenerated excess carrier density amplitude ratio ar = |np(0)/np(l)| between front and rear surface versus modulation frequency f, calculated for different surface recombination velocities s1 and s2 ratios and constant  = 10 -4 s in the case of: a) thick (l = 1000 m) and b) thin (l = 10 m) sample. the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 321 carrier density: bulk lifetime influence the equation (1) is used again to calculate the photogenerated excess carrier density p(x) amplitude a with different values of bulk lifetimes and constant values of s1 = 0 m/s and s2 = 24 m/s. the values of  are taken to be between 10 4 s and 5  10 7 s representing lowand high-level donor density values nd respectively. a shorter lifetime also means a higher level of si crystal lattice defects and/or contaminants. calculated densities on the front p(0) and rear p(l) sample surface as a function of the modulation frequency f are shown in figure 3. in the case of thick samples, the results show that, besides an enormously greater density of excess carriers on the front surface, the rear density sharply decreases by decreasing the bulk lifetime, reaching the limit of 1 carrier per unit volume at  = 5  10 7 s. as mentioned before, both densities are independent of frequency when  << 1. it is seen that p(0) and p(l) decrease at higher frequencies with different slopes. as follows from this analysis, p(l) is much more sensitive to lifetime changes than p(0) is. in the case of thin samples, the results again show that p(0) and p(l) shapes in the frequency domain are almost identical, having a weak dependence on the lifetime changes. calculated for the same s1 and s2 ratio, both density values remain constant at lower frequencies, decreasing at higher ones following l / l ratio. no significant differences between p(0) and p(l) are found except at very high frequencies (l / l  1). 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -1 10 4 10 9 10 14 10 19 t = 10 -4 s s 1 = 0 m/s, s 2 = 24 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, (hz) l = 1000 m n p (0) n p (l) a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -1 10 4 10 9 10 14 10 19 t = 10 -5 s s 1 = 0 m/s, s 2 = 24 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, (hz) l = 1000 m  p (0)  p (l) b) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -2 10 3 10 8 10 13 10 18  = 10 -6 m/s s 1 = 0 m/s s 2 = 24 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, (hz) l = 1000 m  p (0)  p (l) c) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -1 10 4 10 9 10 14 10 19 t = 5x10 -7 s s 1 = 0 m/s, s 2 = 24 m/s l = 10 m  p (0)  p (l) a m p li tu d e , a ( m -3 ) modulation frequency, (hz) l = 1000 m n p (0) n p (l) d) fig. 3 comparison of the excess carrier density p(x) amplitudes a at front (p(0)  magenta) and rear (p(l)  light blue) surface versus modulation frequency f, in the case of a thick (l = 1000 m  dash) and thin (l = 10 m  solid) sample. different lifetime  values are taking into account (a,b,c,d) assuming constant s1 and s2 ratio. 322 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. in order to further survey the influence of the photogenerated excess carriers, we performed the analysis of amplitude ratios ar (ar = |p(0)/p(l) |) in the case of thick and thin samples presented in figure 3. the results of such an analysis in frequency domain are shown in figure 4. it is clear that the less intensive ratios are assigned to the higher values. in our calculation, higher values of the lifetime means lower levels of defects and contaminants in the sample. this fact allows for a simpler analysis of carrier transport effects in the sample. therefore, in our further calculations, the value of  = 10 4 s (red lines) will be taken as the basis for the study of excess carrier effect on the thermal properties of a semiconductor. 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 0 10 5 10 10 10 15 10 20 l = 1000 m a m p li tu d e r a ti o , a r modulation frequency, (hz)  = 10 -4 s  = 10 -5 s  = 10 -6 s  = 5x10 -7 s s 1 = 0 m/s, s 2 = 24 m/s a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 1,0 1,2 1,4 1,6 l = 10 m s 1 = 0 m/s, s 2 = 24 m/s a m p li tu d e r a ti o , a r modulation frequency, f (hz)  = 10 -4 s  = 10 -5 s  = 10 -6 s  = 5x10 -7 s b) fig. 4 comparison of the excess carrier density amplitude ratio ar between front and rear surface |np(0)/np(l)| versus modulation frequency f, calculated for different values of bulk lifetimes  and constant s1 and s2 ratio in the case of: a) thick (l = 1000 m) and b) thin (l = 10 m) sample. total surface temperature: excess carrier contributions based on the results obtained in previous sections and the established most convenient parameter values (s1 = 0 m/s, s2 = 24 m/s,  = 10 4 s) assigned for the study of the excess carrier influence, the dependences of the total temperature distributions s(x) and its components (therm(x), br(x) and sr(x)) on the modulation frequency f are presented in figures 5 and 6. their normalized amplitudes an and phases  are calculated using eqs. (10) and (11) on the front (x = 0) and rear (x = l) sample surfaces in the case of thick (l = 1000 m) and thin (l = 10 m) samples. the amplitude normalization is performed using the same values of light intensity i0 to make different temperature distribution components comparable to each other. all calculations are performed assuming that the absorption of the incident light in the surrounding media is negligible. in the case of thick samples, the results show that on the front surface (figure 5.a,b) and at lower frequencies (f < 10 3 hz), the photogenerated excess carrier contribution to the s(x) is significant through the bulk recombination component br(x). at higher frequencies (f > 10 3 hz), the therm(x) component predominates, so no significant excess carrier contribution to the s(x) can be found. sr(x) component is negligible in the whole frequency range, which is expected (s1 = 0 m/s). on the rear surface (figure 5.c,d), the excess carrier contribution to the s(x) can be found in the whole frequency range: at lower frequencies the br(x) contributes significantly; at higher frequencies sr(x) predominates (s2 = 24 m/s).  the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 323 this s(x) analysis confirm previously obtained results [22,23,31] and must hold for all sample thicknesses satisfying the condition l / lp > 1. 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 l = 1000 m x = 0 front n o rm a li ze d a m p li tu d e , a n ( a .u .) modulation frequency, f (hz) t s (0) t term (0) t br (0) t sr (0) a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -150 -100 -50 p h a se ,  ( d e g re e s) modulation frequency, (hz) l = 1000 m x = 0 front  s (0)  term (0)  br (0)  sr (0) b) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 n o rm a li ze d a m p li tu d e , a n ( a .u .) modulation frequency, f (hz) l = 1000 m x = l rear  s (l)  therm (l)  br (l)  sr (l) c) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -300 -250 -200 -150 -100 -50 0 p h a se ,  ( d e g re e s) modulation frequency, f (hz) l = 1000 m x = l rear  s (l)  therm (l)  br (l)  sr (l) d) fig. 5 comparison of the surface temperature distribution s(x) normalized amplitude an and phase  at front (a,b  solid magenta) and rear surface (c,d  solid light blue) of a thick sample (l = 1000 m), with thermalization (therm(x)  dash), bulk recombination (bulk(x)  dot) and surface recombination (sr(x)  dash-dot) contributions versus modulation frequency f. in the case of thin samples, the results show that on the front surface (figure 6.a,b), the photogenerated excess carrier contribution to the s(x) is significant in almost whole frequency range through the surface recombination component sr(x). only at very high frequencies (f > 10 5 hz) therm(x) component dominates, so no significant excess carrier contribution to the s(x) can be found. br(x) component is negligible in the whole frequency range, which is expected (l << lp). on the rear surface (figure 5.c,d), the excess carrier contribution to the s(x) can be found in the whole frequency range, where sr(x) component dominates. as far as we know, especially in the case of higher frequencies, this kind of s(x) analysis was not presented in any article published before. it must hold for all sample thicknesses satisfying the condition l / lp < 1, indicating that in the case of thin samples and higher frequencies s(0) at front illuminated surface could drop much faster than s(l) at rear non-illuminated one. such s(x) behavior may result in a major change of pa and pt signals, especially for those taking into account temperatures from both surfaces (e.g. pa thermoelastic signals) [22,23,31]. 324 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -13 10 -12 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 n o rm a li ze d a m p li tu d e , a n ( a .u .) modulation frequency, f (hz) l = 10 m x = 0 front  s (0)  therm (0)  br (0)  sr (0) a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -200 -150 -100 -50 p h a se ,  ( d e g re e s) modulation frequency, f (hz) l = 1000 m x = 0 front  s (0)  term (0)  br (0)  sr (0) b) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 n o rm a li ze d a m p li tu d e , a n ( a .u .) modulation frequency, f (hz) l = 10 m x = l rear  s (l)  term (l)  br (l)  sr (l) c) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -200 -150 -100 p h a se ,  ( d e g re e s) modulation frequency, f (hz) l = 10 m x = l rear  s (l)  term (l)  br (l)  sr (l) d) fig. 6 comparison of the surface temperature distribution s(x) normalized amplitude an and phase  at front (a,b  solid magenta) and rear surface (c,d  solid light blue) of a thin sample (l = 10 m), with thermalization (therm(x)  dash), bulk recombination (bulk(x)  dot) and surface recombination (sr(x)  dash-dot) contributions versus modulation frequency f. based on the results presented in figures 5 and 6, we analyze amplitude ratios ar and phase differences  between s(x) and i(x), where i = therm, br and sr. the results of such an analysis are presented in figures 7 (tick) and 8 (thin samples). obviously, the back surface of the thick and thin samples suffers the greatest influence of photogenerated carriers. therefore, we can recommend the transmission detection configuration as the most favorable experimental pa or pt scheme for studying the influence of free carriers on the thermal state on silicon surfaces. the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 325 10 1 10 2 10 3 10 4 10 5 10 6 10 7 1 10 100 l = 1000 m x = 0 front a m p li tu d e r a ti o , a r modulation frequency, (hz)  s (0)/ therm (0)  s (0)/ br (0)  s (0)/ sr (0) a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -20 0 20 40 60 80 p h a se d if fe re n c e ,   ( d e g re e s) modulation frequency, (hz) l = 1000 m x = 0 front   s (0) -  term (0)   s (0) -  br (0)   s (0) -  sr (0) b) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 0 10 1 10 2 a m p li tu d e r a ti o , a r modulation frequency, (hz) l = 1000 m x = l rear  s (l)/ therm (l)  s (l)/ br (l)  s (l)/ sr (l) c) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -100 -50 0 50 100 l = 1000 m x = l rear p h a se d if fe re n c e ,   ( d e g re e s) modulation frequency, f (hz)   s (0) -  therm (0)   s (0) -  br (0)   s (0) -  sr (0) d) fig. 7 comparison of the amplitude ratios ar and phase differences  at front (a,b) and rear surface (c,d) of a thick sample (l = 1000 m) versus modulation frequency f. 10 1 10 2 10 3 10 4 10 5 10 6 10 7 1 10 100 a m p li tu d e r a ti o , a r modulation frequency, (hz) l = 10 m x = 0 front  s (0)/ therm (0)  s (0)/ vr (0)  s (0)/ sr (0) a) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -50 0 50 100 p h a se d if fe re n c e ,   ( d e g re e s) modulation frequency, (hz) l = 10 m x = 0 front   s (0) -  therm (0)   s (0) -  vr (0)   s (0) -  sr (0) b) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 1 10 100 a m p li tu d e r a ti o , a r modulation frequency, (hz) l = 10 m x = l rear  s (l)/ therm (l)  s (l)/ br (l)  s (l)/ sr (l) c) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -20 0 20 40 60 p h a se d if fe re n c e ,   ( d e g re e s) modulation frequency, (hz) l = 10 m x = l rear  s (l)/ term (l)  s (l)/ br (l)  s (l)/ sr (l) d) fig. 8 comparison of the amplitude ratios ratios ar and phase differences  at front (a,b) and rear surface (c,d) of a thin sample (l = 10 m) versus modulation frequency f. 326 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. 4. conclusions by the combination of the photogenerated excess carrier density and heat source analysis in n-type silicon excited by a frequency-modulated light source, we established the most favorable values of surface velocities and bulk lifetime assigned for the examination of the excess carrier influence on temperature distributions at the semiconductor surfaces. these analyses were performed by a theoretical study of the carrier density and temperature amplitudes and phases in the case of thin (10 m) and thick (1000 m) silicon circular plates as a function of modulation frequencies in the range of (10  10 7 )hz. the relative thickness of the samples is defined by the ratio of sample thickness l and carrier diffusion length lp: l /lp >1 thick and l /lp <1 thin. it was demonstrated that the photogenerated excess carrier density strongly depends on the sample surface characteristics and material quality described by the surface recombination velocity and bulk lifetime respectively. we suggest that the most intensive excess carrier influence in thick and thin samples could be found when the illuminated front surface is perfectly passivated (s1 = 0 m/s) and the rear has a high recombination (s2 = 24 m/s). also, we propose a high value for the carriers bulk lifetime ( = 10 -4 s), which implies a low concentration of dopant and internal defects, allowing a much simpler analysis of the carrier transport in semiconductors. furthermore, a detailed heat source analysis was performed to demonstrate the effects of photogenerated carriers on the surface temperature distributions. in the case of thick samples, this analysis predicts a strong influence of the bulk recombination on both surfaces and at lower frequencies (f < 10 3 hz). at higher frequencies (f > 10 3 hz), a strong surface recombination influence can be found only on the rear surface. in the case of thin samples, the bulk recombination is negligible but the surface recombination dominates: on the rear surface at all frequencies; on the front surface at f < 10 5 hz. also, on the front surface of thin samples one can expect sudden decrease of the total temperature at higher frequencies due to the changed roles of temperature components (thermalization and surface recombination). this kind of analysis allows one to choose which type of the experimental set-up, transmission or reflection is much convenient to use due to a free carrier influence study. the results of our study predict a strong influence of the recombination processes on the rear sample surfaces in the whole range of frequencies regardless of the thickness. therefore, the transmission photoacoustic or photothermal configuration can be recommended as the most suitable experimental scheme for studying the influence of photogenerated carriers on surface temperatures. acknowledgements: this work has been supported by the ministry of education, science and technological development of the republic of serbia under the grant on171016. references [1] l. a. skvortsov, “laser photothermal spectroscopy of light-induced absorption”, quantum electronics, vol. 43, pp. 1–13, 2013. [2] p. almond, p. patel, 1996 photothermal science and technique (london: chapman and hall) [3] s. bialkowski, 1996 photothermal spectroscopy methods for chemical analysis (new york: john wiley) [4] b. g. streetman and s. banerjee, 2000 solid state electronic devices, 5th edition (prentice hall) [5] s. m, sze, semiconductor devices physics and technology, 2 nd edition, (1985, 2002) by john wiley & sons the surface recombination velocity and bulk lifetime influences on photogenerated excess carrier density 327 [6] shyn wang, fundamentals of semiconductor theory and device physics, prentice hall international, inc, englewood cliiffs, nj 07632, usa, isbn 0-13-344425-2) [7] y. chen, y. dai, h. chou, i. chang, " photoinduced absorption studied by photothermal deflection spectroscopy: its application to the determination of the energy of dangling-bond states in a-si:h", chinese journal of physics, vol. 31, pp. 767-772, 1993. [8] j. kitao, y. kasuya, t. kunii, n. yoshida, s. nonomura, "photoinduced effects on infrared and near infrared absorption of amorphous and microcrystalline si measured by photothermal bending spectroscopy", analytical sciences, vol. 17, pp. 302-304, 2001. [9] m.a. nigro, m. gagliardi, f.g.d. corte, "measurement of the ir absorption induced by visible radiation in amorphous silicon and silicon carbide thin films by an in-guide technique", optical materials, vol. 30, p. 1240, 2008. [10] l. goris, k. haenen, m. nesládek, p. wagner, d. vanderzande, l. schepper, j. d’haen, l. lutsen, j. manca, "absorption phenomena in organic thin films for solar cell applications investigated by photothermal deflection spectroscopy", journal of materials sciences, vol. 40, pp. 1413-1418, 2005. [11] k. tanaka, t. gotoh, n. yoshida, s. nonomura, "photothermal deflection spectroscopy of chalcogenide glasses", journal of applied physics, vol. 91, p. 125, 2002. [12] l. xiao, l. changyoung, l. zhang, y. zhao, s. jia, g. zhou, "pulsed-laser pumped photothermal deflection spectroscopy for liquid thermal diffusivity measurement", chinese journal lasers b, vol. b9, pp. 538-544, 2000. [13] li y., gupta r., "an investigation of the photothermal deflection spectroscopy technique for temperature measurements in a flame" applied physics b: lasers and optics, 75, 103-112 (2002) [14] photoacoustic and thermal wave phenomena in semiconductors by a. mandelis (editor), 1987 (northholland) [15] j. batista, a. mandelis and d. shaughnessy, “temperature dependence of carrier mobility in si wafers measured by infrared photocarrier radiometry”, appl. phys. lett., vol. 82, pp. 4077-4079 (jun. 9, 2003). [16] k.m. gupta, nishu gupta, "carrier transport in semiconductors" in advanced semiconducting materials and devices, part of the series engineering materials (2016) (springer international publishing switzerland), pp 87-144. [17] igor lashkevych, oleg titov and yuri g gurevich, recombination and temperature distribution in semiconductors, semiconductor science and technology, vol. 27, 055014 (7pp), 2012. [18] arnau vives a. 2008 piezoelectric transducers and applications (springer-verlag berlin haidelberg new york), [19] min-hang bao 2000 micro mechanical transducers: pressure sensors, accelerometers, and gyroscopes handbook of sensors and actuators, volume 8, (elsevier publishing company) [20] d.d. markushev, m.d. rabasovic, d.m. todorovic, s. galovic, s.e. bialkowski, review of scientific instruments, vol. 86, p. 035110, 2015. [21] schroder, dieter k. semiconductor material and device characterization 2006, 3rd ed. john wiley and sons, inc. hoboken, new jersey. [22] d. m. todorović, p. m. nikolić, m. d. dramićanin, d. g. vasiljević, z. d. ristovski, “photoacoustic frequency heat-transmission technique: thermal and carrier transport parameters measurements in silicon”, journal of applied physics, vol. 78, p. 5790, 1995. [23] d. m. todorović, p. m. nikolić, a. i. bojičić, and k. t. radulović, " thermoelastic and electronic strain contributions to the frequency transmission photoacoustic effect in semiconductors" physical review b, vol. 55, pp. 15631–15642, 1997. [24] a. mandelis, r. bleiss, f. shimura, "highly resolved separation of carrier‐ and thermal‐wave contributions to photothermal signals from cr‐doped silicon using rate‐window infrared radiometry", journal of applied physics, vol. 74, p. 3431, 1993 . [25] y. g gurevich, i. lashkevych, "sources of fluxes of energy, heat, and diffusion heat in a bipolar semiconductor: influence of nonequilibrium charge carriers", international journal of thermophysics, vol. 34, p. 341, 2013. [26] o. palais, j. gervais, s. martinuzzi, “high resolution lifetime scan maps of silicon wafers”, materials science and engineering: b, vol. 71, pp. 47-50, 2000. [27] andres cuevas, daniel macdonald, "measuring and interpreting the lifetime of silicon wafers", solar energy, vol. 76, no. 1-3, pp. 255-262, 2003. [28] k.t. radulović, p.m. nikolić, d. vasiljević-radović, d. m. todorović, s.s. vujović, a.i. bojicić, v. blagojević, d. urosević, “a contribition of carrier transport processes to the photoacoustic effects in doped narrow gap semiconductors”, review of scientific instruments, vol. 74, p. 595, 2003. http://psroc.org/cjp/issues.php?vol=31&num=6-i https://doi.org/10.1016/j.optmat.2007.06.002 https://link.springer.com/article/10.1007/s10853-005-0576-0 http://caod.oriprobe.com/articles/2924999/pulsed_laser_pumped_photothermal_deflection_spectr.htm http://www.springer.com/kr/book/9783319197579 http://www.springer.com/kr/book/9783319197579 https://www.elsevier.com/books/micro-mechanical-transducers/middelhoek/978-0-444-50558-3 http://eu.wiley.com/wileycda/wileytitle/productcd-0471739065.html 328 d. k. markushev, d. d. markushev, s. galović, s.aleksić, et al. [29] gary hodes, prashant v. kamat, "understanding the implication of carrier diffusion length in photovoltaic cells", the journal of physical chemistry letters, vol. 6, pp. 4090−4092, 2015. [30] a. b. sproul, “dimensionless solution of the equation describing the effect of surface recombination on carrier decay in semiconductors”, journal of applied physics, vol. 76, pp. 2851-2854, 1994. [31] m. d. dramićanin, p. m. nikolić, z. d. ristovski, d. g. vasiljević, and d. m. todorović, "photoacoustic investigation of transport in semiconductors: theoretical and experimental study of a ge single crystal", physical review b, vol. 51, pp. 14226, 1995. [32] yuri g. gurevich, georgiy n.logvinov, gerardo g. de la cruz, and gabino espejo lopez, "physics of thermal waves in homogeneous and inhomogeneous (two-layer) samples", international jornal of thermal sciences, vol. 42, pp. 63-69, 2003. [33] g. gonzalez da la cruz and yu. g. gurevich, thermal diffusivity of a two-layered systems, physical review b, vol. 51, pp. 2188-2192, 1995. [34] ordonez-miranda, j.j. alvarado-gil, "determination of thermal properties for hyperbolic heat transport using a frequency-modulated excitation source", international journal of engineering science, vol. 50, pp. 101–112, 2012. [35] s. galovic and d. kostoski, "photothermal wave propagation in media with thermal memory", journal of applied physics, vol. 93, pp. 3063-3070, 2003. [36] a. vedavarz, s. kumar, & m. k. moallemi, "significance of non-fourier heat waves in conduction", journal of heat transfer–transactions of the asme, vol. 116, pp. 221–224, 1994. facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 615-631 https://doi.org/10.2298/fuee1904615s © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd development of an automated gas-leakage monitoring system with feedback and feedforward control by utilizing iot  mhia md. zaglul shahadat, avijit mallik, md. monowarul islam dept. of mechanical engineering, rajshahi university of engineering & technology, rajshahi, bangladesh abstract. liquefied petroleum gas (lpg) is used in many ranges of applications like home and industrial appliances, in vehicles and as a propellant and refrigerator. however, leakage of lpg produces hazardous and toxic impact on human begins and other living creatures. there by, the authors developed a system to monitor the lpg gas leakage and make alert to users of it. in this research, mq-6 gas sensor is used for sensing the level of gas concentration of a closed volume; and to monitor the consequences of environmental changes an iot platform has been introduced. robust control along with cloud based manual control has been applied so that the gas leakage can be prevented in the response of either feedback or feedforward commands individually. it switches on the specified relays to control the level of gas concentration in the time of leakage the excess gas in times of leakage. it rechecks the value again and again if it crosses 300 ppm it will setup a relay-based switching on control mechanism using thingspeak cloud. the controller used here is node-mcu v:1.0. this research provides design approach on both software and hardware. hence an embedded system comprising of relay switches, embedded c ++ , gas sensor, temperature & humidity sensor along with internet of things (iot) is fabricated to meet the objectives of the current research. key words: internet of things, smart system, gas leakage control, embedded system 1. introduction lpg comprises of a blend of propane and butane which is profoundly combustible compound. it is an odorless gas, because of which ethanoate oil is included as incredible odorant, with the goal that spillage can be effectively identified. there are other global benchmarks like en589 [1], amyl mercaptan, and tetrahydrothiophene which are most regularly utilized as odorants. lpg is one of the substitute powers utilized nowadays. lpg is likewise utilized as a substitute fuel in vehicles because of taking off in the costs received may 1, 2019; received in revised form september 3, 2019 corresponding author: avijit mallik dept. of mechanical engineering, rajshahi university of engineering & technology, rajshahi-6204, bangladesh (e-mail: avijitme13@gmail.com)  616 m. m. z. shahadat, a. mallik, m. m. islam of oil and diesel. some people have low sense of smell, may or may not respond on low concentration of gas leakage. in such a case, some high security systems have become an essence to prevent gas leakage accidents. bhopal, chernobyl, okishima gas tragedy was an example of gas leakage accident in india, russia and japan [2, 3]. accidents due to gas leakage are increasing day by day in recent times. inherently, the researchers focused their attention on developing a smart system to monitor and avoid the gas leakage incidents [4]. gas leakage detection is not only important but stopping leakage within smallest time interval is equally essential. the authors have designed and fabricated a system capable of sniffing lpg leakage on the basis of volumetric concentration (ppm) and takes immediate action to control the situation by the aid of internet of things (iot). for a long time, wireless technologies have been a real target of hackers due to the easiness of intercepting traffic and attacking without being noticed as some weaknesses in its security protocols [5]. wireless sensor networking‟s (wsn) are also a big target due to the importance of the information it holds. for many wireless applications, the authentication system is relied on a pre-shared key (psk) which must be established before starting data communication between two or more devices. in wireless hart (highway addressable remote transducer protocol) communication, psk holds a join key (jk) having an „intrinsic security (is)‟ [6]. only when the jk is compromised, the security is overcome and plenty of attacks may take place frequently. as a prevention, this research includes iot utilized wsn‟s having developed state-of-the-art algorithms capable of detecting possible node capturing attacks along with timing delay measure technique to predict specific node viable to security threats [7, 8]. industries utilizing gas-piping (process, pharmaceuticals, chemical and fuelfired power plants) are considered as a reference to apply the proposed gas leakage monitoring and control utilizing wsn (iot enabled with wirelesshart) as per the experimental findings of the research undertaken. table 1 shows some selected referred works on gas leakage/concentration detection and control in recent times. the literatures stated above were mainly focused on sensing environmental changes (like gas concentration, relative humidity of soil and air), but all of those recent works lags in swift feedforward control along with the digital signal processing (dsp) analysis (i.e. data coherence, stability of wireless data logging, noise filtering and etc.). this paper involves like previous literatures where the authors have designed an automated lpg gas concentration monitoring system with both automatic and manual actuator(s) controlling from an iot operated embedded platform. this research focuses on monitoring volumetric concentration (in parts per millions, ppm) of economic gaseous fuels like petroleum gases, liquid petroleum gas, carbon based toxic gases (high on carbon monooxide) etc., and to alert specific administrators/operators about the leakage through an iot server (thingspeak) provided with both automatic along with decision based actuator controlling to mitigate probable accidents from gas leakage. it also includes temperature and relative humidity sensing inside the control volume to check unstable sensory behavior as per the sensor datasheet. industrial internet of things (iiot) 617 table 1 referred works on electronic sensor-based gas concentration detection and actuator control. year purpose of research major findings references 2014 development of an electronic nose for selected oil odor detection. in the new proposed method, the under damped natural frequency ωn is calculated via considering the trise from 10% to 90% of the overshoot. [9] 2017 development and analysis of gsm based gas leakage alerting system. successful wireless gsm aided alerting for possible fire accidents. [10] 2017 development of a wireless electronic nose using artificial neural network for various gas concentration monitoring. artificial neural network had been used to estimate the concentration of a gas in the air based on the ratio. [11] 2017 internet of things-based cargo monitoring system (iot-cms) to monitor any environmental changes. introduced wsn and fuzzy logic control simultaneously for the first time. [12] 2018 iot based cold storage temperature and humidity monitoring. showed how to control actuators over the net by just changing the state wise parameters from “0” to “1” and vice versa for controlling. [13] 2018 smart irrigation controlling system by using iot. the data and control were hosted in an online iot platform. the iot platform provides real-time monitoring and control via a simplified online graphical user interface (gui). [14] 2. project planning & algorithm this project deals with monitoring lpg leakage along with administrative alert by giving buzzer sound, switching on specified relay(s) and sending an alert message to administrator(s) to decide about the precautional measures. for this purpose, gas leakage concentration is sensed by gas sensor (mq-6) which sends the data (analog) to the controller (nodemcu) where the analog value is subjected to a sequential conversion to predict the probable intensity of gas inside the control volume and on crossing a reference threshold value (set by administrator/operator) the controller it switches „on‟ the relay(s) and buzzer(s) alerting to the administrator/operator of the plant (industry/home) including an extra control-option through manual control utilizing the iot server to prevent accidents from the leakage. fig. 1 shows the flow chart of the experimental system and fig. 2, fig. 3 shows the schematic block diagram and the experimental setup of the system respectively. 618 m. m. z. shahadat, a. mallik, m. m. islam fig. 1 flowchart of the fabricated control system. from the flowchart (fig. 1), it is noticed that the controller sets its baud rate or sampling frequency, check input-output pins and the delay time/samples before starting sensory data-logging. then the analog signal from gas sensor along with the digital signal from the relative humidity sensor is read and stored inside the read-only-memory of the controller for a specified timestamp constantly. the gas sensor data is subjected to adc (analog to digital conversion) followed by voltage measurement from which the sensed gas concentration level is calibrated by means of coding. then a comparator (+/-) controls the feedback action and the wireless chip (esp-8266) sends collected data to an authenticated server with an isp (internet service provided) activated router. fig. 2 system setup (schematic). industrial internet of things (iiot) 619 the system schematic shown in fig. 2, is a graphical representation of the experimental system (fig. 3). here the nodemcu used as controller is electrically connected to an analog sensor, a digital sensor and a high volt-amp rated actuator. the data transfer protocol used in this experiment is mqtt (message queuing telemetry transport); which is a low-space data transferring protocol widely used in wireless transmission. fig. 3 experimental setup. 3. mathematical model for gas sensor calibration as the system focuses on gas concentration-based alerting and control of appliances, thus only mq-6 gas sensor calibration (analog value to ppm) is considered for the mathematical analysis. dht-11 sensor is to for checking the conditional temperature and relative humidity as per gas sensor‟s datasheet. the gas sensor circuit schematic is shown in fig. 4. fig. 4 mq-6 gas sensor equivalent circuit schematic. let, vc = supply voltage = +5v, rs = sensor resistance, rl = load resistance (variable), vrl = sensor output voltage. from, current flow and voltage relationship, (1) here, , so, equation-(1) becomes; 620 m. m. z. shahadat, a. mallik, m. m. islam (2) again, from equation-(1); ⁄ (3) from, equation-(3); (4) fresh air resistance ration for gas sensors: rs/r0 = 4.4 ppm (mq-4), 9.8 (mq-2), 10 (mq-6) [23]. now, for calibrating sensor data the equation of a straight line can be beneficial. from normal geometrical analysis, the basic equation of a straight line is; (5) here, y = value on y-axis; x = value on x-axis; m = slope of line; b = intercept from y-axis. for analyzing sensitivity from rs/r0 vs ppm graph (log-log plot) with respect to datasheet, equation-(5) can be equated as; (6) slope (m) value formula: if (x0, y0) and (x, y) are any two points of a line from a loglog plot then the formula for determining m is; ⁄ ⁄ (7) intercept from y-axis (b) is given in equation-(6); (8) using equations-(1) to (8) the gas concentration can be determined directly in ppm (parts per millions): gas concentration, in parts per million (ppm) unit. where, rs/r0 = gas sensor sensitiveness, b = interception from y-axis (from, sensor datasheet; sensitiveness vs ppm graph), slope (m) = . from, the equations of analog signals of 1024 (2 10 ) resolutions; gas sensor‟s sensitivity = rs0/r0. where, rs0 = sensor output resistance at experimental environment, r0 = sensor output resistance at ideal environment (1000 ppm of lpg, 25 0 c. & 60% rh). at 32 0 c. & 55-65% of rh environment; sensor sensitivity, rs0/r0 = 9.8 (in fresh air for mq-6) [14]. in this study, at 27 0 c. & 67% of rh environment; sensor sensitivity, rs/r0 = 8.71 (at normal room condition) is obtained from adc [14, 15]. 4. test outputs & signal processing from experimental data, various calculations were performed using „matlab‟ environment. the digital signal analysis-toolbar is a great tool for signal processing and smoothing. as the data transfer took place using wireless media thus some additional noise was logged in an arbitrary manner. to overcome the problem a processing is needed. for this research, mainly fast fourier transformation (fft) was performed for better signal processing. the analysis procedure along with relative graphs are descried below; industrial internet of things (iiot) 621 coherence estimation via welch method: in digital signal processing, coherence is a statistics between two functions or signals which is used to estimate the power transfer of the input and output in a linear/linear time-invariant system. this algorithm is based on standard matlab‟s tools. the standard derivation of the mean is computed as [15-18]; √ | | | | (9) | | | | (10) if, x(t) and y(t) are two real value time variant functions where the coherence between the two signals is termed as (sometimes also called magnitude squared coherence) and is the standard deviation of the mean. by using equations-(9) & (10), fig. 5, is plotted where a normalized frequency is used for better visualization which depicts the coherence estimation via welch method. the figure shows that no data has been overlapped into one another which verifies the continuous datalogging as quite satisfactory. fig. 5 coherence estimation via welch method. the fig. 6, denotes the magnitude response curve. the rms signal-to-noise ratio for an ideal n-bit converter is; snr = (11) or, snr = √ √ √ (12) snr = 6.02n + 1.76 db, over the nyquist bandwidth of interest. fig. 6 magnitude response graph (gas sensor). 622 m. m. z. shahadat, a. mallik, m. m. islam time-domain signal processing: now, using the gas sensor data time-domain graph was plotted in fig. 7(a) to fig. 7(d), along with a 3 rd and 8 th order fast fourier transformation (fft) of the signal for better analysis. this is achieved, in a process known as convolution, by fitting successive sub-sets of adjacent data points with a lowdegree polynomial by the method of linear least squares [19]. fig. 7(a) depicts the plots of raw analog, savitzky-golay filtered value, moving average filtered value and the median plot. fig. 7(a) processed signal (ppm vs time). fig 7(b) shows the furrier fitted curve for gas sensor value from which the best suited filter was found to be the 5 th order fft. the higher and lower range has also been showed along with residuals plotted in black color. this signal filtration was done under room environment with no manual gas leak. by comparing the plots of fig. 7, the 5 th order fft was finally considered as the filter for signal processing having very low range of residuals. fig. 7(c) ppm vs time domain. (5 th order fft with 95% confidence limit). the below fig. 7(c) & (d) are the same plots as fig. 7(a) & (b), but those value represents the temperature and humidity. the same method of simulation was considered for both sensor‟s data smoothing with respect to time. industrial internet of things (iiot) 623 fig. 7(c) processed signal (humidity vs time). fig. 7(d) humidity vs time domain graph (8 th order fft of 95%). group delay response analysis: now from convolution theorem, if y is a time-variant function of x ( ) and the function x is in a convoluted form with some other similar time-variant function h then [20-24], ∫ (13) in signal processing, y(t) dependent mostly on the subsequent values of x that occurred near the time t. so, for any linear system; {∫ } ∫ (17) and the time-invariance requirement is; { } { } (18) from, eqn-13 to 18, the impulse response can be noted as; { } (19) 624 m. m. z. shahadat, a. mallik, m. m. islam from, equations-(12) to (19), using amplitude sampling technique; fig. 7(e) to fig. 7(k) are be computed numerically along with related graphical representation. fig. 7(e) is the group delay response plot for the gas sensor which shows that between 11.3-12.2 khz of sampling frequency the delay remains constant. fig. 7(f) represents the phase delay (with respect to samples), which shows positive noise addition in the cloud data thus a low-pass filter should be mounted with the sensor‟s input pin in series resulting in about 80-85% reduction of induced noise. fig.7(e) group delay response. fig. 7(f) phase delay samples. fig. 7(g), denotes the pole-zero plot for the system, which shows that this system has some noise but ultimately it is stable. to make datalogging more stable, the phase delay response should be reduced to almost zero value, which is induced mainly because of using an old and very low-cost sensor. it can be reduced by using a low pass filter (as it controls the positive phase delay addition) at the input or by using sensors from renowned brands. industrial internet of things (iiot) 625 fig. 7(g) pole zero plot. fig. 7(h) power spectrum (l) and sensitivity vs ppm plot (r). fig. 7(h) is the round-off noise power spectrum plot for the system showing the power band of the induced noise due to positive phase addition and also shows the actual & experimental sensitivity vs concentration plot, from where it can be said that the calibration was 90-95% accurate, which is good. fig. 7(i) step response (l) and ppm vs voltage plot (r). 626 m. m. z. shahadat, a. mallik, m. m. islam from, fig. 7(i), the step response (left) can be shown where it is seen that initially the system runs fine but after a certain period it shows some abruptions but the statistical filter automatically fixes the errors resulting in a smooth response. it also shows the experimental and simulated relationship between supplied voltage and gas concentration. the below, fig. 7(j) shows the time domain and frequency domain analysis plot for the system showing initial sensory noise. the transfer function of the system can be estimated by using welch transfer function distribution theory, which is plotted on fig. 7(k). fig. 7(j) time domain and frequency domain graph for the system. fig. 7(k) transfer function estimation via welch distribution. 5. design of control system feedback and feedforward both control options were applied to the experimental prototype. combining these two control options is the most challenging part for this research. the experimental threshold for gas sensor was set 300 ppm. for the feedback control, when the nearby gas concentration crosses 300 ppm then the controller automatically will set the actuator off for a certain delay period and rechecks the value of sensor again and again. if the value goes below 300 ppm then again for a certain period the actuators are turned on by a digital signal, but if again the gas concentration crosses 300 ppm then the system detects a gas leakage problem and the source valve is turned off. industrial internet of things (iiot) 627 fig. 8(a) feedback control for proposed system. fig. 8(a), shows the schematic of the developed feedback control system of the experimental system. in this device, the mean error will be the mathematical summation of desired voltage and measured voltage and those two voltages are different in terms of signs. the desired value will be always positive, and the measured value is always negative. so, when those two values are equal but opposite in signs, so there will be no errors in the control system and that will be an absolute equilibrium condition [25-27]. thus, this type of feedback control system is very effective to use. feedforward control in simple terms mean controlling something by the aid of a manual signal. in this research, when feedback control crashes or manual switching is needed then from the server a predefined signal is sent to the controller and selected actuators can be controlled. fig. 8(b) shows the schematic of the applied feedforward mechanism for controlling actuators using manual command over iot. fig. 8(b) feedforward control system for the experimental setup. the feedforward controlling actions is implemented using „https-request‟ protocol used for single way communication. the control algorithm for this research has been formulated through some logical reasonings followed by c ++ code maintaining the fuzzy logics given in table 2. from the fuzzy logics, nine probabilities of actuation is possible by generating c ++ code upon those logics, where two variables namely „https-request‟ (feedforward command 1 or 0, if undefined, set value = = 0) and „gas concentration‟ (reference 628 m. m. z. shahadat, a. mallik, m. m. islam value<300 = = 0, >300 = = 1, if value = 300, set value = = 0) are liable to specified controller responses followed by series change in actuation commands from controller utilizing both feedback and feedforward control actions simultaneously. table 2 control logics topology for proposed system. „https-request‟ from server (str.) gas concentration (ppm) controller response (bin) actuator output (bin) logic-1 0 <300 0 0 logic-2 0 >300 1 1 logic-3 1 >300 1 1 logic-4 1 <300 1 1 logic-5 undefined (= = 0) <300 0 0 logic-6 undefined (= = 0) >300 1 1 logic-7 0 =300 0 0 logic-8 1 =300 1 1 logic-9 r 0 reset 1 6. system performance analysis to make a system performance analysis total 20 trials have been made. the trials were successfully investigated, and no major error was observed. the below table 3 shows the performance test results. in this investigation, the commands were performed using the cloud-server followed by observation in actuation using the experimental setup described earlier in fig. 3 by the mobile application. fig. 9 represents the system performance test outputs in a graphical manner. table 3 system performance data table. no of obs. gas concentration https input actuator response elapsed time (sec) experimentation 1 170 (= = 0) 0 0 0 (start) success 2 173 (= = 0) 1 1 15.3 success 3 211 (= = 0) 0 0 20 success 4 288 (= = 0) 1 1 11 success 5 316 (= = 1) 0 1 7.5 success 6 375 (= = 1) 1 1 n/a no specified change 7 402 (= = 1) 0 1 n/a no specified change 8 287 (= = 0) 1 1 12 success 9 255 (= = 0) 1 1 n/a no specified change 10 221 (= = 0) 0 0 15.7 success 11 300 (= = 0) 0 0 n/a no specified change 12 300 (= = 0) 1 1 13.4 success 13 302 (= = 1) 2 1 error error 14 380 (= = 1) 4 1 error error 15 293 (= = 0) 0 0 12 success 16 319 (= = 1) r reset n/a reset of process 17 327 (= = 1) 1 1 18 success 18 289 (= = 0) 0 0 8 success 19 294 (= = 0) 1 1 11 success 20 277 (= = 0) 0 0 5 success industrial internet of things (iiot) 629 fig. 9 feedback and feedforward control from system performance test. all the evaluation shows that the total system behaves like an error free system. though the data processing time is a bit long (avg. 13 sec.) but the system behavior seemed better. this system processing delay can be optimized by using manual api and custom server. as thingspeak gives free access to students limiting browsing speed; so, no one can misuse it for commercial purpose. the below fig. 10 shows the graphical states of those experimental setup. this fabricated system can be used to prevent fig. 10 graphical presentation of monitored data; (a) relay state data; (b) temperature and rh data; (c) ppm concentration of lpg gas and (d) android application gui. 630 m. m. z. shahadat, a. mallik, m. m. islam accidents caused by gas leakage in home and industrials. in process, food, chemical and fertilizer industries various types of toxic-hydrocarbon based gases (may have ability to ignite fire) are used widely thus making the industries more viable to accidents due to unnoticed leakage of those gases. the fabricated prototype has almost 95% efficiency along with both feedback and feedforward control options which makes it more stable to monitor and prevent unnoticed gas leakage crossing a defined reference value. the implemented cloud server controlling (feedforward) actions prove the system‟s infinite (very long range) distance actuating capability but it is very necessary to get connected with isp for both controllers (transmitter and receiver). 6. system performance analysis the fabricated system run successfully, and 20 trials were performed to measure its performance. it showed no errors in those 20 trials which took almost 3 hours to execute. the control system took almost 13 seconds to perform necessary commands when signals came from the host server. this delay period can be overcome by using the premium version of thingspeak server. this delay period can be shortened up to 4 seconds as it is the minimal period of iot server refreshing. it is apparent from analyzed signals that the impulse response and the group delay response are seemed quite perfect if the baud rate (sampling frequency) tuning ranges from 11 to 12.6 khz. in this study, 11.52 khz sampling frequency is used. the power density spectrum of gas sensor is quite fine although the signal to noise ratio is comparatively higher. this can be overcome by using a 20-pf ceramic disk capacitor as it can work as a low pass filter. the overall system efficiency was about 95%, which is quite good for a robust controlling operation. acknowledgement: the authors gratefully acknowledge the financial support received from the university grant commission (ugc), bangladesh. the authors thank mr. s. m. asif hossain (dept. of electronics and communication engineering, kuet) and md. robiul islam (lecturer, dept. of mechatronics engineering, ruet) for their contributions in this study. references [1] j. wang, m. tong, x. wang, y. ma, d. liu, j. wu, d. gao & g. du, "preparation of h2 and lpg gas sensor", sensors and actuators b: chemical, vol. 84, no. 2-3, 95–97, 2002. [2] m. miftakul amin, m. azel aji nugratama, andino maseleno, miftachul huda, and kamarul azmi jasmi. "design of cigarette disposal blower and automatic freshner using mq-5 sensor based on atmega 8535 microcontroller." international journal of engineering & technology, vol. 7, no. 3, pp. 1108–1113, 2018. [3] n. sinha, k. eswari pujitha, and j. sahaya rani alex, "xively based sensing and monitoring system for iot", in proceedings of the ieee international conference on computer communication and informatics (iccci), 2015, pp. 1–6, 2015. [4] a. mallik, s. a. hossain, a. b. karim, & s. m. hasan, "development of local-ip based environmental condition monitoring using wireless sensor network", international journal of sensors, wireless communications and control, vol. 9, pp. 1–8, 2019. [5] k. keshamoni, and s. hemanth. "smart gas level monitoring, booking & gas leakage detector over iot", in proceedings of the 2017 ieee 7th international advance computing conference (iacc), pp. 330-332. 2017. [6] a. mallik, a. ahsan, m. m. z. shahadat and j. c. tsou. “man-in-the-middle-attack: understanding in simple words.” int. j. data networks and security, (2019) industrial internet of things (iiot) 631 [7] v. yadav, a. shukla, s. bandra, v. kumar, u. ansari, and s. khanna. "a review on iot based hazardous gas leakage detection & controlling system using microcontroller & gsm module." journal of vlsi design and signal processing, vol. 3, no. 1, 2017. [8] m. sharma, d. tripathi, n. p. yadav, and p. rastogi, "gas leakage detection and prevention kit provision with iot." gas, vol. 5, no. 02, 2018. [9] a. j. moshayedi, m. v. kukade, and d. gharpure, "electronic-nose (e-nose) for recognition of cardamom, nutmeg and clove oil odor", 2014. [10] v. v. alekseev, v. s. konovalova, and e. n. sedunova, "information-measurement and control system “smart house” as object of practice-oriented training of master's degree “instrumentation technology”", in proceedings of the international conference "quality management, transport and information security, information technologies"(it&qm&is), 2017, pp. 612–615. [11] s. i. sabilla, r. sarno, and j. siswantoro. "estimating gas concentration using artificial neural network for electronic nose", procedia computer science, vol. 124, pp. 181–188, 2017. [12] y. p. tsang, k. l. choy, c. h. wu, g. t. s. ho, h. y. lam, and p. s. koo. "an iot-based cargo monitoring system for enhancing operational effectiveness under a cold chain environment." international journal of engineering business management, vol. 9, 1847979017749063, 2017. [13] a. b. karim, a. z. hassan, and m. m. akanda. "monitoring food storage humidity and temperature data using iot." moj food process technol, vol. 6, no. 4, pp. 400–404, 2018. [14] j. mari, j. maja, j. robbins, "controlling irrigation in a container nursery using iot", aims agriculture and food, vol. 3, no. 3, pp. 205–215, 2018. [15] a. brandt, "a signal processing framework for operational modal analysis in time and frequency domain", mechanical systems and signal processing, vol. 115, pp. 380–393, 2019. [16] s. a. hossain, m. hossen, and s. anower, "estimation of damselfish biomass using an acoustic signal processing technique", journal of ocean technology, vol. 13, no. 2, 2018. [17] s. mariani, l. tarokh, i. djonlagic, b. e. cade, m. g. morrical, k. yaffe, k. l. stone et al, "evaluation of an automated pipeline for large-scale eeg spectral analysis: the national sleep research resource", sleep medicine, vol. 47, pp. 126–136, 2018. [18] tns tengku zawawi, a. r. abdullah, w. t. jin, r. sudirman, and n. m. saad, "electromyography signal analysis using time and frequency domain for health screening system task", international journal of human and technology interaction (ijhati), vol. 2, no. 1, pp. 35–44, 2018. [19] s. a. hossain, m. hossen, a. mallik, and s. mahmudul hasan, "a technical review on fish population estimation techniques: non-acoustic and acoustic approaches." akustika, vol. 31, pp. 87–103, 2019. [20] regalia, phillip. adaptive iir filtering in signal processing and control. routledge, 2018. [21] b. boashash, a. aïssa-el-bey, and m. f. al-sa‟d. "multisensor time–frequency signal processing matlab package: an analysis tool for multichannel non-stationary data", softwarex, 2018. [22] a. e. cohen, "automated hdl signal processing deployment performance from high level matlab specification for an unmanned aerial vehicle (uav)", in proceedings of the ieee 8th annual computing and communication workshop and conference (ccwc), 2018, pp. 900-905. ieee, 2018. [23] van drongelen, wim. signal processing for neuroscientists. academic press, 2018. [24] s. a. hossain, a. mallik, and md arefin, "a signal processing approach to estimate underwater network cardinalities with lower complexity", journal of electrical and computer engineering innovations, vol. 5, no. 2, pp. 131–138, 2017. [25] u. yilmaz, a. kircay, and s. borekci, "pv system fuzzy logic mppt method and pi control as a charge controller", renewable and sustainable energy reviews, vol. 81, pp. 994–1001, 2018. [26] w. he, t. meng, d. huang, and x. li, "adaptive boundary iterative learning control for an euler– bernoulli beam system with input constraint", ieee transactions on neural networks and learning systems, vol. 29, no. 5, pp. 1539–1549, 2018. [27] steven walczak, "artificial neural networks." in advanced methodologies and technologies in artificial intelligence, computer simulation, and human-computer interaction, pp. 40-53. igi global, 2019. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 403 416 doi: 10.2298/fuee1703403p analasys of two low-cost and robust methods for indoor localisation of mobile robots * miloš petković, vladimir sibinović, dragiša popović, vladimir mitić, darko todorović, goran s. đorđević university of niš, faculty of electronic engineering, niš, serbia abstract. this paper presents two simple and cost effective indoor localisation methods. the first method uses ceiling-mounted wide-view angle webcam, computer vision and coloured circular markers, placed on the top of a robot. main drawbacks of this method are lens distortion and sensitivity to lighting conditions. after solving these problems, a high localisation accuracy of ±1cm is achieved at about 5 hz sampling rate. the second method is a version of trilateration, based on ultrasound time of flight distance measurement. an ultrasonic beacon is placed on a robot while wall detectors are strategically placed to avoid an excessive occlusion. the zigbee network is used for inter-device synchronisation and for broadcasting measured data. robot location is determined as a solution to the minimisation of measurement errors. using nelder-mead algorithm and low-cost distance measuring devices, a solid sub 5 cm localisation accuracy is achieved at 10hz. key words: robot localization, nelder-mead, gnu scientific library, usb camera, opencv 1. introduction the robot or objects indoor localisation is a vital research area, intrinsically important in expanding competences of future low-cost home robots. a comprehensive research overview is best gained by browsing applications in microsoft’s indoor localisation competition, held three years in a row [2], starting with 2014. the best scores are often achieved through engagement of expensive components such as lidar’s. however, when it comes to a low-cost mobile robot, it is demanded that localisation is both reliable and inexpensive. consequently, a compromise is reduced to the ratio of positioning accuracy and the costs of producing and implementing localisation. this is not difficult to received october 7, 2016; received in revised form december 15, 2016 corresponding author: miloš petković faculty of electronic engineering, university of niš, serbia, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: milos.petkovic@elfak.ni.ac.rs) * an earlier version of this paper received best section paper award at electronics section at 3rd international conference on electrical, electronic and computing engineering icetran 2016, zlatibor, serbia, 13-16 june, 2016 [1] 404 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević achieve for service robots. for example, home cleaning robots do not require high precision localisation for wandering. however, if servicing an arbitrary point in workspace is required, a comprehensive research would be needed in order to stay below the price tag. furthermore, the indoor localisation is especially challenging [3] due to a problem with weak or non-existing gps signal, and due to occlusion problems as a result of variety of objects and their placement within a room. thus, usage of any method that needs a straight line visibility between two parts would require a redundant solution. on the other hand, such increasing of complexity leads to the increase of the overall costs. therefore, a careful consideration has to be made before choosing the right method. the localisation is based on a low-cost, ultrasonic, time-of-flight, distance measuring system. it is similar to cricket [4, 5]. the robot emits an ultrasonic beacon signal, while fixed wall-mount devices measure time-of-flight. this kind of system is often inexpensive, so increasing redundancy by adding more of wall devices is not increasing the overall system cost considerably. use of straightforward trilateration imposes few problems. the first one appears when, due to a measurement error, three or more spheres do not intersect at a single point. for smaller measurement errors this could be neglected and considered as a rounding error. since our system had better than ± 10cm accuracy, this could not be the case. the other problem, a special case of the first one, is absence of intersection between spheres in case of negative errors. mathematically speaking, a solution of trilateration is imaginary. arguably, accuracy could be improved by calibrating each wall unit separately, and ensuring their precise coordinates. however, in cases of occlusion and reflections, these kinds of problems would reappear. therefore, we seek a solution through a criterion-based optimisation to get as close as possible to the point that minimises the measurement error. further improvement could be achieved by using a secondary, more accurate, localisation system. when these two systems run in parallel, the second system would be a good reference for the calibration of the initial one. for this purpose, localisation rate does not even need to be high. therefore, we decided to base the secondary system on computer vision and recognition of passive markers. low-cost requirement was priority as well, so overcoming typical drawbacks of such an image processing methods was important. fisheye lens distortion was removed by using known geometry [6], and complexity of object recognition was avoided by simplification and colour coding of markers [7]. the rest of them will be presented in details in the following section. 2. visual feedback mapping for localisation 2.1. materials and method we have placed a fish-eye webcam on the ceiling in the middle of the test room. in order to make this system affordable, we based it on a full hd webcam, genius f100, with 120° view angle lens, and moderate power pc of amd athlon ii x3 455 3.30ghz, ati radeon hd 6450 and 4gb ddr3 ram. the distance between the camera lens and the floor is 3.1 metres therefore, the camera with 120° wide view angle lens can cover the area of 4×3 m. a grid of 0.5 × 0.5 m was drawn on the floor to ease calibration and provide a visual clue during the measuring. the grid is highly accurate, with only 5 mm distortion error over the diagonals of 5 m. analasys of two low-cost and robust methods for indoor localisation of mobile robots 405 rectification was crucial for this system because a wide-angle lens that is used has intrinsic distortion. its removal is easy since the camera itself is stationary and marker height was supposed to be constant. sampling images of the marker at different positions reveals levels of distortion. this data is then used to invert the effects. we gathered those samples at drawn greed points. as an aid we used a tripod, and as a marker we used an orange ball, as shown in fig. 1. height of this customized calibration tool was set to 1.1 m which reduced the distance between the camera lens and the markers to exactly 2 m. after relocating the tripod around the grid, and overlaying all images one on top of the other, we generated fig. 1. the central part of the grid, which aligns with the middle of the camera, is free from the lens distortion. that is why we dropped out some middle points but left one on the edges and corners, where the distortion is at its largest. fig. 1 overlay of tripod with marker as calibration points in our test room. we found it fitting to divide the frame to 9 regions and linearize them independently. this keeps rectification simple and calibration easy. number of pixels between sampled points was manually counted and converted to centimetres. later on, calibration constants and offsets for each region were calculated, and embedded in the positioning algorithm. distortional displacement within the camera image is not the same for close and distant objects. obviously, an additional calibration is required if height of the marker is changed. however, there is no need for this if its placement is optimal. the best place for the marker is on the top of the tracked object, where chances for occlusion are negligible. we should note that markers placed higher do require more linearization sectors, as the difference between the real position of the object on the floor and the camera frame varies. an important part of the simplification of the marker recognition is its colour coding. this makes identification easy. in addition, extracted marker shape is more accurate, which enhances precision in marker centre calculation. we implemented this extraction through pixels classification. the classification of pixels generates a black and white image, where white pixels are originally in adjacent colour space of the marker. this new image contains slightly etched shapes of markers with some artefacts as well. another layer of smoothing filter corrects this. we suggest gaussian blur, as it produced quite useful results for us. larger artefacts, if they happen to persist, are filtered out by shape and size classification. we opted for a circular marker design. 406 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević marker colour distinction also enables multi object tracking, or orientation recognition by engaging two markers per object. in particular we used the larger, orange coloured, marker for tracking position, while the smaller one which was green, was an aid in tracking robot heading. this marker combination proved to be the most desirable with respect to the program execution time. after marker positions in pixels are extracted, in our case after the centre of the only remaining circle is calculated, its conversion to absolute position in centimetres comes in place, by using formula (1) and calibration constants. 1 2 calib mp os mpc os c        (1) mp is the marker position in pixels while os1 is the marker offset in pixels for the region it belongs to. ccalib and os2 are linearity gain and offset in centimetres for the region. their values, for all nine calibration regions, are given in table 1. finally, mpc is marker position in centimetres, in coordinate system which centre is placed at the bottom left calibration point of fig. 1. table 1 calibration constants and offsets for conversion into cm marker osition os1 ccalib os2 x y x y x y upper left 285 28 3.44 3.64 0 0 centre left 285 28 3.44 3.43 0 0 lower left 285 900 3.36 3.23 0 250 upper middle 620 20 2.29 2.26 100 0 centre 620 200 3.5 3.5 100 50 lower middle 620 1319 3.6 3.6 50 0 upper right 1319 28 3.44 3.43 300 0 centre right 1319 28 3.44 3.43 300 0 lower right 285 900 3.36 3.23 0 250 2.2. implementation and results the program was done under window 10 with microsoft visual studio community 2015 with inclusion of opencv library version 3.0. at the start up of the program, camera parameters, such as brightness, contrast, saturation, hue, gamma, sharpness and exposure, are pre-set to suitable values. this parameters tweaking enhances proper pixel colour classification at given lighting conditions. we experimentally determined them for our neon light test room, with west facing windows. prior to the pixel classification, the image is converted from rgb intohsv. after this, the inrange function is used, as classifier, to generate black and white image. as already stated, we used gaussianblur for bw image smoothing and smaller artefacts removal. in the next step we calculate the marker position by data extraction. we used simpleblobdetector in this process. parameters of this function are set to ignore everything but circles of particular size, thus filtering any larger artefacts. it is the middle point of a found blob, that is considered as the marker position, pixel-wise. analasys of two low-cost and robust methods for indoor localisation of mobile robots 407 to speed up the program we decided to trim sampled frames only to region-of-interests (roi). this way, computationally intensive functions like simpleblobdetection shall execute faster. during the initialisation phase, the program searches the whole frame for marker, until it is found. afterwards, the roi is extracted from frames based on previous marker position and the maximum expected movement. this roi trimming not only shortens calculation time but also filters out other objects of similar visual properties as the marker’s. precaution that needs to be taken into account is that these kinds of objects are not present during the start of the program. in such cases it could happen that some other object is recognised for tracking, instead of the marker, and then the wrong roi would be extracted. in the last step, marker position in pixels is converted into actual position in centimetres, in absolute coordinate frame attached to the floor. approximately, one centimetre corresponds to 2.5 pixels. initial verification of the system includes repetitive measurements with the marker, placed on a tripod, at an arbitrary point in workspace. this tests calibration accuracy and system repeatability. upon consecutive large number of measurements, we can confirm that the system is reliable and repeatable at the acceptable level. the number of 1572 location samples of a still marker was acquired. on average, it required 235 ms to complete one localisation cycle. with 4.26 hz localisation rate, such system is not suitable for localising high speed mobile platforms. nevertheless, a robot that travels at comfortable speed of 0.3 m/s would be localised at points 7 cm apart. this can be considered acceptable in applications such as fetching objects to the customer or telepresence, but not in precise object handling. repeatability for all 1572 measurements was within one-centimetre range which corresponds to 2 to 3 pixels of the camera. due to small variations in lighting and inherent camera noise, there exists a jitter in marker position, found by a simple blob detector. when position in pixels is converted into position in centimetres, and rounded, the jitter passes to marker position in centimetres. an improvement is possible with the increase of camera resolution, or perhaps with the increase of the number of linearization sections. however, we find this system static performance quite satisfactory for calibration and support of low-cost, ultrasound based, time-of-flight localisation system. for the dynamic testing of camera localisation system, we have decided to make several circular motions in the centre of the test room. there are two reasons for this. the first is simplicity of trajectory equations, which allows easier data analysis later on. the second is trajectory length that should provide sufficient time for acquisition of a sufficient amount of data. since the test room was not large enough for straight line movements, the most logical trajectory then was circular. also, it can be easily performed without the need for an expensive setup. for example, a simple remotely driven mobile platform, like more powerful homemade rc car, suffices. another proposal is a motor driven rotating stand. at our disposal was a small, student grade, robotic platform. after attaching the marker to it, we have initiated the localisation and made 30 laps, with approximately constant speed of 20 cm/s. the programme was set to log the marker positions with the time stamps of frame acquisitions. the time stamps are expressed in milliseconds and the local time is measured from the beginning of the test. fig. 2. shows plotted positions of the marker. as it can be noted, the trajectory is circular but there exists some slight movement of the centre. 408 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević fig. 2 logged trajectory of circular motion of marker. number of repetitive cycles is 30. in the next step, we have done a time stamp analysis of about 426 s long measurement streak. this was necessary for the later analysis of trajectory. the logged time seemed rather linear when plotted. an average time between processed frames is 236 ms, with standard deviation of 22.9 ms. differences on the histogram of time between two successively grabbed frames are an interesting observation, which is shown in fig. 3. fig. 3 histogram of time differences, dt, between two successively grabbed frames. histogram peaks are at an equal distance of approx. 15 ms. since the camera streams at about 30 fps, this 15 ms seems like a half of a frame time. an average period of 236 ms is then correlated to 7 frames. considering a slight variance in stream frame rate and code execution, it could lead to a frame grabbing jitter. the jitter would be only one frame. its effect would be increase in localisation uncertainty of one frame time multiplied by the speed of marker. if speed is low, uncertainty increase is only a few centimetres. in our case, for speed of just under 20cm/s, it is evaluated to 0.6cm. when time stamps are analasys of two low-cost and robust methods for indoor localisation of mobile robots 409 converted to integer number of frames from the beginning of test, and time difference is recalculated, the histogram looks like in fig. 4. now it is much clearer that almost half of the samples are taken with 7 frame difference. from the remaining samples, about one third is with 6 frame difference and one third with 8 frame difference. in other words, standard deviation is 0.77 frames. to conclude, as far as the timing analysis is concerned, since no real time os were used, a variance in processing frames and sampling does exist. however, it is not more than 10 frames or one third of a second. fig 4 histogram of time stamp differences, when time is converted to frames with 30fps rate. in parallel with the dynamic performance test we have done an additional timing analysis. we wondered whether this kind of localisation system could be integrated as small localisation device capable of broadcasting tracked object location via wi-fi. thus, the image processing pc was set to send position via udp packets to pc within the same wireless network. comparing the time difference of localisation frame sampling time and time of the udp arrival, we got 236 ms of time difference between location information. on the other hand, a standard deviation is now 133 ms, which is almost 6 times more than for the localisation alone. the main culprit is packet buffering, and wireless signal quality. due to them, considerate number of packets was late. note also that this differential analysis excludes fixed amount of latency from wi-fi, as it did with camera frame grabbing. since we are using low-cost off the shelf components, it is not possible to determine accurately this kind of delays. at least not without the use of special setups. conversely, we find sending location via udp packets and wi-fi for control purposes plausible, however, control algorithms must either be rugged enough for variable time delays or take advantage of frame time stamp and perform small corrections of received location. in the following step, we have done trajectory analysis in two stages. firstly, we have found trajectory radius r and centre (x0, y0), as well as speed of centre movement (vx, vy). this was achieved by finding the best fit for function (2). 410 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević 2 2 0 0 ( , , ) ( ) ( ) x y f x y t r x v t x y v t y       (2) basically, function (2) represents difference in radius of acquired location and the estimated one. for any measured point it should be equal to zero. the best fit result gave r of 41.9 cm, (x0, y0) of (191.1, 149.6) cm, as well as (vx, vy) of (0.264, -0.096) mm/s. the best fit average error is 6e-16, while the standard deviation is 0.633cm. it is interesting to note that the standard deviation is on the level of mentioned frame jitter, for an object with speed of 20 cm/s. nevertheless, we state that accuracy of this system for moderate speed of tracked marker is ±1.5cm, or ±2.25cm if absolute limits are applied. so performance of system for tracking a moving object does not go far off from the static measurements. now, if we take into consideration that speed of the marker was constant, we can assume that coordinates (x, y) change as in (3), where ω is constant angular velocity and ϕ is initial angular offset. the formula (3) is our ideal mathematical model of real trajectory. 0 0 ( ( ), ( )) ( cos( ), sin( )) x y x t y t x v t r t y v t r t          (3) difference of trajectory given with the formula (3) and measured data is given with function (4). ideally, it equals zero. 2 2 0 0 ( , , ) ( cos( ) ) ( sin( ) ) x y g x y t x v t r t x y v t r t y             (4) the best fit result gives angular velocity of -0.439 rad/s, which translates to 18.4 cm/s peripheral speed, and angular offset of 3.163 rad. negative velocity comes from the clockwise direction of trajectory. average fitting error is 2.8 cm and standard deviation is 1.8 cm. since this result seems much worse than the one from trajectory path analysis, we conclude that this method is accurate for localisation within a frame. however, when a tracked object is moving, due to unsynchronised frame grabbing, larger margin of error occurs. indeed, when we calculated travelled distances between successive sampled frames, we got 4.4 cm in average and standard deviation of 0.5 cm. this seems like a great variance, considering the fact that marker speed was pretty constant. after calculating temporal velocities, we got the result that average speed is 18.6 cm/s and standard deviation is 0.6 cm/s. so generally, due to variance in precise image capturing, we get very rough velocity approximation based only on two samples. however, after filtering, this information seems quite right. 3. time-of-flight localisation method 3.1. materials and method a simplified block diagram of time-of-flight distance measurement system is presented in fig. 5. there is a beacon that emits ultrasound on the left and a wall mount device on the right. the minimum number of wall devices necessary for successful trilateration is three. before the beacon fires a streak of waves, it notifies a wall device via radio module, and it starts the counter. when the wall device detects emitted sound, it stops the counter. information about time of flight is then sent via radio. distance is calculated after the time analasys of two low-cost and robust methods for indoor localisation of mobile robots 411 of flight is multiplied by the speed of sound. since the device is for indoor use only, speed changes due to temperature variations are neglected. multiple ultrasonic transducers are used in both devices. beacon covers 360 degrees horizontally and about 45 degrees vertically. the wall device covers about 140 degrees horizontally and 45 degrees vertically. therefore, a proper redundancy is needed for specific coverage. currently we use 4 wall devices placed in corners of a rectangle, with an orientation toward common centre. we made sure to do the measurements only in areas covered with more than 3 wall units. although devices are low-cost to make, this is only an initial accuracy testing and we find it irrelevant to have coverage of any preferred size or shape. fig. 5 simplified block diagram of system: ultrasound emitting beacon on the left and time-of-flight measuring wall mount device on the right. in order to overcome the problem of trilateration when using low-accuracy, but also low-cost, distance measuring system, we have based solution calculation through minimisation of the sum of squares of measurement errors. in the minimisation function ,)( 1 2    n i iri dppf (5) n represents number of wall devices that responded to ultrasonic beacon. position vector of beacon pr and position vectors of wall devices pi are defined in 3d and in regard to some ground reference point. again, vectors pi, where i is from 1 to n, are known, as they are measured during localisation system installation. the x and y axes are in the plane of the floor while the z axis is oriented toward the ceiling. measured distances di are obtained short after the beacon signal is emitted. the function minimum is located around the beacon’s position. this function is equal to zero when no measuring error is present. otherwise, a small precision uncertainty will occur in the case of measurement errors. when measured data noise is of random nature, there is no possibility to narrow down solution search area, at least not statically. 412 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević in order to test this method, we have created a wolfram mathematica script. it simulates a system of 3 or 4 wall devices and a beacon. distance measuring error is randomly generated and added to the precise value. we set the x and y plane to correspond to the floor and the z axis to point to the ceiling. although this method allows finding position of beacon in 3d, we are more interested in keeping its height constant. this would be most probable use-case in mobile robotics. therefore the script visualises 2d plane of the z axes at the fixed height of beacon of 1.3 m, as in fig. 6. possible beacon positions in that plane are circles, designated with thick circular arcs in fig. 6. note that both positive and negative measurement errors were introduced. the dot represents calculated position, while the short lines, that connect it to the arcs, are estimated measurement errors. the squares represent projection of wall devices on the plane. they are also centres of the circles. the lower left part contains magnified detail around the dot. fig. 6 a plane, where the z coordinate is constant 1.3 m, that contains calculated robot position which is shown with a dot. possible beacon positions, for that plane, according to the measured data are circles, are shown partially with thick arcs. the short lines represent estimated measurement error. the squares represent projection of wall devices on the plane. they are also centres of the circles. the zoomed detail around solution point is presented at the bottom left. visual checks were only used as an aid, for better understanding of behaviour of solution in response to errors and device placement. for example, actual and calculated positions are identical when there is no measurement error. equal errors in all wall devices tend to cancel each other. numeric evaluation is done as well. we used nminimize function for minimization. available minimisation methods are nelder-mead [8], differential evolution [9], simulated annealing [10] and random search [11]. we used them all simultaneously in order to compare them with respect to efficiency and accuracy. wall devices were placed in rectangular pattern with same height, as they might be used commonly. we generated random beacon positions, calculated accurate distances to wall devices, and then added a gaussian error in range of ±10 cm. beacon analasys of two low-cost and robust methods for indoor localisation of mobile robots 413 location found by minimisation of function (5) was accurate enough, mostly bellow 5 cm error. however, in some cases, the error went up to extremes of almost 20 cm. that occurs in situation when two adjacent wall devices have maximal error of +10 cm while the opposite two have –10 cm of error. probability for this is rather low and general conclusion is that this method works quite nicely. it shows robustness to both positive and negative measurement errors. solution exists independently from the number of wall devices. increasing their number to overcome temporary occlusion problems does not affect solution calculation, neither in complexity nor in time. comparison of results of four minimisation methods showed no significant difference between them. difference in accuracy was well below 1 cm. the same could be said about efficiency. so we chose the nelder-mead for practical implementation. 3.2. implementation and results after successful method of validation in wolfram mathematica, we have built c++ code. we have chosen to use nelder-mead solver from the gnu scientific library. the program was used on the minnowboard computer with non-commercial ubuntu os. the minnowboard is an open-source, 64-bit intel® atom™ based mini/embedded pc. initial tests were done with pre calculated examples, generated with mathematica script. execution time was about 1 ms, in average. though sometimes it reached 3ms however, this was not the only program running. nevertheless, we find this quite satisfactory. for service type robot speed, this introduces a localisation error less than one millimetre. delays in distance measuring system are much greater and position sampling is below 10hz. if by any chance execution time has to be reduced it could be done by lowering solver precision. we noticed that in most cases 10 to 20 iterations were enough to get the right position of centimetre resolution. as in the visual feedback localisation in section 2, we initially verified the system, through repetitive measurement with beacon fixed at arbitrary position in the workspace. this verification helps understanding repeatability in measurement and also gives reasonable confidence in usability for further implementation on a mobile robot. upon consecutive large number of measurements, we can confirm that the system is reliable and repeatable at an acceptable level. the beacon firing rate was fixed, with the period of 150 ms, which is frequency of 6.67 hz. although we could set it up to 10hz, we did not want to use it at its limits. a number of 1172 measurements at fixed position is presented as histogram in fig. 7. the average point is (213cm, 169cm) and standard deviation is 0.62, or 0.38 for x axis data and 0.49 for y axis data. in general, only 0.26%, or 3 points, is outside of ± 1.5cm accuracy region. these data show a satisfactory initial accuracy of the method. although it returns a bit more scattered location than the camera based method, it works faster. for dynamic testing of ultrasonic based localisation system, we have done the same test as with camera based localisation system. furthermore, we decided to do both tests in parallel. this would make the comparative analysis easier. so the ultrasonic beacon was placed on the same platform as the marker. since the platform, which was in the centre of the test room, was making circular motions, both the marker and the beacon had the same centre of rotation. since the beacon must not occlude the marker it was placed as close as possible to it. nevertheless, there still existed a slight difference of almost 3 cm, in their radiuses. the 414 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević initial trajectory analysis confirmed a slightly lower localisation accuracy of this system compared to the camera based one. therefore, we decided to use the centre of rotation (x0, y0) calculated from the camera based system trajectory analysis, as well as speed values (vx, vy), and repeat fitting process with (2). the best result gave r of 44.8 cm, an average error of -3e-14, and a standard deviation of 7.44 cm. this result looks a lot higher than the one for the static test. this stems from the poor choice of rf modules for the system. these are low power zigbee modules. several studies indicate low performance of zigbee communication in presence of wi-fi signals. this is nicely summarised in [12]. there it is clearly stated that wi-fi signal can corrupt zigbee signal on bit level or cause drastic increase in retransmission. since our setup room had one wi-fi router and there were plenty more distributed in nearby offices, we have noticed both effects. when we analysed time of arrival of packets from single wall device we discovered that latency between packets is quite drastic. instead of having packets at regular beacon firing intervals of 150 ms, plus or minus time of flight of ultrasound up to 5 m, there were packet buffering where packets came with less than 30 ms difference. since packets with distance information were not time stamped at transmitter side, it was impossible to determine whether the wall device failed to transmit after one beacon firing or the measured distance information came after the following beacon firing. in such cases mixing of data occurred. it could be otherwise interpreted like higher inaccuracy in distance measurement, which leads to higher localisation error. at some rare moments, packets from unknown wall unit address were received, which we interpret like obvious pollution of data. it is quite possible that lower performance of zigbee modules is even due to its quality, since they were one of the cheapest on the market. fig. 7 histogram of 1172 measurements at single beacon pint. most often measured position is (213, 169) cm. analasys of two low-cost and robust methods for indoor localisation of mobile robots 415 problems associated with zigbee modules could perhaps be overcome by using better and more reliable modules, and by implementation of some better protocol for sending data over zigbee as suggested in [12]. another solution could be using modules that avoid overcrowded 2.4 ghz region at all. since we had already identified the problematic latency in our system, we skipped the second part of trajectory accuracy analysis that we did with the camera based system. simply, it would not add any value to the results. 4. conclusion we have implemented two methods for indoor localisation, and tested them against each other under identical conditions in our testing facility. after initial static testing and validation of systems accuracy, with laser range finder, we have determined that the first method, the camera-based one, has better accuracy. although it has half of localisation speed than the time-of-flight method, we have decided to use it as referent system during dynamic testing. since mobile service robots have moderate speeds, then the localisation rate of visually based system is quite adequate. dynamic test showed that ultrasonic based localisation system has lower accuracy and success rate of measurement, due to zigbee modules communication glitches that require additional attention and improvements. on the other hand, the first method has its own pitfalls. it is, foremost, sensitivity to changes in lighting condition. it also requires a comprehensive calibration which should be automated in order to make it an off-the-shelf localisation solution. the standard pc could be easily replaced with embedded type pc, for example, with any of newer raspberry pi series. nevertheless, both systems showed simplicity in setting up and use. their low implementation cost makes them affordable for use in education and some less demanding real life applications, such as service robots. in conclusion, camera-based system is better for laboratory conditions due to its high accuracy. the other system, although less accurate, is more suitable for a variety of other locations. references [1] m. petković, v. sibinović, d. popović, v. mitić, d. todorović and g. s. đorđević, “robust indoor localisation methods of mobile robots: direct visual feedback and time-of-flight trilateration”, in proceedings of the 3rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, june 13 – 16, 2016, eli2.6 1-6. [2] "microsoft indoor localisation competition". research.microsoft.com. n.p., 2016. web. 15 apr. 2016. [3] j. borenstein, et al, “mobile robot positioning sensors and techniques”, invited paper for the journal of robotic systems, special issue on mobile robots, vol. 14, no. 4, pp. 231 – 249, 1996. [4] "the cricket indoor location system: an nms project". cricket.csail.mit.edu. n.p., 2016. web. 15 apr. 2016. [5] n. b. priyantha, a. chakraborty and h. balakrishnan, “the cricket location-support system”, in proceedings of the 6th acm mobicom, boston, ma, august 2000. [6] c. hughes, et al., “wide-angle camera technology for automotive applications: a review”, iet intelligent transport systems, vol. 3, no. 1, pp. 19-31, 2009. [7] z. garofalaki, et al, “object motion tracking based on color detection for android devices”, international journal of computer, electrical, automation, control and information engineering , vol. 9, no. 4, pp. 970-973, 2015. 416 m. petković, v. sibinović, d. popović, v. mitić, d. todorović, g. s. đorđević [8] j. a. nelder and r. mead, “a simplex method for function minimization”, computer journal, no. 7, pp. 308–313, 1965. [9] r. storn and k. price, “differential evolution a simple and efficient heuristic for global optimization over continuous spaces”, journal of global optimization, no. 11, pp. 341–359, 1997. [10] s. kirkpatrick, c. d. gelatt jr and m. p. vecchi, “optimization by simulated annealing”, science, vol. 220, no. 4598, pp. 671–680, 1983. [11] l. a. rastrigin, “the convergence of the random search method in the extremal control of a many parameter system”, automation and remote control, vol. 24, no. 10, pp. 1337–1342, 1963. [12] c. m. liang, n. b. priyantha, j. liu and a. terzis “surviving wi-fi interference in low power zigbee networks”, in proceedings of the 8th acm conference on embedded networked sensor systems, acm ny, 2010, pp. 309-322. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 63 74 https://doi.org/10.2298/fuee1801063f prediction of annual energy production from pv string under mismatch condition due to long-term degradation * miodrag forcan university of east sarajevo, faculty of electrical engineering, east sarajevo, bosnia and herzegovina university of belgrade, faculty of electrical engineering, belgrade, serbia abstract. reduction of long-term degradation effects represents a long-time challenge in photovoltaic (pv) manufacturing industry. modelling of long-term degradation types and their impact on maximum power of pv systems have been analysed in this article. brief guidelines for pv cell-based modelling of pv systems have been illustrated. special study case, pv string consisting of 12 pv modules, has been modelled in order to determine degradation and mismatch power losses. modified methodology for prediction of annual energy production from pv string, based on horizontal irradiation and ambient temperature experimental measurements at the location of belgrade, has been developed. coefficient named “degradation factor” has been introduced to include and validate degradation power losses. economic considerations have indicated evident money income reduction, as a consequence of lower annual energy production related to long-term degradation. key words: pv string, energy production, long-term degradation, degradation factor, mismatch losses 1. introduction precise determination of annual energy production from pv systems is very difficult to achieve, mostly due to variable operating conditions (irradiation and ambient temperature) [1]. electricity production is closely related to conversion efficiency, which represents one of the most important parameters when discussing pv systems [2]. the new materials are being constantly developed with purpose of increasing conversion efficiency and mitigating degradation effects. according to research, presented in [3], organic materials with pv properties have proved to be one of the most promising solutions. meanwhile, conventional silicon materials remain the most widely used in field applications. received january 31, 2017; received in revised form september 18, 2017 corresponding author: miodrag forcan university of east sarajevo, faculty of electrical engineering, vuka karadzica 30, 71126 lukavica, 71123 east sarajevo, republic of srpska, bosnia and herzegovina (e-mail: miodrag.forcan@live.com, miodrag.forcan@etf.unssa.rs.ba) * an earlier version of this paper was presented at the 2 nd virtual international conference on science, technology and management in energy (energetics 2016), 22-23 september, 2016, in niš, serbia [1]. 64 m. forcan conventional methods for prediction of energy production from pv systems usually use hourly-averaged horizontal irradiation and ambient temperature measurements for specific locations [4-6]. one of their main shortcomings is neglecting of long-term degradations related to encapsulating material, e.g. delamination, discoloration and corrosion. according to various research results [7-12], it has been found that long-term degradation effects can often reduce pv system’s power up to 15-20% during lifetime exploitation period. this article is organized as follows. the second chapter covers basic facts related to most common types of long-term degradation. in the third chapter, modelling guidelines for pv systems and degradation types are presented. pv module degradation effects, under variable irradiation and temperature condition, have been analysed with results presented in the fourth chapter. the fifth chapter presents study case dedicated to pv string power reduction due to long-term degradation. modified methodology for prediction of annual energy production from pv string and financial income, based on introduction of degradation factor, has been investigated in sixth chapter. valuable conclusions are pointed out in final chapter. 2. long-term degradation of pv systems degradation represents a gradual deterioration of pv system components caused by real operating conditions in the field. affected pv modules can continue to generate electricity, although produced energy can be significantly reduced. according to manufacturers, it is common practise to identify pv module as degraded when its maximum power reduces below 80% of the initial value. long-term degradation of pv systems is related to encapsulation material deterioration and its effects could be observed on the surfaces of pv cells during exploitation period. ethylene vinyl acetate (eva) is recognized, over the decades, as one of the best encapsulation materials for pv cells. as a consequence, nearly 80% of pv modules, produced around the world, are encapsulated by eva [7]. typical long-term degradation types of pv cells, related to eva, are: delamination, discoloration and corrosion. characteristic field examples of pv cells affected by long-term degradation types are shown in fig.1 [8]. (a) (b) (c) fig. 1 typical long-term degradations of pv cells [8]: (a) delamination; (b) discoloration; (c) corrosion prediction of annual energy production from pv string under mismatch condition due to long-term degradation 65 2.1. delamination glass, eva and pv material are tightly affixed (laminated) in normal pv cells. if some of the mentioned layers is damaged it could lead to delamination development. delamination represents separation between the different layers within the pv cells and it is usually followed by the penetration of moisture and corrosion. the most common pv cell’s surface area affected is located around busbars, as can be seen in fig.1.a. this type of degradation is observed in more than 50% of installed pv modules according to research [9]. 2.2. discoloration ultra-violet radiation, followed by high degree of humidity and environment temperature, is recognized as the main cause of discoloration. discoloration is the most common type of long-term degradation represented by electro-chemical process in which pv material changes colour, usually from light yellow to brown (fig. 1.b). according to research papers [10] and [11], discoloration can reduce pv cell’s short-circuit current up to 15%. 2.3. corrosion the main reason for corrosion occurrence in pv cells is moisture penetration. corrosion damages metal parts and contacts of pv cells (fig.1.c), which leads to pv cell’s series resistance increase. based on the results of accelerated corrosion tests, it has been found that probability of corrosion occurrence is related to oxygen presence in silicon layers of pv cell [12]. 3. pv system modelling in order to precisely determine degradation and mismatch power losses in pv systems, it is essential to use pv cell-based modelling [1]. for proper calculation of degradation effects on pv systems it is necessary to model functionality between generated power and specific ambient conditions. it is a common practice to model i-v curve with irradiation and temperature as controllable primary input variables. one-diode matlab-based model of pv cell has been created by using recommendations from [13]. corresponding pv module and cell models are used throughout previous research and series of related publications [14-16]. future pv modelling research will include the cooling effect of wind on pv cell temperature [17]. regarding mismatch effect due to long-term degradation it can be assumed that wind conditions are uniform at the relatively small surface of pv string. low irradiation effects have been included in modelling process by threatening of pv cell’s series resistance, parallel resistance and diode ideality factor, as functions of irradiation and operating temperature, with corresponding analytical expressions recommended in literature [18-20]. pv module modelling has been realized by using matlab/simulink software [21]. the chosen pv module zdny -250p60 250wp [22] consists of 60 polycrystalline suntellite 156m pv cells with electrical data for standard test conditions (stc) presented in table 1. pv string model consists of 12 pv modules with maximum installed power of 2.995 kw. similar types of pv systems are often used on the roofs of households in urban environments. pv system modelling procedure is presented in fig.2. pv modules within pv string are enumerated with numbers 1-12. 66 m. forcan table 1 stc electrical data of suntellite 156m pv cell and pv module suntellite pv pv cell pv module efficiency [%] 17.00-17.19 17.00 pmpp [w] 4.16 249.61 vmpp [v] 0.531 31.84 impp [a] 7.834 7.84 voc [v] 0.63 37.78 isc [a] 8.35 8.35 ff [%] 79.08 79.12 fig. 2 pv system modelling procedure: (a) one-diode pv cell model; (b) 60-cell pv module model; (c) 12-module pv string model 3.1. long-term degradation modelling in order to analyse long-term degradation effects on reduction of pv string power, it is mandatory to establish relation between degradation mechanisms and pv cell’s parameters. as delamination, discoloration and corrosion are impossible to predict precisely, their modelling is limited on approximate relations resulting from field observations and statistical analysis of experimentally obtained data. according to experimental research results [8], delamination reduces pv module’s short-circuit current isc, while its effects on open-circuit voltage voc can be neglected. based on experimentally obtained data for characteristic pv module, several modelling cases are defined: 1. case 0 del 0 no delamination isc = isc-(stc). 2. case 1 del 1 limited area around pv cells’ busbars affected isc = 0.95 × isc-(stc) (5% decrease). 3. case 2 del 2 limited area around small cracks in pv module’s surface isc = 0.92 × isc-(stc) (8% decrease). based on statistical analyses and experimental field data obtained in temperate climate zone [23], it has been found that discoloration also can be modelled as reduction of isc, the following cases are defined: 1. case 0 dis 0 no discoloration isc = isc-(stc). 2. case 1 dis 1 bright colours present on less than 50% of pv module’s surface isc = 0.9473 × isc-(stc) (5.27% decrease). 3. case 2 dis 2 bright colours present on more than 50% of pv module’s surface isc = 0.9137 × isc-(stc) (8.63% decrease). 4. case 3 dis 3 dark colours present on less than 50% of pv module’s surface isc = 0.9088 × isc-(stc) (9.12% decrease). prediction of annual energy production from pv string under mismatch condition due to long-term degradation 67 regarding corrosion modelling of pv modules, according to same experimental results, as in the case of discoloration [23], pv module’s series resistance rs has been identified as key parameter. corrosion manifests as increase of rs. the following modelling cases are defined: 1. case 0 cor 0 no corrosion. 2. case 1cor 1 bright colour corrosion on metal parts of pv module rs = 1.65 × rs-(stc) (65% increase). 3. case 2 cor 2 bright colour corrosion on metal parts and terminals of pv module rs = 2.2 × rs-(stc) (120% increase). 4. case 3 cor 3 dark colour corrosion on metal parts and terminals of pv module rs = 4.3 × rs-(stc) (330% increase). 4. pv module degradation effects under variable irradiation and temperature condition in real-time field conditions pv systems are operating under hourly-based irradiation and temperature variations. it is of mandatory importance to determine long-term degradation effects under variable irradiation and temperature conditions. by using earlier defined longterm degradation modelling cases, pv module maximum power is observed for ambient temperature and irradiation ranges: -5°c 35°c; 200 w/m 2 1000 w/m 2 , respectively. ambient temperature values have been varied with constant irradiation condition 800 w/m 2 . similarly, irradiation values have been varied with constant ambient temperature condition 8.75°c (pv cells’ operating temperature 25°c). corresponding results are presented in fig.3. by analysing graphs from fig.3, the several observations can be made:  delamination and discoloration preserve approximate linear correlation between pv module’s maximum power (pm) and both ambient temperature (t) and irradiation (i), while corrosion inserts slightly nonlinear components.  in the case of t variations, pm curve slopes remain approximately constant in delamination and discoloration analysis. as a consequence, differences between pm for all modelling cases (del0, del1, del2 and dis0, dis1, dis2, dis3) remain approximately constant.  in the case of i variations, pm curve slopes slightly change in delamination and discoloration analysis, which leads to important conclusion: for higher irradiation values, differences between pm for all considered modelling cases (del0, del1, del2 and dis0, dis1, dis2, dis3) are also higher.  regarding the corrosion effects, it can be seen from fig.3.c that modelling case cor3 significantly differ from other cases in terms of pm curve slope for variable t condition. for variable i condition, pm value differences between different modelling cases of corrosion are lower than the corresponding cases of delamination and discoloration.  degradation losses related to delamination and discoloration maintain approximately equal values in whole analysed t and i ranges, while corrosion losses nonlinearly increase with t and i values increasing (fig.3.d). it can be concluded that delamination and discoloration losses are approximately unaffected by variation of t and i. 68 m. forcan fig. 3 pv module maximum power and degradation losses under variable temperature and irradiation condition: (a) delamination; (b) discoloration; (c) corrosion; (d) power losses due to long-term degradation prediction of annual energy production from pv string under mismatch condition due to long-term degradation 69 5. pv string power reduction due to long-term degradation study case determination of pv string power losses due to degradation is a complex task, because of the mismatch condition occurrence. the term “mismatch condition” refers to differences in current-voltage (i-v) curves of individual pv modules in pv string due to different degradation rates. in the field conditions, during long exploitation periods, it is very common that pv modules degrade differently. in order to investigate pv string’s degradation and degradation mismatch power losses, the special study case, consisting of adopted pv module’s degradation modelling cases, is defined in table 2. table 2 pv string under long-term degradation study case period of pv string exploitation 10 years 15 years 20 years 25 years type of long-term degradation del. disc. corr. del. disc. corr. del. disc. corr. del. disc. corr. pv1 del 0 dis 1 cor 1 del 1 dis 2 cor 2 del 2 dis 2 cor 2 del 2 dis 2 cor 3 pv2 del 0 dis 0 cor 1 del 1 dis 0 cor 2 del 1 dis 1 cor 2 del 2 dis 1 cor 3 pv3 del 0 dis 1 cor 0 del 0 dis 3 cor 1 del 1 dis 3 cor 2 del 2 dis 3 cor 2 pv4 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 1 dis 2 cor 2 del 1 dis 2 cor 2 pv5 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 0 dis 2 cor 2 del 1 dis 2 cor 2 pv6 del 0 dis 0 cor 0 del 0 dis 0 cor 1 del 0 dis 1 cor 2 del 1 dis 2 cor 2 pv7 del 0 dis 0 cor 0 del 0 dis 1 cor 0 del 0 dis 3 cor 1 del 1 dis 3 cor 2 pv8 del 0 dis 0 cor 0 del 0 dis 1 cor 0 del 0 dis 3 cor 1 del 1 dis 3 cor 2 pv9 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 0 dis 3 cor 2 pv10 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 1 del 0 dis 1 cor 2 pv11 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 0 del 0 dis 1 cor 1 pv12 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 0 cor 0 del 0 dis 1 cor 1 according to data in table 2 it can be observed that several key time points are defined during 25 years long exploitation period of pv string. long-term degradation modelling cases are assumed to take place after 10, 15, 20 and 25 years of exploitation period. the highest combined degradation rate is set for pv modules with starting indexes (1, 2, 3 …). it is assumed that degradation rate is negligible in the first 10 years of exploitation. in order to determine pv string degradation losses and mismatch losses separately, it is necessary to identify total maximum power of individual pv modules (12 pv modules operate separately), beside the maximum power of the entire pv string (12 pv modules operate in series connection). for defined study case (table 2), under constant ambient temperature t = 20°c and irradiation i = 600 w/m 2 conditions, maximum power points of individual pv modules pmpp-im and pv string pmpp-string have been determined and presented in table 3. 70 m. forcan table 3 maximum power points of individual pv modules and pv string for study case defined in tab.5.1 under constant ambient temperature and irradiation values (t = 20°c and i = 600 w/m 2 ) maximum power point pv modules / string (pmpp-im / pmpp-string) period of pv string exploitation 10 years 15 years 20 years 25 years ppv1-mpp [w] 126.773 115.478 111.637 109.500 ppv2-mpp [w] 133.563 126.533 119.793 113.653 ppv3-mpp [w] 127.631 121.935 114.847 111.005 ppv4-mpp [w] 134.518 126.918 115.478 115.478 ppv5-mpp [w] 134.518 126.918 121.895 115.478 ppv6-mpp [w] 134.518 133.702 126.189 115.478 ppv7-mpp [w] 134.518 127.773 121.935 114.847 ppv8-mpp [w] 134.518 127.773 121.935 114.847 ppv9-mpp [w] 134.518 134.518 126.914 121.268 ppv10-mpp [w] 134.518 134.518 126.914 126.189 ppv11-mpp [w] 134.518 134.518 127.773 126.194 ppv12-mpp [w] 134.518 134.518 134.518 126.194 pmpp-im = σppvi-mpp (i=1…12) 1598.6 w 1561.1 w 1475.2 w 1418.5 w pmpp-string 1591.7 w 1511 w 1440.4 w 1392.7 w power losses due to long-term degradation pmpp-new string* pmpp-string 22.52 w 1.4 % 103.2 w 6.4 % 173.82 w 10.8 % 221.52 w 13.7 % mismatch losses due to long-term degradation pmpp-im pmpp-string 6.9 w 0.43 % 50.1 w 3.1 % 34.8 w 2.16 % 25.8 w 1.6 % *new pv string maximum power 1614.22 w according to results presented in table 3 it can be concluded that power losses due to long-term degradation are increasing from 1.4% to 13.7% over the 25 years exploitation period. on the other hand, mismatch losses have the highest value after just 15 years of exploitation (3.1%), because the degradation rates of individual pv modules differ the most in that time period. it is important to notice that mismatch losses are very difficult to predict and they certainly depend on particular study cases. their values could reach up to 50% of power losses due to long-term degradation itself. 6. pv string annual energy production statistical prediction of energy production from pv string is based on horizontal irradiation and ambient temperature measurements. acquisition system provided measurements of horizontal irradiation and ambient temperature for every 10 minutes between july 15 th , 2013 and july 15 th , 2014, at location of belgrade, serbia, with wgs coordinates: 44.8 0 ; 20.47 0 ; 120 m. the obtained irradiation and ambient temperature values have been averaged for every three hours and in the next step monthly-averaged. based on the procedure given in [24], horizontal irradiation can be divided into direct and diffuse component. in addition, reflected component can be determined by using corresponding reflection coefficient. prediction of annual energy production from pv string under mismatch condition due to long-term degradation 71 in order to determine irradiation components on pv string surface, position angles need to be defined. corresponding assumed tilt and azimuth angles are σ=30 0 and s=0 0 , respectively. in the process of determining ambient reflection coefficient, it is assumed that household with pv string on its roof is located on a grassy surface. the adopted reflection coefficient value is ρ=0.15. total irradiation on the surface of pv string has been calculated by usage of following relation: , pv dir dif ref i i i i   (1) where: idir, idif and iref are direct, diffused and reflected irradiation components, respectively. by using calculated irradiation and measured ambient temperature data, it is possible to determine operating temperature of pv string, according to following relation: 20( ) , 0.8 pv amb pv noct t t i     (2) where: tpv is operating temperature of pv string; tamb is ambient temperature; noct is nominal operating temperature of pv cell (47 °c for considered pv cells); ipv is irradiation value on the surface of pv string. based on the calculated and averaged ipv and tpv values, pv string dc power values are obtained (pdc). conventional relation for calculation of pv systems’ dc power in the field conditions, expanded with insertion of degradation factor, is defined as follows: (field) ( , ) (1 ) (1 ) (1 ), dc dc pv pv n z d p p i t          (3) where: μn is efficiency reduction factor due to resulting unequal i-v curves in the manufacturing process of pv modules; μz is efficiency reduction factor resulting from soiling of pv modules in the field; μd (df) is newly defined efficiency reduction factor as a consequence of long-term degradation (degradation factor). assumed values of efficiency reduction factors in the analysed study case are μn = 0.03 (3%) and μz = 0.04 (4%). degradation factor has been calculated on the hour basis according to relation (4) and initially averaged for every three hours, and in the next step also monthly-averaged. _ _ _string _ . new string long term d d new string p p df p      (4) monthly averaged pv string maximum power and degradation factor have been presented in fig.4 for analysed exploitation period of one year and considered study case. daily time intervals with irradiation values below 30 w/m 2 have been neglected in the analysis. 72 m. forcan fig. 4 monthly averaged pv string maximum power and degradation factor during the considered year of exploitation in different hourly-based time intervals: (a) 8:30h 11:30h; (b) 11:30h 14:30h; (c) 14:30h 17:30h according to results from fig.4, the following observations can be obtained:  degradation factor has higher values in the summer time (up to 14%), with the exception of january in time interval 11:30h 14:30h.  the highest values of degradation factor are present in the period with maximum irradiation (11:30h 14:30h).  degradation factor monthly-based differences are most expressed after 25 years of pv string exploitation. prediction of annual energy production exhibits pv string ac power calculation by using the following relation: (field) (field) , ac dc i p p  (5) where μi is inverter efficiency. after determination of pac-(field) values, annual energy production can be easily calculated. purchase price of electricity produced from small capacity pv systems, installed on the households in serbia, can be calculated by using relation (6).  0.01 (20.941 – 9.383 ) / ,p eur kwh  (6) where p is installed power of pv system in mw units. prediction of annual energy production from pv string under mismatch condition due to long-term degradation 73 with assumed inverter efficiency of μi = 97%, pv string annual energy production has been predicted, together with annual money income and losses due to long-term degradation. corresponding results are presented in table 4. table 4 pv string annual energy production, money income and losses due to long-term degradation pv string annual energy production and income period of pv string exploitation new string 10 years 15 years 20 years 25 years annual energy production [kwh] 3422 3374 3204 3058 2962 annual money income [eur] 715.6 705.6 670.1 639.5 619.4 loss of money due to degradation [eur] 0 10 45.5 76.1 96.2 with assumption that loss of money due to long-term degradation is approximately equal in consecutive time periods of 5 years (e.g. loss of money in time period 7.5 12.5 years is equal to 5 × loss of money in the 10th year of exploitation) it is possible to roughly estimate total loss of money in time period 0 27.5 years (very close to lifetime of pv string): 5 × (10 + 45.5 + 76.1 + 96.2) = 1139 eur. it can be concluded that predicted amount of money loss due to long-term degradation is enough to buy several new pv modules during considered exploitation period. even rough estimation of degradation factor could be of significant interest for economic predictions, especially for larger installed pv capacities (> 20 kw) where money income could be reduced for more than 10 000 eur in lifetime exploitation period due to long-term degradation. 7. conclusions modelling of pv system degradation in terms of statistical prediction of annual energy production proved to be a very complex task, mainly because of many uncertainties related to long exploitation period and field conditions. several useful guidelines and study case results have been presented in this article. the most common long-term degradation types have been modelled by using approximate relations, adopted on the basis of experimental observations. it has been shown that power losses of individual pv modules due to delamination and discoloration remain approximately constant under wide range of irradiation and ambient temperature values, while power losses due to corrosion proved to be temperature-dependent. mismatch power losses, caused by different degradation rates of individual pv modules in pv string, have been identified as potentially significant part of total degradation losses. methodology for prediction of annual energy production from pv string, based on horizontal irradiation and ambient temperature field measurements, has been modified in order to include long-term degradation effects. degradation factor has been introduced as useful tool for validating power losses due to long-term degradation. analysis of pv string consisting of 12 pv modules, located in belgrade, study case, showed that money losses during lifetime exploitation period, caused by long-term degradation could overcome price of several new pv modules. acknowledgement: the author would like to thank to professors jovan mikulović and željko đurišić for their advices and support during research period. special acknowledgement belongs to my best friend slobodan elez, who contributed with useful results related to his master thesis. 74 m. forcan references [1] m. forcan, “prediction of energy production from string pv system under mismatch condition”, in proceedings of the 2 nd virtual international conference on science, technology and management in energy energetics, 2016, pp. 3-9. [2] m. jošt and m. topič, “efficiency limits in photovoltaics – case of single junction solar cells”, facta universitatis, series: electronics and energetics, vol. 27, no 4, pp. 631 638, december 2014. [3] y. georgiev, g. angelov, t. takov, i. zhivkov and m. hristov, “the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer”, facta universitatis, series: electronics and energetics, vol. 27, no 4, pp. 639 648, december 2014. [4] o. perpinan, e. lorenzo and m.a. castro, “on the calculation of energy produced by pv grid-connected system”, progress in photovoltaics research and applications, vol. 15, issue: 3, pp. 265-274, 2007. [5] m. brabec, e. pelikán, p. krč, k. eben and p. musilek, “statistical modeling of energy production by photovoltaic farms”, in proceedings of the ieee elect. power energy conf. (epec), aug. 2010, pp. 1-6. [6] o. perpinan, “statistical analysis of performance and simulation of two axis tracking pv system”, solar energy, vol. 83, issue 11, pp. 2074-2085, nov. 2009. [7] s. jiang, k. wang, h. zhang, y. ding and q. yu “encapsulation of pv modules using ethylene vinyl acetate copolymer as the encapsulant”, macromol. react. eng., 9, pp. 522–529, 2015. [8] t. shioda, “delamination failures in long-term field-aged pv modules from point of view of encapsulant”, lecture presented at 2013 nrel pv module reliability workshop, denver. [9] d. c. jordan, j. h. wohlgemuth, and s. r. kurtz, “technology and climate trends in pv module degradation”, in proceedings of the 27th european photovoltaic solar energy conference and exhibition, 2012, pp. 3118-3124. [10] m. kempe, “modelling of rates of moisture ingress into photovoltaic modules”, solar energy materials & solar cells, vol. 90, issue: 16, pp. 2720–2738, 2006. [11] m. kempe, “ultraviolet test and evaluation methods for encapsulants of photovoltaic modules”, solar energy materials & solar cells, vol. 94, issue: 2, pp. 246–253, 2010. [12] a. ndiaye, a. charki, a. kobi, c.m.f. kébé, p.a. ndiaye and v. sambou, “degradations of silicon photovoltaic modules: a literature review”, solar energy, vol. 96, pp. 140–151, 2013. [13] d. sera, r. teodorescu and p. rodriguez, “pv panel model based on datasheet values”, in proceedings of the ieee international symposium on industrial electronics, vigo, spain, 2007, pp. 2392–2396. [14] m. forcan, ţ. đurišić, and j. mikulović, “an algorithm for elimination of partial shading effect based on a theory of reference pv string,” solar energy, vol. 132, pp. 51–63, 2016. [15] m. forcan, j. tuševljak, s. lubura and m. šoja, “analyzing and modeling the power optimizer for boosting efficiency of pv panel,” ix symposium industrial electronics indel, banja luka, november 2012, pp. 193-198. [16] m. forcan and ţ. đurišić, “the analysis of pv string efficiency under mismatch conditions,” in 4th international symposium on environment friendly energies and applications efea, 2016, pp. 1-6. [17] c. schwingshackl, m. petitta, j.e. wagner, g. belluardo, d. moser, m. castelli, m. zebisch and a. tetzlaff, “wind effect on pv module temperature: analysis of different techniques for an accurate estimation”, energy procedia, vol. 40, pp. 77–86, 2013. [18] s. bensalem and m. chegaar, “thermal behavior of parasitic resistances of polycrystalline silicon solar cells”, revue des energies renouvelables, vol. 15, pp. 171-176, 2013. [19] m.l. priyanka and s.n. singh, “a new method of determination of series and shunt resistances of silicon solar cells”, solar energy materials & solar cells, vol. 91, pp. 137–142, jan. 2007. [20] d. macdonald and a. cuevas, “reduced fill factors in multicrystalline silicon solar cells due to injectionlevel dependent bulk recombination lifetimes”, progress in photovoltaics: research and applications, vol. 8, pp. 363–375, 2000. [21] matlab/simulink. mathworks, inc. natick. massachusetts. united states. [22] pv module data sheet, available online at http://www.suntellite.cn/en/product/suntellite-modulepolycrystalline-20.html [23] r. dubey, s. chattopadhyay, v. kuthanazhi, j. j. john, b. m. arora, a. kottantharayil, k. l. narasimhan, c. s. solanki, v. kuber, j. vasi, a. kumar and o. s. sastry “all india survey of photovoltaic module degradation 2013”, national centre for photovoltaic research and education, mumbai, india, 2014, available online at http://www.ncpre.iitb.ac.in/pages/publications_reports.html [24] g. m. masters, renewable and efficient electric power systems. hoboken, nj: john wiley & sons, 2004, chapters 7-8. http://www.suntellite.cn/en/product/suntellite-module-polycrystalline-20.html http://www.suntellite.cn/en/product/suntellite-module-polycrystalline-20.html http://www.ncpre.iitb.ac.in/pages/publications_reports.html facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 351-378 https://doi.org/10.2298/fuee2003351m © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd graphene-reinforced polymeric nanocomposites in computer and electronics industries  hossein kardanmoghaddam 1 , mohamadreza maraki 1 , amir rajaei 2 1 faculty member of birjand university of technology, birjand, iran 2 faculty member of computer engineering, velayat university, iranshahr, iran abstract. graphene is the newest member of the multidimensional graphite carbon family. graphene is a two-dimensional atomic crystal formed by the arrangement of carbon atoms in the hexagonal network. it is the most rigid and thinnest material ever discovered and has a wide range of uses regarding its unique characteristics. it is expected that this material will create a revolution in the electronics industry. graphene is a very powerful superconductor as the movability of charged particles is high on it, and additionally, because of the high surface energy and π electrons being free, graphene can be used in manufacturing many electronics devices. in this paper, the applications of graphene nanoparticles reinforced polymer nanocomposites in the computer and electronics industry are investigated. these nanoparticles have received much attention from researchers and craftsmen, because graphene has unique thermal, electrical and mechanical properties. its use as a filler in very small quantities substantially enhances the properties of nanocomposites. there are various methods for producing graphene-reinforced polymer nanocomposites. these methods affect the amount of graphene dispersion within the polymer substrate and the final properties of the composite. the application and the properties of graphene-reinforced polymer nanocomposites are discussed along with examples of results published in the papers. to better understand such materials, the applications of these nanocomposites have been investigated in a variety of fields, including batteries, capacitors, sensors, solar cells, etc., and the barriers to the growth and development of these materials application as suggested by the researchers are discussed. as the use of these nanocomposites is developing and many researchers are interested in working on it, the need to study and deal with these substances is increasingly felt. key words: graphene, nanocomposite, electrical conductance, electromagnetic waves, capacitor, solar cells, light absorption, oled received september 26, 2019; received in revised form may 7, 2020 corresponding author: hossein kardanmoghaddam faculty member of birjand university of technology, birjand, iran e-mail: h.kardanmoghaddam@birjandut.ac.ir  352 h. kardanmoghaddam, m. maraki, a. rajaei 1. introduction up to 1980, only three kinds of three-dimensional carbon allotropes were recognized that the most popular of them were diamond, graphite, and amorphous type of carbon. in diamond, as the hardest kind of natural material each carbon atom binds with four other carbon atoms and hybridization of carbon atoms in this structure is as sp 3 form. in graphite carbon atoms have sp 2 hybridization and carbon hexagons create planes that each of these planes are bounded to underlying planes though weak and van der waals bond. these single planes in graphite called graphene and attracted considerable attention. special features of graphene have made it usable for electronic applications. every study on graphene has led to the development of electronic components with lower volume and higher speeds. graphene has remarkable mechanical properties which make it a wonderful material for reinforcing metal matrix composites. due to its unique optical and thermal properties, graphene is a perfect filler for multilayer composites, especially for metal matrix composites. additionally, it is taken into consideration for its viability and outstanding mechanical properties. researches on graphene and its nanocomposites is developing at many universities, research and development centers and by many people [1][2]. there are so many motivations for upgrading composites of graphene metal. reinforcing mechanism of graphene is related to its unique mechanical and structural characteristics and good binding of graphene and matrix. there are so many challenges in this area that one of them is dispersion of graphene in metal matrix of composite despite usual metallurgical methods and processes which is related to great difference in density of graphene nano planes and metal matrix. more contact surface in comparison with carbon nanotubes , reaction in matrix-reinforce interface because of higher reactive metals and also slight dispersion of graphene in matrix are among other problems in this area [2][3]. there are widespread researches in making and application of polymer nano composites aiming improvement of polymer features and increasing their application capability in different areas [4]. along with this, carbon based nano particles like carbon nanotubes [5-9] and graphene [10-16] have attained special position in making polymer nano composites. it should be noted that these nano particles have different features like mechanical reinforcement, electrical conductance and heat stability in comparison with each other [16-18]. despite wonderful advancements in using carbon nanotubes as reinforcement phase, case like tendency of nanotubes to agglomeration during process, limited access to high quality carbon nanotubes in large amounts and also their high prices have restricted manufacturing polymer nano composites reinforced with carbon nanotubes so, graphene nano particles because of mechanical and electrical features and also their dominant comprising material i.e. graphite in nature, are considered good alternative for carbon nanotubes for manufacturing polymer nano composites [19]. graphene, or in other words, chemically modified graphene (cmg) are suitable alternatives for different applications like energy saving materials, semi-paper material, polymer composites, liquid crystal tools and mechanical oscillators [1].on the other hand, the unique properties of graphene, including its electrical, thermal, electrochemical and high specific surface properties, have increased the usability of this material in many applications such as sensors, catalysts, energy suppliers and composite types [20-26]. fabbri et al.[27] produced reinforced poly butylene terephthalate nano composites reinforced with graphene by insitu polymerization method. they found that by increasing graphene amount, obtained molecule mass of https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 graphene-reinforced polymeric nanocomposites in computer and electronics industries 353 polymer decreases, but no significant change in heat resistance of nano composite is occurred. shen et al. [28] synthesized polycarbonate nanocomposites reinforced with modified graphene through melt blending. according to their results, process conditions significantly affect final properties of the resulting nanocomposite and the degree of polycarbonate grafting on the surface of graphene sheets. dong et al., [29] prepared graphene-reinforced polyimide fibers by in-situ polymerization. fibers containing 0.8% graphene showed a tensile strength 1.6 times greater than pure fibers with 200% increase in the young’s modulus. wang et al., [30] studied the effect of adding graphene to glass fiberreinforced epoxy resins on mechanical and fire resistance (flammability) properties of the resulting nanocomposite. they found an increase in both mechanical and fire resistance properties of the nanocomposite, if graphene is to the polymeric matrix. researches performed by researches by rafiee et al. [31] [32] and verdejo et al., [33] on polymer nanocomposites reinforced with modified carbon and graphene nanotubes shows that properties of nanocomposites reinforced by modified graphene is more improved than carbon nanotubes [31] [32] [34]. the researchers correlated the results to the higher contact surface area and the great ratio of the length to the width of graphene plates compared to carbon nanotubes. in the following sections of this paper, the structure of graphene (section 2), graphene-reinforced metal matrix composites (section 3), the use of graphene in lithium batteries (section 4), the effect of graphene on the electricity conductivity (section 5), increasing the cooling power of electronic components by combining different materials with graphene (section 6), using graphene in sensors (section 7), using graphene to protect against electromagnetic waves (section 8), using graphene in construction of capacitors (section 9), using graphene in touch pads (section 10), using graphene in lamps and optical leds (section 11), using graphene in the construction of microphones (section 12), the use of graphene in the manufacture of oled displays (section 13), the use of graphene in the manufacture of ink (section 14), the use of graphene in reinforcing electrical circuits against moisture (section 15), the use of graphene in industrial applications of iot (section 16), the use of graphene in transistors (section 17), the development of nanoelectromechanical switches using graphene (section 18) and the use of graphene in the manufacture of cameras (section 19) are investigated. 2. graphene graphene is two-dimensional (2d) sheet with binding carbon atoms in hexagonal configuration like bee hive that atoms have been bound with sp2 hybrid. this monolayer and bee hive structure has been shown in figure 1. graphene because of containing great mechanical, electrical, temperature, optical, high surface area and ability to control all of these features through chemical factors has attracted attention of scientists [1][2][35]. the thinnest and strongest material known so far, is a two-dimensional sheet of carbon atoms called graphene [36]. graphene is a nano particle with two-dimensional plane structure and its thickness is about one carbon atom. in these planes, carbon atoms have been bound in hexagonal network. the material structure is flawless so graphene has desirable physical properties like: electrical conductance, heat transmittance, high mechanical strength, 98% transparency and very high specific surface area [37][38][39]. 354 h. kardanmoghaddam, m. maraki, a. rajaei fig. 1 graphene monolayer and bee hive structure [2] according to research [1] graphene features are: high young’s modulus (about 1100 giga pascal), high resistance against breaking (125 gpa), good heat conductance (5000w/mk), high mobility of load carries or in other words high electrical conductance (200000 vs/cm 2 ), high specific surface area, calculated amount: 2630 (m 2 /g) and wonderful transmittance events like quantum hall effect. some of the most important physical and mechanical properties of graphene have been presented in table 1. table 1 some of the most important physical and mechanical properties of graphene [2] property graphene electron mobility 1500 cm 2 v -1 s -1 resistivity 10 -6 ω-cm thermal conductivity 5.3*10 3 wm -1 k -1 transmittance >95% for 2nm thick film >70% for 10nm thick film elastic modulus 0.5-1 tpa coefficient of thermal expansion -6*10 -4 /k specific surface area 2630m 2 g -1 tensile strength 130 gpa the less graphene layers, the higher its properties [40][41]. so methods by which few layered graphene are produced in high scale are more important. electrochemical exfoliation is among top-down graphene synthesis methods with a higher production rate and lower costs than other methods [42]. also with many solvents, so many kinds of graphite can be produced in room temperature [43]. a variety of methods are used for graphene synthesis including micromechanical exfoliation, epitaxial growth, chemical vapor deposition (cvd) and chemical methods [1][2]. 3. graphene reinforced metal matrix composites proper scattering of nanoparticle reinforcers within the polymer field is an essential parameter to achieve enhanced properties compared to matrix polymer. if the graphene is properly scattered within the polymer phase and there are strong interactions between the graphene and the polymer interface, the overall properties of the polymer matrix will be significantly improved. much efforts have been invested to achieve a homogenous system with good dispersion of graphene sheets in the polymer matrix through covalent or noncovalent bonds on the graphene surface [33]. there are so many researches on graphene graphene-reinforced polymeric nanocomposites in computer and electronics industries 355 reinforced metal matrix composites for example graphene-platinum composites, graphenegold, graphene-cobalt, silicon-graphene, aluminum powdergraphene, magnesium graphene, composite foil copper-graphene and nickel-graphene [2]. some of the graphenereinforced metal matrix composites, their properties, applications have been listed in table 2. table 2 an illustration of metal-graphene composites [2]. composition properties and applications pt-graphene super capacitor-fuel cell applications electrochemically active surface area-catalyst carrier in electrocatalysis and fuel cells applications ai/pd/pt acts as catalytic methanol oxidation-methanol fuel cell applications au-graphene dna gets adsorbed faster than only au surface biosensors, biodevices and dna sequencing applications voltammograms of electrolytic reduction of oxygen and glucose oxidation shows more au-graphene than alone au-fuel cell and bioelectroanalytical chemistry applications apparent electrode area environmental monitoring-detection of mercury electroactive surface area-electrochemical detection of dna specific sequence applications co-graphene anode material for li-ion battery applications sigraphene anode material for li-ion battery applications ai powder-graphene graphene as reinforce-strenghening of composite applications decreased strength and hardness lower failure strain and higher vickers hardness mg-graphene based composite production of ultra high performance metal matrix composite cu-graphene composite foil higher the electrical conductivity and hardness compare to copper alone mg-1%a-1%sn reinforced graphene superior nano-filler adhesion and increased and tensile strength au-graphene-hrp-cs h2o2 biosensor applications 4. use of graphene in lithium batteries lithium ion batteries nowadays have been used extraordinarily. at the moment all new laptops use lithium ion batteries. the reason for the popularity of these batteries is their high capability in rendering more power than other batteries. ion lithium battery electrodes have comprised lightweight materials such as lithium and carbon. lithium is also very reactive metal. it means that this metal can save so much energy in its atomic structure. because of this, ion lithium batteries have high energy saving density. ion lithium batteries are at present standard batteries for computers and laptops. they’re light with high life cycle. these batteries aren’t restricted by memory effect; it means that it doesn’t need to completely discharge them before recharging and they can be charged in any time in random order. finally these batteries don’t become hot in cases of extra charge and their explosion possibility is so low. they are also thinner and smaller than any other batteries used in laptops. this issue makes them ideal for using in very lightweight and small laptops produced nowadays. when these batteries are charged, 356 h. kardanmoghaddam, m. maraki, a. rajaei lithium ions from inside electrolyte material and positive pole move to negative pole and bind with carbon.normally, lithium-ion batteries can be charged between 950 and 1200 times. there have been so many researches about adding graphene as second phase to lithium for lithium batteries anode used in laptop batteries. silicon because of high nominal capacity and low discharge potential is a suitable material for lithium batteries. but its mass change in charge and consequential discharge lowers its capacity. by adding graphene to silicon because of high conductance capability, chemical stability, and also good mechanical properties, these problems can be removed. so adding graphene to silicon improves its stability as a material for lithium batteries anode [3]. chou et al, considered 15% higher output for graphene-silicon composites. although original charge capacity (2185 mah/g) is less than silicon (3026 mah/g); but this composite preserved 54% of original capacity after 30 cycles; while silicon only preserved 11%. graphene-silicon composites have higher capacity related to silicon or graphene in more cycles numbers [3]. fig. 2 graphene stability cycle, silicon in nano dimensions, silicongraphene composite electrodes and share of calculated net silicon [3]. li et al., [44] prepared silicon-graphene foams using silicon film deposition on graphene foam. they used highly flexible nanocomposites as an anode for the manufacture of lithium batteries. both experimentally and theoretically (using density function theorem (dft)), kumar et al., [45] found that the use of organic species such as porphyrins as columns between graphene oxide sheets was a promising method for development of highly flexible stable anode electrodes in sodium-ion batteries. the cell constructed by kumar et al. showed a capacity of 200 ma h⁄g with a current density of 100 ma h⁄g. the electrode showed an insignificant capacity reduction even after 700 charging/discharging cycles. the specific capacitance of the cell remained stable after leaving it for 1 month. both experiments and dft calculations revealed that the higher efficiency (stability and capacity) of the anode prepared from porphyrin graphene oxide framework could be related to an increase in the space between graphene layers. 5. electricity current conductance any material which conducts electricity properly is called a conductor.scientists believe that the main reason for conductance of some materials is that their electrons can easily release from atom and move. electrical conductance has an important role in computer graphene-reinforced polymeric nanocomposites in computer and electronics industries 357 industry and data transfer in electronic and computer systems. adding nano material to conductors increases their conductance. according to the literature, carbon nanotubes show an electrical percolation threshold at low concentrations. also, graphene or modified graphene in same amounts or less than carbon nanotubes are capable of forming conductive network [34]. in graphene-resin epoxy nano composite, electrical conductance has been increased considerably (12 times) [46]. in specified amount of nano particle called as percolation threshold, nano particle is capable to form network structure. this causes sudden increase in nano composite electrical conductance [47]. inherent conductance and length to width ratio of filler nano particles based on carbon make them a suitable alternative for obtaining this percolation threshold in less amounts of filler phase. the research results as indicated that flawless graphene planes indicate signs of ballistic transport [48]. although electrical conductance of graphene using modified chemical methods is not as well as flawless graphene, but it is still suitable alternative for producing electrical conductance nano composites. the first and the most common used method for chemical reduction and layering graphene oxide is using dispersion in colloid form inside hydrazine hydrate [49][50]. chemically modified graphene as product of this method contains carbonyl, epoxy, and carboxylic acid groups. conductance of a sample powder was measured about 200±2400 s/m that is comparable with 20±2500 s/m for graphene oxide [49]. graphene oxide is reduced and thermally exfoliated through continuous heating at high temperatures [50]. the resulting chemically-modified graphene contains carbonyl, epoxy and carboxylic acid functional groups with structural defects and surface shrinkage. despite these surface defects, a bulk conductivity of 1000 to 2300 s/m has been measured for modified graphene. kuang et al., [51] used electrodeposition technique for preparation of nickel nanocomposite films. according to hardness and young’s modulus measured for nickel-graphene composites, mechanical properties of the composite were significantly improved in comparison with neat electrodeposited nickel samples. while nickel-graphene composite showed a hardness and young’s modulus of 6.85 and 252.76 gpa respectively, the corresponding values measured for pure nickel were only 1.81 and 166.70 gpa, respectively. the electrical conductivity of the nickel-graphene composite rose also by 15% compared with pure nickel, as shown in figure 3. fig. 3 comparison of electrical conductance of pure nickel and graphene-nickle composite [51]. 358 h. kardanmoghaddam, m. maraki, a. rajaei tchernook et al., [52] produced nanocomposite from nanocrystal of heavy poly ethylene and graphene as ethylene insertion polymerization in water. graphene dispersion in monocrystal water solution poly ethylene led to increasing the electrical conductance. high electrical conductance and low percolation threshold may be related to composite microstructure. small size particles of polyethylene nanocrystals form homogenous mix of graphene in polymer matrix. kim et al., [53] produced nanocomposites from lldpe and graphene by solution method. in this research, graphene nanoparticles were covered by paraffin. results indicated that adding paraffin decreases electrical conductance percolation threshold. electrical conductivity of the above-mentioned composites revealed the lower conductivity of the samples prepared by solution blending than nanocomposites synthesized using other methods. in a similar research, jiang et al., [54] produced hdpe graphene nanocomposites (hdpe/graphene) and hdpe multi wall carbon nanocomposite tubes (hdpe/mwcnt). both nanoparticles were covered by paraffin. the results indicated that manufactured nanocomposites by multi wall carbon nano tubes in less amounts in relation to graphene reach percolation threshold. in another research, mohamadzadeh et al., [55] have synthesized polyaniline-graphene oxide with polymerization method in monomer aniline adjacent to graphene oxide planes as mild oxidant agent. in this method any external oxidant agent hasn’t been used and obtained composites has high electrical conductance and crystal traits. 6. increasing the cooling power of electronic components by combining different materials with graphene cooling electronic parts is one of the ever present problems of users and because of that they have always attempted to cool their system in order to preserve its output and efficiency in long term. despite that it has attempted to use lower voltage and frequency in producing computer parts, especially cpu, hard disk and graphic cards produce so much heat during their working. the heat generated in the electronic components must be balanced for the components to be able to operate in safe heat so that the system does not have any problems with data loss and crash. new portable computer and telecommunication systems (laptop) use small heat pipes for removing heat, as nowadays computer advancement is such that while showing higher capabilities more heat is produced. heat pipes are widely used in cooling computers. these pipes are empty inside containing heat transfer fluid. when the fluid is vaporized in warm part of the pipes, transfers heat to cool part of the pipes and there the fluid is condensed and returns to the warm part of the pipe. increasing heat conductance of material by manufacturing polymer-graphene nanocomposites has been boomed incredibly. most studies have focused on the effect of modified graphene on thermal properties of polymeric nanocomposites such as thermal stability, glass transition temperature (tg), melting point (tm) and crystallinity [33]. fang et al., reported increasing heat conductance of polystyrene film filled with 2% graphene weight bound with polystyrene from 0.158 w/m.k to 0.413 w/m.k [56]. noorunnisa khanam et al., [57] prepared lldpe/graphene nanocomposites by melt blending and studied the effect of nanoparticle concentration and extruder rotor speed on thermal stability of the resulting nanocomposites. in this study motor rounds of 50, 100 and 150 rpm were used respectively for polymer blending and nanoparticles. the results indicated that in prepared nanocomposites with motor round of 150 rpm, degradation graphene-reinforced polymeric nanocomposites in computer and electronics industries 359 temperature increases because of better dispersion of nanoparticles in polymer beds. also graphene nanoparticles act as heat block and improve the heat resistance. kuila et al., [58] prepared lldpe nanocomposites and modified graphene with dodecylamine (da-g) with solution method and examined their heat properties. the results of thermal gravimetry analysis (tga) indicated that graphene increases polymer heat stability. kim et al., [59] prepared graphene and lldpe nanocomposite and effect of graphene nanoparticles on crystal features. the results indicated that by increasing graphene nanoparticles, the crystallization temperature does not change but crystallization rate decreases. in other words, physical presence of graphene nanoplanes blocks movement of polymerized chains and crystallization rate decreases. park et al., [60] prepared fluids with high heat conductance using combined graphene planes with pure fluid like water. these researchers reported that fluids containing oxidized graphene have better heat properties in comparison with fluids containing graphene. choi et al., [61] used poly methyl methacrylate nanocomposites and prepared graphene with very low graphene rate for removing produced heat in electronic equipment. they observed heat conductance of these nanocomposites is 3 times of pure samples. these nanocomposites have higher transparency in addition to being light weighted. boron nitride (bn) was combined with graphene to construct a tool capable of effective heat removal and cooling of electronic devices [62]. to this end, graphene is deposited on boron nitride. a graphene-based transistor was developed on a boron nitride substrate. this system is able to cool electronic devices 10 times more effective than conventional methods. the mechanism used in this tool takes advantage of the 2d nature of graphene and boron nitride to create a thermal bridge with the substrate. to this purpose, they used boron nitride crystal with a thickness of several ten nanometers. this layer is mounted on the gold surface. then, the layer containing graphene-based transistor is mounted on the gold surface. the heat flow in the boron nitride crystal was measured by raman spectroscopy. the cooling mechanism in this tool was explained by dielectric anisotropy of the boron nitride layer. due to the anisotropic property, the insulator enters a photothermal mode called hyperbolic polariton (hp) in which heat flows in the matter in places considered forbidden zones in other insulators. thus, this system removes heat more effectively than other methods. this mode opens a real thermal bridge between graphene and back electrode leading to heat removal with 10 times higher efficiency. the efficiency of the transistor increases by 10 times when entering the zener-klein zone. this tool makes use of hyperbolic polariton mode to transfer heat to the substrate without any damage to the graphene network. 7. graphene containing sensors moisture is called water vapor in the air. water vapor is totally transparent and we can’t see it by naked eye, but high moisture causes may problems especially short connection in electronic machines and damages computer parts. moisture resides as water drops on objects. the air can keep so much water vapor. when the air becomes warmer, the air moisture increases. sensors can be made of materials with a large range of electrical conductivity from polymeric composites with conductivities close to percolation threshold to polymeric nanocomposites which are used for the manufacture of gas, ph, pressure or temperature sensors [44][63][64][65]. 360 h. kardanmoghaddam, m. maraki, a. rajaei wang et al [66] prepared sensors stimulant to ph fluctuations and heat using graphene oxide and poly methyl methacrylate derivatives. they observed that sensitivity level of these sensors is related to combined graphene oxide. qin et al., [67] prepared membranes for separating gases using graphene. they found that these membranes have high capability in separating hydrogen and methane gases and we can use them for removing efficient environmental problems. li et al., [68] prepared new version of sensors sensitive to moisture using combination of graphene with polypyrrole in different amounts of graphene. their study results indicated that sensors with 10% graphene weight are more sensitive in relation to other sensors with response time 15 to 20s and this sensor can be used in protecting electronic parts against moisture. one of the materials in industrial and chemical areas that damages electronic and computer parts is presence of ammonia. in research [69] prepared a sensor for detecting ammonia using reduced graphene oxide with chemical method. they observed that these kinds of membranes have more sensitivity in detecting ammonia with very low densities that can be used in industrial centers with electronic server and equipment for protecting this equipment. 8. protection against electromagnetic interference (emi) electro magnetic interference (emi) means disruption or decrease in efficient performance of equipment and tools because of electromagnetic flames from an unsolicited source in the frequency range the same as working frequency. the purpose of protection against electromagnetic waves is to attain a certain attenuation level of these waves. this is performed through reflection and absorption of these waves by protective material. generally, material performance in electromagnetic fields is determined by moving electrons freely and their atomic movement in the magnetic field. the main mechanism of electromagnetic protection includes producing magnetic field opposed with striking magnetic field and as a result energy loss in protective area and weakening entering waves (figure 4) [70]. in using nanoscale materials, this protection because of high conductance of nanocomposites is more effective than other protective materials. fig. 4 electromagnetic protection mechanism (emi). waves crossing protection cover lose considerably their strength when existing [70]. mainly effectiveness of electromagnetic protection of a composite material depends on inherent conductance of protective material and its electrical permittivity coefficient. among this, application of nanomaterial like carbon nano fibers and graphite layers because of unique thermal, electrical, mechanical and physical features for using in composites has graphene-reinforced polymeric nanocomposites in computer and electronics industries 361 attracted so much attention. small diameter, high dimensional ratio, high conductance, high mechanical stability of carbon nanotubes has made these materials a great alternative for applying in electromagnetic interference protection with low weight percent and high performance. graphene is a good alternative for emi shielding because of unique features. graphene even in lower weight percent is also effective in improving composite shielding features. according to figure 5 it can be observed that emi shielding effectiveness increasing by higher nano material and it has acceptable trend for all frequency ranges. shielding effectiveness level for 15 weight percent of graphene (8.8 mass) is obtained 21 decibel [71]. song et al., [72] prepared composite films containing ethylene vinyl acetate and graphene and studied their capability for application in electromagnetic interference shielding. fig. 5 process of emi shielding effectiveness for different weight percentages of graphene in graphene-epoxy nano composite [71]. 9. graphene-based supercapacitors supercapacitors with a higher power density and cyclic life than batteries are used to store electrical charge [73][74]. however, the widespread use of supercapacitors has been limited due to their energy density [75]. accordingly, most studies in this area have focused on methods for increasing specific capacity of supercapacitors [78]. graphene and its derivatives are extensively used for the manufacture of supercapacitor electrodes due to their lower costs than other materials such as metal oxides [76]. in addition, graphene-based electrodes provide a lower energy density than metal oxides [77]. the total capacitance of a supercapacitor can be increased through changes in electrode structure [75] [78][77][79]. according to hwang (2012), quantum capacity is the factor limiting the total capacitance of graphene-based capacitors [79]. these calculations were repeated by wood et al., (2014) and same results were reported. the effect of graphene functionalization on quantum capacity was studied in 2015 [80]. according to the results, the use of functionalized graphene as the base material for the supercapacitor electrode will provide very favorable results. according to [81], modification of graphene sheets with n, p, s and si atoms may affect quantum capacity. studies have also shown interesting effects of graphene sheet functionalization [80]. the results of an experimental study confirmed the https://www.google.co.in/url?sa=i&rct=j&q=&esrc=s&source=imgres&cd=&cad=rja&uact=8&ved=2ahukewj9koeonvrgahwfduwkhqyhbtiqjhx6bagbeam&url=http%3a%2f%2fdiscovermagazine.com%2f2009%2fjul-aug%2f09-ways-carbon-nanotubes-just-might-rock-world&psig=aovvaw3p7pc5szxxkb0wxzpcw9vl&ust=1552398743189379 362 h. kardanmoghaddam, m. maraki, a. rajaei significant impact of functionalization of graphene sheets on energy density [82]. the structural defects also improve quantum capacity [77]. graphite oxide all over its three dimensional structure has nano metric porosity and curvature walls with one atom thickness. this material is exceptionally a great electrode material for super capacitors that enables to use these energy saving wares in wide range of applications especially manufacturing electronic devices used in computer and telecommunication industry. we can convert graphite oxide to individual and mono layer plates using methods like thermal or chemical operations. graphite is exfoliated to produce graphene oxide monolayers due to polar oxygen functional groups on graphene sheets, which improve the distribution of graphene oxide sheets in polar solvents such as water and many other organic solvents. the graphene sheet can be subsequently reduced by reducing agents such as hydrazine hydrate, dimethyl hydrazine, sodium borohydride and ascorbic acid. this causes that graphene oxide plates recover their sp 2 carbon network. however, graphene oxide layers are not completely reduced leading to the formation of carboxylic acid and hydroxyl functional groups on the edges of graphene oxide sheets [83],[50]. at the moment activated carbon in most of available commercial super capacitors or two layered electrochemical capacitors used as electrodes because of high area and electrical conductance [84]. carbon nanocomposite fillers can be added to polymers to improve the performance of electrodes. the results of a study indicated a specific capacity of 120 f/g for graphene-propylene carbonate by using tetraethyl ammonium tetrafluroborate as the electrolyte. according to the results of another study, the electrode made of polyaniline nanocomposite filled with graphene oxide reduced by microwaves in sulfuric acid showed a capacity of 408 f/g [85][86]. yuan et al., [87] invented a method for preparing three dimensional and porous structures from graphene oxide with high efficiency in making capacitors. these researchers used graphene oxide for preparing capacitor, and then by using chemical reduction methods, prepared porous structures of graphene. tu f. et al., [88] invented a simple method for making three dimensional structures of reduction graphene oxide in ethylene glycol having capability of use in lithium capacitors. these researchers stated that such capacitors don’t lose their capability after 3000 cycles. despite the reasonable double-layer electrochemical capacity of graphene (526 f.g -1 ), experimental results are fewer than theoretical ones leading to the agglomeration of graphene sheets and reduced surface area and permeability of electrolyte ions into electrodes [89]. to solve this problem, a three-dimensional graphene network (3dgns), with an intertwined structure, has been used as an ideal supercapacitor material. intertwined 3dgns with a very large surface area, high specific capacity, excellent mechanical properties and high electrical conductivity provide a unique supercapacitor material with a high charging-discharging capability and lifetime [90]. table 3, lists specifications of some supercapacitor materials made of 3d graphene composite and transition metal oxides/hydroxides. graphene-reinforced polymeric nanocomposites in computer and electronics industries 363 table 3 supercapacitor materials made of 3d graphene (3dgns) and intermediate metal oxides/hydroxides composite specific capacity (cs) (f.g -1 ) number of cycles tolerated energy density (wh.kg -1 ) power density (kw.kg -1 ) electrolyte synthesis method ref sno2/ga 310 (~90%)1000 30 8.3 2m koh chemical selfassembly [91] v2o5 nanobelt/gh 426 (95%)5000 21.3 --0.5 m k2so4 ultrasound waves and hydrothermal [92] mno2/gf 130 (82%)5000 6.8 2.5 0.5 m na2so4 cvd, electrochemical deposition [93] sponge-rgo/mno2 450 (90%)10000 8.34 47 1 m na2so4 dip coating [94] mno2/ga 410 (95%)50000 ------0.5 m na2so4 sol-gel, electrochemical deposition [95] mno2/gh 242 (89.6%)1000 21.2 --1 m na2so4 self-assembly [96] mno2/gh/nf 234 (98.5%)10000 ----0.5 m na2so4 electrochemical deposition [97] ruo2/cnt/gf 503 (106%)8100 39.28 128.01 2 m li2so4 cvd [98] co3o4/gh 757 (94.5%)500 9.3 142.9 6m koh hydrothermal [99] co(oh)2/gf 1139 (74%)1000 13.9 18 1m koh chemical bath deposition [100] ni(oh)2/gh 1247 (95%)2000 31.1 9 6m koh hydrothermal [101] ni(oh)2/gf 1560 (63.2%)10000 6.9 44 6m koh hydrothermal [102] ni(oh)2/gf 1450 (78%)1000 ---6m koh apcvd hydrothermal [103] nio/gf 816 (100%)2000 ---3m koh cvd electrochemical deposition [104] fe2o3/gh anode 908 (75%)200 ---1m koh go solution hydrothermal ultrasound waves [105] 3d fmg 508 (94%)1000 15.32-66 14.43-52 thermal decomposition and ultrasound waves [106] 3dg layers 231.2 8000(>99%) 32.1 0.5 1 mna2so4 gas foaming [107] nico2o4 nanoneedles/3dgn 970 3000(~96.5%) solvothermal cvd [108] nio/gm 727 (94.5%)1000 ----6m koh hydrothermal [109] n-rgo 214 (100%)5000 -----6m koh ice-templating [110] 364 h. kardanmoghaddam, m. maraki, a. rajaei 10. improving light absorption using graphene graphene is two dimensional carbon page having network structure like bee hive and only has one atom thickness. graphene has manifold light, electronic and heat properties. dispersed electrons in graphene like weightless dirac fermions behave in linear relation energy-size movement and cause increasing reaction of graphene carriers in area temperature to 10 5 cm/vs and in lower temperatures it becomes 10 6 cm/vs. graphene absorption coefficient is higher than other conventional semi-conductors and another feature of graphene is high heat conductance as 5000 w/mk for mono layer graphene and zero energy gap causing that graphene is used in devices such as light illustration, solar cells, light emitting diodes and so on. it should be noted that graphene with high light absorption depending on short interaction length only absorb 3.2 percent of visible light to infrared. this amount of light absorption in graphene is not enough for using in illustration devices. recently light technologies have been used for improving graphene light absorption. echtermeyer et al., [111] combined graphene layer with plasmonic nanostructures that increased efficiency of graphene photodetectors. abajo et al., [112] used periodic graphene pattern for improving light absorption in the ultrared area. zhao et al., [113] used metal grating for increasing light absorption in graphene. zhu et al., [114] figured out that blending plasmonic array of empty pores in nano scale in visual light area, increases graphene light absorption to 30%. the photoresponsivity and photocurrent of several photodetectors have been compared at a communication wavelength of 55.1 µm [115]. this research indicates the superiority of designed photodector in relation to other cases. table 4 compares the photoresponsivity and photocurrent of the proposed photodetectors with those proposed in [115]. table 4 comparison of photoresponsivity and photocurrent of the proposed photodetector with those in [115] references power or input intensity photo responsivity photo current [116] 80 mw/cm 1.0 a/w 765.2 ma [117] 5.0 mw 9.0 ma/w 45.0 µa [118] 5.0 mw 273.0 a/w 137.0 ma [119] 6.0 µw 37.0 a/w 222.0 µa [120] 5.0 mw 100 a/w 50 ma 11. graphene application in solar cells one of the issues for providing energy for portable electronic devices that has been examined is using photovoltaic cells. photovoltaic cells are considered one of the energy providing resources in the future of the world. it is estimated that up to 2050, 15 to 30 percent of world power will be provided by solar energy. at the moment, most of the photovoltaic cells are made of mono crystals or silicon poly crystals. among the new combinations used at the moment in making photovoltaic cells, are many kinds of carbon nanostructures [121]. carbon nanotubes used in composite structures can increase efficiency of solar cells because of high surface area and current conductance. researchers have relied on using graphene for making solar cells. using nanotube structures and graphene in photo electrode can increase the velocity of electron movements. however, increasing nanotube amounts and also graphene-reinforced polymeric nanocomposites in computer and electronics industries 365 graphene increase rate of electrons returning from photo electrode. so, an optimal amount in this area should be considered. according to the graph of converting current based on voltage, three electrodes were compared from graphene, multi wall carbon nanotubes (mwcnt) and graphene based multi wall carbon nanotubes (gmcwnt); transferring load in electrode based on graphene multi wall nanotubes (gmwnt) has been higher than other two electrodes [122]. jeon et al [123] prepared hole transport layer (htl) for the manufacture of solar cells by thermal reduction of graphene oxide. according to their results, graphene dispersed in polystyrene sulfonate increased the efficiency of solar cells compared with those manufactured by adding polyethylene dioxythiophene to polystyrene sulfonate. adding small amount of gold nanoparticles and boron doped carbon nanotubes (au:bcnt) improves light excitation and electron hole pair separation as charge carriers for charge transport to the electrodes in solar cells. multiple synergistic effects led to a high efficiency of 9.81% [124]. 12. graphene-based touchscreens touchscreens as input/output (i/o) tools allow users to communicate or control what is seen on the screen by touching the screen with one or multiple fingers or with a stylus tip (a pen-like tool). due to the need for transparent conductive materials (tcms), extensive studies have been conducted to find out alternatives for indium tin oxide (ito) as the most commonly used tcm. transparent conductive electrodes are used in numerous optoelectronic devices. currently, indium tin oxide (ito) is used as a transparent conductive electrode due to high electrical conductivity and optical transparency. however, it may not be a suitable choice due to technical and economic limitations. the use of ito is limited for two reasons. first, indium is a very rare expensive element. second, ito is brittle and its electrical conductivity is irreversibly decreased with a small bending. this limits the use of ito in applications requiring high flexibility [125][126]. ito as an oxide ceramic material is very brittle and susceptible to cracking. today, graphene is considered as a good alternative for transparent conducive electrodes in many applications. graphene is combined with silver nanowires for facile manufacture of novel touchscreens with a higher strength. furthermore, graphene-based touchscreens consume less energy and are easily bending due to their high flexibility. touchscreens available on the market are usually constructed from an indium tine oxide (ito) layers, which are very expensive and brittle despite their high electrical conductivity. touchscreens constructed from silver are not economically feasible, so it is better to manufacture touchscreens by combining graphene and silver. this type of touchscreen has a low mechanical strength, and most of its users have complained of the fracture of touchscreens. metal nanowires (mnws) integrate high flexibility, optical transparency and good electrical conductivity. despite the use of fewer raw materials, mnws show a good transparency and electrical conductivity like ito electrodes due to a large length to diameter ratio. to achieve an optimal electrical conductivity and transparency, a higher indium level is used in ito electrodes compared with silver level used in the manufacture of silver nanowire electrodes. adding graphene to silver nanowires increased their electrical conductivity up to around 10000 tines. this shows the possibility of obtaining similar or even better results with an improved performance and less energy 366 h. kardanmoghaddam, m. maraki, a. rajaei consumption only with part of the silver used earlier in the manufacture of such electrodes. according to the literature, the use of graphene significantly reduces manufacturing costs of touch films. it should be noted that silver is darkened when exposed to air, but the graphene layer preserves silver against air molecules. furthermore, the electrical properties of graphene films do not change with bending. while previous samples faced many limitations in this regard, this new material paves the way for the manufacture of flexible devices [127]. 13. graphene-based bulbs and light emitting diodes (leds) according to the light spectrum emitted from graphene, it could be heated to temperatures above 2500 c and warmed enough to emit light. the light emitted from this thin graphene layer is very intense and visible to the naked eye, without magnification. a visible light source using graphene strings on a chip is developing, and many studies are currently being conducted in this field. this new type of broadband light emitters can be integrated with chips to pave the way for the manufacture of flexible transparent displays with an atomic thickness as well as graphene-based optical communications. light emission in small structures on a chip is essential for development of photonic circuits that work with light. graphene is able to resist against overheating without melting of metal or silicon substrate. moreover, graphene is not able to conduct heat and, therefore, heat is concentrated at the center of carbon strings leading to intense light emission [128][129][130]. graphene bulbs are similar to ordinary light bulbs in appearance but contain a thin graphene layer leading to a higher efficiency and light emission. graphene bulbs with a lower manufacture cost and higher brightness and lifetime reduce power consumption. compared to the ordinary lightbulbs and light emitting diodes (leds), graphene bulbs consume less energy. graphene-based leds are used in led displays and monitors with color and quality setting, cell phones, tvs and led lightweight textures with color adjustment. h.diker et al., used graphene oxide and pedot: pss composite for hole injection in light-emitting diods (leds) [131]. to this end, various graphene sizes consisted of gangrene oxide were examined. despite the low efficiency of leds, they found the significant effect of graphene oxide as a hole injector layer (hil). it seems that graphene bulbs and leds are able to change computer and communications industry. the very small size of graphene bulbs and leds allows the manufacture of displays with a higher sensitivity and color capabilities. however, many studies should be conducted in this regard to use graphene-based photonic devices in future. 14. graphene microphones microphone are a type of device or transducer, which convert sound into an electric current. in other words, a microphone is a transducer capable of converting sound into electric signals. there are numerous studies on the use of graphene in the structure of microphones[132-135]. graphene microphones are ultrasonic and lightweight. both conventional speakers and microphones use either paper or plastic diaphragms, which play a key role in sound generation or recognition through vibration. the diaphragms used in new devices are made of a graphene sheet with a thickness of only one atom. it is lightweight diaphragm with high hardness and strength, which is able to respond to a wide range of frequencies from infrasound (≤20 hz) to ultrasound (≥20 khz). one of the graphene-reinforced polymeric nanocomposites in computer and electronics industries 367 great advantages of using graphene is that ultrathin graphene sheets well respond to various frequencies of an electronic pulse. unlike currently used piezoelectric speakers and microphones, the frequency response of this lightweight membrane covers a very wide range capable of generating rapid frequencies for measuring distances more accurately than traditional methods. the membrane converts over 99% of energy into sound. the corresponding value for conventional microphones and speakers, however, is only 8% [133]. graphene microphones constructed by spanenović et al., [132] were 32 times more sensitive than standard nickel microphones. they constructed a very sensitive graphene membrane to convert sound into electric current. sensitivity of the graphene sheet was 15 db greater than conventional models in the market. in their study, graphene membranes with 60 layers were grown on a nickel foil by chemical vapor deposition (cvd). when graphene was produced, the nickel layer was removed from graphene and the grown layer was placed on the surface of microphone. according to the results, this microphone was more sensitive than commercial microphones by 15 db. they simulated a 300-layer membrane with a high efficiency in infrasound range. a thick graphene membrane can be flexible with a good performance for infrasound waves. graphene microphones are very useful for studying hearing signals in high frequencies. beside electromagnetic waves, acoustic waves and highly-oriented long-range sounds are expected to be used in communication devices such as mobile phones. 15. graphene oleds organic light-emitting diodes (oleds) have been widely developed and used due to their high image quality, low power consumption and ultrathin structure. oleds consist of a vibrant and active organic structure embedded between two electrodes. one of the electrodes should be transparent. indium tin oxide (ito) is commonly used in oleds. however, indium is an expensive rare element, which is hardly recovered. graphene is a good alternative to make electrodes. graphene is used to form transparent electrodes (tes). it is also used as an electrode in oleds [136]. the graphene oleds showed similar performance as control tools made from ito transparent electrodes indicating the potential application of graphene. the oled designed by lee and yoo in [137] was very complex and yet efficient. in these displays, graphene was embedded between transparent thin titanium oxide and conductive polymer layers. the displays consist of 5 sticking layers. the displays are mounted on a plastic substrate. by applying voltage to the sheet, cathode emits electrons and holes are produced by the anode leading to photon emission by oled. photon emission occurs in one of the layers by recombination. the light emission path in this type of displays can be determined by cathode and anode adjustment. a titanium oxide layer with a high refractive index and a hole injection layer in the conductive polymer (with a small refractive index) have been used in this scheme. the whole structure is mounted on both sides of the graphene layer. the color can be controlled and the loss of surface plasmon polariton is reduced in this type of displays. the performance of the resulting display is improved due to concurrent use of two layers with different refractive indices. 368 h. kardanmoghaddam, m. maraki, a. rajaei 16. graphene inks highly flexible graphene inks with a very high electrical conductivity are used on a wide range of substrates including paper and plastics, and they are highly flexible. this type of inks shows a higher surface stability than conductive composites and carbon inks, and, they are much cheaper than metal inks. graphene ink utilized graphene, various polymers and surfactants, and functional groups manufacturing techniques that reduce aggregation and enhance graphene ink dispersion. graphene inks could accelerate the printing process of electronic devices leading to a decrease in the relevant costs. to produce graphene inks, graphene particles are dispersed in a solvent and then water is added to the base ink. the ratio of ingredients in the system is adjusted to preserve solvent properties without affecting ink and graphene composition in the solvent. conventional ink is combined with highly conductive silver and used in printing machines leading to an increase in ink cost. graphene inks are relatively cheaper than silver-containing inks. silver cannot be recycled while graphene can be recycled and reused. there are numerous studies on the use of cheap nontoxic and environmentally friendly graphene inks [138-142]. graphene inks and other two-dimensional graphene related materials (grms) have received much attention for novel technical advances in smart texture industry to produce electronic devices combined with novel fabrics and textures. a method has been invented in [143] for deposition of graphene ink on cotton to produce conductive fabrics. expensive metals such as silver are used for preparation of other conductive inks and thus are very unstable and expensive. however, the graphene used in this study is cheap, environmentally friendly and chemically compatible with cotton. this method allows direct placement of electronic systems on the human body. this is a novel technology to manufacture smart fabrics. 17. use of graphene in reinforcing electrical circuits against moisture graphene can be combined with metals to produce moisture-resistant connections in electronic circuits [144]. this can be useful for development of novel low-cost sensors. to produce highly efficient sensors, graphene should preserve its electrical conductivity after being combined with electronic circuits. a durable connection is necessary in any sensor, and it plays a key role in sensor function. however, graphene is sensitive to moisture, and water molecules may surround graphene by water adsorption on the surface of graphene. electrical conductivity of graphene changes by water adsorption and thus wrong signals are sent to the sensor. if graphene attaches the metal in the electronic circuit, its resistance will not change in the presence of water molecules. in fact, moisture will not adversely affect graphene performance. according to quellmalz, one of the researchers of the project, this technology will facilitate sensor design as moisture is not a concern and water will not negatively affect the circuit. they conducted experiments on the graphene connected to gold-metallized silica plates. this structure was then evaluated by different methods and computer simulations. according to quellmalz, one can take advantages of graphene and electronics by combining them. graphene has unique properties and conventional electronics is cheap on the other hand. therefore, one can take advantage of both graphene and electronics. to combine these technologies, graphene is placed on the finished electronic component instead of metal deposition on the graphene surface. graphene-reinforced polymeric nanocomposites in computer and electronics industries 369 18. graphene ink for development of internet of things (iot) the internet of things (iot) is making extensive changes in modern cities and human's everyday lives around the world. iot deals with people's accessories, assets, information, knowledge, services, and businesses. scientists have developed a flexible, soft type of graphene that is capable of bending, folding, etc., and can produce three-dimensional conductive objects in any form. these achievements will make it easier to produce iot-based objects and smart products, and graphene can be used to produce advanced electrical products used in iot. on the other hand, graphene can also be used to supply the power of iot devices [149-151]. in research pan k. et al., [145], a method has been proposed for the manufacture of printed electronic components using 2d materials. while providing a high rate, this method is also cost effective. the graphene-based ink can be used for the manufacture of electronic devices, in particular those employed in internet of things (iot). graphene as a 2d material consisting of carbon atoms with high electrical conductivity can be used in iot industry. however, two main problems limit the use of conductive inks in iot: high price and rapid oxidation. these two challenges limit industrial use of 2d carbon materials. dihydrolevoglucosenone known as cyrene was used to change graphene properties. according to the results, cyrene increases exfoliation rate of graphite to produce graphene at a lower cost by using a nontoxic and completely renewable method. 19. using graphene in construction of transistors defect-free graphene is an ideal 2d lattice in which carbon atoms with sp 2 hybridization provide a high strength. however, the durability of polycrystalline graphene is not sufficient for industrial applications. this limits commercial application of graphene. accordingly, it should be modified to be used in flexible electronics. improper crystal structure and defects in the lattice are the most important drawbacks limiting commercial applications of graphene in electronics. to overcome this limitation, mechanical durability of single-atom polycrystalline graphene sheet should be improved as a necessary step for practical use of graphene in soft and flexible electronics. a low-cost simple method consistent with conventional processes should be provided for this purpose. durability of large graphene sheets can be increased to produce a new generation of flexible electronics. a method has been proposed to improve mechanical durability of graphene by chemical bonding of nanoscale packets on the surface of graphene [146]. the organosilane nanopatches are mounted on the graphene surface with a nanometric thickness. nanopatches improve graphene resistance against severe media to enhance graphene applications. nanopatches on the graphene surface increase its mechanical durability when used as an electrode in wearable sensors. furthermore, when graphene is used as an electrode, orientation of semi-conducting organic layers on the surface is controlled, and charge is more effectively injected in organic transistors. this type of graphene can be used in organic field-effect transistors (ofets). using graphene and boron nitride, researchers in [152] produced a two-dimensional field effect transistor. unlike silicon-based field effect transistors, this transistor performance doesn’t decline at high voltages and provides high electron conductivity even when its thickness is reduced to a single layer. researchers have succeeded in making this transistor using hexagonal boron nitride, transition metal dichalcogenides, and graphene 370 h. kardanmoghaddam, m. maraki, a. rajaei plates tied together through van der waals interactions. in constructing these transistors, each section is made up of a thin layer. these layers are designed to be similar in thickness, with no surface roughness, and have bonded using van der waals force. 20. development of nanoelectromechanical switches using graphene the graphene-based nanoelectromechanical switches can be used for electrostatic discharge (esd) protection of electronic devices [147]. the switches consist of two terminals with a gap between them. the gap is placed in the lower conductive substrate and the graphene membrane is placed on the lower substrate. this is a new concept in esd protection of electronic components on chips. according to chen, esd switches have numerous advantages in comparison with conventional pn contact-based esd devices. this passive mechanical switch with a nearly close leakage has a very low parasite capacity. in addition, the switch has a bi-functional polar performance while pn junction-based devices only provide unipolar protection. with a very high mechanical and thermal performance, graphene can be used to produce this type of switches. the switch can be produced by cmos infrastructure through non-uniform aggregation. the cmos-based process proposed in this study can be used to produce nemes esd graphene switches. this method was applied in a clean room, and the results were characterized by transitional linear pulse (tlp). this device can be used to produce a new generation of esd protectors on the chip. 21. use of graphene in the manufacture of cameras graphene image sensors are more sensitive to light than commercial sensors known as cmos and ccd under identical conditions. they also consume less energy than commercial sensors. the use of this type of sensors in surveillance equipment and satellites may cause a great difference. the quality of images decreases by downsizing cameras due to a decrease in image sensor dimensions. the graphene sensor increased camera resolution despite camera downsizing [148]. the use of graphene in complementary metal-oxide semiconductors (cmos) may produce high-resolution low-noise images. this technology allows integration of cameras in small electronic devices. graphene can be used in the production of image sensors. it can also be used as phototransistors in digital image sensors to convert light into an electric current. a digital imaging system was constructed from graphene and quantum dots capable of imaging visible, ultraviolet and infrared lights simultaneously [148]. the sensor was constructed using pbs quantum nano-charges mounted on the graphene sheet. the resulting hybrid system was then connected to a cmos and the whole system was connected to a reading circuit. the resulting high-resolution imaging sensor was sensitive to a wide range of wavelengths from uv at 300 nm, visible light at moderate wavelengths to infrared at 2000 nm. this imaging sensor was able to detect visible and invisible lights simultaneously. the thin graphene layer reduces the size of final product while increasing its resolution. each pixel in the sensor is covered by graphene with a layer of quantum dots on this layer. the dots in this layer absorb light and transfer their electrons to graphene. using this graphene sensors, there is no need to downsize pixel dimensions. the number of pixels is not the only factor determining the imaging quality of a camera, but sensor size also plays a key role in this regard. unlike cmos sensors, increased noise in this new sensor is not a concern. the imaging cameras made with this sensor are able to record images at low-light using infrared graphene-reinforced polymeric nanocomposites in computer and electronics industries 371 waves with a wavelength of 1100-1900 just as night vision cameras. they are also able to record images under normal conditions. the highly sensitive cameras constructed from this type of phototransistors are cost effective and can be easily used in computers, laptops and other electronic devices. 22. conclusion graphene is emergent material that has important role in developing advanced technologies. graphene composites relative to polymer matrix composites or other carbon based composites (composites reinforced with carbon nanotubes or carbon fibers), have higher properties and better performance. in this research we examined reinforced polymer nanocomposites using graphene and recent advancements, properties and applications of these materials in computer and electronic industry. as the application of these nano materials in computer and electronic industry has many added values and economic interest and on the other hand, computer and electronic industry is faced with many challenges and difficulties in its progress path that should be solved. these problems can be solved by using polymer nanocomposites filled with graphene. on the other hand, preparation of graphene with high quality and reasonable price is a problem yet and should be synthesized and produced with new methods. there are different methods for producing graphene reinforced polymer nanocomposites that these methods affect dispersion amount of graphene inside polymer matrix and final properties of composites. complete usage of graphene filled nanocomposites with distribution is related to amount of graphene and its orientation and increases economic saving and producing final material. distribution and orientation of graphene for optimization of structural and practical effectiveness and efficiency is so critical. prevention of random orientation of filler nanoparticles results in producing designed nanocomposites having controlled and exact configurations and on the other hand they are capable of functionalization for creating strong inter surface bonds, between graphene and chemically modified graphene with other materials and this cause they are used in making new computer and telecommunication parts, that nowadays it’s required to examine and study these nanomaterial in computer and telecommunication industry more than before. references [1] s. park and r. s. ruof, “chemical methods for the production of graphenes”, nature nanothechnology, 2009. [2] m.a. xavior and h.g. prashantha kumar, “graphene reinforced metal matrix composite (grmmc): a review”, materials today: proceedings, vol. 4, no. 2, pp. 3334–3341, 2017. [3] w. choi and j. lee, graphene synthesis and applications, crc press taylor & francis group, 2012, pp. 1–223. [4] r.a. vaia and h.d. wagner, “framework for nanocomposites”, materials today, vol. 7, no. 11, pp. 32–37, 2004. [5] d.v. rueger and m.r. kessler, “effect of silane structure on the properties of silanized multiwalled carbon nanotube-epoxy nanocomposites”, polymer, vol. 55, pp. 1854–1865, 2014. [6] s.b. jin, g.s. son, y.h. kim, and c.g. kim, “enhanced durability of silanized multi-walled carbon nanotube/epoxy nanocomposites under simulated low earth orbit space environment”, compos. sci. tech., vol. 87, pp. 224–231, 2013. 372 h. kardanmoghaddam, m. maraki, a. rajaei [7] s. ullah khan, j.r. pothnis, and j.k. kim, “effects of carbon nanotube alignment on electrical and mechanical properties of epoxy nanocomposites”, composite part a: applied science and manufacturing, vol. 49, pp. 26–34, 2013. [8] s. shadlou, e. alishahi, and m.r. ayatollahi, “fracture behavior of epoxy nanocomposites reinforced with different carbon nano-reinforcements”, composite structure, vol. 95, pp. 577–581, 2013. [9] m.m. gallego, m. hernández, v. lorenzo, r. verdejo, m.a. manchado, and m. sangermano, “cationic photocured epoxy nanocomposites filled with different carbon fillers”, polymer, vol. 53, pp. 1831–1838, 2012. [10] j. lingpu, y. shengjiao, j. yimi and w. chunming, “electrochemical deposition of diluted magnetic semiconductor znmnse on reduced graphene oxide/polyimide substrate and its properties”, alloy. compos., vol. 609, pp. 233–238, 2014. [11] x. zhang, m. liu, y. mao, y. xu, and s. niu, “ultrasensitive photoelectrochemical immunoassay of antibody against tumor-associated carbohydrate antigen amplified by functionalized graphene derivates and enzymatic biocatalytic precipitation”, biosensor and bioelectronics, vol. 59, pp. 21–27, 2014. [12] t.d. dao, j.e. hong, k.s. ryu, and h.m. jeong, “supertough functionalized graphene paper as a high-capacity anode for lithium ion batteries”, chemical engineering journal, vol. 250, pp. 257–266, 2014. [13] s. aoyama, y.t. park, w. and macosko, “melt crystallization of poly(ethylene terephthalate): comparing addition of graphene vs. carbon nanotubes”, polymer, vol. 55, pp. 2077–2085, 2014. [14] t. gokkurt, a. durmus, and v. sariboga, “investigation of thermal, rheological, and physical properties of amorphous poly(ethylene terephthalate)/organoclay nanocomposite films”, appl. polym. sci., vol. 129, pp. 2490–2501, 2013. [15] h. kim, a.a. abdala, and c.w. macosko, “thermal analysis of epoxy-based nanocomposites: have solvent effects been overlooked”, macromolecules, vol. 43, pp. 6515–6530, 2010. [16] h. zhang, w. zheng, q. yan, y. yang, j. wang, and z. lu, “electrically conductive polyethylene terephthalate/graphene nanocomposites prepared by melt compounding”, polymer, vol. 51, pp. 1191–1196, 2010. [17] s. bandla and j.c. hanan, “microstructure and elastic tensile behavior of polyethylene terephthalateexfoliated graphene nanocomposites”, mater. sci., vol. 47, pp. 76–82, 2012. [18] m. li and y.g. jeong, “influences of exfoliated graphite on structures, thermal stability, mechanical modulus, and electrical resistivity of poly(butylene terephthalate)”, appl. polym. sci., vol. 125, pp. 53–40, 2012. [19] n.a. kotov, “materials science: carbon sheet solutions”, nature, vol. 442, pp. 254–255, 2006. [20] c.n.r. rao, k. biswas, k.s. subrahmanyam and a. govindaraj, “graphene, the new nanocarbon”, mater. chem., vol. 19, pp. 2457–2469, 2009. [21] c.n.r. rao, a.k. sood, k.s. subrahmanyam and a. govindaraj, “graphene: the new twodimensional nanomaterial”, angew. chem., int. ed., vol. 48, pp. 7752– 7777, 2009. [22] d. cai and m.song, “recent advance in functionalized graphene/polymer nanocomposites”, mater. chem., vol. 20, pp. 7906–7915, 2010. [23] x. li, y. zhu, w. cai, m. borysiak, b. han, d. chen, r.d. piner, l. colombo and r.s. ruoff, "transfer of large-area graphene films for high-performance transparent conductive electrodes", nano lett., vol.9, pp.4359–4363, 2009. [24] t.d. dao, h.i. lee and h.m. jeong, “alumina-coated graphene nanosheet and its composite of acrylic rubber”, colloid. int. sci., vol. 416, pp.38–43, 2014. [25] h.m. seo, j.h. park, t.d. dao, and h.m. jeong, “compatibility of functionalized graphene with polyethylene and its copolymers”, nanomaterials, vol. 129, no. 5, pp. 1–8, 2013. [26] j.t. choi, t.d. dao, k.m oh, h.i. lee, h.m. jeong, and b.k. kim, “shape memory polyurethane nanocomposites with functionalized graphene”, smart mater. struct., vol. 21, 2012. [27] p. fabbri, e. bassoli, s.b. bon, and l. valentini, “preparation and characterization of poly (butylene terephthalate)/ graphene composites by in-situ polymerization of cyclic butylene terephthalate”, polymer, vol. 53, pp. 897–902, 2012. [28] b. shen, w.z. tao, d. lu, and w. zheng, “enhanced interfacial interaction between polycarbonate and thermally reduced graphene induced by melt blending”, compos. sci. tech., vol. 86, pp.109–116, 2013. [29] j. dong, c. yin, x. zhao, y. li, and q. zhang, “high strength polyimide fibers with functionalized graphene”, polymer, vol. 54, pp. 6415–6424, 2013. [30] x. wang, l. song, w. pornwannchai, y. hua and b. kandola, “the effect of graphene presence in flame retarded epoxy resin matrix on the mechanical and flammability properties of glass fiberreinforced composites”, composites part a: applied science and manufacturing, vol. 53, pp. 88–96, 2013. https://www.sciencedirect.com/science/journal/1359835x graphene-reinforced polymeric nanocomposites in computer and electronics industries 373 [31] m.a. rafiee, j. rafiee, z. wang, h. song, z.z. yu and n. koratkar, “enhanced mechanical properties of nanocomposites at low graphene content”, acs nano, vol. 3, pp. 3884–3890, 2009. [32] m.a. rafiee, j. rafiee, i. srivastava, z. wang, h. song, z.z. yu, and n. koratkar, “fracture and fatigue in graphene nanocomposites”, nano micro small, vol. 6, pp. 179–183, 2010. [33] r. verdejo, m. mar bernal, l.j. romasanta and m.a. manchado, “graphene filled polymer nanocomposites”, mater. chem., vol. 21, pp. 3301–3310, 2011. [34] p. steurer, r. wissert, r. thomann and r. mulhaupt, “functionalized graphenes and thermoplastic nanocomposites based upon expanded graphite oxide”, macro. rapid commun., vol. 30, pp. 316– 327, 2009. [35] w. choi and j. lee, graphene synthesis and applications, crc press taylor & francis group, pp. 1223, 2012. [36] c. low, f. walsh, m. chakrabarti, m.a hashim, m.a. hussain, “electrochemical approaches to the production of graphene flakes and their potential application”, carbon, vol. 54, pp. 1–21, 2013. [37] a.k. geim and k.s. novoselov, “the rise of graphene”, nat. mater., vol. 6, pp. 183–191, 2007. [38] c.n.r. rao, k. biswas, k.s. subrahmanyam and a. govindaraj, “graphene, the new nanocarbon”, mater. chem., vol.19, pp. 2457–2469, 2009. [39] c. soldano, a. mahmood and e. dujardin, “production, properties and potential of graphene”, carbon, vol. 48, pp. 2127–2150, 2010. [40] j. paredes, and s. villar rodil, “atomic force and scanning tunneling microscopy imaging of graphene nanosheets derived from graphite oxide”, langmuir, vol. 25, pp. 5957–5968, 2009. [41] m. pumera, "electrochemistry of graphene, graphene oxide and other graphenoids", electrochemistry communications, vol. 36, pp. 14–18, 2013. [42] k.k. sadasivuni, d. ponnamma, s. thomas, and y. grohens, “evolution from graphite to graphene elastomer composites”, progress in polymer science, vol. 39, pp. 749–780, 2014. [43] r. salvatierra, s. domingues, m. oliveira, and a. zarbin, “tri-layer graphene films produced by mechanochemical exfoliation of graphite”, carbon, vol. 57, pp. 410–415, 2013. [44] s. ansari and e.p. giannelis, “functionalized graphene sheet-poly(vinylidene fluoride) conductive nanocomposites”, polym. sci., part b., vol. 47, pp. 888–897, 2009. [45] n.a. kumar, r.r. gaddam, m. suresh, s.r. varanasi, d. yang, s.k. bhatia, x.s. zhao, “porphyringraphene oxide frameworks for long life sodium ion batteries”, journal of materials chemistry a, vol. 5, 13204–13211, 2017. [46] w. lu, j. weng, d. wu, c. wu, g. chen, “epoxy resin/graphite electrically conductive nanosheet nanocomposite”, materials and manufacturing processes, vol. 21, pp. 167–171, 2006. [47] s.s. park and n.j. kim, “study on methane hydrate formation using ultrasonic waves”, ind. eng. chem., vol. 20, pp. 1911–1915, 2014. [48] l.y. choi, s.w. kim, and k.y. cho, “improved thermal conductivity of graphene encapsulated poly(methyl methacrylate) nanocomposite adhesives with low loading amount of graphene”, compos. sci. technol., vol. 94, pp. 147–154, 2014. [49] s. stankovich, d.a. dikin, g.h.b. dommett, k.m. kohlhaas, e.j. zimney, e.a. stach, r.d. piner, s.t. nguyen and r.s. ruoff, “graphene-based composite materials”, nature, vol. 442, pp. 282–286, 2006. [50] d.r. dreyer, s. park, c.w. bielawski and r.s. ruoff, “the chemistry of graphene oxide”, chem. soc. rev., vol. 39, pp. 228–240, 2010. [51] d. kuang, l. xu, l. liu, w. hu and y. wu, “graphene–nickel composites”, applied surface science, vol. 273, pp. 484–490, 2013. [52] a. tchernok, m. krumova, f. johannes t lle, r. m lhaupt and s. meching, “composites from aqueous polyethylene nanocrys tal/graphene dispersions”, macromolecules, vol. 47, pp. 3017–3021, 2014. [53] s. kim, j. seo and l.t. drzal, “improvement of electric conductivity of lldpe based nanocomposite by paraffin coating on exfoliated graphite nanoplatelets”, compos. part a-appl sci, and manufacturing, vol. 41, pp. 581–587, 2010. [54] x. jiang and l.t. drzal, “improving electrical conductivity and mechanical properties of high hensity polyethylene through incorporation of paraffin wax coated exfoliated graphene nanoplatelets and multiwall carbon nano-tubes”, compos part a-appl. s., vol. 42, pp. 1840–1849, 2011. [55] m. h. mohamadzadeh and s. sabury, “graphene oxide-induced polymerization and crystallization to produce highly conductive polyaniline/graphene oxide composite”, j. polym. sci., part a: polym. chem., vol. 52, pp. 1545–1554, 2014. [56] m. fang, k.g. wang, h.b. lu, y.l. yang and s. nutt, “single-layer graphene nanosheets with controlled grafting of polymer chains”, mater. chem., vol. 19, pp. 7098–7105, 2009. 374 h. kardanmoghaddam, m. maraki, a. rajaei [57] p. noorunnisa khanam, a.m. almaadeed, m. ouederni, e. harkin-jones, b. mayoral, a. hamilton, d. sun, “melt processing and properties of linear low density polyethylene-graphene nanoplates composites”, vacuum, vol. 130, pp. 63–71, 2016. [58] t. kuila, s. bose, c.e. hong, m.e. uddin, p. khanra, n.h. kim, and j.h. lee, “preparation of functionalized graphene/linear low density polyethylene composites by a solution mixing method”, carbon, vol. 49, pp.1033–1051, 2011. [59] s. kim, i. do and l.t. drzal, “multifunctional xgnp/lldpe nanocomposites prepared by solution compounding using various screw rotating sys tems”, macromol. mater. eng., vol. 294, pp.196– 205, 2009. [60] s.s. park and n.j. kim, “study on methane hydrate formation using ultrasonic waves”, ind. eng. chem., vol. 20, pp. 1911–1915, 2014. [61] j.y. choi, s.w. kim and k.y. cho, “improved thermal conductivity of graphene encapsulated poly(methyl methacrylate) nanocomposite adhesives with low loading amount of graphene”, compos. sci. technol., vol. 94, pp. 147–154, 2014. [62] w. yang, s. berthou, x. lu, q. wilmart, a. denis, m. rosticher, t. taniguchi, k. watanabe, g. f`eve, j.m. berroir, g. zhang, c. voisin, e. baudin and b. placais, “a graphene zener–klein transistor cooled by a hyperbolic substrate”, nature nano technology, vol. 13, pp. 47–52, 2018. [63] l. chen, g. chen and l. lu, “piezoresistive behavior study on finger-sensing silicone rubber/graphite nanosheet nanocomposites”, adv. funct. mater., vol. 18, pp. 898–904, 2007. [64] k.h. an, s.y. jeong, h.r. hwang and y.h. lee, “enhanced sensitivity of a gas sensor incorporating single-walled carbon nanotube–polypyrrole nanocomposites”, adv. mater., vol. 16, pp. 1005–1009, 2004. [65] j.q. liu, l. tao, w.r. yang, d. li, c. boyer, r. wuhrer, f. braet and t.p. davis, “synthesis, characterization, and multilayer assembly of ph sensitive graphene−polymer nanocomposites”, langmuir, vol. 26, pp. 10068–10075, 2010. [66] j. wang, d. song, s. jia and z. shao, “poly (n, ndimethylaminoethyl methacrylate)/graphene oxide hybrid hydrogels: ph and temperature sensitivities and cr(vi) adsorption”, reac. func. polym., vol. 81, pp. 8–13, 2014. [67] x. qin, q. meng, y. feng and y. gao, “graphene with line defect as a membrane for gas separation: design via a first-principles modeling”, surface sci., vol. 607, pp. 153–158, 2013. [68] f. li, h. yue, z. yang, x. li, y. qin and d. he, “flexible free-standing graphene foam supported silicon films as high capacity anodes for lithium ion batteries”, mater. let., vol. 128, pp. 132–135, 2014. [69] f. liu, y. piao, j. choi and t.s. seo, “three-dimensional graphene micropillar based electrochemical sensor for phenol detection”, biosen. bioel., vol. 50, pp. 387–392, 2013. [70] m. jaroszaewski, j. ziaja, “em shieldingtheory and development of new materials”, research signpost, kerala, 2012. [71] j. liang, y. wang, y. huang, y. ma, z. liu, j. cai and c. zhang, “electromagnetic interference shielding of graphene/epoxy composites”, carbon, vol. 47, no. 3, pp. 922–925, 2009. [72] w.l. song, m. cao, m.m. lu, s. bi, c.y. wang, j. liu, j. yuan and l.z. fan, “flexible graphene/polymer composite films in sandwich structures for effective electromagnetic interference shielding”, carbon, vol. 66, pp. 67–76, 2014. [73] d.a.c. brownson, d.k. kampouris and c.e. banks, “an overview of graphene in energy production and storage applications”, j. power sources, vol. 196, no. 11, pp. 4873–4885, 2011. [74] t. kim, g. jung, s. yoo, k.s. suh, r.s. ruoff, “activated graphene-based carbons as supercapacitor electrodes with macro-and mesopores”, acs nano, vol. 7, pp. 6899–6905, 2013. [75] e. paek, a.j. pak, k.e. kweon, g.s. hwang, “on the origin of the enhanced supercapacitor performance of nitrogen-doped graphene”, j. phys. chem. c, vol. 117, no.11, pp. 5610–5616, 2013. [76] l.l. zhang, r. zhou, x.s. zhao, “graphene-based materials as supercapacitor electrodes”, j. mater. chem., vol. 20, no. 29, pp. 5983–5992, 2010. [77] a.j. pak, e. paek and g.s. hwang, “tailoring the performance of graphene-based supercapacitors using topological defects: a theoretical assessment”, carbon n. y., vol. 68, pp. 734–741, 2014. [78] b.c.b. wood, t. ogitsu, m. otani and j. biener, “first-principles-inspired design strategies for graphene-based supercapacitor electrodes”, j. phys. chem. c, vol. 118, no. 1, pp 4–15, 2013. [79] e. paek, a.j. pak and g.s. hwang, “a computational study of the interfacial structure and capacitance of graphene in [bmim][pf6] ionic liquid”, j. electrochem. soc., vol. 160, no. 1, pp. a1– a10, 2012. [80] s.m. mousavi-khoshdel and e. targholi, “exploring the effect of functionalization of graphene on the quantum capacitance by first principle study”, carbon, vol. 89, pp. 148–160, 2015. graphene-reinforced polymeric nanocomposites in computer and electronics industries 375 [81] m. mousavi-khoshdel, te. argholi and m.j. momeni, “first-principles calculation of quantum capacitance of codoped graphenes as supercapacitor electrodes”, the journal of physical chemistry c, vol. 119, pp. 26290–26295, 2015. [82] z. lin, y. liu, y. yao, o.j. hildreth, z. li, k. moon and c.p. wong, “superior capacitance of functionalized graphene”, the journal of physical chemistry c, vol. 115, pp. 7120–7125, 2011. [83] s. park and r.s. ruoff, “chemical methods for the production of graphenes”, nat. nanotech., vol. 4, pp. 217–224, 2009. [84] p. simon and y. gogotsi, “materials for electrochemical capacitors”, natur. mater., vol. 7, pp. 845– 854, 2008. [85] a.v. murugan, t. muraliganth and a. manthiram, “rapid, facile microwave-solvothermal synthesis of graphene nanosheets and their polyaniline nanocomposites for energy storage”, chem. mater., vol. 21, pp. 5004–5006, 2009. [86] y. zhu, m.d. stoller, w. cai, a. velamakanni, r.d. piner and d. chen, “exfoliation of graphite oxide in propylene carbonate and thermal reduction of the resulting graphene oxide platelets”, acs nano, vol. 4, pp. 1227–1233, 2010. [87] c.z. yuan, l. zhou and l.r. hou, “facile fabrication of self-supported three-dimensional porous reduced graphene oxide film for electrochemical capacitors”, mater. lett., vol. 124, pp. 253–255, 2014. [88] f. tu, s. liu, t. wu, g. jin and c. pan, “porous graphene as cathode material for lithium ion capacitor with high electrochemical performance”, power technol., vol. 253, pp. 580–583, 2014. [89] q. cheng, j. tang, j. ma, h. zhang, n. shinyaa and l. qin, “graphene and carbon nanotube composite electrodes for super capacitors with ultra-high energy density”, phys. chem. chem. phys., vol. 13, pp. 17615–17624, 2011. [90] c. li, x. zhang, k. wang, h. zhang, x. sun and y. ma, “three dimensional graphene networks for supercapacitor electrode materials”, new carbon materials, vol. 30, no. 3, pp. 193–206, 2015. [91] m. chen, h. wang, l. li, z. zhang, c. wang, y. liu, w. wang, j. gao, “novel and facile method, dynamic self-assemble, to prepare sno2/rgo droplet aerogel with complex morphologies and their application in supercapacitors”, acs appl. mater. interfaces, vol. 6, pp. 14327–14337, 2014. [92] h. wang, h. yi, x. chen and x. wang, “one-step strategy to three-dimensional graphene/ vo2 nanobelt composite hydrogels for high performance supercapacitors”, j. mater. chem. a, vol. 2, pp. 1165–1173, 2014. [93] y. he, w. chen, x. li, z. zhang, j. fu, c. zhao and e. xie, “freestanding three-dimensional graphene/mno2 composite networks as ultralight and flexible supercapacitor electrodes”, vol. 7, pp. 174–182, 2013. [94] j. ge, h.b. yao, w. hu, x.f. yu, y.x. yan, l.b. mao, h.h. li, s.s. li and s.h. yu, “facile dip coating processed graphene/mno2 nanostructured sponges as high performance supercapacitor electrodes”, nano energy, vol. 2, pp. 505–513, 2013. [95] c.c. wang, h.c. chen and s.y. lu, “manganese oxide/graphene aerogel composites as an outstanding supercapacitor electrode material”, chem eur, vol. 20, pp. 517–523, 2014. [96] s. wu, w. chen and l. yan, “fabrication of a 3d mno2/graphene hydrogel for high-performance asymmetric supercapacitors”, mater. chem. a, vol. 2, pp. 2765–2772, 2014. [97] t. zhai, f. wang, m. yu, s. xie, c. liang, c. li, f. xiao, r. tang, q. wu, x. lu and y. tong, “3d mno2-graphene composites with large areal capacitance for high-performance asymmetric supercapacitors”, nanoscale, vol. 7, pp. 6790–6796, 2013. [98] w. wang, s. guo, i. lee, k. ahmed, j. zhong, z. favors, f. zaera, m. ozkan and c.s. ozkan, “hydrous ruthenium oxide nanoparticles anchored to graphene and carbon nanotube hybrid foam for supercapacitors”, sci rep, vol. 4, pp. 4452–4461, 2014. [99] j. yuan, j. zhu, h. bi, x. meng, s. liang, l. zhang and x. wang, “graphene-based 3d composite hydrogel by anchoring co3o4 nanoparticles with enhanced electrochemical properties”, phys. chem. chem. phys., vol. 15, pp. 12940–12945, 2013. [100] u.m. patil, s.c. lee, j.s. sohn, s.b. kulkarni, k.v. gurav, j.h. kim, j.h. kim, s. lee and s.c. jun, “enhanced symmetric supercapacitive performance of co(oh)2 nanorods decorated conducting porous graphene foam electrodes”, electrochimica acta, vol. 129, pp. 334–342, 2014. [101] y. xu, x. huang, z. lin, x. zhong, y. huang and x. duan, “one-step strategy to graphene/ni(oh)2 composite hydrogels as advanced three-dimensional supercapacitor electrode materials”, nano research, vol. 6, pp. 65–76, 2013. https://www.sciencedirect.com/science/journal/00134686 https://www.sciencedirect.com/science/journal/00134686/129/supp/c 376 h. kardanmoghaddam, m. maraki, a. rajaei [102] j. ji, l.l. zhang, h. ji, y. li, x. zhao, x. bai, x. fan, f. zhang, r.s. ruoff, “nanoporous ni(oh)2 thin film on 3d ultrathin-graphite foam for asymmetric supercapacitor”, acs nano, vol. 7, pp. 6237–6243, 2013. [103] c. jiang, b. zhao, j. cheng, j. li, h. zhang, z. tang and j. yang, “hydrothermal synthesis of ni(oh)2 nanoflakes on 3d graphene foam for high-performance supercapacitors", electrochimica acta, vol. 173, pp. 399–407, 2015. [104] x. cao, y. shi, w. shi, g. lu, x. huang, q. yan, q. zhang, and h. zhang, “preparation of novel 3d graphene networks for supercapacitor applications”, nano micro small, vol. 7, no. 22, pp. 3163– 3168, 2011. [105] h. wang, z. xu, h. yi, h. wei, z. guo and x. wang, “one-step preparation of single-crystalline fe2o3 particles/graphene composite hydrogels as high performance anode materials for supercapacitors”, nano energy, vol. 7, pp. 86–96, 2014. [106] w. tian, q. gao, y. tan, y. zhang, j. xu, z. li, k. yang, l. zhu, and z. liu, “three-dimensional functionalized graphenes with systematical control over the interconnected pores and surface functional groups for high energy performance supercapacitors”, carbon, vol. 85, pp. 351–362, 2015. [107] j. hao, y. liao, y. zhong, d. shu, c. he, s. guo, y. huang, j. zhong and l. hu, “three-dimensional graphene layers prepared by a gas-foaming method for supercapacitor applications”, carbon, vol. 94, pp. 879–887, 2015. [108] s. liu, j. wu, j. zhou, g. fang and s. liang, “mesoporous nico2o4 nanoneedles grown on three dimensional graphene networks as binder-free electrode for high-performance lithium-ion batteries and supercapacitors”, electrochimica acta, vol. 176, pp. 1–9, 2015. [109] j. liu, w. lv, w. wei, c. zhang, z. li, b. li, f. kangb and q.h. yang, “a three-dimensional graphene skeleton as a fast electron and ion transport network for electrochemical application”, j. mater. chem. a, vol. 2, pp. 3031–3037, 2014. [110] m. kota, x. yu, s.h. yeon, h.w. cheong and h.s. park, “ice-templated three dimensional nitrogen doped graphene for enhanced supercapacitor performance”, power sources, vol. 303, pp. 372–378, 2016. [111] t. echtermeyer, l. britnell, p. jasnos, a. lombardo, r. gorbachev, a. grigorenko, a. geim, a. ferrari and k. novoselov, “strong plasmonic enhancement of photovoltage in grapheme”, nat.commun,. vol. 2, no.1, pp. 458–468 ,2011. [112] s. thongrattanasiri, f. koppens and f. abajo, “total light absorption in graphene”, phys.rev,. lett., vol. 108, 2012. [113] b. zhao, j m. zhao and z.m. zhang, “enhancement of near-infrared absorption in graphene with metal gratings”, appl.phys. lett., vol. 105, pp. 031905, 2014. [114] x. zhu, l. shi, m. schmidt, a. boisen, o. hansen, j. zi, s. xiao and n. mortensen, “enhanced lightmatter interactions in graphene covered gold nanovoid arrays”, nnano. lett, vol. 13, pp. 4690, 2013. [115] m.h. mahdabinezhad, m. pourmahyabadi, “design of graphene based photodetector with high absorption and responsivity”, in proceedings of the 23 rd iranian conference on optics and photonics and 9 th conference on photonics engineering and technology tarbiat modares university, tehran, iran, 2017. [116] b. chitara, l.s. panchakarla, s.b. krupanidhi and c.n.r. rao, “infrared photodetectors based on reduced graphene oxide and graphene nanoribbons”, advanced materials, vol. 23, pp. 5419–5424, 2011. [117] z. cheng, j. wang, k. xu, h.k. tsang and c. shu, “graphene on silicon on-sapphire waveguide photodetectors”, in proceedings of the conference on laser and electro-optic, 2015. [118] i. wang, z. cheng, z. chen, x. wan , b.q. zhu and h. ki tsang, “high responsivity graphene-on silicon slotwaveguide photodetectors”, nanoscale, vol. 8, no. 27, pp. 13206-11, 2016. [119] l. goykhman, u. sassi, b. desiatov, n. mazurski, “on chipintegrated silicongraphene plasmonic schottky photodetector with high responsivity andavalanche photogain”, nano letters, vol. 16, no.5, pp. 3005–3013, 2016. [120] m.h. mahdabi nezhad and m. pourmahyabadi, “design of graphene based photodetector with high absorption and responsivity”, in proceedings of the 23 rd iranian conference on optics and photonics and 9 th conference on photonics engineering and technology tarbiat modares university, tehran, iran, 2017. [121] h. zhu, j. wei, k. wang and d. wu, “applications of carbon materials in photovoltaic solar cells”, j. solar energy materials & solar cells , vol. 93, pp. 1461–1470, 2009. [122] h. choi, h. kim, s. hwang, w. choi and m. jeon, “dye-sensitized solar cells using graphene-based carbon nano composite as counter electrode”, solar energy materials & solar cells, vol. 95, pp. 323–325, 2011. graphene-reinforced polymeric nanocomposites in computer and electronics industries 377 [123] y.j. jeon, j.m. yun, d.y. kim, s.i. naa and s.s. kim, “high-performance polymer solar cells with moderately reduced graphene oxide as an efficient hole transporting layer, solar energy mater”, sol. cells, vol. 105, pp. 96–102, 2012. [124] x. zhengguo, y. yuan, b. yang, j. vanderslice, j. chen, o.d. gerd duscher and j. huang, “universal formation of compositionally graded bulk heterojunction for efficiency enhancement in organic photovoltaics”, advanced materials, vol. 26, no.19, pp. 3068–3075, 2014. [125] a. r. madaria and a. kumar, “large scale, highly conductive and patterned transparent films of silver nanowires on arbitrary substrates their application in touch screens”, nanotechnology, vol. 22, pp. 245201–245208, 2011. [126] w. cai and y. zhu, “large area few-layer graphene/graphite films as transparent thin conducting electrodes”, applied physics letters, vol. 95, pp. 123115–123118, 2009. [127] m.j. large, s.p. ogilvie, s. alomairy, t. vöckerodt, d. myles, m. cann, h. chan, i. jurewicz, a.a. k. king and a.b. dalton, “selective mechanical transfer deposition of langmuir graphene films for highperformance silver nanowire hybrid electrodes”, langmuir, vol. 33, no. 43, pp. 12038–12045, 2017. [128] y.d. kim, h. kim, y. cho, j.h. ryoo, c.h. park, p. kim, y.s. kim, s. lee, y. li, s.n. park, y.s. yoo, d. yoon d, dorgan ve, pop e, heinz tf, hone j, chun sh, cheong h, lee sw, bae mh, park yd, “bright visible light emission from grapheme”, nat nanotechnol, vol. 10, no. 8, pp. 676–681, 2015. [129] y.d. kim, y. gao, r.j. shiue, l. wang, o.b. aslan, m.h. bae, h. kim, d. seo , h.j, choi, s.h. kim, a. nemilentsau, t. low, c. tan, d.k. efetov, t. taniguchi, k. watanabe, k.l. shepard, t.f. hein , d. englund and j. hone, "ultrafast graphene light emitters", nano lett. vol. 18, no. 2, pp.934-940, 2018. [130] s. zhou, k. chen, m.t. cole, z. li, j. chen, c. li and q. dai, “ultrafast field-emission electron sources based on nanomaterials”, adv mater., vol. 31, no. 45, 2019. [131] h. diker, g.b. durmaz, h. bozkurt, f. yeşil and c. varlikli, “controlling the distribution of oxygen functionalities on go and utilization of pedot:pss-go composite as hole injection layer of a solution processed blue oled”, curr. appl.phys., vol. 17, pp. 565–572, 2017. [132] d. todorović, a. matković, m. milićević, đ. jovanović, r. gajić, i. salom, m. spasenović, “multilayer graphene condenser microphone”, 2d materials, vol. 2, no. 4, 2015. [133] q. zhou, j. zheng, s. onishi, m.f. crommie and a.k. zettl, “graphene electrostatic microphone and ultrasonic radio”, in proceedings of the national academy of sciences, 2015, pp. 8942-6. [134] s.t. woo, j.h. han, j.h. lee, s. cho, k.w. seong, m. choi and j.h. cho, “realization of a high sensitivity microphone for a hearing aid using a graphene–pmma laminated diaphragm”, acs applied materials & interfaces, vol. 9, no. 2, pp. 1237–1246 , 2017. [135] r. z.h.m. auliya, m.a. md ali, m.s. rusdi, “graphene mems capacitive microphone: highlight and future perspective”, scientific journal of ppi-ukm, vol. 3, no. 4, pp. 187–191, 2016. [136] j. wu, m. agrawal, h.a. becerril, z. bao, z. liu, y. chen, and p. peumans, “organic light-emitting diodes on solution-processed graphene transparent electrodes”, acs nano, vol. 4, no.1, pp. 43–48, 2010. [137] j. lee, t.h. han, m.h. park, d.y. jung, j. seo, h.k. seo, h. cho, e. kim, j. chung, s.y. choi, t.s. kim, t.w. kim and s. yoo, “synergetic electrode architecture for efficient graphene-based flexible organic light-emitting diodes”, nat. commun., vol. 7, no. 11791, 2016. [138] j. kim, r. kumar, a.j. bandodkar and j. wang, “advanced materials for printed wearable electrochemical devices: a review”, advanced electronic materials, vol. 3, no. 1, 2017. [139] k. arapov, e. rubingh, r. abbel, j. laven, g. de with and h. friedrich, “conductive screen printing inks by gelation of graphene dispersions”, advanced functional materials, vol. 26, no. 4, pp. 586– 593, 2016. [140] w.j. hyun, e.b. secor, m.c. hersam, c.d. frisbie and l.f. francis, “high-resolution patterning of graphene by screen printing with a silicon stencil for highly flexible printed electronics”, advanced materials, vol. 27, no.1, pp. 109–115, 2015. [141] j.r. windmiller and j. wang, "wearable electrochemical sensors and biosensors: a review", electroanalysis, vol. 25, no. 1, pp. 29–46, 2013. [142] s. majee, m. song, s.l. zhang and z.b. zhang, “scalable inkjet printing of shear-exfoliated graphene transparent conductive films”, carbon, vol. 102, pp. 51–57, 2016. [143] j. ren, c. wang, x. zhang, t. carey, k. chen, y. yin and f. torrisi, “environmentally-friendly conductive cotton fabric as flexible strain sensor based on hot press reduced graphene oxide”, carbon, vol. 111, pp. 622–630, 2017. https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20yd%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20h%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=cho%20y%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=ryoo%20jh%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=park%20ch%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20p%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20ys%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=lee%20s%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=li%20y%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=park%20sn%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=yoo%20ys%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=yoon%20d%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=dorgan%20ve%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=pop%20e%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=heinz%20tf%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=hone%20j%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=chun%20sh%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=cheong%20h%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=lee%20sw%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=bae%20mh%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=park%20yd%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=park%20yd%5bauthor%5d&cauthor=true&cauthor_uid=26076467 https://www.ncbi.nlm.nih.gov/pubmed/26076467 https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20yd%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=gao%20y%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=shiue%20rj%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=wang%20l%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=aslan%20ob%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=bae%20mh%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20h%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=seo%20d%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=choi%20hj%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=kim%20sh%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=nemilentsau%20a%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=low%20t%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=tan%20c%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=efetov%20dk%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=taniguchi%20t%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=watanabe%20k%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=shepard%20kl%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=heinz%20tf%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=englund%20d%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=hone%20j%5bauthor%5d&cauthor=true&cauthor_uid=29337567 https://www.ncbi.nlm.nih.gov/pubmed/29337567 https://www.ncbi.nlm.nih.gov/pubmed/?term=zhou%20s%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/?term=chen%20k%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/?term=cole%20mt%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/?term=li%20z%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/?term=chen%20j%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/?term=li%20c%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/?term=dai%20q%5bauthor%5d&cauthor=true&cauthor_uid=30724407 https://www.ncbi.nlm.nih.gov/pubmed/30724407 https://iopscience.iop.org/journal/2053-1583 https://iopscience.iop.org/volume/2053-1583/2 https://iopscience.iop.org/issue/2053-1583/2/4 https://www.ncbi.nlm.nih.gov/pubmed/?term=zhou%20q%5bauthor%5d&cauthor=true&cauthor_uid=26150483 https://www.ncbi.nlm.nih.gov/pubmed/?term=zheng%20j%5bauthor%5d&cauthor=true&cauthor_uid=26150483 https://www.ncbi.nlm.nih.gov/pubmed/?term=onishi%20s%5bauthor%5d&cauthor=true&cauthor_uid=26150483 https://www.ncbi.nlm.nih.gov/pubmed/?term=crommie%20mf%5bauthor%5d&cauthor=true&cauthor_uid=26150483 https://www.ncbi.nlm.nih.gov/pubmed/?term=zettl%20ak%5bauthor%5d&cauthor=true&cauthor_uid=26150483 http://www.pnas.org/content/early/2015/07/01/1505800112.abstract https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/article/pii/s0008622316309071?via%3dihub#! https://www.sciencedirect.com/science/journal/00086223 https://www.sciencedirect.com/science/journal/00086223/111/supp/c 378 h. kardanmoghaddam, m. maraki, a. rajaei [144] a. quellmalz, a..d. smith, k. elgammal, x. fan, a. delin, m. östling, m. lemme, k.b. gylfason and f. niklaus, “influence of humidity on contact resistance in graphene devices”, acs applied materials & interfaces, vol. 10, no. 48, 2018. [145] k. pan, y. fan,t. leng , j. li, z. xin, j. zhang, l. hao, j. gallop, k.s. novoselov and z. hu, “sustainable production of highly conductive multilayer graphene ink for wireless connectivity and iot applications”, nature communications, vol. 9, no. 1, 2018. [146] b. kang, s.k. lee, j. jung , m. joe, s.b. lee, j. kim, c. lee, and k. cho, “nanopatched graphene with molecular self-assembly toward graphene–organic hybrid soft electronics”, advanced materials, vol. 30, no. 25, 2018. [147] r. ma, q. chen, w. zhang, f. lu , c. wang , a. wang , y. xie and h. tang, “a dual-polarity graphene nems switch esd protection structure", ieee electron device letters, vol. 37, no. 5, pp. 674–676, 2016. [148] s. goossens, g. navickaite, c. monasterio, s. gupta, j.j. piqueras, r. pérez, g. burwell, i. nikitskiy, t. lasanta, t. galán, e. puma, a. centeno, a. pesquera, a. zurutuza, g. konstantatos and f. koppens, “broadband image sensor array based on graphene-cmos integration”, nature photonics, vol. 11, no. 6, pp. 366–371, 2017. [149] american chemical society[acs nano]. available from: https://pubs.acs.org/journal/ancac3[accessed on:15.1.2018] [150] plosone journal. available from: http://journals.plos.org/plosone[accessed on: 15.1.2018] [151] scientific reports[sci rep]. available from: https://www.nature.com/srep/about[accessed on: 15.1.2018] [152] t. roy, m. tosun, j. s.kang, a.b. sachid, s.b. desai, m.k. hettick, c.c. hu and a. javey, “fieldeffect transistors built from all two-dimensional material components”, acs nano, vol. 8, no. 6, pp. 6259–6264, 2014. https://www.ncbi.nlm.nih.gov/pubmed/?term=pan%20k%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=fan%20y%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=leng%20t%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=li%20j%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=xin%20z%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=zhang%20j%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=hao%20l%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=gallop%20j%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=novoselov%20ks%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://www.ncbi.nlm.nih.gov/pubmed/?term=hu%20z%5bauthor%5d&cauthor=true&cauthor_uid=30518870 https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=kang%2c+boseok https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=lee%2c+seong+kyu https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=jung%2c+jaehyuck https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=joe%2c+minwoong https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=lee%2c+seon+baek https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=kim%2c+jinsung https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=lee%2c+changgu https://onlinelibrary.wiley.com/action/dosearch?contribauthorstored=cho%2c+kilwon ../../appdata/local/downloads/vol.30,%20no.%2025 https://ieeexplore.ieee.org/author/38469487600 https://ieeexplore.ieee.org/author/38469487600 https://ieeexplore.ieee.org/author/37085792940 https://ieeexplore.ieee.org/author/37076888400 https://ieeexplore.ieee.org/author/37085794845 https://ieeexplore.ieee.org/author/37085536854 https://ieeexplore.ieee.org/author/37085794597 https://ieeexplore.ieee.org/author/37533615900 https://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=55 https://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=7458214 https://www.nature.com/articles/nphoton.2017.75#auth-1 https://www.nature.com/articles/nphoton.2017.75#auth-2 https://www.nature.com/articles/nphoton.2017.75#auth-3 https://www.nature.com/articles/nphoton.2017.75#auth-4 https://www.nature.com/articles/nphoton.2017.75#auth-5 https://www.nature.com/articles/nphoton.2017.75#auth-6 https://www.nature.com/articles/nphoton.2017.75#auth-7 https://www.nature.com/articles/nphoton.2017.75#auth-8 https://www.nature.com/articles/nphoton.2017.75#auth-9 https://www.nature.com/articles/nphoton.2017.75#auth-10 https://www.nature.com/articles/nphoton.2017.75#auth-11 https://www.nature.com/articles/nphoton.2017.75#auth-12 https://www.nature.com/articles/nphoton.2017.75#auth-13 https://www.nature.com/articles/nphoton.2017.75#auth-14 https://www.nature.com/articles/nphoton.2017.75#auth-15 https://www.nature.com/articles/nphoton.2017.75#auth-16 instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 585 597 doi: 10.2298/fuee1704585b spectral parameters for finger tapping quantification * vladislava n. bobić 1 , milica d. djurić-jovičić 2 , nathanael jarrasse 3 , milica ječmenica-lukić 4 , igor n. petrović 4 , saša m. radovanović 5 , nataša dragašević 4 , vladimir s. kostić 4 1 school of electrical engineering, university of belgrade, serbia 2 innovation center of school of electrical engineering, university of belgrade, serbia 3 institut des systèmes intelligents et de robotique, université pierre et marie curie, paris, france 4 neurology clinic, clinical center of serbia, medical faculty, university of belgrade, serbia 5 institute for medical research, university of belgrade, serbia abstract. a miniature inertial sensor placed on fingertip of index finger while performing finger tapping test can be used for an objective quantification of finger tapping motion. temporal and spatial parameters such as cadence, tapping duration, and tapping angle can be extracted for detailed analysis. however, the mentioned parameters, although intuitive and simple to interpret, do not always provide all the necessary information regarding the subject’s motor performance. analysis of frequency content of the finger tapping movement can provide crucial information about the patient's condition. in this paper, we present parameters extracted from spectral analysis that we found to be significant for finger tapping assessment. with these parameters, tapping’s intra-variability, movement smoothness and anomalies that may occur within the tapping performance can be detected and described, providing significant information for further diagnostics and monitoring progress of the disease or response to therapy. key words: frequency analysis, finger tapping, parkinson's disease. 1. introduction patients with parkinson’s disease (pd) exhibit severe motor problems; therefore objective assessment of their movements is crucially important for diagnostics and evaluation of progress of the disease. frequency analysis is widely used for such assessment of parkinsonian patients. some usual frequency-derived measures obtained from fast fourier transform (fft), such as amplitude, median power frequency, power dispersion, and power received november 29, 2016; received in revised form march 29, 2017 corresponding author: vladislava n. bobić school of electrical engineering, university of belgrade, kralja aleksandra blvd. 73, 11120 belgrade, serbia (e-mail: vladislava.bobic@yahoo.com) * an earlier version of this paper received best section paper award at 3rd international conference on electrical, electronic and computing engineering, icetran 2016, zlatibor, serbia, june 13 – 16, 2016 [1]. 586 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. percentage within the 4–7 hz frequency range were used for quantification of hand tremor [2]. body-area inertial sensing system and signal processing based on filter-bank analysis and cross correlation were used for the interpretation of tremor frequency and energy [3]. one study proposed a new technique for tremor detection from gyro data [4] that comprises empirical mode decomposition and the hilbert spectrum, introducing the concept of instantaneous frequency in the field of tremor. frequency-derived measures were extracted from the results of the welch's averaged modified periodogram method of spectral estimation performed on the acceleration data and used for assessment of stride-to-stride variability in pd patients and healthy controls in real-life settings [5]. they defined four parameters for the main peak of the power spectral density function: its frequency, the amplitude, the width at half of its amplitude and the slope from the point of the peak’s maximum to the point of half of the peak’s amplitude. body motion of pd patients was also assessed by using a maximum-likelihoodestimator-based fractal analysis method for triaxial accelerometer data [6]. freeze of gait in patients with pd was quantified from the power spectral density of the shank acceleration [7]. researchers defined a new index, named frequency ratio as the square of the total power in the 3–8 hz band, divided by the square of the total power in the 0.5–3 hz band. results showed that the defined parameter can be used for better differentiation between patients than traditional gait spatial measures. although spectral components hidden in the performed movement can indicate motor impairment [8], fourier analysis is not the most effective tool for the analysis of transient behavior or discontinuities that are typical for human movement. in such case, timefrequency algorithms can provide detailed analysis of signal’s frequency content over time, allowing detection of localized features in specific time moments. time-frequency algorithms short-time fourier transform (stft), and wavelet transform (wt) have already been used in many studies in the field of human movement [9][11]. detection of transient episodes and tripping in inertial data can be performed with both stft and discrete wavelet transform [12]. however, wavelets proved to be superior at describing anomalies, pulses and other transient events that start and stop within a movement signal [13]. parameters expressing main frequencies, pattern decrement and activity volume of the basic finger tapping rhythm and vigor of the performed movements were extracted from the coefficients of the results of continuous wavelet transform performed on gyro signals, providing classification between pd patients and healthy subjects [14]. neurological disorders, including parkinson’s disease [15], can affect smoothness of the patient’s motor performance. because of that, objective measure of movement smoothness can be a very important segment of the assessment of the patient’s motor abilities. it was shown that frequency analysis can provide information about movement smoothness by analyzing the spectral arc length (sparc) [16]. repetitive finger tapping represents one of the descriptive characteristics of the patient motor ability that is included in unified parkinson’s disease rating scale (updrs test, e.g., fahn et al, 1987 [17]). in clinical practice, the finger tapping performance is often validated visually, which results in a low diagnostic resolution [18]. however, using the appropriate instrumentation, such as miniature inertial sensors, finger tapping performance can be quantified, allowing the objective assessment of specific characteristics or changes in the finger tapping pattern over time [19]-[20]. our goal is to offer a new method for the objective quantification of finger tapping performance that is regularly used for assessment and visually estimated by physicians. we spectral parameters for finger tapping quantification 587 suggest a set of frequency derived parameters that can provide the assessment of tapping’s rhythmic behavior, vigor of its performance, intra-variability, tremor and motor blocks. in this way, the quantitative assessment of repetitive finger tapping performance can be obtained thus providing support in monitoring of the patient's condition, response to therapy as well as in differential diagnostics of parkinsonism. 2. methods and materials instrumentation the instrumentation includes an inertial sensor unit comprising a 3d gyroscope l3g4200 (stmicroelectronics, usa) [21]. in our system, the small sized (10x12 mm) and lightweight (3 g) sensor is placed on a fingertip of the subject’s index finger (fig. 1). the sensor is connected to its sensor control unit (scu), positioned on the forearm, by thin, light, flexible and loose cable. the designed instrumentation and mounting concept secure that movement path and range are not hindered in any aspect. different technical and mounting solutions (sensor gloves, wireless sensors) have also been considered, however, all of them showed certain shortcomings in terms of size, weight (e.g. having wireless sensor on fingertip requires mounted battery which increases the size and weight), limited performance and tactility (gloves), as well as hygiene and price. the signals are collected by scu and wirelessly transmitted to a remote computer. custom-made graphical user-friendly interface, which is developed in cvi (cvi 9.0, ni labwindows, usa), controls the data acquisition, storing and provides export (ascii comma separated value (csv) format) for further analysis. fig. 1 system setup: sensor (s) positioned on fingertip connected to sensor control unit (scu) mounted on the subject’s hand. experiments twenty patients with parkinson's disease (age: 61,39±9,7), and twelve age and gender matched controls (age: 56,53±9,13) were enrolled in this study. during the performance, subjects were sitting comfortably in a chair, with their hand placed in front of them. as the part of the test, they repeatedly tapped index finger and thumb as rapidly and as widely as possible for 15 s, as described in [19]. each recording began and ended with their fingers closed at the "zero-posture”. for each subject, three trials per affected hand were recorded. a resting period of one minute in between was given; because fatigue may compromise the performance. 588 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. the study was performed at the neurology clinic, clinical centre of serbia, belgrade in accordance with the ethical standards of the declaration of helsinki. all the participants gave informed written consent prior to the participation in the study. signal processing angular velocity was recorded using digital gyroscopes with the sampling frequency fs=200 hz, calibrated and directly processed by custom-made matlab script (matlab 7.6.0., r2008a). the examples of recorded signals for one healthy control (ctrl) and two pd patients are presented in fig. 2. fig. 2 the examples of recorded gyro signals for: two pd patients and one ctrl subject. firstly, tapping performance was described with parameters typically used for tapping description [19]:  duration of the taps tt – expressed in seconds,  tapping cadence ct – expressing the number of taps in the observed 15 s long sequence,  angle that index finger forms relative to the “zero posture” of the fingers αt – expressed in degrees. additionally, continuous wavelet transform (cwt), welch's averaged modified periodogram method of spectral estimation and spectral arc length method (sparc) [16] were applied on the observed 15 s long sequences of the signal. the methods were performed for the frequency range between 0.01 and 20 hz (the frequency increment 0.01 hz), covering the complete possible spectral content of finger tapping. continuous wavelet transformation continuous wavelet transformation based on fft algorithm was applied on the 15 s long sequences of the gyro signal. for this application, we used a mother wavelet from complex morlet wavelet family, with center frequency f0=1 hz and time-frequency resolution σ=0.7. the fourier transform of wavelet function was found for each scale (reciprocal of each frequency from the defined band 0-20 hz) and multiplied by the representation of the gyro signal in the frequency domain. complex cwt coefficients were obtained using the inverse fourier transform and then normalized with the weighting function i.e., by dividing the coefficients by the square root of the scale. the final result is obtained in the spectral parameters for finger tapping quantification 589 form of matrix, with the same time resolution ∆t=5 ms (∆t=1/fs=1/200 hz) as the original gyro signal (no additional interpolation or down sampling were performed). the examples of obtained cwt coefficients, presented in the shape of a 3d scalogram, are shown in fig. 3. the scalogram represents an original color-coded illustration of wavelet coefficients. for this application, we used jet colormap, where small amplitudes are represented with the cold color tones (starting from navy blue), whereas warmer colors (ending with dark red) follow the increase of the amplitude. fig. 3 3d representation of cwt coefficients. an example is given for patient pd1. in order to observe temporal changes of tapping activities, we defined cross-sectional area perpendicular to the t-axis (csa-ttot) [14]. csa-ttot was calculated by summing the absolute values of cwt coefficients, and finally expressed as percent of the maximum energy of csa-ttot characteristic. by introducing two thresholds at 50 and 25% (light and dark dashed grey lines in fig. 4, respectively), we found signal parts where tapping performance was compromised causing energy loss below two defined levels. fig. 4 representative example of csa-ttot [%] distribution given for one pd patient. light and dark dashed grey lines mark two defined thresholds at 50 and 25%, whereas dashed blue and solid red rectangles outline signal parts with energy loss below defined levels (50 and 25%, respectively). 590 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. in this way, tapping performance can be described regarding the disturbance of its basic rhythmic behavior e.g., motor blocks. we introduced two parameters representing the duration of the detected anomalies, expressed in seconds (cwt<50 and cwt<25, respectively). welch's method of spectral estimation power spectral density was calculated with welch’s method of spectral estimation. for this application, a window size of 800 samples and overlap between the windows of 50% were applied. a fft length was 2 times the next higher power of 2 of the signal length. for each subject, we extracted four parameters for the main peak i.e., the dominant harmony of the obtained power spectral density function (fig. 5) [5]:  the frequency of the peak – f;  the amplitude of the peak – h;  the width of the peak at half of its amplitude – w (the red lines in fig. 5);  the slope of the peak, calculated from the point of half of the peak’s amplitude to the peak’s maximum point – s (the blue lines in fig. 5). fig. 5 representation of power spectral density function. blue line marks slope of the peak, whereas red line shows width of the peak at half of its amplitude. the examples are given for: one ctrl subject (top panel) and two pd patients (middle and bottom panels). spectral parameters for finger tapping quantification 591 sparc method for assessment of tapping smoothness spectral arc method is used for the assessment of smoothness of signals describing any rhythmic sensorimotor behavior [22]-[24]. sparc method applied here is modified spectral arc length method, defined in [16]. it represents the signal smoothness as a single scalar, by calculating the arc length of the fourier spectrum within the defined frequency range of a given velocity. final value of this parameter was expressed as negative logarithm of the calculated arc length. bigger values correspond to greater smoothness. smoothness was calculated for the upward trend of the taps, because it corresponds partially to both opening and closing but it doesn’t include the moment when fingers are closed, which may cause some changes in the signal and thus introduce error. the procedure was repeated for all the taps, which were previously segmented. for each subject we calculated the total measure of tapping smoothness, expressed as descriptive statistics (average ± std.dev), and the trend of change in smoothness across all segmented taps, represented by the slope of the fitted linear regression line across the corresponding smoothness characteristic (the red dashed line in fig. 6). fig. 6 sparc smoothness characteristic with corresponding slope (red dashed line) for one ctrl subject (top panel) and two pd patients (middle and bottom panels). dashed blue rectangle marks detected change in movement smoothness. 592 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. statistical analysis the two groups were compared using the t-test for two independent samples (if both groups satisfied the normal distribution) or mann-wilcoxon test (if the distributions were not normal). statistical significance was determined with 2-tailed tests when p<0.05. statistical analysis was performed in spss v17.0 (chicago, il). 3. results by observing the examples of recorded gyro signals (fig. 2), one can notice that the healthy subject had rapid and vigorous performance. patient pd1 performed even more rapidly, but less vigorously, less rhythmically and with noticeable amplitude changes within the signal, as the consequence of motor block that occurred during the performance. on the other hand, the patient pd2 had slower and non-smooth but more rhythmical tapping performance. results summarized for all the participants showing descriptive statistics (average ± std.dev) for the parameters expressing duration of tapping performance, tapping cadence and angles, as well as the statistical differences between the two groups are given in table 1. distributions of the introduced parameters are shown in fig. 7. although those parameters show statistically significant differences between the groups (the grey shaded cells in table 1), they cannot provide information about changes in tapping shape and the appearance of specific transient events, and therefore they are not suitable for the detection or description of such noticeable characteristics of tapping performance. because of that, the evaluation of tapping pattern needs to be supplemented with the frequency analysis of gyro data. table 1 descriptive statistics of finger tapping duration, cadence and angle for both ctrl and pd subjects param. ctrl (av±std) pd (av±std) p-value tt [s] 0.32 ± 0.07 0.65 ± 0.41 0.001 ct [taps/s] 49.00 ± 13.02 30.40 ± 17.22 0.001 αt [°] 61.88 ± 18.18 39.53 ± 18.74 0.024 in order to provide the complete analysis of tapping data, we applied cwt, sparc and welch's method of spectral estimation on the 15 s long sequences of the signal. continuous wavelet transformation has an important role in the detection and localization of anomalies that may appear within movement signal. patient pd1 had some changes in the tapping motion which are obvious from the raw gyro signal (marked with the solid red rectangle in fig. 4). by using the cwt method, this disturbance can be described in terms of the degradation level (below 25% of the maximum performing energy) and duration. however, the suggested technique allowed detection of another not so noticeable tapping "anomaly" (marked with the dashed blue rectangle, around 12 s), which could be left unnoticed otherwise. by combining csa-ttot function with a color-coded illustrative representation of cwt coefficients such as 3d scalogram (fig. 3), clinicians can assess anomalies in tapping performance, localize them in time and evaluate the duration and severity of those disturbances. spectral parameters for finger tapping quantification 593 by using parameters extracted from welch’s algorithm of spectral estimation, tap-to-tap variability can be assessed. sparc algorithm allowed calculation of tapping smoothness and its decrement in time. the combined frequency analysis of all three performed methods can provide clinicians with crucial information about tapping performance that can be used for further analysis, or assistance in diagnostics. the applied analysis is summarized in table 2, showing descriptive statistics (average ± std.dev) for the listed frequency parameters for all the subjects, as well as the statistical difference between the two groups. the statistically significant difference between pd patients and healthy subjects was found for all the parameters (except slope of sparc). in addition, for all ctrl subjects the value of cwt<25 parameter was equal to zero, indicating that none of them had severe energy loss below 25%, as opposed to pd patients who demonstrated the appearance of those anomalies in duration up to 5 s long. this indicates that cwt based evaluation is suitable for finger tapping quantification, with potential for differential diagnostics. table 2 descriptive statistics of cwt, welch and sparc based parameters of finger tapping for both ctrl and pd subjects param. ctrl (av±std) pd (av±std) p-value cwt<50 [s] 1.02 ± 1.49 5.23 ± 3.26 <0.001 cwt<25 [s] 0.00 ± 0.00 0.94 ± 1.74 0.023 f [hz] 3.47 ± 0.92 2.10 ± 1.21 0.002 h [psd] 1.34 ± 0.29 1.14 ± 0.39 0.039 s [psd/hz] 3.42 ± 0.70 2.90 ± 1.09 0.042 w [hz] 0.39 ± 0.04 0.42 ± 0.07 0.041 sparc -3.13 ± 0.13 -3.69 ± 0.70 0.001 sparcs -0.0005 ± 0.003 -0.03 ± 0.05 0.373 the distributions of cwt<50 and four psd based parameters for two groups of subjects (ctrl and pd) are shown in fig. 7. sparc smoothness parameter distributions are presented for 10 randomly selected healthy subjects and 10 pd patients with different patterns of tapping performance and shown in the form of a boxplot in the bottom panel in fig. 7. based on the presented results of the applied sparc analysis, it can be seen that healthy subjects have small intraand inter-subject variability of tapping smoothness. on the other hand, patients with pd have wider range of sparc index within their tapping patterns (intra-variability) as well as within the group (inter-variability). this cognition proves that sparc parameter is suitable for the analysis of tapping performance and has potential for differential diagnostics. 594 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. fig. 7 boxplot representation of all listed parameters for both ctrl subjects and pd patients. spectral parameters for finger tapping quantification 595 4. discussion and conclusion tapping performance can be described with temporal and spatial parameters, describing tapping duration and cadence and angle between fingers at maximum opening. although the mentioned characteristics of tapping performance can be used for distinction between healthy individuals and patients (table 1), they are not suitable for the detailed analysis of changes that may occur within tapping performance, movement variability and smoothness. therefore, the analysis should be supplemented with other techniques that can provide such evaluation of tapping performance. in this paper, three frequency based methods were applied on gyro signal acquired from one miniature sensor mounted on the subject’s index finger, and the results of performed techniques are used for quantification of finger tapping performance. by implementing continuous wavelet transform, the frequency content of signal can be observed over time (fig. 3), but also analyzed in terms of energy changes that can be useful for anomaly detection (the solid red rectangle in fig. 4). two cwt based parameters expressing the duration of energy loss below 50% and 25% proved to be statistically different between groups (the grey shaded cells in table 2). in previous research studies, the smaller slope and larger width of the dominant frequency within welch’s power spectral density function were defined as indicators of the greater signal intra-variability. the most prominent peak of the psd function was explained with f, h, s and w parameters which proved to be statistically different between the two groups of subjects (the grey shaded cells in table 2). for pd group, the smaller slope and higher values of width parameters comparing to ctrl group, indicate prominent tapping intra-variability for pd patients. this discovery agrees with the result from weiss et al, performed on gait data [5]. sparc based parameter provide the assessment of movement smoothness, whereby bigger values indicate smoother movements. in this paper, it was demonstrated (table 2, fig. 7) that pd patients have decreased movement smoothness, with statistically significant difference from healthy subjects. by implementing this method, patient’s motion smoothness and its decrement in time can be assessed. also, the combined analysis of these methods allows detection of some changes (the dashed blue rectangle in fig. 4 and fig. 6), which aren’t obvious from the gyro signal, and therefore can be overlooked. based on the presented analysis, finger tapping can be quantified in terms of its rhythmic behavior, the vigor of its performance, tapping intra-variability, tremor and motor blocks that can occur within the tapping performance. these methods allow monitoring of patient’s response to therapy and progress of the disease, and comparison with other evaluated patients. in the future, defined parameters will be complemented with additional parameters which can provide the complete assessment of tapping movement. designed methodology will be implemented for automated differential diagnostic system. acknowledgment: this work was partially supported by the serbian ministry of education, science and technological development under grant no. 175016, grant no. 175090 and grant “pavle savic” bilateral collaboration with france. we would also like to thank phd student minja belić for assisting with recordings. 596 v. n. bobić, m. d. djurić-joviĉić, n. jarrasse, m. jeĉmenica-lukić, et al. references [1] v.n. bobić, m. d. djurićjoviĉić, n. jarrasse, m. jeĉmenica-lukić, i. n. petrović, s. m. radovanović, n. dragašević and v. s. kostić, “frequency analysis of repetitive finger tapping – extracting parameters for movement quantification”, in proceedings of the 3rd international conference on electrical, electronic and computing engineering (icetran 2016), zlatibor, serbia, june 13 – 16, 2016, pp. mei2.2 1-5 [2] c. duval, "rest and postural tremors in patients with parkinson's disease", brain research bulletin, vol. 70, no. 1, pp. 44-48, 2006. [3] h. c. powell, m. a. hanson and l. john, "on-body inertial sensing and signal processing for clinical assessment of tremor", biomedical circuits and systems, ieee transactions on, vol. 3, no. 2, pp. 108116, 2009. [4] e. rocon, j. l. pons, a. o. andrade and s. j. nasuto, "application of emd as a novel technique for the study of tremor time series", in proceedinigs ieee eng med biol soc conf, 2006, pp. 6533-6536. [5] a. weiss, s. sharifi, m. plotnik, j. p. van vugt, n. giladi and j. m. hausdorff, "toward automated, athome assessment of mobility among patients with parkinson disease, using a body-worn accelerometer", neurorehabilitation and neural repair, vol. 25, no. 9, pp. 810-818, 2011. [6] m. sekine, m. akay, t. tamura, y. higashi and t. fujimoto, "fractal dynamics of body motion in patients with parkinson's disease", journal of neural engineering, vol. 1, no. 1, pp. 8, 2008. [7] s. t. moore, h. g. macdougall, and w. g. ondo, “ambulatory monitoring of freezing of gait in parkinson’s disease,” j. neurosci. methods, vol. 167, no. 2, pp. 340–348, 2008. [8] i. shimoyama, t. ninchoji and k. uemura, "the finger-tapping test: a quantitative analysis", arch neurol, vol. 47, no. 6, pp. 681-684, 1990. [9] g. strang, "wavelet transforms versus fourier transforms", bulletin of the american mathematical society, vol. 18, pp. 288–305, 1993. [10] t. m. e. nijsen, p. j. m. cluitmans, p. a. m. griep and r. m. aarts, ”short time fourier and wavelet transform for accelerometric detection of myoclonic seizures”, embs benelux symposium, pp. 155158, december 7-8, 2006. [11] a. napieralski, z. ciota, m. janicki, m. kamiński, r. kotas, p. marciniak, a. mielczarek, m. napieralska, r. ritter, b. sakowicz, w. tylman and m. zubert, “examples of medical software and hardware expert systems for dysfunction analysis and treatment”, facta universitatis, series: electronics and energetics, vol. 28, no. 1, pp. 29-50, 2014. [12] m. a. hanson and l. john, "assessing joint time-frequency methods in the detection of dysfunctional movement", in proceedings of the fortieth asilomar conference on signals, systems and computers, 2006. acssc'06, 2006. [13] b. xu, a. song and j. wu. "algorithm of imagined left-right hand movement classification based on wavelet transform and ar parameter model", in proceedings of the 1st int. conf. on bioinformatics and biomedical engineering, icbbe 2007, 6-8 july 2007, pp. 539-542. [14] m. d. djuric-jovicic, v. n. bobic, m. jecmenica-lukic, i. n. petrovic, s. m. radovanovic, n. s. jovicic, v. s. kostic and m. b. popovic, "implementation of continuous wavelet transformation in repetitive finger tapping analysis for patients with pd", in proc of the 22nd telecommunications forum telfor 2014, ieee, 2014, pp. 541-544. [15] j. jankovic and j. d. frost, "quantitative assessment of parkinsonian and essential tremor clinical application of triaxial accelerometry", neurology, vol. 31, no. 10, pp. 1235-1235, 1981. [16] s. balasubramanian, a. melendez-calderon, a. roby-brami and e. burdet, "on the analysis of movement smoothness", journal of neuroengineering and rehabilitation, vol. 12, no. 1, pp.1, 2015. [17] s. fahn and r. l. elton, “unified parkinsons disease rating scale”, in: s. fahn, c. d. marsden, m. goldstein and d. b. calne, recent developments in parkinsons disease ii, committee mot ud, new york: macmillan, pp. 153-63, 1987. [18] á. jobbágy, p. harcos, r. karoly and g. fazekas, "analysis of finger-tapping movement", journal of neuroscience methods, vol. 141, pp. 29–39, 2005. [19] m. djurić-joviĉić, i. petrović, m. jeĉmenica-lukić, s. radovanović, n. dragašević-mišković, m. belić, v. miler-jerković, m. b. popović and v. s. kostić, “finger tapping analysis in patients with parkinson’s disease and atypical parkinsonism”, journal of clinical neuroscience, vol. 30, pp. 49-55, 2016. [20] s. r. muir, r. d. jones, j. h. andreae and i. m. donaldson, "measurement and analysis of single and multiple finger tapping in normal and parkinsonian subjects", parkinsonism & related disorders, elsevier science ltd, great britain, vol. 1, no. 2, pp. 89-96, 1995. [21] n. s. joviĉić, l. v. saranovac and d. b. popović, "wireless distributed functional electrical stimulation system", journal of neuroengineering and rehabilitation, vol. 9, no. 1, pp. 1-10, 2012. spectral parameters for finger tapping quantification 597 [22] s. balasubramanian, a. melendez-calderon and e. burdet, “a robust and sensitive metric for quantifying movement smoothness”, ieee transactions on biomedical engineering, vol. 59, no.8, pp. 2126-2136, 2012. [23] v. crocher, j. fong, m. klaic, d. oetomo and y. tan, “a tool to address movement quality outcomes of post-stroke patients”, in replace, repair, restore, relieve–bridging clinical and engineering solutions in neurorehabilitation. springer international publishing, 2014, pp. 329-339. [24] s. estrada, m. k. o'malley, c. duran, d. schulz and j. bismuth, “on the development of objective metrics for surgical skills evaluation based on tool motion”, in proceedings of the 2014 ieee international conference on systems, man, and cybernetics. ieee, 2014, pp. 3144-3149. instruction facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 317-326 https://doi.org/10.2298/fuee2002317s © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd a single power supply 0.1-3.5 ghz low noise amplifier design using a low cost 0.5 µm d-mode phemt process * denis sotskov, vadim elesin, alexander kuznetsov, nikolay usachev, nikita zhidkov, alexander nikiforov 1 national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation 2 specialized electronic systems, moscow, russian federation abstract. design and testing results of a single power supply wide-band low noise amplifier (lna) based on low cost 0.5 µm d-mode phemt process are presented. it is shown that the designed cascode lna has operating frequency range up to 3.5 ghz, power gain above 15 db, noise figure below 2.2 db, output linearity above 17 dbm and power consumption less than 325 mw. potential immunity of the lna to total ionizing dose and destructive single event effects exceed 300 krad and 60 mev·cm 2 /mg respectively. key words: low noise amplifier (lna), pseudo high-electron mobility transistor (phemt), cascode, radiation tolerance 1. introduction low noise amplifier (lna) is an important functional unit in receiver paths of communication, radar and navigation systems. lna parameters determine the sensitivity (noise figure), power gain and input linearity (p1db) of the receiver [1-3]. nowadays mass-produced lna ic’s are manufactured with iii-v and silicon based technologies using the following devices: gaas depletion (d-) or enhancement (e-) mode pseudo high-electron mobility transistor (phemt); gan high-electron mobility transistor (hemt); gaas and sige heterojunction bipolar transistors (hbt). d-mode phemt process has become a good choice for lna design because of transistor low noise figure, high cut-off frequency (ft) and appropriate fabrication costs [4]. in addition, low sensitivity of phemt to total ionizing dose (tid) and single event effects received september 26, 2019; received in revised form december 29, 2019 corresponding author: denis sotskov national research nuclear university mephi (moscow engineering physics institute); specialized electronic systems, moscow, russian federation e-mail: disot@spels.ru * an earlier version of this paper was presented at the 31 st international conference on microelectronics (miel 2019), september 16-18, 2019, in niš, serbia [1].  318 d. sotskov, v. elesin, a. kuznetsov, n. usachev, n. zhidkov, a. nikiforov (see) makes them promising for space applications [5, 6]. а disadvantage of a d-mode phemt based lna is a negative bias supply requirement, which limits the possible field of applications [2, 7]. meanwhile, the conventional approach to single positive supply lna design on d-mode phemt is based on self-biasing [2]. the purpose of this work was design, manufacturing, and testing of lna with a single positive supply and 0.1-3.5 ghz operation frequency range using a low cost 0.5 µm d-mode phemt process [8]. 2. lna design wide-band (multi-octave) lna’s are designed using various architectures including distributed (traveling-wave), balanced and resistive feedback configurations [9]. the resistive feedback is widely used to achieve tradeoff among several lna performances (operated frequency range, noise figure, gain, gain flatness, linearity, vswr, power consumption) [9, 10]. among the possible configurations based on resistive feedback, cascode lna can provide not only flat gain and power over its operating band but also flat linearity in the same band and higher output impedance (better wide-band potential) [11]. therefore the single positive power supply cascode lna based on resistive feedback configuration and self-biasing techniques are presented in this work. lna was implemented in d-mode phemt 0.5 μm process with ft up to 35 ghz and minimum noise figure (nfmin) is 1.2 db at 8 ghz. the lna circuit schematic is shown in fig. 1. fig. 1 simplified circuit of the lna a single power supply 0.1-3.5 ghz low noise amplifier design 319 the amplifier is composed of cascoded input (vt2) and output (vt1) transistors with the same width of 4×150 µm. series feedback (resistor r3, capacitor c2, inductor l2) and parallel feedback (capacitor c1, resistor r1) are used to provide stability and gain flatness in a wide frequency range. it should be noted what cascode transistor with capacitance connected to the gate terminal forms a collpits oscillator. a damping resistor should be added to the gate of the cascode transistor to decrease parasitic resonator quality factor in order to improve amplifier stability [12]. lna is designed with a single positive voltage supply 5 v. resistor r2, resistive divider implemented by r4-r6 and self-bias circuit r3 provide required transistors operation point. the input matching network consists of integrated spiral inductors l1 and l2. the output matching network consists of capacitor c3 and resistor r5. circuit parameters optimization was carried out in a computer aided design (cad) tool using the technique presented in [13], which allows determining the operating current, the width of the transistors and parameters of the matching networks, providing optimal values of the gain, noise figure and return loss. the lna’s area is 2.15×1.65 mm 2 and includes additional pad “c” needed to connect external bypass capacitors for enhanced performance at frequencies below 0.5 ghz. 3. lna performances 3.1. simulation and measurement setups simulation has been performed using scalable phemt non-linear model and linear models for microstrip lines, inductors, t-shapes, vias and pads based on scattering (s-) parameters measurement verified up to 20 ghz and provided by foundry [8]. measurements have been performed on the wafer for a significant amount (above 50) of lna chips using a specialized microwave test system, based on cascade semiautomated probe station, vector network analyzer (vna) and signal (spectrum) analyzer with noise figure measurement option described in [14]. the experimental setup used for lna dies testing is shown in fig. 2. according to the test procedure, s-parameters and p1db (linearity) are measured at 5 v supply, the noise figure is measured at 3 v supply. fig. 2 experimental setup based on the microwave probe station 320 d. sotskov, v. elesin, a. kuznetsov, n. usachev, n. zhidkov, a. nikiforov 3.2. simulation and measurement results the simulated and measured lna chip performances (gain, noise figure, p1db, etc.) are shown in fig. 3 and fig. 4. the measured and simulated results are in good agreement in the frequency range 0.5-3.5 ghz, and demonstrate that lna has power gain above 15.3 db, noise figure below 2.2 db, output p1db above 17 dbm (f = 1.5 ghz). fig. 3 simulated and measured lna gain and noise figure versus frequency fig. 4 simulated and measured lna gain and output power versus input power a single power supply 0.1-3.5 ghz low noise amplifier design 321 3.3. model accuracy estimation the simulation results relative error estimation was carried out with the following expression: δx = {|xm-xs|/xm}∙100%, (1) where xm and xs are the measured and simulated gain and noise figure values respectively. according with a small-signal analysis the relative gain error and the noise figure error do not exceed 3 % (frequency range 0.5-3.5 ghz) and 4 % (frequency range 1-3.5 ghz) respectively, that confirms phemt model accuracy. the relative input p1db error does not exceed 15 % (f = 1.5 ghz) for the worst case, the typical value is less than 5 %. 3.4. parameters variation estimation the coefficient of variation estimation was carried out with the following expression: vx { /x }∙100%, (2) where and x are a standard deviation and mean (average) values respectively. on wafer measurement results showed that the coefficient of variation does not exceed 1.3 % for gain (frequency range 0.5-3.5 ghz), 0.5 % for noise figure (frequency range 1-3.5 ghz) and 2.8 % for current consumption. 3.5. special measurement test-fixture special test-fixture based on ro4003c laminate has been designed and implemented to provide lna performance measurements especially under extreme temperature and ionizing radiation exposure. the special measurement test-fixture schematic and photograph are shown in fig. 5 and fig. 6 respectively. external capacitors c2 and c3 can be used to improve lna performance at frequencies below 0.5 ghz. fig. 5 special measurement test-fixture schematic 322 d. sotskov, v. elesin, a. kuznetsov, n. usachev, n. zhidkov, a. nikiforov fig. 6 special measurement test-fixture photograph 3.6. low-frequency applications measurement of low-frequency s-parameters and noise figure were performed at 5 v and 3v supply (vdd) respectively using special test-fixture (see fig. 5, 6) and microwave test system operating up to 26 ghz [14]. measured lna low-frequency performance with and without mounted 1 nf smd capacitors c2 and c3 are shown in fig. 7. dependencies in fig. 7 demonstrate that at frequency 100 mhz power gain and noise figure have been improved by more than 10 db and 5 db respectively. fig. 7 measured lna gain and noise figure versus frequency 3.7. parameters variation over temperature range the lna parameters measurement results in the ambient temperature range from -60 °c to +125 °c and frequency 1 ghz are shown in fig. 8 and fig. 9. the gain value monotonically decreases with increasing ambient temperature, the gain change does not a single power supply 0.1-3.5 ghz low noise amplifier design 323 exceed 2 db. the noise figure monotonically increases with increasing ambient temperature, the variation in the noise figure does not exceed 1.3 db. the output linearity (output p1db) monotonically decreases with increasing ambient temperature, the output p1db shift does not exceed 2 db in the temperature range. the current consumption change in the considered ambient temperature range does not exceed 3.8 ma. fig. 8 measured lna gain and noise figure at 1 ghz versus temperature fig. 9 measured lna output p1db at 1 ghz and current consumption versus temperature 324 d. sotskov, v. elesin, a. kuznetsov, n. usachev, n. zhidkov, a. nikiforov 3.8. radiation tolerance estimation a radiation tolerance estimation have been performed by spels / nrnu mephi test center for the typical test structures: transistor and c-band two-stage lna implemented in given d-mode phemt 0.5 µm process. the experimental research of the test structures under tid irradiation have been performed using cs-137 “panorama-mephi” irradiation facility [15, 16]. heavy ion irradiation have been performed at the facility based on u400m heavy-ion cyclotron of the joint institute for nuclear research (jinr, dubna, russia). according to the test results, up to an equivalent gamma dose of 300 krad, no parameter degradation of the test structures is observed. destructive see (sel, seb and other) have not been observed for heavy ions exposure with let up to 60 mev·cm 2 /mg. 3.9. performance summary the lna’s measured performance is summarized in table 1 compared to its commercially available analogues implemented in different processes (gaas phemt, gaas hbt, gan hemt, and sige hbt). table 1 lna’s performance summary lna this work tga5108 hmc395 qpl1002 sgl0622z company qorvo (triquint) analog dev. (hittite) qorvo qorvo (rfmd) process 0.5 µm gaas d-phemt 0.5 µm gaas e/d-phemt gaas hbt 0.25 µm gan hemt sige hbt configuration cascode cascode darlington cascode – operating frequency[ghz] 0.5 …3.5 0.5…3.5 0.5…3.5 0.5…3.5 0.5…3.5 gain [db] 15.3 15.0 14.3 16.1 18.7 gain flatness [db] 4 7 2 3 14 noise figure [db] 2.2 2.2 5.0 1.6 3.2 output p1db [dbm] (f = 1.5 ghz) 17 20 15 23 6 gain temperature coefficient [db/°c] -0.011 -0.011 -0.008 -0.022 -0.034 noise figure temperature coefficient [db/°c] 0.007 0.005 0.012 0.016 0.010 power consumption [mw] 325 425 270 600 36 chip size [mm 2 ] 2.151.65 1.490.85 0.380.58 in package in package negative bias supply requirement no no no yes no fom [arbitrary unit] 0.60 0.66 0.14 1.55 0.08 in order to compare the presented lna’s, a figure of merit (fom) is introduced as a function of gain (g), gain flatness (δg), noise figure (nf), output p1db (op1db), gain temperature coefficient (δg), noise figure temperature coefficient (δnf), power consumption (pdc) and operating temperature range (δt): a single power supply 0.1-3.5 ghz low noise amplifier design 325 1db g t nf t 20 10 dc g op fom (nf 1) p g 10 10            , (3) where g, δg, nf are in arbitrary units; op1db and pdc are in mw; δg and δnf are in db/°c; δt 125 °c (-40 ‒ +85 °c). according to table 1, the presented single power supply lna implemented in 0.5 µm d-mode phemt process shows similar performance as compared with lna based on 0.5 µm gaas e/d-phemt and better noise figure, output p1db and fom compared with lna’s based on gaas and sige hbts. lna implemented in 0.25 µm gan hemt shows superior performance but required a negative bias supply. 4. conclusion the single power supply wide-band low noise amplifier design approach was considered with respect to the low cost 0.5 µm d-mode phemt process. it was demonstrated that the designed lna’s performance in the frequency range 0.5 3.5 ghz is not inferior to commercially available single power supply analogues implemented in more expensive gaas phemt, gaas hbt and sige hbt processes. it is also important to note, that this low cost process is equipped with a proper design kit (models accuracy, low process variations) and is also a good choice for lna’s and control circuits (switches, attenuators, phase shifters) radiation tolerant design for space applications, providing tid (total ionizing dose) and let (single event effects) up to 300 krad and 60 mev·cm 2 /mg respectively. acknowledgment: this work was supported in accordance with agreement between ministry of education and science of the russian federation and national research nuclear university mephi № 8.2373.2017/4.6. references [1] d.i. sotskov, n.a. usachev, v.v. elesin, a. g. kuznetsov, k.m. amburkin, g. v. chukov, m. i. titova, n. m. zidkov, "d-phemt 0.5 um process characterization to wide-band lna design", in proceedings of the 31th int. conf. on microelectronics (miel 2019), 2019, pp. 99–102. [2] g. gonzalez. microwave transistor amplifiers: analysis and design. 2nd ed. pearson. 1996. 528 p. [3] n. usachev, v. elesin, a. nikiforov, g. chukov, g. nazarova, d. sotskov, n. shelepin, v. dmitriev "system design considerations of universal uhf rfid reader transceiver ics", facta universitatis, series: electronics and energetics, vol. 28, no. 2, pp. 297–307, 2015. [4] g.d. vendelin, a.m. pavio, u.l. rohde. microwave circuit design using linear and nonlinear techniques. john wiley & sons ltd, 2005, 1058 p. [5] d.v. gromov, v.v. elesin, s.a. polevich, et al. "ionizing-radiation response of the gaas/(al, ga)as phemt: a comparison of gammaand x-ray results", russian microelectronics, vol. 33, no. 2, 2004, pp. 111–115. [6] g.v. chukov, v.v. elesin, g.n. nazarova, a.y. nikiforov, d.v. boychenko, v.a. telets, a.g. kuznetsov, k.m. amburkin, "see testing results for rf and microwave ics", in proceedings of the 2014 ieee radiation effects data workshop, 2014, pp. 233–235. [7] h.-c. chiu et al., "enhancementand depletion-mode ingap/ingaas phemts on 6-inch gaas substrate", in proceedings asia-pacific microwave conference, 2005, pp. 1–4. 326 d. sotskov, v. elesin, a. kuznetsov, n. usachev, n. zhidkov, a. nikiforov [8] o.r. fazylkhanov, i.s. pushnitsa, s.i. strelnikov, m.a. kalyakin, a.h. filaretov, "process design kit verification methodology and practice", 2017, crimico, pp. 143–149. [9] i.j. bahl. fundamentals of rf and microwave transistor amplifiers. john wiley & sons. 2009. 671 p. [10] g. wang, j. liu et al., "the design of broadband lna with active biasing based on negative technique", journal of microelectronics, electronic components and materials, vol. 48, no. 2, pp. 115– 120, 2018. [11] j.p. conlon, n. zhang, m.j. poulton et al., "gan wide band power integrated circuits", ieee compound semiconductor integrated circuit symposium (csic), 2006, pp. 85–88. [12] bagher afshar, ali m. niknejad. "x/ku band cmos lna design techniques", ieee custom integrated circuits conference, 2006, pp. 389–392. [13] g.n. nazarova, v.v. elesin, d.i. sotskov, “an approach to low noise amplifier optimization in advanced design system cad”, it security (russia), 2016, vol. 23, no. 3, pp. 53–59. [14] d.i. sotskov, v.v. elesin, k.m. amburkin, g.n. nazarova, n.a. usachev, a.y. nikiforov, "design and testing issues of a high-speed soi cmos dual-modulus prescaler for radiation tolerant frequency synthesizers", in proceedings of the 30th int. conf. on microelectronics (miel 2017), 2017, pp. 329– 332. [15] a.s. artamonov, a.a. sangalov, a.y. nikiforov, v.a. telets, d.v. boychenko, "the new gamma irradiation facility at the national research nuclear university mephi", in proceedings of the ieee radiation effects data workshop, 2014, pp. 258–261. [16] d. boychenko, o. kalashnikov, a. nikiforov, a. ulanova, d. bobrovsky, p. nekrasov. "total ionizing dose effects and radiation testing of complex multifunctional vlsi devices", facta univesitatis, series: electronics and energetics, vol. 28, no. 1, pp. 153–164, 2015. facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 379-394 https://doi.org/10.2298/fuee2003379g © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd comparative evaluation of keypoint detectors for 3d digital avatar reconstruction  dušan gajić, gorana gojić, dinu dragan, veljko petrović university of novi sad, faculty of technical sciences, novi sad, serbia abstract. three-dimensional personalized human avatars have been successfully utilized in shopping, entertainment, education, and health applications. however, it is still a challenging task to obtain both a complete and highly detailed avatar automatically. one approach is to use general-purpose, photogrammetry-based algorithms on a series of overlapping images of the person. we argue that the quality of avatar reconstruction can be increased by modifying parts of the photogrammetry-based algorithm pipeline to be more specifically tailored to the human body shape. in this context, we perform an extensive, standalone evaluation of eleven algorithms for keypoint detection, which is the first phase of the photogrammetry-based reconstruction pipeline. we include well established, patented distinctive image features from scale-invariant keypoints (sift) and speeded up robust features (surf) detection algorithms as a baseline since they are widely incorporated into photogrammetry-based software. all experiments are conducted on a dataset of 378 images of human body captured in a controlled, multi-view stereo setup. our findings are that binary detectors highly outperform commonly used sift-like detectors in the avatar reconstruction task, both in terms of detection speed and in number of detected keypoints. key words: detector, photogrammetry-based reconstruction, 3d human avatar, structure from motion, multi-view stereo 1. introduction an avatar is a digital self-representation of a participant in a computer generated virtual world [1] and can be represented both in two (2d) or three dimensions (3d). the significance of 3d avatars is constantly growing due to the expansion of virtual worlds in which participants identify themselves through their avatars. recently, avatars have been successfully involved in many applications, including entertainment [2], shopping [3], education [4], health [5], and military [6]. for some applications, the avatar must be a 3d, highly personalized representation of a person, e.g., avatars used for meeting events or virtual try-on applications [3], [7]. since it is a labor-intensive task to produce high-quality 3d avatars manually, many techniques for automatic generation have been proposed. one of them is digital photogrammetry received october 13, 2019; received in revised form january 12, 2020 corresponding author: dušan gajiš university of novi sad, faculty of technical sciences, trg dositeja obradoviša 6, 21102 novi sad, serbia e-mail: dulegajic@gmail.com  380 d. gajiš, g. gojiš, d. dragan, v. petroviš which is the subject of the research described in this paper. to obtain a 3d avatar through digital photogrammetry software, a series of overlapping images showing a person from different viewpoints are first acquired. a typical photogrammetry-based pipeline consists of three phases [8]: structure from motion (sfm), multi-view stereo (mvs), and mesh creation. as an input, the sfm phase receives a series of overlapping 2d images and outputs a 3d sparse point cloud. this phase relies on a triangulation process to recover 3d points from multiple 2d projections of the same 3d point present on two or more images. to identify 2d points on images that represent the same 3d point, an algorithm for point detection, in literature also known as a detector, is applied to all input images. this helps locate keypoints—patches of the image which represent the 3d points that will make up the sparse point cloud. depending on type, detectors find keypoints corresponding to structures known as edges, blobs, or corners. detected keypoints are matched with each other to find tracks of keypoints that represent the same 3d point using a description-generating algorithm. for more information about sfm, mvs, and mesh creation phases of the photogrammetry-based pipeline, we refer the reader to [8]. recently, large number of detection algorithms have been proposed. although minor discrepancies in the research on the evaluation of detection algorithms exist, scale-invariant feature transform (sift) based algorithms are still considered to be state-of-the-art algorithms for general-purpose use. however, it has been shown that even cutting-edge commercial software solutions that use sift or sift-like algorithms in phases of a keypoint detection, such as agisoft photoscan [9], have difficulties when reconstructing human avatars. those difficulties are often caused by an insufficient number of detected keypoints on particular problem areas (e.g., backs), and ultimately result in an incomplete avatar model. according to [10], the optimal choice of detector might depend on properties of the input data. this means that sift and surf might not perform best in the specific case of 3d avatar reconstruction. additionally, the price of software used for avatar reconstruction could be reduced if a patent-free detector algorithm were to be used. recently, patent-free detectors have been implemented in some of the leading open-source photogrammetry-based solutions, such as meshroom [11] and openmvg [12]. many of the detectors tested in our study have been proposed considerably after sift and surf algorithms, thus it is expected that more of them will be implemented in photogrammetry software in the future to compensate for shortcomings of sift and surf. all detectors included in this study, including those already incorporated into photogrammetry software, can be used for human avatar reconstruction. still, it remains a question if detectors not yet implemented in available photogrammetry software could yield a comparable or better result to those already implemented. from this viewpoint, our study can be seen as a first step to guide the implementation of human-based photogrammetry software. to this end, we have conducted an extensive, standalone detector evaluation study on a human-based image dataset captured in controlled conditions. the results of such a study can lead to less expensive, more widely-available photogrammetry software, if it shows that free-to-use detection algorithms can replace sift without sacrificing quality. we evaluate eleven detectors, both binary and floating-point in terms of the number of keypoints detected, detection speed and detector efficiency in finding keypoints in the region of an image representing a person. our overall findings are that binary detectors highly outperform floating-point detectors tested in this study, including sift and surf detectors, for the task of 3d human body reconstruction. comparative evaluation of feature detectors for 3d digital avatar reconstruction 381 the rest of the paper is organized as follows. in section 2, we present a brief overview of work in the field of detector evaluation. section 3 discusses in detail the experimental framework for evaluation, as well as the dataset used in the experiments. we present and discuss the obtained results in section 4. the final section offers the main conclusions, as well as possible directions for future work. 2. related work in this section, we give a brief overview of the work related to human body reconstruction. we start by presenting associated studies in the detector evaluation field and report established state-of-the-art results. next, we discuss different techniques for 3d reconstruction of a clothed human body representing an avatar, with a particular interest in model’s level of detail. 2.1. detector evaluation detector evaluation has been a widely addressed topic in a computer vision. extensive standalone detector evaluation for the use case similar to ours have been proposed in [13]. ten well-established detectors at the time the paper was written were evaluated on a dataset captured in a multi-view stereo setup showing complex, non-planar scenes such as buildings, fruits, etc. authors evaluate detectors through three metrics: recall rate introduced in [14], keypoint location, and the average number of detected keypoints. to calculate the first two metrics, they use a ground-truth data in the form of known camera positions and a 3d dense point cloud of a scene captured by a laser scanner. since we do not have a precomputed 3d model of a person that can be used for recall and location calculation, we adopt the average number of keypoints as a metric in our work. results of the experiments conducted in [13] show that fast (features from accelerated segment test) detector showed unreliable performance despite the large average number of detected keypoints. although not so extensive, one of the most influencing works in the field of detector evaluation is the early work of mikolajczyk and schmid [15]. to evaluate the performance of the detectors under an extensive set of image transformations, the authors used ground-truth homographies between image pairs to match detected keypoints. this solution for keypoint verification is commonly used in experiments performed on images showing planar scenes, which is not valid in our case, since the scenes used for detector evaluation are non-planar. along with standalone detector evaluation, recent studies provide detector evaluation jointly with description algorithms through feature matching task [10], [16]–[18]. joint detector-descriptor evaluation has been appealing due to the nature of the keypoint matching problem. keypoint matching between two images is a two-step problem: (1) all keypoints are detected on both images and (2) described by a descriptor algorithm of the choice. then, keypoint pairs are tracked by similarity in terms of descriptor’s output. however, introducing descriptors into detector evaluation adds more complexity to the evaluation task, since the final performance cannot be assigned solely to detection or description algorithm, but rather to the combination of these two. in [16] sift, surf, mser (maximally stable color regions), fast and orb (oriented fast and rotated brief) detectors are evaluated in terms of fast matching on a dataset with different geometric and photogrammetric transformations including rotation, scale change, viewpoint 382 d. gajiš, g. gojiš, d. dragan, v. petroviš change, image blur, jpeg compression and change in illumination. in [17] more detectors are added in evaluation, including censure, agast (adaptive and generic accelerated segment test), and brisk (binary robust invariant scalable keypoints) over the extensive image transformations dataset comprised of multiple well-known feature evaluation datasets. although commonly employed metrics for joint detector evaluations include repeatability score, precision and recall value, number of keypoint correspondences and keypoint detection time, in our experiment we adopt just the keypoint detection time since the other metrics are descriptor dependent. there is a majority consent between proposed evaluation methods that fast is one of the top-performing detectors in terms of the number of detected keypoints and detection speed. considering the detection speed, fast is followed by other binary detectors such as orb and agast [17]. although fast expresses superior performance when it comes to the number of keypoints detected, it is stated in [13] that it was unreliable compared to the other scale-space keypoint detectors, such as difference of gaussian (dog), today incorporated as a part of sift detector. according to a ranking proposed in [19], the best performing detector-descriptor combinations were fast+sift and fast+brisk. in [20], a novel method for detector evaluation is introduced through the reconstruction of a 3d dense point cloud. although authors compare just sift and akaze detectors, the method can be applied to other detection algorithms to verify already produced numerical results additionally. as future work, we intend to incorporate a similar approach in our evaluation framework. 2.2. human body reconstruction to reconstruct the 3d body model of a clothed human, affordable image-based techniques are used as an alternative to more expensive laser scan and structured light techniques [21]. image-based reconstruction requires one [22]–[25] or more [26]–[29] temporally [26][30] or spatially [27] connected images captured by rgb [22]–[25] or rgb-d [26], [27], [30] sensors. early work in this field was directed toward general-purpose multi-view stereo algorithms. in a multi-view stereo, multiple sensors are used in a setup to simultaneously capture images of the subject from different viewpoints with certain redundancy between the views. although these techniques are not primarily designed for human body reconstruction, it has been demonstrated that highly detailed models can be obtained using this method [28], [29], [31], [32]. by design, multi-view stereo algorithms are sensitive to complex occlusions between the views, as well as sparse or repeated textures [28], [33], [34]. these appearances are ubiquitous in human body reconstruction: texture issues are often caused by clothes, and occlusions by nontrivial body shape and pose. as a result, the output body model may be missing some of the body parts [33], [34]. in [35], authors minimize model incompleteness by increasing redundancy between the views in a dense multi-view stereo setup. however, using tens or thousands of sensors in a setup significantly limits the proposed method’s applicability due to high setup price and increased reconstruction time. different approach to address the model incompleteness problem based on compressive sensing technique (cs) is presented in [36]. compressive sensing has already been used to refine depth maps that are generated in later steps of the human avatar 3d reconstruction pipeline. this technique could be used to reduce the number of sensors in the setup, with a limitation that this approach can be applied just in cases where exact sensor positions are known during the image acquisition process, which is not the assumption in this paper. still, it could be possible to apply cs to fill comparative evaluation of feature detectors for 3d digital avatar reconstruction 383 missing parts of the final model reconstructed by the photogrammetry-based pipeline. it remains to be tested if cs could recover whole body parts or just minor patches on the model. another effort to reduce the setup price was attempted in [27] where more affordable, lowresolution rgb-d sensors are used instead of rgb sensors. although rgb-d sensors improve the reconstruction process results in terms of improved depth estimation, due to low sensor resolution, body models reconstructed in these setups lack details. another approach to reduce the setup price is to use a sparser sensor setup. this is approach we utilize in our work. since the redundancy between views in a sparse setup is low, models generated using these setups are more likely to be incomplete. to overcome this problem, algorithms for reconstruction from sparse setups usually do not rely solely on input images. in [24], [26] coarse human body template is used as a basis to overcome model incompleteness issues. the template is further modified according to input images to obtain a personalized model of a clothed person. the main disadvantage of using the template in the reconstruction process is unavailability to generate models with a high level of details. lately, human body reconstruction from a single image has been a topic of great interest in a computer vision. the most successful single-image approaches are those based on convolutional neural networks (cnns) [22]–[25]. there are two common approaches to reconstruct a body model when it comes to cnns: (1) estimate human body template parameters [37], [38] or (2) directly output voxel occupancy in the form of a voxel grid [22], [23]. the latter approach is of more interest to this work since it is more suitable for the reconstruction of a clothed body model. recently, not just input color images, but also segmentation masks and body landmarks are used to output the clothed body model successfully. however, although voxel grid-based cnn reconstruction methods output promising results both in terms of completeness and level of details, this approach is currently limited by computational power to a voxel grid of approximate size of 128×128×128 voxels. this constraint is related to model detail level, which is limited by the maximal size of the grid. it is of particular interest to our work that the 3d model of a clothed human body is highly detailed and complete. thus, we choose multi-view setup with rgb cameras to capture images of a clothed subject since currently no other method can produce models with comparable high level of details. as a basis for our research, we use a general-purpose multiview stereo reconstruction algorithm to obtain the clothed body model. differently from the other work, we make an effort towards modifying general-purpose photogrammetry-based reconstruction algorithms for a human body reconstruction domain. to achieve that, we perform an extensive study of detector algorithms that are used as a first step in the pipeline to choose the best performing detectors on a human-based dataset. in this way, we tackle the problem of improving the reconstruction detail level and completeness through the improvement of the algorithm, instead of the more expensive sensor setup densification. 3. experimental framework in this section, we explain the sensor setup used to capture the human-body based dataset used for the experiments, as well as a detailed description of conducted experiments. to refer to the person who has been photographed, we use a term the subject. 384 d. gajiš, g. gojiš, d. dragan, v. petroviš 3.1. camera setup to capture image data, we use multi-view stereo setup with 54 high-resolution rgb calibrated cameras, conceptually similar to the one described in [39] and shown in fig. 2. during the image acquisition process, the subject is standing in a center of the setup with legs slightly apart and arms positioned at an approximately 30-degree angle away from the body (so-called a-pose) [39]. fig. 1 gives an idea of the body areas visible on images captured by different cameras in the setup. due to privacy concerns, we display subject silhouettes instead of color images. almost all parts of the subject’s body are visible on the captured images. the subject soles are the exception since they are not visible during the image acquisition process. fig. 1 body coverage schema—anonymized real data 3.2. dataset the dataset we conduct experiments on consists of seven image sets that we will refer to as scans. each scan is captured with the setup similar to [39] and consist of images displaying different body parts as illustrated in fig. 1. to capture scans, we use two comparative evaluation of feature detectors for 3d digital avatar reconstruction 385 different camera types (see table 1). due to relatively sparse camera setup, redundancy between images of a single scan is low. images are captured from different viewpoints without precisely known camera positions. some of the images may suffer from an illumination effect. other frequently tested geometric or photogrammetric transformations in the general-purpose evaluations, such as rotation, blur or jpeg compression are omitted from the dataset since the presence of those transformations indicates an error in the scan acquisition process. table 1 scans specification scan identifier camera manufacturer resolution 1, 2, 3, 4, 5, 6 canon 3456x5184 7 raspberry pi 2464x3280 (a) (b) fig. 2 conceptual camera setup shown from the top (a) and side view (b). acquisition cameras are represented as columns of dark spots. this image is taken from [36]. in our work, we use acquisition setup similar to the one presented in the image. 3.3. software and hardware all experiments are conducted on a personal computer running windows 10 64-bit, powered by intel i5-6600 cpu at 3.3 ghz, 32 gbs of ram, and nvidia geforce 1050ti graphics card. evaluation pipeline is implemented as a single-threaded application using c++ programming language and compiled with visual c++ 2015 compiler using speed optimizations (/o2 compiler flag). to make experiments easily reproducible, all detector implementations used are part of a publicly available opencv 3.2 [40] library. we compile the library from source with opencv-contrib package to include support for patented algorithms such as sift and surf. 386 d. gajiš, g. gojiš, d. dragan, v. petroviš 3.4. detector evaluation we include both floating-point and binary detection algorithms into our study. as for floating-point algorithms we include currently the most popular sift [41] and surf [42] detection algorithms, as well as star [43], maximal self-dissimilarities (msd) [44], maximally stable color regions (mser) [45], [46], and good features to track (gfft) [47]. we also evaluate number of binary detection algorithms such as oriented fast and rotated brief (orb) [48], features from accelerated segment test (fast) [49], binary robust invariant scalable keypoints (brisk) [50], adaptive and generic accelerated segment test (agast) [51], and accelerated kaze (akaze) [52]. we choose to include in our study as many detectors as possible limiting ourselves to implementations available as a part of opencv library. when instantiating a detector object, we use default parameters for all detectors except for orb and gfft for which the maximum number of keypoints has been set to 300000 instead of much smaller, default values of 500 and 1000, respectively. we experimentally choose 300000 as an upper limit for the number of detected keypoints, since none of our test images exceed this limit under any detector. we evaluate detectors on scans from table 1. experiments are conducted both on original scans and scans with a removed background (so-called masked scans). to remove the background, we apply a mask image to each image from the scan. a mask is new, binary image that corresponds to the original, color image with white pixels representing subject body and black pixels representing the background. each mask image is manually labeled to precisely follow subject’s outline. after the mask is applied, the image is left showing just the subject while the background is made entirely white. since our study is also directed toward setup cost reduction, we are also interested in detectors performance in lower resolution images, since low-resolution cameras are cheaper. motivated by this fact, we test all detector algorithms on images with applied scale factors of 1, 2, 4, and 8. to downscale original images, we use bilinear interpolation. 3.4.1. performance metric we use three metrics for the measurement of detector performance:  the average number of detected keypoints has been calculated for both masked and original images. this metric is important for detector evaluation since the insufficient number of detected keypoints in the image segment showing the subject will almost certainly result in a sparse point cloud with too few points and, consequently, an incomplete avatar reconstruction. certain areas of the human body, such as back or legs, can be particularly challenging to reconstruct due to a lack of edges or textures, which are detected as features by some detection algorithms.  the average number of keypoints per second = number of detected keypoints / time to detect keypoints. large number of keypoints is necessary to reconstruct complete and detailed avatar of a human, making keypoint detection time significant factor in a 3d reconstruction process. choosing detector with large execution time might limit applicability of avatar reconstruction to non-realtime applications. thus, similar to [19], we include detector execution time measurement into ours study. we improve the approach from [19], by not limiting the maximum number of keypoints detected by the algorithm. since the number of keypoints detected by different detectors on the same image can vary, we do not measure absolute execution time as comparative evaluation of feature detectors for 3d digital avatar reconstruction 387 introduced in [19]. instead, we measure a detector’s execution time indirectly as the number of keypoints detected per second.  semantic precision = number of keypoints detected on the image segment showing the subject / total number of keypoints detected both on image segment representing the subject and on the segment representing the background. to get the first value, we apply the selected detection algorithm on the masked image. for the second value, the detector algorithm is applied to the original image, and the total number of detected features is calculated. this measure is used as an indicator of detector algorithm expressiveness – higher ratio indicates a better ability of the detector to distinguish between subject and background, and possibly reduce the number of bad matches and noise later in the reconstruction process. 4. results and discussion this section offers our findings for detector evaluation on the human-based dataset. 4.1. detector evaluation here we present results of standalone detector evaluation for each of the aforementioned three metrics. 4.1.1. the average number of detected keypoints as mentioned earlier, the number of detected keypoints can significantly impact later stages of the reconstruction process, since the low number of detected keypoints will undoubtedly lead to the low-quality avatar reconstruction. in table 2, we show our findings on the average number of keypoints detected on the proposed dataset of seven scans for different scaling factors applied on both original and masked images. in all tested scenarios, binary detectors highly outperform sift and surf, as shown in table 3. in general, masking does not have a significant impact on the number of detected keypoints. we observe that the average number of keypoints detected on masked images can vary up to 10% compared to the average number of detected keypoints using the same detectors on original images. since the change is positive in all cases except for surf and msd detectors, our estimate is that by eliminating the background from the input image, we emphasize contours of the subject which leads to the increased number of keypoints detected by the majority of detection algorithms. at the same time, keypoints detected on the background are discarded on masked images. in the case of surf and msd algorithms, the number of keypoints rejected by the mask is slightly larger than the number of newly detected keypoints on the masked images, which leads to the reduced number of keypoints detected on masked images. we observe that the average number of keypoints detected on masked images can vary up to 10% compared to the average number of detected keypoints using the same detectors on original images. in all cases except for surf and msd detectors, more keypoints are detected on the masked image than on the original image even though image masking discards all keypoints detected on the background. higher keypoint detection rate on masked images can be contributed to additional keypoints being identified by the majority of detectors when subject contours enhancement is introduced. 388 d. gajiš, g. gojiš, d. dragan, v. petroviš table 2 detection algorithms ranked according to the average number of detected keypoints. in the first row are detectors that on average detect the largest number of keypoints, in the last row are detectors that on average detect the smallest number of keypoints. the table provides detector rankings on images with (column masked) and without masks applied (column original) rank scale 1 scale 2 scale 4 scale 8 original masked original masked original masked original masked 1 orb orb orb orb agast agast agast agast 2 fast fast agast agast fast fast orb orb 3 agast agast fast fast orb orb fast fast 4 gftt gftt gftt gftt gftt gftt gftt gftt 5 brisk brisk brisk brisk surf surf surf surf 6 sift sift surf surf brisk brisk brisk brisk 7 surf surf sift sift sift sift sift sift 8 akaze akaze akaze akaze akaze akaze akaze msd 9 msd msd msd msd msd msd msd akaze 10 mser mser mser mser mser mser mser mser 11 star star star star star star star star table 3 average number of detected keypoints rank detector scale 1 scale 2 scale 4 scale 8 1 orb 122591 46600 10392 2823 2 fast 115551 39980 10570 2644 3 agast 114616 42207 11880 3013 4 sift 48546 9981 1872 608 5 surf 37910 11416 3564 1079 4.1.2. the average number of detected keypoint per second since we do not limit the number of detected keypoints, it would be unfair to rank detectors directly according to the execution time. instead, we use a relative ratio of the number of detected keypoints and time spent on keypoint detection, as shown in table 4. binary detectors fast, agast and orb show the best overall detection speed performance. both on the original and masked images, fast detects 3.5 and 4.5 times more keypoints per seconds then agast and orb, respectively. detected keypoint ratio between these three detectors persists even across different scales, which is not valid for the comparison of fast and state-of-the-art sift and surf detectors, where ratio variations are not negligible. for original images and different values of a scale factor, fast detects up to 540 times more keypoints per second then sift, and 137 times more than surf. for masked images, these ratios are slightly larger. when compared to other detectors, msd and mser algorithms are highly inefficient, detecting approximately less than a single keypoint per second on the original, and one to two keypoints per seconds on masked images. comparative evaluation of feature detectors for 3d digital avatar reconstruction 389 fig. 3 average number of keypoints detected per time unit (second) table 4 average number of keypoints detected per second detector scale 1 scale 2 scale 4 scale 8 original masked original masked original masked original masked agast 740 739 1046 1047 1224 1232 1172 1196 akaze 4 4 5 5 6 6 8 8 brisk 78 77 87 87 88 88 86 88 fast 2959 2956 3902 3878 3900 3904 3465 3516 gfft 140 100 169 150 181 165 161 155 msd 1 1 1 1 1 1 2 2 mser 1 1 1 1 1 2 2 3 orb 680 686 894 885 787 778 663 681 sift 11 11 9 9 7 6 9 8 star 3 3 8 8 8 7 17 15 surf 3 3 8 8 8 15 17 15 390 d. gajiš, g. gojiš, d. dragan, v. petroviš fig. 4 the ratio of detected keypoints on original scans and scans with applied masks 4.1.3. semantic precision not all keypoints are equally important in the process of human reconstruction since we would like to reconstruct the avatar of the subject with as little background noise as possible. that makes keypoints detected on the subject more important than the keypoints detected on the background. in fig. 4 we show a ratio of the average number of keypoints detected on masked images and those detected on the original image. surf and msd are more likely to detect keypoints on the background for all tested values of the scale factor. other algorithms express moderate to high robustness to the background keypoints since the computed ratio indicates that the number of keypoints detected on the subject is at least equal or even larger than the total number of keypoints detected on the original image. 5. conclusion in this paper, we presented an extensive evaluation of algorithms for keypoint detection in the context of 3d avatar reconstruction from an image sequence. although similar exhaustive evaluations of detector performance exist, we are not aware of any other study performed in the context of photogrammetry-based human body reconstruction. first, we created a human body image dataset by capturing images of seven different persons in a multi-view stereo comparative evaluation of feature detectors for 3d digital avatar reconstruction 391 setup in controlled lighting conditions. the dataset is used to evaluate eleven algorithms for keypoint detection, including well established and patented sift and surf algorithms as a baseline. our findings are mainly in agreement with previously conducted work proposed in [13], [17]. binary detectors show superior performance compared to floating-point detectors in terms of detection speed and number of detected keypoints. among the binary detectors, fast is the most efficient in terms of speed detection, detecting a considerably larger number of keypoints per second comparing to sift and surf detectors, followed by orb and agast. orb, agast, and fast are top-performing detectors considering the number of detected keypoints; their performance additionally increased when performed on the masked image. in our use case, fast does not produce the largest number of keypoints but is significantly close to the top-performing orb detector with approximately 2% less keypoints detected. we also found that surf and msd in comparison with other detectors, discover a significant number of keypoints in the background area, meaning that the usage of this detectors in the pipeline could lead to noisy reconstructions. in future work, detectors learned by machine learning techniques will be included in the evaluation. although advanced handcrafted detector algorithms still exhibit at least comparable performance to those that are learned, machine learning is a rapidly developing area and it can be expected that learned detectors will outperform handcrafted soon. another direction for future work includes improvement of the evaluation framework. the most reliable way to estimate actual detector performance would be to produce a 3d reconstruction based on detected keypoints. current photogrammetry-based software commonly includes just sift and surf detection algorithms into the pipeline. more work toward the adaption of other detectors in the pipeline will be done to additionally verify given numerical results. acknowledgements.the research reported in this paper is partially supported by the ministry of education, science, and technological development of the republic of serbia, projects number tr32044 (2011-2020), on174026 (2011-2020), and iii44006 (2011-2020). references [1] j. n. bailenson, n. yee, j. blascovich, and r. e. guadagno, ―transformed social interaction in mediated interpersonal communication‖, mediated interpersonal communication, 2008, pp. 77–99. [2] h. lin and h. wang, ―avatar creation in virtual worlds: behaviors and motivations‖, comput. human behav., vol. 34, pp. 213–218, may 2014. [3] f. cordier, w. lee, h. seo, and n. magnenat-thalmann, ―from 2d photos of yourself to virtual try-on dress on the web,‖ in people and computers xv—interaction without frontiers, london: springer london, 2011, pp. 31–46. [4] c. zizza, a. starr, d. hudson, s. s. nuguri, p. calyam, and z. he, ―towards a social virtual reality learning environment in high fidelity,‖ in proceedings of the 15th ieee annual consumer communications & networking conference (ccnc), 2018, pp. 1–4. [5] d. dragan, z. anišiš, s. mihiš, and v. puhalac, ―3d avatar platforms: tomorrow’s gateways for digitized persons into virtual worlds‖, springer, cham, 2018, pp. 141–155. [6] i. hudson and j. hurter, ―avatar types matter: review of avatar literature for performance purposes,‖ in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2016, vol. 9740, pp. 14–21. [7] m. yuan, i. r. khan, f. farbiz, s. yao, a. niswar, and m.-h. foo, ―a mixed reality virtual clothes try-on system‖, ieee trans. multimed., vol. 15, no. 8, pp. 1958–1968, dec. 2013. 392 d. gajiš, g. gojiš, d. dragan, v. petroviš [8] t. luhmann, s. robson, s. kyle, and j. boehm, close range photogrammetry and 3d imaging. 2013. [9] agisoft, ―agisoft photoscan professional (version 1.2.6) (software)‖, 2016. [online]. available: https://www.agisoft.com/downloads/installer/. [10] j. heinly, e. dunn, and j. m. frahm, ―comparative evaluation of binary features‖, in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2012, vol. 7573 lncs, no. part 2, pp. 759–773. [11] alicevision, ―meshroom: a 3d reconstruction software.‖ 2018. [12] p. moulon, p. monasse, r. perrot, and r. marlet, ―openmvg: open multiple view geometry,‖ in proceedings of the international workshop on reproducible research in pattern recognition, 2016, pp. 60–74. [13] h. aanæs, a. l. dahl, and k. s. pedersen, ―interesting interest points: a comparative study of interest point performance on a unique data set‖, int. j. comput. vis., vol. 97, no. 1, pp. 18–35, mar. 2012. [14] k. mikolajczyk and c. schmid, ―a performance evaluation of local descriptors‖, ieee trans. pattern anal. mach. intell., vol. 27, no. 10, pp. 1615–1630, 2005. [15] k. mikolajczyk and c. schmid, ―scale & affine invariant interest point detectors‖, int. j. comput. vis., vol. 60, no. 1, pp. 63–86, oct. 2004. [16] o. miksik and k. mikolajczyk, ―evaluation of local detectors and descriptors for fast feature matching,‖ in proceedings of the 21st int. conf. pattern recognit. (icpr), 2012, icpr, pp. 2681– 2684, 2012. [17] a. canclini, m. cesana, a. redondi, m. tagliasacchi, j. ascenso, and r. cilla, ―evaluation of low-complexity visual feature detectors and descriptors‖, in proceedings of the 18th international conference on digital signal processing, dsp 2013, 2013, pp. 1–7. [18] ş. işık, ―a comparative evaluation of well-known feature detectors and descriptors,‖ int. j. appl. math. electron. comput., vol. 3, no. 1, p. 1, dec. 2014. [19] d. mukherjee, q. m. jonathan wu, and g. wang, ―a comparative experimental study of image feature detectors and descriptors,‖ mach. vis. appl., vol. 26, no. 4, pp. 443–466, may 2015. [20] k. yamada and a. kimura, ―a performance evaluation of keypoints detection methods sift and akaze for 3d reconstruction,‖ in proceedings of the 2018 international workshop on advanced image technology, iwait 2018, 2018, pp. 1–4. [21] b. allen, b. curless, and z. popoviš, ―the space of human body shapes‖, acm trans. graph., vol. 22, no. 3, p. 587, 2003. [22] a. s. jackson, c. manafas, and g. tzimiropoulos, ―3d human body reconstruction from a single image via volumetric regression‖, sep. 2018. [23] g. varol et al., ―bodynet: volumetric inference of 3d human body shapes,‖ in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2018, vol. 11211 lncs, pp. 20–38. [24] z. zheng, t. yu, y. wei, q. dai, and y. liu, ―deephuman: 3d human reconstruction from a single image,‖ mar. 2019. [25] a. venkat, s. s. jinka, and a. sharma, ―deep textured 3d reconstruction of human bodies,‖ sep. 2018. [26] j. tong, j. zhou, l. liu, z. pan, and h. yan, ―scanning 3d full human bodies using kinects‖, ieee trans. vis. comput. graph., vol. 18, no. 4, pp. 643–650, apr. 2012. [27] z. liu et al., ―3d real human reconstruction via multiple low-cost depth cameras‖, signal processing, vol. 112, pp. 162–179, jul. 2015. [28] y. m. kim, c. theobalt, j. diebel, j. kosecka, b. miscusik, and s. thrun, ―multi-view image and tof sensor fusion for dense 3d reconstruction‖, in proceedings of the 2009 ieee 12th international conference on computer vision workshops, iccv workshops 2009, 2009, pp. 1542–1546. [29] y. furukawa and j. ponce, ―accurate, dense, and robust multiview stereopsis‖, ieee trans. pattern anal. mach. intell., vol. 32, no. 8, pp. 1362–1376, aug. 2010. comparative evaluation of feature detectors for 3d digital avatar reconstruction 393 [30] a. weiss, d. hirshberg, and m. j. black, ―home 3d body scans from noisy image and range data,‖ in proceedings of the ieee international conference on computer vision, 2011, pp. 1951–1958. [31] j. l. schönberger and j.-m. frahm, ―structure-from-motion revisited‖, in proceedings of the conference on computer vision and pattern recognition (cvpr), 2016. [32] j. l. schönberger, e. zheng, j. m. frahm, and m. pollefeys, ―pixelwise view selection for unstructured multi-view stereo‖, in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2016, vol. 9907 lncs, pp. 501–518. [33] h. aanæs, r. r. jensen, g. vogiatzis, e. tola, and a. b. dahl, ―large-scale data for multiple-view stereopsis,‖ int. j. comput. vis., vol. 120, no. 2, pp. 153–168, nov. 2016. [34] m. goesele, b. curless, and s. m. seitz, ―multi-view stereo revisited,‖ in proceedings of the 2006 ieee computer society conference on computer vision and pattern recognition volume 2 (cvpr’06), vol. 2, pp. 2402–2409. [35] s. r. fanello et al., ―ultrastereo: efficient learning-based matching for active stereo systems,‖ in proceedings of the 30th ieee conference on computer vision and pattern recognition, cvpr 2017, 2017, vol. 2017-janua, pp. 6535–6544. [36] i. stančiš, m. brajoviš, i. oroviš, and j. musiš, ―compressive sensing for reconstruction of 3d point clouds in smart systems,‖ in proceedings of the 24th international conference on software, telecommunications and computer networks, softcom 2016, 2016, pp. 1–5. [37] v. tan, i. budvytis, and r. cipolla, ―indirect deep structured learning for 3d human body shape and pose prediction,‖ in proceedings of the british machine vision conference 2017, 2017. [38] a. kanazawa, m. j. black, d. w. jacobs, and j. malik, ―end-to-end recovery of human shape and pose,‖ in proceedings of the ieee computer society conference on computer vision and pattern recognition, 2018, pp. 7122–7131. [39] d. gajiš, s. mihiš, d. dragan, v. petroviš, and z. anišiš, ―simulation of photogrammetrybased 3d data acquisition,‖ int. j. simul. model., vol. 18, no. 1, 2019. [40] g. bradski, ―the opencv library,‖ dr. dobb’s j. softw. tools, 2000. [41] d. g. lowe, ―distinctive image features from scale-invariant keypoints,‖ int. j. comput. vis., vol. 60, no. 2, pp. 91–110, 2004. [42] h. bay, t. tuytelaars, and l. van gool, ―surf: speeded up robust features‖, in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2006, vol. 3951 lncs, pp. 404–417. [43] m. agrawal, k. konolige, and m. r. blas, ―censure: center surround extremas for realtime feature detection and matchin‖, in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2008, vol. 5305 lncs, no. part 4, pp. 102–115. [44] f. tombari and l. di stefano, ―interest points via maximal self-dissimilarities‖, in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2015, vol. 9004, pp. 586–600. [45] p. e. forssén, ―maximally stable colour regions for recognition and matching‖, in proceedings of the ieee computer society conference on computer vision and pattern recognition, 2007, pp. 1–8. [46] d. nistér and h. stewénius, ―linear time maximally stable extremal regions,‖ in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2008, vol. 5303 lncs, no. part 2, pp. 183–196. [47] jianbo shi and tomasi, ―good features to track‖, in proceedings of ieee conference on computer vision and pattern recognition cvpr-94, 1994, pp. 593–600. [48] e. rublee, v. rabaud, k. konolige, and g. bradski, ―orb: an efficient alternative to sift or surf‖, in proceedings of the ieee international conference on computer vision, 2011, pp. 2564–2571. 394 d. gajiš, g. gojiš, d. dragan, v. petroviš [49] e. rosten and t. drummond, ―fusing points and lines for high performance tracking‖, in proceedings of the ieee international conference on computer vision, 2005, vol. ii, pp. 1508–1515. [50] s. leutenegger, m. chli, and r. y. siegwart, ―brisk: binary robust invariant scalable keypoints‖, in proceedings of the ieee international conference on computer vision, 2011, pp. 2548–2555. [51] e. mair, g. d. hager, d. burschka, m. suppa, and g. hirzinger, ―adaptive and generic corner detection based on the accelerated segment test‖, in lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2010, vol. 6312 lncs, no. part 2, pp. 183–196. [52] p. alcantarilla, j. nuevo, and a. bartoli, ―fast explicit diffusion for accelerated features in nonlinear scale spaces‖, in proceedings of the british machine vision conference 2013, 2014, pp. 13.1-13.11. facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 395-412 https://doi.org/10.2298/fuee2003395n © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd using an intelligent anfis-online controller for statcom in improving dynamic voltage stability vinh huu nguyen 1 , tien minh cao 2 , hung nguyen 3 , hung kim le 4 1 ho chi minh city power corporation, ho chi minh city, vietnam 2 faculty of telecommunications engineering, telecommunications university, nha trang city, vietnam 3 hutech institute of engineering ho chi minh city university of technology (hutech), ho chi minh city, vietnam 4 faculty of electrical engineering, the university of danang—university of science and technology, vietnam abstract. this research has introduced the intelligent anfis-online controller of statcom for improving the dynamic voltage on the power network under a 3-phase short circuit fault. the anfis-online is made using an artificial neural network identifier. and based on the identifier, the premise and consequent parameters of anfis are adjusted timely. to demonstrate the performance of the suggested controller, the transient waves are shown to describe the effectiveness of the intelligent anfis-online controller to enhance the transient response of the research system under a 3-phase short circuit fault. it's shown that the suggested intelligent anfis-online controller has provided waves better than the other controllers such as anfis controller, anfis-pso controller, anfis-ga controller for statcom equipment to enhance transient voltage stability. key words: adaptive neuro-fuzzy inference system (anfis), artificial neural network (ann), on-line training, voltage stability, synchronous machine (sg), wind farm (wf), statcom. 1. introduction due to the increase of wind energy all over the world, such as in the usa (united states of america) wind energy will account for up to 20% of power capacity by 2030 [1]. however, increasing the wind power penetration to the grid that could lead to the unstable of the system since wind energy is an unpredictable resource. for solving this case, flexible alternating current transmission systems (facts) equipments are suggested to install on the power system for enhancing the stability of the power grids. among facts equipments, static synchronous compensator (statcom) is one of the suitable equipment that can be installed received november 3, 2019; received in revised form may 7, 2020 corresponding author: vinh huu nguyen ho chi minh city power corporation, ho chi minh city, vietnam e-mail: nguyenhuuvinhdlhcm@gmail.com  396 v. h. nguyen, t. m. cao, h. nguyen, h. k. le not only to maintain the voltage of point of common coupling (pcc) but also to enhance the dynamic stability of the power system [2]. such as in [3], through the calculation and simulation results, it has shown the effectiveness of statcom in improving the low-voltage ride-through capability of wind power plants. in [4], a statcom was proposed to install at the pcc to keep stable voltage by protecting wind farms connected to the weak power system. in other papers, a full-power wind permanent-magnet generator (pmg) with a statcom is used in a new control algorithm to compensate reactive power for enhancing the stability of transient voltage wave [5]. in [6], statcom is proposed in a network with a big load in which absorbing reactive power from the power system can have a serious impact on the connection load. in [7], the fuzzy controller is applied to statcom for dynamic stability of the interconnected power grids but all the responses have big oscillation and long setting time. in [8], the design steps for the statcom controller are presented with the pi controller parameters that are constantly updated to improve the voltage regulation in the multi-machine system when occurs large disturbance. in the paper [9], the authors presented the study of combining statcom with pss in a multi-machine system connected to the pv generator to improve dynamic stability. the controller has its positive and negative effects, such as [7], the fuzzy logic controller (flc) is applied in the statcom controller to improve power stability in the two-area fourgenerator interconnected power system. the research results are presented in [10] show the effect of combination pi and fl controller in the speed controller of the interior permanent magnet synchronous motor to improve dynamic stability. in the distribution power system, to control and keep the thd of the current within the ieee standard, the d-statcom device is installed and the different structures of the fuzzy controller with pi controller are applied. in [12], the auxiliary controller is employed for statcom to improve the dynamic stability margin of multi-machine power grids. neuro-fuzzy systems appear for a newly developed different kind of smart systems that combine the key features of fuzzy logic systems and artificial neural networks. it is well known that neither fuzzy logic systems nor artificial neural networks are themselves capable of solving issued problems involving their both linguistic and numerical knowledge at once [13]. the anfis system is a combination of transparent linguistic reasoning of fuzzy logic and the learning ability of the artificial neuron system to perform smart self-learning that leads to a wide-ranging application. in [14], the anfis-pso and anfis-ga controllers have been suggested to statcom for enhancing the dynamic voltage stability of the power system. in general, two training methods have existed for adaptive neuro-fuzzy inference system, hybrid and back-propagation learning method. the hybrid learning algorithm has two stages, i.e., feedforward pass to identify the consequent parameters by using the learning mechanism of fis based on the least squares estimator (lse) and neural topologies, and backward pass to update the premise parameters by the error of the back-propagation algorithm. with the backpropagation, it is fast, simple, and easy to program. so, the back-propagation learning method is implemented for training the anfis controller in the research. identification and modeling keep a crucial role in the analysis and design of a physical system. the most common method for linear systems is to measure the values of the input and output signals to carry out mathematical modeling that shows the relationship between them. but it is difficult to identify mathematical modeling for nonlinear systems due to variations in their modeling parameters. the approximate characteristics of artificial neural networks make them suitable when used to model uncertainty nonlinear systems. in [15], the incorporation of artificial neural networks in using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 397 adaptive systems for the identification and control of uncertainty nonlinear systems was proposed by narendra and parthasarathy. the intelligent neural network is applied in a wideranging for the analysis of limit bearing capacity of continuous beams depending on the character of the load [16] and for the classification of electricity consumers based on several different criteria [17]. this paper aims to suggest an intelligent anfis-online controller of statcom for improving the dynamic voltage of the one machine-infinite bus and multi-machine power network under a 3-phase short circuit fault. the anfis-online is made using an artificial neural network identifier. and based on this identifier, the premise and consequent parameters of anfis are adjusted timely. to demonstrate the operation of the suggested controller, the transient waves are shown to describe the effectiveness of the intelligent anfis-online controller to enhance the dynamic stability of the research system under a 3-phase short circuit fault. the paper is organized as below: the introduction is shown in the first section. the intelligent anfis-online controller design is presented in the second section. the third section describes the mathematical model of statcom. to demonstrate the operation of the suggested controllers, simulation results and discussions are shown in the fourth section. in the last section, specific important conclusions of this research are presented. 2. intelligent anfis-online controller design 2.1. intelligent anfis-online controller the anfis controller discriminates itself from the controller based on fuzzy logic systems by the adaptive parameters, i.e., both the premise and the consequent parameters are adjustable. the operation of anfis depends on their membership functions, the quantities of membership functions, the number of their training data, and verify the quantities of data, and time for training that must be carefully adjusted. the proposed intelligent anfis-online controller is presented in figure 1. the five membership functions for both input error (e) and its difference rate (de) are applied. fig. 1 proposed intelligent anfis-online controller anfis plant d/dt ann identifier + e de u(n) y(n) ym(n) 𝜀(n) x(n) + 398 v. h. nguyen, t. m. cao, h. nguyen, h. k. le 2.2. anfis structure the typical structure of the considered anfis controller is illustrated in figure 2; in which a square indicates an adaptive node, whereas a circle indicates a fixed node. in the structure, the system that has two inputs , , and one output is considered. among the fuzzy system models, the sugeno fuzzy model is the most applicable cause of its high computational efficiency and interpretability, and integrated optimization, and adaptive techniques. in each modeling, the common rule set with two fuzzy if-then rules can be explained as below [18]: rule i: if is , and is , then = + + where: and are fuzzy sets in the antecedent and = is a crisp function in the consequent; are the updating parameters of rules. fig. 2 configuration of anfis in these studies, the anfis controller includes five layers as following: layer 1: the input fuzzification takes place in this layer. mathematically, the form can be explained as below: )( )1()1( ijjij io  = ( ) (1) where o (1) ij is the output of the layer 1 node which corresponds to the j-th linguistic term of the i-th input variable i (1) ij i-th is the quantities of input variable and j is the quantities of the linguistic term of each input. layer 2: in this layer, the total quantities of rules is 25. each node output deputizes the activation level of a rule:  q i ijkk owo 1 )1()2( (2) k is number rules. layer 3: the k-th node’s output is the firing strength of each rule divided by the total sum of the activation values of all the fuzzy rules. this leads to the normalization of the activation value for each fuzzy rule. this operation is expressed as the following:  x1 a1 a2 x2 b1 b2     layer 1 layer 2 layer 3 layer 4 layer 5 w1 w2 𝑤 2 𝑤 1 x1 x2 𝑤 1f1 𝑤 2f2 f x1 x2 using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 399    y m m k kk o o wo 1 )2( )2( )3( (3) layer 4: every node k in the layer is accompanied by a set of parameters, which can be adjustable, 𝑘, 𝑘, …, 𝑁𝑖𝑛𝑝𝑢𝑡𝑘, 𝑦𝑘, , and applies the linear function below: )...( 0 )1( 1 )1( 22 )1( 11 )4( )4( kninputkkkk kkk dididido fwo   (4) layer 5: the single node in this layer calculates the overall output as the total of all incoming signals, which is written as:        y k k y k kky k kk y k kk w fw fwoo 1 1 11 )4()5( (5) 2.3. ann identifier a multilayer perceptron (mlp) network is applied to introduce the dynamics of the plant. the architecture of the mlp is described in figure 3. the suggested mlp network has 6 inputs, a hidden layer of 9 neurons with hyperbolic tangent functions, and one output layer with a neuron having linear node properties. the overall architecture of the plant with the ann identifier is illustrated in figure 3. the output of the ann identifier is written by: ̂ + = (6) fig. 3 ann identifier where is the power at the statcom at time step (n); is the control signal at time step (n); ̂ + is the output of the ann identifier, that is the forecasted at the 𝑃𝑠 𝑛 𝑃𝑠 𝑛 𝑃𝑠 𝑛 𝑢 𝑛 𝑢 𝑛 𝑢 𝑛 �̂�𝑠 𝑛 + 400 v. h. nguyen, t. m. cao, h. nguyen, h. k. le time step (n+1). the inputs of the ann identifier are standardized in the range of [-1, + 1] before being employed in the neuron network. to rightly predict the output of the system at time step (n+1), the identifier is first trained to carry out the estimated paradigm output, ̂ + , then it follows the actual output of the plant, , by minimizing the cost function is written as follow: = = ̂ (7) the parameters of the ann identifier are updated online by using the gradient descent algorithm as below [19]: = (8) where w(n) is the parameters matric at time step n, is the learning rate of the network, and is the gradient of concerning the parameters matrix w(n). the gradient is computed by: = [ ̂ ] ̂ (9) 2.3 online training of anfis the mission of the learning method in this architecture is to adjust all the adjustable weights such as gaussian membership function variables and values of anfis rules which are called { , }, and { }. the weight's modification is implemented to carry out the anfis output to match the training data. this research suggests a neurofuzzy control algorithm based on an artificial neural network identifier. its topology is shown in figure 1. the anfis weights are updated online using the output of the ann identifier which be described above. the error is defined as follows: = (10) the operation index for evaluation controller proficiency is written as: = = (11) for both inputs, the authors use the well-known bell-shaped membership function which is defined as follow: 2 2 1 )1( )(            ij iji cx ijj ei   (12) while the triplet of parameters { , } are called to as premise weights or non-linear weights and they tune the shape and the location of the membership function. those weights are tuned during the training mode of operation by the error back-propagation method. in this research, i = (1, 2) and j = (1, 2, …, 5). to calculate = and = , the authors use the below equations: (n+1)= +η( 𝑛 𝑖 ) (13) using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 401 (n+1)= +η( 𝑛 𝑖 ) (14) to update the node of function weights includes { }, the below equations are used: (n+1)= +η( 𝑛 𝑖 ) (15) (n+1)= +η( 𝑛 𝑖 ) (16) (n+1)= +η( 𝑛 𝑖 ) (17) in this case, the quantities of neurons, shown in figure 1, in the adaptive neuro-fuzzy controller, are 5, 10, 20, 20, and 5 for layers 1, 2, 3, 4, and 5, respectively. this is explained that ten center weights in the membership functions (five for each input) and five consequent weights must be updated on-line. a system with these weights to update is considered relatively complex and time-consuming to calculate and train, especially when applied to real-time systems.[20] 3. statcom modeling a statcom is installed to adjust the voltage at its connection point by injecting in or absorbing out reactive power from the electrical system. when the system voltage is lower than statcom voltage, the statcom pumps the reactive power to the power system; when the system voltage is higher than statcom voltage, it absorbs the reactive power from the power system. besides, a statcom can be operated as an active filter to absorb system harmonics [5, 8]. for researching and analyzing the performance of statcom, mathematical modeling is applied. in which, the output voltage is separated into two elements described in d and q axes as below [14, 21, 22]. ista rm rsta xsta pcc bus v  vdcsta cm sta sta km ,α fig. 4 statcom model the dq-axis output voltages of the researched statcom illustrated in figure 4 can be explained by equations (18) and (19), respectively [14, 21, 22]: sin( ) dsta dcsta sta bus sta v = v km +  (18) cos( ) qsta dcsta sta bus sta v = v km +q a (19) 402 v. h. nguyen, t. m. cao, h. nguyen, h. k. le where bus θ is the phase angle of the common ac-bus voltage; vdcsta is the p.u dc voltage of the dc capacitor cm; vdsta and vqsta are the p.u dq-axis voltages of the statcom, respectively; is the modulation index; is a phase angle of the statcom. the p.u dc voltage current equation of the dc capacitor cm can be described by: (cm)( ̇dcsta) = b[idcsta  (vdcsta/rm)] (20) in which the dc current can be calculated as: idcsta = iqstakmcos(θbus+) + idstakmsin(θbus+ ) (21) where rm is the p.u equivalent resistance representing the statcom electrical losses; and and are the p.u qand d-axis currents of the statcom, respectively. 4. simulation results power networks are very nonlinear systems and operate in a wide range. they are operated with unpredictable load changes and sudden faults that can lead to oscillations in the power system. these transient responses require damping; in addition, maintaining the stable operation of the system is very important and difficult to implement. desiring to develop a controller capable of adjusting parameters online, according to the environment in which it operates, to bring out satisfactory control performance. for the successful use of the controller in power networks, the flexibility of the adaptive controller is a major advantage because it decides its applicability to various conditions. it is also desirable to minimize outside interference in the performance that it performs. as a rule, big quantities of controller coefficients that must be modified manually, the more difficult if it is implemented in real situations. in this research, the number of membership functions in the adaptive neuro-fuzzy system structure shown in figure 2, are 5 membership functions for one input with the quantities of neurons of 5, 10, 20, 20 and 5 for layer 1, layer 2, layer 3, layer 4 and layer 5, respectively. the total of parameters to be updated is 35 parameters corresponding to 20 parameters in layer 1 (corresponding to 2 inputs are error e and its differential, ∆e) and 15 parameters in the output layer (01 output with 5 membership functions and 3 variables in each dependent function d0, d1, and d2 mentioned in equation (4)). these parameters are updated during the training. in this section, the simulation results of intelligent anfis-online controller for statcom in improving dynamic voltage stability are presented. the test scenarios are included: i) omib system and ii) ieee – 9 bus system. these simulation results of the voltage waves in time-domain are drawn in matlab software to evaluate the effectiveness of the proposed controller for statcom. 4.1. scenario 1 the wind power source is one of the sources with a lot of fluctuations in operation due to changes naturally in wind speed, as well as in case of a fault in the power system. therefore, the author selected and performed a simulation scenario with this power wind to study the responsiveness of the statcom controller being applied. figure 5 shows the architecture of the studied system containing one machineinfinite bus (omib) system with sg 160mva, using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 403 20mw wind farm (wf), and the 5mvar statcom. the wind farm and statcom are connected to pcc. the pcc is connected to a power system at an infinite bus through a 200km overhead transmission line [23]. it is assumed that the wind generator performs under a fixed wind speed of 12 m/s. a sampling frequency of 1 khz is applied in the simulation and the learning factor is set to be 0.01. grid vinf 160 mva sg 15/110kv tranmission line pcc 5 mva 200 km wind farm dfig 15/110 kv statcom 20 mva fig. 5 configuration of the omib system fig. 6 transient responses voltage at pcc (vpcc) in this study, a three-phase short-circuit fault occurs at the grid bus at t = 1 s and is isolated after 5-cycle. simulation results in the case of using the suggested anfis-online controller are compared to other controllers, such as anfis controller [13], anfis-ga controller [14], and anfis-pso controller [14]. figure 6, figure 7, and figure 8 show the comparative transient waves of the studied system with anfis controller (black dash lines), with anfis0 1 2 3 4 5 6 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 v p c c ( p .u .) anfis anfis-ga anfis-pso anfis online 1.2 1.4 1.6 1.8 2 2.2 2.4 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 v p c c ( p .u .) anfis anfis-ga anfis-pso anfis online 404 v. h. nguyen, t. m. cao, h. nguyen, h. k. le ga controller (red lines), with anfis-pso controller (green lines) and with anfis-online controller (blue lines). in which, figure 6 shows the transient responses voltage at pcc, figure 7 shown the active power and reactive power of sg, respectively. while figure 8 shown the active power and reactive power of the wind farm, respectively. fig. 7 active power (psg) and reactive power (qsg) of sg as shown in figure 6, when the 3-phase short-circuit fault appears at the grid, the voltage at pcc is reducing to under 0.5 p.u; after cleared fault, the response of voltage is restored and has a variation from the reference voltage before reaching the steady-state. 0 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 t (s) p s g ( p .u .) anfis anfis-ga anfis-pso anfis online 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 0.92 0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.1 1.12 t (s) p s g ( p .u .) 0 1 2 3 4 5 6 0.1 0.2 0.3 0.4 0.5 0.6 0.7 t (s) q s g ( p .u .) anfis anfis-ga anfis-pso anfis online 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 0.05 0.1 0.15 0.2 0.25 0.3 t (s) q s g ( p .u .) using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 405 besides that, it is found that the active power (psg) and reactive power (qsg) responses were completely damped at t = 1.346 s when the anfis-online controllers are employed. fig. 8 active power (pwf) and reactive power (qwf) of wf as for the active power and reactive power profile of sg, it is seen in figure 7 that the active power and reactive power of sg with the anfis, anfis-pso, anfis-ga, and anfis-online controller are almost the same. however, the settling time of these 0 1 2 3 4 5 6 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 t (s) p w f ( p .u .) anfis anfis-pso anfis-ga anfis online 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 0.65 0.7 0.75 0.8 t (s) p w f ( p .u .) 0 1 2 3 4 5 6 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 t (s) q w f ( p .u .) anfis anfis-ga anfis-pso anfis online 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 0.02 0.04 0.06 0.08 0.1 0.12 t (s) q w f ( p .u .) 406 v. h. nguyen, t. m. cao, h. nguyen, h. k. le power responses, when the anfis controller is applied to the statcom device, is 2.641s and 3.681s. these settling times are reduced to 1.157s and 2.111s, respectively when applying the anfis-online controller. as seen from figure 8 that the wave of the active power and reactive power of wind farms, after cleared fault, are the almost same. the settling time of wf’s active and reactive power, when employing anfis-pso and anffis-ga controller, are 2.533s – 2.351s and 2.507s – 2.361s, respectively, while they are 2.095s – 2.111s when using anfis-online controller. with these figures, it is found that, by implemented the anfis-online controller for the statcom equipment, the output values of these parameters are more stable and more effective, so that the voltage at vpcc is enhanced on overshoot, and settling time after a 3phase short circuit fault occurred. table 1 comparision of the controllers in omib system items index anfis anfis ga anfis pso anfis online voltage vpcc (p.u) settling time (s) 1.632 1.497 1.444 1.346 peak value (p.u) 0.991 0.995 0.993 0.987 pot (%) 1.26% 1.62% 1.43% 0.85% active power of sg (psg) settling time (s) 2.641 1.688 1.634 1.157 peak value (p.u) 1.114 1.109 1.108 0.977 pot (%) reactive power of sg (qwf) settling time (s) 3.681 2.533 2.351 2.111 peak value (p.u) 0.245 0.245 0.245 0.233 pot (%) active power of wf (pwf) settling time (s) 3.718 2.507 2.361 2.095 peak value (p.u) 0.799 0.806 0.805 0.798 pot (%) reactive power of wf (qwf) settling time (s) 3.681 2.533 2.351 2.111 peak value (p.u) 0.245 0.245 0.245 0.233 pot (%) in order to compare the efficiency between controllers, percent of overshoot (pot) indexes and settling time are used. these indexes are shown in table 1. as for the voltage response of the pcc bus, it is found in table 1 that the percent of overshoot (pot) (%) of voltage at pcc bus with the anfis-pso, anfis-ga, and anfis-online are 1,43%, 1,62%, and 0,85%, respectively. in case of using the anfis-online controller carries out the best wave than other controllers. compare the settling time of voltage after fault isolation, which is the time of voltage recovery within the permissible range of 5%, the anfis-online controller gives the shortest time. the settling time of voltage at pcc with the anfis-pso, anfisga, and anfis-online controller are 1.444s, 1.497s, and 1.346s. as a conclusion, proposed anfis-online controller can significantly reduce the fluctuation characteristics of voltage at pcc (reaching the set state of about 1,346s and overshoot of about 0.85%), the power of the synchronous generation (reaching the steady-state of about 1.1 ~ 2.1s) after a 3-phase short circuit fault is isolated. using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 407 4.2. scenario 2 figure 9 shows the topology of the studied system including three synchronous generators (sgs) connected to buses 1, 2, and 3 that supply power to three loads on bus 8, bus 9, and bus 10. the 50mvar statcom is installed at bus 9 of the 3-machine 9bus system. it is assumed that this multi-machine system operates under stable conditions as referred to [24]. the 3-phase short-circuit fault is suddenly appeared on the power grid at t = 1s and is isolated after 0.1s. g1 g2 g3 18 kv 230 kv 13.8 kv 230 kv 16.5 kv 230 kv 18/230 kv 230/13 kv 1 6 .5 /2 3 0 k v load a load b load c 1 2 3 5 9 8 6 10 7 rm 25/230 kv statcom +/-50 mvar cm fig. 9 topology of the 3-machine 9-bus system with statcom simulation results have been presented in figure 10 and figure 11. these figures plot the comparative transient responses voltage at the point of common coupling (pcc), the voltage of 1st machine (sg-1), 2nd machine (sg-2) and 3rd machine (sg-3) of the studied system with the suggested statcom in cases of basic anfis controller (black dotted lines), with anfis-pso controller (red lines), with anfis-ga controller (green lines), anfis-online controller (blue lines) subject to a 3-phase fault at bus 9. in which, figures 10 is represented for voltage at pcc. while figures 11a to 11c are illustrated for voltage waves at sg-1, sg-2, and sg-3. as for the voltage profile at pcc, it is found out from figure 10 that the voltage of pcc with the anfis, anfis-pso, anfis-ga, and anfis-online controller have an alteration from the reference value, i.e 0.041 p.u, and 0.1 p.u when implemented the anfis-online and anfis, respectively. the waveforms of voltage at the point of common coupling are recovered to the pre-fault steady state operating conditions with large oscillations. in the case of using the anfis-online controller, these responses are recovered to the pre fault steady state operating conditions around 3 seconds with small oscillations. regarding voltage settling times after fault isolation, which is the time of voltage recovery within the permissible range of 5%, the anfis-online controller gave the smallest values. with the anfis-online controller, the voltage settling time at sg-1, sg-2, sg-3, was 4.6s, 3.6s, and 3.5s, respectively. in the case of the anfis controller, the settling time of the voltage was 4.65s, 3.8s, and 3.6s, respectively. 408 v. h. nguyen, t. m. cao, h. nguyen, h. k. le fig. 10 voltage waves at pcc a. voltage waves of sg-1 0 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 t (s) v p c c ( p .u ) anfis anfis-ga anfis-pso anfis-online 1 2 3 0.9 0.95 1 1.05 1.1 1.15 1.2 0 1 2 3 4 5 6 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 t (s) v s g 1 ( p .u .) anfis anfis-pso anfis-ga anfis-online 2 0.96 0.97 0.98 0.99 1 1.01 1.02 1.03 1.04 1.05 1.06 t (s) v s g 1 ( p .u .) anfis anfis-pso anfis-ga anfis-online using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 409 b. voltage waves of sg-2 c. voltage of sg-3 fig. 11 voltage waves of sg-1, sg-2 and sg-3 0 1 2 3 4 5 6 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 t (s) v s g 2 ( p .u .) anfis anfis-pso anfis-ga anfis online 2 0.8 0.85 0.9 0.95 1 1.05 t (s) v s g 2 ( p .u .) anfis anfis-pso anfis-ga anfis online 0 1 2 3 4 5 6 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 t (s) v s g 3 ( p .u .) anfis anfis-pso anfis-ga anfis online 2 0.88 0.9 0.92 0.94 0.96 0.98 1 1.02 1.04 t (s) v s g 3 ( p .u .) anfis anfis-pso anfis-ga anfis online 410 v. h. nguyen, t. m. cao, h. nguyen, h. k. le table 2 comparision of controllers in ieee 9 bus system items index anfis anfis ga anfis pso anfis online voltage of sg-1 (vsg1) settling time (s) 4,651 4,65 4,65 4,63 peak value (p.u) 1,044 1,044 1,044 1,04 pot (%) 1% 1% 1% 0,5% maximum values of the second-half cycle (p.u) 0,963 0,963 0,963 0,961 put (%) 7% 7% 7% 7% voltage of sg-2 (vsg2) settling time (s) 3,801 3,8 3,8 3,6 peak value (p.u) 1,039 1,039 1,039 1,036 pot (%) 1% 1% 1% 1% maximum values of the second-half cycle (p.u) 0,861 0,861 0,861 0,864 put (%) 16% 16% 16% 16% voltage of sg-3 (vsg3) settling time (s) 3,601 3,6 3,6 3,5 peak value (p.u) 1,019 1,019 1,019 1,018 pot (%) maximum values of the second-half cycle (p.u) 0,879 0,878 0,879 0,886 put (%) 14% 14% 14% 13% voltage at pcc (vpcc) settling time (s) 3,501 3,5 3,5 3,5 peak value (p.u) 1,104 1,101 1,101 1,089 pot (%) 5% 5% 5% 4% maximum values of the second-half cycle (p.u) 0,951 0,949 0,959 0,954 put (%) 9% 10% 9% 9% it is clearly observed from the comparative transient simulation results presented in figure 10 that the suggested statcom with the different anfis controllers can offer better damping characteristics, and improve voltage quality in the power grid. all the results show that the suggested anfis-online controller is better than anfis-pso, anfis-ga, and anfis controller. with this scenario, percent of the overshoot (pot) index, percent of undershot (put) index, and settling time are used to compare the efficiency between controllers. this scenario's indexes are shown in table 2. in the case of using the anfis-online controller, the pots of voltages are 0.5%, 1%, 0%, 4%, respectively. in case of using anfis online controller, the pot of voltages of sg-1, sg-2, sg-3, pcc are 1%, 1%, 0%, 4%, respectively. meanwhile, these voltage overshoots in the case of the anfis controller are 1%, 1%, 0%, 5%. in comparison put of voltages of sg-1, sg-2, sg-3, pcc, when using anfis-online controller, are equal to 7%, 16%, 13%, 9% while they are equal to 7%, 16%, 14%, 10% when employing the anfis-pso controller or anfis-ga controller. it is shown that the suggested statcom equipment with the intelligent anfis-online controller can provide suitable reactive power to the power system and bring out better damping characteristics to quickly damp out the inherent oscillations of the studied power system than the other controllers under a 3-phase short-circuit fault at the pcc bus of the power grid, and this help to improve voltage quality in the power system. using an intelligent anfis-online controller for statcom in improving dynamic voltage stability 411 6. conclusions this paper has demonstrated the results of research on using the intelligent anfis-online controller for statcom in improving dynamic voltage stability. the proposed anfisonline controller with ann identifier for statcom was designed and applied in i) the omib system and ii) the ieee-9bus network. statcom can support fast response to the power system to balance reactive power in the grid help to improve dynamic voltage stability. the above simulation results have shown that the suggested controllers can be applied to improve the system stability as well as the voltage quality more effective than other controllers, such as anfis [13], anfis-pso [14] and anfis-ga [14], in which proposed anfis-online controller carry out the best response after a 3-phase short circuit fault happened. it can be concluded that the proposed anfis-online controller has better damping characteristics to enhance the dynamic stability performance of the studied power network under severe fault. references [1] [2] e. muljadi, t. b. nguyen, and m. a. pai, ―impact of wind power plants on voltage and transient stability of power systems‖, in proceedings of the ieee energy 2030 conf, 2008. foad. h. gandoman, abdollah ahmadi, a.m. sharaf and pierluigi siano, ―review of facts technologies and applications for power quality in smart grids with renewable energy systems‖, renewable and sustainable energy reviews, no. 82, pp. 502–514, 2018. [3] m. molinas, j.a. suul, and t. undeland, ―low voltage ride through of wind farms with cage generators: statcom versus svc‖, ieee trans. power electronics, vol. 23, no. 3, pp. 1104–1117, 2008. [4] b. pokharel and w. gao, ―mitigation of disturbances in dfig-based wind farm connected to weak distribution system using statcom‖, north american power symposium (naps), arlington, texas, usa, 2010. [5] g. cai, c. liu, q. sun, d. yang, and p. li, ―a new control strategy to improve voltage stability of the power system containing large-scale wind power plants‖, in proceedings of the 4th int. conf. electric utility deregulation and restructuring and power technologies (drpt), weihai, china, 2011. [6] m. n. eskander and s. i. amer, ―mitigation of voltage dips and swells in grid-connected wind energy conversion systems‖, in proceedings of the iccas-sice, fukuoka, japan, 2009. [7] h. v. nguyen, k. h. le, and h. nguyen, ―hybrid damping controller for statcom to enhance power quality in multi-machine system‖, in proceedings of the ieee international conference on system science and engineering (icsse2017), ho chi minh city, vietnam, 2017. [8] a. ganesh, r. dahiya, g. k. singh, ―development of simple technique for statcom for voltage regulation and power quality improvement‖, in proceedings of the 2016 ieee international conference on power electronics, pedes, trivandrum, india, 2016. [9] g. shahgholian, e. mardani, a. fattollahi, ―impact of pss and statcom devices to the dynamic performance of a multi-machine power system‖, engineering, technology & applied science research, vol. 7, no. 6, pp. 2113–2117, 2017. [10] m. n. uddin and r. s. rebeiro, ―improved dynamic and steady state performance of a hybrid speed controller based ipmsm drive‖, in proceedings of the ieee industry applications society annual meeting (ias), 2011. [11] [12] a. varshney and r. garg, ―comparison of different topologies of fuzzy logic controller to control dstatcom‖, in proceedings of the 3rd international conference on computing for sustainable global development (indiacom), new delhi, 2016.  a. karami, and k. mahmoodi galougahi, ―improvement in power system transient stability by using statcom and neural networks‖, electrical engineering, vol. 101, no. 1, pp. 19–33, 2019. https://www.researchgate.net/profile/foad_h_gandoman https://www.researchgate.net/profile/abdollah_ahmadi https://www.researchgate.net/profile/am_sharaf https://www.researchgate.net/profile/pierluigi_siano https://www.researchgate.net/publication/320244028_review_of_facts_technologies_and_applications_for_power_quality_in_smart_grids_with_renewable_energy_systems https://www.researchgate.net/publication/320244028_review_of_facts_technologies_and_applications_for_power_quality_in_smart_grids_with_renewable_energy_systems https://www.researchgate.net/journal/1364-0321_renewable_and_sustainable_energy_reviews https://www.researchgate.net/journal/1364-0321_renewable_and_sustainable_energy_reviews https://link.springer.com/journal/202 412 v. h. nguyen, t. m. cao, h. nguyen, h. k. le [13] [14] h. v. nguyen, h. nguyen, and k. h. le, ―anfis and fuzzy tuning of pid controller for statcom to enhance power quality in multi-machine system under large disturbance‖, aeta 2018 recent advances in electrical engineering and related sciences: theory and application, lnee, vol. 554, pp. 34–44, 2018. h. v. nguyen, m. t. cao, h. nguyen, and k. h. le, ―performance comparison between pso and ga in improving dynamic voltage stability in anfis controllers for statcom‖, engineering, technology & applied science research, vol. 9, no. 6, pp. 4863–4869, 2019. [15] [16] [17] k. s. narendra and k. parthasarathy, ―adaptive identification and control of dynamic systems using neural networks‖, in proceedings of the ieee proc. of the 28th conf. on decision and control, 1989. m. bogdanović, ž. petrović, b. milošević, m. mijalković and l. stoimenov, ―a neural network approach for the analysis of limit bearing capacity of continuous beams depending on the character of the load‖, facta universitatis, series: electronics and energetics, vol. 31, no. 1, pp. 115–130, 2018. d. knežević and m, blagojević, ―classification of electricity consumers using artificial neural networks‖, facta universitatis, series: electronics and energetics, vol. 32, no. 4, pp. 529–538, 2019. [18] j. jang, c. sun, and e. mizutani, neuro-fuzzy and soft computing, nj: prentice hall, upper saddle river, 1997. [19] p. shamsollahi and o.p. malik, ―an adaptive power system stabilizer using online trained neural networks‖ ieee transactions on energy conversion, vol. 12, no. 4, pp. 382–387, 1997. [20] miguel ramirez-gonzalez and o.p. malik, "simplified fuzzy logic controller and its application as a power system stabilizer‖, in proceedings of the international conference on intelligent applications on power systems, 2009. [21] d. shen, and p. w. lehn, ―modeling, analysis and control of a current source inverter based statcom‖, ieee trans. on power delivery, vol. 17, no. 1, pp. 248–253, 2002. [22] a. jain, k. joshi, a. behal, and n. mohan, ―voltage regulation with statcoms: modeling, control and results‖, ieee trans. power delivery, vol. 21, no. 2, pp. 726–735, 2006. [23] l. wang and l.-y. chen, ―reduction of power fluctuations of a large-scale grid-connected offshore wind farm using a variable frequency transformer‖, ieee trans. on sustainable energy, 2011. [24] a. awasthi, s. k. gupta, m. k. panda, ―design of a fuzzy logic controller based statcom for ieee9 bus system‖, european journal of advances in engineering and technology, vol. 2, no. 4, pp. 62–67, 2015. https://link.springer.com/book/10.1007/978-3-319-69814-4 https://link.springer.com/book/10.1007/978-3-319-69814-4 instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 267 284 doi: 10.2298/fuee1703267a recent advances in ntc thick film thermistor properties and applications  obrad s. aleksić 1 , pantelija m. nikolić 2 1 institute for multidisciplinary research, university of belgrade, serbia 2 serbian academy of sciences and arts, belgrade, serbia abstract. an introduction to thermal sensors and thermistor materials is given in brief. after that novel electrical components such as thick film thermistors and thermal sensors based on them are described: custom designed ntc thermistor pastes based on nickel manganite nim2o4 micro/nanostructured powder were composed and new planar cellbased (segmented) constructions were printed on alumina. the thick film segmented thermistors were used in novel thermal sensors such as anemometers, water flow meters, gradient temperature sensor of the ground, and other applications. the advances achieved are the consequence of previous improvements of thermistor material based on nickel manganite and modified nickel manganite such as cu0.2ni0.5zn1.0mn1.3o4 and optimization of thick film thermistor geometries for sensor applications. the thermistor powders where produced by a solid state reaction of mnco3, nio, cuo, zno powders mixed in proper weight ratio. after calcination the obtained thermistor materials were milled in planetary ball mils, agate mills and finally sieved by 400 mesh sieve. the powders were characterized by xrd and sem. the new thick film pastes where composed of the powders achieved, an organic vehicle and glass frit. the pastes were printed on alumina, dried and sintered and characterized again by xrd, sem and electrical measurements. different thick film thermistor constructions such as rectangular, sandwich, interdigitated and segmented were printed of new thermistor pastes. their properties such as electrical resistance of the thermistor samples where mutually compared. the electrode effect was measured for all mentioned constructions and surface resistance was determined. it was used for modeling and realizations of high, medium and low ohmic thermistors with different power dissipation and heat loss. finally all the results obtained lead to thermal sensors based on heat loss for measuring the air flow, water flow, temperature gradient and heat transfer from the air to the ground. key words: metal oxide thermistors, thick film thermistor geometries, thick film thermistor sensors and systems received november 10, 2016 corresponding author: obrad s. aleksić institute for multidisciplinary research, university of belgrade, kneza višeslava 1a, 11 000 belgrade, serbia (e-mail: obradal@yahoo.com) 268 o. s. aleksić, p. m. nikolić 1. introduction to temperature sensors temperature measuring and control today is widely spread everywhere around us: from homes and buildings, ground surface, water and air to power machines in industry, cars and trucks, electronic equipments, chemistry, medicine etc. [1-4].the temperature measuring range is divided in several sub-ranges such as cryogenic -temperatures, nearroom temperatures, moderate elevated temperatures, and high temperatures. to cover these temperature ranges with measuring different temperature measuring methods and temperature sensor devices were developed mainly based on thermocouples (thermoelectric effect), thermistors (thermo resistive effect) and pyrometers (infrared to visible light radiation) [5-7]. the thermoelectric effect was discovered by seebeck in 1821.the thermocouple was formed of two wires of different metals joined in one point by welding [8]. the thermoelectric electromotive force emf depends on metals used in forming the thermocouple. there are various combinations of metals such as copper and iron, metal alloys of alumel (ni/mn / al/si), chromel (ni/cr), constantan (cu/ni), nicrosil (ni/ cr/si) and nisil (ni/ si/mn), the noble metals platinum and tungsten, and the noble-metal alloys of platinum/rhodium and tungsten/rhenium [9]. the output of thermocouples increases with temperature increase from low temperatures to moderate and high temperatures. the platinum rhodium-platinum thermocouple reaches 20 mv at 1600 ᵒc, chromel-alumel 50 mv at 1200 ᵒc and chromel-constantan 80 mv at 1000ᵒ c, while at cryogenic temperatures round 0 k voltage emf is negative and reaches -6 mv for copper-constantan and -10 mv for chromel-aufe wires. emf of thermocouples is non-lineal function of temperature t [ᵒc] which crosses zero at 0 ᵒc. it can be approximated by following equation: e = a1t +a2t 2 +a3t 3 +a4t 4 …. where a1, a2, a3, a4 …are constants experimentally determined. inaccuracy is round +0.2% of the emf voltage or +2 ᵒc at 1000 ᵒc, for example [10]. at higher temperatures optical pyrometers are used for measuring temperatures using stefan–boltzmann law [11]. the output signal of the photo detector is related to the thermal radiation or irradiance j * =ε σ t 4 , where σ is called the stefan-boltzmann constant and ε is the emissivity of the object. the temperature t is measured from the distance (1-10 meters) and the measuring temperature range of pyrometers is typically from 650-2500 ᵒc. the pyrometer inaccuracy is high, typically round +25ᵒ c at the beginning of the measuring range and +50 ᵒc at the end of the measuring range [12]. thermistors are a new class of temperature sensors based on the thermo-resistive effect [13,14]. it includes sintered metal oxides (electronic ceramics) exhibiting different values of resistance thermal coefficients in near-room and moderate temperature range. generally their resistance temperature coefficient can be positive or negative (ptc and ntc, respectively). ptc thermistors are used in heaters, temperature level controls and thermal switches while ntc thermistors are used in temperature measurements in electronic equipment, air conditioning, cars, domestic appliances etc. [15]. the main advantage of this class of sensors compared to thermocouples is small dimensions, high sensitivity and low price: thermistors are mass produced as electronic components with one pair of leads (small disc shape), or with short leads (small cubic chips) like other smd components (figure1) [ 16]. the nominal resistance for example is in the range from 1kω to 10k, temperature application range from -50 to 150 ᵒc and sensitivity is 3-5 % per ᵒc, which is much higher compared to thermocouples. the resistance temperature coefficient https://en.wikipedia.org/wiki/stefan%e2%80%93boltzmann_law https://en.wikipedia.org/wiki/thermal_radiation https://en.wikipedia.org/wiki/irradiance https://en.wikipedia.org/wiki/stefan-boltzmann_constant https://en.wikipedia.org/wiki/stefan-boltzmann_constant https://en.wikipedia.org/wiki/emissivity recent advances in ntc thick film thermistor properties and applications 269 values round b=4000, for example and resistance r decreases with temperature t increase by exponential law such as: r= r0 exp [b*(1/t  1/t0), r0-nominal thermistor resistance measured at t0= 293 k (20ᵒ c). it enables easy temperature measurement with inaccuracy of +0.1 ᵒc or less [17]. it also can be used for surge protection, delay in electronic circuits and temperature compensation in power electronic circuits. therefore there is a permanent interest in thermistor materials innovation and development for new applications. fig. 1 disc and chip thermistors based on sintered metal oxides (top figures): leaded and leadlessleft and right respectively, and typical resistance values r vs. temperature t for commercial nickel manganese ntc thermistors (bottom). comparison of typical values of the main characteristics of ntc thermistors, platinum resistors (rtd), thermocouples and semiconductor thermal sensors from the application aspect is given in table 1 [18-20]. one of the way to reach new thermistor applications is use of thick films custom design thermistor pastes (modified micro/nano structure, partly substitution of metals, even doping with re) and new planar constructions adapted by shape and size to required resistance and power. the objectives are measuring of moderate elevated temperatures, flow of gasses and liquids by heat loss thermistors (flowmeters), measuring of heat transfer through the surface or heat radiation (bolometers), temperature gradient in the ground, water and air. http://www.ussensor.com/products/ntc-thermistors http://www.ussensor.com/products/ntc-thermistors 270 o. s. aleksić, p. m. nikolić table 1 comparison of modern temperature sensor devices ntc thermistor platinum rtd thermocouple semiconductor sensor ceramic (metaloxide spinel) platinum wire or metal film thermoelectric semiconductor junction temperature range -100 to +325˚c -200 to +650˚c -200 to +1750˚c 70 to 150˚c accuracy 0.05 to 1.5˚c 0.1 to 1.0˚c 0.5 to 5.0˚c 0.5 to 5.0˚c stability at 100˚c 0.2˚c/year (epoxy) 0.02˚c/year (glass) 0.05˚c/year (film) 0.002˚c/year (wire) variable, some types very prone to aging >1˚c/year output ntc resistance -4.4%/˚c typical ptc resistance 0.00385ω/ω/°c thermovoltage 10μv to 40μv/°c digital, various outputs linearity exponential fairly linear most types nonlinear linear response time fast 0.12 to 10 s slow 1 to 50 s fast 0.10 to 10 s slow 5 to 50 s cost low to moderate wire high film low low moderate thick film technology enables miniaturization of thermistors, faster thermistor response, mass production, integration with electronics and realizing novel geometries such as row and matrix of thermistors, segmented geometry, multilayer and interdigitated geometry. they can be modified in shape to cover the heat source or measuring requirements for differential measurement (in defined point) or average measurements (on defined surface). this work deals with recent research on thermistor materials and their applications: from powder preparation, pressed and sintered samples, thick film pastes to thick films devices, including our contributions to that field in last two decades. our intention was to govern thermistor properties by finding out correlations between their electronic properties and micro/nano structure and between thick film geometry and electrical properties, to optimize the sensitivity, reliability, reproducibility, robustness, long term stability etc., and answer the specific requirements in new sensor applications as mentioned above. the authors expect new sensor products in the market based on thick film thermistors in the near future. 2. metal oxide thermistors thermistors were invented in 1930 and their name is a combination of the words thermal and resistor: thermal-resistors [22].the first thermistors were made of metal oxide powders mixture pressed and sintered at elevated temperatures. generally, they are classified into two types: (ptc) thermistors where resistance increases with increasing temperature, and the vdevice is called a positive temperature coefficient thermistor, and (ntc) thermistor where the resistance decreases with increasing temperature, and the device is called a negative temperature coefficient thermistor. moreover they are also classified in two groups such as thermistors for low temperature operation (-50 to 150ᵒc) and high temperature operation (150 to 900ᵒc) [23,24]. a large number of metal oxides have decrease of resistance with temperature increase, and their sign is defined as ntc, but they are not used as thermistor materials if their electrical resistance is too high. the most suitable values of thermistor resistances for temperature measurement and control in electronics or as temperature sensors at room temperatures are moderate values (from 1-10 kω). the materials that can be used in synthesis recent advances in ntc thick film thermistor properties and applications 271 of thermistors for low temperature range of-50 to 150 ᵒc are mixed oxides of mn, ni, co. they have a spinel structure such as: (nimn)3 o4, (nimnco)3o4, (nimnfeco)3o4 or cubic structure like (fe,ti)2o3 and can be doped with other metal oxides such as lio, ruo2, zno, cuo, coo etc. [25,26]. their bulk resistance ρ25 of powder pressed and sintered samples which are measured at room temperature is correlated with exponential coeficient b of bulk resistance and compared (figure 2)[27]. the small addition of co in spinel gives higher b values and higher values of resistance [28-30]. the resistance is the consequence of initial powder properties, sintering temperaturtes/time profiles and microstructure of the samples [30-36]. fig. 2 corellation between bulk resistivityρand exponetial coefficient of resistivity b for different ntc thermistor materials: curve r  (mn, ni, co), li  dopped ; curve r1 spinel group (nimn)3 o4, (nimnco)3o4, (nimnfeco)3o4; curve r2  (fe,ti)2o3. the materials that can be used in synthesis of thermistors for high temperature range of 150 to 1100 ᵒc are mixed oxides such as mg(al1−xcrx)2o4 [37], y-al-mn-fe-ni-cr-o [38], mgal2o4–lacr0.5mn0.5o3[39], catio3 [40], sr7mn4o15 [41], ni1.0mn2-xzrxo4 [42], fe2tio5 [43], batio3 [44], al2o3-cr2o3-zro2 [45] etc. their properties such as temperature operating range, resistance and exponential factor b are given in table 2. table 2 high temperature ntc thermistor materials operating range [ᵒc] t [ ᵒc] resistivity [ωˑcm] at t b [k] la2o3-al2o3/cr2o3,cuo 50-600 50, 600 3ˑ10 11 , 5ˑ10 4 10 4 bi2zr3o7 250-600 11-17ˑ10 4 zro2/cao 400-1000 400, 1000 5-6ˑ10 5 , 20 coo-tio2/cr2o3 620-1100 22,8ˑ10 3 coo-tio2/y2o3 620-1100 31,1ˑ10 3 tio2-coo/y2o3 650-1100 28ˑ10 3 mno-al2o3-na2o 1:2:4.8 20-600 20 9ˑ10 8 14.4ˑ10 3 ceo2/pro2 200-1200 10-15ˑ10 3 the high temperature thermistors are usually disc and chip type, sealed in glass with pt wire terminations. the ceramic type special glass have a strength more than two times that of traditional glass-coated products and have excellent durability against reducible 272 o. s. aleksić, p. m. nikolić gases, such as hydrogen gas. their accuracy is lower than the low temperature thermistors like nickel manganese thermistors nimn2o4. therefore they are used in applications that directly detect high temperatures in regions to be heated: burner temperature control in gas ranges and soldering tool, oil heaters, for other abnormal heating detection in combustion equipment, and for industrial equipment instead platinum temperature detectors and thermocouples. 3. thick film thermistor pastes thick film pastes are composed of the fine powders of thermistor materials, organic vehicle and glass frits as a binder to ceramic substrate. the most often used thermistor pastes are based on nickel-manganese nimn2o4 where nickel is substituted partly with co [46,47] and nickel manganese is substituted with zn, cu [48-50]. the base material nickel manganese can be doped with bi, la, sn, cr, ru, al oxides to improve stability of electrical characteristics in the temperature operation range and adjust exponential factor b [50-55]. moreover our contribution to thick film thermistor layers includes not only substitution of basic oxides with other oxides but development of pastes based on thermistor nanopowders [56]. the sintered thermistors electronic properties were measured by fir spectrometers [57,58], hall effect measurement [59], electrical measurement (activation energy) [60] and photoacoustic spectroscopy (pa). the thermistor material thermal diffusivity was also determined for the first time by pa [61-63]. xrd of thermistor powders and sem of recently developed low resistance thick film thermistors layers based on nimn2o4 partly substituted with cuo and zno are given below in figures 2 and 3. the ntc behavior of the thick films is given in figure 4, while electronic properties are given in table 4 [64]. recent advances in ntc thick film thermistor properties and applications 273 fig 2 xrd of thermistor powder based on modified nickel manganese nimn2o4 (substituted partly with zno and cuo) and used for thermistor pastes preparation. fig. 3 sem of ntc thick film thermistor layers sintered in the air at 850ᵒc / 10 min: cu0.25ni0.5zn1.0mn1.25o4 and cu0.4ni0.5mn2.1o4: (a and b) sample surfaces, respectively and (c and d) sample cross sections, respectively. fig. 4 ntc behavior of micro/nanostructured thick film thermistors based on nimn2o4 partly substituted with cuo and zno. 274 o. s. aleksić, p. m. nikolić table 3 electronic properties of ntc thermistor nanostructured thick films thermistor composition rsq [mω/sq] b [k] ea [ev] cu0.2 ni0.5zn1.0mn1.3o4 1.3 3356 0.294 cu0.25 ni0.5zn1.0mn1.25o4 1.2 3294 0.288 cu0.4ni0.5mn2.1o4 0.39 2915 0.255 the electrical surface resistance rsq (sheet resistance) of thick film thermistors was measured on rectangular resistor geometry 2.5  2.5 mm printed on pdag electrode matrix. thick films printed of pure nimn2o4 thermistor paste composed of round 0.9 micron powder has more than 10 times higher electrical resistance. the resistances where measured at room temperature (20 ᵒc). the exponential factor b of thick film thermistor was determined from resistances r20 and r30 measured at 20 and 30 ᵒc in the climatic chamber. activation energy ea is defined as ea = bˑk, where k is boltzmann constant. 4. thick film thermistor geometries thermistor resistance r is complex function of sheet resistivity, geometry (shape and size), temperature t, and time t. the sheet resistance ρ(k) changes with k inter-electrode spacing due to electrode effect e.g. the diffusion of conductor layer to the thermistor layer. far enough from electrodes (k=few mm) practically there is no diffusion of metal electrodes to resistive layer at sintering temperature of 850 ᵒc/10 min and sheet resistance is constant and marked with ρbulk. but this is not nominal resistance of the thermistor paste: it is conventional to use rectangular resistor geometry 2.5 x 2.5 mm (du pont test resistor) to determine nominal resistance of the pastes, together with electrode effect. in fact during printing the other un-homogeneity in layer deposition occurs (thickness deformations) that can vary the resistance along the resistor [65]. the ideal resistor r is dependent of resistor length-l, width-w, thickness -d, and number of layers-n) and it can be easily calculated. the ntc behavior has an exponential factor a∙exp (-b/t) where b is exponential temperature coefficient, a is constant. moreover, thermistor resistance is dependant of time f(t): it is increasing lineal function (few seconds) and when heating is higher than cooling ntc effect occurs and further resistance it is decreasing function of time. finally the equation which describes thermistor resistance is given as follows: (1) the most difficult for modeling is f(t) as the heat transfer from thick film thermistor to air depends of shape and size of thermistor which can be different depending of application. the rectangular thermistors measure temperature in the defined point, while the surface planar thermistor constructions can measure temperature radiation flux, or average temperature of the surface, heat loss in fluids etc. for example different planar thick film thermistor constructions such as rectangular, sandwich, multilayer, segmented and interdigitated are given in figure 5. their ideal resistance is modeled as r(l, w, d, n) and sheet resistance as ρ(k), k-electrode spacing [66]. ( ) ( , , , ) ( ) b tr k r l w d n ae f t   recent advances in ntc thick film thermistor properties and applications 275 s d l w d l rectangular sandwich multilayer segmented interdigitated fig. 5 different thick film thermistor constructions (top view and cross section); rectangular, sandwich, multilayer, segmented and interdigitated, respectively. gray area  pdag conductive paste, black  ntc thermistor layer. the sheet resistance ρ(k) changes with k inter-electrode spacing as given in figure 6. the highest electrode effect which affects sheet resistivity ρ(k) occurs in sandwich, multilayer and segmented constructions. the electrode spacing in that case is only 30-33 microns (three sequentially printed and fired thermistor layers), while in case of rectangular and interdigitated construction spacing is 1 mm or more. sheet resistivity ρ=20-32 [ωm] for low k and ρ= 275285 [ωm] for k>1 mm respectively. 0,001 0,01 0,1 1 10 100 1000 10000 1 10 100 1000 10000 k [mm]  [wm] fig. 6 the sheet resistivity ρ of thick film thermistors versus inter-electrode spacing k. dashed line are experimental data measured at room temperature and solid line is modeled (fitted) ρ(k) – determined by an exponential function. w d l l d w l/3 w l w 276 o. s. aleksić, p. m. nikolić the simple modeling of sheet resistance is done using following equation: 2 0( / )( ) (1 ) k k bulk k e     (2) where k=0, ρ=0; k>>k0, ρ= ρbulk; k=k0=1mm, ρ= ρbulk(1-1/e).the modeld data differ from the experimental data (figure 6) for thermistor layer thickness less than 30 microns: sheet resistivity never crosses zero value as the diffusion of conductor pdag to ntc thermistor layer is limitted by finite porosity of thermistor microstructure (electrode effect saturation). thick film segmented thermistor is a novel construction developed for heat loss sensor aplications as it has gradient of temperature along the fluid flow. it's electrical equvalent electrical scheme consists of serial and parallel resistances rs and rp and serial and parallel capacitances cs and cp arrangend between bottom and top electrodes and between the neighbour electrodes in the same row of electrodes (figure 7). different thick film segmented thermistors are given in figure 8. + + - rscs rp cp fig. 7 equivalent electrical sheme of segmented thick film thermistor construction: rs  serial resistor, cs  serial capacitance, rp  parallel resistor, cp  parallel capacitor. fig. 8 different thick film segmeted thermistors: 5w (76.7  12.7 mm), 2w (51  6.35 mm) printed of nimn2o4 paste and 1w (25.4  6.35 mm) printed of modified paste cu0.25ni0.5zn1.0mn1.25o4. recent advances in ntc thick film thermistor properties and applications 277 total resistance r of thick film segmented thermistor in dc regime is given with r=2n· rs = ρ·2n· d / ((l/3) · w) (w  electrode width, l-electrode length, l/3 electrode spacing, n – number of top electrodes as given in figure 5), parallel resistors rp >>rs is neglected. in dc regime only resistances are active and in ac regime both resistances and capacitances are active and forms low band pass filter [67, 68]. in the segmented thermistor construction rp>>rs and cs>>cp. the voltage applied on segmented thermistor is distributed over segments (cells) in accordance with rs value. 5. thick film thermistor sensors and systems thermistor temperature sensors generate output signals in one of two ways: 1. through a change in output voltage (constant current) 2. through a change in resistance of the sensor's electrical circuit. sensing methods: contact and non-contact. the contact method: sensor is in direct physical contact with the object to be sensed to monitor solids, liquids, gases over wide range. the non-contact method interprets the radiant energy of a heat source to energy in electromagnetic spectrum monitor non-reflective solids and liquids (thermistor bolometers). temperature is a scalar quantity that determines the direction of heat flow between two bodies. temperature measuring and control by thermistors is enabled using steinhart-hart equation [69]. 2 3 ( ) exp b c d r t a t t t         or 31 (ln ) (ln )a b r c r t    (3) a,b,c are constants determined experimentally. in the first approximation two thermistor resistances r0 at t0= 293,16 к and r1 at temperature t1 are connected with following equation: 0 1 0 0 ( ) exp b t t r r tt        or 1 1 0 0 0 /( ln ) r t b t b t r            (4) unknown temperature t1 is defined by measuring r1 and using ratio r1/r0 (4). the methods for calibration and linearization of ntc thermistors for high precision temperature measurements are given in literature [70-72]. the temperature gradient (in the air, water or ground) can be measured by segmented thick film thermistors using inner electrodes (see figure 7): for measured resistances r1,r2,r3 and r4 on segments, for example, temperatures т1,т2,т3 and т4 are determined using equation (4), respectively. the first application of segmented thermistor as heat loss sensor was attempted in air flow measuring e.g. in anemometers [73-76]. after that three dimensional anemometer comprising thick film segmented thermistors, was formed using three uniaxial anemometers positioned under compasses to measure wind velocity as a three dimensional vector having {x, y, z } projections on x,y.z axes, respectively. the module of wind vector velocity | v | was calculated from square root of projections as | v | = (x 2 +y 2 +z 2 ) 1/2 and angles of wind vector to axes where calculated by from arctg { (x/v), (y/v), (z/v)}. three dimensional anemometer construction is given in figure 9, while the response of uniaxial anemometer ith of selfheated thermistor on wind velocity change (v) is given in figure 10 for room temperature of the air. 278 o. s. aleksić, p. m. nikolić fig 9 three dimensional anemometer comprising thick film segmented thermistors: thick film segmented thermistors top view (right); anemometer construction (figure in middle): x,y,z three uniaxial anemometers positioned under compasses, t  input air thermometer, v  humidity sensor with thermistors; cross section of uniaxial anemometer (right): (+,) power supply, u1,u2 inner electrodes, 1 air flow reductor, 2  sensor housing (tube), 3  segmented thermistor. the segmented thermistor is selfheated at constant voltage, and wind blow causes a heat loss on it's surface e.g., causes change of self resistance (increase), which further causes decrease (lowering) of selfheating current. the wind direction in uniaxial anemometers is determined by gradient of voltages using voltage difference (u1 u2) on inner electrodes for two halves of selfheating segmented thermistor (figure 9 right). 27 29 31 33 35 37 39 41 43 0 5 10 15 20 25 30 35 40 45 ith [ma] v [m/s] u-27 v fig. 10 the selfheating current ith of segmented thermistor in uniaxial anemometer as response on input air velocity v ( measured at room temperatures in aerodynamic tunnel). the heat loss volume water flow sensor (flowmeter for water) was formed with two segmented thermistors as micro flowmeter and flowmeter for stationary flow [77-78]. the water flowmeter for unstationary flow aimed for waterworks current volume flow measuring was formed of segmented thermistors with reduced dimensions printed on recent advances in ntc thick film thermistor properties and applications 279 alumina of cu0.25ni0.5zn1.0mn1.25o4 thermistor paste (figure 11). it consists also of two segmented thermistors: the first segmented thermistor (r) measures the input water temperature using r(t) using equation (4) and the second is selfheated thermistor at constant voltage (u). the selfheated current i is changed with water volume flow q and as i=f(q,t), where input water temperature t is a parameter [ 79 ]. fig. 11 flowmeter for water based on heat loss of segmented thermistors: segmented thermistors with reduced dimensions (left) and cross section of flowmeter (right): cold thermistor (r) – serves as thermometer for measuring input water temperature t, and selfheated thermistor at constant voltage u and selfheating current i measures volume water flow q. the response of flowmeter on current water flow q is given in figure 12: left diagram represents response of ultrasonic flowmeter as referent and righ diagram represents thermal flowmeter for water with segmented thermistors. two impulses were generated q=0.15 [l/s ] duration 30 s, 30 s pause and q=0.15 [l/s ] duration 30 s by fast switching of water flow using valves with lever. fig. 12 the electrical response of thermal flowmeter on current volume water flow: impulse water function input q = 0.15 l/s and q = 0.2 l/s time duration 15s . left diagram measured on referent ultrasonic flowmeter, right diagram measured on thermal flowmeter comprising segmented thermistor with reduced dimensions. their responses are given as electrical current i of ultrasonic flowmeter (output amplifier) and selfheating current i of ntc segmented thermistor, respectively. input water temperature is т=14.35 ᵒс, thermistor supply voltage u=14.7 v. the beginning and stop of water flow is detected by gradient of voltage measured on inner electrodes of thermistor. moreover, thick film segmented thermistors also can be applied as gradient 280 o. s. aleksić, p. m. nikolić sensors for measuring heat transfer from air to ground. other possible applications of thick film thermistor are thick film thermistor bridges, hybrid circuits with thermistors, bolometers for radiation heath measuring and thermistor arrays for measuring heat transfer and temperature or heath homogeneity etc. 5. discussion and conclusion metal oxide thermistors have higher sensitivity comparing to other temperature dependent devices (table 1), higher accuracy and stability, lower response time and lower size, and lower price /performance. therefore they are more suitable for application in different types of electronics: both ptc and ntc thermistors are widely produced as disc and chip shaped electrical components for many years, but ntc thermistors are more applied in temperature measurements as they have moderate exponential behavior with temperature (figure 1). ntc thermistors cover wide operation temperature range from 100 to 1200 ᵒc using different metal oxides (table 2). thick film thermistors with rectangular geometry (or flip chip) have appeared recently as smd (surface mounting device) commercial electrical component (tateyama kagaku device technology co., ltd.,). the main advantage of thick film chip thermistors is in laser trimming of resistance and in faster response to temperature change due to thermal conductivity of alumina used as a substrate for thick film thermistor layer. another advantage of thick film thermistors is in their sensitivity, due to thermistor layer low thickness (in microns), and low heat power for resistance change (a few mw). thick film thermistors are designed for application in microelectronics e.g. for temperature compensation of other devices and temperature sensing. their delay is as low as few seconds (transition from initial lineal to nonlineal regime), while sintered disc and chip thermistors are bulky and have much longer delay time. nickel manganese and other ntc thermistors which operate near room temperatures (low temperature range) are often in use in electronics as leaded or leadless electrical components for temperature measurements. thick film thermistors are used much less as hybrid components: they are used mainly as custom designed hybrid planar components for thermal sensors. high temperature operating thick film thermistors are very rarely used for sensors applications above 300-400 ᵒc. in practice thick film thermistor pastes appear as sensor pastes or as resistive pastes: they are produced for low temperature range applications (-50 to 130ᵒ c), with nominal square resistances 1, 10 or 100 [kω/□] at room temperature (esl, koartan, heraeus).the modified nickel manganese thermistor pastes developed recently and presented in this work also belong to custom design sensor pastes, which are aimed for temperature sensors: they have mesoporous structure and moderate ntc slope (see figure 2 and 3), and enable realization round 10 times lower resistance than pure nickel manganite thermistor paste (see table 3). different thick film thermistor devices (planar geometries) were analyzed and optimized to achieve suitable resistance and power dissipation of thick film thermistors and achieve faster response of thermistors needed for heat loss sensors. optimization of resistance included influence of electrode shape, size and arrangement, electrode spacing and diffusion of metal electrode to thermistor layer or electrode effect to sheet resistance (figure 5 and 6). mutual comparison of thick film thermistor geometries shows that sandwich recent advances in ntc thick film thermistor properties and applications 281 and multilayer geometries are “low ohmic” while rectangular and interdigitated geometries are “high ohmic” (ω and mω, respectively). a new geometry called segmented thermistor appeared as “moderate ohmic” geometry (kω), and most suitable for heat loss sensor. segmented thermistors were designed, realized and applied for heat loss from 1-5 w (different size and number of segments as given in figure 8). the equations (1) and (2), given above in part 5, are basic for resistance calculations, modeling and designing, and simulation e.g. predicting of properties of created new geometries, before thick film thermistor printing, sintering and measuring. the three axes anemometers and water flow sensors (presented above) are fully thermal devices based on selfheathed thick film thermistors and heat loss principle. comparing to electromechanical or ultrasonic flowmeters they are simpler: they do not contain amplifiers or moving parts, they are smaller in size and cheaper. the aim was to develop intelligent sensors as the second step of research and introduce intelligent functions such as auto-range, autocalibration, auto-correction of delay and auto-display of measured values and calculated values, selection of continual or switching operating mode, etc. finally, summing the recent advances in thick film thermistors three tendencies can be noticed: 1. new thermistor materials development, 2. new custom designed thermistor pastes development (micro/nano structured and doped with different oxides including rare earths) and 3. new thick film geometries development (planar constructions) aimed for thermistor sensors and systems. all three tendencies combined lead to novel thick film thermistors e.g. to new applications which fit the customer requirements. the new applications such as thick film gradient temperature sensor, temperature sensor array, bolometer with high temperature thermistor are partly in realization and their appearance is expected very soon in near future. acknowledgement: the paper is a part of the research done within the project iii 45 007 financed by the ministry of education, science and technological development of serbia. references [1] p. r. n. childs, j. r. greenwood, c. a. long, “review of temperature measurement”, review scientific instrumentation, vol. 71, no. 8, pp. 2959-2965, 2000. [2] t. d. mc gee, principles and methods of temperature measurement, john wiley,1988, pp. 2-21 [3] n.g. lewis, m. randall, thermodynamics, 2 nd edition, mcgraw-hill, new york, 1961, pp. 378-379. [4] d. sherry, “thermoscopes, thermometers, and the foundations of measurement”, studies in history and philosophy of science, vol. 42, pp. 509–524, 2011. [5] p. coates, d. lowe, the fundamentals of radiation thermometers, chapter 1:the basis of temperature measurement, crc press 2016, pp. 10-30. [6] m. j. moran, h. n. shapiro, fundamentals of engineering thermodynamics, chapter 1, john wiley & sons, 2006, pp. 10-25. [7] j. g. webster, h. eren, measurement, instrumentation and sensor handbook, thermal and temperature measurement, chapter 7, 2 nd edition, crc press 2014, pp. 65-78. [8] b. l. hunt, “the early history of the thermocouple”, platinum metals rev., vol. 8, no. 1, pp. 23-28, 1964. [9] y.s. touloukian, d.p. dewitt p.d., thermophysical properties of matter, tprc series, vol. 7 thermal radiative properties, metallic element and alloys, ifi/plenum ny, 1970, pp. 159-168. [10] r.e. bentley, handbook of temperature measurement, vol. 3: theory and practice of thermoelectric thermometry. springer-verlag singapore pte. ltd., 1998, pp. 24-36. http://scitation.aip.org/content/contributor/au0937925;jsessionid=mh4tt9qips2w3qudiyrd7zea.x-aip-live-06 http://scitation.aip.org/content/contributor/au0857680;jsessionid=mh4tt9qips2w3qudiyrd7zea.x-aip-live-06 http://scitation.aip.org/content/contributor/au0937927;jsessionid=mh4tt9qips2w3qudiyrd7zea.x-aip-live-06 http://psychology.okstate.edu/faculty/jgrice/psyc4333/thermoscopes_measurement2011.pdf 282 o. s. aleksić, p. m. nikolić [11] r.a. felice, “pyrometry for liquid metals”, advanced materials & processes, vol.166 (7), asm international, pp. 31-33, 2008. [12] r. p. benedict, fundamentals of temperature, pressure, and flow measurements, third edition, chapter 8. optical pyrometry, john wiley 1984, pp. 130-145. [13] e.d. macklean, thermistors, electrochem. pub., glasgow, 1979, pp. 5-22. [14] f. j. hyde, thermistors, first edition, published by iliffe, 1971, pp. 2-15. [15] d. r. white, “temperature errors in linearizing resistance networks for thermistors”, international journal of thermophysics, vol. 36, no. 12, pp. 3404–3420, 2015. [16] p. umadevi, c. l. nagendra, “preparation and characterization of transition metal oxide microthermistors and their application to immersed thermistor bolometer infrared detectors”, sensors and actuators a: physical, vol. 96, no. 2–3, pp. 114-124, 2002. [17] h. zumbahlen, linear circuits design handbook, chapter 3-2 temperature sensors: thermistors, analog devices -newnes, 2008, pp. 231-240. [18] w. kester, j. bryant, w. jung, sensor signal conditioning, temperature sensors, chapter 7, analog devices, pp. 1-38, 2000. [19] d. d. pollock, thermocouples: theory and properties, crc press, 1991, pp. 181195. [20] b. gosselin jr, “ntc thermistors versus voltage output ic temperature sensors”, texas instruments ecn: 04/02/2013, pp. 1-3. [21] t. kuglestadt, “semiconductor temperature sensors challenge precision rtds and thermistors in building automation”, texas instruments: application report: snaa267–04 2015, pp. 2-10. [22] t.g. nanov, s.p. yordanov, ceramic sensors: technology and applications, chapter 5 thermistors, crc press, 1996, pp. 193-203. [23] c. ma, y. liu, y. lu, h. qian, “preparation and electrical properties of ni0.6mn2.4xtixo4 ntc ceramics”, journal of alloys and compounds, vol. 650, pp. 931-935, 2015. [24] j. park, “microstructural and electrical properties of y0.2al0.1mn0.27−xfe0.16ni0.27−x(cr2x)oy for ntc thermistors”, ceramics international, vol. 41, no. 5, pp. 6386-6390, 2015. [25] o. shpotyuk, a. kovalskiy, o. mrooz, l. shpotyuk, v. pechnyo, s. volkov, “technological modification of spinel-based cuxni1−x−yco2ymn2−yo4 ceramics”, journal of the european ceramic society, vol. 21, no. 1112, pp. 2067–2070, 2001. [26] r. metz, “electrical properties of n.t.c. thermistors made of manganite ceramics of general spinel structure: mn3−x−x′mxnx′o4 (0 ⩽ x + x′ ⩽ 1; m and n being ni, co or cu). aging phenomenon study”, journal of materials science, vol. 35, pp. 4705–4711, 2000. [27] r.c. buchanan, ceramic materials for electronics: processing, properties and applications (electrical engineering & electronics), marcel dekker inc; enlarged 2nd edition, 1986, pp. 125-162. [28] m. vakiv, o. shpotyuk, o. mrooz, i. hadzaman, “controlled thermistor effect in the system cuxni1-xyco2ymn2-yo4”, journal of the european ceramic society, vol. 21, pp. 1783–1785, 2001. [29] e. s. na, u. g. paik, s. c. choi, “the effect of a sintered microstructure on the electrical properties of a mn-co-ni-o thermistor”, journal of ceramic processing research, vol. 2 (1), pp. 3134, 2001. [30] h. zhang, a. chang, c. peng, “preparation and characterization of fe 3+ -doped ni0.9co0.8mn1.3-xfexo4 (0 < x < 0.7) negative temperature coefficient ceramic materials”, microelectronic engineering, vol. 88, no. 9, pp. 2934–2940, 2011. [31] m. l. m. sarrión, m. m. sánchez, “preparation and characterization of thermistors with negative temperature coefficient, nixmn3–xo4(1 0. in this case n = 6 and matrices have dimension 6x6. equations for all the cv are obtained under lowest order cv theory, bare approximation. applying the bare approximation implies that residual field is set to zero, g(z,t) =0. for m = 2 elements of matrix r are as follows: 3 1 1 3 4 1 2 3 ( 3 35 ) 2 r x x x     (16) 4 41 1 3 5 3 2 2 4 2 2 2 1 3 4 3 5 3 2 4 2 2 2 1 3 4 3 5 3 2 2 4 2 2 3 1 5 3 4 5 3 5 3 2 4 2 1 1 2 4 5 3 4 4 512 64 315 35 2 336 35( 6 ) 420 315 2 384 4( 49 6 ) 288 315 2 1008 105( 6 ) 420 315 2 1152 1 ( ) ( ) ( ) 2( 49 6 ) 288( x bx x x x x x x x x x x x x x x x ax x x x x x x x x x x r x x                                 2 4 2 3 5 3 2 4 2 2 2 1 5 3 4 3 5 3 2 2 4 2 2 3 1 5 3 4 3 5 3 4 2 ) ( ) 315 2 640 12( 49 6 ) 288 315 2 128 12( 49 6 ) 288 31 ( ) 5 x x x x x x x x x x x x x x x x               (17) 2 2 2 2 2 2 1 1 4 1 1 3 1 4 2 4 1 4 2 3 4 324 48( 31 6 )4 (15 4 ) 45 2835 16( 205 24 ) ( 2835 ( ) ) x x x x r a x x x x                  (18) 44 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas 2 4 3 2 2 3 2 4 3 1 3 1 3 5 1 3 5 2 2 2 2 4 2 2 2 2 1 3 3 4 3 5 2 2 4 4 2 2 1 1 3 3 4 3 4 4 4 1 4 ( 49 6 ) ( 6 ) ( 49 6 ) 315 9 315 15120( 15 ) 945 60 7 18900( 6 ) 170100 64(58715 7194 ) 1512(60 35 3 ) 2160( 4 ( ( 9 6 ) ) ( ) b x x x x x x x x ax x x x x x x x x r x x                                           2 2 5 2 2 4 4 2 2 2 2 2 1 3 3 4 3 5 2 2 4 4 2 2 2 2 3 1 3 3 4 3 4 4 5 170100 5760( 29 3 ) 1512(60 35 3 ) 2160( 49 6 ) 170100 5760( 29 3 ) 1512(60 ) ( ) 35 3 ) 2160( 49 6 ) 1 ( 7 10 ) 0 0 x x x x x x x x x x x x x                              (19) 2 2 3 2 4 3 1 3 4 1 3 4 2 2 3 2 4 3 1 3 4 5 1 3 4 5 1 2 4 3 2 4 3 1 3 4 5 2 1 3 5 3 5 4 2 8 ( 6 ) ( 49 6 ) 9 315 4 16 ( 6 ) ( 49 6 ) 9 315 16 16 ( 49 6 ) ( 49 6 ) 315 315 x x x x x x a x x x x x x x x x x x x x x x x r                               (20) 4 2 4 1 3 1 3 5 1 3 5 2 2 4 2 2 2 1 3 4 3 5 3 2 4 2 2 2 1 1 3 4 3 5 3 2 4 2 2 2 2 1 3 4 3 5 3 2 4 2 3 1 3 4 4 6 4 4 ( ) 64 8 64 35 3 35 2 336 35( 6 ) 420 315 2 384 4( 49 6 ) 288 315 2 384 4( 49 ( 6 ) 288 315 2 384 ) ( ) ( 4( 49 6 ) 28 bx x x x x x x x ax x x x x x x x x x x x x x x x x x x x r x                                   2 2 3 5 3 )8 315 x x x (21) finally, the nonlinear dynamical system (ds) reduces to: 2 3 3 2 1 1 4 1 4 2 321 1 4 2 4 ( 755 84 ) 4 ( 1025 228 ) ( ) 105( 15 )4 315( 15 4 ) x a x x x x x x                     (22) 2 52 1 5 1 2 3 ( 1 ( 21( 2 ) 8 3 2 2 3 ) 21 ( ))axx x x            (23) 2 2 2 2 3 4 1 1 3 23 2 ( (2 63 ( 15 4 ) 8 3( 89 12 ) 2( 31 6 )( ) 63( 1 ) 5 ) 4 ) x x a x x                    (24) super-sech soliton dynamics in optical metamaterials by using collective variables 45 4 22 3 41 52 4 3 3 2 4 2 2 1 1 3 4 3 4 3 2 4 2 2 2 1 3 4 3 4 3 2 4 2 2 3 1 3 4 3 4 3 4 2 5 2 5 2 5 2 ( 108 )312 ( ) 7 2 ( ) 315 2 (15840 ) 315 2 9 7 (15840 ) 315 020 9 7020 9 7020 a px xbx b x px px x r x x x x px x x x x x px x x x x x px x q q q                    (25)              2 2 1 42 2 2 2 2 5 1 2 3 5 1 4 (3 11 4 105 6 20 6 2 3 11 4 87 8 207 28 ) x x x x                            (26) 2 4 2 4 2 2 3 52 4 2 3 2 2 4 4 2 1 3 4 1 2 3 2 4 2 4 1 2 3 2 6 2 4 3 5 1 (756 (450 75 16 5( 45 30 4 ) ) 3780( 45 30 4 ) (81(1680 280 13 ) 16((352290 111599 8328 ) 18( 870 305 48 )( )) 36 (3 ( 1470 655 96 ) 3 ) ( ( ( 3 a x x x x x x x x b x                                                 2 4 2 4 2 4 5 1 2 4 2 4 2 3 30 545 64 ) 80( 45 30 4 ) ((6390 5235 672 ) ( 810 435 32 ) ( 8010 4365 608 ) ))))) x                                 (27) where 4 2 4 30 45p     , 4 2 96 1045 1050q     , 2 16(7464 60335)r   ; 4. results and conclusion collective variable approach was applied to solve the evolution equation that governs the dynamics of soliton and its propagation through optical metamaterials. numerical investigations on the evolution of pulse parameters have been carried out in order to illustrate results of collective variable approach. results have been obtained using standard fourth order runge-kutta method for integration of the system of ordinary differential equations that resulted from the cv analysis. in figure 1 dynamic of the system is presented for the following parameter values:  = 0.25, a = 0.1, b = 20,           . as the pulse propagates, the amplitude (x1), pulse width (x3), frequency (x5) and chirp (x4) vary periodically. the control parameter of the soliton solution as it evolves is the total energy q. the total energy can be expressed as function of the super-sech function parameters 2 1 3 4 3 x x q  (28) 46 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas this expression shows that the total energy strongly depends on amplitude (x1) and the pulse width (x3). the collective variables method enables a clear analysis of the equations and reveals the influence of various parameters. fig. 1 variation of pulse parameters (x1  soliton amplitude, x2  center position of the soliton, x3  pulse width, x4  soliton chirp, x5  soliton frequency, x6  soliton phase) with propagation distance. in conclusion, we have investigated the dynamics of an ultra short pulse in optical fibers, using cv approach.this paper could be used for further investigations of solitons dynamics and the influence of nonlinear parameters on solitons amplitude, temporal position, frequency, phase and chirp. super-sech soliton dynamics in optical metamaterials by using collective variables 47 acknowledgement: this research is funded by qatar national research fund (qnrf) under the grant number nprp 6-021-1-005. the second, third and sixth authors (dm, mb & ab) thankfully acknowledge this support from qnrf. the second author (dm) thankfully acknowledges the support from ministry of education, science, and technological development of republic of serbia [ iii 44006, tr-32051]. the fourth author (qz) was funded by the national science foundation of hubei province in china under the grant number 2015cfc891.the fifth author (spm) would like to thank the research support provided by the department of mathematics and statistics at tshwane university of technology and the support from the south african national foundation under grant number 92052 irf1202210126. the sixth author (ab) would like to thank tshwane university of technology during his academic visit on 2016. the authors also declare that there is no conflict of interest. references [1] v. g. veselago, “the electrodynamics of substances with simultaneously negative values of  and ”, sov. phys. usp., vol. 10, no. 4, pp. 509-514, 1968. [2] n. a. zharova, i. v. shadrivov, a. a. zharov, y. s. kivshar, “nonlinear transmission and spatiotemporal solitons in metamaterials with negative refraction”. optics express, vol. 13, no. 14, pp. 1291-1298, 2005. [3] v. m. shalaev, nature photonics. 1, 41 (2007). [4] a. biswas, k. r. khan, m. f. mahmood & m. belic. "bright and dark solitons in optical metamaterials". optik, vol. 125, issue 13, pp. 3299-3302, 2014. [5] a. biswas, m. mirzazadeh, m. savescu, d. milovic, k. r. khan, m. f. mahmood & m. belic. "singular solitons in optical metamaterials by ansatz method and simplest equation approach", journal of modern optics, vol. 61, issue 19, pp. 1550-1555, 2014. [6] a. biswas, m. mirzazadeh, m. eslami, d. milovic & m. belic, "solitons in optical metamaterials by functional variable method and first integral approach", frequenz., vol. 68, issues 11-12, pp. 525-530, 2014. [7] g. ebadi, a. mojavir, j. vega-guzman, k. r. khan, m. f. mahmood, l. moraru, a. biswas & m. belic, "solitons in optical metamaterials by fexpansion scheme". optoelectronics and advanced materials – rapid communications, vol. 8, no. 9-10, pp. 828-832, 2014. [8] m. veljkovic, y. xu, d. milovic, m. f. mahmood, a. biswas, and m. r. belic, “super-gaussian solitons in optical metamaterials using collective variables”, journal of computational and theoretical nanoscience, vol. 12, no. 12, pp. 5119-5124, 2015. [9] s. i. fewo & t. c. kofane. "a collective variable approach for optical solitons in cubic-quintic complex ginzburg-landau equation with third order dispersion", optics communications, vol. 281, issue 10, pp. 2893-2906, 2008. [10] p. green, d. milovic, d. lott & a. biswas. "dynamics of gaussian optical solitons by collective variables method". applied mathematics and information sciences, vol. 2, issue 3, pp. 259-273, 2008. [11] e. v. krishnan, m. al gabshi, q. zhou, k. r. khan, m. f. mahmood, y. xu, a. biswas & m. belic. "solitons in optical metamaterials by mapping method", journal of optoelectronics and advanced materials, vol. 17, no 3-4, pp. 511-516, 2015. [12] a. b. moubissi, k. nakkeeran, p. t. dinda & t. c. kofane. "non-lagrangian collective variable approach for optical soliton in fibers", journal of physics a., vol. 34, pp. 129-136, 2001. [13] m. savescu, k. r. khan, p. naruka, h. jafari, l. moraru & a. biswas. "optical solitons in photonic nano waveguides with an improved nonlinear schrödinger's equation", journal of computational and theoretical nanoscience, vol. 10, no. 5, pp. 1182-1191, 2013. [14] s. shwetanshumala & a. biswas. "femtosecond pulse propagation in optical fibers under higher order effects: a collective variables approach", international journal of theoretical physics, vol. 47, issue 6, pp. 1699-1708, 2008. [15] s. shwetanshumala. "temporal solitons in nonlinear media modeled by modified complex ginzburglandau equation under collective variables approach", international journal of theoretical physics, vol. 48, issue 4, pp. 1122-1131, 2008. [16] y. xu, q. zhou, a. h. bhrawy, k. r. khan, m. f. mahmood, k. r. khan & m. belic. "bright solitons in optical metamaterials by traveling wave hypothesis", optoelectronics and advanced materials – rapid communications, vol. 9, no. 3-4, pp. 384-387, 2015. 48 m. veljković, d. milović, m. belić, q. zhou, s. p. moshokoa, a. biswas [17] q. zhou, q. zhu, y. liu, a. biswas a. h. bhrawy, k. r. khan, m. f. mahmood & m. belic. "solitons in optical metamaterials with parabolic law nonlinearity and spatio-temporal dispersion", journal of optoelectronics and advanced materials, vol. 16, no. 11-12, pp. 1221-1225, 2014. [18] z. jakšić, m. obradov, s. vuković, m. belić, "plasmonic enhancement of light trapping in photodetectors", facta universitatis, series: electronics and energetics, vol. 27, issue 2, pp. 183-203, 2014. [19] e. suhir, "fiber optics engineering: physical design for reliability", facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 153-182, 2014. facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 273-287 https://doi.org/10.2298/fuee2002273j © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd the peak windowing for papr reduction in software defined radio base stations * borisav jovanović, srđan milenković university of niš, the faculty of electronic engineering, niš, serbia abstract. the utilization of the techniques for peak to average power ratio (papr) reduction makes the wireless infrastructure conform to rigorous telecommunication standard specifications (error vector magnitude (evm), bit error rate (ber), transmit spectrum mask (tsm)). in modern modulation schemes reduction of papr is important requirement for distortion free and energy-efficient operation of power amplifiers (pa). in this paper novel implementation of peak windowing method for papr reduction in software defined radio (sdr) base stations (bs) is presented. the measurement results in terms of evm and acpr are given for 5 mhz, 10 mhz, 15 mhz, 20 mhz long-term evolution (lte) and wideband code division multiple access (wcdma) modulations. in case of 10mhz lte signal, we achieved papr = 8 db, evm = 2.0%, acpr -52dbc at modulated pa output, antenna point. key words: peak to average power ratio; peak windowing, software defined radio 1. introduction in radio frequency (rf) transceivers, power amplifiers (pa) consume the most power among the analog circuits; thereby its energy-efficiency is an important design requirement. pa nonlinearity causes high out-of-band radiation, inter-carrier interference and bit error rate (ber) performance degradation. the digital predistortion (dpd) is proven to be an effective method for pa linearization decreasing in-band and out-ofband distortions [2, 3]. dpd improves pa energy-efficiency and reduces the exploitation expenses of rf transceivers. peak-to-average power ratio (papr) of the signal s(n) is defined as the ratio of peak power and the average power of a signal: 2 2 m ax 10 )( log10 rms s ns papr db  (1) received september 20, 2019; received in revised form december 11, 2019 corresponding author: borisav jovanović faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia e-mail: borisav.jovanovic@elfak.ni.ac.rs * an earlier version of this paper was presented at the 6th icetran conference, june 3 6, 2019, in silver lake, serbia, where the paper was awarded as the best paper in the electronics section [1]. 274 b. jovanović, s. milenković modern modulation schemes exhibit frequent occurrence of large signal peaks. in the presence high papr waveforms, pa cannot be efficiently linearized by dpd [4]. the solution which supports the dpd to compensate distortions is dealing with signals with reduced papr. in this case it is possible to increase transmitted signal average power, avoid pa operation in non-linear region and improve pa energy efficiency [5]. the utilization of papr reduction techniques in modern modulation schemes became an obligation. the flexibility to implement variety of modulation schemes is important feature of software defined radio (sdr) systems. because of this property, the sdr is our choice for the implementation of rf base station (bs). new implementation of papr reduction technique, operating in sdr based bs, is presented in the paper. in conjunction with dpd, it is promising solution to achieve good pa linearity and energy-efficiency. in the literature there are many studies in which the dpd and papr reduction methods are realized using expensive laboratory equipment and where modulated waveforms are produced using matlab software and vector signal generators (vsg). our papr reduction implementation is based on sdr board which is a part of bs. to assess the performance of papr reduction method we used following figures of merit: adjacent channel power ratio (acpr) and error vector magnitude (evm). telecommunication standards define minimum requirements in terms of acpr and evm for different modulation schemes [6]. this paper is organized as follows. related work is given in the following section. the section iii describes the technique for papr reduction and implementation in sdr hardware. measurement results of the papr method are presented in section iv. the results are obtained using following schemes: long-term evolution (lte) and wideband code division multiple access (wcdma). the section v is reserved for discussion. finally; conclusion is drawn in final section. 2. related work a number of methods have been proposed for papr reduction. they can be generally divided into two major groups: receiver-dependent methods and receiverindependent methods. in receiver-dependent methods, the papr is reduced by increasing complexity of the rf receiver [7]. in this case, the transmitter sends additional data which is decoded at receiver for reliable reconstruction of useful information. the disadvantage of receiver-dependent methods is reduction of data rate which is caused by transmission of additional data. examples of receiver-dependent methods are tone reservation [8], selective mapping [9] and partial transmit sequence [10] methods. the receiver-independent methods don’t transmit any additional data and don’t modify the structure of receiver. instead, the shape of the transmitted signals is modified by limiting the magnitude of large peaks [5, 11]. the disadvantage of their utilization is increase of in-band distortions and spectral regrowth. the receiver-independent techniques include clipping and filtering (caf) [12, 13], peak windowing (pw) [14], [15], peak cancellation (pc). the caf is the simplest method for papr reduction. since clipping operation is a nonlinear process it results in high in-band and out-of-band distortions. clipping operation the peak windowing for papr reduction in software defined radio base stations 275 is followed by low-pass filtering (lpf) operation which is employed to eliminate the spectrum regrowth. in pw large signal peaks are multiplied with a specific window function [11]. in an advanced pw proposed in [14], new weighting coefficients are obtained whenever successive peaks are found within a half of the window length. in [15], sequential asymmetric suppression (sas) pw method and optimally weighted windowing were proposed for papr reduction. the focus was to reduce unwanted attenuation of the signal caused by closely spaced peaks. in [16] a hybrid peak windowing (hpw) is proposed which minimizes the distortion by changing the papr reduction method. when a single peak is detected in the period of half of the window length, the peak is shortened using pw. when successive peaks are detected within the same period, the hpw eliminates the peaks using caf method. however, an experimental validation of pw method and its disadvantages induced by its application in sdr bs are not reported in the literature. 3. peak-to-average power ratio reduction 3.1. peak windowing method we have adopted pw algorithm for papr reduction because it can be easily implemented in sdr bs. besides, it provides flexibility to apply different modulation schemes. the operation of hard clipping is described by (2): )()()( nxncny  , (2) where signals x(n) and y(n) represent the input and clipped output signals respectively, each consisting of i and q signal components. the clipping operation is modeled by c(n):         thnx thnx nx th nc )(,1 )(, )()( (3) the clipping operation forces peaks in the signal envelope to stay below the threshold, denoted with th in (3). at the same time, clipping operation produces sharp edges in the y(n) reflecting in increased signal distortion. in pw method, sharp edges of clipped signal peaks are smoothened by multiplication of the original signal in the region of the peaks with a windowing function. to avoid sharp edges in y(n) and keep the signal envelope below th, the clipping function c(n) is replaced by b(n), given by (4). the kaiser, hamming or hann windowing functions can be used for realization of w(n) [17-18].     k k knwkcnb )())(1(1)( (4) the output signal y(n) is a convolution of the original signal x(n) and the applied windowing function. )()()( nxnbny  (5) 276 b. jovanović, s. milenković 3.2. peak windowing operations fig. 1 a) the peak windowing pwfir architecture b) the structure of peak search block the operation described by (4) is implemented by finite impulse response (fir) filter with symmetric impulse response, denoted with pwfir, which architecture is described in this section in detail. the implementation of pw consists of several stages. the pw preprocessing operations are depicted in the fig. 1a [19]. to determine envelope e(n) the i/q components xi(n) and xq(n) of the input signal are squared, summed and squarerooted. the envelope e(n) is then compared to threshold th. according to (3), if amplitude of e(n) is greater than th, clipping function c(n) is formed as the value of th, divided by e(n). otherwise, c(n) value is set to one. the peak search block is introduced in preprocessing stage to find local minimum values of the signal c(n). if a sample is not a local minimum, then the output signal of peak search block (signal cp(n)), is set to one. if a sample is recognized as local minimum, the value of this sample is passed to the output cp(n). peak search block examines the sequence of seven consecutive c(n) samples. the value of seven is chosen based on results of simulations in which the lte 10mhz and wcdma waveforms are used. the structure of peak search block is given in the fig. 1b. the operation of block comp is described by (6).       ba baa ny ,1 , )( (6) the peak windowing for papr reduction in software defined radio base stations 277 fig. 2 the top panel gives signals th, the envelopes of input and output signals; the bottom panel depicts signals c(n), cp(n) and b(n). the description of pwfir filter architecture is presented in the fig.1a [19]. the pwfir takes at input signal is 1cp(n) and generates 1b(n). negative values of 1cp(n) are substituted by zeros. the signal b(n) is used for gain correction of input samples xi(n) and xq(n). before these signals are multiplied with b(n), they are delayed for the period equal to the delay of pwfir filter. the difference between c(n) and b(n) is minimized by choosing narrow window lengths, revealing in decreased evm values. if clipping operation happens frequently, neighboring windows may overlap and the difference between c(n) and b(n) becomes larger [18]. to reduce windows overlapping the feedback structure is employed. the feedback structure adjusts the input values of cp(n) preventing that the signal is clipped more than it is really needed. the feedback structure reduces the evm in situations when successive peaks in cp(n) occur within the time interval which is less than half of a window length [19]. the signals, illustrating the pw method, are given in the fig. 2. top panel presents the envelope of input signal x(n), containing the peaks that are greater than the threshold level th, and the envelope of the resulting signal y(n),which is constrained to th. the bottom panel depicts the clipping functions c(n), modified clipping function cp(n) and the resulting gain correction signal b(n). as it can be seen from fig. 2, the amplitudes of local minimum of the signals c(n), cp(n) and b(n) are equal. also, the amplitude of output signal envelope is precisely limited to th. 3.3. hardware implementation of pw technique to implement preprocessing stage, the square, division and square-root circuits are designed. architecture description of these circuits can be found in [20]. arithmetic circuits are pipelined and operate at clock frequency which is equal to input and output signals data rate of 30.72 msps, which is the rate defined by 3g and 4g standards [6]. all arithmetic circuits are implemented in 18-bit fixed point precision which has not influence on papr reduction algorithm performance. 278 b. jovanović, s. milenković pwfir filter structure, which is depicted in the fig. 1a, can be divided into two parts, the filter pwfir1, producing output signal 1b(n), and the pwfir2 which generates the feedback signal f(n). for filter coefficient implementation we choose the hann windowing function. the coefficients are generated by following equation: 1 ( ) 1 cos(2 ) , 0 1, 2 1 k w k k n n            (7) where n is the window length. pwfir1 is designed as 40-tap fir filter. the value of n = 40 is determined by simulations [21]. for, example, in case of filter size equal to n = 40 and clipping threshold of th = 0.7, the out-of-band distortions, obtained at the output of pw block, are minimized down to system’s noise floor. the filter length n and coefficients are programmable. namely, the pwfir1 block has provision to change filter order in the range from 1 to the maximum equal to 40. besides, new filter coefficients can easily be loaded. the architecture of pwfir1 filter is based on multiply-and-accumulate (mac) circuitry and it is optimized for implementation in fpga. the number of utilized multipliers is reduced by multiplexing input data and operation at clock frequency which is four times larger than the sampling rate. the pwfir1 operates at the clock frequency of 122.88 mhz and input and output data rate is equal to 30.72 ms/s. the architecture is additionally optimized by the fact that filter has linear-phase and therefore, symmetrical coefficients around the center tap. the symmetry of filter coefficients enables additional reduction of the number of multiplication operations by factor of two. the arithmetic precision is 18-bit and it does not impact the pw performance. architecture utilizes the 18x18 bit embedded fpga dsp multipliers and provides up to 20 programmable coefficients. the number of coefficients is halved because of coefficients symmetry. as the result of architecture optimizations, the pwfir1 block occupies only 10 fpga dsp blocks which are needed for implementation of only five embedded 18x18 bit multipliers. the detailed architecture of pwfir1 is given in the fig. 3a. the pwfir1 coefficient value at index j is determined by (8) and (9). the coefficients have indexes in range from 0 to 19. whenever the relation (9) is met, the coefficient at index j is determined by (8). otherwise, coefficient is equal to zero.                       2 1 20)( 1 n jwjh pwfir (8) 19 2 1 20         j n (9) in fig. 3a, signal clk is the clock signal, while enable signal xen determines input and output data rates. input signal is sampled at the positive edge of clk whenever xen = 1. filter coefficients hi and input samples di are stored in mem and dmem memory blocks respectively. the peak windowing for papr reduction in software defined radio base stations 279 fig. 3 architecture of a) pwfir1 and b) pwfir2 module both mem and dmem are addressed by 2-bit binary counter cnt. five mem blocks provide signals from h0 to h4; ten dmem blocks give signals d0 to d9. in each clock cycle, outputs mem and dmem are multiplied and the result is fed to the digital integrator:     4 0 9 *)( j jjj hddsum (10) delayed input enable signal ien is used to set the integrator feedback signal to zero whenever the calculation starts. after four clock cycles, output signal y calculates the filter output. output signal y is provided by latching the integrator output controlled by ien as well. at positive edge of clk when xen = 1, counter is reset and the process is repeated. the coefficients memory blocks are implemented as dual port register arrays. similarly, data memory is implemented as single port register array. multipliers are implemented by 1818 bit fpga dsp blocks. pwfir2 architecture is given in the fig. 3b. it provides up to 20 programmable filter coefficients which are stored in a register array and are indexed in the range from 0 to 19. the coefficients of pwfir2 are determined by (11) whenever condition in (12) is met. otherwise, coefficient at index j is equal to zero.               j n wjh fir 2 1 )( 2 (11) 1 2 0        n j (12) 280 b. jovanović, s. milenković in fig. 3b five mem blocks provide signals from h0 to h4; also five dmem blocks give d0 to d4. in each clock cycle, outputs mem and dmem are multiplied and the following sum is calculated:    4 0j jj hdsum (13) 4. measurement results in this section, the results of the measurements are reported. [22] we utilized the sdr board which includes a transceiver ic covering the frequency range up to 3.8ghz [23] and additional on-board fpga ic in which the papr reduction circuits are implemented. test waveform can be uploaded and played from wfm ram block for the development or demo. in the real applications, wfm ram blocks are not required. cpu core performs functions of bb digital modem which are application specific, wcdma or lte [24] for example, and provides the input signal of papr reduction block. the papr reduction block output signal is filtered by symmetrical 40-tap low-pass fir filter. the utilization of fir filter is necessary because of two reasons. first, it compensates distortions generated by bb digital modem. else, the fir filter removes residual out-of-band distortions generated by papr reduction block. the filter length, which maximum is equal to 40, is programmable as well the filter coefficients. the fir provides up to 20 programmable coefficients which can be changed to support utilization of different modulation schemes. the fir architecture is identical to architecture of pwfir. the fir is optimized using the same optimization methods implemented in pwfir. the time multiplexing and symmetry of coefficients are exploited to reduce the number of fpga multipliers by factor of 8. the moderate output power pa with saturated power of 19dbm and supply voltage equal to 5v is used in the measurements. transmitted signal output power is set to pout = 6 dbm. the sdr board rf center frequency is set to 763 mhz. the results are reported by analyzing papr, acpr and evm of the pa output signal. in the measurements the pwfir filter order n is changed (n = 9, 19, 29 and 39). different clipping thresholds th are examined; the value of th is decreased from 1.0 down to 0.6 in steps of 0.04. for each combination of threshold th and filter order n, the papr, acpr and evm are measured by spectrum analyzer. in different test cases following waveforms are used: 5 mhz, 10 mhz, 15 mhz and 20 mhz lte test model 3.1 (e-tm 3.1) and wcdma. in particular, the e-tm 3.1 test specification applies to most lte modulation schemes at maximum power, and this specification is regarded as the most rigorous and one of the most important specifications of all evm test specifications [16]. 4.1. test case 1: 10mhz lte test model 3.1 waveform for e-tm 3.1 10mhz lte waveform, the cut-off frequency of fir filter is set to 10mhz. the measured papr vs. threshold graph is presented in fig. 4a. as it can be seen from fig. 4a, the papr vs. threshold curve exhibits almost linearity. the papr value of unmodified waveform signal is 10.2dbm. when threshold value is set to 0.7, the papr of the peak windowing for papr reduction in software defined radio base stations 281 output signal is reduced by 3db. the evm versus papr plot is given in the fig. 4b. the evm results are obtained at pa output after waveform is processed by papr reduction and low-pass fir filter blocks. when th = 1.0 is selected, the block for papr reduction is bypassed. in this case the evm is equal to 1.2%. as it is shown in fig. 4b, the evm is decreased with reduction of pwfir filter length n. for example, the combination of n = 9 and th = 0.6 yields to evm = 5.8%. in the case of n=39 and th = 0.6, evm = 7.7%. a) b) fig. 4 a) papr vs. th for 10mhz lte; b) evm for 10mhz lte as a function of n and papr. a) b) fig. 5 acpr for 10mhz lte as a function of papr when: a) fir filter is not used; b) fir is applied. the figs. 5a and 5b give the acpr values, obtained at pa output, as a function of papr and n. the measurements are performed for two cases: when low-pass fir filtering operation is bypassed and in the case when fir block is utilized. the figures clearly point the necessity of low-pass filtering. if the filtering is not performed, the acpr can only be improved when large n value is chosen, n=29 for example. if low-pass filter is used, the acpr results become insensitive to selection of n. in this case, when parameters th = 0.6 282 b. jovanović, s. milenković and n = 19 are chosen, out-of-band distortions are reduced to the systems noise floor and the acpr is equal to 52dbc. the utilization of low-pass fir not only reduces the out-ofband distortions at pw block output but also enables the usage of shorter pwfir window lengths, which gives better results in terms of evm. when papr is decreased to 8 db, the evm = 2% and acpr = 52dbc. similar results stand for in bs application, where the 10 mhz lte waveform is amplified using 10 w modulated output power pa with integrated dpd. the reduction of papr down to 8 db by proposed block gives the performance at pa output of evm = 2% and acpr = 52 dbc, obtained at 39.5 dbm modulated output power. 4.2. test case 2: 10 mhz lte when different window functions are used in the previous test case hann window function is applied. however, in the realization of coefficients w(n) different window functions can be used. we considered the other window functions that behave differently from hann function: the hamming and blackman-harris for example. we measured papr, acpr and evm values of the signal at pa output in the cases when hann, hamming, blackman-harris functions are separately applied. the results in terms of acpr and evm are given in fig. 6a and 6b respectively. in the measurements the fir filter is left bypassed. different window lengths are considered in the figures: n = 9, n = 19 and n = 39. a) b) fig. 6 a) acpr and b) evm as a function of papr and n when 10mhz lte waveform and different window functions are used: hann, hamming and blackman-harris 4.3. test case 3: 5 mhz lte test model 3.1 for 5 mhz lte waveform the cut-off frequency of low-pass fir filter is set to 5 mhz. in this test case the hann window function is applied. the evm versus papr plot is given in the fig. 7a. the fig. 7b depicts the acpr results, obtained at pa output, as a function of papr and n. these results presented in the fig. 7b are obtained in the case when low-pass fir filtering operation is bypassed. the peak windowing for papr reduction in software defined radio base stations 283 in the case when low-pass fir block is used and the papr is reduced by 2 db down to papr = 8.2 db, the amount of out-of-band distortions is reduced down to 55 dbc and evm = 2%. a) b) fig. 7 a) evm and b) acpr as a function of n and papr for 5 mhz lte 4.4. test case 4: 20 mhz lte test model 3.1 a) b) fig. 8 a) evm and b) acpr as a function of n and papr for 20 mhz lte. in the test case of 20 mhz lte the cut-off frequency of low-pass fir filter is set to 20 mhz. the results in terms of evm in acpr are given in figs. 8a and 8b respectively. the utilization of low-pass fir filter is necessary when n = 9. in other cases of n = 19, n = 29 and n = 39, the acpr is equal to 50.2 dbc. when papr is decreased by 2db, the evm = 5.36%. 4.5. test case 5: wdma test model 1 in case of wcdma test model 1 waveform, the coefficients for 5 mhz low-pass filter are loaded in the fir. besides, we have applied the hann function. the evm vs. papr plot is given in the fig. 9a. the papr value of unclipped wcdma signal is 284 b. jovanović, s. milenković 10.6 dbm. the evm is then 1.1%. when th is reduced to 0.7, the papr of output signal is reduced by 3db. increase of n yields in increase of evm. the fig. 9b depicts the acpr results obtained for wcdma input waveform when signal filtering is not used. when fir block is used the out-of-band distortions are reduced to the systems noise floor and acpr becomes equal to -55dbc. a) b) fig. 9 a) evm and b) acpr as a function of n and papr for wcdma 5. discussion a signal envelope containing high peaks is an unwanted characteristic of modern modulation schemes. this can be seen in the examples of wcdma and 10mhz lte waveforms. for the reduction of such high signal peaks we use the pw method. the original version of pw method, found in [11, 19], is modified by introducing novel peak search block. compared to the original pw implementation, after utilization of new peak search block, the absolute value of the difference between local minimum values of gain correction b(n) and the clipping signal c(n) is minimized. this difference is minimized invariantly to the clipped signal amplitude value. as the result of equalization of amplitudes of b(n) and c(n), the peaks in output signal envelope are more accurately constrained to the threshold th, resulting in lower evm values. the pwfir order does not affect the papr, but does affect acpr and evm. besides, the acpr and evm depend on the threshold level. we investigated the tradeoff between smaller papr and larger signal distortions. in our pw implementation, the low-pass fir is employed to reduce the out-of-band distortions. after processing with the low-pass fir, high-frequency signal components are eliminated and implemented block manifests improved acpr performance. as it can be seen from measured results, the out-of-band distortion is reduced down to system's noise floor. also, the utilization of low–pass fir enables selection of lower pwfir window lengths, which yields to lower evm, conforming strict telecommunication standards [6]. the novel architecture of pwfir is created to fulfill two main requirements: to be programmable and to save the fpga resources. the programmability enables changing of modulation scheme by adjustment of pwfir parameters. namely, the pwfir circuit the peak windowing for papr reduction in software defined radio base stations 285 has provision to change window filter length and filter coefficients. beside options of modifying the pwfir configuration, we have option to specify the clipping threshold. the pwfir architecture is dedicated for fpga implementation and it is optimized to save the fpga chip resources. the number of multipliers in pwfir is reduced eight times. time multiplexing reduces the number of multipliers by factor of four. the throughput of pwfir implementation is equal to sample rate of 30.72 msps, which is defined by 3g and 4g lte standards [6]. the circuits operate at clock frequency of 122.88 mhz; this frequency is determined by propagation delays of embedded fpga multiplier blocks. the pwfir architecture is additionally optimized by a factor of two exploiting the symmetry of pwfir filter coefficients. each of pwfir1 and pwfir2 filter occupy exactly 10 fpga dsp blocks, implementing only five 1818 bit multipliers. additional 10 bit multipliers are used for realization of low-pass fir filter block, which architecture is similar to the architecture of pwfir filters. the arithmetic precision of digital blocks implemented within fpga is 18-bit and it does not have influence on pw performance. the utilization of fpga resources is given in the table 1. table 1 the occupied altera cyclone v fpga resources digital block combinatorial alut dedicated logic registers dsp blocks papr reduction block 2168 3389 14 fir 1523 2168 10 the results in terms of acpr, evm and papr, obtained for different modulation schemes – 5 mhz, 10 mhz, 15 mhz, 20 mhz lte e-tm 3.1 and wcdma, are summarized in the table 2. in all measurements the papr of input signal is reduced by 2 db. as bandwidth of waveform is increased, the evm and acpr results become worsened. for example, for 20 mhz lte the acpr is equal to -50.2 dbc, compared to 55 dbc obtained for 5mhz lte signal. the evm of 20 mhz signal is equal to 5.36%. table 2 the measured acpr, evm and papr for 5mhz, 10mhz, 15mhz, 20 mhz and wcdma when papr of input waveform is reduced by 2db 5 mhz lte 10 mhz lte 15 mhz lte 20 mhz lte wcdma acpr[dbc] -55 -51.8 -51.3 -50.2 -55 evm[%] 1.9 1.9 2.93 5.36 1.4 papr[db] 8.2 8.2 8.2 8.4 8.6 in order to evaluate the performance of the proposed papr reduction, it is compared with the caf [12], original pw [14] and hpw [16] methods which are found in literature. the pwfir filter length n = 19 and the lte 10mhz e-tm 3.1 waveform are used. fig. 10 summarizes the performance comparison of proposed method and the existing methods. the proposed method outperforms the caf, original pw and hpw schemes in terms of the evm. besides, the measured evm is much below than evm = 8%, the value required by standards, even in the case when papr is reduced down to 6 db. 286 b. jovanović, s. milenković fig. 10 comparison of evm vs. papr plots of proposed method with references. the lte 10mhz e-tm 3.1 waveform is used different window functions are considered in the realization of pwfir filter. han and hamming functions produce similar results in evm and acpr. the blackman harris gives better results in evm but acpr is significantly worsened. in the papers which are used for comparison, waveforms are generated by laboratory equipment. after the waveforms are processed by matlab software, implementing the papr reduction, they are up-converted to rf by vsg. in all references the dac resolution is greater or equal than 14 bits. in our case the resolution of input waveforms, as well as the resolution of embedded dac, located in lms7002 transceiver ic, is equal to 12 bits. 6. conclusion the state-of-the-art modulation schemes exhibit large papr values, enhancing the non-linear effects of power amplifiers (pa) and increasing the running cost of rf base stations. this paper presents novel peak windowing papr reduction method dedicated for implementation in sdr based rf base stations. the pa is constrained to operate within its linear region using pw method which employs low-pass filtering for complete elimination of residual out-of-band distortion. besides, the novel peak search block is created in preprocessing stage of pw to precisely constrain the envelope of the output signal to selected threshold. in conjunction with feedback path in pw architecture, the peak search block reduces amount of in-band distortion. the advantage of implemented hardware is that it can be used in different modulation schemes. namely, to support various schemes, the pw module provides adjustment of different window lengths, threshold levels and loading of new filter coefficients. to demonstrate performance of papr reduction method the wcdma, 5mhz, 10mhz, 15mhz and 20mhz lte modulations are utilized. the papr, evm and acpr are obtained by spectrum analyzer at pa output, antenna point. we show that proposed method exhibits better performance in terms of in-band distortions than the receiver-independent methods found in literature. besides, novel papr reduction architecture reduces the number of embedded multipliers and it is therefore suitable for implementation in fpga ics. the peak windowing for papr reduction in software defined radio base stations 287 references [1] b. jovanović and s. milenković, "the implementation of peak windowing technique", in proceedings of the 6th icetran conference, srebrno jezero, 3-6 june 2019, eli 1.2. [2] c. eun and e. j. powers, "a new volterra predistorter based on the indirect learning architecture", ieee trans. signal process., vol. 45, no. 1, pp. 223–227, 1997. [3] j. k. cavers, "amplifier linearization using a digital predistorter with fast adaptation and low memory requirements", ieee trans. veh. technol., vol. 39, no. 4, pp. 374–382, 1990. [4] a. đorić, n. maleš-ilić and a. atanasković, "rf pa linearization by signals modified in baseband digital domain", facta universitatis, series electronics and energetics, vol. 30, no. 2, pp. 209–221, 2017. [5] d. w. lim, s. j. heo and j. s. no, "an overview of peak-to-average power ratio reduction schemes for ofdm signals", journal of communications and networks, vol. 11, no. 3, pp. 229–239, june, 2009 [6] evolved universal terrestrial radio access (e-utra); base station (bs) radio transmission and reception 2012, www.3gpp.org. [7] t. jiang and y. wu, "an overview: peak-to-average power ratio reduction techniques for ofdm signals", ieee trans. broadcasting, vol. 54, no. 2, pp. 257–268, june, 2008. [8] j . tellado, "peak to average power reduction for multicarrier modulation", ph.d. dissertation, stanford university, 2000. [9] r. w. bäuml, r. fisher and j. b. huber, "reducing the peak-to-average power ratio of multicarrier modulation by selected mapping", electronic letters, vol. 32, no. 22, pp. 2056–2057, 1996. [10] s. h. müller and j. b. huber, "ofdm with reduced peak-to-average power ratio by optimum combination of partial transmit sequences". electronic letters, vol. 33, no. 5, pp. 368–369, 1997. [11] h. mistry, "implementation of a peak windowing algorithm for crest factor reduction in wcdma", master of engineering thesis, simon fraser university, canada, 2006. [12] x. li and l. j. cimini, "effects of clipping and filtering on the performance of ofdm", ieee communication letters, vol. 2, no, 5, pp. 131–131, 1998. [13] t. lee and h. ochiai, "experimental analysis of clipping and filtering effects on ofdm systems", in proceedings of the 2010 ieee international conference on communications (icc 2010), south africa, ieee, 2010. [14] s. cha, m. park, s. lee, k. j. bang and d. hong, "a new papr reduction technique for ofdm systems using advanced peak windowing method", ieee trans. of consumer electronics, vol. 54, no. 2, pp. 405–410, 2008. [15] g. chen, r. ansaru and y. yao, "improved peak windowing for papr reduction in ofdm", in proceedings of the ieee 69th vtc spring conference, barcelona, spain: ieee, 2009, pp. 1–5. [16] d. kim and s. an, "experimental analysis of papr reduction technique using hybrid peak windowing in lte system", journal on wireless communications and networking, vol. 75, december 2015. [17] s. han and j. lee, "an overview of peak-to-average power ratio reduction techniques for multicarrier transmission", ieee trans. wireless communications, vol.12, no.2, april 2005, pp. 56– 65. [18] m. pauli and h. p. kuchenbecker, "minimization of the inter-modulation distortion of a nonlinearly amplified ofdm signal", journal of wireless personal communications, vol.4, no.1, pp. 90–101, january 1996. [19] o. vaananen et al., "effect of clipping in wideband cdma systems and simple algorithm for peak windowing", in proceedings of the of world wireless congress, san francisco, usa, may 28-31 2002, pp. 614–619. [20] b. jovanović, m. damnjanović and v. litovski, "square root on chip", etf journal of electrical engineering, ee department, university of montenegro, vol. 12, pp. 65–75, may 2004. [21] b. jovanović and s. milenković, "peak windowing for peak to average power reduction", in proceedings of the 7th small systems simulation symposium, nis, serbia, 2018, pp. 33–36. [22] limemicrosystems limesdr qpcie (2019), https://wiki.myriadrf.org/limesdr-qpcie. [23] limemicrosystems lms7002m (2019), https://limemicro.com/. [24] n. milosević, b. dimitrijević, d. drajić, z. nikolić and m. tosić, "lte and wifi co-existence in 5 ghz unlicensed band", facta universitatis, series electronics and energetics, vol.30, no.3, pp. 363– 373, 2017. instruction facta universitatis series: electronics and energetics vol. 29, no 4, december 2016, pp. 509 541 doi: 10.2298/fuee1604509p p-channel mosfet as a sensor and dosimeter of ionizing radiation  milić m. pejović university of niš, faculty of elecronic engineering, niš, serbia abstract. this paper presents a study of mosfets as a sensor and dosimeter of ionizing radiation. the electrical signal used as a dosimetric parameter is the threshold voltage. the functionality of these components is based on radiation-induced ionization in sio2, which results in increase of positive charge trapped in the sio2 and interface traps at si sio2, leads to change in threshold voltage. the first part of the paper deals with analysis of defect precursors created by ionizing radiation, responsible for creation of fixed and switching traps, as well as most important techniques for their separation. afterwards, the results for sensitive p-channel mosfets (radfets) are presented, following with results for commercially available mosfets applications as a sensors of ionizing radiation. key words: fixed traps, fading, mosfet, radfet, switching traps, threshold voltage shift 1. introduction the attention of today’s research on the impact of ionizing radiation on mosfets is directed in two ways. the first one is the production on mosfets with the highest possible resistance to ionizing radiation (radiation hardness), while the other is toward to ionizing radiation dosimeters production. the first report on the use p-channel mosfet as integrating radiation dosimeter was published in 1970 [1] and this idea was verified by results published in 1974 [2]. further investigations lead to the manufacture of radiation sensitive p-channel mosfets, also known as radiation sensitive field effect transistor (radfet) or pmos dosimeter [3]. radfet has been shown to be suitable for dose measurements in various applications, such as diagnostic radiology and radiotherapy [4][8], space radiation monitoring [9]-[12], irradiation of food plants [13] and in personal dosimetry [14], [15]. the radfet radiation-sensitive region, the oxide film layer under the al-gate is typically 1m  200m  200m, i.e., the sensing volume is much smaller than competing integral dose measuring devices as the ionizing chamber or thermoluminescent dosimeter, implying that it can also be used in vivo dosimetry [16], [17]. this property of the radfet received march 22, 2016 corresponding author: milić m. pejović universiy of niš, faculy of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: milic.pejovic@elfak.ni.ac.rs) 510 m. pejović also makes it attractive for measurement in the gradient radiation field where the gradient mostly depends on a single space coordinate, like resolving dose of x-ray microbeams or depth dose distribution [18]. the advantages of radfets include immediate, nondestructive dosimetric information readout, real time or delayed reading, possible integration with other sensors and/or electronics, wide dose range, accuracy and competitive price [19], [20]. the application of radfets for dosimetry is hadron therapy, which is one of the promising radiation modalities in radiotherapy, another field where it is possible to explore their advantages. hadron therapy includes, fast neutron therapy, proton therapy, heavy ion therapy and boron-neutron capture therapy. it is shown [21] that radfets are less sensitive to neutron radiation than the photon or charge particles. on the other hand, a disadvantage of radfets is the need for separate calibration in the fields of different modalities and energy. moreover, radfets have a certain range of the total accumulated dose, which depends on the dosimeter type and sensitivity. once the upper limit of linearity is achieved, the radfets need to be replaced. however, recent studies have shown that such radfets can be recovered for reuse by storing at room or elevated temperature for a sufficient time [22], [23] or by annealing with current [24], [25]. the dosimetry of ionizing radiation using radiation sensitive mosfets is based on converting the threshold voltage shift vt into radiation dose d. this shift originates in the radiation-induced electron-hole pairs in the gate oxide layer of the transistor which lead to increase in the density of interface traps and build-up or neutralization of positive trapped charge. the sensitivity of radfets can be adjusted, which makes them suitable for various applications. for example, sensitivity can be tuned using different gate oxide layer thickness [26], [27], or in some cases by stacking transistors [14], [15]. the sensitivity can also be tuned by applying positive bias on the gate during irradiation [28], [29]. 2. the defects precursors created by ionizing radiation ionizing radiation leads to formation of large number of defects in sio2 and at sio2 si interface, which are responsible for mosfets threshold voltage shift. the defects which make significant impact to devices performance will be discussed further. 2.1. photon induced ionization during gamma or x-ray irradiation photons interact with the electrons in the sio2 molecules releasing secondary electrons and holes, i.e., photons break sio  o and sio  sio covalent bonds in the oxide [30] (the index o is used to denote silicon atom in the oxide). the released electrons (so called “secondary electrons”) which are highly energetic, may be recombined by holes at the place of production, or may escape recombination. the secondary electrons that escape recombination with holes travel some distance until they leave the oxide, losing their kinetic energy through the collisions with the bonded electrons in the sio  o and sio  sio covalent bonds in the oxide, releasing more secondary electrons (the latter bond represents an oxygen vacancy). p-channel mosfet as a sensor and dosimeter of ionizing radiation 511 each secondary electron, before it has left the oxide or been recombined by the hole, can break a lot of covalent bonds in the oxide producing a lot of new secondary highly energetic electrons, since its energy is usually much higher than an impact ionizing process energy (energy of 18 ev is necessary for the creation of one electron-hole pair [30], i.e., for the molecule ionization). it is obvious that the secondary electrons play a more important role in bond breaking than highly energetic photons, as a consequence of the difference in their effective masses, i.e., in their effective cross sections. the electrons leaving the production place escape the oxide very fast (for several picoseconds), but the holes remain in the oxide. the holes released in the oxide bulk are usually only temporary, but not permanently trapped at the place of production, since there are no energetically deeper centers in the oxide bulk. the holes move toward one of the interface (sio2-si or sio2-gate), depending on the oxide electric field direction, where they have been trapped at energetically deeper trap hole centers [31], [32]. moreover, even in the zero gate voltage case, the electrical potential due to a work function difference between gate and substrate is high enough for partial or complete moving towards an interface. 2.2. the defects created by secondary electrons in impact ionization a secondary electrons passing through oxide bulk, break covalent bonds in the oxide by the impact ionization and create  sio  o + sio  complex, where  denotes the three sio  o bonds (o3  sio  o) and  denotes the unpaired electron. the formed  sio  o + sio  complex is energetically very shallow, representing the temporary hole centre (the trapped holes can easily leave it [33]). the strained silicon-oxygen bond  sio  o  sio  mainly distributed near the interfaces can also be easily broken by the passing secondary electrons, usually created non-bridgingoxygen (nbo) centre,  sio  o  , and positively charged e' center,   o si [34] known as a e's center [35]. a nbo centre is an amphoteric defect that could be more easily negatively charged than positively by trapping an electron. the nbo as an energetically deeper centre is the main precursor of the traps (defects) in the oxide bulk and the interface regions. a secondary electron passing through the oxide can also collide with an electron in the strained oxygen vacancy bond  sio  sio , which is a precursor of a e' centre (   o si ) [36], breaking this bond and knocking out an electron. the oxygen vacancy bonds are mainly distributed in the vicinity of the interfaces. the trapped charge can be positive (oxide trapped holes) and negative (oxide trapped electrons) and the former is more important, since the hole trapping centers more numerous including e's, e' and nbo centers, compared with one electron trap centre (nbo). the holes and electrons trapped near the si – sio2 interface have the biggest effect on mosfet characteristics, since they have the strongest influence on the channel carriers. 2.3. defects created in sio2 by hole transport the holes trapped at   o si centers formed from oxygen vacancies and strained silicon-oxygen bonds are energetically deep and steady, at which the holes can remain for longer time period, i.e. they can be hardly filled by electrons than some shallowly trapped holes. these centers exist near both interfaces, especially near the si – sio2 interface. the 512 m. pejović holes created and trapped at the bulk defects, representing energetically shallow centers, are forced to move towards one of the interfaces under the electric field, where they are trapped at deeper traps, since there a lot of oxygen vacancies, as well as a lot of strained silicon-oxygen bonds near the interfaces, grouping all positive trapped charge there. the holes leave the energetically shallow centers in the oxide spontaneously and transporting to the interface (fig. 1(a)) by hopping process using either shallow centers in the oxide (fig. 1(b)); the holes “hop” from one to another center or centers in the oxide valence band (fig. 1(c)) [31], [37]. fig. 1 displays the hole transport in the space for the positive gate bias (a) and the energetic diagram for the possible mechanisms of this space process (b) and (c). fig. 1 space diagram: (a) hole transport through the gate oxide layer in the case of positive gate bias. “x” represents unbroken bonds and “o” broken bonds (trapped holes at shallow traps), respectively, and “  ” represents the hole trap precursors near the interface (precursors of a deep trap). energetic diagram: the hole transport (b) by tunneling between to localized traps and (c) by the oxide valence band. fig. 2 shows the possible hole (electron) tunneling between adjacent centers: a shallow centre and deep centre. when there is no gate bias (fig. 2 (a)) the holes (electrons) tunneling between these centers, is not possible. when the transistor is positively biased (fig. 2 (b)), the bonded electron can tunnel from the deep centre to the shallow centre. it represents the hole tunneling from shallow to deep centers, being permanently trapped at the deep center. the electron, which is now in the shallow centre, can easily tunnel from this shallow centre to the next adjusted shallow centre, enabling the hole transport towards the interface [31]. p-channel mosfet as a sensor and dosimeter of ionizing radiation 513 fig. 2 the electron tunneling between adjacent centers: (a) shallow centre and (b) deep centre. moving throughout the oxide, the holes react with the hydrogen defect hsio  and ohsi o  finally create e's, e' , nbo centers, hydrogen ions  h and hydrogen atoms o h .  h ions and o h atoms were important for defect creation at sio2-si interface (see the next subsection). when the holes reach the interface, they can break both the strained oxygen vacancy bonds  oo sisi , forming e' centers [34] and the strained silicon-oxygen bonds  oo siosi , created the e's and nbo centers [31]. these centers represent energetically deeper hole and electron trapping centers, respectively. it should be noted that the energetic levels of the defects created after the holes at e's, and e' centers and electrons at nbo centre, respectively, have been trapped can be various. the chemically same defects show different behaviors depending on the whole bond structure: the angles and distances between the surrounding atoms [38]-[43]. 2.4. defects created at sio2 si interface the defects at the sio2-si interface known as true interface traps represent an amphoteric defect si3  si  s (index s is used to denote silicon atom in substrate): a silicon atom  si  s at sio2-si interface back bonded to three silicon atoms from the substrate  sis usually denoted as  si  s or si  . they can directly be created by incident photons passing to the substrate or the gate [44], [45] but this amount can be neglected. interface traps are mainly created by trapped holes (h + model) [46]-[49] and by hydrogen released in the oxide (hydrogen-released species model– h model) [50]-[52]. the h + model proposes that a hole trapped near the sio2-si interface created interface trap, suggesting that an 514 m. pejović electron-hole recombination mechanism is responsible [47]. namely, when holes are trapped near the interface and electrons are subsequently injected from substrate, recombination occurs. from the energy released by this electron-hole recombination the interface state may be created. the h model proposes that h + ions released in the oxide by trapped holes drift towards sio2-si interact with  sio  h and  sio  oh defects, drifting toward the sio2-si interface under the positive electric field. when the h + ion arrives at the interface, it picks up an electron from the substrate, breaking a highly reactive hydrogen atom h o [53]. also, according to the h model the hydrogen atoms o h released in reaction holes with  sio  h and  sio  oh defects and diffuse towards the sio2-si interface under the existing concentration gradient. these atoms react without an energy barrier at the interface producing interface trap in interaction with interface trap precursors  sis  h and  sis  oh [54]-[56]. interaction between h o atoms with  sis  h and  sis  oh precursors, beside creation of interface traps in interaction with interface trap precursors, leads to the creation of h2 and h2o molecules, respectively [31], [53]. h2 molecules diffuse towards the bulk of oxide where it is cracked at cc + centers [57]. this cracking process ensured the continuous source of h + ions, which drift towards the interface to form interface traps [58]. 2.5. classification of defects according to their influence on i-v characteristics the above mentioned defects can be divided to fixed traps (ft) and switching traps (st). ft represents traps in the oxide that do not have an ability to exchange the charge with the channel (substrate) within the transfer/subthreshold characteristic measurement time frame [59]. ft could be either negatively or positively charged, and they attract or repulse the channel carrier by the coulomb force, depending on the charge sign of both their charge and channel carrier charge. st represent the traps created near and at sio2-si interface and they do capture (communicate with) the carrier from the channel within the transfer/subthreshold characteristic measurement time frame [59]. the st created in the oxide near sio2-si interface are called slow switching traps (sst), but the st created at the interface are called fast switching traps (fst) also called true interface traps. the sst located in the oxide, closed to the sio2-si interface are also known slow states (ss) [60], anomalous positive charge (apc) [61], [62], switching oxide traps (sot) [63] and border traps [64]. it was emphasized that the influence of ft and st on the transistor subthreshold characteristics is manifested through the parallel shift and its slope variation, respectively. ft are usually deeper in the oxide, and during the long time post-irradiation annealing they can only be permanently recovered or temporally compensated (as in the case of switching gate bias experiments). it is emphasized that fst are amphoteric, and each of them contributes to two states within the silicon band gap (an acceptor and a donor) which could be randomly distributed inside it. 3. transistor characterization there are several techniques for ft and st separation [65]. most commonly used techniques are subthreshold midgap and charge pumping technique. their basic principle will be presented. p-channel mosfet as a sensor and dosimeter of ionizing radiation 515 3.1. subthreshold midgap technique fig. 3 subthreshold characteristics of radfets with 100 nm thick gate oxide manufactured by tyndall national institute, cork, ireland: (0) before gamma-ray irradiation and (1) after irradiation to 500 gy. the midgap-subthreshold (mg) technique [59] for determination of ft and st densities is based on analysis of mosfets subthreshold characteristics. namely, the influence of ft and st on the transistor subthreshold characteristics in saturation is in their parallel shifts and their slope changes, respectively. the first step is linear regression of the linear regions of subthreshold characteristics (fig. 3). the linear regression gives a straight line nvmi gd )log( . the next step in the procedure is the calculation of the midgap current before irradiation, img0 and after irradiation img. the calculation of the midgap current is performed using the subthreshold-current equation for a transistor in saturation [66]: )exp()( 2 2 2 ,0 s sda i dx s d kt q q kt qn ktn lc i     , (1) where 0 /x effw c l  and 2 , / d s a d l kt q n is the debye length. in this equation c0x is the oxide capacitance per unit area, k is the boltzmann’s constant, q is the absolute value of electron charge, t is the absolute temperature, ni is the intrinsic carrier concentration, na,d is the doping concentration, s is the silicon permittivity, s is the surface potential,  is the carriers mobility, w is the channel width and leff is the effective channel length. regardless of the distribution within the substrate energy gap, interface traps are electrically neutral (total charge equals zero) when surface potential s is equal to fermi’s potential f and that is the case when fermi’s level is in the middle of the semiconductor’s energy gap. in that case, the shift between two subthreshold characteristics towards the vg-axis is a consequence of the charge of ft only, and the gate voltage which 516 m. pejović corresponds to these surface potential is denoted as vmg (midgap voltage) and it can be obtained as abscissa of the (vmg, img) point at subthreshold characteristics (fig. 3). using the equation log(id) = m  vg + n, obtained by the linear fit of subthreshold characteristic, the vmg, i.e., vg that corresponds to id = img could be found as ./])[log( mniv mgmg  using this procedure, vmg0 and vmg are found. in fig. 3 a region used for the linear fit is shown, and the straight lines obtained by the linear fits of subthreshold characteristics are extended up to corresponding midgap current img. the component of threshold voltage shift due to ft, ft v is 0mgmgmgft vvvv  , (2) where vmg0 and vmg are midgap voltages before irradiation and after irradiation, respectively. the component of threshold voltage shift due to st, vst, is 000 )()( ssmgtmgtst vvvvvvv  , (3) where vt0 and vt are transistors threshold voltages before irradiation and after irradiation, respectively and threshold voltage shift is 0ttt vvv  . vt0 and vt are determined from the transfer characteristics in saturation as the intersection between vg-axis and extrapolated linear region of )( gd vfi  curves that are modeled by the following equation [66]: 20 )( 2 tg eff x d vv l wc i    . (4) the total value of threshold voltage shift, t v can be expressed as [67]: stftt vvv  , (5) st x ft x t n c q n c q v  00 , (6) where q is the absolute value of electron charge nft is the areal density of ft and nst is the areal density of st. signs “+” and “-“ are for p-channel and n-channel mosfet, respectively. as it can be seen from exp. (6), both the ft and st contribute to the threshold voltage shift in p-channel mosfet in the same direction. also, so called “rebound effect” [30] is absent in p-channel mosfets: this phenomenon is due to competitive effects to the positive charge in the oxide and negative interface traps generated in n-channel mosfets leading to a positive or negative vt values depending on the relatively values of nft and nst. this is a reason that more commonly p-channel mosfets are used as sensor or dosimeter of ionizing radiation. it is emphasized that nft could contain a small amount of sst that are located deeper in the oxide, since there is not enough time for the carriers from the channel to reach them during measurement frames. 3.2. charge pumping technique as opposed to the mg technique, the charge-pumping (cp) technique does not give changes in charge densities in the positive oxide trapped charge and interface traps, but is p-channel mosfet as a sensor and dosimeter of ionizing radiation 517 used solely to determine interface traps density while the positive oxide trapped charge can be subsequently determined on the basis of the expression (6) under the condition that the change in threshold voltage known [68]-[70]. fig. 4 shematic diagram of charge pumping measurement. the charge-pumping effect can be explained on the basis of the scheme in fig. 4 [69]. the source and the drain of the transistor are short-circuited and p-n junction of source and drain with the substrate are inversely polarized with vr voltage. in the absence of signal at the gate, under the influence of inverted polarization at the junction source-substrate and drain-substrate, the inverted saturation current of these connections will flow. when a train of rectangular pulses of sufficiently high amplitude is applied to gate (with pulse generator), a change of current direction in the substrate occurs. the intensity of that current is proportional with the pulse frequency, and “pumping” of the same amount of electric charge towards the substrate. as current cannot flow through oxide, the electric charge in the substrate go through p-n junction of source and drain. in this way, in the case of n-channel mosfets, a channel is formed under the gate in positive pulse half-period, whereby electrons are captured on interface traps. during the negative half-period, when the channel area turns into the state of accumulation, mobile electrons from the channel are returned to the source and drain, and the captured electrons are recombined with holes from the accumulated layer, thereby generating cp current icp, whose maximum value icp,max is expressed by[70] edfqadafqi itgsitgcp  2 max, , (7) where ag is the area under the gate active in charge pumping and f is the pulse frequency and s = qe is the total sweep of the surface potential that corresponds to the e. in order to avoid recombination with channel electrons, it is necessary to ensure their return to the source and drain before overflow of cavities from the substrate occurs, which is accomplished by using reverse polarization of p-n junction or using a train of trapezoid pulses or triangular pulses with sufficient times for rise of time tr and fall time tf pulse. however, part of the electrons whose capture is shallowest, are in the meantime thermally emitted into conductive band of the substrate, reducing the width of interface traps energy range measure by the cp technique, so that cp current is generated by interface traps within the range [70] 518 m. pejović )ln(2 fr g fbt pnith tt v vv nvkte    (8) which is 0.5 ev from the middle of the forbidden band. in the expression (8) vth is thermal velocity, n and p are cross section surface of carrier captures, ni is self-concentration of carriers in the semiconductor and vg is pulse height. fig. 5 elliot-tipe cp curves of radfets with a 100 nm thick gate oxide manufactured by tyndall national institute, cork, ireland: (0) before gammaray irradiation and (1) after irradiation to 500 gy. the absolute value of interface traps density nit can be calculated using equation (7) and edn itit  : faq i n g cp it   max . (9) the change in areal density of interface traps is 0)()( ititit ntncpn  , where nit(t) is the absolute value of interface trap density after irradiation time t and nit0 is the absolute value of interface trap density before irradiation. icpmax (fig. 5) is directly proportional to the pulse frequency and a small-size transistor with usual state density needs a frequency of at least several khz to enable the charge-pumping current level reach the order of magnitude of picoampers. due to this, cp measuring is most often conducted with frequencies in the range between 100 khz and 1 mhz, whereby only fst (true interface traps) are registered (in some frequencies, cp is also contributed by of sst which also captures electrons from the channel [71]). as the cp technique required a separate outlet for the substrate, it could be concluded that it is not applicable for power vdmosfets, in which the p-bulk is technologically connected to the source. however the cp technique for these devices is applicable in a somewhat altered form (see [72], [73] for more details). p-channel mosfet as a sensor and dosimeter of ionizing radiation 519 the density nit found by cp technique using expression (9) is, in fact, also the switching trap density nst (cp)  nit (cp). however, a very useful feature of the cp technique is that, as a much fast technique, it can sense only the fst and eventually just the fastest among sst. hence, the density of st measured by the cp technique is indicative of true interface trap (fst) behavior, i.e., nst (cp) = nfst. the simultaneous use of both techniques has a great advantage. for instance, the difference in the behavior of nst (mg) and nst (cp) is a consequence of sst [74], since nst (mg) = nsst + nfst and nst (cp) = nfst. 3.3. single point threshold voltage shift measurements fig. 6 threshold voltage measurement configuration based on constant current. as mentioned above, one of the methods for threshold voltage determination is based on transfer characteristics in saturation as the intersection between vg-axis and extrapolated linear region of )( gd vfi  curves that are modeled by equation (4). the single point threshold voltage measurement requires measuring the drain-source voltage while the transistor remain biased by a constant drain current and the gate and drain terminals are short-circuited (fig. 6) [75]. under this configuration, the source-drain voltage shift is taken as vt. the monitoring of the drain-source voltage can be done continuously during irradiation. most of the commercial dosimetry systems based on mosfets measure increments of the drain-source voltage at constant drain current [76]-[78]. usually, in the order to minimize the thermal drift, the drain current selected is the zero temperature coefficient current, iztc, for which the thermal dependence of the drain-source voltage cancels out. when i  vout are measured at different temperatures, all of them intersect in the same point. in fig. 7 presented the readout current ranging from 1 to 150 a and the vout voltage (vsd) were measured at temperature ranges from 25 to 100 o c for radfets with 400 nm thick gate oxide manufactured by tyndall national institute, cork, ireland. as it can be seen, all of these curves intersected in the vicinity of 12 a. it could be concluded that a selection of this current would minimize the effect of the temperature on the threshold voltage. 520 m. pejović fig. 7 single-point characteristics of radfets with a 400 nm thick gate oxide layer manufactured by tyndall national institute, cork, ireland at various temperatures. 4. radfet as a sensor and dosimeter of ionizing radiation as it was stated before, the first results in mosfets application in dosimetry were published by andrew holmse-siedle in 1974. [2]. basic principles in application of these devices as sensors and dosimeters of ionizing radiation were presented. several research groups which dealt with similar problems appeared afterwards. those are canadian research group [79], usa navy research group [80], [81], french research group [82], [83], netherlands [84], [85], usa [86], [87] and serbia [88]-[90]. large number of companies and institutes throughout the world are engaged in production of radiation sensitive mosfets. among them is tyndall national institute, cork, ireland. this institute produces radfets with gate oxide thicknesses of 100 nm, 400 nm and 1 m. some of the results related to these components will be presented in this paper regarding several important dosimetric parameters. 4.1. sensitivity of radfets irradiated by gamma rays 4.1.1. influence of gate bias fig. 8 shows the threshold voltage shift vt of radfets with 100 nm gate oxide layer thickness for gamma-ray radiation dose d in the range from 100 to 500 gy without and with gate bias during irradiation virr = 5v [91]. it can be seen that for irradiation without gate bias in the range from 0 to 500 gy, vt increases for about 0.4 v. for the same dose interval, for gate bias 5 v, vt increased for about 2.3 v. the sensitivity is defined as vt / d, so it can be concluded that the gate bias virr = 5v significantly increases the sensitivity of the radfet. p-channel mosfet as a sensor and dosimeter of ionizing radiation 521 fig. 8 threshold voltage shift vt as a function of radiation dose d, for 100 nm gate oxide thick layer radfets, without and with gate bias during irradiation of virr = 5v. in order to determine the contribution of ft and st to total vt during irradiation, their densities were determined using mg and cp techniques, and results are presented in figs. 9 and 10, respectively. it can be seen that the increase of radiation dose d lead to increase in both nft and nst and that these increase are smaller for radfets previously irradiated without gate bias. also, for the same values of d and virr the increase of ft density is larger than the increase of st density. the st density nst (mg) determined using mg technique is bigger than the st density nst (cp) determined using cp technique. this is due to the fact that mg technique determines both sst and fst, while cp technique determines only fst (true interface traps). from figs. 9 and 10, it can be seen that ft density is for about order of magnitude larger than st density obtained using the mg technique. these results have shown that ft play a crucial role in threshold voltage shift. fig. 9 the change in areal density of fixed traps nft as a function of radiation dose d, for 100 nm gate oxide thick layer radfets, without and with gate bias during irradiation virr = 5v. 522 m. pejović fig. 10 the change in areal density of switching traps as a function of radiation dose d, for 100 nm gate oxide thick layer radfets, obtained by mg and cp technique without and with gate bias of virr = 5v. fig. 11 shows vt = f (d) dependence of a radfets with a gate oxide layer thickness of 100 nm for gamma-ray radiation dose in the range from 0 to 50 g [92]. during the irradiations the gate biases virr were 0, 1.25, 2.5, 3.75 and 5 v. the threshold voltage shift for the same dose increases in gate bias increase. the radiation dose up to 50 gy did not degrade the linearity of the radfets significantly, which is significant for practical applications of these devices. fig. 11 threshold voltage shift vt as a function of radiation dose d for radfets with a 100 nm thick gate oxide layer for various gate bias virr during irradiation. in general, one can express the dependence of vt on d as: n dav  , (10) where a is a constant and n is the degree of linearity. ideally, n =1, and the dependence is linear with the sensitivity s = vt / d. correlation coefficients for linear fits for all values of virr (fig. 11) are r 2 = 0.999. having that r 2 are very close to one, it can be assumed that there is a linear dependence between vt and d and that the sensitivity of radfets for a given values of irr v is the same in the range from 0 to 50 gy. p-channel mosfet as a sensor and dosimeter of ionizing radiation 523 fig. 12 sensitivity s as a function of gate bias virr during irradiation for radfets with a 100 nm thick gate oxide layer for radiation dose of 50 gy. fig. 12 shows the sensitivity s as a function of gate bias virr during gamma-ray radiation dose of 50 gy for the radfets with 100 nm gate oxide thickness [92]. the symbols stand for experimental data while the solid lines represent fits, which are exponential. fig. 13 shows vt = f(d) of radfets with a 400 nm gate oxide layer thickness for gamma-radiation dose in range from 0 to 5 gy and virr = 0v and virr = 5v [78], [93]. expression (10) very well describes experimental data because the correlation coefficient is r 2 = 0.999. these results, as well as those presented in figs. 8 and 12, show that the increase in gate bias during irradiation lead to the increase in vt value, i.e. the sensitivity is increased as well. fig. 13 threshold voltage shift vt as a function of radiation dose d for radfets with a 400 nm thick gate oxide layer in the case without gate bias and gate bias of virr = 5v. 4.1.2. influence of gate oxide layer thickness fig. 14 shows the threshold voltage shift vt as a function of radiation dose d for radfets with gate oxide layer thicknesses of 100 nm, 400 nm and 1m. the gammaray irradiation of these devices was performed in the dose range from 0 to 50 gy, while the gate bias was virr = 5v [92]. it can be seen that the increase in gate oxide layer 524 m. pejović thickness lead to significant increase in vt for the same radiation dose. it is mainly due to the increase in ft density [94]. experimental data fitting using expression (10) for n=1, gives the correlation coefficient value, for radfets whith 100 and 400 nm gate oxide thickness, r 2 = 0.999, what proves linear dependence between vt and d, i.e. the sensitivity is the same in considered dose range and this value is higher for 400 nm gate oxide later thickness. for 1 m gate oxide thickness radfets, correlation coefficient is r 2 = 0.976, so there is no linear dependence between vt and d, and hence the sensitivity is different for different values of radiation dose. fig. 14 threshold voltage shift vt as a function of radiation dose d for three values of gate oxide layer thickness. gate bias during irradiation was virr = 5v. vt = f(d) dependence for radfets with a gate oxide layer thicnesses of 400 nm and 1m is shown in fig. 15 [78]. irradiation of these devices was also performed with gamma-rays and gate bias during irradiation of virr = 5v but the dose range was from 0 to 5 gy. i was also shown that the sensitivity increases with gate oxide thickness and that here is linear dependence between vt and d (correlation coefficient r 2 = 0.999). fig. 15 threshold voltage shift vt as a function of radiation dose d for two values of gate oxide layer thickness. gate bias during irradiation was virr = 5v. p-channel mosfet as a sensor and dosimeter of ionizing radiation 525 4.1.3. photon energy influence on radfets sensitivity threshold voltage shift vt for radfets with 1 m gate oxide layer thickness, irradiated with gamma-rays which originates from 60 co and x-rays with energy of 140 kev for radiation dose in the range from 0 to 5 gy for virr = 0v and virr = 5v is presented in figs. 16 and 17, respectively [95]. it can be seen that vt increases in much higher in the case when radfets are irradiated with x-rays then in the case of gamma-rays. it is a consequence of different photon energies which lead to ionization of the oxide gate molecules. namely, x-rays photon energy of 140 kev leads to molecule ionization by both photoeffect and compton’s effect, while gamma-rays, which originate from 60 co with energies of 1.17 and 1.33 mev lead to molecules ionization only by compton’s effect [31]. since the probability for molecule ionization by photoeffect is significantly higher than compton’s effect, during x-ray irradiation a large number of ft and st are formed than during gamma-ray irradiation which directly causes the change in vt values. fig. 16 threshold voltage shift vt as a function of radiation dose of gamma and x-ray d for 1 m gate oxide thick layer radfets irradiated without gate bias. fig. 17 threshold voltage shift vt as a function of radiation dose of gamma and x-ray d, for 1 m gate oxide thick layer radfets irradiated with virr = 5v. 526 m. pejović fitting of experimental data for gamma radiation dose in the range from 0 to 5 gy (figs. 16 and 17) using the expr. (10) for n=1 gives correlation coefficient of r 2 = 0.998 for both virr = 0v and virr = 5v. having that the correlation coefficients are very close to one, it can be assumed that there is a linear dependence between vt and d, so that the sensitivity vt / d is the same in whole interval. correlation coefficients for the case when radfets are irradiated with x-rays (figs. 16 and 17) are 0.96 and 0.95 for virr = 0v and virr = 5v, respectively, so it is shown that there is no linear dependence between vt and d. fig 18 threshold voltage shift vt as a function of radiation dose d, for 1 m gate oxide thick layer radfets, for two value of energy of x-ray. gate bias during irradiation is virr = 5v. fig. 18 shows vt = f(d) dependence of radfets with 1 m gate oxide layer thikcness during x-ray irradiation with photons energies of 90 and 140 kev for gate bias virr = 5v [96]. it can be seen that lower photon energy leads to a greater change in vt for the same radiation dose. similar behavior is detected in tn-502 rdi mosfets (thomson and neilson electronic ltd, ottawa, canada) [97]. fig. 19 threshold voltage shift vt as a function of x-ray radiation dose d for radfets with a 400 nm thick gate oxide layer. radiation was performed without and with gate bias of virr = 5v. p-channel mosfet as a sensor and dosimeter of ionizing radiation 527 fig 20 threshold voltage shift vt as a function of x-ray radiation dose d for radfets with a 1m thick gate oxide layer. radiation was performed without and with gate bias of virr = 5v. figs. 19 and 20 show the threshold voltage shift for x-ray radiation dose in the range from 0 to 100 cgy for radfets with gate oxide layer thickness of 400 nm and 1 m, respectively. fig. 21 shows the same dependence for x-ray radiation dose in the range from 1 to 10 cgy for radfets with gate oxide layer thickness of 1 m [98]. these dependence are given for gate bias during irradiation virr = 0v and virr = 5v. as it can be seen, vt values are higher when the gate bias during irradiation was virr = 5v, compared with the case when it was virr = 0v. furthermore, vt is higher for radfets with large gate oxide layer thickness (figs. 19 and 20). results presented in fig. 21 show that vt values can be detected with good reliability even for radiation dose of 1 cgy. fig. 21 threshold voltage shift vt as a function of x-ray radiation dose d for radfets with a 1m thick gate oxide layer. radiation was performed without and with gate bias of virr = 5v. 528 m. pejović 4.2. irradiated radfets fading as a dosimeter a radfets must satisfied two fundamental dosimetric demands: a good compromise between sensitivity to irradiation and stability with time after irradiation. the stability means insignificant change in vt of an irradiate radfet at room temperature for a long period of time, i.e., dosimetric information should be saved for a long period. there are two important reasons for this: first, being the fact that the dose cannot always be acquired immediately after irradiation, but after a certain period of time; second, as by individual monitoring, the exact moment of irradiation is often unknown, and the radiation dose measurements are performed periodically. room temperature stability of irradiated radfet can be followed by calculating fading f, which can be calculated as [95]: 0 (0) ( ) (0) ( ) 100 [%] 100 [%] (0) (0) t t t t t t t v v t v v t f v v v         . (11) where vt0 is the pre irradiation threshold voltage, vt(0) is the threshold voltage immediately after irradiation, vt(t) is the threshold voltage after annealing time t and vt(0) is the threshold voltage shift immediately after irradiation. fig. 22 fading f at room temperature for 2000 h of radfets with a 400 nm thick gate oxide layer previously irradiated with gamma-ray radiation dose 5 gy. fig. 23 fading f at room temperature for 2000 h of radfets with a 400 nm thick gate oxide layer previously irradiated with gamma-ray radiation dose 50 gy. p-channel mosfet as a sensor and dosimeter of ionizing radiation 529 fading radfets with gate oxide layer thickness of 400 nm which were previously irradiated with 5 gy gamma-rays are shown in fig. 22. it can be seen that fading for the first 24 h annealing at room temperature is about 3.5% while for the annealing time from 24 h to 800 h it is increases for about 6%. during further annealing fading is insignificant. for the same type of radfets previously irradiated with 50 gy gamma-rays is shown in fig. 23. it can be seen that for the first 24 h annealing at room temperature fading is about 6% and its value slightly increases to 200 h annealing time. for annealing time longer than 200 h comes to a slight increase in fading, and therefore, is fading after 2000 h for about 2% higher than after 200 h. fig. 24 fading f at room temperature for 2000 h of radfets with a 400 nm thick gate oxide layer previously irradiated with x-ray radiation dose 100 cgy. fig. 25 fading f at room temperature for 28 d of radfets with a 1m thick gate oxide layer previously irradiated with x-ray radiation dose 100 cgy. fading results for radfets with gate oxide thickness of 400 nm and 1 m, at room temperature previously irradiated with x-rays up to 100 cgy, are presented in figs. 24 and 25, respectively [98]. in fig. 26 fading of radfets with gate oxide layer thickness of 1 m previously irradiated with x-rays up to 10 cgy is also presented [98]. fading of radfets with gate oxide layer thickness of 400 nm, which were irradiated with gate bias of 5 v, is 40% in the first 7 d, whereas those of radfets irradiated without gate 530 m. pejović bias during irradiation have 22% fading also in the first 7 d (fig. 24). for the time period between 7 and 28 d, fading of radfets irradiated with gate bias 5 v increased for about 3% whereas that of radfets irradiated without gate bias during irradiation fading had a nearly constant value. fading of 1m thick gate oxide layer radfets, which were irradiated up to 100 cgy with gate bias 5 v in the first 7 d was 14% (fig. 25), whereas for the time period between 7 and 28 d, it increases about 1%. radfets with the same gate oxide layer thickness, which were irradiated without gate bias the first 7 d, have a fading increase for about 1% and this value is kept up to 28 d. figs. 24 and 25 show that fading is lower when the gate oxide layer of radfets is thicker which in accordance with early study [10], [89], showed that fading decreases with the increase in gate oxide thickness. fig. 26 fading f at room temperature for 28 d of radfets with a 1m thick gate oxide layer previously irradiated with x-ray radiation dose 10 cgy. in radfets with gate oxide layer thickness of 1m irradiated up to 10 cgy, the highest fading occurs in the first 3 d and it is 15% for radfets irradiated without gate bias and 13% for radfets with 5 v gate bias during irradiation (fig. 26). moreover, in both case, fading from 3 to 28 d is smaller than 2%. fading of radfets is mainly a consequence of positive oxide trapped charge decrease. this decrease is a consequence of electrons tunneling from si into sio2; these electrons are captured at positive oxide trapped charge, which leads to their neutralization/ compensation, and thus instability of manifested threshold voltage shift [99]. 4.3. the possibility of radfets re-use many investigations have showed that radfets cannot be used for subsequent determination of ionizing radiation dose. namely, these dosimeters are only used to measure the maximum dose, which is determined by the type and sensitivity of radfet. when the maximum radiation dose is reached, these radfets should be replaced. the first results dealing with the possibility of re-use of these devices are given in ref. [10] for radiation dose 400 gy. later investigations for the same components are presented in [23], [100]. irradiation was performed with gamma-rays up to 35 gy, without gate bias and with gate bias virr = 2.5v and virr = 5v. fig. 27 shows the threshold voltage shift vt p-channel mosfet as a sensor and dosimeter of ionizing radiation 531 as a function of radiation dose d, for both the first and second irradiation with gate bias of virr = 5v. after the first irradiation, the radfets were annealed at room temperature for 5232 h without gate bias. after this, the annealing process was continued at 120 o c without gate bias for 432 h. the radfets were then irradiated under the same conditions. the values of vt during the first and second irradiation is very close. such results are in oposition with earlier results [10] where it was shown that values for vt during the first irradiation are higher than the values obtained during the second irradiation. fig. 27 threshold voltage shift vt as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. the first and second irradiation of radfets lead to approximately the same increase of nft (fig. 28) while the increase of nst (mg) is higher during the second irradiation (fig. 29). nfst (cp) is higher during the second irradiation (fig. 30). on the basis of the results presented in figs. 28, 29 and 30 it can be seen that the major contribution to vt increase during the first and second irradiation originates from ft, which density is an order magnitude higher than st(mg) density for a radiation dose of 35 gy. fig. 28 areal density of fixed traps nft as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. 532 m. pejović fig. 29 areal density of switching traps nst (mg) as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. fig. 30 areal density of switching traps nst (cp) as a function of radiation dose d of radfets with a 400 nm thick gate oxide layer for both the first and second irradiation with gate bias of virr = 5v. 5. low-cost commercial p-channel mosfet as a radiation sensor in recent years, many investigations are driven toward applications of low-cost commercial p-channel mosfets as a radiation sensors in radiotherapy [101]. paper [102] presents results of some most important dosimetric parameters (sensitivity, linearity, reproductibility and angular dependence) for power p-channel mosfets 3n163. these transistors were irradiated by gamma rays from 60 co up to 55 gy. these devices were irradiated without gate bias. fig 31 shows the threshold voltage shift vt versus radiation dose d for 15 devices. as expected, the vt values increases when the radiation dose in mosfets increases. the data show excellent linearity with a mean sensitivity value of 29.2 mv/gy and resonable good reproducibility up to a total dose of 58 gy (which is around to the total dose used in typical radiotherapy treatments). moreover, the angular and dose-rate dependencies are similar to those of other, more specialised p-channel mosfets (radfets). authors of this paper concluded that power p-channel mosfet as a sensor and dosimeter of ionizing radiation 533 p-channel mosfet 3n163 would be an excellent candidate as a sensor of a low-cost system capable of measuring the gamma radiation dose. this radiation sensor could be placed on patient without the need for wires, and the threshold voltage shift, which is indicative of the radiation dose could be measured after the completion of each irradiation session with a resonable degree of confidance. fig. 31 threshold voltage shift vt as a function of radiation dose d for fifteen mosfets 3n163 irradiated with gamma-rays without gate bias. martines-garicia et al [103] investigatted the possibility of vertical diffusion mos, also called double-diffused mos transistor, or simply dmos, as a sensor of ionizing radiation. those components were dmos bs250f, zvp3306 and zvp4525, manufacured by diodes incorporated (plano, usa). the irradiation was performed by an electron beam of 6 mev energy without gate bias. the same auhors invesigated the behavior of p-channel mos transistors from integrated circuis cd4007 (texas instruments, dallas, usa and nxp semiconductors, eindhoven, netherlands) under 6 mev energy electron beam. in fig. 32 the vt versus d is ploted for four simples of the zvp3306 dmos transistors. the results for the other models dmos transistors are similar. as it can be seen there is a linear dependence betveen vt and d to radiation dose of 25 gy. values of sensitivity for bs250f, zvp4525 and zvp3306 are 3.1, 3.4 and 3.7 mv/gy, respectively. fig. 32 threshold voltage shift vt as a function of radiation dose d for four dmod zvp3306 irradiated with 6 mev electrons without gate bias. 534 m. pejović it is shown [103] that p-channel mos transistors from integrating circuits cd4007 in unbiased configuration during irradiation showed the sensitivity 4.6 mv/gy with a very good linear behaviour of the threshold voltage shift versus radiation dose. as the thermal compensation may be applied this transistor may be considered as a promising candidate to use as dosimeter in intra-operative radiology. fig. 33 threshold voltage shift vt and sensitivity s as a function of radiation dose d for p-channel mos transistors from integrated circuits cd4007 irradiated with 6 mev electrons with gate bias virr = 0.6v. fig. 33 shows the vt = f(d) dependence when cd4007 manufactured by texas instruments irradiated with electron beam of 6 mev. during irradiation gate bias is 0.6 v. the data present a linear behaviour showing that p-chnnel transistors from this integrating circuit is suitable for electron beam dosimetry because the sensitivity is 7.4 mv/gy. sensitivity for cd4007 manufactured by nxp semiconductor for the same conditions is 8.9 mv/gy. fig. 34 threshold voltage shift vt as a function of radiation dose d for radfets and vdmosfets irf9520 irradiated without gate bias. p-channel mosfet as a sensor and dosimeter of ionizing radiation 535 fig. 35 threshold voltage shift vt as a function of radiation dose d for radfets and vdmosfets irf9520 irradiated with gate bias of virr = 10v. a comparative study of radfets manufactured by tyndall national institute, cork, ireland with 100 nm gate oxide layer thicknes and commercial p-channel power vdmosfets irf9520 manufactured by international rectifier sensitivity to gamma-ray irradiation in the dose range from 0 to 500 gy is given in paper [104]. figs. 34 and 35 show the dependence between vt and d for radfets and irf9520 in the case when they were irradiated without gate bias (virr = 0v) and with gate bias of virr = 10v, respectively. it can be seen that vt is higher for irf9520 then for radfet for the same radiation dose. the difference in vt is probably a concequence of different technological procedures during device fabrication. it is shown that linear dependence between vt and d valid only for devices with virr = 10v during irradiation (the value of corelation coefficient obtained by experimental data fiting using expression (10) is r 2 = 0.998). figs. 36 and 37 present the change in areal densities of ft, nft for radfet and irf9520 without gate bias and with gate bias virr = 10v during irradiation, respectively [104]. it can be seen that nft is larger in irf9520 then in radfet as well as that gate bias leads to the increase in vt for the same value of radiation dose for both types of transistors. fig. 36 the change in areal density of ft nft as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated without gate bias. 536 m. pejović fig. 37 the change in areal density of ft nft as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated with gate bias of virr = 10v. fig. 38 the change in areal density of st nst, determined using mg technique, as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated without gate bias. fig. 39 the change in areal density of st nst, determined using mg technique, as a function of radiation dose d for radfets and vdmosfets ir9520 irradiated with gate bias of virr = 10v. p-channel mosfet as a sensor and dosimeter of ionizing radiation 537 the change in areal densities of st, nst determined by mg technique for radfet and irf9520 without gate bias and with virr = 10v gate bias are presented in figs. 38 and 39, respectively [104]. it can be seen that nst is smaller in irf9520 than in radfet. however, nft is considerably larger than nst in both types of devices. on the basis of these data it can be concluded that nft predominantly contributes to vt increase during irradiation. fading of irradiation irf9520 and radfet up to 500 gy is calculated 24 h after irradiation using equat. (11). for this time the device were kept at room temperature without gate bias. it was shown that the fading is higher in irf9520 than in radfets and it is smaller for devices previously irradiated with gate bias virr = 10v. 6. conclusion intensive investigations of radiation sensitive mosfets (radfets) have been performed in order to investigate their application in dosimetry. their relatively small volumen give them advantage over some other dosimetric systems, which is particulary important in in-vivo dosimetry as well as in control of gradient radiation fielld of x-rays. radfets are most commonly used for photon and ionizing radiation charged particles detection. it can be also used for neutron detection, but their sensitivity is much smaller than for photons and charged particles. their sensitivity can be increased by gate bias application during irradiation and by increasing the gate oxide layer thickness. the sensitivity increases with the decrease in ionizing radiation photon energy. it is required for these components to achieve minimal variation in threshold voltage shift after irradiation at room temperature, i.e. it is neccessary to preserve the dosimetric information for a long period of time. considered radfets are sensitive sensors of gamma and xrays, because they can register doses below 1 cgy. unfortinatelly, their major disadvantage is large fading immidiatelly after irradiation. investigations in the past few years have shown that some commercially available p-channel mosfets can be very efficiently applied as gamma and x-ray sensors as well as electrons sensors with energyes of seweral mev. those are low power p-channal mosfets 3n163, dmos bs250f, zvp3306, zvp4525 and power vdmosfets irf9520. furthermore, p-channel mos transistors, for example from cd4007, can be used as sensors of ionizing radiation. acknowledgement: the paper is a part of the research done within the project supported by the ministry of education, science and technological development of the republic of serbia under project no. 32026. references [1] w. poch and a.g. holmes-siedle, „the mosimetera new instrument for measuring radiation dose“, rca eng., vol. 16, pp. 56-59, 1970. [2] a. g. holmes-siedle, „the space charge dosimeter-general principles of a new method of radiation dosimetry“, nucl. instrm. methods, vol. 121, pp. 169-179, 1974. [3] l. adams and a. holmes-siedle, „the development of mos dosimetry unit for use in space“, ieee trans. nucl. sci., vol. 18, pp. 1607-1612, 1978. [4] r. r. price, c. benson an k. rodgers, „development of radfet linear array for intracavitary in vivo dosimetry in external radiotherapy and brachyterapy“, ieee tran. nucl. sci., vol. 51, pp. 1420-1426, 2004. 538 m. pejović [5] r. ramasechum, k.s. kulli, t.j. zhang, b. norling, a. hallil and m. islam, „performance characteristics of a micro mosfet as an in vivo dosimeter in radiation therapy“, phys. med. biol., 49, pp. 4031-4048, 2004. [6] g. tarr, k. shortt, y. wang and i. thomson, „a sensitive temperature-compensated, zero-bias floating gate mosfet dosimeter“, ieee trans. nucl. sci., vol. 51, pp. 1277-1282, 2004. [7] m. c. lavallee, l. gingras and b. luc, „energy and interated dose dependence of mosfet dosimeter sensitivity for irradiation eneries between 30 kv and 60 co“, med. phys., vol. 33, pp. 3683-3689, 2006. [8] r. kohuo, t. nishio, t. miyagishi, e. hirano, k. hotta, m. kowashima and t. ogino, „experimental evaualation of a mosfet dosimeter for proton dose measurements“, phys. med. biol., vol. 51, pp. 6077-6086, 2006. [9] a. holmes-siedle and l. adams, „radfets: a review of the use of metal-oxide silicon devices as integrating dosimeters“, rad. phys. chem., vol. 28, 224-235, 1986. [10] a. kelleher, n. mcdonnell, b. o'neill, w. lane, l. adams, „investigation into the re-use of pmos dosimeter“, ieee trans. nucl. sci., vol. 41, pp. 445-451, 1994. [11] l. z. scheick, p.j. mcnulty and d.r. roth, „dosimetry based on the erasure of floating gates in natural radiation environments“, ieee trans. nucl. sci., vol. 45, pp. 2681-2688, 1998. [12] k. kay, e. mullen, w. stapor, r. circle and p. mcdonald, „grres dosimetry results and comparison using the space radiation dosimeter and p-channel mos disieteter“, ieee tran. nucl. sci., vol. 39, pp. 1846-1850, 1992. [13] a. faigon, j. lipovetzky, e. redin and g. kruscenski, „expresion of measurement range of mos dosimeters using radiation induced charge neutralization“, ieee trans. nucl. sci., vol. 55, pp. 21412147, 2008. [14] b. o'connell, c. connely, c. mccarthy, j. doyle, w. lane and l. adams, „electrical performance and irradiation sensitivity of stacked pmos dosimeters under bulkbias control“, ieee trans. nucl. sci., vol. 45, pp. 2689-2694, 1988. [15] g. sarrabayrouse, buchdahl, v. poliscuk and s. siskos, „stacked-mos ionizing radiation dosimeters: potentials and limitations, radiat. phys. chem., vol. 71, pp. 737-739, 2004. [16] r.c. hughes, d. huffman, j.v. snelling, t.e. zipperian, a.j. ricoo and c.a. kelsey, „miniature radiation dosimeter for in vivo radiation measuremnts“, int. rad. oncol. biol. phys., vol. 14, pp. 963967, 1988. [17] d.j. gladstone, x.q. lu. j.l. humm, h.f. bowman and l.m. chin, „a miniature mosfet radiation probe“, med. phys., vol. 21, pp. 1721-1728, 1994. [18] g.i. kaplan, a.b. rosenfeld, b.j. allen, j.t. booth, m.g. carolan and a. holmes-siedle, „a special resolution by mosfet dosimetry of an x-ray microbeam“, med. phys., vol. 27, pp. 239-244, 2000. [19] g. sarabayrouse and v. polischuk, „mos ionizing radiation dosimeters: from low to high dose measurement“, radiat. phys. chem., vol. 61, pp. 511-513, 2001. [20] a. jaksić, g. ristić, m. pejović, a. mohammadzadeh, c. sudre and w. lane, „gamma-ray irradiation and post-irradation response of high dose range radfets“, ieee trans. nucl. sci., vol. 49, pp. pp. 1356-1363, 2002. [21] r. a. price, „towards and optimum design of a p-mos radiation detector for use in high-energy medical photon beams and neutron facilities: analysis of activation materials“, radiation protection dosimetry., vol. 115, pp. 386-390, 2005. [22] m. m. pejović, m.m. pejović, a.b. jakšić, k.dj. stanković and a.a. marković, “successive gamma-ray irradiation and corresponding post-irradiation annealing of pmos dosimeters”, nucl. technol. and radiat. protection, vol. 27, pp. 341-345, 2012. [23] m. m. pejović, m.m. pejović and a.b. jakšić, „contribution of fixed oxide traps to sensitivity of pmos dosimeters during gamma ray irradiation and annealing at room and elevated temperature”, sensors and actuators a, vol. 174, pp. 85-90, 2012. [24] s. alshaikh, m. carolan, m. petasecca, m. lerch and a.b. metealfe, „direct and pulsed current annealing of p-mosfet based dosimeter, the moskin“, australs phys. eng. sci. med., vol. 37, pp. 311-319, 2014. [25] g-wen luo, qi. z.-y. deng, a. rosenfeld and wx?, „investigated of a pulsed current annealing method in reusing mosfet dosimeters for in vivo imrt dosimetry“, med. phys., vol. 41, 0511710, 2014. [26] g. ristić, s. golubović and m. pejović, „pmos dosimeter with two-layer gate oxide operated at zero negative bias”, electr. lett., vol. 30, pp. 295-296, 1994. [27] g. ristić, a. jakšić, m. pejović, “pmos dosimetric transistors with two-layer gate oxide”, sensors and actuators a, vol. 63, pp. 129-134, 1997. p-channel mosfet as a sensor and dosimeter of ionizing radiation 539 [28] g. sarrabayrouse and f. gessinn, “thick oxide mos trnsistors for ionizing radiation dose measurement”, radioprotection, vol. 29, pp. 557-572, 1994. [29] a. haran, a. jakšić, n. rafaeli, a. elyahu, d. david and j. barak, ieee trans. nucl. sci., vol. 51, 2917-2921, 2004. [30] t. p. ma and p.v. dressendorfer, ionizing radiation effects in mos devices and circuits, new york: willey and sons, 1989. [31] g. s. ristić, “influence of ionizing radiation and hot carrier injection on metal-oxide-semiconductor transistors”, j. phys. d: appl. phys., vol. 41, 023001 (19 pp), 2008. [32] m. pejović, p. osmokrović, m. pejović and k. stanković, “influence of ionizing radiation and hot carrier injection on metal-oxide-semiconductor transistors”. in m. nenoi (ed), current topic in radiation research. intech. institute for new technologies, maastricht (nl), chapter 33, . oclc: 846871029, 2012. [33] c. t. sah, “origin of interface states and oxide charges generated by ionizing radiation”, ieee tran. nucl. sci., vol. 23, pp. 1563-1567, 1976. [34] d. l. griscom, “optical properties and structure of defects in silica glass”, j. ceram. soc. japan, vol. 99, pp. 923-941, 1991. [35] r. helms and e.h. poindexter, “the silicon-silicon-dioxide system: its microstructure and imperfections”, rep. prog. phys., vol. 57, pp. 791-852, 1994. [36] r. a. weeks, “paramagnetic resonance of lattice defects in irradiated quartz”, j. appl. phys., vol. 27, pp. 1376-1381, 1959. [37] h. e. boesch, jr, f.b. mclean, j.m. mcgarrity and g.a. ausman, jr,”hole transport and charge relaxation in irradiated sio2 mos capacitors”, ieee trans. nucl. sci., vol. 22, pp. 2163-2167, 1975. [38] w. l. warren and p.m. lenahan, “a comparison of positive charge generation in high field stressing and ionizing radiation on mos structure”, ieee trans. nucl. sci., vol. 34, pp. 1355-1358, 1987. [39] l. p. trombetta, f.j. feigl and r.j. zeto, “positive charge generation in metal-oxide-semiconductor capacitors, j. appl. phys., vol. 69, pp. 2512-2521, 1991. [40] r. k. freitag, d.b. brown and c.m. dosier, “experimental evidence of two species of radiation induced trapped positive charge”, ieee trans. nucl. sci., vol. 40, pp. 1316-1322, 1993. [41] r. k. freitag, d.b. brown and c.m. doser, “evidence for two types of radiation-induced trapped positive charge”, ieee trans. nucl. sci., vol. 41, pp. 1828-1834, 1994. [42] j. e. conley, p.m. lenahan, a.h. lelis and t.r. oldham, “electron spin resonance evidence for the structure of a switching oxide trap: long term structural charge at silicon dangling bond sites in sio 2”, appl. phys. lett., vol. 67, pp. 2179-2181, 1995. [43] j. f. conley, p.m. lenahan, a.j. lelis and t.r. oldham, “electron spin resonance evidence that  e center can behave as switching oxide trap”, ieee trans. nucl. sci., vol. 42, pp. 1744-1749, 1995. [44] d. a. buchanan, a.d. marwick, d.j. dimaria and l. dori, “hot-electron-induced hydrogen redistribution and defect generation in metal-oxide-semiconductors”, j. appl. phys., vol. 76, pp. 35953605, 1994. [45] d. j. dimaria, d.a. buchanan, j.h. stathis and r.e. stahlbush, “interface states induced by the presence of trapped holes near the silicon-silicon-dioxide interface”, j. appl. phys., vol. 77, pp. 20322040, 1995. [46] s.k. lai, “two carrier nature of interface-state generation in hole trapping and radiation damage”, appl. phys. lett., vol. 39, pp. 58-60, 1981. [47] s. k. lai, interface trap generation in silicon dioxide when electrons are captured by trapped holes”, j. appl. phys., vol. 54, pp. 2540-2546, 1983. [48] s. t. chang, j.k. wu and s.a. lyon, “amphoterical defects at si-sio2”, appl. phys. lett., vol. 52, pp. 622-624, 1986. [49] s. j. wang, j.m. sung and s.a. lyon, “relationship between hole trapping and interface state generation in metal-oxide-silicon structures, appl. phys. lett., vol. 52, pp. 1431-1433, 1986. [50] f. b. mclean, “a framework for understanding radiation-induced interface states in sio2 mos structures, ieee trans. nucl. sci., vol. 27, pp. 1651-1657, 1980. [51] n. s. saks, c.m. dozier and d.b. brown, ”time dependence of interface trap formation in mosfets following pulsed irradiation”, ieee trans. nucl. sci., vol. 35, no. 6, pp. 1168-1177, 1988. [52] n. s. saks and d.b. brown, “interface trap formation via the two-stage h + process”, ieee tran. nucl. sci., vol. 36, no. 6, pp. 1848-1857, 1989. 540 m. pejović [53] d. l. griscom, d.b. brown and n.s. saks, nature of radiation-induced point deffcts in amorphous sio2 and their role in sio2-on-si structure,the physics and chemistry of sio2 and sisio2 interface, ed c.r. holmes and b.e. deal, ney-york, plenum, 1988. [54] k. l. brower and s.m. mayers, “chemsical kinetics of hydrogen and (111) sisio2 interface defect”, appl. phys. lett., vol. 57, pp. 162-164, 1990. [55] j. h. stathis and e. cartier, “atomic hydrogen reactions with pb centers at the (100) sisio2 interface”, phys. rev. lett., vol. 72, pp. 2745-2748, 1994. [56] e. h. poindexter, “chemical reactions of hydrogenous species in the sisio2 system”, j. non. cryst. solids, vol. 187, pp. 257-263, 1995. [57] r. e. stahlbush, a.h. edwards, d.l. griscom and b.j. mrstik, “post-irradiation cracking of h2 and formation of interface states in irradiated metal-oxide-semiconductor field-effect transistors”, j. appl. phys., vol. 73, pp. 658-667, 1993. [58] m. m. pejović, “physico-chemical processes in vertical-double-diffusion metal-oxide-semiconductor field effect transistors induced by gamma-ray irradiation and post-irradiation annealing”, facta universitatis, series: physics, chemistry and technology, vol. 13, pp. 13-27, 2015. [59] mcwhorter and p.s. winocur, “simple technique for separating the effects of interface traps and trappedoxide charge in metal-oxide semiconductor transistors”, appl. phys. lett., vol. 48, pp. 133-135, 1986. [60] m. v. fischetti, r. gastaldi, f. maggoni and a. madelli, “slow and fasdt states induced by hot electrons at sisio2 inteface”, j. appl. phys., vol. 53, pp. 3136-3144, 1982. [61] l. p. trombetta, f.j. feigl and r.j. zeto, “positive charge generation in metal-oxide-semiconductor capacitors”, j. appl. phys., vol. 69, pp. 2512-2521, 1991. [62] r. k. freitag, d.b. brown and c.m. dozier, “experimental evidence of two species of radiation induced trapped positive charge”, ieee tran. nucl. sci., vol. 40, pp. 1316-1322, 1993. [63] a.j. lelis. and t.r. oldham, “time dependence of switching oxide traps”, ieee tran. nucl. sci., vol. 41, pp. 1835-1843, 1994. [64] d. m. fleetwood, “border traps in mos devices”, ieee tran. nucl. sci., vol. 39, 269-271, 1992. [65] v. davidovic, ph. d., university of nis, 2010. [66] s. m. sze, physics of semiconductor devices, ney york, wiley, 1981. [67] a. holmes-siedle and l. adams, handbook of radiation effects, 2 nd ed., new york: oxford university press, 2002. [68] m.a.b. eliot, “the use charge pumping currents to measure surface state densities in mos transistors”, solid-state electron., vol. 19, pp. 241-247, 1986. [69] j.s. brugler and p.g. jespres, “charge pumping in mos devices”, ieee trans. electron dev. lett., vol. 13, pp. 627-629, 1969. [70] g. groeseneken, h.e. maes, n. baltron and r.f. de keersmaeeker, “a reliable approch to chargepumping measurements in mos transistors”, ieee trans. electron dev., vol. 31, pp. 42-53, 1984. [71] r. e. paulsen, r.r. siergiej, m.l. french andm.h. white, “observation of near-interface oxide traps with the change pumping technique”, ieee electron dev. lett., vol. 13, pp. 627-629, 1992. [72] d. habaš, z. prijić, d. pantić and n. stojadinović, “charge-pumping characterization of sio2/si interface virgin and irradiated power vdmosfets”, ieee trans. electron dev., vol. 43, pp. 2197-2208, 1996. [73] s. c. witezak, k.f. gallawoy, r.d. schrimpf and j.r. brews, g. prevost, “ the determination of si sio2 interface trap density in irradiated four-terminal vdmosfets using charge pumping”, ieee trans. nucl. sci., vol. 43, pp. 2558-2564, 1996. [74] g. s. ristić, m.m. pejović and a.b. jakšić, „comparison between post-irradiation annealing and posthigh electrical field stress annealing of n-channel power vdmosfets”, appl. surf. sci., vol. 220, pp. 181-185, 2003. [75] a. kelleher, m. o’sullivan, j. rayn, b. o’neal and w. lane, “development of the radiation sensitivity of pmos dosimeters”, ieee tran. nucl. sci., vol. 39, pp. 342-346, 1992. [76] i. thomson, “direct reading dosimeters”, european patent office, ep0471957a2, 02/07/1991. [77] s. best, a. ralson and n. suchowerska, “clinical application of the one dose patient dosimetry system for total body irradistion”, phys. in medic. and biology, vol. 50, pp. 5909-5919, 2005. [78] m. m. pejović, “the gamma-ray irradiation sensitivity and dosimetric information instability of radfet dosimeter”, nucl. technol. and radiat. protection, vol. 28, pp. 415-421, 2013. [79] i. thomson, r.e. thomson and l. p. brendt, “radiation dosimetry with mos sensors”, radiation protec. dosimetry, vol. 6, pp. 121-124, 1983. [80] l. s. august, r.r. circle and j.c. ritter, “an mos dosimeter for use in space”, ieee tran. nucl. sci., vol. 30, pp. 508-511, 1983. p-channel mosfet as a sensor and dosimeter of ionizing radiation 541 [81] l. s. august, “estimating and reducing errors in mos dosimeters caused by exposure to different radiations”, ieee trans. nucl. sci., vol. 29, no. 6, pp. 2000-2003, 1982. [82] g. sarrabayrouse, a. bellaouar and p. rossel, “electrical properties of mos radiation dosimeters”, revue phys. appl., vol. 21, pp. 283-287, 1986. [83] a. ballaouar, g. sarrabayrouse and p. rassel, “mos transistor for ionizing radiation dosimetry”, proc. 13 th yugoslav conf. on mictoelectronics (miel 85), ljubljana, pp. 161-168, 1985. [84] l. adams and a. holmes-siedle, “the development of mos dosimetry unit for use in space”, ieee trans. nucl. sci., vol. 18, pp. 1607-1612, 1978. [85] l. adams, e.j. daly, r. harboe-sorensen, a.g. holmes-siedle, a.k. ward and a.a. bull, “measurements of seu and total dose in geostationary orbit under normal and solar frame conditions”, ieee trans. nucl. sci., vol. 38, pp. 1686-1692, 1991. [86] j. s. leffler, s.r. lendgren and a.g. holmes-siedle, “the aplications of radfet dosimetry to equipment radiation qualification and monitoring”, trans. of the american society, vol. 60, pp. 535536, 1989. [87] a. g. holmes-siedle, l. adams, j.s. leffler and s.r. lingren, ”the radfet system for real-time dosimetry in nuclear facilities”, 7th annual astm-euratom symp. on reac. dosimetry, strasbourg, pp. 851-859, 1990. [88] g. ristić, s. golubović and m. pejović, “p-channel metal-oxide-semiconductor detector fading dependencies on gate bias and oxide thickness”, appl. phys. lett., vol. 66, pp. 88-89, 1995. [89] g. ristić, s. golubović and m. pejović, “sensitivity and fading of pmos dosimeters with thick gate oxide”, sensors and actuators a, vol. 51, pp. 153-158, 1996. [90] z. savić, s. stanković, m. kovačević and m. petrović, „energy dependence of pmos dosimeters“, radiation protect. dosimetry, vol. 64, pp. 205-211, 1996. [91] m. m. pejović and m. m. pejović, „radiation-sensitive field effect transistor response to gamma-ray irradiation“, nuclear technol. and radiat. protection, vol. 26, pp. 25-31, 2011. [92] m. m. pejović, „dose response, radiation sensitivity and signal fading of p-channal mosfets (radfrts) irradiated up to 50 gy with 60 co”, appl. radiation and isotopes, vol. 104, 100-115, 2015. [93] s. pejović, p. bošnjaković, o. ciraj-bjelac and m.m. pejović, “characteristics of a pmosfet suitable for use in radiotherapy”, appl. radiation and isotopes, vol. 77, pp. 44-49, 2013. [94] g. ristić, a. jakšić and m. pejović, „pmos dosimetric transistors with two-layer gate oxide”, sensors and actuators a, vol. 63, pp. 129-134, 1997. [95] m. m. pejović, s.m. pejović, d. stojanov and o. ciraj-bjelac, “sensitivity of radfets for gamma and x-ray doses used in medicine”, nuclear technol. and radiat. protection, vol. 29, pp. 179-185, 2014. [96] m. pejović, o. ciraj-bjelac, m. kovačević, z. rajović and g. ilić, “sensitivity of p-channel mosfet to xand gamma-ray irradiation”, international journal of photoenergy, vol. 2013, pp. 1-6, 2013. [97] c. ehringfeld, s. schmid, k. poljanc, ch. kirisits, h. aiginger and d. georg, “application of commercial mosfet detectors in vivo dosimetry in the therapic x-ray range from 80 kv to 250 kv, physics in medicine and biology, vol. 50, pp. 289-303, 2005. [98] s. m. pejović, m.m. pejović, d. stojanov and o. ciraj-bjelac, “sensitivity and fading of pmos dosimeters irradiated with x-ray radiation doses from 1 to 100 cgy”, radiation protect. dosimetry, vol. 168, pp. 33-39, 2016. [99] p. j. mcwhorter, s.l. miller and w.m. miller, “modeling the anneal of radiation-induced traps holes in a varying thermal environment”, ieee trans. nucl. sci., vol. 37, pp. 16821689, 1990. [100] m. m. pejović, m. m. pejović and a.b. jakšić, “response of pmos dosimeters on gamma-ray irradiation during its re-use”, radiation protection dosimetry, vol. 155, pp. 394-403, 2013. [101] j. aristu, f. calvo, r. martinez, j. dubois, m. santors, s. fisher, et al., “lung cancer, in; intraoperative irradiation techniques and results, 437-453, 1999. [102] l. j. asensio, m.a. carvajal, j.a. lopez-villaneva, m. vilches, a.m. lallena and a.j. palma, „evaluation of a low-cost commercial mosfet as radiation dosimeter“, sensors and actuators a, vol. 125, pp. 288-295, 2006. [103] m. s. martinez-garcia, f. simancos, a.j. palma, a.m. lallena, j. banqueri and m.a. carvajal, „ general purpose mosfets for the dosimetry of electron beams used in intra-operative radiotherapy“, sensors and actuators a, vol. 210, pp. 175-181, 2014. [104] m. m. pejović, „application of p-channel power vdmosfet as a high radiation dose sensor”, ieee trans. nucl. sci., vol. 62, pp. 1905-1910, 2015. facta universitatis series:electronics and energetics vol. 31, n o 1, march 2018, pp. 89 100 https://doi.org/10.2298/fuee1801089a introducing a novel high-efficiency arc less heterounction dj solar cell  sobhan abasian 1,2 , reza sabbaghi-nadooshan 1 1 electrical engineering department, islamic azad university, central tehran branch, tehran, iran 2 ahvaz electricity distribution company, ahvaz, iran abstract. the present study was undertaken to examine the structure and performance of hetero junctions on the fill factor, short circuit current and open circuit voltage of aingap/gaasdual-junction solar cell. this goal of this work was to reduce recombination in the bottom cell so that the electrons and holes produced in the top cell with the lowest recombination participate in the output current. semiconductors with a high bandwidth from the ѵш group were studied in order to obtain a high open circuit voltage. by observing mobility and lattice constant semiconductors (al0.52in0.48p, gaas and in0.49ga0.51p), it was concluded that the semiconductor al0.52in0.48p has high electron mobility and hole mobility and that the lattice constant matched to the gaas semiconductor can be effective in reducing recombination. the cathode current and absorbed photons show that the composition ingap/alinp increased the number of charge carriers in the top cell. the structure of ingap-alinp/gaas-alinp was obtained by inserting an ingap-alinp heterojunction at the top and gaas-alinp heterojunction at the bottom of aingap/gaas dual-junction cell. for this structure, short circuit current (jsc) = 22.96 ma/cm 2 , open circuit voltage (voc) = 2.72 v, fill factor (ff) = 93.26% and efficiency(η)= 58.28% were obtained under am1.5 (1 sun) of radiation. key words: solar cell, dual-junction, heterojunction, arc less 1. introduction the growing demand for energy and increasing environmental pollution have attracted attention of researchers and investors to the renewable energy sector. solar energy is a renewable resource that produces electricity through photovoltaic solar cells. much research has been conducted to increase the efficiency of solar cells by using iii-v compound semiconductors by increasing the number of p-n junctions for greater absorption of sunlight. hutchby et al. presented the first algaas/gaas dual-junction cell [1]. lueck et al. obtained an efficiency of 23.6% by placing the ingap/gaas dual-junction cell on gaas received march 30, 2017; received in revised form july 14, 2017 corresponding author: reza sabbaghi-nadooshan electrical engineering department, islamic azad university, central tehran branch, tehran, iran (e-mail: sobhan_aba@yahoo.com) 90 s. abasian, r. sabbaghi-nadooshan under radiation at am1.5 (1 sun) [2]. leem et al. designed aingap/gaas dual-junction cell with a ingap/ingap tunnel diode. the efficiency of this cell was 25.14% under radiation of am1.5 (1 sun) [3]. singh et al. obtained efficiency of 32.196% under radiation of am1.5 (1 sun) by inserting in0.5(al0.7 ga0.3)0.5 p into the back surface field (bsf) layer at the bottom of a ingap/gaas dual-junction cell [4]. nayak et al. reported a efficiency of 39.15% under radiation of am1.5 (1000 sun) by creating an extra electric field using two back surface field (bsf) layers [5]. abbasian et al. used semiconductors in 0.5(al0.7 ga0.3)0.5p at low irradiation and arrived at 53.51% efficiency under radiation of am1.5g (1 sun) [6]. the current study obtained a favorable structure for ingap-alinp/gaas-alinp dualjunction cells using heterojunctions. the performance of the proposed cell was simulated using the silvaco atlas under the standard am1.5 spectrum and the values obtained for efficiency(η), fill factor (ff), open circuit voltage (voc) and short circuit current (jsc) were compared with those from previous works. in the rest of the paper, section 2 describes the solar cell model. section 3 shows and discusses the results, and section 4 compares the results with other works. finally, section 5 concludes the paper. 2. modeling solar cells 2.1. structure of multi-junction cells in multi-junction cells, each cell consists of a window layer and p-n junction layer and back surface field (bsf) layer. cells that absorb wave length proportional to a semiconductor are used in the p-n junction and are connected by a tunnel junction. the window layer has a high band gap that allows photons to pass like a transparent substance and causes maximum absorption of sunlight. charge carriers produced by irradiation of photons are separated at the p-n junction and back surface field (bsf) layer, which reduces surface recombination of charge carriers by producing an electric field. the tunnel junction provides conditions for passage of the charge carriers from a route having low resistance by creating a low-width discharge area [4-8]. figure 1 shows a dual-junction solar cell. fig. 1 layers of dual-junction solar cell a high-efficiency arc less heterounction dj solar cell 91 2.2. selection of materials for heterojunction heterojunctions are used to apply the properties of different materials at a p-n junction. materials with a high bandwidth such as al0.52in0.48p and gaas were used to reduce the effect of recombination in the gaas emitter layer and to increase the open circuit voltage [9]. m.r. islam et al [10] showed that the combination of ingap/alinp increased the life time of the carriers, reduced recombination and increased the short circuit current. table 1 shows that the electron and hole mobility of al0.52in0.48p is suitable for combination with gaas and in0.49ga0.51p semiconductor. the lattice constant is an important parameter in combination with different semiconductors. table 1 shows that in0.5(al0.7ga0.3)0.5p, in0.49ga0.51p and al0.52in0.48p prevent trap levels in the structure with similar lattice constant. to increase the efficiency in the ingap/gaas dual-junction base cell [5], ingap/alinp and gaas/alinp heterojunctions were placed at the top and bottom of the cells, respectively. figure 2 shows the proposed structure to increase the efficiency of the ingap/gaas dual junction cell. as can be seen, the cell high-band gap the semiconductor selected the window layer to allow the light to pass through and prevent surface recombination. the p-n junction in the top cell absorbed the shorter wavelengths of the light spectrum and the bsf layers blocked the recombination of electrons and holes generated in the top cell. the top and bottom cells were connected by a gaas tunnel junction with a band gap of 1.4ev. the bottom cell absorbed long wavelengths of sunlight and cross charge carriers generated in the top cell increased efficiency in the dual-junction cell. table 1 major parameters for the ternary (al0.52in0.48p , in0.49ga0.51p) and quaternary in0.5(al0.7ga0.3)0.5p lattice matched to gaas materials used in this design [11-14]. alinp inalgap ingap gaas material 2.4 2.3 1.9 1.42 band gap eg (ev) @300 k 5.65 5.65 5.65 5.65 lattice constant α (å) 11.7 11.7 11.6 13.1 permittivity (es/eo) 4.2 4.2 4.16 4.07 affinity (ev) 2.65 2.85 3 0.063 heavy eeffective mass (me*/m0) 0.64 0. 64 0. 64 0.5 heavy h + effective mass (mh*/m0) 2291 2150 1945 8800 emobility mun (cm 2 /v× s) 142 141 141 400 h + mobility mup (cm 2 /v× s) 1.08e+20 1.20e+20 1.30e+20 4.7e+17 edensity of states nc (cm -3 ) 1.28e+19 1.28e+19 1.28e+19 7.0e+18 h + density of states nv (cm -3 ) 92 s. abasian, r. sabbaghi-nadooshan fig. 2 schematic of the proposed dual-junction cell 2.3. simulation model the performance of the proposed cell was simulated using atlas the silvaco atlas within the standard of am 1.5 (1 sun) spectrum and the results were compared with those from previous works. figure 3 shows the meshing of the proposed cell. the areas were partitioned differently to accurately simulate the structure of the solar cell. in this model, qtx.mesh and qty.mesh were used to calculate the quantum tunneling current. figure 4 shows the energy band diagram of a dual junction cell with bias voltage of 0v. fig. 3 generated mesh of the proposed dual-junction cell. a high-efficiency arc less heterounction dj solar cell 93 fig. 4. energy band diagram of the proposed model 3. discussion and results 3.1. thickness and optimal impurity for upper and lower cells p-n junction solar cell impurities in the emitter layer should be greater than in the base layer.the emitter layer is an absorbent layer in solar cells. to improve the performance of the solar cell, the thickness of this layer should be less than that of the base layer [15]. the thickness (emitter = 0.05, base = 0.55) (μm) and optimal impurity (emitter = 2×10 19 , base = 7×10 16 ) (1/cm 3 ) are shown in figures 5a and 5b for the ingap/alinpheterojunction. the thickness (emitter = 0.3, base = 3.0) (μm) and optimal impurity (emitter = 2×10 18 , base = 2×10 17 ) (1/cm 3 ) are shown in figures 6a and 6b for the gaas/alinpheterojunction. fig. 5 a) different solar cell parameters with various doping of new top cell, b) the different parameters obtained by varying the thickness of new top cell. 94 s. abasian, r. sabbaghi-nadooshan fig. 6 a) different solar cell parameters with various doping of new bottom cell, b) the different parameters obtained by varying the thickness of new bottom cell. 3.2 illumination with am1.5g figures 7 and 8 show the optical intensity in the base cell [5] and proposed cell. comparison of the optical intensity of the different layers indicates that the optical intensity of the proposed cell is less than that of the base cell due because of the increased absorption of photons in different layers. the value of the spectral response will determine solar cell gain and includes source photocurrent, available photocurrent and cathode current. fig. 7 optical intensity of different layers of the base model a high-efficiency arc less heterounction dj solar cell 95 fig. 8 optical intensity of different layers of the proposed model. source photocurrent is the amount of photons produced by the source photocurrent and available photocurrent. the cathode current measures photons absorbed in the solar cell and output current resulting from their absorption [15]. figure 9 shows the absorbed photons and cathode current for ingap and ingap/alinpcells and indicates that the amount of absorbed photons and cathode current in the ingap/alinp is greater. figure 10 shows the absorbed photons and cathode current in the gaas and gaas/alinp cells in which the amount of absorbed photons and cathode current in gaas/alinp is greater. fig. 9 generation of photocurrent by top cell of the base and proposed dual-junction cell 96 s. abasian, r. sabbaghi-nadooshan fig. 10 generation of photocurrent by bottom cell of the base and proposed dual-junction cell 3.3. photogeneration photogeneration shows amount of photons produced by sunlight in the different layers of a solar cell and is obtained as: (1) where g is photo-generation rate, 0 is the internal quantum efficiency, p is the cumulative effect of reflections, transmissions and losses due, ʎ is the wavelength, h is the plank’s constant, c is the light speed, α is the absorption coefficient for each set of (n,k) and y is the relative distance[4]. figures 11 and 12 show that photogeneration decreased from the top of the cell to the bottom due to the reduced absorption of photons in the lower layers. for example, in the base cell, photogeneration primarily occurring in the window layers at the top of the cell equaled 10 22 electrons and holes per cm 3 . this amount decreased to ~10 19 electrons and holes per cm 3 in the base layer at the bottom of the cell. figure 13 shows photogeneration in the base and proposed cells. comparison of the two cells indicates that photogeneration in the base layer at the tops and bottoms of the cells in the proposed model was caused by a reduction in the rate of recombination of electrons and holes produced and an increase in the absorption of photons after application of the inalp semiconductor. a high-efficiency arc less heterounction dj solar cell 97 fig. 11 photogeneration rate of the base model fig. 12 photogeneration rate of the proposed model 98 s. abasian, r. sabbaghi-nadooshan fig. 13 cutline view of photogeneration rate of the base and proposed model 3.4. important parameters in solar cells 3.4.1. short circuit current and open circuit voltage the total current in one solar cell is obtained as [4]: * ( ) + (2) where n is the diode in an ideal state, k is the boltzmann constant, t is the temperature (k), q is the electrical charge, il is the light generated current and i0 is the current in a dark state. the open circuit voltage is obtained as [4]: (3) 3.4.2. fill factor the fill factor shows the maximum output power to ideal power in a solar cell and is obtained using the i-v curve. this factor is expressed in percent and is calculated as [16]: (4) 3.4.3. efficiency the performance of a solar cell is determined by the efficiency and is obtained as [16]: η (5) v-i curve for the proposed model is illustrated in figure 14. a high-efficiency arc less heterounction dj solar cell 99 fig. 14 i-v curve of the base and proposed model 4. comparison of performance the parameters of short circuit current, open circuit voltage fill factor and efficiency of the optimized model of the proposed cell were compared with results of other models in table 2. in s. abbasian et al [6], semiconductor in0.5(al0.7ga0.3)0.5p decreased in the cells, which increased the electrical field. this resulted in fewer recombinations; thus, the use of heterojunction gaas/alinpcells on the bottom reduced the recombination of cells and increased efficiency. as seen, the use of heterojunction ingap/alinp at the top and gaas/alinp at the bottom of a dual junction cell increased its efficiency. table 2 comparison of proposed model with the different optimized ingap/gaas dj solar cell structures for spectrum am 1.5g solar cells spectrum sun voc(v) jsc (ma/cm 2 ) ff (%) (%) lueck et al [2] am1.5g 1.0 2.23 10.9 79.00 23.6 leem et al [3] am1.5g 1.0 2.30 10.7 87.55 25.14 singh&sarkar [4] am1.5g 1.0 2.39 16.1 87.52 32.196 nayak et al [5] am1.5g 1000 2.66 17.3 88.67 39.15 dutta et al [17] am1.5g 1000 2.668 18.2 88.29 40.879 sahoo et al [7] am1.5g 1000 2.7043 18.9 88.88 43.603 abbasian et al [6] am1.5g 1 3.347 17.5 90.90 53.51 this model am1.5g 1.0 2.72 22.9 93.26 58.28 100 s. abasian, r. sabbaghi-nadooshan 5. conclusion in this design, short circuit current, open circuit voltage and fill factor increased due to changing the structure and replacing the inalp semiconductor with band-gap of 2.4 ev in the base layer of the top and bottom cell of a gainp/gaasdual junction cell. the efficiency has been optimized by changing the thicknesses and the impurity density of the top and bottom layers. this optimized cell provides open circuit voltage (voc) = 2.72 v, short circuit current (jsc) = 22.96 ma/cm2, fill factor (ff) = 93.26% and efficiency ( ) = 58.28% under radiation (1 sun). references [1] j.a. hutchby, r j. markunas, s.m. bedair, "material aspects of the fabrication of multi junction solar cells", in proceedings of the 14th critical reviews of technology conference. arlington, 1985, pp. 40-61. [2] m.r. lueck, c.l. andre, a.j. pitera, m.l. lee, e.a. fitzgerald, s.a. ringel, "dual junction gainp/gaas solar cells grown on metamorphic sige/si substrates with high open circuit voltage", ieee electron device lett., vol. 27, pp. 142-144, 2006. [3] j.w. leem, y.t. lee, j.s. yu, "optimum design of ingap/gaas dual-junction solar cells with different tunnel diodes", opt. quantum electron., vol. 41, pp. 605-612, 2009. [4] k.j. singh, s.k. sarkar, "highly efficient arc less ingap/gaas dj solar cell numerical modeling using optimized inalgap bsf layers", opt. quantum electron., vol. 43, pp.1-21, 2009. [5] p.p. nayak, j.p. dutta, g.p. mishra, "efficient ingap/gaas dj solar cell with double back surface field layer", eng. sci. technol. int. j., vol.18, pp. 325-335, 2015. [6] s. abbasian, r. sabbaghi-nadooshan, "design and evaluation of arc less ingap/algainp dj solar cell", optik., vol. 136, pp. 487-496, 2017. [7] g.s. sahoo, p.p. nayak, g.p. mishra, "an arc less ingap/gaas dj solar cell with hetero tunnel junction", superlattices and microstructures, vol. 95, pp. 115-127, 2016. [8] f.s.gabibov, e.m. zobov, "effect of optical and thermal stimulation on gaas photosensitivity", inorganic materials, vol. 49, no. 8, pp. 754–757, 2013. [9] a.s. gudovskikh, k.s. zelentsov, n.a. kalyuzhnyy, v.m. lantratov, s.a. mintairov, "anisotype gaas based heterojunctions for iii-v multijunction solar cells", in proceedings of the 25 th european photovoltaic solar energy conference and exhibition 2010. [10] m.r. islam, r.d. dupuis, a.l. holmes, a.p. curtis, n.f. gardner, g.e. stillman, j.e. baker, r. hull, "luminescence characteristics of ina1p-ingap heterostructures having native-oxide windows", journal of crystal growth, vol. 170, pp. 413-417, 1997. [11] i. vurgaftman, j.r. meyer, l.r. rammohan, "band parameters for iii-v compound semiconductors and their alloys". j. appl. phys., vol. 89, pp. 5815, 2001. [12] silvaco data systems inc, silvaco atlas user’s manual, 2010. [13] h.y. lee, c.t. lee, "the investigation for various treatments of inalgap schottky diodes",in proceedings of the 8th international conference on electronic materials, iumrs-icem 23, 2002, pp. 99-102. [14] p. michalopoulos, "a novel approach for the development and optimization of state-of-the-art photovoltaic devices using silvaco", naval postgraduate school monterey, california, 2002. [15] a. luque, s. hegedus, handbook of photovoltaic science and engineering, england, john wiley & sons ltd., 2003 , pp. 83-87. [16] s. m. sze, m. k. lee, .semiconductor devices physics and technology, wiley 2010 [17] j.p. dutta, p.p. nayak, g.p. mishra, "design and evaluation of arc less ingap/gaas dj solar cell with ingap tunnel junction and optimized double top bsf layer", optik., vol. 127, pp. 4156-4161, 2007. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 239 257 https://doi.org/10.2298/fuee2102239c vladimir ćirić, dušan cvetković, nadja gavrilović, natalija stojanović, ivan milentijević2 received september 16, 2020; received in revised form october 28, 2020 corresponding author: vladimir m. ćirič faculty of electronic engineering, computer science department, aleksandra medvedeva 14, 18000 niš, serbia e-mail: vladimir.ciric@elfak.ni.ac.rs facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) input splits design techniques for network intrusion detection on hadoop cluster university of niš, faculty of electronic engieering, niš, serbia abstract. intrusion detection system (ids) is one of the most important components being used to monitor network for possible cyber-attacks. however, the amount of data that should be inspected imposes a great challenge to idss. with recent emerge of various big data technologies, there are ways for overcoming the problem of the increased amount of data. nevertheless, some of this technologies inherit data distribution techniques that can be a problem when splitting a sensitive data such as network data frames across a cluster nodes. the goal of this paper is design and implementation of hadoop based ids. in this paper we propose different input split techniques suitable for network data distribution across cloud nodes and test the performances of their apache hadoop implementation. four different data split techniques will be proposed and analysed. the techniques will be described in detail. the system will be evaluated on apache hadoop cluster with 17 slave nodes. we will show that processing speed can differ for more than 30% depending on chosen input split design strategy. additionally, we’ll show that malicious level of network traffic can slow down the processing time, in our case, for nearly 20%. the scalability of the system will also be discussed. key words: network intrusion detection, cloud computing, apache hadoop. © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 239 257 https://doi.org/10.2298/fuee2102239c vladimir ćirić, dušan cvetković, nadja gavrilović, natalija stojanović, ivan milentijević received september 16, 2020; received in revised form october 28, 2020 corresponding author: vladimir m. ćirič faculty of electronic engineering, computer science department, aleksandra medvedeva 14, 18000 niš, serbia e-mail: vladimir.ciric@elfak.ni.ac.rs facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) input splits design techniques for network intrusion detection on hadoop cluster university of niš, faculty of electronic engieering, niš, serbia abstract. intrusion detection system (ids) is one of the most important components being used to monitor network for possible cyber-attacks. however, the amount of data that should be inspected imposes a great challenge to idss. with recent emerge of various big data technologies, there are ways for overcoming the problem of the increased amount of data. nevertheless, some of this technologies inherit data distribution techniques that can be a problem when splitting a sensitive data such as network data frames across a cluster nodes. the goal of this paper is design and implementation of hadoop based ids. in this paper we propose different input split techniques suitable for network data distribution across cloud nodes and test the performances of their apache hadoop implementation. four different data split techniques will be proposed and analysed. the techniques will be described in detail. the system will be evaluated on apache hadoop cluster with 17 slave nodes. we will show that processing speed can differ for more than 30% depending on chosen input split design strategy. additionally, we’ll show that malicious level of network traffic can slow down the processing time, in our case, for nearly 20%. the scalability of the system will also be discussed. key words: network intrusion detection, cloud computing, apache hadoop. © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper 2 v.ciric et al. 1 introduction the complexity of the internet, diversity of available services, and the desire to expand applications of the global network contribute to its increased insecurity. even with decades of research, and a lot of available security products, the internet has steadily become more and more dangerous [1,2]. living in the era when everything is connected to the internet requires a different security strategy. when the attack begins, it is irrelevant how the network is configured or what kind of “boxes” network has, or how many security devices are installed. the only thing that matters is who is defending the network. the only way to stay ahead of new vulnerabilities and attacks is through vivid detection and response [3]. unfortunately, constant security monitoring is a key component missing in most networks [4,5]. intrusion detection system (ids) is one of the most important components used to detect attacks in monitored network traffic [6]. intrusion detection is broadly considered to be a classification problem. based on their classification model idss are classified into signature (or pattern) matching and anomaly based ids. the signature matching ids monitors the network activity for a known misuse pattern that was previously identified as a malicious attempt [6]. having in mind typical bandwidths on the network boundaries, the amount of data that need to be analyzed for malicious signatures becomes challenging. there are ids implementations available that tend to speed up network packet analysis [7–12]. different approaches to task and data parallelism were exploited [9,10,12]. some implementations use multi-core software development frameworks to parallelize the execution on cpu [11], while some utilize gpus [8]. the apache hadoop is a framework for distributed processing of large amount of data on clusters of computers (nodes) using mapreduce programming model, where each node offers local computation and storage [13]. hadoop distributed file system (hdfs) is used for distributed data storage, and it represents a layer above existing file system of every node in cluster used to store input files or parts of them. large files are split into a group of smaller blocks. size of these blocks is fixed, so it is easy for hadoop to index any block within the file [7]. however, this data distribution technique can introduce problems when splitting a sensitive data such as network data frames across a cluster nodes. due to the fixed size of the block, one part of the network packet can end up on one node, while the other part is on the other, making malicious pattern matching challenging [14,15]. 240 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 241240 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 241 2 v.ciric et al. 1 introduction the complexity of the internet, diversity of available services, and the desire to expand applications of the global network contribute to its increased insecurity. even with decades of research, and a lot of available security products, the internet has steadily become more and more dangerous [1,2]. living in the era when everything is connected to the internet requires a different security strategy. when the attack begins, it is irrelevant how the network is configured or what kind of “boxes” network has, or how many security devices are installed. the only thing that matters is who is defending the network. the only way to stay ahead of new vulnerabilities and attacks is through vivid detection and response [3]. unfortunately, constant security monitoring is a key component missing in most networks [4,5]. intrusion detection system (ids) is one of the most important components used to detect attacks in monitored network traffic [6]. intrusion detection is broadly considered to be a classification problem. based on their classification model idss are classified into signature (or pattern) matching and anomaly based ids. the signature matching ids monitors the network activity for a known misuse pattern that was previously identified as a malicious attempt [6]. having in mind typical bandwidths on the network boundaries, the amount of data that need to be analyzed for malicious signatures becomes challenging. there are ids implementations available that tend to speed up network packet analysis [7–12]. different approaches to task and data parallelism were exploited [9,10,12]. some implementations use multi-core software development frameworks to parallelize the execution on cpu [11], while some utilize gpus [8]. the apache hadoop is a framework for distributed processing of large amount of data on clusters of computers (nodes) using mapreduce programming model, where each node offers local computation and storage [13]. hadoop distributed file system (hdfs) is used for distributed data storage, and it represents a layer above existing file system of every node in cluster used to store input files or parts of them. large files are split into a group of smaller blocks. size of these blocks is fixed, so it is easy for hadoop to index any block within the file [7]. however, this data distribution technique can introduce problems when splitting a sensitive data such as network data frames across a cluster nodes. due to the fixed size of the block, one part of the network packet can end up on one node, while the other part is on the other, making malicious pattern matching challenging [14,15]. 240 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 241 2 v.ciric et al. 1 introduction the complexity of the internet, diversity of available services, and the desire to expand applications of the global network contribute to its increased insecurity. even with decades of research, and a lot of available security products, the internet has steadily become more and more dangerous [1,2]. living in the era when everything is connected to the internet requires a different security strategy. when the attack begins, it is irrelevant how the network is configured or what kind of “boxes” network has, or how many security devices are installed. the only thing that matters is who is defending the network. the only way to stay ahead of new vulnerabilities and attacks is through vivid detection and response [3]. unfortunately, constant security monitoring is a key component missing in most networks [4,5]. intrusion detection system (ids) is one of the most important components used to detect attacks in monitored network traffic [6]. intrusion detection is broadly considered to be a classification problem. based on their classification model idss are classified into signature (or pattern) matching and anomaly based ids. the signature matching ids monitors the network activity for a known misuse pattern that was previously identified as a malicious attempt [6]. having in mind typical bandwidths on the network boundaries, the amount of data that need to be analyzed for malicious signatures becomes challenging. there are ids implementations available that tend to speed up network packet analysis [7–12]. different approaches to task and data parallelism were exploited [9,10,12]. some implementations use multi-core software development frameworks to parallelize the execution on cpu [11], while some utilize gpus [8]. the apache hadoop is a framework for distributed processing of large amount of data on clusters of computers (nodes) using mapreduce programming model, where each node offers local computation and storage [13]. hadoop distributed file system (hdfs) is used for distributed data storage, and it represents a layer above existing file system of every node in cluster used to store input files or parts of them. large files are split into a group of smaller blocks. size of these blocks is fixed, so it is easy for hadoop to index any block within the file [7]. however, this data distribution technique can introduce problems when splitting a sensitive data such as network data frames across a cluster nodes. due to the fixed size of the block, one part of the network packet can end up on one node, while the other part is on the other, making malicious pattern matching challenging [14,15]. input splits design techniques for ids on hadoop cluster 3 several authors already dealt with the problem of ids implementation on hadoop. however, for the best of our knowledge there is no solution that implements ids on hadoop without support of other software tools. in [16,17] the authors used hadoop to analyse logs gathered from well-known snort ids. in [18] the authors proposed hadoop as a distributed database manager, but the main processing isn’t performed by hadoop. the goal of this paper is design and implementation of ids based on apache hadoop, with focus on data splitting and distribution techniques to cluster nodes. in this paper we propose different input split techniques suitable for network data distribution across cloud nodes and test the performances of their apache hadoop implementations. four different data split techniques will be proposed and analysed. the techniques will be described in detail. the ids will be implemented using myers pattern search algorithm as a core for signature-based packet analysis and evaluated on apache hadoop cluster with 17 slave nodes. we will show that processing speed can differ for more than 30% depending on chosen input split design strategy. additionally, we’ll show that malicious level of network traffic can slow down the processing time, in our case, for nearly 20%. the scalability of the system will also be discussed. the paper is organized as follows. section 2 gives a brief introduction to ids. section 3 is devoted to the mapreduce framework, as a basis for the proposed apache hadoop implementation. section 4 is the main section and presents the design of the ids workflow on the hadoop framework. in this section we will discuss the design of data input split techniques, as well. section 5 is devoted to the system evaluation, while in section 6 concluding remarks are given. 2 intrusion detection system background ids monitors network traffic and deploys various techniques in order to provide security services. based on the technique used to assess the network packets as regular or malicious, idss are classified into signature (or pattern) matching and anomaly based idss [11,19,20]. the signature matching ids searches the network traffic for a known misuse pattern that was previously identified as a malicious attempt [7,8]. a database with malicious signatures is prepared in advance. this leads to fast and reliable operation, but these idss are not able to detect new attacks that have not been seen before. the anomaly based detection idss make the decision based on a profile of a normal network behavior, and they are capable of detecting zero day 240 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 241240 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 241 4 v.ciric et al. attacks, with a drawback of possible false positives [20,21]. in this paper we will focus on pattern matching based ids. the workflow of pattern matching based ids is shown in fig. 1 [12,19]. intrusion detection starts with network monitoring, followed by network packet preparation for efficient pattern matching, which is based on the predefined signature database (fig. 1). network monitoring can be performed as packet capture, deep packet inspection and flow-based monitoring. packet capture intercepts a data packet that is crossing over a specific computer network, but it focuses only on packet headers. deep packet inspection (dpi) is an advanced method of packet filtering, which inspects at the application layer of the osi (open systems interconnection) reference model. fig. 1: the typical ids workflow. any signature based ids checks the presence of a malicious signature in the incoming packet sequence and act as instructed by the corresponding rule. snort is a widely used open-source ids based on pattern matching [11]. the pattern matching algorithm must be fast enough in order to support the network link speed. there are various implementations of pattern matching algorithms [7–12]. we will use myers pattern search algorithm for dpi packets inspection, with rules in snort syntax as proposed in [12]. in order to speed up pattern matching, in this paper we choose apache hadoop distributed environment, with focus on network traffic data distribution across the nodes. 3 apache hadoop hdfs and mapreduce the apache hadoop is a framework for distributed computing based on mapreduce programming model, where each computer in a hadoop cluster (node) offers local computation and storage [13]. the apache hadoop cluster consists of one master and many slave nodes. the apache hadoop is available in versions 1.x and 2.x. there are two main components of hadoop 1.x system: hadoop distributed file system (hdfs), used for distributed data 242 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 243242 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 243 4 v.ciric et al. attacks, with a drawback of possible false positives [20,21]. in this paper we will focus on pattern matching based ids. the workflow of pattern matching based ids is shown in fig. 1 [12,19]. intrusion detection starts with network monitoring, followed by network packet preparation for efficient pattern matching, which is based on the predefined signature database (fig. 1). network monitoring can be performed as packet capture, deep packet inspection and flow-based monitoring. packet capture intercepts a data packet that is crossing over a specific computer network, but it focuses only on packet headers. deep packet inspection (dpi) is an advanced method of packet filtering, which inspects at the application layer of the osi (open systems interconnection) reference model. fig. 1: the typical ids workflow. any signature based ids checks the presence of a malicious signature in the incoming packet sequence and act as instructed by the corresponding rule. snort is a widely used open-source ids based on pattern matching [11]. the pattern matching algorithm must be fast enough in order to support the network link speed. there are various implementations of pattern matching algorithms [7–12]. we will use myers pattern search algorithm for dpi packets inspection, with rules in snort syntax as proposed in [12]. in order to speed up pattern matching, in this paper we choose apache hadoop distributed environment, with focus on network traffic data distribution across the nodes. 3 apache hadoop hdfs and mapreduce the apache hadoop is a framework for distributed computing based on mapreduce programming model, where each computer in a hadoop cluster (node) offers local computation and storage [13]. the apache hadoop cluster consists of one master and many slave nodes. the apache hadoop is available in versions 1.x and 2.x. there are two main components of hadoop 1.x system: hadoop distributed file system (hdfs), used for distributed data 242 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 243 4 v.ciric et al. attacks, with a drawback of possible false positives [20,21]. in this paper we will focus on pattern matching based ids. the workflow of pattern matching based ids is shown in fig. 1 [12,19]. intrusion detection starts with network monitoring, followed by network packet preparation for efficient pattern matching, which is based on the predefined signature database (fig. 1). network monitoring can be performed as packet capture, deep packet inspection and flow-based monitoring. packet capture intercepts a data packet that is crossing over a specific computer network, but it focuses only on packet headers. deep packet inspection (dpi) is an advanced method of packet filtering, which inspects at the application layer of the osi (open systems interconnection) reference model. fig. 1: the typical ids workflow. any signature based ids checks the presence of a malicious signature in the incoming packet sequence and act as instructed by the corresponding rule. snort is a widely used open-source ids based on pattern matching [11]. the pattern matching algorithm must be fast enough in order to support the network link speed. there are various implementations of pattern matching algorithms [7–12]. we will use myers pattern search algorithm for dpi packets inspection, with rules in snort syntax as proposed in [12]. in order to speed up pattern matching, in this paper we choose apache hadoop distributed environment, with focus on network traffic data distribution across the nodes. 3 apache hadoop hdfs and mapreduce the apache hadoop is a framework for distributed computing based on mapreduce programming model, where each computer in a hadoop cluster (node) offers local computation and storage [13]. the apache hadoop cluster consists of one master and many slave nodes. the apache hadoop is available in versions 1.x and 2.x. there are two main components of hadoop 1.x system: hadoop distributed file system (hdfs), used for distributed data input splits design techniques for ids on hadoop cluster 5 storage, and mapreduce computing framework for data manipulation. the architecture of hadoop 2.x adds yarn (yet another resource negotiator) as an extension for resource management. the hdfs is an abstraction of all file systems of cluster nodes, which creates an illusion of common data file storage. large files are split into a group of smaller parts called blocks (default block size is 64mb) [13]. the size of blocks is fixed, due to the simplification of indexing. the hdfs is master-slave architecture, based on the existence of two types of (linux) deamons: datanode and namenode (fig. 2). namenode is executed on the master node and it is responsible for managing datanodes (slaves) [13]. fig. 2: hdfs components and their communication the namenode is also responsible for taking care of the replication factor of data blocks. the replication factor contributes to data fault tolerance by creating a several copies of each block across the cluster. in fig. 2 the replication factor is 2 (default replication factor is 3). in case of the datanode failure, the namenode chooses new datanodes for new replicas, balances disk usage and manages the communication traffic to the datanodes [13]. typical hadoop workflow has 4 parts: (1) transferring input data from client host to hdfs, (2) processing data using mapreduce framework on the slave nodes, (3) storing results on hdfs, and (4) reading data by client host from hdfs. mapreduce is programming model for distributed data processing, where the map function is applied on every data element in parallel, followed by the reduce function that summarize the collections of intermediate results produced by the map functions. mapreduce paradigm assumes that there 242 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 243242 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 243 6 v.ciric et al. are no data dependencies between any given instance of the map functions. the map and reduce functions are implemented in hadoop as follows. before the beginning of execution, the input data files must be added to the hdfs. the beginning of the data processing itself is the determination of the logical units that will be processed input splits. the most common case is that one input split corresponds to one block on hdfs, but it is not necessary to be so. in case the data requires it, the partitioning of data to input splits can be done differently, through a special implementation of the inputformat class that will create them [14,15]. all input and output data are given in key-value pairs < k, v >. the default behavior is to use textinputformat, where the key is an offset in bytes from the beginning of file, and the value is the content of one line of the file. the binary files can be used as well. one map task processes one input split (fig. 3). each input split is divided into records, which are represented as key-value pairs < ki, vj >. each pair is processed by a map task with one call of the map function. the map function takes one keyvalue pair < ki, vj > and executes given operations on them. it produces the intermediate results also in the form of key-value pairs < kn, vm > (fig. 3). those results are then grouped in such manner that all pairs having the same key are sent to the same reducer. reducer summarizes all data with the same key in order to get the final result (fig. 3). fig. 3: the mapreduce execution in order to design an efficient hadoop based ids, due to the fixed size nature of hdfs data blocks, in this paper we’ll focus on experimenting with 244 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 245244 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 245 6 v.ciric et al. are no data dependencies between any given instance of the map functions. the map and reduce functions are implemented in hadoop as follows. before the beginning of execution, the input data files must be added to the hdfs. the beginning of the data processing itself is the determination of the logical units that will be processed input splits. the most common case is that one input split corresponds to one block on hdfs, but it is not necessary to be so. in case the data requires it, the partitioning of data to input splits can be done differently, through a special implementation of the inputformat class that will create them [14,15]. all input and output data are given in key-value pairs < k, v >. the default behavior is to use textinputformat, where the key is an offset in bytes from the beginning of file, and the value is the content of one line of the file. the binary files can be used as well. one map task processes one input split (fig. 3). each input split is divided into records, which are represented as key-value pairs < ki, vj >. each pair is processed by a map task with one call of the map function. the map function takes one keyvalue pair < ki, vj > and executes given operations on them. it produces the intermediate results also in the form of key-value pairs < kn, vm > (fig. 3). those results are then grouped in such manner that all pairs having the same key are sent to the same reducer. reducer summarizes all data with the same key in order to get the final result (fig. 3). fig. 3: the mapreduce execution in order to design an efficient hadoop based ids, due to the fixed size nature of hdfs data blocks, in this paper we’ll focus on experimenting with 244 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 245 6 v.ciric et al. are no data dependencies between any given instance of the map functions. the map and reduce functions are implemented in hadoop as follows. before the beginning of execution, the input data files must be added to the hdfs. the beginning of the data processing itself is the determination of the logical units that will be processed input splits. the most common case is that one input split corresponds to one block on hdfs, but it is not necessary to be so. in case the data requires it, the partitioning of data to input splits can be done differently, through a special implementation of the inputformat class that will create them [14,15]. all input and output data are given in key-value pairs < k, v >. the default behavior is to use textinputformat, where the key is an offset in bytes from the beginning of file, and the value is the content of one line of the file. the binary files can be used as well. one map task processes one input split (fig. 3). each input split is divided into records, which are represented as key-value pairs < ki, vj >. each pair is processed by a map task with one call of the map function. the map function takes one keyvalue pair < ki, vj > and executes given operations on them. it produces the intermediate results also in the form of key-value pairs < kn, vm > (fig. 3). those results are then grouped in such manner that all pairs having the same key are sent to the same reducer. reducer summarizes all data with the same key in order to get the final result (fig. 3). fig. 3: the mapreduce execution in order to design an efficient hadoop based ids, due to the fixed size nature of hdfs data blocks, in this paper we’ll focus on experimenting with input splits design techniques for ids on hadoop cluster 7 different techniques of dividing the input data into input splits. 4 design of hadoop based ids the architecture of the proposed ids is shown in fig. 4. the proposed architecture uses available snort rules database and distributes pattern search across the hadoop cluster. due to the default behavior of hdfs to split the data into a fixed size blocks, and the nature of network protocols to have a packets of different sizes, the crucial design decision is how the packets on block boundaries will be handled. having this in mind, we introduced pcap input format packet in the architecture from fig. 4, which will allow us to experiment with different approaches by abstracting the inputformat class mentioned in the previous section. fig. 4: the architecture of the proposed ids the ids packet from fig. 4 is a central part that implements mapreduce pattern search through captured network traffic using myers algorithm [12]. we chose the standard pcap format for capturing and storing the network traffic [3]. the architecture’s packet pcap input format from fig. 4 is specialized for controlling the boundaries of the input splits, while pcap input counter tests the validity of its execution. the snort rules parser creates a distributed cache out of snort rules that will be used as an input in the pattern search algorithm. the pattern search algorithm itself is implemented in the tests packet, while utils provide pcap network traffic decoding functionalities. 244 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 245244 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 245 8 v.ciric et al. 4.1 the input file format de facto standard for network traffic capture and storage is pcap file format, which is used by various well known network tools such as wireshark, tcpdump, libcap, etc. [3]. the internal structure of pcap file is given in fig. 5. the file begins with global header, after which the particular network traffic packets follow. global header contains, among the others, two important information (fig. 5): the network protocol of the stored packets (network), and the maximum length of the stored packets (snaplen). the network protocol is in the most of cases ethernet protocol, but it can be ip or any other. the maximum length of the stored packets is the feature that enables storage of the beginning of the packets only, for the sake of efficiency, in the cases when only headers are required. in such cases snaplen value is less then the value indicating original packet len in the actual header of the packet, showing that only the first snaplen bytes of the packet are stored. each packet header from fig. 5 contains the information about the stored network packet and should not be confused with actual network protocol header. the packet header from fig. 5 contains pcap information about the time when the packet is captured (ts sec and ts usec), and its length (incl len and orig len). fig. 5: the internal structure of pcap file as network protocol packets can have variable length (from few bytes to several tens of kb, depending on protocol), and hdfs blocks are of the fixed size, the fields incl len and orig len are of the great importance for the proposed system. the incl len field represents the length of the packet in bytes as it is stored in the pcap file, while the orig len field gives its original length in bytes as seen on the network. for each packet the following relation stands incl len ≤ snap len ≤ orig len, (1) where only the first incl len bytes of each packet are captured in the pcap file. here we will demonstrate and compare several techniques for input splits design, having in mind variable nature of network traffic packets. 246 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 247246 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 247 8 v.ciric et al. 4.1 the input file format de facto standard for network traffic capture and storage is pcap file format, which is used by various well known network tools such as wireshark, tcpdump, libcap, etc. [3]. the internal structure of pcap file is given in fig. 5. the file begins with global header, after which the particular network traffic packets follow. global header contains, among the others, two important information (fig. 5): the network protocol of the stored packets (network), and the maximum length of the stored packets (snaplen). the network protocol is in the most of cases ethernet protocol, but it can be ip or any other. the maximum length of the stored packets is the feature that enables storage of the beginning of the packets only, for the sake of efficiency, in the cases when only headers are required. in such cases snaplen value is less then the value indicating original packet len in the actual header of the packet, showing that only the first snaplen bytes of the packet are stored. each packet header from fig. 5 contains the information about the stored network packet and should not be confused with actual network protocol header. the packet header from fig. 5 contains pcap information about the time when the packet is captured (ts sec and ts usec), and its length (incl len and orig len). fig. 5: the internal structure of pcap file as network protocol packets can have variable length (from few bytes to several tens of kb, depending on protocol), and hdfs blocks are of the fixed size, the fields incl len and orig len are of the great importance for the proposed system. the incl len field represents the length of the packet in bytes as it is stored in the pcap file, while the orig len field gives its original length in bytes as seen on the network. for each packet the following relation stands incl len ≤ snap len ≤ orig len, (1) where only the first incl len bytes of each packet are captured in the pcap file. here we will demonstrate and compare several techniques for input splits design, having in mind variable nature of network traffic packets. 246 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 247 8 v.ciric et al. 4.1 the input file format de facto standard for network traffic capture and storage is pcap file format, which is used by various well known network tools such as wireshark, tcpdump, libcap, etc. [3]. the internal structure of pcap file is given in fig. 5. the file begins with global header, after which the particular network traffic packets follow. global header contains, among the others, two important information (fig. 5): the network protocol of the stored packets (network), and the maximum length of the stored packets (snaplen). the network protocol is in the most of cases ethernet protocol, but it can be ip or any other. the maximum length of the stored packets is the feature that enables storage of the beginning of the packets only, for the sake of efficiency, in the cases when only headers are required. in such cases snaplen value is less then the value indicating original packet len in the actual header of the packet, showing that only the first snaplen bytes of the packet are stored. each packet header from fig. 5 contains the information about the stored network packet and should not be confused with actual network protocol header. the packet header from fig. 5 contains pcap information about the time when the packet is captured (ts sec and ts usec), and its length (incl len and orig len). fig. 5: the internal structure of pcap file as network protocol packets can have variable length (from few bytes to several tens of kb, depending on protocol), and hdfs blocks are of the fixed size, the fields incl len and orig len are of the great importance for the proposed system. the incl len field represents the length of the packet in bytes as it is stored in the pcap file, while the orig len field gives its original length in bytes as seen on the network. for each packet the following relation stands incl len ≤ snap len ≤ orig len, (1) where only the first incl len bytes of each packet are captured in the pcap file. here we will demonstrate and compare several techniques for input splits design, having in mind variable nature of network traffic packets. input splits design techniques for ids on hadoop cluster 9 4.2 input split techniques we performed experiments with four different input split designs. within the first design technique we use textual file as an input, while in the next three techniques we use binary file format with different network packet and hdfs data block aligning techniques. technique 1 tshark packet pre-decoding the simplest solution regarding the implementation of the map and reduce functions is to use input file in textual format, and to pre-decode captured network traffic prior to placing the input file on the hdfs. the paper [12] deals with this particular type of implementation. as malicious attempts can be recognized from their signatures in the form of character or byte arrays, pcap file should be decoded in order to obtain data in plain text from all headers of encapsulating network protocols (i.e. ethernet, ip, and tcp), including data carried by the application layer. in this case it is not necessary to implement custom hadoop inputformat, but the textinputformat can be used instead. we use tshark linux command line tool for network traffic decoding. the example of tshark tool usage is: tshark -r -t fields -e separator=, -e ip.addr -e ws.col.protocol -e tcp.port -e udp.port -e data > output.txt each line in the output.txt contains the information from one fetched network packet, which now represents the input file for hdfs. the input file is divided in input splits, and each map task is fed by one input split. the map function implemented to support this technique takes one line at the time, and executes the pattern search algorithm. if some of the snort rules match the malicious network packet, the mapper emits < key, value > pair, where the key stands for the attack identification, while the value is constant 1. having the same key, the results from the same malicious flow go to the same reducer, which counts the malicious packets in the flow and outputs the result. the advantage of this approach is ease of hadoop implementation, with the drawback that packet decoding have to be done prior the beginning of hadoop program and pattern search. technique 2 custom inputformat for pcap input in order to overcome disadvantages of the previous technique, pcap file have to be used in the original binary format, without pre-decoding. this can 246 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 247246 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 247 10 v.ciric et al. be achieved by implementation of custom hadoop inputformat that will process pcap files. two considerations should be taken into account: (1) how to divide the input file into input splits? (2) how to read records from an input split and feed them into the map function? during the input file processing and creation of each input split, in this technique we ensured that the boundary between each two adjacent input splits is exactly at the boundary between two packets (fig. 6). we crawled the pcap file package by package until the configured block size limit (fig. 6). at that point, a new input split is created. since the division is performed at the packet boundaries, the sizes of the obtained input splits can differ from each other, but no more than the maximum length of one packet, which is 1536 bytes for ethernet). fig. 6: dividing the pcap file into input splits: (1) correct boundary, (2) incorrect boundary in order to perform pattern search within the map function, the map function requires both the packet header and the packet data. the output from the mapper is in the form < key, value >, where the key is the offset of the beginning of the packet header, and the value is the whole packet in its original binary format. in this approach the mapper itself decodes the binary packet and locates all required data (ip addresses, ports, payload, etc.), making the decoding distributed operation, as well. as myers algorithm natively works with bytes, mapper only have to decode the packet up to the application layer (to find ip addresses and port numbers), but not the application layer payload itself, which, compared to tshark, reduces the number of operations required for decoding. 248 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 249248 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 249 10 v.ciric et al. be achieved by implementation of custom hadoop inputformat that will process pcap files. two considerations should be taken into account: (1) how to divide the input file into input splits? (2) how to read records from an input split and feed them into the map function? during the input file processing and creation of each input split, in this technique we ensured that the boundary between each two adjacent input splits is exactly at the boundary between two packets (fig. 6). we crawled the pcap file package by package until the configured block size limit (fig. 6). at that point, a new input split is created. since the division is performed at the packet boundaries, the sizes of the obtained input splits can differ from each other, but no more than the maximum length of one packet, which is 1536 bytes for ethernet). fig. 6: dividing the pcap file into input splits: (1) correct boundary, (2) incorrect boundary in order to perform pattern search within the map function, the map function requires both the packet header and the packet data. the output from the mapper is in the form < key, value >, where the key is the offset of the beginning of the packet header, and the value is the whole packet in its original binary format. in this approach the mapper itself decodes the binary packet and locates all required data (ip addresses, ports, payload, etc.), making the decoding distributed operation, as well. as myers algorithm natively works with bytes, mapper only have to decode the packet up to the application layer (to find ip addresses and port numbers), but not the application layer payload itself, which, compared to tshark, reduces the number of operations required for decoding. 248 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 249 10 v.ciric et al. be achieved by implementation of custom hadoop inputformat that will process pcap files. two considerations should be taken into account: (1) how to divide the input file into input splits? (2) how to read records from an input split and feed them into the map function? during the input file processing and creation of each input split, in this technique we ensured that the boundary between each two adjacent input splits is exactly at the boundary between two packets (fig. 6). we crawled the pcap file package by package until the configured block size limit (fig. 6). at that point, a new input split is created. since the division is performed at the packet boundaries, the sizes of the obtained input splits can differ from each other, but no more than the maximum length of one packet, which is 1536 bytes for ethernet). fig. 6: dividing the pcap file into input splits: (1) correct boundary, (2) incorrect boundary in order to perform pattern search within the map function, the map function requires both the packet header and the packet data. the output from the mapper is in the form < key, value >, where the key is the offset of the beginning of the packet header, and the value is the whole packet in its original binary format. in this approach the mapper itself decodes the binary packet and locates all required data (ip addresses, ports, payload, etc.), making the decoding distributed operation, as well. as myers algorithm natively works with bytes, mapper only have to decode the packet up to the application layer (to find ip addresses and port numbers), but not the application layer payload itself, which, compared to tshark, reduces the number of operations required for decoding. input splits design techniques for ids on hadoop cluster 11 technique 3 custom inputformat with probabilistic packet boundary detection although the previous technique is much better than tshark packets decoding due to its distributed packet decoding, it still crawls from packet to packet through the pcap file with aim to align the boundary of the input split with the boundary of the packet. in order to do so, it loads each network packet into the memory. this takes time before the start of ”useful” distributed processing, and slows down whole processing. in order to avoid this bottleneck we propose the third technique probabilistic packet boundary detection, where we assume that pcap file contains the network packets captured only on the data link layer of the osi reference model, i.e. that each ”packet” in the pcap file is ethernet frame. let the hdfs block size be z bytes. here we propose not to load all z bytes into the memory in order to find the boundary between the input splits, but rather to skip the first x (x < z) bytes (fig. 7). the question now is whether the boundary between the network packets lies on the chosen offset of x bytes? if so, then for the next package in the pcap file eq. (1) should stand. this practically means that the value on the position of the orig len field should be greater than zero, that the value of the inc len field should be within the valid limits of the ethernet frame size, and that the value of the inc len field should be less than or equal to the value of the orig len field. other fields in the package itself must be valid, too. a suitable place for additional check is where the ethertype field in the header of the ethernet frame should be. this value should be compared with the value that indicates the ethernet protocol. this is highly probabilistic and fuzzy way of boundary detection, and a few verified fields can mislead us by giving us a false positive answer. thus, we check the same conditions for the following k packets (up to the packet denoted as pn+k in fig. 7). if all conditions stand for the next k packets, we declare the boundary between the packets found, and create the input split. in order to perform mentioned additional checks on the next k packets, after skipping the offset of x bytes we load the following y bytes, as it is shown in fig. 7. let us note that y ≪ x. this can be used to form a probabilistic algorithm as follows: (1) if the verified conditions stand for the next k packets, we assume that the correct boundary between the input splits is on the x-th offset; (2) if the conditions don’t stand at least for one of the k packages, the offset x is not the correct limit, and the offset of x + 1-st byte should be examined (fig. 7). in the 248 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 249248 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 249 12 v.ciric et al. fig. 7: probabilistic method for packets boundaries detection worst case scenario the number of offsets that should be examined is equal to the maximum length of ethernet frame, which is not a problem having in mind that we already have y bytes from pcap loaded and available. even in this case, it is much less than loading and processing of all z bytes. with the careful selection of the parameters x, y and k, high degree of certainty of the proposed technique can be reached. in our implementation we selected the following parameters: z = 128mb, x = 123mb and y = 5mb. in the portion of y = 5mb there are more than 3.000 ethernet frames, more then enough for us not to obtain any false positive, while loading only 4% of the pcap file. technique 4 custom inputformat with aligned blocks and input splits the previous technique has an important drawback regarding the way how the hdfs operates in the case of unequal sizes of blocks and input splits. in any case, the hdfs block size is constant. if the input split size is less than the block size, as it is in the previous technique, the boundary of the input split will not be aligned with the block boundary, forcing the hdfs to fill the remaining space with the next input split. that input split will be divided having a small portion in one block and a larger portion in the next block. the case when one input split resides in two blocks will force the hadoop to copy both blocks on the node where the mapper who processes the particular input split is executed. this can cause large and unnecessary network traffic while copying the blocks. to overcome this issue and prevent unnecessary blocks copying, we propose the fourth technique where we have a custom inputformat and the exact same sizes of blocks and input splits (fig. 8). now, the boundaries 250 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 251250 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 251 12 v.ciric et al. fig. 7: probabilistic method for packets boundaries detection worst case scenario the number of offsets that should be examined is equal to the maximum length of ethernet frame, which is not a problem having in mind that we already have y bytes from pcap loaded and available. even in this case, it is much less than loading and processing of all z bytes. with the careful selection of the parameters x, y and k, high degree of certainty of the proposed technique can be reached. in our implementation we selected the following parameters: z = 128mb, x = 123mb and y = 5mb. in the portion of y = 5mb there are more than 3.000 ethernet frames, more then enough for us not to obtain any false positive, while loading only 4% of the pcap file. technique 4 custom inputformat with aligned blocks and input splits the previous technique has an important drawback regarding the way how the hdfs operates in the case of unequal sizes of blocks and input splits. in any case, the hdfs block size is constant. if the input split size is less than the block size, as it is in the previous technique, the boundary of the input split will not be aligned with the block boundary, forcing the hdfs to fill the remaining space with the next input split. that input split will be divided having a small portion in one block and a larger portion in the next block. the case when one input split resides in two blocks will force the hadoop to copy both blocks on the node where the mapper who processes the particular input split is executed. this can cause large and unnecessary network traffic while copying the blocks. to overcome this issue and prevent unnecessary blocks copying, we propose the fourth technique where we have a custom inputformat and the exact same sizes of blocks and input splits (fig. 8). now, the boundaries 250 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 251 12 v.ciric et al. fig. 7: probabilistic method for packets boundaries detection worst case scenario the number of offsets that should be examined is equal to the maximum length of ethernet frame, which is not a problem having in mind that we already have y bytes from pcap loaded and available. even in this case, it is much less than loading and processing of all z bytes. with the careful selection of the parameters x, y and k, high degree of certainty of the proposed technique can be reached. in our implementation we selected the following parameters: z = 128mb, x = 123mb and y = 5mb. in the portion of y = 5mb there are more than 3.000 ethernet frames, more then enough for us not to obtain any false positive, while loading only 4% of the pcap file. technique 4 custom inputformat with aligned blocks and input splits the previous technique has an important drawback regarding the way how the hdfs operates in the case of unequal sizes of blocks and input splits. in any case, the hdfs block size is constant. if the input split size is less than the block size, as it is in the previous technique, the boundary of the input split will not be aligned with the block boundary, forcing the hdfs to fill the remaining space with the next input split. that input split will be divided having a small portion in one block and a larger portion in the next block. the case when one input split resides in two blocks will force the hadoop to copy both blocks on the node where the mapper who processes the particular input split is executed. this can cause large and unnecessary network traffic while copying the blocks. to overcome this issue and prevent unnecessary blocks copying, we propose the fourth technique where we have a custom inputformat and the exact same sizes of blocks and input splits (fig. 8). now, the boundaries input splits design techniques for ids on hadoop cluster 13 of input splits are not aligned with the boundaries of network packets in most cases, and they split some packets into two parts (fig. 8). therefore, we will ignore the split packets as invalid. they intentionally will not be processed further through the mapreduce framework, leaving the small chance of false negative response of our network intrusion detection system for the sake of speed gain by avoiding of unnecessary block copying. in the worst case, the number of packets that will not be processed can be equal to the number of input splits, i.e. one invalid packet per input split (roughly one ethernet frame per 100.000 frames will be ignored). the problem that remains is finding valid beginning of the first packet within the input split, and this is a reason for having a custom inputformat within this technique, too. to find the first valid packet in input split, we use the same probabilistic algorithm as in the previous technique, and we search through the first y bytes of the input split for the valid beginning of ethernet frame (fig. 8). fig. 8: aligned blocks and input splits 5 implementation results the proposed techniques are suitable for implementation in both hadoop 1.x and 2.x without any restrictions. in order to evaluate the proposed techniques, the ids is implemented in apache hadoop 1.x and 2.x, and tested on a cluster with 18 commodity nodes, where 1 node is a master while the rest 17 nodes are slaves. the nodes are equipped with intel(r)core(tm)2 duo, cpu e4600@2.40ghz, and 1gb of ram. in order to compare the performances of the proposed hadoop ids with the reference snort ids, evaluation of the technique 1 is done in single-processor environment, because the snort ids doesn’t support distributed execution. for that purpose we used an environment with i3 6006u cpu and 8gb ram. the processor e4600 has 2 cores and operates at 2.4ghz, with whetstone benchmark results 2.25 flops per core, i.e. 4.50 flops in total. the processor i3 6006u has 4 250 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 251250 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 251 14 v.ciric et al. cores at 2.00ghz, with whetstone benchmark results 2.08 per core, i.e. 8.31 in total [22]. we used input pcap files of variable sizes with 1, 2, 3, 4, and 5gb of network traffic data, having ”low”, ”medium”, and ”high” number of malicious packets (less than 1%, 30-40%, and more than 70%, respectively). we also varied the number of snort rules and the number of slave nodes in the cluster. the proposed hadoop ids with technique 1 for input splits preparation (tshark packet pre-decoding) is evaluated in hadoop 2.9.0, in pseudodistributed environment on the previously mentioned single-processor system, along with the snort ids. the results are given in table 1. input file size [gb] preprocessing time [s] pattern search time [s] total processing time [s] snort ids 1 0 120 120 hadoop ids 1.6 211 94 305 table 1: hadoop ids with tshark packet pre-decoding vs. snort ids the used input pcap file size is 1gb, and it contained about 2 million network packets with ”medium” number of malicious packets. as it can be seen from table 1, packet pre-decoding increased the file size from 1gb to 1.6gb. nevertheless, the proposed ids has 21% faster pattern search time than the snort (94s vs. 120s). this is due to the fact that the execution of hadoop ids takes advantage of multi-core cpu, while the snort doesn’t. however, the time required for tshark packet pre-decoding took more than 3 minutes (211 seconds), giving the total processing time for hadoop 2.5 times slower than the snort ids. for the job execution purposes hadoop 2.x requests three different kinds of containers from yarn: the application master container, map containers, and reduce containers. application master itself requires 1.5gb or ram by default, making hadoop 2.x suitable for large clusters with a lot of resources. the proposed techniques 2, 3 and 4 are evaluated in hadoop 1.2.1 environment, due to the lower resource requirements. for the independent variables in the experiment we chose input pcap file size, the level of malicious packets, the number of snort rules in the database, and the number of slave nodes. as dependant variables we obtained the total execution time, the total number of map tasks, as well as the number of data local and rack local1 map 1data local map task is a map task which has data block already locally available on the node where it executes prior to the execution, while rack local map task needs to fetch the data block from the other slave node. 252 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 253252 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 253 14 v.ciric et al. cores at 2.00ghz, with whetstone benchmark results 2.08 per core, i.e. 8.31 in total [22]. we used input pcap files of variable sizes with 1, 2, 3, 4, and 5gb of network traffic data, having ”low”, ”medium”, and ”high” number of malicious packets (less than 1%, 30-40%, and more than 70%, respectively). we also varied the number of snort rules and the number of slave nodes in the cluster. the proposed hadoop ids with technique 1 for input splits preparation (tshark packet pre-decoding) is evaluated in hadoop 2.9.0, in pseudodistributed environment on the previously mentioned single-processor system, along with the snort ids. the results are given in table 1. input file size [gb] preprocessing time [s] pattern search time [s] total processing time [s] snort ids 1 0 120 120 hadoop ids 1.6 211 94 305 table 1: hadoop ids with tshark packet pre-decoding vs. snort ids the used input pcap file size is 1gb, and it contained about 2 million network packets with ”medium” number of malicious packets. as it can be seen from table 1, packet pre-decoding increased the file size from 1gb to 1.6gb. nevertheless, the proposed ids has 21% faster pattern search time than the snort (94s vs. 120s). this is due to the fact that the execution of hadoop ids takes advantage of multi-core cpu, while the snort doesn’t. however, the time required for tshark packet pre-decoding took more than 3 minutes (211 seconds), giving the total processing time for hadoop 2.5 times slower than the snort ids. for the job execution purposes hadoop 2.x requests three different kinds of containers from yarn: the application master container, map containers, and reduce containers. application master itself requires 1.5gb or ram by default, making hadoop 2.x suitable for large clusters with a lot of resources. the proposed techniques 2, 3 and 4 are evaluated in hadoop 1.2.1 environment, due to the lower resource requirements. for the independent variables in the experiment we chose input pcap file size, the level of malicious packets, the number of snort rules in the database, and the number of slave nodes. as dependant variables we obtained the total execution time, the total number of map tasks, as well as the number of data local and rack local1 map 1data local map task is a map task which has data block already locally available on the node where it executes prior to the execution, while rack local map task needs to fetch the data block from the other slave node. 252 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 253 14 v.ciric et al. cores at 2.00ghz, with whetstone benchmark results 2.08 per core, i.e. 8.31 in total [22]. we used input pcap files of variable sizes with 1, 2, 3, 4, and 5gb of network traffic data, having ”low”, ”medium”, and ”high” number of malicious packets (less than 1%, 30-40%, and more than 70%, respectively). we also varied the number of snort rules and the number of slave nodes in the cluster. the proposed hadoop ids with technique 1 for input splits preparation (tshark packet pre-decoding) is evaluated in hadoop 2.9.0, in pseudodistributed environment on the previously mentioned single-processor system, along with the snort ids. the results are given in table 1. input file size [gb] preprocessing time [s] pattern search time [s] total processing time [s] snort ids 1 0 120 120 hadoop ids 1.6 211 94 305 table 1: hadoop ids with tshark packet pre-decoding vs. snort ids the used input pcap file size is 1gb, and it contained about 2 million network packets with ”medium” number of malicious packets. as it can be seen from table 1, packet pre-decoding increased the file size from 1gb to 1.6gb. nevertheless, the proposed ids has 21% faster pattern search time than the snort (94s vs. 120s). this is due to the fact that the execution of hadoop ids takes advantage of multi-core cpu, while the snort doesn’t. however, the time required for tshark packet pre-decoding took more than 3 minutes (211 seconds), giving the total processing time for hadoop 2.5 times slower than the snort ids. for the job execution purposes hadoop 2.x requests three different kinds of containers from yarn: the application master container, map containers, and reduce containers. application master itself requires 1.5gb or ram by default, making hadoop 2.x suitable for large clusters with a lot of resources. the proposed techniques 2, 3 and 4 are evaluated in hadoop 1.2.1 environment, due to the lower resource requirements. for the independent variables in the experiment we chose input pcap file size, the level of malicious packets, the number of snort rules in the database, and the number of slave nodes. as dependant variables we obtained the total execution time, the total number of map tasks, as well as the number of data local and rack local1 map 1data local map task is a map task which has data block already locally available on the node where it executes prior to the execution, while rack local map task needs to fetch the data block from the other slave node. input splits design techniques for ids on hadoop cluster 15 tasks. fig. 9 shows the evaluation results of the techniques 2, 3 and 4 for 1gb input file size with low level of malicious packets, on a cluster with 17 slave nodes, and the snort database with 1000 rules. the results are as expected: the hadoop ids with input split technique 4 has the best execution time (fig. 9a). in this case it performs 32% faster then the proposed technique 2 (170 vs. 250 in fig. 9a). the number of datalocal and racklocal blocks confirm the design hypothesis about additional block copying (fig. 9b). the techniques 2 and 4 have the same number of map tasks due to the fact that both techniques force the size of the input split to be exact (tcq4) or very close to the size of a block (tcq2), while tcq3 introduces the greatest deviation between the size of an input split and a block. tcq2 tcq3 tcq4 0 100 200 250 187 170 a) [s] maptasks datalocal racklocal 0 10 20 17 13 4 20 15 5 17 13 4 b) [#] tcq2 tcq3 tcq4 fig. 9: evaluation results of different input split design techniques: a) total processing time for techniques 2, 3 and 4, b) map tasks and blocks distribution across the cluster we also evaluated how the technique 4 performs with variable malicious level of input pcap file and variable file size, how it performs with variable number of snort rules in database, and how it performs in the clusters with different number of slave nodes. the evaluation results are given in fig. 10. for better introspection, we used the same parameters for the starting points of graphics in figs. 10 a) and b) as in fig. 9 a): tcq4, snort database with 1000 rules, input file size 1gb, and low pcap malicious level. figs. 10 b), c), and d) have one common point, as well. from fig. 10 a) it can be seen that the total execution time strongly depends on the number of malicious attempts in the network traffic flow. in this case the total execution time differs for 18% (202 vs. 170 in fig. 10a). this is not a consequence of the choice of pattern search algorithm, but 252 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 253252 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 253 16 v.ciric et al. low medium high 100 200 300 170 182 202 a) malicious level of pcap to ta l ex ec u ti o n ti m e [s ] f=1gb;sr=1000 1 2 3 4 5 100 200 300 170 182 230 280 335 b) input pcap file size [gb] to ta l ex ec u ti o n ti m e [s ] m=low;sr=1000 200 400 600 800 1000 100 200 300 127 158 190 255 335 c) the number of snort rules to ta l ex ec u ti o n ti m e [s ] f=5gb;m=low 5 9 13 17 200 400 600 800 807 461 343 335 d) the number of slave nodes to ta l ex ec u ti o n ti m e [s ] f=5gb;m=low fig. 10: evaluation of the tcq4 with variable parameters: a) variable number of malicious packets in pcap, b) variable input pcap file size, c) variable number of snort rules, d) variable size of the cluster. rather the consequence of snort rules database structure. the used myers algorithm has a stabile execution time which is not affected by the contents of neither text nor pattern [12]. the snort database is hierarchically organized, having rules categorized in levels from general to specific. for example, if the protocol is not http, the sql injection rules are not going to be examined. thus, for the low malicious pcap a lot of packets are simply skipped after ports and protocols check, and the myers search algorithm is not started for them. if the monitored traffic contains a lot of packets that fall into ”suspicious” category, one or more additional pattern searches are going to be performed, depending on the number of specific rules bound to matched general rule. this directly reflects the results from fig. 10a. let us note that ids process in general can be strongly affected with the choice of the pattern search algorithm, as well as with the specifics of 254 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 255254 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 255 16 v.ciric et al. low medium high 100 200 300 170 182 202 a) malicious level of pcap to ta l ex ec u ti o n ti m e [s ] f=1gb;sr=1000 1 2 3 4 5 100 200 300 170 182 230 280 335 b) input pcap file size [gb] to ta l ex ec u ti o n ti m e [s ] m=low;sr=1000 200 400 600 800 1000 100 200 300 127 158 190 255 335 c) the number of snort rules to ta l ex ec u ti o n ti m e [s ] f=5gb;m=low 5 9 13 17 200 400 600 800 807 461 343 335 d) the number of slave nodes to ta l ex ec u ti o n ti m e [s ] f=5gb;m=low fig. 10: evaluation of the tcq4 with variable parameters: a) variable number of malicious packets in pcap, b) variable input pcap file size, c) variable number of snort rules, d) variable size of the cluster. rather the consequence of snort rules database structure. the used myers algorithm has a stabile execution time which is not affected by the contents of neither text nor pattern [12]. the snort database is hierarchically organized, having rules categorized in levels from general to specific. for example, if the protocol is not http, the sql injection rules are not going to be examined. thus, for the low malicious pcap a lot of packets are simply skipped after ports and protocols check, and the myers search algorithm is not started for them. if the monitored traffic contains a lot of packets that fall into ”suspicious” category, one or more additional pattern searches are going to be performed, depending on the number of specific rules bound to matched general rule. this directly reflects the results from fig. 10a. let us note that ids process in general can be strongly affected with the choice of the pattern search algorithm, as well as with the specifics of 254 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 255 16 v.ciric et al. low medium high 100 200 300 170 182 202 a) malicious level of pcap to ta l ex ec u ti o n ti m e [s ] f=1gb;sr=1000 1 2 3 4 5 100 200 300 170 182 230 280 335 b) input pcap file size [gb] to ta l ex ec u ti o n ti m e [s ] m=low;sr=1000 200 400 600 800 1000 100 200 300 127 158 190 255 335 c) the number of snort rules to ta l ex ec u ti o n ti m e [s ] f=5gb;m=low 5 9 13 17 200 400 600 800 807 461 343 335 d) the number of slave nodes to ta l ex ec u ti o n ti m e [s ] f=5gb;m=low fig. 10: evaluation of the tcq4 with variable parameters: a) variable number of malicious packets in pcap, b) variable input pcap file size, c) variable number of snort rules, d) variable size of the cluster. rather the consequence of snort rules database structure. the used myers algorithm has a stabile execution time which is not affected by the contents of neither text nor pattern [12]. the snort database is hierarchically organized, having rules categorized in levels from general to specific. for example, if the protocol is not http, the sql injection rules are not going to be examined. thus, for the low malicious pcap a lot of packets are simply skipped after ports and protocols check, and the myers search algorithm is not started for them. if the monitored traffic contains a lot of packets that fall into ”suspicious” category, one or more additional pattern searches are going to be performed, depending on the number of specific rules bound to matched general rule. this directly reflects the results from fig. 10a. let us note that ids process in general can be strongly affected with the choice of the pattern search algorithm, as well as with the specifics of input splits design techniques for ids on hadoop cluster 17 topology and network organization, as well. if anomaly based approach is chosen instead of pattern search, the search performances can be significantly affected also. from fig. 10 b) it can be seen that the total execution time linearly depends on the size of the input files. the very slow growth in the beginning of fig. 10b between execution times for 1gb and 2gb inputs is explained by the fact that the cluster consists of 17 nodes, where each node can execute 2 map tasks on two separate cores in parallel, giving the maximum of 34 simultaneously executed map task on the cluster. a 1gb file is presented on hdfs with 16 blocks of 64mb, while a 2gb file is presented with 32 blocks. this means that all necessary tasks can be run at the same time for both 1gb and 2gb files. the rise in the execution time between these two is due to the increased number of rack local tasks for the larger file. for larger files, more than 34 map tasks are required for processing, which means that not all of them can be started immediately, and they need to wait for the previously started map tasks to finish execution. however, the dependency once the cluster boundary is reached linearly increases (fig. 10b). the total processing time depends linearly on the number of snort rules (fig. 10c), while it has an asymptotic decline depending on the number of nodes (fig. 10d). it can be noticed in fig. 10d that for 17 nodes the graph enters saturation and the processing speed remains slightly under 0.15 gb/sec. as the number of nodes in the cluster grows, each of them stores a smaller number of blocks on average. this leads to an increase in rack local tasks. copying of remote blocks during the execution is a limiting factor that leads to the saturation in this case. 6 conclusion in this paper the design and implementation of ids using apache hadoop is proposed. four different input data split techniques are proposed and analysed. the techniques are described in detail. the ids is implemented using myers pattern search algorithm as a core for signature-based packet analysis. we showed the suitability of hadoop environment for the implementation of network ids and discussed inherited problem from hadoop that relates to splitting sensitive data across cluster nodes. the system is evaluated on apache hadoop cluster with 17 slave nodes. the implementation and evaluation results are given and discussed in detail. we showed that processing speed can differ for more than 30% depending on chosen input split design strategy. additionally, we showed that malicious level of network traffic can 254 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 255254 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 255 18 v.ciric et al. slow down the processing time, in our case, for nearly 20%. the scalability of the system was also discussed. the proposed techniques deal with specific type of input data, i.e. network traffic packets, but they can be easily generalized to deal with any type of sensitive data which need a special attention before it can be split into pieces and scatter onto different nodes in distributed environment. acknowledgments this work was supported by the serbian ministry of education, science and technological development [grant number tr32012]. references [1] l. a. maglaras, k.-h. kim, h. janicke, m. a. ferrag, s. rallis, p. fragkou, a. maglaras, and t. j. cruz, “cyber security of critical infrastructures,” ict express, vol. 4, no. 1, pp. 42–45, 2018. [2] m. a. ferrag, l. maglaras, s. moschoyiannis, and h. janicke, “deep learning for cyber security intrusion detection: approaches, datasets, and comparative study,” journal of information security and applications, vol. 50, pp. 1–19, 2020. [3] j. svoboda, i. ghafir, v. prenosil et al., “network monitoring approaches: an overview,” int j adv comput netw secur, vol. 5, no. 2, pp. 88–93, 2015. [4] i. ghafir, v. prenosil, j. svoboda, and m. hammoudeh, “a survey on network security monitoring systems,” in 2016 ieee 4th international conference on future internet of things and cloud workshops (ficloudw). ieee, 2016, pp. 77–82. [5] b. schneier, “managed security monitoring: network security for the 21st century,” computers & security, vol. 20, no. 6, pp. 491–503, 2001. [6] g. kumar, k. kumar, and m. sachdeva, “the use of artificial intelligence based techniques for intrusion detection: a review,” artificial intelligence review, vol. 34, no. 4, pp. 369–387, 2010. [7] m. aldwairi and d. alansari, “exscind: fast pattern matching for intrusion detection using exclusion and inclusion filters,” in 2011 7th international conference on next generation web services practices. ieee, 2011, pp. 24–30. [8] d. xu, h. zhang, and y. fan, “the gpu-based high-performance patternmatching algorithm for intrusion detection,” journal of computational information systems, pp. 3791–3800, 2013. 256 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 257256 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 257 18 v.ciric et al. slow down the processing time, in our case, for nearly 20%. the scalability of the system was also discussed. the proposed techniques deal with specific type of input data, i.e. network traffic packets, but they can be easily generalized to deal with any type of sensitive data which need a special attention before it can be split into pieces and scatter onto different nodes in distributed environment. acknowledgments this work was supported by the serbian ministry of education, science and technological development [grant number tr32012]. references [1] l. a. maglaras, k.-h. kim, h. janicke, m. a. ferrag, s. rallis, p. fragkou, a. maglaras, and t. j. cruz, “cyber security of critical infrastructures,” ict express, vol. 4, no. 1, pp. 42–45, 2018. [2] m. a. ferrag, l. maglaras, s. moschoyiannis, and h. janicke, “deep learning for cyber security intrusion detection: approaches, datasets, and comparative study,” journal of information security and applications, vol. 50, pp. 1–19, 2020. [3] j. svoboda, i. ghafir, v. prenosil et al., “network monitoring approaches: an overview,” int j adv comput netw secur, vol. 5, no. 2, pp. 88–93, 2015. [4] i. ghafir, v. prenosil, j. svoboda, and m. hammoudeh, “a survey on network security monitoring systems,” in 2016 ieee 4th international conference on future internet of things and cloud workshops (ficloudw). ieee, 2016, pp. 77–82. [5] b. schneier, “managed security monitoring: network security for the 21st century,” computers & security, vol. 20, no. 6, pp. 491–503, 2001. [6] g. kumar, k. kumar, and m. sachdeva, “the use of artificial intelligence based techniques for intrusion detection: a review,” artificial intelligence review, vol. 34, no. 4, pp. 369–387, 2010. [7] m. aldwairi and d. alansari, “exscind: fast pattern matching for intrusion detection using exclusion and inclusion filters,” in 2011 7th international conference on next generation web services practices. ieee, 2011, pp. 24–30. [8] d. xu, h. zhang, and y. fan, “the gpu-based high-performance patternmatching algorithm for intrusion detection,” journal of computational information systems, pp. 3791–3800, 2013. 256 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 257 18 v.ciric et al. slow down the processing time, in our case, for nearly 20%. the scalability of the system was also discussed. the proposed techniques deal with specific type of input data, i.e. network traffic packets, but they can be easily generalized to deal with any type of sensitive data which need a special attention before it can be split into pieces and scatter onto different nodes in distributed environment. acknowledgments this work was supported by the serbian ministry of education, science and technological development [grant number tr32012]. references [1] l. a. maglaras, k.-h. kim, h. janicke, m. a. ferrag, s. rallis, p. fragkou, a. maglaras, and t. j. cruz, “cyber security of critical infrastructures,” ict express, vol. 4, no. 1, pp. 42–45, 2018. [2] m. a. ferrag, l. maglaras, s. moschoyiannis, and h. janicke, “deep learning for cyber security intrusion detection: approaches, datasets, and comparative study,” journal of information security and applications, vol. 50, pp. 1–19, 2020. [3] j. svoboda, i. ghafir, v. prenosil et al., “network monitoring approaches: an overview,” int j adv comput netw secur, vol. 5, no. 2, pp. 88–93, 2015. [4] i. ghafir, v. prenosil, j. svoboda, and m. hammoudeh, “a survey on network security monitoring systems,” in 2016 ieee 4th international conference on future internet of things and cloud workshops (ficloudw). ieee, 2016, pp. 77–82. [5] b. schneier, “managed security monitoring: network security for the 21st century,” computers & security, vol. 20, no. 6, pp. 491–503, 2001. [6] g. kumar, k. kumar, and m. sachdeva, “the use of artificial intelligence based techniques for intrusion detection: a review,” artificial intelligence review, vol. 34, no. 4, pp. 369–387, 2010. [7] m. aldwairi and d. alansari, “exscind: fast pattern matching for intrusion detection using exclusion and inclusion filters,” in 2011 7th international conference on next generation web services practices. ieee, 2011, pp. 24–30. [8] d. xu, h. zhang, and y. fan, “the gpu-based high-performance patternmatching algorithm for intrusion detection,” journal of computational information systems, pp. 3791–3800, 2013. input splits design techniques for ids on hadoop cluster 19 [9] m. kharbutli, m. aldwairi, and a. mughrabi, “function and data parallelization of wu-manber pattern matching for intrusion detection systems.” netw. protoc. algorithms, vol. 4, no. 3, pp. 46–61, 2012. [10] x. su, z. ji, and x. lian, “a parallel ac algorithm based on spmd for intrusion detection system,” in proceedings of the 2nd international conference on computer science and electronics engineering. atlantis press, 2013. [11] m. aldwairi, a. m. abu-dalo, and m. jarrah, “pattern matching of signaturebased ids using myers algorithm under mapreduce framework,” eurasip journal on information security, vol. 2017, no. 1, pp. 1–11, 2017. [12] v. ciric, d. cvetkovic, and i. milentijevic, “design and implementation of network intrusion detection system on the apache hadoop platform,” in proceedings on 5th international conference on electrical, electronic, and computer engineering (icetran 2018), palic, serbia, 2018, pp. 1102–1105. [13] c. lam, hadoop in action. manning publications co., 2010. [14] m. y. eltabakh, y. tian, f. özcan, r. gemulla, a. krettek, and j. mcpherson, “cohadoop: flexible data placement and its exploitation in hadoop,” proceedings of the vldb endowment, vol. 4, no. 9, pp. 575–585, 2011. [15] a. sayar et al., “hadoop optimization for massive image processing: case study face detection,” international journal of computers communications & control, vol. 9, no. 6, pp. 664–671, 2014. [16] j. cheon and t.-y. choe, “distributed processing of snort alert log using hadoop,” international journal of engineering and technology, vol. 5, no. 3, pp. 2685–2690, 2013. [17] p. prathibha and e. dileesh, “design of a hybrid intrusion detection system using snort and hadoop,” international journal of computer applications, vol. 73, no. 10, 2013. [18] k. kato and v. klyuev, “development of a network intrusion detection system using apache hadoop and spark,” in 2017 ieee conference on dependable and secure computing. ieee, 2017, pp. 416–423. [19] c. f. endorf, e. schultz, and j. mellander, intrusion detection & prevention. mcgraw-hill osborne media, 2004. [20] h.-d. j. jeong, w. hyun, j. lim, and i. you, “anomaly teletraffic intrusion detection systems on hadoop-based platforms: a survey of some problems and solutions,” in 2012 15th international conference on network-based information systems. ieee, 2012, pp. 766–770. [21] a. khraisat, i. gondal, p. vamplew, and j. kamruzzaman, “survey of intrusion detection systems: techniques, datasets and challenges,” cybersecurity, vol. 2, no. 1, p. 20, 2019. [22] u. of washington, “cpu performance,” https://boinc.bakerlab.org/rosetta/cpu list.php, accessed: 2020-10-22. 256 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 257256 v. ćirić, d. cvetković, n. gavrilović, n. stojanović, i. milentijević input splits design techniques for ids on hadoop cluster 257 comparison of memristor models for microwave circuit simulations facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 65-74 https://doi.org/10.2298/fuee1901065m comparison of memristor models for microwave circuit simulations in time and frequency domain ivo marković 1 , milka potrebić 1 , dejan tošić 1 , zlata cvetković 2 1 university of belgrade, school of electrical engineering, belgrade, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. as reported in the open literature, there are many memristor models for the circuit-level simulations. some of them are not particularly suitable for microwave circuit simulations. at rf/microwave frequencies, the memristor dynamics become an important issue for the transition process. in this paper we present a number of different spice memristor model groups. each group is explained using representative models, which are analysed and compared from the microwave circuit analysis viewpoint. we consider the model behaviour at rf/microwave frequencies and the memristance setting issues. results are compared and the best models are recommended. key words: memristor models, microwave circuit, transition process. 1. introduction in 1971, leon chua theoretically predicted [2] the existence of memristor, the fourth fundamental element. he claims that memristor fills the gap in the relation that connects magnetic flux and electric charge: m(t)=dϕ(t)/dq(t). in 2008, strukov et al. [3] published a paper in which the authors claim that the component was produced in hp‟s laboratory. the conclusion was made on the basis of matching behavioural properties that chua predicted with the measured results. to the contrary, there are papers such as [4], in which authors claim that memristor is not a new element. they claim that hp laboratory did not create a new component, and that chua‟s prediction was not correct. this discussion should be classified as theoretical, since it is not relevant for the component‟s implementation. presently, there are two companies that provide commercially available memristors [5, 6], and a lot of researchers are trying to apply memristors in various fields of electrical engineering. there are plenty of reasons for memristors implementations, such as: small dimensions, small power consumption, fast switching time (of the order of seconds or less) from on to off state and received february 26, 2018; received in revised form july 31, 2018 corresponding author: milka potrebić university of belgrade, school of electrical engineering, bulevar kralja aleksandra 73, belgrade, serbia. (e-mail: milka_potrebic@etf.rs) * an earlier version of this paper was presented at the 13th international conference on applied electromagnetics (пес 2017), august 30 september 01, 2017, in niš, serbia [1]. 66 i. marković, m. potrebić, d. tošić, z. cvetković vice versa, no mechanical parts are included, etc. all of the mentioned characteristics are relative to the commercially available devices in use for implementations for rf/microwave systems and devices. in order to implement memristor-based circuits, appropriate models for circuit-level simulations are required. in this paper, we analyse models of memristors that operate at high frequencies. first, we display models for the transition process and display results achieved using the models. next, models for the frequency analysis are discussed and compared. finally, we summarize results in short and suggest which models are most suitable for simulations at rf/microwave frequencies. in the new volume of ieee circuits and systems magazine, there are two papers which understanding could improve researches in both modelling area and implementation area. biolek et al. [7] point out at important fingerprints of memristors, which could be good guidelines. ascoli et al. [8] discuss about dynamics of real world memristors. some of the models were recently used to realize memristor-based circuits. different filter realizations [9] and reconfigurable microwave filters [10,11] were analysed using these models. transition of memristor states in reconfigurable microwave filter was analysed in [12]. the use of memristors in power divider, coupled resonator bandpass filters, and a low-reflection quasi-gaussian lowpass filter with lossy elements, was discussed in [13]. realisation of phase shifter was presented in [14]. modelling and simulation of large memristive networks was reported in [15]. potential applications of memristors in rf/microwave circuits were presented in [16]. there is also a chapter in the book [17] regarding memristors, in which authors summarized possible applications using memristors in passive microwave circuits. 2. models for transition process in this chapter a couple of memristor model types are presented. we analyse these specific models which seem to be the most effective, to the best of our knowledge. 2.1. biolek’s model based on hp‟s device that was fabricated back in 2008, biolek et al. came with an initial memristor model in 2009 [18], and with improved models in 2013 [19]. the physical model of the memristor from [3] is shown in fig. 1. it consists of two-layer thin film, sandwiched between two platinum electrodes. the first layer is doped with oxygen vacancies, so it behaves as a semiconductor. the other region, which is undoped, has an insulating property. fig. 1 graphical representation of memristor model comparison of memristor models in time and frequency domain microwave circuit simulations 67 the total resistance of the device is a sum of the doped and undoped regions. by applying adequate voltage at platinum electrodes, it is possible to change the width of the doped and undoped region. this fact implies that it is possible to change the total resistance of the memristor. after some calculations, we can present a simplified equation for the memristor resistance:           q d r r dq d qm 2 onv off 1)( , (1) where m represents memristance (memristor‟s resistance), φ is the flux, q is charge, μv is dopant mobility, d is the distance between platinum electrodes, ron and roff are resistances of the doped and undoped regions. the total memristance depends on initial values of ron and roff, as well as the width of the doped region x referenced to the total width of the memristor d. the speed of the movement of the boundary between the doped and undoped regions may be calculated as: )()()()( )( m2 onv xftk ixfti d r dt tdx    (2) where k is a constant: k = µv ron / d 2 , im is the current through the memristor, and f(x) is a so-called window function. the purpose of the window function is to describe nonlinear dopant drift. it is a phenomenon that manifests when small voltages yield enormous electric fields, which can produce significant nonlinearities in ionic transport. these nonlinearities manifest themselves particularly at the thin film edges, where the speed of the boundary between the doped and undoped regions gradually decreases to zero. this model is graphically presented in fig. 2. the memristor memory effect is modelled using a feedback-controlled integrator. it stores the effects of the passing current, and controls the memristance. fig. 2 graphical representation of the memristor model on the basis of this reasoning, it becomes apparent that the problem of modelling the memristor is reduced to the modelling of the window function. the problem is not just to provide one formula for all the micro effects, but also the limitations that exist in the software for circuit analysis. in light of this, it becomes clear why so many papers on this 68 i. marković, m. potrebić, d. tošić, z. cvetković topic were published in the last decade. some of the most important papers from this group, alongside biolek‟s, are joglekar [20], yakopcic [21]. there are also new papers by biolek et. al [22, 23], in which they solve some problems caused by spice programs, in terms of representation of very large numbers. here, we discuss the resistive model reported in [19]. the model is based on the minimal resistances in on state and the maximal resistance in off state ron, roff initial resistance rini; mobility of charge μv; and the distance between the platinum electrodes d. memristance could be calculated as: )1(offon xrxrr  , (3) where x is the width of the doped region, and x ∈ [0,1]. so rini = r for some initial state of x, r = roff when x = 0, and r = ron when x = 1. those are the only values that need to be set up for the model to work properly. ron, roff, μv and d are fabrication dependent values. ron and roff differ from ron and roff from eq. (1), and should not be mixed: ron = ron x, roff = roff (1-x). this model provides theoretically predicted results in case of i-v curve analysis, as shown in fig. 3. results are obtained using ltspice [24]. fig. 3 i-v curve for biolek‟s model, at f = 10 hz. additionally, it is an excellent model of an ideal memristor. the model is resistive, so it might be used for the proof-of-concept transition process of idealized microwave circuits. for switching from the off-to-on and from the on-to-off state analysis, a simple circuit containing serial connection of a generator and a memristor is used. generator produces pwm signal, with voltage amplitude of ±2 v, at the frequency of 100 khz, and the high voltage level lasts 70% of the signal period. simulation result suggests that the transition time is of the order of seconds, for both on to off and from off to on transitions. increase of voltage amplitude would shorten the transition time, but we considered that memristor cannot handle very high voltages. increase of frequency would also shorten the transition time. 2.2. mazady’s model mazady et al. proposed models for both transition process and frequency [26] analysis. they are based on the tio2 device. we can classify these models as a new category, comparing to biolek‟s, yakopcic‟s, joglekar‟s etc. in this case, model for time-domain comparison of memristor models in time and frequency domain microwave circuit simulations 69 analysis are defined by 24 ideal switches (exact moments of switching on or switching off); currents through 6 parallel branches. model for frequency-domain analysis consists of resistances in on and off states rhc, rlc; parasitic inductances l1 and l2; capacitances c and cb. rhc and c are time dependent variables. i-v curve matches theoretically predicted results, as shown in fig. 4. results are obtained using ltspice [24]. all of the circuit parameters are the same as in the original paper [26]. fig. 4 i-v curve for mazady‟s model at 50 hz. switching time is very short. authors claim [16] that it takes only 210 ps for switching from off to on state, and vice versa. since the switching time is very short, it is possible to use a short rectangular impulse for programming memristor‟s states. the particularity of these models lies in the fact that they depend on excitations and the actual circuit parameters. consequently, no stand-alone self-contained memristor model is available, which is based only on the port description equations. 2.3. amirsoleimani’s model amirsoleimani et al. also proposed a memristor model based on the tio2 device [27]. this kind of model could be classified as the third category. it includes plenty of fabrication details and specific calculations. amirsoleimani‟s approach is based on the tunnelling effect, and the model is very similar to the shottky diode model. it is based on many physical properties, which are directly used in calculations for the current of the tunnelling effect and the ohmic currents. the memristor i-v curve obtained from this model matches the theoretically predicted results for different excitations, as shown in fig. 5. results are obtained using ni awr microwave office [28]. fig. 5 i-v curve for amirsoleimani‟s model, at 100 hz. 70 i. marković, m. potrebić, d. tošić, z. cvetković switching time depends on parameters, but it is of the order of magnitude of a few seconds, as reported in [27]. this model does not depend on excitation and circuit parameters. the downside of this model is that it requires many fabrication parameters to be measured and set in order to properly illustrate memristors behaviour. also, complicated equations require more time to be calculated. simulation time is longer. signal for programming could be the same as for biolek‟s model. it is important to mention another category of models – analytical. the purpose of these kind of models is to provide simplified mathematical (analytical) apparatus, which will facilitate design of electrical circuits. this means that it will improve modelling of memristors, but also facilitate the design of memristors programming circuitries. to the best of our knowledge, there is only one published model [29] at this time. these kind of models are still in development, and perhaps there will be some useful results in near future. another group of memristor models that is important to mention is based on verilog models. one of the best models of this kind was presented in series of papers [30-32]. since our focus is on spice models and circuits, we will not analyse this model, but leave verilog models for some further researches. 3. models for rf/microwave simulations from the rf/microwave simulation viewpoint, mazady‟s model [26], pi‟s model [25] and improved pi‟s model [24] may be particularly interesting. mazady‟s model [26] is actually the laplace transform of the model (equations) for the transition process. however, this model strongly depends on the actual circuit and its excitations. 3.1. pi’s model pi et al. [25] fabricated series of devices and provided obtained results. authors claim that it was fabricated on an intrinsic silicon wafer with a 380-nm-thick thermally grown silicon dioxide. they used silver (ag) layer as a terminal for one side, and gold (au) with a thin titanium (ti) adhesion layer as a terminal from the other side. the terminals are separated by a 35-nm-wide air gap. this structure is graphically presented in fig. 6. fig. 6 pi‟s device graphical model. switching is related to the formation/rupture of multiple conductive filaments between two electrodes. in the on state, there is always at least one conductive filament, comparison of memristor models in time and frequency domain microwave circuit simulations 71 which leads to the low resisting state. in the off state, ruptures are dominant, so this state is dominantly defined by capacitances of the gaps between the filaments. according to the measurements on fabricated devices, authors proposed an rf memristor model of the memristor switch that behaves as a resistor in on state, and a capacitor in off state, as shown in fig. 7. averaged values for resistance and capacitance of the fabricated series of devices are: ron = 3.6 ω, coff = 1.37 ff. the capacitance is predominantly associated with the capacitance of the air gap, which depends on the effective dielectric constant of the substrate–air interface. it is constant over the frequency range of [10 mhz, 110 ghz]. frequency and power ratings are also provided in the paper. model was tested in the range of 10 mhz – 110 ghz. at 40 ghz, in the on state, measured insertion loss was 0.3 db, and in the off state, measured isolation was 30 db. memristor is operative up to about 20 dbm (17 dbm measured; 24 dbm calculated). the on state is critical, since the device is limited by fusing. as the memristance increases, memristor‟s power handling also increases. fig. 7 pi‟s rf memristor model. 3.2. weinstein’s model weinstein et al. [33] analysed behaviour of pi‟s memristor model, and presented some improvements. they take into account the following effects: sio2 parasitic capacitance; si substrate capacitance and the fringe capacitance between the signal line and ground planes. inductance of the filament and the electrodes are also taken into account. the model is displayed in fig. 8. values for the proposed model are: ron = 2.56 ω, coff = 1.168 ff, l = 52 fh, lω = 3.1 ph, cp = 1.15 ff. fig. 8 weinstein‟s rf memristor model. 72 i. marković, m. potrebić, d. tošić, z. cvetković 4. comparison of memristor models for transition process in this section we compare the results obtained by using different memristor models. biolek‟s model [19] is based on hp‟s device [3]. this model‟s parameters are initial resistance, minimal resistance in on state, maximal resistance in off state, memristor width, dopant drift speed. mazady‟s model [26] was also based on tio2 device. in this model, parameters are switching times. amirsoleimani [27] based his model on accurate charge transport calculation, and it could be applied on tio2 as well. in this model, there is a lot of parameters to set, such as: temperature, electron mobility, ion mobility, barrier height, density of states in valence and conduction bands, activation energy, some circuit parameters, and others. our goal is to have a rf memristor model. spice memristor models available to the present day do not accurately describe the transient process of the corresponding rf devices. basic parameters of pi‟s rf device are: resistance in on state, capacitance in off state, resistance in off state, memristor width, conductor atoms mobility… main model properties and results are presented in table 1, for models used for timedomain analysis: table 1 result comparison biolek mazady amirsoleimani matches chua‟s theoretical prediction yes yes yes excitation independent yes no yes setting simplicity high medium low calculation complexity medium low high matching real physical behaviour medium low high order of magnitude of transition time ~s ~ps ~s comparing with the other two models, biolek‟s model has the longest transition time. transition time is longer in this case because biolek‟s model is model of an ideal resistive memristor. other parameters / properties are equal or better. it is simple to set, and there is no great calculation complexity. mazady‟s model main issue is that it is not excitation independent, so model should be changed as the excitation changes, which requires calculations. amirsoleimani‟s model is the most precise model. on the other hand, it is not easy to set, and calculation complexity is big in comparison with the other two models. in case of the frequency-domain models, pi‟s and weinstein‟s model provide very similar results in circuits such as phase shifters and filters. weinstein‟s model could possibly be treated as more precise, since there is more parameters taken into consideration. on the other hand, in a complex circuitry pi‟s model would faster provide simulation results. 5. conclusion a lot of researchers are working on the memristor fabrication and modelling. there are many different models of memristors available in the open literature. not all of them are suitable for rf/microwave analysis. in this paper we compared several spice models from several categories used for transition process and frequency analysis. the best results for transition process are comparison of memristor models in time and frequency domain microwave circuit simulations 73 achieved using models based on window functions. biolek‟s model seems to be one of the most precise in this category, as well as in general. it is excitation independent and simple to use and adjust. equations are not of great complexity, so the simulation is not time-consuming. mazady‟s models provide good matching with theoretical expectations. these models are dependent on the circuit parameters and excitations. amirsoleimani‟s model is probably the most accurate when considering physical processes inside the tio2 device. it is excitation independent, circuit independent, and matching with theory is good. on the other hand, the main issue with this kind of models is its complexity of setting all the parameters. simulation time is a little bit longer comparing with biolek‟s model. analytical models are still in development. in the case of frequency analysis, we analysed two models. to be more precise, we discuss one device, its model and improved version of the model. original pi‟s model was experimentally verified. authors provided information on average values in on and off states, power handling, isolation in off state and insertion loss in on state. the improved model was based on the original model, plus some side effects were taken in consideration. it is only theoretical and no measurement was involved. these two models provide very similar simulation results when applied in the phase shifter circuits. references [1] i. marković, m. potrebić, d. tošić, z. cvetković, “comparison of memristor models for microwave circuit simulations”, in proceedings of the 13 th international conference on applied electromagnetics пес 2017. niš, serbia, 2017, pp. p10 1-3. [2] l. chua, “memristor – „the missing circuit element‟”, ieee transactions on circuit theory, vol. 18, no. 5, pp. 507-519, 1971. [3] d. strukov, g. snider, d. stewart, r. williams, “the missing memristor found”, nature, vol. 453, no. 7191, pp. 80-83, 2008. [4] s. vongehr, x. meng, “the missing memristor has not been found”, scientific reports, vol. 5, pp. 1-7, 2015. [5] knowm, inc. http://knowm.org, santa fe, nm, 87502–4698, usa. [accessed june 2017] [6] bio inspired technologies, llc. http://bioinspired.net, 720 w. idaho st. suite 30 boise, idaho 83702 usa. [accessed june 2017] [7] d. biolek, z. biolek, "about fingerprints of chua's memristors", ieee circuits and systems magazine, vol. 18, no. 2, pp. 35-47, 2018. [8] a. ascoli, r. tetzlaff, s. menzel, " exploring the dynamics of real-word memristors on the basis of circuit theoretic model predictions", ieee circuits and systems magazine, vol. 18, no. 2, pp. 48-76, 2018. [9] i. marković, m. potrebić, d. tošić, "possibility of applying memristors in microwave filters", tehnika, vol. 71, no. 6, pp. 853-860, 2016. [in serbian] [10] m. potrebić, d. tošić, d. biolek, "reconfigurable microwave filters using memristors", international journal of circuit theory and applications, vol. 46, no.1, pp. 113-121, 2018. [11] i. marković, “implementation of reconfigurable bandpass band stop filter using memristors”, in proceedings of the 24 th telecommunications forum – telfor 2016, belgrade, serbia, 22-23 nov. 2016, pp. 11.11. 1-4. [12] i. marković, m. potrebić, d. tošić, “memristor state transition in reconfigurable microwave filter”, in proceedings of the 2017 ieee 30 th international conference on microelectronics (miel), ieee, niš, serbia, 9-11 oct. 2017, pp. 71-74. [13] m. potrebić, d. v. tošić, "application of memristors in microwave passive circuits", radioengineering, vol. 24, no. 2, pp. 408-419, 2015. [14] i. marković, m. potrebić, d. tošić, "main-line memristor mounted type loaded-line phase shifter realization", microelectronic engineering, vol. 185-186, pp. 48-54, 2018. [15] d. biolek, z. kolka, v. biolková, z. biolek, m. potrebić, d. tošić, "modeling and simulation of large memristive networks", international journal of circuit theory and applications, vol. 46, no. 1, pp. 50-65, 2018. [16] m. potrebić, d. tošić, "potential applications of memristors in microwave circuits", invited plenary paper and talk, in proceedings of full papers13 th international conference on applied electromagnetics, пес 2017, faculty of electronic engineering of niš, niš, serbia, august 30-september 01, 2017, pp. i2 1-4. http://bioinspired.net/ 74 i. marković, m. potrebić, d. tošić, z. cvetković [17] m. potrebić, d. tošić, d. biolek, "rf/microwave applications of memristors", chapter, pp. 159-185, s. vaidyanathan, c. volos (editors), advances in memristors, memristive devices and systems, studies in computational intelligence, vol. 701, springer, 2017. [18] z. biolek, d. biolek, v. biolkova, “spice model of memristor with nonlinear dopant drift”, radioengineering, vol. 18, no. 2, pp. 210-214, 2009. [19] d. biolek, m. di ventra, y. v. pershin, “reliable spice simulations of memristors, memcapacitors and meminductors”, radioengineering, vol. 22, no. 4, pp. 945-968, 2013. [20] y. n. joglekar, s. j. wolf, "the elusive memristor: properties of basic electrical circuits." european journal of physics, vol. 30, no. 4, pp. 661-685, 2009. [21] c. yakopcic, t. m. taha, g. subramanyam, r. e. pino, s. rogers, "a memristor device model", ieee electron device letters, vol. 32, no. 10, pp. 1436-1438, 2011. [22] z. biolek, d. biolek, v. biolkova, z. kolka, a. ascoli, r. tetzlaff, "analysis of memristors with nonlinear memristance versus state maps", international journal of circuit theory and applications, vol. 45, no. 11, pp.1814-1832, 2017. [23] d. biolek, v. biolkova, z. kolka, "modified mim model of titanium dioxide memristor for reliable simulations in spice", in proceedings of the 14 th international conference on. ieee synthesis, modeling, analysis and simulation methods and applications to circuit design (smacd), 2017, pp. 1-4. [24] m. engelhardt, ltspice iv version 4.23i. linear technology corporation, http://linear.com. [accessed june 2018] [25] s. pi, m. ghadiri-sadrabadi, j. bardin, q. xi: “nanoscale memristive radiofrequency switches”, nature communications, vol. 7519, no. 6, pp. 1-9, 2015. [26] a. mazady, m. anwa: “memristor: part ii–dc, transient, and rf analysis”, ieee transactions on electron devices, vol. 61, no. 4, pp. 1062-1070, 2014. [27] a. amirsoleimania, j. shamsib, m. ahmadia, a. ahmadia, s. alirezaeea, k. mohammadib, m. a. karamib, c. yakopcic, o. kaveheid, s. al-sarawie: “accurate charge transport model for nanoionic memristive devices”, microelectronics journal, vol. 65, pp. 49-57, 2017. [28] ni awr design environment, national instruments, inc., el segundo, ca 90245, usa. http://ni.com/awr (accessed june 2018). [29] a. ascoli, v. ntinas, r. tetzlaff, g.c. sirakoulis, "closed-form analytical solution for on-switching dynamics in a tao memristor", electronics letters, vol. 53, no. 16, pp. 1125-1126, 2017. [30] s. kvatinsky, e. g. friedman, a. kolodny, u. c. weiser, "team: threshold adaptive memristor model", ieee transactions on circuits and systems i: regular papers, vol. 60, no. 1, pp. 211-221, 2013. [31] s. kvantinsky, m. ramadan, e. g. friedman, a. kolodny, "vteam: voltage threshold adaptive memristor model", ieee transactions on circuit and system-ii, vol. 62, pp. 786-790, 2015. [32] n. wainstein, s. kvantinsky: “a lumped rf model for nanoscale memristive devices and nonvolatile single-pole double-throw switches”, ieee transactions on nanotechnology, 2018. [33] n. wainstein, s. kvantinsky: “an rf memristor model and memristive single-pole double-throw switches”, in proceedings of the ieee international symposium on circuits and systems (iscas), baltimore, md, usa, may 28-31, 2017, pp. 1-4. facta universitatis series: electronics and energetics vol. 34, no 3, september 2021, pp. 461-482 https://doi.org/10.2298/fuee2103461k © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory hossein kardanmoghaddam1, amir rajaei2, seyedeh fatemeh fatemi3 1department of computer engineering, birjand university of technology, birjand, iran 2department of computer engineering, velayat university, iranshahr, iran. 3birjand university of technology, birjand, iran abstract. as using cloud computing develops, one of the problems that managers face at the organization level is that the personnel cannot work with these systems, users do not accept these new systems and the problem of accepting these technologies, effective factors in accepting them and the existing barriers in accepting them by users are very important. in many organizations wherein cloud computing has been launched, a time period is required to accept this new system by their personnel. if this time period is less and the personnel can work with these systems earlier, the productivity increases in the organization. the present paper aims the analysis of effective factors on acceptance of cloud computing by personnel working in birjand international airport in south khorasan county (iran) based on roger’s diffusion of innovation theory. examination of effective factors on acceptance of cloud computing in an organization can pave the way for improving its implementation and performance. this research has been done by descriptive survey method and its community includes personnel working in different official and informatics departments of birjand international airport. the data gathering instrument was questionnaire and for determining questionnaire validity opinions of masters and for determining consistency cronbach's alpha has been used. for data analysis descriptive and inferential statistics have been used. the results of present paper indicated that there is meaningful and positive relationship between acceptance features of cloud computing and perception elements of testability, comparative advantage, visibility, complexity, and compatibility of cloud computing with its acceptance rate and there is meaningful and negative relationship between features of perception of not needing cloud computing and its acceptance rate. but there was no meaningful relationship between perception of cloud computing test opportunity and its acceptance rate. as one of the modern ways of providing electronic service has been cloud computing in recent years and it has many benefits for air transportation systems, this paper presents a model for analysis of effective factors in received january 28, 2021; received in revised form july 6, 2021 corresponding author: hossein kardanmoghaddam department of computer engineering, birjand university of technology, birjand, iran e-mail: h.kardanmoghaddam@birjandut.ac.ir 462 h. kardanmoghaddam, a. rajaei, s. f. fatemi acceptance of cloud computing among personnel of an airline company (airport) and it can be used for examining cloud acceptance in other air companies. key words: cloud computing, it, computing services, roger’s diffusion of innovation theory, adoption of cloud computing 1. introduction nowadays, with the development of it in many organizations and institutes, has resulted in a number of fundamental and massive changes that have taken place in comparison with the past, these changes have forced organizations to maintain presence in the field of competition in order to survive in the current intense competitive market, leaving no other option for the organizations, but to transform and use the latest available technological achievements in order to reach the highest levels of capability for either themselves or their employees or perhaps even possibly both [1]. therefore, it can be claimed that the information technology in other words has overtaken other evolving technologies and is considered essential and a basic need for entire organizations and individuals in many different industries all over the world [2]. furthermore, cloud computing is considered to be one of the latest forms of providing data services which has managed to attract the attention of many organizations and data collecting institutions lately, in addition, this particular technology is now considered to be the latest generation in the evolution of the internet that can possibly demonstrate the future of application engineering and design [3]. as defined by the national institute of standards & technology (nist), cloud computing is basically a model type used for providing simple network access in accordance with the users need, for a shared repository derived from computing resources configurations. in addition, networks, servers, applications and services are all great examples of such services which can be available and provided quickly with minimum management efforts as well as minimum interactions coming from service providers [4]. additionally, from a (nist) perspective, the five features of cloud computing include on-demand services, broad network access, resource pooling and rapid elasticity as well as service measurements. however, cloud computing, which is the next generation of network and grid computing, can provide the possibility of using information technology, that has been initially proposed as a service provided through the networks. it is important to know that this technology represents a set of services in the forms of various packages and has an application program interface (api) that can be used in the network which may include both storage and computing services [5]. the term cloud refers to the data center of a hardware as well as its software provider [6]. moreover, cloud computing is built on the foundation that instead of organizations and institutes creating their own hardware & software for storage use and process of data, they can launch cloud computing, and use its many featured services, then pay based on the overall usage, instead of purchasing various products through the network. in this regard, we can refer to companies and institutes providing public services to the industries such as water, electricity, telephone lines and internet, which can possibly eliminate the need for individual and organizations to purchase infrastructure related to these particular industries, instead, they can use cloud computing services and pay the related costs based on the precise amount of their usage either individually or organization-wise. furthermore, in the field of cloud computing, large firms and institutions that have the potential as well as acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 463 capital can very well create their own necessary infrastructure, such as the cloud computing research center of amirkabir university of technology & the free cloud computing society of iran [7]. according to the theory of roger and shoemarker, 50% of people in society are resistant to innovation who are known as resistant and slow groups, and the other 50% are people who are more receptive to innovation. this division is shown by shoemarker as a normal distribution [8]. venkatesh et al. [9] introduced the conceptual basis of most technology acceptance models which is depended on the relationship between individual reactions to the "intention to use" and “actual use” of technology as shown in figure 1. fig. 1 the conceptual basis of technology acceptance models (venkatesh et al.2003) organizational change more concerns employees with long experience rather than novices. experienced employees are usually older and have lower media literacy [10]. indeed, these employees have been accustomed to perform things in their usual way for years. thus, the manager must expect resistance if the change involves forcing employees to modify their performance way. the change, itself, is not a problem for employees, however, the possibility that these employees who already know what to do in any situation will no longer be considered an "expert" is an issue. being an expert, for many people, is very important in terms of self-esteem. the use of cloud computing technology means accepting changes in system performance, therefore, some human resources of the organization resist it. if this acceptance is inconsistent with their ability and job position, this resistance will be increased. acceptance of technology as an individual-voluntary behavior has been explained by various models. in this regard, the theory of reasoned action (tra) [11], fishbein & ajzen (1975) and theory of planned behavior (tpb), ajzen (1991) [12], technology acceptance model (tam), davis (1989) [13], technology, organization and environment (toe) framework tornatzky & fleischer (1990) [14], diffusion of innovations (doi) rogers (2010) [15] and the unified theory of acceptance and use of technology (utaut), venkatesh et al. (2003) [9] can be listed. some acceptance theories, such as the technology acceptance model and the theory of planned behavior, are more appropriate for assessing the acceptance of final users (individuals) [16]. in some acceptance theories, such as technology, the environmental organization relies more on the organizational attitude. since environmental influences are effective on technology acceptance, the importance of this dimension in technology acceptance is another strength of this theory. this theory provides a framework including three variables of technology and organization and environment to study the acceptance of new technologies in organizations [17]. cloud computing acceptance, especially among organizations, is enhanced through public access to services, which offers several benefits such as increased flexibility and agility [18]. therefore, promoting cloud computing acceptance is an important issue for policymakers due to its many benefits [19]. this study [20] assessed the model of acceptance of cloud computing technology, deployment, presenting the optimal method of its use and implementation. the findings showed that cloud computing and the use of 464 h. kardanmoghaddam, a. rajaei, s. f. fatemi the model (cloud computing: evaluation of cloud computing for acceptance and use) can provide an appropriate path for efficient cloud implementation. roger’s theory is more concerned with the processes of diffusion innovation and its acceptance in a systematic and planned method. this theory assesses the social process of innovation and how it is achieved and extended to an entire social system. rogers' theory of diffusion innovation has been applied in this research since our case study was the personnel of an international airport in the east of iran and the fact that accepting this innovation takes time and people do not accept it immediately. this theory is also used in the commercialization of research results, innovation resulting from technological opportunities, and the marketing of new products. the innovation process can be adapted to the rogers model to explain the behavioral and social characteristics at different stages of the innovation process within the organization [21]. furthermore, based on the many benefits and features of using cloud computing services, which are gradually becoming more and more popular to the organizations as well as the general population, we come to conclusion that this particular technology can possibly play a major role in the aviation & airport-related industries, in order to properly assist airlines in providing the best possible services to the passengers, in addition to helping the organization as a whole, achieve success and greatness. besides, the low degree of knowledge about the new cloud services features, among the airport staff, can possibly cause uncertainty and curiosity within the members of the staff, and as a result may very well motivate them, so they can learn in greater details about this particular new technology. moreover, the higher the degree of familiarity and comprehension of staff with the cloud computing and its various features, can very well lead to a much better and more efficient service use. in addition, identifying the various factors affecting the overall acceptance and adoption of cloud computing, as well as determining its impacts on increasing the quality of airport services and features is of great importance, in order to create and plan the perfect policies for organizational objectives, because, the introduction and widespread use of cloud technology can possibly lead to huge quantitative as well as qualitative changes in the services and features of airports and aviation systems overall. although, the main issue in using such brand new technologies is the analyze and measure of the precise degree of staff willingness for using these newly available tools and technologies [22]. additionally, before implementing new technologies in any organizations, both managers and employees of that particular organization must have a complete and clear understanding of the upcoming technology, as well as its various advantages and possible uses in their executed activities, in addition to the close observation & evaluation of the various available functions and efficiencies that these innovative technologies may provide in organizational process. birjand international airport has both, iata code: (xbj) as well as icao code: (oimb), and is usually used for commercial flights (birjand.airport.ir). furthermore, due to the political and strategic position of birjand, being located in the east of iran, in addition to having (331 km long) of common border with the neighbor country, afghanistan, as well as being the third airport built in iran in the year (1933), after the construction of qala-e-marghi airport in (tehran) and bushehr airport. additionally, birjand is located at 59 ֯ and 13 minutes longitude, 32 ֯ and 53 minutes of latitude, at an altitude of 1,470 meters above the sea level, as well as being the capital city of southern khorasan province. moreover, due to the strategic and sensitive location of this particular city, a considering amount of attention had been paid to the air navigation system, specifically on the birjand airport, acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 465 and therefore, having new and more efficient systems, such as the use of cloud computing services in the birjand airport can possibly be of great significance [23] [24] [25]. in addition to the cloud computing, the current study uses roger’s diffusion of innovation theory, which will be discussed in greater details in the following chapter. according to roger in the year 1995, the definition of innovation is basically any new or original idea, object, practice or it can even include something that only appears new. however, despite the title given to the word innovation, it does not necessarily mean that it has to be an extremely new and innovative idea which was never seen before. in fact, it can be called innovation if the new upcoming idea, method or object is perceived as new by those particular people who are willing to accept and use it. therefore, when it comes to publishing a brand-new idea, it is much more important for the idea to appear new to the targeted market, rather than having a high degree of actual new features and characteristics [26]. additionally, according to roger (1995), [26], these are the four innovation dissemination elements: -1innovation that is accepted, -2by the members of the society, -3through specific communication networks, -4over a period of time [27]. this research proposes a model for analysis of effective factors in acceptance of cloud computing by personnel of an airline company (airport); the findings of this paper can be a guideline for efficient applying of cloud in air transportation systems and it can also contribute examination of cloud acceptance in other air organizations. at the birjand international airport vast satellite system and automatic terminal information system (atis) is used. also, in this airport ict infrastructure is used including datacenter, active and passive network infrastructure, storing and processing infrastructure, weak current infrastructures. in this airport navigation systems of ndb, dvor, vor, dme are used. communication systems like switching tower, access (controller), recorders, fixed and moving senders-receivers, fm and am bands, sending system of meteorology information, using automatic dependent surveillance system as ads-b and ads-c for increasing radar coverage. also in this airport, rcag flight control system, electronic flight planning (efpl) and gps system, microwave, closed circuit television (cctv) and nationally integral flight information display system (fids) and papi and sals systems are used. also, this airport is planning to implement intelligent transportation system (its), the aeronautical telecommunication network (atn), control center and a-smgcs system and in addition, birjand international airport is planning to use wtmd auxiliary inspection system, intelligent gate way systems, purchasing new equipment with x-ray modern technology using iot. also in this airport, the remote tower scheme has been proposed and it is studied by an iranian company though installation, launching and maintaining ils equipment is costly for the airport. its high ranking authorities are planning to use performance based navigation (pbn), biometric identity identification is among other research plans in birjand international airport and it hasn’t been operated yet. in this research, the characteristics of innovation that are effective in its acceptance rate are discussed from roger's point of view in the second section. the research literature review and previous work concerning cloud computing are discussed in the third section. research methodology is stated in the fourth section. the findings and hypotheses of the research are defined in the fifth section. the regression model of the determinants of the cloud computing acceptance rate is stated in the sixth section. the conclusion and comparison of this research with similar works are discussed in the final section. 466 h. kardanmoghaddam, a. rajaei, s. f. fatemi 2. various characteristics & features of innovation acceptance roger indicated several technological characteristics to be extremely effective in the adoption rate, which include: a) comparative advantage, b) compatibility, c) complexity, d) testability as well as e) observability of the results [28] [29]. comparative advantage: it refers to the precise degree of the comprehension of individuals on whether the new innovation is more efficient and generally better than the current technology in order to successfully replace it. additionally, the degree of comparative advantage is measured by economic factors, however often other factors such as social credibility, convenience and overall satisfaction with technology acceptance can also play a major role in these measurements. besides, the objectivity of the innovation and its many benefits are not nearly as important as understanding the precise degree of the advantages of innovation via consumers. moreover, the better the innovator perceives the benefits of the created innovation or technology, the quicker the process of their adoption and acceptance will be. compatibility: it specifically refers to the extent of individual perception of innovation in harmony with current existing values, past experiences as well as meeting the requirements and needs of the recipient. besides, those ideas and innovations that are considered to be compatible with prevailing social values seem to be quicker accepted rather than the ones that are not compatible with the social values. generally, accepting an incompatible technology often requires the initial acceptance of an entirely new value system. complexity: it indicates the degree to which an individual fathoms the difficulties of learning and applying innovation and new technologies. furthermore, for most regular members of the society, some new technologies are simpler to comprehend and apply than others, while other technologies may not be so, and hereby they will gradually be accepted by the targeted market in a longer period of time. in general, brand-new ideas that do not require specific learning and investment are often accepted much sooner than those technologies that need the acquirement of new skills & knowledge. testability: it refers specifically to the review and examination of technologies available at a limited level. additionally, we also learn that technologies that can possibly be tested with limited facilities and resources have a higher chance of being accepted by the society quicker rather than those that cannot be examined at all. additionally, innovations that have been previously tested on a smaller scale seem to be significantly less risky to use. observability: it refers to the extent to which the results of technologies are visible and clear to the society and others. besides, the clearer the results of the innovation to the people, the more likely it is to be sooner accepted by the general population and the rest [28] [29]. roger, in the year (1995) [26] makes the argument that those innovations and technologies which possess comparative advantage, greater compatibility, higher degree of objectivity and less complexity are much easier and faster to adopt by their recipients in comparison with other available innovations. furthermore, due to the vast number of capabilities cloud computing can offer, many institutions and organizations (both public & private) are gradually considering making use of such abilities which can possibly process the most amount of data and information technology in the shortest possible time as well as the lowest cost. additionally, experts suggested using and setting up cloud computing facilities for aviation and airport-related systems in order for them to meet every need and requirements of the passengers even at the busiest times. meanwhile, most air terminals and airports can meet their specific acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 467 requirements using one of the predominant forms of cloud computing (which involves public, private, collective and hybrid cloud services). moreover, the use of cloud computing technology in both private and public institutions and organizations in iran is lower than in the other parts of the globe. besides that, it can be claimed that only a limited number of institutions are even aware of the very existence of this new technology and the many benefits and its uses. in addition, one of the main reasons causing this is the lack of managers and officials’ overall knowledge about the many capabilities of this particular technology. as a result, this lack of knowledge and familiarity can possibly create a huge gap between the organizations and the available benefits of the cloud computing and may very well cause the organizations to never take advantage from such technologies which could potentially lead to huge positive impacts. the discoveries made by the current study conducted on the staff members of birjand international airport can potentially be extremely effective by providing the necessary information about the many available features of cloud computing and as a result can potentially lay the necessary foundations for accepting the cloud computing technology by the entire staff of the airport. additionally, the present conducted study attempts to examine the many factors affecting the acceptance rate of the cloud computing by the working personnel at birjand international airport, in order to determine whether there is a relationship between the characteristics and features of the cloud computing services and its adoption rate through the people or not. 3. research background prior to the current study, there has never been a proper research conducted, specifically on the acceptance rate of cloud computing usage in the airports and aviation industry, based on the roger’s diffusion of innovation theory. therefore, based on the theory, the main focus of the current research background segment lies within data science and science-related fields of study. behrend et al. (2011) conducted a study on the acceptance and use rate of cloud computing amongst approximately 750 local college students who all took part in the computing skill-based courses. furthermore, in the process of this research, which was conducted through an online questionnaire, factors such as the access to software, easier transportation, individual creativity, technological concerns, teacher supports and reliability were all individually examined and known as determining factors affecting the overall usefulness comprehension, user-friendliness as well as the actual and future potential uses. additionally, the analysis results indicate that the two factors of usefulness comprehension as well as user-friendliness both seem to have a huge positive effect on the decision makings that concern the use of cloud computing amongst the students, on the other hand, despite the various negative impact of factors such as technology concerns, access to software, easier communications and transportations, the factor of easy comprehension still remains more significant than the other utility comprehension factors, and as a result, utilities comprehension do not affect the use of cloud computing amongst students in great extent [30]. in another conducted research by opitz et al. (2012), they examined the acceptance rate of cloud computing using data analysis obtained from 100 it managers amongst the leading companies of germany in the stock market. moreover, in this study, which was conducted through a questionnaire, they have come to conclusion that there is a significant relationship in the use of cloud computing services between the factors of usefulness and 468 h. kardanmoghaddam, a. rajaei, s. f. fatemi the according behavior of the managers, additionally, conducted study also shows that the easy use factor for managers who use cloud services has significantly less effect in comparison to the manager behavioral factors [31]. furthermore, alharbi in (2012) examined the adoption rate of cloud computing in saudi arabia using factors including age, gender, field of work, degree of education and nationality. additionally, the results obtained from this study which was conducted through an online questionnaire specifically on 171 employees of it organizations in the saudi arabia using the linear regression, indicates that there is a constant significant relationship between the factors of usefulness comprehension as well as the easy use of the cloud services and the manager behavioral factors. on the other hand, there is also a significant relationship between the perception of easy use and usefulness factors, and the total usefulness comprehension as well as the managers behavioral responsive factors which can ultimately affect the decision makings on whether to use the cloud computing services or not [32]. akbari, sohrabi & zanjani in (2012) in another study, examined the status and acceptance rate of search engines as well as meta search engines among the users of the university of isfahan based on roger’s diffusion of innovation theory. additionally, the main purpose of this research was to investigate the acceptance rate and identify the various capabilities and effective tools used in accepting search & meta search engines by the users of the university of isfahan. the obtained results show that the average acceptance rate of search engines and meta search engines by the users of the university of isfahan was not statistically significant in terms of gender, particular grade or even faculties, and the only significance discovery found in terms of accepting specialized search engines was only related to gender alone. on the other hand, the results showed that users friends however, did play a major role on how public search engines and subject directories were accepted, as well as professors who can potentially be extremely effective on how specialized search engines and meta search engines are accepted by the entire community. furthermore, they came to the conclusion that the best possible place for search engines and meta search engines to be used is applying them at the universities. moreover, there was also a comparison conducted, between the use of simple or advanced google search pages which indicated that there is a greater tendency among users than a simple search page, in addition, this also absolutely confirmed roger’s theory in every way [33]. luo (2012) [34] in this research examined precisely how reference librarians can potentially use the cloud computing services and technology. additionally, the overall objective of the current study is to use software-oriented as well as software as a service (saas) tools in order to support and facilitate their work. furthermore, the obtained results of the study indicated that the librarians may use such tools for a variety of purposes, including facilitating, internal communication, collaborative work, data support as well as literacy practices and trainings. heidari, alizadeh & hamdipour (2013) in another study, investigated the various factors affecting the adoption rate of electronical data resources conducted using the faculty members of iranian data science departments based on roger’s diffusion of innovation theory. furthermore, it must be added that the conducted research method used was descriptive-survey, which could be completed through a set of electronic questionnaires. additionally, the findings of exploratory factor analysis identified the most significant features of electronic data resources which include: comparative advantages, testability, compatibility, observability as well as available testing opportunities. further discoveries show that the above features have a meaningful relationship with the speed of adoption of electronic data resources such as clouds and can potentially be very effective in the adoption rate. moreover, the multivariate results show that the characteristics of observability, test opportunity as well acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 469 as age are considered to be important predictors in the acceptance of electronic data resources which statistically takes 27% of the dependent variable variance. generally, the discoveries seem to confirm the roger’s diffusion of innovation theory (2003) in understanding the various characteristics of electronic data resources using the working members of the iranian department of data science [35]. fung (2013) in this study they used the tam model types and port five forces analysis in order to estimate the decision being made on whether to apply the cloud computing technology in it outsourced services or not. furthermore, based on this quantitative interference research that has been conducted through an online questionnaire, multi regressions and path analysis via (pls) software, however, the only considered factors being investigated here are the comprehension of the many features, user-friendliness and the overall response towards the acceptance of cloud computing, specifically the (tam) model type. according to the obtained results, the perception factor in the overall acceptance of cloud computing can positively and in a major way estimate the perception factor of its usefulness. additionally, we also learn that both the factors of usefulness comprehension of cloud computing features as well as the degree of user-friendliness can very well significantly predict the decision-making factor of accepting the cloud services [36]. sun et al. (2014) in this particular conducted research, they mostly focused on the cloud service providers perspective which were categorized into five groups including: a) the use of decision-making methods, b) the use of data display models, c) the use of cloud service characteristics parameters, d) the content and finally e) objectives [37]. yuvaraj (2014) in this study, a survey was conducted on 209 librarians of the central library of university of india in order to determine the precise degree of acceptance of cloud computing services and apps amongst these individuals. additionally, using this survey and data collection method which was conducted continuously for eleven months using postal questionnaire, yuvaraj emphasizes the effects of four possible factors on using cloud computing services & applications including: usefulness comprehension of the cloud, understanding the easy use of cloud, as well as decision-making and the behavior towards the use of cloud computing. in addition, according to the results of the research, there was no significant relationship amongst the above factors in using cloud computing applications. however, there was a significant relationship found between the usefulness comprehension of the cloud and the behaviors towards its use [38]. this particular study examines the cloud migrations that have occurred during the years (2009-2014) as well as introducing a set of factors affecting the adoption rate of cloud computing services in the form of a conceptual model, in addition, these various key factors include the feasibility study of cloud migration, analysis of development of migration plan requirements, cost-reduction, low effort in order to maintain resource, efficient use of resources as well as the unlimited scalability of the resources [39]. mangai, ganesan & kumar (2014) in another conducted research, they examined the current and possible future of using cloud computing tech in library services. furthermore, in this study, they analyzed the origin, the various types, the impact on libraries as well as the potential benefits and drawbacks of cloud computing in general. in the end, the concluded that (information and communication technology) (ict) have forced the libraries to change the process and performance of information, besides, with the rapid advancement of technology, automated, digital, and other advanced libraries are consequently created. additionally, the research team back then sincerely believed that soon users will be able to access their information from anywhere in the world through this particular technology (cloud computing), which may very well save a considering amount 470 h. kardanmoghaddam, a. rajaei, s. f. fatemi of money as well as time and resources all at the same time [40]. scholtz (2016) examined the technical and environmental factors that could potentially affect the adoption rate of cloud computing specifically in the public departments of south africa. additionally, in this particular study, there were 51 experts from 40 different public-oriented organizations of south africa were surveyed using the delphi method as well as a questionnaire. furthermore, the findings indicated that most participants were very concerned about their individual privacy, which ranked number one amongst all factors. in addition, environmental factors such as constant learning pressure, being resistant to change, lack of sense of security and etc. are all considered other major factors that must be dealt with in order to properly and legally implement cloud computing services as a provider to such organizations based on their specific requirements [41]. valmohammadi & mazaheri (2017) in this conducted research, based on the technology acceptance model as well as the use of the davis model, they managed to thoroughly described the factors that can potentially influence the final decision on the use of cloud computing amongst the employees working at the national radio & television organization of iran, in which case, the utility comprehension and perception was identified as the most significant factor that influenced the decision on whether to use the cloud computing services or not [42]. yuvaraj, mayank (2016) studied factors affecting the acceptance and targeted use of cloud computing in the libraries of medical universities in india using the delphi 3d method. this study was conducted by a group of 32 experts with experience in cloud computing in the field of libraries. 60 different factors were identified of which 42 of them had a direct impact on the level of acceptance and targeted use of cloud computing technologies in the mentioned libraries [43]. heydari dahouei et al. (2017) through this research, they managed to provide somewhat a framework in order to select the appropriate system for implementing cloud computing. additionally, a case study was conducted on the faculty of modern sciences and technologies, at the university of tehran. in conclusion, experts determined that the criteria of accessibility, system reliability as well as stability were all considered to be the top most important determining factors [44]. sabi et al. (2018) conducted another research on acceptance rate of cloud computing in the southern regions of africa, in which they examined and analyzed building a background model based on the factors that could potentially affect the actual acceptance rate. additionally, the methods they used included a mixture of either (doi, diffusion of innovation theory) or (tam, which stands for technology acceptance method model). furthermore, the obtained results indicated that socio-cultural factors as well as displaying results, usefulness and data security were all found amongst the most significant factors in order to appropriately accept cloud computing in universities [45]. changchit, chuleeporn and chuchuen (2018) in another study, they examined the factors that affect the adoption rate of cloud computing. additionally, they used the technology acceptance model and identified the various factors that could potentially influence the acceptance rate of cloud computing and these factors including usefulness comprehension, easy use, security and the cost of using the cloud computing provided services [46]. elzamly et al. (2019) [47] proposed a model for cloud computing acceptance in ebanking management systems. in their proposed model, there are four steps to successfully accept the cloud computing model for managing an electronic banking system. the proposed model includes 4 stages which are divided into technological factors, organizational factors, environmental factors, and operational factors for the acceptance of the cloud computing model. in e-banking systems, the successful adoption of the cloud computing model improves the probability of the success of cloud e-banking in banking organizations. alidoust et al. [48] acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 471 (2020) studied effective factors in the cloud computing adoption of the employees of physical education faculties in tehran, iran. in this study, 12 factors affecting the cloud computing adoption were identified and examined, among which 11 factors were considered as a positive effect. the component of complexity has negatively affected the cloud computing adoption intention. also, technology readiness had the highest impact on cloud computing adoption. the growth of technology readiness among physical education faculties, data security and reducing complexity were identified as the most important factors on cloud computing adoption which can effectively facilitate cloud computing adoption. albelaihi, khan (2020) [49] examined the advantages and barriers to the cloud computing adoption in small and medium enterprises in saudi arabia. in their research, they interviewed managers of the information and communication technology industry. the questionnaire completed via telephone inquiring managers of the enterprises in different geographical parts of saudi arabia. after evaluating the questionnaires, a significant relationship was derived between the use of cloud computing and the increasing quality of services as well as the performance of small and medium companies. they concluded that the level of knowledge of managers in these enterprises is lower than the global average. they also found that the biggest challenge in saudi arabia is privacy and security regarding cloud computing service providers and users, which the culture of saudi arabia has played a role in preventing the cloud computing adoption. in another study by d. h. tesema (2020) [50] the challenges and the importance of cloud computing adoption were examined in terms of security, privacy and availability in a commercial bank in ethiopia. a descriptive method was used to study and analyze the role of cloud computing in commercial banks. also, the role of cloud computing was analyzed as a powerful tool for savings and cost-effectiveness on large commercial banks considering the mentioned bank as a case study. the results of this study display the main effective factors such as cost efficiency, security and compliance, reliability and also the ability to cooperate in cloud computing adoption in commercial bank of ethiopia (cbe). omar ali et al. (2021) [51] assessed the complexity of cloud computing adoption in local governments in australia. in this research, they proposed a hybrid evaluation model for the cloud computing adoption in the information systems. to evaluate the proposed model, 21 it managers from local governments were interviewed in the first phase, and then in the second phase, 480 it employees from 47 local governments responded to an online survey. after evaluating the results, they concluded that the complexity of an organization, the structural complexity of technology, the dynamic complexity of an organization and the dynamic complexity of should be considered in the use of cloud computing and without them, the results will not be satisfactory. the importance of cloud computing acceptance by personnel and the extensive research before its implementation have been considered, thus, this research is conducted to address the importance of cloud systems in aviation, especially in the east of iran. 4. research methodology this conducted research has been completed, using descriptive survey method. additionally, the selected statistical population for this particular research included all the current personnel who work at the birjand international airport which is located in the southern parts of khorasan province (iran). furthermore, we must add that the used method in this particular case was census method, and not the usual sampling method at all due to the limited number of people, which meant that all the employed personnel 472 h. kardanmoghaddam, a. rajaei, s. f. fatemi were included. in total, 67 sets of questionnaires were issued amongst the staff from which 57 were returned and analyzed, in addition, these questionnaires were the tools use for data collection in this study. besides, it is important to know that these sets of questionnaires and dissertations were based on both roger’s diffusion of innovation theory as well as other similar researches that have been conducted previously, [27], [35] & [52]. moreover, the questions of each section of the questionnaire are created, according to the infrastructure and factors that make up roger’s diffusion of innovation theory as well as using the various particular features and functions of the technologies under study. additionally, a total of 30 question along with 6 available options were designed in the form of likert scale beginning from (1 = strongly disagree to 6 = strongly agree), in order to accurately measure the overall comprehension of cloud computing available features, in addition, the questionnaire must first be available to professors, experts, sociologists, computer, the management as well as a number of airport staff in order to determine the actual validity of the used tool, and finally after clearing all the discovered errors and mistakes amongst the questions, then the questionnaires are distributed. later, in order to assess the reliability, the questionnaire was calculated using cronbach’s alpha coefficient, which resulted in 92.2%, besides, also being tested through retest method which was conducted on only a group of personnel who at that time worked at the birjand international airport. at the end, spss and excel software are both used in order to thoroughly analyze the obtained results of the questionnaires descriptive statistics and on the other hand, for inferential statistics, were used. many organizations are nowadays using their private cloud model that they have set up for their organization. in this study, the cloud computing services used by administration and technology employees of airport were studied. since the clouds used are private or associate cloud models that are managed by upstream management organizations, they do not have the challenge of confidentiality and the diffusion of sensitive data. programs have been implemented to transition to the cloud computing model in aviation systems such as flight planning, ticket reservation and sales systems, electronic control of goods, air traffic and navigation control, customs services, accounting and security services in birjand international airport (systems that implementation is not possible). most of the questions in the questionnaire were about the use of software as a service (saas). and this cloud computing service is more used and important in the airport business. 5. research findings a total of 57 individuals were a part of the research, who included 5 females (8.8%) and the rest were all male participants (91.2%). additionally, the educational degree of most of the participants were bachelor’s degree or lower (68.4%), besides, the average age and past service history of the participants being 36.79 ±4.17 and 12.96 ±4.43 (table 1). table 1 demographic characteristic of the participants variables abundance percentage gender female 5 8.8 men 52 91.2 educational degree bachelor’s degree or lower 39 68.4 master’s degree or higher 18 31.6 age mean±sd 36.79 ± 4.17 years of service mean±sd 12.96 ± 4.43 acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 473 furthermore, descriptive analysis of research variables is presented in the following page (table 2). additionally, the average score of cloud computing adoption characteristics in total were 120.65 ± 9.99, on the other hand, the adoption rate was discovered to be 66.09 ± 9.60, in addition, if the coefficient of elasticity and scalability of the research variables are within the range of [-2,2], then it can be concluded that these particular variables are normally being distributed. table2 descriptive analysis of research variables variables m e a n s ta n d a rd d e v ia ti o n m e d ia n m o d e v a ri a n c e m in im u m s c o re m a x im u m s c o re s k e w n e ss k u rt o si s testability comprehension 23.95 3.37 24 21 11.34 17 30 0.39 -0.56 comparative advantages comprehension 42.14 4.05 42 40 16.41 36 51 0.30 -0.91 understanding the visibility of results (observability) 12.68 1.77 12 12 3.15 10 16 0.24 -1.03 examination opportunities comprehension 7.26 1.48 7 7 2.20 5 11 0.58 0.40 complexities comprehension 16.93 2.29 16 20 5.25 13 22 0.43 -1.02 compatibility comprehension 8.49 1.28 8 8 1.65 5 10 -0.69 0.38 understanding the occasional lack of need for cloud computing 9.19 1.91 9 9 3.66 4 12 -0.70 0.65 general acceptance features of cloud computing 120.65 9.99 116 113 99.95 108 138 0.44 -1.53 adoption rate 66.09 9.60 64 55 92.08 53 84 0.55 -0.97 hypothesis (1): there is a significant relationship between compatibility comprehension of cloud computing and its adoption rate. the obtained results of pearson correlation coefficient shown in table (3) indicates that there is a positive relationship between understanding the characteristics of cloud computing testability and the adoption rate (p<0.001 and r = 0.63). in other words, the more people who test the new technologies and innovations, the more the adoption rate rises. hypothesis (2): there is a significant relationship between the comprehension of comparative advantage of cloud computing and its adoption rate. the results of pearson correlation coefficient in table (3) indicates that there is a significant relationship between the comparative nature of cloud computing and its adoption rate (p<0.001 and r = 0.69). in other words, the ones who understand comparative advantages more also seem to have a higher and better adoption rate. hypothesis (3): there is a relationship between understanding the visibility of cloud computing results and its adoption rate. in addition, the obtained results of pearson correlation coefficient in table (3) show that there is a significant relationship between the observability of the results of cloud computing and the adoption rate (p<0.001 and r = 0.55). meaning that people who have higher degree of comprehension on the observability of the results tend to also have higher adoption rate. hypothesis (4): there is a significant relationship between the examination opportunities comprehension of cloud computing and the adoption rate. 474 h. kardanmoghaddam, a. rajaei, s. f. fatemi the obtained results of pearson correlation coefficient in table (3) indicated that there is a meaningful relationship between fully understanding the possible examination opportunities and the general adoption rate (p = 0.17 and r = 0.18). hypothesis (5): there is also a relationship between the complexities comprehension of cloud computing and the adoption rate. the obtained results of pearson correlation coefficient in table (3) show a positive relationship between the comprehension of complexities in cloud computing and its adoption rate (p<0.001 and r = 0.79). in other words, those who tend to understand these complexities more and thoroughly, also tend to have a higher adoption rate in general. hypothesis (6): there is a significant relationship found between the compatibilities comprehension of cloud computing and the adoption rate. the obtained results of pearson correlation coefficient in table (3) show that there is a significant relationship between compatibility comprehension of cloud computing and the general adoption rate (p = 0.004 and r = 0.37). which also means that those who are more compatible with cloud computing tend to have a higher acceptance rate in general. hypothesis (7): there is also a significant relationship between the lack of need for cloud computing and the adoption rate. the obtained results from pearson correlation coefficient in table (3) show that there is always a negative relationship between the lack of need for cloud computing and the adoption rate (p < 0.001 and r = -0.69). in other words, people who feel the need for cloud computing more, tend to have a higher adoption rate in comparison to those who do not. table 3 the correlations between cloud computing acceptance feature & adoption rate variables adoption rate r p testability comprehension 0.63 <0.001 comparative advantage comprehension 0.69 <0.001 understanding the visibility of results (observability) 0.55 <0.001 examination opportunities comprehension 0.18 0.17 complexities comprehension 0.79 <0.001 compatibility comprehension 0.37 0.004 understanding the lack of need for cloud computing -0.69 <0.001 general acceptance features of cloud computing 0.71 <0.001 6. regression model determinants of cloud computing adoption rate multiple regression analyses of data as well as the relationships amongst variables are both discussed in this part of the conducted research. additionally, the objective of these analysis is to determine the precise contribution and effect major research variables (cloud computing adoption features) can have in the estimation of the dependent variable changes (adoption rate). in other words, multiple regression may very well help in the explain and prediction of the variance of the dependent variables of cloud computing adoption rate. in addition, this can only be done by estimating the degree of role and effect independent variable have in creating variance. therefore, multiple regression can possibly be used in order to find out the existence as well as the precise degree of a relationship between a (y) variable against any number of independent variables. acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 475 therefore, in order to determine the precise degree of characteristics found between cloud computing acceptance and the adoption rate, multiple regression analysis is used along with a step-by-step method. additionally, in the first step the complexities comprehension of cloud computing was used which resulted in a possible 62% estimation in the variance of cloud computing variables. secondly, the testability comprehension variable is used in the equation, which may very well increase the estimations coefficient by 4% (from 62% to 66%). furthermore, in the third step, the lack of need for cloud computing enters the equation and causes an increase in prediction coefficient by 3% (from 66% to 69%). in other words, the features and variables of complexity, testability comprehension as well as understanding the lack of need for cloud computing can together possibly explain 69% of the variance of the variable cloud computing adoption rate and hereby other variables were all removed from the equation due to the lack of significance. moreover, amongst these three mentioned variables, the complexities comprehension perception feature with a coefficient of 0.49 has the greater part in the estimation of the variables of cloud computing adoption rate (table 4 & 5). table 4 multiple regression test result determinants of cloud computing adoption rate source of changes squares sums degree of freedom squares mean f significance degree correlation coefficient determination coefficient c o m p le x it ie s c o m p re h e n si o n regression 3212.33 1 3212.33 90.87 <0.001 0.79 0.62 remaining amount 1944.23 55 35.35 total sum 5156.56 56 t e st a b il it y c o m p re h e n si o n regression 3482.78 2 1741.39 56.18 <0.001 0.82 0.66 remaining amount 1673.78 54 30.99 total sum 5156.56 56 u n d e rs ta n d in g t h e l a c k o f n e e d f o r c lo u d c o m p u ti n g regression 3663.71 3 1221.24 43.36 <0.001 0.84 0.69 remaining amount 1492.85 53 28.17 total sum 5156.56 56 table 5 beta coefficient of cloud computing adoption rate for regression model variables variables non-standard coefficient standard coefficient (t)value significance level the (b)amount standard error (β)value fixed 26.75 11.37 2.35 0.02 complexities comprehension 2.04 0.45 0.49 4.58 <0.001 testability comprehension 0.69 0.26 0.24 2.69 0.01 understanding the lack of need for cloud computing -1.27 0.50 -0.25 2.53 0.01 476 h. kardanmoghaddam, a. rajaei, s. f. fatemi furthermore, structural equation model was also used in order to observe & analyze the possible effects of cloud computing adoption characteristics on the adoption rate. additionally, the obtained results indicated that the precise amount of standard factor between the two variables of cloud computing acceptance characteristics and its adoption rate is about (0.95), in addition, since there is a significance relationship between the value of (t-statistics) and other previously mentioned variables (9.30), which is clearly more than the (1.96), therefore, it can be concluded that the adoption characteristics of cloud computing, do in fact have a positive and significant effect on the adoption rate. fig. 1 the standard factors related to the effect on cloud computing characteristics and its adoption rate acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 477 fig. 2 the value of t (t-statistics) is significantly related to the effect of cloud computing and the adoption rate 7. discussions & conclusions the results of present paper conducted in an international airport (birjand) indicated that there is meaningful and positive relationship between acceptance features of cloud computing and perception elements of testability, comparative advantages, the visibility of the results (observability) and compatibility of cloud computing and its adoption rate, however, the relationship found between the lack of need for cloud computing and the adoption rate seems to be a negative kind of relationship which can also be significant simultaneously. on the other hand, there was no major relationship found between the feature of comprehension of examination opportunities of cloud computing and the adoption rate. 478 h. kardanmoghaddam, a. rajaei, s. f. fatemi farmanlu lilab et al. (2019) conducted a study on the librarians who worked at the tabriz university of medical sciences and analyzed the factors that could potentially affect the adoption rate of cloud computing based on roger’s diffusion of innovation theory. additionally, the most significant features of cloud computing which were identified using heuristic factor analysis are as follows: testability, comparative advantages, complexity, compatibility and the observability of the obtained results. in addition to these five features, two other factors were also identified during this study which included test & examination opportunities as well as the lack of need for cloud computing. furthermore, the obtained results of this particular study indicated that there is significant relationship among the characteristics and features of cloud computing such as testability, compatibility, complexity, comparative advantages, examination opportunities along with the lack of need for cloud computing and the adoption rate of cloud computing, however, there was no major relationship to be found between the observability of the results and cloud computing adoption rate in general. moreover, we conclude that the results of this study, apparently is not consistent with the results obtained through the current conducted research (with the only exception being in the characteristics of examination opportunities and the observability of the results) [52]. hamdipour & zavaraqi (2018) conducted a research on the analyze of the factors that may potentially affect the acceptance of research information management system (simap), using the personnel of tabriz university. as a result, they came to conclusion that there is a significant relationship between the seriousness and determination of the university and the characteristics such as compatibility, testability, complexity and comparative advantages which very possibly may lead to a change in the speed of the adoption rate [53]. nazari et al. (2012) in another study, examined the relationships between the complete comprehension of the characteristics mentioned in roger’s diffusion of innovation theory and the acceptance rate through the female personnel of district 8 of islamic azad university, as a result, they came to the conclusion that features such as compatibility, complexity, testability and observability, all have a positive impact on the relationship towards the acceptance rate [54]. the obtained results of the research conducted by heidari et al. (2014) based on the roger’s diffusion of innovation theory, using the faculty members of iranian department of information science on the study of factors that could potentially affect the adoption rate of electronical information resources (such as cloud computing), indicates that the characteristics and features such as comparative advantage, testability, compatibility, complexity, observability and examination opportunities all have a significant impact on the speed or pace of the adoption rate of cloud computing or other available electronical information resources [35]. furthermore, ebadolahi et al. (2014) in another study conducted on the public libraries of the city of tehran and its 22 different districts, examined and analyzed the possible effects of the five characteristics of innovation dissemination on the adoption rate. in conclusion, the results showed a significant as well as positive relationship between the various characteristics of technology (compatibility, testability, easy use, observability and comparative advantage) and the acceptance rate of journal index software used by librarians [27]. do (2008) [55] in another research conducted on the faculty members and students of the college of specialized studies of hawaii, examined the possible effects of the five characteristics of roger’s diffusion of innovation theory on the acceptance rate as well as the many uses in online educational admissions. additionally, the obtained results show a significant relationship amongst the features of complexity, compatibility, observability, testability, comparative advantage and the acceptance rate of online educational technologies by the potential users. al-gahtani (2004) [56] in another study conducted on computing technologies and its acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 479 acceptance rate based on the roger’s diffusion of innovation came to the conclusion according to the obtained results that the two factors of testability and observability of the results can both have huge as well as positive impact on the acceptance rate of computers. additionally, it is also important to know that the obtained results of this particular research seem to be consistent with our current study. in order to describe the obtained results of the current study, it can be said that technologies or innovations that can potentially be tested in necessary cases or those that can be evaluated in either short or long-term, are more likely to be accepted by the social system as well as the community. furthermore, those who have higher degree of comparative advantage comprehension also seem to have a much higher adoption rate, in addition, in order to explain this particular case, it can be claimed that components such as usefulness, various benefits, easy use, user-friendliness and task performances are all amongst the most important factors that may potentially have huge impacts on the adoption rate. additionally, we also learn that individuals tend to have a higher adoption rate through better understanding the observability of the results. therefore, it can be stated that if the results of using new technologies are constantly visible to the population, then they can witness the positive impacts of using it for themselves, and as a result the acceptance rates may increase. moreover, the use of technology as an innovation is generally better accepted, if it is compatible with other factors such as social values, past experiences, job aspects and essential requirements. finally, take note that those who are at more complex situations and feel the greater need for cloud computing, consequently tend to have a higher adoption rate. at the end, based on the obtained results of multivariate regression in the current study, we learn that among the various characteristics of cloud computing acceptance (understanding the lack of need for cloud computing as well as complexity and testability comprehension) are the most significant factors and predictors of cloud computing adoption rate overall, in addition, they make 69% of the variance of the cloud computing adoption variables. additionally, amongst these three mentioned variables, the complexity comprehension with a beta coefficient of 0.49 seems to have the greater share of impact on the adoption rate of cloud computing in comparison with the rest of the variables. moreover, the obtained results of the research conducted by al-gahtani (2004) [56] can indicate and confirm that technological complexity feature is the most effective factor found amongst the five technological characteristics which can ultimately determine the prevention or acceptance of computers and technological advancements. air transportation systems and airports have gradually adopted cloud computing as they worry about reliability, surveillance and security threats. but cloud computing is gradually changing the interaction of people with these systems. acceptance of cloud computing provides different levels of decision making for air industry managers seeking prospects of applying cloud computing in their organization and it can affect decision making of air transport system managers. as using cloud computing service or the time of using the service requires costs, controlling it and lowering the prices is easier. when you choose cloud software, you control extra costs, and don’t need to pay or the costs related to purchasing and maintaining servers and you only pay for user usage amount. among air companies, small and medium ones don’t have enough financial and human resources compared with large organizations so they can’t efficiently update and upgrade their it requirements. this causes that these organizations lose the chance of competing with their great rivals. using cloud computing contributes these organizations to assess their organizational conditions for accepting services related to cloud computing. using cloud computing in air transportation 480 h. kardanmoghaddam, a. rajaei, s. f. fatemi systems and airports causes that these companies or a group of subsidiaries have better management strategies with lower costs as the cost is shared among the companies and the cloud computing can be profitable for these systems. using cloud computing provides more storing space related to traditional methods of storing information as the storage is performed by cloud and cloud users can have access to the required knowledge in everywhere at any time using different systems like personal computers or smart tools like tablet, smart phones or mobile computers. cloud computing causes flexibility and accessibility of it through different resources and working with it is easier leads to smart and rapid survey of resources as the documents are not placed on personal computers and they are hosted on the cloud and multiple users can simultaneously work with the documents and projects and issues like customer relationship management, flexibility in organizational activities, ease of establishing new systems and business durability are important in using cloud. scalability of cloud computing services leads to increasing and decreasing of resources for users in any time. whenever due to high visit of website, so many resources are needed; cloud computing service automatically increases bandwidth so that there won’t be any problem in processing. using cloud computing systems decreases website, software and system downtime to great extent. many cloud service providers present policies, technologies and controls that boost security condition of air transportation systems and support data, programs and organizational infrastructures against probable threats. capability to backup and retrieve information by cloud computing system has turned it into one of the most reliable information processing and storing methods. cloud computing systems are implemented in a big and secure datacenter and when update is provided, they are rapidly updated. this updating trend leads to high speed and high performance services at airports. using cloud computing enables air transportation systems and airports to provide new services and technologies, identify their customer’s needs and fulfill them. like any other study, the researcher has faced limitations due to the small sample size. if a larger sample size (for example the entire employees of airports in eastern iran or total employees of iranian airports) was available, more accurate results would be likely obtained. the results of the present study display the cloud computing acceptance at birjand international airport, thus, generalizing it to other airports should be carefully made with sufficient knowledge. also, this study was conducted on personnel working in an international airport in eastern iran, therefore it cannot be generalized to the entire community affiliated with the iranian aviation industry. another limitation of this study is the lack of similar research of cloud computing acceptance on airport personnel to compare the results. references [1] y. a. qasem, r. abdullah, y. y. jusoh, r. atan, & s. asadi, “cloud computing adoption in higher education institutions: a systematic review”, ieee access, vol. 7, pp. 63722–63744, 2019. [2] v. i. chang, “a proposed framework for cloud computing adoption”, in sustainable business: concepts, methodologies, tools, and applications, igi global, pp. 978–1003, 2020. [3] w. yaokumah, & r. a. amponsah, “examining the contributing factors for cloud computing adoption in a developing country”, in cloud security: concepts, methodologies, tools, and applications, igi global, pp. 1663–1685, 2019 [4] p. mell & t. grance, “the nist definition of cloud computing. nist”, (accessed: http://www.crs.nist. gov/publications/nistpubs/800-145/sp800-145.pdf), 2011. [5] sun microsystems. 2009. introduction to cloud computing architecture (white paper). sun microsystems. [6] m. armbrust et al, “above the clouds: a berkeley view of cloud computing”, uc berkeley technical report, 2009 acceptance of cloud computing in an airline company based on roger’s diffusion of innovation theory 481 [7] v. ghobadpour, n. naghshineh, a. sabetpour, “from cloud computing to cloud library: proposing cloud model to configure future libraries”, iranian journal of information processing and management, vol. 28, no. 4, pp. 859–877, 2014. [8] m. wright and d. charlett, “new product diffusion models in marketing: an assessment of two approaches”, marketing bulletin, vol. 6, pp. 32–41, article 4, 1995. [9] v. venkatesh, m. g. morris, g. b. davis & f. d. davis, “user acceptance of information technology: toward a unified view”, mis quarterly, pp. 425–478, 2003. [10] h. kardan moghaddam and b. aghdas asghari, “comparison of media and digital knowledge among university of technology birjand (iran), pune in india and university of liege in belgium”, asian journal of information technology, vol. 14, pp. 268–272, 2015. [11] m. fishbein, & i. ajzen, belief, attitude, intention and behavior: an introduction to theory and research, 1975. [12] i. ajzen, “the theory of planned behavior”, organizational behavior and human decision processes, vol. 50, no. 2, pp. 179–211, 1991. [13] f. d. davis, “perceived usefulness, perceived ease of use, and user acceptance of information technology”, mis quarterly, pp. 319–340, 1989. [14] l. tornatzky, & m. fleischer, the processes of technological innovation, 1990. [15] e. m. rogers, diffusion of innovations: simon and schuster, 2010. [16] t. oliveira, m. thomas & m. espadanal, “assessing the determinants of cloud computing adoption: an analysis of the manufacturing and services sectors,” information & management, pp. 497–502, 2014. [17] h. f. lin, “understanding the determinants of electronic supply chain management system adoption: using the technology–organization–environment framework”, technological forecasting & social change, pp. 80–92, 2014. [18] m. stieninger, d. nedbal, w. wetzlinger, g. wagner, & m. a. erskine, “impact on the organizational adoption of cloud computing: a reconceptualization of influencing factors”, procedia technology, vol. 16, pp. 58–93, 2014. [19] t. kihara, & d. gichoya, “adoption and use of cloud computing in small and medium enterprises in kenya”, in proceedings of the the ist-africa conference and exhibition (ist-africa), 2013. [20] felix n. njeh, “cloud computing: an evaluation of the cloud computing adoption and use model”, doctor of science in computer science department of computer science college of arts and science a dissertation submitted to the graduate school bowie state university, 2014. [21] b. bernstein, and p. j. singh, “innovation generation process: applying the adopter categorization model and concept of “chasm” to better understand social and behavioral issues”, european journal of innovation management, vol. 11, no. 3, pp. 366–388. [22] h. gangwar, h. date, & r. ramaswamy, “understanding determinants of cloud computing adoption using an integrated tam-toe model”, journal of enterprise information management, vol. 28, no. 1, pp. 107–130, 2015. [23] https://birjand.airport.ir [24] https://iranairems.com/ [25] www.world-airport-codes.com [26] e. rogers, diffusion of innovations (4 ed.). new york: the free press, 1995. [27] n. ebadolahi, m. cheshmeh sohrabi, f. nooshinfard, “analysis of technological factors influencing on adoption based on rogers' diffusion of innovation theory: case study”, namaye nashryat software, vol. 7, no. 26, pp. 79–92, 2014. [28] e. rogers, and f. f. shoemaker, “communication of innovations: a cross-cultural approach, 2nd ed.”, 1971. [29] e.m. rogers, diffusion of innovations, 5th ed., free press, new york, ny, 2003. [30] t. s. behrend, e. n. wiebe, j. e. london, & e. c. johnson, “cloud computing adoption and usage in community colleges”, behaviour & information technology, vol. 30, no. 2, pp. 231–240, 2011. [31] n. opitz, t. f. langkau, n. h. schmidt, & l. m. kolbe, “technology acceptance of cloud computing: empirical evidence from german it departments”, in proceedings of the ieee 45th hawaii international conference on system science (hicss), pp. 1593–1602, 2012. [32] s. t. alharbi, “users’ acceptance of cloud computing in saudi arabia: an extension of technology acceptance model”, international journal of cloud applications and computing (ijcac), vol. 2, no. 2, pp. 1–11, 2012. [33] m. akbari, m.c. sohrabi, e.a. zanjani, “analysis of search engines and meta search engines' position by university of isfahan users based on rogers' diffusion of innovation theory”, iranian journal of information processing management, vol. 27, no. 4, pp. 961–984, 2012. [34] l. luo, “reference librarians’ adoption of cloud computing technologies: an exploratory study”, internet reference services quarterly, vol. 17, no. 3-4, pp. 147–166, 2012. [35] g. r. heidari, m. bagher, a. aghdam, a. hamdipour, “factors affecting the adoption of electronic information resources (eir) by iranian knowledge and information science faculty members, based on https://www.magiran.com/volume/84990 https://www.emerald.com/insight/search?q=boaz%20bernstein https://www.emerald.com/insight/publication/issn/1460-1060 https://www.emerald.com/insight/publication/issn/1460-1060 https://sciexplore.ir/documents/details/220-707-875-254?title=analysis%20of%20search%20engines%20and%20meta%20search%20engines%27%20position%20by%20university%20of%20isfahan%20users%20based%20on%20rogers%27%20diffusion%20of%20innovation%20theory https://sciexplore.ir/documents/details/220-707-875-254?title=analysis%20of%20search%20engines%20and%20meta%20search%20engines%27%20position%20by%20university%20of%20isfahan%20users%20based%20on%20rogers%27%20diffusion%20of%20innovation%20theory 482 h. kardanmoghaddam, a. rajaei, s. f. fatemi roger's diffusion of innovation theory”, library and information science, vol. 16, no. 3, 2013, (in persian). [36] h. p. fung, “using porter five forces and technology acceptance model to predict cloud computing adoption among it outsourcing service providers”, internet technologies and applications research itar, vol. 1, no. 2, pp. 18–24, 2013. [37] l. sun, h. dong, f. khadeer hussain, o. khadeer hussain, e. chang, “cloud service selection: state-of-the-art and future research directions”, journal of network and computer application, vol. 45, pp. 134–150, 2014. [38] m. yuvaraj, “examining librarians’ behavioural intention to use cloud computing applications in indian central universities”, annals of library and information studies (alis), vol. 60, no. 4, pp. 260–268, 2014. [39] r. rai, g. sahoo, and s. mehfuz, “exploring the factors influencing the cloud computing adoption: a systematic study on cloud migration,” springerplus, vol. 4, no. 1, p. 197, 2015. [40] g. mangai, p. ganesan, & kumar d. kirana, “a perspective study of cloud computing in library services”, in proceedings of the international conference on library space and content management for networked society, (ic.liscom-2014), 18-20 october 2014, under the auspices of dvk central library, dharmaram vidya kshetram, bangalore. [41] b. scholtz, j. govender, & j. m. gomez, “technical and environmental factors affecting cloud computing adoption in the south african public sector”, in proceedings of the international conference on information resources, 2016. [42] c. valmohammadi, m. mazaheri, “clarification of factors affecting the decision to use cloud computing among irib employees based on a technology acceptance model”, it management studies, vol. 5, no. 19, pp. 105–124, 2017. [43] m. yuvaraj, “ascertaining the factors that influence the acceptance and purposeful use of cloud computing in medical libraries in india", new library world, vol. 117, no. 9/10, pp. 644–658, 2016 [44] j.h. dahooie, n. mohammadi, a. vanaki, & m. jamali, “developing proper systems for successful cloud computing implementation using fuzzy aras method (case study: university of tehran faculty of new science and technology)”, journal of information technology management, vol. 9, no. 4, pp. 759–786, 2017. [45] h. m. sabi, f. m. e. uzoka, k. langmia, f. n. njeh, & c. k. tsuma, “a cross-country model of contextual factors impacting cloud computing adoption at universities in sub-saharan africa”, information systems frontiers, vol. 20, no. 6, pp. 1381–1404, 2018. [46] c. changchit, & c. chuchuen, “cloud computing: an examination of factors impacting users’ adoption”, journal of computer information systems, vol. 58, no. 1, pp. 1–9, 2018. [47] elzamly, a., messabia, n., doheir, m., mahmoud, a., basari, a. s. b. h., selmiya, n. a., & al-shami, s. s. a. (2019). adoption of cloud computing model for managing e-banking system in banking organizations. international journal of advanced science and technology, 28(1), 318–326. retrieved from http://sersc.org/journals/index.php/ijast/article/view/397 [48] e. alidoust ghahfarokhi, a. safarpour, & a. amani samani, “identifying effective factors in cloud computing adoption among staff of physical education faculties in tehran”, journal of human resource management in sport, vol. 7, no. 2, pp. 245–263, 2020. [49] a. albelaihi, n. khan, “top benefits and hindrances to cloud computing adoption in saudi arabia: a brief study”, journal of information technology management, vol.12, no. 2, pp. 107–122, 2020. [50] d. h. tesema, “cloud computing adoption challenge in case of commercial bank of ethiopia”, international journal of development research, vol. 10, no. 1, pp. 33562–33565, 2020. [51] o. ali, a. shrestha, m. ghasemaghaei, et al., “assessment of complexity in cloud computing adoption: a case study of local governments in australia”, inf syst front, 2021. [52] a. farmanlu lilab, “analysis of the affecting factors on cloud computing adoption of the university of tabriz and medical sciences based on rogers diffusion of innovation theory”, journal of academic librarianship and information research, vol. 52, no. 4, pp. 39–58, 2019 [53] a. hamdipour, r. zavaraqi, “investigation of factors affecting the adoption research information management system (simap) by faculty members of the tabriz university: application of innovations diffusion theory”, library and information sciences, vol. 21, no. 2, pp. 131–164, 2018. [54] f. nazari, f. khosravi, f. babolhavaeji, (2012), “characteristics of online database perceptions and its compliance by female faculty members”, scientific research quarterly of woman and culture, vol. 4, no. 13, pp. 95–107, 2012 (in persian). [55] t. do, “rogers’ five main attributes of innovation on the adoption rate of online learning hawaii pacific university”, dissertation ma, 2008. [56] s. a. al-gahtani, “computer technology success factors in saudi arabia: an exploratortudy”, journal of global information technology management, vol. 7, no. 1, pp. 5-29, 2004. design of microwave waveguide filters with effects of fabrication imperfections facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 431 458 doi: 10.2298/fuee1704431m design of microwave waveguide filters with effects of fabrication imperfections  marija mrvić, snežana stefanovski pajović, milka potrebić, dejan tošić university of belgrade, school of electrical engineering, belgrade, serbia abstract. this paper presents results of a study on a bandpass and bandstop waveguide filter design using printed-circuit discontinuities, representing resonating elements. these inserts may be implemented using relatively simple types of resonators, and the amplitude response may be controlled by tuning the parameters of the resonators. the proper layout of the resonators on the insert may lead to a single or multiple resonant frequencies, using single resonating insert. the inserts may be placed in the e-plane or the h-plane of the standard rectangular waveguide. various solutions using quarter-wave resonators and splitring resonators for bandstop filters, and complementary split-ring resonators for bandpass filters are proposed, including multi-band filters and compact filters. they are designed to operate in the x-frequency band and standard rectangular waveguide (wr-90) is used. besides three dimensional electromagnetic models and equivalent microwave circuits, experimental results are also provided to verify proposed design. another aspect of the research represents a study of imperfections demonstrated on a bandpass waveguide filter. fabrication side effects and implementation imperfections are analyzed in details, providing relevant results regarding the most critical parameters affecting filter performance. the analysis is primarily based on software simulations, to shorten and improve design procedure. however, measurement results represent additional contribution to validate the approach and confirm conclusions regarding crucial phenomena affecting filter response. key words: bandpass filter, bandstop filter, multi-band filter, printed-circuit discontinuity, equivalent circuit, fabrication effects 1. introduction a great diversity of microwave filters can be perceived in modern communication systems. continuous improvement of these systems needs microwave filters having much more features such as low cost, compact size, low loss and operation in several frequency bands. therefore, this topic still gains significant attention in the area of microwave engineering. received february 18, 2017 corresponding author: marija mrvić school of electrical engineering, university of belgrade, kralja aleksandra blvd. 73, 11120 belgrade, serbia (e-mail: marija.mrvic@gmail.com) 432 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić a filter design procedure consists of several steps, which assume specification, approximation, synthesis, simulation model, implementation, study of imperfections and optimization [1, 2]. the purpose of each step can be briefly explained as follows [3]. design starts by setting the criteria (a filter specification) to be met for potential application. specification should be mathematically represented, so we need an approximation which is actually a filter transfer function. at that point filter simulation model and filter prototype (a fabricated device) may be introduced and evaluated. study of imperfections is then performed to investigate the various effects and phenomena caused by the real components used for the filter implementation. finally, optimization may be used for systematic numerical tuning of filter parameters to meet the specification. amongst the available filter manufacturing technologies, rectangular waveguides are attractive in communication systems, such as radar and satellite systems, due to their ability to handle high power and have low losses [4]. in this technology, bandstop and bandpass filters can be easily implemented with properly employed feeders [5]. filters are designed by inserting discontinuities into the e-plane or h-plane of the rectangular waveguides. various types of resonators, in relatively simple forms to design and fabricate, can be used on these discontinuities to obtain resonating inserts with a single or multiple resonant frequencies. for the e-plane filters, it is important to properly couple the resonators of the same frequency, and to decouple the resonators operating at the different frequencies. on the other hand, for the h-plane filters, it is important to decouple the resonators with the different resonant frequencies on the same insert, and to properly implement the inverters between the resonators with the same resonant frequency [6]. in this paper, various types of bandstop and bandpass waveguide filters, with single or multiple frequency bands, are presented and their characteristics are analyzed in details. the proposed filters are designed to operate in the x frequency band (8.2–12.4 ghz); therefore standard rectangular waveguide wr-90 (inner cross-section dimensions: width a = 22.86 mm, height b = 10.16 mm) is used and the dominant mode of propagation te10 is considered. both e-plane and h-plane filters are presented. split-ring resonators (srrs) and quarter-wave resonators (qwrs) are used for the bandstop, and complementary splitring resonators (csrrs) for the bandpass filter design. along with the three-dimensional electromagnetic (3d em) models, equivalent microwave circuits are generated and, for the chosen examples, the obtained results are also experimentally verified. bearing in mind the operational frequency band and implementation technology, these filters can be used as components of radar and satellite systems of various purposes [3]. a study of imperfections, based on the fabrication side effects investigation, is also presented and exemplified. a waveguide resonator and a third-order bandpass waveguide filter are analyzed in details in terms of implementation imperfections, including: implementation technology, the tolerance of the machine used for fabrication and positioning of the inserts inside the waveguide. this investigation provided relevant results regarding the most critical parameters influencing the filter performance. it is based on the software simulations, thus shortening and improving design procedure, and verified by the measurements on a laboratory prototype. design of microwave waveguide filters with effects of fabrication imperfections 433 2. bandstop filter design bandstop filters, as key components in rf/microwave communication systems, have an important task to reject the unwanted signals [7]. they can be easily implemented by inserting discontinuities into the e-plane or h-plane of the rectangular waveguides. authors in [8] present the h-plane filter using horizontal and vertical stepped thin wire conductors connecting the opposite waveguide walls. the usefulness of the srrs is verified for compact waveguide h-plane filter design in [9-13] and for the e-plane filter design in [14-17]. in this section, e-plane and h-plane bandstop waveguide filters are discussed. both types of filters use printed resonators as qwrs and srrs. compact size and independent control of the designed stopbands is a common feature of presented filters. for both of them, independently tunable stopbands are achieved in diverse manners, so detailed design procedures and results are presented. 2.1. e-plane bandstop waveguide filters using qwrs e-plane single-band filter design using qwrs, presented in [18], is expanded for the multi-band bandstop filter design [19]. first, we consider waveguide qwr, shown in fig. 1a, designed for resonant frequency f0 = 11 ghz. presented qwr is printed on the upper side of the substrate and connected to the lower waveguide wall. (a) 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -25 -20 -15 -10 -5 0 f circuit 3d em circuit 3d em (ghz) s (db) 11 (db)s 21 (b) fig. 1 a) waveguide qwr: 3d model and equivalent circuit, b) comparison of amplitude responses for the 3d model and equivalent electrical circuit of the qwr 434 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić fiberglass/ptfe resin laminate (tle-95) (www.taconic-add.com) is chosen as a substrate to implement the e-plane inserts. the parameters of the substrate are: εr = 3, h = 0.11176 mm, tanδ = 0.0028, t = 0.0175 mm. the metal losses due to the skin effect and surface roughness are taken into account by setting the conductivity ζ = 20 ms/m. the equivalent-circuit model of the waveguide qwr is shown in fig. 1a. simulated results for the 3d em model of the waveguide qwr and its equivalent circuit are compared in fig. 1b. the values of the circuit elements are calculated using equation (1), as proposed in [6]: )jω(1 )jω( 2 011 011 0 s s zr   , 2 0 011 0db3 ω )jω( 2 s zbl  , )jω(2 1 0110db3 szb c  , (1) where ω0 denotes the angular frequency in (rad/s), b3db is 3db bandwidth (rad/s), s11(jω0) is the value of the s11 parameter at the considered resonant frequency. the impedances of ports correspond to the value of the wave impedance of the waveguide for the resonant frequency of f0 = 9 ghz (550 ω). quality factor (q-factor) is an important parameter that characterizes a microwave resonator. detailed determination of the q-factor for the considered resonator is given in [19]. the obtained q-factors are ql = 22.5 for the loaded resonator, and qu = 175.34 for the unloaded resonator. (a) (b) fig. 2 e-plane waveguide bandstop filter a) 3d model, b) equivalent microwave circuit 2.1.1. bandstop waveguide filter and equivalent circuit bandstop waveguide filter using presented qwrs is shown in fig. 2a. a printedcircuit insert consisting of two identical qwrs is placed in the e-plane of the rectangular waveguide. center frequency of the bandstop filter can be targeted by adjusting the length of the used qwrs. qwrs are grounded to the lower waveguide wall and the spacing design of microwave waveguide filters with effects of fabrication imperfections 435 between them yields the desired bandwidth. for the considered filter, the center frequency is f0 =9 ghz and qwrs are spaced 8.5 mm apart to achieve the bandwidth of 570 mhz. 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f (ghz) circuit 3d em circuit 3d em s 21 (db) s 11 (db) fig. 3 comparison of amplitude responses for the bandstop filter (fig. 2a) and its equivalent circuit (fig. 2b) equivalent circuit of the bandstop waveguide filter using qwrs is shown in fig. 2b. to develop equivalent circuit, qwrs are represented using mutually coupled lc resonators. the coupling is composed of the following elements: inductor (lm) provides the magnetic part of the coupling and capacitor (cm) provides electric part of the coupling. values of l and c are found from equation (1). the waveguide section of length w1 comprises the distance between middle parts of the qwrs in 3d em model, and in calculations, it was replaced by the equivalent transmission line of characteristic impedance zc = 550 ω, and electrical length θ = 1.60 rad at 9 ghz. having determined all these parameters, we can find the values of the coupling elements (lm, cm) using equation (2): 2 2 m m m m c m m c 1 m m c 2( ) cot 2 csc 4( ) ( 4( ) ) cos( ) 2 2 4( )( ) l l l l l l c c z l l c c z f l l c c z                            2 2 m m m c m 2 m m c ( ) tan ( ) 4( ) ( ) tan sign cos 2 2 2 . 2( )( ) l l l l c c z l l f l l c c z                                     (2) this equation is derived for the resonant frequencies (f1, f2) of the coupled qwrs. the numeric values of these resonant frequencies are found for unloaded coupled resonators in the 3d em model. values of the circuits elements are, as follows: l = 0.757 nh, lm = 0.00371 nh, c = 0.4136 pf, cm = 0.00038 pf, w1 = 12.35 mm and we1 = 5.255 mm. fig. 3. shows the comparison of simulated amplitude responses for the 3d em and equivalent circuit model of the waveguide bandstop filter. 436 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić 2.1.2. multi-band bandstop waveguide filter design to validate the design of the e-plane waveguide filters with multiple stopbands, filters with two and three stopbands are designed. presented filters exhibit independent control of the designed stopbands (icds). 3d models of the non-miniaturized icds (nmicds) and miniaturized (micds) dualband bandstop waveguide filters are shown in fig. 4. specified center frequencies of the dual-band bandstop filter are f01 = 9 ghz and f02 = 11 ghz. as for the nmicds dual-band filter design, all of the printed qwrs are connected to the same waveguide wall. fig. 4 3d models of the nmicds and micds e-plane dual-band waveguide filters to eliminate the unwanted coupling between the qwrs for different stopbands, they are separated far from each other by the spacing of 12.5 mm. in that manner, each of the stopbands can be controlled individually, and the whole filter is perceived as a cascade connection of the bandstop filters intended for particular stopband performance. overall length of the nmicds filter is 0.876 λg, where λg denotes the guided wavelength at the center frequency of the lower stopband. with the aim to reduce the footprint of the nmicds filter, qwrs for different stopbands are connected to the different waveguide walls, which is in fact relatively simple solution to implement micds dual-band bandstop waveguide filter. amplitude responses of the nmicds and micds filters exactly match. for the micds filter, the unwanted coupling is overcome by shifting the qwrs for specified stopband along the upper waveguide wall. it was found that minimal value of the shift is 12 mm. however, the overall length decreased to 0.512 λg. equivalent microwave circuit of the nmicds dual-band bandstop filter is the cascade of the equivalent networks of single-band filters (fig. 2b) with the specified center frequencies, and it is shown in fig. 5a. the ports impedances are set to 500 ω, which is the value adequate for the wave impedance at 10 ghz (frequency in the middle of the considered center frequencies). the values of the equivalent circuit elements of the filter at 9 ghz remain unchanged, while circuit elements’ values for the filter at 11 ghz are: design of microwave waveguide filters with effects of fabrication imperfections 437 l2 = 0.518 nh, lm2 = 0.001122 nh, c2 = 0.4036 pf, cm2 = 0.0007 pf, w2 = 10.98 mm, wm = 15.92 mm and we = 2.705 mm. amplitude responses of the 3d em model and its equivalent circuit are compared in fig. 5b. (a) 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f (ghz) circuit 3d em circuit 3d em s 21 (db) s 11 (db) (b) fig. 5 a) equivalent microwave circuit of the nmicds filter from fig. 4. b) comparison of amplitude responses for the 3d em model of the nmicds filter and its equivalent circuit (a) 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] 9 ghz 9 ghz 10 ghz 10 ghz 11 ghz 11 ghz tbbwf tbbwf s 11 [db] s 21 [db] (b) fig. 6 a) tbbwf b) comparison of amplitude responses for the icds tbbwf and single-band filters for each specified center frequency 438 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić according to the proposed design guidelines, triple-band bandstop waveguide filter (tbbwf) is designed for specified center frequencies f01 = 9 ghz, f02 = 10 ghz and f03 = 11 ghz. middle stopband is designed by adding pair of identical qwrs having their length tuned to resonate at f0 = 10 ghz. so, tbbwf consists of alternating pairs of qwrs for different stopbands, attached to the top and bottom waveguide walls. 3d model of the tbbwf is shown in fig. 6a. the proposed design of the filter with three stopbands assumes that qwrs for the second and third stopband are connected to the same waveguide wall, while the qwrs for the first stopband are grounded to the opposite waveguide wall. the distances between the qwrs are set to secure the independent control of the stopbands. comparison of amplitude responses for the tbbwf and singleband filters for each specified center frequency is given in fig. 6b. total length of the tbbwf is 0.86 λg, λg being the guided wavelength at the lowest center frequency. (a) 8 8.5 9 9.5 10 10.5 11 11.5 12 -50 -40 -30 -20 -10 0 f [ghz] icds dual-band filter icds dual-band filter uc dual-band filter uc dual-band filter s 21 [db] s 11 [db] (b) fig. 7 a) 3d model of the uc dual-band bandstop waveguide filter. b) comparison of simulated amplitude responses for the uc and icds dual-band waveguide filters 2.1.3. miniaturization further miniaturization of the micds dual-band bandstop filter is achieved through several steps. 3d model of the presented ultra-compact (uc) dual-band bandstop filter is shown in fig. 7a. some of the geometric parameters are given symbolically to investigate their impact on the filter response. qwrs for different stopbands are printed on different sides of the insert. the aim was to preserve the characteristics of the micds filter, but to reduce the length of the filter. the whole length of the uc dual-band bandstop filter is 0.295 λg. the proximity of the participating qwrs restricted the independent control of the stopbands. comparison of simulated amplitude responses for the uc and icds dual design of microwave waveguide filters with effects of fabrication imperfections 439 band bandstop waveguide filters is shown in fig. 7b. the effect of the alterations of the parameters on the center frequencies and obtained bandwidths is exposed in table 1. table 1 influence of the parameters on the response of the uc dual-band filter parameter in (mm) f01 (ghz) b3db1 (mhz) f02 (ghz) b3db2 (mhz) c21 ↑ − − ↓ ↑ c11 ↓ ↓ ↓ ↓ ↑ r2 ↓ − ↓ − ↑ r1 ↓ ↑ ↑ ↓ ↑ d11 ↑ ↓ ↑ − ↑ m ↑ ↑ ↑ − ↓ possibilities regarding further miniaturization included the straight form of the qwrs and variation in the increment of the dielectric constant of the substrate used for implementation of the qwrs. the filter design with qwrs in the straight form features significantly wider bandwidths compared to the case when qwrs are implemented as folded elements. so, to preserve the characteristics of the icds filter, the space between the qwrs should be increased, resulting in longer filter than micds. the same effect is observed for substrates with higher permittivity (εr). since the higher εr makes the length of the printed qwrs shorter, the bandwidth became significantly wider. so, we had to increase the distance between the qwrs, which in turn increases the length of the filter. as a consequence, that filter is longer than our proposed realization. additional solution for miniaturization is proposed in [20], where connection of the qwrs for specified stopband to the opposite waveguide walls is suggested. 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f (ghz) 3d em exp 3d em exp s 21 (db) s 11 (db) (a) (b) fig. 8 a) a photograph of the fabricated e-plane dual-band bandstop waveguide filter. b) comparison of the simulated and measured results 2.1.4. experimental verification in order to demonstrate the effectiveness of the proposed design, the e-plane dualband bandstop filter is verified on a fabricated prototype (fig. 8a). the amplitude response was measured using agilent n5227 network analyzer. fig. 8b shows comparison between the measured and simulated amplitude responses for the dual-band bandstop 440 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić filter. measured response is in good agreement with the 3d em simulation results. slight discrepancies are observed in terms of the passband insertion loss, which occurred as a consequence of the losses within the waveguide walls and transitions from waveguide wr-90 to sma connectors (waveguide-to-coaxial adapters). these losses have not been taken into account during the 3d em analysis of the considered filter. 2.2. h-plane bandstop waveguide filters using srrs for the implementation of the h-plane filter, srrs in the form of the printed-circuit inserts are positioned in the transverse plane of the standard wr-90 waveguide [11, 12]. the printed-circuit inserts are implemented using copper clad ptfe/woven glass laminate (tlx-8) with the parameters: εr = 2.55, tanδ = 0.0019, h = 1.143 mm and t = 0.018 mm. the losses due to the skin effect and surface roughness are taken into account by setting the conductivity to ζ = 20 ms/m. 10 10.5 11 11.5 12 -30 -25 -20 -15 -10 -5 0 f [ghz] circuit circuit 3d em 3d em s 21 [db] s 11 [db] (a) (b) fig. 9 srr: a) 3d and equivalent circuit model. b) comparison of amplitude responses 2.2.1. waveguide srr 3d model of the considered h-plane waveguide srr is presented in fig. 9a. it is designed for resonant frequency of 11 ghz, so appropriate dimensions are given. equivalent circuit model is also presented in fig. 9a, and the values of the circuits’ elements are obtained using the equation (1). comparison of amplitude responses for the 3d em model and its equivalent circuit is shown in fig. 9b. 2.2.2. third-order bandstop waveguide filter using srrs a third-order bandstop waveguide filter using srrs is designed for the center frequency f0 = 11 ghz [11, 12]. 3d model of the filter is shown in fig. 10a, and its response is given in fig 10b. the h-plane inserts are separated by the waveguide section of length of λg11ghz/4 = 8.494 mm, to implement the quarter-wave inverters for the center frequency. design of microwave waveguide filters with effects of fabrication imperfections 441 8 8.5 9 9.5 10 10.5 11 11.5 12 -30 -25 -20 -15 -10 -5 0 f [ghz] s 11 [db] s 21 [db] (a) (b) fig. 10 h-plane bandstop filter using srrs: a) 3d model b) amplitude response equivalent microwave circuit of the third-order bandstop filter is shown in fig. 11a, and fully corresponds to the 3d em model of the filter. in the presented circuit, losses are not taken into account. values of the elements of the circuit are calculated using equation (1). comparison of the amplitude responses for the 3d em model and the equivalent microwave circuit is presented in fig. 11b. a good agreement between the results is observed in terms of the center frequency and the obtained bandwidth. 9 9.5 10 10.5 11 11.5 12 12.5 13 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] circuit circuit 3d em 3d em s 21 [db] s 11 [db] (a) (b) fig. 11 a) equivalent microwave circuit of the h-plane bandstop filter. b) comparison of the amplitude responses for the 3d em model and the equivalent microwave circuit 2.2.3. third-order dual-band bandstop filter using srrs to verify the usefulness of the design, a third-order h-plane dual-band bandstop filter is proposed for the center frequencies f01 = 9 ghz and f02 = 11 ghz [11, 12]. 3d model of the filter is shown in fig. 12a. srrs for different stopbands are separated by the quarterwavelength waveguide sections to realize the immittance inverters for the corresponding center frequency. so, designed stopbands can be controlled independently. srrs for the different stopbands are distanced by (λg9ghz λg11ghz)/4 = 3.678 mm. 442 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] 9 ghz 9 ghz 11 ghz 11 ghz 9 and 11 ghz 9 and 11 ghz s 21 [db] s 11 [db] fig. 12 h-plane dual-band bandstop filter: a) 3d model b) amplitude response 3. bandpass filter design bandpass waveguide filters can be designed using inserts with different types of resonators. the inserts may be placed in the e-plane or h-plane of the rectangular waveguide. herein, bandpass waveguide filters using h-plane inserts with csrrs as resonating elements, are considered. in fact, as relatively simple resonators to model and fabricate, providing bandpass frequency response, csrrs are widely used for bandpass waveguide filter design. they allow us to control the frequency response by modifying their parameters, thus providing flexible design. some of the previously reported solutions can be found in the open literature. in [21], the use of csrr for the h-plane bandpass design is demonstrated. a third order bandpass filter using csrrs is presented in [22], while compact solution can be found in [23]. 3.1. resonating inserts with csrrs resonating insert with csrr, placed in the h-plane of the standard rectangular waveguide (wr-90), is assumed to be a basic element of the higher-order filters. therefore, various implementations of the waveguide resonators with such inserts are possible. first, waveguide resonator using multi-layer planar insert with csrr is shown in fig. 13a. substrate used for the printed-circuit insert is copper-clad polytetrafluoroethylene (ptfe)/woven glass laminate (tlx-8) (http://www.taconic-add.com). the parameters of this substrate are as follows: εr = 2.55, tan δ = 0.0019, h = 1.143 mm and t = 18 μm. the specification of this resonator requires a resonant frequency of f0 = 11.1 ghz and a 3-db bandwidth of b3db = 520 mhz. the equivalent microwave circuit of the waveguide resonator is also given in fig. 13a. the following equations [6, 24] are used for calculation of the circuit parameters: 21 0 0 21 0 ( jω ) 2(1 ( jω ) ) s r z s   , 2 0 021 0db3 ω2 )jω(s zbl  , )jω( 2 0210db3 szb c  , (3) where ω0 denotes the angular frequency in (rad/s), b3db is 3db bandwidth (rad/s), s21(jω0) is the value of the s21 parameter at the considered resonant frequency. the impedances of ports correspond to the value of the wave impedance of the waveguide for the resonant frequency of f0 = 11.1 ghz (468 ω). design of microwave waveguide filters with effects of fabrication imperfections 443 as shown in fig. 13b, the amplitude response meets given specification, for the chosen csrr dimensions. also, there is a god agreement of the obtained amplitude responses of the 3d em model and equivalent circuit. the printed-circuit insert presented here used basic csrr form; however, csrr may have additional elements for the amplitude response finetuning, as exemplified in [24, 25]. 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -30 -25 -20 -15 -10 -5 0 f 0 [ghz] 3d em 3d em circuit circuit s 11 [db] s 21 [db] (a) (b) fig. 13 waveguide resonator using multi-layer planar insert with csrr: a) 3d model and equivalent microwave circuit, b) comparison of amplitude responses besides multi-layer planar structures, the resonating insert can be a pure metallic structure, which is even easier to implement (fig. 14a). the thickness of the metal insert is 100 μm. the conductivity of the metal plates is set to ζ = 20 ms/m to include the losses (the surface roughness and the skin effect). this resonator achieves resonant frequency of f0 = 11.06 ghz (fig. 14b). (a) (b) fig. 14 waveguide resonator using metal insert with csrr: a) 3d model, b) amplitude response in the previously considered models, csrr was centrally positioned on the insert. however, this is not mandatory; in fact, by changing the position of the resonator (besides modifying its parameters) one can influence the frequency response. relatively simple design of the waveguide resonator using metal insert with csrrlike resonator attached to the top waveguide wall [26] is depicted in fig. 15a. the 444 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić obtained amplitude response, having bandpass characteristic, is shown in fig. 15b (f0 = 11.06 ghz, b3db = 680 mhz). similarly, resonator can be attached to the bottom waveguide wall. a common property of all presented types of inserts is that more than one resonator can be accommodated on the insert, thus allowing for multiple resonant frequencies. in fact, by properly positioning the resonators, each frequency band can be independently tuned, by modifying parameters of a single resonator. this is an important property for the multi-band filter design. some of the previously reported printed-circuit discontinuities with multiple resonant frequencies can be found in [3, 6, 24, 26-28]. (a) (b) fig. 15 waveguide resonator using metal insert with csrr: a) 3d model, b) amplitude response 3.2. third-order filter using csrrs starting from the printed-circuit insert with csrr, higher-order filter can be designed. since the resonating circuits are connected in parallel, inverters are needed between them [4, 29]. for the waveguide filter design, an inverter can be deployed as a quarter-wave waveguide section at the center frequency of interest, as explained in [3, 6]. a third-order bandpass filter, with a single pass band, is considered as an example of the higher-order filter design. it uses multi-layer planar inserts with csrrs of the same substrate as the insert shown in fig. 13a. the 3d model of the filter is shown in fig. 16a. filter is designed to meet the following specification: f0 = 11 ghz, b3db = 300 mhz. therefore, the parameters of the csrrs are set to achieve that. also, the waveguide sections of length equal to λg 11ghz/4 = 8.49 mm represent inverters between the resonating elements. fig. 16b shows the obtained amplitude response. 3.3. multi-band bandpass filter design as previously stated, resonating inserts with multiple resonant frequencies can be used for the higher-order multi-band filter design. however, it is necessary to properly design the inserts and the inverters, as well. this means that each waveguide section representing inverter has to be of the proper length equal to λg/4 (λg is guided wavelength in the waveguide), for each center frequency. therefore, the folded inserts have been introduced design of microwave waveguide filters with effects of fabrication imperfections 445 as an adequate solution [3, 6, 25, 27, 28], being a novel solution at the same time, compared to the available open literature. to exemplify the use of the folded inserts for the filter design, a second-order dualband (f01 = 9 ghz, f02 = 11 ghz) filter model with two multi-layer planar inserts is shown in fig. 17a. as can be seen, the parts of the inserts with csrrs are mutually separated for the proper distance to meet the invertor requirement and the fold is achieved by adding a metal plate to connect these parts. the substrate used for the inserts is rt/duroid 5880 (εr = 2.2, h = 0.8 mm) (http://www.rogerscorp.com). according to fig. 17a, the lengths of the inverters are λg 9ghz/4 = 12.17 mm and λg 11ghz/4 = 8.49 mm and the metal plate length is lpl = (λg 9ghz λg 11ghz)/8 = 1.84 mm. for the insert designated as i1, the width of the metal plate corresponds to the waveguide width. the other possibility is to have narrow plate connecting the resonating inserts (insert i2). in the considered example, the width of the metal plate is set to wpl = 3 mm. the obtained amplitude responses for the filters having both inserts implemented as i1 or i2 (with the same dimensions and positions of the csrrs) are compared in fig. 17b. as can be seen, for the model with i2 inserts, a transmission zero occurs above the upper band. since dimensions of the csrrs have not been tuned for the i2 insert, the discrepancy between the parameters of the frequency bands is notable; however, the idea is to present the design possibilities and to point at their influence. (a) (b) fig. 16 third-order bandpass filter using multi-layer planar inserts with csrrs: a) 3d model, b) amplitude response the previous model may be simplified by using metal inserts [27, 28], instead of the multi-layer planar ones (fig. 18a). the filter is designed to meet the following specification: f01 = 9 ghz, b3db-1 = 450 mhz, f02 = 11 ghz, b3db-2 = 650 mhz. regarding the inverter implementation, the same stands as for the filter in fig. 17a. therefore, the distances between the resonators are λg 9ghz/4 = 12.17 mm and λg 11ghz/4 = 8.49 mm. the length of the metal plate of the folded insert is lpl = (λg 9ghz λg 11ghz)/8 = 1.84 mm. an equivalent microwave circuit has been generated for this filter in ni awr microwave office (http://www.awrcorp.com) (fig. 18b). each resonating insert is represented by a network consisting of rlc circuits (for each csrr) and an inductor connected between them. the inverter is represented by a waveguide section of length equal to λg 9ghz/4, inserted between these networks. the details regarding equivalent microwave circuit and the equations used for calculation of the lumped elements parameters can be found in [3, 6, 25, 27, 28]. 446 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić comparison of the amplitude responses obtained by a 3d em simulation and an equivalent circuit is given in fig. 18c. 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] model with i1 model with i1 model with i2 model with i2 s 21 [db] s 11 [db] (a) (b) fig. 17 second-order dual-band bandpass filter using multi-layer planar inserts with csrrs: a) 3d model, b) amplitude responses (a) 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 f 0 [ghz] circuit circuit 3d em 3d em s 11 [db] s 21 [db] (b) (c) fig. 18 second-order dual-band bandpass filter using metal inserts with csrrs: a) 3d model, b) equivalent microwave circuit, c) comparison of amplitude responses design of microwave waveguide filters with effects of fabrication imperfections 447 in order to develop a compact filter, the waveguide sections representing inverters may be shortened; however, an additional properly designed insert between the resonating inserts is needed to preserve the original filter response. this way of miniaturization assumes that the normalized lengths of the inverters are the same for both center frequencies, but the resonating inserts are still folded. for the sake of easier fabrication, a solution with flat inserts has been proposed, as more optimal one [3, 6, 25, 28] (fig. 19a). as can be seen, the additional insert is still needed; however the inverters for the center frequencies are not miniaturized in the same manner (the normalized length of the inverter for the csrrs with f01 = 9 ghz is λg 9ghz/8, while the normalized length of the inverter for the csrrs with f02 = 11 ghz is equal to 0.18λg 11ghz). fig. 19b shows comparison of amplitude responses before and after applying inverter miniaturization, with the same and different normalized lengths of the inverters, for the considered second-order dual-band filter. fabrication of the flat metal inserts is relatively simple; however, supporting plates and fixtures are needed in order to have stable inserts inside the waveguide [6, 30]. in [3, 28] a detailed explanation regarding filter fabrication can be found, including the implementation of the structure for precise positioning of inserts. the proposed solution has been successfully deployed for the experimental verification and the measured results have shown good agreement with the simulated ones. (a) 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -40 -35 -30 -25 -20 -15 -10 -5 0 s 11 [db] s 21 [db] s 11 [db] s 21 [db] s 11 [db] s 21 [db] f [ghz] (b) fig. 19 compact second-order dual-band bandpass filter using flat metal inserts: a) 3d model, b) comparison of amplitude responses of the filter without inverter miniaturization (blue), with equal (red) and unequal (green) inverter miniaturization 448 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 -35 -30 -25 -20 -15 -10 -5 0 f 0 [ghz] insert i1 insert i1 insert i2 insert i2 s 21 [db] s 11 [db] (a) (b) fig. 20 second-order dual-band bandpass filter using metal inserts with csrrs attached to the waveguide walls: a) 3d model, b) amplitude responses multi-band filters may be also designed using inserts with resonators attached to the waveguide walls, as an example with a single resonator shown in fig. 15a. a second order dual-band filter, with two folded metal inserts, is shown in fig. 20a [26]. dimensions of the resonators are tuned to provide center frequencies of f01 = 9 ghz and f02 = 11 ghz. therefore, the lengths of the inverters are λg 9ghz/4 = 12.17 mm and λg 11ghz/4 = 8.49 mm and the metal plate length is lpl = (λg 9ghz λg 11ghz)/8 = 1.84 mm. proposed resonators occupy less space on the insert, compared to the centrally positioned csrrs, for the same resonant frequencies (the occupied area may be reduced up to 20 %). for the insert designated as i1, the width of the metal plate corresponds to the waveguide width. the other possibility is to have narrow plate connecting the resonating inserts (insert i2). in the considered example, the width of the metal plate is set to wpl = 2 mm. the obtained amplitude responses for the filters having both inserts implemented as i1 or i2 (with the same dimensions and positions of the resonators) are compared in fig. 20b. as can be seen, for the model with i2 inserts, a transmission zero occurs above the upper band and better matching is obtained for that band, as well; however, at the expense of the wider band. 4. bandpass waveguide filter fabrication side effects an important step of the filter design procedure is certainly experimental verification, i.e. the measurement of the filter response on a fabricated prototype. at that point, obtained simulation models may be optimized and corrected and another control fabrication may be performed [31]. fabrication process itself may affect the obtained filter responses; thus a study of imperfections should be carried out in order to estimate the influence of the fabrication side effects on the amplitude response. this topic has already gained attention, since some previously published papers considered the influence of the substrate parameters on the frequency response of the microwave structures (e.g. [32]). regarding waveguide filters fabrication and possible deviations of the frequency response, some of the available solutions can be found in [33-38]. design of microwave waveguide filters with effects of fabrication imperfections 449 in our study, we have considered various implementation imperfections and fabrication side effects influencing the frequency response of the bandpass waveguide filters [31]. since these filters use printed-circuit inserts as discontinuities, we have taken into account the parameters of the substrate (dielectric permittivity, thickness, losses, including the tolerances) used for the multi-layer planar inserts. furthermore, a machine used for fabrication may introduce some inaccuracy and imperfections during the fabrication of the inserts. finally, it is not always possible to have stable and perfectly positioned inserts in the waveguide during the measurement and regular operation, so this should be also taken into account when investigating filter response deviation. our goal was to investigate the influence of the aforementioned imperfections on the bandpass waveguide filter amplitude response by making precise 3d em models, which included considered effects, and by performing software simulations. in this manner, we were able to estimate the influence of various effects and phenomena on the filter response and make conclusions regarding the most relevant ones. also, the advantage of this method of investigation is the fact that majority of settings can be made in software, without unnecessary fabrications, thus shortening filter design procedure. the experimental verification of the chosen models has confirmed simulated results, showing good mutual agreement, thus confirming the proposed method for investigation, as well. we have considered a waveguide resonator using single csrr (fig. 13a) and a thirdorder filter, as a more complex structure using three multi-layer planar inserts with csrrs (fig. 16a). in both cases, substrate used for the inserts is copper-clad polytetrafluoroethylene (ptfe)/woven glass laminate (tlx-8), with the following nominal values of the substrate parameters and the tolerances: εr = 2.55 ± 0.04, tan δ = 0.0019 ± 0.001, h = 1.143 ± 0.05715 mm, t = 18 μm (http://www.taconic-add.com/). the conductivity of the metal plates was set to ζ = 20 ms/m to include the losses (the surface roughness and the skin effect). for the modeling of the waveguide structures, wipl-d software has been used (http://www.wipl-d.com/), to make precise models with various effects included and to perform full-wave simulations of metallic and dielectric structures [39]. for the printedcircuit inserts fabrication, a mits electronics fp21-tp machine (http://www.mitspcb.com/) has been used. according to the manufacturer’s specification, precision of the machine can be specified as follows: a minimum achievable microstrip line width is 50 μm and a minimum gap between microstrip lines is 50 μm. csrrs have been made using milling process. all filter response measurements have been performed on the agilent n5227a network analyzer. in order to be able to investigate the influence of the considered effects and phenomena, we have analyzed the filter response deviation. in fact, this deviation could be qualified as a difference between the nominal value of the observed parameter of the amplitude response (center frequency, bandwidth, insertion loss) and the value obtained when some of the fabrication side effects are taken into account. furthermore, the deviation could be quantified by a relative change of the parameters of the amplitude response [31], xrel [%] = 100(x – xref)/xref, (4) where xrel is the relative change in percent, x represents the obtained value and xref is the reference (nominal) value, without introducing any inaccuracy. accordingly, an absolute change could be calculated as xabs = x – xref. 450 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić we have adopted a set of criteria to evaluate performance degradation. therefore, we have assumed that the filter response is not significantly degraded if the following conditions are met: 1) the relative change of the center frequency (f0rel) is less than 1 %, 2) the relative change of the bandwidth (b3dbrel) is less than 2 %, 3) the absolute change of the passband attenuation (s21abs(f0)) is less than 0.3 db. the filter response degradation was analyzed and evaluated using simulation results of the 3d em models and measurement results on the laboratory prototype. 4.1. influence of the design parameters in order to investigate the influence of the implementation technology, the substrate parameters have been varied according to the manufacturer’s specification provided earlier in this section. the same procedure has been carried out for the waveguide resonator and the third-order filter. in the latter case, it has been assumed that each printed-circuit insert was made using the same substrate board, thus the same type of imperfection was applied to all inserts. the substrate parameters εr, tan δ and h have been varied discretely, within the provided boundaries, and the frequency response parameters (f0rel, b3dbrel, s21abs(f0)) have been observed. a complete set of the obtained numerical results can be found in [31]. while the change of tan δ and h practically had no influence, the most significant degradation of the amplitude response has been introduced by varying εr (f0rel was nearly 0.5 %, b3dbrel was below 2 % and s21abs(f0) was significantly lower than 0.3 db, related to the reference values), for both the waveguide resonator and the filter. since the given criteria have been met, one can conclude that the variation of the substrate parameters within the tolerances provided by the manufacturer, does not introduce significant degradation of the amplitude response. fig. 21 shows comparison of amplitude responses for various values of εr. 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -20 -15 -10 -5 0 f [ghz]  r =2.55  r =2.55  r =2.59  r =2.59  r =2.51  r =2.51 s 11 [db] s 21 [db] 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -60 -50 -40 -30 -20 -10 0 f [ghz]  r =2.55  r =2.55  r =2.59  r =2.59  r =2.51  r =2.51 s 11 [db] s 21 [db] (a) (b) fig. 21 comparison of amplitude responses for various values of εr: a) waveguide resonator, b) waveguide filter since the relative dielectric permittivity had the most significant influence on the amplitude response, the next step in our study was to find analytical expression of the resonant frequency (f0) in terms of εr. therefore, we have analyzed the amplitude response for various values of εr in case only one printed-circuit insert was placed in the waveguide (the first/third insert or the second insert of the filter) and in case of the third-order filter. design of microwave waveguide filters with effects of fabrication imperfections 451 the obtained results have shown that there is a linear dependency between f0 and εr, in the following form [31]: f0 = k εr + m, (5) where k = 1.43 and m varies. this expression represents the best linear fit to each set of the obtained results (fig. 22). in practice, for the desired resonant frequency, one should perform a measurement using single insert, and based on that and the given family of curves, the exact permittivity can be determined and used for the filter design. 2.45 2.5 2.55 2.6 2.65 10.8 10.85 10.9 10.95 11 11.05 11.1 11.15 11.2 11.25 11.3  r f r [ g h z ] filter sim filter approx 1 st /3 rd resonator sim 1 st /3 rd resonator approx 2 nd resonator sim 2 nd resonator approx f r = -1.43 r +14.590 f r = -1.43 r +14.650 f r = -1.43 r +14.694 fig. 22 design curve: resonant frequency as a linear function of permittivity 4.2. inaccuracy of the machine used for fabrication the machine used for fabrication of the printed-circuit inserts may also introduce some inaccuracy, thus the obtained amplitude response may be degraded to some extent. we have considered a few possible issues related to the machine tolerance. as previously mentioned, the milling process was used to remove the metallization. therefore, it was possible to obtain traces, i.e. csrrs, with larger or smaller dimensions than those given in the design specification. the details of the analysis and the obtained simulation results are given in [31]. it has been shown that the amplitude response does not get degraded in case the deviation of the trace width is within the limits of ± 5 μm. the next considered issue is also a consequence of using the milling process. namely, while removing the metallization, the tool may dig into the substrate to a certain depth [40]. in our study, a trace of cylindrical tool was used and the 3d em model of such insert was successfully made in wipl-d software [31]. fig. 23 shows compared amplitude responses for various values of the digging depth d, for the waveguide resonator and the third-order filter. as can be seen, by increasing the depth, the center frequency increases, as well, and the bandwidth gets wider, for both the waveguide resonator and the filter. for the waveguide resonator, there is a good agreement of the simulated and measured results for d = 50 μm (fig. 23a), thus confirming the proposed method for modeling the influence of this type of inaccuracy in the software. in addition, the following conclusions can be made: 1) for a single insert, the digging depth of 10 μm can be declared as critical; 2) for the filter using three inserts with the same digging depth, critical value is even lower than 10 μm (which is around 50 % of the metallization thickness). 452 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić finally, we have considered the possibility to fabricate inserts with dimensions not exactly the same as those of the waveguide cross-section. in our example, the insert was narrowed by the same amount on both sides. the detailed analysis and the simulated and measured results can be found in [31]. it has been confirmed that this effect practically does not have influence on the amplitude response (for both waveguide resonator and filter), despite the fact that the insert was not physically short-circuited to each waveguide wall. precisely, in case the inserts were equally narrowed, by the same amount, on both sides, this amount should be kept below 500 μm (i.e., 1000 μm in total), so the filter response does not get degraded. 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -30 -25 -20 -15 -10 -5 0 f [ghz] d =0 m d =0 m d =10 m d =10 m d =20 m d =20 m d =50 m -sim d =50 m -exp d =50 m -sim d =50 m -exp s 21 [db] s 11 [db] 10 10.25 10.5 10.75 11 11.25 11.5 11.75 12 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 f [ghz] d =0 m d =0 m d =10 m d =10 m d =20 m d =20 m d =50 m d =50 m s 11 [db] s 21 [db] (a) (b) fig. 23 comparison of amplitude responses for various values of digging depth d: a) waveguide resonator (including measurement results for d = 50 μm), b) waveguide filter 4.3. precise positioning of inserts the inaccuracy in positioning of printed-circuit inserts might introduce filter response degradation. therefore, we have considered two possible issues – inclined and rotated inserts – for both waveguide resonator and filter. the detailed analysis has been carried out for a single insert, and those results have been further taken into account when considering positioning of inserts for the third-order filter. fig. 24a shows an inclined insert in the waveguide and two possible situations from the practice were considered. in case the dimensions of the fabricated insert perfectly match the dimensions of the waveguide cross-section (b1 = b), the following equation can be used to calculate the inclination angle [31]: 2 2 2 cos(α) ( ), 2 ( )b b x x bw b w   . (6) it has been shown that the critical angle which still allows the insert to remain more or less stable, i.e. to have contact with the top and bottom waveguide walls, is α ≈ 13º. the other possible situation is to have the insert fabricated to be shorter than needed (b1 ≠ b). the inclination angle that still provides stable insert, for known value of b1, can be calculated using following equation [31]: 2 2 2 2 2 2 1 1 1 cos(α) ( ), ( ) ( )b b x x b w wb w b b b w      . (7) design of microwave waveguide filters with effects of fabrication imperfections 453 in case of shorter insert, the critical inclination angle may have lower values (e.g. α = 4º), compared to the case with b1 = b. furthermore, fig. 24b shows a rotated insert in the waveguide and the minimum rotation angle can be found using following equation [31]: 2 2 2 cos(θ) ( 2) ( 2 ), ( )a a x x aw a w    . (8) the minimum rotation angle for the insert with dimensions perfectly matching the waveguide cross-section is θ ≈ 6º. the maximum rotation angle (in positive or negative direction) which does not introduce response degradation is θ = 15º. it has been shown, that in this case the insert has physical contact with the waveguide walls over its top and bottom sides, so it should remain stable although it is not perfectly short-circuited to the side walls [31]. (a) (b) fig. 24 printed-circuit insert in the waveguide: a) inclined by α (side view), b) rotated by θ (top view) in case a single insert is inclined by α = 13.038º or rotated by θ = 15º, it has been shown that there is no significant influence on the amplitude response of the waveguide resonator [31]. the next step was to investigate the influence of the inaccurately positioned inserts on the third-order filter response. in this case, the function of the inverters may be disrupted, since their lengths may be inadequate. therefore, we have thoroughly investigated the filter response in case one or multiple inserts were rotated or inclined. we have considered the filter with the central insert rotated by θ = 15º (fig. 25a) and it has been shown that this type of inaccuracy does not introduce significant amplitude response degradation, particularly in the passband (fig. 25b). fabricated filter is shown in fig. 25c. the detailed explanation regarding filter fabrication along with the structures designed to hold the inserts can be found in [30, 31]. a comparison of the simulated and measured amplitude responses shows their good agreement, as can be seen in fig. 25d. finally, the amplitude response has been analyzed when two or three inserts were inclined or rotated, since these are also possible situations in practice. it has been shown that cases with all three inclined or rotated inserts exhibit the most significant response degradation, so these models were considered in details in [31], and herein the most important observations will be pointed out. in case of three inclined inserts, fig. 26a shows models with the most noticeable performance degradation. namely, model 1 results in the most significant response deviation, even for small inclination angles. however, model 2 is the most probable one in practice: in case the fixtures holding the inserts, attached to the top and bottom waveguide walls, are mutually shifted, all three 454 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić inserts are inclined by the same angle, in the same direction. for the model 2 with perfectly fabricated inserts and inclination angle α ≈ 13º, b3dbrel is around 5 %, compared with the reference bandwidth of the original filter. for the same model with slightly shorter inserts (b1 ≈ 10.1 mm) and inclination angle α = 8º for all three inserts, practically there is no response degradation, i.e. the parameters of the amplitude characteristic met the criteria provided earlier in this section. in case of three rotated inserts, model 1 in fig. 26b exhibited the most significant response deviation. it has been found that the maximum rotation angle, still providing acceptable amplitude response in terms of required criteria for an arbitrary position of the inserts, was θ = 8º. finally, in case the inserts were simultaneously inclined and rotated, the aforementioned criteria would be met for the inclination angle α ≤ 5º and the rotation angle θ ≤ 7º. 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 -60 -50 -40 -30 -20 -10 0 f [ghz] original original rotated rotated s 11 [db] s 21 [db] (a) (b) (c) (d) fig. 25 filter with central insert rotated by θ = 15º: a) top view and wipl-d model, b) comparison of amplitude responses for the original model and filter with rotated insert, c) fabricated filter, d) comparison of simulated and measured results for the filter with rotated insert (a) (b) fig. 26 a) inclined inserts, b) rotated inserts design of microwave waveguide filters with effects of fabrication imperfections 455 5. conclusions in this paper, various solutions for the bandstop and bandpass waveguide filter design have been presented. the goal was to exemplify the method for relatively simple waveguide filter design procedure, using printed-circuit discontinuities and different types of resonators, easy to design and implement. first, bandstop filters were designed using printed-circuit inserts within the rectangular waveguide. inserts with srrs were placed in the h-plane, while the insert with qwrs was positioned in the e-plane of the standard wr-90 waveguide. designed filters using these inserts have been thoroughly analyzed and the results have been presented. both types of the considered filters allow independent control of the designed stopbands and are compact in size. as for the e-plane filters, miniaturized icds multi-band bandstop waveguide filter design using qwrs has been discussed. as a proof of concept, e-plane icds dual-band and triple-band bandstop waveguide filters have been designed. center frequencies can be flexibly adjusted by the length of the corresponding qwrs. as for the icds dual-band bandstop filter, connection of the qwrs for different stopbands to the opposite waveguide walls has resulted in about 41 % of the size reduction, compared to the case where they are connected to the same waveguide wall. miniaturized icds dual-band bandstop filter has been fabricated and measured. the filter is 0.512 λg in length. further miniaturization of the dual-band bandstop filter has been achieved when qwrs of different size were printed one below another. in this arrangement, the unwanted mutual coupling has been particularly strong and restricted the control of the center frequencies and bandwidth. the impact of the physical dimensions alteration on the filter response has been thoroughly investigated and exposed. obtained ultra-compact e-plane dual-band bandstop waveguide filter has length of 0.295 λg, which is about 66 % and 42 % shorter compared to the non-miniaturized and miniaturized icds dual-band bandstop filter, respectively. additionally, equivalent microwave circuit of the multi-band bandstop filter with independently tunable stopbands is presented in the form of a cascade of the equivalent microwave networks of the single-band bandstop filters. equivalent circuit corresponds to the decomposed 3d filter structure, and it is suitable for faster filter design and optimization, as well. for the design of the h-plane filters, inserts with printed srrs have been used. the third-order bandstop filter has been designed using srrs distanced by the quarter-wave waveguide sections acting as immittance inverters for the center frequency. accordingly, dual-band bandstop filter has been implemented with srrs separated by the inverters for the specified center frequencies. the filter is 0.5 λg in length, which is attributed to the length of the quarter-wave waveguide section used as inverter for lower stopband design. regarding bandpass waveguide filters, various types of resonating inserts, having bandpass characteristic, have been introduced. they have been used for the higher-order h-plane bandpass filters with a single or multiple pass bands. a novel solution for dualband filter using folded inserts has been presented, in order to properly implement the inverters, i.e. the quarter-wave waveguide sections, for each center frequency. the inserts may be implemented either as multi-layer planar inserts or metal inserts, as a simpler solution. dual-band filter with folded metal inserts has been further modified to obtain compact solution with flat inserts and miniaturized inverters, optimized for fabrication. it has been also demonstrated that csrrs do not necessarily need to be centrally positioned on the inserts, but they may be attached to the top and bottom waveguide walls. 456 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić finally, the bandpass waveguide filters fabrication side effects have been investigated in details. the amplitude responses of the waveguide resonator and the third order filter have been analyzed in terms of the implementation technology, the tolerance of the machine used for fabrication and positioning of the inserts inside the waveguide. the obtained results are relevant for identifying critical parameters affecting the performance of the considered structures. various effects and phenomena have been modeled in software and for the chosen examples the results were also experimentally verified, showing good agreement with the simulation results. the obtained results can be summarized as follows: 1) regarding substrate parameters, the dielectric permittivity of the printed-circuit insert had the major impact on the amplitude response (a closed-form expression based on a linear dependency between the permittivity and center frequency was proposed as a design curve); 2) in terms of machine tolerance, the digging depth into the substrate during the milling process introduced the most significant response degradation; 3) the inaccuracy in positioning of the inserts in the waveguide did not introduce deviation of the filter response in the passband, for the critical angles which were determined, for both the waveguide resonator and the filter with three arbitrarily inclined or rotated inserts. the findings of our study may be applicable for the other types of waveguide filters using similar resonating inserts and also for the filters operating in different frequency bands, since the presented results pointed out the most significant phenomena and side effects of the fabrication process. the advantage of the proposed method is the possibility for improving and shortening the design procedure, by performing majority of setting and analyses in the software, thus avoiding unnecessary fabrications. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia under grant tr32005. references [1] m. d. lutovac, d. v. tošić, b. v. evans, filter design for signal processing using matlab and mathematica, upper saddle river, nj: prentice hall; translated in chinese. beijing, p. r. china: publishing house of electronics industry, phei; 2004. [2] d. m. pozar, microwave engineering. new york: john wiley & sons, 2012. [3] s. stefanovski pajović, m. potrebić, d. v. tošić, "advanced filtering waveguide components for microwave systems", microwave systems and applications, dr. sotirios goudos (ed.), intech, january 2017. [4] r. j. cameron, c. m. kudsia, and r. r. mansour, microwave filters for communication systems: fundamentals, design and applications, new jersey: john wiley & sons, 2007. [5] b. milovanović, j. joković, t. dimitrijević, "analysis of feed waveguide length influence on em field in microwave applicator using tlm method", facta universitatis, series: electronics and energetics, vol. 21, no.1, pp. 65-72, april 2008. [6] s. lj. stefanovski, "microwave waveguide filters using printed-circuit discontinuities", ph.d. dissertation, school of electrical engineering, university of belgrade, belgrade, serbia, 2015. [7] s. c. dutta roy, "a new lumped element bridged-t absorptive band-stop filter", facta universitatis, series: electronics and energetics, vol. 30, no. 2, pp. 179-185, june 2017. [8] s. prikolotin, a. kirilenko, "a novel notch waveguide filter", microw. opt. techn. let., vol. 52, pp. 416-420, february 2010. [9] s. fallahzadeh, h. bahrami, m. tayarani, "a novel dual-band bandstop waveguide filter using split ring resonators", prog. electromagn. res. lett., vol. 12, pp. 133-139, 2009. design of microwave waveguide filters with effects of fabrication imperfections 457 [10] s. fallahzadeh, h. bahrami, m. tayarani, "very compact bandstop waveguide filters using split ring resonators and perturbed quarter-wave transformers", electromagnetics, vol. 30, no. 5, pp. 482-490, june 2010. [11] s. stefanovski, m. potrebić, d. tošić, "novel realization of bandstop waveguide filters", technics, special edition, pp. 69-76, 2013. [12] s. stefanovski, m. potrebić, d. tošić, "a novel design of dual-band bandstop waveguide filter using split ring resonators", j. optoelectron. adv. mat., vol. 16, no. 3-4, pp. 486-493, march-april 2014. [13] s. stefanovski, m. potrebić, d. tošić, z. cvetković, "design and analysis of bandstop waveguide filters using split ring resonators", in proceedings of the 11 th international conference on applied electromagnetics (pes 2013), niš, serbia, 2013, pp. 135-136. [14] m. mrvić, m. potrebić, d. tošić, z. cvetković, "e-plane microwave resonator for realisation of waveguide filters", in proceedings of xii international saum conference on systems, automatic control and measurements (saum 2014), niš, serbia, 2014, pp. 205–208. [15] s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković, "e-plane waveguide bandstop filter with double-sided printed-circuit insert", facta universitatis, series: electronics and energetics, vol. 30, no. 2, pp. 223-234, june 2017. [16] h. sun, c. feng, y. huang, r. wen, j. li, w. chen, g. wen, "dual-band notch filter based on twist split ring resonators", int. j. antennas propag., vol. 2014, article id 541264, 6 pages, april 2014. [17] p. castro, j. barosso, j. leite neto, a. tomaz, u. hasar, "experimental study of transmission and reflection characteristics of a gradient array of metamaterial split-ring resonators", j. microw., optoelectron. electromagn. appl., vol. 15, no. 4, pp. 380-389, october/december 2016. [18] s. stefanovski, m. potrebić, d. tošić, "a novel design of e-plane bandstop waveguide filter using quarter-wave resonators", optoelectron. adv. mat., vol. 9, no. 1-2, pp. 87-93, january-february 2015. [19] m. mrvić, m. potrebić, d. tošić, "compact e-plane waveguide filter with multiple stopbands", radio sci., vol. 51, no. 12, pp. 1895-1904. [20] m. mrvić, m. potrebić, d. tošić, z. cvetković, "miniaturization of waveguide bandstop filter", in procееdings of the 12 th international conference on applied electromagnetics (pes 2015), niš, serbia, 2015, pp. 79–80. [21] n. ortiz, j. d. baena, m. beruete, f. falcone, m. a. g. laso, t. lopetegi, r. marques, f. martin, j. garcia-garcia, m. sorolla, "complementary split-ring resonator for compact waveguide filter design", microw. opt. techn. let., vol. 46, no. 1, pp. 88-92, july 2005. [22] m. m. potrebić, d. v. tošić, z. ţ. cvetković, n. radosavljević, "wipl-d modeling and results for waveguide filters with printed-circuit inserts", in proceedings of the 28 th international conference on microelectronics (miel 2012), niš, serbia, 2012, pp. 309-312. [23] h. bahrami, m. hakkak, a. pirhadi, "analysis and design of highly compact bandpass waveguide filter utilizing complementary split ring resonators (csrr)", prog. electromagn. res., vol. 80, pp. 107-122, 2008. [24] s. stefanovski, m. potrebić, d. tošić, "design and analysis of bandpass waveguide filters using novel complementary split ring resonators", in proceedings of the 11 th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2013), niš, serbia, 2013, pp. 257-260. [25] s. stefanovski pajović, m. potrebić, d. tošić, "microwave bandpass and bandstop waveguide filters using printed-circuit discontinuities", in proceedings of the 23 rd telecommunications forum (telfor 2015), belgrade, serbia, 2015, pp. 520-527. [26] s. stefanovski, đ. mirković, m. potrebić, d. tošić, "novel design of h-plane bandpass waveguide filters using complementary split ring resonators", in proceedings of progress in electromagnetics research symposium (piers 2014), guangzhou, china, 2014, pp. 1963-1968. [27] s. stefanovski, m. potrebić, d. tošić, z. stamenković, "a novel compact dual-band bandpass waveguide filter", in proceedings of ieee 18 th international symposium on design and diagnostics of electronic circuits and systems (ddecs 2015), belgrade, serbia, 2015, pp. 51-56. [28] s. stefanovski, m. potrebić, d. tošić, z. stamenković, "compact dual-band bandpass waveguide filter with h-plane inserts", j. circuit syst. comp., vol. 25, no. 3, 1640015 (18 pages), 2016. [29] j.-s. hong, microstrip filters for rf/microwave applications, nj: john wiley & sons, 2011. [30] s. stefanovski, m. potrebić, d. tošić, "structure for precise positioning of inserts in waveguide filters", in proceedings of the 21 st telecommunications forum (telfor 2013), belgrade, serbia, 2013, pp. 689-692. 458 m. mrvić, s. stefanovski pajović, m. potrebić, d. tošić [31] s. lj. stefanovski pajović, m. m. potrebić, d. v. tošić, z. ţ. cvetković, "fabrication parameters affecting implementation of waveguide bandpass filter with complementary split-ring resonators", j. comput. electron., vol. 15, no. 4, pp. 1462-1472, 2016. [32] s. c. gao, l. w. li, t. s. yeo, m. s. leong, "a dual-frequency compact microstrip patch antenna", radio sci., vol. 36, no. 6, pp. 1669–1682, november-december 2011. [33] m. albooyeh, a. a. lotfi neyestanak, b. mirzapour, "wideband dual posts waveguide band pass filter", int. j. microw. opt. techn., vol. 2, no. 3, pp.203-209, 2007. [34] n. s. choi, d. h. kim, g. jeung, j. g. park, j. k. byun, "design optimization of waveguide filters using continuum design sensitivity analysis", ieee t. magn., vol. 46, no. 8, pp.2771-2774, 2010. [35] r. l. villaroya, "e-plane parallel coupled resonators for waveguide bandpass filter applications", ph.d. dissertation, heriot-watt university, edinburgh, 2012. [36] [online] http://www.ros.hw.ac.uk/bitstream/handle/10399/2604/lopez-villarroyar_1012_eps.pdf?sequence= 1&isallowed=y [37] p. soto, d. de llanos, v. e. boria, e. tarin, b. gimeno, a. onoro, l. hidalgo, m. j. padilla, "performance analysis and comparison of symmetrical and asymmetrical configurations of evanescent mode ridge waveguide filters", radio sci., vol. 44, no. 6, rs6010, december 2009. [38] j. bornemann, j. uher, "design of waveguide filters without tuning elements for production-efficient fabrication by milling", in proceedings of asia-pacific microwave conference (apmc), pp.759-762, taipei, taiwan, 2001. [39] c. zhao, t. kaufmann, y. zhu, c. c. lim, "efficient approaches to eliminate influence caused by micromachining in fabricating h-plane iris band-pass filters", in proceedings of asia-pacific microwave conference (apmc), pp.1306-1308, sendai, japan, 2014. [40] b. m. kolundţija and a. r. djordjević, electromagnetic modeling of composite metallic and dielectric structures, 1 st ed. norwood, ma: artech house, 2002. [41] a. r. djordjević, d. i. olćan, a. g. zajić, "modeling and design of milled microwave printed circuit boards", microw. opt. technol. let., vol. 53, no. 2, pp. 264–270, 2011. instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 285 293 doi: 10.2298/fuee1703285t modeling of magnetoelectric microwave devices  alexander s. tatarenko, darya v. snisarenko, mirza i. bichurin novgorod state university, veliky novgorod, russia abstract. the possibility of computer modeling implementation of electrically controlled magnetoelectric (me) microwave devices is considered. the computer modeling results of different structures of me microwave devices based on layered ferrite-piezoelectric structure formed on the slot line, microstrip line and coplanar waveguide are offered. results are reported as frequency dependencies of insertion losses of me devices. key words: magnetoelectric microwave devices, ferrite-piezoelectric resonator, dual magnetic and electrical control 1. introduction with the increasing significance of the microwave communication systems, radar and navigation in modern society are enhanced requirements for their reliability, mobility, power consumption. telecommunication and mobile satellite radiotelephone systems, mobile navigation and radar stations, global and local computer networks are need of an electrically controllable and inexpensive devices. this requirement can be achieved by replacing complex circuits with active components to tunable microwave devices based on thin film materials with nonlinear physical properties such as ferroelectric and ferrites. one way to control the parameters of electronic components is based on the change in the dielectric constant of components under the influence of an external electric field. "electric" method of control is characterized by high speed and low energy consumption, since the restructuring carried out without leakage currents through the control circuit. control property under the influence of the electric field is maintained in some ferroelectrics in a wide frequency range from the lowest to the highest frequencies. this feature is widely used in microwave devices for rapid regulation of the amplitude-frequency and phase-frequency characteristics. the disadvantages of ferroelectric control structures is a relatively narrow range of operating frequency regulation and a high level of voltage applied to the electrodes. these drawbacks can be overcome by design of new modifications of the transmission lines, as well as the use of layered structures containing not only the ferroelectric, but and ferromagnetic received november 30, 2016 corresponding author: mirza bichurin novgorod state university, 173003 veliky novgorod, russia (e-mail: mirza.bichurin@novsu.ru) 286 a. s. tatarenko, d. v. snisarenko, m. i. bichurin films. using of ferrite-ferroelectric layered structures can manage the performance by electric and magnetic field at the same time. in such devices, you can combine the advantages of an "electric" and "magnetic" management methods, i.e. the high speed and a wide range of operating frequency with the microwave device parameters. analysis of the current state in the field of microwave devices controlled by electric and magnetic fields, indicates the existence of scientific and technical issues, including radio physical and physical-technological aspects. this issue determines the number of academic assignments, such as theoretical studies of electrodynamics characteristics and improving the design of microwave transmission lines, experimental investigations of wave processes, design and development of the controlled devices. magnetoelectric (me) materials [1-6], which simultaneously exhibit ferroelectricity and ferromagnetism, have recently stimulated a sharply increasing number of research activities for their scientific interest and significant technological promise in the novel multifunctional devices. the me effect [7-9] in composite materials is known as a product tensor property, which results from the cross interaction between different orderings of the two phases in the composite. neither the piezoelectric nor magnetic phase has the me effect, but composites of these two phases have remarkable me effect. thus the me effect is a result of the product of the magnetostrictive effect (magnetic/mechanical effect) in the magnetic phase and the piezoelectric effect (mechanical/electrical effect) in the piezoelectric. one of the promising directions of development of microwave technology currently is the development of me microwave devices. application of me non-reciprocal devices eliminates the above drawbacks of ferrite devices. electric field control allows to implement such devices integrally, i.e. reduces the cost of the devices; improves performance; reduces power consumption in the control circuit; eliminating the interference arising from the magnetic field control [10-11]. 2. modeling of me microwave devices magnetoelectric interactions in ferrite-ferroelectric composites have facilitated a new class of microwave signal processing devices. such devices are based on either hybrid spin electromagnetic waves or mechanical force mediated magnetoelectric interactions. when a ferrite-piezoelectric bilayer is driven to ferromagnetic resonance (fmr) and an electric field e is applied across piezoelectric (ferroelectric), the me effect results in a frequency or field shift of fmr. thus devices based on fmr can be tuned with both electric field e and magnetic field h. several dual tunable me devices, including resonators, filters, attenuators, circulators, isolators and phase shifters have been demonstrated so far. simulation of me microwave devices by the modern computer program which calculate multimode s-parameters and the electromagnetic field in the three-dimensional passive structures greatly simplifies the selection of optimal parameters of such devices: the parameters of the transmission line (dimensions and relative substrate permittivity, the size of the conductors) and the resonator parameters (size, shape, material). as the industry turns to monolithic integrated/hybrid nonreciprocal microwave devices, planar geometries have to be used. this requires the development of planar elements, compatible with strip line and microstrip systems. as high-frequency systems are manufactured using monolithic microwave integrated circuit (mmic) designs, the size of the me resonator must be compatible with the mmic chip technology. modeling of magnetoelectric microwave devices 287 the difference between the proposed me non-reciprocal devices and ferrite devices is to replace the ferrite magnetic resonator and magnetic control systems to ferrite-piezoelectric resonator and a system of electrodes connected to the source of the control voltage. me resonator (fig. 1) is a layered composite in the form of a disk or plate. as a ferrite phase can be different type of spinels (nife2o4, cofe2o4, ni0.8zn0.2fe2o4, co0.6zn0.4mn2o4 and other), yttrium iron garnet (yig thick film or monocrystal); as piezoelectric phase we can use polycrystalline material lead zirconate titanate (pzt), or single-crystal materials as lead magnesium niobate-lead titanate (pmn-pt), lead zinc niobate-lead titanate pzn-pt. fig. 1 me resonator: 1 is piezoelectric component, 2 is ferrite component, 3 is metal electrodes the basis for the design of me microwave devices is a microwave transmission line on a dielectric substrate with a me resonator placed in the transmission line. the operating principle of the me non-reciprocal microwave devices is based on the me microwave effect. the point of this effect is to shift the fmr line under the influence of an electric field. me layered composite operate as a resonator in this case. electric field control allows carrying out the tuning of the device characteristics in the frequency range. this is also the ability to control the fmr line using a magnetic field. dual tunability of the devices control parameters open up new possibilities for the design of such devices. fmr is a powerful tool for studies of microwave me interaction in ferrite-piezoelectric structures. an efficiency of the magnetoelectric interaction in the ferrite-piezoelectric bilayers is characterized by coefficient of magnetoelectric interaction a=δh/δe, where δh is variation of the internal magnetic field in the ferrite and δe is variation of the electric field applied to the piezoelectric. magnitude of a depends mainly on magnetostriction constant of the ferrite and piezoelectric coefficient of the piezoelectric. an electric field e applied to the composite produces a mechanical deformation in piezoelectric that in turn is coupled to the ferrite and results in the shift δf in the resonance field. information on the nature of high frequency me coupling was therefore obtained from data on shift δf vs e. the shift is proportional to linear me coupling coefficient. the design of me microwave device assumes the presence of me resonator, which is placed on the microstrip line or circuit-resonator, slot line or into waveguide using the circular polarization area of microwave field. the circular polarization of microwave field allows more effectively to use of composite component and allow increase the magnetic susceptibility. the working point is selected depending on the purpose of the device. for example, in case of attenuator or isolator the device is tuned in the resonance absorption. for the phase shifter selects the area near a resonance with the lowest absorption, but maximal depth control. 288 a. s. tatarenko, d. v. snisarenko, m. i. bichurin computation, design and manufacturing technology of nonreciprocal microwave devices intended for application in receiving-transmitting modules of antenna array have a great interest in current time. currently, a large development has program high frequency system simulator (hfss) of company ansoft, which is intended for the analysis of threedimensional microwave structures, including antennas and non-reciprocal devices containing ferrites and ferroelectrics. electromagnetic simulation in hfss is based on the use of the finite element method (finite element method, fem). microstrip line [12], coplanar line and slot line are used in the microwave range. the microstrip lines are used most widely [13-14]. however, at designing the non-reciprocal devices using ferrites it requires the microwave field of circular polarization. in microstrip line this region is absent and the additional elements are needed, for example in the form of stubs to create an area of circular polarization. from this point of view, the slot and coplanar line are of interest. the structure of the microwave field in the slot line and coplanar waveguide is significantly different from the structure of the wave field in microstrip line. coplanar waveguide (cpw) is a transmission line which consists of a center strip, two slots and a semi-infinite ground plane on either side of it [15]. this type of waveguide offers several advantages over conventional microstrip line, namely, it facilitates easy shunt as well as series mounting of active and passive devices; it eliminates the need for wraparound and the holes, and it has a low radiation loss. another important advantage of cpw which has recently emerged is that cpw circuits render themselves to fast and inexpensive on-wafer characterization at frequencies as high as 50 ghz. lastly, since the microwave magnetic fields in the cpw are elliptically polarized, nonreciprocal components such as ferrite circulators and isolators can be efficiently integrated with the feed network. fig. 2 (a, b, c) shows the computer model of me devices on a different type of transmission line. fig. 2 a) microstrip line the transmission line structure in fig.2a) consisted of microstrip lines of nonresonant lengths with two stubs of lengths 1/8 and 3/8 wavelengths on a dielectric substrate with ground plane on bottom side. the stubs is required for creating of elliptically polarized microwave magnetic field. modeling of magnetoelectric microwave devices 289 fig. 2 b) slot line the slot line transmission systems [16-17] has been shown to contain elliptically polarized h field regions which are required for producing nonreciprocal microwave devices. the development of such a device was dependent on being able to determine a me composite slot line configuration that would yield good interaction between the me resonator and the propagating mode of the slot line with a minimum of concurrent insertion loss. the microstrip to slot line transition is used to convert input microwave signals from a tem mode to the required slot line mode. the slot width on the transition is designed so as to match into the slot line etched on one of the me resonator inserts in the slot of the device. the pertinent characteristics of this type of transmission system such as field configurations, propagation constants, and impedance as functions of dielectric material characteristics, dielectric thickness, and slot width were derived. the slot line contained an microwave magnetic field configuration which was suitable for generating nonreciprocal me devices. there existed regions within the slot line that contained circularly or elliptically polarized microwave magnetic field. fig. 2 c) coplanar waveguide 290 a. s. tatarenko, d. v. snisarenko, m. i. bichurin the use of modern simulation software allows the fast design of various types of nonreciprocal microwave devices. we conducted a simulation of various types of nonreciprocal magnetoelectric devices based on slot and coplanar lines by using the hfss. a comparison with similar devices based on the microstrip line was made. 3. results and discussion simulation of the devices is made in the software environment of the hfss program. s-parameters in the frequency range are optimized for investigated device. the amplitude characteristics were investigated. computer simulation results for different designs of me microwave devices realized on the strip transmission lines are shown in the figures. figure 3 shows the frequency dependence of the microstrip line attenuation in the forward and reverse directions. fig. 3 the microstrip transmission line. dependence of attenuation vs. frequency. the resonators parameters is yig disk: thickness is 0.1 mm on ggg substrate with thickness 0.44 mm and diameter of 3 mm; magnetizing field is 2700 oe. fig. 4 slot transmission line. the dependence of the attenuation vs. frequency. resonator dimensions is 10 mm×1 mm×0.2 mm; slot line width is 0.62 mm, widening the gap to 1.2 mm; the relative permittivity of the substrate is 30, the substrate thickness is 2 mm; magnetizing field is 2514 oe. modeling of magnetoelectric microwave devices 291 figure 4 shows the frequency dependence of the slot transmission line attenuation in the forward and reverse directions. figure 5 shows the frequency dependence of the coplanar transmission line attenuation in the forward and reverse directions. fig. 5 the coplanar transmission line. dependence of attenuation vs. frequency. resonator dimensions is 0.6×4×0.1 mm 3 ; slot width is 0.4 mm; the center conductor width is 0.6 mm; ε of substrate is 40; substrate thickness is 1 mm; magnetizing field is 3125 oe. figure 6 shows the experimental frequency dependence of the coplanar transmission line attenuation in the forward and reverse directions. the experimental investigation of the me microwave properties of the bilayer structures were based on the measurements of the resonators frequency responses for different values of external dc voltage and bias magnetic fields. namely, reflection spectra s11 ( f ) = 10 log|pref ( f ) / pin( f )|, where pin( f ) is an incident power, pref( f ) is a reflected power, and f is the excitation frequency, were measured. the frequency responses were carried out with agilent network analyzer. fig. 6 for comparison. coplanar waveguide, the experimental frequency dependence of attenuation. magnetizing field is 2780 oe. 292 a. s. tatarenko, d. v. snisarenko, m. i. bichurin computation, design and manufacturing technology of nonreciprocal microwave devices have a great interest in current time. the main directions for further research based on the use of modern computer design programs. the use of modern simulation software allows the fast design of various types of non-reciprocal microwave devices. that simulation allows to get the selection of substrate parameters and the shape of me resonator. the me resonator based on layered structure of yig and pzt was used. to decrease the control voltage and the increase the valve ratio it is necessary to reduce the thickness of the piezoelectric, and hence the thickness of the ferrite. the use of computer simulation for me structures in the non-reciprocal microwave devices opens promising opportunities for the creation of the new devices. 3. conclusion magnetoelectric layered structures are ideal for studies on wideband magnetoelectric interactions between the magnetic and electric subsystems that are mediated by mechanical forces. such structures show a variety of magnetoelectric phenomena including microwave me effects. the phenomenon can be used for creating electrical tuning the microwave me resonators and devices on their basis. the possibility of me microwave devices realization on the strip transmission lines controlled by both electric and magnetic fields are shown. the results of computer simulation of various me microwave devices designs with resonators based on me layered structures placed into the transmission line are given. the simulated results are compared with the experimental results. acknowledgement: the paper is a part of the research done within the project of russian science foundation 16-12-10158. references [1] r. heindl, h. srikanth, s. witanachchi, p. mukherjee, t. weller, a.s. tatarenko and g. srinivasan, "structure, magnetism, and tunable microwave properties of pulsed laser deposition grown barium ferrite/barium strontium titanate bilayer films", j. appl. phys., vol. 101, p.09m503, 2007. [2] g. srinivasan, a.s. tatarenko, y. k. fetisov, v. gheevarughese, and m.i. bichurin, "microwave magneto-electric interactions in multiferroics", in proc. of the mater. res. soc. symp, 2007, vol. 966, p.0966t14-01 [3] d. seguin, m. sunder, l. krishna, a. tatarenko, p.d. moran, "growth and characterization of epitaxial fe0.8ga0.2/0.69pmn-0.31pt heterostructures", journal of crystal growth, vol. 311, no. 12, p.p.32353238, 2009. [4] g. srinivasan, i.v. zavislyak, a.s. tatarenko, "millimeter-wave magnetoelectric effects in bilayers of barium hexaferrite and lead zirconate titanate", appl. phys. lett., vol. 89, p.152508, 2006. [5] c.-w. nan, m. i. bichurin, s.x. dong, d. viehland, and g. srinivasan, "multiferroic magnetoelectric composites: historical perspective, status, and future directions", j. appl. phys., vol. 103, p.031101, 2008. [6] a.s. tatarenko, a.b. ustinov, g. srinivasan, v.m. petrov, and m.i. bichurin, "microwave magnetoelectric effects in bilayers of piezoelectrics and ferrites with cubic magnetocrystalline anisotropy", j. appl. phys. vol. 108, p.063923, 2010. http://www.sciencedirect.com/science/journal/00220248 http://www.sciencedirect.com/science?_ob=publicationurl&_hubeid=1-s2.0-s0022024809x0014x&_cid=271622&_pubtype=jl&view=c&_auth=y&_acct=c000228598&_version=1&_urlversion=0&_userid=10&md5=8b251a8f35a97f64a9c7006684ddd5fe modeling of magnetoelectric microwave devices 293 [7] m.i. bichurin, i.a. kornev, v.m. petrov, a.s. tatarenko, yu.v. kiliba, g. srinivasan, "theory of magnetoelectric effects at microwave frequencies in a piezoelectric/magnetostrictive multilayer composite", phys. rev. b., vol. 64, p.094409, 2001. [8] m.i. bichurin, i.a. kornev, v.m. petrov, yu.v. kiliba, a.s. tatarenko, n.a. konstantinov, g. srinivasan, "resonance magnetoelectric effect in multilayer composites", ferroelectrics, vol. 280, p.187-198, 2002. [9] s. shastry, g. srinivasan, m.i. bichurin, v.m. petrov, a.s. tatarenko, "microwave magnetoelectric effects in single crystal bilayers of yttrium iron garnet and lead magnesium niobate – lead titanate", phys. rev. b., vol. 70, p.064416, 2004. [10] a.s. tatarenko and m.i. bichurin, "microwave magnetoelectric devices", advances in condensed matter physics, vol. 2012, p.10. [11] a.s. tatarenko, g. srinivasan, m.i. bichurin, "magnetoelectric microwave phase shifter", appl. phys. lett., vol. 88, p.183507, 2006. [12] m. perić, s. ilić, s. aleksić, n. raičević, m. bichurin, a. tatarenko, r. petrov, "covered microstrip line with ground planes of finite width", facta universitatis series: electronics and energetics, vol. 27, no. 4, pp. 589 – 600, december 2014. [13] m.i. bichurin, v.m. petrov, r.v. petrov, g.n. kapralov, f.i. bukashev, a.yu. smirnov, a.s. tatarenko "magnetoelectric microwave devices", ferroelectrics, vol. 280, pp.213-220, 2002. [14] m.i. bichurin, a.s. tatarenko, d.v. lavrenteva, s.r. aleksić, "magnetoelectric microwave devices", in proc. of the 11th international conference on applied electromagnetics πec'2013, niš, serbia, september 01 – 04, 2013, pp.77-78 [15] c.p. wen, "coplanar waveguide: a surface strip transmission line suitable for nonreciprocal gyromagnetic device applications," ieee transactions on microwave theory and techniques, vol. mtt-17, no. 12, pp. 1087-1090, december 1969. [16] s. b. cohn, "slot line an alternative transmission medium for integrated circuits", in digest of the 1968 ieee g-mtt international microwave symposium, pp 104-109. [17] mariani, heinzman, agrios and cohn, "slot line characteristics", ieee transactions microwave theory and techniques, vol. mtt-17, december 1969, pp 1091-1096. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 199 208 doi: 10.2298/fuee1702199d a non-intrusive identification of home appliances using active power and harmonic current  srđan đorđević, marko dimitrijević, vančo litovski university of niš, faculty of electronic engineering, niš, serbia abstract. in recent years, research on non-intrusive load monitoring has become very popular since it allows customers to better manage their energy use and reduce electrical consumption. the traditional non-intrusive load monitoring method, which uses active and reactive power as signatures, has poor performance in detecting small non-linear loads. this drawback has become more prominent because the use of nonlinear appliances has increased continuously during the last decades. to address this problem, we propose a nilm method that utilizes harmonic current in combination with the changes of real power. the advantages of the proposed method with respect to the existing frequency analysis based nilm methods are lower computational complexity and the use of only one feature to characterize the harmonic content of the current. key words: non-intrusive load monitoring (nilm), load signature, energy management 1. introduction the rapid growth in energy consumption and carbon emissions has generated interest in the deployment of efficient household energy management system. the system for home energy management enables consumers to control and manage their electrical consumption, according to the information of individual load consumptions [1, 2]. therefore, in order to significantly reduce waste in residential energy consumption, it is necessary to use load monitoring system. appliance load monitoring is not only useful in energy saving, but also in fault detection systems, remote monitoring systems and some residential applications such as in-home activity tracking [3, 4]. there are two methods for monitoring individual electrical loads: 1. distributed direct sensing or intrusive load monitoring and 2. single point sensing or non-intrusive load monitoring (nilm). the first approach requires complex instrumentation system to measure energy consumption of each device separately. this solution has many practical disadvantages such as: complex installation, low scalability, low reliability as well as high cost due to a large number of sensors and communication devices. a more practical solution for received june 3, 2016; received in revised form september 1, 2016 corresponding author: srđan đorđević faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: srdjan.djordjevic@elfak.ni.ac.rs) 200 s. đorđević, m. dimitrijević, v. litovski monitoring individual loads is nilm, that use only one sensor attached to the electric utility service entry. this method dis-aggregates the whole-house energy consumption into energy usage of individual appliances. the most commonly used steady-state nilm method detects operation of individual loads from the step changes in real and reactive power [5]. this method works well for devices with two states of operation, but it is not suitable for extracting variable-loads and multi-state appliances. another problem is the detection of loads that consume similar steady-state power since their two dimensional signatures overlap in the p-q plane. it is especially difficult to distinguish low-power loads with small power consumption (lower than 150 w). the current trends in electricity consumption show a rapid increase in the type and number of household appliances [6], most of which are not predictable or controlled. consequently, the task of load identification becomes more challenging. an additional problem is the inability of previous nilm algorithms to detect low-power devices, which have become more numerous and diverse. a solution to this problem has been proposed in [7], through the use of the circuit-level instead of whole-house power measurements. this approach represents a trade-off between intrusive and non-intrusive load monitoring, which facilitate load disaggregation by the expense of the cost and complexity. in order to improve performance in detecting small non-linear loads several nilm methods have used current harmonics [8-11]. however, most of these techniques are not practical due to calculation of many harmonics in real time [12, 13]. in our previous works [14, 15], we have proposed the use of distortion power for appliance identification. the aim of these papers was to improve identification of small nonlinear loads by the analysis of three electrical quantities (active, reactive and distortion power), which is easy to obtain from metering devices. however, this method does not take into account the fact that the time variations of the voltage harmonics make the load identification imprecise. namely, distortion power, which is used to characterize the nonlinear loads, mainly consists of cross-products of voltage and current harmonics of different orders. in this work we suggest the use of harmonic current instead of distorted power in order to improve the appliance disaggregation accuracy. this paper proposes a nilm method based on the analysis of steady-state values of harmonic current and active power. the proposed approach uses only one feature to characterize the harmonic content of the current, as opposed to the previous nilm methods. we explore the effectiveness of the proposed electrical quantities in recognizing lowpower loads. the remainder of the paper is organized as follows: in the next section we review some of the commonly used nilm techniques. the proposed method for load monitoring is discussed in section 3. section 4 presents the results of the application of the proposed nilm method on low power appliances. the conclusion is reported in section 5. 2. background the main stages of an nilm system are: a) the data acquisition b) the feature extraction and event detection c) the load identification. the purpose of the first stage is to gather the voltage and current measurements at an adequate sampling rate. the sampling frequency depends on the electrical characteristic used by the nilm method. generally, a nonintrusive identification of home appliances using active power and harmonic current 201 the data acquisition for nilm can be classified in terms of the sampling rate as: high frequency and low frequency. the next step is to transform the raw data into a specific appliance feature, or load signature. in order to extract features it is necessary to first detect load events like switching on/off or changing state. the load signature can be derived from the steady-state signal component, which can be expressed as a finite number of sinusoids, or from the transient signal component [16-18]. dong et. al. [19] studied non-intrusive extraction of load signatures and demonstrated their technique by using the smart meter data. the final step is the estimation of the appliance-specific states by using machine learning algorithms. there are two categories of load identification algorithms: supervised learning algorithm, which requires a training procedure, and unsupervised learning algorithms, which is able to directly recognize appliance operations. the residential nilm systems usually use steady-states instead of transients load signatures. transient load monitoring systems are not suitable for residential energy disaggregation since they require expensive hardware (high-frequency energy meters) that makes them impractical. in addition, turn-off events are very difficult to detect with transient signatures. recently, wang et. al. [20] developed a new nilm method which is not limited to transient or steady state analysis and categorize the appliances according to working style. the most common method of nilm uses power measurement to characterize appliance [5]. since the load signature of this method involves two electrical parameters, the steadystate changes in active and reactive power, they are mapped to a two-dimensional signature space (p-q plane). an important characteristic of pq signature is that it can be obtained by using data from the existing smart meters. the second advantage of the power change method is that it allows automatic identification of on-off appliances. however, the method has some limitations. at first, it requires step changes in power level to identify loads. therefore, there is a problem in detecting devices with variable power draw. despite the fact that steady-state power level between two events of on-off devices is easy to detect, the method has a problem to distinguish these devices in some cases. false positive may occur when two or more devices change state at nearly the same time, since the sum of power consumptions of these devices may be associated with another load. furthermore, different loads may exhibit overlapping of signatures in pq signature space, which become more prominent as the number of loads increases. 3. non-intrusive load monitoring by using harmonic current and active power nowadays, the number of non-linear household loads, such as energy efficient variable speed drives and switched mode power supplies, increases continuously. since the nonlinear loads inject harmonic currents into the power system, it is promising to use harmonics as a load signature in the residential nilm. this kind of methods requires spectral analysis as opposed to the power based methods which use features directly derived from the raw current and voltage waveforms. due to the presence of many linear loads in the residential buildings, it is not possible to use harmonic content as a unique load signature for load disaggregation. harmonic currents may be also caused by transients during shutdown and turn on events. the harmonic content of the transient waveforms varies with time and may have frequencies that are not related to the fundamental frequency. some of the nilm methods are based on the 202 s. đorđević, m. dimitrijević, v. litovski frequency analysis of the transient waveforms [17, 18]. the problem with this approach is that transient detection is prone to errors. to overcome this problem some researchers have proposed the use of steady-state current harmonics to characterize the nonlinear loads [10, 11]. the online calculation of many current harmonics implies higher computational requirements [12]. consequently, a practical harmonic based nilm system must use a limited number of harmonics. to solve this problem many researchers have proposed various nilm methods. cole and alike [21] have developed the first harmonic based nilm method which is based on a calculation of first eight odd harmonics. the authors of [22] have proposed the use of only 2nd and 3nd harmonic, while the authors of [23] have considered first sixteen odd harmonics. recently, several authors [24] have proposed a method which used the first three odd harmonics. the nonlinear loads can be characterized not only by harmonic components in the current signal, but also by the other quantities like total harmonic distortion of current, distortion power, harmonic current, crest factor, distortion power factor. the focus of our research is on the identification of small non-linear loads. in the case when large loads are active smaller loads are difficult to identify due to the limited resolution of the data acquisition. therefore, we need to consider the load signature that enables identification of low consuming appliance in the presence of high power devices. in a typical household most of the existing load is still linear while nonlinear loads are small. therefore the influence of small loads on the on the fundamental current harmonic is negligible. each of the aforementioned electrical parameters (thdi, d, dpf, ki, and ih) can be expressed in terms of the current and voltage harmonics. according to these equations only harmonic current, as opposed to the other quantities, is not mathematically related to the fundamental current harmonic. therefore, we propose a novel approach in which non-linear loads are characterized by steady state harmonic current. the proposed method utilizes harmonic current in combination with the changes of real power. the current signal can be represented as the sum of the fundamental and harmonic components, as follows:     2 22 1 2 0 22 1 2 0 2 h hhrms iiiiiii (1) where: i is a total rms value of the current, i1 is the rms value of the fundamental harmonic, i0 is the dc component of the current, ih is the rms value of the h-th harmonic component of the current signal. therefore, harmonic current can be simply expressed in terms of the effective current and fundamental harmonic of the current as follows: 2 0 2 1 2 iiii rmsh  (2) the effective current is usually calculated by using the root mean square method as:    n n rms ni n i 1 2 ][ 1 (3) where: n is the sample index, i[n] is the current value measured at sampling point n and n is the number of samples taken during a full-wave of the current. a nonintrusive identification of home appliances using active power and harmonic current 203 the standard method for calculation of the harmonic current and voltages is based on the use of the discrete fourier transform (dft). this method works well for estimation of periodic signal in stady state. the rms value of the fundamental harmonic can be calculated using _ _ 2 2 1 1 1 re{ } im{ } 2 i i i   (4) where _ 1 i is the first harmonic current obtained by the discrete fourier transform as:     n n n n j eni n i 1 2_ 1 ][ 1  (5) according to (1-3), the calculation of the harmonic current is less computationally demanding than the calculation of harmonics. therefore, we can claim that proposed method is more computationally effective than other approaches that use harmonic analysis. to the best of our knowledge, the computational complexity of the harmonic-based nilm algorithms were not explicitly stated by researches. however, the computational cost of these methods can be determined according to the algorithm used to calculate dft (discrete fourier transform) and the number of frequencies required by the method. in the most nilm methods steady-state current harmonics are obtained by applying fft [11, 21 ,22]. however, when only a few dft frequencies are needed, as in the proposed method, it is more suitable to use the goertzel's algorithm. the main advantage of the goertzel algorithm over often used fft algorithms is less mathematical operations required for harmonic analysis. goertzel algorithm has linear complexity for n data points and m applications (required harmonics) it is o(mn), while fft algorithms has o(nlog2n) complexity. it is clear that for m= 0) and (x < height) and (y >= 0) and (y < width) then {if coordinates are valid} rptr^ := x * width + y{store offset to transformation array} else rptr^ := -1; {store -1} inc(rptr); {increment pointer} inc(ptr1, 2); {increment pointer} inc(ptr2, 2); {increment pointer} end; end; the following subsections provide implementation details for the main parts of the proposed approach. 4.1. hough transform implementation in order to achieve the fast implementation of the hough transform, the pointer arithmetic is used. for both, linear and matrix representation of the image, classical implementation is automatic document skew pre-processor for character segmentation algorithm 619 not suitable due to slow indexed access to the array elements. using the linear image representation combined with pointer arithmetic, the direct memory access is achieved. the following code represents the pascal implementation of the standard hough transform based on using the pointer arithmetic, which is capable to determine skew angles in the range from 90 to 90 degrees. it should be mentioned that using the different parameters in the following implementation, it is possible to estimate angles in different ranges which is exploited for obtaining the experimental results. dmax := trunc(sqrt(width * width + height * height)) + 1; {maximal allowed line length} teta := 90; {angle which defines the range of estimation angle} ptr1 := @imagebinary[0]; {pointer to binary image which is being processed} ptr2 := @posmap[0]; {pointer to support lookup table} ptr3 := @votingmatrix[0]; {pointer to voting matrix} for i := 0 to count 1 do {main loop} begin if ptr1^ = 1 then {is it black pixel?} begin j := ptr2^; {get x value from linear offset lookup table} inc(ptr2); {increment pointer} l := ptr2^; {get y value from linear offset lookup table} inc(ptr2); {increment pointer} ptr5 := @sincos[0]; {pointer to lookup table of trigonometric values sin} ptr6 := @sincos[1]; {pointer to lookup table of trigonometric values cos} for k := 0 to teta do {loop through all angles} begin d := trunc(j * ptr5^ + l * ptr6^);{determine the line length for current parameters} if (d >= 0) and (d < dmax) then {is it in range?} begin ptr4 := ptr3; {get the starting pointer of voting matrix} inc(ptr4, d * teta + k); {increment the pointer bydetermined offset} ptr4^ := ptr4^ + 1; {increment the voting matrix value} end; inc(ptr5); {increment pointer} inc(ptr6); {increment pointer} end; end else inc(ptr2, 2); {increment pointer} inc(ptr1); {increment pointer} end; result := maxind(votingmatrix) 90;{final estimated angle} 4.2. rotation implementation image rotation is performed using the previously described ultra-fast architecture for geometrical image transformations. this approach proved to be very efficient and performs almost 50 times faster than standard approach for image rotation. in order to achieve the highest computational performances, the highly optimized low-level machine code implementation of the image rotation is used. the following listing shows the machine routine for image rotation: 620 v. vuĉković, b. arizanović asm pushad{push all registers to stack} mov ecx,count{number of pixels to process} mov esi,rptr{pointer to r transformation array} mov ebx,imagesrcptr{source image pointer} mov edi,imagedstptr{destination image pointer} @main: {main loop} lodsd {load current offset from r transformation array} mov edx,eax {save current offset} or eax,eax {is it -1?} js @init {if true, jump to label init} shl edx,2 {offset * 4} mov eax,[edx+ebx]{calculate final offset and load value from source to eax} stosd {store loaded value from eax to destination} dec ecx {decrement counter} jnz @main {if not zero, loop again through ecx} jmp @ex {else,jump to ex label} @init: {label init} mov eax,white_color{store white color definition to eax} stosd {store value from eax to destination} dec ecx {decrement counter} jnz @main {if not zero, loop again through ecx} @ex: {label ex} popad {pop up all registers from stack} end; 5. experiments testing of the proposed optimized skew correction approach is performed on pc machine with an amd quad core processor running at 3.1 ghz and 4 gb ram installed. the proposed approach performances from the aspect of time complexity are analyzed and compared with related work. beside the time complexity, skew estimation accuracy results are also provided. for the purpose of demonstrating the proposed approach performances and importance of the skew correction for character segmentation process, nikola tesla’s documents from the “nikola tesla museum” in belgrade are used. the proposed approach performances are compared with results provided in [18]. table 1 shows comparison of rotation processing time for the proposed approach and algorithms analyzed in [18]. table 1 comparison of processing time for image rotation image dimensions (px) c. singh et al. [18] processing time (ms) proposed approach processing time (ms) float rotation integer rotation fast implementation bresenham’s line like algorithm 1249x1249 15 172 15 78 15 78 16 31 6 4148x4068 218 1875 156 844 157 841 235 313 64 provided results show the efficiency of the rotation algorithm proposed as a part of skew correction approach. c. singh et al. provided results for different algorithms, including results for forward and inverse rotation for each analyzed rotation algorithm. an important fact is that the proposed rotation approach is not dependent on complexity of calculations, thus both, forward and inverse rotation, will be performed with the same processing time. taking this fact into consideration, results show that the proposed approach gives better results than any of the rotation algorithms analyzed in [18]. automatic document skew pre-processor for character segmentation algorithm 621 beside the rotation algorithm processing time, other important aspects of the skew correction approaches are the hough transform processing time and skew estimation accuracy. c. singh et al. used bag algorithm in the process of skew estimation and they compared the overall processing time with standard approach which does not use bag algorithm. table 2 shows the comparison of the proposed approach results with results provided in [18]. results given in table 2 show that the proposed approach, from the aspect of time complexity, gives better results than classical implementation of the hough transform and worse results than implementation which exploits bag algorithm. considering the fact that the proposed approach performs approximately 2.5 times faster than classical implementation, this is a significant improvement. also, taking into consideration that the proposed approach exploits ultra-fast architecture for image transformation, bag algorithm could be used in combination with the proposed approach to provide even faster results than the already presented ones. results provided in table 1 and table 2 are obtained using the document images with the same characteristics as images used in related work [18]. when it comes to the importance of the skew correction for character segmentation system, character segmentation system which is in the background of the proposed skew correction approach proved to be very sensitive to document skew. document images skewed for random angles are used for obtaining results. it is shown that document images skewed even for a small angle higher than 2° highly decrease the successful character segmentation percentage. table 2 comparison of overall processing time and skew estimation accuracy image dimensions (px) % of black pixels skew angle c. singh et al. processing time (ms) difference in skew (bag) difference in skew (no bag) proposed approach processing time (ms) difference in skew bag no bag 4168x4088 7.16 0 593 3875 0.00 0.000 1548 0.00 4308x4231 6.6 2 610 3969 0.06 -0.855 1589 -0.35 4508x4436 6.1 5 704 3968 0.079 0.079 1585 -0.15 4815x4750 5.3 10 797 4025 0.08 0.08 1612 -0.10 4981x4921 4.9 13 860 4079 -0.48 -0.65 1631 -0.25 5272x5222 4.42 19 953 4188 -0.19 0.195 1654 0.12 5654x5624 3.8 30 1046 4234 0.02 -0.26 1687 0.20 5838x5838 3.57 45 1234 4484 0.00 0.00 1711 0.00 4088x4168 7.1 90 522 3781 0.00 0.00 1536 0.00 5267x5315 4.3 110 985 4134 -0.43 -0.43 1644 -0.47 5739x5759 3.6 150 1374 4655 -0.32 -0.19 1720 -0.20 1904x2588 13.88 0 280 2034 0.20 0.00 478 0.00 1993x2653 14.07 2 250 2281 -0.73 -1.11 515 -0.88 2122x2744 11.69 5 235 2094 -0.43 -1.19 487 -0.65 2324x2879 9.47 10 281 2124 0.49 -1.16 492 -0.45 2437x2950 9.47 13 327 2125 -0.70 -0.48 489 -0.32 2643x3067 8.4 19 328 2140 -1.14 -0.57 493 -0.60 2943x3193 7.24 30 375 2172 -0.95 -0.95 500 -0.95 3176x3176 6.75 45 407 2187 -1.14 -0.64 508 -0.42 2588x1904 13.88 90 171 2078 0.00 0.00 481 0.00 3083x2674 8.25 110 281 2141 -0.65 -0.43 496 -0.71 2943x3193 7.24 150 375 2172 0.96 0.96 504 0.90 622 v. vuĉković, b. arizanović this characteristic of the character segmentation algorithm is due to its nature. the graph that shows the dependency of the successful character segmentation percentage from the skew angle is shown in fig. 7. fig. 7 character segmentation results as a function of document skew angle this graph shows that even a small document skew represents a big problem for character segmentation algorithm. visual results of the extended character segmentation algorithm are provided using the nikola tesla’s documents from the “nikola tesla museum” in belgrade. these results are shown in fig. 8 a), b), c), and d). (a) (b) (c) (d) fig. 8 extended character segmentation results for skewed and de-skewed original nikola tesla’s documents: a) first skewed document, b) first de-skewed document, c) second skewed document, d) second de-skewed document automatic document skew pre-processor for character segmentation algorithm 623 the character segmentation results shown in fig. 8 confirm the previous conclusion based on the graph results. based on the scanned document images from the “nikola tesla museum”, it is clear that further character segmentation is impossible without skew correction performed in the very early stage of the character segmentation process. 6. conclusions in this paper, the optimized hough transform based approach for skew correction is presented as an essential pre-processing part of character segmentation system. character segmentation system is initially designed for machine-typed documents, but can be used for machine-printed documents as well. in section 2 the theoretical background of the hough transform is provided. in section 3 the flowchart of the complete character segmentation algorithm is shown and description of the proposed skew correction approach is provided. the proposed approach uses the ultra-fast generalized image transformation architecture for achieving high computational performances. in order to achieve fast implementation, linear image representation is used. ultra-fast image transformation architecture is used for implementation of the image rotation algorithm, which is implemented using the highly optimized low-level machine code. the standard hough transform is implemented using the pointer arithmetic. for both implementations, support lookup tables are used. in section 4, the experimental results for the proposed approach from the aspect of time complexity and estimation accuracy are given and are compared with existing approaches. also, the results which show how the skew correction affects the character segmentation process, are given. based on the results, the proposed approach performs approximately 2.5 faster than the classical implementation used in [18]. also, the proposed approach gives worse results than skew correction approach which exploits a bag algorithm. although the proposed approach does not give the best results, it could be used in combination with a bag algorithm to provide even better results than existing ones. the estimation accuracy results show that the proposed approach gives results as good as results provided in the related work. on the other side, it is clear that character segmentation of skewed documents is impossible and gives bad results without a skew correction. since character segmentation system is initially designed for needs of the “nikola tesla museum”, namely for conversion of nikola tesla’s scanned documents to electronic form, the original nikola tesla’s documents from the “nikola tesla museum” are used for testing of the complete character segmentation algorithm performances. also, the official evaluation of the complete character segmentation system performances will be performed at the “nikola tesla museum”. our future work will be focused on the automatization of the character segmentation algorithm manual parts, improving its performances, the optimization of the complete algorithm including the proposed skew correction approach, and integration of the character segmentation system into the complete real-time ocr system. acknowledgments: this paper is supported by the ministry of education, science and technological development of the republic of serbia (project iii44006-10), mathematical institute of serbian academy of science and arts (sanu) and museum of nikola tesla (providing original typewritten documents of nikola tesla). 624 v. vuĉković, b. arizanović references [1] s. s. cvetković, s. v. nikolić and s. ilić, “effective combining of color and texture descriptors for indoor-outdoor image classification”, facta universitatis: electronics and energetics, vol. 27, no. 3, pp. 399-410, 2014. [2] j. j. hull, “document image skew detection: survey and annotated bibliography”, series in machine perception and artificial intelligence, vol. 29, pp. 40-66, 1998. [3] h. s. baird, “the skew angle of printed documents”, document image analysis, pp. 204-208, 1995. [4] a. papandreou et al., “icdar2013 document image skew estimation contest (disec’13)”, in proceedings of the 12th international conference on document analysis and recognition (icdar), 2013. [5] a. d. bagdanov and j. kanai, “evaluation of document image skew estimation techniques”, in spie proceedings 2660: document recognition iii, 1996, pp. 343-354. [6] p. mukhopadhyay and b. b. chaudhuri, “a survey of hough transform”, pattern recognition, vol. 48, no. 3, pp. 993-1010, 2015. [7] r. o. duda and p. e. hart, “use of the hough transformation to detect lines and curves in pictures”, in proceedings of the communications of the acm, vol. 15, no. 1, pp. 11-15, 1972. [8] s. n. srihari and v. govindaraju, “analysis of textual images using the hough transform”, machine vision and applications, vol. 2, no. 3, pp. 141-153, 1989. [9] o. g. okun, “geometrical approach to skew detection for documents containing the latin/cyrillic characters”, in proceedings of the spie, vol. 3811: vision geometry viii, 1999, pp. 357-365. [10] a. boukharouba, “a new algorithm for skew correction and baseline detection based on the randomized hough transform”, journal of king saud university computer and information sciences, vol. 29, no. 1, pp. 29-38, 2016. [11] d. kumar and d. singh, “modified approach of hough transform for skew detection and correction in documented images”, international journal of research in computer science, vol. 2, no. 3, pp. 37-40, 2012. [12] f. stahlberg and s. vogel, “document skew detection based on hough space derivatives”, in proceedings of the 13th international conference on document analysis and recognition, 2015. [13] v. shapiro, “accuracy of the straight line hough transform: the non-voting approach”, computer vision and image understanding, vol. 103, no. 1, pp. 1-21, 2006. [14] s. guo et al., “an improved hough transform voting scheme utilizing surround suppression”, pattern recognition letters, vol. 30, no. 13, pp. 1241-1252, 2009. [15] b. gatos, n. papamarkos and c. chamzas, “skew detection and text line position determination in digitized documents”, pattern recognition, vol. 30, no. 9, pp. 1505-1519, 1997. [16] u. pal and b. b. chaudhuri, “an improved document skew angle estimation technique”, pattern recognition letters, vol. 17, no. 8, pp. 899-904, 1996. [17] a. amin et al., “fast algorithm for skew detection”, in proceedings of the spie 2661: real-time imaging, 1996, pp. 65-77. [18] c. singh, n. bhatia and a. kaur, “hough transform based fast skew detection and accurate skew correction methods”, pattern recognition, vol. 41, no. 12, pp. 3528-3546, 2008. [19] l. a. f. fernandes and m. m. oliveira, “real-time line detection through an improved hough transform voting scheme”, pattern recognition, vol. 41, no. 1, pp. 299-314, 2008. [20] l. a. najman, “using mathematical morphology for document skew estimation”, in proceedings of the spie 5296: document recognition and retrieval xi, 2003, pp. 182-192. [21] g. bessho, k. ejiriand j. f. cullen, “fast and accurate skew detection algorithm for a text document or a document with straight lines”, in proceedings of the spie 2181: document recognition, 1994, pp. 133-141. [22] j. van beusekomand t. m. breuel, “resolution independent skew and orientation detection for document images”, in proceedings of the spie 7247: document recognition and retrieval xvi, 2009, pp. 72470k-72470k-8. [23] r. kapoor, d. bagai and t. s. kamal, “a new algorithm for skew detection and correction”, pattern recognition letters, vol. 25, no. 11, pp. 1215-1229, 2004. [24] h. liu et al., “skew detection for complex document images using robust borderlines in both text and non-text regions”, pattern recognition letters, vol. 29, no. 13, pp. 1893-1900, 2008. [25] y. cao, s. wang and h. li, “skew detection and correction in document images based on straight-line fitting”, pattern recognition letters, vol. 24, no. 12, pp. 1871-1879, 2003. [26] j. fabrizio, “a precise skew estimation algorithm for document imagesusing knn clustering and fourier transform,” in proceedings of the ieee international conference on image processing (icip), 2014. http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.234715&name=junichi+kanai http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.171101&name=goroh+bessho http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.171101&name=koichi+ejiri http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.171101&name=koichi+ejiri http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.807735&name=joost+van+beusekom http://profiles.spiedigitallibrary.org/summary.aspx?doi=10.1117%2f12.807735&name=joost+van+beusekom automatic document skew pre-processor for character segmentation algorithm 625 [27] a. chan-hon-tong, c. achard and l. lucat, “simultaneous segmentation and classification of human actions in video streams using deeply optimized hough transform”, pattern recognition, vol. 47, no. 12, pp. 3807-3818, 2014. [28] c. tu et al., “vehicle position monitoring using hough transform”, in proceedings of the international conference on electronic engineering and computer science, vol. 4, pp. 316-322. [29] r. varun et al., “face recognition using hough transform based feature extraction”, procedia computer science, vol. 46, pp. 1491-1500, 2015. [30] v. vuĉković and s. spasić, “3-d stereoscopic modeling of the tesla’s long island”, facta universitatis: electronics and energetics, vol. 29, no. 1, pp. 113-126, 2016. facta universitatis series: electronics and energetics vol. 34, no 4, december 2021, pp. 483-498 https://doi.org/10.2298/fuee2104483k © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor mahdi karami1*, norman mariun2, mohd zainal abidin ab kadir2, mohd amran mohd radzi2, norhisam misron2 1department of electrical engineering, jam branch, islamic azad university, jam, iran 2department of electrical and electronic engineering, faculty of engineering, universiti putra malaysia, serdang, malaysia abstract. this article proposed a detection scheme for three-phase line start permanent magnet synchronous motor (lspmsm) under different levels of static eccentricity fault. finite element method is used to simulate the healthy and faulty lspmsm with different percentages of static eccentricity. an accurate laboratory test experiment is performed to evaluate the proposed index. effects of loading condition on lspmsm are also investigated. the fault related signatures in the stator current are identified and an effective index for lspmsm is proposed. the simulation and experimental results indicate that the low frequency components are an effective index for detection of the static eccentricity in lspmsm. key words: line start permanent magnet synchronous motor, static eccentricity, current signature analysis, finite element method, fault detection 1. introduction line start permanent magnet synchronous motor (lspmsm) is a hybrid electric motor uses the combination of induction motor (im) and permanent magnet synchronous motor (pmsm) structure through a rotor bar to provide the starting torque that conducts the motor into synchronism and permanent magnets for the generation of synchronous torque at steady state. lspmsm starts with the induction characteristic using rotor bar torque and permanent magnet opponent torque (breaking torque). so long as the velocity attains near synchronous speed, a synchronization process commences and the motor pulls to synchronous state whenever no eddy current generates into the rotor bars except harmonics field currents. lspmsms is introduced as a viable alternative to ims [1-3]. environmental considerations such as pollutions, greenhouse gases and landfills are the major issues in recent years [4] which has attracted a lot of attention for efficiency improvement of power received january 1, 2021; received in revised form september 17, 2021 corresponding author: mahdi karami department of electrical engineering, jam branch, islamic azad university, jam, iran e-mail: mehdikarami.en@gmail.com 484 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron systems and electrical machines [5,6]. unpredictable faults always create unexpected problems in electrical motors because of many stress mechanisms such as thermal, electrical, mechanical and environmental effects during the operation, that ultimately result in efficiency reduction, serious damage and breakdown. the consequent economic and security reasons justify the importance of fault detection techniques for preventive maintenance [7]. faults in lspmsms can be classified in different types as indicated in fig. 1. the mechanical faults are much possible in electrical motors which earmark 60% of the faults, while 80% of mechanical faults are due to eccentricity between stator and rotor that encourage many research efforts still devoted to the eccentricity in electrical motors. eccentricity in the lspmsm could cause a type of fault cycle proportional to eccentricity percentage lead to decrease the motor efficiency. despite the significance of eccentricity exploration in electrical motors, few researches has been reported on eccentricity fault detection in lspmsm. it is worth mentioning that type of electrical motors has significant influence on the fault detection procedures [8] and hence precise eccentricity fault diagnosis in lspmsm can guarantee their lifetime as well as keep their high efficiency performance. fig. 1 fault classification in lspmsm the initial study on static eccentricity detection in lspmsm is reported in [3]. a threephase lspmsm is modeled using the finite element method (fem). the stator current signal of lspmsm under static eccentricity condition is analyzed in frequency domain and the fault index is determined. however, the loading effect on fault detection is not considered. the investigation then continued and a cost-effective, non-invasive detection strategy is proposed for mixed eccentricity fault diagnosis in lspmsm. examination is executed through simulation and experimental works and an efficient frequency pattern as well as detection criterion is specified [9]. a mathematical model of lspmsm under static eccentricity condition is developed in [10]. the proposed model is verified using fem. the performance of eccentric lspmsm is analyzed and the time variations of stator current, speed and torque are investigated. pmsm has been simulated with eccentricity using fem to calculate the stator current signal in addition to experimental investigation [11]. barbour and thomson [12] analyzed the influence of rotor shapes on static eccentricity in im using mcsa method. fem has been employed to simulate the stator current. it was concluded that the rotor slot design has a remarkable impact on the harmonic components of stator current with static eccentricity, while semi-closed slots indicate higher increase in the presence of static eccentricity. these researchers then proved that rotor slot skewing reduces the static eccentricity harmonic components of stator current signal in im [13]. nandi et al. work on the detection of eccentricity fault in three-phase im analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 485 by measuring the high frequency harmonic components in the stator current signal [14]. the amplitude of sideband components around principle slot harmonics (psh) is examined for eccentricity detection. these researches proposed two feature formulas for detection of eccentricity fault in pmsm using mcsa. in [15] the same index as proposed in [16] has been evaluated for static eccentricity recognition in reluctance synchronous motor by predicting a proper harmonic component in the stator current signal through modeling and experimentation concepts. according to the resultant of aforementioned topics as well as the easiness and accessibility of measuring current spectrum via a current clamp, mcsa is the most accepted noninvasive technique which depends on exploring the variation of eccentricity related harmonics in the current signal of the motor. this article proposed a method to identify the static eccentricity fault signature for three-phase lspmsm. the main contribution is to examine the proposed index through the simulation and experimental analyses and determine its effectiveness. the stator current spectrum at steady state operation is used as a reference signal. a laboratory test setup is developed to measure the current signal of motor non-invasively for both healthy and faulty conditions with different degrees of fault and loading level. by the way, the stator current signal of lspmsm is calculated using fem. the simulated and measured current signals are processed in frequency domain using power spectral density (psd) technique. it is indicated that the amplitudes of fault-related components increased proportional to fault severity while they decreased at higher levels of load. 2. eccentricity fault the non-uniform air-gap distribution between stator and rotor due to displacement of one or all the rotor symmetry, stator symmetry and rotor rotation axes from the center of motor, so called eccentricity fault in electrical motors that classifies in three types as static, dynamic and mixed eccentricity. static eccentricity defines as a condition when the rotor symmetrical axis (cr) concentric with the rotor rotational axis (cg) but they are displaced with respect to the stator symmetrical axis (cs) result in a non-uniform air-gap distributes between stator and rotor where the position of minimum (and maximum) air-gap versus stator is motionless and time-independent. several stresses of forces and conditions lead to static eccentricity fault in electric motor such as shaft deflection, motor housings imperfection, wrong placement of the rotor or stator at the setup or subsequent of maintenance, elliptical stator core, incorrect bearing positioning, bearing deterioration, end-shield misalignment, excessive tolerance, rotor weight or pressure of interlocking ribbon. the consequences of static eccentricity could seriously damage the motor, specially the advanced degrees of fault. static eccentricity fault causes static unbalanced magnetic pull (ump) in the radial route across the motor, rub between rotor and stator, abnormal noise and vibration, harm the rotor and stator laminations, destroy the windings, rotor deflection, bent shaft and bearing defect. the magneto motive force (mmf) of stator shapes the non-uniform air-gap due to eccentricity by permeance harmonics into the electromotive force (emf) that induced in the rotor. the same procedure pursued for emf which is induced in the stator on the basis of the rotor mmf. thus, the air-gap flux results from permeance and mmf generates an air-gap magnetic field which composes of fundamental components, stator and rotor mmf 486 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron harmonics, stator and rotor slot permeance harmonics, eccentricity permeance harmonics and permeance harmonics of saturation. lspmsm starts to run by rotor bar torque and permanent magnet breaking torque. while the rotor rotates close to the synchronous speed, the motor is driven to synchronization and steady state operation [3]. the air-gap permeance including stator slotting permeance and smooth rotor can be calculated for lspmsm at steady state operation as follows: 𝑃𝑠𝑙 = ∑ 𝑃𝑘𝑠𝑙 cos(𝑘𝑠𝑙 𝑆1𝜃) ∞ 𝑘𝑠𝑙=0 (1) where 𝑘𝑠𝑙 is an integer value, 𝑃𝑘𝑠𝑙 is the stator slotting permeance, 𝑆1 is the stator slots and 𝜃 is the space variable. the saturation permeance of this machine is expressed as: 𝑃𝑠𝑎𝑡 = ∑ 𝑃𝑘𝑠𝑎𝑡 cos[𝑘𝑠𝑎𝑡 ( 2𝜔𝑡 − 2𝑃𝜃)] ∞ 𝑘𝑠𝑎𝑡=0 (2) where 𝑘𝑠𝑎𝑡 is an integer number, 𝑃𝑘𝑠𝑎𝑡 is the specific permeance of saturation, 𝑃 is the main pole pairs, ω is the angular supply frequency and t is time variable. since the non-concentric air-gap in static eccentricity is time invariant the permeance due to this fault considering smooth rotor and stator will be: 𝑃𝑆𝐸 = ∑ 𝑃𝑘𝑆𝐸 cos(𝑘𝑆𝐸 𝜃) ∞ 𝑘𝑆𝐸=0 (3) where 𝑘𝑆𝐸 is an integer number and 𝑃𝑘𝑆𝐸 is the specific permeance related to static eccentricity. accordingly, the total permeance can be computed with the following expression: 𝑃𝑇 (𝑡) = ∑ ∑ ∑ 𝑃𝑘𝑠𝑙,𝑘𝑠𝑎𝑡,𝑘𝑆𝐸 cos[±( 2𝑘𝑠𝑎𝑡 𝑃 )𝜔𝑡 ∞ 𝑘𝑆𝐸=0 ∞ 𝑘𝑠𝑎𝑡=0 ∞ 𝑘𝑠𝑙=0 +(𝑘𝑠𝑙 𝑆1 ± 2𝑘𝑠𝑎𝑡 𝑃 ± 𝑘𝑆𝐸 )𝜃] (4) then, the air-gap flux density of lspmsm at steady state can be calculated utilizing ampere’s circuital principle as: 𝐵(𝑡) = 𝑃𝑇 (𝑡) ∫ 𝜇0𝑗𝑠 (𝜃, 𝑡)𝑑𝜃 (5) where 𝜇0 is the vacuum permeability and 𝑗𝑠 is the current density of stator interior surface. 𝑗𝑠 (𝜃, 𝑡) = ∑ 𝐽𝑠 sin(𝑘𝑗 𝜔𝑡 − 𝑃𝜃)] ∞ 𝑘𝑗=1 (6) where 𝑘𝑗 is an integer number. substituting expression (4) and (5) results in: analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 487 𝐵(𝑡) = 𝜇0𝐽𝑠 𝑃 ∑ cos(𝑘𝑗 𝜔𝑡 − 𝑃𝜃) ∞ 𝑘𝑗=1 ∑ ∑ ∑ 𝑃𝑘𝑠𝑙,𝑘𝑠𝑎𝑡,𝑘𝑆𝐸 ∞ 𝑘𝑆𝐸=0 ∞ 𝑘𝑠𝑎𝑡=0 ∞ 𝑘𝑠𝑙=0 × cos[±( 2𝑘𝑠𝑎𝑡 𝑃 )𝜔𝑡 + (𝑘𝑠𝑙 𝑆1 ± 2𝑘𝑠𝑎𝑡 𝑃 ± 𝑘𝑆𝐸 )𝜃] (7) expression (7) can be rewritten in the following form: 𝐵(𝑡) = 𝜇0𝐽𝑠 𝑃𝑘𝑠𝑙,𝑘𝑠𝑎𝑡,𝑘𝑆𝐸 𝑃 ∑ ∑ ∑ cos[(𝑘𝑗 ± ( 2𝑘𝑠𝑎𝑡 𝑃 ))𝜔𝑡 ∞ 𝑘𝑆𝐸=0 ∞ 𝑘𝑠𝑎𝑡=0 ∞ 𝑘𝑠𝑙=0 + (±𝑘𝑠𝑙 𝑆1 ± 2𝑘𝑠𝑎𝑡 𝑃 ± 𝑘𝑆𝐸 − 𝑃)𝜃] (8) eventually, expression (8) can be finalized as follows: 𝐵(𝑡) = ∑ 𝐵, cos[(𝑘𝑗 ± (  𝑃 ))𝜔𝑡 ± 𝜃] , (9)  = 2𝑘𝑠𝑎𝑡  = ±𝑘𝑠𝑙 𝜃 ± 2𝑘𝑠𝑎𝑡 𝑃 ± 𝑘𝑆𝐸 𝜃 − 𝑃𝜃 the mmf of the stator is expressed as: 𝐹𝑠 = 𝐵(𝑡) 𝑃𝑇 (𝑡) (10) the stator current involving space and time harmonics is as follows: 𝑖(𝑡) = ∑ 𝐼, cos[( , 𝑘𝑗 ± (  𝑝 ))ωt ± 𝜃] (11) expression (11) is simplified for sinusoidal supply voltage as: 𝑖(𝑡) = ∑ 𝐼, cos[( , 1 ± (  𝑝 ))ωt ± 𝜃] (12) where  = 1, 3, 5, ... . 3. experimental setup the experimental test rig is shown in fig. 2(a). the characteristics of lspmsm for both healthy and faulty cases is a 1-hp, 4-pole, three-phase lspmsm with the specification as mentioned in table 1. the motor is directly fed by the grid power supply while the stator windings are y connected and the current nominal value is 1.28 a. the lspmsm is coupled to torque/speed sensor in order to measure the torque value in different operation condition. on the other side, a mechanical load has been provided by a dc-excited magnetic powder 488 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron brake (mpb) coupled to torque/speed sensor. the specific load torque level could be furnished to the motor shaft by controlling the input dc voltage of mpb. there are some advantages to use mpb instead of generator for mechanical load such as stability of load torque value. the schematic of test rig is displayed in fig. 2(b). this shows a brief information about the lspmsm, mpb, current transducer and data acquisition system. this system is used to sample the stator current noninvasively while the motor operated in the steady state condition. the stator current is measured using current probe model pico-pp264 and data acquisition carried out by picoscope 4424 which is a high resolution usb-connected system including an industry-leading signal acquisition path, provides 80 ms/s adc on each channel with 1% accuracy. only one phase current signal is required to be recorded for detection process. the recorded signals are analyzed by a computer-based signal processing program. (a) (b) fig. 2 (a) experimental test rig (b) schematic view of system analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 489 table 1 specification of three-phase lspmsm rated output power (hp) 1 rated voltage (v) 415 rated frequency (hz) 50 number of poles 4 rated speed (rpm) 1500 connection y air-gap length (mm) 0.30 remanent flux density of magnets (t) 1.235 3.1. noninvasive implementation of static eccentricity in three-phase lspmsm hitherto, different approaches have been proposed to beget eccentricity fault in im and pmsm for the purpose of fault diagnosis. some of the introduced approaches invasively modify the configuration of motor and damage it permanently. in some cases, changing the fault severity is impossible while others require accurate measuring device to specify the fault percentage. this noninvasive method makes the lspmsm temporarily eccentric without any costly measuring device while the fault severity can be easily changed and after all, motor returned to the normal condition for further usage. accordingly, the original bearings of lspmsm are changed by a new set of bearing with larger inner diameter and smaller outer diameter result in creation of free space between the shaft and bearings and also between the bearings and the housing of end shields. the fault can be created by filling these free spaces with special-made concentric or non-concentric inner and outer rings. a specific screw is applied to prevent sliding of inner ring on the rotor shaft and inside the bearing. meanwhile, a particular slot is made for outer ring to fix it properly in the housing. the protective layers are designed for inner ring to keep the outer ring inside the housing at any probabilistic misalignment and to avoid friction and scrubbing between the inner ring and bearing or outer ring. the prototype of eccentricity simulation strategy is shown in fig. 3. fig. 3 inner, outer rings and new bearing before and after assembly 490 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron static eccentricity is created by fixing the concentric inner rings between the new bearings and shaft on both ends of lspmsm and non-concentric outer rings between the new bearings and housings of both end shields. the non-concentric outer ring is made via offset the center of ring based on the specific measures, so the severity of static eccentricity can be varied by fixing the outer rings with different values of offset. this strategy is followed to create 20%, 35% and 50% static eccentricity in lspmsm. 4. fem based simulation of three-phase lspmsm fem based analysis provides an accurate tool for modeling of electrical motors since it includes material characteristic, nonlinearity and complexities. the reliability and accuracy of fem for analyzing the electric motor performance is more than other method such as winding function theory (wft). fem is a precise method because the inductances can be directly calculated by field analysis which leads to various conditions such as slot effects, saturation and etc. to be spontaneously taken into account [9]. the three-phase lspmsm is simulated in 2-d environment based on fem utilizing maxwell 2-d software to calculate the stator current spectrum with eccentricity effect. the simulations are performed at the same condition of the experimental study such as motor specification, static eccentricity percentages, sampling frequency, sampling time and frequency resolution. the geometrical and physical complexities of lspmsm like stator, hybrid rotor and shaft, stator winding distribution, non-uniform permeance of air-gap due to eccentricity, nonlinear characteristics of stator and rotor cores and permanent magnets materials are considered for modeling study. the 2-d model of lspmsm is described in fig. 4. (a) (b) fig. 4 cross-section of three-phase lspmsm: (a) geometric configurations and (b) plotting of the mesh three-phase sinusoidal voltage is injected to stator windings. a solver with time integration method based on backward euler is employed to solve the steady state current of lspmsm. magnetic field distribution in lspmsm is calculated by fem and the stator current signal is computed. the proposed lspmsm is simulated with different degrees of static eccentricity by shifting the center of rotor from the center of stator while the rotor rotates concentrically with its own center as mentioned in section 2. the non-uniform air analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 491 gap magnetic field due to static eccentricity in lspmsm produces asymmetrical current, torque and speed. several harmonic components will be manifested in magnetic flux, stator current of the motor. detection of the harmonic components in stator current will nominate a precise fault signature for static eccentricity diagnosis in lspmsm. 5. stator current signature analysis in three-phase lspmsm investigation of static eccentricity fault detection in lspmsm is pursued by analyzing the stationary spectrum of stator current in frequency domain applying power spectral density (psd) technique. current spectrum is stored with the sampling frequency of 5 khz over a total sampling time of 6.5 s which allows the analysis of the signals with a minimum frequency of 0.15 hz. in order to evaluate the static eccentricity at early stages, the lower degree of fault is also considered. the inherent eccentricity up to 10% normally disregards in electrical motors [2]. flowchart of rotor asymmetry fault analysis is displayed in fig. 5. in particular, two different configurations are tested in this study such as: 1. healthy lspmsm (0% static eccentricity) 2. lspmsm with 20, 35 and 50% static eccentricity fig. 5 flowchart of static eccentricity detection strategy for lspmsm 492 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron the time variation of simulated and measured steady state current signal for three-phase lspmsm in healthy condition and under static eccentricity at full load operation is indicated in fig. 6. the effect of static eccentricity is intangible in time variation of current signal. the amplitude of sideband components around fundamental frequency which is extracted by signal processing of the stator current in frequency domain can be used for precise eccentricity fault detection. harmonics of controllers always have influenced on the previous eccentricity detection methods in synchronous motors. due to unique property of lspmsms, the purpose of this research is to study the motor behavior under the static eccentricity condition without any driver. (a) (b) fig. 6 the stator current signal of lspmsm: (a) healthy condition (b) with 50% static eccentricity this article is dedicated to identify an index for eccentricity fault detection based on mcsa. the normalized line current spectra of lspmsm at healthy condition and with 20% static eccentricity are displayed in fig. 7. comparison between healthy and faulty conditions with 20% static eccentricity shows increment in the amplitude of harmonic components around fundamental frequency. low percentage of static eccentricity generates harmonic components at frequencies of 25 hz, 75 hz and 125 hz in the stator current. the visibility of these harmonics is prominent to nominate them as static eccentricity fault signature for incipient detection and maintenance analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 493 procedure in lspmsms. despite the eccentricity-related harmonics, the main field also generates harmonics because of associated rotating force wave at frequencies 50 hz, 100 hz, 150 hz, 200 hz and etc., while the amplitude of main field harmonics is superior to the eccentricity components so the eccentricity components at frequencies 50 hz, 100 hz, etc. is negligible. (a) (b) (c) fig. 7 psd spectra of current signal for: (a) healthy lspmsm (b) simulation result of 20% static eccentricity (c) experimental result of 20% static eccentricity 494 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron accordingly, the nominated signatures can be effective for lspmsm in order to detect the static eccentricity at early stages. it is derived from the results that the following index can be introduced as an appropriate frequency pattern to pinpoint the static eccentricity related signatures in three-phase lspmsm. 𝑓𝑠𝑡𝑎𝑡𝑖𝑐 = [1 ±  𝑝 ] 𝑓 (13) where 𝑓𝑠𝑡𝑎𝑡𝑖𝑐 is the harmonic components due to static eccentricity in lspmsm,  is an odd integer value, 𝑝 is the number of pole pair and 𝑓 is the fundamental frequency. proposed index can be used to detect the static eccentricity in lspmsms with different number of pole pairs due to variable 𝑝 in expression (13). investigation on the influence of fault severity percentage in current signal is a significant issue in order to estimate the ability of proposed index. the psd of stator current spectrum for lspmsm with 35% and 50% static eccentricity are demonstrated in fig. 8 and fig. 9, respectively. (a) (b) fig. 8 psd spectrum of current signal with 35% static eccentricity: (a) simulation result (b) experimental result fig. 8 exposes a remarkable rise of the amplitudes of harmonic components due to 35% static eccentricity at nominated frequencies. this incremental rate of amplitudes proofs the ability of expression (13) to detect the static eccentricity and its degree. on the other hand, analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 495 comparison between fig. 7 and fig. 8 illustrates that amplitudes of harmonic components are increased while the static eccentricity degree changed from 20% to 35%. psd spectrum of stator current in fig. 9 indicates that the amplitude of harmonic components at low frequencies further increases due to 50% fault severity. the static eccentricity degree of 50% increases the amplitude of 25 hz to -36 db, 75 hz to -40 db and 125 hz to -44 db that are significantly far from the amplitudes of these components at healthy condition as well as faulty condition with 20% and 35% of static eccentricity. thus, the amplitudes of harmonic components at frequencies which determined by expression (14) are a suitable index for detection of static eccentricity fault in three-phase lspmsm. by the way, the proposed feature is capable to be used for predicting the static eccentricity degree in this type of motor. fig. 9 psd spectrum of current signal with 50% static eccentricity: (a) simulation result (b) experimental result the evaluation of proposed index for various loading levels and fault severity are summarized in table 2. these amplitudes increase as a function of fault severity and vary in a subtractive manner proportional to load level. effect of load variation on amplitudes of eccentricity-related harmonics depends on the type of electrical motor. amplitudes of eccentricity-related harmonics in pmsm remain constant versus load variation at fixed degrees of fault [8]. despite the similarity of pmsm and lspmsm as a synchronous motor but their reactions to static eccentricity fault are not the same due to their different configurations. this observation again demonstrates the importance of motor type in fault detection process. a comparison between the proposed method and previous techniques is summarized in table 3. 496 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron table 2 amplitudes of harmonic components in current spectrum of lspmsm index load (%) static eccentricity degree (%) simulation result (db) experimental result (db) 0% 20% 35% 50% 0% 20% 35% 50% [1 − 1 𝑝 ] 𝑓 0 -46 -42 -42 -40 -49 -44 -45 -41 20 -47 -40 -39 -35 -44 -34 -32 -30 60 -50 -45 -43 -40 -46 -42 -37 -35 100 -55 -50 -45 -43 -68 -47 -43 -36 [1 + 1 𝑝 ] 𝑓 0 -48 -44 -43 -40 -52 -48 -45 -42 20 -50 -40 -37 -32 -53 -41 -39 -34 60 -51 -49 -44 -42 -53 -46 -40 -46 100 -54 -52 -49 -47 -57 -53 -46 -40 [1 + 3 𝑝 ] 𝑓 0 -52 -46 -43 -41 -61 -48 -45 -44 20 -55 -47 -40 -39 -63 -51 -42 -38 60 -59 -50 -46 -43 -66 -55 -44 -41 100 -63 -58 -59 -46 72 -56 -58 -44 table 3 comparison between eccentricity detection methods ref. monitoring technique motor type fault type achievement [3] current lspmsm static eccentricity ▪ a simulation study has been done. ▪ an index has been proposed for fault detection. ▪ loading condition is not considered. ▪ the method is not examined experimentally. [10] current, speed, torque lspmsm static eccentricity ▪ motor performance has been analyzed under healthy and faulty condition. ▪ mathematical and simulation studies are performed. ▪ no index has been proposed for detection. [11] current pmsm static and dynamic eccentricity ▪ a simulation and experimental examination has been done. ▪ specific frequencies under different loading levels have been analyzed. [12] & [13] current im static eccentricity ▪ effect of rotor shapes on fault related features in im has been investigated. ▪ it has been shown that rotor slot skewing reduces the fault-related components. analysis of rotor asymmetry fault in three-phase line start permanent magnet synchronous motor 497 6. conclusions the stator current spectrum of lspmsm at both healthy and faulty condition during its steady state operation was examined to identify the features of static eccentricity and propose an index for precise fault detection at early stage. the stator current signal of motor was measured noninvasively. different degrees of static eccentricity were created in the motor by a noninvasive method and then the motor returned to the healthy condition for continuing its normal operation. a three-phase lspmsm was simulated using fem with the same conditions of the laboratory test such as motor specification, static eccentricity percentages, sampling frequency, sampling time and frequency resolution. the effects of static eccentricity on the harmonic content of stator current spectrum were scrutinized in frequency domain utilizing psd analysis to propose a criterion for fault detection in this hybrid type of electrical motors. it is concluded that the low frequency components are the accurate signatures for static eccentricity in lspmsm which can be specified by frequency pattern [1 ± /𝑝]𝑓as an index for this fault. the higher degrees of fault increase the amplitude of these components which can be utilized to estimate the degree of static eccentricity. in an opposite manner, higher levels of load reduce the amplitudes of eccentricity-related components in the stator current. accordingly, this frequency pattern is reliable for static eccentricity detection at early stage. acknowledgements: the authors would like to express their gratitude to ministry of education malaysia for financial support through grant number frgs-5524356 and universiti putra malaysia for the facilities provided during this research work references [1] v. šarac, "line-start synchronous motor a viable alternative to asynchronous motor", facta univ., series: automat. control robot., vol. 19, pp. 39-58, july 2020. [2] m. karami, n. mariun, m. r. mehrjou, m. z. a. ab kadir, n. misron and m. a. m. radzi, "diagnosis of static eccentricity fault in line start permanent magnet synchronous motor", in ieee proceedings of international conference on power and energy, december 2014, pp. 83-86. [3] m. karami, n. mariun, m. r. mehrjou, m. z. a. ab kadir, n. misron and m. a. m. radzi, "static eccentricity fault recognition in three-phase line start permanent magnet synchronous motor using finite element method", math. probl. eng., vol. 2014, pp. 1-12, nov. 2014. [4] a. nochian, o. mohd tahir, s. mualan and d. rui, "toward sustainable development of a landfill: landfill to landscape or landscape along with landfll? a review", pertanika j. soc. sci. humanit., vol. 27, no. 2, pp. 949-969, 2019. [5] m. karami, n. mariun, m. a. m. radzi and g. varamini, "intelligent stability margin improvement using series and shunt controllers", int. j. appl. power eng. (ijape), vol. 10, no. 4, pp. 281-290, 2021. [6] f. rahmani, m. s. mashhadi, h. gh. lamouki, f. asghari, h. shokouhandeh and m. amoozadeh, "maximum power point tracking–a study of photovoltaic systems in supplying stand-alone and gridconnected electrical loads", in ieee proceedings of international conference on applied and theoretical electricity, may 2021, pp. 1-6. [7] j. faiz, t. asefi and m. azeem khan, "design of dual rotor axial flux permanent magnet generators with ferrite and rare-earth magnets", facta univ. series: elec. energ., vol. 33, pp. 553-569, 2020. [8] b. m. ebrahimi and j. faiz, b. n araabi, "pattern identification for eccentricity fault diagnosis in permanent magnet synchronous motors using stator current monitoring" iet electric power applications, vol. 4, pp. 418–430, 2010. [9] m. karami, n. b. mariun, m. z. a. ab-kadir, n. misron and m. a. m. radzi, "motor current signature analysis-based noninvasive recognition of mixed eccentricity fault in line start permanent magnet synchronous motor", electr. power compon. syst., vol. 49, no. 1-2, pp. 133-145, june 2021. 498 m. karami, n. mariun, m. z. a. ab kadir, m. a. mohd radzi, n. misron [10] i. hussein, z. al-hamouz, m. a. abido and a. milhem, "on the mathematical modeling of line-start permanent magnet synchronous motors under static eccentricity", energies, vol. 11, pp. 1-17, jan. 2018. [11] w. le roux, r. g. harley and t. g. habetler, "detecting rotor faults in low power permanent magnet synchronous machines", ieee trans. power elec., vol. 22, pp. 322–328, jan. 2007. [12] a. barbour and w. t. thomson, "finite element study of rotor slot designs with respect to current monitoring for detecting static airgap eccentricity in squirrel-cage induction motors", in ias ’97. conference record of the 1997 ieee industry applications conference thirty-second ias annual meeting, 1997, vol. 1, pp. 112–119. [13] w. t. thomson and a. barbour, "on-line current monitoring and application of a finite element method to predict the level of static airgap eccentricity in three-phase induction motors", ieee trans. energy conv., vol. 13, pp. 347–357, dec. 1998. [14] s. nandi, s. ahmed and h. toliyat, "detection of rotor slot and other eccentricity related harmonics in a three phase induction motor with different rotor cages", ieee trans. energy conv., vol. 16, pp. 253– 260, sep. 2001. [15] t. c. ilamparithi and s. nandi, "detection of eccentricity faults in three-phase reluctance synchronous motor", ieee trans. ind. appl., vol. 48, pp. 1307–1317, may 2012. [16] w. le roux, r. g. harley and t. g. habetler, "detecting rotor faults in low power permanent magnet synchronous machines", ieee trans. power elec., vol. 22, pp. 322–328, jan. 2007. facta universitatis series: electronics and energetics vol. 30, no 3, september 2017, pp. 351 362 doi: 10.2298/fuee1703351s nikola stojanović1, negovan stamenković2 received june 14, 2016; received in revised form november 18, 2016 corresponding author: nikola stojanović university of niš, faculty of electronic engineering, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: nikola.stojanovic@elfak.ni.ac.rs) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) lowpass filters approximation based on the jacobi polynomials 1university of niš, faculty of electronic engineering, serbia 2university of priština, faculty of natural science and mathematics, serbia abstract. a case study related to the design the the analog lowpass filter using a set of orthogonal jacobi polynomials, having four parameters to vary, is considered. the jacobi polynomial has been modified in order to be used as a filter approximating function. the obtained magnitude response is more general than the response of the classical ultraspherical filter, due to one additional parameter available in orthogonal jacobi polynomials. this additional parameter may be used to obtain a magnitude response having either smaller passband ripple, smaller group delay variation or sharper cutoff slope. two methods for transfer function approximation are investigated: the first method is based on the known shifted jacobi polynomial, and the second method is based on the proposed modification of jacobi polynomials. the shifted jacobi polynomials are suitable only for odd degree transfer function. however, the proposed modified jacobi polynomial filter function is more general but not orthogonal. it is transformed into orthogonal polynomial when orders are equal and then includes the chebyshev filter of the first kind, the chebyshev filter of the second kind, the legendre filter, gegenbauer (ultraspherical) filter and many other filters, as its special cases. key words: filters, analog circuits, approximation, filter characteristic function, jacobi polynomial, orthogonal polynomials. facta universitatis (niš) ser.: elec. energ. vol. 30, no. 1, february 2017, xx-xx lowpass filters approximation based on the jacobi polynomials nikola stojanović1 and negovan stamenković2 1university of niš, faculty of electronic engineering, serbia 2university of priština, faculty of natural science and mathematics, serbia abstract: a case study related to the design the the analog lowpass filter using a set of orthogonal jacobi polynomials, having four parameters to vary, is considered. the jacobi polynomial has been modified in order to be used as a filter approximating function. the obtained magnitude response is more general than the response of the classical ultraspherical filter, due to one additional parameter available in orthogonal jacobi polynomials. this additional parameter may be used to obtain a magnitude response having either smaller passband ripple, smaller group delay variation or sharper cutoff slope. two methods for transfer function approximation are investigated: the first method is based on the known shifted jacobi polynomial, and the second method is based on the proposed modification of jacobi polynomials. the shifted jacobi polynomials are suitable only for odd degree transfer function. however, the proposed modified jacobi polynomial filter function is more general but not orthogonal. it is transformed into orthogonal polynomial when orders are equal and then includes the chebyshev filter of the first kind, the chebyshev filter of the second kind, the legendre filter, gegenbauer (ultraspherical) filter and many other filters, as its special cases. keywords: filters; analog circuits; approximation; filter characteristic function; jacobi polynomial; orthogonal polynomials. 1 introduction the very classical orthogonal polynomials jacobi, laguerre and hermite [1] and their special cases i.e gegenbauer, chebyshev and legendre are widely used in communication theory and particularly in the synthesis transfer function of electric filters. the coefficients of the bessel-thomson filters, which provide maximally flatness of the group delay response in the passband without any ripple, are related to the bessel polynomials [2]. however, the bessel type polynomials are not orthogonal on an interval of the x-axis, but in certain cases are orthogonal on a unit circle. manuscript received on june 9, 2016. corresponding author: nikola stojanović, university of niš, faculty of electronic engineering, a. medvedeva 14, 1800 niš, serbia (e-mail: nikola.stojanovic@elfak.ni.ac.rs). 1 352 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 353 2 n. stojanović and n. stamenković: apart from chebyshev polynomials, which are of utmost importance in the synthesis of filters exhibiting a sharp increase in attenuation as the frequency increases above corner frequency, other classes of above mentioned orthogonal polynomials have found many useful applications in the synthesis of electrical filters. in particular, the approximation problem in the synthesis of electrical filters consists of finding a physical realizable function of frequency that shall meet a prescribed set of specifications with regard to its magnitude and/or group delay characteristics. it is known that, for a given filter degree, there is always a trade-off between the magnitude and group delay characteristics. by considering the whole frequency band, the better group delay characteristic is generally associated with the better time domain characteristic [3]. the better time domain characteristic leads to smaller time delay or smaller values of the overshoot in the step response. there are approximations that have a very good magnitude characteristic in detriment of their group delay characteristic, as for example, butterworth [4], chebyshev [5], [6], bernstein [7], legendre [8] [9] [10] and their derivatives by ku and drubin [11]. converse case occurs with other approximations, as for example, bessel [12], gauss [13], hermite [11] and least-squares monotonic [14] [15], all those filters present optimized characteristics in specific points. transitional filters are alternative filter solutions that perform a trade-off between the magnitude and group delay characteristics. transitional butterworth-chebyshev [16] filters are considered with magnitude characteristics that vary gradually from those of the butterworth filter to those of the chebyshev filter as a number of pass-band ripples (or the degree of flatness at the origin) is varied. three degrees of freedom are available for transitional butterworth-chebyshev filters: the degree n, the ripple factor ε and the degree of flatness at the origin. the smooth transition is accomplished using the method proposed of peless and murakami [17] by finding each pole of the transitional butterworth-thompson filter as an interpolation between a pole of the butterworth filter and a corresponding pole of the thompson filter. a special class of filter functions of odd order providing monotonic magnitude characteristic of the resulting filter has first been investigated by papoulis [18] by means of legendre polynomials. subsequently these results have been extended so as to include filters of even degree [19], [20], and also some other functions leading to the same class of filtering networks whose magnitude response is bounded to be monotonic have been derived using a different approach based on the applications of jacobi polynomials [21]. in this paper, the concept of magnitude response synthesis techniques is extended for orthogonal jacoby lowpass filters. simple modification of orthogonal jacobi polynomial, suitable for the continuous-time lowpass filter design, is proposed in this paper. if the degree of the filter is given, both indexes (order) of the jacobi polynomial can be used for smoothly adjusting the filter performance. the magnitude response obtained is more general than the continuous-time response of the chebyshev filter because of two additional parameters available with the modified jacobi polynomials. it should be noted, the proposed jacobi approximation covers many of the above-mentioned all-pole filter functions. 352 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 353 lowpass filters approximatin based on the jacobi polynomials 3 2 filter magnitude function in lowpass filter design, assuming all the zeros of the system function are at infinity, the squared magnitude function (insertion loss) can be written as |hn( jω)|2 = 1 1 + ε 2φ 2n (ω) (1) where ω is frequency variable, ε is a parameter that controls the passband attenuation tolerance, n denotes the degree of the filter and the polynomial φn(ω) is the characteristic (or approximating) function of the filter which is to be selected to give desired magnitude characteristic. the characteristic function is normalized to unity at the pass-band edge frequency ωp, which is also normalized to ωp = 1, then can be written as φn(1) = 1. this conventional procedure for filter design using the insertion loss method includes the design of a lumped element lc ladder lowpass filter known as the lowpass prototype. a more modern procedure uses this network synthesis technique to design filters with a completely specified frequency response. the design is simplified by beginning with low-pass filter prototypes that are normalized in terms of impedance and frequency. transformations are then applied to convert the prototype designs to the desired frequency range and impedance level. in filter design, the characteristic frequency use for frequency normalization is the cutoff frequency known as the filter passband corner frequency, and therefore normalized cutoff frequency is equal to 1. for this application, the function φ 2n (x) is required to be an even polynomial ψn(ω 2) = φ 2n (x). if φn(x) is even or odd, then φ 2 n (x) is always even, as is required. polynomials φn(x), which are neither even nor odd, may be also be used in magnitude functions if φn(x) is replaced by φn(x2). therefore it is necessary that no terms of the form x2k+1 appear in the characteristic function. the jacobi polynomials p(α,β )n (x) have n distinct zeros for α �= β but they are neither even nor odd. such type of polynomials are not suitable to be a filter characteristic function. however, jacobi orthogonal polynomials can be adapted for use in the low-pass filter magnitude functions, as will be shown in the next section. 3 jacobi polynomial the jacobi polynomials [22], denoted by p(α,β )n (x) of the degree n, are orthogonal on the interval [−1,1] with respect to the jacobi weight function w(α,β ) = (1 − x)α (1 + x)β when α,β ≥ −1. we shall refer to α and β as the orders of the jacobi polynomial. namely, ∫ 1 −1 p(α,β )m (x)p (α,β ) n (x)w(α,β )(x)dx = h (α,β ) n δn.m, (2)4 n. stojanović and n. stamenković: where h(α,β )n = 2α+β +1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) , (3) δn.m is kronecker delta symbol and γ(·) is well known gamma function. the jacobi polynomials are generated by the three-term recurrence relation: p(α,β )0 (x) = 1, p(α,β )1 (x) = 1 2 (α + β + 2)x + 1 2 (α − β ), p(α,β )n+1 (x) = (a (α,β ) n x − b (α,β ) n )p (α,β ) n (x)− c (α,β ) n p (α,β ) n−1 (x), n ≥ 1 (4) where a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) 2(n + 1)(n + α + β + 1) b(α,β )n = (β 2 − α 2)(2n + α + β + 1) 2(n + 1)(n + α + β + 1)1)(2n + α + β ) c(α,β )n = (n + α)(n + β )(2n + α + β + 2) (n + 1)(n + α + β + 1)1)(2n + α + β ) matlab is an inexpensive an easi-to-use software package and widely available comercial product that is in widespread in both academia and industry [23]. a matlab script for evaluating jacobi polynomials using the above procedure is given in jacobipoly.m. in addition to jacobi polynomial, proposed matlab program also evaluates gegenbauer and legendre polynomials. jacobypoly.m function p=jacobipoly(n,a,b) % coefficients p of the jacobi polynomial % they are stored in decending order of powers if nargin == 1, a=0; b=0; elseif nargin == 2, b=a; end p0 = 1; p1 = [(a+b)/2+1,(a-b)/2]; if n == 0, p=p0; elseif n == 1, p=p1; else for k=2:n, d=2*k*(k+a+b)*(2*k-2+a+b); a=(2*k+a+b-1)*(2*k+a+b-2)*(2*k+a+b)/d; b=(2*k+a+b-1)*(aˆ2-bˆ2)/d; 354 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 355 4 n. stojanović and n. stamenković: where h(α,β )n = 2α+β +1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) , (3) δn.m is kronecker delta symbol and γ(·) is well known gamma function. the jacobi polynomials are generated by the three-term recurrence relation: p(α,β )0 (x) = 1, p(α,β )1 (x) = 1 2 (α + β + 2)x + 1 2 (α − β ), p(α,β )n+1 (x) = (a (α,β ) n x − b (α,β ) n )p (α,β ) n (x)− c (α,β ) n p (α,β ) n−1 (x), n ≥ 1 (4) where a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) 2(n + 1)(n + α + β + 1) b(α,β )n = (β 2 − α 2)(2n + α + β + 1) 2(n + 1)(n + α + β + 1)1)(2n + α + β ) c(α,β )n = (n + α)(n + β )(2n + α + β + 2) (n + 1)(n + α + β + 1)1)(2n + α + β ) matlab is an inexpensive an easi-to-use software package and widely available comercial product that is in widespread in both academia and industry [23]. a matlab script for evaluating jacobi polynomials using the above procedure is given in jacobipoly.m. in addition to jacobi polynomial, proposed matlab program also evaluates gegenbauer and legendre polynomials. jacobypoly.m function p=jacobipoly(n,a,b) % coefficients p of the jacobi polynomial % they are stored in decending order of powers if nargin == 1, a=0; b=0; elseif nargin == 2, b=a; end p0 = 1; p1 = [(a+b)/2+1,(a-b)/2]; if n == 0, p=p0; elseif n == 1, p=p1; else for k=2:n, d=2*k*(k+a+b)*(2*k-2+a+b); a=(2*k+a+b-1)*(2*k+a+b-2)*(2*k+a+b)/d; b=(2*k+a+b-1)*(aˆ2-bˆ2)/d; lowpass filters approximatin based on the jacobi polynomials 5 c=2*(k-1+a)*(k-1+b)*(2*k+a+b)/d; p=conv([a b],p1)-c*[0,0,p0]; p0 = p1; p1 = p; end end end some properties of the jacobi polynomials, which are needed here, are as follows p(α,β )n (1) = γ(n + α + 1) γ(n + 1)γ(α + 1) (5) and p(α,β )n (−1) = (−1)nγ(n + β + 1) γ(n + 1)γ(β + 1) (6) jacobi polynomials have symmetry p(α,β )n (x) = (−1)np (β ,α) n (x) (7) the following important derivative relation is d dx p(α,β )n (x) = 1 2 (n + α + β + 1)p(α+1,β +1)n−1 (x) (8) 3.1 shifted jacobi polynomials in order to use jacobi polynomials on the interval x ∈ [0,1] we define the so-called shifted jacobi polynomials by introducing the change of variable x �→ 2x − 1. let the shifted jacobi polynomials p(α,β )n (2x−1) be denoted by j (α,β ) n (x). the shifted jacobi polynomials are orthogonal with respect to the weight function w(α,β )s = (1 − x)α xβ in the interval [0,1] with the orthogonality property: ∫ 1 0 w(α,β )s j (α,β ) m (x)j (α,β ) n (x)dx = 1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) δn,m (9) the shifted jacobi polynomials are generated from the three-term recurrence relations [24]: j (α,β ) 0 (x) = 1, j (α,β ) 1 (x) = (α + β + 2)y −(β + 1), j (α,β ) n+1 (x) = (a (α,β ) n x − b (α,β ) n )j (α,β ) n (x)− c (α,β ) n j (α,β ) n−1 (x), n ≥ 1 (10) 354 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 355 lowpass filters approximatin based on the jacobi polynomials 5 c=2*(k-1+a)*(k-1+b)*(2*k+a+b)/d; p=conv([a b],p1)-c*[0,0,p0]; p0 = p1; p1 = p; end end end some properties of the jacobi polynomials, which are needed here, are as follows p(α,β )n (1) = γ(n + α + 1) γ(n + 1)γ(α + 1) (5) and p(α,β )n (−1) = (−1)nγ(n + β + 1) γ(n + 1)γ(β + 1) (6) jacobi polynomials have symmetry p(α,β )n (x) = (−1)np (β ,α) n (x) (7) the following important derivative relation is d dx p(α,β )n (x) = 1 2 (n + α + β + 1)p(α+1,β +1)n−1 (x) (8) 3.1 shifted jacobi polynomials in order to use jacobi polynomials on the interval x ∈ [0,1] we define the so-called shifted jacobi polynomials by introducing the change of variable x �→ 2x − 1. let the shifted jacobi polynomials p(α,β )n (2x−1) be denoted by j (α,β ) n (x). the shifted jacobi polynomials are orthogonal with respect to the weight function w(α,β )s = (1 − x)α xβ in the interval [0,1] with the orthogonality property: ∫ 1 0 w(α,β )s j (α,β ) m (x)j (α,β ) n (x)dx = 1 2n + α + β + 1 γ(n + α + 1)γ(n + β + 1) γ(n + 1)γ(n + α + β + 1) δn,m (9) the shifted jacobi polynomials are generated from the three-term recurrence relations [24]: j (α,β ) 0 (x) = 1, j (α,β ) 1 (x) = (α + β + 2)y −(β + 1), j (α,β ) n+1 (x) = (a (α,β ) n x − b (α,β ) n )j (α,β ) n (x)− c (α,β ) n j (α,β ) n−1 (x), n ≥ 1 (10) 6 n. stojanović and n. stamenković: where the recursion coefficients are a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) (n + 1)(n + α + β + 1) b(α,β )n = (2n + α + β + 1)(2n2 +(1 + β )(α + β )+ 2n(α + β + 1)) (n + 1)(n + α + β + 1)(2n + α + β ) c(α,β )n = (2n + α + β + 2)(n + α)(n + β ) (n + 1)(n + α + β + 1)(2n + α + β ) (11) the shifted jacobi polynomial j (α,β )n (x) can be obtained in the polynomial standard form as j (α,β ) n (x) = n ∑ i=0 (−1)n−i γ(n + α + β + i + 1) γ(i + 1)γ(n + α + β + 1) γ(n + β + 1) γ(n − i + 1)γ(β + i + 1) xi (12) suppose the jacobi polynomials should be normalized soo’ that φn(1) = 1. according to the polynomial (12), the normalization constant is k(α,β )n = ∑ni=0 a (n) i , where a (n) i are corresponding polynomial coefficients. as an example, fig. 1 shows the characteristic functions based on the shifted jacobi polynomials for n = 1,2,...,5 in the form φn(x) = xν j (α,β ) m (x2)/k (α,β ) n , where n = ⌊m/2⌋+ ν , the floor function ⌊m/2⌋ rounds the value of m/2 to the nearest integers towards zero, ν = 0 and ν = 1 for n even and odd, respectively. −1.5 −1 −0.5 0 0.5 1 −5 −4 −3 −2 −1 0 1 2 3 4 5 j4 (α,β) j 3 (α,β) j 2 (α,β) j 5 (α,β) j 1 (α,β) characteristic function shifted jacobi α=−0.5, β=0.5 x φ n (x )= xν j m(α ,β ) (x 2 ) /k n(α ,β ) fig. 1. the normalized shifted jacobi polynomials φn(x) = xν j (α,β ) m (x2) for ν = 0 and ν = 1 for n even and odd, respectively, used in place characteristic function, α = −0.5 and β = 0.5, n = 2m + ν , m = 0,1 and 2. 356 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 357 6 n. stojanović and n. stamenković: where the recursion coefficients are a(α,β )n = (2n + α + β + 1)(2n + α + β + 2) (n + 1)(n + α + β + 1) b(α,β )n = (2n + α + β + 1)(2n2 +(1 + β )(α + β )+ 2n(α + β + 1)) (n + 1)(n + α + β + 1)(2n + α + β ) c(α,β )n = (2n + α + β + 2)(n + α)(n + β ) (n + 1)(n + α + β + 1)(2n + α + β ) (11) the shifted jacobi polynomial j (α,β )n (x) can be obtained in the polynomial standard form as j (α,β ) n (x) = n ∑ i=0 (−1)n−i γ(n + α + β + i + 1) γ(i + 1)γ(n + α + β + 1) γ(n + β + 1) γ(n − i + 1)γ(β + i + 1) xi (12) suppose the jacobi polynomials should be normalized soo’ that φn(1) = 1. according to the polynomial (12), the normalization constant is k(α,β )n = ∑ni=0 a (n) i , where a (n) i are corresponding polynomial coefficients. as an example, fig. 1 shows the characteristic functions based on the shifted jacobi polynomials for n = 1,2,...,5 in the form φn(x) = xν j (α,β ) m (x2)/k (α,β ) n , where n = ⌊m/2⌋+ ν , the floor function ⌊m/2⌋ rounds the value of m/2 to the nearest integers towards zero, ν = 0 and ν = 1 for n even and odd, respectively. −1.5 −1 −0.5 0 0.5 1 −5 −4 −3 −2 −1 0 1 2 3 4 5 j4 (α,β) j 3 (α,β) j 2 (α,β) j 5 (α,β) j 1 (α,β) characteristic function shifted jacobi α=−0.5, β=0.5 x φ n (x )= xν j m(α ,β ) (x 2 ) /k n(α ,β ) fig. 1. the normalized shifted jacobi polynomials φn(x) = xν j (α,β ) m (x2) for ν = 0 and ν = 1 for n even and odd, respectively, used in place characteristic function, α = −0.5 and β = 0.5, n = 2m + ν , m = 0,1 and 2.lowpass filters approximatin based on the jacobi polynomials 7 as shown in fig. 1, the hump at x = 0 occurs when the filter degree is even. using (6) size of the hump can be obtained as φ (α,β )m (0) = 1 k(α,β )n p(α,β )m (−1) = 1 k(α,β )n (−1)mγ(m + β + 1) γ(m + 1)γ(β + 1) (13) because j (α,β )n (0) = p (α,β ) n (−1). one can easily show that the size of the hump increases when the degree of the filter increases. for example, for n = 4, (m = 2 and ν = 0) from (13) follow p(−0.5,0.5)2 (−1) = 1.875 and from (12) is k (−0.5,0.5) 2 = 0.3750 then value for hump is φ2(0) = 5. for n = 6 (m = 3 and ν = 0) follow p(−0.5,0.5)3 (−1) = −2.1875, k (−0.5,0.5) 3 = 0.3125 then φ3(0) = −7. thus, the even degree of the shifted jacobi polynomial is not suitable as the filter characteristic function. other definitions of the monic shifted jacobi polynomials are given in [22, chapter 22], gn(p,q,x), which are also orthogonal in the interval [0,1] with respect to weight function w(x) = (1−x)p−qxq−1 (with q > 0 and p > q − 1), are used for the construction magnitude of the filter’s transfer function [25] [26] [27]. shifted jacobi polynomials [22] are related to the jacobi polynomials p(α,β )n (x) as [28] gn(p,q,x) = γ(n + 1)γ(n + p) γ(2n + p) p(p−q,q−1)n (2x − 1) (14) it can be concluded, the shifted jacobi polynomials j (α,β )n (x) have n distinct positive real zeros in the interval (0,1) but they are neither even nor odd then it can not be used as a characteristic function in the equation (1). however, [xj (α,β )n (x2)]2 or [xg(p,q,x2)]2 could be used in (1) in place of squared characteristic function φ 2n (ω). 3.2 modified jacobi polynomials we propose the following modified jacobi polynomials, based on the summation of two jacobi orthogonal polynomials which have the same degree n, as j (α,β ) n (x) = p (α,β ) n (x)+ p (β ,α) n (x) (15) where p(α,β )n (x) is above mentioned classical jacoby orthogonal polynomial in x. one can easily show that modified jacobi polynomial (15) is not orthogonal polynomial except in the case when α = β is. since jacobi polynomials p(β ,α)n (x) = (−1)np (α,β ) n (−x) are not orthogonal polynomials with the respect to the weight function w(α,β )(x) over the interval [−1,1], then the modified orthogonal jacobi polynomials (15) are not orthogonal polynomials as the shifted jacobi polynomials are. however, the resulting degree of modified jacobi polynomial is n, which is pure odd or pure even polynomial in x, and hence the realization of the lowpass filter is possible for all specifications if they are used as characteristic function. 356 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 357 lowpass filters approximatin based on the jacobi polynomials 7 as shown in fig. 1, the hump at x = 0 occurs when the filter degree is even. using (6) size of the hump can be obtained as φ (α,β )m (0) = 1 k(α,β )n p(α,β )m (−1) = 1 k(α,β )n (−1)mγ(m + β + 1) γ(m + 1)γ(β + 1) (13) because j (α,β )n (0) = p (α,β ) n (−1). one can easily show that the size of the hump increases when the degree of the filter increases. for example, for n = 4, (m = 2 and ν = 0) from (13) follow p(−0.5,0.5)2 (−1) = 1.875 and from (12) is k (−0.5,0.5) 2 = 0.3750 then value for hump is φ2(0) = 5. for n = 6 (m = 3 and ν = 0) follow p(−0.5,0.5)3 (−1) = −2.1875, k (−0.5,0.5) 3 = 0.3125 then φ3(0) = −7. thus, the even degree of the shifted jacobi polynomial is not suitable as the filter characteristic function. other definitions of the monic shifted jacobi polynomials are given in [22, chapter 22], gn(p,q,x), which are also orthogonal in the interval [0,1] with respect to weight function w(x) = (1−x)p−qxq−1 (with q > 0 and p > q − 1), are used for the construction magnitude of the filter’s transfer function [25] [26] [27]. shifted jacobi polynomials [22] are related to the jacobi polynomials p(α,β )n (x) as [28] gn(p,q,x) = γ(n + 1)γ(n + p) γ(2n + p) p(p−q,q−1)n (2x − 1) (14) it can be concluded, the shifted jacobi polynomials j (α,β )n (x) have n distinct positive real zeros in the interval (0,1) but they are neither even nor odd then it can not be used as a characteristic function in the equation (1). however, [xj (α,β )n (x2)]2 or [xg(p,q,x2)]2 could be used in (1) in place of squared characteristic function φ 2n (ω). 3.2 modified jacobi polynomials we propose the following modified jacobi polynomials, based on the summation of two jacobi orthogonal polynomials which have the same degree n, as j (α,β ) n (x) = p (α,β ) n (x)+ p (β ,α) n (x) (15) where p(α,β )n (x) is above mentioned classical jacoby orthogonal polynomial in x. one can easily show that modified jacobi polynomial (15) is not orthogonal polynomial except in the case when α = β is. since jacobi polynomials p(β ,α)n (x) = (−1)np (α,β ) n (−x) are not orthogonal polynomials with the respect to the weight function w(α,β )(x) over the interval [−1,1], then the modified orthogonal jacobi polynomials (15) are not orthogonal polynomials as the shifted jacobi polynomials are. however, the resulting degree of modified jacobi polynomial is n, which is pure odd or pure even polynomial in x, and hence the realization of the lowpass filter is possible for all specifications if they are used as characteristic function.8 n. stojanović and n. stamenković: many of the aforementioned polynomials are special cases of modified jacobi polynomials. for α = β , one can obtain the ultraspherical polynomials (symmetric jacobi polynomials) [29]. for α = β = ∓1/2, the chebyshev polynomials of first and second kinds. for α = β = 0, one can obtain the legendre polynomials. for the two important special cases α = −β ± 1/2, the chebyshev polynomials of third and fourth kinds are also obtained. finally, the constants c(α,β )n = j (α,β ) n (1) have to be chosen in such a way that normalization criterion φn(1) = 1 is satisfied, i.e. φn(ω) = j (α,β ) n (ω) c(α,β )n , (16) where c(α,β )n = 1 γ(n + 1) [γ(n + α + 1) γ(α + 1) + γ(n + β + 1) γ(β + 1) ] . (17) modified jacobi polynomials are symmetrical in relation to the orders α and β , i.e. j(α,β )n (ω) = j (β ,α) n (ω). table 1 contains the modified jacobi polynomials for α = −0.5 and β = 0.5 up to the ninth degree. table 1. the modified orthogonal jacobi polynomials j(α,β )n (x), α = −0.5, β = 0.5, and n = 0,1,...,10. n j(−0.5,0.5)n (x) 1 2 x 2 3 x2 − 3 4 3 5 x3 − 5 2 x 4 35 4 x4 − 105 16 x2 + 35 64 5 63 4 x5 − 63 4 x3 + 189 64 x 6 231 8 x6 − 1155 32 x4 + 693 64 x2 − 231 512 7 429 8 x7 − 1287 16 x5 + 2145 64 x3 − 429 128 x 8 6435 64 x8 − 45045 256 x6 + 96525 1024 x4 − 32175 2048 x2 + 6435 16384 9 12155 64 x9 − 12155 32 x7 + 255255 1024 x5 − 60775 1024 x3 + 60775 16384 x 10 46189 128 x10 − 415701 512 x8 + 323323 512 x6 − 1616615 8192 x4 + 692835 32768 x2 − 46189 131072 it is important to know where the roots of the modified jacobi polynomials are located. the fastest way to calculate the zeros of the modified jacobi polynomials is by using mathematical programs such as matlab, mathematica and maple. it can be concluded that the modified jacobi 358 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 359 8 n. stojanović and n. stamenković: many of the aforementioned polynomials are special cases of modified jacobi polynomials. for α = β , one can obtain the ultraspherical polynomials (symmetric jacobi polynomials) [29]. for α = β = ∓1/2, the chebyshev polynomials of first and second kinds. for α = β = 0, one can obtain the legendre polynomials. for the two important special cases α = −β ± 1/2, the chebyshev polynomials of third and fourth kinds are also obtained. finally, the constants c(α,β )n = j (α,β ) n (1) have to be chosen in such a way that normalization criterion φn(1) = 1 is satisfied, i.e. φn(ω) = j (α,β ) n (ω) c(α,β )n , (16) where c(α,β )n = 1 γ(n + 1) [γ(n + α + 1) γ(α + 1) + γ(n + β + 1) γ(β + 1) ] . (17) modified jacobi polynomials are symmetrical in relation to the orders α and β , i.e. j(α,β )n (ω) = j (β ,α) n (ω). table 1 contains the modified jacobi polynomials for α = −0.5 and β = 0.5 up to the ninth degree. table 1. the modified orthogonal jacobi polynomials j(α,β )n (x), α = −0.5, β = 0.5, and n = 0,1,...,10. n j(−0.5,0.5)n (x) 1 2 x 2 3 x2 − 3 4 3 5 x3 − 5 2 x 4 35 4 x4 − 105 16 x2 + 35 64 5 63 4 x5 − 63 4 x3 + 189 64 x 6 231 8 x6 − 1155 32 x4 + 693 64 x2 − 231 512 7 429 8 x7 − 1287 16 x5 + 2145 64 x3 − 429 128 x 8 6435 64 x8 − 45045 256 x6 + 96525 1024 x4 − 32175 2048 x2 + 6435 16384 9 12155 64 x9 − 12155 32 x7 + 255255 1024 x5 − 60775 1024 x3 + 60775 16384 x 10 46189 128 x10 − 415701 512 x8 + 323323 512 x6 − 1616615 8192 x4 + 692835 32768 x2 − 46189 131072 it is important to know where the roots of the modified jacobi polynomials are located. the fastest way to calculate the zeros of the modified jacobi polynomials is by using mathematical programs such as matlab, mathematica and maple. it can be concluded that the modified jacobi lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are 8 n. stojanović and n. stamenković: many of the aforementioned polynomials are special cases of modified jacobi polynomials. for α = β , one can obtain the ultraspherical polynomials (symmetric jacobi polynomials) [29]. for α = β = ∓1/2, the chebyshev polynomials of first and second kinds. for α = β = 0, one can obtain the legendre polynomials. for the two important special cases α = −β ± 1/2, the chebyshev polynomials of third and fourth kinds are also obtained. finally, the constants c(α,β )n = j (α,β ) n (1) have to be chosen in such a way that normalization criterion φn(1) = 1 is satisfied, i.e. φn(ω) = j (α,β ) n (ω) c(α,β )n , (16) where c(α,β )n = 1 γ(n + 1) [γ(n + α + 1) γ(α + 1) + γ(n + β + 1) γ(β + 1) ] . (17) modified jacobi polynomials are symmetrical in relation to the orders α and β , i.e. j(α,β )n (ω) = j (β ,α) n (ω). table 1 contains the modified jacobi polynomials for α = −0.5 and β = 0.5 up to the ninth degree. table 1. the modified orthogonal jacobi polynomials j(α,β )n (x), α = −0.5, β = 0.5, and n = 0,1,...,10. n j(−0.5,0.5)n (x) 1 2 x 2 3 x2 − 3 4 3 5 x3 − 5 2 x 4 35 4 x4 − 105 16 x2 + 35 64 5 63 4 x5 − 63 4 x3 + 189 64 x 6 231 8 x6 − 1155 32 x4 + 693 64 x2 − 231 512 7 429 8 x7 − 1287 16 x5 + 2145 64 x3 − 429 128 x 8 6435 64 x8 − 45045 256 x6 + 96525 1024 x4 − 32175 2048 x2 + 6435 16384 9 12155 64 x9 − 12155 32 x7 + 255255 1024 x5 − 60775 1024 x3 + 60775 16384 x 10 46189 128 x10 − 415701 512 x8 + 323323 512 x6 − 1616615 8192 x4 + 692835 32768 x2 − 46189 131072 it is important to know where the roots of the modified jacobi polynomials are located. the fastest way to calculate the zeros of the modified jacobi polynomials is by using mathematical programs such as matlab, mathematica and maple. it can be concluded that the modified jacobi 358 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 359 lowpass filters approximatin based on the jacobi polynomials 9 polynomials, j(α,β )n (x), have n simple real zeros in the closed interval [−1,1]. for example, the zeros of the modified jacobi polynomial of degree 8 with α = −0.5 and β = 0.5 are: {−0.9396926,−0.7660444,−0.5000000,−0.1736482, 0.1736482, 0.5000000, 0.7660444, 0.9396926}. the zeros of j(α,β )n (x) are located symmetrically about x = 0 in the interval −1 < x < 1. note that modified jacobi polynomials are the only non orthogonal polynomials which are suitable for the synthesis of the filter function given in a closed form. the characteristic functions φn(x) based on the modified jacobi polynomials j α,β ) n (x) are illustrated in figure 1 for x in [−1,1] and n = 1,2,...,5. they satisfy the following relationships: for |x| < 1, the characteristic polynomial oscillates around zero and they ripples are bounded by ±1 for α,β ≥ −0.5, φn(0) �= 0 for n even and φn(0) = 0 for n odd. for |x| > 1, the polynomials magnitude increase (decrease) monotonically. x -1 -0.5 0 0.5 1 φ n( x) =[ p n(α ,β ) ( x) +p n(β ,α ) ( x) ]/c n(α ,β ) -3 -2 -1 0 1 2 3 4 5 4 3 2 n=1 α=-0.5, β=0 modified jacobi polynomials fig. 2. the normalized modified orthogonal jacobi polynomials j(α,β )n (ω)/c (α,β ) n used in place characteristic function φn(x), α = −0.5 and β = 0.5, n = 1,...,5. an example is given in figure 3, which shows the ninth-order modified jacobi lowpass filter and its three partial filters with their individual orders α and β . as mentioned earlier, jacobi orthogonal polynomial corresponds to the chebyshev polynomial if α = β = −0.5 which have 3db ripples in the pass-band. in general, passband ripples are being undesirable, but a value less than 0.5 db is acceptable in many applications. if α = −0.5 and order β increases, the ripples in the passband decrease smoothly to be unequal and smaller in magnitude. for β > 1.5 the passband response is nearly flat, but the cutoff slope is much steeper than a butterworth filter cutoff slope. on the other hand, for −1 < β < −0.5 the passband ripples are unequal, but in magnitude are 10 n. stojanović and n. stamenković: larger than 1, but these values of β (also for α ) have no practical significance. it is shown that the passband ripple can be adjusted to improve the linearity of the group delay response near the ω = 0. normalized frequency, ω 10-1 100 s to pb an d at te nu at io n, d b 0 10 20 30 40 50 60 70 p as sb an d at te nu at io n, d b 0 1 2 3 5 10 15 20 g ro up d el ay , s modified jacobi, n=9 α=-0.5, β=0 α=-0.5, β=0.5 α=-0.5, β=1.5 butterworth fig. 3. the frequency responses of the 9th degree modified jacobi filters. generally, for microwave applications modified orthogonal jacobi as filter function may be also used. the most widely used filters in microwave applications are a band-pass filters [30]. using lowpass to bandpass frequency transformation of lumped element lowpass filter, the series inductor converts to the series resonator and parallel capacitor converts to the parallel resonator. richards transformation can be used to emulate the inductive and capacitive behaviour of the lumped circuit elements into distributive element consist the transmission line sections, and kuroda’s identities can be used to facilitate the conversion between the various transmission line realizations. in the application where approximation of the filter magnitude function based on the christofeldarboux formula for classical orthonormal jacobi polynomials gives excellent results [31] [32], this method cannot be applied to the modified jacobi filters, because it is non orthogonal. in this case, it should either generate the sum of the product modified jacobi polynomial, or christoffel-darboux formula be applied separately to the both orthonormal jacobi polynomials as: a2n(ω 2) =[p (α,β ) 0 (ω)] 2 +[p(α,β )1 (ω)] 2 + ···+[p(α,β )n (ω)]2 +[p(β ,α)0 (ω)] 2 +[p(β ,α)1 (ω)] 2 + ···+[p(β ,α)n (ω)]2 (18) where p(α,β )i (ω), i = 1,2,...,n are orthonormal jacobi polynomials with respect to the weight function w(α,β )(ω) = (1 − ω)α (1 + ω)β and p(β ,α)i (ω), i = 1,2,...,n are also orthonormal jacobi polynomials but with respect to the other weight function w(β ,α)(ω) = (1 − ω)β (1 + ω)α . the 360 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 361 10 n. stojanović and n. stamenković: larger than 1, but these values of β (also for α ) have no practical significance. it is shown that the passband ripple can be adjusted to improve the linearity of the group delay response near the ω = 0. normalized frequency, ω 10-1 100 s to pb an d at te nu at io n, d b 0 10 20 30 40 50 60 70 p as sb an d at te nu at io n, d b 0 1 2 3 5 10 15 20 g ro up d el ay , s modified jacobi, n=9 α=-0.5, β=0 α=-0.5, β=0.5 α=-0.5, β=1.5 butterworth fig. 3. the frequency responses of the 9th degree modified jacobi filters. generally, for microwave applications modified orthogonal jacobi as filter function may be also used. the most widely used filters in microwave applications are a band-pass filters [30]. using lowpass to bandpass frequency transformation of lumped element lowpass filter, the series inductor converts to the series resonator and parallel capacitor converts to the parallel resonator. richards transformation can be used to emulate the inductive and capacitive behaviour of the lumped circuit elements into distributive element consist the transmission line sections, and kuroda’s identities can be used to facilitate the conversion between the various transmission line realizations. in the application where approximation of the filter magnitude function based on the christofeldarboux formula for classical orthonormal jacobi polynomials gives excellent results [31] [32], this method cannot be applied to the modified jacobi filters, because it is non orthogonal. in this case, it should either generate the sum of the product modified jacobi polynomial, or christoffel-darboux formula be applied separately to the both orthonormal jacobi polynomials as: a2n(ω 2) =[p (α,β ) 0 (ω)] 2 +[p(α,β )1 (ω)] 2 + ···+[p(α,β )n (ω)]2 +[p(β ,α)0 (ω)] 2 +[p(β ,α)1 (ω)] 2 + ···+[p(β ,α)n (ω)]2 (18) where p(α,β )i (ω), i = 1,2,...,n are orthonormal jacobi polynomials with respect to the weight function w(α,β )(ω) = (1 − ω)α (1 + ω)β and p(β ,α)i (ω), i = 1,2,...,n are also orthonormal jacobi polynomials but with respect to the other weight function w(β ,α)(ω) = (1 − ω)β (1 + ω)α . the lowpass filters approximatin based on the jacobi polynomials 11 orthonormal jacobi plynomials are: p(α,β )n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(α,β )n (ω) (19) and p(β ,α)n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(β ,α)n (ω) (20) where p(β ,α)n (ω) and p (β ,α) n (ω) are the orthogonal jacobi polynomials which can be evaluated by the proposed matlab program. by using christoffel-darboux formula equation (18) is reduced to: a2n(ω 2) = k(α,β )n k(α,β )n+1 [dp(α,β )n+1 dω p(α,β )n − dp(α,β )n dω p(α,β )n+1 ] + k(β ,α)n k(β ,α)n+1 [dp(β ,α)n+1 dω p(β ,α)n − dp(β ,α)n dω p(β ,α)n+1 ] (21) where k(α,β )n and k (β ,α) n are leading coefficients of the orthonormal jacobi polynomials p (α,β ) n (ω) and p(β ,α)n (ω), respectively. the following filter approximating function for n = 5, α = −0.5 and β = 0.5 is given as an example: a10(ω 2) = 325.9493ω 10 − 488.9240ω 8 + 244.4620ω 6 − 40.7437ω 4 + 3.8197ω 2 + 2.5 according to the definition, the characteristic function should be normalized so that is unit, a10(1) = 1, at the cutoff frequency, ωp = 1. 4 conclusion in this paper, we intended to illuminate the usage of jacobi orthogonal polynomials in the design of time-continuous low-pass filter transfer function. since jacobi polynomial cannot be directly used as filter characteristic function, we suggested shifted jacobi polynomials and proposed a simple modification of jacobi polynomials to use as a filter characteristic function. the modified jacobi polynomials are not orthogonal, but they are suitable for the filter transfer function approximation. filter degree, maximum passband attenuation and two indexes of jacobi polynomials are four parameters that adjust the performance of the filter. the new modified jacobi polynomials are implemented to approximate the lowpass filter transfer function in such a way that they are used directly as filter characteristic function (as standard orthogonal polynomials: 360 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials 361 lowpass filters approximatin based on the jacobi polynomials 11 orthonormal jacobi plynomials are: p(α,β )n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(α,β )n (ω) (19) and p(β ,α)n (ω) = √ 2n + α + β + 1 2α+β +1 γ(n + 1)γ(n + α + β + 1) γ(n + α + 1)γ(n + β + 1) p(β ,α)n (ω) (20) where p(β ,α)n (ω) and p (β ,α) n (ω) are the orthogonal jacobi polynomials which can be evaluated by the proposed matlab program. by using christoffel-darboux formula equation (18) is reduced to: a2n(ω 2) = k(α,β )n k(α,β )n+1 [dp(α,β )n+1 dω p(α,β )n − dp(α,β )n dω p(α,β )n+1 ] + k(β ,α)n k(β ,α)n+1 [dp(β ,α)n+1 dω p(β ,α)n − dp(β ,α)n dω p(β ,α)n+1 ] (21) where k(α,β )n and k (β ,α) n are leading coefficients of the orthonormal jacobi polynomials p (α,β ) n (ω) and p(β ,α)n (ω), respectively. the following filter approximating function for n = 5, α = −0.5 and β = 0.5 is given as an example: a10(ω 2) = 325.9493ω 10 − 488.9240ω 8 + 244.4620ω 6 − 40.7437ω 4 + 3.8197ω 2 + 2.5 according to the definition, the characteristic function should be normalized so that is unit, a10(1) = 1, at the cutoff frequency, ωp = 1. 4 conclusion in this paper, we intended to illuminate the usage of jacobi orthogonal polynomials in the design of time-continuous low-pass filter transfer function. since jacobi polynomial cannot be directly used as filter characteristic function, we suggested shifted jacobi polynomials and proposed a simple modification of jacobi polynomials to use as a filter characteristic function. the modified jacobi polynomials are not orthogonal, but they are suitable for the filter transfer function approximation. filter degree, maximum passband attenuation and two indexes of jacobi polynomials are four parameters that adjust the performance of the filter. the new modified jacobi polynomials are implemented to approximate the lowpass filter transfer function in such a way that they are used directly as filter characteristic function (as standard orthogonal polynomials: 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. 362 n. stojanović, n. stamenković lowpass filters approximation based on the jacobi polynomials pb 12 n. stojanović and n. stamenković: chebyshev or legendre). these methods of approximation can be used to provide filters with adjustment of the passband ripple, group delay deviation or cutoff slope. acknowledgment this work is supported by serbian ministry of education, science and technological development, project no. 32009tr. references [1] w. v. assche and e. coussement, “some classical multiple orthogonal polynomials,” journal of computational and applied mathematics, vol. 127, no. 12, pp. 317 – 347, jan. 2001, numerical analysis 2000. vol. v: quadrature and orthogonal polynomials. [online]. available: http://www.sciencedirect.com/science/article/pii/s0377042700005033 [2] l. storch, “synthesis of constant-time-delay ladder networks using bessel polynomials,” proceedings of the ire, vol. 42, no. 11, pp. 1666–1675, nov. 1954. [3] b. d. rakovich and v. s. stojanovich, “on the design of equal ripple delay filters with chebyshev stopband attenuation,” radio and electronic engineer, vol. 43, no. 4, pp. 257–265, apr. 1973. [4] s. butterworth, “on the theory filter amplifier,” experimental wireless and the radio engineer, vol. 7, pp. 536–541, oct. 1930. [5] h. g. dimopoulos, “optimal use of some classical approximations in filter design,” ieee transactions on circuits and systems ii: express briefs, vol. 54, no. 9, pp. 780–784, sep. 2007. [6] s. c. d. roy, “modified chebyshev lowpass filters,” international journal of circuit theory and applications, vol. 38, no. 5, pp. 543–549, 2010.[online]. available: http://dx.doi.org/10.1002/cta.585 [7] r. ramiz and h. sedef, “general method for designing and simulating of resistively terminated lc ladder filters,” facta universitatis, series: electronics and energetics, vol. 12, no. 3, pp. 79–94, 1999. [8] s. prasad, l. g. stolarczyk, j. r. jackson, and e. w. kang, “filter synthesis using legendre polynomials,” proceedings of the iee, vol. 114, no. 8, pp. 1063–1064, aug. 1967. [9] m. t. chryssomallis and j. n. sahalos, “filter synthesis using products of legendre polynomials,” electrical engineering, vol. 81, no. 6, pp. 419–424, 1999. [10] d. živaljević, n. stamenković, and v. stojanović, “nearly monotonic passband low-pass filter design by using sum-of-squared legendre polynomials,” international journal of circuit theory and applications, vol. 44, no. 1, pp. 147–161, jan. 2016. [online]. available: http://dx.doi.org/10.1002/cta.2068 [11] y. h. ku and m. drubin, “network synthesis using legendre and hermite polynomials,” j. franklin inst., vol. 273, no. 2, pp. 138–157, feb. 1962. [12] i. m. filanovsky, “bessel-butterworth transitional filters,” in 2014 ieee international symposium on circuits and systems (iscas), jun. 2014, pp. 2105–2108. [13] a. dey, s. sadhu, and t. k. ghoshal, “adaptive gauss hermite filter for parameter varying nonlinear systems,” in 2014 international conference on signal processing and communications (spcom), jul. 2014, pp. 1–5. lowpass filters approximatin based on the jacobi polynomials 13 [14] b. d. rakovich and v. b. litovski, “least-squares monotonic lowpass filters with sharp cutoff,” electronics letters, vol. 9, no. 4, pp. 75–76, feb. 1973. [15] d. mirković, m. a. stošović, p. petković, and v. litovski, “design of iir digital filters with critical monotonic passband amplitude characteristic a case study,” facta universitatis, series: electronics and energetics, vol. 29, no. 2, pp. 269–283, 2016. [16] a. budak and p. aronhime, “transitional butterworth-chebyshev filters,” circuit theory, ieee transactions on, vol. 18, no. 3, pp. 413–415, may 1971. [17] y. peless and murakami, “analysis and synthesis of tranzitional butterworth-thomson filters and bandpass amplifier,” rca rev., vol. 18, no. 3, pp. 60–94, mar. 1957. [18] a. papoulis, “optimum filters with monotonic response,” proceedings of the ire, vol. 46, no. 3, pp. 906–609, mar. 1958. [19] ——, “on monotonic response filters,” proceedings of the ire, vol. 47, no. 2, pp. 332–333, feb. 1959. [20] m. fukada, “optimum filters of even orders with monotonic response,” ire transactions on circuit theory, vol. 6, no. 3, pp. 277–281, sep. 1959. [21] p. halpern, “optimum monotonic low-pass filters,” circuit theory, ieee transactions on, vol. 16, no. 2, pp. 240–242, may 1969. [22] m. abramowitz and i. stegun, handbook of mathematical functions with formulas, graphs, and mathematical tables, 9th ed. new york, dover: national bureau of standards applied mathematics series 55, 1972. [23] m. lutovac and d. t. sić, “symbolic signal processing and system analysis,” facta universitatis, series: electronics and energetics, vol. 16, no. 3, pp. 423–431, 2003. [24] a. h. bhrawy, e. h. doha, s. s. ezz-eldien, and r. a. van gorder, “a new jacobi spectral collocation method for solving 1+1 fractional schrödinger equations and fractional coupled schrödinger systems,” the european physical journal plus, vol. 129, no. 12, pp. 1–21, 2014. [online]. available:http://dx.doi.org/10.1140/epjp/i2014-14260-6 [25] c. beccari, “the use of the shifted jacob1 polynomials in the synthesis of lowpass filters,” international journal of circuit theory and applications, vol. 7, no. 2, pp. 289–295, 1979. [26] b. d. rakovich, “designing monotonic low-pass filterscomparison of some methods and criteria,” international journal of circuit theory and applications, vol. 2, no. 3, pp. 215–221, sep. 1974. [online]. available: http://dx.doi.org/10.1002/cta.4490020302 [27] d. topisirović, v. litovski, and m. andrejević stošović, “unified theory and state-variable implementation of critical-monotonic all-pole filters,” international journal of circuit theory and applications, vol. 43, no. 4, pp. 502–515, apr. 2015. [online]. available: http://dx.doi.org/10.1002/cta.1956 [28] t. v. hoang and s. tabbone, “errata and comments on ”generic orthogonal moments: jacobi-fourier moments for invariant image description”,” pattern recognition, vol. 46, no. 11, pp. 3148 – 3155, nov. 2013. [online]. available: http://www.sciencedirect.com/science/article/pii/s0031320313001817 [29] d. johnson and j. johnson, “low-pass filters using ultraspherical polynomials,” ieee transactions on circuit theory, vol. 13, no. 4, pp. 364–369, dec. 1966. [30] z. d. milosavljević and m. v. gmitrović, “realizable band-pass filter structures with optimal redundancy parameters,” facta universitatis, series: electronics and energetics, vol. 13, no. 1, pp. 131–141, 2000. 14 n. stojanović and n. stamenković: [31] v. d. pavlović and ć. b. dolićanin, “mathematical foundation for the christoffel-darboux formula for classical orthonormal jacobi polynomials applied in filters,” scientific publications of the state univ. of novi pazar, series a: appl. math. inform. and mech.,, vol. 3, no. 2, pp. 139–151, 2011. [32] v. d. pavlović et al., “new class of filter functions generated most directly by the christoffel-darboux formula for classical orthonormal jacobi polynomials,” scientific publications of the state univ. of novi pazar, series a: appl. math. inform. and mech.,, vol. 5, no. 1, pp. 23–33, 2013. instruction facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 503-512 https://doi.org/10.2298/fuee1904503k © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd control of systems on spatial domains with moving boundaries: 3d printing and traffic  miroslav krstić department of mechanical and aerospace engineering university of california, san diego, la jolla, ca 92093-0411, u.s.a. abstract. until roughly the year 2000, control algorithms (of the kind that can be physically implemented and provided guarantees of stability and performance) were mostly available only for systems modeled by ordinary differential equations. in other words, while controllers were available for finite-dimensional systems, such as robotic manipulators of vehicles, they were not available for systems like fluid flows. with the emergence of the “backstepping” approach, it became possible to design control laws for systems modeled by partial differential equations (pdes), i.e., for infinite dimensional systems, and with inputs at the boundaries of spatial domains. but, until recently, such backstepping controllers for pdes were available only for systems evolving on fixed spatial pde domains, not for systems whose boundaries are also dynamical and move, such as in systems undergoing transition of phase of matter (like the solid-liquid transition, i.e., melting or crystallization). in this invited article we review new control designs for moving-boundary pdes of both parabolic and hyperbolic types and illustrate them by applications, respectively, in additive manufacturing (3d printing) and freeway traffic. key words: pde backstepping, stefan problem, 3d printing, traffic 1. control systems and feedback laws for dynamical systems modeled by ordinary or partial differential equations (pdes) with significantly fewer input variables than state variables—like a scalar input variable for a pde with a spatially-distributed or infinite-dimensional state—control theory constructs the input as a function(al) of the state. this achieves stability for the dynamical system, where ―stability‖ in a technically rigorous sense refers to a set of mathematical properties, which includes the property that the state converges to zero as time approaches infinity. received october 10, 2019 corresponding author: miroslav krstić department of mechanical and aerospace engineering, university of california, san diego, la jolla, ca 92093-0411, u.s.a. (e-mail: krstic@ucsd.edu)  504 m. krstić constructing such input functions, also called ―feedback laws‖ because the input depends on the measurable state, is part of the design of most technological systems. a simple example is the segway, whose driver would nosedive or fall backward without the feedback system that feeds the pitch angle measurements into the wheel angle inputs to keep the apparatus and rider upright. less obvious feedback systems developed through evolution to both keep organisms alive and prevent them from making drastic changes to themselves, regardless of how much they desire said modifications. for instance, feedback systems that regulate metabolism prevent people from achieving significant weight loss by starving themselves over several days. these feedback systems developed in the living organisms in order to maintain—in the case of human organisms—our energy reserves in periods of famine and during strenuous travel. 2. pde control on moving domains classical control theory developed for ordinary differential equations (odes) requires remarkable sophistication in the design of feedback laws for nonlinear systems. feedback synthesis for pdes poses even greater challenges, namely in transitioning from the finite to infinite system dimension. nonlinear ode control saw its greatest achievements in the 1980s [1] and 90s [2], whereas pde control has blossomed during the last two decades [3]. not all physical systems are modeled by odes of a fixed order or pdes on fixed domains. some important applications—including traffic, opinion dynamics, and climate science—involve processes whose dimensions or domains depend on the size of the process state. for instance, the state vector dimension can increase with the size of the state. or a higher temperature in its pde spatial domain may cause the domain to grow, as in, melting ocean ice. classical control techniques are unequipped to deal with such dimension-varying dynamics. in fact, such possibilities have rarely even occurred to the control research community, which has been preoccupied in recent years with already difficult nonlinear, infinite-dimensional, stochastic, and hybrid phenomena in fixed dimension. fig 1 examples of cascade systems in which a pde, which is directly controlled, feeds into an ode. top: a hyperbolic pde-ode cascade, where a pure delay is example of the simplest hyperbolic pde (example: control of congested traffic). bottom: a parabolic pde-ode traffic (example: additive manufactruring/3d printing). control on spatial domains with moving boundaries: 3d printing and traffic 505 among the simplest and most elegant problems with the state’s dimension that varies with the state’s size are those that involve a connected ode and pde, so that the pde’s state acts as an input to the ode, whose state thus represents the pde’s boundary location. such pde-ode systems may involve either hyperbolic or parabolic pdes. figure 1 depicts general pde-ode cascade systems in which the ode is a general stabilizable dynamical system. control of such pde-ode cascade systems is studied in [4]. in this article the ode considered is a special case—a scalar ode governing the position of the pde’s boundary. 3. control of the stefan system (parabolic): example of additive manufacturing with laser actuation an example of a parabolic pde-ode system in which the ode state represents the pde’s boundary locatoin is the so-called stefan system. developed and analytically solved in the late 1800s by slovenian-austrian physicist josef stefan (of stefan-boltzmann fame), known in former yugoslavia as jožef štefan, the system models melting and freezing [5]. fig. 2 diagrams of additive manufacturing through laser-based sintering. laser melts metal powder, which subsequently solidifies, allowing to build, layer-by-layer, a complex 3d solid form. top: a diagram of the laser sintering system. bottom: a notational representation of the temperature fields in the liquid and solid phases, represented in one spatial dimension, denoted by x. the heat flux qc represents a boundary input to the liquid phase. 506 m. krstić researchers have recently used the stefan system to model numerous other physical phenomena, including additive manufacturing with both polymers and metals, depicted in figure 2; growth of axons in neurons; tumor growth; cancer treatment via cryosurgeries; spread of invasive species in ecology; lithium-ion batteries; domain walls in ferroelectric thin films; and information propagation in social networks. figure 3, shows the image at the bottom of figure 2 rotated clockwise by 90 degrees, where tl(x,t) and ts(x,t) respectively represent the spatiotemporal temperatures in the solid and liquid. heat pdes govern the temperatures. a scalar ode—whose inputs are the heat fluxes at the pdes’ boundary—governs the liquid-solid interface position s(t). fig. 3 temperature profiles and phase interface in a pde-ode system involving a liquid, a solid, and rightward melting with the aid of heat flux applied by a laser on the left boundary. the stefan model is given by the parabolic (heat equation) pde in which t(x,t) represents the spatiotemporal distribution of temperature, at location x and at time t, the heat flux qc represents a boundary input at x = 0, and the liquid-solid interface s is governed by the ode even though the heat equation above, for t, appears linear, the scalar ode governing s is clearly nonlinear because its right-hand side is a nonlinear function of s, where the nonlinearity is the heat flux function at the liquid-solid interface. this nonlinearity, along with the non-constancy of the pde’s domain, is what makes control of this seemingly simple system quite challenging and entirely unconventional. control on spatial domains with moving boundaries: 3d printing and traffic 507 stefan’s pde-ode model gives rise to several control and state estimation problems. the early efforts on control of the stefan problem are [6, 7, 8, 9]. here we focus on a control problem that is both simple and difficult. the goal is to regulate the liquid-solid interface position s(t) to a setpoint sr > 0. this goal is depicted in figure 4. the nonobvious thing to note is that, as the liquid-solid interface position s(t) is regulated to its equilibrium value sr, the temperature in both the liquid and the solid phases is being regulated to the melting/freezing temperature tm. if this were not the case, namely, if the liquid were the be regulated substantially above, and the solid substantially below tm, the liquid-solid interface position s(t) would keep on moving, either melting more of the solid, or freezing more of the liquid. fig. 4 a depiction of the control objective in the stefan problem. the liquid-solid interface is regulated to the setpoint, while, at the same time, the temperature fields of both the liquid and the solid phases are being regulated to the melting/freezing temperature, which represents the thermal equilibrium in this problem. using the backstepping approach for pde-ode systems [4], we design and implement a feedback law qc(s, t) by using a laser to apply a heat flux to the liquid. this backstepping feedback is given by where c is a positive gain constant. this backstepping control law is proportional to the error between the measured thermal energy and the thermal energy at the melting/freezing point, plus the interface tracking error s sr. the feedback law appears linear but it is not. the dependence of the upper limit of integration in x on the solid-liquid interface s is what makes this controller nonlinear, for the system which is nonlinear. the backstepping approach entails construction of a volterra transformation of the temperature state and a lyapunov functional based on the transformed temperature state [10, 11]. 1 1 http://a2c2.org/awards/o-hugo-schuck-best-paper-award http://a2c2.org/awards/o-hugo-schuck-best-paper-award 508 m. krstić fig. 5 time evolution of the liquid-solid interface (top), which approaches its setpoint without an overshoot, and the temperature at the initial location of the liquid-solid interface (bottom) which starts from the melting point, has an upward excursion while the solid gets melted, and returns to the melting point, which is the system’s thermal equilibrium. at no point does the temperature in the liquid phase fall below freezing. at no point does the heat flux get negative, which ensures the monotonicity of the motion of the liquid-solid interface and the absence of frozen islands within the liquid. figure 5 shows that the controller succeeds in its task. the solid-liquid interface is regulated to its setpoint. the temperature throughout the liquid domain is regulated to the melting point, which is the system’s thermal equilibrium. this control law achieves global stabilization for all initial conditions where the liquid temperature is above melting and the solid temperature is below freezing; both temperatures remain in these states for all time. in physical terms, this means that no solid islands form within the liquid and no pools of liquid form within the solid. the maximum principle for the heat equation establishes this result [12, 13]. control on spatial domains with moving boundaries: 3d printing and traffic 509 4. control of moving shock in congested traffic the analog to the stefan system’s parabolic pde phenomenon is the hyperbolic pde phenomenon that arises in traffic. this originates with a moving shock that delineates the free traffic (upstream of shock) from the congested traffic (downstream from shock), as seen in figure 6. fig. 6 free traffic (upstream/left) and congested traffic (downstream/right) are separated by shock, depicted as a sharp increase in density. modulating the durations of the red and green lights on the on-ramps regulate the shock location to a desired position. the hyperbolic nonlinear lighthill-whitham-richards pde [14, 15], which acts as a simple delay for small deviations, models the traffic flow. a scalar ode governs the shock motion, and the traffic densities of the congested and free traffic at the shock location form the ode’s inputs. this ode represents the rankine-hugoniot jump condition that is common in compressible gas models. the pde-ode system is given by where the first pde models the density of cars in the free traffic segment, the second pde models the density in the congested traffic segment, and the ode at the bottom models the motion of the free-congested interface l(t), namely, of the shock location. 510 m. krstić if left uncontrolled, this system will exhibit the upstream motion of the shock, until the entire freeway is consumed by congestion. this is shown in figure 7, which shows a simulation of the pde model on the left and a simulation of a ―microscopic‖ model on the right (where each car’s motion is modeled individually). fig. 7 shock starting near the downstream end of the freeway segment propagates upstream until the entire freeway segment is consumed by congestion. left: lwr pde simulation. right: ―microscopic‖ simulation showing density of cars where blue denotes low density and yellow/green denotes high density, namely, congestion. to prevent the loss of free traffic, we again use the pde backstepping design to devise a feedback law that regulates the moving shock’s position to a setpoint. this backstepping controller is given by the formulas the variable uin denotes the deviation of the density of cars at the inlet of the freeway segment relative to a setpoint, whereas the variable uout denotes the deviation of the density of cars at the outlet of the freeway segment relative to a setpoint. the quantities kf and kc denote positive gain constants, whereas l denotes the length of the freeway segment. the feedback laws above are implemented via ―ramp metering,‖ which involves modulation of the red and green lights on the freeway on-ramps around steady durations that correspond to the desired location of the shock. figure 8 illustrates the success of the feedback laws. they ―arrest‖ the upstream drift of the shock and keep the segment of the freeway upstream of the shock in free, i.e., uncongested traffic. control on spatial domains with moving boundaries: 3d printing and traffic 511 fig. 8 the controllers implemented through ramp metering at the inlet and outlet of the freeway prevent the drift of the congested traffic beyond the setpoint for the shock. hence, the upstream portion of the freeway is kept uncongested (blue denotes low density of cars in both pictures). allowing the downstream portion of the freeway to be congested is important—not doing so would mean that many cars are prevented from entering the freeway and are instead kept on the ramps and on the streets leading to the ramps. the similarity between the feedback laws for the stefan (additive manufacturing) and the freeway problems are quite noticeable. both feedbacks include integrals over varying spatial domains and both feedbacks also include the error between the measured interface position and the reference position. analyzing the pde-ode system with the feedback law once again employs a backstepping/volterra transformation of the traffic density pde’s state, along with a resulting lyapunov functional. like with the stefan system, stability occurs in the h1 sobolev norm. the details are contained in [16]. however, while stability for the stefan system holds for all physically-meaningful initial conditions, it only holds locally—for small deviations of the density field around its equilibrium profile—for the traffic problem. another important result on control of an lwr-like model of traffic is [17]. 4. conclusions in this tutorial exposition of two pde control designs from distinct domains of physics and engineering, we have illustrated the current state-of-the art in designing controllers for infinite-dimensional systems modeled by pdes with moving boundaries. these techniques are also applicable to a variety of other phase-change problems, including tumor growth and cancer treatment, lithium-ion batteries, and information propagation in social networks, as well as to multi-phase flows, fluid-structure interactions, and undersea construction using long cables. future research needs to advance these techniques from one spatial dimension to two and three spatial dimension, multi-pde scenarios, and systems in which the interface is 512 m. krstić not governed by an ode but by another pde, possibly from a different class than in the main domain. an example of such a dynamical system is a biological cell whose membrane is governed by an elastic structural pde model (second-order in time and fourth-order in space), while the interior is governed by a diffusion-dominated parabolic pde. acknowledgement: the paper is the result of joint work with my students shumon koga (for the stefan problem) and huan yu (for the traffic problem). the material in this article was presented in two lectures that the author presented in the serbian academy of sciences and arts, one dedicated to traffic control and the other dedicated to the stefan model of systems with a phase change. references [1] a. isidori, nonlinear control systems, springer, 1989. [2] m. krstic, i. kanellakopoulos, and p. v. kokotovic, nonlinear and adaptive control design, wiley, 1995. [3] m. krstić and a. smyshlyaev, boundary control of pdes: a course on backstepping designs, siam, 2008. [4] m. krstić, delay compensation for nonlinear, adaptive, and pde systems, boston, ma: birkhauser, 2009. [5] j. stefan, ―uber die theorie der eisbildung, insbesondere uber die eisbildung im polarmeere,‖ annalen der physik, vol. 278, pp. 269–286, 1891. [6] a. armaou and p.d. christofides, ―robust control of parabolic pde systems with time-dependent spatial domains,‖ automatica, vol. 37, pp. 61–69, 2001. [7] n. petit, ―control problems for one-dimensional fluids and reactive fluids with moving interfaces,‖ in advances in the theory of control, signals and systems with physical modeling, volume 407 of lecture notes in control and information sciences, pages 323–337, lausanne, dec 2010. [8] b. petrus, j. bentsman, and b.g. thomas, ―feedback control of the two-phase stefan problem, with an application to the continuous casting of steel,‖ in proceedings of the 49th ieee conference on decision and control (cdc), 2010, pp. 1731–1736. [9] m. izadi and s. dubljevic, ―backstepping output feedback control of moving boundary parabolic pdes,‖ european journal of control, vol. 21, pp. 27–35, 2015. [10] s. koga, m. diagne, and m. krstić, ―control and state estimation of the one-phase stefan problem via backstepping design,‖ ieee transactions on automatic control, vol. 64, pp. 510–525, 2019. [11] s. koga, i. karafyllis, and m. krstić, ―input-to-state stability for the control of stefan problem with respect to heat loss at the interface,‖ in proceedings of the 2018 american control conference. milwaukee, wi, 2018. [12] a. friedman ―free boundary problems for parabolic equations i. melting of solids,‖ journal of mathematics and mechanics, vol. 8, no. 4, pp. 499–517, 1959. [13] s. gupta, the classical stefan problem. basic concepts, modelling and analysis. north-holland: applied mathematics and mechanics, 2003. [14] m. j. lighthill and g. b. whitham, ―on kinematic waves. ii. a theory of traffic flow on long crowded roads,‖ proc. roy. soc. london. ser. a., 229 317–345, 1955. [15] p. i. richards, ―shock waves on the highway,‖ operations res., 4, 42–51, 1956 [16] h. yu, l.-g. zhang, m. diagne, and m. krstic, ―bilateral boundary control of moving traffic shockwave,‖ ifac symposium on nonlinear control systems, 2019. [17] i. karafyllis, n. bekiaris-liberis, & m. papageorgiou. ―feedback control of nonlinear hyperbolic pde systems inspired by traffic flow models‖. ieee transactions on automatic control, 2018. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 307-322 https://doi.org/10.2298/fuee2102307z © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper the evolution of breakdown voltage and delay time under high overvoltage for different types of surge arresters emilija živanović, marija živković, milić pejović faculty of electronic engineering, university of niš, serbia abstract. the results of the reliability testing of littelfuse and epcos gas-filled surge arresters for different overvoltages under dc discharge will be presented in this paper. the static breakdown voltage of these gas components was estimated using voltage increase rates ranging from 1 to 10 v/s. a detailed statistical analysis of experimental data has also been done. the delay time of these components for different nominal overvoltages has been investigated as an additional aspect important for component reliability. in addition, the delay time method was used as a statistical method which can give neither ion nor neutral active states number density in the glow and afterglow. it can be employed for qualitative observation of ions and neutral active states decay in the afterglow to such low concentrations where the other methods cannot be applied. finally, a comparison has been done between the characteristics of two gas-filled surge arresters which have the same nominal overvoltage (littelfuse and epcos) from different manufacturers. key words: gas-filled surge arresters, nominal overvoltage, delay time, static breakdown voltage 1. introduction efficient overvoltage protection of electronic components and systems is very important for their proper operation. the gas-filled surge arresters (marked as gfsa in this paper) are non-linear components used in overvoltage protection. in literature it is known as surge voltage protector or gas discharge tube. overvoltage is a phenomenon where the potential of one point of a component or device in relation to the point of zero potential is greater than allowed. overvoltage above a certain value can endanger the safety of people who operate the devices, as well as damage the devices themselves. besides, overvoltage above the permitted levels can lead to permanent or temporary damage to certain electronic components and devices and to the appearance of noise in received october 30, 2020; received in revised form march 3, 2021 corresponding author: emilija živanović faculty of electronic engineering, aleksandra medvedeva 14, 18115 niš, serbia e-mail: emilija.zivanovic@elfak.ni.ac.rs 308 e. živanović, m. živković, m. pejović the transmission signals. atmospheric discharges, electrostatic discharges, commutation overvoltage, radar pulses and electromagnetic pulses of a nuclear explosion can cause overvoltage existence. these types of discharges significantly affect the telecommunication lines through which they damage components. of all the types of overvoltage, atmospheric discharge is the most dangerous because its occurrence is unpredictable. overvoltage protection elements can be divided with respect to the operating voltage type into linear and nonlinear according to the manner of applying voltage on them, when the current through them increases. linear elements for overvoltage protection are electric filters, whose most sensitive elements are capacitors. nonlinear elements for overvoltage protection are used more than linear ones and they can be divided into three groups according to the manufacturing technology and the principle of operation. these are transient suppresser diodes (tsds), metal оxide varistors (movs) and gas-filled surge arresters (gfsa). in order to protect against overvoltage, various combined (hybrid) schemes are sometimes used [1]. today, the most widely used gfsa consists of two or three electrodes that are enclosed in a ceramic or glass housing [1]. the distance between the electrodes is the order of a millimeter or part of a millimeter. as the insulating material, either noble gases (argon, neon, krypton or xenon) or their mixtures at pressures from 100 pa to 70 kpa are used. the major drawbacks in gfsas application are their delay time and cut off delay upon voltage disconnection as well as relatively large deviation in breakdown voltage, which goes up to 20% with respect to values usually found in datasheets [2,3]. time delay method proved to be a valuable tool for littelfuse gfsa reliability testing [4]. the continuation of our research relates to a detail analysis of these gas components and further extends to similar components manufactured by epcos at the same operating voltage of 230 v. on this occasion, a similar analysis was performed with the possibility to comparing the new results with previously used components as well as with characteristics from the datasheet [2,3]. the static breakdown voltage was estimated for all used samples by discretized dynamic method. in addition, the results of testing the reliability of these components, which implies determining the components’ static breakdown voltage and delay time for the different overvoltage, as well as the different relaxation time, will be also shown. further, the influence of overvoltage on the reliability of gfsa will represent, as well as a detailed statistical analysis of the obtained experimental results. 2. related work previous research on gfsa has reflected that the most commonly examined types of it were siemens and citel. also tests performed in the field of ionizing radiation are widely represented in the literature [5,6]. the possibility of stabilizing the static working point of the gfsa by adequately selecting the parameters that are important during their fabricated was sought. this analysis has a practical importance to their manufacturer and provides a much better understanding of the pre-breakdown effects in gas at low pressure. the operating voltage of gfsa used in the experiment was 230 v [7]. the paper [8] examined the influence of the type of noble gas, gas pressure, inter-electrode gap, electrode material and the type of electrode surface processing, as well as the change of absorbed dose rate in radiation field on the operation of gfsa. such an extensive analysis of the evolution of breakdown voltage and delay time under high overvoltage 309 the impact of various parameters is performed due to their wide application in telecommunications systems, space technology and military industry. in addition, this type of testing of gfsa is related to the period immediately before the breakdown, but there are published papers that show the results related to the contribution of positive ions and neutral active particles remaining during the previous breakdown and discharge in the gas, using the delay time method [9,10]. this statistical method can provide the qualitative separation of contribution of different particle species which can induce the secondary electron emission processes, which lead to initiation of breakdown in insulation gas. citel, siemens and epcos are types of used gas-filled arresters in previous research at operating dc voltage in range of 220 to 250 v. 3. experimental details 3.1. gas-filled surge arresters the gas-filled surge arrester samples used in this work for the experiment were chosen from two manufacturers – littelfuse and epcos. littelfuse gfsa are designed with typical value of dc breakdown by: ▪ 285 v dc (in this paper marked as lf1) ▪ 230 v dc voltage (designated as lf2). ▪ 250 v dc voltage (in this paper marked as lf3). ▪ 350 v dc voltage (designated as lf4). aside from littelfuse gfsa, there have also been done experiments with epcos gfsa which is designed to operate at the voltage of: ▪ 230 v dc voltage (designated as ep). the technical characteristics of the gfsa mentioned above are listed in detail in table 1 [2,3]. table 1 device specifications (at 25°c) device breakdown voltage in the dc mode (v) dc breakdown – typical (v) insulation resistance (gω) capacitance (pf) lf1 230 – 340 285 10 1.5 lf2 184 – 276 230 1 1.5 lf3 200 – 300 250 10 1.5 lf4 280 – 420 350 10 1.5 ep 184 – 276 230 10 1.5 the components’ geometry is shown in fig. 1. inter-electrode space d is close to 6 mm. precise information about gas type and pressure could not be obtained. however, manufacturer states that gfsa is filled with neon and/or argon to the pressure below atmospheric. with respect to this, further analysis of physical processes will be focused on noble gases in general at low pressures. 310 e. živanović, m. živković, m. pejović fig. 1 the geometry of gas-filled surge arrester 3.2. measurement system for breakdown voltage estimation widely established definition of the breakdown voltage ub considers that it is the voltage applied on gas component, which induces the gas transition from non-selfsustaining to self-sustaining discharge. due to statistical nature, breakdown voltage is not a strictly predefined value. many different factors may influence its value [11,12]. those are for example, the presence of external ionization source (such as x, gamma or uv source), electrodes precondition [13], ambient temperature, electrode’s shape and product of inter-electrode distance and gas pressure [14,15], and many more. considering this, gas components, gfsa in our case, manufacturers usually give the expected breakdown voltage in the datasheets, with tolerance, which goes up to 20%. from the statistical point of view, it is of great importance to determine the breakdown voltage on the onset of breakdown, i.e., the voltage for which breakdown probability is still kept at zero value. this value is referred in the literature as the static breakdown voltage us and there are several methods for its determination. in our experiments, we used discretized dynamic method. estimation of static breakdown voltage is important due to the scaling of the overvoltage in relation to it. unlike dynamic method [16], which considers the application of linear ramp signal on gas component until breakdown, discretized dynamic method requires the application of stepped voltage on the diode until breakdown, while voltage step up and its duration tp are predefined (see fig. 2 below). in our experiments voltage step was fixed to 0.1 v. the duration of the steps was varied from 0.01 s up to 0.1 s. correspondingly, the voltage increase rates k = up /tp ranged from 1 to 10 v/s. this choice of voltage increase rate increases the resolution in measurement accuracy. the block diagram of electrical system for breakdown voltage measurement along with signals on gas diode for three successive measurements is presented in fig. 2. the evolution of breakdown voltage and delay time under high overvoltage 311 fig. 2 block diagram of a system for breakdown voltage acquisition: uk significantly smaller than the expected breakdown; ug value to which the voltage rises as it can be seen from fig. 2, the voltage on gas device is raised in steps until breakdown. computer based control and acquisition system has been set up to increase voltage in steps of certain duration until the breakdown voltage is established and recorded, and to do so sufficient number of times so that a good statistic for the measured quantity has been achieved. 312 e. živanović, m. živković, m. pejović 3.3. measurement system for delay time measurement memory curve is the dependence between delay time and relaxation period. due to statistical nature of delay time, it is necessary to perform large series of measurements and use the mean values of delay time as reference values. in our experiments, we used a series of a hundred delay time measurements for different relaxation periods. the relaxation periods were chosen according to logarithmic scale until the memory curve saturation. the block diagram of measurement system along with signals on gfsa for three successive measurements is presented in fig. 3. system structure can be divided into two separate subsystems, analog and digital. the main purpose of analog subsystem is to provide fast and accurate voltage switching on gfsa. digital subsystem, on the other hand, is responsible for measurement data collection and storage as well as the measurement control and execution. fig. 3 block diagram of a system for delay time acquisition measurement is executed in a following manner. the series of constant voltage pulses were applied on the component, while elapsed time between the voltage pulse application and breakdown was measured. measured delay times were stored in memory and voltage pulse was maintained on diode for the time tg in order to maintain the same conditions for every measurement. after that, the gas component was disconnected from the relaxation period τ. the procedure was repeated for a desired number of times, for different relaxation periods. the evolution of breakdown voltage and delay time under high overvoltage 313 4. results and discussion 4.1. analysis of static breakdown voltage special attention in this paper will be focused to two basic characteristics of gfsa. static dc breakdown voltage is one of them. it should be noted that the static breakdown voltage is the starting point for further overvoltage determination. and its meaning should not be confused with dc breakdown voltage from the datasheet. its estimation for each of used components was performed applying a dynamic discretized method [16]. it is based on a linear fit of the experimentally obtained dependence )(kfub = , where bu is the mean values of a thousand measured data of breakdown voltage and k is the voltage rate. the results represented in figs. 4 and 5 show the mean value of breakdown voltage as a function of voltage increase rate for littelfuse (for four different components) and epcos gfsa, respectively. the estimated static breakdown voltage is shown in all figures. fig. 4 mean value of breakdown voltage as a function of voltage increase rate for littelfuse gfsa 0 2 4 6 8 10 236,5 236,6 236,7 236,8 236,9 237,0 237,1 237,2 237,3 237,4 237,5 237,6 237,7 237,8 237,9 238,0 238,1 experimental data linear fit lf1 u b (v ) k(v/s) u s = 236.7 v 0 2 4 6 8 10 273,9 274,0 274,1 274,2 274,3 274,4 274,5 274,6 experimental data linear fit lf2 u b ( v ) k(v/s) u s = 273.9 v 0 2 4 6 8 10 283,8 284,0 284,2 284,4 284,6 284,8 285,0 285,2 experimental data linear fit lf3 u b (v ) k(v/s) u s = 284 v 0 2 4 6 8 10 373,0 373,1 373,2 373,3 373,4 373,5 373,6 373,7 373,8 373,9 374,0 374,1 u b (v ) k(v/s) experimental data linear fit lf4 u s = 373.2 v 314 e. živanović, m. živković, m. pejović 0 2 4 6 8 10 280,0 280,1 280,2 280,3 280,4 280,5 280,6 experimental data linear fit ep u b ( v ) k(v/s) u s = 279,9 v fig. 5 mean value of breakdown voltage as a function of voltage increase rate for epcos gfsa lf 2 ep fig. 6 histograms and fitted distribution density of dynamic breakdown voltage for lf2 and ep gfsa for k = 3 and 4 v/s the evolution of breakdown voltage and delay time under high overvoltage 315 in literature, in the analysis of the breakdown voltage in gas tube filled with noble gases and nitrogen, it was shown that the breakdown voltage is a statistical quantity with gaussian distribution function. this method also takes into account the stochastic nature of the breakdown voltage. namely, even at low voltage rise rates in a series of repeated measurements under the same experimental conditions, the obtained data will have different values. this confirms the fact that the breakdown voltage has a statistical nature with a certain distribution due to stochastic processes in the gas. this statement can be observed in fig. 6, which shows a histogram of relative frequencies of experimental data of breakdown voltage (a thousand measurements for each voltage increase rate) as well as gaussian distribution function. the breakdown voltage represented in this figure was obtained for voltage step up = 0.3 and 0.4 v and its duration tp = 0.1 s. as mentioned above k = up /tp and increase rate was k = 3 and 4 v/s, respectively (the details about the increase rate estimation can be found in paper [16]). a similar tendency was obtained for the other k. it should be expected for very small k that us has a constant value. but, the application of statistical 2 test, as well as r2 correlation coefficient, (shown in the table 2) present a good agreement between gaussian distribution function and experimental data. a complete analysis was performed for all samples used in the experiments, but a detailed statistical analysis was shown for gfsas marked as lf2 and ep that are at the same operating voltage of 230 v dc. table 2 2 and r2 values for analyzed gfsa device k (v/s) 2 r 2 ep 3 24.21 0.91 ep 4 19.58 0.92 lf2 3 1.18 0.99 lf2 4 9.31 0.96 4.2. analysis of delay time the delay time existence could be the main problem in usage of gfsas. due to stochastic nature of breakdown process breakdown doesn’t appear instantly upon voltage application on gfsa. the time elapsed between the moment of application of voltage higher than breakdown voltage, and the moment when the gfsa current starts to flow is called the delay time td. the delay time consists of the statistical time delay ts and the formative time tf, i.e., td = ts + tf [17,18]. statistical delay time is the time interval between the moment of operating voltage application and the appearance of a free electron which initiates the breakdown. formative time is the time taken from the end of the statistical delay time to the onset of breakdown, characterized by the collapse of the applied voltage as a transition self-maintained glow [17]. the various parameters have an influence on delay time, but the most important factor is the relaxation time τ which represents the time interval between two successive measurements when there is no voltage on the used component [18]. this dependence, )(ftd = , is usually called the memory curve. it can be divided into three distinctive areas, those are plateau, the growth of delay time with relaxation and saturation. different mechanisms of breakdown initiation play dominant role in each range of the memory curve. however, the existence of a memory curve is undesirable when studying the reliability of gasfilled surge arresters, which will be discussed later in the paper. 316 e. živanović, m. živković, m. pejović since the gfsas’ manufacturers do not provide exact specification of the gas composition in the technical documentation, it can only be found that these are noble gases, and that argon and neon are most often used for this purpose. during the experiment itself, a reddish-orange glitter is noticed, which is a hint that these are noble gases. the experimentally obtained memory curves for littelfuse gfsa (fig. 7) can be compared with those previously obtained for argon and neon [19,20]. it can be seen that the shape of the curve is similar, i.e., that the plateau appears as well as the area of increase delay time with the period of relaxation. the plateau area itself is characterized by the constancy of the delay time with relaxation regardless of the overvoltage. only a decrease in the delay time with an increase in overvoltage is observed. as far as the plateau area is concerned, in this range of relaxation there is a high concentration of positive ions in the gas, which are formed both during discharge and afterglow. the processes that are possible with noble gases in which positive ions are formed, are responsible for maintaining discharge are listed in table 3 [21]. table 3 positive ions’ creation process reaction direct ionization exxе 2+→+ + stepwise ionization of metastable atoms by electron impact exxе m 2+→+ + metastable-metastable collision ionization exxxx mm ++→+ + exxx mm +→+ + 2 excited atom-ground state atom collision exxx +→+ + 2 * three-body collision (ion conversion) ion recombination between electron and molecular ion diffusion on the device wall xxxx +→+ ++ 22 xxxxxe m +→+→+ + * 2 wallxxxe m → ++ 2,,, in reactions represented above (table 3), x is the ground state’s atom of noble gas, while x* and xm signify resonant excited and metastable level, respectively. as x+ and + 2x are marked positive ion in atomic and molecular form, respectively. in figs. 7 and 8 families of delay time vs. relaxation (memory curve) dependencies are shown, for all littelfuse samples as well as for epcos gfsa, respectively. dependencies were recorded for different overvoltages. the overvoltage is most often expressed in percentages and defined as (u / us)  100 %, where u is the difference in operating voltage uw (uw > us) and static breakdown voltage, i.e., u = uw − us. the aim of this study was to examine the effect of the applied voltage to the delay time in the function of relaxation time because of determined overvoltage range corresponding to safe component operation. the memory curves’ plateau (see figs. 7 and 8) is a consequence of positive ions’ recombination formed by first five reactions from table 3. atoms and molecules in ground and metastable state have also been included in the secondary electron emission (see) process, but because of their electroneutrality had much smaller contribution. the rapid growth of delay time can be observed in some cases. it is a consequence of a change in mechanisms yielding the dominant influence in the see process. significant decrease of ion the evolution of breakdown voltage and delay time under high overvoltage 317 concentration as well as longer recombination time of neutral particle is the main cause of sudden td rise. fig. 7 mean value of delay time as a function of relaxation time for different overvoltage for littelfuse gfsas fig. 8 mean value of delay time as a function of relaxation time for different overvoltage for epcos gfsa 318 e. živanović, m. živković, m. pejović two conclusions can be drawn from the presented results. the first one is a tendency toward delay time decrease with increasing overvoltage for all gfsas. it has been previously tested and confirmed for gas-filled tube that by increasing the voltage, the probability of a breakdown in the gas increases as well as the probability that the secondary electrons released from the cathode lead to a breakdown [22-24]. when the yield of electrons in a gap is a constant, the mean value of delay time is inversely proportional to the breakdown probability [22]. it has also been shown that the breakdown probability increases with increasing overvoltage, which is manifested by a decrease in the mean value of delay time. in most of the experimental results, the obtained characteristics do not show an increase in delay time, that is expected. since delay time is practically independent of the relaxation time  it can be concluded that tested components worked reliably in the whole range of tested relaxation periods. however, it can be seen in fig. 7, that samples lf1 and lf3 show different tendency for lower values of overvoltage. namely, sample lf1 shows significant increase in delay time for relaxation period longer than 7 s for overvoltages 1.3us and 1.4us, while sample lf3 shows similar behavior for relaxation longer than 700 ms and overvoltage 1.2us. with respect to above mentioned, it can be concluded that these components are not reliable for operation in the area of significant increase of delay time. ep sample in fig. 8 shows similar behavior for relaxation about 1.5 s and 1.1us overvoltage although increase in delay time is smaller than for lf1 and lf3 samples. in order to establish overvoltages below which components are not reliable, the delay time method allows evaluating the delay time of these components. namely, the memory curves (figs. 7 and 8) indicate to values of gfsa, i.e., approximate from 10 µs to 30 µs for lf1, as well as in the range of 15 µs to 200 µs for lf2, around 10 µs for lf3, from 30 µs to 300 µs for lf4, and between 80 µs and 400 µs for ep. as the gfsas of both manufacturers showed a deviation in the results for relaxation times of about 105 ms, this required further statistical analysis. figs. 7 and 8 also indicate that for some overvoltage there is an increase in delay time with increasing relaxation time. this is something which indicates instability in the operation of the component with respect to the delay time. in the earlier results [22-25] of the memory effect study, it can be seen that in the region of increase for most experimental conditions ts is less than tf, as well as that the standard deviation of tf is very small, in the analysis of total delay time, in the first approximation we can assume that under constant experimental conditions, it is deterministic. in this case, the delay time becomes the sum of one deterministic tf and one stochastic quantity ts, so it takes on its stochastic character from the statistical delay time. it has been shown earlier [18] that the statistical delay time has an exponential distribution, which is based on the physical nature of the processes that occur in the gas. in the physical literature, the exponential distribution is based on the so-called laue distribution. it is represented by diagrams lauegrams, where n is the total number of delay time measurements, and n(t) represents the number of measured delay time whose values are greater than t. this corresponds to the drawing of the function ln ( ) ( ), f r t t t= − − where ( ) 1 ( ) 1 exp[ ( )] f t f t r t f t x t dx= − = −  − − , is the function that represents the probability that a breakdown in the gas will occur after time t. since 1 s t = , laue distribution is usually written in the form [26] the evolution of breakdown voltage and delay time under high overvoltage 319 ( ) ln f s t tn t n t − = − . lf2 ep fig. 9 lauegrams of the relaxation time for the previously observed deviation of gfsa of both manufacturers where the delay time increases suddenly and where the operation of the components is not reliable 320 e. živanović, m. živković, m. pejović this expression shows that the mean value of the statistical delay time, and thus the electron yield, can be obtained from the slope of straight line, and the formation time is cut off on the td axis. on lauegrams, the linear fit is obtained using the least squares method when determining the distribution parameters. the correlation coefficient was determined for each data set. it connects the data of two features, in this case ln n(td)/n and td, and represents a quantitative measure of the agreement of the experimental data for the delay time with the exponential distribution. fig. 9 represents the lauegrams for those relaxation times for which the deviation in figs. 7 and 8 is observed. for these relaxation times, the delay time increases suddenly and for these values the operation of the components is not reliable. a detailed statistical analysis was performed to check whether it was a measurement fault or physical processes occurring in the gas. if there is a good laue distribution, i.e., if r2 correlation coefficient is close to unity, it means that scattering of experimental delay time data exist. then the memory curve is expressed, so from a technical point of view, the arrester is not good. for smaller r2, the data is more difficult to describe by the laue distribution, so there is no memory curve and the delay times are small, so the arrester is reliable. the results of the used statistical pearson’s test confirm this, as well as the fact that with the growth of relaxation time. confirmations of these facts can be seen in tables 4 and 5. the first one refers to the analysis of experimental data of the gas-filled surge arrester marked as lf2, and the second to the ep. table 4 pearson’s test and r2 for lf2 τ pearson’s r r2 700 -0.92 0.85 1500 -0.98 0.97 3000 -0.99 0.99 table 5 pearson’s test and r2 for ep τ pearson’s r r2 1500 -0.90 0.81 3000 -0.92 0.85 7000 -0.99 0.97 5. conclusion based on all of the above, the following can be concluded. the paper investigates the reliability testing of littelfuse and epcos gas-filled surge arresters. using the dynamic discretized method for different voltage increase rates from 1 to 10 v/s, the static breakdown voltage was precisely estimated. components’ delay time has been determined for different nominal overvoltages using delay time method. the mean value of breakdown voltage as a function of voltage increase rate for all gfsas has been shown. in addition, histograms and fitted distribution density of dynamic breakdown voltages for lf2 and ep gfsas, that are at the same operating voltage of 230 v dc, for k = 3 and 4 v/s are shown. it is evident that the breakdown voltage for these gfsas is a statistical quantity with gaussian distribution function, as presented. the evolution of breakdown voltage and delay time under high overvoltage 321 the delay time of the gfsas is determined from the obtained experimentally memory curve. figs. 7 and 8 show those dependencies, the mean value of delay time as a function of relaxation time for different overvoltage for littelfuse and epcos gfsas. as it can be observed, in the case of lf1, for overvoltages greater than 40% this type of surge arrester works reliably. also, for overvoltages less than 40% it also works reliably for relaxation times up to 7 s. for lf2, we can notice that the device works reliably for overvoltages greater than 20%. in the case of lf3, reliability is shown for overvoltages over 30%. for overvoltages less than 30% it works reliably up to 1.5 s of relaxation time. also, lf4 shows very good reliability for overvoltages over 10%. finally, epcos device has the reliability for overvoltages over 20%, while under 20% unreliability is shown for relaxation times over 1.5 s. additionally, the previous results which show an increase in delay time with increasing relaxation time were used for further statistical analysis. the laue distribution of these data is represented by lauegrams in fig. 9. it shows that the mean value of the statistical delay time, and thus the electron yield, could be obtained from the slope of a straight line, and the formation time is cut off on the td axis. further analysis is planned in order to continue research with the goal of comparing already obtained results with additional analysis of ionizing radiations’ influence on gfsas’ samples produced by littelfuse and epcos like investigation done for xenonfilled tube published in [27] as well as given the current attractiveness of investigations based on radiation of different types of components [28-30]. acknowledgement: this work has been supported by the ministry of education, science and technological development of the republic of serbia. references [1] m. m. pejović, introduction to electrical gas discharges. gas electronic components, in serbian, university of niš, faculty of electronic engineering, 2008, chapter 8, pp. 124-128. [2] https://www.littelfuse.com/~/media/electronics/product_catalogs/littelfuse_gdt_catalog.pdf.pdf [3] https://www.tdk-electronics.tdk.com/inf/100/ds/a81-a230xg-x3800t502.pdf [4] e. živanović, s. veljković, m. živković amd m. pejović, "reliability of various type of gas-filled surge arresters under dc discharge", in proceedings of the 31st international conference on microelectronics, niš, serbia, 2019, pp. 113-116. [5] k. stanković and l. perazić, "determination of gas-filled surge arresters lifetimes", ieee trans. on plasma sci., vol. 47, no. 1, pp. 935-943, january 2019. [6] j. he, j. lin, w. liu, h. wang, y. liao and s. li, "structure-dominated failure of surge arresters by successive impulses", ieee trans. on power delivery, vol. 32, no. 4, pp. 1907-1914, august 2017. [7] b. lončar, p. osmokrović, a. vasić and s. stanković "influence of gamma and x radiation on gas-filled surge arrster characteristics", ieee trans. plasma sci., vol. 34, no. 4, pp. 1561-1565, august 2006. [8] b. lončar, m. vujisić, k. stanković, d. aranđić and p. osmokrović, "radioactive resistance of some commercial gas filled surge arresters", in proceedings of the 26th international conference on microelectronics, niš, serbia, 2008, pp. 587–590. [9] m. m. pejović and m. m. pejović, "investigations of breakdown voltage and time delay of gas-filled surge arresters", j. phys. d: appl. phys., vol. 39, pp. 4417-4422, september 2006. [10] m. m. pejović, k. stanković, i. fetahović, m. m. pejović, "processes in insulating gas induced by electrical breakdown responsible for commercial gas-filled surge arresters delay response", vacuum, vol. 137, pp. 85-91, march 2017. [11] y. fu, p. zhang, j. p. verboncoeur and x. wang, "electrical breakdown from macro to micro/nano scales: a tutorial and a review of the state of the art", plasma res. express, vol. 2, p. 013001, february 2020. https://www.littelfuse.com/~/media/electronics/product_catalogs/littelfuse_gdt_catalog.pdf.pdf https://www.tdk-electronics.tdk.com/inf/100/ds/a81-a230xg-x3800t502.pdf 322 e. živanović, m. živković, m. pejović [12] z. lj. petrović, j. sivoš, m. savić, n. škoro, m. radmilović rađenović, g. malović, s. gocić and d. marić, "new phenomenology of gas breakdown in dc and rf fields", j. phys.: conf. series, vol. 514, p. 012043, may 2014. [13] s. gocić, n. škoro, d. marić and z. lj. petrović, "influence of the cathode surface conditions on v–a characteristics in low-pressure nitrogen discharge", plasma sources sci. technol., vol. 23, p. 035003, may 2014. [14] d. marić, n. škoro, p. d. maguire, c. m. o. mahony, g. malović and z. lj. petrović, "on the possibility of long path breakdown affecting the paschen curves for microdischarges", plasma sources sci. technol., vol. 21, p. 035016, may 2012. [15] a. m. loveless and a. l. garner, "a universal theory for gas breakdown from microscale to the classical paschen law", physics of plasmas, vol. 24, p. 113522, november 2017. [16] m. m. pejović, č. s. milosavljević and m. m. pejović, "the estimation of static breakdown voltage for gasfilled tubes at low pressures using dynamic method", ieee trans. plasma sci., vol. 31, pp. 776-781, august 2003. [17] j. m. meek and j. d. craggs, electrical breakdown of gases, new york, usa: wiley, 1987. [18] m. m. pejović, g. s. ristić and j. p. karamarković, "electrical breakdown in low pressure gases", j. phys. d: appl. phys., vol. 35, pp. r91-r103, april 2002. [19] m. m. pejović, m. m. pejović, č. i. belić, k. đ. stanković, "separation of vacuum and gas breakdown processes in argon and their influence on electrical breakdown time delay", vacuum, vol. 173, p. 109151, march 2020. [20] m. m. pejović, "the application of a small-volume neon-filled tube in overvoltage protection", ieee trans. on plasma sci., vol. 43, no. 4, pp. 1063-1067, april 2015. [21] m. pejović, k. stanković, m. pejović and p. osmokrović, processes induced by electrical breakdown responsible for the memory effect in low pressure noble gases, in book advances in chemistry research, 2019, chapter 2, vol. 47, edited by j. c. taylor, new york: nova science publishers, inc., pp. 47-93. [22] e. n. živanović, "investigation of the effect of additional electrons originating from the ultraviolet radiation on the nitrogen memory effect", fu elec. energ., vol. 28, no. 3, pp. 423-437, september 2015. [23] m. m. pejović, n. t. nesić, m. m. pejović and e. n. živanović, "afterglow processes responsible for memory effect in nitrogen", j. appl. phys., vol. 112, p. 013301, may 2012. [24] e. n. živanović, "influence of combined gas and vacuum breakdown mechanisms on memory effect in nitrogen", vacuum, vol. 107, pp. 62-67, september 2014. [25] e. n. živanović, m. m. pejović, m. m. pejović and n. t. nešić, "analysis of the statistical nature of electrical breakdown time delay in nitrogen at 6.6 mbar pressure in presence of positive ions and n( 4s) atoms", contrib. to plasma phys., vol. 51, no. 9, pp. 877-884, april 2011. [26] f. llewellyn jones, e. t. de la perrelle, "field emission of electrons in discharges", proc. math. phys. eng. sci., vol. 216, no. 1125, pp. 267-279, january1953. [27] m. pejović, e. živanović and m. živanović, "investigation of xenon-filled tube breakdown voltage and delay response as possible dosimetric parameters for small gamma ray air kerma rates", radiat. prot. dosim., vol. 190, no. 1, pp. 84-89, august 2020. [28] d. boychenko, o. kalashnikov, a. nikiforov, a. ulanova, d. bobrovsky and p. nekrasov, "total ionizing dose effects and radiation testing of complex multifunctional vlsi devices", fu elec. energ., vol. 28, no. 1, pp. 153-164, march 2015. [29] m. pejović, "p-channel mosfet as a sensor and dosimeter of ionizing radiation", fu elec. energ., vol. 29, no. 4, pp. 509-541, december 2016. [30] t. pešić-brđanin, "spice modeling of ionizing radiation effects in cmos devices", fu elec. energ., vol. 30, no. 2, pp. 161-178, june 2017. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 459 475 doi: 10.2298/fuee1704459s a novel architecture with scalable security having expandable computational complexity for stream ciphers  prathap siddavaatam, reza sedaghat electrical and computer engineering, ryerson university, toronto, canada abstract. stream cipher designs are difficult to implement since they are prone to weaknesses based on usage, with properties being similar to one-time pad besides keystream is subjected to very strict requirements. contemporary stream cipher designs are highly vulnerable to algebraic cryptanalysis based on linear algebra, in which the inputs and outputs are formulated as multivariate polynomial equations. solving a nonlinear system of multivariate equations will reduce the complexity, which in turn yields the targeted secret information. recently, addition modulo has been suggested over logic xor as a mixing operator to guard against such attacks. however, it has been observed that the complexity of modulo addition can be drastically decreased with the appropriate formulation of polynomial equations and probabilistic conditions. a new design for addition modulo is proposed. the framework for the new design is characterized by user-defined expandable security for stronger encryption and does not impose changes in existing layout for any stream cipher such as snow 2.0, sosemanuk, cryptmt, grain family, etc. the structure of the proposed design is highly scalable, which boosts the algebraic degree and thwarts the probabilistic conditions by maintaining the original hardware complexity without changing the integrity of the addition modulo . key words: algebraic attack, modulo addition, algebraic degree, scalability, snow 2.0, trivium, s-box, lfsr, nfsr, sat solver, stream cipher. 1. introduction in 1949, shannon mentioned [1] the possibility of decrypting a good cryptosystem by solving a system of simultaneous equations, which describes the cryptosystem, and the equations have a large number of unknowns. one of the interpretations of this proposition [2] is to describe a cryptosystem as a system of multivariate polynomial equations and to solve this system. this has built the foundations of the cryptanalysis method commonly known as algebraic attack nowadays. applications of this idea were first introduced in received april 6, 2017 corresponding author: reza sedaghat electrical and computer engineering, ryerson university, toronto, canada (e-mail: rsedagha@ee.ryerson.ca) 460 p. siddavaatam, r. sedaghat [3] and [4] to break public key scheme. later, the attack was generalized and applied on stream ciphers and block ciphers [5][6][7]. algebraic attack focuses on formulating multivariate polynomial equations between the inputs and outputs with low algebraic degree. in the higher echelons of security framework some methods tend to deploy a family of protocols which is designed specifically to be secure against algebraic attacks [8]. the significance of the attack is that the formulae exist with probability 1 or close to 1, unlike traditional probabilistic attacks such as differential cryptanalysis [9] and linear cryptanalysis [10]. as a result, solving such equations successfully will always yield the desire value of the targeted variable. the procedure to setup the attack typically starts with the attacker finding a set of equations that can describe the relationship between the input and the output. each equation in the set contains an algebraic degree. higher degree results in higher difficulty to solve the equations. at the same time, it is very common that the number of multivariate equations is less than the number of variables. therefore, an attacker would try to uncover ways that will lower the algebraic degree of the existing equations or new independent equations that will help describe the relationship between input and output. moreover, it is often possible that the degree can be lowered or that new equations can be formed based on some probabilistic condition. finally, solving the set of equations can be done through techniques such as gaussian reduction or methods described in [11], [12], and [13]. addition modulo 2 n has been widely used as an elementary cryptographic module in both stream ciphers, such as cast [14], twofish [15], and mars [16], and block ciphers, such as sober-t32 [17], snow 2.0 [18], and zuc [19]. typically, it is used for mixing, which combines two data sources to provide security. while the logic xor operation is also often used for mixing, modulo addition offers better security against algebraic attack [20] because it is partly non-linear in gf(2). a linear operation in gf(2), such as xor, can be described by an equation of algebraic degree 1. modulo addition is linear only at its least significant bit (lsb); therefore, it is harder for an attacker to solve using algebraic attack. it has been discovered that the algebraic degree of the formulae describing modulo addition can be reduced to quadratic [11]. at the same time, conditional properties of the modulo addition are also discovered to lower the algebraic degree and create new independent equations. these techniques help reduce the complexity of solving modulo addition tremendously. as a result, this paper aims to devise a new structure that will increase the algebraic degree when compared to the traditional modulo addition and increase the difficulty of using the conditional properties. at the same time, the size of the structure is user-defined and flexible, providing the users a scalable security against algebraic attack specifically when security is considered as a key requirement during the early stages of systems development [21][22][23][24]. the paper is structured as the following: section 2 discusses the complexity of algebraic attack and describes in detail modulo addition under the lens of algebraic attack. section 3 presents the details of the proposed design. section 4 provides the analysis of the design and compares the design with traditional modulo addition. section 5 demonstrates the application of the new design in a contemporary stream cipher example and gives the analysis of its application. finally, section 6 summarizes the proposed design briefly. a novel architecture with scalable security having expandable computational complexity 461 2. preliminaries and terminology 2.1. algebraic attack and its complexity the steps taken typically in an algebraic attack can be summarized as the following:  formulate multivariate polynomials equations describing input-output relations with probability 1  explore additional independent equations supplementing the existing system of equations with probability 1 or close to 1  explore conditions that can lower the algebraic degree of the system of equations  solve the system of equations with the appropriate techniques. algebraic immunity is a metric that has been developed [25] to provide a fast evaluation of the security against algebraic attack for a given cryptographic function. it is defined as the minimum algebraic degree in the system of equations. moreover, the algebraic immunity has been deemed to be insufficient and the describing degree, which is the minimum algebraic degree such that an s-box can be entirely defined by equations of that minimum degree, has been developed to provide a more thorough evaluation of security against algebraic attack [20]. these metrics all focus on measuring the algebraic degree of the set of equations that describes the targeted function because the degree has direct relationship to the complexity of solving the set of equations. these metrics do not attempt to address fault attacks employ that inject faults at any random location and random point of time for a stream cipher [26]. the complexity of solving algebraic attack has been explored in [5] and [7]. though the complexity can vary depending on the specific method used, it can be generalized by using the estimation of the complexity of a gaussian elimination. when applying algebraic attack on a generic stream cipher, the complexity can be estimated with the following steps:  define r as the number of multivariate equations formed between the output bit and the states in the stream cipher  define n as the number of variables in the stream cipher or the number of states equivalently  define d as the describing degree of the multivariate equations  define t as the number of monomials of degree ≤ d and t can be calculated as: (1) the estimated complexity can be calculated by multiplying t with the number of operations and the cycle time required for each operation. since these parameters can be algorithm and platform dependent, t itself can be used to provide estimation. at the same time, if conditional properties are used to reduce the algebraic degree of the set of equations, the complexity increases by attaching the probability of the conditional properties happening to t. the complexity of algebraic attack on a generic block cipher is slightly different. the complexity is contributed mainly from the non-linear component in a block cipher, which is typically an s-box. a typical s-box transforms its n-bit input variables to m-bit output variables, using a vectorial boolean function. using the definitions above, the number of monomials, t, can be calculated as: 462 p. siddavaatam, r. sedaghat (2) the number of equations, r, is determined by forming a matrix m of size 2 n t. this r is calculated as a difference given by: (3) thereby the complexity of solving this system of equations can be estimated by: (4) from (4), the complexity increases when the number of monomials increases. the number of monomials increases when the algebraic degree of the set of equations increases in (1) and (2). 2.2. addition modulo when viewing modulo addition in the eyes of algebraic attack, a set of equations describing the relationship between the input and output needs to be formed. this is outlined in [12] for the n-bit modulo addition of z=x y, and shown in (5). (5) the variable c is used to denote the carries. the + sign in the equations denotes addition in gf(2), or simply the logic xor operation. each carry variable can be described by the set of equations given in (6). it can be seen from combining (5) and (6) that the modulo addition is partly non-linear because the lsb of the output is linear. the algebraic degree is dominated by the carry terms. (6) a novel architecture with scalable security having expandable computational complexity 463 from (6), the degree increases linearly with the carry terms. this is because the more significantly positioned carry terms not only depend on their corresponding input variables, which have a degree 1, but also the previous carry terms. as c0 is generated by x0 and y0, the degree of z1 becomes 2. similarly, c1 is generated by x1 and y1, and the degree of z2 becomes 3. in fact, for an n-bit output, the algebraic degree for each output bit is: (7) as mentioned before, the complexity of solving the equations increases with the increment of algebraic degree. in [11], the author has devised a set of equations that describes modulo addition but limits the algebraic degree to 2. this is shown in (8). (8) moreover, the author is able to create in total – independent equations, instead of the original n equations. this effectively reduces the complexity of algebraic attack on modulo addition even before the deployment of conditional properties. 2.3 conditional properties of modulo addition conditional properties have been studied in [25] and the idea can be applied to modulo addition in a similar fashion. the conditional properties of modulo addition are first explored in [26] and then expanded in [20]. the goal of using conditional equations in modulo addition is to lower the algebraic degree of the equations or to create more independent equations with lower degree. in general, the occurrence of these conditions is based on the manipulation of input bits and carries bits. as mentioned before, the cost of these conditions is the probability. for input bits, the probability is assumed to be uniform, or ½. for the carry bits, the probability can be generalized above in (9). the probability of a carry being 1 nears toward ½ as the number of bits increase. (9) 2.3.1. modulo addition with no carries a modulo addition that generates no carries will have a completely linearized equation. in other words, the algebraic degree of (5) will be reduced to 1 when all carries are 0. the probability of this condition can be calculated using (9) and can be approximated to . 464 p. siddavaatam, r. sedaghat 2.3.2. modulo addition output characteristic an output characteristic that can help linearize the equations is when the output bits of the addition are all 1’s, or that the output is . given that the carry-in is 0, the output bits can be all 1’s only when the input bits are of opposite polarity. in other words, for each pair of input bits, the two bits are either or . this is also referred to as propagate [6], and the probability of this occurring is . the algebraic degree of the equations is lowered to 1 in this case. 2.3.3. modulo addition input characteristics two input characteristics can be utilized to linearize the equations. first, no carry is generated when one of the input is simply 0. second, there can be no carries generated when one of the inputs is the two’s complement of the other input. the output bits of this input pairing are always 0 in modulo addition. the distribution of the carry bits is as follows: there will be no carries generated from the input pairs until the first pair. then, the subsequent input pairs will always generate a carry. this provides a controlled distribution to the carry bits. in fact, if one of the inputs is a power of 2 and the other input is the two’s complement, then no carry will be generated. although these conditions can reduce the algebraic degree, the probability attached to these conditions is . the proposed design will not only increase the algebraic degree of the equations describing the relationship between the input and output, but also increases the difficulty of utilizing the conditional properties. 3. the new modulo addition the proposed design is a new type of cryptographic module that provides userdefined scalable security against algebraic attack. the components in the new modulo addition includes: input expansion, modulo addition, and output compaction. a block diagram is shown in figure 1 to contrast the new design and traditional modulo addition. fig. 1 block diagram of the new design a novel architecture with scalable security having expandable computational complexity 465 3.1. input expansion the input expansion function, fin(), is a function that expands each single input bit into a -bit string based on an n m-bit control string ki. we specify a user-defined parameter m that can be determined depending on the security requirement. the input control string ki typically can be generated within a cipher. the actual expansion function can be flexible; meaning, the user can substitute other expanding functions instead of the proposed one. for example, the expanding function can be an algebraic function or it can be an s-box. the proposed expansion function is an arithmetic relationship that is easily scalable. also, each of its output bit is 0-1 balanced. we can define the input expansion function as follows: let x = x … x x be an n-bit input and ki = ki … ki ki be its input control string ki ∈ { } . furthermore, let ki =ki ki , … ki ki such that ki ∈ { }, ≤ i ≤ n − and ≤ j ≤ m − . then, let x′ be the expanded input such that x = x′ … x′ x′ and x′ ∈ { } , where w = . ki is treated as a decimal number in (10). (10) using (10), it is recommended to define the user-defined parameter m ≥ 2 to avoid repeating values. 3.2. addition modulo the second component of the design takes the expanded inputs and performs modulo addition. nevertheless, the number of additions now has increased from to where w = , as the inputs have been expanded. let x = x … x x be an n-bit input and = y … y y be another n-bit input. let x = x … x x = (x′(n1)(w-1), …,x′(n-1)1, x′(n-1)0, …,x′1(w-1), …, x′11, x′10, x′0(w-1), …, x′01, x′00) be the expanded input and y′ = y′n-1, …, y′1, y′0 = (y′(n-1)(w-1), …, y′(n-1)1, y′(n-1)0, …, y′1(w-1), …, y′11, y′10, y′0(w-1), …, y′01, y′00) be the other expanded input. then, let z′ = z′n-1, …, z′1, z′0 = (z′(n-1)(w-1), …, z′(n-1)1, z′(n-1)0, …, z′1(w-1), …, z′11, z′10, z′0(w-1), …, z′01, z′00) be the sum of the modulo addition. equations (8) or (5) and (6) both can be used to describe the modulo addition. in (11), the equations are derived from (8). (11) 466 p. siddavaatam, r. sedaghat 3.3. output compaction the final component of the new design is a function that compresses z′ to z, i.e. from { } { } , based on n m-bit output control string ko. the function is flexible as long as the function chosen is capable of constricting the sum. in this design, the proposed function is a mux function. let k = k … k k and k = k … k k where k { } . we have, = … = z k … z k z k . the expression for can be generalized in (12). (12) here, (-1) refers to the complement of k , summation refers to logic xor, and multiplication refers to logic and. an example is given below for m = 3, p and q are index variables. 4. analysis of the new design in this section, the proposed design is analyzed with respect to algebraic attack. 4.1. probability of carry the probability of carry in the traditional modulo addition can be estimated using (9). as mentioned before, each bit of the expanded input is 0-1 balanced. the input expansion function can be viewed as a collection of boolean functions such that each output bit is a { } { } function. each boolean function, in this case, is 0-1 balanced because the output of the function has an equal chance of producing a 0 or 1. with this assumption, the probability of carry for the new design can be derived as below. the result shows that the formula is very similar to (9). let = … = … , … … … be the carry bits generated from summing the two expanded inputs. also, the limits are ≤ i ≤ n − ≤ j ≤ , and w = . a novel architecture with scalable security having expandable computational complexity 467 (13) 4.2. new design with no carries as a result of (13), the probability of carry has decreased from to for the same n-bit input pair. this helps increase the difficulty of an attacker to create a scenario without any carry, as discussed in section 2. 4.3. modulo addition output characteristics in the new design in a traditional modulo addition, the output bits can be used directly to derive potential carries and input pairings. in the new design, the output compaction function is lossy; thus, the attacker can only obtain n bits out of 2 nw bits even if the output control string ko is known. therefore, these n bits cannot provide enough information to derive the potential carries and input pairings. however, it is still possible to have all 1’s in the sum of the modulo addition component in the new design. this requires specific combinations of the two m-bit input control strings ki and kiyi in particular, the two input control strings need to be the same while the corresponding inputs need to be a propagate pair. as mentioned before, the probability of output being all 1’s in a traditional modulo addition is . the probability of this condition occurring in the new design is decreased to ( )( ). 4.4. modulo addition input characteristics in the new design similar approach is applied to evaluate the use of input characteristics of the modulo addition component in the new design. first, the expanded inputs will never be all 0’s when using the input expansion function given in (10). therefore, this characteristic becomes invalid. nevertheless, it is possible for the expanded inputs to be the two’s complement of one another. by observing (10) carefully, it is discovered that there are only 3 such cases given any m and m . thus, the probability is derived to be ⁄ , which is significantly less than . 468 p. siddavaatam, r. sedaghat 4.5. complexity of solving the new design to evaluate the complexity of solving the new design under algebraic attack, the algebraic degree needs to be understood. the algebraic degree can be obtained by expressing the new design in its algebraic normal form (anf), which describes a boolean function using logic xor gates [25]. the algebraic degree of each component is first studied and then the degree of the whole design is considered. 4.5.1. algebraic degree of input expansion the algebraic degree is the monomial with the largest degree in the algebraic normal form. for the input expansion function, each expanded variable can be expressed in the anf by considering itself as a boolean function. intuitively, the value of the expanded variable is a manipulation of the original input value based on the value of the userdefined parameter m. from the two examples given in the table 1, it can be observed that the algebraic degree directly relates to the value of user-defined parameter m. the table 1 also provides a comparison summary of the algebraic degree of the input variables going into a traditional modulo addition and the modulo addition component in the new design. table 1 anf of input expansion function when m = 2 xi anf algebraic degree x′i0 kix1 kix0 kix1kix0 xi 2 x′i1 1 kix0 kix1kix0 xi 2 x′i2 1 kix1 kix1kix0 xi 2 x′i3 1 kix1kix0 xi 2 anf of input expansion function when m = 3 xi anf algebraic degree x′i0 kix0 kix1 kix1kix0 kix2 kix2kix0 kix2kix1 kix2kix1kix0 xi 3 x′i1 1 kix0 kix1kix0 kix2kix0 kix2kix1kix0 xi 3 x′i2 1 kix1 kix1kix0 kix2kix1 kix2kix1kix0 xi 3 x′i3 1 kix1kix0 kix2kix1kix0 xi 3 x′i4 1 kix2 kix2kix0 kix2kix1 kix2kix1kix0 xi 3 x′i5 1 kix2kix0 kix2kix1kix0 xi 3 x′i6 1 kix2kix1 kix2kix1kix0 xi 3 x′i7 1 kix2kix1kix0 xi 3 comparison of input algebraic degrees input to traditional modulo addition input to modulo addition in the new design algebraic degree 1 m a novel architecture with scalable security having expandable computational complexity 469 4.5.2. algebraic degree of modulo addition the algebraic degree of the modulo addition component can be evaluated using (8) or (5) and (6). equation (8) limits the algebraic degree to quadratic in the original modulo addition by utilizing the output variables. this is under the assumption that the output is observable. in the new design, the output variables of the modulo addition component may not be observable; however, it is possible to define them as additional variables so that the algebraic degree of the expression can be reduced. the drawback of this method is that the number of variables used to solve the set of equations has increased. assuming that additional variables are used, the algebraic degree of the modulo addition component is at most 2m. this is because each input variable now has a degree of m and the largest degree is quadratic using (8). at this point, it can be observed that the algebraic degree has already increased by the user-defined parameter m. in addition, it is possible to express the modulo addition using (5) and (6), and its algebraic degree is outlined by (7). as mentioned before, each input variable now has a degree of m. the lsb of the addition then has a degree of m and the rest of the output bits have a degree of i w j m. note that ≤ i ≤ n − ≤ j ≤ − , and w = 2 m . the derivation approach is similar to what is outlined in the previous section. the degree of the carry terms increases linearly according to their bit positions; however, the degree increases in multiples of m because the expanded input variables have a degree of m. as a result, the degree of z′01 is generated by the multiplication of two degree-m variables x′00 and y′00. the degree of z′02 can be generated by the multiplication of x′01, x′00, and y′00, or the combination of y′01, x′00, and y′00. the degrees are 2m and 3m. therefore, each output variable of the modulo addition, z′ij, has a degree of i w j m. a comparison summary of algebraic degree is given in table 2. at this point, the effective increase of algebraic is m when compared to the original modulo addition. 4.5.3. algebraic degree of output compaction finally, the algebraic degree of the output compaction function needs to be determined. as mentioned before, the function is a 2 m : 1 logic multiplexer (mux) function defined by (12). this equation is itself in the algebraic normal form. therefore, the degree can be determined simply by observing (12). the degree is m + 1 because the output of the mux function depends on the values of all the select lines and the input. note that the 1 comes from the assumption that the degree of the input to the mux is 1. it needs to be substituted when the degree changes. 4.5.4. algebraic degree of the new design the algebraic degree of the new design can be determined by combining the degrees of all the components. the table 2 provides the summary of the algebraic degree of the new design and a comparison to the traditional modulo addition. note that the algebraic degree of the traditional modulo addition is calculated using (8). 470 p. siddavaatam, r. sedaghat table 2 comparison of modulo addition algebraic degrees sum of traditional modulo addition sum of modulo addition in the new design algebraic degree using (8) m m algebraic degree using (5) and (6) i m i w j m table 3 algebraic degree of the new design traditional modulo addition new modulo addition algebraic degree of input 1 m algebraic degree of addition 2 m i w j m algebraic degree of output m + 1 total 1  2 m m i w j m algebraic immunity and describing degree traditional modulo addition new modulo addition algebraic immunity 1 2m describing degree 2 m ( n – – )m table 4 complexity comparison of the corner case traditional modulo addition using (8) new modulo addition corner case number of input variables 2n 3mn + 2n number of output variables n n number of extra variables 0 0 number of equations r = 6n – 3 r = n algebraic degree 2 n + 1 condition cost 0 23mn number of monomials = ∑ ( n i ) = ∑ ( n m i ) complexity = n⁄ ⌈ ⁄ ⌉ = ( mn n ⁄ ⌈ ⁄ ⌉ as described in table 3, the algebraic immunity has increased by m, or at least 4 for m = 2. the describing degree has increased from 2 to at least 10 for m = and n = . in addition, it is worth noting that, an attacker can seek to lower the degree of the new design by looking for additional independent equations with lower degree or by creating extra variables. a novel architecture with scalable security having expandable computational complexity 471 the benefit of these methods is to be determined by the attacker. however, a corner case study is provided in the next section as a starting point. 4.5.5. corner case analysis of the new design the corner case can be obtained by looking for a conditional property of the new design. in other words, there is a probabilistic condition that can help lower the algebraic degree. from table 1, it can be seen that the algebraic degree of the input expansion function depends on the multiplication of the input control string variables. once the variables are known, the degree falls to 1. specifically, if the input control string has all 0’s, the expanded inputs are either the same as the inputs or the complement of the inputs. under this condition, the degree of addition becomes at least 1, which is the same as the traditional modulo addition. however, the degree does not scale linearly with the significance of the bit positions. in each block of expanded inputs, the expression of the summation of the expanded input variables can be reduced because many of the variables are the same. in fact, the degree of the lsb in each block of expanded inputs is , for 0 ≤ i ≤ n – 1. the rest of the summation bits from adding each block of the expanded inputs have a degree of . furthermore, the attacker would notice that if the output compaction function is able to select the lsb in each block of the summation, i.e., z′i0, the algebraic degree is the lowest. for this to happen, the output control string needs to be all 0’s. as a result, the degree of the new design becomes for z′i, and 0 ≤ i ≤ n – 1. this is the same as the traditional modulo addition as shown in table 2. nevertheless, the cost of this condition has a probability of 2 -3mn as all control bits need to be 0’s. fig. 2 new design schematic for snow 2.0 traditional modulo addition and the new design can be viewed as s-boxes and their complexity against algebraic attack can be approximated as s-boxes. for the traditional modulo addition, the required parameters have been studied in [20]. a comparison for the same has been listed in table 4. in the corner case, the number of monomials is still 472 p. siddavaatam, r. sedaghat larger because of the increased in number of variables and algebraic degree. at the same time, the complexity has increased by attaching the conditional cost. 5. applications of new design in this section a contemporary stream cipher like snow 2.0 is used to demonstrate the application of the new design. this application explores the new design being deployed in a stream cipher that uses combiner with memory. the following example describes the new design being applied in the stream cipher, snow 2.0, which constitutes a lfsr with non-linear feedback and uses an output combiner function with memory [18]. 5.1.1. overview of snow 2.0 snow 2.0 uses a length 16 lfsr over . in other words, the lfsr has 16 elements, or states, but each state contains a 32-bit word. let ,…, denote the states of the lfsr. the feedback function is defined as the xor combination of multiplied by , and divided by . to produce the output key stream, a finite state machine (fsm) is used in conjunction with the lfsr. the fsm contains two 32-bit registers r1 and r2. the value of r2 is determined by feeding the value of r1 through a set of aes s-boxes and the aes mix column function. the value of r1 is determined by performing addition modulo 232 between r2 and . finally, the output combiner function is defined as first performing addition modulo between r1 and , second xoring the result with r2, and finally xoring the result of the former with . consequently, snow 2.0 operates in two modes: initialization and key stream generation; it uses a 128-bit secret key and a 128bit initialization vector in initialization mode. 5.1.2. application of the new design the new design is used to replace the two modulo additions and the user-defined parameter m is chosen to be 3. there are 288 extra bits required to supply the input and output control strings of each addition because for each input bit of the 32-bit addition, a 3-bit control string is needed. therefore, 576 bits in total are required for two additions. again, many ways can be utilized to generate the extra bits. in this case, is used to generate 288 bits and the same set of bits is used for the two insertions of the new design. the generation logic is defined as the following:  for each bit of the first input x, there are 3 input control bits needed. they will be the 3 lsbs of the 3-bit circular-left-shifted . for example: ki = ( , , ) and ki = ( , , ).  for each bit of the second input y, the 3 input control bits will come from the 3 lsbs of the 3-bit circular-right-shifted and inverted . let denotes the bit-wise inverted . then, ki = ( , , ) and ki = ( , , ).  for the output control string, each 3 output control bits come from the 3 lsbs of the 3bit circular-right-shifted . for example: k = ( , ) and k = ( , , ). this setup can at least guarantee that the input control bits for the first input pair will not be all 0’s simultaneously. the new snow 2.0 setup is shown in table 5. the secret key a novel architecture with scalable security having expandable computational complexity 473 and the initialization vector are defined as given in the table 5. this is one of the test vectors listed in [18]. at this specific timeframe, s15 = 0xcc15a50b, r1 = 0xaab91a68, and s14 = 0x5164b6d9. the output of the new design here is 0x37f7b4f7 while the original modulo addition gives 0x76cebf73. also, the first output key stream of new snow 2.0 is 0x91cc022f and the original key stream is 0xc355385d. table 5 snow 2.0 test vectors [24] attribute hexadecimal value secret key(k) x initialization vector (iv) x 5.1.3. analysis of new snow 2.0 algebraic attack on snow 2.0 has been studied in [26] and [20]. two methods have been proposed to linearize the addition modulo 2 32 in the stream cipher. the first method is relatively straightforward, as the modulo addition can be completely linearized when there are no carries. the probability of this occurring can be estimated using (9). to be more precise, the condition is satisfied as long as each input pair does not generate a carry. the probability of this happening is (3/4) 31 because the probability of an input pair to generate no carries is (3/4). the author in [26] seeks to use this condition for both additions and for 17 consecutive cycles. the probability of this is ⁄ , which is close to exhaustive search 2 -576 . in snow 2.0, the exhaustive search includes the search for 512 bits in the lfsr states and two 32-bit registers. in the new snow 2.0, the cost of having no carries has greatly increased. as m = 3 in this application, the length of the modulo addition component in the new design becomes 32*2 3 = 256. to fix the carries for one modulo addition, the probability is estimated to be 2 -(31*8*17) = 2 -4216 by using (9). this is much larger than exhaustive search. the second method sees the attacker trying to manipulate the output characteristics of the modulo addition to linearize the equations, as described in [20]. in particular, 9 consecutive values of the register r1 are fixed. the desired output values from the summation are r11 = 0, r12 = 2 32 – 1, r13 = 0, r14 = 0, r15 = 0, r16 = 0, r17 = 0, r18 = 0, and r19 = 0. the value of r1 comes from summing r2 and s5 but the value of r2 comes from feeding r1 through the ase s-boxes and mix column operation. therefore, only s5 needs to be fixed. due to the nature of lfsr, 9 states need to be fixed and they are: s5, s6, s7, s8, s9, s10, s11, s12, and s13. the associated probability is 2 -(32*9) = 2 -288 . with the new design applied; however, the output characteristic may not be applicable. as discussed in section 4, the probability of fixing all outputs to be 1 in the new design is 2 -n(m+1) . in this scenario, the probability has become 2 -32(3+1) = 2 -128 . in addition, the probability of fixing all outputs to be 0 in the new design is (3 / 2 2m+2 ) n . again, the probability becomes (3 / 2 2*3+2 ) 32 ≈ 2 -205 . for a total of 9 consecutive cycles, the probability has become 2 -205*8 *2 -128 = 2 -1768 . in essence, the adversary may want to utilize the corner case of the new design to lower the algebraic degree. however, the control string generation logic, outlined in section 5, guarantees that the input control strings for the lsbs of the two inputs will not be simultaneously 0. therefore, the set of equations cannot be completely linearized. a summary is provided in table 6. 474 p. siddavaatam, r. sedaghat table 6 result analysis of new snow 2.0 snow 2.0 new snow 2.0 by method 1: fix carries to 0 2 -248 2 -4216 by method 2: fix consecutive outputs 2 -288 2 -1768 corner case na 6. conclusions in this paper, a new type of modulo addition is proposed to defend against algebraic attack. it contains three components: input expansion, modulo addition, and output compaction. in addition, the new design utilizes an expanding and compacting structure that can be user-defined to fit into various cryptographic security architectures. the new design is capable of improving the algebraic immunity by 4 times, by defining the userdefined parameter m to 2, and the describing degree by at least 5 times, as outlined in table 3. although the algebraic degree can potentially fall back to the same as the original modulo addition, the associated cost of doing so is at least 2 -mn . in section 5, the new design was applied to stream cipher like snow 2.0 to demonstrate the capability of the new design against algebraic attack. the cost of utilizing output and input characteristics of modulo addition has been increased to more than the exhaustive search, which is 2 -576 , in the new snow 2.0. overall, the new design can serve as a new cryptographic component that provides scalable security against algebraic attack. acknowledgement: this work is supported by the optimization and algorithm research lab (opral), ryerson university. references [1] c. shannon, “communication theory for security systems,” bell system technical journal 28, 1949. [2] c. adams, “designing against a class of algebraic attacks on symmetric block ciphers,” applicable algebra in engineering, communications, and computing, vol. 17, no. 1, pp. 17-27, apr. 2004. [3] j. patarin, “hidden fields equations (hfe) and isomorphisms of polynomials (ip): two new families of asymmetric algorithms,” in advances in cryptography – eurocrypt’96, springer berlin heidelberg, 1996, pp. 33-48. [4] j. patarin, “cryptanalysis of the matsumoto and imai public key scheme of eurocrypt’88,” in advances in cryptography – eurocrypt’95, springer berlin heidelberg, 1995, pp. 248-261. [5] n. courtois and w. meier, “algebraic attack on stream ciphers with linear feedback,” in advances in cryptography – eurocrypt 2003, springer berlin heidelberg, 2003, pp. 345-359. [6] n. courtois, “algebraic attack on combiners with memory and several outputs,” in information security and cryptography – icisc 2004, springer berlin heidelberg, 2004, pp. 3-20. [7] n. courtois and j. pieprzyk, “cryptanalysis of block ciphers with overdefined system of equations,” in advances in cryptography – asiacrypt 2002, springer berlin heidelberg, 2002, pp. 267-287. [8] c. adams and s. tavares, “designing s-boxes for ciphers resistant to differential cryptanalysis,” in proceedings of the 3rd symposium on the state and progress of research in cryptography, feb. 1993, pp. 181-190. [9] e. biham and a. shamir, “differential cryptanalysis of des-like cryptosystems,” journal of cryptography, vol. 4, no. 1, pp. 3-72, jan. 1991. [10] m. matsui, “linear cryptanalysis method for des cipher,” in advances in cryptography – eurocrypt’93, springer berlin heidelberg, 1994, pp. 386-397. a novel architecture with scalable security having expandable computational complexity 475 [11] n. courtois and j. patarin, “about the xl algorithm over gf(2),” in topics in cryptography – ct-rsa 2003, springer berlin heidelberg, 2003, pp. 141-157. [12] n. courtois, “higher order correlation attacks, xl algorithm and cryptanalysis of toyocrypt,” in information security and cryptography – icisc 2002, springer berlin heidelberg, 2002, pp. 182-199. [13] n. courtois, a. klimov, j. patarin, and a. shamir, “efficient algorithms for solving overdefined systems of multivariate polynomial equations,” in advances in cryptography – eurocrypt 2000, springer berlin heidelberg, 2000, pp. 392-407. [14] c. adams, “constructing symmetric ciphers using the cast design procedure,” in selected areas in cryptography, springer us, 1997, pp. 71-104. [15] b. scheier et al, the twofish encryption algorithm: a 128-bit block cipher, new york, ny, wiley, 1994. [16] c. burwick et al, “mars – a candidate cipher for aes,” ibm corp., rep., 1998. [17] p. hawkes and g. rose, “primitive specification and supporting documentation for sober-t32 submission to nessie,” in the proceedings of the first open nessie workshop, 2000. [18] p. ekdahl and t. johansson, “a new version of the stream cipher snow,” in selected area in cryptography, springer berlin heidelberg, 2003, pp. 47-61. [19] “specification of the 3gpp confidentiality and integrity algorithms 128-eea3 & 128-eia3. document 2: zuc specification,” rep. version 1.6, jan. 2011. [20] n. courtois and b. debraize, “algebraic description and simultaneous linear approximations of addition in snow 2.0,” in information and communications security, springer berlin heidelberg, 2008, pp. 328-344. [21] a. bushager, m. zwolinski, "evaluating system security using transaction level modelling," facta universitatis, series: electronics and energetics, vol.27, issue.1, pp.137-151, 2014. [22] a. khanna, “an architectural design for cloud of things”, facta universitatis, series: electronics and energetics, vol. 29 issue 3, pp. 357-365, 2016. [23] a. janjic, s. savic, g. janackovic, m. stankovic and l.velimirovic, “multi-criteria assessment of the smart grid efficiency using the fuzzy analytic hierarchy process”, facta universitatis, series: electronics and energetics, vol. 29, issue. 4, pp. 631-646, 2016. [24] m. a. dimitrijević, m. andrejević-stošović, j. milojković, v. litovski, " implementation of artificial neural networks based ai concepts to the smart grid ", facta universitatis, series: electronics and energetics, vol.27, issue.3, pp.411-424, 2014. [25] w. meier, e. pasalic, and c. carlet, “algebraic attacks and decomposition of boolean functions,” in advances in cryptography – eurocrypt 2004, springer berlin heidelberg, 2004, pp. 474-491. [26] s. sarkar, s. banik and s. maitra, "differential fault attack against grain family with very few faults and minimal assumptions," ieee transactions on computers, vol. 64, no. 6, pp. 1647-1657, june 2015. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 203-217 https://doi.org/10.2298/fuee2102203g © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper pso based takagi-sugeno fuzzy pid controller design for speed control of permanent magnet synchronous motor hamid ghadiri1, hamed khodadadi2, hooman eijei3, milad ahmadi4 1faculty of electrical, biomedical and mechatronics engineering, qazvin branch, islamic azad university, qazvin, iran 2department of electrical engineering, khomeinishahr branch, islamic azad university, isfahan, iran 3engineering and technical department, electricity distribution company of lahijan, lahijan, iran 4school of control science and engineering, shandong university, jinan 250061, pr china abstract. a permanent magnet synchronous motor (pmsm) is one kind of popular motor. they are utilized in industrial applications because their abilities included operation at a constant speed, no need for an excitation current, no rotor losses, and small size. in the following paper, a fuzzy evolutionary algorithm is combined with a proportional-integral-derivative (pid) controller to control the speed of a pmsm. in this structure, to overcome the pmsm challenges, including nonlinear nature, cross-coupling, air gap flux, and cogging torque in operation, a takagi-sugeno fuzzy logic-pid (tsfl-pid) controller is designed. additionally, the particle swarm optimization (pso) algorithm is developed to optimize the membership functions' parameters and rule bases of the fuzzy logic pid controller. for evaluating the proposed controller's performance, the genetic algorithm (ga), as another evolutionary algorithm, is incorporated into the fuzzy pid controller. the results of the speed control of pmsm are compared. the obtained results demonstrate that although both controllers have excellent performance; however, the pso based tsfl-pid controller indicates more superiority. key words: particle swarm optimization (pso), takagi-sugeno fuzzy logic (tsfl), pid, pmsm, genetic algorithm (ga). received september 7, 2020; received in revised form december 2, 2020 corresponding author: hamid ghadiri faculty of electrical, biomedical and mechatronics engineering, qazvin branch, islamic azad university, qazvin, iran e-mail: h.ghadiri@qiau.ac.ir 204 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi 1. introduction the pmsm has been broadly used in many industrial applications and rapid replacement of induction and dc motors in servo applications. pmsms are very favorite due to their efficiency, high power density, underweight, and small size comparing to dc and induction machines [1-3]. synchronous motors are constant speed machines that the frequency of the armature current determines their speed. the armature current depends on armature voltage. therefore, the simplest way to control the synchronous motor speed is to use the frequency of the voltage applied to the armature. however, in steady-state conditions, the synchronous motor speed is determined by the excitation frequency, but the frequency control speed is limited practically. the main reason is that it is difficult for a synchronous machine's rotor to follow an arbitrary change in the voltage frequency applied to the armature. control of synchronous motors can be facilitated by control algorithms where the stator flux and its relation to the rotor flux are directly controlled. this structure leads to torque control of the synchronous motors. some control methods such as model predictive control (mpc) [4], terminal sliding mode control [5,6], application of support vector machine in internal model control [7], adaptive control [8], input-output feedback linearization technique [9], and neural network control approach [10,11] are employed to control the speed and position of pmsms. however, traditional pid control approaches are well-known yet, whereas these controllers are applicable readily while working well around the desired point [12,13]. notably, conventional pid controllers fail to guide the system to a high-performance operation due to integral windup. also, the model-based control approaches are presented in [14-16]. in [17,18], researchers have explained demagnetization fault diagnosis and fault model recognition as a significant issue on the pmsm derive system. in [19], the presented method aims to reject voltage disturbances with an internal model control (imc); however, the imc method fails when the system's parameters vary during their operation significantly. a fuzzy control system is analyzed in [20] with a lack of indicating transient speed responses under the load torque variations. predictive model methods are considered by several researchers so far. a beneficial mpc method is applied to psms derives [21]. as a significant disadvantage in this method, solving the optimization problem requires a burdensome computational process for each instant sampling. as follow, a pid controller is considered to minimize the error signal; however, the constant pid coefficients are inefficient due to the system's nonlinear behavior. several studies have been assigned to perform the self-tuning pid coefficients. fuzzy logic can be identified as the most effective method to offer excellent flexibility for pid coefficients, changes under the system's nonlinear behavior [22]. the combination of fuzzy and pid controllers can create a reliable controller for nonlinear models. however, there are considerable deficiencies in some applications. when the system behavior becomes highly nonlinear, it is virtually impossible to use the pid controller under the fuzzy rules. this study's presented approach is based on optimizing the fuzzy rules in the takagisugeno fuzzy logic pid (tsfl-pid) controller to be more flexible than the conventional tsfl-pid controller. the tsfl-pid control parameters are tuned based on pso. besides, the tsfl-pid controller is optimized with ga for the comparison. the obtained results indicate that the pso algorithm performs better than ga. the rest of this paper is organized as follows. the pmsm model and design procedure of the fuzzy pid controller will be explained in sections 2 and 3. afterward, the tsfl-pid controller is presented and optimized using pso and ga in section 4. section 5 is dedicated pso based takagi-sugeno fuzzy pid controller design for speed 205 to present the simulation results and the resultant discussion. finally, this paper is concluded in section 6. 2. mathematical model of pmsm the mathematical model of pmsm in the 𝑑 − 𝑞 coordinates is shown below. the voltage equation is [23]: 𝑢𝑞 = 𝑅𝑠𝑖𝑞 + 𝐿𝑞 𝑖̇̇𝑞 + 𝑤𝑒 𝐿𝑑 𝑖𝑑 + 𝑤𝑒 𝛹𝑓 , 𝑢𝑑 = 𝑅𝑠 𝑖𝑑 + 𝐿𝑑 𝑖̇̇𝑑 − 𝑤𝑒 𝐿𝑞 𝑖𝑞 , (1) where 𝑢𝑑 and 𝑢𝑞 present the direct and quadrature-axis input voltages, 𝑖𝑑 and 𝑖𝑞 represent the direct and quadratic-axis currents. besides, the resistance of the stator phase and inductances of the direct and the quadrature-axis are demonstrated as 𝑅𝑠, 𝐿𝑑 as well as 𝐿𝑞 , respectively. furthermore, 𝛹𝑓 , and 𝑤𝑒 denote the excitation magnetic field and rotor angular speed. the magnetic bond equation is considered by the following equation [24]: 𝛹𝑑 = 𝐿𝑑 𝑖𝑑 + 𝛹𝑓 , 𝛹𝑞 = 𝐿𝑞 𝑖𝑞 , (2) where 𝛹𝑑 , and 𝛹𝑞 are the existed magnetic fields in the stator winding of the direct-axis and quadrature-axis, respectively. the d-q coordinates, pmsm electromagnetic torque is [25]: 𝑇𝑒 = 𝑛𝑝(𝛹𝑓 𝑖𝑞 − (𝐿𝑑 − 𝐿𝑞 )𝑖𝑞 𝑖𝑑 ) , (3) where 𝑛𝑝 indicates the pole pairs value. the pmsm motion equations denote as follows: 𝐽𝑝ω̇𝑟 = 𝑇𝑒 − 𝑇𝑙 − 𝐵ω𝑟 , ω𝑟 = 𝑤𝑒 𝑛𝑝 . (4) where ω𝑟 is the mechanical angular speed of the rotor, 𝐽 is the moment of inertia of the rotor and 𝑇𝑙 is the load (external) torque. therefore, the state-space equations can be concluded using the above equations as: 𝑖̇�̇� = 1 𝐿𝑞 (𝑢𝑞 − 𝑅𝑠𝑖𝑞 + 𝐿𝑑 𝑖𝑑 𝜔𝑒 − 𝛹𝑓 𝜔𝑒 ) , 𝑖̇�̇� = 1 𝐿𝑑 (𝑢𝑑 − 𝑅𝑠𝑖𝑑 + 𝜔𝑒 𝐿𝑞 𝑖𝑞 ) , �̇�𝑒 = 1 𝐽 (1.5𝑛𝑝 2 (𝛹𝑓 𝑖𝑞 + (𝐿𝑑 − 𝐿𝑞 )𝑖𝑑 𝑖𝑞 ) − 𝑛𝑝𝑇𝑙 − 𝐵𝜔𝑒 ). (5) in the pmsm vector control system, 𝑖𝑑 is assumed to be zero. thus, the equation of state space (5) can be rewritten as follows: 𝑖̇̇𝑞 = 1 𝐿𝑞 (𝑢𝑞 − 𝑅𝑠𝑖𝑞 − 𝛹𝑓 𝜔𝑒 ) , �̇�𝑒 = 1 𝐽 (1.5𝑛𝑝 2 𝛹𝑓 𝑖𝑞 − 𝑛𝑝𝑇𝑙 − 𝐵𝜔𝑒 ). (6) a field-oriented vector-controlled isotropic pmsm dynamic equation can be obtained as follow: 206 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi �̇�(t)=𝜑1𝑖𝑞𝑠 (𝑡) 𝜑2𝜔(𝑡) 𝜑3𝑇𝐿 (𝑡), (7) where 𝜔 = �̇� is the electrical rotor angular speed, θ is the electrical rotor angle, 𝑇𝐿 illustrates the load torque disturbance input, and 𝜑𝑖 > 0, 𝑖 = 1, . . . , 3 are as follows: 𝜑1= 3𝑃2 8𝐽 𝜆𝑚 , 𝜑2 = 𝐵 𝐽 , 𝜑3 = 𝑃 2𝐽 (8) while 𝑝 is the number of poles, 𝐽, 𝐵, and 𝜆𝑚 are the rotor inertia, the viscous friction coefficient, and the magnetic flux, respectively. fig. 1 demonstrates the block diagram of the control system applied for pmsm. fig. 1 block diagram of a field-oriented pmsm control system [26]. as can comprehend, the three-phase current commands can be calculated via converting the controller commands 𝑖𝑞𝑠𝑑 , 𝑖𝑑𝑠𝑑 . in which, 𝑖𝑑𝑠𝑑 assumed as the direct-axis reference current is typically equal to zero. in a field-oriented pmsm control system, the three-phase current commands are computed by converting the current controller commands 𝑖𝑞𝑠𝑑 , 𝑖𝑑𝑠𝑑 . thus, the main challenge can be described as proposing an evolutionary algorithm (ea) based fl-pid for speed control of pmsm. 3. fuzzy-pid controller design 3.1. pid control structure one of the most widely used controllers in various industries is the pid controller. the main reason for this type's extensive use is its simplicity, where the control signal can be calculated as follows [27]: 𝑢(𝑡) = 𝑘𝑝𝑒(𝑡) + 𝑘𝑖 ∫ 𝑒(𝑡)𝑑(𝑡) + 𝑘𝑑 𝑑 𝑑𝑡 𝑒(𝑡) 𝑒(𝑡) = 𝑟(𝑡) − 𝑦(𝑡). (8a) where 𝑢(𝑡) is the control signal, 𝑒(𝑡) is error signal, 𝑟(𝑡), and 𝑦(𝑡) are reference and output signals, respectively. the proportional, integral, and derivative terms are identified as 𝑘𝑝, 𝑘𝑖 , and 𝑘𝑑 , respectively. the closed-loop system's operation can be affected directly by the mutation of controller parameters [28]. 3.2. fuzzy logic controller uncertainty is one of the inseparable parts of the industrial system [29-31]. the fuzzy logic approach controls a system intelligently without dependency on an accurate system model pso based takagi-sugeno fuzzy pid controller design for speed 207 [32,33]. a fuzzy inference system (fis) consists of a fuzzifier, rule base, inference engine, and defuzzifier. a set of fuzzy if-then rules the central part called the knowledge base. 3.3. designing fuzzy-pid controller the traditional pid controller should be modified by using a takagi-sugeno fuzzy controller to tune the pid gains as follows [26]: rule i : if 𝜔𝑒 and �̇�𝑒 is 𝐹𝑖 , then 𝑖𝑞𝑠𝑑 = − 𝐾𝑖 𝑃𝜔𝑒 − 𝐾𝑖 𝐼 ∫ 𝜔𝑒 𝑑𝜏 𝑡 0 − 𝐾𝑖 𝐷 �̇�𝑒 (9) where 𝜔𝑒 = 𝜔 − 𝑣𝑑 , 𝜔𝑑 is the desired speed, 𝐹𝑖 (𝑖 = 1, … , 2𝑛 − 1) are fuzzy sets, 𝑛 > 1, 𝑟 = 2𝑛 − 1 is the number of fuzzy rules, and 𝑘𝑖 𝑃 , 𝑘𝑖 𝐼 , 𝑘𝑖 𝐷 are the constants. the fuzzy sets 𝐹𝑖 can be determined by the membership function 𝑚𝑖 . we assume that the fuzzy set 𝐹𝑛 computes for 𝜔𝑒 = 0, 𝐹𝑖 calculates for more negative 𝜔𝑒 than 𝐹𝑖+1 does for 1 ≤ 𝑖 ≤ 𝑛 − 1, and 𝐹𝑖+1 accounts for more positive 𝜔𝑒 than 𝐹𝑖 does for 𝑛 ≤ 𝑖 ≤ 2𝑛 − 2. fig. 2 illustrates the membership function of the input variable "𝜔𝑒 " and �̇�𝑒 that the inputs are normalized into [-1,1]. in this figure, zo(𝐹3) denotes zero, nb(𝐹1) represents negative big, nm(𝐹2) is the negative medium, pm(𝐹4) is a positive medium, and pb(𝐹5) is positive big. fig. 2 membership functions of the input variables "𝜔𝑒 " and �̇�𝑒 . 208 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi the controller gains are set as [29]: 𝐾1 𝑃 ≥ 𝐾2 𝑃 ≥ ⋯ ≥ 𝐾𝑛−1 𝑃 ≥ 𝐾𝑛 𝑃 ≥ 0 𝐾2𝑛−1 𝑃 ≥ 𝐾2𝑛−2 𝑃 ≥ ⋯ ≥ 𝐾𝑛+1 𝑃 ≥ 𝐾𝑛 𝑃 ≥ 0 𝐾1 𝐼 ≥ 𝐾2 𝐼 ≥ ⋯ ≥ 𝐾𝑛−1 𝐼 ≥ 𝐾𝑛 𝐼 ≥ 0 𝐾2𝑛−1 𝐼 ≥ 𝐾2𝑛−2 𝐼 ≥ ⋯ ≥ 𝐾𝑛+1 𝐼 ≥ 𝐾𝑛 𝐼 ≥ 0 0 ≤ 𝐾1 𝐷 ≤ 𝐾2 𝐷 ≤ ⋯ ≤ 𝐾𝑛−1 𝐷 ≤ 𝐾𝑛 𝐷 0 ≤ 𝐾2𝑛−1 𝐷 ≤ 𝐾2𝑛−2 𝐷 ≤ ⋯ ≤ 𝐾𝑛+1 𝐷 ≤ 𝐾𝑛 𝐷 (10) using the typical fuzzy inference approach, the pid control input can be obtained as: 𝑖𝑞𝑠𝑑 = − ∑ ℎ𝑖 (𝜔𝑒 ) [ 𝐾𝑖 𝑃 (𝜔𝑒 ) + 𝐾𝑖 𝐼 ∫ 𝜔𝑒 𝑑𝜏 𝑡 0 +𝐾𝑖 𝐷�̇�𝑒 ] 𝑟 𝑖 (11) where ℎ𝑖 = 𝑚𝑖 / ∑ 𝑚𝑗 𝑟 𝑗=1 is the normalized weight of each if-then rule, and it states ℎ𝑖 ≥ 0, ∑ ℎ𝑖 = 1 𝑟 𝑖 . fig. 3 shows a graphic interpretation of the fuzzy logic method to obtain the pid control input. fig. 3 flowchart representation of the fis [26]. according to fig. 3, the input premise variables are the speed error and its derivative. hence, the number of initial fuzzy sets for input is equal to 2, and only one output variable 𝑖𝑞𝑠𝑑 is utilized. thus, the proposed approach is more straightforward and easier than the previous heuristics-based fuzzy control techniques, such as [34]. it should be noted that if the controller gains are set: 𝐾1 𝑃 = 𝐾2 𝑃 = ⋯ = 𝐾2𝑛−1 𝑃 = 𝐾 𝑃 > 0 𝐾1 𝐼 = 𝐾2 𝐼 = ⋯ = 𝐾2𝑛−1 𝐼 = 𝐾 𝐼 > 0 𝐾1 𝐷 = 𝐾2 𝐷 = ⋯ = 𝐾2𝑛−1 𝐷 = 𝐾 𝐷 > 0 (12) pso based takagi-sugeno fuzzy pid controller design for speed 209 the fuzzy control law can be changed into the conventional pid control law: 𝑖𝑞𝑠𝑑 = −(𝐾 𝑃𝜔𝑒 + 𝐾 𝐼 ∫ 𝜔𝑒𝑑𝜏 𝑡 0 + 𝐾 𝐷𝜔�̇� ) (13) using the error vector 𝑒 = [𝑒1, 𝑒2] 𝑇, 𝑒1 = 𝜔𝑒 = 𝜔 − 𝜔𝑑 and 𝑒2 = 𝑑𝑒1 𝑑𝑡 , the error dynamics can be gained as follows: �̇�1 = 𝑒2 �̇�2 = − ∑ ℎ𝑖 (𝑒1)𝜑1[𝐾𝑖 𝑃𝑒1 + 𝐾𝑖 𝐼 ∫ 𝑒2𝑑𝜏 𝑡 0 + 𝐾𝑖 𝐷 �̇�2] + 𝜂(𝑡) 𝑟 𝑖 (14) where 𝜂(𝑡) = −𝜑2𝜔(𝑡) − 𝜑3𝑇𝐿 (𝑡) (see equations (7) and (8)). 4. fuzzy-pid controller optimized by pso the constancy of pid coefficients included 𝑘𝑖 , 𝑘𝑝, and 𝑘𝑑 makes this controller operation is not appropriate in nonlinear systems. on the other hand, when some variation has happened in the system condition, the designed controller doesn't have excellent performance [27,28,32,33]. employing the optimal pid controller is a suitable solution for dominating these conditions. the pso algorithm is one of the most effective methods that help shape fuzzy rules to be changed and optimized under a specific cost function. pso algorithm is a meta-heuristic algorithm that solves the problems with the least information. the genetic algorithm can also act as an optimizer with three basic operators: crossover and mutation operators, to create a new population and one operator to distinguish between the generated population [35]. albeit, the populations that are classified as inappropriate for survival, will be eliminated. additionally, in terms of performance, ga has some significant weaknesses. high running costs can be the main ga problem where we need to keep hundreds of chromosomes in memory and perform the algorithm for thousands of generations. howbeit, it is notable that ga is a meta-heuristic algorithm, and its time required for complete running is faster and more optimal than systematic methods. there are numerous problem solutions known as particles in a space with a specific pso algorithm position. pso is a populationbased algorithm and considerably similar to evolutionary computational techniques like ga. the pso's system begins to work by collecting random solutions, searching for optimization, and generational updates. unlike ga, pso has no evolutionary operator, such as crossover and mutation. as previously mentioned, particles are potential solutions while flying through the problem space by looking for optimum particles [36,37]. the equations describing the behavior of particles in pso are given below: 𝑣𝑖 [𝑡 + 1] = 𝑤 𝑣 𝑖 [𝑡] + 𝑐1𝑟1(𝑥 𝑖,𝑏𝑒𝑠𝑡 [𝑡] − 𝑥 𝑖 [𝑡] + 𝑐2𝑟2(𝑥 𝑔𝑏𝑒𝑠𝑡 [𝑡] − 𝑥 𝑖 [𝑡]) 𝑥 𝑖[𝑡 + 1] = 𝑥 𝑖 [𝑡] + 𝑣𝑖 [𝑡 + 1] (15) (16) in these equations, 𝑖 specifies a pseudo time increment; 𝑣 𝑖 and 𝑥 𝑖 describe the speed and position of the ith particle; 𝑥 𝑖,𝑏𝑒𝑠𝑡 and 𝑥 𝑔𝑏𝑒𝑠𝑡 denote the best earlier position and best global position of the swarm, respectively. besides, 𝑐1 and 𝑐2 assigned to the personal and collective learning indices, 𝑤 demonstrates the inertia index, and 𝑟1, 𝑟2 are the random numbers varying between 0 and 1. 210 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi to find a solution using pso, steps are as follows [38]: step 1: select an approach to encode variables into particles. step 2: initialization and start with an accidentally generated population of 𝑁𝑝 particles. step 3: evaluate the fitness for 𝑁𝑝 population. step 4: initialization the position and fitness of global and local variables and calculate the velocity vector. step 5: calculate the new position that obtained the velocity vector and evaluate the fitness corresponding to the new position. step 6: get the local best fitness and global best fitness and continue until stopping criteria. step 7: stop the procedure when the necessary criterion is realized. otherwise, repeat the algorithm by step 4. 4.1. fitness function several objective functions can be considered for improving the pmsm performance. in this research, the integral time absolute error (itae) index is utilized. itae is defined as follows: 𝐼𝑇𝐴𝐸 = ∫ 𝑡|𝑒(𝑡)| 𝑇 0 𝑑𝑡 (17) velocity vector obtained based on the following equation: 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 (𝑖). 𝑣𝑒𝑙𝑜𝑐𝑖𝑡𝑦 = 𝑤 ∗ 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑣𝑒𝑙𝑜𝑐𝑖𝑡𝑦 +𝑟𝑎𝑛𝑑 ∗ 𝐶1 ∗ 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑃𝑏𝑒𝑠𝑡 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛) +𝑟𝑎𝑛𝑑 ∗ 𝐶2 ∗ (𝐺𝑏𝑒𝑠𝑡. 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 − 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛) 4.2. calculating new position new position obtained using the following equation: 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑣𝑒𝑙𝑜𝑐𝑖𝑡𝑦 + 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 = 𝑚𝑖𝑛(𝑚𝑎𝑥(𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑖). 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛, 𝑙𝑜𝑤𝑒𝑟𝑏𝑜𝑢𝑛𝑑), 𝑢𝑝𝑝𝑒𝑟𝑏𝑜𝑢𝑛𝑑) in this paper, the max iteration equal to 15 is considered as the maximum of iteration. 5. simulation results fig. 4 illuminates the block diagram of the suggested ea based tsfl-pid control system. an optical encoder calculates the rotor position (𝜃), and motor speed (𝜔). the switching frequency is assumed 5 khz, and a space vector modulation (svm) technique is in charge of controlling pulse-width modulation (pwm). also, pmsm parameters are given in table 1. the controlled system involves two control loops (see fig. 4). the outer loop is suggested ea-based tsfl-pid controller, and the inner is a traditional pi current controller. the output generated by the considered controller (𝑖𝑞𝑠𝑑 ) is applied to the input of pi current controller. pso based takagi-sugeno fuzzy pid controller design for speed 211 fig. 4 block diagram of the suggested ea based tsfl-pid controller for the pmsm system. table 1 pmsm parameters parameter value rated output 400 w rated current 3.5 a rated speed 3000 rpm voltage constant (vpeak l-l) 300 v number of poles (𝑝) 4 stator resistance (𝑅𝑠) 0.18 ohm stator inductance (𝐿𝑠 = 𝐿𝑞 = 𝐿𝑑) 0.71mh magnetic flux (𝜆𝑚) 0.413497 equivalent inertia (𝐽) 0.0008 kg.m^2 friction factor (𝐵) 0.0001 n.m.s in the first step, simulations are performed considering no load is affect on the pmsm. assuming three determining speeds, including 1, 1000, and 2500 rpm, as cases a, b and c, are employed to the pmsm at start time. membership functions are considered in the quasi form. the fuzzy rules for designing the output based on the error and its changes are given in table 2, in which, 𝛼 is considered as 𝑃, 𝐼, 𝐷. table 2 fuzzy rules for computing pid gains (𝐾𝑗 𝛼 , 𝛼: 𝑃, 𝐼, 𝐷) 𝜔𝑒 /�̇�𝑒 nb nm zo pm pb nb 𝐾1 𝛼 𝐾2 𝛼 𝐾3 𝛼 𝐾4 𝛼 𝐾5 𝛼 nm 𝐾6 𝛼 𝐾7 𝛼 𝐾8 𝛼 𝐾9 𝛼 𝐾10 𝛼 zo 𝐾11 𝛼 𝐾12 𝛼 𝐾13 𝛼 𝐾14 𝛼 𝐾15 𝛼 pm 𝐾16 𝛼 𝐾17 𝛼 𝐾18 𝛼 𝐾19 𝛼 𝐾20 𝛼 pb 𝐾21 𝛼 𝐾22 𝛼 𝐾23 𝛼 𝐾24 𝛼 𝐾25 𝛼 212 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi with the increase in 𝑟, the control performance can be improved at the expense of the computational burden. it is well visible in the simulation results that 𝑟 = 5 gives a satisfactory control performance. it should be noted that the proposed approach is similar to the ga-based fuzzy-pid controller presented in [29] and simpler than the previous fuzzy methods given in [34]. figs. 5 and 6 demonstrate the convergence diagram of the pso and ga methods. the pso convergency is realized faster compared to the ga. besides, the final value of the cost function in the pso algorithm (0.01554) is much less than ga (0.01787). thus, the better optimal control parameters and control speed can be obtained using pso in the current control issue. the optimal control parameters for each meta-heuristic algorithm (pso and ga) are indicated in table 3. fig. 5 the convergence of the ga algorithm fig. 6 the convergence of the pso algorithm the number of decision variables and control parameters are assumed equal for both methods. given the following mathematical equations, the coefficients of the ea based tsflpid controllers for each meta-heuristic algorithm are shown in tables 4-6. for avoiding prolongation, the first five parameters of the pid gains are mentioned in these tables. pso based takagi-sugeno fuzzy pid controller design for speed 213 𝐾1 𝑃 = 𝐾2𝑛−1 𝑃 = 𝐾 𝑃, 𝐾2 𝑃 = 𝐾2𝑛−1 𝑃 = 𝛼1𝐾 𝑃 , … , 𝐾𝑛 𝑃 = (∏ 𝛼𝑗 𝑛−1 𝑗−1 ) 𝐾 𝑃 𝐾1 𝐼 = 𝐾2𝑛−1 𝐼 = 𝐾𝐼 , 𝐾2 𝐼 = 𝐾2𝑛−1 𝐼 = 𝛽1𝐾 𝐼 , … , 𝐾𝑛 𝐼 = (∏ 𝛽𝑗 𝑛−1 𝑗−1 ) 𝐾𝐼 𝐾𝑛 𝐷 = 𝐾 𝐷, 𝐾𝑛+1 𝐷 = 𝐾𝑛−1 𝐷 = 𝛾1𝐾 𝐷 , … , 𝐾1 𝐷 = 𝐾2𝑛−1 𝐷 = (∏ 𝛾𝑗 𝑛−1 𝑗−1 ) 𝐾 𝐷 (18) table 3 the achieved optimal control parameters for the pso and ga methods optimizers parameters pso ga 𝛼1 0.9814 0.8137 𝛼2 0.5277 0.4721 𝛽1 0.9116 0.8133 𝛽2 0.7984 0.7107 𝛾1 0.7221 0.6510 𝛾2 0 0.1575 𝛿1 0.2319 0.6591 𝛿2 0 0.3394 1 0 0.8594 2 0.5411 0.9733 3 0.9412 0.9508 table 4 achieved 𝐾𝑗 𝑃 parameters related to ea based tsfl-pid controller pso ga 0.04 0.0400 𝐾1 𝑃 0.04 0.03176 𝐾2 𝑃 0.000228 0.000228 𝐾3 𝑃 0.04 0.023076 𝐾4 𝑃 0.04 0.040 𝐾5 𝑃 table 5 achieved 𝐾𝑗 𝐷 parameters related to ea based tsfl-pid controller pso ga 0 0.00972 𝐾1 𝐷 0.01 0.008155 𝐾2 𝐷 0.01 0.01 𝐾3 𝐷 0.01 0.003055 𝐾4 𝐷 0 0.00627 𝐾5 𝐷 214 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi table 6 achieved 𝐾𝑗 𝐼 parameters related to ea based tsfl-pid controller pso ga 8 8 𝐾1 𝐼 8 5.238 𝐾2 𝐼 8 2.011 𝐾3 𝐼 8 5.901 𝐾4 𝐼 8 8 𝐾5 𝐼 whereas the fuzzy outputs feed the pid coefficients, takagi-sugeno fuzzy system is used instead of the mamdani fuzzy system structure. in the ts fuzzy structure, only the input variables have a membership function, and the output variables don't have these. to evaluate the proposed controller's performance in speed tracking for the pmsm system, figure 7 is presented. in this figure, the pmsm responses to variation of speed and external load disturbance are demonstrated. besides the proposed approach of this paper as the pso based tsfl-pid, the ga based tsfl-pid and pi controllers are employed for the comparison. figure 7 is composed of three sub-figures. in the first sub-figure (case a), the setpoint is determined at 1 rpm. in the second and third sub-figures (cases b and c), these values are set at 1000 and 2500 rpm. besides, for assessing the proposed method's ability in disturbance attenuation, external load disturbances have been added to the pmsm system. while a step disturbance with 0.5 rpm is applied to case 1, the amplitudes of adding the step disturbances are 200 and 1000 rpm for cases b and c, respectively. in all cases, the time of inserting external disturbances is 10 seconds. for a comprehensive comparison, the transient characteristics consist of overshoot, settling, rise, and peak times for pso and ga based tsfl-pid and pi controllers are presented in table 7. as can be seen, the proposed approach has the best performance in transient and settling times and overshoot for both speed tracking and disturbance attenuation. in other words, the disturbance effect on the pmsm performance is reduced via the proposed controller. table 7 transient characteristic of pmsm speed control for pso and ga optimization algorithm and pi controller case 1 case 2 case 3 pso ga pi pso ga pi pso ga pi overshoot (mp%) 0 3.5 21.5 0 8.75 22.7 0 6.84 14 settling time (ts) 0.41 1.52 3.68 0.4 1.55 3.7 0.811 1.55 6.03 rise time (tr) 0.22 0.582 0.685 0.229 0.485 0.659 0.636 0.539 1.4 peak time (tp) -1.28 1.57 -1.138 1.4 -1.002 3.04 pso based takagi-sugeno fuzzy pid controller design for speed 215 (a) (b) (c) fig. 7 comparison between the speed response of the pmsm applying the pso-tsflpid, ga-tsfl-pid, and pi controllers in three cases 0 5 10 15 0 0.2 0.4 0.6 0.8 1 1.2 1.4 time (sec) s p e e d ( rp m ) speed response of the pmsm in the presence of external load disturbance pso-tsfl-pid ga-tsfl-pid pi 0 5 10 15 0 200 400 600 800 1000 1200 1400 time (sec) s p e e d ( rp m ) speed response of the pmsm in the presence of external load disturbance pso-tsfl-pid ga-tsfl-pid pi 0 5 10 15 0 500 1000 1500 2000 2500 3000 time (sec) s p e e d ( rp m ) speed response of the pmsm in the presence of external load disturbance pso-tsfl-pid ga-tsfl-pid pi 216 h. ghadiri, h. khodadadi, h. eijei, m. ahmadi 6. conclusion the challenge of the speed control of the pmsm consist of nonlinear nature, crosscoupling, and air gap flux is considered in this paper. by introducing the tsfl-pid controller for this purpose, the system's overall performance is determined. two metaheuristic algorithms consist of the pso and ga are individually combined by the proposed controller for the parameter's optimization. the simulation results demonstrate that although both optimization algorithms' performances are suitable in the speed control of pmsm, the pso based tsfl-pid controller shows superiority in this particular issue. moreover, compared to the pi controller, the best transient response in speed tracking and disturbance attenuation belongs to the proposed approach. references [1] k. zhao et al., "sliding mode observer-based current sensor fault reconstruction and unknown load disturbance estimation for pmsm driven system", sensors, vol. 17, no. 12, p. 2833, december 2017. [2] h. ghadiri, "real-time stability assessment of power system using ann without requiring expert experience", majlesi j. electr. eng., vol. 14, no. 2, pp. 43-49, june 2020. [3] s. heidarpoor, m. tabatabaei and h. khodadadi, "speed control of a dc motor using a fractional order sliding mode controller", in proceedings of the 2017 ieee international conference on environment and electrical engineering and 2017 ieee industrial and commercial power systems europe (eeeic/i&cps europe), 2017, pp. 1-4. [4] m. preindl, and s. bolognani, "model predictive direct torque control with finite control set for pmsm drive systems, part 2: field weakening operation", ieee trans. industr. inform., vol. 9, no. 2, pp. 648-657, may 2013. [5] y. wang, h. yu, z. che, y. wang and y. liu, "the direct speed control of pmsm based on terminal sliding mode and finite time observer", processes, vol. 7, no. 9, p. 624, september 2019. [6] k. zhao et al., "robust model-free nonsingular terminal sliding mode control for pmsm demagnetization fault", ieee access, vol. 7, pp. 15737-15748, february 2019. [7] g. liu et al., "internal model control of permanent magnet synchronous motor using support vector machine generalized inverse", ieee trans. industr. inform., vol. 9, no. 2, pp. 890-898, may 2013. [8] y. zhan, j. guan and y. zhao, "an adaptive second-order sliding-mode observer for permanent magnet synchronous motor with an improved phase-locked loop structure considering speed reverse", trans. inst. meas. control, vol. 42, no. 5, pp.1008-1021, march 2020. [9] k. zhou, m. ai, d. sun, n. jin and x. wu, "field weakening operation control strategies of pmsm based on feedback linearization", energies, vol. 12, no. 23, p. 4526, november 2019. [10] f. f. el-sousy, "intelligent optimal recurrent wavelet elman neural network control system for permanentmagnet synchronous motor servo drive", ieee trans. industr. inform, vol. 9, no. 4, pp. 1986-2003, november 2013. [11] b. zhang and x. gao, "hybrid adaptive integral sliding mode speed control of pmsm system using rbf neural network", in proceedings of the 2020 international symposium on power electronics, electrical drives, automation and motion (speedam), 2020, pp. 17-22. [12] f. cao, "pid controller optimized by genetic algorithm for direct-drive servo system", neural comput. appl., vol. 32, no. 1, pp. 23-30, september 2020. [13] j.-w. jung et al., "adaptive pid speed control design for permanent magnet synchronous motor drives", ieee trans. power electron., vol. 30, no. 2, pp. 900-908, february 2014. [14] c. zhang et al., "sliding observer-based demagnetisation fault-tolerant control in permanent magnet synchronous motors", j. eng., vol. 2017, no. 6, pp. 175-183, june 2017. [15] c. zhang et al., "robust fault-tolerant predictive current control for permanent magnet synchronous motors considering demagnetization fault", ieee trans. ind. electron., vol. 65, no. 7, pp. 5324-5334, july 2018. [16] h. mekki et al., "fault tolerant design for permanent magnet synchronous motor using fuzzy speed controller", ifac-papersonline, vol. 49, no. 5, pp. 315-320, june 2016. [17] l. hongmei, c. tao, and y. hongyang, "mechanism, diagnosis and development of demagnetization fault for pmsm in electric vehicle", transactions of china electrotechnical society, vol. 28, no. 8, pp. 276-284, august 2013. pso based takagi-sugeno fuzzy pid controller design for speed 217 [18] h. li and t. chen, "demagnetization fault diagnosis and fault mode recognition of pmsm for ev", transactions of china electrotechnical society, vol. 32, no. 5, pp. 1-8, march 2017. [19] c. xia et al., "voltage disturbance rejection for matrix converter-based pmsm drive system using internal model control", ieee trans. ind. electron., vol. 59, no. 1, pp. 361-372, january 2011. [20] h. h. choi and j.-w. jung, "discrete-time fuzzy speed regulator design for pm synchronous motor", ieee trans. ind. electron., vol. 60, no. 2, pp. 600-607, february 2013. [21] s. chai, l. wang, and e. rogers, "a cascade mpc control structure for a pmsm with speed ripple minimization", ieee trans. ind. electron., vol 60, no. 8, pp. 2978-2987, august 2013. [22] i. hassanzadeh, h. ghadiri, and r. dalayimilan, "design and implemention of a simple fuzzy algorithm for obstacle avoidance navigation of a mobile robot in dynamic environment", in proceedings of the 5th international symposium on mechatronics and its applications (isma), 2008, pp. 1-6. [23] s. wang, "windowed least square algorithm based pmsm parameters estimation", math. probl. eng., vol. 2013, article id 131268, september 2013. [24] q. xu et al., "multiobjective optimization of pid controller of pmsm", j. contr. sci. eng., vol. 2014, article id 471609, august 2014. [25] s. wang, "adrc and feedforward hybrid control system of pmsm", math. probl. eng.,, vol. 2013, article id 180179, december 2013. [26] h. h. choi et al., "precise pi speed control of permanent magnet synchronous motor with a simple learning feedforward compensation". electr. eng., vol. 99, no. 1, pp. 133-139, march 2017. [27] m. ahmadi and h. khodadadi, "self-tuning pd2-pid controller design by using fuzzy logic for ball and beam system", in fundamental research in electrical engineering, springer, 2019 pp. 217-225. [28] a. dehghani and h. khodadadi, "designing a neuro-fuzzy pid controller based on smith predictor for heating system", in proceedings of the 17th international conference on control, automation and systems (iccas). 2017, pp. 15-20. [29] h. ghadiri and m. r. jahed-motlagh, "lmi-based criterion for the robust guaranteed cost control of uncertain switched neutral systems with time-varying mixed delays and nonlinear perturbations by dynamic output feedback", complexity, vol. 21, no. s2, pp. 555-578, november 2016. [30] h. ghadiri, m. r. jahed-motlagh and m. barkhordari yazdi, "robust stabilization for uncertain switched neutral systems with interval time-varying mixed delays", nonlinear anal.-hybri., vol. 13, pp. 2-21, august 2014. [31] h. ghadiri, m. r. jahed-motlagh and m. b. yazdi, "robust output observer-based guaranteed cost control of a class of uncertain switched neutral systems with interval time-varying mixed delays", int. j. control, autom. syst., vol. 12, no. 6, pp.1167-1179, october 2014. [32] h. khodadadi and h. ghadiri, "fuzzy logic self-tuning pid controller design for ball mill grinding circuits using an improved disturbance observer", mining, metallurgy & exploration, vol. 36, no. 6, pp. 1075-1090, july 2019. [33] h. khodadadi and a. dehghani, "fuzzy logic self-tuning pid controller design based on smith predictor for heating system", in proceedings of the 16th international conference on control, automation and systems (iccas), 2016, pp. 161-166. [34] y.-s. kung, c.-c. huang and m.-h. tsai, "fpga realization of an adaptive fuzzy controller for pmlsm drive", ieee trans. ind. electron., vol. 56, no. 8, pp. 2923-2932, august 2009. [35] a. abdollahi, a. foruzan tabar and h. khodadadi, "optimal controller design for quadrotor by genetic algorithm with the aim of optimizing the response and control input signals", cumhuriyet üniversitesi fenedebiyat fakültesi fen bilimleri dergisi, vol. 36, no. 3, pp. 135-147, may 2015. [36] h. b. novin and h. ghadiri, "particle swarm optimization base explicit model predictive controller for limiting shaft torque", in proceedings of the 5th iranian joint congress on fuzzy and intelligent systems (cfis), 2017, pp. 35-40. [37] m. mohammadhosseini and h. ghadiri, "a combination of genetic algorithm and particle swarm optimization for power systems planning subject to energy storage", j. comp. robot., vol. 12, no. 1, pp. 65-76, november 2019. [38] h. h. choi, h. m. yun and y. kim, "implementation of evolutionary fuzzy pid speed controller for pm synchronous motor", ieee trans. indust. inform., vol. 11, no. 2, pp. 540-547, april 2015. нинослав стојадиновић, редовни члан академије инжењерских наука србије (аинс) од 1999 facta universitatis series: electronics and energetics vol. 34, no 1, march 2021, pp. i-ii © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd in memoriam ninoslav stojadinović, a friend of many, passed away on december 25, 2020, after one month of fight with covid-19. ninoslav, for friends nino, was a chairman of many conferences, editor of many scientific journals, well-known and famous professor, a highly respected mentor to his students, and a role model for colleagues. he was editor-in-chief of the scientific journal facta universitatis, series: electronics and energetics (university of niš) in the period 2013-2020. ninoslav d. stojadinović was born in niš on 20 september 1950 (father dobrivoje stojadinović and mother nadežda đorđević). he was married to anđelka with whom he had a son dragan. he spent most of his student and working life at the faculty of electronic engineering, university of niš, serbia. it was at this university where he obtained all his degrees, b.s. (1974), m.s. (1977), and ph.d. (1980), all in electrical engineering. he started first professional job in "ei-semiconductors" in niš in 1974, joined the faculty of electronic engineering in 1976, where he became full professor in 1991. he was head of the department of microelectronics (1984-2005), vice-dean (1986-1989), and dean of the faculty of electronic engineering in niš (1989-1994). he was vice-director of sasa research center at the university of niš from 1991-1996. based on his versatile scientific and teaching work, as well as the achieved results, he became an academician at the department of technical sciences of the serbian academy of sciences and arts (sasa, corresponding member from october 30, 2003; full member from november 1, 2012). he was president of the sasa branch in niš from 2016-2020. ninoslav stojadinović was a member of commissions for promotion of teaching staff at griffith university (australia), brown university (usa), national technical university of athens (greece), technical university of sofia (bulgaria), and banaras hindu university (india). in 1997 he was a visiting professor at the technical university of wien. under his supervision 48 dipl. ing. theses, 13 m.sc. theses, and 17 ph.d. theses were realised. since 1990 he was a member of the editorial board of microelectronics journal (elsevier), while from 1993-1995 he was editor-in-chief of this very journal. from 1993-1996 he was regional editor for europe of microelectronics reliability (elsevier), and editor-in-chief of this journal in period 1996-2017. from 2013-2020 he was editor-in-chief of the journal facta universitatis, series: electronics and energetics (university of niš). also, he was member of the scientific and/or programme committee of more than 50 international and numerous national scientific meetings. since 1986 he was a member of the international institute of electrical and electronics engineers (ieee). in the period 2002-2005 he was chair of serbia & montenegro ieee section, from 1994-2020 he was chair of its ieee ed/ssc chapter. in 1998 he became ieee senior member, in 2003 ieee fellow, and in 2019 life fellow. since 1996 he was ieee eds distinguished lecturer for the field of microelectronics. he was chair of society for etran in the period 2001-2006. he was editor-in-chief of journal ieee eds newsletters in the period 1998-2001, and ieee eds adcom member in the period 2002-2007. in the period 1984-1991 he was member of expert teams for creation of development strategy of microelectronics and electronics in serbia and yugoslavia. from ii 1990-1995 he was chair of committee for electrotechnics, and from 1995-1998 he was member of committee for information technology at the ministry of science and technology of republic of serbia. in 1988 he founded the department of microelectronics at the faculty of electronic engineering in niš, and course in microelectronics. in the period 1988-1991 within the department of microelectronics at the faculty of electronic engineering in niš, he founded three scientific research laboratories where several scientific research projects were realised. since 2001 he was a member of international scientific advisory boards at center for nanotechnologies, clemson university (usa) and center of excellence: micro and nanotechnology applied research, warsaw technical university (poland). since 2005 he was member of the european expert commission for the seventh research framework program ec fp7, while since 2008 he was consultant of the national science foundation of taiwan government (nsftg). ninoslav stojadinović published 97 papers (9 invited/review) in the reputable international journals of sci list and 185 papers (23 invited) in the proceedings of international and national scientific conferences. he was author/coauthor of four chapters in international monographs: computer engineering handbook and digital design and fabrication (crc press, usa), micro electronic and mechanical systems (in-tech press, boca raton) and bias temperature instability for devices and circuits (springer science). according to data from scopus, his papers are cited more than 560 times. he realised a number of technological solutions, six of which are applied in microelectronics industry all over the world. ninoslav stojadinović had significant political and diplomatic experience. he was a member of the assembly of the republic of serbia in the period 1997-2000, member of the assembly of the state union of serbia and montenegro, and its representative to the parliamentary assembly of the council of europe, in the period 2004-2006, deputy speaker of assembly of the republic of serbia in the period 2014-2016. also, he was ambassador of the republic of serbia to the kindgom of sweden (2005-2011) and bosnia and hercegovina (2011-2013). nino was always available to support students and colleagues in their first steps in many important professional tasks. we will never forget his significant contribution in all fields. we will miss him for a long time but his contribution will stay engraved in the marble of the history of many universities, institutions, journals and conferences. we express our deepest condolences to all scientists around the world who followed his works, known him, or met him. editorial board instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 543 560 doi: 10.2298/fuee1404543v lifetime aspects of wearable electronics  hans de vries philips research laboratories, eindhoven, the netherlands abstract. in the course of two european projects the endurance behavior of stretchable electronic substrates and of electronic textiles was investigated. the results have to a large extent been published already. in this work new analyses to the earlier results are presented. straightforward analytical approaches are used to describe fatigue under cyclic mechanical loading in the two technologies. for stretchable substrates the actual plastic strain upon stretching could be qualitatively evaluated to replace the engineering strain. for the textile-based substrates the bending strain was estimated. additionally, there are sub-categories within each technology which perform differently and apparently show percolation behavior. key words: stretchable electronic, electronic textile, endurance test, cyclic mechanical loading, fatigue, percolation 1. introduction during the last decade or perhaps even longer the notions of wearable electronics and smart textiles have received ever more attention. the main justification for this development is the possibility to make electronic applications that are conformable around complex shapes. one can think of monitoring human body functions where small signals need to be measured and thus the devices must be placed close to or on the skin. the application areas are amongst others in health care, wellness and leisure. but one has also a larger freedom of design in consumer and professional products, for instance lighting and automotive. in this contribution the endurance behavior of the abovementioned technologies is addressed. quite understandably, initially research and development organizations have spent most effort to the study and development of the application areas and the technologies. as for stretchable substrates with conducting tracks and components, the first investigations are from princeton university, lawrence livermore national laboratories, and john hopkins university (see references in [1]). here, we refer to the european project stella * where three technologies were investigated and demonstrator products were made [1, 2, 3]. textile-based technologies and applications were the subject of the european project received september 25, 2014 corresponding author: hans de vries philips research laboratories, high tech campus 34, 5656ae eindhoven, the netherlands (e-mail: j.w.c.de.vries@philips.com) * stella: stretchable electronics for large area applications. ist-028026, 2006–2010. 544 h. de vries place-it † . recently, an extensive review has been published on the history of smart textiles, trends and identifying the important challenges such as the manufacturing technologies and the robustness of the products [4]. of course, much more has been published on stretchable and textile-based electronics, but it is not the purpose of this paper to give a full account of the literature. an important factor that needs attention before products are introduced on the market is the reliability and the lifetime. since reliability is defined as ‘the ability / probability of a product to fulfill its intended function, under specified loading conditions, during a specified period of time’, the reliability and the lifetime requirements depend on the application. one thus has to make an inventory of the relevant stress factors that the product will be subjected to, and test them accordingly. especially when new applications, or new technologies, or new materials, or a combination of these are developed, one might encounter new stress conditions which ask for new test strategies. two examples may serve as illustration. suppose one designs an activity monitor for para medical usage or for athletes (such as to measure heartbeat, respiration frequency et cetera). the device will stretch about 5% when pulling it over the wrist; in actual use the stretch is less. this will be done ten to twenty times daily for say three hundred days per year. with a desired lifetime of three years, the product should endure up to ten thousand stretch cycles of 5%. the second case is a textile-based device that will be tightly wrapped around a limb or the body – a device for instance that is meant for skin treatments (light therapy). because in the warp and weft directions a textile is only limited stretchable (a few percent perhaps) the most important stress type is that of bending each time the device is applied. with a required lifetime of three years, the product must endure several thousand bending cycles. of course, this will also depend on the bending radius. a special feature of wearable electronic products is washability. this was also touched upon in the two referenced european projects, but indeed the scope was comparatively limited, and as far as the author is aware no results were published. apart from the fluids and detergents, obviously various mechanical stresses will be imposed on the products: bending, folding, rolling, and shock impacts. to summarize, the introduction of stretchable and textile-based electronic products goes hand in hand with additional factors that have influence on the reliability and the lifetime of these products. it will be clear that the conventional stress factors, such as temperature and humidity, still apply and must be tested for. this leads to the actual purpose of the paper, which is to address the particularities of wearable electronics that relate to reliability and lifetime. all data come from the two european projects that were referred to earlier. often a dedicated test setup had to be designed to investigate the failure modes. a couple of results on endurance tests have been published for stretchable electronics [5, 6] and structures made with woven and nonwoven textile [7, 8]. sometimes, a model description for the failure mechanism could be included. it is worth to review these earlier studies and try to further explore the possibilities of modeling the failure mechanisms. these will be rather simple models, but they are only meant to shed light on the underlying physics. thus we hope to encourage other researchers to continue in this field. † place-it: platform for large area conformable electronics by integration. ist-0248048, 2010–2013. lifetime aspects of wearable electronics 545 the main subject is the fatigue lifetime modeling of the two technologies in general. however, within each of these two there is a category which cannot be related to fatigue. instead, there are indications that the behavior of these categories should be explained in terms of percolation theory. the paper is organized as follows. in the second chapter the samples fabricated with the various technologies are briefly introduced. also the test methods and the test results are presented, including the essential results of the failure analyses. fatigue lifetime analyses are the subject of the third chapter. the above mentioned special categories will be discussed in chapter four. finally, a summary and outlook are formulated. 2. test structures, test methods, and test results as was mentioned, nearly all test structures, methods and results have been published already, and thus we will give only a brief description in this chapter. further details can be found in the referenced literature. at the end of this chapter a compilation of the available test data will be made, which will be further analyzed and discussed in the next chapter. fig. 1 meander structure of stretchable copper board and stretchable molded interconnect technologies [5]. fig. 2 design of meander and definition of parameters [5, 9, 11]. 2.1. stretchable technology for each of the three stretchable technologies basically the same test structures were used [5]. these have a meander-shaped conductor incorporated in or on the stretchable substrate. a meander-like structure allows for larger deformations of the metal conductor as it acts as a kind of spring [9]. other shapes, which have a spring-like feature in common, have been proposed and studied as well, see e.g. [10]. the second technology is called stretchable molded interconnection (smi) and uses polydimethylsiloxane structures (pdms, sylgard 186 of dow corning) [3]. here, the copper tracks have a thickness of 18 µm. the total thickness of the substrate is 1 mm. 546 h. de vries a typical layout of the samples for both technologies is shown in figure 1. the influence of the shape of the meander – the angle (), radius (r), pitch (p), and width (w), see figure 2 – on the mechanical properties [9, 11] and the endurance behavior [12, 13] has been investigated. the third technology is called stretchable polymer board (spb) and one applies screen printing of a conductive paste on a polyurethane micro fiber carrier non-woven material [2]. the paste consists of a thermoplastic polyurethane matrix filled with silvercoated flakes. in this case a horseshoe shape of track was made (for this case, in figure 2 the angle  = 0). the substrates have a thickness of about 0.27 mm. we will present and discuss the results in chapter 4. 2.2. electronic textile two basic types of textile samples were manufactured for the endurance tests. one type is made by weaving conductive yarns into a textile, mostly parallel to the warp direction [4, 7]. since light therapy is considered as a possible application, light emitting diodes (leds) are mounted on the textile on positions where the conductive yarn lies on top of the weave, the so-called floats. fig. 3 woven test sample with rows of leds [7]. the globtop is visible as the brown ring around the leds. the conductive yarns are in the warp direction (left-right). fig. 4 layout of non-woven test structure of combined stitched (red dashed lines) and embroidered conductive yarns (blue solid lines) and interconnection [8]. (black dot, see insert). the resistance is measured between a and b. in figure 3 an example of a textile with leds is depicted. the leds are electrically connected to the conductive yarns with conductive adhesive paste and this connection is reinforced by globtop around the led. typical dimensions of the textile are 85100 mm 2 . the weave is characterized by the thickness of the cotton wire which is given by the dtexnumber (mass in grams of 10,000 meter of wire) and the pitch of the weave, given in lifetime aspects of wearable electronics 547 picks per centimeter. the conductive yarn consists of 20 silver-clad copper filaments of 40 µm thickness. the bundle has a twist of about 240 turns per meter. the other type of textile test sample is made from non-woven material of about 0.6 mm thickness and size of 80150 mm 2 . conductive yarns are bundles of 34 polyamide filaments which are coated with about 1 µm silver. two different techniques were used to attach these yarns to the substrate. one is by stitching with a sewing machine. in this case an upper and an under yarn – both conductive – are stitched together. alternatively, the conductive yarn is embroidered to the top of the substrate by a separate, non-conductive yarn. this is called soutache. in the test samples both techniques were combined and the two yarns were connected by a knot as shown schematically in figure 4. one can also connect two similar yarns in this way; the essential point is to use existing methods of applying yarns and making electrical interconnections. the insert shows a detail of such interconnection. the resistance was monitored between the points a and b. for further information, see [8]. the results will be discussed separately in chapter 4. 2.3. test methods from the introduction it will have become clear that the relevant type of stress is repeated bending or stretching. thus, test setups are needed to provide such deformation in a cyclic manner to the test structures. the stretchable were mounted in test machines – either commercially available or home built – and subjected to stretching cycles of up to 20% for scband even higher for smi-technology, and 5% for the spb-samples [5, 14]. care was taken not to stretch the samples too close to the limit of maximum elongation. simultaneously, the integrity of the metallic conductors and of the electrical interconnections was monitored in order to determine the moment of failure. fig. 5 setup for cyclic bending test, showing springs to hold the textile straight, and clamp (white blocks) to move the sample up and down. fig. 6 schematic of cyclic bend test with springs, and clamp moving up and down over distance (z). the half-length of the non-rigid part is given by l. for the electronic-textile samples a dedicated bending test machine was built as described in [15]. the sample (figure 5) is held straight by springs. the central part with the leds is moved down and back again, which causes the textile substrate to bend next to the row with leds which are rigid, see figure 6. the current through the leds was recorded to observe the moment of failure in the conductive yarns or in the electrical interconnections. in the same contraption the non-woven substrates with the stitched and embroidered conductive yarns (figure 4) were put to test. here, the row with the knot548 h. de vries type of interconnections is tested. the electrical resistance of the conductive yarns was measured. the layout of the various test structures made it possible to do a number of measurements per sample in parallel. in most of the cases experiments were repeated, so that in the end a statistical evaluation of the failure data could be done. for that purpose weibull-statistics were applied. the two-parameter weibull-function is a probability distribution function described by a scale parameter () denoting its position and a shape parameter () denoting its width. the cumulative distribution function as function of time (t) is: ( ) 1 exp[ ( ) ]f t t     (1) examples of failure distributions with the 95%-confidence levels are given in figure 7 and figure 8. the slopes of the distributions are 6.4–10 for the scb-samples and 3.5–4.8 for the textile samples. these values indicate that accelerated test conditions were applied, leading to wear out failures. ‡ fig. 7 weibull failure distribution with 95%-confidence levels of scbsamples cyclically stretched to 5% () and 7.5% (). fig. 8 weibull-failure distributions with 95%-confidence levels of textile sample cyclically bent by displacing over 59 mm () and 77 mm (). 2.4. test results & failure analyses in the following paragraphs of this section the results from the endurance tests are collected. both the lifetime results and the failure analyses will be given. for each case – stretchable, textile – the cyclic lifetime is given as it was obtained for each test condition. this will be the elongation or engineering strain for the stretchable samples, and the displacement (z) imposed by the bending tool (see figure 6) for the textile-based samples. the lifetime is defined as the scale parameter of the weibull-distributions (, see equation (1)), which is thus the time or number of cycles to 63.2% failures. ‡ the failure rate decreases over time (t) for  < 1 (early failures), is constant for  = 1 (random failures), and increases for  > 1 (aging or wear-out failures). for severe or accelerated test conditions,  can have values well larger than two or three. cy cles to f ailure u n r e li a b il it y , f ( t) 100 100001000 1 5 10 50 90 99 cy cles u n r e li a b il it y , f ( t) 10000 1000000100000 1 5 10 50 90 99 lifetime aspects of wearable electronics 549 table 1 shows the results for two of the stretchable technologies. similar is listed in table 2 for the textile substrates. it should be mentioned, that after 950,000 cycles to 2.6% stretch, the smi-sample had not failed. in the same sense the textile-bending test to 35 mm displacement did not lead to complete failure, but several filaments in the bundle were broken, and so we include this result in the evaluations as were it a failure. table 1 engineering strain and observed cycles to failure for the indicated stretchable technologies. the margins are the 90%-confidence levels from the failure distributions. technology strain (%) cycles to failure reference scb 20 150±20 5 15 300±50 10 700±100 7,5 920±120 5 3200±200 2,5 45000±6500 smi 20 200±50 5 15 500±150 10 2100±800 10 2500±800 7,5 3300±300 5 16000±6000 4 100000±15000 2.6 950000: no failure table 2 displacement (z) and observed cycles to failure for textile with indicated characteristics (dtex in grams per 10,000 meter of yarn, picks/cm in number of yarns per cm). the margins are the 90%-confidence levels from the failure distributions. textile z (mm) cycles to failure reference dtex: 76 86 25000±7000 7 picks/cm: 33 77 49000±8000 59 105000±20000 35 1500000: partial failure finally, selected results from the failure analyses will be given as an illustration to reveal the failure mechanisms. this is one essential step to decide which models should be used for a description of the lifetime in the stretchable and textile substrates. in figure 9 and figure 10 for scb and in figure 11 and figure 12 for smi typical examples of the failure mode in the meandering cu-tracks of the stretchable substrates are shown. from the pictures one can derive that the failure mechanism is fatigue, caused by the mechanical cycling. from the statistical analyses (see e.g. figure 7) it followed that the cumulative distributions are near parallel to each other, which supports the conclusion that the same failure mechanism applies for all cases. 550 h. de vries fig. 9 scb-sample with fatigue deformation. the scale bar is 100 µm. fig. 10 scb-sample with fatigue crack after testing to 5% stretching. the scale bar is 200 µm. from [5]. fig. 11 micro cracks in the cu-track of smi-sample after testing to 4% stretching. the scale bar is 100 µm. from [5]. fig. 12 incomplete fatigue crack in the cu-track of smi-sample after testing to 4% stretching. the scale bar is 200 µm. as for the textile-based technology, some results of failure analyses have been published before [7], but since these are new in comparison to the polyurethane-based stretchable technologies it is useful to elaborate a little bit more here. figure 3 shows a typical layout of an electronic textile with rigid components (leds reinforced with globtop). upon bending most of the deformation takes place at the transition from the rigid to the textile parts (see figure 5 and figure 6) – and this is exactly the location where failure occurs. in figure 13 an x-ray image of a detail of the textile with led after the cyclic bend test is shown. one can see the position where the bundles of filaments are broken. next to this picture, figure 14 shows an optical image of the same part where one of the two broken conductive yarns has been pulled out of the weave. the failure site coincides with the edge of the globtop. to be complete, a cross section of such a construction proves that indeed the metallic conductor breaks right outside the edge of the rigid globtop. one can see this in figure 15 where a part of the led is shown with its lead attached to the conductive yarn in the textile. the electrical interconnection is made with conductive lifetime aspects of wearable electronics 551 adhesive, which is a mechanically weak connection. therefore globtop is applied which is also visible in the photo. the detail of figure 16 reveals a crack in the cu-filaments close to where the globtop ends. all this is evidence for fatigue as failure mechanism. fig. 13 x-ray image of led-on-textile after cyclic bend test. the two conductive yarns are broken at the edge of the rigid to textile transition. fig. 14 optical photo of led-on-textile after cyclic bend test, same as in figure 13. one of the two conductive yarns has been pulled out of the textile. fig. 15 cross section of led on textile with from top to bottom: lead / adhesive / yarn. the dark grey mass is the globtop. fig. 16 cross section showing yarn with crack close to the edge of the globtop. the statistical analyses indicate that in all test cases the conductive yarns failed because of the same mechanism: the cumulative distributions are in parallel, as one can infer from figure 8. 552 h. de vries before turning to the evaluation of the results in order to develop a model description of the failure mechanism, it may be convenient to summarize the results that were collected. cyclic endurance tests have been carried out. samples based on stretchable technologies were elongated to various levels until electrical failure occurred. similarly, samples based on electronic textile were bent at different levels. the number of cycles to failure depends on the strain level. from the failure analysis it was concluded that the failure mechanism is fatigue. 3. analyses & discussion in the analyses of below chapter, first the some basic elements that relate to fatigue will be discussed. then we will provide additional evaluations of the data that were published earlier. this leads to a close agreement with results obtained on stretchable substrates that are reported in other research. 3.1 fatigue general as we have discussed in the previous chapter, fatigue has been identified as the mechanism of failure. in so far as low-cycle fatigue is concerned, the well-known manson-coffin relation [16] is commonly used to describe the cycles to failure as function of the applied strain: ' (2 ) . 2 p c f fn     (2) p is the plastic strain range, f ’ and c are the fatigue coefficient and exponent respectively, and nf is the number of cycles. this relation strictly applies only to the situation of large strain values, where plastic strain dominates. typical values for the exponent (c) are between about -0.5 and -0.7 for most metals. a similar relation was described by basquin [17] for the fatigue stress as function of the number of repetitions, which morrow proposed to write [18]: ' (2 ) . b a f fn  (3) here, a is the fatigue strength, f ’ and b are the fatigue strength coefficient and exponent, and nf the number of cycles. values for the exponent (b) are normally in the range of 0.07 to -0.12. this latter expression is also referred to as high-cycle fatigue. morrow recognized that because stress and strain are connected through the elastic modulus (e), both equations can be combined to an expression for the total strain amplitude /2 [18]: ' ' (2 ) (2 ) . 2 f b c f f fn n e      (4) it thus follows that two slopes exist in a diagram of the total strain amplitude versus the number of cycles, two slopes that describe two regimes. below about 10,000 cycles lowcycle fatigue occurs while (very) high-cycle fatigue should be found at one million cycles or more [19]. in between there is a transition region. lifetime aspects of wearable electronics 553 at several occasions in his paper on the effects of thermal stress on the cyclic durability of metals, manson points out that concentration of stress is of great importance [16]. whereas in brittle materials such concentration of stress determines the failure, this is not so in ductile materials – unless the stress is applied cyclically and plastic strain accumulates. this has eventually led to considering the plastic strain energy as the leading influence factor for fatigue [18]. later, more sophisticated approaches were proposed, as one can read in the review on prediction methods for fatigue damage [20]. with respect to wearable electronics it was recognized that even at small elongations deformation can pile up and lead to failure. by means of finite element modeling the locations of maximum stress and strain in a stretched meander structure were identified. these are the apex and the inflection point in the meander [9]. we will now use the above to analyze the lifetime data on stretchable samples in section 3.2 and similar for the electronic-textile samples in section 3.3. 3.2 lifetime modeling for stretchable technology turning to the actual data, figure 17 represents the engineering strain range versus the cycle life of the stretchable test structures. two power law curves such as equation (2) have been fitted through the data. their exponents are for the scb-data c = -0.37 and for the smi-data c = -0.32. for smi the curve was fitted through the data with the exclusion of the points at 4% and 2.6% strain. two remarks must be made to this analysis. in the first place, the data point for the smi-samples at an elongation of 2.6% does not represent a failure (see table 1) and an actual failure would thus occur after more cycles. but already there is an indication that the data point deviates from the fitted power law. this could mean that stretching to 2.6%, and perhaps already at 4%, involves mainly elastic deformation and the high-cycle regime applies. the second remark concerns the values of the exponents of the power law function. although the values are in line with other reported values of the fatigue exponent for cu-structures, and in particular for such of comparable thickness where values of -0.14 to -0.46 could be evaluated [21], one must be cautious as to the precise value of the exponents in view of the above mentioned effect of stress concentration. below it will be demonstrated what the consequences are for the analyses of the test data. in particular the work at imec [9] referenced in the previous subsection made clear that the highest deformation concentrates at the apex and the inflection points of the meandering cu-trace. in our experiments the apex is indeed the position where failure occurs (see e.g. figure 12), but no failure was observed at the inflection point. by finite element modeling the plastic strain was computed as function of the engineering strain for two pdms-materials [9]. for a correct model description of fatigue failure that takes the actual deformation into account, not the elongation as used in figure 17 is needed, but one must take the plastic strain. the data reported in [9] for the substrate material sylgard 186 was used to transform the elongation that was applied in the test (see table 1) to plastic strain. thus, the result of figure 18 is obtained which we show together with the initial analysis. the power law of equation (2) was fitted to the data of 5% engineering strain or higher, which gives an exponent of c = -0.60. this value is nearly the same as the exponent that the imec-paper reports of c = -0.59. 554 h. de vries fig. 17 engineering strain versus cycles to failure for scb() and smitechnology () [5]. for the smiseries at 2.6% the sample had not yet failed (). the dashed lines are power-law fits. fig. 18 strain versus cycles to failure of smi-test samples (same data as in figure 17). engineering strain () and plastic strain () according to [9]. the lines are explained in the text. as was explained in the introductory part of this chapter, in the case where plastic strain dominates the power law usually has an exponent in the range of -0.5 to -0.7. the above analysis with an exponent of -0.60 is thus in good agreement. moreover, also the failure analysis shows that failure is due to plastic deformation; see for instance the photos in figure 11 and figure 12. the two data points taken at 4% and 2.6% engineering strain were not included in this analysis – from 4% and below the slope of the power-law curve is less: c = -0.19 (see dotted line in figure 18), which is somewhat higher than was mentioned above for highcycle fatigue. however, since at 2.6% no failure had occurred yet, the true power law will have a still lower value than -0.19. this is a clear indication for the transition from low to high-cycle fatigue. the same procedure cannot be done for the scb-samples since no such assessment of the plastic strain has been done yet. 3.3 lifetime modeling for textile technology the textile samples were not stretched but bent, and thus a different analytical approach is required. in the test, the samples were hold straight by springs and in the center they were vertically displaced over a certain distance (z), see figure 6. it is thus now necessary to derive a relation between this displacement and the bending strain. the maximum bending strain in a beam of thickness d over a radius r is: . 2r d b  (5) the system of the textile, the conducting yarn, and the spring can be regarded as a beam which is kept fixed at one end (a), see figure 19. point a is the edge of the ledglobtop combination. for bending of a beam of length l the displacement caused by the force (f) at any location (x) is given by: 1 10 100 1,000 10,000 100,000 1,000,000 st ra in ( % ) cycles to failure (-) 0.1 1 10 100 1000 10000 100000 1000000 st ra in ( % ) cycles to failure (-) lifetime aspects of wearable electronics 555 2 (3 ). 6 fx z l x ei   (6) e is the elastic modulus and i the moment of inertia of the beam. for the strain at the constrained end of the beam (a) the radius of the curve of the beam at that point must be determined. the curvature – i.e. the inverse of the bending radius – is: " 2 3/ 2 ( ) , (1 ' ) z k x z   (7) where z’ and z” are the first and the second derivatives of the displacement (z, equation (6)). working this out, we have at point a (for x = 0, the edge of the rigid part) that the curvature is proportional to the force which is the only variable: .)0( f ei fl k  (8) fig. 19 bending of a beam of length l by a force f. the beam is fixed at point a (x = 0). fig. 20 experimentally measured forcedisplacement of textile (). calculated force from spring constant (drawn line). the next step is to evaluate the force as function of the imposed displacement. this was done is two ways. first, the force in the vertical direction and the resulting displacement of the test sample were measured (see figure 6). (note that in the actual cyclic test the sample is moved up and down by a clamp which is attached to a piston with air pressure. for this particular purpose the clamp was removed in order to only move the textile with the springs to keep it straight.) figure 20 gives the results of force versus the displacement as the open symbols. with respect to the second assessment, we must realize that the origin of the force originates from the four springs keeping the textile straight. referring again to the setup of figure 6, the displacement (z) and the angle () determine the elongation of the springs. together with the spring constant the force along the direction of the textile is obtained, which is also shown in figure 20 as the drawn line. the full expression for the force of the four springs (fsprings) and the curve-fitted relation for the measured force on the test jig (fjig) are given by the following equations: 0 5 10 15 20 25 30 35 0 20 40 60 80 100 fo rc e , f ( n ) displacement, z (mm) 556 h. de vries ,arctan, sin 4              l z l z kfsprings   (9a) .0079.00033.0 2 zzfjig  (9b) the spring constant k = 0.19 nm and the length l = 100 mm. returning to the test results of table 2, these are now complemented with the force on the textile, using equation (9b). table 3 lists the result, and so, finally, the bending strain expressed in terms of the acting force can be plotted against the cycle life, see figure 21. a power-law fit to the data yields an exponent of c = -0.41. table 3 displacement (z) and observed cycles to failure for textile (see table 2). the force (f) is calculated with equation (9b). textile z (mm) f (n) cycles to failure dtex: 76 86 23.73 25000±7000 picks/cm: 33 77 18.96 49000±8000 59 11.02 105000±20000 35 3.77 1500000 fig. 21 bending strain given as force (see table 3) versus cycle life in textile sample. the fitted line is a power-law fit. this value can be compared to the exponent that was obtained for the engineering strain of the stretchable substrates, which is in the range of -0.32 to -0.37 (see section 3.2). the difference does not seem very impressive. for the stretchable samples we were able to estimate the plastic strain, which is not yet possible for the bending of the conducting cu-yarns in the woven textile. but even if we do not know the plastic strain, we can summarize the analysis of the cyclic bend tests on electronic textile by concluding that the result gives confidence that the qualitative expression for the bending strain is correct. 1 10 100 10,000 100,000 1,000,000 10,000,000 b e n d in g s tr a in , fo rc e ( n ) cycles to failure (-) lifetime aspects of wearable electronics 557 4. conductor networks in the second chapter we explained that there are some special situations. both groups of technologies contain a separate case in which the electrical conductivity evolves differently under cyclic mechanical loading than in the other cases which were treated in chapter 3. the spb-technology – based on non-woven materials with screen-printed conductors – differs from the scband smi-technologies. as was shown in the original paper [5], the resistance of the meandering tracks does not stay constant and upon failure suddenly increases, as happens in the two other technologies. rather, the resistance increases gradually, and at some point it becomes irregular. here, a typical example of a failed sample is shown in figure 22, which indicates that the screen-printed structure breaks up because of its granular structure. the resistance was recorded during the cycles and is shown per each cycle as the value at 5% elongation and at rest. figure 23 gives the resistance data which become irregular in the stretched state after about 200 cycles. fig. 22 failure mode of spb-sample after test to 5% stretching [5]. cracks are visible in the screen-printed layer. fig. 23 resistance of spb-sample under cyclic stretching from 0% (dashed line) to 5% (solid line). from [5]. the second special case is the combination of two methods to apply a conducting yarn to non-woven textile substrate, as is shown in figure 4 where a stitched and an embroidered yarn are connected by a knot. the thin silver-coating of the yarns will be damaged under mechanical loading [8] (see figure 24). initially, the conduction takes place through the coating of each individual filament and, because these are in direct contact in the bundle, there is also inter-filament conduction. when the coating is damaged, less intra-filament conduction is possible. but between the filaments a conducting resistor network may still exist. it lies at hand that the inclusion of a knot to connect two yarns further complicates the conductivity since the interconnection is formed by a clamped contact. this clamp loosens upon cyclic mechanical loading and in a clamped contact the resistance increases when the clamping force is reduced. typical data are shown in figure 25 for cyclic bending over a vertical displacement of 86 mm. at a few instances the test was interrupted, this can be seen as the decrease of the resistance at 500, 1000, 2000, and 3000 cycles. after resuming the cyclic loading the value prior to the interruptions is reached again. 0 20 40 60 80 100 120 140 160 0 100 200 300 400 500 r e si st a n ce ( w ) cycles (-) 558 h. de vries fig. 24 failure mode of embroidered yarn on non-woven textile after cyclic bending. damage to the ag-coating of the filaments is visible. fig. 25 resistance of non-woven sample with knot between stitched and embroidered yarn under cyclic bending. the dashed line is explained in the text. we thus have two super positioned effects in these measurements. first, there is the background of the gradual increase of the resistance as function of the number of stretching and bending cycles. this is the interpretation of the dashed line in figure 23 for stretchable spb-substrates which reflects the sample at 0% elongation for each cycle. very tentatively, that gradual increase of the resistance is attributed to reduction of the elasticity of the substrate material which functions to keep the conducting ag-particles in tight contact. for the non-woven sample no data were taken at the exact moments that the sample was in the un-stretched state, therefore we have indicated the background with the dashed line in figure 25. the clamping force of the knot-interconnection stems from the stitching and embroidering which force loses strength. the second effect adds to the background resistance. this appears as an increasing and significant variation of the resistance during mechanical cycling which is further discussed below. in the two situations, it is very likely that the large variation of the resistance is caused by percolation. percolation theory describes the behavior of clustered elements with a certain physical functionality, such as electrical conduction (e.g. for instance [22]). one considers a network composed of sites that can be occupied by one element, with a probability p that the site is occupied. if two adjacent sites are occupied, the transport process takes place between them. in the network conduction is possible when there is at least one coherent cluster of occupied sites between the two ends of the network, i.e. the contacts to the conducting track or yarn. below the critical probability that a site is occupied – the so called percolation threshold pc – the conduction vanishes. close to this value the conduction () is described by [23]: ( ) . t cp p   (10) the value of the critical site occupancy pc and the exponent t depend on various factors, such as the dimensionality of the system. typically, one reports values of pc = 0.25–0.5 and t = 1.5–2 [22, 23, 24]. 1 10 100 1,000 10,000 100,000 0 1000 2000 3000 4000 5000 6000 r e si st a n ce ( w ) cycles (-) lifetime aspects of wearable electronics 559 the elements to occupy a site in case of the two types of test structures are the agflakes of the screen printed conductor in the spb-samples, and the conducting yarns with ag-coating in the non-woven textile samples. both form a conductor network. the bottom line of this expose is that at some point the connected cluster of conducting elements reaches the percolation threshold. the probability of site occupancy approaches p = pc. even small variations in the number and strength of the bonds between the elements will then have large impact on the resistance, as one sees in figure 23 and figure 25 and which follows from equation (10). in particular in the stitched interconnection the resistance varies per bend cycle over two orders of magnitude. to summarize, under cyclic mechanical loading, the conductivity in systems made with screen-printed particles or with clamped conducting yarns decays by two mechanisms. on the one hand one has a declining clamping force that leads to gradually increasing resistance. on top of this, the electrical contacts between conducting particles and yarns become less and less in number, which leads to the formation of a percolation network with unstable conductivity. 5. summary and conclusions in this contribution no new data were presented. instead, existing results from two projects on wearable electronic applications and technologies have been reviewed and attempts have been made to refine the originally reported analyses. to this end we used analytical models to come to a qualitative description of the bending strain of conducting yarns in electronic textiles. the bending strain is proportional to the force that is used to keep the textile straight in the test. the cycle life obeys a power law with an exponent that compares well with the one found for the cycle life of the stretchable substrates. as to the stretchable substrates, for the smi-technology the engineering strain was transformed to plastic strain values using results from finite element simulations that were carried out elsewhere. as a result the power-law dependence is better in agreement with low-cycle fatigue. these evaluations show where still work needs to be done. in particular, modeling of the plastic strain that accumulates in the cyclic bend test for electronic textiles is such a topic. finally, an additional failure mechanism has been observed in clamped contacts. in some subsets of the stretchable and textile technologies, the contact force between conducting particles or between conducting yarns gradually diminishes, which might be attributed to fatigue. but eventually a conductor network evolves that behaves as a percolation network. this is also a topic that may receive further attention. acknowledgement. instead of mentioning each individually, the author would like to thank all colleagues that contributed to the stella and place-it projects and who have supplied him with additional information for this paper. co van veen’s critical reading of the text is greatly appreciated. 560 h. de vries references [1] t. löher, m. seckel, r. vieroth, c. dils, c. kallmayer, a. ostmann, r. aschenbrenner, h. reichl. stretchable electronic systems: realization and applications. proc. 11th eptc, 2009, pp 893–898. [2] b. schmied, j. guenther, c. klatt, h. kober, e. raemaekers. stella – stretchable electronics for large area applications. adv. sci. technol. 2008, vol. 60, pp 67–73. [3] f. bossuyt, t. vervust, f. axisa, j. vanfleteren. improved stretchable electronics technology for large area applications. proc. mrs spring meeting symposium, 2010, pp 1271–1277. [4] k.h. cherenack, l. van pieterson. smart textiles: challenges and opportunities. j. appl. phys. 2012, vol. 112, no 9, 091301. [5] f. bossuyt, j. guenther, t. löher, m. seckel, t. sterken, h. de vries. cyclic endurance reliability of stretchable electronic substrates. microelectronics reliability, 2011, vol. 51, pp 628–635. [6] m. jablonski, r. lucchini, f. bossuyt, t. vervust, j. vanfleteren, h. de vries, p. vean, m. gonzalez. impact of geometry on stretchable meandered interconnect uniaxial tensile extension fatigue reliability. microelectronics reliability, 2014, accepted for publication. [7] h. de vries, k.h. cherenack. endurance behavior of conductive yarns. microelectronics reliability, 2014, vol. 54, pp 327–330. [8] m. de kok, h. de vries, k. pacheco, g. van heck. reliability of conducting yarns in electronic-textile applications. textile research journal, 2014, submitted. [9] m. gonzalez, f. axisa, m. vande broucke, d. brosteaux, b. vandevelde, j. vanfleteren. design of metal interconnects for stretchable electronic circuits. microelectronics reliability, 2008, vol. 48, pp 825–832. m. gonzalez, f.axisa, f. bossuyt, y.y. hsu, b. vandevelde, j. vanfleteren. design and performance of metal conductors for stretchable electronic circuits. proc. 2 nd estc, 2008, pp 371–377. [10] d.h. kim, j.a. rogers. stretchable electronics: materials strategies and devices. adv. mater. 2008, vol. 20, pp 4887–4892. [11] f. bossuyt, t. vervust, j. vanfleteren. stretchable electronics technology for large area applications: fabrication and mechanical characterization. ieee trans. comp. pack. manuf. technol. 2013, vol. 3, no. 2, pp 229–235. [12] m. jablonski, f. bossuyt, j. vanfleteren, t. vervust, h. de vries. reliability of a stretchable interconnect utilizing terminated, in-plane meandered coper conductor. microelectronics reliability, 2013, vol. 53, pp 956–963. [13] m. jablonski, r. lucchini, f. bossuyt, t. vervust, j. vanfleteren, h. de vries, p. vena, m. gonzalez. impact of geometry on stretchable meandered interconnect uniaxial tensile extension fatigue reliability. microelectronics reliability, 2014, submitted. [14] y.y. hsu, b. dimcic, m. gonzalez, f. bossuyt, j. vanfleteren, i. de wolf. reliability assessment of stretchable interconnects. proc. ieee impact, 2010, pp 1–4; polyimide-enhanced stretchable interconnects: design, fabrication, and characterization. ieee trans. electron dev. 2011, vol. 58, no. 8, pp 2680–2688. [15] h. de vries, k.h. cherenack. failure modes in textile interconnect lines. ieee electron dev. lett. 2012, vol. 33, no. 10. pp 1450–1452. [16] s.s. manson. behavior of materials under conditions of thermal stress. naca tn 2933, 1953. l.f. coffin jr. a study of the effects of cyclic thermal stresses on a ductile metal. trans. asme 1954, vol. 76, pp 931950. [17] o.h. basquin. the exponential law of endurance tests. proc. astm 1910, vol. 10, pp 625–630. [18] j.d. morrow. cyclic plastic strain energy and fatigue of metals. astm stp 1965, vol. 378, pp 45–87. [19] s.s. manson. thermal stress and low-cycle fatigue. new york: mcgraw-hill, 1966. [20] a. fatemi, l. yang. cumulative fatigue damage and life prediction theories: a survey of the state of the art for homogeneous materials. int. j. fatigue 1998, vol. 20, nr. 1, pp 9–34. [21] d. farley, y. zhou, f. askari, m. al-bassyiouni, a. dasgupta, j.f.j. caers, h. de vries. copper trace fatigue models for mechanical cycling, vibration and shock/drop of high-density pwas. microelectronics reliability 2010, vol. 50, pp 937–947 . [22] d. stauffer. introduction to percolation theory. taylor & francis,1985. [23] j.p. clerc, g. giraud, j.m. laugier, j.m. luck. the electrical conductivity of binary disordered systems, percolation clusters, fractals and related models. adv. phys. 1990, vol. 39, nr. 3, pp 191–309. [24] m. cattani, m.c. salvadori, f.s. teixeira. insulator-conductor transition: a brief theoretical review. 2009, arxiv:0903.3587. facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 445-458 https://doi.org/10.2298/fuee2003445p © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd various diffraction effects and their importance for detection of inhomogeneites in human tissues * nikola petrović, per olov risman mälardalen university (mdh), academy of innovation, design and engineering, västerås, sweden abstract. hitherto described microwave modalities for detection of internal inhomogeneities in human tissues such as breasts and heads are by image reconstruction, requiring time-consuming computational resources. the method developed at mdh is instead based on the use of a magnetic field transducer, creating an essentially circular electrical field. this is in turn diffracted by the dielectric inhomogenity and that signal is received by an e-field sensor in an appropriate position. the transmitting applicator is unique by no need to contact the object under study (ous) and does not generate any surface waves at it. the primary field has properties behaving as coming from a magnetic monopole. the receiving 3d contacting applicator contains a high-permittivity ceramic and is resonant in order to provide the desired field polarisation sensitivity. the desired system properties are achieved by optimized use of the orthogonality properties of the primary magnetic, induced electric, and diffracted electric fields. key words: diffraction, magnetic field, applicator, internal inhomogeneity. 1. introduction detection of internal inhomogeneities by microwave techniques has been an area of research for several decades [2-4]. the breakthrough was with human breasts submerged in a so-called bolus liquid in the mid 1990‟s [5]. microwave imaging was then – as now – the goal, i.e. to provide images of the variations of the microwave permittivity of the various tissues and inhomogeneities in the object under study – ous. the choice of operating frequency is an important consideration. there are two primary factors: a) the need for dialectic contrast and b) the need for sufficient spatial resolution. both are dealt with in sections 3.1. and 6.2 in this paper. in general, microwaves at frequencies between 0.8 ghz and 3 ghz are optimal, depending on:  received january 1, 2020; received in revised form may 1, 2020 corresponding author: per olov risman mälardalen university (mdh), academy of innovation, design and engineering, box 883, se-72123, västerås, sweden e-mail: per.olov.risman@mdh.se * an earlier version of this paper was presented at the 14th international conference on applied electromagnetics (пес 2015), august 26 – august 28, 2019, in niš, serbia [1]. mailto:per.olov.risman@mdh.se 446 n. petrović, p.o. risman 1. the size of the inhomogeneity to be discovered – a lower frequency will of course result in a poorer spatial resolution. 2. a lower frequency will result in a smaller attenuation. 3. the microwave propagation path causing power losses in the tissues between the surface of the ous where the transmitting structure is located, via the internal inhomogeneity to be discovered, and the path to the receiving structure – often described by the power penetration depth (at which 1/e of the power flux intensity of a plane wave remains). another common expression for this factor is attenuation budget, which also includes the power losses of the propagation through a bolus. a first example: typically, the blood of a haemorrhage has 30 % higher real permittivity as well as loss factor at the typical 1 ghz frequency of operation; haemorrhages typically have volumes exceeding 20 ml – see image ii in fig. 8 – so detection is feasible if the haemorrhage is not in a deep position. the surrounding brain-matter has rather high permittivities and that of blood is still higher, and the head dimensions are comparatively large, so this low frequency is optimal. however and importantly, the head cannot be submerged in a bolus tank, as can e.g. the female breast. a second example: much of the female breast consists of fat tissue, which has up to ten times lower complex permittivity than brain tissue. this results in a larger penetration depth and a good dielectric contrast to tumorous tissue. however, tumours are small, which requires a higher microwave frequency for being detected. 2.5 ghz may then be a good choice. a third example: this relates to the propagation path. if the ous is in a bolus tank, multiple small antennas are used and each create a non-directive propagation which causes diffraction phenomena over most of the ous. signals are then received by several antennas, excluding the nearest to the transmitting one (too strong direct signal, burying the diffraction signals) and also the opposite ones (too high attenuation budget, i.e. the system becomes more expensive and prone to external interferences). – with our system principle – see fig. 5 – the induced electric field “directivity” has a conical shape with no field at the axis. this reduces the diffracted fields emanating from the opposite side to the receiving structure location, and diffraction from inhomogeneities up to about 90° sideways are promoted and measured on the ous surface. another factor: it is concluded from the above that transmission/detection across, and preferably about or slightly more than 90° sideways is needed with microwave systems. signals emanating from the central regions of the torso are not measureable in practice. protruding body-parts such as the head (albeit with reservations), female breasts, legs, and arms are feasible. the ous surface waves have been and are still “neutralised” by this immersion of the ous in a mixture of liquids (typically water-glycerol) having a microwave permittivity similar to that of the ous. that method is, however, not possible with heads. the submersion of the ous in a liquid tank with similar permittivity properties to the tissues provides a simple solution to several antenna design issues. since these are submerged in the bolus liquid and at a distance from both the tank walls and the ous, the nearfields are in the homogeneous liquid. simple propagating fields can then be assumed at the ous, and reflections at the tank walls become insignificant. in addition, the large bolus volume contacting the ous and these having similar permittivity solves a very significant problem: that of surface waves else created along the surface of the object under ous. however, submerging a human head in a bolus is practically impossible, so other means of eliminating of the surface waves become necessary. the hitherto most successful efforts are by the em-tensor company in austria [6, 7], using a thin bolus layer between the head and various diffraction effects and their importance for detection of inhomogeneites in human tissues 447 a metallic hood. a large number of fixed ceramic antenna applicators in the hood is used, with a large number of sets of transmission paths. the tomographic computations are then made in two basic steps: a first to compute the ous surface dimensions, and a second to compute the structure of the internal inhomogeneities in the ous. all existing systems such as that by em-tensor – as well as our and any future systems – employ diffraction effects. somewhat astonishingly, only our group at mdh has not only studied and analysed the actual diffraction phenomena, but we are also using our system for direct detection of these, as opposed to using multiple antennas [2] submerged with the ous in a large tank. in the following, we firstly describe the diffraction phenomena in and at a lossy dielectric wedge and lossy spheres, for the latter we also discuss the polarisation and directions of the external diffracted fields which are actually much of the basis of our work on direct detection of inhomogeneities in the ous. our system with its main component – the special quasistatic b-field transmitting applicator – is then described. numerical fdtd modeling of a simple phantom head is then addressed, using a new field subtraction method showing the diffracted field patterns. finally, some experiments are described, and conclusions on the limitations of discerning haemorrhages are presented. 2. what diffraction effects of possible interest exist? 2.1. general several kinds of diffraction phenomena have been studied for many years. some are a nuisance in e.g. microwave oven food heating, and some phenomena have only extremely complicated analytical solutions. the physical “explanation” to some phenomena is actually not possible by analytical calculations, so the geometric theory of diffraction may be a way out. this is based on huygen‟s principle but requires experimental determination of a socalled canonical constant. of interest in our context is, however, that some diffraction effects cause field polarisations that are very useful for detection of internal inhomogeneities. 2.2. the edge overheating and centre focussing effects the edge overheating effect (in e.g. microwave oven food loads) is a significant issue and is polarisation dependent. it is illustrated in fig. 1 a) and b), obtained with numerical modeling using the quickwave fdtd software [8]. they show the power distributions in a 90° dielectric wedge illuminated from below by a 2,45 ghz tem-wave. the impinging e-field in fig. 1 a) is in the direction of the wedge tip (in-out in the paper); it is left-right in fig. 1b). the strong heating in the wedge regions is up to four times stronger to what would occur with vectorial addition of fields coming from both directions to the surfaces. a) b) c) d) fig. 1 illustrations of some diffraction phenomena 448 n. petrović, p.o. risman the strength of the phenomenon can be deduced by a first order theory based on the geometric theory of diffraction, which has also been confirmed by numerical modeling and experiments [9]. it turns out that the statements in many textbooks that the effect results from simple addition of wave energy irradiating the edge from two directions (from the side and from above, instead of one direction) is in error. as an example for ε = 52−j20 and wedge angle 90°, the maximum power density in the tip is typically 4 times higher than some distance away from the tip, and 6 times higher for wedge angle 60°, for a single plane wave irradiation. for ε′ about 15 the factor is about 2 for wedge angles between 30° and 60°. the effect has almost disappeared for ε′ about 4. the size of the overheated zone also varies with ε′; the length becomes 12 mm for a 60° wedge. the first order equation for the electric field maximum eε inside the edge tip is: √ ( ) ( √| |) (1) where e i is the incident field and the wedge angle is 2α, with all angles in degrees. the first term is a wave potential term which also gives the field just inside a large flat load, and the second term a diffraction source term. the factor 0.50 is a so-called canonical constant that cannot be derived theoretically and was determined by numerical modeling [9]. it should be noted that the direction of incidence has a very small and only second order influence on the effect, as long as it is within the „free‟ (180−2α) angle. impinging waves which do not hit the edge tip first will also create surface waves. the wave potential term will then change, and with that the edge overheating effect. fig. 1 c) and d) show the phenomenon of a remarkable “focussing” effect of power density to the centre region of a round object having a circumference of about one free space wavelength. fig. 1c) is an illustration by fdtd modeling with the quickwave software [8] of the momentary overall average e field in and at a 40 mm diameter sphere with ε = 40−j16 illuminated by a 2,45 ghz plane wave with its e field with relative amplitude 10 (green). magenta = maximum = +23 units, dark blue = 0. the pattern is an efield-dominated hybrid mode of the second kind: eh202 (the te and tm type modes are non-separably coupled). the external mode is tmy01 with the same spherical mode designation (by the bromwich method [10]) and has similarities with the lowest circularly cylindrical tmz00 resonant mode bound to a long thin z-directed dielectric exposed to an e field parallel to the axis. fig. 1d) shows the power density pattern as the same load as in fig. 1c), in the same plane. the radius of the volume in which the power is larger than half the maximal is about 3,2 mm, corresponding to about 4 % of the overall volume. the average power density in most of the load volume is about 7 % of that at the centre. fig. 2 left shows the standard radar cross section analytical solution for a metal sphere. it is seen that the main external surface resonance effect shown in fig. 1c) and d) is confirmed by the first maximum. fig. 2 right shows the relative absorbed power flux density (1/r 2 , y axis) in free spheres under plane wave irradiation, as function of the radii in mm (x axis) and with the complex permittivity ε = ε′−jε″ as parameter, at 2,45 ghz, by analytical calculations. note the maxima at 20 mm radius for both ε = 16−j8 and 52−j20. there are also weaker maxima at 40 mm radius. various diffraction effects and their importance for detection of inhomogeneites in human tissues 449 fig. 2 illustrations of mie scattering of spherical objects in free space the phenomena shown in figs 1 and 2 are manifested for example by a chicken egg in its basic shape, raw or boiled, deshelled or not, shattering violently in any microwave oven within about half a minute. this is thus not by any quasi-optical focussing effect, but instead by the formation of an external surface wave resonance, basically being maximal for the object circumference being about one free space wavelength [11]. of course, related phenomena occur also for cylindrical objects. it is to be noted that it is possible to employ the radar cross section differences for discerning of free objects of different sizes by operating a multiantenna detection system, using also the polarisation of the diffracted waves, then in the region circumference/wavelength from about 0.6 to 1.8. 3. diffraction phenomena vs rounded object size and permittivity 3.1. absorbed and diffracted power a good beginning is to study the analytically solvable scenario of a dielectric lossy sphere under plane wave irradiation. fig. 3 a) shows the case of spheres at 2,45 ghz (the ism band for e.g. microwave ovens), with the radius as variable and complex permittivity as parameter. one can see the te101 resonance at about 8 mm radius (upper arrow), the tm101 at 12 mm, and the particular overall eh202 and external tmy01 resonant at 20 mm. the first index m for temnp modes is for h field maxima along the z axis (θ = 90°), n is for h field variations along the equator and p for the e field in the radial direction. importantly, the resonant te101 mode exists down to very small diameters, down to where the constant quasistatic field takes over. the mode is that of a magnetic dipole, as shown by the h field lines in the right fig. 3 b) [12] at free resonance for ε = 14. the tm101 mode is that of an electric dipole and its e-field lines are shown in fig. 3 b) left at free resonance for ε = 14. fig. 3 c) shows the analytically calculated absorbed (red) and diffracted (blue) relative power at 2,45 ghz under plane wave irradiation of spheres with ε =52−j20, with the radius as variable. fig. 3 d) shows the diffracted power flux density into a lossless medium with ε = 40, from a spherical object with ε = 70−j10, with its radius as variable, at 2,45 ghz. the red curve is for the te101 and tm101 fields only, and the magenta curve for all modes up to index 5, i.e. tem05 and tmm05. it is seen that the m = 1 term dominates up to 20 mm radius, and that the diffraction is very weak up to about 5 mm radius, after which it raises linearly with the radius. 450 n. petrović, p.o. risman this is consistent with rayleigh‟s law: the same small object diffraction proportional to λ0 −4 , but it can be shown that the reduction is approximately by r −6 at constant frequency in the steepest part of the curve. however, a comparison with the absorbed power flux density shows that the diffracted power flux is twice or more stronger that the absorbed. a) b) c) d) fig. 3 diffraction phenomena at smaller objects to sum up: 1. a high quotient between the ε′ values giver better diffraction; 2. a significant quotient between only ε″ gives a weak diffraction; 3. the diffracting object needs to be at least 10 mm in diameter at 2,45 ghz for providing a “good” diffraction in a typical human object such as a head or breast containing some inhomogeneities. the sensitivity increases significantly with a fully homogeneous ous model – see the section on investigations of a test scenario. 3.2. diffracted field patterns fig. 4 shows the numerically modelled maximal momentary ey field (left) and hx field (right) for a 16 mm diameter sphere with ε = 40−j16 in free space, at 2,45 ghz. the irradiation is by ez and hy, so the fields shown in the images are both perpendicular to those of the irradiating wave. the scaling is equalised (i.e. with a normalisation providing equal electric and magnetic field energies) in the zy plane shown in the figure. it is seen that there is a sideways propagation, i.e. perpendicularly to that of the impinging wave. by studies of these and other fields and their variations along the axes various diffraction effects and their importance for detection of inhomogeneites in human tissues 451 one can deduce that the mode is te101; see fig. 3 b) right. it is characterised by diffracted fields in mainly the zy plane. fig. 4 momentary fields at a 16 mm diameter sphere at 2,45 ghz 4. our system – overview and function 4.1. general our most recent system at mdh [13, 14] is not based on a need for complex and timeconsuming mathematical algorithms and complicated microwave circuitry. we instead directly employ certain internal diffraction phenomena, using a movable set consisting of a special transmission applicator surrounded by receiving devices, see the outline in fig. 5. the transmission applicator has the unique property of acting as a “magnetic monopole” with the electric field being created within the ous and no such components on its surface. an outline of the fields generated by the transmitting applicator and the diffracting object is shown in fig. 5. the h field (blue lines) induces a “circular” e field (in-out in the illustration; red circles), both hitting the diffracting “inhomogeneity object” (green) essentially as a tem wave. mainly the “ends” or periphery of this produce fields as shown in the figure with the resulting e fields shown as red arrows. this, plus the diffracted field directions as shown in fig. 4, result in a minimum diffraction field straight outward from the central projection of a larger inhomogeneity object than that shown in fig. 5, but then instead diffracted fields from the edges of the object. with the geometry of the scenario there will thus be an e field component perpendicular to the direction of the primary e field, i.e. radially as seen from the ous surface. this is important, since both the stronger primary field fig. 5 system overview 452 n. petrović, p.o. risman and any other surface-parallel e field are enhanced (i.e. reflected back into the ous) by the magnetic wall effect. however, in the case of the non-contacting receiving applicator, the surface-perpendicular field will be stronger in the airspace just outside the ous, by the d field vector continuity. the procedures for collecting and using the data are tentatively as follows. – with a fixed position of the transmitting device, the receiving device (or two diametric such devices) is moved around the skull surface, in a circumferential region around the axis of the transmitting device and at an azimuthal angle between about 30 and 70° (see fig. 5), which will be the best for receiving the diffracted signals. these are then recorded with the geometric positions of the devices, in for example 30 locations of the receiving device. the mechanical set of devices is then moved to another location, and the same procedure is carried out, in e.g. four locations of the transmitting device. the memorised geometric and signal data are then treated in a relatively simple software, providing an almost immediate “map” of the signal strengths over the skull surface region. this will provide a correlation between the (continuous) haemorrhage periphery as “seen” from the outside. more advanced algorithms may also be developed, for providing haemorrhage depth indications. – of course, the same principles are applicable also for breasts and other protruding body parts. 4.2. the transmitting applicator this is shown in fig. 6 left. there are a pair of current feeds in each gap in the inner ring; these are coupled for creation of a circulating current in it, at about 1 ghz with an overall applicator diameter of about 80 mm as in the picture. the double gaps are for optimising the purity of the created h field, as are the outer rings. fig. 6 mid shows the numerically modelled axial h field amplitude in the axial plane, in db scale, with a 6 mm thick ous outer layer with ε′ = 20 and σ = 0,4 s/m at 2 mm distance, and an inner continuum with ε′ = 45; σ = 0,8 s/m – i.e. simulating a human head. there is thus a certain attenuation, but it is seen that the applicator largely functions as a magnetic monopole. the particular properties of the quasistatic field emanating from the applicator results in virtually no surfaces waves at all being created in the ous surface region. this allows the device to be located some millimetres away from the ous – no bolus liquid is needed and the applicator can easily be moved over the ous surface. the diameter 80 mm at 1 ghz is of course a disadvantage, but only one device is needed and the resulting propagating waves in the ous are favourable: basically radially outwards from the device axis. the optimum positions of the receiving devices is not close to the transmitting device sideways at the ous, as shown in fig. 5. the disadvantage that there will be no or a very weak induced e field at the transmitter axis is of course compensated in practise by moving the whole set, as addressed earlier. 4.3. the receiving device this is still under development, and a working design is shown in fig. 6 right. it is resonant at about 1 ghz. the overall diameter of the high permittivity ceramic (magenta in the fig.; ε′ = 73) is 20 mm. the applicator mode is of the circular tm01 type, resulting in an axial e field at the sensing end. the frustum conical part provides a nulling of the reception properties of any radial (up-down in the figure) e field components. this various diffraction effects and their importance for detection of inhomogeneites in human tissues 453 results in a filter function for sensing mainly the radially outwards-going diffracted e field at the ous surface. fig. 6 the transmitting applicator (left), its axially directed b field (mid), and the receiving applicator. these properties of the receiving device are quite desirable, but expensive to manufacture, in particular since at least two are needed in a complete practical set-up. there are also issues with disturbances by the connected cables as seen by the curve fluctuation in fig. 8 iii and 8 iv, so a design with a built-in amplifier, ad converter, a small battery and bluetooth transmitter is now under consideration. it is then to be noted that only the signal amplitude and not phase are needed, and that the activation needs to be only about 1 % of the total time for an overall measurement session. it is to be noted that the bolus layer is also under development and can be very thin, but there is a need for contacting to the ous, so a thin flexible rubber film may be used. a particular issue is then also the microwave properties of the contacting skin layer, which is currently under investigation by particular in vivo measurements. without contacting, there will be a sensitivity to any residual surface waves, which have an axial e field component. the resonance properties do virtually eliminate the influences by extraneous propagating fields in the surroundings, and by that simplifies the electronics. a particular advantage with the principle is that no phase measurements are needed. this virtually eliminates calibration procedures, which are typically both complicated and time consuming with systems such as in [6, 15]. components and connections become much less expensive. 5. investigation of a test scenario and experimental work 5.1. the test scenario and numerical modeling fig. 7 a and b show the head model with a 20 cm³ cylindrical water inhomogeneity, as seen in the images. frequency 1 ghz, about 12 million voxels, 0.13 mm smallest distance between z planes, main voxel setting with 0.6 mm sides. fig. 7 a and b illustrate the modeling scenario with the transmitting applicator on top of the simulated head and an asymmetrically located water object (magenta) with diameter 25 mm and height 40 mm. the head maximum diameter is 220 mm and there is a 6 mm thick bone and skull layer (brown) outside the simulated homogeneous brain, being by a ternary liquid mixture in the experiments. a numerical modeling result is 454 n. petrović, p.o. risman shown in fig. 7 c and d, with the use of a new quickwave [8] subtraction module allowing field images (subtracted average/ amplitude or momentary at the same timestep), in this case with and without the water object. fig. 7 c shows the difference in the axially directed e field amplitude in a vertical plane 10 mm in front of the central plane, in linear scale, i.e. magenta = positive max; green = 0, darkest blue = negative max. fig. 7 d shows the same but now in the central plane and with the span between darkest blue and magenta being 40 db. fig. 7 e is the difference in the left-right-directed e field amplitude in a vertical plane near the centre of the object in fig. 7 a and b, between scenarios with and without the water object (linear scaling). fig. 7 f has the same scaling as fig. 7 e, but now in a horizontal plane through the centre of the water object. fig. 7 the test scenario and numerical analysis fig. 7 c and d show that the field is stronger to the left of the water object and weaker on the other side. it is to be noted that the scenario is a replication of an actual experimental various diffraction effects and their importance for detection of inhomogeneites in human tissues 455 set-up – see fig. 8 a. the signal difference as “measured” as the numerical result in the position shown by the arrow in fig. 7 d was +4 db with/without the water object present. fig. 8 experimental setups and results; skull/brain investigation it is seen that the subtracted field is stronger above and below the water object, and weaker straight outwards. the images confirm the theory on the main diffracted fields emanating from the “ends” of the enclosed object. the results in this section clearly shows the usefulness of our field subtraction facility. even if its usability is excellent for our diffraction studies described here, it can have many other uses, such as qualitative and quantitative detection of faults or deviation sensitivities in various objects and systems such as microwave filters and similar components. with this system with somewhat larger inhomogeneities we have by fdtd modeling obtained a difference of up to +15 db in a homogeneous material simulating brain permittivity, at a level of about 50 db signal overall attenuation. 456 n. petrović, p.o. risman 5.2. experimental work fig. 8 i is a picture of the experimental set-up with the experimental skull. it was specially made and contains strontium titanate for increasing the permittivity: ε′ ≈ 15 and σ ≈ 0.3 [16]. the transmitting applicator is placed under it and the receiving applicator at the side. a thin-wall glass vial with water in gelled form was submerged in the water and glycerol mixture with properties simulating the average permittivity of white and grey matter [17] and moved around for obtaining the best signal change. this was about 3.5 db and corresponds rather well to the position shown in figs 7 a and b. the signal differences are shown in fig. 8 iii and iv and confirm our modeling results. they show the measured s21 without (iii) and with (iv) the water clot. the sensing frequency of the receiving applicator is marked as mkr 1. the difference with/without the clot is about 3.5 db. it is of importance that there are virtually no surface waves. this applies to both the modeling and the experiments. however, the transmitting applicator did not have a good balun feeding, since it represents an extremely low impedance. furthermore, the receiving applicator design has imperfections. these issues and extraneous interferences are the reasons for the irregularities in the s21 curves. 6. discussion and conclusions 6.1. general on haemorrhage detection in human heads fig. 8 ii shows a cross section of a human brain with marked regions of typical haemorrhages. the most interesting and possible to detect with microwaves are marked a and b in the image – the others are too deep-lying and detection will also be hindered by the major normal inhomogeneities in this central region, as seen in the image. the quite large natural inhomogeneities in the head are characterised by such different dielectric composition that they will inevitably cause diffraction effects. the question is then to what extent these inherent diffraction fields may overshadow those by a haemorrhage. it seems likely that the volume of the haemorrhage must be comparable or larger than those of the surrounding natural inhomogeneities of the brain, for sufficiently reliable haemorrhage detection to be achievable by direct detection methods such as ours. reliable microwave detection of deep haemorrhages is in our opinion possible only with acquisition of smn phase and amplitude data using more than a hundred multi-switched antennas and advanced and time-consuming computations. true 3d tomographic images are then obtainable, but even then the so-called attenuation budget will become problematic and require almost extreme dynamic ranges of the amplifiers, etc. 6.2. studies of the austin man head all kinds of inhomogeneities in the skull will cause diffraction phenomena, as well as reflection and absorption. in order to quantify this, the head of the austin man [18] was converted for use in the modeling software, with 2 mm cubical voxels and the specified dielectric data of the 30 kinds of substances, at 1 ghz. the model is shown in fig. 8 v, with a blood clot having dimensions 18 mm × 34 mm and height 30 mm (i.e. a volume of about 18 cm 3 ). this was asymmetrically located. a result of the subtraction of averages with/without the blood clot, of the x-directed e field in the y plane giving the strongest diffraction phenomena, is shown in fig. 8 vi, in various diffraction effects and their importance for detection of inhomogeneites in human tissues 457 decibel scaling. the signal to the receiving applicator was not evaluated but it is seen that the strongest signal is just to the right of it. the difference is 3 db. an important conclusion is that the disturbing wave phenomena caused by the many inhomogeneities and widely different complex permittivities: ε′ from 1 (air-pockets) to 70 and up to σ = 2.5 s/m for the cerebrospinal fluid – will require quite voluminous inhomogeneities or these to be located close to the skullbone. it seems that deep-lying inhomogeneities (e.g. b in fig 8 ii) need to be larger than about 30 cm 3 at 1 ghz, and that flat or spread-out inhomogeneities need to have surface areas of at least 25 cm 2 for non-tomographic multiantenna systems to be useful. our investigations on this are continuing. 6.3. conclusion as described in this presentation our system has some unique properties. the key elements and their function are related to the polarisation and field orthogonality phenomena with a magnetic field emitting device creating a “rotating” electric field in the ous. the internal inhomogeneity then creates orthogonal diffracted fields. the combination of a special transmitting device and a resonant receiving device requires only amplitude measurements, which significantly reduces system complexity, cost and some calibration issues, while achieving good sensitivity. we therefore expect our overall system approach to become a viable and cost-effective class of equipment for detection of human body-part abnormalities. our method of direct measurements of diffraction effects by internal inhomogeneities such as haemorrhages and tumorous tissue provides about the same operational procedure as does x-ray mammography and partially also ultrasound, but at a projected much lower cost. a further contributing factor to the low cost is that complicated computations are not needed, which also saves time since our method provides immediate results. the contrast is by permittivity differences which are principally not correlated to x-ray grayscale representation. a combination of the two methods will thus provide much reduced false positive/negative medical conclusions. acknowledgement: our work is financially supported by the swedish knowledge foundation (kks) through the project embedded sensor systems for health plus (ess-h+), project number 20180158. references [1] p.o. risman and n. petrovic, “detection of diffraction effects by brain haemorrhages with a special microwave transmission system”, in proceedings of the 14th international conference on applied electromagnetics (пес 2019), aug 2019. [2] p.m. meaney, m.w. fanning, dun li, s.p. poplack, and k.d. paulsen, “a clinical prototype for active microwave imaging of the breast”, ieee transactions on microwave theory and techniques, vol. 48, no. 11, pp. 1841–1853, nov 2000. [3] s. y. semenov et al., “microwave tomography: two-dimensional system for biological imaging”, ieee transactions on biomedical engineering, vol. 43, no. 9, pp. 869–877, sept. 1996. [4] a. franchois, a. joisel, c. pichot and j. c. bolomey, “quantitative microwave imaging with a 2.45-ghz planar microwave camera”, ieee transactions on medical imaging, vol. 17, no. 4, pp. 550–561, aug. 1998. [5] p.m. meaney, k.d. paulsen, a. hartov and r.k. crane, “an active microwave imaging system for reconstruction of 2-d electrical property distributions”, ieee transactions on biomedical engineering, vol. 42, no. 10, pp. 1017 –1026, oct. 1995. [6] i. el kanfoud, v. dolean, c. migliaccio, j. lanteri, i. aliferis, c. pichot, p. tournier, f. nataf, f. hecht, s. semenov, m. bonazzoli, f. rapetti, r. pasquetti, m. de buhan, m. kray and m. darbas, “whole458 n. petrović, p.o. risman microwave system modeling for brain imaging”, in proceedings f the 2015 ieee conference on antenna measurements applications (cama), pp. 1–4, november 2015. [7] p. tournier, m. bonazzoli, v. dolean, f. rapetti, f. hecht, f. nataf, i. aliferis, i. el kanfoud, c. migliaccio, m. de buhan, m. darbas, s. semenov and c. pichot, “numerical modeling and high speed parallel computing: new perspectives on tomographic microwave imaging for brain stroke detection and monitoring”, ieee antennas and propagation magazine, vol. 59, no. 5, pp. 98–110, october 2017. [8] qwed company, quickwave 3d – complete 3d electromagnetic simulation, 2019 (accessed december 2019). [9] p.o. risman, “diffraction phenomena inside dielectric wedges – qualitative theory and verification by modelling and experiment”, in proceedings of the mikon 2008 conference paper a7/1, wroclaw, poland. [10] s. gallagher and w.j. gallagher, “the spherical resonator”, ieee trans. on nuclear science, vol. ns32, no. 5, pp. 2980–82, october 1985. [11] p.o. risman and m. celuch-marcysiak, “electromagnetic modelling for microwave heating applications”, invited paper, in proceedings of the 13th intl.conf. on microwaves, radar and wireless communications, wroclaw, may 2000, vol. 3, pp. 167–182. [12] m. gastine, l. courtois and j.l. dormann, "electromagnetic resonances of free dielectric spheres", ieee trans. microwave theory and techniques, vol. mtt-15, pp. 694–700, december 1967. [13] n. petrovic, m. otterskog and p. o. risman, “antenna applicator concepts using diffraction phenomena for direct visualization of brain hemorrhages”, in proceedings of the 2016 ieee conference on antenna measurements & applications (cama), syracuse, ny, 2016, pp. 1-4. [14] n. petrovic, m. otterskog and p. o. risman, “breast tumor detection with microwave applicators in open air”, in proceedings of the 2017 ieee conference on antenna measurements & applications (cama), tsukuba, 2017, pp. 272-274. [15] p. m. meaney, k. d. paulsen and j. t. chang, “near-field microwave imaging of biologically-based materials using a monopole transceiver system”, ieee transactions on microwave theory and techniques, vol. 46, no. 1, pp. 31–45, jan. 1998. [16] m. otterskog, n. petrovic and p. o. risman, “a multi-layered head phantom for microwave investigations of brain hemorrhages”, in proceedings of the 2016 ieee conference on antenna measurements applications (cama), oct 2016, pp. 1–3. [17] p. m. meaney, c. j. fox, s. d. geimer and k. d. paulsen, “electrical characterization of glycerin: water mixtures: implications for use as a coupling medium in microwave tomography”, ieee transactions on microwave theory and techniques, vol. 65, no. 5, pp. 1471–1478, may 2017. [18] j. w. massey and a. e. yilmaz, “austinman and austinwoman: high-fidelity, anatomical voxel models developed from the vhp color images”, in proceedings of the 38th annual international conference of the ieee engineering in medicine and biology society (embc), august 2016, pp. 3346–3349. facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 37-59 https://doi.org/10.2298/fuee2001037n © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd the high frequency surface wave radar solution for vessel tracking beyond the horizon  dejan nikolić 1 , nikola stojković 1 , pavle petrović 1 , nikola tošić 1 , nikola lekić 1 , zoran stanković 2 , nebojša dončov 2 1 vlatacom institute, belgrade, serbia 2 university of nis, faculty of electronic engineering, niš, serbia abstract. with maximum range of about 200 nautical miles (approx. 370 km) high frequency surface wave radars (hfswr) provide unique capability for vessel detection far beyond the horizon without utilization of any moving platforms. such uniqueness requires design principles unlike those usually used in microwave radar. in this paper the key concepts of hfswr based on frequency modulated continuous (fmcw) principles are presented. the paper further describes operating principles with focus on signal processing techniques used to extract desired data. the signal processing describes range and doppler processing but focus is given to the digital beamforming (dbf) and constant false alarm rate (cfar) models. in order to better present the design process, data obtained from the hfswr sites operating in the gulf of guinea are used. key words: high frequency surface wave, radar, maritime surveillance, digital beamforming, constant false alarm ratio, frequency modulated continuous wave. 1. introduction in recent years organized crime in maritime regions has flourished, threatening both secure flow of goods from exclusive economic zones (eez) [1] and lives of participants in the marine operations. henceforth, all marine nations are forced to fully control whole eez, not only territorial waters. moreover, in some areas of the world, the situation is so serious that un [2] and/or eu intervention [3] has been required, since nations which have jurisdiction over those waters have limited resources. since eezs are huge bodies of water which can cover hundreds of thousands of square kilometers, complete monitoring is much easier said than done. so, the first question is how to monitor the whole eez? received february 25, 2019; received in revised form may 14, 2019 corresponding author: dejan nikolić faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: dejan.nikolic@vlatacom.com)  38 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov to the best of our knowledge, there are only two ways to achieve complete eez monitoring, especially if your primary targets are non-cooperative vessels. first approach utilizes optical and microwave sensors on platforms such as satellites and airplanes, thus avoiding sensor’s limitations, but introducing platform’s limitations. the most limiting factor is interrupted data availability, since no airplane is able to stay in the air constantly during whole year and all-weather conditions, while satellites are orbiting around earth and will be over the zone of interest for a limited time. other approach uses network of hfswrs [4] to ensure constant surveillance well beyond horizon. since the price of hfswr network is significantly less than the combined cost of aforementioned sensors and data are available constantly during whole year, it is clear why these radars are slowly becoming the sensors of choice for maritime surveillance at over-the-horizon (oth) distances. the paper examines the most important questions that have been encountered in the design of vlatacom’s high frequency over the horizon radar (vhf – othr). please note that this paper relies on [5] and while in [5] main focus is on signal processing, this paper provides an overview of vhf othr. a general overview of hfswr principles can be found in [4,6-9] our solution relies on fmcw and main reason for that choice are: 1. peak power requirement. since we wanted to minimize needed transmit peak power we opted for fmcw hence it requires significantly less peak power than pulsed waveforms. 2. secondly, our targets are staying within the radar resolution cell for few minutes, so we wanted to fully use available integration time. for a difference, solution presented in [10] is based on a pulsed waveform which requires at least an order of magnitude greater peak power in order to achieve same radar range. on the other hand solution presented in [10] is requiring less land area in order to deploy the system, since rxs can be blanked during transition. the rest of the paper is organized as follows: tactical situation and environmental challenges are presented in section 2. section 3, is dedicated to general vhf – othr design. section 4, focuses on signal processing and conclusions presented in section 5. 2. tactical situation and environmental challenges usually demands which end users put in front of hfswr designers may be formulated in the following manner:  cover areas of the sea beyond the range of shore–based microwave radars. ideally cover complete eez and neighboring areas.  provide reliable detection and stable tracking at oth distances regardless of environmental conditions. simultaneously minimize number of false tracks. in order to completely understand such demands designers must be proficient with situation in maritime arena and fully understand what types of vessel are present in the open sea. vessels which are the most interesting are vessels used for transportation of goods, such as various types of cargo vessels and tankers. all those vessels have following common characteristics: the high frequency surface wave radar solution for vessel tracking beyond the horizon 39  most of the vessels are very large and their length is often more than 100 meters, while their displacement is usually more than 50000 dead-weight tonnages (dwts). although, those vessels are mandated to carry and use ais devices by international regulations, it is not always the case in the gulf of guinea.  the top speed of the vessels seldom exceeds 25 knots, while usual cruising speed is ranging from 10 to 20 knots, or even less. please note, vessels may be stationary as well, like fishing vessels for example.  most of the time these vessels are traveling along strait line and when they make turns they are doing slowly in a wide arcs. although in some cases some (smaller) vessels can perform sharp manoeuvers. it is clear that tracking this type of vessels is not very demanding and their influences on hfswr coverage are presented in [11]. on the other hand, track initiation process may be a very demanding task, especially at the long ranges. however, since tracking is not the focus of this paper, readers are suggested to rely on works [12, 13]. unfortunately, this is the only favorable circumstance; all other factors present quite demanding challenges. those challenges can be divided into natural and man–made challenges. 2.1. natural challenges the very first natural challenge is direct consequence of operating band, since natural noise levels in hf band depend on geographical location, most notably geographic latitude [14]. this leads to the situations where the very same hfswr will achieve different range performances at different locations. an example of noise influence can be found in appendix a of [15]. next roughness of sea is a factor which must be carefully considered during hfswr design, since additional propagation losses result from the roughness of the sea surface. there are two scales which describe roughness of the sea surface – douglas and beaufort scale. beaufort scale is usually used by mariners, while douglas scale represents world meteorological organization (wmo) standard. in this paper we decided to rely on douglas scale. by this scale sea state is expressed with digits from 0 to 9. a higher number on the scale corresponds to a higher wave height, which leads to higher losses in the propagation. analysis of sea states from 0 to 6 shows that increase of wave height is proportional to increase in the propagation loss. detailed analyze of this phenomenon could be found in [16]. it is also important to note that losses are increasing with increase of the operational frequency. many shore lines are quite low and it may not be feasible to mount an hfswr on an embankment of cliff which though not impacting the performance of an hfswr does represent another challenge in installing and maintaining hfswr. if there is no other option other than to install at the shore line (and in some regions there is no other option) wave erosion can present danger to the hfswr installation and in some cases inflict significant damage to the installation site and thus hfswr itself. although this is more civil engineering problem than design issue it must be carefully considered prior to the hfswr deployment. 40 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov 2.2. manmade challenges the first challenge is a man–made noise sources which are constantly changing, especially in the developing countries. in developed countries all changes are regulated by governmental bodies. on the other hand in developing countries (especially in africa) those changes are not strictly regulated. moreover, due to the fast development of those countries, international recommendations are not quite up to date regarding the man– made noise data. furthermore, in those countries very often there is no strong regulative body which controls bandwidth occupancy and users. this leads to situations where completely unexpected noise source appears in the hfswr operating band (see fig. 1). fig. 1 unexpected noise sources as it can be seen from fig. 1, a strong unexpected noise source (red and yellow dots in the spectrogram) appeared and lasted for a few minutes across whole band. this should not be confused with regular radio devices operating in the same band as hfswr (light blue vertical lines in the spectrogram). although this usually happens in developing countries, it doesn’t mean that there are no noise sources in hf band in developed countries. on the contrary, in highly developed countries hf band can be fully occupied and find a suitable operating band can be quite challenging. in extreme cases hfswr are forced to operate at very low power levels (less than 40 dbm equivalent isotropically radiated power eirp) in order to avoid need for designated band. obviously, this naturally significantly limits range performance. on the other hand, there are some signal processing techniques which can be applied in order to mitigate the influence of noise [17, 18]. next problems are mostly present in developing countries. those problems are:  connectivity problems and  power supply problems. the high frequency surface wave radar solution for vessel tracking beyond the horizon 41 2.2.1. connectivity problems one sensor, no matter how large area it can cover, is often not enough to provide constant surveillance of eez. so, a network of hfswrs is needed. in order to form the network a connection to the command and control (c2) centers is a must. however, in developing countries it can be a major problem, especially in remote areas. from our point of view, connection between remote sites and c2 centers in developing countries can be established in one of following ways:  mobile telephony,  microwave links and  satellite links. mobile telephony represents an optimal solution, if it is available. since no additional infrastructure is needed, deployment costs are literately negligible. since hfswr doesn’t require high data rates (approximately 256 kb/s is quite enough) and total data-transfer is around 5 gb on a monthly basis any today mobile network will be suitable. moreover, in many countries some mobile telephony operators are state controlled, or even state owned, so data security should be at the very good level. even then, encryption of output data is recommended. unfortunately, mobile telephony is not always available, especially in scarcely inhabited areas, so the other means of connection are needed. microwave links as a mean of connectivity between hfswr sites and c2s is definitely the most reliable and the safest one. on the other hand, cost of deployment is very high, which limits its usefulness. as the last mean, there is a satellite link. although it may look appealing, it is crucial to understand its flaws. the first of all is security issue, since whole network is dependent on the third party (link provider). second issue is an availability of data. hfswr as a sensor is only slightly affected by a meteorological factors and virtually unaffected by rain and clouds, while usual satellite links are very susceptible to the weather conditions. so, choosing satellite link as a mean of data transfer is not recommended, unfortunately sometimes it is the only viable solution, due to lack of mobile network or lack of funds to develop a mw link network. 2.2.2. power supply problems all electrical devices require a power supply to perform their functions and hfswrs are not exceptions. this simple requirement can be a real problem in developing countries, since electrical power network is not always available and if it is available quality of provided electrical energy is questionable. it is clear that other means of power supply are needed. usually, diesel generators or solar panels are only viable options. regardless how electrical energy is provided, uninterrupted power supply (ups) units are a must. moreover, it is highly advisable to back-up both generators and ups systems, in order to provide uninterrupted hfswr operation. 42 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov 3. general hfswr design 3.1. usual site deployment like all the other radars, the hfswr consists of transmitter and receiver with their antenna arrays. beside those crucial elements the vhf-othr site includes a central site location (csl). although this area is not mandatory, in some cases it is a must. in those cases the csl provides power supply, connectivity to the c2 and sometimes even physical security (armed guards) to the hfswr. a typical site layout is shown in fig. 2. fig. 2 vhf-othr site deployment selected site geometry and working frequency define coverage area as well as area required for site deployment. as an example, for a system centered at 4.6 mhz, which yields maximum range of approximately 200 nautical miles (370 km) will be discussed. site geometry for the antenna arrays is presented in fig. 3. fig. 3 hfswr array layout for system centered at 4.6 mhz the high frequency surface wave radar solution for vessel tracking beyond the horizon 43 according to fig. 3. minimal length of the rx antenna array is 7.5 wavelengths, which for the operating frequency of 4.6 mhz is nearly 500 meters. in order to prevent selfinterference rx and tx arrays needs to be separated at least 10 wavelengths, or 650 meters for system operating at 4.6 mhz. next, transmit array length is at least half of wavelength, or 32 meters in a case of 4.6 mhz system. furthermore, in order to boost antenna efficiency radials are needed. taking into account that radial length is 25 meters, required area is increased for another 50 meters (25 meters prior to receive antenna 1 in fig. 3. and after transmit antenna). finally, in order to secure the site fencing is needed so occupied area rises again. to summarize in order to deploy the 4.6 mhz site land area nearly 1.5 km long and 100 meters wide is required. this area needs to be as close as possible to the sea; practically the best location is at the shoreline (see fig. 4). fig. 4 deployed vhf-othr site for applications where 200 nm coverage range is not a requirement the occupied area can be reduced as a higher frequency can be used. for example, for a system centered at 12 mhz requires only an area 300 meters long, but maximal range is reduced to 60 nautical miles. 3.2. system’s block diagram brief description of the vhf-othr block diagram with most important parameters which influence its performance is presented here and shown in the fig. 5. more detailed description of the individual components is presented in the following sections or even in the standalone articles. 44 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov f ig . 5 v h f -o t h r s y st e m ’s b lo c k d ia g ra m the high frequency surface wave radar solution for vessel tracking beyond the horizon 45 it is important to note that this radar is frequency modulated continuous wave (fmcw) principles, similarly to the ones presented in [19]. as in all other radar systems the exciter generates waveforms needed for system operation. here, direct digital synthesizer (dds) is used to generate linearly frequency modulated signal, also known as chirp. this signal is split into 17 channels, one for the transmitter and 16 for receiver channels. the transmitter signal is amplified to the desired level using a power amplifier and fed to the transmit antenna array. although the majority of the signal is transmitted towards the open sea, some portion is also radiated directly towards the receiving array which consequently interferes with signal reflected from the open sea. it is important to note that losses during propagation over any type of soil (lland) are much greater that losses over the sea surface (lsea). therfore the echo returned from the sea is dominant at the receiver input. the echo itself consists of two predominant components, signal reflected from vessels (defined by each target radar cross section – rcs) and signal reflected from the ocean waves, also known as sea clutter. besides direct and reflected signal components at the receiver inputs external noise originating from the natural (fna) and manmade sources (fmm) is also present. in some cases the third component ionospheric clutter may appear at the receiver inputs. this ionospheric clutter is the result of unwanted sky-wave propagation of the transmitted signal that is reflected back to the radar via the ionosphere. each rx antenna receives signal completely independently of the others and feeds it to the separate receiver (see fig. 6). fig. 6 receivers block diagram from each of the 16 antennas (a1 a16), signal is firstly filtered in order to suppress outof -band components, then it is amplified to the level needed for the further processing. next it is mixed with the signal from the exciter that translates it to the baseband i and q signals. after, the signal passes through a notch filter and a low pass filter. the notch filter suppresses signals around 0 hz (dc), in order to reduce the impact of transmitter leakage. it is important to note that this filter comes after the received signal is already translated to 46 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov frequency domain thus it does not influence the stationary targets located away from the hfswr, but only close targets and most importantly leakage from the transmitter. the low pass filter with a cut off frequency of 1 khz is used to clean channel from higher harmonics which are mainly produced by the mixer. at the final stage, the signal is converted to digital with a 16 bit a/d (analog-to-digital) converter, after which it is sent to the digital signal processing.. digital signal processing steps are presented in section 4, while tracking and integration processes are presented in [13] and [15] respectively. at the end, in order to completely cover nation eez multiple hfswrs may be needed so the hfswr network must be formed [20]. 3.3. antenna arrays receiver antenna array consists of the 16 monopole antennas [21] with each antenna feeding its own independent receiver. beamforming is done digitally and presented in section 4.3. the transmit array radiation pattern is formed with array geometry and phase shifts at the each array element. primarily, the transmitting planar array is designed with the intention of maximizing energy radiated towards the sea over wider band so operating frequency can deviate when necessary. secondly, radiation pattern nulls towards the receiving array must be achieved over entire operation band. in order to achieve those goals elements a and d have a phase shift of 126º from elements b and c from fig.7. fig. 7 transmit array diagram lightning poles are placed at θ=0º and 180º, 0.75λ from the center of the array, where the nulls are expected to be. it is important to note that lightning poles are a must, since antenna array is towering above the surrounding area. the 3d radiation pattern of the array far field can be seen in fig. 8, while a horizontal cut at 8º is shown in fig. 9. the high frequency surface wave radar solution for vessel tracking beyond the horizon 47 fig. 8 3d polar plot of transmitting array radiation pattern fig. 9 radiation pattern of tx array – horizontal cut, θ=8º as shown, the side lobe towards angle θ=270º has about 8.7db lower gain than maxima directed towards θ=90º, and the nulls occur toward directions θ=0º and θ=180º. 4. signal processing this section provides an overview of the signal processing applied to the vlatacom radar. complete signal processing consists of following steps: 1. range processing, 2. doppler processing, 3. beamforming, 4. constant false alarm rate (cfar) and 5. target tracking (not subject of this paper) all these steps are presented in fig 10. 48 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov fig. 10 vhf-othr signal processing steps 4.1. range processing digitalized complex envelope of received signal translated into the base band (0 to 1000 hz). the signal form is power relative to time. digital signal processing starts with the ―range fft‖, which is the fast fourier transform of digitalized received signal. in this way received signal is reshaped into signal level vs range form (fig. 11). fig. 11 signal after the ―range processing‖ (taken from [5]) 4.2. doppler processing the first step during doppler processing is to rearrange data based on range values and rx channels. after data is rearranged into desirable form, it initially is processed by window function before it can be passed to fft. the blackman – harris window function [22] is used here because it suppresses higher order side lobes better than other window functions, while still maintains very good selectivity of the main lobe. the last step is fft and its outputs the ―range – doppler maps‖. these maps are generated for each angle in the hfswr field of view and represent signal levels dependence on range and speed (doppler shift) of targets. the high frequency surface wave radar solution for vessel tracking beyond the horizon 49 doppler processing is schematically presented in fig 12. fig. 12 ―doppler processing‖ (taken from [5]) 4.3. digital beamforming in order to form desired radiation pattern for the receiving array, various summation techniques can be used [23]. here a conventional phase shift beam former is used for angle calculations, while orchard algorithm [24] is used for weighting coefficients calculations. antenna factor of the formed antenna array can be written as:     1 0 )( n n nn zziaf (1) where af represents antenna factor of the formed array, in are currents at the each element, n number of elements and z (zn = e an + jbn) is complex variable which defines the array factor. manipulating an and bn desired antenna factor can be obtained. iteratively solving equation system (eq. 2. – see below) which describes antenna gain in characteristic points (maximums and minimums), an and bn parameters can be obtained. the beam forming process is described below: 1. an are set to 0, while bn are uniformly distributed between 0 and 2 π. which gives array gain as a function of βd cos(θ) as shown in fig. 13. where, β = 2π/λ (λ is used wavelength), d represents distance between antenna elements (in this case 0.45 λ) and θ is aspect angle relative to array axis. 50 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov fig. 13 starting gain (taken from [5]) 2. desired gain is set and eq. 2 is solved in order to find an and bn. ;*~ gxag  (2) where g ~ are current gain values, g are required gain values (defined by shaping function) and δx is change of x matrix which contains an and bn. it is important to note that there is no closed-form solution to the eq. 2, but only optimal solutions which can be found using numerical approach. one example can be seen in fig. 14. fig. 14 intermediate step, obtained during optimization process (taken from [5]) the high frequency surface wave radar solution for vessel tracking beyond the horizon 51 3. it is worth noting that numerical approach using software tools may be suboptimal, as matrix x may become singular when solution reaches some local stationary point. when this happens, some manual tweaking is required in order to reach desired shaping function and thus derive required an and bn. final result is show in fig. 15. fig. 15 end of optimization process – fully derived an and bn (taken from [5]) note that the shift in angle between desired shaping function and obtained radiation pattern is not relevant; since exact position of radiation pattern is achieved through phase shifting coefficients (see fig. 9.). end result of beamforming is so called range – doppler – azimuth cube (rda cube), which contains signal power levels defined by all relevant parameters (range, doppler shift and azimuth). 4.4. cfar algorithm input data for this processing step are rda cubes and joint noise/clutter distribution functions. these distribution functions depend on system parameters and environment where system is installed. so, a thorough statistical analysis needs to be done in order to precisely derive required distribution functions. 4.4.1. statistical analyses firstly, system’s noise distribution function must be derived using data obtained in the laboratory while othr transmitter is directly connected to the othr receiver and test signal (one lone target) is run. one rd map obtained during such test is shown in fig. 16. 52 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov fig. 16 rd map obtained in the laboratory processing all rd maps obtained during this test yields statistical properties of hfswr system noise. distribution of the obtained results is shown in fig. 17 fig. 17 statistical properties of system’s noise one rda cube contains approximately 12 million cells. in one of them a test target is present, while the rest contain only system noise. orange lines with circles on the top represent number of cells (samples) used to determine the value of the blue line. for example, nearly 120,000 cells had power levels of 50 db above noise floor, so orange line with circle at the top reached 0.01 (120.000 / 12 million). distribution of the measured results is shown with blue line, two distributions, normal and weibull’s, which match measured results the best are shown with orange and purple line respectively. it is clear that the weibull’s distribution matches the measured samples better than the normal distribution, so it is chosen as statistical model of the system’s noise and it is used for further statistical analyses. the high frequency surface wave radar solution for vessel tracking beyond the horizon 53 next, clutter and noise introduced by environment will be analyzed. a rda cube obtained from hfswr site situated in the gulf of guinea is used. one rd map obtained during test is shown in fig. 18. fig. 18 rd map obtained in the field (gulf of guinea) from fig. 18. following areas are easily noted: 1. dominant area, usually containing noise / clutter distribution – primary goal of this analysis. its distribution function presented in fig. 17. 2. ionospheric clutter region, which can mask some distant targets and has distinctive statistical properties, so it will be discussed separately. 3. 1st order bragg lines [25], representing scattering which is very important for oceanographic measurements, but has no value for vessel tracking, since it introduces blind velocities. since there is no statistical method, known to authors, which can reliably detect vessels in this region this area is excluded from further statistical analyses. please note, that although this cannot be solved within cfar, it can be addressed with frequency diversity. 4. fast moving targets, such as airplanes, or even returns from d or e ionosphere layers. since it doesn’t have any significant impact on the clutter / noise distributions it will not be discussed further. 5. potential targets. they are occupying very few cells and have no significant impact on noise / clutter distribution functions. 54 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov fig. 19 statistical properties of the region 4 from fig. 19. it can be seen that weibull distribution matches measured data the best and it is adopted for further cfar modeling. on the other hand, when ionospheric clutter is present weibull distribution in the affected area doesn’t matches very well with measured data. as it can be seen from fig. 20. log normal distribution describes that region the best. fig. 20 statistical properties of the region affected by ionospheric clutter please note, obtained distribution functions represent system noise, environmental noise and clutter distribution in the gulf of guinea. in some other regions of the world this distribution may vary. the high frequency surface wave radar solution for vessel tracking beyond the horizon 55 4.4.2. cfar algorithm analyses presented above suggest usage of an adaptive cfar algorithm such as [26] or fusion cfar algorithm [27]. cfar used here is based on approach present in [27] and represents a slight modification of well know cell averaging greatest of cfar (cago – cfar). the only difference lies in the fact that threshold level depends not only on averaged signal level, but also on assumed distribution function presented above. as in all cfar detectors ―cell under test‖ is surrounded with guard cells after which come cells used for signal level estimation – training cells (see fig21). please note, that for simplicity sake fig. 21 is draw in 2d, while in reality cfar operates in 3d (range, azimuth, doppler) estimating each cell in rda cube. fig. 21 cell estimation principle mean signal value in training cells is calculated as: 1 i tc y l    (3) where, μ represents mean signal value, l number of training cells and yi is signal level in the current training cell. while variance (σ) is calculated as: 2 21 ( )i tc y l    (4) threshold level (t) is calculated as: t c   (5) where, value c is depending on distribution function derived above and its main role is to maintain predefined false alarm ration (see fig. 22.) 56 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov fig. 22 maintaining constant probability of false alarm end result of this processing is list of detected targets. that list is graphically presented in fig. 23, where all detected targets are plotted in polar diagram. value of – 50 knots is chosen as figure background because it is highly unlikely that any vessel of interest can reach that speed. fig. 23 detected targets the high frequency surface wave radar solution for vessel tracking beyond the horizon 57 it is important to note that not all of the detections present at fig. 23 are necessarily real vessels. on some occasions, especially at the longer ranges, sea clutter can be detected as legitimate target, which may cause false alarms in the c2 system. in order to sort out this issue, all detections are fed to the tracking algorithms which are using track before principle [13] in order to eliminate false alarms. afterwards, data obtained from multiple radars are fused into single stream of data and integrated with ais data [15] in order to create unique operational picture. end result of aforementioned process can be seen in fig. 24. ais data originating from one vessel are also displayed at the fig 24 in order to facilitate verification of the vessel detection. fig. 24 an example of an operational picture 5. conclusion with the development of technology, prevention of illegal activities at the open sea is becoming increasingly complex, which further increases the need for more sophisticated solutions for large maritime area monitoring far away from the shore. at ranges well beyond the horizon, constant data availability and affordable price are putting hfswrs at the forefront of the battle for the safer seas. unfortunately, unlike microwave radars, hfswrs are not mass-produced and well established devices. there are many question which demand 58 d. nikolić, n. stojković, p. petrović, n. tošić, n. lekić, z. stanković, n. donĉov answers before design process even starts. most important of those questions, from authors point of view, and according answers are presented in this manuscript. an introduction to hfswrs that adopt frequency-modulated continuous waves, or fmcw, to measure range, angular position and velocity of remote objects has been made in this article. an elaborate analysis on how the received signal is processed in order to obtain the vessels positions is in the focus of the article, although some other important aspects are discussed as well. during description of the signal processing main focus is given to the digital beamforming (dbf) and constant false alarm ration (cfar) models, but the other steps such as range and doppler processing are presented as well. in order to better present the design process data obtained from the hfswr sites operating in the gulf of guinea are used. in the future this system development is going multiple in multiple out (mimo) direction with intention to design one system which consists of multiple interlinked nodes. acknowledgement: the paper is a part of the research done within the project #89.1, funded by vlatacom institute. references [1] united nations, law of the sea, part v – exclusive economic zone. august 2011. [2] https://news.un.org/en/story/2017/11/570172-un-security-council-urges-comprehensive-response-piracysomali-coast. [3] http://eunavfor.eu/mission/. [4] g. fabrizio, "high frequency over-the-horizon radar: fundamental principles, signal processing, and practical applications," london, uk, mcgraw-hill inc., 2013. [5] n. stojkovic, d. nikolic, p. petrovic, n. tosic, n. lekic, "an implementation of dbf and cfar models in othr signal processing", in proceedings of the 15 th ieee colloquium on signal processing and its applications (cspa 2019), penang, malesia, 8-9 mar. 2019. [6] l. sevgi, a. ponsford, h.c. chan, "an integrated maritime surveillance system based on high-frequency surface-wave radars. part 1. theoretical background and numerical simulations", ieee antennas and propagation magazine, vol. 43, no. 4, pp. 28-43, aug 2001. [7] a. ponsford, a. ponsford, h.c. chan, "an integrated maritime surveillance system based on highfrequency surface-wave radars. part 2. operational status and system performance", ieee antennas and propagation magazine, vol. 43, no. 5, pp. 52 -63, oct 2001. [8] t. ponsford and j. wang, "a review of high frequency surface wave radar for detection and tracking of ships", special issue on skyand ground-wave high frequency (hf) radars: challenges in modelling, simulation and application, turk j elec eng & comp sci, vol. 18, no. 3, 2010. [9] l. sevgi, "modeling and simulation strategies in high frequency surface wave radars", turk j elec eng & comp sci, vol. 18, no. 3, 2010. [10] a. ponsford, r. mckerracher, z. ding, p. moo, d. yee, "towards a cognitive radar: canada’s thirdgeneration high frequency surface wave radar (hfswr) for surveillance of the 200 nautical mile exclusive economic zone", sensors, vol. 17, no. 7, article no. 1588, 2017. [11] n. lekic, d. nikolic, b. milanovic, d. vucicevic, a. valjarevic, b. todorovic, "imapact of radar cross section on hf radar surveillance area: simulation approach", in proceedings of 2015 ieee radar conference, johannesburg, rsa, 2015. [12] z. ding, p. moo, "design of an imm-nnjpda tracker for hfswr", in proceedings of the 17th international radar symposium (irs), 2016, pp. 1-5. [13] n. stojkovic, d. nikolic, b. dzolic, n. tosic, v. orlic, n. lekic, b. todorovic, "an implementation of tracking algorithm for over-the-horizon surface wave radar", in proceedings of the 24th telecommunications forum (telfor), belgrade, serbia, 22-23 nov. 2016. [14] itu-r recommendation p.372-11, september 2013. the high frequency surface wave radar solution for vessel tracking beyond the horizon 59 [15] d. nikolic, n. stojkovic, n. lekic, "maritime over the horizon sensor integration: high frequency surface-wave-radar and automatic identification system data integration algorithm", sensors, vol. 18, no. 4, article no.1147, 2018. [16] d. e. barick, "theory of ground-wave propagation across a rough sea at decameter wavelengths", battelle memorial institute, 1970. [17] s.v. vaseghi, "advanced digital signal processing and noise reduction 4th ed", john wiley and sons ltd., 2008. [18] g. chen, z. zhao, g. zhu, y. huang, and t. li, "hf radio-frequency interference mitigation", ieee geoscience and remote sensing letters, vol. 7, no. 3, july 2010. [19] v. milovanovic, "on fundamental operating principles and range-doppler estimation in monolithic frequency-modulated continuous-wave radar sensors", facta universitatis, series: electronics and energetics, vol. 31, no. 4, pp 547-570, 2018. [20] s.j.anderson, "optimizing hf radar siting for surveillance and remote sensing in the strait of malacca", ieee transaction on geoscience and remote sensing, vol. 51, no. 3, pp. 1805-1816, mar. 2013. [21] m. m. weiner, "monopole antennas," new york – basel: marcel dekker inc., 2003. [22] j. o. smith iii, "spectral audio signal processing," w3k publishing, 2011. [23] h. l. van trees, "optimum array processing, " john wiley & sons, inc., 2002. [24] h.j. orchard, r.s. elliott, g.j. stern, "optimizing the synthesis of shaped beam antenna patterns", in proceedings of the iee microwave, antennas and propagation, h 132, mar. 1985, vol. 1, pp 63 – 68. [25] o. m. phillips, "radar returns from the sea surface—bragg scattering and breaking waves," j. phys. oceanogr., vol. 18, pp. 1065-1074, 1988. [26] x. lu, j. wang, r. dizaji, z. ding, a. m. ponsford , "a new constant false alarm rate technique for high frequency surface wave radar", ieee ccece, 2004. [27] d. ivković, m. andrić, b. zrnić, "detection of very close targets by fusion cfar detectors", scientific technical review, vol. 66, no. 3, pp. 50-57, 2016. instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 631 638 doi: 10.2298/fuee1404631j efficiency limits in photovoltaics – case of single junction solar cells  marko jošt, marko topič university of ljubljana, faculty of electrical engineering, ljubljana, slovenia abstract. the conversion efficiency of solar energy into electrical energy is the most important parameter when discussing solar cells, photovoltaic (pv) modules or pv power plants. so far many papers have been written to address the limiting efficiency of solar cells, the theoretical maximum conversion efficiency an ideal solar cell could achieve. however, most of the researches modelled sun’s spectrum as a blackbody which does not represent a realistic case. in this paper we have calculated the limiting efficiency as a function of absorbers band gap at standard test conditions using the solar spectrum am1.5. in addition, the other key solar cells performance parameters (open-circuit voltage, short-circuit current density and fill factor) are evaluated while the intrinsic losses in the solar cells are also explained and presented in light of a cell temperature. key words: efficiency limit, single junction solar cell, loss mechanisms, am1.5 spectrum. 1. introduction the conversion efficiency is one of the most important parameters when discussing photovoltaic or any other energy conversion devices, telling us the ratio between output and input energy. besides the actual state of the art average or record efficiencies, theoretical limiting efficiency is also very important since it declares how much progress is still left to achieve. in photovoltaics, the first limiting efficiency for solar cells was calculated by shockley and queisser in 1961 [1]. in their work they assumed the detailed balance principle based on the second law of thermodynamics. they calculated limiting efficiency to be 30% for single junction solar cells modelling sun as a blackbody with t = 6000 k. later, other attempts in this field were also reported [2]–[4]. different techniques were examined in an attempt to achieve or even exceed the limiting efficiency. most popular are light management through optical light scattering [7], tandem solar cells and concentrator solar cells. other approaches, such as multiple exciton generation[8] and up-[9] and down-[10]conversion, are also researched. here we will focus on single-junction solar cells under terrestrial conditions. in the existing studies most of the researchers used sun’s blackbody radiation spectrum. this, however, does not represent realistic situation for terrestrial conditions as part of received july 28, 2014; received in revised form october 23, 2014 corresponding author: marko jošt university of ljubljana, faculty of electrical engineering, tržaškacesta 25, 1000 ljubljana, slovenia (e-mail: marko.jost@fe.uni-lj.si) 632 m. jošt, m. topič sun’s spectrum is absorbed in the atmosphere. in this paper we will use standard solar spectrum am1.5[11]. the two spectra are shown in fig. 1. the difference is clear, thus an analysis has to be done separately. we will determine the maximum efficiency under am1.5 one sun illumination for single junction solar cell at standard testing conditions together with the key performance parameters of the solar cells. in the second part of this paper intrinsic losses will be explained and calculated. also, a comparison between results underam1.5 or blackbody radiation spectrum will be shown. fig. 1 comparison between blackbody radiation spectrum at ts = 6000 k (g=1550wm -2 ) and solar spectrum am1.5 (g=1000 wm -2 ). same energy span for both spectra was considered here. 2. efficiency limit according to the detailed balance principle[1], in equilibrium everything that is absorbed has to be emitted. radiative recombinations are therefore the only recombination mechanism present in the solar cell and as such necessary and unavoidable. all other recombinations diminish the efficiency significantly. the emission from the solar cell in equilibrium follows the planck equation where the emission spectrum is described by the solar cell temperature tc: ( ) ( ) (1) e is the energy of the emitted photon, h is planck’s constant, c speed of light and k boltzmann’s constant. emissivity ε = 1 was used in the following calculations as black body was assumed. under the illumination the system is no longer in equilibrium. due to chemical potential between quasi-fermi levels the photon emission increases by ( ) following the normalised np product, where µ = q*voc is a chemical potential, q is elementary charge. absorbed incident photons create electron-hole pairs. at the open-circuit voltage voc generated electrons cannot be extracted as no load is connected to the solar cell. in the ideal efficiency limits in photovoltaics – case of single junction solar cells 633 solar cell they are all radiatively recombined and emitted. open-circuit voltage can therefore be derived by equalling recombinations rext (equation 2) and generations ppump (equation 3): ( ) ∫ ∫ ( ) (2) ∫ ∫ ( ) (3) (4) (5) eg denotes band gap, s(e) is spectrum of the solar radiation while ω and θ stand for solid and polar angle, respectively. if non-radiative recombinations rnr do occur, they can be described with external fluorescence efficiency: ηext<1. voc is then: ∬ ( ) ∬ ( ) (6) the current in the solar cell is the difference between the generated electrons from solar radiation and the recombined electrons, radiatively or non-radiatively, and is voltage v dependent. this gives us: ( ) ( ) ∬ ( ) ( ) ∫ ( ) (7) here, we consider only radiative recombinations since we assumed an ideal solar cell, where only loss in the solar cell is emission loss due to radiative recombination. therefore ηext = 1. the short circuit current jsc can easily be calculated from equation 7 by inserting v=0. maximum power can be obtained as a product of jmpp and vmpp, current and voltage in maximum power point, while efficiency, the most important factor when discussing solar cells, is the ratio between generated electrical power pel and incoming power from the sun pin. (8) where ff is the fill factor and calculated by the following equation: (9) the discussed parameters are presented in fig. 2 in band gap dependency. the first graph shows the famous sq limit for the two spectra. the peak is 33.8% at 1.34 ev for am1.5 and 31.4% at 1.29 ev for blackbody radiation. voc increases linearly with band gap while jsc decreases due to less photons absorbed at higher band gap energies. the discussed parameters of c-si and gaas record solar cells are also inserted in the graphs. 634 m. jošt, m. topič fig. 2 graphical presentation of solar cell parameters for am1.5 (blue line) and blackbody radiation (green line) at tc = 25°c with inserted points for record c-si [12] and gaas [13] solar cells since most of the papers are based on blackbody radiation we decided to show the comparison between blackbody radiation at ts = 6000 k and am1.5 spectrum for all the basic parameters of the solar cell. to calculate parameters with blackbody radiation only am1.5 spectrum data has to be replaced with blackbody formula. while there is not much difference at voc and ff, clear difference between generated currents can be observed. this is a result of a higher number of photons if we consider the blackbody radiation. the efficiency of the solar cell is higher at am1.5 despite less current due to lower incident power density. in table 1 we present the comparison between theoretical efficiency limits and the achieved record efficiencies for crystalline silicon (c-si) and crystalline gallium arsenide (gaas) solar cells. the columns 3 and 4 show that the material properties are very important at determining the limiting efficiency of solar cells while record solar cells are still some way below theoretical limits. the j-v characteristics for the record and the ideal solar cell for both c-si and gaas are presented in fig. 3. since there is no j-v data about 28.8% gaas solar cell, the one of the 27.6%solar cell [14] is used instead. the record c-si cell exhibits better utilization of incident photons (higher jsc compared to jsc_ideal), while the record gaas cell exhibits better utilization of photovoltage (higher voc compared to voc_ideal). in both cases and in particular in thin-film solar cells there is room for further improvements [11] efficiency limits in photovoltaics – case of single junction solar cells 635 table 1 comparison between efficiencies for si and gaas solar cells, solar spectrum am1.5 material eg limit at eg limit for the material record cell[13] c-si 1.12 ev 33.2% 29.8% 25.6% gaas 1.42 ev 33.4% 33.4% 28.8% fig. 3 i-u characteristics for the ideal and record c-si (a) and gaas (b) solar cell at stc (am1.5, tc=25°c) 3. loss mechanisms as shown in the previous section the efficiency limit is 33.8%. where is the rest of the power lost? in this section we will explain intrinsic losses in a solar cell. since the amount of absorbed incident photons is strongly related with band gap, the biggest losses are spectral losses. other losses, such as emission, carnot and boltzmann losses, also contribute to the lower efficiency of the solar cells. 3.1. spectral losses the band gap is the most important parameter when determining efficiency. the photons with energy below the band gap do not have enough energy to generate an electron-hole pair and are therefore transmitted and not absorbed. such losses are named below band gap losses. they can be calculated by the following equation where we integrate the am1.5 spectrum for all the energies below the band gap. ∫ ( ) ∫ ( ) (10) the photons with energy above the band gap are absorbed in the active layer, creating free electron-hole pairs. the excessive energy, difference between photon’s energy and the band gap, however, is lost in a thermalization process where the generated electron thermalizes from the conductive band to its edge. such losses are named above band gap losses or thermalization losses. ∫ ( ) ∫ ( ) (11) 636 m. jošt, m. topič 3.2. emission loss the free electrons generated by the incoming photons are not stable and eventually drop back to the valence band where they recombine. recombination results in a phonon if the recombination is non-radiative or in photon if it is radiative. here we assumed only radiative recombinations occur in the solar cell as they are unavoidable due to detailed balance principle where everything absorbed has to be emitted. the emission loss can be calculated as a radiation from a blackbody at the maximum power point. ( ) ∫ ( ) (12) 3.3. energy less than band gap ideally the open circuit voltage would be equal to the band gap. however, in application open circuit voltage is lower and corresponds to the potential difference between quasifermi levels while voltage in the maximum power point is even lower. this is a result of carnot and boltzmann factor. carnot factor appears as the conversion from thermal to electrical work needs some energy [15], while boltzmann factor is a consequence of unequal solid angles of absorption and emission. (13) (14) the symbol ωe denotes solid angle of emission and ωa is solid angle of absorption. their values are π and 6.8221e-5, respectively. (a) (b) fig. 4 losses in solar cell for am1.5 at tc=0 k (a) and tc = 298.15 k = 25°c (b) the structures of the losses are presented in fig. 4. first we assumed the temperature of the solar cell to be 0 k. by observing equations 1, 12, 13 and 14, we see that at tc = 0 k the emission, carnot and boltzmann losses all equal to 0. the only loss mechanism present are spectral losses that are not temperature dependant, therefore the efficiency increases. maximum efficiency is now 49.1% at 1.14 ev. such a state is shown in fig. 4 a. efficiency limits in photovoltaics – case of single junction solar cells 637 second, we calculated the losses for stc conditions which demand solar cell temperature to be 25°c (298.15 k). the result is shown in fig. 4 b. the spectral losses present most of the losses. thermalization losses decrease with the band gap while below band gap losses increase due to less photons absorbed. the emission loss presents only a small fraction while carnot and boltzmann losses are not insignificant. the maximum efficiency at tc = 25°c is 33.8% at 1.34 ev. conclusions we have calculated efficiency limit of single junction solar cells for standard solar spectrum am1.5 under stc conditions. the peak efficiency is 33.8% at 1.34 ev. the basic solar cell parameters – efficiency, jsc, voc and ffwere derived and shown in band gap dependency. the blackbody radiation and solar spectrum am1.5 comparison was also shown to emphasize the difference between the two spectra. in addition, intrinsic losses in solar cells were explained and discussed. spectral losses, due to unabsorbed photons with energy below band gap or thermalization process of absorbed photons, contribute to over 50% drop of efficiency. attention was also paid to losses that are present in the solar cell at tc = 0 k, where only spectrum losses would have been present and increasing the efficiency limit under am1.5 spectrum to 49.1%. acknowledgement: we would like to thank j.r.sites and b. lipovšek for fruitful discussions. the work has been funded by the slovenian research agency under the research programme p2-0197. references [1] w. shockley and h. j. queisser, “detailed balance limit of efficiency of p‐n junction solar cells,” j. appl. phys., vol. 32, no. 3, pp. 510–519, mar. 1961. [2] g. araujo and a. marti, “absolute limiting efficiencies for photovoltaic energy-conversion,” sol. energy mater. sol. cells, vol. 33, no. 2, pp. 213–240, jun. 1994. [3] t. tiedje, e. yablonovitch, g. d. cody, and b. g. brooks, “limiting efficiency of silicon solar cells,” ieee trans. electron devices, vol. 31, no. 5, pp. 711–716, may 1984. [4] l. c. hirst and n. j. ekins-daukes, “fundamental losses in solar cells,” prog. photovolt. res. appl., vol. 19, no. 3, pp. 286–293, may 2011. [5] c. h. henry, “limiting efficiencies of ideal single and multiple energy gap terrestrial solar cells,” j. appl. phys., vol. 51, no. 8, pp. 4494–4500, aug. 1980. [6] o. d. miller, e. yablonovitch, and s. r. kurtz, “intense internal and external fluorescence as solar cells approach the shockley-queisser efficiency limit,” ieee j. photovolt., vol. 2, no. 3, pp. 303–311, jul. 2012. [7] e. yablonovitch and g. d. cody, “intensity enhancement in textured optical sheets for solar cells,” ieee trans. electron devices, vol. 29, no. 2, pp. 300–305, feb. 1982. [8] a. j. nozik, m. c. beard, j. m. luther, m. law, r. j. ellingson, and j. c. johnson, “semiconductor quantum dots and quantum dot arrays and applications of multiple exciton generation to thirdgeneration photovoltaic solar cells,” chem. rev., vol. 110, no. 11, pp. 6873–6890, nov. 2010. [9] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by up-conversion of sub-bandgap light,” j. appl. phys., vol. 92, no. 7, pp. 4117–4122, oct. 2002. [10] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by down-conversion of highenergy photons,” j. appl. phys., vol. 92, no. 3, pp. 1668–1674, aug. 2002. [11] “solar spectral irradiance: air mass 1.5.” [online]. available: http://rredc.nrel.gov/solar/spectra/am1.5/. [accessed: 19-mar-2014]. [12] “panasonic hit(r) solar cell achieves world’s highest energy conversion efficiency of 25.6% at research level | headquarters news | panasonic global.” [online]. available: http://panasonic.co.jp/ corp/news/official.data/data.dir/2014/04/en140410-4/en140410-4.html. [accessed: 08-jul-2014]. 638 m. jošt, m. topič [13] m. a. green, k. emery, y. hishikawa, w. warta, and e. d. dunlop, “solar cell efficiency tables (version 44),” prog. photovolt. res. appl., vol. 22, no. 7, pp. 701–710, jul. 2014. [14] b. m. kayes, h. nie, r. twist, s. g. spruytte, f. reinhardt, i. c. kizilyalli, and g. s. higashi, “27.6% conversion efficiency, a new record for single-junction solar cells under 1 sun illumination,” in 2011 37th ieee photovoltaic specialists conference (pvsc), 2011, pp. 000004–000008. [15] e. fermi, thermodynamics. courier dover publications, 1956. instruction facta universitatis series: electronics and energetics vol. 32, n o 1, march 2019, pp. 75-89 https://doi.org/10.2298/fuee1901075u composite milliohm-meter for resistance measurement of precision current shunts in industrial environment * marjan urekar, đorđe novaković, nemanja gazivoda university of novi sad, faculty of technical sciences, department of power, electronic and telecommunication engineering, novi sad, serbia abstract. a composite milliohm-meter (com-mohm), based on standard four-wire kelvin resistance measurement method, was developed for measurements of precision current shunts used in an industrial environment. the system is comprised of a precision, temperature compensated, low drift, high stability 100 ma current source and a 4 ½ digit commercial multimeter. due to the set of specific demands and conditions concerning the industrial applications of precision current shunts, standard measurement equipment and methods could not be implemented. the composite resistor standard was used for temperature stabilization of the precision current source based on precision voltage reference ref102. the measurement precision of ±0.1 milliohms is observed during measurement of 20 milliohm current shunts. key words: calibration, milliohm-meter, precision current source, four-wire resistance measurement, current shunt, measurement uncertainty 1. introduction this paper is an extension of the conference paper [1] and the lecture [2]. it is based on the experience in solving the specific task described, which required a unique solution within the projected budget and adapting to working conditions in industrial environment. laboratory for metrology and chair for electrical measurements at the faculty of technical sciences in novi sad have continuing cooperation with companies in the field of instrumentation, metrology and control of industrial processes. user demands often require finding non-standard solutions and constant innovation in the field of electrical measurement, as described in [3] and [4]. received february 28, 2018; received in revised form july 5, 2018 corresponding author: marjan urekar university of novi sad, faculty of technical sciences, department of power, electronic and telecommunication engineering, trg dositeja obradovića 6, novi sad, serbia. (e-mail: urekarm@uns.ac.rs) * an earlier version of this paper was presented at the 61th national conference etran 2017, kladovo, serbia, june 58, 2017. [1] 76 m. urekar, đ. novaković, n. gazivoda 2. problem description for the development of precision cnc machine motors and other devices where measurement of direct current (dc) with high accuracy is required, an automated 16-channel control system has been developed within a company in the field of automation of industrial processes. an industrial system for precision motor control is shown in fig. 1. each system control channel is consisted of: 1. microprocessor control module (um), common to all 16 channels. 2. high-resolution current driver (ps), digitally controlled by a microprocessor system which is part of the um. 3. precision cnc motor (m) being controlled by current im. 4. serial shunt resistor (rs) with 20 mω resistance and tolerance of ±0.5 %. 5. module for monitoring (mm) of voltage drop um on shunt with the feedback to um. 6. external process control unit (e) control desk. fig. 1 block diagram of an industrial system for precision motor control control module um receives command from control panel e and adjusts the corresponding digital signal that sets the required dc current im of power driver ps. this current drives the motor m and creates a voltage drop um on a precision current shunt rs, which is measured at one of the 16 analog inputs of the monitoring module mm. using a/d conversion, this voltage is converted into digital signal and is forwarded to um, creating voltage feedback line. this signal is compared with the value set by the user and, the output of the ps is adjusted to the required value, if necessary. the projected accuracy of the system is ±0.1 % of the current im. 3. problem analysis and standard solutions the company noted a measurement related problem when producing precision low resistance current shunts. this resistance has to be measured with accuracy better than ±0.5 %. a task is defined to create an instrument that will successfully measure the resistance of a shunt during its production. the voltage measurement range of the monitoring module is 200 mv, so for an operating current of up to 10 a, it is necessary to produce a shunt resistor of (20 ± 0.1) mω. resistance measurement with resolution of 100 μω is an extremely complex requirement, even for the most modern instruments, [5] and [6]. a high performance laboratory multimeter standard, such as fluke 8846a with 6 ½ digits resolution, at the lowest ohms range of 10.00000 ω, has a sensitivity of 10 μω [7]. composite milliohm-meter for resistance measurement of precision current shunts... 77 the price of this instrument is over $2000. procurement of this (or similar) instrument was not an option for the company, due to the budget for the realization of this instrument was about 10 times lower. another significant problem was imposed upon the analysis of the production process of mounting shunts (a shunt with associated mounting bracket, which also has its finite contact resistance). the mounting shunt consists of a piece of copper alloy (shunt), which is mounted in the brackets. the shunt is fixed with screws passing through the top of the bracket and pressing the copper against the contacts at the bottom of the carrier, as in fig. 2. each contact has a soldering point attached to it, enabling connection to the rest of the circuit. the mounting shunt is shortly called shunt. fig. 2 mounting shunt components pieces of copper alloy are procured prefabricated to the resistance of (15 ± 5) mω. it is possible to obtain prefabricated pieces of alloy of higher accuracy, but the price is exponentially higher. the company decided to purchase cheaper products, expecting "easy, quick and cheap" process of adjustment to the desired tolerance. in order to achieve the desired accuracy of the shunt, the copper piece is mounted in the bracket and its resistance is manually increased by cutting into the body of the alloy and by filing the surface of the shunt with a fine diamond-coated file. during this final treatment, the shunt resistance is continuously measured, i.e. dc current is continuously passed through it. this is the reason why two basic methods of measuring resistance are not possible in this case: the first method, using an industrial milliohm-meter [8], implies that during handling of the shunt and its processing, constant high current (up to 2 a) is passed through it, which can endanger the safety of workers and surrounding workplace, consumes a lot of energy and leads to a large thermal dissipation. another method is the wheatstone bridge, [9] and [10], one of the most commonly used bridge methods. even today, high precision measurement bridges are produced to cover wide range of resistances (from µω to gω), for industrial and laboratory applications, e.g. [11] and [12]. with three resistors of a known resistance, the value of the fourth (unknown) resistance can be determined very precisely, as in fig. 3. the values of the resistors r1 and r3 are known and fixed, while the resistance rx is measured. the bridge in the balance when there is no current flowing through the galvanometer g in the diagonal of the bridge. the condition of the bridge balance is given simply as the product of r1 and rx must be equal to the product of r2 and r3. this is achieved by using a variable resistor r2 for bridge balancing. if the resistors r1 and r3 have the same resistance value, the bridge is in the balance when ig = 0, and resistance r2 corresponds to the value of the unknown resistance rx. the value of unknown resistance can be determined as: 78 m. urekar, đ. novaković, n. gazivoda 2 1 / x s r r r r  (1) fig. 3 the wheatstone bridge in this case, the measurement would seem to be possible by simply selecting equal values of the resistors r1 and r3, with the value of r2 adjusted to (20 ± 0.1) mω. it turns out that it is not possible to perform such delicate measurements due to mechanical vibrations during the manual processing, as there are sudden changes in the contact resistance that cause momentary current overload of g that has very high sensitivity (in μa). if r2 was used as a variable resistor, the measurement would be significantly more difficult due to the mechanical properties of the variable resistor. the wheatstone bridge measurement process in an industrial-grade automated system is done in the following way: the microcontroller (mc) changes dozens of r2 resistance values while measuring voltage udb on the bridge. this voltage is converted by an a/d converter (adc) into a digital signal for mc. using the method of successive approximations, the correct value of r2 is found when the bridge response is near the zero level (i.e. bridge is in the balance). it is necessary to ensure that the adc can read very small voltage changes, which is only possible with high-resolution σ-δ adc. the price of the whole system would significantly exceed the budget limit defined for the project, if expensive σ-δ adcs were used. fig. 4 shows the standard 2-wire u/i method of resistance rx measurement, integrated into a digital multimeter (dmm), [6]. the current source (cs) in the instrument generates a constant current that passes through the measured resistor, creating a voltage drop ux measured by the digital voltmeter (v) in the instrument itself. this voltage drop is proportional to the measured resistance rx [9]. fig. 4 block diagram of resistance measurement with standard 2-wire u/i method using dmm composite milliohm-meter for resistance measurement of precision current shunts... 79 the problem of lower grade class of multimeters is that the minimum resistance measurement range is usually 200 ω. it is a regular practice with technical stuff and novice engineers that 2-wire method is used, regardless of the resistance measured and available measurement ranges. this is a common mistake due to the lack of formal education in metrology, inexperience in the field and especially bad practices that continue to be implemented due to the human factor. one of the main characteristics of a dmm is number of digits and number of counts of its display resolution. the cheapest and most commonly used are 3 ½ digits instruments, while 3 ¾, 3 ⅚, 4 ½, 4 ¾, 6 ½, etc. digits are also commercially available. the number of counts can be determined from the number of digits, e.g. a 3 ½ digits multimeter can display 3 full digits. the ½ indicates that the most significant digit has the maximum value of 1 (out of 2) that can be displayed. the maximum value that can be displayed on this multimeter is 1999. with the zero count counted in, we say that there are 2000 counts in total. in the same way, the number of counts for other multimeters is determined: for 3 ¾ 4000, for 3 ⅚ 6000, for 4 ½ 20000, etc. with an increase in resolution, i.e. the number of counts, the price of a multimeter increases also. the resolution of a dmm can be determined by dividing the measurement range value with the number of counts. since the low-ohm measurements are carried out in the lowest range, for the 3 ½ digits instrument, this means resolution of 100 mω, and 10 mω for 4 ½ digits, which is insufficient to measure (20 ± 0.1) mω. only with a 6 ½ digits it is possible to achieve the 100 μω resolution, but the price of such instruments exceed the projected budget. also, the resistance measurement error of such instruments is 0.5 % 5 %, depending on the measurement range, thus they are completely inadequate for the purposes of direct measurement of a high-precision shunt. if a more expensive laboratory-class instrument with adequate measurement characteristics would be used, the problem of instrument thermal overload would arise due to the longer measurement periods with high cs current [13]. milliohm resistance is, effectively, a short circuit for any instrument, which in this case measures in the lowest ohms range with the highest available current of the current source. if this current is small (at the order of ma), it produces voltage drop on the shunt (in μv), too small for the instrument to measure. for a milliohm-meter current source in ma or a range [5], there is always a time limit given by the manufacturer that the maximum measurement time is 10 to 30 seconds [8]. a large cs current creates high thermal dissipation within the instrument case, inevitably leading to thermal overload of the device and its failure. in addition, when the current flows through the resistor, its resistance changes due to the temperature increase. for a copper alloy conductor, with resistance of 20 mω at a 20 ºc, when exposed to 50 ºc its resistance changes for 242.46 μω, exceeding the error limit. therefore, it is necessary to avoid large temperature changes in order to maintain the resistance value within the given limits. for the above reasons, the maximum recommended current for measuring resistance is 100 ma [14]. this option is not available in most mid-range multimeters powered by regular batteries that limit maximum cs current. continuous use of such instruments on the lowest resistance measurement range, results in only several hours of autonomous work on standard batteries. frequent replacement of batteries would quickly result in a significant increase of the total cost of the device. 80 m. urekar, đ. novaković, n. gazivoda the standard method for measuring resistance in modern instruments is the ratiometric method, [14] and [15]. while regular absolute (direct) measurements are simpler (such as u/i method), the ratiometric method, on the other hand, performs two (or an even number) of measurements, and the desired value can be calculated based on the ratio of those measurements. most often, one value is taken as a reference, in relation to which the second value is measured. the advantage of this method is that the measured values mutually depend on each another, so the change in one value automatically reflects on the other value, while their ratio remains unchanged. in this way, the potential sources of error such as humidity, temperature, pressure etc. are eliminated using just calculation. in this method, the reference resistor of a known value and the unknown resistor (with order of magnitude of the reference resistor) are connected in series and form a voltage divider of known input voltage across them. a precision voltmeter measures voltage drop on each resistor and the voltage ratio is compared with the resistance ratio, a method shown to have greater accuracy than a direct measurement of resistance with the conventional u/i method, [14]. the problem with this method occurs when measuring low resistances, where for the series connection of two 20 mω resistors, voltage source of 1 v has to produce the current of 25 a, which could endanger safety of the user. such a source would be expensive, but also with the power dissipation of 12.5 w, it would burn out a low-power resistor. the finite resistance of the test leads connecting shunt to the multimeter [15] introduces an additional error in measurement. this resistance can be of the same magnitude as shunt resistance, but also up to ten times higher, if low-quality test leads are used. when measuring low resistances, intermittent contacts increase the error, mainly cable to shunt connections and instrument input connectors, as in fig. 4. the total resistance of the test leads and contact resistances can be up to 1 ω, which is unacceptable for this type of measurements. even with high-grade instruments such as fluke 87v [16], test leads resistance is about 100 mω – five times larger than the shunt resistance. this problem is usually resolved by socalled relative "zeroing": the ends of test leads are shorted together and the instrument is adjusted to display zero reading. the measured test leads resistance is stored in memory and simply subtracted from every consequent measurement, [17]. this solves the problem only to a certain degree, as the resistance of the test leads and contacts varies over time – it changes with current, physical position, temperature, etc. hence, the "zeroing" is also inadequate for precise µω measurements [6], as contact resistance changes even with minor shifting of the contacts by the user. during the manual shunt processing stage (filing and cutting), operator exerts mechanical pressure on the shunt, creating significant vibrations and stress to the resistive material that instantly change the resistance of the contacts, with each attempt to process the material. regular and periodic "zeroing" of the test leads would be required, complicating the device design, and affecting its price and measurement time. the problem with wide range variation of ambient temperature in an industrial environment is always present, affecting resistors and current sources (cs are used in temperature sensors very often due to their temperature sensitivity), lowering the measurement accuracy, [14]. a typical current source error is 0.01 %/°c. temperature variation of ±5 °c would make this source too unstable for low resistance measurement. composite milliohm-meter for resistance measurement of precision current shunts... 81 4. description of the optimal solution after analyzing all known problems and limitations, it was decided to create a composite milliohm-meter (com-mohm), made up of several separate modules, based on the kelvin (four-wire) version of the standard u/i method for resistance measurement, as shown in fig. 5. this example is well known in the measurement theory: the current source dictates the current ics through the resistor, and corresponding voltage drop ux is measured with voltmeter v. the value of the measured resistance rx is obtained as: x x cs u r i  (2) fig. 5 kelvin's four-wire method of measuring resistance the most important condition in this method is to use two pairs of cables (four wires) – a voltage pair (for voltmeter) and a current pair (for current source), with current and voltage terminals connected directly at the ends of rx, in order to eliminate the influence of finite cable and contact resistances. the current through resistor rx is calculated from the current divider equation: ,vx cs x v r i i r r   (3) where rv represents the input resistance of the voltmeter (multimeter). input resistance of dmm depends on the adc within the instrument and resistance of the measurement range voltage divider. the cheapest digital 3 ½ digits multimeters, based on icl7106 or similar adc, have input resistance of 1 – 10 mω. higher-grade instruments have 10 – 100 mω input resistance, with the lowest voltage range ranging up to several gω. the lowest measurement range of these instruments is usually the basic voltage range, where the multimeter input is connected directly to the adc input with gω input resistance. this high input resistance of a dmm results in minor bias current flowing into the instrument. for the current source of 100 ma, the multimeter bias current is in pa range, not affecting measurement results. if a pair of cables with cross-sections of 0.5 mm 2 and 1 m of length were used for the measurement, the resistance would be about 72 mω, much higher than 20 mω shunt resistance. in most practical cases, real resistance being measured with this method is rm, as in (4) – rx with added sum of all resistances of test leads (rmc) and sum of all contact resistances (rcr). m x mci crj i j r r r r    (4) 82 m. urekar, đ. novaković, n. gazivoda the advantage of the kelvin method is that very long connecting cables can be used, (5-10 meters, or even more), which is a common case for an industrial setting, [6] and [9]. the main problems of this approach are: 1. a current source of 100 ma with stability (precision) of 0.05 % is needed. this level of precision is only available in integrated current sources, with output range of 100 μa to 10 ma, [14]. most of the integrated current sources over 10 ma have at least ± 1 % error, such as ref200 [18] and lm334 [19] made by texas instruments. 2. the current of 100 ma produces only 2 mv at 20 mω resistance. in order to achieve the desired accuracy and precision, an instrument capable of reading 2 mv ± 0.05 %, i.e. with 1 μv resolution, must be used. 3. the best instrument that could be obtained for this purpose was uni-t ut61e [20] with resolution of 4 ½ digits (maximum display of 22000 counts). the lowest voltage dc range of this multimeter is 220 mv, with resolution of 10 μv – 10 times worse than needed. the optimal solution was found in the modification of kelvin 4-wire method in the form of composite milliohm-meter (com-mohm), as seen in fig. 6. the current source uses double servo loop. a servo-loop represents a system in which the adjustment of its parts is automatically made to get the response within certain limits. in this case, one servo-loop is made with u2 operating amplifier that corrects the level of gnd voltage for the precision reference u1, depending on the monitored current at the output of the current source. the other servo-loop with u3 ensures temperature stability of the current source that is set to iref = 100 ma. for rx = 20 mω shunt with tolerance of δrx/rx = 0.5 %, it is possible to determine maximum variation of ux voltage measured with voltmeter v: 100 ma (20 m ) 2 mv ( ma 0.1m ) 2 mv 10μvx xu r                (5) the maximum variation of the shunt resistance corresponds to the smallest quant (resolution) of dc voltage reading of the ut61e. declared multimeter error in this measurement range is expressed as 0.1 % of reading + 5 digits. this equals to the error of 52 μv for 20 mv reading, rendering it impossible to measure 10 μv of voltage change. however, it was found that the ut61e actually possesses sensitivity of 10 μv, in form of having the last figure on the display accurately changing in these increments. this was verified by changing the calibrator standard voltage in 10 μv steps, observing ut61e display and checking it against the fluke 8846a multimeter. this allowed the optimal solution to be found in the form of relative measurements of the transmission standard, according to the principles given in [4], [9], [10] and [21]. a prototype of the shunt standard r0 was made, with its resistance adjusted to 20 mω ± 10 μω using the fluke 8846a. this standard then serves to determine what voltage ut61e shows as reference (uvref) when the resistance of the shunt is "exactly" 20 mω. when ut61e is used in com-mohm, for shunt resistance change of ± 0.1 mω, reading change of ± 1 digit is obtained, (i.e. uvref ± 10 μv) in relation to the value obtained for r0. it can be concluded that the multimeter here serves only as an indicator, indicating whether the measured resistance is or is not within tolerance limits. if it is, its value cannot be accurately determined. this represents sufficient functionality of the instrument for the required task, at the lowest price. also, it is simple enough to be handled by the untrained technical personnel in the production, making it the best possible solution for the given problem. composite milliohm-meter for resistance measurement of precision current shunts... 83 fig. 6 composite milliohm-meter (com-mohm) scheme 5. practical realization during the design process of com-mohm, various design inputs from [14], [15], [22], [23], [24], [25] and [26] were used. fully functional com-mohm is given in fig. 6. its main block is the current source with u1 precise voltage reference ref102 [24] with voltage output of 10 v ± 2.5 mv, temperature coefficient of 2.5 ppm/°c that ensures negligible change in current, even at large external temperature fluctuations that are typical for the industrial environment. the precision reference allows single supply voltage from 11.4 v to 36 v, enabling batterypowered operation. its low drift and high stability output is impervious to large input voltage swings. in addition, its long-time stability is estimated at 5 μv over 1000 hours of work. the output voltage is passed to operational amplifier u3 and transistor q1, which serves as current amplifier in active mode, [22] and [23], since the voltage reference has maximum output current of only 10 ma. from the q1 emitter, a 100 % feedback of dc voltage is applied back to the inverting input of u3. since the non-inverting input of u3 is at vref potential, knowing that operational amplifier always forces its "+" and "-" inputs to be at the equal voltage levels, it is clear that the voltage at the q1 emitter will be vref. this eliminates the voltage variation of the transistor base-emitter vbe voltage due to temperature change, a prerequisite for stabile current amplification. vbe variation affects base and collector currents, as well as current through the re, resulting in large measurement error. temperature coefficient of vbe is -2mv/°c. q1 is a standard power npn transistor tip47 [27], rated 40 w with maximum current of 1 a, in to-220 package, which can be easily mounted on a metal heat sink, if necessary. proper selection of operating amplifiers u2 and u3 is one of the most critical aspects of the entire system. opa1013 was chosen, with its small offset voltage drift, typically around 0.5 μv, providing stability of the current source. another important factor is its low input bias current, around 10 na, providing that current through re is identical to the current through the measured resistor rx. the temperature coefficient is 2.5 μv/ºc, which allows stabile operation in wide range of temperatures. 84 m. urekar, đ. novaković, n. gazivoda u2 (opa1013) is configured as voltage buffer, forcing voltage on gnd reference pin to be the same as the voltage across rx (voltage between the non-inverting u2 input and ground), [25] and [26]. as the result, iref current through the rx is constant and defined as: /ref ref ei v r (6) vref is the reference output voltage and re is emitter resistor, adjusted to obtain the correct current. sudden change in the resistance of rx or re, results in sudden change of the voltage value at the u2 "+" input due to the flow of constant current iref: ref xv i r   (7) while the first servo-loop has full feedback that ensures temperature stability of q1 current amplifier, the second servo-loop provides constant current through the re, regardless of the rx value. the second servo-loop in feedback line, also in the form of voltage buffer, manages the potential of the gnd input of the voltage reference, thus controlling the output voltage. we can say that the current source is stabilized with two separate feedbacks, in order to compensate for two different sources of error. if the voltage at the ―+‖ input of u2 increases or decreases, the potential of the gnd input of the voltage reference will also increase or decrease. the voltage reference ensures that the potential difference between its output and the gnd pin is constant at 10 v. this ensures that the voltage across the re is also exactly 10 v, regardless of the voltage drop on the rx. this results in the effect of a "floating" voltage, relative to the battery ground potential, but constant at the ends of the re. the voltage on the resistor re remains unchanged, resulting in 100 ma constant current source, dependent only on the value of precision resistor re. the accuracy of current source is directly related to re accuracy, and its stability (precision) relates to the temperature stability of re. the value of resistor re is calculated from (6) as the ratio of voltage reference vref = 10 v and constant current iref = 100 ma. the nominal value of the resistor re is 100 ω. the constant current of 100 ma generates thermal dissipation of 1 w at the resistor. this amount of heat inevitably leads to the effect of self-heating and resistance change [3], affecting shunt resistance for at least one order of magnitude more than environment temperature change. in order to preserve the stability of the current reference, re is constructed as a composite resistor standard, described in [3], reducing self-heating resistance change to zero. metal-film resistors rnmf = 400 ω were selected (serial connection of two resistors rnmf1 = rnmf2 = 200 ω) and carbon resistors rncf = 200 ω, connected according to the equivalent scheme in fig. 7, as a substitute for re, [3]. opposite temperature coefficients of metal-film and carbon resistors provide temperature stability of the current source, eliminating the self-heating effect. r1 and r2 are selected high-resistance resistors for precision adjustments of the source current to (100 ± 0.05) ma. composite milliohm-meter for resistance measurement of precision current shunts... 85 fig. 7 composite resistor standard re, equivalent connection scheme 100 µf and 100 nf capacitors in the circuit provide voltage stabilization and filtering. additional stabilization of the re voltage is achieved by the electrolytic 10 μf capacitor. a tantalum capacitor is used, with equivalent serial resistance (esr) 10-50 times lower than a conventional aluminum electrolytic capacitor of the same capacitance (around 0.2 ω). tantalum capacitors are thermally stable with low leakage current. the current source is a milliohm-meter component must have its own dedicated power source, which should provide stable voltage (over 12 v), required for the stabile operation of ref102. in addition, this power source has to provide 120 ma (for current source and associated components) during at least 16 hours (two shifts in one working day). the switch s1 turns on and off the battery power. table 1 comparison of multimeter characteristics characteristics ut61e fluke 8846a number of digits 4 ½ 6 ½ number of counts 22000 1000000 input impedance (gω) 3 10 the lowest dc voltage range (mv) 220 100 voltage resolution at the lowest range (μv) 10 0.1 price (eur) 60 1500 a spare siemens laptop battery is used as the main power source. the battery is composed of eight rechargeable lithium-ion cells, with voltage of 14.4 v and a capacity of 4400 mah – around two times more than the com-mohm needs. the battery is similar in size to the ut61e multimeter, which makes com-mohm compact and easy to operate and transport. the second component of the com-mohm, the ut61e multimeter, has its own battery supply. com-mohm was adjusted and calibrated in the laboratory for metrology. the current source is set to (100 ± 0.05) ma by trimming re. measurements with the fluke 8846a confirmed the accuracy and stability of the current source below ±1 μa. the precision shunt standard r0 was also adjusted and calibrated. a comparison of the uni-t ut61e and the fluke 8846a characteristics is given in table 1. 86 m. urekar, đ. novaković, n. gazivoda 6. measurement results com-mohm was installed and used in the factory hall where the first set of 16 shunts was produced. the hall temperature was 5 °c higher than the temperature in the reference laboratory. fig. 8 graphical comparison of measurement results table 2 overview of the measurement results rx # rm1 mω rm2 mω |rm1-rm2| µω 1 2 2 | | 100m m m r r r   % 1 19.99 19.9976 7.6 0.0380 2 20.00 20.0045 4.5 0.0225 3 20.01 20.0073 2.7 0.0135 4 20.01 20.0191 9.1 0.0455 5 20.00 20.0031 3.1 0.0155 6 20.01 20.0165 6.5 0.0325 7 19.99 19.9919 1.9 0.0095 8 20.00 19.9998 0.2 0.0010 9 20.01 20.0084 1.6 0.0080 10 20.01 20.0069 3.1 0.0155 11 19.99 19.9902 0.2 0.0010 12 20.00 20.0035 3.5 0.0175 13 19.99 19.9922 2.2 0.0110 14 19.99 19.9807 9.3 0.0465 15 20.00 19.9981 1.9 0.0095 16 20.01 20.0092 0.8 0.0040 all the manufactured shunts were subsequently moved to the laboratory, where they were calibrated after 24 hours of stabilization. the results of field measurements using the ut61e (rm1) and the laboratory measurements with fluke 8846a (rm2) are given in table 2. the absolute value of the absolute measurement error is determined as the difference of readings between commohm and fluke 8846a. the absolute value of the relative error is also given. the standard deviation for the 16 shunts of the first production series is 9.64 μω. composite milliohm-meter for resistance measurement of precision current shunts... 87 fig. 8 shows graphical representation of the measurement results with the upper (20 mω + 10 μω) and the lower (20 mω – 10 μω) range of allowed values. the results seem to agree well, only r4, r6 and r14 are outside the required range. these small discrepancies can be attributed to the change in the copper alloy resistance due to the difference in the temperature of the factory hall and the laboratory, and to the measurement error of the laboratory multimeter. 6. the assessment of the measurement uncertainty shunt resistor measurement is carried out in two steps [28], as in fig. 9. in the first step, a standard shunt r0 is placed instead rx and the voltage drop u0 is measured. in the second step, shunt under test rx is placed instead r0 and voltage drop ux is measured. based on these two measurements, the value of the shunt under test is calculated. current i0 is calculated from the first measurement with the shunt standard r0: 0 0 0 /i u r (8) after adding correction, (8) becomes: 0 0 0 0 0 0 0 u u u u stab i r r         (9) in the step two, resistance of the shunt under test rx is calculated: / x x x r u i (10) fig. 9 two-step shunt measurement current ix can be represented as a sum of current i0 and correction i0stab due to i0 instability: 0 0 x x u r i i stab   (11) mathematical model of the measurement result is obtained by combining (9) and (10), in according to gum [29]: 0 0 0 0 0 0 0 0 x x x u u stab u u r u u u i stab r r               (12) 88 m. urekar, đ. novaković, n. gazivoda used notation: ux mean value of ux, δux measurement result ux correction due to finite voltmeter resolution, δu0stab measurement result correction due to instability of measurement of u0, u0 mean value of u0, δu0 measurement result u0 correction due to finite voltmeter resolution, δu0 measurement result u0 correction due to inaccuracy of measurement, r0 resistance of the standard shunt r0, δr0 measurement result correction due to inaccuracy of shunt r0, i0stab measurement result correction due to instability of current i0, i measured current. measurement uncertainty budget is given in table 3. table 3 com-mohm measurement uncertainty budget symbol estimate standard uncertainty type distribution sensitivity contribution to measurement uncertainty ux 2.01 × 10 -3 6.32 × 10 -6 a normal 10 63.2 × 10 -6 δux 0 2.89 × 10 -6 b uniform 10 28.9 × 10 -6 δu0stab 0 127 × 10 -9 b uniform 10 1.3 × 10 -6 u0 2 × 10 -3 6.32 × 10 -6 a normal -10.1 63.6 × 10 -6 δu0 0 2.89 × 10 -6 b uniform -10.1 29 × 10 -6 δu0 0 12.7 × 10 -6 b uniform -50 × 10 -3 635 × 10 -9 r0 20 × 10 -3 0 b uniform 1.01 0 δr0 0 11.5 × 10 -6 b uniform 1.01 11.6 × 10 -6 i0stab 0 28.9 × 10 -6 b uniform -201 × 10 -3 5.8 × 10 -6 rx 0.0201 99.4 × 10 -6 extended measurement uncertainty with coverage factor k=1 : ±0.000099 ω extended measurement uncertainty with coverage factor k=2 : ±0.00020 ω 8. conclusion the realized composite milliohm-meter system (com-mohm) meets the requirements of stability, accuracy, reliability and low price. the set of hardware components (multimeter, current source and shunt standard) cost about $100, about 20 times less (not including procedures like calibration, adjustments, circuit design, etc.) than a commercial laboratory-grade instrument. the measurement results in the field and laboratory are sufficiently consistent for the stated requirements. a solution of a specific industrial problem, with numerous constraints, is presented. careful analysis of the problem, examination of all influential parameters, the optimal design of the electronic circuit and the knowledge of how all the relevant blocks of the system function, provide non-standard solution for a specific problem with favorable benefit/cost ratio. the core electrical engineering knowledge is fundamental for bridging the gap between theory and practice, especially in regular industrial environment applications, where both innovative and old (but well-tested) technologies are used in conjunction. in present volatile economics, current and ―obsolete‖ technologies, that provide backbone of existing industry, might become very important to be maintained, repaired and improved in order for a company to remain with cost-effective product lines. composite milliohm-meter for resistance measurement of precision current shunts... 89 acknowledgement: this work is supported in part by the ministry of education, science and technological development of the republic of serbia within the framework of the project tr-32019. references [1] m. urekar, đ. novaković, b. vujičić, p. sovilj, m. bulat, kompozitni miliommetar za merenje preciznih strujnih šantova u industrijskim uslovima, in proceedings of etran 2017. kladovo, ml1.8., jun 2017. [2] m. urekar, n. gazivoda, j. đorđević-kozarov, m. bulat, p. sovilj, precizni kompozitni ommetar za merenje industrijskih strujnih šantova na mω opsegu, merno-informacione tehnologije 2017 (mit2017), fakultet tehničkih nauka, novi sad, 8-9. decembar 2017. [3] m. urekar, m. bulat, b. vujičić, d. pejić, composite resistor standard for calibration of measuring transducers in laboratory conditions, serbian journal of electrical engineering, vol. 13, no. 1, feb. 2016. [4] m. urekar, n. gazivoda, the transfer voltage standard for calibration outside of a laboratory, serbian journal of electrical engineering, vol. 14, no. 1, feb 2017. [5] a guide to low resistance testing, megger, mar 2004. [6] low level measurements handbook, 7th ed., keithley, 2014. [7] fluke 8845a/8846a digital multimeters extended specifications, fluke corporation, 2008. [8] extech models 380560 and 380562 high resolution benchtop milliohm meter user’s guide, extech instruments, 2010. [9] d. placko ed., fundamentals of instrumentation and measurement, iste usa, 2007. [10] a. s. morris, measurement and instrumentation principles, butterworth-heinemann, 2001. [11] tinsley portable precision wheatstone bridge model qj49, http://tinsley.co.uk/wp/?page_id=1713 [12] yokogawa precision wheatstone bridge 2768, user’s manual im 2768-01e, 4 th edition, oct. 2017. https://tmi.yokogawa.com/solutions/products/portable-and-bench-instruments/dc-precisionmeasuring/2768-precision-wheatstone-bridge/ [13] calibration: philosophy in practice, 2nd ed., fluke corporation, 1994. [14] p. horowitz, w. hill, the art of electronics, 3rd ed., cambridge university press, 2015. [15] robert a. pease, troubleshooting analog circuits, edn series for design engineers, newnes, jul 1991. [16] fluke 87v industrial multimeter, http://en-us.fluke.com/products/digital-multimeters/fluke-87v-digitalmultimeter.html [17] low ohms measurements you can trust, application note, fluke corporation, 2004. [18] ref200 dual current source and current sink datasheet, rev. b, texas instruments, jul 2015. [19] lm134/lm234/lm334 3-terminal adjustable current sources datasheet, rev. b, texas instruments, may 2013. [20] digital multimeter ut61e manual, uni-t, 2011. [21] a.e. fridman, the quality of measurements – a metrological reference, springer, 2012. [22] n. zhao, r. malik, w. liao, difference amplifier forms heart of precision current source, analog dialogue vol. 43, analog devices, sep 2009. [23] b. ing, usinga linear regulator to produce a constant current source, application note 4404, maxim integrated, jun 2009. [24] ref102 2.5ppm/°c drift, 10-v precision voltage reference datasheet, rev. b, texas instruments, jun 2009. [25] r. m. stitt, make a precision current source or current sink, application bulletin, burr-brown, feb 1994. [26] r. m. stitt, implementation and applications of current sources and current receivers, application bulletin, burr-brown, mar 1990. [27] tip47g, tip48g, tip50g high voltage npn silicon power transistors, rev. 11, on semiconductor, oct. 2014. [28] f. galliana, p. p. capra, e. gasparotto, a traceable technique to calibrate dc current shunts and resistors in the range from 10 µω to 10 mω, measurement 46, elsevier, 2013. [29] jcgm 100:2008 evaluation of measurement data — guide to the expression of uncertainty in measurement, jcgm, bipm, 2010. http://tinsley.co.uk/wp/?page_id=1713 http://en-us.fluke.com/products/digital-multimeters/fluke-87v-digital-multimeter.html http://en-us.fluke.com/products/digital-multimeters/fluke-87v-digital-multimeter.html facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 571-583 https://doi.org/10.2298/fuee1804571f a self organizing map (som) based electric load classification mahdi farhadi birjand university of technology, birjand, iran abstract. it is of vital importance to use proper training data to perform accurate shortterm load forecasting (stlf) based on artificial neural networks. the pattern of the loads which are used for the training of kohonen self organizing map (som) neural network in stlf models should be of the highest similarity with the pattern of the electric load of the forecasting day. in this paper, an electric load classifier model is proposed which relies on the pattern recognition capability of som. the performance of the proposed electric load classifier method is evaluated by iran electric grid data. the proposed method requires a very few number of training samples for training the kohonen neural network of the stlf model and can accurately predict electric load in the network. key words: short-term load forecasting, similar sampling process, kohonen self-organizing map, pattern recognition, electric load classification, load classifier 1. introduction electric load forecasting is of high importance for efficient generation and distribution of electric energy in power grids. therefore, load forecasting is usually performed in different time horizons such as very short-term, short term, medium term, and long term. short-term load forecasting (stlf) refers to the prediction of electric load during one day to one week and is performed in different situations such as for unit commitment, evaluation of net interchange scheduling functions, system security analysis, and control and scheduling of power systems [1, 2]. the stlf models require neural network training using appropriate training data to achieve optimal forecasting. the behavior of load data used in the training process of neural network in the stlf models should be of the highest degree of similarity with the behavior of the data load in the forecasting days. high similarities between the behaviors of training sample days used in the training phase of neural networks and the behavior of forecasting day largely guarantee the accuracy of forecasting. so, finding appropriate training samples is of great importance in the process of stlf.  received august 1, 2017; received in revised form august 23, 2018 corresponding author: mahdi farhadi birjand university of technology, birjand, iran (e-mail: mahdifarhadi.staff@yahoo.com) 572 m. farhadi to design stlf models, some load classification studies are needed to identify the exact patterns consumption of electricity in the network. by accurate recognition of daily power consumption curve behavior in earlier times, the ability to forecast power requirements is provided in short-term period. in fact, daily load classifier provides the capacity to identify the pattern of consumption power and the required knowledge to design and propose efficient models of stlf provides for designers and operators of power grid. clustering techniques are widely used for grouping customers and consumption loads in the network. different classification methodologies based on discriminate analysis, linear regression, and artificial intelligence (fuzzy logic and neural networks) techniques have been proposed in literature [3]. among different neural network classifiers, kohonen self organizing map (som) is one of the most effective methods. kohonen som has two valuable features, namely, pattern recognition and pattern complementarity. these two kohonen som properties are exploited for short-term load classification and for forecasting, respectively. in recent years, classification algorithms have been subject of extensive research activities and these algorithms have been widely used for different applications in the area of electrical engineering. for instance, in [4] a theoretical framework is formulated for customer classification using annual load profiles. it is also demonstrated how to extract characteristic attributes in frequency domain (cafd) to represent signatures for customer classes and subclasses. the cafds are obtained by using a data mining method called cart which is composed of classification and regression tree. these cafds are then used for systematic customer load classification. a method with low computational cost, but yet with sufficient accuracy is proposed in [5] to extract signatures for load classification. this method, instead of utilizing digital signal processing and frequency-domain analysis, quantifies the similarity of voltage–current (v–i) trajectories for different loads and maps v–i trajectories to a grid of cells with binary values. next, graphical signatures can be extracted and be used for several applications. this technique significantly reduces the computational cost compared with the existing frequency-domain signature extraction methods. in [6] an algorithm, which is called isodata, is developed for customer classification and load profiling. this method takes into account the impact of temperature on the loads for classification and filters the impact of outlier samples. a novel iterative active learning technique based on self-organizing map (som) neural network and support vector machine (svm) classifier is proposed in [7]. the technique exploits the properties of svm classifier and that of som neural network to identify uncertain and diverse samples to be included in the training set. it selects uncertain samples from lowdensity regions of the feature space by exploiting the topological properties of som. in addition to the extensive use of classification methods for different applications in electrical engineering, classification methods particularly have been used for load management and load forecasting. for instance, a novel load profile management software framework is presented in [8] for boosting the efficiency of power systems operation. the proposed framework performs real-time encoding and classifies load profiles. the classifier engine is based on an implementation of a locality sensitive hashing algorithm. in [9], the identification of operating electrical appliances inside residential buildings has been addressed. a method is proposed that can identify each appliance from the aggregate an effective self organizing map based model of electric load classification 573 power readings of the electrical measurement panel. the possibility of applying a temporal multi-label classification approach in the domain of non-intrusive load monitoring (noneventbased method) is also investigated. in [10], a non-intrusive appliance load classification and monitoring strategy is proposed for energy management in smart buildings. the use of isodata algorithm for customer classification proposed in [11]. temperature dependency correction and outlier filtering are considered in the proposed customer classification and load profiling method. o.e. dragomir et al in paper [12] have proposed an object-oriented software application using matlab toolbox to classify customers and the results have been assessed for three urban areas in romania in the form of several different scenarios. in this paper, to make a software system to analyze daily electricity usage of network across the country an efficient and flexible load classifier model has been proposed. we implemented the proposed kohonen som model in matlab programming environment. the flexible program of the developed load classifier has the capacity to adjust the classifier accuracy by changing the dimensions of a two-layer kohonen network for different time periods. we propose efficient rules of similar sampling process to train kohonen neural network in the load forecasting models. according to these rules, each of the „normal week days‟, „official holidays‟, „before official holidays‟, „after official holidays‟ and „ramadan days‟ have their own specific behaviors [13]. it is noteworthy that previous author's papers [13], [17] forecast the short-term (the next day) load, while the present paper focuses only on the method of classifying daily electric load and the classification of loads with the same behavior for the time period selected (e.g. one week, one month, or one year, or any desired time interval from one day to several years). although all three papers are based on the kohonen neural network function, two reference articles [13], [17] use the pattern-complement property of kohonen neural network in the process of daily electric forecasting, while the present paper use pattern recognition property of kohonen neural network in order to classification of daily electric load. in fact, the classification technique used in this paper is a prerequisite for finding suitable training patterns for training kohonen's neural network used in the process of electric load forecasting in reference papers [13], [17]. section 2 addresses the structure, specifications, and the algorithm of som network classifier. in section 3, the classifier system which is implemented and designed in matlab programming environment is presented and related input and output parameters are explained. in section 4, the proposed classifier program is evaluated, and its performance is verified by sampling the classifier in monthly time intervals of network data of iran. in section 5, the advantages of the proposed model and the implemented classifier program of som and the sampling process are discussed. an effective method called similar sampling method is introduced to specify the pattern of an appropriate training sample to train kohonen neural network in the stlf models. in section 6, the method for correcting the predicted load based on the fuzzy expert system is briefly described, and finally, the paper is concluded in section. 574 m. farhadi 2. kohonen self-organizing-map neural network kohonen is a specific type of som which is capable of classifying complex sets of patterns in an unsupervised way [14]. this classifier extracts some classification criteria from the data and uses it in an implicit manner. to perform the classification, it spans input space over output space of a lower dimension, while still preserving the topological features of the patterns in the input space. a set of elements so called neurons, which are illustrated over a plane or a line in a rectangular or hexagonal shape, represent the output space. fig. 1 shows a neighborhood function which is defined in the output space. the internal connections and links of the self-organized network must incorporate the important properties, patterns, categories, regularities, and correlations which are extracted from the input data. the neurons self-organize themselves based on the inputs as external stimuli [15]. som architecture consists of a two-layered neural network. there are n neurons in the input layer, each of which is associated to one input variable. the output layer consists of neurons, which are spatially distributed along a 2d grid. fig. 1 kohonen neural network the ith input neuron and the jth output neuron are connected together by a weight wij. the jth output neuron represents the average prototype vector of the category, and the reference weight vector wj called codebook [15]. the steps of som algorithm are as follows [16]: step 1: initialize neuron weights step 2: select input vector step 3: calculate a distance between any neurons and the input vector step 4: select the nearest output neuron step 5: adjust the output neuron and its neighbors step 6: repeat the process from step 2 winner neuron neighborhood area input vector n m an effective self organizing map based model of electric load classification 575 3. som electric load classifier in order to implement an electric load classifier, we created a self-organizing layer of kohonen with adjustable dimensions of m×n and implemented the training algorithm of som in matlab programming environment. in the implemented classifier program, input and output parameters of classifier algorithm is defined. after finishing the classifier process, the similar days are assigned to one neuron or close neurons of the two-layered self-organizing kohonen. 3.1 matlab software matlab provides a powerful environment for numerical computations [16], data analysis, simulations, algorithm implementation, programming development, and rather easier model implementation compared to other programming languages. in this paper, the classifier has developed using matlab programming language based on the kohonen classification algorithm, instead of using the neural network library of matlab. in addition, a matlab gui has developed which provides an interface to receive inputs and illustrate outputs of the proposed classifier. our implemented software provides a flexible and powerful environment. 3.2. classifier's structure as mentioned earlier, the structure of the proposed classifier is formed by an autonomous kohonen network with adjustable and flexible set of m×n neurons according to fig. 1. the accuracy of the classifier is adjusted by changing the dimensions of the kohonen network and the number of neurons which are located in the network. the classifier module which we implemented in matlab is shown in fig. 2. fig. 2 classifier module 576 m. farhadi input and output parameters of the classifier program are listed in the following: 3.2.1. inputs as shown in table 2, in addition to the dimensions of the inputs of the kohonen network such as m and n parameters, other inputs of the classifier include calendar inputs of the program are the first date, the last date and the number of days. other input quantities of the classifier algorithm include the number of load values (which is fixed value equal to 24), learning cycle, start value of the learning rate, eht end value of the learning rate, eht start value of the neighborhood radius, and the end value of the neighborhood radius. 3.2.2. outputs as shown in table 2, the outputs of the classifier program include the number of classes, selected neurons which is equal to the number of classes, and empty neurons. 3.3. model's structure according to fig. 3 the inputs of the som network are two vectors of normalized active power consumption during 24 hours in iranian power grid network. as shown in fig. 3 the classifier model receives two vectors of length 24, loadnormal (d1) and load-normal (d), where the vectors represents normalized electric load in consequent days d-1 and d (d = 2,…,365), respectively. these values of vectors for a period of one year would be used for the classification of the daily electric load vectors [13, 17]. fig. 3 the load som model in the classification process these normalized inputs were calculated as follows: ( 1) ( 1) / ( ( 1))load normal d l d average l d     (1) ))1((/)()(  dlaveragedldnormalload (2) an effective self organizing map based model of electric load classification 577 where l (d1) and l (d) represent the loads for the pre-forecasting and forecasting days, respectively, as the following: )]1( 24 )1( 3 )1( 2 )1( 1 [)1(  dldldldldl  (3) )](24)(3)(2)(1[)( dldldldldl  (4) the normalized loads of the previous and present days could be obtained by dividing 24 hourly loads of the previous and 24 hourly loads of the present days by the average applied load of the previous day. therefore, training of all the samples of different years could be performed concurrently by the omission of the load growth. 4. validation of the classifier model the proposed classifier has flexible classifier capability in all electric networks that follow persian official calendar. in order to validate the classifier operation of the implemented model, the data of electric load of iran network is used which follow the persian official calendar according to table 1. according to the persian official calendar, saturday to wednesday are working days, thursday is half-holiday working day, and friday is the official weekend holiday. it is noteworthy that official holidays are days according to the persian official calendar in various occasions such as national celebrations, religious celebrations, national mourning, religious mourning, etc. and are divided in two groups of official solar holidays and official lunar holidays. official solar holidays include national celebrations and mourning that happens in specific times of the solar persian year. lunar official holidays also include religious mourning and celebration which happens with 11day difference of days between lunar and solar calendar in variant times of a persian solar year. table 1 the months of a year based on persian official calendar. in this paper, the classifier is evaluated in a time interval of 10 years from 1370 to 1380 (thursday, march 21, 1991 to wednesday, march 21, 2001). the classifier used hourly active power consumption (in mega watts) in the iranian electric grid for the classification, training and forecasting purposes. the electric load consumption in the iranian nationwide grid follows a considerably nonlinear pattern and composed of base loads and peak loads. kohonen som in general has scalability problem, however, the volume of the data which is required for the training of the model to be used for the specific application year first half of a year second half of a year season spring summer autumn winter no 1 2 3 4 5 6 7 8 9 10 11 12 month f a rv a rd in o rd ib e h e sh t k h o rd a d t ir m o rd a d s h a h riv a r m e h r a b a n a z a r d e y b a h m a n e sfa n d 578 m. farhadi which is addressed in this paper is not big (maximum 365×24 data points). therefore, there is not any scalability problem in this case study and no technique is needed to manage the increasing size of the input data. in this study, due to the high volume of output tables and curves, samples of the results for monthly experiments are provided for the classifier of days in bahman of 1380 from 1/11/1380 to 1/12/1380 (from monday, january 21, 2002 to wednesday, february 20, 2002). in this experiment, the curves of daily load of the network are classified in a two-dimensional array with dimensions of 6×6. the input parameters of the classifier are adjusted completely according to table 2. table 2 input parameters of the classifier based on kohonen self-organizing network by running the classifier module which we implemented in matlab (the interface is shown in fig. 2), the output numerical results are inserted according to the classifier curves based on the number of established classes. as shown in fig. 4 , the som network assigns 21 of the established classes into four distinctive groups such as after official holidays, official holidays, before official holidays and normal working days (from sunday to wednesday). the neurons related to each of these classes are illustrated, respectively, by yellow, red, blue and green color. the dates of the days in each class of the som network of fig. 4 are shown in table 3. also, the curves of daily average load related to four sample classes of the 21 established classes are shown by the colors related to each class in fig. 4. the displayed classes such as 1, 16, 21 and 4 include after official holidays, official holidays, before official holidays and normal days. according to the classifier results in this example, 31 sample days are assigned to 21 classes. among the established classes, a few classes had the highest share of a specific type of days. for example, out of 21 established classes, four classes such as 1, 2, 5 and 8 included after official holidays, two classes such as 16 and 17 included official holidays, three classes such as 19, 20 and 21 included before official holidays and eleven classes included the remained days of normal days. it is noteworthy to mention that the association of classes to neurons is such that the adjacent neurons represent days which have relatively similar electric load patterns. in particular, the class associated to saturday is beside the class of after official holiday; the classerepresenting tuersday is beside that of before official holidays; and the classcoresponding friday is near that of official holiday. in addistion, normal working days in the middle of the week including sunday, monday, tuesday and wednesday are located in their own specific classes. classification input parameters first date 1380/11/1 learning cycle 7 last date 1380/12/1 start value of learning rate 0.9 number of days 31 end value of learning rate 0.1 number of load value 24 start value of neighborhood radius 10 m 6 end value of neighborhood radius 1 n 6 an effective self organizing map based model of electric load classification 579 fig. 4 two-dimensional neural network som classifier with dimensions of 6×6 for the month of the bahman of 1380. by increasing the number of classifer days, the dimension of kohonen network is increased. increasing the dimensions of a kohonen network, i.e. increasing the number of neurons in the network, leads to higher accurancy of the classifier. by distributing the input sample days between more neurons, more classes with lower sample patterns would be created. therefore, the neurons in each class would have higher similarity compared to the case with a network with lower dimension. table 3 dates of som network classifier with dimensions of 6×6 for bahman of 1 class 5 after holiday sat 80/11/13 class 4 normal day mon 80/11/1 tues 80/11/2 wed 80/11/3 mon 80/11/8 class 3 normal day tues 80/11/30 class 2 after holiday tues 80/11/23 class 1 after holiday sat 80/11/20 sat 80/11/27 class 8 after holiday sat 80/11/6 class 7 normal day tues 80/11/9 wed 80/11/24 class 6 normal day mon 80/11/29 class 12 normal day sun 80/11/7 tues 80/11/16 class 11 normal day sun 80/11/14 class 10 normal day wed 80/11/10 sat 80/11/27 class 9 normal day wed 80/12/1 class 15 normal day sun 80/11/28 class 14 normal day wed 80/11/17 class 13 normal day mon 80/11/15 class 16 holiday fri 80/11/5 fri 80/11/12 fri 80/11/19 mon 80/11/22 class 21 before holiday thurs 80/11/4 thurs 80/11/14 class 20 before holiday thurs 80/11/18 class 19 before holiday thurs 80/11/25 class 18 normal day sun 80/11/21 class 17 holiday fri 80/11/26 580 m. farhadi the curves of daily average electric loads related to four sample classes are shown in fig. 5. fig 5 the curves of daily average electric load related to four sample classes: (a) class 1: after holidays (b) class 4: normal days (c) class16: holidays (d) class21: before holidays 5. similar sampling method for stlf although none of the outputs of the som classifier is directly used for the prediction purpose, the results of the som network classification process can be used as an effective way for sampling the appropriate training patterns for training and forecasting 24-hour active power consumption of the nationwide electricity. this can be used in a separate som network. the details of the training and forecasting processes of this som network are presented by the same authors in [13]. according to the outcomes of the classifier, every normal day of a week has its specific load consumption curve and holidays have distinguished ones. in the similar sampling method, the maximum of a sample is used to train kohonen som neural networks. maximum 17 training samples have been selected in the time interval of two weeks before forecasting day and two weeks after that in three past years according to similar calendar features of the forecasting day; so that the date of the forecasting day is in the center of these four weeks [13]. given the possibility of changing the behavior of load curves over many years, using the samples of many years away may degrades the results of load forecasting; therefore, the samples of three years ago are used to train neural networks. for example, to forecast tuesday in date 5/04/1380 (26/06/2001), the samples of monday and tuesday are used in the time interval of 22/03/1380 (12/06/2001) to 4/04/1380 (25/06/2001), 22/03/1379 (11/06/2000) to 22/04/1379 (12/07/2000), 22/03/1378 (12/06/1999) to 22/04/1378 (13/07/1999) and 22/03/1377 (12/06/1998) to 22/04/1377 (13/07/1998) in order to train kohonen neural networks. the average annual mean of the mape (mean absolute percentage error) index to predict the load of the normal days of the years 1380 (from 21/03/2001 to 20/03/2002), 1381(from 21/03/2002 to 21/03/2003) and 1382 (from 21/03/2003 to 19/03/2004) are an effective self organizing map based model of electric load classification 581 1.83%, 1.77%, and 1.29%, respectively [17]. the details of load forecasting model incorporating temperature degree are discussed by the authors in [13]. 6. fuzzy expert system for stlf a fuzzy-expert system adjusts the primary predicted load for special days by taking into account the modifications in the load which happened in same day‟s load behavior. the relative difference between the actual and forecasted hourly loads of the same day is defined as relative gap regardless of the possible influence of a holiday [15, 16] and calculated via the following formula: ( ) ( ) re ( ) 100(%) ( ) load i load i real forecast lative gap i load i forecast    (5) where load real(i) and load forecast(i) represent the actual and forecasted loads for hour i, respectively. it's worth noting load forecast(i) in the formula (5) is the same forecasted load which resulted from kohonen neural network. the input variables of a fuzzy system are the day type (day) and day-night hours (hour) (which specifies the time of a particular event) and its output variable is the relative gap which is represented by adding to or subtracting from the primary predicted load for the purpose of a more precise forecasting. finally the ultimate load to be forecasted is acquired by the following formula: 100 )()(re )()( i forecast loadigaplative i forecast loadiloadfinal   (6) a complete description of the predicted load correction process based on the fuzzy system is available in the reference [17]. mape of the initial forecasted load by kohonen neural network and the final forecasted load by the fuzzy-expert system for official holidays in the year 1380 are respectively 3.19% and 1.78% [17]. the complete numerical results of this studies are presented by the author in references [13,17]. 7. conclusion we implemented a som classifier using matlab coding environment and performed numerical tests for the validation of the classifier. the flexibility, efficiency, and usefulness of the proposed classifier program would make it a suitable solution to be used in the power industry. the flexibility of the proposed classifier program due to the capability of changing the dimensions of the kohonen network provides the possibility to adjust and increase the accuracy of electric load classifier by increasing the dimensions and the number of neurons of the neural network in an optimal form. 582 m. farhadi according to the outcomes of the classifier program, each working normal day (saturday to friday) has its own specific load curve. also, the curves of power consumption in working day in the mid-week from sunday to wednesday have high similarity. the curves of load consumption in official holiday and official working days of a week have completely different patterns. the official holidays' behavior is relatively similar to its nearest friday. also, the curve of load consumption in the days before working days and the ones after official days are completely different. the electric consumption pattern of working days before official holidays is considerably similar to that of the nearest thursday, and the electric consumption pattern of the working day after official holiday is similar to that of the nearest saturday before it. in addition, in the curve of morning consumption during the month of ramadan in which people pray early in the morning and late night, a relative peak of power consumption is observed. in the different seasons of a year, the curve of load consumption is changed according to different factors in each season such as the length of daytime. over the time, the average load consumption is increased by increasing the population and the economic growth which results in load growth in each year. all these factors should be considered in the modeling of load consumption. in summary, there are ten classes for electric load consumption in iranian power grid including normal working days a week (saturday to friday) and specific days (official holiday, before official holiday and after official holiday). in order to design models to forecast electric load based on kohonen neural networks, ten sub-models related to all type of days are created, and the load patterns for each class according to similar sampling process are used to train and forecast the neural networks. the similar sampling process with maximum 17 training samples and high accuracy and low computation time is able to supply the training data which is required for the process of electric load forecasting and benefits the variant models of kohonen self-organizing neural networks. in addition to the valuable capability of the proposed classifier program to perform load forecasting in an industrial scale, the insightful outcomes of the proposed program are other valuable advantages of the implemented code to identify the behavior of power consumption curve in a power network desired which need to be studied by designers and the users of the power network. acknowledgement: the paper is a part of the research work within the project supported by industrial parks company of khorasan e jonoobi province, birjand, iran. the author would like to thank the manager and the experts of the mentioned company. references [1] g. dudek, "artificial immune system with local feature selection for short-term load forecasting," ieee transactions on evolutionary computation, vol. 21, no. 1, pp.116-130, feb. 2017. [2] h. quan, d. srinivasan, a. khosravi, "short-term load and wind power forecasting using neural network-based prediction intervals", ieee transactions on neural networks and learning systems, vol. 25, no. 2, pp. 303-315, feb. 2014. [3] s. v. verdú, m. o. garcía, c. senabre, a. g. marín, f. j. g. franco, “classification, filtering, and identification of electrical customer load patterns through the use of self-organizing maps,” ieee transactions on power systems, vol. 21, no. 4, pp. 1672-1682, 2006. an effective self organizing map based model of electric load classification 583 [4] s. zhong, k. s. tam,”hierarchical classification of load profiles based on their characteristic attributes in frequency domain,” ieee transactions on power systems, vol. 30, no. 5, pp. 2434-2441, sep. 2015. [5] l. du, d. he, r. g. harley, and t. g. habetler, “electric load classification by binary voltage-current trajectory mapping,” ieee transactions on smart grids, vol. 7, no. 1, pp. 358-365, jan 2016. [6] l. du, j. a. restrepo, y. yang, r. g. harley, t. g. habetler, “nonintrusive, self-organizing, and probabilistic classification and identification of plugged-in electric loads,” ieee transactions on smart grid, vol. 4, no. 3, pp. 1371-1380, sept. 2013. [7] s. patra, l. bruzzone, “a novel som-svm-based active learning technique for remote sensing image classification,” ieee transactions on geoscience and remote sensing, vol. 52, no. 11, pp. 6899-6910, nov. 2014. [8] ervin d. varga, sándor f. beretka, christian noce, and gianlucasapienza,"robust real-time load profile encoding and classification framework for efficient power systems operation", ieee transactions on power systems, vol. 30, no. 4, pp.1897-1904, july 2015. [9] kaustav basu, vincent debusschere, seddik bacha, ujjwal maulik, and sanghamitra bondyopadhyay, "nonintrusive load monitoring: a temporal multilabel classification approach", ieee transactions on industrial informatics, vol. 11, no. 1, pp. 262-270, feb. 2015. [10] dawei he, liang du, yi yang, ronald harley, and thomas habetler, "front-end electronic circuit topology analysis for model-driven classification and monitoring of appliance loads in smart buildings", ieee transactions on smart grid, vol. 3, no. 4, pp.2286-2293, dec. 2012. [11] antti mutanen, maija ruska, sami repo, and pertti järventausta," customer classification and load profiling method for distribution systems", ieee transactions on power delivery, vol. 26, no. 3, pp. 1755-1763, july 2011. [12] o.e. dragomira, f. dragomirb, m. radulescuc, matlab application of kohonen selforganizing map to classify consumers‟ load profiles. itqm, computer science 31, pp. 474 479, 2014. [13] m. farhadi, s.m. moghaddas tafreshi, effective model for next day load cureve forecasting based upon combination of perceptron and kohonen anns applied to iran power network. in proceedings of the ieee, intelec, rome, italy, sep 30 oct 4, 2007. [14] s. valero, j. aparicio, c. senabre, m. oritz, j. sancho, and a. gabaldon, "comparative analysis of self organizing maps vs. multilayer perceptron neural networks for short-term load forecasting," in proceedings of the international symposium on modern electric power systems (meps), 2010, no. 1, pp. 1-5. [15] j. llanos, d. sáez, r. palma-behnke, a. núñez, g. jiménez-estévez. load profile generator and load forecasting for a renewable based microgrid using self organizing maps and neural networks, in proceedings of the ieee wcci, 2012. [16] k.i. kimi, c. h. jini, y. k. leei, k. d.kim, k.h.ryui. forecasting wind power generation patterns based on som clustering. awareness science and technology (icast), 2011. [17] m. farhadi, m. farshad," a fuzzy inference self-organizing-map based model for short term load forecasting" in proceedings of the 17th conference on electrical power distribution networks (epdc), 2-3 may 2012, tehran, iran. http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=6156580 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=6241680 facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 489-498 https://doi.org/10.2298/fuee2003489d © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd design of efficient delay block for low frequency application sandeep k. dash, satya n. mishra, nirmal k. rout school of electronics engineering, kiit deemed to be university, bhubaneswar, odisha, india abstract. in recent years researchers have been focusing on the design of low power and small size oscillator for emerging areas of interest such as the internet of things (iot) and biomedical applications. in this paper a new delay block for ring oscillator is proposed using cmos inverter cascaded with inverted current starved inverter (cicsi). the designed delay block provides approximately 50% more delay with a smaller number of transistors than the conventionally designed circuits. furthermore, a ring oscillator and a non-overlapping clock (noc) generator are designed using it. the designed circuits can be used in switched capacitor (sc) circuits, analog mixed signal circuits to meet the need for low frequency portable biomedical applications. the designed circuits are simulated on generic 90nm 1.2v process design kit (gpdk90) using cadence virtuoso design environment. the simulation result shows the delay of the cicsi delay block is 592ps. the ring oscillator using 101 stages of delay block is designed and it is shown that it operates at a frequency of 17mhz with a power consumption of 420µw. key words: cmos inverter, inverted current starved inverter, noc 1. introduction in modern cmos vlsi design, the researchers focus on switched capacitor (sc) technique to implement the design due to its enormous advantages like less power, smaller area, higher precision, and easy implementation on chip [1-3]. the optimal realization of sc circuits depends upon the efficient designing of a non-overlapping clock (noc) generator. the traditional noc consists of a ring oscillator, nor gate, and cascaded delay blocks. there are several methods reported for the generation of delay block, which is further used to design the ring oscillator and noc [4-14]. for low frequency applications, the delay bock should provide more delay. the most efficient and easiest way to design the delay block is using cmos inverter and current starve inverter [15-16]. different techniques are reported to increase the delay of the delay block by implementing a voltage scaling technique [6], transmission gate method [7], and inverted inverter [17, 18].  received february 20, 2020; received in revised form may 26, 2020 corresponding author: sandeep k. dash school of electronics engineering, kiit deemed to be university, bhubaneswar, odisha, india e-mail: sandeepfet@kiit.ac.in 490 s. k. dash, s. n. mishra, n. k. rout cascaded inverter with an inverted inverter is used to get more delay compared to conventional delay blocks reported by a.k. mal et all [17]. the drawback of the delay block is that two or more inverted inverters cannot be cascaded due to logic level restoration problem [6, 17]. so, to improve the delay without having logic level restoration problem inverted current starved inverter (icsi) is used in place of an inverted inverter. the proposed cascaded inverter with inverted current starved inverter (cicsi) delay block consists of a cmos inverter cascaded with an icsi for a single delay block. the cicsi delay block has more delay mainly due to three reasons. first, the output voltage swing of the cicsi is not rail to rail. second, the transistors of icsi are not allowed to turn off during the transition of the input signal, so delay due to switching increases. third, the current flowing through the cicsi is smaller compared to previously reported work [17]. the cicsi circuit produces higher delay with a fewer number of transistors, so that it can be used in low power and low frequency applications. this paper is organized as follows: in section 2, the delay block, ring oscillator, and noc are discussed. section 3 gives a detailed analysis of the delay calculation of the delay block. section 4 presents the results and analysis, finally, section 5 draws the conclusions of the work. 2 cicsi structure the previously reported delay block using an inverted inverter circuit has more delay but it shows a two-stage problem [17]. to improve the delay further, a cmos inverter cascaded with inverted current starved inverter (cicsi) delay block as shown in fig.1 is proposed in this paper. the inverted current starved inverter is designed by swapping pmos with nmos, and nmos with pmos, and shorting the gate terminal of all nmos and pmos transistors. the cicsi has two stages. the first stage is normal cmos inverter and the second stage is a modified current starved inverter which acts as a cascaded pass transistor [1]. in the second stage, the drain terminal of nmos (m3) as shown in fig.1 is connected to vdd, and the source is connected to the drain terminal of nmos (m4). a logic „1‟ input activates the gate of both m3 and m4. the drain of pmos (m6) is grounded and source is connected to drain of pmos (m5). a logic „0‟ input activates the gate of both m5 and m6. by applying the concepts of cascaded pass transistor logic [19], when the second stage of cicsi is given logic „1‟, then the m4 pass gate is turned on and the output will be ≈ vdd−vtn. similarly, when a logic „0‟is applied to the second stage then m5 is turned on the output will be ≈ |vtp|, where vtn and |vtp| are the threshold voltage of the m4 and m5 transistors respectively. for gpdk 90 nm process, with vdd = 1.2 v, and vtn0 and −vtp0 are approximately 0.15v. the output voltage of the cicsi is from 0.15 v to 1.05 v. fig. 1 cicsi delay block of noc design of efficient delay block for low frequency application 491 the output voltage swing of the inverted current starved inverter is smaller by one threshold voltage on both sides as per the principle of cascaded pass transistor logic. so, the combination of cmos inverter followed by inverted current starved inverter's output signal swing is not rail to rail. when this output is fed to the next stage inverter, the output is not affected much due to the fact that the basic cmos inverter provides an excellent gain in between voltage levels vil and vih [19]. the output voltage swing of the next cmos inverter is not affected as the input to the cmos inverter is either less than vil or more than vih. the delay of the inverter circuit is inversely proportional to the current flowing through it [19]. in the proposed cicsi block, smaller amounts of current flow as compared to the conventional circuit. ring oscillators are realized by cascading an odd number of inverters [4, 17-19]. the ring oscillator using the cicsi block is shown in fig.2. the frequency of oscillation for a ring oscillator is given by 1 2 osc p f n   (1) where „n‟ is the number of inverter stages and „τp‟ is the average propagation delay of a single cicsi delay block. as the frequency of oscillation depends on the delay introduced by each inverter stage, so for low frequency applications the total delay must be large at the output. fig. 2 ring oscillator using cicsi delay block the block diagram of the basic non overlapping clock (noc) generator is shown in fig. 3. an even number of delay blocks is connected in series in noc generator [1]. the different non overlapping clock period can be obtained by suitably choosing number of cascaded delay blocks. the total non-overlapping clock period is given by n × τp, where „n‟ represents the number of cicsi delay block and „τp‟ represents the average propagation delay of single cicsi delay block. the noc output does not have a rail to rail swing, so two cmos inverters are connected in series (as buffers) at the end of the delay block to make the output signal swing rail to rail. 492 s. k. dash, s. n. mishra, n. k. rout fig. 3 noc block fig. 4 shows the voltage transfer characteristic (vtc) of the cmos inverter, inverted inverter [17], and the proposed inverter (cicsi). in the figure cmos inverter shows rail to rail swing (0 1.2 v) but inverted inverter does not show rail to rail output. the vtc of the cicsi though not having rail to rail swing does not degrade from the inverted inverter. from the vtc of the cmos inverter it is found that the transition region extends from 0.328 v to 0.6 v. so the input below 0.328v and above 0.6v is considered as perfect logic „0‟ and logic „1‟ respectively. for gpdk 90nm process vtn = -vtp = 0.15 v, the output swing of the cicsi is between 0.15 v to 1.05v (approximately) and it is recovered by the next stage cmos inverter. in these three designs minimum transistor sizing ratio (w/l=120nm/100nm) with a voltage supply of 1.2v are used. fig. 4 vtc curve 3. delay estimation in this section the delay of the cicsi delay block consisting of a cmos inverter and icsi is calculated. the delay of the cascaded block is calculated one by one and then combined to get the final delay. then the frequency of a ring oscillator which is designed using an odd number of such delay blocks is estimated. the delay of cmos inverter is calculated in two steps. when the output of cmos inverter changes from logic „1‟ state to logic „0‟ state, the delay is defined as propagation delay (high to low) which is denoted by τphl. similarly, when the output of cmos inverter changes from logic „0‟ state to logic „1‟ state, the delay is defined as propagation delay design of efficient delay block for low frequency application 493 (low to high) which is denoted by τplh. combining both the delays, the total propagation delay [17] (high to low) is expressed as 4 ln ( 2 ) 2 3 4 load tn dd phl n dd tn dd tn dd tn c v v k v v v v v v         (2) similarly, the propagation delay [17] (low to high) is estimated as 4 ln ( 2 ) 2 3 4 tpload dd plh p dd tp dd tp dd tp vc v k v v v v v v            (3) the next objective is to model the delay of icsi in terms of the mos process parameters. the behaviour of the icsi circuit is the same as the cascaded pass transistor [19]. when the input of the the icsi is high (vdd), the nmos transistor is on and acts as a cascaded pass transistor to transfer the logic input to the output. thus, during logic 1 transfer the charging of the capacitor is done through nmos transistor. to simplify our analysis, we neglect the substrate bias effect at this point. thus, when input changes from 0 to vdd then transistors m3 and m4 are on state and in saturation mode up to output voltage ηvdd where η is 0 < η < 0.5 and m3 in saturation and m4 in linear for output voltage ηvdd to 0.5vdd. 1 2 2 ( ) 2 1 1 (1) dd tp v load i lh n v dd tn load n dd tn dd tp tn c dv k v v v c k v v v v v            (4) 0.5 2 2 2 2( )( ) ( ) ( 2 ) = ln ( 4 )( ) dd dd v load i lh n v dd tn dd dd load dd dd dd tn n tn dd tn dd dd c dv k v v v v v v v c v v v v k v v v v v             (5) here, v represents the output voltage of icsi delay block, τi1lh is the time required to charge the load capacitor from vtp to ηvdd and τi2lh is the time required to charge the load capacitor from ηvdd to 0.5vdd. so total time (τilh) required to charge load capacitor from vtp to 0.5vdd is calculated as follows 1 2 ilh i lh i lh     (6) in similar ways, during logic 0, transfer high to low delay (τihl) is calculated as 1 2 ihl i hl i hl      (7) the total delay of the cicsi delay block τlh(tot) [or τhl(tot)] is the sum of the delay of cmos inverter τplh [or τphl] and delay of icsi τilh [or τihl]. it is expressed as ( ) ( ) lh plh ilh hl phl ihl tot tot           (8) 494 s. k. dash, s. n. mishra, n. k. rout 4. result and discussion this section presents the simulated results that were carried out in cadence using cmos 90nm (gpdk 90nm) process of delay block and noc, and its characterization for the proper functioning. in all the designs transistor sizing ratios (w/l=120nm/100nm) are used with a voltage supply of 1.2v. fig. 5 shows the plot of input and output voltage of cicsi. when the input is fed to the cicsi then the first stage gives the inverted output which is rail to rail. the output of the first stage is applied as input to the second stage (inverted current starved inverter). the output of the second stage is not rail to rail. the vil and vih of the cicsi are 0.32v and 0.6v which is shown in fig. 4. the output of the second stage inverted current starved inverter is less than 0.32 v and more than 0.6v. when this output is applied to the next stage of the cicsi, it is considered as perfect logic 0 and logic 1. fig. 5 input and output voltage level of cicsi delay block fig. 6 shows the plot of the ring oscillator output with and without buffer. the ring oscillator output shows the output ranges from 0.25v to 1.05v which is not rail to rail. when the buffer is added at the output then the signal level is restored from 0 to 1.2v, as shown in fig. 6. fig. 6 ring oscillator output with and without buffer design of efficient delay block for low frequency application 495 fig. 7 shows the plot of transistor count versus the frequency variation for inverted inverter and cicsi. the graph shows for same number of transistors, cicsi produced less frequency compared to the inverted inverter. the variation shows the slope of cicsi is smaller compared to the inverted inverter. this shows that the proposed cicsi design has less variation in frequency (307mhz-68mhz) for the same number of transistors as compared to the inverted inverter (737mhz-145.9mhz). fig. 7 number of transistors vs frequency fig. 8 shows the plot temperature versus frequency of the cicsi. the cicsi shows the duty cycle is 50% for almost all ranges of temperature. the cicsi has tested from -40°c to 100°c, which shows minimal change (9.85mhz to 21.82mhz) in the frequency compared with the other oscillator in table 1. fig. 8 frequency vs temperature fig. 9 shows the output of the noc block. the cicsi gives more delay compared to other delay circuits like inverted inverter [17]. the increased delay is helpful in saving number of transistors. in this work 2.59ns non overlapping clock period is achieved using 48 transistors in the delay chain. 496 s. k. dash, s. n. mishra, n. k. rout fig. 9 noc delay fig. 10 shows the variation of the number stages of the oscillator with the frequency of oscillation obtained from simulation and analytical model. based on fig.10, the delay is calculated and the frequency is obtained. table 2 shows the comparison between an oscillator using cicsi block and the existing oscillator which shows the improved performance like reduced number of stages so that the power is reduced by nearly 50%. in table 3 there is the comparison for different processes showing frequency of the oscillations for different oscillators. table 4 shows that using a smaller number of transistors, a higher noc period can be obtained with a cicsi block. the noc and oscillators are an integral part of switch capacitor circuits used in the filter, eeg and ecg applications [20, 21] fig. 10 analytical & simulated delay design of efficient delay block for low frequency application 497 table 1 comparison of frequencies of different oscillators for different temperatures temperature (°c) frequency of cicsi ring oscillator (mhz) frequency of 1cmos inverter and 1inverted cmos inverter (mhz) -40 9.85 13.26 -20 12.22 14.82 0 14.44 15.98 20 16.45 16.97 27 17.08 17.15 40 18.20 17.92 60 19.69 18.24 80 20.87 18.64 100 21.82 18.88 table 2 the important parameters of different oscillator blocks cicsi oscillator 1 cmos inverter and 1 inverted cmos inverter frequency 17.08 mhz 17.15 mhz no. of stages 101 313 transistor count 610 1256 power 419.2µw 822.94 µw table 3 the comparison of the frequency of different oscillators for different processes process frequency of cicsi ring oscillator (mhz) frequency of 1 cmos inverter and 1 inverted cmos inverter (mhz) fs 4.59 5.63 ss 12.05 11.92 nn 17.08 17.15 ff 20.85 21.77 sf 31.56 26.81 table 4 the comparison of noc period of different delay blocks no. of transistors tnoc of cicsi block (ns) tnoc of 1cmos inverter and 1inverted cmos inverter (ns) 24 1.41 0.69 48 2.59 1.23 72 3.74 1.79 96 4.91 2.35 120 6.05 2.91 5. conclusion this paper presents a delay block which generates a higher delay with an equal number of mos transistors as reported earlier. the delay expression for the proposed cicsi circuit is formulated. the delay block is used in a ring oscillator circuit and simulation results show its power and area efficiencies. finally, a two phase non overlapping clock is generated using noc generator which shows improvement in terms of area. 498 s. k. dash, s. n. mishra, n. k. rout acknowledgement: the authors would like to acknowledge the school of electronics engineering, kiit deemed to be university, bhubaneswar, for providing all the facilities for carrying out the work. references [1] p. e. allen, e. sanchez-sinencio, switched capacitor circuits, van nostrand rienhold, 1984. [2] roubik gregorian, gabor c. temes, analog mos & integrated circuits for signal processing, john wileysons, 1986. [3] david a. johns, ken martin, analog integrated circuit design, john wiley & sons, 1997. [4] b. razavi, “the ring oscillator [a circuit for all seasons]”, in ieee solid-state circuits magazine, vol. 11, no. 4, pp. 10-81, fall 2019. [5] g. jovanovic, m. stojcev, “a method for improvement stability of a cmos voltage controlled ring oscillators”, in proceedings of the icest2007, ohrid, june 2007, vol. 2, pp. 715–718. [6] r. s. s. m. r. krishna, g. l. madhumati and a. k. mal, “design of substantial delay block using voltage scaled cmos inverter and transmission gate blend”, in proceedings of the 2016 international conference on microelectronics, computing and communications (microcom), durgapur, 2016, pp. 1–6, [7] meng-liehsheu, ta-wei lin, wei-hung hsu, “wide frequency range voltage controlled ring oscillator based on transmission gates”, iscas 2005, vol. 3, pp. 2731–2734, may 2005. [8] t.jiang,j.yin,p.mak and r.p.c.martins, “a 0.5-v 0.4-to-1.6-ghz 8-phase bootstrap ring-vco using inherent non-overlapping clocks achieving a 162.2-dbc/hz fom”, ieee transactions on circuits and systems ii: express briefs, vol. 66, no. 2, pp. 157-161, feb. 2019. [9] f. pepe and p. andreani, “an accurate analysis of phase noise in cmos ring oscillators”, ieee transactions on circuits and systems ii: express briefs, vol. 66, no. 8, pp. 1292–1296, aug. 2019. [10] b.j. v. t. ferreira and c. galup-montoro, “ultra-low-voltage cmos ring oscillators”, electronics letters, vol. 55, no. 9, pp. 523–525, 2 5 2019. [11] b. s. salem, h. zandevakili, a. mahani and m. saneei, “fault-tolerant delay cell for ring oscillator application in 65 nm cmos technology”, iet circuits, devices & systems, vol. 12, no. 3, pp. 233–241, 2018. [12] b.x. yu, y. fang and z. shi, “2.5 mw 2.73 ghz non-overlapping multi-phase clock generator with duty-cycle correction in 0.13 µm cmos”, electronics letters, vol. 52, no. 14, pp. 1261–1262, 2016. [13] p. angelov, m. nielsen-lönn and a. alvandpour, “ring-oscillator-based timing generator for ultralow-power applications”, in proceedings of the 2017 ieee nordic circuits and systems conference (norcas): norchip and international symposium of system-on-chip (soc), linkoping, pp. 1–4, 2017. [14] b.nowacki, n. paulino and j. goes, “a simple 1 ghz non-overlapping two-phase clock generators for sc circuits”, in proceedings of the 20th international conference mixed design of integrated circuits and systems mixdes 2013, gdynia, pp. 174–178. [15] b.l. minatiet al., “current-starved cross-coupled cmos inverter rings as versatile generators of chaotic and neural-like dynamics over multiple frequency decades”, ieee access, vol. 7, pp. 54638– 54657, 2019. [16] b.c. q. liu, y. cao and c. h. chang, “acro-puf: a low-power, reliable and aging-resilient current starved inverter-based ring oscillator physical unclonable function”, ieee transactions on circuits and systems i: regular papers, vol. 64, no. 12, pp. 3138–3149, dec. 2017. [17] a.k. mal, r. thodani,”non overlapping clock (noc) generator for low frequency switched capacitor circuits”, in proceedings of the students technology symposium 2011 ieee, pp. 226–231. [18] a. k. mal and r. todani, “non overlapping clock generator for switched capacitor circuits in bio-medical applications”, in proceedings of the 2011 international conference on signal processing, communication, computing and networking technologies, thuckafay, 2011, pp. 238–243. [19] s.m. kang, y. lebebici, cmos digital integrated circuits,analysis and design, mcgraw-hili publishing company limited, 2003. [20] n. n. singh and p. p. bansod, “switched-capacitor filter design for ecg application using 180nm cmos technology”, in proceedings of the 2017 international conference on recent innovations in signal processing and embedded systems (rise), bhopal, 2017, pp. 439–443. [21] a. karimi-bidhendi et al., “cmos ultralow power brain signal acquisition front-ends: design and human testing” ieee transactions on biomedical circuits and systems, vol. 11, no. 5, pp. 1111–1122, 2017. facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 553-569 https://doi.org/10.2298/fuee2004553f © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd design of dual rotor axial flux permanent magnet generators with ferrite and rare-earth magnets jawad faiz 1 , tohid asefi 1 , mohammad azeem khan 2 1 school of electrical and computer engineering, college of engineering, university of tehran, tehran iran 2 department of electrical engineering, university of cape town, cape town, south africa abstract. this article addresses dual rotor axial flux ferrite permanent magnet (pm) generator, as an alternative to a surface mounted and spoke types nd-fe-b generator which have concentrated windings. the performance parameters of all generators, particularly the efficiency, are identical. the design objective function is the generators mass minimization using a population-based algorithm. to predict the performance of yhe generators a finite element (fe) technique is applied. besides, the aims of the design include minimizing cogging torque, examining different rotor pole topologies and different pole arc to pole pitch ratios. three-dimensional fe technique is employed. it is shown that the surface mounted ferrite generator topology cannot develop the rated torque and also has high torque ripple. in addition, it is heavier than the spoke type generator. furthermore, it is indicated that the spoke type ferrite pm generator has favorable performance and could be an alternative to rare-earth pm generators, particularly in wind energy applications. finally, the performance of the designed generators is experimentally verified. key words: axial flux, permanent magnet generator, dual rotor, finite element analysis, wind turbines, cogging torque, population-based algorithm 1. introduction in recent years, permanent magnet (pm) machines have been used in different applications such electric vehicle (ev) traction and wind energy generation. the price increase of rare-earth pms and insecurity of their supplying are two reasons that lead to interest to substitute these pms with alternative materials such as ferrites. ferrites are low cost materials to substitute rare-earth pms [1, 2]. some companies such abb have offered a ferrite pm received december 26, 2019; received in revised form february 11, 2020 corresponding author: jawad faiz electrical and computer engineering, college of engineering, university of tehran, tehran 1439957131, iran e-mail: jfaiz@ut.ac.ir *an earlier version of this paper was presented at the 14th international conference on applied electromagnetics (пес 2019), august 26 28, 2019, in niš, serbia [1]. 554 j. faiz, t. asefi , m. a. khan wind generator called the “wind former” [3]. in the wind farmer, ferrite pms mounted between the pole shoes of the generator. a high speed ferrite pm motor has been introduced in [4] as a substitution to sm-co pms motor. a ferrite pm electric motor has been employed in [5] for evs in which no-load induced phase voltage is sinusoidal and cogging torque is low [5], [6]. it is noted that the efficiency and power of the pm machines are higher than that of induction machines. this is the main reason for the substituting induction machines with pm machines [7]. a higher torque density can be obtained by axial flux pm (afpm) machines compared to radial flux and transverse flux pm machines [8], [9]. there are two following more common radial flux pm machines: 1. conventional radial flux (rfpm), 2. outer rotor radial flux (orrfpm). and five following axial flux pm machines: 1. double-stator axial flux (dsafpm), 2. double-rotor axial flux (drafpm), 3. single-sided stator balanced axial flux (sbafpm), 4. single-sided rotor balanced axial flux (rbafpm) and 5. toroidal wound axial flux machines (twafpm). the advantages of the drafpm are the highest torque density, power density, efficiency, lowest mass of pm and active materials [10]. this motor could be the most appropriate choice in the applications requiring a high power and torque density. however, this is true specific low-power application and cannot be extended to all applications with high power and torque density. this paper proposes a drafpm machine with surface mounted and spoke type ferrite pms which is optimally designed by a population-based algorithm. the design objective function is minimizing the mass of the machine. the 3d-fem is used for predicting the performances of the proposed generators. the topology of the generator is determined for minimizing the cogging torque and torque ripple. the effects of pole arc/pole pitch ratio (α) upon the flux leakage, cogging torque and torque ripple are investigated by sensitivity analysis. it is noted that the air gap clearance and materials types used (except pms) are identical. it is shown that the torque ripple, cogging torque and total harmonic distortion (thd) of the induced voltage in the spoke type ferrite pm generator at full-load are improved compared to the surface mounted nd-fe-b machine. it is proved that the surface mounted ferrite machine cannot develop the rated electromagnetic torque. besides, it has higher torque ripple and weight compared to the spoke type machine. therefore, the spoke type ferrite generator can be considered as a viable alternative to the rare-earth pm generators. finally, the performance of the designed generators is verified experimentally. 2. different topologies of generator fig. 1 shows the proposed small wind generator topologies which include the spoke type and surface mounted pm rotors with laminated stator. these generators are analysed and the feasibility of substituting nd-fe-b pms with ferrite pms in axial flux type generators is investigated. design of pm generator with ferrite and rare-earth magnets 555 (a) (b) fig. 1 axial flux pm generator topologies (a) ferrite surface mounted, (b) spoke type. fig. 2 shows the norton magnetic equivalent circuits (mecs) of the both generators and fig. 3 presents their flux paths. it is clear that two pms contribute to the pole flux in the spoke type generator, whereas four pms contribute partially to the pole flux in the surface mounted pm machine. performance of the designed 400 w, 10-pole, 12-slot ferrite pm generator is compared with the corresponding surface mounted drafpmm having nd-fe-b pms [11]. table 1 gives the main specifications of the nd-fe-b generator. rm 4*rr 4*rg 2*rs ᵠr fm + _ pl ᵠm ᵠl ½*ᵠg fa rm rr/2 rg rs/2 ᵠr fm + _ pl ᵠm ᵠl ᵠg fa (a) (b) fig. 2 mec of pm generators (a) surface mounted and (b) spoke type. rr: reluctance of rotor pole shoe rg: reluctance of air gap rs: reluctance of stator tooth pl: permeance of leakage flux rm: reluctance of pm fa : armature reaction fm: total mmf drop cφ: flux focusing factor 556 j. faiz, t. asefi, m. a. khan fig. 3 magnetic flux paths in pm generators (a) surface mounted and (b) spoke type table 1 specifications of proposed generator output power (w) 400 line voltage (v) 24 frequency (hz) 50 rated speed (rpm) 600 winding type fractional slot concentrated winding (fscw) for both pm arrangements, the mec is solved and the pm surface is estimated to generate the required air gap magnetic flux bg, for no-load spoke type machine (fa=0). the coefficient flkg as the ratio of the air gap flux and magnet flux leakage is defined in [12]. referring to fig. 2b, as the mec of the spoke type machines, flkg is determined as follows: 0.5 / 0.5 /(0.5 )lgk g m g g lf =    =     +  (1) fig. 3 presents the flux paths in the both machines. generally, the leakage coefficient is less than 1 and it depends on the configuration of the machine. the fem is applied to evaluate this coefficient in the axial flux generators design. fig. 2b shows clearly that a fraction of total remnant magnetic flux k: k = 0.25pg / (0.25pg + pm + pl) (2) crosses the air gap, where pg=1/rg. therefore, the air gap magnetic flux is estimated as follows: 0.25 / 2 (0.25 ) g g r g m l p p p p  =  + + (3) the numerator and denominator of (3) is multiplied by the following magneto-motive force (mmf) drop: mmf = fm (2*fg + fs+2*fr), (4) and the flux is determined as follows: design of pm generator with ferrite and rare-earth magnets 557 0.5 / 2 (0.5 ) g g r g m l    =    +  +  (5) 0.5 /(0.5 ) 2 (0.5 ) /(0.5 ) 1 0.5 g g g l lgk r r mg m l g l lgk g f f      +  =   =     +  +    +  +    (6) g r m g φ φ φ fm2 1 0.5×φ fm lkg lkg f f =  +   (7) g fm & 4 0.5 φ g r=   now by dividing the numerator and denominator of (3) to (0.5φg+φl) and simplifying the resultant formula, the flux is obtained as follows: fm m m p  = (8) so: g r φ / 2 φ 1 4 lkg lkg m g f f p r =  +    (9) as a result, the pm surface area for generating magnetic flux φg in the air gap is estimated by substituting φr=bram and φg=bgag in (9) as follows: ' 4μ2 g m rec lkg g glkg r g g m b a f l bf b a a l =   −      (10) where l՜g=kclg, lm is the magnet thickness and kc is the carter’s coefficient. the same procedure is applied to the surface mounted generator and am is obtained as follows: 0 0.05 0.1 0.15 0.2 0.25 0 50 100 air gap flux density (t) m a g n e t t h ic k n e s s ( m m ) fig. 4 variations of air gap magnetic flux density versos pm thickness ' μ g m rec lkg g glkg r g g m b a f l bf b a a l =   −      (11) 558 j. faiz, t. asefi, m. a. khan as shown in fig 4, lm for surface mounted ferrite generator is calculated as follows: r φ μ . / ( / 1/ ( . )) m g r g lkg l l b b c f= − (12) which indicates that lm depends on the air gap magnetic flux density bg. to achieve a higher power density and simplify prototyping a concentrated winding configuration is used [13]. 3. design optimization of designed generator the parameters of the machines are optimized using a population-based algorithm to achieve the minimum total mass of active materials, whilst keeping the overall efficiency identical with that of the nd-fe-b machine. the objective function is as follows: min min ( ) ( ) ( ) loss mass new loss mass f x f x f x w w loss mass = + (13) where lossmin is the minimum loss, massmin is the minimum mass, floss(x) is the loss, fmass(x) is the mass functions, wloss is loss weighting factor, wmass is the mass weighting factor, as such that wloss+wmass=1. the same constraints are used throughout the optimization process. the variables during the optimization process include: 1. ratio of stator inner to outer diameter di/do, 2. ratio slot width to slot pitch ws/τs, 3. air gap magnetic flux bg, 4. specific electric loading as, 5. stator current density js, 6. magnet grade (residual flux density). the flkg is determined by the fem; considering the flux focusing factor cφ=am/ag, the bg estimated by (10), may not be increased beyond a limit as shown in fig 4. it means that for infinite lm, bg of the surface mounted ferrite generator would not exceed 0.27 t and this can be considered as a major drawback for the machine. for minimum mass of the generator, the air gap flux density is 0.24 t which is lower than that of the other topology. on the other hand, the surface mounted ferrite generator is heavier, its axial length is longer, and its outer diameter is higher than the spoke type machine. therefore, the surface mounted topology is less viable to replace the nd-fe-b machine. table 2 and table 3 present the optimization variables and specifications of the designed drafpm machines. after approximately 50 generations of the optimization process, the optimum result is obtained after 50 iterations; however, the number of iterations is set up to 450 to avoid failure in a local minimum. design of pm generator with ferrite and rare-earth magnets 559 4. finite element analysis 3d-fem is applied to analyse the generators performance. 4.1. magnetic flux density and back-emf the phase back-emf of the drafpm generator can be calculated by [14]: π α ω 2 m lkg f ph w r av m s e f k n k d b l (14) where dav = (do + di) / 2, ls = (do  di) / 2 and kf is the distribution coefficient of the air gap magnetic flux density and kw is the winding factor fig. 5 and fig. 6 presents the no-load flux density and back-emf of the generator respectively. table 4 gives the thd of the back-emf for different types of pm motors. table 2 optimization variables quantity variable min. max. di/do dk 0.50 0.85 ws/τs --0.50 0.70 air-gap magnetic flux density (t) gb 0.20 0.75 specific electric loading (ka/m) as 9 25 stator current density (a/mm 2 ) sj 3 8 pm grade (t) br 0.30 0.41 table 3 specifications of proposed drafpm generator generator nd-fe-b [10] ferrite spoke type surface mounted ferrite magnet grade n33-br: 1.13t y33-br: 0.41 t active material (kg) 7.40 8.46 9.54 total axial length (mm) 81.5 96 110 outer diameter (mm) 180 170 215 pole arc/pole pitch 0.7 0.3 0.7 electric loading (a/m) 9885.8 10632 13557 current density (a/mm 2 ) 5 5.28 5.56 flux density of air gap bg (t) 0.593 0.42 0.24 thd of back emf (%) 6.61 1.52 5.6 table 4 thd of back-emf of pm motors type of pm machine thd of back-emf (%) surface mounted nd-fe-b 6.61 ferrite spoke type 1.52 surface mounted ferrite 5.6 560 j. faiz, t. asefi , m. a. khan fig. 5 no-load magnetic flux density/pole pitch of generator (a) ferrite spoke type (b) nd-fe-b. 0 0.005 0.01 0.015 0.02 -20 0 20 time p h a s e b a c k e m f nd fe b ferrite spoke type ferrite surface mounted fig. 6 time variations of back emf. 4.2. full load torque and cogging torque the cogging torque and torque component for no armature excitation case may be determined by fem at several rotor positions and no-load condition. fine meshes are necessary in this case. cogging torque is proportional to bg 2 in the surface mounted ferrite machine (eq, 15). the cogging torque in this generator is small, because the air gap magnetic flux density has minimum value. however, the torque ripple is too high and it is calculated as follow [15]: 2θ 1 θ θ ( ) ( ) b θ 2μ cog g v w t dv               (15) the machine is unable to develop the rated torque. the reason is that the rotor magnetic flux excitation is weaker than that of the on-load armature reaction [10]. the developed torque is calculated as follows: 2 3 α η 4 lkg f d w o m m t f k k k d b a   (16) fig. 7 presents the developed torque and cogging torque against angular position of rotor. design of pm generator with ferrite and rare-earth magnets 561 4.3. losses and efficiency the power losses in the electric machines consist of core losses (∆pe), copper losses (∆pw) and rotational losses (∆prot). therefore, the total power losses of an afpm machine is as follows: fig. 7 full-load torque and cogging torque of proposed generators w e rot p p p p     (17) e pm e p p p    (18) rot fr wind vent p p p p     (19) where ∆pfe, stator core losses ∆ppm, rotor core losses ∆pfr, pm losses ∆pwind windage losses ∆pven, friction, windage and ventilation losses, the pms eddy current losses have a direct impact on the heat generation, the rotor temperature rise and the efficiency of the machine [16]. the resistivity of the nd-fe-b pm is in the range of 110-170×10 -8 ωm, and the resistivity of ferrite pms is approximately 1000×10 -8 ωm. the ferrite pms is electrically insulated material and electric current does not pass it. due to its relatively high resistance it is called ceramic pm. therefore, the ferrite pm losses can be neglected for the spoke type generator [17]. core losses in the surface mounted ferrite generator are higher than that of the spoke type and along with higher copper losses, which decreases the overall efficiency of the generator. the generators losses components have been compared in table 5 in which the efficiency is estimated by pout / (pout + δp). table 5 comparison of performance of generators generator nd-fe-b [9] spoke type ferrite surface mounted ferrite electromagnetic torque (nm) 8.71 8.67 6.78 torque ripple (nm) 1.073 0.44 1.16 cogging torque (nm) 1.563 0.426 0.345 copper losses (w) 47 51 64 iron losses (w) 16.3 12.1 18.1 efficiency @ 600 rpm (%) 83 82 75 562 j. faiz, t. asefi , m. a. khan 4.4. tapered pole machine fig. 8a presents the pm magnetic flux density distribution in a non-tapered radial sided pole machine. it is clear that the pole is partially saturated, therefore, the iron of the rotor has not been optimally used. fig. 8b shows the tapered pole which makes possible to use iron optimally. the peak-to-peak cogging torque drops to 205 mnm, the reason is the saturation of the pole. as shown in fig. 7, the torque ripple rises due to the non-linear behaviour of the core material; and electromagnetic torque is not developed. on the other hand, the generator active material is reduced from 8.45 kg to 7.55 kg. fig. 9 exhibits the full load torques, and cogging torques of tapered parallel and non-tapered radial sided ferrite generators. fig. 7 (a) flux concentration in radial sided pole and (b) tapered parallel sided pole. (a) (b) fig. 8 (a) full load torques, and (b) cogging torques of tapered parallel and non-tapered radial sided ferrite generators. 4.5. minimization of cogging torque there are different cogging torque waveforms for various machines. these waveforms contain valuable information from the machines. the cogging torque can be estimated by quasi-3d multi-slice model as follows:   3 3 ( , ) ( , ) ( , ) ( , ) 10 1 4 2 3 1 ( ) θ 3μ ( ) c cs isn cog iii s iii s iii s iii s s n n n n n r r t a a a a                  (20) design of pm generator with ferrite and rare-earth magnets 563 many attempts have been made to address the cogging torque in axial flux machines [18]. however, to analyse the cogging torque of the spoke type generators, fem is the most efficient and reliable method. a more distortion in the cogging torque generates more thd in the resultant back emf waveform. it is evident from the resultant cogging torques shown in fig. 10 that the smaller pole arc/pole pitch ratio leads to a more sinusoidal cogging torque waveform. cogging torque waveforms for different pole arc/pole pitch ratios (0.3, 0.4, 0.5, 0.6 and 0.75), are investigated and ratios of 0.3 and 0.75 are shown in fig. 9 for spoke type generator. the ratio of 0.3 has the highest peak-to-peak cogging torque and the minimum cogging torque is obtained for ratio of 0.75. cogging torque for ratios higher than 0.75 has not been investigated, because it results in large inter-polar flux leakage, which leads to a high magnetic flux density in the rotor pole-body and improper flux distribution over the whole stator. an air gap improper flux distribution at α=0.75 is seen clearly from the cogging torque waveform. also, for each value of α, pm flux leakage is calculated. due to larger inter-polar flux leakage for higher α, flux leakages increases from 15.4% at α=0.3 to 17.2% at α=0.75. in table 6, φm1 and φm2 is the flux through pms contributing in the flux of one pole in spoke type generator and φagap is the air gap flux through one pole. fig. 9 higher α increases cogging torque distortion. table 6 flux leakages for different α α 0.3 0.5 0.675 0.75 φm1 (mwb) 0.2979 0.2979 0.2974 0.2969 φm2 (mwb) 0.2977 0.2976 0.2980 0.2969 φagap (mwb) 0.5038 0.4967 0.4955 0.4907 φp (%) 15.4 16.6 16.7 17.2 lkg f 0.8459 0.8341 0.8322 0.8264 5. experimental verification it is noted that the prototyped spoke type generator is lighter than the surface mounted ferrite generator, with lower torque ripple and able to develop the rated torque. 5.1. structure of rotor fig. 10a shows the rotor of the spoke type ferrite machine. aluminium latches are used to retain the circumferentially magnetized magnets between the steel parts, which eliminates 564 j. faiz, t. asefi , m. a. khan the need for adhesives. to minimize vibration and noise the parts of the rotor must be carefully assembled with good tolerances. 5.2. structure of stator as shown in fig. 10b, it is possible to easily and quickly wind and assemble the stator due to modular stator teeth and aluminium. to block a path of eddy currents in the teeth of stator, only one bolt and nut is employed which keeps the laminations stacked. fig. 10c presents the complete generator assembly. fig. 11 exhibits the test rig of the generator. to measure the shaft torque coupled to the generator and a 3 kw servo motor, a 10 nm torque transducer having 1% accuracy was used. a data acquisition system and a power analyser were utilized to monitor the power signals. as shown in fig. 12, it is necessary to analyse vibration of the heavier and more complicated rotor of spoke type ferrite generator. structural mechanical fe analysis was used to analyse the whole result of the test rig. to estimate the dynamics of the rotor, a modal analysis was conducted to determine the air gap unbalance. (a) (b) (c) (d) fig. 10 (a) rotor structure, (b) stator assembly, (c) generator assembly, and (d) dummy stator for measuring friction and windage losses. fig. 11 test rig: (1) generator under test, (2) 10 nm in-line torque transducer, (3) 3 kw servo motor, (4) resistor bank, (5) servo control unit and ni cdaq-9174 data acquisition system (6) oscilloscope. design of pm generator with ferrite and rare-earth magnets 565 fig. 12 vibration analysis of whole test rig. it was found that the total deformation of the rotor is 24 µm. the generator air gap clearance is 15 mm and the above-mentioned deformation results in a total air gap unbalance of 1.6%. as fig. 13 shows it leads to no-load voltage unbalance. it is noted that the mechanical assembly, parts tolerances and mechanical imperfections exacerbate the unbalance in induced voltage. 5.3. thermal behaviour of generator the thermal behaviour of the generator is predicted in the full load and open-circuit conditions with natural air convection electric loading and current density of the spoke type ferrite generator are increased 7.5 % and 5.6 % respectively compared to the proposed generator. the generator hotspot must be in the safe margin of the proposed generator’s insulation class. although a better class insulation can improve the total cost of the generator which is in contrast with the purpose of the project. the fe technique has been chosen for thermal analysis by considering the 3d structure of the proposed generator and precise results of 3d cfd. although the electric loading and current density are increased, however, spoke construction of the rotor has caused the heat transfer coefficient of the surfaces in the air gap vicinity and also winding to increase which counteracts the increase of electric loading and current density and causes lower hotspot temperature in full load operation. fig. 14 shows the generator steady-state temperature at full load when environment temperature is 40 ℃. it verifies the 3d fem results in fig. 15, with 3 % error. this error includes error of the thermal camera image and calculations. temperature rise in different parts of the generator has been obtained from thermal analysis and also hotspot temperature measurements of winding. temperature sensors in fig. 16 shows an error of 5 % which is the error of calculations and readings from pt100 sensors. 566 j. faiz, t. asefi , m. a. khan fig. 13 unbalance three-phase experiment no load voltages. fig. 14 thermal image of generator at full load. 5.4. test results in this section the experimental results are presented and compared to the results from analytical design and numerical analysis results for the spoke type ferrite pm generator. fig. 17 shows the measured and numerically calculated back-emf. the total thd of the back-emf is 1.52 %. fea of the designed generator performance agree well with the experimental results. in particular, the dirichlet and neumann boundary conditions were imposed, as well as precise meshing of air gap region to calculate the cogging torque. design of pm generator with ferrite and rare-earth magnets 567 fig. 15 steady-state thermal analysis of generator at full load with 40 ℃ environment temperature. in the no load generator, the total no load losses is measured. a dummy plastic stator of fig. 0d was used to measure the windage and friction losses. then, core losses are determined by deducting the windage and friction losses from the total no-load losses. to eliminate the effect of windage and friction on cogging torque, the rotor is rotated in small steps of 0.01 mechanical degrees. the torque transducer signal is then conditioned by means of a fft and lower and higher frequencies are filtered. fig. 18a clearly shows the generator cogging torque and its oscillations with a mechanical period of 360/lcm(2pqs) = 360/lcm(12,10) = 6º. fig. 16 temperature rise in different parts of generator (cfd) and hotspot from pt100 sensors in windings 568 j. faiz, t. asefi , m. a. khan 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 -20 -10 0 10 20 time(s) p h a s e v o lt a g e (v ) fem experiment fig. 17 phase back emf (a) (b) fig. 18 (a) cogging torque, (b) efficiency map table 7 experimental results comparison of alternators at 600 rpm generator nd-fe-b spoke type ferrite cogging torque (nm) 0.52 0.23 copper losses (w) 48 54 iron losses (w) 19.35 15.63 max. efficiency (%) 80.8 @8.6 a 80.3@8.7 a phase back emf is due to asymmetries in the mechanical structure and assembly imperfections. finally, fig. 18b shows the efficiency map of the test machine where the measured results are compared to the analytical design. this map indicates that the maximum efficiency of the generator occurs at 87 % of the rated power, i.e. phase current of 8.7 a, at a shaft speed of 600 rpm. in table 7 the experimental results of the nd-fe-b and ferrite generators have been compared. 6. conclusion in this paper, a comparative study was conducted on the impact of the ferrite and nd-feb magnets upon the performance of a high poles number drafpm wind generator. in particular, a surface mounted and spoke type topology with low cost ferrite pms was compared with a surface mounted nd-fe-b pm topology. design of pm generator with ferrite and rare-earth magnets 569 the designed generators were optimized to achieve minimum weight using a populationbased algorithm. extensive fea was conducted to further optimize the machine topology with ferrite pms. experimental results verified the design and fea results. it was exhibited that the spoke type ferrite pm topology is a viable alternative to the conventional nd-fe-b generator and has lower thd in its open circuit voltage, cogging torque and torque ripple. however, its active material mass was 16% more and its induced voltages were more unbalanced. it was also found that surface mounted topology is unable to develop the nominal electromagnetic torque, and also its torque ripple is higher and efficiency is lower than the spoke type generator. references [1] j. faiz, t. asefi and m. a. khan, "design of axial flux ferrite permanent magnet synchronous generator”, in proceedings of the 14th international conference on applied electromagnetics (пес 2019), august 26 28, 2019, niš, serbia. [2] t. asefi, j. faiz, and m. a. khan. "design of dual rotor axial flux permanent magnet generators with ferrite and rare-earth magnets", in proceedings of the ieee 18th international power electronics and motion control conference (pemc), budapest, hungry, 2018. [3] m. dahlgren, h. frank, m. leijon and f. owman, "wind former; wind power goes large scale," abb rev, 3:31-37.j. , 2000. [4] k. h. kim, h. i. park, s. m. jang, d. j. you and j. y. choi, "comparative study of electromagnetic performance of high-speed synchronous motors with rare-earth and ferrite permanent magnets," ieee trans. magn., vol. 52, no. 7, pp. 1–4, jul. 2016. [5] s. chung, s. h. moon, d. j. kim and j. m. kim, "development of a 20-pole–24-slot spmsm with consequent pole rotor for in-wheel direct drive," ieee trans. ind. electron, vol. 63, no. 1, pp. 302–309, jan. 2016. [6] m. kimiabeigi, j. d. widmer, r. long, y. gao, j. goss, r. martin, t. lisle, j. m. soler vizan, a. michaelides and b. mecrow, "high-performance low-cost electric motor for electric vehicles using ferrite magnets," ieee trans. ind. electron., vol. 63, no. 1, pp. 113–122, jan. 2016. [7] i. petrov and j. pyrhönen, "performance of low-cost permanent magnet material in pm synchronous machines," ieee trans. ind electron., vol. 60, no. 6, pp. 2131–2138, june 2013. [8] y. chen, p. pillay and m. khan, "pm wind generator comparison of different topologies," ieee trans. ind appl., vol. 41, no. 6, pp. 1619–1626, nov. 2005. [9] m. a. khan, p. pillay and k. visser, "on adapting a small pm wind generator for a multi-blade, high solidity wind turbine," ieee trans. energy convers., vol. 20, no. 3, pp. 685–692, sep. 2005. [10] h. jagau, m. a. khan and p. s. barendse, "design of a sustainable wind generator system using redundant materials," ieee trans. ind. appl., vol. 48, no. 6, pp. 1827–1837, nov./dec. 2012. [11] j. wanjiku, h. jagau and m. a. khan, "a simple core structure for small axial-flux pmsgs," in ieee international electric machines & drives conference (iemdc), niagara falls, may 2011. [12] j. r. hendershot and t. j. e. miller, "design of brushless permanent-magnet motor"s, usa: oxford university press, 1995. [13] f. gieras, r. wang and m. j. kamper, "axial flux permanent magnet brushless machines", 2nd ed, springer, 2008. [14] w. zhao, t. a. lipo and b. kwon, "comparative study on novel dual stator radial flux and axial flux permanent magnet motors with ferrite magnets for traction application," ieee trans. mag., vol. 50, no. 11, pp. 1–4, nov, 2014. [15] h. tiegna, a. bellara, y. amara and g. barakat, "analytical modeling of the open-circuit magnetic field in axial flux permanent-magnet machines with semi-closed slots," ieee trans. on mag., vol. 48, no. 3, pp. 1212–1226, sep. 2012. [16] d. ishak, z. q. zhu and d. howe, "performance variation of ferrite magnet pmbldc motor with temperature," ieee trans. magn., vol. 51, no. 12, p. 95–103, dec. 2015. [17] p. zhang, g. y. sizov, j. he, d. m. ionel and n. a. o. demerdash, "calculation of magnet losses in concentrated-winding permanent-magnet synchronous machines using a computationally efficient finiteelement method," ieee trans. ind. appl., vol. 49, no. 6, pp. 1495–1504, nov./dec. 2013. [18] j. g. wanjiku, m. khan, p. barendse and p. pillay, "influence of slot openings and tooth profile on cogging torque in axial-flux pm machines," ieee trans. ind. electron., vol. 62, no. 12, pp. 7578–7589, dec. 2015. facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 15-26 https://doi.org/10.2298/fuee2001015o © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd noise shaping in sar adc  dmitry osipov 1 , aleksandr gusev 1 , vitaly shumikhin 2 , steffen paul 1 1 institute of electrodynamics and microelectronics, university of bremen, bremen, germany 2 asic lab, national research nuclear university mephi, moscow, russia abstract. the successive approximation register (sar) analog-to-digital converter (adc) is currently the most popular type of adc architecture, owing to its power efficiency. they are also used in multichannel systems, where power efficiency is of high importance because of the large number of simultaneously working channels. however, the sar adc architecture is not the most area efficient. in sar adcs, the binary weighted capacitive digital-to-analog converter (dac) is used, which means that one additional bit of resolution costs double the increase of area. oversampling and noise shaping are methods that allow an increase in resolution without an increase of area. in this paper we present the new sar adc architectures with a noise shaping. a first-order noise transfer function (ntf) with zero located nearly at one can be achieved. we propose two modifications of the architecture: with zero-only ntf and with the ntf with additional pole. the additional pole theoretically increases the efficiency of noise shaping to further 3 db. the architectures were applied to the design of sar adcs in a 65 nm complementary metal-oxide semiconductor (cmos) with osr equal to 10. a 6-bit capacitive dac was used. the proposed architectures provide nearly 4 additional bits in enob. the equalent input bandwitdth is equal to 200 khz with the sampling rate equal to 4 ms/s. key words: sar adc, noise shaping, fom. 1. introduction sar adcs with a capacitive dac in the feedback loop are currently the most popular type of adcs. the general benefit of this architecture is its power efficiency. some of the recently published sar adcs can achieve an fom of several fj/conv.-step [1]. to provide the required accuracy, a binary weighted capacitive dac is usually employed. the matching of capacitors has a significant influence on the characteristics of sar adcs [2]. to improve the matching, the best option is to use metal-insulator-metal (mim) capacitors, which have relatively high capacitance and area. other types of capacitors, such as metal-oxide-metal (mom), have also been used to design sar adcs, received november 4, 2019 corresponding author: dmitry osipov institute of electrodynamics and microelectronics (item), university of bremen, 28359 bremen, germany (e-mail: osipov@uni-bremen.de) 16 d. osipov, a. gusev, v. shumikhin, st. paul but additional techniques have to be used to provide the required precision because of poor matching (for example, calibration [3], or dithering [4]), even to achieve 10 bit linearity. larger mim capacitors provide good matching, but the final layout of the adc requires a large chip area. furthermore, the larger capacitance leads to higher dac power consumption, for two reasons: first, because of energy drawn from the reference source to switch the capacitors; second, because of the power dissipated in the larger switches. the first problem can be solved by using an advanced switching scheme, such as previously proposed in [5] and [6]. the second problem has no known solution, except for a decrease of dac capacitance. noise shaping and oversampling are the next alternative solutions, which allow a decrease in the number of capacitors in the dac for the same adc resolution. this thematic field is relatively new, with the first important work conducted at the beginning of the 2010s. 2. review of state-of-the-art noise shaping sar adc architectures the basic idea of noise shaping can be described as follows. the digital output d(n) of an adc can be expressed as ( ) ( ) ( ) where x(n) is the analog input value, and q(n) is a quantization error. in conventional adcs, the q(n) value is not used. the basic idea of noise shaping is to append the quantization error of a previous sample to the current sample, for example as ( ) ( ) ( ) ( ) in this case, the transfer function of an adc can be expressed as ( ) ( ) ( ) ( ) it is evident that the quantization error is filtered with the first order high pass filter. the quantization error power is moved to the high frequency part of the spectrum, meaning that the sinad of the adc increases in the low frequency part of the spectrum. the noise shaping only makes sense if used with oversampling. the most obvious way to implement noise shaping is to store the remainder of the last conversion on the dac, while sampling the current sample. a noise-shaping scheme realizing this approach is described in [7]. the simplified circuit is shown in fig. 1(a). in this approach, the active amplifier is used to feed the residue of the previous sample back to the dac. this schematic allows lowering of the dac area, but it utilizes the active amplifier in the feedback loop, which renders it impractical for low power applications. fully passive noise shaping is more promising approach. the most of them use the four input comparator to append the previous sample to the converted input [8]. similar designs were presented in [9, 10], but in these works the ota is used for residue filtering. so, these schemes cannot be called fully passive. these adcs achieve the foms of several and several tens of fj/conv.-step, while providing the effective resolution of 10-12 bits with the effective bandwidth of several ms/s. all these schemes utilize the four input (or fully differential) comparator. the first two differential inputs are connected to the dac, the second two to the output of the filter, which processes the residue (see fig. 1(b)). noise shaping in sar adc 17 fig. 1 typical noise shaping sar architectures: (a) – with active filtering of quantization error; (b) – fully passive in 2019, we also proposed two fully passive noise shaping schemes [11, 12], which are based on the completely different principle. in our schematic we do not use the four input comparator. these schemes will be described in detail in next sections of this paper. the dac capacitors and the residue filter capacitor are connected to each other through the attenuation capacitor (see fig. 2). after the charge redistribution the new voltage at a dac equals to the attenuated sample plus the filtered residue of previous sample. so, the usual two-input comparator can be used. furthermore, our schematic allows to implement the most efficient noise shaping, as its noise transfer function (ntf) zero lies very near to one, to be exactly 0.943. by the competitors the zero location is between 0.5 and 0.75 (see fig. 3). so, it did not make sense to sample with osrs higher than 8, as no gain in enob could be achieved. both our noise shaping schemes allow the use of higher osrs – up to 15. this schematic allows designing the competitive sar adcs. fig. 2 proposed fully passive noise-shaping sar adc architecture 18 d. osipov, a. gusev, v. shumikhin, st. paul fig. 3 comparison of state-of-the-art fully passive noise-shaping sar adcs. the both proposed schemes allow the use of osr higher than 10 3. proposed noise shaping sar architecture the proposed noise shaping architecture is shown in fig. 2. dependent on the connection of the attenuation capacitor after the conversion phase this architecture can be split into two sub-architectures: one implements the zero-only ntf, the other has the ntf with additional pole. the pole has the influence on the systems stability, but also increases the attenuation of the noise in the frequency band of interest. the circuit utilizes additional capacitors equal to bc0 and cc0, where b and c are integers, and 8 switches, which can be implemented as single n-channel mos (nmos) transistors. additionally, the most-significant-bit (msb) capacitor should be equal to 2 n-1 , where n is the dac's resolution. so, the dac is twice as big in comparison to the standard monotonic switching sar architecture [14]. the additional capacitor cc0 is needed to obtain the value of quantization error e(n) at the end of conversion, as in the noise-shaping architecture presented in [15]. 3.1. architecture without the pole in ntf first, we consider the architecture without the pole in ntf. in this case, the digital control logic does not differ much from the standard sar logic. the additional switches are operated with the sampling signal and inverted sampling ̅̅ ̅. during sampling the capacitors bc0 are discharged to ground, the comparator and cc0 are disconnected from the capacitive dac. the cc0 capacitor holds the remainder voltage of previous conversion. during conversion phase the dac is connected to the comparator and cc0 through the discharged attenuation capacitor bc0. the working principle is shown in fig. 4. only one side of the dac is shown for simplicity. the following equations can be written for the charge redistribution after the end of sampling phase (b in fig. 4): noise shaping in sar adc 19 { ( ( ) ( )) ( ( ) ( )) ( ( ) ( ) ⁄ ( ⁄ ⁄ ⁄ ) ) (1) where x(n) is the sampled analog input, e(n1) is the quantization error of previously converted digital output d(n), g(n) is an actually converted analog value, which appears on the comparator input after the initial charge redistribution after the sampling phase and before the start of conversion, k(n) is the voltage on the capacitive dac obtained at the same moment. from this set of equations the voltage g(n) can be found as: ( ) ⁄ ( ) ( ⁄ ⁄ ) ( ) ( ⁄ ⁄ ⁄ ) (2) after initial charge redistribution the conventional monotonic sar algorithm can start. at the end of conversion, the voltage at comparator will equal to: ( ) ( ) ( ⁄ ⁄ ⁄ ) , (3) fig. 4 working principle of the proposed noise shaping architecture. a: sampling phase; b: charge redistribution after sampling phase; c: sar matrix and voltages at the end of conversion cycle.  = 1 / (1/2 n + 1/b + 1/c) 20 d. osipov, a. gusev, v. shumikhin, st. paul where d(n) is the digital output code of the sar. this voltage, obviously, equals to the quantization error e(n) divided by the coefficient equal to c (1/2 n + 1/b +1/c). taking this into account, the following equation can be written for the digital output d(z) in the complex frequency domain: ( ) ( ) ( ⁄ ⁄ ⁄ ⁄ ⁄ ) ( ) (4) the following observations can be made. first, with the increase of c, the coefficient at z 1 approaches ''1''. secondly, from the equation (2) with the increase of c, the voltage at the comparator input at the beginning of conversion cycle becomes independent on the x(n), and so, the circuit becomes nonfunctional. in the practice it means, that to provide better ntf we have to increase the resolution of the comparator. for example, in this work, the following configuration was applied: c = 2 n1 , b = 1. that configuration provides the ntf equal to (10.9429z -1 ), while the x(n) is divided by 17.5. so, in the proposed design, the comparator resolution should be equivalent to 10.13 bit. however, the design of comparator can be relaxed in comparison to the comparator in the classical 10-bit monotonic scheme, as the common mode variation in the proposed scheme is also no more than vref / 17.5. 3.2. architecture with the pole in ntf the additional pole in the ntf provides additional attenuation of the noise in the band of interest. but, for the realization of this architecture additional control signals are needed. furthermore, the charge on the attenuation capacitor should be divided by two to provide the circuits stability. the modified schematic is shown in fig. 5. fig. 5 architecture modification to implement additional pole in ntf noise shaping in sar adc 21 the simplified functional diagram of the proposed sar adc architecture is shown in fig. 6. initially, the input signal x(n) is sampled on the top plates of the capacitive dac. at the same time, the residue voltage of previous sample (voltage on the capacitive dac after the end of conversion) saved on the capacitor bc0 is divided by two. without this step the pole will be equal to one and the circuit will become unstable. the capacitor сс0, with the previous quantization error divided by  = (1/2 n + 1/b + 1/c), is disconnected from the dac. after the end of sampling phase, the dac is connected to the quantization error storage capacitor cc0 through the capacitor bc0 like in previous circuit. the charge redistribution occurs, which can be described as: { ( ( ) ( )) ( ( ) ( ) ( )) ( ( ) ( ) ⁄ ) , (5) where k(n) and g(n) are the voltages on the dac and quantization error storage capacitor cc0, respectively, after charge redistribution, e(n1) is the quantization error of previous sample. after the charge redistribution the conversion begins. as the capacitor cc0 is connected directly to comparator, after the end of conversion, its voltage will be equal to e(n) / α. the voltage at the dac will be equal to: ( ) ( ) (6) where d(n) represents the digital dac input (adcs output). fig. 6 functional diagram of the proposed adc architecture 22 d. osipov, a. gusev, v. shumikhin, st. paul after the end of conversion the new value of vres(n) can be saved on capacitor bc0 (if bc0<<2 n c0). from (5)-(6) the following equation can be written for the determination of output digital code d(z): ( )( ( ⁄ ) ) ( )( ) ( )( ⁄ ⁄ ⁄ ) (7) if ⁄ <<1, this equation can be simplified to: ( ) ( ) ⁄ ⁄ ( ) (8) if c, n>>1, the equation (8) can be rewritten as: ( ) ( ) ( ) (9) so, the circuit will perform the first order noise shaping. the pole gives additional 3 db noise shaping in the input frequency band. to implement this architecture the modification of the sar logic is needed, which is shown in fig. 7. it can be seen that only 6 additional combinational logic blocks and one delay are used to generate the signals for noise shaping. our innovative delay design [16] is used to provide low power consumption. fig. 7 control logic modification for the realization the proposed noise-shaping sar adc architecture (with additional pole) 4. simulation results and comparison of proposed noise shaping sar architectures for both architectures we used 65 nm technology of umc. a binary weighted capacitive dac is built with minimum metal-insulator-metal (mim) capacitors (5µm x 5µm, 51 ff). two capacitors are connected in series to implement c0, so the c0 value is equal to 25.5 ff. a 16c0 (c=16) mim capacitor is used to store the quantization noise shaping in sar adc 23 error. the attenuation capacitor is set to 2c0 (b=2). so, the input voltage x(n) is devided by 17.5 at the input of the comparator. so, in this configuration the comparator resolution should be equivalent to 10.13 bit. the common mode voltage on the input of the comparator sequentially decreases from vref/2/17.5 to zero, so one p-type input differential pair can be used in comparator circuit. a common dynamic one stage topology shown in fig. 8 was used. all switches are realized as single n-mos transistors with minimum length and width except for the input sampling switch, where a bootstrapped switch is used to suppress harmonics, what is quite common in sar adcs with upper plate sampling. fig. 8 simple one stage comparator used in this design the second architecture does not affect the area of the significantly. the area estimation of both architectures (automatic place and route) equals to 0.05 mm 2 . the circuits were simulated with cadence spectre. the sampling speed was set to 4 ms/s. the simulated output spectrums of the proposed adcs with 191.47 khz @0.9151 db sinusoidal input are shown in fig. 9. fig. 9 simulated output spectrum of the proposed architectures 24 d. osipov, a. gusev, v. shumikhin, st. paul a comparison with other architectures and both proposed architectures is given in table 1. table 1 state of the art noise shaping sar adcs [8] [10] [9] zero-only zero+pole architecture ntf zero location 0.75 0.5 0.65 →1 need of ota no yes yes no need of comparator modification yes yes yes yes input attenuation no yes no yes number of unit capacitors 2 n 2 n+1 2 n 3(2 n-1 ) circuit performance technology, nm 130 65 65 65 bandwith, mhz 0.125 6.25 11 0.2 dac size, bit 10 8 8 6 enob, bit 12 10 9.35 10.0 10.12 additional bits in enob, bit 2 2 1.35 4.0 4.12 osr 8 4 4 10 fom, fj/conv.-step 59.6 14.8 35.8 19.4 18.0 foms, db 167 165.2 163.3 167.2 167.9 verification meas. meas. meas. simulation year 2019 it can be noted that the architecture variant with pole gives only slight improvement in the sinad 0.8 db, which corresponds only to 0.12 bits in enob. the power consumption of the second architecture is slightly higher (because of additional logic and switches): for the zero only architecture the average power consumption equals to 7.94 µw, while for the architecture with additional pole it is 8.0 µw. for both implementations dac consumes nearly 50% of power, comparators power consumption is nearly 40%, while the rest is consumed by digital logic. the theoretical 3 db improvement of sinad in the frequency band of interest was not achieved. the nonideality of the division by two with the real switch and capacitor can be the cause of it. so, by now we recommend to use the more simple variant without the pole. both proposed schemes do not need any modification of comparator circuit, the additional capacitor needed for storage of the quantanization error is compensated by the use of monotonic switching. so, the total area of the proposed architectures is near the same as in other noise shaping schemes. the main advantage of the proposed architectures is the possibility to use higher osrs. it gives the possibility to achieve higher number of effective bits with less area. the previously reported noise-shaping sar adcs typically achieve only 2 additional bits, while proposed sar adc achieves the 10-bit resolution with the 6-bit capacitive dac. noise shaping in sar adc 25 4. conclusion two new, fully passive noise shaping architectures for a sar adc were proposed. in both architectures the theoretically achievable ntf zero location tends to one. in the proposed implementation of the architectures the real zero location is 0.943, while in the alternative solutions the maximum ntf zero was 0.75. the architectures introduce only a slight modification of the standard sar adc scheme. a four-input comparator is not needed. the digital logic remains unmodified for the first architecture (zero only) and slightly modified for the second architecture (6 additional logic gates). the second architecture theoretically can give additional 3 db attenuation of quantanization noise in the frequency band of interest. however, in practical implementation the additional attenuation was equal to 0.8 db, which makes the first architecture more suitable, because of its simplicity. both architectures were used to implement the sar adc in 65 nm cmos technology of umc. the 6-bit capacitve dac and osr ration equal to 10 were used. the input frequency bandwidth was set to 200 khz. both architectures provide 4 additional bits in enob. according to simulation results both adcs have walden fom of less than 20 fj/conv.-step and schreier fom of more than 167 db. further research can concentrate first, on the more accurate investigation of the lower attenuation of quantization noise in the second architecture with additional pole in ntf, and second, on the implementation of second order ntfs by, for example, a combination of the proposed architecture with the architecture with 4-input comparator. acknowledgement: this work was supported by german research foundation (dfg), project number: 389481053, and by grant no. 18-79-10259 by the russian science foundation. references [1] m. liu, p. harpe, r. van dommele and a. van roermund, "15.4 a 0.8v 10b 80ks/s sar adc with dutycycled reference generation," in digest of technical papers of the ieee international solid-state circuits conference (isscc), san francisco, ca, 2015, pp. 1-3. [2] d. osipov and y. bocharov, "behavioral model of split capacitor array dac for use in sar adc design," in proceedings of the 8th conference on ph.d. research in microelectronics & electronics, prime 2012, aachen, germany, 2012, pp. 1-4. [3] j. shen, a. shikata, a. liu, b. chen and f. chalifoux, "a 12-bit 31.1µw 1-ms/s sar adc with on-chip input-signal-independent calibration achieving 100.4-db sfdr using 256-ff sampling capacitance," ieee journal of solid-state circuits, vol. 54, no. 4, pp. 937-947, april 2019. [4] j. guerber, h. venkatram, m. gande, a. waters and u. moon, "a 10-b ternary sar adc with quantization time information utilization," ieee journal of solid-state circuits, vol. 47, no. 11, pp. 26042613, november 2012. [5] d. osipov and st. paul, "two advanced energy-back sar adc architectures with 99.21 and 99.37% reduction in switching energy", analog integrated circuits and signal processing, vol. 87, no. 1, pp. 81-91, 2016. [6] d. osipov and s. paul, "two-step monotonic switching scheme for low-power sar adcs," in proceedings of the 15th ieee international new circuits and systems conference (newcas), strasbourg, 2017, pp. 205-208. [7] m. shahghasemi, r. inanlou, m. yavari, "an error-feedback noise-shaping sar adc in 90 nm cmos", analog integr circ sig process, vol 81, pp. 805-814, 2014. [8] w. guo and n. sun, "a 12b-enob 61μw noise-shaping sar adc with a passive integrator", in proceedings of the esscirc conference, 2016, vol. 42, pp. 405-408. https://doi.org/10.1007/s10470-014-0434-6 26 d. osipov, a. gusev, v. shumikhin, st. paul [9] a. fredenburg and m. p. flynn, "a 90-ms/s 11-mhz-bandwidth 62-db sndr noise-shaping sar adc, " ieee journal of solid-state circuits, vol. 47, no. 12, pp. 2898-2904, december 2012. [10] z. chen, m. miyahara, and a. matsuzawa, "a 9.35-enob, 14.8 fj/conv.step fully-passive noise-shaping sar adc, " ieice transactions on electronics, vol. 99, no. 8, pp. 963-973, 2016. [11] d. osipov, a. gusev, s. paul, "first order fully passive noise-shaping sar adc architecture with ntf zero close to one," in proceedings of the 17th ieee international new circuits and systems conference (newcas), munich, 2019. (not published yet) [12] d. osipov, a. gusev, v. shumikhin, s. paul, "sar adc architecture with fully passive noise shaping", in proceedings of the ieee 31th international conference on microelectronics (miel), nis, 2019, 219-222. [13] z. chen, m. miyahara and a. matsuzawa, "a 9.35-enob, 14.8 fj/conv.-step fully-passive noise-shaping sar adc," in proceedings of the 2015 symposium on vlsi circuits (vlsi circuits), kyoto, 2015, pp. c64-c65. [14] liu, s. chang, g. huang and y. lin, "a 10-bit 50-ms/s sar adc with a monotonic capacitor switching procedure," ieee journal of solid-state circuits, vol. 45, no. 4, pp. 731-740, april 2010. [15] r. inanlou and m. yavari, "a simple structure for noise-shaping sar adc in 90 nm cmos technology, " aeuinternational journal of electronics and communications, vol. 69, no. 8, pp. 1085-1093, 2015. [16] d. osipov, h. lange, st. paul, "energy-efficient cmos delay line with self-supply modulation for lowpower sar adcs", international journal of electronics, published online: 06 september 2019. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 17 28 doi: 10.2298/fuee1501017s virtual instruments in low-frequency noise spectroscopy experiments adam w. stadler 1 , andrzej dziedzic 2 1 department of electronics fundamentals, rzeszów university of technology, rzeszow, poland 2 faculty of microsystem electronics and photonics, wrocław university of technology, wrocław, poland abstract. low-frequency noise spectroscopy (lfns) is an experimental technique to study noise spectra, typically below 10 khz, as a function of temperature. results of lfns may be presented as the ‘so-called’ noise maps, giving a detailed insight into fluctuating phenomena in electronic devices and materials. the authors show the usefulness of virtual instrument concept in developing and controlling the measurement setup for lfns experiments. an example of a noise map obtained for polymer thick-film resistors (ptfrs), made of commercial compositions, for temperature range 77 k – 300 k has been shown. the experiments proved that 1/f noise caused by resistance fluctuations is the dominant noise component in the studied samples. however, the obtained noise map revealed also thermally activated noise sources. furthermore, parameters describing noise properties of resistive materials and components have been introduced and calculated using data from lfns. the results of the work may be useful for comparison of noise properties of different resistive materials, giving also directions for improvement of thick-film technology in order to manufacture reliable, low-noise and stable ptfrs. key words: low-frequency noise, noise spectroscopy, virtual instrument, polymer thick-film resistor 1. introduction noise measurements are much more sensitive to internal electronic component imperfections than ordinary resistance measurements since noise and resistance are proportional to the fourth and second moments of local current distribution, respectively. taking into account that a reduction of supply voltage in modern electronic circuits is a common trend forced by commercial applications, one may realize that the noise generated in electronic components becomes one of their most important parameter. it is also observed in specialized electronics, like cryogenic thermometry, where intrinsic noise of the device limits capabilities or the resolution of the overall circuit. on the other hand, it  received october 30, 2014 corresponding author: adam w. stadler rzeszow university of technology, department of electronics fundamentals, w. pola 2, pl 35-959 rzeszow, poland (e-mail: astadler@prz.edu.pl) 18 a.w. stadler, a. dziedzic has been also shown that there is a relationship between low-frequency noise observed in electronic components and their reliability [1, 2]. therefore, the noise measurements are very important research and diagnostic tool. however, noise measurements requires much more sophisticated equipment and are time consuming, as compared to resistance (or conductivity) and i-v curve tests. it is a good practice to start noise studies with noise components identification which means that the shape of noise spectrum and its dependence on the excitation signal has to be known. one of noise components commonly observed in low-frequency range is 1/f noise. however, in many electronic components, apart from 1/f, other noise components may also be detected. a good example are thick-film resistors (tfrs), in which the resistive layer of ruo2+glass is prepared in a high-temperature process, i.e., made of pastes deposited on the proper substrate and then fired in temperature of about 850 ºc. in these resistors, apart from the dominant 1/f noise, lorentzian components, caused by thermally activated noise sources (tanss) distributed randomly in the resistive layer, have also been found [3]. however, to detect tanss and study their properties noise spectra have to be measured as a function of temperature. this research technique is called low-frequency noise spectroscopy (lfns). noise spectra vs. temperature have been investigated in tfrs at fixed temperature points [4], whereas lfns used in [3, 5, 6] make it possible to depict a large amount of experimental data in the plot of fluctuating quantity vs. frequency f and temperature t, called ‘noise map’. it was possible due to continuous acquisition of voltage fluctuations observed on tfrs with the current excitation, and recording calculated in realtime consecutive spectra during slowly varying temperature. the progress was made also due to the usage of software defined instruments for controlling the experiment and processing the data. the software that supports the experiment uses virtual instrument (vi) concept, which means that the instrument consists of three main components: (i) general purpose personal computer, (ii) specialized software which is responsible for interaction with the user by means of graphical interface, and (iii) internal and/or external functional hardware (daq board/device, digital meters with remote control, etc.) [7, 8]. due to the synergy, the functionality and possibilities of vis result in equal parts both from the software and the hardware. in this work we describe in more detail the software layer that is used in studies of noise properties of electronic materials and components in function of temperature. the usefulness of vis, developed for supporting lfns experiments, will be shown with the use of polymer thick-film resistors (ptfrs) as the subject of studies. 2. role of the software as a base noise signal analyzer described in [9] has been reused. the hardware part of vi is shown in fig. 1. the heart of the system is multi-channel plug-in daq board, which simultaneously digitizes analog signals. voltage signals, including fluctuations of interest, from the multi-terminal electronic component, after conditioning are sampled and converted to digital representation giving 2 20 samples in each time record of trec = 2 s duration. after spectra and/or cross-spectra calculations, executed in real-time by the use of fft algorithm, they are displayed and certain low parts of the spectra are recorded. additional digital voltmeters (3458a and 34410a both from agilent) and temperature controller (lake shore 340) are used for monitoring biasing conditions of the sample as well as its temperature. in order to precisely virtual instruments in low-frequency noise spectroscopy experiments 19 tune sampling frequency an external generator (agilent 33120a with extended stability) is used for triggering daq board. all auxiliary instruments are recognized automatically at the start of the system and then exchange messages with pc controller via gpib or usb bus. wzmacniacz niskoszumowy wzmacniacz niskoszumowy wzmacniacz niskoszumowy low-noise amplifiers gpib daq board pci 6132 filtr dolnoprzepustowy filtr dolnoprzepustowy filtr dolnoprzepustowy low-pass filters d if fe re n ti a l a n a lo g i n p u ts a u x il ia ry i n st ru m e n ts usb fig. 1 hardware part of vi used in low-frequency noise experiments the software part has been prepared in labview environment and is divided into two main parts: (a) controlling the experiment, and (b) post-experiment analysis. the following main tasks are defined in part (a): (i) signal sampling in continuous mode, (ii) digital signal processing of the sampled signal (in real time), including spectra calculations, (iii) monitoring resistance r of the sample and its temperature t, (iv) spectra presentation and recording together with additional data (r and t) for further analysis. the above tasks are executed iteratively in independently running threads. the software part responsible for digital signal processing also offers more advanced functions for analysis either in time or frequency domain, like (i) phase sensitive detection (for ac method of noise measurements), (ii) detrend function [10] for removing trend and thus rejecting distortion of noise spectrum due to the temperature drift, (iii) power cross-spectral density function, which apart from ordinary power spectral density function is the most useful in experiments that involve multi-terminal samples, (iv) correlation function, (v) second spectra, etc. the complexity of the software may be expressed as a number of sub-vis, which is about 500 for part (a) and 270 for part (b) of the software. the block diagram of the software part of considered vi is shown in fig. 2, where three main pieces are visible: configuration, experiment control and monitoring, and postexperiment data processing. furthermore, three traditional layers, i.e. hardware (acquisition), analysis and presentation, have been shown in the background of experiment control. under the hood, during the experiment several loops are executed concurrently in separate threads which communicate with each other via queues. the first loop acquires data from daq board and passes them to digital signal processing engine that calculates, selected previously at the configuration stage, functions in time and/or frequency domain. another loop continuously reads measurements from auxiliary meters and calculates actual resistance and temperature. loops interacting with hardware functional blocks, i.e. daqboard/device and monitors/controllers, iterate independently with their own pace. for transferring data between loops, an advanced queues mechanism has been employed, not 20 a.w. stadler, a. dziedzic to lose any time record and results of calculated functions. averaging engines have been used for data decimation during the experiment with the way and rate configured by the user. presentation layer analysis layer hardware layer (data acquisition)daq board auxiliary instruments save/read config configuration dsp daq board acquisition auxiliary instruments acquisition dsp (frequency/time domain) data recording data presentation experiment resistance and temperature calculations and averaging integral measure of noise post-experiment analysis excess noise calculation spectra normalization noise maps viewer and tanss analyzer calculator of power in frequency bandsnoise exponent calculator bulk and contact noise decomposition spectra viewer and editor dsp fig. 2 block diagram of the software part of vi developed to support low-frequency noise spectroscopy lfns experiment takes long time, which is the result of small rate of temperature changes. namely, to avoid spectra distortion due to the temperature drift, the following relation has to be fulfilled [11] 2 2(2 ) ( ), rec vex t f s f   (1) where  is the rate of voltage drift caused by temperature change, and svex is excess noise, svex = sv sv=0, where sv and sv=0 are power spectral densities of voltage fluctuations with bias and with no sample bias, respectively. however, if the above relation is not fulfilled, then the spectra will include distortion (mainly in low-frequency range) obscuring the useful information. nevertheless, the negative influence of the resistance drift, may be significantly reduced by using detrend procedure, which removes unwanted small changes in the signal just before calculating spectra [11]. the above issues limit the rate of temperature changes and therefore the temperature sweep in the experiment covering temperature range 77 k – 300 k lasts nearly 2 days. during this time, consecutive time records, acquired from the device under test, are used for (cross)spectra calculations. it means that about 86 thousands of spectra, each counting 1000 (or even more) bins for every acquired signal have to be decimated and recorded in separate files. hence, the software support for post-experiment data processing is necessary. the main progress with respect to noise signal analyzer, described in [9], concerns part (b) of the software. it consists of the functions responsible for virtual instruments in low-frequency noise spectroscopy experiments 21 (i) spectra viewing, editing and their transformations, including excess noise calculations, noise normalization, etc., (ii) noise maps generating and viewing, as well as tanss identification and analysis, including calculation of activation energy, (iii) calculations of the powers of fluctuating quantity in the user defined frequency bands as a function of temperature, necessary for evaluation of the integral noise measure [3], (iv) decomposition of total noise into bulk and contact components, which is possible for multi-terminal samples, and (v) calculations parameters describing noise properties, like noise spectral exponent, material noise intensity parameter c (see next section). all the functions create graphs with plots and/or save their results in files for further analysis (see bottom part of fig. 2). 3. experiment as a subject of studies a pair of ptfrs, with matched resistance, prepared on the same substrate, has been taken. resistive layers of ptfrs have been made of ed7100_200ω carbon-based composition from electra polymers ltd on fr-4 laminate. samples have been prepared as multi-terminal ptfrs in which rectangular resistive layer (length l = 15 mm, width w = 0.5 mm and thickness d = 20 m) ended on opposite sides with current terminals. additional, evenly spaced along the layer, terminals (voltage probes) have also been formed. such shape of sample has been used in our previous experiments [3]. polymer thick-film resistive films have been screen-printed on bare cu contacts. the substrate with the pair of samples has been then inserted into liquid nitrogen cryostat, making lower arms of the dc bridge (see fig. 3). ballast resistors rb have been selected so as to fulfil the relation rb  rs. the circuit has been supplied from programmable source measure unit (keithley 2636) working in voltage mode and low-pass passive filter of large time constant. signals from diagonal of the bridge and its sub-diagonals have been connected to differential low-noise preamplifiers (5186 from signal recovery) with 1000 gain and then after low-pass filtering digitized in acquisition plug-in board. spectra sv and cross-spectra (with 0.5 hz resolution) for signals taken from samples terminals have been calculated in real time by the software and gathered in proper files for further analysis. the experiment started with cooling down the sample and letting the temperature rise freely. it should be noted that the voltage v7-7 acquired from the diagonal of the bridge includes fluctuations arising in both samples. similarly, voltages from sub-diagonals, vi-i, include fluctuations that arise in both samples but only in parts of the resistive layers (sectors) marked by terminals ‘1’ and ‘i’, where i = 2…6. furthermore, in order to obtain voltages from sectors 3-5 and 2-6 covering internal sectors of the resistive layer, voltages v5-5-v3-3 and v6-6-v2-2 have been created by the use of amplifiers/filters with differential inputs. to improve the accuracy of the spectra calculation for the internal sectors, cross-correlation technique has been employed, i.e. cross-power spectral density function has been used for calculation sv(1-i) using voltages v7-7 and vi-i, while sv(1-7) has been calculated using ordinary power spectral density function from voltage v7-7. due to the advantage of the method [6], sv(1-6), are free from noise of voltage probes, including both contacts noise and noise of resistive arm, although it includes (apart from noise of the part of the main resistive layer) also noise of current contacts at terminals ‘1’. on the other hand, sv(1-7) includes noise of contacts at terminals ‘1’ and ‘7’ and noise of the whole resistive layers. in a similar way, sv(26) and sv(3-5) calculated as cross-power spectral density of v7-7 and either v5-5-v3-3 or v6-6-v2-2 are free from noise of current contacts as well as voltage probes. 22 a.w. stadler, a. dziedzic during the experiment, apart from the spectra of interest, also current resistance of the sample, rs, is calculated and recorded, using supply and sample voltages and the known value of ballast resistor rb. since temperature of the sample is also monitored and recorded, temperature dependence of sample resistance, r1-7(t), is also the result of lfns experiment. + temperature monitor r t d rb rb vsupply vsample 7 1 6 5 4 3 2 7 1 6 5 4 3 2 a b multi-terminal thick-film resistors in balanced dc bridge ln2 cryostat v o lt a g e s fr o m t h e b ri d g e d ia g o n a l v 7 -7 a n d s u b -d ia g o n a ls v ii fig. 3 measurement setup for noise studies in multi-terminal tfrs 4. results entry test of samples, executed at room temperature, cover measurements of voltage distribution along the resistive layer, noise spectra measurements for identification main noise components and samples selection. the voltage distribution has been shown in fig. 4 (points). the data points for terminals 2 – 6 have been used in a linear fit (solid line) for evaluation sheet resistance rsq = 0.353 k, while the expected nominal value was 200 . from the intersections of the extrapolated line (dashed line) with coordinates, resistances of contacts may be calculated. it thus occurred that the resistance of bottom contact is rcontact(1) = 273  while rcontact(7) ≈ 0 . fig. 4 voltage distribution along the resistive layer. terminals’ labels have been given near the corresponding points. virtual instruments in low-frequency noise spectroscopy experiments 23 the fundamental issue in noise properties studies is the identification of noise components. to do that, spectra of fluctuating quantity (current or voltage) are acquired as the function of sample bias. it has been shown in fig. 5a, where collection of excess noise spectra gathered at room temperature for terminals 7-7, svex(1-7) is plotted for different sample voltages. additionally, background noise, sv=0, and dashed line for pure 1/f noise have been drawn for reference. it is obvious that 1/f noise is dominant. therefore, it is convenient to use product of frequency and svex, f, averaged over certain frequency band, f, as the measure of noise intensity. the plot of this quantity, calculated for decade frequency bands, versus sample voltage (points) is show in fig. 5b. squared voltage dependence has been added (dashed line) for reference. it is visible that the observed noise is caused by resistance fluctuations. after tests at room temperature, the substrate with studied ptfrs has been inserted into ln2 cryostat, cooled-down and noise spectra have been recorded during warming-up. an exemplary plot of the noise map has been shown in fig. 6. in this map the product f·sr, where sr = svex/i 2 and i is biasing current, has been plotted vs frequency and reciprocal temperature. such fluctuating quantity, f·sr, has been chosen since it is sample voltage independent and other than 1/f noise components are emphasized. furthermore, reciprocal temperature scale (horizontal scale in fig. 6) helps in direct detection of tanss which are visible in the map as streaks, marked with dashed lines. from the slopes of these lines, activation energies of tanss [6] have been calculated: eg = 1013 mev, 736 mev, 642 mev, 316 mev, 303 mev. the set of tanss visible in the noise map is sample-dependent since only those tanss are visible that modulate resistances in the critical percolation path. analyzing noise map of fig. 6 we may notice that at temperatures 145 k and 186 k there are vertical streaks that look like unwanted interferences. but further inspection of noise spectra ensures that the noise map is correct and non-stationary noise sources have been recorded. their individual spectra of lorentzian shape have been shown in fig. 7. additionally, another two spectra, for intermediate temperature, including tans of eg = 303 mev have been plotted. furthermore, the spectrum with two lorentzians has been also shown in fig. 7 that has been caught at 77 k, which means that tanss also exist in a lower temperature range. 1 10 100 1k 10k 10 -19 10 -18 10 -17 10 -16 10 -15 10 -14 10 -13 10 -12 10 -11 10 -10 10 -9 e x c e ss n o is e , s v e x , v 2 /h z frequency, f, hz sample voltage 0.00 v 0.020 v 0.040 v 0.081 v 0.162 v 0.322 v 0.646 v 1.287 v 2.577 v 5.130 v 1/f (a) 0.01 0.1 1 10 -14 10 -13 10 -12 10 -11 10 -10 frequency band f 1 -10 hz 10 100 hz 0,1 1 khz 1 10 khz n o is e i n te n si ty , < f s v e x >  f , v 2 sample voltage v 1-7 , v ~v 2 sample ed7100/cu r s = 11.1 k (b) fig. 5 (a) excess noise for different sample voltages. (b) noise intensity versus sample voltage 24 a.w. stadler, a. dziedzic fig. 6 noise spectra gathered for samples made of ed7100 with cu contacts plotted as a noise map, i.e. the quantity fsr versus frequency and temperature 1 10 100 1k 10k 10 -3 10 -2 f s r ,  2 frequency, hz temperature 77.6k 145.6 k 153.3 k 171.2 k 186.5 k fig. 7 selected spectra with visible lorentzians 5. integral noise measure as we can see, the dependence of noise intensity is subject to large variations, which has been shown in the noise map in fig. 6 and also in fig. 8. since only those tanss are visible in noise maps that modulate critical resistances in percolation paths, dependence of r 2 versus temperature is irregular, although temperature dependence of resistance, also shown in fig. 8, is monotonic and smooth. it implies the necessity of averaging r 2 over temperature to obtain a reliable noise measure. namely, having sr(f, t) as a result of lfns experiment, we use the integral measure of noise intensity [12]: 2 2 1 1 1 2 1 ( ) ( , ) , t f r t f s t t s f t df dt      (2) virtual instruments in low-frequency noise spectroscopy experiments 25 where t1, t2 and f1, f2 determine temperature and frequency range, respectively. the inner integral calculates the power, r 2 , of resistance fluctuations, while the outer integral averages r 2 over the temperature. the plot of r 2 , calculated for studied resistors in decade frequency bands, has been shown in fig. 8. the values of integral s calculated for different sectors of the studied resistor, for frequency band 1 khz – 10 khz have been then used for the decomposition of the noise of the whole resistor into bulk and contact noise components, which has been depicted in fig. 9, where s has been plotted vs sectors’ size. it should be noted, that the noise of two samples/sectors is measured acquiring signals from the bridge diagonal/sub-diagonals. as the internal sectors of resistor, i.e. sectors (3-5) and (2-6), are far from current contacts also s calculated for them is free of contact noise, while s calculated for sectors (1-6) and (1-7) includes noise of bottom contacts and all contacts, respectively. hence, from the slope of linear fit of s for sectors (3-5) and (2-6) we obtain ‘s per square’, ssq =1.4∙10 -5  2 . next, dimension-independent bulk noise intensity cbulk  ssq /rsq 2 sq, is calculated: cbulk = 5.7∙10 -22 m 3 , where sq is the volume of the individual square [12]. parameter cbulk occurred to be very helpful for comparison noise properties of different materials [12-14]. 77 100 150 200 250 300 10 -3 10 -2 10 -1 ( r )2 ,  2 temperature, k frequency bands 1.0hz -10.0hz 10.0hz -100.0hz 100.0hz -1.0khz 1.0khz -10.0khz r(t) 9k 10k 11k 12k 13k 14k 15k 16k 17k 18k re si st a n c e r 1 -7 ,  fig. 8 power of resistance fluctuations versus temperature and temperature dependence of resistance 0.0 6.0m 12.0m 18.0m 24.0m 30.0m 0.000 0.001 0.002 0.003 0.004 0.005 3-5 2-6 1-6 1-7 2s bulk s,   sector length x, m 4s int s sq 0 10 20 30 40 50 60 sector length in number of squares, x/w fig. 9 the integral s versus sector size (points). sectors are labelled near data points. solid line is the linear fit of data corresponding internal sectors of the resistor (labelled (3-5) and (2-6)), while dashed line is its extrapolation 26 a.w. stadler, a. dziedzic the noise generated in the interfaces of both current contacts can be evaluated as a difference between noise intensity measured for the whole resistor, s1-7, and extrapolated bulk noise sbulk = l/wssq, which is depicted in fig. 9. analyzing the plot of fig. 9 we can see that the noise generated in contacts at the terminals ‘7’ significantly contributes to the overall noise. to compare the quality of the interface we use contact-geometry-independent parameter, cint  wsint/ssq [12]. its numerical value is the length of hypothetical resistive film, which would have the noise intensity equal the intensity of the interface noise. in this case, we obtain cint = 32 mm, which means that the quality of the interface between resistive layer and cu contacts is very poor and the contribution of noise contacts in the noise of the resistor is significant. the main reason for the introduction of parameters describing noise properties is to make possible quantitative comparison of the noise properties of different materials or devices. an example is the current noise index (cni) defined for resistors or resistive materials [15 method 308] and often used by the manufacturers. however, cni is useless in characterization of materials properties as long as geometrical dimensions of the samples are unknown. therefore, another parameter is introduced, i.e. material noise intensity, c ≡ fsr/r 2 , (2) where  is the volume of the sample. it is worth to note that c describes properties of the material itself, since it is frequency, as well as sample volume and bias independent. hence, parameter c describes properties of samples with various geometry and therefore is considered as the most proper quantity for materials characterization with respect to 1/f noise [16]. furthermore, using the value of c and geometrical dimensions of the sample it is possible to calculate measurable quantity, i.e. power spectral density of voltage or current fluctuations for the known bias. the parameter c is equivalent to the previously defined cbulk, however, cbulk = c ln10. the value c = 2.5∙10 -22 m 3 obtained for studied in this work resistive layer of ed7100 is very close to 10 -21 m 3 found in [17] for squared resistors (size 1.5 mm) of ed7100 with cu contacts but it is still more than 1 order of magnitude larger than c ≈ 0.6∙10 -23 m 3 found for pb-rich ruo2+glass thick-film layers [3]. 6. summary the importance of the software support in low-frequency noise spectroscopy has been explained. virtual instrument concept and its usage in controlling lfns experiment and post-experiment data processing have been described. since the functionality and possibilities of vis result in equal parts both from software and hardware, it is easy to expand the capabilities of vis by changing existing or developing new sub-vis. as the subject of lfns experiment, that showed the usefulness of vi concept, ptfr of ed7100 resistive ink with cu contacts has been used. the shape of test resistor was multi-terminal allowing for (i) studies of noise vs volume of the sample, and (ii) localization of noise sources in different parts of the resistor, (iii) obtaining valuable information concerning the quality of the interface resistive/conductive layer. the experiments proved that 1/f noise caused by resistance fluctuations is the dominant noise component in the studied samples. however, exemplary noise map as the result of lfns, revealed also thermally activated noise sources. the noise map gave detailed insight into fluctuating virtual instruments in low-frequency noise spectroscopy experiments 27 phenomena and opened the door for introduction of integral noise measure. parameters used for characterization of noise properties of resistive materials and components, including material noise intensity cbulk and cint describing the quality of the interface between resistive and conductive layers in tfr, have been defined and calculated for studied samples. it has been found that interface between polymer composition ed7100 and cu contacts has very poor noise properties. the concepts shown in this work may also be used for studies of noise properties in other electronic components, both passive and active. the results obtained by lfns may be utilized in thick-film technology in selection materials for manufacturing low-noise, reliable resistive components with stable parameters as well as in improvements of the technology in order to achieve technological advantage. acknowledgement: the work has been supported from grant dec-2011/01/b/st7/06564 funded by national science centre (poland) and from rzeszow university of technology project u-235/ds. studies have been performed with the use of equipment purchased in the project no popw.01.03.0018-012/09 from the structural funds, the development of eastern poland operational programme co-financed by the european union, the european regional development fund. references [1] d. rocak, d. belavic, m. hrovat, j. sikula, p. koktavy, j. pavelka, v. sedlakova, "low-frequency noise of thick-film resistors as quality and reliability indicator", microelectronics reliab., vol. 41, pp. 531542, 2002. [2] l.k.j. vandamme, "noise as a diagnostic tool for quality and reliability of electronic devices", ieee transactions on electron devices, vol. 41, pp. 2176-2187, 1994. [3] a.w. stadler, "noise properties of thick-film resistors in extended temperature range", microelectronics reliab., vol. 51, pp. 1264-1270, 2011. [4] b. pellegrini, r.saletti, p. terreni, m. prudenziati, "1/f  noise in thick-film resistors as an effect of tunnel and thermally activated emissions, from measures versus frequency and temperature", phys. rev. b, vol. 27, pp. 1233-1243, 1983. [5] a.w. stadler, a. kolek, z. zawiślak, k. mleczko, m. jakubowska, k.r. kiełbasiński, a. młożniak, "noise properties of pb/cd-free thick film resistors", j. phys. d: applied physics, vol. 43 (26), 265401 (9pp), 2010. [6] a. kolek, a.w. stadler, p. ptak, z. zawiślak, k. mleczko, p. szałański, d. żak, "low-frequency 1/f noise of ruo2-glass thick resistive films", j. appl. phys., vol. 102, 103718 (9pp) , 2007. [7] h. goldberg, "what is virtual instrument", ieee instrumentation &. measurement magazine, vol. 3, no. 4, pp. 10-13, 2000. [8] p. janković, j. manojlović, s. đukić, "virtual instrumentation for strain measerument using wheatston bridge model", facta universitatis, series: elec. energ., vol. 26, no 1, pp. 69 – 78, april 2013. [9] a.w. stadler, "noise signal analyzer for multi-terminal devices", in proceedings of the 31st int. conf. of imaps poland chapter, rzeszów krasiczyn, poland, 2007, pp. 413-416. [10] labview 2013 advanced signal processing toolkit help, http://zone.ni.com/reference/enxx/help/372656c-01/lvtimeseriestk/tsa_de-trend/ [11] a. kolek, experimental methods of low-frequency noise (in polish). rzeszow, poland: rzeszow university of technology publishing house, 2006. [12] k. mleczko, z. zawiślak, a.w. stadler, a. kolek, a. dziedzic, j. cichosz, "evaluation of conductive-toresistive layers interaction in thick-film resistors", microelectronics reliab., vol. 48, pp. 881-885, 2008. [13] a. kolek, p. ptak, a. dziedzic, "noise characteristics of resistors buried in low-temperature co-fired ceramics", j. phys. d: applied physics, vol. 36, pp. 1009-1017, 2003. 28 a.w. stadler, a. dziedzic [14] l.k.j. vandamme, h.j. casier, "the 1/f noise versus sheet resistance in poly-si is similar to poly-sige resistors and au-layers", in proceedings of the proc 34th european solid-state device research conf (leuven, belgium), 2004, p 21. [15] test method standard electronic and electrical component parts, mil-std-202g, 2002. [16] a. dziedzic, a. kolek, "1/f noise in polymer thick-film resistors", j. phys. d: applied physics, vol. 31, no. 17, pp. 2091-2097, 1998. [17] a.w. stadler, z. zawiślak, a. dziedzic, w. stęplewski, "studies of noise in polymer thick-film resistors embedded in printed circuit boards", in: microelectronic materials and technologies (ed. by zbigniew suszyński), koszalin technical university monograph series – monograph no. 231, koszalin 2012, vol. 1, pp. 82-97. instruction facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 557 570 doi: 10.2298/fuee1704557j information system for the centralized display of the transport comfort information * željko jovanović 1 , ranko bačević 1 , radoljub marković 1 , siniša ranđić 1 , dragan janković 2 1 university of kragujevac, faculty of technical sciences, ĉaĉak, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper introduces the information system for presenting road comfort map. the map is generated based on the conducted transportations. as a basis for the information system and the source of the comfort information, developed android application is used. it calculates comfort parameters using three-axis accelerometer values. the calculated data are recorded into the files in the proper format. recorded files are uploaded on the information system to be viewed and analyzed. as a final result of all recorded transportations, it is possible to generate a map of roads comfort. the paper presents the current functionality of the system and the current roads comfort map of covered roads in serbia. based on the collected data about 50% of transportation intervals were comfortable, 44% was moderately uncomfortable, and 6% was uncomfortable. key words: android, transport comfort, gis, comfort map 1. introduction the term transport comfort cannot be strictly defined, but it is of great importance in the assessment of transport quality. the problem is the subjective comfort feeling which is different for every person. comfort depends on many factors like acceleration (vibration), noise, temperature, compartment space, etc. if only mechanical effects are of interest, then generally the acceleration and vibration that passengers feel during the ride have the greatest impact on passenger comfort. vibrations are caused by three factors: vehicle condition, driver skills (driving style), and road condition. as for vehicle condition factors, suspension system and tires are most important vehicles parts that affects on vibration. nowadays, some vehicles suspension systems have active suspension control for better received november 14, 2016; received in revised form march 2, 2017 corresponding author: željko jovanović university of kragujevac, faculty of technical sciences, ĉaĉak, serbia (e-mail: zeljko.jovanovic@ftn.kg.ac.rs) * an earlier version of this paper received best paper award in computer science section at 60 th conference on electronics, telecommunications, computers, automation and nuclear engineering (etran 2016), june 1316, zlatibor, serbia [1] 558 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković comfort and vehicle stability. driver skills and driving style are also important. at the same road and with the same vehicle, two different drivers could provide different comfort for their passengers. sharp turning, sudden braking, and accelerating are usually marked as uncomfortable actions. road conditions are probably the most important for the passenger’s comfort and safety. they can be categorized as static and dynamic factors. static factors are commonly associated with a location, like road bumps and potholes. dynamics factors appear suddenly, like rain, snow, or landslides. also, the impact of other traffic participants is significant dynamic factor. on the basis of the above facts it is evident that many factors affect the comfort. it is very important to achieve as comfortable transportation as possible. uncomfortable transportation affects the mental and physical conditions of even healthy passengers. the impact on passengers with the health problems is even greater since uncomfortable driving could impair their medical condition. due to the discomfort location, as one of the most important information for assessing the comfort of transport, the usage of geographic information systems (gis) is of great importance. nowadays, there are several commonly used gis systems, e.g. openlayers, arcgis, openstreetmaps, geomedia, and googlemaps. usability and the possibility of integration into other application increase their popularity. the aforementioned gis systems are under constant development and new functionalities are implemented almost every day. its interactivity with the users in real time provides an increasing amount of information. the dynamics of development and user interaction information can be seen in the example of latest googlemaps gis novelty. the route from point a to the point b is colored according to the traffic jams detected on presented location. information for appropriate road color marking is gathered from numerous users of google maps navigation. as a basis for generating road maps of comfort, which are presented in this paper, features of googlemaps gis are used. the aim of the information system presented in this paper is to generate road comfort maps according to the information gathered from the users of the client android application. client android application calculate comfort parameters based on the accelerometer and gps data. the paper is organized as follows. related work is presented. android application functionalities and implemented calculations for transport comfort are demonstrated. after that, functionalities and usage of the developed web-based information system are demonstrated. at the end, overall information gathered by developed information system is presented. 2. related work in 1972, the international organization for standardization (iso) issued a standard: "a guide to the evaluation of human exposure to whole-body vibration" [2] which is still in general use. it is used for the evaluation of working conditions and exposure to the vibrations. the effect of vibration on health, at work, sitting, and other life situations is described in the paper [3]. higher vibration exposure has negative health effects. for vehicles, transport comfort is most affected by tires, suspension, shock absorbers, seats, etc. the suspension system impact on the passenger's comfort on various types of roads is presented in papers [4-6]. it is presented that suspension system has great positive effects on comfort but can't eliminate it. besides vibrations, the noise produced by tires may also affect passenger’s acoustic comfort as presented in [7]. information system for generating road comfort maps 559 gathering information from nodes to centralized unit is the trend nowadays. wireless sensor networks allow data collection and centralized processing, like in paper [8]. sometimes it is called swarm intelligence [9]. the role of smartphones is increasing in this area of research. the reason lies in the fact that smartphones equipped with sensors such as accelerometer, gyroscope, and gps are increasing their processing capabilities for better performances. some phones have processing power almost as classic computers. the paper [10] presented a system based on mobile phones to detect potholes on the roads. the phones were placed in taxi vehicles and recorded the locations of detected discomfort. for detection, only vertical (z-axis) was used. in the paper [11] smartphones are used to monitor conditions during transport. potholes, bumps, and siren sounds are detected. comfort calculations are usually based on the accelerometer signals processing. accelerometer detects dynamic movements and is also affected by the static gravity influence. for appropriate dynamic calculations, it is necessary to eliminate the static gravity influence from accelerometer signal values. this is usually done by some signal filter implementation. the authors of [12, 14] developed the automotive real-time observers and attitude estimation system, based on an extended kalman filter (ekf). the authors of [15] used high-pass filter for the road potholes detection. vibration duration exposure and interaxial influence need to be addressed. in [16] authors didn’t observe any statistically significant differences in discomfort between the 10, 15 or 20-second vibration exposure. in [17] authors showed that single axis vertical vibrations were typically associated with the less discomfort than multi-axis vibrations. also, different sensitivity for different axes is detected, for similar ranges of vibration. according to these, the data from all axes need to be collected for appropriate comfort level classification. although the vertical axis is the most influential, the others cannot be ignored. artificial intelligence usage is increasing in this field of research. in [18] neural network was used in order to analyze the quality of public transport. in [19] bayesian network was used for recognizing the mode of transport. for artificial intelligence implementation it is necessary to collect a lot of data for its training. work presented in this paper is based on the android applications, which development and use are presented in the paper [20]. measurement of comfort is realized by using values obtained from the triaxial accelerometer that helps determining the level of passenger comfort. in addition to the accelerometer, gps is used for discomfort location detection. accelerometer signals are passed through high pass filter for the static gravity influence elimination. it is set up to cut off 10% of low frequency signals. decision time intervals is set to 10s, and all three axes are used for comfort level classification. developed information system will be used to collect large number of information that could be used for artificial intelligence implementation in the future fork. 3. information system architecture developed information system is realized in form of a client-server system. client part is realized as an android application which measures comfort parameters during transportation. server part is realized in form of java web application with google maps support for data presentation. block diagram of the developed client-server information system usage is presented in fig. 1. 560 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković fig. 1 blog diagram of the developed information system usage as presented, information gathered from all client users are stored into one database (in server part of client-server system), which allows transportation and road comfort analysis. with longer information system usage collected information will be of more significance and more detailed analyses could be performed. since smartphones are widely used nowadays, there is a large number of potential users for the developed information system. 4. android application – client application android application is based on three-axis accelerometer data calculations in the standard ten seconds time interval. the development of the main application functionalities was presented in [20] using rxjava [21] for accelerometer calculations, gps monitoring, and main application thread. to classify transport comfort, it was necessary to determine the comfort levels. 4.1. comfort levels iso standard [2] is in general use for the comfort level determination. it assumes that acceleration magnitude, frequency spectrum, and duration represent the principal exposure variables, which account for the potentially harmful effects. at the national level (serbia), there is standard ics 13.160 (srps iso 2631-1:2014 mechanical vibration and shock: evaluation of human exposure to whole-body vibration, part 1: general requirements) which is translated version of iso standard [2]. in [2-3], authors have shown that the sensitivity of the human body at different frequencies depends on the intensity of acceleration. the information system for generating road comfort maps 561 effective value of the acceleration (arms) for a discrete system is calculated according to the equation (1): 2 2 2 1 2 1 ( ... ) zrms z z zn a a a a n     (1) where azi is the i th z-axis acceleration (vertical axis) and n=200 is a number of samples. for real android application usage accelerometer sampling is performed 20 times per second and decision interval is set to 10s. according to [2] comfort levels are defined and presented in table 1. only vertical axes are used for comfort level classification. table 1 comfort levels according to iso standard 2631-1 [2] arms [m/s 2 ] comfort level 0-0.315 not uncomfortable 0.315-0.63 a little uncomfortable 0.5-1 fairly uncomfortable 0.8-1.6 uncomfortable 1.6-2.5 very uncomfortable > 2 extremely uncomfortable in developed android application, all axes arms values are used for the comfort level classification. three comfort levels are chosen and defined in a way presented in table 2. table 2 used comfort levels in the developed android application arms [m/s 2 ] comfort level 0-0.315 comfortable 0.315-1 little uncomfortable > 1 uncomfortable by comparing table 1 and table 2 it can be seen that first comfort level (comfortable) from table 2 is the same as the first comfort level (not uncomfortable) from table 1. little and fairly uncomfortable levels from table 1 are merged to one level (little uncomfortable) in table 2. also, very and extremely uncomfortable levels from table 1 are merged to one level (uncomfortable) in table 2. 4.2. all implemented calculations the android application is designed to calculate parameters in standard time intervals. accumulated vibrations are calculated according to equation (1). this calculation is performed for all three axes (rms_x, rms_y, and rms_z). besides these values, for every acceleration sample, the application calculates the magnitude of all three-axis accelerations (2). 𝑎 = √a + a + a (2) 562 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković where axi, ayi, azi are x-, y-, and z-axis acceleration in the i-th sample, and airms is the magnitude of all three-axis accelerations. according to these, arms for the decision time interval is calculated according to equation (3). 𝑎 = √ ∗ (a + a + ⋯ + a ) (3) where airms is the i th magnitude of all three-axis accelerations, and n is the number of samples. for the most uncomfortable sample, maximum magnitude values (apeak) and their three-axis values (apeak_x, apeak_y, and apeak_z) are calculated. for detected apeak the gps data (latitude, longitude, speed, and time) are stored. beside these, all three-axis maximum accelerations (max_x, max_y, and max_z) are calculated. as a result of the decision time interval calculations, the following values are saved:  idt: location marker id  rms_x: x-axis calculation according to equation (1)  rms_y: y-axis calculation according to equation (1)  rms_z: z-axis calculation according to equation (1)  arms: acceleration magnitude according to equation (3)  apeak: acceleration magnitude maximum value in the values calculated by equation (2)  apeak_x: x-axis acceleration value in apeak  apeak_y: y-axis acceleration value in apeak  apeak_z: z-axis acceleration value in apeak  latitude: gps data for apeak  longitude: gps data for apeak  time: gps data for apeak  speed: gps data for apeak  max_x: x-axis real value where absolute maximum x-axis value is detected  max_y: y-axis real value where absolute maximum y-axis value is detected  max_z: z-axis real value where absolute maximum z-axis value is detected  comfort: (0=comfortable; 1=a little uncomfortable; 2=uncomfortable) 4.3. file formats developed android application on a used mobile device saves the final result in files. measurement data are located in the file name separated by the symbols "--" (two hyphens). these are: the time when the measurement is taken, title and description of the measurement, a unique user id, and measurement id. below is the example of the file name format. // format date--title--description--userid—measurementid.txt // example 2016-04-02-09-02-52--transport—mladenovac to belgrade--13--183.txt the data that need to be stored in the database are also separated with "--" symbols. each file row data presents calculated data for one decision interval. file row data order needs to be the same as the column order in a database for a successful insert. as a row information system for generating road comfort maps 563 delimiter, symbol ";" is used. below is the example of file row data format and a sample data row. // format measurementid -userid --rmsx--rmsy---rmsz--arms--apeakx--apeaky-apeakz--apeak--latitude--longitude--time--speed--maxx--maxy--maxz-description; // example 183--13--0.0627--0.033--0.0917--0.1159--0.4161---0.0571--0.0059--0.42-44.4534864--20.6799459--2016-04-02-09-03-01--0--0.4161---0.2515--0.295-merenje; 5. web application for centralized data processing – server application the server part of the developed client-server information system (vibromap) is realized as java web application. it is developed as multilayer application in the model view controller (mvc) architecture. for data presentation, the view part of the mvc, java server pages (jsp) is used. the controller part is developed using servlets, while the model is based on java beans and the data access object (dao) classes for communication with the mysql database. developed information system allows to its users to preview and analyze saved transportations (created by the developed client android application) in a gis form. the server part of a client-server information system has two types of users, registered users and administrators. vibromap registered user functionalities are presented by the use-case diagram shown in fig. 2. fig. 2 the vibromap use-case diagram after the registration and successful login, the registered user can upload the created measurements, preview them in gis or chart form, and also delete them. list of measurement 564 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković can be presented by date (today or dates in between). main functionality is the measurement upload. the measurement upload algorithm is presented in fig. 3. fig. 3 the measurement upload algorithm successfully created files with developed android application can be uploaded to the vibromap by the registered users by choosing the measurement upload functionality. the information system for generating road comfort maps 565 choose file window will appear and the created file needs to be selected. selected file is sent to the controller part of vibromap to the servlet called servletuploadfile. it creates a temporary file with uploaded file and checks its name format and content format. after a successful check, an existence of the saved file’s measurementid in the database is performed. if the id doesn't exist, the file data are stored in the database. if the id exists, then the rows numbers in the file and the database (rows with uploaded measurementids) are compared. if the file rows number is higher than the database rows number, then the database rows are deleted. this functionality is implemented for the future work in real time android application data upload because the internet connectivity loss could lead to incomplete data recording to the database. after successful upload the data are presented in gis format using google maps. every file row is presented with an appropriate marker on the map. in a case of error in any of the presented validations, an appropriate error message is displayed and data upload is canceled. copying the data from the file into the database is performed by the sql query in the following format. load data local infile file_location into table table_name fields terminated by '--' lines terminated by ';' (column names,…); file data are first stored in the temporary table. then, dependending on the marker type, data are copied to appropriate table (marker or analysis). after a successful copy, the uploaded file and the temporary data are deleted. beside registered users’ type, vibromap has administrator user type. it can preview data of all registered users and generate overall road comfort map which is the most important functionality presented in this paper. this functionality is realized using googlemaps gis. the google provided googlemaps api in june 2005 which allowed its integration into third party applications. the integration is performed by client-side scripting using javascript and ajax. as default map center, latitude and longitude of the city of ĉaĉak, serbia, are chosen, with zoom value of 8. this location is chosen since most of the measurements are conducted in its surrounding. to take advantage of gis, data with location parameters like latitude and longitude need to be passed to the map. these data are usually stored in a server side application database. googlemaps api is realized using client side language (javascript). the server side java programming language functionalities are included to perform database queries. by combining serverside java programming language and client-side javascript programming language, the appropriate data arrays for gis presentation are created. every array member is presented by one marker on the map. as additional functionality, the actions could be added to the markers by defining the appropriate marker listener using google.maps.event.addlistener function. this enables info window preview for the marker click action. all calculated data for the clicked location are presented. 5. presentation of the usage scenario of the developed information system with every uploaded measurement to the web application, new data about transportation comfort are stored in the database. road comfort map becomes more complete and more comprehensive. the presented system is in usage since december 2015. first upload was on december 7, with measurement created on relation ĉaĉak-užice. until competition of this paper, 38 successful file uploads were performed. fig. 4 presents all collected data (zoomedout). 566 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković fig. 4 the complete road comfort map (zoomed-out) the presented information system is mostly developed on faculty of technical sciences in ĉaĉak. therefore, most of the measurements were conducted from the city of ĉaĉak, serbia, to the other cities. at this stage, comfort map could be analyzed on several destinations: ĉaĉak-užice, ĉaĉak-beograd, ĉaĉak-kraljevo, ĉaĉak-kragujevac, and the part of mladenovac-beograd destination. on the zoomed-out map it is not easy to detect the comfort value (marker color) for certain location. since the map is interactive, it could be zoomed into the desired location for a detailed preview. the road comfort map of ĉaĉak main roads is presented in fig. 5. fig. 5 the road comfort map, city of ĉaĉak, serbia since the developed android application is most widely used for intercity relations, only the main roads are marked. the marker color presents first level of information for measured comfort level: information system for generating road comfort maps 567  green comfortable interval,  yellow semi-uncomfortable interval,  red – uncomfortable interval. every presented marker contains information for ten seconds driving interval around its location. marker location is a location with highest detected discomfort value (max apeak) in ten seconds comfort decision interval. detailed information (second level of information) are presented by click on the desired marker in the info window form. this is shown in fig. 6. fig. 6 the preview of all calculated data for the desired location, presented by clicking on the desired marker in the info window form as presented in fig. 6, besides gps data (latitude, longitude, time, and speed) all calculated parameters over accelerometer signals are displayed. thanks to these, it is possible to provide additional analyses about comfort, or the discomfort cause. by increasing the number of the uploaded measurements, the map becomes more complete and provides more information. it is important to mention that some relations are measured only once while some have several conducted measurements. the fig. 7 presents the part of the relation ĉaĉak-beograd which is measured three times. by analyzing the fig. 7, it can be seen that some locations from different measurements are marked with the different color markers, and some with the same color markers. in the case of the same color markers, every measurement detected the same comfort level. in the case of different color markers, detected comfort level from different measurements differed. there are several causes for this situation. the first cause is the elapsed time between measurements. it is possible that a location that was comfortable in the meantime had a problem (the appearance of potholes, bumps, landslides ...) and that in the next measurement became uncomfortable. also, it is possible that the uncomfortable location is repaired in the meantime by the road maintenance department and therefore become comfortable. the second cause lies in the variety of vehicles that are used for the measurement. to be precise, on the same road different vehicles have different comfort levels for their 568 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković passengers according to the vehicles class and condition. in any case, the presented comfort is detected inside the vehicle. fig. 7 the preview of the relation with multiple conducted measurements the third cause is driver and its driving skills. driver with more experience adjust its driving based on the road conditions. this could result with the comfortable driving even though road conditions are not satisfying. also, the opposite scenario is possible, when the less experience driver makes sharp turning or hard braking for not being able to adapt to the road conditions. the forth cause lies in the influence of the other traffic participants. their activities like overtaking or braking could result with lower comfort level. based on all measurements performed so far it is possible to analyze the transportation comfort statistics. the statistics for all 38 performed measurements are shown in table 3. table 3 the statistic of performed measurements title value (%) number of measurements 38 total number of markers 6923 comfortable markers 3500 (50,56) semi uncomfortable markers 3012 (43,50) uncomfortable markers 411 (5,94) since developed android application is used only in intercity transportations, presented statistics demonstrate transportation comfort for the part of serbia primary roads. according to the results, around 6% of driving intervals was very uncomfortable, while almost 44% have some discomfort cause in it. overall only 50% of intervals was comfortable. 6. conclusion in this paper, a web-based information system for generating transport comfort maps is presented. measurement of transportation comfort is conducted using android application presented in [20]. presented information system provides the centralized data information system for generating road comfort maps 569 functionalities and gis preview in the form of interactive googlemaps. the advantages of presented information system are the web-based approach, whereby the data are available to all users of the system worldwide. information that the system can provide can be of great importance when choosing the route for the next transportations. if some of the already recorded paths are going to be used, reviewing the routes previous transportations before setting off on the trip can provide information about expected road conditions. the implementation of googlemaps api enables an efficient and interactive display of the measured data. the benefits of the developed information system are increasing with every newly conducted measurement. as a future work some improvements are planned. the first one is a grouping of close markers into a cluster of markers and providing one overall piece of information based on all gathered markers data. this would provide road comfort information based on all conducted transportations in the presented location. the second one is the communication between client android application and the server part of the information system. detected comfort values using android application could be transferred to the developed information system in real time. this would allow an insight into the current status of each vehicle with an active android app in real time. the third one is implementation of artificial intelligence in comfort level recognition where the collected data would be used for its training. acknowledgment: the work presented in this paper was funded by grant no. tr32043 for the period 2011-2016 from the ministry of education and science of the republic of serbia. references [1] z. jovanovic, r. bacevic, r. markovic, s. randjic and d. jankovic, "information system for generating road comfort maps", in proceedings of the 60 th international conference etran, zlatibor, serbia, 2016, rt 5.6. [2] iso 2631-1:1997 mechanical vibration and shock -evaluation of human exposure to whole-body vibration -part 1: general requirements [3] m. j. griffin, handbook of human vibration, elsevier, 1996. [4] f. yi and s. zhang, "ride comfort simulation under random road based on multi-body dynamics", in 3rd international workshop on intelligent systems and applications, ieee, 2011, pp. 1–3. [5] j. sun and q. yang, "advanced suspension systems for improving vehicle comfort", in proceedings of the international conference on automation and logistics, ieee, 2009, pp. 1264–1267. [6] s.a. abu bakar, p.m. samin and a.a. azhar, "modelling and validation of vehicle ride comfort model", applied mechanics and materials, vol. 554, pp. 515–519, 2014. [7] j. ahmad kadri, w.m. wan zuki azman, m.n. zulkifli, m.j.m. nor, a. kamal ariffin and m. hosseini fouladi, "a study on the effects of tyre to vehicle acoustical comfort in passenger car cabin", in proceedings of the 3rd international conference on computer research and development, ieee, 2011, pp. 342–345. [8] nikolic, n. neskovic, r. antic and a. anastasijevic, "industrial wireless sensor networks as a tool for remote on-line management of power transformers’ heating and cooling process", facta universitatis, series: electronics and energetics, vol. 30, no. 1, pp: 107–119, 2017. [9] f. elfouly, r. ramadan, m. mahmoud and m. dessouky, "swarm intelligence based reliable and energy balance routing algorithm for wireless sensor network", facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 339-355, 2016. [10] mednis, g. strazdins, r. zviedris, g. kanonirs and l. selavo, "real time pothole detection using android smartphones with accelerometers", in proceedings of the international conference on distributed computing in sensor systems and workshop, dcoss’11, 2011. 570 ž. jovanović, r. baĉević, r. marković, s. ranđić, d. janković [11] p. mohan, v.n. padmanabhan and r. ramjee, "trafficsense : rich monitoring of road and traffic conditions using mobile smartphones", in proceedings of the 6th acm conference on embedded network sensor systems, acm 2008, pp. 323-336. [12] j. cuadrado, d. dopico, j. perez, et al., "automotive observers based on multibody models and the extended kalman filter", multibody system dynamics, vol. 27, no. 1, pp. 3-19, 2012. [13] j. lee, e. park, s. robinovitch, "estimation of attitude and external acceleration using inertial sensor measurement during various dynamic conditions", ieee transactions on instrumentation and measurement, vol. 61, no. 8, pp. 2262-2273, 2012. [14] j. blanco-claraco, j. torres-moreno, a. giménez-fernández, "multibody dynamic systems as bayesian networks: applications to robust state estimation of mechanisms", multibody system dynamics, vol. 34, no. 2, pp. 103-128, 2015. [15] j. eriksson, l. girod, b. hull, r. newton, s. madden and f. balakrishnan, "the pothole patrol: using a mobile sensor network for road surface monitoring", in proceedings of the 6th international conference on mobile system, applications and services, 2008, pp. 29-39. [16] j. dickey, m. oliver, p. boileau, t. eger, l. trick and a. edwards, "multi-axis sinusoidal whole-body vibrations: part i how long should the vibration and rest exposures be for reliable discomfort measures?", journal of low frequency noise, vibration and active control, vol. 25, no. 3, pp. 175-184, 2006. [17] j. dickey, t. eger, m. oliver, p. boileau, l. trick and a. edwards, "multi-axis sinusoidal whole-body vibrations: part ll relationship between vibration total value and discomfort varies between vibration axes", journal of low frequency noise, vibration and active control, vol. 26, no. 3, pp. 195-204, 2009. [18] c. garrido, r. de oña, j. de oña, "neural networks for analyzing service quality in public transportation", expert systems with applications, vol. 41, no. 15, pp. 6830-6838, 2014. [19] g. xiao, z. juan, c. zhang, "travel mode detection based on gps track data and bayesian networks", computers, environment and urban systems, vol. 54, pp. 14-22, 2015. [20] z. jovanovic, r. bacevic, r. markovic, s. randjic, "android application for observing data streams from built-in sensors using rxjava", in proceedings of the 23 rd telecommunition forum telfor, ieee, 2015, pp. 918–921. [21] reactivex, (n.d.). http://reactivex.io/ (accessed october 5, 2015). instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 509 520 doi: 10.2298/fuee1404509j new function for representing iec 61000-4-2 standard electrostatic discharge current  vesna javor university of niš, faculty of electronic engineering of niš, serbia abstract. new function for representing electrostatic discharge (esd) currents according to the iec 61000-4-2 standard current is proposed in this paper. good agreement with the standard defined parameters is obtained. this function is compared to other functions from literature. its first derivative needed for field calculations is analyzed in the paper. main advantages are simplified choice of parameters, possibility to obtain discontinuities in the decaying part, and zero value of the function first derivative at t=0 + . parameters of the function are obtained by using least-squares method (lsqm). key words: analytically extended function, electromagnetic compatibility, electrostatic discharge current, iec 61000-4-2, least-squares method 1. introduction nowadays, electromagnetic compatibility (emc) gains in its importance with the development and global marketing of electronic components, electrical devices and systems, so as with public concern for electromagnetic pollution. electrical engineers and industrial professionals, dealing with the design and manufacture of such products, have to take into account many aspects of emc in order to obtain and market a product which complies with emc standards and directives. besides utility and functionality, better appearance but lower costs as possible, any device, equipment or system has to comply with its electromagnetic environment and to function satisfactorily without introducing intolerable electromagnetic disturbances (emds) to other in its environment or being disturbed by an external influence from the environment [1]. electrostatic discharges (esds) are common phenomena and among very important emc aspects of concern. lightning discharges are discharges of static electricity, although their processes are in fact transient, and far from being “static” phenomena. these discharges produce the most powerful emds for electrical systems. in general, electrostatic discharges are dangerous in many technological processes: in textile industry, petrol industry, powder production, food industry, chemical industry, manipulating with various substances and transporting them, etc. however, there are also useful applications of esds: in medical  received july 1, 2014 corresponding author: vesna javor faculty of electronic engineering, a. medvedeva 14, 18000 niš, republic of serbia (e-mail: vesna.javor@elfak.ni.ac.rs) 510 v. javor devices as defibrillators, in photocopiers, spray painting, electrostatic precipitators, electrostatic dusters, some technological processes in producing fabrics, etc. an esd occurs between two objects at a distance close enough for the sufficient difference of their electrostatic potentials to produce breakdown. static electricity may appear not only on parts of machines and after separating different materials in contact, but also on humans. in every day’s life, human body may discharge through fingers or other body parts via skin or small metal pieces, such as keys, to some objects. this may happen at working places which is dangerous in production of electronic components. it is well known that integrated circuits and fast complementary metal oxide semiconductor components, so as digital devices in general, are more sensitive than analog, although esd may have influence on any kind of electrical devices and systems. the standard iec 61000-4-2 [2], [3], and european standard en 61000-4-2 (issued by cenelec) deal with the typical waveform of electrostatic discharge current, range of test levels, test equipment, test set-up and procedures related to electrostatic discharge immunity requirements for the equipment under test (eut). scientific committee sc77b, wg 10, is also maintaining the standard 61000-4-3 on radiated radio-frequency electromagnetic field immunity test, ([4],[5]). recent status of these standards and the elements of maintenance are discussed in [6]. test generators current waveform is defined in iec/en 61000-4-2 standards for contact esd testing: its initial peak current, current level at 30ns, current level at 60ns, so as rise time from 10% to 90% of the initial peak current. in order to improve the repeatability of tests, tolerance of the rise time of electrostatic discharge current waveform was expanded in the ed.2 of the standard [3]. the oscilloscope bandwidth was increased beyond 1ghz, so to measure rise time more accurately [7]. minimum 2ghz oscilloscope bandwidth is needed according to the iec 61000-4-2, ed.2. esd generators simulate real discharges thus enabling repetitive test procedures for eut. however, esd test generator current waveshape depends on various conditions, as discussed in [8], and these are: charging voltages, approach speeds, types of electrodes, relative arc length, humidity, etc. parameters of the real esd testers are also discussed in [9], and the influence of various conditions on current waveshape is investigated using simulation with pspice in [10]. a modified test generator with a reference waveshape close to the standard one and the corresponding equation for that waveshape are discussed in [11]. another equation was proposed already in [12] in order to study esd in coaxial cable shields. a mathematical function accurately representing standard esd current is necessary for computer simulation of such phenomena, for verification of test generators and for better modeling of esds. mathematical functions for modeling lightning discharge currents are used in literature to approximate currents of esd testing waveforms, but they have some disadvantages along with their complexity, as described in [13]. new function which may represent both typical esd and lightning currents, as given in corresponding standards, is proposed in this paper in order to make further steps in research and use advantages of computer simulations of the problem. any function is more useful for such purposes if simple as possible, whereas still capable to satisfactorily approximate experimentally measured characteristics. channel-base current function (cbc) is proposed in [14] for typical and experimental lightning stroke currents, and two-peaked function in [15]. for representing esd currents an analytically extended function (aef), as the sum of two or three cbc expressions, is used in this paper. new function for representing electrostatic discharge current 511 the procedure of choosing function parameters has to be further investigated in order to make it simple for any user. these parameters may be estimated applying different procedures such as genetic algorithm (ga) as in [17], or marquardt least-squares method (mlsm) as done in [18] for the lightning currents. in this paper least-squares method (lsqm) is used. firstly, the analysis of usually used functions is given, and after that the comparison of the proposed function to the iec 61000-4-2 standard one, so as the choice of its parameters and the analysis of the first derivative. 2. functions for approximating electrostatic discharge currents in iec 61000-4-2 standard, esd current peak is described with 3.75a/kv, current value ns30i at 30ns with 2a/kv, ns60i at 60ns with 1a/kv. the tolerance for esd contact mode currents is  10% for ipeak in ed.1,  15% in ed.2,  30% for ns30i and ns60i (in both ed.1 and ed.2). rise time rt in the range 0.7  1ns is defined in ed.1 for a typical contact mode discharge, and 0.6  1ns in ed.2 of the standard. parameters of esd currents are given in table 1, for the defined discharge test voltages. discharges may be contact or air esds. according to the standard, application of contact discharges is preferably used for testing, whereas air discharge only if not available otherwise. test level voltages range between 2 and 8kv for contact discharges, but between 2 and 15kv for air discharges. the arc lengths about 0.85mm are common for esd test generators and for 5kv as discussed in [11], but level and rise time of esd currents are less reproducible in the case of air discharge and depend significantly on humidity, shape of the tip, speed of the tip approach, etc. esd of a human through a small piece of metal is simulated with esd generators for testing robustness of sensitive electronics toward esd. current waveform parameters are given in table 1 for 2, 4, 6 and 8kv discharge voltages. human-body model (hbm) discharge current may be approximately obtained with a simple electrical circuit having the charging resistor  m10050 , energy-storage capacitor 150pf  10%, and the discharge resistor of 330 value representing skin, as in fig.1. the produced waveshape differs from the test generator esd currents, so as from the standard one. more complex circuits are also suggested in literature. table 1 standard 61000-4-2 esd current waveform parameters discharge voltage [kv] ipeak [a] %10 (ed.1) %15 (ed.2) rise time of the first peak rt [ns] (ed.1) rise time of the first peak rt [ns] (ed.2) ns30i [a] %30 (ed.1, 2) ns60i [a] %30 (ed.1, 2) 2 7.5 0.7  1 0.6  1 4 2 4 15 0.7  1 0.6  1 8 4 6 22.5 0.7  1 0.6  1 12 6 8 30 0.7  1 0.6  1 16 8 512 v. javor fig. 1 simple circuit for obtaining typical hbm current waveform [2] fig. 2 esd current waveform given in iec 61000-4-2 hbm and contact mode discharges are used for verification of esd test generators, and the standard esd current pulse is given in fig. 2. some functions from literature are compared for 4kv esd and the proposed function is compared to the best fit function of those and the standard waveshape. the following expression is proposed in [19], using four exponential functions 1 1 2 2 3 4 ( ) [exp( / ) exp( / )] [exp( / ) exp( / )]i t i t t i t t           , (1) for i1 = 498a, i2 = 148.5a, 1 = 1.4ns, 2 = 1.3ns, 3 = 23.37ns, 4 = 20ns as function parameters. this function is presented in figs. 3 and 4 with the dash-dot line. an expression using two gaussian functions is proposed for esd currents in [12] as the following: 2 22 2 1 1 2 2 ( ) exp[ ( ) /σ ] exp[ ( ) /σ ]i t a t t b t t t      , (2) for a = 13a, b = 0.4a/ns, t1 = 5ns, t2 = 10ns, 1 = 1.414ns, 2 = 35.35ns. this function is presented in figs. 3 and 4 with the dash-dot-dot line for a = 13.25a, b = 391a/ns, t1 = 2ns, t2 = 300ns, 1 = 0.6ns, 2 = 122.2ns, as given in [13]. for the experimental esd current described in [16] parameters of (2) are determined by using ga and minimizing relative error of the current as the following: a = 4.95a, b = 0.27a/ns, new function for representing electrostatic discharge current 513 t1 = 5.18ns, t2 = 1.62ns, 1 = 9.78ns, 2 = 54.72ns. using ga method and minimizing relative error of the current, parameters of (2) in [21] are determined as: a = 5.29a, b = 0.33a/ns, t1 = 6.07ns, t2 = 9.48ns, 1 = 4.31ns, 2 = 52.03ns. the pulse function [23] is given with the following expression 0 1 2 ( ) [1 exp( / )] exp( / ) p i t i t t     , (3) and its binomial expression with 0 1 2 1 3 4 ( ) [1 exp( / )] exp( / ) [1 exp( / )] exp( / ) p q i t i t t i t t           . (4) for 4kv esd and the binomial expression (4) of pulse functions, for parameters: i0 = 106.5a, i1 = 60.5a, 1 = 0.62ns, 2 = 1.1ns, 3 = 55ns, 4 = 26ns, [13], the waveshape is presented in figs. 3 and 4 with the long-dash lines. the trinomial expression of pulse functions is given with 0 1 2 1 3 4 2 5 6 ( ) [1 exp( / )] exp( / ) [1 exp( / )] exp( / ) [1 exp( / )] exp( / ), p q r i t i t t i t t i t t                   (5) and the quadrinomial expression with 0 1 2 1 3 4 2 5 6 3 7 8 ( ) [1 exp( / )] exp( / ) [1 exp( / )] exp( / ) [1 exp( / )] exp( / ) [1 exp( / )] exp( / ). p q r r i t i t t i t t i t t i t t                         (6) the trinomial (5) and quadrinomial (6) expressions provide better approximations [13] of the esd current and give results more similar to the goal function, but these functions have too many parameters. one function commonly used for lightning currents is applied in [11], having binomial expression of two heidler’s functions [20] 31 1 2 2 4 1 21 3 ( / )( / ) ( ) exp( / ) exp( / ) η η1 ( / ) 1 ( / ) nn n n ti t i i t t t t t            , (8) for peak correction factors 1/ 1 2 1 2 1 exp τ n n                and                  n n /1 3 4 4 3 2 τ τ τ τ expη . to approximate the measured human-metal esd at 5kv current parameters are chosen as the following: i1 = 21.9a, i2 = 10.1a, 1 = 1.3ns, 2 = 1.7ns, 3 = 6ns, 4 = 58ns and n =3. for the 4kv discharge parameters values in [13] are chosen as: i1 = 17.5a, i2 = 10.1a, 1 = 1.3ns, 2 = 1.7ns, 3 = 8.7ns, 4 = 42ns and n =3. this function is presented in fig.3 with the dot line. after choosing n = 3 as an initial value and using ga with minimizing relative error of the current, parameters are determined for the experimental esd current described in [16] as the following: i1 = 17.46a, i2 = 7.81a, 1 = 0.75ns, 2 = 0.82ns, 3 = 3.43ns, 4 = 68.7ns. the waveform approximating the esd current from iec 61000-4-2 ed.2 [3], for 4kv, is obtained for: i1 = 16.6a, i2 = 9.3a, 514 v. javor 1 = 1.1ns, 2 = 2.0ns, 3 = 12ns, 4 = 37ns, n = 1.8, and presented in figs. 3 and 4 with the full lines. after choosing 7.1n as an initial value in [22] for ga procedure with minimizing relative error of the current, parameters are calculated for the esd current as the following: i1 = 16.3a, i2 = 9.1a, 1 = 1.2ns, 2 = 2.05ns, 3 = 11.7ns, 4 = 37.3ns, n = 1.82. in [21] is proposed the following function  ( ) exp[ ] expi t at ct bt dt    , (9) for approximating iec 61000-4-2 ed.2 esd current with the following parameters: a = 38.1679a/ns, b = 1.0526a/ns, c = 1ns 1 , and d = 0.0459ns 1 . the function is presented in figs. 3 and 4 with the short-dash lines. fig. 3 functions approximating the standard 61000-4-2 esd current waveform for 4kv rising time is the difference between tb for 90% of the current peak (i90%=13.5a) and ta for 10% of the current peak (i10%=1.5a). rising times as in the standard 61000-4-2 are obtained with very different waveshapes behaviour in the first 5ns of functions from fig. 3 as presented in fig. 4. all the functions are presented from t=0 + , for imax=15a, although the standard function rises between 6 and 8 ns, given with tollerably lowered peak value imax=14a, if i30ns = 8a and i60ns=4a are chosen as reference (figs. 2 and 5). two-gauss function has the greatest rising time and wang function the shortest. four-exponential expression and wang function don’t have realistic rising parts. two-heidler’s function for n=1.8, given with the full lines in figs. 3 and 4, represent the standard waveshape better than the others. new function for representing electrostatic discharge current 515 fig. 4 functions approximating the standard 61000-4-2 esd current waveform for 4kv in the first 5ns, with notations from fig. 3 3. new function for approximating electrostatic discharge currents an analytically extended function (aef), with the same expression before and after time moments of maxima, but for different parameters, is proposed for approximating esd currents. its main advantages are: simply adjustable derivative value, rise time value, time to the peak value, exact peak values chosen prior to adjusting other parameters and a suitable waveform with the zero first derivative at the point t=0 + . the function is continuous, with its first derivative also continuous at any t, so it is of differentiability class c 1 . higher order derivatives have discontinuities at the points of maximum/minimum, so the first derivative of the function belongs to class c 0 . current function cbc [14] is given with the following expression 1 1 1 1 1 1 1 1 1 ( / ) exp[ (1 / )] , 0 , ( ) ( / ) exp[ (1 / )] , , a m m m m b m m m m i t t a t t t t i t i t t b t t t t           (10) and another with 2 2 2 2 2 2 2 2 2 ( / ) exp[ (1 / )] , 0 , ( ) ( / ) exp[ (1 / )] , , c m m m m d m m m m i t t c t t t t i t i t t d t t t t           (11) so that )()()( 21 tititi  (12) may represent esd current. it is denoted with esd2 and presented in fig. 5. it may be written in another way as 516 v. javor 1 1 1 2 1 1 1 1 1 1 2 2 2 1 2 1 1 1 2 2 2 2 ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], 0 ( ) ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], a c m m m m m m m b c m m m m m m m m b d m m m m m m m i t t a t t i t t c t t t t i t i t t b t t i t t c t t t t t i t t b t t i t t d t t t t                     (13) as a, b, c, and d are the constants, and 21 mm tt  . using lsqm to approximate iec 61000-4-2 standard esd current, the parameters are determined as im1=14a, im2=8.4a, tm1=1ns, tm2=21ns, a=2, b=0.3, c=3, and d=0.9. if three functions are used, based on the same expressions, their sum better represents the iec 62305-1 standard current, as given in fig. 5 and denoted with esd3. 1 1 1 1 1 1 1 1 1 ( / ) exp[ (1 / )] , 0 , ( ) ( / ) exp[ (1 / )] , , a m m m m b m m m m i t t a t t t t i t i t t b t t t t           (14) 2 2 2 2 2 2 2 2 2 ( / ) exp[ (1 / )] , 0 , ( ) ( / ) exp[ (1 / )] , , c m m m m d m m m m i t t c t t t t i t i t t d t t t t           (15) 3 3 3 3 3 3 3 3 3 ( / ) exp[ (1 / )], 0 , ( ) ( / ) exp[ (1 / )], , e m m m m f m m m m i t t e t t t t i t i t t f t t t t           (16) so that esd3 is )()()()( 321 titititi  . (17) this may be written also as 1 1 1 2 2 2 3 3 3 1 1 1 1 2 2 2 3 3 3 1 2 1 1 ( / ) exp[ (1 / )] ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], 0 ( / ) exp[ (1 / )] ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], ( ) ( / ) exp[ ( a c m m m m m m e m m m m b c m m m m m m e m m m m m b m m i t t a t t i t t c t t i t t e t t t t i t t b t t i t t c t t i t t e t t t t t i t i t t b                1 2 2 2 3 3 3 2 3 1 1 1 2 2 2 3 3 3 3 1 / )] ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], ( / ) exp[ (1 / )] ( / ) exp[ (1 / )] ( / ) exp[ (1 / )], d m m m m e m m m m m b d m m m m m m f m m m m t t i t t d t t i t t e t t t t t i t t b t t i t t d t t i t t f t t t t                            (18) as a, b, c, d, e and f are the constants, and 321 mmm ttt  . using lsqm the parameters are determined as im1=14a, im2=8.2a, im3=2.2a, tm1=1ns, tm2=21ns, tm3=50ns, a=2, b=0.3, c=2.5, d=1.5, e=15, and f=7. for both esd2 and esd3 the maximum peak value can be set to 15a simply by choosing im1=15a. esd3 better represents iec 61000-4-2 standard current waveform as given in fig. 5, than esd2 or two-heidler’s function for n=1.8. its derivative is also continuous, but of differentiability class c 0 as the first derivative has discontinuities at tm1, tm2 and tm3. new function for representing electrostatic discharge current 517 fig. 5 aef approximating iec 61000-4-2 standard current waveform for 4kv fig. 6 esd3 rising part from 6 to 8 ns 518 v. javor fig. 7 esd3 derivative for a=2 fig. 8 esd3 derivative from 15 to 100ns the esd3 function rising part is given in fig. 6. the function derivative in the first 100ns is presented in fig. 7. first derivative is greater for greater parameter a, so that for a=10 rising time is 0.4ns, for a=5 is 0.5ns, and for a=2 is 0.6ns as defined in iec610004-2 standard. parameter a does not influence on the choice of other parameters. fig. 8 shows the esd3 derivative from 15 to 100ns, where the needed discontinuities according to the standard current appear. esd2, esd3 and two-heidler’s function (for n=1.8) are given in fig. 9. for the comparison two-heidler’s function is delayed for 6ns and its peak is set to the same value as for esd2 and esd3 representing the standard current. new function for representing electrostatic discharge current 519 fig. 9 esd2, esd3 and two-heidler’s function representing the standard current conclusions functions for approximating esd currents are needed for simulation of different types of electrostatic discharges, calibration of test equipment and adequate representation of the iec 61000-4-2 standard current. important features of such mathematical functions are good approximation of realistic waveshapes and discontinuities in specified time intervals, zero function derivative at t=0 + , and simple choice of function parameters. new function presented in this paper in two forms, esd2 and esd3, may be used to approximate different electrostatic discharge currents. their waveshapes are compared to other functions from literature and show better agreement with the iec 61000-4-2 standard current waveshape and its defined parameters. the function derivative is also analyzed. rising time, maximum and minimum values, so as needed discontinuities, may be obtained for this function independently from other parameters and without peak correction factors simplifying any optimization algorithm used to obtain its parameters. further research will include calculation of parameters according to experimentally measured esd currents, and application of different optimization procedures. acknowledgement: this paper is in the frame of research within the project humanism iii 44004 financed by the serbian ministry of education, science and technological development. 520 v. javor references [1] c. paul, "introduction to electromagnetic compatibility", ed. 2, john wiley & sons, 2006. [2] emc – part 4-2: testing and measurement techniques – electrostatic discharge immunity test. iec international standard 61000-4-2, basic emc publication, 1995+a1:1998+a2:2000. [3] emc – part 4-2: testing and measurement techniques – electrostatic discharge immunity test. iec international standard 61000-4-2, ed. 2, 2009. [4] emc – part 4-3: testing and measurement techniques radiated radio-frequency immunity test. iec international standard 61000-4-3, ed. 2, 2002. [5] emc – part 4-3: testing and measurement techniques radiated radio-frequency immunity test. iec international standard 61000-4-3 (77b/339/fdis), ed. 3, 2006+a1:2007. [6] t. ishida, g. hedderich, "recent status of iec 61000-4-2 and iec 61000-4-3", emc’09 kyoto, 2009, pp. 821-824. [7] t. c. moyer, r. gensel, "update on esd testing according to iec 61000-4-2", em test. [8] d. pommerenke, m. aidam, "esd: waveform calculation, field and current of human and simulator esd", j. electrostat., no.38, 1996, pp. 33-51. [9] o. fijuwara, h. tanaka, y. yamanaka, "equivalent circuit modeling of discharge current injected in contact with an esd gun", electr. eng. japan, no.149, 2004, pp. 8-14. [10] n. murota, "determination of characteristics of the discharge current by the human charge model esd", simulator electron. commun. japan, no.80, 1997, pp. 49-57. [11] k. wang, d. pommerenke, r. chundru, t. van doren, j. l. drewniak, a. shashindranath, "numerical modeling of electrostatic discharge generators", ieee transactions on emc, vol.45, no.2, may 2003, pp. 258-271. [12] s. v. berghe, d. zutter, "study of esd signal entry through coaxial cable shields", j. electrostat., no.44, 1998, pp. 135-148. [13] z. yuan, t. li, j. he, s. chen, r. zeng, "new mathematical descriptions of esd current waveform based on the polynomial of pulse function", ieee transactions on emc, vol.48, no.3, 2006, pp. 589591. [14] v. javor, "multi-peaked functions for representation of lightning channel-base currents," 31 st int. conf. on lightning protection iclp 2012, proc. of papers, doi: 10.1109/iclp.2012.6344384 ,vienna, 2012. [15] v. javor, "approximation of a double-peaked lightning channel-base current", compel: the int. journal for comp. and mathematics in electrical and electronic engineering, vol.31, no.3, 2012, pp. 1007-1017. [16] g. p. fotis, i. f. ganos and i. a. stathopulos "determination of discharge current equation parameters of esd using genetic algorithms," electronic letters, vol.42, no.14, july 2006. [17] g. p. fotis, f. e. asimakopoulou, i. f. ganos and i a. stathopulos, "applying genetic algorithms for the determination of the parameters of the electrostatic discharge current equation", measurement science and technology, vol.17 (2006), pp. 2819-2827, 2006. [18] k. lundengard, m. rančić, v. javor, s. silvestrov, "estimation of pulse function parameters for approximating measured lightning currents using the marquard least-squares method", emc europe 2014, 2014, (accepted for publication). [19] r. k. keenan, l. k. a. rosi, "some fundamental aspects of esd testing", in proc. ieee int. symp. on electromagnetic compatibility, aug. 12-16, 1991, pp. 236-241. [20] f. heidler, “travelling current source model for lemp calculation,” in proc. 6 th int. zurich symp. emc, zurich, switzerland, pp. 157-162, mar. 1985. [21] k. wang, j. wang, x. wang, "four order electrostatic discharge circuit model and its simulation", telkomnika, vol.10, no.8, pp. 2006-2012, dec. 2012. [22] g. p. fotis, l. ekonomou, "parameters’ optimization of the electrostatic discharge current equation", int. journal on power system optimization., vol.3, no.2, pp. 75-80, 2011. [23] s. shenglin, b. zengjun, l. shange, "a new analytical expression of standard current waveform", high power laser and particle beams, vol.15, no.5, pp. 464-466, 2003. [24] r. chundru, d. pommerenke, k. wang, t. van doren, f. p. centola, j. s. huang "characterization of human metal esd reference discharge event and correlation of generator parameters to failure levels – part i: reference event", ieee transactions on emc, vol.46, no.4, pp. 498-504, nov. 2004. [25] v. javor, "modeling of lightning strokes using two-peaked channel-base currents", int. journal of antennas and propagation, vol. 2012, article id 318417, doi: 10.1155/2012/318417, 2012. http://dx.doi.org/10.1109/iclp.2012.6344384 instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 179 185 doi: 10.2298/fuee1702179d a new lumped element bridged-t absorptive bandstop filter  suhash c. dutta roy formerly at the department of electrical engineering, indian institute of technology, delhi, new delhi india abstract. following a brief review of previous work on bandstop filters, the inadequacy of a recent work to obtain a perfect notch or perfect absorption at the notch frequency ω0 is demonstrated. a simple and elegant alternative solution, based on purely analytical arguments, is then presented. the resulting network is shown to achieve perfect matching as well as perfect absorption at the notch frequency and has several other advantages. a comparison has also been made with the conventional bridged-t band-stop filter. key words: bandstop filter, bridged-t network, circuit design. 1. introduction bandstop filters are circuits which reject, to within a specified tolerance, a band of frequencies around a centre frequency at which there is complete rejection. such filters are known by various names, such as band rejection filters, notch filters, null networks etc. and are required in many situations in communication and instrumentation. bandstop filters have fascinated a large number of researchers, including the present author, who has written papers on the analysis [1-7], design [8] and its limitations [9], and analysis and applications of dual input techniques to such filters [10-12]. all these contributions relate to analog circuits. bandstop filters are also required in digital signal processing, and the author and his students have done extensive work on digital notch filters, using both fir and iir techniques [13-21]. of these, [21] is a review of fir notch filter design, which appeared in this journal. at low frequencies, passive rc networks are mostly used, except in situations where a selectivity, defined as (notch frequency)/(3 db stop bandwidth), is required to be more than half. in the latter cases, either active rc filters or lc networks are to be used. for high frequencies, lc networks are easily designed and implemented. at microwave frequencies, distributed networks are preferred over lumped networks, although the latter  received november 3, 2016 corresponding author: suhash c. dutta roy department of electrical engineering, indian institute of technology, 164, hauz khas apartments, new delhi 110016, india (e-mail: s.c.dutta.roy@gmail.com) 180 s. c. dutta roy have the advantage of occupying less space, and as is well known, space is a premium in microwave integrated circuits. examples of lumped element microwave bandstop filters can be found in [22-29], while bandstop filters with distributed elements can be found in [30,31]. 2. scope and organization of the paper this paper is concerned with the design of a band-stop filter which achieves a perfect notch and perfect absorption at some frequency ω0. in this context, we first demonstrate, in section 3, the inadequacy of a recent solution proposed by chieh and rowland [32], by network theoretic arguments. in the next section, we present a new, simple and elegant alternative design, based on purely analytical arguments. the resulting network is shown to achieve perfect matching as well as perfect absorption at the notch frequency, and has several other advantages. a normalized design is discussed in section 5, and the simulation results are presented. a comparison of the new design with the conventional bridged-t bandstop filter is made in section 6. finally, section 7 gives the concluding comments. 3. chieh and rowland's design chieh and rowland [32] proposed the symmetrical network of fig. 1 where z1(jω)=1/(jωc), (1a) z2(jω)=r1+jωl1+1/( jωc1) (1b) and z3(jω)=r2+jωl2+1/(jωc2). (1c) and both z2 and z3 resonate at the same frequency ωo. for ready reference, we reproduce here the expressions for the z-parameters of the network and the scattering parameters, in slightly different forms: fig. 1 the bridged-t network z11=z22=z2+(z1 2 +z1z3)/(2z1+z3), (2) z12=z21=z2+z1 2 /(2z1+z3), (3) s12=s21=2z21 zo/[(z11+ zo) 2 -z21 2 ], (4) a new lumped element bridged-t absorptive band-stop filter 181 and s11=s22=(z11 2 z21 2 ]/[(z11+ zo) 2 -z21 2 ]. (5) note from (2) and (3) that z11=z21+z1z3/(2z1+z3). (6) from (4) and (5), we observe that for a perfect notch as well as perfect absorption at the frequency ωo, we require z21(jωo)=0 (7) and z11(jωo)= zo. (8) from (1), we have z1(jωo)=1/(jωoc), z2(jωo)=r2 , and z3(jωo)=r1. (9) substituting these values in (3) gives, on simplification, z21(jωo)= r2+1/[ jωoc(2+ jωoc r1)], (10) which cannot be made zero. also, under this condition, z11(jωo)= r2+(1+ jωoc r1)/[ jωoc(2+ jωoc r1)], (11) which cannot be equal to zo if the latter is purely resistive, which is usually the case. equation (8) can be satisfied only if zo is a complex series rc impedance. thus the network of fig. 1 with the element values given by (1) can achieve neither perfect notch nor perfect absorption. 4. the new design the problem to be solved can be restated as follows: given ωo and ro and the network topology of fig.1, find z1, z2 and z3 such that z21(jωo)=z2(jωo)+[z1(jωo)] 2 /[2z1(jωo)+z3(jωo)]=0, (12) and z11(jωo)=z21(jωo)+z1(jωo)z3(jωo)/[2z1(jωo)+z3(jωo)]=ro. (13) where zo has been assumed to be resistive, equal to ro. in view of (12), (13) reduces to z11(jωo)=z1(jωo)z3(jωo)/[2z1(jωo)+z3(jωo)]=ro. (14) from (14), z3 is expressed in terms of z1 as z3(jωo)=2roz1(jωo)/[z1(jωo)-ro]. (15) combining this with (12) and simplifying, we get z2(jωo)=[ro-z1(jωo)]/2. (16) we can now choose a z1. if we take z1(jωo)=1/(jωoc), as in [1], then (15) gives, on simplification, z3(jωo)=[2ro/(1+ωo 2 c 2 ro 2 )]+jωo[2cro 2 /(1+ωo 2 c 2 ro 2 )] (17) 182 s. c. dutta roy which represents a series combination of an inductance l3 and a resistance r3, where r3=[2ro/(1+ωo 2 c 2 ro 2 )] and l3=2cro 2 /(1+ωo 2 c 2 ro 2 ). (18) similarly, (16) gives z2(jωo)=(ro/2)+jωo/(2ωo 2 c), (19) which also represents a series combination of an inductance l2 and a resistance r2, where r2=(ro/2) and l2=1/(2ωo 2 c). (20) in theory, c can be chosen to have any value, but as we shall see, it will be most convenient to choose c from the expression for r3 given in (18), which gives c=[(2ro/r3)-1] 1/2 /(ωoro) (21) note that if we choose c=1/(ωoro), (22) then r3 becomes equal to ro. also, under this condition, (17) and (18) give l3=ro/ωo and l2=ro/(2ωo). (23) this choice of c is advantageous because then z3 can be obtained by a series combination of z2 and z2 and there is no spread in the element values of the network. also note that lossy inductors can be used with ease because their losses can be absorbed in their series resistances. finally, the element valus of the network are consolidated as c=1/(ωoro), l3=2l2=ro/ωo and r3=2r2= ro. (24) fig. 2 the normalized design of the absorptive bandstop filter 5. a normalized design it is always convenient to have a normalized design which can be denormalized by impedance and frequency scaling. let ro=1 ohm and ωo=1 rad/sec. then (24) gives the element values as c=1f, l3=2l2=1h and r3=2r2=1 ohm. (25) a new lumped element bridged-t absorptive band-stop filter 183 the resulting network is shown in fig. 2. this network has been simulated with matlab and the obtained plots of │s11(jω)│and│s21(jω)│are shown in fig. 3. these plots exactly match the theoretical predictions. 6. comparison with the conventional bridged-t bandstop filter it may be noted that compared to network proposed in [32], the conventional bridgedt bandstop filter [3] performs better because it achieves a perfect notch but not perfect absorption. in this network, z1(jωo)=1/(jωc), z2(jωo)= r+jωl, and z3(jω)=r. (26) the network then achieves a perfect notch at ω=[2/(lc)] 1/2 under the condition l=crr, but it cannot achieve s11(jωo)=0 unless zo is a parallel combination of a capacitor c and a resistor r/2, which is not the usual case. also, if we choose r=r, then there is no spread in the component values. further, as in the proposed alternative, a lossy inductor can be used here. in addition, in comparison with the networks of [32] and that proposed here, it uses the least number, viz. three of reactive elements, yielding a transfer function of order three. 10 -1 10 0 10 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 fig. 3 performance of the normalized design. the upper curve is a plot of │s21(jω)│and the lower curve represents │s11(jω)│ 7. concluding comments it has been shown that the network proposed in [32] achieves neither a perfect notch nor perfect absorption. an alternative solution is proposed here purely by analytical, rather than physical or heuristic arguments, which achieves these two objectives simultaneously. the element values are obtained very simply, rather than by numerical and parametric methods as in [32]. also, the new solution uses only two capacitors, instead of four, which 184 s. c. dutta roy reduces the order of the transfer function by two. by an appropriate choice of the elements, there is no spread in the element values. a normalized design has been presented and the resulting characteristics of │s11(jω)│and│s21(jωo)│ have been plotted. a comparison of the two circuits has also been made with the conventional bridged-t bandstop filter. acknowledgement: the author thanks professor y. v. joshi for his help in the preparation of the manuscript and performing the simulation. references [1] s. c. dutta roy, ―analyzing the parallel-t rc network – yet another method‖, iete j. educ., vol. 44, pp. 111-116, [2] s. c. dutta. roy, ―a quick method of analyzing parallel ladder networks‖, int. j. elect. eng. educ., vol. 13, pp. 70-75, jan. 1976. [3] s. c. dutta roy, ―miller’s theorem revisited‖, circuits, syst. signal proces., vol. 19, pp. 487-499, dec. 2000. [4] s. c. dutta roy, ―on second order digital bandpass and bandstop filters", iete j. educ., vol. 49, pp. 5963, may-aug. 2008. [5] s. c. dutta roy, ―interference rejection in a uwb syetem: an example of lc driving point synthesis‖, iete j. educ., vol. 50, pp. 55-58, may-aug. 2009. [6] s. c. dutta roy, ―on some three terminal lumped and distributed rc null networks‖, ieee trans. circuit theory, vol. ct-11, pp. 98-103, mar. 1964. [7] s. c. dutta roy and b. a. shenoi, ―notch networks using distributed rc elements". proc. ieee, vol. 54, pp. 1220-1221, sept. 1966. [8] s. c. dutta roy. "on the design of parallel-t resistance capacitance networks for maximum selectivity‖, j.ite, vol. 8, pp. 218-223, sept. 1962. [9] s. c. dutta roy, ―parallel—t rc network : limitations of design equations and shaping the transmission characteristic‖, lndian j. pure appl. phys., vol. 1, pp.175-181, may 1963. [10] s. c. dutta roy, "dual input null networks‖, proc. ieee, vol. 55, pp. 221-222, feb. 1967. [11] s. c. dutta roy and n. choudhury, ―an application of dual input networks", proc. ieee, vol. 56, pp. 647-646. may 1970. [12] s. c. dutta roy and r. p. sah. "dual input distributed rc notch filter‖, lnd. j. pure appl. phys., vol. 9, pp. 762-763, sept. 1971. [13] s. c. dutta roy, s. b. jain and b. kumar, "design of digital notch filters‖, iee proc. – vision, image signal process., vol. 141, pp. 334-338, oct. 1994. [14] b. kumar, s. b. jain and s c. dutta roy, "on the design of fir notch filters", iete j. res., vol. 43, pp. 65-68, jan.feb. 1997. [15] s. b. jain, b. kumar and s. c. dutta roy, "design of fir notch filters by using bernstein polynomials‖, int. j. circuit theory applic., vol. 25, pp. 135-139, mar.-apr. 1997. [16] s. c. dutta roy, s. b. jain and b. kumar, ‖design of digital fir notch filters from second order llr prototype", iete j. res., vol. 43, pp. 275-279, jul.-aug. 1997. [17] s. b. jain, b. kumar and s. c. dutta roy, ‖semi-analytic method for the design of digital fir filters with specified notch frequency", signal process., vol. 59, pp. 235-241, 1997. [18] y. v. joshi and s. c. dutta roy, ‖design of llr digital notch filters‖, circuits, syst. signal process., vol. 16, pp. 415-427, 1997. [19] y. v. joshi and s. c. dutta roy. "design of lift notch filters with different passband gains‖,iee proceedings — vision, image signal process., vol. 147, pp. 11-19, feb. 1998. [20] y. v. joshi and s. c. dutta roy. "design of iir multiple notch filters based on all-pass filters‖, ieee trans.circuits syst.-ii: trans. briefs, vol, 46, pp. 134-138, feb. 1999. [21] s. c. dutta roy, b. kumar and s. b. jain, ―fir notch filter design – a review‖ (invited paper), facta universitatis (nis) – series : electron. energ., vol. 14, pp. 295-327, dec. 2001. [22] a. s. alkanhal, ―compact bandstop filters with extended upper passbands‖, active and passive components, vol. 2008, doi: http://dx.doi. org/10.11552008/356049. http://dx.doi/ a new lumped element bridged-t absorptive band-stop filter 185 [23] o. p. gupta and r. j. wenzel, ―design tables for a class of optimum new bandstop filters‖, ieee trans. microw. theo. tech., vol. 18, pp. 402-404, july [24] k. s. k. yeo and p. vijaykumar, ―quasi-elliptic microstrip bandstop filters using tap coupled open loop resonators‖, prog. electromag. res., vol. 35, pp. 1-11, 2013. [25] y. s. mezaal, h. t. eyyuboglu and j. k. ali, ―wide bandpass and narrow bandstop microstrip filters based on hilbert fractal geometry: design and simulation results‖, plos one, vol. 9, e115412, 2014. [26] m. m. bait-sawailam, ―miniaturized bandstop filters using slotted complementary networks‖, int. j. dig. inform. and wireless commun.,vol. 4, pp. 401-407, mar. 2014. [27] d. r. jachowski, ―narrowband absorption bandstop filtres with multiple signal paths‖, us patent no. 7323955b2, pub: jan. 29 2008. [28] w. m. pathelbab and m. b. steer, ―design of bandstop filters utilising circuit prototypes‖, iee micrpw, ant. propag., vol. 1, pp. 523-526, march 2007. [29] d. r. jachowski, ―tunable lumped element notch filter with constant bandwidth‖, in proc. of the ieee int. wireless inf. technol. syst. conf., pp. 1-4, 2010. [30] t. c. lee, j. lee, e. j. naglich and d. peroulis, ―octave tunable lumped element notch filter with resonator q-independent zero reflection coefficient‖, in proc. of the ieee mtt-s int. digest, pp. 1-4, 2014. [31] j. lee, t. leeand w. j. chappel, ―lumped element realization of absorptive bandstop filter with anomalously high spectral isolation‖, ieee trans. microw. theo. tech., vol. 60, pp. 2424-2430, aug. 2012. [32] j. s. chieh and j. rowland, ―quasi-lumped element bridged-t absorptive bandstop filter‖, ieee microw. wireless compon. lett., vol. 26, pp. 264-266, apr. 2016. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 317 328 doi: 10.2298/fuee1403317s rapid exploration of cost-performance tradeoffs using dominance effect during design of hardware accelerators reza sedaghat 1 , anirban sengupta 2 1 electrical and computer engineering, ryerson university, toronto, canada 2 computer science and engineering, indian institute of technology, indore, india abstract. modern very large scale integration (vlsi) designs require a tradeoff between cost efficiency and performance (circuit speed). furthermore, the design space exploration (dse) of the cost-performance tradeoffs for the multi objective vlsi designs should also be fast and efficient in nature. this paper presents a novel accelerated dse approach for the exploration of cost-performance tradeoffs of modular multi (trio parametric. viz. cost, execution time and power consumption) objective vlsi hardware accelerators using hierarchical criterion analysis. the selection of the final design point is made after the tradeoffs are explored using the proposed approach. results of the proposed approach when applied to various benchmarks yielded significant acceleration in the exploration process compared to current existing approaches with multi parametric objective. key words: hardware accelerator, rapid, exploration, performance, cost 1. introduction the design space exploration process generally takes into account two conflicting situations such as a) accurately searching the optimal design point from the huge design space b) time taken (or number of architectures analyzed) to evaluate the architecture design space in order to select the optimal design point. the second situation is more significant for modern multi-objective heterogeneous vlsi systems because exhaustive exploration of the architecture space is prohibitive due to the massive size of the design space. the architecture exploration process is therefore a battle between the optimal architecture determination and the speed of the exploration process. furthermore, since present generation vlsi systems are multi-objective in nature, they demand efficient exploration approaches that can satisfy the multi-objective requisite by concurrently reducing the time spent in the architecture evaluation as well as maximizing the opportunity of automating the exploration methodology [1]-[7].  received january 27, 2014 corresponding author: reza sedaghat electrical and computer engineering, ryerson university, toronto, canada (e-mail: rsedagha@ee.ryerson.ca) 318 r. sedaghat, a. sengupta 2. related works exploration has been a subject of research for almost two decades. many approaches have been proposed in the recent past for fast and efficient evaluation of the design architecture space. the evaluation of the architecture design space has been performed by implementing an architecture configuration graph (acg) based on the hierarchical criterion factor [8], [9]. after the creation of the acg, the pareto optimal analysis is performed to find the optimal architecture. although the approach seems promising, the major drawback of this approach is the excessive time taken for the framework to build the architecture design space in order to analyze the variants. on the other hand, authors in [10] use an evolutionary algorithm, such as genetic algorithm (ga), for efficiently searching the optimal solution. they propose a new encoding scheme to improve the efficiency of ga search for design space exploration. using chromosome representation, the precedence relationships among the tasks in the input behavioral specification are encoded with a topological order-based representation to specify schedule priorities. authors in [11] also use ga based on binary encoding of chromosome for efficient design space exploration. additionally, authors in [12], [13] have developed a model that can assist designers at the system-level dse stage to explore the utilization of the reconfigurable resources and evaluate the relative impact of certain design choices. all the above mentioned approaches mostly consider dual objective dse (such as area and delay), but the proposed approach considers multi objective problems (such as cost, delay and power consumption).in addition to the above, a problem space genetic algorithm for design space exploration of data paths have been proposed in [14]. the authors have used the concept of heuristic/problem pair to convert a data flow graph into a valid schedule. the chromosome is encoded based on the „work remaining‟ value of each node. one of the problems with approach [14] is that the second special parent chromosome‟s built in correspondence with the minimum functional units (i.e. serial implementation) does not differ in the work remaining field of the first special chromosome. this may not always lead to the optimal solution. furthermore, the cost function considers only latency and not total execution time. authors in [15] describe an approach to solve the dse problem which is based on ga and weighted sum particle swarm optimization (wspso). the authors use crossover between parent and local-best-solution, then parent and globalbest-solution to implement particle swarm optimization (pso) behavior. the authors do not consider velocity to update the position. moreover in wspso, the authors also do not consider user constraints for area and execution time in cost function. in [16], authors describe another approach for dse in high level systems based on binary encoding of the chromosomes. however, they consider only traditional latency and not execution time constraint for data pipelining. authors in [17] suggest that identification of a few superior design points from the pareto set is enough for an excellent design process. the work shown in [18] discusses the optimization of area, delay and power in behavioral synthesis but does not consider execution time during data pipelining. the problem of design space exploration is also addressed in [19] by suggesting order of efficiency, which assists in deciding preferences amongst the different pareto optimal points. authors in [20] introduce a tool called systemcodesigner that offers rapid design space exploration with rapid prototyping of behavioral systemc models. in [21] evolutionary algorithms such as the genetic algorithm (ga) have been suggested to yield better results for the design space exploration process. an automated tool was developed by integrating behavioral rapid exploration of cost-performance tradeoffs using dominance effect during design.... 319 synthesis into their design flow, while authors in [22] describe current state-of-the-art high-level synthesis techniques for dynamically reconfigurable systems. additionally, authors in [23]-[25] also use a genetic algorithm for scheduling and resource allocation for data path synthesis. another class of scheduling methods employed previously was probabilistic in nature. for example the simulated annealing (sa) and simulated evolution (se) based scheduling techniques have been used for the high level synthesis problem. authors in [26], [27] have proposed a simulated annealing scheduling method called „salsa‟, which uses many probabilistic search operators to enhance the performance of the sa-based technique for high level synthesis problems. in addition, authors have also proposed an extended binding model for handling the scheduling problem in high level synthesis. simulated evolution has been proposed by authors in [28] to solve the combined problem of scheduling and resource allocation in high level synthesis. unfortunately, approaches [23]-[28] do not consider execution time, chaining and data pipelining. authors in [20],[29] proposed alternate approaches based on integer linear programming (ilp).although they are capable of providing good results, the computational complexity is massive and therefore require and extensive amount of time. furthermore, the concept of data pipelining based on execution time was not shown during system trade-off. work shown in [30] for dse suggests an evolutionary algorithm for successful evaluation of the design for an application specific soc. other well known tools for hls exist, such as gaut [31]. gaut inputs a c/c++ behaviour description for automatically generating a rtl structure based on compulsory constraint of throughput (or initiation interval) and clock period. in addition, authors in [32] propose an opensource hls tool called legup for fpga-based processor/accelerator systems. legup is able to synthesize c language to hardware, thereby providing a nice platform for hls. different fpga architectures are supported by this tool, which allows new scheduling algorithms and parallel accelerators. moreover, roccc, proposed in [33], is an opensource hls tool for generating rtl structure from c. it was designed for kernels that perform computation intensive tasks, such as most dsp applications. therefore, roccc applies to a specific class of applications (streaming-oriented applications) and is not a general c-to-hardware compiler, unlike legup [32]. 3. the proposed framework behind design space exploration 3.1 the proposed framework for cost model the model for the cost of the resources is proposed in this section and is an extension of the authors‟ previous work [3]-[5] on the area model. let the area of the resources be given as „a‟. ri denotes the resources available for system designing; where 1<=i<=n. „rclk‟ refers to the clock oscillator used as a resource providing the necessary clock frequency to the system. the total area can be represented as the sum of all the resources used for designing the system, such as adder, multiplier, divider etc, and clock frequency oscillator. total area is shown in equation (1). ( )a a ri  (1) )()...( 2211 rclkaknknkna rnrnrrrr  (2) 320 r. sedaghat, a. sengupta where „nri’ represents the number of resource „ri‟,and „kri‟ represent the area occupied per unit resource „ri‟. let the total cost of all resources in the system be „cr‟. further, cost per area unit of the resource (such as adders, multipliers etc) is given as „cri‟ and the cost per area unit of the clock oscillator is „crclk‟. therefore total cost of the resources is given as: 1 1 2 2 ( .. ) ( ) r r r r r rn rn ri rclk c n k n k n k c a rclk c          (3) applying partial derivative to equation (3) nr1 ….nrn, nrm,and arclkyields equations (4) to (7) respectively as shown below: 1 111 1 ])()..[( r rclkrnrnrnrrr r r n crclkacknckn n c     11 rr ck  (4) rn rclkrnrnrnrrr rn r n crclkacknckn n c ])()..[( 111     rnrn ck  (5) rclk rclk r c a c    (6) according to the theory of approximation by differentials, the change in the total area can be approximated by the following equation: rclk rclk r rn rn r r r r r a a c n n c n n c dc           1 1 (7) substituting equations (4) to (6) into equation (7) yields equation (8): equation (8) represents the change in total cost of resources with a change in the number of all resources and the clock period (clock frequency). the pf for cost of resources is defined as follows: 1 1 1 ( 1) r r ri r n k c pf r n     (9) ( ) rn rn ri rn n k c pf rn n     (10) ( ) ( ) rclk rclk a rclk c pf rclk n    (11) equations (9) and (10) indicate the average deviation of cost with respect to change in resource r1,….rn. note: this average deviation of cost helps in finding the dominance the change of cost contributed by resource rn the change of cost contributed by resource clock rclkrirnrnr crclkackndc  )( (8) rapid exploration of cost-performance tradeoffs using dominance effect during design.... 321 effect of corresponding resource types on cost. further, equation (11) indicates the change of cost of the system with respect to change in resource „rclk‟ (i.e. the dominance effect of rclk). 3.2 the framework used for execution time this section introduces a new mathematical pf model for clock oscillator resource, thus extending the authors‟ previous work [3]-[5] on pf model of functional resources. the priority factor of the resources r1, …rn (such as adders, multipliers etc) for the execution time is derived from [3]-[5].from [3]-[5], the priority factor for the resources r1,...rn for execution time, is defined as: max ( ) ( )rn rn p rn n t pf rn t n     (12) the pf model for the clock oscillator is defined as: rclk min rclk max rclk n tt rclkpf  )( (13) in equation (13), ‘trclk max ’ and ‘trclk min ’are the maximum and minimum values of „execution time‟ and all the available resources have the maximum value. the pf defined in equations (12) and (13) indicates the average change in execution time with a change in number of a particular resource. this average deviation of execution time depends on various resources to find the dominance effect of corresponding resource types on execution time. 3.3 the framework used for power consumption pf for power consumption is defined as: max( ) ( )rn rn c rn n k pf rn p n     (14) max( ) ( )rm rm c rm n k pf rm p n     (15) 1 1 2 2 .. ( ) ( ) clk r r r r rn rn c r n t n t n t pf rclk p n          (16) similarly as explained above, the priority factors for power consumption defined in equations (14), (15) and (16) indicate the average change in the total power consumption of the system with the change in number of resources at maximum clock frequency. therefore, as discussed before, equations (14),(15),(16) indicate the dominance effect of resource types rn, rm and rclk on power metric. 322 r. sedaghat, a. sengupta 4. proposed demonstration 4.1 system specifications the case study of a selected benchmark has been provided for demonstration of the proposed method based on multiple real system specifications (as shown in table 1). the function of the selected second order digital iir chebyshev filter benchmark is given in (17). ( ) 0.041 ( ) 0.082 ( 1) 0.041 ( 2) 0.6743 ( 2) 1.4418 ( 1)y n x n x n x n y n y n         (17) x(n), x(n-1) and x(n-2) are the input vector variables for the function. the previous outputs are given by y (n-1) and y(n-2), while the present output is y(n). table 1 system specifications and constraints 1) maximum cost of resources: 1588 area units 2) maximum time of execution: 200µs (for d =1000 sets of data) 3) power consumption: minimum 4) maximum resources available for the system design: a) 3 adder/subtractor units. b) 3 multiplier units c) 3 clock frequency oscillators: : 24 mhz, 100 mhz and 400 mhz 5) no. of clock cycles needed for multiplier and adder/subtractor to finish each operation: 4 cc and 2cc 6) area occupied by each adder/subtractor and multiplier: 12 area units (a.u), and 65a.u on the chip (e.g. 12 clbs on fpga for adder/subtractor) 7) area occupied by the 24 mhz, 100 mhz and 400 mhz clock oscillator: 6 a.u., 10 a.u. and 14 a.u. 8) power consumed at 24mhz, 100mhz and 400 mhz: 10mw/a.u., 32 mw/a.u. and 100mw/a.u. respectively. 9) cost per area unit resource (cri) = 10 units and cost per area unit clock oscillator = 8 units 4.2 arrangement of the design space (consisting of resources) in increasing orders of magnitude in the form of architecture tree for cost model this paper proposes the use of a hierarchical tree topology for arrangement of design points in sorted orders and exploration of the optimal design point. unlike the authors‟ previous works [3]-[5] using vector design space, this approach uses a more convenient topology for exploration. the tree structure is easy to construct and does not require a special algorithm to order the design space in increasing/decreasing order. the pf of the different resources for cost model is given in equations below: 1 1 1 (3 1) 12 10 ( 1) 80 3 r r ri r n k c pf r n          (18) 2 2 2 (3 1) 65 10 ( 2) 433.33 3 r r ri r n k c pf r n          (19) rapid exploration of cost-performance tradeoffs using dominance effect during design.... 323 ( ) (14 6) 8 ( ) 21.36 3 rclk rclk a rclk c pf rclk n        (20) based on the pf calculated for cost model, the architecture tree for cost can be constructed. the tree is constructed in such a way, so that the resource with the highest pf is assigned level (l1) in the tree, followed by level (l2) being assigned to the resources with next highest pf and finally the last level being assigned to the resource with the lowest pf. the resource with the highest pf influences the cost of the system the most compared to the resource with the least pf. after the assigning the levels, the architecture tree comprising of the design space is automatically arranged in increasing orders of magnitude for the cost model. the architecture tree for the cost model is shown in fig. 1. after the design space is sorted in increasing order of magnitude, searching is applied on the design space. a mixed searching approach is proposed in this work by extracting the advantages of two different well known searching algorithms viz. interpolation search and binary search. previous works [3]-[5] employed a mono binary searching procedure. however, as highlighted in fig. 1, a mixed searching approach is proposed to further enhance the speed of the exploration process. interpolation search is used with the cost model in order to search for the border variant for cost, while for the execution time model binary search is used to find the border variant. the interpolation search performs faster than binary search in cases of uniformly sorted models, such as design space for cost (cost is an increasingly linear function of the number of resources, i.e. cost of the system increases with increase in number of resources). on the other hand, binary search exploits the „divide and conquer‟ approach. hence, it works faster on nonuniformly linear sorted models, such as execution time (execution time being a nonuniformly decreasing linear function of the number of resources i.e. increase in number of resources does not always decrease execution time, but remains same). therefore applying interpolation search on the sorted design space for cost, shown in fig.1 yields the border variant in just 2 comparisons (cost is calculated according to eqn.(3)). the border variant for cost is the last variant in the design space (in fig.1) which satisfies the constraint for cost specified. the border variant obtained for cost is „v11‟. fig. 1 architecture tree representing the design space for cost arranged in increasing order 324 r. sedaghat, a. sengupta 4.3 arrangement of the design space in decreasing orders of magnitude in the form of architecture tree for execution time model the pf of the different resources used in system design for execution time model is given below equations: max1 1 1 (3 1) 2 ( 1 ) ( ) 0.0416 0.055 3 r r p r n t pf r t n          (21) max2 2 2 (3 1) 4 ( 2) ( ) 0.0416 0.111 3 r r p r n t pf r t n          (22) 333.5 20.01 ( ) 104.50 3 max min rclk rclk rclk t t pf rclk n      (23) similarly, as described in section ii.b, the architecture tree for execution time is constructed based on the pf calculated for execution time. thus, the architecture tree obtained after construction is now also automatically arranged (sorted) in decreasing orders of magnitude. after arrangement, binary searching is applied in order to find the border variant for execution time (execution time is calculated according to the model of execution time shown in [4]). the border variant for execution time is the first variant in the design space, which satisfies the constraint for cost specified. the border variant obtained is variant „v5‟. after the border variants for both cost and execution time are found, the pareto optimal set is derived as explained in [3]-[5]. the architecture tree for power consumption is constructed similarly in increasing orders of magnitude for power consumption. among the variants of the pareto set, the one which appears first in the ascending ordered sorted design space (in the tree), is the one with the minimum power consumption. it concurrently satisfies the constraints for cost, execution time and power consumption (specified in table1) for the design problem. therefore the optimal variant obtained, which satisfies all the specified constraints, is variant „v5‟ (marked bold red in fig.1). 5. analysis and results the results of the proposed approach using pf and mixed searching scheme for rapid exploration of cost performance tradeoffs are verified for a number of benchmarks. compared to the authors‟ previous works [3]-[5], the proposed approach is capable of further enhancing the speedup of the exploration process. the search of the border architecture in the case of execution time (using binary search) requires only log2  n i=1 vri where „n‟ = number of type of resources and „vri‟ is the number of variants of resource „ri‟. the search of the border architecture (using interpolation search) for cost parameter requires log2 log2 log2  n i=1 vri. in the design space exploration approach presented here, three objective parameters have been used; execution time and cost are the parametric constraints and power consumption is the optimization parameter. the total number of architecture evaluations performed during searching using the proposed method is given as: rapid exploration of cost-performance tradeoffs using dominance effect during design.... 325 log2 log2  n i=1 vri + log2  n i=1 vri when applied on various benchmarks, the proposed approach indicated massive acceleration in the speedup compared to the exhaustive approach. the proposed method was also compared with a current approach in [8], [9]. the acceleration obtained, compared to the [8], [9], for both small and large size benchmarks is shown in tables2 and 3 respectively. moreover, the proposed approach has also been compared with a heuristic approach (wspso) [15]. as evident from tables 4 and 5, the proposed approach performs lower architecture evaluations than [15] for both small and large benchmarks respectively. for example, in case of mpeg mmv (shown in table 5) the proposed approach performs only 14 evaluations, while [15] perform 53 evaluations to search a final solution. table 2 experimental results of comparison between proposed dse approach with the current approach [8], [9] for small benchmarks benchmarks [2],[34],[35] total possible architecture in the design space for one parameter architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [8],[9] (number of variants analyzed) percentage speed up using proposed approach compared to [8],[9] average speedup using proposed approach compared to [8],[9] cost execution time total iir chebyshev filter 27 4 6 10 18 44.44 % 41. 85 % mesa horner 36 5 6 11 19 42.10 % elliptic wave filter 78 5 7 12 19 36.84 % differential equation solver (hal) 90 5 7 12 19 47.82 % bpf 100 5 8 13 21 38.09 % table 3 experimental results of comparison between proposed dse approach with the current approach [8], [9] for large benchmarks benchmarks [2],[34],[35] total possible architecture in the design space for two parameters architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [8],[9] (number of variants analyzed) percentage speed up using proposed approach compared to [8],[9] average speedup using proposed approach compared to [8],[9] cost execution time total auto regressive filter 144 5 8 13 21 38.09 % 37.56 % mpeg mmv 200 5 9 14 23 39.13 % matrix multiplication 400 6 10 16 25 36 % jpeg_idct 900 6 11 17 27 37.03 % 326 r. sedaghat, a. sengupta table 4 experimental results of comparison between proposed dse approach and the current approach [15] for small benchmarks benchmarks [2],[34],[35] total possible architecture in the design space for one parameter architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [15] (number of variants analyzed) percentage speed up using proposed approach compared to [15] average speedup using proposed approach compared to [15] cost execution time total iir chebyshev filter 27 4 6 10 17 41% 48.7% mesa horner 36 5 6 11 21 47% elliptic wave filter 78 5 7 12 31 61% differential equation solver (hal) 90 5 7 12 32 62.5% bpf 100 5 8 13 35 62% table 5 experimental results of comparison between proposed dse approach and the current approach [15] for large benchmarks benchmarks [2][34][35] total possible architecture in the design space for two parameters architecture evaluation using proposed approach (number of variants analyzed) architecture evaluation using approach [15] (number of variants analyzed) percentage speed up using proposed approach compared to [15] average speedup using proposed approach compared to [15] cost execution time total auto regressive filter 144 5 8 13 52 75% 75% mpeg mmv 200 5 9 14 53 73.5% matrix multiplication 400 6 10 16 65 75.3% jpeg_idct 900 6 11 17 72 76.3% 6. conclusions this paper presented a novel framework for rapid exploration of the cost-performance tradeoffs for modular multi-objective hardware accelerators. once the design space for the cost-performance is explored, the final design point with minimum power consumption is searched from the obtained small pareto optimal set. the proposed dse approach for different benchmarks yielded superior results in terms of acceleration obtained compared to the current existing approaches. acknowledgement: this work is supported by the optimization and algorithm research lab (opral), ryerson university, canadian microelectronics corporation (cmc), motorola, nserc crsng, ontario innovation trust and sun microsystems. additionally, this work acknowledges the assistance provided by science and engineering research board (serb), department of science and technology, govt. of india. rapid exploration of cost-performance tradeoffs using dominance effect during design.... 327 references [1] g. de micheli, “synthesis and optimization of digital circuits”. mcgraw-hill: new york, 1994. [2] saraju p. mohanty, nagarajan ranganathan, elias kougianos and priyadarsan patra, “low-power highlevel synthesis for nanoscale cmos circuits” chapterhigh-level synthesis fundamentals, springer us, 2008 [3] anirban sengupta, reza sedaghat, zhipeng zeng, “a high level synthesis design flow with a novel approach for efficient design space exploration in case of multi parametric optimization objective”, microelectronics reliability, science direct, elsevier, volume 50, issue 3, march 2010, pp. 424-437. [4] zhipeng zeng, reza sedaghat, anirban sengupta, “a framework for fast design space exploration using fuzzy search for vlsi computing architectures”, accepted to appear in the proceedings of 2010 ieee international symposium on circuits and systems (iscas), june 2, 2010. [5] anirban sengupta, reza sedaghat, zhipeng zeng, “rapid design space exploration for multi parametric optimization of vlsi designs”, proceedings of 2010 ieee international symposium on circuits and systems (iscas), june 2, 2010, paris, france, article # 2016 (session: logic & high-level synthesis, c2l-f). [6] anirban sengupta, reza sedaghat, zhipeng zeng, “hardware efficient design of speed optimized power stringent application specific processor”, proceedings of ieee 21st international conference on microelectronics (icm), morocco, december 22, 2009, pp. 167-170. [7] d. gajski, n. dutt, a.wu, and s. lin, “high level synthesis: introduction to chip and system design”. kluwer: norwell, ma, 1992. [8] kirischian, l;geurkov, v., kirischian, v. and terterian, i. „multi-parametric optimisation of the modular computer architecture‟, int. j.technology, policy and management, vol. 6, no. 3,2006, pp.327–346. [9] kirischian, l. „optimization of parallel task execution on the adaptive reconfigurable group organized computing system‟, proc. of international conference parelec 2000, canada, pp.150–154. [10] vyas krishnan and srinivaskatkoori, “a genetic algorithm for the design space exploration of datapaths during high-level synthesis, ieee transactions on evolutionary computation, vol. 10, no. 3, june 2006, pp.229-313. [11] e. torbey and j. knight, “performing scheduling and storage optimization simultaneously using genetic algorithms,” in proc. ieee midwest symp. circuits systems, 1998, pp. 284–287. [12] giuseppe ascia, vincenzo catania, alessandro g. di nuovo, maurizio palesi, davide patti, “efficient design space exploration for application specific systems-on-a-chip” journal of systems architecture 53 (2007) pp. 733–750. [13] c. h. gebotys and m. i. elmasry, “global optimization approach for architectural synthesis,” ieee trans. comput.-aided des., vol. 12, 1993, pp. 1266–1278. [14] m. k. dhodhi, f. h. hielscher, r. h. storer, and j. bhasker, “datapath synthesis using a problem-space genetic algorithm,” in ieee trans.comput.-aided des., vol. 14, 1995, pp. 934–944. [15] harish ram d. s., m. c. bhuvaneswari, and shanthi s. prabhu, (2012) a novel framework for applying multiobjective ga and pso based approaches for simultaneous area, delay, and power optimization in high level synthesis of datapaths, vlsi design hindawi, article id 273276, 12 pages [16] e. torbey and j. knight, “high-level synthesis of digital circuits using genetic algorithms,” in proc. int. conf. evol. comput, may 1998, pp.224–229. [17] alessandro g. di nuovo, maurizio palesi, davide patti, fuzzy decision making in embedded system design,” proceedings of the 4th international conference on hardware/software codesign and system synthesis, october 2006,pp. 223-228. [18] a.c.williams, a.d.brown and m. zwolinski,“simultaneous optimisation of dynamic power, area and delay in behavioural synthesis”, iee proc.-comput. digit. tech, vol. 147, no. 6, 2000, pp. 383-390. [19] i. das. a preference ordering among various pareto optimal alternatives. structural and multidisciplinary optimization, 18(1):aug. 1999, pp.30–35. [20] christian haubelt, thomas schlichter, joachim keinert, mike meredith, “systemcodesigner: automatic design space exploration and rapid prototyping from behavioral models”, proceedings of the 45th annual acm ieee design automation conference, 2008, pp. 580-585. [21] j. c. gallagher, s. vigraham, and g. kramer,“a family of compact genetic algorithms for intrinsic evolvable hardware,” ieee trans. evolutionary computation., vol. 8, no. 2 , apr. 2004, pp. 111–126. [22] xuejie zhang and kam w. ng, “a review of high-level synthesis for dynamically reconfigurable fpgas”, microprocessors and microsystems, elsevier, volume 24, issue 4, 2000, pp. 199-211. [23] r. m. san and j. p. knoght, “genetic algorithms for optimization of integrated circuit synthesis,” in proc. 5th int. conf. genetic algorithms, san mateo, ca, 1993., pp. 432–438. 328 r. sedaghat, a. sengupta [24] r. j. cloutier and d. e. thomas, “the combination of scheduling, allocation and mapping in a single algorithm,” in proc. 27th design automation conf., jun. 1990, pp. 71–76. [25] n. wehn et al., “a novel scheduling and allocation approach to datapath synthesis based on genetic paradigms,” in proc. ifipworking conf. logic architecture synthesis, 1991, pp. 47–56. [26] g. krishnamoorthy and j. a. nestor, “data path allocation using extended binding model,” in proc. 32nd acm/ieee design automation conf.1992, pp. 279–284. [27] j. a. nestor and g. krishnamoorthy, “salsa: a new approach to scheduling with timing constraints,” ieee trans. comput.-aided des., vol. 12, 1993, pp. 1107–1122. [28] t. a. ly and j. t. mowchenko, “applying simulated evolution to high level synthesis,” ieee trans. comput.-aided des., vol. 12, no. 2, feb. 1993, pp.389–409. [29] c. t. hwang, j. h. lee, y. c. hsu, and y. l. lin, “a formal approach to the scheduling problem in highlevel synthesis,” ieee trans. comput.aided des., vol. 10, no. 2, feb1991, pp. 464–475. [30] giuseppe ascia, vincenzo catania, alessandro g. di nuovo, maurizio palesi, davide patti, “efficient design space exploration for application specific systems-on-a-chip” journal of systems architecture 53, 2007, pp. 733–750. [31] gaut: a high-level synthesis tool for dsp applications”, p. coussy, c. chavet, p. bomel et al., in high-level synthesis: from algorithm to digital circuits, springer, 2008, pp. 147-169. [32] canis, a., choi, j., aldham, m., zhang, v., kammoona, a., czajkowski, t., brown, s. d., and anderson, j. h. 2013. legup: an open-source high-level synthesis tool for fpga-based processor/accelerator systems. acm trans. embedd. comput. syst. 13, 2, article 24 (september 2013), 27 pages. [33] villarreal, j., park, a., najjar, w., and halstead, r. 2010. “designing modular hardware accelerators in c with roccc 2.0”. in proceedings of the ieee international symposium on field-programmable custom computing machines. 2010, pp. 127–134. [34] http://www.cbl.ncsu.edu/benchmarks/. [35] http://express.ece.ucsb.edu/benchmark/ http://express.ece.ucsb.edu/benchmark/ instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 223 234 doi: 10.2298/fuee1702223s e-plane waveguide bandstop filter with double-sided printed-circuit insert  snežana stefanovski pajović 1 , milka potrebić 1 , dejan tošić 1 , zoran stamenković 2 1 school of electrical engineering, university of belgrade, serbia 2 ihp, frankfurt (oder), germany abstract. in this paper a novel design of an e-plane bandstop waveguide filter with a double-sided printed-circuit insert is presented. split-ring resonators are used as the resonating elements to obtain the bandstop response. the amplitude response of the waveguide resonator with a single resonating element on the insert is analyzed for various dimensions and positions of the split-ring resonator. the coupling between two resonators on the insert, in terms of their mutual distance, is considered as a next step to the filter design. various positions of the resonators are considered, including the case with the resonators on the different sides of the insert, which is of interest for the proposed filter design. finally, a third-order bandstop filter with a double-sided printed-circuit insert, operating in the x-frequency band, is introduced. the filter response is analyzed for various distances between the resonators and for various positions of the resonator printed on the other side of the insert. proposed filter design is simple, providing for the accurate fabrication, miniaturization and possibility to relatively easy obtain multi-band response, using resonators with different resonant frequencies on the different sides of the insert. key words: e-plane waveguide filter, bandstop filter, split-ring resonator, double-sided printed-circuit insert 1. introduction waveguide filters are widely used components for communication systems operating with high-power signals. they are qualified as passive components with high quality factors and low losses [1]. for example, microwave waveguide filters are elements of various satellite and radar systems, either as bandpass or bandstop filters. e-plane waveguide filters, considered in this paper, are relatively simple to design, fabricate and measure. however, in spite of simple design, there are lots of possibilities to implement received june 25, 2016; received in revised form october 3, 2016 corresponding author: milka potrebić school of electrical engineering, bulevar kralja aleksandra 73, 11120 belgrade, serbia (e-mail: milka.p@mts.rs) 224 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković single e-plane insert using different resonating elements. various implementations, for different frequency bands, can be found in the available open literature, thus confirming the great interest for the waveguide filters among the researchers in the area of microwave filter design. there are various solutions for the bandpass filter design, using simple or complex resonating elements on the insert. bandpass filter with ladder-type pattern on substrate, for ka-band operation, can be found in [2]. an example of the bandpass filter with t-shaped resonator to operate in the x-band is introduced in [3]. furthermore, rectangular ring resonators (rrrs) are used for the ka-band bandpass filter design in [4], while the combination of c-shaped and central-folded stripline resonators (cfsrs) for the waveguide filter design is introduced in [5]. for the bandstop filters, solutions with splitring resonators (srrs), quarter-wave resonators (qwrs) and other types of simple resonators can be found. bandstop filters using srrs with single rejection band are proposed in [6]-[9], while multiple rejection bands are obtained in [10]-[11]. in [12], the authors have exemplified the use of the srr array for the waveguide filter design. folded srrs are used for the third-order ka-band bandstop filter in [13]. in [14], the possibility to obtain bandpass and bandstop filter response using srrs with microstrip structures, is explained and illustrated. second-order bandstop filter with qwrs, combined with srr as a coupling element, is introduced in [15]. dual-band e-plane bandstop filter with qwrs is proposed in [16]. both latter filters are designed to operate in the x-band. simple rectangular resonating slots are used for single-band and dual-band filter design in [17]. for the e-plane filters with multiple resonating elements on the insert it is important to properly couple them, as explained in [18]. the goal of our research is to design a novel e-plane filter using srrs. therefore, we propose a bandstop waveguide filter with a double-sided printed-circuit insert, using srrs with optimized parameters as the resonating elements, in order to obtain the bandstop response in the x-frequency band (f0 = 10 ghz). according to the available open literature, waveguide filter design with double-sided printed inserts is still not widespread. so far, several solutions for the waveguide structures with double-sided printed-circuit inserts have been introduced. in [19], the operation of the x-band rectangular waveguide with double-sided single ring resonator array is analyzed in the frequency range 2-10 ghz, in order to investigate the characteristics of metamaterials in the considered waveguides. furthermore, bandpass filters using various types of resonators (rrrs, c-shaped resonators and csfrs), printed on different sides of the insert, are proposed in [20], for the w-band, and in [21], for the ka-band. the bandstop waveguide filter realization, using double-sided printed-circuit insert with srrs, for the x-band operation, as considered here, represents a novel solution. the following steps are carried out to achieve the targeted filter design. the amplitude response of a waveguide resonator using single srr is analyzed in terms of the dimensions and the position of the srr. furthermore, the coupling between two srrs on the same insert is considered in terms of their mutual distance. various positions of the srrs are observed, taking into account the possibility to have srrs on different sides of the insert, as well. finally, a novel third-order bandstop filter with a double-sided printedcircuit insert is introduced. the filter response in analyzed in terms of mutual distance between the srrs and the position of the srrs printed on the different sides of the insert. wipl-d software [22] is used to make three-dimensional electromagnetic (3d em) models of the considered structures and to perform 3d em full-wave simulations. e-plane waveguide bandstop filter with double-sided printed-circuit insert 225 the advantage of the proposed design is simple and more accurate fabrication when the distance between the resonators on the insert is critical. also, the novel design provides possibility to have so-called “overlapped“ resonators, meaning that the srr on the other side of the insert does not necessarily have to be positioned between the other srrs, but it may partly overlap with them. such design contributes to the compactness of the structure, meeting demanding miniaturization requirements in this manner. another important aspect of the proposed design is possibility to relatively easy obtain multi-band filter response, having resonators with different resonant frequencies on the different sides of the insert. 2. waveguide resonator using e-plane insert with srr the amplitude response of the waveguide resonator using e-plane insert with a single srr (figure 1a) is analyzed in terms of the parameters of the srr and its position. waveguide resonator and filter, considered in this paper, are designed using standard rectangular waveguide wr-90 (width a = 22.86 mm, height b = 10.16 mm). they are excited by properly designed ports with probes (monopoles), placed at a distance of λg/4 from the short-circuited end of the port (λg – guided wavelength in the waveguide). the te10 mode of propagation is observed. the printed-circuit insert is modeled using copper clad ptfe/woven glass laminate tlx-8 (εr = 2.55, tanδ = 0.0019, h = 1.143 mm, t = 18 μm). dimensions of the e-plane insert are apl = 22.86 mm, bpl = 10.16 mm. according to figure 1a, the parameters used for the srr centrally positioned on the insert are given in table 1. the obtained amplitude response is shown in figure 1b (f0 = 10 ghz, b3db = 193 mhz). the amplitude response is analyzed in terms of dimensions of the srr and its position. the obtained results are presented in figure 2 and table 2. (a) (b) fig. 1 waveguide resonator using e-plane insert with a single srr: (a) 3d model, (b) amplitude response table 1 dimensions of the srr in figure 1a dimension [mm] d1 d2 c p l value 2.76 2.5 0.4 0.6 3.43 226 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) (c) (d) fig. 2 comparison of amplitude responses: (a) d1 varies, (b) c varies, (c) p varies, (d) l varies table 2 comparison of amplitude responses for single srr d2 = 2.5 mm, c = 0.4 mm, p = 0.6 mm, l = 3.43 mm d1 = 2.76 mm, d2 = 2.5 mm, p = 0.6 mm d1 [mm] f0 [ghz] b3db [mhz] c [mm] f0 [ghz] b3db [mhz] 2.6 10.276 190 0.2 10.748 174 2.8 9.945 194 0.4 10.009 193 3.0 9.641 192 0.6 9.408 209 d1 = 2.76 mm, d2 = 2.5 mm, c = 0.4 mm, l = 3.43 mm d1 = 2.76 mm, d2 = 2.5 mm, c = 0.4 mm, p = 0.6 mm p [mm] f0 [ghz] b3db [mhz] l [mm] f0 [ghz] b3db [mhz] 0.4 9.750 193 2.43 10.115 183 0.6 10.009 193 3.43 10.009 193 0.8 10.238 193 4.43 9.944 178 variation of resonator length (d1) primarily influences resonant frequency (longer printed resonator provides lower resonant frequency), while the 3-db bandwidth practically does not change. similarly, the increase of the gap width (p) moves the resonant frequency toward higher values, but the bandwidth remains the same. however, the change of the width of the printed trace (c) has the influence on both resonant frequency and bandwidth: by increasing c, f0 decreases while the band becomes wider, and vice versa. it should be noticed that the change of the trace width c causes small change of l, in order to have centrally positioned srr regardless of its dimensions. furthermore, by moving the resonator up and down from its central position on the insert, both resonant frequency and bandwidth change. these results are important for optimization of the parameters and positions of the srrs used for the filter design, in order to obtain desired amplitude response. e-plane waveguide bandstop filter with double-sided printed-circuit insert 227 3. coupling between two srrs differently positioned on e-plane insert coupling between two srrs on the same e-plane insert, depending on their mutual distance, is analyzed for several different cases. namely, possible solutions assume various orientations of the srrs in terms of gap position, and also various positions of the srrs, i.e. both srrs can be on the same or different side of the insert. in order to be able to calculate the value of the coupling coefficient, the amplitude characteristic s21 [db] is observed when the resonators are practically decoupled from the ports, meaning that the excitation is weakened, as proposed in [23]. this is achieved by adding metal plates (s = 8 mm), on both ends of the insert, toward the ports (figure 3a). therefore, two characteristic frequencies (f1 and f2), denoting local maxima of the s21 characteristic (figure 3b), are obtained and used for the coupling coefficient k calculation, according to the following formula [24]: 2 2 2 1 2 1 2 2 ff ff k    . (1) (a) (b) fig. 3 method of determining coupling coefficient: (a) 3d model with additional metal plates, (b) s21 characteristic with two local maxima the inserts with two srrs considered for the coupling analysis are shown in figure 4. the srrs depicted using dashed lines are printed on the other side of the insert. both srrs have the same dimensions, given in table 1. however, for cases 1 and 2, l1 = l2 = 3.43 mm, and for case 3 l1 = 3.43 mm and l2 = 3.30 mm. figure 5 shows coupling coefficient k as a function of the distance d between the srrs. for all considered cases, coupling gets weaker (i.e. k decreases) by increasing the distance d. for cases 1 and 2, coupling between resonators is stronger compared to case 3, so it is analyzed for wider range of values of the distance d. it can be noticed that the coupling is pretty much the same for cases 1a and 1b, meaning there is no significant difference whether the srrs are printed on the same or different sides of the insert. however, for d ≤ 2 mm, there is significant difference between values of the coupling coefficient obtained for case 2a and 2b. the same stands for case 3a and 3b, for d ≤ 1 mm. also, when both srrs are on the same side of the insert, the strongest coupling is obtained for case 2. on the other hand, when the srrs are printed on the different sides of the insert, cases 1 and 2 provide stronger coupling, compared to case 3, for the same distance d. 228 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) (c) fig. 4 srr inserts used for coupling analysis: (a) case 1, (b) case 2, (c) case 3 4. e-plane bandstop filter with double-sided printed-circuit insert using srrs based on the aforementioned results, a third-order bandstop filter is developed using double-sided printed-circuit insert with srrs. two srrs are printed on the same side of the insert, and the third one (central srr) is printed on the other side (depicted using dashed line in figure 6a). the parameters of the srrs are given in table 3. dimensions of the insert are apl = 22.86 mm, bpl = 10.16 mm. the amplitude response of the proposed filter, for the distance d = 11 mm, is shown in figure 6b (f0 = 10 ghz, b3db = 277 mhz). the total length of the proposed filter is 0.456 λg. the amplitude response of the filter is analyzed for various values of the distance d between two outer srrs. according to the amplitude responses shown in figure 6c, it is notable that the increase of the distance d results in a narrower bandwidth, while the center frequency remains practically the same. furthermore, the influence of the position of the central srr on the filter response is investigated, as well. considered srr can be centrally positioned on the insert, as previously proposed, but it can be also shifted up and down, so it does not have to be in line with the outer srrs (figure 7a). the obtained amplitude responses, for various values of the shift, are compared as shown in figure 7b. by moving central srr up or down for 1 mm, there is no significant change of the center frequency (less than 1 %). however, the 3-db bandwidth is notably changed, particularly when the srr is moved up (in the considered case, 3-db bandwidth is increased for 45 %). this property of the filter can be used for bandwidth tuning. e-plane waveguide bandstop filter with double-sided printed-circuit insert 229 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 55 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20.2 d [mm] k case 1a case 1b (a) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 d [mm] k case 2a case 2b (b) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20.2 d [mm] k case 3a case 3b (c) fig. 5 coupling coefficient k as a function of distance: (a) case 1, (b) case 2, (c) case 3 230 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) (c) fig. 6 waveguide filter using e-plane insert with srrs: (a) 3d model and wipl-d model of the insert, (b) amplitude response, (c) comparison of amplitude responses for various values of distance d table 3 dimensions of the srrs in figure 6a dimension [mm] di1 di2 ci pi li r1 (i = 1) 2.76 2.5 0.4 0.6 3.43 r2 (i = 2) 2.77 2.5 0.4 0.6 3.43 (a) (b) fig. 7 waveguide filter using e-plane insert with shifted central srr: (a) models of the insert, (b) comparison of amplitude responses for various values of the shift e-plane waveguide bandstop filter with double-sided printed-circuit insert 231 figure 8 shows comparison of amplitude responses of the filter with double-sided printed-circuit insert (figure 6a), and the one when all three srrs are printed on the same side of the insert. for both cases, dimensions of the corresponding srrs are the same, as well as the distance between them (d = 11 mm). as can be seen, there is no significant change of the filter response; resonant frequency is the same, while the bandwidth is narrowed for 10 mhz, which is 3.6 % of the reference bandwidth. however, a novel solution with srrs printed on both sides of the insert allows more accurate fabrication when the distance between the printed traces is critical, so the srrs can be closer to each other or can even overlap. in this manner, the requirements regarding device miniaturization can be easily met. also, multi-band filters can be developed having srrs with different resonant frequencies on the different sides of the insert, occupying less space compared to the solution when the srrs are printed on the same insert, next to each other, but separated enough to avoid undesired coupling. fig. 8 comparison of amplitude responses of e-plane bandstop filters with srrs (model 1: double-sided printed-circuit insert, model 2: single-sided printed-circuit insert) another possible solution with double-sided printed-circuit insert is shown in figure 9a. the outer srrs are oriented in such manner so their gaps are positioned on the left/right side. similarly as in the previous examples, the central srr is printed on the other side of the insert. dimensions of the srrs are given in table 4. the distance between the outer srrs is set to d = 9 mm. the filter length is equal to 0.392 λg. the amplitude response of the filter is shown in figure 9b (f0 = 10 ghz, b3db = 1027 mhz). as can be seen, a wide-band filter is obtained, using the proposed simple approach. the proposed filters are compared to the similar solutions from the available open literature (e-plane filters of the third order, with a single rejection band), in terms of the filter size on the printed insert. the filter given in figure 9a exhibits a smaller size than the ka-band filter presented in [13], whose length is 0.406 λg, while each of the filters given in figures 6a and 9a is shorter than filter in [7] (total length 0.501 λg) and x-band filters in [8] (total length 0.572 λg) and [17] (total length 1.766 λg). therefore, it can be concluded that the compact structures are designed, with the possibility for further miniaturization. the filter order can be easily increased by adding resonators. 232 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković (a) (b) fig. 9 waveguide filter using e-plane insert with srrs of various orientations: (a) 3d model and wipl-d model of the insert, (b) amplitude response table 4 dimensions of the srrs in figure 9a dimension [mm] di1 di2 ci pi li r1 (i = 1) 2.8 2.5 0.4 0.6 3.28 r2 (i = 2) 2.76 2.5 0.4 0.6 3.43 5. conclusion novel design of an e-plane bandstop waveguide filter using a double-sided printedcircuit insert with srrs has been proposed. design has started with a model of the waveguide resonator using single srr. the amplitude response has been thoroughly investigated in order to be able to optimize the parameters of the srrs for the filter design. the coupling between two srrs on the insert has been analyzed for various positions of the srrs and their orientation. since the double-sided printed-circuit insert is of interest for the presented research, the model with srrs printed on different sides has been also taken into account. for each considered case, it has been shown that the coupling becomes weaker as the distance between the srrs increases. based on these findings, the third-order e-plane filter with the double-sided printed-circuit insert is introduced. the amplitude response has been investigated in terms of the distance between the srrs and position of the central srr. by moving the central srr up or down the bandwidth can be tuned. it has been shown that the amplitude response of the e-plane waveguide bandstop filter with double-sided printed-circuit insert 233 filter with the double-sided printed-circuit insert matches relatively good with the response of the filter with all srrs printed on the same side of the insert. however, the advantage of the novel solution has been recognized in the fact that printing resonators on different sides of the insert allows more accurate fabrication when the distance between the traces is critical, so the srrs can be closer to each other. proposed filter design provides the possibility to have various combinations of the resonators on the insert, resulting in different responses. in this manner, a wide-band filter using double-sided printed-circuit insert with srrs has been also introduced. besides the abovementioned advantage regarding fabrication precision, the proposed filter design allows overlapping resonators printed on the different sides of the insert, therefore providing for the device miniaturization. also, double-sided printing allows development of multi-band filters using single e-plane insert, having resonators with different resonant frequencies on the different sides of the insert. such layout of the srrs occupies less space on the insert, compared to the design assuming srrs on the same side, next to each other, separated enough to avoid undesired coupling of different bands. comparison with the similar solutions found in the available open literature has confirmed the proposed filter design in terms of the possibility for device miniaturization. it has been shown that the filters presented here occupy less space on the inserts than some previously reported filters of the same order, operating in different frequency bands, assuming that the filter length is normalized to the guided wavelength in the waveguide for the considered center frequency. the future work will be based on the different implementations of compact multi-band filters using e-plane double-sided printed-circuit inserts, which are recognized as relatively simple to design and fabricate, and can be used in real systems operating at microwave frequencies. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia under grant tr32005. references [1] r. j. cameron, c. m. kudsia, r. r. mansour, microwave filters for communication systems: fundamentals, design, and applications. new jersey: john wiley & sons, 2007. [2] z. wang, r. xu, b. yan, “a covering ka-band two-way switch filter module using a three-line and an e-plane waveguide band-pass filters“, int. j. rf microw. c. e., vol. 25, no. 4, pp. 305-310, may 2015. [3] d. budimir, o. glubokov, m. potrebic, “waveguide filters using t-shaped resonators“, electron. lett., vol. 27, no. 1, pp. 38-40, january 2011. [4] j. y. jin, x. q. lin, q. xue, “a miniaturized evanescent mode waveguide filter using rrrs”, ieee t. microw. theory, vol. 64, no. 7, pp. 1989-1996, july 2016. [5] j. y. jin, x. q. lin, y. jiang, q. xue, “a novel compact e-plane waveguide filter with multiple transmission zeroes”, ieee t. microw. theory, vol. 63, no. 10, pp. 3374-3380, october 2015. [6] a. shelkovnikov, dj. budimir, “miniaturized rectangular waveguide filters“, int. j. rf microw. c. e., vol. 17, no. 4, pp. 398-403, july 2007. [7] a. shelkovnikov, dj. budimir, “left-handed rectangular waveguide bandstop filters“, microw. opt. techn. let., vol. 48, no. 5, pp. 846-848, may 2006. [8] m. mrvić, m. potrebić, d. tošić, z. cvetković, “e-plane microwave resonator for realization of waveguide filters”, in proceedings of the 12th international saum conference on systems, automatic control and measurements. niš, serbia, 2014, pp. 205–208. [9] b. jitha, c. s. nimisha, c. k. aanandan, p. mohanan, k. vasudevan, “srr loaded waveguide band rejection filter with adjustable bandwidth”, microw. opt. techn. let., vol. 48, no. 7, pp. 1427-1429, july 2006. 234 s. stefanovski pajović, m. potrebić, d. tošić, z. stamenković [10] m. n. m. kehn, o. quevedo-teruel, e. rajo-iglesias, “split-ring resonator loaded waveguides with multiple stopbands“, electron. lett., vol. 44, no. 12, pp. 714-716, june 2008. [11] e. rajo-iglesias, o. quevedo-teruel, m. n. m. kehn, “multiband srr loaded rectangular waveguide”, ieee t. antenn. propag., vol. 57, no. 5, pp. 1571-1575, may 2009. [12] n. purushothaman, a. jain, w. r. taube, r. gopal, s. k. ghosh, “modeling and fabrication studies of negative permeability metamaterial for use in waveguide applications”, microsyst. technol., vol. 21, no. 11, pp. 2415-2424, november 2015. [13] j. y. jin, q. xue, “a type of e-plane filter using folded split ring resonators (fsrrs)”, in proceedings of asia-pacific microwave conference 2015. nanjing, china, 2015. [14] s.n. burokur, m. latrach, s. toutain, “influence of split ring resonators on the properties of propagating structures”, iet microw. antenna p., vol. 1, no. 1, pp. 94-99, february 2007. [15] s. stefanovski, m. potrebić, d. tošić, “a novel design of e-plane bandstop waveguide filter using quarterwave resonators”, optoelectron. adv. mat., vol. 9, no. 1-2, pp. 87-93, january 2015. [16] m. mrvić, s. stefanovski, m. potrebić, d. tošić, "novel implementation of dual-band bandstop waveguide filter using quarter-wave resonators", (in serbian), tehnika, vol. 64, no. 3, pp. 473-480, 2015. [17] r. lopez-villarroya, g. goussetis, “novel topology for low-cost dual-band stopband filters”, in proceedings of asia-pacific microwave conference 2009. singapore, singapore, 2009. [18] s. lj. stefanovski, “microwave waveguide filters using printed-circuit discontinuities”, ph.d. dissertation, school of electrical engineering, university of belgrade, belgrade, serbia, 2015. [19] c.-t. chiang, j.-c. liu, y.-c. huang, c.-p. kuei, y.-s. lee, k.-d. yeh, a. h.-c. chen, “both transversal negative permeability and backward-wave propagation in x-band waveguide with double-side srr metamaterials”, int. j. rf microw. c. e., vol. 26, no. 3, pp. 240-246, march 2015. [20] j. y. jin, q. xue, “novel w-band passband filters using the e-plane planar resonators”, ieee international workshop on electromagnetics: applications and student innovation competition (iwem) 2016. nanjing, china, 2016. [21] j. y. jin, x. q. lin, q. xue, “a novel dual-band bandpass e-plane filter using compact resonators”, ieee microw. wirel. co., vol. 26, no. 7, pp. 484-486, july 2016. [22] wipl-d pro 11.0, http://www.wipl-d.com, wipl-d d.o.o., belgrade, serbia, 2013. [23] r. l. villarroya, “e-plane parallel coupled resonators for waveguide bandpass filter applications“, ph.d. dissertation, heriot-watt university, edinburgh, scotland, uk, 2012. [24] j.-s. hong, microstrip filters for rf/microwave applications. new jersey: john wiley & sons, 2011. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 329 338 doi: 10.2298/fuee1403329s method for integrated circuits total ionizing dose hardness testing based on combined gammaand x-ray irradiation facilities  armen v. sogoyan 1,2 , alexey s. artamonov 1,2 , alexander y. nikiforov 1,2 , dmitry v. boychenko 1,2 1 national research nuclear university mephi, moscow, russia 2 jsc "specialized electronic systems" (spels), moscow, russia abstract. a method is proposed to test microelectronic parts total ionizing dose hardness based on a rationally balanced combination of gammaand x-ray irradiation facilities. the scope of this method is identified, and a step-by-step algorithm of combined testing is provided, along with a test example of the method application. key words: microelectronics, tid effects, gammaray, x-ray. 1. introduction testing procedure of microelectronic parts, i.e., integrated circuits (ics), semiconductor devices, solid-state microwave electronics and electronic modules for compliance with nuclear and space radiation hardness regulations can be based on various radiation facilities that initiate total ionizing dose (tid) effects [1], [2] in devices under test (dut). since the problem of radiation testing of microelectronic parts had arisen for the first time and till now, tid effects are induced in laboratory mainly by gamma irradiation test facilities based on co 60 sources. every isotope-based gamma irradiation facility is unique and complex installation with a full-scale biological personnel protection, commonly designed under dedicated projects. there are also some other types of tid radiation test facilities which are widely used such as electron accelerators, other isotopic sources (cs 137 ), nuclear reactors. in all cases radiation test installation is focused at reproducing characteristics equivalent to real-world radiation factors and their effects. as gamma quanta have high energy (about 1 mev), this results in high penetrating power and weak dependence of the total ionizing dose in active areas of dut. at the same time in order to provide radiation safety gamma irradiation facilities require a  received february 28, 2014 corresponding author: aleksandr y. nikiforov national research nuclear university mephi, moscow, russia (e-mail: aynik@spels.ru) 330 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko significant (usually from 6 up to 25 meters) signal lines distance from dut to the measuring hardware. this remote measurements usually fail to test all necessary modes and conditions of dut operation under irradiation. moreover a substantial part of dut informative parameters (including those related to precision and high frequency performance) have become totally immeasurable at such a distance. gamma irradiation facilities have low general availability due to strict radiation safety regulations and it is impossible to use such a facility directly within ic design and manufacture process. as a result, such method of testing has not a very compelling business case in its favor. to overcome this downside of gamma irradiation facilities, in late 80s to early 90s new tid simulation test method have been developed using relatively compact x-ray irradiators with low-energy (10...100kev). in tests with x-ray facilities, intensity is tuned so as to result in a tantamount change in parameters, faults and failures of electronic components compared to the real-world ionization sources having the same dominant effect. x-ray testers (e.g., produced by aracor, usa or spels, russia) have been installed in many companies specialized in microelectronics research and development. the main goal of x-ray testers is their radiation safety (2 mm iron shield is enough for 10 kev source) together with very sho rt signal lines (less than 1 meter) and good compatibility with automotive control and measurement tools ( including wafer probes). implementation of x-ray testers for microelectronics tid hardness was accompanied by theoretical and experimental verification and research to substantiate equivalence of tid effects of various types of radiation [3]-[11]. as a result x-ray testers were incorporated into microelectronic processes and test standards [12], [13]. 2. use of x-ray irradiation facilities the main issue restraining application of x-ray testers is their low energy and, consequently, low penetration of x-ray radiation, as well as substantial dependence of tid absorbed in active areas, on design and process specifics of dut. all these necessitate advanced expert skills to ensure quantitative tid assessment (i.e. dosimetric evaluation) in the context of process diversity of microelectronic parts, a multitude of packages used, etc. a substantial number of microelectronic parts tested today are sophisticated chips used in modern apparatus. test customers tend to minimize the number of tested samples of each type to 3...5. many types of microelectronic parts have plastic packages. dosimetric evaluation of such samples is rather complex, because in most cases the manufacturer fails to provide data on the component design, layout, process used, chemistry of the package, etc. therefore, in this work we tried to overcome the disadvantages of gamma and x-ray radiation test sources specifically for microelectronics tid research using the inherent benefits of both of them in favor of compact and safe x-ray source and rationally minimizing usage of gamma-sources for necessary cases only. method for integrated circuits total ionizing dose hardness testing based on combined... 331 3. scope of joint testing the joint method of tid hardness testing based on gammaand x-ray irradiation facilities has been designed to enhance precision and quality of x-ray based simulation testing defined in [13]. it covers packaged and caseless silicon-based cmos circuits (i.e., with monosilicon, epitaxial, silicon-on-sapphire and silicon-on-insulator structures), as well as bipolar and bicmos (including sige) ics. to be admitted to tests, microelectronic parts have to meet the following conditions:  number of samples: 3 or more  samples taken from the same production lot, with clearly identified samples. 4. calibration method in x-ray dosimetry the method of calibration is commonly used. the most tid sensitive parameter of the device under test is chosen as a calibration parameter and denoted as qk. it is assumed that the x-ray dose is equivalent to the  radiation dose (d), if they both produce an identical radiation-induced change in the calibration parameter under identical testing conditions (mode, temperature, time from start of irradiation till measurement): dэ(qk) = d (qk). d(qk) is called the calibration curve; it is determined based on the test results on a gamma irradiation facility. based on this curve, the tested sample tid sensitivity is "calibrated". as calibration parameter qk, we propose to choose such electrical parameter of the product, the radiation-induces change of which is determined by tid effects. additional requirements to be met by the calibration parameter are: ease of measurement, a higher sensitivity to d and a long linear or, at least, "smooth" monotonous interval with qк=qк(d), lower susceptibility to electromagnetic interference and crosstalk. 5. algorithm of combined testing microelectronics tid hardness testing procedure on gamma and x-ray facilities is based on the following algorithm. 1. predicting the level of tid hardness and selection of the most sensitive operating mode. the following prediction methods can be used (descending priority):  based on the lab's own previous experience in testing of a given part type, or other products of a given manufacturer;  based on formally published results of previous testing of a given part type or other products of a given manufacturer provided by another test labs;  based on formally published results of previous tests of similar parts provided by a given manufacturer, including technical specifications;  based on results of previous tests of functionally similar parts provided by other various manufacturers;  based on, data-bases, articles, advertizing and other informal sources. 332 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko such a prediction results in a preliminary selection of a particular calibration parameter from various device under test (dut) parameters listed in the test procedure as well as selection of the mostly tid-sensitive electric and operating modes. if there is no technical evidence in favor of a particular electric mode, we recommend opting for the mode with a maximum supply voltage according to specifications. 2. analysis of dut design and estimation of the x-ray package (coating) attenuation ratio. the attenuation ratio is estimated based on the type, thickness and chemical composition of the package (protective coating) of a dut. 3. x-ray irradiation of dut sample, measuring all the criterial parameters specified in the test procedure, in the selected operating mode under the normal climatic conditions. to make a preliminary selection of the calibration parameter and the criterial parameters, the q = q(dx) dependency should be identified. the power of x-ray radiation absorbed on the crystal surface, based on the estimated attenuation ratio, should fall in the range of x-ray irradiation facility power used for calibration. irradiation proceeds until the sample fails in most of criterial parameters, or until the level of exposure at which radiation-induced change of a pre-selected calibration parameter and criterial parameters 100 times exceeds the measurement error. when choosing an irradiation mode, the following condition should be met: trad > 10tmeas, where trad is the full exposure time, tmeas  total time of parameter measurement during irradiation. in case of low radiation sensitivity of the calibration parameter and other criterial parameters (initial value changes less than 100 times the measurement error) hardness is assessed on a smaller number of samples (but still 2 samples at least) on a gamma irradiation facility. 4. gamma irradiation of a dut sample, measuring all the criterial parameters in the selected operating mode under the normal climatic conditions. to make a preliminary selection of the calibration parameter and the criterial parameters, the q = q(d) dependency should be identified. the power of gamma radiation absorbed should fall in the range 0.5...2.0 of gamma radiation absorbed on the crystal surface, in view of the estimated attenuation ratio. irradiation continues until d0 is reached, or the sample fails in most of criterial parameters, or until the level of exposure at which radiation-induced change of a preselected calibration parameter and criterial parameters 100 times exceeds the measurement error. the tid is measured by the gamma irradiation facility standard dosimetric methods. when choosing an irradiation mode, the following condition should be met: trad>10tmeas, where trad is the full exposure time, tmeas is the total time of parameter measurement during irradiation. 5. comparative analysis of x-ray and gamma irradiation test results. a decision is made on feasibility and validity of x-ray tests and the calibration factor is estimated. 6. applicability of combined testing the method of joint testing is applicable in case it is possible to build the calibration transformation: x d kd   , (1) method for integrated circuits total ionizing dose hardness testing based on combined... 333 where k is a factor for which dependencies qk(dx) and qк(d) are approximately similar: 2 ( / ) ( ) max 0.04 ( ) kx k d k q d k q d q d          , (2) where  is a relative instrumental error for q (according to the measurement tool data sheet), qk (d) is the dependence of criterial parameter versus d obtained on the gamma irradiation facility (item 4), qkx (dx) is the dependence of the criterial parameter increment versus the exposure level dx on the x-ray irradiation facility (3). the k-factor in the relationship (1) can be estimated by the least squares method. condition (2) should be verified at least at two points of d. when condition (2) is met, a decision on applicability of calibration-based dosimetry method is taken. lot #1 of n samples is tested on a gamma irradiation facility, and lot #2 of nx samples is tested on an x-ray irradiation facility, where nx > n. both lots are tested in an identical electric mode and under the same climatic conditions. the method to estimate the k-factor depends on the nature of functions qi(d), where i is the number of a sample in lot 1: i = 1 ... n. as a calibration parameter, we recommend to select a one with the higher relative radiation-induced increment. if there are multiple criterial parameters having close relative increment values (within 20%), the conditions outlined below apply to each parameter. if, in the tid range 0...d 0, the qi(d) dependency has a maximum in the neighborhood of dimax, it is normalized to the value of qi l, measured at di l closest to dimax. if, within a dosage range of 0...d 0, the qi(d) dependence has several maximums, the main maximum should be selected. if no maximum is available, the dependency is not normalized. the calibration level of q0 is selected. the calibration level should be selected close to the value corresponding to the parameter tolerance boundary specified for the tested sample. for the j-th sample of lot 1, j = 1...n, based on the experimental dependency qi(d) the value of tid d j is determined from condition 0 0 ( ) j j q d q q      (3) if necessary, to determine d j from (3), linear interpolation of dependency qj (d) can be used. similarly, the values of dxi , i = 1...nx, for lot 2, are defined. then, the point estimate of calibration factor k is made: x d k d   , (4a) 1 1 x n x xi ix d d n    , (4b) 334 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko 1 1 n i i d d n        (4с) when there are multiple criterial parameters with close relative values of increments, the calibration parameter is that for which 2 2 2 2 1 1 2 2 ( ) ( ) 1 1 x nn xi x i i i x x x d d d d n nd d                    (4d) has the smallest value. the lower boundary kl of the calibration factor confidence interval is calculated: 2 1 2 1 2 1 (1 ) l k q k q q q k q      , (5) 2 1 , 2 12 1 2 ( ) 1 1 ( 2) x x n xi x n n i x x x t d d q n n n n d                     , 2 1 , 2 12 2 2 ( ) 1 1 ( 2) x n i n n i x x x t d d q n n n n d                        where t 1/2,n is the quantile of the student distribution with n degrees of freedom, where confidence level is /2. confidence level p=1-is defined in the regulatory and technical documentation. if its value is not set, it is assumed to be 0.95 according to radiation test standards. as the calibration factor k=kl is taken.. the k/kl> 1 ratio plays the role of testing norm which depends on the number of samples tested. a relative dosimetry error in such a case  is affected by relative errors of gamma () and x-ray (x) dosimetry: (1 )(1 ) 1 x x          (6) dosimetric conformity of products is regulated by radiation test standards. 7. combined testing example for a test example, we have chosen a typical integrated circuit, hef4013bt which is a dual cmos d-trigger manufactured by nxp semiconductors. let's estimate the calibration factor for hef4013bt. as the sample was irradiated, we controlled its operation and measured acceptability criteria (uoh, ioh, iol, icch, iccl) versus the level (time) of exposure (see fig. 1). method for integrated circuits total ionizing dose hardness testing based on combined... 335 a) b) c) d) fig 1 experimental dependences of selected hef4013bt parameters versus exposure time: а) uoh, b) ioh, iol, c) icch, d) iccl next, we have to assess applicability of the method. for this purpose, we expose the circuit in a gamma irradiation facility (sample 13) and in an x-ray irradiation facility (sample 6). fig. 2 shows matching of dependencies of increment of supply current in the set mode for these samples. the calibration transformation factor (1) was estimated by the least squares method. at k=0.0328, relationship (2) is valid even at δ = 0 at least at three different exposure levels. therefore, we can conclude that the combined test method is applicable to the particular sample. further, the two lots of integrated circuites are irradiated. the first lot (2 samples, including sample #13) is exposed in a gamma irradiation facility, while the second lot (5 samples, including #6) is exposed in an x-ray facility. as a calibration parameter, the supply current in the set state (icch) is selected. since the dependence of the parameter increment versus exposure level is monotonous, such dependence is not normalized. 336 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko fig. 2 matching of dependencies of supply current in the set mode at exposure of hef4013bt in a gamma irradiation facility (sample 13) and an x-ray irradiation facility (sample 6) at k = 0.0328. the calibration level of parameter q0 = 3 ma is selected. for the j-th sample of lot 1, j = 1...n , based on the experimental dependency q j (d), the tid value d j is determined from the following condition (fig. 1) 0 0 ( ) j j q d q q      fig. 3 then, the levels of exposure di matching the q0 criteria, are determined. method for integrated circuits total ionizing dose hardness testing based on combined... 337 resulting d = {51.6, 44.6}. similarly, the values of dxi, i = 1...nx are determined for lot 2: dx = {1734, 1733, 1521, 1488, 1569}. then, the point estimate of calibration factor k is made: 1609, 48.1, 0.0299. x x d d d k d       the lower boundary kl of the calibration factor confidence interval is calculated at p=0.95: k=kl=0.025. a relative error of measuring x-ray exposure duration x for automatic source control is under 1%, therefore the dosimetry testing error is determined by the relative error of gamma irradiation dosimetry   which is 15% according to the dosimetric system data sheet. if the case for x-ray testing is proven, electronic components informative parameters immeasurable under the gamma irradiation conditions are measured on the x-ray source, otherwise the entire test is run the on the gamma irradiation facility. 8. conclusion the method of microelectronics tid hardness assurance testing based on a combination of gamma and x-ray irradiation facilities clarifies and develops the method of x-ray tests dosimetry specified in regulatory documents. this method can improve reliability of dosimetry of x-ray testing, fully combining, within a single test cycle, the capabilities and benefits, both of gamma irradiation facilities ensuring adequacy of test effects and of xray irradiation facilities, allowing to determine all informative parameters of electronic components (including precision and performance), and check all the operating modes and conditions directly under irradiation. the newly proposed method of combined electronic component testing offers the benefit of working with small sample lots and presents clear applicability criteria. acknowledgement: the authors wish to thank dr. yuriy bogdanov with the physics and technology institute, ras, moscow for the great contribution in statistical approach. references [1] artamonov a.s., chumakov a.i., eremin n.v., figurov v.s., kalashnikov o.a., nikiforov a.y., sogojan a.v. / 'reis-ie' x-ray tester: description, qualification technique and results, dosimetry procedure // ieee radiation effects data workshop, 1998, pp. 164-169. [2] methods for the prediction of total-dose effects on modern integrated semiconductor devices in space: a review / belykov v.v., pershenkov v.s., zebrev g.i., sogoyan a.v., chumakov a.i., nikiforov a.y., skorobogatov p.k. // russian microelectronics, 2003, 32 (1) , pp. 31-47. [3] fleetwood, d.m.; winokur, p.s.; schwank, j.r. using laboratory x-ray and cobalt-60 irradiations to predict cmos device response in strategic and space environments. ieee transaction on nuclear science. vol 35, 1988, pp. 1497-1505. [4] palkuti, leslie j. lepage, james j. x-ray wafer probe for total dose testing. ieee transaction on nuclear science. vol 29, 1982, pp. 1832-1837. [5] fleetwood, d. m.; beegle, r. w.; sexton, f. w.; winokur, p. s.; miller, s. l.; treece, r. k.; schwank, j. r.; jones, r. v.; mcwhorter, p. j. using a 10-kev x-ray source for hardness assurance. ieee transaction on nuclear science. vol 33, 1986, pp. 1330-1336. 338 a.v. sogoyan, a.s. artamonov, a.y. nikiforov, d.v. boychenko [6] dozier, c. m.; brown, d. b.; throckmorton, j. l.; ma, d. i. defect production in sio2 by x-ray and co-60 radiations. ieee transaction on nuclear science. vol 35, 1985, pp. 4363-4368. [7] oldham t.r., mcgarrity j.m. comparison of co-60 response and 10 kev x-ray response in mos capacitors // ieee trans. -1983. -vol. ns-30. -n 6. -p. 4377. [8] ic space radiation effects experimental simulation and estimation methods / chumakov a.i., nikiforov a.y., telets v.a., sogoyan a.v. //radiation measurements, 1999. v.30. [9] ic’s radiation effects modeling and estimation /chumakov a.i., nikiforov a.y., pershenkov v.s., skorobogatov p.k. //microelectronics reliability, 2000, v.40, #12. [10] nikiforov a.y., chumakov a.i. simulation of space radiation effects in microelectronic parts// effects of space weather on technology infrastructure. 2004 kluwer academic publishers/ netherlands. [11] a.v. sogoyan. assessment of tid hardness of cmos vlsi exposed to pulsed radiation // russian microelectronics, 2011, v. 40, #3, pp. 200-208. [12] astm f1467 11 standard guide for use of an x-ray tester (≈10 kev photons) in ionizing radiation effects testing of semiconductor devices and microcircuits [13] astm e666 09 standard practice for calculating absorbed dose from gamma or x radiation. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 267 277 https://doi.org/10.2298/fuee1802267b improving network lifetime by minimizing energy hole problem in wsn for the application of iot  trupti m. behera, sushanta k. mohapatra school of electronics engineering, kiit university, bhubaneswar, odisha, india abstract. the world today is at the internet of things (iot) inflection point with more number of products adding to its intelligence system through a wide range of connectivity. wireless sensor networks (wsn) have been very useful in iot application for gathering and processing of data to the end user. however, limited battery power and network lifetime are few of the major challenges in the designing process of any sensor network. one of those is the energy hole problem (ehp) that arises when the nodes nearer to the sink or base station die out early due to excess load as compared to other nodes that are far away. this breaks the connection of the network from the sink which results in shortening the lifetime of the network. in this paper, a trade-off is maintained between network lifetime and power requirement by implementing a sleep-awake mechanism.with the help of matlab simulations, it is found that after applying the mechanism, the network lifetime was extended to almost 300 and 700 rounds for teen and leach protocol respectively. the results will be beneficial for the design process in wsn for iot application. key words: iot, wsn, energy hole problem, power consumption, network lifetime. 1. introduction the internet of things (iot) is an integration of the existing and evolving internet with future network developments, such as self-configuring capabilities and enhanced network lifetime with proper power management. the iot cloud creates an intelligent network that can be sensed, controlled and programmed [1]. the basic elements of the future internet designed as iot include three major components which enable seamless communication [2]. the first is the hardware which is made up of sensors, actuators and embedded communication hardware like radio frequency identification (rfid), wireless sensor network (wsn), etc. the second is a middleware which performs on-demand storage and computing tools for data analytics. and the last is a presentation of novel and easy to understand visualization and interpretation tools which can be widely accessed on different platforms and which can be designed for different applications [2]. received may 25, 2017; received in revised form october 23, 2017 corresponding author: trupti m. behera school of electronics engineering, kiit university, bhubaneswar, odisha, india (e-mail: truptifet@kiit.ac.in) 268 t. m. behera, s. k. mohapatra the emerging iot has a diversified application scenario equipped with a wide range of heterogeneous devices. as shown in fig 1, wsn acts as a gateway to the iot. wsn also has a wide range of applications in various working domains and is also well suited for long-term data acquisition, hence wsn will be the best sensor interfacing device in the iot environment [3][4]. fig. 1 wsn as a gateway for iot one of the major design criteria of wsn is communication of data in an iot environment while trying to prolong the network lifetime. the design procedure should also prevent any connectivity degradation by employing efficient power management techniques. further, the placement of the sink or the base station also plays a vital role in the process of power consumption as it is responsible to collect all the sensed data from the sensor nodes and process the information to the end user. the sink node is equipped with one or more receiving antenna and unlimited energy to carry out the communication process effectively. in a wsn, all the nodes are randomly deployed, but nodes nearer to the sink area consume more energy than those away from the sink (as they have a greater load). hence these nodes die quickly creating a vacuum of energy called energy hole problem (ehp)[5] around the sink. under this scenario, the data transmission to the sink will be lost completely leading to an end of network lifetime[6]. as a result, optimizing the power consumption with enhancing the network lifetime becomes one of the most challenging tasks for researchers. 2. related work till date, a number of schemes have already been proposed to achieve the desired performance in terms of better power efficiency, network lifetime, throughput, etc. in [7], we have discussed a heterogeneous wsn where some of the nodes (called advanced nodes) are assigned more energy as compared to other nodes. with the simulation result we have shown that when all the normal nodes are dead, the network still continues transmission as the advances nodes are alive to transmit data from the sink, thus enhancing the network lifetime. an analytical modeling is proposed in [8] in order to reduce the ehp by analyzing the effectiveness of several existing approaches including traffic compression, deployment assistance, and aggregation. improving network lifetime by minimizing energy hole problem in wsn... 269 in [9], the authors have prepared a model based on a calculation of voronoi polygon of each node to detect any energy hole in the network. based on this, the node then moves to a better position to provide maximum coverage. they have also discussed optimizing the network lifetime and data collection simultaneously by adopting a rate allocation algorithm for data aggregation. a non-uniform node distribution strategy is proposed in [10], where the authors propose that if the number of nodes increases with geometric proportion from the outer parts of the network to the inner ones, then the energy wastage can be reduced to almost 10%. in [11], the author proposed that instead of a single sink in a particular field, multiple sinks can be deployed. each sink will be surrounded by normal nodes, thus dividing the network load to avoid the energy hole. this decision depends on the amount of data load in the network. a data gathering scheme is proposed in [12], where the network employs an optimum and fixed cluster radius intending to improve the network lifetime by avoiding the energy hole problem. in [13], a new scheme wemer is proposed that divides the whole network in too many small equiangular wedges that help in reducing energy hole formation. the authors in [14], proposed a non-uniform node distribution strategy to achieve nearly balanced energy depletion in the network with a distributed shortest path routing algorithm in order to reduce the energy hole problem. a sensor network designed for iot application need to perform various operations, such as sensing of data, aggregating and transferring the data to the end-user. to perform such operations with limited power becomes one of the major challenges in the design process. hence, we need to maximize the network lifetime by conserving energy during the transmission phase. this is made possible if only a small percentage of nodes are allowed to transmit the data to base station and the rest of the node becomes inactive and go to sleep condition. 3. sleep-awake mechanism from the above literature, it is clear that due to ehp, the network dies earlier [15] than its expected lifetime. the main reason behind ehp is that a large amount of data is given to the sink by nearby nodes as compared the nodes far away. in [16], the authors stated that due to the ehp, the network lifetime gets over even when 90% of the energy is left unused. thus, avoiding ehp becomes an important research area nowadays. we use the first-order radio model for energy consumption as used in [16] and shown in fig. 2, where nodes are randomly deployed with equal energy level. the sink is centrally positioned with unlimited energy. for each round, the sink has to search for the node with maximum distance in the region. it will then formulate the energy required to transmit the data to the sink. let this energy be reference energy (eref). only when the energy level of a particular node becomes greater than or equal to eref, does it have the permission to transmit any data to the sink or else it is not allowed to transmit. when the energy level of any node [15] becomes less than eref, it goes to sleep mode to save energy. this process continues for each round until the percentage of sleep nodes exceeds 1/10 th of the total nodes in the region. when the number of sleep nodes exceeds 10%, then the node which first went to 270 t. m. behera, s. k. mohapatra sleep mode moves to the awake mode. in consecutive rounds, when percentage again exceeds 10%, the nodes which went to sleep in the second position moves to the awake mode, and the mechanism continues. in such scenario, some 1/10 th of the total node will always remain in sleep position to save energy for extending network lifetime. to calculate the reference energy, we use the following formula as in [15], 4 )(( * ) ( * * )ref tx da ampe e e d e d d   (1) where eref is reference energy d is the length of the data packet d is the distance between maximum distance node and sink etx is energy required for data transmission eda is energy required for data aggregation eamp is energy required by power amplifier. the next step will be cluster head selection by nodes based on predefined probability [6]. only after the cluster heads broadcast their status, the nodes will be able to get associated with the cluster heads, thus consuming minimal energy while transmitting data. after formulation of clusters, each cluster head creates a time division multiple access (tdma) schedule for the nodes within the cluster. the tdma slots are assigned by the sink to each node. nodes can transmit their data to cluster head only during their respective time slots. once the cluster head collects all the data, it performs data aggregation and transmits the data to the base station. the energy consumption to transmit data from a node n to the cluster head ch for the condition d< d0 (reference distance) can be given as 2 ( ) ( )( ) ch ch ch n n ele n fse d e d e d  (2) where 0 4 t r h h d    ht and hr are the height of transmitting and receiving antenna respectively. fig. 2 first-order radio model improving network lifetime by minimizing energy hole problem in wsn... 271 now considering the scenario where the distance between n to ch is d > d0 , the energy can be given as in [15] 4 ( ) ( )( ) ch ch ch n n ele n ampe d e d e d  (3) energy consumed by ch to transmit data to the s when distance between them is d< d0 is given as in [15] 2 ( ) ( )( ) s s s ch ch ele da ch fse d e e d e d   (4) when the distance between ch and s(sink) is d> d0 , the energy consumption can be written as in [15] 4 ( ) ( )( ) s s s ch ch ele da ch ampe d e e d e d   (5) the total energy consumed in transmitting data from a particular node to sink will be the sum of both the energies in equation (2), (3) and (4), (5), i.e. totalch ch n e e e  (6) the average of total energy can be found by _ totalch average ch e e n  (7) energy saving due to sleeping of normal nodes in each round _save n ele tx ampe e e e   (8) where eele is radio energy dissipation energy saving for ch is _save ch ele da tx rx ampe e e e e e     (9) energy saving for all sleep nodes can be written as _ 0 n save total i i e e   (10) where n is the total number of nodes that are in sleep mode, then the average energy saving can be written as _ _ save total save avg e e n  (11) 4. simulations and result we have considered a sensor network where 100 nodes are deployed randomly. the sink is located at the center with unlimited energy. the normal sensor nodes have limited energy. for each round, some of the sensor nodes transmit data, while others are set to sleep mode to save energy. we implement this mechanism in some of the cluster-based protocols such as leach [17] [18], deec [19] and teen [20]. leach is a homogenous protocol, whereas deec and teen are heterogeneous protocols. however, the work can 272 t. m. behera, s. k. mohapatra also be extended to other hierarchical routing protocols such as pegasis, eammh, and sep. to generate matlab simulation, we consider these parameters as listed below. table 1 parameters for simulation symbol description value xm distance at x-axes 100 meters ym distance at y-axes 100 meters n total number of nodes 100 nodes e0 total energy of network 0.5j p probability of cluster head 0.1 erx energy dissipation: receiving 0.0013/pj/bit/m 4 efs energy dissipation: free space model 10/pj/bit/m 2 eamp energy dissipation: power amplifier 100/pj/bit/m 2 eele energy dissipation: electronics 50/nj/bit etx energy dissipation: transmission 50/nj/bit eda energy dissipation: aggregation 5/nj/bit d0 reference distance 87 meters n number of sleep nodes 10 nodes 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 fig. 3 deployment of 100 sensor nodes randomly leach is a homogenous protocol where all the sensor nodes are initially assigned the same energy level. according to our concept, the nodes that have the energy level less than the threshold are in the sleep mode. following this method, we will be able to save the total energy of the network. figure 4 shows the comparison of the above technique ileach (sleep-awake mechanism) with leach regarding the number of alive nodes, the number of dead nodes, the number of chs per round and number of packets sends to bs. the above figure shows that in leach the last node alive around 1500 rounds and in ileach the last node is alive till 2200 rounds. this result shows that in the ileach utilization of energy is properly distributed among all the nodes in the networks, which results in increasing network lifetime. improving network lifetime by minimizing energy hole problem in wsn... 273 (a) 0 500 1000 1500 2000 2500 3000 3500 0 10 20 30 40 50 60 70 80 90 100 no. of rounds (r) n o . o f n o d e s a lli v e leach ileach (b) 0 500 1000 1500 2000 2500 3000 3500 0 10 20 30 no. of rounds (r) n o . o f c h s p e r ro u n d s cluster heads per round leach ileach 0 500 1000 1500 2000 2500 3000 3500 0 0.5 1 1.5 2 x 10 4 no. of rounds (r) n o . o f p k ts t o b s packets sent to the base station leach ileach fig. 4 comparing the performance of leach and ileach: (a) number of alive nodes during rounds, (b) number of data packets per rounds hence, ileach has a prolonged stability period, and also the instability region starts much later as compared to leach. in leach, a random number of chs is selected in every round, but ileach had some patterns and controlled chs selection. in ileach efficient chs selection algorithm helps it in better and constant data rate transmission to bs. with sleep-awake policy, ileach successfully delivers data to the base station in a much better way than leach as the number of data packets sends much higher than leach to achieve higher data rate with longer network lifetime. 274 t. m. behera, s. k. mohapatra (a) 0 500 1000 1500 2000 2500 3000 3500 0 10 20 30 40 50 60 70 80 90 100 no. of rounds (r) n o. o f n od es a lli ve iteen teen (b) 0 500 1000 1500 2000 2500 3000 3500 0 10 20 30 40 no. of rounds (r) n o . o f c h s p e r ro u n d s cluster heads per round 0 500 1000 1500 2000 2500 3000 3500 0 0.5 1 1.5 2 x 10 4 no. of rounds (r) n o . o f p k ts t o b s packets sent to the base station fig. 5 comparing the performance of teen and iteen: (a) number of alive nodes during rounds, (b) number of data packets per rounds(red-iteen and blue-teen) improving network lifetime by minimizing energy hole problem in wsn... 275 (a) 0 500 1000 1500 2000 2500 3000 3500 0 10 20 30 40 50 60 70 80 90 100 no. of rounds (r) n o . o f n o d e s a lli v e ideec deec (b) 0 500 1000 1500 2000 2500 3000 3500 0 20 40 60 80 no. of rounds (r) n o . o f c h s p e r ro u n d s cluster heads per round deec ideec 0 500 1000 1500 2000 2500 3000 3500 0 2 4 6 8 x 10 4 no. of rounds (r) n o . o f p k ts t o b s packets sent to the base station deec ideec fig. 6 comparing the performance of deec and ideec: (a) number of alive nodes during rounds, (b) number of data packets per rounds figure 5 and 6 show the comparison of two existing heterogeneous protocols, i.e., teen and deec with the sleep-awake mechanism iteen and ideec respectively. the simulation result clearly shows that iteen and ideec outperform regarding the number of alive nodes, the number of chs per round and number of packets sent to bs. for teen protocol, the nodes start to die out after 1600 rounds, wherein iteen goes till around 1900 rounds. in a similar manner, the data packets sent to the base station also increase for both protocols. 276 t. m. behera, s. k. mohapatra rasheedl et al. in [6] did a similar experiment called ehorm to compare the number of alive nodes for protocols, such as leach, deec, teen, and sep. our approach, however, gives better results with more valid comparisons by taking different parameters into consideration. for leach protocol, the number of alive nodes extend to 2750 rounds as compared to almost only 1700 rounds using ehorm technique in [6]. similarly, for teen and deec protocol it extends beyond 3500 rounds, wherein ehorm the nodes becomes dead by 3200 and 3000 rounds respectively. hence, we can say that the network lifetime is enhanced after the implementation of our proposed mechanism for both heterogeneous and homogenous protocols. 5. conclusion in this article, we discussed an important issue in wireless sensor network for the application in iot which is energy hole problem. ehp is created since the nodes near the sink consume more energy, and as a result, die quickly, which in turn shortens the network lifetime. ehp in both heterogeneous and homogeneous routing protocols is studied. the sleep–awake mechanism was implemented in leach, deec and teen protocols to study the behavior of the network under the different scenarios in order to remove any energy hole problem within the network. from the simulation result, it was found that less energy is consumed and nodes live longer in ileach, iteen and ideec, as compared to leach, teen, and deec respectively. this clearly indicates that the sensor network lifetime will be enhanced or increased after implementation of the sleep-awake mechanism. simulation result also shows a better stability period and increased data packets sent to the sink in the network. this technique of enhancing the network lifetime while also optimizing energy consumption can be implemented in iot to achieve better performance. to extend the work in future direction, performance analysis of iot based applications can be done for other routing protocols such as pegasis, eammh, sep, etc. references [1] j. chase, “the evolution of the internet of things,” texas instruments, 2013. [2] j. gubbi, r. buyya, s. marusic, and m. palaniswami, “internet of things (iot): a vision, architectural elements, and future directions,” futur. gener. comput. syst., vol. 29, no. 7, pp. 1645–1660, 2013. [3] p. bellavista, g. cardone, a. corradi, and l. foschini, “convergence of manet and wsn in iot urban scenarios,” ieee sens. j., vol. 13, no. 10, pp. 3558–3567, 2013. [4] m. t. lazarescu, “design of a wsn platform for long-term environmental monitoring for iot applications,” ieee j. emerg. sel. top. circuits syst., vol. 3, no. 1, pp. 45–54, 2013. [5] j. jia, x. wu, j. chen, and x. wang, “exploiting sensor redistribution for eliminating the energy hole problem in mobile sensor networks,” eurasip j. wirel. commun. netw., vol. 2012, no. 1, p. 68, 2012. [6] m. b. rasheedl, n. javaid, a. javaid, m. a. khan, s. h. bouk, and z. a. khan, “improving network efficiency by removing energy holes in wsns,” arxiv prepr. arxiv1303.5365, 2013. [7] t. m. behera and s. s. singh, “a novel energy efficient network management scheme of heterogeneous wsn with mimo techniques,” int. j. comput. appl., vol. 93, no. 7, 2014. [8] j. li and p. mohapatra, “analytical modeling and mitigation techniques for the energy hole problem in sensor networks,” pervasive mob. comput., vol. 3, no. 3, pp. 233–254, 2007. [9] x. tang and j. xu, “optimizing lifetime for continuous data aggregation with precision guarantees in wireless sensor networks,” ieee/acm trans. netw., vol. 16, no. 4, pp. 904–917, 2008. improving network lifetime by minimizing energy hole problem in wsn... 277 [10] x. wu, g. chen, and s. k. das, “on the energy hole problem of nonuniform node distribution in wireless sensor networks,” in proceedings of the ieee international conference on mobile adhoc and sensor systems (mass), 2006, pp. 180–187. [11] m. ahadi and a. m. bidgoli, “a multiple-sink model for decreasing the energy hole problem in largescale wireless sensor networks,” int. j. comput. theory eng., vol. 4, no. 5, p. 843, 2012. [12] a.-f. liu, x.-y. wu, z.-g. chen, and w.-h. gui, “research on the energy hole problem based on unequal cluster-radius for wireless sensor networks,” comput. commun., vol. 33, no. 3, pp. 302–321, 2010. [13] n. sharmin, m. s. alam, and s. s. moni, “wemer: an energy hole mitigation scheme in wireless sensor networks,” in proceedings of the 2016 ieee international wie conference on electrical and computer engineering (wiecon-ece), 2016, pp. 229–232. [14] x. wu, g. chen, and s. k. das, “avoiding energy holes in wireless sensor networks with nonuniform node distribution,” ieee trans. parallel distrib. syst., vol. 19, no. 5, pp. 710–720, 2008. [15] m. b. rasheed, n. javaid, z. a. khan, u. qasim, and m. ishfaq, “e-horm: an energy-efficient hole removing mechanism in wireless sensor networks,” in proceedings of the 26th annual ieee canadian conference on electrical and computer engineering (ccece), 2013, pp. 1–4. [16] j. li and p. mohapatra, “an analytical model for the energy hole problem in many-to-one sensor networks,” in proceedings of the ieee vehicular technology conference, 2005, vol. 62, no. 4, p. 2721. [17] s. k. singh, p. kumar, and j. p. singh, “a survey on successors of leach protocol,” ieee access, vol. 5, pp. 4298–4328, 2017. [18] l. yadav and c. sunitha, “low energy adaptive clustering hierarchy in wireless sensor network (leach),” int. j. comput. sci. inf. technol., vol. 5, no. 3, pp. 4661–4664, 2014. [19] l. qing, q. zhu, and m. wang, “design of a distributed energy-efficient clustering algorithm for heterogeneous wireless sensor networks,” comput. commun., vol. 29, no. 12, pp. 2230–2237, 2006. [20] a. manjeshwar and d. p. agrawal, “teen: a routing protocol for enhanced efficiency in wireless sensor networks.,” in ipdps, 2001, vol. 1, p. 189. instruction facta universitatis series: electronics and energetics vol. 30, n o 2, june 2017, pp. 209 221 doi: 10.2298/fuee1702209d rf pa linearization by signals modified in baseband digital domain  aleksandra đorić 1 , nataša maleš-ilić 2 , aleksandar atanasković 2 1 innovation center of advanced technologies, niš, serbia 2 university of niš, faculty of electronic engineering, niš, serbia abstract. this paper represents the linearization of the rf power amplifier performed by a new approach that combines two different methods exploiting the modified baseband signals. the signals for linearization in both methods are formed and processed in digital domain. the required modified baseband signals for linearization are products of the second order nonlinearity of a nonlinear system fed by the useful baseband signal. in the first method, adequate part of the modified baseband signal is adjusted in amplitude and polarity and injected at the input and output of the amplifier transistor across the series lc resonant circuit. in the second method, the appropriate modified baseband signal set on the appropriate amplitude and phase modulates the fundamental carrier second harmonic, which is then inserted at the input and output of the amplifier transistor. the effects of the combined linearization method are considered on a single stage power amplifier for quadrature amplitude modulated signals characterized with frequency spacing between spectral components up to 60 mhz for different input power levels, as well as for wcdma digitally modulated signal. key words: linearization, power amplifier, baseband signal, second harmonic, intermodulation products. 1. introduction the new generation of the communication technologies and standards impose demanding requirements to new systems in order to increase bit rate, linearity and spectrum efficiency. these requirements present a serious task for the transmitter designers regarding the power amplifier (pa) topology that should support wideband operation, various modulation formats, a diversity of signal bandwidths and frequency ranges, high efficiency, as well as linear operation [1]. for achieving high power efficiency, the power amplifier should operate closer to its compression region distorting the linearity of the output signals. consequently, significant efforts have been devoted to development of linearization techniques for nonlinear rf and microwave power amplifiers. various linearization methods received june 10, 2016; received in revised form september 12, 2016 corresponding author: aleksandra đorić innovation center of advanced technologies, bulevar nikoletesle 61/5, 18 000 niš, serbia (e-mail: alexdjoric@yahoo.com) 210 a. đorić, n. maleš-ilić, a. atanasković for minimizing nonlinear distortions of power amplifiers have been reported in the literature [2-5]: feedback, feed-forward, predistortion, etc. in the previously deployed linearization technique that uses the fundamental signals’ second-order (im2) and fourth-order nonlinear signals (im4) at frequencies around the second harmonics, [6-12], the signals for linearization were generated and prepared for injection at the input and output of the transistor amplifier in the rf analogue signal domain. linearization effects were validated on the single stage rf power amplifiers throughout the simulation process [6]-[8] and experiments [9], as well as on the doherty amplifiers [10-12]. in this paper, we combine two linearization methods that exploit modified baseband signals formed and processed in the digital domain [13], [14] in order to linearize the rf amplifier. in the first method, an adequately prepared signal in the baseband is adjusted in amplitude and polarity and injected at the input and output of the amplifier transistor across the lc resonant circuit. in the second method, specified baseband signals modulate the carrier second harmonic after appropriate setting on the amplitude and phase, and the formed signal is run at the gate and drain of the amplifier transistor. the injected signals for the linearization and the fundamental signal are mixed due to the second order nonlinearity of the transistor generating additional third-order nonlinear products that may suppress the original intermodulation products caused by the transistor nonlinear characteristic. the impact of the proposed linearization methods is considered on a single stage power amplifier for qam signals wherein i and q components are single tones with maximum spectrum bandwidth 60 mhz, as well as for the wcdma digitally modulated signal. the results obtained by the combined method are validated by comparison with the results achieved by the first and second linearization approaches, which are described in [13] and [14]. since the linearization results for the first method are represented in [13] for only one power level of the qam signal, in this paper we have analysed its linearization effect for a range of the input signal powers. this paper is organized as follows: section ii explains the theoretical background of the combined linearization approach; section iii represents the results of linearization of the designed single-stage rf power amplifier obtained by the combined method proposed in this paper, which are also compared to the results obtained by the first linearization approach; in section iv conclusions are reported. 2.theoretical analysis of the combined linearization technique the operational principle of the linearization method proposed herein can be comprehended by the theoretical analysis of the current nonlinearity at the transistor output in the amplifier circuit. the dominant nonlinearity of fets can be represented by a taylor-series polynomial model [15-17] in case when the memory effects are neglected 2 3 2 3 ds m1 gs m2 gs m3 gs d1 ds d2 ds d3 ds 2 2 m1d1 gs ds m2d1 gs ds m1d1 gs ds ... i g v g v g v g v g v g v g v v g v v g v v            (1) the transistor’s drain current (ids) depends on the voltage between the gate and source (vgs), which is expressed by the transconductance terms labeled by gmx. the dependence rf pa linearization by signals modified in baseband digital domain 211 of the drain current on the voltage between the drain and source (vds) is included by the drain conductance terms gdy. in addition, the drain current is a function of voltage between the gate and source and voltage between the drain and source, which are represented by the coefficients gmxdy. the order of each coefficient can be calculated as x + y. the digitally modulated signal is characterized by the magnitude c(t), phase (t), and carrier frequency 0, as in 0 0 0 s 0 0 ( ) ( ) cos( ( )) ( )[cos( ( )) cos( ) sin( ( )) sin( )] ( cos( ) sin( )) v t = c t t t c t t t t t v i t q t             (2) where i = (c(t)/vs)cos((t)) and q = (c(t)/vs)sin((t)) are the in-phase and quadrature-phase components of the baseband signal. the second-order nonlinear system generates the output signal given by 2 2 2 2 2 2 2 out_2_order in s s 0 0 1 1 ( ) [ ] [( ) cos(2 ) 2 sin(2 )] 2 2 v t v v i q v i q t iq t        (3) if the digital signal expressed by eq.2 is fed at its input. the linearization approach suggested in this paper utilizes the complete modified baseband signal from the eq.3: the baseband signal in appropriate form (the first term) together with the fundamental carrier second harmonic modulated by the adequately shaped baseband signal whose modified (new) in-phase and quadrature-phase components are in the forms inew = i 2  q 2 and qnew = 2iq (the second and third terms). fig. 1 shows the schematic diagram of the amplifier with the linearization circuit that forms, processes, and injects the linearization signals at the input and output of the amplifier transistor in case of the combined linearization method. the baseband part of the signal for linearization in the form bbmod = i 2 + q 2 is multiplied by the coefficients a{ib/ob} for amplitude and polarity tuning. the baseband signals modified in this manner are then inserted over the series lc circuit at the gate and drain of the amplifier transistor. another modified components of the linearization signal, inew and qnew, which modulate the carrier second harmonic, are adjusted in amplitude by a{i2h/o2h} and phase by {i2h/o2h} across two branches. the prepared signals are then inserted at the input and at output of the amplifier transistor through the bandpass filters. the aforementioned indices consisting of the letters i and o in subscript are related to the signals injected at the input and output of the amplifier transistor, respectively. the voltage at the gate, given as, i2h 2 2 gs s 0 0 ib 2 2 i2h 0 0 g ( ) [ cos( ) sin( )] ( ) [( ) cos(2 ) 2 sin(2 )] j v t v i t q t a i q a e i q t iq t v                (4) is comprised of all signals injected: the fundamental useful signal, the gate bias signal, the modified baseband signal and the fundamental carrier second harmonic modulated by the adequately shaped baseband signal. the voltage at the drain, given as, o2h 2 2 ds o 0 0 ob 2 2 o2h 0 0 d ( ) [ cos( ) sin( )] ( ) [( ) cos(2 ) 2 sin(2 )] j v t v i t q t a i q a e i q t iq t v                (5) 212 a. đorić, n. maleš-ilić, a. atanasković consists of the fundamental signal amplified linearly, the drain bias voltage and the signals for linearization including the baseband signal and modulated second harmonic, which are appropriately modified, tuned, and injected together at the amplifier transistor drain. in eq. 5 o 0 0 ( cos( ) sin( ))v i t q t   is the output signal at the fundamental frequency, vg and vd are dc bias voltages supplied at the gate and drain of the amplifier transistor, respectively. fig. 1 schematic diagram of the amplifier linearized by the injection of the modified baseband signals processed in digital domain the distorted output current is obtained by substituting the eq. 4 and eq. 5 in eq. 1. yielding i2h o2h i2h 3 ds s m3 ib s m2 i2h s m2 ob s m1d13 o2h s m1d1 ib o m1d1 i2h o m1d1 2 2 2 2 s o m1d2 s o m2d1 0 3 ( ) 2 4 1 1 2 2 3 3 ( )[ cos( ) 2 2 j im j j i t v g a v g a e v g a v g a e v g a v g a e v g v v g v v g i q i t q                         0 sin( )] (6)t   i2h i2h o2ho2h 25 2 2 2 ds s m5 ib s m3 i2h s m3 ob s m1d2 ib ob o m1d25 θ θ2θ2 o2h s m1d2 i2h o2h o m1d2 ib ob s m2d1 2 ib o m2d1 5 3 ( ) 3 2 8 2 1 2 2 1 j im jj i t v g a v g a e v g a v g a a v g a e v g a a e v g a a v g a v g                    i2h o2hi2h θ θ2θ2 i2h o m2d1 i2h o2h s m2d1 2 2 2 0 0 2 ( ) [ cos( ) sin( )] (7) jj a e v g a a e v g i q i t q t           rf pa linearization by signals modified in baseband digital domain 213 where eq. 6 and eq. 7 refer to the thirdand fifth-order intermodulation products of the drain current at the fundamental frequency, respectively. the nonlinearity of the drain current in terms of the voltage between the drain and source, vds, is expressed by the coefficients gd1  gd3 and according to [16] and [17] it is assumed to have an inessential impact on the intermodulation products and have been omitted from the equations. the first term in eq. 6 represents the signal distorted by the cubic term of the amplifier (gm3), which is considered as a dominant in arousing the third-order intermodulation products, im3, and spectral regrowth [16], [17]. the gm2 second-order transconductance nonlinear products of the fundamental signal and the linearization signals injected at the amplifier transistor gate are expressed as the second and third terms in the eq.6. the fourth and fifth gm1d1terms are the mixing products between the gatesource voltage of the fundamental signal and the voltage of the linearization signals fed at the amplifier transistor drain. additionally, the fundamental signal at the output of the transistor mixes with the linearization signals driven at the amplifier transistor input generating the sixth and seventh terms. the drain current at im3 frequencies includes the mixing products of the third-order, gm1d2 and gm2d1, between the drain and gate voltages of the fundamental signal (the eighth and ninth terms in eq. 6). since the output signal at the fundamental frequency is considered to be 180 degree out of phase in reference to the input signal, these products reduce each other [17] due to their opposite phases. according to previous analysis, it is possible to reduce spectral regrowth caused by the third-order distortion of the fundamental signal by selecting the appropriate amplitude and polarity of the modified baseband signal injected at the input (aib) and output (aob) of the amplifier transistor, as well as by choosing the adequate amplitude and phase of the modified baseband signal that modulates the second harmonic injected at the input (ai2h, i2h) and output (ao2h, o2h) of the amplifier transistor. the first term in eq.7 is formed due to the amplifier nonlinearity of the fifth-order, (gm5) and expresses the fifth-order intermodulation products of the drain current of the amplifier transistor. the mixed terms between the drain and gate, gm1d2 and gm2d1 are the products between the fundamental signal and the baseband linearization signal as well as the fundamental signal and the modulated second harmonic, which exist at the amplifier transistor input or output. it is supposed that these terms of the drain current at the im5 frequencies neutralize each other to a certain extent, which depends on the phase relations between the linearization signals driven at the gate and drain as well as the intensity of the mixing products. however, the second and third gm3 terms in eq. 7 may increase or decrease the im5 products owing to the signs of the thirdand fifth-order nonlinear coefficients gm3 and gm5 and also to the linearization signal phase i2h. 3. linearization results in order to estimate the effects of linearization, the proposed combined approach is applied on the broadband rf amplifier designed in agilent advanced design systemads software. the designing process is based on the nonlinear met model of the freescale transistor mrf281s ldmosfet and includes synthesis of the input and output broadband matching circuits with the lumped elements [7]. the source and load 214 a. đorić, n. maleš-ilić, a. atanasković impedances of the amplifier transistor were determined by the source-pull and load-pull analysis in ads entailing high drain efficiency and maximum output power [8]. the amplifier circuit was designed to operate over the frequency range 0.7 ghz-1.1 ghz. we considered the influence of the bandpass filters (ideal elements from the ads library) connected to the gate and drain of the amplifier transistor to supply the linearization signal, which comprises the modulated fundamental carrier second harmonic, to the amplifier circuit. the series lc circuits that enable injection of the modified baseband signals for linearization into the amplifier were also included into analysis. the gain, power-added efficiency (pae) and output power of the amplifier loaded by the lc circuits and bandpass filters, in terms of the input power, is shown in fig. 2 for the single-tone excitation. it can be noted that the maximum gain observed at 1 ghz is slightly greater than 22 db, showing a variation of approximately 2 db with the change of the excitation signal frequency. the power added efficiency at labeled frequencies deviates from the pae at 1 ghz by maximum 5%, whereas the maximum pae is 50 % at 1 ghz at maximum output power of around 36 dbm. fig. 2 gain, pae and pout of the design amplifier with the series lc circuits and bandpass filters loading gate and drain of the amplifier transistor the designed power amplifier was tested for the qam modulated signals whose spectrum contains two frequency components separated by 2 mhz up to 60 mhz with centre frequency of 1 ghz. through the ads simulations, timed source component named qam was used as a source of the signals. the analysis was carried out for different fundamental signal power levels at the amplifier input: 0 dbm, 3 dbm and 7 dbm. the power levels of the third-order and the fifth-order intermodulation products, before and after the linearization, in terms of the frequency interval between the spectral components of the qam signal are presented in fig. 3 and fig. 4 for different input power levels. it should indicate that the values of the linearization coefficients a{ib/ob} for amplitude and polarity tuning of the baseband signals and coefficients a{i2h/o2h} and {i2h/o2h} for amplitude and phase adjustment of the linearization signals that modulate the carrier second harmonic, were obtained by the optimization process in ads for each considered input signal power level. the random optimization of the adjustable coefficients of the linearization signals rf pa linearization by signals modified in baseband digital domain 215 was carried out with the aim to suppress the third-order intermodulation products and to restrain the fifth-order intermodulation products at the levels below the reduced im3 products. we compared two cases: when the linearization was achieved by insertion of the only modified baseband signals at the input and output of the amplifier transistor (the first or baseband method) [13] and when the linearization was performed by the combined approach, i.e. a simultaneous injection of the adequately modified baseband signals together with the second harmonic of the fundamental carrier modulated by another differently modified baseband signal at the input and output of the amplifier transistor. the combined linearization approach encompasses the linearization methods aforementioned above as the first (baseband) and second methods, [13], [14]. figure 3 represents the third-order intermodulation products, im3, before and after the linearization for the compared linearization cases. it can be noted that, greater reduction of the im3 products for all input power levels over the considered power range was achieved by the combined linearization approach proposed in this paper. the effects of the linearization method that exploits only the modified baseband signal was proposed and tested in [13] for a specific input power level of the qam signal and a range of input power for the wcdma signal. in this paper, we obtained the linearization results for a power range of the qam signal. it can be indicated that the suppression of the im3 products attained for the combined approach suggested in this paper is greater for around 25 db to 15 db in comparison with the results of the baseband approach from [13] for the frequency spacing between the qam spectral components from 2 mhz to 20 mhz. the im3 products reduction grade is around minimum 25 db for frequency separation of 20 mhz when the combined approach is run. a general observation is that, as the input power increases and the frequency span becomes wider, the im3 products drop rate decreases. the im3products are lessened by 10 dbin the case of 7 dbm input power and 60 mhz frequency span when the combined method is applied, that is still much better result referring to the reduction of only a few decibels in case of the baseband method linearization. moreover, it should be stressed that the results achieved by the combined method are also notably better in comparison with the im3 products decrease represented in [14] wherein the fundamental carrier second harmonic modulated by the shaped baseband signal was utilized for the linearization. better linearization results are obvious in the whole signal power range and frequency spacing between the spectral components, that is especially significant for a larger spacing: e.g. for input power 7 dbm and spacing 20 mhz, the im3 products are hardly lowered by a few decibels in the linearization approach from [14], whereas in this paper the combined approach decreases the im3 products by 26 db. the influence of the performed linearization approaches on the fifth-order intermodulation products, im5, is presented in fig. 4. the simulation shows that the im5 products are lessened by minimum 10 db for frequency interval between signals up to 20 mhz, while, by applying the modified baseband linearization signals, the im5 products stayed unaltered in reference to the state before the linearization for almost all considered input power levels and frequency spacing between the qam components. an exception is noted at input power of 3 dbm where the reduction of im5 is 6 db to 13 db. 216 a. đorić, n. maleš-ilić, a. atanasković a) b) c) fig. 3 third-order intermodulation products of the rf power amplifier for qam signal before and after the linearization for different input power levels: a) 0 dbm, b) 3 dbm, c) 7 dbm rf pa linearization by signals modified in baseband digital domain 217 a) b) c) fig. 4 fifth-order intermodulation products of the rf power amplifier for the qam signalbefore and after the linearization for different input power levels: a) 0 dbm, b) 3 dbm, c) 7 dbm 218 a. đorić, n. maleš-ilić, a. atanasković by application of the combined linearization method, theim5 products reduction grade significantly increases in relation to the previous method and depends on the power and frequency spacing between the qam signal spectral components in the similar manner as the results for the im3 products behave: the linearization results are significantly better than when only modified baseband signal is used for linearization. the reduction grade goes from 14 db at 10 mhz signal spacing for the specified power range until 8 db at 60 mhz spacing and 3 dbm input power level. at 7 dbm input power, the im5 products descend by 27 db at 10 mhz spacing, whereas they are retained at the level before the linearization at 60 mhz spectral component frequency spacing. in comparison with the results achieved in [14], where only modulated second harmonics carried out linearization, the better results concerning the im5 products reduction are obtained by the combined method proposed herein. namely, the results from [14] show that the im5 products stayed unchanged or lowered for a few decibels in almost every analysed case (0 dbm to 7 dbm input signal power and 10 mhz to 20 mhz frequency spacing). it should indicate that the im5 products are not lessened in reference to the level before the linearization at 20 mhz spectral component spacing, whereas we have accomplished the im5 suppression even until frequency spacing of 60 mhz for 0 dbm and 3 dbm input power by the combined linearization approach. additionally, the influence of the suggested linearization methods was also investigated for the wcdma signal which has 1 ghz centre frequency, a spectrum width of 3.84 mhz and peak to average power ratio (papr) of 6 db in a range of fundamental signal average output power. the adequate values of the coefficients a{ib/ob}, a{i2h/o2h} and {i2h/o2h} for the required linearization results were determined by ads optimization. it should be noticed that the linearization coefficients obtained for the wcdma signal differ from the values achieved for the qam signals. the linearization results obtained by the combined linearization approach are also compared with the results gained by the method that uses only baseband modified signal [13], as indicated in fig. 5 and fig. 6. the similar observation relating to the linearization results of the baseband linearization approach and the combined approach is imposed for the wcdma signal as for the qam signal previously considered. in the baseband linearization case, the adjacent channel power ratioacpr is enhanced around 10 db at power levels greater than 24 dbm in the range of dominant third-order intermodulation products at ±4mhz offset from the carrier (fig. 5), while in the range of dominant fifth-order intermodulation products at ±8 mhz offset from the carrier, the acpr is restrained at the power levels before the linearization with the exception at the higher observed power levels where it is improved by a few decibels. comparing to the results of the pa linearization gained in this paper by applying the combined linearization method, we may indicate that the improvement of acpr observed at ±4mhz offset from the carrier is better for maximally 5 db in relation to the baseband approach. additionally, the acpr improvement is better by a few decibels in the range of dominant fifth-order intermodulation products when the combined method is utilized than in the first approach. moreover, in reference [14], we analysed the acpr of the wcdma signal before and after the linearization by applying the modulated fundamental carrier second harmonic for only 11 dbm input signal power (output power of 29 dbm), where the acpr was enhanced more than 10 db at ±4mhz offset from the carrier. it is spotted from fig. 5 that the combined method gives around 15 db acpr improvement at that output power level. rf pa linearization by signals modified in baseband digital domain 219 fig. 5 acpr before the linearization (solid line) and after the linearization (dashed and dotted lines) at ±4 mhz offset from the carrier (the range of the dominant third-order distortion) for the wcdma digitally modulated signal in a terms of average output power fig. 6 acpr before the linearization (solid line) and after the linearization (dashed and dotted lines) at ±8 mhz offset from the carrier (the range of the dominant fifth-order distortion) for the wcdma digitally modulated signal in a terms of average output power 4. conclusion this paper presents a new linearization approach that combines variously modified baseband signals where one modulates the second harmonic of the fundamental carrier. the proposed linearization method uses the i and q signals that are adequately processed in the digital domain with the aim to form the signals for the linearization which are 220 a. đorić, n. maleš-ilić, a. atanasković inserted into the gate and drain of the rf power amplifier transistor. the analysis of the impact of the proposed combined linearization techniques on the intermodulation products suppression is assessed in simulation by ads for the qam signal whose i and q components are sinusoidal signals and the spectrum contains two frequency components symmetrical around the carrier frequency. the linearization effects for different input power levels and different frequency spacing between the signal spectral components are examined for the proposed linearization method. also, the obtained results are compared to the results achieved when the only modified baseband signals are fed at the amplifier circuit. it may be noted that the significantly better results are achieved in the reduction of the third-and fifthorder nonlinearity of the amplifier by the combined linearization method in comparison with the method that uses only the modified baseband linearization signal. the same may be inferred regarding the nonlinearity suppression by the combined method in reference to the results given in the literature that were reached by the method that performs linearization by the second harmonics of the fundamental carrier modulated by adequately modified baseband signals. additionally, the linearization influence is also demonstrated for the wcdma digitally modulated signal. the combined linearization method gives also greater improvement of the acpr for the wcdma digitally modulated signal in the range of the dominant third-order as well as the fifth-order distortions relative to the two mentioned linerization approaches. acknowledgement: this work was supported by the ministry of education, science and technological development of republic of serbia, the project number tr-32052. references [1] n. usachev, v. elesin, a. nikiforov, g. chukov, g. nazarova, d. sotskov, n. shelepin, v. dmitriev, “system design considerations of universal uhf rfid reader transceiver ics”, facta universitatis, series: electronics and energetics, vol. 28, no. 2, pp. 297-307, june 2015. [2] p. kenington, high-linearity rf amplifier design. artech house, 2000, chapters 4-6, pp. 135–423. [3] s. cripps, rf power amplifiers for wireless communications. artech house, 1999, chapter 9, pp. 251–282. [4] m. k. kazimierczuk, rf power amplifiers. wiley, 2008, chapter 9, pp. 321–343. [5] n. mizusawa, s. kusunoki, “thirdand fifth-order baseband component injection for linearization of the power amplifier in a cellular phone”, ieee transactions on microwave theory and techniques, vol. 53, no.11, pp.3327-34, 2005. [6] n. males-ilić, b. milovanović, đ. budimir, “effective linearization technique for amplifiers operating close to saturation”, international journal of rf and microwave computer-aided engineering, vol.17, no. 2, pp.16978, 2007. [7] a. đorić, n. males-ilić, a. atanasković, b. milovanović, “linearization of broadband microwave amplifier”, serbian journal of electrical engineering, vol. 11, no. 1, pp. 111-120, february 2014. [8] a. đorić, a. atanasković, n. males-ilić, b. milovanović, “linearization of microwave power amplifier for broadband applications”, xlviii international scientific conference on information, communication and energy systems and technologies icest2013, ohrid, republic of macedonia, pp. 65-68, 2013. [9] a. atanasković, n. maleš-ilić, b. milovanović, “linearization of power amplifiers by second harmonics and fourth-order nonlinear signals”, microwave and optical technology letters, wiley periodicals, inc., a wiley company, vol.55, issue 2, pp. 425-430, february 2013. [10] a. atanasković, n. males-ilić, b. milovanović, “linearization of two-way doherty amplifier”, in proc. of microwave integrated circuits conference (eumic), european 2011, pp. 304-307. [11] n. maleš-ilić,a.đorić, a. atanasković, “linearization of broadband two-way microstrip doherty amplifier”, facta universitatis, series: electronics and energetics, vol. 29, no. 1, pp. 127-138, march 2016. rf pa linearization by signals modified in baseband digital domain 221 [12] a. đorić, n. maleš-ilić, a. atanasković, b. milovanović: “linearization of broadband doherty amplifier”, in proceedings of the 11 th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2013). niš, serbia, october 16-19, 2013, vol. 2, pp. 509-512. [13] a. atanasković, n. maleš-ilić, a. đorić, m. ţivanović, “power amplifier linearization by modified baseband signal injection”, in proceedings of the 12 th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks 2015). niš, serbia, 14-17 october, 2015, pp. 102-105. [14] a. atanasković, n. males-ilić, k. blau, a. đorić, b. milovanović, “rf pa linearization using modified baseband signal that modulates carrier second harmonic”, microwave review, vol. 19, no. 2, pp. 119-124, december 2013. [15] j. c. pedro and j. perez, “accurate simulation of gaas mesfet’s intermodulation distortion using a new drain-source current model,” ieee trans. microwave theory tech., vol. 42, pp. 25–33, january 1994. [16] j. p. aikio and t. rahkonen, “detailed distortion analysis technique based on simulated large-signal voltage and current spectra”, ieee mtt trans. microwave theory tech., vol. 53, pp. 3057–3065, 2005. [17] a. heiskanen, j. aikio, and t. rahkonen, “a 5-th order volterra study of a 30w ldmos power amplifier”, in proceedings of the international symposium on circuits and systems (iscas'03), bangkok, thailand, 2003, vol. 4, pp. 616–619. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 101 113 https://doi.org/10.2298/fuee1801101w a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology weiyin wang 1 , xiangjie chen 1 , hei wong 2 1 institute of photonics and microelectronics, school of information sciences and electronic engineering, zhejiang university, hangzhou, china 2 department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong abstract. this work presents the design and realization of a fully-integrated 1.5 ghz sigma-delta fractional-n ring-based pll for system-on-chip (soc) applications. some design optimizations were conducted to improve the performance of each functional block such as phase frequency detector (pfd), voltage-controlled oscillator (vco), filter and charge pump (cp) and so as for the whole system. in particular, a time delay circuit is designed for overcoming the blind zone in the pfd; an operational amplifierfeedback structure was used to eliminate the current mismatch in the cp, a 3rd lpf is used for suppressing noises and a current overdrive structure is used in vco design. the design was realized with a commercial 40 nm cmos process. the core die sized about 0.041 mm2. measurement results indicated that the circuit functions well for the locked range between 500 mhz to 1.5 ghz. key words: pll, blind zone, current mismatch, ring oscillator 1. introduction phase-locked loops (plls) are widely used in modern digital and communications systems for frequency synthesis, clock generation, retiming, clock signal recovery, etc. [1-6]. with fractional-n feature, variable frequency clocks can be generated for the operation of different communication and digital sub-systems in a mixed signal systemon-chip (soc) design. to meet some specific system requirements, a pll was usually realized with analog charge pump, lc oscillator, loop filter with large component values which require large silicon area and non-standard digital/analog cmos fabrication process [3, 6-7]. this work presents one fully integrated, fractional-n, pll design solution based solely on a commercial 40 nm cmos process without much significant performance tradeoff. received april 7, 2017; received in revised form october 3, 2017 corresponding author: hei wong department of electronic engineering, city university of hong kong, tat chee avenue, kowloon, hong kong (e-mail: eehwong@cityu.edu.hk) 102 w. wang, x. chen, h. wong figure 1 shows the system block diagram of a fractional-n pll. it consists of a phase frequency detector (pfd), charge pump (cp), low-pass filter (lpf), voltage-controlled oscillator (vco) and multi-modulus divider (mmd) for sigma-delta modulation. the whole structure is configured as a negative feedback system that can keep track with the output signal frequency, fout, with the reference frequency, fref. if there is a difference in frequency or phase, the pfd will output control pulses up and dn which are fed into a charge pump for converting into a single end current. a low-pass filter will filter out the high-frequency components and output a dc voltage for the vco control. that is, the vco voltage is proportional to the phase error between the fref and vco output frequency, fout, or its fraction as divided by the mmd. a generalized transfer function for describing this feedback system is given by ( ) 2( ) 1 1 ( ) 2 cp vco lpf cp vco lpf i k h s sh s i k h s s n     (1) where icp/2 is the current generated by charge pump per phase angle; hlpf(s) is the transfer function of low-pass filter. kvco/s is the transfer function of vco; and 1/n represents the frequency division for fractional-n operation. fig. 1 major constitutions of a fractional-n pll. the system performance is governed by various characteristics of the building blocks. the blind zone of pfd and the mismatch of control currents of the charge pump will cause the vco to output incorrect signal frequency and cause a large phase noise in the pll [8-13]. high-order active filter would lead to a better loop and high-order harmonics suppression. however, it may be costly for implementation and conventional integrated plls often put the filter as an external component which allows user to have custom designs. the characteristics of vco will greatly affect the overall performance of pll a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 103 [7]. within the constraint of given design rules, available component types and values of target 40 nm cmos process, we designed a fully integrated fractional-n pll with some special circuit configurations. in pfd circuit, conventional d-type flip-flop based phase detector circuit configuration [14] was used where we introduced a delay line with approximately 280ps delay to eliminate the blind zone [14]. in cp circuit, with reference to some recently reported configuration [9-12], a feedback mechanism constituted by an opamp was established in order to tract with the up and dn currents. to suppress noises of system, a 3 rd order passive lpf was used. in vco circuit, io device was used to control the ring oscillator. preliminary results of this design has been reported in ref.[15]. this paper presents further detailed analysis on the circuit configuration, results, circuit constraints and methodologies for further performance improvement. 2. design methodologies 2.1. phase frequency detector phase frequency detector (pfd) which generate a pulse output is proportional to the phase difference between the input and output frequencies/phases [4, 12]. in this work, we construct the pfd using the conventional d-type flip-flop based circuit which is typical in a commercial pll design [14]. figure 2(a) gives the specific circuit of pfd. the d flip-flops make the pfd be sensitive to the rising edges of fref or fout only. in connection with the charge pump (cp), the high output of the top d flip-flop enables the up current of the charge pump; whereas the output of the lower d flip-flop enables the dn current. at the rising edge of the reference signal fref , the up signal changes from 0 to 1. it may remain even after the rising edge of feedback signal fout which changes the dn output from 0 to 1 also. at this point, both signals are fed into the and gate which resets up and dn signals simultaneously. this results in a blind zone for up and dn signal. blind zone will make the cp be insensitive to a small phase errors and results in a large phase noise of the pll. thus, measure to eliminate the blind zone needs to be introduced. here we introduce a delay line (dl) of about 280 ps in order to eliminate the blind zone. with this configuration, dn will be reset for a period of dl after the up signal and that eliminates the blind zone. the detailed operation of this circuit could be understood with the aid of the state diagram given in fig. 2(b). figure 2(b) highlights the three states of the up and dn signal generation for the pfd. when a rising edge of fout detected, there is a positive transition from the cp. when the system starts up, the pfd is in the “state 0” (up = 0, dn = 0). when a rising edge of fref comes up, pfd changes from “state 0” to “state 1” (up = 1, dn = 0). if a rising edge of fout comes, the pfd will go back to “state 0”; and if there is another rising edge of fref detected, the pfd will keep at “state 1”. when the system is at “state 0” with a rising edge of fout, the pfd will go into “state 2” (up = 0, dn = 1). at this point, if a rising edge of fref comes up, it will switch back to “state 0”. however, if a second rising edge of fout detected, the pfd will keep at “state 2”. 2.2. charge pump the control signal up and dn generated by the phase detector are fed into charge pump (cp) to control the current flows in the charge pump so as to produce a single current which will be further converted into a voltage for the vco control via the low104 w. wang, x. chen, h. wong pass filter. the total charge, qcp, is proportional to the durations of up and dn signals, namely (2) where iup and idn are the up and dn current, respectively, tup and tdn are the pulse duration of up and dn control in cp, respectively. (a) (b) fig. 2 (a) schematic of phase frequency detector; and (b) state diagram showing the operation flow of the phase frequency detector. in ideal case, iup is always equal to idn. in real case, the currents may be different due to device mismatch, charge injection, clock feedthrough, channel-length modulation and etc. [8-13]. this issue becomes even worse in nano cmos circuits as channel length modulation will be more significant. to eliminate the mismatch between up and dn current, several circuit configurations such as drain-switching charge pump, current steering charge pump, source-switching charge pump and cascade current source charge pump were proposed [8-13]. in our work, we incorporate a comparator to keep up and dn current to track each other. as shown in fig. 3, the drains of m1 and m3 are tied to the noninverting and inverting input of the op amp, respectively, which will make the drain voltage of m1 and m3, and then m2 and m4, be equal. it can be readily shown that both up current (id, m3) and dn current (id, m4) are equal and are both governed by the mirror current of ibias. they will not be affected by the output voltage, channel length modulation or size mismatch of the transistors. cp up up dn dn q i t i t  a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 105 fig. 3 schematic of the cp structure with a feedback path constituting by an operational amplifier to keep track with the up and dn currents. 2.3. low-pass filter as shown in fig.1, the low-pass filter (lpf) shapes the error current for vco control. it governs the damping factor and natural frequency of the system. for sake of flexibility and to save the chip area, most of commercial plls often put the lpf as externallyconnected circuit and allow the maximum flexibility for specific circuit application design. for soc applications, this work implements the lpf with on-chip components and it does not treat filter as a tunable building block. to keep a reasonable system performance, simple, third order, passive rc filter shown in fig. 4 is used. processes for determining the component values are given in appendix a. fig. 4 schematic of the low-pass filter circuit used in this work. 106 w. wang, x. chen, h. wong 2.4. voltage control oscillator voltage control oscillator is one of the most important building block governing the performance of the pll. the key performances of concern include: (a) frequency tuning range: it determines the operation range of the pll. (b) tuning gain: expressed in terms of v/hz, indicates the change in voltage level as the frequency change. it governs the overall gain of the whole system. (c) phase noise level: the phase noise level of the vco is of particular importance in some applications such as used as a frequency synthesizer. it affects the stability of the system and is the major jitter source. many advanced vco circuit configurations based on rc or lc structures have been proposed [1-3, 7]. in most of the high-performance oscillators and plls, lc structures are always favorable. however, a high-quality factor inductor requires a thick metal layer for implementation which is not available in most of the cmos process. it requires a large silicon area also. rc based oscillator also requires a large chip area and there are constraints in high-frequency operation also. ring-based vco has poor frequency stability, large phase noise and they are more vulnerable to power and temperature fluctuation [7]. in addition, it was difficult to achieve very high frequency operation and is seldom used in any highperformance pll. the advantages of the ring-based vco is that it is simple in circuit design. it is simply built with some cascaded inverters. ring oscillator is very compact and can be realized with any cmos process. with the available of nano cmos technology and some digital circuit techniques, high-performance and high-frequency ring-based plls have been obtained [6]. another area that makes the ring-based pll be more attractive is the need of low-cost and readily available pll for full cmos system integration and soc applications. ring-based vco together with the nano cmos technology do provide a lowcost implementation of pll with reasonable performance. in present design, because of the available 40 nm gate length devices, high-speed operation can be readily obtained. performance of the circuit is still acceptable with such simple vco design as will be demonstrated later. in our design, we implemented the vco with a five-stage cmos inverters. to achieve a higher operation frequency and better stability, the circuit is overdriven with large size io mosfet, m1, which operates at high analog supply voltage avdd of 2.5v. as shown in fig. 5, being biased at 2.5v, m1 would produce a larger control current so as to generate a higher oscillation frequency. it also reduces the effects fig. 5 schematic of a five-stage voltage control ring oscillator with current overdrive. a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 107 due to supply voltage, temperature, and process parameter fluctuations. note that the core ring oscillator, constituted by the five-stage inverters, was operated at digital supply voltage with dvdd = 1.1 v. to protect the vco output snw not to exceed dvdd, an operational amplifier op and level limiter m2 were used. if snw exceeds dvdd, transistor m2 will be turned on so as to lower the snw voltage level to a value equal to dvdd. 3. testing and validating we have realized the designed pll with a commercial 40 nm cmos process. because such short-gate length transistors are available, high-frequency operation can be readily obtained. the layout of the design is shown in fig. 6. major building blocks, pfd, cp, lpf, vco, mmd, sigma-delta modulator are highlighted. the chip size of core functional block (excluding pads and ios) is about 250 μm × 165 μm or 0.041 mm 2 .which is half size of the latest digital fractional-n pll realized using 65 nm technology [6]. the circuit is operated with both digital and analog supply voltages of 1.1 v and 2.5 v, respectively. the locked range of the chip in from 500 mhz to 1.5 ghz. the overall power consumptions are 1.5 mw and 2.8 mw at 750 mhz and 1.5 ghz, respectively. (a) (b) fig. 6 (a) layout of a system chip with embedded ppl designed in this work; (b) layout of designed pll. the size of the chip is 250 μm×165 μm or 0.041 mm 2 . it can be seen that although the low-pass filter is rather simple from the circuit configuration point of view, it occupies the largest portion of chip area for realizing the few passive capacitive and resistive components. compared to other soc designs, the chip area of lpf is larger because the use of third order filter. the size of vco is comparatively compact because the use of ring oscillator structure and minimum size inverters. designed ppl 108 w. wang, x. chen, h. wong the preliminary testing results of the fabricated pll chip are shown in fig.7. figure 7(a) shows the measured output phase noises for the vco output at 1.5 ghz. the lowfrequency (1 khz) and high-frequency (1 mhz) noise level are -62 dbc/hz and -81 dbc/hz, respectively. figure 7(b) depicts the output spectrum of vco at 1.5 ghz with 300 mhz frequency span. the peak value 13.42 dbm and a number of spurious peaks at various frequencies such as ±20, ±30, ±40, ±50 mhz and their multiples are found. the sources of the spurs should be due to the clock frequency of signal source and the power line frequency as well. the characteristics are not very good as compared with other integrated plls [6]. however, when the operation frequency is lowered, better characteristics are found. figure 8 plots the phase noise levels as a function of vco frequency. phase noise is smaller for smaller value of fractional n. at 500 mhz, the 1 khz phase noise reduces to -71 dbc/hz. the phase noises increase as the frequency goes up. in general, the low-frequency noise levels, less than -62 dbc/hz, are acceptable for a ring-based pll in some general applications. the high-frequency phase noises are less than -81 dbc/hz for all investigated vco frequencies. in addition, the time domain jitter noise was also measured. fig.9 plots the root-mean-square value of jitter noise as function of oscillator frequency. the peak value was about 13 ps at 500 mhz. the jitter noise levels are smaller at other frequencies. figure 10 plots the output amplitude as a function of vco frequency. as shown in fig. 10, the output amplitude was -8.6 dbm when the output is set to 500 mhz, it decreases as the frequency increases. figure 11 shows the change of primary spurs (at 20 mhz) for various vco frequencies. the spur amplitude is less than -64 dbm for 500 mhz center frequency and the largest spur was found for 1.2 ghz output spectrum. the levels of spurs in our pll are high and needs to be suppressed. although it does not directly show up in the present measured data, one can readily anticipate that the performance of the lpf and vco could be one of the major sources for the characteristics degradation of the system. these are the major performance trade-off for the compact and simple design in the sense of soc applications. the loop dynamic of the present design could not be adjusted with the fixed lpf components and the loop gain may be lowered because of the losses in the parasitic passive components. in addition, the integrated capacitors used usually have large leakage current because of the use of ultrathin dielectric. this is even worse in the 40 nm technology. as a consequence, the constant dc level for the low-voltage vco is hard to maintain at the lpf output and that causes some undesirable frequency drifts of vco. hence further improvement should focus on the lpf design such as the use of active filter to reduce the chip size and yet to suppress the effect of gate leakage. yet the second issue needs special attention is stability of vco against power supply and temperature fluctuations. ring oscillator was known to have poor power supply and temperature stability. in the 40 nm technology, the digital power supply voltage (dvdd) has been scaled down to 1.1 v. the low supply voltage makes the ring-based vco more sensitive to supply voltage and temperature fluctuation. here we use the io devices for driving and level limiting with operation voltage of 2.5 v (avdd). it should help in alleviating these effects. further experimental validation and detailed characterization are under investigation. a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 109 (a) (b) fig. 7 phase noise characteristics and output spectrum of the ring-based vco in the fabricated pll operated at 1.5 ghz: (a) phase noise; and (b) output amplitude response. 110 w. wang, x. chen, h. wong fig. 8 plot of measured phase noise of vco output at different frequencies. vco frequency (mhz) 0 200 400 600 800 1000 1200 1400 1600 r m s j it te r (p s ) 4 6 8 10 12 14 fig. 9 measured root-mean-square value of jitter for the whole frequency range of the vco. fig. 10 peak amplitudes of the vco running at different frequencies. a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 111 fig. 11 levels of primary spur observed at the vco’s output. 4. conclusion in this work, a fully-integrated compact cmos fractional-n pll was designed and realized. we adopted various strategies for most of the key functional blocks so as to improve the overall performance of the pll. in particular, a time delay circuit was introduced to the phase detector for overcoming the blind zone in control signal generation; the charge pump characteristics was improved by using an operational amplifier to mirror the up and dn currents so as to alleviate the effects of current mismatch and channel length modulation of short-channel devices. the major strategy in performance, cost and technology trade-off is the use of a five-stage ring-based vco in the design. a current overdrive structure was introduced by using io device and analogue supply voltage which allow a better frequency range and alleviate the possible degradations due to power supply and temperature fluctuations. the design is compact in size and has been realized with a 40 nm cmos process. measurement results indicated that the circuit functions well for the locked range between 500 mhz to 1.5 ghz. references [1] v. ravinuthula and s. finocchiaro, “a low power high performance pll with temperature compensated vco in 65nm cmos", in proceedings of the ieee radio frequency integrated circuits symp., 2016, pp. 31-34. [2] d. liao, h. wang, f. f. dai, y. xu, r. berenguer, “an 802.11 a/b/g/n digital fractional-n pll with automatic tdc linearity calibration for spur cancellation”, in proceedings of the ieee radio frequency integrated circuits symp, 2016, pp. 134-137. [3] s. ikeda, h. ito, a. kasamatsu, y. ishikawa, t. obara, n. noguchi, et al., “an 8.865-ghz -244db-fom high-frequency piezoelectric resonator-based cascaded fractional-n pll with sub-ppb-order channel adjusting technique”, in proceedings of the ieee symp. vlsi circuits, 2016, pp. 1-2. [4] t. li x. fan, and z. hua, “cmos phase frequency detector and charge pump for multi-standard frequency synthesizer”, in proceedings of the ieee int’l conf. microwaves, communications, antennas and electronic systems, 2015, pp. 1-4. [5] m. ghasemzadeh, s. mahdavi, a. zokaei, and k. hadidi, “a new adaptive pll to reduce the lock time in 0.18 μm technology”, in proceedings of the 23rd international conference mixed design of integrated circuits and systems, 2016, pp. 140-142. 112 w. wang, x. chen, h. wong [6] a. elkholy, s. saxena, r. k. nandwana, a. elshazly, p. k. hanumolu, “a 2.0-5.5 ghz wide bandwidth ring-based digital fractional-n pll with extended range multi-modulus divider”, ieee j. solid-state circuits, vol. 51, pp. 1771-1784, 2016. [7] m.-t. hsieh, j. welch, g. e. sobelman, “pll performance comparison with application to spread spectrum clock generator design,” analog integr. circ. sig. process, vol. 63, pp. 197-216, 2010. [8] a. g. amer, s. a. ibrahim, and h. f. ragai, “a novel current steering charge pump with low current mismatch and variation”, in proceedings of the ieee international symp. circuits and systems, 2016, pp. 1666-1669. [9] s. g. kim, j. rhim, d. h. kwon, m. h. kim, and w. y. choi, “a low-voltage pll with a current mismatch compensated charge pump”, in proceedings of the international soc design conference, 2015, pp. 15-16. [10] m. k. hati and t. k. bhattacharyya, “a pfd and charge pump switching circuit to optimize the output phase noise of the pll in 0.13μm cmos," in proceedings of the international conference on vlsi systems, architecture, technology and applications, 2015, pp. 1-6. [11] n. joram, r. wolf, and f. ellinger, “high swing pll charge pump with current mismatch reduction”, electron. lett., pp. 661-663, 2014. [12] y. he, x. cui, c. l. lee, and d. xue, “an improved fast acquisition pfd with zero blind zone for the pll application”, in proceedings of the ieee international conference on electron devices and solidstate circuits, 2014, pp. 1-2. [13] m.-s. shiau, c.-h. cheng, h.s. hsu, h.c. wu, h.-h. weng, j.j. hou, r. c. sun, “design for low current mismatch in the cmos charge pump”, in proceedings of the international soc design conference, 2013, pp. 310-31. [14] analog devices totorial mt-086, fundamentals of phase locked loops (plls), analog devices, 2009. [15] w. wang, x. chen and h. wong, “1.5 ghz sigma-delta fractional-n ring-based pll realized using 40 nm cmos technology for soc applications,” in proceedings of the international conference on electronics, information, and communications, phuket, thailand, january 11-14, 2017. appendix a according to fig. 1, the lpf transfer function is defined as the change in voltage at the tuning port of the vco divide by the current level from the charge pump. for 3 rd lpf given in fig. 4, the transresistance function is: 2 2 2 1 0 1 ( ) ( ) st z s s s a sa a     (a1) where a0 = c1 + c2 + c3, a1 = c2r2(c1 + c3) + c3r3(c1 + c2), a2 = c1c2c3r2r3, and time constant for zero t2 = r2c2. by expressing the transfer function in terms of poles and zeroes, we have: 2 0 1 3 1 ( ) (1 )(1 ) st z s sa st st     (a2) where poles t1 and t3 are given, respectively, by t1 = c1c2r2/(c1 + c2), t3 = r3c3. the phase margin can be readily determined from (a2) with: o 1 1 1 2 1 3 180 tan ( ) tan ( ) tan ( ) c c c t t t           (a3) by setting the derivative of the phase margin to zero, the following relationships can be obtained: (a4)   2 2 1 2 0 12 2 2 1 1 a t c t a a t a           a system-on-chip 1.5 ghz phase locked loop realized using 40 nm cmos technology 113 (a5) other components can be evaluated with the following equations: (a6) (a7) (a8) 2 2 2 1 2 1 1 2 0 3 2 2 1 2 t c t a c a a c t c a      2 0 1 3 c a c c   2 2 2 t r c  2 3 1 3 2 a r c c t  plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 155 155 https://doi.org/10.2298/fuee1801155e retraction mohammad maghsoudloo, hamid r. zarandi parallel execution tracing: an alternative solution to exploit under-utilized resources in multi-core architectures for control-flow checking. facta universitatis, series: electronics and energetics (fu elec energ), vol. 29, no 2, june 2016, pp. 243 260. doi: 10.2298/fuee1602243m  the article: parallel execution tracing: an alternative solution to exploit under-utilized resources in multi-core architectures for control-flow checking. mohammad maghsoudloo, hamid r. zarandi. facta universitatis, series: electronics and energetics, vol. 29, no 2, june 2016, pp. 243-260, doi: 10.2298/fuee1602243m, repeats 62% data already published in: an efficient adaptive softwareimplemented technique to detect control-flow errors in multi-core architectures. mohammad maghsoudloo, hamid r. zarandi, navid khoshavi. microelectronics reliability, vol. 52, issue 11, november 2012, pp. 2812-2828, doi: doi.org/10.1016/j.microrel.2012.03.033 without any referencing. link to the retracted article  doi: 10.2298/fuee1602243m received november 9, 2017 http://www.sciencedirect.com/science/article/pii/s0026271412001175#! http://www.sciencedirect.com/science/article/pii/s0026271412001175#! http://www.sciencedirect.com/science/article/pii/s0026271412001175#! http://www.sciencedirect.com/science/article/pii/s0026271412001175#! https://doi.org/10.1016/j.microrel.2012.03.033 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/981 plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 287 301 https://doi.org/10.2298/fuee1802287b vision inspection and monitoring of wind turbine farms in emerging smart grids  mahdi bahaghighat, seyed ahmad motamedi electrical engineering faculty, amirkabir university of technology, tehran, iran abstract. todays, smart grids as the goal of next generation power grid system span wide and new aspects of power generation from distributed and bulk power generators to the end-user utilities. there are many advantages to develop these complex and multilayer system of systems such as increasing agility, reliability, efficiency, privacy, security for both energy and ict sections in smart grid architecture. in emerging smart grids, the communication infrastructures play main role in grid development and as a result multimedia applications are more practical for the future power systems. in this work, we introduce our method for monitoring and inspection of wind turbine (wt) farms in smart grids. in our proposed system, a thermal vision camera is embedded on a wireless sensor node for each wt to capture appropriate images and send video streams to the coordinator. it gets video frames to perform machine vision inspection (vi) and monitoring purposes. in our constructed model, turbine blade velocity estimation is targeted by detecting two important landmarks in the image that are named hub and blade. by tracking the blade in the consecutive frames and based on proposed scoring function, we can estimate the velocity of the turbine blade. obtained results clearly indicate that accurate hub and blade positions extraction lead to error free estimation of turbine blade velocity. key words: vision inspection, thermal vision, gabor wavelets, template matching, wind turbine and smart grids. 1. introduction smart grids (sgs) as a future network of legacy power grids are sophisticated system of systems that support bidirectional power and data flows. sg benefits renewable energies such as wind energy and should be eco-friendly [1]. the self-healing, two-way communication, decentralization, and predictive reliability of sgs make electricity network operation and maintenance more manageable and easy [2]. in europe, a total of 211 sg’s projects in the r&d phase is worth approximately € 820 million, and 248 projects under d&d have a total budget of around € 2320 million [3]. these investments clearly indicate received june 10, 2017; received in revised form october 3, 2017 corresponding author: seyed ahmad motamedi electrical engineering faculty, amirkabir university of technology, tehran, iran (e-mail: motamedi@aut.ac.ir) 288 m. bahaghighat, s.a. motamedi the importance of smart grids in the world’s future. the european roadmap for smart grids is based on sgam model that includes five interoperability layers in five domains and six zones [4]. fig. 1 shows sgam architecture for smart grids. in addition to european standards, there are some strong models such as nist framework1 and nist framework2 [5] that are conducting by related organizations in the usa. these comprehensive models cover all parts of sgs. in all mentioned sg models, telecommunication infrastructures play major role in the development of next generation power grid systems. fig. 1 sgam architecture for smart grids [4] these infrastructures can be deployed for multimedia applications [2] but there are a number of requirements that must be addressed in order to have fully robust, reliable and secure multimedia streaming in smart grid networks [2]. the most important requirements that should be considered in the sg’s communication backbone are latency, frequency ranges, reliability, data rate, security, and throughput. for example, in [6] the total throughput in ranges 3–10 mbps is estimated for sg communication systems in many applications as well as multimedia communications. in addition, they offer a frequency range under 2ghz to have a low-cost solution that can overcome line-of-sight issues, e.g., foliage, rain fade, and penetration through walls [6]. in sg architecture, several major network types are defined in literatures [6-8]: home area network (han), building area networks (ban), neighborhood area network (nan), field area networks (fan) and wide area network (wan). this multi-tier network structure type vision inspection and monitoring of wind turbine farms in emerging smart grids 289 is illustrated in fig. 2. each network area involves its technical restrictions and they will have mutual impacts on others. now, there are increasing demands for energy monitoring through sg communication networks. our work is considered to develop a special wt energy monitoring system based on multimedia applications. we assume that wt farms are located in nan and/or fan. in nan/fan applications, usually data rates vary between 100kbps and 10mbps [6] which is well adapted to monitor and video inspection purposes. fig. 2 different network types in sgs in addition to energy mentoring and its essential priority in sg, wind turbines (wts) as extremely high cost devices should have advance maintenance services [9]. fig. 3 shows a comparison between traditional and modern maintenance approaches. for a wt that its components are made from particularly carbon fiber reinforced plastic (cfrp), it is essential to have monitoring infrastructure and intelligent vision inspection (vi) during in-service operation [10]. in order to increase the lifetime of the wt farm and reduce the maintenance cost, well accurate predictions of system faults and failures are needed. in future grids, these predictions can be available based on advanced nondestructive testing (ndt) approaches such as intelligent vi. the blades monitoring and diagnostics based on intelligent vision inspection are actually a complex challenge, and until now, there is no work in the literature. therefore, we conduct our research on the subject and introduce our architecture to monitor wind turbine (wt) farms in smart grids. in our proposed system, a thermal vision camera [10] is embedded on a wireless sensor node for each wt to capture appropriate images and stream video to the coordinator. it gets video streams to perform machine vision inspection and monitoring tasks. 290 m. bahaghighat, s.a. motamedi fig. 3 comparison between traditional and intelligent maintenance methods [9]. thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above absolute zero. deploying this type of sensor in vision systems eliminates the illumination problems of normal greyscale and rgb cameras [11]. these type of cameras were originally developed as surveillance and night vision tools for the military systems, but recently their prices have dropped significantly. this means that a broader field of applications can use these cameras. in this work, turbine blade velocity estimation is targeted and it can be used to estimate power generation of wt farms as well. to tackle this challenging issue, we define two landmarks in a receiving image that related to the hub and blade of a wt. accurate and robust estimation of these two objects in the consecutive frames can lead to the turbine blade velocity calculation that it is directly related to the power generated by a wt farm. furthermore, our proposed structure based on thermal vision camera and sensor node is well designed for more vision inspections (vi) such as mechanical deformations, surface defects, overheated components in rotor blades, nacelles, slip rings, yaw drives, bearings, gearbox, generators, and transformers [9]. 2. proposed model in this section, we want to elaborate details of our comprehensive model for real-time video streaming from a sensor node to the coordinator node. our main motivation to construct this architecture is that usually sensor nodes are low end devices with limited hardware resources [12-14] so running high complexity vi algorithms cannot be expected with acceptable performance in a light sensor node. as a result, for practical purposes, we propose to use our structured model for turbine blade velocity estimation at coordinator node as a medium or a high end receiver node. in the following, we demonstrate our proposed blade velocity estimation algorithm. 2.1. turbine blade velocity estimation system model the coordinator as the receiver node receives video output from the thermal vision camera. in predefined time window, it gets each frame and tries to find special objects in vision inspection and monitoring of wind turbine farms in emerging smart grids 291 the current image as landmarks. as mentioned before, we define two landmarks in a image that related to the hub and blade of a wt (fig. 4). fig. 4 condition monitoring and diagnosis for wind turbine firstly, the fast and robust correlation coefficient template matching approach is used for hub detection and localization in the received image, and then the mass center of the hub is calculated to set coordinate system (cx,cy). based on the estimated hub position, the bounding box with height bh and width bw is assigned to the point on the circumference of a circle with predefined radius  and angle  in obtained coordinate system. this primary phase, constructs the adaptive structure of our algorithm to dynamic adaptation on the variation of the hub position in a sequence of images. now, our gabor wavelet filter banks that well-tuned according to blade parameters are applied to the masked sub image. consequently, the present or absent of a blade in this image will be recognized by gw coefficient analysing. recalling the mentioned procedure in consecutive frames and using of some softening approaches lead to accurate estimation of blade velocity. in fig. 5, proposed method for blade detection based on gabor wavelets is described. in addition, in fig.6, overall view of our model is presented to clear more details of our algorithm. fig. 5 proposed method for blade detection based on gabor wavelets 292 m. bahaghighat, s.a. motamedi 2.2. hub detection based on template matching in the first step of our algorithm, it is necessary to localize hub position as the first landmark object in the whole input image. this is the fundamental step that has a direct impact on the performance of next steps. in this work, we use template matching (tm). tm plays an important role in many image processing applications. in a tm approach, it is sought the point in which it is presented the best possible resemblance between a sub image known as a template and its coincident region within a source image [15]. there are a lot of methods for pattern and template matching [15-18] but for simplicity, we use correlation coefficient [16] template matching to find a hub in an input image. so we benefits pearson’s correlation coefficient as below [16]: , 2 2 (x x)(y y) (1) (x x) (y y) x y x y r         (1) the correlation coefficient can be interpreted as a correlation between a template image (with average x ) and an input image (with average y ) after both the template and the image have been z-normalized (it is rescaled so that its mean is zero and the standard deviation is one) [12]. illumination and contrast differences are thus eliminated before match quality is evaluated making the correlation coefficient an ideal measure of match when we want to ensure robustness for variations of pattern brightness and contrast. fig. 6 proposed algorithm for turbine blade velocity estimation vision inspection and monitoring of wind turbine farms in emerging smart grids 293 our study shows that template matching based on the correlation coefficient can successfully identify potential target regions in a thermal camera video use case for wt. after hub localization, the sub image ih(x,y) is extracted to calculate its mass center (cx,cy) by the following equations [19, 20]: 10 01 00 00 , (2) x y m m c c m m   (2) (x, y) (3) i j ij h x y m x y i  (3) now, based on extracted reference point (cx,cy), the bounding box with height bh and width bw is assigned to the point on the circumference of a circle by radius  and angle  this primary phase, constructs the adaptive structure of our algorithm to dynamic adaptation on the variation of the hub position in a sequence of images. 2.3. proposed gabor wavelet feature extraction method for blade detection from the morphological point of view, the turbine blade object is a directional pattern with known inter ridge spacing. there are a lot of approaches in the literature for example radon transform [21], hough transform and gabor wavelet transform (gwt) [22] to analysis directional pattern. among all mentioned methods, gwt has special and unique properties. the important property of the gwt is that it minimizes the product of its standard deviations in both time and frequency domain. put another way, the uncertainty in information carried by this wavelet is minimized. however, they have the downside of being non-orthogonal, so efficient decomposition into the basis is difficult. since their inception, various applications have appeared, from image processing to analyzing neurons in the human visual system [23, 24]. at this stage, we use our structured gabor wavelet filter banks that are well adapted to turbine blade parameters. these tuned filter banks are applied to the masked sub image ib(x,y). this image includes neighbour pixels around the central point of a bounding box that located in (,), in our bipolar coordination system. in fact, our gabor wavelet based feature extraction method is used to determine whether a blade is located in ib(x,y) or not? before elaborating the details of our blade detection algorithm, the brief view of gwt should be presented. in [22], bidimensional gabor wavelets, gw,(x,y) = l(v)b(u), is used for directional pattern analysis. where b(u) is the equation of a band pass filter, centered on the w frequency, and l(v) is the equation of a gaussian low-pass filter. a bidimensional gabor wavelet is composed of a band pass filter in the direction of the wave and low pass filter in orthogonal direction as below: 2 2 2 2 2 21 1 ( ) , ( ) (4)v u v u jwu v u l v e b u e e         (4) with cos( ) sin( )u x y    (5) sin( ) cos( )v x y    (6) 294 m. bahaghighat, s.a. motamedi where  2 v and  2 u are scale parameters in the direction of the wave and in its orthogonal direction respectively [22, 25]. in fig. 7, real and imaginary parts of 2d gabor wavelet are depicted. (a) real part (b) imaginary part fig. 7 real and imaginary parts of a 2d gabor wavelet an example of gabor filter bank feature extraction including 12 coefficients, in the case of 3 frequencies and 4 directions, is shown in fig. 8. according this figure, providing that the input image frequency and orientation are w * and  * respectively, gabor wavelet coefficient for (w * ,  * ) will be the maximum and vice versa. in our work, in order to have the best adaptation on blade direction and frequency features, we define (2k +1) directions and (2kw +1) frequencies around (w0,0) so we have gabor bank, including (2kw +1)(2k +1) filters, then we should compute the local projections of the normalized masked image, i ' b(x,y), on the filter bank. i ' b(x,y) is the normalized version of i ' b(x,y) that obtained from equation below: , 0 0 0 0 ( , ) [ ( , ) ( , )] ( , ) b b v i x y i x y m x y v x y   (7) where m(x0,y0) and v(x0,y0) are the mean and standard deviation of ib(x,y) respectively, and norm of i ' b can be noted according equation (8): 2 2, , 2 2 2 2 1 || || ( , ) 1 n n b b n n i i x y dxdy v n       (8) now, we can calculate our gabor wavelet coefficients bank as is following: 2 2 , , 2 2 , , ( , ) ( , ) , ( , ) n n b w n n w w i x y g x y dxdy a g x y         (9) vision inspection and monitoring of wind turbine farms in emerging smart grids 295 fig. 8 an example of gabor filters bank feature extraction in the case of 3 frequencies and 4 directions then features can be extracted by proposed equation (10): 2 , , * * , 0 0 re ( ) ( , ) max( w w , ) w { 1, 2,... } { 1, 2,... } w w w s s s w w w s a n l a w arg a w k w k k k k k                         (10) at the final step of our analysis, decision making should be down to recognize present/absent of a blade in this frame. we define the equation (11) for scoring the blade detection results: * * , 1 , 2 , * * , s { , w , } w bd w w s w s s s w a w a w w w                 (11) where w1,w2 are weighting coefficients and sbd is the blade detection score. if a blade is available in ib, most of the aw, coefficients will have a large value near or greater than sthr while in null case which there is no blade in ib; coefficients may have a relatively low value. we proposed such weighted scoring function as mentioned in the equation (11) to intensify score gap between null and blade situations. 296 m. bahaghighat, s.a. motamedi now, by comparing sbd to predefined threshold sthr, final decision can be made. simply, if sbd  sthr, we have detected a blade in the masked image. we rewrite blade detection score function for current frame index k as below: ( ) 0 otherwise bd bd thrk bd s s s s     (12) so s (k) bd  0 clearly means that a blade detected for frame index k. this algorithm should be repeated for all image frames in our predefined time window, tw. 2.4. blade velocity estimation method in the previous section, the blade detection method is well presented. this procedure should be called for all frames in the predefined time window, tw. now, all information that needed for velocity estimation is calculated as below: { } ( ){ 1, 2..., }, tw i bd bd w s s s i n n t f    (13) where fs is the video frame rate in frame per second (fps). in order to estimate blade velocity, it is necessary to fit the gathered scores by appropriate function so we deploy ―sum of sine‖ approximation function ( )f x including 3n parameters: 1 ( ) sin(w x ) n i i i i f x a     (14) then, we select the harmonic wi * with the highest power ai * as the frequency of blade rotation: ai * = ai* and wi * = wi* where: * argmax{ 1, 2,3,...n} i i a i  (15) finally, the velocity of blade can be calculated by our proposed equation: * 1 2 (w ) 60( ) b si v f rpm    . (16) 3. simulation and results in order to investigate the performance of our proposed system architecture, we use ―rotary blade.avi‖, including 440 image frames with size 512 by 512 and fs = 40 fps in matlab software. we receive video stream at the coordinator and convert each buffered image to a 8-bit rgb image for further inspection. fig. 9(a)-(d) evidently show that our algorithm accurately detect both hub and blade landmarks. in fig. 10(a)-(d), different blade orientations are presented. in all figures, a hub is detected correctly, but in fig. 10(a)-(c) the blade orientations are not matched to the gw filter bank so for these frames we have sbd < sthr that means the blade is not detected. for fig. 10(d), the orientation of the blade is well adapted to the gw filter bank and sbd  sthr is satisfied thus we have a blade in this sub image. vision inspection and monitoring of wind turbine farms in emerging smart grids 297 we conduct our experiment for all 440 images. obtained results for these images indicate that in 94.97% the hub is detected correctly and approximately in 50 frames (11.36%) a blade is recognized by our gwt feature extraction and scoring method. (a) (b) (c) (d) fig. 9 hub and blade detection: (a) original image (b) hub detection by proposed correlation coefficient template matching (c) adaptive bounding box at (,) (d) proposed blade detection by gwt in fig. 11, the gathered scores of blade detector are presented for several sample frames of this video stream. for example, fig. 11(a) shows that for frame indexes from 264 to 268, in all 5 consecutive frames, a blade is detected with scores over 820. in fact, all of these five detections just related to unique blade and must count as one blade. this is due to our defined directional drifts in the proposed gw filter bank to increase a blade detection probability. as the same way, fig. 11(b) presents two blade detections for frame indexes 363 and 365 while fig. 12 shows the results for the whole video stream. as mentioned before, in order to estimate blade velocity, it is necessary to fit the gathered scores by appropriate function so we deploy ―sum of sine‖ approximation function f(x) based on equation (14). in fig. 12(b), the soft approximation of blade score function is depicted. 298 m. bahaghighat, s.a. motamedi (a) (b) (c) (d) fig. 10 hub and blade detection during blade rotation (a)(b)(c) non-matched blade orientation: no blade is detected (d) matched bade orientation: a blade is detected (a) (b) fig. 11 blade detection scores for sample frames vision inspection and monitoring of wind turbine farms in emerging smart grids 299 0 1000 0 1 5 3 0 4 5 6 0 7 5 9 0 1 0 5 1 2 0 1 3 5 1 5 0 1 6 5 1 8 0 1 9 5 2 1 0 2 2 5 2 4 0 2 5 5 2 7 0 2 8 5 3 0 0 3 1 5 3 3 0 3 4 5 3 6 0 3 7 5 3 9 0 4 0 5 4 2 0 4 3 5b la d e d e te ct io n s co re frame index (a) (b) fig. 12 blade detection scores vs frame index (a) without softening function (b) results of ―sum of sine‖ estimation now, we select the harmonic wi* with the highest power ( * argmax{ 1, 2,3,...n} i i a i  ) as the frequency of blade rotation. finally, the velocity of the blade can be calculated by simply replacing w i * =0.1789 and fs = 40 in equation (16) then vb = 52.68rpm will be estimated for simulated scenario. this value means that our model successfully estimate blade velocity without any error. in the final step of our analysis, we evaluate robustness of our algorithm by adding zero mean gaussian noise to the video stream. fig. 13 shows three noisy samples. table 1 shows the result of  2 n variations and its impact on velocity estimation (noise variance can vary between 0 and 1). the obtained results emphasize that our model is highly robust against noisy conditions. 300 m. bahaghighat, s.a. motamedi (a) (b) (c) fig. 13 adding zero mean gaussian noise to the input image (a)  2 n = 0.01, (b)  2 n = 0.05, (c)  2 n = 0.10 table 1 gaussian noise impact on velocity estimation accuracy  2 n 0 0.01 0.02 0.05 0.10 vb(rpm) 52.68 52.62 52.44 52.53 5368 4. conclusion many smart grid applications face harsh environmental conditions, but have high reliability and maintenance requirements. in this work, an intelligent nondestructive test (ndt) based on vision inspection is modelled. in the proposed model, the velocity of turbine blade is targeted as a main goal to energy monitoring while it is also well designed for more vision inspections (vi) procedures such as mechanical deformations, surface defects, and overheated components in rotor blades, nacelles, slip rings, yaw drives, bearings, gearbox, generators, and transformers. although, our structured model is the first work in the related literature that fully concentrated on practical vi based ndt maintenance for wts in sgs, the obtained results show the high accuracy and robustness of our algorithm. it is worth nothing that our blade velocity estimation can be used to estimate wt power generation [26] as well. vision inspection and monitoring of wind turbine farms in emerging smart grids 301 references [1] m. e. el-hawary, "the smart grid—state-of-the-art and future trends," electric power components and systems, vol. 42, pp. 239-250, 2014. [2] v. c. gungor, d. sahin, t. kocak, s. ergut, c. buccella, c. cecati, et al., "a survey on smart grid potential applications and communication requirements," ieee transactions on industrial informatics, vol. 9, pp. 28-42, 2013. [3] n. iqtiyaniilham, m. hasanuzzaman, and m. hosenuzzaman, "european smart grid prospects, policies, and challenges," renewable and sustainable energy reviews, vol. 67, pp. 776-790, 2017. [4] s. g. c. cen-cenelec-etsi, "group," in smart grid reference architecture, ed, 2012. [5] n. framework, "roadmap for smart grid interoperability standards," national institute of standards and technology, 2010. [6] m. kuzlu, m. pipattanasomporn, and s. rahman, "communication network requirements for major smart grid applications in han, nan and wan," computer networks, vol. 67, pp. 74-88, 2014. [7] a. usman and s. h. shami, "evolution of communication technologies for smart grid applications," renewable and sustainable energy reviews, vol. 19, pp. 191-199, 2013. [8] h. li, l. lai, and w. zhang, "communication requirement for reliable and secure state estimation and control in smart grid," ieee transactions on smart grid, vol. 2, pp. 476-486, 2011. [9] p. tchakoua, r. wamkeue, m. ouhrouche, f. slaoui-hasnaoui, t. a. tameghe, and g. ekemb, "wind turbine condition monitoring: state-of-the-art review, new trends, and future challenges," energies, vol. 7, pp. 2595-2630, 2014. [10] c.-s. tsai, c.-t. hsieh, and s.-j. huang, "enhancement of damage-detection of wind turbine blades via cwt-based approaches," ieee transactions on energy conversion, vol. 21, pp. 776-781, 2006. [11] r. gade and t. b. moeslund, "thermal cameras and applications: a survey," machine vision and applications, vol. 25, pp. 245-262, 2014. [12] a. b. nikolic, n. neskovic, r. antic, and a. anastasijevic, "industrial wireless sensor networks as a tool for remote on-line management of power transformers'heating and cooling process," facta universitatis, series: electronics and energetics, vol. 30, pp. 107-119, 2016. [13] m. milutinov, n. đurić, n. pekarić-nađ, d. mišković, and d. knežević, "multiband sensors for wireless electromagnetic field monitoring system-semont," facta universitatis, series: electronics and energetics, vol. 25, pp. 137-150, 2012. [14] m. r. kosanović and m. k. stojčev, "rpll-rendezvous protocol for long-living sensor node," facta universitatis, series: electronics and energetics, vol. 28, pp. 85-102, 2015. [15] e. cuevas, v. osuna, and d. oliva, "template matching," presented at the evolutionary computation techniques: a comparative perspective, 2017. [16] g. jasvilis, c. weise, and b. zenger-landolt, "finding complex patterns using template matching," 2016. [17] a. mahmood, a. mian, and r. owens, "on optimizing auto-correlation for fast template matching through transitive elimination," arxiv preprint arxiv:1407.3535, 2014. [18] f. zhong, s. he, and b. li, "blob analyzation-based template matching algorithm for led chip localization," the international journal of advanced manufacturing technology, pp. 1-9, 2015. [19] y. zhang, s. wang, p. sun, and p. phillips, "pathological brain detection based on wavelet entropy and hu moment invariants," bio-medical materials and engineering, vol. 26, pp. s1283-s1290, 2015. [20] j. flusser and t. suk, "rotation moment invariants for recognition of symmetric objects," ieee transactions on image processing, vol. 15, pp. 3784-3790, 2006. [21] m. k. bahaghighat and j. mohammadi, "novel approach for baseline detection and text line segmentation," international journal of computer applications, vol. 51, 2012. [22] m. k. bahaghighat and r. akbari, "fingerprint image enhancement using gwt and dmf," in proce. of the 2nd international conference on signal processing systems (icsps), 2010, pp. v1-253-v1-257. [23] s. tang, "face recognition method based on gabor wavelet and memetic ecological algorithm," biomedical research, pp. 1-1, 2017. [24] t. s. lee, "image representation using 2d gabor wavelets," ieee transactions on pattern analysis and machine intelligence, vol. 18, pp. 959-971, 1996. [25] n. karimimehr and a. a. b. shirazi, "fingerprint image enhancement using gabor wavelet transform," in proceedings of the 18th iranian conference on electrical engineering (icee), 2010, pp. 316-320. [26] s. li, d. c. wunsch, e. a. o'hair, and m. g. giesselmann, "using neural networks to estimate wind turbine power generation," ieee transactions on energy conversion, vol. 16, pp. 276-282, 2001. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 425-445 https://doi.org/10.2298/fuee1803425d electric field modeling and analysis of ehv power line using improved calculation method  rabah djekidel, sid ahmed bessedik, abdechafik hadjadj laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria abstract. this paper aims is devoted to modeling and simulation of electric field created by ehv power transmission line of 275 kv using an efficient hybrid methodology, the charge simulation method (csm) with the simplex simulated annealing (simpsa) algorithm in order to find the optimal position and number of fictitious charges used in csm for an accurate calculation. various factors that affect the electric field intensity were analyzed; it is found that the influence of the conductor sagging is clearly remarked, the maximum electric field strength at 1 m above the ground level recorded at mid-span point of the power line is 3.09 kv/m, in the proximity of the pylon, the maximum value is significantly reduced to 1.28 kv/m. the configuration type of the transmission line (single or double circuit) and the arrangements of phase conductors on double circuit pylons have a significant effect on the levels of electric field around the transmission line. for a single circuit, the triangular configuration provides the lowest maximum value of electric field. for a double circuit, the inverse phase arrangement (abc-cba) or low-reactance phasing produces the lowest maximum value of electric field. the resulting maximum electric field levels were found below the exposure values set by the icnirp and irpa standards for both occupational and general public. the simulation results of electric field are compared with those obtained from the comsol 4.3b multiphysics software, a fairly good agreement is found. key words: catenary geometry, charge simulation method (csm), electric field, ehv power line, simplex simulated annealing (simpsa) received october 7, 2017; received in revised form march 19, 2018 corresponding author: djekidel rabah laboratory for analysis and control of energy systems and electrical systems lacosere, laghouat university (03000), algeria (e-mail: rabah03dz@live.fr) 426 r. djekidel, s. bessidek, a. hadjadj 1. introduction over the years, electricity has improved the conditions of human life; it plays a key role in responding to basic human needs. however, despite all its advantages, electricity has many negative effects on human health identified. as energy needs increase with the rapid growth of the human population, leading to adoption of electric transport systems with very high voltage levels and accelerated the creation of new transmission power lines using single circuit or double circuit near residential areas. the electric and magnetic fields at extremely low frequencies generated by the lines of the transmission network have assumed great importance in recent years, because of growing concern about the potential effects of these fields on human health and the environment. exposures to these generated fields induce a current inside human bodies that interferes with those of the body and can, if sufficiently intense, cause harmful biological effects with important implications for human health. in the last years, several publications have been made for the calculation and measurement of very low frequency electric and magnetic fields (elf) created by power transmission lines [1-5], based on the results and recommendations reported by these research studies. a number of national and international standards have been established, to define the limits for occupational and public exposure of electric and magnetic fields at very low frequency [6-8]. in parallel, a wide variety of software using different numerical techniques have been developed for modeling and simulation of electric and magnetic fields in both 2d and 3d analysis. the international organizations responsible for providing guidance and advice on the health hazards of non-ionizing radiation exposure officially recognized by the world health organization (who) are the international commission against non-ionizing radiation (icnirp) and the international radiation protection association (irpa), usually at a frequency of 50 hz, these organizations recommend an exposure limit (24 hours), for the general public are of the order of 5 kv/m for the electric field and 100 µt for the magnetic field, as regards the occupational exposure medium, these recommendations are 10 kv/m and 500 µt, respectively [6-8]. therefore, it is very important to assess the levels of electric and magnetic fields generated by these very-high-voltage transmission lines, in order to protect public health, environmental and electrical equipments [9,10]. in view of the above, the purpose of this paper is to analyze the electric field levels generated by the high voltage transmission lines (hvtl) in a steady state condition, using a novel modeling approach that combines charge simulation method (csm) with simplex simulated annealing (simpsa). the charge simulation method, due to its favorable characteristics, such as simplicity and ease of programming, the execution speed, has been very commonly used successfully in many studies to solve a variety of analysis problems of the electric field in high voltage electrical insulation systems [11-20]. to improve the performance of this technique, aiming to increase the calculation accuracy, it seems advisable to use one of the optimization techniques as the simplex simulated annealing (simpsa) algorithm in combination with this method in order to determine the optimal number and position of simulation charges. this algorithm shows good robustness and accuracy in arriving at the global optimization of difficult non-convex highly unconstrained and constrained functions; it combines the downhill simplex method (dhs) with simulated annealing algorithm (sa) [21,22]. it electric field modeling and analysis of ehv power line using improved calculation method 427 should be noted that this calculation takes into account the effects of the catenary line, where the conductor sag depends on the individual characteristics of the electrical line and environmental conditions, this effect is rarely considered in the literature, because most often it is assumed negligible. usually, they use in the calculation method the notion of the average height of the electrical line. the simulation results will be compared with those obtained using comsol multiphysics 4.3b based on the finite element method. 2. model of overhead power lines the conductors of an overhead power line are not at all points at the same height along the span of this line (longitudinal axis). in fact, they regularly describe a catenary, where the sag depends on the individual characteristics of the line and environmental conditions. fig. 1 depicts the basic catenary geometry for a single conductor line [18,23,24]. z y x l s +l/2 -l/2 hmax hmin catenary geometry fig. 1 the basic catenary geometry for a single-conductor line the equation of the catenary shape of conductor placed in the (yz) plane is given by [25]. 2( ) 2 sinh 2 min z y z h          (1) where z is the longitudinal position of the conductor about z axis, for a symmetrical line, you normally choose z = 0 at the mid-span; α is the solution of the transcendental equation, with. 2sinh ( ) 2 4 s l    (2) to calculate the height of the electrical line along z axis in a span length, the following equation can be used [25]. 428 r. djekidel, s. bessidek, a. hadjadj 2 min 2 ( ) 4 s y z h z l          (2) some researchers, in the electric field calculation in the vicinity of power lines assumes that the conductors are horizontal of infinite length, parallel to a flat ground and parallel with each other, and the sagging due to the weight of conductors is neglected, taking into account an average height between the maximum height and the height minimum of the power line [18]. the average height aveh is given by the following expression. max 2 3 aveh h s         (3) where hmin is the minimum height of the line; hmax is the maximum height; s is the sag of the conductor; l is the length of the power line in one span. 3. charge simulation method (csm) the basic principle of this method is very simple. if several discrete charges of any type are present in a region, the electrostatic potential at any point can be found by the superposition of the potentials resulting from the individual charges as long as this point, this potential can be given as follows [11-13]. 1 n i ij j j v p q    (4) where n is the number of fictitious charges and pij called the potential coefficient, means the potential at point i caused by a unit charge of qj. once the types of simulation charges and their positions are defined, the simulation charges of conductors are replaced by fictitious charges placed inside the conductor, when this procedure is applied to n contour points, this leads to a linear system of n equations for n unknown charges [11-13]. 1 ij j ci nn n n p q v (5) where pij is the potential coefficients matrix; qj is the column vector of fictitious simulation charges; vci is the column vector of known potentials at the contour point (boundary conditions). after determining the values of the simulation charges by solving the matrix system shown in equation 5, it was necessary to check all the calculated charges by choosing new points located on the contour (check points), the new potential vvi is calculated at these checkpoints, the error tolerance is checked. if this value is lower than the simulation accuracy, the potential and electric field at any point can be calculated, if not, it will be necessary to repeated the all calculations by changing the number and/or the locations of simulation charges [16-19]. electric field modeling and analysis of ehv power line using improved calculation method 429 :contour points :simulation charges :check points r1: real radius of the conductor r2: fictitious radius of the conductor r1 r2 y x y0 x0 4. electric field calculation the conductor of an electrical line is usually represented by an infinite line charges because its length is much greater than the other dimensions, these charges are placed inside the periphery of this conductor. in the charge simulation method (csm), the effect of the ground is simulated by an image charge for each conductor. this ensures that potential at any point on the ground plane is zero. using the image technique, each conductor of the line is represented by a positively charged line and a negatively charged image conductor. fig. 2 arrangement of the simulation charges and the contour points of conductor the arrangement of fictitious charges and contour points in the conductors of the power line is shown in fig. 2. the coordinates of these points are calculated using the following formulas [18,21]. 0 0 2. cos ( 1) 2. sin ( 1) k k k k x x r k n y y r k n                     (6) where r= {r1 if k=i, r2 if k=j} ; x0 is the heights of conductors above ground; y0 is the horizontal coordinates of conductors. for an infinite length of charge type, the potential coefficient is given in equation below [18,21]. 2 2 2 2 0 ( ) ( )1 (2. ) ( ) ( ) i j i j ij i j i j x x y y p ln x x y y               (7) where (xi, yi) are the coordinates of contour points; (xj, yj) are the coordinates of simulation charges. for a cartesian coordinate system, the magnitude of the total electric field at the desired point is calculated by the summation of the components. 430 r. djekidel, s. bessidek, a. hadjadj 2 2 1 1 n n t xi j yi j j j e f q f q                    (8) where fxi, fyi are the field intensity coefficients between the contour points and the simulation charges qj, they are given below. , i i ij ij x y p p f f x y       (9) in this analysis of electric field created by power transmission line, the catenary form of the overhead power line conductors (conductor sag) is taken into account; this 3d quasi-static analysis can supply to good electric field estimation. it should be noted that in this analysis, the influence of the towers and metallic objects encountered which act as screens is neglected. 5. simplex simulated annealing (simpsa) the simplex simulated annealing (simpsa) algorithm was developed for the global solution of optimization problems. it is based on the original sa that was proposed for discrete optimization problems. simpsa combines the original simulated annealing algorithm (metropolis algorithm) with the non-linear simplex algorithm (simplex downhill search). simulated annealing algorithm employs a stochastic generation of solution vectors and employs similarities between the physical process of annealing and a minimization problem. this algorithm shows good robustness and accuracy in arriving at the global optimum of difficult non-convex highly constrained functions [21,22]. due to the application of the simplex downhill search, a simplex with d + 1 vertex for d decision variable is used. the algorithm starts with an arbitrary solution in the search space, a new solution is created according to the metropolis algorithm and the fitness function values are calculated for both solutions, the difference between these two points is calculated, the better function evaluation is accepted and becomes the starting point for the next iteration; otherwise a new point is accepted with the boltzmann probability of intexp (/ . )bp e k t  , where e is the difference of fitness function values, kb is boltzmann’s constant, and tinit is the annealing temperature. for the assumed acceptance ratio ar, the initial annealing temperature tinit is estimated by [26-28]. * int. 1 2 1 2 . e t r m m e a m m    (10) where m1 and m2 are the number of successful and unsuccessful reflections, respectively, *e is the average increase in objective function values for m2. in the preliminary generations, the temperature value is remains high, but it is decreased during next generations in order to reduce the acceptance probability. the cooling schedule will then continue with estimated tinit by equation (10) as [26-28]. electric field modeling and analysis of ehv power line using improved calculation method 431 ( ) ( 1) ( ) ln(1 ) 1 3. cool t g t g t g r       (11) where rcool is the cooling rate and σ is the standard deviation of all solutions at t(g) (current temperature). the abovementioned steps are repeated, and the process is continued with a sufficient number of successful generations for the current temperature. the temperature is then gradually reduced using equation (11) and the entire process is repeated until the stopping criterion is met [26-28]. the fitness function used for optimization is based on the accuracy of the calculation method, which is obtained by evaluating the relative error between the potential calculated by the check contour charges and the real potential applied on active conductors. the fitness function (ff) has the form 1 1 .100 c i i i n c v ic c v v ff n v    (12) where vci is the exact potential to which is subjected the conductors and vvi is the actual voltage of the check charges; nc is the total number of check points. the main steps of the proposed ga–csm algorithm are given as follows [26-28]. 1  simpsa generates initial solution with high temperature. 2  at each step, a new solution is created; the csm will evaluate the objective function values for both points. 3  compare the two solutions using the metropolis criterion. 4  steps 2 and 3 are repeated until system reaches equilibrium state. 5  decrease temperature and repeat the above steps, until the stopping criteria are met. 6. finite element method (fem) the finite element method (fem) is a numerical technique, used to find approximate solutions of partial differential equations, reducing the latter to a system of algebraic equations. the great advantage of this computational technique consists in the fact that the implementation in a code of iterative algorithms, relatively simple, allows having solutions, practically exact, with an acceptable approximation, of very complex problems, with calculation time considerably reduced. the finite element analysis of any problem involves basically four steps. those are: (a) discrediting the solution region into a finite number of sub-regions or elements, (b) deriving governing equations for a typical element, (c) assembling of all elements in the solution region, and (d) solving the system of equations obtained [29,30]. in bi-dimensional (2d) problems, the energy in an electrostatic field in cartesian coordinates (x, y) has the functional expression [30,31]. 22 2 1 1 . . 2 2 e s s v v w e ds ds x y                        (13) 432 r. djekidel, s. bessidek, a. hadjadj radius 10 mm radius 10 mm 10m 20 m 26 m 0,3m s t r 7 m in fem, the volume of the proposed region is divided into "m" small triangular elements where their sides form a grid with "ne" nodes. the potential function is approximated by [31,32]. ne 1 ( ) ( ) i i i v r v f r    (14) where vi is the electric potential of node i, r is any point on the proposed region, fi(r) represents the shape function having the feature that any fi(r) is equal to unit at the location of node i and zero at the other nodes. ( ) (1 for ), (0 for ) i f r i j i j   (15) substituting equation (15) into equation (14), it is obtained the approximate energy w, which is minimized under the following conditions. =0 1, 2 ..... e i w i n v    (16) a system of equations whose unknowns are the electric potential values in the nodes of the mesh is obtained. the electric field intensity within each element is obtained using the gradient expression as follows. ne 1 ( . ) m i m i i i e v v f      (17) fig. 3 275 kv single circuit three phase overhead transmission line in the present work, a three-phase ehv overhead electrical line of 275 kv with earth wires is considered, with the arrangement and the geometric coordinates, referred to the suspension pylon (height at tower), as shown in fig. 3, each phase of the transmission line consists of a bundle of two conductors separated by 30 cm with a radius of 10 mm, the ground wire radius is selected as 10 mm, the span length is 300 m, the sag of the conductors s=8 m, and s=6 m for the ground wires. the system of phase voltages is considered electric field modeling and analysis of ehv power line using improved calculation method 433 symmetrical and of direct (positive) succession with a nominal frequency of 50 hz, the earth is assumed to be homogeneous with a resistivity of 100ω.m. fig. 4 different configurations of single circuit lines (1) horizontal, (2) vertical (3) triangular, (4) inverted triangular fig. 5 different configurations of double circuit lines (1) vertical, (2) triangular (3) inverted triangular 6. results and discussions after choosing the type of fictitious charges as infinite line type, the simplex simulated annealing algorithm (simpsa) is used to find an appropriate arrangement (number and location) of both fictitious charges and contour points of charge simulation method (csm) for accurate calculation of electric field. the preferred parameters settings for simpsa algorithm taken from [33-35] and search intervals of the variables for the charge simulation method (csm) are summarized in table 1. 434 r. djekidel, s. bessidek, a. hadjadj table 1 charge simulation method and simplex simulated annealing parameters algorithm+csm number of max generation (iteration) =80 simpsa cool_rate=10,min_cooling_factor=0.9,initial_acceptance_ratio=0.95 csm range of fictitious charges= 4–30. range of fictitious radius for phase conductor =0.01–0.05 range of fictitious radius for ground wire =0.001–0.009 after multiple runs for the optimization of the fitness function, once the algorithm terminates execution, the best fitness function value and the optimal parameter values are obtained. the optimal values converged by this algorithm, which are incorporated into the proposed method, are summarized in table 2: table 2 optimum values of csm fictitious charges number fictitious conductor radius [m] ff value phase conductor 12 0.016 3.64e-12 ground wire 30 0.0045 the convergence of the fitness function (ff) mentioned above in equation (12) with number of iterations is shown in fig. 6. the best value for the fitness function is (3.64e12) and is practically achieved approximately after 75 iterations. 0 10 20 30 40 50 60 70 80 0 0.2 0.4 0.6 0.8 1 1.2 x 10 -10 iteration number f it n e s s f u n c ti o n fig. 6 convergence of fitness function used in simpsa algorithm the search process of this algorithm at successive iterations with optimal solutions are represented in figs. 7 and 8 respectively, where it becomes clear that the algorithm converge quickly to these values. electric field modeling and analysis of ehv power line using improved calculation method 435 0 10 20 30 40 50 60 70 80 12 14 16 18 20 22 24 26 28 30 n u m b e r o f fi c ti ti o u s c h a r g e s f o r l in e c o n d u c to r s iteration number phase conductor ground wire fig. 7 convergence of the optimum values of fictitious charges number on the conductor 0 10 20 30 40 50 60 70 80 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 iteration number f ic ti ti o u s r a d iu s f o r l in e c o n d u c to r s [ m ] phase conductor ground wire fig. 8 convergence of the optimum value of fictitious conductor radius fig. 9 shows the lateral profile of the electric field calculated at 1 m above ground level at mid-span length and pylon foot in different points within the right of way. generally, it is observed that this intensity has a lower value in the centre of the power line, and then increases to a maximum value near under the lateral conductor, from this point; it decreases gradually as the lateral distance from the power line center increases. it appears clear from this figure that the electric field profile is symmetric around the middle conductor. the most important result which the electric field strength at the mid-span length is significantly higher to that at the pylon (points of suspension), this is due to the effect of the height difference of the conductors above ground level. consequently, since the suspension height of the conductors above the ground is minimal, the value of the electric field is very high. 436 r. djekidel, s. bessidek, a. hadjadj -50 -40 -30 -20 -10 0 10 20 30 40 50 0 0.5 1 1.5 2 2.5 3 3.5 lateral distance [m] e le c tr ic f ie ld [ k v /m ] at mid-span at pylon foot average height fig. 9 electric field profile at mid-span and pylon foot calculated at 1m above the ground for the average height, the maximum calculated electric field value represents the average value between the maximum value obtained at mid-span and that at the pylon foot. this assumption does not reflect the actual situation of the power line. the longitudinal profile of the electric field at 1 m above the ground level shown in fig.10 illustrates very well this observation, the electric field is greatest at mid-span (3.09 kv/m), as the location of the electric field profile approaches the pylon in both directions, the electric field gradually decreases to a minimum value (1.28 kv/m), the electric field near the pylon is much lower than at mid-span. for the average height, the electric field strength at 1 m above ground level is 2.2 kv/m; this value remains constant along the span of the power line. this shows that the use of the conductor sagging in the electric field calculation is a very practical way of modeling the real shape of the power line; it plays a considerable role in the determination of the real values of the electric field. it should be noted that the maximum peak values of the electric field obtained are well below the limits prescribed by the icnirp and irpa international standards. fig.11 shows the three-dimensional profile of the electric field over a right of way equal to 50 m either side of the power line center and a span length between the suspension pylons of 300 meters. the values of the higher electric fields exist only in a small area near the mid span, and then decrease rapidly towards the pylons and even more rapidly away from the side conductors. fig.12 describes the mapping of the electric field intensity, in an area defined by the height of the conductors, and the axis of the lateral distance along the corridor. it may be interesting to note that the concentrated level of the electric field is produced around the phase of the conductor surface; the electric field gradually decreases with the lateral distance from the power line center in both directions of the corridor. electric field modeling and analysis of ehv power line using improved calculation method 437 -150 -100 -50 0 50 100 150 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 longitudinal span [m] e le c tr ic f ie ld [ k v /m ] sagging effect average height fig. 10 longitudinal electric field profile calculated at 1m above the ground -200 -100 0 100 200 -100 -50 0 50 100 0 1 2 3 4 longitudinal span [m]lateral distance [m] e le c tr ic f ie ld [ k v /m ] fig. 11 three-dimensional (3d ) electric field profile calculated at 1 m above the ground in the following, it should be mentioned some factors which may influence the value of the electric field. the effect of varying the conductor’s height and the phase spacing is shown in fig.13. an increase in the conductor height above the ground (clearance between conductor and ground) causes a significant reduction in the electric field value. an increase of the spacing between phases provokes a slight increase in the strength of the electric field. 438 r. djekidel, s. bessidek, a. hadjadj lateral distance [m] c o n d u c to r s h e ig h t [m ] -40 -30 -20 -10 0 10 20 30 40 0 5 10 15 20 25 30 fig. 12 mapping of the electric field generated by the single-circuit 275 kv power line at mid-span length 5 6 7 8 9 10 11 12 13 14 15 0 2 4 6 8 10 12 distance/height [m] e le c tr ic f ie ld [ k v /m ] spacing be twe e n conductors he ight of conductors fig. 13 electric field profile calculated at 1m above the ground as a function of the conductor height and the spacing between the conductors fig.14 illustrates the effect of changing the observation point height of electric field calculation above ground, so it can be seen that the increasing in the calculation point height above ground level can lead as a first step to a small increase in the electric field up to a certain height of 8 m, from this height, the rise becomes sudden. it can be concluded that the amplitude of the electric field is the highest in the immediate vicinity of the surface of the phase conductors and gradually decreases as it goes towards the ground. fig.15 shows the effect of bundle phase conductors on the value of the electric field, as seen in this figure, the electric field intensity is slowly increased if the numbers of subconductors per phase are increased. electric field modeling and analysis of ehv power line using improved calculation method 439 0 2 4 6 8 10 12 0 5 10 15 20 25 30 35 observation point height [m] e le c tr ic f ie ld [ k v /m ] fig. 14 electric field profile as a function of the observation point height above ground -50 -40 -30 -20 -10 0 10 20 30 40 50 0 0.5 1 1.5 2 2.5 3 3.5 4 lateral distance [m] e le c tr ic f ie ld [ k v /m ] n=1 n=2 n=3 n=4 n=8 fig. 15 electric field profile as a function of the bundle conductors electric field profile for different single circuit phase configurations (see fig.4) is shown in fig. 16. it can be seen that the horizontal configuration produces the higher maximum electric field than other all configurations due to all conductors being close to the ground level, and on the other hand, the triangular configuration produces the lowest maximum electric field due to better cancellation effect of the line voltages. 440 r. djekidel, s. bessidek, a. hadjadj -50 -40 -30 -20 -10 0 10 20 30 40 50 0 0.5 1 1.5 2 2.5 3 3.5 4 lateral distance [m] e le c tr ic f ie ld [ k v /m ] horizontal line vertical line triangular line inverted triangular line fig. 16 lateral electric field profile calculated at 1m above the ground for different phase configurations of single-circuit 275 kv transmission line for various double circuit configurations lines (see fig. 5), for the same phasing (abcabc), the lateral distribution of electric field is illustrated by fig.17. typically, one can observe that the triangular configuration gives lower maximum electric field than the other configurations in the immediate vicinity of the power line center into an interval between [-7,+7] m, for a distance between 7 and 30 m ±[7-30], the vertical configuration is the preferred configuration, within this range the values obtained indicate a significant reduction in the electric field strength. -50 -40 -30 -20 -10 0 10 20 30 40 50 0 0.5 1 1.5 2 2.5 3 3.5 4 lateral distance [m] e le c tr ic f ie ld [ k v /m ] vertical line triangular line inverted triangular line fig. 17 lateral electric field profile calculated at 1 m above the ground for different phase configurations of double-circuit 275 kv transmission line electric field modeling and analysis of ehv power line using improved calculation method 441 in double circuit overhead power line, the phase sequence arrangement has a significant influence on the electric field intensity; it is highly possible to adjust the position of the phase order to reduce the electric field under the power line to a lower level. as an example, the electric field for different phase arrangements in a vertical double circuit line with the same parameters is illustrated in fig. 18. as can be seen in this figure, the inverse phase arrangement (abc-cba) or low-reactance phasing gives the lowest value of electric field for the different points along the power line corridor, because of the best electric field cancellation caused by the phaseshift between phases, while the phase arrangement (abcacb) produces a higher electric field than all other arrangements of phase conductors. -50 -40 -30 -20 -10 0 10 20 30 40 50 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 lateral distance [m] e le c tr ic f ie ld [ k v /m ] abc-abc abc-acb abc-bac abc-bca abc-cab abc-cba fig. 18 comparison of electric field in different phase’s arrangement for double circuit vertical line 275 kv fig. 19 finite element discretization of the study domain given in fig. 3 442 r. djekidel, s. bessidek, a. hadjadj in order to validate the adopted method in this study, the comsol multiphysics software (version 4.3b) specializing in electromagnetism numerical simulation based on the finite element method (fem) can be used to simulate and evaluate the electric field around the overhead power lines. fig. 20 electric field profile at mid-span length calculated at 1m above the ground using comsol 4.3b software fig. 21 electric field profile at pylon foot calculated at 1m above the ground using comsol 4.3b software electric field modeling and analysis of ehv power line using improved calculation method 443 the electrostatic module was appropriately chosen for this computational work to solve and analyze this model in 2d dimensional space. the mesh using linear triangular elements generated by this software in the study domain with the defined settings of the system is shown in fig. 19. figs. 20 and 21 respectively, show the electric field distribution at mid-span length and pylon foot under the hv power line at 1 m above the ground level using the comsol 4.3b software. it can be seen that the electric field under the middle conductor is less intense, and then it increases to a maximum intensity nearly under the lateral conductors. as the distance from this point increases, the electric field strength decreased quite rapidly. -50 -40 -30 -20 -10 0 10 20 30 40 50 0 500 1000 1500 2000 2500 3000 3500 4000 lateral distance [m] e le c tr ic f ie ld [ k v /m ] csm+simpsa (mid s pan) csm+simpsa (pylon foot) comsol 4.3b (mid s pan) comsol 4.3b (pylon foot) fig. 22 comparison of electric field levels between the proposed method and comsol 4.3b software the comparison of the electric field results obtained by the proposed method with those simulated from comsol 4.3b software is shown in fig. 22. these results are in good agreement; the graphs of two figures are approximately superposed, the maximum error value is not significant; it does not exceed the value of 4.8%. 7. conclusion in this study, a novel optimized approach that couples the simplex simulated annealing algorithm (simpsa) and the charge simulation method (csm) has been presented. the adopted algorithm offers efficiency and accuracy for determination of the optimal position and number of fictitious charges. accurate results on the 3d quasi-static analysis of electric field created by an ehv overhead power line have been obtained. from the results, it is clear that the electric field intensity is less intense under the middle phase conductor and increases to peak intensity near under the side phase conductor; then decreases with increasing the lateral distance. it has also been found that the electric field depends on several factors, such as the spacing between two adjacent conductors and conductor height above the ground, the type of 444 r. djekidel, s. bessidek, a. hadjadj lines, single or double circuit. for double circuit lines, it is possible to adjust the phases in an appropriate manner in order to achieve a significant reduction of the electric field. the most important parameter is the influence of the conductor sag; it is noted that the value of the electric field at mid-span length is much higher than that at the pylon foot. the obtained results were compared with those obtained from the comsol 4.3b software. the simulation results are almost identical and are visually superimposed; the comparison is satisfying enough and it sufficient to validate the combined method. references [1] ch. j. portier, m.s. wolfe, "assessment of health effects from exposure to power-line frequency electric and magnetic fields," working group report, niehs and emfrapid, august 1998. [2] the elf working group, health effects and exposure guidelines related to extremely low frequency electric and magnetic fields, the federal provincial territorial radiation protection committee, canada, january 2005. [3] t. keikko, "technical management of the electric and magnetic fields in electric power system," technical report, finland, 2003. [4] cigre, "electric and magnetic fields produced by transmission systems," description of phenomena practical guide for calculation, wg 36-01, paris 1980. [5] report, "a review of the potential health risks of radiofrequency fields from wireless telecommunication devices," the royal society of canada, rsc.epr 99-1, march 1999. [6] icnirp, " international commission on non-ionizing radiation protection, “guidelines for limiting exposure to time-varying electric and magnetic fields (1hz to 100 khz)," health physics, vol. 99, no.6, pp. 818–836, 2010. [7] iarc, "non-ionizing radiation, part 1: static and extremely low-frequency (elf) electric and magnetic fields,” iarc monographs on the evaluation of carcinogenic risks to humans," vol. 80, pp.1-395, 2002. [8] who, "extremely low frequency fields, environmental health criteria monograph, " no. 238, who press, geneva, switzerland, 2007. [9] abstract book, "international conference on electromagnetic fields, bio-effects to legislation", ljubljana, slovenia, november 2004. [10] review , statement of the international evaluation committee to investigate the health risks of exposure to electric, magnetic and electromagnetic fields, the italian ministers of environment, health and telecommunication, italy,2002. [11] h. singer, h. steinhigler, p. weiss, "a charge simulation methods for the calculation of high voltage fields," ieee trans on power applicat, vol. pas-93, pp. 3660-3668, 1974. [12] t. takuma, "charge simulation method with complex fictitious charges for calculating capacitive resistive fields," ieee trans on power apparatus and systems, vol. pas-i00, no.11, pp. 4665-4672, november 1981. [13] n.h. malik, "a review of the charge simulation method and its applications," ieee trans on electrical insulation, vol. 24, no.1, pp.3-20, feb 1989. [14] s. chakravorti, p. k. mukherjee, "efficient field calculation in three-core belted cable by charge simulation using complex charges," ieee trans on electrical insulation, vol. 27, no. 6, pp. 1208-1212, 1992. [15] x. m. liu, y. d cao, e. z. wang, "numerical simulation of electric field with open boundary using intelligent optimum charge simulation method," ieee transactions on magnetics, vol. 42, no. 4, pp.1159-1162, april 2006. [16] d. himadri, "implementation of basic charge configurations to charge simulation method for electric field calculations," international journal of advanced research in electrical, electronics and instrumentation engineering, vol. 3, no.5, pp. 9607-9611, may 2014. [17] r. m. radwn, m. m. samy , "calculation of electric fields underneath six phase transmission lines," journal of electrical systems, vol.12, no. 4, pp.839-851, 2016. [18] r. djekidel, d. mahi, "effect of the shield lines on the electric field intensity around the high voltage overhead transmission lines," international journal of modeling, measurement and control a, amse journals, series, modelling a, vol.87, no. 1, pp. 1-16, 2014. [19] m. m. samy, a. m. emam, "computation of electric fields around parallel hv and ehv overhead transmission lines in egyptian power network," in proceedings of the ieee, international conference on http://scholar.google.fr/scholar?cluster=10916841630457612781&hl=fr&as_sdt=0,5&sciodt=0,5 http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.m.%20m.%20samy.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.a.%20m.%20emam.qt.&newsearch=true http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=7966856 electric field modeling and analysis of ehv power line using improved calculation method 445 environment and electrical engineering and ieee industrial and commercial power systems europe, italy, 2017, pp. 1 – 5. [20] r. djekidel, a, choucha, a, hadjaj, "efficiency of some optimization approaches with the charge simulation method for calculating the electric field under extra high voltage power lines," iet generation, transmission and distribution , vol. 11, no. 17, pp.4167 – 4174, 2017. [21] m. f. cardoso, r. l. salcedo, s. f. de azevedo, "the simplex-simulated annealing approach to continuous non-linear optimization", computers & chemical engineering, vol. 20, no. 9, pp. 1065-1080, sep 1996. [22] m. e. cardoso, r. l. salcedo, s. f. azevedo, d. barbosa, "a simulated annealing approach to the solution of minlp problems", computers & chemical engineering., vol. 21, no. 12, pp. 1349-1364, 1997. [23] a. v. mamishev, r. d. nevels, and b. d. russell , "effects of conductor sag on spatial distribution of power line magnetic field," ieee trans on power delivery, vol. 11, no. 3, pp. 1571-1576, july 1996. [24] m.p.arabani, b.porkar, s.porkar, "the influence of conductor sag on spatial distribution of transmission line magnetic field," cigre, paper b2–202, paris, 2004. [25] v. phan tu, j. tlusty," the induced magnetic field calculation of three phase overhead transmission lines above a lossy ground as a frequency-dependent complex function", in proceedings of the ieee conference on power engineering, canada, may 2003, pp. 154 – 158. [26] r. mukesh, k. lingadurai, "aerodynamic optimization using simulated annealing and its. variants", international journal of engineering trends and technology, vol.2, no. 3, pp.73–77, 2011. [27] h. ketabchi, b. ataie-ashtiani, "evolutionary algorithms for the optimal management of coastal groundwater: a comparative study toward future challenges," journal of hydrology, vol. 520, pp.193213, 2015. [28] b. behzadi, c. ghotbi, a. galindo, "application of the simplex simulated annealing technique to nonlinear parameter optimization for the saft-vr equation of state," chemical engineering science,science direct, vol.60, no. 3 , pp. 6607–6621, dec 2005. [29] m. k. haldar, "introducing the finite element method in electromagnetics to undergraduates using matlab," international journal of electrical engineering education, vol. 43, pp. 232-244, 2006. [30] e. o. virjoghe, d. e. nescu, m. f. stan, c. cobianu, "numerical determination of electric field around a high voltage electrical overhead line, " journal of science and arts, vol. 4, no. 21, pp. 487-496, 2012. [31] n.yadav, n. kumar, "2-dimensional and 3-dimesional electromagnetic fields using finite element method,” iosr, journal of electrical and electronics engineering, vol. 7, no. 2, pp. 53-60, 2013. [32] j. faiz, m. ojaghi, "instructive review of computation of electric fields using different numerical techniques", international journal of engineering, vol. 18, no. 3, pp. 344-356, 2002. [33] s. bera, i. mukherjee, "an ellipsoidal distance-based search strategy of ants for nonlinear single and multiple response optimization problems," european journal of operations research, vol. 223, no. 2, pp. 321-332, 2012. [34] z.h. che, h.s. wang, "a hybrid approach for supplier cluster analysis," computers & mathematics with applications, vol. 59, no. 2, pp. 745–763, 2010. [35] m. m. atiqullah, "an efficient simple cooling schedule for simulated annealing," in proceedings of the international conference on computational science and its applications, italy, pp 396-404, may 2004. http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=7966856 http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.djekidel%20rabah.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.choucha%20abdelghani.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.hadjaj%20abdelchafik.qt.&newsearch=true http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=4082359 http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=4082359 http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=8128691 http://scholar.google.ca/citations?view_op=view_citation&hl=en&user=qvrmte8aaaaj&citation_for_view=qvrmte8aaaaj:rolk4nbrz8uc http://scholar.google.ca/citations?view_op=view_citation&hl=en&user=qvrmte8aaaaj&citation_for_view=qvrmte8aaaaj:rolk4nbrz8uc http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.%20vu%20phan%20tu.qt.&newsearch=true http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=8567 http://www.sciencedirect.com/science/article/pii/s0009250905004902 http://www.sciencedirect.com/science/article/pii/s0009250905004902 http://www.sciencedirect.com/science/article/pii/s0009250905004902 http://www.sciencedirect.com/science/journal/00092509 http://www.sciencedirect.com/science/journal/00092509 http://www.sciencedirect.com/science/journal/00092509/60/23 https://link.springer.com/conference/iccsa instruction facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 561 588 doi: 10.2298/fuee1404561n recent research in vlsi, mems and power devices with practical application to the iter and dream projects  andrzej napieralski, cezary maj, michal szermer, piotr zajac, wojciech zabierowski, małgorzata napieralska, łukasz starzak, mariusz zubert, rafał kiełbik, piotr amrozik, zygmunt ciota, robert ritter, marek kamiński, rafał kotas, paweł marciniak, bartosz sakowicz, kamil grabowski, wojciech sankowski, grzegorz jabłoński, dariusz makowski, aleksander mielczarek, mariusz orlikowski, mariusz jankowski, piotr perek lodz university of technology, department of microelectronics and computer science, poland abstract. several mems (micro electro-mechanical systems) devices have been analysed and simulated. the new proposed model of sic mps (merged pin-schottky) diodes is in full agreement with the real mps devices. the real size dll (dynamic lattice liquid) simulator as well as the research on modelling and simulation of modern vlsi devices with practical applications have been presented. based on experience in the field of atca (advanced telecommunications computing architecture) based systems a proof-of-concept daq (data acquisition) system for iter (international thermonuclear experimental reactor) have been proposed. key words: mems, power devices, sic, dll, vlsi, nuclear fusion. 1. edumems project the edumems (developing multidomain mems models for educational purposes) project started in 2011 and lodz university of technology acts as project coordinator. the project consortium includes two polish universities (lodz and wroclaw university of technology), one french and one belgian partners (laas-cnrs laboratory and ghent university) and two ukrainian universities (lviv polytechnic national university and national technical university of ukraine in kiev). the main goal of the project is to received october 6, 2014 corresponding author: andrzej napieralski lodz university of technology, department of microelectronics and computer science, ul. wolczanska 221/223, 90924 lodz, poland lodz university of technology, department of microelectronics and computer science (e-mail: napier@dmcs.pl) 562 a. napieralski et al. bring together scientists from different areas of research (mechanics, electronics, optics, fluidics) to work together on interdisciplinary mems modelling and design. mems operate in the microscale and most mems devices involve phenomena from multiple domains. therefore, to provide in-depth quality research of mems, specialists from all these domains have to cooperate. thanks to the project, researchers can go to the partner universities and work there with specialists in a given field. such newly founded research groups guarantee that the performed design, analysis and simulations take into consideration and adequately model all phenomena which arise in mems devices. one have to underline that the common research in mems area have been conducted already by lut and laas in the frame of barmint project [1]. during the project, several mems devices have been analyzed and simulated, new modelling methodologies have been proposed and new device models have been invented. in particular, the following topics have been researched:  modelling of uncooled microbolometer  modelling of micromembrane  modelling of rf temperature sensor  modelling of microfluidic flow in this paper we describe in detail the work performed on first two devices, namely microbolometer and membrane. 1.1. modelling and simulation of uncooled microbolometer bolometers are used to measure radiation, in particular they are used in thermal cameras to measure infrared radiation [2]. their principle of operation is that the radiation heats up an active element which changes its resistance due to the temperature change. the resistance change can be then measured and based on the result, the intensity of radiation is calculated. thermal cameras usually use arrays of microbolometers in which each one represents one pixel. after reading the radiation coming on all pixels, the camera is able to provide the entire thermal image of observed scene. the detailed description of microbolometer operation is beyond the scope of this paper. here, we will concentrate on modelling electrical and thermal phenomena in these devices. let us first discuss the thermal domain. the role of the microbolometer is to provide the highest possible temperature change for a given radiation. thus, the surface of the device should be as large as possible and made of material which has a high temperature coefficient of resistance (tcr). moreover, the heated surface of this material should be thermally separated from the chip surface so that the absorbed heat does not heat up the entire chip. as far as the electrical domain is concerned, the designer has to ensure that it is possible to measure the resistance of the heated element. this most often involves applying a given current through the element and measuring the voltage. thus, the used material should be a conductor. all mentioned requirements are met in the structure of mems-based microbolometer (see fig. 1). it can be seen that the structure is composed of a bridge suspended over the substrate and supported by thin legs. thanks to this structure, the bridge is thermally isolated from the substrate. the isolation is of course not perfect and depends on the size of the supporting legs. it can be also observed that there is a thin layer of active material which has a serpentine shape and goes from one leg to another. this layer is the active recent research in vlsi, mems and power devices with practical application to the iter and… 563 material: the one whose resistance will be measured. consequently, it is made of thin conductor with high tcr. several materials have been proposed for the role of active material, namely vanadium oxide, titanium, amorphous silicon etc. this thin layer is encapsulated in the membrane whose role is to maximize the device surface and to absorb as much radiation as possible. naturally, the membrane should be made of isolator (silicon nitride is most often used) so that it does not interfere with the current flowing through the active layer. fig. 1 typical microbolometer structure electro-thermal simulation of microbolometers has to include the coupling between electrical and thermal domain. the reason is that joule heat dissipated in the active layer is a function of resistance but also the resistance is a function of temperature (so basically a function of dissipated heat). therefore, accurate and fast simulation of such structures is difficult. of course, one can use finite element model based tools like ansys [3] or comsol [4] to obtain detailed simulation results. during the project, we performed thorough simulations of various microbolometer shapes, tested various layer dimensions and used various materials. a sample simulation result for titanium-based 50×50 µm microbolometer is presented in fig. 2. it presents the maximal transient temperature reached in microbolometer due to the applied current pulse. note that on one hand, pulse amplitude should be as small as possible to reduce joule heating. however, on the other hand, the bias current should be as high as possible to increase the sensitivity. consequently, our simulation give concrete answer to the designer which has to make this trade-off. for example, it shows that in case of this particular microbolometer, a current pulse of 0.2 ma amplitude and 100 µs duration will cause the temperature rise up to 26c. on the other hand, a current of 0.25 ma and 150 µs will increase the temperature to 30.2c [5]. fig. 2 maximal transient temperature due to the applied current pulse fig. 3 response of a microbolometer for a constant radiation and current pulse 564 a. napieralski et al. however, the simulation time may be quite long in this case, which is sometimes prohibitive. in some cases, designers are in need of faster and simpler models. therefore, a significant amount of work has been performed to design an analytical, electro-thermal model of a microbolometer, which allows obtaining results very close to those calculated by ansys in a much shorter time. fig. 3 shows the response of a microbolometer for a constant radiation and current pulse for both ansys model and simplified, analytical model [6]. it shows that the designed model gives basically identical results to those obtained using complex ansys model. needless to say, the simulation time using our model is much shorter, especially in case of transient simulation with a significant number of time points. we have performed similar comparisons for various microbolometers (various sizes, materials, shapes) and the maximal error of our model with respect to ansys model was found to be 3%. 1.2. modelling and simulation of micromebrane membranes are commonly used in many micromachined applications. they are used as a mechanical part of a device that allow converting external force into electrical signal (via membrane deflection) as well as generating force by applying electrical signal. wide spectrum of application defined many constructions of membranes (number of layers, materials used in fabrication) and wide range of membrane shapes and dimensions. therefore, the simulation of a membrane became crucial step in device production. very often fem analysis is used in mems simulation. although, it can be very detailed, the time consumption is rather high. as the simulation has to be run repeatedly, numerical simulation can be very inconvenient. then, the analytical modelling can be a very good alternative, especially when the model is very accurate and allows combining mechanical domain with other ones [7]. 1 2 3 4 5 6 7 8 9 x 10 11 0 5 10 15 x 10 4 membrane ratio = 1 spring constant (n/m) n u m b e r o f s a m p le s si <100> si <110> -4 -3 -2 -1 0 1 2 3 4 0 0.5 1 standard deviation from mean n o rm a li z e d d e n s it y si <100> si <110> 1 2 3 4 5 6 7 8 9 x 10 11 0 5 10 15 x 10 4 membrane ratio = 1 spring constant (n/m) n u m b e r o f s a m p le s si <100> si <110> -4 -3 -2 -1 0 1 2 3 4 0 0.5 1 standard deviation from mean n o rm a li z e d d e n s it y si <100> si <110> fig. 4 membrane stiffness distribution for square membrane one of the highest benefits of analytical modelling can be achieved in statistical simulation. even we found optimal parameters of the membrane that give the desired performance of the device, the fabrication process does not guarantee that the real device will meet this requirements. many fabrications steps have some tolerances that affect the device performance. the investigation of this influence can be performed using monte carlo analysis. component tolerances are used in generation of parameters distribution and then the simulation is run many times for each case to obtain the distribution of the device performance. a sample simulation was performed for a silicon membrane with dimension of 200 µm width and 4 µm of thickness and various lengths (from square membrane to rectangular). it was assumed that the fabrication process influences on recent research in vlsi, mems and power devices with practical application to the iter and… 565 membrane dimensions and residual stress within it. the distribution of input parameters was generated using the normal distribution and tolerances provided by typical equipment used in fabrication process. the sample results for square membrane are presented in fig. 4. it presents the distribution of the membrane stiffness for two crystallographic orientations. it can be seen that rather negligible input tolerances affects the membrane properties significantly. the deviation from the mean value can reach 3 standard deviation and “only” 70% of membranes are located within one standard deviation. depending on our requirements, the tolerances of fabrication process may lead to fabrication of useless devices. therefore, the statistical simulation can be very useful in estimation of yield production. in many applications the membranes are fabricated using wafer bonding technique [8]. the bonding process usually requires high temperature annealing of the structure to strengthen the bond. if two different materials are bonded, the residual stress appears in the structure which is usually undesired because it changes the response of the membrane. the evolution of mems fabrication technology allows nowadays performing bonding process in stress free temperature. however, this technique does not guarantee that the residual stress will not appear. it can only reduce its value significantly. it has to be mentioned that the device operates in variable temperatures that influence on residual stress value. therefore, it is desired to investigate this influence in a typical range of operational temperature of a device. the simulation was performed for a structure that consists of silicon membrane fabricated on pyrex surface by bonding performed in 270c (that is known as stress free temperature). it was found that the residual stress disappears when the structure is returned to about 13c as near this temperature the deflection of unloaded membrane changes direction. then, the influence of operational temperature was investigated for temperatures in range of 0-50c. the figures below show the change of membrane stiffness and normal stress in the centre of membrane edge (see fig. 5). 0 10 20 30 40 50 2.62 2.64 2.66 2.68 2.7 2.72 2.74 temperature s p ri n g c o n s ta n t [k p a /u m ] residual stress no residual stress 0 10 20 30 40 50 17.5 18 18.5 19 19.5 20 20.5 21 temperature n o rm a l s tr e s s i n t h e c e n tr e o f m e m b ra n e e d g e [ m p a ] residual stress no residual stress fig. 5 membrane stiffness and normal stress within the membrane as a function of temperature comparing to the membrane with no residua stress, the membrane stiffness varies up to is 2.3%. although, this change seems to be negligible, in case of capacitive read-out the capacitance will vary up to 10%. on the other hand, the stress within the membrane varies up to 10% also. in case of piezoresistive read-out, the response will be different by the same value. 566 a. napieralski et al. 2. behavioral electro-thermal models of merged sic diode silicon carbide devices are the most promising semiconductor devices for power applications [9]. they offer excellent thermal properties and low on-state resistance together with high voltage capability. this has made possible the manufacturing of high voltage unipolar devices reaching very high operating frequencies. as a consequence, the most frequently used sic devices are merged pin-schottky (mps) diodes [10]. the mps diode models provided by device manufacturers the classical spice embedded diode model, as well as physical models are all unable to accurately reproduce temperaturedependent device behaviour of these devices in a relatively wide range of operating temperatures [11]. however, accurate models are required in order to provide engineers with a reliable tool for design of robust state-of-the-art power conversion appliances. consequently, a simple and accurate electro-thermal model of the mps diode is necessary. behavioural models have great potential in this regard, in particular when unipolar devices are concerned. electrical phenomena in schottky diodes during conduction and switching are generally simpler as compared to pin diodes, which renders behavioural modelling feasible. however, sic mps diodes exhibit nonlinear behaviour when temperature influence is involved [12] what requires the development of dedicated models [10]. moreover, even though switching processes are very short as compared to thermal time constants, they may soon become more important as switching frequencies are increased. until now, compact thermal models (ctms) were obtained either by detailed investigation of the device structure, whose physical and geometrical parameters are often unavailable, either by application of the network identification by deconvolution (nid) method, based on analysis of a single transient temperature measurement processed appropriately [13]. to develop the electro-thermal mps model, a novel model generation approach can be applied which is based on the nid method and time constant spectrum examination. it offers greater simplicity and better numerical properties while preserving the physical meaning of the obtained ctm [14]. the thermal model is coupled with the electrical part to form the complete electrothermal model. the proposed approach is demonstrated for the csd20030 commoncathode dual mps rectifier manufactured by cree. it is demonstrated that the behavioural approach enables the derivation of an accurate unipolar power semiconductor device model without any knowledge about its technological parameters. this has a great practical impact as such data are normally not revealed by device manufacturers. the first sic diodes made available by device manufacturers were subjected to tests which showed serious inconsistencies between simulated and measured behaviour (see [10,15] and also [11]). the research presented in [10, 15] shows that classical shockley’s equation [17] cannot be used for electric domain description. in the authors’ opinion, these inconsistencies may be due to these devices being manufactured as merged pinschottky diodes with additional p-islands [18]. in a regression-based analysis of the mps electro-thermal model [19], the temperature t is linearly or quadraticly related to internal resistance series resistance rs and intrinsic voltage drop vintrsc. analyses of measured quantities variability lead to the following relationship describing the static behaviour of the diode under forward bias: vfwd(ifwd,t) = (p1+p2∙t)∙ln ifwd+ (p3+p4∙t) + ifwd∙p5∙[1+p6∙(t−27°c)+p7∙(t−27°c) 2 ] (1) recent research in vlsi, mems and power devices with practical application to the iter and… 567 where p1,…, p7 are empirical constant coefficients. equation (1) is equivalent to the form suitable for numerical simulations [16]: )( ),()( exp),( tv tvirtvv tvi r fwdfwdsintrscfwd fwdfwd   (2) where vr(t) = p1 + p2 ∙ t, vintrsc(t) = p3 + p4 ∙ t, rs(t) = p5 ∙ [1 + p6 ∙ (t − 27 °c) + p7 ∙ (t − 27 °c) 2 ]. a similar analysis was performed for the reverse bias, leading to the formula (valid almost for all the diodes) irev(vrev,t) = β(t) ∙ exp[vrev ∙ α(t)]. (3) where α(t) and β(t) have linear character. in the case of the antysymmetric current character of temperature influence on behaviour of this diode below and over 75 °c (e.g. for csd04060 diode), an exponential and a hyperbolic relationships had to be used to represent the way the current varies with temperature [16]: 2 7575 2 7575 1 exp),( vevd vcvba tvi revrev    (4) where v75 is the reverse voltage drop vrev at t = 75 °c th tgfv v rev    1 75 (5) and a to h are empirical constant coefficients. the sic mps diode total capacitance is not significant for forward polarization except under very high switching frequencies. however, the junction capacitance for low reverse voltages can be presented using the space charge layer capacitance equation: cj(vd) = cj0 ∙ |vd + vj| 1 − mj (6) where cj0 is the capacitance of an unbiased junction (vd = 0), vj is the junction potential, and mj is the junction grading coefficient. thermal behaviour of semiconductor devices and cooling assemblies can be predicted by compact thermal models (ctms), which can be derived adopting the structural or the behavioural approach [20], [14], [23]. the detailed spice behavioural model of infineon and cree mps devices are presented in [10], [21], [16], [15]. the electro-thermal model is in full agreement with the real mps devices behaviour. the proposed model of sic mps diodes is in full agreement with the real mps devices behaviour in forward and reverse characteristics with the measurements of real devices (fig. 6, fig. 7) and definitely produced much better results than the models provided by the device manufacturers. 568 a. napieralski et al. fig. 6 comparison of the measured sic diode forward characteristics with the proposed model. temperatures: 25°c (green), 75°c (blue), 125°c for c3d04060 (black). y-axis: vfwd [v]; x-axis: ln(ifwd); the measurement deviation is presented using error bars. ([20]). fig. 7 the example dynamic behaviour of c3d04060 diode (source [20]). recent research in vlsi, mems and power devices with practical application to the iter and… 569 3. from dll simulator to dream supercomputer 3.1 introduction – historical outline necessity is the mother of invention. this very well-known saying perfectly renders the idea of dynamically reconfigurable polymorphic supercomputer (dream). the idea was born in 2010, in department of microelectronics and computer science at lodz university of technology in poland. the conception of this idea took place at the same university about ten years earlier, when two scientists from department of molecular physics – working on simulations of some phenomena in polymers – desired to accelerate their calculations. for their simulations they were using dynamic lattice liquid (dll) algorithm, which – as it was expected and finally proved – can be efficiently parallelised. in order to materialize their desire, the physicistsinventors started to draw the schematics of an electronic device – composed of discrete elements – dedicated to execute dll algorithm in a fully parallel manner. soon they realised, that the task is very ambitious and in the era of integrated circuits the approach based on discrete components is not efficient. however, the solution based on dedicated application specific integrated circuits (asics) would be relatively expensive and inflexible. the idea to build a parallel computing machine equipped with a typical microprocessor was also not very promising. the simulations of polymers by means of dll algorithm require many – though very simple – logic units called nodes. each node (representing e.g. one monomer) corresponds to a lattice point of the face cubic centred (fcc) network. assigning one node to one microprocessor would result in a huge number of inefficiently used computing cores. the solution to this stalemate came from the friendly department of microelectronics and computer science and was – as most inventions – very simple: instead of discrete elements, dedicated asics or microprocessors, the reconfigurable devices such as fpgas (field programmable gate arrays) must be used [22, 23]. in such elements it is possible to implement many nodes (lattice points) and to simulate their behaviour simultaneously. in this way the size of the simulator can be significantly reduced. furthermore, the internal architecture of computing elements can be optimized to efficiently achieve the functionality of e.g. simulated monomers. 3.2 first prototype – dll simulator in order to prove the feasibility and efficiency of the proposed solution, both departments – in close cooperation – developed a prototype of a simulator dedicated to dll algorithm. the prototype was named dll simulator. its capacity, defined as a number of implemented nodes, is insufficient to perform any full-scale simulation. the purpose of building this prototype was just to prove the concept of parallel dll algorithm execution in an array of fpgas. dll simulator (fig. 8) is composed of 7 pcbs (printed circuit boards). 6 of them contain 3 fpgas (xc3s4000) each and constitute resources for the simulation nodes. the 7-th board is equipped with 2 fpgas (xc2s150e) and it is responsible for synchronizing the simulation and for the communication of the whole system with a pc (personal computer). this board initializes the simulation and collects the simulation results. the implementation of the dll algorithm in dll simulator is described in details in [22]. below the most general aspects of this implementation are presented in order to simplify the understanding of the dll simulator construction. 570 a. napieralski et al. fpga simulation board fpga fpga fpgafpga fpgafpga control board fig. 8 dll simulator in each fpga on the simulation board a 2×6 array of nodes is implemented, thus the board represents a two-dimensional array containing 36 (6×6) nodes (9). 6 boards correspond to a three-dimensional array containing 216 (6×6×6) nodes. simulation board connections in 2d on pcb cyclic boundary conditions fig. 9 nodes and connections on simulation board recent research in vlsi, mems and power devices with practical application to the iter and… 571 the lattice of nodes in dll simulator represents the face cubic centred (fcc) network with the coordination number of 12. in other words: during the simulation process each node exchanges the data with the closest 12 neighbours: 6 on the same board, 3 on the next board and 3 on the previous board. for this exchange in dll simulator the dedicated wires are designed. although the dll simulator proved the idea of parallel implementation of dll algorithm in an array of fpgas, its construction was fixed and could not be extended to perform the realistic physical or chemical simulations. in practical applications the dll model should have about 10 6 (100×100×100) nodes. 3.3 second prototype – mdll simulator the very promising results of implementation of the dll algorithm in dll simulator encouraged the authors to design a new simulation board, which could constitute a basic building block of the full-scale dll simulator. therefore the second prototype of dll simulator (fig. 10) was designed and developed. fig. 10 mdll simulator 572 a. napieralski et al. the second prototype – mdll simulator finished in 2012 – is composed of 27 simulation boards and one control board serving the same functionality as in dll simulator. it is still too small for practical applications but it allows testing the functionality of simulation board, which can be many times duplicated and used in the construction of a powerful computing machine. each simulation board contains 5 fpgas: 4 of them, called simulation fpgas (xc6slx75), constitute resources for the nodes of the dll algorithm and the 5-th fpga (xc6slx45t), called control fpga, manages the operation of the simulation fpgas (fig. 11). fpga fpga fpga fpga simulation fpga simulation board fpga control fpga fig. 11 simulation board in second prototype the simulation boards are connected in panels. to reduce the number of wires connecting the simulation boards in panel – which is very important in any big computational machine due to the feasibility and reliability reasons – the nodes in the mdll simulator are not directly connected, as it was in dll simulator. the outputs of nodes, which must be transferred outside the fpga, are grouped and transmitted serially. on the receiver side they are deserialised, ungrouped and delivered to appropriate inputs of the destination nodes. sending the outputs and receiving the inputs of each node is performed simultaneously, using dedicated transfer protocol to achieve high data throughput. the implementation of dll algorithm on mdll and dll simulators differs mainly in the organization of nodes in the simulation fpga and in the communication among the nodes placed in different chips. in dll a dedicated one-bit routing path was devoted to each connection among such nodes. in mdll the connections are merged in groups and transferred using fast serial links. the number of nodes in one fpga is configurable. for simple simulations it is possible to reduce the architecture of a node and to include more nodes in one chip. 3.4 dream supercomputer the works on dll and mdll simulators revealed the feasibility and efficiency of the fpga-based dll simulator. it gave an impulse to build such a simulator in a full scale within the confines of bionanopark+, which is currently coming into existence in recent research in vlsi, mems and power devices with practical application to the iter and… 573 lodz. simultaneously, very positive conclusions, which emerged during the dll and mdll development encouraged the constructors of both prototypes to find more applications for the powerful and flexible computing machine being constructed (the dll simulator in bionanopark+ will be composed of over 25 000 fpgas). in this way the idea of dynamically reconfigurable polymorphic (dream) supercomputer appeared. the fundamental aspect of this idea is to provide the scientists with the possibility to use the dream fpgas without the need to define their functionality by means of hdl, as it is in case of dll and mdll. therefore an automatic conversion (compilation) from high level programming language (e.g. c++) to fpga configuration bitstream has been proposed. furthermore, in order to equip the dream supercomputer with some extraordinary features supporting unconventional simulations, the bio-inspired mechanisms are planned to be implemented in it. these mechanisms (such as dynamic routing or self-replacement) will transform the set of fpgas and cables into the fault tolerant, evolvable hardware ready to adopt itself to the problem to be solved. 4. asic design for commercial applications 4.1. industry oriented asic design and research dmcs undertakes numerous industry-oriented activities. application specific integrated circuit (asic) design oriented cooperation with external enterprises begun with series of contracts with tritem microsystems gmbh company, conducted in cooperation with institute of electron technology in warsaw. dmcs team participants had valuable opportunity of working in typical commercially oriented asic design facilities abroad, owned by a major player on a ic-based solution market. such experience is a big asset, especially in country like poland, that generally lacks its own industrial design centres and modern foundries. result of the mentioned cooperation was a set of asics ready to preproduction phase tests. some of these designs were introduced to mass production and are available on market, nowadays. 4.2. continuation of industry-oriented research dmcs staff got insight into work organisation, commercial design resource management and design methodology, during the abovementioned projects. also, our staff had opportunity of undertaking real life design challenges. thus, new ideas emerged during the projects. some of them were checked and introduced during commercial projects, some we evaluated much later, though providing very interesting results. one specific solution introduced during commercially oriented projects is a high voltage unity-gain voltage buffer. it was designed as a hybrid of two typical complementary voltage buffers – source follower and gate follower structures (fig. 12). very useful property set of this new structure enabled application of several untypical signal processing solution in hv integrated environments [24]. the buffer (fig. 13) was granted a patent by polish patent office in 2012 [25]. this is a rare achievement because what was patented was not a layout of the structure but scheme of transistor-level electrical connections and their applications in forming functionality of the buffer. 574 a. napieralski et al. fig. 12 operation rule of the patented unity-gain buffer; a) source follower, b) gate follower, c) combined follower fig. 13 the patented unity-gain high-voltage buffer; a) simplified version, b) high-quality full version when industry oriented projects and asic designs were completed, several design ideas were further studied and advanced. several interesting circuits were elaborated and published in isi list journals. much stress was put on overcoming problems with precise signal processing in highvoltage asics. effective means of current-mode circuitry insertions into voltage-mode signal paths were studied and implemented. it must be stressed that simplicity and recent research in vlsi, mems and power devices with practical application to the iter and… 575 precision of devised solutions was possible due to application of previously patented voltage buffer. fig. 14 voltage/current/voltage converter for current-mode circuitry integrated into high-voltage voltage-mode signal paths fig. 15 current-mode trapezoidal waveform generator and edge-rounded for high-voltage systems [29], based on unity-gain buffer [24] and current-mirrors (shown as boxes) set of simple but efficient voltage/current and current/voltage integrated converters was introduced [26] (and presented in fig. 14) along with several function current-mode function blocks applicable to high-voltage cmos and soi integrated systems. these function blocks include current-mode versions of waveform invertors, amplifiers and dclevel shifters [26], [27]. they all can be placed inside high-voltage signal paths owing to design of voltage/current/voltage conversion system with use of no more than the devised 576 a. napieralski et al. buffer combined with simple but efficient voltage/current/voltage converter tailored precisely for cmos/soi integrated systems [27]. fig. 16 current-mode high-voltage switches for voltage (a, b) and current (c) applications set of current-mode trapezoidal waveform generators with fully controlled waveform parameters were studied and published [28]. universal current-mode low-voltage and highvoltage trapezoidal waveform generators with edge-rounding functionality were invented and published [29]. this circuitry makes possible precise control over frequency, voltagerange and slew-rate control of the waveform in high-voltage integrated systems with use of nothing more than unity gain buffers and current mirrors (fig. 15). for high-voltage signal switching applications, set of application-optimized high-voltage current-controlled switches for voltage and current signal paths has been devised and studied [30]. some of them are presented in fig. 16. 4.3. continuation of industry-oriented research dmcs continues commercially-oriented asic design activities. proven and new industry entities enter various cooperation schemes with our department. together with institute of electron technology and tritem microsystems gmbh, our department has applied for funds to build and test prototype system for remote and wireless identification, access control and supervisory. several previously designed circuits are planned to be used for design of asic circuits related to this activity. dmcs has also cofounded a consortium with astri polska ltd and center of space research of polish academy of sciences, and has been applying for european space agency (esa) funds to build asics for use in space industry. esa expressed its interest and the consortium has been granted initial funds for in-depth feasibility study on a specialized asic for space applications. also, a three-year scientific project related to the study on modelling and simulation of the electro-magnetic phenomena in modern 3d integrated systems has been recently granted. though this is a scientific project, it is expected to implement several functional blocks initially developed for commercial applications. dmcs takes part in a number of projects at application stage where it is expected to design, implement and test various complex systems in the form of asics. recent research in vlsi, mems and power devices with practical application to the iter and… 577 5. development of diagnostic use cases for the iter organisation 5.1. the iter organisation diagnostic and control systems as the global energy consumption increases, the provision of efficient and clean energy sources becomes an urging necessity. one of the most promising way of energy production is the use of the nuclear fusion in thermonuclear reactors, such as tokamaks in which the plasma is confined in a toroidal shape using magnetic fields. both the substrates and products of the deuterium-tritium fusion are not radioactive and are environmentally friendly. the project of building the world’s largest thermonuclear reactor is called iter. it is a result of international cooperation of the european union, india, japan, korea, russia, china and the united states. it is built on the experience gained from the latest experiments, like joint european torus (jet) and tore supra and it will be the most technologically advanced tokamak so far. it is assumed to be able to produce 500 mw of energy with efficiency coefficient of at least 10. the machine is now being constructed at iter, cadarache, france. as the site construction process progresses, the need for development of sophisticated control and monitoring system is emerging. the technology being the result of the iter project will be able to be commercialized in 2050, but the tokamak assembly should start in 2015 and till then the concept for its instrumentation shall be finished. building and operating of the machine requires multidisciplinary effort not only by the physicists but also by the engineers. since eu tends to reduce co2 production the new technology of energy production is especially crucial. the control systems of modern tokamaks utilize a variety of telecommunication standards. jet, the largest fusion device in europe, is operated via an old atm network. tore supra is backboned by the scramnet real-time shared memory and the vme standard. in both machines the data acquisition is performed with pxi, eurocard, cpci, vme and advancedtca (atca) systems. among these architectures, the pci extensions for instrumentation (pxi) gained popularity and is also considered one of the fundamental standards of the iter project. the i&c systems of iter require an architecture providing a higher reliability, availability, maintainability and inspectability (rami) than can be achieved with pxi. the xtca (atca and mtca) standards [31] [32] [33] [34] [35] seem to fulfil this requirement and are being successively added to the iter pcdh catalogue. the advantage of the pxi architecture (national instruments) is wide availability of various i/o and processing modules, especially on markets not reached by competitive technologies. also the labview graphical programming environment is easy to use for people with limited programming experience. although the pxi architecture is widely adopted and well tested, it also has some drawbacks:  weak support for gnu/linux and other unix-like operating systems,  significant limitation on type and number of backplane interfaces,  no possibility of providing an effective redundancy scheme,  lack of support for hardware interlock mechanisms. the atca standard was developed for demanding telecommunication applications and later adapted for physics experiments. it concentrates on providing powerful computing 578 a. napieralski et al. platform with high reliability. due to high price of standard compliant components, it gains popularity slowly, mainly in the usa and europe. the authors have gained experience in the field of atca-based systems by building a proof-of-concept daq system for iter composed of off-the-shelf commercial components. the last considered standard is the mtca. this standard is similar to pxi, however it offers better performance and flexibility. the advanced mezzanine card (amc) modules, plugged into mtca shelf may contain a variety of backplane interfaces, support hot-swap mechanism and offer an advanced monitoring and module management capabilities. data transmission between the amc modules takes place over gigabit ethernet, pci express, sata, serial rapidio and similar high-speed serial interfaces. due to high reliability and cost-effectiveness the mtca-based systems are gradually gaining popularity in the industrial and scientific control systems [36] [37] [38] [39]. the potential of mtca standard was already noticed by the world’s leading physics laboratories. in response to this, the pci industrial computer manufacturers group (picmg) announced the establishment of the xtca for physics coordinating committee, in 2009. the sub-committee provides extensions and modifications to the plain telecommunication standard in order to adapt it for experimental research machines and detectors in such diverse fields as astronomy, high energy, photon, fusion and medical physics. the resulting subsidiary specification, the mtca.4, is now under active development and receives contributions from a number of well-known laboratories and institutions such as iter, desy, cern and many others. members of the dmcs team are also involved in development of mtca.4 standard working actively in picmg xtca for physics since 2009. the mtca.4 specification introduces many improvements especially important for iter [52]. for example, it enables some global scope signals (e.g. timing events, interlocks, additional clocks) to be transferred freely between cooperating modules. also, it introduces the micro rear transition module (mrtm) which can effectively double the space available on the module and allows connecting signals from both sides of the chassis. the picmg expects that the mtca.4 will be the common standard for physics experiments of the future. the authors believe that this standard can already offer a wellsuited complete solution for iter, although similar systems have not been built before using this architecture. the iter tokamak requires more than 150 various plant systems. most of them are part of the i&c subsystem that can be logically divided into two layers: central coordination and local plant systems. the primary goal of iter i&c system is to provide a fully integrated and automated control for the thermonuclear reactor. the most important part of i&c is the data acquisition system, which should collects signals from large number of digital/analogue channels (about 4000) and digital cameras (about 200). since these signals come from different physical sources, they span a large range of different sampling frequencies (from khz to ghz), resolutions (from 8 to 24 bits) and signal conditioning techniques (table 1). therefore the task of developing such data acquisition system is very demanding and requires pre-processing and processing of data on various hardware platforms (fpga, gpu, cpu, etc.). recent research in vlsi, mems and power devices with practical application to the iter and… 579 table 1 summary of example signal sources and required data processing measurement group data io signal processing magnetics 1400 adc (1 ms/s) 240 adc (10 ms/s) fpga / gpu / cpu dosimetry and fusion products 50 adc (100 ms/s) fpga / cpu vis/ir cameras 24 cameras (1 khz frame rate – 8gb/s per camera) fpga / cpu optical (ex. lidar) 150 adc (20 gs/s) fpga / gpu imaging spectroscopy ~200 cameras and detector arrays (7 gb/s) fpga / cpu spectroscopy and neutral particle analyzer ~50 adc (1 gs/s) fpga / cpu bolometers ~500 adc (1 ms/s) fpga / cpu 5.2. diagnostic use cases most of the extremely complex iter diagnostics systems are provided by the domestic agencies (das) and their partners. on their demand the io has created several diagnostics use case examples to enhance the understanding of diagnostics plant system i&c and the associated deliverables. the use cases come complete with documentation and implementation, further helping the das, their suppliers and diagnostic responsible officers to meet the iter diagnostics requirements [40]. the department of microelectronics and computer science has prepared two of such use cases, one in atca and one in mtca.4 form factor. 5.2.1. data acquisition use case in the atca form factor the data acquisition system designed and built by the authors is based on the atca and amc standards. all the elements that comply with those standards are off-the-shelf devices. no in-house atca or amc hardware has been made. this is due to iter policy and is to ensure that such equipment is always available and proper tech support is provided for it. the system block diagram is presented in figure 17 [41]. all the data communication in the system is based on the 1 and 10 gigabit ethernet. the input of the system consists of a number of tews tamc900 modules. every card is divided into two logical submodules each of them sampling up to four channels with a frequency of up to 50 ms/s. two amc modules reside in one emerson atca-7301 carrier board that connects to an emerson atca-f120 10 gb ethernet switch over backplane. the receiving side of the system consists of an emerson atca-7360 computation blade where the daq server runs. the received stream of data is forwarded to an external data archiver (backup system) via a 10 gbe connection. simultaneously, data is sent to a tesla s1070 computation blade via a pcie x8 connection. in case of failure of the 10 gbe uplink to the data archiver or pcie connection to the computation blade, data is stored in a dual hdd buffer. the hdd buffer is configured in a software raid 0 configuration. the data is sent from the hdd buffer immediately when the connectivity is recovered. the photograph of the system is presented in figure 18. the system is able to continuously process and forward to the archiving system the 800 mb/s data coming from 5 modules simultaneously without dropping any data. 580 a. napieralski et al. fig. 17 block diagram of the daq system with estimated throughputs. diagnostics systems based on the direct imaging are now widely used in tokamaks for both the real-time plasma control and off-line physics studies. the visible light emitted by the plasma can be used to monitor the plasma position during the operation, as well as to detect some transient events, such as flying debris that could degrade or interrupt the plasma unexpectedly if not mitigated. the infrared light is also very important to measure the surface temperature of the plasma facing the components subject to high heat fluxes (several mw/m 2 ) and particle fluxes. the early detection of overheating areas, called hot spots, is of the primary importance for the protection of the machine during the plasma operation to avoid component damage, such as melting, and even leaks of the water-cooling systems installed behind the first wall components. to this end, images are analysed in real-time (up to several ms) to detect, identify and recognize abnormal events which will be provided for the central control systems. image processing techniques are used to recognize spatiotemporal patterns of the expected thermal events (i.e. qualitative analysis). then, adequate actions can be taken to decrease the components overheating or to change the plasma state to the safest conditions. considering the high complexity of the machine geometry and plasma equilibrium, all of the in-vessel surface must be monitored with a resolution high enough to detect every local hot-spot of at most several centimetres in diameter. in the case of iter, this means that the cameras used for machine protection function must cover up to 640 m 2 from a distance up to 10 meters. cameras are also used for the understanding of plasma-wall interactions, e.g. to study the turbulence in the edge plasma, close to the vessel. in this case, the temporal resolution can be very high (200 kfps) and the system must support the streaming as well as the access of large amount of imaging data. recent applications of imaging networks in tokamak show that combination of data from several cameras is very promising for the 3d recent research in vlsi, mems and power devices with practical application to the iter and… 581 volume reconstruction (e.g. tomography), provided that data are well calibrated and synchronized. fig. 18 hardware installed in the iter cubicle – front view 5.2.2. image diagnostics use case in the mtca.4 form factor an image acquisition system (ias) is composed of a digital camera connected to a frame grabber card, image processing module and data transmission system [42][43][44][45][49]. the acquired images are sent to the image processing system. the system distributes data for further processing and archiving. the processed data are sent using low-latency 582 a. napieralski et al. connection to the machine control or protection system [53]. the buffered images with attached metadata are sent for archiving via the high-throughput connection. the metadata describes collected data (image resolution, bit depth, frame rate, etc.) and precisely defines when the images were created. the global synchronization network delivers a reference clock and a trigger signal that define when the images are acquired and allow calculating timestamps. a block diagram of ias is presented in fig. 19. the ias based on mtca.4 specification consists of:  digital cameras connected to frame grabber modules,  camera link receiver mezzanine module [54],  frame grabber modules with local processing power,  synchronization and timing distribution module connected to time communication network (tcn),  image processing module based on external industrial computers with gpus,  high-throughput network links to data archiving network (dan) and synchronous data network (sdn). fig. 19 a block diagram of ias implemented in mtca.4. the flow of the digital data starts with the camera. data from the camera is transferred to a dedicated frame grabber module, realized as an advanced mezzanine card (amc) hosted in the mtca.4-compliant shelf. the shelf is connected to an external cpu module using the pcie cable link. video data from the frame grabber is transferred using a dma directly to the host computer memory. from there, the data are made available through two 10 gb/s ethernet connections. the ias is installed in the iter codac technical room. the mch natmchphys fabricated by nat with the rtm pcie uplink (pcie x4, gen. 2) was used. the maximum theoretical bandwidth of the pcie x4, gen. 1 connection on the frame grabber card is 8 gb/s. to evaluate the performance of the dma module on the frame recent research in vlsi, mems and power devices with practical application to the iter and… 583 grabber card, the switch hierarchy, the root complex and the software driver a special performance testing module, sending the test data at the maximum possible speed has been implemented in the firmware. using this module, the maximum achievable data rate from the single module has been measured as 800 mb/s. this result is fully satisfactory for the system with a single camera as it is equal to the cameralink theoretical maximum of 6.4 gb/s. the link between the pcie switch on the mch and the external cpu is currently limited to the pcie x4, gen. 2. this allows transferring payload data with a theoretical throughput up to 1.6 gb/s (12.8 gb/s). in this configuration image acquisition can be done from two frame grabber cards running almost at full link saturation. the system with two frame grabber cards has also been tested. in this case, the maximum throughput from the single amc module was limited to about 6.2 gb/s. one of the key issue in integration of all control and diagnostic subsystems is distribution of reference clock and precise synchronization. in the tokamak the reference time is distributed via dedicated time communication network (tcn) using the ieee 1588-2008 protocol, called precision time protocol (ptp). the assumed synchronization accuracy that is ensured by this solution is 50 ns rms. the application of the ptp-based network causes that every subsystem needs to be equipped with timing receiver capable of receiving reference time from tcn network, generating synchronous clock and trigger signals and provide support for timestamping of external signals. as there are no commercially available solutions that may be used at mtca-based iter diagnostic systems, it was necessary to design new mtca.4-compliant timing module providing support for the ptp and ensuring required synchronization accuracy [46]. the module is based on a recent spartan-6 fpga circuit from xilinx. the programmable device hosts complex microprocessor system built around the microblaze core. the firmware image is too large to fit in the fpga’s integrated memory and is hence stored and executed in the external ddr2 sdram. the fpga bitstream is loaded from the external spi flash memory. the ram memory is preloaded on system start-up using contents of the spi flash memory. the hardware structure of the ptm module is presented in 20. the ptm-1588 module accesses the timing network using gigabit ethernet interface using the regular 8p8c modular connector. the module not only synchronizes its internal counters with the ptm master, but also provides synchronized clocks. the module reference frequency is generated by the oven controlled voltage controlled crystal oscillator (ocvcxo). the clock phase correction is achieved by manipulation of the ocvcxo frequency. its frequency can be shifted in 10 ppm range using an external tuning voltage provided by the fpga controlled dac. the module produces 100 mhz clock and pulse per second (pps) signal on the front panel output and both 10 mhz and 100 mhz on the backplane. apart from that, there are 8 programmable lines on the backplane and 2 on the front panel, that can be used for generation of future time events and timestamping of external signals. the module is configured and operated mainly by means of the pcie interface. the board is manageable through the intelligent platform management interface (ipmi) protocol thanks to custom developed module management controller (mmc), analogous to the one presented in [47], [48], [50], [51]. this subsystem is responsible for monitoring the module health and maintaining its state. the ptm-1588 board has been tested at iter, using the gpssynchronized grandmaster clock symmetricom xli and three cascaded hirschmann mar1040 switches. at the same time, the pps output of the grandmaster clock is connected to the frontpanel input of the board using the 50 ohm coaxial cable. the delay of the cable, input and 584 a. napieralski et al. output drivers has been calibrated by connecting the pps output to fte input via a cable and timestamping it locally. the pps error has been measured for 237230 samples (the test lasted 2.7 days). the rms value of the error was 11.7 ns. fig. 20 precise timing module – hardware structure 6. conclusions in this paper some chosen research topics conducted in dmcs tul have been presented. at first several mems (micro electro-mechanical systems) devices have been analysed and simulated. new modelling methodologies have been proposed and some new device models (microbolometer and micromembrane) have been invented. next the new model of sic mps (merged pin-schottky) diodes has been proposed. to develop the electro-thermal mps model, a novel model generation approach has been applied which is based on the nid method and time constant spectrum examination. it offers greater simplicity and better numerical properties while preserving the physical meaning of the obtained ctm. the thermal model is coupled with the electrical part to form the recent research in vlsi, mems and power devices with practical application to the iter and… 585 complete electro-thermal model. the proposed approach has been demonstrated for the csd20030 common-cathode dual mps rectifier manufactured by cree. the new proposed model of sic mps (merged pin-schottky) diodes is in full agreement with the real mps devices. these works will be continued in the frame of the european project adept (advanced electric powertrain technology). its aim is to produce a virtual development environment for electric propulsion systems for evs and hevs (electrical and hybrid electrical vehicles). it is expected that sic devices will be heavily applied in these applications to reduce power loss and cooling needs, thus enabling an extension of the vehicle’s operating distance range. however, reliable device models are needed for efficient circuit design and optimization. the idea of dynamically reconfigurable polymorphic supercomputer (dream) was born in dmcs in 2010. the simulations of polymers by means of dll algorithm require many logic units called nodes. instead of discrete elements, dedicated asics or microprocessors, the reconfigurable devices such as fpgas (field programmable gate arrays) have been used. the fundamental aspect of this idea is to provide the scientists with the possibility to use the dream fpgas without the need to define their functionality by means of hdl, as it is in case of dll and mdll. therefore an automatic conversion (compilation) from high-level programming language (e.g. c++) to fpga configuration bit-stream has been proposed. dmcs undertakes numerous industry-oriented activities. several interesting circuits were elaborated and published in isi list journals. dmcs continues commercially-oriented asic design activities. proven and new industry entities enter various cooperation schemes with our department. together with the institute of electron technology and tritem microsystems gmbh, our department has applied for funds to build and test prototype system for remote and wireless identification, access control and supervisory. several previously designed circuits are planned to be used for design of asic circuits related to this activity. dmcs has also cofounded a consortium with astri polska ltd and space research centre of the polish academy of sciences, and has been applying for european space agency (esa) funds to build asics for use in space industry. esa expressed its interest and the consortium has been granted initial funds for in-depth feasibility study on a specialized asic for space applications. a three-year scientific project with potential commercial applications has been recently granted to dmcs. its research part focuses on modelling and simulation of electro-magnetic phenomena in modern 3d integrated systems. practical applications will be driven by the implementation of several function blocks designed to fulfil commercial demands. as the last point the project of building the world’s largest thermonuclear reactor – iter has been presented. the control systems of modern tokamaks utilize a variety of telecommunication standards. dmcs proposed to apply xtca (atca and mtca) standards in order to fulfil the requirement of the project. for the real-time plasma control and off-line physics studies, a diagnostics system based on direct imaging has to be developed. the visible light emitted by the plasma can be used to monitor the plasma position during operation as well as to detect some transient events, such as flying debris that could degrade or interrupt the plasma unexpectedly if not mitigated. the infrared light is also very important to measure the surface temperature of the plasma facing the components subject to high heat fluxes (several mw/m 2 ) and particle fluxes. images must be analysed in real-time (up to several ms) to detect, identify and recognize abnormal events which information will be provided for the central control systems. cameras are also used for the understanding of plasma-wall interactions, e.g. to study the turbulence in 586 a. napieralski et al. the edge plasma close to the vessel. in this case, the temporal resolution can be very high (200 kfps) and the system must support the streaming as well as the access to large amounts of imaging data. recent applications of imaging networks in tokamaks show that a combination of data from several cameras is very promising for the 3d volume reconstruction (e.g. tomography), provided that data are well calibrated and synchronized. references [1] esteve d., alderman j., cane c., courtois b., glesner m., napieralski a., rencz m., samitier j., troccaz j., cinquin p., dillman j.: “basic research for microsystems integration”, cepadus-editions, toulouse, france, 1997, s. 266 isbn 2-85428-465-8. [2] bhan r. k., saxena r. s., jalwania c. r., and lomash s. k.: “uncooled infrared microbolometer arrays and their characterisation techniques”, def. sci. j.59, pp. 580–590 (2009) [3] ansys® workbench, http://www.ansys.com [4] comsol® multiphysics, http://www.comsol.com [5] zajac p., szermer m., maj c., zabierowski w., melnyk m., matviykiv o., napieralski a., lobur m.: “study of dynamic thermal phenomena during readout of uncooled titanium-based microbolometer”, memstech, 16-20 april 2013, polyana, ukraine, pp. 40-42 [6] janicki m., zajac p., szermer m., napieralski a.: “compact thermal modeling of microbolometers”, 15th international conference on thermal, mechanical and multi-physics simulation and experiments in microelectronics and microsystems (eurosime), 7-9 april 2014, ghent, belgium, pp. 1-4 [7] maj c., olszacki m., al bahri m., pons p.: “analytical model of electrostatic membrane-based actuators”, 10th international conference on thermal, mechanical and multi-physics simulation and experiments in microelectronics and microsystems (eurosime), 26-29 april 2009, delft, netherlands, pp. 1-6 [8] suni t.: “direct wafer bonding for mems and microelectronics”, vtt publications, espoo, finland, 2006 [9] m. bhatnagar and b. j. baliga, “comparison of 6h-sic, 3c-sic, and si for power devices,” ieee trans. electron devices, vol. 40, no. 3, pp. 645–655, 1993. [10] zubert m., napieralska m., jabłoński g., starzak ł., janicki m., napieralski a.: static electro-thermal model of sic merged pin schottky diodes. w: 10th international seminar on power semiconductors isps’10, prague, czech republic, 1-3 sep 2010, prague, ed. v.benda, 2010, s.282. pp.227-232. [11] j. zarebski and j. dabrowski, “spice modelling of power schottky diodes,” int. j. numer. modelling: electron. networks, devices and fields, vol. 21, no. 6, pp. 551–561, 2008. [12] w. janke and a.hapka, “nonlinear thermal characteristics of silicon carbide devices,” mater. sci. eng.: b, vol. 176, no. 4, pp. 289–292, 2011. [13] v. szekely, “on representation of infinite-length distributed rc one-ports,” ieee trans. circuits syst., vol. 38, no. 7, pp. 711–719, 1991. [14] m. janicki, j. banaszczyk, b. vermeersch, g. de mey, and a. napieralski, “generation of reduced dynamic thermal models of electronic systems from time constant spectra of transient temperature responses,” microelectron. rel., vol. 51, no. 8, pp. 1351–1355, 2011. [15] lukasz starzak, mariusz zubert, marcin janicki, tomasz torzewicz, malgorzata napieralska, grzegorz jablonski, andrzej napieralski. behavioral approach to sic merged diode electro-thermal models generation. ieee transaction on electron devices. february 2013, volume 60, no 2:, pp. 630-638 [16] zubert, m.; starzak, l.; jablonski, g.; et al. an accurate electro-thermal model for merged sic pin schottky diodes. microelectronics journal volume: 43 issue: 5, 312-320, may 2012. [17] v. zeng. (2012, jan. 6). high efficiency system design with infineon power discrete-infineon coolmostm, optimostm, igbt and sic diode. infineon. [online]. available: www.infineon.com/cms/cn/corporate/ promopages/csr/8.ppt [18] zubert m., janicki m., napieralska m., jabłoński g., starzak ł, napieralski a.: "behavioural electrothermal modelling of sic merged pin schottky diodes". scientific computing in electrical engineering scee 2010. series: mathematics in industry, vol. 16, part iii. subseries: the european consortium for mathematics in industry. eds.: michielsen, bastiaan; poirier, jean-rené. 1-st edition., 2011, springer-verlag 2012, pp. 223-231 [19] sabry m.-n., “compact thermal models for electronic systems,” ieee trans. compon. packag. technol., vol. 26, no. 1, pp. 179–185, 2003. http://www.comsol.com/ recent research in vlsi, mems and power devices with practical application to the iter and… 587 [20] m. zubert, l. starzak, g. jabłoński, m. napieralska, m. janicki, a. napieralski. „novel spice dynamic model of sic merged pin schottky diodes”. 2011 proceedings of the 18th international conference mixed design of integrated circuits and systems (mixdes), 16-18 june 2011, pp. 541 – 544 [21] janicki, z. kulesza, t. torzewicz, and a. napieralski, “automated stand for thermal characterization of electronic packages,” in proc. 27th ieee semiconductor thermal measurement, modeling and management symp., san jose, ca, 2011, pp. 199–202. [22] jung j., polanowski p., pakuła t., kiełbik r., napieralski a., ulański j., „hardware implementation of dynamic lattice liquid model as a way of investigation of very complex molecular systems”, proceedings of the 6th hellenic conference on polymers, patras, greece, november 2006, pp. 285-286 [23] polanowski p., jung j., kiełbik r.: “special purpose parallel computer for modelling supramolecular systems based on the dynamic lattice liquid model”, computational methods in science and technology 16(2), 2010, issn 1505-0602, pp. 147-153. [24] jankowski m., napieralski a.: high-voltage high input impedance unity-gain voltage buffer, microelectronics journal, 2013, vol. 44, no. 7, p. 576-585 [25] patent issued by polish patent office “uklad bufora napieciowego,” (eng. “voltage buffer circuit”), inventor: jankowski mariusz, designee: automatix spółka z o.o., exclusive right kind and number wyn: (11) 212837, granted: 19 june 2012, published: 22 june 2012. [26] jankowski m., napieralski a.: novel structure of cmos voltage-to-current converter for high voltage applications, nanotech conference & expo 2012, june 18-21 2012, santa clara, california, usa [27] jankowski m., napieralski a.: current-mode signal processing implementation in hv soi integrated systems, microelectronics journal, volume 45, issue 7, july 2014, pages 946–959 [28] jankowski m., jabłoński g.: adjustable generator of edge-rounded trapezoidal waveforms. international journal of electronics and telecommunications, 2012, vol. 58, no. , p. 213-218 [29] jankowski m., napieralski a.: high-voltage trapezoidal waveform generator with edge-rounding functionality implementations, proceedings of the 21st international conference mixed design of integrated circuits & systems (mixdes), 19-21 june 2014, pp. 224 – 229 [30] jankowski m., napieralski a.: current-controlled switches for hv soi processes, microelectronics journal, volume 45, issue 7, july 2014, pages 931–945 [31] s. simrock, l. bertalot, m. cheon, c. hansalia, d. joonekindt, g. jablonski, y. kawano, w.-d. klotz, t. kondoh, t. kozak, p. makijarvi, d. makowski, a. napieralski, m. orlikowski, m. park, s. petrov, a. piotrowski, p. predki, i. semenov, d. shelukhin, v. udintsev, g. vayakis, a. wallander, m. walsh, s. wu, s. yang, and i. yonekawa, “evaluation of the atca fast controller standard for iter diagnostics,” fusion engineering and design, vol. 87, no. 12, pp. 2100 – 2105, 2012. [32] d. makowski, w. koprek, t. jezynski, a. piotrowski, g. jablonski, w. jalmuzna, k. czuba, p. predki, s. simrock, and a. napieralski, “prototype real-time atca-based llrf control system,” nuclear science, ieee transactions on, vol. 58, no. 4, pp. 1553 –1561, aug. 2011. [33] a. piotrowski and d. makowski, “pciexpress hot-plug mechanism in linux-based atca control systems,” wroclaw, poland, june 2010. [34] d. makowski, w. koprek, t. jezynski, a. piotrowski, g. jablonski, w. jalmuzna, and s. simrock, “interfaces and communication protocols in atca-based llrf control systems,” in nuclear science symposium conference record, 2008. nss ’08. ieee, oct. 2008, pp. 32–37. [3] [35] d. makowski, w. koprek, t. jezynski, a. piotrowski, g. jablonski, w. jalmuzna, k. czuba, p. predki, s. simrock, and a. napieralski, “prototype real-time atca-based llrf control system,” nuclear science, ieee transactions on, vol. 58, no. 4, pp. 1553 –1561, aug. 2011. [36] j. branlard, g. ayvazyan, v. ayvazyan, m. k. grecki, m. hoffmann, t. jezynski, i. m. kudla, t. lamb, f. ludwig, u. mavric, s. pfeiffer, h. schlarb, c. schmidt, h. c. weddig, b. yang, k. oliwa, w. wierba, w. cichalewski, k. gnidzinska, w. jalmuzna, d. r. makowski, a. mielczarek, a. napieralski, p. perek, a. piotrowski, t. pozniak, k. przygoda, s. korolczuk, j. szewinski, p. barmuta, s. b. habib, l. butkowski, k. czuba, m. grzegrzolka, e. janas, j. piekarski, i. rutkowski, d. sikora, l. zembala, and m. zukocinski, “the european xfel llrf system,” in international particle accelerator conference, new orleans, usa, may 2012. [37] a. mielczarek, d. makowski, g. jablonski, a. napieralski, p. perek, p. predki, t. jezynski, f. ludwig, and h. schlarb, “utca-based controller,” in mixed design of integrated circuits and systems (mixdes), 2011 proceedings of the 18th international conference, june 2011, pp. 165 –170. [38] i. rutkowski, k. czuba, d. makowski, a. mielczarek, h. schlarb, and f. ludwig, “vector modulator card for mtca-based llrf control system for linear accelerators,” nuclear science, ieee transactions on, vol. 60, no. 5, pp. 3609–3614, oct 2013. 588 a. napieralski et al. [39] d. makowski, g. jablonski, p. perek, a. mielczarek, p. predki, h. schlarb, and a. napieralski, “firmware upgrade in xtca systems”, nuclear science, ieee transactions on, vol. 60, no. 5, pp. 3639–3646, oct 2013. [40] s. simrock, l. abadie, r. barnsley, l. bertalot, p. makijarvi, j. y. journeaux, r. reichle, d. stepanov, g. vayakis, i. yonekawa, a. wallander, m. walsh, p. patil, d. makowski, and v. martin, “diagnostics use case examples for iter plant instrumentation,” in international conference on accelerator and large experimental physics control systems, icalepcs, 2013, june 2013. [41] a. piotrowski, m. orlikowski, t. kozak, p. predki, g. jablonski, d. makowski, and a. napieralski, “performance optimisation in software for data acquisition systems,” mixed design of integrated circuits and systems (mixdes), 2011 proceedings of the 18th international conference, pp. 189 – 194, june 2011, isbn 978-1-4577-0304-1. [42] a. mielczarek, p. perek, d. makowski, m. orlikowski, g. jablonski, and a. napieralski, “amc frame grabber module with pcie interface,” mixed design of integrated circuits and systems (mixdes), 2013 proceedings of the 20th international conference, pp. 137 – 142, june 2013, isbn 978-83-63578-00-8. [43] p. perek, m. orlikowski, g. jablonski, a. mielczarek, d. makowski, k. zagar, and s. isaev, “software components of mtca-based image acquisition system,” mixed design of integrated circuits and systems (mixdes), 2013 proceedings of the 20th international conference, pp. 137 – 142, june 2013, isbn 978-83-63578-00-8. [44] perek p., wychowaniak j., makowski d., orlikowski m., napieralski a., “image acquisition and visualisation in doocs and epics environments”, int. j. microelectron. comput. sci., 2012, vol. 3 no. 2, pp. 60-66, issn 2080-8755 [45] mielczarek a., makowski d., jablonski g., perek p., napieralski a., “image acquisition module for utca systems”, mixed design of integrated circuits and systems (mixdes), 2012 proceedings of the 19th international conference, 24-26 may 2012, pp. 156-160, isbn 978-1-4577-2092-5 [46] g. jabłoński, d. makowski, a. mielczarek, m. orlikowski, p. perek, a. napieralski, p. makijarvi, and s. simrock, “ieee 1588 time synchronization board in mtca.4 form factor,” in real time conference (rt), 2014 19th ieee-npss, 2014. [47] p. perek, a. mielczarek, p. predki, d. makowski, and a. napieralski, “module management controller for microtca-based controller board,” international journal of microelectronics and computer science, vol. 3, no. 1, 2012, pp. 25–31, 2012, issn 2080-8755. [48] d. makowski, a. mielczarek, p. perek, m. fenner, f. ludwig, m. uros, j. szewinski, and a. schlarb, h. napieralski, “standardized solution for management controller for mtca.4,” in real time conference (rt), 2014 19th ieee-npss, 2014. [49] perek p., “high-performance image processing system for plasma diagnostics”, proceedings of the xv international phd workshop owd 2013, wisła, poland, pp. 328-331 [50] t. kozak, p. prędki, d. makowski, "real-time ipmi protocol analyzer," nuclear science, ieee transactions on , vol.58, no.4, pp.1857,1863, aug. 2011 [51] p. prędki, d. makowski, a. napieralski, "intelligent platform-management controller for low-level rf control system atca carrier board," nuclear science, ieee transactions on , vol.58, no.4, pp.1538,1543, aug. 2011 [52] d. makowski, a. mielczarek, p. perek, a. napieralski, l. butkowski, j. branlard, m. fenner, h. schlarb, b. yang, " high-speed data processing module for llrf" in real time conference (rt), 2014 19th ieeenpss, 2014. [53] d. makowski, a. mielczarek, p. perek, g. jabłonski, m. orlikowski, a. napieralski, p. makijarvi, s. simrock, v. martin, "high-performance image acquisition and processing system with mtca.4" in real time conference (rt), 2014 19th ieee-npss, 2014 [54] a. mielczarek, d. makowski, g. jabłoński, p. perek, m. orlikowski, "fmc video acquisition module with camera link interface", international journal of microelectronics and computer science 2012, volume 3, number 3, issn: 2080-8755 facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 303 311 https://doi.org/10.2298/fuee1802303m on a property of the reed-muller-fourier transform  claudio moraga faculty of computer science, technical university of dortmund, germany abstract. the reed-muller-fourier is reviewed and a new property is presented: the reed-muller-fourier transform of an n-place p-valued function preserves any permutation of the arguments. this leads to the additional result that the reed-mullerfourier spectrum of an n-place p-valued symmetric function is also symmetric. furthermore, the reed-muller and the vilenkin-chrestenson spectra of an n-place pvalued symmetric function are also symmetric. key words: multiple-valued switching theory, symmetric functions, reed-muller-fourier transform. dedicated to prof. radomir stanković on the occasion of his 65th birthday 1. introduction the fundamentals of the reed-muller transform may be found in the early work of i. zhegalkin [1], [2]. however since his publications were in russian, they remained practically unknown for scientists not proficient in that language. the transform was rediscovered with the works of i.s. reed [3] and d.e. muller [4] and since then, it carries their names. in the literature frequently this transform is mentioned as the rm transform. the transform was developed to be applied to boolean functions. the later extension of the reed-muller transform to multiple-valued domains is due to d.h. green and i.s. taylor [5]. the reed-muller-fourier transform (rmf) was introduced by radomir. s. stanković [6], [7] aiming to combine relevant properties of the reed-muller transform and the discrete fourier transform. in a way, this transform is another extension of the reedmuller transform to the multiple-valued domain. in the binary case, the rmf transform converges to the reed-muller transform. received august 3, 2017; received in revised form september 8, 2017 corresponding author: claudio moraga faculty of computer science, technical university of dortmund, germany (e-mail: claudio.moraga@tu-dortmund.de) 304 c. moraga an important common property of both the rm and rmf transforms is the fact that they represent bijections in the set of p–valued functions. this means that the rm spectrum or the rmf spectrum of an n–place p–valued functions is again an n–place p– valued function, not necessarily different from the original one. (it has been shown that both transforms have fixed points [8], [9]). moreover, both the rm and the rmf transforms have a kronecker product structure. (kronecker product: see e.g. [10], [11]). the rmf transform matrix is lower triangular [12] and exhibits special similarities with the pascal matrix on finite fields [13]. 2. formalisms notation: vectors and matrices will be written with upper case in bold. if m is a p m p n matrix, it will be denoted simply as mm,n. square matrices will be assigned just one index. if not clear from the context, the length of vectors will be explicitly given. an exception to this notation is “xprmf”, which, for historical reasons [7] will be used to denote the basis of the rmf transform. spectral techniques in a nut shell: let v = {0, 1, …, p–1} be the domain of p–valued functions and let f : v n  v, be an nplace p–valued function. to every function f, a value column vector f of length p n is associated. the elements of f are the values of f for all the different value assignments to the arguments. the elements of f follow the lexicographic order of the value assignments to the arguments of f. let f  f denote the association. it is obvious that f  inf, where in denotes the identity matrix, represents a valid association. if mn is a non-singular matrix, its inverse is also non-singular and well defined. moreover since (mn) -1  mn = in, then f  (mn) -1 mnf is also a valid association and represents the basic concept of spectral transformations. since (mn) -1 is non-singular, its columns form a linearly independent set. if the columns of (mn) -1 are considered to represent value vectors of auxiliary functions, then (mn) -1 constitutes a basis. mn, the inverse of (mn) -1 , is called a transform matrix and the product mnf is normally called the spectrum of f. the inner product of the basis and the spectrum leads to a polynomial expression of f. depending on the choice of (mn) -1 , different polynomial expressions on elements of the basis will be obtained. definition 1: let f, g : zp  zp. the gibbs convolution product () of p-valued functions is calculated as follows [6]: if x = 0, then (f  g)(0) = 0. if x  0, then (f  g)(x) = ∑ – – mod p definition 2: the fundamental basis for the rmf transform, called xprmf is the following [6], [7]: xprmf = [x* 0 x* 1 … x* (p–1) ], where x* 0 is defined to be the constant p – 1 for all x, and for 1 ≤ j ≤ p – 1, the powers x* j are calculated as the j–fold gibbs product of x* 0 with itself. on a property of the reed-muller-fourier transform 305 it is simple to show that xprmf is its own inverse. therefore the basic rmf transform matrix, called r1 equals xprmf, and for all n > 1 holds: rn = (xprmf) n , where the exponent “n” denotes the n-fold kronecker product of xprmf with itself. since xprmf is its own inverse, it is easy to see that rn will also be its own inverse. example 1: let n = 2 and p = 3. calculating mod 3, notice that the borders of r2 look different than those of r1. this will happen whenever n is even, since for all p, (p–1) n  1 mod p. if this is inconvenient for some application, then a normalized transform may be used. definition 3: the normalized rmf transform is given by rn = (–1) n+1 xprmf(1)⨂ n mod p. the factor (–1) n+1 is introduced to preserve the value (p–1), in the leftmost column of the matrix when n is even, since (–1) n+1 (p–1) n ≡ (p–1) n+1 (p–1) n ≡ (p–1) 2n+1 mod p. 2n + 1 will be an odd number and an odd power of (p–1) equals (p–1) mod p. it is simple to see that in this case rn is also self-inverse. if for particular applications a “homogeneous and dft-like look” is desirable, then a special rmf transform may be used. definition 4: the special rmf transform equals (p-1)(xprmf) n mod p. see figure 1. [ ] [ ] [ ] fig. 1 special rmf transform matrices for p = 3, 4, and 5 when n = 1 if for any p r1 is expressed as [ri,j], i, j  ℤp, then ( ) mod p [12]. it may be observed that in the case when p is a prime, the matrices are skewsymmetric, i.e., symmetric with respect to the diagonal with positive slope. furthermore besides being skew-symmetric and self inverse, starting at the lower left corner and moving along the diagonal with positive slope, a pascal triangle mod p is found. r2 = 306 c. moraga an important property of the rmf transform is the following: the rmf transform of a non-zero constant vector is an “impulse” vector, i.e. a vector with only one non-zero entry, at the 0-th position [12]. this is a well known property of the dft, which is preserved by the rmf transform. 3. theorems theorem 1. preliminaries: let v = {0, 1, …, p–1} be the domain of p–valued functions and let f : v 2  v, with value vector f of length p 2 . moreover let g : v 2  v, such that g(x1, x2) = f (x2, x1). let the value vector of g be g. furthermore, let p2 be a permutation matrix such that when applied upon f induces a permutation of its components according to the reordering of the arguments of the function. hence g = p2f. claim: the rmf transform of a p-valued function of two variables preserves the order of the arguments. r2g = r2p2f = p2r2f mod p. proof: let i, j  (ℤ ) , with i = i1i0 and j = j1j0. since r has a kronecker product structure, then r2 = r1  r1 mod p. if r2 is expressed as [ri,j] then ri,j = ( ( )) ( ( )) mod p. if i1 and i0 are exchanged, then modified ri,j mod p. and if j1 and j0 are exchanged, then modified ri,j mod p. it is simple to see that in both cases the modified ri,j takes the same value. moreover, exchanging i1 and i0 has the effect of exchanging (the corresponding) two rows of r2 and, similarly, exchanging j1 and j0 has the effect of exchanging (the corresponding) two columns of r2. exchanging i1 and i0 corresponds to p2r2, while exchanging j1 and j0 corresponds to r2p2. the assertion follows. although not explicitly needed for theorem 1, it is not difficult to construct the p2 matrices for different values of p, because of the strong regularity of their structure. they are symmetric, skew-symmetric and self inverse. see figure 2. on a property of the reed-muller-fourier transform 307 [ ] [ ] [ ] fig. 2 p2 matrices for p = 2, p = 3, and p = 4 corollary 1.1: from p2r2 = r2p2 and recalling that r2 is self inverse follows that p2 = r2p2r2. since p2 is also self inverse, then p2p2 = r2p2r2p2 = i2, meaning that r2p2 is also its own inverse. theorem 2. let n  2 and k < n. define f and g to be p-valued functions of n variables (i.e. nplace functions) with value vectors f and g, respectively, such that for all value assignments to the arguments, g equals f, but with transposed arguments xk and xk+1. let pn be a permutation which when applied to f has the effect of transposing only the two selected arguments, i.e., pn = (ik-1  p2  in-k-1). then rnpnf = pnrnf mod p. proof: decompose rn to match the structure of pn. i.e. rn = rk-1  r2  rn-k-1, and apply it to both sides of the claim, taking advantage of the compatibility between kronecker and matrix products [11]: rnpnf = (rk-1  r2  rn-k-1)(ik-1  p2  in-k-1)f = (rk-1  r2p2  rn-k-1)f mod p. pnrnf = (ik-1  p2  in-k-1)(rk-1  r2  rn-k-1)f = (rk-1  p2r2  rn-k-1)f mod p. it is easy to see that the claim will be satisfied if and only if p2r2 = r2p2. this was proven in theorem 1. the assertion follows. 308 c. moraga example 2. let p = 4 and n = 2. calculate p2r2 operating mod 4. from corollary 1.1, (p2r2) -1 = p2r2 = r2p2 therefore commuting the factor matrices will give the same result. theorem 3. let f and g be n-place p-valued functions with value vectors f and g, respectively, such that for all value assignments to the arguments, g equals f, but with transposed arguments xk and xk+1 and transposed arguments xh and xh+1. (n > k > h > 0). if applied independently, let the corresponding transposition matrices be and , respectively, leading to g =  f. the following holds: rng =  rnf mod p. proof: consider first one of the transpositions. let g’ = f mod p. then from theorem 1 follows that rng’ = rn f = rnf mod p. now let the second transposition be executed. g = g’. then from theorem 1 follows that rng = rn g’ = rng’ = =  rnf mod p. p2r2 = = = on a property of the reed-muller-fourier transform 309 theorem 4. let f and g be n-place p-valued functions with value vectors f and g, respectively, such that for all value assignments to the arguments, g equals f, but with permuted arguments. let pn be a permutation matrix, which when applied to f has the same effect as permuting the corresponding arguments. then rng = rnpnf = pnrnf mod p. proof: recall that any permutation of an ordered set of arguments may be obtained with an appropriate sequence of transpositions, and any transposition may be obtained with a cascade of transpositions of neighbor arguments. apply accordingly theorems 2 and 3 as many times as needed. theorem 5. the rmf spectrum of an n-place p-valued symmetric function is symmetric. proof: recall that a p-valued function is symmetric iff it is invariant with respect to any permutation of its arguments. (see e.g. [14], [15], [16], [17]) let f be the value vector of a symmetric function and let pn be equivalent to a random permutation of its arguments. then f = pnf. from theorem 4, rnf = rnpnf = pnrnf mod p. therefore rnf mod p is symmetric. example 3: let p = 4 and f : v 2  v be symmetric, such that f = [1 1 0 3 1 2 3 1 0 3 3 2 3 1 2 0 ] t let s = r2f s = 310 c. moraga symmetry proof: x2 x1 f t s t 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 1 1 0 3 1 2 3 1 0 3 3 2 3 1 2 0 1 0 3 3 0 1 3 0 3 3 0 3 3 0 3 2 it is easy to see that s, the spectrum of f, is also symmetric. remark: it was shown in [18] that an analog to theorem 3 holds for spectra obtained with the reed-muller or the vilenkin-chrestenson transforms. this also includes the circular vilenkin-chrestenson spectrum. corollary 5.1. the reed-muller and the vilenkin-chrestenson spectra of p–valued symmetric functions are symmetric. corollary 5.2. if f is a p–valued bent function [20], [19], then the function obtained after permuting the value assignment to the arguments is also bent, since the circular vilenkinchrestenson spectrum will remain flat., i.e. all its components will have a constant absolute value equal to p n/2 . 4. conclusions it has been shown that the rmf transform shares with the reed-muller and the vilenkin-chrestenson transforms the property of preserving any permutation of the arguments, in spite of their different structural attributes. recall that the vilenkinchrestenson transform is complex-valued, symmetric, and unitary up to a normalizing coefficient; the reed-muller transform is integer-valued and neither symmetric nor orthogonal; and the reed-muller-fourier transform is integer-valued, lower triangular, and self inverse. references [1] i.i. zhegalkin, “o tekhnyke vychyslenyi predlozhenyi v symbolytscheskoi logykye,” math. sb., vol. 34, pp. 9-28, in russian, 1927. [2] i.i. zhegalkin, “aritmetizatiya symbolytscheskoi logyky,” math. sb., vol. 35, pp. 311-377, in russian, 1928. [3] i.s. reed, “a class of multiple-error-correcting codes and the decoding scheme.” ire trans. on information theory pgit-4, pp. 38-49, 1954. [4] d.e. muller, “application of boolean algebra to switching circuit design and to error correction.” ire trans. on elec. computers ec-3, vol. 3, pp. 6-12, 1954. [5] d.h. green and i.s. taylor, “multiple-valued switching circuit design by means of generalized reedmuller expansions.” digital processes 2, pp. 63-81, 1976. [6] r.s. stanković, “some remarks on fourier transforms and differential operators for digital functions,” in proceedings of the 22 nd international symposium on multiple-valued logic, sendai, japan, ieee press n.y., 1992, pp. 365-370. on a property of the reed-muller-fourier transform 311 [7] r.s. stanković, “the reed-muller-fourier transform – computing methods and factorizations”, claudio moraga: a passion for multi-valued logic and soft computing. (r. seising, h. allende-cid, eds.), springer 2017, pp. 121-151. [8] c. moraga, s. stojković and r.s. stanković, “on fixed points and cycles in the reed muller domain.” in proceedings of the 38 th international symposium on multiple-valued logic, ieee press, 2008, pp. 82-88. [9] c. moraga, r.s. stanković, m. stanković and s. stojković, “on fixed points of the reed-muller-fourier transform.” in proceedings of the 47 th international symposium on multiple-valued logic, ieee press, 2017, pp. 55-60. [10] a. graham, kronecker products and matrix calculus with applications. ellis horwood ltd., chichester uk, 1981. [11] r.a. horn and ch.r. johnson, topics in matrix analysis. cambridge university press, new york, 1991. [12] c. moraga, r.s. stanković and m. stanković, “a comparative study of the reed-muller-fourier transform, the pascal matrix, and the discrete pascal transform.” research report fsc-2015-02, european centre for soft computing, mieres, asturias, spain, 2015. [13] r.s. stanković, j.t. astola and c. moraga, “pascal matrices, reed-muller expressions, and reed-muller error correcting codes.” in logic in computer science ii, (s. ghilezan, ed.), press mathematical institute of the serbian academy of science, belgrade, serbia, 2015., zbornik radova 18 (26), pp. 145-172. [14] e. pogossova and k. egiazarian, “reed-muller representation of symmetric functions.” j. multiplevalued logic and soft computing, vol. 10, no. 1, pp. 51-72, 2004. [15] r.s. stanković, j.t. astola and k. egiazarian, “remarks on symmetric binary and multiple-valued functions.” in proceedings of the 6th international workshop boolean problems, b. steinbach (ed.), 2004, pp. 83-87. [16] j.t. butler and k. a. schueller, “worst case number of terms in symmetric multiple-valued functions.” in proceedings of the 21 st international symposium on multiple-valued logic. ieee press, 1991. [17] j.c. muzio, “concerning the maximum size of the terms in the realization of symmetric functions.” in proceedings of the 20 th international symposium on multiple-valued logic, 1990, pp. 292-299. [18] c. moraga, “permutations under spectral transforms.” in proceedings of the 38 th international symposium on multiple-valued logic, ieee press, 2008, pp. 76-81. [19] p.v. kumar, r.a. scholz and l.r. welch, “generalized bent functions and their properties.” jr. combinatorial theory series a, vol. 40, no. 1, 90-107, 1985. [20] c. moraga, m. stanković, r.s. stanković and s. stojković, “contribution to the study of multiplevalued bent functions.” in proceedings of the 33 rd international symposium on multiple-valued logic, ieee press, 2013, pp. 340-345. facta universitatis series: electronics and energetics vol. 35, no 1, march 2022, pp. 71-92 https://doi.org/10.2298/fuee2201071p © 2022 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper possibilistic uncertainty assessment in the presence of optimally integrated solar pv-dg and probabilistic load model in distribution network shradha singh parihar, nitin malik the northcap university, gurgaon, india abstract. to integrate network load and line uncertainties in the radial distribution network (rdn), the probabilistic and possibilistic method has been applied. the load uncertainty is considered to vary as gaussian distribution function whereas line uncertainty is varied at a fixed proportion. a voltage stability index is proposed to assign solar pv-dg optimally followed by application of pso technique to determine the optimal power rating of dg. standard ieee 33and 69-bus rdn are considered for the analysis. the impact of various uncertainties in the presence of optimally integrated solar pv-dg has been carried out on 69-bus network. the results obtained are superior to fuzzy-arithmetic algorithm. faster convergence characteristic is obtained and analyzed at different degree of belongingness and realistic load models. the narrower interval width indicates that the observed results are numerically stable. to improve network performance, the technique takes into account long-term changes in the load profile during the planning stage. the significant drop in network power losses, upgraded bus voltage profile and noteworthy energy loss savings are observed due to the introduction of renewable dg. the results are also statistically verified. key words: distribution network, distributed generation, optimal integration, uncertainties, interval arithmetic, gaussian distribution function 1. introduction 1.1. motivation and literature review the distribution network is ill-conditioned because of low x/r ratio and its radial structure. thus, the conventional approaches like newton-raphson, gauss-seidel, etc, for solving power flow (pf) problem in the transmission network fails to converge in many cases in distribution network. to compute bus voltage and power flow values, deterministic pf algorithm requires precise network (n/w) load and generation data. it does not received february 23, 2021; received in revised form april 12, 2021 corresponding author: nitin malik the northcap university, gurgaon, india e-mail: nitinmalik77@gmail.com 72 s. s. parihar, n. malik contribute to optimal planning and operation of the n/w, as it finds the pf results for specific n/w configuration and operating conditions only at a given instant. the emphasis is on integrating distributed generation (dg) into the distribution n/w due to socioeconomic, environmental, and technical constraints. dg is a decentralized generation of electric power in distribution n/w using non-renewable (turbine, engine, etc.) or renewable (small/micro/mini hydropower, wind, solar, fuel cell, geothermal, etc.) resources. as the output of these technologies depicts stochastic behavior, hence, capable to introduce significant uncertainty in total power production. in real, the networks are complex due to the presence of non-linearity and incapability in expressing n/w variables in very precise terms which can be simplified by either allowing some degree of uncertainty or making assumptions about the distribution n/w. the input parameters (n/w load, line, and transformer data) are considered to be fixed while performing power grid operations, however they are not in practise. because of the erroneous calculation of reactance and resistance due to conductor ageing and temperature variation, there is ambiguity in n/w line data. the load uncertainties are caused due to incorrect estimation in load demand (load forecasting). changes in climate and water runoff induce uncertainty in hydropower plants. temperature sensitivity in a fuel cell creates uncertainty since temperature change has a stronger impact at higher current [1]. this varies the power flow in the radial distribution network (rdn). these uncertainties are modeled either using a possibilistic or probabilistic method. monte carlo simulation (mcs) and stochastic approach belongs to probabilistic domain whereas interval arithmetic (ia) and fuzzy sets comes under possibilistic methods. ia establishes a strict constraint on all feasible n/w circumstances that could have been achieved by hundreds of successive mcs; as a result, the mcs calculation time increases, making analysis more difficult. with little computational effort, ia can produce highquality results. probabilistic and possibilistic modelling is quantitative and qualitative in nature, respectively [2]. probabilistic method is used where sufficient historical data of uncertain parameter or their probability density function (pdf) like load pattern, wind speed and solar irradiation [3] is easily available whereas possibilistic modelling is preferred when the data available is not sufficient for the planners and operators to establish pdf [3]. both the methods are cooperative and their utilization provides more realistic approximation to n/w modelling. the approaches for integrating dg are generally categorized as analytical and heuristic methods. the analytical method makes use of mathematical equations to determine the optimum solution. an analytical method for evaluating dg in n/w is presented in [4] without considering cost benefits. many numerical approaches like kalman filter algorithm [5] and mixed integer non-linear programming [6] are applied to integrate dg optimally in the rdn. authors in [7], demonstrated an analytical and a meta-heuristic approach to optimally allocate dg units in the rdn. numerous evolutionary algorithms such as grey wolf optimizer [8], pso [9] and gravitational search algorithm [10] and have been employed for solving issues related to dg allocation in the rdn. an ant lion optimization algorithm [11] is demonstrated to optimally allocate dg in the rdn. authors in [12], utilized augmented lagrangian genetic algorithm to integrate renewable dgs for minimizing n/w losses, satisfying operational constraints without considering economic benefits. in [13], author implements an ia technique to incorporate load uncertainty in the distribution n/w considering constant power load only. a correlated interval-based uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 73 backward/forward (b/f) pf method is developed in [14] to consider uncertainties in renewable energy resources. the interval-based pf models are transformed into the optimization problem in [15] which minimizes the conservatism of the obtained interval solutions. an affine arithmetic method is projected in [16] to introduce n/w uncertainties of different types. a probabilistic distribution-based ia approach is presented in [17] to introduce uncertainty in load demand in conventional pf. authors in [18], demonstrated the ia based pf analysis in the presence of load, line and dg uncertainties. the analysis has also been done with various types of dg units that are not optimally allocated. abdelkader et al. [19] proposed a fuzzy arithmetic algorithm (faa) for incorporating uncertainties in the rdn. triangular fuzzy number method is proposed in [20] to introduce uncertainties in the n/w but resulted in higher n/w loss. to deal with n/w uncertainties a new midpoint-radius interval-based approach has been demonstrated in [21] to eliminate the factorization of the interval jacobian matrix. 1.2. paper contributions from the previously published literature, it has been concluded that the combined use of the interval arithmetic and probabilistic load model with optimally integrated solar photovoltaic (pv) dg in distribution n/w has not been explored before. using the hybrid possibilistic-probabilistic strategy, this research article contributes to the published literature. the optimal penetration of solar pv-dg is carried out using a novel voltage stability index (vsi) and pso method and, thereafter, the n/w uncertainties (line and load) are introduced in rdn to demonstrate the possible states of the solution. the detailed investigation considering various realistic loads and load fluctuations with and without solar pv-dg have been presented. the presented approach is applied on well-established standard ieee 69-bus n/w. two case studies considering solar pv-dg without and with n/w uncertainties are analysed and the results of the ieee 33-bus n/w are further compared to faa to establish the effectiveness of the proposed methodology. the summarized article contributions are mentioned below a) a new vsi has been developed for the optimal placement of solar pv-dg in a rdn after which pso is implemented to find the optimal size of dg. this independent method for integrating renewable dg gives openness and versatility to the problem. b) a combination of ia and probabilistic load model is applied and presented to attain a more realistic representation of distribution n/w modelling which paves the way for accurate results. c) all input variables (loads and line) and generation in the distribution n/w are represented as random variables. the line uncertainty variables are considered to fluctuate at a constant proportion, whereas load uncertainty is expected to change according to a gaussian distribution function. d) the effect of n/w uncertainties and different realistic loads (industrial, residential and commercial) and their combination with optimally integrated solar pv-dg is analysed by directly incorporating them into the interval-based b/f pf algorithm. e) the analysis of the reduction in n/w losses and the cost of annual energy loss savings (aels) has been carried out at three load levels to aid the distribution n/w operators (dnos) in future n/w planning. f) the attained results of the simulation study imply that the methodology proposed in this research is much more feasible and effective for designing the large-scale rdn at all load levels. 74 s. s. parihar, n. malik 1.3. paper outline the brief outline of the work is given as: in section 2, the description of interval arithmetic is mentioned. in section 3, the modeling of n/w data and dg is presented. in the next section, the working of the pso is explained. section 5 explores the development of the novel vsi and the algorithm to integrate solar pv-dg optimally in the n/w. intervalbased b/f pf solution is attained in section 6. the simulated results for two standard rdn are mentioned in section 7 followed by conclusion in last section. 2. interval arithmetic in contrast to the point estimating technique, a number can be expressed as confidence interval that can be open, closed, or a combination of both in the ia approach. a set of real numbers can be used to express an interval number. let l and k be two separate interval numbers (of real numbers) with supporting intervals of [l1, l2] and [k1, k2]. here, the k1, l1 and k2, l2 signifies the lower limits and upper limits (endpoints), respectively. k2-k1 and l2l1 are the interval widths determined for intervals k and l, respectively. addition, subtraction, multiplication, division, minimization, and maximising are all mathematical operations that may be applied to interval numbers [22]. 𝐾 + 𝐿 = [𝑘1 + 𝑙1, 𝑘2 + 𝑙2] (1) 𝐾 − 𝐿 = [𝑘1 − 𝑙2, 𝑘2 − 𝑙1] (2) 𝐾 × 𝐿 = [𝑚𝑖𝑛. (𝑘1 × 𝑙1, 𝑘1 × 𝑙2, 𝑘2 × 𝑙1, 𝑘2 × 𝑙2), 𝑚𝑎𝑥. (𝑘1 × 𝑙1, 𝑘1 × 𝑙2, 𝑘2 × 𝑙1, 𝑘2 × 𝑙2)] (3) 𝐾 𝐿 = 𝐾 × 𝐿−1 (4) where 𝐿−1= [1/l2, 1/l1] with 0 ∉ [l1, l2]. the distance between k and l is defined as 𝑑(𝐾, 𝐿) = 𝑚𝑎𝑥[|𝑘1 − 𝑙1|, |𝑘2 − 𝑙2|] (5) complex uncertainty can be obtained by representing real numbers in complex domain. the pf study utilizes the above-mentioned fundamental operations to calculate the link between uncertain variables in terms of complex interval numbers. in this research, ia is used to deal with the uncertainties in the n/w data. therefore, reactance, resistance, bus voltage and n/w power loss are taken as interval numbers instead of a fixed value. rather than the fixed variation discussed in section 3.2, the n/w load at a bus is assumed to fluctuate over a specified range based on a gaussian distribution. when the load demand is changing over the interval, the number of pf computations required are lesser than the total number of repeated pf solutions. uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 75 3. mathematical model 3.1. line variation model fig. 1 is a one-line diagram of a branch connecting bus i-1 and bus i. fig. 1 single branch equivalent from fig. 1, 𝑷𝒊 + 𝒋𝑸𝒊 = 𝑽𝒊∠𝜹𝒊. 𝑰𝒊 ∗ (6) where, vi stands for receiving-end rms bus voltage and δi is voltage angle at bus i. the reactive and real power load fed through the ith bus are represented by qi and pi, respectively. every bus, including the source bus, has an initial voltage of [1.0,1.0] +j [0.0,0.0] p.u. since, both power and voltage are complex interval variables, the subsequent current at bus i (𝐼𝑖 ), as described in (7), is also a complex interval quantity that can be evaluated using the division operation (4). 𝐼𝑖 = [𝑃𝑖𝑙𝑜 ,𝑃𝑖𝑢𝑝 ]− j [𝑄𝑖𝑙𝑜 ,𝑄𝑖𝑢𝑝 ] 𝑉𝑖∠−𝛿𝑖 (7) where 𝑃𝑖𝑙𝑜 , 𝑄𝑖𝑙𝑜 and 𝑃𝑖𝑢𝑝 , 𝑄𝑖𝑢𝑝 , respectively, are lower and the higher limits for the real and reactive power load at the ith bus. the n/w real and reactive loss in a branch is 𝑃𝑙𝑜𝑠𝑠 (𝑖 − 1, 𝑖) = (𝑃𝑖 2+𝑄𝑖 2) |𝑉𝑖| 2 . 𝑅𝑖 (8) 𝑄𝑙𝑜𝑠𝑠 (𝑖 − 1, 𝑖) = (𝑃𝑖 2+𝑄𝑖 2) |𝑉𝑖| 2 . 𝑋𝑖 (9) the branch reactance and resistance, respectively, are xi and ri. at a constant proportion, the line parameter's uncertainty can be introduced as 𝑋𝑖𝑙𝑜 = (1 − %(𝑋)). 𝑋𝑖 (10) 𝑋𝑖𝑢𝑝 = (1 + %(𝑋)). 𝑋𝑖 (11) 𝑅𝑖𝑙𝑜 = (1 − %(𝑅)). 𝑅𝑖 (12) 𝑅𝑖𝑢𝑝 = (1 + %(𝑅)). 𝑅𝑖 (13) where 𝑅𝑖𝑙𝑜 , 𝑋𝑖𝑙𝑜 and 𝑅𝑖𝑢𝑝, 𝑋𝑖𝑢𝑝 are the lower and the upper constraints on the n/w resistance and reactance, respectively. 76 s. s. parihar, n. malik 3.2. variation in load model in the rdn, the majority of the loads are frequency and voltage-dependent [23]. for analyzing static load, only variation in voltage is considered as deviation in frequency is not significant [24]. the load model generally chosen is complex power type, but in reality, the load is a combination of numerous load models. therefore, this study aims at evaluating the impact of realistic load models viz. industrial, residential and commercial loads in the distribution n/w that is particularly important for dnos in various planning scenarios. the considered load models can be expressed mathematically as [25]. 𝑃𝑖 = 𝑃𝑖𝑛𝑜 . ( |𝑉𝑖| |𝑉𝑖𝑛𝑜| ) 𝑥1 (14) 𝑄𝑖 = 𝑄𝑖𝑛𝑜 . ( |𝑉𝑖| |𝑉𝑖𝑛𝑜| ) 𝑥2 (15) where x1 and x2 are the load exponents. 𝑄𝑖𝑛𝑜 is the nominal reactive load, 𝑃𝑖𝑛𝑜 is the rated real load and 𝑉𝑖𝑛𝑜 is the rated bus voltage at the i th bus, respectively. in the present study, the real and reactive exponents taken for constant power load (cpl), industrial load (il), residential load (rl) and commercial load (cml) model are 0 & 0, 0.18 & 6.00, 0.92 & 4.04 and 1.51 & 3.40, respectively [25]. as practically any type of load might present in the n/w, therefore, composite load (cl) model is considered for the analysis with 40% of cpl, 30% of il, 20% of rl and 10% of cml [25]. the gaussian distribution function is utilized to predict the change in n/w power load demand. the gaussian distribution is a symmetric mean-value distribution with a bellshape and mentioned as (16) 𝑓(𝑦𝑖) = 1 √2𝜋𝜎2 𝑒 − 1 2 (𝑦𝑖−𝜇)² 𝜎² (16) where random variables 𝜎 2 and μ are distribution parameters that represents variance and mean (expected) value of the base loads, respectively. the variance represents how much the random variable is expected to deviate from its mean value (in a certain percentage). the normalised value of the reactive or real power load at bus i of the considered network is given by yi, which can be given as in [17]. 𝑦𝑖 = 𝑃𝑖 𝑃𝑖𝑛𝑜 and 𝑦𝑖 = 𝑄𝑖 𝑄𝑖𝑛𝑜 (17) where, (14) and (15) defines 𝑃𝑖 and 𝑄𝑖 , respectively. 𝛼𝑞𝑙 (𝑘) and 𝛼𝑝𝑙(𝑘) are the degree of belongingness for reactive and real power load, where k indicates a number of degree of belongingness. from the load curve illustrated in fig. 2 the mean value of the normalized real and reactive power load is unity for the degree of belongingness 1.0. the degree of belongingness can have any value between plmax / n and plmax where, the number of points of linearization of the gaussian curve is denoted by n and plmax is the maximum degree of belongingness. the gaussian distribution curve for a real power load is depicted in fig. 2 [17]. uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 77 fig. 2 gaussian distribution of load equation (16) can be written as 𝛼𝑝𝑙 (𝑘𝑑 ) = 𝑓 [ 𝑃𝑖 𝑃𝑖𝑛𝑜 ] = 1 √2𝜋𝜎2 𝑒 [ 𝑃𝑖 𝑃𝑖𝑛𝑜 −𝜇] 2 2𝜎2 ⁄ (18) from equation (18), we get 𝜎 = 0.399 for µ = 1.0 and 𝛼𝑝𝑙(𝑘𝑑 ) = 1.0. for these values, equation (18) can be written as 𝑃𝑖 𝑃𝑖𝑛𝑜 − 1 = ±√ −ln (𝛼𝑝𝑙(𝑘𝑑)) 𝜋 for 𝑃𝑖 𝑃𝑖𝑛𝑜 ≠ 1 (19) similarly, for reactive load 𝑄𝑖 𝑄𝑖𝑛𝑜 − 1 = ±√ −ln (𝛼𝑞𝑙(𝑘𝑑)) 𝜋 for 𝑄𝑖 𝑄𝑖𝑛𝑜 ≠ 1 (20) right hand side of equation (19) and (20) can be specified as √ −ln (𝛼𝑝𝑙(𝑘𝑑)) 𝜋 = 𝛼𝐾 = √ −ln (𝛼𝑞𝑙(𝑘𝑑)) 𝜋 (21) thus, (19) can be rewritten as 𝑃𝑖 𝑃𝑖𝑛𝑜 = 1 ± 𝛼𝐾𝑑 (22) 𝑃𝑖 = 𝑃𝑖𝑛𝑜 (1 ± 𝛼𝐾𝑑 ) (23) where ± sign gives a lower and upper constraints of the n/w load at bus i. 𝑃𝑖𝑙𝑜 = 𝑃𝑖𝑛𝑜 (1 − 𝛼𝑘𝑑 ) (24) 𝑃𝑖𝑢𝑝 = 𝑃𝑖𝑛𝑜 (1 + 𝛼𝑘𝑑 ) (25) 𝑄𝑖𝑙𝑜 = 𝑄𝑖𝑛𝑜 (1 − 𝛼𝑘𝑑 ) (26) 𝑄𝑖𝑢𝑝 = 𝑄𝑖𝑛𝑜 (1 + 𝛼𝑘𝑑 ) where 𝑘𝑑 =1, 2..n (27) linearization at different 𝑘𝑑 values in equations (24)-(27) results in 𝑘𝑑 discrete load intervals in closed form. for the analysis purpose, the linearization is carried out at three 78 s. s. parihar, n. malik different points which results in three distinct load intervals (d-regions) as shown in fig. 2 and given below 𝐷1 → {𝑃𝑖𝑛𝑜 , 𝑃𝑖𝑛𝑜 } point interval for 𝑘𝑑 =1 (28) 𝐷2 → {𝑃𝑖𝑛𝑜 [1 − 𝛼2], 𝑃𝑖𝑛𝑜 [1 + 𝛼2]} for 𝑘𝑑 =2 (29) 𝐷3 → {𝑃𝑖𝑛𝑜 [1 − 𝛼3], 𝑃𝑖𝑛𝑜 [1 + 𝛼3]} for 𝑘𝑑 =3 (30) equations (28)-(30) shows that the d1, d2 and d3 are in bound form. therefore, an ia operation has been implemented to introduce these variations in the power flow. 3.3. dg modelling the generator bus has been characterised as a continuous negative pq load for the small size dg resources, implying that they run in constant power mode. according to the ieee 1547 standard [26] the dgs are not meant for regulating the voltage at the buses as they may conflict with the utilities' existing distribution voltage regulating schemes [27]. the total n/w load gets reduced by the power generated by the connected dg. the solar pvdg injects real power at unity power factor. the resultant load at bus i at which solar-pv dg has been placed will be 𝑃𝑟𝑒𝑖 = 𝑃𝑖 − 𝑃𝑠𝑜𝑙𝑎𝑟 𝑃𝑉−𝐷𝐺𝑖 (31) where, 𝑃𝑠𝑜𝑙𝑎𝑟 𝑃𝑉−𝐷𝐺𝑖 represents real power injected by the solar pv-dg at bus i. 4. pso algorithm pso is a stochastic technique in which each particle in a search space alters its state. in a d-dimensional hyperspace, the updated particle velocity and position are expressed as: 𝑣𝑝𝑑 𝑛+1 = 𝑤𝑝𝑣𝑝𝑑 𝑛 + 𝑐1𝑟𝑎𝑛𝑑1(𝑝𝑏𝑒𝑠𝑡𝑝𝑑 − 𝑆𝑝𝑑 𝑛 ) + 𝑐2𝑟𝑎𝑛𝑑2(𝑔𝑏𝑒𝑠𝑡𝑝𝑑 − 𝑆𝑝𝑑 𝑛 ) (32) 𝑆𝑝𝑑 𝑛+1 = 𝑆𝑝𝑑 𝑛 + 𝑣𝑝𝑑 𝑛+1 (33) where, 𝑆𝑝𝑑 𝑛 and 𝑣𝑝𝑑 𝑛 shows the particle’s current position and velocity at nth iteration, respectively. 𝑝 = 1,2, … 𝑁𝑠 where ns represents the swarm size. the acceleration coefficients for the iind and ist particles are c2 and c1, respectively. random numbers in the interval [0,1] are rand1(.) and rand2(. ). pbestpd and gbestpd are particle personal best and the global best position, respectively. the particle p’s inertia weight (𝑤𝑝) is given as 𝑤𝑝 = 𝑤𝑝𝑚𝑎𝑥 − (𝑤𝑝𝑚𝑎𝑥−𝑤𝑝𝑚𝑖𝑛 ) 𝑛𝑚𝑎𝑥 . 𝑛 (34) where, 𝑤𝑝𝑚𝑖𝑛 and 𝑤𝑝𝑚𝑎𝑥 are the minimum and the maximum inertia weight value, respectively. 𝑛𝑚𝑎𝑥 and n are the maximum and current iteration number, respectively. 5. development of novel vsi and optimal allocation of solar pv-dg a novel vsi is proposed to site dg optimally and is derived by substituting the value of ii from (6) in vi, we get 𝑉𝑖 ∠𝛿𝑖 = 𝑉𝑖−1∠0 − [(𝑅𝑖 + 𝑗𝑋𝑖 ). ( 𝑃𝑖−𝑗𝑄𝑖 𝑉𝑖∠−𝛿𝑖 )] (35) uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 79 by multiplying (35) with 𝑉𝑖 ∠ − 𝛿𝑖 on both sides, we obtain 𝑉𝑖 2 = 𝑉𝑖−1𝑉𝑖 ∠ − 𝛿𝑖 − (𝑅𝑖 + 𝑗𝑋𝑖 )(𝑃𝑖 − 𝑗𝑄𝑖 ) (36) 𝑉𝑖 2 + [𝑃𝑖 𝑅𝑖 + 𝑄𝑖 𝑋𝑖 + 𝑗(𝑃𝑖 𝑋𝑖 − 𝑄𝑖 𝑅𝑖)] = 𝑉𝑖−1𝑉𝑖 cos 𝛿𝑖 − 𝑗𝑉𝑖−1𝑉𝑖 sin 𝛿𝑖 (37) on segregation of real and imaginary part of (37), we obtain 𝑉𝑖 2 + 𝑃𝑖 𝑅𝑖 + 𝑄𝑖 𝑋𝑖 = 𝑉𝑖−1𝑉𝑖 cos 𝛿𝑖 (38) 𝑃𝑖 𝑋𝑖 − 𝑄𝑖 𝑅𝑖 = −𝑉𝑖−1𝑉𝑖 sin 𝛿𝑖 (39) substituting 𝑋𝑖 from (39) in (38), we obtain 𝑉𝑖 2 + 𝑃𝑖 𝑅𝑖 + 𝑄𝑖 . ( 𝑄𝑖𝑅𝑖−𝑉𝑖−1𝑉𝑖 sin 𝛿𝑖 𝑃𝑖 ) = 𝑉𝑖−1𝑉𝑖 cos 𝛿𝑖 (40) 𝑉𝑖 2 − 𝑄𝑖𝑉𝑖−1𝑉𝑖 sin 𝛿𝑖 𝑃𝑖 − 𝑉𝑖−1𝑉𝑖 cos 𝛿𝑖 + 𝑃𝑖 𝑅𝑖 + 𝑅𝑖𝑄𝑖 2 𝑃𝑖 = 0 (41) 𝑉𝑖 2 + (− 𝑄𝑖𝑉𝑖−1 sin 𝛿𝑖−𝑃𝑖𝑉𝑖−1 cos 𝛿𝑖 𝑃𝑖 )𝑉𝑖 + 𝑅(𝑃𝑖 + 𝑅𝑄𝑖 2 𝑃𝑖 ) = 0 (42) for bus voltages to be stable, (42) must have real roots, i.e. discriminant > 0 that resulted in the proposed vsi for the given branch and can be articulated as in (44) ( −𝑄𝑖𝑉𝑖−1 sin 𝛿𝑖−𝑃2𝑉𝑖−1 cos 𝛿𝑖 𝑃𝑖 )2 − 4𝑅𝑖 (𝑃𝑖 + 𝑅𝑖𝑄𝑖 2 𝑃𝑖 ) ≤ 0 (43) 4𝑅𝑖.𝑃𝑖 2 (𝑄𝑖𝑉𝑖−1 sin 𝛿𝑖+𝑃𝑖𝑉𝑖−1 cos 𝛿𝑖) 2 . (𝑃𝑖 + 𝑄𝑖 2 𝑃𝑖 ) ≤ 1 (44) the bus voltage determined from the pf solution is utilised to calculate vsi for each branch that lies in [0,1] range. any value of proposed vsi nearing 0 shows stable operation, in contrary, vsi value approaching 1 indicates that the bus is gradually leading towards instability. the constraints taken for the analysis: a) power balance: 𝑃𝐺 = 𝑃𝐷 + 𝑃𝑙𝑜𝑠𝑠 (45) 𝑄𝐺 = 𝑄𝐷 + 𝑄𝑙𝑜𝑠𝑠 (46) where 𝑄𝐺 and 𝑃𝐺 shows the reactive and real power generated. 𝑄𝐷 and 𝑃𝐷 stands for reactive and real load demand on the network. b) voltage constraint: 0.95 𝑝. 𝑢 ≤ 𝑉𝑖 ≤ 1.05 𝑝. 𝑢 (47) c) current constraint: 𝐼𝑏𝑟𝑎𝑛𝑐ℎ ≤ 𝐼𝑡ℎ𝑒𝑟𝑚𝑎𝑙 (48) where, 𝐼𝑏𝑟𝑎𝑛𝑐ℎ and 𝐼𝑡ℎ𝑒𝑟𝑚𝑎𝑙 shows the branch current and its thermal limit, respectively. d) dg power generation constraint: 0 ≤ 𝑃𝑠𝑜𝑙𝑎𝑟 𝑃𝑉−𝐷𝐺𝑖 ≤ ∑𝑃𝐿𝑜𝑎𝑑 (49) where ∑𝑃𝐿𝑜𝑎𝑑 is the total real power load in the network. e) substation capacity: 0 ≤ 𝑃𝑔 𝑖 ≤ 𝑃𝑔(𝑚𝑎𝑥) i ∈ slack (50) 80 s. s. parihar, n. malik 0 ≤ 𝑄𝑔 𝑖 ≤ 𝑄𝑔(𝑚𝑎𝑥) (51) where 𝑄g(max) and pg(max) represents the maximum value of reactive and real power generation, respectively. 𝑄𝑔 𝑖 and 𝑃𝑔 𝑖 shows the reactive and real generated power at the slack bus, respectively. the pseudo-code for the optimal integration of solar pv-dg for the deterministic case in the rdn is mentioned below: step i: run the pf program to calculate the bus magnitude and its phase angle, branch current, n/w power losses using direct pf method [28] of rdn for the base case. the following iterative formula is used to determine the solution [𝑉𝑖 𝑛] = [𝑉𝑖 0] + [𝐵𝐶𝐵𝑉][𝐵𝐼𝐵𝐶][𝐼𝑖 𝑛−1] (52) where, bcbv stands for branch current-to-bus voltage and bibc stands for bus incidence-to-branch current matrix. the initial voltage (𝑉𝑖 0) is 1.0 +j 0 p.u. 𝐼𝑖 𝑛−1 is the branch current at n-1 iteration and 𝑉𝑖 𝑛 is the bus voltage at nth iteration at bus i. step ii: determine the cost of annual energy loss [29] using 𝐴𝑛𝑛𝑢𝑎𝑙 𝑐𝑜𝑠𝑡 𝑜𝑓 𝑒𝑛𝑒𝑟𝑔𝑦 𝑙𝑜𝑠𝑠𝑒𝑠 = (∑ 𝑃𝐿𝑜𝑠𝑠 (𝑖 − 1, 𝑖) nb i=2 𝑇. 𝐸 ) $ (53) where nb is the number of buses, t is the annual time duration (8760 hrs) and e is cost of energy (0.06 $/kwh). step iii: evaluate vsi using (44). select bus i as the most sensitive bus to place dg if a branch between bus i-1 and i has the greatest vsi value. step iv: set pso parameters (swarm size, inertia weights, acceleration coefficients) for minimizing real power loss (rpl). step v: set iteration counter (n) to 0. step vi: with random velocities and placements on the dimension as pbest, the values of solar pv-dg size are created (between 0 and ∑ system loads (continuous)). step vii: after installing dg at the location as obtained in step iii, repeat the pf algorithm for each particle. calculate rpl for the randomly initialised particles if all constraints are within limits. otherwise, discard the infeasible solution. step viii: the solar pv-dg size giving minimum rpl value is opted as gbest and its corresponding position is considered as the particle best position. step ix: the particles’ velocity, position and the weight are updated utilizing (32), (33) and (34), respectively. step x: if maximum iterations (nmax) are reached, jump to step xi. otherwise, the counter is incremented and steps iv through x are repeated. if the newly obtained particle position is superior to the prior pbest and gbest, new pbest and gbest will be generated. step xi: the best location denotes optimal solar pv-dg sizes, while the corresponding number denotes the lowest total rpl. step xii: determine annual power loss savings after calculating the cost of energy losses in the presence of dg using (53). uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 81 6. interval-based b/f pf solution methodology the following steps are used to determine the value of the bus voltage and n/w losses: step i: read n/w data. step ii: determine the degree of belongingness 𝛼𝑝𝑙 (𝑘𝑑 ) and 𝛼𝑞𝑙 (𝑘𝑑 ) for n intervals. step iii: complex interval numbers 𝑉𝑖 𝑛 and 𝑉𝑖 0 can be written as 𝑉𝑖 𝑛 = 𝐴1 + 𝑗𝐴2 and 𝑉𝑖 0 = 𝐵1 + 𝑗𝐵2 where 𝐴1, 𝐴2, 𝐵1 and 𝐵2 are all interval numbers. the voltage start for all buses is [1.0,1.0] +j [0.0,0.0] p.u. at first, n/w losses are set to zero. the iteration count and slack bus angle are initialized to zero. step iv: calculate the equivalent real power load at the bus using (31) after optimal siting and sizing of solar pv-dg as explained in step iii and step xi of section 5, respectively. step v: the closed bounded interval of line and load data is determined from (10) through (13) and (24) through (27), respectively for the various degree of belongingness. step vi: for the complex nature of the load, update the bounded interval of real and reactive power load with the use of (14) and (15). step vii: form bibc, bcbv and distribution pf matrices. step viii: determine the currents and voltages at each bus using (7) and (52) using subtraction, addition, division and multiplication operation of the complex interval numbers as described in section 2. step ix: the voltage difference between two successive iteration can be given as 𝑉𝑖 𝑛 − 𝑉𝑖 0 = max[𝑑(𝐴1, 𝐵1), 𝑑(𝐴2, 𝐵2 )] (54) where 𝑑(𝐴1, 𝐵1) and 𝑑(𝐴2, 𝐵2) is calculated using (5). if max[𝑑(𝐴1, 𝐵1), 𝑑(𝐴2, 𝐵2 )] < 10−4 at all the buses then jump to step x, else jump to step v. step x: use (8) and (9) to calculate n/w power losses. step xi: print the results for specific value of 𝛼𝑝𝑙(𝑘𝑑 ) and 𝛼𝑞𝑙 (𝑘𝑑 ). step xii: if 𝑘𝑑 =n terminate the program otherwise increment k and go to step ii. 7. results and discussion to demonstrate the performance of the ia-based pf technique in the presence of various realistic loads and solar pv-dg, ieee distribution test networks of varying complexity and size are simulated to show its robustness. the complete n/w data for 33 and 69-bus n/w has come from [30] and [31], respectively. for both networks under consideration, the base kv and mva are 12.66 and 100, respectively. the bus feeders of ieee n/w were tested on matlab. a voltage error tolerance of 0.0001 p.u is considered for all the test cases acknowledged in this work. the piecewise segmentation of annual load profile in light, nominal and heavy load level is assumed 50%, 100% and 160% of the rated n/w load [32] with an annual hourly duration of 1000, 6760, and 1000 hours [11], respectively. to confirm the efficacy of the suggested methodology, the simulated network performance is compared to previously published findings for deterministic parameters for the same base voltage and the load model. in table 1, the power flow results at various realistic loads for 33-bus rdn are compared to the published literature. the n/w real and reactive power losses are 5.5% and 5.9% of their respective load for cpl model, whereas, for cl model the real and reactive 82 s. s. parihar, n. malik losses reduces to 4.7% and 5.04%, respectively. the convergence is also faster when compared to that of [25]. table 1 power flow result at various load models for 33-bus rdn proposed method [25] type of load cpl il rl cml cl cpl il rl cml cl total rpl (kw) 202.66 161.28 158.54 153.57 174.21 202.68 161.69 159.33 154.93 174.19 total reactive power loss (kvar) 135.13 107.20 105.31 101.94 115.89 135.23 107.56 105.92 102.94 115.97 number of iterations 4 2 2 2 2 3 4 4 4 3 assuming constant annual load with only one type of load model is a misnomer because the n/w load profile is highly affected by various type of load model and time variations, hence light load, nominal load and heavy load levels are considered. the vmin and network power losses attained for ieee 69-bus n/w at different load levels for cpl and cl model are mentioned in table 2. for 69-bus n/w, the power losses and the convergence characteristics obtained for deterministic case are compared at a nominal load in table 3 to exemplify the capability of the proposed method. from table 2, it is inferred that the reduction in load has a positive effect on n/w bus voltage profile, while increment in load aggravates it, in both cases. for 69-bus rdn, the bus voltage profile attained with cl and cpl model is given in fig. 3. the results demonstrate that the effect of cl model on n/w performance is over-represented as compared to cpl model. as the n/w power loss reduction varies unproportionate to the network load, thus, it is better to provide generalized equations for the n/w power losses using a curve fitting. the generalized power loss equations for cpl and cl model are as follows: for cpl, 𝑃𝑙𝑜𝑠𝑠 (𝑘𝑊) = 332.11𝜆 2 − 151.68𝜆 + 44.37 (55) 𝑄𝑙𝑜𝑠𝑠 (𝑘𝑉𝐴𝑟) = 147.98𝜆 2 − 64.88𝜆 + 18.98 (56) for cl, 𝑃𝑙𝑜𝑠𝑠 (𝑘𝑊) = 149.22𝜆 2 + 20.51𝜆 − 3.71 (57) 𝑄𝑙𝑜𝑠𝑠 (𝑘𝑉𝐴𝑟) = 69.28𝜆 2 + 9.24𝜆 − 1.73 (58) where, λ represents load level. the general expressions from (55) to (58) are useful for dnos in future power generation planning. table 2 results of 69-bus n/w at various load levels with no solar pv-dg cpl cl load level light nominal heavy light nominal heavy vmin (p.u.) 0.9567 0.9092 0.8446 0.9597 0.9211 0.8757 rpl (kw) 51.56 224.80 651.88 43.85 166.02 411.11 reactive power loss (kvar) 23.54 102.09 294.02 20.21 76.79 190.41 uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 83 table 3 comparative analysis of cl model at nominal load and without dg for 69-bus rdn fig. 3 voltage profile of 69-bus rdn with cpl and cl model validation of novel vsi to authenticate the index, the n/w load (p and q both) is subjected to the random variation between 0 and 160% of the total base load. for 69-bus n/w, branch 60 with 0.0286 value is determined to have the largest vsi value. as a result, the bus 61 is regarded the most vulnerable bus beyond critical loading and is investigated as the load increases. fig. 4 displays the vsi variation at the critical bus at various loading conditions. as vsi displays a linear variation to n/w load increment, it can be utilized for the accurate prediction of the voltage stability in rdn as also concluded in [32]. fig. 4 variation in the value of vsi at various load levels in 69-bus rdn 7.1. analysis of n/w performance with optimal integration of solar pv-dg the proposed methodology has been employed on 69-bus rdn for the optimal integration of solar pv-dg having deterministic parameters. the nmax and swarm size is 130 and 20, respectively. to achieve fast convergence of the optimization technique, the control variables 𝑐1 and 𝑐2 value in (32) and 𝑤𝑝𝑚𝑎𝑥 and 𝑤𝑝𝑚𝑖𝑛 in (34) are chosen as 2, 2, proposed method [25] rpl (kw) 166.02 189.3761 reactive power loss (kvar) 76.79 86.8497 number of iterations 2 3 84 s. s. parihar, n. malik 0.9 and 0.4, respectively [33]. the value of vsi for each branch of 69-bus rdn is shown in fig. 5. bus 61 is found out to have the maximum vsi value and is chosen as the best dg placement location at nominal load level. fig. 5 vsi at each branch in 69-bus rdn the optimal size and site of solar pv-dg is determined at various loading scenarios using the proposed method. the dg size follows a linear relationship with n/w load increment for the 69-bus network, as exemplified in fig. 6. the n/w performance in terms of vmin, power losses, rpl reduction and cost of annual energy losses attained at nominal load level after integrating solar pv-dg for cpl and cl model is illustrated in table 4. table 4 results for solar pv-dg at nominal load level fig. 6 optimal solar pv-dg size at various load levels in 69-bus network base case cpl cl optimal dg location @ solar pv-dg size in kw 61 @ 1888 61 @1888 vmin in pu @ bus (% voltage improvement) 0.9092 @ 65 0.9684 @ 27 (6.5 %) 0.9700 @ 27 (6.7 %) rpl (kw) 224.80 83.17 70.40 rpl reduction (kw) (in %) 141.63 (63.00%) 154.40 (68.68) reactive power loss (kvar) (in %) 102.09 40.51 (60.31%) 34.88 (65.83%) annual cost of energy loss ($) 91178.88 33733.75 28554.24 aels ($) 57445.13 62624.64 mailto:0.9102@65 uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 85 7.1.1. impact of solar pv-dg on n/w power loss the integration of solar pv-dg has a considerable effect on n/w losses. to validate, the general mathematical expressions of ploss and qloss for any load level in the 69-bus n/w are derived utilizing curve fitting approach as depicted below for cpl model, 𝑃𝑙𝑜𝑠𝑠 (𝑘𝑊) = 92.40𝜆 2 − 12.87𝜆 + 3.64 (59) 𝑄𝑙𝑜𝑠𝑠 (𝑘𝑉𝐴𝑟) = 44.68𝜆 2 − 5.8𝜆 + 1.64 (60) for cl model, 𝑃𝑙𝑜𝑠𝑠 (𝑘𝑊) = 55.15𝜆 2 + 20.76𝜆 − 5.51 (61) 𝑄𝑙𝑜𝑠𝑠 (𝑘𝑉𝐴𝑟) = 28.21𝜆 2 + 9.09𝜆 − 2.42 (62) after comparing n/w power losses without dg [(55) (58)] and with solar pv-dg [(59) – (62)], we can conclude that the integration of renewable dg minimises the n/w losses at all load levels for all types of load models. the relationship between the variation in rpl with solar pv-dg size for 69-bus rdn considering cl model is illustrated in fig. 7, which seems to follow a parabolic curve. the curve shows the rpl value decreases with the increase in the size of solar pv-dg as presented in the left portion of the curve. the optimum dg size will be attained at the lowest point of the curve when rpl reached to its minimum value after which the rpl losses increase as dg size increases (right part of the curve) due to extra current flow from the dg to the adjacent bus. the rpl in 69-bus rdn without integrating solar pv-dg was found to be 224.80 kw and 166.02 kw for cpl and cl model, respectively. after optimal integration of solar pv-dg, the rpl for cpl and cl model in ieee 69-bus n/w mitigates to 83.17 kw and 70.40 kw with a decrease of 63.00% and 68.68% percent, respectively, with respect to the base case (from table 2). the real power demand released is 141.63 kw for the cpl model and 154.40 kw for the cl model after installing spv-dg, respectively. the rpl magnitude at each branch with and without solar pv-dg is demonstrated in fig. 8 for 69-bus n/w with cpl model at nominal load. the results validate that the n/w loss minimizes after integration of solar pv-dg. fig. 7 relationship between rpl and solar pv-dg size at optimal dg bus in 69-bus n/w 86 s. s. parihar, n. malik fig. 8 network rpl with and without renewable dg at nominal load in 69-bus rdn 7.1.2. impact of solar pv-dg on bus voltage profile the vmin for 69-bus n/w has been updated from 0.9092 pu at bus 65 to 0.9684 pu and 0.9700 pu at bus 27 for cpl and cl models, respectively, resulting in 6.5 % and 6.7 % increase in bus voltage magnitude (from table 4). it has been found out that for both the load model the n/w voltage profile is enhanced after solar pv-dg installation satisfying the constraints. fig. 9 impact of n/w load variation on voltage profile in 69-bus n/w for cl model the effect of different load levels on bus voltage profile considering realistic loads is analysed in fig. 9 and found out to have a remarkable enhancement in voltage profile for 69-bus n/w at all the considered load levels after installing solar pv-dg. fig. 9 also shows that all the bus voltages attained from the proposed method are within allowable voltage limits and hence validates the method consistency. thus, the obtained integrated solution is very beneficial for dnos. 7.1.3. impact of solar pv-dg on aels the cost of energy loss in 69-bus rdn before integrating solar pv-dg was $91178.88 which is reduced to $33733.75 and $28554.24 with solar pv-dg resulting in aels of $57445.13 and $62624.64 for cpl and cl model, respectively, at nominal load level with respect to the base case as mention in table 4. the aels for cpl and cl model attained considering all the load levels are $85257.73 and $93577.34, respectively compared to base case. uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 87 7.1.4. comparative analysis to authenticate the efficacy of the method, the test results attained after the penetration of solar pv-dg are compared to other available meta-heuristic methods like woa [34], gwo [35], pso [36], sga [36], csa [36] and bb-bc [37] and mentioned in table 5 for the 69-bus n/w. due to the variable nature of rpl reduction and dg size, it becomes obligatory to compare it on a common platform which is carried out by calculating the ratio of rpl reduction to size of dg. the penetration of solar pv-dg in the n/w yields a ratio of 0.075 superior or comparable to already published literature. the higher value of the ratio compared to the already published results signifies the robustness of the proposed approach used for the optimal integration of dg. to demonstrate its rapid convergence, the computational time required for solar pv allocation at a nominal load for the 69-bus n/w is calculated and compared to the existing literature [38] and [12] (table 6). table 5 comparative analysis of solar pv-dg integration techniques for 69-bus n/w dg allocation method size of dg/ power factor optimum location rpl (kw) % rpl reduction ratio of rpl reduction to solar pv-dg size proposed approach 1888/1 61 83.17 63.00 0.075 woa [34] 1872.82/1 61 83.23 63.01 0.075 gwo [35] 1928.67/1 61 83.24 62.98 0.073 pso [36] 2000/1 61 83.80 62.75 0.070 csa [36] 2000/1 61 83.80 62.74 0.070 sga [36] 2300/1 61 89.40 60.30 0.058 bb-bc [37] 1872.5/1 61 83.22 63.00 nr nr: not reported table 6 execution time for solar allocation at nominal load in 69-bus n/w proposed method analytical method [38] ga [12] cpu time (sec) 0.20 0.70 0.85 7.2. analysis of n/w uncertainties (line and load) with optimal integration of solar pv-dg for the comparative analysis, the load and line uncertainties for the cpl model are set to 5% and 1%, respectively as described in [19] for ieee 33-bus rdn. the interval width for line and load uncertainty at vmin is tabulated and compared with faa [19] in table 7. the results clearly illustrate that the interval width determined from the proposed probabilistic-possibilistic approach is narrower. as a result, the solution is less conservative and superior to the probabilistic technique alone. it can be concluded that increasing n/w load uncertainty creates a bigger voltage drop than increasing n/w line uncertainty. table 7 interval width of vmin for cpl model in 33-bus n/w output variable type of uncertainty interval width (p.u) interval width reduction in % probabilistic-possibilistic approach faa [19] vmin (p.u) line 0.0019 0.0021 9.5 load 0.0094 0.0142 33.8 88 s. s. parihar, n. malik in this case, the analysis of uncertainties in input parameter is presented with solar pvdg for 69-bus n/w. the fixed variation of ±3% in n/w line data has been considered. the solar pv-dg is positioned at bus 61 with dg size of 1888 kw as determined in case 1 from the proposed method. the simulated results for vmin, total real and reactive n/w losses in solar pv-dg integrated ieee 69-bus n/w for the deterministic case and when uncertainties occur in n/w line and load parameter at various degree of belongingness at different load models are tabulated in table 8. at α = 1, the interval widths for cpl, il, rl, cml, and cl are 0.002 pu, 0.0017 pu, 0.0017 pu, 0.0016 pu, and 0.0018 pu, respectively, based on the upper and lower bounds of the vmin. for all practical load models, the interval of voltage magnitude at bus 65 is narrower than that obtained from the cpl load model. as a result, the cl model, as opposed to the cpl model, produces more realistic results. fig. 10 shows the effect of adding uncertainties on the voltage profile of 69-bus n/w for the cl model and three degrees of belongingness (α = 0.2, 0.6, 1). as expected, with deterministic input values, the voltage magnitude at every bus fall within the range of potential n/w states obtained by varying input parameters. table 8 results for ieee 69-bus n/w with load and line uncertainty and dg penetration at nominal load degree of belongingness αpl, αql =1 αpl, αql =0.6 αpl, αql =0.2 load model deterministic result lower upper lower upper lower upper cpl vmin 0.9684 0.9674 0.9694 0.9535 0.9820 0.9425 0.9915 ploss 83.1722 80.5589 85.7935 28.1596 172.5547 6.3011 262.4480 qloss 40.5177 39.2487 41.7903 13.7375 83.9240 3.0769 127.4850 il vmin 0.9709 0.9700 0.9717 0.9586 0.9828 0.9499 0.9916 ploss 60.4966 59.1220 61.8500 23.3002 110.3419 6.0157 153.6309 qloss 30.499 29.7787 31.2103 11.5943 56.3890 2.9513 79.266 rl vmin 0.9710 0.9702 0.9719 0.9590 0.9829 0.9507 0.9916 ploss 65.6427 64.0243 67.2424 24.5247 122.8959 6.0953 173.7163 qloss 32.7774 31.9483 33.5982 12.1344 61.9703 2.9863 88.2260 cml vmin 0.9714 0.9706 0.9722 0.9599 0.9830 0.9519 0.9916 ploss 67.1456 65.4526 68.8209 24.8706 126.6893 6.1169 179.9339 qloss 33.4364 32.5746 34.2904 12.2859 63.6350 2.9957 90.556 cl vmin 0.9700 0.9691 0.9709 0.9569 0.9825 0.9476 0.9916 ploss 70.40 68.5249 72.2729 25.5387 136.0021 6.0119 196.6515 qloss 34.88 33.9405 35.8256 12.5824 67.7841 2.9496 98.4141 uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 89 fig. 10 voltage profile with solar pv-dg with fixed and varying line and load parameter at various degree of belongingness considering cl model and nominal load fig. 11 shows the variation of total reactive and real power losses at different degree of belongingness without and with solar pv-dg for the cl model in ieee 69-bus n/w. it was obvious that when solar pv-dg was integrated into an ieee 69-bus n/w, power losses were dramatically decreased. as can be seen in fig. 11, the interval between power losses reduces as the degree of belongingness increases. fig. 11 variation of total n/w losses at different degree of belongingness with cl model at nominal load level the generalized equations for determining lower and upper real and reactive losses in 69-bus rdn considering the cl model with line and load uncertainty at α=0.6 using curve fitting technique are given as 𝑃𝑙𝑜𝑠𝑠 𝑙𝑜 (𝑘𝑊) = 22.02𝜆 2 + 5.15𝜆 − 1.3 (63) 𝑄𝑙𝑜𝑠𝑠 𝑙𝑜 (𝑘𝑉𝐴𝑟) = 11.05𝜆 2 + 2.12𝜆 − 0.577 (64) 𝑃𝑙𝑜𝑠𝑠 𝑢𝑝(𝑘𝑊) = 97.50𝜆 2 + 52.05𝜆 − 13.55 (65) 𝑄𝑙𝑜𝑠𝑠 𝑢𝑝(𝑘𝑉𝐴𝑟) = 50.88𝜆 2 + 22.86𝜆 − 5.96 (66) the coefficient of variation (cv) in rpl decreases with the penetration of solar pvdg with cl model for 69-bus n/w, and is greatest for the base case, as tabulated in table 9. this implies that the integration of dg decreases the power loss variation in the feeders of the distribution n/w around its mean value and thereby provide better security against overheating of feeders and instability. it is found that the cl model provide better results consistently as specified by their better voltage profile, lowest power losses and minimum 90 s. s. parihar, n. malik cv. the minimum, maximum, mean, standard deviation (std) and cv of power loss lies within their lower and upper limits for all values of α but has been illustrated for α =1 only, in table 9. it has been observed that higher the dg penetration, higher will be the cv value due to its higher degree of uncertainty. table 9 statistical analysis for rpl without and with dg in ieee 69-bus n/w with cpl and cl model and uncertainty at nominal load level load model 𝑃𝑙𝑜𝑠𝑠 (kw) deterministic lower upper without dg cpl min 1.2562e-05 1.2184e-05 1.2940e-05 max 49.6749 47.8832 51.4904 mean 3.3059 3.1890 3.4242 std 8.3880 8.0892 8.6906 cv 2.5373 2.5366 2.5380 with solar pv-dg cpl min 1.2561e-05 1.2183e-05 1.2939e-05 max 15.0325 14.5640 15.5022 mean 1.2231 1.1847 1.2617 std 2.7308 2.6450 2.8169 cv 2.2327 2.2326 2.2328 cl min 1.2468e-05 1.2096e-05 1.2841e-05 max 12.4737 12.1515 12.7931 mean 1.0354 1.0077 1.0628 std 2.2805 2.2205 2.3401 cv 2.2026 2.2018 2.2035 8. conclusions this paper proposes a probabilistic and possibilistic strategy to solve the power flow problem with optimally integrated solar pv-dg to investigate the impact of line and load uncertainties in the rdn. the n/w line and load vary in fixed and as function of gaussian distribution, respectively. a new vsi is proposed to search the optimal site strategically for solar pv-dg to reduce power losses and enhance bus voltages. pso method is further applied to determine the optimum solar pv-dg size. the independent method for finding the optimal site and size of the renewable dg provide openness and flexibility to the method. two test cases have been designed and solved for varying levels of complexity in the pf problem. the bus voltage characteristic for various degree of belongingness is found to be affected by various realistic loads. the solution obtained from the proposed approach comprises all possible states of the n/w and converges faster than the existing results. the robustness of the method has been demonstrated on 33and 69-bus n/w. it has been statistically approved from the analysis that the voltage profile and reduction in n/w power losses are under-represented for cpl model when compared to the cl model for all n/w loading conditions. the results imply that the proposed technique is more feasible and effective for the design of the large-scale n/w with a high degree of uncertainty. the narrower interval width signifies less conservative solution and numerical stability when compared to the faa method. the findings revealed that uncertainties have a major impact on the rdn and so cannot be overlooked. a generalized set of equations for calculating n/w power losses with and without solar pv-dg considering uncertainties has been developed under various loading conditions which will help the dnos in n/w planning and expansion of the rdn. uncertainty assesment in rdn with optimally integrated solar pv-dg considering realistic loads 91 references [1] noorkami et al., "effect of temperature uncertainty on polymer electrolyte fuel cell performance", international journal of hydrogen energy, vol. 39, no. 3, pp. 1439–1448, 2014. [2] z. wang and f.l. alvarado, "interval arithmetic in power flow analysis", ieee transactions on power systems, vol. 7, no. 3, pp. 1341–1349, 1992. [3] m. aiena, m. rashidinejad and m. fotuhi-firuzabad, "on possibilistic and probabilistic uncertainty assessment of power flow problem: a review and a new approach", renewable and sustainable energy reviews, vol. 37, pp. 883–895, 2014. [4] m.m. aman, g.b. jasmon, h. mokhlis and a.h.a. bakar, "optimal placement and sizing of a dg based on a new power stability index and line losses", international journal of electrical power and energy systems, vol. 43, no. 1, pp. 1296–1304, 2012. [5] l. soo-hyoung and p. jung-wook, "selection of optimal location and size of multiple distributed generations by using kalman filter algorithm", ieee transactions on power system, vol. 24, no. 3, pp. 1393–1400, 2009. [6] a.c. rueda-medina, j.f. franco, m.j. rider, a. padilha-feltrin and r. romero, "a mixed integer linear programming approach for optimal type, size and allocation of distributed generation in radial distribution system", electric power system research, vol. 97, pp. 133–143, 2013. [7] s.s. parihar and n. malik, "optimal allocation of multi-type dg in radial distribution system based on new voltage stability index with future load growth", evolving systems, pp. 1–15, 2020. [8] m. mohsen, a.r. youssef, m. ebeed and s. kamel, "optimal planning of renewable distributed generation in distribution systems using grey wolf optimizer gwo", in proceedings of the nineteenth international middle east power systems conference, 2017, pp. 915–921. [9] s.s. parihar and n. malik, "optimal allocation of renewable dgs in a radial distribution system based on new voltage stability index", international transaction on electrical energy system, vol. 30, no. 4, pp. 1– 19, 2020. [10] s.s. parihar and n. malik, "optimal allocation of multiple dgs in rds using pso & its impact on system reliability”, facta universitatis, series: electronics and energetics, vol. 34, no. 2, pp. 219–237, 2021. [11] e.s. ali, s.m. elazim and a.y. abdelaziz, "ant lion optimization algorithm for renewable distributed generations", electrical engineering, vol. 100, no. 1, pp. 100–109, 2018. [12] a.a. hassan, f.h. fahmy, a.e.s.a. nafeh and m.a. abu-elmagd, "genetic single objective optimization for sizing and allocation of renewable dg systems", international journal of sustainable energy, vol. 36, no. 6, pp. 545–562, 2017. [13] b. das, "radial distribution power flow using interval arithmetic", international journal of electrical power and energy systems, vol. 24, no. 10, pp. 827-836, 2002. [14] p.m. vidovic and a.t. saric, "a novel correlated interval-based algorithm for distribution power flow calculation", international journal of electrical power and energy systems, vol. 90, pp. 245–255, 2017. [15] t. ding, r. bo, f. li, q. guo, h. sun, w. gu and g. zhou, "interval power flow analysis using linear relaxation and optimality-based bounds tightening (obbt) methods", ieee transaction on power system, vol. 30, no. 1, pp. 177–188, 2015. [16] a. vaccaro, c.a. canizares and d. villacci, "an affine arithmetic-based methodology for reliable power flow analysis in the presence of data uncertainty", ieee transactions on power systems, vol. 25, no. 2, pp. 624–632, 2010. [17] a. chaturvedi, k. prasad and r. ranjan, "use of interval arithmetic to incorporate the uncertainty of load demand for radial distribution system analysis", ieee transactions on power delivery, vol. 21, no. 2, pp. 1019–1021, 2006. [18] s.s. parihar and n. malik, "probabilistic distribution based interval arithmetic power flow analysis of radial distribution system with distributed generation and composite load model", process integration and optimization for sustainability, pp. 1–13, 2021. [19] b. abdelkader, l. slimani and t. bouktir, "analysis of radial distribution system power flow under uncertainties with fuzzy arithmetic algorithm", in proceedings of the 3rd international conference on information processing and electrical engineering, 2014. [20] m. esmaeili, m. sedighizahed and m. esmaili, "multi-objective optimal reconfiguration and dg (distributed generation) power allocation in distribution networks using big bang-big crunch algorithm considering load uncertainty", energy, vol. 103, pp. 86–99, 2016. [21] m. marin, f. milano and d. defour, "midpoint-radius interval-based method to deal with uncertainty in power flow analysis", electric power systems research, vol. 147, pp. 81–87, 2017. [22] g. alefeld and j. herzeberger, "introduction to interval arithmetic", new york: academic, 1983. https://www.sciencedirect.com/science/journal/03603199/39/3 https://www.sciencedirect.com/science/journal/03603199 http://www.sciencedirect.com/science/journal/01420615 http://www.sciencedirect.com/science/article/pii/s0360544216302146?np=y&npkey=3fbb054e363077c63d4cbe672c6062df7dba39b85a01d2520131e809a62c9ea7 electric%20power%20systems%20research 92 s. s. parihar, n. malik [23] m.e. el-hawary and l.g. dias, "incorporation of load models in load flow studies. form of model effects", iee proceedings cgeneration, transmission and distribution, vol. 134, pp. 27–30, 1987. [24] m.h. haque, "load flow solution of distribution systems with voltage dependent load models", electric power and systems research, vol. 36, pp. 151–156, 1996. [25] k. nagaraju, s. sivanagaraju, t. ramana and p.v. prasad, "a novel load flow method for radial distribution systems for realistic loads", electric power components and systems, vol. 39, no. 2, pp. 128–141, 2011. [26] 1547‐2003‐ieee standard for interconnecting distributed resources with electric power systems, ieee standards, pp. 1–16, 2003. [27] r.a. walling, r. saint, r.c. dugan, j. burke and l.a. kojovic, "summary of distributed resources impact on power delivery systems", ieee transaction on power delivery, vol. 23, no. 3, pp. 1636–1644, 2008. [28] j.h. teng, "a direct approach for distribution system power flow solution", ieee transactions on power delivery, vol. 18, no. 3, pp. 882–887, 2003. [29] v.v.s.n. murty and a. kumar, "optimal placement of dg in radial distribution systems based on new voltage stability index under load growth", international journal of electrical power and energy systems, vol. 69, pp. 246–256, 2015. [30] m.e. baran and f.f. wu, "network reconfiguration in distribution systems for loss reduction and load balancing", ieee transactions on power delivery, vol. 4, no. 2, pp. 1401–1407, 1989. [31] r. ranjan, b. venkatesh and d. das, "voltage stability analysis of radial distribution networks", electric power components and systems, vol. 31, pp. 501–511, 2003. [32] r. ishak, a. mohamed, a.n. abdalla and m.z.c. wanik, "optimal placement and sizing of distributed generators based on a novel mpsi index", international journal of electrical power and energy systems, vol. 60, pp. 389–398, 2014. [33] s. kansal, v. kumar and b. tyagi, "hybrid approach for optimal placement of multiple dgs of multiple types in distribution networks", international journal of electrical power and energy systems, vol. 75, pp. 226-235, 2016. [34] p.d.p. reddy, v.c.v. reddy and t.g. manohar, "optimal renewable resources placement in distribution networks by combined power loss index and whale optimization algorithms", journal of electrical systems and information technology, vol. 5, no. 2, pp.175–191, 2018. [35] a.r. sobieh, m. mandour, e.m. saied and m.m. salama, "optimal number size and location of distributed generation units in radial distribution systems using grey wolf optimizer", international electrical engineering journal, vol. 7, no. 9, pp. 2367–2376, 2017. [36] w.s. tan, m.y. hassan, m.s. majid and h.a. rahman, "allocation and sizing of dg using cuckoo search algorithm", ieee international conference on power and energy, pp. 133-138, 2012. [37] a.y. abdelaziz, y.g hegazy, w. el-khattam and m.m. othman, "a multi-objective optimization for sizing and placement of voltage-controlled distributed generation using supervised big bang–big crunch method", electric power components and systems, vol. 43, no. 1, pp. 105–117, 2015. [38] d.q. hung, n. mithulananthan and r.c. bansal, "analytical strategies for renewable distributed generation integration considering energy loss minimization", applied energy, vol. 105, pp. 75–85, 2013. http://www.sciencedirect.com/science/journal/03787796 http://www.sciencedirect.com/science/journal/03787796 https://www.sciencedirect.com/science/article/pii/s2314717217300259#! https://www.sciencedirect.com/science/article/pii/s2314717217300259#! https://www.sciencedirect.com/science/article/pii/s2314717217300259#! instruction facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 295 312 doi: 10.2298/fuee1703295p control of functional electrical stimulation for restoration of motor function  dejan b. popović institute of technical sciences of sasa, belgrade, serbia emeritus professor, aalborg university, aalborg, denmark abstract. an injury or disease of the central nervous system (cns) results in significant limitations in the communication with the environment (e.g., mobility, reaching and grasping). functional electrical stimulation (fes) externally activates the muscles; thus, can restore several motor functions and reduce other health related problems. this review discusses the major bottleneck in current fes which prevents the wider use and better outcome of the treatment. we present a control method that we continually enhance during more than 30 years in the research and development of assistive systems. the presented control has a multi-level structure where upper levels use finite state control and the lower level implements model based control. we also discuss possible communication channels between the user and the controller of the fes. the artificial controller can be seen as the replica of the biological control. the principle of replication is used to minimize the problems which come from the interplay of biological and artificial control in fes. the biological control relies on an extensive network of neurons sending the output signals to the muscles. the network is being trained though many the trial and error processes in the early childhood, but staying open to changes throughout the life to satisfy the particular needs. the network considers the nonlinear and time variable properties of the motor system and provides adaptation in time and space. the presented artificial control method implements the same strategy but relies on machine classification, heuristics, and simulation of modelbased control. the motivation for writing this review comes from the fact that many control algorithms have been presented in the literature by the authors who do not have much experience in rehabilitation engineering and had never tested the operations with patients. almost all of the fes devices available implement only open-loop, sensory triggered preprogrammed sequences of stimulation. the suggestion is that the improvements in the fes devices need better controllers which consider the overall status of the potential user, various effects that stimulation has on afferent and efferent systems, reflexive responses to the fes and direct responses to the fes by non-stimulated sensory-motor systems, and the greater integration of the biological control. key words: functional electrical stimulation, neurorehabilitation, optimal control, finite state control received january 9, 2017 corresponding author: dejan b. popović institute of technical sciences of sanu, kneza mihaila 35, 11000 belgrade, serbia (e-mail: dbp@etf.rs ru) 296 d. b. popović 1. introduction an injury or disease of the central nervous system (cns) (fig. 1) leads to disability expressed with decreased sensory-motor performance (e.g., tetraplegia, paraplegia, hemiplegia, multiple sclerosis, cerebral palsy, etc.). fig. 1 the sketch of the central nervous system with the annotations of particular subsystems and types of disability (left panel) and the indication on possible sites of electrical stimulation (right panel). the disability changes the lifestyle and results with other medical problems (e.g., muscle atrophy, contractures, frequent bladder infections, reduced cardiovascular capacity). electrical stimulation (es) is used for external activation of sensory-motor systems after an upper motor neuron lesion to decrease the disability (fig. 1). fig. 1 shows the stimulation sites at the head (including the neck), spinal cord, and periphery. some of the sites are above the lesion and in complete transections do not reach the parts that are paralyzed, and vice versa when the fes is applied at the periphery, and there is a complete spinal cord lesion only parts of the body are directly activated. however, in both cases in persons with incomplete lesion (likely about 90% of the total population), the stimulation can be seen above and below the lesion and has multiple effects as suggested in fig. 2. when the es generates a function, then it is termed functional electrical stimulation (fes). a motor neural prosthesis (mnp) is the system which by employing the fes restores a motor function [1 25]. to regulate externally the activation of muscles and consequently the movements of the body parts one must understand the richness of natural mechanisms being involved in the operation. the reasons are the following: 1) muscles, tendons, ligaments, and joints differ from the man-made active kinematical chains; 2) the musculoskeletal system is control of functional electrical stimulation 297 highly redundant; 2) the biological control system comprises a highly large number of nerves that serve as sensors, transmission channels to the muscles, and the decision-making circuits operating as a neural network physically defined to an extent by the interconnections that are hard wired [26], and partly the networks operating based on the unsupervised and supervised training. the external control must implement natural like model since they work jointly because only some parts of the body are paralyzed and require artificial external control, while the rest of the body is controlled by the central nervous system. fig. 2 the schema of the mechanism of fes of peripheral electrical stimulation. the stimulation activates afferent and efferent nerves. the afferent activity results with the reflex response from lower and upper motor neuron, while the efferent stimulation generates muscle contraction. the hierarchical, self-organized biological control system relies on the extreme redundancy of the sensory-motor systems. motion performance in a healthy person appears to be flexible and uncomplicated, although the neuronal operation which controls the system is still vaguely understood, even for the movements which comprise a small number of body parts [27]. the question which comes out to mind from the motor control studies is: are the activities of the system components chosen randomly from the variety of possibilities to accomplish a given task, or there is a consistently reproducible pattern 298 d. b. popović of behavior that must be used? if the later, would it be possible to understand the constraints imposed to reproduce the same motor behavior whenever the motor task needs to be executed? a human can capture a spectrum of functional movements during the life. most of the movements are mastered in early childhood; however, the repertoire is increasing and changing throughout the life, if so required. a functional movement relies upon perceptuomotor coordination that involves three main elements: the sensory information, the internal coding that is appropriate, and the generation of motion. the author of this review assumes the model of control as a "black box" with an internal structure that is partly known [28 – 30]. the black-box approach follows many studies that the author proposed and carried out working with clinicians and motor control scientists. he started his research from the model-based control to design exoskeletons and hybrid assistive systems [31]. the clinical applications of these early developments made clear that the modeling is essential for studying the behavior and design components of the systems. fig. 3 summarizes the effects of the fes applied to a peripheral nervous system. the modeling approach limits in most cases the control to the sketch of the stimulator and the efferent activation of nerves (fig. 3, bottom left panel); while all other effects are not considered. the modeling approach carries the following: 1) over simplification of the real system; 2) no considerations of the time variability of the system; 3) not observable and measurable parameters of the model; 3) the target trajectory that is not known; 4) feedback which is not applicable due to the delays of the nerves, synapses and muscles; and 5) the fact that only parts of the system are controlled externally, while the remaining of the body is controlled by biological structures. the black-box model of the fes must consider that the input to the user comes through a variety of sensory modalities which continuously updates information about the state of the system and the interactions with the environment. the black-box model needs to include that the sensory input triggers perceptual elements which assist selecting the decision what and how to execute the task. the information is processed at several structures of the brain in serial and parallel operations. the intention command (function selection) signal travels to lower centers of the cortex, midbrain, brainstem, and to particular segments of the spinal cord. the signals from the spinal cord in synchrony activate muscles via peripheral nerves and adjust the commands based on the afferent inflow [32, 33]. muscles receive signals and produce forces which generate a relative movement of bodily parts, and due to the interaction with the environment the movement of the body in the space. the engineering description of this model is the following: 1) the control has a hierarchical structure with many parallel channels; 2) feedback tunes control at the lower and upper levels; 3) time delays characterize the operation and required a combination of the feedback and predictive control; and 4) the control needs to overcome the ambiguity of the redundant system. in this review, we concentrate on the stimulation of peripheral nervous system (pns). an fes generated bursts of short pulses of electrical charge triggers a series of action potentials in afferent and efferent neural pathways. the externally triggered efferent pathways directly activate muscles that are innervated by the said neurons; yet, not in the same manner as the volitional motor command would be coming from the upper motor neuron (fig. 3). in parallel, the activity triggered in afferent pathways carry action potential to the spinal cord where various reflexes are generated (e.g., cross-extension reflex, flexion reflex), and interneurons are activated transmitting signals which eventually reach the cortex [15]. the fes delivers bursts of the electrical charge that are converted into the ionic currents in the tissues via electrodes. the time varying magnetic field can induce ionic currents; thereby control of functional electrical stimulation 299 the electrodes could be replaced with magnetic coils. the stimulus waveform selected for the excitation process must take into consideration the physiological effect (action potential generation), potential damage to the tissue, and possible degradation of the electrode [34, 35]. the amplitude modulation (am) or pulse width (duration) modulation (pwm) controls the level of recruitment [36 41]. pwm utilizes a slightly lower charge density compared with the am to evoke a response of equal magnitude. since the timing circuits (i.e., regulating pulse width) can be easily constructed and controlled with a resolution of 1 µs or less, many stimulators implement the pwm. the typical duration of excitation pulses is in range to 250 µs with implanted electrodes, but longer with surface electrodes. the threshold for excitation of the fibers of a peripheral nerve is proportional to the diameter of the fiber. since the nerve is composed of a mixture of afferent and efferent fibers with a spectrum of fiber diameters, short pulses of constant amplitude will excite large afferent and efferent fibers. longer pulses may also stimulate smaller fibers, including afferents typically carrying information of noxious stimuli, and therefore may be painful to the subject. for this reason and to minimize the electrical charge injection, short pulses are preferred. recent research by our group resulted with multipad electrodes and the distributed stimulation which contribute to better selectivity, comfort for the users and substantially postponed muscle fatigue caused by the es [42 44]. the regulation of the strength of a motor response depends on the number of activated nerve fibers and the rate at which they are activated. these two mechanisms are called recruitment and temporal summation, respectively. the same terms are utilized to describe electrically elicited events. when the stimulus is sufficiently large, an action potential will be evoked in the nerve. in a healthy person the slow, fatigue-resistant motor units are activated at a lower effort compared with the larger, fast fatigable units. in the fes the recruitment order depends upon the variables of position and geometry as well as fiber size [36 41]. fig. 3 block diagram showing components for an fes stimulator. the acronyms used are: adc – analog to digital converter, dac – digital to analog converter, dc/dc – converter of the voltage from low to high level, i/p and o/p are digital input and output ports, emg – electromyography, eeg – electroencephalography. 300 d. b. popović the second mechanism affecting the force developed by the muscle is the temporal summation. at low frequencies the response is unfused, and the variations of the muscle force are expressed. the frequency at which the mechanical responses produced are sufficiently smooth is known as the fusion frequency. in most human upper extremity muscles the fusion occurs at about 16 pulses per second. the unit hz is often used instead of the correct term pulses per second. a maximum force (tetanus) can be reached at higher frequencies. the fast development of electronics, microcomputers, wireless communication, and sensors makes the design of stimulators relatively straightforward. the requirements are the following: the system needs to be safe, small (portable), efficient, battery operated, and adaptable for various applications. the stimulator needs to control the frequency of pulses, pulse duration, pulse amplitude, the shape of the pulse and support several output channels. the stimulator needs to communicate with the user and use sensors signals including electrophysiological signals for the selection of the pattern of bursts delivered as particular channels. battery supply and the charging circuitry are a must for an fes system. the block diagram of a stimulator is shown in fig. 3. the elements that need the attention in implantable systems are the biocompatibility, miniaturization and recharging circuitry. 2. control for fes-based motor neural prosthesis 1 the multi-level control appropriate for the fes system which operates as a motor neural prosthesis (mnp) is presented in fig. 4. fig. 4 the model of the hierarchical hybrid controller for fes aiming to restore the movement. 1 the presentation uses the modified material from the publication by the author [30]. control of functional electrical stimulation 301 the intentions and decisions of the user are at the top of the hierarchical controller. the user communicates to the external controller decisions what and how he/she is planning [45]. the appealing method for the communication is a brain-computer interface (bci). the bci is these days one of the top research topics. many techniques to extract signals are reported in the literature [46 56]. the interfaces for the bci are not perfected yet (fig. 5). to support this statement cite the conclusion from one of the most respected research groups in the world [46]: "despite a growing animal literature demonstrating on-line control of functional hand movements from spike patterns recorded with microelectrodes in the motor cortex, bci applications in neurological patients are rare and hampered by methodological difficulties. bcis using eeg measures allow verbal communication in paralyzed patients with als; bci-communication in totally locked-in patients, however, awaits experimental confirmation. movement restoration in patients with little residual movement capacity using noninvasive bci is possible, but a generalization of improvement to real life needs further experimentation". events related recordings from the skull, correlated with a particular cortical activity, can be used as the trigger to start or stop an operation of the fes system [57]. fig. 5 three interface types for the bci: noninvasive (eeg, left panel), and invasive (ecog and lfp, middle and right panels) the implantable bci systems provide much more detailed information compared with the skull recordings. the electrodes for the reproducible cortical recordings need to be improved. the positional instability of the electrodes is a critical issue. the longevity and reaction of the cortical tissues to the biocompatible materials need consideration. implanting electrodes and electronic circuitry are rather invasive techniques. the recordings from the cortical electrodes will create eventually an extensive communication channel between the user and the microcomputer driving the fes or some other computerized device (fig. 5, middle and right panels). the complexity of capturing motor commands from the brain can be easily seen from the reduced model of the flow of information within the brain, 302 d. b. popović cerebellum and brainstem (fig. 6). namely, the immediate question is where and what to record to allow the direct bci? the electrophysiological signals from the periphery (nerve or muscle) are an option to replace the higher centers cns interface. the myoelectric control is widely used in hand prosthetics with success. there are various sites in persons with the cns lesion that can be utilized. surface electrodes could be used for recordings of muscle activities. implantable electrodes are required for the interface with nerves. the plurality of electrophysiological signals can be used for multiple tasks mnp controllers. some of the limitations listed for the bci are valid for the implantable systems for the peripheral recordings and can be even more expressed. fig. 6 the sketch of the flow of signals and major brain parts participating in the control of movement. switches and other computer inputs which utilize artificial sensors are the most efficient interface. the sensors measuring the acceleration, angular rate, joint angle, interface force/pressures when manipulated by the user provide a reliable and reproducible set of control signals. if the computer inputs are "mounted" on the body in a manner that allows subconscious sending of the appropriate command, then the multichannel control is facilitated. recently, artificial visual perception [58, 59] was introduced as an input for the fes systems. it has been tested with success in artificial hand prostheses [60]. the artificial visual perception allows the planning that is ultimate interest for control (e.g., type of grasp, the position of the object with respect the hand, size of the target object when grasping; curb, stairs, slope, obstacle, perturbation when walking/standing). state control for coordination of movement. the two top levels of the control use the finite state control (fsc). the division to two levels is made to mimic the biological control where the temporal synchrony and spatial synergies are interplaying. the fsc implements nonanalytical tools and non-parametric models of movement [61]. the fsc inherently deals with the following problems of movement control: 1) redundancy, nonlinearity, and time variability of the plant; 2) redundancy of plausible trajectories; and 3) the significance of the preference criteria based on the task. control of functional electrical stimulation 303 non-numerical tools are the identification techniques. in many cases, the identification is strictly based on heuristics. the heuristic procedures consist of choosing methods, which seem promising while allowing the changes if the originals do not to lead quickly enough to a solution [61]. this procedure allows that the fsc learns from "mistakes" and improves the performance based on the acquired skill. the fsc use set theory to define the behavior based on the states and rules to set transitions between these states. the states are movement representations in the multidimensional phase space (e.g., joint locked, joint free to move, flexion, extension expressed regarding muscle forces or joint kinematics. the transition from state to state is defined by rules. the rules are logical relations (e.g., if-then, and, or) that connect state variables. a production rule is a state action pair; i.e., whenever a particular state is encountered, given on the left side of the rule, then the action on the right side of the rule needs to be executed. there are no a priori constraints on the forms of the states or the actions. a system based on production rules have three components: 1) the rule base, consisting of the set of production rules; 2) one or more data structures containing the known facts relevant to the domain of interest called facts bases; and 3) the interpreter of the facts and rules, which is the mechanism that decides which rule to apply to initiate the target action. each rule is an independent item of knowledge, containing all the conditions required for the application. due to the modularity rbc systems can be changed by the addition, deletion or modification of a rule. an important feature of the rbc systems is the ability to look first at the established facts and use forward chaining, or to start from the task and implement backward chaining. the problem of knowledge representation (determination of rules) is fundamental to the operation of the rbc. the rules these days are automatically generated through a procedure known as machine learning and classification (e.g., inductive learning, artificial neural networks, adaptive logic networks, fuzzy-logic networks, wavelet networks, hebbian learning, stochastic classification techniques like principal component analysis, etc.). the data for the machine classification come are sensory patterns acquired while observing the process and the plant. the sensory patterns are coded (e.g., single threshold, multi-threshold, timing, local vs. absolute minimum or maximum), and the rules define the relationship between sensory patterns and required motor activities. a set of sensors providing feedback signals has been so far arbitrarily determined (e.g., ground reaction force or pressure sensor, switch, goniometer, inclinometer, accelerometer, and proximity sensor); the choice is based on availability of sensors, reproducibility of the sensory recordings, and overall practicality of plausible day to day usage. sensors that are functionally equivalent to those used in biological control systems are preferred. increasing the number of sensors produces very fast growth of the number of control rules making the definition process time consuming and difficult, yet very functional. our group demonstrated and tested several systems that use the rbc for the control of the grasping, reaching and walking [62 66]. the fsc is not applicable directly to the fes systems because the on-off activation will result in jerky movements. also, the time delays require a predictive rule base to allow that adaptation of the stimulation to the tasks. the fsc is a result of machine mapping, and even the high level of confidence close to 100% is no guaranty that a correct classification will follow since the hypothesis that the events are stochastic and independent is semi correct. model-based control for the fes. the fsc cannot be directly implemented for control because it ultimately creates jerky movements. to eliminate the jerks the physical 304 d. b. popović characteristics and properties of the system components must be incorporated. the method that is most appropriate is the model based control (mbc). the mbc considers the human body as a system of rigid bodies (skeleton) connected with rotational joints driven by joint actuators (muscles). the mbc considers the elastic properties of tendons connecting the muscles and bones, and the ligaments connecting the neighboring bones. the mbc for fes systems deals with two tasks: standing and walking and goal-directed movement (manipulation and grasping). the differences are the consequences of the tasks: walking is a near cyclic operation where the legs provide support for the trunk and propulsion, while the manipulation requires complex temporal and spatial activation to allow goal-directed movement. the muscle forces for the standing/walking are an order of magnitudes larger compared with those required for the arm movement and grasping. the walking introduces one tough task: balance. at present, there is no solution how to ensure balance while standing on a small surface (sole of a single foot). the mbc is a highly complex task. the human body comprises more than 200 bones driven by about 600 skeletal muscles. the skeleton is presented with a spatial or a planar model comprising segments and joints with externally driven muscles. the fes drives only a subset of the whole system that is cut out from the biological control due to a cns lesion. the only method to come to a solution is to use a reduced model. the reduction relates to the number of segments included in the analysis and the type of joints that connect the selected segments. fig. 7 shows the 12-segment model and the two-segment planar model of the leg. we present here the model of the planar model shown in panel b (fig. 7) as in illustration of the complexity of the significantly reduced mechanical model of the body for the analysis of the walking [31]. fig. 7 the two-degree of freedom model of the leg. the middle panel shows the models of the thigh and the shank. the right panel shows the actuators (agonist and antagonist muscles) at the hip and knee joints. the following notations are used: k and h are the knee and hip joints; s and t are the angles of the thigh and shank with respect the x-axes direction; ms and mt the masses of the shank and thigh; jcs and jct are axial moments of inertia of the thigh and shank for control of functional electrical stimulation 305 the axes perpendicular to the sagittal plane through the corresponding centers of masses; ls and lt are the lengths of the segments; ds and dt the distances between the hip and knee and the corresponding centers of masses. the torques ms and mt are the torques acting at the thigh and shank, while mh and mk are net joint torques at the hip and knee joints. the gravity direction is opposite to the y-axes. the double pendulum representing the leg (fig. 7) allows the knee and hip to flex and extend within the typical physiological range of movement. two pairs of monoarticular muscles acting around the hip and knee joints (right panel in fig. 7) drive the leg. the second order nonlinear differential equations describing the model are: m = φ cos l y + φsin l x φcos ) g + y( a φsin x a ) φ φ (sin φ a + ) φ φ ( cos φ a + φ a sssgssgsh5 sh4st 2 t3stt2s1   m = φ cos l y + φsin l x φcos ) g + y( b φsin x b ) φ φ (sin φ b + ) φ φ ( cos φ b + φ b tttgttgth5 th4st 2 s3sts2t1   (1) where the coefficients are b= b , a= a , d m + l m = b , d m = a 4545ttts4ss4 b = b , a = a , a = b , l d m = a 232322tss2 d m + l m + j = b , d m + j = a 2 tt 2 tsc1 2 ssc1 ts we used the notations for the angles to match the anatomical definitions of flexion and extension at the hip and knee. the net torques ms and mt generate the motion, and the torques mk and mh are a linear combination of the ms and mt. the simulation requires the following inputs: joint angle (trajectory), acceleration of the hip (body motion), and ground reaction force. the simulation needs the parameters that are geometrical and inertial parameters of the leg segments. the model presented is deterministic; hence there is a unique solution for the joint torques. the simulation result for one healthy person walking 0.9 m/s is presented (fig. 8). the parameters for the model are from the literature [67, 68]. fig. 8 the inputs for simulation (target trajectory) are the joint angles, ground reaction forces, and hip motion. geometrical and inertial parameters are for a healthy young person. the speed of the walking was 0.9 m/s. right panels show the output from stimulation: joint torques at the hip and the knee. 306 d. b. popović however, in the fes system, the joint torque is generated by flexor and extensor muscles at joints as shown in fig. 7, right panel. two complex elements introduced are the redundancy, and nonlinearity and time variability due to the characteristics of the actuators and their attachments to the skeleton. the eq. 1 now includes a more complex term replacing the pure torques acting at the high and shank (eq. 2): pk,ek,fk,s pk,ek,fk,ph,eh,fh,t tttm ttttttm   (2) the active flexor and extensor torques th,f, th,e, tkf, and tke are given by eq. 3: e f n k, hm ),)g(a(u)f( = t nm,   (3) the muscle response to electrical stimulation can be approximated by a second order, critically damped, low pass filter with a delay as shown in eq. 4: djω 2 pp 2 2 p e ω2jωω ω )u(j )a(j       (4) the nonlinear joint torque resulting from the contraction of muscles depends on the length of the muscle and the rate of the change of the length [69, 70]. the simplified model of this behavior is presented with the parabolic dependence of the torque from the joint angle and the hyperbolic fall of the torque with the increased joint angular rate (eq. 5): b , 0 b 0 , )1/1( 0 ,b )g(, 0f 0, 0f ,aaa )f( 1 11 12 210                           bb b , 0 b 0 , )1/1( 0 ,b )g(, 0f 0, 0f ,aaa )f( 1 11 12 210                           bb (5) the coefficients a0, a1 and a2 and b1 are specific for joints and vary between uses for the same joint. the passive torques at the hip and knee is (eq. 6): iiii6,i5, c i3, c i1,pi, kccecect ii4,ii2,     (6) where the ci and ki are constants that are user dependent. the redundancy can be resolved only by applying the optimization. the cost function needs to be formulated based on the tasks of simulation. the cost function can include the time, energy, force, torque, jerk, fatigue, muscle activation, non-physiological loading, the number of muscles used for the task, tracking error, and any combination thereof. based on heuristics we suggest that the minimization of the tracking error and the muscle’s efforts, with the preset level of co-contraction of agonist and antagonists (minimum stiffness) for the stabilization of joints and minimization of jerks. to demonstrate the value of the model based analysis we show a result from one of our earlier studies [67]. the cost function is given by eq. 7 dt } (t)]u+(t)u+(t)u(t)u[ +])(+ )[({= )( 2 4 2 3 2 2 2 12 2t2 sm s 1 t+t t 0 0          tm ur (7) control of functional electrical stimulation 307 the right panel (fig. 9) shows substantial tracking errors and the recruitment level that is at the maximum. the direct conclusion is that the selected target trajectory is not achievable for the patient. fig. 9 the recruitment level for four consecutive steps. the recruitment reaches a maximum, and a substantial tracking error occurs for the paraplegic subject. the optimization does not consider the minimum stiffness. we show the difference in the cost function (eq. 7 and eq. 8) affects the total cost. 3i 2 4 2 3 2 2 2 12 2 tm t2 sm s 1 t+t t )(u and dt } (t)]u+(t)u+(t)u+(t)u[ +])()[({= )( 0 0             t ur (8) the graphs suggest the optimal conditions are when the effort is about 50%, and the tracking error is small (only a few degrees), while the lower values are obtained if the cost function does not consider the minimum cocontraction. the values on the axis of the graphs are normalized (fig. 10), and we intentionally do not discuss this because of the scope of this presentation. after discussing how a model is defined for the design of a controller we go back to the review of the literature. the simplest mbc operates without feedback (fig. 11, openloop). the controller is the inversed model of the plant [e.g., 71 74]. there are two major problems with this type of control: 1) the model is reduced in comparison with the real plant and parameters of the model are not reflecting the properties of the system adequately, and 2) the disturbances are not part of the model. the operation of this openloop controller might be sufficient for simple systems; yet, in most cases, the system is not 308 d. b. popović robust enough [75 77]. an open loop controller uses the trajectory as the input. the term trajectory is used in broad sense (position, angle, velocity, acceleration, etc.). fig. 10 the optimization surfaces for the two optimization functions (eq. 3) for the two segmented model during the walking cycle at speed 0.9 m/s. a closed-loop mbc (fig. 11, close-loop), often termed "error driven control" uses feedback information from sensors measuring the achieved trajectory and corrects the command signal. the practical problem with the error-driven mbc is the biological delay; therefore, a predictive closed-loop control needs to be implemented. the term delay is related to the response of muscles (actuators) to the stimulation of the neural pathways. the response of the muscle to stimulation can be described with the low pass filter with the delay. the closed-loop control also requires a model that reflects the complexity of the organism being controlled and parameters which characterize the system. also, the control requires precisely defined trajectory and the permitted errors that would not compromise the use of the np. fig. 11 model-based control is operating without feedback (open-loop) and with feedback (closed-loop). the important aspects which need to be considered when using the model-based control are the time variability of the responses of muscles (e.g., muscle fatigue, habituation, etc.) and nonlinearities. the model based control is not applicable for multichannel electrical stimulation systems because the model is oversimplified, there is no preferred trajectory, the interaction with biological control is not included, the parameters of the systems change with time, the geometry of stimulation changes during the movement, reflexive behaviors are nor included (dystonia, hyperreflexia, spasticity), some muscles are atrophied or even denervated, the control of functional electrical stimulation 309 overall changes occurred in the central nervous system due to the lesion and modified periphery, stimulation activates afferent system that generates reflex responses (e.g., contralateral extension or flexion), stimulation is not selective enough, muscle fatigue could interfere due to the nonphysiological order of recruitment. 3. conclusion control of fes for restoring movements as a highly complex task since it required full integration of the external components into the remaining and modified biological motor control. the plant is nonlinear and time variable, the system is not observable; thereby the stability of the system is questionable. fig. 12 the mimetic model of the hybrid hierarchical controller for control of movement using fes. the left panel shows the components that need to be integrated into the corkscrew to release the full potentials of the fes for rehabilitation. we suggest that the concept presented and summarized in fig. 12 has the best chance to contribute to the translation of the research into the clinical and home use by persons with disability as an orthosis or just a therapeutic device. other aspects of the implementation of the fes are much more technological changes and cosmetic features. the interface to the user is also touched in this review, and the author is not convinced that invasive techniques which output a small number of commands are appreciated by the users and greatly increase the costs for the logistic support and original installation. the tests of the efficacy of the system with noninvasive interfaces should always be a step before considering the implant that improves the operation. acknowledgement: the work on this project was partly supported by the project no rs35003, ministry of education, sciences and technological development of serbia and the project f 137 from the serbian academy of sciences and arts, belgrade. 310 d. b. popović references [1] a. l. benabid, s. chabardes, j. mitrofanis, p. pollak p. “deep brain stimulation of the subthalamic nucleus for the treatment of parkinson's disease”, the lancet neurology, vol. 31, no. 8(1), pp. 67-81. 2009. [2] j. h. burridge, m. haugland, b. larsen, r. m. pickering, n. svaneborg, h. k. iversen, p. b. christensen, j. haase, j. brennum and t. sinkjær, "phase ii trial to evaluate the actigait implanted drop-foot stimulator in established hemiplegia", j. rehabil. med., vol. 39, pp. 212-218, 2007. [3] d. g. everaert, r. b. stein, g. m. abrams, a. w. dromerick, g. e. francisco, b. j. hafner, t. n. huskey, m. c. munin, k. j. nolan and c. v. kufta, "effect of a foot-drop stimulator and ankle-foot orthosis on walking performance after stroke: a multicenter randomized controlled trial", neurorehabil neural repair, vol. 27, no. 7, pp. 579-591. 2013. [4] l. e. fisher, m. e., miller, s. n. bailey, h. a. jr. davis, j. s. anderson, l. r. murray, d. t. tyler and r. j. triolo, "standing after spinal cord injury with four-contact nerve-cuff electrodes for quadriceps stimulation", ieee trans. neural. syst. rehabil. eng., vol.16, pp. 473–478, 2008. [5] t. keller, m. r. popović, i. p. i. papas, p-y. muller, “transcutaneous functional electrical stimulator compex motion”, artif. organs, vol. 26, no. 3, pp. 219-223, 2002. [6] k. l. kilgore, h. e. hoyen, a. m. bryden, r. l. hart, m. w. keith and p. h. peckham, "an implanted upper-extremity neuroprosthesis using myoelectric control", j. hand surgery, vol. 33, pp. 539-550, 2008. [7] r. kobetič, c. s. to, j.r. schnellenberger, m. l. audu, t. c. bulea, r. gaudio, g. pinault, s. tashman and r. j. triolo, "development of hybrid orthosis for standing, walking, and stair climbing after spinal cord injury", j. rehab. res. develop., vol. 46, pp. 447–462, 2009. [8] a. kralj, t. bajd, “functional electrical stimulation: standing and walking after spinal cord injury”, boca raton, florida, crc press, 1989. [9] g. e. loeb, r. a. peck, w. h. moore and k. hood, “bion system for distributed neural prosthetic interfaces”, med eng phys, vol. 23, pp. 9-18, 2001. [10] n. m. malešević, l. popović-maneski, v. ilić, n. jorgovanović, g. bijelić, t. keller and d. b. popović, “a multi-pad electrode based functional electrical stimulation system for restoration of grasp”, j. neuro eng. rehabil., vol. 9, no. 66, 2012. [11] s. mangold, t. keller, a. curt and v. dietz, “transcutaneous functional electrical stimulation for grasping in subjects with cervical spinal cord injury”, spinal cord, vol. 43, no. 1, pp. 1-3, 2005. [12] p. h. peckham and j. s. knutson, "functional electrical stimulation for neuromuscular applications", annual review of biomed. eng., vol. 7, pp. 327-360, 2005. [13] d. b. popović, m. b. popović, t. sinkjær, a. stefanović and l. schwirtlich, "therapy of paretic arm in hemiplegic subjects augmented with a neural prosthesis: a cross-over study", can. j. physio. pharmacol., vol. 82, pp. 749-756, 2004. [14] d. b. popović, t. sinkjær and m. b. popović mb. “electrical stimulation as a means for achieving recovery of function in stroke patients”, j. neurorehab., vol. 25, pp. 45–58, 2009. [15] d. b. popović, “advances in functional electrical stimulation (fes)”. journal of electromyography and kinesiology. vol. 24, no. 6, pp. 795-802, 2014. [16] a. prochazka, m. gauthier, m. wieler and z. kenwell, z. "the bionic glove: an electrical stimulator garment that provides controlled grasp and hand opening in quadriplegia", arch. phys. med. rehabil., vol. 78, no. 6, pp. 608–14, 1997. [17] r. van den brand, j. heutschi, q. barraud, j. digiovanna, k. bartholdi, m. huerlimann, l. friedli, i. vollenweider, e. m. moraud, s. duis, n. dominici, s. micera, p. musienko and g. courtine, "restoring voluntary control of locomotion after paralyzing spinal cord injury", science, vol. 336 no. 6085, pp. 1182-1185, 2012. [18] v. visser-vandewalle, z. temel, ch. van der linden, l. ackermans and e. beuls, "deep brain stimulation in movement disorders: the applications reconsidered", acta neurologica belgica, vol. 104, pp. 33-36, 2004. [19] http://www.bioness.com/ ness_l300_for_foot_drop.php [accessed, dec. 2016] [20] http://www.bioness.com/ ness_h200_for_hand_rehab.php [accessed, dec. 2016] [21] http://www.walkaide.com/en-us/ pages/default.aspx [accessed, dec. 2016] [22] http://www.odstockmedical.com/ products/microstim-2v2-kit [accessed, dec. 2016] [23] http://www.markfelling.com/id450.htm [accessed, dec. 2016] [24] http://www.ottobock.com/cps/rde/ xchg/ob_com_en/hs.xsl/4762.html [accessed, 2016] [25] http://musclepower.com/parastep.htm [accessed, dec. 2016] [26] g. rizzolatti and g. luppino, "the cortical motor system", neuron, vol. 31, no. 6, pp. 889-901, 2001. control of functional electrical stimulation 311 [27] m. jeannerod, the neural and behavioural organization of goal-directed movements, clarendon press/oxford university press, 1988. [28] n. bernstein, the co-ordination and regulation of movements, pergamon press, oxford (1967) [29] m. l. latash, control of human movement, human kinetics, 1993. [30] d. b. popović and t. sinkjær, control of movement for the physically disabled, london: springer, 2000. [31] d. b. popović, “control of walking in disabled humans”, journal of automatic control, vol. 13(suppl.), pp. 1-38, 2003. [32] r. grasso, y. p. ivanenko m. zago, m. molinari, g. scivoletto v. castellano, v, macellari and f. lacquaniti,”distributed plasticity of locomotor pattern generators in spinal cord injured patients”, brain, vol. 127, no. 5, pp. 1019-10034, 2004. [33] y. p. ivanenko, r. p. poppele and f. lacquaniti f, “five basic muscle activation patterns account for muscle activity during human locomotion”, the journal of physiology, vol. 556, pp. 267-282, 2004. [34] a. scheiner, j. t. mortimer and u. roessmann, “imbalanced biphasic electrical stimulation: muscle tissue damage”, annals of biomed. eng., vol. 1, no. 18(4), pp. 407-25, 1990. [35] j. t. mortimer, “motor prostheses”, in handbook of physiology: the nervous system, motor control. published on line 2011. [36] t. j. bajzek and r. j. jaeger, “characterization and control of muscle response to electrical stimulation”, ann. biomed. eng., vol. 15, pp. 485–501, 1987. [37] r. baratta and m. solomonow, “the dynamic response model of nine different skeletal muscles”, ieee trans. biomed. eng., vol. 37, pp. 243–51, 1990. [38] p. e. crago, p. h. peckham and j. t. mortimer, ”the choice of pulse duration for chronic electrical stimulation via surface, nerve and intramuscular electrodes”, ann. biomed. eng., vol. 2, pp. 252–264, 1974. [39] p. e. crago, p. h. peckham and g. b. thrope, “modulation of muscle force by recruitment during intramuscular stimulation”, ieee trans. biomed. eng,, vol. 27, pp. 679–684, 1980. [40] j. a. gruner and c. p. mason, "nonlinear muscle recruitment during intramuscular and nerve stimulation." j rehabil. res. develop., vol. 26, no. 2, pp. 1-16, 1988. [41] d. b. popovic, t. gordon, v. f, rafuse and a. prochazka, “properties of implanted electrodes for functional electrical stimulation”, annals of biomed. eng., vol. 19, no. 3, pp. 303-316, 1991 [42] d. b. popović and m. b. popović, “automatic determination of the optimal shape of the surface electrode: selective stimulation”, j. neurosci. methods, vol. 178, pp. 174-181, 2009. [43] l. popović-maneski, m. kostić, t. keller, s. mitrović, lj. konstantinović and d. b. popović, “multipad electrode for effective grasping: design”, ieee trans. neur. sys. rehabil. eng., vol. 21, pp. 648654, 2013. [44] l. popović-maneski, n, malešević, a. savić and d. b. popović, “spatially distributed asynchronous stimulation delays muscle fatigue”, muscle & nerve, vol. 48, pp. 930-937, 2013. [45] t. sinkjær, m. haugland, a. inmann, m. hansen and d. k. nielsen, “biopotentials as command and feedback signals in functional electrical stimulation systems”, medical engineering & physics, vol. 25, no. 1, pp. 29-40, 2003. [46] n. birbaumer, a. r. murguialday and l. g. cohen, “brain-computer-interface (bci) in paralysis”, the european image of god and man, pp. 483-492, lippincott williams & wilkins, 2010. [47] j.j. daly, r. cheng, j. rogers, k. litinas, k. hrovat and m. dohring, “feasibility of a new application of noninvasive brain computer interface (bci): a case study of training for recovery of volitional motor control after stroke”, journal of neurologic physical therapy, vol. 33, no. 4, pp. 203-11, 2009. [48] a. h. do, p. t. wang, c. e. king, a. abiri, z. nenadic, “brain-computer interface controlled functional electrical stimulation system for ankle movement”, journal of neuroengineering and rehabilitation, vol. 8, no. 49. 2011. [49] j. d. millán, r. rupp, g. r. müller-putz, r. murray-smith, c. giugliemma, m. tangermann, c. vidaurre, f. cincotti, a. kübler, r. leeb and c. neuper, “combining brain–computer interfaces and assistive technologies: state-of-the-art and challenges”, frontiers in neuroscience, vol. 4, 161, 2010. [50] c. t. moritz, s. i. perlmutter, e. e. fetz, ”direct control of paralysed muscles by cortical neurons”, nature, vol. 456 (7222), pp. 639-642, 2008. [51] g. pfurtscheller, g. r. müller-putz, j. pfurtscheller and r. rupp, “eeg-based asynchronous bci controls functional electrical stimulation in a tetraplegic patient”, eurasip journal on applied signal processing, p. 3152-5, 2005. [52] g. pfurtscheller g. r. müller, j. pfurtscheller, h. j. gerner and r. rupp. ”thought–control of functional electrical stimulation to restore hand grasp in a patient with tetraplegia”, neuroscience letters. vol. 351, no. 1, pp. 33-36. 2003. 312 d. b. popović [53] c. ethier, e. r. oby, m. j. bauman and l. e. miller, "restoration of grasp following paralysis through brain-controlled stimulation of muscles" nature, vol. 485, pp. 368–371, 2012. [54] a. m. savić, n. m. malešević and m. b. popović, “feasibility of a hybrid brain-computer interface for advanced functional electrical therapy”, hindawi publ corp, scientific world journal, article id 797128, 2014. [55] a. b. schwartz, t. cui t, d. j. weber and d. w. moran, “brain-controlled interfaces: movement restoration with neural prosthetics”, neuron, vol. 52, no. 1, pp. 205–220, 2006. [56] m. taylor, s. i. helms tillery and a. b. schwartz, "direct cortical control of 3d neuroprosthetic devices", science, vol. 296 (5574), pp. 1829-1832, 2002. [57] j. r. wolpaw, n. birbaumer, w. j. heetderks, d. j. mcfarland, p.h. peckham, g. schalk, e. donchin, l. a. quatrano, c. j. robinson and t. m. vaughan, “brain-computer interface technology: a review of the first international meeting”, ieee trans. rehabil. eng., vol. 8, no. 2, pp. 164-173, 2000. [58] s. došen and d. b. popović, “transradial prosthesis: artificial vision for control of prehension”, artificial organs, vol. 35, no. 1, pp. 37-48, 2011. [59] s. došen, c. cipriani, m. kostić, m. c. carrozza and d. b. popović, “cognitive vision system for the control of a dexterous prosthetic hand: an evaluation study”, journal of neuroengineering and rehabilitation, on line 7:42, 2010 [60] m. marković. s. došen, c. cipriani, d. b. popović and d. farina, “stereovision and augmented reality for closed-loop control of grasping in hand prostheses”, j. neural engineering, vol. 11, no. 4, p. 046001, 2014. [61] r. tomović, d. b. popović and r. b. stein, nonanalytical methods for motor control, world sci publ, singapore, 1995. [62] j. kojović, m. djurić-jovičić, s. došen, m. b. popović and d. b. popović, “sensor-driven fourchannel stimulation of paretic leg: functional electrical walking therapy”, j. neurosci.. methods, vol. 181, pp. 101-105, 2009. [63] d. b. popović and m. b. popović, "tuning of a nonanalytic hierarchical control system for reaching with fes", ieee trans. on biomed. eng., vol. 45, pp. 203-212, 1998. [64] m. b. popović and d. b. popović, "cloning biological synergies improved control of elbow neuroprostheses", ieee emb magazine, vol. 20, no. 1, pp. 74-81, 2001. [65] m. b. popović, “control of neural prostheses for grasping and reaching”, med. eng. phys., vol. 25, no. 1, pp. 41-50, 2003. [66] d. b. popović, r. tomović, d. tepavac and l. schwirtlich, “control aspects of active above-knee prosthesis”, intern. j man-machine studies, vol. 35, no. 6, pp. 751-767, 2001. [67] d. b. popović, r. b. stein, m. n. oguztoreli, m. lebiedowska and s. jonić. “optimal control of walking with functional electrical stimulation: a computer simulation study”, ieee trans. rehabil. eng., vol. 7, no. 1, pp. 69-79, 1999. [68] s. jonić, t. janković, v. gajić and d. b. popović, “three machine learning techniques for automatic determination of rules to control locomotion”, ieee trans. biomed. eng., vol 46, no. 3, p. 10, 1999. [69] g. shue, p. e. crago and h. j chizeck, "muscle-joint models incorporating activation dynamics, moment-angle, and moment-velocity properties" ieee trans. biomed. eng., vol.42, pp. 212-223, 1992. [70] r. b. stein, e. p. zehr, m. k. lebiedowska, d. b. popović, a. scheiner and h. j. chizeck, "estimating mechanical parameters of leg segments in individuals with and without physical disabilities" ieee trans rehabil. eng., vol. 4, pp. 201-211, 1996. [71] r. riener and t. fuhr, “patient-driven control of fes-supported standing up: a simulation study”, ieee trans. rehabil.eng., vol. 6, no. 2, pp. 113-124, 1998. [72] m. ferrarin, f. palazzo, r. riener and j. quintern, “model-based control of fes-induced single joint movements”, ieee trans. neural syst. rehabi. eng., vol. 9, no. 3, pp. 245-257, 2001. [73] z. matjacic, k. hunt, h. gollee and t. sinkjaer, “control of posture with fes systems”, med. eng. phys., vol. 25, no. 1, pp. 51-62, 2003. [74] d. b. popović, m. radulović, l. schwirtlich and n. jauković, "automatic vs. hand-controlled walking of paraplegics", med, eng, phys., vol. 25, pp. 63-74, 2003. [75] s. jezernik, r,g. wassink and t. keller, “sliding mode closed-loop control of fes controlling the shank movement”, ieee trans. biomed. eng., vol. 51, no. 2, pp. 163-172, 2004. [76] d. graupe, “emg pattern analysis for patient-responsive control of fes in paraplegics for walkersupported walking”, ieee trans.biomed. eng., vol. 36, no. 7, pp. 711-919, 1989. [77] c. t freeman, a. m. hughes, j. h. burridge, p. h. chappell, p. l. lewin ans e. rogers, “iterative learning control of fes applied to the upper extremity for rehabilitation”, control engineering practice, vol. 17, no. 3, pp. 368-381, 2009. instruction facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 1-14 https://doi.org/10.2298/fuee2001001j © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd design of multiplierless comb compensators with magnitude response synthesized as sinewave functions  gordana jovanovic dolecek department of electronics, institute inaoe, puebla, mexico abstract. this paper presents a research on design of multiplierless comb compensators with magnitude response synthesized as sinewave functions. first, it is elaborated the importance of comb decimation filter and why we need its compensator. in continuation are presented some favorable characteristics of comb compensator. the compensators, with magnitude characteristic synthesized as sinewave functions fulfill those favorable characteristics. next, are described some most important results on design of compensators with sinewave-based magnitude responses including single and cascaded sinewave-based functions. in all designs are presented the overall corresponding magnitude responses and the zooms in the passband. the parameters of design generally depend only on number of cascaded combs and generally do not depend on decimation factor. design parameters are presented in tables along with the corresponding required number of adders. key words: sigma delta ad converters, oversampling, decimation, decimation filter, comb filter, compensation filter. 1. introduction the oversampled sigma delta (σδ) analog-digital converters (adc) samples the analog signal with a frequency much larger than the nyquist frequency (which is the minimum required sampling frequency), usually expressed through over sampling ratio (osr) [1]. oversampled converters have gained a lot of popularity in last two decades due to some very favorable characteristics like low power consumption, small silicon area, high resolution, among others, [2], [3]. the oversampled signal frequency must be decreased to the nyquist frequency at the ad modulator output. this process is performed in a digital form in decimator. therefore the principal parts of σδ adc are modulator and decimator. our attention here is only on the decimator. decimation is a process of decreasing the sample rate by an integer called decimation factor m. decimation introduces unwonted replicas of the main spectrum called aliasing. received october 30, 2019 corresponding author: gordana jovanovic dolecek department of electronics, national institute inaoe, e. erro1, 72740 puebla, puebla, mexico (e-mail: gordana@ieee.org) 2 g. jovanovic dolecek aliasing may deteriorate the decimated signal and must be eliminated prior the decimation by a lowpass digital filter, called decimation, or antialiasing filter [4]. the most popular decimation filter is a comb filter which has all coefficients equal to unity and consequently does not require multipliers for its implementation. this filter naturally provides the aliasing rejection in bands around the comb zeros, called folding bands. however, this attenuation is not enough and must be improved. some recent methods for improving comb aliasing rejection are described in [5]. additionally, the comb passband characteristic is not flat and exhibits a passband droop, which is increased with the increase of the number of cascaded combs k. the introduced droop in the signal band penalizes the resolution achieved by the σd adc, and should be compensated [1]. in this paper we only consider the design of comb compensators. the rest of the paper is organized in the following form. next section presents basic description of comb filters in zdomain and also in the frequency domain. the wideband and narrowband compensators are defined and some very desirable characteristics of comb compensator are elaborated. section iii presents compensator with magnitude characteristic synthesized as a single sinewave function. wideband and narrowband cases are elaborated. the magnitude responses based on the cascade of single sinewave functions are described in section iv. finally, section v introduces the most efficient compensator with magnitude response synthesized using fourth-order sinewave function. the parameters of design are presented for all methods, and methods are illustrated with examples. 2. comb decimation filter transfer function of the comb filter is usually presented in a recursive form as: k m z z m zh              1 1 11 )( , (1) where m is the decimation factor and k is the number of cascaded combs. from the recursive form (1) is derived the popular cic (cascaded-integrator-comb) structure [6], shown in fig. 1. due to the popularity of cic structure some authors use also term cic for comb filter. fig. 1 cic structure the nonrecursive transfer function is given as [4]: k m k k z m zh            1 0 1 )( . (2) design of multiplierless comb compensators with magnitude response synthesized as sinewave functions 3 the recursive structure is area efficient, while the nonrecursive structure is a power efficient [4]. using the polyphase decomposition the nonrecursive structure may be more power efficient because all filtering is moved to the lower rate. in a polyphase decomposition the polyphase components are introduced [4]: )( 1 )( 1 0 m m k zhz m zh        , (3) where hλ(z m ), λ=1,….,m-1, are the polyphase components. the polyphase decomposition is shown in fig. 2 a. using the multirate identities [4] we get more efficient structure, shown in fig. 2b. a. comb polyphase decomposition b. moving filtering to lower rate fig. 2 polyphase decomposition the magnitude characteristic of comb filter has the following form: k j m m eh )2/sin( )2/sin(1 )(    . (4) the passband edge frequency ωp is defined as: rmp /  , (5) where r is an integer. for r=2, it is considered wide passband, while for r<2 is considered a narrowband. the magnitude characteristic (4) should be flat in the passband in order to preserve the decimated signal. the stopband is defined in bands around comb zeros, called folding bands, where aliasing occur: odd even for2/)1(,...,1 for2/,....1 , 22      mm mm k m k m k pp     . (6) in order to eliminate aliasing, comb should have enough attenuation in folding bands. figure 3 shows overall comb magnitude characteristics for m=15 and k=1,…,5. the wideband passband zoom is also shown. we can easily note that by increasing the comb parameter k, the aliasing rejection is improved. however, the passband droop is increased. )(0 zh )(1 zh )(1 zh m  m m m 1 z )1(  m z )(0 m zh )(1 m zh )(1 m m zh  m )1(  m z 1 z 4 g. jovanovic dolecek fig. 3 overall comb magnitude responses and passband zoom, for different values of k the passband droop of comb h(z) must be corrected by the filter called comb compensator filter c(z), which works at low rate i.e. after decimation, as shown in fig.4. fig. 4 compensated comb we consider here only comb compensators. some desirable characteristics of the .compensator are listed in the continuation:  like comb, the compensator structure does not require multipliers.  the multiplierless compensator has a low number of adders.  compensator design is simple and defined only by the comb parameter k, and generally does not depend on the decimation factor m.  compensated comb has a low absolute value of the passband deviation.  compensator should not deteriorate alias rejection in comb folding bands. generally, there is a trade-off between the number of adders and the absolute value of the passband deviation of the compensated comb. the compensator magnitude characteristic should approximate inverse comb characteristic: k j i m m eh )2/sin( )2/sin( )(    , (7) in the passband: p j i jm ehec    0for )()( , (8) where ωp is the passband frequency edge and c(e jmω ) is the compensator frequency response at high rate. different methods have been proposed for design of narrowband and wideband multiplierless compensators. )(zh m )(zc design of multiplierless comb compensators with magnitude response synthesized as sinewave functions 5 in this paper we consider only compensators with magnitude characteristic synthesized using sinewave forms, since those types of compensators truly satisfy all compensator desirable characteristics, previously mentioned. in all examples we consider the same values of comb parameters, i.e., m=16 and 25, and k=5. 3. single sinewave function-based magnitude response 3.1 wide band compensator [7] the magnitude response of compensator presented in [7] has a sinewave-squared form: )2/(sin1)( 2 mbec mj    , (9) where b is amplitude of sine-squared function, m is a comb parameter. the magnitude characteristic of the compensated comb is: 21 sin( / 2) ( ) ( ) ( ) [1 sin ( / 2)] sin( / 2) k j j j m c m h e h e c e b m m          , (10) where k is the order of comb filter. the values of parameter b depend on the given value of comb parameter k and do not depend on the comb parameter m, (for m>10), and are given in the first column of table 1, [7]. the resulting maximum absolute value of passband deviation of the compensated comb, is equal to 0.4db. using a well known trigonometrical relation, 2/)]2cos(1[)(sin 2   , (11) the transfer function of compensator, at low rate, is given as: 12122122 ]21[2])22([2)(   zzzbbzzbbzc . (12) the compensator is a second order filter. the number of required adders s is shown in the second column of table 1. note that the compensator coefficients can be presented using shifts and adders, and consequently the compensator is a multiplierless filter. table 1 values of parameter b and adders s for k=1,…, 5 k b s 1 2 -2 3 2 2 -1 3 3 4 5 2 -1 +2 -2 1 1+2 -2 4 3 4 6 g. jovanovic dolecek the values in table 1, can be used for the same value of k and different values of m, as shown in the following example. example 1: we consider the value of k = 5 and two different values for m, 16 and 25. from table 1, the parameter b = 5/4=1+2 -2 . from (10), the magnitude responses of compensated combs are given as: 1 5 21 sin(8 ) ( ) [1 5 / 4sin (8 )] 16 sin( / 2) j ch e       , (13) 2 5 21 sin(25 / 2) ( ) [1 5 / 4sin (25 / 2)] 25 sin( / 2) j ch e p       , (14) where sub indexes 1 and 2 are for m=16, and 25, respectively. the overall magnitude responses of the compensated comb and comb are shown in fig. 5, along with the passband zoom. a. m=16, k=5 b. m=25, k=5 fig. 5 magnitude responses of comb and compensated comb in [7] design of multiplierless comb compensators with magnitude response synthesized as sinewave functions 7 3.2 narrowband compensator [8] in [8] was proposed a narrowband compensator with the following magnitude characteristic: )2/(sin21)( 2 mec bmj     , (15) where b is parameter and the passband edge is equal: mp 8/  . (16) the transfer function is given as: ( 2) 2 1 2 ( ) 2 [1 (2 2) ] b b c z z z           . (17) the parameter b depends only on comb parameter k. the values of b for k=2,…,6, are given in table 2. considering values of the parameter b and (17), it follows that the compensator needs only s= 3 adders, for all values of k, as shown in third column of table 2. table 2 values of parameter b and adders s, for k = 2,…,6 k b s 2 2 3 3 2 3 4 1 3 5 0 3 6 0 3 example 2: we consider again m = 16 and 25, and k = 5. from table 2, in both cases b = 0 and the magnitude characteristics of the compensated combs are: 1 5 21 sin(8 ) ( ) [1 sin (8 )] 16 sin( / 2) j ch e       , (18) 2 5 21 sin(25 / 2) ( ) [1 sin (25 / 2)] 25 sin( / 2) j ch e       , (19) where sub indexes 1 and 2 are for m=16, and 25, respectively. the corresponding magnitude responses are shown in fig. 6. 8 g. jovanovic dolecek a. m=16, k=5 b. m=25, k=5 fig. 6 magnitude responses of comb and compensated comb in [8] 4. magnitude response synthesized as cascade of sinewave functions 4.1. method in [9] in order to decrease maximum absolute passband deviation of compensated comb, in [9] was proposed the magnitude response of compensator as a cascade of two sinewave functions with different parameters b: )2/(sin1)2/(sin1)( 2 2 2 1   bbec j  . (15) the proposed compensator has an interesting feature, i.e. the parameter b1 remains the same for all values of k, and b2 is linearly related with the comb parameter k: 42 3 1 2 )1(41 ;2    k bb , for k=1,….5. (16) the parameters and number of adders s are summarized in table 3. the method is illustrated in next example, taking the same comb parameters as in previous examples. design of multiplierless comb compensators with magnitude response synthesized as sinewave functions 9 table 3 values of parameters b1 and b2 and adders s, for k=1,…,5 k b1 b2 s 1 2 -3 2 -4 6 2 2 -3 2 -4 +2 -2 7 3 2 -3 2 -4 +2 -1 7 4 2 -3 2 -4 +2 -2 +2 -1 8 5 2 -3 2 -4 +2 0 7 example 3: from (16) we get b1=2 -3 , and b2=17/16=1+2 -4 , for both values of m. the magnitude responses of compensated combs for m=16 and 25, respectively, are: 1 5 2 21 sin(8 ) ( ) [1 1/ 8sin (8 )][1 17 /16sin (8 )] 16 sin( / 2) j ch e         . (17) 2 5 2 21 sin(25 / 2) ( ) [1 1/ 8sin (25 / 2)][1 17 /16sin (25 / 2)] 25 sin( / 2) j ch e         . (18) the overall magnitude responses and the passband zooms for both cases are shown in fig. 7. a. m=16, k=5 b. m=25, k=5 fig. 7 magnitude responses of comb and compensator in [9] 10 g. jovanovic dolecek note that the maximum absolute passband deviation of the compensated combs is slightly decreased and it is lesser than 0.3 db. 4.2. cascade narrowband and wideband compensators [10] in [10] is proposed the cascade of narrowband compensator (17) from [8] ( 2) 2 1 2 1( ) 2 [1 (2 2) ] b b c z z z           , (19) and the wideband compensator from [11]: 1 4 1 4 2 3 2 ( ) [ 2 [ (2 2) ]] k c z z z z          , (20) where k1 is the parameter. the transfer function of compensator is: 1 1 2 ( 2) 2 1 2 4 1 4 2 3 ( ) ( ) ( ) 2 [1 (2 2) ] [ 2 [ (2 2) ]] kb b c z c z c z z z z z z                       . (21) the parameters of compensator, b and k1, depend only on the comb parameter k, and are shown in table 4 along with the corresponding number of adders s for values of k=2, …,5. table 4 values of parameters b and k1 and adders s, for values of k=2,…, 5. k k1 b s 2 1 1+2 6 3 2 1+2 9 4 5 6 3 4 5 1+2 2 2 1+2 2 12 15 18 the maximum absolute passband deviation of the compensated comb is decreased, as shown in the following example. however, the number of the required adders is increased, as given in last column of table 4. example 4: we again consider the decimation factors 16 and 25 and k=5. from table 4 we have the compensator parameters k1=4 and b=22. the required number of adders is 15. the magnitude response of compensated comb for m=16: )( )2/sin( )8sin( 16 1 )( 16 5 1    jj c eceh  , (22) where c(e j16ω ) is the magnitude response of compensator (21) at high rate. design of multiplierless comb compensators with magnitude response synthesized as sinewave functions 11 similarly, the magnitude response of the compensated comb with m=25, at high rate is: 2 5 251 sin(25 ) / 2 ( ) ( ) 25 sin( / 2) j j ch e c e     , (23) where c(e j25ω ) is magnitude response of the compensator (21) at high rate. the overall magnitude responses of the combs and the compensated combs are contrasted in fig. 8. the passband zooms are also shown. a. m=16, k=5, k1=4, b=4 b. m=25, k=5, k1=4, b=4 fig. 8 magnitude responses of comb and compensated comb in [10] 5. fourth-order sine-based magnitude response [12] in order to get better approximation of the inverse comb characteristic in [12] was proposed to cascade the filter (9) from [7], and filter with a fourth-order sine-based magnitude response: )2/(sin1)( 4 2 maec mj    . (24) 12 g. jovanovic dolecek the magnitude characteristic of the compensator is given as: )()()( 21 mjmjmj ececec   . (25) using (9), (24) and (25), the magnitude characteristic of compensator at high rate, becomes: 2 4 ( ) [1 sin ( / 2)] [1 sin ( / 2)] j m c e b m a m       . (26) transfer function c2(z) is obtained by using the following trigonometrical relations:   8/3)2cos(4)4cos()(sin4   . (27) using (27) and euler's formula from (24) we get: 2223144 2 ])22()(41[2)(   zzzzzazc . (28) similarly, from (14) we have: 12122122 1 ]21[2])22([2)(   zzzbbzzbbzc . (29) finally, from (28) and (29) we get the compensator transfer function: ]]21[2[ ]])22()(41[2[ )()()( 1212 2223144 21      zzzb zzzzza zczczc . (30) the values of a and b are the parameters of the design which depend on the comb parameter k, and are given in table 5 along with the required number of adders for each value of k, k=1,….,6. table 5 values of parameters a and b, and adders s, for values of k=1,…, 6. k a b s 1 0 2 -2 -2 -5 4 2 2 -2 2 -2 +2 -4 10 3 4 5 2 -1 2 -1 1 2 -1 -2 -4 2 -1 +2 -3 +2 -4 2 0 -2 -2 -2 -5 10 11 11 6 1 2 0 -2 -6 10 the method is illustrated in example 5. example 5: using the same values m=16 and 25 and k=5, as i previous examples, from table 4, we get design parameters: a=1 and b=2 0 -2 -2 -2 -5 =23/32. magnitude responses of compensated combs for m=16 and 25, respectively, are given as: 1 2 41 sin(8 ) ( ) [1 23 / 32 sin (8 )] [1 sin (8 )] 16 sin( / 2 j ch e          . (31) design of multiplierless comb compensators with magnitude response synthesized as sinewave functions 13 2 2 41 sin(25 / 2) ( ) [1 23 / 32 sin ( 25 / 2)] [1 sin ( 25 / 2)] 25 sin( / 2 j ch e          . (32) magnitude responses (31) and (32) are contrasted with the comb magntude responses in fig. 9a and 9b, respectively. the zooms in passband are shown in both cases. a. m=16, k=5, a=1, b=23/32 b. m=25, k=5, a=1, b=23/32 fig. 9 illustration of compensator in [12] the absolute value of passband deviation of the compensated combs is lesser than 0.1 db requiring 11 adders. 6. conclusion this paper addresses the different methods for comb compensator design with magnitude responses synthesized as sinewave functions. the presented methods are the result of our research in last ten years. all presented designs are multiplierless, considering that the compensator coefficients are realized using only adders and shifts. all design parameters 14 g. jovanovic dolecek depend only on the comb parameter k, and practically do not depend on the decimation factors. the design parameters are presented in tables. the presented methods are compared in table 6 in terms of the compensation capability expressed in maximum absolute value of the passband deviation in db, of the compensated comb, and the complexity expressed in number of adders. in all presented methods, comb compensation do not deteriorate aliasing rejection in folding bands. table 6 comparisons method δ[db] s [3] [9] [10] [12] 0.4 0.3 0.25 0.1 3, for k=1,2,4; 4, for k=3,5 6 for k=1; 7 for k=2,3,5; 8 for k=4 6 for k=2; 9 for k=3; 12 for k=4, 15 for k=5; 18 for k=6 4, for k=1, 10 for k=2,3; 11, for k=4,5; 10, for k=6 we can observe that the compensator in [12] exhibits best trade-off between the quality of compensation expressed in the maximum absolute passband deviation and the complexity expressed in the number of required adders. references [1] j. m. de la rosa, sigma-delta converters, practical design guide, 2nd edition, hoboken, new jersey: wiley-ieee press, 2018, pp. 20-27. [2] s. pavan, r. schreier, and g. c. temes, understanding delta-sigma data converters, 2nd edition, hoboken, new jersey: wiley-ieee press 2018, pp. 451-480. [3] n. n. hurrah, z. jain, a. bhardwaj, a. a. parah and a. k. pandit, “ oversampled sigma delta adc decimation filter: design techniques, challenges, tradeoffs and optimization,” in proceedings of the 2015 2nd international conference on recent advances in engineering & computational sciences (raecs), 2015, pp. 1-6. [4] g. jovanovic dolecek (ed.), multirate systems: design and application. hershey: igi global, 2001, chapter 1, pp. 1-26. [5] g. jovanovic dolecek (ed), advances in multirate systems, new york: springer, 2018, chapter 3, pp. 59-81. [6] e. hogenauer, “an economical class of digital filters for decimation and interpolation,” ieee transactions on acoustic, speech, and signal processing, vol. assp-29, pp.155-162, 1981. [7] g. jovanovic dolecek, a. fernandez-vazquez, “trigonometrical approach to design a simple wideband comb compensator.” international journal on electronics and communication, aeu, vol. 68, pp. 437–441, 2014. [8] g. jovanovic dolecek and s.k. mitra ,”simple method for compensation of cic decimation filter,” electronics letters, vol. 44, pp. 1162-1163, 2008 [9] g. jovanovic dolecek, “a novel comb compensator with a good passband deviation-complexity tradeoff,” in proceedings of the 2016 ieee 59 th international midwest symposium on circuits and systems (mwscas), pp. 1-4, 2016. [10] g. jovanovic dolecek, “on wideband comb compensator”, in proceedings of the 2013 ieee international conference in computing, communications and informatics, (icacci ), mysore, 2013, pp. 1987-1990. [11] g. jovanovic dolecek ,”simple wideband cic compensator”, electronics letters, vol. 45, pp. 1270-1272, 2009. [12] g. jovanovic dolecek, r. g. baez, r. m. salgado, and j. de la rosa, “novel multiplierless wideband comb compensator with high compensation capability,” circuits, systems and signal processing, vol.5, pp. 2031-2049, 2017. instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 1 25 doi: 10.2298/fuee1701001b microelectronic reliability models for more than moore nanotechnology products  alain bensoussan institute of technology antoine de saint exupery, toulouse, france abstract. disruptive technologies face a lack of reliability engineering standards and physics of failure (pof) heritage. devices based on gan, sic, optoelectronics or deepsubmicron nanotechnologies or 3d packaging techniques for example are suffering a vital absence of screening methods, qualification and reliability standards when anticipated to be used in hi-rel application. to prepare the hirel industry for just-in-time cots, reliability engineers must define proper and improved models to guarantee infant mortality free, long term robust equipment that is capable of surviving harsh environments without failure. furthermore, time-to-market constraints require the shortest possible time for qualification. breakthroughs technologies are generally industrialized for short life consumer application (typically smartphone or new pcs with less than 3 years lifecycle). how shall we qualify these innovative technologies in long term hi-rel equipment operation? more than moore law is the paradigm of updating what are now obsolete, inadequate screening methods and reliability models and standards to meet these demands. a state of the art overview on quality assurance, reliability standards and test methods is presented in order to question how they must be adapted, harmonized and rearranged. here, we quantify failure rate models formulated for multiple loads and incorporating multiple failure mechanisms to disentangle existing reliability models to fit the 4.0 industry needs? key words: reliability, gan, sic, dsm, nanotechnology, more than moore. 1. introduction hi-rel embedded system applications in aeronautic, space, railways, nuclear, telecommunication rely on reliability engineering standards [1] [2] related to physics of failure (pof) [3]. when systems are constructed on innovative and disruptive technologies, such standards and methods are in general obsolete and inadequate to prepare their industrialization and qualification for just-in-time commercialization. suggested probabilistic design for reliability (pdfr) [4] and prognostic health monitoring (phm) [5] concepts open the door to anticipate and assess their reliability and quantification. reliability prediction as remaining useful life (rul), failure rate and accelerating factors are mathematic and tools related to pof describing macroscopic changes in materials and devices  received may 18, 2016 corresponding author: alain bensoussan irt saint exupery, 118 route de narbonne cs 44248, 31432 toulouse cedex 4, france (e-mail: alain.bensoussan@irt-saintexupery.com) 2 a. bensoussan having their own microscopic behavior. indeed statistics helps to predict population comportment but are unable to predict the performance on a single item as part of this population. this is exactly what did ludwig boltzmann (1844-1906) [6] when he gave a new perception of the universe on microscopic scale in the kinetic theory: a macroscopic state for some probability distribution of possible microstates. section 1 of this paper will review existing standards and clarify some route to implement and generalize existing reliability jedec or mil standards. these standard methods develop failure mechanism models and their associated activation energies or acceleration factors that may be used in making system failure rate estimations. for large scale integration processes in the nanoscale range (now lower than the 10 nm) used for microcontrollers or pc‟s chip, the physic of interaction, the temperature distributions and the critical path for signal processing are extremely variable. the average value of the apparent activation energies of the various failure mechanisms can‟t be exploited because a) different failure mechanisms have different weighting factors and effects differently each portion of an ic‟s and b) the apparent activation energy values affect the acceleration factor exponentially rather than linearly. section 2 will detail accelerated stress models as exposed in well-established jedec documents prior to recall the multiple stress boltzmann-arrhenius-zhurkov (baz) reliability model [7], [8] which can be considered also as a development of the cox proportional hazards model [9]. we will settle multiple failure mechanisms [10] as mandatory to be pondered for dsm nanotechnology nodes and will show how the htol reliability model elaborated by j. bernstein [11] [12] can support a more robust easy-to-use theory. section 3, will show how a multi-dimensional tool named m-storm (multi-physics multi-stressors predictive reliability model) [13] can be implemented in a concrete situation existing for the deep-submicron process devices highlighting the remaining steps to be carried out for a complete tool release. 2. quality standard overview well-known quality standards in various industry domains rely or are close to military standards mil-std and jedec methods. now entering the 4.0 industry paradigm as the fourth industrial revolution (the age of cyber and robots), quality/reliability models and tools headed by health monitoring (hm) leads toward more crucial and vital questions. this section is not intended to be an exhaustive cookbook but on the other hand will highlight how generic approaches and hypotheses are considered to assure products and equipment‟s quality and how to built-in reliability products dynamically. the name “dynamically” means that hardware‟s and software‟s must be designed in order to preidentify and characterize system degradation when still in-operating condition. to diagnostic the healthiness of a system for anticipating failure requires to open new roads to imagine and to design dedicated hardware and software installed within the system itself and to define procedures and tests which will decide self-corrections at hardware and/or software level (artificial intelligence). this requires a high level of intelligence integration within a system or a product and this is the challenge of the 4.0 era. jedec or mil standards are generally based on the principle of separating the variables and considering a single stress at a time and a single failure mode and mechanism at a time. a failure mechanism may be characterized by how a degradation process proceeds including the microelectronic reliability models for more than moore nanotechnology products 3 driving force, e.g., oxidation, diffusion, electric field, current density. when the driving force is known, a mechanism may be described by an explicit failure rate model; identifying that model with associated parameters is the main objective. the existing technologies, extended also to highly critical innovative technologies, oblige design engineers to consider those driving forces to be quantified considering multiple internal stress parameters inducing interfering stress settings (current, voltage, power and temperature) and loads (dc and ac, environment as thermal cycling, radiation, electrostatic discharge -esd, electrostatic over-stress -eos, energetic electromagnetic pulse, etc.). 2.1. european standards as an example, the european cooperation for space standardization (ecss) (www.ecss.nl) is an initiative established to develop a coherent system of european space standards. the ecss organization standardization policy develops a documentation architecture with three branches (project management, product assurance and engineering) to overcome issues due to the existing standard resulting in higher costs, lower effectiveness and in a less competitive industry. the framework and basic rules of the system were defined with the involvement of the european space industry. a short overview of the main system documentation is presented here with the intention to show how, when and where the quality assurance requirements affect electronic parts supply chain considering long term harsh environment space missions. most of space product assurance documents are constructed to guarantee final customers‟ and operators‟ satisfaction for satellite mission duration greater than 18 years without repair. most of them rely on well-established technologies and products avoiding to use innovative products. the ecss-q-st-60c [2] standard defines the requirements for selection, control, procurement and usage of electronic, electrical, and electromechanical (eee) components for space projects considering the characteristics of the space environment condition. when selected, parts must be integrated on system based on best design practices. the “space product assurance derating eee components” ecss-q-st-30-11c [14] specifies electrical derating requirements applicable to eee components. derating is a long standing practice applied to components used on spacecraft‟s. cots microcontrollers and core ic chips produced on nanoscale technology are now integrating 1 billion transistors (below the 10 nm node) on a single chip with cash memory, i/o accesses, cpu, flash and ddr memory, all biased at low voltage (below 1v) and accessed at increasing clock frequency (few ghz). as derating is under the control of designers and manufacturers nanoscale makers: due to the tremendous increase of system capability, big data management, world-wide telecommunication and internet of things, the space industry must collaborate or impose new design rules if they want to use such innovative technologies. another scale, is for new packaging and connection techniques to be pondered. the ecss-q-st-70-08c, [15] ―space product assurance manual soldering of high-reliability electrical connections” is a standard defining the technical requirements and quality assurance provisions for the manufacture and verification of manually-soldered, high-reliability electrical connections. for temperatures outside a normal range (−55°c to +85°c) special design, verification and qualification testing is performed to ensure the necessary environmental survival capability. packaging and assembly reliability models must be improved too when additive manufacturing techniques and new materials for high power dissipation are mobilized. “commercial electrical, electronic and electromechanical (eee) 4 a. bensoussan components” document named ecss-q-st-60-13c [16] applies only to commercial components which meet technical parameters that are on the system application level demonstrated to be unachievable with existing space components or only achievable with qualitative and quantitative penalties. all of these normative documents as ecss and escc standards are generally based on mil-std and jedec test methods. component failures and system failures determination have been extensively described on handbook and tools but all of them are now mostly obsolete with respect to the emerging technologies proposed on the cots market. they are unable to predict and quantify the reliability of new products having short product‟s life cycle and being complex and technically highly sophisticated. 2.2. standards and handbooks for eee parts, the at&t reliability manual [17] is more than just a prediction methodology. although it contains component failure data, it outlines prediction models based on a decreasing hazard rate model, which is modeled using weibull data. fides [18] is a new reliability data handbook (available since january 2004). the fides guide is a global methodology for reliability engineering in electronics, developed by a consortium of french industry under the supervision of the french dod (dga). the important fact is that fides evaluation model proposes a reliability prediction with constant failure rates. the infant mortality and wear out periods are today excluded from the prediction. the iec 62380 electronic reliability prediction supports methods based on the latest european reliability prediction standard. it was originally, the rdf 2000 (ute c 80810, iec-62380-tr ed.1) [19] from cnet handbook previously published as rdf93 and covers most of the same components as mil-hdbk-217. mil-hdbk-217 [1] reliability prediction of electronic equipment, has been the main stay of reliability predictions for about 40 years, but it has not been updated since 1995. the siemens sn29500 [20] failure rates of components and expected values method was developed by siemens ag for use by siemens associates as a uniform basis for reliability prediction. the reliability prediction procedure for electronic equipment documents telcordia sr-332 [21] recommends methods for predicting device and unit hardware reliability. this procedure is applicable for commercial electronic products whose physical design, manufacture, installation, and reliability assurance practices meet the appropriate telcordia (or equivalent) generic and product-specific requirements. in july 2006, riac released 217plus tm [22] as the successor to the dod-funded, defense technical information center (dtic)-sponsored version 1.5 of the prism ® software tool. the rac (eprd) electronic parts reliability data handbook database is the same as that previously used to support the mil-hdbk-217, and is supported by prism ® . the models provided differ from those within mil-hdbk-217. the prism software is available from the reliability analysis center [23]. the models contain failure rate factors that account for operating periods, non-operating periods and cycling. traditional methods of reliability prediction model development have relied on the statistical analysis of empirical field failure rate data. the riac new approach is predicated on component models considering the combination of additive and multiplicative model forms that predict a separate failure rate for each class of failure mechanism. a typical example of a general failure rate model that takes this form is: microelectronic reliability models for more than moore nanotechnology products 5 (1) where, λ p = predicted failure rate λ o = failure rate from operational stresses π o = product of failure rate multipliers for operational stresses λe = failure rate from environmental stresses π e = product of failure rate multipliers for environmental stresses λ c = failure rate from power or temperature cycling stresses π c = product of failure rate multipliers for cycling stresses λ i = failure rate from induced stresses, including electrical overstress and esd λ sj = failure rate from solder joints π sj = product of failure rate multipliers for solder joint stresses one can note that part-count prediction assumes a “constant failure rate per part” as a linear combination (+ and x) of  factors and specific  factors. failure rate is for a stated period of the life of an item, the ratio of the total number of failures in a sample to the cumulative time of that sample. a consistent frame work for reliability qualification using the physics-of-failure (pof) concept is provided by the jedec jep148 procedure [24]. the physics-of-failure (pof) concept [25] is an approach to design and development of reliable product to prevent failure based on the knowledge of root-cause failure processes. it is based on understanding  relationships between requirements and the physical characteristics of the product (and their variation in the production process),  interactions of product materials with loads (stresses at application conditions) and their influence on product reliability with respect to the use conditions. 2.3. discussion reliability engineering and mathematics have been many times presented, see for example detailed by suhir, e. in his book “reliability applied probability for engineers and scientists”, mcgraw-hill, [26]. talking about reliability engineering of objects is studying property of complex elements that do not lend themselves to any restauration (repair) and have to be replaced after first failure. the reliability is completely due to their dependability. this property is measured by the probability that a device or a system will perform a required function under stated conditions of a stated period of time. suhir explain, this involves three major concepts: 1. probability: the performance of a group of devices in a system described as a failure rate. such an overall statistic does not have a meaning for an individual device. 2. definition of a “reliability function”: for a device, a failure is relatively easy to be fixed, based on guaranteed performance which can be measured. for a system, this concept is rather elusive and harder to set since based on customer satisfaction. 3. time: what is “time”, in defining reliability? there may be many critical time period, at component, equipment or at system level, but the reliability for each critical time period can be determined in appropriate terms. standards listed in section 1.2 are generally related to item as parts and system hardware functions based of constant failure rate considering the element of interest have been manufactured and screened efficiently, operating in a given environment and assuming 6 a. bensoussan wearout failure rate well beyond the operating end of life time (eol). the next sections developed in this paper will show how these hypotheses must be reexamined for present and future application based on new technologies but also on existing ones as deep sub micron nanotechnologies already used for asics, fpga or memories. the book from p. a. tobias and d.c. trindade [27] “applied reliability” (3 rd edition), is an extensive and powerful document exposing mathematics and methods, statistical software helping reliability engineers addressing applied industrial reliability problems. once developing statistical life distribution models, reliability prediction and quantification on emerging technology is somewhere a matter to look inside a fuzzy crystal. we are unable to obtain reasonable set of data from short endurance stress tests and extrapolate or approximate what should be the effect at normal use condition on their behavior. what a product is likely to experience at much lower stress knowing its failure rate at a higher stress? the model used to bridge the stress gap are known as acceleration models but assumes to be constructed and grounded on some hypotheses:  lot homogeneity and reproducibility: it is assumed components under stress are manufactured from an homogeneous lot and supposing no major change in manufacturing technology,  stress effects are representative, homogeneous and reproducible,  failure mechanism duplication: independent of level of stress, and reproducible,  the failure rate of a device is independent of time. this is the usual, but often very inappropriate, assumption in conventional reliability-prediction methods .  linear acceleration: when every time to failure, every distribution percentile is multiplied by the same acceleration factor to obtain, the projected values at another operating stress, we say we have linear acceleration [27].  temperature effect governed by arrhenius law: “things happen faster at high temperature”. lower temperatures may not necessarily increase reliability [10] [5], since some failure mechanisms are accelerated at lower temperature as seen for example for hot carrier degradation mechanisms. generally quality standards and prediction tools are focusing only on high temperature acceleration models.  multiplicity: multiple stresses (loads) and multiple failure mechanisms at a time (cf discussion in section 3 and 4).  pof signature: activation energy determined from experiments based on catastrophic degradation or related to electrical parameter drift (a predictor).  temperature definition: an accurate and agreed concept to be the core of reliability prediction tool based on thermal accelerated testing. reliability of electronic equipments are designed considering affected by the temperature. influence of temperature on microelectronics and system reliability published by p. lall, m. pecht and e. b. hakim in 1997 [28], discussed various modelling methodologies for temperature acceleration of microelectronic device failures. mil-hdbk-217, fides and jedec standards have advantages to describe such models but are mostly not adapted to breakthrough and new immature technologies. microelectronic reliability models for more than moore nanotechnology products 7 how to quantify reliability for disruptive technologies? knowing, a) multiple failure mechanisms are in competition, b) activation energies are parameters determined experimentally, c) based on accelerated tests carried out at extreme temperatures (both at high and low) and d) supposed to be constant but modified by stress conditions, physics of failure (pof) methodology is the alternative suggested approach in the mid 90‟s by the u.s., cadmp alliance now known as electronic components alliance [5]. problems arise when the failure mechanisms precipitated at accelerated stress levels are not activated in the equipment operating range as highlighted by lall, pecht and hakim [28]. since 2010, we first define a generalized multiple stress reliability model and suhir, e. published a comprehensive model called boltzmann-arrhenius-zhurkov (baz) model [7], [8], [29], [30]. the premises of this model was addressed by d. cox [9] in journal of the royal statistical society 1972. in last decade view, two advanced probabilistic design-for reliability (pdfr) concepts were addressed in application to the prediction of the reliability of aerospace electronics: 1) boltzmann-arrhenius-zhurkov (baz) model, which, in combination with the exponential law of reliability, and 2) extreme value distribution (evd) technique that can be used to predict the number of repetitive loadings that closes the gap between the capacity (stress-free activation energy) of a material (device) and the demand (loading), thereby leading to a failure. the second concern illustrated by the previous discussion is related to multiple failure mechanism being in competition. the monograph and papers published since 2008 by pr. j. bernstein [11], [25], [31] quite precisely define the context and the modified m-htol [12] approach. the development of which is part of the following section 2 and 3. 3. reliability mathematics and tools many books and papers define basic concepts in reliability and particularly on reliability prediction analysis such as a fmeca (failure modes, effects and criticality analysis), rbd (reliability block diagram) or a fault tree analysis. in reliability engineering and reliability studies, the general convention is to deal with unreliability and unavailability values rather than reliability and availability (see for example http://www.reliabilityeducation.com/):  the reliability r(t) of a part or system is defined as the probability that the part or system remains operating from time t0 to t1, set that it was operating at t0.  the availability, a(t) of a part or system is defined as the probability that the component or system is operating at time t1, given that it was operating at time t0.  the unavailability, q(t) of a part or system is defined as the probability that the component or system is not operating at time t1, given that it was operating at t0. hence, r(t) + f(t) = 1 or unreliability f(t)= 1 – r(t) and a(t) + q(t) = 1 (2) figure 1 shows the schematic representation of failure distribution functions. the instantaneous failure rate (ifr), also named the hazard rate (t), is the ratio of the number of failures during the time period t, for the devices that were healthy at the beginning of testing (operation) to the time period t. ( ) ( ) 1 ( ) f t t f t    (3) 8 a. bensoussan fig. 1 instantaneous failure rate, probability density function and reliability distribution functions the cumulative probability distribution function f(t) for the probability of failure is related to the probability density distribution function f(t) as 0 ( ) ( ). t f t f x dx  (4) and the reliability function r(t), the probability of non-failure is defined as ( ) 1 ( )r t f t  (5) failure rates are often expressed in term of failure units (fits): 1 fit = 1 failure in 10 9 device-hours. probability data obtained when performing accelerated tests (halt or foat) can be modeled by various distribution models, such as exponential law, weibull law, normal or log-normal distributions, etc. in most practical applications, life is a function of more than one or two variables (stress types). the next and an important question is how to consider and relate the reliability figures when applying other stresses than temperature, as thermal cycling or radiation? on jedec standard jep122g, reliability models as electromigration [32], ohmic contact degradation [33] [34], coffin-manson [35], eyring [36], humidity [37], time dependent dielectric breakdown tddb [38], hot carrier injection [39] [40] [41], hydrogen poisoning [42] [43], thermo-mechanical stress [44], nbti [45] are generally expressed by a function of stress parameter or by a function of an electrical predictor multiplying the exponential activation energy factor. talking about stress parameters named stressors or electrical predictors may sometimes be confusing because the first one (e.g. stressors), give warning on how is high or low the free gibbs energy barrier to cross, and the second concept (e.g. predictors) gives information on how fast the device will cross that barrier. the core of generalizing the existing models must unified this apparent antagonism by using precise definitions and effects. in general this has been unthank by major papers published. reader will see in the next paragraph how such confusion is considered. all studies argue and consider the activation energy are deduced experimentally as a constant with respect to temperature (low vs high), stress conditions, and other predictors as for example charge de-trapping for hot carrier degradation or nbti for pmos devices microelectronic reliability models for more than moore nanotechnology products 9 under negative gate voltages at elevated temperature. these models are generally applicable for a given technology. even some end-users and customers are focused to qualify lot production instead of a process. there is a need to simplify the forest of existing models. is it possible to harmonize the mathematics of the existing paradigm? first consideration is to define precisely the elements and roles of each parameter separating the thermodynamics (activation energy, free gibbs energy), stressor and predictor parameters and their effects in failure mechanisms. 3.1. reliability standards and accelerated stress models formerly, activation energy is related in one hand to a single pure temperature effect and disregard other stress parameters. it is true in second hand, the activation energy is defined as an effective activation energy mostly modified by several type of other stresses applied and failure mechanisms considered. steady state temperature stress tests are considered the only stress parameter affecting reliability and are typically time-dependent temperature related. failure mechanisms are thermally activated or not and can be either catastrophic or parametric (drift of characteristics). a sudden catastrophic failure can be observed due to electrical overstress and is called burnout or due to high electrical field inducing catastrophic breakdown. breakdown and burnout limits are also temperature dependent. as a consequence it is reasonable to consider a same failure mechanisms being induced by a pure thermal stress to a pure electrical stress: in this case any intermediate condition between these two extremes will be modeled by a pure arrhenius activation energy modified by a factor depending of stresses applied. this postulate justify the boltzmann-arrheniuszhurkov model (baz) presented in the section 2.2. the idealized experimental bathtub curve of a material or a device shown in figure 2 exhibits the combined effect of the statistics-related and reliability-physics-related processes. in the analysis developed by suhir [46], a probabilistic predictive model (ppm) is developed for the evaluation of the failure rates and the probabilities of non-failure. here we draw a synthetized view on how we can clarify some concept for a comprehensive harmonization of existing reliability model of failure mechanism:  internal electrical stresses labelled stressor parameters are responsible of the wearout failure rate (weibull  greater than 1). they are only of four types of applied and imposed stress conditions: they are voltage, current, dissipated power and input signal or esd/eos/emc energies and can be either static, dynamic, transient or surge. they are quantified with respect to their level of stress applied compared to their level of burnout instantaneous failure mode. but for sake of standardization and normalization they are limited by the maximum values allowed by the technology.  when device operates under external stress (thermal management constraints, packaging and assembly constraints, atmosphere contaminants, radiations environments), such stressor parameters level are modified with respect to their maximum burnout and breakdown limits thus accelerating wearout failures compared to temperature and biasing stress in the absence of external environment.  failure modes of interest are electrical or mechanical signatures related to failure mechanisms observed and are predictor parameters. such parameters can be measured as absolute drift value of electrical parameter or as relative percentage of drift. 10 a. bensoussan fig. 2 bathtub curve. weibull distribution with two parameters (shape and time).  constant failure rate (random) are caused by random defects and random events. the failure rate is modeled by a weibull shape parameter close to 1 which is equivalent to an exponential distribution law.  lot-to-lot production variation (respectively device-to-device) and performance dispersion from a single manufacturing lot (respectively device) will affect the burnout limits, inducing in return a change of percentage of stress applied on a given lot (device). statistic dispersion will affect the time to failure on similar way (producing the same statistical effect). such dispersion at lot and device level will impact the remaining useful life (rul) for some part of the population.  infant mortality failure population are caused by “defects” and correlates with defect-related yield loss. they are reduced by improved quality manufacturing and by screening. 3.2. baz model and transition state theory accelerated stresses design for reliability (dfr) is a set of approaches, methods and best practices that are supposed to be used at the design stage of the product to minimize the risk that it might not meet the reliability requirements, objectives and expectations. these considerations have been the basis of the generalized baz model mentioned in section 1 constructed from the 1965 zhurkov‟s [47] solid-state physics model, which is a generalization of the 1889 arrhenius‟ [48] chemical kinetics model, which is, in its turn, a generalization of the 1886 boltzmann‟s (“boltzmann statistics”) [49] model in the kinetic theory of gases. the paradigm of the transition state theory (tst) developed by e. wigner in 1934 [50] and by m. evans, m. polanyi in 1938 [51] is viewed as the equivalent approach we apply to the concept of a unified semiconductor reliability model. the arrhenius equation relates reaction rate r of transition from a reactant in state a to a product in state b is depending on temperature and the activation energy as also modeled by transition state theory. the probability that the particular energy level u is exceeded has been expressed in boltzmann‟s theory of gases: microelectronic reliability models for more than moore nanotechnology products 11 ( ) (6) and a total distribution is found to be: ∫ ( ) (7) this function defines the probability p that the energy of a defect exceeds the activation energy can be assessed as a function of the ratio of time constant 0 to lifetime equal to: ( ) (8) figures 3.a show a schematic drawing of the principle of the transition state theory which represents the amount of free energy δgǂ required to allow a chemical reaction to occur from an initial state to a final state. if the chemical reaction is accelerated by a catalyst effect the height of energy δgǂ is reduced allowing the transition initial state → final state to occur with a transition state energy being a lower value of the energy barrier to cross. in transition state theory with catalyst effect it is possible to get an effective activation energy being negative (shown in figure 3.b), as observed for example for hci failure mechanism. it is observed that hot carrier injection induced effects are exaggerated at lower temperatures demonstrating clear negative effective activation energies. fig. 3.a transition state theory principle diagram fig. 3.b with catalyst effect with negative ea and for hci failure mechanism. the boltzmann-arrhenius-zhurkov (baz) model [8] determines the lifetime  for a material or a device experiencing combined action of an elevated temperature and external stress: ( ) (9) where s is the applied stress (can be any stimulus or a group of stimuli, such a voltage, current, signal input, etc), t is the absolute temperature, γ is a factor of loading characterizing the role of the level of stress (the product γ · s is the stress per unit volume and is measured in the same units as the activation energy ea), and k the boltzmann's constant (1.3807 10 −23 j/k or 8.6174 10 −5 ev/k). the generalized baz model proceeds from the rationale that the process of damages is temperature dependent, but is due primarily to the accumulation of damages resulting 12 a. bensoussan from loading above the threshold stress level. each level of stress is characterized by the corresponding term ·s normalized by the term k ·t, thereby defining the relationship between the elevated temperature and the energy contained in an elementary volume of the material or the active zone of a device. in a recent papers e. suhir et al. presented [52] [53] the substance of the multiparametric baz model considering the lifetime  in the baz model be viewed as the mttf. the failure rate for a system is given by the baz equation can be found as: ( ) (10) assuming the probability of non-failure at the moment t of time is (11) this formula is known as exponential formula of reliability. if the probability of failure p is established for the given time t in operation, then the exponential formula of reliability can be used to determine the acceptable failure rate. such an assumption suggests that the mttf corresponds to the moment of time when the entropy of this law reaches its maximum value. using the famous expression due to gibbs for the entropy which was later used by shannon to define information [54] from the formula: (12) we obtain that the maximum value of the entropy h(p) is equal to e -1 = 0.3679. with this probability of non-failure, the formula (9) yields: ( ) (13) comparing this result with the arrhenius equation (1), suhir concludes that the t50% or mttf expressed by this equation corresponds to the moment of time when the entropy of the time-depending process p=p(t) is the largest. let us elaborate on the substance of the multi-parametric baz model using an example of a situation when the product of interest is subjected to the combined action of multiple stressors si (electrical stress as for example dc biasing current, voltage, power dissipation or dynamic input signal). let us assume that the wearout failure rate wf(t) of an electronic product, which characterizes the degree of propensity of a material or a device to failure, is determined during testing or operation by the relative drift of an electrical predictor parameter p as the electrical signature of the failure mode of concern [55] and considering equation (10), one could seek the probability of the material or the device non-failure in the form: [ ( ) ( ∑ )] (14) where p0 is the value of the predictor parameter at time = 0 and , i values reflect respectively the sensitivities of the device to the corresponding predictor and stressors. the model can be easily made multi-parametric, i.e. generalized for as many stimuli as necessary [55]. the sensitivity factors must be determined experimentally. because of that, the structure of the multi-parametric baz expressed by the equation (14) should not be interpreted as a superposition of the effects of different stressors, but rather as a convenient and physically meaningful representation of the foat data. microelectronic reliability models for more than moore nanotechnology products 13 in such condition the suggested approach is to determine the  factors reflecting the sensitivities of the device to the corresponding stimuli (stressors). this will be detailed when considering the baz model derived from the transition state theory in the following section related to multiple dimensional reliability model. one‟s note the equation (14) can be viewed as a cox proportional hazards model [9]. survival models consist of two parts: the underlying hazard function, denoted 0(t), describing how the risk of event per time unit changes over time at baseline levels of covariates; and the effect parameters, describing how the hazard varies in response to explanatory covariates. the hazard function for the cox proportional hazard model has the form: ( ) (15) this expression gives the hazard rate at time t for subject i with covariate vector (explanatory variables) xi. saying this, one limitation of the cox model is observed on reliability analysis method: for a sound part at time t, the failure probability during time [t, t+dt] is related to stress applied during this period of time dt but not taking into account history of stresses applied before t. this may be a limitation when modeling nonconstant stress applied during time (e.g. step stress test for example). the proportional hazards (ph cox) model can be generalized (gph) by assuming that at any moment the ratio of hazard rates is depending not only on values of covariates but also on resources used until this moment. the application of the pdfr concept and particularly the multi-parametric baz model enables one to improve dramatically the state of the art in the field of the microelectronic products reliability prediction and assurance. 4. multi-dimensional reliability models as seen in section 1 and 2, existing quality standards are considering stress tests and related pof mechanisms without entanglements. device failure rates are seen to be a sum of each existing failure rate taken individually. bathtub curve is an idealized view of instantaneous failure rate scenario generally considered in well-known mil, jedec or telcordia standards. the multidimensional variable addressed by boltzmann-arrhenius-zhurkov (baz) reliability model and the multi mechanism model htol (high temperature operating lifetest) proposed by j. bernstein are discussed now with the intend to generalize how their implementation can be suitable for an easy to use, to quantify and to predict probability of failure of new products and technologies. 4.1. multiple stressors and predictors the baseline of the model deals with concept issued from the transition state theory and the healthiness of a population of device must grow and change with time and stresses applied. the first concept is that a device or a homogeneous lot of item constituted of population of “identical” device must fail after an observed time due to aging either under operation or under storage conditions. the statistics of this behavior has to do with entropy evolution of such item of population. the transformation from a sound item to a failure is similar to what is described in the transition state theory considering similarly a system of products to combine in a new system of product when energy is provided to the system. 14 a. bensoussan stressor definition and normalization in a similar way considering a population of devices submitted to heating will only degrade continuously up to malfunction and failure. but when superposing high (or low) temperature and adequate stressors, the time-to-failure of such alike population will reduce. the term “stressors” here is defined as the electrical factors applied to the device of concern. stressors are all limited by technology boundaries defined by the burnout values of each related electrical parameter (breakdown voltage, current overstress and burnout, power burnout, input signal overstress). these stressors can be normalized with respect to their burnout limits and strains are pondered as percentage of breakdown limits. the main hypotheses, verified by experiments on electronic devices and population of similar devices, are: i. the physical instantaneous degradation phenomena due to electrical stress above the limits is observed at any temperature and depend of the active zone temperature of the device under test (sze, s. m [56]) ii. the relative drift of a predictor parameter is a function of time (for example square root for diffusion mechanisms) and relate to a failure mechanism activated by temperature and biasing. iii. for a biasing set higher and close to the breakdown limit, the two failure mechanisms (e.g. the diffusion and the instantaneous catastrophic ones) are in competition and occurred simultaneously; for sake of simplicity it is assumed they are progressively and linearly combined from a pure diffusion mechanism at nominal biasing to a pure burnout at high bias (voltage or current of power dissipation). this last hypothesis is the foundation of the baz model, as the stressor is seen like a catalyst effect able to modify the height of the barrier of the pure temperature failure mechanism (arrhenius thermally activated) and to quantify the effect of biasing on the barrier properties. the predictor parameters is then the sensitive tool we can use to measure this barrier height under various temperature and bias conditions. for unit homogeneity, the stressor is multiplied by a constant factor to be determined by experiments and the term · s is in ev unit. indeed the  coefficients can be easily determined because of hypothesis iii) above and as shown on figure 3, the apparent height of the barrier is reduced to zero and we can verify: (16) e.g. when the bias is high enough to reach the instantaneous catastrophic failure. this major principle is called failure equivalence (fe) principle. because ea (pure thermal effect) is assumed to be a constant and considering the burnout limit is temperature dependent potentially distributed (gaussian distribution), the  factor should also reflect temperature dependence and have a same gaussian like distribution. the present paper will not consider this extension and the  factor is supposed to be a constant on a first basis. predictor definition as mentioned previously, an electrical predictor parameter p is defined as the electrical signature (failure mode) of a failure mechanism of interest. such a parameter is normalized with respect to its initial value at time zero. similarly to the stressor context, we can define an equivalent energy using a prefactor  as outlined in equation (14). microelectronic reliability models for more than moore nanotechnology products 15 figure 4 is a schematic drawing showing how the fe principle applied and how predictors and stressors takes place in the baz model highlighted by the transition state theory. all vertical axes are transformed in energy unit. fig. 4 predictor p and stressor s for baz model and transition state theory the predictor relative drift shown is an example of actual measurements performed on microwave transistors when submitted to steady state aging testing [57]. the predictor of each single device is normalized with respect to its initial measurement (mean value) and the failure criteria was 20% drift reached. so, the drawing is set in order to consider failed devices for all drift greater than 20%. 4.2. baz model simplification and applicability it is observed from section ii, the baz model is a generalization of existing well known arrhenius equation modified by commonly accepted industrial models as eyring for example. as presented in ref [29], all failure mechanisms models as detailed in jedec jep122 can be rearranged in the following form ( ) (17) where the function g(s) are a function of stressor parameter always expressed in two ways generalized expressions: [ ] (18.a) or (18.b) where m and p = 1 or -1 is a power law factor. applying the normalization process for each stressor si with respect to its burnout limit parameters or electrical parameter limits, we set: (19.a) ( ) ( ) (19.a) 16 a. bensoussan from these equations, it is assumed the xi and xj are varying from 0 when no electrical stress is applied to 1 when maximum electrical stress induces an instantaneous failure at any given temperature. the value of stressor burnout is considered in a first approximation not temperature dependent. this can be reformulated when the model will be refined to take into account this statement. merging equations (17 to19), it is easy to express the general equation of failure rate as: ( ) (20) with the effective activation energy in the form [13]: ( ∑ [ ( )] ) (21) expression 21 is based on the assumption that the stressors are temperature independent and are applied simultaneously, so simply added because of a linear approximation point of view. the stressors are considered independent and they aggregate each other up to a value which compensate exactly the “pure” arrhenius activation energy leading to an instantaneous burnout (see figure 3.a for clarification): consequently the principle of superposition cannot be invoked in this case, rather it is a principle of aggregation and compensation. the stressors defined above are considered through literature experiments and accumulated data. of course any other type of stressor can be easily introduced in lieu of or together with the listed stressors providing they are relevant in the considered model. this proposed reliability methodology is agile and consists of measuring the burnout or breakdown true limits (including lot dispersion values mean and standard deviation) or some physical limit as for hci in order to normalize new stress parameter with respect to its limit and to include it in the equation 13. 4.3. multiple failure mechanisms (m-tol) the key novelty of the multiple-temperature operational life (m-tol) testing method proposed by j. bernstein [58], is its success in separating different failure mechanisms in devices in such a way that actual reliability predictions can be made for any user defined operating conditions. this is opposed to the common approach for assessing device reliability today is the high temperature operating life (htol) testing [59], which is based on the assumption that just one dominant failure mechanism is acting on the device [31]. however, it is known that multiple failure mechanisms act on the device simultaneously [25]. the new approach m-tol method predicts the reliability of electronic components by combining the failure in time (fit) of multiple failure mechanisms [60]. degradation curves are generated for the components exposed to accelerate testing at several different temperatures and core stress voltage. data clearly reveals that different failure mechanisms act on the components in different regimes of operation causing different mechanisms to dominate depending on the stress and the particular technology. a linear matrix solution, as presented in [60], allows the failure rate of each separate mechanism to be combined linearly to calculate the actual reliability as measured in fit of the system based on the physics of degradation at specific operating conditions. an experimental results of the m-tol method tested on both 45 and 28 nm fpga devices from xilinx that were processed at tsmc (according to the xilinx data sheets) is running in the frame of a project granted by research institute of technology named irt saint exupery, toulouse (france). the fpgas are tested over a range of voltages, microelectronic reliability models for more than moore nanotechnology products 17 temperature and frequencies, and the test program is conducted by j. bernstein, ariel university, ariel (israel). ring frequencies of multiple asynchronous ring oscillators simultaneously during stress in a single fpga were read and recorded. hundreds of oscillators and the corresponding frequency counters were burned into a single fpga to allow monitoring of statistical information in real time. since the frequency itself monitors the device degradation, there is no recovery effect whatsoever, giving a true measure for the effects of all the failure mechanisms measured in real time. the common intrinsic failure mechanisms affecting electronic devices are, hot carrier injection (hci), bias temperature instability (bti), electromigration (em) and time dependent dielectric breakdown (tddb). tddb will not be discussed in this paper since it was never observed in our test results. the standard models for failure mechanisms in semiconductor devices are classified by jedec solid state technology association and listed in publication jep-122g. the failure mechanisms can be separated due to the difference of physical nature of each individual mechanism. the theory of using fpgas as the evaluation vehicle for our m-tol verification utilizes the fact that this chip is built with the basic cmos standard cells that would be found in any digital process using the same technology. the system runs hundreds of internal oscillators at several different frequencies asynchronously, allowing independent measurements across the chip and the separation of current versus voltage induced degradation effects. when degradation occurred in the fpga, a decrease in performance and frequency of the ro could be observed and attributed to either increase in resistance or change in threshold voltage for the transistors. the test conditions were predefined for allowing separation and characterization of the relative contributions of the various failure mechanisms by controlling voltage, temperature and frequency. extreme core voltages and environmental temperatures, beyond the specifications, were imposed to cause failure acceleration of individual mechanisms to dominate others at each condition, e.g. sub-zero temperatures, at very high operating voltages, to exaggerate hci. the acceleration conditions for each failure mechanism allowed us to examine the specific effect of voltage and temperature versus frequency on that particular mechanism at the system level, and thus define its unique physical characteristics even from a finished product. finally, after completing the tests, some of the experiments with different frequency, voltage and temperature conditions were chosen to construct the m-tol matrix. the results of our experiments give both ea and  for the three mechanisms we studied at temperatures ranging from -50 to 150°c. the eyring model [36] is utilized here to describe the failure in time (fit) for all of the failure mechanisms. the specific ttf of each failure mechanisms follows these formulae: (22) (23) (24) correct activation energy simultaneously with corresponding voltage factor were determined. the procedure was followed for all three mechanisms for the 45nm as well as the 28nm devices. the ea and  for hci found in 45nm are summarized in table 1. 18 a. bensoussan table 1 summary of ea and  for fpga 45 nm. ea (ev)  hci -0.37 22.7 bti 0.52 3.8 v 1 em 1.24 3.8 as presented by regis, d. et al. [61], the impact of scaling on the reliability of integrated circuits is the actual concern. it is particularly necessary to focus on three basics of safety analyses for aeronautical systems: failure rates, lifetimes and atmospheric radiations' susceptibility. the deep sub-micron technologies, in terms of robustness and reliability, need to be modeled because the increase in failure rate, reduction in useful life and increased vulnerability to high energy particles are the most critical concerns in terms of safety. when considering the well documented failure mechanisms related to the die only, they can be defined in two families, one for those related to what is call front end of line (feol) meaning at transistor level and those occurring in the back end of line (beol) mainly metallization. as illustrated on figure 5 (extracted from paper [61]), ics are affected by different degradation mechanisms during their useful life. these degradation mechanisms can shift the properties of electronic devices and thereby affect the circuit performance. due to the exponential nature of acceleration factor (referring to equations 22 to 24) as function of voltage, frequency (equivalent to current) or temperature, it is mandatory to consider at least 3 mechanisms, each of them in competition and accelerated. fig. 5 wear-out phenomena localization (65 nm ic cross section) (from [61]). the paper proposed by j. bernstein [12] is offering a new reliability point of view and is synthetized hereunder. the proposed m-tol approach is defined with multiple failure mechanism in competition and on the assumption of non-equal failure probability at-use conditions to describe and to determine the correct proportionality. the basic method for solving the system of equations is described in another paper from j. bernstein [62], and using the suggestion of a sum-of-failure-rate method as described in jedec standard jep122g. it is clear that the manufacturers of electronic components recognize the importance of combining failure mechanisms in a sum-of-failure-rates method. each mechanism „competes‟ with the others to cause an eventual failure. when more than one mechanism exists in a system, then the relative acceleration of each one must be defined microelectronic reliability models for more than moore nanotechnology products 19 and averaged under the applied condition. every potential failure mechanism should be identified and its unique af should then be calculated for each mechanism at given temperature and voltage so the fit rate can be approximated for each mechanism separately. then, the final fit is the sum of the failure rates per mechanism, as described by: (25) where each mechanism leads to an expected failure unit per mechanism, fiti. thus, we describe here, the prediction of a system reliability using a linear matrix solution. although until today, we have only verified the methodology on verifiable microelectronic device failure mechanism, the methodology will apply directly to additional mechanisms including thermal and mechanical stresses due to wafer bonding and any failure mechanism that can be modelled by physics of failure, including wide bandgap semiconductors and even packaging failures whereas each intrinsic mechanism is known to have different statistical distributions, the combination of distributions becomes, at the ensemble level, approximately constant rate as demonstrated by r.f. drenick [63]. in its theorem, drenick suggests and justifies the summation of failure rate approach also as explained in the jedec handbook. the mechanism matrix is described in table 2. each row of the matrix describes various operating conditions under which the system is tested. each experiment, i, is operated with its unique voltage, frequency and temperature. the „„results‟‟ column, fiti is the average time when the failure occurs under the experimental condition, which is associated with a pre-determined failure point. the example studied uses 10% performance degradation as the failure point, however any reasonable value will work as long as it is consistent with the application. the result fiti is a failure rate () and measured as 10 9 /mttf. table 2 m-tol matrix used to solve models with measured times to fail [12] hci bti em results v1, f1, t1 x·a1 y·b1 z·c1 fit1 v2, f2, t2 x·a2 y·b2 z·c2 fit2 v3, f3, t3 x·a3 y·b3 z·c3 fit3 we assume that each mechanism (a–c) affects the system linearly with its own acceleration factor (af) for a given frequency. the acceleration factor formulas are in table 3. each equation is calculated with the experimental condition of each result on the right hand side. table 3 the equations for the acceleration factors matrix [12] hot carrier injection ai  afhci = ( ) ( ) negative bias temperature instability bi  afnbti = ( ) ( ) electromigration ci  afem = ( ) ( ) 20 a. bensoussan then the matrix is solved to find a set of constants, pi, shown here as x–z, across the whole matrix that matches the experimental results with calculated acceleration factors. this linear matrix is solved by multiplying the inverse matrix, af -1 , with lambda at each condition, as shown in table 4. the solution give the coefficients (x–z), which make up the relative contribution of each failure mechanism on the system. table 4 matrix solution [12]. af pi  [ ] [ ] [ ] (af) · (pi) = ()  (pi) = (af) -1 · () knowledge of these coefficients, allows prediction of the mttf or the fit for any other work conditions that were not tested and give an accurate prediction of the reliability of the device under different conditions. this matrix has been used then to construct the full reliability profile whereby fit is calculated versus temperature for several conditions for fpga 45 nm process, as shown in figure 6. the 45 nm technology shows frequency related effects at both low temperatures (below 5°c) due to hci and at high temperatures. it is observed the high voltage bias (@ 1.2 v) enhance the effect of frequency which reduce the overall hci contribution at low frequency. the dominant failure mechanism at medium ambient temperature (range from 10°c to 150°c) is related to nbti while em failure mechanism is rather observed at high temperature. fig. 6 reliability curves for 45nm technology showing fit versus temperature for voltages above and below nominal (1.2v) and frequencies from 10 mhz (dashed line) to 2ghz (solid line). 0,01 0,1 1 10 100 1000 10000 100000 1000000 10000000 -50 0 50 100 150 f it temperature t (°c) v = 0,8 v ; f = 1,0 ghz v = 1,0 v ; f = 1,5 ghz v = 1,2 v ; f = 2,0 ghz v = 0,8 v ; f = 0,01 ghz v = 1,0 v ; f = 0,01 ghz v = 1,2 v ; f = 0,01 ghz hci bti em microelectronic reliability models for more than moore nanotechnology products 21 how to disentangle reliability models for more than moore microelectronics based on nanotechnologies? an innovative and practical way is to use the various physics of failure equations together with accelerated testing for reliability prediction of devices exhibiting multiple failure mechanisms. we presented an integrated accelerating and measuring platform to be implemented inside fpga chips, making the m-tol testing methodology more accurate, allowing these tests at the chip and at the system level, rather than only at the transistor level. the calibration of physics models with highly accelerated testing of complete commercial devices allows to perform physical reliability prediction. the m-tol matrix can provide information about the proportional effect of each failure mechanism in competition and offering an easy and simply tool to extrapolate the expected reliability of the device under various conditions. this practical platform can be implemented on almost any fpga device and technology to enable making fit calculations and reliability predictions. the results of this approach provide the basis for improvements in performance and reliability given any design or application. this method can be extended to other processes and new technologies, and can include more failure mechanisms, thus producing a more complete view of the system's reliability. the baz model together with the m-tol methodology has been combined in a general multi-dimensional tool named m-storm (multi-physics multi-stressors predictive reliability model) [13] which can be implemented in a concrete situation existing for the deep-submicron process devices but also for any other microelectronic disruptive technology. 5. conclusion to this day, the users of our most sophisticated electronic systems that include optoelectronic, photonic, mems device, gan power devices, asic and deep-sub-micron technologies etc. are expected to rely on a simple reliability value (fit) published by the supplier. the fit is determined today in the product qualification process by use of htol or other standardized test, depending on the product. the manufacturer reports a zero-failure result from the given conditions of the single-point test and uses a single-mechanism model to fit an expected mttf at the operator‟s use conditions. the zero-failure qualification is well known as a very expensive exercise that provides nearly no useful information. as a result, designers often rely on halt testing and on handbooks such as fides, telcordia or mil-hdbk-217 to estimate the failure rate of their products, knowing full well that these approaches act as guidelines rather than as a reliable prediction tool. furthermore, with zero failure required for the “pass” criterion as well as the poor correlation of expensive htol data to test and field failures, there is no communication for the designers to utilize this knowledge in order to build in reliability or to trade it off with performance. prediction is not really the goal of these tests; however, current practice is to assign an expected failure rate, fit, based only on this test even if the presumed acceleration factor is not correct. we presented, in this paper, a simple way to predictive reliability assessment using the common language of failure in time or failure unit (fit). we evaluated the goal of finding mtbf and evaluate the wisdom of various approaches to reliability prediction. our goal is to predict reliability based on the system environment including space, military and 22 a. bensoussan commercial. it is our intent to show that the era of confidence in reliability prediction has arrived and that we can make reasonable reliability predictions from qualification testing at the system level. our research will demonstrate the utilization of physics of failure models in conjunction with qualification testing using our multiple – temperature operating life (m-tol) matrix solution to make cost-effective reliability predictions that are meaningful and based on the system operating conditions. the baz model together with the m-tol methodology has been combined in a general multi-dimensional tool named m-storm (multi-physics multi-stressors predictive reliability model) applicable to microelectronic disruptive technologies. acknowledgement: the paper is a part of the research done at irt saint exupery, toulouse, france. the study was conducted in the frame of electronic robustness contract project irt-008 sponsored by the following funding partners: agence nationale de la recherche, airbus operations sas, airbus group innovation, continental automotive france, thales alenia space france, thales avionics, laboratoire d'analyse et d'architecture des systèmes — centre national de la recherche scientifique (laas-cnrs), safran labinal power systems, bordeaux university, institut national polytechnique bordeaux (ims — umr 5218), and hirex engineering. i would like to particularly thank professor joseph b. bernstein, from ariel university, ariel (israël) for the major comments and deep discussions we had during the manuscript preparation. references [1] dod, "mil-hdbk-217, military handbook for reliability prediction of electronic equipement," washington, dc, usa, 1991, december. [2] european space comp. information exchanges systems, "ecss-q-st-60c rev2 space product assurance, electrical, electronic and electromecanical (eee) components.," ecss secretariat-esa-estec-component, material and processes related ecss and esa pss standards; requirements & standards divisionnoordwijk, the netherlands, 21 october 2013. [online]. available: https://escies.org. [accessed april 2016]. [3] standard, "jedec jep-122g failure mechanisms and models for semiconductor devices," jedec solid state technology association, arlington, 2011. [4] m. pecht, "a prognostics and health management roadmap for information and electronics-rich systems," ieice fundamentals review, vol. 3, no. 4, p. 25, 2010. [5] p. lall, r. lowe and k. goebel, "prognostic health monitoring for a micro-coil spring interconnect subjected to drop impacts," in proc. of the 2013 ieee conference on prognostics and health management (phm), 2013. [6] l. boltzmann, "further investigations on the thermal equilibrium of gas molecules," in proceeding of the imperial academy of science, vienna, vol. ii, no. 76, p. 428, 1872. [7] e. suhir, "probabilistic design for reliability," chipscale rev., vol. 14, no. 6, 2010. [8] e. suhir, "predicted reliability of aerospace electronics: application of two advanced concepts," in proc. of the ieee aerospace conference, 2-9 march 2013. [9] d. cox, "regression models and life-tables," journal of the royal statistical society. series b (methodological), vol. 34, no. 2. (1972), pp., vol. 34, no. 2, pp. 187-220, 1972. [10] j. w. mcpherson, reliability physics and engineering time-to-failure modeling; 2nd edition, plano (tx) usa: springer , 2013. [11] m. white and j. b. bernstein, "microelectronics reliability: physics-of-failure based modeling and lifetime evaluation," jpl publication 08-5, jet propulsion laboratory/california institute of technology, february 2008. [12] j. b. bernstein, m. gabbay and o. delly, "reliability matrix solution to multiple mechanism prediction," microelectronics reliability journal, vol. 54, pp. 2951-2955, 2014. microelectronic reliability models for more than moore nanotechnology products 23 [13] a. bensoussan, "m-storm: multi-physics multi-stressors predictive reliability model," microelectronic reliability journal, 2016 (to be published). [14] n. t. n. esa-estec secretariat, "ecss-q-st-30-11c : space product assurance derating-eee components, rev 1, october 4th 2011," [online]. [15] esa-estec, ecss-q-st-70-08c, space product assurance manual soldering of high-reliability electrical connections, noordwijk, the netherlands. [16] esa-estec secretariat, noordwijk, the netherlands, "ecss-q-st-60-13c: “commercial electrical, electronic and electromechanical (eee) components”," 21 october 2013. [online]. available: www.ecss.nl. [accessed 2016]. [17] klinger, d.j., yoshinao nakada and m. a. mendez, at&t reliability manual, van nostrand reinhold, 1990. [18] fides_guide, "reliability methodology for electronic systems, dga," 2004. [19] "rdf 2000 (ute c 80-810, iec-62380-tr ed.1)," [online]. available: http://www.ute-fr.com/lanormalisation/ute-and-standardisation. [20] siemens ag, sn29500, reliability and quality specifications failure rates of components, siemens technical liaison and standardisation, 1986. [21] bellcore, sr-332, reliability prediction procedure for electronic equipment, bellcore telcordia, 2011. [22] d. nicholls, "what is 217plus (tm) and where did it come from?," in proc. of the annual reliability and maintainability symposium , orlando, fl, 2007. [23] "prism now 217plus (tm) riac the reliability analysis center," [online]. available: http://www.theriac.org . [24] jep148b, "reliability qualification of semiconductor devices based on physics of failure risk and opportunity assessment," jedec solid state technology association, arlington, va, december 2004. [25] j. b. bernstein, s. salemi, l. yang, j. dai and j. qin, physics-of-failure based handbook of microelectronic systems, utica, ny: reliability information analysis center, 2008. [26] e. suhir, applied probability for engineers and scientists, new york: mcgraw-hill, 1997. [27] p. a. tobias and d. c. trindade, applied reliability (3rd ed.), boca raton, fl: crc press, 2012. [28] p. lall, m. pecht and e. b. hakim, influence of temperature on microelectronics and system reliability, boca raton, ny: crc press, 1997. [29] a. bensoussan and e. suhir, "design-for-reliability (dfr) of aerospace electronics: attributes and challenges," ieee aerospace conf., 2-9 march 2013. [30] a. bensoussan, "how to quantify and predict long term multiple stress operation: application to normally-off power gan transistor technologies," microelectronics reliability journal, special issue on reliability in power electronics, vol. 58, pp. 103-112, march 2016. [31] j. b. bernstein, reliability prediction from burn-in data fit to reliability models, london: elsevier ap, 2014. [32] j. black, "electromigration a brief survey and some recent results," ieee, trans. electron devices, vols. ed-16, p. 388, 1969. [33] m. yoder, "ohmic contacts in gaas," solid state el., vol. 23, pp. 117-119, 1980. [34] c. lee, b. welch and w. fleming, "reliability of auge/pt and auge/ni ohmic contacts on gaas," electronics letters, vol. 17, pp. 407-408, 1981. [35] h. cui, "h. cui, “accelerated temperature cycle test and coffin-manson model for electric packaging," ieee trans. rams, pp. 556-560, 2005. [36] h. eyring, s. lin and s. lin, basic chemical kinetics, new york chichester brisbane toronto: john willey & sons, 1980. [37] d. peck, "comprehensive model for humidity testing correlation," in proc. of the ieee 23rd international reliability physics symp. (irps), anahiem, 1986. [38] i. chen, s. holland and c. hu, "a quantitative physical model for time-dependent breakdown in sio2," in proc. of the ieee 23rd international reliability physics symp. (irps), orlando, 1985. 24 a. bensoussan [39] m. dai, c. gao, k. yap, y. shan, z. cao, k. liao, l. wang, b. cheng and s. liu, "a model with temperature-dependent exponent for hot-carrier injection in high-voltage nmosfets involving hot-hole injection and dispersion," ieee trans. electron devices, vol. 55, pp. 1255-1258, 2008. [40] e. takeda, y. nakagome, h. kume and s. asai, "new hot-carrier injection and device degradation in submicron mosfet‟s," iee proc, vol. 130, no. 3, pp. 144-149, 1983. [41] e. takeda, h. kume, t. toyabe and s. asai, "submicron mosfet structure for minimizing channel hotelectron injection," in proc. of the symposium on vlsi tech., 1981. [42] k. decker, "gaas mmic hydrogen degradation study," in gaas reliability workshop, philadelphia, pa, usa, 1994. [43] m. delaney, t. wiltsey, m. chiang and k. yu, "reliability of 0.25μm gaas mesfet mmic process: results of accelerated lifetests and hydrogen exposure," in gaas reliability workshop, philadelphia, pa, usa, 1994. [44] m. ciappa, f. carbognani and w. fichtner, "lifetime modeling of thermomechanics-related failure mechanisms in high power igbt modules for traction applications," in proc. of the ieee 15th int. symp. power semicond. devices ics, pp. 295-298, 2003. [45] d. schroder and j. babcock, "negative bias temperature instability: road to cross in deep submicron silicon semiconductor manufacturing," journal of applied physics, vol. 94, no. 1, pp. 1-18, 2003. [46] e. suhir, "statistics-related and reliability-physics-related failure processes in electronics devices and products," modern physics letters b, vol. 28, no. 13, 2014. [47] s. zhurkov, "kinetic concept of the strength of solids," int. j. of fracture mechanics, vol. 1, no. 4, 1965. [48] s. arrhenius, "ueber den einfluss des atmosphärischen kohlensäurengehalts auf die temperatur der erdoberfläche," in proceedings of the royal swedish academy of science, vol. 22, no. 1, 1896. [49] l. boltzmann, "the second law of thermodynamics. populare schriften," in essay 3, address to a formal meeting of the imperial academy of science,, 1886. [50] e. wigner, "the transition state method," faraday society (london) trans, vol. 34, pp. 29-41, 1938. [51] m. evans and m. polanyi, "inertia and driving force of chemical reaction," faraday society (london) trans, vol. 34, pp. 11-29, 1938. [52] e. suhir, a. bensoussan, g. khatibi and j. nicolics, "probabilistic design for reliability in electronics and photonics: role, significance, attributes, challenges.," in proc. of the international reliability physics symposium, monterey, ca, 2015. [53] e. suhir, r. mahajan, a. lucero and l. bechou, "probabilistic design-for-reliability concept and novel approach to qualification testing of aerospace electronic products," in aerospace conf., 2012 ieee, big sky, mt, march 2012. [54] c. shannon, "a mathematical theory of communication," bell system technical journal 27 (3): 379– 423, vol. 27, no. 3, pp. 379-423, 1948. [55] e. suhir and a. bensoussan, "quantified reliability of aerospace optoelectronics," in sae international j. aerosp. 7(1), cincinnati, 2014. [56] s. sze, physics of semiconductor devices, new york: john wiley and sons, 1981. [57] a. bensoussan, p. coval, w. roesch and t. rubalcava, "reliability of a gaas mmic process based on 0.5µm au/pd/ti gate mesfets," in proc. of the 32nd annual proceeding reliability physics, irps, san jose (ca), april 1994. [58] j. bernstein, a. bensoussan and e. bender, "reliability prediction with mtol," to be published in microelectronics reliability journal, elsevier, 2016. [59] xilinx, "reliability report," xilinx ug116 (v10.4), 1st april 2016. [60] j. bernstein, "reliability prediction for aerospace electronics," in proc. of the ieee aerospace conference, big sky (mn), 2015. [61] d. regis, j. berthon and m. gatti, "dsm reliability concerns impact on safety assessment," in sae 2014 aerospace systems and technology conference,, cincinnati, oh, september 2014. [62] j. bernstein, m. gurfinkel, x. li, j. walters and y. shapira, "electronic circuit reliability modeling," microelectronics reliability j., vol. 46, pp. 1957-1979, 2006. microelectronic reliability models for more than moore nanotechnology products 25 [63] r. drenick, "mathematical aspects of the reliability problem," journal of the society for industrial and applied mathematics, vol. 8, no. 1, pp. 125-149, 1960. [64] b. agarwala and al., "dependence of electromigration-induced failure time on length and width of aluminium thin-film conductors," j. appl. phys., vol. 41, p. 3954, 1970. [65] n. 1. r. hdbk, "nswc-10 reliability hdbk jan2010 45818/," 2010. [online]. available: http://everyspec.com/usn/nswc/nswc-10_reliability_hdbk_jan2010_45818/. [accessed 2016]. [66] j. evans, p. lall and r. bauernschub, "a framework for reliability modeling of electronics," in proc. of the reliability and maintenability symposium, 1995. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 279 285 https://doi.org/10.2298/fuee1802279m design of novel efficient full adder circuit for quantum-dot cellular automata technology  dariush mokhtari 1 , abdalhossein rezai 1 , hamid rashidi 1 , faranak rabiei 2 , saeid emadi 3 , asghar karimi 1 1 acecr institute of higher education, isfahan branch, isfahan, iran 2 institute for mathematical research, universiti putra malaysia, malaysia 3 ipe manager school, france abstract. in this paper the novel coplanar circuits for full adder implementation in quantum-dot cellular automata (qca) technology are presented. we propose a novel one-bit full adder circuit and then utilize this new circuit to implement novel four-bit ripple carry adder (rca) circuit in the qca technology. the qcadesigner tool version 2.0.1 is utilized to implement the designed qca full adder circuits. the implementation results show that the designed qca full adder circuits have an improvement compared to other qca full adder circuits. key words: full adder, quantum-dot cellular automata, coplanar circuit, ripple carry adder, high-performance design 1. introduction computer arithmetic plays an important role in the information and communication applications such as cryptography and alu [1-3]. full adders have an important role in computer arithmetic. so, the efficiency of many computer arithmetic applications is primarily determined by the efficiency of the full adder implementation [1-3]. on the other hand, quantum-dot cellular automata (qca) technology is a promising technology, which can continue the moore’s law development. this technology uses charge formation to information transition instead of current. as a result, circuit design in the qca technology has advantages in comparison with conventional technologies such as cmos technology in terms of small dimension, fast operation and low power consumption [4, 5]. recently, several efforts have been done to improve the efficiency of the full adder implementation in the qca technology [6-15]. hänninen and takala [6] presented a qca full adder that requires 102 qca cells and 0.1 µm 2 area. ramesh and rani [7] received june 6, 2017; received in revised form november 11, 2017 corresponding author: abdalhossein rezai acecr institute of higher education, isfahan branch, isfahan, 84175-443, iran (e-mail: rezaie@acecr.ac.ir) 280 d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi designed a qca full adder that consists of 52 qca cells and 0.038 µm 2 area. abedi et al. [8] designed a qca full adder that requires 59 qca cells and 0.043 µm 2 area. hashemi, and navi [9] have offered a qca full adder that requires 71 cells and 0.06 µm 2 area. mohammadi et al. [10] presented a qca full adder that requires 38 qca cells and 0.02 µm 2 area. ahmad et al. [11] constructed a qca full adder that consists of 41 qca cells and 0.04 µm 2 area. labrado and thapliyal [12] have presented a qca full adder that requires 63 qca cells and 0.05 µm 2 area. balali et al. [13] designed a qca full adder that requires 29 qca cells and 0.02 µm 2 area. however, these full adder circuits have advantages, but the complexity and required area of full adder circuit in the qca technology can be reduced with a described new technique in this paper. in this paper, an efficient circuit for one-bit qca full adder is presented and evaluated. then, an efficient circuit is designed for four-bit qca ripple carry adder (rca). the functionality of the designed circuits is verified using qcadesigner tool version 2.0.1. the implementation results show that the designed circuits have advantages compared to recent modified one-bit qca full adder circuits and four-bit qca rca circuits. the rest of this paper is organized as follows. background of the developed circuits is presented in section 2. the designed circuits are presented in section 3. section 4, evaluates the designed circuits. finally, section 5 concludes this paper. 2. background 2.1. qca technology qca technology is an emerging technology that can utilize for development of digital circuits based on moore’s law. this new technology utilizes the charge formation instead current for information transition. the basic element in this technology is a four dots square, which has two free electrons. fig. 1 shows a simplified qca cell [1, 5]. fig. 1 a simplified qca cell [1, 5] the formation of these free electrons is utilized to denote the zero state and one state logic in this technology. using this cell, the logic elements such as majority gate [4] can be developed. it should be noted that other logic elements such as or gate and and gate can be developed using majority gate [1, 4, 5]. moreover, complex digital circuits such as full adder circuits [6-15] and multiplexer circuits [1, 5] are developed using these logic elements. design of novel efficient full adder circuit for quantum-dot cellular automata technology 281 2.2. qca full adder circuit full adder plays a vital role in the complex digital circuits. as a result, highperformance implementation of this circuit is an attractive research area. the logical function of one-bit full adder can be shown by following equation: )cb,(a, 3 maj=bc+ac+ab=carry ininin (1) )carry),cb,(a, maj3,maj3(c=c ba=sum ininin (2) in (1) and (2), a, and b denote the inputs of one-bit full adder. cin and carry denote the carry input and carry output, respectively. sum denotes the output of sum in the onebit full adder. moreover, maj3 denotes the 3-input majority function, which can be implemented using 3-input majority gate in the qca technology [1, 4, 9]. fig. 2 shows the circuit diagram for the one-bit qca full adder [6-9]. fig. 2 the circuit diagram for the one-bit qca full adder [6] in addition, the four-bit qca rca circuit can be achieved by using consecutively four one-bit full adder [1, 6-9]. fig 3 shows the four-bit qca rca circuit. fig. 3 the four-bit qca rca circuit [6] in this circuit, a = (a3, a2, a1, a0), b = (b3, b2, b1, b0) are two four-bit inputs. the cin and cout denote the one-bit carry input and carry output, and sum = (sum3, sum2, sum1, sum0) is four-bit output. 282 d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi 3. the designed qca circuits this section outlines a novel one-bit qca full adder circuit. then, the new four-bit qca rca circuit is designed based on the designed one-bit qca full adder circuit. 3.1. the designed qca full adder circuit the designed circuit for the one-bit qca full adder circuit is shown in fig. 4. fig. 4 the designed circuit for the one-bit qca full adder in this circuit, a and b are two one-bit inputs and c is the carry input. carry and sum denote the outputs of carry and sum, respectively. this circuit consists of 46 qca cells. note that, four clocking zones are utilized in this circuit as follows: white indicates clock zone 3, light blue indicates clock zone 2, violet indicates clock zone 1, and green indicates clock zone 0. 3.2. the designed qca rca circuit the designed circuit for the four-bit qca rca circuit is shown in fig. 5. fig. 5 the designed circuit for the four-bit qca rca in this circuit, a and b are two four-bit inputs and cin is the one-bit carry input. carry and sum denote the outputs of one-bit carry and four-bit sum, respectively. this circuit consists of 187 qca cells. design of novel efficient full adder circuit for quantum-dot cellular automata technology 283 4. implementation results and comparison the designed circuits are implemented using qcadesigner tool version 2.0.1. this section presents these implementation results. 4.1. the designed qca full adder circuit fig. 6 shows the implementation results of the designed circuit for the one-bit qca full adder. fig. 6 the implementation results for the designed one-bit qca full adder circuit the implementation results of the designed circuit for the one-bit qca full adder confirm the correctness of this circuit. table 1 summarizes the implementation results of the designed circuit for the one-bit qca full adder compared to other one-bit qca full adder circuits in [6-13]. table 1 the comparative table for one-bit qca full adder circuits reference complexity (#cell) area (µm 2 ) delay (clock zone) [6] 102 0.1 8 [9] 71 0.06 5 [7] 52 0.038 4 [8] 59 0.043 4 [10] 38 0.02 3 [11] 41 0.04 2 [12] 63 0.05 3 [13] 29 0.02 2 this paper 46 0.04 4 based on our implementation results that are shown in fig. 6 and table 1, the designed circuit for the one-bit qca full adder has an improvement in terms of complexity compared to other one-bit qca full adder circuits in [6-9, 12]. although the cell count in one-bit qca 284 d. mokhtari, a. rezai, h. rashidi, f. rabiei, s. emadi, a. karimi full adder circuits in [10, 11, 13] is lower than our designed one-bit qca full adder circuit, our designed four-bit qca rca circuit, which is utilized this one-bit qca full adder circuit as its basic block, has advantages compared to four-bit qca rca circuits in [10, 13]. it is because the input/output ports in the developed one-bit qca full adder have suitable places. so, the place and route results in the developed four-bit qca rca presents a better results. moreover, the output cells of one-bit qca full adder circuit in [11] aren’t suitable placed. so, the implementation of four-bit qca rca circuit using the one-bit qca full adder circuit in [11] is hard. 4.2. the designed qca rca circuit fig. 7 shows the implementation results of the designed circuit for the four-bit qca rca. fig. 7 the implementation results for the designed the four-bit qca rca circuit the implementation results of the designed circuit for the four-bit qca rca confirm the correctness of this circuit. table 2 summarizes the implementation results of the designed circuit for the four-bit qca rca compared to other four-bit qca rca circuits in [6-10, 12-14]. table 2 the comparative table for four-bit qca rca circuit reference complexity (#cell) area (µm 2 ) delay (clock zone) [6] 558 0.85 20 [9] 442 1 8 [7] 260 0.28 10 [8] 262 0.208 28 [10] 237 0.24 6 [12] 295 0.3 6 [13] 269 0.37 14 [14] 339 0.2542 7 this paper 187 0.2 16 based on our implementation results that are shown in fig. 7 and table 2, the designed circuit for the four-bit qca rca has an improvement in terms of complexity, and area compared to other four-bit qca rca circuits in [6-10, 12-14]. design of novel efficient full adder circuit for quantum-dot cellular automata technology 285 5. conclusion full adders play an important role in computer arithmetic fields. so, efficient implementation of full adders can increase the efficiency of the computer arithmetic circuits. this paper presented and evaluated an efficient full adder circuit in the qca technology. in addition, we implemented a four-bit qca rca circuit based on this new one-bit qca full adder. the designed circuits have been implemented using qcadesigner tool version 2.0.1. the implementation results confirmed that the designed circuits outperform recent modified one-bit qca full adder circuits and four-bit qca rca circuits in [6-9, 12] in terms of complexity, and required area. references [1] p. balasubramanian, “a latency optimized biased implementation style weak-indication self-timed full adder,” facta universitatis, series: electronics and energetics. vol. 28, pp. 657-671, 2015. [2] a. rezai, and p. keshavarzi, “high-performance scalable architecture for modular multiplication using a new digit-serial computation,” micro. j., vol.55, pp. 169–178, 2016. [3] a. rezai, and p. keshavarzi, “high-throughput modular multiplication and exponentiation algorithm using multibit-scan-multibit-shift technique,” ieee trans. vlsi syst., vol. 23, pp. 1710-1719, 2015. [4] m. balali, a. rezai, h. balali, f. rabiei, and s. emadi “a novel design of 5-input majority gate in quantumdot cellular utomata technology,” in proceedings of the ieee symp. comput. appl. indust. electr. (iscaie 2017), 2017, pp. 13-16. [5] h. rashidi, a. rezai, and s. soltani, “high-performance multiplexer circuit for quantum-dot cellular automata,” j. comput. electr., vol. 15, pp. 968–98, 2016. [6] i. hänninen, and j. takala, “binary adders on quantum-dot cellular automata,” j. sign. process. syst., vol. 58, pp. 87–103, 2010. [7] b. ramesh, and m. a. rani, “design of binary to bcd code converter using area optimized quantumdot cellular automata full adder,” int. j. eng., vol. 9, pp, 49-64, 2015. [8] d. abedi, g. jaberipur, and m. sangsefidi, “coplanar full adder in quantum-dot cellular automata via clock-zone-based crossover,” ieee trans. nanotech., vol. 14, pp. 497–504, 2015. [9] s. hashemi, and k. navia, “a novel robust qca full-adder,” proc. mater. sci., vol. 11, pp. 376-380, 2015. [10] m. mohammadi, m. mohammadi, and s. gorgin, “an efficient design of full adder in quantum-dot cellular automata (qca) technology,” microelectr. j., vol. 50, pp. 35-43, 2016. [11] f. ahmad, g. m. bhat, h. khademolhosseini, s. azimi, s. angizi, and k. navi, “towards single layer quantumdot cellular automata adders based on explicit interaction of cells,” j. comput. sci., vol. 16, pp. 8-15, 2016. [12] c. labrado, and h. thapliyal, “design of adder and subtractor circuits in majority logic-based fieldcoupled qca nano computing,” electron. lett.., vol. 52, pp.464-466, 2016. [13] m. balali, a. rezai, h. balali, f. rabiei, and s. emadid, “towards coplanar quantum-dot cellular automata adders based on efficient three-input xor gate,” result. phys., vol. 7, pp. 1389-1395, 2017. [14] v. pudi, and k. sridharan, “low complexity design of ripple carry and brent-kung adders in qca,” ieee trans. nanotech., vol. 11, pp. 105-119, 2012. [15] h. cho, and e. e. swartzlander, “adder and multiplier design in quantum-dot cellular automata,” ieee trans. comput., vol. 58, pp. 721-727, 2009. facta universitatis series: electronics and energetics vol. xx, 2018, xx-xx compact xor-bi-decomposition for lattices of boolean functions ∗ bernd steinbach1 and christian posthoff2 1freiberg university of mining and technology, institute of computer science, freiberg, germany 2the university of the west indies, department of computing and information technology, saint augustine, trinidad & tobago abstract: bi-decomposition is a powerful approach for the synthesis of multi-level combinational circuits because it utilizes the properties of the given functions to find small circuits, with low power consumption and low delay. compact bi-decompositions restrict the variables in the support of the decomposition functions as much as possible. methods to find compact and-, or-, or xor-bi-decompositions for a given completely specified function are well known. a lattice of boolean functions represents all possible functions which are defined by an incompletely specified function. lattices of boolean functions significantly increase the possibilities to synthesize a minimal circuit. however, so far only methods to find compact andor or-bi-decompositions for lattices of boolean functions are known. this gap, i.e., a method to find a compact xor-bi-decomposition for a lattice of boolean functions, has been closed by the approach suggested in this paper. keywords: synthesis, combinational circuit, lattice of boolean functions, xor-bidecomposition, boolean differential calculus, derivative operations. 1 introduction the aim of all decomposition methods in circuit design is to find decomposition functions that are simpler than the given function. the bi-decomposition is an apmanuscript received september 18, 2017 corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cottastr. 2, d-09596 freiberg, germany (e-mail: steinb@informatik.tu-freiberg.de). ∗a preliminary version of this paper was presented at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 2 b. steinbach and c. posthoff: proach that decomposes a given boolean function into two simpler decomposition functions which are combined by an and-, an or-, or an xor-gate. for the aim of simplification, the bi-decomposition utilizes the properties of the given boolean function to design circuit structures of a small area, low power consumption, and low delay [1]. there are several types of bi-decompositions. both decomposition functions of each strong bi-decomposition are simpler than the given function because they depend on fewer variables. unfortunately, there are functions for which no strong bi-decomposition exists. le [2] bridged this gap by means of the weak bi-decomposition. he found that each function, for which neither a weak or bi-decomposition nor a weak and bi-decomposition exists, can be simplified by a strong xor bi-decomposition. a simplified proof of the completeness of the strong and weak bi-decomposition is given in [3, 4]. an implementation that reuses already decomposed blocks outperforms other synthesis approaches [5]. a drawback of the weak bi-decomposition is that the synthesized circuits can have a large difference between the shortest and the longest path (unbalanced circuits). recently, vectorial bi-decompositions were suggested as supplement to strong and weak bi-decompositions [6, 7]. the decomposition functions of these bidecompositions are simpler than the given function because they are independent of the simultaneous change of several variables. vectorial bi-decompositions can exist for functions without any strong bi-decomposition. benefits of the vectorial bi-decomposition are their contribution to balanced circuits and the increased number of decomposition possibilities in comparison to the strong bi-decomposition. all bi-decomposition approaches mentioned above utilize the boolean differential calculus (bdc) [3, 4, 8–12] to find optimal bi-decompositions. there are several other approaches of bi-decompositions which demonstrate the interest on this useful synthesis method; however, these other approaches are not directly helpful to solve the problem explored in this paper. in [13] a method for disjoint bi-decompositions with an extension to non-disjoint bi-decompositions for a single common variable are suggested. a graph-based approach for bi-decompositions was suggested in [14]. unfortunately, the used benchmarks in [5] and [14] overlap only partially. one common benchmark is t481 where our approach from [5] outperforms the graph-based approach from [14] in the number of gates (17/25). recently, semi-tensor products of matrices were suggested for bi-decompositions of boolean and multi-valued functions [15]. however, this paper does not contain experimental results of benchmark circuits. it is a property of the given function whether it can be decomposed into two simpler decomposition functions using a certain type of bi-decomposition. the possibility to find a bi-decomposition increases when the function to decompose can be chosen from a lattice of boolean functions. incompletely specified functions facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 223 240 https://doi.org/10.2298/fuee1802223s bernd steinbach1, christian posthoff2 received october 21, 2017; received in revised form january 24, 2018 corresponding author: bernd steinbach institute of computer science, freiberg university of mining and technology, bernhard-von-cotta-str. 2, d-09596 freiberg, germany (e-mail: steinb@informatik.tu-freiberg.de) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) compact xor-bi-decomposition for lattices of boolean functions* 1freiberg university of mining and technology, institute of computer science, freiberg, germany 2the university of the west indies, department of computing and information technology, saint augustine, trinidad and tobago abstract. bi-decomposition is a powerful approach for the synthesis of multi-level combinational circuits because it utilizes the properties of the given functions to find small circuits, with low power consumption and low delay. compact bi-decompositions restrict the variables in the support of the decomposition functions as much as possible. methods to find compact and-, or-, or xor-bi-decompositions for a given completely specified function are well known. a lattice of boolean functions represents all possible functions which are defined by an incompletely specified function. lattices of boolean functions significantly increase the possibilities to synthesize a minimal circuit. however, so far only methods to find compact andor or-bi-decompositions for lattices of boolean functions are known. this gap, i.e., a method to find a compact xor-bi-decomposition for a lattice of boolean functions, has been closed by the approach suggested in this paper. key words: synthesis, combinational circuit, lattice of boolean functions, xor-bidecomposition, boolean differential calculus, derivative operations. 2 b. steinbach and c. posthoff: proach that decomposes a given boolean function into two simpler decomposition functions which are combined by an and-, an or-, or an xor-gate. for the aim of simplification, the bi-decomposition utilizes the properties of the given boolean function to design circuit structures of a small area, low power consumption, and low delay [1]. there are several types of bi-decompositions. both decomposition functions of each strong bi-decomposition are simpler than the given function because they depend on fewer variables. unfortunately, there are functions for which no strong bi-decomposition exists. le [2] bridged this gap by means of the weak bi-decomposition. he found that each function, for which neither a weak or bi-decomposition nor a weak and bi-decomposition exists, can be simplified by a strong xor bi-decomposition. a simplified proof of the completeness of the strong and weak bi-decomposition is given in [3, 4]. an implementation that reuses already decomposed blocks outperforms other synthesis approaches [5]. a drawback of the weak bi-decomposition is that the synthesized circuits can have a large difference between the shortest and the longest path (unbalanced circuits). recently, vectorial bi-decompositions were suggested as supplement to strong and weak bi-decompositions [6, 7]. the decomposition functions of these bidecompositions are simpler than the given function because they are independent of the simultaneous change of several variables. vectorial bi-decompositions can exist for functions without any strong bi-decomposition. benefits of the vectorial bi-decomposition are their contribution to balanced circuits and the increased number of decomposition possibilities in comparison to the strong bi-decomposition. all bi-decomposition approaches mentioned above utilize the boolean differential calculus (bdc) [3, 4, 8–12] to find optimal bi-decompositions. there are several other approaches of bi-decompositions which demonstrate the interest on this useful synthesis method; however, these other approaches are not directly helpful to solve the problem explored in this paper. in [13] a method for disjoint bi-decompositions with an extension to non-disjoint bi-decompositions for a single common variable are suggested. a graph-based approach for bi-decompositions was suggested in [14]. unfortunately, the used benchmarks in [5] and [14] overlap only partially. one common benchmark is t481 where our approach from [5] outperforms the graph-based approach from [14] in the number of gates (17/25). recently, semi-tensor products of matrices were suggested for bi-decompositions of boolean and multi-valued functions [15]. however, this paper does not contain experimental results of benchmark circuits. it is a property of the given function whether it can be decomposed into two simpler decomposition functions using a certain type of bi-decomposition. the possibility to find a bi-decomposition increases when the function to decompose can be chosen from a lattice of boolean functions. incompletely specified functions compact xor-bi-decomposition for lattices of boolean functions 3 were traditionally used as a source of a lattice of boolean functions. the on-setfunction fq(x) and the off-set-function fr(x) are the preferred mark functions of these lattices. the introduction of derivative operations for lattices of boolean functions [4, 16, 17] facilitates the application of lattices in circuit design. a strong bi-decomposition divides the variables of the function to decompose into three disjoint subsets. the variables xa control only the decomposition function g, the variables xb control only the decomposition function h, and the variables xc are commonly used for both the decomposition functions g and h. the more variables in the dedicated sets of variables xa and xb the simpler are the decomposition functions g and h. a compact strong bi-decomposition uses the largest possible sets of xa and xb. there are formulas [3, 4, 10–12] containing operations of the boolean differential calculus that can be used to decide whether a lattice of boolean functions contains a compact strong andor a compact strong or-bi-decomposition. unfortunately, so far it is only possible to find a compact strong xor-bi-decomposition for a single decomposition function or to assign only a single variable to the set xa for the check whether a lattice of boolean functions contains a strong xorbi-decomposition [18]. we suggest in this paper an approach to find also compact strong xor-bi-decompositions for a lattice of boolean functions. this new method combines the ideas of simplifications used in both the strong and the vectorial bidecomposition. hence, we are going to solve a problem that is more than 20 years known as unsolved. the rest of this paper is organized as follows. section 2 briefly describes lattices of boolean functions and single derivatives of functions belonging to such a lattice. section 3 summarizes the known approach to find and determine noncompact xor-bi-decompositions. section 4 introduces into the theory of compact xor-bi-bi-decompositions, concludes the main theorem and the consequence for compact xor-bi-bi-decompositions, and provides two consecutive algorithms that solve this task using xboole [19, 20]. section 5 demonstrates the benefits of the suggested new decomposition method by means of a simple example. section 6 concludes the paper. 2 lattices of boolean functions lattices of boolean functions occur, e.g., in circuit design where each function of the lattice can be chosen as a function to realize the circuit structure. hence, lattices of boolean functions provide a possibility for optimization in circuit design. widely used are lattices which can be modeled as incompletely specified function (isf). such an incompletely specified boolean function divides the 2n patterns 224 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 225 2 b. steinbach and c. posthoff: proach that decomposes a given boolean function into two simpler decomposition functions which are combined by an and-, an or-, or an xor-gate. for the aim of simplification, the bi-decomposition utilizes the properties of the given boolean function to design circuit structures of a small area, low power consumption, and low delay [1]. there are several types of bi-decompositions. both decomposition functions of each strong bi-decomposition are simpler than the given function because they depend on fewer variables. unfortunately, there are functions for which no strong bi-decomposition exists. le [2] bridged this gap by means of the weak bi-decomposition. he found that each function, for which neither a weak or bi-decomposition nor a weak and bi-decomposition exists, can be simplified by a strong xor bi-decomposition. a simplified proof of the completeness of the strong and weak bi-decomposition is given in [3, 4]. an implementation that reuses already decomposed blocks outperforms other synthesis approaches [5]. a drawback of the weak bi-decomposition is that the synthesized circuits can have a large difference between the shortest and the longest path (unbalanced circuits). recently, vectorial bi-decompositions were suggested as supplement to strong and weak bi-decompositions [6, 7]. the decomposition functions of these bidecompositions are simpler than the given function because they are independent of the simultaneous change of several variables. vectorial bi-decompositions can exist for functions without any strong bi-decomposition. benefits of the vectorial bi-decomposition are their contribution to balanced circuits and the increased number of decomposition possibilities in comparison to the strong bi-decomposition. all bi-decomposition approaches mentioned above utilize the boolean differential calculus (bdc) [3, 4, 8–12] to find optimal bi-decompositions. there are several other approaches of bi-decompositions which demonstrate the interest on this useful synthesis method; however, these other approaches are not directly helpful to solve the problem explored in this paper. in [13] a method for disjoint bi-decompositions with an extension to non-disjoint bi-decompositions for a single common variable are suggested. a graph-based approach for bi-decompositions was suggested in [14]. unfortunately, the used benchmarks in [5] and [14] overlap only partially. one common benchmark is t481 where our approach from [5] outperforms the graph-based approach from [14] in the number of gates (17/25). recently, semi-tensor products of matrices were suggested for bi-decompositions of boolean and multi-valued functions [15]. however, this paper does not contain experimental results of benchmark circuits. it is a property of the given function whether it can be decomposed into two simpler decomposition functions using a certain type of bi-decomposition. the possibility to find a bi-decomposition increases when the function to decompose can be chosen from a lattice of boolean functions. incompletely specified functions compact xor-bi-decomposition for lattices of boolean functions 3 were traditionally used as a source of a lattice of boolean functions. the on-setfunction fq(x) and the off-set-function fr(x) are the preferred mark functions of these lattices. the introduction of derivative operations for lattices of boolean functions [4, 16, 17] facilitates the application of lattices in circuit design. a strong bi-decomposition divides the variables of the function to decompose into three disjoint subsets. the variables xa control only the decomposition function g, the variables xb control only the decomposition function h, and the variables xc are commonly used for both the decomposition functions g and h. the more variables in the dedicated sets of variables xa and xb the simpler are the decomposition functions g and h. a compact strong bi-decomposition uses the largest possible sets of xa and xb. there are formulas [3, 4, 10–12] containing operations of the boolean differential calculus that can be used to decide whether a lattice of boolean functions contains a compact strong andor a compact strong or-bi-decomposition. unfortunately, so far it is only possible to find a compact strong xor-bi-decomposition for a single decomposition function or to assign only a single variable to the set xa for the check whether a lattice of boolean functions contains a strong xorbi-decomposition [18]. we suggest in this paper an approach to find also compact strong xor-bi-decompositions for a lattice of boolean functions. this new method combines the ideas of simplifications used in both the strong and the vectorial bidecomposition. hence, we are going to solve a problem that is more than 20 years known as unsolved. the rest of this paper is organized as follows. section 2 briefly describes lattices of boolean functions and single derivatives of functions belonging to such a lattice. section 3 summarizes the known approach to find and determine noncompact xor-bi-decompositions. section 4 introduces into the theory of compact xor-bi-bi-decompositions, concludes the main theorem and the consequence for compact xor-bi-bi-decompositions, and provides two consecutive algorithms that solve this task using xboole [19, 20]. section 5 demonstrates the benefits of the suggested new decomposition method by means of a simple example. section 6 concludes the paper. 2 lattices of boolean functions lattices of boolean functions occur, e.g., in circuit design where each function of the lattice can be chosen as a function to realize the circuit structure. hence, lattices of boolean functions provide a possibility for optimization in circuit design. widely used are lattices which can be modeled as incompletely specified function (isf). such an incompletely specified boolean function divides the 2n patterns compact xor-bi-decomposition for lattices of boolean functions 3 were traditionally used as a source of a lattice of boolean functions. the on-setfunction fq(x) and the off-set-function fr(x) are the preferred mark functions of these lattices. the introduction of derivative operations for lattices of boolean functions [4, 16, 17] facilitates the application of lattices in circuit design. a strong bi-decomposition divides the variables of the function to decompose into three disjoint subsets. the variables xa control only the decomposition function g, the variables xb control only the decomposition function h, and the variables xc are commonly used for both the decomposition functions g and h. the more variables in the dedicated sets of variables xa and xb the simpler are the decomposition functions g and h. a compact strong bi-decomposition uses the largest possible sets of xa and xb. there are formulas [3, 4, 10–12] containing operations of the boolean differential calculus that can be used to decide whether a lattice of boolean functions contains a compact strong andor a compact strong or-bi-decomposition. unfortunately, so far it is only possible to find a compact strong xor-bi-decomposition for a single decomposition function or to assign only a single variable to the set xa for the check whether a lattice of boolean functions contains a strong xorbi-decomposition [18]. we suggest in this paper an approach to find also compact strong xor-bi-decompositions for a lattice of boolean functions. this new method combines the ideas of simplifications used in both the strong and the vectorial bidecomposition. hence, we are going to solve a problem that is more than 20 years known as unsolved. the rest of this paper is organized as follows. section 2 briefly describes lattices of boolean functions and single derivatives of functions belonging to such a lattice. section 3 summarizes the known approach to find and determine noncompact xor-bi-decompositions. section 4 introduces into the theory of compact xor-bi-bi-decompositions, concludes the main theorem and the consequence for compact xor-bi-bi-decompositions, and provides two consecutive algorithms that solve this task using xboole [19, 20]. section 5 demonstrates the benefits of the suggested new decomposition method by means of a simple example. section 6 concludes the paper. 2 lattices of boolean functions lattices of boolean functions occur, e.g., in circuit design where each function of the lattice can be chosen as a function to realize the circuit structure. hence, lattices of boolean functions provide a possibility for optimization in circuit design. widely used are lattices which can be modeled as incompletely specified function (isf). such an incompletely specified boolean function divides the 2n patterns 224 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 225 4 b. steinbach and c. posthoff: x of the boolean space bn into three disjoint sets: x ∈ don’t-care-set ⇔ fϕ (x1,...,xn) = 1 ⇔ it is allowed to choose the function value of f (x) without any restrictions; x ∈ on-set ⇔ fq(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 1) ; x ∈ off-set ⇔ fr(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 0) . each pair of these mark functions can be used to specify all functions of the lattice. a function f (x) belongs to the lattice l 〈 fq(x), fr(x) 〉 if fq(x) ≤ f (x) ≤ fr(x) . the single derivatives with regard to xi of all functions of a lattice l 〈 fq(x), fr(x) 〉 results again in a lattice of boolean function that is specified by the mark functions: f ∂ xiq (x1) = maxxi fq(xi,x1)∧ max xi fr(xi,x1) , (1) f ∂ xir (x1) = minxi fq(xi,x1)∨ min xi fr(xi,x1) . (2) 3 known non-compact xor-bi-decompositions for lattices of boolean functions a lattice of boolean functions l 〈 fq(xa,xb,xc), fr(xa,xb,xc) 〉 contains at least one function f (xa,xb,xc) that is strongly xor-bi-decomposable with regard to the single variable xa and the set of variables xb if and only if max xb m f ∂ xaq (xb,xc)∧ f ∂ xa r (xb,xc) = 0 . (3) the decomposition function g(xa,xc) of this xor-bi-decomposition is uniquely specified by g(xa,xc) = xa ∧ max xb m f ∂ xaq (xb,xc) , (4) and the associated decomposition function h(xb,xc) can be chosen from the lattice with the mark functions hq(xb,xc) = max xa ((g(xa,xc)∧ fq(xa,xb,xc))∨(g(xa,xc)∧ fr(xa,xb,xc))) , (5) hr(xb,xc) = max xa ((g(xa,xc)∧ fr(xa,xb,xc))∨(g(xa,xc)∧ fq(xa,xb,xc))) . (6) more details about strong and weak bi-decompositions are given in [3, 4, 10, 11]. 226 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 227 4 b. steinbach and c. posthoff: x of the boolean space bn into three disjoint sets: x ∈ don’t-care-set ⇔ fϕ (x1,...,xn) = 1 ⇔ it is allowed to choose the function value of f (x) without any restrictions; x ∈ on-set ⇔ fq(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 1) ; x ∈ off-set ⇔ fr(x1,...,xn) = 1 ⇔ ( fϕ (x1,...,xn) = 0)∧( f (x1,...,xn) = 0) . each pair of these mark functions can be used to specify all functions of the lattice. a function f (x) belongs to the lattice l 〈 fq(x), fr(x) 〉 if fq(x) ≤ f (x) ≤ fr(x) . the single derivatives with regard to xi of all functions of a lattice l 〈 fq(x), fr(x) 〉 results again in a lattice of boolean function that is specified by the mark functions: f ∂ xiq (x1) = maxxi fq(xi,x1)∧ max xi fr(xi,x1) , (1) f ∂ xir (x1) = minxi fq(xi,x1)∨ min xi fr(xi,x1) . (2) 3 known non-compact xor-bi-decompositions for lattices of boolean functions a lattice of boolean functions l 〈 fq(xa,xb,xc), fr(xa,xb,xc) 〉 contains at least one function f (xa,xb,xc) that is strongly xor-bi-decomposable with regard to the single variable xa and the set of variables xb if and only if max xb m f ∂ xaq (xb,xc)∧ f ∂ xa r (xb,xc) = 0 . (3) the decomposition function g(xa,xc) of this xor-bi-decomposition is uniquely specified by g(xa,xc) = xa ∧ max xb m f ∂ xaq (xb,xc) , (4) and the associated decomposition function h(xb,xc) can be chosen from the lattice with the mark functions hq(xb,xc) = max xa ((g(xa,xc)∧ fq(xa,xb,xc))∨(g(xa,xc)∧ fr(xa,xb,xc))) , (5) hr(xb,xc) = max xa ((g(xa,xc)∧ fr(xa,xb,xc))∨(g(xa,xc)∧ fq(xa,xb,xc))) . (6) more details about strong and weak bi-decompositions are given in [3, 4, 10, 11]. compact xor-bi-decomposition for lattices of boolean functions 5 4 compact xor-bi-decomposition for lattices of boolean functions a compact bi-decomposition is determined by maximal numbers of variables in the dedicated sets xa and xb. the number of commonly used variables xc is as small as possible, and consequently the decomposition functions g(xa,xc) and h(xb,xc) will be the simplest in case of a desirable compact bi-decomposition. condition (3) allows us to find a maximal number of possible variables for the dedicated set xb of an xor-bi-decomposition, but it unfortunately restricts to a single variable xa of the dedicated set xa. as initial solution we can calculate the decomposition function g(xa,xc) using (4) and the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 , using (5) and (6), from which the decomposition function h(xb,xc) can be chosen. the set of all variables x is distributed to the three disjoint sets xa = xa, xb, and xc. we assume that xb contains as much as possible variables, because it can be verified by (3) that no other variable can be added to the dedicated set xb without loosing the property of an xor-bi-decomposition of the given lattice. a given xor-bi-decomposition is not compact if at least one variable can be moved from xc to xa. moving a variable xi from xc to xa does not change the set of variables the function g(xa,xc) is depending on; however, it reduces the support of the function h(xb,xc). the set of variables xc is split into xi and xc0. due to the evaluation of condition (3) for all variables, we know that the function h(xb,xc) depends on all variables (xb,xc). hence, only another function h′(xb,xc0) is able to solve the problem. in the context of the xor-bi-decomposition, the following transformation steps show the key idea to solve the problem: g(xa0,xc)⊕ h(xb,xc) (7) = g(xa0,xc0,xi)⊕ h(xb,xc0,xi) (8) = g(xa0,xc0,xi)⊕(xi ⊕ h′(xb,xc0)) (9) = (g(xa0,xc0,xi)⊕ xi)⊕ h′(xb,xc0) (10) = g′(xa0,xi,xc0)⊕ h′(xb,xc0) (11) = g′(xa,xc0)⊕ h′(xb,xc0) ; (12) • the step from (7) to (8) emphasizes the variable xi as element of the given set of variables xc; • the step from (8) to (9) requires that the function h(xb,xc0,xi) is linear in xi. this property enables or prohibits the whole transformation; • the step from (9) to (10) moves the variable xi to the other decomposition function of the xor-bi-decomposition; 226 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 227 6 b. steinbach and c. posthoff: • the step from (10) to (11) includes the variable xi into the new decomposition function g′(xa0,xi,xc0). this transformation is possible without any restriction; • the step from (11) to (12) emphasizes that xi does not belong to the commonly used variables xc0, because h′(xb,xc0) does not depend on xi; hence, xi extends the dedicated set of variables xa0 to xa = (xa,xi). the only condition for the transformation from (7) to (12) is that the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies: ∂ h(x1,xi) ∂ xi = 1 . theorem 1 (linear separation of a variable for a function of a lattice) a lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function h(x1,xi) that can be represented by h(x1,xi) = xi ⊕ h′(x1) (13) if and only if the condition h∂ xir (x1) = 0 (14) is satisfied. proof 1 necessary: due to condition (14) the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies ∂ h(x1,xi) ∂ xi = 1 . (15) it is well-known that (15) is satisfied if the function h(x1,xi) is linear with regard to the variable xi as shown in (13). hence, we have theorem 1 in the direction (14) ⇒ (13). sufficient: function h(x1,xi) (13) belongs to the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 so that it holds: hq(xb,xc) ≤ h(x1,xi) ≤ hr(xb,xc) . (16) using (13), the inequality (16) can be split into the two inequalities hq(xb,xc) ≤ h(x1,xi) = xi ⊕ h′(x1) hq(xb,xc) ≤ (xi ∨ h′(x1))(xi ∨ h′(x1)) , (17) 228 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 229 6 b. steinbach and c. posthoff: • the step from (10) to (11) includes the variable xi into the new decomposition function g′(xa0,xi,xc0). this transformation is possible without any restriction; • the step from (11) to (12) emphasizes that xi does not belong to the commonly used variables xc0, because h′(xb,xc0) does not depend on xi; hence, xi extends the dedicated set of variables xa0 to xa = (xa,xi). the only condition for the transformation from (7) to (12) is that the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies: ∂ h(x1,xi) ∂ xi = 1 . theorem 1 (linear separation of a variable for a function of a lattice) a lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function h(x1,xi) that can be represented by h(x1,xi) = xi ⊕ h′(x1) (13) if and only if the condition h∂ xir (x1) = 0 (14) is satisfied. proof 1 necessary: due to condition (14) the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 contains at least one function that satisfies ∂ h(x1,xi) ∂ xi = 1 . (15) it is well-known that (15) is satisfied if the function h(x1,xi) is linear with regard to the variable xi as shown in (13). hence, we have theorem 1 in the direction (14) ⇒ (13). sufficient: function h(x1,xi) (13) belongs to the lattice l 〈 hq(xb,xc),hr(xb,xc) 〉 so that it holds: hq(xb,xc) ≤ h(x1,xi) ≤ hr(xb,xc) . (16) using (13), the inequality (16) can be split into the two inequalities hq(xb,xc) ≤ h(x1,xi) = xi ⊕ h′(x1) hq(xb,xc) ≤ (xi ∨ h′(x1))(xi ∨ h′(x1)) , (17) compact xor-bi-decomposition for lattices of boolean functions 7 and hr(xb,xc) ≥ h(x1,xi) = xi ⊕ h′(x1) hr(xb,xc) ≤ h(x1,xi) = xi ⊕ h′(x1) hr(xb,xc) ≤ (xi ∨ h′(x1))(xi ∨ h′(x1)) . (18) the right-hand-side functions of (17) and (18) are equal to or larger than the mark functions hq(xb,xc) and hr(xb,xc). the mark function h∂ xir (x1) of condition (14) is defined by: h∂ xir (x1) = minxi hq(xi,x1)∨ min xi hr(xi,x1) . (19) now, we substitute the function h(x1,xi) of (17) for hq(xb,xc) and the function h(x1,xi) of (18) for hr(xb,xc) into (19). these functions are equal to or larger than the replaced functions. so we get: h∂ xir (x1) = minxi ( (xi ∨ h′(x1))∧(xi ∨ h′(x1)) ) ∨ min xi ( (xi ∨ h′(x1))∧(xi ∨ h′(x1)) ) = min xi (xi ∨ h′(x1))∧ min xi (xi ∨ h′(x1))∨ min xi (xi ∨ h′(x1))∧ min xi (xi ∨ h′(x1)) = ( h′(x1)∨ min xi (xi) ) ∧ ( h′(x1)∨ min xi (xi) ) ∨ ( h′(x1)∨ min xi (xi) ) ∧ ( h′(x1)∨ min xi (xi) ) = ( h′(x1)∨ 0 ) ∧ ( h′(x1)∨ 0 ) ∨ ( h′(x1)∨ 0 ) ∧ ( h′(x1)∨ 0 ) = h′(x1)∧ h′(x1)∨ h′(x1)∧ h′(x1) = 0 ∨ 0 = 0 . hence, condition (14) is not only satisfied for the two mark functions hq(xb,xc) and hr(xb,xc), but also for the function (13) which is linear with regard to xi and belongs to the evaluated lattice. that shows the implication (13) ⇒ (14) and completes theorem 1. consequence 1 an xor-bi-decomposition is compact, if the set of variables xb is as large as possible (verified by condition (3)), and within an iterative procedure 228 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 229 8 b. steinbach and c. posthoff: all variables xi of the initial set xc that satisfy condition (14) are used to transform g(xa0,xc0,xi) to g′(xa,xc0) by g′(xa,xc0) = xi ⊕ g(xa0,xc0,xi) (20) and the associated new lattice l 〈 h′q(xb,xc0),h ′ r(xb,xc0) 〉 is adjusted by h′q(xb,xc0) = maxxi (xi hq(xb,xc0,xi)∨ xi hr(xb,xc0,xi)) , (21) h′r(xb,xc0) = maxxi (xi hr(xb,xc0,xi)∨ xi hq(xb,xc0,xi)) . (22) a precondition for a compact xor-bi-decomposition is the existence of two variables xa and xb for which the given lattice l 〈 fq(x), fr(x) 〉 contains at least one function which has a strong xor-bi-decomposition with regard to these variables. algorithm 1 analyzes whether there is an xor-bi-decomposition for the given lattice with regard to one pair of variables xa and xb. algorithm 1 determines these variables if they exist. algorithm 1 initial xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 with regard to the variables xa and xb require: tvls of fq(x)and fr(x) in oda-form ensure: boolean variable hasx orbd: it is true if the given lattice contains at least one xor-bi-decomposable function and f alse otherwise ensure: set of variables (sv) of xa and xb: variables for which the lattice contains at least one function with a strong xor-bi-decomposition 1: all var ← sv uni( fq, fr) 2: hasx orbd ← f alse 3: xa ← /0 4: while hasx orbd ∧ sv next(all var,xa,xa) do 5: xb ← xa 6: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 7: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 8: while hasx orbd ∧ sv next(all var,xb,xb) do 9: if te isc(maxk( f ∂ xaq ,xb), f ∂ xa r ) then 10: hasx orbd ← true 11: end if 12: end while 13: end while 14: return (hasx orbd,xa,xb) 230 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 231 8 b. steinbach and c. posthoff: all variables xi of the initial set xc that satisfy condition (14) are used to transform g(xa0,xc0,xi) to g′(xa,xc0) by g′(xa,xc0) = xi ⊕ g(xa0,xc0,xi) (20) and the associated new lattice l 〈 h′q(xb,xc0),h ′ r(xb,xc0) 〉 is adjusted by h′q(xb,xc0) = maxxi (xi hq(xb,xc0,xi)∨ xi hr(xb,xc0,xi)) , (21) h′r(xb,xc0) = maxxi (xi hr(xb,xc0,xi)∨ xi hq(xb,xc0,xi)) . (22) a precondition for a compact xor-bi-decomposition is the existence of two variables xa and xb for which the given lattice l 〈 fq(x), fr(x) 〉 contains at least one function which has a strong xor-bi-decomposition with regard to these variables. algorithm 1 analyzes whether there is an xor-bi-decomposition for the given lattice with regard to one pair of variables xa and xb. algorithm 1 determines these variables if they exist. algorithm 1 initial xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 with regard to the variables xa and xb require: tvls of fq(x)and fr(x) in oda-form ensure: boolean variable hasx orbd: it is true if the given lattice contains at least one xor-bi-decomposable function and f alse otherwise ensure: set of variables (sv) of xa and xb: variables for which the lattice contains at least one function with a strong xor-bi-decomposition 1: all var ← sv uni( fq, fr) 2: hasx orbd ← f alse 3: xa ← /0 4: while hasx orbd ∧ sv next(all var,xa,xa) do 5: xb ← xa 6: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 7: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 8: while hasx orbd ∧ sv next(all var,xb,xb) do 9: if te isc(maxk( f ∂ xaq ,xb), f ∂ xa r ) then 10: hasx orbd ← true 11: end if 12: end while 13: end while 14: return (hasx orbd,xa,xb) compact xor-bi-decomposition for lattices of boolean functions 9 algorithm 1 uses two nested while-loops for the selection of the variables xa and xb. the basic set of all variables is prepared in line 1 using the xboolefunction sv uni (set of variables union). the sequential selection of the variables xa and xb is realized by two xboole-functions sv next (set of variables next variable) that control these while-loops. the variable hasx orbd is used to indicate the boolean result whether the lattice contains at least one function with a strong xor-bi-decomposition. this variable is also used to terminate both while-loops if a strong xor-bi-decomposition is detected. xboole-functions isc (intersection), uni (union), maxk (k-fold maximum), and mink (k-fold minimum) calculate in lines 6 and 7 the mark functions of the derivative of the given lattice with regard to the single variables xa. xboolefunction te isc (test empty intersection) checks in line 9 condition (3) for the strong xor-bi-decomposition with regard to the actually selected variables xa and xb. in the case that this condition is satisfied, the control variable hasx orbd is changed to the value true in line 10. algorithm 2 extends a found initial xor-bi-decomposition to a compact one. initial steps determine all variables (line 1), the basic sets of commonly used variables xc (line 2) and dedicated variables xb (line 3), and the precondition (line 4) for the selection of variables xb by means of the xboole-function sv next in line 8. the mark functions of the derivative of the given lattice with regard to the single variables xa are needed in condition (3) to decide about the possibility to extend the set xb; they are calculated in lines 5 and 6 based on (1) and (2). the help-function h0 stores the intermediate result of the k-fold maximum with regard to the already known variables of the set xb (line 7). the while-loop in lines 8 to 13 extends the set xb to the maximal possible number of variables of a strong xor-bi-decomposition for the given lattice. condition (3) is verified in line 9 for the temporally extended set xb. if this condition is satisfied for the set of variables xb ∪ xb, the set of variables xb is permanently extended in line 10 and the help-function h0 is adjusted in line 11. knowing the maximal set of variables xb, basic versions of the wanted functions can be calculated: • g(xa,xc) based on (4) in line 14; • hq(xb,xc) based on (5) in line 15; and • hr(xb,xc) based on (6) in line 16. in a second while-loop (lines 20 to 28) the set of variables xa is extended. initial steps determine the new set of commonly used variables xc (line 17), the basic set 230 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 231 10 b. steinbach and c. posthoff: algorithm 2 compact strong xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 require: tvls of fq(x) and fr(x); initial svs of xa and xb ensure: tvl of g(xa,xc): decomposition function ensure: tvls of hq(xb,xc) and hr(xb,xc): decomposition lattice ensure: svs of xa, xb, and xc: disjoint sets of variables 1: all var ← sv uni( fq, fr) 2: xc ← sv dif(sv dif(all var,xa),xb)) 3: xb ← xb 4: xb ← /0 5: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 6: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 7: h0 ← maxk( f ∂ xaq ,xb) 8: while sv next(xc,xb,xb) do 9: if te isc(maxk(h0,xb), f ∂ xar ) then 10: xb ← sv uni(xb,xb) 11: h0 ← maxk( f ∂ xaq ,xb) 12: end if 13: end while 14: g ← isc(sv get(xa),h0) 15: hq ← maxk(uni(isc(g, fq),isc(g, fr)),xa) 16: hr ← maxk(uni(isc(g, fr),isc(g, fq)),xa) 17: xc ← sv dif(sv dif(all var,xa),xb)) 18: xa ← xa 19: xi ← /0 20: while sv next(xc,xi,xi) do 21: if te uni(mink(hq,xi),mink(hr,xi)) then 22: xa ← sv uni(xa,xi) 23: xi ← sv get(xi) 24: g ← syd(xi,g) 25: hq ← maxk(uni(isc(xi,hq),isc(xi,hr)),xi) 26: hr ← maxk(uni(isc(xi,hr),isc(xi,hq)),xi) 27: end if 28: end while 29: xc ← sv dif(xc,xa) 30: return (g,hq,hr,xa,xb,xc) of variables xa (line 18), and the selection variable xi (line 19) needed to evaluate the possibility of the extension of xa. 232 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 233 10 b. steinbach and c. posthoff: algorithm 2 compact strong xor-bi-decomposition of the lattice l 〈 fq(x), fr(x) 〉 require: tvls of fq(x) and fr(x); initial svs of xa and xb ensure: tvl of g(xa,xc): decomposition function ensure: tvls of hq(xb,xc) and hr(xb,xc): decomposition lattice ensure: svs of xa, xb, and xc: disjoint sets of variables 1: all var ← sv uni( fq, fr) 2: xc ← sv dif(sv dif(all var,xa),xb)) 3: xb ← xb 4: xb ← /0 5: f ∂ xaq ← isc(maxk( fq,xa),maxk( fr,xa)) 6: f ∂ xar ← uni(mink( fq,xa),mink( fr,xa)) 7: h0 ← maxk( f ∂ xaq ,xb) 8: while sv next(xc,xb,xb) do 9: if te isc(maxk(h0,xb), f ∂ xar ) then 10: xb ← sv uni(xb,xb) 11: h0 ← maxk( f ∂ xaq ,xb) 12: end if 13: end while 14: g ← isc(sv get(xa),h0) 15: hq ← maxk(uni(isc(g, fq),isc(g, fr)),xa) 16: hr ← maxk(uni(isc(g, fr),isc(g, fq)),xa) 17: xc ← sv dif(sv dif(all var,xa),xb)) 18: xa ← xa 19: xi ← /0 20: while sv next(xc,xi,xi) do 21: if te uni(mink(hq,xi),mink(hr,xi)) then 22: xa ← sv uni(xa,xi) 23: xi ← sv get(xi) 24: g ← syd(xi,g) 25: hq ← maxk(uni(isc(xi,hq),isc(xi,hr)),xi) 26: hr ← maxk(uni(isc(xi,hr),isc(xi,hq)),xi) 27: end if 28: end while 29: xc ← sv dif(xc,xa) 30: return (g,hq,hr,xa,xb,xc) of variables xa (line 18), and the selection variable xi (line 19) needed to evaluate the possibility of the extension of xa. compact xor-bi-decomposition for lattices of boolean functions 11 condition (14) of theorem 1 is verified in line 21 using the formula (2) to determine the off-set of the derivative of a lattice with regard to the single variable xi. if this condition is satisfied: • the set of variables xa is extended by xi in line 22 using the xboolefunction sv uni (set of variables union); • xi is transformed in line 23 from a set of variables into the tvl representing x1 = 1 using the xboole-function sv get (set of variables get); • the new function g is calculated in line 24 based on (20) using the xboolefunction syd (symmetric difference); • the new on-set function hq(x) is calculated in line 25 based on (21); and • the new off-set function hr(x) is calculated in line 26 based on (22). the restriction of the set of commonly used variables xc is realized in line 29 outside of the loop, because the unchanged basic set xc is needed in the xboolefunction sv next in line 20 to select the next variable xi. the complement operations in lines 15, 16, 24, and 25 are realized using the xboole-function cpl. 5 example 5.1 the chosen lattice and conditions for the synthesis 0 0 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 1 0 1 0 1 1 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 1 0 x3 0 0 1 1 1 1 0 0 x2 0 0 0 0 1 1 1 1 x1 φ φ φ x4 x5 l 〈 fq(x), fr(x) 〉 (a) 0 0 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 1 0 0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 0 1 1 0 x3 0 0 1 1 1 1 0 0 x2 0 0 0 0 1 1 1 1 x1 x4 x5 y = f (x) (b) fig. 1. karnaugh-maps of (a) the given lattice, and (b) chosen function of both bi-decompositions. figure 1 (a) shows the karnaugh-map of a lattice of eight boolean functions. the simplest multi-level circuit structure for one of these functions must be found using and-, or-, and xor-gates of two inputs where these inputs arbitrary can 232 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 233 12 b. steinbach and c. posthoff: be negated. the gates can be reused to simplify the circuit. as basis for comparison serves a minimal disjunctive form, calculated by means of the well known quine mccluskey algorithm. the synthesis of the given lattice of functions by bi-decompositions has been realized using both the know non-compact xor-bidecomposition and the new compact xor-bi-decomposition. using conditions given in [3, 4, 11] it can be verified that this lattice does not contain any function which has a strong bi-decomposition with regard to any dedicated sets of variables xa and xb for an andor an or-gate. figure 1 (b) shows the function chosen by both the known and the new bi-decomposition approach. two don’t-cares are assigned to 0 and the other to 1. the simplest minimal disjunctive form realizes the function fq(x) of the lattice where all don’t-cares are assigned to 0. 5.2 synthesis by covering using a minimal disjunctive form the execution of the quine mccluskey algorithm results in two minimal disjunctive forms of the same complexity. both of them realize the on-set function fq(x) and require the same number of gates and levels in a circuit. the chosen minimal disjunctive form is: fq(x) = (x1x2)x3 ∨(x1x2)(x4x5)∨(x1x2)(x4x5)∨(x2x3)(x4x5)∨ (x1x2)(x3x4)∨(x1x3)(x4x5)∨(x1(x2x3))(x4x5)∨(x2(x1x3))(x4x5) . (23) the parentheses in the conjunctions in (23) emphasize the chosen two-input andgates. figure 2 shows the associated circuit structure in which as much as possible and-gates are reused. the disjunction of eight conjunctions is realized by a tree of seven or-gates. seven and-gates could be reused to build another conjunction. in total there are 18 and-gates. the complete circuit consists of 25 two-input gates on six levels. 5.3 synthesis using the known non-compact xor-bi-decomposition using condition (3) it was found that the lattice of figure 1 (a) contains at least one function that is xor-bi-decomposable with regard to the single variable xa = x1 and the dedicated set xb = (x3,x5). hence, the set of commonly used variables xc = (x2,x4). the decomposition function g(xa,xc) of an xor-bi-decomposition is uniquely specified by (4), and we get g1(x1,x2,x4) = x1 ∧(x2 ∧ x4) . (24) it can directly be seen that there is a strong and-bi-decomposition of g1(x1,x2,x4) into g2 = x1 and h2 = (x2 ∧ x4). no further decomposition is needed for these functions. 234 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 235 12 b. steinbach and c. posthoff: be negated. the gates can be reused to simplify the circuit. as basis for comparison serves a minimal disjunctive form, calculated by means of the well known quine mccluskey algorithm. the synthesis of the given lattice of functions by bi-decompositions has been realized using both the know non-compact xor-bidecomposition and the new compact xor-bi-decomposition. using conditions given in [3, 4, 11] it can be verified that this lattice does not contain any function which has a strong bi-decomposition with regard to any dedicated sets of variables xa and xb for an andor an or-gate. figure 1 (b) shows the function chosen by both the known and the new bi-decomposition approach. two don’t-cares are assigned to 0 and the other to 1. the simplest minimal disjunctive form realizes the function fq(x) of the lattice where all don’t-cares are assigned to 0. 5.2 synthesis by covering using a minimal disjunctive form the execution of the quine mccluskey algorithm results in two minimal disjunctive forms of the same complexity. both of them realize the on-set function fq(x) and require the same number of gates and levels in a circuit. the chosen minimal disjunctive form is: fq(x) = (x1x2)x3 ∨(x1x2)(x4x5)∨(x1x2)(x4x5)∨(x2x3)(x4x5)∨ (x1x2)(x3x4)∨(x1x3)(x4x5)∨(x1(x2x3))(x4x5)∨(x2(x1x3))(x4x5) . (23) the parentheses in the conjunctions in (23) emphasize the chosen two-input andgates. figure 2 shows the associated circuit structure in which as much as possible and-gates are reused. the disjunction of eight conjunctions is realized by a tree of seven or-gates. seven and-gates could be reused to build another conjunction. in total there are 18 and-gates. the complete circuit consists of 25 two-input gates on six levels. 5.3 synthesis using the known non-compact xor-bi-decomposition using condition (3) it was found that the lattice of figure 1 (a) contains at least one function that is xor-bi-decomposable with regard to the single variable xa = x1 and the dedicated set xb = (x3,x5). hence, the set of commonly used variables xc = (x2,x4). the decomposition function g(xa,xc) of an xor-bi-decomposition is uniquely specified by (4), and we get g1(x1,x2,x4) = x1 ∧(x2 ∧ x4) . (24) it can directly be seen that there is a strong and-bi-decomposition of g1(x1,x2,x4) into g2 = x1 and h2 = (x2 ∧ x4). no further decomposition is needed for these functions. compact xor-bi-decomposition for lattices of boolean functions 13 y = fq(x)x1 x2 x3 x4 x5 fig. 2. circuit structure synthesized by quine-mccluskey and reused two-input gates. y = f (x) g1g2 h2 h1 g3 h3 g4 h4 x1 x2 x3 x4 x5 fig. 3. circuit structure synthesized using the old non-compact xor-bi-decomposition. the lattice of the decomposition function h1 can be calculated by (5) and (6) and contains in this example the single function h1(x2,x3,x4,x5) = (x3 ∧(x4 ⊕ x5))⊕(x2 ⊕ x3) . (25) by means of condition (3) it can be verified that an xor-bi-decomposition of l 〈 h1q,h1r 〉 with regard to xa = x2 and xb = (x4,x5) exists. hence, only the variable x3 belongs to the set of commonly used variables xc. the decomposition function g(xa,xc) of this second xor-bi-decomposition 234 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 235 14 b. steinbach and c. posthoff: was again calculated by (4): g3(x2,x3) = x2 ⊕ x3 . the lattice l 〈 h3q,h3r 〉 contains only the single function h3(x3,x4,x5) = x3 ∧(x4 ⊕ x5) for which a strong and-bi-decomposition into g4 = x3 and h4 = (x4 ⊕ x5) exists. no further decomposition is needed for these functions. figure 3 shows the synthesized circuit consisting of seven gates on four levels. 5.4 optimized synthesis using the new compact xor-bi-decomposition for direct comparison we demonstrated the approach of the utilization of a linearly separable variable to get a compact xor-bi-decomposition using the same lattice (shown in figure 1 (a)) as before. y = f (x) g1 g2 h2 h1 g3 h3 x1 x2 x3 x4 x5 fig. 4. circuit structure synthesized using the new compact xor-bi-decomposition. using condition (3) in algorithm 1 the initial xor-bi-decomposition with regard to the single variables xa = x1 and xb = x3 is found. in the first part of algorithm 2 (lines 1 to 13) the dedicated set xb could be extended to (x3,x5) due to the check of variables x2, x4, and x5 within line 9 embedded in the loop of lines 8 to 13. hence, the basic set of commonly used variables is xc = (x2,x4) which is determined in line 17. algorithm 2 finds by condition (14) in line 21 in the while-loop in lines 20 to 28 that the so fare detected lattice of h1 contains a function that is linear with regard to x2. hence, x2 is included into the set xa in line 22, the new function g1 is calculated in lines 23 and 24 using the basic function g′1 (24): g1(x1,x2,x4) = x2 ⊕ g′1(x1,x2,x4) = x2 ⊕ ( x1 ∧(x2 ∧ x4) ) = (x1 ⊕ x2)∨(x2 ∧ x4) . (26) 236 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 237 14 b. steinbach and c. posthoff: was again calculated by (4): g3(x2,x3) = x2 ⊕ x3 . the lattice l 〈 h3q,h3r 〉 contains only the single function h3(x3,x4,x5) = x3 ∧(x4 ⊕ x5) for which a strong and-bi-decomposition into g4 = x3 and h4 = (x4 ⊕ x5) exists. no further decomposition is needed for these functions. figure 3 shows the synthesized circuit consisting of seven gates on four levels. 5.4 optimized synthesis using the new compact xor-bi-decomposition for direct comparison we demonstrated the approach of the utilization of a linearly separable variable to get a compact xor-bi-decomposition using the same lattice (shown in figure 1 (a)) as before. y = f (x) g1 g2 h2 h1 g3 h3 x1 x2 x3 x4 x5 fig. 4. circuit structure synthesized using the new compact xor-bi-decomposition. using condition (3) in algorithm 1 the initial xor-bi-decomposition with regard to the single variables xa = x1 and xb = x3 is found. in the first part of algorithm 2 (lines 1 to 13) the dedicated set xb could be extended to (x3,x5) due to the check of variables x2, x4, and x5 within line 9 embedded in the loop of lines 8 to 13. hence, the basic set of commonly used variables is xc = (x2,x4) which is determined in line 17. algorithm 2 finds by condition (14) in line 21 in the while-loop in lines 20 to 28 that the so fare detected lattice of h1 contains a function that is linear with regard to x2. hence, x2 is included into the set xa in line 22, the new function g1 is calculated in lines 23 and 24 using the basic function g′1 (24): g1(x1,x2,x4) = x2 ⊕ g′1(x1,x2,x4) = x2 ⊕ ( x1 ∧(x2 ∧ x4) ) = (x1 ⊕ x2)∨(x2 ∧ x4) . (26) compact xor-bi-decomposition for lattices of boolean functions 15 using (21) and (22) the new lattice l 〈 h1q,h1r 〉 is calculated in lines 25 and 26 of algorithm 2. due to the special case of the completely specified function h′1 (25) the result is also a completely specified function that is calculated by (21): h1(x3,x4,x5) = max x2 ( x2 ⊕ h′1(x2,x3,x4,x5)) ) = max x2 (x2 ⊕(x3 ∧(x4 ⊕ x5))⊕(x2 ⊕ x3)) = (x3 ∧(x4 ⊕ x5))⊕ x3 = (x4 ⊕ x5)∨ x3 . (27) hence, h1(x3,x4,x5) depends only on three variables, and the comparison with g1(x1,x2,x4) confirms that only the variable x4 is shared. in this way the single variable of the dedicated set xa is implicitly extended to xa = (x1,x2). algorithm 2 explicitly realizes this extension in line 22 using the xboole operation sv uni (set of variables union). the expressions (26) and (27) show that there are or-bi-decompositions for both decomposition functions g1 and h1. figure 4 shows that the circuit structure, realized be means of the new method, only needs six gates on three levels. 5.5 comparison of the synthesis results table 1 summarizes the results of the synthesis of the given lattice of boolean functions realized by: • the covering method using the quine mccluskey approach to get a minimal disjunctive form which has been split into two-input gates that are reused as much as possible; • the bi-decomposition method where the known xor-bi-decomposition of a lattice is restricted to the assignment of a single variable to the dedicated set xa; • the bi-decomposition method using the new xor-bi-decomposition for a lattice that is able to realize a compact xor-bi-decomposition. both the needed area and the power consumption are estimated by the number of gates. the benefit of the bi-decomposition in comparison to the covering method is evident; the number of gates could be reduced, despite the seven reused gates in the covering approach, from 25 to seven in case of the known bi-decomposition and even to six when the new compact xor-bi-decomposition is used. this is a reduction to 24% of the needed area as well as the power consumption of the new 236 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 237 16 b. steinbach and c. posthoff: table 1. comparison of needed area, power consumption, and maximal delay effect to used count covering method used xor-bi-decomposition ratios known new compact new compactcovering new compact known area number of gates 25 7 6 24.0 % 85.7 % power number of gates 25 7 6 24.0 % 85.7 % delay number of gates in the longest path 6 4 3 50.0 % 75.0 % compact xor-bi-decomposition in comparison to the covering method or to 85.7% regarding the so far used non-compact xor-bi-decomposition. the maximal delay of the synthesized circuit can be estimated by the number of gates in the longest path that is equal to the number of gate levels. the bidecomposition outperforms the covering method also regarding the maximal delay. the new compact xor-bi-decomposition was able to reduce the maximal delay to one half in comparison to the covering method or 75% according to the known non-compact xor-bi-decomposition. 6 conclusions lattices of boolean functions provide the possibility to choose the function for which the circuit needs a small area, a low power consumption, and has a short delay time. the bi-decomposition is a very powerful method to synthesize circuits that improve these parameters in comparison to covering methods. the theory to find compact strong bi-decompositions was so far only known for andand orgates. however, strong xor-bi-decompositions were restricted to a single variable in the dedicated set xa. the results of this paper close this gap of a missing compact xor-bi-decomposition for lattices of boolean functions. it provides both the needed new theory and their application in algorithms using xboole [19, 20] for the calculation of compact xor-bi-decompositions for lattices of boolean functions. in a very simple example the gate count (needed area, power consumption) could be reduced to 24 percent in comparison to an exact covering method and to 85 percent regarding the known bi-decomposition. for the same example the length of the longest path (maximal delay) could be reduced to one half in comparison to an exact covering method and to 75 percent according to the known bi-decomposition. 238 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 239 16 b. steinbach and c. posthoff: table 1. comparison of needed area, power consumption, and maximal delay effect to used count covering method used xor-bi-decomposition ratios known new compact new compactcovering new compact known area number of gates 25 7 6 24.0 % 85.7 % power number of gates 25 7 6 24.0 % 85.7 % delay number of gates in the longest path 6 4 3 50.0 % 75.0 % compact xor-bi-decomposition in comparison to the covering method or to 85.7% regarding the so far used non-compact xor-bi-decomposition. the maximal delay of the synthesized circuit can be estimated by the number of gates in the longest path that is equal to the number of gate levels. the bidecomposition outperforms the covering method also regarding the maximal delay. the new compact xor-bi-decomposition was able to reduce the maximal delay to one half in comparison to the covering method or 75% according to the known non-compact xor-bi-decomposition. 6 conclusions lattices of boolean functions provide the possibility to choose the function for which the circuit needs a small area, a low power consumption, and has a short delay time. the bi-decomposition is a very powerful method to synthesize circuits that improve these parameters in comparison to covering methods. the theory to find compact strong bi-decompositions was so far only known for andand orgates. however, strong xor-bi-decompositions were restricted to a single variable in the dedicated set xa. the results of this paper close this gap of a missing compact xor-bi-decomposition for lattices of boolean functions. it provides both the needed new theory and their application in algorithms using xboole [19, 20] for the calculation of compact xor-bi-decompositions for lattices of boolean functions. in a very simple example the gate count (needed area, power consumption) could be reduced to 24 percent in comparison to an exact covering method and to 85 percent regarding the known bi-decomposition. for the same example the length of the longest path (maximal delay) could be reduced to one half in comparison to an exact covering method and to 75 percent according to the known bi-decomposition. references 17 references [1] d. bochmann, f. dresig, and b. steinbach. “a new decomposition method for multilevel circuit design”. in: proceedings of the conference on european design automation. edac ’91. amsterdam, the netherlands: ieee computer society, 1991, pp. 374–377. [2] t. le. “testbarkeit kombinatorischer schaltungen theorie und entwurf”. written in german, english title: testability of combinational circuits theory and design. phd thesis. tu karl-marx-stadt, germany, 1989. [3] c. posthoff and b. steinbach. logic functions and equations – binary models for computer science. dordrecht, the netherlands: springer, 2004. [4] b. steinbach and c. posthoff. boolean differential calculus. synthesis lecturers on digital circuits and systems 52. san rafael, ca, usa: morgan & claypool, 2017. [5] a. mishchenko, b. steinbach, and m. perkowski. “an algorithm for bidecomposition of logic functions”. in: proceedings of the 38th annual design automation conference. dac ’01. las vegas, nevada, usa: acm, 2001, pp. 103–108. [6] b. steinbach. “vectorial bi-decompositions of logic functions”. in: proceedings of the reed-muller workshop 2015. rm 4. waterloo, canada, 2015. [7] b. steinbach and c. posthoff. “vectorial bi-decompositions for lattices of boolean functions”. in: proceedings of the 12th international workshops on boolean problems. iwsbp. freiberg, germany: freiberg university of mining and technology, 2016, pp. 93–104. [8] a. thayse. “boolean differential calculus”. in: philips research reports 26 (1971). r 764, pp. 229–246. [9] m. davio and a. thayse. “boolean differential calculus and its application to switching theory”. in: ieee transactions on computes 22.4 (1973), pp. 409–420. [10] b. steinbach and c. posthoff. logic functions and equations examples and exercises. springer science + business media b.v., 2009. [11] b. steinbach and c. posthoff. “boolean differential calculus theory and applications”. in: journal of computational and theoretical nanoscience 7.6 (2010), pp. 933–981. 238 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions 239 18 references [12] b. steinbach and c. posthoff. “boolean differential calculus”. in: progress in applications of boolean functions. synthesis lecturers on digital circuits and systems 26. san rafael, ca, usa: morgan & claypool, 2010, pp. 55–78. [13] t. sasao and j. butler. “on bi-decompositions of logic functions”. in: 6th international workshop on logic & synthesis. iwls. granlibakken resort tahoe city, ca, usa, 1997, pp. 1–6. [14] m. choudhury and k. mohanram. “bi-decomposition of large boolean functions using blocking edge graphs”. in: 2010 ieee/acm international conference on computer-aided design. iccad. 2010, pp. 586–591. [15] d. cheng and x. xu. “bi-decomposition of logical mappings via semitensor product of matrices”. in: automatica 49.7 (2013), pp. 51–76. [16] b. steinbach. “generalized lattices of boolean functions utilized for derivative operations”. in: materiały konferencyjne knws’13. knws ’13. łagów, poland, 2013, pp. 1–17. [17] b. steinbach. “derivative operations for lattices of boolean functions”. in: proceedings of the reed-muller workshop 2013. rm ’13. toyama, japan, 2013, pp. 110–119. [18] b. steinbach and a. wereszczynski. “synthesis of multi-level circuits using exor-gates”. in: ifip wg 10.5 workshop on applications of the reed-muller expansion in circuit design. chiba makuhari, japan, 1995, pp. 161–168. [19] b. steinbach. “xboole a toolbox for modelling, simulation, and analysis of large digital systems”. in: systems analysis and modelling simulation 9.4 (1992), pp. 297–312. [20] b. steinbach and m. werner. “xboole-cuda fast calculations of large boolean problems on the gpu”. in: problems and new solutions in the boolean domain. ed. by b. steinbach. newcastle upon tyne, uk: cambridge scholars publishing, 2016, pp. 117–149. 240 b. steinbach, c. posthoff compact xor-bi-decomposition for lattices of boolean functions pb plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, no 1, march 2018, pp. 131 140 https://doi.org/10.2298/fuee1801131p memory chips and units radiation tolerance dependence on supply voltage during irradiation and test andrey g. petrov, alexander y. nikiforov, anna b. boruzdina, anastasia v. ulanova, andrey v. yanenko national research nuclear university mephi (moscow engineering physics institute), moscow, russia abstract. in this work we investigate the influence of various memory chips supply voltage on their sensitivity to the radiation environment. the main physical mechanisms responsible for radiation-induced degradation at nominal, increased, and decreased supply voltage values are discussed. it is demonstrated that, depending on supply voltage value during irradiation and subsequent testing, device's tolerance to data corruption effects in memory circuits, single event latch-up (sel) and hard errors induced by ionizing radiation can vary significantly. we also give some recommendations to perform radiation tests. key words: space radiation, memory, digital integrated circuits, flash, sram, seu, total dose 1. introduction the typical variation of allowable supply voltage values for complex digital cmos integrated circuits (microprocessors, microcontrollers, memory chips, etc.) is within 5 to 10 percent of the nominal one. the device in application can work at any supply voltage within this range. according to data from previous works, the total dose hardness and single event sensitivity can vary significantly depending on the operation conditions [1][9]. this fact must be taken into account when assessing radiation tolerance of microcircuits. in this work we concentrated our investigations on radiation tolerance dependence with supply voltage for memory segment of digital ics. memory cells or units are a part of the vast majority of digital ics. in some cases memory is the most critical unit of digital ics due to its sensitivity to radiation [1], [10]-[11]. radiation environment (space, various ground sources, etc.) can have a negative impact on electrical parameters of memory chips received may 4, 2017; received in revised form september 14, 2017 corresponding author: alexander y. nikiforov national research nuclear university mephi (moscow engineering physics institute), kashirskoe shosse 31, moscow 115409, russia (e-mail: aynik@spels.ru) 132 a.g. petrov, a.y. nikiforov, a.b. boruzdina, a.v. ulanova, a.v. yanenko and units, such as supply current, output voltage levels, timing parameters, etc. however, the most negative consequences are associated with radiation effects leading to functional failures, such as corruption of data stored in memory or inability to rewrite data. the radiation tolerance level to these functional and parametric failures can significantly depend on the supply voltage of ics and particularly for memory devices. the radiation behavior of memory devices and units must be taken into account while providing radiation qualification of digital ics. the aim of this work is to demonstrate how the influence of supply voltage applied during irradiation and testing can influence the radiation response of memory microcircuits, and to determine the worst-case supply voltages for various critical microcircuit parameters. we will describe the main mechanisms that determine the dependency of radiation response to supply voltage and work out some recommendations for proper selection of supply voltage during radiation tests of various memory devices and digital ics containing memory units. 2. the influence of supply voltage on total dose hardness of memory integrated circuits previous works have shown that total dose hardness levels of complex multifunctional very large scale integration (vlsi) devices strongly depend on operating conditions during radiation tests [12]-[14]. in this work we consider in more detail total dose tolerance dependence of various memory ics on their supply voltages not only during irradiation but also during functional tests. the amount of radiation-generated carriers escaping initial recombination increases with applied electric field, as shown in figure 1 [15]. the application of supply voltage on the ic during irradiation leads to the presence of a higher electric field in oxides, and induces a higher density of charge trapped in oxides. thus, applying the maximum allowed supply voltage during irradiation is the most critical parameter for the estimation of total dose tolerance of digital ics (in particular memory devices) estimation. fig. 1 experimentally measured fractional hole yield as a function of applied electric field, for a number of incident particles [15] memory chips and units radiation tolerance dependence on supply voltage during irradiation and test 133 at the same time, as will be shown below, applying the maximum supply voltage during functional tests after irradiation is not always critical for memory total dose tolerance estimation. we experimentally compared tid levels when applying minimum, nominal and maximum allowed supply voltages during functional tests of sram microcircuits. these srams were manufactured on various cmos processes, supply voltages vary from 2,7 v to 5,5 v. device irradiation was performed at maximum supply voltage. this experimental comparison (figure 2) shows that applying the minimum allowed supply voltage during functional test (writing and reading test operations) is the most critical mode for sram functional failure total dose level estimations. such behavior is due to the fact that an ic in this mode exhibits the maximum sensitivity to threshold voltage shift and leakage caused by the trapping of the radiation-induced charge in the oxide. fig. 2 total dose hardness dependence on supply voltage applied during functional tests for various srams a different behavior was observed for flash and eeprom memories. we investigated functional failure dependence on supply voltage during test after irradiation for flash memory s29gl064n manufactured on 110 nm cmos process. before irradiation, the test pattern was stored into the memory array. during irradiation the device was kept in storage mode at nominal supply voltage. periodically irradiation was paused and a reading operation was performed on the memory array at minimum, nominal and maximum allowed supply voltages. the first differences between write (stored) and read data were observed for the maximum supply voltage (figure 3). all observed errors were bit upsets from programmed state (charge stored into cell gate, “0” logical level) to erased state (charge removed from cell gate, “1”logical level). thus, it can be argued that bit upsets were caused by the loss of charge stored into the cell during irradiation. when the radiation-induced charge loss is total, stored information is upset from the programmed state to the erased state. when the loss of charge is only partial, it leads to a threshold voltage shift of the flash memory cell, as illustrated by the dashed curves in figure 4. during the reading operation of the memory, the voltage on cells gate has the same value as the supply voltage. therefore, as illustrated in figure 4, applying the maximum supply voltage during irradiation and test will be the most critical for total dose tolerance estimation. 134 a.g. petrov, a.y. nikiforov, a.b. boruzdina, a.v. ulanova, a.v. yanenko fig. 3 number of flash memory read error bits vs total dose level for various supply voltage applied during test fig. 4 drain current vs gate voltage for programmed (“0”) and erased (“1”) cell states and cell with some charge (dotted curve) 3. the influence of supply voltage on single event sensitivity of digital integrated circuits two main single event effects in digital ics are single event latchup (sel) and single event upsets (seu) in memory units, control and data registers. single event upsets in registers can cause single event functional interrupt (sefi). we provide estimation of seu sensitivity dependence on supply voltage for different types of memory devices and units. 3.1. seu sensitivity dependence on supply voltage for sram memory. in a memory cell, the area sensitive to single event upsets is the drain of off-state transistors in [16]-[21]. according to [21], the critical charge for sram cell upset depends on static noise margin of the device and can be estimated as: c ox snm q c v (1) where cox  the capacitance of gate oxide, vsnm is the static noise margin of memory cell. static noise margin decreases with supply voltage. therefore, the sensitivity of sram memory ics and units to single event upsets increases with the decrease in supply voltage. such behavior was observed for xc7z020 configuration memory, as shown in the results presented below. memory chips and units radiation tolerance dependence on supply voltage during irradiation and test 135 however, the critical charge for cell upset does not always show a linear dependence on supply voltage level. for this reason, experimental results obtained for one supply voltage cannot be extrapolated to another level without experimental estimation [22]. investigation of single event upsets in sram due to neutrons [23] shows that simple cross section estimation based on critical charge in some cases can give underestimated results. the results of investigation on the xc7z020 configuration memory are shown in figure 5, where the seu cross section increases with the decrease in supply voltage during test. for this type of memory, applying the minimum supply voltage during irradiation is the most critical mode for seu. fig. 5 seu cross section dependences vs supply voltage during test and temperature for 120 mev protons irradiation (cm 2 /bit), (mevcm 2 /mg) at the same time seu investigation results for cmos 0,25 m sram 512к×8 (figure 6) and xc5vlx50 block and configuration memory (figure 7) show no significant difference in seu cross section at different supply voltages during irradiation. however, it should be noted that a difference in sensitivity of sram at different voltages can be observed in the let threshold region of the effect. experimental data for linear energy transfer (let) values in near-threshold have not been obtained in this case. fig. 6 cmos 0,25 um sram 512к×8. seu cross section vs heavy ion let for various supply voltages applied during irradiation 136 a.g. petrov, a.y. nikiforov, a.b. boruzdina, a.v. ulanova, a.v. yanenko fig. 7 amount of seus in block and configuration memory vs pulsed laser energy for various supply voltages 3.2. seu sensitivity dependence on supply voltage for charge storage memory. another dependence of the sensitivity to seus is observed for charge storage memories (flash, eeprom). we provide data on the irradiation of flash memory s29gl064n manufactured on 110 nm cmos process by ne ions with an let near 7 mev·cm2/mg at various supply voltages. after irradiation, information stored was read from the array at minimum, nominal and maximum allowed supply voltages for this device (figure 8) and compared with the data written before irradiation. as shown by the experimental results in figure 8, the device cross section does not depend on supply voltage during irradiation. at the same time, an increase in seu cross section with supply voltage during test after irradiation has been observed. seus in this flash memory result from partial charge loss stored in memory cell [24] and increase in cross section with supply voltage can be explained similarly to total dose. in this case the maximum supply voltage during test will be the most critical mode for seu sensitivity estimation. fig. 8 seu cross section vs supply voltage during test and irradiation ne ions at normal incidence for s29gl064n memory chips and units radiation tolerance dependence on supply voltage during irradiation and test 137 3.3. sel sensitivity dependence on supply voltage for digital ics. memory ics and units have no difference in single event latchup mechanisms and sensitivity dependence on supply voltage in comparison with other digital ics. our experimental results obtained for cmos sram memories cy62256 (figure 9) and 90nm cmos sram 1mx8 (figure 10) show that the worst-case for sel sensitivity is to apply the maximum supply voltage during irradiation. fig. 9 cy62256. sel cross section (sel) vs heavy ions let for different supply voltages fig. 10 90nm cmos sram 1mx8. sel cross section(sel) vs heavy ions let for different supply voltages there is no influence of ics scaling on the parameters of the parasitic thyristor structure at the origin of the sel mechanism. switch-on current does not vary significantly for cmos processes with design rules from 180 nm to 65 nm. operating temperature and supply voltage mainly affect ics sel sensitivity [25]. sensitivity to sel decreases with supply voltage, which is explained by a decrease in the gain of the parasitic bipolar transistor and a decrease in the collected charge with a lower electric 138 a.g. petrov, a.y. nikiforov, a.b. boruzdina, a.v. ulanova, a.v. yanenko field strength [26].the influence of the supply voltage is mainly manifested near the sel threshold let. it can be clearly seen from our experimental results presented above in figures 9 and 10. the sel saturation cross section is almost unchanged with supply voltage, while the sel threshold let varies significantly. in addition, a higher supply voltage can be more likely to exceed the sel holding voltage, which increases the probability of maintaining sel condition. results presented in work [26] show a sharp increase in sensitivity to sel at supply voltages greater than 1.5 v (figure 11). therefore the maximum supply voltage is the most critical mode for sel sensitivity estimations. fig. 11 dependence of the sel threshold let on the supply voltage for various design rules 4. recommendations on selection of supply voltage for different types of memory integrated circuits and units during radiation tests when performing radiation qualification of ics, it is necessary to correctly select the worst-case supply voltage to give conservative estimations of radiation hardness level. incorrect selection of supply voltage can lead to overestimation of radiation hardness level. it is important to take into account that the worst-case supply voltage to use during test and during irradiation may be different. based on the results of the investigation and their analysis presented above, we can give recommendations for an appropriate selection of the worst-case supply voltage during certification of memory ics and units. these recommendations are presented in table 1. table 1 worst-case supply voltages during ics radiation certification type of memory total dose single events during irradiation during test sel seu during irradiation during test sram maximum minimum maximum minimum any charge storage memory maximum minimum and maximum maximum any minimum and maximum memory chips and units radiation tolerance dependence on supply voltage during irradiation and test 139 5. conclusion in this work we have shown significant influence of memory ics and units supply voltage on their sensitivity to total dose and single event upsets. we identified the worstcase supply voltage for total dose and single event upsets memory ics sensitivity during irradiation and test, and we have shown that they can be different in some cases. consequently, recommendations are also provided to properly select the supply voltage to use during memory ics and units radiation qualification. references [1] p. nekrasov, a. demidov, o. kalashnikov, “functional checks of microprocessors during radiation tests”, instruments and experimental techniques, vol. 52, no. 2, pp. 196-199, 2009. [2] o. kalashnikov, a. demidov, v. figurov, a. nikiforov, s. polevich, v. telets, s. maljudin, a. artamonov “integrating analog-to-digital converter radiation hardness test technique and results”, ieee transactions on nuclear science, 1998, vol. 45, no. 6 (1), pp. 2611-2615, 1998. [3] a. boruzdina, a. ulanova, n. grigor'ev, a. nikiforov, “radiation-induced degradation in the dynamic parameters of memory chips”, russian microelectronics, vol. 41, no. 4, pp. 259-265, 2002. [4] o. kalashnikov, “statistical variations of integrated circuits radiation hardness”, in proceedings of the radecs conference, 2011, pp. 661-665. [5] o. kalashnikov, “cmos integrated circuits total dose functional upset sensitivity to operation mode”, in proceedings of the 4 th workshop on electronics for lhc experiments, 1998, rome, italy, pp. 484-485. [6] a. kirgizova, a. nikiforov, n. grigor'ev, i. poljakov, p. skorobogatov, “dominant mechanisms of transient-radiation upset in cmos ram vlsi circuits realized in sos technology”, russian microelectronics, vol. 35, no. 3, pp. 162-176, 2006. [7] a. karakozov, o. korneev, p. nekrasov, p. nekrasov, m. sokolov, d. zagryadsky, “bias conditions and functional test procedure influence on powerpc7448 microprocessor tid tolerance”, in proceedings of the radecs conference, 2013. pp. 1-2. [8] d. bobrovsky, o. kalashnikov, p. nekrasov, “functional control technique for fpga total ionizing dose testing”, in proceedings of the radecs conference, 2012. [9] o. kalashnikov, a. artamonov, a. demidov, “adc/dac radiation test technique”, workshop record 4th european conf. "radiations and their effects on devices and systems" in proceedings of the radecs conference, palm beach-cannes, france, 1997, pp. 56-60. [10] v.a. marfin, p.v. nekrasov, and i.o. loskutov, “connection of the parametric and functional control for tid testing of complex vlsi circuit,” in proceedings of the 14 th european conf. on radiation and its effects on components and systems, radecs-2015, moscow; russian federation; sept. 14 -18, 2015, article number 7365664. [11] i.o. loskutov, a.b. karakozov, p.v. nekrasov, and a.y. nikiforov, “automated radiation test setup for functional and parametrical control of 8-bit microcontrollers,” in proceedings of the 2015 international siberian conference on control and communications, sibcon 2015 omsk; russian federation; may 21 -23, 2015, article number 7147128. [12] o.a. kalashnikov, and a.y. nikiforov, “tid behavior of complex multifunctional vlsi devices,” in proceedings of the 29 th international conference on microelectronics, miel 2014, belgrade, serbia, may 2014, pp. 455-458. [13] d. boychenko, o. kalashnikov, a. nikiforov, a. ulanova, d. bobrovsky, p. nekrasov, “total ionizing dose effects and radiation testing of complex multifunctional vlsi devices”, facta universitatis, series: electronics and energetics, vol. 28, issue 1, pp. 153-164, 2015. [14] a. sogoyan, a. artamonov, a. nikiforov, d. boychenko, “method for integrated circuits total ionizing dose hardness testing based on combined gammaand x-ray irradiation facilities”, facta universitatis, series: electronics and energetics, vol. 27, issue 3, pp. 329-338, 2014. [15] t.r. oldham and f.b. mclean, “total ionizing dose effects in mos oxides and devices”, ieee transaction on nuclear science, vol. 50, no. 3, pp. 483-499, june 2003. [16] a.i. chumakov, a.l. vasil'ev, a.a. kozlov, d.o. kol'tov, a.v. krinitskii, a.a. pechenkin, a.s. tararaksin, and a.v. yanenko, “single-event-effect prediction for ics in a space environment,” russian microelectronics, vol. 39, no. 2, pp. 74-78, 2010. 140 a.g. petrov, a.y. nikiforov, a.b. boruzdina, a.v. ulanova, a.v. yanenko [17] a. i. chumakov, a. a. pechenkin, d. v. savchenkov, a. s. tararaksin, a. l. vasil'ev, and a. v. yanenko, “local laser irradiation technique for see testing of ics”, in proceedings of the 12 th european conf. on radiation and its effects on components and systems, radecs-2011, sevilla; spain; sept. 19 -23, 2011, pp. 449-453. [18] a.i. chumakov, “evaluation of multibit upsets in integrated circuits under heavy charged particles,” russian microelectronics, vol. 43, no. 2, 2014, pp. 91-95. [19] a.b. boruzdina, a.v. ulanova, a.g. petrov, v.a. telets, p. reviriego and j.a. maestro, “verification of sram mcus calculation technique for experiment time optimization,” in proceedings of the 14 th european conf. on radiation and its effects on components and systems, radecs-2013, oxford; united kingdom; sept. 23 -27, article number 6937393. [20] d.v. savchenkov, a.i. chumakov, a.g. petrov, a.a. pechenkin, a.n. egorov, o.b. mavritskii, and a.v. yanenko, “study of sel and seu in sram using different laser techniques” in proceedings of the 14 th european conf. on radiation and its effects on components and systems, radecs-2013, oxford; united kingdom; sept. 23 -27, article number 6937411. [21] z. zhang, j liu, y. sun, m. hou, t. tong, s. gu, t. liu, "supply voltage dependence of single event upset sensitivity in diverse sram devices," in proceedings of the 10th international conference on reliability, maintainability and safety (icrms), guangzhou, 2014, pp. 114-119. [22] j. barak, j. levinson, a. akkerman, e. adler, a. zentner; d. david, y. lifshitz, m. hass, b.e. fischer, m. schlogl, m. victoria, w. hajdas, “scaling of seu mapping and cross section, and proton induced seu at reduced supply voltage,” ieee transactions on nuclear science, vol. 46, no.6, pp. 1342-1353, dec. 1999. [23] p. hazucha, k. johansson, c. svensson, “neutron induced soft errors in cmos memories under reduced bias,” ieee transactions on nuclear science, vol.45, no.6, pp.2921-2928, dec 1998. [24] a.g. petrov, a.l. vasil'ev, a.v. ulanova, a.i. chumakov, and a.y. nikiforov, “flash memory cells data loss caused by total ionizing dose and heavy ions,” central european journal of physics, vol. 12, no. 10, pp. 725-729, 2014. [25] g. boselli, v. reddy and c. duvvury, "latch-up in 65nm cmos technology: a scaling perspective," in proceedings of the ieee international reliability physics symposium (irps2005), 2005, pp. 137-144. [26] r. koga, s.j. hansel, w.r. crain, k.b. crawford, s.d. pinkerton, and j. quan, “single event upset and latchup considerations for cmos devices operated at 3.3 volts”, aerospace report no. tr-94(4940)-9, 1995. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 641-650 https://doi.org/10.2298/fuee1804641n design of planar plate monopole antenna with vertical rectangular cross-sectional plates for ultra-wideband communications  seyed arash naghdehforushha 1 , mahdi bahaghighat 2 *, mohammad reza salehifar 2 , hossein kazemi 3 1 electrical engineering department, amirkabir university of technology (aut), tehran, iran 2 engineering department, raja university of qazvin, qazvin, iran 3 school of engineering, the university of edinburgh, edinburgh, uk abstract. in this paper, a novel design for planar plate monopole antennas is proposed with applications to ultra-wide band (uwb) communications. to verify the proposed antenna design, simulations are performed by means of cst and hfss software tools, showing that the impedance bandwidth is significantly increased by vertical cross-sections. by adding a series of parameters to the vertical crosssections, the antenna efficiency is effectively enhanced by achieving a return loss of 10 db over the bandwidth range between 3.1 ghz and 10.6 ghz. in addition, our experimental results demonstrate that the fabricated antenna has a return loss performance similar to that obtained by the simulation results. key words: monopole antenna with vertical cross plates, planar monopole antenna, ultra-wideband (uwb). 1. introduction during the recent years, broadband antennas covering a wide range of the frequency spectrum have found increasing applications. these antennas are particularly applied to high data rate wireless communications [1-4] with high quality of service requirements, such as multimedia transmission [5-7], real time navigation and tracking systems, photography and radars. the planar monopole antennas are well suited to broadband applications due to their wide impedance bandwidth, omnidirectional pattern with linear polarization, low cost and noncomplex shape. the rectangularity of such antennas is more appealing because of its simple structure and easier construction in contrast to the circular or elliptical antenna structure. there are several approaches for increasing the impedance bandwidth of received march 25, 2018; received in revised form august 31, 2018 corresponding author: mahdi bahaghighat engineering department, raja university of qazvin, qazvin, iran (e-mail: m.bahaghighat@aut.ac.ir) 642 s.a. naghdehforushha, m. bahaghighat, m.r. salehifar, h. kazemi rectangular plate-shaped monopole antennas including beveling and shorting techniques [814]. in this work, a novel structure is proposed based on adding vertical cross-linked plates to a simple rectangular monopole antenna, which leads to a remarkable improvement in the impedance bandwidth. for different cross-sections, some new parameters such as length, width and height of the cross-sectional plate are considered to provide proper adaptation for the impedance bandwidth. all the parameters are optimized to maximize the attainable performance. compared with a simple monopole antenna, the proposed antenna has smaller rectangular plates. this type of antennas exhibits wideband characteristics with a stable pattern over the entire operating bandwidth. a planar disc monopole antenna was developed and studied by honda et al. in 1991 for the japanese television band (90-770mhz) [6], where the antenna is mounted on a bounded circular ground plate. in this work, we use both cst and hfss software tools to simulate the proposed antenna. the proposed antenna is suitable for indoor radar applications. the comparison between proposed structure and the rectangular monopole antenna, proposed structure provides higher impedance bandwidth but patterns of the rectangular plate monopole are more stable with frequency [8]. (a) front view (b) top view fig. 1 a monopole antenna including vertical plates design of planar plate monopole antenna with vertical rectangular cross-sectional plates... 643 2. antenna design fig. 1 illustrates the schematic of the proposed monopole antenna with cross-sectional vertical plates from the top and front views. the rectangular antenna (l1 × w1) with cross plates (l2 × w2) is placed on the top of the circular ground plane with the radius r and is fed by an sma connector at a distance of g. in this figure, s is the gap between the cross-sections of the vertical plates. the central part of the bottom edge of the monopole antenna is connected to a pin, which is coming out of the ground plane through the hole. we consider this pin to facilitate connection with the feeding source. for the distance between the main monopole antenna and its side, the following formula is adopted in our design based on [8]. 61.9 ( ) 1 l f ghz w  (1) where w1 and fl are the side length in mm and the frequency corresponding to the lower edge of the bandwidth, respectively. in our work, the width of the monopole antenna is optimized in order to increase the impedance bandwidth. then, the monopole antenna is augmented by mounting the proposed vertical cross-sectional plates on the substrate. these plates provide new degrees of freedom for a more flexible adjustment of the impedance bandwidth so as to achieve the optimum performance without any reduction in the target 10 db return loss. 3. simulation and results the return loss of the antenna is simulated by the cst software tool. to this end, the antenna is excited via the waveguide port. the thickness of the monopole plate antenna is 0.5 mm filled with bronze and a thin layer of tin. the main dimensions of the antenna are set to l1 = 15 mm, w1 = 18 mm and four vertical cross-sections are considered to increase the impedance bandwidth of the main antenna. the 50 mm radius circular-shaped ground plate is also fed into the antenna above an sma connector. the pin is a wire with a diameter of 1.3 mm and a length of 2 mm. the gap between the antenna and the ground is considered to be 1.5 mm. this value is obtained through the analysis of the return loss by taking into account the effect of the dimensions of the rectangular plate monopole antenna and its distance from the ground plane on the impedance bandwidth. fig. 2 shows the return loss of a simple rectangular monopole antenna with different values for l1 and w1 and with a constant ground radius of 50 mm. it can be observed that the impedance bandwidth is bounded between 3.1 to 6.5 ghz. in the following, by fixing the main dimensions of the antenna at l1=15 mm and w1=18 mm, the design of the dimensions of the vertical cross-sectional rectangular plates is discussed using fig. 3. as shown in fig. 3 for different lengths and widths of the cross-vertical plates, the return loss is optimized to acquire the best impedance bandwidth. likewise, as shown in fig. 4, the antenna distance from the ground plane is optimized to minimize the return loss. the effect of choosing different values for the ground radius on the return loss is investigated in fig. 5. the results suggest that the optimum value aiming to achieve the wide bandwidth is around 50 mm. in this work, we choose the value of d to be equal to 5 mm, so that the vertical cross-sections are centered on the antenna width, see fig. 1. 644 s.a. naghdehforushha, m. bahaghighat, m.r. salehifar, h. kazemi in fig. 6, the current distribution is shown for both a simple monopole antenna and the one designed using the cst software tool. it can be seen that the current concentration for the simple monopole antenna is focused on the side near the edges of the antenna. therefore, the use of the vertical cross-sectional plates to some extent offloads the current near the edges toward the vertical plates, thus increasing the impedance bandwidth. the maximum achievable gain over the considered frequency range is computed by using the hfss software tool and the results are shown in fig. 7. the simulated and measured radiation pattern for a monopole antenna with vertical rectangular plates are shown in figs. 8 and 9. the omnidirectional pattern is shown in fig. 8(c). 3.1 4.5 6 7.5 9 10.6 -40 -30 -20 -10 0 frequency in ghz r e tu rn l o ss ( d b ) l1=18 , w1=18 l1=15 , w1=20 l1=15 , w1=18 fig. 2 return loss of a simple monopole antenna with a length of l1 and different w2 width with a radius of 50 mm 3.1 4.5 6 7.5 9 10.6 -30 -20 -10 0 frequency in ghz r e tu rn l o ss ( d b ) l2=8mm ,w2=10mm l2=8mm ,w2=8mm l2=7mm ,w2=10mm l2=7mm ,w2=8mm fig. 3 antenna return loss for different l2 and w2 with l1 = 15mm, w1 = 18mm, g = 1.5mm, s = 5mm, d = 5mm and r = 50mm design of planar plate monopole antenna with vertical rectangular cross-sectional plates... 645 3.1 4.5 6 7.5 9.5 10.6 -25 -20 -15 -10 -5 frequency in ghz r e tu rn l o ss ( d b ) g=2.5 mm g=2 mm g=1.5 mm fig. 4 antenna return loss for different heights g with l1=15mm, w1 = 18mm, l2 = 7mm, w2 = 8mm, s = 5mm, d=5mm and r = 50mm. 3.1 4.5 6 7.5 9 10.6 -30 -20 -10 0 fequency in ghz r e tu rn l o ss ( d b ) r=50 mm r=40 mm r=30 mm fig. 5 antenna return loss for different radius r of circular ground plane with l1 = 15mm, w1 = 18mm, l2 = 7mm, w2 = 8mm, s = 5mm, d = 5mm and g = 1.5mm 4. experimental results our antenna sample is implemented based on mentioned parameters in fig. 6. 646 s.a. naghdehforushha, m. bahaghighat, m.r. salehifar, h. kazemi simple monopole antenna with length l1 = 15mm and width w1 = 18mm with a radius of 50mm monopole antenna with rectangular crosssection plates with l1 = 15mm, w1 = 18mm, l2 = 7mm, w2 = 8mm, s = 5mm, d = 5mm and r = 50mm (a) (b) (c) fig. 6 comparison of current distribution (a/m) for three frequencies. (a) 4 ghz, (b) 7 ghz, and (c) 9.5 ghz design of planar plate monopole antenna with vertical rectangular cross-sectional plates... 647 4.5 6 7.5 9 10.6 3 4 5 6 frequency in ghz g a in ( d b ) fig. 7 the vertical cross-sections monopole antenna gain with l1 = 15mm, w1 = 18mm, l2=7mm, w2 = 8mm, s = 5mm, d = 5mm, g = 1. 5mm and r = 50mm 0 20 40 60 80 100 120 140 160 180 -130 -110 -90 -70 -50 -30 -10 10 theta in degree d ir e c ti v it y ( d b ) phi=0 deg freq=4 ghz freq=7 ghz freq=9.5 ghz (a) 0 20 40 60 80 100 120 140 160 180 -130 -110 -90 -70 -50 -30 -10 10 theta in degree d ir e c ti v it y ( d b ) phi=90 deg freq=4 ghz freq=7 ghz freq=9.5 ghz (b) 0 50 100 150 200 250 300 360 -30 -20 -10 0 10 phi in degree d ir e c ti v it y ( d b ) theta=90 deg freq=4 ghz freq=7 ghz freq=9.5 ghz (c) fig. 8 radiation pattern of monopole antenna with vertical crossed plates for three different frequencies for three sections with l1=15mm, w1 = 18mm, l2 = 7mm, w2 = 8mm, s = 5mm, d=5mm, g = 1.5mm and r = 50mm. (a) phi = 0 °, (b) phi = 90 °, and (c) theta = 90 ° 648 s.a. naghdehforushha, m. bahaghighat, m.r. salehifar, h. kazemi (a) (b) fig. 9 measurement of co and cross-radiation patterns of proposed structure at 4 ghz (a) e-plane (xz-plane) and (b) h-plane (xy-plane. l1=15mm, w1 = 18mm, l2 = 7mm, w2 = 8mm, s = 5mm, d=5mm, g = 1.5mm and r = 50mm.( the prototype of the antenna fabricated in this work is shown in fig. 10. it is controlled by the sma connector on top of the ground. this monopole antenna is made of tin-plated bronze with a thickness of 0.5 mm. the length of the sma pin connector is 2 mm and the ground plane thickness is 0.5mm. fig. 10 the proposed antenna prototype for comparison, the results of the return loss simulated by the cst software and those measured by the network analyzer are shown in fig. 11. the discrepancy between the results are primarily because there is no loss in simulations, whereas in practice there are inevitable errors due to the non-ideal fabrication process and laboratory environment. design of planar plate monopole antenna with vertical rectangular cross-sectional plates... 649 3.1 4.5 6 7.5 9 10.6 -40 -30 -20 -10 0 frequency (ghz) r e tu rn l o ss ( d b ) measured simulated fig. 11 comparison between the measured and simulated results of return loss for proposed monopole antenna with vertical rectangular plates 5. conclusion in this paper, a novel method is presented based on designing vertical rectangular cross-sections for a monopole antenna. it is further shown in [2] that the new design structure reaches the same property while reducing 2 protruding plates. the results of the return loss performance show that by adding a series of parameters introduced by the rectangular vertical cross-sections, a return loss of 10 db can be achieved over an ultrawide bandwidth. the main outcome of this research constitutes substantial enhancement of the bandwidth for the input impedance of the monopole antenna. references [1] a. jalali-deel, v. nayyeri, m. soleimani, and s.-a. naghdehforushha, "modified current distribution for analysis of spiral antennas," iet microwaves, antennas & propagation, vol. 11, pp. 1583-1586, 2017. [2] s. a. naghdehforushha, h. oraizi, f. hojjat-kashani, and a. j. deel, "design of a rectangular metallic monopole antenna with protruding normal plates for applications in uwb communication," progress in electromagnetics research, vol. 51, pp. 161-167, 2014. [3] s. a. naghdehforushha and g. moradi, "plasmonic patch antenna based on graphene with tunable terahertz band communications," optik-international journal for light and electron optics, 2017. [4] s. a. naghdehforushha and g. moradi, "design of plasmonic rectangular ribbon antenna based on graphene for terahertz band communication," iet microwaves, antennas & propagation, 2017. [5] m. bahaghighat and s. a. motamedi, "it-mac: enhanced mac layer for image transmission over cognitive radio sensor networks," international journal of computer science and information security, vol. 14, p. 234, 2016. [6] honda, satoshi, michiaki ito, hajime seki, and yosbio jinbo. "a disk monopole antenna with 1: 8 impedance bandwidth and omnidirectional radiation pattern." in proceedings of the international symposium on antennas and propagation japan, vol. 4, pp. 1145-1145. institute of electronics, information & communication engineers, 1992. [7] m. bahaghighat and s. a. motamedi, "vision inspection and monitoring of wind turbine farms in emerging smart grids," facta universitatis, series: electronics and energetics, vol. 31, pp. 287-301, 2018. 650 s.a. naghdehforushha, m. bahaghighat, m.r. salehifar, h. kazemi [8] m. ammann, "square planar monopole antenna," 1999. [9] e. antonino-daviu, m. cabedo-fabres, m. ferrando-bataller, and a. valero-nogueira, "wideband double-fed planar monopole antennas," electronics letters, vol. 39, p. 1635, 2003. [10] m. ammann, "control of the impedance bandwidth of wideband planar monopole antennas using a beveling technique," microwave and optical technology letters, vol. 30, pp. 229-232, 2001. [11] m. ammann and z. n. chen, "a wide-band shorted planar monopole with bevel," ieee transactions on antennas and propagation, vol. 51, pp. 901-903, 2003. [12] w.-s. lee, d.-z. kim, k.-j. kim, and j.-w. yu, "wideband planar monopole antennas with dual bandnotched characteristics," ieee transactions on microwave theory and techniques, vol. 54, pp. 28002806, 2006. [13] m. ammann and z. n. chen, "wideband monopole antennas for multi-band wireless systems," ieee antennas and propagation magazine, vol. 45, pp. 146-150, 2003. [14] h. hassani and s. mazinani, "wideband planar plate monopole antenna," in passive microwave components and antennas, ed: intech, 2010. facta universitatis series: electronics and energetics vol. 32, n o 3, september 2019, pp. 387-402 https://doi.org/10.2298/fuee1903387p © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd transformative and disruptive role of local direct current power networks in power and transportation sectors * prahaladh paniyil 1 , rajendra singh 1,2 , amir asif 1 , vishwas powar 1 , guneet bedi 1 , john kimsey 1 1 holcombe department of electrical and computer engineering, clemson university, clemson, sc 29634 usa 2 department of automotive engineering, clemson university, clemson, sc 29607 usa abstract. the power sector is about to undergo a major disruptive transformation. in this paper, we have discussed the best possible energy solution for addressing the challenges of climate change and eradication of energy poverty. this paper focusses on the decentralized power generation, storage and distribution through photovoltaics and lithium batteries. it encompasses the need for local direct current (dc) power through the factors driving this change. the importance of local dc power in the transportation sector is also established. finally, we conclude with data bolstering our argument towards the paradigm shift in the power network. key words: photovoltaics, lithium-ion batteries, electric vehicles, nano-grid, internet-of-things, blockchain 1. introduction renewable energy is the key to a sustainable power network for the future. the major sources of renewable energy today would be solar energy and wind energy. there is no direct competition between solar and wind energy since solar works during the day time and wind energy works mostly during night time. off shore wind power generation and long-haul transmission of wind power is not cost effective. only local power generation of wind power is cost effective. thus, local generation and distribution of solar and wind energy-based power can provide an effective source of clean power for mankind. the focus of this paper, however, is solar energy based photovoltaic (pv) systems and a power network consisting of pv and lithium-ion battery systems. received february 26, 2019; received in revised form may 20, 2019 corresponding author: prahaladh paniyil holcombe department of electrical and computer engineering, clemson university, clemson, sc 29634 usa (e-mail: ppaniyi@clemson.edu) * an earlier version of this paper was presented at the 4th virtual international conference on science, technology and management in energy, energetics 2018, october 25-26, niš, serbia [1]  388 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey alternating current (ac) has been dominating power networks in the world ever since it won the battle against direct current (dc) in the late 19 th century. the invention of the transformer allowed ac power to transmit over long distances with minimal losses by stepping up/down voltages with ease. however, the scenario is very different today. most of the loads used today, except for a few inductive ones, are dc powered. owing to the prowess in photovoltaic (pv) and lithium-ion battery technologies today, we can generate and store dc power locally. minimal losses are encountered in local direct current power networks and the use of dc loads is much more energy efficient than the current ac network. pvs coupled with batteries provides an ultra-low-cost, secure and self-sufficient power network. no other energy source currently can match the declining costs for pvs as they work on the principle of free fuel based solar energy. batteries are also experiencing a similar trend in their cost owing to the increasing demand for electric vehicles across the world. fig. 1 displays a slope for the cost reduction in pv and lithium-ion batteries. we will be discussing the current status of pvs and batteries in section 2. 2. current status of photovoltaics and batteries the exponential growth of the pv and battery industries leading to a substantial reduction in their cost, can enforce the paradigm shift from ac to dc power. in 2018 the global pv installations has reached 108 gw [3]. the university of new south wales in sydney has signed an agreement for a period of 15 years to have all its energy demands met by solar pv. this agreement ensures the university to purchase 124,000 mw hours of electricity from a future solar farm called sunraysia that is being constructed in the state. the university aims to achieve its goal of carbon neutrality of energy use by 2020 through this step. the facts about this agreement can be found in [4]. fig. 1 cost reduction trends of photovoltaic modules and lithium batteries [2] transformative and disruptive role of local direct current power... 389 a solar project in saudi arabia led by masdar, abu dhabi‟s renewable energy company, and its french partner edf have submitted the lowest bid in the world for solar power generation at 1.79 us cents per kilowatt-hour (kwh) [5]. fig. 2 shows data analytics from last year in which there is a clear exponential decrease in cost for pv systems and an exponential increase in the installation volume across the globe. at the end of 2018, the price of mono crystalline silicon panel has fallen to $0.25 per watt [6]. fig. 2 cost vs installation volume for pv systems [7] due to demand for evs, consumer products and storage for pv power, the battery industry is also experiencing a similar trend in the cost and manufacturing. lithium batteries are an attractive option for these markets as they have the highest energy density per weight. lithium batteries are already dominant in the consumer goods market such as mobile phones and laptops. they are also considered extremely viable for the ev industry owing to their weight benefits. to meet this increasing demand for lithium batteries, gigawatt factories like tesla‟s and panasonic‟s $5 billion giga-factory, are being built across the globe [8]. according to fig. 3, the average lithium battery cost is forecasted to fall below $100 per kwh by 2022. we have already concluded in our paper [10] in 2014 that if current trends of pv growth continue, we expect pv electricity cost with storage to reach $0.02 per kwh in the next 8-10 years (2022-2024). with the data that is available today, the goal of $0.02 per kwh can be seen to be achieved well ahead of the predicted time. 390 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey fig. 3 lithium-ion battery cost vs energy density [9] 3. need for local dc power the electricity industry is at the cusp of a dramatic transformation. the drivers for this paradigm shift are real-time grid monitoring, emergence of microgrid and nanogrid in place of centralized integrated electric grid, improved energy efficiency, cyber security in the grid, weather tolerant electricity infrastructure and intelligent loads [11]. we can discuss the importance of these factors and their fulfillment through a local dc power network. 3.1. real-time grid monitoring the concept of localized pv and battery-based dc power is further reinforced by the connectivity aspect of the internet-of-things (iot). iot is a boon in the domain of predictive analysis. the data collected over time can be utilized for predictive analysis to generate more efficient future outcomes. a solar energy company can install various iotbased sensors on the solar panels and monitor their performance and provide real-time insight. since these sensors can collect massive amounts of data, companies can utilize this to have a more granular oversight over their systems [12]. various iot based sensor techniques are available for monitoring solar energy grids for maximum power efficiency. we can see in [13], how an iot-based network is created to monitor and control a smart farm utilizing solar panels as the power source. like pv systems, batteries can also be monitored through iot based devices for enhanced performance in the grid. in [14], we can see the development of a battery monitoring system for the grid with the utilization of iot. the combination of real time monitoring of the energy grid and utilization of intelligent dc loads can lead to immense power savings as opposed to the current power networks by optimizing the power-time function of a building grid access. transformative and disruptive role of local direct current power... 391 3.2. emergence of micro-grid and nano-grid in place of centralized energy grids in remote places, where the number of consumers is relatively small, it is quite challenging to draw transmission lines or to operate a generator that requires fuel delivery. on such instances, the pv and battery-based dc nano-grid system can provide optimum solution by eliminating the transmission challenges and by almost hassle-free operation. this holds the key to open the door of energy accessibility even to the people of underdeveloped economies where power for everyone is still a genuine issue [15]. thus, such a decentralized power network empowers economic growth, creates a global middleclass and establishes social justice. the local dc nano-grids and micro grids can operate at a lower voltage (<1500v), and thus eliminate the environmental, health and safety issues associated with high voltage ac transmission and distribution. local distribution will also get rid of expensive tree branch pruning and vegetation clearance activities that are associated with high voltage ac lines running through forests. there is no significant issue of safety for 48v dc applications. for the data center and other general-purpose higher voltage applications, several companies supply dc power distribution hardware operating at 380v including circuit breakers with rated currents ranging from 15a to 2,500a [16]. local generation, storage and transmission of power do not require massive transmission network infrastructure and its associated investment. the operational cost is greatly reduced for a local network due to this exclusion of long-distance transmission networks. 3.3. improved energy efficiency most of the loads that we use today (except for a few inductive ones) are all dc powered. there is a need for conversion of power for ac to dc at the device level with the existing centralized ac power network. this conversion of power results in power losses at each stage of conversion. as reported by singh and shenai [17], over 30 % of electrical power can be saved by converting all appliances running currently on ac to high-efficiency and dc fig. 4 conversion circuitry for ac to dc conversion for a 35w led bulb [18] 392 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey internal technology. implementation of local dc power through pv and battery can eliminate majority of the conversion losses leading to a much more efficient network. fig. 4 demonstrates the extra internal circuitry required for conversion of power from ac to dc in a 35w led bulb. the amount of power wasted in generating ac power from a dc source like pv is discussed in detail in the following subsection of this paper. a local dc network consists of dc power at generation and utilization. the elimination of conversion components leads to lesser capital cost for the network. also, due to fewer components, the whole network has considerably lesser probability of component failures. ultimately, it provides enhanced system reliability. another important enabling prospect with fewer components in the system is the availability of less area for cyber-attacks. we would be discussing the overall aspect of cyber security in the following sub-section. 3.4. cyber security in the grid pv based dc microgrids are less vulnerable to cyber-attacks [19]. it is difficult for a remote attacker to access an isolated and self-contained pv based dc microgrid/nanogrid. situational intelligence, real time monitoring, physical security and user control –all can be incorporated in pv based dc microgrid/nanogrid. large grids and distributions require a lot of nodes where monitoring data are generated .and collected, and commands are sent to be executed. this transmission and reception of data creates opening for malicious intruders who might gain access to systems or data through sophisticated attacks [19]. in the age of internet of things (iot) and remotely connected servers, a possible attack may come from any point, even from far ends of the internet. pv based dc microgrids/ nanogrids will employ intelligent control systems that are local and can remain secured from cyber-attacks. a leading networking company, cisco, reported that about 73% of it professionals in (centralized) utility service experienced security breach whereas the other industry‟s average is 55% [20]. hence, by decentralizing the energy grid through local dc power, the risks of cyber-attacks to the grid can also be minimized. 3.5. energy wasted in current ac based pv systems the current power network is based on ac power. majority of the loads that we use today draw ac power from the grid-supplied power. if pv systems are used as the source of power and batteries for storage, both these sources run on dc power. with the existing electrical system, this dc power must be converted into ac power and supplied to the loads. solar inverters are used for this conversion from dc to ac power. while considering inverters, their sizing factor also must be taken into consideration. the sizing factor takes into account the watt rating of the solar panel array and the rated wattage of the inverter. for instance, a 150-kw array connected to a 100-kw inverter has a sizing factor of 1.5. the simulations run estimating the energy loss due to the oversizing of inverters yielded the results as in fig. 5. the energy data is generated using clemson‟s irradiance data for worst-case and best-case operating temperatures. the inverter sizing factors are derived from inverter efficiencies obtained from [21]. transformative and disruptive role of local direct current power... 393 fig. 5 energy loss and output vs sizing factor 3.6. weather tolerant electricity infrastructure recently, united states and other nations have been witnessing nature‟s wrath through frequent hurricanes and hail storms causing power outages in many parts of the world. recently, nerl‟s main campus in golden, colorado was hit by a hail storm. only one broken panel was reported broken among the 3000 panels that were installed on the roof of a net-zero energy building [22]. to further reiterate the claim for resiliency of pv based solar systems in times of natural disasters, hurricane irma‟s path can be taken into consideration. a 650-kw rooftop solar array on san juan‟s va hospital continued to operate at 100% post-storm even after being exposed to wind speeds of about 180 mph [23]. resiliency to harsh weather conditions is not the only advantage to pv and batterybased power networks. a power failure of a plant in a centralized ac power network impacts a very large area that is incorporated under the same network. with the implementation of decentralized local dc power grids, the area impacted by such failures would be much smaller. pv and battery-based local dc power networks are also much more resilient to geomagnetic storms caused by coronal mass ejections (cmes) [10]. 394 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey 3.7. intelligent loads the advent of iot is a major revolution in the technological aspect of humanity. every sector is working on incorporating smart devices that are embedded with sensors that relay information on the internet. from home appliances to automated robotics in manufacturing plants iot-based products are being designed and implemented everywhere. the energy sector is also following the same path. as already discussed in this paper, iot-based sensors enable real time monitoring of the grid for efficient energy output. however, iot-based intelligent appliances can further enhance energy efficiency by demand-based utilization of power. such intelligent loads/appliances are already commercially available in the market. these loads can monitor electricity generated by solar and instruct on/off times based on peak energy generation periods. for example, an automated washing machine can be instructed by the intelligent control hub to turn on only during periods of minimum load utilization and maximum free fuel based solar energy generation. hence, a local dc power network has the incentive of easily coupling with the abundance of iot-based intelligent loads that are making an entry into human lifestyle. to further bolster this idea of incentivization, we can refer to the recent article on the new range of evs from bmw. according to the article [24], bmw i3 ev can be turned into cash cows by delaying their charging time to offset peak demand and align with maximization of renewable energy utilization [24]. 4. importance of dc equipment and appliances in local dc power networks availability and options for dc equipment are a major concern among the consumers as well as policy makers. right now, the appliances are mostly sold with ac standard (110v or 220v). though most of the appliances use dc internally, the connection to the wall outlet is still ac. with suitable policy, the market for dc appliances can grow and many manufacturing industries will find potential, even untapped, markets. as mentioned in a previous publication [25] inside modern electronic equipment and appliances, a portion of the printed circuit board (pcb) is dedicated for converting ac into dc, and dc power is used in most areas of the pcb. the use of local dc power in place of ac will eliminate rectifier, smoothing filters, etc. from pcb. in addition to the cost reduction of components, time will be saved when we do not need to solder the ac/dc conversion rectifier, filters, etc. on the pcb. manufacturing of all the loads that operate on dc power will provide significant cost reduction. for inductive loads connected to local dc power network, an internal inverter can be added as part of the inductive load. 5. local dc power in transportion sector the recent data released by environmental protection agency (epa) signifies the growth of transportation sector as one of the major contributor to us greenhouse gas (ghg) emissions [26][27]. transportation surpassed various economic sectors like electricity and industry to contribute for 28.5% of total us ghg emissions in 2016 [27]. this has raised several concerns about the petroleum-based transportation sector and has called for transformation. in case of surface transportation, there are four major drivers of change viz. – electrification, diverse mobility, autonomous driving and connectivity. transformative and disruptive role of local direct current power... 395 the need for a cleaner, cheaper and cost-efficient alternative technology for surface transportation has created the need for electrification. increased carbon emissions and rising global temperatures have led to stricter laws and preventive measures by many eu countries [28]. low cost and improved battery technology has also fueled electrification of surface transport sector which in-turn benefits the pv sector. thereby establishing a triangular relationship between pv, ev and batteries. furthermore, the cost of maintenance and operation aka return on investment (roi) of electric dc vehicles is way cheaper than traditional ic engine-based automobiles [29][30]. as the sharing economy expands and consumer preferences change, the standard one person-one car model will continue to evolve from outright purchase or lease to rentals and carpooling, thus creating diverse mobility. the new era of iot, wireless sensor networks, dc electric vehicles and improved battery technology have made it possible to envision autonomous driving and connected vehicle technology. the rise in sales of evs has accentuated the concept of localized dc charging using pv and battery-based systems. localized charging for electric vehicles can be achieved in two ways; plug-in charging and onboard on-the-go charging. 5.1. plug-in charging plug-in charging of electric vehicles is achieved by installing a charging station also known as electric vehicle supply equipment (evse) in your house or by accessing a public charge station in the neighborhood. the primary objective of local plug-in charging is to use current wall-outlet or some additional circuitry to charge the battery of electric cars. there are three types of electric charging stations available, level 1, level 2 and dc fast chargers. a key observation is that both level 1 and 2 chargers operate on ac input power (120 v and 240 v respectively). the plug-in evs internal battery charger converts it into dc power to charge up the car‟s traction battery [31]. this results in conversion losses and increases the charge time of the battery. the dc fast charging usually implements a direct link between a dc power source (either solar or battery) and the battery charging circuitry, thereby eliminating intermediate stages of conversion and rectifying [32]. this considerably reduces charge-time and has generated areas of smart control and pv based techniques for dc fast charging stations. vehicle-to-grid (v2g) and grid-to-vehicle (g2v) technologies that ensure smart control and flow of energy, to and fro form the grid are also be implemented. however, reducing conversion and step-up/ step down losses is the primary reason for utilizing direct localized dc systems instead of grid dependent ac [32]. 5.2. onboard pv charging on-board dc charging of electric vehicles aims at continuously charging the batteries on-the-go. in this technique, energy is usually harvested from an on-board solar panel and smart control techniques for efficient charging of the traction battery are implemented. this method is challenging as it involves smart control algorithms to mitigate effects of variability of solar energy as the vehicle is moving from point a to point b. effects of partial shading, tracking maximum power point, variability of irradiance and temperature have to be compensated by intelligent control techniques and algorithms [33] [34]. another critical challenge lies in retrofitting the design of existing ev drivetrain to integrate pv modules and additional pv circuitry. sometimes integrating the solar panel might be bulky and undermine speed and performance. but many lightweight, small 396 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey distance and low speed prototypes have successfully integrated pvs in their design. the ford c-max solar, toyota prius, volkswagen id buzz and lightyear one is some examples. according to [35], a light weight aerodynamically efficient, small distance, car can travel up to 60 kms using bulk si-pv modules. as battery technologies improve and design principles are tweaked, there would be many more local dc generated and powered automobiles, primarily relying on pv modules for functional power. the following section elaborates our work in studying benefits of localized dc power in field of transport. 5.3. air transportation other than surface transportation, advancements in battery technology is also impacting air transportation. last year boeing unveiled a new unmanned electric verticaltakeoff-and-landing (evtol) cargo air vehicle prototype that will be used to test and evolve boeing's autonomy technology for future aerospace vehicles [36]. it is designed to transport a payload up to 500 pounds for possible future cargo and logistics applications. this year on january 23, boeing announced that it recently conducted the first test flight of its all-electric autonomous passenger air vehicle [37]. the unpiloted vehicle took off vertically, hovered for a few seconds, and then landed at the designated site. powered by an electric propulsion system, boeing says the prototype is designed for fully autonomous flight from takeoff to landing, with a range of up to 50 miles (80.47 kilometers). measuring 30 feet (9.14 meters) long and 28 feet (8.53 meters) wide, its airframe integrates the propulsion and wing systems to achieve efficient hover and forward flight. a number of startup companies are developing evtol based urban air mobility vehicles. startup eviation aircraft and siemens will jointly develop propulsion systems for the alice, a nine-passenger electric commuter plane [38]. 5.4. water transportation fossil fuel-based diesel is a problem not only on land, but also at sea. the environmental impact of ferries, cargo ships or cruise ships includes greenhouse gas emission, acoustic and oil pollution. shipping industry is one of the dirtiest. in collaboration with norwegian shipyard fjellstrand, in 2015 siemens has developed the technology for the first electric car and passenger ferry in the world [39]. the electric ship has been moving silently through the norwegian fjords – 34 times every day at 20-minute intervals between lavik and oppedal. a conventional ferry on this route consumes roughly one million liters of diesel per year and emits 2,680 tons of carbon dioxide as well as 37 tons of nitrogen oxide into air. on the other hand, the new 80-meter-long electric ship is powered by two 450 kilowatt electric motors, which take their energy from lithium-ion batteries [39]. the electric ferry reduced carbon emission by 95% and operating cost by 80 % [39]. 6. proof of concept experimental results we have established the energy efficiency of local dc power through our experiments. 6.1. li-ion battery charging firstly, we will demonstrate the power savings of direct dc charging of lithium battery through solar panels as compared to ac charging networks. fig. 6 demonstrates the experimental setup for direct dc charging. transformative and disruptive role of local direct current power... 397 fig. 6 dc charging experimental setup in the dc charging circuit, we connected the solar panel (320w) with an initial voltage of 41.7v to a tristar mppt charge controller at 1.19 pm when it was sunny. the charge controller is used to regulate the current and voltage for smooth charging process. the battery was previously discharged by connecting it to a dc powered refrigerator. the initial battery voltage was recorded as 15.7v. the charge controller equalizes the battery charging and hence, readings were taken every minute before and after the absorption state for 30 minutes. before reaching the absorption state, the panel delivered approximately 40w of power with voltage reading of 39.7v and current reading of 0.74a. the charging voltage and current delivered by the controller to the battery were 13.89v and 2.52a respectively. after reaching the absorption state, the power output from the solar panel stabilized at around 20w with a voltage and current reading of 39.5v and 0.5a respectively. the controller delivered constant power of approximately 18w with voltage and current readings of 13.93v and 1.3a respectively. a charging network for the same battery was designed using an ac network as seen in fig. 7. such ac networks are prevalent today as the ac power network is dominant almost everywhere. we constituted for all the elements in the ac network and took reading every minute for 30 minutes. the battery charger used was the lithium-ion battery charger by hct electric co ltd [39]. the initial voltage recorded for the ac mains was 119.7 v with a current of 2.66 a. the initial battery voltage recorded was 12.7 v. while charging, the ac input power was approximately 154.79 w (average) and the dc charging power was approximately 121.66 w (average). the charging voltage through the charger output was recorded as approximately 13.06 v and the current was recorded as approximately 9.55a. fig. 7 ac charging experimental setup 398 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey we recorded the average power readings for both dc and ac mechanisms and calculated average power losses. table 1 demonstrates the charging efficiency and power losses incurred in the dc vs ac charging mechanisms. table 1 dc vs ac charging efficiency parameters power calculations average input power (w) average output power (w) average power loss (w) dc power (w) (before absorption state) 43.2 39.375 3.825 dc power (w) (after absorption state) 20.31 17.44 2.87 ac power (w) 154.79 121.66 33.13 the dc charging incurs lesser losses due to fewer conversion components involved in its network. this experiment demonstrates the increased energy efficiency of dc charging as compared to the current ac technique. 6.2. importance of new-age dc loads along with the implementation of local dc power, it is important to give significance to the emergence of dc loads. as an extension of this experiment, we conducted some power readings using a refrigerator that has both, dc and ac power, inputs. the grape solar glacier 5-cu. ft. ac/dc fridge/freezer used for this purpose was purchased from home depot. these refrigerators are utilized in recreational vehicles (rvs). it was found that, in ac mode, there is „leakage‟ or wastage of electrical energy when the refrigerator‟s compressor is not running. this is due to idle-losses in the rectifying transformer. in dc, such phenomena do not exist. fig. 8 shows a comparison of the power savings. fig. 8 comparison of idle energy usage of the same appliance in dc and ac mode transformative and disruptive role of local direct current power... 399 6.3. onboard pv for battery operated golf cart in this experiment, we designed and analyzed a simple on-board pv powered dc golf cart. using localized dc charging we increased the run-time and distance traveled by the golf cart and got comparable results even on cloudy and rainy days. in order to determine how the on-board solar panel would improve the electric golf cart performance, four types of trials were conducted in different weather conditions. for trial 1, the golf cart was driven on a set route on battery alone (without any solar panel). for trials 2, 3 and 4, the golf cart was driven on the same route as trial 1, but this time with battery and onboard pv solar panel. trials 2, 3, and 4 were conducted on a hot sunny day, partly cloudy day and rainy day, respectively. every 30 minutes, the golf cart was brought back to the initial start point to have battery voltage tested. the distance traveled by the golf cart during each trial was measured using a gps phone-based application. the circuit was retrofitted to include, a mppt charge controller, a circuit breaker switch, an on-board solar panel rated (320w), along with the 6 batteries of 6.6 v each connected in series to produce a net 40 v golf cart battery. fig. 9 shows the increased on-road distance travelled by the golf cart with on-board pv charging. in fig. 10 we can clearly see that with the help of on-board solar panel, the on-road run time has increased considerably. table 2 summarizes the results of this experiment. fig. 9 enhanced distance travelled by pv golf cart fig. 10 enhanced time traveled by pv golf cart 400 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey table 2 improvement in dc golf cart parameters using on-board pv parameters without solar with solar sunny sunny cloudy rainy distance in miles 27.05 53.68 49.81 49.59 time in minutes 241 390 362 360 improvement in distance over without solar -98.45% 84.14 83.33 improvement in time over without solar -61.83% 50.21% 49.38% 7. blockchain technology with local dc power blockchain technology is gaining its niche in today‟s digital world. the blockchain provides a decentralized platform to record transactions and distribute information without being copied. therefore, instead of one central agency controlling the transaction and its data, blockchain allows the transaction to be distributed across a network and protects it using encryption mechanisms. in a central network with many nodes, a hacker may have many points of centralized vulnerability to attack on. this is eliminated in a decentralized network as the data is distributed over network. a similar concept can be applied to the power grid. the current ac system is equivalent to a centralized grid that controls the power generation and distribution. the central grid is connected to millions of nodes that utilizes power from the grid. with digitization venturing into every aspect of human life, the grid is also being digitized to enhance its efficiency. a centralized grid if attacked can affect all the nodes connected to it. moreover, the computation power required to operate millions of computational nodes is very large. however, with local dc networks, the grid is decentralized like the block chain technology. with a decentralized dc network, the number of nodes connected to the grid will be greatly reduced with the maximum number of nodes being in the order of thousands. as for the blockchain technology, the cyber security risks for this decentralized grid is greatly reduced. the computational power for thousands of nodes in a local nanogrid is also miniscule compared to the millions of nodes in a centralized grid. to give a clearer picture, a company called lo3 energy has initiated a project called the brooklyn microgrid [41]. under this initiative, the participants can buy and sell locally generated solar power within their community and blockchain technology is used to record these energy transactions [41]. as the blockchain technology is beneficial for local dc networks and grids, pvs and batteries also compliment the cryptocurrency infrastructure prevalent on the blockchain architecture. there is a significant amount of energy consumed in mining of cryptocurrency. since, free fuel-based pv and battery combination is the cheapest source of energy network available today, the cryptocurrencies can be mined at lowest energy expense using pv and batteries. this bidirectional advantage will further enforce the implementation of local dc power along with the booming blockchain architecture. 8. conclusion based on the research conducted in this paper, we conclude that local dc power is shaping to be the energy network for the future. our data indicates that local dc power has potential of providing disruptive transformation in power and transportation sectors. transformative and disruptive role of local direct current power... 401 with the introduction of proper policies, pv and battery-based local dc power grids will be the main mode of the energy generation, storage and consumption. acknowledgement: part of the work was supported by clemson university creative enquiry research program. references [1] p. paniyil, r. singh, v. powar, j. kimsey, “local dc power network”, in proceedings of the 4th virtual international conference on science, technology and management in energy, energetics 2018, pp. 147154, october 25-26, 2018. [2] m. liebreich (2015). bnef summit – micheal liebreich keynote. available at: https://data.bloomberglp.com/bnef/sites/4/2015/04/bnef_2014-04-08-ml-summit-keynote_final.pdf [3] b. lillian (2019). yearly global solar pv addiions top 100 gw for first time. available at: https://solarindustrymag.com/yearly-global-solar-pv-additions-top-100-gw-for-first-time/ [4] websolutions (2018). australian university to become 100% solar powered. available at: http://www.climateactionprogramme.org/news/australian-university-to-become-100-solar-powered [5] l. graves (2017). world’s cheapest prices submitted for saudi arabia’s first solar project. available at: https://www.thenational.ae/business/energy/world-s-cheapest-prices-submitted-for-saudi-arabia-s-firstsolar-project-1.663842 [6] s. enkhardt (2018). prices for monocrystalline solar modules are picking up. available at: https://www.pv-magazine.com/2018/10/18/prices-for-monocrystalline-solar-modules-are-picking-up/ [7] z. shahan (2018). solar panel prices continue falling quicker than expected. available at: https://cleantechnica.com/2018/02/11/solar-panel-prices-continue-falling-quicker-expectedcleantechnica-exclusive/ [8] a. moss (2017). 5 companies taking a share in the growing lithium-ion batteries market. available at: https://keepingstock.net/5-companies-taking-a-share-in-the-growing-lithium-ion-batteries-marketacc2828719f5 [9] san diego and sunderland (2017). after electric cars, what more will it take for batteries to change the face of energy. available at: https://www.economist.com/news/briefing/21726069-no-need-subsidieshigher-volumes-and-better-chemistry-are-causing-costs-plummet-after\ [10] r. singh, g. alapatt and g. bedi, “why and how photovoltaics will provide cheapest source of electricity in the 21st century,” facta universitatis, series: electronics and energetics, vol. 27, no. 2, pp. 275-298, 2014. [11] a. a. asif, r. singh and g. k. venayagamoorthy, “ultra-low cost and solar storm secured local dc electricity to address climate change challenges for all economies,” in proceedings of the clemson power systems conference, clemson, sc, 2016. [12] m.d. phung, m. villefromoy, and q. ha, “management of solar energy in microgrids using iot-based dependable control,” in proceedings of the 20th ieee international conference electrical machines and systems (icems), august 2017, pp. 1-6. [13] o. chieochan, a. saokaew, e. boonchieng, “internet of things (iot) for smart solar energy: a case study of the smart farm at maejo university,” in proceedings of the ieee international conference control, automation and information sciences (iccais), oct 2017, pp. 262-267. [14] k. friansa, i. haq, b. santi, d. kurniadi, e. leksono, b. yuliarto, “development of battery monitoring system in smart microgrid based on internet of things (iot),” procedia engineering, vol. 170, pp. 482487, jan 2017. [15] world economic forum (2018). global risk report. available at: http://www3.weforum.org/docs/wef_ grr18_report.pdf [16] abb white paper. six stages of solar bankability. available at: https://library.e.abb.com/public/ 2fa374aa511be1558525792e005e47fb/abb_%20solar_whitepaper_final.pdf [17] r. singh, and k. shenai, “wide bandgap (wbg) semiconductor power converters for dc microgrid appliacations,” in proceedings of the 2015 ieee first international conference on dc microgrids, 2015, pp. 263-268. [18] lpr article. direct current (dc) supply grids for led lighting. available at: https://www.ledprofessional.com/resources-1/articles/direct-current-dc-supply-grids-for-led-lighting https://www.thenational.ae/business/energy/world-s-cheapest-prices-submitted-for-saudi-arabia-s-first-solar-project-1.663842 https://www.thenational.ae/business/energy/world-s-cheapest-prices-submitted-for-saudi-arabia-s-first-solar-project-1.663842 https://www.pv-magazine.com/2018/10/18/prices-for-monocrystalline-solar-modules-are-picking-up/ https://cleantechnica.com/2018/02/11/solar-panel-prices-continue-falling-quicker-expected-cleantechnica-exclusive/ https://cleantechnica.com/2018/02/11/solar-panel-prices-continue-falling-quicker-expected-cleantechnica-exclusive/ https://keepingstock.net/5-companies-taking-a-share-in-the-growing-lithium-ion-batteries-market-acc2828719f5 https://keepingstock.net/5-companies-taking-a-share-in-the-growing-lithium-ion-batteries-market-acc2828719f5 https://www.economist.com/news/briefing/21726069-no-need-subsidies-higher-volumes-and-better-chemistry-are-causing-costs-plummet-after/ https://www.economist.com/news/briefing/21726069-no-need-subsidies-higher-volumes-and-better-chemistry-are-causing-costs-plummet-after/ http://www3.weforum.org/docs/wef_grr18_report.pdf http://www3.weforum.org/docs/wef_grr18_report.pdf 402 p. adh paniyil, r. singh, a. asif, v. powar, g. bedi, j. kimsey [19] x. zhong, l. yu, r. brooks and g. venayagamoorthy, “cyber security in smart dc microgrid operations,” in proceedings of the ieee 1st intl. conf. dc microgrids, atlanta, ga, 2015. [20] greentech media white paper (2016). utility security: exceeding mandates to mitigate risk. available at: http://www.cisco.com/c/dam/en_us/solutions/industries/energy/docs/greentech-white-paper.pdf [21] inverter datasheet (2016). available at: https://solectria.com//site/assets/files/1413/pvi_50-100kw_ datasheet_december_2016_rev_l.pdf [22] office of energy and renewable energy (2017). hail no! national lab's solar panels survive severe storm. available at: https://www.energy.gov/eere/articles/hail-no-national-labs-solar-panels-survivesevere-storm [23] j. marsh (2017). can solar panels withstand hail and survive hurricanes. available at: https://news.energysage.com/solar-panels-hail-hurricanes/ [24] f. lambert (2018). bmw says ev owners can turn i3 into ‘cash cow’, use more solar power with controllable load tech. available at: https://electrek.co/2018/09/12/bmw-electric-car-owners-i3-intocash-cow-controllable-load-technology/ [25] r. singh, amir a. asif, g. k. venayagamoorthy “transformative role of photovoltaics in phasing out alternating current based grid by local dc power networks for sustainable global economic growth”, in proceedings of the ieee-pvs conference, portland 2016. [26] the united states environmental protection agency (epa). (2018). inventory of u.s. greenhouse gas emissions and sinks: 1990–2016. available at: https://www.epa.gov/sites/production/files/201801/documents/2018_complete_report.pdf [27] offce of transportation and air quality epa-420-f-18-013. (2018). u.s.transportation sector ghg emissions fast facts: 1990-2016. available at: https://nepis.epa.gov/exe/zypdf.cgi?dockey= p100usi5.pdf [28] a. gray. (2017). the best countries in europe for using renewable energy. available at: https://www.weforum.org/agenda/2017/04/who-s-the-best-in-europe-when-it-comes-to-renewable-energy/ [29] fleetkarma. everything you need to know about electric cars. available at: https://www.fleetcarma.com/ everything-need-know-electric-cars/ [30] m. abdelhamid, i. haque, s. pilla, z. filipi, r. singh, “impacts of adding photovoltaic solar system on-board to internal combustion engine vehicles towards meeting 2025 fuel economy cafe standards,” sae int. j. alt. power., vol. 5, no. 2, 2016. [31] d. kettles, “electric vehicle charging technology analysis and standards,” florida solar energy center, fsec report number: fsec-cr-1996-15, 2015, available at: http://www.fsec.ucf.edu/en/publications/ pdf/fsec-cr-1996-15.pdf [32] m. abdelhamid, r. singh, i. haque, “role of pv generated dc power in transport sector: case study of plug-in ev,” in proceedings of the first international conference on dc microgrids (icdcm), ieee, july 2015, pp. 299-304. [33] m. abdelhamid, a. qattawi, r. singh, and i. haque, “ comparison of an analytical hierarchy process and fuzzy axiomatic design for selecting appropriate photovoltaic modules for onboard vehicle design,” international journal of modern engineering, vol. 15, no. 1, pp. 23-35, 2014. [34] a.r. bhatti., z.salam, m.j.b.a. aziz, and k. p. yee, “a critical review of electric vehicle charging using solar photovoltaic”, int. j. energy res., vol. 40, pp. 439-461 , 2016 [35] m. abdelhamid, r. singh, a. qattawi, m. omar and i. haque (2014). “evaluation of on-board photovoltaic modules options for electric vehicles”, ieee journal of photovoltaics, vol. 4. pp. 1576-1584, 2014. [36] boeing website (2018). boeing unveils new unmanned cargo air vehicle prototype. available at: https://www.prnewswire.com/news-releases/boeing-unveils-new-unmanned-cargo-air-vehicle-prototype300580750.html [37] cnn business website. see boeing’s autonomous flying car take flight. available at: https://edition.cnn. com/videos/business/2019/01/23/boeing-autonomous-air-vehicle-orig.cnn-business [38] j. gerdes (2019). siemens to supply motors for eviation‟s all-electric airplanes. available at: https://www.greentechmedia.com/articles/read/siemens-to-supply-motors-for-eviations-all-electric-airplanes [39] elect-expo website. exploring new horizons – with electric ships. available at: https://www.electexpo.com/en/electrifying-stories/exploring-new-horizons-with-electric-ships/ [40] lithium-ion battery charger specifications. available at: https://hct.manufacturer.globalsources. com/si/6008801796577/pdtl/lithium-ion-battery/1161365711/standard-battery-charger.htm [41] e. woyke (2017). blockchain is helping to build a new energy grid. available at: https://www.technologyreview.com/s/604227/blockchain-is-helping-to-build-a-new-kind-of-energy-grid/ http://www.cisco.com/c/dam/en_us/solutions/industries/energy/docs/greentech-white-paper.pdf https://news.energysage.com/solar-panels-hail-hurricanes/ https://www.epa.gov/sites/production/files/2018-01/documents/2018_complete_report.pdf https://www.epa.gov/sites/production/files/2018-01/documents/2018_complete_report.pdf https://nepis.epa.gov/exe/zypdf.cgi?dockey=p100usi5.pdf https://nepis.epa.gov/exe/zypdf.cgi?dockey=p100usi5.pdf https://www.weforum.org/agenda/2017/04/who-s-the-best-in-europe-when-it-comes-to-renewable-energy/ https://www.fleetcarma.com/everything-need-know-electric-cars/ https://www.fleetcarma.com/everything-need-know-electric-cars/ http://www.fsec.ucf.edu/en/publications/pdf/fsec-cr-1996-15.pdf http://www.fsec.ucf.edu/en/publications/pdf/fsec-cr-1996-15.pdf https://www.prnewswire.com/news-releases/boeing-unveils-new-unmanned-cargo-air-vehicle-prototype-300580750.html https://www.prnewswire.com/news-releases/boeing-unveils-new-unmanned-cargo-air-vehicle-prototype-300580750.html https://edition.cnn.com/videos/business/2019/01/23/boeing-autonomous-air-vehicle-orig.cnn-business https://edition.cnn.com/videos/business/2019/01/23/boeing-autonomous-air-vehicle-orig.cnn-business https://www.greentechmedia.com/articles/read/siemens-to-supply-motors-for-eviations-all-electric-airplanes https://www.elect-expo.com/en/electrifying-stories/exploring-new-horizons-with-electric-ships/ https://www.elect-expo.com/en/electrifying-stories/exploring-new-horizons-with-electric-ships/ https://www.technologyreview.com/s/604227/blockchain-is-helping-to-build-a-new-kind-of-energy-grid/ facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 413-427 https://doi.org/10.2298/fuee2003413m © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd fault detection using fra in order to improve the aging model of power transformer saša d. milić 1 , denis ilić 1 , milan ponjavić 2 1 university of belgrade, electrical engineering institute nikola tesla, belgrade, serbia 2 university of belgrade, school of electrical engineering, belgrade, serbia abstract. power transformers are constantly exposed to mechanical, thermal and electrical stresses during operation. in this paper, the authors propose an improved aging model of power transformers by adding the impact of mechanical deteriorations. in the current practice, the mechanical deformation and dislocation of the windings and core are not sufficiently distinguished as components that influence the aging of the transformer. hence, the current aging model was expanded with a functional block that contains several typical failures in order to emphasize their impact on the lifetime of transformers and their aging as well. the authors used the frequency response analysis (fra) method for the fault detection and location of the mechanical deformations of its active parts. the correlation function is used to determine the level of the detected failure. all presented test results are obtained in real exploitation conditions. key words: aging model, fault detection, frequency response analysis, maintenance, monitoring 1. introduction reliable, safe and continuous production flow in today‟s industry is of undeniable importance. modern, automated production cycles are fully based on electricity as the cleanest and the most easily applicable energy source. any delay in the production can cause potentially significant financial losses and must be entirely prevented or reduced to a minimum duration. hence, plant engineers must have scheduled inspections of the equipment “health” and planned interventions in the most appropriate point in time, as production must be carried out. that is the most convenient and the cheapest way to maintain a large, valuable, important facility. as mentioned above, there is no industry without a sustainable electrical power source. foundations of the electricity relate to the power transformers and generators. power transformers are irreplaceable and crucial in all areas of the industry, as they step up voltage and thus enable long-distance transmission of electricity with lower looses, received november 21, 2019; received in revised form february 28, 2020 corresponding author: saša d. milić university of belgrade, electrical engineering institute nikola tesla, 8a koste glavinića, 11000 belgrade, serbia e-mail: s_milic@yahoo.com  414 s. d. milić, d. ilić, m. ponjavić and step-down voltage and enable its use in medium and low voltage motors, converters, lighting, ac, etc. any major fault in power transformer causes a power outage and delay in production. preventing unplanned outages which can jeopardize production and financial gain became the main task for field test engineers, as they must provide information about equipment “health” and schedule needed services and interventions. condition monitoring, fault detection and diagnosis, prognosis and several types of management are effective means to reduce the downtime and the maintenance cost and to improve the reliability and lifetime of the power transformers and generators. during the last two decades, the use of a large number of theories in order to improve their operations in the field of management, maintenance, monitoring and control of electrical power systems is noticeable. entire scientific and technical areas have expanded researching these issues. in recent years, business trends in the power sector imply the need to reduce maintenance costs, operate transformers as much as possible and prevent forced outages. today's development of measurement techniques and methods, as well as the development of software tools and hardware, encourages the rapid development of complex monitoring systems and diagnostic methods for measuring, monitoring, predicting and analyzing potential failures of capital equipment (generators, transformers, boilers, hv motors...) in the power sector. there is a number of off-line and on-line monitoring/diagnostic techniques which are currently used and developed further [1], [2], most relevant being:  dissolved gas analysis  partial discharges  direct hot spot measurement  degree of polymerization  furan analysis  on-load tap changers  power factor of bushings  recovery voltage measurement  detection of winding displacement  frequency response analysis (fra) three main mechanisms contribute to the insulation aging or deterioration in transformers: hydrolysis, oxidation and pyrolysis. therefore, insulation aging or deterioration is a time function of temperature, moisture content, and oxygen content [3]. however, these mechanisms can be initiated and accelerated by mechanical deterioration of almost all parts (windings, core...) of the transformer. this paper emphasizes the need to monitor and analyze mechanical deterioration (deformations) and potential mechanical failures in power transformers. the authors also emphasize the fact that the mechanical deformations of the power transformer further shorten its lifetime. based on the foregoing, the paper presents an advanced model of aging of the power transformer by adding the influence of mechanical failures on the existing model. as a tool for mechanical deformations, the detection method of the windings frequency response analysis is suggested, in the form of the frequency response analysis (fra). fra is an offline test based on the measurement of the impedance, admittance, or transfer function of a particular phase as a function of a wide frequency range which is used as a transformer fingerprint that can be compared with its previous signatures to detect any mechanical deteriorations (such as winding displacements, etc.) [4 7]. fault detection using fra in order to improve the aging model of power transformer 415 there are many mathematical methods, algorithms and procedures for signal processing and analysis [8]. fra is one of the methods that were created as a result of the mentioned multi-year research effort in order to improve monitoring and fault detection in power systems. this method aims to detect and recognize mechanical deformations inside the transformer without the need for its opening and visual inspection. this is one of the main reasons why the authors decided that the results obtained with this method illustrate the importance of the proposed, improved and expanded aging model of power transformers. 2. power transformer aging and lifetime power transformers are constantly exposed to mechanical, thermal and electrical stresses during operation. therefore, fault detection and monitoring are very important for condition and remaining lifetime assessment of power transformers [9], [10]. contrary to an ideal transformer, a practical transformer has winding resistance, flux leakage, finite permeability and core losses [11], [12]. a transformer is designed to have sufficient dielectric and mechanical strength to withstand the maximum predicted and expected operating stresses. various stresses during the service life of a transformer can exceed these expectations and degrade transformer insulation, magnetic core and the whole mechanical structure. during an unpredicted short-circuit current, axial and radial electromagnetic forces in the windings can significantly exceed the projected values and cause permanent deformation such as tilting and buckling, or destroy the entire winding [13]. excessive radial forces can cause buckling of a transformer, while tilting is usually caused by axial forces. these forces combined can cause axial bending, winding twisting, and other winding deformations. certain magnitudes of forces that cross the point of elastic deformation of the conductors can permanently deform the winding, without interrupting it, destroying it, etc. if you do not notice this winding can be safely kept in operation. however, its projected mechanical integrity is compromised and at the time of the next stress, the deformed winding can no longer withstand the projected electromagnetic force of short-circuits, i.e. its mechanical strength is significantly reduced. for example, when a transformer is taken off-line, a certain amount of residual flux remains in the core. the residual flux can be as much as 50% to 90% of the maximum operating flux, depending on the type of core steel. when voltage is reapplied to the transformer, the flux introduced by this source voltage builds upon the one already existing in the core. then the magnitude of this inrush current can be up to 10 times the rated fullload current [14]. these operating conditions happen often. there is a special danger to the transformer if it already has some mechanical deteriorations. in this case, additional mechanical deterioration (dislocation and deformation of the windings and/or core) occurs due to additional stress forces. when the transformer returns to the nominal operating regime, these deteriorations can be reflected in the increase in losses, which inevitably leads to an increase in temperature, increased partial discharges, increased vibrations, etc. planning the flow and schedule of activities and selecting measurement and test methods for testing power transformers are the first steps in planning maintenance. fig.1 presents a simplified algorithm for maintenance and testing of power transformers in order to systematize the flow of all necessary activities. 416 s. d. milić, d. ilić, m. ponjavić fig. 1 maintenance and testing algorithm since it is practically impossible to determine the exact size of the damage within the windings and/or core (without opening a transformer) the authors recommend that the detection of mechanical deteriorations is done by measurements using some of the previously mentioned non-invasive measurement methods. the results of these measurements should be used as a reliable indicator of the weakening of mechanical stiffness and the occurrence of deformations, and consequently reducing the operational readiness and lifetime of the transformer. since mechanical deteriorations are a consequence of stochastic undesirable effects, it is difficult to estimate their impact on the transformer‟s lifetime. transformer age is an important factor to consider when identifying candidates for replacement or rehabilitation. although actual service life varies widely depending on the manufacturer, design, quality of assembly, materials used, maintenance and operating conditions, the expected life of a transformer is usually about 40 years. some of the most fault detection using fra in order to improve the aging model of power transformer 417 important issues in the power system are closely related to the assessment of the remaining lifetime of capital equipment such as power transformers. 2.1. the temperature influence on the lifetime of power transformer “bathtub curve” (fig. 2) represents a traditional approach to the graphic representation of the lifetime of the power transformer and describes its hypothetical failure rate versus time. however, the accelerated development and widespread implementation of on-line and real-time monitoring systems, as well as implementation of modern maintenance strategies (condition-based maintenance, reliability centred maintenance, risk-based maintenance, etc.) have raised numerous criticisms of this approach. there are also some papers [15], [16] that point to significant shortcomings in this approach. fig. 2 bathtub curve [15] the traditional approach to aging of the transformer is directly related to aging of transformer insulation that is a time-dependent function of temperature, moisture content, and oxygen content. the authors of this paper are trying to emphasize that a prior statement should be extended with the impact of mechanical failures in the lifetime and aging of the power transformer. 2.2. mechanical failures and improved aging model overview of typical causes of transformer failures is given in table 1 [17]. tab. 1 typical causes of transformer failures internal external insulation deterioration lightning strikes loss of winding clamping system switching operations overheating system overload oxygen system faults (short circuit) moisture mechanical deformations & damages solid contamination in the insulating oil partial discharge design & manufacture defects winding resonance mechanical deformations & damages 418 s. d. milić, d. ilić, m. ponjavić the authors of this paper enhance the mentioned table that is complemented with mechanical failures whose causes can be both inside and outside of the transformers. this is done because of two reasons. the first reason is the need to extend the existing aging model with mechanical movements and deformations that can accelerate the rate of aging. the second reason is the need to timely recognize the type of failure in terms of selecting the most appropriate measuring methods that can detect, locate, monitor, and/or analyze its impact on the condition of the power transformer. the condition assessment of power transformers is always preceded by their lifetime assessment. the remaining lifetime assessment is closely linked to aging. these relationships indicate the need for a more accurate aging model of transformers. the existing aging model [10] has a significant drawback because it does not take into account the impact of mechanical failures in the lifetime and aging of the power transformer. fig. 3 shows the improved aging model that takes into account six major causes of mechanical failures. according to the aging model shown in fig. 3, it is possible to determine the place and role of standard (traditional and new) measurement and diagnostic methods for monitoring and assessing the condition of the transformer [1720]. to detect any mechanical change on power transformers, the best available method is by performing frequency response analysis (fra) measurement on the transformer winding [21]. as a comparative, leakage reactance and excitation current and power measurements are usually applied. not all of them can produce enough force to cause deformation of the winding or core, but can cause winding and core insulation breakdown and turn-to-turn faults. recent studies tend to estimate the remaining life of the transformer by means of deep analysis of the results obtained by electrical and chemical tests. usually, the expected power transformer life can be presented as a curve (fig 4). fig. 3 improved power transformer aging model fault detection using fra in order to improve the aging model of power transformer 419 fig. 4 transformer aging curve: 1 – normal aging curve, 2 – accelerated aging curve, 3 – electromechanical stresses, 4 – insulation strength reserve, 5 – insulation breakdown, 6 – shortened transformer life each time when a transformer experiences a short circuit fault, or overvoltage, it loses some of the factory designed mechanical and electrical strength. after a long time in service, cellulose insulation becomes brittle, and cannot maintain designed mechanical specs, although dielectric strength cannot be threatened. the most powerful units, 10 mva, 20 mva and more, are usually carefully monitored as they provide electricity for major production sectors. those less powerful and usually cheaper, are subjected to scheduled visual inspections and perhaps scheduled insulation resistance test before being put to operation. sometimes, even the most monitored and maintained transformer insulation can breakdown suddenly, causing unwanted outages and rising economic pressure on facility management. transformer failure must be estimated in a timely manner to enable top management to make the right decision (repair or replace). there are several non-invasive measurement methods (some of them are mentioned above) that may indicate some of the potential mechanical failures. the practice has shown that the fra measurement method is the most applicable because it can detect and diagnose the greatest number of mechanical failures. 3. power transformer mechanical condition assessment using frequency response analysis frequency response analysis (fra) is a measurement method that is used to detect and identify the mechanical failure of power transformer such as movements and deformations of its windings and core [21–32]. the fra is based on measuring the impedance of the transformer (r, l and c) that is related to the geometry of the core and windings. actually, fra sees the transformer windings as a distributed network of the rlc parameters, or frequency-dependent impedance. those parameters are mainly geometry dependent, since any change in geometry alters parameters values, for example, capacitance. that change will reflect as a deviation in the frequency response of the winding. in general, any system represented as a „black box‟, with its input and output, will react to the input signal (fig. 5). output signal depends on system parameters, attenuation, and input signal amplitude and frequency. the behavior of the system on the input signal is defined by the transfer function [23], [24]. 420 s. d. milić, d. ilić, m. ponjavić fra (sfra sweep frequency response analysis) is a modern, simple and reliable method for recording winding frequency responses. responses are gathered for every winding, regarding transformer vector group, by injecting sinusoidal voltage at the one winding end. the frequency response is derived by varying frequency of the input signal from fstart to fstop of, while amplitude is kept stable. the voltage amplitudes and phases at both ends are measuring simultaneously. frequency range is usually predefined, and it is common to start frequency response recording in a few mhz area and tuning frequency down to a 10 hz or 20 hz. for oil-filled power transformers, range from 20hz up to a 2 mhz is enough, regarding test duration and optimal, useful responses spectrum. responses in the frequency domain are considered as a fingerprint of the winding geometry. any change in response could indicate deformation in winding geometry. recorded frequency responses are compared to a fingerprint response which was recorded when the transformer was considered “healthy”. it is desirable to have recorded fingerprint responses recorded during the handover tests. that is common for a new powerful gsu and distribution transformers, but not for small units. if there are no recorded fingerprint responses, the comparison could be made to transformers of the same type, “sister units”, or between the two phases of the same winding. comparison is made visually, or by the means of statistical and mathematical operations. in engineering practice, the responses are divided into a few frequency sub-bands, in which sfra traces showed sensitivity for failures specified in table 2. tab.2 frequency sub-bands sensitivity frequency sub-band failure sensitivity <2khz core deformations, open circuits, shorted turns, residual magnetism 2khz – 20khz bulk deformation, winding deformation, clamping structure deformation 20khz – 600khz windings, main or tap winding 600khz – 2 mhz windings, loose connection, bad grounding influence comparison between two frequency responses is performed using a statistical indicator for investigating the transformer‟s mechanical condition. cross correlation (cc) coefficients are widely used mathematical relation for response deviation assessment. simplified, cross correlation of traces x and y is (1) [33 38]: _ _ 1 _ _ 2 2 1 1 ( ) (y ) ( ) (y ) n i ii n n i ii i x x y cc x x y              (1) where xi and yi are two traces being compared, and ̅ and ̅ are the means values. if two sets of numbers are perfectly matched, the value of cc is 1, whereas if there is no resemblance between these two sets, the value is 0. a rough estimation of the cc factors meaning is provided in table3. tab. 3 frequency sub-bands sensitivity corellation cc factor good 0.95 – 1.00 fair 0.90 – 0.94 poor 0 – 0.89 no correlation  0 fault detection using fra in order to improve the aging model of power transformer 421 there is a new fra interpretation [35] that propose improved cc that include the phase response in the interpretation. this approach implies that instead of euclidean distance, it introduces a complex distance which includes phase information. the phase information increases the sensitivity of the cross-correlation coefficients. sweep fra (sfra) measurement is accurate, repeatable, non-destructive since injecting voltage whose amplitude is only 10 vpp. since the instrument “knows” the precise signal frequency, while sweeping, narrowband filters can be applied in order to eliminate noises. 3.1. example 1 – faulty transformer three phase, two winding transformer geometry can be inspected within an hour, and since instruments plot responses during the test, an experienced engineer can make a suggestion instantly as to what might be a problem with the transformer. an example of the similar situation will be presented. three phases, a two winding transformer which powers few pumps in a power plant had been turned off by buchholz relay tripping a switch. it was very important to do fast fault recognition, and sfra method seemed appropriate. immediately after a dismantling hv and lv transformer cables, sfra test is done. since it is a very small unit, there are no factory recorded fingerprints. high voltage winding opencircuit sfra traces are presented in fig. 5 and short-circuit sfra traces are presented in fig.6. fig. 5 open-circuit primary winding sfra traces fig. 6 short-circuit primary winding sfra traces 422 s. d. milić, d. ilić, m. ponjavić in general, traces do not match perfectly, but it was clear that the one from phase a deviate from the other two responses significantly. it was obvious in the middle and lowfrequency range. winding frequency responses gave an indication that phase “a” winding had shorted turns and that transformer would need a time-consuming repair (table 1). calculated cc factor in frequency sub-band 20 hz – 5 khz, cc=0.9785, and for sub-band 5 khz-15 khz, cc= 0.4821, confirmed an already obvious problem. it is confirmed later, with no load loss measurement (table 4). even during a no-load test, buchholz relay was filled with combustible gasses within a minute. it was a sign of a stable, firm turn-to-turn connection, followed by great local temperature rise. the transformer was sent to the factory for dismantling and visual inspection. after dismantling a transformer, and taking the core and windings out of the transformer tank, visual inspection gave no indication of the fault in the windings and core of the transformer. so, windings were energised again, so it was possible to visually check and confirm turn to turn insulation problem. after unwinding has been done, it was obvious that inter-turn insulation was degraded and that a few turns had actually weld themselves (fig. 7). fig. 7 phase a winding visual inspection 3.2. example 2 – probably deformed winding in the other example, generator‟s step-up transformer has a frequency response deviation which indicates possible problem with a transformer windings (fig. 8). deviation is present in short-circuit measuring scheme when the effects of the core are practically eliminated and inter-winding geometry is tested. usually, in short-circuit scheme, because the core effect is eliminated, there are no differences in the shape of the winding response between the phases. phase “a” and “c” responses were the same. calculation of the cc factor between phase “c” (or “a”) and “b” in the frequency sub-band 5khz – 15khz gave the number cc=0.9352. referring to table 3, it is clear that is not a perfect match, and that deeper investigation is needed. as a complementary method, measurement of the low voltage leakage reactance is done, and results showed a deviation between phases up to a 7%, which is enough for concern (table4). again, phase “b” leakage inductance is about 194.5 mh, while the other two phases had similar values, around 181 mh. fault detection using fra in order to improve the aging model of power transformer 423 a) b) c) fig. 8 short circuit hv winding sfra traces: a) hv winding, b) hv winding, lv winding shorted, c) magnified deviation of phase “в” 424 s. d. milić, d. ilić, m. ponjavić tab. 4 results leakage reactance measurements connection voltage (v) current (a) z () l (mh) l/l (%) a-n 172,40 3 57,47 181,98 7,11 b-n 184,20 3 61,19 194,49 c-n 172,70 3 57,16 181,58 3.3. routine inspection and repair based on cc factor a power transformer maintenance and testing guide with recommended frequency is given in the literature [36]. almost all maintenance recommendations are written for the average conditions under which the transformer is required to perform and operate. most of these recommendations are applied during preventive maintenance, which basically involves scheduled maintenance and testing. predictive maintenance involves additional monitoring and testing. the main goals of this maintenance should predict potential failure and reduce the costs of unscheduled maintenance and ordinary maintenance. as already mentioned, the deviation of the transformer aging curve from the normal (fig. 4) is practically impossible to quantify and precisely mathematically calculate. therefore, the authors recommend the alternatives in the form of detection of mechanical deteriorations and monitoring of their trends. cc factor with its suggested values (tab. 3) can describe the mechanical condition of the transformer taking into account for example it's coupling. thus, from the perspective of the projected geometry of windings and/or magnetic core, it can be concluded:  if cc is in the range of 0.95-1 it is considered that no significant mechanical deteriorations have occurred and that the curve of aging is the one that has been declared in the factory (fig.4).  if cc is in the range of 0.90-0.94, it is considered that there are some mechanical deteriorations, so that the factory curve is shifted downward, resulting in a shortening of the lifetime. in this case, it is recommended that measurements and tests are performed more often with the aim of monitoring the failure tendency. the optimum moment for applying additional remote (online and real-time) monitoring systems is when the cc factor is in this interval.  if cc is in the range of 0.89-0, it is considered that there is a risk of fatal failure that may be a result of the next mechanical stress. 4. concluding remarks power transformers are irreplaceable in modern developed electricity production and distribution. the demand for reliable and safe power supply caused power transformer life management techniques to be established. hence, aging of the transformer, or power transformer remaining service life, became one of the main topics in the asset management sector. different aging factors are taken into account, in order to define the expected remaining lifetime of the transformer. the problem of temperature rising and the appearance of hot-spots have been identified as the result of increased losses due to, inter alia, the occurrence of mechanical deteriorations. mechanical deformation and dislocation of the windings and core are not sufficiently distinguished as components that influence the aging of the transformer, so authors proposed fault detection using fra in order to improve the aging model of power transformer 425 upgrading of the current power transformer aging model by introducing a new, detached set of parameters which will introduce mechanical strength of the winding-core structure into the existing model. focusing on the problem of disturbed geometry of the windings and core, demanded we find suitable measurement methods that would discover mentioned defects and enable their analysis and monitoring. the method of analysis of the frequency response of transformer windings was selected as the most appropriate since it is simple, fast, easy, non invasive way to detect geometry deformations of the power transformer. we analyzed its possibilities through two examples in real operating conditions. for numerical interpreting of the test results, the authors used cross correlation (cc) factors, which are calculated for mentioned examples and frequency band that indicated major response deviation. further research will be concentrated on the quantifying and recognising mechanical deformations in their infancy, in order to make planned interventions and monitor “power transformer health”. approaching the transformer condition monitoring in this manner can prevent excessive financial losses and major equipment failures because of long, expensive repairs and also expensive, prolonged intervals without power delivery. acknowledgement: this research was funded by grant (project no. tr 33024) from the ministry of education, science and technological development of serbia. references [1] s. v. kulkarni, s. a. khaparde, “transformer engineering: design and practice”, marcel dekker inc., new york, 2004. [2] c. sun, p. r. ohodnicki, e. m. stewart, “chemical sensing strategies for real-time monitoring of transformer oil: a review”, ieee sensors journal, vol. 17, no. 18, pp. 5786–5806, 02 august 2017. [3] k. qian, c. zhou, y. yuan, “impacts of high penetration level of fully electric vehicles charging loads on the thermal ageing of power transformers”, international journal of electrical power and energy systems, elsevier, vol. 65, pp. 102–112, 201. [4] t. d. rybel, a. singh, j. a. vandermaar, m. wang, j. r. marti, k. d. srivastava, “apparatus for online power transformer winding monitoring using bushing tap injection”, ieee transactions on power delivery, vol. 24, no. 3, pp. 996–1003, july 2009. [5] m. florkowski, j. furgal, “modelling of winding failures identification using the frequency response analysis (fra) method”, electric power system research, vol. 79, no. 7, pp. 1069–1075, july 2009. [6] x. lei, j. li, y. wang, s. mi, c. xiang, “simulative and experimental investigation of transfer function ofinterturn faults in transformer windings”, electric power systems research, elsevier, vol. 107, pp. 1–8, 2014. [7] e. g. luna, g. a. mayor, c. g. garcia, j. p. guerra, “current status and future trends in frequencyresponse analysis with a transformer in service”, ieee transactions on power delivery, vol. 28, no. 2, pp. 1024–1031, april 2013. [8] r. stanković, “fft and decision diagram methods for calculation of discrete spectral transforms”, facta universitatis, series: electronics and energetics, vol. 16, pp. 415–422, december 2003. [9] cigré working group a2-18, publication 227: “life management techniques for power transformer”, 2003. [10] cigré working group 12.18 “life management of transformers: guidelines for life management techniques for power transformers”, june 2002. [11] s. puzović, b. koprivica, a. milovanović, m. đekić, “analysis of measurement error in direct and transformer-operated measurement systems for electric energy and maximum power measurement”, facta universitatis, series: electronics and energetics, vol. 27, no. 3, pp. 389–398, september 2014. [12] b. stošić, “wave digital models of ideal and real transformers”, facta universitatis, series: electronics and energetics, vol. 29, no. 2, pp. 219–231, june 2016. [13] m. bagheri, m. s. naderi, t. r. blackburn, t. phung, “a case study on fra capability in detection of mechanical defects within a 400mva transformer”, 21, rue d‟artois, f-75008 paris, cigre, 2012, d1-313. [14] r. gopika, s. deepa, „study on power transformer inrush current“, national conference on "emerging research trends in electrical, electronics & instrumentation" (erteei'17), iosr journal of electrical and electronics engineering (iosr-jeee), vol. 2, e−issn: 2278-1676, p−issn: 2320-3331, 2017, pages: 59-63. http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=7361 426 s. d. milić, d. ilić, m. ponjavić [15] g. a. klutke, p. c. kiessler, m. a. wortman, “a critical look at the bathtub curve”, ieee transactions on reliability, vol. 52, no. 1, pp. 125–129, 2003. [16] s. chakravorti et al., “recent trends in the condition monitoring of transformers”, power systems, , springer-verlag london, 2013. [17] m. wang, a. j. vandermaar, k. d. srivastava, “review of condition assessment of power transformers in service”, ieee electrical insulation magazine, vol. 18. no. 6, pp. 12-25, november-december 2002. [18] w. williams, m. jones, s. anderson, “managing critical power transformer assets”, electric energy, 1160 levis, suite 100, terrebonne, qc canada j6w 5s6, vol.16, no. 6, november-december 2012, terrebonne, qc canada. [19] w. h. bartley, “failure analysis of transformer”, imia wgp 33, international association of engineering insurers, 36th annual conference – stockholm, 2003. [20] t. k. saha, “review of modern diagnostic techniques for assessing insulation condition in aged transformers”, ieee transactions on dielectric and electrical insulation, vol. 10, no. 5, pp. 903-917, 2003. [21] m. f. m. yousof, c. ekanayake, t. k. saha: "study of transformer winding deformation by frequency response analysis", in proceedings of the ieee power & energy society general meeting 2013, publisher ieee, 21-25 july 2013, vancouver, pp. 1–5. [22] cigre wg a2.26, “mechanical-condition assessment of transformer windings using frequency response analysis (fra)”, 2007. [23] c. sweetser, t. mc grail, “sweep frequency response analysis – transformer application”, a technical paper from doble engineering company, version1.0, pp. l-47, 2003. [24] a. kraetge, m. kruger, j. l. velasquez, h. viljeon, a. dierks, “experiance with the practical application of sweep frequency response analysis (sfra) on power transformers", proceedings of the l6th international symposium on high voltage engineering, johannesburg, isbn 978-0-620445849, pp. 1–6, 2009. [25] j. bak – jensen, b. bak jensen, s. d. mikkelsen, “detection of faults and ageing phenomena in transformers by transfer function”, ieee transactions on power delivery, vol. 10, no. 1, pp. 318–314, january 1995. [26] p. werelius, m. öhlen, l. adeen e. brynjebo, “measurement considerations using sfra for condition assessment of power transformers”, in proceedings of the international conference on condition monitoring and diagnosis, beijing, china, april 21-24, 2008. [27] s. a. ghani, y. h. m. thayoob, y. z. y. ghazali, m. s. a. khiar, i. s. chairul, “evaluation of transformer core and winding conditions from sfra measurement results using statistical techniques for distribution transformers”, in proceedings of the ieee international power engineering and optimization conference (peoco2012), melaka, malaysia, pp. 6-7, june 2012. [28] a. d. y. argole: “insight into sfra responses & interpretations with regard to insulation system of transformer”, in proceedings of the ieee 10th international conference on the properties and applications of dielectric materials, bangalore, india, pp. 24-28, july 2012. [29] k. usha, j. joseph, s. usa, “location of faults in transformer winding using sfra”, in proceedings of the ieee 1st international conference on condition assessment techniques in electrical systems, 2013. [30] m. a. sathya, a. j. thomas, s. usa, “prediction of transformer winding displacement from frequency response characteristics”, in proceedings of the ieee 1st international conference on condition assessment techniques in electrical systems, 2013. [31] a. a. pandya, b. r. parekh, “interpretation of sweep frequency response analysis (sfra) traces for the open circuit and short circuit winding fault damages of the power transformer”, international journal of electrical power and energy systems, elsevier, vol. 62, pp. 890–896, 2014. [32] v. behjat, a. vahedi, a. setayeshmehr, h. borsi, e. gockenbach “sweep frequency response analysis for diagnosis of low level short circuit faults on the windings of power transformers: an experimental study”, international journal of electrical power and energy systems, elsevier, vol. 42, pp. 78–90, 2012. [33] p. m. nirgude, d. ashokraju, a. d. rajkumar, “application of numerical evaluation techniques for interpreting frequency response measurements in power transformers”, iet science, measurement & technology, vol.2, no. 5, pp. 275–285, september 2008. [34] g. m. kennedy, a. j. mcgrail, j. a. lapworth, “using cross-correlation coefficients to analyze transformer sweep frequency response analysis (sfra) traces”, ieee pes power africa 2007 conference and exposition johannesburg, south africa, 16-20 july2007. [35] m. f. m. yousof, c. ekanayake, t. k. saha, “study of transformer winding deformation by frequency response analysis”, in proceedings of the 2013 ieee power and energy society general meeting (pes), vancouver, bc, canada, 2013. [36] m. h. samimi, s. tenbohlen, a. a. s. akmal, h. mohseni, “effect of different connection schemes, terminating resistors and measurement impedances on the sensitivity of the fra method”, ieee transactions on power delivery, vol. 32, no. 4, pp. 1713–1720, august 2017. http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=57 http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.m.%20f.%20m.%20yousof.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.c.%20ekanayake.qt.&newsearch=true http://ieeexplore.ieee.org/search/searchresult.jsp?searchwithin=%22authors%22:.qt.t.%20k.%20saha.qt.&newsearch=true http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=6657332 http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber=4105888 http://ieeexplore.ieee.org/xpl/mostrecentissue.jsp?punumber=6657332 fault detection using fra in order to improve the aging model of power transformer 427 [37] m. h. samimi, s. tenbohlen, a. a. s. akmal, h. mohseni, “improving the numerical indices proposed for the fra interpretation by including the phase response”, international journal of electrical power and energy systems, elsevier, vol. 83, pp. 585–593, 2016. [38] h. l. willis, m. h. rashid, “electrical power equipment maintenance and testing”, secon edition, crc press, taylor & francis group, usa, 2009. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 3, september 2017, pp. 327 350 doi: 10.2298/fuee1703327g modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits a review  francisco j. garcía-sánchez 1,2 , beatriz romero 1 , denise c. lugo-muñoz 2,3 , gonzalo del pozo 1 , belén arredondo 1 , juin j. liou 3 , adelmo ortiz-conde 2 1 superior school of experimental science and technology (escet), rey juan carlos university (urjc), móstoles, madrid 28933, spain 2 solid state electronics laboratory (lees), simón bolívar university (usb), caracas 1080a, venezuela 3 emoat, llc, 1933 ayrshier place., oviedo, fl 32765, usa abstract. this article reviews and appraises the dc lumped-parameter equivalent circuit models that have been proposed so far for representing some types of solar cells that can exhibit under certain circumstances a detrimental s-shaped concave deformation within the energy-producing fourth quadrant of their illuminated i–v characteristics. we first present a very succinct recollection of lumped-parameter equivalent circuits that are commonly used to model conventional solar cells in general. we then chronologically present and discuss lumped-parameter equivalent sub-circuits that, combined with conventional solar cell equivalent circuits, are used to specifically represent the undesired s-shaped behaviour. the mathematically descriptive equations of each complete equivalent circuit are also examined, and closed form solutions for the terminal current and voltage as explicit functions of each other are presented and discussed whenever available. while comparing the most salient features and explaining the practical advantages and disadvantages of such equivalent circuit models, we offer some comments on possible directions for further improvement. key words: solar cell lumped-parameter equivalent circuit modelling, solar cell concentrated-element equivalent circuit models, s-shaped current-voltage characteristics, s-shape kink, organic solar cells, lambert w function received february 19, 2017 corresponding author: francisco j. garcía-sánchez solid state electronics laboratory (lees), simón bolívar university (usb), caracas 1080a, venezuela (e-mail: fgarcia@ieee.org) 328 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. 1. introduction the process of designing practical photovoltaic applications calls for the availability of dc lumped-parameter equivalent circuit models as simple as possible to compactly describe the solar cells’ electric behaviour represented by their terminals’ current–voltage (i-v) characteristics, measured in the dark and under standard illumination conditions. solar cell lumped-parameter (or concentrated-element) equivalent circuit models ignore the spatial distribution of the electrical mechanisms present, and instead assume that they are concentrated and represented by certain idealized passive and active lineal and nonlineal electrical components, typically resistors, capacitors, inductors, diodes, and current voltage sources, located at certain positions in an electrical network. under steady state (dc) conditions neither inductors nor capacitors are used. such simple equivalent circuits constitute essential tools for photovoltaic systems simulation, as well as for the important task of advancing basic and applied research and technological development of emerging solar cells’ materials, structures, and fabrication techniques. most well-established conventional solar cells under illumination exhibit the type of generic i-v characteristics that can be satisfactorily represented under steady state by the equivalent electrical behaviour of some of the conventional dc lumped-parameter circuit models that are shown in fig.1 [1]. however, there are innovative developmental or still experimental solar cells, such as some of those based on binary and ternary compound semiconductors, non-crystalline hetero-junctions [2], novel silicon quantum dot solar cells [3], others based on perovskite semiconductors [4-6], and most notably organic semiconductor-based solar cells [7-9], that might exhibit under certain circumstances undesirable deformations of their illuminated i-v characteristics that impair their energy conversion capacity. the main feature of such apparent anomaly becomes evident when the solar cell’s i-v characteristics under illumination present a peculiar concave shape within the fourth quadrant (the power generating quadrant); instead of exhibiting the normally expected, so-called “j” type conventional convex shape. this deformation of the illuminated i-v curve is commonly referred to as the s-shape “kink” of the i-v characteristics [10]. the presence of such bend seriously reduces the solar cell’s fill factor by depressing the location of the maximum power point, and thus represents a serious impairment for the cell’s power conversion efficiency that must be avoided, minimised or suppressed [7, 11]. in the sections that follow we offer a chronological perspective view of the most relevant dc lumped-parameter equivalent circuit models that have been proposed to date, for specifically describing in a compact way this adverse s-shaped behaviour observed in the illuminated i-v characteristics of some otherwise promising solar cells. 2. solar cell equivalent circuit models the simplest possible mathematical description of the i-v characteristics at the terminals of any conventional solar cell measured under illumination conditions consists of adding a photo-generated current to the well known shockley’s ideal diode current equation [12]. the equation resulting from adding these two terms is an explicit compact model of the terminal current expressed as an exponential function of the terminal voltage. this simplest mathematical description of a solar cell electrical behaviour under modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 329 illumination represents a corresponding dc lumped-parameter equivalent circuit model that consists of the parallel combination of a diode and an illumination-dependent current source, as portrayed in fig. 1(a). such descriptive mathematical representation is practically appealing, not only because of its compactness, but also because its explicit nature allows it to be easily inverted and numerically calculated. unfortunately, this simplest model usually falls short of adequately describing all the relevant electrical phenomena that must be considered for solar cell development and photovoltaic system simulation and design. thus, the corresponding very basic dc lumped-parameter equivalent circuit model shown in fig. 1(a) is often deemed to be not accurate enough to be of practical use. 2.1. conventional dc lumped-parameter circuit models the basic equivalent circuit model is modified to offer a more realistic representation, by including other elements, especially parasitic resistors added as lumped elements connected both in series and/or in parallel to account for the possible presence of significant ohmic losses, as indicated in figs. 1(b), (c) and (d). similarly, and in order to be able to better account for the possible presence of more than one significant junction conduction mechanism, the equivalent circuit might also need to include more than one diode connected in parallel with the photo-current source, as presented in fig. 1(e). relevant lumped parameters potentially introduced in these more complex equivalent circuit models, in addition to the value(s) of the series rs and shunt rp=1/gp resistors, are the magnitudes i01 , i02,... of the reverse saturation current(s) and the corresponding value(s) of the junction ideality factor(s) n1, n2,... of the possible multiple diodes needed. a b c d e fig. 1 typical generic solar cell dc lumped-parameter equivalent circuit models showing the photo-generated current source with: (a) a single ideal diode in parallel; (b) plus a series resistance; (c) plus a parallel conductance; (d) plus both series resistance and parallel conductance; and (e) several ideal diodes plus a series resistance. 330 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. the parameters that may be used are supposed to bear direct associations to relevant fundamental microscopic physical features and phenomena actually present in the real solar cell to be modelled. the various circuit elements added to the basic dc equivalent circuit model of fig. 1(a) undoubtedly improve the model’s descriptive fidelity. however, their presence, as shown in figs. 1(b), (c), (d) and (e) also fundamentally complicates the mathematical handling of the resulting descriptive i-v equations. because of them the basic equation ceases to be explicit to become an implicit transcendental equation. from the point of view of photovoltaic system simulation and solar cell model parameter extraction through curve fitting, being transcendental is an undesirable trait of the descriptive equations. except in very few specific cases, they cannot be explicitly solved for the terminal current as a function of the terminal voltage, and vice versa, using only elementary functions. 2.2. conventional models’ mathematically descriptive equations and their solutions luckily, there is the lambertw function, which we will refer to here as w for short. this function comes in very handy for explicitly solving equations which are made up of both linear and exponential terms, such as those equations that describe the circuits of figs. 1(b), (c), and (d). this special function w, whose utility was ignored to a large extent until not long ago, may be succinctly defined as the solution to the generic linear-exponential equation: z = w(z) e w(z) , where z is any complex number [13, 14]. around the turn of this century the use of w started to become an accepted and increasingly ubiquitous tool for solving various important problems of physics [15, 16]. the problems newly solved by using w prominently include important areas related to semiconductor physics, such as electronic devices and circuits, where linear-exponential type of equations abound since they play essential roles in describing the underlying phenomenology. numerical calculation of w is relatively transparent nowadays, since various methods exist to quickly compute the principal w0(z) and other branches of w. additionally, efficient algorithms are routinely implemented in most major mathematical software packages, physics and device modelling tools, and circuit simulation systems. at the turn of the century two seminal works dealing with the use of w in the field of electronic circuit problems were published in the year 2000. one was an exact w-based analytical solution, proposed by banwell and jayakumar [17], of the terminal current i as an explicit function of the terminal voltage v for shockley’s modified equation [12]. it describes a circuit consisting of the series combination of a single diode and a lone resistor rs (similar the circuit of fig. 1(b) less the current source) which is expressed as: 0 exp 1s th v ir i i nv            , (1) where i0 is the reverse saturation current of the diode, vth = kbt/q is the thermal voltage and n is the so-called diode ideality factor, which describes how much the diode’s junction carrier transport mechanisms deviate from supposedly “ideal” behaviour (n=1). the exact w-based analytical solution of the terminal current as an explicit function of the terminal voltage presented by banwell and jayakumar is [17]: modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 331 0 0 0 0 expth s s s th th nv i r v i r i w i r nv nv            . (2) because only a series-connected resistor was assumed to be involved in this problem, just the terminal current i needs be explicitly solved using w; whereas the terminal voltage v can be directly expressed as an explicit function of the terminal current using the natural logarithm elementary function: 0 0 ln th s i i v nv ir i        . (3) the other contemporaneous turn of the century seminal work about the use of w in the field of electronic circuit problems was published by ortiz-conde et al also in 2000 [18]. it was more comprehensive in the sense that it also contemplated the presence of significant shunt conductance gp=1/rp. this seminal work presented the derivation of the two exact w-based analytical solutions for both the terminal current and the terminal voltage as explicit functions of each other, of the transcendental equation corresponding to the circuit composed of a single diode and both seriesand shunt-connected resistors, rs and rp, respectively (similar to the circuit of fig. 1(d) less the current source). shockley’s modified terminal current equation in this case has an extra implicit term that accounts for the additional shunt conductance: 0 exp 1s s th p v ir v ir i i nv r              . (4) the explicit w-based closed form analytic solutions for both i and v as explicit functions of each other, as presented by ortiz-conde et al [18] are, for the current: 0 0 0 0 exp ( 1) ( 1) ( 1) th s s s s s th s p th s p s s p nv i r v i r v i rv i w r r nv r g nv r g r r g                , (5a) or 0 0 0 0 ( 1) ln exp ( 1) ( 1) th th s p s s s s s th s p th s p nv nv r g i r v i rv i w r r i r nv r g nv r g                    ; (5b) and for the voltage: 0 0 0 0 exp s th th p th p p i i i i i v ir nv w nv g nv g g              , (6a) or 0 0 0 0 ln expth p s th th p th p nv g i i i v ir nv w i nv g nv g                  . (6b) it might be noticed that eliminating the shunt conductance loss (letting gp=1/rp 0) does revert (5) back into (2), as it should, but does not allow to directly convert (6) into (3). 332 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. four years after the publication of these two important seminal works about how to use w to derive exact explicit solutions for both i and v of a circuit consisting of a (dark) diode with seriesand parallel resistors, jain and kapoor illuminated in 2004 the previously dark circuit model by adding in parallel with the diode a current source of a photo-generated intensity iph [19]. by so doing, the previously dark circuit became the illuminated solar cell equivalent circuit model shown in fig. 1(d). the new constant current concisely represents the photo-current iph generated by the transport and collection of separated charge carriers that are photo-generated within the cell’s body by the absorption of sufficiently energetic incoming photons that penetrate through the diode’s illuminated surface (now a photovoltaic diode). this photo-current must be inserted as an additional constant current term into the descriptive equation (4), so that now under illumination (4) becomes: 0 exp 1s s ph th p v ir v ir i i i nv r               . (7) the addition of the constant to transcendental eq. (4) does not alter the manner eq. (7) is solved, which remains the same as it was for eq. (4) [18]. the resulting exact w-based analytical solutions for both the terminal current and the terminal voltage, as explicit functions of each other, published in 2004 by jain and kapoor [19], are similar to eqs. (5) and (6), except for the presence of the added iph term. for the current: 0 00 0 ( ) ( ) exp ( 1) ( 1) ( 1) ph s ph sth s s s th s p th s p s s p v i i r v i i rnv i rv i w r r nv r g nv r g r r g                   , (8a) or 00 0 0 ( )( 1) ln exp ( 1) ( 1) ph sth th s p s s s s th s p th s p v i i rnv nv r g i rv i w r r i r nv r g nv r g                     ; (8b) and for the voltage, 0 00 0 exp ph ph s th th p th p p i i i i i ii v ir nv w nv g nv g g                 , (9a) or 00 0 0 ln exp phth p s th th p th p i i inv g i v ir nv w i nv g nv g                    . (9b) therefore, having inserted the additional photocurrent term iph into eqs. (5) and (6), they have become the w-based solutions (8) and (9) which explicitly describe the electric behaviour of illuminated solar cells with significant series and shunt parasitic resistances. it is interesting to check that turning the light off (by letting iph0) reverts eqs. (8) and (9), as they should, back into the original eqs. (5) and (6), respectively. one year later, in 2005, jain and kapoor directly used these same w function-based solutions, corresponding to the conventional solar cell lumped-parameter equivalent circuit model of fig. 1(d), to study organic solar cells [20]. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 333 in addition to a significant presence of both series and shunt parasitic resistances, sometimes it is evident in the measured i-v characteristics of the solar cell the presence of more than one significant conduction mechanism. in such cases multiple-diode equivalent circuit models are called for. they contain more than just one diode in parallel with the photocurrent source, as shown in fig. 1(e). consequently, the corresponding equations turn out to be of a multi-exponential nature and, thus, are in general more difficult, or even impossible, to solve exactly in an explicit form. regardless of the difficulty, such type of multiple-diode equivalent circuits must be used whenever the presence of multiple junction conduction mechanisms must be adequately described because their relative significance so demands [21, 22]. for a complete review of the existing literature about generic solar cell dc lumpedparameter equivalent circuit models and their corresponding equations, solutions, and methods for numerically extracting their parameters, see refs. [1, 23-26] and the references cited therein. although most solar cells can be adequately described by one of the just mentioned generic lumped-parameter equivalent circuit models, some researchers still prefer to use other models that are specifically intended for particular types of solar cells. for example, j. w. jin, et al recently published a universal compact model for organic solar cells, which consists of individually describing three different regimes of operation and then combining their mathematical descriptions into a single equation [27]. unfortunately, even that universal model for organic solar cells is not capable of describing the concave s-shaped behaviour occasionally exhibited by the illuminated i-v characteristics of organic solar cells. in fact, none of the just described conventional dc lumped-parameter equivalent circuit models seem to be capable by itself of properly modelling the occurrence of the undesirable s-shaped behaviour observed in the illuminated i-v characteristics of several types of solar cells [28, 29]. consequently, more suitable specialized circuit models need to be introduced to specifically represent the s-shaped kink. in the following sections we will analyse the issue of how to best describe, through other dc lumped-parameter equivalent circuit models, the harmful s-shaped deformation of the illuminated i–v characteristics, whose presence might seriously spoil the energy conversion performance of solar cells, especially but not exclusively those of organic solar cells [7]. 3. the s-shape kink as already mentioned above, some promising important types of solar cells can, and do, under certain circumstances exhibit the undesirable s-shaped concave deformation of their illuminated i-v characteristics. the s-shape kink is most evident in the fourth quadrant, where it can seriously reduce the fill factor, and thus, impair the solar energy conversion efficiency of the device. therefore when this is the case, corrective or palliative measures must be adopted to avoid or suppress the emergence of this detrimental kink. many researchers have proposed several explanations of the probable causes of this sshaped concavity, but its origins are still not totally clear. materials-related charge transport restrictions and charge accumulation-related interface phenomena, which alter the distribution of the solar cell’s internal electric field are generally regarded to be mainly 334 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. responsible for the occurrence of the s-shaped kink [5, 7, 11, 30-33]. in organic solar cells, misaligned metal work functions and selective blocking contacts can produce injection barriers, and insulating interfacial layers between the metal and the active layers can produce extraction barriers, both of which might produce the fatidic s-shape [34]. similar more or less pronounced s-shapes also can be observed in the measured characteristics of many types of experimental and developmental photovoltaic devices. for example, transient forward, or even reverse, i−v sweeps of perovskite semiconductorbased solar cells, where both ion migration and mobile charge trapping seem to cause undesirable scan direction-dependent hysteresis in their illuminated i-v curves [4, 5]. the same kind of s-shaped kink also has been observed in a-si/c-si hetero-junction solar cells at certain temperature and illumination levels [2]. other types of emerging more exotic photovoltaic devices display this type of detrimental behaviour. that is the case, for example, of the novel experimental ultra-thin photovoltaic cells made with van der waals force-bonded hetero-structures containing atomically thin layers of semiconducting transition metal dichalcogenides (such as mos2, ws2 and wse2) [35]. their illuminated i-v curves also exhibit detrimental s-shaped deformations that need to be suppressed for this attractive type of device to ever achieve usable energy conversion efficiency levels. it is, therefore, essential to identify the possible origins of the s-shape kink if we pretend to avoid or diminish it. thus, identifying and understanding the origin(s), as well as quantifying their influence on the s-shape kink’s emergence, growth or suppression, becomes a crucial task for optimising the design of such solar cells. this goal may be conveniently achieved through the introduction of additional lumped elements into an existing conventional solar cell equivalent circuit model, to modify it so that it may electrically account for the full range of illuminated i-v characteristics, especially including the s-shaped kink behaviour. this was kind of analysis followed by l. zuo, et al, among others, for investigating the origin of the s-shaped kink in the i-v characteristics of organic solar cells [36], who use an equivalent circuit model approach [37] as a tool for analysis. the object of study then becomes the evolution of the solar cell’s i-v characteristics experimentally measured under different operating conditions (illumination intensity, temperature, etc), or in response to adjustments in material composition, morphology, structural design and fabrication specifications, etc. the analysis of how the equivalent circuit’s lumped-elements’ parameter values (as extracted by fitting the model’s equations to the measured data) change in response to modifications of the conditions, can be used as a powerful tool to scrutinise and understand the causes of the s-shape kink, and, thus, to learn how it may be best suppressed. 4. equivalent circuit modelling of the s-shape kink many solar cells unfortunately exhibit this undesirable s-shaped “kink” visibly in the fourth quadrant of their illuminated i-v characteristics. since the kink cannot be described using only the conventional dc lumped-parameter equivalent circuit models shown in fig. 1, and discussed in the preceding sections, ancillary circuits have been proposed for over a decade [28, 29]. the additional lumped elements must be incorporated together with a modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 335 conventional dc lumped-parameter equivalent circuit model to offer an overall description of the illuminated i-v characteristics. 4.1. model by b. mazhari (2006) as early as 2006 mazhari already understood the incapacity of stand-alone existing conventional dc lumped-parameter equivalent circuit models for properly describing the measured i-v characteristics of some illuminated organic solar cells [38]. he suggested that the commonly held hypothesis that the photo-current of organic solar cells remains essentially constant throughout the whole fourth quadrant (00; v>voc), where the current yielded by this model is needlessly forced to level off, following a general trend imposed to a great extent by the i-v locus of rp2 (see dashed blue lines in fig. 6). the reason is that in this model shown fig. 3 the first quadrant is primarily dominated, for large forward voltages >>voc, by the linear parallel resistor rp2, and thus, the i–v curve turns out to be quasi-linear. instead, what seems to actually happen in real cells that exhibit these s-shape kinks, is that their i-v characteristic when measured under illumination in the first quadrant beyond the open circuit voltage (i>0; v>voc) at some point start to describe an upward turn and continue to grow in what appears to be an exponential-like fashion [7, 30, 31, 33]. this circuit has been successfully applied in different experiments that involve sshaped removal with annealing [43] and uv soaking [46], and it has been validated with impedance measurements and ac modeling [47]. fig. 4 dc lumped-parameter equivalent circuit model proposed by gaur and kumar [29] to describe the i-v characteristics of polymer solar cells under dark conditions. it looks almost like a conventional double diode with series and shunt resistances model, except that the diodes have opposite polarities. 4.3. model by gaur and kumar (2013) in 2013 kumar and gaur [28, 29] proposed an improved equivalent circuit model to represent the behaviour of polymer solar cells under different environmental conditions. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 339 the proposed model does not result in a single compact formulation. the actual equivalent circuit turns out to be very complex and contains many circuit elements. the main reason for this is that the model itself separately treats the dark and illuminated characteristics, and even forward and the reverse characteristics are dealt with separately. the proposed dark equivalent circuit is a parallel combination of a shunt resistor and two diodes connected with opposite polarity, all connected in series with a series resistor, as presented in fig. 4. the dc equivalent circuit proposed to represent the i–v characteristics under illumination is based on the chief assumption that he photo-current is not constant but varies with applied voltage. it contains the dark circuit elements plus: a zener diode, up to four more diodes, two photo-generated current sources, and additional unconventional resistors. it is shown in fig. 1(b) of [28] and fig. 7 of [29], but it is too complex to be of any practical use to reproduce here. although this model allowed kumar and gaur to understand the phenomenology of degraded p3ht: pcbm polymer solar cells, its complexity is such that it is not suggested as a practical compact equivalent circuit model to efficiently represent in general the sshaped concave deformations that are observed under illumination in the i-v characteristics of some types of solar cells. 4.4. model by f. j. garcía-sánchez et al (2013) to deal with the inability of the model by araujo de castro et al [10] shown in fig. 3 to faithfully model real measured i-v data far beyond voc, a minor but crucial modification was introduced in 2013 by garcía-sánchez et al [48]. the improvement affects the sub-circuit#2i part of the proposed equivalent circuit model diagram presented in fig. 5. fig. 5 the solar cell dc lumped-parameter equivalent circuit model proposed by garcíasánchez et al [48] to allow describing the s-shaped kink. it includes the same conventional single ideal diode with series and shunt resistances as before (subcircuit #1), but the series-connected sub-circuit #2 has been modified to replace the previous resistor rp2 by a third diode connected with reversed polarity, the same as that of the diode of sub-circuit #1. 340 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. as in the model by araujo de castro et al [10], to reproduce the s-shaped concave region up to about the open circuit voltage (i<0; 00; v>voc). the substitution of the parallel resistor rp2, by the third diode modifies the current through sub-circuit#2i, which now is: 2 2 02 03 2 3 exp 1 exp 1 th th v v i i i n v n v                             . (18) as before, we can write the solution for this equivalent circuit’s terminal voltage v as a function of the terminal current i adding the voltage drop irs across the series resistor rs and the voltages v1 and v2 across the terminals of each of its two sub-circuits, as in (13) which is repeated here: 1 2s v ir v v   . (19) however, now v2 must be obtained by solving (18), and that solution is not explicit in general. it has close form w-based explicit solutions in some particular cases: that both ideality factors n1 and n2 are equal, or that one is twice as large as the other [48]. otherwise, in general (18) would have to be solved numerically or approximately for the terminal voltage v2. this is, of course, the price that must be paid for having a model with two diodes connected in parallel. to illustrate the difference the addition of diode 3 makes regarding the description of the observed upturn of the illuminated i–v curve for i>0; v> voc, we present in fig. 6 in in linear and semi-logarithmic scales the synthetic i-v characteristics of a hypothetical solar cell under illumination, as generated by numerical calculation using the two dc lumped-parameter equivalent circuit models (depicted in figs. 3 and 5), with suitable parameter values indicated in the inset of fig. 6 (b). notice that in this particular example the two ideality factors n1 and n2 were chosen to have equal values, so that the solution for v2 turns out to be explicit [48]. therefore the terminal voltage v calculated from (19) is also an explicit function of the terminal current. as can be seen in fig. 6, the equivalent circuit models of figs. 3 and 5 adequately describe, as expected, the s-shaped kink in the fourth quadrant of the illuminated i–v characteristics (i<0; 00; v>voc). modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 341 fig. 6 comparison of the s-shaped kinks in two synthetic illuminated i-v characteristics, presented in linear (a) and semi-logarithmic (b) scales, as generated by the dc lumped-parameter equivalent circuit models shown in figs. 3 and 5, using the parameter values indicated within the lower pane (dotted black lines = sub-circuit #1, dashed blue lines = first model by araujo de castro et al [10], continuous red lines = model by garcía-sánchez et al [48]). the outstanding difference between both equivalent circuit models is easily visualised by comparing the dashed blue line and the continuous red line curves presented in fig. 6, that correspond to i-v characteristics calculated with the model by araujo de castro et al [10] of fig.3, and calculated with the model by garcía-sánchez et al [48] of fig.5, respectively. as an example of a real exponential-like upward bend in the first quadrant of the i-v curve beyond the open circuit voltage (i>0; v>voc), data points corresponding to an experimental organic solar cell with s-shaped kink measured under arbitrary illumination and described in [48] are presented in fig. 7. 342 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. fig. 7 illuminated i-v characteristics of an experimental organic solar with s-shaped kink (dashed black straight line = series resistance i-v curve, dotted black line = subcircuit #1 i-v curve, continuous dashed blue lines = sub-circuit #2 i-v curve and total model playback of previous model, red lines = sub-circuit #2i i-v curve and total model playback of newer model, green circles = measured i-v data). also shown in fig. 7 is the playback calculated using the model by garcía-sánchez et al [48] with the parameter values indicated in the figure which had been previously extracted by curve fitting of the model’s equation to the originally measured data. notice that in this case the relation n3=2n2 was imposed as a fitting condition, so that the solution for v2 also turns out to be explicit [48], and the terminal voltage v calculated from (19) is also an explicit function of the terminal current. whenever the model’s equations can be solved explicitly, we can avoid numerical iteration and thus ease the necessary curve fitting to experimental data when extracting the cell’s model parameters. such explicit equations are desirable also because they may be analytically operated on, which facilitates derivation of other analytic expressions, such as the temperature dependence of the open-circuit voltage. for assessment purposes, fig. 7 includes three separate i-v curves: two correspond to the conventional solar cell equivalent circuit model (sub-circuit #1), one is the rs curve (black dash straight line), and the other (black dotted curve) corresponds to the parallel combination of the constant photo-current source, the first diode, and its companion shunt resistor rp1. the third curve (continuous red line) corresponds to the s-shape-generating sub-circuit 2i, which is made up of the parallel combination of the second and third diodes with opposite polarities. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 343 a quick look at the shapes of the curves in fig. 7, in light of the terminal voltage equation (19) visually indicates how the s-shape kink is formed in the total model’s i-v curve (shown in fig. 7 on top of the green circles that represent the data points). in fact, the simple graphical addition of the three curves along the voltage axis, equivalent to adding the voltages across the three series-connected parts of the model (rs, sub-circuit 1 and subcircuit 2i), confirms that the result is indeed a sort of double s shape, by virtue of the upward turn at its highest voltage end. additionally, we may notice that the inflexion point of the shown total i-v curve is located at voc, as expected from the fact that the i-v characteristic of sub-circuit #2i, which is the anti-parallel combination of the two extra diodes, has its inflexion point at zero voltage. it is worth mentioning at this point that the reason for proposing that a conventional solar cell equivalent circuit model be connected in series to an additional circuit with a configuration such as that of sub-circuit #2i, is not the result of an arbitrary attempt to try to empirically reproduce the observed upturn beyond voc. rather, it is based on a certain understanding of how to best generalise the possible mechanisms that might be present in the different types of solar cells that exhibit this upturn beyond voc. therefore, a configuration such as that of sub-circuit#2i is most probably justifiable as a reasonable circuital representation of specific underlying physical phenomena taking place near the interfaces of solar cells that display the s-shaped kink. 4.5. model by l. zuo et al (2014) l. zuo et al [36] proposed in 2014, an improved equivalent circuit model for organic solar cells. they explain their proposal saying: “in view of the previous studies, an improved equivalent circuit is proposed to interpret the origin of s-shaped i–v curve and its effect on device performance.” their equivalent circuit model contains, as those proposed before, a sub-circuit with a rectifying junction connected in series with the conventional single-diode photovoltaic equivalent circuit, which is considered the essential reason for the s-shape curve. however, no mention is made in [36] of the previous models already proposed by araujo de castro [10] and garcía-sánchez et al [48] in 2010 and 2013, respectively. this sub-circuit is shown within the complete equivalent circuit model diagram presented in fig. 8. notice that there are two remarkable differences with respect to the original model by araujo de castro et al [10] (see fig. 1(b) of [36]).the first and foremost is that the second diode in sub-circuit #2 is connected with the same forward polarity as the diode in the photovoltaic sub-circuit #1, whereas in the model proposed by araujo de castro et al [10] the second diode in sub-circuit #2 was connected with reverse polarity, opposite to that of the diode in the photovoltaic sub-circuit #1 (see fig.3). in this case the second diode is supposed to be a schottky barrier junction introduced to represent the anode interface current caused by the rectifying properties induced by interfacial dipoles, unbalanced charge transport, etc. nonetheless, it is noted that this model is not limited to the use of schottky barriers, and thus any rectifying junction or non-ohmic contact would do. we must draw attention here to the fact that if an ideal diode with the same forward polarity as the photovoltaic diode were connected by itself (without rp2) in series with sub-circuit #1 of fig. 8, it would certainly modify the shape of the illuminated i-v curve 344 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. in the first quadrant, but at the same time it would suppress almost completely the current in the fourth quadrant. therefore, if a significant (reverse) current is to flow through subcircuit #2 under illumination, there must be substantial shunt current going around that second diode. fig. 8 organic solar cell dc lumped-parameter equivalent circuit model proposed by zuo et al [36] to describe the s-shaped kink. it includes the conventional single ideal diode with series and shunt resistances (sub-circuit #1), and series-connected sub-circuit #2 with a single schottky diode and a parallel resistor, and additional series resistor. that means that the value of the resistor rp2 that shunts the second forward diode in sub-circuit #2 (fig. 8) must be small. we would like to mention in passing that a series combination of ideal diodes with equal polarity was used in the past to model amorphous silicon junctions [9]. the second difference introduced in this model is a minor one. it refers to the fact that now there is a second series resistor rs2, in addition to rs1, the already present series resistor of the model proposed by araujo de castro et al [10] (compare figs.3 and 8). although this second series resistor rs2 seems redundant, because from a circuits point of view it can be absorbed by rs1, according to the explanation given in [36], this resistor “rs2 stands for the series resistance of each layer and interface resistance.” 4.6. second model by f. araujo de castro et al (2016) seeking further generalisation, araujo de castro et al published in 2016 [41] a modification of the previous model by garcía-sánchez et al [48]. in fact, what they proposed is a generalised 3-diode model (shown in fig. 9) aimed to, in these authors’ own words, “gain insight into the modelling and parametrisation of organic solar cell current voltage curves” [41]. the modification introduced by araujo de castro et al consists of two specific changes that are made to the previous model. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 345 fig. 9 second solar cell dc lumped-parameter equivalent circuit model, proposed as an improvement by araujo de castro et al [41]. it is said to improve on the previous equivalent circuit model by garcía-sánchez et al [48]. the previously present series resistor has been eliminated. the previously eliminated shunt resistor rp2 has been added again in parallel, but now with the two diodes in the seriesconnected sub-circuit #2 the first change made is the restoration of the shunt resistor rp2, originally connected in parallel with the single diode of sub-circuit #2 in the first model by araujo de castro et al [10] shown in fig. 3, which was later eliminated by garcía-sánchez et al [48] to be replaced by a third forward polarity diode in anti-parallel connection with the second diode. that shunt resistor rp2 was restored araujo de castro et al [10], but it is now connected in parallel with the two diodes of sub-circuit 2i in the model by garcía-sánchez et al [48] shown in fig. 5. the second change made was to eliminate the series resistor rs, which had been present in earlier equivalent circuit models. the change in [41], that is, the restoration of the shunt resistor rp2, seems to be a perfectly reasonable and necessary decision from a phenomenological point of view. the presence of that resistor rp2, that was present in the first model by araujo de castro et al [10], and was then eliminated and replaced by a diode in the model by garcía-sánchez et al [48], seems to be crucial for properly modelling bulk transport within the body of the solar cell. from a graphical point of view, resistor rp2 controls the i-v curve’s slope of subcircuit #2i around the origin. at the same time, the presence within sub-circuit 2i of the third diode in anti-parallel connexion with diode 2 introduced by garcía-sánchez et al [48] also seems to be necessary, in order to be able to produce the upward bend observed in the i-v curve beyond voc. therefore, the decision adopted in [41] of keeping both elements, the original shunt resistor rp2 and the third diode introduced in [gar1348], seems to be the best way to address two physical phenomenon-related circuital issues that are not likely to be mutually 346 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. excluding, since one seem to come mainly from the bulk while the other probably of interfacial origin. on the other hand, the second change made in [41] regarding the elimination of the series resistor rs, which had been present in all earlier solar cell lumped-parameter equivalent circuit models, does not seem to be a convenient decision. in fact, from a methodological point of view, that series resistor rs should not be even considered as part of the s-shape-generating sub-circuit #2, but as part of the conventional solar cell circuit model. therefore, there does not seem to be a good reason why rs should be substantially altered when adding an s-shape-generating sub-circuit to the total model. to write the solution for this equivalent circuit’s terminal voltage v as a function of the terminal current i only voltages v1 and v2 across the terminals of each of its two subcircuits need be added, since rs has been eliminated. therefore, (19) becomes simply: 1 2 v v v  . (20) however, now v2 must be obtained by solving: 2 2 2 02 03 2 3 2 exp 1 exp 1 th th p v v v i i i n v n v r                              . (21) the solution of (21) is not explicit in general and would have to be solved numerically or approximately for the terminal voltage v2. 4.7. model by p. j. roland et al (2016) regardless of the model development methodology used, the fact is that rs represents an indispensable lumped element of any solar cell equivalent circuit model to be able to describe the presence of omnipresent parasitic resistance at the contacts and other collecting electrode resistance. therefore, it is important keeping this series resistor rs in place, in any solar cell dc lumped parameter equivalent circuit model. the improved solar cell dc lumped-parameter equivalent circuit model, shown in fig. 10 as suggested by p. j. roland et al [50], is another step forward in the sequence of models previously proposed in [10, 41, 48]. the series resistor has been restored as an indispensable lumped element needed to describe the ubiquitous parasitic series resistances present in all solar cells. as before, we would write the solution for this improved equivalent circuit’s voltage v as a function of i by adding the voltage drop irs across the now restored series resistor rs and the voltages v1 and v2 across the terminals of each of its two sub-circuits. that means using (15) instead of (16), and obtaining v2 by solving (7) through numerical or approximate means. sub-circuit #2 contains the parallel combination of shunt resistor rp2 and the antiparallel pair of diodes 2 and 3. p. j. roland et al [50] used spice simulations of this equivalent circuit model to reproduce i-v plots with s-shaped deformation for studying the influence of changing the values of the circuit element’s parameters on the shape of the resulting i-v curve. modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 347 fig. 10 an improved solar cell dc lumped-parameter equivalent circuit model, proposed by p. j. roland et al [50] as a further improvement to the models proposed by araujo de castro et al [41] and by garcía-sánchez et al [48]. the series resistor has been restored as a necessary element to describe the parasitic series resistance. a comparison between experimentally measured data of cds/cdte/fes2 nc/au photovoltaic devices at 200k, which exhibit s-shape kinks in the measured i-v characteristics and their corresponding simulated s-shaped curve was also carried out, trying to correlate the model’s parameters with the physical features that determine current flow through the device. 5. conclusion we have presented a brief chronological review and appraisal of dc lumped-parameter equivalent circuits that have been proposed to date for modelling the effect of the s-shaped “kink” which shows up in the fourth quadrant, and eventually in the first, of the i-v characteristics measured under illumination of certain types of organic solar cells, as well as of some other types of photovoltaic devices. in doing so, we have analysed the defining mathematical equations of the available equivalent circuits, and we have provided and discussed their possible solutions. critical analysis have been included and some recommendations were offered when relevant. we hope that the unifying approach and generic nature of this succinct review can provide extra insight and be of practical help to photovoltaic engineers and solar cell scientists that must deal with the important issue of the s-shape i-v curve deformation and its modelling through lumped-parameter equivalent circuits. acknowledgement: parts of his work were financially sponsored by the madrid autonomous community and urjc, under projects nos. s2009/esp-1781 and urjc-cm-2010-cet-5173, respectively. partial financial assistance was also received through an institutional grant from usb’s decanato de investigación y desarrollo. 348 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. references [1] a. ortiz-conde, f. j. garcía-sánchez, j. muci, a. sucre-gonzález, “a review of diode and solar cell equivalent circuit model lumped parameter extraction,” facta universitatis, series: electronics and energetics, vol. 27, no 1, pp. 57-102, march 2014. [2] r. v. k. chavali, j. v. li, c. battaglia, s. de wolf, j. l. gray, m. a. alam, “a generalized theory explains the anomalous suns–voc response of si heterojunction solar cells,” ieee journal of photovoltaics, vol. 7, no. 1, pp. 169 176, jan. 2017. [3] p. g. kale, c. s. solanki, "silicon quantum dot solar cell using top-down approach." international nano letters, vol. 5, no. 2, pp. 61-65, jun 2015. [4] s. van reenen, m. kemerink, h. j. snaith, “modeling anomalous hysteresis in perovskite solar cells,” j. of phys. chem. letters, vol. 6, pp. 3808−3814, 2015. [5] f. xu, j. zhu, r. cao, s. ge, w. wang, h. xu, r. xu, y. wu, m. gao, z. ma, f. hong, z. jiang, “elucidating the evolution of the current-voltage characteristics of planar organometal halide perovskite solar cells to an s-shape at low temperature,” solar energy materials & solar cells, vol. 157, pp. 981– 988, december 2016. [6] j. liu, g. wang, k. luo, x. he, q. ye, c. liao, j. mei. “understanding the role of electron transport layer in highly efficient planar perovskite solar cells.” chemphyschem. in press, jan 2017. [7] a. wagenpfahl, d. rauh, m. binder, c. deibel, v. dyakonov, “s-shaped current-voltage characteristics of organic solar devices,” physical review b, vol. 82, no. 115306, september 2010. [8] a. opitz, r. banerjee, s. grob, m. gruber, a. hinderhofer, u. hörmann, j. kraus, t. linderl, c. lorch, a. steindamm, a. k topczak, “charge separation at nanostructured molecular donor–acceptor interfaces,” chapter of elementary processes in organic photovoltaics , volume 272 of the series advances in polymer science, pp. 77-108. springer international pub., 2017. [9] v. h. tran, r. b. ambade, s. b. ambade, s. h. lee, i. h. lee. “low-temperature solution-processed sno2 nanoparticles as cathode buffer layer for inverted organic solar cells.” acs applied materials & interfaces, accepted paper in press, jan 2017. [10] f. araujo de castro, j. heier, f. nüesch, r. hany, “origin of the kink in current-density versus voltage curves and efficiency enhancement of polymer-c60 heterojunction solar cells,” ieee journal of selected topics in quantum electronics, vol. 16, no. 6, pp. 1690 – 1699, nov/dec 2010. [11] b. qi, j. wang, “fill factor in organic solar cells,” phys.chem. chem. phys., vol.15, pp. 8972-8982, 2013. [12] w. shockley, the theory of p-n junctions in semiconductors and p-n junction transistors, bell system technical journal, vol. 28, no. 3, pp. 435−489, july 1949. [13] r. m. corless, g. h. gonnet , d.e.g. hare, d. j. jeffrey, d. e. knuth, “on the lambert w function.” adv. comput. math., vol. 5, no. 1, pp. 329–359, 1996. [14] lambert w-function (4.13), nist digital library of mathematical functions. http://dlmf.nist.gov/4.13 [15] s. r. valluri, r. m. corless and d. j. jeffrey. “some applications of the lambert w function to physics,” canadian j. of physics, vol. 78, no. 9, pp. 823-831, 2000. [16] d. veberič, “lambert w function for applications in physics,” computer physics communications, vol. 183, pp. 2622–2628, 2012. [17] t. banwell, a. jayakumar, “exact analytical solution for current flow through diode with series resistance.” electronics letters, vol. 36, no. 4, pp. 291–292, 17 feb. 2000. [18] a. ortiz-conde, f. j. garcía-sánchez, j. muci, “exact analytical solutions of the forward non-ideal diode equation with series and shunt parasitic resistances,” solid-state electronics, vol. 44, no. 10, pp. 1861–1864, october 2000. [19] a. jain, a. kapoor, “exact analytical solutions of the parameters of real solar cells using lambert wfunction,” solar energy materials & solar cells, vol. 81, no. 2, pp. 269-277, february 2004. [20] a. jain, a. kapoor, “a new approach to study organic solar cell using lambert w-function,” solar energy materials & solar cells, vol. 86, pp. 197–205, 2005. [21] d. c. lugo-muñoz, m. de souza, m. a. pavanello, d. flandre, j. muci, a. ortiz-conde, f. j. garcíasánchez, “parameter extraction in quadratic exponential junction model with series resistance using global lateral fitting,” ecs transactions, vol. 31, no. 1, pp. 369-376, 2010. [22] a. ortiz-conde, d. lugo-muñoz and f. j. garcía sánchez, “an explicit multi-exponential model as an alternative to traditional solar cell models with series and shunt resistances,” ieee journal of photovoltaics, vol. 2, no. 3, pp. 261-268, july 2012. http://dlmf.nist.gov/4.13 modelling solar cell s-shaped i-v characteristics with dc lumped-parameter equivalent circuits – a review 349 [23] t. ma, h. yang, l. lu, “solar photovoltaic system modeling and performance prediction,” renewable and sustainable energy reviews, vol. 36, pp. 304-315, 2014. [24] v. j. chin, z. salam, k. ishaque, “cell modelling and model parameters estimation techniques for photovoltaic simulator application: a review,” applied energy, 154, pp. 500–519, 2015. [25] a. m. humada, m. hojabri, s. mekhilef, h. m. hamada, “solar cell parameters extraction based on single and double-diode models: a review,” renewable and sustainable energy reviews, vol. 56, pp. 494–509, 2016. [26] a. r. jordehi, parameter estimation of solar photovoltaic (pv) cells: a review,” renewable and sustainable energy reviews, vol. 61, pp. 354–371, 2016. [27] j. w. jin, s. jung, y. bonnassieux, g. horowitz, a. stamateri, c. kapnopoulos, a. laskarakis, s. logothetidis, “universal compact model for organic solar cell,” ieee transactions on electron devices, vol. 63, no. 10, pp. 4053-4059, october 2016. [28] p. kumar, a. gaur, “model for the j-v characteristics of degraded polymer solar cells,” journal of applied physics, vol. 113, no. 094505, 2013. [29] a. gaur, p. kumar, “an improved circuit model for polymer solar cells,” prog. photovolt: res. appl., 2013. [30] a. kumar, s. sista, y. yang, “dipole induced anomalous s-shape i-v curves in polymer solar cells,” journal of applied physics, vol. 105, no. 094512, 2009. [31] j. wagner, m. gruber, a. wilke, y. tanaka, k. topczak, et al, “identification of different origins for sshaped current voltage characteristics in planar heterojunction organic solar cells,” journal of applied physics, vol. 111, no. 054509, march 2012. [32] r. saive, c. mueller, j. schinke, r. lovrincic, w. kowalsky, “understanding s-shaped current-voltage characteristics of organic solar cells: direct measurement of potential distributions by scanning kelvin probe,” applied physics letters, vol. 103, no. 243303, 2013. [33] o. j. sandberg, m. nyman, r. österback, “effect of contacts in organic bulk heterojunction solar cells,” phys. rev. applied, vol. 1, 024003, 27 march 2014. [34] w. tress, o. inganäs, “simple experimental test to distinguish extraction and injection barriers at the electrodes of (organic) solar cells with s-shaped current–voltage characteristics,” solar energy materials & solar cells, vol. 117, pp. 599–603, 2013. [35] m. m. furchi, a. a. zechmeister, f. hoeller, s. wachter, a. pospischil, t. mueller, “photovoltaics in van der waals heterostructures,” ieee journal of selected topics in quantum electronics, vol. 23, no. 1, pp. 4100111, jan/feb 2017. [36] l. zuo, j. yao, h. li, h. chen, “assessing the origin of the s-shaped i–v curve in organic solar cells: an improved equivalent circuit model,” solar energy materials & solar cells, vol. 122, pp. 88–93, 2014. [37] a. cheknane, h. s. hilal, f. djeffal, b. benyoucef, j.-p. charles, “an equivalent circuit approach to organic solar cell modeling,” microelectronics journal, vol. 39, pp. 1173–1180, 2008. [38] b. mazhari, “an improved solar cell circuit model for organic solar cells,” solar energy materials & solar cells, vol. 90, no. 7, pp. 1021-1033, may 2006. [39] b. romero, g. del pozo, b. arredondo, “exact analytical solution of a two diode circuit model for organic solar cells showing s-shape using lambert w-functions,” solar energy, vol. 86, pp. 3026–3029, 2012. [40] k, roberts, s. r. valluri. "on calculating the current-voltage characteristic of multi-diode models for organic solar cells." arxiv preprint arxiv:1601.02679 , 2015. [41] f. a. de castro, a. laudani, f. riganti fulginei, a. salvini, “an in-depth analysis of the modelling of organic solar cells using multiple-diode circuits,” solar energy, vol. 135, pp. 590–597, 2016. [42] a. ortiz-conde, y. ma, j. thomson, e. santos, j. j. liou, f. j. garcía-sánchez, m. lei, j. finol, p. layman, “direct extraction of semiconductor diode parameters using lateral optimization method,” solid-state electronics, vol. 43, no. 4, pp. 845–848, 1999. [43] g. del pozo, b. romero, b. arredondo, “evolution with annealing of solar cell parameters modeling the s-shape of the current–voltage characteristic,” solar energy materials & solar cells, vol. 104, pp. 81– 86, 2012. [44] g. del pozo, b. romero, b. arredondo, “extraction of circuital parameters of organic solar cells using the exact solution based on lambert w-function,” proceedings of the int. society for optical engineering (spie), brussels, belgium, vol. 8435, organic photonics v, 84351z, june 2012. [45] k. tada, “validation of opposed two-diode equivalent-circuit model for s shaped characteristic in polymer photocell by low-light characterization,” organic electronics, vol. 40, pp. 8-12, 2017. [46] b. romero, g. del pozo, e. destouesse, s. chambon, b. arredondo, “circuital modelling of s-shape removal in the current–voltage characteristic of tiox inverted organic solar cells through white-light soaking,” organic electronics, vol. 15, pp. 3546–3551, 2014. 350 f. j. garcía-sánchez, b. romero, d. c. lugo-muñoz, et al. [47] b. romero, g. del pozo, b. arredondo, j. p. reinhardt, m. sessler, and u. würfel, “circuital model validation for s-shaped organic solar cells by means of impedance spectroscopy,” ieee journal of photovoltaics, vol. 5, no. 1, pp. 234-237, january 2015. [48] f. j. garcía-sánchez, d. lugo-muñoz, j. muci, a. ortiz-conde, “lumped parameter modeling of organic solar cells’ s-shaped i-v characteristics,” ieee journal of photovoltaics, vol. 3, no. 1, pp. 330-335, january 2013. [49] ortiz-conde, a., estrada, m., cerdeira, a., garcía sánchez, f.j., de mercato, g. , “modeling real junctions by a series combination of two ideal diodes with parallel resistance and its parameter extraction,” solid-state electronics, vol. 45, no. 2, pp. 223-228, 2001. [50] p. j. roland, k. p. bhandari, r. j. ellingson, “electronic circuit model for evaluating s-kink distorted current-voltage curves,” proc. ieee 43rd photovoltaic specialists conf. (pvsc), 2016. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 115 130 https://doi.org/10.2298/fuee1801115b a neural network approach for the analysis of limit bearing capacity of continuous beams depending on the character of the load miloš bogdanović 1 , žarko petrović 2 , bojan milošević 3 , marina mijalković 2 , leonid stoimenov 1 1 faculty of electronic engineering, university of niš, niš, serbia 2 faculty of civil engineering and architecture, university of niš, niš, serbia 3 college of applied studies in civil engineering and geodesy, university of belgrade, belgrade, serbia abstract. being a part of civil engineering, limit state analysis represents a structural analysis with a goal of developing efficient methods to directly estimate collapse load for a particular structural model. as a theoretical foundation, limit state analysis uses a set of bound (limit) theorems. limit theorems are based on the law of conservation of energy and are used for a direct definition of the limit state function for failure by plastic collapse or by inadaptation. this study proposes an artificial neural network (ann) model in order to approximate the residual bending moment, limit and the incremental failure force of continuous beams. the neural network structure applied here is a radial-gaussian network architecture (rgin) and complementary training procedure. this structure is intended to be used for civil engineering purposes and it is demonstrated on the example of the two-span continuous beam loaded in the middle of the span that the limit and the incremental failure force can be obtained using neural network approach with sufficient precision and is especially suitable in analysis when some of the model parameters are variable. key words: continuous beam; incremental force; limit failure force; neural network; radial-gaussian network architecture 1. introduction artificial intelligence can be considered as a field of computer science often defined as “science and engineering of making intelligent machines, especially intelligent computer programs” [1]. after the 50 years of advancement, technology of artificial intelligence is applied in numerous fields: expert systems, knowledge based systems, received april 11, 2017; received in revised form june 26, 2017 corresponding author: miloš bogdanović faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: milos.bogdanovic@elfak.ni.ac.rs) 116 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov medical diagnosis, remote sensing, intelligent database system, civil engineering and natural language processing. through years of extensive advancement, delimiting artificial intelligence to a narrower field of research has proven to be capable of offering many significant capabilities and applications [2]. as an example, expert systems are marked as “the technology of knowledge management and decision making for the 21st century” [2]. a broad applicability of artificial intelligence has influenced their usage in the field of civil engineering. it is not an unusual situation for civil engineering researchers to encounter problems influenced by many uncertainties. to resolve some of these problems, civil engineering researchers relay not only on mathematics and mechanics calculations, but also on their experience and practice. however, knowledge and experience do not guarantee that the problem will be solved using traditional procedures. this is the situation where artificial intelligence expresses its supremacy by solving complex problems to the levels of experts by means of imitating experts. this is one of the main reasons why artificial intelligence has a broad application prospects in the practice of civil engineering [3]. in the practice of civil engineering, the broadest interest was shown to artificial neural networks (ann) [4 8], mainly due to their ability to process external data and information based on past experiences. artificial neural networks represent models of real world problems. anns are capable of mapping a set of given patterns to an associated set of a priori known values. they can be observed as non-linear operators that transform input patterns into another set of numerical data at its output. output values are usually gathered through repeated observations of a particular phenomenon. the ann is trained with the input data patterns to perform the transformation and to become the numerical model of the observed phenomenon. anns have the ability to learn from examples and to adapt to changing situation. also, they are capable of bidirectional mappings, e.g. mapping from cause to effect for estimation and prediction and mapping from effect to possible cause [4, 9]. neural networks can be thought of as models which try to imitate some of the learning activities of the human brain, although they are much simpler. while doing so, the internal structure of the neural operator ann remains unaltered for a variety of problems. this facilitates the usage of ann within different applications, including civil engineering applications, because it becomes enough for an application to have and suitably interpret ann input-output data pairs in order to become able to use the ann as the numerical model of the particular phenomenon. limit state analysis of structures is an analytic procedure which determines the maximum load parameter of load increment parameter, which can be sustained by an elasto-plastic structure. if the structure is exposed to the action of gradually increasing load, at some point it can surpass a certain critical value, which causes the plastic failure of the structure, after which the structure is not capable of receiving any further increase of load. this critical state is called the limit state of structure, and the load that causes it is called the limit load. determination of the bearing capacity of a structure is very valuable, not only as a simple control of beam bearing capacity, but also as a significant basis and factor in designing of structures. the beginning of the limit state analysis is related to kazincy [10], who calculated failure load of the beam fixed at both ends, and confirmed the results through experiments. even though the static theorem was first proposed by kist [11], as an intuitive axiom, it is considered that the basic theorem of limit state analysis was first announced by gvozdev [12]. the limit state analysis theorems were independently developed by hill for the stiff perfectly plastic material [13], as well as drucker, prager and greenberg [14], for elastic a neural network approach for the analysis of limit bearing capacity of continuous beams 117 perfectly plastic material. in the meantime, a formal proof of these theorems for beams and frames was derived by horne [15], as well as greenberg and prager [16]. application of limit theorems in designing of civil engineering structures was later applied by many authors among the following are prominent: symonds and neal [17]; hodge [18]; baker and heyman [19]; zyczkowski [20]; save [21]. the loading on a structure may vary considerably during its lifetime. for example, apart from dead-loading, a building frame will experience snow loads on the roof and wind loads on each face. the magnitudes of these loads at any particular instant cannot be foreseen, although their characteristic values will be known, so that the sequence of loading is unpredictable. this type of loading is termed variable repeated loading [22]. generally, the designer’s knowledge of the future loadings to which a particular structure will be exposed is usually as follows:  types of loads such as live load, wind load, water pressure, snow weight, dead weight, etc. are clearly determined;  limits of variations of load intensities of particular load types are also known as supplied by the design codes or they follow from some technological or service conditions;  actual future history of the loads, however, is not given explicitly as it is impossible to predict it. if a structure is deformed elastically, then in the presence of variable loads its strength is determined by the fatigue properties of the material; fracture occurs after a large number of cycles. but if the body experiences elastic-plastic deformation, a load less than limiting can cause the attainment of a critical state with a comparatively small number of cycles [23]. the fact that the collapse loads calculated according to limit state analysis may fail to provide a proper measure of structural safety in the case of variable repeated loads, was pointed out for the first time by grüning and bleich in 1932. in 1936 melan presented a general static shakedown theorem and later extended it to the general case of a continuum [24]. it was koiter, who formulated a general kinematical shakedown theorem [25]. in the recent years, the shakedown analysis of elasto-plastic structures has been increasingly applied in the analysis of engineering problems due to the increased demands of modern technologies. it is thus successfully applied for many engineering problems, such as designing of nuclear reactors, railways, in civil engineering designing and safety assessment of some building structures. the aim of this paper is to propose an ann model aimed to be used for civil engineering purposes in order to approximate the residual bending moment, limit and the incremental failure force of continuous beams. the neural network structure used for these purposes is radial-gaussian network architecture (rgin). the rest of the paper will be organized as follows. in section 2 we will present an overview of ann implementations in civil engineering along with basic postulates of limit and shakedown analysis. section 3 will present an analysis of the bearing capacity of continuous beams depending on the load character and degree of static indeterminacy. section 4 will describe the neural network approach used in this research along with detailed description of a neural network structure used for the approximation of the residual bending moment, limit and the incremental failure force of continuous beams. in section 5, we will present an analysis of ann generated results. the conclusion is presented in section 6 with an outlook of future work and improvement of the research presented in this paper. 118 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov 2. related work 2.1. artificial neural networks in civil engineering it seems almost impossible to review all applications of the ann in civil engineering. a vast amount of research papers presenting results in this science field was published since eighties of the twentieth century. therefore, we will present a selected part of research results with due respect for all other research results not mentioned in this section. one of the earliest applications of ann in mechanics was proposed by ghaboussi, garrett and wu [26]. they have used ann to investigate direct representation of constitutive behavior of concrete. recently, arangio and beck have developed a strategy for the estimation of the integrity of a long-suspension bridge while being influenced by ambient vibrations [27]. cachim demonstrated the usage of artificial neural networks for calculation of temperatures in timber under fire loading [28]. in this research, a multilayer feed forward network has been used to determine the temperature in the timber as the only output parameter of the neural network. liu et al. have shown the possibility of using back propagation neural networks (bpnn) as models for predicting the compressive strength of concrete [29]. evolutionary fuzzy hybrid neural network (efhnn) was used to enhance the effectiveness of assessing subcontractor performance in the construction industry. this possible usage of efhnn was demonstrated by cheng et al., with purpose of to achieving optimal mapping of input factors and subcontractor performance output [30]. wang et al. have shown a promising perspective of back-propagation neural network usage in cost estimate of construction engineering [31]. the model they presented is based on back-propagation ann trained to perform estimations on the basis of the large number of past estimation materials. their test results suggest that the developed model based on ann successfully extract the relation between the project’s features and the estimation of fabrication cost. gui et al. [32] presented a survey of structural optimization applications in civil engineering. their aim was to combine different design and development techniques (ann, expert systems, genetic algorithms) for the bridge project so that the structural design of the system can be optimized. parhi and dash have presented an analysis of the dynamic behavior of a beam structure. the analysis was performed upon a beam containing multiple transverse cracks [33]. for these purposes, neural network controller was used. authors have calculated three natural frequencies and compared results of experimental, theoretical and finite element analysis. the results of the analysis were used to train feed-forward multilayered neural network controller. after the training process was performed, this controller was capable of predicting crack locations and depths and the results of its predictions were validated against an experimental set-up. alacali, akba, and doran have presented an investigation of a confinement degree for confined concrete using neural network analysis [34]. they have established the neural network algorithm to validate the empirical equations proposed for the confinement coefficient. rahman et al. presented the estimation of local scour depth around bridge piers [35]. the estimation was performed using the multi-layer perceptron ann, ordinary kriging (ok), and inverse distance weighting (idw) models. they have evaluated results from the sixth test case. the results indicate that the ann model predicts local scour depth more accurately than the kriging and inverse distance weighting models. in [36], authors present artificial neural network based heat convection (ann-hc) algorithm. they have used an earth-to-air heat exchanger (etahe) component with aim to establish a a neural network approach for the analysis of limit bearing capacity of continuous beams 119 new thermal modeling method for cooling components. the case study they presented shows working principles of the algorithm they proposed and tested upon etahe and its environment. narasimhan presented a direct adaptive control scheme for the active control of the nonlinear highway bridge benchmark [37]. as a model, author used nonlinearly parameterized neural network. described ann contains single hidden layer and is coupled with a proportional-derivative type controller to perform approximation of control force. in this particular case, ann approximates nonlinear control law and it is not used to model the nonlinearities of the overall system. lee, lin and lu performed an assessment of highway slope failure using neural network [38]. for these purposes, they have used back-propagation neural networks and demonstrated the effectiveness of ann in the evaluation of slope failure potential based on five major factors, such as the slope gradient angle, the slope height, the cumulative precipitation, daily rainfall, and strength of materials. laflamme and connor used self-tuning gaussian networks for control of civil structures equipped with magnetorheological dampers [39]. the neural network used for these purposes is an adaptive neural network composed of gaussian radial functions. their evaluation results indicate that the neural network is effective for controlling a structure equipped with a magnetorheological damper. bilgil and altun emphasize the importance of prediction of the friction coefficient in hydraulic engineering [40]. therefore, they propose a method to estimate friction coefficient through means of ann. data used for ann training was obtained experimentally and estimates friction factor in a smooth open channel. results demonstrate ann model shows higher efficiency compared to manning approach in the given environment. as stated in [41], the civil engineering research community is still in demand for the next generation of applied anns that have to be based on sophisticated genetic coding mechanisms in order to develop the required higher-order network structures and utilize development mechanisms observed in nature. 2.2. basic postulates of limit and shakedown analysis in the area of elastic behavior of beams, the stresses and strains are proportionally dependant. due to the increase of load, there is a gradual build-up of stress, until the value of the stress in the most loaded fiber reaches the value of yield stress. the further increase of load causes plastification of the entire cross section, and thus formation of plastic hinge [22]. it is known, that in statically determinate structures, the complete plastification of one cross-section of a beam and transition of the structure into the failure mechanism means loss of load bearing capacity. in statically indeterminate structures, formation of one plastic hinge does not lead to formation of failure mechanism, and the bearing capacity of one n times statically indeterminate structures is fully exhausted when in the structure an n+1 plastic hinge is formed. if the structure is unloaded prior to formation of failure mechanism, certain residual strain occurs, which causes occurrence of residual bending moments. by applying the limit state analysis it is not possible to include the residual bending moments in the calculations. this is possible by applying the shakedown analysis. in the shakedown analysis all the assumptions of the limit state analysis are also valid, whereby this method makes possible the analysis of the behavior of the structure exposed to repeated load. 120 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov fig. 1 relation moment–curve in shakedown theory the shakedown theorem can be established for a material having the more general moment-curvature relationship of fig. 1. the basic curve (oab) is assumed to be the same whichever way the bending moment is applied, so that first yield occurs at a moment my in either sense, and the full plastic moment has value mp, again in either sense. the linear elastic range thus extends for a total 2my. the assumption is made that this range of 2my is not affected by any partially plastic deformation that might occur. thus if a moment corresponding to the value at point b in fig. 1 is applied to the cross-section, followed by unloading, then the behaviour will be linear for a total decrease of moment 2my [19]. if a structure, made of an elasto-plastic material, is exposed to variable loads, then, the following situations are possible [42]:  if the load intensities remain sufficiently low, the structural response is perfectly elastic;  if the load intensities become sufficiently high, the instantaneous load-carrying capacity of the structure becomes exhausted; plastic, unconstrained flow mechanism develops and the structures collapse;  if the plastic strain increments in each load cycle are of the same sign then, after a sufficient number of cycles, the total strains (and therefore displacements) become so large that the structure departs from its original form and becomes unserviceable. for sufficiently high load amplitude (although below the load-carrying capacity) the deflection grows in each cycle. this phenomenon is called incremental collapse;  if the strain increments change sign in every cycle, they tend to cancel each other out and the total deformation remains small (alternating plasticity). in this case, however, after a sufficient number of cycles, material at the most stressed points begins to break due to low-cycle fatigue;  it may also occur that, after some plastic deformation in the initial load cycles, the structural behavior becomes eventually elastic, for lower load amplitudes. such stabilization of plastic deformations is called shakedown or adaptation. 2.3. theorems of limit state analysis the basic theorems of limit state analysis consist of:  static theorem or the theorem of the lower border of limit load and,  kinematic theorem or the theorem of the upper border of limit load. a neural network approach for the analysis of limit bearing capacity of continuous beams 121 static theorem is based on the static equilibrium of the observed system. a large number of distributions of bending moments meeting the equilibrium conditions as a result of the given external load can be assumed for one statically indeterminate system. greenberg and prager [16] named such distribution statically admissible. if the bending moment has not exceeded the appropriate value it is claimed that it is also safe. the static theorem can be expressed in the following way: if there exists any distribution of bending moment throughout structure which is simultaneously safe and statically admissible under the load p, then the value  must be less or equal to the factor of failure load c, (c > ). the actual limit load (cp  pp) can be equal or higher than the given one. the kinematic theorem relates to the possible failure mechanism. the failure mechanism comprises a kinematically unstable system which a beam becomes after the plastic hinges are formed in the cross sections where there are conditions for this [16]. the factor of failure load c, i.e. the limit load (cp), is determined the equalizing the work of external forces with the work absorbed in plastic hinges for each assumed failure mechanism. the kinematic theorem can be expressed in the following way: for the given static system, subjected to the set of loads p, the value of  which corresponds to any assumed failure mechanism must be higher or equal to the factor of failure load c, that is, c≥. 2.4. theorems of shakedown analysis as well as in the limit state analysis, in the shakedown analysis there are static and kinematic theorems, on whose basis it is possible to determine the safe limit load depending on the type of variable repeated load. the static shakedown or melan's theorem is as follows: shakedown occurs if it is possible to find a field of fictitious residual stress ij , independent of time, such that for any variations of loads within the prescribed limits the sum of this field with the stress field * ij  in a perfectly elastic body is safe (sufficient condition). shakedown cannot occur if there does not exist any time-independent field of residual stresses ij such that the sum * ijij   is admissible (necessary condition) [43]. the kinematic shakedown or koiter's theorem is as follows: shakedown does not occur if it is possible to find an admissible cycle of plastic strain rates and some programme of load variations between prescribed limits for which   vatdsxt p ijfioni )d(dd 0 00    (1) where p ijij p ija 000 )(    is the rate of work of the plastic strain on the admissible rates [43]. 3. analysis of the bearing capacity of continuous beams depending on the load character and degree of static indeterminacy applying the adequate method based on upper and lower limit and shakedown analysis [44], and depending on the character of the load, an analysis of the limit load of continuous beam displayed on the fig. 2 was conducted. the span of the beams affects the distribution of internal forces, and therefore on the relevant condition of failure, that is, the value of the failure 122 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov force. on the example of the continuous beam, a procedure of the failure force calculation has been previously conducted and presented in [44], depending on the change of beam span, which is defined by the coefficients α and β, as well as depending on moment of plasticity mp. fig. 2 continuous two–span beam loaded by concentrated forces in the middle of the span the limit force of failure in one-parameter form, the incremental failure force and residual bending moment depending on the change of span length are shown in fig. 3 and fig. 4 respectively. a b fig. 3 a) change of the limit failure force depending on α and β b) change of incremental failure force depending on α and β a neural network approach for the analysis of limit bearing capacity of continuous beams 123 fig. 4 change of residual bending moment depending on α and β 4. the neural network approach the most obvious reason why ann models are gaining popularity is their ability to adapt to changing situation and to learn from a predefined set of examples. these characteristics facilitate the development of a model of the observed phenomenon is situations where there is limited theory describing the cause-effect relationship between the beam configuration and its performances (such as failure force, the incremental failure forces and the residual bending moment). also, neural networks are capable of performing fast processing which makes them particularly relevant to frame performance analysis because this type of analysis tends to be highly computationally intensive. in cases where a neural network will be applied to perform an analysis of a particular phenomenon, an issue that needs special attention is the determination of the neural network architecture and training method. there are other alternative approaches to choose from, such as ones described in [45 47]. for the analysis presented in this paper, we have adopted rgin architecture and complementary training procedure [47]. rgin architecture was chosen mainly because of its performance characteristics. rgin system is capable of produce very precise models of a function [45, 46]. it circumvents the issue of how many hidden nodes to incorporate in a network and provide the possibility to perform a rapid training process including the usage of large training sets (containing thousands of training data sets [48]). finally, if the training set is validated (e.g. there is a confirmation that the training set does not contain ambiguous data), then there is a guarantee rgin system will converge on a solution during training process. radial-gaussian networks represent a specialized form of the radial basis function (rbf) networks [49]. as illustrated in figure 5, rgins comprise three layers of neurons connected in a feed-forward manner. these networks perform mapping from a vector of inputs to a vector of outputs. vector of inputs represents the observed phenomenon, e.g. the problem that should be solved, while a vector of outputs represents the solution of the problem provided by the neural network. in this study, the input vector would represent the frame configuration, while the output vector would be an estimate of the performance of the frame, such as its the limit failure force, the incremental failure forces and the residual bending moment. 124 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov the data is passed through rgin network in single direction forward only. as the data flows through the network forward, simple processing is applied upon it. processing is applied both within the neurons and along the links connecting the neurons. the equation 2 summarizes the way of functioning of this type of network:                      i i i,jiijj,k j j k oisao 1 2 1 exp  (2) where the value of the ith element of the input vector is represented by ii; oi,j represents an offset on the connection between the ith input neuron to the jth hidden neuron; αi is a normalizing term at the ith input neuron; ak,j is an amplitude term connecting the jth hidden neuron and kth output neuron; sj is a spread term coupled with the jth hidden neuron; and ok is the output value from the kth neuron in the output layer. fig. 5 rgin neural network the process of training a rgin network is used to configure a suitable set of values for network parameters so that the ann may perform the required function. a more detailed description of rgin networks and their method of training are provided in [50]. the training scheme used for rgins training process is supervised. the network is provided with a data set of training sample problems and corresponding answers. this data set represents a mapping between input and output vectors. the network is expected to learn from these correspondences to within a predefined error tolerance. once the training process is finished, the performance of the network should be tested against a data set comprising problem samples not used within the training process. error is calculated by determining the sum of the absolute difference between the actual output given by ann and the expected output, calculated over all training data examples. a neural network approach for the analysis of limit bearing capacity of continuous beams 125 while training a rgin, hidden neurons are added (one at a time) to a middle layer in network. the ann can be thought of as implementing an overall function and every hidden neuron adds a radial-gaussian function to it. a correction term is generated by each radial-gaussian function belonging to each of the hidden neurons in the middle layer. as training process progresses, each of the three radial-gaussian function parameters, oj,i, ak,j, and sj, are adjusted. 5. discussion of the results as previously stated, ann used for the research presented in this paper a rgin and complementary training procedure [47] was adopted. the input layer consists of 4 neurons representing the structure of the two-span continuous beam loaded in the middle of the span that the residual bending moment, limit and the incremental failure force are obtained using the developed ann. in particular, these 4 neurons represent the following beam configuration parameters: l, α, β and mp. the input layer transmits information from the outside into the hidden layer and the process continues up to reach the output layer. the structure of the hidden layer is a result of the training process, as described in [47]. in our approach, it consists of 8 neurons. the output layer consists of 3 neurons representing the limit failure force, the incremental failure force and the residual bending moment. for the training and validation purposes, two sets with the same number of samples were generated (training set and validation set), each containing 233 samples. both sets were populated with data calculated according to postulates of limit and shakedown analysis. as input values, in both sets parameter l is an integer ranging from 1 to 10, parameter α is a floating-point number ranging from 1 to 10, parameter β is a floating-point number ranging from 1 to 10 and parameter mp is a floating-point value ranging from 1.5 to 75. after the training process, ann outputs were compared to the validation set (expected output calculated according to postulates of limit and shakedown analysis) and the results of the validation process for the incremental failure force, the residual bending moment and the limit failure force are displayed on figs. 6, 7 and 8, respectively. on each figure, curves that describe the change of expected (theoretical) values are shown in blue while the output generated by ann is shown in red. fig. 6 visualizes a comparison between the expected (theoretical) values compared to the values obtained using the ann in the case of incremental failure force. as shown in fig. 6, developed ann shows a high level of precision of the output results. for a majority of the samples that were subjected to validation (in particular, 68% of validation samples), the mean absolute percentage error was less than 8%. the largest discrepancy in the results was detected in the case of a training samples subset in which value of the input parameter α has a steep growth. in these cases, the precision of ann is expected to be achieved by increasing the number of training samples within this subset of training samples which will be used for the additional ann training. 126 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov fig. 6 comparison of theoretical and rgin generated results – incremental failure force fig. 7 comparison of theoretical and rgin generated results – residual bending moment a neural network approach for the analysis of limit bearing capacity of continuous beams 127 in the case of residual bending moment values validation, the obtained results are shown in figure 7. as shown in fig. 7, in this case ann outputs the highest percentage of discrepancy with the theoretical results. to adjust the network to be fully able to determine the residual bending moment, the training set should be expanded along with an increase of the number of training epochs. despite the increased results discrepancy, for 59% of the validation samples the mean absolute percentage error was less than 15%, which indicates a real opportunity to improve network quality through proposed changes. figure 8 visualizes a comparison between the expected (theoretical) values compared to the values obtained using the ann in the case of limit failure force. as shown in fig. 8, developed ann shows a high level of precision of the output results. for a majority of the samples that were subjected to validation (in particular, 68% of validation samples), the mean absolute percentage error was less than 10%. the largest discrepancy in the results was detected in the case of a training samples subset in which value of the input parameter α has a steep growth. in these cases, the precision of ann is expected to be achieved by adding additional training samples to ann training sample set in order to cover better critical parts of the input space. . fig. 8 comparison of theoretical and rgin generated results – limit failure force 128 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov 6. conclusion modern technologies applied in civil engineering are in constant demand for new solutions during the analysis of different problems. one of the widely accepted approaches used in civil engineering is the ann approach because of its ability to adapt to changing situation and to learn from example. since the shakedown analysis of elastoplastic structures has been increasingly applied in the analysis of engineering problems, the potential usage of ann approach in these analyses should be devoted significant attention. in this paper, we have proposed an annmodel in order to approximate the residual bending moment, limit and the incremental failure force of continuous beams. ann model we have developed, trained and evaluated was designed as radial-gaussian network architecture. our analysis of the developed ann model, on the example of the two-span continuous beam loaded in the middle of the span, indicates that the residual bending moment, limit and the incremental failure force can be obtained using the proposed neural network approach with sufficient precision. the proposed ann model was trained with 233 training samples calculated on the basis of theorems of limit and shakedown analysis. the testing process resulted with a mean absolute percentage error less than 15% which indicates that ann used during the test is a feasible tool for an approximation of the residual bending moment, limit and the incremental failure force of continuous beams. the presented ann model still exposes some weaknesses that could be eliminated with additional effort. in case of the approximation of residual bending moment, the accuracy of the ann model output can be improved by introducing additional training samples in the critical zones that expose the highest mean absolute percentage error. the training set modification performed in this manner would positively affect all ann aspects which will lead towards other ann outputs becoming even more precise. also, as a part of the future research and development, ann structure will be modified in terms of increasing the number of neurons in the hidden layer. we expect this change to decrease mean absolute percentage errors across the whole validation set. another possibility which will be considered in future is splitting existing ann into three anns, one for each of the estimated forces (residual bending moment, limit and the incremental failure force). it is our aim to use different ann structures for different forces to achieve better results. references [1] j. mccarthy, "what is artificial intelligence?", computer science department of stanford university, california, united states of america, 2007, available from: http://www-formal.stanford.edu/jmc/whatisai/. [2] c.t. leondes, expert systems, volume i. academic press, san diego, california 92101-4495, usa, 2002. [3] p. lu, s. chen and y. zheng, "artificial intelligence in civil engineering", mathematical problems in engineering, vol. 2012, article id 145974, pp. 22, 2012. [4] m.y. rafiq, g. bugmann and d.j. easterbrook, "neural network design for engineering applications", computers & structures, vol. 79, issue 17, pp. 1541-1552, 2001. [5] z. waszczyszyn and l. ziemianski, "neural networks in mechanics of structures and materials-new results and prospects of applications", computers & structures, vol. 79, issue 22, pp. 2261-2276, 2001. [6] n. ahmadi, r. kamyab moghadas and a. lavaei, "dynamic analysis of structures using neural networks", american journal of applied sciences, vol. 5.9, pp. 1251-1256, 2008. [7] i. lou and y. zhao, "sludge bulking prediction using principle component regression and artificial neural network", mathematical problems in engineering, article id 237693, pp. 17, 2012. a neural network approach for the analysis of limit bearing capacity of continuous beams 129 [8] e. bojórquez, j. bojórquez, s.e. ruiz and a. reyes-salazar, "prediction of inelastic response spectra using artificial neural networks", mathematical problems in engineering, article id 937480, 2012. [9] j.h. garrett, "where and why artificial neural networks are applicable in civil engineering", journal of computing in civil engineering, pp.129-130, 1994. [10] g.v. kazinczy, kiserletek befalazott tartokkal.betonszemle, 2, 1914. [11] n.c. kist, leidt een sterkteberekening, die uitgaat van de evenredigheid van kracht en vormverandering, tot een goede constructie van ijzeren bruggen en gebouwen. inaugural dissertation, polytechnic institute, delft, 1917. [12] a.a. gvozdev, "the determination of the value of the collapse load for statically indeterminate systems undergoing plastic deformation", in proceedings of the conference on plastic deformations / akademiia nauk s.s.s.r., moscow-leningrad, 1398, 19, (tr., r.m.haythornthwaite, int. j. mech.sci., 1, (1960), 332). [13] r. hill, "on the state of stress in a plastic rigid body at the yield point", philosophical magazine, vol. 42, pp. 868-875, 1951. [14] d.c. drucker, w. prager and h.j. greenberg, "extended limit design theorems for continuous media", quarterly of applied mathematics, vol. 9, pp.381-392, 1952 [15] m.r. horne, "fundamental propositions in the plastic theory of structures", journal of the institution of civil engineers, vol. 34, pp. 174-177, 1950 [16] h.j. greenberg and w. prager, "on limit design of beams and frames", transactions of the american society of civil engineers, (first published as tech. rep. a18-1, brown univ., 1949), pp.117-447, 1952. [17] b.g. neal and p.s. symonds, "the calculation of collapse loads for framed structures", journal of the institution of civil engineers, vol. 35, pp.21-40, 1950-51. [18] p.g. hodge, plastic analysis of structures, new york: mcgraw-hill, 1959. [19] j. baker and j. heyman, plastic design of frames. vol 1. fundamentals, london: cambridge university press, 1969. [20] m. zyczkowski, combined loadings in the theory of plasticity, springer netherlands, 1981. [21] m. save, atlas of limit loads of metal plates shells and disks. elsevier science bv, 1995. [22] b.g. neal, the plastic methods of structural analysis. london: chapman and hall, 1977. [23] m. jirásek and z.p. baţant, inelastic analysis of structures. john wiley & sons, 2002. [24] e. melan, "zur plastizitat des raumlichen continuum", ing. arch. vol. 9, pp.116–126, 1938. [25] w.t. koiter, general theorems for elastic–plastic solids. amsterdam: north-holland, 1960. pp. 165– 221. [26] j. ghaboussi, j.h. garrett and x. wu, "knowledge-based modeling of material behavior with neural networks", journal of engineering mechanics, vol. 117, pp.132–151, 1991. [27] s. arangio and j. beck, "bayesian neural networks for bridge integrity assessment", structural control & health monitoring, vol. 19, no. 1, pp. 3–21, 2012. [28] p.b. cachim, "using artificial neural networks for calculation of temperatures in timber under fire loading", construction and building materials, vol. 25, no. 11, pp. 4175–4180, 2011. [29] j. liu, h. li and c. he, "concrete compressive strength prediction using rebound method with artificial neural network", advanced materials research, vols. 443-444, pp. 34-39, 2012. [30] m.y. cheng, h.c. tsai and e. sudjono, "evaluating subcontractor performance using evolutionary fuzzy hybrid neural network", international journal of project management, vol. 29, no. 3, pp. 349– 356, 2011. [31] x.z. wang, x.c. duan and j.y. liu, "application of neural network in the cost estimation of highway engineering", journal of computers, vol. 5, no. 11, pp. 1762–1766, 2010. [32] x. gui, x. zheng, j. song and x. peng, "automation bridge design and structural optimization", applied mechanics and materials, vol. 63-64, pp. 457–460, 2011. [33] d.r. parhi and a.k. dash, "application of neural networks and finite elements for condition monitoring of structures", journal of mechanical engineering science, vol. 225, no. 6, pp. 1329–1339, 2011. [34] s.n. alacali, b. akba and b. doran, "prediction of lateral confinement coefficient in reinforced concrete columns using neural network simulation", applied soft computing journal, vol. 11, no. 2, pp. 2645– 2655, 2011. [35] h. rahman, k. alireza and g. reza, "application of artificial neural network, kriging, and inverse distance weighting models for estimation of scour depth around bridge pier with bed sill", journal of software engineering and applications, vol. 3, no. 10, 2010. 130 m. bogdanović, ţ. petrović, b. milošević, m. mijalković, l. stoimenov [36] j. zhang and f. haghighat, "development of artificial neural network based heat convection algorithm for thermal simulation of large rectangular cross-sectional area earth-to-air heat exchangers", energy and buildings, vol. 42, no. 4, pp. 435–440, 2010. [37] s. narasimhan, "robust direct adaptive controller for the nonlinear highway bridge benchmark", structural control health monitoring, vol. 16, pp. 599–612, 2009. [38] t.l. lee, h.m. lin and y.p. lu, "assessment of highway slope failure using neural networks", journal of zhejiang university: science a, vol. 10, no. 1, pp. 101–108, 2009. [39] s. laflamme and j.j. connor, "application of self-tuning gaussian networks for control of civil structures equipped with magnetorheological dampers", in proceedings of the spie, the international society for optical engineering, march 2009. [40] a. bilgil and h. altun, "investigation of flow resistance in smooth open channels using artificial neural networks", flow measurement and instrumentation, vol. 19, no. 6, pp. 404–408, 2008. [41] i. flood, "towards the next generation of artificial neural networks for civil engineering", advanced engineering informatics, vol. 22, no. 1, pp. 4–14, 2008. [42] a. j. konig, shakedown of elastic-plastic structures, north holland, 1987. [43] l.m. kachanov, foundations of the theory of plasticity. amsterdam-london: north-holland publishing company, 1971. [44] b. milošević, m. mijalković, ţ. petrović and m. hadţimujović, "the application of the limit analysis theorem and the adaptation theorem for determining the failure load of continuous beams", scientific tehnical review, vol 60, no 3-4, pp. 82-92, 2010. [45] n. kartam, i. flood and j. h. garrett, artificial neural networks for civil engineering: fundamentals and applications. new york: american society of civil engineering, 1997, pp. 19-43. [46] r. hecht-nielsen, neurocomputing. new york : addison-wesley, 1990. [47] c.r. alavala, fuzzy logic and neural networks: basic concepts & applications, new age international pvt ltd publishers, december 2008. [48] n. gagarin, i. flood and p. albrech, "computing truck attributes with artificial neural network", journal of computing in civil engineering, vol. 8(2), pp.179–200, 1994 [49] s. chen, c.f.n. cowan and p.m. grant, "orthogonal least squares learning algorithm for radial basis function networks", ieee transactions on neural networks, vol. 2, no. 2, 1991. [50] i. flood, "a gaussian-based feedforward network architecture and complementary training algorithm", in proceedings of international joint conference on neural networks, ieee and inns / singapore, 1991, pp. 171-176. facta universitatis series: electronics and energetics vol. 32, n o 4, december 2019, pp. 581-600 https://doi.org/10.2298/fuee1904581g © 2019 by university of niš, serbia | creative commons license: cc by-nc-nd the usage of lambert w function for identification and speed control of a dc motor * radmila gerov, zoran jovanović university of niš, faculty of electronic engineering, department of control systems, niš, republic of serbia abstract. the paper proposes a new method of identifying the linear model of a dc motor. the parameter estimation is based on the closed-loop step response of the dc motor under a proportional controller. for the application of the method, a deliberate delay of the measured speed was introduced. the paper considers the speed regulation of the direct current motor with negligible inductance by applying 1-dof and 2-dof, proportional integral retarded controllers. the proportional and integral gain of the pi retarded controllers was received by using a pole placement method on the identified model. the lambert w function was applied for the identification and in designing the controller with the purpose of finding the rightmost poles of the closed-loop as well as the boundary conditions for selecting the gain of the pi controller. the robustness of the calculated controllers was considered under the effect of an disturbance, uncertainty in each of the dc motor parameters as well as perturbations in time delay. key words: dc motor, identification, pi controller, lambert w function, time delay. 1. introduction dc motors have a vast usage range: starting with children’s toys, through house appliances, computer equipment, to the automobile industry for example [2]. it is well known that the dynamic behavior of the direct current (dc) motor can be approximated by a linear model and that there are multiple ways of its control [3], such as robust speed control [4], predictive control [5], optimal control [6], the application of the integral retarded algorithm [7], etc. the proportional integral (pi) system for controlling the speed of a dc motor communicated by tcp/ip and the ethernet network is presented and analyzed in [8], wherefrom it can be seen that delay affects the stability of the system. received march 10, 2019; received in revised form june 18, 2019 corresponding author: radmila gerov university of niš, faculty of electronic engineering, department of control systems, aleksandra medvedeva 14, 18000 niš, republic of serbia (e-mail: gerov@ptt.rs) * an earlier version of this paper was presented at the 13th international conference on applied electromagnetics (пес 2017), august 31 september 01, 2017, in niš, serbia [1].  582 r. gerov, z. jovanović for the purpose of designing a controller, it is necessary that the dc motor model represents the dynamics of a real motor as accurately as possible. for this reason, the motor parameters established by the manufacturer are verified by means of various methods of identification and depending on the identified model, the controller of the desired type is then designed. there exist various different procedures for the identification of dc motor parameters among which are: the parameter estimation of the linear models and using step signals [9], the method for closed-loop identification of position-controlled dc servomechanisms [10], the procedure of parameter identification of dc motor model using a method of recursive least squares [11], parameters are estimated using the proposed reduced-order recursive least square parameter identification and discrete-time disturbance observer in [12] and linear elman neural network based genetic algorithm are used to find parameters in [13]. this paper suggests a new method of estimating linear dc motor model parameters. the method is based on a closed-loop step response to a dc motor with a proportional controller, with the known delay being considered in the measurement of velocity. mean absolute error (mae) and root mean squared error (rmse) index are used to validate the obtained model. the paper explores the methods of the proportional-integrative controller tuning for speed regulation of the dc motor with negligible inductance. the synthesis of the pi controller was performed by using a pole placement method with the features of the lambert w function [14]. using the analytical solution form in terms of the matrix lambert w function (lwf) [15]-[19] and the limitations of the matrix-like lambert w function provided in [20], the characteristic system equation for the desired rightmost poles of the closed-loop system was solved and the pi controller parameters were set. the proposed method of controlling the speed of the dc motor was developed for one degree of freedom (1-dof) and two degrees of freedom (2-dof) pi controller. this parametric pi controller can be used for the proportional-derivative (pd) controller in the sense of controlling the position of the dc motor shaft. the paper considers two controller synthesis methods. with the first method, during the process of designing the pi controller, the possibility of time delay in the feedback branch [1] is taken into account. considering that the proposed procedure takes into account the possible time delay, the received characteristic system equation is transcendent and has infinite solutions. the equation can also be solved by means of the lambert w function [1], [21]-[22]. apart from the above-mentioned methods, literature also recognizes other methods of pi controller configuration as well as dc motor control system stability analyses. for example: a nonsmooth optimization based stabilization method has been used to design pv and pi controllers for a dc motor system with a pointwise time-delay in the feedback loop in [23], a theoretical method has been used to compute the delay margin values for stability for various values of the pi controller gains for a dc motor speed control system that contains time delays in feedback and feedforward parts in [24], the possibility of designing pi controllers to optimize given performance criteria is given with a characterization of the complete set of stabilizing pi controllers for a foptd system in [25]. it is well known that a time delay can produce that a feedback system becomes unstable, as well as that introduction of a delay for control purposes may stabilize an unstable system [26]. the added delay to the control law can be treated as an additional control parameter, as mentioned in [27], where the delay was added to stabilize the open the usage of lambert w function for identification and speed control of a dc motor 583 loop unstable system. a second method was designed in accordance to this idea and it was based on the deliberate introduction of time-delays (h) into the systems for control purposes (pi retarded controller). the processes of designing both controllers are identical. there exist several methods for determining the stabilizing set of (kp, ki) values, including the extension hermite-biehler theorem to quasipolynomials which is based on the work of pontryagin given in [25]. in this paper, the boundary conditions for selecting the pi controller parameters are determined using the lambert w function. identification of a linear dc motor model provided by the suggested algorithm is shown on the example of the dc motor with the parameters given in [13]. this linear model was used in order to design the controller. the control laws are validated for the dc motor with parameters given in [13] by simulation. boundary conditions for selecting desired poles of the closed-loop system, boundary values of the pi controller gain as well as the parameters necessary for determining them are confirmed as well. the robustness of the received controllers was taken into account for disturbance, uncertainty in each of the dc motor parameters as well as perturbations in time delay. the paper is organized in the following way: in chapter 2, the basic properties of the lambert w function were presented. in chapter 3 the linear model of a dc motor is presented. identification of a linear dc motor model is presented in chapter 4. the design of 1-dof and 2-dof pi retarded controllers for speed regulation of a dc motor was explained in chapter 5. boundary conditions for selecting the desired poles of the closedloop system as well as boundary values of the pi controller gain are presented in this chapter. simulation examples of the identification and design of the controller, as well as the robustness analysis, are provided in chapter 6. chapter 7 provides the conclusion with the suggestion for further analysis. 2. lambert w function introduced in the eighteenth century by lambert and euler [14], the lambert w function w(z) is a solution of the following equation ( ) ( ) , . w z w z e z z c  (1) the lambert w function is a complex value, with a complex argument. considering the fact that z belongs to a set of complex numbers c, lambert w function has an infinite number of solutions and an infinite number of branches wk(z) where k ϵ (∞,∞). if z belongs to a set of real numbers r, only the principal branch w0(z) for k = 0 and branch w-1(z) for k = 1, assume the real values fig. 1. principal branch w0(z) is analytic at the point of zero, which ensues from the lagrange’s inversion theorem which provides the series expansion with the radius of convergence e 1 . 1 0 1 ( ) ( ) . ! n n n n w z z n      (2) 584 r. gerov, z. jovanović fig. 1 the graph of wk(z), for zϵr and kϵ(-1,0). lambert w functions have been implemented in the various commercial software packages, such as maple and matlab. a more detail explanation can be found in [14]. 3. linear model of a dc motor with respect to armature current and shaft speed, the dynamics of a dc motor can be described by electrical and mechanical first order differential equation ( ) ( ) ( ) ( ), ( ) ( ) ( ). e m di t l t ri t k t dt d t j k i t t dt         (3) where: l – armature inductance, j – the moment of inertia of the moving parts, r – the resistance of armature winding, β – the damping coefficient due to viscous friction, ke – the back emf constant, km – the motor torque constant, u(t) – the input voltage, i(t) – the armature current and ω(t) – the angular velocity of the rotor. it is rather known that the linear model of the direct current motor with negligible inductance [28], can be described by using a differential first-order equation of an inertial system (4), where tm and ksm are the motor time constant and the motor gain, respectively, received from the manufacturer and checked by the process of identification. ( ) ( ) ( ) m sm d t t t k v t dt     (4) the motor time constant tm and the motor gain ksm have the following values , . m m sm e m e m krj t k r k k r k k       (5) if the dimensionless control signal is the scaled input voltage u(t) = ν(t) / νmax, the admissible controls must satisfy the inequality |u(t)| < 1. with respect to ks = ksmνmax and ts = tm, the velocity (speed) transfer function gmv(s) the usage of lambert w function for identification and speed control of a dc motor 585 of the dc motor and angular (position) transfer function gma(s) of the dc motor, received from (4), respectively are ( ) ( ) ( ) 1 ( ) ( ) ( ) ( 1) s mv s s ma s ks g s u s t s ks g s u s s t s         (6) where the angle α represents the position of the motor shaft. 4. proposed method parameter estimation of the linear dc motor model the control system, for the suggested parameter estimation approach of the linear dc motor model with the negligible inductance, is shown in fig. 2, where gmv(s) is the velocity transfer function of the dc motor (6) with the parameters that need to be estimated, gc(s) = kp is the transfer function of the p controller, ktg is the tachogenerator constant, ω(t) is a controlled speed, ωr(t) = uref(t) / ktg is the desired rotational speed and e(t) is the tracking error where h is the known time delay. fig. 2 control system of a dc motor with a retarded controller. the selection of the controller gain needs to be undertaken in such a way so that the underdamped system with the closed-loop transfer function given in (7) is received. ( ) ( ) . ( ) 1 p s tg hs r s p s tg k k ks w s s t k k k e        (7) if in equation (7), time delay from a denominator is approximated by a padé approximant, the output of the closed-loop system can be written down in the following form 0 2 2 ( 1) ( ) ( ) 2 r n n k s s s s s         (8) thus, the observed system can be considered as the second order plus time delay system with dynamic numerators, where ωn is the natural frequency, ξ is the damping 586 r. gerov, z. jovanović ratio, ωd is the damping frequency, τ0 is time constant defining zero of the closed-loop transfer function and k is gain. a typical closed-loop step response of the system is shown in fig. 3, where ωss – is the steady state value, t1 – is the time required for the output to reach its first maximum value, ω1 – is the first maximum value of output, t2 – is the time required for the output to reach its first minimum value and ω2 – is the first minimum value of the output. fig. 3 closed-loop step response of a dc motor with a proportional retarded controller. the gain of the velocity transfer function gmv(s), can be found from . ( ) ss s p tg r ss k k k     (9) overshoot (os) can be calculated approximately in the following way 2 12 1 .ss ss os e        (10) the damping ratio ξ received from (10) is 2 2 ln( ) . ln ( ) os os      (11) the damped frequency can be calculated from 2 1 , d t t     (12) therefore the natural frequency is 2 . 1 d n     (13) the usage of lambert w function for identification and speed control of a dc motor 587 the characteristic equation of the system described by equation (8) has conjugatecomplex poles whose values are 1/ 2 n d s j    (14) the characteristic equation of the closed-loop transfer function of a dc motor stabilized by retarded p controller (7) 1 0 hs s s p tg t s k k k e     (15) has an infinite number of solutions 1 1 ( )s h p s tg t k k s s k k k s w he h t t     (16) where k stands for an ordinal number of the lambert w function branch. considering that (8) is an approximation of the closed-loop transfer function (7), it is clear that the poles from (14) are also the rightmost poles of the (15). these poles can be calculated from (16) by using the principal branch w0(z) and w1(z). this means that the solutions (16) assume a form of 0 1 1 1 ( ) 1 1 ( ) s s h p s tg t n d s s h p s tg t n d s s k k k j w he h t t k k k j w he h t t                (17) for the known poles (14), the unknown dc motor time constant ts and time delay h, are received by solving a system of two equations (17) whereby all the parameters of the linear model of the dc motor with transfer function gmv(s) have been estimated. the mae and rmse index are used to validate the model obtained, where ω is the real the angular velocity of the rotor and ωm is the angular velocity produced by the identified model. mae and rmse index of 0 indicates a perfect model. 2 0 0 1 1 , ( ) n n m m i i mae rmse n n       (18) 5. pi retarded controller projection for dc motor speed regulation 5.1. 1-dof pi retarded controller tuning the transfer function of the pi controller, where kp and ki are the gain of the proportional and the integral part of the controller is ( ) . i pi p k g s k s   (19) the control system for speed regulation of a dc motor with a pi retarded controller is shown in fig. 2, where gmv(s) is the identified velocity transfer function of the dc motor, gc(s) = gpi(s) is the transfer function of the pi controller. 588 r. gerov, z. jovanović in time domain control law can be represented by ( ) ( ) ( ) . p i u t k e t k e t dt   (20) the closed-loop transfer function of the received control system is 1 2 ( )( ) ( ) ( ) ( ) s tg p i dof hs r s s tg p i k k sk ks t s s t s s k k k s k e          (21) wherefrom the characteristic system equation is 2 ( ) 0. hs s s tg p i t s s k k k s k e      (22) the characteristic equation for the closed-loop system is transcendent and has infinite solutions. it can be seen it its matrix form in the following manner 0.k hs k d s a a e     (23) where a and ad are real matrices with the dimensions of 2x2 whose values are 0 00 1 , .1 0 s tg i s tg pd s s s k k k k k ka a t t t                     (24) the solution of the characteristic equation (23) is a complex matrix skϵc 2x2 where k denotes the branch of lambert w function which can be formulated in the following way 1 2 1 2 0 1 k s            (25) where λ1 and λ2 represent the desired poles of the closed-loop system. in order to convert the characteristic system equation (23) into the lambert w form, it is necessary [20] to add an unknown matrix qkϵc 2x2 which has to fit into the equation ( ( ) ) ( ) .k d k w a hq ah k d k d w a hq e a h   (26) the solution of the characteristic equation (23) is obtained through the application of the following equation 1 ( ) . k k d k s w a hq a h   (27) for the desired poles [1], by solving a system of two equations (26) and (27), by using the lambert w_dde toolbox in matlab [18], the unknown matrix qk is received, and the pi retarded controller parameters are set. the proposed method provides the opportunity of choosing the desired real and conjugate-complex poles. the selection of the desired poles, in the infinite spectrum of poles, is reduced to the selection of the rightmost poles (the closest to the imaginary axis from the complex plane), and in [16] it was shown that these could be obtained by solving (27) only for k = 0 or k = –1, which guarantees dominance. the usage of lambert w function for identification and speed control of a dc motor 589 5.2. 2-dof pi controller tuning in time domain control law of the 2-dof pi controller control system, where γ is tuning parameter and γ ϵ (0,1), can be represented by ( ) (1 ) ( ) ( ) ( ) ( ) . p r p i r i u t k t k t h k t dt k t h dt             (28) the closed-loop transfer function in the case of a 2-dof pi controller is 2 2 ( (1 ) )( ) ( ) , ( ) ( ) s tg p i dof hs r s s tg p i k k sk ks t s s t s s k k k s k e           (29) it is obvious that the closed-loop transfer functions obtained by using the 1-dof pi controller (21) and the 2-dof pi controller (29) differ only in the numerator, which means that the characteristic equations of both systems are identical and equal (22). in order to design the 2-dof pi controller, the already described process of designing the 1-dof pi controller can be used, taking into account that both equations (22)-(27) are identical. considering (29) we can conclude that γ = 0, 2-dof pi controller control system becomes 1-dof pi retarded controller control system fig. 3 with the control law (20), for γ = 1, this control system becomes the so-called i-p retarded controller control system with a control law ( ) ( ) ( ) ( ) . p i r i u t k t h k t dt k t h dt          (30) tuning parameter γ regulates the hight of the overshoot of the set-point response and the letter is chosen depending on its range γϵ(0,1). 5.3. boundary range for selection of the desired poles if the desired poles are λd = re{λd} ± jim{λd} where d ϵ (1,2), for re{λ1} = re{λ2} and im{λ1} = im{λ2}, the poles are complex conjugate. for im{λ1} = im{λ2} = 0 the poles are real. let the unknown matrix be qk 11 12 21 22 . k q q q q q        (31) regardless of the value of the unknown matrix qk,, taking into account the value of the matrix ad (24), the result adhqk will always be in the form of 21 22 0 0 . k d k m a hq m m         (32) from (27) it can be concluded that ( ) ( ) ( ). k k d k k k h s a w a hq w m   (33) substituting the values of matrices a and sk from (24) and (25) as well as the value of the matrix mk from (32) in (33) the following is received 590 r. gerov, z. jovanović 21 1 2 1 2 22 22 22 0 00 0 .1 ( ) ( ) ( ) k k s m h h w m w m t m                        (34) taking into account the values of poles λ1 and λ2 as well as the values of h and ts, we can conclude that all matrix coefficients on the left side of (34) are real numbers, wherefrom we can conclude that wk(m22) is a real number. depending on the value of wk(m22) from fig. 1, the same corresponds to the branch k = 0 for 1 < wk(m22) or the branch k = 1 if wk(m22) < 1. in order for the closed-loop system to be stable the condition of re{λ1} < 0 and re{λ2} < 0 has to be met. taking into account the principal branch of the lambert w function ie. k = 0, from (34) we get an additional boundary condition for the selection of real parts of the desired poles. 1 2 1 1 { } { } 0 e e s r r t h        (35) in case that the desired poles are complex conjugated, statement (35) looks as following 1 1 1 ( ) { } 0 2 e d s r t h      (36) it is known that the solution to an equation is satisfactory for the equation itself. substituting the complex conjugated pole λd = re{λd} + jim{λd} = σ + jω into a characteristic equation (22) and also dividing it into its real and imaginary parts, the following is received 2 2 2 2 (( ( ) ) cos( ) (1 2 ) sin( )) 0, ( (( ( ) ) sin( ) (1 2 ) cos( )) ) 0 h s s p s tg i s tg h s s p s tg e t h t h k k k k k k j e t h t h k k k                        (37) from (37) we can conclude that for the chosen part of the real pole σ = re{λd}, which fulfills the condition (36), there exists an imaginary pole part ωi = im{λd}, where the integral gain of the pi controller is ki  0. the integral gain ki = 0, if the imaginary pole part fulfills the condition      1 im tan( im ) ( ) re d d s d h t        (38) if a higher value of the imaginary pole part is taken than ωi = im{λd} received from (38), for the real value of the pole σ = re{λd} the closed-loop system will be unstable. substituting the desired complex conjugated poles λd = re{λd} + jim{λd} = σ ± jω, where ω < ωi, in (25) as well as solving the system of two equations (26) and (27) the pi controller gain is received. with the obtained controller parameters it is necessary to check whether all the poles of the infinite pole specter are located on the left side from the desired poles on the complex s plane. from (37), it can also be seen that for the selected part of the real pole σ = re{λd}, which fulfills condition (36), there exists an imaginary part of the pole ω = im{λd}, for which the proportional pi controller gain kp  0. proportional gain kp = 0, only if the imaginary part of the desired pole satisfies the condition the usage of lambert w function for identification and speed control of a dc motor 591            2 2 im (1 2 re ) tan( im ) ((re ) (im ) ) re d s d d s d d d t h t            (39) the closed-loop system is on the boundary of stability only if there are poles whose real parts are equal to zero i.e. they have only the imaginary part, and all the other poles from the infinite pole spectrum are located on the left half-plane of the complex s plane. substituting σ = re{λd} = 0 in (38) the boundary value ωgr = im{λd} is received where ki = 0. substituting the received poles λ1 = jωgr and λ2 = jωgr in (34), the following is received 2 22 0 22 21 0 22 ( ) ( ) , ( ) gr s h mh w m m t w m     (40) from (40) it can be concluded that m22 doesn’t depend on the boundary value of the imaginary part of the pole, which is not the case with m21. substituting λ1 = jωgr and λ2 = jωgr in (25) as well as solving the system of two equations (26) and (27) the boundary value of the proportional gain kpgr is received. the proportional gain of the pi controller is located in the following range 1 ( ) s tg p pgr k k k k     (41) 6. simulation example a dc motor with the parameters from [13] is being taken into account: the moment of inertia j = 0.052 kg/m 2 , the motor torque constant km = 0.66 n.m/a, the back emf constant ke = 0.64 v/rad/s, the armature resistance r = 2.3ω, the armature inductance l = 0.0345 h, the damping coefficient due to viscous friction β = 0.002 n.m/rad/s. the tachogenerator constant ktg = 0.06685 v/rad/s. 6.1. parameter estimation of the linear dc motor model for the application of the parameter estimation method of the linear dc motor model, described in chapter 4, it is necessary that an underdamped closed-loop system is created. this method is applicable for several different values of the proportional gain kp as well as the delay in the return branch h in order to identify the model which most accurately represents the dynamic of the dc motor in question. the paper shows the best possible result. for parameter estimation the proportional gain kp = 5 is used, time delay in the feedback branch h = 0.5 s with the referent speed of the rotor rotation ωr = 30 πrad/s. for this simulation the program package matlab/simulink was used with the following parameters: the duration of the simulation t=10s, solver ode5, fixed step ts=0.01s. from the closed-loop step response the following values were measured: ωss = 32.1053 rad/s, ω1 = 42.5769 rad/s, t1 = 0.61 s, ω2 = 29.2832 rad/s and t2 = 1.4 s. with the usage of (9) the gain of the velocity transfer function, ks = 1.5457 was calculated. substituting ωss, ω1 and ω2 in (10) the overshoot os = 0.2695 was calculated, based on which and the application of (11) the damping ratio ξ = 0.3852 was calculated. the damped frequency was calculated by substituting t1 and t2 in (12), ωd = 3.9767 rad/s. by substituting the new values in (13) the natural frequency ωn = 4.3092 rad/s was 592 r. gerov, z. jovanović calculated. applying (14) closed-loop poles s1/2 = 1.6597 ± 3.9767j were calculated. substituting the newly calculated values in (17) resulted in the time constant of the dc motor ts = 0.2715 s and the time delay h = 0.5134 s. the linear model transfer function of the considered dc motor is ( ) 1.5457 ( ) ( ) 0.2715 1 mv s g s u s s     (45) time domain response of a dc motor and identified model is shown in fig. 4. the frequency response of a dc motor and identified model is given in fig. 5. fig. 4 response of the angular speed of a dc motor and identified model. fig. 5 frequency response of a dc motor and identified model. the result of the application of (18) revealed the following quality indicators of the suggested identification process, mae = 0.0494 and mrse = 0.2552. according to these indicators (the mae index and rmse index, the response of the dc motor and the model obtained by the usage of the suggested identification process in the usage of lambert w function for identification and speed control of a dc motor 593 fig. 4, the frequency response in fig. 5 from which it is clear that the deviation of the frequency characteristics occurs at irrelevant high frequencies), it can be concluded that the dc motor is adequately identified by the suggested method. 6.2. designing the controller the project requirements are: the referent rotation speed of the rotor 200rad/s, the rotor settling time is shorter than 1.3s and the maximum amplitude overshoot of the rotor rotation speed is less than 10%. substituting the calculated values ks and ts (45), as well as the tachogenerator ktg in (24), results in matrices a and ad with the function of enhancing the pi controller. it is well known that measuring tools i.e. sensors, as well as communication lines, cause delays. let us suppose there exist measurement and communication delays between the regulated output and the controller [24]. for the purpose of designing the 1-dof pi retarded controller, the delay of the measured rotation speed of the rotor of h=0.2s is introduced. the suggested controller designing method was applied to the desired real and conjugate-complex poles of the closed-loop system λ1 and λ2. solving (38) for σ = re{λd} = 0, within the range π / (2h) < im{λd} < π / h, ωgr = im{λd}= 9.6733 is received. substituting λ1 = 9.6733j and λ2 = 9.6733j in (25) as well as solving the system of two equations (26) and (27), the boundary values of the proportional gain kpgr = 27.1936 and ki = 0 are received. for these parameter values of the pi controller, the system is within the borders of stability. from (41) it can be concluded that 9.6779 < kp < 27.1936. from (35) and (36) condition are met for selecting the real parts of the closed-loop poles: re{λ1} + re{λ2} > 8.6838, re{λ1} < 0, re{λ2} < 0, i.e. re{λ1/2} > 4.3419 for the principal branch. for the principal branch, the lambert w function, wk(m22) > 1. therefore, with the conjugate-complex poles re{λ1/2} = 1 / (2ts) = 1.8419, wk(m22) = 0, m22 = 0. if 4.3419 < re{λ1/2} < 1.8419, wk(m22) < 0 and m22 < 0 while 1.8419 < re{λ1/2} < 0, wk(m22) > 0 and m22 > 0. the paper considers both cases. the method of designing a controller was applied to the complex conjugate poles with re{λ1/2} = 4, which suits the case of wk(m22) < 0. applying substitution and solving (38) im{λd}gr = 7.6474 is received. if the desired poles are λ1/2 = 4 ± 7.6474j, ki = 0, kp = 9.0351. the pi controller is functioning as a proportional controller, meaning that there will be an offset outgoing signal i.e. there will be a deviation of the rotor velocity compared to the previously entered reference velocity. for higher values of the imaginary part of the pole, the system will be unstable because ki < 0, i.e. from the infinite pole spectrum there exists a pole that is located in the right half-plane of the s-plain. for example, λ1/2 = 4 ± 8j, ki = 6.4883, kp = 9.1043, λ3 = 0.3194. for the values of the imaginary pole part that are lower than the boundary value, the closed-loop system is stable, while both gains are positive. for example, λ1/2 = 4 ± 2j, ki = 20.2919, kp = 5.3215. displacing the desired poles onto the right side re{λ1/2} = 1, the case of wk(m22) > 0 is taken into consideration. substituting re{λ1/2} = 1 in (38), im{λd}gr = 9.2639 is received, while substitution in (39) results in the boundary value of the imaginary pole part where kp = 0, im{λd}0 = 2.2692. if the desired value of the imaginary pole parts is lower than im{λd}0, then kp < 0 and ki > 0. for example, for λ1/2 = 1 ± 2j, kp = 0.5367, ki = 15.5265. 594 r. gerov, z. jovanović since the confirmation results of the boundary conditions are received for the identified dc motor model, the angular velocity of the identified model with the received pi controllers is shown in fig. 6. wherefrom the confirmation of the boundary conditions for the selection of the desired poles can be seen. fig. 6 response of the angular speed of the identified model with different pi retarded controllers and time delay h = 0.2 s. to evaluate the output control performance the integral absolute error (iae) of the controll error ωr(t)  ω(t) and the integral square error (ise) of the controll error ωr(t)  ω(t) is used. 0 ( ) ( ) r iae t t dt     (46) 2 0 ( ( ) ( )) r ise t t dt     (47) the small values of iae and ise show good closed-loop performance. table 1 contains the values of the proportional and integral gain of the pi retarded controller obtained for different values of the desired closed-loop poles and time delay h=0.2. to evaluate the closed-loop performance, we considered a step set-point change of amplitude 200 and a step input (load) disturbance of amplitude 10. in order to quantitatively present the results, the integral absolute error (iae) and the integral square error (ise) for the set-point and load disturbance as well as for settling time (ts) and the percentage of the overshoot (os) of the closed-loop system response are also given in table 1. the usage of lambert w function for identification and speed control of a dc motor 595 table 1 parameters of the pi controller for different values of the desired rightmost poles of the closed-loop system. values of settling time, the percentage of the overshoot, iae and ise value for set-point and load disturbance poles λ1/2 kp ki ts [s] os [%] set-point iae ise load disturbance iae ise -4±2j 5.3215 20.2919 0.71 1.66 58.75 7271 25.69 670 -4±3j 5.9237 22.6005 1.13 4.50 53.93 6530 23.17 605 -3±3j 5.7552 24.7598 1.28 9.94 58.84 6567 22.92 590 -3.5±4.3j 7.2219 27.4642 1.02 12.7 51.17 5567 19.89 507 -4 and -4.5 4.8600 17.9475 0.99 0.00 67.84 8151 29.05 745 since the values of iae and ise in table 1 are given for a step set-point change of amplitude 200 and a step load disturbance of amplitude 10, it can be said that the closedloop system has good performances. figure 7 depicts the response of the angular speed of a dc motor under the effect of torque disturbance of the shape of the step function of amplitude 10 happening at the interval t = 5 s by the regulated usage of the 1-dof pi retarded controller with the parameters given in table 1. the response of angular speed of a dc motor with 1-dof pi retarded controller with kp = 7.2219, ki = 27.4642 is not given in fig. 7. the reason being that with the application of this controller one of the projection conditions isn’t met -that the maximum amplitude overshoot of the rotor rotation speed is less than 10%. it is clear from fig. 7. that by using the 1-dof pi retarded controller designed for the desired complex conjugate poles λ1/2 = 3 ± 3j, a system is formed with a better disturbance compensation and also with a higher set-point response overshoot. fig. 7 the response of a dc motor with different 1-dof pi retarded controller with the parameters given in table 1 and time delay h=0.2s. the 2-dof pi retarded controller is used for the overshoot elimination with γ being the tuning parameter and control law given in (28). instead of the 1-dof pi retarded controller which fulfill the projection conditions, using the controller with two degrees of 596 r. gerov, z. jovanović freedom is not necessary. the same controller can be used to lower the overshoot. for example, γ = 0.35 being the tuning parameter for 1-dof pi retarded controller with parameters obtained for λ1/2 = 4 ± 2j. an overshoot of the amplitude of the rotor rotation speed doesn’t exist for 1-dof pi retarded controller design for real poles, resulting in tuning parameter γ = 0 within the control law of the 2-dof controller. the influence of the tuning parameter γ on the angular velocity of the dc motor with 2-dof pi retarded controller is considered in the case that the closed-loop system with 1dof pi retarded controller doesn’t fulfill the projecting conditions. table 1 shows that the conditions aren’t met using the controller with parameters kp = 7.2219 and ki = 27.4642 because the overshoot is higher than 10%. it also shows that using this controller the lowest values of indexes iae and ise are received. the response of angular velocity of a dc motor under the effect of torque disturbance of the shape of the step function of amplitude 10 happening at the interval t = 5 s with 2-dof pi retarded controller with the gain kp = 7.2219 and ki = 27.4642 and different tuning parameter γ = 0, γ = 0.15 and γ = 0.3 is shown in fig. 8. response of the angular speed of a dc motor for γ = 0 is equal to the respone received using the1-dof pi retarded controller. fig. 8 clearly shows that an increase of γ lowers the overshoot where: for γ = 0.15, os = 8.9%, for set-point iae=52.15 and ise = 6067, while for γ = 0.3, os = 5.9%, for set-point iae = 54.84 and ise = 6742. in both cases iae = 19.89 and ise=507 for load disturbance. fig. 8 response of the angular speed of a dc motor with 2-dof pi retarded controller with the parameters kp=7.2219, ki=27.4642, time delay h=0.2s, and different γ. considering the fact that there was no analysis of the influence on the uncertainty in paper [1], perturbations of ±20% in the dc motor parameters j, km, ke, r, l and β are considered. response of the angular speed of a dc motor with pi retarded controller with the parameters ki = 20.2919, kp = 5.3215, time delay h = 0.2 s, and perturbations of ±20% in the dc motor parameters j, km, ke, r, l and β is shown in fig. 9. the usage of lambert w function for identification and speed control of a dc motor 597 fig. 9 response of the angular speed of a dc motor with 1-dof pi retarded controller with the parameters ki = 20.2919, kp = 5.3215, time delay h = 0.2 s, and perturbations of ±20% in the dc motor parameters j, km, ke, r, l and β. it can be observed that the change of the parameters affects the response of the feedback system, but also that the same remains stable, for example for the closed-loop system with the pi controller ki = 17.9475, kp = 4.8600, perturbation of +20% in all dc motor parameters increase iae for 32% and ise for 23%, ts = 1.54 s, os=0% while perturbation of 20%, decrease iae for 22% and ise for 20%, ts = 0.924 s, os = 3.66%. for the closed-loop system with pi controller ki = 20.2919, kp = 5.3215 and perturbation of +20% in all dc motor parameters, settling time ts = 1.12 s, os = 0.01%, iae = 74.50 and ise = 8877 for set-point, while for perturbation of 20%, ts = 1.08 s, os = 8.93%, iae = 51.50 and ise = 5902. the results received were compared to the results received with the usage of a standard pi controller (pis) which was projected without the assumption that delay can occur in the return branch. for the desired poles λ1/2 = 4 ± 2j, the pis controller parameters are kis = 52.5429, kps = 11.3392. for the desired real poles λ1 = 4 and λ2 = 4.5, pis controller parameters are kis = 47.2886, kps = 12.6528. for the same perturbation in all dc motor parameters, better results are received using the pis controller as opposed to the suggested pi retarded controller. the influence of changing the time delay (td) in the return branch on the angular velocity of the dc motor is considered in case the suggested pi controller and the standard pis controller are used. parameters of the controllers are received for real poles where the suggested pi controller: ki = 17.9475, kp = 4.8600 without the intentionally added delay of the motor speed measured value i.e. h = 0 s, and pis controller: kis = 47.2886, kps = 12.6528. the response of angular velocity of the dc motor for a different value of time delay td in measured speed for the control system with the proposed pi controller and standard pis controller is shown in fig. 10. 598 r. gerov, z. jovanović fig. 10 response of angular velocity of a dc motor with the proposed pi controller and standard pis controller for a different value of time delay td in measured speed. based on the results shown in fig. 10, it can be concluded that for the delay in the return branch td > 0.3 s the system regulated with the pis controller will become unstable, which is not the case with the usage of the suggested pi controller. the closed-loop system with the proposed pi controller becomes unstable for td > 0.8 s. the suggested pi controller shows a considerably better result as opposed to the standard pis controller if there exists a time delay in the return branch. the same result can be observed with the usage of controllers designed for the desired complex-conjugated poles. [1]. 6. conclusion the linear dc motor model obtained by applying the proposed new method of parameter estimation using the lambert w functions possesses characteristics which slightly differ from the characteristics of a dc motor in the fields of time and frequency. for this reason, the model can be used for the purpose of designing various types of controllers not only for managing the speed of a dc motor rotor, but also its position. controlling the rotation speed of the dc motor rotor using the pi controller with the adjusted measuring velocity delay in the suggested manner leads to a robust system with adequate disturbance compensation. within the following period the usage of this process of identification of the linear dc motor model it is possible to design models that could be used for the purpose of designing other types of speed and rotor positioning dc motor controllers. references [1] r. gerov, z. jovanović, “lambert w function application to a direct current motor speed regulation with a delay”, in proceedings of the 13 th international conference on applied electromagnetics pes 2017, niš, serbia, 2017, pp. o6-3. [2] a. hughes, “electric motors and drives: fundamentals, types and applications”, 3 nd , elsevier, 2006. [3] h. a. toliyat (ed.), g. b. kliman (ed.), “handbook of electric motors”, 2 nd , boca raton: crc press, 2004. the usage of lambert w function for identification and speed control of a dc motor 599 [4] t. umeno, y. hori, “robust speed control of dc servomotors using modern two degrees-of-freedom controller design”, ieee transactions on industrial electronics, vol. 38, no. 5, 1991, pp. 363–368. [5] v. l´echapp´e, o. salas, j. le´on, f. plestan, e. moulay, a. glumineau, “predictive control of disturbed systems with input delay: experimental validation on a dc motor”, in proceedings of the ifac, 2015, vol. 48, no. 12, pp. 292–297. [6] m. ruderman, j. krettek, f. hoffmann, t. bertram, “optimal state space control of dc motor”, in proceedings of the ifac, 2008, vol. 41, no. 2, pp. 5796–5801. [7] a. ramirez, r. garrido, s. mondie, “velocity control of servo system using an integral retarded algorithm”, elsevier, isa transactions, vol. 58, pp. 357–366, 2015. [8] k. matsuo, t. miura and t. taniguchi, “speed control of a dc motor system through delay time variant network,” in proceedings of the sice-icase international joint conference, 2006, pp. 399–404. [9] w. lord, j. h. hwang, “dc servomotors: modeling and parameter determination", ieee transactions on industry aplications”, vol. 1a-13, no. 3, pp. 234–243. 1977. [10] r. garrido, r. miranda, “closed loop identification of a dc servomechanism”, in proceedings ogf the 2006 ieee international power electronics congress, 2006, pp.1–5. [11] k. radojka, a. sanja, s. danilo, “recursive least squares method in parameters identification of dc motors models”, facta universitatis, series: electronics and energetics, vol. 18, no. 3, pp. 467–478, 2005. [12] c. s. chen, s. c. lin, s. m. wang, y. m. hong, “adaptive control based on reduced-order parameter identification and disturbance observer for linear motor”, in proceedings of the asme 2007 conference on international manufacturing science and engineering, atlanta, georgia, usa, 2007, pp. 751–758. [13] a. a. al-qassar, m. z. othman, “experimental determination of electrical and mechanical dc motor using genetic elman neural network”, journal of engineering science and technology, vol. 3, no. 2, pp. 190–169, 2008. [14] r. m. corless, g. h. gonnet, d. e. g. hare, d. j. jeffrey, d.e. knuth, “on the lambert w function,”, advances in computational mathematics, vol. 5, 1996, pp. 329–359. [15] s. yi, p. w. nelson, a. g. ulsoy, “proportional-integral control of first-order time-delay systems via eigenvalue assignment,” ieee transactions on control system technology, vol. 21, no. 5, pp. 1586– 1594, 2013. [16] s. yi, p. w. nelson, a. g. ulsoy, “time delay systems: analysis and control using the lambert w function”, world scientific, 2010b. [17] s. yi, p. w. nelson, a. g. ulsoy, “dc motor control using the lambert w function approach”, in proceedings of the ifac, 2012, vol. 45, no. 14, pp. 49–54. [18] s. yi, p. w. nelson, a. g. ulsoy, “the lambert w function approach to time delay systems and the lambertw_dde toolbox,” in proceedings of the ifac, 2012, vol. 45, no. 14, pp. 114–119. [19] r. gerov, z. jovanović, “synthesis of pi controller with a simple set-point filter for unstable firstorder time delay processes and integral plus time delay plant”, elektronika ir elektrotechnika, vol. 24, no. 2, pp. 3–11, 2018. [20] r. m. corless, h. ding, n. j. higham, d. j. jeffrey, “the solution of s*exp(s)=a is not always the lambert w function of a”, in proceedings of the 2007 international symposium on symbolic and algebraic computation issac 2007, waterloo, ontario, canada, 2007, pp. 116–121. [21] r. r. alla, j. s. lather, g. l. pahuja: “comparison of pi controller performance for first order systems with time delay”, journal of engineering science and technology, vol. 12, no. 4, pp. 1081–1091, 2017. [22] r. gerov, z. jovanović, “primena proporcionalno-integralnog kontrolera za sisteme prvog reda sa kašnjenjem korišćenjem lambert w funkcije za podešavanje polova”, zbornik 61. konferencije za elektroniku, telekomunikacije, računarstvo, automatiku i nuklearnu tehniku, etran 2017, kladovo, srbija, 2017, pp. au1.8.1-5. [23] s. m. özer, s. yıldız, a. iftar, “optimization-based pv/pi design for a dc-motor system with delayed feedback,” in proceedings of the 26th mediterranean conference on control and automation, zadar, crotia, 2018, pp. 39–45. [24] s. ayasun, "stability analysis of time-delayed dc motor speed control system", turkish journal of electrical engineering & computer sciences, vol. 21, pp. 381–393, 2013. [25] g. j. silva, a. datta, s. p. bhattacharyya: "pi stabilization of first-order systems with time delay", automatica, vol. 37, pp. 2025–2031, 2001. [26] r. sipahi, s. i. niculescu, c. t. abdallah, w. michiels, k. gu, "stability and stabilization of systems with time delay. limitations and opportunities", 2010, https://digitalrepository.unm.edu/ece_rpts/35. 600 r. gerov, z. jovanović [27] w. michiels, s.i. niculescu, l. moreau, "using delays and time-varying gains to improve the static output feedback stabilizability of linear systems: a comparison", ima journal of mathematical control and information, vol. 21, no. 4, 2004, pp. 393–418, 2004. [28] inteco, modular servo system-user’s manual, [online]. available: www.inteco.com.pl, 2008. instruction facta universitatis series: electronics and energetics vol. 28, n o 1, march 2015, pp. 29 50 doi: 10.2298/fuee1501029n examples of medical software and hardware expert systems for dysfunction analysis and treatment andrzej napieralski, zygmunt ciota, marcin janicki, marek kamiński, rafał kotas, paweł marciniak, aleksander mielczarek, małgorzata napieralska, robert ritter, bartosz sakowicz, wojciech tylman, mariusz zubert lodz university of technology, department of microelectronics and computer science, lodz, poland abstract. this paper presents the recent research in dmcs. the medical and biometric research projects are presented. one of the key elements is image acquisition and processing. the paper presents research of diagnostic application of voice analysis for stroke patients with speech dysfunction, as well as the method for diagnosing and monitoring the effectiveness of medical rehabilitation of patients with dysfunction of the cervical spine. then the method for sudden cardiac death risk stratification is elaborated. key words: medical systems, speech dysfunction, sudden cardiac death risk 1. research of diagnostic application of voice analysis for stroke patients with speech dysfunction human verbal communication presents a very complex process, consisting of nervous impulse synchronization, stimulation of speech processes, extraction of phonetic information by using an aural system. the tracking process of speech context, occurring in the brain, also presents a very important task of neural network activation. subjective diagnosis carried out by an experienced phoniatrist or speech therapist is the simplest method of a voice quality evaluation. such classification of voice requires experience and intuition, it cannot be applied commonly, especially in comparative studies presented by various medical centres. a better technique of measurements of voice quality can be obtained applying an objective acoustic analysis, a relatively new method. spectrographic, sonographic and temporary analyses of speech signals are useful as objective methods of voice quality measurements. computer modelling of such complicated process demands a large number of mathematical calculations performed by high-speed processors. in the case of vascular  received october 30, 2014 corresponding author: andrzej napieralski lodz university of technology, department of microelectronics and computer science, ul. wolczanska 221/223, 90-924 lodz, poland (e-mail: napier@dmcs.pl) 30 a. napieralski et al. lesion of nervous areas, additional pathological distortions should be taken into account as dominant parameters. evaluation of importance of such parameters demands close collaboration between neurologists, computer scientists and electronic engineers. such collaboration gives us opportunity to verify a possibility of treatment progress of the patient using our new signal-statistical method. the correlation between pathological changes of neurological stability and voice pathology can be established. to obtain rewarding results, different analyses must be implemented: time domain, frequency domain and mixed time-frequency analysis, all of which should be assisted by statistical functions for the calculation of voice variability as a function of health state. one of the most important tasks is a proper definition of feature vectors. each vector can contain several specific features of voice signals, and finally, one has to calculate more than hundred features for each utterance. the following four vectors can be favoured as very important and useful, covering the most important features of speech signal: long-term spectra (lts) vector, speaking fundamental frequency vector, timeenergy distribution vector and vowel formant-tracking vector [1]. to obtain such vectors, two kind of analysis should be taken into account:  long term analysis spectrum –voice samples are relatively long (a few seconds),  short term analysis, applying fast fourier transformation – it contains short intervals of the speech (from 0.1 to 0.6 seconds). we present a new method of registration and processing of voice parameters for patients with vascular lesion of a central nervous system. the most frequent cause of it are strokes, where one can observe focal disorders of the cerebrum functions caused by vascular injury. there are two kinds of stroke: hemorrhagic (20% of all cases) and ischaemic (80% of all cases in medical practice). the speech quality depends on programming action of the central nervous arrangement and the condition of the broadcast of stimuli in cortical-subcortical area. voice analysis has been performed by using our own algorithms of signal processing based on fourier transforms. additionally, statistical analysis was used to obtain proper correlation between the improvement of voice parameters and neurological condition of the examined patient. the investigations were performed for several patients with ischemic and also hemorrhagic stroke, during the first three days of hospitalization. afterwards, next examinations were repeated once a week. the voice analysis of the patients with ischemic brain stroke, performed by using our software, indicated specific deviations of frequency and amplitude in the formant parameters (the phoneme „a‟ seems to be very important example), in comparison with healthy persons. the same abnormalities were significantly smaller for patients with aphasia in the case of the hemorrhagic stroke. the presented method permits establishing a possibility of speech recovery process just at the beginning of the ischemic stroke. we record patients and next we analyse these samples by using various algorithms. numerical data are assigned to the patient‟s state of health during the recording and are compared with data of a healthy human of the same age group and gender. we use time, frequency and time-frequency analysis. it is possible to observe the characteristic changes as a function of the patient health state. these dependences are proportional to the degree the cerebrum damages. the examples of medical software and hardware expert systems for dysfunction analysis and treatment 31 characteristic of spectral estimation of a healthy person has the soft course. for ill persons the power of the signal is characterized by larger amplitudes. the examples of the results are presented in fig. 1, 2 and 3 [2, 3]. transient ischaemia 2nd day in hospital transient ischaemia 7th day in hospital fig. 1 voice signal processing for the patient with stroke hospitalization mixed time-frequency (formant) analysis 32 a. napieralski et al. fig. 2 voice signal processing for patients with stroke hospitalization frequency domain results [2] fig. 3 vocal track filter for the patient with traumatic haematoma in the left and right hemisphere in the first (left) and last (right) day of hospitalization [3] examples of medical software and hardware expert systems for dysfunction analysis and treatment 33 2. an innovative method for diagnosing and monitoring the effectiveness of medical rehabilitation of patients with dysfunction of the cervical spine 2.1. the examination process 2.1.1 the former procedure the traditional method of examination requires attaching a laser pointer to the patient‟s head, which is done by using fixture based on tightly fitting ear protectors. the examinee is sitting in front of a white screen with several colour shapes drawn on it – the setup is outlined in fig. 4. the medical doctor is observing how fast and accurately the patient is following the shapes with the spot of the laser. this method requires the doctor to stay firmly focused for about ten minutes, during which he/she assesses the quality and time of completion of subsequent tasks. reduction of human engagement in the process of data acquisition should make the assessment more precise and repeatable. estimation of the accuracy, with which the examinee is able to operate the spot, evaluation of the scope of the screen that the patient is able to reach and time measurement are tasks that are proved to be successfully performed by the computers. moreover, the digital storage of the results will in the future facilitate tracking progress in rehabilitation and will help design new exercises tailored to the individual patient‟s needs. 2.1.2 requirements for automated test the measurement method has to fulfil several types of constraints, these are: accuracy, bandwidth and reproducibility of results. all these fields were initially discussed with medical doctors. the measurement setup is expected to achieve the resolution of about 1 mm with a patient sitting or standing about three meters away from the screen. the desired accuracy is hence similar to the one achieved by a person observing the screen during the original examination procedure. a typical man is able to perform up to about 15 cycles of tensing and relaxing of skeletal muscles per second [4]. people training the same move on a daily basis may double or even triple this value, but such a situation may be neglected in case of the head. however, since the movements are generally nonlinear, several higher harmonics shall also be taken into account. inclusion of the first three harmonics implicates the sampling rate reaching about 90 measurements per second (to fulfil nyquist equation for three first harmonics of body part moving with 15 hz fundamental frequency). the device shall not indicate a drift of the pointer larger than 2 mm during a 10minute time period of being at rest. fig. 4 former examination procedure 34 a. napieralski et al. 2.1.3 the proposed measuring method the adopted approach is founded on the assumption that it is possible to calculate the laser pointer location on the screen based on the limited knowledge of the patient‟s head position. the task could be easily accomplished if the head position (three coordinates) and angles (at least two) in relation to the screen would be known with appropriate accuracy. the problem could be then brought to projection of a point on a plane along a predefined line. in practice, the precise measurement of the actual head position is complicated and the applicable techniques are expensive. therefore, the proposed system shall be able to operate without obtaining those parameters directly. the measurement of angles is more straightforward, as there are numerous sensors available that can directly or indirectly provide the elevation and heading angle. a convenient way to measure the required angles is to fix a device with sensors to the patient‟s head, in the same manner as the laser pointer was originally mounted. measurement of the elevation angle (labelled α in fig. 5) is relatively simple as it requires only determining the direction of the earth‟s gravitational force vector in the coordinate system of the device and thus the head. the market available inclinometers, sensors dedicated for this measurement, are relatively expensive. alternatively, the α angle can be determined using an accelerometer accompanied with angle computation procedures implemented in the software. the measurement of rotation in the horizontal plane (β) cannot be done analogically, as the g-force vector is constant during such a movement. apart from gravity, the earth is also a source of magnetic field, which might be used to determine the patient heading with the utilization of magnetometer. unfortunately, this field is very weak and the local induction vector depends heavily on the nearest environment. it can vary significantly even in a short term observation, for example due to nearby electrical appliances. the system could never fully rely on the magnetometer readouts. another method of determining both angles is the measurement of angular velocities of the device using a gyroscope. integrating these will directly lead to the angle values. this method is simple, but has several important drawbacks. any sensor error will be integrated over time causing a constant drift of the angle value. moreover, the integration constant is not known in advance. nevertheless, the gyroscope is more reliable than the magnetometer and it is utilized in the proof-of-concept device. both problems indicated above had to be solved to take the full benefit of this sensor‟s features. fig. 5 spot projection on the screen examples of medical software and hardware expert systems for dysfunction analysis and treatment 35 the drift of calculated angles can be significantly reduced by calibrating the sensor before the examination. the system can determine the constant component continuously present on gyroscope outputs and cancel it out during the actual measurement. the described projection implementation was intended to be temporary, which should be corrected just after the application becomes functional. unexpectedly, this solution seems to be accurate enough for system evaluation purposes and initial tests on patients. there is a visible deviation between the laser pointer and the simulated pointer when performing moves after which the assumption of small angles is not proper and after fast moves that can exceed the sensor full scale range. nevertheless, when not leaving the operational area, the simulated pointer behaves as the brain would expect it to, thus the application is functional enough to guide the patient to perform a set of predefined moves. 2.1.4 the examination setup the complete setup is depicted in fig. 6. the actual laser is still needed for calibration purposes, but is turned off during the examination. assuming the hospital has a presentation pc with projector it only needs to invest in a device worth about $50 in components. 2.1.5 system concepts summary in the designed solution a small device with a gyroscope, accelerometer, magnetometer and laser pointer is fixed to the head of the examined. the sensors gather the information on the elevation, angular velocities of the device and magnetic field lines direction, hence determining the head‟s position. the shapes and calculated pointer are cast on the screen by means of standard multimedia projector. before the examination, the system has to be calibrated to account for the change of examined position and location of the screen. the computer, which coordinates the system, easily obtains the information on the actual spot position in relation to displayed shapes. since no exhaustive computations are needed, a typical office pc performance is more than sufficient. fig. 6 the proposed examination scheme 36 a. napieralski et al. 2.2. the measurement device 2.2.1 initial considerations the sensors box has to be small enough to be mounted on the frame of ear protectors. it should contain the laser pointer on-board and the pointer should be enabled/disabled automatically. the device should be as light as possible and the cabling should not limit the patient movements in any way. the wireless operation was considered but finally discarded. it would require a costly radio module, define additional requirements of the computer and increase the mass due to the need for battery, which would be the heaviest component. due to the high popularity and more than sufficient throughput, the usb 2.0 standard was chosen for both data link and power supply. according to the earlier considerations, the device should have the resolution of 1 mm with the patient located 3 meters away from the screen. calculation of the appropriate arctangent reveals that the angular resolution should be of about 0.00033 rad (0.02°). 2.2.2 the proof-of-concept hardware to test if the proposed solution can be implemented successfully, a prototype sensor board, codenamed gyroaccel, was built. the device was fit into the case of a small flashlight. it features a 8-bit accelerometer, 16-bit gyroscope (250 °/s range [5]) and laser pointer. the device has proven that the idea is useful for the patient examination and encouraged further development. the main disadvantage of the first prototype is too low resolution of the accelerometer. the schematic of the final revision, v4, is heavily based on the previous prototypes. minor improvements were made, the board has been redesigned to adopt to the new accelerometer and to fit in a much smaller case, along with the complete laser pointer assembly – see fig. 7. fig. 7 the final device integrated with laser pointer all the prototypes were developed around atmega32u2 microcontroller from atmel. it is an affordable 8-bit core processor accompanied with 32 kb of flash memory, 1 kb of ram memory and a rich set of peripheral circuits. it has an embedded usb 2.0 full speed compatible controller. the microcontroller interfaces accelerometer and magnetometer with an i2c compatible two wire interface and the gyroscope using four-wire spi bus. examples of medical software and hardware expert systems for dysfunction analysis and treatment 37 2.3. the data processing 2.3.1 system outline the hardware part of the system captures the data from the physical sensors and realizes the oversampling required for obtaining a valuable accelerometer readout. packets, containing the values read from gyroscope, accelerometer and magnetometer are transferred through the usb pipe. the computer receives the data stream of the inertial sensors and passes though the calibration module. the magnetometer‟s data stream is also extracted, but in the current software revision is not used. the accelerometer data are used for calculating absolute elevation angle using arcsine. the output from gyroscope is integrated to obtain values of rotation in vertical and horizontal plane. if no calibration would be done previously, the dc component of integrated signal would lead to a drift of the calculated angles. 2.3.2 microcontroller software design the economic one-sided pcb design process has put several constraints on the microcontroller i/o usage. the spi lines had to be routed through some other pins to be able to reach the dedicated lines of built-in spi block. furthermore, the microcontroller does not offer a hardware support for the i2c protocol, hence the software solution was adopted. the application sets the accelerometer for the 800 hz operation. it collects all the readouts and averages them appropriately to obtain the 14-bit result out of 12-bit samples. the gyroscope and magnetometer readouts are also acquired and a complete matrix of nine 16-bit values is stored in the local fifo. when the usb controller receives a read request form the host, the data are passed to the usb endpoint buffer and marked for transmission. the communication is done using bulk transfers [6]. the device firmware may be upgraded easily, as the microcontroller is equipped with an usb boot loader. the upgrade feature is activated after pressing the dedicated pushbutton, which is hidden inside the plastic case to avoid accidental activation. 2.3.3 the computer application the computer application is written in the c# language with bindings to libusb library. the project was developed under windows with the .net framework, but should also compile under the mono framework for unix-like platforms. the data arriving from the usb interface are first checked for the valid packet structure. if the check fails, the algorithm skips following bytes until a valid packet is found. the data are then extracted and the sensors saturation detection is performed. next the data are processed according to fig. 8. finally, the application presents the patient fig. 8 data processing scheme 38 a. napieralski et al. with the screen resembling the one used in the traditional form of examination – depicted in fig. 9. the trajectory of the move is recorded and every point outside the predefined shape is highlighted. the length of the path outside the shape should be measured together with the maximum deviation to the path. also the time required to pass between several markers is important and shall be recorded. the doctor is presented with a simple window for controlling the calibration process and defining visibility of shapes and points of error occurrences. 2.4. the results the most important objective has been achieved – the developed prototypes proved that the idea of using gyroscope and accelerometer for simulating position of the laser pointer is applicable and can be exploited for building useful medical devices. the cost of parts required for building one prototype is about $50. the worst-case price of a commercial device could be about $100, which would still render it an affordable solution for the health care services. the prototyping process brought forward numerous issues that an engineer has to face constructing similar equipment. the most important is that there are currently no affordable mems accelerometers of the resolution higher than 12 bits. secondly, the simple matrix multiplication based projection may be used only if the strict simulation of the laser pointer is not required and only over a relatively small range of angle values. nevertheless, its performance may be satisfactory in some cases, mainly if the feedback is provided using simulated pointer instead of the real one. 3. sudden cardiac death risk stratification 3.1 introduction sudden cardiac death – scd is a natural death from causes attributable to the heart, preceded by a sudden loss of consciousness within one hour of the onset of acute symptoms in patients with heart disease. sudden cardiac death (scd) is currently a considerable social issue. according to numerous sources, 30 out of 1 million of population die from scd every week. in poland, 1200 scd related deaths are being classified every week. this fig. 9 the application screen of examined examples of medical software and hardware expert systems for dysfunction analysis and treatment 39 group accounts for 50% of all deaths caused by cardiovascular diseases (fig. 10). literature shows that 10 to 32 percent of all natural deaths are sudden deaths and nearly 90 percent of all sudden deaths are classified as scds. medical conditions that increase the risk of sudden cardiac death occur in broad population of patients. identification of highest risk cases requires performing a number of clinical tests, such as: coronary perfusion assessment (exercise test, st segment alteration, angiography), heart failure assessment (nyha class, ejection fraction, exercise time), autonomic nervous system assessment (heart rate variability, baroreceptors sensitivity) and cardiac dysrhythmia assessment (ambulatory ecg recording, mean ecg, qt segment analysis, exercise test, electrophysiological test). as a result of a limited possibility of highrisk patients‟ identification, high scd prevention efficiency of implantable cardioverter defibrillator does not influence directly the overall number of scds. fig. 10 the share of main causes of the total number of deaths in poland, 2006 recently, several markers on the basis of the ecg signal were developed that pose a high prediction value for scd. among these markers are t-wave alternans (twa) [7, 8], heart rate variability (hrv) [9], heart rate turbulence (hrt) and deceleration capacity (dc) [10, 11, 12]. a number of other parameter analyses are implemented in the described project. every single analysis is designed as an independent plug-in. the use of additional software package for ecg analysis of all patients imposes only a minimal cost in clinical practice, and it can be employed along with the currently used diagnostic software. additionally, the modular structure of the software and its source code availability simplify the task of potential development and introduction of new algorithms for ecg analysis. 3.2 the functional structure of the platform software platform is implemented using java technology. this fact allows to the use of platform in any popular operating system. it is designed in a modular manner (fig. 11). the addition of various functionalities can be done by a plug-in software – without the necessity of rebuilding and modifying the main source code. as a result, the platform is a very convenient tool for performing complex research and test of different algorithms for ecg analysis. 40 a. napieralski et al. fig. 11 a block diagram of kardio application. platform functionality is divided into following modules. patient database  registration of results from numerous medical examinations and tests (i.e. blood pressure, complete blood count, etc.);  patients registration;  visualization of patients condition history;  patients and results search according to various criteria age, sex, time of day, etc. signal acquisition  importing holter recordings from various file formats. initial preprocessing  ecg signal filtering using fir and iir filters (including zero-phase approach);  baseline wandering extraction with both spectral and time-domain (curve subtraction) methods. ecg signal segmentation and analysis  heartbeat separation and segment identification;  artefact and arrhythmia detection;  heartbeat classification based on previously prepared templates ;  hrt, hrv, dc (ac) and twa (with mma and spectral methods) assessment. statistical module  reports generation for a given group of patients;  statistical analysis of marker dynamics in accordance to various criteria. model generation module  model extraction based on optimization of: sensitivity, positive predictive accuracy, negative predictive accuracy and specificity [13, 14]. examples of medical software and hardware expert systems for dysfunction analysis and treatment 41 3.3 risk stratification models an essential element of the application is the possibility of risk stratification model generation. the models are implemented as plug-ins so the authors have many possibilities of the application development. the authors performed research based on various artificial intelligence models. finally, four classification support models were selected and implemented in the presented platform. simple model – logical conjunction the simplest concept implemented by authors is based on using one or more parameters. conditions for the various parameters are combined using a logical conjunction [15]. geometrically, this is consistent with a cut in a single area of the search space which is characterized by an increased risk of disease. decision tree the second approach is based on decision trees. decision tree is a graph in which the vertices correspond to the tests – a comparison of the values of attributes, arcs to the test scores, and leaves to the classes. the apex of the tree is called the root of the tree. each internal vertex of a decision tree consists of the division, which is responsible for the division of the data set to the appropriate partition. a partition can be understood as a set of data belonging to one class, resulting from the division of the training set. bayesian network the third approach is based on bayesian network which is an advanced form of a probabilistic approach based on bayesian reasoning [14]. this approach is widely used in many medical applications. bayesian network consists of a directed graph describing the qualitative relationship between events and their numerical specification. artificial neural network the artificial neural network is the most known technique for making diagnosis based on using an expert system (which is a branch of artificial intelligence). artificial neural networks were inspired by biological findings relating to the behaviour of the brain as a network of units called neurons. in this research authors use artificial neural networks to classify heart diseases. in order to do that multilayer feed forward neural network has been chosen. 3.4. summary project results:  sudden cardiac death risk stratification based on holter ecg monitoring;  software platform (open and generally available) for ecg signal analysis and patients management with statistical reporting facilities. software has a plug-in architecture. thanks to this it is easy to extend its features and make further improvement. software (together with the source code) is royalty free. this can contribute to acceleration of the research on ecg analysis; 42 a. napieralski et al.  ecg signal processing algorithms, especially segmentation and segments identification adjusted for data from vital function monitors and holter recorders;  the authors have prepared an advanced platform based on plug-ins architecture. the developed application enables a user to generate simple and advanced risk stratification models including: neural networks, bayesian networks, decision trees. 4. thermographic assessment of burn wound depth in children 4.1. introduction one of the most important clinical problems of children burn treatment is the early and accurate evaluation of the burn wound depth and the determination of the necrosis limits. this evaluation influences the decision of a surgeon whether to operate a patient. the less serious wounds can heal in a natural way, but the more severe ones, have no chance of tissue regeneration, because of the necrosis [16,17]. this kind of wounds requires excision of necrotic tissues with the concurrent split thickness skin graft. the ideal solution would be to provide surgeons with adequate tools allowing early classification of the burn wound depth and providing them with a kind of wound map indicating clearly the areas to be operated on. obviously, there exist diagnostic methods such as clinical assessment by a physician or the evaluation of the biopsy material, but the first one is subject to human errors and is very difficult during the first days after a thermal injury, while the second one is invasive and painful for patients. the use of infrared thermography in medicine is not an entirely new idea [18,19], but the recent advances in the infrared technology and the image processing techniques allow significant improvements in diagnostics. 4.2. physiological foundations the current practice in the assessment of the burn wound depth is based on the evaluation of a wound by a surgeon and the results of a biopsy, which is an invasive examination. thus, the goal of the current research would be to reduce the influence of the human factor on the results of the wound classification and to limit the number of required biopsies. the use of infrared thermography is possible because the microcirculation of blood in the burn wound tissues is changed, which is reflected in the skin temperature. consequently, the analysis of skin temperature differences within the registered infrared images should render possible the classification of burn wound depth by segmenting each image into individual areas having the same range of temperatures. the segmentation process could be based on the analysis of temperature histograms of the registered infrared images, which is a well-known approach already discussed in the literature [20,21]. an important element is to establish the standard procedure to perform infrared measurements. in the proposed method the infrared images of a wound are taken 24 and 72 hours after the thermal injury. later the evaluation of the burn depth would not be possible because of the inflammatory processes which change the original temperature distribution within the wound resulting from the blood vessels damage. before each measurement, the patient is anesthetized and undressed for several minutes to let the thermoregulation processes equalize the skin temperature (see fig. 12). examples of medical software and hardware expert systems for dysfunction analysis and treatment 43 the evaluation of a wound is validated each time with a number of traditional biopsies, which are taken in the locations indicated by a surgeon. after the thermographic method is proved correct, the number of biopsies may be reduced, or they may be dispensed with altogether. 4.3. image capture system the image capture system, whose general view is shown in fig. 12, is based on the modern bolometric infrared camera flir p660. the camera features the built-in visible light channel allowing an operator the automatic overlay of the visible image with the infrared one. both images can be recorded in a single jpeg file on the built-in card or transmitted in real time to a pc through a fire-wire interface. the storage and the transmission of data in the system complies with the regulations concerning medical images, in particular with the digital imaging and communication in medicine (dicom) standard, which regulates the handling, storing and transmission of information in medical imaging. the dicom standard, known as the iso 12052:2006 standard, is commonly used in hospitals. it includes a file format definition and a tcp/ip based network communications protocol. owing to the adoption of the dicom standard, files containing all the patient data and medical images can be exchanged between various health-care institutions. the dicom standard also enables integration of hardware provided by various manufacturers, such as scanners, servers, workstations, printers, into a single medical database, known as the picture archiving and communication system (pacs). fig. 12 recording of the burn wound thermographic image using flir p660 camera 44 a. napieralski et al. 4.4. dedicated software although the employed infrared camera is accompanied by pc software for image processing, the functionality of this software is not adequate for the task of wound classification. consequently, one of the main tasks in the presented project was to develop a custom pc software solution, performing comprehensive analyses needed to correctly classify the wound. the aim was to develop a solution that would be easy to use, yet provide the complete processing path – from the captured images serving as the input data, to the detailed report containing indications for further treatment forming the output. the developed software, named burndiag, is intended to work with two types of images: thermovision and visible light. the main input is the thermovision image, the visible light one is supplemental, providing means to map burn wound areas to the body areas as seen by the human eye. both images are read from a disk (usually a memory card of the camera). although it would be possible to use a direct connection with the camera, this would require additional cables, which are an unwelcome addition during patient examination. the user interface is able to display both images side-by-side, making their comparison straightforward. apart from the images, the program main window contains controls (buttons, edit boxes, etc.) required for performing analyses and displaying their results (fig. 13). the interface can also be switched to text view, displaying textual information about the patient and the wound. these data are stored in a database. the images may be stored in dicom format [22]. fig. 13 interface of the burndiag software (image analysis tab) examples of medical software and hardware expert systems for dysfunction analysis and treatment 45 the software is able to read thermographic images in flir jpg format and also visible light images in standard jpg format. the flir jpg image format is a proprietary format that can be read by general-purpose graphical imaging software – in this case only the false color snapshot is available. however, it can also be processed by a dedicated library made available by flir – in such case the file exposes temperature data for the whole area of the image (as an array of c language float data type objects) and also some useful information about the conditions present when the image was taken. the presented software uses the second approach. the software supports following operations on the thermographic image:  presenting image in user-selected false colour palette,  adjusting lower and upper temperature limit,  freehand drawing of a closed path in order to delimit the wound area or the temperature reference area,  setting the point of (optional) biopsy and reading its temperature,  displaying an isotherm and computing burn area having temperature below and above the isotherm,  displaying temperature profiles along user-drawn lines. it also allows matching the visible light image to the thermographic one, generate final reports in adobe pdf or microsoft word doc format and export images in popular formats. matching the visible and thermographic images, mentioned in the preceding paragraph, is essential for obtaining meaningful results of analyses: the analyses are performed on the thermographic image, but the indications for surgery have to be presented in the visible light image. in order to transfer results from the thermographic image to the visible light one, both images should be taken from the same point in space and using cameras with exactly the same angle of view. in practice, these conditions are never met. even when thermovision – visible light combo camera is used (such as the flir thermacam p660), the lenses of the two cameras are separate and offset by several centimetres, they angles of view are also vastly different (the angle of view of the visible light camera being usually much wider). this problem is illustrated in fig. 14, which presents the thermographic and visible light images of the same object. when separate thermovision and visible light cameras are used, the problem becomes even more obtrusive, due to possible rotation of the cameras. fig. 14 thermographic and visible light image of the same object. note the different position of the uppermost reference markers (black squares) 46 a. napieralski et al. in order to match (register) the image with the other one, two distinct steps are required: a) determining corresponding points in the images, b) applying suitable transformations. if a number of point pairs in the images being matched can be identified that correspond to the same objects in the recorded scene, then it is possible to compute image transformation coefficients. the more pairs are identified, the more parameters can be used in the transformation process, and therefore more complicated deformations can be corrected. burndiag employs three pairs of points, which allow performing affine transform. the points in question could be selected in various ways, ranging from manual identification to fully automatic approach that includes discovery of the features of the images that can be treated as reference points. for the purpose of burndiag software, a solution has been proposed that performs automatic detection of reference points in the images, but the points themselves are marked by placing predefined shapes on the investigated object (human body in this case). this offers more robust performance and less chance of errors, which in case of medical diagnosis are particularly unwanted. the mentioned predefined objects are black squares with area of 1 cm 2 each. moreover, the reference objects are used in computations of the wound area: because their area is exactly 1 cm 2 , they allow the software to calibrate the algorithm for area computation. the detection procedure consists of several steps and is similar for both the thermographic and visible light image. however, the former requires additional considerations, which will be covered towards the end of this subsection. the first step of the procedure is conversion of the image to a grayscale equivalent. next, an edge detection procedure is applied, allowing for disclosing objects which differ significantly in brightness from the surroundings – in this case the reference squares. this is followed by a sequence of dilatation and erosion filters – the purpose is to expose distinct objects in the image and increase the chance they will be outlined by a closed path. the dilatation filter grows the objects in the image, consequently filling the gaps that may appear in the path outlining the reference squares. the erosion filter removes small standalone objects, which reduces noise and smoothes the outline of the squares. then, the resulting pre-processed images are passed to the blob detector which labels regions corresponding to individual objects larger than a predefined threshold area. this algorithm produces a list of large objects including the three black squares. the next step is the exclusion from the list of the objects which significantly differ from the square shape. for this purpose, every object in the list is tested for being quadrilateral and having sides and angles of similar values. the objects which pass the test are sorted by their area. the final step of the algorithm is the search through the list of remaining objects for three shapes having similar areas. when such a set is found, the entire detection procedure is finished. before the actual image transform can be performed, the reference points in infrared images have to be related to their counterparts in visible light images. for three reference points in a figure, there exist six possible ways of pairing the identified points. thus, in order to select the correct solution, the euclidean distances between the points of interest are calculated in both types of images. then, for each pairing, the sum of distances between the points is computed and the solution producing the lowest sum of distances is selected as the correct one. this approach is justified when the mutual rotation between images is not significant, which is true in the considered case. finally, the three pairs of points matched in the above-described procedure allow the execution of an affine transform, i.e., a transform that can map a parallelogram onto examples of medical software and hardware expert systems for dysfunction analysis and treatment 47 square [23]. numerous tests carried out on real images registered for burn wounds located in different parts of a human showed that this kind of transform is sufficient for the considered application. as far as the choice of a particular image to be transformed is concerned, it should be underlined that it is the visible light image which should undergo the transformation because, for medical diagnosis reasons, the infrared one should not be subject to any unnecessary processing. the result of the fully automatic correction procedure, presenting the transformed image from fig. 14, is shown in fig. 15. it should be noted that processing thermographic images by the described algorithm may be troublesome due to their very low spatial resolution (compared to the typical visible light image). at 640×480 pixels the reference squares may be too small for reliable detection. it has also been observed that the depth of field for the thermographic camera was small, which led to the squares (placed outside of the wound, and as a result, often at different distance from the camera than the fragment on which the camera focused) being out of focus. to handle such situations, the burndiag software allows also manual (by mouse click) marking of the reference squares in the images. fig. 15 thermographic and visible light image of the same object. the visible light image is corrected to match the thermographic one 4.5. medical results each time the infrared assessment of a burn wound was validated with a number of standard biopsies, which were taken only in the locations indicated by a surgeon and marked in infrared images by sterile pads of a diameter comparable with the one of the needle used for the biopsy. the infrared assessment of the burn wound depth showed extremely high correlation with the histopathological evaluation of the biopsy material and the clinical assessment of a wound by a physician. the results demonstrate that the temperature of superficially burned wound is only slightly changed with respect to the undamaged skin surrounding the wound. on the contrary, the deeply burned skin has the temperature lower by more than 3 k. as far as the optimal temperature thresholding value used for the separation between the burns of degrees iia and iib is concerned, the experimental results show that this value should be located approximately 1.45 k below the healthy skin temperature. this important result has been proven on a statistically sound number of clinical cases. 48 a. napieralski et al. when setting the precise temperature thresholding value, one should keep in mind that if this value is set too low, some of the necrotic tissues will not be removed during a surgery and the grafting will have to be repeated. hence, it is much safer to set this value at a bit higher level than required. then, patients will not have to undergo an additional surgery. 4.6. conclusions this chapter presented a new possible clinical method for the classification of the burn wound depth based on the analysis of infrared images. this method seems to be promising and compared to the traditional biopsy it is contactless and non-invasive, thus it is much more tolerable for young patients and brings very much same results from the diagnostic point of view. additionally, the proposed method largely eliminates the human factor from the diagnostic process and the results do not depend on a particular person performing the diagnostics or his/her current condition. the presented research combined the quantitative thermographic examination of burn wounds in children with the original quantitative histometrical assessment of biopsy punches taken at the same time from the examined wounds, hence proving to our best knowledge for the first time the high correlation between these two investigation methods; non-invasive thermography and invasive histopathology. consequently, it has been made possible to draw the exact isotherm dividing a burn wound area into two parts: superficially and deeply damaged, hence offering to surgeons a new non-invasive instrument to create maps of burn wounds which allows them to make the correct, early and exact decision: to operate or not to operate on the patient. moreover, such thermographic maps of burn wounds resulting from examinations are exact plans for surgical operations indicating places within the wound where necrotic tissues must be excised and places where surgical intervention is not required. it is also worth mentioning that the developed software for the analysis of burn wounds in children allows the complete and objective documentation of thermal injuries; including the storage of infrared and visible light images, the course of medical treatment and, most importantly, it supports making the therapeutic decision which is the best for the patient. concluding, the main benefits of the present clinical research are as follows:  it allowed the determination of the correlation between the change of skin temperature and the degree of burn depth (especially iia and iib) by comparing the results of biopsy evaluation with the infrared images in a statistically sound number of cases hence allowing significant reduction, if not complete elimination, of the number of biopsies required and consequently relieving the patient and speeding up the process of burn depth evaluation;  it facilitated fully automatic correlation and fusion of infrared and visible light images which are taken by the same or different camera with different fields of view and depth of focus. the developed software allows not only off-line analyses and reporting of recorded images, but it is a real time tool used to process and analyze the images which produces the wound maps used during grafting and calculating the surface of the wound. further research in this area will be focused on the precise determination of the temperature value which should be used for the thresholding of the infrared images. tests in hospital environment will determine the need for subsequent improvements of the developed software. examples of medical software and hardware expert systems for dysfunction analysis and treatment 49 acknowledgements: the researches presented in the paper were supported by:  funds from the national science centre granted on the basis of the decision number no. umo2011/03/b/st6/03454,  funds from the polish ministry of scientific research and higher education granted on the basis of the decision number no. n5 15 2423 37 authors would like to thank the following researchers: professor małgorzata kurpesa md, phd, dsc, ewa trzos md, phd, dsc, urszula cieślik-guerra md, barbara uznańska-loch md, and wojciech kuzanski md, phd. references [1] z. ciota: "design and realization of smart speech processors". in proceedings of the 15th ieee international conference on electronics, circuits, and systems, icecs 2008, malta, 31st august 3rd september 2008. [2] d. krzesimowski, z. ciota: “signal processing of voice in case of patients after stroke”, przegląd elektrotechniczny (electrical review), no. 11, 2010, pp.129-132 [3] d. krzesimowski, z. ciota: “estimation of hospitalization progress for patients with stroke with using of voice analysis” in proceedings of the 6th int. workshop on models and analysis of vocal emissions for biomedical applications, firenze, italy, december 14-16, 2009 [4] winter david: biomechanics and motor control of human movement, chapter 7, university of waterloo, 1999 waterloo, ontario, canada [5] stmicroelectronics: l3g4200d ultra-stable three-axis digital output gyroscope [6] compaq, hewlett-packard, intel, lucent, microsoft, nec, philips: universal serial bus specification, revision 2.0, april 27, 2000 [7] kamiński m., chłapiński j., sakowicz b., kotas r.: „ecg signal processing for t-wave alternans detection”, pp. 623-628, 16th international conference mixed design of integrated circuits and systems mixdes 2009, 25-27 june 2009, lodz, poland, wyd. tech. univ. łódź, dmcs, poland, form. a4. s.722, isbn 978-83-928756-0-4 [8] chłapiński j., kamiński m., sakowicz b., kotas r.: “t-wave alternans analysis in ambulatory ecg monitoring”, pp. 229-232, proceedings of the xth international conference tcset‟2010, “modern problems of radio engineering, telecommunications and computer science”, lviv-slavsko, ukraina, 23-27 february 2010, s.380, a4, wyd. publishing house of lviv polytechnic national university 2010, isbn 978-966-553-875-2 [9] cieślik-guerra u., kamiński m., kotas r., rechciński t., wadolowska e., jerka k., uznańska-loch b., trzos e., kasprzak j.d., kurpesa m.: „effect of cardiac rehabilitation on time and frequency domain analysis of heart rate variability and coronary syndrome (forever study)”, esc congress 2014, 30 aug 2014 03 sep 2014 barcelona [10] kamiński m., chłapiński j., sakowicz b., kotas r. , napieralski a.: „ecg signal processing for deceleration capacity assessment” elektronika – konstrukcje, technologie, zastosowania, grudzień 2011, nr 12/2011, issn 0033 -2089 [11] kamiński m., chłapiński j., sakowicz b., kotas r.: “ecg signal processing for deceleration capacity assesment”, pp.657-662, 18th international conference mixed design of integrated circuits and systems mixdes 2011, gliwice, 16-18 june 2011, wyd. tech. univ. łódź, dmcs, poland, form. a4. s.702, isbn 978-83-932075-0-3 [12] kamiński m., chłapiński j., trzos e., kurpesa m., napieralski a.: „differentiation of the deceleration capacity marker for periods of sleep and wakefulness” xiv międzynarodowa konferencja wspólna international society for holter monitoring and noninvasive electrocardiology oraz sekcji elektrokardiologii nieinwazyjnej i telemedycyny polskiego towarzystwa kardiologicznego, 2012 [13] marciniak p., kotas r., kamiński m., ciota z., „implementation of artificial intelligence methods on the example of the risk stratification of cardiovascular diseases”, 21th international conference mixed design of integrated circuits and systems mixdes 2014, 19-21 june 2014, lublin [14] kamiński m., kotas r., mazur p., sakowicz b., napieralski a.: “optimization model for risk stratification of sudden cardiac death”,20th international conference mixed design of integrated circuits and systems mixdes 2013, 20-22 june 2013, gdynia 50 a. napieralski et al. [15] kamiński m., kotas r., mazur p., sakowicz b., napieralski a.: „models for risk stratification of sudden cardiac death based on logical conjunction and decision tree”, international journal of microelectronics and computer science, 2013, 2080-8755, 2013, nr 2, s. 65-71 [16] heimbach d., emgrav l., grube b., marvin j, “burn depth: a revive”. world j. surg. 16, pp. 10-15, 1992 [17] hendricks w. m, “the classification of burns.” j. am. acad. dermatol., n.5, 1990 [18] mason b.r., graff a.j., pegg s.p., “colour thermography in the diagnosis of the depth of burn injury.”, burns incl. therm. inj., 7, 197, 1981 [19] cole r.p., jones s.g., shakespeare p.g., “thermographic assessment of hand burns.”, burns, 16, pp. 60-63,1990 [20] glasbey c.a., “an analysis of histogram-based thresholding algorithms.”, comput. vis. graph. image process., vol. 55, pp. 532-537, 1993. [21] pal n.r., pal s., “a review on image segmentation techniques.”, pattern recognition, vol. 26, pp. 12771294, 1993. [22] dicom specification, ftp://medical.nema.org/medical/dicom/2009/ [23] zitova b., flusser j., "image registration methods: a survey", image and vision computing, vol. 21, no. 11, pp. 977-1000, 2003 instruction facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 155-167 https://doi.org/10.2298/fuee2002155r © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer: a status review renu rajput, rakesh vaid department of electronics, university of jammu, jammu, india abstract. traditional flash memory devices consist of polysilicon control gate (cg) – oxide-nitride-oxide (ono interpoly dielectric) – polysilicon floating gate (fg) – silicon oxide (tunnel dielectric) – substrate. the dielectrics have to be scaled down considerably in order to meet the escalating demand for lower write/erase voltages and higher density of cells. but as the floating gate dimensions are scaled down the charge stored in the floating gate leak out more easily via thin tunneling oxide below the floating gate which causes serious reliability issues and the whole amount of stored charge carrying information can be lost. the possible route to eliminate this problem is to use high-k based interpoly dielectric and to replace the polysilicon floating gate with a metal floating gate. at larger physical thickness, these materials have similar capacitance value hence avoiding tunneling effect. discrete nanocrystal memory has also been proposed to solve this problem. due to its high operation speed, excellent scalability and higher reliability it has been shown as a promising candidate for future non-volatile memory applications. this review paper focuses on the recent efforts and research activities related to the fabrication and characterization of non-volatile memory device with metal floating gate/metal nanocrystals as the charge storage layer. key words: metal oxide semiconductor (mos), non-volatile memory (nvm), nanocrystal (nc), dielectrics, interpoly dielectric (ipd), floating gate (fg) 1. introduction flash memory is the extensively used non-volatile information-storage device today because of its multi-bit per cell storage property and lesser cost per bit. moreover its fabrication process is consistent with the present cmos process and is a relevant solution for embedded memory applications. the qualitative comparison of various nonvolatile mos memory devices in the flexibility-cost plane [1] is shown in fig.1. nor flash memory and nand flash memory are the two main types of flash memory used today. the key difference between the two is that nor flash memory allow a single byte received december 26, 2019 corresponding author: rakesh k. vaid department of electronics, university of jammu, jammu-180006, india (e-mail: rakeshvaid@ieee.org) 156 r. rajput, r. vaid to be written to an erased location or read independently where as nand flash memory is written in blocks or pages. the other contrast is that the former uses channel hot electron (che) for programming and has slower program-erase speed while the later uses fowler-nordheim tunneling for programming and has fast program-erase speed. rom not electrically programmable eprom programmable and electrically erasable flash programmable and erasable in system eeprom byte rewrite capability cost flexibility fig. 1 non-volatile mos memory qualitative comparison in the flexibility-cost plane [1] nand flash memory finds use in data storage devices like mobile phones, pen drives, digital cameras etc. and the nand flash based solid state drive (ssd) is used as a replacement for hard disk drive (hdd) in modern computers and laptops. the growing demand for this technology is the key driving force to increase the data storage capacity and decrease the cost per bit in flash memories by scaling down the flash cell to smaller and smaller dimensions. in continual scaling the major roadblocks are being faced by both nor and nand flash memories. the scaling projection for floating-gate nor and nand flash memory by international technology roadmap for semiconductors (itrs) [2] is shown widely in table 1 and 2. mos memory devices rely on the charge stored in the floating gate to cause a shift in the threshold/flatband voltage. the schematic cross section of a floating-gate mos memory cell is shown in fig.2. fig. 2 schematic cross section of mos memory cell flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer 157 conventionally, the floating gate and control gate are usually made of polysilicon. the tunneling barrier and the blocking oxide are typically composed of sio2 and oxidenitride-oxide (ono) respectively. floating gate acts as a potential well and once the charges are injected into the potential well, it cannot move without external electric field. the programming and erasing of the memory cell is carried out by applying positive and negative electric field to the control gate respectively. si has been used in mos technology for many decades because of the better qualities of its native oxide [3]. in advanced semiconductor technology, the transistor gate length will continue to shrink in order to achieve faster switching speed and high device density. this results in reduction of gate length and gate oxide thickness to 1.5nm and oxide scaling down to eot of approx. 0.5nm is required [4]. as the thickness of sio2 shrinks below 1.5nm, it is facing its physical limitations because of gate dielectric leakage current and reliability requirements [5]. thus there is a need for alternate channel materials that can enhance channel mobility beyond the physical limits of si based mos devices [6]. with the decrease in thickness of tunnel dielectric, the interface between si and sio2 plays a very important role in retaining the charge for the long time. however, for a very thin sio2, it is impossible to have a defect free oxide and even a single defect can cause leakage of charge stored in floating gate through the tunnel oxide. thus, thinner tunnel dielectric would degrade the charge retention. for instance, with 2.3 nm sio 2 tunneling oxide shows reasonable programming efficiency with fairly low programming voltage, but loose 25% of its stored charges in several tens of seconds [7]. to overcome these issues, sion (silicon oxynitride) and other high dielectric constant materials with higher physical thickness are being used to limit the leakage current while maintaining the higher capacitance values of the scaled devices. sion has been used as a substitute for sio2 because of its high dielectric permittivity and low density of surface states. in addition, by differing o/n ratio the band energy of sion can be tailored between 5 and 9 ev [8, 9]. the tunnel oxide has the most strict requirements in a flash cell with properties summarized as under: a) it must provide a good interface to the silicon channel for reliable transistor operation b) it should provide efficient charge transport either through tunneling or hot carrier injection during the programming and erase operations which allow the data to be changed in the memory cell c) it should enable years of charge retention on the floating gate [10]. the conventional polysilicon floating gate structure suffers from various disadvantages like cell to cell interference when optimized for high density, high leakage current through tunneling dielectric, programming current become increasingly ballistic and loss of gate coupling factor. the thickness of ono is to be reduced to compensate for the loss of gate coupling factor. but when it is reduced below 10nm, it would cause increase in the unwanted gate to gate tunneling and degrade the retention characteristics of memory devices. the possible route to eliminate this problem is to use high-k based interpoly dielectric and to replace the polysilicon floating gate with a metal floating gate. the added benefits of incorporating a metal floating gate include work function tunability, higher density of states and most importantly the lower ballistic transport. the lower ballistic transport in case of metals is a beneficial factor in scaling down the floating gate thickness to realize a lower cell-to-cell interference [11]. hf based high-k dielectrics in combination with metal floating gate demonstrated excellent memory characteristics [12]. therefore, the ultra-thin metal floating gate is a promising candidate for future scaled nor and nand flash memories. 158 r. rajput, r. vaid table 1 scaling projections for floating-gate nor flash by international technology roadmap for semiconductors (itrs) [2] nor flash nor flash technology node-f (nm) gate length physical (nm) tunnel oxide thickness (nm) interpoly dielectric material interpoly dielectric thickness eot (nm) gate coupling ratio retention (years) 2011 40 100 8-9 ono 13-15 0.6-0.7 10-20 2012 35 100 8-9 ono 13-15 0.6-0.7 10-20 2013 32 90 8 high-k 8-10 0.6-0.7 20 2014 28 90 8 high-k 8-10 0.6-0.7 20 2015 25 90 8 high-k 8-10 0.6-0.7 20 2016 22 (‽ ) (‽ ) high-k 8-10 0.6-0.7 20 2017 20 (‽ ) (‽ ) high-k 8-10 0.6-0.7 20 2018 28 (‽ ) (‽ ) high-k 7-9 0.6-0.7 20 2019 16 (‽ ) (‽ ) high-k 6-8 0.6-0.7 20 2020 14 (‽ ) (‽ ) high-k 6-8 0.6-0.7 20 2021 12 (‽ ) (‽ ) high-k 6-8 0.6-0.7 20 2022 10 (‽ ) (‽ ) high-k 6-8 0.6-0.7 20 2023 10 (‽ ) (‽ ) high-k 6-8 0.6-0.7 20 2024 10 (‽ ) (‽ ) high-k 6-8 0.6-0.7 20 table 2 scaling projections for floating-gate nand flash by international technology roadmap for semiconductors (itrs) [2] initially in 1967, charge-trapping memory (ctm) was introduced to illustrate some eminent advantages over the conventional floating-gate counterpart [13]. recently, floating gate memory devices consisting of discrete metal nanocrystals have received huge attention due to their excellent memory performance and high scalability. memory devices with nanocrystal floating-gate offer faster write/erase speeds, lower power nand flash nand flash technology node-f (nm) tunnel oxide thickness (nm) interpoly dielectric material interpoly dielectric thickness eot (nm) control gate material gate coupling ratio retention (years) 2011 28 6-7 ono 10-13 n-poly 0.6-0.7 10-20 2012 25 6-7 high-k 9-10 poly/metal 0.6-0.7 10-20 2013 22 6-7 high-k 9-10 poly/metal 0.6-0.7 10-20 2014 20 6-7 high-k 9-10 poly/metal 0.6-0.7 10 2015 19 6-7 high-k 9-10 metal 0.6-0.7 5-10 2016 18 4 high-k 9-10 metal 0.6-0.7 5-10 2017 16 4 high-k 9-10 metal 0.6-0.7 5-10 2018 14 4 high-k 9-10 metal 0.6-0.7 5-10 2019 13 4 high-k 8-10 metal 0.6-0.7 5-10 2020 12 4 high-k 8-10 metal 0.6-0.7 5-10 2021 11 4 high-k 8-10 metal 0.6-0.7 5-10 2022 9 4 high-k 8-10 metal 0.6-0.7 5-10 2023 8 4 high-k 8-10 metal 0.6-0.7 5-10 2024 8 4 high-k 8-10 metal 0.6-0.7 5-10 flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer 159 consumption, and more powerful endurance characteristics compared to conventional continuous floating-gate nonvolatile memory devices. during the last few years, si and ge nanocrystals embedded in gate oxide layers have been studied as the charge storage nodes in silicon-based memory devices. but it has been found that during the device fabrication their retention characteristics are very sensitive to the thermal process because of the existence of defects and traps inside or at the surface of the nanocrystals. the presence of these defects and traps limits the application of si and ge nanocrystals in nonvolatile memory. thus the use of metal nanoparticles has been proposed to prolong the retention characteristics and to overcome the limits of semiconductor nanocrystals [14]. the fabrication of a discrete non volatile memory (nvm) cell requires a perfect control of four main parameters: (a) the tunnel oxide thickness (b) the nanocrystal density (c) the nanocrystal diameter and (d) the control oxide thickness. the most commonly used methods to fabricate nanocrystals are self-assembly, precipitation and chemical reaction. in self assembly procedure, a trapping layer of 1 – 5 nm is deposited and then the film is annealed at a temperature close to its eutectic temperature in an inert gas ambient to transform the trapping layer into a nanocrystal structure. the diameter of the nanocrystal depends on the thickness of the trapping layer as well as the temperature and duration of the thermal treatment. however this method cannot make sure that the trapping layer is completely discrete. in precipitation method, an oversaturated or mixed trapping layer is prepared by ion implantation into a deposited insulator layer or codeposit system to form nanocrystals by further thermal annealing. however, this method has also some limitations due to use of high energy ion implantation for formation of nanocrystals. the chemical reaction method is most widely used to form a nanocrystal trapping layer. initially, a binary or tertiary mixed layer is co-deposited by different material systems and then the layer is oxidized by rta under an oxygen flow. but the control of the oxygen concentration in the mixed layer is an important issue. a low oxygen concentration causes insufficient oxidation of the mixed layer and higher leakage current and a high oxygen concentration can result in the oxidation of the nanocrystal. the memory property can be lost if either of these two conditions occur. a metallic nanocrystal storage layer has several advantages over the semiconductor nanocrystal storage layer due to their high density of states and work function engineering. the large work function difference between the metal and the si substrate creates a deep potential well that ensures a lesser charge loss ultimately resulting in enhancement of the retention characteristics. blocking layers comprised of high-k dielectrics allow the use of relatively thick films which prevent leakages while maintaining a thin equivalent oxide thickness (eot) and prevent any charge leakage to the top gate. moreover, the increased capacitance density offered by high-k films enlarges the charge storage density. fig. 3 shows the schematic diagram enlightening the motivation and challenges in transitioning from conventional memory structure to a novel high-k mos memory structure. 160 r. rajput, r. vaid si substrate sio2 tunneling oxide poly-si floating gate poly-si control gate oxide-nitride-oxide blocking oxide (ipd) si substrate high-k tunneling oxide metal /metal nanocrystal floating gate metal control gate high-k blocking oxide (ipd) introduces poly gate depletion effect, increases boron penetration, threshold voltage fluctuation and degrades drive current. free from poly depletion effect. work function tunability and reduces unwanted control gate– floating gate leakage reduced leakage compromising gcr and lower program/erase speed improved gcr, enhanced program/ erased speed without compromising leakage current higher ballistic current on reducing floating gate thickness and degrades memory characteristics lower ballistic current and improved memory characteristics increased leakage current on reducing thickness and degrades charge retention characteristics reduced leakage current maintaining high capacitance value and improved charge retention characteristics fig. 3 schematic diagram showing the comparison between conventional and novel high-k mos memory structure 2. related work to non-volatile memory structures using different metal nanocrystals/metal floating gate self-assembly of metal nanocrystals including au, ag, and pt on ultrathin oxide for nonvolatile memory applications were investigated [15] and it was observed that for non volatile memory cells spherical nanocrystals are preferred because the three dimensional symmetry results in the best charge confinement and physical stability from surface energy minimization. the shallow potential well in case of semiconductor nanocrystals (si nanocrystals paired with a si substrate) can only hold charges for a relatively short time because of direct-tunneling back to the substrate [16]. however, metal nanocrystals due to their large work functions can make use of the deep potential wells to hold electrons for longer time and thus the memory effects of au and pt are better than the ag. chungho lee et. al. [17] in 2005 fabricated and characterized the heterogeneousstack devices consisting of metal nanocrystals (au) and silicon nitride (si3n4). the metal nanocrystals at the lower stack allow the direct tunneling mechanism during program/erase to achieve low-voltage operation and good endurance while the nitride layer at the upper stack acts as an additional charge trap layer to increase the memory window and significantly improve the retention time. semiconductor nanocrystal (si) and nitride heterogeneous stacks have been first proposed in metal–nitride–oxide–silicon (mnos) structure by yamazaki et al. [18, 19] and sonos structure by steimle et al. [20, 21]. the nanocrystal/nitride heterogeneous stack shows enlarged charge storage capacity, longer retention, low voltage and fast p/e by additional nitride traps as compared to au nanocrystals. ch sargentis et. al [22] in 2005 fabricated a novel mos memory device with platinum (pt) nanoparticles embedded in the hfo2 /sio2 interface which exhibits flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer 161 clear hysteresis behavior and is attributed to the charge storage in the nanoparticles. jong jin lee and dim-lee kwong [23] in 2005 proposed a nonvolatile memory using nickel nanocrystal (nc) embedded in hfo2 tunneling/control dielectrics to overcome the fundamental tradeoff between programming and data retention characteristics. nickel has not only suitable work function (4.5–4.6 ev) [24] for such applications but it also exhibits a low anneal temperature for nc formation which is advantageous to the quality of underlying tunneling hfo2 dielectrics (5.1nm). hei wong and hiroshi iwai in 2006 [25] suggested that the conventional oxide can be scaled down to two atomic layers of about 7a ˚ but this is not viable in practice because of the non-scalabilities of interface, trap capture cross-section, leakage current and the statistical parameters of fabrication processes. physically thicker high-k material can help to solve most of the problems but high-k gate dielectric film still give rise to several reliability problems and degrades the performances of the mos devices to a great extent. bulk oxynitride/high-k stack could be a secure solution to this problem. the major benefits of oxynitride are that most of its material characteristics are second to sio2 but with a larger value of dielectric constant (3.9-7.8), band gap within 5.9–8.9 ev, conduction band offset greater than 2.1 ev, thermally stable structure with high crystallization temperature and stable on silicon. it also has a comparatively better dielectric/si interface when compared to other high-k competitors. these properties are sufficient for mos device applications. but the two atomic-layer oxynitride will not be synthesized and the leakage current will be too large as well. therefore, a better solution is to make use of the bulk oxynitride/high-k stack to make it physically thicker and thus reduce the gate tunneling current and the sensitivity to thickness fluctuation. due to the fabrication methods, most of the defects in oxynitride are tied to the hydrogen and hydroxyl groups and are the major sources for hot-carrier induced related trap generation in nitrided oxide films [26-29]. hence it is very important to minimize the hydrogen incorporation for a highly reliable oxynitride film for advanced mos device applications. s. hwang et. al [30] in 2007 studied the effect of the presence of high concentration of nitrogen in silicon-oxide layer. the results reveal the suitability of oxynitride films in high quality ultra-thin-film transistors and non volatile memory device. p.h. yeh et. al. [31] in 2007 review the memory effects of the ni, nisi2, cosi2 and nisi2 with sio2/hfo2 tunneling dielectrics. it was found that the memory effects of the metal nanocrystals have strong relationship with the work function and the work function can be modulated by changing the metal species. the memory window of the samples with hfo2 as tunnel dielectric is larger than the samples with sio2 due to the smaller voltage drop with the same physical thickness(=3nm). the physical thickness of hfo2 tunnel oxide is sufficient for the small operation voltage but not satisfactory for the retention requirement because the equivalent oxide thickness (1.5 nm) being too thin so that the electrons can tunnel back easily. the tunneling probability of electron is interrelated with the thickness and the barrier of the tunnel dielectric. weihua guan et. al. [32] in 2007 proposed that metal nanocrystal (mnc) shows better retention performance than snc (semiconductor nanocrystals) for the same nanocrystal size because of the higher electron barrier height from mnc to the substrate and the weaker quantum confinement effect of mnc. a tunneling oxide of 3.6 nm is thick enough to guarantee 10 years retention for au nanocrystal. it is also observed that high-k tunneling barrier combined with high work function mnc (e.g.ncau) can further enhance the retention performance. therefore, metal nanocrystal memory devices employing high–k gate dielectric is a promising 162 r. rajput, r. vaid candidate for the application of non-volatile memories. again, weihua guan et. al. [33] in 2007 fabricated mos capacitor structure with metal nanocrystals (including au, ni and co formed by self-assembling process) embedded in the gate oxide for nonvolatile memory (nvm) application. due to high work function (5.0 ev) au nanocrystals can provide enhanced retention performance which confirms the high capacity of au nanocrystals for nvm applications with respect to semiconductor counterparts. jaeyoung choi et. al. [34] in 2007 fabricated nonvolatile memory (nvm) mos capacitors with a structure of si/sio2/ni ncs/sio2/al2o3. the mos memory structures with the ninc floating gates showed a relatively large memory window of 3 – 5 v for 10 ms – 1 s under ±19 v and excellent endurance characteristics. byoungjun park et. al. [14] in 2008 studied the electrical characteristics of titanium (ti) nanoparticles embedded metal-oxidesemiconductor (mos) capacitors and metal-oxide-semiconductor field effect transistors (mosfets) with blocking al2o3 layer(~20 nm) and tunneling sio2 (~6.3 nm)layer. due to work function of 4.3 ev ti nanoparticles shows potential for charge storage nodes in floating gate devices because during their formation their surfaces are natively covered with a titanium dioxide (tio2) layer which are known as a high-k dielectric material (ε = 80~110) and it can avert the lateral transportation of charge carriers between the ti nanoparticles and blocking oxide layers and the charge loss through the tunneling barrier. yingtao li and su liu [35] in 2009 used the different work function nc materials (au~5.0ev, w~4.6ev and si~4.05ev) in nvm devices and observed that au-nc as floating gates exhibit better retention characteristics due to its larger work function which induces a higher barrier height and a deeper well at the interface of the floating gate and the tunneling dielectric. the work function values for some important transition metals are listed in table 3. ni henan et. al. [36] in 2009 fabricated and characterized the mos structure with double-layer heterogeneous nanocrystals (si nanocrystals and ni nanocrystals) embedded in a gate oxide ( sio2) matrix for nonvolatile memory applications. the additional metal nanocrystals layer allows the direct tunneling mechanism to raise the flat voltage shift and prolong the retention time. coulomb blockade and energy level quantization limits the leakage mechanism that is tunneling to the closest lower nanocrystals [37]. zs. j. horvath and p. basa [38] in 2009 created mnos structures with semiconductor nanocrystals at the si3n4/sio2 interface and summarized that the formation of nanocrystals in nitride based memory structures can enhance both the charging and retention behaviour due to direct tunneling to nanocrystals and creating deep energy states respectively [39,40]. jeng hwa liao et. al. [41] investigated the relationship between the physical and the electrical characteristics of silicon oxynitride (sion) films and the refractive index. it was found that the charge-trap density of the sion film is inversely proportional to the oxygen concentration in the sion layers and the dielectric constant is directly proportional to the refractive index. s. raghunathan et. al. [11] in 2009 investigated that during programming ultra-thin metal fg shown three orders of magnitude lesser ballistic current than poly-si of same thickness because the metal has high density of states and larger the density of states the larger is the scattering. it was also observed that the deeper work-function metal fg device erases slower but shows better retention compared to a shallower work-function fg device. srikant jayanti et. al. [12] in 2010 investigated ultrathin tan metal floating gate with hf based high-k ipd for nand flash applications. the results indicate that high-k based interpoly dielectric in combination with ultra-thin tan metal fg can enable further scaling of nand flash memory beyond flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer 163 conventional oxide-nitride-oxide (ono) based ipd technology. jang-sik lee [42] in 2010 reviewed gold nanoparticles are chemically stable and have a high work function, they have been shown to be a promising candidate for use as the charge trapping element in non-volatile memory devices employing nanoparticles. shiqian yang et. al. [43] in 2010 fabricated mos capacitor (p-si/sio2/tiw ncs/al2o3/al) and illustrate that high-k blocking oxide play vital role in the low-voltage operation and increases the coupling ratio to the nc layer which results in good charge storage feature. v. mikhelashvili et. al. [44] in 2010 proposed and demonstrated a double-layer memory capacitor comprising of two au nanoparticle layers separated by hfo2 film. hfo2 was also used for the tunneling layer whereas the blocking insulator was based on a hfn/hftio multilayer stack having a dielectric constant larger than 35. the structure exhibits excellent performance in terms of the hysteresis window, the stored charge density, the leakage current, the eot and its retention properties. g. gay et. al. [45] in 2010 fabricated the devices with tin ncs embedded in sin and observed that the devices were erased 10 2 faster and shows large memory window (10 v) as compared to devices with pure sin trapping layer. larysa khomenkova et. al. [46] in 2011 demonstrated the application of pure hfo2 and hfsio layers fabricated by rf magnetron sputtering as alternative tunnel layers for high-k/si-ncssio2/sio2 memory structures and indicates the utility of these stack structures for lowoperating-voltage nvm devices with specific deposition conditions and annealing treatment. g. x. li [47] et. al. in 2011 demonstrated that hf-rich films tend to form thicker interfacial layer due to the stability of hfo2 whereas ti-rich films usually demonstrate higher leakage current owing to the low band gap of tio2. overall, the 20-nm-thick films with hf/ti ratio of 46/54 exhibit high permittivity up to 50 while maintaining the relatively low leakage current of 1.2 × 10 −8 a/cm 2 at 1 v bias which would be the promising candidate to replace hfo2 for the next generation technology. minseong yun [48] et. al. in 2012 demonstrated high density multi-layer pt nanoparticle embedded memory device with enhanced memory window and better retention property. chih-ting lin et. al. [49] in 2013 revealed that the memory window becomes larger at elevated annealing temperatures due to the high density of au-ncs. the optimized annealing condition was 700 0 c because of the large space between ncs. jun-hyuk seo et. al. [50] in 2013 fabricated a mos capacitor containing au nanocrystals in a stepped hfo2 and sio2 tunneling oxide matrix. memory window effects were observed due to the successive charge trapping in au nanocrystals by controlling the electric field distribution via the tunneling oxide. wenchieh shih et. al. [51] in 2013 studied mytos device i.e metal-yttrium oxide-tantalum oxide-silicon oxide-silicon (al/y2o3/ta2o5/sio2/si) and showed that the large conduction band offset at the ta2o5/sio2 and the y2o3/ta2o5 interfaces is expected to give better blocking efficiency which will improve memory window and programming speed and can also relieve over erase problem. guoxing chen et. al. [52] in 2014 demonstrated that metal floating gate capacitor si/sio2-2.7/hfo2-8/tan-8/al2o3-15/au 150 presents favorable performance with lower operation voltage as well as enhanced program/erase speed and improvement of data retention compared to si/sio2-4.5/tan8/al2o3-15/au-150 floating gate cell. thus, the program/erase speed can be enhanced using a high-k blocking layer. metal oxides shows potential for the high-k gate materials because of chemical stability at si interface. table 4 shows the contrast of a few main properties of high-k dielectric materials [53]. high-k dielectrics proportionally reduced the electric field across the blocking oxide with its dielectric constant and therefore, 164 r. rajput, r. vaid electron injection from the gate during erase can be effectively suppressed which will generally in turn enhance the erase speed [54]. from the above discussion it can be concluded that hfo2 and its aluminates, silicates are now the most accepted candidates. chengyuan yan et. al. [55] in 2019 fabricated a nonvolatile memory with a floating gate structure using znse@zns core–shell quantum dots as discrete charge-trapping/tunneling centers. the fabricated device demonstrates a large memory window, stable retention, and good endurance. s. wang et. al. [56] in 2019 fabricated a dual gate structure floating gate non –volatile memory device based on heterostructure of mos2 and hexagonal boron nitride. the newly introduced device exhibit long retention time, ultra low leakage current (~10 -13 a) and low operation voltage (~5v). table 3 work function values of various transition metals element work function (ev) pt 5.35 au 5.1 al 4.08 ag 4.0 ti 4.35 cr 4.5 fe 4.5 co 5.0 ni 5.01 cu 4.65 mg 3.68 nb 4.3 table 4 comparison of some main properties of high-k dielectric material [53] material energy gap (ev) conduction band offset to si dielectric constant crystal structure sio2 8.9 3.2 3.9 amorphous si3n4 5.1 2 7 amorphous sion 5.1-8.9 2-3.2 3.9-7.1 amorphous al2o3 8.7 2.8 9 amorphous hfo2 5.7 1.5 25 monoclinic, tetragonal, cubic zro2 7.8 1.4 25 monoclinic, tetragonal, cubic tio2 3.5 1.2 80 tetragonal ta2o5 4.5 1-1.5 26 orthorhombic y2o3 5.6 2.3 15 cubic la2o3 4.3 2.3 30 hexagonal, cubic flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer 165 4. conclusion to overcome the problem of stored charge leakage back to the channel due to local defects, memory-cell structures employing discrete traps as charge storage nodes have been proposed in the past few years. extensive researches have been conducted on non volatile memories employing metal nanocrystals/metal floating gate. metal floating gate in combination with high-k results in good electrical characteristics even at low thicknesses and also lowers the interpoly dielectric leakage (ipd) due to larger work function. the metal nanocrystal with high work function shows enhanced charge retention time as compared to semiconductor nanocrystals and thus the high potentiality of nanocrystals for nvm applications is confirmed. the advantages of both metal nc and a high-tunneling barrier results in excellent data retention characteristic without compromising the programming efficiency and it resolve the tradeoff between programming and retention characteristic. the heterogeneous floating-gate stack of metal nanocrystals and nitride can take the better tradeoff between the programming and retention characteristics as compared to single layer of metal nanocrystals or nitride. the multilayer nanoparticle embedded nonvolatile memory contributes to additional charge storage capability and ultimately enhanced memory window. the high-k blocking oxide play vital role in the low-voltage operation and increases the coupling ratio between the control gate and metal floating gate/ metal nc layer which results in good charge storage feature. however with decrease in nanocrystal size the density of nanocrystals increases but it also faces some limitations such as high leakage density and fluctuation from device to device and isolation concept is violated. therefore, it is indispensable to solve the scaling problem with suitable requirements in nonvolatile memory industry. acknowledgement: the author renu organized the concept of this review paper and would like to thank prof. rakesh vaid for supervising the project. both the authors read and approved the final manuscript. references [1] r. bez, e. camerlenghi, a. modelli and a. visconti, “introduction to flash memory”, in proceedings of the ieee, 2003, vol. 91, pp. 489–502. [2] http://www.itrs.net/links/2009itrs/home2009.htm [3] r. garg, d. misra and p. k. swain, “ge mos capacitors with thermally evaporated hfo2 as gate dielectric”, j. of the electrochem. soc., vol. 153, pp. f29–f34, 2006. [4] a. g. khairnar and a. m. mahajan, “effect of post-deposition annealing temperature on rf-sputtered hfo2 thin film for advanced cmos technology”, solid state sci., vol. 15, pp. 24–28, 2013. [5] h. s. kim, “a study of hfo2-based moscaps and mosfets on iii-v substrates with a thin germanium interfacial passivation layer”, ph.d. thesis, the university of texas, austin, 2008. [6] s. v. j. chandra, m. jeong, y. c. park, j. w. yoon and c. j. choi, “effect of annealing ambient on structural and electrical properties of ge metal-oxide-semiconductor capacitors with pt gate electrode and hfo2 gate dielectric”, mater. trans., vol. 52, pp. 118–123, 2011. [7] m. saitoh, e. nagata and t. hiramoto, “effects of ultra-narrow channel on characteristics of mosfet memory with silicon nanocrystal floating gates”, in proceedings of international electron devices meeting technical digest, 2002, pp. 181–184. [8] n. konofaos, “electrical characterisation of sion/n-si structures for mos vlsi electronics”, microelectron. j., vol. 35, pp. 421–425, 2004. 166 r. rajput, r. vaid [9] i. fainer, m. l. kosinova, a. e. a. macimovsky, yu. m. rumyantsev, f. a. kuznetsov, v. g. kesler, v. v. kirienko, “study of the structure and phase composition of nanocrystalline silicon oxynitride films synthesized by icp-cvd”, nucl. instrum. and meth. in phys. res. sect. a, vol. 543, pp. 134–138, 2005. [10] s. n. keeney, “dielectric scaling challenges and approaches in floating gate non-volatile memories”, in proceedings of the electrochemical society, 2004, vol. 04, pp. 151–158. [11] s. raghunathan, t. krishnamohan, k. parat, and k. saraswat, “investigation of ballistic current in scaled floating-gate nand flash and a solution”, in proceedings of the ieee international electron devices meeting, baltimore, md, 7-9 december, 2009, pp. 34.1.1–3.4.1.4 [12] s. jayanti, x. yang, r. suri and v. misra, “ultimate scalability of tan metal floating gate with incorporation of high-k blocking dielectrics for flash memory applications”, in proceedings of the international electron devices meeting, 2010, pp. 5.3.1–5.3.4 [13] j. brewer and m. gill, nonvolatile memory technologies with emphasis on flash: a comprehensive guide to understanding and using nvsm devices, ieee, in press: piscataway, nj, usa, 2008. [14] b. park, k. cho, j. yun, y. s. koo, j. h. lee, and s. kim, “electrical characteristics of floating-gate memory devices with titanium nanoparticles embedded in gate oxides”, j. of nanosci. and nanotech., vol. 8, pp. 1904–1908, 2008. [15] c. lee, j. meteer, v. narayanan and e. c. kan, “self-assembly of metal nanocrystals on ultrathin oxide for nonvolatile memory applications”, j. of electron. mater., vol. 34 pp. 1–11, 2005. [16] z. liu, c. lee, v. narayanan, g. pei and e.c. kan, “metal nanocrystal memories-part i: device design and fabrication”, ieee trans. on electron devices, vol. 49, pp. 1606–1613, 2002. [17] c. lee, t. h. hou, and e. c. c. kan, “nonvolatile memory with a metal nanocrystal/nitride heterogeneous floating-gate”, ieee trans. on electron devices, vol. 52, pp. 2697–2702, 2005. [18] s. yamazaki, k. hatakeyama, i. kagawa, and y. yamashita, “properties of nonvolatile semiconductor memories by using silicon clusters in the gate insulator”, in proceedings of the international electron devices meeting technical digest, 1973, pp. 355–358. [19] s. yamazaki, “metal-insulator-semiconductor structures,” ceramic bulletin, vol. 64, pp. 1585–1589, 1985. [20] r. f. steimle, m. sadd, r. muralidhar, r. rao, b. hradsky, s. straub and b. e. white jr., “hybrid silicon nanocrystal silicon nitride dynamic random access memory”, ieee trans. nanotechn., vol. 2, pp. 335–340, 2003. [21] r. f. steimle, r. a. rao, b. hradsky, r. muralidhar, m. sadd, m. ramon, s. straub, s. bagchi, x. d. wang, j. hooker and b. e. white jr., “hybrid silicon nanocrystal silicon nitride memory”, in proceedings of the icssdm, 2003, pp. 848–849. [22] c. sargentis, k. giannakopoulos, a. travlos and d. tsamakis, “fabrication and characterization of a metal nanocrystal memory using molecular beam epitaxy”, j. of phys., vol. 10, pp. 53–56, 2005. [23] j. j. lee and d. l. kwong, “metal nanocrystal memory with high-k tunneling barrier for improved data retention,” ieee trans. on electron devices, vol. 52, pp. 507–511, 2005. [24] s. m. sze, physics of semiconductor devices. new york: wiley, 1981, chapter 7. [25] h. wong and h. iwai, “on the scaling issues and high-k replacement of ultrathin gate dielectrics for nanoscale mos transistors”, microelectron. eng., vol. 83, pp. 1867–1904, 2006. [26] h. wong, b. l. yang and y.c. cheng, “chemistry of silicon oxide annealed in ammonia and its influence on interface traps”, appl. surf. sci., vol. 72 , pp. 49–54, 1993. [27] e. cartier and j. h. stathis, “insulating films on semiconductors”, microelectron. eng., vol. 28, pp. 3–10, 1995. [28] j. t. yount, p. m. lenahan and j. t. krik, “comparison of defect structure in n2oand nh3-nitrided oxide dielectrics”, j. of appl. phys., vol. 76, pp. 1754–1758, 1994. [29] i. j. r. baumvol, f. c. stedile, j. j. ganem, i. trimaille and s. rigo, “thermal nitridation of sio2 films in ammonia: the role of hydrogen”, j. of electrochem. soc., vol. 143, pp. 1426–1434, 1996. [30] s. hwang, s. jung, k. s. jang, j. i. lee, h. park, s. k. dhungel and j. yi, “properties of the ultra -thin silicon-oxynitride films deposited by using plasma-assisted n2o oxidation for semiconductor device applications”, j. of the korean phys. soc., vol. 51, pp. 1096–1099, 2007. [31] p.h. yeh, l.j. chen, p.t. liu, d.y. wang and t.c. chang, “ metal nanocrystals as charge storage nodes for nonvolatile memory devices”, electrochimica acta (elsevier), vol. 52, pp. 2920–2926, 2007. [32] w. guan, s. long, r. jia, q. liu, y. hu, q. wang and m. liu, “analysis of charge retention characteristics for metal and semiconductor nanocrystal non-volatile memories”, in proceedings of the ieee, 2007, pp. 141. [33] w. guan, s. long, m. liu, z. li, y. hu and qi liu, “fabrication and charging characteristics of mos capacitor structure with metal nanocrystals embedded in gate oxide”, j. of phys. d: appl. phys., vol. 40, pp. 2754–2758, 2007. [34] j. y. choi, e. k. lee, y. s. min and j. b. park, “nonvolatile memory devices fabricated by using colloidal ni nanocrystals”, j. of the korean phys. soc., vol. 50, pp. 49–52, 2007. flash memory devices with metal floating gate/metal nanocrystals as the charge storage layer 167 [35] y. li and s. liu, “using different work function nanocrystal materials to improve the retention characteristics of nonvolatile memory devices”, microelectronics j., vol. 40, pp. 92–94, 2009. [36] n. henan, w. liangcai, s. zhitang, and h. chun, “memory characteristics of an mos capacitor structure with double-layer semiconductor and metal heterogeneous nanocrystals”, j. of semiconductors, vol. 30, pp. 114003(1–5), 2009. [37] c. lee, a. g. seetharam and e. c. kan, “operational and reliability comparison of discrete-storage nonvolatile memories: advantages of singleand double-layer metal nanocrystals”, in proceedings of the iedm technical digest, 2003, vol. 557, pp. 22.6.1–22.6.4. [38] z. j. horvath and p. basa, “nanocrystal non-volatile memory devices”, mater. sci. forum, vol. 609, pp. 1–9, 2009. [39] z. j. horvath, p. basa, t. jaszi, a.e. pap, l. dobos, b. pecz, l. toth, p. szollosi and k. nagy, “electrical and memory properties of si3n4 mis structures with embedded si nanocrystals”, j. of nanosci. and nanotech., vol. 8, pp. 812–817, 2008. [40] p. basa, z. j. horvath, t. jaszi, a.e. pap, l. dobos, b. pecz, l. toth, and p. szollosi, “electrical and memory properties of silicon nitride structures with embedded si nanocrystals”, phys. e: lowdimensional systems and nanostructure., vol. 38, pp. 71–75, 2007. [41] j. h. liao, j. y. hsieh, h. j. lin, w. y. tang, c. l. chiang, y. s. lo, t. b. wu, l. w. yang, t. yang, k. c. chen and c. y. lu, “physical and electrical characteristics of silicon oxynitride films with various refractive indices”, j. of phys. d: appl. phys., vol. 42, pp. 175102(1-7), 2009. [42] j. s. lee, “recent progress in gold nanoparticle-based non-volatile memory devices”, gold bulletin, vol. 43, pp. 189–199, 2010. [43] s. yang, q. wang, m. zhang, s. long, j. liu and m. liu, “titanium–tungsten nanocrystals embedded in a sio2/al2o3 gate dielectric stack for low-voltage operation in non-volatile memory”, nanotechnology, vol. 21, pp. 245201–245205, 2010. [44] v. mikhelashvili, b. meyler, s. yofis, j. salzman, m. garbrecht, t. cohen-hyams, w. d. kaplan, and g. eisenstein, and eisenstein g., “a nonvolatile memory capacitor based on a double gold nanocrystal storing layer and high-k dielectric tunneling and control layers”, j. of the electrochem. soc., vol. 157, pp. h463–h469, 2010. [45] g. gay, d. belhachemi, j. p. colonna, s. minoret, p. brianceau, d. lafond, t. baron, g. molas, e. jalaguier, a. beaurain, b. pelissier, v. vidal and b. de salvo, “passivated tin nanocrystals/sin trapping layer for enhanced erasing in nonvolatile memory”, appl. phys. lett., vol. 97, pp. 152112(1–3), 2010. [46] l. khomenkova, b. s. sahu, a. slaoui and f. gourbilleau , “hf-based high-k materials for si nanocrystal floating gate memories”, nanoscale res. lett., vol. 172, pp. 1–8, 2011. [47] g. x. li, x. f. chen, w. ren, p. shi, x. q. wu, o. k. tan, w. g. zhu and x. yao, “hfo2-tio2 ultrathin gate dielectric by rf sputtering”, ferroelectrics, vol. 410, pp. 129–136, 2011. [48] m. yun, b. ramalingam, and s. gangopadhyay, “multi-layer pt nanoparticle embedded high density non-volatile memory devices”, j. of the electrochem. soc., vol. 159, pp. h393–399, 2012. [49] c. t. lin, j. c. wang, p. w. huang, y. y. chen and l. c. chang, “performance revelation and optimization of gold nanocrystal for future nonvolatile memory application”, japanese j. of appl. phys., vol. 52, pp. 04cj09(1–5), 2013. [50] j. seo, j. y. kim, y. b. kim, d. w. kim, h. kim, h. jo and d. k. choi, “multi-level storage in a nano-floating gate mos capacitor using a stepped control oxide”, microelectron. reliability, vol. 53, pp. 528–532, 2013. [51] w. c. shih, c. h. cheng, j. y. m. lee and f. c. chiu “charge-trapping devices using multilayered dielectrics for nonvolatile memory applications”, advances in mater. sci. and eng., pp.1–5, 2013. [52] g. chen, z. huo, l. jin, y. han, x. li, su liu and m. liu, “metal floating gate memory device with sio2/hfo2 dual-layer as engineered tunneling barrier”, ieee electron device lett., vol. 35, pp. 744–746, 2014. [53] g. d. wilk, r. m. wallace and j. m. anthony, “h-k gate dielectrics: current status and materials properties considerations”, j. of appl. phys., vol. 89, pp. 5243–5255, 2001. [54] c. zhao, c. z. zhao, s. taylor and p. r. chalker, “review on non-volatile memory with high-k dielectrics:flash for generation beyond 32 nm”, mater., vol. 7, pp. 5117–5145, 2014. [55] c. yan, j. wen, p. lin and z. sun, “a tunneling dielectric layer free floating gate nonvolatile memory employing type-i core–shell quantum dots as discrete charge-trapping/tunneling centers”, nano-micro small, vol.15, pp.1804156 (1–8), 2019. [56] s. wang, c. he, j. tang, x. lu, c. shen, h. yu, l. du, j. li, r. yang, d. shi and g. zhang, “new floating gate memory with excellent retention characteristics”, advanced electron. mater., vol. 5, pp. 1800726 (1–7), 2019. facta universitatis series: electronics and energetics vol. xx, 2017, xx-xx guest editorial the reed-muller workshop has been held biennially since 1993, and since 2007 has been co-located with the ieee international symposium on multiple-valued logic and supported by the ieee computer society technical committee on multiple-valued logic. papers presented at the workshop are provided informally to attendees but workshop proceedings are not formally published. the goal of the reed-muller workshop is to provide a forum for researchers to exchange and discuss research ideas in a variety of areas including: • graph-based representations of logic functions • exor-based representations and spectral representation of logic functions • graph functions, bent functions, cryptographically-significant functions and cryptographic applications • implementations in silicon • applications including circuit design, reversible logic, quantum logic, etc. • representations for quantum computing, nano-technology, and molecular scale computing the papers appearing in this issue are from the 2017 reed-muller workshop (rm2017) held may 24-25 in novi sad, republic of serbia. the first paper in this special issue is the rm2017 invited address energy-efficient cryptographic primitives presented by prof. elena dubrova, royal institute of technology (kth), stockholm, sweden. the paper considers how to design cryptographic primitives that address integrity and confidentiality of transmitted messages while satisfying resource constraints. secondly, this work describes countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. 1 guest editorial facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. i iii https://doi.org/10.2298/fuee180200im 2 there are five refereed contributed papers in this special issue. preliminary versions were presented at rm2017. the papers included here are fully refereed revised and in some cases extended versions. genetic algorithm for binary and functional decision diagrams optimization, suzana stojković, darko veličković and claudio moraga, introduces a genetic algorithm to minimize the size, number of nodes, for both binary decision diagrams and functional decision diagrams. experimental results show the effectiveness of the algorithm particularly when mutation of polarity is introduced for the fdd case. compact xor-bi-decomposition for lattices of boolean functions, bernd steinbach and christian posthoff, presents a method to find a compact xor-bi-decomposition for a lattice of boolean functions thereby extending well known techniques for finding and-, or-, or xor-bi-decompositions for a given completely specified function. the approach emphasizes small circuits with low power consumption and delay. an improved spectral classification of boolean functions based on an extended set of invariant operations, milena stanković, claudio moraga and radomir stanković, considers the extension of prior spectral methods for the classification of boolean functions by the introduction of a previously unconsidered invariant operation in the walsh spectral domain. this work strengthens the classification and resolves a long standing problem in spectral classification. the new invariant operation can also be used in constructing bent functions. construction of subsets of bent functions satisfying restrictions in the reed-muller domain, miloš radmanović and radomir s. stanković, considers the important task of determining bent functions which have practical application in cryptography. three ways of imposing restrictions to construct subsets of boolean functions which are more readily searched for bent functions are considered. experimental estimates of the number of bent functions in the corresponding subsets of boolean functions are given. enumeration and coding methods for a class of permutations and reversible logical gates, costas karanikas and nikolaos atreas, introduces a variety of coding methods for boolean sparse invertible matrices and uses these methods to create a variety of bijections on the permutation group p(m) of the set {1,2,. . . ,m}. it is also shown how several well-known reversible logic gates can be coded by sparse matrices. the above synopses demonstrate the breadth of research interests covered by the reed-muller workshop ranging from theory to practice including circuit design and cryptography applications. ii m. miller, t. sasao guest editorial iii 3 we express our gratitude to all the authors for their contributions to this special issue. we acknowledge the important contribution of the rm2017 program committee and referees, listed below, for their careful review and valuable comments on the contributed papers both for the workshop and this special issue. we also express our sincere gratitude to prof. ninoslav d. stojadinović, editor-in-chief, and dr. danijel m. danković, technical secretary, facta universitatis: electronics and energetics series, for their support of this special issue and for allowing us to serve as guest editors. this special issue is an excellent venue for dissemination of research results from rm2017. we sincerely hope that publication of these results will stimulate continued research in these important areas. d. michael miller, university of victoria, canada tsutomu sasao, meiji university, japan rm2017 program committee and referees jon. t. butler naval postgraduate school, monterey, usa rolf drechsler university of bremen, germany gerhard w. dueck university of new brunswick, fredericton, canada oliver keszocze university of bremen, germany alireza mahzoon university of bremen, germany d. michael miller university of victoria, canada claudio moraga technical university of dortmund, germany philipp niemann university of bremen, germany marek perkowski portland state university, usa tsutomu sasao meiji university, kawasaki, japan anatoly shalyto itmo university, st. petersburg, russia saeideh shirinzadeh university of bremen, germany mathias soeken école polytechnique fédérale de lausanne, switzerland radomir s. stanković university of nǐs, republic of serbia bernd steinbach tu universitat bergakademie freiberg, germany mitchel a. thornton southern methodist university, dallas, usa robert wille johannes kepler university, linz, austria ii m. miller, t. sasao guest editorial iii instruction facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 11 23 https://doi.org/10.2298/fuee1801011v cost-effective sensors and sensor nodes for monitoring environmental parameters dragana vasiljević 1 , čedo žlebič 1 , goran stojanović 1 , mitar simić 2 , libu manjakkal 3 , zoran stamenković 4 1 faculty of technical sciences, university of novi sad, serbia 2 faculty of electrical engineering, university of banja luka, bosnia & herzegovina 3 university of glasgow, g12 8qq glasgow, uk 4 ihp, frankfurt (oder), germany abstract. this paper reviews the design and characterization of humidity and ph sensors manufactured in the printed circuit board (pcb), ink-jet, and screen printing technologies. the first one (pcb technology) provides robust sensors with pet film which can be exposed to harsh environment. the second (ink-jet technology) can manufacture sensors on flexible substrates (foils and papers). the third (screen printing technology) has been used to implement a thick-film sensor. in addition to this, a multi-sensor cloudbased electronic system with autonomous power supply (solar panels) for air and water quality monitoring has been described. finally, a flexible and modular hardware platform for remote and reliable sensing of environmental parameters has been presented. key words: humidity sensor, ph sensor, sensitivity, stability, sensor node 1. introduction advanced applications require different types of sensors which can be manufactured in various technologies. the manufacturing method determines the performance and price of the sensors. this paper deals with the two types of sensors: humidity sensors and ph sensors. humidity sensors play an important role in many measurement and control applications in meteorology, agriculture, environmental protection, industry, and medicine. in the past years, a lot of effort has been made to develop high-performance humidity sensors exhibiting the large sensitivity, fast response and recovery, and small hysteresis. various transduction techniques, such as capacitive, resistive, acoustic, optical, and mechanical, have been adopted for the design of humidity sensors. their cost depends on the accuracy requirements, response time, hysteresis, sensitivity, mechanical and chemical characteristics, power received july 27, 2017 corresponding author: goran stojanović faculty of technical sciences, university of novi sad, trg dositeja obradovića 6, 21101 novi sad, serbia (e-mail: sgoran@uns.ac.rs) 12 d. vasiljević, ĉ. žlebiĉ, g. stojanović, m. simić, l. manjakkal, z. stamenković consumption, etc. pet film as one of the most common substrates in industry is used as a sensitive layer for humidity measurements. pet film, compared to different thermoplastics, has equal or better water vapour transmission rate, dimensional stability, service temperature range, etc. novel sensitive materials, such as graphene-oxide (go), have recently been introduced in manufacturing process of humidity sensors [1-8]. for instance, the sensitivity of interdigitated capacitive (idc) humidity sensors has been significantly improved by using the go as a sensitive material [9]. the monitoring of water quality is an essential task having global impact. this requires determining the parameters such as ph, dissolved oxygen, content of ammonia, conductivity, turbidity, temperature, and dissolved metal ions [10]. among these the ph is one of the most important as it measures the acidity or basicity of water and directly affects the health of individuals [11]. the ph measurement has wide range of applications including environmental monitoring, chemical processing [12], medical [13], food and beverage [14], biomedical applications such as blood analysis [15], and monitoring of ph fluctuations in the human brain [16]. these applications require highly reliable and accurate ph sensors with the reduced level of maintenance and long lifetime. a range of electrochemical and non-electrochemical methods have been explored for ph measurements [17-19]. among these the glass electrode based ph sensor has been the most attractive and reliable [17-20]. however, the lack of applicability of the existing solutions in environments that are corrosion prone, or have high temperature and high pressure conditions is a limitation, which provides a strong motivation to develop new ph sensors. in this regard, the metal oxide based ph sensors are attractive as they offer a number of potential advantages over glass electrode ph sensors, including low-cost, smaller dimensions, and ease of manufacturing. due to high chemical stability, the tio2 based films are considered good for ph sensitive layers and a few studies concerning tio2 as a ph sensitive layer have been reported as well [21, 22]. 2. cost-effective sensors manufactured in different technologies humidity sensor with pet lamination film pet film as a sensitive layer has been chosen since, compared to thermoplastics, it has equal or better water vapour transmission rate, dimensional stability, and service temperature range. this film (with and without 400 µm pores) has been laminated on copper electrodes. three types of idc structures have been designed and manufactured on the standard fr4 dielectric substrate with a conductive copper layer. geometrical parameters of the idc structures have been optimized in order to obtain the targeted capacitance values (from 25 pf to 45 pf). the layout of the idc humidity sensor is shown in figure 1a, while the representative samples of the manufactured sensors (with and without macro-porous cover) are presented in figure 1b and figure 1c, respectively. cost-effective sensors and sensor nodes for monitoring environmental parameters 13 (a) (b) (c) fig. 1 a) layout of idc sensor, b) porous pet sensor, c) standard pet sensor the idc structures have been measured with lcz meter (hp 4277a). they have been milled with lpkf protomat s100 machine. a porous pet film with 400-µm pore diameter has been used (figure 2b). (a) (b) fig. 2 schematic structures of proposed humidity sensors: a) pet film laminated on the copper electrodes, b) porous pet film laminated on the copper electrodes the change in the capacitance of presented humidity sensors is related to the three different processes [23]. the first is adsorption on the polymer surface (given rise of a new thin layer on the top of the polymer), second is absorption into the polymer phase (changing the dielectric constant of the polymer) and the third is swelling of the polymer layer. sensor’s sensitivity can be increased by adding pores into pet laminated layer which significantly increase the water molecules adsorption inside this porous dielectric film. in order to investigate the humidity response, the sensors have been installed in a chamber with humidity and temperature control (heraeus vötsch vlk 08/450). the adjustable humidity range has been between 45% and 90%, while the temperatures were fixed at 30°c and 40°c. the measurements have been carried out with a lcz meter (hp 4277a), which was connected via agilent 82357a usb/gpib interface converter with a laptop. an in-house developed program (created using labview) has been used for data acquisition. characteristics of the tested sensors have been determined by observing capacitance variations at 50 khz. capacitance values of the two sensor types (with standard pet film and porous pet film) have been measured. capacitance responses of the sensors for a fixed environment temperature (30°c and 40°c) and relative humidity (45% rh – 90% rh) are presented in figure 3 and figure 4. the capacitance stability of the sensor has been observed for time of 30 seconds. results show a very high stability of the sensor capacitance in time. 14 d. vasiljević, ĉ. žlebiĉ, g. stojanović, m. simić, l. manjakkal, z. stamenković (a) (b) fig. 3 standard pet sensor capacitance stability as a function of relative humidity (rh): a) for temperature of 30°c, b) for temperature of 40°c (a) (b) fig. 4 porous pet sensor capacitance stability as a function of relative humidity (rh): a) for temperature of 30°c, b) for temperature of 40°c comparison of the humidity sensor sensitivities is presented in table i. above 80% rh at 30°c, the porous sensor has a sensitivity of 48ff/%rh, while the sensitivity of the standard sensor is 17ff/%rh. also, for humidity above 80%rh and temperature of 40°c, the sensitivity of the porous sensor is 106ff/%rh, while the sensitivity of the standard sensor is 30ff/%rh. measurement results show that the sensitivity of sensors laminated with pet film (standard and porous) increases with increase of the temperature and rh. similar results have been reported in [24]. since the adhesive layer is placed between pet film and copper electrodes, the sensor sensitivity is reduced. this is because the adhesive layer, in some way, prevents the transfer of pet film dielectric constant change (due to the water adsorption) to idc structure and its capacitance. cost-effective sensors and sensor nodes for monitoring environmental parameters 15 table 1 sensor sensitivity in a humidity range from 45% to 90% temperature sensor 45-60% 60-80% 80-90% 30 °c pet sensor 8 ff/%rh 11 ff/%rh 17 ff/%rh porous pet sensor 11 ff/%rh 20 ff/%rh 48 ff/%rh 40 °c pet sensor 8 ff/%rh 12 ff/%rh 30 ff/%rh porous pet sensor 17 ff/%rh 28 ff/%rh 106 ff/%rh response time of the sensors is measured to 90% point of the final steady-state capacitance during the relative humidity change from 45% rh to 90% rh at 24°c. also, the recovery time is measured as time in which the sensor capacitance changes with 90%rh of its maximum value to the initial value while humidity content is reduced from 90%rh to 45%rh. the response and recovery times of the standard pet sensor have been found to be 35 s and 57 s, respectively. likewise, the response time of the porous pet sensor is 42 s, while the recovery time is 47 s. humidity sensor based on graphene-oxide the sensors have been manufactured by an ink-jet printing process using the dimatix deposition material printer (dmp-3000) and spin-coating. it has been widely acknowledged that the ink-jet manufacturing technology is cost-effective in the case of humidity sensors [25]. an interdigitated capacitor with 20 pairs of electrodes has been designed, as shown in figure 5. it consists of a polyimide substrate, interdigitated ag electrodes, and sensing go material. (a) (b) fig. 5 capacitive humidity sensor based on go: a) schematic of the sensor, b) sensor’s electrodes before deposition of go the second sensor layer has been manufactured by spin-coating 3 layers of the graphenea go ink on top of the electrodes. measurements have been performed using an in-house measurement setup shown in figure 6. it consists of a chamber (plastic box) and humidity source (aerosol). capacitances and resistances of the manufactured sensors have been measured using the agilent 4284a lcr meter. the lascar el-usb-2 humidity and temperature data logger has been used to measure the humidity level inside the chamber. 16 d. vasiljević, ĉ. žlebiĉ, g. stojanović, m. simić, l. manjakkal, z. stamenković fig. 6 humidity sensing and measurement setup fig. 7 capacitance hysteresis curves of the go-based sensor the capacitance hysteresis characteristic of go sensor has been observed by increasing the relative humidity from 45% to 85% for water molecules absorption and then decreasing back to 45% for water molecules desorption. the measurement results are shown in figure 7. the capacitance values range from 200 pf to 1100 pf, for the relative humidity in the range from 45% to 85%. this indicates that the proposed go sensor has much higher sensitivity comparing to the others described in open literature. in order to compare the response speed of the analysed go sensors, the behaviour in absorption and desorption phases has been observed. figure 7 shows a significant hysteresis (lagging) of the sensor capacitance behind rh variations. this could be explained by different speeds at which the humidity within the chamber (plastic box in figure 6) has been changed. cost-effective sensors and sensor nodes for monitoring environmental parameters 17 namely, the humidity has been raised by introducing an aerosol device and, after some time, reduced back by self-drying in laboratory conditions. tio2 thick-film ph sensor interdigitated electrode (ide) tio2 thick-film based ph sensor has been designed, manufactured, and characterized [26]. a ph measurement system based on the integrated circuit ad5933 [27-29] (which can be used for sensor impedance characterization, as well as sensor readout electronics) has also been implemented. the manufacturing process of the conductimetric ph sensor is similar to that reported in [11, 30]. we have chosen alumina as a substrate to investigate the performance of pure metal-oxide and to avoid any reaction at the metal/metal-oxide interface. initially, a planar ide has been deposited on an alumina substrate by screen printing of ag paste (ag/pd esl 9695). the screen printing of metal paste is a faster way of manufacturing devices at low cost [31]. illustration of the conductimetric ph sensor is shown in figure 8a. the major advantages of the ide ph sensor, compared to the other reported approaches, are: faster and low-cost manufacturing, lack of reference electrode, large surface area, and low energy consumption during measurements. in addition, the screen printing technology could open avenues for integrating the ph sensors with electronics on flexible substrates [31]. fig. 8 a) illustration of tio2 ph sensor, b) impedance measurement device connected to ph sensor, c) experimental setup for ph sensor characterization the ad5933-based impedance measurement system reported earlier [27-29] has been used for sensor characterization. figure 8c shows the experimental setup for spectroscopic analysis of the sensor impedance. the sample under test has been connected to the measurement device and placed into a beaker with a solution. the sensor can be employed in water pollution monitoring, with an expected operating ph range from 6 to 9, thus test 18 d. vasiljević, ĉ. žlebiĉ, g. stojanović, m. simić, l. manjakkal, z. stamenković solutions with ph ranging from 4 to 10 have been prepared by adding 1 mol% of hcl or koh to distilled water. a standard glass-electrode ph and conductivity meter (elmeiron, cpc-411) with temperature probe has been used to control the ph level of test solutions and measure the conductivity of each test solution. the sensor has been washed with deionized water and dried with a paper towel after each measurement to reduce the contamination of the electrode surface by solutions with different ph. all measurements have been done at room temperature with the liquid temperature close to 23°c. to measure the electrical parameters of tio2 films at different ph values, the sensor has been dipped in the solution for 10 min prior to operating to ensure the steady-state. the impedance measurement has been done by performing frequency sweep in range of 5-20 khz with ac voltage of 200 mv. fig. 9 illustrates the variation of impedance magnitude and phase angle as a function of frequency in the range 5-20 khz for different ph values of test solution. the magnitude and phase of the sensor impedance decrease with increase of the ph value of solution. for a constant ph, there is a decrease in magnitude and an increase in phase as the frequency increases. fig. 9 impedance magnitude and phase angle plots for tio2 thick-film ph sensor for different ph values of solutions over a frequency range from 5 khz to 20 khz from figure 9, it can be noted that the impedance is lower when the sensor is in a solution of a higher ph. variations of the solution resistance with different ph values contribute to changes of the sensor impedance. the observed dependence is mainly caused by a lower resistance of the applied alkaline solutions as compared with the acidic solutions. the variation in impedance with frequency can be attributed to the effect of intercrystalline capacitance [32]. in the khz-range, this value is sufficient for shortcircuiting the spaces between the grains, which reduces the resistance of sensor [32]. obtained impedance data has been used for more detailed sensor characterization regarding sensor sensitivity. it is very important to determine impedance changes of the sensor compared to changes of the ph value of the analysed solutions. the developed impedance measurement device (used for the sensor characterization) can be used as readout electronics as well, if the measurement error is lower than the sensor sensitivity. sensor sensitivity regarding the relative change of impedance magnitude (z) with ph value change can be defined as cost-effective sensors and sensor nodes for monitoring environmental parameters 19 ph ph-1 ph-1 | | (ph) 100%z z z s z   , for ph values between 4 and 10. sensor sensitivity regarding relative change of impedance phase angle (ϕ) with ph value change can be defined as ph ph-1 ph-1 | | (ph) 100%s      . in figure 10, relative changes of the impedance magnitude and phase angle of the manufactured ph sensor are shown. the five frequencies (5 khz, 8.8 khz, 12.6 khz, 16.4 khz, and 20.2 khz) in the analysed frequency range are chosen to establish a linear frequency distribution. fig. 10 sensitivity of sensor impedance magnitude and phase angle as it can be seen from figure 10, the relative change of the impedance magnitude is higher than 2% and it increases with frequency increase. additionally, there is decrease in relative change of the impedance phase angle with increase of the ph value. therefore, for ph values lower than 7, it is more convenient to measure the phase angle, while for ph values higher than 7, it is better to measure the impedance magnitude. moreover, it can be concluded that the reported measurement error of 2% of developed ad5933-based device [27-29] is acceptable in typical applications. 3. wireless sensor nodes for environmental parameter measurements a tio2-based sensor has been used with commercial sensors in realization of a wireless sensor node for environmental parameters monitoring (ph, temperature, relative humidity, volatile organic compounds, etc.) [33]. it is a low-cost, portable, and low-power system powered by a solar-panel charger unit, thus providing automated in-situ measurements and data storage operations. compared to the systems presented in literature, the design shown in figure 11 offers the advantage of remote multi-parameter measurements in real-time [33]. 20 d. vasiljević, ĉ. žlebiĉ, g. stojanović, m. simić, l. manjakkal, z. stamenković fig. 11 wireless sensor node with a ph sensor and additional commercial sensors moreover, the developed system for remote measurement and acquisition of environmental parameters has been integrated in a more complex cloud-based system which ensures remote access to the measurement data in real-time. ibm iot platform has been used for data presentation and internet access of measurement results, as it is shown in figure 12. fig. 12 browser view of the web ibm watson iot platform with sensor data ihpnode has been developed as a flexible and modular hardware platform for remote sensing in environmental and agricultural applications [34]. the node (figure 13) is based on the texas instruments msp430x low-power microcontroller and three rf transceivers, one working in the 868 mhz and two working in the 2.4 ghz frequency band. the two of cost-effective sensors and sensor nodes for monitoring environmental parameters 21 these (cc1101 and cc2500) support flexible proprietary network protocols, while the third (cc2520) provides a network coprocessor for zigbee protocol integration. fig. 13 ihpnode based on ti msp430x microcontroller 4. conclusion cost-effective humidity and ph sensors have been designed, manufactured, and characterised. two wireless sensor nodes for remote monitoring of environmental parameters have been developed using the aforementioned humidity and ph sensors. the future work should complete and integrate these nodes into a smart multi-sensor cloudbased hardware/software platform for environmental and agricultural applications. acknowledgement: the work described in this paper is partly supported by the ministry of education, science and technological development within the project no.tr32016 and the provincial secretariat for higher education and r&d activities within the project no.114-451-2044/2016-01. references [1] h. bi, k. yin, x. xie, j. ji, s. wan, l. sun, m. terrones, and m. s. dresselhaus, ―ultrahigh humidity sensitivity of graphene oxide‖, scientific reports, vol. 3-2714, pp. 1-7, 2013. [2] h. chi, y. j. liu, f. wang, and c. he, ―highly sensitive and fast response colorimetric humidity sensors based on graphene oxides film‖, acs appl. mater. interfaces, vol. 7, pp. 19882-19886, 2015. [3] y. yao, x. chen, h. guo, z. wu, and x. li, ―humidity sensing behaviour of grapheneoxide-silicon bilayer flexible structure‖, sensors and actuators b: chemical, vol. 161, pp. 1053-1058, 2012. [4] l. guo, h. b. jiang, r. q. shao, y. l. zhang, s. y. xie, j. n. wang, x. b. li, f. jiang, q. d. chen, t. zhang, and h. b. sun, ―two-beam-laser interference mediated reduction, patterning and nanostructuring of graphene oxide for the production of a flexible humidity sensing device‖, carbon, vol. 50, pp. 16671673, 2012. 22 d. vasiljević, ĉ. žlebiĉ, g. stojanović, m. simić, l. manjakkal, z. stamenković [5] p. g. su and z. m. lu, ―flexibility and electrical and humidity-sensing properties of diamine functionalized graphene oxide films‖, sensors and actuators b: chemical, vol. 211, pp. 157-163, 2015. [6] d. zhang, j. tong, and b. xia, ―humidity-sensing properties of chemically reduced graphene oxide/polymer nanocomposite film sensor based on layer-by-layer nano self-assembly‖, sensors and actuators b: chemical, vol. 197, pp. 66-72, 2014. [7] r. gao, d.-f. lu, j. cheng, y. jiang, l. jiang, and z.-m. qi, ―humidity sensor based on power leakage at resonance wavelengths of a hollow core fiber coated with reduced graphene oxide‖, sensors and actuators b: chemical, vol. 222, pp. 618-624, 2016. [8] d. zhang, h. chang, p. li, r. liu, and q. xue, ―fabrication and characterization of an ultrasensitive humidity sensor based on metal oxide/graphene hybrid nanocomposite‖, sensors and actuators b: chemical, vol. 225, pp. 233-240, 2016. [9] c. l. zhao, m. qin, w. h. li, and q. a. huang, ―enhanced performance of a cmos interdigital capacitive humidity sensor by graphene oxide‖, solid-state sensors, actuators, and microsystems conference, pp. 1954-1957, 2011. [10] j. kang, m. wang, and z. xiao, ―modeling and control of ph in pulp and paper wastewater treatment process‖, journal water resource and protection, vol. 2, pp. 122-127, 2009. [11] l. manjakkal, k. cvejin, j. kulawik, k. zaraska, d. szwagierczak, and r. p. socha, ―fabrication of thick film sensitive ruo2-tio2 and ag/agcl/kcl reference electrodes and their application for ph measurements‖, sensors and actuators b: chemical, vol. 204, pp. 57-67, 2014. [12] h. a. clark, r. kopelman, r. tjalkens, and m. a. philbert, ―optical nanosensors for chemical analysis inside single living cells sensors for ph and calcium and the intracellular application of pebble sensors‖, anal. chem., vol. 71, pp. 4837-4843, 1999. [13] b. d. malhotra and a. chaubey, ―biosensors for clinical diagnostics industry‖, sensors and actuators b: chemical, vol. 91, pp. 117-127, 2003. [14] c. bohnke, h. duroy, and j. l. fourquet, ―ph sensors with lithium lanthanum titanate sensitive material: applications in food industry‖, sensors and actuators b: chemical, vol. 89, pp. 240-247, 2003. [15] l. xie, y. qin, and h. y. chen, ―polymeric optodes based on upconverting nanorods for fluorescent measurements of ph and metal ions in blood samples‖, anal. chem., vol. 84, pp. 1969-1974, 2012. [16] v. a. magnotta, h. y. heo, b. j. dlouhy, n. s. dahdaleh, r. l. follmer, d. r. thedensa, m. j. welshc, and j. a. wemmie, ―detecting activity-evoked ph changes in human brain‖, proc. national academy of sciences of usa, vol. 109, pp. 8270-8273, 2012. [17] u. guth, w. vonau, and j. zosel, ―recent developments in electrochemical sensor application and technology — a review‖, meas. sci. technol., vol. 20, pp. 1-14, 2009. [18] y. qin, h. j. kwon, m. m. howlader, and m. j. deen, ―microfabricated electrochemical ph and free chlorine sensors for water quality monitoring: recent advances and research challenges‖, rsc advances, vol. 5, pp. 69086-69109, 2015. [19] p. kurzweil, ―metal oxides and ion-exchanging surfaces as ph sensors in liquids: state-of-the-art and outlook‖, sensors, vol. 9, pp. 4955-4985, 2009. [20] g. eisenmann, glass electrodes for hydrogen and other cations, ed. marcel dekker, new york, usa, 1967. [21] y. h. liao and j. c. chou, ―preparation and characterization of the titanium dioxide thin films used for ph electrode and procaine drug sensor by sol-gel method‖, mat. chem. phys., vol. 114, pp. 542-548, 2009. [22] y. chen, s. c. mun, and j. kim, ―a wide range conductometric ph sensor made with titanium-dioxide / multiwall-carbon nanotube/cellulose hybrid nanocomposite‖, ieee sensors j., vol. 13, pp. 4157-4162, 2013. [23] r. igreja and c. j. dias, ―dielectric response of interdigital chemocapacitors: the role of the sensitive layer thickness‖, sensors and actuators b: chemical, vol. 115, pp. 69-78, 2006. [24] l. chia-yen and l. gwo-bin, ―micro-machine based humidity sensors with integrated temperature sensors for signal drift compensation‖, journal of micromechanics and microengineering, vol. 13, pp. 620-627, 2003. [25] f. molina lopez, d. briand, and n. f. de rooij, ―all additive inkjet printed humidity sensors on plastic substrate‖, sensors and actuators b: chemical, vol. 166, pp. 212-222, 2012. [26] m. simić, l. manjakkal, k. zaraska, g. m. stojanović, and r.dahiya, ―tio2 based thick film ph sensor‖, ieee sensors j., vol. 17, pp. 248-255, 2017. [27] m. simić, ―complex impedance measurement system for the frequency range from 5 khz to 100 khz‖, key eng. mater., vol. 644, pp. 133-136, 2015. [28] m. simić, ―realization of digital lcr meter‖, in proceedings of the epe, 2014, pp. 769-773. [29] m. simić, ―complex impedance measurement system for environmental sensors characterization‖, in proceedings of the 22 nd telecommunications forum telfor, belgrade (serbia) 2014, pp. 660-663. cost-effective sensors and sensor nodes for monitoring environmental parameters 23 [30] l. manjakkal, e. djurdjic, k. cvejin, j. kulawik, k. zaraska, and d. szwagierczak, ―electrochemical impedance spectroscopic analysis of ruo2 based metal oxide thick film ph sensors‖, electrochim. acta., vol. 168, pp. 246-255, 2015. [31] s. khan, l. lorenzelli, and r. dahiya, ―technologies for printing sensors and electronics over large flexible substrates: a review‖, ieee sensor. j., vol. 15, pp. 3164-3185, 2015. [32] k. arshak, e. gill, a. arshak, and o. korostynska, ―investigation of tin oxides as sensing layers in conductimetric interdigitated ph sensors‖, sensors and actuators b: chemical, vol. 127, pp. 42-53, 2007. [33] m. simić, g. stojanović, l. manjakkal, and k. zaraska, ―multi-sensor system for remote environmental (air and water) quality monitoring‖, in proceedings of the 24 th telecommunications forum telfor, belgrade (serbia) 2016, pp. 1-4. [34] k. piotrowski, a. sojka-piotrowska, z. stamenkovic, and r. kraemer, ―ihpnode platform as a base for precision farming and remote diagnosis in agriculture‖, in proceedings of the 24th telecommunications forum telfor, belgrade (serbia) 2016, pp. 1-5. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, no 1, march 2018, pp. 141 153 https://doi.org/10.2298/fuee1801141j trade-off between multiple criteria in smart home control system design aleksandar janjić 1 , lazar velimirović 2 , miomir stanković 3 , vladimir djordjević 4 1 faculty of electronic engineering niš, niš, serbia 2 mathematical institute of the serbian academy of sciences and arts, belgrade, serbia 3 faculty of occupational safety niš, niš, serbia 4 electric power industry of serbia, belgrade, serbia abstract. the successful automation of a smart home relies on the ability of the smart home control system to organize, process, and analyze different sources of information, according to several criteria. because of variety of key design criteria that every smart home of the future should meet, the main challenge is the trade-off between them in uncertain environment. in this paper, a problem of smart home design has been solved using the methodology based on multiplicative form of multi-attribute utility theory. aggregated functions describing different smart home alternatives are compared using stochastic dominance principle. the aggregation of different criteria has been performed through their numerical convolution, unlike usual approach of pairwise comparison, allowing only the additive form of aggregation of individual criteria. the methodology is illustrated on the smart home controller parameter setting. key words: maut, decision making, multi criteria analysis, smart home, stochastic dominance 1. introduction making a home smart means that residents move around safely and easily, economizing and using resources more efficiently. in order to accomplish these multiple tasks, a smart home must be equipped with technology that observes the residents and provides proactive services. with the increase of inexpensive sensors, communication equipment and embedded processors, smart homes are equipped with a large amount of sensors that use the acquired data on the activities and behaviors of its residents and consequently perform appropriate control actions [1]. the successful automation of a smart home relies on the ability of the smart home control system to organize, process, and analyze different sources of information according to different received june 20, 2017; received in revised form september 20, 2017 corresponding author: lazar velimirović mathematical institute of the serbian academy of sciences and arts, kneza mihaila 36, 11001 belgrade, serbia (e-mail: lazar.velimirovic@mi.sanu.ac.rs) 142 a. janjić, l. velimirović, m. stanković, v. djordjević criteria defined by the user. to this end, a strong and formal support to the multi-criteria decision is central to the smart home controller design and setting. as far as smart home functionality is concerned, there are at least four major key design requirements that every smart home of the future should meet [2]:  user-friendliness: a functionality must be comfortable and helpful to (often nontechnical) home occupants.  intelligence for the most basic and sensible functions (such as turning on lights when coming, and turning them off when leaving home), requiring complex information processing of diverse information sources.  non-intrusiveness: the ability of the system to operate in the background, not bothering occupants by the proliferation of queries.  security and its accompanying factor, privacy, are extremely important for the adoption of any smart home system. the trade-off between these criteria is necessary on all hierarchical levels of smart home design, selection and operation. we do not know what mix of sensors is optimal for a particular group or individual, and how to appropriately control, summarize and present information collected to different stakeholders. a series of technical and social challenges need to be addressed before sensor technologies can be successfully integrated according to the occupant’s attitude to different criteria. besides the presence of multiple criteria, another challenge in front of intelligent builiding and smart home automation is the great uncertainty due to the stochastic naure of renewable energy sources. in this paper, the methodology for discrete stochastic multiple criteria decision making problem in smart home system design, with different types of tradeoffs among criteria has been applied for the smart home design selection problem. the advantage of this approach is the usage of compensatory aggregation, which is more suitable for conflicting criteria or the human aggregation behavior. the proposed methodology is based on numerical convolution of criteria probability distribution functions, according to different types of criteria aggregation. alternatives are ranked according to the stochastic dominance (sd) rules. the contribution of this paper is the introduction of new decision support tool which is more adapted to the smart home design faced with uncertainties and necessary trade-off between different criteria and different stakeholders. the methodology can be used for various problems in the smart home design, including the sensor disposition, parameter setting, functionality selection etc. unlike previous multi-criteria approach, compensatory aggregation adapted to the human behavior has been applied. the paper is organized in the following way. after the literature review of the current state of the problem, the methodology for stochastic multi criteria decision making (smcdm) is presented, describing each step of the methodology: definition of the type of the criteria aggregation, numerical convolution of aggregated utility probability distributions and the application of sd rules for the ranking of alternatives. the methodology is illustrated on the choice of the smart home control parameter settings and finally, conclusions and further research directions are presented. trade-off between multiple criteria in smart home control system design 143 2. litterature review generally, a home that is designed according to smart and sustainable home principle has to meet occupant’s needs through all stages of their life. previous work on smart home system design has been generally focused on a specific problem area such as information correlation or hardware [3], [4]. in [5], authors review sensor technology used in smart homes focusing on environment and infrastructure mediated sensing. in [6]-[9] smart home technology is a support for people with reduced capabilities due to aging or disability. requirements generated from considerations of social, environmental, and economic issues for high efficient energy-saving building systems in compliance with building codes and regulations were analyzed in [10], [11]. focusing on specific design problem, authors did not take on a holistic system and multi-criteria engineering view. in [12], the general controller system design procedure based on evolutionary multiobjective optimisation (emo) is presented, with the comprehensive review of other multiobjective design procedures. an extensive list of requirements for composition of smart home application has been provided in [13] and [14], where requirements are clustered in seven categories, each of which consisting of three to five requirements, including:  simplicity: describing the complexity of application development, involving the interaction between the system and the application developer.  modeling: requirements that affect the way the smart home applications can be modeled.  time: the ability to impose timing constraints  mobility: including both mobile devices and changes in the system  technical requirement for a composition solution  security, safety and privacy  miscellaneous, containing all requirements that do not match the other categories. with the diversification of criteria and the increased number of stakeholders engaged in smart home realization, the need for multiobjective and multicriteria approach emerged. starting from the redesign of building automation systems [15], various applications of multiobjective optimization of control systems were introduced, like the controller adjustment and controller parameter selection [16]. in [17] fuzzy ahp multicriteria analysis of key performance indicators related to the smart grid efficiency, as the key factor of any energy management system implementation have been analyzed. however in all of mentioned approaches the multiobjective problem is normalized and converted to a single-objective optimization with deterministic state of nature concerning the consequences of different alternatives. although the authors present a multi-criteria decision-making model using the analytic network process to evaluate the lifespan energy efficiency of intelligent buildings, the tradeoff between different criteria has not been taken into account in all mentioned approach. as stated before, stochastic nature of renewable sources integrated in intelligent buildings requires stochastic predictors [15], [18]. however, authors conclude that the current technology is still not mature enough for cost-effective usage in most of the real-world scenarios. one of the prominent stochastic and multicriteria methodology smcdm is used for selecting alternatives associated with multiple criteria, where consequences of alternatives with respect to criteria are in the form of random variables. there are three general methods to solve smcdm problem: 1) outranking methods using confidence indices on alternative 144 a. janjić, l. velimirović, m. stanković, v. djordjević pairwise comparisons with respect to each criterion [19], 2) data envelopment analysis [20] and 3) stochastic multi-objective acceptability analysis (smaa) [21]. methods using stochastic processes and sd rules generally include two processes [22], [23]: comparison and selection. the comparison serves to identify whether there exists a sd relation for comparison of any pair of alternatives using sd rules, while the selection is to rank alternatives based on the determined sd relations using rough set theory or interactive procedures [24], [25]. in stochastic multi attribute analysis (smaa) or group decision-making analysis, both criterion values and criterion weights are uncertain but the usage of more complex utility functions together with the correlation between attributes remained neglected. so far, smcdm problems were exclusively related to the additive form of utility functions, with evaluations eij taken as utility values. in [26] a range of simulated problem settings is used to show that using an additive aggregation when preferences actually follow a multiplicative model may often only have minor impacts on results. however, for many decision problems, including the various smart home design phases, estimated parameters are inconsistent with the linear additive case and are strongly favoring the multiplicative functional form. furthermore, decision makers tend to partially compensate between criteria, instead of trying to satisfy them simultaneously, emphasizing the need for the multiplicative functional form. in [27], a new methodology for the multidimensional risk assessment, based on stochastic multiattribute theory has been presented. this methodology encompasses simultaneously: the multi criteria decision problem, stochastic nature of criteria outcomes and trade-off between them depending on decision maker preferences, making it the candidate for the smart home controller design problems. 3. methodology the main challenge in the smart home control system design is the presence of great number of different stakeholders, with different and often opposite preferences. for the sake of illustration, suppose that seven persons evaluate different alternatives for indoor temperature setting (e.g. 20º c) over the set of three criteria: comfort (c1), ecology (c2) and energy costs (c3), on a scale of ten (1 the worst, 10 the best). the evaluations of i-th alternative are expressed in the form of the discrete probability distribution as shown in table 1. table 1 evaluation distribution of three criteria for an indoor temperature setting value scores criteria c1 c2 c3 1 0 2/7 0 2 0 0 1/7 3 3/7 0 0 4 0 1/7 1/7 5 2/7 2/7 0 6 0 1/7 3/7 7 0 0 0 8 1/7 1/7 1/7 9 0 0 0 10 1/7 0 1/7 trade-off between multiple criteria in smart home control system design 145 the graphical representation of appropriate cumulative distribution functions is given on figure 1. 0 1 2 3 4 5 6 7 8 9 10 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 c1 c2 c3 p ro b a b il it y grades fig. 1 the cumulative distribution functions of three criteria evaluations the problem is how to make a trade-off between these criteria and how to choose the required temperature to satisfy all occupants’ preferences. furthermore, on other levels of smart home design or operation, the same problem of multi-criteria decision analysis in presence of group of decision makers, or uncertain environment still exists. the methodology proposed in this paper for solving this problem is based on multi-attribute utility theory (maut) and numerical convolution of probability distribution. the reader is referred to the article [27] for the detailed explanation of the methodology, but the key points will be explained in the sequel. a decision problem is consisting of n alternatives denoted by ai, i  {1,...,n} each evaluated on m criteria denoted by cj, j  {1,...,m}. let eij be the evaluation of ai in terms of criterion cj, according to some suitable performance measure. we focus on decision making situations in which the values of eij for each i are not known with certainty for all j, but follow some distribution function f (eij). this formulation is known as alternatives, attributes (criteria), evaluators (aae or ace) model. the process of selecting the optimal smart home design is performed in following steps:  identification of different alternatives and criteria.  formation of individual criteria probability distribution functions.  the aggregated probability distribution formation by the numerical convolution of marginal probability distributions.  sd evaluation on aggregated probability functions 3.1. criteria aggregation the following three types of aggregation of criteria are used most commonly in decision making: conjunctive, disjunctive and compensatory. conjunctive aggregation implies simultaneous satisfaction of all decision criteria, while the disjunctive aggregation implies full 146 a. janjić, l. velimirović, m. stanković, v. djordjević compensation amongst them. the compensatory aggregation is more suitable for human aggregation behavior. among the great number of different compensatory aggregation operators, multiplicative multi-attribute utility function proved to be the most suitable for practical engineering applications. it is shown that if the additive independence condition is verified, a multi-attribute comparison of two actions can be decomposed to one-attribute comparisons. if mutual utility independence exists, the multi-attribute utility function is of the following form [28]: 1 2 (1 ( )) 1 ( , , , ) i i i i n kk u x u x x x k     (1) here, ui(xi)  the single-attribute utility value for attribute i with value xi (ranges from 0 to 1), ki = a  parameter from the trade-off for component i, for all i, and k = a  normalization constant, ensuring that the utility values are scaled over the component range space between 0 and 1. one method to determine the multiplicative function (1) is to measure each u(x), determine the kj values, and find the k value by iteratively solving (2). 1 1 (1 ) n i i k k k      (2) parameter k is related to parameters ki as follows: if 1 1, n i i k   then 1 0,k  (3) if 1 1, n i i k   then 0,k  and the additive model holds, (4) if 1 1, n i i k   then 0k  . (5) the overall utility function actually reflects three different types of interactions between individual criteria. in the compensatory case, performance of one criterion makes up for the lack of performance by other criteria, while in the additive case, it does not interact with the value of the other criteria. in the complementary case, a good performance by one criterion is less important than balanced performance across the criteria. 3.2. smcdm with compensatory aggregation the main idea of the proposed methodology is to compare different alternatives using a pragmatic aggregation function for combining the single-utility functions from each of the system components. this comparison is possible because of equivalence of rules for multivariate utility function u = u(x1,x2,...,xn) and univariate utility function defined on multivariate outcome space u = u s (p(x1,x2,...,xn)). in order to make the ranking of alternatives more practical, the convolution of these probability distributions to enable the comparison of only one distribution function per alternative is proposed. after the new, aggregated probability distribution has been built for every alternative, the ranking of alternative is performed by sd rules explained in the appendix. different uncertainty types, like outcomes and weighting factors can be simultaneously handled by the convolution principle. trade-off between multiple criteria in smart home control system design 147 the four step methodology of alternative ranking is based on the multiplicative utility function as a combination of suggested criteria and decision maker attitude towards risk, numerical convolution of individual distribution functions and sd principle. 3.3. aggregation of utility distribution functions let x and y be two independent integer-valued random variables, with distribution functions fx and fy respectively. then the convolution of fx and fy is the distribution function fz given by: ( ) ( ) ( ) z x y k f j f k f j k   , (6) for j = ,...,+. the function fz (j) is the distribution function of the random variable z = x + y. in [29], an efficient algorithm for computing the distributions of sums of discrete random variables is presented. however, multiplicative form of utility function requires other convolution type. in the proposed methodology, the computational procedure is extended to different forms of aggregating function and speeded up by the reduction of dimensions of arrays p and z to the number of evaluation grades, according to the following algorithm. for n criteria, and m number of evaluation grades, dimension of output array is reduced to m instead of m x n. the algorithm for the discrete convolution algorithm is given below: input: f (x1,...,xn) – multi-attribute utility function; m – number of evaluation grades; p(xi = j) – probability that variable i takes the value j, j = (1,m).  for i = 1 to m for j = 1 to m … for n = 1 to m calculate 1 2 ( , , , ) n f x i x j x n   z = integer(f) [discretization of f] 1 2 ( ) ( ) [ ( ) ( ) ( )] n p z p z p x p x p x     output: z [dimension m] the cumulative distribution function of aggregated random variable u is given by (7). ( ) ( ) ( ) ( ) x x u x u x f x p x x p x x f u         , (7) the comparison of different cdfs corresponding to aggregated utility function is now possible with the sd principle. the first step is the formation of aggregation function based on suggested criteria and dm attitude towards risk. in the second step, using the numerical convolution of individual criterion probability distribution functions, an aggregated probability distribution is derived. in the third step, using sd rules and sd degree values, a dominance matrix is formed. the final step in this methodology is the alternative ranking based on the results of the dominance matrix. two types of dominance matrices will be used in this methodology: the first one obtained by the three types of stochastic dominance. using the first, second or third degree stochastic dominance rule, the appropriate type of the dominance matrix is obtained, where the elements of the dominance matrix are defined in the following way:   1, , , 0, 1, 2, 3ij ai h aj ijsd if f sd f otherwise sd h   . http://en.wikipedia.org/wiki/random_variable 148 a. janjić, l. velimirović, m. stanković, v. djordjević the methodology will be illustrated on the example of smart home controller parameter selection concerning four criteria explained in the introductory section. 4. case study we consider one of many possible smart home functions: the blackout prevention for the smart house, where the smart meter measures the real-time power levels of appliances and send this information to smart home control system. the control system calculates the remaining available power, and send this information to the appliances, but with a time delay. table 2. expert’s evaluation of alternatives criteria scores alternatives a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 c1 1 0 0 0 1/7 0 1/7 1/7 1/7 0 0 2 3/7 1/7 0 0 0 0 0 2/7 0 1/7 3 1/7 0 0 0 1/7 0 0 2/7 0 2/7 4 0 2/7 0 0 0 0 0 1/7 0 2/7 5 2/7 1/7 3/7 1/7 0 0 3/7 1/7 2/7 1/7 6 0 2/7 1/7 0 2/7 0 1/7 0 1/7 0 7 1/7 0 1/7 0 2/7 1/7 0 0 3/7 1/7 8 0 1/7 2/7 1/7 0 4/7 1/7 0 1/7 0 9 0 0 0 4/7 2/7 0 0 0 0 0 10 0 0 0 0 0 2/7 1/7 0 0 0 c2 1 0 1/7 1/7 0 0 0 1/7 3/7 0 0 2 2/7 0 0 0 0 0 3/7 3/7 0 1/7 3 1/7 0 0 1/7 0 4/7 1/7 0 1/7 0 4 0 0 0 1/7 0 0 0 1/7 1/7 0 5 2/7 0 0 0 1/7 0 1/7 0 0 0 6 0 1/7 1/7 1/7 2/7 0 1/7 0 1/7 0 7 0 1/7 0 0 1/7 1/7 0 0 4/7 2/7 8 1/7 1/7 2/7 3/7 2/7 2/7 0 0 0 3/7 9 1/7 3/7 1/7 1/7 1/7 0 0 0 0 0 10 0 0 2/7 0 0 0 0 0 0 1/7 c3 1 0 0 1/7 0 1/7 0 0 2/7 0 1/7 2 0 0 0 0 0 0 3/7 1/7 0 2/7 3 1/7 0 0 1/7 0 0 1/7 4/7 1/7 0 4 3/7 0 0 0 0 1/7 1/7 0 2/7 0 5 0 1/7 0 0 0 1/7 2/7 0 2/7 0 6 1/7 0 0 0 0 0 0 0 0 2/7 7 0 1/7 0 1/7 0 0 0 0 2/7 2/7 8 1/7 2/7 0 2/7 3/7 2/7 0 0 0 0 9 1/7 3/7 2/7 1/7 1/7 1/7 0 0 0 0 10 0 0 4/7 2/7 2/7 2/7 0 0 0 0 c4 1 0 1/7 0 1/7 0 0 0 2/7 0 0 2 0 0 0 0 0 0 0 0 1/7 0 3 3/7 0 0 0 0 0 1/7 0 0 0 4 0 0 0 0 0 0 0 1/7 1/7 0 5 2/7 0 0 0 0 1/7 1/7 2/7 0 0 6 0 0 0 0 1/7 1/7 0 1/7 3/7 3/7 7 0 0 1/7 0 1/7 1/7 0 0 0 1/7 8 1/7 2/7 4/7 0 3/7 2/7 3/7 1/7 1/7 1/7 9 0 2/7 0 1/7 1/7 1/7 1/7 0 0 1/7 10 1/7 2/7 2/7 5/7 1/7 1/7 1/7 0 1/7 1/7 trade-off between multiple criteria in smart home control system design 149 let suppose that we can build 10 alternatives with different combination of appliances and times for their disconnection, directly affecting all of four criteria concerning the smart home functionality requirements. in the problem, the set of ten alternatives is (a1, a2, ...; a10) and the criteria considered include: user friendliness c1, intelligence complexity c2, non-intrusiveness c3 and security c4. suppose that seven persons provide evaluations on the alternatives with respect to the criteria on a scale of ten (1 the worst, 10 the best). the complete table of probability distributions of expert’s evaluation is presented in table 2. the similar problem, which served as as basis for our analysis is given in [23],[25],[31]. the proposed method is illustrated with the multiplicative utility function of four existing criteria. using the expression (1), the aggregated utility function is obtained with the supposed weighting factors: k1 = 0.5, k2 = 0.2, k3 = 0.57, k4 = 0.09, k = -0.686. applying the numerical convolution of four criteria probability functions, ten aggregated probability distributions are obtained, represented on figure 2. 0 1 2 3 4 5 6 7 8 9 10 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 p ro b a b il it y grades fig. 2. aggregated probability distributions for ten different alternatives using the stochastic dominance degree, the dominance matrix is obtained (8). as explained in the appendix the premise of calculating the sdd on a pair of alternatives is that there must be the sd relation on the pair of alternatives. the matrix element sdd (i,j) represents the degree of the dominance of the alternative i over the alternative j. 150 a. janjić, l. velimirović, m. stanković, v. djordjević 0 0 0 0 0 0 0.05 0.30 0 0 0.33 0 0 0 0 0 0.37 0.53 0.24 0.32 0.46 0.19 0 0.03 0.05 0.06 0.49 0.62 0.35 0.45 0.44 0.16 0 0 0.02 0.03 0.47 0.61 0.34 0.43 0.43 0.14 0 0 0 0.01 0.46 0.60 0.31 0.42 0.42 0.13 0 0 0 0 0.45 0.60 0.24 0.41 0 0 0 0 0 0 0 0.26 0 0 0 0 0 0 0 0 0 0 0 0 0. sdd  17 0 0 0 0 0 0.22 0.42 0 0.16 0.01 0 0 0 0 0 0.07 0.31 0 0                                , (8) as the final step, the ranking of alternatives is performed based on the values from the dominance matrix. 3 4 5 6 2 9 10 1 7 8 a a a a a a a a a a , (9) the power and flexibility of the proposed method is illustrated on the same example, with additive utility function of four existing criteria and the criterion weight vector w = [0.09; 0.55; 0.27; 0.09], as proposed in the original example in [23]. the comparison of alternative ranking obtained from the previous matrix with three already mentioned methods is given in table 3. table 3. different alternative ranking methods comparison method ranking proposed method 3 5 4 2 6 10 9 1 7 8a a a a a a a a a a zhang et al. 3 2 5 4 6 10 9 1 7 8a a a a a a a a a a zaras and martel’s 3 4 2 5 6 10 9 1 7 8a , a a , a a , a , a a , a a nowak 3 2 4 5 6 9 10 1 7 8a a a , a a a , a a a a the proposed method gives the same results as the method of zhang et al. [31]. however, instead of pairwise comparison of alternatives for individual criterion the result is obtained in only three steps explained above. the simulation is performed on intel(r)xeon(r) cpu e526670 @ 2.90 ghz processor with 32 gb ram. the total time for the simulation was 1.3 sec that proves the suitability of the method in real time smart home applications. 5. concluding remarks proper smart home design depends on human judgment in great extent. in many practical applications, criteria in different stages of smart home design can be presented as random variables with appropriate discrete probability density function. these applications include, but are not limited to the scheduling of appliances in the presence of stochastic renewable production, control parameter selection and the choice of control strategy in uncertain trade-off between multiple criteria in smart home control system design 151 environment. in this paper, a problem of optimal design alternative selection has been solved with enhanced smcdm methodology, based on numerical convolution of criteria probability distribution functions, according to multiplicative aggregation form. the methodology is based on multiplicative form of multi-attribute utility theory, which proved to be suitable for the modeling of human behavior in front of opposite criteria the ranking of alternative is performed by the stochastic dominance degree. because of variety of key design criteria that every smart home should meet, and the trade-off between them in uncertain environment, this method proved to be efficient, unlike usual approach of pairwise comparison, allowing only the additive form of aggregation of individual criteria. in previous methodologies, the decision maker risk attitude is taken into account only at individual level of criterion comparison, while this attitude can be directly incorporated in the model with the different compensatory aggregators. together with the multiple uncertainties of evaluations and weighting factors, the problem of group decision making in smart home applications will be the focus of further researches of the possible application of this methodology. acknowledgement: this work was supported by the ministry of education, science and technological development of the republic of serbia through mathematical institute sasa under grant iii 44006 and grant iii 42006. references [1] i. cardei, b. furth, and l. bradely, "design and technologies for implementing a smart educational building: case study", facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 325 – 338, 2016. [2] j. xiao and r. boutaba, "the design and implementation of an energy-smart home in korea", journal of computing science and engineering, vol. 7, no. 3, 204-210, 2013. [3] j. y. son, j. h. park, k. d. moon, and y. h. lee, "resource aware smart home management system by constructing resource relation graph", ieee transactions on consumer electronics, vol. 57, no. 3, pp. 1112-1119, 2011. [4] d. m. han and j. h. lim, "smart home energy management system using ieee 802.15.4 and zigbee", ieee transactions on consumer electronics, vol. 56, no. 3, pp. 1403-1410, 2010. [5] d. ding, r. a. cooper, p. f. pasquina, and l. fici-pasquina, "sensor technology for smart homes", maturitas, vol. 69, no. 2, pp. 131-136, 2011. [6] d. h. stefanov, z. bien, and w. c. bang, "the smart house for older persons and persons with physical disabilities: structure, technology arrangements, and perspectives", ieee transactions on neural systems and rehabilitation engineering, vol. 12, no. 2, pp. 228-250, 2004. [7] m. chan, e. campo, d. esteve, and j. fourniols, "smart homes—current features and future perspectives", maturitas, vol. 64, no. 2, pp. 90–96, 2009. [8] t. gentry, "smart homes for people with neurological disability: state of the art", neuro rehabilitation, vol. 25, no. 3, pp. 209–225, 2009. [9] g. demiris, and b. k. hensel, "technologies for an aging society: a systematic review of smart home applications", imia yearbook of medical informatics, vol. 3, no. 1, pp. 33–40, 2008. [10] h. alwaera and d. j. clements-croomeb, "key performance indicators (kpis) and priority setting in using the multi-attribute approach for assessing sustainable intelligent buildings", building and environment, vol. 45, no. 4, pp. 799–807, 2010. [11] z. chen, d. clements-croome, j. hong, h. li, and q. xu, "a multicriteria lifespan energy efficiency approach to intelligent building assessment", energy and buildings, vol. 38, no. 5, pp. 393–409, 2010. [12] g. reynoso-mesa, x. blasco, j. sanchis, and m. martinez, "controller tuning using evolutionary multiobjective optimisation: current trends and applications", control engineering practice, vol. 28, pp. 58– 73, 2014. javascript:%20goarcpage('',%20'480578',%20''); javascript:%20goarcpage('',%20'483196',%20''); http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/article/pii/s036013230900225x http://www.sciencedirect.com/science/journal/03601323 http://www.sciencedirect.com/science/journal/03601323 http://www.sciencedirect.com/science/journal/03601323/45/4 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/article/pii/s0378778805001349 http://www.sciencedirect.com/science/journal/03787788 http://www.sciencedirect.com/science/journal/03787788/38/5 152 a. janjić, l. velimirović, m. stanković, v. djordjević [13] b. davidovic, and a. labus, "a smart home system based on sensor technology", facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 451 – 460, 2016. [14] c. beckel, h. serfas, e. zeeb, g. moritz, f. golatowski, and d. timmermann, "requirements for smart home applications and realization with ws4d-pipesbox", in proceedings of the 16th conference on emerging technologies & factory automation (etfa), toulouse, france, ieee, 2011. [15] m. levin, a. andrushevich, a. klapproth ―improvement of building automation system‖, in proceedings of the international conference on industrial, engineering and other applications of applied intelligent systems iea/aie 2011: modern approaches in applied intelligence pp 459-468 [16] p. stewart, j. c. zavala, and p. fleming, "automotive drive by wire controller design by multi-objective techniques", control engineering practice, vol. 13, no. 2, pp. 257–264, 2005. [17] janjic, s. savic, g. janackovic, m. stankovic, and l. velimirovic, "multi-criteria assessment of the smart grid efficiency using the fuzzy analytic hierarchy process", facta universitatis series: electronics and energetics, vol. 29, no 4, pp. 631 – 646, 2016. [18] m. prýme, a. horák, l. prokop s. misak ―smart home modeling with real appliances‖, in proceedings of the international joint conference soco’13-cisis’13-iceute’13, pp. 369-378. [19] j. martel, and g. d’avignon, "projects ordering with multicriteria analysis", european journal of operational research, vol. 10, no. 1, pp. 56–69, 1982. [20] d. wu and d. l. olson, "a comparison of stochastic dominance and stochastic dea for vendor evaluation", international journal of production research, vol. 46, no. 8, pp. 2313-2327, 2008. [21] r. lahdelma and p. salminen, "stochastic multicriteria acceptability analysis using the data envelopment model", european journal of operational research, vol. 170, no. 1, pp. 241–252, 2006. [22] durbach, "the use of the smaa acceptability index in descriptive decision analysis", european journal of operational research, vol. 196, no. 3, pp. 1229–1237, 2009. [23] zaras and j. martel, multiattribute analysis based on stochastic dominance, models and experiments in risk and rationality, kluwer academic publishers, dordrecht, 1994, pp. 225–248. [24] zaras, "rough approximation of a preference relation by a multi-attribute dominance for deterministic, stochastic and fuzzy decision problems", european journal of operational research, vol. 159, no. 1, pp. 196–206, 2004. [25] nowak, "aspiration level approach in stochastic mcdm problems", european journal of operational research, vol. 177, no. 3, pp. 1626–1640, 2007. [26] t. stewart, "simplified approaches for multicriteria decision making under uncertainty", journal of multi-criteria decision analysis, vol. 4, no. 4, pp. 246–258, 1995. [27] janjic, a. andjelkovic, m. docic, ―multi-attribute risk assessment using stochastic dominance‖ international journal of economics and statistics, vol. 1, no. 3, pp. 105-112, 2013. [28] r. keeney and h. raiffa, decisions with multiple objectives: preferences and value tradeoffs, john wiley & sons, new york, 1976. [29] r williamson and t. downs, "probabilistic arithmetic: numerical methods for calculating convolutions and dependency bounds", international journal of approximate reasoning, vol. 4, no. 1, pp. 89-158, 1990. [30] y. zhang, z. p. fan, and y. liu, "a method based on stochastic dominance degrees for stochastic multiple criteria decision making", computers and industrial engineering, vol. 58, no. 1, pp. 544–552, 2010. [31] c. c. huang, d. kira, i. vertinsky, "stochastic dominance rules for multi-attribute utility functions", the review of economic studies, vol. 45, no. 3, pp. 611-615, 1978. appendix stochastic dominance in order to determine whether a relation of stochastic dominance holds between two distributions, the distributions are characterized by their cumulative distribution functions, or cdfs. suppose that we consider two distributions a and b, characterized respectively by cdfs fa and fb. then distribution b dominates distribution a stochastically at first order if, for any argument y, fa(y)  fb(y). http://users.cecs.anu.edu.au/~williams/papers/p5.pdf http://users.cecs.anu.edu.au/~williams/papers/p5.pdf trade-off between multiple criteria in smart home control system design 153 the sd rules can be fundamentally classified into two groups for two classes of utility functions. the first group is for increasing concave utility function and includes first degree stochastic dominance, second degree stochastic dominance and third degree stochastic dominance. these rules can be applied for modeling risk averse preferences. definition 1. let a and b (a < b) be two real numbers, x and y be two random variables, f(x) and g(x) be cumulative distribution functions of x and y, respectively. let u1 include all the utility functions u for which ’ 0u  , u2 include all the functions u for which u'  0 and u"  0, u3 include all the functions u for which u'  0 and u"  0 and u'''  0. let ef and eg be the two expectations or the means, respectively. let sd1, sd2 and sd3 denote first, second and third degree stochastic dominance, respectively. the sd rules are: 1 ( ) ( )f x sd g x if and only if ( ) ( )( ) ( ) f g e u x e u y for all 1 u u with strict inequality for some u, or ( ) ( )f x g x for all [ , ]x a b with strict inequality for some x; 2 ( ) ( )f x sd g x if and only if ( ) ( )( ) ( ) f g e u x e u y for all 2 u u with strict inequality for some u, or x x a a f t dt g t dt      for all , ][x a b with strict inequality for some x; 3 ( ) ( )f x sd g x if and only if f ge x e y     ( ) ( )( ( )) f g e u x e u y for all 3 u u with strict inequality for some u, or x t x t a a a a f z dzdt g z dzdt      for all [ , ]x a b with strict inequality for some x; the second group of sd rules is for increasing convex utility function and includes first degree stochastic dominance, second inverse stochastic dominance, third inverse stochastic dominance of the first type and third inverse stochastic dominance of second type. these rules are equivalent to expected utility maximization rule for risk-seeking preferences. definition 2. in [30], a sd degree is defined, in the following way: if ( ) ( ) h f x sd g x  , {1, 2, 3}h then the stochastic dominance degree sdd of ( ) ( ) h f x sd g x  is given by: [ ] ( ) {1 2 3} { [ ]} h f x g x dx f x sd g x ,h , , , x x a,b g x dx                      , both sd rules and sd degrees are used in the proposed methodology. according to [29], classes ui (i = 1, 2, 3) are identical to the following classes:   * 1 2 1 2 ( , , , ) ( ( , , , )), s s i n n i i u u x x x u p x x x u u and p u      , for each i = 1, 2,3, u s is a single attribute utility function and 1 2 ( , , ..., )p p x x x a multivariate function, and u = u for i = 1,2,3. facta universitatis series: electronics and energetics vol. x, no x, x 2018, pp. x energy-efficient cryptographic primitives elena dubrova∗ royal institute of technology (kth), stockholm, sweden abstract: our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. keywords: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 1 introduction today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or rfid tags in the conventional belief that the information they gather is of little concern to attackers. however, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network. for example, in the attack manuscript received x x, x corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) ∗an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 2 s. stojković, d. veličković and c. moraga 1 introduction decision diagrams (dds) are a compact data structures for discrete functions representation. bryant showed their canonicity in 1986 in [1] and after that they have been applied in many areas in which discrete functions are used: hardware design, hardware testing, signal processing, etc. complexities of the designed hardware or of the computations that are done by decision diagrams are directly proportional to the size of the diagram. a main disadvantage of the decision diagrams is that their size is dependent on the order of the variables that are used in the diagram. optimization of the dd size is a very often solved problem. algorithms for dd optimization can be classified into two categories: exact algorithms and heuristic algorithms. the basic exact algorithm is a brute-force algorithm creating the diagrams for all possible orders of variables and choosing the best. a slightly improved exact algorithm is presented in [2]. but, all exact algorithms are very slow and inapplicable for functions with a large number of variables. the most widely used heuristic algorithm for dd optimization is rudells sifting algorithm that was proposed in [3]. the main idea in that algorithm is the sifting of each variable through all levels in the diagram and choose the optimal position. a genetic algorithm is a heuristic algorithm that can be applied in solving different optimization problems. using genetic algorithm in dd optimization was first discussed in [4]. after that, genetic algorithms for dd optimization were improved in many papers ( [5–13]). some of them optimize the dd size ( [4–11]). in [12] the 1-paths number is optimized, and in [13] a method for these two optimization is proposed. in this paper, we present a genetic algorithm for optimization of bdds and fdds. our main goal was minimization of fdd size because we use fdd in reversible synthesis (see for example [14], [15]). for comparison we also include results on the minimization of the size of the bdds. in the applications for fdd-based reversible synthesis, the complexity of the generated network is directly dependent of the fdd size. additional problem in fdd usage is that the size is dependent of the decomposition rules that are used in the nodes. in fdd in each a node positive or negative davio decomposition can be used. usually, the same decomposition is used in all nodes from the same level. it follows that for one variable order, 2n different fdds can be created. an exact algorithm in that case should check 2n · n! cases, which is impossible for large number of variables. another group facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 169 187 https://doi.org/10.2298/fuee1802169s suzana stojković1, darko veličković1, claudio moraga2 received october 21, 2017; received in revised form january 30, 2018 corresponding author: suzana stojkovic faculty of electronic engineering, university of niš, medevedeva 14, 18000 niš, serbia (e-mail: suzana.stojkovic@elfak.ni.ac.rs) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) genetic algorithm for binary and functional decision diagrams optimization* 1faculty of electronic engineering, university of niš, niš, serbia 2department of computer science, tu dortmund, germany university, dortmund, germany abstract. decision diagrams (dd) are a widely used data structure for discrete functions representation. the major problem in dd-based applicationsis the dd size minimization (reduction of the number of nodes), because their size is dependent on the variables order. genetic algorithms are often used in different optimization problems including the dd size optimization. in this paper, we apply the genetic algorithm to minimize the size of both binary decision diagrams (bdds) and functional decision diagrams (fdds). in both cases, in the proposed algorithm, a bottom-up partially matched crossover (bupmx) is used as the crossover operator. in the case of bdds, mutation is done in the standard way by variables exchanging. in the case of fdds, the mutation by changing the polarity of variables is additionally used. experimental results of optimization of the bdds and fdds of the set of benchmark functions are also presented. key words: binary decision diagrams, functional decision diagrams, decision diagrams oprimization, genetic algorithm. 2 s. stojković, d. veličković and c. moraga 1 introduction decision diagrams (dds) are a compact data structures for discrete functions representation. bryant showed their canonicity in 1986 in [1] and after that they have been applied in many areas in which discrete functions are used: hardware design, hardware testing, signal processing, etc. complexities of the designed hardware or of the computations that are done by decision diagrams are directly proportional to the size of the diagram. a main disadvantage of the decision diagrams is that their size is dependent on the order of the variables that are used in the diagram. optimization of the dd size is a very often solved problem. algorithms for dd optimization can be classified into two categories: exact algorithms and heuristic algorithms. the basic exact algorithm is a brute-force algorithm creating the diagrams for all possible orders of variables and choosing the best. a slightly improved exact algorithm is presented in [2]. but, all exact algorithms are very slow and inapplicable for functions with a large number of variables. the most widely used heuristic algorithm for dd optimization is rudells sifting algorithm that was proposed in [3]. the main idea in that algorithm is the sifting of each variable through all levels in the diagram and choose the optimal position. a genetic algorithm is a heuristic algorithm that can be applied in solving different optimization problems. using genetic algorithm in dd optimization was first discussed in [4]. after that, genetic algorithms for dd optimization were improved in many papers ( [5–13]). some of them optimize the dd size ( [4–11]). in [12] the 1-paths number is optimized, and in [13] a method for these two optimization is proposed. in this paper, we present a genetic algorithm for optimization of bdds and fdds. our main goal was minimization of fdd size because we use fdd in reversible synthesis (see for example [14], [15]). for comparison we also include results on the minimization of the size of the bdds. in the applications for fdd-based reversible synthesis, the complexity of the generated network is directly dependent of the fdd size. additional problem in fdd usage is that the size is dependent of the decomposition rules that are used in the nodes. in fdd in each a node positive or negative davio decomposition can be used. usually, the same decomposition is used in all nodes from the same level. it follows that for one variable order, 2n different fdds can be created. an exact algorithm in that case should check 2n · n! cases, which is impossible for large number of variables. another group genetic algorithm for bdd and fdd optimization 3 of minimization algorithms, sifting algorithms, analyze only variables order. because of that we choose a side to try by applying the genetic algorithm. in the presented algorithm, for both bdd and fdd, a modified pmx crossover operator (bu-pmx bottom-up partially matched crossover) and mutation by variable exchange are used. in the case of fdd optimization, mutation by polarity change is additionally used. the paper is organized in the following way: section 2 contains most important definitions related to the decision diagrams. section 3 presents the general idea of genetic algorithms and their specifications in the case of applying in dd optimization. section 4 describes the algorithm for bdd and fdd optimization and the genetic operations that are used in it. section 5 discuses experimental results and in section 6 some concluding remarks are given. 2 decision diagrams definition 1 (binary decision tree) a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: f = xk · f(xk = 0) ⊕ xk · f(xk = 1) (1) definition 2 (terminal and nonterminal nodes) a bdt contains two types of nodes: nonterminal and terminal. a nonterminal node represents one decomposition and it has a joint decision variable. a terminal node contains the value of the function. definition 3 (level in bdt) a level in the bdt is a set of nonterminal nodes with the same decision variable, or the set of terminal nodes. definition 4 (functional decision tree) a functional decision tree (fdt) representing a boolean function f is the binary tree created by the recursive application of the positive (2) or negative (3) davio decomposition rule: f = f(xk = 0) ⊕ xk · (f(xk = 0) ⊕ f(xk = 1)) (2) f = xk · (f(xk = 0) ⊕ f(xk = 1)) ⊕ f(xk = 1) (3) definition 5 (fixed polarity functional decision tree) a functional decision tree in which the same decomposition is used in each node from the same level is called a fixed polarity functional decision tree (fpfdt). 170 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 171 2 s. stojković, d. veličković and c. moraga 1 introduction decision diagrams (dds) are a compact data structures for discrete functions representation. bryant showed their canonicity in 1986 in [1] and after that they have been applied in many areas in which discrete functions are used: hardware design, hardware testing, signal processing, etc. complexities of the designed hardware or of the computations that are done by decision diagrams are directly proportional to the size of the diagram. a main disadvantage of the decision diagrams is that their size is dependent on the order of the variables that are used in the diagram. optimization of the dd size is a very often solved problem. algorithms for dd optimization can be classified into two categories: exact algorithms and heuristic algorithms. the basic exact algorithm is a brute-force algorithm creating the diagrams for all possible orders of variables and choosing the best. a slightly improved exact algorithm is presented in [2]. but, all exact algorithms are very slow and inapplicable for functions with a large number of variables. the most widely used heuristic algorithm for dd optimization is rudells sifting algorithm that was proposed in [3]. the main idea in that algorithm is the sifting of each variable through all levels in the diagram and choose the optimal position. a genetic algorithm is a heuristic algorithm that can be applied in solving different optimization problems. using genetic algorithm in dd optimization was first discussed in [4]. after that, genetic algorithms for dd optimization were improved in many papers ( [5–13]). some of them optimize the dd size ( [4–11]). in [12] the 1-paths number is optimized, and in [13] a method for these two optimization is proposed. in this paper, we present a genetic algorithm for optimization of bdds and fdds. our main goal was minimization of fdd size because we use fdd in reversible synthesis (see for example [14], [15]). for comparison we also include results on the minimization of the size of the bdds. in the applications for fdd-based reversible synthesis, the complexity of the generated network is directly dependent of the fdd size. additional problem in fdd usage is that the size is dependent of the decomposition rules that are used in the nodes. in fdd in each a node positive or negative davio decomposition can be used. usually, the same decomposition is used in all nodes from the same level. it follows that for one variable order, 2n different fdds can be created. an exact algorithm in that case should check 2n · n! cases, which is impossible for large number of variables. another group genetic algorithm for bdd and fdd optimization 3 of minimization algorithms, sifting algorithms, analyze only variables order. because of that we choose a side to try by applying the genetic algorithm. in the presented algorithm, for both bdd and fdd, a modified pmx crossover operator (bu-pmx bottom-up partially matched crossover) and mutation by variable exchange are used. in the case of fdd optimization, mutation by polarity change is additionally used. the paper is organized in the following way: section 2 contains most important definitions related to the decision diagrams. section 3 presents the general idea of genetic algorithms and their specifications in the case of applying in dd optimization. section 4 describes the algorithm for bdd and fdd optimization and the genetic operations that are used in it. section 5 discuses experimental results and in section 6 some concluding remarks are given. 2 decision diagrams definition 1 (binary decision tree) a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: f = xk · f(xk = 0) ⊕ xk · f(xk = 1) (1) definition 2 (terminal and nonterminal nodes) a bdt contains two types of nodes: nonterminal and terminal. a nonterminal node represents one decomposition and it has a joint decision variable. a terminal node contains the value of the function. definition 3 (level in bdt) a level in the bdt is a set of nonterminal nodes with the same decision variable, or the set of terminal nodes. definition 4 (functional decision tree) a functional decision tree (fdt) representing a boolean function f is the binary tree created by the recursive application of the positive (2) or negative (3) davio decomposition rule: f = f(xk = 0) ⊕ xk · (f(xk = 0) ⊕ f(xk = 1)) (2) f = xk · (f(xk = 0) ⊕ f(xk = 1)) ⊕ f(xk = 1) (3) definition 5 (fixed polarity functional decision tree) a functional decision tree in which the same decomposition is used in each node from the same level is called a fixed polarity functional decision tree (fpfdt). genetic algorithm for bdd and fdd optimization 3 of minimization algorithms, sifting algorithms, analyze only variables order. because of that we choose a side to try by applying the genetic algorithm. in the presented algorithm, for both bdd and fdd, a modified pmx crossover operator (bu-pmx bottom-up partially matched crossover) and mutation by variable exchange are used. in the case of fdd optimization, mutation by polarity change is additionally used. the paper is organized in the following way: section 2 contains most important definitions related to the decision diagrams. section 3 presents the general idea of genetic algorithms and their specifications in the case of applying in dd optimization. section 4 describes the algorithm for bdd and fdd optimization and the genetic operations that are used in it. section 5 discuses experimental results and in section 6 some concluding remarks are given. 2 decision diagrams definition 1 (binary decision tree) a binary decision tree (bdt) representing a boolean function f is the binary tree created by the recursive application of the shannon decomposition rule: f = xk · f(xk = 0) ⊕ xk · f(xk = 1) (1) definition 2 (terminal and nonterminal nodes) a bdt contains two types of nodes: nonterminal and terminal. a nonterminal node represents one decomposition and it has a joint decision variable. a terminal node contains the value of the function. definition 3 (level in bdt) a level in the bdt is a set of nonterminal nodes with the same decision variable, or the set of terminal nodes. definition 4 (functional decision tree) a functional decision tree (fdt) representing a boolean function f is the binary tree created by the recursive application of the positive (2) or negative (3) davio decomposition rule: f = f(xk = 0) ⊕ xk · (f(xk = 0) ⊕ f(xk = 1)) (2) f = xk · (f(xk = 0) ⊕ f(xk = 1)) ⊕ f(xk = 1) (3) definition 5 (fixed polarity functional decision tree) a functional decision tree in which the same decomposition is used in each node from the same level is called a fixed polarity functional decision tree (fpfdt). 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 170 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 171 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 172 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 173 genetic algorithm for bdd and fdd optimization 5 fig. 1: bdd (a) and fdd (b) of the function from example 1. in the general case, for the function of 2n variables f(x1, x2, . . . x2n−1, x2n) = x1x2+· · ·+x2n−1x2n, the size of the bdd with variables order (x1, x2, . . . , x2n−1, x2n) is 2n, and with variables order (x1, xn+1, . . . , xn, x2n) it is o(2 n−1). besides the variable order, the size of fixed polarity fdds is dependent also on the polarities for the variables. example 3 figure 3 shows the fdds of the function in example 1 for polarity vectors (a) f = [1 1 1 1]t and (b) f = [0 1 0 1]t. the size of the diagram if the first case is 4, and in the second case is 6. 3 genetic algorithm a genetic algorithm is a method for solving different optimization problems based on an analogy to the natural selection process. in this algorithm, the solution of a problem is presented as an array that is named chromosome. an element of the chromosome is a gene. in general, the initial set of chromosomes are generated randomly, and then, the new generation is created by using two genetic operations: crossover and mutation. the crossover operator defines the way for creating the child chromosomes by combination of the genes from parent chromosomes. in practice, one point crossover (fig. 4(a)) and two-point crossover (fig. 4(b)) 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 4 s. stojković, d. veličković and c. moraga definition 6 (polarity-vector) the polarity-vector of the fpfdt is a bit vector which defines the types of the decompositions that are used in the levels. 0 denotes that the positive davio decomposition is used, 1 denotes the negative. definition 7 (positive-polarity fdt) a fdt in which positive davio decomposition is used at all levels is a positive-polarity fdt. definition 8 (binary decision diagram) a bdt is transformed into a binary decision diagram (bdd) by using the following reduction rules: 1. share the isomorphic sub-trees: if there are two terminal nodes with the same value, or two non-terminal nodes with isomorphic sub-trees, one of them is deleted. its incoming edges are directed to the remaining node. 2. eliminate the redundant nodes: if both outgoing edges from a non-terminal node point to the same sub-tree, this node is redundant and it is deleted. its incoming edges are directed to the common sub-tree. definition 9 (functional decision diagram) an fdt is transformed into an fdd by using the reduction rule 1 above and the following 0-suppress reduction rules: 2.1 if the right outgoing edge from a positive davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its left sub-tree. 2.2 if the left outgoing edge from a negative davio node points to the 0, the node is deleted. the edges pointing to the deleted node are directed to its right sub-tree. example 1 figure 2 shows the bdd (a) and the positive-polarity fdd (b) of the function f(x1, x2, x3, x4) = x1 · x2 + x1 · x2 + x3 + x4. definition 10 (dd size) dd size is equal to the number of the nonterminal nodes. example 2 figure 2 shows the bdds of the function f(x1, . . . , x6) = x1x2 + x3x4 + x5x6 for variables orders (a) (x1, x2, x3, x4, x5, x6) and (b) (x1, x4, x2, x5, x3, x6). the size of the first bdd is 6, but the size of the second is 14. 172 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 173 6 s. stojković, d. veličković and c. moraga fig. 2: bdds of the function from example 2 for two different variable orders. are usually chosen. the mutation is often realized by changing the value of the gene at a selected position. the measure of the quality of a solution (chromosome) is named fitness score. fitness scores are used to compute the possibilities for selecting the chromosomes for parents for the next generation, and for selecting the chromosomes that will die after an iteration. to define the genetic algorithm for a concrete optimization problem means to define: the type of genes, the fitness function and the genetic operations. 3.1 genetic algorithm for bdd size optimization one chromosome in a bdd optimization problem is one order of input variables, i.e. one permutation of the integer numbers from interval [1, n]. it follows that standard genetic operators cannot be used. because of that, for a bdd optimization, special genetic operators are defined. crossover oper174 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 175 6 s. stojković, d. veličković and c. moraga fig. 2: bdds of the function from example 2 for two different variable orders. are usually chosen. the mutation is often realized by changing the value of the gene at a selected position. the measure of the quality of a solution (chromosome) is named fitness score. fitness scores are used to compute the possibilities for selecting the chromosomes for parents for the next generation, and for selecting the chromosomes that will die after an iteration. to define the genetic algorithm for a concrete optimization problem means to define: the type of genes, the fitness function and the genetic operations. 3.1 genetic algorithm for bdd size optimization one chromosome in a bdd optimization problem is one order of input variables, i.e. one permutation of the integer numbers from interval [1, n]. it follows that standard genetic operators cannot be used. because of that, for a bdd optimization, special genetic operators are defined. crossover opergenetic algorithm for bdd and fdd optimization 7 fig. 3: fdds of the function from example 1 for two different polarity vectors. fig. 4: one-point (a) and two-point (b) crossover operators. ators that will be discussed in this section are: order crossover ( [10], [11]), cyclic crossover ( [10], [11]), partially matched crossover ( [4], [10], [11]) and alternating crossover ( [7]). algorithm 1 (cyclic crossover operator cx) : step 1. create a cycle of the genes defined by corresponding positions in the parent chromosomes starting from first unused gene in the first parent. step 2. copy the genes from the cycle from one parent in the first child and from other parent in the second child. 174 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 175 8 s. stojković, d. veličković and c. moraga step 3. repeat steps 1 and 2 by alternating change the target child in which the genes from one parent is copied. example 4 let we see the following parents: p1 = [1 2 3 4 5 6 7 8 9 10] p2 = [5 4 6 9 2 8 3 7 1 10] the first cycle of the genes is created starting from the gene 1 from the first parent. on the corresponding position in the second parent is the gene 5. then, we find the gene 5 in the first parent and in the corresponding position in the second parent is the gene 2. process is continued until the cycle is closed. the created cycle is 1 → 5 → 2 → 4 → 9 → 1. the child chromosomes after putting first cycle are: c1 = [1 2 4 5 9 ] c2 = [5 4 9 2 1 ] second cycle is created starting from the gene 3: 3 → 6 → 8 → 7 → 3 child chromosomes after putting second cycle in the child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 ] c2 = [5 4 3 9 2 6 7 8 1 ] the last cycle contains only gene 10, and, finally, child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 10] c2 = [5 4 3 9 2 6 7 8 1 10] algorithm 2 (order crossover operator ox) : step 1. randomly select two crossover points. step 2. copy in the child chromosome the genes from the first parent between crossover points. step 3. delete from second parent the genes which are already in the child. step 4. place the genes from the second parent into unfilled positions in child chromosome from left to right. 176 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 177 8 s. stojković, d. veličković and c. moraga step 3. repeat steps 1 and 2 by alternating change the target child in which the genes from one parent is copied. example 4 let we see the following parents: p1 = [1 2 3 4 5 6 7 8 9 10] p2 = [5 4 6 9 2 8 3 7 1 10] the first cycle of the genes is created starting from the gene 1 from the first parent. on the corresponding position in the second parent is the gene 5. then, we find the gene 5 in the first parent and in the corresponding position in the second parent is the gene 2. process is continued until the cycle is closed. the created cycle is 1 → 5 → 2 → 4 → 9 → 1. the child chromosomes after putting first cycle are: c1 = [1 2 4 5 9 ] c2 = [5 4 9 2 1 ] second cycle is created starting from the gene 3: 3 → 6 → 8 → 7 → 3 child chromosomes after putting second cycle in the child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 ] c2 = [5 4 3 9 2 6 7 8 1 ] the last cycle contains only gene 10, and, finally, child chromosomes are: c1 = [1 2 6 4 5 8 3 7 9 10] c2 = [5 4 3 9 2 6 7 8 1 10] algorithm 2 (order crossover operator ox) : step 1. randomly select two crossover points. step 2. copy in the child chromosome the genes from the first parent between crossover points. step 3. delete from second parent the genes which are already in the child. step 4. place the genes from the second parent into unfilled positions in child chromosome from left to right. genetic algorithm for bdd and fdd optimization 9 example 5 let the parent variable orders be given by arrays: p1 = [1 2 3 |4 5 6 |7 8 9 10] p2 = [6 7 4 2 3 10 9 5 1 8] the crossover points are marked in the first parent. after step one, the child chromosome is: c = [ |4 5 6 | ] after deleting corresponding genes, second parent is: p2 = [�6 7 �4 2 3 10 9 �5 1 8] finally, after putting the genes from second parent, the generated child is: c = [7 2 3 |4 5 6 |10 9 1 8] algorithm 3 (partially matched crossover operator pmx) : step 1. perform a two-point crossover. step 2. create the mapping table of the genes from the central part of one parent that do not appear in the central part of the second parent. the mapping pair of a gene from position i of the first parent (p1[i]) is the gene at the same position in the other parent (p2[i]) if the gene p2[i] is not in the central part of the first parent, otherwise, if the p2[i] = p1[j] the mapping pair of p1[i] is equal to the mapping pair of the gene p1[j]. step 3. eliminate duplicated genes in child chromosomes so that the central part of the chromosomes remains unchanged. if some gene from the central part appears again in other parts, replace it by the corresponding mapping pair. example 6 let the parent variable orders be given by arrays: p1 = [9 8 4 |5 2 7 |1 3 6 10] p2 = [8 7 1 |2 3 10 |9 5 4 6] let the two-point crossover operator be performed with the crossover points 3 and 6. the resulting child chromosomes are: c′1 = [9 8 4 |2 3 10 |1 3 6 10] 176 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 177 10 s. stojković, d. veličković and c. moraga c′2 = [8 7 1 |5 2 7 |9 5 4 6] let us create a mapping table: p2[4] = 2 exists in the central part of p1, and it is not mapped. p2[5] = 3 → 2 → 5. pair (3, 5) is added into the mapping table. p2[6] = 10 → 7. pair (10, 7) is added into the mapping table. resulting child chromosomes after duplicate elimination are: c1 = [9 8 4 |2 3 10 |1 5 6 7] c2 = [8 10 1 |5 2 7 |9 3 4 6] algorithm 4 (alternating crossover operator ax) create the child chromosome by taking alternatively the genes from the first and the second parent. before storing the gene into a child chromosome check whether it already exists there. example 7 let the alternating crossover be performed over the same parents as in the previous example. the resulting child chromosome is: c = [9 8 7 4 1 5 2 3 10 6] mutation cannot be realized as it is shown in the previous section, too. in the literature, three ways for the mutation operation are suggested: algorithm 5 (mutation by one variables exchange) randomly select two positions in a chromosome and exchange the variables from the selected positions. algorithm 6 (mutation by two variables exchanges) apply two-times mutation defined in the algorithm 5. algorithm 7 (mutation by neighbor exchange) randomly select one position i. exchange the variables from positions i and i + 1. the fitness function in a bdd optimization problem is the size of the bdd. 178 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 179 10 s. stojković, d. veličković and c. moraga c′2 = [8 7 1 |5 2 7 |9 5 4 6] let us create a mapping table: p2[4] = 2 exists in the central part of p1, and it is not mapped. p2[5] = 3 → 2 → 5. pair (3, 5) is added into the mapping table. p2[6] = 10 → 7. pair (10, 7) is added into the mapping table. resulting child chromosomes after duplicate elimination are: c1 = [9 8 4 |2 3 10 |1 5 6 7] c2 = [8 10 1 |5 2 7 |9 3 4 6] algorithm 4 (alternating crossover operator ax) create the child chromosome by taking alternatively the genes from the first and the second parent. before storing the gene into a child chromosome check whether it already exists there. example 7 let the alternating crossover be performed over the same parents as in the previous example. the resulting child chromosome is: c = [9 8 7 4 1 5 2 3 10 6] mutation cannot be realized as it is shown in the previous section, too. in the literature, three ways for the mutation operation are suggested: algorithm 5 (mutation by one variables exchange) randomly select two positions in a chromosome and exchange the variables from the selected positions. algorithm 6 (mutation by two variables exchanges) apply two-times mutation defined in the algorithm 5. algorithm 7 (mutation by neighbor exchange) randomly select one position i. exchange the variables from positions i and i + 1. the fitness function in a bdd optimization problem is the size of the bdd. genetic algorithm for bdd and fdd optimization 11 4 genetic algorithm for bdd and fdd size optimization in the original pmx algorithm, the central part of the chromosome is transferred into the child chromosome unchanged. but, the possibility of deleting a dd node in the reduction phase is greater if the node is at the bottom levels. it follows that good properties of the parents will be inherited if the order of the variables on the last levels is not changed. because of that,we used a modified pmx algorithm in which the right part of the genes from parent chromosomes are directly transferred to the child chromosomes. this operator is named as the bottom-up pmx, because the genes are written into the child chromosome from the right to the left, i.e. from the bottom levels up. the second reason why the part of the unchanged genes is shifted to the end of the chromosome is that in that case the dd corresponding to the child chromosome contains an identical set of nodes in the last levels as the dd corresponding to the parent chromosome and calculation time of the fitness function is shortened. algorithm 8 (bottom-up pmx operator bu-pmx) : step 1. perform an one-point crossover. step 2. create the pmx mapping table for the right part of chromosomes. step 3. eliminate duplicate genes from the left part of child chromosomes by using the pmx mapping table. example 8 let the bottom-up pmx operator be performed over parents: p1 = [1 2 3 4 5 6 |7 8 9 10] p2 = [7 4 1 2 5 6 |9 3 8 10] after performing one-point crossover the generated children are: c′1 = [7 4 1 2 5 6 |7 8 9 10] c′2 = [1 2 3 4 5 6 |9 3 8 10] the mapping table contains only the pair (7, 3). after duplicates elimination, the resulting child chromosomes are: c1 = [3 4 1 2 5 6 |7 8 9 10] c2 = [1 2 7 4 5 6 |9 3 8 10] 178 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 179 12 s. stojković, d. veličković and c. moraga fig. 5: positive (a) and negative (b) davio nodes. as it was shown in example 3, the fdd size is dependent on the polarity vector. because of that, in fdd optimization an additional mutation producing a polarity change is used. to specify the transformation that is done on the fdd when this mutation is performed, the positive davio and negative davio nodes are shown in figure 5 ((a) and (b), respectively). in this figure f0 = f(xk = 0) and f1 = f(xk = 1). let fl and fr be the left and right successors of the node. if the polarity is changed from positive to negative, the transformation that is done is: fl new = fr old fr new = fl old ⊕ fr old (4) if the reverse polarity change is done, the applied transformation is: fr new = fl old fl new = fl old ⊕ fr old (5) algorithm 9 (mutation by polarity change) randomly select a variable. change the expansion rule in all nodes at the level corresponding to the selected variable. the complete genetic algorithm that is used for bdd and fdd optimization is shown in the algorithm 10. algorithm 10 (genetic algorithm for dd optimization) : step 1. create initial population of chromosomes and compute the fitness score for each of them. step 2. select pairs of parents for reproduction. 180 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 181 12 s. stojković, d. veličković and c. moraga fig. 5: positive (a) and negative (b) davio nodes. as it was shown in example 3, the fdd size is dependent on the polarity vector. because of that, in fdd optimization an additional mutation producing a polarity change is used. to specify the transformation that is done on the fdd when this mutation is performed, the positive davio and negative davio nodes are shown in figure 5 ((a) and (b), respectively). in this figure f0 = f(xk = 0) and f1 = f(xk = 1). let fl and fr be the left and right successors of the node. if the polarity is changed from positive to negative, the transformation that is done is: fl new = fr old fr new = fl old ⊕ fr old (4) if the reverse polarity change is done, the applied transformation is: fr new = fl old fl new = fl old ⊕ fr old (5) algorithm 9 (mutation by polarity change) randomly select a variable. change the expansion rule in all nodes at the level corresponding to the selected variable. the complete genetic algorithm that is used for bdd and fdd optimization is shown in the algorithm 10. algorithm 10 (genetic algorithm for dd optimization) : step 1. create initial population of chromosomes and compute the fitness score for each of them. step 2. select pairs of parents for reproduction. genetic algorithm for bdd and fdd optimization 13 fig. 6: number of iterations needed to reach the minimum bdd size for the bw benchmark function as a function of a percents of the mutated child chromosomes. step 3. create child chromosomes by bu-pmx. step 4. mutate child chromosomes (by mutation probability). step 5. do darwins process remove from population the worst chromosome or more bad chromosomes if the population is full. step 6. repeat steps 2-5 until the goal is reached or the computing time is exhausted. 5 experimental results 5.1 results of bdd size optimization at first, we tested how the mutation probability influences the convergence of the algorithm. figure 6 shows the number of iterations that is needed to reach the minimum bdd size for the function bw for different percents of mutated chromosomes. each experiment was repeated 100 times and in the figure the average values are shown. the number of needed iterations decreases when the percents of the mutated chromosomes increases. for percents greater than 15 the decreasing is very slow and 0.15 is chosen as an optimal mutation probability. then, we tested the convergence of the proposed algorithm on the set of a small benchmark functions for which we know the optimal size. we tested the number of iterations that is needed to reach the minimum bdd size. we compared these results with results obtained by using the order corossover (ox), cyclic crossover (cx), original pmx and alternating crossover (ax) 180 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 181 14 s. stojković, d. veličković and c. moraga table 1: number of iterations needed to reach minimum bdd size by using different crossover operators function in/out ox cx pmx ax bu-pmx bw 5/25 5.5 7.4 5.9 7.9 5.2 5xp1 7/10 56.1 46.6 47.5 72.8 32.1 con1 7/2 97.9 76.8 87.2 86.7 71 misex1 8/7 152.5 93 67.7 84.2 107.8 sqrt8 8/4 135.4 159.3 116.8 371.1 113.7 clip 9/5 72.9 75 59.4 194.4 49.7 operators. the experiments were repeated 10 times and in table 1 the average values are shown. table 1 shows that for 5 out of 6 functions the smallest bdd size was obtained with the smallest number of iterations when the bu-pmx operator is used. in these 5 cases, alternating crossover was the worst. only for the function misex1 the minimal bdd size was obtained with less number of iterations when pmx operator is used. finally, we optimized the bdd size for benchmark functions of a larger number of variables. table 2 compares the sizes of the bdds with initial order of variables and with optimal order generated by the genetic algorithm. in each experiment, the initial population contains 2n chromosomes (permutations) and maximum population size is 10n, where n is the number of input variables. the table shows that the proposed algorithms reduced the size of the bdd, on the average, by 46.375%. 5.2 comparison bdd optimization by proposed genetic algorithm and by other heuristic algorithms the paper [11] compares the sizes of bdds optimized by different heuristic algorithms and with genetic algorithm with 3 types of crossover operators (ox, cx and pmx). the paper shows that results that were produced by genetic algorithms are better than results of the other heuristic algorithms. table 3 compares the sizes of dds generated by the genetic algorithm presented in the paper [11] and by the genetic algorithm that is proposed in this paper. table 3 shows that, for the functions with small number of variables, all algorithms found absolute minimum. for the functions with large number of variables algorithms that used pmx or bu-pmx operator produced better results. the algorithm that is proposed in this paper produced the smallest bdd for 13 out of 15 functions. 182 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 183 14 s. stojković, d. veličković and c. moraga table 1: number of iterations needed to reach minimum bdd size by using different crossover operators function in/out ox cx pmx ax bu-pmx bw 5/25 5.5 7.4 5.9 7.9 5.2 5xp1 7/10 56.1 46.6 47.5 72.8 32.1 con1 7/2 97.9 76.8 87.2 86.7 71 misex1 8/7 152.5 93 67.7 84.2 107.8 sqrt8 8/4 135.4 159.3 116.8 371.1 113.7 clip 9/5 72.9 75 59.4 194.4 49.7 operators. the experiments were repeated 10 times and in table 1 the average values are shown. table 1 shows that for 5 out of 6 functions the smallest bdd size was obtained with the smallest number of iterations when the bu-pmx operator is used. in these 5 cases, alternating crossover was the worst. only for the function misex1 the minimal bdd size was obtained with less number of iterations when pmx operator is used. finally, we optimized the bdd size for benchmark functions of a larger number of variables. table 2 compares the sizes of the bdds with initial order of variables and with optimal order generated by the genetic algorithm. in each experiment, the initial population contains 2n chromosomes (permutations) and maximum population size is 10n, where n is the number of input variables. the table shows that the proposed algorithms reduced the size of the bdd, on the average, by 46.375%. 5.2 comparison bdd optimization by proposed genetic algorithm and by other heuristic algorithms the paper [11] compares the sizes of bdds optimized by different heuristic algorithms and with genetic algorithm with 3 types of crossover operators (ox, cx and pmx). the paper shows that results that were produced by genetic algorithms are better than results of the other heuristic algorithms. table 3 compares the sizes of dds generated by the genetic algorithm presented in the paper [11] and by the genetic algorithm that is proposed in this paper. table 3 shows that, for the functions with small number of variables, all algorithms found absolute minimum. for the functions with large number of variables algorithms that used pmx or bu-pmx operator produced better results. the algorithm that is proposed in this paper produced the smallest bdd for 13 out of 15 functions. genetic algorithm for bdd and fdd optimization 15 table 2: bdd size for initial variable order and for optimal order generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 1352 701 300 48 cu 14/11 65 37 300 43 misex3 14/14 1301 544 300 58 misex3c 14/14 810 443 300 45 table3 14/14 941 752 300 20 b12 15/9 91 60 300 34 table5 17/15 873 683 300 22 cc 21/20 105 49 400 53 dike2 22/29 976 373 400 62 i1 25/16 58 43 500 26 misex2 25/18 140 86 500 39 vg2 25/8 1059 84 500 92 frg1 28/3 203 89 600 56 c8 28/18 145 93 600 36 in4 32/20 1109 410 600 63 unreg 36/16 146 81 600 45 average 46.375 5.3 results of fdd size optimization as was shown above, the fdd size is dependent on the variable order and the polarity. to determine the mutation that should be used in fdd optimization, a genetic algorithm with different mutation operators is performed on the set of function of a small number of variables (less than 10). table 4 shows sizes of fdds when: • the initial order of variables and positive-polarity is used (init), • the genetic algorithm with mutation by variables exchange is used (ga,ve), • the genetic algorithm with mutation by polarity change is used (ga, pc), and • the genetic algorithm with both mutation operators (with probabilities 0.5) are used (ga,ve+pc). genetic algorithm for bdd and fdd optimization 15 table 2: bdd size for initial variable order and for optimal order generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 1352 701 300 48 cu 14/11 65 37 300 43 misex3 14/14 1301 544 300 58 misex3c 14/14 810 443 300 45 table3 14/14 941 752 300 20 b12 15/9 91 60 300 34 table5 17/15 873 683 300 22 cc 21/20 105 49 400 53 dike2 22/29 976 373 400 62 i1 25/16 58 43 500 26 misex2 25/18 140 86 500 39 vg2 25/8 1059 84 500 92 frg1 28/3 203 89 600 56 c8 28/18 145 93 600 36 in4 32/20 1109 410 600 63 unreg 36/16 146 81 600 45 average 46.375 5.3 results of fdd size optimization as was shown above, the fdd size is dependent on the variable order and the polarity. to determine the mutation that should be used in fdd optimization, a genetic algorithm with different mutation operators is performed on the set of function of a small number of variables (less than 10). table 4 shows sizes of fdds when: • the initial order of variables and positive-polarity is used (init), • the genetic algorithm with mutation by variables exchange is used (ga,ve), • the genetic algorithm with mutation by polarity change is used (ga, pc), and • the genetic algorithm with both mutation operators (with probabilities 0.5) are used (ga,ve+pc). 182 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 183 16 s. stojković, d. veličković and c. moraga table 3: comparision of sizes of bdds produced by the proposed algorithm and by the existing genetic algorithm with ox, cx and pmx crossover operators function in/out ox cx pmx bu-pmx squar5 5/8 37 37 37 37 bw 5/28 106 106 106 106 5xp1 7/10 68 69 68 68 con1 7/2 16 15 15 15 inc 7/9 72 72 72 72 misex1 8/7 36 36 36 36 sqrt8 8/4 33 33 33 33 clip 9/5 102 109 93 93 sao2 10/4 92 90 85 85 alu4 14/8 891 939 734 701 b12 15/9 70 68 50 60 t481 16/1 85 78 30 38 duke2 22/9 506 512 390 373 misex2 25/18 100 102 87 86 vg2 25/8 339 301 148 84 it can be seen from the table that the fdds with minimal sizes are generated when both mutations are used in the genetic algorithm. because of that, in the experiments for optimization of fdds of the functions of a larger number of variables (greater than 10), the approach with both mutation operators is used. results of these experiments are shown in table 5. as it can be seen from this table, fdds are reduced by the proposed genetic algorithm, on the average, by 48.875%. these experiments are done with the functions up to 25 variables. it is applicable on the functions with large number of variables, because the number of cases that are checked in the algorithm is determined by three parameters: • number of crossover operations that is done in one iteration (cx), • possibility of applying of mutation operator (pm), and • maximal number of iterations (it). total number of created dds is: 184 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 185 16 s. stojković, d. veličković and c. moraga table 3: comparision of sizes of bdds produced by the proposed algorithm and by the existing genetic algorithm with ox, cx and pmx crossover operators function in/out ox cx pmx bu-pmx squar5 5/8 37 37 37 37 bw 5/28 106 106 106 106 5xp1 7/10 68 69 68 68 con1 7/2 16 15 15 15 inc 7/9 72 72 72 72 misex1 8/7 36 36 36 36 sqrt8 8/4 33 33 33 33 clip 9/5 102 109 93 93 sao2 10/4 92 90 85 85 alu4 14/8 891 939 734 701 b12 15/9 70 68 50 60 t481 16/1 85 78 30 38 duke2 22/9 506 512 390 373 misex2 25/18 100 102 87 86 vg2 25/8 339 301 148 84 it can be seen from the table that the fdds with minimal sizes are generated when both mutations are used in the genetic algorithm. because of that, in the experiments for optimization of fdds of the functions of a larger number of variables (greater than 10), the approach with both mutation operators is used. results of these experiments are shown in table 5. as it can be seen from this table, fdds are reduced by the proposed genetic algorithm, on the average, by 48.875%. these experiments are done with the functions up to 25 variables. it is applicable on the functions with large number of variables, because the number of cases that are checked in the algorithm is determined by three parameters: • number of crossover operations that is done in one iteration (cx), • possibility of applying of mutation operator (pm), and • maximal number of iterations (it). total number of created dds is: genetic algorithm for bdd and fdd optimization 17 table 4: initial fdds sizes and sizes of fdds generated by genetic algorithms with different mutation operators function in/out (init) (ga,ve) (ga,pc) (ga, ve+pc) add2 4/3 8 7 7 7 squar5 5/8 32 30 29 29 bw 5/28 144 97 93 93 inc 7/9 121 79 78 73 f51m 8/8 40 34 27 20 sqrt8 8/4 48 25 26 24 table 5: fdd size for initial variable order and positive-polarity and for order and polarity generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 840 541 300 36 cu 14/11 74 37 300 50 misex3 14/14 1024 764 300 25 misex3c 14/14 759 635 300 16 b12 15/9 116 62 300 47 cc 21/20 78 40 400 49 misex2 25/18 149 37 500 75 vg2 25/8 942 68 500 93 average 48.875 n = cx · (1 + pm) · it if we need smaller dd, the number of cx and it should be greater. if the execution time is critical, cx and it should be smaller. in our experiments: cx = 2 · n, pm = 0.15, it = [n/5] · 100. n = 2 · n · 1.15 · [n/5] · 100 ≈ 46 · n2. it is much smaller than when the brut-force exact algorithm in which n = 2n · n!. genetic algorithm for bdd and fdd optimization 17 table 4: initial fdds sizes and sizes of fdds generated by genetic algorithms with different mutation operators function in/out (init) (ga,ve) (ga,pc) (ga, ve+pc) add2 4/3 8 7 7 7 squar5 5/8 32 30 29 29 bw 5/28 144 97 93 93 inc 7/9 121 79 78 73 f51m 8/8 40 34 27 20 sqrt8 8/4 48 25 26 24 table 5: fdd size for initial variable order and positive-polarity and for order and polarity generated by the proposed genetic algorithm function in/out init optimal iterations red. ratio [%] alu4 14/8 840 541 300 36 cu 14/11 74 37 300 50 misex3 14/14 1024 764 300 25 misex3c 14/14 759 635 300 16 b12 15/9 116 62 300 47 cc 21/20 78 40 400 49 misex2 25/18 149 37 500 75 vg2 25/8 942 68 500 93 average 48.875 n = cx · (1 + pm) · it if we need smaller dd, the number of cx and it should be greater. if the execution time is critical, cx and it should be smaller. in our experiments: cx = 2 · n, pm = 0.15, it = [n/5] · 100. n = 2 · n · 1.15 · [n/5] · 100 ≈ 46 · n2. it is much smaller than when the brut-force exact algorithm in which n = 2n · n!. 184 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 185 18 s. stojković, d. veličković and c. moraga 6 conclusion in this paper, a genetic algorithm for bdd and fdd optimization is presented. in the algorithm a modification of the pmx operator is proposed: in the initial phase, instead of two-point crossover, one-point crossover is used. it follows that in the generated dd based on child permutation, part of the dd in the last levels is equal to the corresponding part of dd generated by the parent chromosome. in this way, the child chromosome inherits good properties of parent chromosome. in the case of fdd optimization, the proposed algorithm introduced mutation of polarity. experiments show that when this mutation is used in combination with variable exchange, the genetic algorithm gives the best results. in the presented algorithm, sifting is not used as an additional method to improve the generated diagrams. our goal was to show the performances of the genetic algorithm. in a real application of the algorithm, sifting can be included, too. references [1] r. e. bryant, “graph-based algorithms for boolean function manipulation,” ieee transactions on computers, vol. c-35, no. 8, pp. 677–691, 1986. [2] s. j. friedman and k. j. supowit, “finding the optimal variable ordering methods for binary decision diagrams,” ieee transactions on computers, vol. 39, no. 5, pp. 710–713, 1990. [3] r. rudell, “dynamic variable ordering for ordered binary decision diagrams,” in proceedings of international conference on cad, 1993, pp. 42–47. [4] r. drechsler, b. becker, and n. göckel, “a genetic algorithm for variable ordering of ob-dds,” in iee proceedings computers and digital techniques, vol. 143, no. 6, 1996, p. 363368. [5] r. drechsler and n. göckel, “minimization of bdds by evolutionary algorithms,” in international workshop on logic synthesis (iwls), 1997. [6] r. drechsler, b. becker, and n. göckel, “learning heuristics for obdd minimization by evolutionary algorithms,” in proceedings parallel problem solving from nature (ppsn), lecture notes in computer science, vol. 1141, 1996, pp. 730–739. [7] w. lenders and c. baier, “genetic algorithms for the variable ordering problem of binary decision diagrams,” lecture notes in computer science, vol. 3469, pp. 1–20, 2005. [8] i. furdu and b. patrut, “genetic algorithm for ordered decision diagrams optimization,” in proceedings of icmi 45, 2006, pp. 437–444. 186 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 187 18 s. stojković, d. veličković and c. moraga 6 conclusion in this paper, a genetic algorithm for bdd and fdd optimization is presented. in the algorithm a modification of the pmx operator is proposed: in the initial phase, instead of two-point crossover, one-point crossover is used. it follows that in the generated dd based on child permutation, part of the dd in the last levels is equal to the corresponding part of dd generated by the parent chromosome. in this way, the child chromosome inherits good properties of parent chromosome. in the case of fdd optimization, the proposed algorithm introduced mutation of polarity. experiments show that when this mutation is used in combination with variable exchange, the genetic algorithm gives the best results. in the presented algorithm, sifting is not used as an additional method to improve the generated diagrams. our goal was to show the performances of the genetic algorithm. in a real application of the algorithm, sifting can be included, too. references [1] r. e. bryant, “graph-based algorithms for boolean function manipulation,” ieee transactions on computers, vol. c-35, no. 8, pp. 677–691, 1986. [2] s. j. friedman and k. j. supowit, “finding the optimal variable ordering methods for binary decision diagrams,” ieee transactions on computers, vol. 39, no. 5, pp. 710–713, 1990. [3] r. rudell, “dynamic variable ordering for ordered binary decision diagrams,” in proceedings of international conference on cad, 1993, pp. 42–47. [4] r. drechsler, b. becker, and n. göckel, “a genetic algorithm for variable ordering of ob-dds,” in iee proceedings computers and digital techniques, vol. 143, no. 6, 1996, p. 363368. [5] r. drechsler and n. göckel, “minimization of bdds by evolutionary algorithms,” in international workshop on logic synthesis (iwls), 1997. [6] r. drechsler, b. becker, and n. göckel, “learning heuristics for obdd minimization by evolutionary algorithms,” in proceedings parallel problem solving from nature (ppsn), lecture notes in computer science, vol. 1141, 1996, pp. 730–739. [7] w. lenders and c. baier, “genetic algorithms for the variable ordering problem of binary decision diagrams,” lecture notes in computer science, vol. 3469, pp. 1–20, 2005. [8] i. furdu and b. patrut, “genetic algorithm for ordered decision diagrams optimization,” in proceedings of icmi 45, 2006, pp. 437–444. genetic algorithm for bdd and fdd optimization 19 [9] i. furdu and t. socaciu, “genetic algorithm for variable ordering of ordered binary decision diagrams,” in proceedings of cnmi, 2007, pp. 67–78. [10] r. kaur and m. bansal, “bdd ordering and minimization using variouscrossover operators in genetic algorithm,” inernational journal of innovative research in electrical, electronics, instrumentation and control engineering, vol. 2, no. 3, pp. 1247–1250, 2014. [11] s. jindal and m. bansal, “a novel and efficient variable ordering and minimization algorithm based on evolutionary computation,” indian journal of science and technology, vol. 8, no. 48, pp. 1–10, 2016. [12] m. hilgemeier, n. drechsler, and r. drechsler, “minimizing the number of one-paths in bdds by an evolutionary algorithm,” 2003. [13] s. shirinzadeh, m. soeken, and r. drechsler, “multi-objective bdd optimization with evolutionary algorithms,” 2015, pp. 751–758. [14] s. stojković, m. stanković, and c. moraga, “complexity reducton of toffoli networks besed on fdd,” facta universitatis, ser. electronics and energetics, vol. 28, no. 2, pp. 251–262, 2015. [15] s. stojković, m. stanković, c. moraga, and r. stanković, “procedure for fddbased reversible synthesis by levels,” 2016, pp. 1–6. 186 s. stojković, d. veličković, c. moraga genetic algorithm for bdd and fdd optimization 187 instruction facta universitatis series: electronics and energetics vol. 30, n o 1, march 2017, pp. 137 144 doi: 10.2298/fuee1701137k modified internal model control for a therapeutic robot  miloš d. kostić 1 , miroslav r. mataušek 2 , dejan b. popović 2,3 1 tecnalia, san sebastian, spain 2 university of belgrade, faculty of electrical engineering, belgrade, serbia 3 institute of technical sciences of the serbian academy of sciences and arts, belgrade, serbia abstract. we present the use of the modified internal model controller (mimc) and the “probability tube” (pt) action representation for robot-assisted upper extremities training of hemiplegic patients. the robot-assisted training session has two phases. during the first "demonstration" phase the robot learns from the therapist the target path through examples. in the second "exercise" phase the robot assists a patient to follow the target path. during this process, the control limits the interface force between the robot and the hand to be below the preset threshold (f = 50 n). the system allows the assessment of the range of movement, the positional error between the target and the reached position, the amount of added assistance (the interface force between the hand and the robot). we demonstrate the operation in two hemiplegic patients. the patients and therapist suggested after the tests that the new system is straightforward and intuitive for clinical applications. key words: stroke, disability, assistant robot, modified internal model control, assessment 1. introduction intensive repetition of functional movements is proven to be an efficient method of motor control relearning during the neurorehabilitation process [1]. robotic devices are inherently well suited for repetitive tasks as well as for providing the quantitative assessment of performed movements, which is why they are becoming the preferred tools to support such therapeutic modality [2]. two types of robot assistants are dominantly used for intensive exercise: 1) devices that assist the end-point movement of the arm and interface the patient at hand (e.g., mit-manus [3], braccio di ferro [4]) and 2) exoskeleton robots that assist individual arm joints and interface the arm at multiple points (e.g., armin [5], cozens arm robot [6]).  received may 24, 2016; received in revised form june 15, 2016 corresponding author: dejan b. popović institute of technical sciences of sanu, kneza mihaila 35, 11000 belgrade, serbia (e-mail: dbp@etf.rs) 138 m. d. kostić, m. r. mataušek, d. b. popović depending on the chosen therapy modality the robotic device can support, assist, resist or even perturb the movement of the arm/hand. to do so, robot assistants apply sophisticated methods for actuator control in position, velocity or impedance space. the rehabilitation gain is maximized when the device adapts to the patient’s performance in a manner which encourages the efforts, e.g., by providing "assistance-as-needed" or "faded guidance" [5 9]. to implement these complex assistance schemes the “haptic” approach, where the device acts on the patient with the force determined by the computer model, is frequently employed in the control of robot assistants [4, 5, 10, 11]. the essential elements of haptic control that are used in current robot assistants can be described with the following two equations: motor intrinsic haptict (t) = t (q,q,q,p) + t (t) (1) t haptic haptict (t) = j(q) f (t) (2) where [ q,q,q  ] are kinematic variables and p is a set of unknown parameters in the nonlinear model of intrinsic torque, tintrinsic [4]. this torque relates to inertia, dissipative friction, and external forces (i.e., gravity). j(q) is the jacobean of the device's geometry, and fhaptic is the targeted interface force between the arm and the apparatus. the application of such a system requires an adequate nonlinear model and experimental assessment of unknown parameter p for an extensive range of operating conditions. a difficulty is that on-line compensation of the intrinsic dynamics is highly complex [4]. another major practical problem for the implementation is the selection of the target trajectory for the hand that the robot needs to assist. we show here one possible method for solving two problems: 1) how to select a target trajectory which is suited to the current patient needs, and dynamically changing abilities; and 2) how can this trajectory be translated to the controller of a robot to is used in daily clinical work? we demonstrate a solution for both tasks in the case of point-to-point movements. the demonstration is presented with a new 3d robot prototype (r3-beg), shown in fig. 1. the assumed principle for the operation of the r3-beg is: "teach-and-repeat" scenario [2, 7, 12], which is adopted in current clinical practice and present in some commercial devices [13]. the "teach-and-repeat" consists of the "demonstration" phase, in which the therapist and patient hold the endpoint of the robot, and the therapist selects a target trajectory based on heuristics; and the "exercise" phase, in which the robot assists the arm to move along the preferred trajectory with the force constraint (threshold maximum force) [13]. the following elements of the system are new: 1) the interface between the therapist, the patient and the robot used during the demonstration phase; 2) the action representation which translates the captured kinematics to the controller; 3) integration of the natural variability of the therapist’s movements into the target trajectory [14, 15]; 4) two-level control comprising at higher level velocity the set points selection in each movement phase, based on the “probability tube” (pt) action representation and at the low-level control implementation of the modified internal model control (mimc) [16, 17] to ensure offset-free following of the set point; and 5) motivating feedback based on the online assessment of the patient’s performance in the “exercise phase” (fig. 1). the presentation modified internal model control for a therapeutic robot 139 starts with the description of the robot and controller, and continues to the presentation of tests in two post-stroke patients. 2. the r3-beg robot assistant the r3-beg combines a two-segment planar manipulandum (arm) and a vertical slider (fig. 1). following the analogy with the patient’s arm, the joints were named shoulder (s) and elbow (e). fig. 1 the r3-beg robot for the arm exercise (left panel). the sketch of the robot arm showing the task (top left panel) and feedback presented to the therapist (bottom right panels). the handle (fig. 1) is instrumented by a set of force transducers allowing the estimation of the size and direction of the force acting at the handle in the plane orthogonal to the handle. this handle serves as the interface between the patient and the robot. the top part (extension) of the same handle is the interface between the therapist and the robot. this configuration allows the therapist to set the target trajectory by moving the end-point of the robot while the patient is holding the same handle. the force sensor is used in the second phase as the source of feedback for controlling the maximum assistive force constraint and for assessment of the added amount of assistance. high level control is based on methods described in [18], which suggested high rehabilitation potential, but required a sophisticated haptic platform. here the pt is used as a lookup table to determine velocity set point, based on current movement phase and performance. this can be presented as: 1k),i, k )i),t(v(pt1 )i),t(v(pt( 1 pt)t(refv      (3) 140 m. d. kostić, m. r. mataušek, d. b. popović where v(t) is current acceleration and i current phase. the factor k determines the level of allowed variability and is set up by the therapist. the low level control is based on two single-input-single-output mimc linear digital controllers [17] to control the shoulder and the elbow of the system. the essential characteristics of the mimc design and tuning concept from [18] are: it is well suited to exploit the benefits of prior knowledge and experience gained from the open-loop dynamics of the plant; the control system structure is directly obtainable from the model used to approximate process dynamics; a small number of tuning parameters, with clear meaning, followed by simple tuning rules, enough easy to apply. this concept also allows scalability of the presented solution, as it is suitable for designing multiple-input multipleoutput (mimo) neural network (nn) digital controllers [19]. measured variables on the plant are the elbow and shoulder positions, pe(t) [rad] and ps(t) [rad], however, the controlled variables consist of the velocity of the elbow ve(t) [rad/s] and the velocity of the shoulder vs(t) [rad/s], which are obtained from e s e s e s sv (kt )=(p (kt ) p ((k 1)t ))/t  (4) s s s s s s sv (kt )=(p (kt ) p ((k 1)t ))/t  (5) their dynamic characteristics are defined by the elbow velocity model gmve(s) and the shoulder velocity model gmvs(s), which are obtained from open loop step response test. models gmve(s) and gmvs(s)are defined by equations 6 and 7: 1stζ2st ek (s)g ee 22 e sl e mve e    , (6) 1stζ2st ek (s)g ss 22 s sl s mvs s    (7) where ke = 0.00024, ks = 0.00023, le = 0.07, ls = 0.1, te = 0.04, ts = 0.08, ζe = ζs = 0.7. fig. 2 mimc controller block diagram, modified from fig. 2 in [17] the velocity models gmve(s) and gmvs(s) were used to design and tune mimc velocity controllers, defined by the structure presented in fig. 2, modified from [17]. the elbow mimc velocity controller is defined by: 2 4 re e le 0.4z f (z)º1, f (z) = , g (z) = z z-0.6       (8) modified internal model control for a therapeutic robot 141 -1 2 m0e 2 p (z) 1 z 1.3205z + 0.4966 = z 0.00024 0.1761z  (9) where z -1 represents the unite delay operator, z -1 = e− sts . the shoulder mimc velocity controller is defined by 2 5 rs s ls 0.2z 0.2z f (z) = , f (z) = , g (z) = z z 0.8 z-0.8        (10) -1 2 m0s 2 p (z) 1 z 1.6522z + 0.7047 = z 0.00023 0.0525z  (11) both mimc controllers are implemented with the sample time ts = 0.02 s. we validated linear models of r3-beg joints. the parameters of the models were estimated based on recordings of the open-loop step responses. the set-points to the elbow and shoulder controllers of the r3-beg are defined in the phase-plane by the procedure described in kostić et al. [14, 15]. however, to test the closed-loop tracking performance of the low level control (mimc controllers equations 8-11), without the influence of higher level control algorithm, sinusoidal set-points defined in time were applied to the shoulder and the elbow control systems. results presented in fig. 3 were obtained for the control system defined in the loop with the mimc elbow velocity controller by equations 8 and 9. fig. 3 closed-loop responses for the elbow in the loop with mimc elbow controllers (8) and (9): model (red line), plant (black line) and set-point (blue line). 3 implementation of the r3-beg two hemiplegic patients signed the informed consent approved by the local ethics committee of the clinic for rehabilitation "dr miroslav zotović", belgrade, serbia. patient p1 had a small range of movement and was highly spastic while the patient p2 had a larger range of motion and less pronounced spasticity. the level of disability was assessed by an experienced clinician before the beginning of the tests (the ashworth spasticity scale (as), the action research arm test (arat), and the fugl-meyer (fm) motor test for upper extremities). 142 m. d. kostić, m. r. mataušek, d. b. popović the session with r3-beg followed the previously described two-phase procedure. in the "demonstration phase", the therapist "presented" the movement to the patient and the robot by manipulating the handle while the patient held the instrumented handle and was instructed not to resist the imposed movement between the starting and end points. the robot was passive (decoupled motors), and sensors captured movement kinematics and interface force. each movement was repeated several times to create the action representation using a procedure described in detail elsewhere [argall et al., 2009). the obtained pt provided set-points to the elbow and shoulder mimc controllers of the r3beg in the phase-plane while the maximal force of assistance was defined as maximal interface force exerted by the therapist. in the "exercise phase", the robot assisted a patient to perform the desired point-topoint movement. there were two different movements, one in the ipsilateral direction and one in the contralateral direction. the starting position and the target were marked with a green and a red circle (diameters d = 4 cm), respectively. the handle was instrumented with a laser pointer which projected the position of the handle to allow the patient to know the position of the handle. data presented in figure 4 illustrate the performance of patients p1 and p2, respectively. the efficacy of the robotic intervention is documented by two objective measures: 1) the euclidian distance between the reached position and the target point, which relates to the range of movement, and 2) the interface force between the hand and the r3-beg, compared to the amount of provided assistance. these metrics were selected based on the recommendations of the european scientific community [19]. fig. 4 trajectories achieved by the patients p1 (severe spasticity left panels) for the two target points. f is the force. d is distance between the end point of the trajectory and the target t. right panels show the performance of patient p2 (mild spasticity) as shown in fig. 4 (left panels), the patient p1 was not able to completely perform the task and could not reach the target point in the case in which the handle needed to be moved to the contralateral side of his body (the distance between the endpoint of the movement and the target was 9.6 cm). however, he encountered fewer problems with the radial movement in the ipsilateral direction (d = 2.9 cm). the interface force indicates that the robot was assisting the movement all along the trajectory. the robot assisted the movement with significant force during the last 25 % of the movement (f ≈ 30 n). the force was gradually increasing to about 10 n during the first 75% of the movement. this result is by the patent’s impairment (constraints introduced by spasticity and decreased the range of movement) fig. 4 (right panels) illustrates the modified internal model control for a therapeutic robot 143 performance of the patient p2 characterized with mild spasticity. in this case, the interface force was substantially smaller compared with the interface force estimated during the tests with patient p1. the distance between the endpoint and the target was only 2 cm and an interface force never reached f = 15 n. this indicates that the patent used the robotic guidance to compensate for the lack of motor control, rather than the compromised range of motion, which supports the reported patient impairment. 4 conclusions we developed a control method for a rehabilitation robot. the new system was proved to be simple for tuning and implementation in the clinical environment. the novel "teachand-repeat" method for high-level control, described in [14,15] implemented in this scenario was found to be useful for translating the therapist's skills and experience to the robot-assisted therapy. the signals from sensors used for control allow direct assessment of the differences between passive and active arm movements (range and smoothness of the movement and required force assistance). the force controlled interface (haptics) also allows the setup of the tasks that need to be trained to improve the performance. acknowledgement: the work on this project was partly supported by the project no rs35003, ministry of education, sciences and technological development of serbia. references [1] g. kwakkel, "intensity of practice after stroke: more is better", schweizer archiv für neurologie und psychiatrie, vol. 160.7, pp. 295-298, 2009. [2] t. nef, m. mihelj, and r. riener, "armin: a robot for patient-cooperative arm therapy", medical & biological engineering & computing, vol. 45(9), pp. 887-900, 2007. [3] n. hogan, h. i. krebs, j. charnnarong, p. srikrishna and a. sharon, "mit-manus: a workstation for manual therapy and training i", in proceedings of the ieee international workshop robot and human communication, 1992, pp. 161-165. [4] m. casadio, v. sanguineti, p. g. morasso, and v. arrichiello, "braccio di ferro: a new haptic workstation for neuromotor rehabilitation", technology and health care, vol. 14(3), pp. 123-142, 2006. [5] t. nef, and r. riener, "armin-design of a novel arm rehabilitation robot", in proc. of the 9 th ieee international conference on rehabilitation robotics, 2005, pp. 57-60. [6] j. a. cozens, "robotic assistance of an active upper limb exercise in neurologically impaired patients", rehabilitation engineering, ieee transactions on, vol. 7(2), pp. 254-256, 1999. [7] l. marchal-crespo, and d. j. reinkensmeyer, "review of control strategies for robotic movement training after neurologic injury", journal of neuroengineering and rehabilitation, vol. 6(1), pp. 20, 2009. [8] m. casadio, p. giannoni, l. masia, p. g. morasso, g. sandini, v. sanguineti, v. squeri, and e. vergaro, "robot therapy of the upper limb in stroke patients: rational guidelines for the principled use of this technology", functional neurology, vol. 24 (4), pp. 195-202, 2009. [9] h. i. krebs, j. j. palazzolo, l. dipietro, m. ferraro, j. krol, k. rannekleiv, b. t. volpe, and n. hogan, "rehabilitation robotics: performance-based progressive robot-assisted therapy", autonomous robots, vol. 15 (1), pp. 7-20, 2003. [10] r. q. van der linde, p. lammertse, e. frederiksen, and b. ruiter, "the hapticmaster, a new high-performance haptic interface", in proc. eurohaptics, 2002, pp. 1-5. [11] r. loureiro, f. amirabdollahian, m. topping, b. driessen, and w. harwin, "upper limb robot mediated stroke therapy: gentle/s approach", autonomous robots, vol. 15 (1), pp. 35-51, 2003. 144 m. d. kostić, m. r. mataušek, d. b. popović [12] j. l. emken, s. j. harkema, j. a. beres-jones, c. k. ferreira, and d. j. reinkensmeyer, "feasibility of manual teach-and-replay and continuous impedance shaping for robotic locomotor training following spinal cord injury", biomedical engineering, ieee transactions on, vol. 55 (1), pp. 322-334, 2008. [13] b. d. argall, s. chernova, m. veloso, and b. browning, "a survey of robot learning from demonstration", robotics and autonomous systems, vol. 57 (5), pp. 469-483, 2009. [14] m. d. kostić, m. b. popović, and d.. b. popović, "a method for assessing the arm movement performance: probability tube", medical & biological engineering & computing, vol. 51 (12), pp. 1315-1323, 2013. [15] m.d. kostić, ;m. d. popović, and d. b. popović, "the robot that learns from the therapist how to assist stroke patients", new trends in medical and service robots. springer international publishing, pp. 1729, 2014. [16] m. r. mataušek and d. m. stipanović, "modified nonlinear internal model control", control and intelligent systems, vol. 26 (2), pp. 57-63, 1998. [17] m.r. mataušek, a. d. mićić, and d. b. dacić, "modified internal model control approach to the design and tuning of linear digital controllers", international journal of systems science, vol. 33 (1), pp. 67-79, 2002. [18] m. r. mataušek, d. m. miljković, and b. i. jeftenić, "nonlinear multi-input-multi-output neural network control of dc motor drive with field weakening", ieee transactions on industrial electronics, vol. 45 (1), pp. 185-187, 1998. [19] "cost action td1006", http://www.rehabilitationrobotics.eu/2013. facta universitatis series: electronics and energetics vol. 31, n o 4, december 2018, pp. 627-639 https://doi.org/10.2298/fuee1804627b a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications  biplab bag, priyabrata biswas, partha pratim sarkar department of engineering and technological studies, kalyani university, west bengal, india abstract. in this paper, a planar quad band monopole antenna excited by the microstrip line feed is proposed for l-band, wimax and wlan applications. the proposed antenna is composed of radiating element in the form of l, u and inverted lshaped strips on the top surface of substrate and defected ground plane on the bottom surface. by adjusting the length of the strips, the resonant frequencies can be reformed individually. the overall dimension of the prototype of the proposed quad band antenna is 50x35x1.6mm³. from the measured results it is found that the proposed antenna has exhibited four distinct operating bands (return loss less than -10db) of 170mhz (from 1.16 to 1.33ghz), 550mhz (from 1.53 to 2.08ghz), 470mhz (from 2.43 to 2.90ghz) and 3930mhz (from 3.77 to 7.70ghz). first two bands operated in l-band, third band can be used for wimax lower band (2.5ghz) and bandwidth of fourth band may be used for wlan (5.2/5.8 ghz) and wimax (5.5ghz) applications. it is also observed that the proposed antenna has good radiation patterns and acceptable gains over the whole operating bands. the design process and parametric analyses are explained with the help of simulation software hfss v.11. key words: defected ground plane, l-band, l-u and inverted l-shaped strip, quad band, wimax, wlan 1. introduction in the growth of wireless technology, microstrip antenna plays an important role. besides the bandwidth and gain improvement, multiband functionality is another challenging task in the domain of antenna design to integrate several frequency bands in a single antenna. to overcome this challenging task, researchers are trying to design an antenna in a limited antenna aperture with different structural configuration. therefore, many efforts have been so far found and some of the popular techniques are cutting a slot [1]-[4], pifa [5]-[9] and fractal [10]-[12] etc. printed monopole [13]-[19] antenna is a most attractive structure for multiband applications due to low profile, lightweight, low  received march 25, 2018; received in revised form august 31, 2018 corresponding author: biplab bag department of engineering and technological studies, kalyani university, nadia-741235 west bengal, india (e-mail: bbagateie@gmail.com) 628 b. bag, p. biswas, p. p. sarkar cost, omnidirectional radiation pattern, easy to integrate into the microwave circuit board and also it exhibits wide impedance bandwidth. the monopole antennas with different configurations like l, u shaped slot [13], [14], inverted l-shaped strip type [15], arc shaped [16], complementary split ring [17], sinc type [18] and circular ring type [19] are reported for multiband operation. meanwhile, the above monopole antenna [13]-[19] covers, three bands. while the proposed antenna covers four distinct bands of l-band, wimax, and wlan. so, our intention is to design a multiple operation antenna with wide bandwidth. in this paper, a microstrip line fed quad band monopole antenna with the defected ground plane is proposed for l-band, wimax, and wlan applications. at the top surface of the substrate consists of three strips in the form of l, u and inverted l-shaped. at the bottom surface, slots have been cut for adjusting the resonant frequency and to minimize the antenna size. the proposed antenna is made by low cost fr4 epoxy substrate (relative permittivity of 4.4) with thickness of 1.6mm. the overall dimension of the proposed antenna is 50x35mm². the measured resonant frequencies are 1.27ghz, 1.72ghz, 2.59ghz, and 5.73ghz. the bandwidth of s11(db) ≤ -10db are 170mhz (from 1.16-1.33ghz), 550mhz (from 1.53-2.08ghz), 470mhz (from 2.43-2.90ghz) and 3930mhz (from 3.77-7.70ghz), which covers the l-band, wimax(2.5/5.5ghz) and wlan (5.2/5.8ghz) band. the gain and radiation pattern are also measured. by properly adjusting the dimension of the strips (l, u, inverted l) and the slots on the ground plane the resonant frequencies can be tuned. the design and parametric analyses are investigated by electromagnetic simulation software hfss v.11. the measured results are in good agreement. 2. evolution process the geometry of proposed antenna is shown in fig. 1. the top layer of the substrate consists of radiating strips in the form of l, u, and inverted l-shaped and microstrip line feed is used to excite the antenna. at bottom layer, slots are cut and a partial part of the copper plate has remained as a ground plane. the proposed antenna is made on a low-cost fr4 epoxy substrate (relative permittivity of 4.4) with the thickness of the antenna substrate is 1.6mm and loss tangent of 0.025. the overall dimension of the antenna substrate is 50x35mm. a 50ω microstrip feed line is used to excite the antenna to provide good frequency response over the operating range. the proposed quad-band antenna is developed by four consecutive steps, which is shown in fig. 2. the frequency response of the corresponding antennas is shown in fig. 3. the evolution started with #ant.1 consists of l-shaped strip and microstrip line feed (lf, wf), produced resonant frequency at about 2.65ghz (from 1.43-3.63ghz) with s11 (db) is -43.55db. the length of the l-shaped strip is equal to the quarter of a guided wavelength (λg/4). the resonant frequency of #ant.1 is theoretically estimated by the equation [20]: 2 1  e r g f c   (1) a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications 629 4 g strip l   (2) )2( 121 wlll strip  (3) (a) (b) (c) fig. 1 the geometry of proposed antenna structure (a) top view (b) bottom view where c is the speed of light, fr the desired resonant frequency, λg the guided wavelength and εr the relative permittivity of the substrate. after that u-shaped strip is added to #ant.1 and reformed as #ant.2 which produced two resonant frequencies at 1.58ghz (from 1.04-2.59ghz) and 5.35ghz (from 4.746.43ghz). the simulated return loss of #ant.2 is shown in fig. 3. it is interesting to observe that after u-shaped strip (#ant.2) is added the resonant frequency of #ant.1 (2.65ghz) is shifted toward the lower frequency. this happened due to direct coupling between l and u-shaped strip by c. the length of u-shaped strip {(u1+u2+u3+u5+u4-c) ≈ (1.5+1.5+2.5+1.5+2.5-1)} is quarter of the guided wavelength for the resonant frequency of 5.35ghz. in #ant.3, an inverted l-shaped strip is added which produced another frequency band from 4.24-5.28ghz (centered at 4.74ghz), the corresponding return loss is shown in fig. 3. finally, slots have been cut on the ground plane to readjust the resonant frequency and to minimize the antenna dimension. the final simulated return loss of #ant.4 is also shown in fig. 3, the resonant frequencies are 1.15ghz (from 0.831.39ghz), 1.57ghz (1.46-1.81ghz), 2.66ghz (2.47-2.80ghz), 5.15ghz and 5.85ghz (4.27-6.10ghz). fig. 3 shows that the proposed antenna may cover simultaneously frequency range of l-band, wimax, and wlan applications. the corresponding frequency responses of all the antennas (#ant.1, #ant.2, #ant.3, and #proposed antenna) are described in table 2. 630 b. bag, p. biswas, p. p. sarkar fig. 2 evolution process of the proposed antenna in step by step fig. 3 simulation reflection coefficient of various antenna structure a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications 631 the length and width of the antenna parameters are finalized after large number of simulated results which are done by electromagnetic simulation software hfss version 11, based on finite element method. the corresponding parameter values are given in table 1. table 1 final dimension of the proposed antenna (all dimension in mm) parameters #proposed antenna parameters #proposed antenna parameters #proposed antenna wf 2.96 u2 1.5 s1 10 lf 22.2 u3 4 s2 5 lg 16 u4 1.5 s3 8 wg 35 u5 4 s4 8 l1 8.04 il1 1 s5 5 l2 6.5 il2 22 s6 10 w1 1.5 il3 7 t1 1.5 c 1 iw1 1.5 t2 9 u1 1.5 table 2 simulated frequency response of all the antennas resonant frequency (ghz) s11 (db) bandwidth (mhz) #ant.1 2.65 -43.55 2200(1.43-3.63) #ant.2 1.58 5.35 -29.16 -23.53 1550(1.04-2.59) 1690(4.74-6.43) #ant.3 1.2 4.74 5.65 -27.66 -28.54 -31.15 2030(0.73-2.76) 1040(4.24-5.28) 1400(5.47-6.87) #proposed antenna 1.15 1.57 2.66 5.15 5.85 -22.45 -17.7 -36.82 -36.36 -32.15 560(0.83-1.39) 350(1.46-1.81) 330(2.47-2.80) 1830(4.27-6.1) 2.1. parametric analysis in this section, the effects of primary parameters of radiating elements of the operating bands of proposed antenna are studied. the main characterizing parameters are l2, u3, u5, il3, il2, s3 and s4. the investigation is carried out by varying one parameter at a time while other parameters are kept fixed to their final dimension which is listed in the previous section. fig. 4 shows the effects of simulated return loss(db) for different values of l2. as l2 increased from 5.5mm to 7.5mm, the resonant frequency of upper band is shifted from 6.03ghz to 5.78ghz passing through 5.85ghz, while others frequencies remain almost the same to their original resonant frequencies. 632 b. bag, p. biswas, p. p. sarkar fig. 4 simulation reflection coefficient of proposed antenna with different values of l2 fig. 5 illustrates the return loss for various values of u3 and u5. as u3 and u5 increased from 1.5mm to 3.5mm, the first two bands shifted simultaneously from 1.27ghz, 1.67ghz to 0.99ghz, 1.43ghz. it is also be observed that the resonant frequency of upper band (5.85ghz) is shifted from 6.08ghz to 5.56ghz, as well as the bandwidth of this band is reduced by the factor of 6.55%. fig. 5 simulation reflection coefficient of proposed antenna with different values of u3 and u5 a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications 633 fig. 6, shows the effect on the characteristic of return loss vs. frequency for different values of il3. as il3 increased from 6mm to 8mm, two effects can be observed. first, the resonant frequency is decreased from 5.31ghz to 4.75ghz and second, the value of s11(db) of the second band (at 1.57ghz) is increased from -15.49db to -21.71db. so, the best performance of the proposed antenna can be obtained at il3=7mm. fig. 6 simulation reflection coefficient of proposed antenna with different values of il3 fig. 7, shows the simulated return loss of the proposed antenna with different values of il2. the other parameters are the same as above, except il2. the values of il2 effect on the resonant frequency of 5.15ghz, whereas all others resonant frequencies are almost unchanged. when il2 increased from 21mm to 24mm, the resonant frequency moved from 5.35ghz to 4.51ghz. fig. 7 simulation reflection coefficient of proposed antenna with different values of il3 634 b. bag, p. biswas, p. p. sarkar finally, the slot parameters (s3, s4) of the ground plane affects the return loss of the antenna, while other parameters are fixed and s3, s4 are changed simultaneously. the simulated return loss curves for different values of s3, s4 are shown in fig. 8. from the figure it is clear that the resonant frequency of third band (wimax 2.5ghz band) is shifted from 2.91ghz to 2.52ghz, as s3, s4 increased from 6mm to 9mm. fig. 8 simulation reflection coefficient of proposed antenna with different values of il3 3. experimental results the prototype of the proposed quad band antenna is depicted in fig. 9. the simulated and measured frequency response of proposed antenna is verified graphically in fig. 10 and tabular form, which is shown in table 3. the measurement has been done with the help of rohde & schwarz (zva 20) vector network analyzer. it is observed from the measured results that the proposed antenna resonates at four distinct frequencies of 1.27ghz (from 1.16-1.33ghz, percentage bandwidth is 13.38%), 1.72ghz (from 1.532.08ghz, percentage bandwidth is 31.97%), 2.59ghz (from 2.43-2.90ghz, percentage bandwidth is 18.14%) and 5.73ghz (from 3.77-7.70ghz, percentage bandwidth is 68.58%). the impedance bandwidth based on -10db return loss are about 170mhz, 550mhz, 470mhz and 3930mhz. clearly, the obtained bandwidth covers the requirement of l-band, wimax, and wlan applications satisfactorily. the discrepancy between the measured and simulated results may be appeared due to fabrication tolerance, dielectric losses, and low-quality sma connector. a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications 635 table 3 comparison between simulated and experimental results simulated measured resonant frequency (ghz) bandwidth (mhz) s11 (db) resonant frequency (ghz) bandwidth (mhz) s11 (db) 1.15 1.57 2.66 5.15 5.85 560 350 330 1830 -22.45 -17.70 -36.82 -36.36 -32.15 1.27 1.72 2.59 5.73 170 550 470 3930 -21 -16.94 -17.42 -25.99 (a) (b) fig. 9 photograph of proposed quad band antenna (a) top view (b) bottom view fig. 10 comparison of measured and simulated results of s11 (db) of proposed quad band antenna 636 b. bag, p. biswas, p. p. sarkar once achieved the resonant frequencies at 1.27ghz, 1.72ghz, 2.59ghz and 5.73ghz, the radiation patterns and gain are also measured at these frequencies. fig. 11 shows the measured far-field radiation pattern of e-plane and h-plane at 2.59ghz, and 5.73ghz. it is observed that the e-plane patterns are dipole (shape of 8) in nature whereas h-plane patterns are omni-directional. the simulated far-field normalized radiation patterns of proposed antenna is also illustrated in fig. 12, at 1.72ghz and 1.27ghz. (a) (b) fig. 11 measured far field radiation patterns in e-plane and h-plane at (a) 2.59ghz (b) 5.73ghz (a) (b) fig. 12 simulated far field radiation patterns in e-plane and h-plane at (a) 1.72ghz (b) 1.27ghz a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications 637 fig. 13 shows the measured gains at the desired frequency bands. the gains at 1.27ghz, 1.72ghz, 2.59ghz and 5.73ghz are 2dbi, 1.25dbi, 2.7dbi and 2.95dbi, respectively. fig. 13 measured gain (dbi) of proposed antenna in the operating region the performance comparison of the proposed antenna with some other reference antenna is shown in table 4. it is clearly seen that the proposed antenna has very good impedance bandwidth compared to the other works. table 4 a comparative study of proposed antenna with some reference antenna ref. (no. of bands) size mm³ bandwidth (mhz) gain (dbi) proposed antenna quadband 50x35x1.6 170(1.16-1.33ghz) 550(1.53-2.08ghz) 470(2.43-2.90ghz) 3930(3.77-7.70ghz) 2 1.25 2.7 2.95 [2] quad-band 20x30x1.6 840(1.79-2.63ghz) 480(3.49-3.97ghz) 930(4.92-5.85ghz) 530(7.87-8.40ghz) 2.5 to 6.9 [21] triple-band 38x25x1.59 300(2.4-2.7ghz) 1050(3.1-4.15ghz) 960(4.93-5.89ghz) 1.85 2.19 2.57 [22] quad-band 14x22x1.6 180(1.73-1.91ghz) 280(2.23-2.51ghz) 940(2.89-3.83ghz) 1310(4.88-6.19ghz) not specified [23] quad-band 71x52x1 360(1.1-1.46ghz) 680(2.23-2.91ghz) 540(3.41-3.95ghz) 720(5.24-5.96ghz) 9.48 2.15 3.5 6.48 [24] hexa-band 125x85x1.57 140(0.87-1.01ghz) 240(1.72-1.96ghz) 550(2.28-2.83ghz) 670(5.71-6.38ghz) 1.83 3.17 3.23 5.82 638 b. bag, p. biswas, p. p. sarkar 4. conclusion a planar quad-band monopole antenna has been proposed in this article. the proposed antenna has been designed with three strips in the form of l, u and inverted l which acts as a radiating element and defected ground plane with slots. the proposed antenna has a volume of 50x35x1.6mm³. the measured results show that the impedance bandwidths ≤10db of the proposed antenna are 170mhz, 550mhz, 470mhz and 3930mhz, which is sufficient for the requirement of l-band, wimax and wlan applications. so, the measured result implies that the proposed antenna is well suited for practical applications in desired bands with very good bandwidth. references [1] t. h. chang and j. f. kiang, "compact multi-band h-shaped slot antenna", ieee transactions on antennas and propagation, vol. 61, pp. 4345-4349, may 2013. [2] j. dong, x. yu and g. hu, "design of a compact quad-band slot antenna for integrated mobile devices", international journal of antennas and propagation, vol. 2016, pp. 1-9, june 2016. [3] y. f. cao, s. w. cheung and t. i. yuk, "a multiband slot antenna for gps/wimax/wlan systems", ieee transactions on antennas and propagation, vol. 63, pp. 952-958, january 2015. [4] l. xiong, p. gao and p. tang, "quad-band rectangular wide slot antenna for gps/wimax/wlan applications", progress in electromagnetics research c, vol. 30, pp. 201–211, 2012. [5] a. soliman, d. elsheakh, e. abdallah and h. e. hennawy, "multiband printed metamaterial inverted f antenna (ifa) for usb applications", ieee antennas and wireless propagation letters, vol. 14, pp. 297300, september 2014. [6] d. g. kang and y. sung, "compact hexa band pifa antenna for mobile handset applications", ieee antennas and wireless propagation letters, vol. 9, pp. 1127-1130, november 2010. [7] m. agarwal, r. singh and m. k. meshram, "linearly polarized planar inverted f-antenna for global positioning system and worldwide interoperability for microwave access applications", iet microwaves, antennas & propagation, vol. 7, pp. 991-998, september 2013. [8] d. m. elsheakh and e. a. abdallah, "compact multiband multifolded slot antenna loaded with printed ifa", ieee antennas and wireless propagation letters, vol. 11, pp. 1478-1481, december 2012. [9] c. k. wu, t. f. chien, c. l. yang and c. h. luo, "design of novel s-shaped quad-band antenna for medradio/wmts/ism implantable biotelemetry applications", international journal of antennas and propagation, vol. 2012, pp. 1-12, june 2012. [10] m. ram, s. das and l. r. yadava, "a quad band sierpinski trapezoidal fractal patch antenna for wireless applications", journal of microwaves, optoelectronics and electromagnetic applications, vol. 16, pp. 25-37, march 2017. [11] v. rajeshkumar and s. raghavan, "trapezoidal ring quad-band fractal antenna for wlan/wimax applications", microwave and optical technology letters, vol. 56, pp. 2545-2548, august 2014. [12] s. sivasundarapandian and c. d. suriyakala, "a planar multiband koch snowflake fractal antenna for cognitive radio", international journal of microwave and wireless technologies, vol. 9, pp. 335-339, march 2017. [13] m. moosazadeh and s. kharkovsky, "compact and small planar monopole antenna with symmetrical l and u-shaped slots for wlan/wimax applications", ieee antennas and wireless propagation letters, vol. 13, pp. 388-391, february 2014. [14] s. chen, m. fang, d. dong, m. han and g. liu, "compact multiband antenna for gps/wimax/wlan applications", microwave and optical technology letters, vol. 57, pp. 1769-1773, may 2015. [15] w. c. liu, c. m. wu and y. dai, "design of triple frequency microstrip fed monopole antenna using defected ground structure", ieee transactions on antennas and propagation, vol. 59, pp. 2457-2463, may 2011. [16] g. j. jo, s. m. mun, d. s. im, g. r. kim, y. g. choi and j. h. yoon, "novel design of a cpw-fed monopole antenna with three arc-shaped strips for wlan/wimax operations", microwave and optical technology letters, vol. 57, pp. 268-273, december 2015. a quad-band monopole antenna with defected ground plane for l-band/wimax/wlan applications 639 [17] r. pandeeswari and s. raghavan, "a cpw-fed triple band ocsrr embedded monopole antenna with modified ground for wlan and wimax applications", microwave and optical technology letters, vol. 57, pp. 2413-2418, july 2015. [18] r. k. badhai and n. gupta, "compact asymmetric coplanar strip fed sinc shaped monopole antenna for multiband applications", international journal of microwave and wireless technologies, vol. 9, pp. 205211, february 2017. [19] j. h. yoon and y. c. rhee, "modified three circular ring monopole antenna for wimax/wlan triple band operations", microwave and optical technology letters, vol. 56, pp. 625-631, january 2014. [20] r. karimian, h. oraizi, s. fakhte and m. farahani, "novel f-shaped quad-band printed slot antenna for wlan and wimax mimo systems", ieee antennas and wireless propagation letters, vol. 12, pp. 405408, march 2013. [21] j. pei, a. g. wang, s. gao and w. leng, "miniaturized triple band antenna with a defected ground plane for wlan/wimax applications", ieee antennas and wireless propagation letters, vol. 10, pp. 298-301, april 2011. [22] n. ojaroudi, m. ojaroudi and n. ghadimi, "a new design of printed monopole antenna with multiresonance characteristics for dcs/wlan/wimax applications", applied computational electromagnetics society journal, vol. 28, pp. 731-736, august 2013. [23] chandan, t. srivastava and b. s. rai, "multiband monopole u-slot patch antenna with truncated ground plane", microwave and optical technology letters, vol. 58, pp. 1949-1952, may 2016. [24] w. t. sethi, h. vettikalladi, h. fathallah and m. himdi, "hexa-band printed monopole antenna for wireless applications", microwave and optical technology letters, vol. 59, pp. 2816-2822, august 2017. facta universitatis series: electronics and energetics vol. 27, n o 4, december 2014, pp. 639 648 doi: 10.2298/fuee1404639g the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer  yavor georgiev 1 , george angelov 1 , tihomir takov 1 , ivaylo zhivkov 2,3 , marin hristov 1 1 faculty of electronic engineering and technologies, department of microelectronics, technical university of sofia, sofia, bulgaria 2 faculty of chemistry, centre for materials research, brno university of technology, brno, czech republic 3 institute of optical materials and technologies ”acad. j. malinowski”, bulgarian academy of sciences, sofia, bulgaria abstract. diphenyl-diketo-pyrrolopyrroles (dpp) are low molecular weight materials with promising luminescence and photovoltaic properties which are suitable for thin layer fabrication by a variety of methods like solution processed spin coating and physical vapour deposition (pvd). in this paper we investigate two types of indium tin oxide (ito)|active material|al structures: one with dpp derivative as active polymer layer with thicknesses of about 150nm and a second type of heterojunction solar cell based on dpp-c60 composite films (ratio 60:40 mass %) with thicknesses of about 100nm. the samples in this study have been prepared by the means of pvd which has the capability to produce the whole multilayered structure in one vacuum cycle after which the samples were processed and encapsulated within inert atmosphere in a glove box. the surface morphology of the films was studied by scanning electron microscope (sem) imaging which revealed formations of grains with size of about 200-500nm in the dpp layers and spheres with size of about 100-200nm in the composite layer. the photovoltaic behavior was evaluated through the results of spectral sweep of the generated photocurrent and i-v characteristics in dark and under illumination with specific wavelength. the photovoltaic behavior was successfully demonstrated and directions toward performance improvements have been given. key words: photovoltaic behavior, diphenyl-diketo-pyrrolopyrrole, fullerene c60, organic solar cells, physical vapour deposition received july 28, 2014; received in revised form october 13, 2014 corresponding author: yavor georgiev faculty of electronic engineering and technologies, department of microelectronics, technical university of sofia, kl. ochridski blvd. 8, 1756 sofia, bulgaria (e-mail: angelov@ecad.tu-sofia.bg) 640 y. georgiev, g. angelov, t. takov, i. zhivkov, m. hristov 1. introduction nowadays the organic solar cells (oscs) are a subject of increased research and development efforts due to their flexibility and inexpensive processing. their basic architecture is that of a sandwich type structure with stacked organic and inorganic thin films with high demands to their properties e.g., film homogeneity, thickness uniformity and roughness. diphenyl-diketo-pyrrolopyrroles (dpp) are low molecular weight materials with promising luminescence and photovoltaic properties [1] and [2]. the derivative, 1,4diketo-3,6-difenyl-pyrrolo-[3,4-c]-pyrrole (dpp) was first described in 1974. the reaction shown on fig. 1 represents its first synthesis method, with synthesis yield from 5 to 20 % [3]. later, different approaches for diketopyrrolopyrrole synthesis were found, for example synthesis from succinic acid diester using the strong base error! reference source not found.1, 2], other methods are available in the literature as well error! reference source not found.4]. fig. 1 diketopyrrolopyrrole first synthesis. dpp is very stable material, red in the solid phase, with very low solubility. once dissolved, the obtained solution is yellow with a little bit of green. absorption maximum between solid state and solution differs (cca 500 nm → cca 540 nm) because hydrogen bonds are broken and π-electron overlaps and the crystal structure is cancelled. in 1986, the first derivative was introduced to the market and since then, many other derivatives have been synthesized and are a subject of patents in many areas of application like organic field-effect transistors [5], oled devices [6], solid-state dye lasers [7] and as a luminescent media in a polymer matrices [8]. presently these compounds are object of intensive research because they exhibit a variety of shades in the solid state and especially chemical, light and thermal stability [9]. dpp itself has a high molar absorption coefficient, as well as high quantum yield of fluorescence, therefore low molecular weight derivatives of dpp and dpp based polymers have been extensively studied on their optical and photovoltaic properties [10–15] and have found their place in the organic electronics as a promising candidate for the future organic solar cells. this paper deals with 3,6-bis(5-(benzofuran-2-yl)thiophen-2-yl)-2,5-bis(2-ethylhexyl) pyrrolo[3,4-c]pyrrole-1,4(2h,5h)-dione denoted hereafter as dpp. using this material as donor for bulk heterojunction based solar cells leads to a photovoltaic conversion of efficiencies up to 4.4% [3]. the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer 641 to separate the strongly bound frenkel excitons in organic films, the donor-acceptor concept was proposed [16]. so far, fullerene c60 and c70, as well their derivatives, are most successful acceptors for oscs [17]. power conversion efficiencies in the range of 6 ÷ 8% have been reported for solution-processed single junction bulk heterojunction solar cells combining fullerene derivatives as acceptor material and π-conjugated polymers as donors [18–21]. conjugated system leads to the high charge carrier mobility across the structure. the thiophene rings secure the planarity of the molecule, so even adding the ethylhexyl solubilising groups does not break the conjugation. organic bulk-heterojunction solar cells comprising are reported to demonstrate lifetimes approaching seven years, which is the longest reported lifetime for polymer solar cells [22]. small molecules show many advantages as higher possibility of supramolecular organization, higher purity and easier production [23]. they also do not suffer from batch to batch variations, broad molecular-weight distributions and end-group contamination [3]. thin films of small molecular semiconductors are usually deposited by the means of complex fabrication methods such as chemical or physical vapour deposition, organic molecular beam epitaxy or solution-based deposition techniques (most notably spin coating). the performance of the resulting thin layers has been shown to be highly sensitive to film morphology and the processing conditions. often, the solution processed active layers of the devices (e.g. spin cast films) exhibit a high portion of microcrystallites and aggregates whereas the vapour deposition techniques provide high quality crystalline films which are characterized by improved charge transportation [14]. recently the relationship between the organic thin film morphology and the device performance is subject of intensive research. we have chosen the dpp and its derivatives as the core material of our research because of its wide range of applications, proven photovoltaic properties and versatile thin film fabrication methods – thin films from this material could be made with solution processed methods like spin or spray coating and through vapour deposition processes like the pvd. we concentrated our efforts on the physical vapour deposition of dpp in order to obtain quality thin films which could be used as a reference for its performance and for future studies on its possible application as integrated optical detector. current paper consists of two main sections. in the experiment section are presented the chemical structure of the used materials, fabrication process details (pvd parameters and process flow, encapsulation procedure, etc.), specification of the used equipment for fabrication and characterization of our test samples and their construction parameters (dimensions and arrangement). in the results section are presented and analyzed images of the surface evaluation of the test samples with two complimentary methods – visual inspection with optical microscope under polarized light and sem analysis. further in this section are presented the results of the photovoltaic characterization and most notably spectral and i-v measurements of the samples. 2. experiment in the course of the present study two types of solar cell devices were prepared and evaluated. their structure was [glass substrate|ito|active material|al|encapsulation epoxy|glass seal] where the “active material” was either dpp polymer or the composite 642 y. georgiev, g. angelov, t. takov, i. zhivkov, m. hristov thin film dpp-c60 in a mass % ratio of 60:40. the chemical structure of the dpp polymer used in this study and the fullerene c60 are presented on fig.2. fig. 2 chemical structure of the dpp (a) and the fullerene c60 (b). the active material (dpp) was provided by centre for organic chemistry ltd. (coc), who synthesized it according to the route published in 2009 by walker et al. [3]. fullerene c60 pcbm ([6, 6]-phenyl-c61-butyric acid methyl ester) was commercially purchased from ossila ltd. for the fabrication of our samples we used “the ossila oled/opv pixelated anode substrate system” and “the ossila encapsulation system” consisting of: 20×15 mm sized pre-patterned ito-covered glass substrates, active area and cathode deposition masks, encapsulation glass slides, encapsulation epoxy and electrical connection legs with standard 0.1 inch (2.54 mm) pitch. the glass substrates had a specific ito pattern which made it possible to obtain 6 solar cell samples on one substrate. the pattern was as follows: one cathode strip by the long side and three little strips (fingers) on both perpendicular sides, parallel to the cathode strip, which after deposition forms six active zones with dimensions 4×1.5 mm and the six anodes connected to them. the active material thin film deposition was carried out in mbraun vacuum system at evaporation temperature of 180°c for the dpp and 420°c for the c60 with rate of deposition accordingly 4.2 å/s and 2.5 å/s through the active area deposition mask. the thin film deposition was simultaneous from two sources for the composite type of samples. after deposition of the active layer, the samples were taken out in a glove box filled with nitrogen atmosphere, the mask was changed with the cathode deposition mask and they were returned back into the evaporation chamber. when the vacuum in the evaporation chamber has reached sufficient levels aluminum electrodes were deposited. as a final and optional step the encapsulation procedure was carried out on some of the samples. the encapsulation procedure consists of covering the al electrode area with encapsulation epoxy and on top of it an encapsulation glass and drying out the test sample under uv lamp for 30 minutes. the encapsulation procedure is performed again in the the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer 643 glove box, without exposing the sample to oxygen. selected samples were not encapsulated in order to be studied by scanning electron microscope philips 515. a cross-section of the resulting fully fabricated and characterization-ready solar cell sample is given on fig. 3. fig. 3 cross-section of the organic solar cell sample. for the photovoltaic characterization we needed monochromatic light, which was produced by lot-oriel halogen lamp lsh502 and lot-oriel monochromator msh101. the voltage applied was provided and the measurements of the generated photocurrent were carried out with keithley 6517a electrometer and voltage source. and finally the light power was controlled and measured by the s120vc – standard si photodiode power sensor and keithley 485 picoampermeter. photovoltaic measurements consisted of spectral dependence of the photocurrent at zero applied voltage; i-v characteristics, measured in both directions of the voltage scale in dark and exposed to monochromatic light. all photovoltaic measurements were carried out on a vibration proof optical table from standa. 3. results 3.1. surface characterization an optical microphotograph image of the prepared sample is presented on fig. 4. as seen, uniform films entirely covering the ito electrode area were obtained. fig. 4 the following device structure can be seen on the optical micrograph under polarized light: a – ito electrode, b – active material and c – al electrode. 644 y. georgiev, g. angelov, t. takov, i. zhivkov, m. hristov the precise film surface characterization of the deposited pure polymer and composite active layers was performed by scanning electron microscope (sem). the sem images taken at two different magnifications from the same sample without applying the encapsulation procedure are given on fig. 5. the images taken at the lower magnification of 20000 (fig. 5a) for dpp films and magnification of 10000 (fig. 5c) for the composite films of dpp-c60 confirms that smooth and uniform thin films without a presence of pinholes were obtained. the image taken at magnification of 40000 (fig. 5b) of the dpp layer, reveals a formation of grains with size of about 200-500 nm. the observed grains follow a predominant orientation and could be related to a formation of crystallites, which is an expectable behavior of such low molecular weight compounds. the sem image taken at magnification of 40000 (fig 5d) of the composite thin film, reveals a formation of spheres with a size of about 100-200 nm. the observed spheres could be related to a formation of single phase from one of the components. to clarify these findings more investigation of samples with different dpp-c60 ratio should be carried out and a comparison of dpp thin films deposited with other fabrication techniques should be performed. in general it could be concluded that the deposited dpp polymer and dpp-c60 composite films are suitable for electrical measurements in a “sandwich” type samples. fig. 5 sem images of the deposited active layer thin films: a) dpp, magnification of 20000; b) dpp, magnification of 40000; c) dpp-c60, magnification of 10000; d) dpp-c60, magnification of 40000. 3.2. photovoltaic characterization spectral dependences of the photocurrent, measured at zero applied voltage between the electrodes from ito|active material|al structures are presented in fig. 6. the solar cell samples with dpp thin films as active material exhibit a maximum of the spectrum at an excitation light wavelength of 400 nm (fig. 6a). the stacked structures with composite thin films of dpp-c60 as active material show a maximum of the spectrum at an excitation light wavelength of 533 nm (fig. 6b). the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer 645 similar spectral dependences of the photocurrent taken from samples with the same materials as active layers, but prepared by spin-coating technique were published in the literature [3]. for an excitation of this peak during the next experiments a monochromatic light with wavelengths accordingly equal to the exhibited maximums were used. as expected the addition of the fullerene c60 in the second type of solar cell samples under investigation has led to an increase of the generated photocurrent – around 15 times than the current generated by the pure dpp solar cell samples. moreover there is a significant widening of the useful wavelength bandwidth – the heterojunction solar cell could be successfully utilized in the entire examined spectrum (from 350 nm to 800 nm) as opposed to the single polymer solar cell which is only useful in the range of 350 nm ÷ 500 nm. under illumination of monochromatic light the non-optimized samples produced photocurrent of almost two orders of magnitude higher than what was observed on the dark i-v measurements. processing the data for the structures, short circuit current jsc and open circuit voltage uoc were determined. from the area confined by jsc and uoc, the dependences of the electrical power on the voltage applied are plotted on fig. 8, right y axis. fig. 6 spectral dependence of the photocurrent at zero applied voltage: a) single polymer, pure dpp solar cell (λmax = 400 nm) b) heterojunction solar cell, dpp-c60 composite with ration 60:40 mass % (λmax = 533 nm). the maximum electrical power pmax was found under the maximum power voltage ump and the maximum power short circuit current density – jmp. the results are given in table 1. table 1 photovoltaic parameters estimated from i-v curves. parameter dpp dpp-c60 dimension jsc 2,5 36 na.cm -2 uoc 0,71 0,71 v 646 y. georgiev, g. angelov, t. takov, i. zhivkov, m. hristov pmax 8,4.10 -7 12.10 -7 mw.cm -2 ump 0,5 0,25 v jmp 1,67 5,24 na.cm -2 fig. 7 i-v characteristics measured in the dark and under monochromatic light with irradiance of 0.95 mw.cm -2 , a) dpp samples and b) dpp-c60 samples. fig. 8 dependence of the power density (right y axis) and current density (left y axis) on the voltage applied, as calculated from and under the same conditions as fig. 7, light curves, a) dpp samples and b) dpp-c60 samples. 4. conclusion two types of ito|active material|al structures were prepared by vacuum deposition and encapsulated in an inert atmosphere. the evaluated active thin layers were dpp with 150 nm thickness and a composite active layer with dpp and fullerene c60 simultaneously deposited from two evaporation sources, with thickness of 100 nm and mass % ratio of about 60:40. the photovoltaic behavior of vacuum deposited diphenyl-diketo-pyrrolopyrrole polymer 647 the surface morphology of the fabricated thin films was studied by sem imaging, which revealed formations of grains with size of about 200-500 nm in the dpp layers and spheres with size of about 100-200 nm in the composite layer. photovoltaic behavior was successfully demonstrated and evaluated. although the basic construction of a single junction osc showed expectedly low power efficiency, below 1%, it could be concluded that the vacuum deposited active layers have promising photovoltaic properties for incorporation in the future oscs and other organic electronic devices. further optimization should be made towards reducing the contact barrier and facilitating the excitation decay and charge carrier extraction trough the addition of intermediate interface layers like poly(3,4-ethylenedioxythiophene) polystyrene sulfonate (pedot:pss) and ca. other possible routes for optimizations include the thermal annealing of the newly formed thin layers [18], or optimization of the whole structure of the oscs by introducing gold nanoparticles with different morphologies into buffer layers [24]; using more than one stacked photoactive layers with complementary absorption spectra to form multi-junction organic solar cells [25 – 27] or formation of radial electron contacts [28]. acknowledgement: this research was funded with support from the contract no. 132pd0048-03, one of the projects for helping phd students at technical university of sofia and facilitated by the erasmus programme for student mobility – bilateral agreement cz-08, individual agreement sm55. the research was also co-funded by grant no lo1211 from the ministry of education youth and sports of the czech republic. references [1] wienk, m. m., turbiez, m., gilot, j. and janssen, r. a. j., “narrow-bandgap diketo-pyrrolo-pyrrole polymer solar cells: the effect of processing on the performance”, adv. mater., 20, pp. 2556–2560, 2008. doi: 10.1002/ adma.200800456 [2] guoqiang zhang, kuan liu, haijun fan, yang li, xiaowei zhan, yongfang li, mujie yang, “the photovoltaic behaviors of ppvand ppe-type conjugated polymers featured with diketopyrrolopyrrole (dpp) units”, synthetic metals, vol. 159, issues 19–20, october 2009, pp. 1991-1995. issn 0379-6779. [3] walker b, tamayo ab, dang xd, zalar p, seo jh, garcia a, et al., “nanoscale phase separation and high photovoltaic efficiency in solution-processed, small-molecule bulk heterojunction solar cells.” adv. funct. mater., 19, pp. 3063-3069, 2009. [4] faulkner, e. b., schwartz, r. j., hight performance pigments. weinheim willey-vch, p. 538, 2009. isbn 978-3-527-31405-8. [5] yanagisawa h., mizuguchi j., aramakil s., and sakai y., jpn. j. appl. phys., 47/6, pp. 4728, 2008. [6] patents wo 2004/090046, us 2005/0008892 [7] makoto fukuda, kunihiko kodama, hiroshi yamamoto, keiichi mito, “solid-state laser with newly synthesized pigment”, dyes and pigments, volume 53, issue 1, pp. 67-72, april 2002. issn 0143-7208. [8] mario smet, bert metten, wim dehaen, “construction of rod-like diketopyrrolopyrrole oligomers with well-defined length”, tetrahedron letters, volume 42, issue 37, pp. 6527-6530, 10 september 2001. issn 0040-4039. [9] wallquist, o. and lenz, r., “20 years of dpp pigments – future perspectives”, macromol. symp., 187: 617–630, 2002. doi: 10.1002/1521-3900(200209)187:1<617::aid-masy617>3.0.co;2-5. [10] rui zhou, qing-duan li, xin-chen li, shun-mian lu, li-ping wang, chun-hui zhang, ju huang, ping chen, feng li, xu-hui zhu, wallace c.h. choy, junbiao peng, yong cao, xiong gong, “a solutionprocessable diketopyrrolopyrrole dye molecule with (fluoronaphthyl)thienyl endgroups for organic solar cells”, dyes and pigments, volume 101, pp. 51-57, february 2014. issn 0143-7208. 648 y. georgiev, g. angelov, t. takov, i. zhivkov, m. hristov [11] yujeong kim, chang eun song, ara cho, jungwoon kim, yoonho eom, jongho ahn, sang-jin moon, eunhee lim, “synthesis of diketopyrrolopyrrole (dpp)-based small molecule donors containing thiophene or furan for photovoltaic applications”, materials chemistry and physics, volume 143, issue 2, pp. 825-829, 15 january 2014. issn 0254-0584. [12] guanjun zhang, longfeng song, shiming bi, yue wu, jianjun yu, limin wang, “mild synthesis and photophysical properties of symmetrically substituted diketopyrrolopyrrole derivatives”, dyes and pigments, volume 102, pp. 100-106, march 2014. issn 0143-7208. [13] yum, j.-h. et al., “blue-coloured highly efficient dye-sensitized solar cells by implementing the diketopyrrolopyrrole chromophore”. sci. rep. 3, pp. 2446, 2013. doi:10.1038/srep02446. [14] martin weiter, martin vala, imad ouzzane, miroslava šperova, patricie heinrichova, štěpán frebort, lubomír kubac, jan vynuchal, zdeněk elias, stanislav lunak – "soluble diketo-pyrrolo-pyrroles for organic electronics and photonics", 4th international conference nanocon 2012, brno, conference proceedings, october 23rd-25th, 2012. isbn: 978-80-87294-32-1 [15] štěpán frebort, martin vala, stanislav luňák jr., jana honová, tomáš mikysek, zdeněk eliáš, antonín lyčka, diphenylamine end-capped diketopyrrolopyrroles with phenylene–vinylene conjugation extension, tetrahedron letters, volume 55, issue 17, pages 2829-2834, 23 april 2014. issn 0040-4039 [16] tang, c. w. 2-layer organic photovoltaic cell. appl. phys. lett. 48, 183–185 (1986), doi: 10.1063/ 1.96937 [17] yongsheng liu, chun-chao chen, ziruo hong, jing gao, yang (michael) yang, huanping zhou, letian dou, gang li and yang yang, “solution-processed small-molecule solar cells: breaking the 10% power conversion efficiency”, scientific reports 3, article number:3356, doi:10.1038/srep03356, 2013 [18] kim k, liu j, namboothiry mag, carroll dl, “the role of donor and acceptor nanodomains in 6% thermally annealed polymer photovoltaics” appl phys lett, 90, 2007. [19] park sh, roy a, beaupre s, cho s, coates n, moon js, et al., “bulk heterojunction solar cells with internal quantum efficiency approaching 100%”, nat photon., 3:297, 2009. [20] liang ll, xu z, xia jb, tsai st, wu y, li g, et al., “for the bright future-bulk heterojunction polymer solar cells with power conversion efficiency of 7.4%”, adv mater, 22:e135, 2010. [21] he z, zhong c, huang x, wong wy, wu h, chen l, et al., “simultaneous enhancement of open-circuit voltage, short-circuit current density, and fill factor in polymer solar cells”, adv mater, 23:4636, 2011. [22] peters, craig h., sachs-quintana, i. t., kastrop, john p., beaupré, serge, leclerc, mario, mcgehee, michael d., high efficiency polymer solar cells with long operating lifetimes, adv. energy mater., vol. 1, issue 4, pp. 491-494, wiley-vch verlag, issn: 1614-6840, 2011, doi:10.1002/aenm.201100138 [23] honova, j.; hrabal, m.; stritesky, s.; heinrichova, p.; vala, m.; weiter, m., “new diketopyrrolopyrrole derivatives for organic photovoltaics”, student conference chemistry for life, brno, p. 82, 2013. isbn: 978-80-214-48223. [24] duygu kozanoglu, dogukan hazar apaydin, ali cirpan, emren nalbant esenturk, power conversion efficiency enhancement of organic solar cells by addition of gold nanostars, nanorods, and nanospheres, organic electronics, vol. 14, issue 7, pp. 1720-1727, issn 1566-1199, 2013, http://dx.doi.org/10.1016/ j.orgel.2013.04.008. [25] kim, j. y. et al. efficient tandem polymer solar cells fabricated by all-solution processing. science vol. 317, pp. 222-225, 2007, doi: 10.1126/science.1141711 [26] li, w. w., furlan, a., hendriks, k. h., wienk, m. m. & janssen, r. a. j. efficient tandem and triplejunction polymer solar cells. j. am. chem. soc. 135, 5529–5532, 2013, doi: 10.1021/ja401434x [27] you, j. b. et al. a polymer tandem solar cell with 10.6% power conversion efficiency. nat. commun. 4, 1446, 2013, doi:10.1038/ncomms2411 [28] jonathan e. allen and charles t. black, improved power conversion efficiency in bulk heterojunction organic solar cells with radial electron contacts, acs nano, 5 (10), pp. 7986-7991, 2011, doi: 10.1021/nn2031963 facta universitatis series: electronics and energetics vol. x, no x, x 2018, pp. x energy-efficient cryptographic primitives elena dubrova∗ royal institute of technology (kth), stockholm, sweden abstract: our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. keywords: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 1 introduction today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or rfid tags in the conventional belief that the information they gather is of little concern to attackers. however, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network. for example, in the attack manuscript received x x, x corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) ∗an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 facta universitatis series: electronics and energetics vol. x, no x, x 2018, pp. x energy-efficient cryptographic primitives elena dubrova∗ royal institute of technology (kth), stockholm, sweden abstract: our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. keywords: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 1 introduction today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or rfid tags in the conventional belief that the information they gather is of little concern to attackers. however, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network. for example, in the attack manuscript received x x, x corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) ∗an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 2 e. dubrova described in [1], wireless tire pressure sensors were hacked and used to access the automotive system. as the number and type of connected devices grows, so are the security risks. attacks are becoming more frequent and their effect is more global. the october 21st, 2016, massive distributed denial of service attack that made inaccessible millions of webpages has shown how vulnerable the internet is today. future wireless networks are expected to support securitycritical services related to industrial automation, traffic safety, smart transport, smart grid, e-health, etc. the value of the information to which the low-end devices will have access via future wireless networks is expected to be much greater than the one today, hence the incentives for attackers will increase [2]. the damage caused by an individual actor may not be limited to a business or reputation, but could have a severe impact on public safety, national economy, and national security. many low-cost wireless devices, such as sensors or radio frequency identification (rfid) tags, work under severe resource constrains such as limited battery and computing power, little memory, and insufficient bandwidth. these devices must dedicate most of their available resources to executing core application functionality and have little resources left for implementing security [3]. to satisfy their constrains, it might be advantageous to combine techniques intended to assure high reliability of communication links (scrambling, checksums, forward error correction (fec)) with cryptographic techniques intended to assure security. in section 2 we show how functional similarities between error detection and data integrity protection can be exploited to efficiently combine these two functions in one. in section 3, we address another important problem assuring data confidentiality. for communications over insecure networks, such as the internet, data confidentiality is assured by encryption. in recent years, many cases of successful attacks on networks causing disclosure of private data have been reported. this increased user privacy concerns. we describe an efficient encryption algorithm which satisfies resource constrains of low-cost wireless devices and therefore enables encrypting all traffic and data transmitted using these devices. in addition to assuring data integrity and confidentiality, it is equally important to trust hardware which implements cryptographic algorithms. if cryptographic hardware has a vulnerability, all the efforts in defending a system at network or software levels are wasted. for example, malicious alterations inserted into an integrated circuit at the design or manufacturing stage can open backdoors into a system in spite of cryptographic protecfacta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 157 167 https://doi.org/10.2298/fuee1802157d elena dubrova received october 21, 2017; received in revised form january 29, 2018 corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) energy-efficient cryptographic primitives* royal institute of technology (kth), stockholm, sweden abstract. our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe counter measures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. key words: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 2 e. dubrova described in [1], wireless tire pressure sensors were hacked and used to access the automotive system. as the number and type of connected devices grows, so are the security risks. attacks are becoming more frequent and their effect is more global. the october 21st, 2016, massive distributed denial of service attack that made inaccessible millions of webpages has shown how vulnerable the internet is today. future wireless networks are expected to support securitycritical services related to industrial automation, traffic safety, smart transport, smart grid, e-health, etc. the value of the information to which the low-end devices will have access via future wireless networks is expected to be much greater than the one today, hence the incentives for attackers will increase [2]. the damage caused by an individual actor may not be limited to a business or reputation, but could have a severe impact on public safety, national economy, and national security. many low-cost wireless devices, such as sensors or radio frequency identification (rfid) tags, work under severe resource constrains such as limited battery and computing power, little memory, and insufficient bandwidth. these devices must dedicate most of their available resources to executing core application functionality and have little resources left for implementing security [3]. to satisfy their constrains, it might be advantageous to combine techniques intended to assure high reliability of communication links (scrambling, checksums, forward error correction (fec)) with cryptographic techniques intended to assure security. in section 2 we show how functional similarities between error detection and data integrity protection can be exploited to efficiently combine these two functions in one. in section 3, we address another important problem assuring data confidentiality. for communications over insecure networks, such as the internet, data confidentiality is assured by encryption. in recent years, many cases of successful attacks on networks causing disclosure of private data have been reported. this increased user privacy concerns. we describe an efficient encryption algorithm which satisfies resource constrains of low-cost wireless devices and therefore enables encrypting all traffic and data transmitted using these devices. in addition to assuring data integrity and confidentiality, it is equally important to trust hardware which implements cryptographic algorithms. if cryptographic hardware has a vulnerability, all the efforts in defending a system at network or software levels are wasted. for example, malicious alterations inserted into an integrated circuit at the design or manufacturing stage can open backdoors into a system in spite of cryptographic protecenergy-efficient cryptographic primitives 3 tion. in section 4, we describe two countermeasures against a type of hardware trojans which exploit non-zero aliasing probability of built-in-self-test (bist). 2 message authentication to authenticate a message means to verify that the message 1. comes from the right sender (its authenticity), and 2. has not been modified (its integrity). clearly, data integrity protection can be implemented by using some n-bit message authentication code, e.g. keyed hash message authentication code (hmac) [4] or cipher block chaining message authentication code (cbc-mac) [5], on the top of an error-detecting code, e.g. n-bit cyclic redundancy check (crc). however, such an approach expands the message by n bits and requires a separate encoding/decoding engine which is more complex than the crc encoding/decoding engine. on the other hand, if we simply replace an n-bit crc with an n-bit hmac or cbc-mac, we cannot guarantee the detection of the same type of random errors as the crc. for example, the detection of n-bit burst errors cannot be guaranteed. this may have a negative impact on the reliability of communication links. only if we make the conventional crc cryptographically secure, can we assure a certain level of security without sacrificing reliability. the latter motivated the development of cryptographically secure crcs. the core idea is to make the crc generator polynomial variable and secret. the crc presented by krawczyk [6] is based on irreducible generator polynomials. the approach described in [7] uses a product of irreducible polynomials. the crc proposed in [8] uses generator polynomials of type (1 + x)p(x), where p(x) is a primitive polynomial. in all three cases, testing for irreducibility or primitivity is required, which is either time or memory consuming. selecting an irreducible degree-n polynomial at random requires either selecting at random a degree-n polynomial (o(n) time) and running a test for irreducibility (ω(n3) time [9]), or selecting at random a degree-n polynomial from a database of irreducible degree-n polynomials (roughly 2n/n space). note that the irreducibility test has to be done during key agreement, i.e. it incurs delay before the communication can start. therefore, it is desirable to minimize the time spent on doing it as much as possible. 158 e. dubrova energy-efficient cryptographic primitives 159 2 e. dubrova described in [1], wireless tire pressure sensors were hacked and used to access the automotive system. as the number and type of connected devices grows, so are the security risks. attacks are becoming more frequent and their effect is more global. the october 21st, 2016, massive distributed denial of service attack that made inaccessible millions of webpages has shown how vulnerable the internet is today. future wireless networks are expected to support securitycritical services related to industrial automation, traffic safety, smart transport, smart grid, e-health, etc. the value of the information to which the low-end devices will have access via future wireless networks is expected to be much greater than the one today, hence the incentives for attackers will increase [2]. the damage caused by an individual actor may not be limited to a business or reputation, but could have a severe impact on public safety, national economy, and national security. many low-cost wireless devices, such as sensors or radio frequency identification (rfid) tags, work under severe resource constrains such as limited battery and computing power, little memory, and insufficient bandwidth. these devices must dedicate most of their available resources to executing core application functionality and have little resources left for implementing security [3]. to satisfy their constrains, it might be advantageous to combine techniques intended to assure high reliability of communication links (scrambling, checksums, forward error correction (fec)) with cryptographic techniques intended to assure security. in section 2 we show how functional similarities between error detection and data integrity protection can be exploited to efficiently combine these two functions in one. in section 3, we address another important problem assuring data confidentiality. for communications over insecure networks, such as the internet, data confidentiality is assured by encryption. in recent years, many cases of successful attacks on networks causing disclosure of private data have been reported. this increased user privacy concerns. we describe an efficient encryption algorithm which satisfies resource constrains of low-cost wireless devices and therefore enables encrypting all traffic and data transmitted using these devices. in addition to assuring data integrity and confidentiality, it is equally important to trust hardware which implements cryptographic algorithms. if cryptographic hardware has a vulnerability, all the efforts in defending a system at network or software levels are wasted. for example, malicious alterations inserted into an integrated circuit at the design or manufacturing stage can open backdoors into a system in spite of cryptographic protecenergy-efficient cryptographic primitives 3 tion. in section 4, we describe two countermeasures against a type of hardware trojans which exploit non-zero aliasing probability of built-in-self-test (bist). 2 message authentication to authenticate a message means to verify that the message 1. comes from the right sender (its authenticity), and 2. has not been modified (its integrity). clearly, data integrity protection can be implemented by using some n-bit message authentication code, e.g. keyed hash message authentication code (hmac) [4] or cipher block chaining message authentication code (cbc-mac) [5], on the top of an error-detecting code, e.g. n-bit cyclic redundancy check (crc). however, such an approach expands the message by n bits and requires a separate encoding/decoding engine which is more complex than the crc encoding/decoding engine. on the other hand, if we simply replace an n-bit crc with an n-bit hmac or cbc-mac, we cannot guarantee the detection of the same type of random errors as the crc. for example, the detection of n-bit burst errors cannot be guaranteed. this may have a negative impact on the reliability of communication links. only if we make the conventional crc cryptographically secure, can we assure a certain level of security without sacrificing reliability. the latter motivated the development of cryptographically secure crcs. the core idea is to make the crc generator polynomial variable and secret. the crc presented by krawczyk [6] is based on irreducible generator polynomials. the approach described in [7] uses a product of irreducible polynomials. the crc proposed in [8] uses generator polynomials of type (1 + x)p(x), where p(x) is a primitive polynomial. in all three cases, testing for irreducibility or primitivity is required, which is either time or memory consuming. selecting an irreducible degree-n polynomial at random requires either selecting at random a degree-n polynomial (o(n) time) and running a test for irreducibility (ω(n3) time [9]), or selecting at random a degree-n polynomial from a database of irreducible degree-n polynomials (roughly 2n/n space). note that the irreducibility test has to be done during key agreement, i.e. it incurs delay before the communication can start. therefore, it is desirable to minimize the time spent on doing it as much as possible. energy-efficient cryptographic primitives 3 tion. in section 4, we describe two countermeasures against a type of hardware trojans which exploit non-zero aliasing probability of built-in-self-test (bist). 2 message authentication to authenticate a message means to verify that the message 1. comes from the right sender (its authenticity), and 2. has not been modified (its integrity). clearly, data integrity protection can be implemented by using some n-bit message authentication code, e.g. keyed hash message authentication code (hmac) [4] or cipher block chaining message authentication code (cbc-mac) [5], on the top of an error-detecting code, e.g. n-bit cyclic redundancy check (crc). however, such an approach expands the message by n bits and requires a separate encoding/decoding engine which is more complex than the crc encoding/decoding engine. on the other hand, if we simply replace an n-bit crc with an n-bit hmac or cbc-mac, we cannot guarantee the detection of the same type of random errors as the crc. for example, the detection of n-bit burst errors cannot be guaranteed. this may have a negative impact on the reliability of communication links. only if we make the conventional crc cryptographically secure, can we assure a certain level of security without sacrificing reliability. the latter motivated the development of cryptographically secure crcs. the core idea is to make the crc generator polynomial variable and secret. the crc presented by krawczyk [6] is based on irreducible generator polynomials. the approach described in [7] uses a product of irreducible polynomials. the crc proposed in [8] uses generator polynomials of type (1 + x)p(x), where p(x) is a primitive polynomial. in all three cases, testing for irreducibility or primitivity is required, which is either time or memory consuming. selecting an irreducible degree-n polynomial at random requires either selecting at random a degree-n polynomial (o(n) time) and running a test for irreducibility (ω(n3) time [9]), or selecting at random a degree-n polynomial from a database of irreducible degree-n polynomials (roughly 2n/n space). note that the irreducibility test has to be done during key agreement, i.e. it incurs delay before the communication can start. therefore, it is desirable to minimize the time spent on doing it as much as possible. 158 e. dubrova energy-efficient cryptographic primitives 159 4 e. dubrova in [10], we presented a cryptographically secure crc based on any randomly selected generator polynomial, with no requirements on irreducibility. this eliminates the need for irreducibility tests. it takes only o(n) time to generate a random polynomial of degree n. the presented cryptographically secure crc retains most of the implementation simplicity of the traditional crc except that the lfsr implementing the encoding and decoding is required to have re-programmable connections. similarly to previously proposed cryptographically secure crcs, the new one enables combining the detection of random and malicious errors without increasing bandwidth. however, using random polynomials as generator polynomials for the crc gives an adversary a higher chance of braking authentication. in [10], we provided a detailed quantitative analysis of the achieved security as a function of message and crc lengths and showed that the presented authentication scheme is particularly suitable for short messages. since short messages (a few bytes to a few tens of bytes) are expected to be dominant in machine-to-machine (m2m) communications [11], this message authentication technique might be quite useful for resource-constrained m2m devices. 3 encryption encryption is the process of transforming a message in such a way that only authorized parties can understand its content. encryption assures message confidentiality. encryption can be performed using either a block or a stream cipher. block ciphers have been studied for over 50 years [12]. collected knowledge about their design and cryptanalysis made it possible to develop the advanced encryption standard (aes) algorithm which is widely accepted and has strong resistance against various kind of attacks [13]. on the other hand, an active public investigation of stream ciphers began only about 20 years ago [14]. a common type of stream cipher is the binary additive stream cipher, in which the keystream, the plaintext, and the ciphertext are binary sequences. the keystream is produced by a keystream generator which takes a secret key and an initial value (iv) as a seed and generates a pseudo-random sequence of 0s and 1s. the ciphertext is then obtained by the bit-wise addition of the keystream and the plaintext. to design a secure stream cipher which satisfies technical requirements of resource-constrained devices, a best trade-off between area and performance for a given security level should be sought. previous stream cipher designs have either too high propagation delay (e.g. grain family [15]) or use too 160 e. dubrova energy-efficient cryptographic primitives 161 4 e. dubrova in [10], we presented a cryptographically secure crc based on any randomly selected generator polynomial, with no requirements on irreducibility. this eliminates the need for irreducibility tests. it takes only o(n) time to generate a random polynomial of degree n. the presented cryptographically secure crc retains most of the implementation simplicity of the traditional crc except that the lfsr implementing the encoding and decoding is required to have re-programmable connections. similarly to previously proposed cryptographically secure crcs, the new one enables combining the detection of random and malicious errors without increasing bandwidth. however, using random polynomials as generator polynomials for the crc gives an adversary a higher chance of braking authentication. in [10], we provided a detailed quantitative analysis of the achieved security as a function of message and crc lengths and showed that the presented authentication scheme is particularly suitable for short messages. since short messages (a few bytes to a few tens of bytes) are expected to be dominant in machine-to-machine (m2m) communications [11], this message authentication technique might be quite useful for resource-constrained m2m devices. 3 encryption encryption is the process of transforming a message in such a way that only authorized parties can understand its content. encryption assures message confidentiality. encryption can be performed using either a block or a stream cipher. block ciphers have been studied for over 50 years [12]. collected knowledge about their design and cryptanalysis made it possible to develop the advanced encryption standard (aes) algorithm which is widely accepted and has strong resistance against various kind of attacks [13]. on the other hand, an active public investigation of stream ciphers began only about 20 years ago [14]. a common type of stream cipher is the binary additive stream cipher, in which the keystream, the plaintext, and the ciphertext are binary sequences. the keystream is produced by a keystream generator which takes a secret key and an initial value (iv) as a seed and generates a pseudo-random sequence of 0s and 1s. the ciphertext is then obtained by the bit-wise addition of the keystream and the plaintext. to design a secure stream cipher which satisfies technical requirements of resource-constrained devices, a best trade-off between area and performance for a given security level should be sought. previous stream cipher designs have either too high propagation delay (e.g. grain family [15]) or use too energy-efficient cryptographic primitives 5 many flip-flops (e.g. trivium [16]) for a given security level. thus, they optimize only one of the two important parameters area or performance. in [17], we presented a methodology for designing a class of stream ciphers, called espresso, which takes into account both parameters simultaneously, thus minimizing the hardware footprint and maximizing the throughput of the design. first, by using non-linear feedback shift registers (nlfsrs) implemented in the galois configuration [18,19], feedback functions can be made smaller. this allows us to reduce the propagation delay compared to grain while at the same time decrease the size compared to trivium. due to the large number of feedback functions in espresso, its maximum degree of parallelization cannot be made as high as in both grain and trivium. still, by carefully choosing feedback functions, we are able to guarantee the maximum degree of parallelization four and a maximum-length nlfsr. second, to enable security analysis of espresso, we transform the original galois nlfsr to an nlfsr whose configuration resembles the fibonacci configuration. the core idea of our method is to assure that all of the most biased linear approximations of the output boolean function take inputs only from those stages of the galois nlfsr which have a corresponding equivalent stage in the transformed nlfsr. as a result, traditional cryptanalysis techniques can be applied to our design as well. according to our evaluation, espresso is the fastest among the ciphers below 1500 ge, including grain-128 and trivium. its 1-bit per cycle version has 1497 ge area, 2.22 gbits/sec throughput and 232 ns latency, meeting requirements of most applications envisioned today. it is resistant to known attacks, including linear approximations, algebraic attacks, time-memorydata trade off attacks, chosen iv attacks, differential attacks and weak key attacks. 4 hardware trojans a hardware trojan is a malicious modification of a design that makes it possible to bypass or disable the security of a system [20]. hardware trojans has been known for a while, but previously it was very difficult to inject a trojan into the supply chain. in today’s globalized world in which multiple players are involved in the supply chain, this is no longer a problem. malicious changes can be introduced into a design, for example, by tampering with a cad-tool which is used for circuit’s synthesis [21]. the code of a cad-tool is usually huge and it undergoes a continuous development. 160 e. dubrova energy-efficient cryptographic primitives 161 6 e. dubrova so, several extra lines which modify the original design to inject a trojan may easily get unnoticed in a multi-million line code. alternatively, a thirdparty-made ip-block might contain a backdoor that can be used to steal secret keys or extract internal chip data. circuit modifications can also be made at the manufacturing stage, potentially affecting all chips or just some selected ones. today’s chips contain billions of transistors, so it is very difficult to identify which of them are not a part of the original design [22]. functional verification is further complicated by the fact that manufacturers are typically given a freedom to add redundant circuitry to a chip in order to increase manufacturing yield [23]. the presence of hardware trojans can be difficult to prove. for example, some pcs are claimed to contain malicious circuit modifications that allow a person who knows the modifications to remotely access a pc without the user’s knowledge [24]. however, it is still not confirmed if these claims are true or not. some processors are suspected to contain backdoors deliberately implanted in their hardware random number generators (rngs) that make possible predicting rng’s output [25]. again, it remains a conspiracy-theory story. we do not know if these stores are true or not. however, we cannot discount a possibility that such attacks may take place if they are feasible to implement. for example, as demonstrated in [26], it is possible to reduce the security of a hardware rng compliant with fips 140-2 [27] and nist sp800-90 [28] standards from 128 to 32 bits by injecting stuck-at faults at the outputs of selected transistors. this can be done without disabling the bist logic which checks rng’s functionality at each power-up, without failing bist tests, and without failing any randomness tests. stuck-at faults can be injected a very stealthy way by modifying dopant types in the active region of transistors. such dopant-level trojans do not require adding any extra logic to the original design and therefore do not change its layout. as a result, the trojan-injected circuit appears legitimate at all wiring layers. even with the advanced imaging methods such as scanning electron microscopy (sem) or focused ion beam (fib), it extremely difficult to detect changes made to the dopant in a large design implemented with nanoscale technologies. to detect changes in dopant types, in addition to all metal layers, the contact layer has to be examined. high-quality imaging of the contact layer is significantly more costly than imaging of a metal layer [29]. in addition, since only the dopants of a few transistors are modified, the change in the side-channel information is too small to be detected by side-channel analysis. typically side-channel analysis can only detect sufficiently large trojans that are at 162 e. dubrova energy-efficient cryptographic primitives 163 6 e. dubrova so, several extra lines which modify the original design to inject a trojan may easily get unnoticed in a multi-million line code. alternatively, a thirdparty-made ip-block might contain a backdoor that can be used to steal secret keys or extract internal chip data. circuit modifications can also be made at the manufacturing stage, potentially affecting all chips or just some selected ones. today’s chips contain billions of transistors, so it is very difficult to identify which of them are not a part of the original design [22]. functional verification is further complicated by the fact that manufacturers are typically given a freedom to add redundant circuitry to a chip in order to increase manufacturing yield [23]. the presence of hardware trojans can be difficult to prove. for example, some pcs are claimed to contain malicious circuit modifications that allow a person who knows the modifications to remotely access a pc without the user’s knowledge [24]. however, it is still not confirmed if these claims are true or not. some processors are suspected to contain backdoors deliberately implanted in their hardware random number generators (rngs) that make possible predicting rng’s output [25]. again, it remains a conspiracy-theory story. we do not know if these stores are true or not. however, we cannot discount a possibility that such attacks may take place if they are feasible to implement. for example, as demonstrated in [26], it is possible to reduce the security of a hardware rng compliant with fips 140-2 [27] and nist sp800-90 [28] standards from 128 to 32 bits by injecting stuck-at faults at the outputs of selected transistors. this can be done without disabling the bist logic which checks rng’s functionality at each power-up, without failing bist tests, and without failing any randomness tests. stuck-at faults can be injected a very stealthy way by modifying dopant types in the active region of transistors. such dopant-level trojans do not require adding any extra logic to the original design and therefore do not change its layout. as a result, the trojan-injected circuit appears legitimate at all wiring layers. even with the advanced imaging methods such as scanning electron microscopy (sem) or focused ion beam (fib), it extremely difficult to detect changes made to the dopant in a large design implemented with nanoscale technologies. to detect changes in dopant types, in addition to all metal layers, the contact layer has to be examined. high-quality imaging of the contact layer is significantly more costly than imaging of a metal layer [29]. in addition, since only the dopants of a few transistors are modified, the change in the side-channel information is too small to be detected by side-channel analysis. typically side-channel analysis can only detect sufficiently large trojans that are at energy-efficient cryptographic primitives 7 most three to four orders of magnitude smaller than the original design [30]. trojans of a smaller size remain undetected. the attack presented in [26] exploits the fact that aliasing probability of bist is non-zero due to the compaction of circuit’s output responses. aliasing probability is the probability that a fault-free circuit is not distinguished from a faulty one. if an n-bit compactor is used, the aliasing probability of bist is 1/2n [31]. in the traditional bist, the same set of test patterns is applied to a circuit under test at each test cycle, and therefore the same compacted output response, called signature, is expected. therefore, an adversary who knows the set of bist test patterns can select suitable values for the trojan that result in the same signature as a fault-free circuit signature. since the aliasing probability is 1/2n, in order to inject a trojan which does not trigger bist, an adversary has to make 2n−1 simulation trials on average. the typical size of a bist output response compactor is 32 bits, so the attack is feasible in practice. in [32,33], we presented two methods for modifying bist to prevent such an attack. in the first method, we make the bist test patterns dependent on a configurable key which is programed into a chip after the manufacturing stage. in the second method, we use a remote test management system which has sufficient computational resources to execute bist using a different set of test patterns at each test cycle. in both cases, the manufacturer does not have bist test patterns and thus does not know which circuit modifications produce the same signature as a fault-free circuit. depending on the application requirements, the former approach might be preferable to the latter, or vice versa. for example, the latter countermeasure might be preferable for the internet of things applications with constrained resources, e.g. m2m communications, since it removes some of the bist functionality from a device under test rather than adds an extra key to it. 5 conclusion and future work we described message authentication and encryption algorithms which can assure data integrity and confidentiality while satisfying technical requirements of resource-constrained devices. we also showed how to design countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to a type of hardware trojans which exploit nonzero aliasing probability of bist. the presented message authentication approach seems a good candidate for simpler 5g radio types, such as the ones used for direct communica162 e. dubrova energy-efficient cryptographic primitives 163 8 e. dubrova tion in sensor networks, and use cases with constrained resources such as m2m. future work includes evaluating the impact on bandwidth. in the current wireless standard message formats two separate fields are typically used for the protection against random and malicious errors. these fields may be located on different layers, e.g. in lte the crc is located at the physical (phy) layer while the message authentication code is located at the packet data convergence protocol (pdcp) layer. a good strategy might be to combine these two fields into the one at the phy layer and use the a cryptographic crc for the protection against both types of errors. however, implications for security and coverage caused by such a merge need to be investigated. the presented stream cipher espresso has 128-bit security. we are currently extending it to 256-bit security, in order to meet the requirements of post-quantum cryptography. we also consider integrating encryption and authentication within espresso. regarding hardware trojans, there is no ”silver bullet” method that can protect against all possible types of trojans or other attacks. in parallel with new countermeasures, more complex attacks are being developed. moreover, we are dealing with a two-ended stick a method originally designed as a countermeasure can be later turned into an attack, and vice versa. for example, advanced visual inspection methods for trojan detection can be used by ip thefts to reverse-engineer chips. similarly, if side-channel analysis techniques for detecting trojans that affect only a tiny fraction of a design are invented, they are likely to give rise to more effective side-channel attacks. future work remains investigating if there are attacks which can go around the presented countermeasures and what can be done to avoid them. acknowledgements the author was supported by the research grant no sm14-0016 from the swedish foundation for strategic research. references [1] i. rouf, r. miller, h. mustafa, t. taylor, s. oh, w. xu, m. gruteser, w. trappe, and i. seskar, “security and privacy vulnerabilities of in-car wireless networks: a tire pressure monitoring system case study,” in proceedings of the 19th usenix conference on security, berkeley, ca, usa, 2010, pp. 21–21. [online]. available: http://dl.acm.org/citation.cfm?id=1929820.1929848 164 e. dubrova energy-efficient cryptographic primitives 165 8 e. dubrova tion in sensor networks, and use cases with constrained resources such as m2m. future work includes evaluating the impact on bandwidth. in the current wireless standard message formats two separate fields are typically used for the protection against random and malicious errors. these fields may be located on different layers, e.g. in lte the crc is located at the physical (phy) layer while the message authentication code is located at the packet data convergence protocol (pdcp) layer. a good strategy might be to combine these two fields into the one at the phy layer and use the a cryptographic crc for the protection against both types of errors. however, implications for security and coverage caused by such a merge need to be investigated. the presented stream cipher espresso has 128-bit security. we are currently extending it to 256-bit security, in order to meet the requirements of post-quantum cryptography. we also consider integrating encryption and authentication within espresso. regarding hardware trojans, there is no ”silver bullet” method that can protect against all possible types of trojans or other attacks. in parallel with new countermeasures, more complex attacks are being developed. moreover, we are dealing with a two-ended stick a method originally designed as a countermeasure can be later turned into an attack, and vice versa. for example, advanced visual inspection methods for trojan detection can be used by ip thefts to reverse-engineer chips. similarly, if side-channel analysis techniques for detecting trojans that affect only a tiny fraction of a design are invented, they are likely to give rise to more effective side-channel attacks. future work remains investigating if there are attacks which can go around the presented countermeasures and what can be done to avoid them. acknowledgements the author was supported by the research grant no sm14-0016 from the swedish foundation for strategic research. references [1] i. rouf, r. miller, h. mustafa, t. taylor, s. oh, w. xu, m. gruteser, w. trappe, and i. seskar, “security and privacy vulnerabilities of in-car wireless networks: a tire pressure monitoring system case study,” in proceedings of the 19th usenix conference on security, berkeley, ca, usa, 2010, pp. 21–21. [online]. available: http://dl.acm.org/citation.cfm?id=1929820.1929848 energy-efficient cryptographic primitives 9 [2] ericsson, “5g security,” 2015, www.ericsson.com/res/docs/whitepapers/ 5gsecurity.pdf. [3] a. juels, “rfid security and privacy: a research survey,” selected areas in communications, ieee journal on, vol. 24, no. 2, pp. 381–394, feb. 2006. [4] m. bellare, r. canetti, and h. krawczyk, “keying hash functions for message authentication,” in advances in cryptology crypto 96, ser. lecture notes in computer science, n. koblitz, ed. springer berlin heidelberg, 1996, vol. 1109, pp. 1–15. [online]. available: http://dx.doi.org/10.1007/3-540-686975 1 [5] m. bellare, j. kilian, and p. rogaway, “the security of cipher block chaining,” in advances in cryptology crypto 94, ser. lecture notes in computer science, y. desmedt, ed. springer berlin heidelberg, 1994, vol. 839, pp. 341–358. [online]. available: http://dx.doi.org/10.1007/3-540-48658-5 32 [6] h. krawczyk, “lfsr-based hashing and authentication,” in proceedings of the 14th annual international cryptology conference on advances in cryptology, ser. crypto ’94. london, uk, uk: springer-verlag, 1994, pp. 129–139. [7] e. dubrova, m. naslund, g. selander, and f. lindqvist, “cryptographically secure crc for lightweight message authentication,” tech. rep. 2015/035, january 2015, cryptology eprint archive. [online]. available: http://eprint.iacr.org/2015/035 [8] e. dubrova, m. naslund, and g. selander, “crc-based message authentication for 5g mobile technology,” in proceedings of 1st ieee international workshop on 5g security, august 2015. [9] s. gao and d. panario, “tests and constructions of irreducible polynomials over finite fields,” in foundations of computational mathematics, f. cucker and m. shub, eds. springer berlin heidelberg, 1997, pp. 346–361. [online]. available: http://dx.doi.org/10.1007/978-3-642-60539-0 27 [10] e. dubrova, m. naslund, g. selander, and f. lindqvist, “message authentication based on cryptographically secure crc without polynomial irreducibility test,” cryptography and communications, 2017. [11] d. boswarthick, o. elloumi, and o. hersent, m2m communications: a systems approach. john wiley & sons, 2012. [12] b. schneier, applied cryptography (2nd ed.): protocols, algorithms, and source code in c. new york, ny, usa: john wiley & sons, inc., 1995. [13] j. daemen and v. rijmen, “aes proposal: rijndael,” april 2003, national institute of standards and technology. [14] m. robshaw, “stream ciphers,” tech. rep. tr 701, july 1994. [online]. available: citeseer.ist.psu.edu/robshaw95stream.html [15] m. hell, t. johansson, a. maximov, and w. meier, “the grain family of stream ciphers,” new stream cipher designs: the estream finalists, lncs 4986, pp. 179–190, 2008. 164 e. dubrova energy-efficient cryptographic primitives 165 10 e. dubrova [16] c. cannière and b. preneel, “trivium,” new stream cipher designs: the estream finalists, lncs 4986, pp. 244–266, 2008. [17] e. dubrova and m. hell, “espresso: a stream cipher for 5g wireless communication systems,” cryptography and communications, pp. 1–17, 2015. [18] e. dubrova, “a transformation from the fibonacci to the galois nlfsrs,” ieee transactions on information theory, vol. 55, no. 11, pp. 5263–5271, nov. 2009. [19] ——, “an equivalence-preserving transformation of shift register,” sequences and their applications seta’2014, lncs 8865, pp. 187–199, 2014. [20] m. tehranipoor and f. koushanfar, “a survey of hardware trojan taxonomy and detection,” ieee design test of computers, vol. 27, no. 1, pp. 10–25, 2010. [21] e. brunvand, digital vlsi chip design with cadence and synopsys cad tools. pearson, 2009. [22] e. seligman, t. schubert, and a. k. kumar, formal verification: an essential toolkit for modern vlsi design. morgan kaufmann, 2015. [23] p. gupta and e. papadopoulou, “yield analysis and optimization,” in the handbook of algorithms for vlsi physical design automation. rc press, 2011. [24] s. shah, “nsa, gchq ban lenovo’s pss due to security concerns,” july 2013. [25] d. goodin, “we cannot trust intel’s and via’s chip-based crypto freebsd developers say,” dec. 2013. [26] g. becker, f. regazzoni, c. paar, and w. p. burleson, “stealthy dopant-level hardware trojans,” proceedings of cryptographic hardware and embedded systems (ches’2013), lncs 8086, pp. 197–214, 2013. [27] federal information processing standards publication, “security requirements for cryptographic modules: fips pub 140-2,” 2001. [28] e. barker and j. kelsey, “recommendation for random number generation using deterministic random bit generators: nist 800-90a,” 2012. [29] t. sugawara, d. suzuki, r. fujii, s. tawa, r. hori, m. shiozaki, and t. fujimo, “reversing stealthy dopant-level circuits,” proceedings of cryptographic hardware and embedded systems (ches’2014), lncs 8731, pp. 112–126, 2014. [30] d. agrawal, s. baktir, d. karakoyunlu, p. rohatgi, and b. sunar, “trojan detection using ic fingerprinting,” in ieee symposium on security and privacy (sp’07), may 2007, pp. 296–310. [31] m. damiani, p. olivo, m. favalli, s. ercolani, and b. ricco, “aliasing in signature analysis testing with multiple input shift registers,” ieee transactions on computer-aided design of integrated circuits and systems, vol. 9, no. 12, pp. 1344–1353, 1990. 166 e. dubrova energy-efficient cryptographic primitives 167 10 e. dubrova [16] c. cannière and b. preneel, “trivium,” new stream cipher designs: the estream finalists, lncs 4986, pp. 244–266, 2008. [17] e. dubrova and m. hell, “espresso: a stream cipher for 5g wireless communication systems,” cryptography and communications, pp. 1–17, 2015. [18] e. dubrova, “a transformation from the fibonacci to the galois nlfsrs,” ieee transactions on information theory, vol. 55, no. 11, pp. 5263–5271, nov. 2009. [19] ——, “an equivalence-preserving transformation of shift register,” sequences and their applications seta’2014, lncs 8865, pp. 187–199, 2014. [20] m. tehranipoor and f. koushanfar, “a survey of hardware trojan taxonomy and detection,” ieee design test of computers, vol. 27, no. 1, pp. 10–25, 2010. [21] e. brunvand, digital vlsi chip design with cadence and synopsys cad tools. pearson, 2009. [22] e. seligman, t. schubert, and a. k. kumar, formal verification: an essential toolkit for modern vlsi design. morgan kaufmann, 2015. [23] p. gupta and e. papadopoulou, “yield analysis and optimization,” in the handbook of algorithms for vlsi physical design automation. rc press, 2011. [24] s. shah, “nsa, gchq ban lenovo’s pss due to security concerns,” july 2013. [25] d. goodin, “we cannot trust intel’s and via’s chip-based crypto freebsd developers say,” dec. 2013. [26] g. becker, f. regazzoni, c. paar, and w. p. burleson, “stealthy dopant-level hardware trojans,” proceedings of cryptographic hardware and embedded systems (ches’2013), lncs 8086, pp. 197–214, 2013. [27] federal information processing standards publication, “security requirements for cryptographic modules: fips pub 140-2,” 2001. [28] e. barker and j. kelsey, “recommendation for random number generation using deterministic random bit generators: nist 800-90a,” 2012. [29] t. sugawara, d. suzuki, r. fujii, s. tawa, r. hori, m. shiozaki, and t. fujimo, “reversing stealthy dopant-level circuits,” proceedings of cryptographic hardware and embedded systems (ches’2014), lncs 8731, pp. 112–126, 2014. [30] d. agrawal, s. baktir, d. karakoyunlu, p. rohatgi, and b. sunar, “trojan detection using ic fingerprinting,” in ieee symposium on security and privacy (sp’07), may 2007, pp. 296–310. [31] m. damiani, p. olivo, m. favalli, s. ercolani, and b. ricco, “aliasing in signature analysis testing with multiple input shift registers,” ieee transactions on computer-aided design of integrated circuits and systems, vol. 9, no. 12, pp. 1344–1353, 1990. energy-efficient cryptographic primitives 11 [32] e. dubrova, m. näslund, g. carlsson, j. fornehed, and b. smeets, “two countermeasures against hardware trojans exploiting non-zero aliasing probability of bist,” journal of signal processing systems, pp. 1–11, 2016. [online]. available: http://dx.doi.org/10.1007/s11265-016-1127-4 [33] e. dubrova, m. m. näslund, g. carlsson, and b. smeets, “keyed logic bist for trojan detection in soc,” in proceedings of international conference of system-on-chip (soc’2014), 2014. 166 e. dubrova energy-efficient cryptographic primitives 167 facta universitatis series: electronics and energetics vol. 33, n o 2, june 2020, pp. 303-316 https://doi.org/10.2298/fuee2002303p © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd enhanced low dose rate sensitivity (eldrs) and reduced low dose rate sensitivity (rldrs) in bipolar devices * vyacheslav s. pershenkov, alexander s. bakerenkov, alexander s. rodin, vladislav a. felitsyn, alexander i. zhukov, vitaly a. telets, vladimir v. belyakov national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation abstract. possible physical mechanism of enhanced low dose rate sensitivity (eldrs) and reduced low dose rate sensitivity (rldrs) in bipolar devices is described. modification of the low dose rate conversion model is presented. the enhanced or reduced sensitivity can be connected with a specific position of the effective fermi level relatively acceptor and donor radiation-induced interface traps. the qualitative and quantitative analysis of the low dose rate effects is presented. the effect of the oxide trapped charge on the value of the oxide electric field and the yield of the oxide charge were taken into account. it leads to dependence of the accumulation of radiationinduced oxide charge and interface traps on the dose rate. in enhancement version the eldrs and rldrs conversion model describes the low dose rate effect in as “true” dose rate effect. key words: total dose, low dose rate, bipolar devices, eldrs, conversion model. 1. introduction the main ideas of this paper were presented in [1]. several types of bipolar devices demonstrate enhanced degradation during low dose rate (ldr) irradiation in comparison with irradiation at high dose rate (hdr) for the same total dose level. these devices are known as eldrs-susceptible (enhanced low dose rate sensitivity) [2]. since the physical mechanism of the eldrs effect [3] is connected with suppressing of the accumulation of the radiation defects at high dose rate we can consider the effect as reduced high dose rate sensitivity (rhdrs) [4]. nevertheless the term eldrs is used for these devices in literature. received october 22, 2019; received in revised form december 26, 2019 corresponding author: vyacheslav pershenkov national research nuclear university mephi (moscow engineering physics institute), moscow, russian federation e-mail: vspersenkov@mephi.ru * an earlier version of this paper was presented at the 31 st international conference on microelectronics (miel 2019), september 16-18, 2019, in niš, serbia [1].  304 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. recent research of bipolar technology shows [5] that reducing degradation with decreasing of the dose rate is an inherent property of devices which are not susceptible to eldrs effect. we can consider these devices as eldrs-free [6]. the decreasing of the radiation degradation at low dose rate irradiation, as a rule, is not considered during hardness assurance tests of devices for space applications. it can lead to significant underestimation of the operation life time of these devices in real space environment. since devices which demonstrate reduced degradation at low dose rate irradiation are usually considered as eldrs-free, it will be useful to single them out into a separate class using the term rldrs (reduced low dose rate sensitivity). the purpose of this work is to describe possible physical mechanism of eldrs and rldrs effect using the low dose rate conversion model [7, 8, 9], which enables to numerically estimate the degradation of electrical parameters of bipolar devices during specified space mission. in this work the low dose rate conversion model is shortly described. the qualitative and quantitative models of the eldrs and rldrs effects are considered. 2. conversion model the model is based on the assumptions that the eldrs effect in bipolar devices is directly connected with increasing of the surface recombination current due to interface trap buildup at sio2/si interface near emitter junction. we suppose that the interface trap buildup can be described by h-e model [10, 11]. according to the model the interface trap buildup is connected with positive oxide trapped charge conversion due to the interaction with substrate electrons and not with the action of hydrogen ions only. nevertheless, the h-e model is not in conflict with the most popular hydrogen model [12]. the h-e model takes into account the contribution of substrate electrons to interface trap buildup process [13]. to check the hypothesis that interface-trap buildup is connected with the interaction between the positive charge in the oxide and electrons tunneling from the substrate, a special experiment was performed [10, 11]. the n-channel mos transistors with a 30-nm gate oxide were used in the experiment. these devices were irradiated by cu-target x-rays sourced with a 1 krad (sio2)/s dose rate to 3 mrad (sio2) total dose. the gate voltage was +5 v during irradiation. for one hour after the end of irradiation the devices were annealed in a molecular hydrogen atmosphere for 24 hours. the interface-trap buildup was registered during this annealing time. four different tests were investigated. the tests were chosen so that the field in the oxide might have a positive or negative value to provide motion for the positively charged hydrogenous species to the si/sio2 interface and from the si/sio2 interface. electron concentration on the substrate near the surface was changed by varying the gate voltage, substrate bias or forward bias of the source-substrate and drain-substrate junctions. this allows us to provide an enhancement or depletion mode for electrons near the substrate surface. thus, different combinations of hydrogenous species and electrons present near the interface were varied in tests 1 through 4. the experimental dependencies of the threshold voltage shift δvit (caused by the interfacetrap buildup) versus the annealing time for all four tests are presented in fig. 1. a maximum change of δvit is observed in test 1, when both electrons and hydrogenous species are presented near the surface. in other cases, when there are no hydrogen species (test 3) or no electrons (test 2) or both near the interface (test 4), shift δvit is essentially reduced. enhanced low dose rate sensitivity and reduced low dose rate sensitivity in bipolar devices 305 the presented experimental data confirms the hypothesis that the presence of hydrogen has some effect on interface-trap buildup. but the interaction of hydrogenous species and electrons from the substrate is the most important component of this process. in hydrogen-electron concept, both components – hydrogenous species and electrons – are factors that can restrict the rate of interface-trap buildup. fig. 1 the interface-trap component of the threshold voltage shift δvit versus the annealing time in the hydrogen atmosphere (after [11]) the post-irradiation relaxation of a positive oxide charge (“annealing”) is traditionally considered as the superposition of two independent processes: a tunneling of electrons from the substrate and a thermal excitation of electrons from the oxide valence band. the effect of the reversibility of annealing [14] could not be explained by these models. it is supposed in [7, 8, 9] that interface trap buildup is connected with a conversion of rechargeable part of trapped positive charge located opposite the silicon forbidden gap [14]. direct substrate electron tunneling to positive centers, located opposite the silicon forbidden gap, is impossible because the tunneling electron energy must be constant (according to basic principles of quantum mechanics). but tunneling to the thermally activated positive centers is still possible. the positive centers energy level can reach the silicon conduction band due to a thermally excited vibration of the lattice. an interaction of thermally excited rechargeable positive charge qot and tunneling substrate electrons leads to interface trap buildup (fig. 2a). the positive charge qot can be neutralized by hole emission to silicon valence band (fig. 2b). the interaction of the oxide defects and electrons from the substrate includes the thermal excitation of the defect and tunneling of the electron. simplistically speaking, the thermal atom fluctuation leads to the oscillation of energy levels. when the energy level of the defect in the oxide corresponds to the allowed electron level in si, it is possible that the tunneling transitions the electron from the substrate to the defect. in that case, the probability w of the tunneling transition of an electron with energy ee to the defect energy level et is a function of temperature t: 306 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. 2 0 ( )1 ~ exp exp e t e ex w kt e             , (1) where x is the distance from the defect to the interface si/sio2; k is the boltzmann constant; λ, e0 are the tunneling parameters. from the relationship(1), the recharge rate depends on the space and energy defect location. the defect located close to interface si/sio2 but with deep energy levels within the silicon forbidden gap (with large activation energy) is recharged during a long time interval. however, after electron capture and defect reconstruction, its energy level shifts and the defect acts as an interface-trap. for example, the typical transition time of the electron to the defect located on the depth 1 nm from the silicon interface and with an activation energy of 0.65 ev is greater than 10 6 s. however, the defect located at the same distance but with an activation energy of 0.1 ev has a recharge time less than 1 ms and acts as an interface-trap. (a) (b) fig. 2 conversion of rechargeable oxide charge qot to interface traps nit: capture of an electron e (a); emission of a hole h (b), ec and ev are energy levels of si conduction and valence band the conversion mechanism of trapped positive charge or e′γ center to interface state can be the following. according to the model [15], positively charged si atom in the e′γcenter is sp2-hybridized (planar configuration), while neutral si 0 atom is sp3hybridized (tetrahedral configuration). the electron energy levels in this defect strongly depend on the distance between si + and si 0 atoms [16, 17]. when this distance is large, these levels are located in the oxide close to the si midgap [16]. when these two atoms are bonded and the distance between them is small, the energy levels of bonding electrons are close to the edges of sio2 bandgap. the first electron capture to the e′γcenter changes defect configuration [15, 16] which results in the electron energy levels shift from si midgap toward the si valence or conduction bands, the distance between si atoms being of intermediate value. the electrons in this transformed defect are expected to be in the intermediate sp2-sp3 configuration and could form diffusion orbital (it is the orbital which extends toward neighbor si further than ordinal sp3 one) [17]. this defect enhanced low dose rate sensitivity and reduced low dose rate sensitivity in bipolar devices 307 configuration may be assumed stable and electron energy levels are proposed to remain unchanged when one of the electrons is removed. the probability of thermal excitation of the oxide trap energy level up to conduction band depends on the depth of its location opposite the silicon forbidden gap. than close trap energy level to the middle of the forbidden gap than less the probability of the conversion process. in time scale, the shallow traps (near conduction level) are annealed first, after that the annealing front spreads to more deep energy levels. for simplicity it is supposed in [7, 8, 9] there are two types of oxide traps: shallow traps with a short time of conversion responsible for the degradation at high dose rates, and deep traps that determine the excess base current at greater times of irradiation or low dose rates. the duration of hdr irradiation process is relatively short, not enough to convert all radiation-induced positive charge to interface traps. therefore at long-time ldr irradiation we can observe the increasing of the degradation. 3. physical model of eldrs and rldrs 3.1. the qualitative physical model the surface recombination current, which is responsible for a radiation degradation of the base current, depends on the concentration of interface traps and the surface potential on interface screen oxide-base region along emitter junction perimeter. its value is integral on the total surface of the passive base region under oxide (fig. 3): ∫ , (2) where q is electronic charge; us is surface recombination rate; s is the surface area of the passive base region under oxide. the surface recombination rate changes as far as injected carriers expand through the base region surface. for estimation of the maximum value of excess base current (worst case) it is possible to use the value of the recombination rate on the edge of emitter junction (y = 0), where a concentration of injected carriers are maximal. in that case relationship (2) can be rewritten: , (3) where us(0) is the surface recombination rate on emitter junction edge (y = 0) (fig. 3). fig. 3 the schematic structure of emitter – base junction of n-p-n bipolar transistor: top view (a); cross section (b). the shade is the surface area of the passive base region s 308 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. using the well known assumption, we consider that in the top half the interface traps act as acceptors, while in the bottom half they act as donors (fig. 4). the empty acceptorlike traps are neutral, the filled acceptor-like traps are negatively charged. the empty donor-like traps are positively charged and the filled donor-like traps are neutral. the capture cross section of a neutral trap is approximately 10 -15 cm 2 the order of atomic dimensions. the capture cross section of a charged trap is one-two orders greater (10 -14 10 -13 cm -2 ) due to columbic interaction with injected to base minority carriers. the charge state of trap depend on its position relatively fermi level on the surface. fig. 4 acceptor-like (a) and donor-like (d) interface traps in this work we suppose that interface traps in the top and bottom half of the silicon forbidden gap occupy some effective mono levels eta and etd. fig. 5 shows energy location of these traps in the forbidden gap for p-base region of npn transistor (the real possible distribution of these traps in the forbidden gap is shown by dotted lines). the solid lines correspond to any effective energy level of acceptor-like eta and donor-like etd surface traps. the acceptor-like traps in the top half of the forbidden gap always is empty for any location of fermi level in p-base region. the capture cross section of the neutral acceptor-like traps σta 0 equals approximately 10 -15 cm 2 . fig. 5 the energy location of the acceptor-like and donor-like traps relatively fermi level efp in the forbidden gap for p-base region of npn transistor: fermi level efp is located above the effective mono levels of the donor-like traps etd (a); efp is located below level etd (b). the dotted lines show the real possible distribution of interface traps in the forbidden gap enhanced low dose rate sensitivity and reduced low dose rate sensitivity in bipolar devices 309 the capture cross section of the donor-like traps strongly depends on their position relatively fermi level efp. in fig. 5a fermi level is located above the effective mono levels of the donor-like traps etd, but in fig. 5b it lies below level etd . the charge state of the donor-like traps depends on their position relatively fermi level. the filled donor-like traps below fermi level are neutral and their capture cross section corresponds neutral traps σtd 0 (approximately 10 -15 cm 2 ) (fig. 5a). the empty donor-like traps above fermi level are positively charged (fig. 5b) and their capture cross section essentially increases. the capture cross section of the positively charged traps may be equal 10 -14 10 -13 cm 2 . we suppose that the main difference of eldrs and rldrs devices is connected with fermi level position in base region relatively energy levels of radiation-induced surface traps in silicon forbidden gap. the case of fig. 5, a is feature for rldrs devices. in that case the acceptor-like and donor-like traps are neutral. the recombination rate of injected from emitter electrons connects with their capture on traps with relatively small capture cross section (10 -15 cm 2 ). for this reason at low dose rate irradiation the excess base current is relatively small in spite of all trapped oxide charges are converted to interface traps during long time irradiation. the increasing of dose rate (the reducing of the irradiation time) leads to increasing of the non converted trapped positive charge and the increasing of the excess base current (fig. 6a). qualitatively it is explained by that: a greater positive charge attracts the injected electrons to the surface that leads to increasing of the recombination rate. the saturation of the excess base current degradation at the high dose rate (fig. 6a) connects with a contribution of the conversion of the shallow traps[7, 8, 9], when the conversion of deep traps becomes insignificant. fig. 6 the excess base current versus dose rate for rldrs (a) and eldrs (b) devices the case of fig. 5b is a feature of eldrs devices. in that case the acceptor-like traps are neutral while the donor-like traps are positively charged. the recombination rate of injected from emitter electrons connects with their capture on the positively charged traps with relatively large capture cross section (10 -14 10 -13 cm -2 ). therefore the excess base current is large. according to conversion model [7, 8, 9] the increasing of dose rate (the reducing of the irradiation time) leads to decreasing of the interface trap concentration and the excess base current reduces (fig. 6b). the increasing of the trapped positive charge yield at hdr, like rldrs devices, has not significant effect since the recombination is 310 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. connected essentially with a capture of electrons on positively charged interface traps. besides, the increasing of the dose rate leads to reducing of trapped positive charge yield due to ricn (radiation induced charge neutralization) effect [18]. the position of fermi level in p-base depends on a specific feature of the manufacturing process: a doping level in p-base region and a value of the initial positive trapped charge in screen oxide above base. the case of fig. 5, a is characterized by a low level of p-base doping or a large value of the initial positive trapped charge in screen oxide. this positive charge shifts the energy level of the donor-like traps etd below fermi level efp (fig. 7a). fig. 7 relation of fermi level efp location to the position of donor-like traps etd on surface: (a) for case fig. 7a (feature for rldrs devices); (b) for case fig. 7b (feature for eldrs devices) the case shown in fig. 5b is realized when the p-base region is strongly doped or the screen oxide has a small technological positive charge. it corresponds the position of the donor-like traps etd above fermi level efp (fig. 7b). the initial features corresponding fig. 5a or fig. 5b can be used for classification eldrs and rldrs devices. a similar mechanism can be described for n-base region of pnp bipolar structures using acceptor-like traps in top half of the silicon forbidden gap. 3.2. the quantitative model we suppose that the accumulation and annealing of the positive oxide trapped charge qot are described by the following equation: , (4) where qot is the oxide trapped charge; kot is a coefficient characterizing the accumulation of trapped charge; p is dose rate; τd is the conversion time of deep traps; kricn is coefficient connected with charge neutralization by ricn effect [18]. first term in the right-hand side of (4) represents the trapped charge accumulation in thick oxide by dispersion transport of radiation-induced holes. second term is responsible for the neutralization of deep trap charge by electrons from substrate. third term characterizes the annealing of the positive charge by radiation-induced electrons (ricn effect). enhanced low dose rate sensitivity and reduced low dose rate sensitivity in bipolar devices 311 the interface trap buildup nit can be expressed as follows: . (5) first and seconds terms in the right-hand side of (5) represent the interface traps buildup through the conversion of trapped charge by the substrate electrons and radiationinduced electrons. the total concentration of interface traps equals the sum of donor-like and acceptorlike traps[19] (6) where ntа is the concentration of the acceptor-like traps in the top half of the forbidden gap; ntd is the concentration of the donor-like traps in the bottom half of the forbidden gap. the excess base current equals anincrease of the surface recombination: . (7) the recombination through the acceptor-like and the donor-like traps is described by shockley-read-hall theory [20-22] and includes 4 processes: electron capture (the rate ra); electron emission (the rate rb); hole capture (the rate rc); hole emission (the rate rd). following [22] we can obtain , (8) , (9) , (10) , (11) where is the thermal velocity; σ 0 ta is capture cross section of the empty (neutral) acceptor-like trap; nta is the effective concentration of the acceptor-like traps; σ ta – is capture cross section of the filled (negative charged) acceptor-like trap; ns and ps are the electron and hole surface concentrations; ; ; , (12) where ni is intrinsic electron concentration; eta is the effective energy level of acceptorlike traps; efeff is the effective fermi level. similar equations can be received for donor-like traps. final relationship for the surface recombination rate us has view: , (13) 312 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. where σ + td is capture cross section of the empty (positive charged) donor-like trap; σ 0 td is capture cross section of the filled (neutral) donor-like trap; ; ; . it can be shown that the effective fermi level equals: √ ( ( )) , (14) where ; ( ); ; . the relationships for us (13) and efeff (14) include the electron and hole concentrations ns and ps which depend on forward bias of emitter-base junction ueb and on value of the surface potential υs. for n-p-n transistor can be written [23]: , (15) , (16) where na is the acceptor concentration on the base region surface; ueb is emitter-base bias; υt = kt/q is thermal potential; υs is the surface potential. the surface potential υs is determined from a charge balance on the si/sio2 interface [23]. taking into account the charge of the interface traps qit [19] we can write: √ √ , (17) where qot is the oxide trapped charge; qit is the charge on the interface traps; εsi and ε0 are the permittivity of silicon and free space. the charge on the interface traps depends on a position of the effective fermi level and can be equal zero or positive or negative. for example, for p-base of n-p-n transistor charge qit = 0 if the effective level of donor-like trap lies low with respect to the effective fermi level since the donor traps is filled by electrons. if the effective fermi level locates lower than effective energy level of donor-like traps, that is the donor traps are empty, the charge qit>0. (note that empty acceptor-like traps have zero charge and relatively small capture cross section). the values efeff and φs are interconnected. the surface potential φs is included in equation (14) for the effective fermi level efeff. the charge qit depending on the effective fermi level efeff enters in equation (17). therefore the values efeff and φs can be estimated by combined solution of (14) and (17). it can be done by numerical procedure only. enhanced low dose rate sensitivity and reduced low dose rate sensitivity in bipolar devices 313 the relationships (13-16) allow to calculate the exact value of the excess base current from (3) and (7). for simplicity it is possible to use the approach described in [7,8,9]. in this case the degradation of the base current as a function of the dose rate (for irradiation time essentially more than 1 s) can be written as: ⁄ , (18) where ks is the excess base current per unit dose at a high dose rate; kd is the excess base current per unit dose at a low dose rate; p is a dose rate; τd is conversion time of the deep traps; and d is a total dose. 3.3. true dose rate effect the relationship (18) follows from the combined solution of (4) and (5). the parameters ks and kd in (18) depend on the coefficient in (4) characterizing the accumulation of the oxide trapped charge for shallow and deep traps. for eldrs and rldrs devices the main factor is an accumulation of the deep traps, which determines parameter kd. the accumulation of the oxide trapped charge is the strong function of an electric field in oxide. the effect of the electric field consists at separation of radiation induced electron-hole pairs and obstacle their initial recombination. at low electric field the electron-hole pairs do not separate, the recombination is great and the yield of the oxide trapped charge is small. the thick screening oxide above the passive transistor base region does not have any metallization, so the value of the oxide electric field is small and depends on the oxide trapped charge. usually it is assumed that the initial build-in electric field in bipolar thick oxide is positive and equals several units of 10 5 v/cm [24]. an accumulation of the positive trapped charge in oxide near si/sio2 interface leads to reducing of the oxide electric field. using gauss theorem, the value of reduction of the oxide electric field can be estimated as qot /εsio2ε0 (εsio2 and ε0 are the dielectric permittivity of oxide and vacuum). because the initial built-in oxide electric field e0 is positive, an accumulation of the positive trapped charge in oxide near si/sio2 interface leads to reducing of the oxide electric field: , (19) where eox is the oxide electric field; e0 is the initial built-in oxide electric field; α is the fitting parameter [24], characterizing a fraction of the charge electric field connected with the reducing the initial built-in oxide electric field. at small electric field the effective trapped charge yield is linear function of the electric field [25]. therefore it can be written: , (20) where β is constant characterizing the accumulation of the trapped charge. the charge yield for low and high dose rates is quite different. for ldr the value of the positive trapped charge is relatively small due to its conversion during long-time irradiation, as a result the reduction of the oxide electric field is low and the yield of the oxide trapped charge is relatively large. since the duration of hdr irradiation is relatively short and time 314 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. for the conversion is small, the value of the positive trapped non converted charge increases. it leads to decreasing of the oxide electric field and reduction of the total of the oxide charge yield. therefore the reduction of the oxide electric field at ldr is less than in case of hdr. it leads to increasing the value of positive charge yield for ldr (more electric field relates less initial recombination) and increasing the value of interface traps due to converted the greater positive charge. the “true” effect connects with the dependence of coefficient in (4) from the dose rate due to the different oxide charge yield at ldr and hdr irradiations. it means that the coefficient kd in (18) is function of the dose rate. for this reason eldrs is “true” dose rate effect because the accumulation of oxide charge and interface traps depends on the dose rate. note that the eldrs and rldrs conversion model in form (18) has time-dependent nature for given dose rate. the oxide charge yield is less for the high dose rate. therefore hdr irradiation leads to an accumulation of the smaller oxide trapped charge in comparison with ldr irradiation. high temperature annealing after hdr irradiation leads to the accumulation of the smaller value of interface traps due to conversion of the smaller positive charge. it means that the degradation measured at the end of a low dose rate irradiation is greater than the degradation after irradiation at high dose rate followed by a high temperature anneal (“true” dose rate effect). the constant kd is proportional to the coefficient characterizing the accumulation of trapped charge . according to the extraction technique the value kd is estimated from elevated temperature irradiation data. as the main goal of the conversion model is a prediction of the transistor parameter degradation for the extremely low dose rate in the space environment, the elevated temperature installs 100 0 c – 120 0 c when all radiationinduced deep oxide charges are converted to interface traps. it means that the conversion model correctly predicts the degradation for the small dose rate. the degradation at the high dose rate is estimated very easy during test laboratory experiment. therefore the error of the description of the radiation degradation in the wide range of the dose rates using the low dose rate conversion model [7, 8, 9] takes place only in the middle interval of the dose rates. but it has not any practical interest. the correct description of the radiation degradation for any dose rate requires the extraction of the constant kd for the each dose rate. it can be done using elevated temperature irradiation in range 40 0 c – 90 0 c. the constants kd are estimated from these experimental data correspond to not full conversion of the oxide trapped charge and in a first order describe the dependence kd at the dose rate p. using the value kd in (18) as a function of the dose rate p the low dose rate conversion model really describes the eldrs and rldrs as “true” dose rate effect. 3.4. fitting parameter extraction the conversion time of the deep traps τd can be described by the arrhenius law: ( ), (21) where t is the absolute temperature; eа is the activation energy of deep oxide trap thermal excitation; k is the boltzmann constant; τd0 is a pre-exponential coefficient. the model based on (18) and (21) has four effective fitting parameters: ks; kd; eа and τd0. the experimental extraction of effectiveparameters can be performed by the following steps [26]: enhanced low dose rate sensitivity and reduced low dose rate sensitivity in bipolar devices 315 1) the constant ks is estimated as the ratio of base current degradation to the specified total dose at high dose rate irradiation. 2) the pre-exponential constant τd0 and activation energy ea in (21) are derived from the experimental data for two different temperatures of elevated temperature postirradiation annealing. 3) the constant kd is estimated from elevated temperature irradiation data (100 0 c – 120 0 c). it is very important to note that the extraction technique is established on the application of the high dose rate irradiation and can be fulfilled during relatively short time testing. the extraction technique is invariant relatively eldrs and rldrs devices and can be used as universal approach. the extraction of four effective fitting parameters allows to describe the behavior of the radiation-induced excess base current for arbitrary dose rate, total dose and temperature. the successful using this technique was demonstrated in [7,8,9] for several types of eldrs devices. the additional experimental work is needed for evidence of validity of here proposed in given work physical mechanism for rldrs devices. it can be done in our future work. 4. conclusion the possible physical mechanism of the enhanced low dose rate sensitivity (eldrs) and reduced low dose rate sensitivity (rldrs) in bipolar devices can be connected with specific position of fermi level in base region relatively radiation-induced interface traps in forbidden gap. acceptorand donor-like interface traps may be neutral or charged according to their position relatively fermi level. the capture cross section of a charged trap is one-two orders greater than a neutral trap due to columbic interaction with injected to base minority carriers. for rldrs devices interface traps are neutral while for eldrs devices they are charged. as a result the effect of low dose rate irradiation is quite different for these devices. the qualitative and quantitatively models of eldrs and rldrs are presented. the eldrs and rldrs conversion model describes the low dose rate effect in as “true” dose rate effect. the quantitatively model involves the fitting parameters extraction technique that allows to numerical estimate of the radiation degradation for arbitrary dose, dose rate and temperature. references [1] v. s. pershenkov, a. s. bakerenkov, a. s. rodin, v. a. felitsyn, v. a. telets, v. v. belyakov, “reduced low dose rate sensitivity (rldrs) in bipolar devices”, in proceedings of the 2019 31 st international conference on microelectronics (miel 2019), niš, serbia, september 16 th -18 th , 2019, pp. 185–188. [2] r.l. pease, r.d.schrimpf, d.m. fleetwood, “eldrs in bipolar linear circuits: a review”, ieee transactions on nuclear science, vol. 56, no. 4, 2009, pp. 1894–1908, 2009. [3] d.m. fleetwood, s. l. kosier, r. n. nowlin, r. d. schrimpf, r. a. reber, jr., m. delaus, p. s. winokur, a. wei, w. e. combs and r. l. pease, “physical mechanisms contributing to enhanced bipolar gain degradation at low dose rates”, ieee trans. nucl. sci., vol. ns-41, no. 6, pp.1871–1883, 1994. 316 v. s. pershenkov, a. s. bakerenkov, a. s. rodin, et al. [4] h.p. hjalmarson, r.l. pease, s.c. witczak, m.r. shaneyfelt, j.r. schwank, a.h. edwards, c.e. hembree, t.r. mattson, “mechanisms for radiation dose-rate sensitivity of bipolar transistors”, ieee trans. nucl. sci., 2003, vol. ns-50, no.6, pp.1901–1909. [5] k.kruckmeyer, l. mcgee, b. brown, d. hughart, “low dose rate test results of national semiconductor ´ s eldrs-free bipolar amplifier lm124 and comparators lm139 and lm193”, in proceedings of the ieee radiation effect data workshop record, pp.110–117, 2008. [6] j. boch, a. michez, m. rousselet, s. dhombres, a.d. touboul, j.-r. vaille, l. dusseau, e. lorfevre, n. charty, n. sukhaseum, f. saigne, “dose rate switching technique on eldrs-free bipolar devices”, ieee trans. nucl. sci., vol. 63, no. 4, pp. 2065–2071, august 2016. [7] v.s. pershenkov, d.v. savchenkov, a.s. bakerenkov, v.n. ulimov, a.y. nikiforov, a.i. chumakov, a.a. romanenko, “the conversion model of low dose rate effect in bipolar transistors”, in proceedings of the radecs conference, 2009, pp. 286–393. [8] v.s. pershenkov, d.v. savchenkov, a.s. bakerenkov, v.n. ulimov, “conversion model of enhanced low dose rate sensitivity in bipolar ics”, russian microelectronics, vol. 39, no. 2, pp. 91–99, 2010. [9] v. s. pershenkov, “conversion model of the radiation-induced interface-trap buildup and its hardness assurance applications”, facta universitatis, series: electronics and energetics, vol. 28, no. 4, pp. 557– 570, march 2015. [10] a. v.sogoyan, s. v. cherepko, v. s. pershcnkov, v. i. rogov, v. n. ulinov, v. v. emelianov, “thermaland radiation-induced interface traps in mos devices”, in proceedings of the radecs conference, 1997, pp. 69–72. [11] a. v.sogoyan, s. v. cherepko, v. s. pershenkov, “the hydrogenic-electron model of accumulation of surface states on the oxide-semiconductor interface under the effects of ionizing radiation”, russian microelectronics, vol. 43, no. 2, pp. 162–164, 2014. [12] f. b. mclean, “a framework for understanding radiation-induced interface states in mos sio2 structures,” ieee trans. nucl. sci., vol. ns-27, no.6, pp. 1651–1657, dec. 1980. [13] s.k. lai, “interface trap generation in silicon dioxide when electrons are captured by trapped holes”, journal of applied physics, vol. 54, pp. 2540–2546, 1983. [14] v. v. emelianov, a. v. sogoyan, o. v. meshurov, v. n. ulimov, v. s. pershenkov, “modeling the field and thermal dependence of radiation-induced charge annealing in mos devices", ieee trans. nucl. sci., vol. 43, no.6, pp. 2572–2578, dec. 1996. [15] k.l. yip and w.b.fowler, “electronic structure,of e’ centers in sio” phys. rev. b, vol. l1, no. 6, pp. 2427–2338, 1975. [16] e.p.reilly and j.roberston, “theory of defects in vitreous silicon dioxide,” phys. rev. b, vo1. 27, no. 6, pp. 3780, 1981. [17] w.h.flygare, molecular structure and dynamics, prentice-hall inc., englewood cliffs, nj, 1978. [18] d.m. fleetwood, “radiation-induced charge neutralization and interface-trap buildup in metal-oxidesemiconductor devices”, journal of applied physics, vol. 67, no. 1, pp. 580–583, 1990. [19] x.j. chen, h.j. barnaby, “the effect of radiation-induced interface traps on base current in gated bipolar test structures”, solid-state electronics, vol. 52, pp. 683–687, 2008. [20] w. shoсkley, w.t. read, “statistics of recombination of holes and electrons”, phys. rev., vol. 87, p. 835, 1952. [21] r.n. hall, “electron-hole recombination in germanium”, phys. rev., vol. 87, p. 387, 1952. [22] a.s. grove, physics and technology of semiconductor devices, john wiley & sons, inc., 1967. [23] s.m. sze, physics of semiconductor devices. new york, willey, 1981. [24] g.i. zebrev, a.s. petrov, r. g. useinov, r. s. ikhsanov, v. n. ulimov, v. s. anashin, i. v. elushov, m.g. drosdetsky, a.m. galimov, “simulation of bipolar transistor degradation at various dose rates and electrical modes for high dose conditions”, ieee transactions on nuclear science, vol. 61, no. 4, pp. 1785–1790, 2014. [25] d. m. fleetwood, “total ionizing dose effects in mos and low-dose-rate-sensitive linear-bipolar devices,” ieee trans. nucl. sci., vol. ns-60, no.3, pp. 1706–1730, june 2013. [26] a.s. bakerenkov, v.v. belyakov, v.s. pershenkov, a.a. romanenko, d.v. savchenkov, v.v. shurenkov, “extracting the fitting parameters for the conversion model of enhanced low dose rate sensitivity in bipolar devices”, russian microelectronics, vol. 42, pp. 48–52, 2013. http://www.scopus.com.scopeesprx.elsevier.com/record/display.url?eid=2-s2.0-84897422808&origin=resultslist&sort=plf-f&src=s&nlo=1&nlr=20&nls=&affilname=moscow+engineering+physics+institute&sid=cf4687f121abd026ffe86a67e0f83770.y7eslnddisn8ce7qwvy6w%3a62&sot=afnl&sdt=afsp&sl=62&s=%28af-id%28%22national+research+nuclear+university+mephi%22+60068673%29%29&relpos=102&relpos=2&citecnt=0&searchterm=%28af-id%28%5c%26quot%3bnational+research+nuclear+university+mephi%5c%26quot%3b+60068673%29%29 http://www.scopus.com.scopeesprx.elsevier.com/record/display.url?eid=2-s2.0-84897422808&origin=resultslist&sort=plf-f&src=s&nlo=1&nlr=20&nls=&affilname=moscow+engineering+physics+institute&sid=cf4687f121abd026ffe86a67e0f83770.y7eslnddisn8ce7qwvy6w%3a62&sot=afnl&sdt=afsp&sl=62&s=%28af-id%28%22national+research+nuclear+university+mephi%22+60068673%29%29&relpos=102&relpos=2&citecnt=0&searchterm=%28af-id%28%5c%26quot%3bnational+research+nuclear+university+mephi%5c%26quot%3b+60068673%29%29 http://www.scopus.com.scopeesprx.elsevier.com/source/sourceinfo.url?sourceid=27163&origin=resultslist http://www.scopus.com.scopeesprx.elsevier.com/source/sourceinfo.url?sourceid=27163&origin=resultslist http://www.scopus.com.scopeesprx.elsevier.com/source/sourceinfo.url?sourceid=27163&origin=resultslist instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 183 203 doi: 10.2298/fuee1402183j plasmonic enhancement of light trapping in photodetectors  zoran jakšić 1 , marko obradov 1 , slobodan vuković 1,2 , milivoj belić 2 1 center of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, serbia 2 science program, texas a&m university at qatar, p.o. box 23874 doha, qatar abstract. we consider the possibility to use plasmonics to enhance light trapping in such semiconductor detectors as solar cells and infrared detectors for night vision. plasmonic structures can transform propagating electromagnetic waves into evanescent waves with the local density of states vastly increased within subwavelength volumes compared to the free space, thus surpassing the conventional methods for photon management. we show how one may utilize plasmonic nanoparticles both to squeeze the optical field into the active region and to increase the optical path by mie scattering, apply ordered plasmonic nanocomposites (subwavelength plasmonic crystals or plasmonic metamaterials), or design nanoantennas to maximize absorption within the detector. we show that many approaches used for solar cells can be also utilized in infrared range if different redshifting strategies are applied. key words: plasmonics, metamaterials, nanoantennas, solar cells, infrared detectors, light trapping 1. introduction an important requirement posed in photodetector design is to maximize the useful photon flux for a given physical thickness of active region of the device [1]. probably the most important type of such devices nowadays are solar cells [2-4]. they are basically photovoltaic detectors where an optical signal (radiation of the sun) is converted to voltage and thus to useful energy. since materials for solar cells are expensive, it is of interest to make their active region as thin as possible. another important class of the devices are infrared (ir) detectors [5] used in e.g. remote sensing, night vision, etc. since they are intended for larger wavelengths – typically they operate within the atmospheric windows at (3-5) m or (8-12) m – their thickness is usually relatively small compared to the operating wavelength.  received january 14, 2014 corresponding author: zoran jakšić center of microelectronic technologies, institute of chemistry, technology and metallurgy, university of belgrade, njegoševa 12, 11000 belgrade, serbia (e-mail: jaksa@nanosys.ihtm.bg.ac.rs) 184 z. jakšić, m. obradov, s. vuković, m. belić actually both the thickness of solar cells and night vision devices may be in subwavelength domain, i.e. smaller than the operating wavelength. a requirement posed to the designers in both situations is how to maximize optical trapping within such thin active regions. an important aspect of decreasing the thickness in the case of general semiconductor detectors is that it is followed by an increase of the response speed. thus the basic task in the design of such detectors is to maintain or even improve quantum efficiency in the operating wavelength range while decreasing the thickness as much as possible. the engineering methods dedicated to maximization of the available optical flux in photodetectors are termed the photon management or the light management [6]. several general strategies are available for this purpose [7], as shown in fig. 1. fig. 1 strategies for maximization of optical flux in photodetector first, one may perform external light concentration and collect optical energy from an incident area larger than the physical dimensions of the detector active region itself (photon collector). a typical example of this approach would be the use of concentrating lenses or reflectors that gather irradiation from the so-called optical area and focus it onto the electric area of the detector. non-imaging collectors can be used to that purpose [8-11]. after the signal has reached the active area of the detector, various antireflection coatings and structures can be used to decrease the reflected component of the incident radiation and to allow as large part of it as possible to enter the active region itself [12]. all of these structures basically match the impedance of the free space/detector environment to that of the detector material. once inside the detector, one can increase the optical path through the active region, which can be done by backside reflectors redirecting radiation back to the active region, or by various scattering structures at the front and at the back side of the device which change the path of the beam to make it longer and make use of total internal reflection to return the beam to the active region. it is also possible to utilize resonant structures (resonant cavity enhancement) [13], thus obtaining a narrow-bandwidth response, or to incorporate photodetector in a photonic crystal cavity [14, 15]. another important approach to detector enhancement after the beam has entered the active region is to perform internal optical concentration (spatial localization), i.e. to plasmonic enhancement of light trapping in photodetectors 185 fabricate structures that will perform squeezing of the optical space from a larger volume to a smaller one, thus increasing the local density of states of optical energy within the latter. the last two approaches, i.e. optical path increase and spatial localization belong to the light trapping schemes. the advent of nanostructuring technologies brought an impetus to this field. various building blocks with nanometer dimensions have been proposed for e.g. solar cell energy harvesting improvement, including nanoparticles, nanowires, different core-shell geometries, colloidal quantum dots, etc. [16, 17]. recently the use of plasmonics appeared as a novel approach to nanotechnological improvement of photodetector light trapping [18-23]. basically, plasmonics represents the use of coupled electron oscillations and surfacebound electromagnetic waves called surface plasmons polaritons (spp). this is achieved through utilization of metal-dielectric nanocomposites that can be designed to obtain almost any desired optical properties and thus almost complete control over electromagnetic propagation in and around such structures [24]. even the values of optical parameters not ordinarily met in nature can be obtained, like near-zero or even negative values of refractive index [25, 26]. such ability to engineer optical parameters at will brought to almost complete control over the propagation of electromagnetic waves and resulted in the appearance of transformation optics [27-29], where one optical space is transformed into another. one of the obvious application of plasmonics has been to “squeeze” the optical space to a much smaller volume than that of the free space. in this way high localizations of the electromagnetic field became possible, i.e. local densities of electromagnetic states much larger than those in the free space. in this paper we consider the use of plasmonics in light trapping in (ultra)thin photodetectors including solar cells and night vision detectors. after considering the fundamental limits to photon management in detectors from the point of view of subwavelength structures, we investigate the basic schemes for light trapping using plasmonics. we analyze the applicability of plasmonic nanoparticles both for field scattering and localization within the detector, the use of subwavelength plasmonic crystals and the possibility to redshift the device response utilizing the designer plasmons. we consider the utilization of dedicated optical antennas (nanoantennas) for detector enhancement. at the end we show how some of the schemes utilized for visible and near infrared radiation can be applied for night vision detectors through the application of different redshifting strategies. 2. fundamental limits to light management in detectors we consider a general case of a photodetector as a device that converts optical energy into another form of energy. most often this energy is electrical signal, although other forms may be used like thermal [30], motion (e.g. cantilever-based detectors) [31], optical signal at another frequency (upor down-converted) [32, 33] etc. basically, different light management approaches are intended to improve absorption of light in the detector and ensure a higher degree of this conversion. obviously, the efficiency of any conversion is limited by basic physical laws. a question is posed what are the fundamental limits of photodetector enhancement through light management. 186 z. jakšić, m. obradov, s. vuković, m. belić fig. 2 the structure of the active region of a detector with corrugated surface and ideal backside reflector a detector system is presented in fig. 2. a background optical flux is incident to the active area of a photodetector with a thickness d. both in the case of solar cells and night vision photodetectors the optical flux is blackbody radiation, described by the planck’s law. in a general case the detector material may incorporate nanostructuring that could localize optical field and create hotspots with high density of states. a perfect mirror is placed at the rear side of the device – i.e. it is assumed that the incident light is unidirectional, while the internal radiation is bidirectional. the detector surface is corrugated in order to increase the optical path through the detector. the corrugation may be random or ordered, but in both cases its basic purpose is to change the direction of light incident upon the active surface and to make use of total internal reflection to ensure repeated passing of the beams through the active region. light can escape if the direction of the internal beam falls within the escape cone, for which according to snell’s law sin cr = 1/n (cr is the critical angle of total reflection, n is the refractive index of the active region). we first consider the case limited by geometrical optics, which has been established by yablonovitch [34-37]. in literature it is variably denoted as the conventional limit, the ergodic light trapping limit, the ray-optics limit and the lambertian limit. it is assumed that the detector active material can be described by an effective absorption coefficient  isotropic throughout the device and that the detector thickness is much larger than the operating wavelength in free space (d >> /2n), so that one considers a bulk process. the absorbance within the photodetector for a single pass across the structure (absorption without enhancement) is ( ) 1 exp( ( ) ) )a d w d        , (1) i.e. the absorbance is equal to the optical thickness of a photodetector, which is defined as the d product. since a bulk case is considered, it is further assumed that interference/diffraction effects can be neglected and that the intensity of light within the detector medium is in equilibrium with external blackbody radiation. the density of states within the medium is proportional to n 2 . the next assumptions are that the equipartition theorem is valid (the plasmonic enhancement of light trapping in photodetectors 187 internal occupation of states is equal to the external one, the internal states are ergodic) and that the surface corrugation performs a full randomization of the incident signal over space. this is not always satisfied, but the assumption holds in a vast majority of cases. a sufficient condition for randomization of light by multiply scattering corrugated surfaces is that these surfaces upon averaging behave as lambertian. the internal distribution of the light within the medium is then isotropic. according to the statistical ray optics approach [34] the relation between internal and external intensity of light is )(),(2),( 2 int  extixnxi  . (2) the same result is also obtained according to the principle of detailed balancing of the light [38] applied between the light incident to a small surface element of the detector active area and escaping from that same element through the loss cone and by applying the brightness or radiance theorem (e. g. [39]) stating that the spectral radiance of light cannot be increased by passive optical devices (based on the principle of reversibility). to determine the enhancement of absorption, one has to consider the loss of light due to various mechanisms. according to yablonovitch [34, 35] there are three such mechanisms: the escape of light through the light cone, the losses due to imperfect reflection at the surfaces and the absorption in bulk. the absorbance of a photon is the ratio of the rate at which absorption occurs and the sum of the absorption and the photon loss through the escape cone. for the volume absorption in the limiting case when d << 1 and taking account the angle of the loss cone , this expression is dn a 2 2 4 sin )( )( )(       , (3) so that the absorption enhancement limit in the bulk case with internal randomization becomes 4n 2 . for = /2 this assumes the more often used simple form dn a 2 4 1 )( )( )(      . (4) the next case we consider are the devices with plasmonic localization for the enhancement of absorption. in this case many of the above assumptions introduced for ergodic limit are not valid. the crucial points are that the light distribution now is not isotropic (and actually the volumes with a strongly enhanced density of electromagnetic states may be deeply subwavelength) and the thickness of the detector is usually subwavelength. a number of treatises is dedicated to the situations in which the ray optics limit is exceeded and optical modes are confined at subwavelength scale [40-42]. however, until now no generally valid solution has been given for the extension of the ray optics limit [43]. 188 z. jakšić, m. obradov, s. vuković, m. belić 3. plasmonics for light trapping surface plasmons polaritons (spp) are oscillations of free electrons in conductive material near an interface with dielectric coherently coupled with electromagnetic radiation at the interface. the conductive material can be characterized by negative value of dielectric permittivity, while that in dielectric is positive. typically the conductive material is metal (most often used being gold and silver, although other metals are used like chromium, copper, various alloys, alkali metals, etc.), however other materials are used too, for instance transparent conductive oxides like indium tin oxide, zinc oxide, tin oxide, etc. (in near infrared), different semiconductors like silicon carbide, gallium arsenide (mid infrared), intermetallics, graphene and some other materials, all being denoted as plasmonic materials [44-46]. the spp is related with electromagnetic waves that are confined to the interface between positive and negative permittivity materials and are evanescent in perpendicular direction, i.e. they exponentially decay away from the interface. spps can be propagating along the interface, or they can be nonpropagating, i.e. spatially confined to e.g. a metal nanoparticle (localized surface plasmons polaritons). generally, the rapidly expanding field of research and application of spp-based phenomena is denoted as plasmonics [24, 47-49]. the field of plasmonics is dedicated to the use of spps in a similar way electrons are used in electronics. this is achieved via engineering of nano-composites that combine materials with positive and with negative values of dielectric permittivity in a certain frequency range. plasmonic nanocomposites can be one-dimensional (1d) like planar metal-dielectric superlattices, two-dimensional (2d) like cylindrical metallic nanowires, or three-dimensional (3d) like spherical metallic nanoparticles embedded in dielectrics. these structures can be periodic, quasiperiodic [50], aperiodic [51] or fully random [52]. the building blocks of these functions themselves may have different shapes, from simple to complex and from regular to irregular [53]. even in their simplest version, spps at the plane boundary between two semi-infinite media with opposite signs of dielectric permittivity are inhomogeneous electromagnetic waves (i.e. not plane waves) that propagate along the interface, and whose energy is concentrated in the narrow region near the boundary plane. this is possible only in a frequency range where the absolute value of the negative dielectric permittivity on one side is greater than the positive value on the other side of the interface. spps are strongly tm (transverse-magnetic) polarized, and because of that they are called polaritons. in other words, magnetic field and wavevector of the spp lay in the plane of interface, while electric field of the wave has both perpendicular and parallel to the wavevector components. therefore, spps are neither longitudinal nor transversal waves. it should be noted that te polarized component of electromagnetic field cannot satisfy the maxwell equations with standard boundary conditions, in the form of surface wave. plasmonic nanocomposites with two or more metal-dielectric interfaces within distances less than, or comparable to the plasmonic material skin depth (~25 nm for au or ag) produce strong coupling of neighbouring spps, and highly pronounced nonlocal effects. a plethora of new modes and possible novel effects may appear in such structures [54, 55]. sophisticated theoretical and numerical methods are necessary in order to achieve desired nanocomposite design levels. plasmonic enhancement of light trapping in photodetectors 189 an important disadvantage of spps is their resonant nature, which causes a narrow bandwidth of operation. another one is their large wave damping due to collisions of free carriers in the epsilon-negative material, which leads to shorter spps lifetimes and/or propagation lengths and high absorption of incident radiation. the relative dielectric permittivity of plasmonic materials is negative below plasma frequency, and its dispersion is well-described by electron resonance model of drude [56], also denoted as drude-sommerfeld model )( 2    i p    , (5) where p is the plasma frequency,  denotes damping factor describing losses (i.e. defines the imaginary part of the complex dielectric permittivity), while  is the asymptotic relative dielectric permittivity. the plasma frequency is determined by the properties of free carriers as 2 2 * 0 e p n e m    , (6) where ne is electron concentration, e is the free electron charge (1.6·10 –19 c), 0 is the free space (vacuum) permittivity (8.854·10 –12 f/m), and m * is the electron effective mass. the damping factor can be calculated from the material scattering data as * m e   (7) where  is mobility of free carriers. if interband transitions from the valence to the conduction bands exist, dielectric permittivity is described by the lorentz model [57]    ')( ' 22 0 2    i p , (8) where  is the resonant frequency of electron oscillator, while the apostrophe in plasma frequency ’p and damping factor ’ denotes that these values are related with the concentration of bound electrons taking part in the interband transitions. since one is able to tailor a plasmonic nanostructure, this means that dispersion relations could be designed within it, even enabling the optical behavior that surpasses that of natural materials. the structures thus obtained are known as plasmonic metamaterials [25]. in that case one can obtain modes with superluminal group velocities (“fast light”), near-zero (“slow light”) or even negative (“left-handed light,” propagating in the direction opposite to that of the phase velocity) [58]. the possibility to obtain an arbitrary frequency dispersion gives a possibility to convert propagating far field modes into spatially localized near-field modes, thus obtaining strongly increased density of states. the same energy is compacted into a much smaller space, thus ensuring much higher energy densities. this ensures highly enhanced interaction of optical radiation with photodetector material. this kind of engineering of optical absorption ensures its maximization in the 190 z. jakšić, m. obradov, s. vuković, m. belić active area, leading to vastly increased photodetector response and sensitivity compared to other light trapping schemes. a drawback of the use of plasmonics in photodetection are large absorption losses in metal, which result in a large part of energy being converted to heat instead of the useful signal. this topic is a field of active investigation, and various schemes are used to avoid it [59]. one of the approaches is the use of alternative plasmonic materials, like for instance transparent conductive oxides like tin oxide, indium tin oxide or zinc oxide [60] which are routinely used in solar cells because of their transparency at visible wavelengths. another such material for solar cell enhancement is graphene [45]. the applicability of plasmonics for photodetector enhancement has been recognized very early, in the period 1970-1980-ties, and actually some of the first proposed applications of surface plasmons polaritons were in photodetection [61, 62]. a large body of papers has been published on various methods of plasmonic enhancement in solar cells [19, 59]. surface plasmon polariton-mediated light trapping schemes may be roughly divided into the following groups according to the particular mechanism used (and bearing in mind that a single trapping scheme may include more than one of these):  enhanced mie scattering on plasmonic nanoparticles or nanovoids through plasmonic enlargement of effective cross-section [63].  coupling into guided modes (which may be propagating or spp modes) [19]  field localization and generation of hotspots near the surface of plasmonic material (using embedded nanoparticles, nanoantennas, metamaterials) [20]  use of plasmon-based singular optics (optical vortices, i.e. circular flow of field in a corkscrew fashion around phase singularities in the optical near field around plasmonic nanostructures) [59]  use of metamaterial-based transformation optics to map the optical space into a desired shape and with an increased density of states (optical superconcentrators and superabsorbers, optical black holes) [64]  plasmon-enhanced up-conversion media (reverse of luminescent materials used for down-conversion) [65] the plasmonic structures to be used for one or more of the above purposes include the following:  nanoparticles and nanovoids – used as scatterers and as nanoantennas for field coupling and localization. may be arranged in an ordered fashion (pattern) or disordered)  diffractive structures (gratings, lattices) – used for field coupling into guided modes; may be ordered or disordered.  subwavelength plasmonic crystals (spc) – used for field coupling and localization. may be periodic [66] or quasiperiodic [67] in 1d, 2d or 3d. plasmonic structures may be used as resonant enhancers, in which case they offer a narrow-bandwidth operation, or may be nonresonant, with a wide-bandwidth operation [68]. 4. plasmonic nanoparticles as mie scatterers the scattering cross-section of a plasmonic nanoparticle is greatly enhanced due to plasma resonance compared to non-plasmonic ones. the effective cross-section may be plasmonic enhancement of light trapping in photodetectors 191 an order of magnitude larger than the geometrical cross-section. thus a 10% surface coverage would suffice for practically 100% efficiency of conversion from incident propagating modes into surface plasmons polaritons. plasmonic nanoparticles (field concentrators) active region substrate fig. 3 light trapping utilizing plasmonic nanoparticles stochastically placed on the detector surface substrate buffer active layer ar/dielectric b) a) plasmonic nanoparticless c) d) fig. 4 geometries for plasmonic scatterers for light trapping within photodetector. a) nanoparticles embedded in top dielectric; b) nanoparticles on top of the active region; nanoparticles embedded within the active region; d) nanoparticles on the back side usually the conventional mie theory is utilized for the calculation of effective crosssections for absorption and scattering on nanoparticles [69]. mie theory is valid for noninteracting nanoparticles (i.e. those where the interparticle distance is large enough to prevent their electromagnetic coupling). in nanoparticles interacting through near-field coupling or far-field dipole interactions various additional phenomena appear like splitting of plasmon resonances and their shifting. 192 z. jakšić, m. obradov, s. vuković, m. belić the simplest case is scattering on a spherical plasmonic nanoparticle that can be considered as an electric dipole. its scattering cross-section at a wavelength  can be calculated as [70, 71] 4 2 3 8            scatc , (9) where                   213 d np d np v      . (10) here np is the complex and wavelength-dispersive relative dielectric permittivity of the plasmonic nanoparticles, d is the permittivity of the surrounding dielectric medium and v is the geometrical volume of the nanoparticle. the plasmon resonance and the maximum scattering cross-section are achieved at np = –2 d. the absorption cross-section is determined as 2 im( ) abs c     . (11) elongated ellipse may be taken as a generalization of the case of sphere and corresponds to a single wire nanorod antenna. this structure is actually the basic building block, out of which more complex forms are built. again the mie theory is applicable to this case, in a somewhat modified form. the dipole moment induced by an external field in an elongated ellipsoid is 0 (1 ) e j e j v p p        , (12) and its resonant frequency r rp res 2    , (13) where r is short radius of the ellipsoid, and r/2 its longer radius. in the most general case, the shapes of the nanoparticles widely vary and may assume different complex forms (e.g. various convex and concave polyhedra, including stellated and other forms [72]. this reflects strongly in their plasmonic response [19], since in principle sharper forms will cause larger field localizations. mie theory has been generalized to some of the more complex forms, but in the most general case the response is calculated numerically. 5. diffractive plasmonic couplers an obvious approach to light trapping using plasmonics is to integrate the detector structure with a diffractive plasmonic structure (diffractive optical element, doe) [73] and generally with a corrugated metal layer to act as a coupler with propagating modes. the simplest doe is the conventional diffractive grating. a parameter of a general diffractive optical element (doe) that determines the degree of coupling with propagating modes is its diffraction efficiency. the diffraction efficiency is dependent on geometrical and material parameters of the plasmonic doe, i.e. the complex plasmonic enhancement of light trapping in photodetectors 193 refractive index of the plasmonic material (for instance, transparent conductive oxides will generally have lower losses and longer resonant wavelengths than metals), the dimensions of the doe features (in the case of plasmonic diffractive gratings the parameters of influence will be the lattice constant (the grating element spacing), the shape and height of the grating ridges. thus its value can be tailored and optimized by a proper choice of the quoted parameters. the guided modes into which propagating modes are coupled by a doe can be propagating optical modes (the conventional waveguide modes) and surface plasmon polariton modes. in an ideal case for a photodetector, all propagating modes will be converted to plasmonic ones. figure 5 shows two different geometries for incorporation of doe into thin photodetectors: a) back-side doe, b) top-side doe. the configuration shown in fig. 5a is more common of the two [74]. however, the second one (fig. 5b can perform an additional function as light collector. a) diffractive patern b) ar/dielectric active layer buffer substrate fig. 5 two geometries for incorporation of plasmonic doe into a photodetector. a) bottom doe, b) top doe depending on the structure of the doe coupler, the propagation lengths of the spp modes may be shorter or longer [75]. besides its function as a light trapping structure, a doe can also serve as a light collector by its virtue of functioning as a non-imaging light concentrator [76]. in addition to that, a doe may perform impedance matching between free space and photodetector material, thus behaving basically as a diffractive antireflection structure. for instance, 1d metallic gratings (i.e. metal surface with an array of parallel slits) have been proved to act as such impedance-matching structures [77]. this means that such grating exhibit wideband extraordinary transmission. since this is a non-resonant phenomenon, it ensures a wide bandwidth and a broad range of incident angles. the diffractive structure may have a form of conventional diffractive grating with parallel ridges of metal, or may be more complex (e.g. a lattice/fishnet, etc.) in a most general case it will have a form of a holographic optical element with fully tailorable properties that can be computer generated [78]. plasmonic doe may function in narrow-bandwidth mode near resonance, but also as non-resonant elements with wide bandwidth. a built-in plasmonic doe in photodetector may simultaneously perform its function as a coupler and an electromagnetic field concentrator, but it may be also built to perform as a plasmonic waveguide [79, 80]. 6. subwavelength plasmonic crystals and designer plasmons further generalization of diffractive plasmonic structures is that to subwavelength plasmonic crystals (spc) [81]. a spc may be defined as a 1d, 2d or 3d plasmonic 194 z. jakšić, m. obradov, s. vuković, m. belić structure with its period much smaller than the operating wavelength (a rule of thumb is that the periodicity is at least ten times smaller than the operating wavelength). thus the details of the structure are not “seen” by the incident light and it behaves as an effective medium with its optical parameters dependent on its design, thus ensuring engineering of frequency dispersion of such materials. the number different possible kinds of spc is virtually limitless. plasmonic metamaterials may be regarded a special class of the spc and are defined as the structures possessing electromagnetic properties that are not readily found in nature [25], the most often researched among such properties being the possibility to reach negative values of effective refractive index [82]. the spc structures ensure light localization and can be therefore straightforwardly utilized to enhance optical absorption in photodetectors. in addition to that, owing to a large number of possible modes in such structures [boba], it is possible to utilize them at the same time to match the impedance between the free space and the photodetector, effectively behaving as an antireflective diffraction structure. as an example of spc for the enhancement of solar cells, fan et al [83] fabricated an ordered 2d array or metal cubes (or rather cuboids) on semiconductor surface to improve light trapping. among spc structures within the context of photodetection, one of the more frequently encountered ones are 2d arrays of nanoapertures in opaque metal films. such structures first drew attention for their ability to transmit light in spite of the dimensions of nanoapertues being much smaller than the operating wavelength and were denoted as extraordinary optical transmission (eot) arrays [84]. this behavior is a consequence of resonant excitation of spp at their surface that forces the passage of electromagnetic waves incident to the whole surface through the apertures. since such behavior effectively corresponds to impedance matching between propagating waves and the perforated metal film, the eot arrays thus act as efficient antireflective structures. however, there is another useful application of the eot arrays in photodetection (and generally structured metal-dielectric surfaces) and it is based on the properties of the surface waves that propagate along them. detector active region plasmon enhancement fig. 6 metallodielectric eot structure introducing “designer” plasmons with structurally tunable plasma frequency pendry et al [85] have shown that for a surface wave that propagates along a perforated metal film one is able to introduce an effective permittivity with a form plasmonic enhancement of light trapping in photodetectors 195          holehole hole planein a c a d    22 22 2 22 1 8 (14) where hole is the permittivity of the material within the holes, a is the hole side length (in the case of square holes, as shown in fig. 6) , and k0 is the wavevector in vacuum. the effective plasma frequency of such material is holehole p a c     (15) in other words, the effective dielectric permittivity of an eot array has the form identical to that of plasmonic materials. such surface waves that mimic spp were denoted by pendry the designer plasmons, and are also known as “spoof” plasmons. their main advantage is that one is able to tune the effective plasma frequency by a proper choice of geometry and material parameters and thus to shift it at will. an obvious application of this approach was for infrared detectors and structures tuned to the range of 8-10 m have been reported [86]. a paragidm that appeared in the wake of metamaterials is the transformation optics [2729, 87], the use of conformal mapping to transform one optical space into another, thus ensuring bending of light at will and tailoring of the density of states within a given volume. in a general state this is ensured through the use of gradient index metamaterials [88, 89]. probably the best known example of transformation optics are the so-called cloaking devices [29, 90], but from the point of view of photodetection much more interesting concepts are met in superfocusing and superconcentrators [91], superabsorbers [92, 93] including optical black holes [94], superscatterers [95], etc . in their 2011 paper aubry et al [64] proposed the use of transformation optics to ensure broadband light harvesting. 7. nanoantennas for photodetection enhancement nanoantenna or optical antenna [68, 96-98] is a plasmonic structure redirecting propagating waves into evanescent field (and vice versa), where propagating and spatially localized modes are linked in a highly efficient manner. the amount of localization itself can be tailored by the proper design of the nanoantenna and can be deeply subwavelength. thus interaction with photodetector active region can be vastly enhanced. nanoantennas are isolated structures, i.e. they are not connected to a feeding circuitry like the conventional antennas. with this in mind, a simple spherical nanoparticle may be regarded as the most basic nanoantenna. its scattering properties are shortly presented in section 4 of this paper. various types of nanoantennas were experimentally produced and presented in literature. fig. 7 shows some of the basic geometries, including the most basic type, the nanosphere. if two such spheres are brought together, they form a nanodimer with a coupling gap with a subwavelength width between them (denoted as the feed gap). a field hotspot appears in the feed gap, where localization is deeply subwavelength and field enhancement is very strong. in this manner larger field localizations are obtained than those using single structures. 196 z. jakšić, m. obradov, s. vuković, m. belić another generalization is the introduction of elongated ellipsoid (also described in section 4) that can be within this context described as dipole nanorod antenna, which is one of the most basic nanoantenna geometries. if two nanorods acting as linear dipoles are aligned and brought together to a subwavelength distance, ensuring an end-to-end coupling, they form a two-wire nanoantenna [68]. this is another basic type of optical antenna. it can be further generalized by introducing two additional dipoles perpendicularly to the first ones, all foud having a joint feed gap (the cross-antenna). nanoparticles can be ordered in an array (nanoparticle chain) to form an optical antenna [99] effectively behaving as linear nanorod antenna. another prototypical structure is the bowtie nanoantenna [100], consisting of two triangular shapes aligned along their axes and forming the feed gap with their tips. such geometry ensures a broader bandwidth together with large field localizations in the feed gap. a diabolo-type nanoantenna has been proposed in [101]. an optical yagi-uda nanoantenna can be fabricated by placing a resonant nanorod antenna between a reflector nanorod and a group of director nanorods [102]. similar to such antennas used in radiofrequent domain, a good directivity is obtained. more exotic shapes include spiral nanoantennas [103] and those with fractal geometries [104]. a plethora of other shapes can be used. different geometries include e.g. the use of split rings, various crescent shapes. an important group are nanoantennas making use of the babinet principle (a metal shape surrounded by dielectric and a dielectric-filled hole in metal with identical shape and size have identical diffraction patterns). thus bow-tie holes in metal substrates are used, two holes as a babinet equivalent of a nano-dimer, arrays of nanoholes, crossed arrays of nanoholes, etc. [105]. fig. 7 some different types of experimental plasmonic nanoantennas plasmonic enhancement of light trapping in photodetectors 197 the obvious way to use nanoantennas in photodetection is for coupling between propagating and localized modes and for field localization, especially through the use of hotspots within the feed gaps. a large number of works has been dedicated to the use of optical nanoantennas for photodetector enhancement [59, 68, 97, 106] the applicability of optical antennas for photodetection has been recognized very early [61]. today it is still one of the foci of interest in the application of optical antennas [59, 106]. one of the alternative approaches is to use a schottky metal-semiconductor junction where the optical antenna forms the metal mart of the metal-dielectric contact at the semiconductor detector surface [107]. photoexcitation generates hot electron-hole pairs by plasmon decay and the electrons are injected over the schottky barrier, thus directly generating photocurrent. a problem with this approach is a low efficiency when using hot electrons. 9. redshifting methods for nanoparticle-based plasmon-assisted infrared detection most of the approaches described in this paper are applicable in different part of the spectrum (subwavelength plasmon crystals/designer plasmon structures and optical antennas). however, the use of metal nanoparticles as mie scatterers is limited to frequencies near the surface plasmon resonance, which is for usual plasmonic materials (good metals) in ultraviolet or visible part of the spectrum. this makes them unsuitable for night vision devices and infrared detection. in this section we consider possible strategies to ensure the usability of plasmonic particles in the ir range [108]. the main point is that one needs to shift their resonance frequency toward longer wavelength, i.e. to perform a redshift of the characteristics. one obvious approach is to use materials with lower plasma frequency. it is known that plasma frequency of transparent conductive oxides is redshifted compared to metals and can be further shifted through proper doping and fabrication techniques [109-111]. another pathway toward redshifting is the immersion of plasmonic nanoparticles into high refractive index material [70], either by incorporating it into a dielectric film at the detector surface or utilizing core-shell particles with external dielectric layer. finally, one of the possible methods is the adjustment of interparticle spacing. fig. 8 shows the calculated scattering cross section for a spherical dipole indium tin oxide nanoparticle with a radius of 60 nm. the assumed doping concentration was 1.2·10 21 cm –3 which together with an effective mass of m* = 0.4 m0 furnishes a plasma frequency of 4.8·10 14 hz. the nanoparticle is placed at the top of the active surface of the detector and is embedded in dielectric, a layout similar to that shown in fig 4b. finite element method was utilized for simulation; no approximations were used. the plasmon resonance redshift described by maxima in scattering cross-section dispersion relations shown in figure 8 is caused by the increase in the embedding dielectric permittivity. figure 9 shows the radial distribution of the scattered electric field, presenting forward and back scattering. spreading of the forward scattering region with the increase of the permittivity of the dielectric layer is readily seen figure 9.a as well for larger operating wavelengths figure 9.b. finally, fig. 10 shows the electric field x-axis component (parallel to the incident light polarization) around the spherical nanoparticle at the surface of the detector for a permittivity of the embedding layer of 8. 198 z. jakšić, m. obradov, s. vuković, m. belić 2.0 2.5 3.0 3.5 4.0 10 15 20 25 30 35 40 s c a tt e ri n g c ro s s -s e c ti o n , x 1 0 – 1 4 m 2 wavelength, m diel = 8 diel = 10 diel = 12 spherical nanoparticle r = 60 nm substr=10 fig. 8 spectral dependence of scattering sscat cross-section for an embedded ito particle, r=60 nm, p=625 nm =12 =8 =10 2 m 1 m 2 m 2 m 3 m 4 m 4 m a) b) fig. 9 radial distribution of electric field around an ito nanoparticle obtained by finite element modeling; r=60 nm, p=625 nm. light is incident from top. a) scattering curves obtained for dielectric permittivity values 8, 10 and 12 for an operating wavelength of 3 m. b) scattering curves for operating wavelengths of 2, 3 and 4 m. permittivity of the dielectric layer is 12 plasmonic enhancement of light trapping in photodetectors 199 fig. 10 field enhancement around ito nanoparticle for infrared detector enhancement, calculated by fem simulation. light is incident from right. r=60 nm, p=625 nm,  =2.6 m, permittivity of the dielectric layer is 8 10. conclusion a broad overview is given of the currently available possibilities to use plasmonics for the enhancement of different classes of photodetectors, stressing solar cells and night vision devices. the consideration is based on the point of view of non-imaging photodetection devices (single detector elements) intended for detection of a broadband spectrum that can be represented as blackbody radiation. a classification of the approaches proposed until now is given, including some original results by the authors. the list of the available methods and approaches must be far from finished, since both plasmonics and solar cells fields of research are rapidly expanding, and new ideas and approaches appear almost every day. acknowledgement: the paper is a part of the research funded by the serbian ministry of education and science within the projects tr32008 and iii45016 and by the qatar national research fund within the project nprp 09-462-1-074. references [1] a. shah, p. torres, r. tscharner, n. wyrsch, and h. keppner, “photovoltaic technology: the case for thin-film solar cells,” science, vol. 285, no. 5428, pp. 692-698, 1999. [2] a. mcevoy, t. markvart, and l. castañer, solar cells, elsevier, amsterdam, 2013. [3] g. li, r. zhu, and y. yang, “polymer solar cells,” nat. photonics, vol. 6, no. 3, pp. 153-161, 2012. 200 z. jakšić, m. obradov, s. vuković, m. belić [4] s. j. fonash, solar cell device physics, elsevier amsterdam, 2010. [5] a. rogalski, infrared detectors, crc press, bocca raton, 2010. [6] r. b. wehrspohn, and j. ůpping, “3d photonic crystals for photon management in solar cells,” journal of optics, vol. 14, no. 2, 2012. [7] z. jakšić, and z. djurić, “cavity enhancement of auger-suppressed detectors: a way to backgroundlimited room-temperature operation in 3-14 μm range,” ieee j. sel. top. quant. electr., vol. 10, no. 4, pp. 771-776, 2004. [8] j. h. atwater, p. spinelli, e. kosten, j. parsons, c. van lare, j. van de groep, j. garcia de abajo, a. polman, and h. a. atwater, “microphotonic parabolic light directors fabricated by two-photon lithography,” appl. phys. lett., vol. 99, no. 15, 2011. [9] i. m. bassett, w. t. welford, and r. winston, "nonimaging optics for flux concentration," progress in optics 27, e. wolf, ed., pp. 161-226: elsevier, 1989. [10] w. t. welford, and r. winston, high collection nonimaging optics, academic press, 1989. [11] r. winston, j. c. minano, and p. g. benitez, nonimaging optics, academic press, 2005. [12] d. h. raguin, and g. m. morris, “antireflection structured surfaces for the infrared spectral region,” appl. opt., vol. 32, no. 7, pp. 1154-1167, 1993. [13] m. s. ünlü, and s. strite, “resonant cavity enhanced photonic devices,” j. appl. phys., vol. 78, no. 2, pp. 607-639, 1995. [14] b. temelkuran, e. ozbay, j. p. kavanaugh, g. tuttle, and k. m. ho, “resonant cavity enhanced detectors embedded in photonic crystals,” appl. phys. lett., vol. 72, no. 19, pp. 2376-2378, 1998. [15] z. djurić, z. jakšić, d. randjelović, t. danković, w. ehrfeld, and a. schmidt, “enhancement of radiative lifetime in semiconductors using photonic crystals,” infrared phys. technol., vol. 40, no. 1, pp. 25-32, 1999. [16] l. cao, p. fan, a. p. vasudev, j. s. white, z. yu, w. cai, j. a. schuller, s. fan, and m. l. brongersma, “semiconductor nanowire optical antenna solar absorbers,” nano lett., vol. 10, no. 2, pp. 439-445, 2010. [17] m. m. adachi, a. j. labelle, s. m. thon, x. lan, s. hoogland, and e. h. sargent, “broadband solar absorption enhancement via periodic nanostructuring of electrodes,” scientific reports, vol. 3, 2013. [18] w. l. barnes, a. dereux, and t. w. ebbesen, “surface plasmon subwavelength optics,” nature, vol. 424, no. 6950, pp. 824-830, 2003. [19] h. a. atwater, and a. polman, “plasmonics for improved photovoltaic devices,” nat. mater., vol. 9, no. 3, pp. 205-213, 2010. [20] j. a. schuller, e. s. barnard, w. cai, y. c. jun, j. s. white, and m. l. brongersma, “plasmonics for extreme light concentration and manipulation,” nat. mater., vol. 9, no. 3, pp. 193-204, 2010. [21] s. pillai, k. r. catchpole, t. trupke, and m. a. green, “surface plasmon enhanced silicon solar cells,” j. appl. phys., vol. 101, no. 9, 2007. [22] v. e. ferry, m. a. verschuuren, h. b. t. li, e. verhagen, r. j. walters, r. e. i. schropp, h. a. atwater, and a. polman, “light trapping in ultrathin plasmonic solar cells,” opt. express, vol. 18, no. 13, pp. a237-a245, 2010. [23] k. r. catchpole, and a. polman, “plasmonic solar cells,” opt. express, vol. 16, no. 26, pp. 2179321800, 2008. [24] s. a. maier, plasmonics: fundamentals and applications, springer science+business media, new york, ny, 2007. [25] w. cai, and v. shalaev, optical metamaterials: fundamentals and applications, springer, dordrecht , germany, 2009. [26] s. a. ramakrishna, and t. m. grzegorczyk, physics and applications of negative refractive index materials, spie press bellingham, wa & crc press, taylor & francis group, boca raton fl, 2009. [27] u. leonhardt, “optical conformal mapping,” science, vol. 312, no. 5781, pp. 1777-1780, 2006. [28] u. leonhardt, and t. g. philbin, "transformation optics and the geometry of light," progress in optics, e. wolf, ed., pp. 69-152, amsterdam, the netherlands: elsevier science & technology 2009. [29] j. b. pendry, d. schurig, and d. r. smith, “controlling electromagnetic fields,” science, vol. 312, no. 5781, pp. 1780-1782, 2006. [30] e. h. putley, "thermal detectors," optical and infrared detectors, r. j. keyes, ed., berlin: springerverlag, 1983. [31] p. g. datskos, n. v. lavrik, and s. rajic, “performance of uncooled microcantilever thermal detectors,” review of scientific instruments, vol. 75, no. 4, pp. 1134-1148, 2004. plasmonic enhancement of light trapping in photodetectors 201 [32] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by up-conversion of sub-bandgap light,” j. appl. phys., vol. 92, no. 7, pp. 4117-4122, 2002. [33] t. trupke, m. a. green, and p. würfel, “improving solar cell efficiencies by down-conversion of highenergy photons,” j. appl. phys., vol. 92, no. 3, pp. 1668-1674, 2002. [34] e. yablonovitch, “statistical ray optics,” j. opt. soc. am., vol. 72, pp. 899-907, 1982. [35] e. yablonovitch, and g. d. cody, “intensity enhancement in textured optical sheets for solar cells,” ieee transactions on electron devices, vol. ed-29, no. 2, pp. 300-305, 1982. [36] t. tiedje, e. yablonovitch, g. d. cody, and b. g. brooks, “limiting efficiency of silicon solar cells,” ieee transactions on electron devices, vol. ed-31, no. 5, pp. 711-716, 1984. [37] p. campbell, and m. a. green, “limiting efficiency of silicon solar cells under concentrated sunlight,” ieee transactions on electron devices, vol. ed-33, no. 2, pp. 234-239, 1986. [38] w. shockley, and h. j. queisser, “detailed balance limit of efficiency of p-n junction solar cells,” j. appl. phys., vol. 32, no. 3, pp. 510-519, 1961. [39] m. born, and e. wolf, principles of optics, 7th ed., cambridge university press, cambridge 1999. [40] z. yu, a. raman, and s. fan, “fundamental limit of nanophotonic light trapping in solar cells,” proc. nat. acad. sci. u.s.a., vol. 107, no. 41, pp. 17491-17496, 2010. [41] z. yu, a. raman, and s. fan, “thermodynamic upper bound on broadband light coupling with photonic structures,” phys. rev. lett., vol. 109, no. 17, 2012. [42] d. m. callahan, j. n. munday, and h. a. atwater, “solar cell light trapping beyond the ray optic limit,” nano lett., vol. 12, no. 1, pp. 214-218, 2012. [43] v. ganapati, o. d. miller, and e. yablonovitch, “light trapping textures designed by electromagnetic optimization for subwavelength thick solar cells,” ieee journal of photovoltaics, 2013. [44] a. boltasseva, and h. a. atwater, “low-loss plasmonic metamaterials,” science, vol. 331, no. 6015, pp. 290-291, 2011. [45] p. avouris, and m. freitag, “graphene photonics, plasmonics, and optoelectronics,” ieee j. sel. top. quant. electr., vol. 20, no. 1, 2014. [46] z. jakšić, s. m. vuković, j. buha, and j. matovic, “nanomembrane-based plasmonics,” j. nanophotonics, vol. 5, pp. 051818.1-20, 2011. [47] s. a. maier, and h. a. atwater, “plasmonics: localization and guiding of electromagnetic energy in metal/dielectric structures,” j. appl. phys., vol. 98, no. 1, pp. 1-10, 2005. [48] e. ozbay, “plasmonics: merging photonics and electronics at nanoscale dimensions,” science, vol. 311, no. 5758, pp. 189-193, 2006. [49] r. b. m. schasfoort, and a. j. tudos, eds., “handbook of surface plasmon resonance,” cambridge, uk: royal society of chemistry 2008. [50] c. bauer, g. kobiela, and h. giessen, “2d quasiperiodic plasmonic crystals,” scientific reports, vol. 2, pp. 0681.1-6, 2012. [51] m. maksimović, and z. jakšić, “emittance and absorptance tailoring by negative refractive index metamaterial-based cantor multilayers,” j. opt. a-pure appl. opt., vol. 8, no. 3, pp. 355-362, 2006. [52] k. vynck, m. burresi, f. riboli, and d. s. wiersma, “photon management in two-dimensional disordered media,” nat. mater., vol. 11, no. 12, pp. 1017-1022, 2012. [53] z. jakšić, "optical metamaterials as the platform for a novel generation of ultrasensitive chemical or biological sensors," metamaterials: classes, properties and applications, e. j. tremblay, ed., pp. 1-42, hauppauge, new york: nova science publishers, 2010. [54] s. m. vuković, z. jakšić, and j. matovic, “plasmon modes on laminated nanomembrane-based waveguides,” j. nanophotonics, vol. 4, pp. 041770, 2010. [55] s. m. vuković, z. jakšić, i. v. shadrivov, and y. s. kivshar, “plasmonic crystal waveguides ” appl. phys. a, vol. 103, no. 3, pp. 615-617, 2011. [56] p. drude, the theory of optics, dover publications, mineola, new york, 2005. [57] h. a. lorentz, the theory of electrons, dover publications, mineola, new york, 1952. [58] p. w. milonni, fast light, slow light and left-handed light, taylor & francis, abingdon, oxford, 2004. [59] s. v. boriskina, h. ghasemi, and g. chen, “plasmonic materials for energy: from physics to applications,” materials today, vol. 16, no. 10, pp. 375-386, 2013. [60] s. franzen, “surface plasmon polaritons and screened plasma absorption in indium tin oxide compared to silver and gold,” j. phys. chem. c, vol. 112, no. 15, pp. 6027-6032, 2008. [61] b. l. twu, and s. e. schwarz, “properties of infrared cat-whisker antennas near 10.6 μ,” appl. phys. lett., vol. 26, no. 12, pp. 672-675, 1975. 202 z. jakšić, m. obradov, s. vuković, m. belić [62] s. r. j. brueck, v. diadiuk, t. jones, and w. lenth, “enhanced quantum efficiency internal photoemission detectors by grating coupling to surface plasma waves,” appl. phys. lett., vol. 46, no. 10, pp. 915-917, 1985. [63] d. derkacs, s. lim, p. matheu, w. mar, and e. yu, “improved performance of amorphous silicon solar cells via scattering from surface plasmon polaritons in nearby metallic nanoparticles,” appl. phys. lett., vol. 89, no. 9, pp. 093103, 2006. [64] a. aubry, d. y. lei, a. i. fernández-domínguez, y. sonnefraud, s. a. maier, and j. b. pendry, “plasmonic light-harvesting devices over the whole visible spectrum,” nano lett., vol. 10, no. 7, pp. 2574-2579, 2010. [65] t. trupke, m. green, and p. würfel, “improving solar cell efficiencies by up-conversion of sub-bandgap light,” j. appl. phys., vol. 92, no. 7, pp. 4117-4122, 2002. [66] g. shvets, and y. a. urzhumov, “electric and magnetic properties of sub-wavelength plasmonic crystals,” j. opt. a-pure appl. opt., vol. 7, no. 2, pp. s23-s31, 2005. [67] c. bauer, and h. giessen, “light harvesting enhancement in solar cells with quasicrystalline plasmonic structures,” opt. express, vol. 21, no. 103, pp. a363-a371, 2013. [68] p. biagioni, j.-s. huang, and b. hecht, “nanoantennas for visible and infrared radiation,” reports on progress in physics, vol. 75, no. 2, pp. 024402, 2012. [69] m. quinten, optical properties of nanoparticle systems: mie and beyond, wiley-vch, weinheim, germany, 2011. [70] m. schmid, r. klenk, m. c. lux-steiner, m. topič, and j. krč, “modeling plasmonic scattering combined with thin-film optics,” nanotechnology, vol. 22, no. 2, 2010. [71] v. e. ferry, j. n. munday, and h. a. atwater, “design considerations for plasmonic photovoltaics,” adv. mat., vol. 22, no. 43, pp. 4794-4808, 2010. [72] t. k. sau, and a. l. rogach, eds., “complex-shaped metal nanoparticles: bottom-up syntheses and applications,” weinheim, germany: wiley-vch, 2012. [73] d. c. o'shea, t. j. suleski, a. d. kathman, and d. w. prather, diffractive optics: design, fabrication, and test, spie publications, bellingham, washington, 2003. [74] p. spinelli, e. ferry, j. van de groep, m. van lare, a. verschuuren, i. schropp, a. atwater, a. polman, v. e. ferry, m. a. verschuuren, r. e. i. schropp, and h. a. atwater, “plasmonic light trapping in thinfilm si solar cells,” journal of optics, vol. 14, no. 2, 2012. [75] p. berini, “long-range surface plasmon polaritons,” adv. opt. photon., vol. 1, no. 3, pp. 484-588, 2009. [76] r. d. r. bhat, n. c. panoiu, s. r. j. brueck, and r. m. osgood jr, “enhancing the signal-to-noise ratio of an infrared photodetector with a circular metal grating,” opt. express, vol. 16, no. 7, pp. 4588-4596, 2008. [77] a. alù, g. d'aguanno, n. mattiucci, and m. j. bloemer, “plasmonic brewster angle: broadband extraordinary transmission through optical gratings,” phys. rev. lett., vol. 106, no. 12, 2011. [78] p. genevet, j. lin, m. a. kats, and f. capasso, “holographic detection of the orbital angular momentum of light with plasmonic photodiodes,” nature communications, vol. 3, 2012. [79] p. berini, “plasmon-polariton waves guided by thin lossy metal films of finite width: bound modes of symmetric structures,” phys. rev. b, vol. 61, no. 15, pp. 10484-10503, 2000. [80] p. berini, “plasmon-polariton waves guided by thin lossy metal films of finite width: bound modes of asymmetric structures,” phys. rev. b, vol. 63, no. 12, pp. 1254171-12541715, 2001. [81] i. i. smolyaninov, w. atia, and c. c. davis, “near-field optical microscopy of two-dimensional photonic and plasmonic crystals,” phys. rev. b, vol. 59, no. 3, pp. 2454-2460, 1999. [82] j. b. pendry, a. j. holden, d. j. robbins, and w. j. stewart, “magnetism from conductors and enhanced nonlinear phenomena,” ieee t. microw. theory, vol. 47, no. 11, pp. 2075-2084, 1999. [83] r. h. fan, l. h. zhu, r. w. peng, x. r. huang, d. x. qi, x. p. ren, q. hu, and m. wang, “broadband antireflection and light-trapping enhancement of plasmonic solar cells,” phys. rev. b, vol. 87, no. 19, 2013. [84] t. w. ebbesen, h. j. lezec, h. f. ghaemi, t. thio, and p. a. wolff, “extraordinary optical transmission through sub-wavelength hole arrays,” nature, vol. 391, no. 6668, pp. 667-669, 1998. [85] j. b. pendry, l. martín-moreno, and f. j. garcia-vidal, “mimicking surface plasmons with structured surfaces,” science, vol. 305, no. 5685, pp. 847-848, 2004. [86] j. rosenberg, r. v. shenoi, t. e. vandervelde, s. krishna, and o. painter, “a multispectral and polarizationselective surface-plasmon resonant midinfrared detector,” appl. phys. lett., vol. 95, no. 16, 2009. [87] h. chen, c. t. chan, and p. sheng, “transformation optics and metamaterials,” nat. mater., vol. 9, no. 5, pp. 387-396, 2010. [88] d. r. smith, j. j. mock, a. f. starr, and d. schurig, “gradient index metamaterials,” phys. rev. e, vol. 71, no. 3, pp. 036609, 2005. plasmonic enhancement of light trapping in photodetectors 203 [89] m. dalarsson, m. norgren, n. dončov, and z. jakšić, “lossy gradient index transmission optics with arbitrary periodic permittivity and permeability and constant impedance throughout the structure,” journal of optics (united kingdom), vol. 14, no. 6, pp. 065102, 2012. [90] a. alù, and n. engheta, “achieving transparency with plasmonic and metamaterial coatings,” phys. rev. e, vol. 72, no. 1, pp. 016623, 2005. [91] a. i. fernández-domínguez, s. a. maier, and j. b. pendry, “collection and concentration of light by touching spheres: a transformation optics approach,” phys. rev. lett., vol. 105, no. 26, 2010. [92] j. ng, h. chen, and c. t. chan, “metamaterial frequency-selective superabsorber,” opt. lett., vol. 34, no. 5, pp. 644-646, 2009. [93] n. i. landy, s. sajuyigbe, j. j. mock, d. r. smith, and w. j. padilla, “perfect metamaterial absorber,” phys. rev. lett., vol. 100, no. 20, 2008. [94] e. e. narimanov, and a. v. kildishev, “optical black hole: broadband omnidirectional light absorber,” appl. phys. lett., vol. 95, no. 4, 2009. [95] t. yang, h. chen, x. luo, and h. ma, “superscatterer: enhancement of scattering with complementary media,” opt. express, vol. 16, no. 22, pp. 18545-18550, 2008. [96] l. novotny, and n. van hulst, “antennas for light,” nat. photonics, vol. 5, no. 2, pp. 83-90, 2011. [97] a. alu, and n. engheta, “theory, modeling and features of optical nanoantennas,” ieee t. antenn. propag., vol. 61, no. 4, pp. 1508-1517, 2013. [98] p. bharadwaj, b. deutsch, and l. novotny, “optical antennas,” adv. opt. phot., vol. 1, no. 3, pp. 438483, 2009. [99] a. f. koenderink, “plasmon nanoparticle array waveguides for single photon and single plasmon sources,” nano lett., vol. 9, no. 12, pp. 4228-4233, 2009. [100] p. j. schuck, d. p. fromm, a. sundaramurthy, g. s. kino, and w. e. moerner, “improving the mismatch between light and nanoscale objects with gold bowtie nanoantennas,” phys. rev. lett., vol. 94, no. 1, 2005. [101] t. grosjean, m. mivelle, f. i. baida, g. w. burr, and u. c. fischer, “diabolo nanoantenna for enhancing and confining the magnetic optical field,” nano lett., vol. 11, no. 3, pp. 1009-1013, 2011. [102] j. li, a. salandrino, and n. engheta, “shaping light beams in the nanometer scale: a yagi-uda nanoantenna in the optical domain,” phys. rev. b, vol. 76, no. 24, 2007. [103] e. n. grossman, j. e. sauvageau, and d. g. mcdonald, “lithographic spiral antennas at short wavelengths,” appl. phys. lett., vol. 59, no. 25, pp. 3225-3227, 1991. [104] g. volpe, g. volpe, and r. quidant, “fractal plasmonics: subdiffraction focusing and broadband spectral response by a sierpinski nanocarpet,” opt. express, vol. 19, no. 4, pp. 3612-3618, 2011. [105] y. alaverdyan, b. seplveda, l. eurenius, e. olsson, and m. käll, “optical antennas based on coupled nanoholes in thin metal films,” nature physics, vol. 3, no. 12, pp. 884-889, 2007. [106] c. simovski, d. morits, p. voroshilov, m. guzhva, p. belov, and y. kivshar, “enhanced efficiency of light-trapping nanoantenna arrays for thin-film solar cells,” opt. express, vol. 21, no. 13, pp. a714a725, 2013. [107] m. w. knight, h. sobhani, p. nordlander, and n. j. halas, “photodetection with active optical antennas,” science, vol. 332, no. 6030, pp. 702-704, 2011. [108] z. jakšić, m. milinović, and d. randjelović, “nanotechnological enhancement of infrared detectors by plasmon resonance in transparent conductive oxide nanoparticles,” strojniski vestnik/journal of mechanical engineering, vol. 58, no. 6, pp. 367-375, 2012. [109] l. dominici, f. michelotti, t. m. brown, a. reale, and a. di carlo, “plasmon polaritons in the near infrared on fluorine doped tin oxide films,” opt. express, vol. 17, no. 12, pp. 10155-10167, 2009. [110] s. franzen, c. rhodes, m. cerruti, r. w. gerber, m. losego, j. p. maria, and d. e. aspnes, “plasmonic phenomena in indium tin oxide and ito-au hybrid films,” opt. lett., vol. 34, no. 18, pp. 2867-2869, 2009. [111] c. rhodes, m. cerruti, a. efremenko, m. losego, d. e. aspnes, j. p. maria, and s. franzen, “dependence of plasmon polaritons on the thickness of indium tin oxide thin films,” j. appl. phys., vol. 103, no. 9, 2008. facta universitatis series: electronics and energetics vol. 33, n o 4, december 2020, pp. 499-529 https://doi.org/10.2298/fuee2004499d © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd is this artificial intelligence?  vladan devedžić university of belgrade, faculty of organizational sciences, belgrade, serbia abstract. artificial intelligence (ai) has become one of the most frequently used terms in the technical jargon (and often in not-so-technical jargon). recent advancements in the field of ai have certainly contributed to the ai hype, and so have numerous applications and results of using ai technology in practice. still, just like with any other hype, the ai hype has its controversies. this paper critically examines developments in the field of ai from multiple perspectives – research, technological, social and pragmatic. part of the controversies of the ai hype stem from the fact that people use the term ai differently, often without a deep understanding of the wider context in which ai as a field has been developing since its inception in mid 1950s. key words: intelligence, artificial intelligence (ai), technology, applications, reality check. 1. introduction artificial intelligence (ai) is seeing an unprecedented rise in popularity for more than a decade. several traditional subfields of ai have developed almost to the level of disciplines per se, and there are more and more practical applications of different technologies that have been developing for years under the ai umbrella. this has affected many sectors, and has attracted attention of not only technology developers, but also of educators, social scientists, artists, governments, media and wider public. on the other hand, there are many apparently simple questions that are still waiting for appropriate answers. what exactly is ai, in the first place? how intelligent is an intelligent system? what are the criteria to call a system an ai system, or an intelligent system? in order to set the stage for discussing these questions further, a brief review of some real-world examples of systems and applications called ai is a good starting point. spam filtering is one of the commonly known examples of applying ai in email services, but it‘s less commonly known that smart email categorization and labelling is also aipowered [1]. even fewer email users are aware of ai behind smart replies, nudging which emails they haven‘t answered or ignored. received august 20, 2020 corresponding author: vladan devedžić university of belgrade, faculty of organizational sciences, jove ilića 154, 11000 belgrade, serbia e-mail: devedzic@gmail.com 500 v. devedzic ai voice-to-text apps for smartphones, like speechnotes 1 and voice notebook 2 , can convert speech to text and can also convert an audio file to text. the same technology powers smart personal assistants, like google assistant 3 , alexa 4 and cortana 5 , that can perform internet searches, set reminders, integrate with your calendar, create to-do lists, order items online and answer questions (via internet searches). when google maps recommends the fastest route through a city on someone's smartphone, it intelligently takes into account not only the traffic speed, but also the road construction, accidents and different user-reported conditions [2]. likewise, ride-hailingand-sharing apps like uber can accurately calculate the price of a ride, predict the passenger's demands, determine optimal pick-up locations and even compute the estimated time for food delivery [3]. some will be surprised to learn that ai autopilots on commercial flights are in charge of flying the aircraft for most of the flight time – humansteered times are typically just during takeoff and landing [4]. and that easily shifts attention to self-driving cars, buses and trucks, a largely debated ai topic that until very recently referred only to experimentation that used to spark our imagination, but nowadays is slowly becoming a reality [5]. these vehicles are smart enough to drive at an optimal speed, to follow the signs, to pay attention to the stop lights, pedestrians and other cars, and safely bring the passengers and loads to their destinations. using ai-enabled technology in military applications has always been one of the driving forces in developing ai further. typical current applications include unmanned (selfdriving) vehicles, combat robots, drone swarms and autonomous action [6], [7]. they allow for running dangerous, suicidal missions, and have opened a whole new line of military strategy and tactics development. a good recent example that uses the ai techniques, called adversarial machine learning [8], is model turtle created at mit – a robot that looks like a turtle to humans, but can easily fool other ai-powered robots and surveillance drones, to which it looks like a rifle [9]. this leads to a series of adversarial algorithmic camouflage tactics, like hiding military planes, tanks, and other objects, ―blinding‖ missiles, and so on. image recognition and face recognition systems have become quite popular. facebook 6 highlights faces on an uploaded image and suggests the user friends to tag, using ai to recognize faces. snapchat 7 goes into a slightly different direction – it can also track facial movements. similarly, instagram 8 uses ai to identify the contextual meaning of emoji. amazon rekognition 9 can recognize faces of celebrities, and so can microsoft azure custom vision 10 image recognition cognitive service. google cloud vision 11 and amazon rekognition are currently among the leaders of general object recognition and content 1 https://speechnotes.co/ 2 https://voicenotebook.com/ 3 https://assistant.google.com/ 4 http://alexa.amazon.com/spa/index.html 5 https://support.microsoft.com/en-us/help/17214/windows-10-what-is 6 https://www.facebook.com/ 7 https://www.snapchat.com/ 8 https://www.instagram.com/ 9 https://aws.amazon.com/rekognition/ 10 https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ 11 https://cloud.google.com/vision/ https://speechnotes.co/ https://voicenotebook.com/ https://assistant.google.com/ http://alexa.amazon.com/spa/index.html https://support.microsoft.com/en-us/help/17214/windows-10-what-is https://aws.amazon.com/rekognition/?blog-cards.sort-by=item.additionalfields.createddate&blog-cards.sort-order=desc https://nordicapis.com/digitize-your-notes-with-microsoft-vision-api/ https://nordicapis.com/digitize-your-notes-with-microsoft-vision-api/ https://cloud.google.com/vision/docs/drag-and-drop?hl=en https://aws.amazon.com/rekognition/?blog-cards.sort-by=item.additionalfields.createddate&blog-cards.sort-order=desc https://aws.amazon.com/rekognition/?blog-cards.sort-by=item.additionalfields.createddate&blog-cards.sort-order=desc https://speechnotes.co/ https://voicenotebook.com/ https://assistant.google.com/ http://alexa.amazon.com/spa/index.html https://support.microsoft.com/en-us/help/17214/windows-10-what-is https://www.facebook.com/ https://www.snapchat.com/ https://www.instagram.com/ https://aws.amazon.com/rekognition/ https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ https://cloud.google.com/vision/ is this artificial intelligence? 501 detection on images. google lens 12 brings up relevant information related to objects it identifies using visual analysis, fig. 1. fig. 1 the photo of the author's desk taken by the google lens app run by his smartphone (left) and part of the information shown by the app (correctly except for the color) as a result of the ai-based image analysis (right) in the banking sector, fraud detection platforms based on machine learning (ml), such as the one created by the teradata 13 firm, are in high demand [10]. they are capable of recognizing potential fraud transactions by differentiating between acceptable deviations from the norm and critical ones. acceptable deviations are treated as false positives, so the system can ―learn‖ from its mistakes. the data used to train the ml model include recent frequency of transactions, transaction size, geolocational data, the kind of retailer involved, etc. so, what is it in these (and many, many more) systems and applications that is most often called ai? 2. defining ai? the question mark in the subheading is intentional. ai is notoriously hard to define – in fact, there are many definitions and none of them is dominant in the ai community; p. marsden has compiled a list of a few dozens of popular definitions [11]. extracting and mixing bits and pieces from several of them, in this article ai is understood primarily as technology capable of exhibiting skills typically associated with human intelligence, such as the ability to perceive, learn, reason, abstract (classify, conceptualize and generate rules) and act autonomously. it is also the science and engineering of creating such technology, where intelligence is the computational part of it that enables machines to exhibit behaviors and actions that would be called intelligent if a human were so behaving, i.e. that would require intelligence if they were done by humans. an important characteristic of an ai system is that it can figure out things for itself, and then act based on that information. the most popular textbook on ai [12] stresses a 12 https://lens.google.com/ 13 https://www.teradata.com/ https://lens.google.com/ https://lens.google.com/ https://www.teradata.com/ 502 v. devedzic variation of that characteristic: ―ai is the study of agents that receive percepts from the environment and perform actions… a rational agent is one that acts so as to achieve the best outcome or, when there is uncertainty, the best-expected outcome,‖ i.e. has the ability to achieve goals in the world in an optimal way. there are at least two distinct points in this understanding/description: (a) ai is technology, more precisely computational technology; and (b) it behaves and acts in a way that is typically associated with human intelligence. what makes things slip away in all attempts to define ai is not part (a); it is part (b). 2.1. what is intelligence? the much-quoted line of r.j. sternberg that ―viewed narrowly, there seem to be almost as many definitions of intelligence as there were experts asked to define it‖ [13] reveals in a concise way that all attempts to define intelligence are inherently controversial. and, just like in the case of defining ai, there are collections of definitions (e.g., [14]) and broad statements and commentaries that outline only vague conclusions about the nature of intelligence, its origins and current scientific evidence. this article adopts two broad statements of this kind, which describe intelligence as: ―a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. it is not merely book learning, a narrow academic skill, or test-taking smarts. rather, it reflects a broader and deeper capability for comprehending our surroundings – ―catching on,‖ ―making sense‖ of things, or ―figuring out‖ what to do.‖ [15] ―ability to understand complex ideas, to adapt effectively to the environment, to learn from experience, to engage in various forms of reasoning, to overcome obstacles by taking thought... concepts of ―intelligence‖ are attempts to clarify and organize this complex set of phenomena.‖ [16] note, however, that all such statements and attempts to define (or, at least, characterize) intelligence can lead to a vicious circle. one now needs to define each of these abilities, like understanding, thinking, reasoning, learning, adapting, etc. this is equally difficult as defining intelligence, since ―although considerable clarity has been achieved in some areas, no such conceptualization has yet answered all the important questions, and none commands universal assent‖ [16]. moreover, there can be substantial individual differences in performance related to these complex abilities, and they can vary even for the same person in different domains, under different circumstances, and so on. mechanisms to measure this performance do exist (e.g., iq), but judgement can be based on different criteria. 2.2. how intelligent is an ai system? given the extremely high complexity of intelligence itself and of the abilities associated with it, developers of ai systems typically focus only on some narrow aspects of intelligence or to a specific dimension of intelligence, such as knowledge representation, reasoning, learning, and image analysis and interpretation. unfortunately, this can lead to big differences in judging how intelligent is an ai system. is this artificial intelligence? 503 2.2.1. the turing test as early as 1950, alan turing suggested that a program/machine should pass a behavioral intelligence test if it was to be called intelligent [17]: it should have a 5-minute typed-messages conversation with a human interrogator, and the interrogator then has to guess if the conversation was with a program or with a person; the program/machine passes the test if for at least 30% of the time the interrogator believes she/he is making this conversation with a person [12]. the modern-time interpretation of the turing test [12] is that such a program/machine should be able to communicate successfully with the interrogator using a natural language, should be capable of representing and storing information and knowledge about what it hears and using that knowledge for reasoning when answering questions and drawing conclusions. in addition, it should be able to learn new knowledge and patterns and to adapt to new situations, as well as to perceive objects using its sensory input and manipulate the objects accordingly (robotics). ever since the turing test was proposed, it has created intense debates. philosophers have argued that there are things that machines cannot do, others have cited mathematical proofs that some questions are in principle unanswerable by formal systems, and some strongly support the stance that human intelligence is much too complex to be captured by machines. however, in recent years there have been several announcements about ai systems passing the turing test [18], [19], [20]. these typically initiate counter-arguments and stay confined to academic circles; so far, there has been no much reaction from technologists. 2.2.2. weak ai vs. strong ai weak ai systems are those that can act as if they were intelligent, i.e. they can simulate human cognitive function. they can only appear to think, but definitely lack consciousness. they can follow certain rules and pre-programmed behaviors – even complex ones – but cannot do anything beyond these rules and behaviors. for example, a chess-playing program cannot be used as a personal assistant and vice versa. as j. searle puts it [21]: ―according to weak ai, the principal value of the computer in the study of the mind is that it gives us a very powerful tool. for example, it enables us to formulate and test hypotheses in a more rigorous and precise fashion. but according to strong ai, the computer is not merely a tool in the study of the mind.‖ in contrast to weak ai, the hypothesis of strong ai is that an ai system should actually have human cognitive abilities and states, not just simulate them. strong ai is not about building tools that help test psychological explanations; it is about building systems that ―are themselves the explanations‖ [21]. in other words, according to strong ai, intelligent programs should have their own autonomous perception, beliefs, emotions, and intentions; they should be minds. current systems called ―ai systems‖ are typically developed with the weak ai hypotheses in mind [12]. developers are happy if their programs work, and do not care much if people call them real intelligence or just simulated one. a related problem is the level of sophistication of an ai system. current ai systems can easily beat even the best human players in computer games or in the games of chess and go, but can neither understand nor feel the meaning of fairy tales and stories for young children [22], let alone capture their bottom-lines and morals. 504 v. devedzic 2.2.3. ai effect critics of weak ai often discount a successful ai technology with not viewing it as being real intelligence, regardless of the fact that it was once considered ai [23]. this is called ai effect: before the technology becomes part of everyday life, i.e. before it comes out from the confines of ai research labs, it has a special aura; it looks magic and truly intelligent. once it is better understood by the majority and gets built into products and tools used by many, the thrill is gone – it loses the ‗ai‘ label and becomes just technology. as a side effect, advancements in technologies that have once lived under the ai umbrella sometimes make these technologies break away from the ‗ai‘ label and get rebranded: expert systems have come out of the ai auspices and become a technology per se, artificial neural networks are often called just neural networks, and everybody says just chatbots, not ai chatbots. some see the cause of ai effect in the difference between the strong ai and weak ai concepts [24]. those who are ready to remove the ‗ai‘ label from technology originating from ai research typically align themselves with the strong ai approach: if an ai problem has been solved, it‘s no longer ai; true ai is a problem that has not been solved yet. a possible way out is to take a different perspective: since ai today is typically weak ai, perhaps a down-to-earth question to ask is ―can a specific problem be solved with weak ai or not?‖ it is also a good idea to occasionally ―see the world differently‖ – what do researchers in other, more-or-less related disciplines, have to say about intelligence? 2.3. intelligence seen from different research perspectives there is a dichotomy in explaining ai from technological and other perspectives. while technology-centered ai development focuses on systems that work accurately and fast, have exciting functionality and demonstrate certain aspects of intelligent behavior, experts in other disciplines are more interested in advancing the understanding of the phenomenon of intelligence. 2.3.1. neuroscience neuroscientists have made some progress identifying various neurological factors relevant for intelligence [25]. it is now known that intelligence and functioning of the brain are related to the overall brain volume, cortical thickness, white matter volume, grey matter volume, white matter integrity, neural efficiency, etc. but it is also known that such factors are only partly responsible for differences in intelligence among different humans (as well as among different members of other species). popular techniques/technologies used in non-invasive scanning of human brain include electroencephalography (eeg), magnetic resonance imaging (mri), functional mri (fmri), etc. for example, recent uses of powerful mri scanners have enabled analysis of functional units inside the layers of the human cortex (responsible for high level of cognition) and seeing for the first time how information flows between collections of neurons in a live human brain [26]. note that this is extremely important for neural network research in ai – neural networks as we know them today are models based on never-proved assumptions of how neurons exchange information. moreover, such scanners have brought neuroscientists one step is this artificial intelligence? 505 closer to understanding of human memory. likewise, an analysis of over 18.000 mri scans of people over 44, paired with four cognitive tests from the uk biobank study has revealed that the brain size has only a minor correlation with intelligence, biological sex has no impact on intelligence, and intelligence is largely influenced by different brain regions [27]. note, however, that neuroscientists admit that although now we have considerably more evidence about how human brain functions and what regions of the brain are responsible for intelligence, we still don‘t know what intelligence really is; a lot of further research is needed to fully understand it. that‘s why some neuroscientists take a different approach. due to the fact that human brain is extremely complex, they make attempts to understand how the brain of simpler species works. for example, a notable success has been achieved with studying the brain of fruit flies (drosophila melanogaster) using electron microscopy – the entire brain of an adult female fly has been imaged at synaptic resolution [28]. however, a fact very relevant for ai research is that in spite of now having an unbiased mapping of synaptic connectivity of the fruit fly, synthesizing its brain – the size of a poppy seed – is not even at sight. 2.3.2. psychology research and experiments in cognitive psychology have led to theories about how humans represent knowledge and how they process it in order to make inferences and decisions, create explanations, analyze situations at hand, reach conclusions and so on. the knowledge represented pertains both to external world and to internal mental states, like beliefs, emotions, attitudes and desires [29]. information perceived from the world (both external and internal) gets encoded into mental representations and is either processed immediately, or is stored in memory for later retrieval and processing. there are several basic forms of mental representations: spatial (e.g., the placement of objects in a room), feature (such as dogs bark, can run, have four legs, are faithful,…), network (like irish setter is a setter, irish setter is red, irish setter has bird sense, irish setter is a dog, dog is an animal), and structured (like a plate is on the table, a drawer is under the table, the drawer is closed,…). these forms themselves have their structures. there are also specific processes associated with each form, capable of accessing and using information and knowledge represented within a specific form. for example, in the network representation example shown above, the is a relation between irish setter and dog enables accessing dog features indirectly and inferring that irish setter can bark. a powerful tool of human thought processes is abstraction. it enables ignoring some information (i.e., not representing it, abstracting it away). this is very important in terms of the efficiency of processing the information that did get stored within the representation – it can be found and accessed more quickly, since the search space is more compact without the information that got abstracted away. cognitive science lays the bridge between cognitive psychology and ai. it develops computational models of different forms of mental representations and their related processes. note, however, that these models only theoretically mimic human thought. in reality, we know very little about how knowledge is represented and processed in human brain [30], in spite of valuable recent discoveries like the one that has revealed the brain‘s code for facial identity [31]. researchers are only beginning to tackle important problems 506 v. devedzic like the relation between consciousness and intelligence [32] and the one between intentionality and intelligence [33]. 2.3.3. philosophy ever since the inception of ai, philosophers have been intrigued with it. the already mentioned work related to strong ai ([17], [21], [22]) is but a tiny bit of discussions on the topic. chapter 26 of [12] surveys philosophical pros and cons related to ai in much more details. some of the more recent considerations and debates in this area include v. vinge‘s notion of (technological) singularity [34], built upon the earlier i.j. good‘s concept of intelligence explosion [35]. essentially, singularity means that if humans can create intelligence smarter than their own, then it could do the same, only faster. the concept has been further explored by r. kurzweil [36], who projected that, given the pace of technological development, by mid 2040s global computing capacity will exceed the capacity of all human brains, which will be a precondition for singularity. numerous philosophical speculations and debates have followed, on the grounds that human brains cannot even comprehend such a superior intelligence. some have expressed fear that singularity can ultimately lead to the extinction of humans. others strongly oppose this view, arguing that humanity has already entered ―a major evolutionary transition that merges technology, biology, and society, where digital technology has pervaded the fabric of human society to life-sustaining dependence‖, transition that will ultimately lead to real ai (rai), as ―a globally distributed hybrid cyber-physical human intelligence converging all the emerging technologies: rai = world big data + ai + ml (dnns) + cloud ai + edge ai + iot + 5g + blockchain + autonomous things + self-driving cars + virtual reality and augmented reality + 3d printer + quantum computing + smart spaces + …‖ [37]. notably, natural intelligence is included in the concept of rai. yet other opinions exist, expressing the view that intelligence might be simpler than we think [38], since the way that humans perceive the world is hierarchical in nature, relying on simple patterns at the lower levels and increasing in complexity at the higher ones [39]. this is to say that the essence of perception, thinking, reasoning and other intelligent processing is actually pattern recognition – a long studied area in ai. all rai is viewed as a combination of a) relations/patterns/causality between entities in the environment, b) representation of a), and c) perception, cognition and reasoning in order to establish understanding of the environment and provide rational interaction with it. to this end, p. domingos has introduced the concept of master algorithm [40], as a blend of different approaches to strong ai and to ml in particular – symbolic, connectionist, evolutionary, bayesian and analogy-based – where different ml algorithms synergistically contribute to an asymptotically perfect understanding the world, the brain and intelligence. philosophers also study higher-level concepts and their relations to intelligence, starting from the much quoted and thought-inspiring book gödel, escher, bach: an eternal golden braid by d. hofstadter [41]. these include deep links between art, music, creativity, algorithms, imagination and abstract math, subtly reflected in and subsumed by intelligence. for example, s. mahadevan has proposed the new concept of imagination machines as ―a powerful launching pad for transforming ai‖ beyond the ―current realm of learning probability distributions from samples‖ [42]. using numerous examples from is this artificial intelligence? 507 arts, literature, poetry, and science, he envisions a new field of study in ai, imagination science, where researchers would explore various ways of automating tasks like ―generating samples from a novel probability distribution different from the one given during training; causal reasoning to uncover interpretable explanations; or analogical reasoning to generalize to novel situations‖. 3. current focus in ai given the difficulties in setting the scope and the boundaries of ai, reconciling somewhat different approaches to it when it‘s seen from the perspective of scholars of different backgrounds, as well as in resolving controversies that surround it, a pragmatic approach is to focus on its most popular subareas (at any given point in time). at the time of writing this article (july-august 2020), the ―popularity bar graph‖ of these subareas, published at the ai topics 14 website (curated by the highly authoritative association for the advancement of artificial intelligence, aaai 15 ), looks as in fig. 2. the popularity is measured by the number of entries in the ai topics repository, related to specific topics. it is obvious that ml is currently the most popular subarea of ai – out of the total of 336.000+ entries, about 160.000 are tagged ml. there are two major reasons for that. one of them is the flood of data that applications, businesses, different institutions, social networks, etc. generate. people want to make sense out of this extremely vast amount of data in order to improve their businesses and other activities, and ml comes as a rescue – it enables building a mathematical model based on sample data, known as ―training data‖, in order to make predictions or decisions with previously unseen data, but without being explicitly programmed to do so [43]. to build models and make predictions, ml closely relies on computational statistics, mathematical optimization and exploratory data analysis; thus, it is also referred to as predictive analytics. the models themselves come in various forms, such as neural networks, regression analysis, decision trees, support vector machines, etc. drilling down the graph shown in fig. 2 reveals that out of the nearly 160.000 ml entries about 54.000 are related to neural networks (nns), and about 32.000 are related to statistical learning. among the different types of neural networks, currently most popular ones are deep neural networks (dnns) that enable so-called deep learning (dl) [44], [45], [46]. important types of dnns include: convolutional neural networks (cnns, typically used for image analysis, facial recognition, visual search, etc.) [44], [45]; recurrent neural networks (rnns, useful in natural language processing, speech analysis, text analysis and so on) [44], [45]; and generative adversarial networks (gans, often used to generate examples for image datasets, photographs of human faces, realistic photographs, cartoon characters and face frontal views, as well as to perform image-to-image translation, text-to-image translation, semantic-image-to-photo translation, and more) [47]. 14 https://aitopics.org/ 15 https://aaai.org/ https://aitopics.org/ https://aaai.org/ 508 v. devedzic fig. 2 the bar graph of popular ai topics at the time of writing the article (source: ai topics website, https://shorturl.at/sau28) and the parts/chapters of the most popular ai textbook [12] (right) the other reason for ml being so popular nowadays is the computational power of current ml technologies. the idea of learning new knowledge from data has been attractive in ai for decades, but only recently the computing technology has advanced to the level that has made it at least partially possible. where it is not easily possible – e.g., is this artificial intelligence? 509 requires too long processing time to build models that make predictions with a satisfactory level of accuracy – special-purpose computer hardware is usually the best solution. it can be a costly one, but it‘s a situation that further accelerates hardware development. it should be also noted that ml and especially dnns have become pervasive in other popular subareas of ai indicated in fig. 2, notably in natural language processing (nlp) and in robotics. in nlp, application of dnns has led to many advancements in language modeling, capturing semantic properties of words, natural language generation, machine translation, wordand sentence-level classification, sentiment analysis, and more [48]. in robotics, detection and perception of objects, robotic grippers, fine grasping and object manipulation, scene understanding and sensor fusion, as well as collision avoidance, are all greatly improved with careful use of dnns [49]. the bar graph shown in fig. 2 is actually much more accurate than the current, informally established public view of ai. this public view can be often seen in media and in popular press, blog posts and forums all over the web: ai ≡ ml! a very frequent modality is ai/ml, and so is a less inaccurate ―ai and ml‖. there are also variations in a bit narrower scope, like ml/nn, ml/dl and the like. this has prompted more knowledgeable people to spawn all over the web a series of images like the one on the left in fig. 3, depicting the subsumption relationship between ai, ml and dl. however, the diagram on the right in fig. 3 captures more details from the above discussion. fig. 3 relationship between ai, ml and dl (left; after [50]) and a more detailed view based on the bar graph from fig. 2 (right) the righthand side of fig. 2 shows the table of contents of the most popular ai textbook today, artificial intelligence – a modern approach [12]. note that there is only a minor overlap with the bar graph on the left side. it further explains the diagram on the right side of fig. 3 – many of the remaining topics still are part of ai (the outer circle in fig. 3), but they are not in focus (which usually means lack of funding as well). a notable exception to this end is the broad subarea of ai – representation and reasoning (the second highest bin of the bar graph in fig. 2). it has always been, and still is, in the core of ai. ai textbooks typically discuss only classical topics from this subarea (propositional logic, predicate logic, production rules, reasoning with uncertainty, fuzzy logic and systems, probabilistic reasoning and the like). however, there is a thriving research there as well (although it still 510 v. devedzic does not manage to catch much of the public attention) – new representation techniques and new efficient reasoning mechanisms have been devised recently [51], [52]. these largely pertain to topic modeling, knowledge graphs, conceptual modeling, representation of different types of thinking, knowledge interwoven with imperfect data, semantic summarization, the tradeoff between expressiveness and tractability, and constructing explanations. the ai topics website largely reflects the views and interests of the ai community. however, views from other communities also matter. for example, fig. 4 shows economic perspective on strategic development of ai. ml is still there, but obviously this community puts more emphasis on industrial and social aspects of ai, as well as on emerging topics such as ai ethics and ai education and awareness. notably, this perspective considers ai to be at the same level with robotics. fig. 4 current focus in ai as seen by the world economic forum (source: https://intelligence. weforum.org/topics/a1gb0000000ptdrea2?tab=publications) 4. ai hype the current wave of interest in ai is certainly unsurpassed in the entire history of the field. there have been periods in the past when breakthroughs in ai have received a lot of interest, attention and investments, but then they have been typically followed by periods of disillusionment, ai effect and lack of funding (usually referred to as ―ai winters‖). this current wave is not only the strongest, but also the longest one. popular media cover it on a regular basis. industry, businesses and services invest in ai more than ever before. year after year universities announce and start new courses and even entire study programs related to ai. governments open new funding programs and institutions to support further development of ai. well-known businessmen, investors, entrepreneurs is this artificial intelligence? 511 and even some of the leading ai experts make statements that contribute to the hype (mark cuban: ―invest in ai technology or risk becoming ‗a dinosaur‘ very soon.‖ 16 ; sundar pichai: ―ai is probably the most important thing humanity has ever worked on‖; koray kavukcuoglu: ―we believe ai will be one of the most powerful enabling technologies ever created – a single invention that could unlock solutions to thousands of problems.‖ 17 ; azamat abdoullaev: ―whoever creates real artificial intelligence will rule the world.‖ 18 ; andrew ng: ―ai is the new electricity.‖ 19 ). claims like ―ai will completely revolutionize our society‖ are all over the media, and everyone wants to be involved in the technology race [53]. there are several reasons for all the buzz and excitement. the already mentioned technological advancements and largely increased computational power are an important enabler of ai developments, and the available enormous amounts of data come hand in hand with it. likewise, there really have been impressive recent developments that in part justify the hype. for example, some machines can outperform humans in extracting information from images and identifying objects on images [7], [53]. similarly, in nlp, the latest generative model from openai 20 , called gpt-3, can generate amazing human-like text on demand [54]. also, the strategic onlook called industry 5.0 [55] puts the interaction and collaboration between man and machine right up front and sees ai as one of the major pillars of future industry developments. promoters envision this important ai trend to make highly automated manufacturing and self-managed supply chains a reality very soon. today's technology development leaders like facebook, google, tencent, amazon, alibaba etc. all have a great business interest in developing ai-powered systems and applications, and they advertise their efforts. again, their own success with their ai products is undeniable, and there is no compelling reason why one should believe that they will not manage to make next major shifts in that direction. however, all this interest and attention raises also an important question: can ai really live up to the hype? there are opposing opinions, stating that ai has been overhyped and that current ai systems are not very intelligent and thus are very limited. some already see a decline in the hype, starting from the gartner hype cycle for ai 2019 that indicates that ml, nlp, dnn and other ai technologies are already on the downward slope of the curve, in the section called the trough of disillusionment [56]. they remind the ai community and the wider public of earlier ai hypes that have crushed by failures (e.g., ―the 5th generation of ai‖) 21 . they also argue that significant ai results achieved in the past have become part of other disciplines and are no longer considered ai. 16 https://yourstory.com/2020/01/ces-2020-mark-cuban-ai-artificial-intelligence-investments-startups 17 https://www.bbc.com/news/technology-51064369 18 https://www.linkedin.com/pulse/global-artificial-intelligence-gai-narrow-ai-applied-mldl-abdoullaev/? published=t 19 https://www.wipo.int/wipo_magazine/en/2019/03/article_0001.html 20 https://openai.com/ 21 https://shorturl.at/qhks3 https://yourstory.com/2020/01/ces-2020-mark-cuban-ai-artificial-intelligence-investments-startups https://www.bbc.com/news/technology-51064369 https://www.linkedin.com/pulse/global-artificial-intelligence-gai-narrow-ai-applied-mldl-abdoullaev/?published=t https://www.wipo.int/wipo_magazine/en/2019/03/article_0001.html https://openai.com/ https://shorturl.at/qhks3 512 v. devedzic some of the more extreme views in the stream opposing the ai hype even insist that consulting firms deliberately create the fear of missing the ai wave and scare companies into paying for ai projects 22 . they warn that typical ai applications rarely bring high payoff to companies. ai can be very hard to afford, given the cost of ai specialists and specialized hardware. mocking the ai hype comes along the same lines. a famous meme 23 from 2018 makes a parallel between concepts in computing – "then" there were application, program, operating system, script, shell, batch file, service, etc.; in 2010, they have been all replaced by app, app, app,…; in 2018, their names became ai, ai, ai,… note, however, that it is not as clear cut (i.e. just promoters vs. opponents) as it might look. the general attitude to ai has changed notably. once it was not so popular and profitable to start a business with ai. nowadays, companies proudly wave their ai flags. it has become almost a matter of self-esteem for a company to say that it is not making just ordinary applications, but ones that can learn, talk, perceive objects and so on – much like people – using ai. when someone makes a pilot study and comes up with results like ―in the future, ai will shorten your commute even further via self-driving cars that result in up to 90% fewer accidents, more efficient ride sharing to reduce the number of cars on the road by up to 75%, and smart traffic lights that reduce wait times by 40% and overall travel time by 26%‖ [1], opponents call it guessing, incomplete, wishful thinking and the like. however, people ask: ―how safe are self-driving vehicles? i‘ve heard of an accident caused by malfunction of such a vehicle.‖ promoters of self-driving vehicles often answer with a counter-question: ―how many accidents like that have you heard of?‖ true, selfdriving cars are not that many yet, so the chance of accidents caused by them is still low. if one thinks in terms of percentages/proportions – what are the proportions of the rides that ended up as accidents when a driver was behind the steering wheel, and those that had no driver? an alternative way of thinking about the same problem is: there are no drunk or mad drivers in self-driving cars. again, the debate is huge, but laymen are very surprised here: some people believe not only that the safety of self-driving cars is not lagging behind that of human-driven cars, but that self-driving cars are safer 24 . they found the grounds for such an opinion in the fact that such vehicles can use much more information than human drivers – information from vehicle-to-vehicle messaging, from ultrasonic and infrared imaging, from automated external traffic-control systems, and so on. of course, critics will reply that level-5 (fully automated) self-driving will never be possible because the ai built into self-driving vehicles belongs to a very narrow domain and lacks a wider, human comprehension of the world; thus, the critics say, using a nonhumanlike way of achieving intelligence, fully automated and truly intelligent self-driving cars will always ―be right around the corner.‖ 25 all in all, controversy is already there, but perhaps paradoxically – it only contributes to the hype. 22 https://www.forbes.com/sites/petercohan/2019/02/15/3-reasons-ai-is-way-overhyped/#31fd61a15a6a 23 https://shorturl.at/mlruy 24 https://qr.ae/txsdyi 25 https://qr.ae/tlygdy https://www.forbes.com/sites/petercohan/2019/02/15/3-reasons-ai-is-way-overhyped/#31fd61a15a6a https://shorturl.at/mlruy https://qr.ae/txsdyi https://qr.ae/tlygdy is this artificial intelligence? 513 5. limitations of what is called ai today a good question to ask about the systems that are called ai today is: what exactly can these systems do? a short answer might be: typically, one thing. for instance, a self-driving car can maybe outperform human drivers in terms of safe driving, communicating with other cars and relevant services to exchange information about road conditions, and even inform the passengers about the route, the driving time, and the like. but it cannot infer how to answer questions like: who wrote the famous lyrics words are flowing out like endless rain into a paper cup?; or, what does the term lonely planet stand for? likewise, after seeing many thousands of images of leopards, a dnn can learn to recognize them with very high accuracy. but it typically breaks when shown an image of a similar animal, like a cheetah, or a lynx. it needs to undergo a time-consuming training process again, to see many thousands of images of cheetahs in order to learn how to recognize them. and the same goes for lynxes. paradoxically, the process is the same even if it has to learn to recognize something completely different, say a tree. the idea of training another dnn on multiple datasets (e.g., leopards, cheetahs, lynxes and trees) would not work because of feature interference. even if it worked for a specific multiset, it would face the same problem when possibly adding yet another dataset to the multiset. efforts to solve this problem do exist (e.g., the proposed multi-modal dl architecture [57] with separate models tuned for each specific dataset in a multiset), but the need for training the resulting dnn again for each new dataset remains. actually, the problem is that dnns are not capable of learning the underlying principles of recognizing similar objects and differentiating them from the starting category of objects. just like the fact that ai and ml are not the same things, and that ml is not simply ―ai that improves itself‖ (an idea often found in the popular press), dl is not ml. dl can be superior in learning how to recognize images or natural language, but they are not a magic wand. when it comes to mundane tasks like regression and classification from structured data, like data sourced from a relational database, dl is of little use. in such cases, statistical techniques like gradient boosting [58], e.g. xgboost [59], are a better choice. similarly, as scott e. fahlman puts it, 26 concept detection in nlp using dl works well if a dictionary of words or word patterns representing the concepts of interest is available. otherwise, traditional symbolic reasoning might be more suitable. on the other hand, symbolic knowledge representation and reasoning techniques are also far from being good in achieving human-level performance in any non-narrow domain, let alone in commonsense reasoning. ml technology of today is also very limited in terms of generalizing from examples, as well as in terms of learning concepts efficiently and quickly based on a small set of the concept features and on just a few examples. a general problem of most currently popular ml approaches is that they need a lot of data to make statistical inference about possibly existing patterns in the data with acceptable accuracy. the data is typically noisy, and given enough data and enough computing power ml can be successful. however, humans are capable of learning from just a handful of examples and clear data. 27 moreover, a few 26 https://qr.ae/pnvvsi 27 https://qr.ae/tw4h6w https://qr.ae/pnvvsi https://qr.ae/tw4h6w 514 v. devedzic examples and clear data make it possible for humans to clearly formulate the knowledge the examples convey, to use this knowledge in further reasoning and to explain their reasoning. contrary to that, much of ml today works like a black box (with a notable exception of decision trees, which are easily interpretable and explainable). it is especially true for nns, most notably dnns. for example, dnns for image classification can include millions of parameters in their convolutions, relu and max pooling layers, which is inherently incomprehensible for humans; explaining how everything works inside such networks is currently an illusion. another serious limitation of today‘s systems called ai is that they are pretty straightforward, which is not typical for intelligent behavior. for example, humans typically drift away in conversations, they change topics, insert jokes and colloquial phrases here and there, and make conversation spontaneous. ai systems don‘t. true, they can answer questions like ―when do i have my next meeting?‖ and ―how long does it take to get from a to b by car?‖ quite accurately, but they cannot answer any more imaginative questions, like ―if bach was still alive, would he play blues?‖. in the words of s. mahadevan, today‘s ai is designed to answer ―what is‖ questions, but not ―what if‖ questions; the latter ―would simply befuddle any ai system‖. 28 many systems called ai today are also easy to fool. studies have shown that dnns are actually very brittle and vulnerable to attacks – making some tiny changes in input images through deliberate adversarial perturbations (like adding some fuzz, noise) [60], even changing only one pixel [61], can lead to a completely wrong classification of the image in a lot of cases. now, if one thinks of some real-world applications of dnns, such as selfdriving vehicles, such a one-pixel change can be fatal – what if a raindrop ―changes this one pixel‖ in such a way that the car ―believes‖ that a pedestrian is another car? or, what that one pixel can do if a medical decision is to be made based on a number of images of a tissue? similarly, an image of a bicycle or a guitar pasted for adversarial purposes over (a part of) an image of a monkey can fool the dnn to classify the animal as a human [62]. the problem here, again, is the black-box nature of dnns – it is simply difficult to figure out what exactly dnns are doing inside their hidden layers when they are predicting the class of an input data item, let alone resemblance to how human brains work. yes, they are always repeating the same algorithmic steps and are making classifications based on some statistics, but humans often have trouble understanding why such statistics are dominant. dnns do not model human brains, simply because it is not known how human brain works. more data fed into a dnn can make it more accurate, but not intrinsically human-smart. also, feeding more data into a dnn cannot account for all possible situations, not even for all possible typical data items; the datasets used contain data from different sources, hence a great deal of repetitive data. given all this discussion, one can ask the question: where is the intelligence there? 28 https://qr.ae/pn2psz https://qr.ae/pn2psz is this artificial intelligence? 515 6. reality check and practical challenges applying ai to solve practical problems in the real world usually brings up conditions different from those that govern academic research in the field. the understanding of ai (or the lack of such understanding?) in companies and institutions comes from business objectives, which typically command development of technology with more ―intelligence‖, i.e. with practical ai (roughly corresponding to weak ai) and is intentionally limited 29 . few companies are interested in developing general ai (strong ai), i.e. sentient behavior. both practical and general ai development require expertise from multiple fields, since ―ai is not a single thing‖. 6.1. human-driven ai vs. autonomous ai much of practical ai is human-driven. for example, one can see ml as predictive analytics – it creates predictions that inform human decision makers. but all steps in the process – from collecting data into dataset(s) and wrangling with the data to make it suitable for feature engineering, building the model(s), testing them, fitting them and creating predictions – are essentially driven by data engineers / ml engineers. the tools they use do not learn themselves, i.e. to not have a built-in self-improvement logic. even if such a logic was built in the ml tools, it would still be pre-programmed by human ai specialists. jeff bezos calls this human-powered pseudo-ai ―aai‖ – artificial artificial intelligence. 30 in contrast, autonomous ai (general, strong ai) reflects ―the very nature of intelligence … [i.e.] it is self-guided, self-expanding and self-inspired.‖ 31 for instance, an ml tool capable of improving its own code, deciding by itself which ml model to use to make predictions, and making different inferences about datasets by itself, would be an autonomous ml tool. to the best of the author‘s knowledge, such tools do not exist in practical ai today. 6.2. ai as a marketing term sadly, due to the ai hype the label ―ai‖ has largely become a marketing term, and the press and online posts support that situation. it has become ―a matter of honor‖ for companies and institutions to put the label ―ai‖ in their products and profile descriptions, whereas in reality much of the products and activities labeled ―ai‖ are at best applied statistics, business analytics and informed human decision-making. in marketing, rebranding is a powerful tool. if one looks carefully at the history of terms used to describe parts of research and development often attributed to ai, then they will see that once upon a time there were ―pattern matching‖ and ―pattern discovery‖. later on, there came ―data mining‖ and ―knowledge discovery‖ – slightly different, but cultivated on the same soil as their predecessors. nowadays, all of them are simply rebranded ―ai‖ (or ―ml‖, or ―dl‖). from the marketing perspective, it was actually a clever decision: ―ai‖ is catchier, cooler, more appealing and more promising. still, just like in any marketing campaign, the reality is different. today‘s dominating weak ai does the job in specific narrow application areas, but when compared to general human intelligence – it lives in a galaxy far, far away. as a famous tweet says: when you‘re 29 https://www.quora.com/?activity_story=88335643 30 https://www.wcspeakers.com/speaker/jeff-bezos/ 31 https://medium.com/@ruchika.nanayakkara/ai-is-the-next-virus-42f887a6bec4 https://www.quora.com/?activity_story=88335643 https://www.wcspeakers.com/speaker/jeff-bezos/ https://medium.com/@ruchika.nanayakkara/ai-is-the-next-virus-42f887a6bec4 516 v. devedzic fundraising, it‘s ai. when you‘re hiring, it‘s ml. when you‘re implementing, it‘s linear regression. 32 there are also warnings that the hype and hysteria around ai can possibly do harm to further ai development [63]. part of them are based on the fact that the labels ―ai‖ and ―ml‖ are (over)used only to boost sales. 33 as in the tweet mentioned above, ―ml‖ advertises and masks much less popular terms like ―regression‖ and ―classification‖ that would actually describe the essence of ml (and the absence of human-like learning in it) in a more realistic way. however, this ―sales pitch‖ bubble can burst soon, because of the dangers associated with raising expectations too high, without thinking about the real chance of delivering their vision. both heavy promoters of ai (often being ceos in big-name companies, where weak ai is an essential part of their business model) and doom forecasters (predicting massive unemployment due to ai development, existential threat, singularity and even destruction of our civilization – like stephen hawking, elon musk and bill gates, to name but a few) have originally further advertised ai with their statements [63]. however, there is little evidence in support of both big promises and big doomsaying. as market research shows, productivity in many countries is slowing down (and not rising) due to automation supported by practical ai, and unemployment is recently at its historical low [63]. moreover, a 2019 survey conducted by a uk-based investment firm has shown that about 40% of europe‘s ―ai companies‖ don‘t use ai in any way essential to their business [64]. unfortunately, such facts possibly indicate that the warnings expressed in [63] might be right: once again, as 2019 gartner curve shows [56], the disillusionment caused by over-advertised but unfulfilled ai promises has started. 6.3. ai seen from different practical perspectives different disciplines intersect in what the label ―ai‖ means in the ai community; in a way, as discussed in section 2, it‘s a catch-all term encompassing subsets of computer science, engineering, statistics, computational linguistics, mathematics, cognitive psychology, neuroscience, philosophy, etc. even subareas of ai represent intersections of different other disciplines. for example, ml is considered by some as ―a rebranding of tools from linear algebra, approximation theory, numerical optimization and statistics.‖ 34 interesting questions here are: what does current ai look like from the perspective of other relevant disciplines? what are the roles of these disciplines in ai? what about industry, employers‘ expectations and job market? what is the role of ai in a context wider than that of technology development? 6.3.1. the role of statistics most ml today heavily depends on statistics; so much, that one can often hear that ai is just statistical fitting (or curve fitting). 35 such statements draw from the fact that, in most ml, conclusions and predictions are made from a large set of training data. in spite of the fact that humans learn differently, from very few examples and making interconnections between different subject areas, experiences and new facts, statistical approaches and nns in 32 https://twitter.com/ossia/status/1097804721295773696?lang=en 33 https://qr.ae/pn2r8b 34 https://qr.ae/pnsnq3 35 https://www.quora.com/when-will-ai-go-beyond-curve-fitting https://twitter.com/ossia/status/1097804721295773696?lang=en https://qr.ae/pn2r8b https://qr.ae/pnsnq3 https://www.quora.com/when-will-ai-go-beyond-curve-fitting is this artificial intelligence? 517 ml are dominant in today‘s ai. s. mahadevan has put it nicely: ―trying to do ml without knowing statistics is like to trying to build engineering structures without physics.‖ 36 in contrast, symbolic ai – by far less popular today than in the past – is often called gofai: good old-fashioned ai. it is important to understand that gofai, in particular its knowledge representation and reasoning approaches, are not dismissed. not at all. they bring declarative way of specifying how things should be conducted, strong formalisms of logical reasoning, and also the power of generating explanations. these features can be nicely combined with statistical approaches; for instance, using symbolic approaches rigor can be brought to defining ml pipelines and what exactly they should learn using statistics. in other words, while statistical approaches can process very large, complex data sets, cognitive approaches coming from symbolic ai, like reasoning and problem-solving can bring more human-like flair to ai in order to use ai to its currently possible full potential. ml/statistical algorithms alone cannot do it; ironically, even some statisticians call ml algorithms ―very, very stupid‖. 37 on the other hand, statistical approaches in areas like image recognition and nlp are essential today. it is important to always remember that both statistical and symbolic approaches have their pros and cons. note, however, that although much of ml is built on statistics, there is an important difference in approaches between the two: classical statistics always starts from a hypothesis to test, even before the data is collected; ml first collects huge datasets and then applies exploratory statistical analysis in hope to discover some patterns in data and then use them as the model for making predictions. 38 it is up to ai course designers at universities to make the role of statistics in ai clear. unfortunately, it is not always so. in an eden webinar from november 2019 on ai in higher education [65], complaints have been put up about courses that have the label ―ai‖ in the title, but are essentially just statistics. 6.3.2. industry perspective google search for ―best careers for 2020 and beyond‖, ―best it career paths for the next decade‖, ―most in-demand it jobs‖ and the like, shows controversial results 39 . a number of websites ranking such careers does not mention ai and its subareas at all. the ―closest‖ jobs they mention are those of mathematicians, statisticians, operations research analysts, business analyst, market research analysts, marketing specialists (if one assumes that these skills are applied in developing ml models to make analyses). some websites rank data analysts, data scientists and data engineers high. only two such websites explicitly rank ai architect and robotics engineer high. a similar search on indeed.com 40 , driven by queries like ―ai‖, ―ml‖, ―ai engineer‖, ―ml engineer‖, ―robotics engineer‖ and the like, has vaguely reflected the bar graph shown in fig. 2. however, the ―software engineer‖ query had the number of hits higher by an order 36 https://qr.ae/pnnjuc 37 https://qr.ae/tqunti 38 https://qr.ae/pnkfxd 39 as of aug. 2020. only the first few dozens of hits have been surveyed. 40 https://www.indeed.com/, a popular job announcement and search website. https://qr.ae/pnnjuc https://qr.ae/tqunti https://qr.ae/pnkfxd https://www.indeed.com/ 518 v. devedzic of magnitude than the one for ―ai engineer‖ 41 . indeed‘s list of 25 best jobs for 2020 42 includes neither ai nor ml explicitly (―data scientist‖ is at no. 8, ―data engineer‖ at no. 12). related job descriptions reveal the usual ai ≡ ml misconception mentioned in section 3, as well as a frequent vagueness in postings (―using various techniques, models and algorithms to solve ai problems‖, ―applying multiple skills, functional and technical, on ai problems‖, ―building prototypes of ai applications‖, …). however, ―strong statistical and math background‖, ―programming experience (java, c/c++, phyton, ruby...)‖, ―mathematical and statistical programming experience (r, sas, spss, phyton...)‖ and the like are very frequent accompanying elements in these job announcements as well. in other words, there is much greater demand for job applicants with programming skills and knowledge of statistics than for ―pure‖ ai specialists. a forum discussion about which undergraduate computer science courses should an aspiring ml engineer take 43 lists in the answers ai, ml, probability, statistics, linear algebra, data science, algorithms, and theory of computation, augmented with an introductory course in psychology. although psychology might look to some as an ―outlier‖ in this list, it actually helps aspiring ml engineers develop a set of skills different from the ―core‖ ones – ai, ml, math, statistics – but also very important in practical work. when ml engineers do not have a good knowledge of the data they have to work with, they have to familiarize with it. in practice, it means attending meetings with the clients and putting a lot of effort in clarifying every single attribute in a dataset. all these observations should be put in the perspective of expectations from both the industry and the job applicants. actually, many companies expect job applicants to do a lot of data analytics and statistics, rather than dl modeling that is used more frequently in academia 44 . likewise, most modeling in industry in terms of ml modeling will be traditional modeling, starting from relational databases, not dnn and the like. in addition, due to companies‘ expectations, many positions that include ml tasks also comprise programming and software engineering. this often contradicts expectations of job applicants – although all ml includes some programming, it is very different from the programming associated with application development. also, most companies use cheap and abundant hardware, which means that the ―more data‖ approach also incurs longer times to train models. not understanding this important fact and expecting any ml model training to run fast without investing in expensive equipment is a serious misconception. a more-and-more applied strategy to alleviate this problem is to subscribe for cloud-based tools such as automl 45 , where training ml models relies on powerful external hardware and software. with tools like that, ml engineers can automate much of the model training, experimentation, fitting and evaluation, getting highaccuracy predictions, but cannot eliminate programming associated with the demanding tasks that precede model building in the ml pipeline – data collection, cleaning and wrangling. 41 this is probably no wonder at all; in the words of m. taylor, ―machine learning is a small part of most projects, and a lot of companies are not going to want to employ a specialist, they are going to expect their software developers to do the job.‖ (https://qr.ae/pnkm7b) 42 https://www.indeed.com/lead/best-jobs-2020; as of feb. 2020. 43 https://qr.ae/pn2tmb 44 note that there are also different opinions, e.g. https://qr.ae/pnkkru 45 https://cloud.google.com/automl https://qr.ae/pnkm7b https://www.indeed.com/lead/best-jobs-2020 https://qr.ae/pn2tmb https://qr.ae/pnkkru https://cloud.google.com/automl is this artificial intelligence? 519 from the perspective of an individual company, the workplace roles, the jobs assigned to them and the entire set of business processes and culture should be all tuned well, in order to create new values and make profit. this leaves some room for structured planning and decision-making. a simple tool to use in this process can be a 2×2 matrix with 4 quadrants, defined along the horizontal time-to-learn and vertical utility axes [66]. the quadrants defined this way include learn (high utility, low time-to-learn – the skills and roles that add value for the company quickly), plan (high utility, high time-to-learn – the skills to be acquired only if they are really worth the investment), browse (low utility, low time-tolearn – easy to acquire skills, so stay aware in case their utility increases) and ignore (low utility, high time-to-learn – the company does not have the time for these skills). with this tool, an ai company can simply list the skills it needs (e.g., ml modeling, statistics, data engineering, data collection and wrangling, etc.) and map them onto the four quadrants. the company then typically focuses on the learning quadrant and defines the job roles and positions in a rather straightforward way. 6.3.3. ml engineering and data engineering perspectives there is some difference between ml engineers and data engineers [67]. ml engineers use programming languages to collect data, clean it, wrangle with it, build and tune ml models and consider alternatives. the languages they typically use include sql, python and r. one of the most important and creative activities of ml engineers is feature engineering – what often differentiates successful ml projects from those that fail is the lack of deriving new, useful input features from existing ones. data engineers take care of various data sources, formats, storage 46 , infrastructure, scaling and security, and, very importantly, integrating them in applications to make predictions – for example, deploying them in the cloud as microservices [68]. experience and skills in data etl (extract, transform, load) 47 are essential for data engineers, and so is sql. these two (often intertwined) job roles make much of ―what it really looks like‖ to work in the area of ml in a company 48 , and is largely different from ml research [69]. note also that many use the term ―data scientist‖ to encompass ml engineer, data engineer and business analyst roles. this often hinders the real nature of the work done by ml engineers, and some even call this term mislabeling. 49, 50 as already mentioned, most of the real work of ml engineers is related to programming. ml model building and tuning takes up to 10-15% of their time (whereas data cleansing and wrangling are about 80% of the job). they work mostly on regression and classification problems, much less on dl problems, and their good command of descriptive statistics is understood. to some, it comes as a surprise that there are usually no entry-level positions for ml engineers and data engineers. 51 but it stops being a surprise when one remembers that, for instance, the ml role assumes knowledge of ai and statistics and a long list of programming and other technical skills. it‘s a similar case with the data engineer role. 46 https://qr.ae/pnrddd 47 https://qr.ae/pnkf7p 48 https://qr.ae/pn2ydq 49 https://qr.ae/pnypal 50 https://qr.ae/pnkkru 51 https://qr.ae/pn2nuv https://qr.ae/pnrddd https://qr.ae/pnkf7p https://qr.ae/pn2ydq https://qr.ae/pnypal https://qr.ae/pnkkru https://qr.ae/pn2nuv 520 v. devedzic 6.3.4. strategic perspective no understanding of the current state of affairs in ai can be complete without at least briefly taking into account a more global, strategic perspective. to this end, the current view is that the strategic leaders in ai are just 9 big companies from china and us [70]: alibaba (china), amazon (us), apple (us), baidu (china), facebook (us), google (us), ibm (us), microsoft (us) and tencent (china). amy webb, the author of the book [70] specifies: ―these companies that are building the frameworks, the custom silicon, it‘s their algorithms, it‘s their patents. they have the lion‘s share of patents in this space. they‘re able to attract the top talent. they have the best partnerships with the best universities. it‘s these nine companies who are building the rules, systems and business models for the future of artificial intelligence. as a result of that, they have a pretty significant influence on the future of work in everyday life.‖ 52 however, there is a big difference in how these companies work: those from usa are private companies, commercially oriented and with responsibility primarily to their shareholders; those from china, on the other hand, are independent but have to follow the leadership of the government. but in both cases, it is a relatively small group of people that make decisions, and the process is not very transparent. application-wise, in usa it is microsoft that is the leader in defense ai, and amazon also has a number of contracts with the government related to ai development. google has pulled out of the defense applications and has focused more on transportation, healthcare and consumer services. when it comes to dl applications, it is nvidia corporation that manufactures gpu units that power self-driving vehicles, cloud computing and so on, deep instinct is the leader in dl-based cybersecurity, and microsoft‘s cloud computing service, azure, can run complex dl-driven tools for medical imaging, robotics, nlp etc. in china, ai in transportation has reached an extremely impressive level, and intelligent service robots and drones, neural network chips, and intelligent manufacturing are also among the ai development priorities identified by the chinese ministry of industry and information technology. 6.4. fear of ai vs. benefits of ai the rapid development of ai and the ai hype have created fear in many people, who seem to believe in the dark predictions mentioned in section 6.2. in a nutshell, the fear is that once intentions, thoughts, human-like behavior and other features of intelligence are coded into programs, machines will become very hard to control and will become inherently dangerous. on the way to this singularity, massive unemployment is almost at sight, in spite of the lack of evidence ([63], [64]) that it looks like that. another concern is that the massive data being collected about everything, everywhere, every minute can become a downright threat to privacy and can endanger society by putting control over too many things into hands of governments or other small groups of people. for instance, it has been reported that in china the government has installed over 200 million of surveillance cameras connected with a powerful face-recognition dl system [71]. as a result, each person captured on any of these cameras can be identified and an activity profile is then created for that person. given the population of china, the technology behind 52 https://www.forbes.com/sites/joemckendrick/2019/04/10/nine-companies-are-shaping-the-future-of-artificialintelligence/#336612632cf1 https://www.forbes.com/sites/joemckendrick/2019/04/10/nine-companies-are-shaping-the-future-of-artificial-intelligence/#336612632cf1 https://www.forbes.com/sites/joemckendrick/2019/04/10/nine-companies-are-shaping-the-future-of-artificial-intelligence/%23336612632cf1%20 https://www.forbes.com/sites/joemckendrick/2019/04/10/nine-companies-are-shaping-the-future-of-artificial-intelligence/#336612632cf1 https://www.forbes.com/sites/joemckendrick/2019/04/10/nine-companies-are-shaping-the-future-of-artificial-intelligence/%23336612632cf1%20 is this artificial intelligence? 521 this system is certainly mind-blowing, but the concern is that such an activity profile is then fed into an ai-powered social credit system, meaning that for each person the government calculates a credit score/rating. those with high scores enjoy benefits in e.g., online purchases, restaurants, hotels and while traveling; those with low scores don‘t. sure, companies like facebook and google are collecting data about their users and are creating their profiles as well, and it is not clear how they are using these profiles. a lot of discomfort has also been created by a recent research at mit, where a dl system called norman 53 has been trained using highly negatively biased data [72]. as a result, images classified in a neutral way by a standard dl image recognition system have been classified by norman in a scary way. this has raised many concerns, like: ―imagine ai that denies someone a loan because of their gender. imagine ai that classifies someone as a criminal because of racial prejudice. what‘s the scariest part of artificial intelligence? how similar it is to us.‖ 54 others have rushed to respond quickly, e.g. ―there is no reason to give ai control over goals. there is only gain to be had in giving it control over means… no tool is designed to take over the goals of what it should be used for. tools don‘t have their own motives.‖ 55 they all pull up many examples of ―good ai‖, such as those surveyed in section 1, and their major counter-argument is summarized as ―sometimes those goals, as decided by humans, are dangerous to other humans. but that‘s not out of control. that‘s just in the control of a dangerous human.‖ 53 the largely debated issue that many people will be left jobless and without purpose due to ai-powered automation of many jobs has its reasons. truck drivers, factory workers, retail and food service assistants are not the only ones to be scared to this end, although their jobs are usually the first ones mentioned in the debates. stock trading, legal analysis, as well as robotic surgery and medical diagnosis, treatment and care, are often quoted as highly skilled professions where ai will replace humans. more optimistic views see ai and data revolution as incentives to transform business processes and job roles. the ai assistant metaphor is their stronghold – they see ai-driven machines not as competitors for human jobs, but as companions that will do work that they can do better, and will simultaneously let humans focus on things unique to them, such as building relationships, making decisions in complex situations, showing empathy and the like. as g. warner has nicely put: ―which would you rather have: 1) a human doctor; 2) an ai doctor; or 3) a human doctor using ai?‖ 56 some jobs will certainly cease to exist due to further development of ai – as it has been the case due to different kinds of automation throughout the history of mankind – but some new will be created. in general, many jobs that entail creativity, social interactions, general knowledge, emotional and social intelligence, as well as manual dexterity will thrive; for example, change management specialists, human-computer interaction developers, ml infrastructure maintainers, data curation workers, mental health professionals, etc. an almost ―classical‖ related question is ―will ai replace programmers?‖ m. fouts‘ answer, not without an irony, is: ―every 10 years from 1960 to 1990 at least one major prediction by a prominent ai researcher was ―ai will make programmers obsolete in (8-)10 years‖. 1960 53 http://norman-ai.mit.edu/ 54 https://qr.ae/pn2kgb 55 https://qr.ae/txsb4x 56 https://qr.ae/pn2knb http://norman-ai.mit.edu/ https://qr.ae/pn2kgb https://qr.ae/txsb4x https://qr.ae/pn2knb 522 v. devedzic was 60 years ago and no programmer has ever been replaced by the use of ai software. nobody has made that prediction since 2000, as far as i know. if ai is ever able to replace programmers, it won‘t be this century.‖ 57 in debates on ai pro et contra, there is also a group of people who tend to be neither pessimists nor optimists, but cautious and more realistic, i.e. to see the things from multiple perspectives. here‘s a comment coming from that party, in this case with regard to the recently developed gpt-3 natural language generator: ―a tool like this has many new uses, both good (from powering better chatbots to helping people code) and bad (from powering better misinformation bots to helping kids cheat on their homework).‖ [54]. developing ai that brings benefits to the society is also a concern of governments and political institutions. for instance, european commission has published a strategic document on development of ai for the benefit of the citizens of eu [73]. the document addresses many opportunities and challenges of ai, but also ―a number of potential risks, such as opaque decision-making, gender-based or other kinds of discrimination, intrusion in our private lives or being used for criminal purposes.‖ the guidelines on development of ethical and trustworthy ai [74] have been a precursor to [73]; these guidelines have established a framework for achieving trustworthy ai. the framework has set ethical principles and values for developing ai in europe, with the idea to foster development of ethical and robust ai. here ―robust‖ refers to the fact that ai systems can cause unintentional harm, so both technical and social robustness should be addressed when developing an ai system. 6.5. artificial general intelligence artificial general intelligence (agi), also sometimes called general artificial intelligence (gai), has recently proliferated as more-or-less a synonym for strong ai and is used interchangeably with it, as well as with true ai, general ai and real ai (rai). conceptually, it is a close approximation of the concept of ai as it was originally envisioned in mid 1950s – the technology that would be able to do anything that human intelligence can, without human intervention. 58 intensive recent discussions about agi and if it is achievable are largely a side effect of the ai hype. critics of current ai notice that it is designed only to perform specific tasks, like image recognition and chess playing, tasks that are essentially based on mathematical logic. fed by huge amounts of data and by pre-programed algorithms, and in some cases equipped by powerful sensory systems (e.g., modern robots and self-driving vehicles), in most mundane applications they do perform well. but if agi tasks are set as objectives, current approaches simply hit the wall. an agi system should also be free of any bias in its behavior, reasoning and actions. this is inherently impossible, if only for the reason of their human designers being biased in many ways (attitudes, objectives, culture and the like) 59 . for instance, chinese and us ai developers would typically have different views of the ai objectives and purpose). likewise, agi is envisioned as observer-independent – also impossible with current technology – whereas 57 https://qr.ae/tl03vw 58 from mid 1950s, ai was originally developing that way for approximately 2 decades, before the statistical approach has been initiated in the field. 59 https://qr.ae/pnsn6g https://qr.ae/tl03vw https://qr.ae/pnsn6g is this artificial intelligence? 523 current ai is observer-dependent. 60 for example, since human intelligent behavior is typically inseparable from emotions, it is highly unlikely that supporters of animal shelters will react to stray dogs the same way as people who have got bit by such dogs. last but not least, an essential feature of agi would be the ability to generalize and then make small variations of the generalized concept or behavior; current ai cannot do it, in spite of some attempts to provide formalisms to do it (e.g., based on description logics [75]). ―throwing larger data sets at faster computers only works for a handful of problems and doesn‘t work very well at that… but none of these performances have resulted in a general method that works. instead, so called data scientists carefully tune data sets used for training, ai companies are caught having humans do what they claim their ai software is doing, and progress has ground nearly to a halt.‖ 61 naturally, speculations on the feasibility of agi have also revived the likewise speculative idea of rai [37] and have even led to its elaboration into the concepts such as super intelligence, artificial super intelligence (asi), universal data intelligence framework and the like. 62 but perhaps more importantly, they have also raised speculations about another ai winter. there have been two major ai winters in the past (in early 1970s and late 1980s / early 1990s). they have resulted from ai hypes that have preceded them, over-inflated buzz created by popular media and unrealistic promises made by companies and developers. these, in turn, have created extremely high expectations from industry and potential endusers, which have eventually failed to become a reality and have led to the bubble burst effect. some base their speculations about another ai winter at sight on making analogies with the previous two. others 63 also look at the gartner hype cycle for ai 2019 [56], as mentioned in sections 4 and 6.2. both of these parties express disappointment in current ai not producing commercial results. the hangover is even more obvious from the sheer reality that impressive results in dl and nlp typically come from costly hardware required to train the models with massive data 61 [63]. this especially hits startups, which are beginning to realize that the magic label ―ai‖ alone is not enough to create a roi. even big players like google, microsoft and openai are beginning to show signs of slowing down the innovation, 64 since most of their huge ml models still keep mapping input to output, without any reasoning or building world models that agi supporters demand. in summary, agi still remains a myth. 6.6. challenges still, although the hype seems to be declining, there are other opportunities and reasonable funding, and there are also intriguing challenges. some of them are indicated in the innovation trigger / on the rise section at the same gartner hype cycle for ai 2019 that shows the slight decline of interest in nlp, dl and computer vision [56]. interestingly, agi is there, but it is predicted to take more than 10 years before it becomes a reality. other notable ai technologies on the rise include, e.g.: 60 https://qr.ae/pn2kxe 61 https://qr.ae/tcvcp4 62 https://shorturl.at/pefl9 63 https://qr.ae/tstw09 64 https://qr.ae/tswzt4 https://qr.ae/pn2kxe https://qr.ae/tcvcp4 https://shorturl.at/pefl9 https://qr.ae/tstw09 https://qr.ae/tswzt4 524 v. devedzic  decision intelligence. it is about how to apply ml in organizational decisionmaking in order to initiate actions with beneficial outcomes. it also applies visualization to help decision-makers quickly grasp cause and effect chains [76].  neuromorphic hardware. in this special-purpose hardware, behavior of neurons in human brain is emulated directly in hardware, enabling exceptional and energyefficient performance during the training of dnns. 65  ai developer kits. this term denotes a set of technologies for straightforward building of ai applications for mobile devices, as well as in the form of web services. 66  ai paas (ai platform as a service). platforms accessible as services for ml developers through a web-based interface enable developers to build models, use models developed by others, and enjoy the model upand down-scaling as needed. 67  edge ai. much of data preprocessing and initial ml can be done by devices used to collect data (e.g., smart speakers), prior to sending data to more powerful computers and servers for further analysis. 68  explainable ai (xai). in contrast to today‘s black-box nature of ml, where often even the system designers cannot explain why the model has predicted a specific output, xai develops with the idea to make the output of an ai system understood by humans [77].  … in addition to these practical development challenges, there is also a number of theoretical challenges that ai still has to take on its path of further expansion. for example, classical questions still without a good theoretical answer are: what exactly is happening inside a nn that makes it possible to train it to recognize images, voices, and so on? why dl algorithms work? similarly, how one can infer a suitable number of layers and nodes in a nn? it is still largely a matter of trial and error; there is no theory about it. likewise, what is the real nature of human vision and can one build a computer vision system based on it, unlike building dl-based image recognition systems where a change of only one pixel can lead to misclassification of the entire image? along the same lines, can ml work correctly without cleaning noisy data first? human brain can. in nlp, how to enable semantic understanding of text? further on, 69 instead of just more-or-less accurately mapping a dnn input to output using some (often complicated) transfer function, is it possible to make the network infer some causal knowledge that connects the two? can a dnn be trained to learn multiple tasks simultaneously? can it be trained to self-improve over time, possibly in multiple phases, like in the developmental psychology of humans? ultimately, can it be trained to become self-aware? these last questions can be tackled in multiple ways. at mit, researchers have tried to make an ai system evolve on its own, in terms of automatically discovering complete 65 https://www.iis.fraunhofer.de/en/ff/kom/ai/neuromorphic.html 66 https://www.colocationamerica.com/blog/ai-development-tools 67 https://geekflare.com/machine-learning-paas/ 68 https://www.digikey.com/en/maker/projects/what-is-edge-ai-machine-learningiot/4f655838138941138aaad62c170827af 69 https://qr.ae/pnkdrs https://www.iis.fraunhofer.de/en/ff/kom/ai/neuromorphic.html https://www.colocationamerica.com/blog/ai-development-tools https://geekflare.com/machine-learning-paas/ https://www.digikey.com/en/maker/projects/what-is-edge-ai-machine-learning-iot/4f655838138941138aaad62c170827af https://www.digikey.com/en/maker/projects/what-is-edge-ai-machine-learning-iot/4f655838138941138aaad62c170827af https://qr.ae/pnkdrs is this artificial intelligence? 525 ml algorithms just using basic mathematical operations as building blocks [78]. although preliminary results look modest – their evolutionary approach has enabled the system to discover two-layer neural networks trained by backpropagation – it is still extremely promising because of at least two reasons. the first one is the vastness of the search space. while their work has just scratched the surface, it is quite possible that the approach can help discover yet unknown nn algorithms and topologies. the second reason is of at least equal importance: this approach significantly reduces human bias due to a generic search space. another group of researchers has made initial progress in developing nns good for modeling and learning continuous processes (unlike all other nns, including dnns, that can model only discrete things, i.e. nothing that transforms continuously over time) [79]. these new nns are called ode networks, for ordinary differential equations that parameterize the continuous dynamics of hidden units specified by a neural network. with other nns, the way training is typically conducted is specifying the number of layers in advance, running the training and then finding how accurate the network is. in contrast, with an ode network one specifies the target accuracy first, based on which the network configures and trains itself in the most efficient way until it achieves the pre-specified accuracy. the ode approach is also featured by high memory efficiency. the drawback is that, unlike with other nns, one cannot tell in the beginning of training how long it will take for an ode network. 7. conclusions? this is another intentional question mark in a subheading. it is difficult to derive any definite conclusions about ai as a field today, since the only common denominator of so many different views and phenomena is – controversy. there is still no single, widely adopted and solid definition of what ai is. this is not a surprise, given the fact that there are still a lot of disagreements on what human intelligence is. in spite of that, there seems to be a good deal of agreement about the differences between weak ai and strong ai (agi), fig. 5. still, due to the ai effect, many research results that initially take on the lure of ai, lose that lure over time and become ―just technology‖. part of the explanation for that is the fact that virtually all ai today is essentially weak ai, without generalized human cognitive abilities, hence incapable of solving intelligent tasks without human intervention. it is quite possible that ai effect will not stop until agi is achieved (if it ever happens). it might also happen that when agi is achieved the term ―ai‖ will gradually become obsolete and just part of the history of computing. but until that happens, the reality looks very different. ai cannot do so many things that in the world of humans are taken for granted – e.g., there is still no robot that can implement the moves of an old lady drinking her coffee without spilling the coffee 70 , and no dnn that can recognize the reasons behind a sudden change in a person‘s mood. true, advances in technology have accelerated the capture of data and information, and the technology we call ml can usually efficiently analyze this data, build models, and make predictions. but it cannot explain the models and predictions it has made, not at all. 70 a brilliant example by a. kostic, given during an ai-related class at the u. of belgrade in 2017. 526 v. devedzic the volume and intensity of the ai hype have created a situation of overselling ai both in industry and in academia. many businesses declare that they are deploying and/or developing some ai; however, a recent survey has not confirmed it for about 40% of the sample. the offer of ai, ml, dl and similar courses is abundant at universities and at boot camps, and is largely profitable because of people‘s fear of missing out (despite the employers‘ reserved opinion about the certificates from such courses). the prophecies of agi-coming-soon, which the general press is frequently throwing, only contribute to that fear. but few, very few realize some crucial misconceptions about ai, like the one that current ai systems still remain useful in narrow domains. the extreme view is that ai actually doesn‘t exist. 71 fig. 5 a vision of ai ai has largely become a metaphor for data-intensive technology. is it maybe a sign of a paradigm shift in the field? long ago, achieving human-level intelligence, or agi, has been the objective of ai research; supporters of the agi idea believe that it should remain so. however, ai today seems to be obsessed with data, despite the fact that much of it achieves success only with static data or snapshots of data; but the problem is that data changes over time. time-series analysis is an approach to tackle this problem, but it is also a data-intensive approach. things like temporal reasoning, that once have been among the hottest ai topics, seem to be forgotten. fortunately, in spite of so many controversies research in the broad field of ai is not dead. researchers (and companies, like amazon, baidu, facebook, alibaba, openai and google) always detect and pursue interesting problems at different scales. they often fail to deliver results, but are not afraid to fail – curiosity always prevails over fear (although neither is possible to represent with current ai technology!). failures indicate the paths not to follow, thus they can still be of some value in the next step. 71 https://qr.ae/tiy96a https://qr.ae/tiy96a is this artificial intelligence? 527 although nobody knows when and if agi will be achieved or not, brilliant entrepreneurs and researchers alike keep suggesting how to pursue it. alan kay‘s affirmative attitude about true ai is: ―the history of learning how life works is ‗very suggestive‘ that intelligence [can be based on] special organizations of parts that do not at all have to be intelligent into systems that manifest intelligence… from the practical standpoint, it is hard to imagine that solutions will not be more intelligent and reflective than human beings right from the get-go (we are actually terrible thinkers, given what thinking is all about).‖ 72 sridhar mahadevan seems to share that opinion: ―intelligence emerges from the synergistic interaction of simple entities embedded in complex environments… in this view, we think of intelligence not as an ability innate to a creature, but as a composite of the interactions of the creature with its environment.‖ 73 references [1] d. faggella, ―everyday examples of artificial intelligence and machine learning – comprehensive overview,‖ woburn, ma, emerj artificial intelligence research, white paper, 2020. [2] t. stenovec, ―google has gotten incredibly good at predicting traffic – here's how,‖ new york, ny, business insider, white paper, 2015. [3] d. richman, ―uber‘s machine learning chief says pattern-finding computing fuels ride-hailing giant,‖ seattle, wa, geekwire llc, 2016. [4] j. markoff, ―planes without pilots,‖ new york, ny, new york times, 2015. [5] bi intelligence, ―10 million self-driving cars will be on the road by 2020,‖ new york, ny, business insider, white paper, 2015. [6] a. prakash, ―swarm robotics: new horizons in military research,‖ robotics business review, may 2018. [7] f. grimal and j. jae sundaram, ―combat drones: hives, swarms, and autonomous action?,‖ j. of conflict & security law, vol. 23, no. 1, pp. 105–135, spring 2018. [8] l. huang et al. (oct. 2011). adversarial machine learning. presented at aisec'11: 4th acm workshop security and artificial intelligence, chicago, il. [online]. [9] w. knight, ―military artificial intelligence can be easily and dangerously fooled,‖ mit technology review, oct. 2019. [10] n. mejia, ―ai-based fraud detection in banking – current applications and trends,‖ woburn, ma, emerj artificial intelligence research, white paper, 2020. [11] p. marsden, ―artificial intelligence defined: useful list of popular definitions from business and science,‖ white paper, 2017. [12] s.j. russell and p. norvig, artificial intelligence a modern approach, third edition. boston, ma: pearson, 2016, chapter 1, pp. 1–5. [13] r.j. sternberg, ―intelligence (entry),‖ in the oxford companion to the mind, 1st ed., r.l. gregory and o.l. zangwill, eds., new york, ny, usa: oxford univ. press, 1987, pp. 375–379. [14] s. legg and m. hutter, ―a collection of definitions of intelligence,‖ in procedings of the 2007 conference on advances in agi: concepts, architectures and algorithms: proc. of the agi workshop 2006, jun. 2007, pp. 17–24. [15] l.s. gottfredson, ―mainstream science on intelligence: an editorial with 52 signatories, history, and bibliography,‖ intelligence, vol. 24, pp. 13–23, dec. 1997. [16] u. neisser et al., ―intelligence: knowns and unknowns,‖ amer. psychologist, vol. 51, no. 2, 1996, pp. 77–10. [17] a. turing, ―computing machinery and intelligence,‖ mind, vol. 59, no. 236, pp. 433–460, oct. 1950. [18] ―turing test success marks milestone in computing history,‖ u. of reading press release, jun. 08, 2014. [19] w. knightley, ―google duplex: does it pass the turing test?,‖ digital initiative, harvard business school, boston, ma, nov. 2018. [20] ―robots or people: who‘s gonna rule tomorrow?,‖ evergreen, kyiv, ukraine. [21] j.r. searle, ―minds, brains, and programs,‖ behavioral and brain sci., vol. 3, no. 3, pp. 417-457, 1980. [22] s.e. fahlman, ―how advanced is the most sophisticated example of ai?,‖. 72 https://qr.ae/pn27dk 73 https://qr.ae/pn2psz https://qr.ae/pn27dk https://qr.ae/pn2psz 528 v. devedzic [23] m. haenlein and a. kaplan, ―a brief history of artificial intelligence: on the past, present, and future of artificial intelligence,‖ california management review, vol. 61, no. 4, pp. 5–14, aug. 2019. [24] k. bailey, ―reframing the ‗ai effect‘,‖ san francisco, ca, medium corp., 2016. [25] e. luders et al., ―neuroanatomical correlates of intelligence,‖ intelligence, vol. 37, no. 2, 2009, pp. 156–163. [26] a. nowogrodzki, ―the world‘s strongest mri machines are pushing human imaging to new limits,‖ nature, vol. 563, no. 7729, pp. 24–26, nov. 2018. [27] s.r. cox et al., ―structural brain imaging correlates of general intelligence in uk biobank,‖ intelligence, vol. 76, pp. sep-oct. 2019. [28] z. zheng et al., ―a complete electron microscopy volume of the brain of adult drosophila melanogaster,‖ cell, vol. 174, no. 3, pp. 730-743, jul 19, 2018. [29] l.r. grimm, ―psychology of knowledge representation,‖ wires cogn. sci., vol. 5, no. 3, pp. 261–270, may-jun. 2014. [30] s. mahadevan, ―how is knowledge representation carried out in the brain?,‖ [31] l. chang and d.y. tsao, ―the code for facial identity in the primate brain,‖ cell, vol. 169, no. 6, pp. 10131028, jun 2017. [32] leverhulme centre for the future of intelligence, ―the consciousness and intelligence project‖. [33] m. aydede and g. guzeldere, ―consciousness, intentionality and intelligence: some foundational issues for artificial intelligence,‖ j. of experim. & theor. ai, vol. 12, no. 3, pp. 263–277, nov. 2010. [34] v. vinge, ―the coming technological singularity: how to survive in the post-human era,‖ in vision21: interdisciplinary science and engineering in the era of cyberspace, g.a. landis, ed., nasa publication cp-10129, pp. 11–22, 1993. [35] i.j. good, ―speculations concerning the first ultraintelligent machine,‖ adv. in computers, vol. 6, pp. 31– 88, 1965. [36] r. kurzweil, the singularity is near. new york, ny: viking books, 2005. [37] k. persianov, ―which company do you think will be the first to create the singularity for artificial intelligence?‖ . [38] m. brenner, ―why intelligence might be simpler than we think – lessons from the neocortex,‖ san francisco, ca, medium corp., 2019. [39] r. kurzweil, how to create mind? new york, ny: viking books, 2005. [40] p. domingos, the master algorithm: how the quest for the ultimate learning machine will remake our world. new york, ny: basic books, 2015. [41] d. hofstadter, gödel, escher, bach: an eternal golden braid. new york, ny: basic books, 1979. [42] s. mahadevan, ―imagination machines: a new challenge for artificial intelligence,‖ palo alto, ca, aaai, 2018. [43] t. mitchell, machine learning. new york, ny: mcgraw hill, 1997. [44] i. goodfellow, y. bengio and a. courville, deep learning. cambridge, ma: mit press, 2016. [45] h. wang and b. raj, ―on the origin of deep learning,‖ arxiv:1702.07800, 2017. [46] a. géron, hands-on machine learning with scikit-learn, keras, and tensorflow: concepts, tools, and techniques to build intelligent systems, 2nd ed. boston, ma: o'reilly media, 2019. [47] i. goodfellow et al., ―generative adversarial networks,‖ in proceedings of the int. conf. neural inf. proc. sys. (nips 2014) 2014, pp. 2672–2680. [48] t. young et al., ―recent trends in deep learning based natural language processing,‖ ieee comp. intelligence mag., vol. 13, no. 3, pp. 55-75, aug. 2018. [49] h.a. pierson and m.s. gashler, ―deep learning in robotics: a review of recent research‖. [50] mc.ai, ―fundamentals of machine learning (ml), deep learning (dl) and artificial neural networks (ann),‖ mc.ai, dec. 11, 2019. [51] c. ramirez, ed., advances in knowledge representation. london, uk: intechopen limited, 2012. [52] m.k. bergman, a knowledge representation practionary: guidelines based on charles sanders peirce. new york, ny: springer, 2018. [53] v. flovik, ―machine learning: from hype to real-world applications – how to utilize emerging technologies to drive business value,‖ san francisco, ca, medium corp., towardsdatascience, sep 16, 2019. [54] w.d. heaven, ―openai‘s new language generator gpt-3 is shockingly good—and completely mindless,‖ mit technology review, jul. 2020. [55] m. vollmer, ―what is industry 5.0?,‖ sunnyvale, ca, linkedin, august 23, 2018. [56] l. columbus, ―what's new in gartner's hype cycle for ai,‖ new york, ny, forbes newsletter group, sep 25, 2019. [57] l. kaiser et al., ―one model to learn them all,‖ arxiv:1706.05137. [58] j.h. friedman, ―greedy function approximation: a gradient boosting machine,‖ ann. statist. vol. 29, no. 5, pp. 1189–1232, 2001. is this artificial intelligence? 529 [59] t. chen and c. guestrin, ―xgboost: a scalable tree boosting system,‖ arxiv:1603.02754. [60] i.j. goodfellow, j. shlens and c. szegedy, ―explaining and harnessing adversarial examples‖, arxiv:1412.6572. [61] j. su, d.v. vargas and s. kouichi, ―one-pixel attack for fooling deep neural networks,‖ arxiv:1710.08864. [62] a.l. yuille and c. liu, ―limitations of deep learning for vision, and how we might fix them‖, the gradient, 2019. [63] w. naudé, ―ai‘s current hype and hysteria could set the technology back by decades,‖ the conversation, jul. 24, 2019. [64] w. knight, ―about 40% of europe‘s ―ai companies‖ don‘t use any ai at all,‖ mit technology review, mar. 2019. [65] eden network. artificial intelligence (ai) in higher education. (nov. 14, 2019). [66] c. littlewood, ―prioritize which data skills your company needs with this 2×2 matrix,‖ harvard business rev., oct. 23, 2018. [67] m. west, acing the machine learning interview, in press. [68] c. kaiser, ―stop making data scientists manage kubernetes clusters,‖ san francisco, ca, medium corp., 2019. [69] d. sculley et al., ―hidden technical debt in machine learning systems,‖ corpus id: 17699480. accessed aug. 15, 2020. [70] a. webb, the big nine: how the tech titans and their thinking machines could warp humanity. new york city, ny: publicaffairs, 2019. [71] p. mozur, ―inside china‘s dystopian dreams: a.i., shame and lots of cameras,‖ new york times, jul. 8, 2018. [72] g. kumar et al., ―scary dark side of artificial intelligence: a perilous contrivance to mankind,‖ humanities & soc. sci. rev., vol. 7, no. 5, pp. 1097-1103, 2019. [73] european commission, ―on artificial intelligence a european approach to excellence and trust,‖ brussels, com (2020) 65 final, feb. 19, 2020. white paper. [74] high-level expert group on artificial intelligence, ―ethics guidelines for trustworthy ai,‖ european commission, brussels, belgium. apr. 8, 2019. [75] a.r. divroodi et al., ―on the possibility of correct concept learning in description logics‖. vietnam j. comp. sci. vol. 5, no. 1, pp. 3–14, 2018. [76] c. byrne, ―why google defined a new discipline to help humans make decisions,‖ fastcompany, jul. 18, 2018. [77] e. tjoa and c. guan, ―a survey on explainable artificial intelligence (xai): towards medical xai,‖ arxiv:1907.07374, 2019. [78] e. real et al., ―automl-zero: evolving machine learning algorithms from scratch,‖ arxiv:2003.03384, 2020. [79] r.t.q. chen et al., ―neural ordinary differential equations,‖ arxiv:1806.07366, 2018. accessed: aug. 18, 2020. facta universitatis series: electronics and energetics vol. 30, no 1, march 2017, pp. 49 66 doi: 10.2298/fuee1701049a anas n. al-rabadi1,2 received november 28, 2015; received in revised form april 13, 2016 corresponding author: anas n. al-rabadi electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan (email: alrabadi@yahoo.com) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 1electrical engineering department, philadelphia university, 2jordan & computer engineering department, the university of jordan, amman-jordan abstract. a new extended green-sasao hierarchy of families and forms with a new subfamily for multiple-valued reed-muller logic is introduced. recently, two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) have been proposed, where the second family was the first to include all minimum exclusive sum-of-products (esops). in this paper, we propose, analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where the second family includes minimum galois field sum-of-products (gfsops) over ternary galois field gf(3). one of the basic motivations in this work is the application of these tifs and tgifs to find the minimum gfsop for multiple-valued input-output functions within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the multiple-valued gfsop expression for any given function. the realization of the presented shannon-davio (s/d) trees using universal logic modules (ulms) is also introduced, where ulms are complete systems that can implement all possible logic functions utilizing the corresponding s/d expansions of multiple-valued shannon and davio spectral transforms. key words: canonical forms, galois field forms, green-sasao hierarchy, inclusive forms, multiple-valued logic, shannon-davio trees, ternary logic, universal logic modules. an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules anas n. al-rabadi ∗ electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan e-mail: alrabadi@yahoo.com abstract a new extended green-sasao hierarchy of families and forms with a new subfamily for multiple-valued reed-muller logic is introduced. recently, two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) have been proposed, where the second family was the first to include all minimum exclusive sum-of-products (esops). in this paper, we propose, analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where the second family includes minimum galois field sum-of-products (gfsops) over ternary galois field gf(3). one of the basic motivations in this work is the application of these tifs and tgifs to find the minimum gfsop for multiple-valued inputoutput functions within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the multiple-valued gfsop expression for any given function. the realization of the presented shannon-davio (s/d) trees using universal logic modules (ulms) is also introduced, where ulms are complete systems that can implement all possible logic functions utilizing the corresponding s/d expansions of multiple-valued shannon and davio spectral transforms. keywords 1. canonical forms, galois field forms, green-sasao hierarchy, inclusive forms, multiple-valued logic, shannon-davio trees, ternary logic, universal logic modules. 1 normal galois forms reed-muller-like spectral transforms [1-18] have found a variety of useful applications in minimizing exclusive sum-of-products (esop) and galois field sop (gfsop) expressions, creation of new forms, binary and spectral decision diagrams, regular structures, besides the well-known uses in digital communications, digital signal ∗received: july 15, 2015 101 an extended green sasao hierarchy of canonical ternary galois forms ... table 1: number of product terms needed to realize some arithmetic functions using various expressions. function pprm fprm grm esop sop adr4 34 34 34 31 75 log8 253 193 105 96 123 nrm4 216 185 96 69 120 rdm8 56 56 31 31 76 rot8 225 118 51 35 57 sym9 210 173 126 51 84 wgt8 107 107 107 58 255 and image processing, and fault detection and testing [1-9, 12-14, 16, 17, 19, 20, 21]. the method of generating the new families of multiple-valued shannon and davio spectral transforms is based on the fundamental multiple-valued shannon and davio expansions. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to multiple-valued transforms, have also found important applications in digital system design and optimization [1, 6, 19, 7-18, 20-31]. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [1, 9, 12-15, 17, 21, 22, 26, 27, 31, 32]. one can observe that by going, for example, from positive polarity reed-muller (pprm) form to the generalized reed-muller (grm) form, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions for example. table 1 illustrates these observations [1]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebra [1, 12, 13, 17, 22, 29, 33, 34]. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as the high testability of such circuits, are mainly due to the fact that the gf operators exhibit the cyclic group, also known as latin square, property. in binary, for example, gf(2) addition gate, the exor, has the cyclic group property. the cyclic group property can be explained, for example, using the three-valued (ternary) gf operators as shown in figures 1(a) and 1(b), respectively. note that in any row and column of the addition table in figure 1(a), the elements are all different, which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. reed-muller normal forms have been classified using the green-sasao hierarchy [1, 10, 12, 13, 17], where the green-sasao hierarchy of families of canonical forms and corresponding decision diagrams is based on three generic expansions; shannon, pos102 an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules anas n. al-rabadi ∗ electrical engineering department, philadelphia university, jordan & computer engineering department, the university of jordan, amman-jordan e-mail: alrabadi@yahoo.com abstract a new extended green-sasao hierarchy of families and forms with a new subfamily for multiple-valued reed-muller logic is introduced. recently, two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) have been proposed, where the second family was the first to include all minimum exclusive sum-of-products (esops). in this paper, we propose, analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where the second family includes minimum galois field sum-of-products (gfsops) over ternary galois field gf(3). one of the basic motivations in this work is the application of these tifs and tgifs to find the minimum gfsop for multiple-valued inputoutput functions within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the multiple-valued gfsop expression for any given function. the realization of the presented shannon-davio (s/d) trees using universal logic modules (ulms) is also introduced, where ulms are complete systems that can implement all possible logic functions utilizing the corresponding s/d expansions of multiple-valued shannon and davio spectral transforms. keywords 1. canonical forms, galois field forms, green-sasao hierarchy, inclusive forms, multiple-valued logic, shannon-davio trees, ternary logic, universal logic modules. 1 normal galois forms reed-muller-like spectral transforms [1-18] have found a variety of useful applications in minimizing exclusive sum-of-products (esop) and galois field sop (gfsop) expressions, creation of new forms, binary and spectral decision diagrams, regular structures, besides the well-known uses in digital communications, digital signal ∗received: july 15, 2015 101 50 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 51 an extended green sasao hierarchy of canonical ternary galois forms ... table 1: number of product terms needed to realize some arithmetic functions using various expressions. function pprm fprm grm esop sop adr4 34 34 34 31 75 log8 253 193 105 96 123 nrm4 216 185 96 69 120 rdm8 56 56 31 31 76 rot8 225 118 51 35 57 sym9 210 173 126 51 84 wgt8 107 107 107 58 255 and image processing, and fault detection and testing [1-9, 12-14, 16, 17, 19, 20, 21]. the method of generating the new families of multiple-valued shannon and davio spectral transforms is based on the fundamental multiple-valued shannon and davio expansions. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to multiple-valued transforms, have also found important applications in digital system design and optimization [1, 6, 19, 7-18, 20-31]. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [1, 9, 12-15, 17, 21, 22, 26, 27, 31, 32]. one can observe that by going, for example, from positive polarity reed-muller (pprm) form to the generalized reed-muller (grm) form, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions for example. table 1 illustrates these observations [1]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebra [1, 12, 13, 17, 22, 29, 33, 34]. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as the high testability of such circuits, are mainly due to the fact that the gf operators exhibit the cyclic group, also known as latin square, property. in binary, for example, gf(2) addition gate, the exor, has the cyclic group property. the cyclic group property can be explained, for example, using the three-valued (ternary) gf operators as shown in figures 1(a) and 1(b), respectively. note that in any row and column of the addition table in figure 1(a), the elements are all different, which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. reed-muller normal forms have been classified using the green-sasao hierarchy [1, 10, 12, 13, 17], where the green-sasao hierarchy of families of canonical forms and corresponding decision diagrams is based on three generic expansions; shannon, pos102 an extended green sasao hierarchy of canonical ternary galois forms ... table 1: number of product terms needed to realize some arithmetic functions using various expressions. function pprm fprm grm esop sop adr4 34 34 34 31 75 log8 253 193 105 96 123 nrm4 216 185 96 69 120 rdm8 56 56 31 31 76 rot8 225 118 51 35 57 sym9 210 173 126 51 84 wgt8 107 107 107 58 255 and image processing, and fault detection and testing [1-9, 12-14, 16, 17, 19, 20, 21]. the method of generating the new families of multiple-valued shannon and davio spectral transforms is based on the fundamental multiple-valued shannon and davio expansions. dyadic families of discrete transforms; reed-muller and green-sasao hierarchy, walsh, arithmetic, adding and haar transforms and their generalizations to multiple-valued transforms, have also found important applications in digital system design and optimization [1, 6, 19, 7-18, 20-31]. normal canonical forms play an important role in the synthesis of logic circuits which includes synthesis, testing and optimization [1, 9, 12-15, 17, 21, 22, 26, 27, 31, 32]. one can observe that by going, for example, from positive polarity reed-muller (pprm) form to the generalized reed-muller (grm) form, less constraints are imposed on the canonical forms due to the enlarged set of polarities that one can choose from. the gain of more freedom on the polarity of the canonical expansions will provide an advantage of obtaining exclusive-sum-of-product (esop) expressions with less number of terms and literals, and consequently expressing boolean functions using esop forms will produce on average expressions with less size as if compared to sum-of-product (sop) expressions for example. table 1 illustrates these observations [1]. the main algebraic structure which is used in this work for developing the canonical normal forms is the galois field (gf) algebraic structure, which is a fundamental algebraic structure in the theory of algebra [1, 12, 13, 17, 22, 29, 33, 34]. the importance of gf for logic synthesis results from the fact that every finite field is isomorphic to a galois field. in general, the attractive properties of gf-based circuits, such as the high testability of such circuits, are mainly due to the fact that the gf operators exhibit the cyclic group, also known as latin square, property. in binary, for example, gf(2) addition gate, the exor, has the cyclic group property. the cyclic group property can be explained, for example, using the three-valued (ternary) gf operators as shown in figures 1(a) and 1(b), respectively. note that in any row and column of the addition table in figure 1(a), the elements are all different, which is cyclic, and that the elements have a different order in each row and column. another cyclic group can be observed in the multiplication table; if the zero elements are removed from the multiplication table in figure 1(b), then the remaining elements form a cyclic group. reed-muller normal forms have been classified using the green-sasao hierarchy [1, 10, 12, 13, 17], where the green-sasao hierarchy of families of canonical forms and corresponding decision diagrams is based on three generic expansions; shannon, pos102 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 50 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 51 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 an extended green sasao hierarchy of canonical ternary galois forms ... + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 ∗ 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1 (a) (b) figure 1: third radix galois field addition and multiplication tables: (a) addition and (b) multiplication. itive davio and negative davio expansions. the two-valued shannon, positive davio and negative davio expansions are given as follows, respectively: f (x1,x2,...,xn) = x̄1 · f0(x1,x2,...,xn)⊕ x1 · f1(x1,x2,...,xn) = [ x̄1 x1 ] [ 1 0 0 1 ][ f0 f1 ] , (1) f (x1,x2,...,xn) = 1 · f0(x1,x2,...,xn)⊕ x1 · f2(x1,x2,...,xn) = [ 1 x1 ] [ 1 0 1 1 ][ f0 f1 ] , (2) f (x1,x2,...,xn) = 1 · f1(x1,x2,...,xn)⊕ x̄1 · f2(x1,x2,...,xn) = [ 1 x̄1 ] [ 0 1 1 1 ][ f0 f1 ] , (3) where f0(x1,x2,...,xn) = f (0,x2,...,xn) = f0 is the negative cofactor of variable x1, f1(x1,x2,...,xn) = f (1,x2, ...,xn) = f1 is the positive cofactor of variable x1, and f2(x1,x2,...,xn) = f (0,x2,...,xn) ⊕ f (1,x2,...,xn) = f0 ⊕ f1. in addition, an arbitrary n-variable function f (x1,...,xn) can be represented using pprm expansion as [2, 31]: f (x1,x2,...,xn) =a0 ⊕ a1x1 ⊕ ...⊕ anxn ⊕ a12x1x2 ⊕ a13x1x3 ⊕ an−1,nxn−1xn⊕ ...⊕ a12...nx1x2 ... xn. (4) for each function f , the coefficients ai in equation (4) are determined uniquely, so pprm is a canonical form. for example, if we use either only the positive literal or only the negative literal for each variable in equation (4) we obtain the fixed polarity reed-muller (fprm) form. the good selection of different permutations using shannon and davio expansions like other expansions such as walsh and arithmetic expansions as internal nodes in decision trees (dts) and diagrams (dds) will result in dts and dds that represent the corresponding logic functions with smaller sizes in terms of the total number of hierarchical levels used, and the total number of internal nodes needed. in general, a literal can be defined as any function of a single variable. basis functions in the general case of multiple-valued expansions are constructed using these 103 [1, 17]: 52 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 53 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... literals. galois field sum-of-products expansions can be performed utilizing variety uses of literals. for example, one can use k-reduced post literal (k-rpl) to produce k-rpl gfsop, generalized (post) literal (gl) to produce gl gfsop, and universal literal (ul) to produce ul gfsop. example 1. figure 2 demonstrates several literal types, where one proceeds from the simplest literal in figure 2(a) (i.e., rpl) to the most complex universal literal in figure 2(c). for rpl in figure 2(a), a value k is produced by the literal when the value of the variable is equal to a specific state, and in this particular example a value of k = 1 is generated by the 1-rpl when the value of variable x is equal to certain state where this state here equals to one. the gl in figure 2(b) produces a value of radix for a set of distinct states. one notes that, in contrast to the other literals, ul in figure 2(c) can have any value of the logic system at distinct states, and thus ul has the highest complexity among the different types of literals. 0 0 01 1 1 1 1 12 2 2 2 2 23 3 3 3 3 34 4 4 4 4 4x x x 1 x l ( ) {1,3} x l ( ) <2,0,4,3,1> x ( )a ( )b ( )c figure 2: an illustrating example of the different types of literals over an arbitrary five-radix logic: (a) 1-reduced post literal (1-rpl), (b) generalized (post) literal (gl), and (c) universal literal (ul). since k-rpl gfsop is simpler from implementation point of view than gl or ul, we will perform all the gfsop expansions utilizing the 1-rpl gfsop. let us define 1-rpl [1, 17] as: i x = 1 iff x = i else ix = 0. (5) for example {0x, 1x, 2x} are the zero, first, and second polarities of the 1-rpl, respectively. also, let us define the ternary shifts over variable x as {x,x′,x′′} as the zero, first and second shifts of the variable x respectively, i.e., x = x + 0, x′ = x + 1 and x′′ = x + 2, and x can take any value in the set {0,1,2}. we chose to represent the 1-reduced post literals in terms of shifts and powers, among others, because of the ease of the implementation of powers of shifted variables in hardware as will be seen in section 3 within universal logic modules (ulms) for the production of rpl. the fundamental shannon expansion over gf(3) for a ternary function with a single variable is shown in theorem 1. theorem 1. shannon expansion over gf(3) for a function with a single variable is: f = 0x f0 + 1 x f1 + 2 x f2, (6) where f0 is cofactor of f with respect to variable x of value 0, f1 is cofactor of f with respect to variable x of value 1, and f2 is cofactor of f with respect to variable x of value 2. 104 an extended green sasao hierarchy of canonical ternary galois forms ... proof. from equation (5), if we substitute the values of the 1-rpl in equation (6), we obtain the following {x = 0 ⇒ fx=0 = f0,x = 1 ⇒ fx=1 = f1,x = 2 ⇒ fx=2 = f2} which are the cofactors of variable x of values {0,1,2}, respectively. example 2. let f (x1,x2) = x ′ 1x2 + x ′′ 2 x1. then, using figure 1, the ternary truth vector with the variable order {x1,x2} is f = [0,2,1,1,2,0,2,2,2] t . using equation (6), we obtain the following gf(3) shannon expansion f (x1,x2) = 0x1 1x2 + 2 · 0x1 2x2 + 2 · 1x1 0x2 + 2 · 1x1 1x2 + 2 · 1x1 2x2 + 2x1 0x2 + 2 · 2x1 2x2. by using the addition and multiplication over gf(3) utilizing figure 1, the 1-rpl which is defined in equation (5) is related to the shifts of variables over gf(3) in terms of powers as follows: 0 x =2(x)2 + 1, (7) 0 x =2(x′)2 + 2(x′), (8) 0 x =2(x′′)2 + x′′, (9) 1 x =2(x)2 + 2(x), (10) 1 x =2(x′)2 + x′, (11) 1 x =2(x′′)2 + 1, (12) 2 x =2(x)2 + x, (13) 2 x =2(x′)2 + 1, (14) 2 x =2(x′′)2 + 2(x′′). (15) after the substitution of equations (7) (15) in equation (6), and after the minimization of the terms according to the gf operations in figure 1, one obtains the following equations: f =1 · f0 + x ·(2 f1 + f2)+ 2(x) 2( f0 + f1 + f2), (16) f =1 · f2 + x ′ ·(2 f0 + f1)+ 2(x ′)2( f0 + f1 + f2), (17) f =1 · f1 + x” ·(2 f2 + f0)+ 2(x ′′)2( f0 + f1 + f2). (18) equations (6) and (16) (18) are the ternary fundamental shannon and davio expansions for a single variable, respectively. these equations can be re-written in the following matrix-based forms as shown in equations (19) (22). we observe that equations (19) (22) are expansions for a single variable, but these expansions can be recursively generated for arbitrary number of variables n using the kronecker product also called the tensor product analogously to the binary case [1, 17]. 105 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 52 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 53 an extended green sasao hierarchy of canonical ternary galois forms ... proof. from equation (5), if we substitute the values of the 1-rpl in equation (6), we obtain the following {x = 0 ⇒ fx=0 = f0,x = 1 ⇒ fx=1 = f1,x = 2 ⇒ fx=2 = f2} which are the cofactors of variable x of values {0,1,2}, respectively. example 2. let f (x1,x2) = x ′ 1x2 + x ′′ 2 x1. then, using figure 1, the ternary truth vector with the variable order {x1,x2} is f = [0,2,1,1,2,0,2,2,2] t . using equation (6), we obtain the following gf(3) shannon expansion f (x1,x2) = 0x1 1x2 + 2 · 0x1 2x2 + 2 · 1x1 0x2 + 2 · 1x1 1x2 + 2 · 1x1 2x2 + 2x1 0x2 + 2 · 2x1 2x2. by using the addition and multiplication over gf(3) utilizing figure 1, the 1-rpl which is defined in equation (5) is related to the shifts of variables over gf(3) in terms of powers as follows: 0 x =2(x)2 + 1, (7) 0 x =2(x′)2 + 2(x′), (8) 0 x =2(x′′)2 + x′′, (9) 1 x =2(x)2 + 2(x), (10) 1 x =2(x′)2 + x′, (11) 1 x =2(x′′)2 + 1, (12) 2 x =2(x)2 + x, (13) 2 x =2(x′)2 + 1, (14) 2 x =2(x′′)2 + 2(x′′). (15) after the substitution of equations (7) (15) in equation (6), and after the minimization of the terms according to the gf operations in figure 1, one obtains the following equations: f =1 · f0 + x ·(2 f1 + f2)+ 2(x) 2( f0 + f1 + f2), (16) f =1 · f2 + x ′ ·(2 f0 + f1)+ 2(x ′)2( f0 + f1 + f2), (17) f =1 · f1 + x” ·(2 f2 + f0)+ 2(x ′′)2( f0 + f1 + f2). (18) equations (6) and (16) (18) are the ternary fundamental shannon and davio expansions for a single variable, respectively. these equations can be re-written in the following matrix-based forms as shown in equations (19) (22). we observe that equations (19) (22) are expansions for a single variable, but these expansions can be recursively generated for arbitrary number of variables n using the kronecker product also called the tensor product analogously to the binary case [1, 17]. 105 54 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 55 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 54 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 55 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... as the green-sasao hierarchy, have been described [1, 10, 17]. because canonical families have higher testability and some other properties desirable for efficient synthesis, especially of some classes of functions, they are widely investigated. a similar ternary version of the binary green-sasao hierarchy can be developed, where this hierarchy can find applications in minimizing galois field sum-of-product (gfsop) expressions which are expressions in the sum-of-product form that uses the additions and multiplications of arbitrary radix galois field, and can be used for the creation of new forms, decision diagrams and regular structures. the current state-of-the-art minimizers of exclusive sum-of-product (esop) expressions are based on heuristics and give the exact solution only for functions with a small number of variables. for example, a formulation for finding exact esop was given [11], and an algorithm to derive minimum esop for 6-variable function was provided [25]. because gfsop minimization is even more difficult, it is important to investigate the structural properties and the counts of their canonical sub-families. two families of binary canonical reed-muller forms, called inclusive forms (ifs) and generalized inclusive forms (gifs) were presented [1], where the second family was the first to include all minimum esops (binary gfsops). in these forms, if is the form generated by the corresponding s/d tree and gif is the form which is the union of the various variable-based ordering ifs (cf. definitions and theorems of these forms in the next subsection). in this paper, we propose, as analogous to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), where gfsop minimizer based on these new forms can be used to minimize functional gfsop expressions and the second family of tgifs includes minimum gfsops over ternary galois field. one of the motivations for this work is the application of these tifs and tgifs to find minimum gfsop for multiple-valued inputs multiple-valued outputs within logic synthesis, where the corresponding s/d trees provide more general polarity that contains grm forms as a special case. 2.1 s/d trees and their inclusive forms and generalized inclusive forms two general families based on the shannon expansion and the generalized davio expansion which are produced using the corresponding s/d trees are presented in this subsection. these families are called the inclusive forms (ifs) and the generalized inclusive forms (gifs). the corresponding expansions over gf(2) are shown in figure 3, where figure 3(d) shows the new node which is based on binary davio expansions called the generalized davio (d) expansion (cf. equation (28) for the more general ternary case) that generates the negative and positive davio expansions as special cases. the s/d trees for ifs of two variables can be generated for variable order {a,b} and for variable order {b,a} as well. the set of gifs for two variables is the union of these two order-based ifs, where the total number of the resulting gifs is equal to: #gif = 2 ·(#ifa,b)− #(ifa,b ⋂ ifb,a). (27) the galois-based shannon and davio ternary expansions (i.e., flattened forms) can be represented in ternary dts (tdts) and the corresponding varieties of ternary dds (tdds) according to the corresponding reduction rules that are used [1, 17]. for one 107 an extended green sasao hierarchy of canonical ternary galois forms ... f = � 0x 1x 2x �   1 0 0 0 1 0 0 0 1     f0 f1 f2  , (19) f = � 1 x x2 �   1 0 0 0 2 1 2 2 2     f0 f1 f2  , (20) f = � 1 x′ (x′)2 �   0 0 1 2 1 0 2 2 2     f0 f1 f2  , (21) f = � 1 x′′ (x′′)2 �   0 1 0 1 0 2 2 2 2     f0 f1 f2  . (22) the recursive generation using the kronecker product for arbitrary number of variables can be expressed formally as in the following forms for ternary shannon (s) and davio (d0,d1,d2) expansions, respectively: f = n � i=1 � 0xi 1xi 2xi � n � i=1 [s][�f], (23) f = n � i=1 � 1 xi x 2 i � n � i=1 [d0][�f], (24) f = n � i=1 � 1 x′i (x ′ i) 2 � n � i=1 [d1][�f], (25) f = n � i=1 � 1 x′′i (x ′′ i ) 2 � n � i=1 [d2][�f]. (26) analogously to the binary case, we can have expansions that are mixed of shannon (s) for certain variables and davio (d0,d1,d2) for the other variables. this will lead, analogously to the binary case, to the kronecker ternary decision trees (tdts). moreover, mixed expansions can be extended to include the case of pseudo kronecker tdts [17]. 2 new multiple-valued s/d trees and their canonical galois sop forms economical and highly testable implementations of boolean functions, based on reedmuller (and-exor) logic, play an important role in logic synthesis and circuit design. the and-exor circuits include canonical forms which are expansions that are unique representations of a boolean function. several large families of canonical forms: fixed polarity reed-muller (fprm) forms, generalized reed-muller (grm) forms, kronecker (kro) forms and pseudo-kronecker (psdkro) forms, referred to 106 56 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 57 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 + (x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 +(x′1) 2 +(x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... s a’ a ( )a pd 1 a ( )b nd 1 a’ d 1 a ( )d( )c figure 3: two-valued nodes: (a) shannon, (b) positive davio, (c) negative davio, and (d) generalized davio where a ∈ {a,a′}. variable, that is one tree level, figure 4 represents the expansion nodes for ternary shannon, davio and generalized davio (d), respectively. s 0 x 1 x 2 x (a) d0 1 x 2 (b) d2 1 (d) d 1 (e) d1 1 ( ’)x 2 ( ’’)x 2 ( )x 2 (c) x xx’ x’’ figure 4: ternary nodes for ternary decision trees: (a) shannon, (b) davio0, (c) davio1, (d) davio2, and (e) generalized davio (d) where x ∈ {x,x′,x′′}. in correspondence to the binary s/d trees, one can produce ternary s/d trees. to define the ternary s/d trees, we define the generalized davio node over ternary galois radix as shown in figure 4(e). our notation here is that (x) corresponds to the three possible shifts of the variable x as follows: x ∈ {x,x′,x′′}, over gf(3). (28) definition 1. the ternary tree with ternary shannon and ternary generalized davio nodes, that generates other ternary trees, is called ternary shannon-davio (s/d) tree. utilizing the definition of ternary shannon in figure 4(a) and ternary generalized davio in figure 4(e), we obtain the ternary shannon-davio trees (ternary s/d trees) for two variables as shown in figure 5. from the ternary s/d dts shown in figure 5, if we take any s/d tree and multiply the second-level cofactors (which are in the tdt leaves) each by the corresponding path in that tdt, and sum all the resulting cubes (terms) over gf(3), we obtain the flattened form of the function f , as a certain gfsop expression. for each tdt in figure 5, there are as many forms obtained for the function f as the number of possible permutations of the polarities of the variables in the second-level branches of each tdt. definition 2. the family of all possible forms obtained per ternary s/d tree is called ternary inclusive forms (tifs). the numbers of these tifs per tdt for two variables are shown on top of each s/d tdt in figure 5. by observing figure 5, we can generate the corresponding flattened forms by two methods. a classical method, per analogy with well-known binary forms, would be to create every transform matrix for every tif s/d tree and then expand using that transform matrix. a better method is to create one flattened 108 56 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 57 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 an extended green sasao hierarchy of canonical ternary galois forms ... form which is an expansion over certain transform matrix (i.e., certain tif) and then transform systematically from one form to another form without the need to create all transform matrices from the corresponding s/d trees. this general approach can lead to several algorithms of various complexities that generalize the existing binary algorithms to obtain the corresponding forms such as fprm, grm and if forms. example 3. using the result of example 2 for the expansion of f (x1,x2) in terms of the ternary shannon expansion (that resembles the s/d tree for shannon nodes in both levels): f = 0x1 1 x2 + 2 · 0 x1 2 x2 + 2 · 1 x1 0 x2 + 2 · 1 x1 1 x2 + 2 · 1 x1 2 x2 + 2 x1 0 x2 + 2 · 2 x1 2 x2, (29) we can substitute any of equations (7) (15), or a mix of these equations, to transform one flattened form to another. for example, if we substitute equations (7) and (11), we obtain: f =(2(x1) 2 + 1)(2(x′2) 2 + x′2)+ 2(2(x1) 2 + 1)2x2 + 2(2(x ′ 1) 2 + x′1)(2(x2) 2 + 1) + 2(2(x′1) 2 + x′1)(2(x ′ 2) 2 + x′2)+ 2(2(x ′ 1) 2 + x′1) · 2 x2 + 2 x1(2(x2) 2 + 1) + 2 · 2x1 2 x2, (30) by utilizing gf addition and multiplication operators from figure 1, equation (30) can be transformed to: f =(x1) 2(x′2) 2 + 2(x1) 2(x′2)+ 2(x ′ 2) 2 + x′2 +(x1) 2( 2x2)+ 2( 2 x2)+ 2(x ′ 1) 2(x2) 2 + (x′1) 2 + (x′1)(x2) 2 + 2x′1 + 2(x ′ 1) 2(x′2) 2 +(x′1) 2(x′2)+ (x ′ 1)(x ′ 2) 2 + 2x′1x ′ 2 + (x′1) 2 · 2x2 + 2(x ′ 1) 2 x2 + 2( 2 x1)(x2) 2 + 2x1 + 2( 2 x1)( 2 x2). (31) let us define, as one of possible definitions, the cost of the flattened form (expression) to be: cost = # cubes (32) then, we observe that equation (29) has the cost of seven, while equation (31) has the cost of 19. thus, the inverse transformations applied to equation (31) would lead to equation (29) and a reduction of cost from 19 to seven. using the same approach, we can generate a subset of possible gfsop expressions (flattened forms). note that all of these gfsop expressions are equivalent since they produce the same function in different forms. yet, as can be observed from equation (31), by further transformations of equation (29) from one form to another, some transformations produce flattened forms with a smaller number of cubes than the others. from this observation rises the idea of a possible application of evolutionary computing using the s/d trees and related transformations to produce the corresponding minimum gfsops. analogous to the binary case in equation (27), the ternary gifs can be defined as the union of ternary ifs. definition 3. the family of forms, which is created as a union of sets of tifs for all variable orders, is called ternary generalized inclusive forms (tgifs). 109 58 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 59 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 b 1 b 2 b s s d d s s s d 1 (a,) 2 a, s d d d 1 (a,) 2 a, s s s 0 b 0 b 1 b 1 b 2 b 2 b d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 (b,) 2 b, b, b, b, b, b, b, b, b, b, b,b, b, b, b, b, b, b, b, b, b, b, b, s s d d s s d d 1 (a,) 2 a, sd d d 1 (a,) 2 a, s s s 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0 b 0b 0 b 0 b 0 b 0 b 0b0b 0 b 0 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1b 1 b 1 b 1 b 1 b 1b1b 1 b 1 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2 b 2b2b 2 b 2 b d 1 b, s sd d s s s sd d 1 (a,) 2 a, d d d d 1 (a,) 2 a, s s s d s d dd d d 1 (a,) 2 a, s d0 d1 d 1 (a,) 2 a, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 a 0 a 0 b 0 b 0 a 0 a 0 a 0 a 0a 0 a 1 a 1 a 1 b 1 b 1 a 1 a 1 a 1 a 1a 1 a 2 a 2 a 2 b 2 b 2 a 2 a 2 a 2 a 2a 2 a (a) figure 5: ternary if (tif) s/d trees and their numbers: (a) 2-variable order {a,b} and (b) 2variable order {b,a}. variables {a,b} are defined as in eq. (28) for generalized davio (d) where in this figure (a,≡ a) and (b,≡ b). theorem 2. the total number of the ternary ifs, for two variables and for orders {a,b} and {b,a}, and the total number of ternary generalized ifs, for two variables, are respectively: #tifa,b = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (33) #tifb,a = 1 ·(3) 0 + 3 ·(3)2 + 3 ·(3)4 + 2 ·(3)6 + 3 ·(3)8 + 3 ·(3)10 + 1 ·(3)12 = 730,000, (34) #tgif = #tifa,b + #tifb,a−#(tifa,b ∩ tifb,a) = 2 · #tif − #(tifa,b ∩ tifb,a) = 2 ·(730,000)− (1 ·(3)0 + 2 ·(3)6 + 1 ·(3)12) = 927,100. (35) proof. by observing figure 5, we note that the total number of tifs for orders {a,b} and {b,a} is the sum of the numbers on top of s/d trees that leads to equations (33) (35), respectively. 2.2 properties of tifs and tgifs the following present basic properties of the presented tifs and tgifs. theorem 3. each ternary inclusive form (tif) is canonical. 110 58 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 59 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 an extended green sasao hierarchy of canonical ternary galois forms ... s s s s 0 a 1 a 2 a s s d d s s s d 1 (b,) 2 b, s d d d 1 (b,) 2 b, s s s 0 a 0 a 1 a 1 a 2 a 2 a d 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1(a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 (a,) 2 a, a, a, a, a, a, a, a, a, a, a,a, a, a, a, a, a, a, a, a, a, a, a, s s d d s s d d 1 (b,) 2 b, sd d d 1 (b,) 2 b, s s s 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0 a 0a 0 a 0 a 0 a 0 a 0a0a 0 a 0 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1 a 1a 1 a 1 a 1 a 1 a 1a1a 1 a 1 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2 a 2a2a 2 a 2 a d 1 a, s sd d s s s sd d 1 (b,) 2 b, d d d d 1 (b,) 2 b, s s s d s d dd d d 1 (b,) 2 b, s d d d 1 (b,) 2 b, n=1 n=81 n=729 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=81 n=6561 n=59049 n=9 n=729 n=6561 n=531441 0 b 0 b 0 a 0 a 0 b 0 b 0 b 0 b 0b 0 b 1 b 1 b 1 a 1 a 1 b 1 b 1 b 1 b 1b 1 b 2 b 2 b 2 a 2 a 2 b 2 b 2 b 2 b 2b 2 b (b) figure 5: (continued). proof. an expansion is canonical iff its terms are linearly independent; none of the terms is equal to a linear combination of other terms. using this fact, it was proven that all gf(2) ifs are canonical. analogously over gf(3), for any function f of the same number of variables, there exists one and only one set of coefficients ai such that f is uniquely tif-expandable using ai and thus the resulting expression is canonical. by induction on number of variables, terms in tifs will be linearly independent and thus canonical. theorem 4. ternary generalized inclusive forms (tgifs) are canonical with respect to given variable order. proof. since tgif is the union of the corresponding tifs, and since each tif is canonical, then the resulting union of the corresponding canonical tifs will be also canonical. for different variable orderings, some forms are not repeated while other forms are. therefore the union of sets of tifs for all variable orders contains more forms than any of the tif sets taken separately and less forms than total sum of all tifs. generalized inclusive forms include other forms such as grms over gf(3) as can be shown by considering all possible combinations of literals for all possible orders of variables. if we relax the requirement of fixed variable ordering, and allow any ordering of variables in branches of the tree but do not allow repetitions of variables in the branches, we generate more general gf(3) family of forms. 111 60 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 61 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 60 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 61 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... definition 4. the family of forms, generated by s/d tree with no fixed ordering of variables, with variables not repeated along same branches, is called ternary free generalized inclusive forms (tfgifs). 2.3 an extended green-sasao hierarchy with a new sub-family for ternary reed-muller logic the green-sasao hierarchy of families of canonical forms [1, 10-13, 17, 24, 26] and the corresponding decision trees and diagrams is based on three generic expansions, shannon, positive and negative davio expansions. for example, this includes shannon decision trees and diagrams, positive and negative davio decision trees and diagrams, fixed polarity reed-muller decision trees and diagrams, kronecker decision trees and diagrams, pseudo reed-muller decision trees and diagrams, pseudo kronecker decision trees and diagrams, and linearly-independent decision trees and diagrams [1, 17, 24]. a set-theoretic relationship between families of canonical forms over gf(2) was proposed and extended by introducing binary if, gif and free gif (fgif) forms where figure 6(a) illustrates set-theoretic relationship between the various families of canonical forms over gf(2). analogously to the green-sasao hierarchy of binary reed-muller families of spectral transforms over gf(2) that is shown in figure 6(a), we will introduce the extended green-sasao hierarchy of spectral transforms, with a new sub-family for ternary reedmuller logic over gf(3). while definitions 2 4 defined the ternary inclusive forms, ternary generalized inclusive forms and ternary free generalized inclusive forms, respectively, and analogously to the binary reed-muller case, the following definitions are introduced for the corresponding canonical expressions over gf(3). definition 5. the decision tree that results from applying the ternary shannon expansion in equation (23) recursively to a ternary input-ternary output logic function (i.e., all levels in a dt) is called ternary shannon decision tree (tsdt). the result expression (flattened form) from the tsdt is called ternary shannon expression. definition 6. the decision trees that result from applying the ternary davio expansions in equations (24) (26) recursively to a ternary-input ternary-output logic function (i.e., all levels in a dt) are called: ternary zero-polarity davio decision tree (td0dt), ternary first-polarity davio decision tree (td1dt) and ternary secondpolarity davio decision tree (td2dt), respectively. the resulting expressions (flattened forms) from td0dt, td1dt and td2dt are called td0, td1 and td2 expressions, respectively. definition 7. the decision tree that results from applying any of the ternary davio expansions (nodes) for all nodes in each level (variable) in the dt is called ternary reed-muller decision tree (trmdt). the corresponding expression is called ternary fixed polarity reed-muller (tfprm) expression. definition 8. the decision tree that results from using any of the ternary shannon (s) or davio (d0, d1 or d2) expansions (nodes) for all nodes in each level (variable) in the dt (that has fixed order of variables) is called ternary kronecker decision tree (tkrodt). the resulting expression is called ternary kronecker expression. definition 9. the decision tree that results from using any of the ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-reed112 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 an extended green sasao hierarchy of canonical ternary galois forms ... muller decision tree (tprmdt). the resulting expression is called ternary pseudoreed-muller expression. definition 10. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt is called ternary pseudo-kronecker decision tree (tpkrodt). the resulting expression is called ternary pseudo-kronecker expression. definition 11. the decision tree that results from using any of the ternary shannon expansion or ternary davio expansions (nodes) for each node (per level) of the dt, disregarding order of variables, provided that variables are not repeated along the same branches, is called ternary free kronecker decision tree (tfkrodt). the result is called ternary free-kronecker expression. definition 12. the ternary kronecker dt that has at least one ternary generalized reed-muller expansion node is called ternary generalized kronecker decision tree (tgkdt). the result is called ternary generalized kronecker expression. definition 13. the ternary kronecker dt that has at least one tgif node is called ternary generalized inclusive forms kronecker (tgifk) decision tree. the result is called ternary generalized inclusive form kronecker expression. figure 6(b) illustrates the extended gf(3) green-sasao hierarchy with the new subfamily (tgifk). the presented tgif nodes can be realized with universal logic modules (ulms) for pairs of variables, as will be shown in section 3, which is analogous to what was done for the binary case. although the s/d trees that have been developed so far are for the ternary radix, similar s/d trees can be developed as well for higher galois radices of gf( pk) where p is a prime number and k is a natural number ≥ 1. tgrm ternary gfsop ternary canonical forms tfgif tgif tif tpgk tgk tpkro tkrotfprm tgifk esop canonical forms fgif gif if pgk gk grm pkro kro fprm (a) (b) figure 6: green-sasao hierarchy: (a) set-theoretic relationship between gf(2) families of canonical forms, and (b) the extended green-sasao hierarchy with a new sub-family (tgifk) for gf(3) reed-muller logic. 113 62 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 63 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 62 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 63 an extended green sasao hierarchy of canonical ternary galois forms ... 3 universal logic modules for the circuit realization of s/d trees the nonsingular expansions of ternary shannon (s) and ternary davio (d0, d1 and d2), can be realized using a universal logic module (ulm) with control variables corresponding to the variables of the basis functions which are the variables we are expanding upon. we call it a universal logic module, because similarly to a multiplexer, all functions of two variables can be realized with two-level trees of such modules using constants on the second-level data inputs. the presented ulms are complete systems because they can implement all possible functions with certain number of variables. the concept of the universal logic module was used for binary rm logic over gf(2), as well as the general case of linearly independent (li) logic that includes rm logic as a special case. binary li logic extended the universal logic module from just being a multiplexer (shannon expansion), and/exor gate (positive davio expansion) and and/exor/not gate with inverted control variable (negative davio expansion), to the universal logic modules for any expansion over any linearly independent basis functions. analogously to the binary case, figure 7 presents universal logic modules for ternary shannon and ternary davio, respectively. one can note, that any function f can be produced by the application of the independent variable x and the cofactors { fi, f j and fk} as inputs to a ulm. the form of the resulting function depends on our choice of the shift and power operations that we choose inside the ulm for the input independent variable, and on our choice of the weighted combinations of the input cofactors. utilizing this note, we can combine all davio ulms to create the single all-davio ulm, where figure 7(c) illustrates this ulm. also, the more general ulm as shown in figure 7(d), can be generated to implement all ternary gf(3) shannon and davio expansions. in general, the gates in the ulms can be implemented, among other circuit technologies, by using binary logic over gf(2) or using multiple-valued circuit gates. each ternary ulm corresponds to a single node in the nodes of ternary dts that were illustrated previously. the main advantage of such powerful ulms is in high layout regularity that is required by future nanotechnologies, where the trees can be realized in efficient layout because they do not grow exponentially for practical functions. for instance, assuming ulm from figure 7, although every two-variable function can be realized with four such modules, it is highly probable that most of two-variable functions will require less than four modules. because of these properties, this approach is further expected to give good results when applied to the corresponding incompletely specified functions and multiple-valued relations. 4 conclusions and future work in this paper, an extended green-sasao hierarchy of families and forms is introduced. analogously to the binary case, two general families of canonical ternary reed-muller forms, called ternary inclusive forms (tifs) and their generalization of ternary generalized inclusive forms (tgifs), are presented. the second family includes minimum gf(3) galois field sum-of-products (gfsops). the multiple-valued shannon-davio (s/d) trees developed in this paper provide more general polarity with regards to the 114 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 an extended green sasao hierarchy of canonical ternary galois forms ... • • x f0 f2 f1 0 x 2 x 1 x x f1 f0 +f 1 +f 2 2f2 +f 0 1 2( ”)x 2 x” x f0 f0 +f 1 +f 2 2f1 +f 2 1 2( )x 2 x 2( ’)x 2 x f2 f0 +f 1 +f 2 2f0 +f 1 1 x’ + + + + + 0 x 1 1 x 2 x 2 x x’ x” fi fk fj f x 1 2 � � � � � � � + +1 fi fk fj f +1 (a) (b) (c) (d) figure 7: various ternary ulms: (a) ulm of gf(3) shannon using three gf(3) multiplication gates and one gf(3) addition gate, (b) ulms of gf(3) davio {d0,d1,d2} using three gf(3) multiplication gates and one gf(3) addition gate, (c) ulm for all gf(3) davio expansions using four gf(3) multiplication gates, one gf(3) addition gate, two shift operators, and one multiplexer, and (d) general ulm of gf(3) s/d node using four gf(3) multiplication gates, one gf(3) addition gate and four multiplexers. corresponding if polarity, which contains the grm as a special case. since universal logic modules (ulms) are complete systems that can implement all possible logic functions utilizing s/d expansions of multiple-valued shannon and davio decompositions, the realization of the presented s/d trees utilizing the corresponding ulms is also introduced. the application of the presented tifs and tgifs can be used to find minimum gfsop for multiple-valued inputs-outputs within logic synthesis, where a gfsop minimizer based on if polarity can be used to minimize the corresponding multiple-valued gfsop expressions. 115 64 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 65 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... future work will include the following items: 1. the investigation of the various multiple-valued and two-valued techniques for the ulm implementation for the important case of quaternary logic. since the ulm realization over gf(3) extends and generalizes from the binary case of the gf(2) ulm realization where most digital system designs are performed the special case of the extension into gf(4) becomes important as this gf(4) ulm realization can be achieved by utilizing the currently existing and widely-used implementations over gf(2). this two-valued realization of quaternary ulm can be done for gf(4) addition utilizing gf(2) addition using vector of exors, (4/2) and (2/4) decoders, and for gf(4) multiplication utilizing gf(2) operations using vectors of xors, ands, (4/2) and (2/4) decoders, and therefore the gf(4) ulm producing quaternary shannon and all davio expansions can be achieved using the corresponding multiplexers and gf(4) additions and multiplications which can be accordingly realized using gf(2) methods. 2. the implementation of the introduced ulms using nanotechnology methods such as quantum computing, quantum dots and carbon nanotubes. 3. the utilization of evolutionary algorithms for function minimization using the introduced s/d trees. 4. the investigation using other more complex types of literals such as the presented generalized literal (gl) and universal literal (ul) to expand upon and consequently construct the corresponding new ulms. acknowledgment this research was performed during sabbatical leave in 2015-2016 granted from the university jordan and spent at philadelphia university references [1] a. n. al-rabadi, reversible logic synthesis: from fundamentals to quantum computing, springer-verlag, 2004. [2] a. n. al-rabadi, "three-dimensional lattice logic circuits, part iii: solving 3d volume congestion problem," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 29 43, 2005. [3] a. n. al-rabadi, "three-dimensional lattice logic circuits, part ii: formal methods," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 15 28, 2005. [4] a. n. al-rabadi, "three-dimensional lattice logic circuits, part i: fundamentals," facta universitatis electronics and energetics, vol. 18, no. 1, pp. 1 13, 2005. 116 an extended green sasao hierarchy of canonical ternary galois forms ... [5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," journal of circuits, systems, and computers, world scientific, vol. 16, no. 5, pp. 641 671, 2007. [6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] a. n. al-rabadi, "multi-valued galois shannon-davio trees and their complexity," facta universitatis electronics and energetics, vol. 29, no. 4, pp. 701-720, 2016. [8] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [9] h. fujiwara, logic testing and design for testability, mit press, 1985. [10] d. h. green, "families of reed-muller canonical forms," int. j. electronics, no. 2, pp. 259-280, 1991. [11] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [12] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic press inc., 1985. [13] m. g. karpovsky, finite orthogonal series in the design of digital devices, wiley, 1976. [14] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [15] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [16] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [17] r. s. stanković, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [18] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [19] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [20] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [21] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. 117 this research was performed during sabbatical leave in 2015-2016 granted from the university of jordan and spent at philadelphia university. 64 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules 65 an extended green sasao hierarchy of canonical ternary galois forms ... [5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," journal of circuits, systems, and computers, world scientific, vol. 16, no. 5, pp. 641 671, 2007. [6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] a. n. al-rabadi, "multi-valued galois shannon-davio trees and their complexity," facta universitatis electronics and energetics, vol. 29, no. 4, pp. 701-720, 2016. [8] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [9] h. fujiwara, logic testing and design for testability, mit press, 1985. [10] d. h. green, "families of reed-muller canonical forms," int. j. electronics, no. 2, pp. 259-280, 1991. [11] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [12] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic press inc., 1985. [13] m. g. karpovsky, finite orthogonal series in the design of digital devices, wiley, 1976. [14] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [15] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [16] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [17] r. s. stanković, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [18] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [19] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [20] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [21] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. 117 66 a. n. al-rabadi an extended green sasao hierarchy of canonical ternary galois forms and universal logic modules pb an extended green sasao hierarchy of canonical ternary galois forms ... [5] a. n. al-rabadi, "quantum logic circuit design of many-valued galois reversible expansions and fast transforms," journal of circuits, systems, and computers, world scientific, vol. 16, no. 5, pp. 641 671, 2007. [6] a. n. al-rabadi, "representations, operations, and applications of switching circuits in the reversible and quantum spaces," facta universitatis electronics and energetics, vol. 20, no. 3, pp. 507 539, 2007. [7] a. n. al-rabadi, "multi-valued galois shannon-davio trees and their complexity," facta universitatis electronics and energetics, vol. 29, no. 4, pp. 701-720, 2016. [8] b. falkowski and l.-s. lim, "gray scale image compression based on multiple-valued input binary functions, walsh and reed-muller spectra," proc. ismvl, 2000, pp. 279-284. [9] h. fujiwara, logic testing and design for testability, mit press, 1985. [10] d. h. green, "families of reed-muller canonical forms," int. j. electronics, no. 2, pp. 259-280, 1991. [11] m. helliwell and m. a. perkowski, "a fast algorithm to minimize multi-output mixed-polarity generalized reed-muller forms," proc. design automation conference, 1988, pp. 427-432. [12] s. l. hurst, d. m. miller, and j. c. muzio, spectral techniques in digital logic, academic press inc., 1985. [13] m. g. karpovsky, finite orthogonal series in the design of digital devices, wiley, 1976. [14] s. m. reddy, "easily testable realizations of logic functions," ieee trans. comp., c-21, pp. 1183-1188, 1972. [15] t. sasao and m. fujita (editors), representations of discrete functions, kluwer academic publishers, 1996. [16] t. sasao, "easily testable realizations for generalized reed-muller expressions," ieee trans. comp., vol. 46, pp. 709-716, 1997. [17] r. s. stanković, spectral transform decision diagrams in simple questions and simple answers, nauka, 1998. [18] r. s. stanković, c. moraga, and j. t. astola, "reed-muller expressions in the previous decade," proc. reed-muller, starkville, 2001, pp. 7-26. [19] s. b. akers, "binary decision diagrams," ieee trans. comp., vol. c-27, no. 6, pp. 509-516, june 1978. [20] r. e. bryant, "graph-based algorithms for boolean functions manipulation," ieee trans. on comp., vol. c-35, no.8, pp. 667-691, 1986. [21] d. k. pradhan, "universal test sets for multiple fault detection in and-exor arrays," ieee trans. comp., vol. 27, pp. 181-187, 1978. 117 an extended green sasao hierarchy of canonical ternary galois forms ... [22] m. cohn, switching function canonical form over integer fields, ph.d. dissertation, harvard university, 1960. [23] r. drechsler, a. sarabi, m. theobald, b. becker, and m. a. perkowski, "efficient representation and manipulation of switching functions based on ordered kronecker functional decision diagrams," proc. dac, 1994, pp. 415419. [24] b. j. falkowski and s. rahardja, "classification and properties of fast linearly independent logic transformations," ieee trans. on circuits and systems-ii, vol. 44, no. 8, pp. 646-655, august 1997. [25] a. gaidukov, "algorithm to derive minimum esop for 6-variable function," proc. int. workshop on boolean problems, freiberg, 2002, pp. 141-148. [26] s. hassoun, t. sasao, and r. brayton (editors), logic synthesis and verification, kluwer acad. publishers, 2001. [27] c. y. lee, "representation of switching circuits by binary decision diagrams," bell syst. tech. j., vol. 38, pp. 985-999, 1959. [28] c. moraga, "ternary spectral logic," proc. ismvl, pp. 7-12, 1977. [29] j. c. muzio and t. wesselkamper, multiple-valued switching theory, adamhilger, 1985. [30] r. s. stanković, "functional decision diagrams for multiple-valued functions," proc. ismvl, 1995, pp. 284-289. [31] b. steinbach and a. mishchenko, "a new approach to exact esop minimization," proc. reed-muller, starkville, 2001, pp. 66-81. [32] i. zhegalkin, "arithmetic representations for symbolic logic," math. sb., vol. 35, pp. 311-377, 1928. [33] a. n. al-rabadi, "carbon nano tube (cnt) multiplexers for multiple-valued computing," facta universitatis electronics and energetics, vol. 20, no. 2, pp. 175 186, 2007. [34] a. n. al-rabadi, "closed-system quantum logic network implementation of the viterbi algorithm," facta universitatis electronics and energetics, vol. 22, no. 1, pp. 1 33, 2009. 118 instruction facta universitatis series: electronics and energetics vol. 33, n o 1, march 2020, pp. 83-104 https://doi.org/10.2298/fuee2001083v © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd universal microprocessor controlled power regulator with and without additional power supply  vladan vučković 1 , simon le blond 2 1 faculty of electronic engineering, university of niš, serbia 2 university of bath, department of electronic & electrical engineering, united kingdom abstract. inexpensive microcontrollers allow complex control methodologies for improving well-established technologies such as resistive lighting. in this paper, we present two constructions of a microprocessor controlled power regulator for resistive load of up to 2.5 kw and exemplify its use for the lamps in tesla’s fountain reconstruction project. these are universal power controllers and could be applied to a wide verity of non-inductive loads, but our primary intention was to construct a miniature light regulator with touch sensor for tesla’s fountain. the devices operate using the phase control of the power grid’s alternating current and controlled fade-in to increase lamp longevity. extensive testing shows the device to operate successfully for 2400 hours of continuous error-free operation, to robustly handle high cycling stresses and increase bulb lifetimes by approximately a factor of 7-8. the microcontroller software can easily be adapted for controlling many non-inductive apparatus, like light bulbs or halogen lamps, as well as resistive heating. we also used advanced technologies from other multi-disciplinary areas to complete project. key words: ac voltage controllers, microcontrollers, power regulation, phase control, signal processing. 1. introduction in this paper we present a working prototype of a novel universal power regulator with touch control as part of a complete realization of tesla’s fountain patent. major problem with tesla’s basic construction was mechanical control mechanism for lights that is too much complex and expensive, with potentially huge risk of mechanical failures. due to those problems, we decided to develop solution with no mechanical parts without changing any other details of the original patent. the presented power regulator operates using phase control of the power grid alternating current, and could potentially received april 19, 2019; received in revised form november 10, 2019 corresponding author: vladan vuĉković faculty of electronic engineering, university of niš, aleksandra medvedeva 14, 18000 niš, serbia (e-mail: vladanvuckovic24@gmail.com)  84 v. vuĉković, s. le blond be used for control of a wide range of resistive loads. the prototype device was tested up to powers of 2.5 kw and shown to operate safely. the construction does not require electromagnetic relays or other mechanical moving-parts, so it is very reliable even after intensive use. the embedded microcontroller enables the construction of very complex control programs that could not be accomplished by some other digital circuits, for example counters and timers. the power regulator is universal, so the same hardware could be used in different applications; with only changes required in software. in addition we present a more advanced second iteration of the regulator. here we removed the power supply transformer as well as the opto-triac component replacing it with a power ignition system based on a dual capacitor electronic circuit, and a smaller and cheaper microcontroller. the light control is solved in the same construction: this device can be coupled with a simple metal touch plate instead of classical switch or push button. the result is a miniature device, with not a single moving mechanical part, that could be embedded directly in the core construction of tesla’s fountain. in general, the field of power electronics is the processing of electrical power using a variety of electronic devices. the basic circuit is the switching converter that processes power from the input port and delivers it to an output port achieving regulation via a control port. usually the raw input power passes through several phases until it reaches the final desired stage, regulated sequentially by the control lines. several basic functions in power processing can be achieved in this way. this paper is concerned with ac–ac cycloconversion, specifically converting an ac input to a required ac output with the same output frequency, supplying the desired power through controlling the switch-on time at every ac cycle [1]. the present state-of-the-art is mostly focused on pulse-width modulated (pwm) rectifiers, both single-phase and three-phase, as the preferred way of controlling modern semiconductor power devices. constant-frequency pwm is achieved by comparing a saw-tooth or triangle carrier against a sinusoid to generate pulses of varying width [2]. one of the standard, well-known pwm controlled constructions is the ccm boost converter that can produce any conversion ratio between one and infinity. another is a pwm switched inverter that combines voltage control and frequency control. this inverter uses a fixed voltage dc source where each of the phase legs of the inverter are switched at high frequency and basically operate as choppers. this circuit is widely used in motor controls. here the switching patterns are complex and use sophisticated control electronics, often involving low-cost microprocessors, which is a central idea adopted in this paper. another fundamental concept we exploit is modulation of output waveforms by controlling the switched pulse width for the phase leg. with the pwm technique the primary objective is to determine when to switch on the converter at each cycle to supply the desired output power. for example, an earlier solution uses a spatial pwm controller or an eprom chip to store values that are latterly accessed in real time to determine the optimized switching angles [3]. there have been many multilevel pwm techniques developed in past decades, especially for three-phase three-level (3l) inverters [4]. in our approach we used simplified single-phase input, single-phase output (siso), and ac/ac type cycloconverter for converting one ac power source into another ac power application. ac/dc/ac or ac/ac direct converters are newly developed power switching circuits widely applied in industrial applications, in comparison with other power switching circuits. universal microprocessor controlled power regulator with and without additional power supply 85 although choppers were dominant in power supplies in the more distant past, power ac/dc/ac converters have been common in industrial applications since the late 1980s [5]. as a consequence of developments in semiconductors, new devices such as the thyristor (scr), gate turnoff thyristor (gto), triac, bipolar transistors (bt), insulated gate bipolar transistors (igbt) and power metal oxide field effected transistors (mosfet) have become available. since the 1980s, control circuits have gradually migrated from analog control to digital control [5]. digital power controllers have become an interesting choice in switching voltage regulators for high performance low-cost applications, like microcontroller voltage regulation modules (vrm’s) and other types of portable electronics. for example digital dither is introduced as a means of increasing the effective resolution of digital pwm (dpwm) modules. by using microprocessors, a universal dither generation scheme may be realized, producing patterns of any shape [6]. mathematical modeling for these type of converters has also been given much attention, for example the first order-hold (foh) is a widely accepted method to simulate the ac/ac and ac/dc/ac converters [5]. additionally some practical matrix based iterative techniques have emerged for generating the pwm waveform that are guaranteed to converge. a wide range of calculations can be performed with a low cost microcontroller [7]. in some applications with asymmetric double-edge pwma, complete experimental system comprising a front-end board, an fpga board functioning as an adpwm modulator and an eighth-order analog active low-pass filter board is developed [8]. also, an interesting application for pwm is image processing circuit that performs a directional pixel-state propagation algorithm based on a pixel-parallel architecture, using low-cost devices [9]. in our presented method we choose the ubiquitous microchip pic microcontrollers as the basic processing device [10]. the possibility of precise power control for loads without need for modifications enables many different possibilities in home and industrial applications. the basic principle of the electronic power regulator is well known and is applied in different devices: light control systems, halogen lamp power control, and electric heater control, and many others. in industrial literature for this class of power regulators there exist synonyms like ac voltage controllers or dimmers. the functionality of these regulators is based on output thyristor and triac component characteristics. specifically, these semiconductor components are able to conduct currents up to 40 a with low losses controlled by the proportionally extremely low gate current. when the triac is in the conductive state, it is deactivated only when voltage between the anode and cathode is zero. therefore, if the triac is operated with an ac 220 v circuit, each sinusoid zero crossing turns off the triac. in this context its activation can be controlled over a wide range by turning on at any point in either the positive or negative half-period. the described functionality is the operating principle behind the device presented in this paper. by using microcontrollers to achieve this, compared to other solutions based on digital and analogue circuits, dramatic hardware simplifications, miniaturization, and cost savings can be achieved because calculation for control is computed in software the microcontroller machine program. this philosophy has driven the design of the novel power regulator presented here. however, it is important to note that the complexity of the design is largely dependent on the practical application. our basic intention was to construct exact replica of the tesla fountain, based on his original drawings in patent pending provided by the museum of nikola tesla. instead of the original mechanical control device for lights, we developed two much simpler versions of the electronic control unit that can be programmed to execute exact control functions like original mechanical device granted in tesla’s patent. 86 v. vuĉković, s. le blond the paper is organized as follows: firstly, in section 1 we describe theoretical principles behind the basic power regulator, then described in the following sections 2-4. in section 5 we describe detailed construction of innovative touch sensor power regulator device that works without an external power supply. in section 6 we exemplify use of the power regulator in tesla’s 3d fountain project. this device is in patent pending procedure (serbia’s patent office). finally the overall system and contributions are summarised in the conclusion. 2. theory of operation 2.1. ac voltage phase control we begin by considering the power grid voltage (u) waveform presented in fig. 1. for the simplicity, let us suppose that the load has is non-inductive or capacitive, so there is no phase shift between current and voltage phasors. it is well known fact that capacitive and inductive impedances in electric circuits cause the current to lead or lag the voltage respectively. however this does not apply to a purely resistive loads like the type of lighting solution powered by this device. a) b) fig. 1 (a) power grid sinusoid power transfer in the shaded region (b) half-period of sinusoid the basic assumptions thus are that the current and voltage sinusoids are in phase. in addition the activation of the device is synchronized by a constant delay after the zero values of current sinusoid. dark arrays in fig. 1 (a) show active areas when the device receives energy from the power grid, in this case by virtue of changing the triac to a conducting state in these periods. it is obvious that total power consumption in that situation is less than full (nominal) power consumption. also, it is well known that the energy conveyed to the load is proportional to the active dark region array on the current and voltage time series. thus, the principle of power control is formulated in following way: the power regulation is based on control of the turn-on delays after the zerocrossing in every period of the sinusoidal power grid alternating current. the solution of power control then becomes a time delay control problem that is much easier to solve. to determine quantitative parameters that represent power control we will use a simple model presented in fig. 1 (b). the half-period in fig. 1 (b) is analogue to the halfperiod of power grid sinusoidal voltage. the phase is marked with variable k. the universal microprocessor controlled power regulator with and without additional power supply 87 portion of energy, which is conveyed to the device, can be determined by calculating the fp factor that represents the proportion between dark array and total array under the sinusoid. the fp factor could be calculated by the proportion of the sine function integral in interval [k,] (dark array) and [0,] (full half-period): 2 1)cos( )0cos()cos( )cos()cos( )sin( )sin( 0        kk dww dww fp k     (1) after the integration and rearranging for k we obtain equation (1b): )12arccos(  pfk (1b) this formula enables determination of the time delay, if the desired fp factor is known. using the nominal power grid frequency of 50 hz, and thus a half period relating to 10 ms, we can postulate the required time delay is:  k t *10  (1c) where t is the delay in miliseconds. after the combination of (1b) and (1c) we can postulate a practical formula to determine the time delay in miliseconds if we know targeted fp factor:  )12arccos(10   fp t (ms) (2) by changing the input values in formula (2) we can precisely generate the time delay in one half-period to achieve targeted fp factor. if we change fp from 0 to 1 with constant step value (for instance 0.1 or 10%), and successively apply the formula (2), the following table values are generated (table 1). table 1 time delays according to target fp factor. fp (%) t (ms) 0 10.000 10 7.952 20 7.048 30 6.310 40 5.641 50 5.000 60 4.359 70 3.690 80 2.952 90 2.048 100 0.000 88 v. vuĉković, s. le blond it is clear that the non-linear function for time delay requests that formula (2) must be applied for each value to determine the output value precisely. but in real time applications it is not suitable to use the formula (2) in its original form because the inverse cosine function is computationally expensive. thus for speed, every equidistant value in table 1. is pre-memorized in microcontroller hash memory in a look-up hash table. this approach has an excellent trade-off between access time and functionality and thus frees up the microcontroller processor for sophisticated control tasks. it is possible to increase the precision by decreasing the step size between values, but this would then increase the numbers of elements required in the hash table. in addition linear interpolation between values could be used to achieve better granularity of the target fp (%) factor. however the granularity in table 1 produces sufficiently smooth evolution of the light output. from table 1., it is obvious that the difference between neighboring values is the most exposed near the beginning and the end where the sinusoid has the steepest gradient (fig. 1). 2.2. controlled fade-in for bulb longevity immediately after turn-on, the temperature of the metal in light devices is low, so the resistance is low, causing a high transient current impulse. the impulse current is up to ten times greater than nominal steady state current. for instance, one 100 w incandescent lamp has an initial electrical resistance of 39. if it is connected to the 220 v power grid, the ignition current will be over 5.6 a. this high current impulse places great stress on the light bulb filament, and as such, devices usually malfunction during this initial turn-on period. to overcome this, the microcontroller can be used to power up the lamp gradually, starting not with 220 v but with a factor of ten lower voltage, approximately 20-25 v. by calling a special function on turn on, the microprocessor generates an initial output wave shape that first gently heats the metal filament in the bulb to red incandescence. this is achieved by gradually decreasing the time delay of the triac gate excitation in every period, up to the delay corresponding to desired fp factor. when red incandescence occurs, the resistance is 70-80% of the maximal resistance, so after the turnon, the current jump is only 20-30%. so rather than an initial starting current of 5.6 a the starting current is reduced to around 0.45 a. the whole process takes around 1 second so does not inconvenience the user, merely produces a pleasant but rapid fade-in effect. following this, the microprocessor turns on to full power. this gradual fade-in process prolongs expected lifetime of the light bulbs by factors of 5-10. 3. microprocessor controlled continual power regulator based on theoretical discussion in the previous section, we now describe the technical solution that will enable power regulator construction. 3.1. subsystems the description of the functionality for the sub-systems with component characteristics is also presented (fig. 2). universal microprocessor controlled power regulator with and without additional power supply 89 fig. 2 the block scheme of the power regulator fig. 2b the gretz bridge rectifier, capacitor and stabilizer (standard circuit)  output subsystem – power component triac bt 139-600 is used. this triac can carry alternating currents up to 16 a rms maximum steady state current. thus the maximal power that can be controlled by the device is about 3.5 kw (for the nominal voltage 220 v) but is limited to 2.5 kw to give a wide safety margin. when high current values are present, the triac must be equipped with a metal heat sink to dissipate the ohmic heat.  microcontroller – the whole device is organized around the microcontroller pic16f8 which is the central functional part executing the control algorithm. the algorithm can be described in following steps: 1) using the zero crossing detector, determine the start point of the half-period for alternating power grid current and reset the timer. the triac is inactive in this moment due to the input voltage zero values. 2) determine the delay constant in milliseconds using the value of the target fp factor. use the hash look-up table in table 1. for fast approximation of the function (2). 3) after selected delay, turn-on output switch unit (triac). 4) wait for triac activation time; a few hundred of microseconds depending on component type. 5) turn-off (gate) triac. at that point the input voltage is still non-zero, so the triac will continue to conduct. 6) goto 1) 90 v. vuĉković, s. le blond the central component microchip pic16f84 was chosen for the device [11],[12]. this microcontroller is perfectly suitable due to its functionality, adaptability and low cost. the processor is clocked with a 4 mhz crystal oscillator that enables stable and accurate time cycles.  electronic coupling unit – is accomplished by using the opto-triac component moc 3020. its breach voltage is about 7.5 kv. the optoelectronic component is controlled by the processor from the input side and it generates gate excitation current for triac as the output. this unit enables galvanic decoupling from the 220 v power grid giving high levels of protection for the sensitive low voltage electronic components.  the electronic circuit for zero-crossing detection – uses the alternating signal from the secondary of the 220 v/6 v transformer that energizes the microcontroller (via 7805l stabilizer to limit the input voltage to 5 v). this simple solution is used as simple, safe and sustainable solution. when the signal rises above 2.0 v the microcontroller reads this as a logical 1, and when it falls below 0.8 v a zero. compared with power grid current, the signal has opposite phase but the same frequency so it can be used for realization of the 1) step in the main algorithm.  rectifier – standard gretz bridge rectifier which also uses electrolytic capacitor 470 f and stabilisator 7805 enabling +5v dc for microcontroller, shown in fig. 2b. to ensure protection of the sensitive electronic components, the device is galvanically isolated from the mains between a and b line (fig. 2). line a decouples the microcontroller from power power grid by power transformer and line b decouples it from the output power circuit by optoelectronic component. 3.2. device operation the block scheme of the device is presented in fig. 2. the construction of the device is compact so it can be embedded on a 3x5 cm wafer. it possesses many useful technical characteristics and is proven to be very reliable in challenging use scenarios. to begin with, the microcontroller part of the device is galvanically decoupled from one side with transformer (a), and from the other side with opto-triac (b). also, it is grounded separately, enabling very high level of resistance to power grid overvoltages and currents. by virtue of the online zero crossing detection, the regulator can accommodate small power grid frequency changes, ensuring that the delay in triac activation is always from the actual measured zero crossing rather than the ideal 10 ms delay. it is well known that in practice power grid frequencies vary somewhat around the nominal 50 hz (or 60 hz in other regions). for example, at times of high load and excessive consumption, the power grid frequency has a tendency to decrease as there is less kinetic energy stored in the prime movers of generators. the embedded microcontroller solves this problem because it constantly, period by period, measures successive zero-crossing points, so that all corrections to changes of frequency are achieved in real time and the device remains synchronized to the power grid. for programming of the microcontroller, the assembly language development environment was used. the light controller has an original construction that surpasses similar devices in its class. it possesses all characteristics that are well known for this type of device and also has a set of new advanced functions. universal microprocessor controlled power regulator with and without additional power supply 91 some of the key characteristics of the device are specified:  microcontrollers and integrated technology are used for the construction of the device. this implementation attains high functionality and reliability with low power consumption and low costs. the basic circuit is a pic 16f84 microcontroller which is widely produced at low cost. the construction does not employ components with mechanical moving parts like relays, so the probability of malfunction decreases.  integrated technology enables the development of the device onto compact 3x5 cm one-side wafer which can be energized with normal line transformers.  the light power regulator generates an alternating signal that could power a small guide light source enabling much easier location of the switch fitting in darkness.  maximal output power is over 25 kw (16 a). with a heat sink the device was tested successfully up to powers of 1800 w (equal to 30 parallel connected 60 w light bulbs). the main circuit must contain only resistive based lighting: i.e. light bulbs or halogen lamps.  if the push button is pressed in the active phase, the turn power turn-off is delayed until the inactive phase. 3.3. device testing after a construction of the industrial prototype that possesses all characteristics and components of the final version of the device, we have performed comprehensive testing of the device to check all defined functions in normal and extreme working conditions. test sets are equivalent to both design versions of our device.  continual testing – the device continually worked in long periods of time. after a few months of permanent testing the device retained all nominal and projected parameters, so this test was successfully passed.  test of output power – large thermal loading was connected to the output – heaters up to 2 kw, emulating a large number of lighting bulbs in parallel connection. the device managed this loading without problems, although these tests indicated that the metal heat sink is required at very high loads. as the maximal nominal output power for the embedded type of triac is 16 a, the output fuse must be scaled accordingly to only rupture before the triac is damaged by short circuit current, but not rupture under normal high load conditions.  shock test – using another microcontroller connected to a relay, a series of voltage shocks were emulated across a broad range of possible instantaneous voltage values (from the 220 v waveform) and frequencies of switching. the tests proved that the prototype device is resistant to high instantaneous peak power grid voltage and high cycling stresses, preserving stable performance in the presence of both. 4. microprocessor controlled continual regulator without additional power supply in this section we will present the advanced version of power regulator developed after extensive testing of the previous initial prototype. the power transformer supply as well as opto-triac component are removed from the construction, reducing the cost and size. here we use an 8-pin pic12f675 microcontroller instead of standard 18-pin pic 92 v. vuĉković, s. le blond 16f84 in the previous iteration. the touch control sensor circuit is achieved within the same device. we discuss the principle of the operation of microprocessor controlled switch without additional source of the power supply, shown in fig. 3. fig. 3 operational principle in electrical circuit and functions the microprocessor controled continual regulator and the switch without additional power supply contains microcontroller (mc1), switch component (tr1), restricted current capacitor (c1), restricted impulse power supply resistor (r1), diodes for current routing (d1) and (d2), stabilization and filtration voltage power supply components (zd1) and (c2), resistor for synchronisation with the power grid voltage (r3), extrusion capacitor for synchronized interferances (c4), capacitor (c3) and resistor (r2) for excitation voltage restriction of the switch component (tr1), capacitor (c7) and resistor (r6) for the power grid interferances and input circuit components (cp), (c6), (r5), (r4), and (c5), emphasized by the fact that the device is using power supply current running through the load (s) in such a way that the whole device obtains voltage and is syncronized by parallel wiring with its switch component. the contact plate (cp), through which the user activates the device, can be of any design and made of any conducting material, though it is desirable that the conductor is as warm as possible. this contact plate should have sufficient mechanical strength to ensure the switch can be fastened on the wall. on the back side of the contact plate it is desirable to fasten the metal heat sink of the triac (tr1), so that the device can dissipate heat accumulated when handling heavy loads. the contact plate must be electrically insulated from the wall by plastic or some other strong insulating material, so that the wall does not conduct signals to or away from it causing interference in the device’s operation. 4.1. powering the microprocessor the functional heart of the device is the microcontroller (mc1). in this case michrochip`s microprocessor pic12f675 was used, because of its size and overall performance. from a hardware perspective, the main problem is securing enough electrical power for the microcontroller in order that its program would perform without interruption. in the boxes in the wall that are usually used for installing the switches there are two different potentials: live wire and the potential of the bulb (point a potential). it is therefore universal microprocessor controlled power regulator with and without additional power supply 93 necessary to supply the device with a small amount of the current that runs through the load s (the bulb). when the switch (triac tr1) is turned off, the potential at point a is 0 v. in this case it is not difficult to provide the power supply for the microcontroller, first by the restriction of current with the resistor r1 or capacitor c1, and then by the stabilization of the voltage. problems can occur when the triac is in the on state, because then both potentials in the switch box are at the live wire potential, in such a way that all voltage transfers from the switch to the bulb. the problem of the voltage loss of the microcontroller (mc1) is solved by use of the triac (tr1) and by choosing the correct moment to turn on by mc1’s program. so as not to leave the microcontroller without the power supply under certain conditions, it is necessary to ensure that the triac (tr1) never turns on at the beginning of the semi-period, but only in the moments when the average current of the device supply, of that period, exceeds the average power consumption required by the microcontroller. this problem is addressed by the microcontroller (mc1) itself. stabilization and filtration of the voltage of the power supply is performed by parallel connection of the zener diode (zd1) and the capacitor (c2). capacitor (c2) is filled over the diode (d2) in the time intervals when the rate of change of voltage is dufa / dt < 0. diode (d1) is used to pass through the current through the capacitor (c1) in intervals when the rate of change of the voltage dufa / dt < 0, in order that it would be possible to recharge capacitor (c2) when the rate of change of voltage dufa / dt again becomes bigger than 0. in the circuit of fig. 3 the restriction of the current by regular connection of both capacitors and resistors is chosen, because of the incomparably smaller dimensions and losses possible in comparison to (restriction) through resistors alone. the capacitor c2 must be sized to ensure the average required current of the mircocontroller of 1.5 ma is continually supplied by only turning on the triac after the first sixth of every semi period. by the end of the first sixth of every semi-period, on power grid effective voltage of 230v (amplitude 310v) based on formula, a sufficiently sized capactior will be: u ti c k    (3) where vvuu m 1555.0*310 6 sin   , and the semi-period ms t t 10 2  . 1.5ma 10ms 100nf 155v c    (4) the dimensions of the capacitor from 100 nf/400 v are relatively small, and since active power on them gravitates towards zero, no warming occurs. during the process the difference in the light level is only slight (negligible) when the triac is turned on in the end of the first sixth of the semi-period, compared to turning on of the triac at the beginning of the semi-period. 94 v. vuĉković, s. le blond 4.2. touch detection switch one of the main advantages of this switch is the detection of touch instead of pressure, in such way that it is possible to make it very firm and very sensisitive at the same time. to excitate this device it is necessary for the user to make physical contact with the contact plate. touching of the contact plate is detected by the microcontroller into pin gp1, producing a sequence of logic zeros and ones with the frequency of the input power grid. based on the time duration of the user touching the contact plate, the moments of the turning on of the triac is updated in the microcontroller's program. maximal electricity which can be conductuted by the control plate is determined by capacitor values c6 and the resistor (r5), so that it can be calculated by the following formula: fc r u i ef ef 2 1   (5) the capacitor value c6 is not bigger than 220 pf, and the resistor value r5 is not less than 1 m  . the penetration voltage of the capacitor c6 must be tested and guaranteed by the manufacturer and must not be less than 630 v, although it is desirable to be multiple stronger for the user`s safety.with these limit values and the voltage of uef = 220 v, the maximal effective current will be ief max = 14 a which that cannot be felt, or have any kind of the negative consequences for the user. even in the event of capacitor short circuit failure the maximum current would be limited by the resistor alone to 0.63 ma. the prototype regulator was tested and certified for safety and found to be completely safe. 4.3. triac exitation current of over 20 ma is necessary for the safe excitation of the triac (depenant on the triac type) which is achieved through r2 and c3. for a smaller size of resistor for the current limitation, a smaller capactior value must be adopted. a reasonable size capacitor could be 100 nf/400 v (as in the previous example). however, maximal available current is 1.5 ma with such a capactior. that is why the excitation of the triac is achieved through capactior c3 as follows: a) in the semi-period, when the live wire potential is negative, the microcontroller output (g4), at the moment of traic activation, transits from the logic zero state to logic one, thus sending short-term positive current toward the triac gate. the value of the current is determined by resistor r2, and must large enough to allow successful triac excitation. b) in the semi-period, when the live wire potential is positive, the micro ontroller output (g4), at the moment of traic activation, transits from the logic zero state to logic one, thus sending short-term negative current toward the triac gate. the value of the current is determined by resistor r2, and must large enough to allow successful triac excitation. by this way of triac excitation, the average current of the gate is greatly reduced, to around 100 a. one of the conditions for the precise determination of the moment of turning on of the triac is the use of one microcontroller's input (gp2) for continual zero crossing detection and thus the power grid voltage synchronisation. the signal is delivered through resistor r3, while capactitor c4 eliminates interferances which would eventually appear on that input. the triac must have a galvanically isolated heatsink comprising of pins connected to the contact plate. the triac used in this device is bta16/600, and its heatsink within the triac is guaranteed by the manufacturer. it must have a penetration voltage of over 2.5 kv. universal microprocessor controlled power regulator with and without additional power supply 95 5. final device prototype this section presents further details of a new microprocessor controlled continual regulator practical realization (fig. 4). this device is a result of several year`s development and testing. as we have presented in previous sections, there are many original and new aspects to its hardware, making it unique device in many aspects. nevertheless, the construction remains simple and very easy to manufacture. fig. 4 device prototype the purpose of this section is to briefly highlight some important features of the device:  the device has potential to completely replace today’s classic wall switches and dimmer switches for incandescent and halogen light bulbs.  the device is built around extra efficient and small eight pin chip microcontroller. the microchip pic12f629/675 is used. the embedded control program is relatively complex (see appendix b for assembly code), but only about 40% of the total program memory is used, allowing a lot of space for future software expansion. therefore, it has high level software flexibility.  the embedded program uses advanced control algorithms. among other functions, the device has program control of the sinusoidal fade-in (fig. 5) and fade-out output voltage, power saving control mechanism and automatic shutdown after predefined time period. the list of the functions could be easily expanded if the manufacturer has additional requirements. 0 0.2 0.4 0.6 0.8 1 1.2 1 9 17 25 33 41 49 57 65 73 81 89 97 o u tp u t p o w e r p .u . time, 1x10-2 s fade… fig. 5 fade-in output effective power diagram 96 v. vuĉković, s. le blond  the device does not use transformation units for power supply. it has original power ignition system based on dual capacitor electronic circuit. the device is connected directly to the 220 v electric power grid only via two pins. the connection scheme is presented in the following picture (fig. 12): fig. 6 simple connection scheme it should be noted that the connection scheme is extremely simple; in fact it is the same as the classic installation. therefore, existing installation for ignition of the lights can be used; the change of the wall switch is the only requirement.  it has no mechanical moving parts like relays, levers, or rotating pots that could easily break. it is completely made of electronic components on single-side printed wafer. instead of the classic switch it uses only thin metal (iron, copper, aluminum or other conductor) layer, connected with special touch detecting hardware. the microprocessor has a special subroutine when controlling this touch sensor. all commands of the user are detected only by a light touch of this layer.  the device has two basic versions of input. it is possible to use classic push button. nevertheless, the main version of the prototype uses touch sensor. the cpu program handles both variants. after one touch, the device goes into the fade-in program. the output voltage slowly rises from 20-30 v to approximately 95% of the nominal voltage (around 210v). at any point of this process, device could be paused, maintaining the current power level, again only by one touch. this level is memorized into the internal non-volatile eeprom memory, so if the current power grid power supply is lost, the level will be recalled on the next energization. also, when the user wants to turn-off the lights, one touch is enough – the device will go into the slow fade-out. these functions are in complete harmony with modern power-saving regulations.  in contrast to classic switch, that has only two power levels (on and off), the new device has many possible levels, between 0% and 95%. for instance, if someone has 100 w incandescent bulb, using this switch, that person could adapt all standard powers 25 w, 40 w, 60 w, 75 w, but also nonstandard 10 w, 33 w, 67 w, 80 w etc. also, precise fade in and out procedures visually look attractive. the maximal power is about 95%, so the device constantly saves power even at maximal output.  the gradual fade in control program significantly extends the lifetime of standard incandescent lamps by a factor of at least 7-8 according to our observations.  the output power component which is used has maximal 16 a current. in our system when a sensor metal layer is used at the same time as a heat sink, being a standard configuration, the output power above 1 kw is easily reached. we have tested the device under 700 w output power for a few months and its performance universal microprocessor controlled power regulator with and without additional power supply 97 was excellent. if 1 kw is loaded (for instance two 500 w halogen lamps in parallel circuit), after around 20 minutes the temperature of the touch sensor is slightly increased, but all functions still remain the same.  the device has very useful characteristic of universality. by changing only the microcontroller’s program it is able to serve as:  standard wall switch for room lamps,  alternating switches,  cross switch,  time relay,  classic push button.  if it is necessary to turn on one or more parallel connected incandescent light bulbs from the different switch location (for instance n places), the current system needs 2 alternating and (n  2) cross switches to comply. by using this device there will be need only for one switch unit and n  1 metal sensor layers (coupled with one capacitor and resistor), parallel connected. this solution is therefore much simpler and cheaper.  the device signals the normalization of the power on 220 v power grids by shortly turning on and off output lamps using minimal power. if the device is at certain power level, after the power grid power incidentally goes on and off, the device resets in the turn-off mode.  the latest versions of the industrial prototypes of the device have been intensively tested in areal working environment for a few months continually, maintaining perfect operational functionality. the master device prototype has had around 2400 hours of continuous error-free operation.  device dimensions are 33x24 mm, the height is 24 mm. the other important device characteristics are systematized in the following table 2: table 2 key device characteristics temperature range -20 do +70 o c maximal output power > 1000 w power loss (output power is off) < 100 mw power factor (output power is off) 0.028 supply current (output power is off) 14 ma maximal enabled power grid voltage > 440 v permeable voltage (toward user) 2500 v maximal sensor current (toward user) 10∙10 -6 a device temperature (p=700w, tamb=20 o c) 60 o c type of pertinex plate one-sided dimension of the pertinex plate 33 mm x 24 mm 98 v. vuĉković, s. le blond 6. tesla’s fountain the project ―computer simulation and 3-d modelling of the original patents of nikola tesla‖ in cooperation with ―nikola tesla museum‖ of belgrade, which represents an institution of national importance, was started in 2009. the main purpose of the project is to digitalize, visualize and reconstruct (with real models) the original legacy of the museum [13], [14], [15]. this research was conducted in the project entitled ―computer simulation and modelling of the original patents of nikola tesla‖ and approved by the ministry of education, science and technological development of the republic of serbia [15]. the first of tesla's patent that was studies was tesla’s fountain patented as tesla’s fountain, no. 1,113,716, us patent office, granted oct, 13, 1914 (fig. 7). the first achievement on this project was a 300.000 particle 3d model of tesla’s fountain (fig. 9). the applications used are 3ds max, adobe photoshop, realflow, v-ray, microsoft visual studio c++. then, we used different tools to generate digital model of tesla's fountain in various configurations and environments. this work was conducted with a team of professors and students from the electronic faculty, who have been are assembled together with professional experts from the museum. we used original construction schema from museum (fig. 8) to generate first 3d model of tesla’s fountain. fig. 8 tesla-tiffani original fountain construction (museum of nikola tesla heritage) fig. 7 original tesla’s fountain patent universal microprocessor controlled power regulator with and without additional power supply 99 the original construction indicates the solid materials for basic structure is comprised of metal. the main idea in the patent is to use just one induction motor to generate water flow against gravity and to run the complex light control mechanism using the mechanical gearing (fig. 8). our first innovation was to alter the basic material to glass [16] (fig. 9). fig. 9 tesla's 3d fountain model in glass the tesla fountain in glass has advantages for research and development of the inner structure as well as aesthetic benefits. after the initial structure modelling, we concentrated on a problem of light control. in his original patent (fig. 7) tesla used multi-colour filters to generate interesting colour effects in fountain water flow. we implemented this original idea in our model (fig. 10): fig. 10 tesla fountain in glass with colour filters (detail) 100 v. vuĉković, s. le blond to run this colour filter system on the top on the fountain, tesla proposed a complex mechanical system with differentials, similar to a clock mechanism. this unit reduces the high rotational speed of the main induction motor which also runs the water pump on the same axis (fig. 7). our model follows this part of the original construction also (fig. 11). fig. 11 tesla fountain with mechanical redactor system this mechanism is embedded in the central part of the fountain, out of water flow and above the main water level in the reservoir. it is integrated with a colour filter (fig 12). fig. 12 tesla's fountain with mechanical system (detail) after the animation of the model, the main invention is presented visually, using one induction motor work to pump water and at the same time connected via mechanical differentials to run the colour filter, whose speed is proportional to the water flow. after the modelling and simulation, the next step is to build the real construction of the model [16], [17]. using current technology, it is easy to manufacture the basic structure using 3d printers. however the main difficulty is to produce the mechanical parts which far universal microprocessor controlled power regulator with and without additional power supply 101 outstrips cost of producing the rest of the basic construction. for this reason it was decided to substitute the whole mechanical part with a static colour lamp controlled by a microprocessor device that is in the main innovation presented in this paper. 7. conclusion this paper presents theoretical and practical realization of the two versions of universal power regulator for tesla’s fountain. the power regulation device comprises hardware and software part designed using the microchip pic 16f84/12f675 development system and other applications, going through the different design phases discussed. the main intention of this paper is to develop complete substitution for the mechanical light control part of the tesla fountain. the further research efforts proved that general use of our solution was possible for a variety of loads.. the processor does not require supporting memory and peripheral integrated circuits which enable realization of the device with very low component and small physical footprint. using the power regulator with non-inductive light sources like light bulbs and halogen lamps, it is possible to regulate the light intensity. however the power regulator may have other uses, such as in the case of electric heaters it is possible to regulate the temperature by controlling the heater current loading. the authors’ intentions are to develop further applications and modifications starting from the basic concept described in this paper. acknowledgement: this paper is supported with iii44006-10 project of the ministry of education and science republic of serbia and museum of nikola tesla in belgrade, serbia. authors also want to thanks to eng. dragan tošić, who is co-owner of the patent pending for the construction of the advanced michochip 12f675 microprocessor device. appendix a detailed computer model and simulation of the device is also developed in autocad application. that model is used to generate following pictures: fig. 13 blueprint of the device dimensions are in millimeters fig. 14 3d models of the device 102 v. vuĉković, s. le blond fig. 15 3d models of the device (different views) fig. 16 method of sensor layer, device and plastic case assembly into the wall switch box (front view) fig. 17 method of sensor layer, device and plastic case assembly into the wall switch box (side view) fig. 18 first version of the device prototype. tested for over 2400 hours fig. 19 last version of the device (industrial prototype) universal microprocessor controlled power regulator with and without additional power supply 103 fig. 20 illustration of the device’s miniature size fig. 21 side view of the device fig. 22 compact solution for wall switch appendix b assembly code for microcontroller: 0078 0018 ;-------------------------------- 0079 0018 procedure(periode) 0079 0018 0080 0018 0081 0018 01 96 clearq(kx) ;key buffer 0082 0019 20 15 call wait_low ;synhro 0083 001a 0084 001a 08 22 load(wx) ;wait constant in millisecond 0085 001b 20 0d call wait 0086 001c 0087 001c 16 06 set(portb,triac) ;turn on triac gate 0088 001d 30 0a load.(5*2) ;delay 0089 001e 20 0d call wait 0090 001f 12 06 reset(portb,triac) ;turn off triac gate 0091 0020 0092 0020 30 be load.(95*2) ;delay and wait for second power half-pe 0093 0021 20 0d call wait 0094 0022 0095 0022 16 06 set(portb,triac) ;turn on triac gate 0096 0023 30 0a load.(5*2) ;delay 0097 0024 20 0d call wait 0098 0025 12 06 reset(portb,triac) ;turn off triac gate 0099 0026 104 v. vuĉković, s. le blond 0100 0026 00 08 return 0101 0027 ;-------------------------------- to control “fade-in” procedure repeat call periode function changing delay time (in memory register wx) from its maximal value (minimal output power) to the minimal value (maximal/full output power). this is done through memory register ex loop: 0160 0048 light: 0161 0048 0162 0048 30 a0 mem(wx,maximum) ;warming, 1sec 0162 0049 00 a2 0163 004a 30 32 for(ex,warm,m1) 0163 004b 00 90 0163 004c 0164 004c 20 18 call periode 0165 004d 0b 90 loop(ex,m1) 0165 004e 28 4c 0166 004f references [1] r. w. erickson and d. maksimovic., fundamentals of power electronics, kluver academic publishers, second edition, dordrecht, 2001. [2] j. sun, pulse-width modulation, dynamics and control of switched electronic systems advanced perspective for modeling, simulation and control of power converters, monograph, chapter 2, springer, 2012. [3] d.g. holmes, t.a. lipo, pulse width modulation for power converters—principles and practice, 1st edn. wiley–ieee press, piscataway, 2003. [4] g. grandi and j. loncarski, "simplified implementation of optimized carrier-based pwm in three-level inverters", electronic letters, vol. 50, no. 8, pp. 631-633, 2014. [5] f. l. luo, h. ye, m. rashid., digital power electronics and applications, elsevier academic press, san diego, california, u.s.a., 2005. [6] a.v. peterchev., digital control of pwm converters: analysis and application to voltage regulation modules, m.s. thesis, university of california, berkeley, u.s.a., 2002. [7] j. huang, k. padmanabhan, and o. m. collins, "the sampling theorem with constant amplitude variable width pulses", ieee transactions on circuits and systems, vol. 58, no. 6, pp. 1178-1190, 2011. [8] z. yu, y. fan, l. shi, g. lv, a pseudo-natural sampling algorithm for low-cost low-distortion asymmetric double-edge pwm modulators, springer circuits, systems, and signal processing, 2014. [9] y. kim, t. morie, a pwm-mode pixel-parallel image-processing circuit performing directional statepropagation and its application to subjective contour generation, springer circuits, systems, and signal processing, vol. 34, pp. 605-623, 2014. [10] j. b. peatmann, design with pic microcontrollers, prentice-hall, 1998. [11] pic16c84:8-bit cmos eeprom microcontroler, microchip technology inc., u.s.a., 1997. [12] pic16c84:eeprom memory programming specification, microchip technology inc., u.s.a., 1997. [13] v. vuckovic, a. stanisic, n. simic, "computer simulation and vr model of the tesla's wardenclyffe laboratory," digital applications in archaeology and cultural heritage, vol. 7, pp. 42-50, 2017. [14] v. vuckovic, s. spasi, " 3-d stereoscopic modeling of the tesla’s long island", facta universitatis, series: electronics and energetics, vol. 29, pp. 113-126, 2016. [15] v. vuckovic, a. stanisic, s. le blond, "virtual reality modelling and simulation of the tesla's radio controlled boat", pp. 61-64, 2017. [16] v. vuckovic, v. mitic, lj. kocic, b. arizanovic, v. paunovic, r. nikolic, "tesla’s fountain, modeling and simulation in ceramics technology", journal of the european ceramic society, vol 38, no. 8, pp. 3049-3056, 2018. [17] v. vuckovic, v. mitic, lj. kocic, v. nikolic, "the fractal nature approach in ceramics materials and discrete field simulation", science of sintering, vol. 50, no. 3, pp. 371-385, 2018. facta universitatis series: electronics and energetics vol. 34, no 2, june 2021, pp. 259-280 https://doi.org/10.2298/fuee2102259k © 2021 by university of niš, serbia | creative commons license: cc by-nc-nd original scientific paper forced stack sleep transistor (fortran): a new leakage current reduction approach in cmos based circuit designing sankit r kassa1, neeraj kumar misra2, rajendra nagaria3 1department of electronics and communication engineering, usha mittal institute of technology, mumbai, india 2department of electronics and communication engineering, bharat institute of engineering and technology, hyderabad, india 3department of electronics and communication engineering, motilal nehru national institute of technology, allahabad, india abstract. reduction in leakage current has become a significant concern in nanotechnology-based low-power, low-voltage, and high-performance vlsi applications. this research article discusses a new low-power circuit design the approach of fortran (forced stack sleep transistor), which decreases the leakage power efficiency in the cmos-based circuit outline in vlsi domain. fortran approach reduces leakage current in both active as well as standby modes of operation. furthermore, it is not time intensive when the circuit goes from active mode to standby mode and vice-versa. to validate the proposed design approach, experiments are conducted in the tanner eda tool of mentor graphics bundle on projected circuit designs for the full adder, a chain of 4-inverters, and 4bit multiplier designs utilizing 180nm, 130nm, and 90nm tsmc technology node. the outcomes obtained show the result of a 95-98% vital reduction in leakage power as well as a 15-20% reduction in dynamic power with a minor increase in delay. the result outcomes are compared for accuracy with the notable design approaches that are accessible for both active and standby modes of operation. key words: fortran approach, submicron region, standby and active modes of operation, power optimization, sub-threshold leakage current 1. introduction power utilization is one of the main issues to be taken care of in vlsi circuit designing, for which cmos is the most important technology. today’s emphasis on lowpower is not only due to traditional smartphone devices, but also due to other issues like an increase in leakage current, high power dissipation and fabrication cost. as even received september 23, 2020; received in revised form march 20, 2021 corresponding author: neeraj kumar misra bharat institute of engineering and technology, hyderabad-501510, india e-mail: neeraj.mishra3@gmail.com 260 s. r kassa, n. k. misra, ra. nagaria before the mobile era, power consumption was a significant problem. to minimize power dissipation, many researchers have proposed different ideas from the device level to the architectural level [1]. there are numerous methods discussed to reduce leakage power in cmos based circuit designing [2–10]. each approach delivers a novel way to reduce leakage power, but the shortcomings of each approach limit the claim of each approach to be the best. as the feature sizes of the device go on decreases, threshold voltage also declines, which increases the static power dissipation [11]. therefore, while in standby mode, the transistor cannot be turned off completely. high power consumption leads to a reduction in battery life if the device is battery powered. it affects the reliability, packing, cooling costs and performance of the device. the primary sources of power dissipation in vlsi circuits are (1) dynamic power consumption, which occurs due to charging and discharging of the load capacitance, which is about 90-92% of the technology processes with feature size larger than 1 µm. (2) short circuit power consumption, due to the direct path established between the power supply and ground because of the transition in the logic gates. (3) leakage current, which arises mainly due to reverse bias diode currents or subthreshold leakage current. the short circuit power dissipation can be reduced by 10% by designing the circuit having equal rise time and fall time. for minimizing power consumption in deep-submicron region, supply voltage is scaled down due to which dimensions of device changes and delay is increased. to maintain the performance of the cmos circuit as it is, threshold voltage (vth) of the transistor should be minimized [12]. the scaling down of vth results in an exponential increase in the threshold leakage current. the entire average power dissipation in cmos circuits can be expressed by the equation. where α is the node conversion activity factor (the average number of times the node makes a power-consuming changeover in one clock period), cl is the load capacitance, vdd is the supply voltage, and fclk is the clock regularity. pavg = pdynamic + pshort circuit + pleakage (1) pdynamic (switching) is the exchanging component of power dissipation specified by equation 2. pdynamic = α .cl.vdd 2.fclk (2) in this manuscript, a new low-power reduction approach is proposed, which provides an innovative choice for low-power vlsi designers/engineers to reduce leakage current in a much better way besides increasing in area and delay to a small extent. an unbiased comparison of the proposed approach with previously available low-power approaches is made in estimating the accuracy of the proposed approach. tanner design tools that are used for this process: s-edit, t-spice, and w-wave is used for schematic capture, spice simulation engine, waveform viewer, respectively in this proposed work. the major pillar of this research works around fortron based digital circuits can be pointed as follows: ▪ we demonstrate the novel fortran (forced stack sleep transistor), which decreases the leakage power efficiency in the cmos-based circuit outline in vlsi domain. ▪ we presented the novel and efficient circuit-level leakage power reduction approach of fortran, which can minimize leakage power as well as dynamic power in a decent amount. forced stack sleep transistor (fortran) 261 ▪ we implement the fortron cmos based nand-2 and nor-2 layout in l-edit 16.0 version. ▪ we demonstrate the nand-2 and nor-2 layout and area results. ▪ the pre-layout (synthesized or gate-level) and post-layout results of cmos-nand-2 and cmos nor-2 have passed all verifications of the asic design flow. ▪ we estimate the area of cmos, nand-2 and nor-2 after the drc result pass. ▪ we perform the performance comparison of cmos nand-2 and nor-2 with the state-of-the art work. this paper is organized as below: section i gives the introduction of different approaches to minimize the power loss. section ii discusses the previously reported approaches available to deal with leakage current reduction. section iii explains the proposed new approach to the fortran approach. section iv gives a logic gate implementation using the fortran approach with a table used to select control input signals g1 and g2 in an active mode of operation. in section v, comparative delay analysis is discussed. simulation results and discussion is specified in section vi. finally, the conclusion is given in section vii. 2. previous approaches numerous methods for minimizing leakage power are reported in the literature [13-20] which are mostly based on modes of operation. they are classified into two categories: (1) standby/idle mode (2) active mode (1) standby/idle mode: when the circuit is in the idle state, the circuit is cut off from the power rails and leakage power occurs (2) active mode: leakage is minimized during the run time by stacking the transistor, thereby reducing the overall leakage power in this section, discussion about previous low-power leakage reduction approaches is done that primarily focused on reducing leakage power consumption of cmos circuits. mtcmos, lector, sleepy stack, indep are some of these types of approaches whose primary goal is to reduce the leakage power at gate-level design. a well-known model of stacking transistor model in which transistors are connected in series with each other is suggested in [20]. by applying some off-state transistors in series, the leakage power of a logic circuit can be minimized. by attaching more transistors in a stack can save more leakage current. the control of threshold voltage in the stacking approach depends on the gate to source voltage, drain to source voltage and substrate to source voltage of the stacked transistors. the use of multiple threshold voltage cmos (mtcmos) technology for leakage control is described as shown in fig. 1 (a) [21]. a high threshold, sleep transistors are attached between the vdd and gnd power rails. in standby mode, high threshold, sleep transistors is turned off to reduce the leakage current dramatically by disconnecting the actual power supply vdd and gnd from the logic block. a controller is required to control the sleep mode and active mode of a transistor. extra processing steps are required to manage the high threshold, sleep transistor controller. moreover, there is also a performance penalty because of the appearance of the high-threshold voltage transistor in series with all the switching current paths. the dual-vth approach, which uses transistors with two different threshold voltages, is a variation of the mtcmos approach. here, additional mask layers for each value of threshold voltage are required for fabricating the transistors selectively according to their assigned threshold values in both mtcmos and dual-vth approaches. 262 s. r kassa, n. k. misra, ra. nagaria (a) (b) (c) (d) fig. 1 leakage power approaches (a) mtcmos [21] (b) lector [22] (c) sleepy stack [23] (d) indep [24] the leakage control transistor (lector) approach is proposed, as shown in fig. 1 (b) is a single threshold, input dependent method to reduce leakage current [22]. it requires only two transistors to minimize leakage current, in which two leakage control transistors (lcts) are used on the same threshold voltage type as a one pmos transistor between pun and output and one nmos transistor between output and pdn. the gate terminal of each leakage control transistor (lct) is controlled and operated by the source of the other transistor. in this approach, one of its lcts is always near its cutoff for any combination of input signal. the drawback of the lector approach is that no full output swing is achieved on the output side of the input via logical circuitry, and the propagation delay is also more of this design. the sleepy stack approach is proposed, as shown in fig. 1 (c) combines the sleep and stack approach [23]. it replaces three transistors in place of one transistor in the cmos circuit. it divides the existing transistor into three transistors out of which one high vth transistors are in parallel to one of the two transistors, which is itself divided into two half-size transistors like the stack approach. during sleep mode, sleep transistors are forced stack sleep transistor (fortran) 263 turned off, and stacked transistors suppress leakage current while saving state as well. here each parallel placed sleep transistor also reduces propagation delay by reducing the resistive path of the circuit during active mode. however, the penalty in terms of area is the most significant matter in this approach, because three transistors replace every single transistor of cmos, and additional wiring is also required for each sleep signal operation. therefore, it requires three times more area compared to the cmos approach if the same transistors are used in circuit designing. another latest approach to decrease leakage power dissipation is indep (input dependent) approach, as illustrated in fig. 1 (d), which is based on the boolean logic circuit [24]. the gate terminals with additional built-in transistors rely on the main logic circuit input combinations. therefore, the selections of input gate voltages are very crucial for dropping the leakage current efficiently. the slight increase in the area of the circuit is the negative criterion of this circuit design approach. in the literature, several works on cmos circuit designs were directly proposed through mtcmos, lector, sleep stack, and indep. they contained no quantitative architectural comparison with frotran technology. this works were proposed reduction of leakage current, dynamic power and delay within an iteration for the logical inputs using fortran technology. following the existing works in [21-24], however, we believe that the improvement over the conventional design can verify our proposed fortran based digital cmos circuit design strategy. we have adopted the fortran technology in the proposed and conventional mtcmos, lector, sleep stack, indep for a fair comparison. the proposed designs such as full adder, a chain of 4 inverters and 4-bit multiplier has analyzed with 180nm, 130nm, 90nm technology node to analyze static power, dynamic power and delay. in this manuscript, cmos, sleepy stack and indep approaches are taken for comparison with the proposed circuit design approach of fortran. sleepy stack approach saves leakage power in a reasonable amount by considering the state saving of the logic gate using both low threshold and high threshold voltage transistors. indep is a new advanced approach that saves leakage power effectively by using only low threshold voltage transistors in its structure. therefore, the fortran approach is compared with both the advanced approaches (sleepy stack and indep) as well as the most successful approach available (cmos) to date for minimizing leakage current. 3. fortran approach the fortran (forced stack sleep transistor) approach has a joint structure of forced stack and sleep transistor approach, as presented in fig. 1. the forced stack transistor structure is generally made by breaking the existing cmos single transistor structure into two transistors and then forces to take advantage of the stacking effect in which the resistance of the path (from vdd to gnd) increases rapidly that decreases the leakage power. the two extra added transistors (p3 and n3) are connected to the forced stack structure in such a way that their drain connection is attached to the substrate of both the forced stack transistors (p1-p2 for pun, n1-n2 for pdn) as shown in the drawing. in standby mode of operation, when these extra added transistors (p3 and n3) are on, it will automatically increase the threshold voltage of the stack transistors, and then the stack transistors would behave like a high threshold voltage transistors which does not allow much leakage current to pass through them. therefore, the leakage current will 264 s. r kassa, n. k. misra, ra. nagaria decrease in a reasonable amount in the standby mode of operation. these two extra sleep transistors input gates are connected opposite to that of forced stack transistors in pun and pdn. at this point, when one transistor goes from active mode to standby mode, it does not take much time to tackle. the beauty of the proposed approach is that only the lower threshold voltage transistors are being used in the design approach. therefore, the fabrication problem of implementing high threshold voltage transistors practically, which is somehow challenging and complicated to implement in a real manner, is elucidated. it is difficult and time consuming to fabricate both types (low/high threshold voltage) of transistors on the same ic at the fabrication level. in the fortran approach, the width of all pmos transistors is double compared to the width of nmos transistors. therefore, in the 180nm technology node, if one takes an nmos transistor w/l ratio as 250nm/180nm, subsequently w/l ratio of pmos transistor should be taken as 500nm/180nm. this is done to maintain the mobility ratio of µn/µp as 2 because in general, the mobility of electron (µn) is 2 times greater than the mobility of holes (µp). the mobility ratio is required to maintain the same amount of charge carriers flow in nmos (electrons) and pmos (holes) region at a time. 3.1 functioning of fortran structure fig. 2 shows the structure of the fortran general approach. two modes of operation are explained here: active mode and standby mode. in active mode of operation, when input signal pulses are applied at the input terminal of the circuit, g1 and g2 should be turned on. so, the forced stack transistor structure (p1-p2 and n1-n2) will be in on condition, and parallel transistors (p3, n3) will be off due to applying logic 0 at g1 and logic 1 at g2 making transistor p3 and n3 off. due to this arrangement, resistance from vdd to node x1 will be decreased, making forced stack structure (p1-p2 and n1-n2) to be on in active mode. therefore, the full supply voltage is achieved at node x1. same as for node x2, the full virtual, gnd, is achieved when the g2 is on. the virtual power supply lines vddv and gndv will be established at node x1 and x2. in active mode of operation, nmos transistors n1 and n2 will be in active mode, while n3 will be in a cutoff mode as well as pmos transistors p1 and p2 will be in active mode, and p3 will be in a cutoff mode of operation. in sleep/standby mode, the reverse will happen. in a standby mode of operation, the advantage of the stacking effect should be taken as sub-threshold leakage current, which flows through a stack of series-connected transistors and reduces the leakage current when more than one transistor in the stack is turned off. this effect is known as the ‘stacking effect.’ one can explain it by connecting two transistors in series with each other, as shown in fig. 3. when both transistor p1 and p3 g1 cmos approach g2 p2 p1 output n1 n3 g2 g1 n2 input vdd gnd x1 x2 pun pdn fig. 2 fortran general approach forced stack sleep transistor (fortran) 265 p2 are turned off, the voltage at in-between node x will be positive due to minor drain current present there. cmos approach based circuit n2 g1 n3 n1 x g2 input output g2 g1 x p2 p1 p3 vdd gnd fig. 3 working of fortran approach in the standby mode of operation, extra transistors of pmos and nmos are connected in parallel with the forced stacking structure. the input gate terminals are connected opposite to that of the forced stack structure of pun and pdn network, which plays a crucial role in the standby mode of operation. it converts low threshold voltage transistors to high threshold voltage transistors. table 1 shows the measurement values of different parameters taken in the simulation tool. all the measurement values as listed in table 1 are important for analyzing the performance of the fortran based digital circuit designs. table 1 set of simulation parameters for different technology nodes parameters (unit) 180nm technology node 130nm technology node 90nm technology node supply voltage (v) 1.8 1.3 1.1 frequency (mhz) 25 25 25 nmos (width) (µm) 0.25 0.2 0.15 nmos (length) (µm) 0.18 0.13 0.1 pmos (width) (µm) 0.5 0.4 0.3 pmos (length) (µm) 0.18 0.13 0.1 nmos (vth) (mv) 0.399 0.332 0.26 pmos (vth) (mv) -0.42 -0.349 -0.303 temperature (0c) 25 25 25 266 s. r kassa, n. k. misra, ra. nagaria table 2 comparative analysis of inverter for dynamic power, delay and static power inverter 180nm technology node dynamic (µw) delay (ns) static power (pw) average static power (pw) 0 1 cmos 1.85 10.5 382 676 529 sleepy stack 1.42 72.6 374 384 379 indep 1.56 79.8 11.9 115 63.45 fortran 1.14 97.6 11.36 18.38 14.87 130nm technology cmos 0.382 6.11 208 399 303.5 sleepy stack 0.293 10.6 398 395 396.5 indep 0.296 13.1 44.4 41.4 42.9 fortran 0.234 47.67 13.59 3.04 8.315 90nm technology cmos 0.22 5.77 220 318 269 sleepy stack 0.146 8.82 317 315 316 indep 0.138 5.76 41.5 19 30.25 fortran 0.104 7.8 2.77 6.13 4.45 the simulation result for an inverter is presented above by comparing the fortran inverter structure with previously available approaches. here, for comparison, sleepy stack approach is considered as it is one of the successful approaches with state saving for effectively decreasing leakage current using forced stack transistor in its design and another approach is indep (input dependent) approach because it’s the most advance approach to reduce leakage current, as known to the author. the analysis below is done at 180nm, 130nm and 90nm technology node for a fair comparison between different approaches. as shown in table 2, dynamic power saving in fortran approach is almost 32% compared to cmos approach. however, it is 45% at the 90nm technology node. therefore, as an observation, leakage power dissipation is decreased in a higher amount in the proposed approach compared to other approaches. leakage power is almost decreased to 95-98 % compared to cmos approach, which is very important while taking care into consideration that only low threshold voltage transistors are utilized in the proposed design. this approach will give the vlsi circuit designer a new area of research for minimizing leakage power by effectively utilizing low threshold voltage transistors into the structure. high threshold transistors are best to minimize leakage current, but the problem with high threshold voltage transistors is in its fabrication process, that it has not reached in an effective way to mainstream fabrication process up till now with a proficiency like low threshold voltage transistors. one concern with the proposed approach is a minimal increase in the total circuit area. but, it is not a big concern nowadays, which is to be considered as a primary problem, because high transistor integration density is possible in which millions of transistors can be placed on a single chip effortlessly by applying various advanced fabrication approaches. if we are reducing technology node, then the improvement factor is being seen in dynamic power as per during the simulation study in t-spice. forced stack sleep transistor (fortran) 267 4. fortran logic gate implementation the logic gates as nand-2 and nor-2 with fortran approach is described for two inputs and a single output, as displayed in fig. 4. as shown in fig 4, node a and b are two inputs, and node y is the output of the fortran nand-2 gate, whereas g1 and g2 are the two control inputs that connect the vdd and gnd with the circuit. the working of fortran approach is considered as, in active mode of operation, when a and b are same, at that phase, g1 and g2 value will be same as that of a or b, making transistors p4, p5, n4 and n5 to acquire complete logic at the output terminal by connecting either vdd or gnd to the output. here the forced stack transistor effect makes the resistor of the path very less in active mode, and thus, less active power dissipation is obtained compared to the cmos approach. in standby mode, as soon as sleep transistors n3 and p3 go in on condition, it will automatically increase the threshold voltage of forced stack transistor structure which behaves like high threshold voltage transistors and decrease the leakage current by increasing the resistance of the path from vdd to output in pun and from gnd to output in pdn. n2 g1 n3 n1 g2 output g2 g1 p2 p1 p3 vdd gnd p4 p5 n5 n4 input a input b n2 g1 n3 n1 g2 output g2 g1 p2 p1 p3 vdd gnd p4 n4 n5 input a input b p5 (a) (b) fig. 4 fortran (a) nand-2 and (b) nor-2 gate 268 s. r kassa, n. k. misra, ra. nagaria fig 5a presents the physical layout of fortron based nand-2 using the process libraries of generic250nm. the physical verification was carried out in l-edit v16. 0 version of mentor graphics tool. the mentioned nand-2 gate with 10 transistor in which 5 transistor as pmos and 5 transistor as nmos have the small area. this is clear in the physical layout of nand-2 highly symmetric configuration and equal transistor of both nmos and pmos of these mentioned designs. the area of nand-2 is taken into the measure as 67.37 μm2. the drc results of the physical layout of fortron based nand-2 are shown in fig. 5b. the physical layout was carried out in l-edit considering the net-list found after the physical verification flow as shown in fig. 6. (a) (b) fig. 5 design of fortron based nand-2 (a) layout (b) drc results forced stack sleep transistor (fortran) 269 fig. 6 netlist extract on the physical layout of nand-2 cell. the logic unit of fortron based nor-2 includes 2-input a and b with two control input g1 and g2 as shown in fig 7a. in the synthesis of nor-2 design operation in the state of 5 nmos and 5 pmos are used in the design. its physical layout of nor-2 has an area of 68.25 μm2. the process libraries were taken as generic250nm for the physical layout of nor-2. the design of the nor-2 circuit is connected to power supply and ground and therewith transistors, the design has more swing in the output voltage and driving capability. design rule check (drc) is the layout rules of technology and be used for physical (post-layout) verification for asic design flow. in the physical layout of nor-2 results have passed all verifications of the asic design flow as shown in fig 7b. the extracted net-list from the physical layout of nor-2 is presented in fig. 8. table 3 gives the complete details of the fortran approach for different input levels in the nand-2 gate. all the input values are given in the circuit to remain in active mode of operation. here different transistors' behavior is shown at different input levels for a fair comparison. from table 3, it can be perceived that the value of the control signal g1 and g2 is same when inputs are identical in nand gate, but if both the input values are different, then both g1 and g2 values should be taken as logic 0 in nand gate. in table 4, the analysis of the nor-2 gate of different inputs is given. from observation of the table, it can be detected that when input values are same, then both g1 and g2 control transistor value should be same similar to input values, but if both inputs are different, then g1 and g2 would be at logic 1 to keep the circuit in active mode of operation. from the above analysis, transistors value at different values of inputs shows which transistors should be on and off at the given input. from this analysis, different modes of transistors can be perceived at various levels of inputs. for standby mode of operation, a different value for control transistors g1 and g2 should be occupied for active mode of operation. 270 s. r kassa, n. k. misra, ra. nagaria (a) (b) fig. 7 design of nor-2 (a) layout (b) drc results fig. 8 netlist extract on the physical layout of nor-2 cell. forced stack sleep transistor (fortran) 271 table 3 input signals in fortran nand-2 gate input control signal working condition of transistors output a b g1 g2 p1 p2 p3 p4 p5 n1 n2 n3 n4 n5 y 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 0 1 0 0 1 1 1 1 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 1 0 0 0 1 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 0 table 4 input signals in fortran nor-2 gate input control signal working condition of transistors output a b g1 g2 p1 p2 p3 p4 p5 n1 n2 n3 n4 n5 y 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 1 0 1 0 1 1 0 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 0 1 1 1 1 1 0 5. comparative analysis of delay in this section, a systematic delay prototype for an inverter, based on the fortran approach, is clarified and tried to be matched to basic cmos, indep and sleepy stack approached. all the approaches are considered as working in an active mode of operation. typically, the transistor delay (td0) of a straight cmos inverter, as shown in fig. 9 driving a load of cl, can be expressed using equation 3. td0 = clrt (3) where rt is the transistor resistance, and cl is the load capacitance. cin indicates the input capacitance. although the non-saturation mode equation is complicated, it can be predicted the adequate first-order gate delay from the equation (3). at this instant, derivation of delay for other approaches have been achieved one by one and compared with the basic cmos approach. (a) (b) fig. 9 (a) inverter circuit schematic (b) rc equivalent circuit mp 1 r vdd clrt mn 1 r cin rt vdd gndgnd 272 s. r kassa, n. k. misra, ra. nagaria 5.1. indep approach from fig. 10, delay of indep approach can be expressed as td1= (rt1+rt2) cl + rt2cx1 here rt1 and rt2 both are the resistances of mn2 and mn1 respectively that are connected with forced stack manner, in which the resistance of rt1 and rt2 will reduce compared to the resistance of rt of cmos. the capacitance of cx1 is much less than that of cl, as cx1 is an internal node capacitance between the two pull-down resistances. therefore, if cx1 = 0.5cl, rt1 = 0.5rt, rt2 = 0.5rt. td1 = (0.5rt + 0.5rt) cl + (0.5rt) (0.5cl) = rtcl + 0.25rtcl = 1.25rtcl (4) = 1.25td0 from this, it can be observed that the delay of indep approach is almost same as that of cmos approach. v0 v1 mp 1 mp 2 mn 2 mn 1 r r r r rt2 rt1 rt1 rt2 cl cx1 vdd gnd vdd gnd cin (a) (b) fig. 10 (a) indep inverter circuit schematic (b) rc equivalent circuit 5.2. sleepy stack approach from fig. 11, delay of the sleepy stack approached inverter can be expressed as td2= (rt1+0.5rt2) cl + 0.5rt2cx1 in sleepy stack approached, two extra sleep transistors are added as mn1 and mn3 at pdn side, and both are high-vth transistors that are connected in parallel to each other as shown. their combined resistance is half of that of rt2. when compared to cmos, rt1 is the resistance of simple transistor, and rt2 is high-vth transistor resistance. and the capacitance value of cx1 50% larger to cl as it is the capacitance connected with three transistors. therefore, in sleepy stack, rt1 = rt and rt2 = 1.25rt and cx1 = 1.5cl. forced stack sleep transistor (fortran) 273 td2 = (rt + 0.5×1.25rt) cl + (0.5×1.25rt × 1.5cl) = 1.625rtcl + 0.625rtcl (5) = 2.25rtcl = 2.25td0 from this, it can be observed that, the delay of sleepy stack approach is almost 2.25 times higher compared to cmos approach. mp1 mp 2 mn 2 r r r r rt2 rt1 rt1 rt2 cl cx1 s s r r rt2 cin mp3 mn3mn1 vdd gnd vdd gnd rt2 (a) (b) fig. 11 (a) sleepy stack inverter circuit schematic (b) rc equivalent circuit 5.3. fortran approach in fortran approach, to design any circuit, the same low threshold voltage transistor is exploited as utilized in cmos approach for designing any circuit. therefore, the resistor value of transistor n1, n2 and n4 in the pull-down network as well as the resistor value of transistor p1, p2 and p4 in the pull-up network will be similar as that of cmos network’s transistors values as shown in fig 12. furthermore, transistor n1 and p2 p3 n3 n1 r r r r 2rt rt rt 2rt cl cin vdd n2 p4 n4 vdd r r gnd r r 2rt rt rt 2rt cx1 g1 g2 gnd g2 g1 p1 (a) (b) fig. 12 (a) fortran inverter circuit schematic (b) rc equivalent circuit 274 s. r kassa, n. k. misra, ra. nagaria n2 are connected in series with each other and combining they are connected parallel with p4. so, transistors n1 and n2 have resistor value double to that of cmos transistors because these two transistors are connected as forced stack transistors. the capacitor cx1 value is assumed as one-fourth of that of cl as cx1 is the capacitor of two forced stack transistors in series, combining they are parallel with a single transistor. therefore, the cx1 is the value of capacitor for 3 transistors. so, at cx1, three transistors are connected such that two transistors are in series with each other combining there are in series with one transistor. as cx1 value of resistances is (4×1) /(4+1) = (4/5) = 0.8 comes. and cx1 is a capacitor connected with four transistors, cx1 = 0.25cl. td3 = (0.8) rtcx1 + 1.8rtcl = (0.8) (0.25rtcl) + 1.8rtcl = (10/5) rtcl (6) = 2rtcl = 2td0 from equation 10, it can be observed that the delay of fortran approach is almost two times higher than cmos approach. 6. simulation results in this section, important simulation results for the fortran approach have been observed. all the experimental data were obtained at 180nm, 130nm and 90nm technology node using tsmc® technology node in tanner eda with the bundle of mentor graphics. table 1 shows the values of different parameters taken for measurement purposes in the simulation tool. all the simulation results which are taken from tanner eda design tool. all the exhaustive operations are performed on the topology of fortran, which is a verified result. table 5 and 6 presents the comparative analysis of nand-2 and nor-2 gates. the average static power has less value as compared to the cmos, sleepy stack, indep and fortron based designs respectively. table 7 presents the area of the nand-2 and nor-2 gates. the leakage power dissipation for nand and nor gates for two inputs have been observed. the leakage power is measured at a static phase of the circuit by applying all the possible combinations of inputs. total leakage power dissipation is the summation of all the leakage power dissipation components for all input combinations. leakage power dissipation of different benchmark circuits like full adder and a chain of 4 inverters has been measured for proper investigation of the proposed circuit design approach. a comparison of the fortran approach with the conventional cmos approach is made by considering it at different frequency ranges. from the result illustrated in table 8 and table 9, it can be observed that up to 95-98% of leakage power is saved compared to the standard cmos technology-based gate. forced stack sleep transistor (fortran) 275 table 5 comparative analysis of nand-2 gate nand-2 180nm technology node dynamic power (µw) delay (ns) static power (pw) average static power (pw) (0,0) (0,1) (1,0) (1,1) cmos 1.35 138 669 665 684 693 677.75 sleepy stack 1.59 170 14.4 384 304 135 209.35 indep 1.28 124 2.77 14.1 9.98 154 45.21 fortran 1.16 185 8.57 5.12 3.47 9.46 6.65 130nm technology dynamic power (µw) delay (ns) (0,0) (0,1) (1,0) (1,1) average static power (pw) cmos 636 208 198 798 460 sleepy stack 0.26 100 396 404 417 391 402 indep 0.23 71.1 24.81 39.9 41.55 34.75 35.25 fortran 0.19 77.08 18.57 19.19 17.71 10.1 16.39 90nm technology dynamic power (µw) delay (ns) (0,0) (0,1) (1,0) (1,1) average static power (pw) cmos 483.7 435.8 377.17 637.87 483.63 sleepy stack 0.12 62.68 316.5 311.7 325.95 311.27 316.35 indep 0.11 45.15 28.71 48.09 43.81 68.39 47.25 fortran 0.09 59.68 18.55 25.28 25.31 21.3 22.61 table 6 comparative analysis of nor-2 gate nor-2 180nm technology node dynamic power (µw) delay (ns) static power (pw) average static power (pw) (0,0) (0,1) (1,0) (1,1) cmos 1.86 5.05 672 676 684 674 676.5 sleepy stack 2.3 5.07 764.43 676.9 654.89 141.95 536.43 indep 1.73 5.09 18.5 119 115 77.4 64.39 fortran 1.43 5.08 8.82 15.46 13.67 13.92 12.96 130nm technology node dynamic power (µw) delay (ns) (0,0) (0,1) (1,0) (1,1) average static power (pw) cmos 427.24 386.23 395.58 391.75 400.19 sleepy stack 0.37 4.97 417.7 400.15 317.21 19.07 288.53 indep 0.33 4.96 57.48 18.21 19.01 9.94 26.16 fortran 0.25 5.02 21.31 8.26 10.63 8.17 12.09 90nm technology node dynamic power (µw) delay (ns) (0,0) (0,1) (1,0) (1,1) average static power (pw) cmos 868 316.02 281.82 41.96 376.95 sleepy stack 0.17 4.89 316.71 306.64 315.05 311.5 312.47 indep 0.13 4.89 74.1 44.32 41.56 23.28 45.815 fortran 0.11 4.9 25.82 22.88 21.56 19.08 22.33 276 s. r kassa, n. k. misra, ra. nagaria table 7 area comparison of the nand-2 and nor-2 design cmos area in mm2 sleepy stack indep fortran area (in µm2) comparison of the nand-2 26.92 53.84 65.23 67.37 area (in µm2) comparison of the nor-2 28.57 56.488 68.59 68.25 table 8 comparative analysis of dynamic power for an inverter at different frequency range sr. no. frequency (mhz) 180nm technology node at 250 c 130nm technology node at 250 c 90nm technology node at 250 c 180nm technology node at 1100 c 130nm technology node at 1100 c 90nm technology node at 1100 c power (µw) dynamic power (µw) dynamic power (µw) dynamic power (µw) dynamic power (µw) dynamic power (µw) dynamic power (µw) cmos fortran cmos fortra n cmos fortra n cmos fortra n cmos fortra n cmos fortra n 1 10 0.22 0.2 0.04 0.03 0.02 0.02 0.25 0.23 0.05 0.04 0.03 0.03 2 20 0.678 0.54 0.12 0.09 0.06 0.05 0.72 0.56 0.14 0.1 0.08 0.06 3 50 1.12 0.86 0.2 0.14 0.1 0.09 1.2 0.9 0.235 0.16 0.12 0.1 4 100 2.25 1.72 0.4 0.29 0.19 0.18 2.39 1.75 0.46 0.31 0.24 0.19 5 200 4.51 3.47 0.79 0.68 0.38 0.37 4.78 3.5 0.9 0.66 0.47 0.39 6 500 11.29 8.38 1.98 1.44 0.96 0.87 11.92 8.7 2.25 1.51 1.15 0.91 table 9 comparative analysis of full adder at 180nm technology node full adder analysis 180nm technology approach dynamic power (µw) % saving of power delay (ns) % saving of delay leakage power (nw) % saving of leakage power cmos 8.53 _ 0.16 _ 2.88 _ sleepy stack 8.32 +2.46 0.19 -14.72 2.5 13.19% indep 8.21 +3.75 0.18 -08 1.95 32% fortran 7.25 +15.01 0.18 -12.27 1.84 36% table 8 shows the comparative analysis of dynamic power for an inverter circuit at different frequency ranges. from the table, it can be observed that as the temperature and frequency go higher, dynamic power of the fortran approach does not increase in a high amount similar to the cmos approach. table 8 shows a comparative analysis of full adder design at the 180nm technology node. from the table, it can be easily understood that dynamic power and leakage power are much more saved in fortran approach compared to other approaches with little increase in delay. we have achieved in full adder design better saving of power (+36%) but the delay has not found optimal value. the simulation result of a chain of 4 inverters is shown in table 9. from the result, one can observe that the static power is effectively reduced in fortran approach compared to all other low-power approaches. static power is reduced as approximately 98% in the proposed fortran approach compared to the cmos approach, while dynamic power is also reduced by almost 19% compared to the cmos approach increasing the delay by 67%. the general approach for the chain of 4 inverters is given as in fig. 13. we have taken a simple inverter which is cascaded in 4 chains and the vdd and ground has been kept common in the design. in the case, the chain of 4 inverters we have only achieved 47% delay and 97.99% savings of static power. this shows that there forced stack sleep transistor (fortran) 277 is a trade off in static power and delay we cannot gain optimal value together. the result of the 4-bit multiplier circuit is compared in table 11, which shows the superiority of the fortran approach over cmos approach. fig. 13 general approach for a chain of 4 inverters table 10 comparative analysis of chain of 4 inverters chain of 4 inverter 180nm technology node dynamic power (µw) % saving of dynamic power delay (ns) % saving of delay average static power (pw) % saving of static power cmos 5.22 0.28 2721.50 sleepy stack 4.60 11.88 0.40 -40.57 2115.50 22.27 indep 4.97 4.79 0.39 -37.01 644.98 76.30 fortran 4.23 18.97 0.47 -66.90 54.66 97.99 table 11 comparative analysis of 4-bit multiplier analysis of 4-bit multiplier analysis at 250 c approach dynamic power (10-4) µw % saving of dynamic power delay (ns) % saving of delay static power (nw) average static power (nw) % saving of static power a=(0000) b=(0000) a=(0011) b=(1100) a=(1100) b=(0011) a=(1111) b=(1111) cmos 1.18 ~ 4.11 ~ 59.75 71.91 72.24 88.75 73.16 ~ fortran 1.11 5.93 3.81 7.3 36.21 36.12 36.13 36 36.11 50.64 analysis at 500 c cmos 1.23 ~ 4.05 ~ 197.43 236.22 237.67 286.2 239.38 ~ fortran 1.21 1.62 3.73 7.9 115.39 115.03 133.14 114.02 119.395 50.12 analysis at 1100 c cmos 1.29 ~ 3.9 ~ 1871.08 2201.11 2221.82 2563.06 2214.26 ~ fortran 1.26 2.32 3.55 8.97 1302 1127.36 1295.94 1272.98 1249.57 43.56 278 s. r kassa, n. k. misra, ra. nagaria (a) (b) (c) fig. 14 comparison of proposed fortran approach with reported approaches, (a) & (b) comparison with different approaches and (c) comparison at different temperature ranges in figure 14, three types of temperature are considered such as 250c, 500c and 1100c. two technologies, cmos and fortran technology have been considered to get the dynamic power, static power, and delay. figure 14 shows the simulation graph of the proposed fortran approach with various reported approaches for fa, a chain of 4 forced stack sleep transistor (fortran) 279 inverters and 4-bit multiplier designs. from fig. 14 (a) and 14 (b), it can be perceived that the dynamic power, delay and static power consumed is very less in fortran approach compared to other approaches. same as the functionality of the proposed fortran approach is also checked against cmos approach at different temperature ranges, as depicted in fig. 14 (c) at 180nm technology node for 4-bit multiplier. it can be observed that as the temperature goes high, the dynamic power and static power increases in cmos approach, but in fortran approach, dynamic as well as static power does not increase in such a high proportion similar to the cmos approach. therefore, the fortran approach gives decent results compared to other approaches at a nanoscale regime. 7. conclusion in a deep-0.1sub-micron regime, sub-threshold leakage power dissipation is almost equal to a dynamic power dissipation, which requires to be minimized effectively with great care. furthermore, in battery-operated devices specifically, for which battery life is of prime concern, leakage power has become a more critical issue to be solved. in this paper, a novel and efficient circuit-level leakage power reduction approach of fortran is presented, which can minimize leakage power as well as dynamic power in a decent amount. the main advantage of the fortran approach is that only low threshold voltage transistors are used in its circuit design structure, which is also the mainstream requirement of the current vlsi industry. various types of circuits like full adder, a chain of 4 inverters and 4-bit multiplier have been analyzed for comparative analysis of the fortran approach with the other well-known leakage power reduction approaches available for leakage current reduction. fortran approach decreases leakage power from 95 to 98%, dynamic power from 15 to 20% with a slight increase in the delay. the delay can be compensated, considering that one wants to implement the proposed circuit design approach predominantly for battery-operated devices. acknowledgements: the authors would like to thank the anonymous reviewers for their constructive criticism and effective advice that improved a preliminary version of this paper. references [1] b. k. bhoi, n. k. misra and m. pradhan, "novel robust design for reversible code converters and binary incrementer with quantum-dot cellular automata", intelligent computing and information and communication, singapore, springer, 2018, pp. 195-205. [2] a. kumar and r. k. nagaria, "a new process variation and leakage-tolerant domino circuit for wide fan-in or gates" analog integr. circuits signal process., vol. 102, no. 1, pp. 9-25, january 2020. [3] a. p. shah, v. neema, s. daulatabad and p. singh, "dual threshold voltage and sleep switch dual threshold voltage doind approach for leakage reduction in domino logic circuits", microsyst. technol., vol. 25, no. 5, pp. 1639-1652, may 2019. [4] h. g. ko, w. bae, g. s. jeong and d. k. jeong, "reference spur reduction techniques for a phase-locked loop", ieee access, vol. 7, pp. 38035-38043, march 2019. [5] a. tajalli and y. leblebici, "leakage current reduction using subthreshold source-coupled logic", ieee trans. circuits syst. ii: express briefs, vol. 56, no. 5, pp. 374-378, may 2009. [6] a. n. bahar, s. waheed and n. hossain, "a new approach of presenting reversible logic gate in nanoscale", springerplus, vol. 4, no. 1, pp. 1-7, march 2015. 280 s. r kassa, n. k. misra, ra. nagaria [7] j. w. chun and c. y. r. chen, "leakage power reduction using the body bias and pin reordering technique", ieice electron. express, vol. 13, p. 20151052, february 2016. [8] a. k. m. mahfuzul islam and h. onodera, "circuit techniques for device-circuit interaction toward minimum energy operation", ipsj trans. syst. lsi design methodology, vol. 12, pp. 2-12, february 2019. [9] s. balaji ramakrishna, a. m. alur, m. s. navya, n. sudhakar and a. kumar, "inconstant threshold keeper & footer mirror (itkfm) technique for wide fan-in domino logic gates", in proceedings of the 3rd ieee international conference on recent trends in electronics, information & communication technology (rteict), 2018, pp. 131-137. [10] d. rossi, v. tenentes, s. yang, s. khursheed and b.m. al-hashimi, "reliable power gating with nbti aging benefits", ieee trans. very large scale integr. (vlsi) syst., vol. 24, no. 8, pp. 2735-2744, august 2016. [11] w. w. kai, n. b. ahmad and m.h. bin jabbar, "variable body biasing (vbb) based vlsi design approach to reduce static power", int. j. electri. comput eng., vol. 7, no. 6, pp. 2088-8708, december 2017. [12] b. j. sheu and d. l. scharfetter, p. k. ko, m. c. jeng, "bsim: berkeley short-channel igfet model for mos transistors", ieee j. solid-state circuits, vol. 22, no. 4, pp. 308-331, august 1987. [13] c. goyal, j. s. ubhi and b. raj, "a reliable leakage reduction technique for approximate full adder with reduced ground bounce noise", math. probl. eng., vol. 2018, december 2018. [14] r. singh and s. akashe, "modeling and analysis of low power 10 t full adder with reduced ground bounce noise", j. circuits, syst. comput., vol. 23, no. 01, p. 1450005, january 2014. [15] s. gupta, k. gupta, b.h. calhoun, n. pandey, "low-power near-threshold 10t sram bit cells with enhanced data-independent read port leakage for array augmentation in 32-nm cmos", ieee trans. circuits syst. i: reg. papers, vol. 66, no. 3, pp. 978-988, march 2019. [16] h. jeon, y. bin kim and m. choi, "standby leakage power reduction technique for nanoscale cmos vlsi systems", ieee trans. instrum. meas., vol. 59, no. 5, pp. 1127-1133, may 2010. [17] n. hanchate and n. ranganathan, "a new technique for leakage reduction in cmos circuits using selfcontrolled stacked transistors", in proceedings of the 17th ieee international conference on vlsi design, 2004, pp. 228-233. [18] p. upadhyay, r. kar, d. mandal and s. p. ghoshal, "a design of low swing and multi threshold voltage based low power 12t sram cell", comput. electr. eng., vol. 45, pp. 108-121, july 2015. [19] j. tonfat, g. flach and r. reis, "leakage current analysis in static cmos logic gates for a transistor network design approach", in proceedings of the 26th ieee international workshop on power and timing modeling, optimization and simulation (patmos), 2016, pp. 107-113. [20] h. p. rajani and s. y. kulkarni, "lpsr: novel low power state retention technique for cmos vlsi design", int. j. comput. appl., vol. 51, no. 18, august 2012. [21] s. mutoh, t. douseki, y. matsuya, t. aoki, s. shigematsu and j. yamada, "1-v power supply highspeed digital circuit technology with multithreshold-voltage cmos", low-power cmos design, wileyieee press, 1998, pp. 86–92. [22] n. hanchate and n. ranganathan, "lector: a technique for leakage reduction in cmos circuits", ieee trans. very large scale integr. (vlsi) syst., vol. 12, no. 2, pp. 196-205, february 2004. [23] j. c. park and v. j. mooney, "sleepy stack leakage reduction", ieee trans. very large scale integr. (vlsi) syst., vol. 14, no. 11, pp. 1250-1263, november 2006. [24] v. k. sharma, m. pattanaik and b. raj, "onofic approach: low power high speed nanoscale vlsi circuits design", int. j. electron., vol. 101, no. 1, pp. 61-73, march 2013. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 375 387 doi: 10.2298/fuee1403375d user-awareness and adaptation in conversational agents  vlado delić1, milan gnjatović1,2, nikša jakovljević1, branislav popović1, ivan jokić1, milana bojanić1 1 faculty of technical sciences, university of novi sad, serbia 2 graduate school of computer sciences, megatrend university, belgrade, serbia abstract: this paper considers the research question of developing user-aware and adaptive conversational agents. the conversational agent is a system which is useraware to the extent that it recognizes the user identity and his/her emotional states that are relevant in a given interaction domain. the conversational agent is user-adaptive to the extent that it dynamically adapts its dialogue behavior according to the user and his/her emotional state. the paper summarizes some aspects of our previous work and presents work-in-progress in the field of speech-based human-machine interaction. it focuses particularly on the development of speech recognition modules in cooperation with both modules for emotion recognition and speaker recognition, as well as the dialogue management module. finally, it proposes an architecture of a conversational agent that integrates those modules and improves each of them based on some kind of synergies among themselves. key words: conversational agent, user-awareness, adaptation, speech recognition, emotion recognition, speaker recognition, dialogue management 1. introduction context-awareness is certainly one of the most fundamental requirements for advanced conversational agents. recognition and interpretation of the user’s dialogue acts and dialogue management are always situated in a particular context. this is primarily due to the fact that many inherently present dialogue phenomena are context-dependent. thus, nonlinguistic contexts shared between the user and the system (e.g., graphical displays) may influence the language of the user to a high extent with respect to frequency of “irregular” utterances (elliptical and minor utterances, utterances containing anaphora and exophora, etc.) [1]. in addition, the user’s dialogue acts may fall outside the system’s domain, scope and semantic grammar, or contradict his earlier dialogue acts. this is even more the case when we consider users in non-neutral emotional states. forcing users to follow a preset grammar or interaction scenario is too restrictive, if possible at all, and  received april 30, 2014 corresponding author: vlado delić faculty of technical sciences, university of novi sad, trg dositeja obradovića 6, 21000 novi sad, serbia (e-mail: vlado.delic@uns.ac.rs) 376 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić would not be well accepted [2]. in such cases, the system needs a considerable amount of stored contextual knowledge to enable it to advance the conversation in spite of miscommunication and to maintain the dialogue’s consistency. however, the requirement for habitable natural language interfaces goes beyond pragmatics. another reason relates to the technology. speech recognition technology is still not accurate enough to deal with flexible, unrestricted language. in realistic settings, average word recognition error rates are 20–30%, and they go up to 50% for non-native speakers [3]. in certain conditions, speech recognition accuracies may degrade dramatically to an extent that systems become unusable even for cooperative users [4]. researchers generally agree that conversational agents need to incorporate dialogue context models in order to maintain a consistent dialogue and overcome technical deficiencies. yet, context is a complex construct and can be considered from different aspects. in this paper, we consider a restricted research question of how user-awareness may help in improving dialogue management. this paper summarizes some aspects of our previous work and presents work-in-progress. in the reported approach, we differentiate between two research lines:  user-awareness. the system is user-aware to the extent that it recognizes the user and his/her emotional states that are relevant in a given interaction domain.  user-adaptation. the system is user-adaptive to the extent that it dynamically adapts its dialogue behavior according to the user and his/her emotional state. at the methodological level, these two lines of research are fundamentally different. the first line relates to a statistical approach to the research problems of automatic speech recognition (asr), emotional speech recognition (esr), and speaker recognition. speech signal encodes not only information about the lexical content of the speaker’s dialogue act, but also information about the speaker’s voice characteristics that may be used for recognition of the speaker and his/her emotional state [5], [6]. the basic idea is to use data derived from both speech and language corpora, and apply automated analysis methods. although speech/speaker/emotion recognition technologies have a common foundation, they are usually developed and applied separately. we build upon our previous work [7]-[13], and investigate the possibilities to combine these technologies rather than to apply them separately. sections 2 and 3 discuss this in more detail. the second research line relates to a representational approach to natural language processing and dialogue management. in previous work, we introduced a representational model of attentional information in human-machine interaction that provides a framework for more robust natural language understanding and designing adaptive dialogue strategies [2], [14]-[17]. section 4 discusses the application of this model to designing useradaptive conversational agents. 2. acoustic information-based approach to user-awareness 2.1. speech recognition the task of automatic speech recognition is to translate spoken words into text. in order to accomplish this task, the reported speech recognizer exploits information about acoustic representations of phonemes, encapsulated in an acoustic model, and information about syntactic rules, encapsulated in a language model. the relation between words and phonemes is captured in a pronunciation dictionary where each word is segmented into at user-awareness and adaptation in conversational agents 377 least one sequence of phonemes. since each phoneme has several acoustic representations, as a basic modeling unit we use a context dependent phone referred to as triphone. the acoustic model is based on hidden markov models and gaussian mixture models. in order to reduce the model computational complexity and to achieve robust parameter estimation, similar states of triphones share parameters. the tree based clustering procedure presented in [18] is performed to find those similar states. the gaussians are modeled using the full covariance matrix, since they obtain more accurate acoustic representation in comparison to models with diagonal covariance matrix [19]. however, in this variant the computational complexity of log likelihood is significantly increased. to overcome this problem, several approaches have been developed and applied [20], [21], [22]. the system uses feature vectors consisted of 15 mel-frequency cepstral coefficients (mfcc), normalized energy and their first derivatives. the feature vectors are extracted from 30 ms speech segments, every 10 ms. the training set for the acoustic model contains recordings of both scripted and spontaneous utterances produced by several dozen speakers, with a total duration of about 200 hours [23]. language modeling is a special issue for highly inflected languages, since language models have to cover a range of grammatical categories (including tense, aspect, mood, case, etc.) and morphological derivations that involve the addition of prefixes and suffixes. in the currently predominant statistically-based approach to asr, language models are trained on large text corpora. however, simple n-gram based language models do not suffice for morphologically more complex languages without significant modifications [24]. our language model is a combination of 3 n-gram models. the first model is based on tokens (surface forms), the second on lemmata, and the third on classes [23]. the size of vocabulary causes data sparsity problems, resulting in the need for significantly greater language corpora, sufficient for obtaining a robust language model. the training set for the language model consists of text content from various newspapers, scientific articles and books, with a total volume of about 16 million words (178865 lemmata). splitting words into phoneme sequence is relatively simple for the serbian language, due to the fact that it has phonemic orthography. however, there are some exceptions in word pronunciation (e.g. dvanaest is usually pronounced as dvanajst) and our phonetic inventory distinguishes stressed and unstressed variant of vowels, thus for mapping words into phones the system uses the pronunciation dictionary developed for speech synthesis [23]. the size of search space is determined by the following factors: the number of words which are expected to be recognized, the number of their pronunciation variants, and the number of hidden markov model states in the acoustic model. for the real-time recognition, it is important to reduce the search space, which can be a significant problem for highly inflected languages, where many derived forms may exist for a single lemma. the standard way to cope with this problem is pruning, i.e., discarding the less probable hypotheses. for this purpose, a system should rely not only on an acoustic model, but also on a language model and information about word pronunciation. our system uses a decoder based on the token-passing algorithm (a variant of the viterbi algorithm in which the information about the path and score is stored at the word level instead of trellis state level). a detailed description of the decoder can be found in [25]. 378 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić 2.2. emotion recognition emotional speech recognition is concerned with the task of identifying emotional states of the speaker automatically, based upon the analysis of his speech. prosodic and spectral features are the most frequently ones used for this task, while the less frequently used features include voice quality features (e.g., harmonic-to-noise ratio, jitter, shimmer). prosodic features, also referred to as paralinguistic features, include specific changes in pitch patterning, the energy of the voice signal, and changes in speech rate. the positions and bandwidth of formants, and a cepstral representation of the spectrum are usually selected as spectral features for emotional speech recognition. this is in line with the findings that the distribution of the spectral energy across the speech range of frequency is a possible measure of the emotional content of speech. in [11], we show that a feature set containing both the prosodic and the spectral features achieves high recognition accuracy (i.e., 91.5 %) of the basic emotional states (i.e., anger, joy, fear, sadness, and neutral). the feature vector was obtained by applying statistical functionals to the spectral/prosodic feature contours, where the most relevant functionals, ranked in the descending order, are: moments, extrema, and regression coefficients [12]. in many speech-based applications, it is beneficial to conceptualize the user’s emotional states in a given interaction domain as positive or negative (e.g., for the purpose of detecting a frustrated or satisfied call-centre customer). therefore, in our previous work, we also investigated the perspective of dimensional emotion models that describe emotional content in terms of valence (positive/negative emotion) and arousal (active/passive emotion). we conducted a comparative study of two acted emotion corpora to investigate possibilities for classification of discrete basic emotions in the valence-arousal space [26]. the first conclusion of this study was that the prosodic-spectral feature set proposed in [11] is almost equally effective in modeling emotions in the valence-arousal space as compared to modeling discrete emotional states. the second conclusion was that the discrimination of emotional states according to the arousal level is more successful than their discrimination according to the valence level [26]. our research on acoustic information-based emotion recognition was primarily supported by the gees corpus of emotional and attitude-expressive speech in serbian [27]. it contains recordings of acted speech-based emotional expressions. six drama students (3 female, 3 male) were engaged to produce emotionally colored utterances. they were given a set of textual entries (32 isolated words, 30 short sentences, 30 long sentences, and one passage of 79 words) and asked to express each entry in five emotional states (anger, joy, fear, sadness and neutral). the perception test demonstrated that the corpus contains acoustic variations that are indicative of emotional expression of the five target emotional states. 2.3. speaker recognition our research on speaker recognition centers on a text-independent speaker recognition based on the feature set that contains mel-frequency cepstral coefficients (mfcc) and their first and second derivatives. the research was primarily supported by a corpus containing recordings of 121 native serbian speakers (61 female, 60 male). each speaker produced 14 audio recordings: one recording of the speaker uttering his/her first name and family name, two recordings of the speaker uttering a sequence of digits, and eleven recordings of the speaker uttering a sequence of syntactically unrelated words. to reduce user-awareness and adaptation in conversational agents 379 the dimensionality of the standard mfcc, we applied the technique of principal component analysis (pca). the reported experimental results [9] suggest that this technique is appropriate to reduce the dimensionality without reducing the recognition accuracy. the applied automatic speaker recognizer shows that already for a 14-dimensional pca feature space, the recognition accuracy reaches the target value as in the 39-dimensional mfcc feature space. mfccs depend on the energy in an observed speech frame. therefore the distribution of a speaker feature vectors depends on the lexical content and expressed emotions. to decrease the text dependency on the covariance matrices used for speaker modeling, we apply an algorithm of model elements weighting introduced in [10]. the basic idea may be formulated as follows: the importance of an element of the speaker model in the decision making processes decreases as its time variability increases. in accordance with this, an element of the speaker model that has the highest time variability will be assigned the smallest value. in real applications, it can be the case that, for some speakers, the automatic speaker recognizer has only one model determined during the training phase. thus, the recognizer cannot observe the time variability of model elements. the time variability of speaker models depends primarily on the largest model elements. by applying a nonlinear function, such as the sigmoid function, on the largest model elements, the time variability of the speaker models is decreased, and consequently, the recognition accuracy is increased. also, mfccs depend on the assumed shape of auditory critical bands. when the mfccs are determined under the assumption that the auditory critical bands have exponential shape based on the lower part of the exponential function, the automatic speaker recognizer shows more accurate performance than in the case when the rectangular or triangular auditory critical bands are applied [10]. if should be noted that emotional speech may significantly affect the accuracy of speaker recognition. however, not all emotions are equally critical for speaker recognition. preliminary experiments conducted on the gees database confirmed that, e.g., the emotion of anger changes the speaker’s voice (i.e., timbre) to the greater extent than the emotion of sadness. in the next sections, we discuss how a combination of different knowledge sources may improve the recognition accuracy. 2.4. interplay between speech, emotion and speaker recognition acoustic features and language information contained within the acoustic, pronunciation and language models may be efficiently combined and used for speech, emotion and speaker recognition [5]. high-level features, e.g., phones, idiolect, semantic, accent and pronunciation, reveal speaker characteristics, such as socio-economic status, language background, personality type, and environmental influence [6]. for speech recognition systems based on hidden markov models in combination with gaussian mixture models, numerous techniques have been developed for model adaptation to specific speaker and acoustic condition [28]-[31]. they can be grouped into two classes based on maximum a posteriori likelihood (map) and maximum likelihood (ml), respectively. a map-based adaptation interpolates the original prior parameter values with parameters obtained from the adaptation data, and thus the estimated parameters converge asymptotically to the adaptation domain as the amount of adaptation data increases [28]. however, in the case of sparse adaptation data, many model parameters remain unchanged [32]. ml-based methods assume that there is a set of linear transformation 380 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić which can map the existing model parameters into new adapted model parameters. since they use linear transformation to map parameters, these methods are referred as to ml linear regression (mllr). mllr can be applied only to the gaussian mean vectors or to both mean vectors and covariance matrices. a special case of mllr where the mean vector and covariance matrix of a gaussian have the same transformation matrix is called constrained mllr. while the use of mean mllr adaptation has the greatest positive impact, the use of variance mllr adaptation may also bring a slight improvement in recognition accuracy [29]. the major advantage of mllr over map adaptations is evident in the case of sparse adaptation data, where the same transformation can be applied to all gaussians in the same acoustic class [32]. alternatively, speaker adaptation can be achieved by transformation of features instead of model parameters. the common procedure is vocal tract length normalization [33], [34]. the basic idea is to find warp scales of the frequency axis for each speaker such that the spectrum fits to the spectrum of the universal speaker with a standard vocal tract length, and to apply that transformation on the used features. in this way, withinclass scattering and the overlapping between classes are reduced. it is interesting to note that the constrained mllr can be treated as feature transformation, and that it is commonly used for speaker adaptive training. models trained in this way may achieve higher recognition accuracy [35]. additionally, the accuracy of an asr system can be improved by the adaptation of the language model in terms of reducing the search space and confusion between words [36], [37]. it is widely acknowledged that the speaker’s emotional states affect the speech production system at several levels – from the higher levels of linguistic coding (word selection and sentence structure) to the lower levels of articulator movements (phoneme/word production). this, in turn, may significantly degrade the performance of asr systems. in general, asr performance and prosodic properties of an utterance are related. variations in speaking style and speaking rate, relative to asr training conditions, may have a negative impact on the performance of an asr system [38]. prosodic features reflect those variations, and some studies show that prosody itself is capable of re-ranking asr hypotheses such as to separate the correctly recognized utterances from incorrectly recognized ones [39], [40]. it can be expected that an asr system using acoustic models trained on neutral speech will have reduced performances in settings when it operates under the conditions of emotional speech. reference [41] shows that training asr models on neutral speech, and its subsequent adaptation on emotional speech samples, does have a positive impact on the recognition performance within such conditions. in [11] and [42], we discuss how the same prosodic and spectral features can be employed for the purpose of speech recognition, emotion recognition and speaker recognition. fig. 1. illustrates how knowledge from different sources is intended to be used in the reported speech processing module. the relationship between these technologies goes beyond prosodic and spectral features. in the next section, we discuss how emotion recognition can employ lexical and discourse information provided by an asr system. user-awareness and adaptation in conversational agents 381 fig. 1 combining knowledge from different sources in the speech processing module 3. emotion recognition based on linguistic information emotion recognition can be also based on lexical and discourse information [43], e.g., a semantic analysis of an output hypothesis of an asr engine [44]. in line with this, one line of our research focuses on recognition and tracking of emotional states of the user from lexical information and other linguistic features. as part of previous work [1], a substantial refinement of the wizard-of-oz technique was proposed in order that a scenario designed to elicit affected speech in human-machine interaction could result in realistic and useful data. the nimitek corpus of affected speech in human-machine interaction was produced during a refined wizard-of-oz simulation. ten healthy native german speakers participated in the study (7 female, 3 male, ages 18 to 27, mean 21.7). the corpus contains 15 hours of audio and video recordings. the number of the subjects’ dialogue turns is 1,847, the average number of words per turn is 17.19 (with standard deviation 24.37), and the subjects’ lexicon contains about 900 lemmata. the evaluation of the corpus with respect to the perception of its emotional content demonstrated that it contains recordings of emotions that were overtly signaled, and that the subjects’ utterances are indicative of the way in which untrained, nontechnical users probably like to converse with conversational agents [1]. the transcribed version of the nimitek corpus was used to conduct a corpus-based examination of various linguistic features that may carry affect information [13]. for the purpose of this contribution, we illustrate the following linguistic features: key words and phrases, lexical cohesive agencies, dialogue act sequences, and negations. the most obvious way of recognizing an emotional state is to detect key words and phrases in users’ utterances. examples from the nimitek corpus are given in table 1. however, expressions of emotions are not necessarily limited to a single dialogue act, but 382 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić can also map over a range of mutually related dialogue acts. for example, the choice of lexical items made to create cohesion in the dialogue can signal an emotion-related state, both at the lexical level (e.g., repetitions), as well as at the semantic level (e.g., reformulations). table 2 contains examples of repetitions and reformulations that signal negative emotional states. in contrast to this, another form of anaphoric cohesion in a dialogue is achieved by ellipsis-substitutions. the typical meaning of ellipsis-substitutions is not one of co-reference – there is always some significant difference between the second instance and the first [45]. to illustrate this, let us observe a typical example from the nimitek corpus: please do it! (bitte tu das!). this utterance does not explicitly provide information what the system is expected to do, but contains an ellipticalsubstitution (verb do) which is used for signaling that the action the system performed is not the same as the action instructed by the user (indicated by the anaphoric reference it). in general, ellipsis-substitutions may signal a potential problem in communication. table 1 examples of key words and phrases that relate to emotional states (adopted and adjusted from [13]) emotional state examples of the subjects’ key words and phrases annoyed sh*t (sche*ße), stupid (blöd), do what i say (tu was ich sage), oh … something like this i hate just like the plague. (oh... so was hasse ich doch wie die pest.) retiring i don't understand it (ich versteh’ das nicht), it's not working at all (das geht doch gar nicht). indisposed i am going now (ich geh’ gleich), oh man (oh man), god (gott), i don't feel like doing any more. (ich hab’ kein’ bock mehr.) offending you think, doll. (denkst du, puppe) satisfied super (super), awesome (geil), i am good, am i not? (bin gut, was?) table 2 examples of lexical cohesive agencies that relate to negative emotional states (adopted and adjusted from [13]) lexical cohesive agencies examples of the subjects’ dialogue acts repetition it just cannot be. it just … it just cannot be. (das kann doch nicht sein. das ist doch … das kann doch nicht sein.) reformulation not true at all. that’s definitely wrong. (gar nicht wahr. das stimmt gar nicht.) ellipsis-substitution please do it. based on this study, a prototypical automatic annotator for recognition and tracking of the user’s emotional states from linguistic information was implemented [13]. it should be noted that the emotional states in the nimitek corpus [1] were conceptualized within the data-driven 6-class emotion model arisen (annoyed, retiring, indisposed, satisfied, engaged, neutral). in addition, the subjects’ expressions in the nimitek corpus often contain mixed emotions, and the human evaluators were allowed to assign more emotion labels to each subject’s utterance. thus, the automatic annotator was implemented to annotate mixed emotions, i.e., to attribute zero, one or more labels from the arisen model to each subject’s utterance. the results of the automatic annotation were compared with the results of the human evaluators. for the given 6-class emotional model arisen, the annotator showed the user-awareness and adaptation in conversational agents 383 following performance: 31.70% of subject emotional states were correctly, 34.35% of subject emotional states were not recognized, and 33.92% of subject emotional states were incorrectly recognized. furthermore, the arisen model was down-sampled to a model that differentiates between 3 emotional states, i.e., negative (including annoyed, retiring and indisposed emotional states), neutral, and positive (including satisfied and engaged emotional states). for this 3-class problem, the annotator showed the following performance: 51.20% of subject emotional states were correctly recognized, 33.67% of subject emotional states were not recognized, and 17.26% of subject emotional states were incorrectly recognized. when interpreting these results, it should be kept in mind that the automatic annotation was based only on lexical information, while the human evaluators were influenced by prosody as well. 4. user-adaptive dialogue management the main idea underlying the conversational agent’s adaptation is that its dialogue behavior is dynamically adapted according to the user and his emotional state. in this respect, the dialogue management module is the central component of the conversational agent. it consists of two components: dialogue context model and adaptive dialogue control [46]. 4.1. dialogue context model dialogue context model keeps track of information relevant to the dialogue. for the purpose of this contribution, it includes the following knowledge sources:  lexical and propositional content of the user’s dialogue act,  attentional state,  emotional state of the user,  information about the user. among these sources, attentional state deserves further discussion. at the conceptual level, attentional state contains information about the dialogue entities that are most salient at any given point. its purpose is twofold [2], [47]. first, it summarizes information from previous dialogue acts that are necessary for processing subsequent ones, and allows for processing spontaneously produced users' dialogue acts. this is an important characteristic of the system, not just because it enables a more natural dialogue, but also because forcing users to follow a preset grammar or interaction scenario is hardly acceptable for users in negative emotional states. the second purpose of attentional state is that it allows for predicting the dialogue behavior of the user, i.e., it forms the basis for expectations about the succeeding dialogue acts. this information is important both for automatic speech recognition, as a means of reducing a set of asr hypotheses, and adaptive dialogue control, for taking initiative in a dialogue. in [2], we introduced a representational model of attentional information in humanmachine interaction that provides a framework for more robust natural language processing and dialogue management. this model integrates neurocognitive understanding of the focus of attention in working memory, the notion of attention related to the theory of discourse structure in the field of computational linguistics, and investigation of the nimitek corpus. to the extent that it is computationally appropriate, it was successfully adapted and applied in several prototypical conversational agents with diverse domains of interaction [14], including the dialogue management module reported in this paper. 384 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić fig. 2 the intended architecture of a conversational agent 4.2 adaptive dialogue control the dialogue control component implements dialogue strategies of the conversational agent. in general, a dialogue strategy involves deciding what to do next once the user’s input has been received and interpreted, e.g., prompting the user for more input, clarifying the user’s previous input, outputting information to the user, etc. [46]. we recall that the reported conversational agent is adaptive to the extent that it dynamically adapts its dialogue strategies according to the current user and his emotional state. therefore, an adaptive dialogue strategy is specified by means of a set of rules that take information about the current dialogue context into account. we build upon previous work on emotion-adaptive dialogue strategies, and end-user design of adaptive dialogue strategies. it is important to note that the reported dialogue management module allows the end-user to design dialogue strategies. this makes two levels of adaptation possible. the dialogue behavior is not only dynamically adapted according to the current dialogue strategy, but also the dialogue strategy itself can be redefined by the user. for detailed discussion, the reader may consult [16], [17]. 5. concluding remarks this paper summarized some aspects of our previous work and presents work-inprogress on developing user-aware and adaptive conversational agents. the intended architecture of a conversational agent is given in fig. 2. the speech recognition module and the dialogue management module (integrated with the natural language processing modules) are fully implemented, while emotion recognition and speaker recognition modules are implemented at a prototype level. user-awareness and adaptation in conversational agents 385 current and future prospects of our research in this field include (but are not limited to): further investigation of the interplay between speech recognition, emotion recognition and speaker recognition, investigation of linguistic cues for early recognition of negative dialogue developments, further development of dialogue strategies for preventing and handling negative dialogue development, and investigation of more complex user models and alternative models of emotions. acknowledgement: the presented study was performed as part of the project “development of dialogue systems for serbian and other south slavic languages” (tr32035), funded by the ministry of education, science and technological development of the republic of serbia. references [1] m. gnjatović and d. rösner, “inducing genuine emotions in simulated speech-based human-machine interaction: the nimitek corpus”. ieee transactions on affective computing, vol. 1, no. 2, pp. 132144, july-dec. 2010, doi: 10.1109/t-affc.2010.14 [2] m. gnjatović, m. janev, v. delić, “focus tree: modeling attentional information in task-oriented humanmachine interaction”. applied intelligence, vol. 37, no. 3, pp. 305-320, 2012, doi: 10.1007/s10489-011-0329-5 [3] d. bohus and a. rudnicky, “sorry, i didn’t catch that! an investigation of non-understanding errors and recovery strategies”. in recent trends in discourse and dialogue, vol. 39 of text, speech and language technology, pp. 123–154, springer, 2008. [4] c.h. lee, “fundamentals and technical challenges in automatic speech recognition”. in proc. of the 12th international conference speech and computer, specom 2007, pp. 25–44, moscow, russia, 2007. [5] b. schuller, g. rigoll, m. lang, “speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture”. in proc. of icassp 2004, vol. 1, pp. i-577-580, 2004, doi: 10.1109/icassp.2004.1326051 [6] t. kinnunen and l. haizhou, “an overview of text-independent speaker recognition: from features to supervectors”. speech communication, vol. 52, pp. 12-40, 2010, doi: 10.1016/j.specom.2009.08.009 [7] v. delić, m. sečujski, n. jakovljević, m. gnjatović, i. stanković, “challenges of natural language communication with machines”. chap. 19 in daaam international scientific book 2013, pp. 371-388, 2013, doi: 10.2507/daaam.scibook.2013.19 [8] n. jakovljević, d. mišković, m. janev, m. sečujski, v. delić, “comparison of linear discriminant analysis approaches in automatic speech recognition”. electronics and electrical engineering, vol. 19, no. 7, pp. 76-79, 2013, doi: 10.5755/j01.eee.19.7.5167 [9] i. jokić, s. jokić, z. perić, m. gnjatović, v. delić, “influence of the number of principal components used to the automatic speaker recognition accuracy”. electronics and electrical engineering, vol. 18, no. 7, pp. 83-86, 2012, doi: 10.5755/j01.eee.123.7.2379 [10] i. jokić, s. jokić, v. delić, z. perić, “towards a small intra-speaker variability models”. electronics and electrical engineering, vol. 20, 2014 (in press). [11] v. delić, m. bojanić, m. gnjatović, m. sečujski, s.t. jovičić, “discrimination capability of prosodic and spectral features for emotional speech recognition”. electronics and electrical engineering, vol. 18, no. 9, pp. 51-54, 2012, doi: 10.5755/j01.eee.18.9.2806 [12] m. bojanić, v. delić, m. sečujski, “relevance of the types and the statistical properties of features in the recognition of basic emotions in speech”. facta universitatis, series: electronics and energetics, vol. 27, 2014 (in press). [13] m. gnjatović, m. kunze, x. zhang, j. frommer, d. rösner, “linguistic expression of emotion in human-machine interaction: the nimitek corpus as a research tool”. in proceedings of the 4th int. workshop on human-computer conversation, bellagio, italy, no pagination, 2008. [14] m. gnjatović and v. delić, “a cognitively-inspired method for meaning representation in dialogue systems”. in proc. of the 3rd ieee int. conf. coginfocom-2012, košice, slovakia, pp. 383-388, 2012. [15] m. gnjatović and v. delić, “electrophysiologically-inspired evaluation of dialogue act complexity”. in proc. of the 4th ieee int. conf. coginfocom 2013, budapest, hungary, pp. 167-172, 2013. [16] m. gnjatović and v. delić, “cognitively-inspired representational approach to meaning in machine dialogue”. knowledge-based systems, doi: 10.1016/j.knosys.2014.05.001, 2014. 386 v. delić, m. gnjatović, n. jakovljević, b. popović, i. jokić, m. bojanić [17] m. gnjatović, “therapist-centered design of a robot's dialogue behavior”. cognitive computation, special issue: the quest for modeling emotion, behavior and context in socially believable robots and ict interfaces, springer, doi: 10.1007/s12559-014-9272-1 (in press). [18] s. j. young, j. odell, p. c. woodland, “tree-based state tying for high accuracy acoustic modelling”. in proceedings of the workshop on human language technology, pp. 307-312, 1994, doi: 10.3115/ 1075812.1075885 [19] n. jakovljević, d. mišković, e. pakoci, t. grbić and v. delić, “poređenje performansi nekoliko varijanata gmm u sistemima za prepoznavanje govora”. in proc. of 21th telecommunications forum, telfor 2013, belgrade, serbia, pp. 466-469, 2013. [20] m. janev, d. pekar, n. jakovljević, v. delić, “eigenvalues driven gaussian selection in continuous speech recognition using hmms with full covariance matrices”. applied intelligence, vol. 33, no. 2, pp. 107-116, 2010, doi: 10.1007/s10489-008-0152-9 [21] b. popović, m. janev, d. pekar, n. jakovljević, m. gnjatović, m. sečujski, v. delić “a novel split-andmerge algorithm for hierarchical clustering of gaussian mixture models”. applied intelligence, vol. 37, no. 3, pp. 377-389, 2012, doi: 10.1007/s10489-011-0333-9 [22] n. jakovljević, primena retke reprezentacije na modelima gausovih mešavina koji se koriste za automatsko prepoznavanje govora, phd thesis, university of novi sad, march 2014. [23] v. delić, m. sečujski, n. jakovljević, d. pekar, d. mišković, b. popović, s. ostrogonac, m. bojanić, d. knežević, “speech and language resources within speech recognition and synthesis systems for serbian and kindred south slavic languages”. in proc. of the specom 2013, pilsen, czech republic, lncs, vol. 8113, springer, pp. 319-326, 2013, doi: 10.1007/978-3-319-01931-4_42 [24] s. ostrogonac, m. sečujski, v. delić, d. mišković, n. jakovljević, n. vujnović sedlar, a mixed-structure ngram language model, axon inteligentni sistemi, novi sad, serbia. international patent pening: pct/ rs2013/000009 [25] n. jakovljević, d. mišković, m. janev, d. pekar, “a decoder for large vocabulary speech recognition”. in proc. of 18th international conference on systems, signals and image processing, iwssip 2011, sarajevo, bosnia and herzegovina, pp. 287-290, 2011. [26] m. bojanić, m. gnjatović, m. sečujski, v. delić: “application of dimensional emotion model in automatic emotional speech recognition”. in proc. of the 11th ieee int. symp. on intelligent systems and informatics, sisy 2013, subotica, serbia, pp. 353-356, 2013, doi: 10.1109/sisy.2013.6662601 [27] s.t. jovičić., z. kašić, m. djordjević, m. rajković, “serbian emotional speech database: design, processing and evaluation”. in proc. of specom 2004, st peterburg, pp.77–81, 2004. [28] j. gauvain and c. h. lee, “maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains”. ieee trans. on speech and audio process., vol. 2, no. 2, pp. 291-298, apr. 1994, doi: 10.1109/89.279278 [29] m.j.f. gales, “maximum likelihood linear transformations for hmm-based speech recognition”. computer speech & language, vol. 12, no. 2, pp. 75-98, 1998, doi: 10.1006/csla.1998.0043 [30] m.j.f. gales and p.c. woodland, “mean and variance adaptation within the mllr framework”. computer speech & language, vol. 10, no. 4, pp. 249-264, 1996, doi: 10.1006/csla.1996.0013 [31] d. povey and g. saon, “feature and model space speaker adaptation with full covariance gaussians”. in proc. interspeech 2006, paper 2050-tue2bup.14, 2006. [32] m.j.f. gales and s. young, “the application of hidden markov models in speech recognition”. foundations and trends in signal processing, vol. 1, no. 3, pp. 195-304, 2008, doi: 10.1561/2000000004 [33] n. jakovljević, d. mišković, m. sečujski, d. pekar, “vocal tract normalization based on formant positions”. in proc. inter. language technologies conference is-ltc 2006, ljubljana, pp. 40-43, 2006. [34] n. jakovljević, m. sečujski, v. delić, “vocal tract length normalization strategy based on maximum likelihood criterion”. in proc. eurocon 2009, st. petersburg, pp. 417-420, 2009, doi: 10.1109/eurcon. 2009.5167662 [35] g. saon and j.t. chien, “large-vocabulary continuous speech recognition systems”. ieee signal processing magazine, vol. 29, no. 6, pp. 12-33. nov. 2012, doi: 10.1109/msp.2012.2197156 [36] j.m. lucas-cuesta j. ferreiros, f. fernandez-martinez, j.d. echeverry, s. lutfi, “on the dynamic adaptation of language models based on dialogue information”. expert systems with applications, vol. 40, no. 4, pp. 1069-1085, 2013, doi: 10.1016/j.eswa.2012.08.029 [37] w. kim, language model adaptation for automatic speech recognition and statistical machine translation, phd thesis, johns hopkins university, 2005. [38] l. ten bosch, “emotions: what is possible in the asr framework”. itrw on speech and emotion, northern ireland, uk, pp. 189-194, 2000. [39] j. hirschberg, d. litman, m. swerts, “prosodic and other cues to speech recognition failures”. speech communication, vol. 43, pp. 155-175, 2004. user-awareness and adaptation in conversational agents 387 [40] d. litman, j. hirschberg, m. swerts, “predicting automatic speech recognition performance using prosodic cues”. in proc. of the 1 st north american chapter of the association for computational linguistics, naac, seattle, pp. 218-225, 2000. [41] b. vlasenko, d. prylipko, a. wendemuth, “towards robust spontaneous speech recognition with emotional speech adapted acoustic models”. s. wölfl (ed.), poster and demo track of the 35th german conference on artificial intelligence, ki-2012, saarbrucken, germany, pp. 103-107, 2012. [42] b. popović, i. stanković, s. ostrogonac, “temporal discrete cosine transform for speech emotion recognition”. in proc. of the 4th ieee int. conf. coginfocom 2013, budapest, hungary, pp. 87-90, 2013. [43] c.m. lee and s.s. narayanan, “toward detecting emotions in spoken dialogs”. ieee transactions on speech and audio processing, vol. 13, no. 2, pp. 293-303, 2005, doi: 10.1109/tsa.2004.838534 [44] r. müller, b. schuller, g. rigoll, “enhanced robustness in speech emotion recognition combining acoustic and semantic analyses”. in proc. of the workshop from signals to signs of emotion and vice versa, santorini, greece, 2004. [45] m. halliday, an introduction to functional grammar, edward arnold, london new york, second edition, 1994. [46] k. jokinen and m. mctear, spoken dialogue systems. synthesis lectures on human language technologies, morgan and claypool, 2009. [47] b. grosz and c. sidner, “attention, intentions, and the structure of discourse”. comput linguist, vol. 12, no 3, pp. 175-204, 1986. instruction facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 429-444 https://doi.org/10.2298/fuee2003429t © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd technique of control pmsm powered by pv panel using predictive controller of dtc-svm fadila tahiri, abdelkader harrouz, djamel belatrache, fatiha bekraoui, ouledali omar, ibrahim boussaid department of hydrocarbon and renewable energy, laboratory lddi, university of ahmed draya, adrar, algeria abstract. the present paper is a part of the study of direct torque control based (dtc) on space vector modulation using predictive controller (predictive svm) of a permanent magnet synchronous motor (pmsm) powered by a photovoltaic (pv) source. in the conventional direct torque control (dtc) of a permanent magnet synchronous motor (pmsm), hysteresis controllers are used to choose the proper voltage vector resulting in large torque ripples. the direct torque control can accelerate the torque responses but increases the torque ripple at same time. nowadays, exist some other alternative approaches to reduce the torque ripples based on (predictive svm) technique. this method is based on the replacement of hysteresis comparators (used in conventional dtc) by proportional integral (pi) regulators and the selection table by space vector modulation (svm). the simulation results confirm that this proposed method where the control of the switching frequency is well controlled, allows us to reduce the oscillations of the electromagnetic torque and flux by 20 % and 30%, respectively with a good dynamic response compared with conventional dtc. key words: photovoltaic, pmsm, dtc, dtc-svm, predictive controller. nomenclature i0 reverse saturation current of the diode (a) i0r reverse saturation current k constant of boltzmann (1.38.10-23j / k) q charge of the electron (1.6.10-19c) a p-n junction ideality factor eg band gap g, gr real and reference solar radiation icc short-circuit current s v , sv si , si s , s current, voltage and magnetic flux of stator (α,β)axes received december 31, 2019; received in revised form may 17, 2020 corresponding author: abdelkader harrouz department of renewable energy and hydrocarbon, faculty of technology and sciences, ahmed draïa university adrar, algeria e-mail: harrouz.onml@gmail.com 430 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid  the speed of rotation of the machine (rotor)(rad/s) f  permanent magnet flux linkage (web) rs the stator resistance (ω) ls the inductance of the stator(h) j moment of inertia f coefficient of friction p number of pairs of poles cba v ,, , cbai ,, three-phase voltage and current 1. introduction the demand for electrical energy increases daily to cover human needs; the use of renewable energy is becoming the key solution to this serious energy crisis and environmental pollution [1]. algeria has great potential from solar energy, because it has a vast desert area and very high solar radiation [2-3]. for this reason, the optimal solution to energy production in our study is solar energy [4]. in the field of variable speed, the permanent magnet synchronous machine are extensively accepted due to their high efficiency, high power density, high precision, low maintenance costs, simple structure, and its high torque density [5-6]. among the most used applications for the permanent magnet synchronous motor (pmsm) are drones, portable robots, and vacuum pumps for decades of diversity and performance. especially in electric and hybrid cars [7]. vector controlled pmsm drive provides better dynamic response and lesser torque ripples, and necessitates only a constant switching frequency [8]. pmsm modeling has been tackled in the literature in various ways, commonly using a transition of the electrical component from the physical 3-phase structure to an equivalent 2-phase right-angled structure, enabled by the clark transformation [9]. due to the presence of external disturbances and parameter variation in pmsm over the past decades, performance has been improved by developing various powerful control technologies. [10] however, the widely used approach consists in using linear control theory with the disturbance estimate [11]. it is therefore interesting to find a way to make their independent control to improve their performance. the most suitable solution now is direct torque control (dtc). this method has been first proposed for induction machines [7-12]. it is used in variable frequency drives where the stator flux and machine electromagnetic torque are directly used to generate the control pulses for voltagesource inverter through a predefined switching table [13], dtc owns the advantages of simplicity, quick dynamic response and robustness, which makes it a powerful motor control method in various applications [14]. however, it is known that dtc is troubled by the disadvantages of large torque/flux ripples and unstable switching frequency [15], which hinders its practiitcal applications. in order to tackle the problems associated with conventional dtc, lots of modified dtc methods are proposed to improve the control performance [16]. abdelkarim ammar et al [17], are present the space vector modulation (svm) based direct torque control strategy (dtc) for induction motor (im) in order to overcome the drawbacks of the classical dtc. moreover, they proposed model based loss minimization strategy for efficiency optimization, the proposed svm-dtc algorithm was investigated by using matlab/ simulink with real time interface based on dspace 1104 signal card. the simulation and experimental validation gives similar results, they showed direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 431 that lmc reduces losses, improved efficiency at zero and low loads operation. therefore, dtc-svm it’s a good solution in general to overcome the drawbacks of classical dtc. a comprehensive review has been provided by r h kumar et al [18] of recent advancements of dtc of induction motor (im) for the past one decade. strategies adopted to improve the performance of dtc based on switching table, constant switching frequency operation, intelligent control, sensor less control and predictive control are extensively discussed with its key results and algorithms. the simulation results of dtc predictive show a reduction in torque and stator flux ripples by 14.94 and 23%, respectively, compared with conventional dtc drive. in this paper, we use the control predictive dtc-svm applied to permanent magnet synchronous motor, to obtain a constant switching frequency after it was variable in conventional dtc (because of the use of hysteresis comparators) and minimize the ripple of torque and flux. this method of control (predictive dtc-svm) is based on the replacement of hysteresis comparators (used in conventional dtc) [17] by proportional integral (pi) regulators and the selection table by space vector modulation (svm).[21] 2. modeling system 2.1. photovoltaic system the solar cells are generally connected in series and in parallel, in parallel with nph cells to increase the current and in series with nsh cells to increase the voltage then increase the pv power. a pv generator is made up of interconnected modules to form a unit producing high continuous power compatible with conventional electrical equipment [19]. the used model is shown in figure.1, which consists of four components: a current generator iph, a diode, a parallel resistance rsh and a serial resistance rse [20]. fig. 1 equivalent diagram of a photovoltaic cell. the output current is given by the following equation [22]: sh sh s sepv sh s pv s sepv sh s pv 0shphshpv r n n r i n n v 1 aktn r i n n vq exp in-in=i                                         (1) 432 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid where, the cell reverse saturation current is related to the temperature (t) as follows                     t 1 298 1 ka qe exp t t i=i g 3 r 0r0 (2) similarly, the photocurrent iph depends on the solar radiation (g) and the cell temperature (t) [21]: ) g g (i=i r ccph (3) 2.2. model of pmsm accounting for the hypothesis commonly considered in ac machine modelling, the electrical equations of the pmsm in a (α, β) reference frame, are                        dt d tfipip dt d j dt di lirv dt di lirv rsfsf f s ssss f s ssss .cos...sin.. cos.. sin... (4) the electromagnetic torque te is given from [23, 38]: e s 3 t p( ) 2 s s s i i        (5) 2.2.1. simulation and interpretation in the first step, we use a simulation of the (pmsm) operation in the reference frame (α, β) powered directly by 50v, 50hz network. the software used in this simulation is matlab / simulink. we see in (figure 2.a) that the space of speed reaches the steady state very quickly with an acceptable response time. after applying the load at the moment (0.3s 0.4s) we found that the speed decreases and then returns to its reference value. the torque peaks at the first start moment, then reaches his value when the speed decreases (under load) and is proportional to the current. (figure 2.d) shows the temporal evolution of the stator flux, which has a disturbed sinusoidal shape. direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 433 fig. 2 the variation with time: (a) speed variation, (b) torque variation, (c) current variation, (d) flux variation. 3. direct torque control of pmsm the block scheme of the investigated direct torque control (dtc) for a voltage source inverter fed pmsm is presented in figure.3 fig. 3 general structure of the dtc. 434 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid the direct torque control of a permanent magnet synchronous machine is based on direct determination of the control sequence to be applied to a voltage inverter [24]. the switching states is selected from the two comparator output (error) of both torque and flux, where the estimated values of torque and flux are compared with the reference values and depending upon the hysteresis comparator error, the result may increase or decreases [25]. the estimation of reference torque, flux and position of flux vector is done by using machine input voltages and currents as shown in figure.3 [26] the stator electric equations of the pmsm, in a (α, β) reference frame are given by [27]: s s 3 i . 2 1 ( ) 2 i . sa s sb sc s s i i i i i j i                 (6) the stator voltage: s s 3 1 3 1 v . ( ) . ( ) 2 2 2 2 1 1 ( ) ( ) 2 2 v . a b c dc a b c s b c dc b c s s v v v u s s s v v v u s s v j v                                   (7) two-level inverter is capable of producing six non-zero voltage vectors and two zero vectors. figure.4. shows the complex plane of the eight voltage vectors [28] fig. 4 different vectors of stator voltages provided by a two levels inverter. direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 435 table 1 shows the switching states to select a suitable v for selecting the switches in the inverter. this voltage sector is generated from the two-comparator output (error) of both torque and flux [25]. the switching table receives 6 active and 2 zero vector from the comparator output and generates 8 possible switching vector for the inverter [27, 29]. dtc select the active voltage switching vector states for doubling the sampling period and select an appropriate v[30]. table 1 switching table for dtc 6sector[29]. the conventional dtc control is more robust compared against the conventional methods (field-oriented control for example). it does not require a mechanical measurement such as that the speed or position of the machine, moreover the sensitivity to the parameters of the machine is clearly attenuated in the case of dtc, since the flux is made according to a single parameter namely the stator resistance. in addition, svm (space vector modulation) is replaced in this command by a simple switching table that makes it easier [31]. 4. predictive dtc-svm the strategy of the control dtc-svm with a predictive controller uses a svm with a fixed and constant switching frequency [28]. this control strategy ensures the decoupling between the stator flux vectors amplitude and its arguments. indeed, the stator flux amplitude will be imposed. nevertheless, the argument is calculated to obtain high performance like the reduction of the stator flux and the electromagnetic torque ripples. the difference between the conventional dtc and this control strategy is that the latter is based on the pi controllers and the svm in order to fix the switching frequency, which consequently reduces the stator flux and the torque ripples as well as the harmonic waves of the stator current. the switching table and the hysteresis regulators used in the conventional dtc are eliminated. the voltage vector was calculated by using a predictive controller. [29] the block diagram of the predictive (dtc-svm) control of a pmsm powered by a voltage inverter from a pv source is shown in figure 5, the pi predictive controller is shown in figure 6. [32] sector s1 s2 s3 s4 s5 s6 v2 v3 v4 v5 v6 v6 v7 v0 v7 v0 v7 v0 v6 v1 v2 v3 v4 v5 v3 v4 v5 v6 v1 v2 v0 v7 v0 v7 v0 v7 v5 v6 v1 v2 v3 v4 436 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid fig. 5 pmsm system control based on predictive dtc-svm. fig. 6 pi predictive controller. 4.1. predictive controller the relationship between the torque pulses: [33]       kk t t sref s s eref e (8) where teref the reference is torque, s and ∆φ are respectively the deviations from s and φ which are defined by: ssrefs  ssref  (9) where ks and kφ are the constants derived from the pmsm specifications. the torque ripple is actually caused by∆∅s, ∆φ and the influence of ∆∅s is considerably lower than ∆φ. as a result, the torque ripple can be attenuated if ∆φ is kept close to zero. for dtc-svm control, the generation of the control pulses (sa, sb, sc) applied to the inverter switches is generally based on the use of a predictive controller, which receives information about the error of the controller. ∆te= (te-ref -te) the reference stator flux amplitude ∅ref, the amplitude and the position of the estimated stator flux vector and the current value to be measured [34]. the predictive controller determines the control reference stator voltage direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 437 vector in the polar coordinates vs= [vsref ∆φ]. the equation shows that the relationship between the torque error and the increment of the angle ∆φ is linear. therefore a proportional integral (pi) predictive controller which generates the load angle changing to minimize the instantaneous error between the reference torque and the actual torque, is applied. from the structure of the predictive torque and stator flux controller shown in figure.6. we noted that the torque error ∆te and the stator reference flux are delivered to the predictive controller, which gives the deviation of the stator flux angle ∆φ [32]. from figure 6; α, β axes components of the stator reference voltage vsref, are calculated as:                    ss e srefsref refs ss e srefsref refs ir t v ir t v sinsin coscos (10) 4.2. space vector modulation (svm) the svm method generates the switching signals based on the instantaneous position of the rotating reference vector in the voltage vector space of the converter [14, 37] as shown in the figure.3. in the space vector diagram svd of a two-level inverter [35], every sector (represented as si, i = 1 to 6) is an equilateral symmetrical triangle of height h (=√3/2). the edge vectors (v1 to v6) are named active vectors and (v0, v7) zero vectors. the three closest switching vectors (one zero vector and two active vectors), allows us to calculate the svm switching time in any sector. the movement of the reference vector v * positioning inside the sector synthesizing the switching times. figure 7 allows us to understand the two-level switching of svd. the determination of the volts-second of v * and their time integral is shown in equation 11. [36] fig. 7 sector-1 for two-level svd the reference voltage v*volts-sec is calculated by the following equation; v t+v t+v t=tv 002211s * (11) 438 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid where t0, t1 and t2 are the work times of basic space voltage vector v0, v1 and v2 respectively. v0 state can be either [000] or [111] switching state, or else both. equation (12) can be used to determine the position of the angle v * (θ) in the sector; v v arctg            s s (12) the θ values sample the v* in different sector (example, θ =115 ° , the v* approach sector-2, since sector-2 lies in an angle between 61°-120°). according to the v* position, whether inside or outside the hexagonal svd (figure.7), svd is divided into linear modulation and over modulation equal ma ≤0.907and ma > 0.907, respectively. ts = t1 + t2 + t0. we calculated t1 and t2 from projecting v* position along α-axis and β-axis with respect to svd origin (zero point). henceforth, the volts-sec equations for αaxis and β-axis are vsα0 and vsβ0, ts= t1 + 0.5t2 and ts = ht2, respectively. thus, t2 = ts vsβ0/h and t1 = ts (vsα0 − vsβ0)/2h. the active vector times t1 and t2, help to find the zero voltage time ts, from the given switching frequency. [37] 5. simulation and interpretation both the simulations (simulation result of dtc conventional and predictive dtcsvm) were proceeded at the same conditions regarding motor parameters, switching frequency of inverter transistors and nominal condition (irradiation, temperature) of pv source. for constant flux operating condition, the flux amplitude produced by permanent magnet is the value of reference amplitude of stator flux. the system consists of photovoltaic sources, inverter, and permanent magnet synchronous motor. table 2 machine and control parameters rated motor power pn 1.1 kw nominal motor voltage vn 220v power factor cos φ 0.38 nominal frequency f 50hz stator resistance rs 0.6 ohm direct stator induction ald 2.8mh quadratic stator induction lq 1.4mh flux of magnets 0.12web number of pole pairs p 4 moment of inertia j 1.1*10 -3 n.m.s 2 coefficient of friction f 1.4*10 -3 nominal torque te 10 n.m 5.1. simulation results of conventional dtc in fig.8.b, the electromagnetic torque is illustrated, that begins with a value of approximately 5nm then its follows the reference torque to return the machine to the previously speed defined by the set point with a reversal of direction of rotation (t = 0.2 to 0.25), finally returns to zero until we apply a load of 5n.m at t= 0.3. we observe in direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 439 fig. 9.a. the stator current temporal evolution, which has an almost sinusoidal shape when we apply a load with an oscillation equal 4 a. the stator flux module follows its reference without exceeding. it shows no sensitivity to the load application. we noted that the electromagnetic torque is full of ripples caused by the used of hysteresis controllers for the stator flux and the electromagnetic torque ,which introduce limitations such as a variable switching frequency, high flux ripples and current distortion. fig. 8 (a) –the variation of speed responses, (b) –the variation of stator torque responses fig. 9 (a) –the variation of current responses, (b) –the variation of flux responses fig. 10 (a) –the variation of stator flux (α, β) axes responses, (b) –the variation of stator flux (alpha) as a function of stator flux (beta). 440 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid 5.2. simulation results of predictive dtc-svm fig. 11 the block diagram of the simulation of pmsm system control based on predictive dtc-svm in the same operating condition (we apply a load of 5n.m at t = 0.3 and reverse direction of rotation at t = 0.2-0.25). in fig.12.a, we note that the rotation speed makes a small overshoot at startup and then stabilizes at the rated speed 140 rad/s during a response time equal to 0.02 s. this overrun is justified by regulator values that have not been satisfactory and require adjustments. the electromagnetic torque, which is illustrated in fig.12.b, perfectly follows its reference with an oscillation of 0.5n.m. on the curve of fig.13.a, we see the stator current evolution which has a near sinusoidal shape with few fluctuations compared with the current of the dtc control. concerning the fig.13.b, we noticed that the modulus of the estimated stator flux revolves around its reference in a band of very narrow width that of the dtc. the presentation of the flow in the complex plane fig.14.b shows that the stator flow starts from the point (0, 0) and then turns in the trigonometric direction to follow a circle of radius fixed by the instruction. generally, we notice a decrease in torque and flux oscillations due to predictive dtcsvm control. switching frequency in predictive dtc-svm is constant due to the excluded of the switching table and hysteresis regulators used in the conventional dtc and its replacement with pi controllers and svm, which reduces the torque and flux oscillations as well as harmonic waves of constant current. direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 441 fig. 11 (a) – the variation of speed responses, (b) –the variation of stator torque responses fig. 12 (a) –the variation of current responses, (b) –the variation of flux responses fig. 13 (a) –the variation of stator flux (α, β) axes responses, (b) –the variation of stator flux (alpha) as a function of stator flux (beta). 442 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid the following table shows the difference between the conventional dtc and predictive dtc-svm: table 3 various oscillations between the conventional dtc and predictive dtc-svm oscillation torque (nm) flux (wb) current (a) dtc 2.5 0.01 4 dtc-svm predictive 0.5 0.003 1.75 6. conclusion the work presented in this paper focuses on the study of direct torque control (dtc) based on space vector modulation using predictive controller (predictive svm) of a permanent magnet synchronous motor (pmsm); indeed, this strategy is based on the direct determination of control sequence applied to the inverter. this control is less sensitive to the variation of the machine parameters and does not require mechanical sensors that are fragile. it has been concluded that the predictive svm dtc control is to minimize ripple at the torque and flux, with a high switching frequency. from the results of the simulation can be summarized as follows:  the oscillation of torque and flux is important in the conventional dtc because of the use of hysteresis regulators, which introduce limitations such as a high and uncontrollable switching frequency  in dtc-svm, we replace the hysteresis regulators and switching table with pi controller and svm, the reducing of the oscillations in the torque and the flux produced by:  inverter switching frequency is constant, which consequently reduces the flux stator and torque ripples as well as the harmonic waves of the stator current.  distortion caused by sector changes are eliminated.  zero low sampling frequency is required.  dynamic performance of dtc-svm are comparable with selection table based dtc. in the end, the torque, flux and current was recorded 0.5, 0.003 and 1.75, respectively in dtc-svm predictive and 2.5, 0.01 and 4 respectively in conventional dtc. the obtained results confirm that, the proposed control more applicable and compatible with our pmsm compared against the convectional dtc. acknowledgement: this paper and the research behind it would not have been possible without the exceptional support of the general directorate for scientific research and technological development, algeria. direct torque control (dtc) svm predictive of a pmsm powered by photovoltaic source 443 references [1] o. ellabban, h. abu-rub, f. blaabjerg, “renewable energy resources: current status, future prospects and their enabling technology”, renewable and sustainable energy reviews, vol. 39, pp. 748–764, 2014. [2] a. harrouz, a. temmam, m. abbes, “renewable energy in algeria and energy management systems”, international journal of smart grid, vol. 2, no. 1, 2018. [3] a harrouz, h omar. application of solar energies to reinforce the flow water of foggara in the adrar region. international journal of smart grids, ijsmartgrid, 2(4), 203–208. (2018). [4] a. harrouz, m. abbes, i. colak, k. kayisli, “smart grid and renewable energy in algeria”, in proceedings ogf the ieee xplore of conference (icrera), 2017, san diego, ca, usa. [5] y. kim, h. t. seo, s. k. kim , k. s. kim, “a robust current controller for uncertain permanent magnet synchronous motors with a performance recovery property for electric power steering applications”, energies, vol. 11, no. 5, p. 1224, 2810. [6] abd essalam badoud, “mppt controller for pv array under partially shaded condition”, algerian journal of renewable energy and sustainable development, vol. 01, no. 01, pp. 99–111, june 2019, [7] y. h. kim, k. choi, s. k. kim, k. s. kim, “a disturbance observer based approach to current control of pmsm drives for torque ripple reduction”, ifac-papers online, vol. 52, no. 4, pp. 206–209, 2009. [8] a. v. sant and k. r. rajagopal, “pm synchronous motor speed control using hybrid fuzzy-pi with novel switching functions”, ieee transactions on magnetics, vol. 45, no. 10, october 2009. [9] a. ul haq, d. đurđanović, “precedent-free fault localization and diagnosis for high speed train drive systems”, facta universitatis, series mechanical engineering, vol. 13, no. 2, pp. 67–79, 2015. [10] m. manohar, s das, “current sensor fault-tolerant control for direct torque control of induction motor drive using flux-linkage observer”, ieee transactions on industrial informatics, vol. 13, no. 6, pp. 2824–2833, 2017. [11] h. mesloub, r. boumaaraf, m. t. benchouia, a. golea, n. goléa, k. srairi, “comparative study of conventional dtc and dtc svm based control of pmsm motor simulation and experimental results”, in proceedings of the international association for mathematics and computers in simulation (imacs), 2018, vol. 18, pp. 30148-4. [12] m. lukac, m. kameyama, m. perkowski, p. kerntopf, “using homing, synchronizing and distinguishing input sequences for the analysis of reversible finite state machines”, facta universitatis, series electronics and energetics, vol. 32, no. 3, pp. 417–438, 2019. [13] m. amiri, j. milimonfared, d. a. khaburi, “predictive torque control implementation for induction motors based on discrete space vector modulation”, ieee transactions on industrial electronics, vol. 65, no. 9, pp. 6881–6889, 2018. [14] f. niu, x. huang , l. ge, j. zhang, l. wu, y. wang, k. li , y. fang, “a simple and practical duty cycle modulated direct torque control for permanent magnet synchronous motors”, ieee transactions on power electronics, vol. 34, no. 2, pp. 0885-8993, 2019. [15] dj. rabah, b. sid ahmed, a. samar, “accurate computation of magnetic induction generated by hv overhead power lines”, facta universitatis, series electronics and energetics, vol. 32, no. 2, pp. 267– 285, 2019. [16] a. ammar, a. benakcha, a. bourek, “closed loop torque svm-dtc based on robust super twisting speed controller for induction motor drive with efficiency optimization”, international journal of hydrogen energy, vol. 42, no. 28, pp. 17940–17952, 2017. [17] m. petronijevic, n. mitrovic, v. kostic, and b. jovanovic, “assessment of unsymmetrical voltage sag effectson ac adjustable speed drives”, facta universitatis, series: electronics and energetics, vol. 22, no. 1, pp. 341–360, december 2009. [18] a o. conde, j. francisco. g. sánchez, j. muci, a. s.gonzález, “a review of diode and solar cell equivalent circuit model lumped parameter extraction procedures”, facta universitatis series: electronics and energetics, vol. 27, no. 1, pp. 57–102, march 2014. [19] h. bouzeriaa, c. fethah, t. bahib, i. abadliab, z. layateb, s. lekhchinec, “fuzzy logic space vector direct torque control of pmsm for photovoltaic water pumping system”, energy procedia, vol. 74, pp. 760 – 771, 2015. [20] s. laribi, k. mammar, f. zohra arama, t. ghaitaoui, “analyze of impedance for water management in proton exchange membrane fue fells using neural networks methodology”, algerian journal of renewable energy and sustainable development, vol. 01, no. 01, pp. 96–105, june 2019, http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/4355 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/4355 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/4526 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/4526 444 f. tahiri, a. harrouz, d. belatrache, f. bekraoui, o. omar, i. boussaid [21] j. francisco, s. garcía, r. beatriz, “modelling solar cell s-shaped i-v characteristics with dc lumpedparameter equivalent circuits a review”, facta universitatis series: electronics and energetics, vol. 30, no. 3, pp. 327–350, september 2017. [22] f. zohra arama, s. laribi, t. ghaitaoui, “a control method using artificial intelligence in wind energy conversion system”, algerian journal of renewable energy and sustainable development, vol. 01, no. 01, pp. 86–95, june 2019. [23] a. janjic, s. savic, g. janackovic, m. stankovic, l. zoran velimirovic, “multi-criteria assesment of the smart grid efficiency using the fuzzy analitical hyerarchy process”, facta universitatis, series electronics and energetics, vol. 29, no. 4, pp. 631–646, 2016. [24] v. kostić, m. petronijević, n. mitrović, b. banković, “experimental verification of direct torque control methods for electric drive applica”, facta universitatis, series: automatic control and robotics, vol. 8, pp. 111–126, no. 1, 2009. [25] a sood, n gupta, “direct torque control scheme of induction motor drive using space vector modulation”, international journal of recent advances in science and technology, vol. 6, no. 1, pp. 1–7, 2019. [26] s. j. kim, j. park, d. h. lee, “a predictive dtc-pwm using 12 vectors for permanent magnet synchronous motor”, in 2019 10th international conference on power electronics and ecce asia (icpe 2019-ecce asia), 2019, pp. 2498–25. [27] s. krim, s. gdaim, a. mtibaa, and m.f. mimouni, “fpga contribution in photovoltaic pumping systems: models of mppt and dtc-svm algorithms”, international journal of renewable energy research, vol. 6, no. 3, 2016. [28] h. abdelkader, f. tahiri, b. fatiha, b. ibrahim, “modelling and simulation of synchronous inductor machines”, algerian journal of renewable energy and sustainable development, vol. 01, no. 01, pp. 8– 23, june 2019. [29] s. krim, s. gdaim, a. mtibaa, m.f. mimouni, “hardware implementation of a predictive dtc-svm with a sliding mode observer of an induction motor on the fpga”, vol. 10, pp. 2224–2856, 2015. [30] o. ouledali, a. meroufel, p. wira, s. bentouba, “genetic algorithm tuned pi controller on pmsm direct torque control”, algerian journal of renewable energy and sustainable development,vol. 1, no. 2, pp. 204–211, 2019. [31] s. belkacem, f. naceri, r. abdessemed, “a novel robust adaptive control algorithm and application to dtc-svm of ac drives”, serbian journal of electrical engineering, vol. 7, no. 1, pp. 21–40, 2010. [32] m. aleenejad, h. mahmoudi, s. jafarishiadeh, r. ahmadi, “fault-tolerant space vector modulation for modular multilevel converters with bypassed faulty submodules”, ieee transactions on industrial electronics, vol. 66, no. 3, pp. 2463–2473, 2018. [33] r. k. pongiannan, s. paramasivam, n. yadaiah, “dynamically reconfigurable pwm controller for threephase voltage-source inverters”, ieee transactions on power electronics, vol. 26, no. 6, 2011. [34] f. tahiri, b. fatiha, b. ibrahim, o. omar, h. abdelkader, “direct torque control (dtc) svm predictive of a pmsm powered by a photovoltaic source”, algerian journal of renewable energy and sustainable development, vol. 01, no. 01, pp. 1–7, june 2019. http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/1131 http://casopisi.junis.ni.ac.rs/index.php/fuelectenerg/article/view/1131 facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 459-476 https://doi.org/10.2298/fuee2003459m © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd reliability analysis of different rcied activation signal responsive jamming techniques and their comparison to active jamming * mladen mileusnić, predrag petrović, vladimir kosjer, aleksandar lebl, branislav pavić iritel a.d., belgrade, serbia abstract. in this paper we compared the time required for the successful jamming of remote controlled improvised explosive devices activation using active and responsive jamming methods. as a representative of active jamming method we analyzed jamming signal generation using frequency sweep. for the analysis of the possible activating signal presence based on responsive jamming procedures we first supposed fast fourier transform (fft) implementation and compared its analysis rate to the rate of sweep jamming. taking into account the current technology state, it is proved that the time required to achieve the successful jamming relied on fft analysis may be less than in the case of active sweep jamming. after that we considered pros and cons for energy detector and matched filter detector implementation in responsive jamming. for these two detector types it is shown how to determine the number of analysis blocks to achieve approximately the same number of collected samples as in the case of fft implementation, starting from the probabilities of false detection and miss of detection. key words: active and responsive jamming, rcied remote controlled improvised explosive devices, frequency sweep, fast fourier transform, energy detector, matched filter, jamming reliability 1. introduction the common characteristic of all remote controlled improvised explosive devices (rcied) is that they are activated by wirelessly transmitted messages. the results of rcied activation message could be disastrous regarding people lives (vip persons) and the equipments damages. all elements related to activation signal characteristics (signal power, frequency, implemented modulation method, message duration) are completely unknown. this fact produces great problems in the realization of rcied activation jammers.  received february 4, 2020; received in revised form march 3, 2020 corresponding author: aleksandar lebl iritel a.d., 11080 belgrade, batajnički put 23, serbia e-mail: lebl@iritel.com * the earlier version of this paper is awarded as the best one in the section telecommunications at the 6 th icetran conference, silver lake, 3-6 june 2019, [1]. 460 m. mileusnić, p. petrović, v. kosjer, a. lebl, b. pavić contributions [2] and [3] provide a general overview of jammers types, communications jamming requirements and their efficiency analysis. modern communications jamming principles and techniques may be found in [4]. there are two basic approaches to the jammer implementation. the first one is active jamming, which consisted of continuous predefined jamming signals sending independently of the rcied activating message characteristics. in this concept there are no „look through“ phases to detect the activation message existence and the jamming signal characteristics are selected in general using previous experience and expectations. the most important freely selected jamming signal parameter is the rf signal level. this level has to be as high as possible to successfully prevent activating message reception. two key features which are not optimally chosen relate to continuous jamming regardless of rcied activation message existence and the rf jamming signal level necessary for jamming successfulness due to the fact that the activation signal level is unknown. the alternative approach to jammer implementation is responsive jamming concept. in this case the jamming signal characteristics can be optimized using look through intervals to detect the activation message existence and its level. that’s why it is possible to send the jamming signal only during activation message presence and jamming signal level can be adjusted to the activation message level in order to successfully deny the threat. a wide range of active and responsive jammers may be found in [5]-[14]. it may be concluded from this short presentation of active and responsive jamming characteristics that active jamming is always successful, while responsive jamming efficiency depends on activation message detection reliability. the question is whether responsive jamming reliability may be higher than for active jamming. in this paper we compare the reliability of mostly implemented active jamming method – frequency sweep [15]-[19] to the reliability of a representative method for activating signal eventual presence detection in order to generate jamming signal according to the activation signal characteristics by implementation of fast fourier transform (fft) in the analysis [14]. a brief principle schematic of rcied activation signal detection is explained in section 2. after that the method for rcied activation signal frequency spectrum estimation based on fft analysis is presented in section 3 with the emphasis on the required time for calculation. sections 4 and 5 deal with the specificities of energy detector and matched filter detector implementation for rcied activation message detection. the emphasis is on the determination of collected samples number. section 6 is devoted to frequency sweep jamming and to determination of required time to realize one complete jamming cycle. in section 7 jamming reliability on the basis of fft analysis is compared to the frequency sweep jamming reliability, whereby two special purpose processors are considered for fft calculation. reliability estimation is based on the required time to allow successful jamming. section 8 is focused on the presentation how to determine the necessary number of analysis blocks in energy detection or matched filter detection to achieve the comparable sample collection rate of these two detectors to fft based detector. at the end, the paper conclusion is given in section 9. 2. principles of detection process main principles of rcied activation signal detection may be explained using simplified block-schema presented in fig. 1. reliability analysis of different rcied activation signal responsive jamming techniques... 461 the first phase in detector function is signal samples collecting (block scol). after that follows processing of these samples (block proc). the final step is making a decision about (eventual) presence of rcied activation signal on the base of a set of comparison rules (block decision). these comparison rules are adjusted to the applied method of signal samples processing. this paper is mainly devoted to the block proc. the analyzed methods are fft, energy detector and matched filter. when the second or the third of these three methods is implemented, digital filter precedes the phase of processing. according to the available literature, there are also other methods which are less often applied for spectrum sensing, but they are possible candidates for rcied activation signal detection. some of them are waveform based detection, eigen-value based detection, wavelet based edge detection, ciclostationary feature detection [20] and so on. these methods, as generally less often applied ones, are beyond the scope of this paper. scol proc decision fig. 1 block schema of rcied activation signal detector 3. signal spectrum estimation on the fft base fft is the calculation procedure, which allows relatively fast estimation of discretized signal frequency spectrum. starting from n time samples of analyzed signal, this procedure gives a snapshot of signal frequency spectrum also in n points, i.e. n spectrum lines are obtained. fft is the optimum method taking into account the required number of mathematical operations for signal spectrum determination. there are (n/2)·log2 (n) complex multiplications and n·log2 (n) complex additions [21]. the limitation for n is that the condition n=2 a must be satisfied, where a is the positive integer number. this is a significant saving in the number of mathematical operations and in the required calculation time comparing to the classical method of frequency spectrum estimation by discrete fourier transform (dft). namely, it is necessary to perform n 2 complex multiplications and n 2 –n complex additions to obtain n frequency spectrum components by dft on the base of n time samples. let us suppose that fs is the frequency of analyzed signal sampling. the sample acquisition time is then: s n t f  (1) the frequency resolution on the base of sample acquisition time may be determined as: 1 sf df t n   (2) therefore, frequency resolution is improved when acquisition time is increased, i.e. the space between frequency spectral components of the analyzed signal is lower. 462 m. mileusnić, p. petrović, v. kosjer, a. lebl, b. pavić constant advancements in processor realization technology and mathematical algorithm improvements are visible in two aspects of fft calculation progress. on the one side the number of points in which frequency spectrum is determined is constantly increased, and on the other side the required fft calculation time for some exactly determined number of frequency spectrum components is constantly decreased, chronologically, successively according to presentations in [22], [23], [21], [24]. we selected two approaches referenced in [21] and [24] due to very fast processing algorithms. data presented in [24] is related to the fft calculation time as a function of the number of signal time samples implemented for fft calculation, i.e. as a function of the obtained frequency components number in the analyzed spectrum. the presented data is for processor clock of 1ghz. it is further emphasized in [24] that improvement may be achieved by processor clock speed increase to 1.25ghz. besides, it is stated in [25] that maximum processor clock frequency may be even 1.4ghz. on the base of these data, the fft calculation time (tcal in ms) is presented in table 1 as a function of the number of points used in a calculation, for a processor clock of 1.25ghz and for 8 processor cores. the value of the constant k is 1024 in the first column of the table 1. the time of fft calculation (tcal in μs) according to the data emphasized in [21] is presented in the table 2. the processor clock in this case may be in the range between 60mhz and 150mhz [26]. that’s why data are presented for the mean processor frequency of 100mhz. fft hardware accelerator (hwafft) is one of the parts in the processor implemented according to [21]. hwafft is intended for faster fft calculation. data in table 2 are related to the case when hwafft is implemented. the number of points is relatively small (till 1024) where fft is calculated comparing to the number of points, where fft results are presented in table 1. in accordance to fig. 1, the total time, which is needed for signal analysis in a jammer (tan) before (eventually) starting rcied activation jamming signal emission, consists of three components: sample acquisition time (t), fft calculation time (tcal) and the time, which is necessary to compare obtained signal frequency components after fft calculation (tcomp) in order to determine whether it is necessary to start jamming. when considering the last component (tcomp), there is not such a data in a literature, because calculation is very specific. for our analysis, we supposed that taking equal values of tcal and tcomp is a quite good approximation, i.e. 2an cal comp cal s n t t t t t f       (3) table 1 the time of fft calculation as a function of the number of calculation points for the processor presented in [24] number of points for fft calculation calculation time tcal [ms] (8 cores, 1.25ghz) 16k 0.1051 32k 0.1584 64k 0.2517 128k 0.5128 256k 0.9488 512k 2.4824 1024k 5.1226 reliability analysis of different rcied activation signal responsive jamming techniques... 463 table 2 the time of fft calculation as a function of the number of calculation points for the processor presented in [21] number of points for fft calculation calculation time tcal [μs] (with hwafft, 100 mhz) 8 1.3 16 1.7 32 3.21 64 4.36 128 9.12 256 16.68 512 37.4 1024 73.15 4. rcied activation signal detection by energy detector energy detector is the simplest techniques for signal detection [27]. in the same time it is a very often applied technique. it is necessary first to measure signal energy in the pre-defined frequency band. the measured signal energy is then compared to the energy threshold according to the equation 2 1 ( ) ( ( )) n n e x x n     (4) where n is the number of samples implemented for signal energy estimation, x(n) is the amplitude of n th sample and γ is the threshold power. although simple for implementation, energy detector performances are degraded due to noise uncertainty (noise level is variable during time) and background interference [28]. noise uncertainty may be bounded or unbounded [29]. as a consequence of noise uncertainty, the detection by energy detector may become even impossible under relatively low value of signal to noise ratio (snr) [30]. in other words, there exists a snr wall: detection is possible only when signal power is higher than noise power uncertainty. for the analysis in this paper and for the comparison of energy detector characteristics with the characteristics of other methods for reactive jamming the most important parameter is the number of samples (n) to achieve necessary detection reliability. our analysis is based on the formula for n from [27] [31]: 1 1 2 2 ( ( ) 1 2 ( )) 2 f dq p snr q p n snr         (5) where pf is probability of false detection (detector announces signal presence although there is no signal), pd is probability of successful detection and q -1 is inverse gaussian-q function. in other words, q -1 is the inverse of 2 1 21 ( ) 2 u x q x e du        (6) 464 m. mileusnić, p. petrović, v. kosjer, a. lebl, b. pavić 1,e+01 1,e+02 1,e+03 1,e+04 1,e+05 1,e+06 1,e+07 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 snr(db) n pf=pmd=0.1 pf=pmd=0.01 pf=pmd=0.001 pf=pmd=0.000001 fig. 2 the necessary number of samples (n) when energy detector is applied as a function of signal-to-noise (snr) ratio without noise uncertainty for different values pmd =pf. one additional important parameter in energy detection systems analysis is miss in detection pmd (detector does not detect a signal although it exists). probabilities pd and pmd are connected by the equation 1md dp p  (7) fig. 2 presents the necessary number of samples (n) as a function of signal-to-noise ratio (snr). the results are presented for equal values of pf and pmd. there is no noise uncertainty which means that optimum detector threshold value exists independently of snr. for small snr signal detection is always possible, but the value of n significantly increases. in our concrete implementation it is more important to achieve low value of pmd than to achieve low value of pf. in other words, consequences of miss in detection are more severe (rcied is activated because there is no jamming) than if the detection is false (only jamming signal is waste generated). that’s why the results for probability values satisfying the condition pmd1, is subtracted from the value snr in the denominator of equation (5). this means that noise power, instead of having power equal to σ 2 when noise is completely defined, now has the value of power between (1/ρ)·σ 2 and ρ·σ 2 . snr-wall is presented by the fourth graph in fig. 3. even for a very small noise uncertainty value ρ=0.25db or ρ=1.059 when it is pf=0.1 and pmd=0.01 the value of n tends to infinity for snr~-9.3db and below this value -9.3db it is not possible to detect a signal. as a conclusion it may be said that it is very important to constantly monitor the noise level and to adjust threshold value according to instantaneous noise level and in this way to avoid snr-wall appearance. 5. rcied activation signal detection by matched filter the second often implemented technique of spectrum analysis is based on the method of matched filters. the main property of such filters is that they are optimum linear filters applied for signal detection in white gaussian noise, meaning that maximum snr is achieved by their implementation [32]. although this property contributes to easier and faster signal detection, the drawback of matched filter implementation is that it is necessary to precisely know time characteristics of the signal which has to be detected. such knowledge is possible in some implementation areas, as for example, in cognitive radio [32], [33]. but, if considering rcied activation signal jamming, there is a great variety of possible and, in the same time, unpredictable activation techniques. they usually depend on the devices which may be easily purchased in some country (region) and easily adapted for its malicious function. the number of applied solution types is not great in the analysis presented in [34] with the dominant implementation of one type, thus simplifying and limiting the necessary number of different matched filters. nevertheless, application of matched filters is not quite suitable for rcied activation signal detection and the analysis of this method has more theoretical than practical significance. similarly to the analysis of energy detector, the necessary number of samples to achieve the desired probability of false alarm and probability of successful signal detection may be determined on the base of equation from [27]: 1 1 2 ( ( ) ( ))f dq p q p n snr     (8) fig. 4 presents the necessary number of samples (n) as a function of signal-to-noise ratio (snr) when matched filter is implemented. the results are obtained by equation (8) and are presented for equal values of pf and pmd. after that, fig. 5 presents the corresponding results if pmd 0 is concave and kr < 0 is convex. fig. 4 effect of pitch curvature on the helix winding: kp < 0 increases pitch and kp > 0 decreases pitch. 2. parabolic form 2.1. parabolic constraints the second-order polynomial relations for helix radius r and pitch p may be written in non-dimensional form: 534 g. e. innes r = k r z 2 + (a r -k r )z +1 (1) p = k p z 2 + (a p -k p )z +1 = dz dn (2) where the coefficients kr and kp control the magnitude of each z 2 term (curvature term) and the following non-dimensional groups are defined: r = r/r1 is the radius of the helix winding, p = p/p1 is the pitch of the winding, z = z/l is the axial distance, ar = (r2–r1)/r1 is the radius taper, ap = (p2–p1)/p1 is the pitch taper and n = n(p1/l) is the reduced turn number. the pitch angle [10] is sometimes used as an alternative to the helix pitch, with the definition  = tan-1(p/2πr) ≈ p/2πr radians, or  ≈ 90p/π2r degrees. this approximation is accurate to within 1 percent for  < 9.86 degrees and to within 2 percent for  < 13.87 degrees. substitution into eq. (2) allows convenient comparison of the two systems to the same order of accuracy, which may be acceptable for many practical cases. eqs (1) and (2) reduce to the linear cases for kr = kp = 0. the radius curvature coefficient kr > 0 for concave helix profiles and kr < 0 for convex profiles, fig. 3, and the pitch curvature coefficient kp imposes a comparable pattern onto the helix winding, fig. 4. the limits for values of the coefficients are explored in the next subsection. eq. (2) may be rearranged, integrated and the boundary condition n = 0 at z = 0 applied to yield: 2 2 1 1 11 ln ( ) 1 p z q n k q q z q  − =   − −  (3) where, defining b = (ap–kp)/2kp , the two quadratic root quantities are: q 1 = -bb 2 -1 k p (4) q 2 = -b+ b 2 -1 k p (5) taking z = 1 allows calculation of the total number of turns ntot = ntot(l/p1) of the helix winding. rearranging for z, a double-exponential relation in terms of n results: z = exp[k p (q 2 -q 1 )n] -1 (1 q 1 ) exp[k p (q 2 -q 1 )n] (1 q 2 ) (6) which allows the axial position of each turn to be determined. eqs (3) to (6) break down for kp = 0, in which case they become: n = 1 a p ln(a p z +1) (7) z = 1 a p exp[(a p n) -1] (8) for r1 = r2, the helix profile reduces to a constant radius cylinder (ar = kr = 0). for p1 = p2, the pitch taper ap = 0 and the helix winding has a constant pitch p = l/ntot = z/n. in that case, eqs (7) and (8) reduce to n = z and z = n. higher-order structural constraints for improved optimization of nonuniform helical antennas 535 2.2. limits for values of the coefficients k eqs (1) and (2) are formulated to produce curves that pass through specified end points at z = 0 and z = 1. however, for certain values of the coefficient k, they are capable of generating negative values of the helix radius and pitch in the axial range of interest 0 ≤ z ≤ 1. these correspond to physically impossible situations, so it is necessary to limit the range of values that k may assume. the effect of the radial curvature coefficient is explored first and the results extended to the pitch curvature. fig. 5 shows several possible helix profiles that may be produced by eq (1). the nondimensional helix radius r exhibits a minimum for kr > 0 (concave profile) and a maximum for kr < 0 (convex profile). these occur where the slope dr/dz = 2krz + (ar –kr) = 0 and thus a formula for kr may be written: k r = a r 12z (9) the curves for kr = +ar and kr = –ar have zero slope at one of the helix endpoints. these curves bracket the already optimized linear helix profile [9,10] and the optimized parabolic shape will likely lie nearby, in the range –ar ≤ kr ≤ ar. it is noted that eq. (9) has a discontinuity for z = 0.5 and can only have physical meaning if ar = 0 also, i.e., the helix radius has no linear taper component in this case. however, if kr > 0 is too large, eq. (1) will generate negative values for the helix radius. the constraint condition may be written as r ≥ 0 in the range 0 ≤ z ≤ 1. by rearranging eq. (9) we obtain the axial position of the parabola minimum, namely z = –(ar –kr)/2kr, and substituting this into eq. (1) gives the condition: 1 2 ( )( ) 0 r r k q k q− − −  (10) where the quadratic roots are: fig. 5 helix profiles for ar = 1.2 over a range of radius curvature coefficients kr. q 1 = (a r + 2) 2 a r +1 (11) q 2 = (a r + 2) + 2 a r +1 (12) 536 g. e. innes from fig. 5, it may be seen that a minimum in the range 0 ≤ z ≤ 1 only occurs for kr = q2, thus eq. (1) generates positive values when: k r < q 2 (13) it is noted that, by definition ar = (r2–r1)/r1 and because r2 must not be negative, the radius taper is confined to the range –1 ≤ ar ≤ ∞. this avoids the generation of complex values by eqs (11) and (12). regarding the pitch curvature, by substituting ap for ar in eqs (11) and (12), the comparable condition for positive pitch values for eq. (2) is: k p < q 2 (14) furthermore, the pitch curvature coefficient kp < 0 cannot be too negative or it will reduce the number of turns in eq. (3) to less than a desired value, typically n = 4. an iterative method with z = 1, n = 4 and fixed values of ap was used to determine the kp values for this constraint. for the linearly optimized case [9] p1/l = 17mm/750mm, the resulting curve for kp versus ap is plotted in fig. 6. this reveals the lower limit for kp is orders of magnitude outside of the range of interest here. the nonlinear equations presented elsewhere in this paper also require an iterative solution to determine limits for values of the curvature coefficients. a practical step in an optimization algorithm might be simply to check for negative values of r and p every time the structural dimensions are recalculated. 2.3. optimized yagi-uda test case the yagi-uda antenna has a nominally comparable structure to the helical antenna; hence by studying optimized versions of the former, it may be possible to improve the design of the latter. the data for a 15-element yagi-uda antenna optimized for gain were produced with a method-of-moments solution [11] embodied in a java applet [6,7] and the results posted on the internet [12]. curve-fitting these data with the form of equations for the helical antenna, an analogy may be developed. fig. 6 variation of kp with ap, where p1/l = 0.0227, to produce a helix with n = 4 turns. higher-order structural constraints for improved optimization of nonuniform helical antennas 537 table 1 lists the equivalent parameters used when comparing the two antenna designs. the two structures are also depicted in fig. 7. the yagi-uda design features a series of discrete linear elements positioned perpendicular to the z axis, while the helical design features a conductive ground plane at z = 0 plus a single wire coiled along the z axis. for consistency, the yagi-uda elements are numbered here as 0-14 and not 1-15 as conventionally done, so that the reflectors count as 0 in both antenna designs. thus ntot = 14 and the normalized element number n = n/ntot, corresponds to the helical normalized turn number n, where 0 ≤ n ≤ 1. the optimized yagi-uda antenna here consists of 14 parasitic or passive elements and one driven element (#1). data points for this element and the reflector (#0) were included in the calculations for the best-fit coefficients although they do not always fit well with the overall pattern of the helical equations. table 1 equivalent non-dimensional parameters for helical and yagi-uda antennas. helix parameter yagi parameter axial distance from reflector, z axial distance from reflector, z helix radius for 1 turn, r element length, r helix pitch for 1 turn, p element spacing, p normalized turn number, n normalized element number, n fig. 7 comparison of antenna structures: yagi-uda (upper half) and nonuniform helix profile (lower half). fig. 8 shows element length r versus axial distance, while fig. 9 shows the element spacing p, derived from linear interpolation of the data, versus axial distance. a leastsquares method [13] was adapted to minimize the error of the second-order curve-fits. the parabolic curves (solid lines) show a superior fit to these data compared with the linear counterparts (dashed lines). this indicates that, had the yagi optimization proceeded with constraints of the parabolic form, a better approximation to the true optimized data would have emerged compared with the linear constraints. by analogy, using the parabolic constraints for optimizing the helical antenna should result in a better approximation to the true optimized data than the linear constraints. 538 g. e. innes 2.4. comparison with bi-linear form the linear constraints could be taken one step further and a bi-linear form used instead of the continuously variable parabolic eqs (1) and (2). figs 10 and 11 show such an arrangement obtains an agreement with the yagi-uda data that rivals that of the parabolic curves. however, the use of piecewise linear segments has the cost of optimizing the range of each segment. thus in general, 4 new variables would be added: a middle radius rm and middle pitch pm and their respective axial locations lrm and lpm. even after simplifying with lrm = lpm, 3 new variables would be added, compared with just the 2 curvature coefficients of the parabolic forms. fig. 8 optimized yagi element length r versus axial distance z. fig. 9 optimized yagi element spacing p versus axial distance z. fig. 10 optimized yagi element length r versus axial distance z with a bi-linear curve-fit. fig. 11 optimized yagi element spacing p versus axial distance z with a bi-linear curve-fit. 3. power-law forms 3.1. direct power-law curve-fits further improvement in accuracy is obtained by applying a power-law formula directly to the optimized yagi-uda data, figs 12 and 13, particularly for the sharply flared profile at one end of the element array. higher-order structural constraints for improved optimization of nonuniform helical antennas 539 the element length taper and element spacing taper are indicated by the dashed lines in each figure, which correspond to the helical tapers ar and ap. the helical power-law form, including linear tapers, is presented in the following subsection. 3.2. power-law-plus-linear equations the close approximation of the parabolic curve-fit in fig. 9, along with the improved power-law approximation of fig. 12, suggests that the power-law form may be used in combination with the parabolic form. this combination of equations is desirable since the power-law form adds one more variable to be optimized: the power-law exponent c. however, the “raw” power-law curve-fit in fig. 12 has a negative exponent, which means zc would tend to infinity, not zero, as z → 0. this difficulty is overcome with the formulations presented below. thus, three new scenarios result: (a) the pitch is kept as parabolic in terms of z, eq. (2), while the radius r is modified to allow an arbitrary exponent c > 0, producing a power-law-plus-linear form: r = k r z c + (a r -k r )z +1 (15) eq. (15) reduces to eq. (1) when c = 2. fig. 12 optimized yagi element length r versus axial distance z with a powerlaw curve-fit. fig. 13 optimized yagi element spacing p versus axial distance z with a powerlaw curve-fit. (b) the radius is kept as parabolic in terms of z, eq. (1), while the pitch p is modified to allow an arbitrary exponent d > 0, producing a power-law-plus-linear form: p = k p z d + (a p -k p )z +1 = dz dn (16) eq. (16) reduces to eq. (2) when d = 2. however, when rearranging and integrating to obtain n(z) the analysis is cumbersome, involving an infinite series expansion. alternatively, n(z) and z(n) may be determined by numerical methods. (c) both the pitch and radius may have power-law forms in terms of z, but the same cumbersome integration problem of (b) remains. 540 g. e. innes 3.3. n-form of power-law-plus-linear equations the analytical difficulty may be avoided by re-stating the non-dimensional radius and pitch directly in terms of the normalized turn number n = n/ntot = n/ntot: r = k r n c + (a r -k r )n+1 (17) p = k p n d + (a p -k p )n+1 = 1 n tot dz dn (18) where 0 ≤ n ≤ 1 and ntot is the reduced turn number at z = 1. eq. (18) may be rearranged, integrated and the boundary condition applied n = 0 at z = 0 to give: 1 2 1 2 p p pd tot k a k z n d + −  = + +  +  n n n (19) and ntot is easily found: 1 1 1 2 p p p tot k a k n d − −  = + +  +  (20) eq. (19) may be used iteratively to solve for n = n(z). fig. 14 shows a curve-fit using the form of eq. (17) with an r versus n plot of the optimized yagi-uda data. the problem of the negative exponent of fig. 12 is overcome. it also shows that plotting against the normalized element number n produces a gentler flare than plotting against z. the rms error in element length is 0.728 percent with the power-law-plus-linear form. fig. 15 shows a curve-fit using the form of eq. (19) with a z versus n plot of the optimized yagi-uda data. the linearly interpolated pitch values shown in figs 9, 11 and 13 are replaced by a smoothed z(n) curve, from which a smoothed pitch p = dz/dn curve may be derived. fig. 14 optimized yagi element length r versus normalized element number n with power-law based curve-fits. fig. 15 optimized yagi axial distance z versus normalized element number n with power-law based curve-fits. higher-order structural constraints for improved optimization of nonuniform helical antennas 541 3.4. simplified power-law forms the power-law-plus-linear curve in fig. 15 is virtually identical to that generated by the power-law-plus-parabola form eq. (19). this suggests the pitch curvature coefficient kp = ap for this data set, which reduces the number of variables by one. furthermore, the nature of the z(n) curve in fig. 15 allows the simplification kp = 0 with ap ≠ 0. the resulting simple parabolic form of eq. (19) is scarcely visible as a dashed curve, revealing the accuracy is also quite good for this data set. this would eliminate yet again one more variable, since eq. (18) would be reduced to the linear form. the equivalent simplification for the radius curvature coefficient kr = ar was applied in fig. 14 to produce a power-law-plus-constant form (dashed curve) with a somewhat poorer fit. again, this reduces the number of variables by one while delivering better capability to model profiles with a sharp flare at one end compared with the parabolic form (gray curve). the best-fit parabola gives a much closer approximation to the data when plotted against n, compared with plotting against z (fig. 8). 4. exponential forms 4.1. exponential-plus-linear equations an exponential term may be substituted for the power-law term in eqs (15) and (16). for this purpose, it is convenient to define a normalized exponential function for each equation: e r (z) = e cz -1 e c -1 (radius) (21) e p (z) = e dz -1 e d -1 (pitch) (22) where 0 ≤ e ≤ 1 in the range 0 ≤ z ≤ 1 and c and d are the exponential constants to be optimized. these constants may be positive or negative, unlike for the power-law form where they are constrained to be positive only. the normalized form allows generation of curves that pass through specified points at z = 0 and z = 1. thus, the radius and pitch constraints become: r = k r e r (z) + (a r -k r )z +1 (23) p = k p e p (z) + (a p -k p )z +1 = dz dn (24) 4.2. n-form of exponential-plus-linear equations as with the power-law-plus-linear form, the analytical difficulty in integrating the rearranged eq. (24) to obtain z(n) and n(z) may be avoided by re-stating the nondimensional radius and pitch directly in terms of the normalized turn number n: r = k r e r (n) + (a r -k r )n+1 (25) 542 g. e. innes p = k p e p (n) + (a p -k p )n+1 = 1 n tot dz dn (26) this may be rearranged, integrated and the boundary condition applied n = 0 at z = 0 to give: 2 ( ) 21 p p p tot p d e a k z n k d e  −   = − + +   −    n n n n (27) and ntot is readily found: 1 1 1 1 21 p p tot p d a k n k d e − −   = − + +   −   (28) figs 16 and 17 reveal this form obtains a comparably close fit to the optimized yagiuda data as the power-law-plus-linear form. the rms error in element length is 0.588 percent with the exponential-plus-linear form, fig. 16. it is noted that eq. (27) does not reduce to eq. (8), as these expressions were formulated differently. 4.3. simplified exponential forms as in subsection 3.3, applying the simplification kr = ar to the radius constraint allows the number of variables to be reduced by one. fig. 16 reveals the exponential-plusconstant form (dashed curve) is noticeably better than the power-law-plus-constant form of fig. 14. it would also be preferred over the less accurate parabolic form (gray curve). all three of these forms involve 3 variables to be optimized. the exponential-plus-linear curve in fig. 17 for the element spacing data is virtually identical to that generated by the exponential-plus-parabola form of eq. (27). again, this suggests the pitch curvature coefficient kp = ap for this data set, which reduces the number of variables by one. fig. 16 optimized yagi element length r versus normalized element number n with exponential based curve-fits. fig. 17 optimized yagi axial distance z versus normalized element number n with exponential based curve-fits. higher-order structural constraints for improved optimization of nonuniform helical antennas 543 furthermore, the simplification kp = 0 with ap ≠ 0 may be used, yielding the same parabola (dashed curve) as in fig. 15. this choice reduces eq. (26) to the linear form, which has only 2 variables to be optimized. further subtle improvements to the fit may be obtained with yet higher forms, but at the cost of adding more variables to be optimized. thus it appears a satisfactory level of efficiency may be achieved, whereby a near-perfect approximation to the true optimized data set results from optimizing as few as 5 variables: r1, r2, p1, p2 and c, with kr = ar = (r2/r1–1) and kp = 0 (ap ≠ 0). 5. comparison with circularly polarized designs 5.1. types of design the combined parabolic, power-law-plus-linear and exponential-plus-linear forms offer the capability of closely approximating the structures of a variety of antenna designs that feature circularly polarized operation. these include, with the aid of external circuitry, a circularly polarized yagi hybrid [14] and a cubical quad [15,16] (square-loopelement yagi). designs with inherent circular polarization include helices with tapered ends [17], a bifilar archimedean-spiral-over-conical-helix design [18] and concave and convex helix profiles [19]. the latter published two normalized trigonometric forms, which may be closely approximated by the forms presented above. table 2 lists sample curve-fit equivalents for the trigonometric forms. figs 18-26 illustrate curve-fits of the data sets. the following patterns occur that demonstrate the capabilities of the higher-order forms: a) negative slope transitioning to level curve b) level curve transitioning to negative slope c) positive slope transitioning to level curve d) level curve transitioning to positive slope e) nearly linear curves. the accuracy of each curve-fit is stated for each subsection. fig. 18 element length r versus normalized element number n for “quagi” antenna [14]. fig. 19 axial distance z versus normalized element number n for “quagi” antenna [14]. 544 g. e. innes 5.2. yagi hybrid the 435 mhz circularly polarized “quagi” or yagi hybrid [14] featured here has two identical orthogonal arrays of yagi-uda elements. however, elements #0 and #1 are square loops and are included in fig. 18 using wire lengths for half of a loop. this hybrid antenna is thus part yagi-uda and part cubical quad (presented below). figs 18 and 19 show r(n) and z(n) curves similar to the yagi-uda test case presented in the previous sections. the rms errors for fig. 18 are 1.651% for the power-law form and 1.313% for the exponential form. the rms errors for fig. 19 are 0.944% for the power-law form and 0.913% for the exponential form. 5.3. cubical quad figs 20 and 21 show the structural data for the cubical quad [16] design, which features square-loop elements arranged in a yagi-uda type array. only parabolic curve-fits were suitable, as the best-fit exponential and power-law forms exhibited highly linear behaviours for this data set. for the z(n) data of fig. 21, the best-fit exponential-plus-linear and power-law-plus-linear forms each exhibited a singularity for n → 0 (exp(–∞n) and n0, respectively). this simply means a mathematically exact curve-fit was possible, but with a discontinuous curve. the rms error for fig. 20 is 0.111% and for fig. 21 is 0.556%. 5.4. tapered helix fig. 22 shows the r profile data for an 18-turn tapered helix [17] and fig. 23 shows the r profile data for a 9-turn tapered helix. the helix pitch was constant for these two antennas. these figures highlight the capability of the higher-order equations to handle a sharp change in the helix profile. the rms errors for fig. 22 are 1.470% for the power-law form and 1.534% for the exponential form. the rms errors for fig. 23 are 1.149% for the power-law form and 1.234% for the exponential form. fig. 20 element length r versus normalized element number n for the cubical quad antenna [16]. fig. 21 axial distance z versus normalized element number n for the cubical quad antenna [16]. higher-order structural constraints for improved optimization of nonuniform helical antennas 545 fig. 22 helix radius r vs. normalized turn number n for an 18-turn tapered helix [17]. fig. 23 helix radius r versus normalized turn number n for a 9-turn tapered helix [17]. fig. 24 helix radius r versus normalized turn number n for the archimedeanspiral-over-conical-helix design [18]. fig. 25 axial distance z vs. normalized turn number n for the archimedeanspiral-over-conical-helix design [18]. 5.5. archimedean-spiral-over-conical-helix the spiral-over-helix [18] design featured here has two identical interlaced windings, rotated 180 degrees apart about the vertical axis z. only one winding is depicted in figs 24 and 25. the winding starts at z = 0 as a conical helix and joins the flat archimedean spiral at z = 1. the conical helix has 5.5 turns and the spiral an undeclared number of turns – at least 3 turns were assumed here. the capability of describing both sections of the winding with a single higher-order equation is illustrated by this example. it is also noted that since z = 1 for all the spiral turns, the pitch p of the flat spiral conductors is given by dr/dn instead of the helical dz/dn. the rms errors for fig. 24 are 3.509% for the power-law form and 3.801% for the exponential form. the rms errors for fig. 25 are 2.770% for the power-law form and 3.001% for the exponential form. 5.6. two trigonometric forms two higher-order equations presented for concave and convex helix profiles [19] are in normalized trigonometric form, eqs (29) and (30): 546 g. e. innes 1 1 1 1 tan (1 ) tan (concave)t z  −    = − −       (29) 2 1 1 1 tan tan (convex) z t    −     = −           (30) where t = r/r2 (with r1 = 0) and 0 ≤ t ≤ 1 over the range 0 ≤ z ≤ 1. the variable to be optimized is 2/π < α < ∞. these are complementary equations with the property of being symmetrical about the line t = z, fig. 26. this may be shown by rearranging the form of t(z) to obtain z(t) in the complementary form, and vice versa. the normalized trigonometric forms are slightly more cumbersome than the forms presented above, since separate equations must be used for the concave and convex cases. when approximating eqs (29) and (30) with the power-law and exponential-based forms, sets of data points (z, t) were first generated and then the curve-fitting process applied. the data points for eq. (29) were chosen as the intersections between the curve and a family of lines radiating from the point (z, t) = (0, 1), generated by: t 1 = 1-z cot q (31) where the angle θ was varied in 5-degree increments over the range 0 ≤ θ ≤ 90 degrees. this arrangement produced a relatively uniform distribution of points over the length of each curve. the complementary sets of data for eq. (30) were generated using z values equal to the t1 values from the data sets for eq. (29). table 2 gives examples of the best-fit coefficients for comparable exponential-pluslinear and power-law-plus-linear forms. fig. 26 illustrates the difference between these curve-fits and the data points. in fig. 26, the exponential-plus-linear form (solid curves) was typically better than the power-law-plus-linear form (dashed curves) in approximating the trigonometric form. the overall fit as measured by the difference ∆trms was better for the concave curves (lower right area) than for the convex curves (upper left). from a practical perspective, where an agreement to within 1 or 2 percent may be acceptable, the choice of normalized form would be a matter of user preference. fig. 26 trigonometric helix profile t versus axial distance z for several concave and convex helices [19]. higher-order structural constraints for improved optimization of nonuniform helical antennas 547 table 2 curve-fits of trigonometric data in fig. 26 using exponential and power-law forms. α eq. (29) best-fit exponential-plus-linear and power-lawplus-linear forms, plotted in fig. 26. ∆trms (%) 0.65 0.65 f(x) = (3.053e-11)exp(24.12x) – 0.1007x– 0.0145 f(x) = 0.9050x^23.79 + 0.1071x – 0.0160 1.080 1.212 0.78 0.78 f(x) = 0.05719exp(2.924x) – 0.0018x – 0.0530 f(x) = 0.7419x^3.651 + 0.2666x – 0.0016 0.428 0.266 1.4 1.4 f(x) = 1.1551exp(-0.8804x) + 1.6767x – 1.1539 f(x) = 0.3819x^1.701 + 0.6193x – 0.0014 0.083 0.115 eq. (30) 0.65 0.65 f(x) = –0.8330exp(-36.805x) + 0.1525x + 0.8699 f(x) = 1.9899x^0.3352 – 0.9871x – 0.0625 2.250 5.457 0.78 0.78 f(x) = –0.5892exp(-6.383x) + 0.4096x+ 0.5961 f(x) = 3.5057x^0.7709 – 2.5082x – 0.0156 0.356 1.122 1.4 1.4 f(x) = –0.2506exp(-2.746x) + 0.7652x+ 0.2502 f(x) = –1.0416x^1.213 + 2.0405x – 0.0019 0.028 0.154 6. simple test of improved optimization an individual optimized helical antenna [9,10] required substantial computational effort to arrive at the structure giving the best signal gain. without performing any further computation, it may be demonstrated using a calculus-of-variations method that adding curvature terms to the published linear constraints can give improved optimizations. fig. 27b shows an end-on view and a profile view of the first 3 turns of a linearly constrained square-helix based on the design of [9,10], with comparison views of a concave version (fig. 27a) and a convex version (fig. 27c) generated by the higher-order constraints. a monotonic increase in radius per turn is shown. the helix end points are the same for all three examples. these views emphasize the helix radius, since the conclusion of subsection 4.3 suggested that adding curvature to the radius constraint would give the most efficient improvement. the precise form of the higher-order constraint is immaterial, as indicated at the conclusion of subsection 5.6, for a structural accuracy on the order of 1 or 2 percent. assuming the same number of turns n, the same overall helix length l and identical radius and pitch values at z = 0 and z = l, the concave version has a shorter wire length than the linearly varying helix, while the convex version has a longer wire length. helix profiles with either a “waist” or “bulge”, fig. 5, could also be considered as they may also have longer or shorter wire lengths. thus, when optimizing for signal gain, one of two possible scenarios will generally occur: (a) the gain increases with an infinitesimal decrease in wire length, or (b) the gain increases with an infinitesimal increase in wire length. for scenario (a), a concave helix would have shorter wire length and would thus exhibit superior gain over the linearly varying helix. for scenario (b), the longer wire length of a convex helix would give superior performance compared with the linearly varying helix. for either scenario, there would exist a better-performing helix generated by the higherorder constraints. actual effect of the radius curvature on performance will be complicated, but it is demonstrated clearly that an improved optimization is generally possible. 548 g. e. innes fig. 27 overhead views and profile views of square helix windings: a) concave, b) linear and c) convex. the dashed gray lines indicate the physical support structure. the solid gray lines are hidden portions of the winding. in a simulation with curved helix profiles [19], a concave profile strongly flared at the top (z = l) proved best when optimizing for bandwidth with a relatively short helix (n ≈ 6). similarly, the analogy with the yagi-uda antenna optimized for gain would also predict a profile with kr ≈ ar for the optimized helical antenna. the generally positive linear tapers for the optimizations of [9,10] would suggest any concave flare should typically occur at the top of the helix for a longer antenna. a preliminary survey [10] indicated that nonlinear helix profiles provide little to no improvement in gain for antenna lengths in the range 2 ≤ l ≤ 10. this may be explained using the geometries of the antennas in [9,10]. for a given design frequency, the helix radius is confined to a relatively narrow range of values regardless of antenna length. thus, the longer the antenna, the more slender it is and the more it resembles a uniform helix. conversely, the shorter the antenna, the more conical or curved the profile will appear. such a pronounced nonlinear shape would therefore be expected to have a greater influence on the electromagnetic properties of the helix, where l ≤ 2. 7. conclusions 7.1. improved structural constraints two parabolic relations are proposed to improve the optimizations for helical antennas, compared with the previously published optimizations using linear constraints. by analogy using optimized yagi-uda antenna data, a better approximation to the true optimized data may be seen by the use of parabolas. these generally also have an advantage in efficiency over bi-linear forms by adding fewer variables to be optimized. also with reference to the optimized yagi-uda data, power-law-plus-linear and exponential-plus-linear curve-fits were found to be even more accurate. the number of variables to be optimized is the same as the parabolic form if the radius or pitch curvature coefficient is taken as equal to the linear taper: k = a. a further reduction of one variable may be possible with the pitch equations by taking kp = 0 with ap ≠ 0, while still retaining good accuracy. comparison with data for a variety of antenna designs and with an alternative pair of trigonometric forms confirms the general validity of these mathematical relations in higher-order structural constraints for improved optimization of nonuniform helical antennas 549 specifying structural dimensions and positioning of array elements and helical windings. current helical antenna designs are often limited to constant or linear structural constraints for convenience of manufacture, whereas higher-order constraints enable optimized designs closer to what the laws of physics will allow. this last point was demonstrated in section 6, indicating improvement is generally possible by adding curvature terms to the linear structural constraints, especially for helices where l ≤ 2. all the foregoing considerations suggest these higher-order equations are immediately capable of delivering improved results when used as constraints in optimization algorithms. the various parabolic, power-law-plus-linear and exponential-plus-linear equations presented above are summarized in table 3. there are 36 possible pairings of the 6 pitch equations with the 6 radius equations, not counting simplified versions. the accuracy generally increases as the number of variables to be optimized increases. the linear forms (k = 0) have 2 variables each. table 3 summary of higher-order structural constraints for the helix radius and pitch. radius equations pitch equations forms with 3 variables forms with 3 variables r=r(z) parabolic, eq. (1) p=p(z) parabolic, eq. (2) n=n(z) eq. (3), z=z(n) eq. (6) r=r(n) parabolic, eq. (17), c=2 p=p(n) parabolic, eq. (18), d=2 n=n(z) and z=z(n) via eq. (19), d=2 forms with 4 variables forms with 4 variables r=r(z) power-law, eq. (15) p=p(z) power-law, eq. (16) n=n(z) and z=z(n) numerically r=r(z) exponential, eq. (23) p = p(z) exponential, eq. (24) n=n(z) and z=z(n) numerically r=r(n) power-law, eq. (17) p=p(n) power-law, eq. (18) n=n(z) and z=z(n) via eq. (19) r=r(n) exponential, eq. (25) p=p(n) exponential, eq. (26) n=n(z) and z=z(n) via eq. (27) 7.2. further refinements the application of the higher-order equations to optimize the conductor dimensions is left as a future exercise. published studies commonly use uniform-thickness wires with circular cross-section, although conductive strips oriented either axially (tapes [20]) or radially (“slinky” coils [17]) are reported. the variation of wire cross-sectional radius or strip width would then be included in the overall optimization process. optimum selection of a uniform wire thickness has already been considered elsewhere [8,10]. optimization of the archimedean spiral antenna [18] included a linear variation in the width of the flat spiral conductors. adding extra dimensions, a finely coiled form has been superimposed onto the main helix winding [19] using the same the two trigonometric forms as for the main helix. the higher-order constraints presented here could be used as alternative forms to optimize both the fine structure and the main structure. 550 g. e. innes an important point may be made regarding the “out-of-place” yagi-uda elements #0 and #1. since these exceptions occur for a fully optimized antenna, the analogy with helical antenna optimization suggests the first turn n = 1 should have its own variables r and p to be optimized. this might also be the case if an impedance-matching transformer [21] is included in the optimization. the radius equations presented here assume a relatively simple variation of the helix profile, although a more complicated tapered-helix-on-tapered-helix design with constant pitch has been published [17]. each tapered helix profile would normally be approximated by a single radius equation from table 3; hence the double tapered-helix profile must be handled by an equation of yet higher-order. as more powerful computing resources become available, the structural constraints may be relaxed further to allow forms such as exponential-plus-parabola, plus-cubic-polynomial, plus-quartic, etc., or a two-exponential-plus-linear form, etc. with the increased number of variables to be optimized, more exotic forms of structural constraint may also be found to be efficient, such as chebyshev polynomials [22] or catmull-rom nonlinear splines [23]. since refinement of the helical winding of the antenna is now approaching a limit, optimizations must be extended to include other structures simultaneously, namely the reflector [24], multi-filar windings with dielectric cores [25,26], antenna arrays and feed impedance-matching components [9]. references [1] j.d. kraus, “helical beam antennas,” electronics, vol. 20, pp. 109–111, april 1947. [2] j.l. wong and h.e. king, “empirical helix antenna design,” antenna and propagation society international symposium, vol. 20, pp. 366–369, 1982. [3] g. burke and a. poggio, “nec part i: program description – theory”, technical report, lawrence livermore laboratory, january 1981. [4] wipl-d historical benchmarks from 1980 to the present. [5] mathworks factsheet citing the initial release of matlab® in 1984 [6] k. schmidt, java applet source, 1998. [7] k. schmidt, yagi modeller notes. [8] a.r. djordjević, a.g. zajić, m.m. ilić and g.l. stüber, “optimization of helical antennas”, ieee antennas and propagation magazine, vol. 48, no. 6, pp. 107–115, dec. 2006. [9] j.l. dinkić, d.i. olćan, a.r. djordjević and a.g. zajić, “high-gain quad array of nonuniform helical antennas”, int'l j. antennas and propagation, article id 8421809, 12 pp., march 2019. [10] j. dinkić, d. olćan, a. djordjević and a. zajić, “design and optimization of nonuniform helical antennas with linearly varying geometrical parameters”, ieee access, 12 pp., in press. [11] k. schmidt, “simplified mutual impedance of nonplanar skew dipoles”, ieee trans. on antennas and propagation, vol. 44, no. 9, pp. 1298–99, 1996. [12] p.g. collier, optimized 15-element yagi-uda antenna data. [13] g.e. innes, “an experimental and theoretical study of viscous lifting in tribology”, ph.d. thesis, appendix c, department of mechanical engineering, university of leeds, leeds, u.k., 1993. [14] r. kalmeijer, “circularly polarized quagi antennas for space communications”. [15] w.i. orr and s.d. cowan, all about cubical quad antennas, 2nd edn. radio publications, inc., box 149, wilton, conn. 06897, 1970. [16] cubical quad antenna data, www.basictables.com/amateur-radio/antenna/cubical-quad-antenna [17] j.l. wong and h.e. king, “broadband quasi-taper helical antennas”, n.t.i.s. report a046067, 35 pp., 30 sept. 1977. [18] s. talari, r. naraiah, and dr. ramakrishna, “tapered spiral helix antenna”, int'l j. emerging tech. adv. eng., vol. 3, no. 3, pp. 318–25, march 2013. [19] t. peng, s. koulouridis and j.l. volakis, “miniaturization of conical helical antenna via optimized coiling”, a.c.e.s. j., vol. 26, no. 6, pp. 452–58, june 2011. http://www.basictables.com/amatuer-radio/antenna/cubical-quad-antenna higher-order structural constraints for improved optimization of nonuniform helical antennas 551 [20] a.f. peterson, b.s. greene and r. mittra, “propagation and radiation characteristics of the tape helix with a conducting core and dielectric substrate”, ieee trans. on antennas and propagation, vol. 38, no. 4, pp. 578–84, april 1990. [21] s.v. savić, m.m. ilić and a.r. djordjević, “design of internal wire-based impedance matching of helical antennas using an equivalent thin-wire model”, int'l j. antennas and propagation, article id 7365793, 5 pp., dec. 2017. [22] s. sharma, r. rishi and c.c. tripathi, “impedance matching techniques for microstrip patch antenna”, indian journal of science and technology, vol. 10, no. 28, 16 pp., july 2017. [23] k. jimisha and s. kumar, “optimum design of exponentially varying helical antenna with non uniform pitch profile”, international conference on communication computing and security, procedia technology, pp. 792–98, june 2012. [24] a. đorđević, d. olćan, m. ilić and a. zajić, “design of optimal ground conductor for the helical antenna”, in proceedings of the 50th etran conference, 3 pp., june 2006. [25] b. slade, “the basics of quadrifilar helix antennas”, technical publication, orban microwave inc., 1834 n. alafaya trl., suite b, orlando, fl 32826. [26] m. škiljo and z. blažević, “helical antennas in satellite radio channel”. in: dr. masoumeh karimi, editor. advances in satellite communications. facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 359 373 doi: 10.2298/fuee1403359s noises in randomly sampled sparse signals  ljubiša stanković university of montenegro, podgorica, montenegro abstract. sparse signals can be recovered from a reduced set of randomly positioned samples by using compressive sensing algorithms. two main reconstruction directions are in the sparse transformation domain analysis of signals and the gradient based algorithms. in the transformation domain analysis, that will be considered here, the estimation of nonzero signal coefficients is based on the signal transform calculated using available samples only. the missing samples manifest themselves as a noise. this kind of noise is analyzed in the case of random sampling, when the sampling instants do not coincide with the sampling theorem instants. analysis of the external noise influence to the results, with randomly sampled sparse signals, is done as well. theory is illustrated and checked on statistical examples. key words: sparse signals, compressive sensing, noise, fourier transform 1. introduction a signal can be transformed from one domain into another in various ways. some signals that cover whole considered interval in one domain (dense in that domain) could be located within much smaller regions in another domain. we say that signals are sparse in a transformation domain if the number of nonzero coefficients is much fewer that the total number of signal samples. for example, a sum of discrete-time complex sinusoidal signals, with a number of components being much lower than the number of signal samples in the time domain, is a sparse signal in the discrete fourier transform (dft) domain. sparse signals could be reconstructed from much fewer samples than the sampling theorem requires. compressive sensing is a field dealing with the problem of signal recovery from a reduced set of samples [1]-[21]. this research area intensively develops in the last decade. it provides solutions that differ from the classical signal theory approach. two main directions in the signal recovery are present. one is based on the signal transform analysis (orthogonal matching pursuit methods) and the other is based on the gradient methods. the samples could be missing due to a desire to represent a signal with the lowest possible number of samples or due their physical or measurement unavailability. in applications it could happen that some arbitrarily positioned samples of the signal are so heavily corrupted by disturbances that it is better to omit them and consider as unavailable in the analysis.  received march 23, 2014 corresponding author: ljubiša stanković university of montenegro, podgorica, montenegro (e-mail: ljubisa@ac.me) 360 lj. stanković this is especially true for the impulsive noise [4], [23]. as a study case, in this paper we will consider signals that are sparse in the fourier transform domain. signal sparsity in the discrete fourier domain imposes some restrictions on the signal. one of them is that the frequencies of the signal components are on the frequency grid. otherwise even one component complex sinusoidal signal will not be sparse in the dft domain. reducing the number of samples in the analysis manifests as a noise, whose properties are studied in [13] and used in [24] to define a reconstruction algorithm. the input noise influence is also an important topic in this analysis since the reduced number of available samples could increase the sensitivity of the recovery results to this noise [8], [24]. in this paper sparse signals with available samples at the random positions, that do not correspond to the sampling theorem defined positions, will be analyzed. it will be shown that the noise due to random sampling exists even in the case of large number of available samples. in the case on nonuniformly sampled signals a possibility to recalculate the signal samples values to the sampling theorem positions is exploited [25]. efficiency of this recalculation in the signal recovery is studied for various numbers of available signal values. an analysis of the additive input noise is done as well. theoretical results are statistically checked. 2. reconstruction algorithm 2.1. definitions consider a discrete-time signal x(n) obtained by sampling a continuous-time signal x(t). since the dft will be used in the analysis then we can assume that the continuoustime signal is periodically extended with a period t. the period t is related to the number of samples n, the sampling interval t, and the maximal frequency m as m =  / t = n / t . the continuous-time signal can be written as an inverse fourier series ( 1) / 2 2 / = ( 1) / 2 ( ) = . n j kt t k k n x t x e      (1) with the fourier series coefficients being related to the dft as = ( ) = { ( )} k x n x k dft x n and x(n) =x(nt). frequency indices { ( 1) / 2,... 1, 0,1,..., ( 1) / 2}k n n     corresponds to the frequencies 2 / ( 1) m k n  in the analog domain. this signal can be reconstructed from its samples taken according to the sampling theorem as 1 = 0 sin[( ) ] ( ) = ( ) . sin[( ) / ] n n t n t x t x n t t n n n t          (2) this relation holds for an odd n. slightly corrected relation holds for an even n, [5], [25]. the signal x(t) is sparse in the fourier transform domain if the number of nonzero transform coefficients k is much lower than the number of the original signal samples n within t , k n , i.e., = 0 k x for 1 {k k , 2 k , ..., } k k . a signal noises in randomly sampled sparse signals 361 2 / { , ,..., } 1 2 ( ) = . j kt t k k k k k k x t x e    (3) of sparsity k can be reconstructed from m samples, where m  n. 2.2. known frequency positions in the case of signal that is sparse in the fourier domain there are k unknown values xk1, xk2,..., xkk . if the frequency positions {k1, k2, ..., kk} are known then the minimal number of equations to find the unknown coefficients (and to calculate (3) for any t) is k. the equations are written for at least k time instants ti, i = 1,2,...,m  k, where the signal should be available, 2 / =1 = ( ), for = 1, 2,..., . k j k t t m i k i m m x e x t i m k   (4) in a matrix form this system is k ax = y , (5) where xk is the vector of unknown nonzero coefficients and y is the vector of the available signal samples, defined as 1 2 = [ ... ] t k k k k k x x xx (6) 1 2 = [ ( ) ( ) ... ( )] t m x t x t x ty with 2 / 2 / 2 / 1 1 2 1 1 2 / 2 / 2 / 1 2 2 2 2 2 / 2 / 2 / 1 2 ... ... = ... ... ... ... ... j k t t j k t t j k t t k j k t t j k t t j k t t k j k t t j k t t j k t t m m k m e e e e e e e e e                        a . (7) the coefficients reconstruction condition can be easily formulated as the condition that the system (5) has a unique solution, i.e.,.that det( ) 0a for the time instants ti where the signal is available and for the known frequency indices ki, i = 1,2,...,k, for m = k. in general, for m > k, the condition is that there are k independent equations, rank( ) = .ka special case 1: consider a sparse signal with frequencies k1, k2, ..., kk. assume that the available signal samples are a random subset of the full set of signal samples taken according to the sampling theorem ti = nit. the set {k1, k2, ..., kk} in the dft analysis can be considered as a subset of all frequency coefficients {0,1,2,..., n1}, having in mind that frequency indices in the second half of the dft correspond to the negative frequencies in the fourier series. then 362 lj. stanković 2 / 2 / 2 / 1 1 2 1 1 2 / 2 / 2 / 1 2 2 2 2 2 / 2 / 2 / 1 2 ... ... = ... ... ... ... ... j k n n j k n n j k n n k j k n n j k n n j k n n k j k n n j k n n j k n n m m k m e e e e e e e e e                        a . (8) this matrix is related to the idft matrix w with n samples in such a way that the rows corresponding to the time instants where the signal is not available are removed. the columns corresponding to the signal frequencies {k1, k2, ..., kk} are kept, while the other idft columns are removed. thus, the matrix a of order m  k is obtained from the idft of order n  n by removing row for time instants of unavailable signal samples and columns where the signal transform is zero-valued. if signal samples are chosen randomly then the reconstruction condition rank (a) = k could be satisfied if at least m = k. however it can happen that some instants, for given frequencies, produce dependent observations and that the full recovery is not possible. probability that we have sufficient number of independent equations is increased if the number of instants is increased. system (4) is used with k m n . therefore, by assuming that the positions of the nonzero coefficients in transformation domain are known, a system of m linear equations axk = y, (5), for available signal samples x(ti), i = 1,2,...,m, is solved for k unknowns xk, k  {k1, k2, ..., kk}. its solution, in the mean squared sense, follows from = h h k a ax a y 1 ( ) h k h x = a a a y. (9) if the dft values x(k) are used in vector xk instead of the fourier series coefficients xk then, using the relations xk n = x(k), the system of equations (5) reads 1 . k n ax = y special case 2 (oversampled signal): consider now a signal sampled with = / m t   but whose maximal frequency is = /k m t   . then, according to the sampling theorem, this is an oversampled signal. it is a special case of a sparse signal with ordered nonzero coefficients as 1 2 1 1 = { , ,..., } = {0,1,..., , ,... 2, 1}. 2 2 k k k k k k n n n     k in order to recover this signal it is sufficient to have a reduced set of samples. if the samples are not taken randomly (as it is done in compressive sensing) but at the instants ti = itn / k, where n / k is an integer, then the sampling step is tn / k. this corresponds to the signal downsampling with factor / 1n k , since k n . then with m = k and ni = in / k we get a special form of (8), relating the dft values and the discrete-time signal samples, as noises in randomly sampled sparse signals 363 2 / 2 ( 1) / 2 ( 1) / 2 ( 1)( 1) / 1 1 ... 1 1 ...1 = . ... ... ... ... 1 ... j k j k k k j k k j k k k e e k e e                     x y this is an idft matrix with k samples. since this is an idft matrix of order k it satisfies the condition det(a)  0. in practice, excluding the sampling theorem cases with oversampled signals, the positions of the frequencies in a sparse signal are rarely known. two groups of the methods are derived to solve the problem. one is based on concentration measures [15] that are used to measure the signal sparsity. the problem is solved by minimizing the concentration measures subject to the condition that the signal values are known at some time instants. in this group the gradient based algorithms are commonly used, [7, 21]. the other group, that will be used here, is directly related to the presented theory of the system solution with known frequencies in the sparse signal. the first step in this class of methods is to estimate the positions of the signal frequencies and then in the next step to apply the presented simple approach to find the fourier transform coefficients at {k1, k2, ..., kk}. by finding the nonzero fourier transom coefficient values the signal recovery is achieved. 3. frequency positions estimation 3.1. random subset of uniformly sampling signal consider a sparse signal whose values are known at some of the possible sampling theorem defined positions tni = nit = nit /n with ni  {n1, n2, ..., nm}. the first step is to estimate the dft coefficients positions, using the available samples. it is done as 2 / { , ,..., } 1 2 ( ) = ( ) . j kn n n n n n m x k x n e    (10) with k m n the dft, calculated with m samples, is a random variable. for a sparse signal of the form =1 ( ) = exp( 2 / ), k p p p x n a j nk n the mean value of (10) is =1 { ( )} = ( ), k p p p e x k m a k k  while its variance is [13] 2 2 =1 ( ) = [1 ( )]. 1 k n p p p n m k a m k k n        (11) the variance is derived [13] in by using the condition that the sum (10) for =m n is =1 ( ) = ( ), k p p p x k m a k k  with 2 ( ) = 0 n k . example: consider a three component signal 1 1 2 2 3 3 ( ) = exp( 2 / ) exp( 2 / ) exp( 2 / )x t a j k t n a j k t n a j k t n    (12) 364 lj. stanković with a1 = 1, a2 = 0.75, a3 = 0.25, , {k1, k2, k3} = {58,117,21}, within 0 256t  . with t = 1 and n = 257 the signal is sparse in the fourier domain. random realizations of the initial dft (10) are given in fig.1, for several values of the number of available samples m. we can see that a low value of m does not provide possibility to estimate the signal component positions. all three components are visible for larger values of m. when signal frequencies are detected then the signal is recovered using (9) with known time instants ti  {t1, t2,..., tm} (or in discrete-time domain ni  {n1, n2,..., nm}) and detected frequencies {k1, k2,..., kk}. obviously from a noisy observation of the dft we can distinguish two cases: 1) when the number of available samples is large and all components are above a threshold that can be calculated based on (11). then all signal frequencies will be distinguishable as peaks in the dft. 2) if the number of available samples is low or there are components with much lower amplitudes then the iterative procedure should be used. the largest component is detected and estimated first. it is subtracted from the signal. the next one is detected and the signal is estimated using the frequency from this and the previous step(s). the estimated two components are subtracted from the original signal. the frequency of next components is detected, and the process with estimation and subtraction is continued until the energy is negligible. both of these reconstruction cases are studied and described in [24]. 3.2. random subset of randomly sampled signal now consider the case when randomly positioned samples of a continuous-time signal within 0  t  t are available. the positions of the observations ti are not related to the sampling theorem positions in any way. an estimate of the initial dft can be calculated using the available signal values, as 2 / { , ,..., } 1 2 ( ) = ( ) . j kt t i i t t t t i m x k x t e    (13) fig. 1 dft of a signal with various number of available samples m. available m samples are a random subset of n samples taken according to the sampling theorem interval. red dots represent the original signal dft values, scaled with m / n to match the mean value of the dft calculated using a reduced set of samples signal. the dft values are presented as a function of the frequency index. noises in randomly sampled sparse signals 365 we will again assume that the signal is sparse with unknown number and positions of the frequencies {k1, k2, ..., kk}, k m n . for a frequency k = kp and the signal component exp( 2 / ) p p a j k t t all values in (13) will be 2 /2 / = = .p i j kt tj k t t i p p k k p a e e a   therefore, the mean value of estimator (13) is =1 { ( )} = ( ). k p p p e x k m a k k  the variance of this estimator is different from the case when the available signal samples are at the sampling interval positions [13]. the condition that the value of the dft coefficient is zero (with zero variance) if all n samples are used, does not hold any more. the total variance is 2 2 =1 ( ) = 1 ( ) . k n p p p k a m k k     (14) for small m we have (n  m) / (n  1)  1. then expressions (11) and (14) give similar result. some of the random realizations of the initial dft (13) are given in fig. 2. in contrast to the previous case, the variance of the estimator (13) does not tend to zero as m approaches to n. however, we can see that the signal frequencies can be detected and used to recover the signal using (5) and (7) with known time instants ti  {t1, t2, ..., tm} and detected frequencies {k1, k2, ..., kk}. fig. 2 dft of a signal with various number of available samples m. available m samples are taken at random positions within 0  ti  t. red dots represent the original signal dft values, scaled with m / n to match the mean value of the dft calculated using a reduced set of samples signal. 366 lj. stanković 3.3. random subset of nonuniformly sampled signals consider now a random set of possible sampling instants {t1, t2, ..., tn}, = , i i t i t   where vi is a uniform random variable t / 2  vi  t/2. with  = 1 any position of the signal sample within it  t / 2  ti < it + t / 2 is equally probable. with   1 the sampling positions are random, but within one sampling interval only one signal sample can occur. this case will be referred to as nonuniform sampling. assume that only m < n signal samples are available at ti  {t1, t2, ..., tm}. in the nonuniform sampling case the initial dft estimate can be calculated using (13). this transform may be used to estimate the frequency positions. note that as in the random sampling case, even if we use m = n the resulting signal will not be sparse. this fact will degrade the recovery performance. the problem with nonuniform sampling can be reformulated to produce a uniformly sampled signal. if the signal values at ti are known then (2) can be used to recover the signal samples at the sampling theorem adjusted instants. this relation reads 1 = 0 sin[( ) ] ( ) = ( ) . sin[( ) / ] n i n t n t x t x n t t n n n t          the transformation matrix relating samples taken at ti with the signal values at sampling theorem positions, is 1 11 12 1 2 21 22 2 1 2 ( ) ... (0) ( ) ... ( ) = ... ... ... ... ... ... ( ) ... (( 1) ) n n n n n nn x t b b b x x t b b b x t x t b a b x n t                                      x̂ = bx with sin[( ) ] = sin[( ) / ] i ij i t j t b t n j n t       an additional problem here is that we know just m  n of signal samples. the values at unavailable positions ti  {t1, t2, ..., tm} are assumed to be zero. their positions are assumed at the sampling theorem instants, ti = it for ti  {t1, t2, ..., tm}, since they are not known anyway. the uniform (sampling interval) signal values are then 1 11 12 1 1 21 22 2 2 1 2 ... ( )(0) ... ( )( ) = . ... ... ... ... ...... ... ( )(( 1) ) n n n n nn n b b b x tx b b b x tx t b a b x tx n t                                   (15) noises in randomly sampled sparse signals 367 the matrix b 1 is inverted only once for given signal sample positions. note that there is a direct relation to calculate the values x(nt) based on randomly sampled values x(ti) as [25]     11 =0=0 sin ( ) / ( ) = ( ) . sin ( ) / nn q p qp p p q p n t t n x n t x t t t n           here the inversion is not needed. however, in our calculation, this was not computationally more efficient approach. the results for several random realization and the nonuniform signal sampling, with recalculated signal values at the sampling theorem positions, are shown in fig. 3. as the number of available samples approaches to the total number of samples n the reconstructed dft is noise-free, fig. 3. fig. 3 dft of a signal with various number of available samples m. available m samples are a random subset of n nonuniform samples taken at random positions within the sampling theorem interval. red dots represent the original signal dft values, scaled with m / n to match the mean value of the dft calculated using a reduced set of samples signal. for all previous reconstruction cases and the signal defined by (12) the variance is calculated in 100 random realizations of the sets of available samples. the results for the variance is presented in fig. 4. the ratio of signal and noise energies is calculated as well and presented in fig.5. agreement of the theory and the statistical results is high. from fig.4 we can conclude that the recalculation is not efficient for a small number of available samples, when m n . in that case even worse results are obtained than without recalculation, what could be expected. for a large number of available samples (in fig.4 for m > 5n / 8) the recalculation produces better results, approaching to the sparse signal without any deviation, for n = m. 368 lj. stanković fig. 4 variance of the dft for all previous methods of sampling and various number of available samples m. (1)-line with marks "x": available samples are a subset of all samples taken at the sampling theorem grid (solid line-theory, marks "x"-statistics). (2)-line with marks "o": randomly positioned m samples taken within 0  ti  t (solid line-theory, marks "o"-statistics). (3)-marks "+": nonuniform randomly shifted samples from the sampling theorem grid. (4)-marks "*": nonuniform randomly shifted available samples being recalculated on the sampling theorem grid. fig. 5 ratio of the signal and dft noise energies for all previous methods of sampling and various number of available samples m. (1)-line with marks "x": available samples are a subset of all samples taken at the sampling theorem grid (solid line-theory, marks "x"-statistics). (2)-line with marks "o": randomly positioned m samples taken within 0  ti  t (solid line-theory, marks "o"-statistics). (3)-marks "+": nonuniform randomly shifted samples from the sampling theorem grid. (4)-marks "*": nonuniform randomly shifted available samples being recalculated on the sampling theorem grid. noises in randomly sampled sparse signals 369 4. additive noise influence next we will analyze the case when input noise exists in the sparse signal to be reconstructed. it has been shown that this kind of noise increases the variance caused by missing samples. a formula how to increase the number of available samples in order to compensate the influence of input noise is derived as well [24]. it is important to note that once the reconstruction conditions are meet and the reconstruction is achieved, the noise due to missing samples does not influence the results in a direct way. it influences the possibility to recover signal at all. the accuracy of the recovery results is related to the input noise. the input noise is transformed by the recovery algorithm into a new noise depending on the signal sparsity and the number of available samples. a simple analysis of this form of noise will be presented next. assume an additive noise  (t) in the input signal. the reconstruction equations (4) are 2 / ={ , ,..., } 1 2 ( ) ( ) = ,for = 1, 2,..., j kt t i i i k k k k k k x t t x e i m    at the available instants ti, i = 1,2,...,m, for detected frequencies k = {k1, k2, ..., kk}. in a matrix form this system of m linear equations with k unknowns reads = k y ax the solution follows form ( ) = h h k a y a ax 1 = ( ) ( ) h h k   x a a a y = k ks kn x x x (16) where 1 = ( ) h h ks  x a a a y are the true signal coefficient values and 1 = ( ) h h kn   x a a a is the noise influence to the reconstructed signal coefficients. the input signal-to-noise (snr) ratio, if all signal samples were available, is 1 2 = 0 1 2 = 0 ( ) = 10 log = 10 log . ( ) n n n x i n n n x t e snr e t      assume the noise energy in m samples used in reconstruction is 2 { , ,..., } 1 2 = ( ) . m a i t t t t i m e t     (17) using (10) in calculation is the same as assuming that the values of unavailable samples is zero. this kind of calculation corresponds to the result that would be achieved for the signal transform if the norm-two, i.e., min 1 2 =0 ( ) n k x k   , is used in minimization, [21, 370 lj. stanković 3, 6, 23]. the correct amplitude in the signal transform at the frequency kp, in the case if all signal samples were used, would be nap. to compensate the resulting transform for the known bias in amplitude when only m available samples are used we should multiply the coefficient by n / m. it means that is a full recovery, a signal transform coefficient should correspond to the coefficient of the original signal with all signal samples being used. the noise in the transform coefficients will also be multiplied by the same factor. therefore, its energy would be increased to ea n 2 / m 2 . the signal-to-noise ratio in the recovered signal would be 1 2 = 0 2 2 2 { , ,..., } 1 2 ( ) = 10 log ( ) n n n i t t t t i m x t snr n t m      (18) if the distribution of noise in the samples used for reconstruction is the same as in other signal samples then 2 2 { , ,..., } 1 2 | ( ) | = i t t t t i m t m      and 1 2 = 0 2 2 2 ( ) = 10 log n n n x t snr n m m     (19) = 10 log . i n snr m        (20) therefore, a signal reconstruction that would be based on the initial estimate (10) would worsen snr, since n > m. an improvement can be expected only if we were able to remove the noisy samples in a selective manner so that the samples used in reconstruction are less noisy than the other samples, [23]. if such a criterion is used to selectively remove the noise samples then the reconstruction is improved if 2 1 2 2 2 { , ,..., } =0 1 2 ( ) < ( ) . n i n t t t t n i m n t t m       since only k out of n dft coefficients are used in the reconstruction the energy of the reconstruction error is reduced for the factor of k / n as well. therefore, the energy of noise in the reconstructed signal is 2 2 2 { , ,..., } 1 2 = ( ) . r i t t t t i m k n e t n m     the final signal to noise ratio in the reconstructed signal is 1 2 = 0 2 2 { , ,..., } 1 2 ( ) = 10 log . ( ) n n n i t t t t i m x t snr kn t m      (21) noises in randomly sampled sparse signals 371 if a criterion of selection the noisy samples is used then the variance in the remaining samples is lower than the average variance of all samples, i.e., 2 { , ,..., } 1 2 1 2 = 0 1 ( ) = 1 ( ) i t t t t i m qn n n t m c t n       (22) where 0 1 q c  is the criterion selection efficiency:  if there is no any criterion cq = 1.  in an ideal case when the noise does not exists in the remaining samples (removed by a criterion or l-statistics as in [3]) then cq = 0. with this factor the snr is = 10 log . i q k snr snr c m        (23) this simple theoretical result is tested on signal (12) with additive noise of variance  2  = 1. for a random set of m available samples the initial dft is calculated using (10). since a large number of available samples m is used in these simulations the signal components {k1, k2, k3} are easily detected in one step. the signal is reconstructed by (9) for the set of available signal samples y = [x(t1) x(t2) ... x(tm)] t and the detected frequencies {k1, k2, k3}. for statistical check of the results, 100 random realizations of the available sample positions are used. the results are summarized in table 1 for a different number of available sampels m, with cq = 1. the theory agreement with statistics is very high. for smaller values of m the iterative procedure (described in the last paragraph of subsection 3.1) should be used since all components can not be detected in a single realization of (10). similar results would be obtained as far the value of available sample is sufficient for signal recovery. table 1 signal to noise ratio: in the input signal (snri), obtained by theory (snrt) and by statistics (snrs) for various number of available samples m with n=257. snr in [db] = 128m = 160m = 192m = 224m snri 2.6383 2.6215 2.5663 2.5811 snrt 18.8837 19.8519 20.6446 21.3140 snrs 18.8709 19.8528 20.6415 21.3887 in order to test the change of ,k the theory is illustrated on a four component signal 1 1 2 2 3 3 4 4 ( ) = exp( 2 / ) exp( 2 / ) exp( 2 / ) exp( 2 / ) ( ) x t a j k t n a j k t n a j k t n a j k t n n          as well. the amplitudes in this case where a1 = 1, a2 = 0.75, a3 = 0.5, a4 = 0.67, and the frequency indices {k1, k2, k3 k4} = {58,117,21,45}. the results are presented in table 2. 372 lj. stanković table 2 signal to noise ratio: in the input signal (snri), obtained by theory (snrt) and by statistics (snrs) for various number of available samples m with n=257. snr in [db] = 128m = 160m = 192m = 224m snri 3.5360 3.5326 3.5788 3.5385 snrt 18.5953 19.5644 20.3562 21.0257 snrs 18.7203 19.5139 20.2869 21.7302 the agreement of the numerical statistical results with this simple theory in analysis of noise influence to the reconstruction of sparse signals is high. 5. conclusion analysis of random samples sparse signals is preformed. it has been shown that random sampling increases noise in the reconstructed caused by unavailable samples. for a relatively large number of available samples the signal recalculation of nonuniformly sampled signals to the sampling theorem grid can improve the results. the input noise can degrade the reconstruction limit. however as far as the reconstruction is possible the noise caused by missing samples manifests its influence to the results accuracy in simple and direct way trough the number of missing samples and signal sparsity. the accuracy of the final result is related to the input noise intensity, number of available samples and the signal sparsity. the theory is checked and illustrated on numerical examples. references [1] d. l. donoho, “compressed sensing,” ieee transactions on information theory, vol. 52, no. 4, pp. 1289–1306, 2006. [2] e. j. candès, j. romberg, and t. tao, “robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” ieee transactions on information theory, vol. 52, no. 2, pp. 489–509, 2006. [3] l. stanković, i. orović, s. stanković, and m. g. amin, “robust time-frequency analysis based on the lestimation and compressive sensing,” ieee signal processing letters, may 2013, pp. 499–502. [4] r. e. carrillo, k. e. barner, and t. c. aysal, “robust sampling and reconstruction methods for sparse signals in the presence of impulsive noise,” ieee journal of selected topics in signal processing, 2010, 4(2), pp. 392–408. [5] l. stanković, m. daković, and t. thayaparan, time–frequency signal analysis with application, artech house, 2013. [6] l. stanković, s. stanković, i. orović, and m. g. amin, “compressive sensing based separation of nonstationary and stationary signals overlapping in time-frequency, ”ieee transactions on signal processing, vol. 61, no. 18, pp. 4562–4572, sept. 2013 [7] m. a. figueiredo, r. d. nowak, and s. j. wright, “gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems,”ieee journal of selected topics in signal processing,, vol. 1, no. 4, pp. 586–597, 2007. [8] d. donoho, m. elad, and v. temlyakov, “stable recovery of sparse overcomplete representations in the presence of noise,” ieee transactions on information theory, vol. 52, pp. 6–18, 2006. [9] b. turlach, “on algorithms for solving least squares problems under an l1 penalty or an l1 constraint,” proc. of the american statistical association; statistical computing section, pp. 2572–2577, alexandria, va, 2005. [10] r. baraniuk, “compressive sensing,” ieee signal processing magazine, vol. 24, no. 4, 2007, pp. 118–121. [11] p. flandrin and p. borgnat, “time-frequency energy distributions meet compressed sensing,” ieee transactions on signal processing, vol. 58, no. 6, 2010, pp. 2974–2982. noises in randomly sampled sparse signals 373 [12] y. d. zhang and m. g. amin, “compressive sensing in nonstationary array processing using bilinear transforms," in proc. ieee sensor array and multichannel signal processing workshop, hoboken, nj, june 2012. [13] l. stanković, s. stanković, and m. g. amin, “missing samples analysis in signals for applications to lestimation and compressive sensing”, signal processing, elsevier, volume 94, jan. 2014, pages 401–408. [14] s. aviyente, “compressed sensing framework for eeg compression”, in proc. stat. sig. processing, 2007, aug. 2007. [15] l. stanković, “a measure of some time–frequency distributions concentration,” signal processing, vol. 81, pp. 621–631, 2001 [16] n.b. karahanoglu and h. erdogan, ” compressed sensing signal recovery via forward–backward pursuit”, digital signal processing, vol.23, issue 5, sept. 2013, pages 1539–1548. [17] s. stanković, i. orović, and e. sejdić, multimedia signals and systems, springer, 2012. [18] e. sejdić, a. cam, l. f. chaparro, c. m. steele, and t. chau, “compressive sampling of swallowing accelerometry signals using tf dictionaries based on modulated discrete prolate spheroidal sequences,” eurasip journal on advances in signal processing, 2012:101 doi:10.1186/1687–6180–2012–101. [19] s. g. mallat and z. zhang, “matching pursuits with time-frequency dictionaries,” ieee transactions on signal processing, vol. 41, no. 12, pp. 3397–3415, 1993. [20] i. daubechies, m. defrise, and c. de mol, “an iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” communications on pure and applied mathematics, vol. 57, no. 11, pp. 1413– 1457, 2004. [21] l. stanković, m. daković, and s. vujović, “adaptive variable step algorithm for missing samples recovery in sparse signals,”iet signal processing, in print, 2014 (first version available on arxiv.org/abs/1309.5749). [22] l. stanković, m. daković, and s. vujović, “concentration measures with an adaptive algorithm for processing sparse signals,” in proceedings of ispa 2013, sept. 4–6, 2013, trieste, italy, pp. 418–423. [23] l. stanković, m. daković, and s. vujović, “reconstruction of sparse signals in impulsive noise", ieee transactions on signal processing, submitted (first version available on arxiv.org) [24] s. stanković, i. orović, and l. stanković, "an automated signal reconstruction method based on analysis of compressive sensed signals in noisy environment", signal processing, elsevier, volume 94, in print. [25] e. margolis and y.c. eldar, "nonuniform sampling of periodic bandlimited signals," ieee transactions on signal processing, vol.56, no.7, pp.2728,2745, july 2008. facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 1 9 https://doi.org/10.2298/fuee1801001e effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism of thin film transistors on its base magali estrada 1 , yoanlys hernandez-barrios 1 , oana moldovan 2 , antonio cerdeira 1 , francois lime 2 , marcelo pavanello 3 , benjamin iñiguez 2 1 sees, depto. de ingeniería eléctrica, cinvestav-ipn, méxico city, méxico 2 departament d'enginyeria electrònica, elèctrica i automàtica (deeea), universitat rovira i virgili, tarragona, spain 3 department of electrical engineering, centro universitário da fei, são paolo, brasil abstract. amorphous in-ga-zn-o thin film transistors (a-igzo tfts) have proven to be an excellent approach for flat panel display drivers using organic light emitting diodes, due to their high mobility and stability compared to other types of tfts. these characteristics are related to the specifics of the metal-oxygen-metal bonds, which give raise to spatially distributed s orbitals that can overlap between them. the magnitude of the overlap between s orbitals seems to be little sensitive to the presence of the distorted bonds, allowing high values of mobility, even in devices fabricated at room temperature. in this paper, we show the effect of the distribution of states in the a-igzo layer on the main conduction mechanism of the a-igzo tfts, analyzing the behavior with temperature of the drain current. key words: amorphous oxide semiconductor, thin-film transistor, behavior with temperature, distribution of states 1. introduction in 2004, nomura et al. [1] presented the first amorphous oxide semiconductor (aos) tfts using a novel semiconductor material at that time, the amorphous in-ga-zn-o (aigzo), which was deposited by pulse layer deposition, using a krf excimer laser and a polycrystalline in-ga-zno target. the chemical composition of the target was in:ga:zn 1.1:1.1:0.9 in atomic ratio. the authors explained, that conduction in amorphous oxide semiconductors (aoss) containing post-transition-metal cations is completely different received july 7, 2017 corresponding author: yoanlys hernandez-barrios sees, depto. de ingeniería eléctrica, cinvestav-ipn, av. ipn 2208, cp 07360, méxico city, méxico (e-mail: yhb961210@gmail.com) 2 m. estrada, y. hernandez-barrios, o. moldovan, et al. from that of covalent semiconductors as a-si:h. in a-si:h tfts, the presence of randomly distributed sp3 bonds, gives rise to a high density of both deep and tail localized states and the carrier transport is governed by hopping between localized tail states. in aoss, the conduction band has high ionicity due to spatially distributed s orbitals, which can overlap between them. although distorted metal-oxygen-metal bonds exist in the amorphous material, the magnitude of the overlap between s orbitals seems to be insensitive to the presence of the distorted bonds, allowing high values of mobility, even in devices fabricated at room temperature. the transistor presented in [1] used 140-nmthick y2o3 layer as gate dielectric and indium tin oxide (ito) as source (s), drain (d) and gate (g) contacts. all materials used in the tft were transparent. to explain the conduction mechanism, authors previously analyzed the behavior of single-crystalline ingao3(zno)5 [2]. for these devices, the carrier transport was associated to percolation conduction over potential barriers around the conduction band edge. these potential barriers are supposed to be due to randomly distributed ga 3+ and zn 2+ ions in the crystal structure. since the amorphous igzo (a-igzo) tfts showed also high mobility values with a behavior similar to the crystalline ones, authors considered that the percolation conduction mechanism takes place also in a-igzo tfts [3]. due to the relatively high electron mobility, high optical transparency, low temperature and relatively low cost processing techniques, these devices have found an important application in active matrix organic light emitting diodes, (amoleds) displays [4-7]. from this moment on, a-igzo tfts have been object of continuous and intensive research from all points of view, including technological and physical aspects, looking to improve stability, increase mobility and reduce operating voltage range, among others. the characteristics of the a-igzo band structure can be found in [6,7], with a distribution of bulk localized states dos in the gap [7,8]. it is generally accepted that dos observed in a-igzo layers are characterized by a relatively low density of localized states, less than 1x10 20 cm -3 ev -1 , and their characteristics strongly depend on the process used for the igzo layer deposition and in general on the device fabrication [9-13]. regarding the conduction mechanism, in [14], authors proposed a conduction mechanism that contains both possibilities, band percolation [15] and the mobility edge or multiple trapping and release mechanism [16,17], which they call extended mobility edge model. depending on the specific characteristics of the device associated to the fabrication process, or depending on the operation regime for the same device, the predominant mechanism can be either percolation in conduction band or multiple trapping. the characterization with temperature of the electrical characteristics of tfts allows to analyze, not only the behavior of the devices in the temperature operation range required for the specific application, but also the conduction mechanisms. in most amorphous tfts, the drain current has been reported to increase with temperature, which is characteristic of the hopping conduction mechanism. in this paper, we show, that under certain operation conditions, a reduction of the drain current with temperature is observed, which is related to the characteristics of the dos present in the a-igzo layer of the device. effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 3 2. experimental part for the analysis of the electrical characteristics, we will use two bottom gate top contact igzo tfts shown in figures 1a and b. device 1 consists of 90 nm of hfo2 as gate dielectric, deposited at 100 o c by atomic layer deposition, on top of which, a 70 nm thick igzo layer was deposited by pulsed laser deposition (pld) at 20 mtorr oxygen pressure. a 500 nm thick layer of poly-p-xylylene-c (parylene-c), deposited by chemical vapor deposition (cvd) at room temperature at 1 mtorr, was used as etch stopper layer (esl). au/cr deposited by electron beam was used for gate contact and al was used for drain and source contacts. annealing was done after the deposition of the a-igzo layer and at the end of the fabrication process. device 2 consisted of 200 nm of si3o4 as gate insulator, deposited by plasma enhanced chemical vapor deposition (pecvd), at 250 o c as gate insulator. the aigzo layer had 12 nm and was deposited by rf sputtering at room temperature. as etch stopper layer, 100 nm of sio2, deposited by pecvd, were used. mo/cr was used for g, d and s contacts. a final annealing was done at the end of the fabrication process. photolithography was used to pattern each layer. figures 1a and b show the cross section of device 1 and 2, respectively. the analyzed tfts corresponding to device 1 had channel width (w) and length (l) of w=80 µm l=40 µm and those of device 2 had w=900 µm, l=30 µm. (a) (b) fig. 1 a) cross section of igzo tft referred as: a) device 1; b) device 2. electrical measurements were done at different temperatures and in vacuum conditions, using a k20 programmable temperature controller and measurement chamber from mmr technologies inc. and a keithley 4200 semiconductor characterization system. the output characteristics were measured every ten degrees, in the temperature range between 300 k and 350 k. measurements were done after waiting 5 minutes at each fixed temperature, with no applied bias. the time with applied voltage, during measurements at 4 m. estrada, y. hernandez-barrios, o. moldovan, et al. each temperature, was less than 5 min. before the i-v-t measurements, the devices were tested for hysteresis and bias stress instability at room temperature to guarantee that the variation of the drain current was due to the temperature variation and not to instability effects. 3. analysis and discussion figure 2a shows the measured output characteristics at 300, 320 and 330 k for device 1. fig. 2b shows the output characteristics at 300,330 and 350 k for device 2. as can be seen in fig. 2a, the drain current (ids) increases significantly with temperature. this is the typical behavior with temperature of the output current in a-igzo shown in [12,14,17,18,19]. on the contrary, in the output characteristics shown in fig. 2b, ids reduces with the increase of t, as vds increases. as already mentioned, due to the specific characteristics of metal oxide materials chemical bonds, the conduction mechanism in a-igzo tfts can be due not only to hopping, typical of amorphous tfts, but also to percolation in the conduction band [8,9,14]. the temperature dependence of the drain current and mobility will be determined by the predominant conduction mechanism, which can depend, not only on the fabrication process, but on the operation conditions for a given fabrication process. it is expected that the contribution of the variable range hopping (vrh) becomes greater than the band-like mechanism when the fermi level ef lies within the exponential tail states. according to [14], for a characteristic energy of around kta=0.069 ev and a density of acceptor tail states at ec, (nta) in the order of 3.4x10 19 cm -3 ev -1 , this can occur. to estimate the dos in the amorphous semiconductor material of our devices, we used the same procedure as in [16,17], obtaining characteristic energy of kta =0.072 ev and nta = 8.5x10 19 cm -3 ev -1 for device 1 and kta=40 mev and nta<6x10 18 cm -3 ev -1 for device 2. in order to study in more detail the effect of the dos characteristic on the conduction mechanism, we used simulations in atlas. for this purpose, we simulated the output characteristics for a-igzo tfts with a bottom gate structure. different dos characteristics with an acceptor-type dos, having nta values in the range from 1.5x10 20 cm -3 ev -1to 3.5x10 17 cm -3 ev -1 were simulated. the characteristic energy was varied from 0.03 ev to 0.18 ev. the default low field mobility model was used, taking the default value of the temperature dependent parameter for this mobility in altas. the reduction of mobility with temperature is the typical behavior expected for a crystalline device due to phonon scattering. in this case, it can be associated to a crystalline-like behavior of the amorphous oxide semiconductor material [3]. the mobility dependence with temperature is considered with the objective to distinguish between the effect of the characteristics of the dos and the effect of a crystalline-like mobility behavior, on the ids temperature dependence. if mobility is considered constant in the simulator, the variation of the drain current with temperature will be determined only by the dos characteristics, which was confirmed by simulating considering a mobility that does not depend on temperature. the model used for the blaze simulation included fermi statistics, as well as band parameters and bandgap narrowing specified for igzo material. effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 5 main results of simulations are summarized on table i. for simulated output characteristics of devices with nta=1.5x10 20 cm -3 ev -1 and kta=0.34, the typical increase of ids with t was observed for all curves with vgs equal or above 4 v. it is evident that in this case, defects are determining the behavior of the temperature dependence of the drain current. for values of nta equal or below 1.5x10 19 cm -3 ev -1 and kta=0.034 ev, the drain current decreases with temperature. this result confirms that when, the density of localized states is sufficiently small and trapping is less important, the temperature dependence of the drain current is determined by the temperature dependence of mobility, see table 1. (a) (b) fig. 2 output characteristics at different temperatures, of: a) device 1; b) device 2´. 6 m. estrada, y. hernandez-barrios, o. moldovan, et al. the effect of reducing the dos characteristic energy kta, and thus the effect of trapping, was also analyzed. for this purpose nta was maintained constant and equal to nta=3.5x10 19 cm -3 ev -1 , while the characteristic energy was varied. table 1 and fig. 3 show that the reduction of kta allows the change of mechanism to occur for smaller values of vgs. it is observed that for kta=100 mev, the change in mechanism is not observed and ids at 350 k is higher than at 300 k for all the operation voltage range. for kta =70 mev, the change in mechanism is observed only, for vgs=10 v, when ids at 350 k becomes smaller than at 300 k. for kta =0.34, the conduction mechanism is also the same in all the operation voltage range, but it corresponds to percolation in the conduction band, producing the reduction of ids with t. fig. 4 shows the change of mechanism in a simulated transfer curve in saturation, when nta is changed maintaining kta =0.034 ev. for nta = 1.5x10 20 cm -3 ev -1 , ids is higher at 350 k than at 300 k for all values of vgs. for nta =1.5x10 18 cm -3 sv -1 , ids is practically the same at 300 and 350 k for vds<0.5 v and as vgs increases, the drain current at 350 k becomes smaller than at 300 k. table 1 values of the drain current for t=300 k and 350 k, at vds=10 v and different values of vgs, corresponding to different combinations of values of nta and kta. nta (cm -3 ) kta (mev) ids [a] vds=10 v vgs=4 v t=350 k t=300 k ids [a] vds=10 v vgs=6 v t=350 k t=300 k ids [a] vds=10 v vgs=8 v t=350 k t=300 k ids [a] vds=10 v vgs=10 v t=350 k t=300 k 1.5x10 18 34 2.05x10 -5 2.54x10 -5 4.00x10 -5 4.98x10 -5 6.58x10 -5 8.23x10 -5 9.65x10 -5 1.21x10 -4 1.5x10 19 34 1.64x10 -5 1.89x10 -5 3.30x10 -5 3.89x10 -5 5.57x10 -5 6.66x10 -5 8.34x10 -5 1.01 x10 -4 1.5x10 20 34 4.06 x10 -6 3.20 x10 -6 9.13 x10 -6 7.83 x10 -6 1.71 x10 -5 1.50 x10 -5 2.85x10 -5 2.80 x10 -4 1.5x10 19 100 2.09 x10 -6 1.50 x10 -6 7.36 x10 -6 6.68 x10 -6 1.74 x10 -5 1.78 x10 -5 3.29x10 -5 3.59x10 -5 3.5x10 19 34 1.22 x10 -5 1.83 x10 -5 2.56 x10 -5 3.69 x10 -5 4.44 x10 -5 6.19 x10 -5 6.83x10 -5 9.21x10 -5 3.5x10 19 70 2.13 x10 -6 1.44 x10 -6 6.51 x10 -6 5.41 x10 -6 1.48 x10 -5 1.40 x10 -5 2.77 x10 -5 2.85 x10 -5 3.5x10 19 100 2.10 x10 -7 8.80 x10 -7 1.08 x10 -6 6.36 x10 -7 3.69 x10 -6 2.83 x10 -6 9.38 x10 -6 8.55 x10 -6 6x10 19 30 1.07 x10 -5 1.08 x10 -5 2.24 x10 -5 2.37 x10 -5 3.92 x10 -5 4.31 x10 -5 6.08 x10 -5 6.87 x10 -5 8.5x10 19 72 2.84 x10 -7 1.25 x10 -7 1.03 x10 -6 5.78 x10 -7 2.85 x10 -6 1.95 x10 -6 6.53 x10 -6 5.22 x10 -6 3.5x10 17 180 1.83 x10 -5 2.25 x10 -5 3.69 x10 -5 4.58 x10 -5 6.19 x10 -5 7.72 x10 -5 9.21 x10 -5 1.15 x10 -4 effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 7 fig. 3 simulated output characteristics at t=300 k and t=350 k for nta= 3.5x10 19 cm -3 ev -1 and kta=100 mev and 70 mev, showing the different behavior of ids with t corresponding to the change in conduction mechanism. fig. 4 simulated transfer curves in saturation at t=300 k and t=350 k for nta= 1.5x10 20 cm -3 ev -1 and 1.5x10 18 cm -3 ev -1 , for kta =34 mev. simulations confirm that device 1 with the dos characteristic indicated above, is expected to have hopping as the predominant conduction mechanism in all the operation region and temperature range analyzed, which is what was observed. 8 m. estrada, y. hernandez-barrios, o. moldovan, et al. from the other hand, simulations also show that for values of nta below 3.5x10 19 cm -3 and kta below 0.1 ev, the predominant conduction mechanism can change to percolation for vds and vgs above a given value. this value can be estimated analyzing the arrhenius dependence of the drain current in the devices. this was the case observed for device 2. as already mentioned, the presence of bandlike carrier transport is well accepted for igzo tfts, although the presence of vrh cannot be excluded [14]. due to it, the device can reveal an electrical crystalline-like behavior, in which mobility reduces with temperature, due to the interaction of carriers with the atoms in the material. the predominant carrier transport mechanism, will depend on the dos characteristics of the device being analyzed. if the effect of the dos, is sufficiently small, the current due to percolation conduction is expected to become predominant and the device can show a crystalline-like behavior. 4. conclusions due to the chemical bond of in a-igzo layers, carrier conduction in the conduction band is possible in aostfts based on this material. however, the presence of vrh cannot be excluded and the predominant conduction mechanism is determined by the characteristics of the dos in the amorphous igzo layer and the operating voltages, which will define the position of the fermi level. simulations confirm that, when the effect of the dos, is sufficiently small, that is when the combination of nta and kta is sufficiently small, current due to percolation conduction becomes predominant and the device can show a crystalline-like behavior in the operation range. for example, for a density of tail acceptor states nta=3.5x10 19 cm 3 ev -1 and a characteristic energy of 34 mev, ids reduces with t. for kta=100 mev the current increases with t. for kta =70 mev, the drain current decreases only when vgs=10 v. for nta =1x10 20 cm -3 ev -1 the current increases with t even for kta =34 mev, indicating that vhr conduction is predominant. for nta <1.5e 19 cm -3 ev -1 , the current reduces with t for kta =34 mev, indicating that the percolation conduction mechanism is predominant in all the operating voltage range of the tft. acknolwledgement: this work was supported by conacyt projects 237213 and 236887 in mexico, the h2020 programme of the european union under contract 645760 (domino), by contract “thin oxide tft spice model” (t12129s) with silvaco inc., by icrea academia 2013 from icrea institute and the spanish ministry of economy and competitiveness through project tec2015-67883-r (greensense). the authors acknowledge holst centre/tno and dr. i. mejia, from the university of texas at dallas, for providing the tft devices. effect of the distribution of states in amorphous in-ga-zn-o layers on the conduction mechanism... 9 references [1] k. nomura, h. ohta, a. takagi, t. kamiya, m. hirano, h. hosono, “room-temperature fabrication of transparent flexible thin-film transistors using amorphous oxide semiconductors”, nature, vol. 432, pp. 488-492, 2004. [2] k. nomura, t. kamiya, h. ohta, k. ueda, m. hirano, h. hosono. “carrier transport in transparent oxide semiconductor with intrinsic structural randomness probed using single-crystalline ingao3(zno)5 films”, appl. phys. lett., vol. 85, pp. 1993-1995, 2004. [3] t. kamiya, k. nomura, h. hosono, “electronic structures above mobility edges in crystalline and amorphous in-ga-zn-o: percolation conduction examined by analytical model”, j. display technol., vol. 5, pp. 462-467, 2009. [4] e. fortunato, p. barquinha, r. martins, “oxide semiconductor thin-film transistors: a review of recent advances”, adv. mater., vol. 24, pp. 2945-2986, 2012. [5] h. kumomi, t. kamiya, h. hosono, “advances in oxide thin-film transistors in recent decade and their future”, ecs transactions, vol. 67, pp. 3-8, 2015. [6] t. kamiya, h. hosono, “material characteristics and applications of transparent amorphous oxide semiconductors”, npg asia mater., vol. 2, pp. 15-22, 2010. [7] t. kamiya, k. nomura, h. hosono, “present status of amorphous in-ga-zn-o thin-film transistors”, sci. technol. adv. mater., vol. 11, pp. 044-305. [8] t. kamiya, k. nomura, h. hosono, “electronic structure of the amorphous oxide semiconductor aingazno4–x: tauc–lorentz optical model and origins of subgap states”, phys. status solidi a, vol. 206, pp. 860–867, 2009. [9] s. sallis, k.t. butler, n.f. quackenbush, d.s. williams, m. junda, d.a. fischer, j.c. woicik, n.j. podraza, b.e. white, a. walsh, l.f. piper, “origin of deep subgap states in amorphous indium gallium zinc oxide: chemically disordered coordination of oxygen”, applied physics letters, vol. 104, pp. 232108, 2014. [10] s. c. kim, y.s. kim, j. kanicki, “density of states of short channel amorphous in–ga–zn–o thin-film transistor arrays fabricated using manufacturable processes”, jpn. j. of appl. phys., vol. 54, pp. 51-101, 2015. [11] x. ding, j. zhang, w. shi, h. zhang, c. huang, j. li, x. jiang, z. zhang, “extraction of density-ofstates in amorphous ingazno thin-film transistors from temperature stress studies”, current applied physics, vol. 14, pp. 1713-1717, 2014. [12] c. chen, k. abe, h. kumomi, j. kanicki, “density of states in a-ingazno from temperature dependent field studies”, ieee tran. electron devices, vol. 56, pp. 1177-1183, 2009. [13] j. jeong, j.k. jeong, j.s. park, y.g. mo, y. hong, “meyer–neldel rule and extraction of density of states in amorphous indium–gallium–zinc-oxide thin-film transistor by considering surface band bending”, japanese journal of applied physics, vol. 49, pp. 03cb02, 2010. [14] w. chr. germs, w.h. adriaans, a.k. tripathi, w.s.c. roelofs, b. cobb, r.a. j. janssen, g.h. gelinck, m. kemerink, “charge transport in amorphous ingazno thin-film transistors”, phys. rev., vol. b86, pp. 155-319, 2012. [15] t. kamiya, k. nomura, h. hosono, “origin of definite hall voltage and positive slope in mobility-donor density relation in disordered oxide semiconductors”, appl. phys. lett., vol. 96, pp. 122103, 2010. [16] s. lee, s. park, s. kim, y. jeon, k. jeon, j.-h. park, j. park, i. song c., j. kim, y. park, d.m. kim, d.h. kim, “extraction of subgap density of states in amorphous ingazno thin-film transistors by using multifrequency capacitance–voltage characteristics”, ieee electron device lett., vol. 31, pp. 231-233, 2010. [17] j.-h. park, k. jeon, s. lee, s. kim, s. kim, i. song, j. park, y. park, c. j. kim, d. m. kim, d.h. kim, “self-consistent technique for extracting density of states in amorphous ingazno thin film transistors”, j. electrochem. soc., vol. 157, pp. h272, 2010. [18] p.y. liao, t.c. chang, t.y. hsieh, m.y. tsai, b.w. chen, y. h. tu, a. k. chu, c.h. chou, j.f. chang, j f “investigation of carrier transport behavior in amorphous indium–gallium–zinc oxide thin film transistors”, jpn. j. of appl. phys., vol. 54, pp. 094101, 2015. [19] m. estrada, m. rivas, i. garduño, f. avila-herrera, a. cerdeira, m. pavanello, i. mejia, m.a. quevedolopez, “temperature dependence of the electrical characteristics up to 370 k of amorphous in-ga-zno thin film transistors”, microelectronics reliability, vol. 56, pp. 29–33, 2016. instruction facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 153 182 doi: 10.2298/fuee1402153s fiber optics engineering: physical design for reliability ephraim suhir bell laboratories, murray hill, nj, portland state university, portland, or, usa, technical university, vienna, austria, bordeaux university, bordeaux, france, ariel university, ariel, israel, ers co., los altos, ca, usa abstract. the review part of the paper addresses analytical modeling in fiber optics engineering. attributes and significance of predictive modeling are indicated and discussed. the review is based mostly on the author’s research conducted at bell laboratories, physical sciences and engineering research division, murray hill, nj, usa, during his tenure with bell labs for about twenty years, and, to a lesser extent, on his recent work in the field. the addressed topics include, but are not limited to, the following major fields: bare fibers; jacketed and dual-coated fibers; coated fibers experiencing thermal and/or mechanical loading; fibers soldered into ferrules or adhesively bonded into capillaries; roles of geometric and material non-linearity; dynamic response to shocks and vibrations; as well as possible applications of nanomaterials in new generations of coating and cladding systems. the extension part is concerned with a new, fruitful and challenging direction in optical engineeringprobabilistic design for reliability (pdfr) of opto-electronic and photonic systems, including fiber optics engineering. the rationale behind the pdfr concept is that the difference between a highly reliable optical fiber system and an insufficiently reliable one is “merely” in the level of the never-zero probability of failure. it is the author’s belief that when the operational reliability of an optical fiber system and product is imperative, the ability to predict, quantify, assure and, if possible and appropriate, even specify this reliability is highly desirable. key words: fiber optics engineering, optical fibers, design-for-reliability, predictive modeling, probabilistic assessments  received january 6, 2014 corresponding author: ephraim suhir ers co., los altos, ca, usa 727 alvina ct., los altos, ca 94024, 650-969-1530, cell. 408-410-0886 (e-mail: suhire@aol.com) 154 e. suhir 1. physical design-for-reliability in fiber optics engineering 1.1. fiber optics engineering (foe) three major objectives are pursued in fiber optics engineering (foe), as far as its short and long-term reliability is concerned: 1) failure-free functional (optical) performance; 2) high physical (structural, mechanical) reliability; and 3) satisfactory environmental durability. the physical design for reliability (dfr) effort deals primarily with the second objective, but, to an extent, also with the other two as well. the dfr effort employs methods and approaches of reliability physics and structural analysis and is aimed at evaluating stresses, strains and displacements in fiber optics structures, carry out physical design of these structures, and assess and assure their shortand long-term reliability. physical dfr effort treats fiber optics products as structures: the materials interaction, the size and configuration of the structural elements in the product, physical nature and magnitude of the applied loads, and the ability to quantify reliability are as important in this effort as the optical properties and characteristics of the employed materials. the application of methods and approaches of dfr in foe systems enables one to design, fabricate and operate a viable and reliable product [1]-[11]. like traditional and much better developed branches of dfr, such as civil, aircraft, space, maritime, automotive, etc., dfr in foe considers the specifics, associated with the properties of the materials used, typical structures employed, and the nature, magnitude and variability of the applied loads. typical foe structures are bare or composite (coated) rods and beams of various lengths and flexural rigidities. these structural elements could be soldered into ferrules, adhesively bonded into capillaries, or embedded into various materials and media. typical materials are silica glasses; polymers (coatings, adhesives, and even polymer light-guides); semiconductors, including compound semiconductors; metals, and, first of all, solders, both ―hard‖ (e.g., gold-tin) and ―soft‖ (e.g., silver-tin) ones. typical loads include internal (thermal) loads caused by dissimilar materials and/or by temperature gradients, and/or highor low-temperature environments (temperature extremes); external (mechanical) loads due to the inevitable or imposed, but always critical, deformations; or possible dynamic loads caused by shocks, vibrations, acoustic noise or impact, etc. high voltage, electric current, ionizing radiation and/or extensive light output from a powerful laser source are also considered as loads (stressors, stimuli). dfr in foe pursues, but might not be limited to, the following major objectives: 1) determine and idealize, for the sake of predictive modeling, the most likely loading conditions; 2) evaluate the stresses, strains, displacements, and, when methods and approaches of fracture mechanics are applicable, also fracture characteristics of the fiber optics materials and structures; 3) assure, typically on the probabilistic basis, that the acceptable strength and reliability criteria will remain, during the lifetime of the product, within the limits allowable from the standpoint of the product‘s structural integrity, elastic stability, dependability, availability and normal operation. while an optical engineer is and should be concerned, first of all, with the functional (optical) performance of the foe product, an adequate performance of this product cannot assured, if its ability to withstand elevated stresses (physical reliability) and exhibit adequate environmental durability (ability to withstand degradation and aging at high fiber optics engineering: physical design for reliability 155 temperature and/or humidity environments) is not taken care of. accordingly, we consider the following stress-strain analysis problems encountered in foe: 1) role and attributes of, and challenges in, predictive modeling in foe problems: the emphasis is on the analytical (mathematical) modeling; 2) thermal stress in fiber optics structures: it is this stress and strains (displacements) that are the most typical and most detrimental in these structures; 3) bending of bare fibers caused by the ends off-set; 4) bare fibers under the combined action of bending and tension; 5) role of the structural and materials nonlinearity; 6) coated fibers and stresses that occur in the glass material during the design, fabrication, operation and proof-testing of such fibers; 7) micro-bending of dual-coated fibers intended for long haul communication; 8) solder materials and joints, and fibers soldered into ferrules; 9) dynamic response of electronic and photonic systems, including optical fibers, to shocks and vibrations; 10) new nano-material and its applications in fiber optics, photonics and beyond; 11) some special foe problems: strain-free planar optical waveguides; apparatus and method for thermostatic compensation of temperature sensitive optical devices; stresses and strains in fused bi-conical taper couplers; ―curling‖ phenomenon during drawing of optical fibers; effect of voids. the extension part deals with a novel direction in ―high-tech‖ engineering -probabilistic design for reliability (pdfr) of electronic and photonic systems, including optical fibers and interconnects. the objective of this direction is to provide quantitative probabilistic assessments of the likelihood of operational failures of oe materials, devices and systems. the pdfr direction is based on the rationale that when reliability is imperative, the ability to predict, quantify, assure, and, if possible and appropriate, even specify it, is highly desirable, or even a must. 1.2. predictive modeling (pm) in fiber optics engineering (foe): role, attributes, challenges modeling is the major approach of any science, whether pure or applied. research and engineering models can be experimental or theoretical. experimental models are typically of the same physical nature as the actual phenomenon or the object. they reproduce a notion or an object of interest in a simplified way and often on a different scale. theoretical models represent real phenomena using abstract notions. the goal of a theoretical model is to reveal non-obvious, often even paradoxical, relationships hidden in the available intuitively obvious and/or experimentally proven input information [12]-[17]. a theoretical model can be either analytical or numerical (computational). analytical models often employ more or less sophisticated mathematical methods of analysis. the today‘s numerical models are computeraided. the most widespread model in the stress-strain evaluations and physical design for reliability in foe is finite-element analysis (fea). experimental and theoretical models have their merits and drawbacks, their areas of application, and should be viewed as equally important and equally indispensable for the design of a viable, reliable, and cost-effective foe product. one should always try to avoid to be blamed that because his/her only tool is a hammer, all the problems look like nails to him/her. although the role of theoretical modeling, mostly computer-simulations 156 e. suhir based, has dramatically increased in foe during the last two decades, the situation is still essentially different from the traditional areas of applied science. the majority of studies dealing with the physical design and performance of foe materials and products are experimental, and there are several reasons for that. first, experiments could be carried out with ―full autonomy‖, i.e. without necessarily requiring theoretical support. unlike theory, testing can be, and is, in effect, used for final proof of the viability and reliability of a foe product. that is why testing procedures are essential requirements of military and commercial specifications for such products. second, experiments in the foe field, expensive as they are, are considerably less costly than, e.g., hulls in naval architecture, or fuselages in the aerospace field, or objects in civil engineering, where "specimens" might cost millions of dollars. third, foe experimentations are much easier to design, organize, and conduct than in the macro-engineering world. fourth, materials whose properties are unknown are often and successfully employed in various foe applications. lack of information about the properties of such materials is often viewed as an obstacle for implementing theoretical modeling. finally, many of the leading specialists in foe (experimental physicists, materials scientists, chemists, chemical engineers) traditionally use experimental methods as their major research tool. some of them simply do not feel that adding theoretical modeling will make an appreciable difference in the state-of-the-art of what they do. it is not surprising that eleven out of twelve bell labs nobel laureates were experimentalists. on the other hand, the application of experimental modeling, unlike theoretical modeling, requires, as a rule, considerable time and is often associated with significant expense. what is even more important though is that experimental data inevitably reflect the effect of the combined action of a variety of factors affecting the phenomenon or the product of interest. this makes experimentation often insufficient to understand the behavior and the performance of an foe material or a device. such a lack of insight inevitably leads to tedious, time-consuming and costly experimental procedures. as a rule, the experimental data cannot be simply extended to new situations or new designs that are appreciably different from those tested. it is always easy to recognize purely empirical relationships obtained by formal processing of experimental data and not based on rational theoretical considerations reflecting the physical nature of the phenomenon of interest. purely experimental relationships contain, as a rule, fractional exponents and coefficients, odd units, etc. although such relationships may have a certain practical value, the very fact of their existence should be attributed to the lack of knowledge in the given area of applied science. typical examples are a power law (e.g., the one used in proof-testing of optical fibers, when their delayed fracture, aka as ―static fatigue‖, is evaluated) or an inverse power law (e.g., numerous relationships of coffin-manson type used to evaluate the lifetime of solder joint interconnections). in view of the above, here is what could be gained by using theoretical modeling: 1) unlike experimentation, predictive modeling is able to shed light on the role of each particular parameter that affects the behavior and performance of the material, structure or a system of interest; 2) although testing can reveal insufficiently robust elements, it is incapable to detect superfluously reliable ones; ―over-engineered‖ (superfluously robust) objects may have excessive weight and be more costly than necessary; in mass production of expensive products, superfluous reliability may entail substantial and unnecessary additional costs; fiber optics engineering: physical design for reliability 157 predictive modeling might be able to reveal the ―over-engineered‖ and, hence, costineffective elements of a foe design; 3) theoretical modeling can often predict the result of an experiment in less time and at a lower expense than it would take to perform the actual experiment; 4) in many cases, theory serves to discourage wasting time on useless experiments; numerous attempts to build impossible heat engines have been prevented by a study of the theoretical laws of thermodynamics; while this is, of course, a classical and an outstanding example of the triumph of a theory, there are also numerous, though less famous, examples, when plenty of time and expense were saved because of prior theoretical modeling of a problem of interest; 5) in the majority of research and engineering projects, a preliminary theoretical analysis enables one to obtain valuable information about a phenomenon or an object to be investigated, and gives an experimentalist an opportunity to decide, what and how should be tested or measured, and in what direction success might be expected. 6) by shedding light on ―what affects what‖, theoretical modeling often serves to suggest new experiments: theoretical analyses of thermal stresses in bi-material assemblies (e. suhir, asme j. appl. mech., vol. 53, no. 3, sept. 1986) and in semiconductor thin films (s. luryi and e. suhir, applied physics letters, vol. 49, no. 3, july 1986) triggered numerous experimental investigations aimed at the rational physical design of semiconductor crystal grown assemblies; 7) theory can be used to interpret empirical results and to bridge the gap between different experiments and can be used to extend the existing experience on new materials and products; 8) one cannot do without a good theory when developing rational (optimal) designs; the idea of optimization of structures, materials, functions and costs, although new in foe, has penetrated many areas of modern engineering; no progress in this direction could be achieved, of course, without application of theoretical methods of optimization. 1.3. analytical vs. numerical modeling analytical modeling [18]-[23] occupies a special place in the predictive modeling effort: it is able not only to come up with relationships that clearly indicate ―what affects what‖, but, more importantly, can often explain the physics of phenomena and especially paradoxical situations better than the fea modeling, or even experiments, can. although the basics of fea modeling were known since mid-thirties or so, it is since mid-1950s, when high-speed and powerful computers have become available, fea modeling has become the major research tool for theoretical evaluations in many areas of engineering. since mid-1970s, fea has become the major modeling tool in electronics and photonics as well. this can be attributed, first of all, to the developments of computer science and engineering and the availability of numerous powerful and flexible computer programs. these programs enable one to obtain, within a reasonable time, a solution to almost any stress-strain related problem. broad application of computers, however, has, by no means, made analytical solutions unnecessary or even less important, whether exact, approximate, or asymptotic. simple and easy-to-use analytical relationships have invaluable advantages, because of the clarity and compactness of the obtained information and explicit indication of the role of various factors affecting the given phenomenon or the behavior of the given material or the device. these advantages are especially significant when the parameter under investigation 158 e. suhir depends on more than one variable. as to the asymptotic techniques, they can be successful in many cases, when there are difficulties in the application of computational methods, e.g., in various problems containing singularities. such problems are often encountered in foe, because of wide employment of assemblies comprised of dissimilar materials. but, even when application of fea encounters no difficulties, it is always advisable to investigate the problem analytically before carrying out fea analyses. such a preliminary investigation helps to reduce computer time and expense, develop the most feasible and effective preprocessing model and, in many cases, avoid fundamental errors. let us indicated several attributes of the analytical modeling effort in comparison with the fea: 1) fea has been originally developed for structures with complicated geometry and/or with complicated boundary conditions (such as, e.g., avionics structures), when it might be difficult to apply analytical approaches. as a consequence, fea has been especially widely used in those areas of engineering, in which structures of complex configuration are typical (aerospace, maritime and offshore structures, some civil engineering structures, etc.). in contrast, foe structures are usually characterized by relatively simple geometries and can be easily idealized as cylindrical beams, flexible rods, rectangular or circular plates, various composite structures of relatively simple geometry, etc. there is an obvious incentive therefore for a broad application of analytical modeling in foe. 2) the adjacent structural elements in foe often have dimensions (thicknesses) that differ by orders of magnitude. typical examples are dual-coated fibers, thin-film systems fabricated on thick substrates, and adhesively bonded assemblies, in which the bonding layer (or the primary coating) is, as a rule, significantly thinner than the bonded components (secondary coating). since the mesh elements in a fea model must be compatible, fea of such structures often becomes a problem of itself, especially in regions of high stress concentration. such a situation does not occur, however, when an analytical approach is used. 3) there is often an illusion of simplicity in applying fea procedures. some users of fea programs believe that they are not even supposed to have any prior knowledge of structural analysis and materials physics, and that the ‖black box" they deal with will automatically provide the right answer, as long as they push the right keys on the computer. at times, a hasty, thoughtless, and incompetent application of computers can result in more harm than good by creating an impression that a solution has been obtained when, actually, this "solution" is simply wrong. it is well known to those with hands-in experience with fea that although it might be easy to obtain a fea solution, it might be quite difficult to obtain the right solution. and how would one know that he/she obtained the right solution, if there is nothing to compare it with? in effect, one has to have good background in reliability and materials physics to develop an adequate, feasible, and economic preprocessing model and to correctly interpret the obtained information, and preliminary analytical modeling can be of significant help in that. clearly, if the fea data are in good agreement with the results of an analytical modeling (which is usually based on quite different assumptions), then there is a reason to believe that the obtained solution is accurate enough. a crucial requirement for an effective analytical model is its simplicity and clear physical meaning. a good analytical model, which can be of real help in ―high-tech‖ engineering, should produce simple, easy-to-use and physically meaningful relationships that clearly indicate the role of the major factors affecting a phenomenon or an object of interest. one authority in applied physics remarked, perhaps only partly in jest, that the degree of fiber optics engineering: physical design for reliability 159 understanding of a phenomenon is inversely proportional to the number of variables used for its description. although an experimental approach, unsupported by theory, is "blind," theory, not validated by an experiment, is "dead." it is the experiment that forms a basis for a theoretical model, provides the input data for theoretical modeling, and determines the viability, accuracy, and limits of application of a theoretical model. limitations of a theoretical model are different in different problems and, in the majority of cases, are not known beforehand. it is the experimental modeling therefore, which is the ―supreme and ultimate judge‖ of a theoretical model. a physical experiment can often be rationally included into a theoretical solution to an applied problem. even when some relationships and structural characteristics lend themselves, in principle, to theoretical evaluation, it is sometimes simpler and more accurate to determine these relationships empirically. a good example is the spring constant of an elastic foundation provided by the primary coating in dual coated optical fibers. 1.4. bending of bare fibers bending of bare fibers, idealized as a single span beams clamped at the ends and subjected to lateral and/or angular misalignment(s) was examined, based on the engineering beam theory, in application to the stress-strain evaluations in optical fiber interconnects [24]-[34]. angular misalignments and lateral ends-offsets might be due to the inability of the given technology to ensure good alignment of the interconnect ends and/or end cross-sections, but might be also essential, and quite often even desirable, features of a particular design. elevated optical fiber curvatures, caused by misalignments, affect both functional (optical) performance and mechanical (structural) reliability of the fiber interconnects. these curvatures and the resulting bending stresses can be predicted and, if necessary, minimized for lower curvatures, thereby minimizing also the added transmission losses in, and structural reliability of, an interconnect. sometime it might be particularly easy and effective to minimize the maximum curvatures by simply rotating the end cross-section of the interconnect. the directions and the angles of rotation could be predicted depending on the measured lateral misalignments [24], [25], [31]-[34]. an important factor in the assessment of the level of the reactive axial tensile forces in an interconnect experiencing ends off-set is the magnitude of the off-set for its given interconnect length (span). reactive forces arise because the supports of the actual interconnect cannot move closer when the interconnect is subjected to the end‘s off-set. in such a situation the interconnect experiences, in addition to bending, also reactive tension. this tension might be neglected nevertheless, if the misalignment is small compared to the interconnect length [26]. how small is ―small‖ could be determined based on a more general, but still simple, predictive model that takes into consideration the possible occurrence of the appreciable tensile reactive forces [27]. if the reactive stresses are not negligible and have to be accounted for, this still could be done on the basis of the linear theory of bending of beams [27], although the level of the tensile forces is not proportional anymore to the level of the ends-offset. the situation is different if the interconnect experiences significant ends offset [28]. if this is the case, the nonlinear euler‘s ―elastic‖ theory can be employed to accurately predict the configuration of the misaligned fiber and the level of the tensile forces for the given (measured) ends off-set. for very large end off-sets the nonlinear stress-strain behavior of the silica material has to be accounted for (see section 1.6 below). 160 e. suhir the effect of the ends-offset, as far as the bending and the reactive tensile stresses are concerned, depends on the flexural rigidity of the interconnect. it is different therefore for bare and coated fibers. the models suggested in refs. [24], [25]-[34] can be employed also for coated interconnects, by just evaluating and using their increased flexural rigidity and then considering the distribution of the induced tensile force between the silica fiber and its coating. partially coated interconnects provide a particular challenge [26], as far as the ability to determine the induced stresses is concerned. if the highest stresses are expected to occur at the clamped ends of the interconnect, it might make a difference whether the interconnect is soldered or adhesively bonded into the support structures, and whether its coating becomes part of these structures: the lateral and/or the axial compliance of the clamped fiber at its support cross-sections might provide appreciable stress relief for the misaligned fiber and should be considered. in a conservative analysis, one could get away, however, assuming ideally rigid supports. such an assumption will result in an overestimation of the actual stresses in bending and/or in (reactive) tension. it is noteworthy that the occurrence of the tensile stress in an optical fiber of finite length subjected to a deliberately applied lateral off-set of its ends can be used in a unique and effective test vehicle for the evaluation of the tensile strength of the fiber, including its ―static fatigue‖ (delayed fracture) [29]. indeed, the developed models for the prediction of the tensile force in a fiber subjected to the given (imposed) ends offset enable one to develop a simple and an effective experimental setup. in fibers with significant ends off-set the bending stress is significantly lower than the tensile one, and could be neglected, especially if the ends of the fiber are allowed to rotate. such a setup mimics well therefore the pull-test conditions. while tensile loading on an optical fiber interconnect has always a negative effect on the state of stress in it, i.e., always leads to elevated stresses, especially in the presence of the ends off-set, moderate compression can have, strange as it may sound, a positive effect on the induced stresses [30], [31]. even if the compressive force exceeds the critical (buckling) force, it can still be tolerated, as long as the distance between the interconnect ends is controlled and cannot be smaller than the distance determined by the thermal contraction mismatch of the supports (and this distance is determined by the thermal contraction mismatch between the optical fiber and its enclosure). the desired compression can be evaluated beforehand and then implemented into the actual design by choosing the most suitable material of the enclosure: when the structure is fabricated at an elevated temperature and is subsequently cooled down to a low (room) temperature, the thermal contraction mismatch between the materials of the fiber and the enclosure will lead to the desired (required) level of the compressive stress and the displacement in the fiber. 1.5. pigtail configuration pigtails in laser package designs provide particular challenge, as far as their bending and optimized configuration is concerned. various situations encountered when a pigtail is employed to connect a laser package to the ―outside world‖ were addressed and analyzed [35]-[38]. it has been shown particularly [35] that by rotating the package inside the enclosure, one could reduce dramatically the induced curvatures. this should be done, however, with caution, since ideally straight pigtails cannot be recommended. this is because while the initial bending stress in them is indeed zero, the situation could be worsened dramatically if the structure with a high expansion enclosure is heated up, thereby leading to undesirable and significant tensile stresses in the pigtail. it is usually fiber optics engineering: physical design for reliability 161 preferred that the pigtail is kept ―loose‖ and, owing to that, is able to accommodate appreciable axial deformations in tension or compression without being stressed. if a pigtail experiences two-dimensional bending on a plane [38], appreciable bending stress relief can be obtained by simply forcing the pigtail to be configured as a quarter of a circumference, so that all its points have the same curvature and, hence, experience the same bending stress. this stress can be appreciably lower than the maximum bending stress at the clamped end of a clamped-free pigtail. more practical and more complicated situations take place when a pigtail is bent on a cylindrical surface [36], [37]. such a design was considered for lasers intended for at&t undersea long haul communication technologies. achieving an optimized geometry of such a pigtail was certainly a challenge. 1.6. consideration of structural and material nonlinearity consideration of the structural (geometric) and materials (physical) nonlinearity might be necessary, if the fiber experiences significant bending and/or axial deformations [39]-[48]. the effect of the structural nonlinearity, which is due to the significant bending deformations of optical fibers, takes place when the induced displacements are not proportional anymore to the applied forces. the stress-strain relationship might be still linear, however, i.e., hooke‘s law is still fulfilled. this is the case, e.g., of fiber interconnects with moderate end-offsets. it has been established, however [39]-[43], that silica glasses exhibit highly non-linear, although still elastic, stress-strain relationships when the applied strains are not low enough. young‘s modulus in these materials becomes strain dependent. experiments have indicated that it increases with an increase in the tensile stress and decreases with an increase in the compressive stress, even well below the stress that leads to buckling. when a silica fiber specimen is subjected to significant bending deformations, its neutral axis shifts at the given cross-section of the specimen, because of the non-linear stress-strain behavior of the material, in the direction of the layer subjected to tension. this phenomenon takes place, particularly, when the specimen is subjected to two-point bending tests [39]-[47]. both the geometric nonlinearity caused by large bending deformations (the shape of the bent fiber) and material‘s nonlinearity have to be considered, so that the maximum bending stress in the bent fiber is predicted in the most accurate fashion. if there is a need to establish the actual shape of the bent fiber, the euler ―elastica‖ approach might be necessary, and this shape is expressed in elliptic functions. attributes associated with the role of fiber coating, if any, can be easily incorporated, if necessary, into the analytical stress model [48]. 1.7. thermal stress in coated fibers coated fibers, whether polymer coated or metalized or otherwise protected, are widely employed for better shortand long-term reliability of the silica material, which is both brittle and moisture-sensitive. the addressed problems encountered during design, manufacturing, testing, and reliability assessments for coated fibers include: evaluation of the effect of coating on the bending stresses; understanding the possible delamination modes and mechanisms; improving strippability of coated fibers; prediction of the magnitude and distribution of stresses occurring during proof (pull-out) testing, and others. thermal loading is responsible for many failures in photonics engineering, including optical fiber systems [49]-[61]. such loading could be caused by the thermal expansion 162 e. suhir (contraction) mismatch of the dissimilar materials in the structure (and particularly of the fiber and its coating) and/or by the non-uniform distribution of temperature (temperature gradients) in the system. steady-state or variable thermal loading takes place during the normal operation of optical assemblies and systems, as well as during their fabrication, testing, transportation and storage. thermal stresses, strains and displacements are the major contributor to the functional, structural and environmental failures of the optical equipment. this is true even for optical fiber systems, although optical fibers, unlike copper wires, do not dissipate heat. creep and stress relaxation phenomena might lead to excessive and undesirable displacements in foe systems. complete loss in optical coupling efficiency can occur, because of the excessive displacements due to the lateral (often less than 0.2 micrometers) or angular (often less than a split of one percent of a degree) misalignment in the gap between two light-guides or between a light source and a light-guide. this could be caused particularly by the thermal stress related deformations and/or, e.g., by stress relaxation in the laser weld. as is known, tiny temperature-induced changes in the distances between bragg gratings written on an optical fiber can be detrimental to its functional performance. for this reason thermal control of the ambient temperature is sometime needed to ensure sufficient protection provided to an optical device sensitive to the change in temperature. the requirements for the structural (physical) behavior of the materials and structures in optoelectronics and photonics are often based therefore on the functional (optical) requirements and specifications, while the requirements for the structural reliability or for the environmental durability might be significantly less stringent. the importance of addressing thermal stresses in, and particularly of modeling of the physical behavior and performance of, coated optical fibers was addressed in refs. 49-61, where a number of practically important fiber optics structures were considered and analyzed. a simple analytical stress model has been recently developed for the prediction of thermal stresses in a cylindrical tri-material body [60], with application to silicon photonics technologies, when a metalized optical fiber is soldered into a silicon chip. the developed model is applicable also to situations when a fiber is soldered into a ferrule, or is adhesively bonded into a capillary. it is concluded particularly that the adequate bonding material (e.g., a ―soft‖ tin-lead or a ―hard‖ gold-tin solder) should be selected and its thickness should be established, for low enough thermally induced stresses in it, based on the developed model, so that the shortand long-term reliability of the materials, and, first of all, the solder material, is not compromised. being analytical, rather than fea based, this model is quite general and can be used in various other technologies and structures, even well beyond the field of photonics, when cylindrical tri-material bodies comprised of dissimilar materials and experiencing temperature excursions are employed. in bi-material soldered or adhesively bonded assemblies, the bonding layer is much thinner than the bonded components and/or its young‘s modulus is considerably lower than young‘s moduli of the materials of the bonded components. owing to that the cte of the bonding material does not have to be accounted for, and the engineering predictive model can be developed for a bi-material assembly and made therefore relatively simple. however, when the intermediate (bonding) material is not thin and/or its young‘s modulus is not small, the material becomes ―an equal partner‖ with the materials of the bonded components. then a more complicated model has to be developed to account for the roles of all the materials in such a tri-material assembly. the development of such a model is particularly challenging for a cylindrical body, such as a silicon photonics assembly [60]. fiber optics engineering: physical design for reliability 163 1.8. coated fibers with low modulus coating at the ends interfacial thermally induced shearing and peeling stresses that are due to the interaction of the dissimilar materials in coated fibers are often the major cause of an insufficient shortand long-term reliability of the fibers. since both categories of the interfacial stresses concentrate at the fiber ends and decrease with an increase in the compliance of the coating system [62]-[64], there is an obvious incentive for employing low modulus coating materials at the ends of optical fiber interconnects. particularly, the maximum thermally induced interfacial shearing stresses at the ends of a jacketed fiber can be minimized, if the lengths of the end portions of the coating are established, for the given young‘s modulus of the coating material and the given thickness of the coating layer, in such a way that the shearing stress at the fiber ends becomes equal to the shearing stress at the boundary between the mid-portion and the peripheral portions of the bonding layer. the maximum shearing stress in such an inhomogeneously coated fiber takes place at two locations: at the fiber ends and at the boundaries between the mid-portion and the peripheral portions of the coated fiber. this stress could be significantly lower than in a fiber with a homogeneous coating. moreover, the maximum stresses in an inhomogeneously coated fiber will be even lower than in a fiber coated by a homogeneous layer whose young‘s modulus is the same as the young‘s modulus of the low modulus material at the peripheral portions of a fiber with an inhomogeneous coating. such a paradoxical situation [65] is due to the fact that stiff mid-portions of bonded joints bring down the relative longitudinal interfacial displacements of the bonded materials not only in the fiber mid-portion, where the interfacial thermal stresses are low anyway, but also at the fiber ends, where the maximum interfacial stresses occur. these stresses decrease with a decrease in the peripheral displacements. 1.9. micro-bending phenomenon in dual-coated fibers dual-coated optical fibers are fabricated at elevated temperatures and operated at low temperature conditions. it is imperative for coated fibers intended for long-haul communications remain stable at low temperatures, i.e., do not buckle (do not ―microbend‖) within the primary coating as a result of the thermal contraction mismatch of the high expansion (contraction) secondary coating and the low expansion (contraction) fiber. low-temperature micro-bending, while most likely harmless from the standpoint of the level of bending thermal stresses, can result in substantial added transmission losses [66][77]. the low-temperature micro-bending phenomenon is a good illustration of a situation, when it is the need for a failure-free functional (optical) performance, rather than the physical reliability, that determines the requirements for the adequate structural (physical) design of an optical fiber system. the simplest analytical models [67]-[74] suggest that the fiber prone to low temperature micro-bending is treated as an infinitely long beam lying on a continuous elastic foundation. this foundation is provided by the coating system, and, first of all, by the low-modulus primary coating. as long as such a beam-on-elastic-foundation predictive model is considered, particular attention should be paid to how the spring constant of the elastic foundation is determined. in the early publications preceding the pioneering vangheluwe‘s work [67] it was simply assumed that this constant was equal to the young‘s modulus of the primary coating materials. vangheluwe, using the plain strain theory-of-elasticity approximation, obtained, assuming ideally rigid secondary coating, a simple and physically meaningful formula for the spring constant. vangheluwe‘s formula indicates that the spring constant of interest 164 e. suhir depends on both the elastic constants of the primary coating material (young‘s modulus and poisson‘s ratio) and its thickness. vangheluwe‘s formula could result, however, in a considerable overestimation of the spring constant and, hence, in an overestimation of the critical (buckling) force, for some actual, not very stiff, secondary coating materials [69]. the more general formula [69] accounts for the finite rigidity of both the primary and the secondary coating. in the case of thick and not very high-modulus secondary coatings, the compliance of both coating layers should be considered. another significant finding, as far as the low-temperature micro-bending phenomenon is concerned, has to do with the role of the initial local curvatures [68]. while the initial curvatures do not change the magnitude of the critical force, they affect the pre-buckling behavior of the compressed fiber. when the compressive force increases, an initially straight fiber remains straight up to the very moment of buckling, while the localized curvatures in a fiber with such curvatures gradually increase with an increase in the compressive force. this could cause appreciable additional deflections of the glass fiber and, as the consequence of that, considerable added transmission losses even at moderately low temperatures, well below the buckling temperatures. it has been shown particularly that, from the standpoint of the pre-buckling behavior of a fiber, certain curvature lengths are less favorable than the others: a dual-coated fiber supported by an elastic foundation provided by the low-modulus coating behaves, with respect to the distributed localized initial curvatures, like a narrowband filter that enhances the curvatures, which are close to the post-buckling configuration of the fiber (regardless of whether buckling occurs or not), and suppresses all the other, ‗non-resonant‖, curvatures. the developed analytical models are simple, easy-to-use, and clearly indicate the role of various factors affecting the pre-buckling behavior of the fiber. the obtained solutions indicate what could possibly be done to bring down, if necessary, the induced curvatures and the resulting added transmission losses in the fiber. the numerical examples are carried out for silicone/nylon coated systems extensively studied experimentally by japanese engineers [66]. the theoretical predictions agree well with the experimental observations. it is noteworthy in this connection that it has been observed [76] that external (mechanical) periodic loading with a period of about 100nm can also cause appreciable micro-bending losses in dual-coated fibers, and therefore should be avoided in actual designs. this period is rather close to the predicted critical ―periods‖ of initial curvatures in the low-temperature micro-bending situation. 1.10. proof-testing of coated fibers the stress-strain related problems that arise during proof-testing of coated optical fibers were addressed, based on the analytical predictive modeling, in refs. 78-83. the considered problems include: the role of the lengths of test specimens in pull-testing [82]; the buffering effect of the coating on the acceptable length of the test specimens in pull (proof) [83] and in bending [80] tests; the magnitude and the distribution of the interfacial stresses during pull-out testing [81], as well as stresses in coated fibers stretched on a capstan during the manufacturing process [79]. in the brief discussion that follows we elaborate on some more or less important aspects of the physical phenomena associated with proof-testing of coated optical fibers. it is well-known in materials science that if one intends to experimentally determine the young‘s modulus of a material and/or its flexural strength through threeor four-point bending, the specimen should be long enough (say, its length should be at least 12-15 fiber optics engineering: physical design for reliability 165 times larger than its height), so that lateral shearing deformations do not occur in the specimen and do not affect the test data [78]. a problem encountered during pull testing of a glass fiber whose one end is soldered or adhesively bonded (and is therefore rigidly or elastically clamped), and its other end is subjected to a pulling (tensile) force [82], although is somewhat different, of course, but has also to do with the intent to obtain clear information about testing. this could be done if the specimen is long enough, so that the tensile stresses prevail considerably over the bending stresses. considering that the pulling force will always form a certain angle with the fiber axis, the question is what could be done to minimize the effect of the associated bending stresses? to answer this question, a simple analytical model has been developed for the evaluation of the bending stress caused by the misalignment of the ends of a glass fiber specimen soldered into a ferrule and subjected to tension during pull testing. it is shown that the bending stress can be reduced considerably by using sufficiently long specimens and how long such specimens should be, so that only the tensile stress could be accounted for. it is also shown how the uncertainty in the prediction of the inevitable misalignment of the fiber ends can be considered when establishing the appropriate specimen length. the tensile force experienced by a dual-coated optical fiber specimen during its reliability (proof) testing is applied to the fiber‘s secondary coating and is transmitted to the glass fiber at a certain distance from the specimen‘s ends. although it is true that, in accordance with the saint-venant‘s principle, the glass fiber will be subjected, at a certain distance from the specimen ends, to the same stress that it would experience if the external force were applied to both the fiber and the coating, it is also true that, because of the buffering effect of the coating, the effective length of the fiber under testing, when the testing force is applied to the coating only, might be reduced appreciably in comparison with the fiber‘s actual length. a simple analytical stress model for the evaluation of this effect was developed [83] and was used to establish the appropriate minimum length of a dual-coated test specimen, so that the experimental data would be consistent and physically meaningful. it has been found that it is the axial compliance of the secondary coating, which experiences the direct action of the external loading, and the interfacial compliance of the coating system that determine the buffering effect of this system. it was concluded that for any finite compliance of the coating, even a very low one, one could always employ a long enough specimen, in which the major mid-portion of the glass fiber would be loaded to practically the same level as in an infinitely long specimen, when the external force is distributed between the glass fiber and its coating proportionally to the axial rigidities of these structural elements. the developed model can be used for selecting the appropriate length of coated optical fiber specimens in reliability (proof) testing. it can be used also beyond the fiber optics technologies area, when composite structures of the type in question are employed and tested. 1.11. elastic stability of optical fiber interconnects analytical models for the evaluation of the elastic stability of optical fiber interconnects have been developed 1) to understand the role of the nonlinear stress-strain relationship [84], 2) to assess the role of the hydrostatic pressure, if any [85], 3) to evaluate the role of the ends off-set [86], 166 e. suhir 4) to find out of there is sufficient incentive for using thicker coatings for higher elastic stability [87], 5) to investigate the role of the finite length of the interconnect [88], [91] on the critical stress, including the situation, when the interconnect is partially stripped off of its coating [89], and 6) to analyze the effect of the lateral compliance of the interconnect on the level of the buckling forces [90]. in the brief discussion that follows we indicate some important physical aspects of some of the phenomena associated with the above efforts. the analysis of the effect of the nonlinear stress-strain relationship on elastic stability of optical glass fibers [84] has been carried out under an assumption that this relationship, obtained for the case of uniaxial tension, is also valid in the case of compression: just the sign in front of the nonlinear term in the formula for the strain-dependent young‘s modulus should be changed. it is clear that since the critical force is proportional, in accordance with the well-known euler formula, to the young‘s modulus of the material, and this modulus reduces with an increase in the compressive force, an approach that ignores such a reduction will overestimate the magnitude of the critical force, and, hence, will not be conservative. in the studies addressing low-temperature micro-bending of infinitely long dual-coated fibers and elastic stability of short bare fibers the role of the nonlinear stress-strain relationship has been evaluated for strains not exceeding 5%, and therefore it has been indicated that future experimental research should include evaluation of the nonlinear stress-strain relationship, both in tension and compression, for higher strains and for high-strength fibers, such as, e.g., fibers protected by metallic coatings. the author of this review is not aware of whether such research has been conducted. the analysis of the effect of the hydrostatic pressure in dual‐coated optical fibers on the induced stresses in the fiber [85] has indicated that all the normal stresses in the fiber (radial, tangential, axial) are proportional to this pressure. it has been found also that hydrostatic pressure results in lower micro-bending losses. calculations of the elastic stability of coated fiber specimens subjected to compression were carried out using analytical modeling [86], [87] for 2mm and 5mm long interconnects for the cases of bare (uncoated) fibers, as well as for coated fibers with 62.5 μm and 187.5 μm thick coatings. the compressive, bending and the total stresses in the glass fiber at the pre-buckling, buckling and post-buckling conditions were computed with consideration of the non-linear stress-strain relationship in the silica material. it has been found that the stresses in the fiber are strongly dependent on its length and the coating thickness. the nonlinear stress-strain relationship plays, however, a minor role, unless the specimen is shorter than only 2mm. the incentive for the evaluation of the effect of the length of a coated fiber, idealized as a beam lying on a continuous elastic foundation (provided by the coating system), on the critical stress in it [88], [90] is due to the fact that the critical (buckling) force for a beam, in the absence of an elastic foundation, is highly dependent on its length: in accordance with the euler formulas, this force is inversely proportional to the beam‘s length squared and is proportional to the beam‘s flexural rigidity. on the other hand, the critical force for a long enough beam lying on a continuous elastic foundation is beam‘s length independent and, as is known from the theory of such beams, is proportional to the doubled square root of the product of the spring constant of the foundation and the beam‘s flexural rigidity. the following natural questions arise in this connection: fiber optics engineering: physical design for reliability 167 1) for what lengths both the beam‘s length and the spring constant of the foundation play a role and should be accounted for? in other words, if the beam on an elastic foundation is not long enough, how does its finite length affect, if at all, the critical force? 2) what role, if any, the arrangements of the beam‘s supports at its ends play, as far as the critical force is concerned, and is this role dependent of the beam‘s length? in other words, is the above mentioned well known formula for the critical force for a long enough beam calculated as the doubled square root of the product of the spring constant of the elastic foundation and beam‘s flexural rigidity, valid for any long enough beam lying on an elastic foundation, regardless of the arrangements of its end supports, or it is not always the case? the developed analytical model enabled one to obtain answers to these questions and, as a by-product, to provide practical guidance for designers of coated fiber interconnects. an easy to use and physically meaningful diagram [90] based on the developed analytical models has been suggested to determine stability/instability zones for the given compressive force, the spring constant of the foundation, the length of the beam (fiber) and its flexural rigidity. both the mechanical and thermally induced compressive forces were considered. it has been shown also that the critical force for a long enough beam with a free (unsupported) end is half of the magnitude of the force in a beam with both ends supported. the obtained solution has been extended for a fiber with a stripped-off coating at its end portion, when the stripped off end of the fiber interconnect (connector) is subjected to compression [89]. a situation when the critical force for the coated portion of the fiber is equal to the critical force for its stripped off portion was particularly addressed and the recommendations for the corresponding length of the elastically stable stripped off portion have been suggested. the model developed for a cantilever beam lying on a continuous elastic foundation and subjected to the combined action of the concentrated compressive and lateral forces at the free end of the beam (coated fiber) [90] was used to explain the effect of the lateral compliance of such a beam (i.e., its propensity to deflect under the action of the given lateral force) on its elastic stability. it is clear that the flexural rigidity of the beam and the presence of a compressive or a tensile force are equally important when assessing the role of the lateral compliance. indeed, while the tensile axial force results in an increased effective flexural rigidity of the beam, the compressive force results in its lower flexural rigidity. in an extreme situation, when the compressive force is significant and becomes equal to its critical value, the beam buckles, i.e., its effective flexural rigidity becomes zero. in another extreme case, when the tensile force is large, the beam‘s effective flexural rigidity increases, and a significant lateral force is needed to bend the beam. these phenomena can be used in fiber optics to increase, if necessary, the elastic stability (the critical force) by applying a tensile force to the fiber. this could be done, e.g., by placing the fiber into an enclosure whose cte is even lower than that of the fiber, say, in an enclosure built of carbon nano-tubes (cnts). as is known, at low and room temperatures, the cte for single wall cnts in axial direction could be even negative. on the other hand, if one intends to increase the lateral compliance of the fiber, a high expansion enclosure could be used. such an enclosure will apply compression to the silica fiber. the modeling technique could be similar to the one used in [30] where a fiber with an initial ends off-set was considered. 1.12. solder materials and joints, and fibers soldered into ferrules solder materials and joints are as important in photonics and, particularly, in foe, as they are in microelectronics [92], [93]. there are, however, specific requirements for the 168 e. suhir solder materials and joints used in photonics. these requirements are associated with the ability to achieve high alignment, high yield stress, propensity to low creep, etc. it has been shown [92] that low expansion enclosures with good thermal expansion (contraction) match with silica is not always the right choice (solution) from the standpoint of the thermally induced stresses in metalized fibers soldered into ferrules, and in the solder material itself. indeed, the low expansion enclosures result in tensile radial stresses in the solder ring, and could lead to the delamination of the metallization from the fiber and/or to the excessive tensile radial deformations in the solder. on the other hand, high expansion (contraction) enclosures might result in high compressive stresses in the solder material, and in unfavorable low cycle fatigue conditions during temperature cycling of the joint. the most feasible material of the enclosure and/or the thickness of the solder ring, and/or the physical properties of the solder material could and should be found based on the developed model. 1.13. dynamics response of optoelectronic structures to shocks and vibrations numerous problems associated with the dynamic response of electronic and photonic structures to shocks and vibrations were addressed in refs. [94]-[106]. the major findings, conclusions and recommendations could be summarized as follows: 1) the maximum acceleration is typically used in electronics and photonics engineering as the major reliability criterion. it is suggested that this criterion can be indeed used in this capacity, when functional (electrical, optical, thermal) performance of the product is evaluated. it could be misleading, however, when structural (physical) reliability is critical [95]. it is the dynamic stress, and not the maximum accelerations (decelerations) that should be used as a suitable and an adequate criterion of the dynamic strength of the material or a device. this stress may or may not be proportional to the maximum acceleration. 2) drop tests are often replaced in electronics and photonics engineering by shock tests, which are simpler to design and conduct, and whose results is easier to interpret. it has been found that such a replacement can be justified, if the dynamic response of the device under test is as close to an instantaneous impact, as possible [97], [98], [103]; 3) electronic and opto-electronic systems are often tested ―on the board level‖. the model [99] contains is an exact solution to a highly nonlinear equation for the principal coordinate for the dynamic response of a board to an impact (shock) loading. the model can be used to evaluate the dynamic response characteristics of the board (with surface mounted devices on it) that experiences highly nonlinear vibrations as a result of the shock impact applied to the board‘s support contour in drop or shock tests. the model has been developed under an assumption that the size of the surface-mounted devices in the xy plane is small, so that the surface mounted devices do not change the flexural rigidity of the board, but contribute significantly to its mass and, hence, to the inertial forces. 4) electronic and photonic systems often experience periodic impacts that could be idealized and modeled as a train of instantaneous impulses [101]. the developed model enables one to evaluate the dynamic response of such systems to a train of periodic impacts, including the situation, when such shocks generate quasi-chaotic vibrations in the system. smoluchowski‘s (fokker-planck) equation is used to describe and to characterize the quasi-random vibrations caused in such a nonlinear system by periodic impulses. fiber optics engineering: physical design for reliability 169 1.14. new nano-particle material (npm) and its applications in fiber-optics an advanced technology for making nano-particle material (npm) based optical silica fiber coatings has been developed under grants from darpa/navy [107]-[116]. the developed technology enables one to create ultra-thin, highly cost-effective, highly mechanically reliable, and highly environmentally durable coatings for silica light-guides. the obtained results have demonstrated the performance superiority of the developed technology over polymer-coated and metallized fibers, as well as a potential that the npm has for various commercial and military applications in microand opto-electronics and related areas. it can have many attractive applications also well beyond the ―high-tech‖ field. this npm-based coating has all the merits of polymer and metal coatings, but is free of the majority of their shortcomings. the developed material is an unconventional inhomogeneous ―smart‖ composite material, which is equivalent to a homogeneous material with the following major properties: 1) low young‘s modulus, 2) immunity to corrosion, 3) good-to-excellent adhesion to adjacent material(s), 4) non-volatile, 5) stable properties at temperature extremes (from -220 0 c to +350 0 c), 6) very long (practically infinite) lifetime, 7) ―active‖ hydrophobicity — the material provides a moisture barrier (to both water and water vapor), and, if necessary, can even ―wick‖ moisture away from the contact surface; 8) ability for ―self-healing‖ and ―healing‖: the npm is able to restore its own dimensions, when damaged, and is able to fill existing or developed defects (cracks and other ―imperfections‖) in contacted surfaces; very low (near unity) effective refractive index (if needed). npm can be designed, depending on the application, to enhance those properties that are most important for the pursued application. the npm properties have been confirmed through testing. the tests have demonstrated the outstanding mechanical reliability, extraordinary environmental durability and, in particular applications, improved optical performance of the lightguide. it is always desirable to provide application-specific modifications of the npm to master/optimize its properties and performance. because it is a nano-material, its surface chemistry and its performance depend a lot upon the contact materials and surfaces. the following npm applications are viewed as the most attractive ones. 1) npm is able to hermetically seal packages, components and devices, such as laser packages, mems, displays and plastic leds; 2) npm can be used as an effective protective coating for various metal and non-metal surfaces, well beyond the area of microand opto-electronics: in cars, aerospace structures, offshore and ocean structures, marine vehicles, civil engineering structures (bridges, towers, etc.), tubes, pipes and pipe-lines, etc. these applications benefit because the material is actively hydrophobic, does not induce additional stresses (owing to its low modulus), is inexpensive, is easy-to-apply, has practically infinite lifetime, and is self-healing. application of this material can result in a significant resistance of a metal surface to corrosion, and, in addition, in substantial increase in the fracture toughness of the material, both initially and during the system‘s operation (use); 3) the npm can be added in the formulation of various coatings such as paints, thereby providing protective benefits without changing the application techniques; 170 e. suhir 4) because of a low refractive index, the npm can be used, if necessary, as an effective cladding of optical silica fibers. the use of the npm cladding eliminates the need to dope silica for obtaining light-guide cores. the new preform will consist of a single (undoped and, hence, less expensive) silica material; 5) a derivative application is flexible light-guides. multicore flexible fiber cables employing npm are able to provide high spatial image resolution. as such, they might find important applications, when there is a need to provide direct high-resolution image transmission from secluded areas. possible applications can be found in bio-medicine, nondestructive evaluations, oil and other geological explorations, in ocean engineering, or in other situations, when an image needs to be obtained and transmitted from relatively inaccessible locations. in such applications, the plane (―butt‖) end of the fiber bundle (cable) will play the role of a small size pixel array. the transmitted image can be concurrently or subsequently enlarged to a desirable size, as needed; 6) another derivative application is a multicore fiber cable. ultra-small diameter glass fibers with an npm-based cladding/coating can be placed in large quantities within a npm medium (―multiple cores in a single cladding‖). in addition, owing to a much better inner-outer refractive index ratio in the npm-based fibers, such cables will be characterized by very low signal attenuation; 7) yet another derivative application is sensor systems. the npm-based fibers can be used in optical sensor systems that employ optical fibers embedded in a laminar or a cast material. such systems are used, e.g., in composite airframes. with the npm used as a cladding or, at least, as a coating of the silica optical fiber, the optical performance and the structural reliability of the light-guide will be improved dramatically compared with the conventional systems; 8) ultra-thin planar light-guides are yet another derivative application of the npm. in the new generation of the planar light-guides, npm can be used as the top cladding material. it will replace silicon or polymer claddings, which are considered in today‘s planar light-guides. all the advantages of the npm cladding material discussed above for optical fibers are equally applicable to planar light-guides. these are thought to have a ―bright‖ future in the next generation of computers and other photonic devices. a modification of the npm has been developed and tested as an attractive substitute for the existing hermetic and non-hermetic optical fiber coatings. the following major activities were undertaken and the following results were obtained: 1) the drawing (manufacturing) process and the drawing tower were adequately retrofitted to adjust them to the characteristics of the developed npm and to the npm layer application procedure; 2) the conducted mechanical tests have demonstrated remarkable strength (up to 7.5gpa=765kgf/mm.sq.=1088kpsi) and attractive quality (low strength variability) of the manufactured npm-based fibers. such high strength characteristics have been never achieved before, even in the lab conditions; 3) the environmental tests have shown that even at the humidity level of 100% (samples were immersed into water for 24 hours) the mechanical strength of these fibers is on the order of the strength of the best quality fibers at the ―dry‖ conditions in the previous tests; 4) there is reason to believe that the achieved performance is still not a limit of the npm-based technology and that the higher fibers strengths and better environmental stability are feasible by further ―fine tuning‖ and further optimization of the npm and the drawing procedure; fiber optics engineering: physical design for reliability 171 5) the optical performance of the npm-based fibers (in terms of the attenuation level) is almost two-fold better than the optical performance of the reference (existing) samples. the estimated lower limit of the npm based optical fibers with silica glass core and stepwise refractive index change, can potentially get a record values for the tested type of multi-mode fibers (getting even below 1 db/km in a specific spectral ―window‖.). the obtained results clearly demonstrated the performance superiority of the developed technology and a great potential (scientific, technological and commercial) of the future products, which makes the project attractive for the commercialization. 1.15. some special foe problems 1. application of the mechanical approach to the evaluation of low-temperature added transmission losses in single-coated (jacketed) optical fibers [117] enables one, based on the developed analytical stress model, to evaluate the threshold of such losses from purely structural (mechanical) calculations, without resorting to optical evaluations or measurements. the model has been confirmed, however, by optical measurements. the model is based on the experimentally obtained evidence that the temperature threshold of the elevated added transmission losses coincides with the threshold of the elevated thermally induced (―hoop‖) stresses applied by the polymer jacket to the silica fiber. the suggested model enables one to predict the threshold of interest by stress calculations, instead of resorting to much more complicated optical calculations or measurements. the model sheds light on the physics of the losses in question. the model can be used also to assess the incentive for employing a dual coated system, in which the thermally induced pressure on the glass fiber will be reduced. 2. analytical models [118]-[120] were used to predict the thermal stresses in fused biconical taper (fbt) light-wave couplers. the stresses are caused by the thermal contraction mismatch of the high-expansion coupler and its low-expansion substrate. the challenge in the modeling is due to the non-prismaticity of the fbt structure and the non-linear stressstrain relationship of the fbt material. 3. elevated lateral gradients of the cte‘s and young‘s moduli (in direction of the fiber diameter) can be possibly responsible for the fiber ―curling‖ during drawing of optical silica fibers [121]. the analysis was carried out on the basis of both analytical and fea modeling, and an excellent agreement of the analytical modeling and fea data has been observed. 4. apparatus and method for thermostatic compensation of temperature change sensitive opto-electronic devices [122] was also based on analytical modeling. in accordance with the invention, temperature-sensitive devices are mounted within a thermostatic structure that provides temperature compensation by applying compressive or tensile forces to stabilize the performance of the device across a significant operating temperature range. in a preferred embodiment, an optical fiber refractive index grating is thermostatically compensated to minimize changes in the reflection wavelength of the grating. various methods and devices are known in the art to compensate for temperature induced thermal expansion. the patent [122] provides the simplest and most effective solution to the thermal compensation problem, when regular and readily available materials can be used to solve the problem. 172 e. suhir 2. probabalistic design for reliability in fiber optics engineering 2.1. qualification testing (qt) the short-term goal of a particular opto-electronic device manufacturer is to conduct and pass the established qt, without questioning if they are adequate. the ultimate longterm goal of opto-electronic industries, whether aerospace, military, or commercial, regardless of a particular manufacturer or a product, is to make their deliverables reliable in the actual operations. it is well known, however, that today‘s electronic devices that passed the existing qt often fail in the field (in operation conditions). are the existing opto-electrionic qt specifications adequate? do opto-electronic industries need new approaches to qualify their devices into products? could the existing qt specifications and practices be improved to an extent that if the device passed the qt, there is a quantifiable way to assure that its performance will be satisfactory? at the same time, there is a perception, perhaps, a substantiated one, that some electronic products ―never fail‖. it is likely that such a perception exists because these products are superfluously durable, are more robust than is needed for a particular application and, as the consequence of that, are more costly than necessary. to prove that it is indeed the case, one has to find a consistent way to quantify the level of the opto-electronic product robustness in the field. then one could establish if a possible and controlled reduction in the reliability level could be translated into a significant cost reduction. 2.2. probabilistic design for reliability (pdfr) the probabilistic design for reliability (pdfr) concept enables one to provide affirmative answers to the above questions. the concept suggest that one 1) conducts a highly focused and highly cost-effective failure-oriented accelerated testing (foat), 2) carries out simple and physically meaningful predictive modeling (pm) to understand the physics of failure; 3) predicts, using the results of the carried out foat and pm, the probability of failure (pof) in the field; 4) carries out sensitivity analyses (sa) to establish the acceptable pof; 5) revisits, reviews and revises the existing qt practices, procedures, and specifications; and 6) develops and widely implements the pdfr concept, methodologies and algorithms, considering that ―nobody and nothing is perfect‖, that the probability of failure is never zero, but could be predicted and, if necessary, minimized, controlled, specified and even maintained (assured) at an acceptable level. in effect, the only difference between a highly reliable and an insufficiently reliable product is ―merely‖ in the level of the operational pof. very popular today prognostication and health monitoring (phm) approaches and techniques could be very helpful at all the stages of the design, manufacturing and operation of the product. the reliability evaluations and assurances cannot be delayed, however, until the device is made (although it is often the case in many current practices). reliability should be ―conceived‖ at the early stages of the device design; implemented during manufacturing; qualified and evaluated by (electrical, optical, environmental and mechanical) testing at the design, product development and the manufacturing stages checked (screened) during production (by implementing an adequate burn-in process) and, if necessary and appropriate; fiber optics engineering: physical design for reliability 173 monitored and maintained in the field during the product‘s operation, especially at the early stages of the product‘s use by employing, e.g., technical diagnostics, prognostication and health monitoring (phm) methods and instrumentation. three classes of engineering products, including opto-electronic and particularly fiber optics products, should be distinguished from the reliability point of view: 1) class i includes some military or aerospace objects, such as warfare, military aircraft, battle-ships, space-craft. cost is important, but is not a dominating factor; 2) class ii includes objects like long-haul communication systems, civil engineering structures (bridges, tunnels, towers), passenger elevators, ocean-going vessels, offshore structures, commercial aircraft, railroad carriages, cars, some medical equipment. the product has to be made as reliable as possible, but only for a certain specified level of demand (stress, loading); 3) class iii includes consumer products, commercial electronics, agricultural equipment. the typical market is the consumer market. 2.3. reliability, cost effectiveness and time to market reliability, cost effectiveness and time-to-market considerations play an important role in the design, materials selection and manufacturing decisions in commercial electronics, and are the key issues in competing in the global market-place, at least for class iii products. a company cannot be successful, if its products are not cost effective, or do not have a worthwhile lifetime and service reliability to match the expectations of the customer. too low a reliability can lead to a total loss of business. product failures have an immediate, and often dramatic, effect on the profitability and even the very existence of a company. profits decrease as the failure rate increases. this is due not only to the increase in the cost of replacing or repairing parts, but, more importantly, to the losses due to the interruption in service, not to mention the losses due to reduced customer confidence and acceptance. these make obvious dents in the company‘s reputation and, as the consequence of that, affect its sales. each business, whether small or large, should try to optimize its overall approach to reliability. ―reliability costs money‖, and therefore a business must understand the cost of reliability, both ―direct‖ cost (the cost of its own operations), and the ―indirect‖ cost (the cost to its customers and their willingness to make future purchases and to pay more for more reliable products). 2.4. failure oriented accelerated testing (foat) it is impractical and uneconomical to wait for failures, when the mean-time-to-failure for a typical today‘s electronic device (equipment) is on the order of hundreds of thousands of hours. accelerated testing (at) enables one to gain greater control over the reliability of a product. at has become a powerful means in improving reliability [3], [4]. this is true regardless of whether (irreversible or reversible) failures will or will not actually occur during the foat (―testing to fail‖) or the qt (―testing to pass‖). in order to accelerate the material‘s (device‘s) degradation and/or failure, one has to deliberately ―distort‖ (―skew‖) one or more parameters (temperature, humidity, load, current, voltage, etc.) affecting the device functional or mechanical performance and/or its environmental durability. at uses elevated stress level and/or higher stress-cycle frequency as effective stimuli to precipitate failures over a short time frame. the ―stress‖ in re does not necessarily have to be mechanical or a thermo-mechanical: it could be electrical current or voltage, high (or low) temperature, high humidity, high 174 e. suhir frequency, high pressure or vacuum, cycling rate, or any other factor (stimulus) responsible for the reliability of the device or the equipment. at must be specifically designed for the product under test. the experimental design of at should consider the anticipated failure modes and mechanisms, typical use conditions, and the required or available test resources, approaches and techniques. some of the most common at conditions (stimuli) are: high temperature (steadystate) soaking/storage/ baking/aging/ dwell; low temperature storage; temperature (thermal) cycling; power cycling; power input and output; thermal shock; thermal gradients; fatigue (crack propagation) tests; mechanical shock; drop shock (tests); random vibration tests; sinusoidal vibration tests (with the given or variable frequency); creep/stress-relaxation tests; electrical current extremes; voltage extremes; high humidity; radiation (uv, cosmic, x-rays, alpha particles); space vacuum. 2.5. qualification testing (qt) and failure oriented accelerated testing (foat) qt is a must. industry cannot do without qt. its objective is to prove that the reliability of the product-under-test is above a specified level. qt enables one to ―reduce to a common denominator‖ different products, as well as similar products, but produced by different manufacturers. qt reflects the state-of-the-art in a particular field of engineering, and the typical requirements for the product performance. however, if a product passes the today‘s qt for opto-electronic products, it is not always clear why it was good, and if it fails the tests, it is usually equally unclear what could be done to improve its reliability. since qt is not failure oriented, it is unable to provide the most important ultimate information about the reliability of the product – the reliability physics behind the failure and the pof after the given time in service under the given operation conditions. foat on the other hand, is aimed, first of all, at revealing and understanding the physics of the expected or occurred failures. that is why it could be referred to as knowledge oriented testing. unlike qts, foat is able to detect the possible failure modes and mechanisms. foat end points are cycles or durations that are scaled to the use environments. another possible objective of the foat is, time permitting, to accumulate failure statistics. thus, foat deals with the two major aspects of the re– physics and statistics of failure. adequately planned, carefully conducted, and properly interpreted foat provides a consistent basis for the prediction of the pof after the given time in service. welldesigned and thoroughly implemented foat can facilitate dramatically the solutions to many engineering and business-related problems, associated with the cost effectiveness and time-to-market. this information can be helpful in understanding what should be changed to design a viable and reliable product. this is because any structural, materials and/or technological improvement can be ―translated‖, using the foat data, into the pof for the given duration of operation under the given service (environmental) conditions. foat should be conducted in addition to the qt. there might be also situations, when foat can be used as an effective substitution for the qt, especially for new products, when acceptable qualification standards do not yet exist. while it is the qt that makes a device into a product, it is the foat that enables one to understand the reliability physics behind the product and, based on the appropriate pm, to create a reliable product with the predicted or even specified pof. fiber optics engineering: physical design for reliability 175 2.6. burn-in testing (bit) as a special type of failure oriented accelerated testing (foat) burn-in (―screening‖) testing (bit) is widely implemented to detect and eliminate infant mortality failures. bit could be viewed as a special type of manufacturing foat. bit is needed to stabilize the performance of the device in use. bit is supposed to stimulate failures in defective devices by accelerating the stresses that will cause these devices to fail without damaging good items. the bathtub curve of a device that undergone bit is supposed to consist of a steady state and wear-out portions only. the rationale behind the bit is based on a concept that mass production of electronic devices generates two categories of products that passed qt: 1) robust (―strong‖) components that are not expected to fail in the field and 2) relatively unreliable (―week‖) components (―freaks‖) that, if shipped to the customer, will most likely fail in the field. 2.7. failure oriented accelerated testing (foat): predictive modeling (pm) foat cannot do without simple and meaningful predictive models. it is on the basis of such models that one decides which parameter should be accelerated, how to process the experimental data and, most importantly, how to bridge the gap between what one ―sees‖ as a result of the accelerated testing and what he/she will possibly ―get‖ in the actual operation conditions. by considering the fundamental physics that might constrain the final design, pm can result in significant savings of time and expense and shed additional light on the physics of failure. pm can be very helpful to predict reliability at conditions other than the foat and can provide important information about the device performance. modeling can be helpful in optimizing the performance and lifetime of the device, as well as to come up with the best compromise between reliability, cost effectiveness and time-to-market. a good foat pm does not need to reflect all the possible situations, but should be simple, should clearly indicate what affects what in the given phenomenon or structure, be suitable/flexible for new applications, with new environmental conditions and technology developments, as well as for the accumulation, on its basis, the reliability statistics. the scope of the model depends on the type and the amount of information available. a foat pm does not have to be comprehensive, but has to be sufficiently generic, and should include all the major variables affecting the phenomenon (failure mode) of interest. it should contain all the most important parameters that are needed to describe and to characterize the phenomenon of interest, while parameters of the second order of importance should not be included into the model. the most widespread foat pm are: power law (used when the physics of failure is unclear); boltzmann-arrhenius‘ equation (used when there is a belief that the elevated temperature is the major cause of failure) and its numerous extensions; coffin-manson‘s and related equations; crack growth equations (used to assess the fracture toughness of brittle materials); miner-palmgren‘s rule (used to consider the role of fatigue when the yield stress is not exceeded); creep rate equations; weakest link model (used to evaluate the mttf in extremely brittle materials with defects); stress-strength interference model, which is, perhaps, the most flexible and well substantiated model. 176 e. suhir 2.8. safety factor (sf) direct use of the probability of non-failure is often inconvenient, since, for highly reliable items, this probability is expressed by a number which is very close to one, and, for this reason, even significant than in the item‘s (system‘s) design, which have an appreciable impact on the item‘s reliability, may have a minor effect on the probability of non-failure. in those cases when both the mean value, <ψ>, and the standard deviation, ŝ, of the margin of safety (or any other suitable characteristic of the item‘s reliability, such as stress, time-to-failure, temperature, displacement, affected area, etc.), are available, the safety factor (safety index, reliability index) sf can be used as a suitable reliability criterion. if the probability distribution density f (ψ) of the random safety margin ψ for the ttf is anticipated or established, then the mean value  ψ  and the standard deviation sψ of this margin can be determined as  ψ  =  0  f (ψ)ψdψ, and sψ =  0  f (ψ)(ψ   ψ ) 2 dψ, and the corresponding sf can be evaluated as sf =  ψ  / sψ. the sf establishes both the upper limit of the reliability characteristic of interest (through the mean value of the corresponding margin of safety) and the accuracy with which this characteristic is defined (through the corresponding standard deviation). the structure of the sf indicates that it is acceptable that a system characterized by a high mean value of the safety margin (i.e., a system whose bearing capacity with respect to a certain stress/reliability-characteristic, is significantly higher than the level of loading) has a less accurately defined deviation from this mean value than a system characterized by a low mean value of the safety margin (i.e., a system whose bearing capacity is much closer to the possible level of loading). in other words, the uncertainty in the evaluation of the safety margin should be smaller for a more vulnerable design. 2.9. do opto-electronic (oe) industries need new approaches to qualify their devices into products? it should be widely recognized that the probability of a failure is never zero, but could be predicted and, if necessary, controlled and maintained at an acceptable low level. one effective way to achieve this is to implement the existing methods and approaches of prm techniques and to develop adequate pdfr methodologies. these methodologies should be based mostly on foat and on a widely employed predictive modeling effort. foat should be carried out in a relatively narrow but highly focused and timeeffective fashion for the most vulnerable elements of the design of interest. if the qt has a solid basis in foat, pm and pdfr, then there is reason to believe that the product of interest will be sufficiently robust in the field. the qt could be viewed as ―quasi-foat,‖ as a sort-of the ―initial stage of foat‖ that more or less adequately replicates the initial nondestructive, yet full-scale, stage of foat. we expect that the suggested approach to the dfr and qt will be accepted by the engineering and manufacturing communities, implemented into the engineering practice and be adequately reflected in the future editions of the qt specifications and methodologies. the pdfr-based qt will still be non-destructive. such qts could be designed, therefore, as a sort of mini-foat that, unlike the actual, ―full-scale‖ foat, is non-destructive and conducted on a limited scale. the duration and conditions of such ―mini-foat‖ qt should fiber optics engineering: physical design for reliability 177 be established based on the observed and recorded results of the actual foat, and should be limited to the stage when no failures in the actual full-scale foat were observed. prognostics and health management (phm) technologies (such as ―canaries‖) should be concurrently tested to make sure that the safe limit is not exceeded. it is important to understand the reliability physics that underlies the mechanisms and modes of failure in electronics and photonics components and devices. no statistics is able to replace understanding of reliability physics underlying a particular design and modes of failure. statistical assessments could and should be conducted when there is a good reason to believe that an adequately reliable product is on the way. as to the foat, it should be thoroughly implemented, so that the qt is based on the foat information and data. pdfr concept should be widely employed. since foat cannot do without predictive modeling, the role of such modeling, both computer-aided and analytical, in making the suggested new approach to product qualification practical and successful. 3. conclusion the application of the methods and approaches of methods and approaches of materials physics and structural analysis can be very helpful in creating a viable and reliable fiber optics products and networks. the probabilistic design for reliability (pdfr) concept enables one to design and fabricate a viable and reliable optoelectronic product. references [1] e. suhir, ―structural analysis in microelectronics and fiber optics‖, van-nostrand, new york, 1991. [2] e. suhir, r.c. cammarata, d.d.l. chung, m. jono, ―mechanical behavior of materials and structures in microelectronics‖, mrs symposia proceedings, vol.226, 1991. [3] e. suhir, ―structural analysis in fiber optics‖, in j. menon, ed., ―trends in lightwave technology‖, council of scientific information, india, 1995. [4] e. suhir, m. fukuda, and c.r. kurkjian, eds., ―reliability of photonic materials and structures‖, mrs symposia proceedings, vol. 531, 1998. [5] e. suhir, ―the future of microelectronics and photonics and the role of mechanics and materials‖, asme j. electr. pack., march 1998. [6] e. suhir, ―fiber optics structural mechanics-brief review‖, editor‘s note, asme j. electr. pack., sept. 1998. [7] e. suhir, ―microelectronics and photonics – the future‖, microelectronics journal, vol.31, no.11-12, 2000. [8] driessen, r. g. baets, j. g. mcinerney, and e. suhir, ―laser diodes, optoelectronic devices, and heterogeneous integration‖, spie press, 2003. [9] e. suhir, ―microelectronic and photonic systems: role of structural analysis‖, interpack’2005, san francisco, july 2005. [10] e. suhir, c.p. wong, y.c. lee, eds. ―microand opto-electronic materials and structures: physics, mechanics, design, packaging, reliability‖, 2 volumes, springer, 2008. [11] e. suhir, ―optical fiber interconnects: design for reliability‖, society of optical engineers (spie), proc. of spie, vol. 7607 760717-8, 2010. [12] b. welker, m. uschitsky, e. suhir, s. kher, g. bubel, ―finite element analysis of the optical fiber structures‖, in e. suhir, ed., ―structural analysis in microelectronics and fiber optics‖, symp. proc., asme press, 1996. [13] e. suhir, ―modeling of the mechanical behavior of microelectronic and photonic systems: attributes, merits, shortcomings, and interaction with experiment‖, proc. 9-th int. congr. on experim. mech., orlando, fl., june 5-8, 2000. [14] e. suhir, ―thermo-mechanical stress modeling in microelectronics and photonics‖, electronic cooling, vol.7, no.4, 2001. 178 e. suhir [15] e. suhir, ―modeling of thermal stress in microelectronic and photonic structures: role, attributes, challenges and brief review‖, special issue, asme journal of electronic packaging, vol.125, no.2, june 2003. [16] e. suhir, ―predictive modeling is a powerful means to prevent thermal stress failures in electronics and photonics‖, chipscale reviews, vol.15, no.4, july-august 2011. [17] e. suhir, ―stress modeling in polymer coated optical glass fibers‖, session honoring prof. a. chudnovsky, 2014 antec, las vegas, nv, april 28-may 3, 2014. [18] e. suhir, ―mechanical behavior of materials in microelectronic and fiber optic systems: application of analytical modeling-review‖, mrs symp. proc., vol. 226, 1991. [19] e. suhir, ―analytical stress-strain modeling in photonics engineering: its role, attributes and interaction with the finite-element method‖, laser focus world, may 2002. [20] e. suhir, ―modeling of thermal stress in microelectronic and photonic structures: role, attributes, challenges and brief review‖, special issue, asme j. electr. packaging (jep), vol.125, no.2, june 2003. [21] e. suhir, ―analytical thermal stress modeling in physical design for reliability of microand opto-electronic systems: role, attributes, challenges, results‖, in e. suhir, cp wong, yc lee, eds. ―microand optoelectronic materials and structures: physics, mechanics, design, packaging, reliability‖, springer, 2007. [22] e. suhir, ―analytical thermal stress modeling in electronic and photonic systems‖, asme app. mech. reviews, invited paper, vol.62, no.4, 2009. [23] e. suhir, ―thermal stress failures: predictive modeling explains the reliability physics behind them‖, imaps advanced microelectronics, vol.38, no.4, july/august 2011. [24] e. suhir, ―bending performance of clamped optical fibers: stresses due to the end off-set‖, applied optics, vol. 28, no. 3, february 1989. [25] e. suhir, ―predicted curvature and stresses in an optical fiber interconnect subjected to bending‖, ieee/osa journal of light-wave technology, vol.14, no.2, 1996. [26] e. suhir, ―bending of a partially coated optical fiber subjected to the ends off-set‖, ieee/osa journal of lightwave technology, vol. 12, no.2, 1997. [27] e. suhir, ―optical fiber interconnect subjected to a not-very-small ends off-set: effect of the reactive tension‖, mrs symposia proceedings, vol. 531, 1998. [28] e. suhir, ―bending stress in an optical fiber interconnect experiencing significant ends off-set‖, mrs symposia proceedings, vol. 531, 1998. [29] e. suhir, ―method and apparatus for proof-testing optical fibers‖, us patent #6,119,527, 1998. [30] e. suhir, ―optical fiber interconnect with the ends offset and axial loading: what could be done to reduce the tensile stress in the fiber?‖, j. appl. phys., vol.88, no.7, 2000. [31] e. suhir, ―method for determining and optimizing the curvature of a glass fiber for reducing fiber stress‖, us patent #6,016,377, 2000. [32] e. suhir, ―method of improving the performance of optical fiber, which is interconnected between two misaligned supports‖, u.s. patent #6,314,218, 2001. [33] e. suhir, ―interconnected optical devices having enhanced reliability‖, u.s. patent #6,327,411, 2001. [34] e. suhir, ―optical fiber interconnects having offset ends with reduced tensile strength and fabrication method‖, us patent #6,606,434, 2003. [35] suhir e., ―analysis and optimization of the input/output fiber configuration in a laser package design‖, asme journal of electronic packaging, vol.117, no.4, 1995. [36] e. suhir, ―‘optical glass fiber bent on a cylindrical surface‖, mrs symposia proceedings, vol.531, 1998. [37] e. suhir,―optimized configuration of an optical fiber ―pigtail‖ bent on a cylindrical surface‖, in t. winkler and a, schubert, eds., ―materials mechanics, fracture mechanics, micromechanics‖ anniversary volume in honor of b. michel’s 50 th birthday, fraunhofer izm, berlin, 1999. [38] e. suhir, ―method for determining and optimizing the curvature of a glass fiber for reduced fiber stress‖, us patent #6,016,377, 2000. [39] j.b. murgatroyd, "the strength of glass fibres. part ii. the effect of heat treatment on strength", j. soc. glass tech.,28, 1944. [40] d. sinclair, ―a bending method for measurement of the tensile strength and young‘s modulus of glass fiber‖, journal of applied physics, vol.21, 1950. [41] krause, j.t., l.r. testardi, and r.n. thurston, ―deviations from linearity in the dependence of elongation upon force for fibers of simple glass formers and of glass optical light-guides‖, physics and chemistry of glasses, vol.20, 1979. [42] p.w. france, paradine, m.j., reeve, m.h., and newns, g.r., ―liquid nitrogen strength of coated optical glass fibers‖, journal of materials science, vol.15, 1980. fiber optics engineering: physical design for reliability 179 [43] s.f. cowap, and s.d. brown, ―static fatigue testing of a hermetically sealed optical fiber‖, american ceramic society bulletin, vol.63, no.3, 1984. [44] m.j. matthewson, c. r. kurkjian and s. t. gulati, "strength measurement of optical fibers in bending", j. am. ceram. soc. vol.69, no.1, 1986. [45] j.n. mcmullin, and j.e. freeman, ―on the shape of a bent fiber‖, ieee/osa j. light-wave techn., vol.8, no.7, 1990. [46] e. suhir, ―effect of the nonlinear stress-strain relationship on the maximum stress in silica fibers subjected to two-point bending‖, applied optics, vol. 32, no. 9, 1993. [47] m. muraoka, ―the maximum stress in optical glass fibers under two-point bending‖, asme j. electr. pack., vol.123, march 2000. [48] e. suhir, v. ogenko, d. ingman, ―two-point bending of coated optical fibers‖, proceedings of the phomat’2003 conference, san-francisco, ca, august 2003. [49] e. suhir, ―stresses in dual-coated optical fibers‖, asme journal of applied mechanics, vol.55, no.10, 1988. [50] o.s. gebizioglu, i.m. plitz, ―self-stripping of optical fiber coatings in hydrocarbon liquids and cable filling compounds‖, optical engineering, vol.30, no.6, 1991. [51] e. devadoss, ―polymers for optical fiber communication systems‖, journal of scientific and industrial research, vol.51, no.4, 1992. [52] s.t. shiue, ―thermal stresses in tightly jacketed double-coated optical fibers at low temperature‖, journal of applied physics, vol.76, no.12, 1994. [53] e. suhir, ―approximate evaluation of the interfacial shearing stress in circular double lap shear joints, with application to dual-coated optical fibers‖, int. j. solids and structures, vol.31, no.23, 1994. [54] p. ostojic, ―stress enhanced environmental corrosion and lifetime prediction modeling in silica optical fibers‖, journal of materials science, vol.30, no.12, 1995. [55] w.w. king, and c.j. aloisio, ―thermomechanical mechanism for delamination of polymer coatings from optical fibers‖, asme journal of electronic packaging, vol.119, no.2, 1997. [56] e. suhir, ―thermal stress failures in microelectronics and photonics: prediction and prevention‖, future circuits international, issue #5, 1999. [57] e. suhir, ―thermomechanical stress modeling in microelectronics and photonics‖, electronic cooling, vol.7, no.4, 2001. [58] e. suhir, ―polymer coated optical glass fibers: review and extension‖, proceedings of the polytronik’2003, montreaux, october 21-24, 2003. [59] e. suhir, ―mechanics of coated optical fibers: review and extension‖, ectc’2005, orlando, florida, 2005. [60] e. suhir, j. nicolics, c. gu, a. bensoussan, l. bechou, ―analytical stress model for the evaluation of thermal stresses in a cylindrical tri-material body with application to optical fibers‖, j. electrical and control engineering, vol.3 no.5, december 2013. [61] e. suhir, ―thermal stress failures in electronics and photonics: physics, modeling. prevention‖, j. thermal stresses, june 3, 2013. [62] e. suhir, ―predicted thermal mismatch stresses in a cylindrical bi-material assembly adhesively bonded at the ends‖, asme j. appl. mech., vol.64, no. 1, 1997. [63] e. suhir, ―thermal stress in a polymer coated optical glass fiber with a low modulus coating at the ends‖, j. mat. res., vol. 16, no. 10, 2001. [64] e. suhir, ―coated optical glass fiber‖, us patent #6,647,195, 2003. [65] e. suhir, ―on a paradoxical situation related to bonded joints: could stiffer mid-portions of a compliant attachment result in lower thermal stress?‖, jsme j. solid mech. and materials engineering (jsmme), vol.3, no.7, 2009. [66] katsuyama, y. mitsunaga, y. isida, and k. ishihara, "transmission loss of coated optical fiber at low temperature," appl. opt., no. 22, 1983. [67] d.c.l. vangheluwe, "exact calculation of the spring constant in the buckling of optical fibers," appl. opt., 23, 1984. [68] e. suhir, ―effect if the initial curvature on the low temperature microbending in optical fibers‖, ieee/osa journal of lightwave technology, vol.6, no.8, 1988. [69] e. suhir, ―spring constant in the buckling of dual-coated optical fibers‖, ieee/osa journal of lightwave technology, vol.6, no.7, 1988. [70] s.t. shiue, ―design of double-coated optical fibers to minimize hydrostatic-pressure-induced microbending losses‖, ieee photonics technology letters, vol.4, no.7, 1992. 180 e. suhir [71] s.t. shiue, and s.b. lee, ―thermal stresses in double-coated optical fibers at low temperature‖, journal of applied physics, vol.72, no.1, 1992. [72] s.t. shiue, ―axial strain-induced microbending losses in double-coated optical fibers‖, journal of applied physics, vol.73, no.2, 1993. [73] f. cocchini, ―double-coated optical fibers undergoing temperature variations-the influence of the mechanical behavior on the added transmission losses‖, polymer engineering and science, vol.34, no.5, 1994. [74] s.t. shiue, ―the axial strain-induced stresses in double-coated optical fibers‖, journal of the chinese institute of engineers, vol.17, no.1, 1994. [75] s.t. shiue, ―thermally induced microbending losses in double-coated optical fibers at low temperature‖, materials chemistry and physics, vol.38, no.2, 1994. [76] e. suhir, v. mishkevich, j. anderson, ―how large should a periodic external load be to cause appreciable microbending losses in a dual-coated optical fiber?‖, in e. suhir, ed., ―structural analysis in microelectronics and fiber optics‖, asme press, 1995. [77] s.t. shiue, ―the spring constant in the buckling of tightly jacketed double-coated optical fibers‖, j. appl. phys., vol.81, no.8, 1997. [78] e. suhir, ―how long should a beam specimen be in bending tests?‖, asme journal of electronic packaging, vol.112, no.1, 1990. [79] e. suhir, ―stresses in a coated glass fiber stretched on a capstan‖, applied optics, vol.29, no.18, 1990. [80] e. suhir, ―can the curvature of an optical glass fiber be different from the curvature of its coating?‖, international journal of solids and structures, vol.30, no.17, 1993. [81] e. suhir, ―analytical modeling of the interfacial shearing stress during pull-out testing of dualcoated lightguide specimens‖, applied optics, vol.32, no.7, 1993. [82] e. suhir, ―pull testing of a glass fiber soldered into a ferrule: how long should the test specimen be?‖, applied optics, vol.33, no.19, 1994. [83] e. suhir, l. bechou, ―saint-venant‘s principle and the minimum length of a dual-coated optical fiber specimen in reliability (proof) testing‖, esref, arcachon, france, 2013. [84] e. suhir, ―elastic stability, free vibrations, and bending of optical glass fibers: the effect of the nonlinear stress-strain relationship‖, applied optics, vol.31, vol.24, 1992. [85] s.t. shiue, ―the hydrostatic pressure induced stresses in double-coated optical fibers‖, journal of the chinese institute of engineers‖, vol.17, no.4, 1994. [86] e. suhir, ―coated optical fiber interconnect subjected to the ends offset and axial loading‖, int. workshop on reliability of polymeric materials and plastic packages of ic devices, paris, nov. 29dec.2, 1998, asme press, 1998. [87] e. suhir, ―critical strain and postbuckling stress in polymer coated optical fiber interconnect: what could be gained by using thicker coating?‖, int. workshop on reliability of polymeric materials and plastic packages of ic devices, paris, nov. 29-dec.2, 1998, asme press, 1998. [88] e. suhir, ―elastic stability of a dual-coated optical fiber of finite length‖, j. appl. physics, vol.102, no.5, 2007. [89] e. suhir, ―elastic stability of a dual-coated optical fiber with a stripped off coating at its end‖, j. appl. physics, vol. 102, no.4, 2007. [90] e. suhir, ―lateral compliance of a compressed cantilever beam, with application to micro-electronic and fiber-optic structures‖, j. appl. physics d, vol.41,no.1, 2008. [91] e. suhir, ―elastic stability of a dual-coated fiber‖, spie paper #8621-37, photonics west, february 2011. [92] e. suhir, ―thermally induced stresses in an optical glass fiber soldered into a ferrule‖, ieee/osa journal of lightwave technology, vol.12, no.10, 1994. [93] e. suhir, ―solder materials and joints in fiber-optics: reliability requirements and predicted stresses‖, proceedings of the international symposium ―design and reliability of solder joints and solder interconnections‖, orlando, fl., 1997. [94] e. suhir, ―elastic stability, free vibrations, and bending of optical glass fibers: the effect of the nonlinear stress-strain relationship‖, applied optics, vol.31, no.24, 1992. [95] e. suhir, ―is the maximum acceleration an adequate criterion of the dynamic strength of a structural element in an electronic product?‖, ieee transactions on components, packaging and manufacturing technology, vol.20, no.4, 1997. [96] e. suhir, ―dynamic response of microelectronics and photonics systems to shocks and vibrations‖, interpack’1997 proc., hawaii, june 15-19, 1997. fiber optics engineering: physical design for reliability 181 [97] e. suhir, ―could shock tests adequately mimic drop test conditions?‖, ieee ectc conference proceedings, san-diego, ca, may 28-31, 2002. [98] c.y. zhou, t.x. yu, e. suhir, ―design of shock table tests to mimic real-life drop conditions‖, ieee cpmt transactions, vol.32, no.4, 2009. [99] e. suhir, m. vujosevic, and t. reinikainen, ―nonlinear dynamic response of a ―flexible-and-heavy‖ printed circuit board (pcb) to an impact load applied to its support contour‖, j. appl. physics, d, 42, no.4, 2009. [100] e. suhir,―linear response to shocks and vibrations‖, in e. suhir, d.steinberg and t.yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [101] e. suhir, ―linear and nonlinear vibrations caused by periodic impulses‖. in e.suhir, d.steinberg and t.yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [102] e. suhir, ―random vibrations of structural elements in electronic and photonic systems‖, in e.suhir, d.steinberg and t.yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [103] c.y. zhou, t.x. yu, s.w. ricky lee and e. suhir, ―shock test methods and test standards for portable electronic devices‖, in e. suhir, d. steinberg and t. yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [104] e. suhir, ―linear response of a single-degree-of-freedom system to an impact load: could shock tests adequately mimic drop test conditions?‖, in e. suhir, d. steinberg and t. yu, ―structural dynamics of electronic and photonic systems‖, john wiley, hoboken, nj., 2011. [105] e. suhir, ―predictive modeling of the dynamic response of electronic systems to shocks and vibrations‖, asme appl. mech. reviews, vol. 63, no.5, march, 2011. [106] e. suhir, ―structural dynamics of electronics systems‖, modern physics letters b (mplb), vol. 27, no. 7, march 2013. [107] e. suhir, and d. ingman, ―new hermetic coating for optical fiber dramatically improves strength: new nano-particle material (npm) and npm-based new generation of optical fiber claddings and coating‖, us navy workshop, st. louis, mo, 2003: could nano-technology make a difference?‖, polytronic‘04, portland, or, september 13-15, 2001. [108] e. suhir, ―polymer coated optical glass fiber reliability: could nano-technology make a difference?‖, polytronic‘04, portland, or, september 13-15, 2004. [109] e. suhir, ―new nano-particle material (npm) for microand opto-electronic packaging applications‖, ieee workshop on advanced packaging materials, irvine, march 2005. [110] d. ingman and e. suhir, ―optical fiber with nano-particle overclad‖, us patent, #7,162,138 b2, 2007. [111] d. ingman and e. suhir, ―optical fiber with nano-particle cladding‖, us patent, #7,162,137 b2, 2007. [112] e. suhir, ―fiber-optics structural mechanics and nano-technology based new generation of fiber coatings: review and extension‖, in e. suhir, cp wong, yc lee, eds. ―microand opto-electronic materials and structures: physics, mechanics, design, packaging, reliability‖, springer, 2007. [113] e. suhir, d. ingman, ―highly compliant bonding material and structure for microand optoelectronic applications‖, in e. suhir, cp wong, yc lee, eds. ―microand opto-electronic materials and structures: physics, mechanics, design, packaging, reliability‖, springer, 2007. [114] t. mirer, ,d. ingman, e. suhir, ―reliability improvement through nano-particle-material-based fiber structures‖, optical fiber technology, v. 13, 2007. [115] e. suhir, ―polymer coating of optical silica fibers, and a nanomaterial-based coating system‖, keynote presentation, polytronic‘2007, proceedings of the international conference on polymeric materials for microand opto-electronics applications, tokyo, japan, january 14-16, 2007. [116] d. ingman, v. ogenko, e. suhir, a. glista, ―moisture resistant nano-particle material and its applications‖, us patent #7,321,714b2, 2008. [117] e. suhir, ―mechanical approach to the evaluation of the low temperature threshold of added transmission losses in single-coated optical fibers‖, ieee/osa journal of light-wave technology, vol.8, no.6, 1990. [118] e. suhir, ―free vibrations of a fused bi-conical taper lightwave coupler‖, int. j. solids and structures, vol. 29, no. 24, 1992. [119] e. suhir, ―vibration frequency of a fused bi-conical taper (fbt) lightwave coupler‖, ieee/osa journal of lightwave technology, vol. 10, no. 7, 1992. [120] e. suhir, ―predicted stresses and strains in fused bi-conical taper couplers subjected to tension‖, applied optics, vol. 32, no. 18, 1993. 182 e. suhir [121] e. suhir, and j.j. vuillamin, jr., "effects of the cte and young's modulus lateral gradients on the bowing of an optical fiber: analytical and finite element modeling", optical engineering, vol. 39, no. 12, 2000. [122] e. suhir, ―apparatus and method for thermostatic compensation of temperature sensitive devices‖, us patent #6,337,932, 2002. [123] e. suhir, r. mahajan, ―are current qualification practices adequate?―, circuit assembly, april 2011 [124] e. suhir,‖accelerated life testing (alt) in microelectronics and photonics: its role, attributes, challenges, pitfalls, and interaction with qualification tests‖, asme j. electr. packaging (jep), vol. 124, no. 3, 2002. [125] e. suhir, ―failure-oriented-accelerated-testing (foat) and its role in making a viable ic package into a reliable product‖, circuits assembly, july 2013. [126] e. suhir, a. bensoussan, j. nicolics, l. bechou, ―highly accelerated life testing (halt), failure oriented accelerated testing (foat), and their role in making a viable device into a reliable product‖, 2014 ieee aerospace conference, big sky, montana, march 2014. [127] e. suhir, ―failure-oriented-accelerated-testing (foat) and its role in making a viable package into a reliable product‖, semi-term 2014, san jose, ca, march 9-13, 2014. [128] e. suhir, ―how to make a photonic device into a product: role of accelerated life testing‖, keynote address at the international conference of business aspects of microelectronic industry, hong-kong, january 2003. [129] e. suhir,―reliability and accelerated life testing‖, semiconductor international, february 1, 2005. [130] e. suhir,―when reliability is imperative, ability to quantify it is a must‖, imaps advanced microelectronics, august 2012. [131] e. suhir, "applied probability for engineering and scientists", mcgraw hill, new york, 1997. [132] e. suhir, ―thermal stress modeling in microelectronics and photonics packaging, and the application of the probabilistic approach: review and extension‖, imaps int. j. of microcircuits and electronic packaging, vol.23, no.2, 2000 (invited). [133] e. suhir, ―probabilistic design for reliability‖, chipscale reviews, vol.14, no.6, 2010. [134] e. suhir, ―remaining useful lifetime (rul): probabilistic predictive model‖, int. j. of phm, vol 2(2), 2011. [135] e. suhir, r. mahajan, a. lucero, l. bechou, ―probabilistic design for reliability (pdfr) and a novel approach to qualification testing (qt)‖, 2012 ieee/aiaa aerospace conf., big sky, montana, 2012 [136] e. suhir, ―how long could/should be the repair time for high availability?‖, modern physics letters b (mplb), vol.27, aug.30, 2013. [137] e. suhir, ―could electronics reliability be predicted, quantified and assured?‖ microelectronics reliability, no. 53, april 15, 2013. [138] e. suhir, ―boltzmann-arrhenius-zhurkov (baz) model in physics-of-materials problems‖, modern physics letters b (mplb), vol.27, april 2013. [139] e. suhir, l. bechou, ―availability index and minimized reliability cost‖, circuit assemblies, february 2013. [140] a. bensoussan, and e. suhir, ―design-for-reliability (dfr) of aerospace electronics: attributes and challenges", 2013 ieee aerospace conference, big sky, montana, march 2013. [141] e. suhir, ―assuring aerospace electronics and photonics reliability: what could and should be done differently‖, 2013 ieee aerospace conference, big sky, montana, march 2013. [142] e. suhir, ―predicted reliability of aerospace electronics: application of two advanced probabilistic techniques‖, 2013 ieee aerospace conference, big sky, montana, march 2013. [143] e. suhir, a. bensoussan, ―application of multi-parametric baz model in aerospace optoelectronics‖, 2014 ieee aerospace conference, big sky, montana, march 2014. [144] e. suhir, ―combined statisticsand physics-of-failurebased approach in the probabilistic design for reliability of opto-electronics products‖, optical engineering, 2014. demonstration of protein hydrogen bonding network application to microelectronics facta universitatis series: electronics and energetics vol. 27, n o 2, june 2014, pp. 205 219 doi: 10.2298/fuee1402205h demonstration of protein hydrogen bonding network application to microelectronics  marin h. hristov 1 , rostislav p. rusev 2 , george v. angelov 1 , elitsa e. gieva 1 1 technical university of sofia/department of microelectronics, sofia, bulgaria 2 technical university of sofia/department of technology and management of communication systems, sofia, bulgaria abstract. model of hydrogen bonding networks in active site of -lactamase during the last intermediate ey of acylenzyme reaction semicycle is presented. the i-v characteristics of each hydrogen bond are calculated following marcus theory and theory of protein electrostatics. simulations showed that hbn characteristics are similar to the characteristics of microelectronic devices such as amplifier, signal modulator, triangular pulse source. the results demonstrated the analogy of hbns in the active site of β-lactamase protein to microelectronic integrated circuit with multiple outputs each with different characteristics. key words: bioelectronics, microelectronics, proteins, hydrogen bonding networks, -lactamse, acylenzyme reaction, proton transfer. 1. introduction bioelectronics is a relatively new field associated with the integration of biomolecules with electronic elements to yield functional devices. the first molecular materials used in electronics originate from material science and are related to the development of electronic and optoelectronic devices that utilize the macroscopic features of organic compounds. the remarkable biochemical and biotechnological progress in tailoring new biomaterials by genetic engineering or bioengineering provides unique and novel means to synthesize new enzymes and protein receptors, and to engineer monoclonal antibodies for nonbiological substrates (such as explosives or pesticides) and dna-based enzymes. all these materials provide a broad platform of functional units for their integration with electronic elements. after many years of research, the organic light-emitting devices, synthetic electronic circuits, chemical and biochemical sensors have drawn attention. molecular scale devices attract substantial research efforts because of the basic fundamental scientific questions and the potential practical applications of the systems. in  received january 20, 2014 corresponding author: george v. angelov technical university of sofia/department of microelectronics, sofia, bulgaria (e-mail: angelov@ecad.tu-sofia.bg) 206 m. hristov, r. rusev, g. angelov, e. gieva particular, efforts are focused on single molecule behavior or behavior of group of molecules and precise 3d position control over single atoms and molecules. the major activities in the field of bioelectronics relate to the development of biosensors that transduce biorecognition or biocatalytic processes in the form of electronic signals [1], [2], [3]. there are certain applications of bioelectronics molecular structures in switches, dna and other molecular devices that could be implemented in standard solid-state silicon electronics [4]; such molecule structures that could become real competitor to state-of-the-art silicon microelectronic devices. other research efforts are directed at utilizing the biocatalytic electron transfer functions of enzymes to assemble biofuel cells that convert organic fuel substrates into electrical energy [5], [6]. exciting opportunities exist in the electrical interfacing of neuronal networks with semiconductor microstructures. the excitation of ion conductance in neuronsmay be followed by electron conductance of semiconductor devices, thus opening the way to generating future neuron-semiconductor hybrid systems for dynamic memory and active learning [7]. one goal of molecular electronics is to imitate complex process behavior of solid-state circuits in molecular structures that would allow for creation of bioelectronics devices. biological molecules – namely enzymes, proteins, and dna – are unique, in that they have benefited from natural selection and evolution, which has resulted in highly optimized properties custom-tailored for specific biological functions. these molecules have evolved to function in a wide range of environmental conditions, often with efficiencies unmatched by nonbiological or synthetic methods. rational selection by the researcher can bring their novel functionalities to device applications. from the standpoint to application of bioobjects to devices that are similar to conventional microelectronics devices proteins are widely sensitive to the presence of many types of molecules, through both specific and nonspecific binding. this is especially true for enzymes, which participate in a large variety of specific interactions with small molecules. the primary result is conformational modification of the protein, resulting in a change of activity. molecules that interact with proteins through nonspecific or indirect interactions typically disrupt noncovalent bonds which in turn alter structure. the challenge here is to detect these changes and transform them into signals that can be processed analogous to solid-state microelectronic devices. for certain classes of proteins, e.g. photoactive yellow protein (pyp) [8], green fluorescence protein [9], bacteriorhodopsin [10], the problem with signal formation and processing is simpler because these proteins can produce (or at least can be made to produce) a measurable response that can be sensed into the form of a signal. the understanding of charge transport phenomena through biological structures is essential for the problem of signal formation. it is constantly evolving due to the intensive theoretical and experimental work. the contributions of the marcus theory [11], the super exchange charge transfer theory [12], and the definition of superior tunneling paths in proteins [13] had an enormous impact on the understanding of biological processes in numerous electrochemical and photoelectrochemical biosensing systems. bacteriorhodopsin (br) has drawn a large amount of attention from the perspective of both the basic and applied sciences. it is a unique protein because it acts as a light-driven engine that converts light into chemical energy in an efficient manner. bacteriorhodopsin possesses hydrogen bonding networks that executes proton transport. this implies that demonstration of protein hydrogen bonding network application to microelectronics 207 proteins with its hydrogen bonding networks (hbn) can process information. such protein with hydrogen bonding networks is β-lactamase [14]. in this paper, we demonstrate the application of hbns in the active site of β-lactamase protein as analog to microelectronic integrated circuit with multiple outputs each with different characteristics. 2. hbn modeling approach signal transfer in hydrogen bonding networks (hbns) is carried out by protons. the model of proton transfer in hydrogen bonds is based on marcus theory and the protein electrostatic theory [15]. proton current in the hydrogen bonds depends on the value of ph. changing the ph causes polarization and ionization of the protein groups. in result, the charges in protein-water system are redistributed and the donor/acceptor electrostatic potentials change. the proton transfer parameter (respectively proton current) between changes as well. in analogy to traditional microelectronic four-terminal elements, where the input circuit correlates to the output circuit and the electrical current itself is formed by electrons, some protein hydrogen bonds that are connected in a hb network could be modeled as four terminal block-elements; the current in each block-element is formed by a transfer of protons between donor and acceptor parts of the heavy atoms in the network [16]. this analogy allows us to model hbns with four terminal circuit block-elements. the i-v characteristics of each block-element are proportional to the k-v characteristics of the respective hydrogen bonds. the current (i) of each block-element represents the proton transfer parameter (k) of each hydrogen bond and the voltage (v) of each blockelement represents the electrostatic potential (el. pot.) [16]. 2.1. types of hbns studied in our earlier research in the field of bioelectronics so far [17]-[21], we have investigated -lactamase protein and in particular, its hydrogen bonding networks (hbns). we have examined different types of hbns including branching hbns, linear hbn with protein residues and water molecules, hbn from protein main chain, and hbn in active site of protein. we have compared hydrogen bonds characteristics to the characteristics of various well-known electronic elements such as transistors, amplifiers, filters, currents sources, decoders, etc. branching network is depicted in fig. 1. linear networks with and without water molecule are given in fig. 2. these hydrogen bonding networks (hbn) extracted from -lactamase, consist of residues in the periphery of the protein and water molecules. b. atanasov et. al [22] assume that proton transfer in the active site hbns of the -lactamase is performed during the interaction of the protein with the ligand (acylenzyme reaction). 208 m. hristov, r. rusev, g. angelov, e. gieva fig. 1 hydrogen bonding network. nh1, nh2, and ne — nitrogen atoms of arginine residue r164, oe1 and oe2 — carboxyl oxygen atoms of glutamic acid residue e171, od1 and od2 carboxyl oxygen atoms of aspartic acid residues, oh — are oxygen atoms of water molecules (w295, w753 and 859w) fig. 2 hydrogen bonding network is virtually separated into two parts (with dashed line). (m182) is methionine residue, og1 is hydroxyl oxygen of threonine residues (t160, 181, 189), od2 is carboxyl oxygen of aspartic acid residue (d157), nz is nitrogen atom of lysine residue (k192), oh is oxygen atom of water molecules (w356, 440) 2.2. acylenzyme reaction the hbns in the active site of -lactamase protein during acyl enzyme reaction is given in fig. 3. there are two hbns participating in the acylenzyme reaction. the first hbn, referred to as nucleophilic, consists of residues s70, w297, n170, e166, k173, n132. the second hbn, referred to as electrophilic, consists of s130, k234, w309, d214, s235. it should be noted that there is proton transfer in parallel in both the two hbns during the different intermediates of the reaction. acyl enzyme reaction cycle has the following intermediates: (e) the networks in the active site of the free enzyme, (es) the formation of michaelis complex, (t1) the transient state of reaction where the networks change due to the nucleophilic attack by s70, (ey) the end of the reaction when the acylenzyme is formed up and the networks of hydrogen bonds have changed due to the opening of the ligand ring and the combination of the ligand with s70. the catalyzed -lactam nitrogen protonation is supposed to be energetically favored at the initiating event, followed by nucleophilic attack on the carbonyl carbon of the lactam group. nitrogen protonation is catalyzed through a hydrogen bonding network involving the 2-carboxylate group of the substrate, s130 and k234 residues, and a water molecule. the nucleophilic attack on the carbonyl carbon is carried out by the s70 with deprotonation abstraction catalyzed by a water molecule hydrogen-bonded to the side chain of e166. demonstration of protein hydrogen bonding network application to microelectronics 209 fig. 3 acylenzyme reaction intermediates – hbns in the active site of: (e) free enzyme, (es) michaelis complex, (t1) transient state, (ey) acyl enzyme. the first hbn is referred to as ―nucleophilic‖ consists of residue s70, water molecule w297, and residues n170, e166, k173, n132 and ligand. the second referred to as ―electrophilic‖ consists of consists of residues s130, k234, water molecule w309, and residues d214, s235 proton transfer between donor and acceptor of each hydrogen bond in protein is studied following marcus theory. proton transfer parameter k is calculated by / 2 exp 2 b b k t eb h k k t          (1) where kb – boltzmann constant, eb – barrier energy, h – plank constant,  – frequency, t – temperature in kelvins. the energy barrier is calculated by: 2 12 12 2 ))())2)((exp(( )))((( evdarts esvtdarseb ccc baaa   (2) where r(da) – distance between donor and acceptor, e12 – the difference between the energies of donor and acceptor (cf. two-well potential). k has dimension of free energy and from the calculations, it can be interpreted as follows: the greater parameter k – so much readily accomplished proton transfer between donor and acceptor from hbns, i.e. the proton current will be greater. on the other hand, the parameter of 210 m. hristov, r. rusev, g. angelov, e. gieva proton transfer depends on the donor/acceptor potentials similarly to the potentials supplying the microelectronic components. therefore we can construct three and four-terminal electronic block-elements analogous to the hydrogen bonds in the following way. the electrostatic potential of each protein atom is calculated by protein electrostatic theory. both k parameter and electrostatic parameter depend on ph of the environment. it is observed that the electrostatic potential of donor/acceptor of each hydrogen bond could be compared to the voltage of a conventional microelectronic circuit device. proton transfer parameter k could be compared to the circuits‘ device current. this lets us introduce the analogy between hydrogen bonds and standard microelectronic devices, respectively hbns and microelectronic circuits. in particular, we will study the behavior of block-elements circuits, modeled with polynomials, in matlab [23]. afterwards, the matlab simulations with polynomials are compared to the results of simulations obtained using marcus theory in [24]. 3. circuit model 3.1. circuit formulation we create circuit block-elements corresponding to each hbn in order to emulate the operation of hbns. hbn is divided in heavy atoms that form the hydrogen bonds (fig. 4). fig. 4 sample hbn with its heavy atoms x, d, and a fig. 5 block-element that is analogous to heavy atom from the hydrogen bonding network in the analogous circuit each heavy atom (which is both donor and acceptor, designated with ‗x‘) is represented as a separate block-element (fig. 5). in fig. 5 the acceptor part of the heavy atom ‗x‘ is assigned as the input of the respective block-element where the input current iin flows in; the donor part of the heavy atom ‗x‘ is assigned as the output of the respective block-element, where the output current iout flows out. the potentials at the input and the output of the block-element are equal to the potential u of the heavy atom. the magnitude of the input and output currents are proportional to the proton transfer parameter of the hydrogen bonding network where the heavy atom is present. in each protein hbn we can find strong proton donor, strong proton acceptor, and atoms exhibiting both donor and acceptor properties. strong proton donor of each hbn always behaves as circuit input and strong acceptor of each hbn always behaves as circuit output. we consider the application of the hbn properties on the example of hydrogen bonding network in active site during the intermediate of acyl enzyme reaction (case ey demonstration of protein hydrogen bonding network application to microelectronics 211 of fig. 3). both hbns in active site of -lactamase protein are represented with analogous circuit depicted on fig. 6. this circuit could be considered as ―integrated‖ circuit built of molecules. fig. 6 correspondence between protein residues and water molecules in ey case and respective block–elements: in k73 and k234 correspond to t1 and t10 blockelements (i.e. they are circuit inputs), e166  t4 (uout1), n170 and w297  t2, n132  t5, n170  t6, d214  t12, s130og  t13, s235og  t14 subject of the modeling effort is to describe the hbn in the active site of  -lactamase protein; the equivalent circuit of the hydrogen bonding network is given in fig. 6 which is analogous to the circuit on fig. 3 (ey). the equivalent circuit consists of two subcircuits corresponding to the respective nucleophilic and electrophilic hbn of the acyl enzyme (ey). proton transfer depends on ph of the environment and the interaction of enzyme with ligand. therefore, the two circuits are bound together in common circuit (which can be considered as “integrated” circuit) and cannot be separated. the ligand is represented in the equivalent circuit by the switch s – the switching of elements and the change of i/o currents depends on the position of s; by the position of the switch in fact we may select the intermediates of the acyl enzyme reaction. it should be noted that the ligand formation and its charge in active site, strongly affects proton transfer through the hbns. the output of the first electric circuit is denoted with uin1; this circuit is analogous to the nucleophilic hbn. t1 block-element corresponds to k73nz lysine which here behaves as proton donor and that is why it is interpreted as current source in the circuit; t1 has equal input and output voltages but different input and output currents. t2 substitutes the water molecule w297. here we do not have a block-element designated by ―t3‖ because we examine the last intermediate of the acylenzyme reaction where the s70og residue has already completed the reaction and we reassign t2 to represent the water molecule; in the other intermediate of the reaction the water molecule is represented by t3 block-element but now there is no s70 residue we reassign the number for the water molecule blockelement. t4 is juxtaposed to e166 which is proton acceptor and can form different hydrogen bonds; t4 sums three input currents. t5 and t6 block-elements are analogous 212 m. hristov, r. rusev, g. angelov, e. gieva to n132 and n170 asparagines, respectively. aspargines can be both donors and acceptors and thence they can alter the current direction; in the circuit, this is modeled by the s switch (s corresponds to the ligand). the input of the second electric circuit is denoted by uin2; this circuit is analogous to electrophilic hbn. t10 block-element corresponds to k234nz proton donor. it is a current source (similarly to the current source t1 in the first electric circuit) and again it has equal input and output voltages but different input and output currents. t11 is analogous to the water molecule w309. t12 is juxtaposed to d214 residue which is in fact output of the circuit. t13 represents s130og residue – it has equal input and output voltages but different input and output currents. t14 is the other output of the circuit; it is compared to s235og residue and has the same properties as s130og. 3.2. equation formulation the i-v characteristics of block-elements that correspond to proton transfer parameter and electrostatic potential of hydrogen bonds are coded in matlab. because the acyl reaction goes together with proton transfer, the proton transfer through each hydrogen bond in each reaction intermediate is simulated. this proton transfer is also compared to the current flow through known electronic elements. the relations between currents and voltages of each block-element in the equivalent circuit are given by polynomials. first, we list the equations for nucleophilic subcircuit. equations (3) and (4) describe voltage and current of first output of t1 (the input voltage uin is between –2.3  +2.2 v): uin = u1 (3) i1 = 3*10 -5 u1 2 + 0.0004u1 + 0.0045 (4) the i-v equations for t5 are: u51 = 0.0516u1 – 0.2473 (5) u5 = 1.0595u51 – 0.242 (6) i5 = –3*10 -6 u5 4 + 5.5*10 -6 u5 3 +2*10 -5 u5 2 – 5*10 -5 u5 + 1.2*10 -4 (7) t2 block-element is modeled by: u2 = 1.0211u1 – 0.1005 (8) i2 = = –1.0658u2 3 – 0.1179u2 2 + 10.912u2 + 151.84 (9) voltage and current of t6, which is output no. 3 of the circuit, are: u6 = 1.1204u1 – 0.3978 (10) i6 = 0.0045u6 2 + 0.0139u6 + 0.0251 (11) t4 that is output no. 1 of the circuit is described by u4 = 1.0732u2 – 0.1933 (12) demonstration of protein hydrogen bonding network application to microelectronics 213 the current of t4 is a sum of all input current: i4 = i3 + i5 + i6 = iout1 (13) the equations for electrophilic subcircuit of hbns are listed below. t10 blockelement is modeled by: u10 = 0.9658u1 + 0.2266 (14) i10 = -0.0062u10 3 -0.002u10 2 +0.0751u10 + 0.5283 (15) the i-v characteristics of t11 are: u11 = 1.002u10 +0.1153 (16) i11 = –3*10 -6 u11 5 + 9*10 -6 u11 4 + 1.7*10 -5 u11 3 – 4*10 -5 u11 2 – 9.2*10 -5 u11 2 + 0.00025(17) for t12, which is circuit output no. 12, we have: u12 = 0.9904u11 + 0.4309 (17) i12 = i11 (18) circuit output no. 13 is described by : u13 = 1.0318u10 – 0.1714 (19) i13 = 0.0008u13 3 – 0.0018u13 2 – 0.0001u13 +0.0042 (20) the last output, no. 14, is modeled by: u14 = 1.01u11 – 0.1509 (21) i14 = –0.0431u14 4 + 0.1242u14 3 + 0.1987u14 2 – 0.1987u14 + 12.893 (22) 3.3. matlab code below we list an excerpt of the matlab code used to model the equivalent circuit behavior: % block t14(s235) out14-> 1inp=1out arguments % function -> 1inp-1out % equation for u114 = f(u11) u14 = 1.01*u11 -0.1509; plot(u11,u14,'linewidth',2); set(gca,'fontweight','b','fontsize',14) grid on title('t14 out14'); xlabel('u11 [v]'); ylabel('uout14 [v]'); legend('simulation','data'); set(legend('simulation','data',1),'fontsize',12); pause; % equation for i14=f(u14) inp1=out1 i14 = -0.0431*u14.^4 +0.1242*u14.^3 +0.1987*u14.^2 -0.1987*u14 +12.893; plot(u14,i14,'linewidth',2); set(gca,'fontweight','b','fontsize',14) grid on 214 m. hristov, r. rusev, g. angelov, e. gieva title('t14 out14'); xlabel('uout14 [v]'); ylabel('iout14 [pa]'); legend('simulation','data'); set(legend('simulation','data',1),'fontsize',12); pause; % % % ++++++++++++++++++++++++++ plot(u7,i51,'-r',u7,i7,'--b','linewidth',2); set(gca,'fontweight','b','fontsize',14) grid on title('i51, i7 vs u7'); xlabel('u7 [v]'); ylabel('i51, i7 [pa]'); set(legend('i51','i7',12),'fontsize',12); h = legend('cos','sin',2); pause; 4. simulation results and discussion we perform dc and transient analyses to study circuit behavior in different modes of operation in matlab. 4.1. dc analysis the dc analysis is carried out by sweeping input voltage between –2.3 and +2.2 v. the simulated results with the above polynomial equations are compared to the results of [24] (cf. fig. 7). it is observed that the polynomials well describe the behavior of the modeled block-elements. we observed similar results for the rest of the block-elements (we do not show them here). the maximal error is 5.66 %. fig. 7 i-v characteristics of block-elements t1 and t11 (representing k73 and w309) for (ey) intermediate of the acylenzyme reaction next, we perform static analysis of the equivalent circuit (fig. 6). the simulated i-v characteristics are illustrated in figures 8, 9, 10, 11, 12, and 13. demonstration of protein hydrogen bonding network application to microelectronics 215 fig. 8 iout1 vs. uout1 fig. 9 iout2 vs. uout2 fig. 10 iout3 vs. uout3 simulation results show s-type form of the outputs characteristics (in fig. 10 the form is similar to an exponent) which implies that the circuit can operate as an amplifier. fig. 11 iout12 vs. uout12 fig. 12 iout13 vs. uout13 fig. 13 iout14 vs. uout14 the i-v characteristic in fig. 11 cannot be directly compared to a common microelectronic device. conversely, in fig. 12 the current exhibits two regions: 1) iout13 increases almost linearly and then 2) iout13 saturates, hence the characteristic is analogous to the i-v characteristics of a transistor. in fig. 13 we observe a characteristic that is typical for a tunnel diode. we also simulated the dependence of output voltages versus input voltage. the results showed that all output voltages are linearly increasing with the input voltage. 4.2. dynamic analysis taking into account the specifics of the hydrogen bonding networks and the proton transfer, which takes place for a period of approximately 10 -11 s, the analogous electronic circuit should transfer signals in the ghz-range. that is why, we begin with input voltage with amplitude between –2.2 and +2.2 v at frequency of 10 ghz and a time sweep between 0 and 0.1 ns. afterwards, we feed input voltages with positive amplitude only and then with negative amplitude only (fig. 14). 216 m. hristov, r. rusev, g. angelov, e. gieva a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 14 uin vs. time in fig. 15 we show the characteristics of t4 block-element which is output no. 1 of the nucleophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 15 iout1 vs. time at different input voltages the results show that iout1 is always positive regardless of whether the input voltage is positive and negative, positive only, or negative only. in the case of fig. 15a) we observe that the signal iout1 at is cut from the bottom. therefore, we can compare the results to the characteristics of a signal limiter. in fig. 16 we show the characteristics of t2 block-element which is output no. 2 of the nucleophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 16 iout2 vs. time at different input voltages demonstration of protein hydrogen bonding network application to microelectronics 217 we observe that the sine output characteristic in fig. 16a) is cut from the top and bottom, in fig. 16b) – from the top, and in fig. 16c) – from the bottom. in fig. 17 we show the characteristics of t6 block-element which is output no. 3 of the nucleophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) fig. 17 iout3 vs. time at different input voltages fig. 18 gives the characteristics of t12 block-element which is output no. 12 of the electrophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) fig. 18 iout12 vs. time at different input voltages in fig. 19 we present the characteristics of t13 block-element that is output no. 13 of the electrophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 19 iout13 vs. time at different input voltages 218 m. hristov, r. rusev, g. angelov, e. gieva fig. 20 shows the characteristics of t14 block-element which is output no. 14 of the electrophilic circuit. a) uin = 2.2×sin(5×10 11 ×t) b) uin = 1 +sin(5×10 11 ×t) c) uin = –1 +sin(5×10 11 ×t) fig. 20 iout14 vs. time at different input voltages the signal in fig. 21a) indicates modulator‘s nature – the output signal is modulated. from transient analyses, we obtain characteristics that can be compared to amplitude limiter, modulator and triangular pulse source. 5. conclusion the presented circuit model of hydrogen bonding networks in the active site of lactamase protein proved that such biofunctional system exhibits properties that are akin to common microelectronic devices. the analogous microelectronic circuit may operate in static mode as a signal amplifier, transistor or tunnel diode. in dynamic mode, the microelectronic circuit may operate as signal limiter or signal modulator in the ghzrange. furthermore, signals with different frequency, amplitude, and width might be generated at each circuit output. thus, it can be concluded that the electrophilic and nucleophilic networks of hydrogen bonds in the active site can operate like an integrated circuit consisting of individual devices in the form of hydrogen bonds. the biocircuit is extremely flexible and it is applicable to multiple circuit purposes. the results are expected to have important applications for finding novel solutions in bioelectronics research. acknowledgement: the present paper is a part of the research carried out in the framework of project дунк 01/03 of 29.12.2009. references [1] k. habermüller, m. mosbach, and w. schuhmann, ―electron-transfer mechanisms in amperometric biosensors,‖ fresenius' journal of analytical chemistry, vol. 366, no. 6-7, 2000, pp. 560-568. [2] a. heller, ―electrical connection of enzyme redox centers to electrodes‖, acc. chem. res., 23, pp. 128– 134, 1990. [3] f.a. armstrong and g.s. wilson, ―recent developments in faradaic bioelectrochemistry‖, electrochim. acta 45 (15-16), pp. 2623-2645, 2000. demonstration of protein hydrogen bonding network application to microelectronics 219 [4] c. dekker and m.a. ratner, ―electronic properties of dna‖, physics world, 14 (8), pp. 29-33 august 2001. [5] a. heller, ―miniature biofuel cells‖, phys. chem. chem. phys., 6, pp. 209–216, 2004. [6] e. katz, a.n. shipway, i. willner, in handbook of fuel cells – fundamentals, technology, applications (eds.: w. vielstich, h. gasteiger, a. lamm), vol. 1, part 4, wiley, chichester, chapter 21, pp. 355–381, 2003. [7] p. fromherz, ―electrical interfacing of nerve cells and semiconductor chips‖, chem. phys. chem., 3, pp. 276-284, 2002. [8] m. baca, g, borgstahl, m. boissinot, p. burke, d. williams, k. slater, e. getzoff, ―complete chemical structure of photoactive yellow protein: novel thioester-linked 4-hydroxyxinnamyl chromophore and photocycle chemistry‖, biochemistry, 33, pp. 14369–14377, 1994. [9] h. zhang, q. sun, z. li, s. nanbu, s.s. smith, ―first principle study of proton transfer in the green fluorescent protein (gfp): ab initio pes in a cluster model‖, computational and theoretical chemistry, 990, pp. 185–193, 2012. [10] k. j. wise, n. b. gillespie, j. stuart, m. p. krebs, and r. r. birge, ―optimization of bacteriorhodopsin for bioelectronic devices‖, trends in biotechnology, vol. 20, no. 9, pp. 387–94, september 2002. [11] r.a. marcus, n. sutin, ―electron transfers in chemistry and biology‖, biochim. biophys. acta, 811, pp. 265–322, 1985. [12] m. bixon, j. jortner, ―electron transfer. from isolated molecules to biomolecules‖, adv. chem. phys., 106, pp. 35–202, 1999. [13] h.b. gray, j.r.winkler, ―electron tunneling through proteins‖, q. rev. biophys., 36, pp. 341–372, 2003. [14] f.k. majiduddin, i.c. materon, t.g. palzkill, ―molecular analysis of beta-lactamase structure and function‖, international journal of medical microbiology, vol. 292, iss. 2, pp. 127-137, 2002. [15] m.a. lill and v. helms, ―compact parameter set for fast estimation of proton transfer rates‖, j. chem. phys., vol. 114 (3), p.1125-1132, 2001. [16] r. rusev, g. angelov, t. takov, m. hristov, ―biocircuit for signal modulation based on hydrogen bonding network‖, annual j. of electronics, vol. 3, no. 2, pp. 155-158, 2009. [17] r. rusev, g. angelov, b. atanassov, t. takov, m. hristov, ―development and analysis of a signal transfer circuit with hydrogen bonding‖, in proc. of the 17 th intl. scientific and appl. science conf. (electronics et’2008), sozopol, bulgaria, book 4, september 2008, pp. 37-42. [18] r. rusev, g. angelov, t. takov, b. atanasov, m. hristov, ―comparison of branching hydrogen bonding networks with microelectronic devices‖, annual j. of electronics, vol.3, no. 2, pp. 152-154, 2009. [19] r. rusev, g. angelov, e. gieva, t. takov, m. hristov, ―hydrogen bonding network as a dc level shifter and a power amplifier‖, in proc. of 17 th intl. conf. mixed design of integrated circuits and systems (mixdes 2010), wroclaw, poland, june 24-26, 2010, pp. 408-411. [20] r. rusev, g. angelov, e. gieva, m. hristov, t. takov, ―hydrogen bonding network emulating frequency driven source of triangular pulses‖, international journal of microelectronics and computer science, vol. 1, no. 3, pp. 293-298, 2010. [21] e. gieva, l. penov, r. rusev, g. angelov, m. hristov, ―protein hydrogen bonding network electrical model and simulation in verilog-a‖, annual j. of electronics, vol.5, no. 2, pp. 132-134, 2011. [22] b. atanasov, d. mustafi, m. makinen, ―protonation of the -lactam nitrogen is the trigger event in the catalytic action of class a -lactamases‖, proc. natl. acad. sci., usa, 97 (7), p. 3160-3165, 2000. [23] matlab website http://www.mathworks.com/ [24] r. rusev, g. angelov, e. gieva, b. atanasov, m. hristov, ―microelectronic aspects of hydrogen bond characteristics in active site of -lactamase during the acylenzyme reaction‖, annual j. of electronics, vol. 6, no. 2, pp. 35-38, 2012. facta universitatis series: electronics and energetics vol. x, no x, x 2018, pp. x energy-efficient cryptographic primitives elena dubrova∗ royal institute of technology (kth), stockholm, sweden abstract: our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. keywords: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 1 introduction today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or rfid tags in the conventional belief that the information they gather is of little concern to attackers. however, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network. for example, in the attack manuscript received x x, x corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) ∗an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 facta universitatis series: electronics and energetics vol. xx, no x, xxxx, pp. xxx xxx enumeration and coding methods for a class of permutations and reversible logical gates costas karanikas1 and nikolaos atreas2 1 school of informatics, aristotle university of thessaloniki, greece 2 school of electrical and computer engineering, faculty of engineering, aristotle university of thessaloniki, greece abstract: we introduce a great variety of coding methods for boolean sparse invertible matrices and we use these methods to create a variety of bijections on the permutation group p(m) of the set {1, 2, ..., m}. also, we propose methods for coding, enumerating and shuffling the set {0, ..., 2m − 1}, i.e. the set of all m-bit binary arrays. moreover we show that several well known reversible logic gates/circuits (on m-bit binary arrays) can be coded by sparse matrices. keywords: permutations, reversible logical gates. 1 introduction let m ≥ 2 be a natural number and p(m) be the group of permutations of the set {1, ..., m}. in this work we introduce a variety of shuffling methods. more precisely, each shuffling method is a bijective map of a set onto itself, i.e. different inputs yield different outputs and the number of inputs and outputs are equal. manuscript received xx, xxxx corresponding author: nikolaos atreas school of electrical and computer engineering, faculty of engineering, aristotle university of thessaloniki, greece (e-mail: natreas@ece.auth.gr) an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 1 2 n. atreas and c. karanikas our main theorem 2 in section 3 or its ”binary” version (see theorem 3 in section 4), states that any pair (ρ, s) of permutations in p(m) determines a bijective map tρ,s : {0, 1, ...., 2m − 1} → {0, 1, ...., 2m − 1}. since every non negative integer n ∈ {0, 1, ...., 2m − 1} can be expressed either as an m-bit binary array en = ( ε0(n), ε1(n), ..., εm−1(n) ) , εj ∈ {0, 1}, or by its dyadic expansion n = m∑ j=1 εj(n)2 j−1, the above map tρ,s can be considered as a reversible map on the set of all m-bit binary arrays. in a different terminology, we can say that in theorem 3 we introduce reversible logic gates, i.e bijective maps on the set of m-bit binary arrays, (see [1]). an example of a reversible gate is the not gate, whereas the and, or, xor gates are irreversible (not reversible), because they map 4 = 22 input states into 2 = 21 output states, so information is lost in the merging of paths. a second target of this work is to enumerate and code permutations in p(m) of large length (note that the cardinality of the set p(m) is m!). therefore, a reversible map tρ,s associated with the pair (ρ, s) can be coded either by the pair (ρ, s) or by an enumeration of p(m)×p(m) as in section 2. this coding method is associated with a particular class of sparse boolean invertible matrices introduced in [2] (see also [3–6]). notice that sparse matrices are very useful for fast processing/transmission of data and they have been effectively used in [6] for detecting specific characteristics on finite data. the paper is organized in the following sections: in section 2 we introduce our main tool, the invertible map p(m) → s(m) (see (2) and (3)) and in proposition 1, we see that this map induces the lexicographic order of the enumeration of p(m). moreover we consider the cartesian product r(m) = p(1)×p(2)× ...×p(m) of permutations to show in theorem 1 that each fixed element of r(m) provides an enumeration of p(m). in section 3 we define a class of sparse m×m boolean invertible matrices zm identified by a pair (ρ, s) ∈ p(m)×s(m) and we use this class of matrices facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 241 255 https://doi.org/10.2298/fuee1802241k costas karanikas1, nikolaos atreas2 received october 22, 2017; received in revised form january 24, 2018 corresponding author: nikolaos atreas school of electrical and computer engineering, dpt of telecommunications, faculty of engineering, aristotle university of thessaloniki, 54124, thessaloniki, greece (e-mail: natreas@auth.gr) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) enumeration and coding methods for a class of permutations and reversible logical gates* 1school of informatics, aristotle university of thessaloniki, greece 2school of electrical and computer engineering, faculty of engineering, aristotle university of thessaloniki, greece abstract. we introduce a great variety of coding methods for boolean sparse invertible matrices and we use these methods to create a variety of bijections on the permutation group p(m) of the set {1,2,...,m}. also, we propose methods for coding, enumerating and shuffling the set{0,...,2m−1}, i.e. the set of all m-bit binary arrays. moreover we show that several well known reversible logic gates/circuits (on m-bit binary arrays) can be coded by sparse matrices. key words: permutations, reversible logical gates. 2 n. atreas and c. karanikas our main theorem 2 in section 3 or its ”binary” version (see theorem 3 in section 4), states that any pair (ρ, s) of permutations in p(m) determines a bijective map tρ,s : {0, 1, ...., 2m − 1} → {0, 1, ...., 2m − 1}. since every non negative integer n ∈ {0, 1, ...., 2m − 1} can be expressed either as an m-bit binary array en = ( ε0(n), ε1(n), ..., εm−1(n) ) , εj ∈ {0, 1}, or by its dyadic expansion n = m∑ j=1 εj(n)2 j−1, the above map tρ,s can be considered as a reversible map on the set of all m-bit binary arrays. in a different terminology, we can say that in theorem 3 we introduce reversible logic gates, i.e bijective maps on the set of m-bit binary arrays, (see [1]). an example of a reversible gate is the not gate, whereas the and, or, xor gates are irreversible (not reversible), because they map 4 = 22 input states into 2 = 21 output states, so information is lost in the merging of paths. a second target of this work is to enumerate and code permutations in p(m) of large length (note that the cardinality of the set p(m) is m!). therefore, a reversible map tρ,s associated with the pair (ρ, s) can be coded either by the pair (ρ, s) or by an enumeration of p(m)×p(m) as in section 2. this coding method is associated with a particular class of sparse boolean invertible matrices introduced in [2] (see also [3–6]). notice that sparse matrices are very useful for fast processing/transmission of data and they have been effectively used in [6] for detecting specific characteristics on finite data. the paper is organized in the following sections: in section 2 we introduce our main tool, the invertible map p(m) → s(m) (see (2) and (3)) and in proposition 1, we see that this map induces the lexicographic order of the enumeration of p(m). moreover we consider the cartesian product r(m) = p(1)×p(2)× ...×p(m) of permutations to show in theorem 1 that each fixed element of r(m) provides an enumeration of p(m). in section 3 we define a class of sparse m×m boolean invertible matrices zm identified by a pair (ρ, s) ∈ p(m)×s(m) and we use this class of matrices enumeration and coding permutations and reversible logical circuits 3 to produce a class of non-linear bijection maps tq,ρ,s : {0, ..., qm − 1} → {0, ..., qm − 1}, see our main theorem 2. in section 4 we show that any triple (ρ, s, τ) of permutations in p(m) provides a variety of maps from {0, ..., 2m − 1} onto {0, ..., 2m − 1} and we see that several reversible logic gates can be determined by this triple. finally, in section 5 we apply theorems 1 and 2, to see with an example that for any pair (ρ, s) ∈ p(m) × s(m) and any fixed r ∈ r(2m) we shuffle the elements of the set {0, ..., 2m −1} and we discus the random permutation generation problem. 2 enumeration methods for p(m) let m ≥ 2 be a natural number. first we review the lexicographical order of the set s(m) = { s = (s1, ..., sm) : si ∈ {1, 2, ..., i} } . (1) obviously, the map u : s(m) → {0, ..., m! − 1} : u(s) = m! m∑ i=1 si − 1 i! (2) is a bijection and the elements si ∈ {1, ..., i} can be thought of digits of the number u(s) with respect to the factorial number system. inversely, for any n ∈ {0, ..., m! − 1}, its digits si(n), i = 1, ..., m are computed by the formula si(n) = mod ([n i! m! ] , i ) + 1 describing the inverse map u−1. here, [x] is the floor of x. from now on we say that u provides the lexicographical order of s(m). using the lexicographical order of s(m) we may obtain an enumeration of the group of permutations p(m) of the set {1, ..., m} as well. in fact, let us define the map q : p(m) → s(m) : q(ρ) = s = (s1, ..., sm), (3) where each element si ∈ s(m) is defined by using the following iteration scheme: 242 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 243 2 n. atreas and c. karanikas our main theorem 2 in section 3 or its ”binary” version (see theorem 3 in section 4), states that any pair (ρ, s) of permutations in p(m) determines a bijective map tρ,s : {0, 1, ...., 2m − 1} → {0, 1, ...., 2m − 1}. since every non negative integer n ∈ {0, 1, ...., 2m − 1} can be expressed either as an m-bit binary array en = ( ε0(n), ε1(n), ..., εm−1(n) ) , εj ∈ {0, 1}, or by its dyadic expansion n = m∑ j=1 εj(n)2 j−1, the above map tρ,s can be considered as a reversible map on the set of all m-bit binary arrays. in a different terminology, we can say that in theorem 3 we introduce reversible logic gates, i.e bijective maps on the set of m-bit binary arrays, (see [1]). an example of a reversible gate is the not gate, whereas the and, or, xor gates are irreversible (not reversible), because they map 4 = 22 input states into 2 = 21 output states, so information is lost in the merging of paths. a second target of this work is to enumerate and code permutations in p(m) of large length (note that the cardinality of the set p(m) is m!). therefore, a reversible map tρ,s associated with the pair (ρ, s) can be coded either by the pair (ρ, s) or by an enumeration of p(m)×p(m) as in section 2. this coding method is associated with a particular class of sparse boolean invertible matrices introduced in [2] (see also [3–6]). notice that sparse matrices are very useful for fast processing/transmission of data and they have been effectively used in [6] for detecting specific characteristics on finite data. the paper is organized in the following sections: in section 2 we introduce our main tool, the invertible map p(m) → s(m) (see (2) and (3)) and in proposition 1, we see that this map induces the lexicographic order of the enumeration of p(m). moreover we consider the cartesian product r(m) = p(1)×p(2)× ...×p(m) of permutations to show in theorem 1 that each fixed element of r(m) provides an enumeration of p(m). in section 3 we define a class of sparse m×m boolean invertible matrices zm identified by a pair (ρ, s) ∈ p(m)×s(m) and we use this class of matrices enumeration and coding permutations and reversible logical circuits 3 to produce a class of non-linear bijection maps tq,ρ,s : {0, ..., qm − 1} → {0, ..., qm − 1}, see our main theorem 2. in section 4 we show that any triple (ρ, s, τ) of permutations in p(m) provides a variety of maps from {0, ..., 2m − 1} onto {0, ..., 2m − 1} and we see that several reversible logic gates can be determined by this triple. finally, in section 5 we apply theorems 1 and 2, to see with an example that for any pair (ρ, s) ∈ p(m) × s(m) and any fixed r ∈ r(2m) we shuffle the elements of the set {0, ..., 2m −1} and we discus the random permutation generation problem. 2 enumeration methods for p(m) let m ≥ 2 be a natural number. first we review the lexicographical order of the set s(m) = { s = (s1, ..., sm) : si ∈ {1, 2, ..., i} } . (1) obviously, the map u : s(m) → {0, ..., m! − 1} : u(s) = m! m∑ i=1 si − 1 i! (2) is a bijection and the elements si ∈ {1, ..., i} can be thought of digits of the number u(s) with respect to the factorial number system. inversely, for any n ∈ {0, ..., m! − 1}, its digits si(n), i = 1, ..., m are computed by the formula si(n) = mod ([n i! m! ] , i ) + 1 describing the inverse map u−1. here, [x] is the floor of x. from now on we say that u provides the lexicographical order of s(m). using the lexicographical order of s(m) we may obtain an enumeration of the group of permutations p(m) of the set {1, ..., m} as well. in fact, let us define the map q : p(m) → s(m) : q(ρ) = s = (s1, ..., sm), (3) where each element si ∈ s(m) is defined by using the following iteration scheme: enumeration and coding permutations and reversible logical circuits 3 to produce a class of non-linear bijection maps tq,ρ,s : {0, ..., qm − 1} → {0, ..., qm − 1}, see our main theorem 2. in section 4 we show that any triple (ρ, s, τ) of permutations in p(m) provides a variety of maps from {0, ..., 2m − 1} onto {0, ..., 2m − 1} and we see that several reversible logic gates can be determined by this triple. finally, in section 5 we apply theorems 1 and 2, to see with an example that for any pair (ρ, s) ∈ p(m) × s(m) and any fixed r ∈ r(2m) we shuffle the elements of the set {0, ..., 2m −1} and we discus the random permutation generation problem. 2 enumeration methods for p(m) let m ≥ 2 be a natural number. first we review the lexicographical order of the set s(m) = { s = (s1, ..., sm) : si ∈ {1, 2, ..., i} } . (1) obviously, the map u : s(m) → {0, ..., m! − 1} : u(s) = m! m∑ i=1 si − 1 i! (2) is a bijection and the elements si ∈ {1, ..., i} can be thought of digits of the number u(s) with respect to the factorial number system. inversely, for any n ∈ {0, ..., m! − 1}, its digits si(n), i = 1, ..., m are computed by the formula si(n) = mod ([n i! m! ] , i ) + 1 describing the inverse map u−1. here, [x] is the floor of x. from now on we say that u provides the lexicographical order of s(m). using the lexicographical order of s(m) we may obtain an enumeration of the group of permutations p(m) of the set {1, ..., m} as well. in fact, let us define the map q : p(m) → s(m) : q(ρ) = s = (s1, ..., sm), (3) where each element si ∈ s(m) is defined by using the following iteration scheme: 4 n. atreas and c. karanikas for the above selection of m and the initial permutation ρ in (3), we store the position of the biggest element in ρ, i.e. we define sm = ρ −1(m) and at the same time we delete this element ρ(sm) = m from ρ and so we form a new permutation ρ(m−1) ∈ p(m − 1) by ρ(m−1)(j) = { ρ(j) if j < sm ρ(j + 1) if j ≥ sm , j = 1, ..., m − 1. then we follow the previous step for the permutation ρ(m−1), i.e. we store the position of its biggest element by defining sm−1 = ρ −1 (m−1)(m − 1) and at the same time we delete the element m − 1 from ρ(m−1) and we form a new permutation ρ(m−2) ∈ p(m − 2) by ρ(m−2)(j) = { ρ(m−1)(j) if j < sm−1 ρ(m−1)(j + 1) if j ≥ sm−1 , j = 1, ..., m − 2. we continue in the same spirit until s is completely determined. example 1 let ρ = (2, 3, 4, 1). in order to determine the set s = {s1, s2, s3, s4} in (3) we are based on the above iteration scheme and so we proceed in the following way: (i) define s4 = ρ−1(4) = 3 and ρ(3) = (2, 3, 1). (ii) define s3 = ρ−1(3)(3) = 2 and ρ(2) = (2, 1). (iii) define s2 = ρ−1(2)(2) = 1 and ρ(1) = (1). (iv) define s1 = ρ−1(1)(1) = 1 and ρ(4) = ∅. now we have the following: proposition 1 [2] let u and q be two maps as in (2) and (3) respectively. then q is a bijection and so the composition map uq : p(m) → {0, ..., m! − 1} provides an enumeration of p(m). 242 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 243 4 n. atreas and c. karanikas for the above selection of m and the initial permutation ρ in (3), we store the position of the biggest element in ρ, i.e. we define sm = ρ −1(m) and at the same time we delete this element ρ(sm) = m from ρ and so we form a new permutation ρ(m−1) ∈ p(m − 1) by ρ(m−1)(j) = { ρ(j) if j < sm ρ(j + 1) if j ≥ sm , j = 1, ..., m − 1. then we follow the previous step for the permutation ρ(m−1), i.e. we store the position of its biggest element by defining sm−1 = ρ −1 (m−1)(m − 1) and at the same time we delete the element m − 1 from ρ(m−1) and we form a new permutation ρ(m−2) ∈ p(m − 2) by ρ(m−2)(j) = { ρ(m−1)(j) if j < sm−1 ρ(m−1)(j + 1) if j ≥ sm−1 , j = 1, ..., m − 2. we continue in the same spirit until s is completely determined. example 1 let ρ = (2, 3, 4, 1). in order to determine the set s = {s1, s2, s3, s4} in (3) we are based on the above iteration scheme and so we proceed in the following way: (i) define s4 = ρ−1(4) = 3 and ρ(3) = (2, 3, 1). (ii) define s3 = ρ−1(3)(3) = 2 and ρ(2) = (2, 1). (iii) define s2 = ρ−1(2)(2) = 1 and ρ(1) = (1). (iv) define s1 = ρ−1(1)(1) = 1 and ρ(4) = ∅. now we have the following: proposition 1 [2] let u and q be two maps as in (2) and (3) respectively. then q is a bijection and so the composition map uq : p(m) → {0, ..., m! − 1} provides an enumeration of p(m). 244 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 245 4 n. atreas and c. karanikas for the above selection of m and the initial permutation ρ in (3), we store the position of the biggest element in ρ, i.e. we define sm = ρ −1(m) and at the same time we delete this element ρ(sm) = m from ρ and so we form a new permutation ρ(m−1) ∈ p(m − 1) by ρ(m−1)(j) = { ρ(j) if j < sm ρ(j + 1) if j ≥ sm , j = 1, ..., m − 1. then we follow the previous step for the permutation ρ(m−1), i.e. we store the position of its biggest element by defining sm−1 = ρ −1 (m−1)(m − 1) and at the same time we delete the element m − 1 from ρ(m−1) and we form a new permutation ρ(m−2) ∈ p(m − 2) by ρ(m−2)(j) = { ρ(m−1)(j) if j < sm−1 ρ(m−1)(j + 1) if j ≥ sm−1 , j = 1, ..., m − 2. we continue in the same spirit until s is completely determined. example 1 let ρ = (2, 3, 4, 1). in order to determine the set s = {s1, s2, s3, s4} in (3) we are based on the above iteration scheme and so we proceed in the following way: (i) define s4 = ρ−1(4) = 3 and ρ(3) = (2, 3, 1). (ii) define s3 = ρ−1(3)(3) = 2 and ρ(2) = (2, 1). (iii) define s2 = ρ−1(2)(2) = 1 and ρ(1) = (1). (iv) define s1 = ρ−1(1)(1) = 1 and ρ(4) = ∅. now we have the following: proposition 1 [2] let u and q be two maps as in (2) and (3) respectively. then q is a bijection and so the composition map uq : p(m) → {0, ..., m! − 1} provides an enumeration of p(m). enumeration and coding permutations and reversible logical circuits 5 example 2 for m = 4, we demonstrate the enumeration of the elements of p(4) derived from proposition (1) and the lexicographical order of the elements of s(4) derived from (2). p(4) = {(4, 3, 2, 1), (3, 4, 2, 1), (3, 2, 4, 1), (3, 2, 1, 4), (4, 2, 3, 1), (2, 4, 3, 1), (2, 3, 4, 1), (2, 3, 1, 4), (4, 2, 1, 3), (2, 4, 1, 3), (2, 1, 4, 3), (2, 1, 3, 4), (4, 3, 1, 2), (3, 4, 1, 2), (3, 1, 4, 2), (3, 1, 2, 4), (4, 1, 3, 2), (1, 4, 3, 2), (1, 3, 4, 2), (1, 3, 2, 4), (4, 1, 2, 3), (1, 4, 2, 3), (1, 2, 4, 3), (1, 2, 3, 4)}. s(4) = {(1, 1, 1, 1), (1, 1, 1, 2), (1, 1, 1, 3), (1, 1, 1, 4), (1, 1, 2, 1), (1, 1, 2, 2), (1, 1, 2, 3), (1, 1, 2, 4), (1, 1, 3, 1), (1, 1, 3, 2), (1, 1, 3, 3), (1, 1, 3, 4), (1, 2, 1, 1), (1, 2, 1, 2), (1, 2, 1, 3), (1, 2, 1, 4), (1, 2, 2, 1), (1, 2, 2, 2), (1, 2, 2, 3), (1, 2, 2, 4), (1, 2, 3, 1), (1, 2, 3, 2), (1, 2, 3, 3), (1, 2, 3, 4)}. for instance, the permutation ρ = (4, 3, 2, 1) is uniquely associated with the set q(ρ) = (1, 1, 1, 1) (apply example 1) and then uq(ρ) = 0 by (2). in the same spirit, the permutation ρ = (3, 4, 2, 1) is uniquely associated with the set q(ρ) = (1, 1, 1, 2) (apply example 1) and then uq(ρ) = 1 by (2). remark 1 the set s(m) in (1) seems to be similar with a lehmer code [7], but our approach seems to be more efficient for the purpose of obtaining a great variety of enumerating methods for p(m), see theorem (1) below. we notice that the lehmer code of a permutation ρ = (ρ1, ....ρm) is a sequence of natural numbers (l1, ..., lm) such that li is the number of all elements ρ1, ..., ρi−1 which are less than ρi, i = 1, ..., m. 244 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 245 6 n. atreas and c. karanikas we may obtain various enumerations of the elements of s(m) (and hence p(m) as well). indeed, let us fix any element r = (r1, r2, ..., rm) ∈ r(m) = p(1) × p(2) × ... × p(m), (4) where ri = (ri,1, ...ri,i) ∈ p(i), i = 1, ..., m. then we have: theorem 1 let s(m) be defined in (1) and r be a fixed element of r(m) as in (4). for any s ∈ s(m) we define wr,m(s) = (r1,s1, r2,s2, ..., rm,sm) then the map wr,m is onto s(m). proof: let us fix an element r ∈ r(m). since ri,si ≤ i (due to the fact that ri ∈ p(i)), we deduce that wr,m(s) ∈ s(m). also, the fact that ri,j ≤ i for any j = 1, ..., i implies that wr,m is onto s(m), because any element si of s = (s1, ..., sm) can be written by si = ri,a(i) for some index a(i) ≤ i and so by defining a = {a(i) : i = 1, ..., m} we have wr,m(a) = s. let u be as in (2) and wr,m be as in theorem 1. it is easy to see that the map uwr,mu −1 : {0, ..., m! − 1} → {0, ..., m! − 1} provides a method for shuffling the set {0, ..., m! − 1}. by altering the selection of r ∈ r(m) in (4) we obtain a different shuffling. finally, it is clear that the class of mappings { qwr,mu −1 : r ∈ r(m) } provides a great variety of enumeration/shuffling methods for the set of permutations p(m). example 3 for m = 4 and r = {(1), (2, 1), (2, 1, 3), (4, 2, 1, 3)}, then by using theorem 1, the lexicographical order of s(4) (see example 2) is shuffled to: {(1, 2, 2, 4), (1, 2, 2, 2), (1, 2, 2, 1), (1, 2, 2, 3), (1, 2, 1, 4), (1, 2, 1, 2), (1, 2, 1, 1), (1, 2, 1, 3), (1, 2, 3, 4), (1, 2, 3, 2), (1, 2, 3, 1), (1, 2, 3, 3), (1, 1, 2, 4), (1, 1, 2, 2), (1, 1, 2, 1), (1, 1, 2, 3), (1, 1, 1, 4), (1, 1, 1, 2), (1, 1, 1, 1), (1, 1, 1, 3), (1, 1, 3, 4), (1, 1, 3, 2), (1, 1, 3, 1), (1, 1, 3, 3)}. 246 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 247 6 n. atreas and c. karanikas we may obtain various enumerations of the elements of s(m) (and hence p(m) as well). indeed, let us fix any element r = (r1, r2, ..., rm) ∈ r(m) = p(1) × p(2) × ... × p(m), (4) where ri = (ri,1, ...ri,i) ∈ p(i), i = 1, ..., m. then we have: theorem 1 let s(m) be defined in (1) and r be a fixed element of r(m) as in (4). for any s ∈ s(m) we define wr,m(s) = (r1,s1, r2,s2, ..., rm,sm) then the map wr,m is onto s(m). proof: let us fix an element r ∈ r(m). since ri,si ≤ i (due to the fact that ri ∈ p(i)), we deduce that wr,m(s) ∈ s(m). also, the fact that ri,j ≤ i for any j = 1, ..., i implies that wr,m is onto s(m), because any element si of s = (s1, ..., sm) can be written by si = ri,a(i) for some index a(i) ≤ i and so by defining a = {a(i) : i = 1, ..., m} we have wr,m(a) = s. let u be as in (2) and wr,m be as in theorem 1. it is easy to see that the map uwr,mu −1 : {0, ..., m! − 1} → {0, ..., m! − 1} provides a method for shuffling the set {0, ..., m! − 1}. by altering the selection of r ∈ r(m) in (4) we obtain a different shuffling. finally, it is clear that the class of mappings { qwr,mu −1 : r ∈ r(m) } provides a great variety of enumeration/shuffling methods for the set of permutations p(m). example 3 for m = 4 and r = {(1), (2, 1), (2, 1, 3), (4, 2, 1, 3)}, then by using theorem 1, the lexicographical order of s(4) (see example 2) is shuffled to: {(1, 2, 2, 4), (1, 2, 2, 2), (1, 2, 2, 1), (1, 2, 2, 3), (1, 2, 1, 4), (1, 2, 1, 2), (1, 2, 1, 1), (1, 2, 1, 3), (1, 2, 3, 4), (1, 2, 3, 2), (1, 2, 3, 1), (1, 2, 3, 3), (1, 1, 2, 4), (1, 1, 2, 2), (1, 1, 2, 1), (1, 1, 2, 3), (1, 1, 1, 4), (1, 1, 1, 2), (1, 1, 1, 1), (1, 1, 1, 3), (1, 1, 3, 4), (1, 1, 3, 2), (1, 1, 3, 1), (1, 1, 3, 3)}. enumeration and coding permutations and reversible logical circuits 7 if q is defined in (3), then by using the composition map q−1wr,4u −1 we obtain the following enumeration of the set p(4): {(1, 2, 3, 4), (1, 4, 2, 3), (4, 1, 2, 3), (1, 2, 4, 3), (2, 3, 1, 4), (2, 4, 3, 1), (4, 2, 3, 1), (2, 3, 4, 1), (3, 2, 1, 4), (3, 4, 2, 1), (4, 3, 2, 1), (3, 2, 4, 1), (2, 1, 3, 4), (2, 4, 1, 3), (4, 2, 1, 3), (2, 1, 4, 3), (1, 3, 2, 4), (1, 4, 3, 2), (4, 1, 3, 2), (1, 3, 4, 2), (3, 1, 2, 4), (3, 4, 1, 2), (4, 3, 1, 2), (3, 1, 4, 2)}. 3 a class of boolean matrices coded by permutations and a class of bijection maps before we introduce a class of bijection maps on {0, 1, ..., qm −1} for any pair of natural numbers m, q ≥ 2, we present as in [2] a class of sparse boolean matrices and their properties. definition 1 for any natural number m ≥ 2 we define by zm the class of all m × m boolean matrices whose row vectors zi satisfy zi ⊙ zj = cij zmax{i,j} : cij ∈ {0, 1}, i, j = 1, ..., m, where ⊙ is the usual hadamard product operation. then the following result is straightforward: lemma 1 [2] let a be an m × m boolean matrix and let 1 ≤ i < j ≤ m. then a ∈ zm if and only if supp{aj} ⊂ supp{ai} or supp{ai}∩supp{aj} = ∅. here, supp{aj} denotes the set of all non zero entries of the row aj. in [2] we proved the following: proposition 2 let p(m) and s(m) be defined in section 2. then every matrix in the class zm is uniquely identified by a pair (ρ, s) ∈ p(m)×s(m). using the above observations we may easily construct elements in the above class of zm matrices. indeed, let us fix a pair (ρ, s) ∈ p(m) × s(m) which determines a matrix z ∈ zm in a unique way. from the pair (ρ, s) we may construct z in the following manner: 246 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 247 8 n. atreas and c. karanikas (i) first, we use ρ to permute the rows of the identity matrix im and so we construct an m × m permutation matrix, say z1. (ii) starting with the above matrix z1, we construct a sequence {zi}mi=2 of m × m matrices iteratively, by using s ∈ s(m). in the ith step of this iteration, a matrix zi is constructed from the matrix zi−1 based on the following rule: (a) if si = i, define zi = zi−1. (a) if si < i, define zi by replacing only the si-row of zi−1 with the sum of the i-row and si-row of zi−1. (iii) execute step (ii) for any i = 2, ..., m. then z = zm is a matrix in the class zm. example 4 let m = 5, ρ = (4, 1, 2, 5, 3) and s = (1, 1, 3, 1, 3). then the element z ∈ z5 associated with the above pair (ρ, s) is the following z =   1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0   . it is remarkable that any matrix z in the class zm (which depends only on a pair (ρ, s)) is invertible and the entries of inverse matrix z−1 are immediately computed by the above pair (ρ, s): z−1i,j =    1 i = ρ(j) −1, i = ρ(s(j)) and s(j) < j 0 otherwise , i, j = 1, ..., m. (5) example 5 if z ∈ z5 is as in example (4), then the inverse matrix of z is calculated directly from (5): z−1 =   0 1 0 0 0 0 0 1 0 −1 0 0 0 0 1 1 −1 0 −1 0 0 0 0 1 0   . 248 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 249 8 n. atreas and c. karanikas (i) first, we use ρ to permute the rows of the identity matrix im and so we construct an m × m permutation matrix, say z1. (ii) starting with the above matrix z1, we construct a sequence {zi}mi=2 of m × m matrices iteratively, by using s ∈ s(m). in the ith step of this iteration, a matrix zi is constructed from the matrix zi−1 based on the following rule: (a) if si = i, define zi = zi−1. (a) if si < i, define zi by replacing only the si-row of zi−1 with the sum of the i-row and si-row of zi−1. (iii) execute step (ii) for any i = 2, ..., m. then z = zm is a matrix in the class zm. example 4 let m = 5, ρ = (4, 1, 2, 5, 3) and s = (1, 1, 3, 1, 3). then the element z ∈ z5 associated with the above pair (ρ, s) is the following z =   1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0   . it is remarkable that any matrix z in the class zm (which depends only on a pair (ρ, s)) is invertible and the entries of inverse matrix z−1 are immediately computed by the above pair (ρ, s): z−1i,j =    1 i = ρ(j) −1, i = ρ(s(j)) and s(j) < j 0 otherwise , i, j = 1, ..., m. (5) example 5 if z ∈ z5 is as in example (4), then the inverse matrix of z is calculated directly from (5): z−1 =   0 1 0 0 0 0 0 1 0 −1 0 0 0 0 1 1 −1 0 −1 0 0 0 0 1 0   . enumeration and coding permutations and reversible logical circuits 9 we consider now a matrix z−1 as above corresponding to a pair ρ = (ρ1, ..., ρm) ∈ p(m) and s = (s1, ..., sm) ∈ s(m). we shall use z−1 to define a new shuffling method. by elementary calculations, for any real row vector e = (e1, ..., em) we obtain ( ez−1 ) i = eρi − ( 1 − δi,si ) eρsi , i = 1, ..., m. (6) here, δi,j denotes the usual kronecker’s delta symbol. inspired from (6) we have: theorem 2 let m, q ≥ 2 be natural numbers, ρ = (ρ1, ..., ρm) ∈ p(m) and s = (s1, ..., sm) ∈ s(m). we define the set e(q)m = {en = (en,1, ..., en,m) : n = 0, ..., q m − 1}, where en is the sequence of digits of n ∈ {0, ..., qm − 1} with respect to its q-adic expansion n = m∑ i=1 en,iq i−1. then the map tq,ρ,s : e (q) m → e (q) m such that for any i = 1, ..., m tq,ρ,s ( en ) i = mod ( en,ρi − ( 1 − δi,si ) en,ρsi , q ) is a bijection. proof: for any natural numbers m, q ≥ 2 we fix a pair (ρ, s) ∈ p(m)×s(m) and we consider the above operator tq,ρ,s. from now on we write t = tq,ρ,s for simplicity. let t(ek) and t(en) be two sequences for some pair (k, n) ∈ {0, ..., qm −1}2. notice that the elements of ek and en belong in {0, ..., q−1} by definition. assume that t(ek) = t(en) ⇒ t(ek)i = t(en)i, ∀i = 1, ..., m. (7) if i = 1 in (7), then by recalling the definition of s(m) in (1) we have s1 = 1, so t(ek)1 = t(en)1 ⇒ mod ( ek,ρ1, q ) = mod ( en,ρ1, q ) . 248 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 249 10 n. atreas and c. karanikas hence ek,ρ1 = en,ρ1. if i = 2, then s2 ∈ {0, 1}. for s2 = 2 we immediately obtain ek,ρ2 = en,ρ2. for s2 = 1 we have t(ek)2 = t(en)2 ⇒ mod ( ek,ρ2 − ek,ρs2 , q ) = mod ( en,ρ2 − en,ρs2 , q ) ⇒ mod ( ek,ρ2 − en,ρ1, q ) = mod ( en,ρ2 − en,ρ1, q ) , where the last equality was derived from the fact that ek,ρ1 = en,ρ1 as we showed above. hence, either ek,ρ2 − en,ρ1 = en,ρ2 − en,ρ1 ⇒ ek,ρ2 = en,ρ2 or q − (ek,ρ2 − en,ρ1) = q − (en,ρ2 − en,ρ1) ⇒ ek,ρ2 = en,ρ2. therefore, in any case we obtain ek,ρ2 = en,ρ2. we proceed in the same manner for the remaining values i = 3, ..., m obtaining ek,ρi = en,ρi, ∀i = 1, ..., m. since ρ is a permutation, necessarily ek,i = en,i, ∀i = 1, ..., m and the proof is complete. it is clear that the above operator tq,ρ,s provides a code for shuffling the elements of the set {0, ..., qm − 1}. example 6 let q = 3, ρ = (2, 1), s = (1, 2) and e (3) 2 = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)}. then by the above definition of tq,ρ,s we obtain (0, 0) → (0, 0), (0, 1) → (1, 0), (0, 2) → (2, 0), (1, 0) → (0, 1), (1, 1) → (1, 1), (1, 2) → (2, 1), (2, 0) → (0, 2), (2, 1) → (1, 2) and (2, 2) → (2, 2) or tq,ρ,s : {0, 1, 2, 3, 4, 5, 6, 7, 8} → {0, 3, 6, 1, 4, 7, 2, 5, 8}. 250 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 251 10 n. atreas and c. karanikas hence ek,ρ1 = en,ρ1. if i = 2, then s2 ∈ {0, 1}. for s2 = 2 we immediately obtain ek,ρ2 = en,ρ2. for s2 = 1 we have t(ek)2 = t(en)2 ⇒ mod ( ek,ρ2 − ek,ρs2 , q ) = mod ( en,ρ2 − en,ρs2 , q ) ⇒ mod ( ek,ρ2 − en,ρ1, q ) = mod ( en,ρ2 − en,ρ1, q ) , where the last equality was derived from the fact that ek,ρ1 = en,ρ1 as we showed above. hence, either ek,ρ2 − en,ρ1 = en,ρ2 − en,ρ1 ⇒ ek,ρ2 = en,ρ2 or q − (ek,ρ2 − en,ρ1) = q − (en,ρ2 − en,ρ1) ⇒ ek,ρ2 = en,ρ2. therefore, in any case we obtain ek,ρ2 = en,ρ2. we proceed in the same manner for the remaining values i = 3, ..., m obtaining ek,ρi = en,ρi, ∀i = 1, ..., m. since ρ is a permutation, necessarily ek,i = en,i, ∀i = 1, ..., m and the proof is complete. it is clear that the above operator tq,ρ,s provides a code for shuffling the elements of the set {0, ..., qm − 1}. example 6 let q = 3, ρ = (2, 1), s = (1, 2) and e (3) 2 = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)}. then by the above definition of tq,ρ,s we obtain (0, 0) → (0, 0), (0, 1) → (1, 0), (0, 2) → (2, 0), (1, 0) → (0, 1), (1, 1) → (1, 1), (1, 2) → (2, 1), (2, 0) → (0, 2), (2, 1) → (1, 2) and (2, 2) → (2, 2) or tq,ρ,s : {0, 1, 2, 3, 4, 5, 6, 7, 8} → {0, 3, 6, 1, 4, 7, 2, 5, 8}. enumeration and coding permutations and reversible logical circuits 11 4 on reversible gates in this section we see that several of the well known reversible gates can be obtained by the bijection maps of theorem 2. first, we modify theorem 2 as follows: theorem 3 for any natural number m, let (ρ, s) ∈ p(m) × s(m) be as in theorem 2 and em = {en := (en,1, ..., en,m) : n = {0, ..., 2m − 1}} be the set of all m-bit arrays. then: (i) the map tρ,σ : em → em such that for any j = 1, ..., m we have tρ,s(en)j = ∣∣en,ρj − (1 − δj,s(j))en,ρs(j) ∣∣ is a bijection. (ii) for any permutation τ ∈ p(m) we denote by lτ (en) = (en,τ(1), ..., en,τ(m)) the element of em obtained from shuffling en by the permutation τ. then lτ tρ,σ : em → em is a bijection too. proof: (i). it is a direct consequence of theorem 2 for q = 2. (ii) it is immediate. example 7 the feynman gate. it is a 2-bit reversible map such that (0, 0) → (0, 0), (0, 1) → (0, 1), (1, 0) → (1, 1) and (1, 1) → (1, 0). according to theorem 3, this gate corresponds to the map tρ,σ, where ρ = (1, 2) and σ = (1, 1). 250 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 251 12 n. atreas and c. karanikas in a different notation this gate can be uniquely described by a matrix in the class z2 associated with the above pair (ρ, s) ∈ p(2) × s(2) (see definition 1 or example 4) zρ,s = ( 1 1 0 1 ) . also, in a different notation this gate can be described by the following 4 × 4 matrix (by concatenating the corresponding inputs and outputs)   0 0 0 0 0 1 0 1 1 0 1 1 1 1 1 0   . example 8 the double feynman gate. it is a reversible map on the 3 bit binary arrays so that (0, 0, 0) → (0, 0, 0), (1, 0, 0) → (1, 1, 1), (0, 1, 0) → (0, 1, 0), (1, 1, 0) → (1, 0, 1), (0, 0, 1) → (0, 0, 1), (1, 0, 1) → (1, 1, 0), (0, 1, 1) → (0, 1, 1) and (1, 1, 1) → (1, 0, 0). according to theorem 3, this gate corresponds to the map tρ,σ, where ρ = (1, 2, 3) and σ = (1, 1, 1). in a different notation, this gate can be uniquely described by a matrix in the class z3 associated with the above pair (ρ, s) ∈ p(3) × s(3) (see the above definition 1 or example 4) zρ,s =   1 1 1 0 1 0 0 0 1   . also, in a different notation this gate can be described by the following 8 × 6 matrix (by concatenating the corresponding inputs and outputs)   0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 1 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 0 1 1 1 1 1 0 0   . 252 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 253 12 n. atreas and c. karanikas in a different notation this gate can be uniquely described by a matrix in the class z2 associated with the above pair (ρ, s) ∈ p(2) × s(2) (see definition 1 or example 4) zρ,s = ( 1 1 0 1 ) . also, in a different notation this gate can be described by the following 4 × 4 matrix (by concatenating the corresponding inputs and outputs)   0 0 0 0 0 1 0 1 1 0 1 1 1 1 1 0   . example 8 the double feynman gate. it is a reversible map on the 3 bit binary arrays so that (0, 0, 0) → (0, 0, 0), (1, 0, 0) → (1, 1, 1), (0, 1, 0) → (0, 1, 0), (1, 1, 0) → (1, 0, 1), (0, 0, 1) → (0, 0, 1), (1, 0, 1) → (1, 1, 0), (0, 1, 1) → (0, 1, 1) and (1, 1, 1) → (1, 0, 0). according to theorem 3, this gate corresponds to the map tρ,σ, where ρ = (1, 2, 3) and σ = (1, 1, 1). in a different notation, this gate can be uniquely described by a matrix in the class z3 associated with the above pair (ρ, s) ∈ p(3) × s(3) (see the above definition 1 or example 4) zρ,s =   1 1 1 0 1 0 0 0 1   . also, in a different notation this gate can be described by the following 8 × 6 matrix (by concatenating the corresponding inputs and outputs)   0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 1 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 0 1 1 1 1 1 0 0   . enumeration and coding permutations and reversible logical circuits 13 fig. 1: the set of points {( n, t2,ρ,s(n) ) : n ∈ i8 } for the selection of the pair (ρ, s) as in example 9. recall that the map t2,ρ,s is a bijection on the set i8 providing a shuffling method for i8. we mention here that the 2-bit swap gate can be also implemented by the map tρ,s by selecting ρ = (2, 1) and s = (1, 2). however, the 3-bit toffoli and fredkin gates cannot be implemented via tρ,s. 5 coding pseudorandom permutations we apply theorem 2 to give by an example a method to code a pseudorandom permutation in p(2m). for any (ρ, s) ∈ p(m) × s(m) and a fixed random permutation r ∈ r(2m) we shuffle the image of t2,ρ,s by the composition map wr,2t2,ρ,s for some particular selection of r ∈ r(28) (see theorem 1) and we obtain a pseudo-random permutation coded by a triple (ρ, s, r). example 9 let ρ = (5, 7, 6, 3, 4, 8, 1, 2) and s = (1, 1, 1, 4, 5, 2, 7, 3). figure 1 shows how the bijective map t2,ρ,s of theorem 2 shuffles the elements of the set i8 = {0, ..., 28 − 1}. in figure 2 we use a fixed element r ∈ r(28) (see theorem 1) and we shuffle the set i8 by means of the composition operator wr,2t2ρ,s. in this case, the graph appears to be more ”randomly” distributed than the graph of figure 1. in conclusion, we demonstrated a variety of new enumeration/shuffling methods for the group of permutations. we also proposed a class of bijections for sets of natural numbers based on efficient coding methods for 252 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 253 14 n. atreas and c. karanikas fig. 2: the set of points { (n, wr,2t2,ρ,s(n)) : n ∈ i8 } for some r ∈ r(28) and (ρ, s) as in example 9. sparse boolean matrices. we also discussed possible connections of the shuffling problem with the random permutation generation problem. according to [8, 9], any permutation in p(m) can be almost uniformly randomly distributed using mlog(m)/2. this observation may be important for establishing a connection between our shuffling method and the random permutation generation problem in future.we believe that this direction is very promising. references [1] k. n. patel, j. p. hayes, and i. l. markov, “fault testing for reversible circuits,” in ieee vlsi test symposium, napa valley, california, 2003, pp. 410–417. [2] n. atreas and c. karanikas, “boolean invertible matrices identified from two permutations and their corresponding haar-type matrices,” linear algebra appl., vol. 435, no. 1, pp. 95–105, 2011. [3] ——, “multiscale haar unitary matrices with the corresponding riesz products and a characterization of cantor-type languages,” j. fourier anal. appl., vol. 13, no. 2, pp. 197–210, 2007. [4] ——, “haar-type orthonormal systems, data presentation as riesz products and a recognition on symbolic sequences,” contemporary math., vol. 451, pp. 1–9, 2008. [5] ——, “discrete type riesz products,” in walsh and dyadic analysis, 2008, pp. 137–143. 254 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 255 14 n. atreas and c. karanikas fig. 2: the set of points { (n, wr,2t2,ρ,s(n)) : n ∈ i8 } for some r ∈ r(28) and (ρ, s) as in example 9. sparse boolean matrices. we also discussed possible connections of the shuffling problem with the random permutation generation problem. according to [8, 9], any permutation in p(m) can be almost uniformly randomly distributed using mlog(m)/2. this observation may be important for establishing a connection between our shuffling method and the random permutation generation problem in future.we believe that this direction is very promising. references [1] k. n. patel, j. p. hayes, and i. l. markov, “fault testing for reversible circuits,” in ieee vlsi test symposium, napa valley, california, 2003, pp. 410–417. [2] n. atreas and c. karanikas, “boolean invertible matrices identified from two permutations and their corresponding haar-type matrices,” linear algebra appl., vol. 435, no. 1, pp. 95–105, 2011. [3] ——, “multiscale haar unitary matrices with the corresponding riesz products and a characterization of cantor-type languages,” j. fourier anal. appl., vol. 13, no. 2, pp. 197–210, 2007. [4] ——, “haar-type orthonormal systems, data presentation as riesz products and a recognition on symbolic sequences,” contemporary math., vol. 451, pp. 1–9, 2008. [5] ——, “discrete type riesz products,” in walsh and dyadic analysis, 2008, pp. 137–143. enumeration and coding permutations and reversible logical circuits 15 [6] n. atreas, c. karanikas, and p. polychronidou, “a class of sparse unimodular matrices generating multiresolution and sampling analysis for data of any length,” siam j. matrix anal. appl.,, vol. 30, no. 1, pp. 312–323, 2008. [7] d. h. lehmer, “teaching combinatorial tricks to a computer,” in proc. symbos. appl. math. combinatorial analysis, vol. 10, 1960, pp. 179–193. [8] p. diaconis, g. graham, and s. p. holmes, “statistical problems involving permutations with restricted positions,” in lecture notes. monograph series, vol. 36, 2001, pp. 195–202. [9] p. diaconis and m. shahshahani, “generating a random permutation with random transposition,” z. wahr. verw. gebeite, vol. 57, no. 2, pp. 159–17, 1981. 254 c. karanikas, n. atreas enumeration and coding permutations and reversible logical circuits 255 facta universitatis series: electronics and energetics vol. 33, n o 3, september 2020, pp. 477-487 https://doi.org/10.2298/fuee2003477a © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd study of hole-blocking and electron-blocking layers in a inas/gaas multiple quantum-well solar cell  sobhan abbasian 1,2 , reza sabbaghi-nadooshan 1 1 electrical engineering department, islamic azad university, central tehran branch, tehran, iran 2 alborz electricity distribution company, karaj, iran abstract. in this work, a gaas-based quantum well solar cell with a 25-layer inas/gaas intermediate layer is simulated in silvaco atlas tcad software. in order to reduce the recombination caused by the presence of the quantum layers and increase the absorption of photons, electron blocking layers (ebls) and hole blocking layers (hbls) have been added to the solar cell in an in0.5(al0.7ga0.3)0.5p semiconductor. the results show that the efficiency of the proposed solar cell increases 17.38% by obtaining impurity the thickness and doping of the ebl and hbl layers. it can be concluded that the use of the in0.5(al0.7ga 0.3)0.5p semiconductor with ebl and hbl layers decreases the open circuit voltage (voc) caused in the quantum wells. the efficiency of the proposed solar cell with ebl and hbl layers was found to be 44.65%. key words: electron-blocking layers, hole-blocking layers, inas/gaas, quantumwell solar cell 1. introduction the increasing human need for energy has drawn the attention of many researchers to renewable energy. solar cells are a clean energy source that absorb and convert solar energy into electricity. much research has been done on solar cells with semiconductors from the iii-v group because of their high efficiency. barnham et al. improved quantum solar cell function by inserting a quantum well as the middle layer in a p-i-n cell [1]. paxman et al. found that the density of the optical current and conversion efficiency increased in p-i-n structures with gaas/algaas quantum wells [2]. in fact, a quantum well in the intermediate layer produces an additional electron-hole pair by absorption of photons with less energy and improves the spectral response by absorbing different energy photons. the increase in the shortcircuit current (jsc) increases absorption, but the increase in the recombination of carriers in the quantum well causes voc to decrease [2  received february 7, 2020; received in revised form june 14, 2020 corresponding author: reza sabbaghi-nadooshan niayesh building, emam hasan blvd., pounak, tehran, iran e-mail: r_sabbaghi@iauctb.ac.ir 478 s. abbasian, r. sabbaghi-nadooshan 3]. an ebl layer with a band gap larger than the p-n junction in the solar cell creates an electric field and prevents reconciliation of carrier density [4-5]. therefore, the ebl and hbl layers cause an increase the jsc without reducing the voc and increasing the gain in the p-i-n solar cells. in 2015, feroz ali et al achieved a 26.285% efficiency in si solar cell by designing an ebl layer with 2.1ev bandwidth [6]. denis et al showed in 2010 that inas /gaas quantum dot cells increase photon absorption [7]. peter james et al then designed a gasb/gaas quantum dot cell in 2011 [8]. wei-sheng designed an inas/gaassb quantum dot in 2012 to increase short-circuit current by 8.8% [9]. in 2013, xiaoguang et al achieved 17% efficiency by investigating the effects of si-doping on inas/gaas quantum dots [10]. the effect of electric field on different layers of inas/gaas quantum dots was investigated by yushuai et al in 2014 [11]. in 2015, inigo et al designed the inas/ingap quantum-dot cell [12]. utrilla et al increased the range of inas/gaas quantum dots in 2016 using semiconductors in inas/gaas cells [13]. an analytical study on inas/gaas quantum dots solar cells was conducted in 2017 by sayantan et al [14]. in 2018, they conducted studies on the optical properties of inas/gaas quantum dots cells [15]. the effects of temperature on inas/gaas quantum dots were investigated by abdelkader et al in 2019 [16]. the present study was undertaken to simulate a gaas/inas multi-quantum well solar cell using silvaco atlas tcad software [17]. as can be seen in [18], the bsf layer of the semiconductor in0.5(al0.7ga0.3)0.5p enhances the adsorption of charge carriers and promotes the solar cell yield. the proposed solar cell simulation confirms the increase efficiency with experimental results compared in this paper. in the present article, we used the ebl and hbl layer, which is similar in performance to bsf, and improved it via semiconductor in0.5(al0.7ga0.3)0.5p .the efficiency of the solar cell of the addition of the in 0.5(al 0.7ga 0.3) 0.5p semiconductor as the ebl and hbl layers is examined. the ebl layer (0.05 μm in thickness; doping 2×10 15 1/cm 3 ) and the hbl layer (2.0 μm in thickness; doping of 2×10 19 1/cm 3 ) increased solar cell efficiency 44.65%. 2. quantum-well solar cell structure in a p-i-n solar cell, due to the presence of an intermediate band in the band gap (eg), the bandwidth of the conduction and valence bands absorbs more inlet photons. as shown in fig. 1, the semiconductor band gap (eg) is divided into two smaller band gaps, el and eh. absorbing photons with less energy than the band gap (eg) leads to an increase in electron and hole production. fig. 1 the photon absorption processes in structure of an intermediate band study of hole-blocking and electron-blocking layers in a inas/gaas multiple quantum-well solar cell 479 to generate a pair of electron holes using photon absorption with less energy than the band gap (eg), an electron moves from the valence band to the middle band and places the hole in its bond. in the form of this transfer, the number 1 is displayed, and an electron from the midband to the conduction band is displayed with the number 2. by absorbing photons with energy higher than the energy band gap (eg), the transfer of an electron from the valence band to the conduction band causes the normal production of the electron-hole to be coupled with the number 3 (as shown). to improve both transitions 1 and 2, the middle band is first filled with the electron. the middle band receiving electrons from the valence band to the middle band and electron transfer from the middle band to the conduction band [19]. usually, longer wavelengths cannot direct the electron–hole generated in solar cells from the conduction band to the valence band because of low energy levels. in the intrinsic (i) region of a p–i–n solar cell, the quantum wells producing a pair of electrons and holes. as seen in fig. 2, the electrons and holes produced in the quantum wells are released by heat and tunneling, and through the electric field in the p-i-n solar cell, they collide and move toward the contacts [20]. fig. 2 quantum wells solar cell 3. materials selection for different layers table 1 shows the properties of the material for the various layers of the proposed cell. as shown in table 1, the in0.49ga0.51p, inas and in0.5(al0.7ga0.3)0.5p semiconductors prevent the recombination of electrons and holes by lattice matched to gaas. table 1 major parameters for the ternary in0.49ga0.51p, inas and quaternary in0.5(al0.7ga0.3)0.5p lattice matched to gaas materials used in this design [21-25]. inas inalgap ingap gaas material 0.36 2.3 1.9 1.42 band gap eg (ev) @300 k 15 11.7 11.6 13.1 permittivity (es/eo) 4.03 4.2 4.16 4.07 affinity (ev) 0.024 2.85 3 0.063 heavy eeffective mass (me*/m0) 0.471 0. 64 0. 64 0.5 heavy h + effective mass (mh*/m0) 30000 2150 1945 8800 emobility mun (cm 2 /v× s) 240 141 141 400 h + mobility mup (cm 2 /v× s) 8.7e+16 1.20e+20 1.30e+20 4.7e+17 edensity of states nc (cm -3 ) 6.6e+18 1.28e+19 1.28e+19 7.0e+18 h + density of states nv (cm -3 ) 480 s. abbasian, r. sabbaghi-nadooshan 4. modeling procedures 4.1. cell structure figure 3 shows the structure of the proposed model. the top of the cell is a window layer with a band gap of 1.9 ev, and a thickness of 0.05 μm. the quantum well contains 25 layers, each 0.005 μm in thickness with inas and gaas semiconductors. the lattice constant for inas is 6.0584 å and for gaas is 5.6533 å, which results in a non-matching of the lattice constant between the two semiconductors in the middle quantum layers. this problem is solved by making thin inas layers without exceeding the critical thickness 7 å, growing in the direction of 001. further thicknesses create trap alignments in the structure [17]. aluminum (0.7%) was added to the in0.49ga0.51p to increase the band gap, producing a in0.5(al0.7ga0.3)0.5p semiconductor with a 2.3 ev band gap. research shows that this semiconductor has a lattice constant matching and gaas [21-22]. also, in0.5(al0.7ga0.3)0.5p with creation of an electric field in the ebl and hbl layers reduces recombination of the electrons and holes. fig. 3 schematic of the proposed cell 4.2. analysis of proposed model the current density of the pin solar cell is obtained from eq. 1: ( ) ( ), ( ⁄ ) , ( ⁄ ) (1) where k is the boltzmann’s constant, t, the absolute temperature, q, the electron charge, j0, the reverse saturation current densities, φb is the net flux of incident photons with energies band gap energy and (2) β is the ratio of the current required in the intrinsic region at equilibrium to the usual reverse drift current, w is intrinsic region width, ab is the nonradiative coefficient and nib study of hole-blocking and electron-blocking layers in a inas/gaas multiple quantum-well solar cell 481 is the equilibrium intrinsic carrier concentration and bb is the barrier recombination coefficient. in eq. 3, the current to voltage in the solar cell (mqw) is shown [26]: ( ), ( ) ( ) , ( ) ⁄⁄ (3) where , *( )⁄ +and , *( ) ⁄ +-(4) fraction fw of the intrinsic region volume substituted by quantum well material ⁄ is the recombination coefficient enhancement factor, ⁄ is the effective volume densities of states enhancement factor, and short-circuit current density in pin solar cell (mqw) is obtained from eq. 5: , ( ) ( ) (5) where nph( > ) and nph(> ) is the net photon flux density corresponding to the energies between and for the bulk solar cell. the open circuit voltage in the solar cell pin (mqw) is obtained from eq. 6 [27]: , ( ) ( ) (6) 4.3. model simulation in this study, the performance of the proposed cell was simulated under the standard am 1.5 spectrum using atlas tcad, and it was segmented by a mesh-structured solar cell with different densities. the conmob model was used to calculate the electron and hole excitation capability and to combine the two optr and srh models. the solar cell exposure process was done using the luminous module [17]. 5. result and discussion 5.1. optimization of thickness and doping in the ebl and hbl layer investigating the doping and thickness of in 0.5(al 0.7ga 0.3) 0.5p in the ebl and hbl layers of the proposed solar cell is essential for efficient detection. fig. 4a shows that the highest efficiency for the ebl layer was at a thickness of 0.05 μm. fig. 4b shows the highest efficiency occurred by adding an ebl layer to a base cell at a doping of 2×10 15 1/cm 3 . figs. 5a and 5b show the effect on the thickness and doping of efficiency by adding a hbl layer. maximum efficiency was achieved at a thickness of 2.0 μm and doping of 2×10 19 1/cm 3 . it is clear that the addition of in0.5(al0.7ga0.3) 0.5p as ebl and hbl layers in the gaas-based multi-quantum well solar cell as an intermediate layer increased the efficiency 17.38%. 482 s. abbasian, r. sabbaghi-nadooshan fig. 4 a) different values of the ebl doping (thickness = 0.005(μm)), b) different values of the ebl thickness (doping = 5×10 15 ( 1/cm 3 )) , fig. 5 a) different values of the hbl doping (thickness = 0.005(μm )), b) different values of the hbl thickness (doping = 5×10 19 ( 1/cm 3 )) . 5.2. electric field the maximum field at the p-n junction can be calculated using eq. (7) as: (7) where q is the charge of the electron, na is the acceptor impurity density, nd is the donor impurity density, is the relative dielectric permittivity of the semiconductor, xp is the depletion region's width of the p-side, and xn is the depletion region's width of the n-side. the electric field in the junction's region is obtained through eq. 8 where nb is the impurity density in the semiconductor which has widest depletion region. study of hole-blocking and electron-blocking layers in a inas/gaas multiple quantum-well solar cell 483 (8) the width of depletion region in a p-n junction can be calculated using eq. (9) as: √ . / (9) where vbi is the internal potential in both sides of the junction [28]. the in vitro findings of references [29] and [30] were used to conclude that in0.5(al0.7ga0.3)0.5p semiconductors increase the open-circuit voltage and the vbi in the semiconductor. the eqs. (8) and (9) shows increased the electric field in the region. as seen in fig. 6, an additional field was created at the interface between the p and ebl layers and the maximum field strength increased from 2.5×10 4 to 2.2×10 5 . at the interface between the n and hbl layers, the maximum field strength increased from 3.8×10 4 to 6.9×10 5 , which reduced recombination in the proposed cell. fig. 6. maximum electric field at the junction region for base and proposed cell. 5.3. spectral response the spectral response demonstrates the absorption of photons in a solar cell. fig. 7 compares the produced and absorbed photons for the base and proposed cells. it can be fig. 7 generation of photocurrent of the base and proposed cell 484 s. abbasian, r. sabbaghi-nadooshan observed that the photon absorption rate in the optical spectrum (0.75-2.5 μm) of the proposed cell is higher than in the base cell, which increases the gain in the proposed cell. 5.4. photogeneration rate the photons produced in a solar cell are defined in eq. (10) (10) in this formula, g is the photogeneration rate, p is the total cumulative effect of reflections, transmissions and losses due to absorption over the ray path, y is the relative distance for the given ray, h is the planck's constant,  is the wavelength, c is the speed of light, α is the absorption coefficient, and is the internal quantum efficiency [31]. the absorption coefficient is obtained from the eq. (11), k coefficient has a positive relationship with the absorption coefficient of a material [28]: ( ) (11) considering eq. (11) and the investigation of the n and k coefficients in algainp semiconductor from reference [28], it can be concluded that by inserting an algainp semiconductor as ebl and hbl, the photogeneration rate will increase. fig. 8 shows the concordance of the theoretical and simulation results for the proposed cell and the base fig. 8 photogeneration rate of the base and proposed model 5.5. iv characteristics fig. 9 compares the i-v curve of the proposed solar cell with ebl and hbl layers with the base cell without ebl and hbl layers. as seen, the in0.5(al0.7ga0.3)0.5p semiconductor in the ebl and hbl layers decreased recombination, which increased the jsc and voc increased to a value of 0.107 v, compensating for the voltage drop caused by the quantum well layer. study of hole-blocking and electron-blocking layers in a inas/gaas multiple quantum-well solar cell 485 fig. 9 current density-voltage (j-v) characteristics of gaas-based multiple quantum well solar cells without ebl and hbl, with one ebl and hb 5.6. important parameters in solar cells solar cell efficiency can be expressed as shown in eq. (12) [32]. (12) where pmax is the maximum output power, pin is the input power and ff is the fill factor. the total current of a solar cell can be obtained from eq. (13). 0 . / 1 (13) k is the boltzmann constant, t is the temperature in kelvin, il is the current produced by the photons and i0 is the current in a dark state. short circuit current occurs when v = 0. isc can be obtained using relation 13 by substituting relation 14. the open circuit voltage is obtained as [33]: isc = −il (14) for i=0 in eq. (4) the open circuit voltage and fill factor are calculated in eqs. (15) and (16). ( ) (15) ( ) (16) table 2 shows the optimized proposed cell model with ebl and hbl layers and base cells without ebl and hbl layers. from the table, it is possible to compare important solar cell parameters such as jsc, voc ff and η. table 2 jsc, voc, ff, and conversion efficiency of multiple quantum well solar cell without ebl and hbl layer, with ebl. solar cells spectrum sun voc(v) jsc (ma/cm 2 ) ff (%) (%) without ebl & hbl (25 layers q-well)[11] am1.5g 1.0 0.907 34.71 86.63 27.27 with ebl & hbl (25 layers q-well) am1.5g 1.0 1.014 51.10 86.20 44.65 486 s. abbasian, r. sabbaghi-nadooshan 6. conclusion in the present study, a in0.5(al0.7ga0.3) 0.5p semiconductor as ebl and hbl intermediate layers was added to a gaas-based 25-layer inas/gaas quantum-well solar cell. the impurity density and optimum thickness of the new layers reduced the drop in voc caused by the presence of the quantum layers. the optimized cell provides a voc of 1.014 v, jsc of 51.1 ma/cm 2 , ff of 86.2 % and a conversion efficiency of 44.65% under 1 sun. references [1] k. w. j. barnham and g. duggan, “a new approach to high-efficiency multi-band-gap solar cells”, j. appl. phys., vol. 67, pp. 3490, 1990. [2] k. barnham, b. braun, j. nelson, and m. paxman, “short-circuit current and energy efficiency enhancement in a low-dimensional structure photovoltaic device,” appl. phys. lett., vol. 59, pp. 135–137, 1991. [3] m. paxman, j. nelson, b. braun, j. connolly, and k. w. j. barnham, “modeling the spectral response of the quantum well solar cell”, j. appl. phys., vol. 74, pp. 614, 1993. [4] b. p. rand, j. li, j. xue, r. j. holmes, m. e. thompson, s. r. forrest, “organic double heterostructure photovoltaic cells employing thick tris(acetylacetonat)ruthenium(iii) exciton-blocking layers”, adv. mater., vol. 17, pp. 2714–2718, 2005. [5] y. k. kuo, t. h. wang, j. y. chang, j. d. chen, “slightly-doped step-like electron-blocking layer in ingan light-emitting diodes”, ieee photonics technol. lett., vol. 24, no. 17, pp. 1506, 2012. [6] m. f. ali, f. hossain, “effect of bandgap of ebl on efficiency of the p-n homojunction si solar cell from numerical analysis”, in proceedings of the international conference on electrical & electronic engineering (iceee), 2015, pp. 245–248. [7] d. guimard, r. morihara, d. bordel, k. tanabe, y. wakayama, “fabrication of inas/gaas quantum dot solar cells with enhanced photocurrent and without degradation of open circuit voltage”, applied physics letters, vol. 96, no. 20, pp. 203507, 2010. [8] p. j. carrington, a. s. mahajumi, m. c. wagener, j. r. botha, q. zhuang, a. krier, “type ii gasb/gaas quantum dot/ring stacks with extended photoresponse for efficient solar cells”, physica b: condensed matter, vol. 407, no. 10, pp. 1493–1496, 2012. [9] w. s. liu, h. m. wu, f. h. tsao, t. l. hsu, j. i. chyi, “improving the characteristics of intermediateband solar cell devices using a vertically aligned inas/gaassb quantum dot structure”, solar energy materials & solar cells, vol. 105, pp. 237–241, 2012. [10] x. yang, k. wang, y. gu, h. ni, x. wang, t. yang, z. wang, “improved efficiency of inas/gaas quantum dots solar cells by si-doping”, solar energy materials & solar cells, vol. 113, pp. 144–147, 2013. [11] y. dai, s. polly, s. hellstroem, d. v. forbes and s. m. hubbard, “electric field effect on carrier escape from inas/gaas quantum dots solar cells”, in proceedings of the ieee 40th photovoltaic specialist conference (pvsc), 2014, pp. 3492–3497. [12] i. ramiro, j. villa, p. lam, s. hatch, j. wu, e. lopez, e. antol´ın, h. liu, a. mart ,wide-bandgap inas/ingap quantum-dot intermediate band solar cells, ieee journal of photovoltaics, vol. 5 , no. 3, pp. 840–845, 2015. [13] a. d. utrilla, d. f. reyes, j. m. llorens, i. artacho, t. ben, d. gonzález, ž. gačević, a. kurtz, a. guzman, a. hierro, j. m. ulloa, “thin gaassb capping layers for improved performance of inas/gaas quantum dot solar cells”, solar energy materials and solar cells, vol. 159, pp. 282–289, 2017. [14] s. biswas and a. sinha, “an analytical study of the minority carrier distribution and photocurrent of a p– i–n quantum dot solar cell based on the inas/gaas system”, indian journal of physics, vol. 91, pp. 1197–1203, 2017 [15] a. imran, j. jiang, d. eric, m. n. zahid, m. yousaf, z. h. shah, “optical properties of inas/gaas quantum dot superlattice structures”, results in physics, vol. 9, pp. 297–302, 2018. [16] a. aissat, n. harchouch, and j. p. vilcot, “optimization of the temperature effects on structure inas/gaas qdsc”, in: hajji b., tina g., ghoumid k., rabhi a., mellit a. (eds), in proceedings of the 1st international conference on electronic engineering and renewable energy. iceere 2018. lecture notes in electrical engineering, vol. 519. springer, singapore, 2019. [17] e. koletsios, gaas/inas multi quantum well solar cell, master of science in applied physics from the naval postgraduate school, 2012. study of hole-blocking and electron-blocking layers in a inas/gaas multiple quantum-well solar cell 487 [18] k. j. singh, s. k. sarkar, “highly efficient arc less ingap/gaas dj solar cell numerical modeling using optimized inalgap bsf layers”, optical quantum electronics, vol. 43, pp. 1–21, 2012. [19] a. martí, c. r. stanley, and a. luque. “intermediate band solar cells (ibsc) using nanotechnology”, chapter 17 in nanostructured materials for solar energy conversion. (elsevier b. v., 2006). [20] f. k. rault, “mathematical modelling of the refractive index and reflectivity of the quantum well solar cell”, chapter 4 in nanostructured materials for solar energy conversion. (elsevier b. v., 2006). [21] ching-hwa ho, ji-han li, and yu-shyan lin, “optical characterization of a gaas/in0.5(alxga1-x)0.5p/gaas heterostructure cavity by piezoreflectance spectroscopy”, optics express, vol. 15, no. 21, pp. 13886–13893, 2007. [22] i. vurgaftman, j.r. meyer, l.r. rammohan, “band parameters for iiiev compound semiconductors and their alloys”, j. appl. phys., vol. 89, no. 11, pp. 5815, 2001. [23] silvaco data systems inc, silvaco atlas user’s manual, 2010. [24] h.y. lee, c.t. lee, “the investigation for various treatments of inalgap schottky diodes”, in proceedings of the 8th international conference on electronic materials, iumrs-icem 23, 2002, pp. 99–102. [25] a. badea, f. dragan, l. fara, and p. sterian, “quantum mechanical effects analysis of nanostructured solar cell models”, renew. energy environ. sustain., vol. 1, no. 3, pp. 1–5, 2016. [26] j. c. rimada and l. hernández, “modelling of ideal algaas quantum well solar cells”, microelectronics journal, vol. 32, no. 9, pp. 719–723, 2001. [27] g. siddharth, v. garg, b. s. sengar, r. bhardwaj, p. kumar, s. mukherjee, “analytical study of performance parameters of ingan/gan multiple quantum well solar cell”, ieee transactions on electron devices, vol. 66, no. 8, pp. 3399–3404, 2019. [28] s. abbasian, r. sabbaghi-nadooshan, “design and evaluation of arc less ingap/algainp dj solar cell”, optik, vol. 136, pp. 487-496, 2017 [29] e. e. perl, j. simon, j. f. geisz, w. olavarria, m. young, a. d. daniel, j. friedman, m. a. steiner, “development of high-bandgap algainp solar cells grown by organometallic vapor-phase epitaxy”, ieee journal of photovoltaics, vol. 6, no. 3, pp. 770–776, 2016. [30] x. li, w. zhang, j. zhang, h. lu, d. zhou, l. sun, k. chen, “study on 2.05 ev aio.13gainp sub-cell and its hetero-structure cells”, in proceedings of the 40th photovoltaic specialist conference (pvsc), 2014, pp. 479–481. [31] s. abbasian, r. sabbaghi-nadooshan, “introducing a novel high-efficiency arc less heterojunction dj solar cell”, facta universitatis, series: electronics and energetics, vol. 31, no. 1, pp. 89–100, 2018. [32] s. m. sze, m. k. lee. semiconductor devices physics and technology, [33] j. p. dutta, p. p. nayak, g. p. mishra, “design and evaluation of arc less ingap/gaas dj solar cell with ingap tunnel junction and optimized double top bsf layer”, optik, vol. 127, pp. 4156–4161, 2016. 2 m. stankovic, c. moraga, r. s. stanković keywords: boolean functions, classification, walsh spectrum, invarant operations. 1 introduction classification of boolean functions is a classical and well explored problem in switching theory [1]. boolean functions can be classified for different purposes. two of them that are most usually reported, npn and lp classifications, are related to technology mapping in logic synthesis with standard gate libraries or in fpga synthesis and unification and simplification of testing procedures [2], [3]. classification can be performed with respect to various criteria usually appropriately defined to serve certain application purposes. the present interest of some researchers in classification with respect to walsh spectral coefficients, which we will shortly denote as the walsh classification, is due to the spectral characterization of some classes of boolean functions used in cryptography [4], [5], [6], [7], [8], [9]. in particular, bent functions [10] which are defined as boolean functions that have maximal minimum distance to the set of affine functions, are alternatively defined as boolean functions with flat walsh spectra, meaning that all the walsh coefficients have the same absolute value equal to 2n/2, where n is the number of variables. this requirement implies that all bent functions for a given n belong to the same class in the walsh classification of boolean functions. this classification has been discussed in a series of papers by different authors with the earliest of them published in the 1970’s, [11], [12], [13]. in this classification, boolean functions are split into classes satisfying certain conditions imposed on elements of appropriately defined canonic vectors specified in terms of their walsh spectra as it will be discussed bellow. functions belonging to the same class are converted to each other by certain operations that we will call affine spectral invariant operations. in the boolean domain, these are affine operations over boolean variables and function values, which in the spectral domain result in permutation and sign changes of walsh coefficients but preserving their absolute values. it was observed already in the early publications, when the analysis was restricted to functions up to 5 variables, that there is some inconsistency when the considerations are restricted to the realms of affine spectral invariant operations. starting from functions of five variables, two different classes of functions with flat spectra have to be considered [12], since functions although satisfying the required conditions over canonic vectors for being members of the class with flat spectra, cannot be converted to each facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 189 205 https://doi.org/10.2298/fuee1802189s milena stanković1, claudio moraga2, radomir s. stanković1 received october 21, 2017; received in revised form february 6, 2018 corresponding author: milena stanković faculty of electronic engineering, university of niš, medevedeva 14, 18000 niš, serbia (e-mail: milena.stankovic@elfak.ni.ac.rs) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) an improved spectral classification of boolean functions based on extended set of invariant operations* 1faculty of electronic engineering, university of niš, niš, serbia 2tu dortmund university, dortmund, germany abstract.boolean functions expressing some particular properties often appear in engineering practice. therefore, a lot of research efforts are put into exploring different approaches towards classification of boolean functions with respect to various criteria that are typically selected to serve some specific needs of the intended applications. a classification is considered to be strong if there is a reasonably small number of different classes for a given number of variables n and it it desirable that classification rules are simple. a classification with respect to walsh spectral coefficients, introduced formerly for digital system design purposes, appears to be useful in the context of boolean functions used in cryptography, since it is in a way compatible with characterization of cryptographically interesting functions through walsh spectral coefficients. this classification is performed in terms of certain spectral invariant operations. we show by introducing a new spectral invariant operation in the walsh domain, that by starting from n≤5, some classes of boolean functions can be merged which makes the classification stronger, and from the theoretical point of view resolves a problem raised already in seventies of the last century. further, this new spectral invariant operation can be used in constructing bent functions from bent functions represented by quadratic forms. key words: boolean functions, classication, walsh spectrum, invarant operations. 2 m. stankovic, c. moraga, r. s. stanković keywords: boolean functions, classification, walsh spectrum, invarant operations. 1 introduction classification of boolean functions is a classical and well explored problem in switching theory [1]. boolean functions can be classified for different purposes. two of them that are most usually reported, npn and lp classifications, are related to technology mapping in logic synthesis with standard gate libraries or in fpga synthesis and unification and simplification of testing procedures [2], [3]. classification can be performed with respect to various criteria usually appropriately defined to serve certain application purposes. the present interest of some researchers in classification with respect to walsh spectral coefficients, which we will shortly denote as the walsh classification, is due to the spectral characterization of some classes of boolean functions used in cryptography [4], [5], [6], [7], [8], [9]. in particular, bent functions [10] which are defined as boolean functions that have maximal minimum distance to the set of affine functions, are alternatively defined as boolean functions with flat walsh spectra, meaning that all the walsh coefficients have the same absolute value equal to 2n/2, where n is the number of variables. this requirement implies that all bent functions for a given n belong to the same class in the walsh classification of boolean functions. this classification has been discussed in a series of papers by different authors with the earliest of them published in the 1970’s, [11], [12], [13]. in this classification, boolean functions are split into classes satisfying certain conditions imposed on elements of appropriately defined canonic vectors specified in terms of their walsh spectra as it will be discussed bellow. functions belonging to the same class are converted to each other by certain operations that we will call affine spectral invariant operations. in the boolean domain, these are affine operations over boolean variables and function values, which in the spectral domain result in permutation and sign changes of walsh coefficients but preserving their absolute values. it was observed already in the early publications, when the analysis was restricted to functions up to 5 variables, that there is some inconsistency when the considerations are restricted to the realms of affine spectral invariant operations. starting from functions of five variables, two different classes of functions with flat spectra have to be considered [12], since functions although satisfying the required conditions over canonic vectors for being members of the class with flat spectra, cannot be converted to each an improved spectral classification of boolean functions... 3 other with affine spectral invariant operations. also, the functions having solely quadratic product terms in their reed-muller expressions are bent functions meaning that they have the flat walsh spectra. bent functions may have product terms with up to n/2 variables, and it means they cannot be derived from functions with quadratic terms by the application of affine spectral invariant operations since these operations cannot be used to increase the number of variables in product terms. it is clear that there are some other operations beyond the affine spectral invariant operations preserving the absolute values of walsh coefficients and in this paper we present a possible answer to this problem by introducing a new spectral invariant operation the effect of which becomes apparent for functions of n ≥ 5 variables. 2 walsh classification of boolean functions classification of boolean functions in terms of walsh spectral coefficients has been discussed in a series of papers with the first of them published in the sixties and seventies of the last century [11], [12], [13], [14], [15]. the walsh transform in the hadamard ordering is used, since in this ordering the structure of the transform matrix corresponds to the recursive structure of the domain for boolean functions viewed as a decomposable abelian group [3]. links to the hadamard matrices are used [2], [3], [12], [16]. 2.1 walsh spectrum definition 1 walsh spectrum. in the so-called hadamard ordering, the walsh spectrum sf(n) of a boolean function f(x1, · · · , xn) is defined as sf(n) = w(n)f(n) = w(1) ⊗nf(n), (1) where ⊗ denotes the kronecker product, w(1) = [ 1 1 1 −1 ] , and f(n) is the value vector of the function f in the (0, 1) → (1, −1) encoding, i.e., f(n) = [(−1)f(0), (−1)f(1), · · · , (−1)f(2 n−1) ]. the walsh spectrum sf(n) of a function of n binary variables is a vector of 2n coefficients. here we will use a notation for spectral coefficients with binary subscripts sb1,b2,··· ,bn where bi ∈ {0, 1}, i = 1, 2, · · · , n sf(n) = [s00...0, s00...1, · · · , s11...1]. 190 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 191 2 m. stankovic, c. moraga, r. s. stanković keywords: boolean functions, classification, walsh spectrum, invarant operations. 1 introduction classification of boolean functions is a classical and well explored problem in switching theory [1]. boolean functions can be classified for different purposes. two of them that are most usually reported, npn and lp classifications, are related to technology mapping in logic synthesis with standard gate libraries or in fpga synthesis and unification and simplification of testing procedures [2], [3]. classification can be performed with respect to various criteria usually appropriately defined to serve certain application purposes. the present interest of some researchers in classification with respect to walsh spectral coefficients, which we will shortly denote as the walsh classification, is due to the spectral characterization of some classes of boolean functions used in cryptography [4], [5], [6], [7], [8], [9]. in particular, bent functions [10] which are defined as boolean functions that have maximal minimum distance to the set of affine functions, are alternatively defined as boolean functions with flat walsh spectra, meaning that all the walsh coefficients have the same absolute value equal to 2n/2, where n is the number of variables. this requirement implies that all bent functions for a given n belong to the same class in the walsh classification of boolean functions. this classification has been discussed in a series of papers by different authors with the earliest of them published in the 1970’s, [11], [12], [13]. in this classification, boolean functions are split into classes satisfying certain conditions imposed on elements of appropriately defined canonic vectors specified in terms of their walsh spectra as it will be discussed bellow. functions belonging to the same class are converted to each other by certain operations that we will call affine spectral invariant operations. in the boolean domain, these are affine operations over boolean variables and function values, which in the spectral domain result in permutation and sign changes of walsh coefficients but preserving their absolute values. it was observed already in the early publications, when the analysis was restricted to functions up to 5 variables, that there is some inconsistency when the considerations are restricted to the realms of affine spectral invariant operations. starting from functions of five variables, two different classes of functions with flat spectra have to be considered [12], since functions although satisfying the required conditions over canonic vectors for being members of the class with flat spectra, cannot be converted to each an improved spectral classification of boolean functions... 3 other with affine spectral invariant operations. also, the functions having solely quadratic product terms in their reed-muller expressions are bent functions meaning that they have the flat walsh spectra. bent functions may have product terms with up to n/2 variables, and it means they cannot be derived from functions with quadratic terms by the application of affine spectral invariant operations since these operations cannot be used to increase the number of variables in product terms. it is clear that there are some other operations beyond the affine spectral invariant operations preserving the absolute values of walsh coefficients and in this paper we present a possible answer to this problem by introducing a new spectral invariant operation the effect of which becomes apparent for functions of n ≥ 5 variables. 2 walsh classification of boolean functions classification of boolean functions in terms of walsh spectral coefficients has been discussed in a series of papers with the first of them published in the sixties and seventies of the last century [11], [12], [13], [14], [15]. the walsh transform in the hadamard ordering is used, since in this ordering the structure of the transform matrix corresponds to the recursive structure of the domain for boolean functions viewed as a decomposable abelian group [3]. links to the hadamard matrices are used [2], [3], [12], [16]. 2.1 walsh spectrum definition 1 walsh spectrum. in the so-called hadamard ordering, the walsh spectrum sf(n) of a boolean function f(x1, · · · , xn) is defined as sf(n) = w(n)f(n) = w(1) ⊗nf(n), (1) where ⊗ denotes the kronecker product, w(1) = [ 1 1 1 −1 ] , and f(n) is the value vector of the function f in the (0, 1) → (1, −1) encoding, i.e., f(n) = [(−1)f(0), (−1)f(1), · · · , (−1)f(2 n−1) ]. the walsh spectrum sf(n) of a function of n binary variables is a vector of 2n coefficients. here we will use a notation for spectral coefficients with binary subscripts sb1,b2,··· ,bn where bi ∈ {0, 1}, i = 1, 2, · · · , n sf(n) = [s00...0, s00...1, · · · , s11...1]. an improved spectral classification of boolean functions... 3 other with affine spectral invariant operations. also, the functions having solely quadratic product terms in their reed-muller expressions are bent functions meaning that they have the flat walsh spectra. bent functions may have product terms with up to n/2 variables, and it means they cannot be derived from functions with quadratic terms by the application of affine spectral invariant operations since these operations cannot be used to increase the number of variables in product terms. it is clear that there are some other operations beyond the affine spectral invariant operations preserving the absolute values of walsh coefficients and in this paper we present a possible answer to this problem by introducing a new spectral invariant operation the effect of which becomes apparent for functions of n ≥ 5 variables. 2 walsh classification of boolean functions classification of boolean functions in terms of walsh spectral coefficients has been discussed in a series of papers with the first of them published in the sixties and seventies of the last century [11], [12], [13], [14], [15]. the walsh transform in the hadamard ordering is used, since in this ordering the structure of the transform matrix corresponds to the recursive structure of the domain for boolean functions viewed as a decomposable abelian group [3]. links to the hadamard matrices are used [2], [3], [12], [16]. 2.1 walsh spectrum definition 1 walsh spectrum. in the so-called hadamard ordering, the walsh spectrum sf(n) of a boolean function f(x1, · · · , xn) is defined as sf(n) = w(n)f(n) = w(1) ⊗nf(n), (1) where ⊗ denotes the kronecker product, w(1) = [ 1 1 1 −1 ] , and f(n) is the value vector of the function f in the (0, 1) → (1, −1) encoding, i.e., f(n) = [(−1)f(0), (−1)f(1), · · · , (−1)f(2 n−1) ]. the walsh spectrum sf(n) of a function of n binary variables is a vector of 2n coefficients. here we will use a notation for spectral coefficients with binary subscripts sb1,b2,··· ,bn where bi ∈ {0, 1}, i = 1, 2, · · · , n sf(n) = [s00...0, s00...1, · · · , s11...1]. 4 m. stankovic, c. moraga, r. s. stanković the first coefficient with the index i = (00 . . . 0) is called the zero order coefficient, and the coefficients with the single 1 in the binary representation of the indices are the coefficients of the first order. walsh spectral coefficients are seen to be a form of correlation between function inputs and the output. 2.2 spectral invariant operations in walsh classification, classification rules are derived from the so-called spectral invariant operations defined as operations that preserve the absolute values of walsh spectral coefficients of boolean functions. in other words, spectral invariant operations perform permutation and change the sign of walsh coefficients. in order to ensure that walsh spectra with permuted coefficients will remain the spectra of boolean and not some other integer valued functions, these permutations are not arbitrary. a permutation of some walsh coefficients requires a simultaneous permutation of certain precisely specified subsets of walsh coefficients (definition 2). definition 2 spectral invariant operations are defined as 1. negation (complement) of the f(x1, · · · , xn). this results in the change of sign of all spectral coefficients without the change of their absolute values. 2. negation (complement) of any input variable xi, i ∈ (1, · · · , n). this results in the change of the sign of all coefficients with bi = 1 in the subscripts. sb0,··· ,bi=1,··· ,bn = −sb0,··· ,bi=1,··· ,bn. 3. interchange of any input variable xi with xj, i �= j. in the spectral domain this corresponds to the permutation of the binary values at the positions i and j in the subscripts: sb0,··· ,bi,··· ,bj··· ,bn ↔ sb0,··· ,bj,··· ,bi··· ,bn, 4. replacement of any input variable xi, by the exclusive-or sum xi ⊕ xj, i �= j. this operation results in the interchange of all coefficient values with bi = 1, bj in the subscripts with the coefficients having bi = 1, b̄j, while all the coefficients with bi = 0 in the subscripts remain unchanged: sb0,··· ,bi=1,··· ,bj,··· ,bn ↔ sb0,··· ,bi=1,··· ,b̄j,··· ,bn. 190 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 191 4 m. stankovic, c. moraga, r. s. stanković the first coefficient with the index i = (00 . . . 0) is called the zero order coefficient, and the coefficients with the single 1 in the binary representation of the indices are the coefficients of the first order. walsh spectral coefficients are seen to be a form of correlation between function inputs and the output. 2.2 spectral invariant operations in walsh classification, classification rules are derived from the so-called spectral invariant operations defined as operations that preserve the absolute values of walsh spectral coefficients of boolean functions. in other words, spectral invariant operations perform permutation and change the sign of walsh coefficients. in order to ensure that walsh spectra with permuted coefficients will remain the spectra of boolean and not some other integer valued functions, these permutations are not arbitrary. a permutation of some walsh coefficients requires a simultaneous permutation of certain precisely specified subsets of walsh coefficients (definition 2). definition 2 spectral invariant operations are defined as 1. negation (complement) of the f(x1, · · · , xn). this results in the change of sign of all spectral coefficients without the change of their absolute values. 2. negation (complement) of any input variable xi, i ∈ (1, · · · , n). this results in the change of the sign of all coefficients with bi = 1 in the subscripts. sb0,··· ,bi=1,··· ,bn = −sb0,··· ,bi=1,··· ,bn. 3. interchange of any input variable xi with xj, i �= j. in the spectral domain this corresponds to the permutation of the binary values at the positions i and j in the subscripts: sb0,··· ,bi,··· ,bj··· ,bn ↔ sb0,··· ,bj,··· ,bi··· ,bn, 4. replacement of any input variable xi, by the exclusive-or sum xi ⊕ xj, i �= j. this operation results in the interchange of all coefficient values with bi = 1, bj in the subscripts with the coefficients having bi = 1, b̄j, while all the coefficients with bi = 0 in the subscripts remain unchanged: sb0,··· ,bi=1,··· ,bj,··· ,bn ↔ sb0,··· ,bi=1,··· ,b̄j,··· ,bn. an improved spectral classification of boolean functions... 5 5. modifucation of the function f(x1, · · · , xn) into function f∗(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xi, where xi ∈ {x1, · · · , xn} results in the interchange of all coefficient having bi = 1 in the subscripts with the coefficients having bi = 0, [12]: sb0,··· ,bi=1,··· ,bn ↔ sb0,··· ,bi=0,··· ,bn. it is important to note that referring to the reed-muller polynomial expressions for boolean functions, the above defined spectral invariant operations do not increase the number of variables in the product terms. 2.3 walsh classification of boolean functions by using spectral invariant operations, hurst in [12] and lechner in [13] proposed a procedure for reordering of spectral coefficients into the positive canonic order which represents the classification entry of the given boolean function f(x1, · · · , xn). positive canonic order is used for s00···0 and the first order coefficients, while other coefficients may take negative values. in [12], it is shown a classification of all boolean functions with n ≤ 5 variables as a table with 48 canonic vectors. in this table, not all of the 32 coefficient values for each classified entry are shown. instead, the values of first order coefficients are shown together with the number of coefficients having the same value in each entry. for example (6 × 10; 10 × 6; 16 × 2) indicates that the 6-th, 10-th, and 16-th coefficients have values 10, 6, and 2, respectively, in the entire spectrum of 32 coefficients. in table 1, we show a part of this table from [12], starting from the class no. 30. we consider this part of the classification table, since there are some inconsistencies in this classification that can be summarized as follows: • the n + 1 zero and first order coefficients are sufficient to identify unambiguously each classification entry for n ≤ 5, except for functions with canonic vectors 35 and 36, as well as 45a and 46b. • there are 6 groups of classes with identical spectral summary and different canonical form of first order coefficients for functions 31 and 32, 33 and 35, 34 and 37, 38 and 39, 41, 42, and 43, as well as 45a, 45b and 47. 4 m. stankovic, c. moraga, r. s. stanković the first coefficient with the index i = (00 . . . 0) is called the zero order coefficient, and the coefficients with the single 1 in the binary representation of the indices are the coefficients of the first order. walsh spectral coefficients are seen to be a form of correlation between function inputs and the output. 2.2 spectral invariant operations in walsh classification, classification rules are derived from the so-called spectral invariant operations defined as operations that preserve the absolute values of walsh spectral coefficients of boolean functions. in other words, spectral invariant operations perform permutation and change the sign of walsh coefficients. in order to ensure that walsh spectra with permuted coefficients will remain the spectra of boolean and not some other integer valued functions, these permutations are not arbitrary. a permutation of some walsh coefficients requires a simultaneous permutation of certain precisely specified subsets of walsh coefficients (definition 2). definition 2 spectral invariant operations are defined as 1. negation (complement) of the f(x1, · · · , xn). this results in the change of sign of all spectral coefficients without the change of their absolute values. 2. negation (complement) of any input variable xi, i ∈ (1, · · · , n). this results in the change of the sign of all coefficients with bi = 1 in the subscripts. sb0,··· ,bi=1,··· ,bn = −sb0,··· ,bi=1,··· ,bn. 3. interchange of any input variable xi with xj, i �= j. in the spectral domain this corresponds to the permutation of the binary values at the positions i and j in the subscripts: sb0,··· ,bi,··· ,bj··· ,bn ↔ sb0,··· ,bj,··· ,bi··· ,bn, 4. replacement of any input variable xi, by the exclusive-or sum xi ⊕ xj, i �= j. this operation results in the interchange of all coefficient values with bi = 1, bj in the subscripts with the coefficients having bi = 1, b̄j, while all the coefficients with bi = 0 in the subscripts remain unchanged: sb0,··· ,bi=1,··· ,bj,··· ,bn ↔ sb0,··· ,bi=1,··· ,b̄j,··· ,bn. 192 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 193 4 m. stankovic, c. moraga, r. s. stanković the first coefficient with the index i = (00 . . . 0) is called the zero order coefficient, and the coefficients with the single 1 in the binary representation of the indices are the coefficients of the first order. walsh spectral coefficients are seen to be a form of correlation between function inputs and the output. 2.2 spectral invariant operations in walsh classification, classification rules are derived from the so-called spectral invariant operations defined as operations that preserve the absolute values of walsh spectral coefficients of boolean functions. in other words, spectral invariant operations perform permutation and change the sign of walsh coefficients. in order to ensure that walsh spectra with permuted coefficients will remain the spectra of boolean and not some other integer valued functions, these permutations are not arbitrary. a permutation of some walsh coefficients requires a simultaneous permutation of certain precisely specified subsets of walsh coefficients (definition 2). definition 2 spectral invariant operations are defined as 1. negation (complement) of the f(x1, · · · , xn). this results in the change of sign of all spectral coefficients without the change of their absolute values. 2. negation (complement) of any input variable xi, i ∈ (1, · · · , n). this results in the change of the sign of all coefficients with bi = 1 in the subscripts. sb0,··· ,bi=1,··· ,bn = −sb0,··· ,bi=1,··· ,bn. 3. interchange of any input variable xi with xj, i �= j. in the spectral domain this corresponds to the permutation of the binary values at the positions i and j in the subscripts: sb0,··· ,bi,··· ,bj··· ,bn ↔ sb0,··· ,bj,··· ,bi··· ,bn, 4. replacement of any input variable xi, by the exclusive-or sum xi ⊕ xj, i �= j. this operation results in the interchange of all coefficient values with bi = 1, bj in the subscripts with the coefficients having bi = 1, b̄j, while all the coefficients with bi = 0 in the subscripts remain unchanged: sb0,··· ,bi=1,··· ,bj,··· ,bn ↔ sb0,··· ,bi=1,··· ,b̄j,··· ,bn. an improved spectral classification of boolean functions... 5 5. modifucation of the function f(x1, · · · , xn) into function f∗(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xi, where xi ∈ {x1, · · · , xn} results in the interchange of all coefficient having bi = 1 in the subscripts with the coefficients having bi = 0, [12]: sb0,··· ,bi=1,··· ,bn ↔ sb0,··· ,bi=0,··· ,bn. it is important to note that referring to the reed-muller polynomial expressions for boolean functions, the above defined spectral invariant operations do not increase the number of variables in the product terms. 2.3 walsh classification of boolean functions by using spectral invariant operations, hurst in [12] and lechner in [13] proposed a procedure for reordering of spectral coefficients into the positive canonic order which represents the classification entry of the given boolean function f(x1, · · · , xn). positive canonic order is used for s00···0 and the first order coefficients, while other coefficients may take negative values. in [12], it is shown a classification of all boolean functions with n ≤ 5 variables as a table with 48 canonic vectors. in this table, not all of the 32 coefficient values for each classified entry are shown. instead, the values of first order coefficients are shown together with the number of coefficients having the same value in each entry. for example (6 × 10; 10 × 6; 16 × 2) indicates that the 6-th, 10-th, and 16-th coefficients have values 10, 6, and 2, respectively, in the entire spectrum of 32 coefficients. in table 1, we show a part of this table from [12], starting from the class no. 30. we consider this part of the classification table, since there are some inconsistencies in this classification that can be summarized as follows: • the n + 1 zero and first order coefficients are sufficient to identify unambiguously each classification entry for n ≤ 5, except for functions with canonic vectors 35 and 36, as well as 45a and 46b. • there are 6 groups of classes with identical spectral summary and different canonical form of first order coefficients for functions 31 and 32, 33 and 35, 34 and 37, 38 and 39, 41, 42, and 43, as well as 45a, 45b and 47. an improved spectral classification of boolean functions... 5 5. modifucation of the function f(x1, · · · , xn) into function f∗(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xi, where xi ∈ {x1, · · · , xn} results in the interchange of all coefficient having bi = 1 in the subscripts with the coefficients having bi = 0, [12]: sb0,··· ,bi=1,··· ,bn ↔ sb0,··· ,bi=0,··· ,bn. it is important to note that referring to the reed-muller polynomial expressions for boolean functions, the above defined spectral invariant operations do not increase the number of variables in the product terms. 2.3 walsh classification of boolean functions by using spectral invariant operations, hurst in [12] and lechner in [13] proposed a procedure for reordering of spectral coefficients into the positive canonic order which represents the classification entry of the given boolean function f(x1, · · · , xn). positive canonic order is used for s00···0 and the first order coefficients, while other coefficients may take negative values. in [12], it is shown a classification of all boolean functions with n ≤ 5 variables as a table with 48 canonic vectors. in this table, not all of the 32 coefficient values for each classified entry are shown. instead, the values of first order coefficients are shown together with the number of coefficients having the same value in each entry. for example (6 × 10; 10 × 6; 16 × 2) indicates that the 6-th, 10-th, and 16-th coefficients have values 10, 6, and 2, respectively, in the entire spectrum of 32 coefficients. in table 1, we show a part of this table from [12], starting from the class no. 30. we consider this part of the classification table, since there are some inconsistencies in this classification that can be summarized as follows: • the n + 1 zero and first order coefficients are sufficient to identify unambiguously each classification entry for n ≤ 5, except for functions with canonic vectors 35 and 36, as well as 45a and 46b. • there are 6 groups of classes with identical spectral summary and different canonical form of first order coefficients for functions 31 and 32, 33 and 35, 34 and 37, 38 and 39, 41, 42, and 43, as well as 45a, 45b and 47. 4 m. stankovic, c. moraga, r. s. stanković the first coefficient with the index i = (00 . . . 0) is called the zero order coefficient, and the coefficients with the single 1 in the binary representation of the indices are the coefficients of the first order. walsh spectral coefficients are seen to be a form of correlation between function inputs and the output. 2.2 spectral invariant operations in walsh classification, classification rules are derived from the so-called spectral invariant operations defined as operations that preserve the absolute values of walsh spectral coefficients of boolean functions. in other words, spectral invariant operations perform permutation and change the sign of walsh coefficients. in order to ensure that walsh spectra with permuted coefficients will remain the spectra of boolean and not some other integer valued functions, these permutations are not arbitrary. a permutation of some walsh coefficients requires a simultaneous permutation of certain precisely specified subsets of walsh coefficients (definition 2). definition 2 spectral invariant operations are defined as 1. negation (complement) of the f(x1, · · · , xn). this results in the change of sign of all spectral coefficients without the change of their absolute values. 2. negation (complement) of any input variable xi, i ∈ (1, · · · , n). this results in the change of the sign of all coefficients with bi = 1 in the subscripts. sb0,··· ,bi=1,··· ,bn = −sb0,··· ,bi=1,··· ,bn. 3. interchange of any input variable xi with xj, i �= j. in the spectral domain this corresponds to the permutation of the binary values at the positions i and j in the subscripts: sb0,··· ,bi,··· ,bj··· ,bn ↔ sb0,··· ,bj,··· ,bi··· ,bn, 4. replacement of any input variable xi, by the exclusive-or sum xi ⊕ xj, i �= j. this operation results in the interchange of all coefficient values with bi = 1, bj in the subscripts with the coefficients having bi = 1, b̄j, while all the coefficients with bi = 0 in the subscripts remain unchanged: sb0,··· ,bi=1,··· ,bj,··· ,bn ↔ sb0,··· ,bi=1,··· ,b̄j,··· ,bn. 192 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 193 6 m. stankovic, c. moraga, r. s. stanković table 1: a part of the classification table in the walsh classification of boolean functions for n ≤ 5. class primary coefficients complete spectral summary 30 14 10 10 10 10 6 1 × 14, 5 × 10, 7 × 6, 19 × 2 31 14 10 10 10 6 6 1 × 14, 3 × 10, 13 × 6, 15 × 2 32 14 10 10 6 6 6 1 × 14, 3 × 10, 13 × 6, 15 × 2 33 12 12 12 12 8 4 4 × 12, 4 × 8, 12 × 4, 12 × 0 34 12 12 12 12 4 4 4 × 12, 28 × 4 35 12 12 12 8 8 8 4 × 12, 4 × 8, 12 × 4, 12 × 0 36 12 12 12 8 8 8 3 × 12, 6 × 8, 13 × 4, 10 × 0 37 12 12 12 4 4 4 4 × 12, 28 × 4 38 12 12 8 8 8 8 2 × 12, 8 × 8, 14 × 4, 8 × 0 39 12 12 8 8 8 4 2 × 12, 8 × 8, 14 × 4, 8 × 0 40 12 8 8 8 8 8 1 × 12, 10 × 8, 15 × 4, 6 × 0 41 10 10 10 10 10 10 6 × 10, 10 × 6, 16 × 2 42 10 10 10 10 10 6 6 × 10, 10 × 6, 16 × 2 43 10 10 10 10 10 2 6 × 10, 10 × 6, 16 × 2 44 10 10 10 10 6 6 4 × 10, 16 × 6, 12 × 2 45a 8 8 8 8 8 8 16 × 8, 16 × 0 45b 8 8 8 8 8 8 16 × 8, 16 × 0 46 8 8 8 8 8 4 12 × 8, 16 × 4, 4 × 0 47 8 8 8 8 8 0 16 × 8, 16 × 0 • the two entries 45a and 45b are the only entries for n ≤ 5 with equal first order coefficients up to the ordering under spectral invariant operations. further, these functions have the same spectral summaries. however, it is proved to be impossible to map a function in the class 45a into a function in the class 45b and vice versa by using the affine spectral invariant operations used in this classification [12]. these two classes are illustrated by functions shown in the karnaugh maps in table 2 and table 3 [12]. similar problems in the walsh classification using affine invariant operations are mentioned by lechner in [13] by using a different notation as shown in table 4. in what follows, we show that certain appropriately defined spectral invariant operations defined for functions with disjoint products of pairs variables [17] permit a refinement of this classification in the sense that these 194 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 195 6 m. stankovic, c. moraga, r. s. stanković table 1: a part of the classification table in the walsh classification of boolean functions for n ≤ 5. class primary coefficients complete spectral summary 30 14 10 10 10 10 6 1 × 14, 5 × 10, 7 × 6, 19 × 2 31 14 10 10 10 6 6 1 × 14, 3 × 10, 13 × 6, 15 × 2 32 14 10 10 6 6 6 1 × 14, 3 × 10, 13 × 6, 15 × 2 33 12 12 12 12 8 4 4 × 12, 4 × 8, 12 × 4, 12 × 0 34 12 12 12 12 4 4 4 × 12, 28 × 4 35 12 12 12 8 8 8 4 × 12, 4 × 8, 12 × 4, 12 × 0 36 12 12 12 8 8 8 3 × 12, 6 × 8, 13 × 4, 10 × 0 37 12 12 12 4 4 4 4 × 12, 28 × 4 38 12 12 8 8 8 8 2 × 12, 8 × 8, 14 × 4, 8 × 0 39 12 12 8 8 8 4 2 × 12, 8 × 8, 14 × 4, 8 × 0 40 12 8 8 8 8 8 1 × 12, 10 × 8, 15 × 4, 6 × 0 41 10 10 10 10 10 10 6 × 10, 10 × 6, 16 × 2 42 10 10 10 10 10 6 6 × 10, 10 × 6, 16 × 2 43 10 10 10 10 10 2 6 × 10, 10 × 6, 16 × 2 44 10 10 10 10 6 6 4 × 10, 16 × 6, 12 × 2 45a 8 8 8 8 8 8 16 × 8, 16 × 0 45b 8 8 8 8 8 8 16 × 8, 16 × 0 46 8 8 8 8 8 4 12 × 8, 16 × 4, 4 × 0 47 8 8 8 8 8 0 16 × 8, 16 × 0 • the two entries 45a and 45b are the only entries for n ≤ 5 with equal first order coefficients up to the ordering under spectral invariant operations. further, these functions have the same spectral summaries. however, it is proved to be impossible to map a function in the class 45a into a function in the class 45b and vice versa by using the affine spectral invariant operations used in this classification [12]. these two classes are illustrated by functions shown in the karnaugh maps in table 2 and table 3 [12]. similar problems in the walsh classification using affine invariant operations are mentioned by lechner in [13] by using a different notation as shown in table 4. in what follows, we show that certain appropriately defined spectral invariant operations defined for functions with disjoint products of pairs variables [17] permit a refinement of this classification in the sense that these an improved spectral classification of boolean functions... 7 table 2: a function from the class 45a. x2x3 x4x5 00 01 11 10 00 01 11 10 00 0 0 0 0 0 0 1 0 01 0 0 1 1 0 0 1 0 11 0 0 1 1 1 1 1 0 10 0 0 0 0 1 1 1 0 x1 = 0 x1 = 1 table 3: a function from the class 45b. x2x3 x4x5 00 01 11 10 00 01 11 10 00 0 0 0 0 0 0 1 0 01 0 0 0 0 0 1 1 1 11 0 0 1 1 0 1 1 1 10 0 1 0 1 0 1 1 0 x1 = 0 x1 = 1 classes, which are considered as related but different, can be unified. we introduce a new spectral invariant operation that permits to convert functions from these two classes to each other, meaning that they belong to the same class under this new operation being added to the classification rules. table 4: correspondence between notations of walsh classes used by hurst and lechner. hurst’s classes lechner’s classes functions 31, 32 classes 30a, 30b functions 33, 35 classes 32a, 32b functions 34, 37 classes 33a, 33b functions 38, 39 classes 35a, 35b functions 41, 43 classes 37a, 37b functions 45a, 45b 47 classes 39a, 39b functions 42 not illustrated by lechner 194 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 195 8 m. stankovic, c. moraga, r. s. stanković 3 new spectral invariant operations the fifth spectral invariant operation in definition 2, which is called disjoint spectral translation [11], [12], concerns with the adding of a linear term xi to a function f without changing the absolute values of walsh coefficients of f. the spectral invariant operation introduced in the present section concerns with conditions under which adding of a non-linear term to a function f satisfies requirements of the walsh spectrum invariance. we show that when the reed-muller polynomial of a boolean function contains two disjoint products of pairs of variables it is possible to add a certain products with three variables such that some spectral coefficients will be permuted or will change the sign but their absolute values remain unchanged. therefore, this operation can be viewed as a spectral invariant operation in the sense discussed above. theorem 1 (adding to f a product of three variables) let f be a boolean function of n variables, which has two disjoint products of two variables in its polynomial form f(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4 ⊕ h(x1, · · · , xk), where {i1, i2} ∩ {i3, i4} = ∅, and h(x1, · · · , xk) is an arbitrary function of {x1, · · · , xk} ⊂ {x1, · · · , xn} variables. furthermore, let g(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xj1xj2xj3, where j1 ∈ {i1, i2}, j2 ∈ {i3, i4} and j3 /∈ {i1, i2, i3, i4} . the walsh spectral coefficients of f and g are related as sgb1,··· ,bj4=1,bj5=1,bj3=0,··· ,bn ↔ sf b1,··· ,bj4=1,bj5=1,bj3=1,··· ,bn , sgb1,··· ,bj4=1,bj5=1,bj3=1,··· ,bn ↔ sf b1,··· ,bj4=1,bj5=1,bj3=0,··· ,bn , where j4 = {i1, i2} \ j1 and j5 = {i3, i4} \ j2. coefficients with bj4 = bj5 = 1, and bj3 = 0 in the subscripts are permuted with the coefficients with bj4 = bj5 = 1, and bj3 = 1 in the subscripts. proof: for simplicity assume that h does not appear in f, h(x1, · · · , xk) = 0. therefore, consider a function f(x1, · · · , xn) in n variables defined by the polynomial form consisting of two disjoint product terms of pairs of variable f(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4. 196 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 197 8 m. stankovic, c. moraga, r. s. stanković 3 new spectral invariant operations the fifth spectral invariant operation in definition 2, which is called disjoint spectral translation [11], [12], concerns with the adding of a linear term xi to a function f without changing the absolute values of walsh coefficients of f. the spectral invariant operation introduced in the present section concerns with conditions under which adding of a non-linear term to a function f satisfies requirements of the walsh spectrum invariance. we show that when the reed-muller polynomial of a boolean function contains two disjoint products of pairs of variables it is possible to add a certain products with three variables such that some spectral coefficients will be permuted or will change the sign but their absolute values remain unchanged. therefore, this operation can be viewed as a spectral invariant operation in the sense discussed above. theorem 1 (adding to f a product of three variables) let f be a boolean function of n variables, which has two disjoint products of two variables in its polynomial form f(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4 ⊕ h(x1, · · · , xk), where {i1, i2} ∩ {i3, i4} = ∅, and h(x1, · · · , xk) is an arbitrary function of {x1, · · · , xk} ⊂ {x1, · · · , xn} variables. furthermore, let g(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xj1xj2xj3, where j1 ∈ {i1, i2}, j2 ∈ {i3, i4} and j3 /∈ {i1, i2, i3, i4} . the walsh spectral coefficients of f and g are related as sgb1,··· ,bj4=1,bj5=1,bj3=0,··· ,bn ↔ sf b1,··· ,bj4=1,bj5=1,bj3=1,··· ,bn , sgb1,··· ,bj4=1,bj5=1,bj3=1,··· ,bn ↔ sf b1,··· ,bj4=1,bj5=1,bj3=0,··· ,bn , where j4 = {i1, i2} \ j1 and j5 = {i3, i4} \ j2. coefficients with bj4 = bj5 = 1, and bj3 = 0 in the subscripts are permuted with the coefficients with bj4 = bj5 = 1, and bj3 = 1 in the subscripts. proof: for simplicity assume that h does not appear in f, h(x1, · · · , xk) = 0. therefore, consider a function f(x1, · · · , xn) in n variables defined by the polynomial form consisting of two disjoint product terms of pairs of variable f(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4. an improved spectral classification of boolean functions... 9 table 5: variables, functions, and spectral coefficients in the proof of theorem 1. xi1 xi2 xi3 xi4 xi5 f g f g w2,4 w2,4,5 sf1,1,0 sg1,1,1 1 0 1 0 1 0 1 1 -1 1 -1 1 1 1 0 1 1 1 1 0 -1 1 -1 1 1 1 1 1 1 0 1 1 0 -1 1 -1 1 1 1 1 1 1 1 1 0 1 1 -1 1 -1 1 1 by adding the product of variables xj1xj2xj3 where j1 ∈ {i1, i2}, j2 ∈ {(i3, i4} and j3 /∈ {i1, i2, i3, i4}, we get the function g(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4 · · · ⊕ xj1xj2xj3. to simplify the notation in the proof, consider the case when j1 = i1, j2 = i3, and j3 = i5. by adding the product xi1xi3xi5 only the values of the function f for xi1 = 1, xi3 = 1, and xi5 = 1 will be complemented. since f(x1, · · · , xn) = xi1xi2 ⊕xi3xi4, the values of the function in these points are f(x1, · · · , xn) = 1 · xi2 ⊕ 1 · xi4 = xi2 ⊕ xi4, and the values of the function g(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4 ⊕ xi1xi3xi5 = 1 · xi2 ⊕ 1 · xi4 ⊕ 1 = xi2 ⊕ xi4 ⊕ 1, as it is shown in table 5. in this table the first five columns show the values of the input variables xi1, xi2, xi3, xi4, and xi5. in columns f, g, f , and g, the values of functions f and g in (0, 1) and f and g in the (1, −1) encoding are shown. in the next two columns the values of the rows in the walsh matrix w2,4 and w2,4,5 through which the coefficients sbi2=1,bi4=1 and sbi2=1,bi4=1,bi5=1 are calculated. in the last two columns, denoted as sf1,1,0 and sg1,1,1, the values of the coefficients sbi2=1,bi4=1,bi5=0 for the function f and the values of the coefficients sbi2=1,bi4=1,bi5=1 for the function g are shown. the values in the column f are in correlation with the values of the walsh function w2,4 which is used for calculation of the coefficients with bi2 = 1, bi4 = 1, and bi5 = 0 in the subscript, while the values of the column g are in correlation with the values of the walsh function w2,4,5 used for calculation of the coefficient with bi2 = 1, bi4 = 1, and bi5 = 1 in the subscript. from that it follows that the changes in the function f have influence only on the coefficients having bi2 = 1, bi4 = 1, and bi5 = 0 in the subscripts, and for function g they will have the same values as the coefficients of the function f having bi2 = 1, bi4 = 1, and bi5 = 1. also, the coefficients of the function g having bi2 = 1, bi4 = 1, and bi5 = 1 in the subscripts have the same values as the coefficients of the function f having bi2 = 1, bi4 = 1, and bi5 = 0 in the subscripts. 196 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 197 10 m. stankovic, c. moraga, r. s. stanković theorem 2 (adding two products of three variables) let f be a boolean function of n variables, which has two disjoint products of two variables in its polynomial form f(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4 ⊕ h(x1, · · · , xk), where {i1, i2} ∩ {i3, i4} = ∅, and {x1, · · · , xk} ⊂ {x1, · · · , xn}. define g(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xj1xj2xj3 ⊕ xj4xj5x̄j3 where j1 ∈ {i1, i2}, j2 ∈ {i3, i4}, j3 /∈ {i1, i2, i3, i4} , j4 = {i1, i2} \ j1, and j5 = {i3, i4} \ j2. the following relations between pairs of spectral coefficients of f and g exist sgb1,b2=1,b3,b4=1,b5=0 ↔ sf b1,b2=1,b3,b4=1,b5=1, sgb1,b2=1,b3,b4=1,b5=1 ↔ sf b1,b2=1,b3,b4=1,b5=0, for b1, b3 ∈ (0, 1), sgb1=1,b2,b3=1,b4,b5=0 ↔ −sf b1=1,b2,b3=1,b4,b5=1, sgb1=1,b2,b3=1,b4,b5=1 ↔ −sf b1=1,b2,b3=1,b4,b5=0, for b2, b4 ∈ (0, 1). it is possible to prove the theorem 2 by using the same approach as in theorem 1. example 1 consider the function f(x1, · · · , x5) = x1x2 ⊕ x3x4 ⊕ x5 with the spectrum sf = [0, 8, 0, 8, 0, 8, 0, −8, 0, 8, 0, 8, 0, 8, 0, −8, 0, 8, 0, 8, 0, 8, 0, −8, 0, −8, 0, −8, 0, −8, 0, 8]t . by adding to f the product of three variables (one variable from the first product, the second from the second product and as the third the variable x5, not included in the products), a new function g1(x1, · · · , x5) = f(x1, · · · , x5) ⊕ x1x3x5 = x1x2 ⊕ x3x4 ⊕ x5 ⊕ x1x3x5, is generated. the spectrum of g1 is sg1 = [0, 8, 0, 8, 0, 8, 0, −8, 0, 8, 8, 0, 0, 8, −8, 0, 0, 8, 0, 8, 0, 8, 0, −8, 0, −8, −8, 0, 0, −8, 8, 0]t , 198 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 199 10 m. stankovic, c. moraga, r. s. stanković theorem 2 (adding two products of three variables) let f be a boolean function of n variables, which has two disjoint products of two variables in its polynomial form f(x1, · · · , xn) = xi1xi2 ⊕ xi3xi4 ⊕ h(x1, · · · , xk), where {i1, i2} ∩ {i3, i4} = ∅, and {x1, · · · , xk} ⊂ {x1, · · · , xn}. define g(x1, · · · , xn) = f(x1, · · · , xn) ⊕ xj1xj2xj3 ⊕ xj4xj5x̄j3 where j1 ∈ {i1, i2}, j2 ∈ {i3, i4}, j3 /∈ {i1, i2, i3, i4} , j4 = {i1, i2} \ j1, and j5 = {i3, i4} \ j2. the following relations between pairs of spectral coefficients of f and g exist sgb1,b2=1,b3,b4=1,b5=0 ↔ sf b1,b2=1,b3,b4=1,b5=1, sgb1,b2=1,b3,b4=1,b5=1 ↔ sf b1,b2=1,b3,b4=1,b5=0, for b1, b3 ∈ (0, 1), sgb1=1,b2,b3=1,b4,b5=0 ↔ −sf b1=1,b2,b3=1,b4,b5=1, sgb1=1,b2,b3=1,b4,b5=1 ↔ −sf b1=1,b2,b3=1,b4,b5=0, for b2, b4 ∈ (0, 1). it is possible to prove the theorem 2 by using the same approach as in theorem 1. example 1 consider the function f(x1, · · · , x5) = x1x2 ⊕ x3x4 ⊕ x5 with the spectrum sf = [0, 8, 0, 8, 0, 8, 0, −8, 0, 8, 0, 8, 0, 8, 0, −8, 0, 8, 0, 8, 0, 8, 0, −8, 0, −8, 0, −8, 0, −8, 0, 8]t . by adding to f the product of three variables (one variable from the first product, the second from the second product and as the third the variable x5, not included in the products), a new function g1(x1, · · · , x5) = f(x1, · · · , x5) ⊕ x1x3x5 = x1x2 ⊕ x3x4 ⊕ x5 ⊕ x1x3x5, is generated. the spectrum of g1 is sg1 = [0, 8, 0, 8, 0, 8, 0, −8, 0, 8, 8, 0, 0, 8, −8, 0, 0, 8, 0, 8, 0, 8, 0, −8, 0, −8, −8, 0, 0, −8, 8, 0]t , an improved spectral classification of boolean functions... 11 and it is related to the spectrum of f as follows sg101010 = sf 01011 sg101011 = sf 01010 sg101110 = sf 01111 sg101111 = sf 01110 sg111010 = sf 11011 sg111011 = sf 11010 sg111110 = sf 11111 sg111111 = sf 11110. also, it is possible to generate from f a new function by adding two products with three variables, one product with x5 and the second with x̄5 as in the following example. example 2 consider again the function f in example 1 and a new function g2(x1, · · · , x5) = f(x1, · · · , x5) ⊕ x1x3x̄5 ⊕ x2x4x5 = x1x2 ⊕ x3x4 ⊕ x5 ⊕ x1x3x5 ⊕ x1x3 ⊕ x2x4x5. the function g2 has the spectrum sg2 = [0, 8, 0, 8, 0, 8, 0, −8, 0, 8, −8, 0, 0, 8, 8, 0, 0, 8, 0, 8, 8, 0, −8, 0, 0, −8, 8, 0, −8, 0, 0, −8]t , related to the spectrum of f by the permutation of the following coefficients sg201010 = −sf 01011 sg201011 = −sf 01010 sg201110 = −sf 01111 sg201111 = −sf 01110 sg211010 = −sf 11011 sg211011 = −sf 11010 sg211110 = −sf 11111 sg211111 = −sf 11110 sg210100 = sf 10101 sg210101 = sf 10100 sg210110 = sf 10111 sg210111 = sf 10110 sg211100 = sf 11101 sg211101 = sf 11100 sg211110 = sf 11111 sg211111 = sf 11110. note that it is possible to generalize the operations defined in theorem 1 and theorem 2. when the function f has more than two disjoint product in the polynomial form it is possible to add products with more than three variables by selecting the variables by following the described rules. 4 classes 45a, 45b and 47 in this section, we consider relationships between functions in the classes 45a, 45b, and 47. 198 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 199 12 m. stankovic, c. moraga, r. s. stanković table 6: functions in 5 variables with solely products of two and three variables in the polynomial form. number of cubes pairs 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 5 0 0 1 0 0 0 0 0 60 120 180 150 0 0 2 115 60 180 360 615 1320 1170 900 300 120 0 3 90 360 990 1860 3990 4980 4470 2520 1110 120 60 4 190 840 2250 4740 8490 11700 9630 4800 2340 480 0 5 222 1020 3270 7980 11310 15012 10290 6900 2730 480 72 6 200 720 3390 7080 11130 10320 9270 5760 2460 360 0 7 110 300 2010 3600 5610 6120 5250 2580 1410 360 60 8 30 60 630 1440 1470 3000 1290 780 285 0 0 9 10 0 210 240 570 180 270 60 0 0 0 10 1 0 30 60 15 12 0 0 0 0 0 from a direct evaluation, the function f(x1, · · · , x5) has the canonic representation (8 8 8 8 8 0 16 × 8; 16 × 0) , which means that this function is from the class 47. the function g1 has the canonic representation (8 8 8 8 8 8 16 × 8; 16 × 0) , and, therefore, it is from the class 45a, while the function g2 with the identical canonic representation is from the class 45b. it is impossible to transform the function f into any of functions g1 or g2 by using the affine spectral invariant operations. the function f has only quadratic product terms in the polynomial form and with these operations it is impossible to generate a product with a higher number of variables, while in the polynomial form of the functions g1 and g2 there are products with three variables. all these functions f, g1, and g2, are connected with the new invariant operation, from which it follows that classes 45a, 45b, and 47 are subclasses of a single class. table 5 shows the number of functions from this class having exclusively products of two variables or three variables in their polynomial forms and no other terms. the total number of such functions is 219604, and the total number of all functions in that class, including linear terms in the polynomial forms, is 219604 × 32 = 7027328. in the column denoted by 0, the number of functions with quadratic polynomial forms is shown. for example, there are 115 functions in this class with two products in the polynomial forms, while the number of functions with five products of 200 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 201 12 m. stankovic, c. moraga, r. s. stanković table 6: functions in 5 variables with solely products of two and three variables in the polynomial form. number of cubes pairs 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 5 0 0 1 0 0 0 0 0 60 120 180 150 0 0 2 115 60 180 360 615 1320 1170 900 300 120 0 3 90 360 990 1860 3990 4980 4470 2520 1110 120 60 4 190 840 2250 4740 8490 11700 9630 4800 2340 480 0 5 222 1020 3270 7980 11310 15012 10290 6900 2730 480 72 6 200 720 3390 7080 11130 10320 9270 5760 2460 360 0 7 110 300 2010 3600 5610 6120 5250 2580 1410 360 60 8 30 60 630 1440 1470 3000 1290 780 285 0 0 9 10 0 210 240 570 180 270 60 0 0 0 10 1 0 30 60 15 12 0 0 0 0 0 from a direct evaluation, the function f(x1, · · · , x5) has the canonic representation (8 8 8 8 8 0 16 × 8; 16 × 0) , which means that this function is from the class 47. the function g1 has the canonic representation (8 8 8 8 8 8 16 × 8; 16 × 0) , and, therefore, it is from the class 45a, while the function g2 with the identical canonic representation is from the class 45b. it is impossible to transform the function f into any of functions g1 or g2 by using the affine spectral invariant operations. the function f has only quadratic product terms in the polynomial form and with these operations it is impossible to generate a product with a higher number of variables, while in the polynomial form of the functions g1 and g2 there are products with three variables. all these functions f, g1, and g2, are connected with the new invariant operation, from which it follows that classes 45a, 45b, and 47 are subclasses of a single class. table 5 shows the number of functions from this class having exclusively products of two variables or three variables in their polynomial forms and no other terms. the total number of such functions is 219604, and the total number of all functions in that class, including linear terms in the polynomial forms, is 219604 × 32 = 7027328. in the column denoted by 0, the number of functions with quadratic polynomial forms is shown. for example, there are 115 functions in this class with two products in the polynomial forms, while the number of functions with five products of an improved spectral classification of boolean functions... 13 table 7: parallels in notation of the classes by hurst and lechner. hurst classes lechner classes new class functions 31, 32 classes 30a, 30b 30 functions 33, 35 classes 32a, 32b 32 functions 34, 37 classes 33a, 33b 33 functions 38, 39 classes 35a, 35b 35 functions 41, 42, 43 classes 37a, 37b 37 functions 45a, 45b, 47 classes 39a, 39b 39 table 8: basic functions for the new classes. new class basic function 30 x1x2 ⊕ x3x4 ⊕ x1x2x5 ⊕ x1x2x3x4x5 32 x1x2 ⊕ x3x4 ⊕ x1x2x5 ⊕ x1x2x3x4 33 x1x2 ⊕ x3x4 ⊕ x1x2x5 35 x1x2 ⊕ x3x4 ⊕ x1x2x3x5 37 x1x2 ⊕ x3x4 ⊕ x1x2x3x4x5 39 x1x2 ⊕ x3x4 two variables in the polynomial form is 222. all other functions from this class have some products with three variables in the polynomial form. for example, there are 15012 functions with five products of two variables and five products of three variables. it is possible to generate all these functions by the spectral invariant operations including the new operations. here we will show how functions φ1 and φ2, defined by the the karnaugh maps in table 2 and table 3 are mutually related and how it is possible to generate these two functions starting from the function x1x2 ⊕ x3x4 by increasing the number of variables in the product terms by using the spectral invariant operation introduced above. the polynomial forms of these functions are φ1 = x1x4 ⊕ x2x5 ⊕ x1x2x3 ⊕ x1x2x4 ⊕ x1x2x5 φ2 = x1x3 ⊕ x2x4 ⊕ x3x4 ⊕ x1x2x5 ⊕ x1x2x4 ⊕ x1x2x3 ⊕ x3x4x5. constructing φ1 • start from ζ1(x1, · · · , x5) = x1x2 ⊕ x3x4. • permutation of x1 and x5, produces the function ζ2(x1, · · · , x5) = x2x5 ⊕ x3x4. 200 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 201 14 m. stankovic, c. moraga, r. s. stanković • permutation of x1 and x3, produces the function ζ3(x1, · · · , x5) = x2x5 ⊕ x1x4. • adding x1x2x3 according to the new invariant operation, produces the function ζ4(x1, · · · , x5) = x2x5 ⊕ x1x4 ⊕ x1x2x3. • replacing x3 with x3 ⊕ x4 produces the function ζ5(x1, · · · , x5) = x2x5 ⊕ x1x4 ⊕ x1x2x3 ⊕ x1x2x4. • replacing x3 with x3 ⊕ x5 in ζ5 produces the function φ1 = x2x5 ⊕ x1x4 ⊕ x1x2x3 ⊕ x1x2x4 ⊕ x1x2x5. constructing φ2 • start from ζ1(x1, · · · , x5) = x1x2 ⊕ x3x4. • permutation of x2 and x3 in ζ1 produces the function ζ2(x1, · · · , x5) = x1x3 ⊕ x2x4. • adding x1x2x5 and x3x4x̄5 according to the new invariant operation produces the function ζ3(x1, · · · , x5) = x1x3⊕x2x4⊕x1x2x5⊕x3x4x5⊕ x3x4. • replacing x5 with x5 ⊕ x4 produces the function ζ4(x1, · · · , x5) = x1x3 ⊕ x2x4 ⊕ x1x2x5 ⊕ x1x2x4 ⊕ x3x4x5. • replacing x5 with x5 ⊕ x3 in ζ4 produces the function φ2 = x1x3 ⊕ x2x4 ⊕ x1x2x5 ⊕ x1x2x3 ⊕ x1x2x4 ⊕ x3x4x5 ⊕ x3x4. 5 consideration of other classes it is possible to repeat the considerations for three classes 45a, 45b, and 47, to all other classes in table 4. in all these cases it is possible to find a basic function from which all other functions from the class considered are generated by the usage of the five affine invariant operation together with the new invariant operations, as shown in table 7. note that all these basic functions have disjoint products of pairs of variables in the polynomial forms. in the following example it will be shown how classes 34 and 37 from table 1 are mutually related. example 3 let us start from the function η1 = x1x2 ⊕ x3x4 ⊕ x1x2x5. 202 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 203 14 m. stankovic, c. moraga, r. s. stanković • permutation of x1 and x3, produces the function ζ3(x1, · · · , x5) = x2x5 ⊕ x1x4. • adding x1x2x3 according to the new invariant operation, produces the function ζ4(x1, · · · , x5) = x2x5 ⊕ x1x4 ⊕ x1x2x3. • replacing x3 with x3 ⊕ x4 produces the function ζ5(x1, · · · , x5) = x2x5 ⊕ x1x4 ⊕ x1x2x3 ⊕ x1x2x4. • replacing x3 with x3 ⊕ x5 in ζ5 produces the function φ1 = x2x5 ⊕ x1x4 ⊕ x1x2x3 ⊕ x1x2x4 ⊕ x1x2x5. constructing φ2 • start from ζ1(x1, · · · , x5) = x1x2 ⊕ x3x4. • permutation of x2 and x3 in ζ1 produces the function ζ2(x1, · · · , x5) = x1x3 ⊕ x2x4. • adding x1x2x5 and x3x4x̄5 according to the new invariant operation produces the function ζ3(x1, · · · , x5) = x1x3⊕x2x4⊕x1x2x5⊕x3x4x5⊕ x3x4. • replacing x5 with x5 ⊕ x4 produces the function ζ4(x1, · · · , x5) = x1x3 ⊕ x2x4 ⊕ x1x2x5 ⊕ x1x2x4 ⊕ x3x4x5. • replacing x5 with x5 ⊕ x3 in ζ4 produces the function φ2 = x1x3 ⊕ x2x4 ⊕ x1x2x5 ⊕ x1x2x3 ⊕ x1x2x4 ⊕ x3x4x5 ⊕ x3x4. 5 consideration of other classes it is possible to repeat the considerations for three classes 45a, 45b, and 47, to all other classes in table 4. in all these cases it is possible to find a basic function from which all other functions from the class considered are generated by the usage of the five affine invariant operation together with the new invariant operations, as shown in table 7. note that all these basic functions have disjoint products of pairs of variables in the polynomial forms. in the following example it will be shown how classes 34 and 37 from table 1 are mutually related. example 3 let us start from the function η1 = x1x2 ⊕ x3x4 ⊕ x1x2x5. an improved spectral classification of boolean functions... 15 the spectrum of this function is sη1 = [12, −4, 12, −4, 12, −4, −12, 4, 4, 4, 4, 4, 4, 4, −4, −4, 4, 4, 4, 4, 4, 4, −4, −4, −4, −4, −4, −4, −4, −4, 4, 4]t , the characteristic vector of the function η1 is (12, 12, 12, 4, 4, 4) which means that it belongs to the class 37. by adding x1x3x̄5 and x2x4x5 according to the new invariant operation, we have the function η2 = x1x2 ⊕ x3x4 ⊕ x1x2x5 ⊕ x1x3x5 ⊕ x1x3 ⊕ x2x4x5, with the spectrum sη2 = [12, −4, 12, −4, 4, 4, −4, −4, 4, 4, −4, −4, 12, −4, −4, 12, 4, 4, 4, 4, 4, 4, −4, −4, −4, −4, 4, 4, −4, −4, −4, −4]t . by replacing x2 with x2 ⊕x3, the function η2 is transformed into the function η3 = x1x2 ⊕ x3x4 ⊕ x1x2x5 ⊕ x2x4x5 ⊕ x3x4x5 with the spectrum sη3 = [12, −4, 12, −4, 4, 4, −4, −4, 12, −4, −4, 12, 4, 4, −4, −4, 4, 4, 4, 4, 4, 4, −4, −4, −4, −4, −4, −4, −4, −4, 4, 4]t . after the replacement of x5 with x5 ⊕ x2 ⊕ x4, the function η3 is transformed into the function η4 = x1x2x5 ⊕ x1x2x4 ⊕ x2x4x5 ⊕ x3x4x5 ⊕ x2x3x4 with the spectrum sη4 = [12, 12, 12, −4, 4, −4, −4, 4, 12, −4, −4, −4, 4, −4, −4, 4, 4, −4, 4, −4, 4, 4, −4, −4, −4, 4, −4, 4, −4, −4, 4, 4]t . finally, after permutation of variables x2 and x3 the function η4 will be transformed into the function η5 = x1x3x5 ⊕ x1x3x4 ⊕ x3x4x5 ⊕ x2x4x5 ⊕ x2x3x4 with the spectrum sη5 = [12, 12, 12, −4, 12, −4, −4, −4, 4, −4, −4, 4, 4, −4, −4, 4, 4, −4, 4, −4, −4, 4, −4, 4, 4, 4, −4, −4, −4, −4, 4, 4]t . 202 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 203 16 m. stankovic, c. moraga, r. s. stanković the characteristic vector of the function η5 is (12, 12, 12, 12, 4, 4) which means that this function belongs to the class 34, from which it is possible to conclude that classes 34 and 37 are subclasses of a single class. it is possible to generate all functions from these two subclasses from the function η1 by using the spectral invariant operations including the new operation. 6 conclusion the classification of boolean functions with respect to walsh spectral coefficients is reconsidered. for this classification, besides the five known invariant operations, a new invariant operation is proposed. this operation is defined for functions with disjoint products of pairs of variables in their reed-muller polynomial form. by using this extended set of invariant operations, it is possible to merge some classes in the walsh classification and to reduce the total number of classes from 48 to 40, for functions with n ≤ 5 input variables. since the goal of the paper was reduction of the number of classes in the walsh classification for functions with n ≤ 5, examples of functions with n = 5 variables are considered. however, the introduced spectral invariant operations are valid for functions with more than 5 variables. when the reed-muller polynomial form of a given function has more than two disjoint products it is possible to add products with more of three variables. due to that, the introduced operation can be used to construct bent functions starting from the bent functions represented by quadratic forms. acknowledgments the work leading to this paper was partially supported by the ministry of education, science and technological development, republic of serbia, project no oi 174026. authors are grateful to the reviewers and the editors of this special issue for their constructive comments that were very useful in improving the presentation in the paper. references [1] t. sasao, switching theory for logic synthesis, kluwer academic publishers, 1999. 204 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 205 16 m. stankovic, c. moraga, r. s. stanković the characteristic vector of the function η5 is (12, 12, 12, 12, 4, 4) which means that this function belongs to the class 34, from which it is possible to conclude that classes 34 and 37 are subclasses of a single class. it is possible to generate all functions from these two subclasses from the function η1 by using the spectral invariant operations including the new operation. 6 conclusion the classification of boolean functions with respect to walsh spectral coefficients is reconsidered. for this classification, besides the five known invariant operations, a new invariant operation is proposed. this operation is defined for functions with disjoint products of pairs of variables in their reed-muller polynomial form. by using this extended set of invariant operations, it is possible to merge some classes in the walsh classification and to reduce the total number of classes from 48 to 40, for functions with n ≤ 5 input variables. since the goal of the paper was reduction of the number of classes in the walsh classification for functions with n ≤ 5, examples of functions with n = 5 variables are considered. however, the introduced spectral invariant operations are valid for functions with more than 5 variables. when the reed-muller polynomial form of a given function has more than two disjoint products it is possible to add products with more of three variables. due to that, the introduced operation can be used to construct bent functions starting from the bent functions represented by quadratic forms. acknowledgments the work leading to this paper was partially supported by the ministry of education, science and technological development, republic of serbia, project no oi 174026. authors are grateful to the reviewers and the editors of this special issue for their constructive comments that were very useful in improving the presentation in the paper. references [1] t. sasao, switching theory for logic synthesis, kluwer academic publishers, 1999. an improved spectral classification of boolean functions... 17 [2] s. l. hurst, d. m. miller, j. c. muzio, spectral techniques for digital logic, academic press, 1985. [3] m. karpovsky, r. s. stanković, j. astola, spectral logic and its applications for the design of digital devices, wiley, 2008. [4] a. braeken, y. borisov, s. nikova, b. preneel, ”classification of boolean functions of 6 variables or less with respect to cryptographic properties”, int. colloquium on automata, languages and programming icalp 2005, m. yung, g.f. italiano, c. palamidessi (eds.), lecture notes in computer science, vol. 3580, springer-verlag, 2005, 324-334. [5] c. carlet, p. sarkar, ”spectral domain analysis of correlation immune and resilient boolean functions”, finite fields appl., vol. 8, 2002, 120-130. [6] k. miranovich, ”spectral analysis of boolean functions under nonuniformity of arguments”, http://eprint.iacr.org/2002/021 [7] p. sarkar, ”a note on the spectral characterization of boolean functions”, inform process lett, 74, vol. 74, 2000, 195. [8] y. tarannikov, ”spectral analysis of high order correlation immune functions”, proc. ieee int. symp. on information theory, june 29, 2001, doi 10.1109/isit.2001.935932. [9] g. z. xiao, j. l. massey, ”a spectral characterization of correlation-immune combining functions”, ieee trans. on inform. theory, vol. 34, no. 3, 1988, 569-571. [10] o. s. rothaus, ”on bent functions”, journal combinatorial theory, vol. 20, no. a, 1976, 300-305. [11] c. r. edwards, ”the application of the rademacher-walsh transform to boolean function classification and threshold logic synthesis”, trans. ieee, vol. c-24, 1975, 48-62. [12] s. l. hurst, the logical processing of digital signals, crane, russak & company, inc., new york, edward arnold, london, 1978. [13] r. j. lechner, ”harmonic analysis of swiching functions” in a. mukhopadyay, (ed.), recent developments in switching theory, new york, academic, 1971. [14] s. w. golomb, ”on the classiffication of boolean functions”, ire trans. circuit theory, vol. ct-6, no. 5, may 1959, 176-186. [15] s. w. golomb, g. gong, signal design for good correlation for wireless communications, cryptogtaphy and radar, cambridge university press, 2005. [16] s.w. golomb, l. d. baumert, ”the search for hadamard matrices”, amer. math. monthly, vol. 70, 1963, 12-17. [17] m. stanković, c. moraga, r. s. stanković, ”new spectral invariant operations for functions with disjoint products in the polynomial form”, in proc. eurocast 2017, 19-23, february, 2017, las palmas, spain. 204 m. stanković, c. moraga, r. s. stanković an improved spectral classification of boolean functions... 205 a quantum algorithm for automata encoding edison tsai and marek perkowski department of electrical and computer engineering portland state university abstract encoding of finite automata or state machines is critical to modern digital logic design methods for sequential circuits. encoding is the process of assigning to every state, input value, and output value of a state machine a binary string, which is used to represent that state, input value, or output value in digital logic. usually, one wishes to choose an encoding that, when the state machine is implemented as a digital logic circuit, will optimize some aspect of that circuit. for instance, one might wish to encode in such a way as to minimize power dissipation or silicon area. for most such optimization objectives, no method to find the exact solution, other than a straightforward exhaustive search, is known. recent progress towards producing a quantum computer of large enough scale to surpass modern supercomputers has made it increasingly relevant to consider how quantum computers may be used to solve problems of practical interest. a quantum computer using grover’s well-known search algorithm can perform exhaustive searches that would be impractical on a classical computer, due to the speedup provided by grover’s algorithm. therefore, we propose to use grover’s algorithm to find optimal encodings for finite state machines via exhaustive search. we demonstrate the design of quantum circuits that allow grover’s algorithm to be used for this purpose. the quantum circuit design methods that we introduce are potentially applicable to other problems as well. 1 introduction although the concept of quantum computing has existed for decades, only recently has it appeared that quantum computers of sufficient scale to solve realistic problems may become available in the near future.1 the development of such quantum computers is of great interest because they are capable of solving certain classes of problems with a time complexity better than the best achievable with a classical computer. grover’s 1see [1], introduction, paragraph 6: “a physical quantum computer. . . is still an outstanding challenge. however, in recent work, physical qubits. . . have reached the point where errors are at or below the threshold, and networks of 4–9 superconducting qubits with individual control and readout have been used to show concepts of error correction. over the next few years, the field will be in a stage of building interesting quantum devices with a complexity that could never be emulated in full generality on a classical computer (∼50 or more qubits).” 1 a quantum algorithm for automata encoding edison tsai and marek perkowski department of electrical and computer engineering portland state university abstract encoding of finite automata or state machines is critical to modern digital logic design methods for sequential circuits. encoding is the process of assigning to every state, input value, and output value of a state machine a binary string, which is used to represent that state, input value, or output value in digital logic. usually, one wishes to choose an encoding that, when the state machine is implemented as a digital logic circuit, will optimize some aspect of that circuit. for instance, one might wish to encode in such a way as to minimize power dissipation or silicon area. for most such optimization objectives, no method to find the exact solution, other than a straightforward exhaustive search, is known. recent progress towards producing a quantum computer of large enough scale to surpass modern supercomputers has made it increasingly relevant to consider how quantum computers may be used to solve problems of practical interest. a quantum computer using grover’s well-known search algorithm can perform exhaustive searches that would be impractical on a classical computer, due to the speedup provided by grover’s algorithm. therefore, we propose to use grover’s algorithm to find optimal encodings for finite state machines via exhaustive search. we demonstrate the design of quantum circuits that allow grover’s algorithm to be used for this purpose. the quantum circuit design methods that we introduce are potentially applicable to other problems as well. 1 introduction although the concept of quantum computing has existed for decades, only recently has it appeared that quantum computers of sufficient scale to solve realistic problems may become available in the near future.1 the development of such quantum computers is of great interest because they are capable of solving certain classes of problems with a time complexity better than the best achievable with a classical computer. grover’s 1see [1], introduction, paragraph 6: “a physical quantum computer. . . is still an outstanding challenge. however, in recent work, physical qubits. . . have reached the point where errors are at or below the threshold, and networks of 4–9 superconducting qubits with individual control and readout have been used to show concepts of error correction. over the next few years, the field will be in a stage of building interesting quantum devices with a complexity that could never be emulated in full generality on a classical computer (∼50 or more qubits).” 1 facta universitatis series: electronics and energetics vol. 33, no 2, june 2020, pp. 169 215 https://doi.org/10.2298/fuee2002169t edison tsai, marek perkowski received september 7, 2017; received in revised form january 28, 2020 corresponding author: marek perkowski department of electrical and computer engineering, portland state university, usa (e-mail: mperkows@ee.pdx.edu) facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) a quantum algorithm for automata encoding department of electrical and computer engineering, portland state university, usa abstract. encoding of finite automata or state machines is critical to modern digital logic design methods for sequential circuits. encoding is the process of assigning to every state, input value, and output value of a state machine a binary string, which is used to represent that state, input value, or output value in digital logic. usually, one wishes to choose an encoding that, when the state machine is implemented as a digital logic circuit, will optimize some aspect of that circuit. for instance, one might wish to encode in such a way as to minimize power dissipation or silicon area. for most such optimization objectives, no method to find the exact solution, other than a straightforward exhaustive search, is known. recent progress towards producing a quantum computer of large enough scale to surpass modern supercomputers has made it increasingly relevant to consider how quantum computers may be used to solve problems of practical interest. a quantum computer using grover’s well-known search algorithm can perform exhaustive searches that would be impractical on a classical computer, due to the speedup provided by grover’s algorithm. therefore, we propose to use grover’s algorithm to find optimal encodings for finite state machines via exhaustive search. we demonstrate the design of quantum circuits that allow grover’s algorithm to be used for this purpose. the quantum circuit design methods that we introduce are potentially applicable to other problems as well. key words: quantum algorithm, automata encoding, finete state machines. © 2020 by university of niš, serbia | creative commons license: cc by-nc-nd well-known search algorithm [2, 3, 4] provides a commonly cited example of such a class of problems. however, despite the theoretical capabilities of quantum computing, virtually no work has been published that demonstrates a specific, concrete example of grover’s algorithm (or another quantum algorithm) applied to solve a problem of practical interest. this gap in the literature has become increasingly relevant as large-scale quantum computers move ever closer to reality.2 to fill this gap, we demonstrate in this work how grover’s algorithm can be applied to the problem of state encoding for finite state machines (fsms). a well-known problem in digital logic design, which has been recognized for over 50 years [6, 7] and today remains important and relevant to the design of virtually all very-large-scale integrated (vlsi) circuits and systems including microprocessors, state encoding (a.k.a. state assignment) is the assignment of binary states in a digital logic circuit to represent the internal states of an fsm. distinct encodings for the same fsm produce distinct implementations in digital logic, which may differ considerably in complexity [6]. usually, one wishes to find a state encoding which is minimal with respect to some metric, e.g., power [8, 9] or silicon area [10, 11] of the resulting digital circuit. our goal is to find the exact minimum (with respect to one particular metric) solution for state and input encoding of finite state machines. so far, only one previous work [12] has attempted to find exact minimum solutions to this problem. furthermore, the methods in [12] find solutions for only state and not input encoding. there is currently no published result that finds the exact minimum solution for concurrent state and input assignment. the methods in [12] rely on finding prime implicants and solving a covering problem on the set of all prime implicants. this would be difficult to implement using grover’s algorithm on a quantum computer because at least one qubit would be required for each prime implicant and the total number of prime implicants can be extremely large. therefore, we do not attempt to directly adapt the approach presented in [12] for a quantum computer. instead, we use a simplified cost metric from [6, 7], in which the cost of an encoding is defined as the total number of dependencies of the next-state functions on current state and input variables. the use of this metric makes it easier to construct the quantum circuits necessary to use grover’s algorithm to search for encodings. since grover’s algorithm effectively performs an exhaustive search, our method is always able to find an encoding with exact minimum cost. the techniques that we use to adapt grover’s algorithm for the purpose of encoding finite state machines may prove useful for other purposes as well. 2see above; see also [5], abstract: “advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers.” 2 2 finite state machines and state encodings 2.1 review of finite state machines we assume that the reader is familiar with the concept of finite state machines (fsms) and how they are realized using digital logic, as well as with the state encoding (a.k.a. state assignment or secondary state assignment) problem. for the sake of self-containment we briefly review these subjects here. an fsm consists of a set of internal states (call it s) together with sets of inputs and outputs (call them i and o, respectively); at all times, it maintains a single internal state which is an element of s, is presented with an input value which is an element of i, and produces an output value which is an element of o. the machine’s operation is idealized as a discrete-time process; with each successive unit of time, it updates its internal state and output in a deterministic fashion based on its current internal state and the input value that is being presented. thus, the internal state of the machine at time t + 1 is a function of the internal state at time t and the input at time t: st+1 = δ(st, it), where st ∈ s is the internal state at time t, it ∈ i is the input value at time t, and st+1 ∈ s is the internal state at time t + 1. we refer to the function δ as the transition function for the fsm. this function is also commonly called the excitation function or next-state function. the output of an fsm can either directly depend on only its internal state, or it can directly depend on both the internal state and the input. the former scenario corresponds to a so-called moore machine [13, 14] whereas the latter corresponds to a so-called mealy machine [15, 14]. here, as will be discussed in more detail below, we only consider the problem of encoding internal states; thus, the type of the machine is irrelevant and our work is equally applicable to both. from now on, for the sake of brevity, we will simply use “states” to refer to the internal states of an fsm. the phrase “internal states” avoids confusion with other objects also referred to as “states”, in particular the input and output values which are sometimes called input and output states, respectively. we will always use the phrases “input (output)” or “input (output) values” here, so that there is no risk of confusion in using simply “states” to refer to internal states. fsms are commonly implemented using digital logic circuits consisting of an array of flip-flops, which stores the current state, together with combinational logic (often referred to as the next-state logic), which computes the next state from the current state and current input. to design this combinational logic, one must select a state encoding, a mapping which associates each state of the fsm with a state of the flip-flop array. more precisely, since the state of a flip-flop array is simply a binary string, a state encoding is a function cs : s → {0, 1}n that maps each state of the fsm to a vector of flip-flop states that represents the fsm state in a digital logic circuit. here, n denotes the number of flip-flops in the circuit. 3 170 e. tsai, m. perkowski a quantum algorithm for automata encoding 171 well-known search algorithm [2, 3, 4] provides a commonly cited example of such a class of problems. however, despite the theoretical capabilities of quantum computing, virtually no work has been published that demonstrates a specific, concrete example of grover’s algorithm (or another quantum algorithm) applied to solve a problem of practical interest. this gap in the literature has become increasingly relevant as large-scale quantum computers move ever closer to reality.2 to fill this gap, we demonstrate in this work how grover’s algorithm can be applied to the problem of state encoding for finite state machines (fsms). a well-known problem in digital logic design, which has been recognized for over 50 years [6, 7] and today remains important and relevant to the design of virtually all very-large-scale integrated (vlsi) circuits and systems including microprocessors, state encoding (a.k.a. state assignment) is the assignment of binary states in a digital logic circuit to represent the internal states of an fsm. distinct encodings for the same fsm produce distinct implementations in digital logic, which may differ considerably in complexity [6]. usually, one wishes to find a state encoding which is minimal with respect to some metric, e.g., power [8, 9] or silicon area [10, 11] of the resulting digital circuit. our goal is to find the exact minimum (with respect to one particular metric) solution for state and input encoding of finite state machines. so far, only one previous work [12] has attempted to find exact minimum solutions to this problem. furthermore, the methods in [12] find solutions for only state and not input encoding. there is currently no published result that finds the exact minimum solution for concurrent state and input assignment. the methods in [12] rely on finding prime implicants and solving a covering problem on the set of all prime implicants. this would be difficult to implement using grover’s algorithm on a quantum computer because at least one qubit would be required for each prime implicant and the total number of prime implicants can be extremely large. therefore, we do not attempt to directly adapt the approach presented in [12] for a quantum computer. instead, we use a simplified cost metric from [6, 7], in which the cost of an encoding is defined as the total number of dependencies of the next-state functions on current state and input variables. the use of this metric makes it easier to construct the quantum circuits necessary to use grover’s algorithm to search for encodings. since grover’s algorithm effectively performs an exhaustive search, our method is always able to find an encoding with exact minimum cost. the techniques that we use to adapt grover’s algorithm for the purpose of encoding finite state machines may prove useful for other purposes as well. 2see above; see also [5], abstract: “advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers.” 2 2 finite state machines and state encodings 2.1 review of finite state machines we assume that the reader is familiar with the concept of finite state machines (fsms) and how they are realized using digital logic, as well as with the state encoding (a.k.a. state assignment or secondary state assignment) problem. for the sake of self-containment we briefly review these subjects here. an fsm consists of a set of internal states (call it s) together with sets of inputs and outputs (call them i and o, respectively); at all times, it maintains a single internal state which is an element of s, is presented with an input value which is an element of i, and produces an output value which is an element of o. the machine’s operation is idealized as a discrete-time process; with each successive unit of time, it updates its internal state and output in a deterministic fashion based on its current internal state and the input value that is being presented. thus, the internal state of the machine at time t + 1 is a function of the internal state at time t and the input at time t: st+1 = δ(st, it), where st ∈ s is the internal state at time t, it ∈ i is the input value at time t, and st+1 ∈ s is the internal state at time t + 1. we refer to the function δ as the transition function for the fsm. this function is also commonly called the excitation function or next-state function. the output of an fsm can either directly depend on only its internal state, or it can directly depend on both the internal state and the input. the former scenario corresponds to a so-called moore machine [13, 14] whereas the latter corresponds to a so-called mealy machine [15, 14]. here, as will be discussed in more detail below, we only consider the problem of encoding internal states; thus, the type of the machine is irrelevant and our work is equally applicable to both. from now on, for the sake of brevity, we will simply use “states” to refer to the internal states of an fsm. the phrase “internal states” avoids confusion with other objects also referred to as “states”, in particular the input and output values which are sometimes called input and output states, respectively. we will always use the phrases “input (output)” or “input (output) values” here, so that there is no risk of confusion in using simply “states” to refer to internal states. fsms are commonly implemented using digital logic circuits consisting of an array of flip-flops, which stores the current state, together with combinational logic (often referred to as the next-state logic), which computes the next state from the current state and current input. to design this combinational logic, one must select a state encoding, a mapping which associates each state of the fsm with a state of the flip-flop array. more precisely, since the state of a flip-flop array is simply a binary string, a state encoding is a function cs : s → {0, 1}n that maps each state of the fsm to a vector of flip-flop states that represents the fsm state in a digital logic circuit. here, n denotes the number of flip-flops in the circuit. 3 170 e. tsai, m. perkowski a quantum algorithm for automata encoding 171 similarly, in order to implement an fsm with a digital logic circuit, one must also select an input encoding, which maps each input value of the fsm to an array of boolean values, where each boolean value represents the value supplied on an input wire to the digital logic circuit. in other words, an input encoding is a function ci : i → {0, 1}m where m is the number of input wires. finally, in addition to the state and input encoding, one must select an output encoding to fully implement an fsm with digital logic. however, we do not consider the problem of encoding outputs in this paper and consequently, we will ignore the outputs of an fsm from now on. we will use “encoding” without further qualification to mean the combination of a state and input encoding. we may also use the phrase “encoding of [a state or input value]” to mean the combination of state or input variable values corresponding to that state or input value under a given encoding. in other words, given a state encoding as a function cs as described above, the encoding of a state s is simply cs(s); the situation is analogous for inputs. finally, we also use “encoding” as a verb to mean the process of selecting or generating an encoding. thus, we describe the main objective of this paper as solving an fsm encoding problem. from now on, when considering a digital logic circuit implementation of an fsm, we will follow the common convention of using qi to denote the state of the ith flip-flop at a given point in time. we will also use xj to represent the value on the jth input wire at a given point in time. we refer to the variables q1 through qn as state variables and the variables x1 through xm as input variables. figure 1 illustrates this notation in the context of a digital logic circuit that implements an fsm. we distinguish carefully between input values, which are the symbolic inputs of an fsm and elements of the set i, and input variables, which are the boolean variables x1 through xm used in a digital logic circuit to represent input values. similarly, we also distinguish between states and state variables—states are the symbolic internal states of an fsm and elements of the set s, while state variables are the variables q1 through qn that correspond to the states of individual flip-flops and together represent the symbolic state. to help avoid confusion, we will always follow a consistent notational convention where, in addition to using qi and xj for states and input values respectively, we also use s or s1, s2, etc. to represent states and i, i1, i2, etc. to represent input values. in the context of an fsm, the word “input” without further qualification will always refer to an input value and not an input variable. a state encoding may be bijective, that is, each flip-flop array state represents exactly one fsm state, or it may not. non-bijective encodings might arise (for instance) if the number of possible flip-flop array states is greater than the number of fsm states, in which case some of the possible flip-flop states will not represent any fsm state at all. similarly, input encodings may also be bijective, or not. for a bijective input encoding, each possible combination of assignments to input variables corresponds to exactly one input value. we say that an encoding (the combination of both state and input encodings) is bijective if both the state and input encodings are bijective. if an encoding is bijective, then any combination of assignments to the 4 d1 q1 d2 q2 dn qn δ1(q1, . . . , qn, x1, . . . , xm) δ2(q1, . . . , qn, x1, . . . , xm) δn(q1, . . . , qn, x1, . . . , xm) x1 xm figure 1: general structure of an fsm implemented as a digital logic circuit; output logic not shown. 5 172 e. tsai, m. perkowski a quantum algorithm for automata encoding 173 similarly, in order to implement an fsm with a digital logic circuit, one must also select an input encoding, which maps each input value of the fsm to an array of boolean values, where each boolean value represents the value supplied on an input wire to the digital logic circuit. in other words, an input encoding is a function ci : i → {0, 1}m where m is the number of input wires. finally, in addition to the state and input encoding, one must select an output encoding to fully implement an fsm with digital logic. however, we do not consider the problem of encoding outputs in this paper and consequently, we will ignore the outputs of an fsm from now on. we will use “encoding” without further qualification to mean the combination of a state and input encoding. we may also use the phrase “encoding of [a state or input value]” to mean the combination of state or input variable values corresponding to that state or input value under a given encoding. in other words, given a state encoding as a function cs as described above, the encoding of a state s is simply cs(s); the situation is analogous for inputs. finally, we also use “encoding” as a verb to mean the process of selecting or generating an encoding. thus, we describe the main objective of this paper as solving an fsm encoding problem. from now on, when considering a digital logic circuit implementation of an fsm, we will follow the common convention of using qi to denote the state of the ith flip-flop at a given point in time. we will also use xj to represent the value on the jth input wire at a given point in time. we refer to the variables q1 through qn as state variables and the variables x1 through xm as input variables. figure 1 illustrates this notation in the context of a digital logic circuit that implements an fsm. we distinguish carefully between input values, which are the symbolic inputs of an fsm and elements of the set i, and input variables, which are the boolean variables x1 through xm used in a digital logic circuit to represent input values. similarly, we also distinguish between states and state variables—states are the symbolic internal states of an fsm and elements of the set s, while state variables are the variables q1 through qn that correspond to the states of individual flip-flops and together represent the symbolic state. to help avoid confusion, we will always follow a consistent notational convention where, in addition to using qi and xj for states and input values respectively, we also use s or s1, s2, etc. to represent states and i, i1, i2, etc. to represent input values. in the context of an fsm, the word “input” without further qualification will always refer to an input value and not an input variable. a state encoding may be bijective, that is, each flip-flop array state represents exactly one fsm state, or it may not. non-bijective encodings might arise (for instance) if the number of possible flip-flop array states is greater than the number of fsm states, in which case some of the possible flip-flop states will not represent any fsm state at all. similarly, input encodings may also be bijective, or not. for a bijective input encoding, each possible combination of assignments to input variables corresponds to exactly one input value. we say that an encoding (the combination of both state and input encodings) is bijective if both the state and input encodings are bijective. if an encoding is bijective, then any combination of assignments to the 4 d1 q1 d2 q2 dn qn δ1(q1, . . . , qn, x1, . . . , xm) δ2(q1, . . . , qn, x1, . . . , xm) δn(q1, . . . , qn, x1, . . . , xm) x1 xm figure 1: general structure of an fsm implemented as a digital logic circuit; output logic not shown. 5 172 e. tsai, m. perkowski a quantum algorithm for automata encoding 173 variables q1 through qn and x1 through xm corresponds to a unique state and input value of the fsm. therefore, the next state is also uniquely determined by the transition function δ. in this case, we define a collection of encoded transition functions δ1 through δn, where δi represents the next state of the ith flip-flop in terms of the variables q1 through qn and x1 through xm. in other words, the values of q1 through qn and x1 through xm uniquely determine a state s and input value i. if these are the current state of and input to the fsm, then the next state of the fsm will be δ(s, i) and the encoding of this next state is cs(δ(s, i)) where cs is the functional representation of a state encoding as previously described. we then define δi(q1, . . . , qn, x1, . . . , xm) to be the i th component of cs(δ(s, i)). encoded transition functions represent the computations to be performed by the next-state logic in a digital logic circuit implementation of an fsm. in other words, given a particular state encoding for an fsm, each encoded transition function gives the next state of a single flip-flop in terms of the current states of all flip-flops and the current input. thus, digital logic design for an fsm involves realizing the encoded transition functions as digital logic circuits. figure 1 graphically demonstrates this relationship between encoded transition functions and the digital logic implementation of an fsm. in the remainder of this paper, we will concentrate on evaluating the cost of realizing encoded transition functions and how this cost can be minimized. if a state encoding is not bijective, one can still define a set of encoded transition functions using the same concept—each encoded transition function represents the next state of a single flip-flop in terms of the current states of all flip-flops and the current values of all input variables. however, the encoded transition “functions” defined in this way are no longer functions in the mathematical sense; they are relations instead. if the flip-flops’ current state does not correspond to any fsm state or the input variables’ values do not correspond to any fsm input value, then the next state is indeterminate (a.k.a. a “don’t-care”). in practice, digital logic design for fsms commonly involves non-bijective encodings. nevertheless, for reasons to be discussed later, we will only consider bijective encodings, for which all encoded transition functions are actual functions in the mathematical sense. observe that this restriction implies that we are only considering fsms where the number of states is a power of two, since no bijective encodings exist otherwise. it also implies the assumption that the machine is state-minimized, meaning that there are no equivalent states in the machine, so that no two distinct states may have the same encoding. it is common to use the notation q+i to represent the next state of the ith flip-flop, where q1 through qn represent the current states of all flip-flops. in other words, q+i = δi(q1, . . . , qn, x1, . . . , xm) for all i. this means that q+i is simply a more compact notation for the function δi that can be used when it is not necessary to explicitly show that δi is a function of q1 through qn and x1 through xm. from now on, we will use both the q+i and δi notations interchangeably, with the choice of notation being simply a matter of convenience. 6 2.2 metric for evaluating cost of state encodings the reader may observe that, by considering the implementation of fsms using digital logic circuits, we have created a distinction between a symbolic fsm itself and its implementation using flip-flops and combinational logic. the former is simply an abstract mathematical concept, while the latter is a physical realization of that abstract concept. from now on, when the meaning is clear from the context, we will use “finite state machine” to refer to both an fsm in the abstract conceptual sense and physical implementations of that fsm. when the need to avoid confusion arises, we will use “fsm specification” to refer specifically to the abstract mathematical concept of an fsm. there always exist many possible implementations of any single fsm specification, since an implementation of an fsm is always associated with some encoding and there are many possible encodings for any given set of states or inputs. the differences between implementations obtained with different encodings are of great interest to the digital logic designer. in particular, the combinational circuit complexity (as measured, for instance, by the number of logic gates or the maximum delay) of implementations may greatly vary for different encodings. consider the fsm whose transition function is as given in figure 2(a). figure 2(b) depicts a possible encoding for the fsm where the four possible two-bit strings are simply allocated in the usual base-two counting order (00 first, then 01, 10, and 11). figure 2(c) then shows the resulting encoded transition functions. these functions may be represented by the following logical expressions: q + 1 = (q1 ∧ x2) ∨ (q1 ∧ ¬q2 ∧ x1) ∨ (¬q1 ∧ q2 ∧ x1) ∨ (q2 ∧ ¬x1 ∧ ¬x2) ∨ (x1 ∧ x2), q + 2 = (q1 ∧ q2 ∧ ¬x1) ∨ (q1 ∧ q2 ∧ ¬x2) ∨ (¬q1 ∧ ¬q2 ∧ ¬x1) ∨(¬q1 ∧ ¬q2 ∧ ¬x2) ∨ (¬x1 ∧ ¬x2), which show that both q+1 and q + 2 depend on all four state/input variables (q1, q2, x1, and x2). in comparison, if the encoding shown in figure 2(d) is used instead, then the encoded transition functions are as shown in figure 2(e) and may be represented by the expressions q + 1 = q2 ∨ ¬x1, q + 2 = (q1 ∧ ¬x1) ∨ (q1 ∧ x2) ∨ (¬x1 ∧ ¬x2), where we observe that q+1 depends on only two variables (q2 and x1) and q+2 depends on three (q1, x1, and x2). we therefore see that between these two encodings, the encoding from figure 2(d) results in encoded transition functions that depend on less variables. if we make the reasonable assumption that the complexity of a digital logic circuit is correlated with its number of inputs, then we would expect the encoding from figure 2(d) to ultimately result in a 7 174 e. tsai, m. perkowski a quantum algorithm for automata encoding 175 variables q1 through qn and x1 through xm corresponds to a unique state and input value of the fsm. therefore, the next state is also uniquely determined by the transition function δ. in this case, we define a collection of encoded transition functions δ1 through δn, where δi represents the next state of the ith flip-flop in terms of the variables q1 through qn and x1 through xm. in other words, the values of q1 through qn and x1 through xm uniquely determine a state s and input value i. if these are the current state of and input to the fsm, then the next state of the fsm will be δ(s, i) and the encoding of this next state is cs(δ(s, i)) where cs is the functional representation of a state encoding as previously described. we then define δi(q1, . . . , qn, x1, . . . , xm) to be the i th component of cs(δ(s, i)). encoded transition functions represent the computations to be performed by the next-state logic in a digital logic circuit implementation of an fsm. in other words, given a particular state encoding for an fsm, each encoded transition function gives the next state of a single flip-flop in terms of the current states of all flip-flops and the current input. thus, digital logic design for an fsm involves realizing the encoded transition functions as digital logic circuits. figure 1 graphically demonstrates this relationship between encoded transition functions and the digital logic implementation of an fsm. in the remainder of this paper, we will concentrate on evaluating the cost of realizing encoded transition functions and how this cost can be minimized. if a state encoding is not bijective, one can still define a set of encoded transition functions using the same concept—each encoded transition function represents the next state of a single flip-flop in terms of the current states of all flip-flops and the current values of all input variables. however, the encoded transition “functions” defined in this way are no longer functions in the mathematical sense; they are relations instead. if the flip-flops’ current state does not correspond to any fsm state or the input variables’ values do not correspond to any fsm input value, then the next state is indeterminate (a.k.a. a “don’t-care”). in practice, digital logic design for fsms commonly involves non-bijective encodings. nevertheless, for reasons to be discussed later, we will only consider bijective encodings, for which all encoded transition functions are actual functions in the mathematical sense. observe that this restriction implies that we are only considering fsms where the number of states is a power of two, since no bijective encodings exist otherwise. it also implies the assumption that the machine is state-minimized, meaning that there are no equivalent states in the machine, so that no two distinct states may have the same encoding. it is common to use the notation q+i to represent the next state of the ith flip-flop, where q1 through qn represent the current states of all flip-flops. in other words, q+i = δi(q1, . . . , qn, x1, . . . , xm) for all i. this means that q+i is simply a more compact notation for the function δi that can be used when it is not necessary to explicitly show that δi is a function of q1 through qn and x1 through xm. from now on, we will use both the q+i and δi notations interchangeably, with the choice of notation being simply a matter of convenience. 6 2.2 metric for evaluating cost of state encodings the reader may observe that, by considering the implementation of fsms using digital logic circuits, we have created a distinction between a symbolic fsm itself and its implementation using flip-flops and combinational logic. the former is simply an abstract mathematical concept, while the latter is a physical realization of that abstract concept. from now on, when the meaning is clear from the context, we will use “finite state machine” to refer to both an fsm in the abstract conceptual sense and physical implementations of that fsm. when the need to avoid confusion arises, we will use “fsm specification” to refer specifically to the abstract mathematical concept of an fsm. there always exist many possible implementations of any single fsm specification, since an implementation of an fsm is always associated with some encoding and there are many possible encodings for any given set of states or inputs. the differences between implementations obtained with different encodings are of great interest to the digital logic designer. in particular, the combinational circuit complexity (as measured, for instance, by the number of logic gates or the maximum delay) of implementations may greatly vary for different encodings. consider the fsm whose transition function is as given in figure 2(a). figure 2(b) depicts a possible encoding for the fsm where the four possible two-bit strings are simply allocated in the usual base-two counting order (00 first, then 01, 10, and 11). figure 2(c) then shows the resulting encoded transition functions. these functions may be represented by the following logical expressions: q + 1 = (q1 ∧ x2) ∨ (q1 ∧ ¬q2 ∧ x1) ∨ (¬q1 ∧ q2 ∧ x1) ∨ (q2 ∧ ¬x1 ∧ ¬x2) ∨ (x1 ∧ x2), q + 2 = (q1 ∧ q2 ∧ ¬x1) ∨ (q1 ∧ q2 ∧ ¬x2) ∨ (¬q1 ∧ ¬q2 ∧ ¬x1) ∨(¬q1 ∧ ¬q2 ∧ ¬x2) ∨ (¬x1 ∧ ¬x2), which show that both q+1 and q + 2 depend on all four state/input variables (q1, q2, x1, and x2). in comparison, if the encoding shown in figure 2(d) is used instead, then the encoded transition functions are as shown in figure 2(e) and may be represented by the expressions q + 1 = q2 ∨ ¬x1, q + 2 = (q1 ∧ ¬x1) ∨ (q1 ∧ x2) ∨ (¬x1 ∧ ¬x2), where we observe that q+1 depends on only two variables (q2 and x1) and q+2 depends on three (q1, x1, and x2). we therefore see that between these two encodings, the encoding from figure 2(d) results in encoded transition functions that depend on less variables. if we make the reasonable assumption that the complexity of a digital logic circuit is correlated with its number of inputs, then we would expect the encoding from figure 2(d) to ultimately result in a 7 174 e. tsai, m. perkowski a quantum algorithm for automata encoding 175 i1 i2 i3 i4 s1 s2 s3 s4 s2 s2 s2 s3 s4 s1 s3 s3 s2 s3 s3 s3 s4 s4 s2 s3 (a) q1q2s s1 0 0 s2 0 1 s3 1 0 s4 1 1 x1 x2i i1 0 0 i2 0 1 i3 1 0 i4 1 1 (b) 00 01 11 10 00 01 11 10 0 0 1 0 1 0 1 1 1 1 1 0 0 1 1 1 q1q2 x1x2 q+1 00 01 11 10 00 01 11 10 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 q1q2 x1x2 q+2 (c) q1q2s s1 0 1 s2 1 0 s3 1 1 s4 0 0 x1 x2i i1 1 0 i2 1 1 i3 0 1 i4 0 0 (d) 00 01 11 10 00 01 11 10 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 0 q1q2 x1x2 q+1 00 01 11 10 00 01 11 10 1 0 0 0 1 0 0 0 1 1 1 0 1 1 1 0 q1q2 x1x2 q+2 (e) figure 2: (a) transition table for an fsm; (b) an encoding for that fsm; (c) resulting encoded transition functions from that encoding; (d) another encoding for the same fsm; (e) resulting encoded transition functions. 8 less complex digital logic circuit, since the next-state logic for both flipflops would involve fewer variables. of course, this correlation between complexity and number of inputs depends on the precise definition of complexity used, and is not perfect in any case. there is no guarantee that the encoding from figure 2(d) would actually produce a digital logic circuit that better suits the design goals (whatever they may be) of a digital logic designer. nevertheless, in the remainder of this paper we will use the simple metric of defining cost as the total number of variables on which a boolean function depends, for two reasons. first, this cost metric simplifies the problem of finding the optimal encoding for an fsm enough that we can always find the exact minimum solution. the only published work so far that achieves exact minimum results for fsm encoding is [12]. however, the authors in [12] use a different cost metric, which is based on the number of product terms when digital logic is implemented using a programmable logic array (pla). this brings us to our second reason for using a cost metric based on number of dependencies: minimizing the number of product terms in a pla has little to no relevance if an fsm is not implemented using plas, as they are not (for instance) in most modern vlsi chips. minimizing the number of variable dependencies of a boolean function, on the other hand, is a reasonable goal for virtually any digital logic technology, and will likely remain reasonable even for future technologies. based on the preceding discussion, we therefore formulate as follows the problem to be solved in the remainder of this paper—given an fsm that satisfies the following conditions: • the number of states in the machine is a power of 2; • the number of possible input values to the machine is a power of 2; • the machine is state-minimal, meaning that no two distinct states are equivalent and therefore no two distinct states may be assigned the same encoding; • no two input values are equivalent and therefore no two distinct input values may be assigned the same encoding; find an encoding for the fsm with the lowest possible cost, where the cost of an encoding is defined as the sum of the costs of the encoded transition functions resulting from that encoding, and the cost of a single function is the number of variables on which the function depends. a function is considered to depend on a variable if and only if the function cannot be computed without knowledge of the value of that variable. our cost model thus defined is the same as that used in [6] and [7]. 3 grover’s search algorithm 3.1 operation of grover’s algorithm in this section we briefly review grover’s algorithm and its relevance to the state encoding problem. we assume the reader’s familiarity with quantum computing in general, and in particular with qubits, quantum states, circuits, gates, and the matrix-vector representations thereof. for 9 176 e. tsai, m. perkowski a quantum algorithm for automata encoding 177 i1 i2 i3 i4 s1 s2 s3 s4 s2 s2 s2 s3 s4 s1 s3 s3 s2 s3 s3 s3 s4 s4 s2 s3 (a) q1q2s s1 0 0 s2 0 1 s3 1 0 s4 1 1 x1 x2i i1 0 0 i2 0 1 i3 1 0 i4 1 1 (b) 00 01 11 10 00 01 11 10 0 0 1 0 1 0 1 1 1 1 1 0 0 1 1 1 q1q2 x1x2 q+1 00 01 11 10 00 01 11 10 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 q1q2 x1x2 q+2 (c) q1q2s s1 0 1 s2 1 0 s3 1 1 s4 0 0 x1 x2i i1 1 0 i2 1 1 i3 0 1 i4 0 0 (d) 00 01 11 10 00 01 11 10 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 0 q1q2 x1x2 q+1 00 01 11 10 00 01 11 10 1 0 0 0 1 0 0 0 1 1 1 0 1 1 1 0 q1q2 x1x2 q+2 (e) figure 2: (a) transition table for an fsm; (b) an encoding for that fsm; (c) resulting encoded transition functions from that encoding; (d) another encoding for the same fsm; (e) resulting encoded transition functions. 8 less complex digital logic circuit, since the next-state logic for both flipflops would involve fewer variables. of course, this correlation between complexity and number of inputs depends on the precise definition of complexity used, and is not perfect in any case. there is no guarantee that the encoding from figure 2(d) would actually produce a digital logic circuit that better suits the design goals (whatever they may be) of a digital logic designer. nevertheless, in the remainder of this paper we will use the simple metric of defining cost as the total number of variables on which a boolean function depends, for two reasons. first, this cost metric simplifies the problem of finding the optimal encoding for an fsm enough that we can always find the exact minimum solution. the only published work so far that achieves exact minimum results for fsm encoding is [12]. however, the authors in [12] use a different cost metric, which is based on the number of product terms when digital logic is implemented using a programmable logic array (pla). this brings us to our second reason for using a cost metric based on number of dependencies: minimizing the number of product terms in a pla has little to no relevance if an fsm is not implemented using plas, as they are not (for instance) in most modern vlsi chips. minimizing the number of variable dependencies of a boolean function, on the other hand, is a reasonable goal for virtually any digital logic technology, and will likely remain reasonable even for future technologies. based on the preceding discussion, we therefore formulate as follows the problem to be solved in the remainder of this paper—given an fsm that satisfies the following conditions: • the number of states in the machine is a power of 2; • the number of possible input values to the machine is a power of 2; • the machine is state-minimal, meaning that no two distinct states are equivalent and therefore no two distinct states may be assigned the same encoding; • no two input values are equivalent and therefore no two distinct input values may be assigned the same encoding; find an encoding for the fsm with the lowest possible cost, where the cost of an encoding is defined as the sum of the costs of the encoded transition functions resulting from that encoding, and the cost of a single function is the number of variables on which the function depends. a function is considered to depend on a variable if and only if the function cannot be computed without knowledge of the value of that variable. our cost model thus defined is the same as that used in [6] and [7]. 3 grover’s search algorithm 3.1 operation of grover’s algorithm in this section we briefly review grover’s algorithm and its relevance to the state encoding problem. we assume the reader’s familiarity with quantum computing in general, and in particular with qubits, quantum states, circuits, gates, and the matrix-vector representations thereof. for 9 176 e. tsai, m. perkowski a quantum algorithm for automata encoding 177 |0〉 |0〉 |0〉 |0〉 |1〉 h h h h h g g g figure 3: a high-level schematic for grover’s algorithm. a detailed treatment of the preceding concepts, we refer the reader to [16]. we concentrate on reviewing the conceptual aspects of grover’s algorithm rather than proving that the algorithm functions as claimed. full proofs of the validity of grover’s algorithm are well known and may be found, for instance, in [3, 16]. grover’s algorithm [2, 3, 4] is a well-known quantum algorithm and one of the first to demonstrate how quantum computers are in a sense more powerful than classical computers. specifically, grover’s algorithm solves the problem of satisfying a boolean function; that is, given a function f : {0, 1}n → {0, 1}, find x1 through xn such that3 f(x1, . . . , xn) = 1. this type of problem is commonly known as a satisfaction or decision problem. classically, assuming no additional information is known regarding the function f, one can on average solve this problem no faster than by exhaustive search. such a search requires time o(n) where n = 2n, the total number of possible assignments of values {0, 1} to the variables x1 through xn. if a quantum computer is available, however, grover’s algorithm provides a method to solve the same problem in o( √ n) time, a quadratic speedup. we first describe grover’s algorithm with the assumption that exactly one solution exists; that is, there is exactly one assignment of the values {0, 1} to x1 through xn such that f(x1, . . . , xn) = 1. this is the original scenario considered by grover. later, we will relax this assumption and describe how the algorithm can be extended to cases where there is more than one solution, or no solution at all. figure 3 shows a high-level schematic of a quantum circuit implementing grover’s algorithm. we see that the algorithm consists of an initialization step involving hadamard gates, followed by repeated applications of an operation g (the grover iterate), and then a measurement at the end. first, we examine the grover iterate, which is the most important step. the grover iterate accomplishes the task of amplitude amplification, the central concept underlying grover’s algorithm. amplitude amplification is the process of selectively increasing the amplitude of a certain component state of a superposition, while decreasing the amplitudes of all other 3the variables x1 through xn in this case are simply inputs to the function f and are unrelated to the input variables of an fsm. 10 o h h h h h h h h z figure 4: structure of the grover iterate g from figure 3 with quantum oracle o and zero-state phase shift z. component states. in grover’s algorithm, the component state whose amplitude is increased corresponds to the variable assignment satisfying the function f. by applying the grover iterate enough times to a superposition, one obtains a quantum state where the component corresponding to the satisfactory variable assignment has an amplitude near 1; all other components have zero or negligibly small amplitudes. a measurement then gives (with probability near 1) a variable assignment satisfying f. figure 4 depicts a schematic of the grover iterate, showing that it consists of a quantum oracle o, followed by the zero-state phase shift z surrounded by arrays of hadamard gates. the quantum oracle is a quantum circuit implementation of the function f to be satisfied. more specifically, the oracle acts as a “controlled inverter” controlled by f. that is, the quantum oracle acts on qubits x1 through xn together with an additional output qubit, and, for each possible basis state of x1 through xn, the oracle inverts the output qubit if and only if f(x1, . . . , xn) = 1. figure 5 depicts an alternative schematic symbol for the oracle that graphically suggests this “controlled” nature. the quantum oracle essentially defines the search criteria for grover’s algorithm and is the only part of grover’s algorithm whose precise details depend on the nature of the function f. therefore, a quantum computer cannot execute grover’s algorithm for a given function f unless a quantum oracle for f is available. for some—if not the majority of—satisfaction problems arising from practical scenarios, the function f is not given explicitly as a formula, but is instead only defined indirectly through a set of conditions or constraints. in such a case, designing a quantum oracle is far from trivial. in the subsequent sections of this paper, we will concentrate on how to design a quantum oracle that can, when used in grover’s algorithm, help find the optimal state encoding for an fsm. for now, we examine the quantum oracle’s role in grover’s algorithm without concern for the details of its implementation. in grover’s algorithm, the quantum oracle performs a phase flip, essentially “marking” the component of a superposition that corresponds to the variable assignment satisfying the function f—from now on, we will 11 178 e. tsai, m. perkowski a quantum algorithm for automata encoding 179 |0〉 |0〉 |0〉 |0〉 |1〉 h h h h h g g g figure 3: a high-level schematic for grover’s algorithm. a detailed treatment of the preceding concepts, we refer the reader to [16]. we concentrate on reviewing the conceptual aspects of grover’s algorithm rather than proving that the algorithm functions as claimed. full proofs of the validity of grover’s algorithm are well known and may be found, for instance, in [3, 16]. grover’s algorithm [2, 3, 4] is a well-known quantum algorithm and one of the first to demonstrate how quantum computers are in a sense more powerful than classical computers. specifically, grover’s algorithm solves the problem of satisfying a boolean function; that is, given a function f : {0, 1}n → {0, 1}, find x1 through xn such that3 f(x1, . . . , xn) = 1. this type of problem is commonly known as a satisfaction or decision problem. classically, assuming no additional information is known regarding the function f, one can on average solve this problem no faster than by exhaustive search. such a search requires time o(n) where n = 2n, the total number of possible assignments of values {0, 1} to the variables x1 through xn. if a quantum computer is available, however, grover’s algorithm provides a method to solve the same problem in o( √ n) time, a quadratic speedup. we first describe grover’s algorithm with the assumption that exactly one solution exists; that is, there is exactly one assignment of the values {0, 1} to x1 through xn such that f(x1, . . . , xn) = 1. this is the original scenario considered by grover. later, we will relax this assumption and describe how the algorithm can be extended to cases where there is more than one solution, or no solution at all. figure 3 shows a high-level schematic of a quantum circuit implementing grover’s algorithm. we see that the algorithm consists of an initialization step involving hadamard gates, followed by repeated applications of an operation g (the grover iterate), and then a measurement at the end. first, we examine the grover iterate, which is the most important step. the grover iterate accomplishes the task of amplitude amplification, the central concept underlying grover’s algorithm. amplitude amplification is the process of selectively increasing the amplitude of a certain component state of a superposition, while decreasing the amplitudes of all other 3the variables x1 through xn in this case are simply inputs to the function f and are unrelated to the input variables of an fsm. 10 o h h h h h h h h z figure 4: structure of the grover iterate g from figure 3 with quantum oracle o and zero-state phase shift z. component states. in grover’s algorithm, the component state whose amplitude is increased corresponds to the variable assignment satisfying the function f. by applying the grover iterate enough times to a superposition, one obtains a quantum state where the component corresponding to the satisfactory variable assignment has an amplitude near 1; all other components have zero or negligibly small amplitudes. a measurement then gives (with probability near 1) a variable assignment satisfying f. figure 4 depicts a schematic of the grover iterate, showing that it consists of a quantum oracle o, followed by the zero-state phase shift z surrounded by arrays of hadamard gates. the quantum oracle is a quantum circuit implementation of the function f to be satisfied. more specifically, the oracle acts as a “controlled inverter” controlled by f. that is, the quantum oracle acts on qubits x1 through xn together with an additional output qubit, and, for each possible basis state of x1 through xn, the oracle inverts the output qubit if and only if f(x1, . . . , xn) = 1. figure 5 depicts an alternative schematic symbol for the oracle that graphically suggests this “controlled” nature. the quantum oracle essentially defines the search criteria for grover’s algorithm and is the only part of grover’s algorithm whose precise details depend on the nature of the function f. therefore, a quantum computer cannot execute grover’s algorithm for a given function f unless a quantum oracle for f is available. for some—if not the majority of—satisfaction problems arising from practical scenarios, the function f is not given explicitly as a formula, but is instead only defined indirectly through a set of conditions or constraints. in such a case, designing a quantum oracle is far from trivial. in the subsequent sections of this paper, we will concentrate on how to design a quantum oracle that can, when used in grover’s algorithm, help find the optimal state encoding for an fsm. for now, we examine the quantum oracle’s role in grover’s algorithm without concern for the details of its implementation. in grover’s algorithm, the quantum oracle performs a phase flip, essentially “marking” the component of a superposition that corresponds to the variable assignment satisfying the function f—from now on, we will 11 178 e. tsai, m. perkowski a quantum algorithm for automata encoding 179 xn x3 x2 x1 y f figure 5: controlled inverter representation of a quantum oracle. |ψ〉 |ψ′〉 1√ 2 (|0〉 − |1〉) f (a) |ψ〉 |ψ′〉 (b) figure 6: (a) quantum oracle set up to perform a phase flip; (b) amplitudegraph visualization of the phase flip. refer to this component as the “solution” component since it represents the solution to the satisfaction problem. specifically, one can prove that if one supplies the state 1√ 2 (|0〉 − |1〉) on the oracle’s output qubit (y in figure 5) and supplies any state |ψ〉 on the input qubits, then the oracle inverts the phase of the solution component of |ψ〉. figure 6(a) illustrates the quantum circuit diagram for an oracle set up to perform the phase inversion, showing the output qubit initialized to 1√ 2 (|0〉 − |1〉). figure 6(b) then visualizes the initial and final states, |ψ〉 and |ψ′〉, of the oracle’s input qubits by plotting their component amplitudes. in other words, given the state |ψ〉 (or |ψ′〉) expressed as a sum of components, |ψ〉 = n−1∑ i=0 ai|i〉, the graphs plot ai against i. comparison between |ψ〉 and |ψ′〉 shows that the phase of a single component (the solution component) has been inverted. we observe that the phase inversion can be interpreted to mean that the oracle in a sense evaluates the function f simultaneously for all possible inputs and finds the solution immediately. unfortunately, the so12 h h h h h h h h z|ψ〉 |ψ′〉 (a) |ψ〉 µ |ψ′〉 µ (b) figure 7: (a) circuit to perform inversion about the mean; (b) amplitude-graph visualization. lution is encoded as a phase variation which is not directly observable. this problem is solved by the second part of grover’s algorithm—the amplitude amplification procedure, which converts this phase difference into a measurable amplitude difference. this procedure is seen on the righthand side of figure 4 and involves the hadamard gates together with the zero-state phase shift z. the zero-state phase shift is defined to invert the phase of the |0〉 component of any quantum state on which it acts. one can prove that when hadamard gates are applied before and after a zero-state phase shift as shown in figure 4, the result is an inversion about the mean, defined as follows: if µ denotes the mean amplitude of the components of a quantum state, then under an inversion about the mean, each component’s amplitude becomes 2µ − a where a is the component’s prior amplitude. this transformation is mathematically represented by the following equation: hzh ( n−1∑ i=0 ai|i〉 ) = n−1∑ i=0 (2µ − ai)|i〉, where hzh denotes the aforementioned combination of hadamard gates and the zero-state phase shift. figure 7 depicts the effect of inversion about the mean when applied immediately following the quantum oracle. figure 7(a) shows the portion of the grover iterate g (from figures 3 and 4) that performs inversion about the mean. figure 7(b) then visualizes the initial and final quantum states in the same manner as figure 6. we see that following inversion about the mean, the amplitudes of the solution component has increased while those of all other components have decreased. successive application of the grover iterate similarly increases the amplitude of the solution component further. after enough iterations, the superposition consists entirely or nearly entirely of only the solution component. specifically, one can prove that the number of iterations required is on the order of√ n, where n is the total number of components (i.e., the number of possible assignments of values to variables). then, a measurement gives (with near certainty) a variable assignment satisfying the function f. since we assume that nothing is known in advance about the function f, the just-described amplitude amplification procedure should begin with a superposition of all possible variable assignments, as any of them could 13 180 e. tsai, m. perkowski a quantum algorithm for automata encoding 181 xn x3 x2 x1 y f figure 5: controlled inverter representation of a quantum oracle. |ψ〉 |ψ′〉 1√ 2 (|0〉 − |1〉) f (a) |ψ〉 |ψ′〉 (b) figure 6: (a) quantum oracle set up to perform a phase flip; (b) amplitudegraph visualization of the phase flip. refer to this component as the “solution” component since it represents the solution to the satisfaction problem. specifically, one can prove that if one supplies the state 1√ 2 (|0〉 − |1〉) on the oracle’s output qubit (y in figure 5) and supplies any state |ψ〉 on the input qubits, then the oracle inverts the phase of the solution component of |ψ〉. figure 6(a) illustrates the quantum circuit diagram for an oracle set up to perform the phase inversion, showing the output qubit initialized to 1√ 2 (|0〉 − |1〉). figure 6(b) then visualizes the initial and final states, |ψ〉 and |ψ′〉, of the oracle’s input qubits by plotting their component amplitudes. in other words, given the state |ψ〉 (or |ψ′〉) expressed as a sum of components, |ψ〉 = n−1∑ i=0 ai|i〉, the graphs plot ai against i. comparison between |ψ〉 and |ψ′〉 shows that the phase of a single component (the solution component) has been inverted. we observe that the phase inversion can be interpreted to mean that the oracle in a sense evaluates the function f simultaneously for all possible inputs and finds the solution immediately. unfortunately, the so12 h h h h h h h h z|ψ〉 |ψ′〉 (a) |ψ〉 µ |ψ′〉 µ (b) figure 7: (a) circuit to perform inversion about the mean; (b) amplitude-graph visualization. lution is encoded as a phase variation which is not directly observable. this problem is solved by the second part of grover’s algorithm—the amplitude amplification procedure, which converts this phase difference into a measurable amplitude difference. this procedure is seen on the righthand side of figure 4 and involves the hadamard gates together with the zero-state phase shift z. the zero-state phase shift is defined to invert the phase of the |0〉 component of any quantum state on which it acts. one can prove that when hadamard gates are applied before and after a zero-state phase shift as shown in figure 4, the result is an inversion about the mean, defined as follows: if µ denotes the mean amplitude of the components of a quantum state, then under an inversion about the mean, each component’s amplitude becomes 2µ − a where a is the component’s prior amplitude. this transformation is mathematically represented by the following equation: hzh ( n−1∑ i=0 ai|i〉 ) = n−1∑ i=0 (2µ − ai)|i〉, where hzh denotes the aforementioned combination of hadamard gates and the zero-state phase shift. figure 7 depicts the effect of inversion about the mean when applied immediately following the quantum oracle. figure 7(a) shows the portion of the grover iterate g (from figures 3 and 4) that performs inversion about the mean. figure 7(b) then visualizes the initial and final quantum states in the same manner as figure 6. we see that following inversion about the mean, the amplitudes of the solution component has increased while those of all other components have decreased. successive application of the grover iterate similarly increases the amplitude of the solution component further. after enough iterations, the superposition consists entirely or nearly entirely of only the solution component. specifically, one can prove that the number of iterations required is on the order of√ n, where n is the total number of components (i.e., the number of possible assignments of values to variables). then, a measurement gives (with near certainty) a variable assignment satisfying the function f. since we assume that nothing is known in advance about the function f, the just-described amplitude amplification procedure should begin with a superposition of all possible variable assignments, as any of them could 13 180 e. tsai, m. perkowski a quantum algorithm for automata encoding 181 potentially satisfy f. therefore, before performing the amplitude amplification procedure, grover’s algorithm first initializes the qubits that will serve as inputs to the quantum oracle, x1 through xn, to a superposition of all possible states. conventionally, this is accomplished by initializing each input qubit (x1 through xn) to |0〉 and then applying a hadamard gate to each input qubit, as shown in figure 3. similarly, the quantum oracle’s output qubit is initialized to 1√ 2 (|0〉 − |1〉) by applying a hadamard gate to an initial state of |1〉, also shown in figure 3. following completion of the amplitude amplification procedure, the qubits x1 through xn are measured to obtain the final output of grover’s algorithm; figure 3 indicates this measurement on the right-hand side using the standard leftpointing triangular symbol. we therefore see that the complete grover’s algorithm consists of the following steps: 1. begin with n input qubits and one output qubit, initialized as described in the preceding paragraph. 2. perform the amplitude amplification procedure by applying the grover iterate √ n times as previously described. 3. measure the input qubits; the result of measurement forms the output of grover’s algorithm. since the initialization and measurement steps in grover’s algorithm are proportional to the number of qubits used, which is on the order of n = log2 n, the number of variables of f, the o( √ n) time required for the amplitude amplification procedure forms the dominant contribution to the overall runtime of grover’s algorithm. therefore, grover’s algorithm requires a time complexity of o( √ n). equipped with an understanding of grover’s original algorithm, which assumes that exactly one solution exists, we now consider how the algorithm can be modified if more than one solution exists, or if there are no solutions at all. there are several possible variations of this scenario, depending on one’s assumptions and objectives: • one may or may not know the number of solutions beforehand. • if at least one solution exists, one may wish to find just a single solution, or all possible solutions. • one may or may not know beforehand whether at least one solution exists. • if the existence of at least one solution is not guaranteed, one may be interested only in determining the existence or non-existence of a solution, and not in actually finding that solution. for our purposes, we will need to assume that we do not have prior knowledge of the existence and number of solutions. we will be interested in first determining whether any solutions exist, and then finding just a single solution if at least one exists. in section 3.2, we will demonstrate that satisfying this requirement suffices to solve the state encoding problem using a quantum computer. fortunately, for the assumptions stated above, one can extend grover’s algorithm to the case of an unknown number of solutions while maintaining an o( √ n) run time complexity. in [17], the authors show that if 14 one has foreknowledge of the exact number of solutions, grover’s original algorithm as previously described (which assumes the existence of exactly one solution) can still be used with one modification: instead of o( √ n) iterations, one now requires o( √ n/k) iterations, where k is the number of solutions. furthermore, the authors in [17] also demonstrate a method by which one can find a solution in o( √ n/k) expected time even if the number of solutions is unknown. this same method is additionally capable of detecting the non-existence of a solution in o( √ n) time with an arbitrarily high level of certainty. alternatively, brassard et al. [18] have described a quantum counting algorithm, with which one may obtain an estimate of the number of solutions. one could use this quantum counting algorithm together with the previously-mentioned extended grover’s algorithm for a known number of solutions (in [17]) to find a single solution, if it exists.4 using the above approaches, one can satisfy the previously stated requirement—determine whether any solutions exist, and find one if it does—with the same o( √ n) time complexity as grover’s original algorithm. for the sake of brevity, we will from now on use “grover’s algorithm” to refer to all of these approaches collectively. in other words, when we say we are “using grover’s algorithm” to solve a satisfaction problem, we mean that we are using any of the above approaches to determine if a solution to the problem exists, and then find one if it does. there is one more aspect of grover’s algorithm that we have not yet mentioned—the use of ancillary (“ancilla”) qubits in the quantum oracle. strictly speaking, ancilla qubits are an aspect of quantum oracle design and not of grover’s algorithm itself. we discuss them here nevertheless, since we will concentrate on designing a quantum oracle to solve the fsm state encoding problem in the remainder of this paper. ancilla qubits are extra qubits that are used by a quantum oracle to store intermediate results while it is performing its computations. they are initialized to a value of the quantum circuit designer’s choice. a quantum oracle only uses ancilla qubits internally; externally, it should appear as if ancilla qubits simply “pass through” the oracle with no change. in other words, ancilla qubits do not externally appear to participate in the oracle’s operation at all; they participate internally and only for the purpose of acting as temporary storage. ancilla qubits are usually employed to reduce the quantum circuit cost of a quantum oracle. a quantum circuit using ancilla qubits often has lower quantum cost than an equivalent circuit without ancilla qubits. for instance, figure 8 shows a quantum oracle for the function f = (x1 + x2)(x3 + x4) using two ancilla qubits. it is possible to design a quantum oracle for this function without ancilla qubits, but the 4strictly speaking, the viability of this second method is actually not absolutely clear, although it should likely work. the difficulty is that the quantum counting algorithm in [18] only provides an approximate estimate of the number of solutions, while [17] assumes knowledge of the exact number of solutions. a brief consideration of the mathematical analysis used in [17] suggests that the method will still succeed with high probability with only an estimate of the number of solutions, as long as this estimate is reasonably accurate. however, the authors in [17] do not rigorously prove this assertion. in any case, the validity of our present work is unaffected since [17] also give a method for finding one of an unknown number of solutions, as mentioned in the main text. 15 182 e. tsai, m. perkowski a quantum algorithm for automata encoding 183 potentially satisfy f. therefore, before performing the amplitude amplification procedure, grover’s algorithm first initializes the qubits that will serve as inputs to the quantum oracle, x1 through xn, to a superposition of all possible states. conventionally, this is accomplished by initializing each input qubit (x1 through xn) to |0〉 and then applying a hadamard gate to each input qubit, as shown in figure 3. similarly, the quantum oracle’s output qubit is initialized to 1√ 2 (|0〉 − |1〉) by applying a hadamard gate to an initial state of |1〉, also shown in figure 3. following completion of the amplitude amplification procedure, the qubits x1 through xn are measured to obtain the final output of grover’s algorithm; figure 3 indicates this measurement on the right-hand side using the standard leftpointing triangular symbol. we therefore see that the complete grover’s algorithm consists of the following steps: 1. begin with n input qubits and one output qubit, initialized as described in the preceding paragraph. 2. perform the amplitude amplification procedure by applying the grover iterate √ n times as previously described. 3. measure the input qubits; the result of measurement forms the output of grover’s algorithm. since the initialization and measurement steps in grover’s algorithm are proportional to the number of qubits used, which is on the order of n = log2 n, the number of variables of f, the o( √ n) time required for the amplitude amplification procedure forms the dominant contribution to the overall runtime of grover’s algorithm. therefore, grover’s algorithm requires a time complexity of o( √ n). equipped with an understanding of grover’s original algorithm, which assumes that exactly one solution exists, we now consider how the algorithm can be modified if more than one solution exists, or if there are no solutions at all. there are several possible variations of this scenario, depending on one’s assumptions and objectives: • one may or may not know the number of solutions beforehand. • if at least one solution exists, one may wish to find just a single solution, or all possible solutions. • one may or may not know beforehand whether at least one solution exists. • if the existence of at least one solution is not guaranteed, one may be interested only in determining the existence or non-existence of a solution, and not in actually finding that solution. for our purposes, we will need to assume that we do not have prior knowledge of the existence and number of solutions. we will be interested in first determining whether any solutions exist, and then finding just a single solution if at least one exists. in section 3.2, we will demonstrate that satisfying this requirement suffices to solve the state encoding problem using a quantum computer. fortunately, for the assumptions stated above, one can extend grover’s algorithm to the case of an unknown number of solutions while maintaining an o( √ n) run time complexity. in [17], the authors show that if 14 one has foreknowledge of the exact number of solutions, grover’s original algorithm as previously described (which assumes the existence of exactly one solution) can still be used with one modification: instead of o( √ n) iterations, one now requires o( √ n/k) iterations, where k is the number of solutions. furthermore, the authors in [17] also demonstrate a method by which one can find a solution in o( √ n/k) expected time even if the number of solutions is unknown. this same method is additionally capable of detecting the non-existence of a solution in o( √ n) time with an arbitrarily high level of certainty. alternatively, brassard et al. [18] have described a quantum counting algorithm, with which one may obtain an estimate of the number of solutions. one could use this quantum counting algorithm together with the previously-mentioned extended grover’s algorithm for a known number of solutions (in [17]) to find a single solution, if it exists.4 using the above approaches, one can satisfy the previously stated requirement—determine whether any solutions exist, and find one if it does—with the same o( √ n) time complexity as grover’s original algorithm. for the sake of brevity, we will from now on use “grover’s algorithm” to refer to all of these approaches collectively. in other words, when we say we are “using grover’s algorithm” to solve a satisfaction problem, we mean that we are using any of the above approaches to determine if a solution to the problem exists, and then find one if it does. there is one more aspect of grover’s algorithm that we have not yet mentioned—the use of ancillary (“ancilla”) qubits in the quantum oracle. strictly speaking, ancilla qubits are an aspect of quantum oracle design and not of grover’s algorithm itself. we discuss them here nevertheless, since we will concentrate on designing a quantum oracle to solve the fsm state encoding problem in the remainder of this paper. ancilla qubits are extra qubits that are used by a quantum oracle to store intermediate results while it is performing its computations. they are initialized to a value of the quantum circuit designer’s choice. a quantum oracle only uses ancilla qubits internally; externally, it should appear as if ancilla qubits simply “pass through” the oracle with no change. in other words, ancilla qubits do not externally appear to participate in the oracle’s operation at all; they participate internally and only for the purpose of acting as temporary storage. ancilla qubits are usually employed to reduce the quantum circuit cost of a quantum oracle. a quantum circuit using ancilla qubits often has lower quantum cost than an equivalent circuit without ancilla qubits. for instance, figure 8 shows a quantum oracle for the function f = (x1 + x2)(x3 + x4) using two ancilla qubits. it is possible to design a quantum oracle for this function without ancilla qubits, but the 4strictly speaking, the viability of this second method is actually not absolutely clear, although it should likely work. the difficulty is that the quantum counting algorithm in [18] only provides an approximate estimate of the number of solutions, while [17] assumes knowledge of the exact number of solutions. a brief consideration of the mathematical analysis used in [17] suggests that the method will still succeed with high probability with only an estimate of the number of solutions, as long as this estimate is reasonably accurate. however, the authors in [17] do not rigorously prove this assertion. in any case, the validity of our present work is unaffected since [17] also give a method for finding one of an unknown number of solutions, as mentioned in the main text. 15 182 e. tsai, m. perkowski a quantum algorithm for automata encoding 183 |1〉 |1〉 |1〉 |1〉 x4 x4 x3 x3 x2 x2 x1 x1 y y ⊕ (x1 + x2)(x3 + x4) figure 8: a quantum oracle using ancilla qubits. quantum circuit cost of such an oracle would be higher than the cost of the oracle in figure 8. in many cases, creating a quantum oracle without using ancilla qubits is in fact completely impractical. although the judicious employment of ancilla qubits can greatly reduce quantum circuit costs, one must also keep in mind that ancilla qubits themselves are a limited resource—in any real quantum computer, only a finite number of qubits will be available. consequently, quantum oracles using excessive quantities of ancilla qubits are unrealistic in the sense that no quantum computer will be able to execute them. the process of designing a quantum oracle (indeed, designing any quantum circuit) nearly always involves a trade-off between its quantum cost and the number of ancilla qubits it uses. 3.2 use of grover’s algorithm to solve optimization problems at this point, before turning to the design of a quantum oracle to solve the state encoding problem, we first consider an interesting issue regarding the usage of grover’s algorithm for this problem, as it will affect the design of the quantum oracle. specifically, some difficulty arises from the fact that, while state encoding cost minimization is an optimization problem, grover’s algorithm directly solves only satisfaction problems (a.k.a. decision problems). in other words, we wish to find the minimum value of a certain function (the cost function) while grover’s algorithm can only find a point at which a function evaluates to 1. in order to use grover’s algorithm for optimization, we reformulate optimization problems in terms of a sequence of satisfaction problems of the form: “find a point at which the value of the function f is less than r”, where r is an arbitrary threshold. by executing grover’s algorithm for different values of the threshold r, one can conduct a search to find the minimum value of f. for example, such a search might proceed according to the following procedure: 1. choose an initial value for the threshold a. ideally, this initial value should be close to the minimum value of f, if an estimate of the minimum is available; if not, the search procedure will still function correctly with any initial value. 16 2. execute grover’s algorithm for the chosen threshold as previously described. 3. if grover’s algorithm finds a solution, proceed to step 4. otherwise, double the threshold a and repeat from step 2. 4. the preceding steps give a lower and upper bound for the minimum of f. in other words, one obtains threshold values amin and amax such that the function f never takes on any value less than amin (as determined using grover’s algorithm) but takes on a value less than amax at one or more points. 5. execute grover’s algorithm for a threshold value of (amin +amax)/2. 6. if grover’s algorithm finds a solution, then (amin + amax)/2 gives a new upper bound for the minimum of f; otherwise, it gives a new lower bound. repeat from step 4 until the minimum value of f is determined. the final output of grover’s algorithm gives the input to f at which the minimum occurs. the above sequence of steps essentially describes the well-known binary search strategy. we give this strategy merely as an example showing that such a search is indeed possible. in particular, we do not claim that this strategy is optimal with respect to expected runtime or any other measure. the question of evaluating different search strategies, as well as what standard should be used to evaluate them in the first place, falls outside the scope of this paper. we leave this avenue of exploration open for future work. we make the crucial observation that the above procedure involves executing grover’s algorithm using not a single, static quantum oracle, but a sequence of quantum oracles that are dynamically created on-thefly for different thresholds. in particular, the threshold value r is not itself an input to the oracle, meaning that grover’s algorithm does not directly search for the threshold. grover’s algorithm is in fact incapable of directly searching for the threshold since, as previously discussed, that constitutes an optimization, not satisfaction, problem. instead, the threshold value is built into the oracle, meaning that a different oracle is created for each threshold value. consequently, to successfully use the above procedure, one requires not just a single quantum oracle but a method for generating quantum oracles for arbitrary threshold values. we additionally observe that repeated execution of grover’s algorithm using a sequence of dynamically generated quantum oracles makes use of the freely reconfigurable nature of quantum circuits. more specifically, a quantum circuit is not a hardware circuit in the same sense as a classical digital logic circuit—in a quantum circuit, information is not transmitted through physical wires from gate to gate. instead, the “wires” in a quantum circuit represent individual qubits that are stored on some physical medium, and the gates are not physical components but rather are manipulations performed on the physical medium using implementationdependent hardware (e.g., lasers, electromagnets, superconducting circuits). this means that a quantum “circuit” is in fact a sequence of software operations stored on a classical computer that controls the quantum hardware, and this sequence of operations can easily be modified at will. 17 184 e. tsai, m. perkowski a quantum algorithm for automata encoding 185 |1〉 |1〉 |1〉 |1〉 x4 x4 x3 x3 x2 x2 x1 x1 y y ⊕ (x1 + x2)(x3 + x4) figure 8: a quantum oracle using ancilla qubits. quantum circuit cost of such an oracle would be higher than the cost of the oracle in figure 8. in many cases, creating a quantum oracle without using ancilla qubits is in fact completely impractical. although the judicious employment of ancilla qubits can greatly reduce quantum circuit costs, one must also keep in mind that ancilla qubits themselves are a limited resource—in any real quantum computer, only a finite number of qubits will be available. consequently, quantum oracles using excessive quantities of ancilla qubits are unrealistic in the sense that no quantum computer will be able to execute them. the process of designing a quantum oracle (indeed, designing any quantum circuit) nearly always involves a trade-off between its quantum cost and the number of ancilla qubits it uses. 3.2 use of grover’s algorithm to solve optimization problems at this point, before turning to the design of a quantum oracle to solve the state encoding problem, we first consider an interesting issue regarding the usage of grover’s algorithm for this problem, as it will affect the design of the quantum oracle. specifically, some difficulty arises from the fact that, while state encoding cost minimization is an optimization problem, grover’s algorithm directly solves only satisfaction problems (a.k.a. decision problems). in other words, we wish to find the minimum value of a certain function (the cost function) while grover’s algorithm can only find a point at which a function evaluates to 1. in order to use grover’s algorithm for optimization, we reformulate optimization problems in terms of a sequence of satisfaction problems of the form: “find a point at which the value of the function f is less than r”, where r is an arbitrary threshold. by executing grover’s algorithm for different values of the threshold r, one can conduct a search to find the minimum value of f. for example, such a search might proceed according to the following procedure: 1. choose an initial value for the threshold a. ideally, this initial value should be close to the minimum value of f, if an estimate of the minimum is available; if not, the search procedure will still function correctly with any initial value. 16 2. execute grover’s algorithm for the chosen threshold as previously described. 3. if grover’s algorithm finds a solution, proceed to step 4. otherwise, double the threshold a and repeat from step 2. 4. the preceding steps give a lower and upper bound for the minimum of f. in other words, one obtains threshold values amin and amax such that the function f never takes on any value less than amin (as determined using grover’s algorithm) but takes on a value less than amax at one or more points. 5. execute grover’s algorithm for a threshold value of (amin +amax)/2. 6. if grover’s algorithm finds a solution, then (amin + amax)/2 gives a new upper bound for the minimum of f; otherwise, it gives a new lower bound. repeat from step 4 until the minimum value of f is determined. the final output of grover’s algorithm gives the input to f at which the minimum occurs. the above sequence of steps essentially describes the well-known binary search strategy. we give this strategy merely as an example showing that such a search is indeed possible. in particular, we do not claim that this strategy is optimal with respect to expected runtime or any other measure. the question of evaluating different search strategies, as well as what standard should be used to evaluate them in the first place, falls outside the scope of this paper. we leave this avenue of exploration open for future work. we make the crucial observation that the above procedure involves executing grover’s algorithm using not a single, static quantum oracle, but a sequence of quantum oracles that are dynamically created on-thefly for different thresholds. in particular, the threshold value r is not itself an input to the oracle, meaning that grover’s algorithm does not directly search for the threshold. grover’s algorithm is in fact incapable of directly searching for the threshold since, as previously discussed, that constitutes an optimization, not satisfaction, problem. instead, the threshold value is built into the oracle, meaning that a different oracle is created for each threshold value. consequently, to successfully use the above procedure, one requires not just a single quantum oracle but a method for generating quantum oracles for arbitrary threshold values. we additionally observe that repeated execution of grover’s algorithm using a sequence of dynamically generated quantum oracles makes use of the freely reconfigurable nature of quantum circuits. more specifically, a quantum circuit is not a hardware circuit in the same sense as a classical digital logic circuit—in a quantum circuit, information is not transmitted through physical wires from gate to gate. instead, the “wires” in a quantum circuit represent individual qubits that are stored on some physical medium, and the gates are not physical components but rather are manipulations performed on the physical medium using implementationdependent hardware (e.g., lasers, electromagnets, superconducting circuits). this means that a quantum “circuit” is in fact a sequence of software operations stored on a classical computer that controls the quantum hardware, and this sequence of operations can easily be modified at will. 17 184 e. tsai, m. perkowski a quantum algorithm for automata encoding 185 thus, executing grover’s algorithm using a newly-generated quantum oracle simply involves having the classical computer perform the appropriate, newly-generated sequence of operations on the quantum hardware. based on the preceding observations, we introduce a distinction between compile time and run time for quantum circuits. in classical computing, compile time is the time at which instructions for the computer are generated, while run time is the time at which those instructions are actually executed. by analogy, compile time of a quantum circuit will refer to the generation process of the quantum circuit (which, as previously noted, takes place on a classical computer). run time of a quantum circuit, in contrast, will refer to the process of actually executing the quantum circuit on quantum hardware (which, as also noted, involves a classical computer controlling the quantum hardware with an appropriate sequence of commands). therefore, our objective in the subsequent sections of this paper becomes: given an fsm specification, demonstrate a procedure that can generate a quantum oracle for an arbitrary threshold value r, where the quantum oracle accepts as input an encoding for the fsm and answers the question “is the cost of the encoding (as defined in section 2.2) less than r?” the process described in this section then constitutes a complete algorithm for finding an encoding for the given fsm with exact minimum cost. 4 procedure to calculate the cost of a given encoding 4.1 computing cost by considering pairs of states we now consider a systematic procedure for computing the cost, as defined in section 2.2, of a given encoding for a given fsm. this procedure will form the basis for the design of a quantum oracle that determines whether the cost of a given encoding is less than a predetermined threshold, therefore allowing the use of grover’s algorithm to find the exact minimum-cost encoding for an fsm. to calculate the cost of a given encoding, we require a method to determine whether the value of a given state variable, say qi, at time t + 1 depends on the value of any (possibly the same or different) state variable, say qj, at time t. by definition, the value of qi at time t + 1 is given by the function δi; thus, q + i = δi(q1, . . . , qn, x1, . . . , xm) where n and m are the number of state and input variables, respectively. now, q+i only depends on qj if, in at least one case, a change in qj and in no other variables causes a change in q+i . in other words, if there exist distinct states s, s′ and an input value i such that qj is the only state variable assigned different values for s and s′, and qi is assigned different values for δ(s, i) and δ(s′, i), then q+i depends on qj. 18 as an example, consider the state machine from figure 2 and the encoding in figure 2(b). the encoded transition function q+1 = δ1(q1, q2, x1, x2) resulting from this encoding, shown in figure 2(c), depends on all four variables. the dependency on q1 can be seen from the fact that δ1(0, 0, 0, 1) = 0 but δ1(1, 0, 0, 1) = 1; i.e., a change in only q1 causes a change in δ1. in terms of states and input values, q1q2 = 00 corresponds to a current state of s1 and q1q2 = 10 corresponds to s3. we then see that given the pair of states (s1, s3), whose encodings differ only in q1, for at least one input value—in this case, i2, which corresponds to x1x2 = 01—the corresponding pair of next states, (s2, s3), is such that the value of q1 differs between the two states in the pair. the result of the preceding discussion may be more formally expressed as follows. for any two distinct states s and s′, let dj(s, s ′) mean “the encodings of s and s′ differ only in the value of qj” and let ai(s, s ′) mean “the encodings of s and s′ agree in the value of qi”. then, check whether dj(s, s ′ ) ⇒ ∀i ∈ i ai(δ(s, i), δ(s′, i)) (1) for all pairs of distinct states (s, s′). if so, then q+i does not depend on qj; otherwise, it does. we determine the dependency of a given q+i on an input variable xj in a similar manner. specifically, q+i depends on xj if and only if, in at least one case, a change in xj and in no other state or input variables causes a change in q+i . therefore, we consider all pairs of distinct input values and determine whether, for those pairs whose encodings differ only in xj, the encodings of the corresponding pair of next states can ever differ in qi. in other words, for any two distinct input values i and i ′, let dj(i, i ′) mean “the encodings of i and i′ differ only in the value of xj”, and for any two states s and s′, let ai(s, s ′) mean (as before) “the encodings of s and s′ agree in the value of qi”. then, we wish to check whether the condition dj(i, i ′ ) ⇒ ∀s ∈ s ai(δ(s, i), δ(s, i′)) (2) holds for every pair of distinct input values (i, i′). if so, q+i does not depend on xj; otherwise, it does. equipped with a procedure to determine the dependency of a given q+i on any single state or input variable, it is now straightforward to compute the total cost of an encoding. we calculate the cost of each q+i in accordance with our cost model—the total number of state and input variables on which q+i depends gives the cost of q + i . then, the sum of costs of q+i for 1 ≤ i ≤ n gives the total cost of a given encoding. 4.2 necessity for transition functions to be completely specified the just-described procedure only gives the correct cost if the encoding is bijective. if this condition is not satisfied, problems can arise from the fact that the “functions” δi are no longer functions in the mathematical sense, but rather relations or so-called incompletely specified functions. in general, when δi is incompletely specified, the question of whether or 19 186 e. tsai, m. perkowski a quantum algorithm for automata encoding 187 thus, executing grover’s algorithm using a newly-generated quantum oracle simply involves having the classical computer perform the appropriate, newly-generated sequence of operations on the quantum hardware. based on the preceding observations, we introduce a distinction between compile time and run time for quantum circuits. in classical computing, compile time is the time at which instructions for the computer are generated, while run time is the time at which those instructions are actually executed. by analogy, compile time of a quantum circuit will refer to the generation process of the quantum circuit (which, as previously noted, takes place on a classical computer). run time of a quantum circuit, in contrast, will refer to the process of actually executing the quantum circuit on quantum hardware (which, as also noted, involves a classical computer controlling the quantum hardware with an appropriate sequence of commands). therefore, our objective in the subsequent sections of this paper becomes: given an fsm specification, demonstrate a procedure that can generate a quantum oracle for an arbitrary threshold value r, where the quantum oracle accepts as input an encoding for the fsm and answers the question “is the cost of the encoding (as defined in section 2.2) less than r?” the process described in this section then constitutes a complete algorithm for finding an encoding for the given fsm with exact minimum cost. 4 procedure to calculate the cost of a given encoding 4.1 computing cost by considering pairs of states we now consider a systematic procedure for computing the cost, as defined in section 2.2, of a given encoding for a given fsm. this procedure will form the basis for the design of a quantum oracle that determines whether the cost of a given encoding is less than a predetermined threshold, therefore allowing the use of grover’s algorithm to find the exact minimum-cost encoding for an fsm. to calculate the cost of a given encoding, we require a method to determine whether the value of a given state variable, say qi, at time t + 1 depends on the value of any (possibly the same or different) state variable, say qj, at time t. by definition, the value of qi at time t + 1 is given by the function δi; thus, q + i = δi(q1, . . . , qn, x1, . . . , xm) where n and m are the number of state and input variables, respectively. now, q+i only depends on qj if, in at least one case, a change in qj and in no other variables causes a change in q+i . in other words, if there exist distinct states s, s′ and an input value i such that qj is the only state variable assigned different values for s and s′, and qi is assigned different values for δ(s, i) and δ(s′, i), then q+i depends on qj. 18 as an example, consider the state machine from figure 2 and the encoding in figure 2(b). the encoded transition function q+1 = δ1(q1, q2, x1, x2) resulting from this encoding, shown in figure 2(c), depends on all four variables. the dependency on q1 can be seen from the fact that δ1(0, 0, 0, 1) = 0 but δ1(1, 0, 0, 1) = 1; i.e., a change in only q1 causes a change in δ1. in terms of states and input values, q1q2 = 00 corresponds to a current state of s1 and q1q2 = 10 corresponds to s3. we then see that given the pair of states (s1, s3), whose encodings differ only in q1, for at least one input value—in this case, i2, which corresponds to x1x2 = 01—the corresponding pair of next states, (s2, s3), is such that the value of q1 differs between the two states in the pair. the result of the preceding discussion may be more formally expressed as follows. for any two distinct states s and s′, let dj(s, s ′) mean “the encodings of s and s′ differ only in the value of qj” and let ai(s, s ′) mean “the encodings of s and s′ agree in the value of qi”. then, check whether dj(s, s ′ ) ⇒ ∀i ∈ i ai(δ(s, i), δ(s′, i)) (1) for all pairs of distinct states (s, s′). if so, then q+i does not depend on qj; otherwise, it does. we determine the dependency of a given q+i on an input variable xj in a similar manner. specifically, q+i depends on xj if and only if, in at least one case, a change in xj and in no other state or input variables causes a change in q+i . therefore, we consider all pairs of distinct input values and determine whether, for those pairs whose encodings differ only in xj, the encodings of the corresponding pair of next states can ever differ in qi. in other words, for any two distinct input values i and i ′, let dj(i, i ′) mean “the encodings of i and i′ differ only in the value of xj”, and for any two states s and s′, let ai(s, s ′) mean (as before) “the encodings of s and s′ agree in the value of qi”. then, we wish to check whether the condition dj(i, i ′ ) ⇒ ∀s ∈ s ai(δ(s, i), δ(s, i′)) (2) holds for every pair of distinct input values (i, i′). if so, q+i does not depend on xj; otherwise, it does. equipped with a procedure to determine the dependency of a given q+i on any single state or input variable, it is now straightforward to compute the total cost of an encoding. we calculate the cost of each q+i in accordance with our cost model—the total number of state and input variables on which q+i depends gives the cost of q + i . then, the sum of costs of q+i for 1 ≤ i ≤ n gives the total cost of a given encoding. 4.2 necessity for transition functions to be completely specified the just-described procedure only gives the correct cost if the encoding is bijective. if this condition is not satisfied, problems can arise from the fact that the “functions” δi are no longer functions in the mathematical sense, but rather relations or so-called incompletely specified functions. in general, when δi is incompletely specified, the question of whether or 19 186 e. tsai, m. perkowski a quantum algorithm for automata encoding 187 i1 i2 i3 s1 s2 s3 s4 s5 s2 s4 s5 s1 s1 s1 s3 s3 s1 s3 s3 s3 s1 s3 s1 (a) q1q2q3s s1 0 0 0 s2 0 1 1 s3 1 0 1 s4 1 1 0 s5 1 1 1 x1 x2i i1 0 0 i2 0 1 i3 1 0 (b) 1 − − − 0 0 0 0 1 − − − 0 0 0 0 1 − − − 0 0 0 0 − − − − − − − − 00 01 11 10 000 001 011 010 110 111 101 100 q1q2q3 x1x2 (c) figure 9: (a) fsm specification with numbers of states and inputs not powers of 2; (b) a possible encoding; (c) resulting form of q+2 . not δi depends on a given variable cannot be answered by the simple procedure described above. figure 9 illustrates an instance where the procedure from the previous section fails to correctly determine the dependencies of a function δi on q1 through qn and x1 through xm. in this figure we have a state machine where the number of states (5) is not a power of two. consequently, we require at least three flip-flops to implement this state machine. however, since eight distinct states exist for an array of three flip-flops, three flipflop array states remain unused. in other words, three of the eight possible flip-flop array states do not represent any fsm state at all. similarly, the number of input values (3) is also not a power of two, and hence the input is encoded by two bits with one of the four possible combinations remaining unused. the effects of these unused states and input combinations can be seen in the encoded transition function δ2, where the value of δ2 for the unused states is recorded as a “−”, indicating a “don’t-care”. a “don’t-care” output indicates that the digital logic implementing the fsm may output a value of either 0 or 1 for that combination of flip-flop states and inputs, since the combination should never occur during normal operation. we now observe that at least three possible implementations exist for δ2: q + 2 = ¬q1 ∧ ¬q2, (3) q + 2 = ¬q1 ∧ ¬q3, (4) q + 2 = ¬q2 ∧ ¬q3. (5) thus, it is clearly possible to implement δ2 with a dependency on only two variables. on the other hand, one can see from inspection that any one of q1, q2, q3, x1, or x2 alone is not enough to determine the value of δ2. hence, our cost model assigns δ2 a cost of 2. however, the procedure from the previous section finds the cost of δ2 to be 0. for instance, when considering whether δ2 depends on q1, the procedure considers all pairs of states (s, s′) whose encodings differ only in q1, and then examines whether the encodings of δ(s, i) and δ(s ′, i) differ 20 in q2 for any input value i. in this case, the only such pair of states is (s2, s5), and under δ with i = i1, i = i2, and i = i3, the images of this pair are (s1, s1), (s1, s3), and (s1, s1), respectively. for all three image pairs, q2 does not differ between the encoded values of the two states in the pair. thus, the procedure determines that δ2 does not depend on q1. in a similar fashion, the procedure also determines that δ2 does not depend on q2, q3, x1, or x2, and consequently calculates the cost of δ2 as 0. this conclusion is clearly incorrect since δ2 is not constant and must therefore depend on at least one variable. indeed, we previously determined the minimum number of dependencies to be 2. our procedure’s failure to correctly determine cost in this case ultimately arises from the fact that, considered individually, δ2 need not depend on any particular one of q1, q2, and q3. for instance, eq. 5 implements δ2 without a dependency on q1. similarly, eq. 4 and eq. 3 implement δ2 without dependencies on q2 and q3, respectively. it is however not possible to simultaneously avoid two of these dependencies. in other words, any implementation of δ2 that lacks a dependency on q1 must then depend on both q2 and q3. so, although the implementation given by eq. 5 does not depend on q1, it depends on both q2 and q3. the procedure from the previous section assumes (in this case incorrectly) that any two dependencies can always be simultaneously avoided if they can be individually avoided, which results in an incorrect computed cost. the scenario described here can never occur if all encoded transition functions are actual functions (not relations) in the mathematical sense. to see this, observe that any mathematical function lacking a dependency on each of two variables individually must also lack a dependency on those variables simultaneously. in other words, suppose that a function f of variables5 x1 through xn lacks a dependency on x1, meaning that a change in only x1 cannot affect the value of f and f can be computed without knowledge of the value of x1. furthermore suppose that f similarly lacks a dependency on x2. then if the value of x1 changes to a different value x1 ′ and the value of x2 simultaneously changes to x2 ′, we have6 f(x1, x2, . . . , xn) = f(x1, x2 ′ , . . . , xn) = f(x1 ′ , x2 ′ , . . . , xn), showing that simultaneous changes in x1 and x2 cannot affect the value of f. hence, f can be computed without knowledge of the values of either x1 or x2. this argument breaks down if f is not actually a function but a relation, because then the value of f may not be uniquely defined at a given point. we therefore see that the procedure from the previous section, which only evaluates dependency on a single variable at a time, correctly computes cost when all encoded transition functions are actual functions (i.e., there are no “don’t-cares”) but may fail if those “functions” are actually relations. no “don’t-cares” will exist if the state and input encodings are 5as in section 3.1, the variables x1 through xn here are unrelated to the input variables of an fsm; here, they are simply variables in an arbitrary function f. 6here we use the notation x1 ′ simply to indicate that the value of x1 has changed. in particular, we do not use the prime (′) symbol to indicate logical negation, as is sometimes done. 21 188 e. tsai, m. perkowski a quantum algorithm for automata encoding 189 i1 i2 i3 s1 s2 s3 s4 s5 s2 s4 s5 s1 s1 s1 s3 s3 s1 s3 s3 s3 s1 s3 s1 (a) q1q2q3s s1 0 0 0 s2 0 1 1 s3 1 0 1 s4 1 1 0 s5 1 1 1 x1 x2i i1 0 0 i2 0 1 i3 1 0 (b) 1 − − − 0 0 0 0 1 − − − 0 0 0 0 1 − − − 0 0 0 0 − − − − − − − − 00 01 11 10 000 001 011 010 110 111 101 100 q1q2q3 x1x2 (c) figure 9: (a) fsm specification with numbers of states and inputs not powers of 2; (b) a possible encoding; (c) resulting form of q+2 . not δi depends on a given variable cannot be answered by the simple procedure described above. figure 9 illustrates an instance where the procedure from the previous section fails to correctly determine the dependencies of a function δi on q1 through qn and x1 through xm. in this figure we have a state machine where the number of states (5) is not a power of two. consequently, we require at least three flip-flops to implement this state machine. however, since eight distinct states exist for an array of three flip-flops, three flipflop array states remain unused. in other words, three of the eight possible flip-flop array states do not represent any fsm state at all. similarly, the number of input values (3) is also not a power of two, and hence the input is encoded by two bits with one of the four possible combinations remaining unused. the effects of these unused states and input combinations can be seen in the encoded transition function δ2, where the value of δ2 for the unused states is recorded as a “−”, indicating a “don’t-care”. a “don’t-care” output indicates that the digital logic implementing the fsm may output a value of either 0 or 1 for that combination of flip-flop states and inputs, since the combination should never occur during normal operation. we now observe that at least three possible implementations exist for δ2: q + 2 = ¬q1 ∧ ¬q2, (3) q + 2 = ¬q1 ∧ ¬q3, (4) q + 2 = ¬q2 ∧ ¬q3. (5) thus, it is clearly possible to implement δ2 with a dependency on only two variables. on the other hand, one can see from inspection that any one of q1, q2, q3, x1, or x2 alone is not enough to determine the value of δ2. hence, our cost model assigns δ2 a cost of 2. however, the procedure from the previous section finds the cost of δ2 to be 0. for instance, when considering whether δ2 depends on q1, the procedure considers all pairs of states (s, s′) whose encodings differ only in q1, and then examines whether the encodings of δ(s, i) and δ(s ′, i) differ 20 in q2 for any input value i. in this case, the only such pair of states is (s2, s5), and under δ with i = i1, i = i2, and i = i3, the images of this pair are (s1, s1), (s1, s3), and (s1, s1), respectively. for all three image pairs, q2 does not differ between the encoded values of the two states in the pair. thus, the procedure determines that δ2 does not depend on q1. in a similar fashion, the procedure also determines that δ2 does not depend on q2, q3, x1, or x2, and consequently calculates the cost of δ2 as 0. this conclusion is clearly incorrect since δ2 is not constant and must therefore depend on at least one variable. indeed, we previously determined the minimum number of dependencies to be 2. our procedure’s failure to correctly determine cost in this case ultimately arises from the fact that, considered individually, δ2 need not depend on any particular one of q1, q2, and q3. for instance, eq. 5 implements δ2 without a dependency on q1. similarly, eq. 4 and eq. 3 implement δ2 without dependencies on q2 and q3, respectively. it is however not possible to simultaneously avoid two of these dependencies. in other words, any implementation of δ2 that lacks a dependency on q1 must then depend on both q2 and q3. so, although the implementation given by eq. 5 does not depend on q1, it depends on both q2 and q3. the procedure from the previous section assumes (in this case incorrectly) that any two dependencies can always be simultaneously avoided if they can be individually avoided, which results in an incorrect computed cost. the scenario described here can never occur if all encoded transition functions are actual functions (not relations) in the mathematical sense. to see this, observe that any mathematical function lacking a dependency on each of two variables individually must also lack a dependency on those variables simultaneously. in other words, suppose that a function f of variables5 x1 through xn lacks a dependency on x1, meaning that a change in only x1 cannot affect the value of f and f can be computed without knowledge of the value of x1. furthermore suppose that f similarly lacks a dependency on x2. then if the value of x1 changes to a different value x1 ′ and the value of x2 simultaneously changes to x2 ′, we have6 f(x1, x2, . . . , xn) = f(x1, x2 ′ , . . . , xn) = f(x1 ′ , x2 ′ , . . . , xn), showing that simultaneous changes in x1 and x2 cannot affect the value of f. hence, f can be computed without knowledge of the values of either x1 or x2. this argument breaks down if f is not actually a function but a relation, because then the value of f may not be uniquely defined at a given point. we therefore see that the procedure from the previous section, which only evaluates dependency on a single variable at a time, correctly computes cost when all encoded transition functions are actual functions (i.e., there are no “don’t-cares”) but may fail if those “functions” are actually relations. no “don’t-cares” will exist if the state and input encodings are 5as in section 3.1, the variables x1 through xn here are unrelated to the input variables of an fsm; here, they are simply variables in an arbitrary function f. 6here we use the notation x1 ′ simply to indicate that the value of x1 has changed. in particular, we do not use the prime (′) symbol to indicate logical negation, as is sometimes done. 21 188 e. tsai, m. perkowski a quantum algorithm for automata encoding 189 bijective. bijectivity of an encoding is equivalent to the condition that the numbers of states and inputs are both powers of two, the minimum possible number of state and input variables are used, and no two distinct states or inputs may be assigned the same encoding. 5 design of a quantum oracle to find optimal encodings 5.1 binary representation of encodings in the quantum oracle we now demonstrate how to, given an fsm and a threshold value r, construct a quantum oracle that, when given an encoding for that fsm as input, determines whether the cost of the encoding is less than r. using this quantum oracle, the procedure described in section 3.2 then constitutes a complete algorithm for finding the exact minimum solution to the fsm encoding problem under the assumptions and conditions described before. as the first design step, we must agree on the manner in which a candidate encoding is to be supplied as input to the oracle. we will use the following scheme to represent an encoding as binary data: letting n be the number of bits used by the encoding, we allocate an array of n qubits for each element of the set being encoded (either the state or input set of an fsm), and assign to each such array the encoded value of the corresponding set element. for instance, suppose that s1 through s4 are the internal states of an fsm. since we require all encodings to be bijective, only 2-bit state encodings will be considered for this fsm. we therefore create four arrays of two qubits each, where the input supplied to each array is the encoded value of the corresponding state. thus, if s1, s2, s3, and s4 are encoded by 01, 10, 00, and 11, respectively, then we represent this encoding by supplying 01 on the first array of two qubits, 10 on the second array, 00 on the third array, and 11 on the fourth array. input encodings are also represented in a similar manner. figure 10 provides an example of an encoding represented as binary data that can be input to a quantum oracle. in this figure as well as others in this section, the notation qi(sj) denotes the value assigned to qi in the encoding of sj. similarly, xi(ij) denotes the value assigned to xi in the encoding of ij. the number of input qubits to a quantum oracle is of great importance as it determines the run time of grover’s algorithm. if an fsm has 2n internal states, each state will be encoded using n bits, and therefore, our scheme for representing encodings requires 2n arrays of n qubits each, for a total of n · 2n qubits. similarly, for an fsm with 2m input values, our scheme requires m · 2m qubits. thus, an fsm with 2n internal states and 2m input values requires a grand total of n · 2n + m · 2m qubits. 22 q1q2s s1 0 1 s2 1 0 s3 0 0 s4 1 1 x1 x2i i1 1 0 i2 1 1 i3 0 1 i4 0 0 (a) q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i4) x2(i4) 0 0 0 1 1 1 0 0 0 1 1 0 encoding of i4 encoding of i1 encoding of s4 encoding of s3 encoding of s2 encoding of s1 y f (b) figure 10: the quantum oracle’s representation of an encoding as binary data— (a) state and input encodings for an fsm; (b) their representations as supplied to the oracle. 23 190 e. tsai, m. perkowski a quantum algorithm for automata encoding 191 bijective. bijectivity of an encoding is equivalent to the condition that the numbers of states and inputs are both powers of two, the minimum possible number of state and input variables are used, and no two distinct states or inputs may be assigned the same encoding. 5 design of a quantum oracle to find optimal encodings 5.1 binary representation of encodings in the quantum oracle we now demonstrate how to, given an fsm and a threshold value r, construct a quantum oracle that, when given an encoding for that fsm as input, determines whether the cost of the encoding is less than r. using this quantum oracle, the procedure described in section 3.2 then constitutes a complete algorithm for finding the exact minimum solution to the fsm encoding problem under the assumptions and conditions described before. as the first design step, we must agree on the manner in which a candidate encoding is to be supplied as input to the oracle. we will use the following scheme to represent an encoding as binary data: letting n be the number of bits used by the encoding, we allocate an array of n qubits for each element of the set being encoded (either the state or input set of an fsm), and assign to each such array the encoded value of the corresponding set element. for instance, suppose that s1 through s4 are the internal states of an fsm. since we require all encodings to be bijective, only 2-bit state encodings will be considered for this fsm. we therefore create four arrays of two qubits each, where the input supplied to each array is the encoded value of the corresponding state. thus, if s1, s2, s3, and s4 are encoded by 01, 10, 00, and 11, respectively, then we represent this encoding by supplying 01 on the first array of two qubits, 10 on the second array, 00 on the third array, and 11 on the fourth array. input encodings are also represented in a similar manner. figure 10 provides an example of an encoding represented as binary data that can be input to a quantum oracle. in this figure as well as others in this section, the notation qi(sj) denotes the value assigned to qi in the encoding of sj. similarly, xi(ij) denotes the value assigned to xi in the encoding of ij. the number of input qubits to a quantum oracle is of great importance as it determines the run time of grover’s algorithm. if an fsm has 2n internal states, each state will be encoded using n bits, and therefore, our scheme for representing encodings requires 2n arrays of n qubits each, for a total of n · 2n qubits. similarly, for an fsm with 2m input values, our scheme requires m · 2m qubits. thus, an fsm with 2n internal states and 2m input values requires a grand total of n · 2n + m · 2m qubits. 22 q1q2s s1 0 1 s2 1 0 s3 0 0 s4 1 1 x1 x2i i1 1 0 i2 1 1 i3 0 1 i4 0 0 (a) q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i4) x2(i4) 0 0 0 1 1 1 0 0 0 1 1 0 encoding of i4 encoding of i1 encoding of s4 encoding of s3 encoding of s2 encoding of s1 y f (b) figure 10: the quantum oracle’s representation of an encoding as binary data— (a) state and input encodings for an fsm; (b) their representations as supplied to the oracle. 23 190 e. tsai, m. perkowski a quantum algorithm for automata encoding 191 5.2 quantum circuit to detect dependencies we next demonstrate a quantum circuit design that uses the procedure from section 4.1 to determine whether the next state of a single flip-flop (i.e., one of q+1 through q + n ) depends on the current state of another flip-flop (i.e., one of q1 through qn) or the current value of a single input bit (i.e., one of x1 through xm). recall that this procedure involves checking the conditions given by eq. 1 and eq. 2 for each pair of states or input values, respectively. in turn, eq. 1 and eq. 2 involve checking the conditions dj(s, s ′), ai(s, s ′), and dj(i, i ′) for pairs of states or input values, where dj and ai are as defined in section 4.1. using the binary representation described in section 5.1, one can easily construct quantum circuits to check dj(s, s ′), ai(s, s ′), and dj(i, i ′) for any pair of states or inputs. for example, figure 11(a) shows a quantum circuit that evaluates d1(s, s ′) for any two distinct states s and s′.7 this circuit operates on just a subset of the complete binary representation of an encoding; specifically, it uses the qubits carrying information about the encoded values of s and s′. it uses feynman (a.k.a. controlled-not) gates to perform comparisons and a toffoli gate to evaluate the logical and of two comparison results. the final output of the circuit is given by the logical expression ¬(q2(s) ⊕ q2(s′)) ∧ ¬(q3(s) ⊕ q3(s′)), which evaluates to 1 if and only if the encodings of s and s′ agree in all state variables except q1, exactly the definition of d1(s, s ′). although we assumed for the purpose of this particular illustration that each state is encoded by three bits, the general circuit structure applies to encodings of any size. likewise, although this particular circuit evaluates d1(s, s ′), similar circuits suffice to evaluate dj(s, s ′) for any j, as well as dj(i, i ′) for two input values i and i′ instead of two states. in a similar vein, figure 11(b) shows a quantum circuit that evaluates a1(s, s ′). this circuit is extremely simple, as it simply evaluates ¬(q1(s) ⊕ q1(s′)), which amounts to just a one-bit comparison. as before, the same circuit structure is usable for encodings of any size, not just three bits. next, to check the conditions specified by eq. 1 and eq. 2, our quantum circuit must be able to evaluate a logical implication. recalling that the expression a ⇒ b is defined as ¬a ∨ b, we observe that a single toffoli gate suffices to evaluate a logical implication, as shown in figure 12. finally, we are ready to turn to the design of the full quantum circuit for checking eq. 1 or eq. 2. this task is complicated by the fact that checking eq. 1 or eq. 2 involves evaluating ai(s, s ′) simultaneously for 7technically, this circuit only works correctly if the states s and s′ have distinct encodings, as required by the conditions stated at the end of section 2.2. if s and s′ happen to have the same encoding, the circuit will erroneously output 1. however, later on, in section 5.5, we describe a circuit that enforces the condition that distinct states have distinct encodings. this circuit is incorporated into the final oracle, as described in section 5.6, so that it forces the oracle’s output to 0 whenever this condition is violated. the operation of the circuit described in the present section therefore becomes irrelevant in such a case, since grover’s algorithm will never find such a distinctness-violating encoding as a solution. from now on, we therefore proceed with the implicit assumption that distinct states have distinct encodings. the same applies to input values as well—the operation of the circuit described here is similarly irrelevant if distinct input values do not have distinct encodings. 24 q1(s ′) q2(s ′) q3(s ′) q1(s) q2(s) q3(s) |0〉 encoding of s′ encoding of s d1(s, s ′) (a) q1(s ′) q2(s ′) q3(s ′) q1(s) q2(s) q3(s) encoding of s′ encoding of s a1(s, s ′) (b) figure 11: (a) quantum circuit to check d1(s, s ′); (b) circuit to check a1(s, s ′). a b |1〉 a ⇒ b figure 12: logical implication evaluated using a toffoli gate. 25 192 e. tsai, m. perkowski a quantum algorithm for automata encoding 193 5.2 quantum circuit to detect dependencies we next demonstrate a quantum circuit design that uses the procedure from section 4.1 to determine whether the next state of a single flip-flop (i.e., one of q+1 through q + n ) depends on the current state of another flip-flop (i.e., one of q1 through qn) or the current value of a single input bit (i.e., one of x1 through xm). recall that this procedure involves checking the conditions given by eq. 1 and eq. 2 for each pair of states or input values, respectively. in turn, eq. 1 and eq. 2 involve checking the conditions dj(s, s ′), ai(s, s ′), and dj(i, i ′) for pairs of states or input values, where dj and ai are as defined in section 4.1. using the binary representation described in section 5.1, one can easily construct quantum circuits to check dj(s, s ′), ai(s, s ′), and dj(i, i ′) for any pair of states or inputs. for example, figure 11(a) shows a quantum circuit that evaluates d1(s, s ′) for any two distinct states s and s′.7 this circuit operates on just a subset of the complete binary representation of an encoding; specifically, it uses the qubits carrying information about the encoded values of s and s′. it uses feynman (a.k.a. controlled-not) gates to perform comparisons and a toffoli gate to evaluate the logical and of two comparison results. the final output of the circuit is given by the logical expression ¬(q2(s) ⊕ q2(s′)) ∧ ¬(q3(s) ⊕ q3(s′)), which evaluates to 1 if and only if the encodings of s and s′ agree in all state variables except q1, exactly the definition of d1(s, s ′). although we assumed for the purpose of this particular illustration that each state is encoded by three bits, the general circuit structure applies to encodings of any size. likewise, although this particular circuit evaluates d1(s, s ′), similar circuits suffice to evaluate dj(s, s ′) for any j, as well as dj(i, i ′) for two input values i and i′ instead of two states. in a similar vein, figure 11(b) shows a quantum circuit that evaluates a1(s, s ′). this circuit is extremely simple, as it simply evaluates ¬(q1(s) ⊕ q1(s′)), which amounts to just a one-bit comparison. as before, the same circuit structure is usable for encodings of any size, not just three bits. next, to check the conditions specified by eq. 1 and eq. 2, our quantum circuit must be able to evaluate a logical implication. recalling that the expression a ⇒ b is defined as ¬a ∨ b, we observe that a single toffoli gate suffices to evaluate a logical implication, as shown in figure 12. finally, we are ready to turn to the design of the full quantum circuit for checking eq. 1 or eq. 2. this task is complicated by the fact that checking eq. 1 or eq. 2 involves evaluating ai(s, s ′) simultaneously for 7technically, this circuit only works correctly if the states s and s′ have distinct encodings, as required by the conditions stated at the end of section 2.2. if s and s′ happen to have the same encoding, the circuit will erroneously output 1. however, later on, in section 5.5, we describe a circuit that enforces the condition that distinct states have distinct encodings. this circuit is incorporated into the final oracle, as described in section 5.6, so that it forces the oracle’s output to 0 whenever this condition is violated. the operation of the circuit described in the present section therefore becomes irrelevant in such a case, since grover’s algorithm will never find such a distinctness-violating encoding as a solution. from now on, we therefore proceed with the implicit assumption that distinct states have distinct encodings. the same applies to input values as well—the operation of the circuit described here is similarly irrelevant if distinct input values do not have distinct encodings. 24 q1(s ′) q2(s ′) q3(s ′) q1(s) q2(s) q3(s) |0〉 encoding of s′ encoding of s d1(s, s ′) (a) q1(s ′) q2(s ′) q3(s ′) q1(s) q2(s) q3(s) encoding of s′ encoding of s a1(s, s ′) (b) figure 11: (a) quantum circuit to check d1(s, s ′); (b) circuit to check a1(s, s ′). a b |1〉 a ⇒ b figure 12: logical implication evaluated using a toffoli gate. 25 192 e. tsai, m. perkowski a quantum algorithm for automata encoding 193 many pairs of states. for instance, consider the state machine from figure 2, and suppose that we wish to design a quantum circuit to evaluate eq. 2 for i = 1, j = 2, and the pair of input values (i, i′) = (i1, i2). the circuit must then check whether a1(δ(s, i), δ(s, i ′)) holds for all states s. in this case, given i = i1 and i ′ = i2, the corresponding pairs (δ(s, i), δ(s, i ′)) of next states are (s2, s2), (s4, s1), (s2, s3), and (s4, s4) for s = s1, s2, s3, and s4, respectively. since a1(s2, s2) and a1(s4, s4) are trivially true, we can ignore them. thus, the quantum circuit must check whether a1(s4, s1) and a1(s2, s3) simultaneously hold. the upper portion of figure 13 demonstrates a subcircuit that accomplishes this task. the lower portion of figure 13 then combines this subcircuit with another subcircuit (from figure 11(b)) to evaluate the full eq. 2. in other words, the final output of this circuit, on the bottommost qubit, will be 1 if eq. 2 is satisfied, and 0 if it is not. we reiterate that for an fsm with more than four input values (so that each input value is encoded by at least three bits), one would replace the simplified circuit for evaluating d2(i1, i2) with the full one from figure 11(a). in some cases, checking eq. 1 or 2 may involve evaluating ai(i, i ′) simultaneously for two or more overlapping pairs of inputs (or states). in these scenarios, one must construct the quantum circuit for checking eq. 1 or 2 using a slightly different design. for example, suppose that we now wish to design a quantum circuit to evaluate eq. 2 for the same state machine and i = 1, j = 2 as before, but a different pair of input values (i1, i3). the pairs of next states corresponding to current states of s1, s2, s3, and s4 are (s2, s2), (s4, s3), (s2, s3), and (s4, s2), respectively. therefore, the quantum circuit must evaluate d2(i1, i3) ⇒ a1(s4, s3) ∧ a1(s2, s3) ∧ a1(s4, s2). (6) we then observe that, since the pairs of states on the right-hand side overlap, the entire right-hand side is equivalent to the condition that the encodings of s2, s3, s4 must all agree in the value of q1. 8 we denote this condition by a1(s2, s3, s4), a natural generalization of our earlier ai(s, s ′) notation for pairs of states. figure 14 illustrates the quantum circuit for evaluating eq. 2 in this case. the critical difference between figures 13 and 14 is that the feynman gates in the upper portion of figure 14 operate on the encodings of overlapping pairs of states, and must therefore be applied in the correct order. additionally, although the right-hand side of eq. 6 contains the term a1(s4, s2), the circuit in figure 14 does not contain a corresponding feynman gate to evaluate this term. such a gate is unnecessary because of transitivity—it is enough to check both a1(s2, s3) and a1(s3, s4) since they are together equivalent to a1(s2, s3, s4), which implies a1(s4, s2). more generally, one can construct similar circuits to evaluate ai(s, s ′) for any number of overlapping pairs of states. one simply takes the union of all such pairs to obtain an arbitrarily-sized set of states and then constructs a quantum circuit according to the pattern shown in figure 15(a). 8for this particular example, this condition is impossible because, since we require encodings to be bijective, the encodings of at most two states can agree in the value of q1. however, this condition could be satisfied in an fsm with more states. 26 q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i3) x2(i3) x1(i4) x2(i4) x2(i2) x1(i2) |0〉 |0〉 |1〉 a1(s1, s4) ∧ a1(s2, s3) d2(i1, i2) d2(i1, i2) ⇒ a1(s1, s4) ∧ a1(s2, s3) figure 13: quantum circuit to evaluate eq. 2 for input pair (i1, i2). 27 194 e. tsai, m. perkowski a quantum algorithm for automata encoding 195 many pairs of states. for instance, consider the state machine from figure 2, and suppose that we wish to design a quantum circuit to evaluate eq. 2 for i = 1, j = 2, and the pair of input values (i, i′) = (i1, i2). the circuit must then check whether a1(δ(s, i), δ(s, i ′)) holds for all states s. in this case, given i = i1 and i ′ = i2, the corresponding pairs (δ(s, i), δ(s, i ′)) of next states are (s2, s2), (s4, s1), (s2, s3), and (s4, s4) for s = s1, s2, s3, and s4, respectively. since a1(s2, s2) and a1(s4, s4) are trivially true, we can ignore them. thus, the quantum circuit must check whether a1(s4, s1) and a1(s2, s3) simultaneously hold. the upper portion of figure 13 demonstrates a subcircuit that accomplishes this task. the lower portion of figure 13 then combines this subcircuit with another subcircuit (from figure 11(b)) to evaluate the full eq. 2. in other words, the final output of this circuit, on the bottommost qubit, will be 1 if eq. 2 is satisfied, and 0 if it is not. we reiterate that for an fsm with more than four input values (so that each input value is encoded by at least three bits), one would replace the simplified circuit for evaluating d2(i1, i2) with the full one from figure 11(a). in some cases, checking eq. 1 or 2 may involve evaluating ai(i, i ′) simultaneously for two or more overlapping pairs of inputs (or states). in these scenarios, one must construct the quantum circuit for checking eq. 1 or 2 using a slightly different design. for example, suppose that we now wish to design a quantum circuit to evaluate eq. 2 for the same state machine and i = 1, j = 2 as before, but a different pair of input values (i1, i3). the pairs of next states corresponding to current states of s1, s2, s3, and s4 are (s2, s2), (s4, s3), (s2, s3), and (s4, s2), respectively. therefore, the quantum circuit must evaluate d2(i1, i3) ⇒ a1(s4, s3) ∧ a1(s2, s3) ∧ a1(s4, s2). (6) we then observe that, since the pairs of states on the right-hand side overlap, the entire right-hand side is equivalent to the condition that the encodings of s2, s3, s4 must all agree in the value of q1. 8 we denote this condition by a1(s2, s3, s4), a natural generalization of our earlier ai(s, s ′) notation for pairs of states. figure 14 illustrates the quantum circuit for evaluating eq. 2 in this case. the critical difference between figures 13 and 14 is that the feynman gates in the upper portion of figure 14 operate on the encodings of overlapping pairs of states, and must therefore be applied in the correct order. additionally, although the right-hand side of eq. 6 contains the term a1(s4, s2), the circuit in figure 14 does not contain a corresponding feynman gate to evaluate this term. such a gate is unnecessary because of transitivity—it is enough to check both a1(s2, s3) and a1(s3, s4) since they are together equivalent to a1(s2, s3, s4), which implies a1(s4, s2). more generally, one can construct similar circuits to evaluate ai(s, s ′) for any number of overlapping pairs of states. one simply takes the union of all such pairs to obtain an arbitrarily-sized set of states and then constructs a quantum circuit according to the pattern shown in figure 15(a). 8for this particular example, this condition is impossible because, since we require encodings to be bijective, the encodings of at most two states can agree in the value of q1. however, this condition could be satisfied in an fsm with more states. 26 q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i3) x2(i3) x1(i4) x2(i4) x2(i2) x1(i2) |0〉 |0〉 |1〉 a1(s1, s4) ∧ a1(s2, s3) d2(i1, i2) d2(i1, i2) ⇒ a1(s1, s4) ∧ a1(s2, s3) figure 13: quantum circuit to evaluate eq. 2 for input pair (i1, i2). 27 194 e. tsai, m. perkowski a quantum algorithm for automata encoding 195 q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i2) x2(i2) x1(i4) x2(i4) x2(i3) x1(i3) |0〉 |0〉 |1〉 a1(s2, s3, s4) d2(i1, i3) d2(i1, i3) ⇒ a1(s2, s3, s4) figure 14: quantum circuit to evaluate eq. 2 for input pair (i1, i3). 28 figure 15(b) then shows a circuit that checks whether ai is simultaneously satisfied for multiple sets of states. these circuit structures are sufficient to algorithmically construct a quantum circuit for evaluating eq. 1 or 2 in any case. specifically, given any fsm with any number of states and input values, the following procedure constructs a quantum circuit to evaluate eq. 1 for any i, j, and pair of states9 (s, s′): 1. list all of the pairs of states appearing on the right-hand side of eq. 1 when expanded. in other words, create a list containing the pair of states (δ(s, i), δ(s′, i)) for every possible input value i. 2. in this list, merge any overlapping pairs of states together to obtain a collection of mutually disjoint sets of states. 3. using the circuit structures from figure 15, construct a circuit that checks whether ai is simultaneously satisfied for every set of states produced by the previous step. 4. using the circuit created in step 3 as a subcircuit, construct a circuit that checks the full eq. 1, following the general pattern illustrated in figures 13 and 14. from now on, we will refer to any circuit generated by this procedure as a partial dependency checker for the given pair of states, because it checks eq. 1 for a single pair of states while the existence of a dependency is determined by checking eq. 1 for all possible pairs of states. we observe that crucially, the quantum circuit design procedure described in this section requires at compile time only knowledge of the transition function of the fsm being encoded. in particular, the quantum circuit cannot be modified depending on the encoding whose cost is being evaluated, because individual encodings are not considered at compile time. individual encodings are only considered at run time, when grover’s algorithm is used to simultaneously evaluate the cost of every possible encoding in the search space. thus, in a single run of grover’s algorithm, the same quantum circuit must be able to evaluate any encoding in the search space without any modifications. 5.3 quantum circuit to calculate total cost of an encoding with the ability to generate a quantum circuit to evaluate eq. 1 or 2 for any i, j, and pair of states or inputs, we now consider the task of designing a quantum circuit to calculate the full cost of a given encoding. following the procedure from section 4.1, the quantum circuit determines whether a given q+i depends on qj by checking if eq. 1 is satisfied for all possible pairs of states.10 figure 16 shows a circuit structure that accomplishes this task. from now on, we will refer to this circuit as a dependency checker. 9we describe the procedure here using eq. 1 with a pair of states, but it functions equally well for eq. 2 with a pair of input values. 10once again, although we use dependencies on state variables for the sake of explanation, the whole discussion applies to dependencies on input variables as well. 29 196 e. tsai, m. perkowski a quantum algorithm for automata encoding 197 q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i2) x2(i2) x1(i4) x2(i4) x2(i3) x1(i3) |0〉 |0〉 |1〉 a1(s2, s3, s4) d2(i1, i3) d2(i1, i3) ⇒ a1(s2, s3, s4) figure 14: quantum circuit to evaluate eq. 2 for input pair (i1, i3). 28 figure 15(b) then shows a circuit that checks whether ai is simultaneously satisfied for multiple sets of states. these circuit structures are sufficient to algorithmically construct a quantum circuit for evaluating eq. 1 or 2 in any case. specifically, given any fsm with any number of states and input values, the following procedure constructs a quantum circuit to evaluate eq. 1 for any i, j, and pair of states9 (s, s′): 1. list all of the pairs of states appearing on the right-hand side of eq. 1 when expanded. in other words, create a list containing the pair of states (δ(s, i), δ(s′, i)) for every possible input value i. 2. in this list, merge any overlapping pairs of states together to obtain a collection of mutually disjoint sets of states. 3. using the circuit structures from figure 15, construct a circuit that checks whether ai is simultaneously satisfied for every set of states produced by the previous step. 4. using the circuit created in step 3 as a subcircuit, construct a circuit that checks the full eq. 1, following the general pattern illustrated in figures 13 and 14. from now on, we will refer to any circuit generated by this procedure as a partial dependency checker for the given pair of states, because it checks eq. 1 for a single pair of states while the existence of a dependency is determined by checking eq. 1 for all possible pairs of states. we observe that crucially, the quantum circuit design procedure described in this section requires at compile time only knowledge of the transition function of the fsm being encoded. in particular, the quantum circuit cannot be modified depending on the encoding whose cost is being evaluated, because individual encodings are not considered at compile time. individual encodings are only considered at run time, when grover’s algorithm is used to simultaneously evaluate the cost of every possible encoding in the search space. thus, in a single run of grover’s algorithm, the same quantum circuit must be able to evaluate any encoding in the search space without any modifications. 5.3 quantum circuit to calculate total cost of an encoding with the ability to generate a quantum circuit to evaluate eq. 1 or 2 for any i, j, and pair of states or inputs, we now consider the task of designing a quantum circuit to calculate the full cost of a given encoding. following the procedure from section 4.1, the quantum circuit determines whether a given q+i depends on qj by checking if eq. 1 is satisfied for all possible pairs of states.10 figure 16 shows a circuit structure that accomplishes this task. from now on, we will refer to this circuit as a dependency checker. 9we describe the procedure here using eq. 1 with a pair of states, but it functions equally well for eq. 2 with a pair of input values. 10once again, although we use dependencies on state variables for the sake of explanation, the whole discussion applies to dependencies on input variables as well. 29 196 e. tsai, m. perkowski a quantum algorithm for automata encoding 197 q1(s1) q2(s1) qn(s1) q1(s2) q2(s2) qn(s2) q1(s3) q2(s3) qn(s3) q1(sn ) q2(sn ) qn(sn ) |0〉 a1(s1, . . . , sn) (a) qi(s1) qi(s2) qi(s3) qi(s4) qi(s5) qi(s6) qi(s7) qi(s8) |0〉 ai(s1, s2, s3) ∧ ai(s4, s5, s6) ∧ ai(s7, s8) (b) figure 15: (a) circuit to evaluate a1(s1, . . . , sn) for an arbitrary number of states; (b) circuit for multiple sets of states. 30 input qubits ancilla qubits |0〉 |0〉 |0〉 |1〉 s1, s2 s1, s2s1, s3 s1, s3s3, s4 s3, s4 figure 16: quantum circuit to check dependency of q+i on a single state or input variable. in figure 16, the labeled “input qubits” represent the entire collection of qubits storing the binary representation of an encoding, as described in section 5.1. each block represents a partial dependency checker that checks eq. 1 for the given i, j, and the pair of states with which it is labeled. “ancilla qubits” represents the ancilla qubits that are used by these partial dependency checkers, as shown in figures 13 and 14. for illustrative purposes, we have assumed an fsm with four states, resulting in six possible pairs of states; partial dependency checkers for three of them are shown. the same circuit structure, scaled up to accommodate the correspondingly larger number of subcircuits, would be used for an fsm with more states. the output from each partial dependency checker is stored on an ancilla qubit so that the final result can be obtained by taking their logical and. each partial dependency checker must be applied again after the result is computed to restore the ancilla qubits to their original states, which, as previously discussed in section 3.1, is required for grover’s algorithm to work correctly. the final output from the dependency checker is 1 if and only if q+i depends on qj. next, we require another quantum circuit to calculate the total cost of the encoding by counting the total number of dependencies over all q+i . we introduce a quantum counter for this purpose. the quantum counter consists of a register of qubits, which stores an integer in basetwo representation, together with incrementer circuits that add one to this stored integer when a control qubit is in the state |1〉. figure 17(a) shows a three-qubit incrementer. if the input qubits represent an integer a2a1a0, with a2 being the most significant and a0 being the least significant bit, then the output of the incrementer is (the basetwo representation of) a2a1a0 + 1 mod 2 3. the result is taken modulo 23 because due to reversibility, the maximum value of 23 − 1 must wrap 31 198 e. tsai, m. perkowski a quantum algorithm for automata encoding 199 q1(s1) q2(s1) qn(s1) q1(s2) q2(s2) qn(s2) q1(s3) q2(s3) qn(s3) q1(sn ) q2(sn ) qn(sn ) |0〉 a1(s1, . . . , sn) (a) qi(s1) qi(s2) qi(s3) qi(s4) qi(s5) qi(s6) qi(s7) qi(s8) |0〉 ai(s1, s2, s3) ∧ ai(s4, s5, s6) ∧ ai(s7, s8) (b) figure 15: (a) circuit to evaluate a1(s1, . . . , sn) for an arbitrary number of states; (b) circuit for multiple sets of states. 30 input qubits ancilla qubits |0〉 |0〉 |0〉 |1〉 s1, s2 s1, s2s1, s3 s1, s3s3, s4 s3, s4 figure 16: quantum circuit to check dependency of q+i on a single state or input variable. in figure 16, the labeled “input qubits” represent the entire collection of qubits storing the binary representation of an encoding, as described in section 5.1. each block represents a partial dependency checker that checks eq. 1 for the given i, j, and the pair of states with which it is labeled. “ancilla qubits” represents the ancilla qubits that are used by these partial dependency checkers, as shown in figures 13 and 14. for illustrative purposes, we have assumed an fsm with four states, resulting in six possible pairs of states; partial dependency checkers for three of them are shown. the same circuit structure, scaled up to accommodate the correspondingly larger number of subcircuits, would be used for an fsm with more states. the output from each partial dependency checker is stored on an ancilla qubit so that the final result can be obtained by taking their logical and. each partial dependency checker must be applied again after the result is computed to restore the ancilla qubits to their original states, which, as previously discussed in section 3.1, is required for grover’s algorithm to work correctly. the final output from the dependency checker is 1 if and only if q+i depends on qj. next, we require another quantum circuit to calculate the total cost of the encoding by counting the total number of dependencies over all q+i . we introduce a quantum counter for this purpose. the quantum counter consists of a register of qubits, which stores an integer in basetwo representation, together with incrementer circuits that add one to this stored integer when a control qubit is in the state |1〉. figure 17(a) shows a three-qubit incrementer. if the input qubits represent an integer a2a1a0, with a2 being the most significant and a0 being the least significant bit, then the output of the incrementer is (the basetwo representation of) a2a1a0 + 1 mod 2 3. the result is taken modulo 23 because due to reversibility, the maximum value of 23 − 1 must wrap 31 198 e. tsai, m. perkowski a quantum algorithm for automata encoding 199 a2 a1 a0 (a2a1a0)2 + 1 mod 2 3 (a) a2 a1 a0 an (b) a2 a1 a0 an c (c) c anan−1...a1a0 n +1 (d) figure 17: (a) a three-qubit incrementer; (b) general n-qubit incrementer; (c) controlled three-qubit incrementer; (d) schematic symbol for the controlled incrementer. around to 0 upon increment. figure 17(b) illustrates how the incrementer is extended to any number of qubits. figure 17(c) then demonstrates how, by adding an additional control qubit to each individual gate making up the incrementer, a controlled incrementer is produced. as the name suggests, the controlled incrementer only increments the register an . . . a2a1a0 if the control qubit c is in the state |1〉. figure 17(d) depicts the schematic symbol that we will use from now on to compactly represent a controlled incrementer. observe that the lower line in this schematic represents not a single qubit but an entire register or array of n qubits. using controlled incrementers, figure 18 illustrates how a quantum counter is formed by a sequence of controlled incrementers all acting on the same target register. this register stores a running count which will be incremented once for each control qubit in the state |1〉. since the register is initialized to |000〉, the final value on the register is simply the total number of control qubits in the state |1〉, represented as a base-two integer as before. we observe that the quantum counter depicted here is limited to a maximum of seven control qubits because the maximum value of the three-qubit target register is |111〉. attempting to count further than this would simply result in the counter wrapping around back to |000〉, as previously mentioned. this limit can be raised by increasing the size of the target register; a target register consisting of n qubits will allow the quantum counter to count up a maximum value of 2n − 1. with quantum counters, it is easy to construct a quantum circuit that calculates the total cost of an encoding. we recall that cost is equal to the total number of dependencies of each q+i on state and input variables 32 c1 c2 c7 |000〉 ∑ ci 3 +1 +1 +1 figure 18: an example of a quantum counter. |1〉 |00 . . . 0〉 input qubits ancilla qubits d.c. q+1 , q1 d.c. q+1 , q1 +1 d.c. q+1 , q2 d.c. q+1 , q2 +1 d.c. q+n , xm d.c. q+n , xm +1 figure 19: quantum circuit to calculate total cost using a quantum counter. qj and xk, summed over all i. therefore, we create a quantum circuit consisting of many dependency checkers, one to evaluate each possible dependency, where the outputs of the dependency checkers are fed to a quantum counter that counts the total number of dependencies. the final value of the quantum counter then gives the cost of the encoding represented by the input qubits. figure 19 depicts the structure of the resulting circuit. in figure 19, each block labeled “d.c. q+i , qj” or “d.c. q + i , xj” represents a dependency checker that checks whether q+i depends on qj or xj, respectively. due to space constraints, only a few checkers are shown in figure 19, but the full circuit requires dependency checkers for every possible combination of a q+i and a qj or xj. in other words, the circuit contains dependency checkers for q+1 depending on each of q1 through qn and x1 through xm, q + 2 depending on each of q1 through qn and x1 through xm, and so on up to q + n depending on each of q1 through qn and x1 through xm, where n is the number of state variables and m is the number of input variables. thus, the total number of dependency checkers is n(n + m). this number is actually quite small relative to the size of the state machine being encoded, since the number of state and input variables grow only logarithmically with the number of states and input values, respectively, in the fsm. we additionally observe that the quantum counter used in figure 19 is constructed slightly differently from the example shown in figure 18. 33 200 e. tsai, m. perkowski a quantum algorithm for automata encoding 201 a2 a1 a0 (a2a1a0)2 + 1 mod 2 3 (a) a2 a1 a0 an (b) a2 a1 a0 an c (c) c anan−1...a1a0 n +1 (d) figure 17: (a) a three-qubit incrementer; (b) general n-qubit incrementer; (c) controlled three-qubit incrementer; (d) schematic symbol for the controlled incrementer. around to 0 upon increment. figure 17(b) illustrates how the incrementer is extended to any number of qubits. figure 17(c) then demonstrates how, by adding an additional control qubit to each individual gate making up the incrementer, a controlled incrementer is produced. as the name suggests, the controlled incrementer only increments the register an . . . a2a1a0 if the control qubit c is in the state |1〉. figure 17(d) depicts the schematic symbol that we will use from now on to compactly represent a controlled incrementer. observe that the lower line in this schematic represents not a single qubit but an entire register or array of n qubits. using controlled incrementers, figure 18 illustrates how a quantum counter is formed by a sequence of controlled incrementers all acting on the same target register. this register stores a running count which will be incremented once for each control qubit in the state |1〉. since the register is initialized to |000〉, the final value on the register is simply the total number of control qubits in the state |1〉, represented as a base-two integer as before. we observe that the quantum counter depicted here is limited to a maximum of seven control qubits because the maximum value of the three-qubit target register is |111〉. attempting to count further than this would simply result in the counter wrapping around back to |000〉, as previously mentioned. this limit can be raised by increasing the size of the target register; a target register consisting of n qubits will allow the quantum counter to count up a maximum value of 2n − 1. with quantum counters, it is easy to construct a quantum circuit that calculates the total cost of an encoding. we recall that cost is equal to the total number of dependencies of each q+i on state and input variables 32 c1 c2 c7 |000〉 ∑ ci 3 +1 +1 +1 figure 18: an example of a quantum counter. |1〉 |00 . . . 0〉 input qubits ancilla qubits d.c. q+1 , q1 d.c. q+1 , q1 +1 d.c. q+1 , q2 d.c. q+1 , q2 +1 d.c. q+n , xm d.c. q+n , xm +1 figure 19: quantum circuit to calculate total cost using a quantum counter. qj and xk, summed over all i. therefore, we create a quantum circuit consisting of many dependency checkers, one to evaluate each possible dependency, where the outputs of the dependency checkers are fed to a quantum counter that counts the total number of dependencies. the final value of the quantum counter then gives the cost of the encoding represented by the input qubits. figure 19 depicts the structure of the resulting circuit. in figure 19, each block labeled “d.c. q+i , qj” or “d.c. q + i , xj” represents a dependency checker that checks whether q+i depends on qj or xj, respectively. due to space constraints, only a few checkers are shown in figure 19, but the full circuit requires dependency checkers for every possible combination of a q+i and a qj or xj. in other words, the circuit contains dependency checkers for q+1 depending on each of q1 through qn and x1 through xm, q + 2 depending on each of q1 through qn and x1 through xm, and so on up to q + n depending on each of q1 through qn and x1 through xm, where n is the number of state variables and m is the number of input variables. thus, the total number of dependency checkers is n(n + m). this number is actually quite small relative to the size of the state machine being encoded, since the number of state and input variables grow only logarithmically with the number of states and input values, respectively, in the fsm. we additionally observe that the quantum counter used in figure 19 is constructed slightly differently from the example shown in figure 18. 33 200 e. tsai, m. perkowski a quantum algorithm for automata encoding 201 specifically, the quantum counter in figure 19 is able to count using only one control qubit, while the one in figure 18 counts the number of ones present in a whole collection of control qubits. the reason for this difference is that the circuit in figure 19 counts the number of ones not in a collection of qubits, but appearing on a single qubit at different times. in other words, the circuit places the result of a given dependency checker onto the single counter control qubit and uses it to update the counter, but crucially, the circuit then applies the dependency checker again to restore that control qubit to its original state so that the next dependency checker can also use it to update the counter as well. in this way, the circuit is able to count dependencies without using a separate ancilla qubit to store the result of each dependency checker. as previously mentioned, the maximum value that a quantum counter can reach before wrapping around to zero is determined by the number of qubits in the counter’s target register. in figure 19, the size of the quantum counter’s target register is not explicitly indicated because it depends on the maximum possible cost. the maximum possible cost is equal to the total number of possible dependencies, which as we previously saw is n(n + m) where n and m are the number of states and input values, respectively. therefore, the quantum counter’s target register must contain at least �log2(n(n + m) + 1)� qubits to guarantee that no wrap-around can occur, which would cause the circuit to produce an incorrect result. 5.4 quantum threshold circuit with a quantum circuit for calculating the cost of an encoding, we now add a threshold circuit to create a quantum oracle that determines whether the cost of the encoding is less than r, where r is the threshold for which the oracle is being generated. the threshold circuit accepts a set of qubits representing a base-two integer as input and produces an output that depends on whether the input is less than or equal to r. designing such threshold circuits is a well-known and solved problem in classical digital logic. for instance, the following recursive procedure allows one to generate a logical expression that determines whether a given base-two integer (anan−1 . . . a1a0)2 is less than or equal to a threshold (rnrn−1 . . . r1r0)2: 1. if the threshold and value to be compared against it consist of only a single bit, the expression is ¬a0 if r0 = 0 and a constant 1 if r0 = 1. in this case, stop immediately as we are finished. 2. otherwise, recursively use this procedure to generate a logical expression that determines whether (an−1 . . . a0)2 ≤ (rn−1 . . . r0)2. 3. if rn = 1, take the logical or of ¬an with the expression generated in step 2. otherwise, if rn = 0, take the logical and. return the resulting expression as the output of this procedure. 4. at the very end, when all recursive steps are complete, it may be possible to simplify the expression using the identities x ∨ 1 = 1 and x ∧ 1 = x. for example, suppose we wish to generate a logical expression that determines whether a2a1a0 is less than or equal to 5, whose base-two 34 a0 a1 a2 a3 a4 a5 |1〉 |0〉 |1〉 y figure 20: quantum circuit implementing the expression y = ¬a5 ∨(¬a4 ∧¬a3 ∧ ¬a2 ∧ (¬a1 ∨ ¬a0)). representation is 101. then, since r2 = 1, we take the logical or of ¬a2 with an expression recursively generated to determine whether a1a0 is less than or equal to (01)2 = 1. since r1 = 0, we take the logical and of ¬a1 with the expression that determines whether a0 is less than 1. this is the base case for which the procedure above returns a constant 1. therefore, the complete generated expression is ¬a2 ∨ (¬a1 ∧ 1), which simplifies to ¬a2 ∨ ¬a1. once a logical expression is obtained, constructing a quantum threshold circuit is simply a matter of implementing that logical expression with quantum gates. this can be achieved using a cascade of toffoli gates as shown in figure 20, where consecutive logical operations of the same type (either and or or) can be combined into a single toffoli gate to reduce the number of ancilla qubits required. we observe that the threshold is “hard-coded” into the circuit (i.e., it is built into the circuit structure itself and can only be changed by changing the circuit) and therefore, generating a quantum oracle for a different threshold involves generating a new threshold circuit, as discussed in section 3.2. 5.5 quantum circuit to enforce bijectivity of encodings in section 4.2, we saw that it is necessary for all encodings to be bijective, which implies that no two states or inputs of an fsm may be encoded by the same value. therefore, the quantum oracle must include a circuit to check for and rule out encodings where the same encoded value is used more than once. this can be achieved by comparing the encoded values for every possible pair of states/inputs and verifying that the encoded values are different for every such pair. we therefore construct the circuit shown in figure 21. since there are four states in this figure, six pairs of states must be checked, of which three are shown. similarly, checks for three out of the six pairs of inputs are shown, with the understanding that the 35 202 e. tsai, m. perkowski a quantum algorithm for automata encoding 203 specifically, the quantum counter in figure 19 is able to count using only one control qubit, while the one in figure 18 counts the number of ones present in a whole collection of control qubits. the reason for this difference is that the circuit in figure 19 counts the number of ones not in a collection of qubits, but appearing on a single qubit at different times. in other words, the circuit places the result of a given dependency checker onto the single counter control qubit and uses it to update the counter, but crucially, the circuit then applies the dependency checker again to restore that control qubit to its original state so that the next dependency checker can also use it to update the counter as well. in this way, the circuit is able to count dependencies without using a separate ancilla qubit to store the result of each dependency checker. as previously mentioned, the maximum value that a quantum counter can reach before wrapping around to zero is determined by the number of qubits in the counter’s target register. in figure 19, the size of the quantum counter’s target register is not explicitly indicated because it depends on the maximum possible cost. the maximum possible cost is equal to the total number of possible dependencies, which as we previously saw is n(n + m) where n and m are the number of states and input values, respectively. therefore, the quantum counter’s target register must contain at least �log2(n(n + m) + 1)� qubits to guarantee that no wrap-around can occur, which would cause the circuit to produce an incorrect result. 5.4 quantum threshold circuit with a quantum circuit for calculating the cost of an encoding, we now add a threshold circuit to create a quantum oracle that determines whether the cost of the encoding is less than r, where r is the threshold for which the oracle is being generated. the threshold circuit accepts a set of qubits representing a base-two integer as input and produces an output that depends on whether the input is less than or equal to r. designing such threshold circuits is a well-known and solved problem in classical digital logic. for instance, the following recursive procedure allows one to generate a logical expression that determines whether a given base-two integer (anan−1 . . . a1a0)2 is less than or equal to a threshold (rnrn−1 . . . r1r0)2: 1. if the threshold and value to be compared against it consist of only a single bit, the expression is ¬a0 if r0 = 0 and a constant 1 if r0 = 1. in this case, stop immediately as we are finished. 2. otherwise, recursively use this procedure to generate a logical expression that determines whether (an−1 . . . a0)2 ≤ (rn−1 . . . r0)2. 3. if rn = 1, take the logical or of ¬an with the expression generated in step 2. otherwise, if rn = 0, take the logical and. return the resulting expression as the output of this procedure. 4. at the very end, when all recursive steps are complete, it may be possible to simplify the expression using the identities x ∨ 1 = 1 and x ∧ 1 = x. for example, suppose we wish to generate a logical expression that determines whether a2a1a0 is less than or equal to 5, whose base-two 34 a0 a1 a2 a3 a4 a5 |1〉 |0〉 |1〉 y figure 20: quantum circuit implementing the expression y = ¬a5 ∨(¬a4 ∧¬a3 ∧ ¬a2 ∧ (¬a1 ∨ ¬a0)). representation is 101. then, since r2 = 1, we take the logical or of ¬a2 with an expression recursively generated to determine whether a1a0 is less than or equal to (01)2 = 1. since r1 = 0, we take the logical and of ¬a1 with the expression that determines whether a0 is less than 1. this is the base case for which the procedure above returns a constant 1. therefore, the complete generated expression is ¬a2 ∨ (¬a1 ∧ 1), which simplifies to ¬a2 ∨ ¬a1. once a logical expression is obtained, constructing a quantum threshold circuit is simply a matter of implementing that logical expression with quantum gates. this can be achieved using a cascade of toffoli gates as shown in figure 20, where consecutive logical operations of the same type (either and or or) can be combined into a single toffoli gate to reduce the number of ancilla qubits required. we observe that the threshold is “hard-coded” into the circuit (i.e., it is built into the circuit structure itself and can only be changed by changing the circuit) and therefore, generating a quantum oracle for a different threshold involves generating a new threshold circuit, as discussed in section 3.2. 5.5 quantum circuit to enforce bijectivity of encodings in section 4.2, we saw that it is necessary for all encodings to be bijective, which implies that no two states or inputs of an fsm may be encoded by the same value. therefore, the quantum oracle must include a circuit to check for and rule out encodings where the same encoded value is used more than once. this can be achieved by comparing the encoded values for every possible pair of states/inputs and verifying that the encoded values are different for every such pair. we therefore construct the circuit shown in figure 21. since there are four states in this figure, six pairs of states must be checked, of which three are shown. similarly, checks for three out of the six pairs of inputs are shown, with the understanding that the 35 202 e. tsai, m. perkowski a quantum algorithm for automata encoding 203 full circuit requires checks on all six. observe that the exact same circuit applies for checking both state and input encodings, and that although figure 21 assumes four states, the same structure may of course be scaled up for fsms with any number of states or inputs. 5.6 the complete quantum oracle finally, we consider how the quantum circuits we have demonstrated thus far are assembled to form a complete quantum oracle. recall that the cost calculation circuit (section 5.3), which is itself assembled using a quantum counter and dependency checkers outputs the cost of an encoding expressed as a base-two integer, which is then passed to the threshold circuit (section 5.4). the threshold circuit produces a single-qubit answer indicating whether the calculated cost is below the threshold or not. at the same time, an encoding must be bijective in order to be considered at all. this condition is enforced by a circuit that checks uniqueness of each state or input’s encoding (section 5.5). we therefore see that our quantum oracle should only output 1 (true) if both the threshold and uniqueness checking circuits output 1. the complete quantum oracle therefore requires one additional toffoli gate to produce its final output, as shown in figure 22. in figure 22, the block labeled “cost” denotes the cost calculation circuit, “th(r)” denotes a threshold circuit with threshold r, and “u.c.” (“uniqueness checker”) denotes the uniqueness checking circuit. the oracle also requires additional mirror circuits to restore the ancilla qubits to their original values. these are denoted by overbars; e.g., “u.c” denotes the mirror circuit for the uniqueness checker. in particular, the mirror of the cost calculation circuit contains decrementer instead of incrementer circuits; these decrementer circuits naturally arise from reversing the order of the gates in the incrementer circuits from figure 17. observe that the design of the oracle allows us to omit additional mirror circuits that would be necessary if these subcircuits were being used as stand-alone circuits. for instance, as a stand-alone circuit, the threshold circuit shown in figure 20 would require additional mirror gates (as shown in figure 23) if one wished to preserve the states of the two ancilla qubits for later reuse. however, when used in the quantum oracle, these additional mirror gates become unnecessary because the th(r) and th(r) subcircuits already act as mirrors to each other. hence, the circuit structure shown in figure 20, without additional mirror gates, can be used for both of these subcircuits (except, of course, that the th(r) subcircuit is reversed compared to its counterpart). this simplification effectively halves the size of both of these subcircuits, as compared with their standalone form as seen in figure 23. the exact same observation also applies to the uniqueness checking subcircuits u.c. and u.c.—their stand-alone forms would require additional mirror gates on top of the circuit structure shown in figure 21, which become unnecessary when the subcircuits are used as components of the oracle in figure 22. we observe that the threshold circuit is the only part of the oracle that must be regenerated for each run of grover’s algorithm as described in section 3.2. the other parts of the oracle are generated depending on the 36 q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i2) x2(i2) x1(i3) x2(i3) x1(i4) x2(i4) |1〉 |1〉 |1〉 |1〉 |1〉 |1〉 |0〉 figure 21: quantum circuit to verify uniqueness of encoding for each state/input. |00 . . . 0〉 |0〉 |0〉 output qubit input/ ancilla qubits cost th(r) u.c. u.c. th(r) cost figure 22: high-level view of the quantum oracle. 37 204 e. tsai, m. perkowski a quantum algorithm for automata encoding 205 full circuit requires checks on all six. observe that the exact same circuit applies for checking both state and input encodings, and that although figure 21 assumes four states, the same structure may of course be scaled up for fsms with any number of states or inputs. 5.6 the complete quantum oracle finally, we consider how the quantum circuits we have demonstrated thus far are assembled to form a complete quantum oracle. recall that the cost calculation circuit (section 5.3), which is itself assembled using a quantum counter and dependency checkers outputs the cost of an encoding expressed as a base-two integer, which is then passed to the threshold circuit (section 5.4). the threshold circuit produces a single-qubit answer indicating whether the calculated cost is below the threshold or not. at the same time, an encoding must be bijective in order to be considered at all. this condition is enforced by a circuit that checks uniqueness of each state or input’s encoding (section 5.5). we therefore see that our quantum oracle should only output 1 (true) if both the threshold and uniqueness checking circuits output 1. the complete quantum oracle therefore requires one additional toffoli gate to produce its final output, as shown in figure 22. in figure 22, the block labeled “cost” denotes the cost calculation circuit, “th(r)” denotes a threshold circuit with threshold r, and “u.c.” (“uniqueness checker”) denotes the uniqueness checking circuit. the oracle also requires additional mirror circuits to restore the ancilla qubits to their original values. these are denoted by overbars; e.g., “u.c” denotes the mirror circuit for the uniqueness checker. in particular, the mirror of the cost calculation circuit contains decrementer instead of incrementer circuits; these decrementer circuits naturally arise from reversing the order of the gates in the incrementer circuits from figure 17. observe that the design of the oracle allows us to omit additional mirror circuits that would be necessary if these subcircuits were being used as stand-alone circuits. for instance, as a stand-alone circuit, the threshold circuit shown in figure 20 would require additional mirror gates (as shown in figure 23) if one wished to preserve the states of the two ancilla qubits for later reuse. however, when used in the quantum oracle, these additional mirror gates become unnecessary because the th(r) and th(r) subcircuits already act as mirrors to each other. hence, the circuit structure shown in figure 20, without additional mirror gates, can be used for both of these subcircuits (except, of course, that the th(r) subcircuit is reversed compared to its counterpart). this simplification effectively halves the size of both of these subcircuits, as compared with their standalone form as seen in figure 23. the exact same observation also applies to the uniqueness checking subcircuits u.c. and u.c.—their stand-alone forms would require additional mirror gates on top of the circuit structure shown in figure 21, which become unnecessary when the subcircuits are used as components of the oracle in figure 22. we observe that the threshold circuit is the only part of the oracle that must be regenerated for each run of grover’s algorithm as described in section 3.2. the other parts of the oracle are generated depending on the 36 q1(s1) q2(s1) q1(s2) q2(s2) q1(s3) q2(s3) q1(s4) q2(s4) x1(i1) x2(i1) x1(i2) x2(i2) x1(i3) x2(i3) x1(i4) x2(i4) |1〉 |1〉 |1〉 |1〉 |1〉 |1〉 |0〉 figure 21: quantum circuit to verify uniqueness of encoding for each state/input. |00 . . . 0〉 |0〉 |0〉 output qubit input/ ancilla qubits cost th(r) u.c. u.c. th(r) cost figure 22: high-level view of the quantum oracle. 37 204 e. tsai, m. perkowski a quantum algorithm for automata encoding 205 a0 a1 a2 a3 a4 a5 |1〉 |0〉 |1〉 figure 23: threshold circuit from figure 20 with mirror gates. the mirror gates turn out to be unnecessary as explained in the main text. fsm being encoded, but are independent of the chosen threshold; thus, they remain unchanged throughout the entire search procedure described in section 3.2. with this oracle,11 we have met the objective stated at the end of section 3.2 and therefore, the procedure described in section 3.2 gives a complete algorithm for finding an exact optimal encoding of an fsm with the help of grover’s algorithm. 6 run time complexity analysis 6.1 run time complexity of the proposed quantum algorithm we now wish to determine the run time complexity of our proposed quantum algorithm, in order to compare its performance against that of an analogous exhaustive search-based classical algorithm for the same problem. in this determination, we make certain simplifying assumptions, as described below. these simplifications are justified because our goal is not to perform the most detailed and nuanced analysis possible, but rather to show that our proposed algorithm can reasonably be expected to outperform the classical algorithm. when computing the run time complexity of a quantum circuit, we must take into account the differing quantum cost [19] of the various quantum gates used in the circuit—in our case, these are feynman, toffoli, and multiple-control toffoli gates. we recall that quantum gates are not physical hardware components like classical digital logic gates; instead, they are manipulations of the physical qubits, consisting of sequences of fundamental physical operations on those qubits. the quantum cost of a 11to be more precise, we have described not a single oracle but a design according to which a whole sequence of oracles (for different thresholds) can be generated for any given fsm. this ability to generate a sequence of oracles is exactly what we need, as discussed in section 3.2. 38 a b c d |0〉 |0〉 |0〉 a ∧ b ∧ c ∧ d figure 24: implementation of a multiple-control toffoli gate with quantum cost scaling linearly with gate size. quantum gate is then the number of physical operations required to implement that gate, which roughly corresponds to the run time complexity of the gate if we assume that each fundamental physical operation requires approximately the same amount of time. authors in the quantum computing literature have proposed a variety of quantum cost models, based on differing assumptions about the physical implementation of a quantum computing system and the fundamental operations available therein. for instance, a well-known result due to barenco et al. [20] suggests12 that the quantum cost of a multiple-control toffoli may be up to o(2n), where n is the size of the gate. however, this result assumes that no ancilla qubits are used. with the use of ancilla qubits, it is easy to see that multiple-control toffoli gates of arbitrary size may be implemented with a quantum cost that is linear with respect to the size of the gate. a cascade of toffoli gates, as shown in figure 24, accomplishes this task. throughout the following analysis, we will assume that an arbitrary number of ancilla qubits are available, and therefore the quantum cost of a multiple-control toffoli gate is linear with respect to the gate’s size. we make this assumption as it simplifies our calculations and is most conducive to our goal of obtaining a rough estimate of the run time complexity of our proposed quantum algorithm. the question of how the run time complexity varies in other quantum cost models is an interesting one, but outside the scope of the present paper. we leave further investigation of this question open for future work. our algorithm operates under the condition that the numbers of states and input values of the fsm must both be powers of two. let us denote the number of states by ns and the number of input values by ni. then, ns = 2 ns and ni = 2 ni for some nonnegative integers ns and ni, where ns and ni are respectively the number of state and input variables. conversely, ns = log2 ns = o(log ns) and ni = log2 ni = o(log ni). we first 12we say “suggests” because, although the results of barenco et al. are widely used throughout the literature as a standard quantum cost function for multiple-control toffoli gates, we do not know of a conclusive proof that it is impossible to implement a multiple-control toffoli gate with lower cost complexity (assuming that no ancilla qubits are used). in any case, our results are unaffected since we assume instead that ancilla qubits are used to implement multiple-control toffoli gates with o(n) quantum cost, as illustrated in the main text. 39 206 e. tsai, m. perkowski a quantum algorithm for automata encoding 207 a0 a1 a2 a3 a4 a5 |1〉 |0〉 |1〉 figure 23: threshold circuit from figure 20 with mirror gates. the mirror gates turn out to be unnecessary as explained in the main text. fsm being encoded, but are independent of the chosen threshold; thus, they remain unchanged throughout the entire search procedure described in section 3.2. with this oracle,11 we have met the objective stated at the end of section 3.2 and therefore, the procedure described in section 3.2 gives a complete algorithm for finding an exact optimal encoding of an fsm with the help of grover’s algorithm. 6 run time complexity analysis 6.1 run time complexity of the proposed quantum algorithm we now wish to determine the run time complexity of our proposed quantum algorithm, in order to compare its performance against that of an analogous exhaustive search-based classical algorithm for the same problem. in this determination, we make certain simplifying assumptions, as described below. these simplifications are justified because our goal is not to perform the most detailed and nuanced analysis possible, but rather to show that our proposed algorithm can reasonably be expected to outperform the classical algorithm. when computing the run time complexity of a quantum circuit, we must take into account the differing quantum cost [19] of the various quantum gates used in the circuit—in our case, these are feynman, toffoli, and multiple-control toffoli gates. we recall that quantum gates are not physical hardware components like classical digital logic gates; instead, they are manipulations of the physical qubits, consisting of sequences of fundamental physical operations on those qubits. the quantum cost of a 11to be more precise, we have described not a single oracle but a design according to which a whole sequence of oracles (for different thresholds) can be generated for any given fsm. this ability to generate a sequence of oracles is exactly what we need, as discussed in section 3.2. 38 a b c d |0〉 |0〉 |0〉 a ∧ b ∧ c ∧ d figure 24: implementation of a multiple-control toffoli gate with quantum cost scaling linearly with gate size. quantum gate is then the number of physical operations required to implement that gate, which roughly corresponds to the run time complexity of the gate if we assume that each fundamental physical operation requires approximately the same amount of time. authors in the quantum computing literature have proposed a variety of quantum cost models, based on differing assumptions about the physical implementation of a quantum computing system and the fundamental operations available therein. for instance, a well-known result due to barenco et al. [20] suggests12 that the quantum cost of a multiple-control toffoli may be up to o(2n), where n is the size of the gate. however, this result assumes that no ancilla qubits are used. with the use of ancilla qubits, it is easy to see that multiple-control toffoli gates of arbitrary size may be implemented with a quantum cost that is linear with respect to the size of the gate. a cascade of toffoli gates, as shown in figure 24, accomplishes this task. throughout the following analysis, we will assume that an arbitrary number of ancilla qubits are available, and therefore the quantum cost of a multiple-control toffoli gate is linear with respect to the gate’s size. we make this assumption as it simplifies our calculations and is most conducive to our goal of obtaining a rough estimate of the run time complexity of our proposed quantum algorithm. the question of how the run time complexity varies in other quantum cost models is an interesting one, but outside the scope of the present paper. we leave further investigation of this question open for future work. our algorithm operates under the condition that the numbers of states and input values of the fsm must both be powers of two. let us denote the number of states by ns and the number of input values by ni. then, ns = 2 ns and ni = 2 ni for some nonnegative integers ns and ni, where ns and ni are respectively the number of state and input variables. conversely, ns = log2 ns = o(log ns) and ni = log2 ni = o(log ni). we first 12we say “suggests” because, although the results of barenco et al. are widely used throughout the literature as a standard quantum cost function for multiple-control toffoli gates, we do not know of a conclusive proof that it is impossible to implement a multiple-control toffoli gate with lower cost complexity (assuming that no ancilla qubits are used). in any case, our results are unaffected since we assume instead that ancilla qubits are used to implement multiple-control toffoli gates with o(n) quantum cost, as illustrated in the main text. 39 206 e. tsai, m. perkowski a quantum algorithm for automata encoding 207 consider the run time complexity for the dependency checking quantum circuits described in sections 5.2 and 5.3. calculation of this complexity is complicated by the fact that the partial dependency checkers described in section 5.2 do not have a fixed complexity; rather, as discussed in that section, their internal structure depends on the particular details of the state machine under consideration. however, we may still calculate an upper bound for the complexity of each partial dependency checker. in the following paragraph, we use the example circuits shown in figures 11, 13, and 14 as a reference. evaluating dj(s, s ′) for a pair of states requires o(ns) = o(log ns) time, since ns − 1 feynman gates are required together with a multiple-control toffoli gate of size ns (ns − 1 control qubits and one target qubit). by analogous reasoning, evaluating dj(i, i ′) requires o(ni) = o(log ni) time. meanwhile, evaluating ai(s, s′) is an o(1) operation, since only a single feynman gate is needed. observe that the upper portion of every partial dependency checker (as in figures 13 and 14) must evaluate ai(s, s ′) for up to ns − 1 state pairs, with this maximum being obtained when all state pairs are overlapping, as in (s1, s2), (s2, s3), (s3, s4), etc. the lower portion of each partial dependency checker always evaluates dj(s, s ′) or dj(i, i ′) for only one pair of inputs/states. the total complexity of a partial dependency checker is therefore (ns − 1) · o(1) + 1 · o(log ns) = o(ns + log ns) = o(ns), for a partial dependency checker that evaluates dj(s, s ′) for a pair of states and checks eq. 1, or (ns − 1) · o(1) + 1 · o(log ns) = o(ns + log ni), for a partial dependency checker that evaluates dj(i, i ′) for a pair of inputs and checks eq. 2. from figure 16 we can see that checking the dependencies of the next state of a single flip-flop on a single state or input variable requires either ns(ns − 1)/2 (for a state variable) or ni(ni − 1)/2 (for an input variable) partial dependency checkers. combining with the previously derived complexity for a single partial dependency checker gives o(ns3) (for a state variable) or o(ni2(ns + log ni)) (for an input variable). the central toffoli gate in figure 16 has either ns(ns − 1)/2 or ni(ni − 1)/2 control qubits, so its time complexity is dominated by that of the partial dependency checkers. in figure 19, we can see that there are ns 2 = (log ns) 2 checks for dependency of q+i (for any i) on a state variable, and nsni = log ns log ni checks for dependency on an input variable. therefore, the total complexity of all the dependency checkers in the circuit from figure 19 is (log ns) 2 · o(ns3) + log ns log ni · o(ni2(ns + log ni)) = o(ns3(log ns)2 + ni2(ns + log ni)(log ns log ni)). (7) the complexity of the incrementer circuits in figure 19 is clearly dominated by that of the dependency checkers, so the preceding expression gives the time complexity for the entire cost-calculating portion of the quantum oracle. finally, we must consider the complexity of the two other components of the oracle, as shown in figure 22—the threshold circuit and uniquenesschecking circuit. the complexity of the cost-calculating circuit clearly dominates that of the threshold circuit, but the situation with respect to 40 the uniqueness-checking circuit is less clear. as seen in figure 21, the uniqueness-checking circuit performs a comparison for every possible pair of states, of which there are ns(ns −1)/2, and every possible pair of input values, of which there are ni(ni − 1)/2. furthermore, a comparison of a pair of states has complexity o(ns) = o(log ns), since each individual state variable assignment must be compared between the two states, and a comparison of a pair of inputs similarly has complexity o(ni) = o(log ni). the final toffoli gate in figure 21 has ns(ns −1)/2+ni(ni −1)/2 control qubits, giving the entire uniqueness-checking circuit a complexity of ns(ns − 1) 2 · o(log ns) + ni(ni − 1) 2 · o(log ni) +o ( ns(ns − 1) 2 + ni(ni − 1) 2 ) = o(n2s log ns + n 2 i log ni). comparison with eq. 7 shows that the cost-calculating circuit dominates the uniqueness-checking circuit in terms of complexity. therefore, the run time complexity of the entire quantum oracle is still given by eq. 7. as we recall from section 3.2, our proposed quantum algorithm involves executing grover’s algorithm multiple times to find an optimal encoding for an fsm. it is uncertain exactly how many executions of grover’s algorithm are required, as we do not consider the details of the search procedure used to find the minimum cost. however, we may consider the time complexity of just a single execution of grover’s algorithm. this does not affect our ability to compare the performance of quantum and classical algorithms because they both must perform a search procedure as described in section 3.2 in order to find the minimum possible cost. we discussed in section 3.1 how grover’s algorithm (together with its extensions for the case of multiple solutions) is able to find a solution to a satisfaction problem, or detect the non-existence of a solution, with o( √ m) executions of the quantum oracle, where m is the size of the search space. (we have used m to avoid confusion with the numbers of states and input values.) as discussed in section 5.1 and shown in figure 10(b), our proposed oracle requires nsns + nini = o(ns log ns + ni + log ni) input qubits, corresponding to a search space of size o(2ns log ns+ni+log ni) = o(nsnsnini). hence, a single execution of grover’s algorithm requires o( √ ns nsni ni) executions of the quantum oracle. the total complexity of one execution of grover’s algorithm, taking into account the previously determined complexity of the oracle, is then o( √ ns nsni ni(ns 3 (log ns) 2 + ni 2 (ns + log ni)(log ns log ni))). (8) eq. 8 is rather unwieldy for a simple, rough comparison between the performance of quantum and classical algorithms. if we assume that ns = ni, that is, the fsm has the same number of states as input values, then 41 208 e. tsai, m. perkowski a quantum algorithm for automata encoding 209 consider the run time complexity for the dependency checking quantum circuits described in sections 5.2 and 5.3. calculation of this complexity is complicated by the fact that the partial dependency checkers described in section 5.2 do not have a fixed complexity; rather, as discussed in that section, their internal structure depends on the particular details of the state machine under consideration. however, we may still calculate an upper bound for the complexity of each partial dependency checker. in the following paragraph, we use the example circuits shown in figures 11, 13, and 14 as a reference. evaluating dj(s, s ′) for a pair of states requires o(ns) = o(log ns) time, since ns − 1 feynman gates are required together with a multiple-control toffoli gate of size ns (ns − 1 control qubits and one target qubit). by analogous reasoning, evaluating dj(i, i ′) requires o(ni) = o(log ni) time. meanwhile, evaluating ai(s, s′) is an o(1) operation, since only a single feynman gate is needed. observe that the upper portion of every partial dependency checker (as in figures 13 and 14) must evaluate ai(s, s ′) for up to ns − 1 state pairs, with this maximum being obtained when all state pairs are overlapping, as in (s1, s2), (s2, s3), (s3, s4), etc. the lower portion of each partial dependency checker always evaluates dj(s, s ′) or dj(i, i ′) for only one pair of inputs/states. the total complexity of a partial dependency checker is therefore (ns − 1) · o(1) + 1 · o(log ns) = o(ns + log ns) = o(ns), for a partial dependency checker that evaluates dj(s, s ′) for a pair of states and checks eq. 1, or (ns − 1) · o(1) + 1 · o(log ns) = o(ns + log ni), for a partial dependency checker that evaluates dj(i, i ′) for a pair of inputs and checks eq. 2. from figure 16 we can see that checking the dependencies of the next state of a single flip-flop on a single state or input variable requires either ns(ns − 1)/2 (for a state variable) or ni(ni − 1)/2 (for an input variable) partial dependency checkers. combining with the previously derived complexity for a single partial dependency checker gives o(ns3) (for a state variable) or o(ni2(ns + log ni)) (for an input variable). the central toffoli gate in figure 16 has either ns(ns − 1)/2 or ni(ni − 1)/2 control qubits, so its time complexity is dominated by that of the partial dependency checkers. in figure 19, we can see that there are ns 2 = (log ns) 2 checks for dependency of q+i (for any i) on a state variable, and nsni = log ns log ni checks for dependency on an input variable. therefore, the total complexity of all the dependency checkers in the circuit from figure 19 is (log ns) 2 · o(ns3) + log ns log ni · o(ni2(ns + log ni)) = o(ns3(log ns)2 + ni2(ns + log ni)(log ns log ni)). (7) the complexity of the incrementer circuits in figure 19 is clearly dominated by that of the dependency checkers, so the preceding expression gives the time complexity for the entire cost-calculating portion of the quantum oracle. finally, we must consider the complexity of the two other components of the oracle, as shown in figure 22—the threshold circuit and uniquenesschecking circuit. the complexity of the cost-calculating circuit clearly dominates that of the threshold circuit, but the situation with respect to 40 the uniqueness-checking circuit is less clear. as seen in figure 21, the uniqueness-checking circuit performs a comparison for every possible pair of states, of which there are ns(ns −1)/2, and every possible pair of input values, of which there are ni(ni − 1)/2. furthermore, a comparison of a pair of states has complexity o(ns) = o(log ns), since each individual state variable assignment must be compared between the two states, and a comparison of a pair of inputs similarly has complexity o(ni) = o(log ni). the final toffoli gate in figure 21 has ns(ns −1)/2+ni(ni −1)/2 control qubits, giving the entire uniqueness-checking circuit a complexity of ns(ns − 1) 2 · o(log ns) + ni(ni − 1) 2 · o(log ni) +o ( ns(ns − 1) 2 + ni(ni − 1) 2 ) = o(n2s log ns + n 2 i log ni). comparison with eq. 7 shows that the cost-calculating circuit dominates the uniqueness-checking circuit in terms of complexity. therefore, the run time complexity of the entire quantum oracle is still given by eq. 7. as we recall from section 3.2, our proposed quantum algorithm involves executing grover’s algorithm multiple times to find an optimal encoding for an fsm. it is uncertain exactly how many executions of grover’s algorithm are required, as we do not consider the details of the search procedure used to find the minimum cost. however, we may consider the time complexity of just a single execution of grover’s algorithm. this does not affect our ability to compare the performance of quantum and classical algorithms because they both must perform a search procedure as described in section 3.2 in order to find the minimum possible cost. we discussed in section 3.1 how grover’s algorithm (together with its extensions for the case of multiple solutions) is able to find a solution to a satisfaction problem, or detect the non-existence of a solution, with o( √ m) executions of the quantum oracle, where m is the size of the search space. (we have used m to avoid confusion with the numbers of states and input values.) as discussed in section 5.1 and shown in figure 10(b), our proposed oracle requires nsns + nini = o(ns log ns + ni + log ni) input qubits, corresponding to a search space of size o(2ns log ns+ni+log ni) = o(nsnsnini). hence, a single execution of grover’s algorithm requires o( √ ns nsni ni) executions of the quantum oracle. the total complexity of one execution of grover’s algorithm, taking into account the previously determined complexity of the oracle, is then o( √ ns nsni ni(ns 3 (log ns) 2 + ni 2 (ns + log ni)(log ns log ni))). (8) eq. 8 is rather unwieldy for a simple, rough comparison between the performance of quantum and classical algorithms. if we assume that ns = ni, that is, the fsm has the same number of states as input values, then 41 208 e. tsai, m. perkowski a quantum algorithm for automata encoding 209 eq. 8 simplifies to the much more manageable o( √ ns nsni ni(ns 3 (log ns) 2 + ni 2 (ns + log ni)(log ns log ni))) = o(nsns(ns3(log ns)2 + ns2(ns + log ns)(log ns log ns))) = o(nsns(ns3(log ns)2)) = o(nsns+3(log ns)2). (9) 6.2 comparison with a classical algorithm a comparable classical algorithm to solve the fsm encoding problem would operate in much the same way as our proposed quantum algorithm, with the main difference being that a classical computer of course cannot use grover’s algorithm and must instead use a straightforward exhaustive search. in particular, a classical computer must also use eqs. 1 and 2 to calculate the cost of a given encoding. most, if not all, modern digital computers are capable of performing so-called bitwise logical operations on words of significant length (at least 32 or 64 bits), which allows them to evaluate dj(s, s ′) and ai(s, s ′) for any pair of states or dj(i, i ′) for any pair of inputs in o(1) time. (we need not consider state machines with more than 232 states or input values, as such a machine would be far too large to be of practical interest.) it follows that a classical computer can evaluate eq. 1 in o(ni) time, since it requires iterating over all input values. then, determining whether q+i depends on qj, for any values of i and j, requires evaluating eq. 1 for all ns(ns − 1)/2 pairs of states and therefore takes o(ns2ni) time. similarly, a single evaluation of eq. 2 requires o(ns) time and determining whether q+i depends on xk, for any values of i and k, requires o(nsni2) time. calculating the total cost of an encoding involves checking the dependency of q+i on qj for all combinations of i and j, of which there are (log2 ns) 2 , and the dependency of q+i on xk for all combinations of i and k, of which there are (log2 ns)(log2 ni). therefore, the complete calculation of the cost of an encoding on a classical computer requires o(ns2ni(log ns)2 + nsni2(log ns)(log ni)) time. the number of possible encodings is ns!ni!, so a full exhaustive search on a classical computer has a total time complexity of o(ns!ni!(ns2ni(log ns)2 + nsni2(log ns)(log ni))). (10) just as for the quantum algorithm, we can simplify this expression if we assume that ns = ni; in this case, eq. 10 reduces to o(ns!ns!(ns3(log ns)2 + ns3(log ns)(log ns))) = o(ns!2ns3(log ns)2). (11) comparing eqs. 9 and 11, we see that the factor of ns 3 log ns is common to both. we may therefore compare the relative complexities of the quantum and classical algorithms, if ns = ni, by looking only at the terms ns ns (for the quantum algorithm) and ns! 2 (for the classical algorithm). table 1 shows the results for a few values of ns. we immediately see that the quantum algorithm appears to be significantly faster than the 42 210 e. tsai, m. perkowski a quantum algorithm for automata encoding 211 eq. 8 simplifies to the much more manageable o( √ ns nsni ni(ns 3 (log ns) 2 + ni 2 (ns + log ni)(log ns log ni))) = o(nsns(ns3(log ns)2 + ns2(ns + log ns)(log ns log ns))) = o(nsns(ns3(log ns)2)) = o(nsns+3(log ns)2). (9) 6.2 comparison with a classical algorithm a comparable classical algorithm to solve the fsm encoding problem would operate in much the same way as our proposed quantum algorithm, with the main difference being that a classical computer of course cannot use grover’s algorithm and must instead use a straightforward exhaustive search. in particular, a classical computer must also use eqs. 1 and 2 to calculate the cost of a given encoding. most, if not all, modern digital computers are capable of performing so-called bitwise logical operations on words of significant length (at least 32 or 64 bits), which allows them to evaluate dj(s, s ′) and ai(s, s ′) for any pair of states or dj(i, i ′) for any pair of inputs in o(1) time. (we need not consider state machines with more than 232 states or input values, as such a machine would be far too large to be of practical interest.) it follows that a classical computer can evaluate eq. 1 in o(ni) time, since it requires iterating over all input values. then, determining whether q+i depends on qj, for any values of i and j, requires evaluating eq. 1 for all ns(ns − 1)/2 pairs of states and therefore takes o(ns2ni) time. similarly, a single evaluation of eq. 2 requires o(ns) time and determining whether q+i depends on xk, for any values of i and k, requires o(nsni2) time. calculating the total cost of an encoding involves checking the dependency of q+i on qj for all combinations of i and j, of which there are (log2 ns) 2 , and the dependency of q+i on xk for all combinations of i and k, of which there are (log2 ns)(log2 ni). therefore, the complete calculation of the cost of an encoding on a classical computer requires o(ns2ni(log ns)2 + nsni2(log ns)(log ni)) time. the number of possible encodings is ns!ni!, so a full exhaustive search on a classical computer has a total time complexity of o(ns!ni!(ns2ni(log ns)2 + nsni2(log ns)(log ni))). (10) just as for the quantum algorithm, we can simplify this expression if we assume that ns = ni; in this case, eq. 10 reduces to o(ns!ns!(ns3(log ns)2 + ns3(log ns)(log ns))) = o(ns!2ns3(log ns)2). (11) comparing eqs. 9 and 11, we see that the factor of ns 3 log ns is common to both. we may therefore compare the relative complexities of the quantum and classical algorithms, if ns = ni, by looking only at the terms ns ns (for the quantum algorithm) and ns! 2 (for the classical algorithm). table 1 shows the results for a few values of ns. we immediately see that the quantum algorithm appears to be significantly faster than the 42 table 1: comparison of relative complexities of quantum and classical algorithms, assuming ns = ni. ns ns ns ns! 2 ns! 2 ns ns 4 256 576 2.25 8 1.68 · 107 1.63 · 109 96.9 16 1.84 · 1019 4.38 · 1026 2.37 · 107 32 1.46 · 1048 6.92 · 1070 4.74 · 1022 classical algorithm. care must be taken in this comparison because we are only comparing the relative complexities of the two algorithms and not their actual run times. in particular, the actual ratio of run times between the two algorithms is better approximated as cns! 2/ns ns, where c is an unknown constant. for example, c = 1/1000 indicates, very roughly speaking, that the classical computer’s “clock speed”—by which we mean not necessarily the hardware’s physical clock speed, but rather the number of low-level instructions executed per second—is on the order of 1000 times faster than the quantum computer’s. from table 1, we can see that for ns = 16, our proposed quantum algorithm is expected to be faster even if the quantum computer on which it is running has a clock speed a million times slower than the competing classical computer. for ns = 32, our proposed quantum algorithm will, nearly unquestionably, run many orders of magnitude faster than the comparable classical algorithm using an exhaustive search. this gives us a high degree of confidence in our expectation that our proposed quantum algorithm will outperform the classical algorithm for state machines of reasonable size. it is interesting to observe that the quantum algorithm actually contains a significant inefficiency in comparison to the classical algorithm. the inefficiency arises from the fact that grover’s algorithm can only search through a space consisting of all possible binary strings of a given length. therefore, the quantum algorithm searches through all possible combinations of inputs to the oracle as illustrated in figure 10(b), even those that do not represent a valid encoding. this means that a large portion of the search space is effectively extraneous, and is excluded by the uniqueness checking circuit from figure 21. the classical algorithm has no such difficulty as it can simply search through only the set of valid encodings, and is not forced to search through a space of all possible binary strings of a given length, as grover’s algorithm is. the fact that the quantum algorithm still outperforms the classical algorithm, with a lower run time complexity, shows that this disadvantage is outweighed by the quadratic speedup in searching obtained from using grover’s algorithm. 43 210 e. tsai, m. perkowski a quantum algorithm for automata encoding 211 7 conclusion we presented a quantum algorithm for finding an exact solution to the problem of encoding a finite state machine with the lowest cost possible. specifically, our algorithm finds an optimal encoding for any fsm with numbers of inputs and states that are powers of two, under the assumptions that the number of state and input variables must be the smallest possible and that the cost of an encoding is given by the total number of variables on which the encoded transition functions resulting from that encoding depend. our algorithm contains the following notable features: 1. it uses a quantum computer with grover’s algorithm as a subroutine to perform exhaustive searches with lower time complexity than that which is achievable using a classical computer alone, thus making those exhaustive searches more practical. little to no published work exists on the subject of applying grover’s algorithm to directly solve a practically useful problem, and the present work is the first to apply grover’s algorithm to the problem of finding optimal encodings, based on the simple metric of dependencies for completely specified finite state machines where both the number of states and the number of input values are a power of 2. 2. it simultaneously optimizes both state and input encodings for an fsm. currently, [12] is the only other published method that finds exact minimum solutions; however, [12] only solves the problem of state, and not input, encoding. additionally, the method presented in [12] is specialized for fsms implemented using plas because it minimizes the total number of pla product terms. in comparison, we use the cruder but more generally applicable cost metric of the total number of variables on which a function depends. 3. it uses grover’s algorithm to solve an optimization, rather than satisfaction, problem. it achieves this by solving a sequence of satisfaction problems using grover’s algorithm; each such satisfaction problem is of the form “find an encoding with cost at most r” where r is a threshold that is varied. by repeatedly executing grover’s algorithm for different thresholds, where the threshold is varied according to an appropriate strategy (e.g., a binary search strategy), the algorithm eventually finds the exact minimum possible cost and an encoding with that cost. 4. we introduced the concept of using a quantum counter in tandem with a threshold circuit as part of a quantum oracle. such oracles check whether the value of a certain function (in this case, our cost function) lies below a given threshold and are exactly what is needed to solve an optimization problem using the procedure described in section 3.2. the use of quantum counters and threshold circuits in quantum oracles is not limited to solving the fsm encoding problem. it can be applied to many other optimization problems such as maxsat, in which the objective is to satisfy as many terms of a boolean expression as possible. we compared the run time complexity of the proposed quantum algorithm against that of the analogous exhaustive search-based classical 44 algorithm. this analysis does not tell us the absolute run times of either algorithm, as calculating such would require much more detailed information regarding the precise specifications of the quantum and classical computers being used. nevertheless, the comparison of run time complexities provides strong evidence that the quantum algorithm can significantly outperform the classical algorithm for fsms of reasonable size that might be encountered in practice. in addition, our work may serve as the basis for further investigation in a number of different directions. we leave these possibilities open to future exploration. among them, the most promising include: incompletely specified transition functions—a significant limitation of our method is that it requires all encoded transition functions to be completely specified, which means that it is only applicable to fsms with a number of states that is a power of two. extending our method for encoded transition functions that are incompletely specified would allow it to apply to all fsms. output encodings—in a realistic digital logic design scenario, the outputs of a state machine of course cannot be ignored. we believe that the methods presented here can, without too much difficulty, be extended to the problem of encoding outputs of fsms as well. such an extension would represent the first quantum algorithm to solve the fsm encoding problem simultaneously for states, inputs, and outputs. a more detailed cost model—we used a simple cost model which only takes into account the number of dependencies of each encoded transition function on state and input variables. while this simple model possesses the advantage of not being closely tied to a single digital logic implementation technology, it is still clearly desirable to extend our method to more realistic cost models that take into account additional factors. comparison of threshold search strategies—one of the key elements of our method is the execution of grover’s algorithm multiple times with a sequence of quantum oracles generated for different thresholds as described in section 3.2. we did not attempt to compare different strategies for varying the threshold. such a comparison could significantly improve our algorithm, because the selected strategy affects the expected number of grover runs needed to find the minimum possible threshold, which in turn affects the run time of our entire algorithm. an analysis of threshold search strategies would also be applicable to problems other than fsm encoding (see also the following paragraph). application to other problems—as mentioned before, the principle of solving an optimization problem by running grover’s algorithm multiple times using a sequence of quantum oracles can be applied to other problems. one such problem, for example, is maxsat, where the objective is to maximize the number of simultaneously satisfied clauses in a boolean formula expressed in conjunctive normal form. further investigation into this and other such problems would greatly increase the generality of the method presented here. 45 212 e. tsai, m. perkowski a quantum algorithm for automata encoding 213 7 conclusion we presented a quantum algorithm for finding an exact solution to the problem of encoding a finite state machine with the lowest cost possible. specifically, our algorithm finds an optimal encoding for any fsm with numbers of inputs and states that are powers of two, under the assumptions that the number of state and input variables must be the smallest possible and that the cost of an encoding is given by the total number of variables on which the encoded transition functions resulting from that encoding depend. our algorithm contains the following notable features: 1. it uses a quantum computer with grover’s algorithm as a subroutine to perform exhaustive searches with lower time complexity than that which is achievable using a classical computer alone, thus making those exhaustive searches more practical. little to no published work exists on the subject of applying grover’s algorithm to directly solve a practically useful problem, and the present work is the first to apply grover’s algorithm to the problem of finding optimal encodings, based on the simple metric of dependencies for completely specified finite state machines where both the number of states and the number of input values are a power of 2. 2. it simultaneously optimizes both state and input encodings for an fsm. currently, [12] is the only other published method that finds exact minimum solutions; however, [12] only solves the problem of state, and not input, encoding. additionally, the method presented in [12] is specialized for fsms implemented using plas because it minimizes the total number of pla product terms. in comparison, we use the cruder but more generally applicable cost metric of the total number of variables on which a function depends. 3. it uses grover’s algorithm to solve an optimization, rather than satisfaction, problem. it achieves this by solving a sequence of satisfaction problems using grover’s algorithm; each such satisfaction problem is of the form “find an encoding with cost at most r” where r is a threshold that is varied. by repeatedly executing grover’s algorithm for different thresholds, where the threshold is varied according to an appropriate strategy (e.g., a binary search strategy), the algorithm eventually finds the exact minimum possible cost and an encoding with that cost. 4. we introduced the concept of using a quantum counter in tandem with a threshold circuit as part of a quantum oracle. such oracles check whether the value of a certain function (in this case, our cost function) lies below a given threshold and are exactly what is needed to solve an optimization problem using the procedure described in section 3.2. the use of quantum counters and threshold circuits in quantum oracles is not limited to solving the fsm encoding problem. it can be applied to many other optimization problems such as maxsat, in which the objective is to satisfy as many terms of a boolean expression as possible. we compared the run time complexity of the proposed quantum algorithm against that of the analogous exhaustive search-based classical 44 algorithm. this analysis does not tell us the absolute run times of either algorithm, as calculating such would require much more detailed information regarding the precise specifications of the quantum and classical computers being used. nevertheless, the comparison of run time complexities provides strong evidence that the quantum algorithm can significantly outperform the classical algorithm for fsms of reasonable size that might be encountered in practice. in addition, our work may serve as the basis for further investigation in a number of different directions. we leave these possibilities open to future exploration. among them, the most promising include: incompletely specified transition functions—a significant limitation of our method is that it requires all encoded transition functions to be completely specified, which means that it is only applicable to fsms with a number of states that is a power of two. extending our method for encoded transition functions that are incompletely specified would allow it to apply to all fsms. output encodings—in a realistic digital logic design scenario, the outputs of a state machine of course cannot be ignored. we believe that the methods presented here can, without too much difficulty, be extended to the problem of encoding outputs of fsms as well. such an extension would represent the first quantum algorithm to solve the fsm encoding problem simultaneously for states, inputs, and outputs. a more detailed cost model—we used a simple cost model which only takes into account the number of dependencies of each encoded transition function on state and input variables. while this simple model possesses the advantage of not being closely tied to a single digital logic implementation technology, it is still clearly desirable to extend our method to more realistic cost models that take into account additional factors. comparison of threshold search strategies—one of the key elements of our method is the execution of grover’s algorithm multiple times with a sequence of quantum oracles generated for different thresholds as described in section 3.2. we did not attempt to compare different strategies for varying the threshold. such a comparison could significantly improve our algorithm, because the selected strategy affects the expected number of grover runs needed to find the minimum possible threshold, which in turn affects the run time of our entire algorithm. an analysis of threshold search strategies would also be applicable to problems other than fsm encoding (see also the following paragraph). application to other problems—as mentioned before, the principle of solving an optimization problem by running grover’s algorithm multiple times using a sequence of quantum oracles can be applied to other problems. one such problem, for example, is maxsat, where the objective is to maximize the number of simultaneously satisfied clauses in a boolean formula expressed in conjunctive normal form. further investigation into this and other such problems would greatly increase the generality of the method presented here. 45 212 e. tsai, m. perkowski a quantum algorithm for automata encoding 213 references [1] j. m. gambetta, j. m. chow, and m. steffen, “building logical qubits in a superconducting quantum computing system,” npj quantum information, vol. 3, p. 2, jan. 2017. [2] l. k. grover, “a fast quantum mechanical algorithm for database search,” in proceedings of the twenty-eighth annual acm symposium on theory of computing, pp. 212–219, may 1996. [3] l. k. grover, “quantum mechanics helps in searching for a needle in a haystack,” phys. rev. lett., vol. 79, pp. 325–328, jul 1997. [4] l. k. grover, “a framework for fast quantum mechanical algorithms,” in proceedings of the thirtieth annual acm symposium on theory of computing, stoc ’98, (new york, ny, usa), pp. 53–62, acm, 1998. [5] g. wendin, “quantum information processing with superconducting circuits: a review,” reports on progress in physics, vol. 80, no. 10, p. 106001, 2017. [6] j. hartmanis, “on the state assignment problem for sequential machines. i,” ire transactions on electronic computers, vol. ec-10, pp. 157–165, june 1961. [7] r. e. stearns and j. hartmanis, “on the state assignment problem for sequential machines. ii,” ire transactions on electronic computers, vol. ec-10, pp. 593–603, dec. 1961. [8] l. benini and g. d. micheli, “state assignment for low power dissipation,” ieee journal of solid-state circuits, vol. 30, pp. 258–268, mar. 1995. [9] g. d. hachtel, m. hermida, a. pardo, m. poncino, and f. somenzi, “re-encoding sequential circuits to reduce power dissipation,” in proceedings of the 1994 ieee/acm international conference on computer-aided design, iccad ’94, (los alamitos, ca, usa), pp. 70–73, ieee computer society press, 1994. [10] g. d. micheli, r. k. brayton, and a. sangiovanni-vincentelli, “optimal state assignment for finite state machines,” ieee transactions on computer-aided design of integrated circuits and systems, vol. 4, pp. 269–285, july 1985. [11] t. villa and a. sangiovanni-vincentelli, “nova: state assignment of finite state machines for optimal two-level logic implementation,” ieee transactions on computer-aided design of integrated circuits and systems, vol. 9, pp. 905–924, sept. 1990. [12] s. devadas and a. r. newton, “exact algorithms for output encoding, state assignment and four-level boolean minimization,” in system sciences, 1990., proceedings of the twenty-third annual hawaii international conference on, vol. 1, pp. 387–396, ieee, 1990. [13] e. f. moore, “gedanken experiments on sequential machines,” in automata studies, pp. 129–153, princeton u., 1956. 46 [14] j. hartmanis, “loop-free structure of sequential machines,” information and control, vol. 5, no. 1, pp. 25–43, 1962. [15] g. h. mealy, “a method for synthesizing sequential circuits,” bell system technical journal, vol. 34, no. 5, pp. 1045–1079, 1955. [16] m. a. nielsen and i. l. chuang, quantum computation and quantum information. cambridge university press, 10th anniversary ed., 2010. [17] m. boyer, g. brassard, p. høyer, and a. tapp, “tight bounds on quantum searching,” fortschritte der physik, vol. 46, no. 4-5, pp. 493–505, 1998. [18] g. brassard, p. høyer, and a. tapp, “quantum counting,” in automata, languages and programming, pp. 820–831, springer berlin heidelberg, 1998. [19] d. maslov and g. dueck, “improved quantum cost for n-bit toffoli gates,” electronics letters, vol. 39, pp. 1790–1791, december 2003. [20] a. barenco, c. h. bennett, r. cleve, d. p. divincenzo, n. margolus, p. shor, t. sleator, j. a. smolin, and h. weinfurter, “elementary gates for quantum computation,” phys. rev. a, vol. 52, pp. 3457– 3467, nov 1995. 47 214 e. tsai, m. perkowski a quantum algorithm for automata encoding 215 references [1] j. m. gambetta, j. m. chow, and m. steffen, “building logical qubits in a superconducting quantum computing system,” npj quantum information, vol. 3, p. 2, jan. 2017. [2] l. k. grover, “a fast quantum mechanical algorithm for database search,” in proceedings of the twenty-eighth annual acm symposium on theory of computing, pp. 212–219, may 1996. [3] l. k. grover, “quantum mechanics helps in searching for a needle in a haystack,” phys. rev. lett., vol. 79, pp. 325–328, jul 1997. [4] l. k. grover, “a framework for fast quantum mechanical algorithms,” in proceedings of the thirtieth annual acm symposium on theory of computing, stoc ’98, (new york, ny, usa), pp. 53–62, acm, 1998. [5] g. wendin, “quantum information processing with superconducting circuits: a review,” reports on progress in physics, vol. 80, no. 10, p. 106001, 2017. [6] j. hartmanis, “on the state assignment problem for sequential machines. i,” ire transactions on electronic computers, vol. ec-10, pp. 157–165, june 1961. [7] r. e. stearns and j. hartmanis, “on the state assignment problem for sequential machines. ii,” ire transactions on electronic computers, vol. ec-10, pp. 593–603, dec. 1961. [8] l. benini and g. d. micheli, “state assignment for low power dissipation,” ieee journal of solid-state circuits, vol. 30, pp. 258–268, mar. 1995. [9] g. d. hachtel, m. hermida, a. pardo, m. poncino, and f. somenzi, “re-encoding sequential circuits to reduce power dissipation,” in proceedings of the 1994 ieee/acm international conference on computer-aided design, iccad ’94, (los alamitos, ca, usa), pp. 70–73, ieee computer society press, 1994. [10] g. d. micheli, r. k. brayton, and a. sangiovanni-vincentelli, “optimal state assignment for finite state machines,” ieee transactions on computer-aided design of integrated circuits and systems, vol. 4, pp. 269–285, july 1985. [11] t. villa and a. sangiovanni-vincentelli, “nova: state assignment of finite state machines for optimal two-level logic implementation,” ieee transactions on computer-aided design of integrated circuits and systems, vol. 9, pp. 905–924, sept. 1990. [12] s. devadas and a. r. newton, “exact algorithms for output encoding, state assignment and four-level boolean minimization,” in system sciences, 1990., proceedings of the twenty-third annual hawaii international conference on, vol. 1, pp. 387–396, ieee, 1990. [13] e. f. moore, “gedanken experiments on sequential machines,” in automata studies, pp. 129–153, princeton u., 1956. 46 [14] j. hartmanis, “loop-free structure of sequential machines,” information and control, vol. 5, no. 1, pp. 25–43, 1962. [15] g. h. mealy, “a method for synthesizing sequential circuits,” bell system technical journal, vol. 34, no. 5, pp. 1045–1079, 1955. [16] m. a. nielsen and i. l. chuang, quantum computation and quantum information. cambridge university press, 10th anniversary ed., 2010. [17] m. boyer, g. brassard, p. høyer, and a. tapp, “tight bounds on quantum searching,” fortschritte der physik, vol. 46, no. 4-5, pp. 493–505, 1998. [18] g. brassard, p. høyer, and a. tapp, “quantum counting,” in automata, languages and programming, pp. 820–831, springer berlin heidelberg, 1998. [19] d. maslov and g. dueck, “improved quantum cost for n-bit toffoli gates,” electronics letters, vol. 39, pp. 1790–1791, december 2003. [20] a. barenco, c. h. bennett, r. cleve, d. p. divincenzo, n. margolus, p. shor, t. sleator, j. a. smolin, and h. weinfurter, “elementary gates for quantum computation,” phys. rev. a, vol. 52, pp. 3457– 3467, nov 1995. 47 214 e. tsai, m. perkowski a quantum algorithm for automata encoding 215 plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 477 510 doi: 10.2298/fuee1704477d on some common compressive sensing recovery algorithms and applications  anđela draganić, irena orović, srđan stanković university of montenegro, faculty of electrical engineering, podgorica, montenegro abstract. compressive sensing, as an emerging technique in signal processing is reviewed in this paper together with its common applications. as an alternative to the traditional signal sampling, compressive sensing allows a new acquisition strategy with significantly reduced number of samples needed for accurate signal reconstruction. the basic ideas and motivation behind this approach are provided in the theoretical part of the paper. the commonly used algorithms for missing data reconstruction are presented. the compressive sensing applications have gained significant attention leading to an intensive growth of signal processing possibilities. hence, some of the existing practical applications assuming different types of signals in real-world scenarios are described and analyzed as well. key words: compressive sensing, optimization algorithms, sampling theorem, undersampled data 1. the basic compressive sensing concepts continuous time, bandlimited signals, sampled according to the shannon-nyquist sampling theorem, may produce a large number of samples to be further processed. having in mind that the signal samples are acquired at the rate being at least twice the maximal signal frequency, the conventional sampling may be inefficient, especially in applications dealing with the high-frequency signals. also, a large number of sensors required for acquisition may lead to large power consumption. hence, the compression arises as a necessary step in conventional signal processing. most of the signals we are dealing with contain redundant information, and this fact is exploited during the compression step. the compression discards certain percent of the samples in the sparse transformation domain, assuming that the majority of samples are insignificant for signal analysis. the fact that most signals exhibit sparsity in certain transformation domain is used in the compressive sensing (cs) theory [1]-[12]. namely, one of the ideas behind the cs was to avoid compression after acquisition and to directly acquire data in the compressed form. in other words, the cs offers the possibility to acquire less data then it received april 19, 2017 corresponding author: anđela draganić university of montenegro, faculty of electrical engineering, džordža vašingtona bb, 81000 podgorica montenegro (e-mail: andjelad@ac.me) 478 a. draganić, i. orović, s. stanković is commonly done, but still to be able to reconstruct the entire information afterwards. the missing signal information can also appear as a consequence of omitting samples that are exposed to different kinds of noise or losing some parts of the signal during transmission. these missing samples can be recovered using the cs reconstruction algorithms [7]-[27]. some of the concepts that are nowadays used within the cs approaches date from the early seventies. the least square solutions, based on the norm minimization, are used by claerbout and muir in 1973 [11]. in 1986, santosa and symes proposed an application of the ℓ1-norm in recovering sparse spike trains. the ℓ1-minimization of the image gradient total variation minimization was proposed in 1990s by rudin, osher and fatemi [17] for removing noise from images. in the early 2000s blu, marziliano and vetterli showed that the k-sparse signals can be sampled and recovered by using only 2k parameters. the idea of the cs starts to grow from the moment when it was shown that a small set of nonadaptive measurements can provide exact signal reconstruction, which proved the basic idea behind the data acquisition in the compressed form [3],[4]. later, in [19], the cs is analyzed in terms of signal recovery when the missing samples are result of signal degradation due to the presence of noise. the influence of the number of missing samples on the spectral signal representation is examined, and the reconstruction procedure is proposed. our focus in this paper is on the practical applications of the cs approaches. however, there are some specific requirements that are imposed to the measurements in order to be able to apply cs signal reconstruction. signal sparsity is one of the conditions required in cs approach, and can be satisfied in different domains: time, frequency or time-frequency domains [28]-[48]. this condition is valid for a variety of real-world signals. the other condition is incoherence which will be explained later in the text. there is a large number of cs applications, from those assuming one-dimensional signals to various image processing and video applications. some of the cs applications are adopted to work in real-time. the constant growth and development in the field of the cs applications aims to reduce the complexity of devices, to speed up the acquisition and transmission procedure and to decrease power consumption. since cs is used to extract as much as possible information from minimal available data, it is important to highlight its use in biomedicine, especially in magnetic resonance imaging – mri. by lowering the number of coefficients required for mr image reconstruction, the time of patient exposure to the mr device is reduced, and consequently, the negative influence of the mr device is lower. another useful application is in radar imaging, where the cs exploits the sparsity in the frequency domain. furthermore, it is used in communication and network systems, sparse channel estimation, wireless sensor networks (wsns), in cognitive radios for spectrum sensing, etc. some of the applications will be described later in the text. 2. mathematical background of a compressive sensing concept assume that we are dealing with the signal x of length n, that is sparse in the transform domain (defined by the direct transformation  ). then, the vector of acquired samples y can be defined as [1]-[12]: 1 ,y x   (1) on some common compressive sensing recovery algorithms and applications 479 where matrix ω is used to randomly under-sample the observed signal and 1   is the inverse transform matrix. the signal sparsity is k, meaning that only k out of n coefficients from the transform domain are non-zero and we assume that only m out of n samples are acquired in the vector y (m<2k). the vector x is the vector of the transform domain coefficients, i.e.: .x x  (2) different transform domains can be used: discrete fourier transform domain – dft [3],[6],[12], discrete cosine transform domain – dct [3],[6],[12],[46], wavelet domain, hermite transform domain [48]-[55], time-frequency domain [56], etc. apart from the sparsity, another important property is incoherence, which enables successful signal reconstruction from a small set of acquired samples. namely, the measurement matrix ω should be incoherent with the transform domain matrix  . the coherence between the two matrices represents the highest correlation between any two column/row vectors of the matrices. a measure of correlation between the two matrices is defined as follows [11],[12]: 1, ( ) max , ,k j k j n n      (3) where n is a signal length, ωk and j are row and column vectors of the matrices ω and  , respectively. the coherence takes values from the interval: 1 ( ) .n  (4) the value of the coherence is greater if the two matrices are more correlated. in the cs scenario the value of the coherence should be as low as possible. the system of equations (1) can be written as follows: 1 1,m m n ny a x   (5) where a denotes cs matrix: 1 a   . the system is under-determined since it has m equations and n unknowns. therefore, the optimization techniques are used in order to find an optimal solution for this system. optimal solution is related to the signal sparsity – the sparsest solution is the optimal one. there are a number of algorithms used to obtain a sparse solution of the system. some of them are based on the convex optimization [1]-[7],[11],[12]: basis pursuit, dantzig selector, and gradient-based algorithms. they provide high reconstruction accuracy, but they are computationally demanding. the commonly used and less computationally demanding are greedy algorithms – matching pursuit and orthogonal matching pursuit [1]-[6],[8],[12]. also, recently proposed threshold based algorithms provide high reconstruction accuracy with low computational complexity: (e.g. iterative hard thresholding iht, iterative soft thresholding ist) [1],[6],[13], automated threshold based iterative solution [12],[57], adaptive gradient-based algorithm, [3],[12],[26],[27], etc. 480 a. draganić, i. orović, s. stanković 3. compressive sensing algorithms the sparsity of the signal can be defined as a number of nonzero elements within a vector. it can be described by using the ℓ0-norm [13]: 0 0 1 1, 0 lim 1 i n n p i p i i x x k        x , (6) and represents the cardinality of the support of x: 0 card{supp( )} k x x . (7) therefore, the solution of the undetermined system of equations (5), in the cases when the signal x is sparse in the transform domain, can be reduced to the minimization of the ℓ0norm, i.e.: 0 min subject to x y ax . (8) the ℓ0-norm is not feasible in practice, since small noise in the signal will be assumed as a non-zero sample. therefore, the ℓ1-norm is commonly used. the optimization problem based on the ℓ1-norm is recast as follows [12],[13]: 1 min subject to x y ax . (9) in the sequel, some of the commonly used algorithms for sparse reconstruction are described. 3.1. convex optimizations basis pursuit and basis pursuit denoising the equation (8) represents a non-convex combinatorial optimization problem. solution of this problem requires exhaustive searches over subsets of columns of the matrix a. for a k-sparse signal of length n, the total number of k-position subsets is n k       , which is not computationally feasible. other approach solves a convex optimization problem through linear programming, which is computationally more efficient. commonly used convex optimization algorithms are: basis pursuit, basis pursuit de-noising (bpdn), least absolute shrinkage and selection operator (lasso), least angle regression (lars), etc. the approach based on the convex ℓ1-minimization that provides near optimal solution, can be defined as: 1 min subject to x y ax . (10) this approach is known as a basis pursuit (bp) [6],[12]. it aims at decomposing a signal into a superposition of dictionary elements that have the smallest ℓ1-norm of the coefficients. bp can be solved by using a primal-dual interior point method. the problem (10) can be recast as follows, in the case of real y, a and x [12]: min , subject to ,t t t tx y ax    , (11) on some common compressive sensing recovery algorithms and applications 481 where variable t is introduced to avoid absolute value in 1 1 n ii  x x . the steps of the primal-dual interior point method are described within the algorithm 1. in the cases of noisy measurements, y=ax+n, where n denotes noise and 2 n  , the optimization problem is known as basis pusuit denoising (bpd) and is defined as [6]: 1 2 min subject to x y ax  (12) algorithm 1: primal-dual interior point method  set 0 t x x a y  , for the known measurement vector y.  set  max0t   x x0 0 . parameters γ and λ are user-defined.  the next step is forming a lagrangian function: 1 1 , , , , ( ) ( ) , where 0 0 0 0 0 0 0 0 t t t g f t g t t t t x x x ax y x x x x                  1 1 0 0 0 0 g t t a x x           .  update each argument of the lagrangian function by step direction (δ) and step length (u). step directions for the algorithm 1 are obtained by finding the first derivatives of λ in terms of its arguments. step lengths are calculated using the backtracking line search [12]. for example, a new value for x is obtained as x=x+u(δx). adaptive gradient based algorithm adaptive gradient based algorithm proposed in [26], belongs to the group of convex optimization approaches. it starts from the chosen initial values of the available signal samples. the initial value is iteratively changed for +δ and –δ, and the concentration improving is measured in the sparsity domain. the gradient vector, used to update the signal values, is obtained as a difference between the ℓ1-norms of the vectors changed for +δ and changed for –δ. this gradient value is used to update the values of the missing samples. the performance of this algorithm can be efficient even for the signals that are not strictly sparse. the algorithm for both 1d and 2d cases is summarized in the algorithm 2. 3.2. greedy algorithms the greedy algorithms represent the second group of algorithms used to obtain the sparsest solution of the system (5). these algorithms are less computationally complex and therefore much faster compared to the ℓ1-norm based optimization techniques, but are also less precise. the greedy algorithms are based on finding the elements of the transform matrix called dictionary that best matches the signal through iterations. commonly used greedy algorithms are matching pursuit (mp), orthogonal matching pursuit (omp), compressive sampling matching pursuit (cosamp), etc. the procedure for the omp algorithm is described within the algorithm 3. 482 a. draganić, i. orović, s. stanković algorithm 2: adaptive gradient based algorithm input: set of the positions of the available samples ωa and set of the missing samples position: ωm=n\ωa; measurement vector y; in the 2d signal case n is: n=(nx,ny)  set (0) ( ), for ( ) 0, for a m n n n n     y y , and 0k   set (0) max ( )n  y  repeat  set ( ) ( ) ( ) k p n ny y  repeat  1k k   for t  n do  if mt  then     ( ) 1 2 ( ) ( ) { ( ) }, in the 2d case ( , ) ( ) { ( ) }, denotes 1d dft or 2d dft k k x f n f f f x f n             y y ; ( ) 1 1 ( ) k t     x x , else  ( ) ( ) 0 k t   end if  ( 1) ( ) ( ) ( ) ( ) ( ) k k k t t t   y y  end for  1 2 2 1 2 2 arccos k k k k k          until 170k    / 10      2 2 ( ) ( ) 1010 log ( ) ( ) / ( ) m m k k pn n r n n n     y y y until max required precisionr r   return ( ) ( ) k ny output: reconstructed signal ( ) ( ) k ny algorithm 3: orthogonal matching pursuit  input: compressive sensing matrix a=ωℑ, measurement vector y  initialization of the variables:  initial residual r0=y; initial solution x0=0; matrix of chosen atoms ϒ0=[].  do following steps until the stopping criterion is met:  arg max ,1 1,..., n n i i m r a    finding maximum correlation column  1 nn n   a     update matrix of chosen atoms  2 arg min 1 1 2n n n n x r x x    solving least square problem  1n n n nr r x  residual update  n=n+1  output: xp and rp, where p denotes number of iterations. on some common compressive sensing recovery algorithms and applications 483 3.3. threshold based algorithms iterative hard and soft thresholding thresholding algorithms are based on an adaptive threshold applied within several iterations. they are much faster than algorithms based on convex relaxation. an iteration can be described in terms of threshold function as [1],[6],[11],[13]: 1t ( ( ))i ix f x  . (13) the thresholding function is denoted as tε, while f is the function that modifies the output of the previous iterate and x is a sparse vector. the signal can be recovered from its measurement by using hard or soft thresholding. therefore, there are two types of iterative thresholding algorithms: iterative hard thresholding (iht) and iterative soft thresholding (ist). iht algorithm sets all but the k largest components, in terms of signal x magnitudes, to zero. the hard thresholding function hk is defined as [6],[11],[13]: , ( ) 0, otherwise i i k x x h      x  . (14) the ε is the k largest component of x [6]. the algorithm is summarized within the algorithm 4 [11]. soft thresholding function is applied to each element of the vector x and is defined as [6]: , ( ) 0, . , i i i i x s x x x                x x (15) algorithm 4: iterative hard thresholding input: signal sparsity k, transform matrix ℑ, measurement matrix ω, cs matrix 1 a , measurement vector y output: an approximation of the signal x 0 0x for i=1,…, until stopping criterion is met do  1 1( )ti k i ih    x x a y ax end for return ix x automated threshold based solution a non-iterative and iterative threshold based solutions for sparse signal reconstruction are proposed in [98]. the proposed solutions are based on the model of noise appearing as a consequence of missing samples. by using a predefined probability of error p, a general threshold t that separates signal components from spectral noise in the transform domain is defined. 484 a. draganić, i. orović, s. stanković algorithm 5: automated threshold based iterative solution input: m, n, y=x(ωa), ωa={n1,…,nm},  , φ,  a , n  . o set k = ; for i=1 : i=i+1: until all components are detected o calculate variance: 2 2 1 ( ) 1 m i n m y i m n m       ; o for a given p calculate:  2 1/log 1 ( ) nt p t   ; o calculate the initial dft vector xi: 1 ( )i   x y ; o update set k: arg{ / }i t n  k k x ; o calculate   1 h  f = a a ay ; (cs matrix a contains rows defined by the set k, and m columns of the dft matrix) o update y: 2 : = ( ) ( ) akj n ak x k e     k y x ; o update the initial dft vector x according to the new vector y; o update 22 /a m  y and 2 2 1 n m a m n     ; o if 2 2 n a  break ; end for the algorithm uses dft as a domain of sparsity but the same concept can be applied to other transform domains. this approach can provide successful signal x reconstruction within a single iteration of the reconstruction algorithm. however, if the number of available samples m is very low, the iterative version of the algorithm is derived as well, updating the threshold value. if the inputs of the algorithm are vector of the m available samples y, signal length n, set of the available samples positions ωa={n1,…,nm}, transform and measurement matrices  and φ, cs matrix  a and gaussian noise variance ζn, then the iterative version can be described using the algorithm 5. 4. cs applications the cs theory stating that the compressible signals can be efficiently reconstructed using a small set of incoherent measurements, motivated the researchers to explore possible fields of applications. having in mind that many real-world signals satisfy the sparsity property, the applications ranges from the speech and audio signals [58]-[61], radar and communications [62]-[83], underwater, acoustic and linear frequency modulated signals [84]-[86], image reconstruction [87]-[98], biomedical applications [98]-[101], etc. the review of cs applications for different 1d signals, images and video data will be addressed in the sequel. on some common compressive sensing recovery algorithms and applications 485 4.1. cs devices (analog to information, single pixel camera, random lens imager) let us firstly consider some of the hardware devices that are based on the cs principles. (a) duarte et al. in [102] proposed single pixel camera concept. this cs camera architecture is an optical computer, composed of a digital micromirror device dmd, two lenses and a single photon detector. it also contains an analog-to-digital (a/d) converter that computes random linear measurements of the scene under view. the image is recovered from the acquired measurements by a digital computer. compared to the conventional silicon-based cameras, single pixel camera is a simpler, smaller, and cheaper and can operate efficiently across a much broader spectral range. (b) fergus et al. in [103] developed a random lens imaging technique. the technique uses a normal digital single-lens reflex dslr camera, whose lens is replaced with a transparent material. the mirrors in this material are randomly distributed. the authors modified a pentax stereo adapter in order to make one of the mirrors have a random reflective surface. the cs measurements are in the form of images, obtained using this system. the new camera set-up has to be calibrated, in order to reconstruct the original image. (c) trakimas et al. in [104] proposed the design and implementation of an analog-toinformation converter (aic). the presented aic is designed in a way that can sample at the nyquist rate, but has also the cs operation mode. this design shows minimal complexity compared to conventional nyquist rate sampling architectures. when dealing with signals with sparse frequency representation, this design has increased power efficiency of the sampling operation. to generate the pseudorandom sequence, a pn clock generator is used and it can be configured to provide a synchronous clock signal when nyquist sampling is required. 4.2. cs in biomedical applications cs finds usage in numerous biomedical applications, such as in magnetic resonance imaging (mri) [105]-[108], then electroencephalography (eeg), electrocardiography (ecg), electrooculography (eog) and electromyography (emg) signals, [109]-[120], etc. some of the specific biomedical applications are given in the sequel. (a) lowering the time of patient exposition to the harmful mr waves was the primary motivation of cs usage in mri. however, the mr acquisition time is proportional to the dimensionality of the mr dataset, i.e., the number of spatial frequencies acquired. scan time can be reduced by lowering the amount of data acquired, but still it has to be able to recover the whole information. (b) lustig et al. in [98] implemented cs approach for rapid mri imaging. sparsity of the mri in the transform domain is exploited for achieving two goals: reducing the scan time and improving the resolution of the observed fast spin-echo brain images and 3d contrast enhanced angiographs. the non-linear conjugate gradient solution is used for the optimization problem solving. the problem is in the form: 2 12 arg min ,u     x x y x (16) where x is the image of interest,  is an operator that transforms signal from pixel representation into sparse representation, u is an undersampled fourier transform, y 486 a. draganić, i. orović, s. stanković denotes measured k-space (e.g. frequency space) data from the scanner and  is a regularization parameter. the conjugate gradient procedure is described in detail in [98]. (c) bioucas-dias et al. introduced twist: two-step iterative shrinkage/ thresholding algorithms for image restoration [101]. this algorithm is introduced as an improved version of the iterative shrinkage thresholding algorithm (ist) – to overcome the problem of its slow convergence in the cases when the measurement matrix a is ill-posed or ill-conditioned. the optimization problem is well-posed if ? has a solution, if a solution is unique and the solution changes continuously on the data [121]. otherwise, the problem is ill-posed. if small perturbation of the y in the problem y=ax leads to large perturbation of the solution, the problem is ill-conditioned [121]. the algorithm is successfully applied on image deconvolution problems, as well as reconstruction of the images with missing samples. considering the system of equations y=ax, the t-th iteration of the twist algorithm can be defined as follows: 1 0 1 1 ( ), (1 ) ( ) ( ),t t t t g g x x x x x x             (17) where μ and δ are nonzero parameters. the starting value for the vector x, x0, can be user-defined or x0=a -1 y. function gη is defined by using denoising operator ψη as: ( ) ( ( )), t g    x x a y ax (18) where 21 ( ) arg min ( ) / 2regv x x y ax       and φreg(x) is a regularization function. the application of the twist algorithm in mri reconstruction is shown. fig. 1 shows an example of mri reconstruction when only 2% of the image samples are available. the samples are acquired from the 2d dft domain using a mask. the mask is formed of radial lines and placed around the origin. the tv regularization is done according to [101]. the original image, mask and the reconstructed image are shown in fig. 1. original estimate 100 200 300 400 500 100 200 300 400 500 a) b) c) fig. 1 a) original image; b) mask in the 2d dft domain; c) image reconstructed from available samples (2% of the total number of samples) (d) trzasko and manduca in [122] proposed a method for under-sampled mr images recovering by using homotopic approximation of the ℓ0-norm. it is shown that the computed local minima of the homotopic ℓ0-minimization problem allows very highly undersampled k-space image reconstruction. the optimization problem can be defined starting from the relation: on some common compressive sensing recovery algorithms and applications 487 0 arg min subject to = , u u u u f    (19) where ψ is wavelet, curvelet, etc. operator, φ is fourier sampling operator and f is the continuous signal. if the ℓ0 semi norm is replaced with the ℓ1 norm, as proposed by candes and donoho [122], the optimization problem can be recast as: 2 1 2 arg min subject to ,n u u u u f       (20) where measured data fn is noisy and ε denotes the statistic of the noise process. chartrand [122] proposed an alternative to the ℓ0 semi norm that provides better sampling bounds compared to the ℓ1 and that is computationally feasible. he proposed the usage of the ℓp semi norms (00 is a regularization parameter. if we are dealing with an invertible transform ψ then the problem (57) can be defined as follows: 1 arg min ( , , ) ( )f g     x x x y x (59) where x denotes the video or single video frame in the transform domain. greedy algorithms are used for the unconstrained problems. they are based on iterative constructing a sparse set of non-zero transform coefficients and finding solution of the minimization problem 2 1 2   x y . the minimization problem solution can be 500 a. draganić, i. orović, s. stanković found using the following greedy algorithms: omp, regularized omp (romp) and stagewise omp (stomp), cosamp [133]. (b) a new approach for estimation of the motion parameters in compressive sensed video sequences under a reduced number of randomly chosen video frames is proposed in [134]. the method focuses on the velocity estimation and combines sparse reconstruction algorithms with time-frequency analysis, applied to μ-propagation signal. the μpropagation maps the video frames sequence into the frequency modulated signal, or into the high nonlinear phase signal. if a video frame at the instant t is defined as: ( , , ) ( , ) ( , )f x y t p x y o x y    , (60) where δx=x-x0-bxt, δy=y-y0-byt, o(x,y) denotes the moving object, p is background, (x0,y0) denote an initial object position and (bx,by) is the velocity. the projection of the frame onto the x-axis is defined as: ( , ) ( , ) ( , ) ( , ) ( ) ( ) y y y r x t f x y p x y o x y p x o x          . (61) finding derivative of the r(x,t) with respect to t and assuming the constant background, the following signal is obtained: ( , ) ( ) ( ) ( , 1) ( , )x r x t o x r x b r x t r x t t x            . (62) the velocity estimation is done by applying the tf analysis to the signal in the form: ( ) ( ) j x x m t r x e    , (63) having in mind that the instantaneous frequency corresponds to the moving object velocity. as tf representation, the s-method can be used since it provides cross-terms free representation and is more suitable in the noisy signal cases. it is defined based on the stft as: * ( , ) ( , ) ( , ) l m i l s t f stft t f j stft t f j      , (64) where l is the s-method window width, while the stft(t,f) is defined as the ft of the windowed signal m(t), with window function w(η): ( , ) ( ) ( ) j stft t f w m t e        . the cs is employed to reduce the number of frames required for the if estimation. in other words, the cs is used to assure motion parameters estimation from an incomplete set of frames. if the subset of frame is denoted as s, s(x,y,ts)⸦f(x,y,t), where only m frames are acquired, ts={t1,…,tm}, then the μ propagation vector will contain small number of samples, i.e. we will have signal m(ts). for each windowed signal part used for the stft calculation, we have the measurement vector y(tsi): ( ) ( ) ( ),si si si sy t w m t t t     , (65) instead of desired vector x(t)=w(η)m(t+η). the ft of the vector y(tsi) will produce low resolution in the stft, and therefore, the cs is used in this step to recover missing samples in the vector y(tsi) and improve the resolution. if we denote the desired signal as on some common compressive sensing recovery algorithms and applications 501 x, measurement vector as y, the measurement and transform matrices as φ and  respectively, then the relation follows:   y x = x = ax , (66) where x corresponds to the stft coefficients at certain available time instant tsi. to find x or its spectral representation x from an incomplete measurement vector y, the following optimization problem can be used: 1 min subject to x y ax , (67) performed for each available time instant. a) frames initial sm 50 100 150 50 100 150 frames cs-based sm 50 100 150 50 100 150 b) c) 0 50 100 150 200 60 80 100 120 0 50 100 150 200 60 80 100 120 d) e) fig. 10 a) several frames from the observed video sequence; b) initial s-method of variable µ-propagation vector; c) cs based s-method of variable µ-propagation vector; velocity estimation using: d) initial s-method and e) cs based s-method the results obtained by using real video sequence are shown in fig. 10. the percentage of the available frames is 40%, due to the compressive acquisition. the moving of metronome’s pendulum is observed and some of the frames from the video sequence are shown in fig. 10a. the s-method of the μ-propagation vector calculated using the available samples, is shown in fig. 10b, while the cs-based s-method is shown in fig. 10c. the corresponding velocity estimations graphs are shown in fig. 10e and f. it is shown that the initial s-method produces error in velocity estimation, while precise results are obtained by using the csbased method. 502 a. draganić, i. orović, s. stanković 4.6. cs in watermarking (a) data protection in terms of cs has been discussed in [136]-[141]. fakhr in [137] proposed a watermark embedding and recovery technique based on the cs framework, tested under mp3 compression. the sparsity of both, the host and the watermark signal is assumed. the watermark is embedded into the measurement vector y. if we denote signal with x, transform domain matrix as  , a sparse signal b as watermark of length l, then the random watermark creation is described as: , w b (68) where ωm×l is the random gaussian matrix, and m is the measurement vector length. matrix ω is used for random expansion of the sparse vector b. the embedding is done as follows:    y x a b , (69) resulting in watermarked measurement vector. embedding strength a is adapted for each frame of the audio signal as: 2 1 0.04 , m i i    a x x x . the advantage of the proposed method is that, in order to recover the clean signal, the optimization problem has to be solved and thus, matrix ω has to be known. in this paper, for the optimization problem solving three methods are used: direct justice pursuit, multiplying by the inverse of ω and multiplying by the annihilator of ω. (b) an image watermarking procedure in the cs scenario is proposed in [139]. the randomly chosen pixels that serve as cs measurements are used to bring the watermark. the image is firstly divided into the blocks and measurements are selected from each block. samples are taken from the space domain, while the image sparsity is assumed in the dft domain. if we denote the n×n image block as ij, vector of measurements for j-th block as yj, tj vector of transform domain coefficients (dft) of the block ij,  as the fourier transform matrix and ωj as the measurement matrix for the block j, then the measurement vector is defined as: j j j j j    y i t . (70) the watermarked measurement vector j y is obtained as follows: j jj  y y , (71) where μ denotes watermark strength and ω is m×1 watermark vector (m denotes the number of measurements). the vector of watermarked coefficients is used to recover the image according to the total variation optimization: min ( ) subject toj j j jtv    t t y t . (72) the reconstructed image block irj is obtained as rj j i t . watermark detection is based on using the standard correlator that requires measurement matrix ωj to be known: ( ) i ii d    y . (73) on some common compressive sensing recovery algorithms and applications 503 the procedure is tested on 256×256 image, divided into 16×16 block. from each block, 50% of the pixels is randomly chosen and serve as a measurement in the reconstruction process, and carry watermark as well. the results are shown in fig. 11. psnr between original and watermarked/reconstructed image is 31.79 db. original image reconstructed image 500 1000 1500 2000 2500 -2 0 2 4 6 8 10 x 10 -3 right key s wrong trials fig. 11 a) original image; b) watermarked and reconstructed image; c) detector responses for 25 right keys and 2500 wrong trials (100 wrong trials for each right key) 5. conclusion the paper focuses on the compressive sensing, as an approach that records an intensive development in signal processing in recent years. an overview of the compressive sensing applications and commonly used algorithms for reconstruction of the signals with missing data is given. algorithms for the reconstruction of both, 1d and 2d signals, are described in the paper. the paper covers the applications starting from the radar signal processing, communications, biomedical signals and image reconstruction, through natural image reconstruction, velocity estimation in video signal processing, cs-based protection of the digital data and hardware devices designed based on the cs principles. experimental results are provided in order to show the performance of the presented algorithms and approaches. acknowledgement: the paper is a part of the research supported by the montenegrin ministry of science, project grant: “new ict compressive sensing based trends applied to: multimedia, biomedicine and communications (cs-ict)” (montenegro ministry of science, grant no. 011002). the authors are thankful to dr josip musić for testing the performance of the object detection in search&rescue images. 504 a. draganić, i. orović, s. stanković references [1] g. pope, “compressive sensing: a summary of reconstruction algorithms”, eidgenossische technische hochschule, zurich, switzerland, 2008. [2] e. candes, j. romberg, “l1-magic: recovery of sparse signals via convex programming”, october 2005. [3] d. donoho, “compressed sensing,” ieee transactions on it, vol. 52, no.4, 2006, pp. 1289 1306. [4] e. j. candes, j. romberg, t. tao, "robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information," ieee transactions on information theory, vol. 52, no. 2, pp. 489-509, feb. 2006. [5] lj. stankovic, m. dakovic, s. stankovic, i. orovic, "sparse signal processing," in the book: digital signal processing, l. stankovic, createspace, amazon, 2015. [6] i. orovic, v. papic, c. ioana, x. li, s. stankovic, "compressive sensing in signal processing: algorithms and transform domain formulations," mathematical problems in engineering, review paper, 2016. [7] e. j. candes and t. tao, “decoding by linear programming,” information theory, ieee transactions on, vol. 51, no. 12, pp. 4203–4215, 2005. [8] g. davis, s. mallat, and m. avellaneda, “adaptive greedy approximations,” constructive approximation, vol. 13, no. 1, pp. 57–98, 1997. [9] y. arjoune, n. kaabouch, h. el ghazi, a. tamtaoui, "compressive sensing: performance comparison of sparse recovery algorithms," in 2017 ieee 7th annual computing and communication workshop and conference (ccwc), las vegas, nv, 2017, pp. 1-7. [10] s. stankovic, i. orovic, m. amin, "l-statistics based modification of reconstruction algorithms for compressive sensing in the presence of impulse noise," signal processing, vol.93, no.11, november 2013, pp. 2927-2931, 2013. [11] y. c. eldar and g. kutyniok, "compressed sensing: theory and applications", cambridge university press, may 2012. [12] s. stankovic, i. orovic, e. sejdic, "multimedia signals and systems: basic and advance algorithms for signal processing," springer-verlag, new york, 2015. [13] v.m. patel and r. chellappa, “sparse representations and compressive sensing for imaging and vision,” springerbriefs in electrical and computer engineering, 2013. [14] t. blumensath, m. e. davies, “iterative thresholding for sparse approximations”, journal of fourier analysis and applications, vol. 14, no. 5-6, pp 629-654, december 2008. [15] t. blumensath, m. e. davies, "gradient pursuits," ieee transactions on signal processing, vol.56, no.6, pp.2370-2382, june 2008. [16] r. mihajlovic, m. scekic, a. draganic, s. stankovic, "an analysis of cs algorithms efficiency for sparse communication signals reconstruction," in proceedings of the 3rd mediterranean conference on embedded computing, meco, 2014. [17] l. i. rudin, s. osher, e. fatemi, “nonlinear total variation based noise removal algorithms”, physica d: nonlinear phenomena, vol. 60, issues 1–4, 1 november 1992, pp. 259-268 [18] s. stankovic, i. orovic, "robust complex-time distributions based on reconstruction algorithms," in proceedings of the 2nd mediterranean conference on embedded computing meco 2013, budva, montenegro, 2013, pp. 105-108. [19] lj. stankovic, s. stankovic, m. amin, "missing samples analysis in signals for applications to lestimation and compressive sensing," signal processing, vol. 94, jan 2014, pp. 401-408, 2014. [20] s. stankovic, lj. stankovic, i. orovic, "a relationship between the robust statistics theory and sparse compressive sensed signals reconstruction," iet signal processing, special issue on compressive sensing and robust transforms, vol. 8, issue 3, pp. 223 229, may, 2014 [21] s. bahmani, “algorithms for sparsity-constrained optimization”, springer theses, series volume 261, isbn 978-3-319-01880-5, 2014. m. [22] t. zhang, “sparse recovery with orthogonal matching pursuit under rip,” ieee trans. on information theory, vol. 57, no. 9, pp. 6215-6221, 2011. [23] s. stankovic, i. orovic, lj. stankovic, a. draganic, "single-iteration algorithm for compressive sensing reconstruction," telfor journal, vol. 6, no. 1, pp. 36-41, 2014. [24] a. draganic, i. orovic, n. lekic, m. dakovic, s. stankovic, "architecture for single iteration reconstruction algorithm," in proceedings of the 4th mediterranean conference on embedded computing. on some common compressive sensing recovery algorithms and applications 505 [25] j. a. tropp, a. c. gilbert, “signal recovery from random measurements via orthogonal matching pursuit,” ieee transaction on information theory, vol. 53, no.12, 2007. [26] lj. stanković, m. daković, and s. vujović, “adaptive variable step algorithm for missing samples recovery in sparse signals,” iet signal processing, vol. 8, no. 3, pp. 246 -256, 2014. [27] s. vujovic, m. dakovic, i. orovic, s. stankovic, "an architecture for hardware realization of compressive sensing gradient algorithm," in proceedings of the 4th mediterranean conference on embedded computing meco 2015, budva, montenegro. [28] y. wang, j. xiang, q. mo, s. he, “compressed sparse time–frequency feature representation via compressive sensing and its applications in fault diagnosis”, measurement, vol. 68, pp. 70–81, may 2015. [29] s. stankovic, i. orovic, "an ideal omp based complex-time distribution," 2nd mediterranean conference on embedded computing meco 2013, pp. 109-112, june 2013, budva, montenegro. [30] y. c. eldar "sampling theory: beyond bandlimited systems", cambridge university press, april 2015. [31] p. flandrin, p. borgnat, "time-frequency energy distributions meet compressed sensing," ieee transactions on signal processing, vol.58, no.6, pp.2974, 2982, june 2010. [32] i. orovic, s. stankovic, t. thayaparan, "time-frequency based instantaneous frequency estimation of sparse signals from an incomplete set of samples," iet signal processing, special issue on compressive sensing and robust transforms, vol. 8, issue 3, pp. 239 245, may, 2014. [33] i. orovic, s. stankovic, m. amin, "compressive sensing for sparse time-frequency representation of nonstationary signals in the presence of impulsive noise," spie defense, security and sensing, baltimore, maryland, united states, 2013. [34] p. borgnat, and p. flandrin, "time-frequency localization from sparsity constraints," in proceedings of the ieee international conference on acoustics, speech and signal proceesing icassp-08, las vegas (nv), 2008, pp. 3785–3788. [35] m. brajović, b. lutovac, i. orović, m. daković, s. stanković, “sparse signal recovery based on concentration measures and genetic algorithm,” in proceedings of the 13th symposium on neural networks and applications neurel 2016, belgrade, serbia, november 2016. [36] x. li, g. bi, “time-frequency representation reconstruction based on the compressive sensing”, in proceedings of the 9th ieee conference on industrial electronics and applications, hangzhou, 2014, pp. 1158-1162. [37] s. stankovic, i. orovic, m. amin, "compressed sensing based robust time-frequency representation for signals in heavy-tailed noise," in proceedings of the information sciences, signal processing and their applications, isspa 2012, canada, 2012. [38] p. k. mishra, r. bharath, p. rajalakshmi, u. b. desai, "compressive sensing ultrasound beamformed imaging in time and frequency domain," in proceedings of the 17th international conference on e-health networking, application & services (healthcom), boston, ma, 2015, pp. 523-527. [39] i. orovic, s. stankovic, t. thayaparan, lj. stankovic, "multiwindow s-method for instantaneous frequency estimation and its application in radar signal analysis," iet signal processing, vol. 4, no. 4, pp. 363-370, 2010 [40] i. orovic, s. stankovic, "a class of highly concentrated time-frequency distributions based on the ambiguity domain representation and complex-lag moment," eurasip journal on advances in signal processing, vol. 2009, article id 935314, 9 pages, 2009. [41] s. stankovic, i. orovic, lj. stankovic, “polynomial fourier domain as a domain of signal sparsity”, signal processing, vol. 130, issue c, pp. 243-253, january 2017. [42] s. stankovic, i. orovic, t. pejakovic, m. orovic, "compressive sensing reconstruction of signals with sinusoidal phase modulation: application to radar micro-doppler," in proceedings of the 22nd telecommunications forum , telfor, 2014. [43] h. su, y. zhang, "time-frequency analysis based on compressive sensing," in proceedings of the 2nd international conference on cloud computing and internet of things (cciot), dalian, 2016, pp. 138142. [44] i. volaric, v. sucic, z. car, "a compressive sensing based method for cross-terms suppression in the time-frequency plane," in proceedings of the ieee 15th international conference on bioinformatics and bioengineering (bibe), belgrade, 2015, pp. 1-4. [45] lj. stankovic, s. stankovic, i. orovic, m. amin, "robust time-frequency analysis based on the lestimation and compressive sensing," ieee signal processing letters, vol. 20, no. 5, pp. 499-502, 2013. 506 a. draganić, i. orović, s. stanković [46] g. hua, y. hiang, g. bi, “when compressive sensing meets data hiding”, ieee signal processing letters, vol. 23, no. 4, april 2016. [47] a. draganic, m. brajovic, i. orovic, s. stankovic, "a software tool for compressive sensing based time-frequency analysis," in proceedings of the 57th international symposium, elmar-2015, zadar, croatia, 2015. [48] i. orovic, s. stankovic, t. chau, c. m. steele, e. sejdic, "time-frequency analysis and hermite projection method applied to swallowing accelerometry signals," eurasip journal on advances in signal processing, vol. 2010, article id 323125, 7 pages, 2010. [49] a. krylov, d. korchagin, “fast hermite projection method,” in proceedings of the 3 rd international conference on image analysis and recognition (iciar ’06), vol. 1, pp. 329–338, povoa de varzim, portugal, september 2006. [50] s. stankovic, i. orovic, a. krylov, "the two-dimensional hermite s-method for high resolution inverse synthetic aperture radar imaging applications," iet signal processing, vol. 4, no. 4, pp. 352362, 2010. [51] m. brajović, i. orović, m. daković, s. stanković, “compressive sensing of signals sparse in 2d hermite transform domain,” 58th international symposium elmar-2016, zadar, croatia, september 2016. [52] a. sandryhaila, s. saba, m. püschel, j. kovačević, “efficient compression of qrs complexes using hermite expansion,” ieee transactions on signal processing, vol. 60, no. 2, pp. 947-955, february 2012. [53] a. draganić, i. orović, s. stanković, “robust hermite transform based on the l-estimate principle,” in proceedings of the 23rd telecommunications forum, telfor 2015. [54] s. stankovic, lj. stankovic, i. orovic, "compressive sensing approach in the hermite transform domain," mathematical problems in engineering, vol. 2015 (2015), article id 286590, 9 pages. [55] m. brajovic, i. orovic, m. dakovic, s. stankovic, "the analysis of missing samples in signals sparse in the hermite transform domain," in proceedings of the 23rd telecommunications forum telfor, 2015, belgrade, serbia, 2015. [56] i. orovic, s. stankovic, "improved higher order robust distributions based on compressive sensing reconstruction," iet signal processing, vol. 8, issue: 7, pp. 738 748, may 2014. [57] s. stankovic, i. orovic, lj. stankovic, "an automated signal reconstruction method based on analysis of compressive sensed signals in noisy environment," signal processing, vol. 104, nov 2014, pp. 43 50, 2014. [58] m. g. christensen, j. østergaard, s. h. jensen, "on compressed sensing and its application to speech and audio signals," conference record of the forty-third asilomar conference on signals, systems and computers, pacific grove, ca, 2009, pp. 356-360. [59] m. scekic, r. mihajlovic, i. orovic, s. stankovic, "cs performance analysis for the musical signals reconstruction," in proceedings of the 3rd mediterranean conference on embedded computing, meco, 2014. [60] d. wu, w. p. zhu, m. n. s. swamy, "a compressive sensing method for noise reduction of speech and audio signals," in proceedings of the ieee 54th international midwest symposium on circuits and systems (mwscas), seoul, 2011, pp. 1-4. [61] l. sun, x. shao, z. yang, "an adaptive multiscale framework for compressed sensing of speech signal," in proceedings of the 6th international conference on wireless communications networking and mobile computing (wicom), chengdu, 2010, pp. 1-4. [62] m. dakovic, lj. stankovic, s. stankovic, "a procedure for optimal pulse selection strategy in radar imaging systems," in proceedings of the international workshop on compressed sensing theory and its applications to radar, sonar and remote sensing (cosera), 19-22 september, aachen, germany, 2016. [63] a. bacci, e. giusti, d. cataldo, s. tomei, m. martorella, "isar resolution enhancement via compressive sensing: a comparison with state of the art sr techniques," in proceedings of the 4th international workshop on compressed sensing theory and its applications to radar, sonar and remote sensing (cosera), aachen, 2016, pp. 227-231. [64] s. costanzo, a. rocha, m. d. migliore, “compressed sensing: applications in radar and communications”, the scientific world journal, vol. 2016 (2016), article id 5407415, 2 pages, editorial. [65] lj. stankovic, s. stankovic, t. thayaparan, m. dakovic, i. orovic, "separation and reconstruction of the rigid body and micro-doppler signal in isar part i-theory ," iet radar, sonar & navigation, vol. 9, no. 9, pp. 1147-1154, 2015. on some common compressive sensing recovery algorithms and applications 507 [66] lj. stankovic, s. stankovic, t. thayaparan, m. dakovic, i. orovic, "separation and reconstruction of the rigid body and micro-doppler signal in isar part ii-statistical analysis," iet radar, sonar & navigation, vol. 9, no. 9, pp. 1155-1161, 2015. [67] a. draganic, i. orovic, s. stankovic, x. li, "isar reconstruction from incomplete data using total variation optimization," in proceedings of the 5th mediterranean conference on embedded computing, (meco 2016). [68] l. c. potter, e. ertin, j. t. parker, m. cetin, "sparsity and compressed sensing in radar imaging," in proceedings of the ieee, vol. 98, no.6, pp.1006-1020, june 2010. [69] m. dakovic, lj. stankovic, s. stankovic, "gradient algorithm based isar image reconstruction from the incomplete dataset," in proceedings of the 3rd international workshop on compressed sensing theory and its applications to radar, sonar and remote sensing, cosera, 2015. [70] j. ender, “on compressive sensing applied to radar”, signal processing, vol. 90, issue 5, may 2010, pp. 1402–1414. [71] lj. stankovic, s. stankovic, i. orovic, y. zhang, "time-frequency analysis of micro-doppler signals based on compressive sensing," compressive sensing for urban radar, ed. m. amin, crc-press, 2014. [72] lj. stankovic, i. orovic, s. stankovic, m. amin, "compressive sensing based separation of nonstationary and stationary signals overlapping in time-frequency," ieee transactions on signal processing, vol. 61, no. 18, pp. 4562-4572, sept. 2013. [73] s. stankovic, lj. stankovic, i. orovic, "l-statistics combined with compressive sensing," spie defense, security and sensing, baltimore, maryland, united states, 2013. [74] m. a. hadi, s. alshebeili, k. jamil, f. e. abd el-samie, “compressive sensing applied to radar systems: an overview”, signal, image and video processing, december 2015, volume 9, supplement 1, pp 25– 39. [75] a. draganic, i. orovic, s. stankovic, "blind signals separation in wireless communications based on compressive sensing," in proceedings of the 22nd telecommunications forum, telfor, 2014. [76] l. zhang, m. xing, c. w. qiu j. li, z. bao, “achieving higher resolution isar imaging with limited pulses via compressed sampling,” ieee geoscience and remote sensing letters, vol.6, no.3, pp.567– 571, 2009. [77] i. orovic, a. draganic, s. stankovic, "sparse time-frequency representation for signals with fast varying instantaneous frequency," iet radar, sonar & navigation, vol. 9, issue 9, pp. 1260 – 1267. [78] s. li, g. zhao, w. zhang, q. qiu, h. sun, "isar imaging by two-dimensional convex optimizationbased compressive sensing," ieee sensors journal, vol. 16, no. 19, pp. 7088-7093, oct.1, 2016. [79] p. zhang, z. hu, r. c. qiu, b. m. sadler, “a compressed sensing based ultrawideband communication system,” in proceedings of the ieee international conference on communications, 14-18 june 2009. [80] a. draganic, i. orovic, s. stankovic, m. amin, "rekonstrukcija fhss signala zasnovana na principu kompresivnog odabiranja," in proceedings of the telfor 2012, belgrade, 2012 [81] j. meng, j. ahmadi-shokouh, h. li, e. j. charlson, z. han, s. noghanian, e. hossain, “sampling rate reduction for 60 ghz uwb communication using compressive sensing, ” in proceedings of the asilomar conf. on signals, systems, and computers, monterey, california, november 2009. [82] a. draganic, i. orovic, s. stankovic, x. li, z. wang, "reconstruction and classification of wireless signals based on compressive sensing approach," in proceedings of the 5th mediterranean conference on embedded computing, (meco 2016). [83] b. jokanovic, m. amin, s. stankovic, "instantaneous frequency and time-frequency signature estimation using compressive sensing," spie defense, security and sensing, baltimore, maryland, united states, 2013, http://dx.doi.org/10.1117/12.2016636 [84] c. bernard, c. ioana, i. orovic, s. stankovic, "analysis of underwater signals with nonlinear time-frequency structures using warping based compressive sensing algorithm," in proceedings of the mts/ieee north american oceans conference, october 2015, washington, dc, united states, 2015. [85] i. murgan, a. digulescu, i. candel, c. ioana, “compensation of position offset of acoustic transducers using compressive sensing concept”, in proceedings of the oceans 2016 mts/ieee monterey, sep 2016, monterey, united states. pp. 1-4. [86] i. orovic, s. stankovic, lj. stankovic, "compressive sensing based separation of lfm signals," in proceedings of the 56th international symposium elmar 2014, zadar, croatia, 2014. [87] j. musić, t. marasović, v. papić, i. orović, s. stanković, "performance of compressive sensing image reconstruction for search and rescue," ieee geoscience and remote sensing letters, vol. 13, no. 11, pp. 1739-1743, nov. 2016. http://dx.doi.org/10.1117/12.2016636 508 a. draganić, i. orović, s. stanković [88] j. music, i. orovic, t. marasovic, v. papic, s. stankovic, "gradient compressive sensing for image data reduction in uav based search and rescue in the wild," mathematical problems in engineering, november, 2016 [89] a. akbari, d. mandache, m. trocan and b. granado, "adaptive saliency-based compressive sensing image reconstruction," in proceedings of the ieee international conference on multimedia & expo workshops (icmew), seattle, wa, 2016, pp. 1-6. [90] n. eslahi, a. aghagolzadeh, "compressive sensing image restoration using adaptive curvelet thresholding and nonlocal sparse regularization," ieee transactions on image processing, vol. 25, no. 7, pp. 3126-3140, july 2016. [91] j. wen, z. chen, y. han, j. d. villasenor, s. yang, "a compressive sensing image compression algorithm using quantized dct and noiselet information," in proceedings of the ieee international conference on acoustics, speech and signal processing, dallas, tx, 2010, pp. 1294-1297. [92] i. stankovic, i. orovic, s. stankovic, m. dakovic, "iterative denoising of sparse images," in proceedings of the 39th international convention on information and communication technology, electronics and microelectronics, (mipro 2016), 2016. [93] m. medenica, s. zukovic, a. draganic, i. orovic, s. stankovic, "comparison of the algorithms for cs image reconstruction," etf journal of electrical engineering 2014, 09/2014; vol. 20, no. 1, pp. 29-39. [94] c.-s. lu, h.-w. chen, “compressive image sensing for fast recovery from limited samples: a variation on compressive sensing”, information sciences, vol. 325, 20 december 2015, pages 33–47. [95] m. maric, i. orovic, s. stankovic, "compressive sensing based image processing in trapview pest monitoring system," in proceedings of the 39th international convention on information and communication technology, electronics and microelectronics, (mipro 2016). [96] s. stankovic, i. orovic, "an approach to 2d signals recovering in compressive sensing context," circuits systems and signal processing, 2016. [97] z. zhu, k. wahid, p. babyn, d. cooper, i. pratt, y. carter, “improved compressed sensing-based algorithm for sparse-view ct image reconstruction”, computational and mathematical methods in medicine, vol. 2013 (2013), article id 185750, 15 pages. [98] i. stankovic, i. orovic, s. stankovic, "image reconstruction from a reduced set of pixels using a simplified gradient algorithm," in proceedings of the 22nd telecommunications forum telfor 2014, belgrade, serbia, 2014. [99] m. lustig, d. donoho, j. pauly, “sparse mri: the application of compressed sensing for rapid mr imaging,” magn. reson. med., vol. 58, no. 6, pp. 1182–1195, 2007 [100] c. g. graff, e. y. sidky, “compressive sensing in medical imaging”, applied optics, 2015 mar 10; vol. 54, no. 8, c23–c44. [101] j. m. bioucas-dias, m. a. t. figueiredo, "a new twist: two-step iterative shrinkage/thresholding algorithms for image restoration," ieee transactions on image processing, vol. 16, no. 12, pp. 29923004, dec. 2007. [102] m. f. duarte et al., "single-pixel imaging via compressive sampling," ieee signal processing magazine, vol. 25, no. 2, pp. 83-91, march 2008. [103] r. fergus, a. torralba, w. t. freeman, “random lens imaging”, mit csail technical report, september 2006. [104] m. trakimas, r. d'angelo, s. aeron, t. hancock, s. sonkusale, “a compressed sensing analog-toinformation converter with edge-triggered sar adc core”, ieee transactions on circuits and systems i: regular papers, pp. 11351148, vol. 60, issue: 5, 2013 [105] m. lustig, d.l donoho, j.m santos, j.m pauly “compressed sensing mri”, ieee signal processing magazine, 2008, vol. 25, no. 2, pp. 72-82. [106] m. lustig, j.m. santos, d.l. donoho, and j.m. pauly, “k-t sparse: high frame rate dynamic mri exploiting spatio-temporal sparsity,” in proceedings of the 13th annual meeting ismrm, seattle, wa, 2006, p. 2420. [107] s. zukovic, m. medenica, a. draganic, i. orovic, s. stankovic, "a virtual instrument for compressive sensing of multimedia signals," in proceedings of the 56th international symposium elmar 2014, zadar, croatia, 2014. [108] m. hong, y. yu, h. wang, f. liu, s. crozier “compressed sensing mri with singular value decomposition-based sparsity basis”, physics in medicine and biology, vol. 56 (2011), pp. 6311–6325. [109] d. craven, b. mcginley, l. kilmartin, m. glavin, e. jones, "compressed sensing for bioelectric signals: a review," ieee journal of biomedical and health informatics, vol. 19, no. 2, pp. 529-540, march 2015. on some common compressive sensing recovery algorithms and applications 509 [110] y. liu, m. de vos, s. van huffel, "compressed sensing of multichannel eeg signals: the simultaneous cosparsity and low-rank optimization," ieee transactions on biomedical engineering, vol. 62, no. 8, pp. 2055-2061, aug. 2015. [111] a. m. abdulghani, a. j. casson, e. rodriguez-villegas, "quantifying the feasibility of compressive sensing in portable electroencephalography systems," in proceedings of the 5th international conference on foundations of augmented cognition. neuroergonomics and operational neuroscience: held as part of hci international 2009, san diego, ca, 2009, pp. 319-328. [112] s. senay, l. f. chaparro, m. sun, r. j. sclabassi, "compressive sensing and random filtering of eeg signals using slepian basis," in proceedings of the 16th european signal processing conference (eusipco 2008), lausanne, switzerland, 2008. [113] z. zhang, t. p. jung, s. makeig, b. d. rao, "compressed sensing of eeg for wireless telemonitoring with low energy consumption and inexpensive hardware," ieee transactions on biomedical engineering, vol. 60, no. 1, pp. 221-224, jan. 2013. [114] j. k. pant, s. krishnan, "reconstruction of ecg signals for compressive sensing by promoting sparsity on the gradient," in proceedings of the ieee international conference on acoustics, speech and signal processing, vancouver, bc, 2013, pp. 993-997. [115] l. f. polanía, r. e. carrillo, m. blanco-velasco, k. e. barner, "exploiting prior knowledge in compressed sensing wireless ecg systems," ieee journal of biomedical and health informatics, vol. 19, no. 2, pp. 508519, march 2015. [116] o. kerdjidj, k. ghanem, a. amira, f. harizi, f. chouireb, "real ecg signal acquisition with shimmer platform and using of compressed sensing techniques in the offline signal reconstruction," in proceedings of the ieee international symposium on antennas and propagation (apsursi), fajardo, 2016, pp. 1179-1180. [117] k. wilhelm, y. massoud, "compressive sensing based classification of intramuscular electromyographic signals," in proceedings of the ieee international symposium on circuits and systems, seoul, 2012, pp. 273276. [118] m. brajović, i. orović, m. daković, s. stanković, “gradient-based signal reconstruction algorithm in the hermite transform domain,” electronics letters, vol. 52, issue 1, pp. 41-43, 2016. [119] m. brajovic, i. orovic, m. dakovic, s. stankovic, "on the parameterization of hermite transform with application to the compression of qrs complexes," signal processing, vol. 131, february 2017, pages 113– 119. [120] m. brajovic, i. orovic, s. stankovic, "the optimization of the hermite transform: application perspectives and 2d generalization," in proceedings of the 24th telecommunications forum telfor 2016, november 2016, belgrade, serbia, 2016. [121] g. teschke “sparse recovery and compressive sampling in inverse and ill-posed problems”, lecture notes. [122] j. trzasko, a. manduca, "highly undersampled magnetic resonance image reconstruction via homotopic \ell _{0} -minimization," ieee transactions on medical imaging, vol. 28, no. 1, pp. 106121, jan. 2009. [123] p. zhang, z. hu, r. c. qiu, b. m. sadler, "a compressed sensing based ultra-wideband communication system," in proceedings of the ieee international conference on communications, dresden, 2009, pp. 1-5. [124] b. zhang, x. cheng, n. zhang, y. cui, y. li, q. liang, “sparse target counting and localization in sensor networks based on compressive sensing,” in proceedings of the ieee infocom, 2011, pp. 2255–2263. [125] m. weiss, “passive wlan radar network using compressed sensing,” in proceedings of the iet international conference on radar systems (radar 2012), glasgow, uk, 2012, pp. 1-6. [126] j. bazerque, g. giannakis, “distributed spectrum sensing for cognitive radio networks by exploiting sparsity,” ieee trans. signal process., vol. 58, no. 3, pp. 1847–1862, mar. 2010. [127] m. brajovic, a. draganic, i. orovic, s. stankovic, "fhss signal sparsification in the hermite transform domain," in proceedings of the 24th telecommunications forum telfor 2016, november 2016, belgrade, serbia, 2016. [128] y. lu, w. guo, x. wang, w. wang, "distributed streaming compressive spectrum sensing for wideband cognitive radio networks," in proceedings of the ieee 73rd vehicular technology conference (vtc spring), yokohama, 2011, pp. 1-5. [129] i. stanković, i. orović, m. daković, s. stanković, “denoising of sparse images in impulsive disturbance environment,” multimedia tools and applications, in print, 2017. [130] j. wu, f. liu, l. c. jiao, x. wang and b. hou, "multivariate compressive sensing for image reconstruction in the wavelet domain: using scale mixture models," ieee transactions on image processing, vol. 20, no. 12, pp. 3483-3494, dec. 2011. 510 a. draganić, i. orović, s. stanković [131] j. bobin, j. l. starck, r. ottensamer, "compressed sensing in astronomy," ieee journal of selected topics in signal processing, vol. 2, no. 5, pp. 718-726, oct. 2008. [132] j. bobin, j.-l. starck, “compressed sensing in astronomy and remote sensing: a data fusion perspective”, in proc. spie 7446, wavelets xiii, 74460i (september 04, 2009). [133] r. g. baraniuk, t. goldstein, a. c. sankaranarayanan, c. studer, a. veeraraghavan, m. b. wakin, "compressive video sensing: algorithms, architectures, and applications," ieee signal processing magazine, vol. 34, no. 1, pp. 52-66, jan. 2017. [134] i. orovic, s. park, s. stankovic, "compressive sensing in video applications," in proceedings of the 21st telecommunications forum telfor, novembar, 2013. [135] l.-w. kang, c.-s. lu, “distributed compressive video sensing,” in proceedings of the ieee international conference on acoustics, speech and signal processing (icassp '09), 2009, pp. 1169– 1172. [136] x. liao, k. li, j. yin, “separable data hiding in encrypted image based on compressive sensing and discrete fourier transform”, multimedia tools and applications, pp. 1-15, 2016. [137] m. w. fakhr, “robust watermarking using compressed sensing framework with application to mp3 audio”, international journal of multimedia & its applications 2013. [138] x. tang, z. ma, x. niu, y. yang, "compressive sensing-based audio semi-fragile zero-watermarking algorithm," chinese journal of electronics, vol. 24, no. 3, pp. 492-497, 07 2015. [139] i. orovic, s. stankovic, "compressive sampling and image watermarking," in proceedings of the 55th international symposium elmar 2013, zadar, croatia, sept. 2013. [140] m. orovic, t. pejakovic, a. draganic, s. stankovic, "mri watermarking in the compressive sensing context," in proceedings of the 57th international symposium elmar-2015, zadar, croatia, 2015. [141] i. orovic, a. draganic, s. stankovic, "compressive sensing as a watermarking attack," in proceedings of the 21st telecommunications forum telfor 2013, novembar, 2013. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 30, n o 4, december 2017, pp. 511 548 doi: 10.2298/fuee1704511p consideration of conduction mechanisms in high-k dielectric stacks as a tool to study electrically active defects albena paskaleva 1 , dencho spassov 1 , danijel danković 2 1 institute of solid state physics, bulgarian academy of sciences, sofia, bulgaria 2 university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper conduction mechanisms which could govern the electron transport through high-k dielectrics are summarized. the influence of various factors – the type of high-k dielectric and its thickness; the doping with a certain element; the type of metal electrode as well as the measurement conditions (bias, polarity and temperature), on the leakage currents and dominant conduction mechanisms have been considered. practical hints how to consider different conduction mechanisms and to differentiate between them are given. the paper presents an approach to assess important trap parameters from investigation of dominant conduction mechanisms. key words: high-k dielectrics; conduction mechanisms; electrically active defects 1. introduction since the invention of metal oxide semiconductor field effect transistor (mosfet) transistor in 1947 at bell laboratories and the first integrated circuit (ic) built independently at texas instruments 11 years later astonishing progress has been made in si technology, achieved through continual scaling of semiconductor devices. the phenomenal scaling trends are popularly known as moore‘s law which predicts that the number of components per chip increases exponentially, doubling over 2-3 year period. the key factor enabling the unprecedented scaling trends was the material properties (and the resultant electrical properties) of sio2 and its interface with si. for quite a long time the main electronic device mos transistor consisted of si substrate, sio2 as gate dielectric and poly-si gate electrode. sio2 formed the perfect gate dielectric material successfully scaling from thickness of about 100 nm 40 years ago to a mere 1.2 nm at 90 nm node. this represents a layer only four atoms thick. therefore, the ultimate scaling of device dimensions has pushed the thickness of sio2 to its physical limits where unacceptably high direct tunneling currents were flowing received may 18, 2017 corresponding author: albena paskaleva institute of solid state physics, bulgarian academy of sciences, tzarigradsko chaussee 72, sofia 1784, bulgaria (e-mail: paskaleva@issp.bas.bg) 512 a. paskaleva, d. spassov, d. dankovic through devices. this resulted in increased power dissipation, accelerated oxide degradation and inferior reliability. indeed, the leakage current flowing through the transistors, arising from the direct tunnelling of charge carriers could exceed 100 a/cm 2 , which lies well above the specifications given by the international technology roadmap for semiconductors (itrs), especially for low operating power and low standby power technologies (fig. 1). in fact, in this thickness range the sio2 is not an insulator any more. the only feasible solution of this problem was the replacement of sio2 with alternative dielectrics that have higher permittivity (high-k dielectrics) so that the required capacitance can be obtained with physically thicker layers. the integration of high-k dielectric into si nano-technology posed a lot of serious problems such as: dielectric and interface charges, reduced channel mobility, charge trapping and degradation of parameters over operating time of the device. all these issues are defined by the electronic structure and bonding in high-k dielectrics which are distinctly different from those of sio2. sio2 has polar covalent bonds with a low coordination. unlike sio2, high-k oxides have higher atomic coordination numbers and greater ionic nature in their bonding due to large difference in electronegativity of the metal and o atoms. as a result, high-k dielectrics have much larger density of electrically active defects compared to silicon dioxide. electrically active defects are defined as atomic configurations which give rise to electronic states in the oxide band gap which can trap carriers. these defects influence very strongly the trapping behavior as well as the transport mechanisms, hence the leakage currents flowing through them. however, one of the key requirements for mos fig. 1 leakage current vs. voltage for various sio2 thicknesses [1]. devices, especially dram capacitor (and any kind of memory device which relies on charge storage) is to maintain low leakage current density of the dielectric. high leakage will cause the capacitor to lose its charge representing the stored binary information before the refreshing pulse. it is well known that also the performance of mosfets strongly depends on the breakdown properties and the current transport behavior of the gate dielectric film. low leakage current is a stringent requirement to provide low standby-power consumption for several types of devices (so-called low power applications (e.g. mobile phones, cameras, etc.)) (fig. 1). for high performance applications (e.g. cpus) consideration of conduction mechanisms in high-k dielectric stacks... 513 high current density can be acceptable (see gate limit in fig. 1) but current flow through the dielectric causes increased power dissipation and heating, which on its turn limits the reliability of devices. therefore, investigation of leakage currents and conduction mechanisms of high-k dielectric films have attracted a lot of attention and a huge research effort has been dedicated to address this issue for all kinds of microand nano-electronic devices. next, various conduction mechanisms in dielectric layers are presented shortly and after that their operation in different dielectric stacks is demonstrated. some practical hints how to consider different conduction mechanisms and to differentiate between them are also given. useful information on the trap parameters as well as structural alterations in the layers obtained by these considerations is also presented. 2. conduction mechanisms in dielectrics the perfect insulator is free of traps with a negligible free carrier concentration in thermal equilibrium. therefore, the ideal mos structure is an insulating device where no dc current is flowing. in practice, this is not true, especially for thin dielectric layers and high electric fields, which is the case in up-to-date mos devices. the macroscopic leakage current behaviour in mos (mim) capacitor structure is governed most strongly by the properties of metal/dielectric contact and the defect status of the dielectric film, hence two kinds of conduction mechanisms are considered: electrode-limited and bulk-limited conduction mechanisms. the electrode-limited mechanisms depend strongly on electrode material and metal/dielectric barrier height. they include injection of the carrier over the schottky barrier at the metal/dielectric interface by thermionic emission (schottky emission) and tunnelling through the thin barrier (direct and fowler-nordheim (fn) tunneling). the bulk-limited mechanisms are governed by the material properties of dielectric and especially the existence of traps and ionized centers in the bandgap of dielectric. generally speaking, the bulk-limited conduction mechanisms are trap-assisted mechanisms, among which the most commonly considered are: poole-frenkel (pf) mechanism, trap-assisted tunnelling and space-charge limited current. 2.1. electrode-limited conduction mechanisms schottky emission the schottky effect is a thermionic emission of electrons over the potential barrier b at the metal-insulator interface which is enhanced in the presence of electric field. the current density governed by schottky emission is given by richardson-dushman equation. ) 4/ exp( 0 3 2 kt keq tcj rb rd    (1) here, j is the current density, t is the temperature, e is the electric field, b is the schottky barrier height, kr is the dynamic dielectric constant and constants ε0, q, k, h and crd are the permittivity of the free space, electron charge, boltzmann constant, planck constant and richardson-dushman constant (crd =120 acm -2 k -2 for the free electron approximation), respectively. the factor rkeq 0 3 4/  represents the reduction of the 514 a. paskaleva, d. spassov, d. dankovic surface barrier b by the electric field. if schottky emission is the dominating mechanism, the plot of ln(j/t 2 ) vs. e 1/2 , called schottky plot, should give a straight line. from yintercept the barrier height b could be extracted. there is a requirement for εr, extracted from the slope of the schottky plot, to be self-consistent. this means that εr value is between optical (high frequency) and static dielectric constant (measured from capacitance methods) for an examined dielectric. usually the εr value is very close to the optical dielectric constant, which is equal to the square of the refractive index n (kr = n 2 ). schottky emission is a strongly temperature dependent process. for films where the electron mean free path is lower than the film thickness the standard equation is replaced by a modified one [2]: * 03 / 2 0 ( ) / 4 exp b r q qe km j t e m kt                  (2) where 𝛼 = 3×10 −4 as/cm 3 k 3/2 , 𝜇 is the electronic mobility in the insulator, m * is the effective electron mass in dielectric and m0 is the free electron mass. tunneling mechanisms tunneling can be divided in two cases – direct tunneling (i.e. tunneling through trapezoidal barrier) and fowler-nordheim tunneling (i.e. tunneling through triangular barrier). in the case of thicker films, the tunnelling becomes dominant at voltages where the barrier is approximately triangular and the electrons tunnel from the electrode to conduction band of dielectric. this process is described by fowler-nordheim equation: )exp( 2 e b aej  (3) v/cm][106.83)2( 3 8 2/3 0 * 72/32/1* bb m m qm h b            (3a) ]a/v[ 1 1054.1 16 2 * 06 *2 0 3 bb m m m mq a           (3b) fowler-nordheim tunnelling is generally considered to be temperature independent or very little temperature dependent mechanism, taking place at higher electric fields. if the dominant current transport mechanism is fn tunnelling, then the graph of ln(j/e 2 ) versus 1/e will be a straight line and from its slope the barrier height could be extracted. direct tunneling (dt) current is expressed by [3]: 3 / 2 3 / 22 1/ 2 1/ 2 2 ( ( ) ) exp ( ( ) ) b b b b b qvae j eqv                (4) direct tunneling essentially operates in the low field (applied voltage) region v<b/q. for higher v conduction is via fn mechanism as the band bending under these applied voltages transforms the trapezoidal barrier into triangular one. dt is observed only for layers with d < 4-5 nm. consideration of conduction mechanisms in high-k dielectric stacks... 515 fig. 2 electrode limited conduction mechanisms: thermionic schottkey emission, fowlernordheim tunneling as well as direct tunneling (at low applied voltage – dashed line). dotted line shows schottkey barrier lowering. 2.2. bulk-limited conduction mechanisms poole – frenkel (pf) emission, trap-assisted tunneling mechanisms and hopping conduction high-k dielectrics are trap-rich materials and conduction mechanisms, which govern the leakage current through them, are usually trap related. the trap states provide an alternative path for the charge carrier to pass from one electrode to the other. depending on how this process is performed different kinds of trap-related mechanisms could be distinguished. the first step is tunneling of the charge carrier to the empty state and from there it can tunnel to the next trap of the same energy (hopping or multi-step tunneling). if the carrier energy is changed, then the process is inelastic tunneling. in this case the carrier looses energy and occupies a trap with different energy. hopping and inelastic tunneling take place when the traps are distributed in a wide band of energies. the current in hopping conduction is:         kt qae qanj t c   exp (5) where nc is the density of the electrons in the conduction band,  is the frequency of thermal vibration of electrons at trap sites, a is the mean hopping distance, (i.e., the mean spacing between trap sites) and t the traps energy level measured from the bottom of the conduction band. eq. 5 describes the tunneling of electron from a trap to adjacent one. however, if the density of the traps is not high enough so the mean hopping distance is large and the tunneling probability between two neighbor sites will be quite low. the hopping current then will result from the hopping of thermally excited electrons from one isolated site to another [4]:         kt ej t  exp (6) 516 a. paskaleva, d. spassov, d. dankovic so, in this case j obeys the ohmic law with a temperature dependence defined by t. fig. 3 schematic representation of hopping conduction another possibility is to tunnel from a trap to conduction band of the other electrode, i.e. similar to direct tunneling the electron does not enter the conduction band of dielectric (fig. 4). this process is usually referred to as a trap-assisted tunneling (tat) and obeys the equation [5]: dx xx xn qj d outin t    0 )()( )(  (7) where nt(x) is the trap density distribution, τin(x) and τout(x) tunneling time constants from the electrode to the trap and from the trap to the second electrode (these processes must happen sequentially) and d is the thickness of the layer, respectively. depending of the chosen model for τin and τout calculation different expressions for j can be obtained, and some models involve numerical calculation. (τin in and τout are calculated directly from the wentzel–kramers–brillouin approximation). the above equation can be written as [6]: dx pp ppnc qj tt   21 21 (8) where p1 and p2 are the tunneling probabilities for electron tunneling into the traps and subsequently to the second electrode and ct is a function of t. the integration limits depend on the b and t, and it is from 0 to d if t >b and xt-d if b > t (xt = (b-t-ee)/e, ee is the total energy of the electron in the gate [7]. using wkb approximation, assuming that t +ex >>ee, the following expression for j can be obtained [7]: 3 / 2 1 1 32 1 1 1 2 exp( 3 / 2 ) tan tan 3 t t t t qc n c er j r r ra                       (9) with: consideration of conduction mechanisms in high-k dielectric stacks... 517 r1=exp(-c3/2); r2=exp(c3); tadc  2 3 3  ; 3 24 ox qm a  (9a) or in more simplified equation [8]:  3 / 2 3 / 2 4 exp ( ( )) 3 oxt t t t mqn j qe d x q e                 (10) where xt is the most favorable position of trap in the dielectric, τ is the relaxation time of the trap. m. houssa et al. [9] have shown that in case of si substrate injection through sio2/high-k stack, tat current at low voltages can be approximated by:         kt qv nj tsio t  212exp (11) where vsio2 is the voltage drop across sio2 layer, 1 (usually taken as 3.2 ev) is the barrier height at si/sio2 interface and 2 is the barrier height at sio2/high-k interface. since in most cases mis structures with high-k layers actually are double layered (the high-k film and interfacial sio2 layer with thickness of 12 nm generally), the tat for the low voltage region will be proportional to exp(qv/kt). fig. 4 trap-assisted tunneling mechanism when the applied field is high enough the electron enters the conduction band of dielectric through a triangular barrier (field-assisted tunneling, fat) (fig. 5). in this case by using (8) for the current in the two step tunneling process it can be obtained:          2/3 3 24 exp 3 2 t oxttt e qm e qnc j    (12) the last equation suggests that the process is quite similar to fowler-nordheim tunneling, but in this case the barrier height b at the electrode/dielectric interface is replaced by the energy depth of trap t. 518 a. paskaleva, d. spassov, d. dankovic apart from the ground state (fat), the electrons can tunnel also from a thermally excited state, and the conduction mechanism is referred to as thermionic fat (t-fat). t-fat has been studied in detail by sathayia and karmalkar [10,11]. in this two-step process, thermally excited electrons tunnel from the metal into the trap level and then to the oxide conduction band through the triangular barrier. fig. 5 schematic representation and comparison of poole–frenkel, field-assisted tunneling and thermally stimulated field-assisted tunneling mechanisms along with the modification of the potential barriers of the trap as a result of coulombic interaction between the electron and charged trap core in the presence of electric field. the depth of the trap is 1 ev and the applied electric field is 1.5 mv/cm the most widely reported conduction mechanism in high-k dielectrics is poolefrenkel (pf) emission effect. in the pf mechanism the electron tunnel from electrode to a trap in a forbidden gap of dielectric and then is thermally emitted to conduction band due to the lowering of the coulombic potential barrier when a trap interacts with an electric field. the difference to fat is represented in fig.5, i.e. the electron enters the conduction band of dielectric through different processes – thermal emission and tunneling. pf effect will dominate when the tunneling probability is low: e.g. tunneling distances are long (tat) or electric field is not high enough (fat). once emitted in the conduction band the carrier will be very likely captured again by another trap, so the conduction process will resemble hopping. the participating traps have to be coulombic ones, i.e. neutral when the trap is occupied and charged when empty. the effect is similar to the schottky emission, the electric field lowers the potential barrier for the trapped charge carrier thereby increasing the thermal ionization rate. the equation describing the current associated with field assisted thermal ionization of traps by frenkel [12] treats the presence of only one type of trap in the band gap with the fermi level laying in the middle between trap level and the conduction band edge. the original equation, however, often fails to describe the experimental j-v curves showing different slope than expected when plotted in pf scale (ln(j/e) vs. e 1/2 ). simmons, yeargan and taylor, mark and hartman [13-15] have addressed the issue showing that if the dielectric besides the pf (donor-like) consideration of conduction mechanisms in high-k dielectric stacks... 519 trap contains also another type of traps (e.g. acceptor-like or neutral) reducing the density of the available electrons from the donor centers jpf is defined as:           rkt eq ecj rt pfpf  0 3 / exp (13) here, cpf is constant proportional to the donor trap density (nd) and mobility and r is a coefficient reflecting the degree of compensation of the pf emitting traps by different trapping centers in the dielectric. there are two limiting cases for r: r=1 and r=2, i.e r have to be in the interval 1  r  2. eq. 13 predicts that plot ln(j/e) vs. e 1/2 is straight line from the slope of which high-frequency dielectric constant kr [13] can be found and it is used to verify the operation of pf mechanism. here we should mention that there is some discrepancy in the literature about the value of r concerning the original frenkel formula (describing the case when only donor traps present). some author assume r=1 for the original frenkel expression eg. [13,16-21] while others as for example [22-24] show that in fact r=2 should be used. in the former case, commonly r=1 is denoted as a pure (or normal) pf effect, and r = 2 is referred as pf with compensation or modified pf [18-21]. in frenkel‘s derivation [12] the density of electrons in the conduction band (cb) due to the emission is exp(t/2kt) which leads actually to r=2 in (13). as discussed, however, by j.g. simmons in [20] and cited there reference, the abundance of shallow traps in the insulator is expected to increase the emission rate of electrons at high electric fields, and hence in case of thin films pf with r=1 is suggested. in this case, when shallow neutral traps lying above the fermi level and donor sites are introduced in the layer [13] r = 2 is obtained providing the density of compensating traps is approximately equal to the donor ones. note, that the existence of deep acceptor-like traps (below donor sites) will not change r while using the r=1 approach for a film with only donor traps present [13]. and vice versa if the donor only case is treated with r=2 then the deep acceptor-like traps will result in r=1 [14,15], but in the presence of shallow neutral traps it will remain 2. so, the outlined inconsistency in the ―pure‖ pf model significantly embarrasses the interpretation of the experimental data. additionally, as pointed out in [16] besides the presence of compensating traps r in (13) depends also on the electron density in cb. and the two limiting cases are r=1 if nc 1, and if it is positioned below ef it is a deep trap. when one type of single shallow trap exists in the material, j is given by: 3 2 8 9 d v j  with           kt ee gn n ft t c exp (17) where g is the degeneracy factor of the traps, nc – effective density of states in the conduction band; nt trap density. the deviation from the ohm‘s law occurs at:  2 0 dqn v sclc  (18) (i.e. vsclc depends on nt). as efq rises with the increase of the injected carriers, gradually the trap level will come below efq and eventually all traps will be filled. near the trap filled condition j begins to rise sharply for several orders of magnitude (represented on the logj-logv plot by almost vertical, current jump). this increase continues till all the traps are filled at vtfl after which there is not any trapping, all injected electrons are in conduction band and j is again proportional to v 2 .  2 dqn v t tfl  (19) for the insulator with deep traps (efq – et)/kt > 1 the domination of sclc begins when the number of total injected carriers (free and trapped) is equal to the empty traps in equilibrium and occurs at:           kt ee g qntd v ft sclc exp 2  (20) unlike the case of shallow traps or trap-free dielectric, the current after vsclc does not switch to square law. instead a rapid increase of j is observed for v > vsclc. at sufficiently high v > vtfl when all the traps are filled j is again  v 2 (fig. 7). thus, in the simple cases of trap-free dielectric or dielectric with traps represented by a single energy level in the bandgap, the operation of sclc can be reliably verified by the presence of j  v 2 in the characteristics and additionally confirmed by the thickness dependence of j according to (15) and (17). the traps in real dielectrics could be characterized with rather more complex energy distribution than confinement in a discrete energy level. several more complex distributions have been considered such as exponential, gaussian and uniform distributions. it turns out that for these more complicated cases, j in sclc regime is no longer  v 2 . for traps with exponential distribution as well as deep traps distributed in gaussian manner the current obeys [22,31,32]: j v l+1 /d 2l+1 (21) where l is higher than 1 and depends on the specific parameters of the particular distribution. (however, j for shallow traps with gaussian distribution follows (17) but with different θ.). consideration of conduction mechanisms in high-k dielectric stacks... 523 fig. 7 schematic j-v dependence of an insulator in the characteristic for sclc log-log plot in case of trap-free material and dielectric with shallow and deep traps present. ( st sclc v and dt sclc v denotes the values of vsclc for shallow and deep traps, respectively) if the traps are uniformly distributed within a narrow band in the forbidden gap (e.g. high impurity concentration) then j is represented by [33]: 1 2 2 2 exp( / ) exp c b v v j q n g e kt d qn ktd            (22) where e is the trap band width and nb is the density of the traps per unit energy interval. the switch from ohmic current to sclc (vsclc) can be obtained by solving j=qnμv/d where j is the current density in sclc regime described by the above relations. the same approach is used to determine vtfl, however, instead of ohmic current eq. (15) have to be used. 3. experimental procedure investigation of current conduction mechanisms was performed on two kinds of test structures metal-insulator-si (mis) or metal-insulator-metal (mim) capacitors. various dielectrics (zro2, hfo2, or ta2o5 and their doped or mixed modifications) were used as insulator in these structures. several techniques (rf magnetron sputtering, metal organic chemical vapor deposition (mocvd), atomic layer deposition (ald)) were implemented to deposit these dielectrics. in addition, capacitor structures with various metal electrodes were prepared in order to investigate their influence on the conduction mechanisms. all details concerning technology of experimental structures could be found in the related publications cited in the text. current flowing in the structures is measured in a wide voltage and temperature ranges and in both field polarities (i.e. in the case of mos structures both at accumulation and inversion conditions) in order to perform complete analysis of current conduction mechanisms. it should be taken into account that when the mos structure is in inversion due to insufficient amount of minority carriers the injection 524 a. paskaleva, d. spassov, d. dankovic current saturates already at low voltages which hinders investigation of conduction mechanisms at higher fields. to solve this problem i –v characteristics in inversion are usually measured under illumination to ensure a sufficient generation of minority carriers. another issue to consider when analyzing i-v curves is the thermodynamic instability of high-k dielectrics when in contact with si. as a result a thin sio2 interfacial layer is usually formed and the applied voltage va distributes between the two layers – the high-k dielectric and the interfacial sio2. therefore, when considering the possible conduction mechanisms, the stacked structure of the dielectric layer should be taken into account. the voltage drops across the high-k dielectric – vhk, and across the interfacial layer – vif, can be obtained by the well-known equations [34]:    1 hk if if hk a hk d d v v   and 1  if hk hk if a if d d v v   (23) where hk and if are the dielectric constants of high-k dielectric and the interface layer, dhk and dif  the thicknesses of the two layers. 4. results and discussion 4.1. conduction mechanisms in sio2 due to the nearly perfect defect free structure of sio2 the main conduction mechanisms which governs the current are the electrode-limited fowler–nordheim (fn) tunneling in thicker films and direct tunneling in layers thinner than about 3 nm. lenzlinger and snow [35] have shown that the conduction current in conventional sio2 films can be excellently described by the classical fn formula. since thermally grown sio2 has an extremely wide bandgap and consequently a high energy barrier at its contact with an electrode, it is more likely to show electrode-limited conduction than other insulators. in addition, bulk-limited mechanisms are less likely to play a role because of the low trap density in the forbidden band of sio2. direct tunneling is observed in very thin dielectric films (< 4 nm). in this case electrons tunnel from one electrode to the other through the dielectric film, i.e. this is a tunneling through trapezoidal barrier and the electron does not enter the conduction band of dielectric. figure depicts typical j-v curves (gate injection mode) of mos capacitors with thermal sio2 with thickness corresponding to dt and fn tunneling regimes, respectively. as seen the experimental data are very well fitted with eq. 3 and 4. assuming that m * /m0 = 0.5the thickness of the sio2 layer and barrier height at al/sio2 interface were found. the obtained results agree very well with the ellipsometrically measured thicknesses and widely accepted the values of b for al/sio2 interface. consideration of conduction mechanisms in high-k dielectric stacks... 525 0.0 -0.5 -1.0 -1.5 -2.0 -2.5 -3.0 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 experimental fit j ( a /c m 2 ) voltage (v) a) 0 -1 -2 -3 -4 -5 -6 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 experimental fit j ( a /c m 2 ) voltage (v) b) fig. 8 leakage currents of al/sio2/p-si capacitors with sio2 thickness of 2.3 nm (a) and 6.6 nm (b). the data is fitted to dt (a) and fn (b) mechanisms using m * /m0 = 0.5; b = 3.05 ev (a) and 3.1 ev (b). 4.2. conduction mechanisms in high-k dielectrics in high-k materials the conduction mechanisms are usually trap related and the domination of a certain mechanism depends on many parameters, which could be summarized in several groups: 1) parameters of intrinsic and process-induced traps (density, spatial and energy location, and charge state), which are directly related to the inherent electronic structure of the high-k material as well as to the technological processes used for their fabrication; 2) stack parameters including thickness of the dielectric and existence of interfacial layer, structural status of the films (amorphous or crystalline), type of metal electrode, etc.; and 3) measurement conditions—applied voltage and its polarity and temperature. such a big set of parameters influencing the conduction process results in a wide diversity of mechanisms observed in high-k dielectric stacks. dependence on measurement conditions (bias, polarity, temperature) it is clear that conduction mechanisms are strongly dependent and defined by the measurement conditions (bias, polarity, temperature). respectively, the change of measurement conditions could result in a change of dominating conduction process in dielectric and this change could be used for better understanding of trap participation in conduction mechanisms as well as assessment of some structure parameters. the first example will demonstrate the possibility to obtain the barrier height at high-k dielectric/si interface by measurement at low temperatures. ti/zr-silicate/si mos structures have been investigated [36] and we tried to obtain fowler–nordheim conduction in this structure. i–v measurements were performed under substrate injection at low temperatures (from room down to -185 c). the measurements showed indeed that below about -120c, the i–v curves are temperature independent and can be well fitted by fowler–nordheim equation (3). from the fitting, we obtained a value of 1.4 ev for the barrier height at the zr-silicate/si interface. this value is in agreement with the value of 1.4 ev given for zirconium dioxide [37]. 526 a. paskaleva, d. spassov, d. dankovic measurements at very low temperatures could give also valuable information about the traps because all temperature activated processes are suppressed, hence only temperature independent processes (e.g. fowler-nordheim, field-assisted tunneling etc.) may operate. in addition, at a certain temperature the trapping and detrapping are in a thermal equilibrium. as the trapping is slightly temperature dependent [38,39], while detrapping is strongly temperature stimulated process, it is expected that at very low temperatures detrapping does not occur, hence the influence on the leakage current of trapping with respect to detrapping will be maximized. following this idea the change of conduction mechanisms in tin/zr1-xalxo2/tin mim structures (al content less than 10%) measured in a wide bias and temperature range [40] has been investigated. the j-v curves of structures are measured down to very low (25 k) temperatures (fig. 9). despite of the symmetrical mim structures there is a polarity asymmetry at 25 k, while at 300 k symmetrical curves have been obtained (fig. 9a). at t<175 k the conduction process at both polarities is symmetric it is fat through one and the same bulk traps located at ~1.3-1.4 ev below cb of dielectric. the asymmetry of j-v curves and the stronger temperature dependence at positive bias resulted from the trapping/detrapping at the bottom tin/high-k interface (where thin tiox–like layer is formed [41,42]) and the shift v ~ 0.4-0.5 v between the two j-v curves at 25 k corresponds to a trapped charge of ~8x10 12 cm -2 . consideration of possible conduction mechanisms at higher temperatures (t≥25 c) (fig. 9b) in negative polarity revealed that at low fields (v<│-2│ v) the current is due to trap assisted tunneling (tat) through a trapezoidal barrier and the activation energy of the process is ~0.25 ev. the increase of voltage above │-2│ v changes the barrier for the tunneling electrons from trapezoidal to triangular, i.e. a change from tat to fat occurs, which results in a change of the field dependence of the current. the weak temperature dependence in the range 30-100 c and the obtained activation energy of ~0.25 ev (same as that for v<│-2│v) (fig. 9b) is related to the thermal excitation of electrons at the first stage of the tunneling process (i.e. injection from electrode to the trap). with the increase of temperature above 100 c the probability for thermal excitation from trap to the cb of dielectric increases and the pf process through traps located at ~1.3 ev (fig. 10a) becomes the dominant mechanism. therefore, the results reveal existence of a trap level at about 1.3 ev below the cb edge of dielectric, which fully controls the transport of electrons through the dielectric at negative polarity and gives rise to several conduction mechanisms that dominate at different conditions. this trap level is assigned to the first ionization level v + of o vacancy in zro2 whose energy position has been calculated at 1.2-1.4 ev [43]. the conduction at positive polarity is substantially different and only one temperature activated process operates (fig.10b), unlike the case at negative polarity. pf conduction is the dominating mechanism at these conditions and the energy location of traps is found to be at ~0.85 ev below the cb of the dielectric (i.e., it is substantially different from the values obtained at negative polarity). therefore, the current transport at very low temperatures is realized through 1.3 ev traps, whereas at higher temperatures different traps govern the current at the two polarities. the feasible explanation is that 0.85 ev trap level is related to defects resulting from the reaction at the high-k/bottom tin electrode. at very low temperatures these defects act as traps, while at high fields and temperatures they serve as transport sites, by this way controlling the current at positive polarity. consideration of conduction mechanisms in high-k dielectric stacks... 527 -5 -4 -3 -2 -1 0 1 2 3 4 10 -8 10 -7 10 -6 10 -5 10 -4 (a) j g ( a /c m 2 ) v g (v) 25 175 300 t (k) 0 1 2 3 4 10 -8 10 -6 10 -4 -v g 2 5 k 300 k j g ( a /c m 2 ) |v g | (v) v g = 0.4 v +v g -4 -3 -2 -1 0 1 2 3 4 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 (b) j g ( a /c m 2 ) v g (v) t = 303-433 k t 0 1 2 3 10 -8 10 -6 10 -4 10 -2 j g ( a /c m 2 ) |v g | (v) v g 303 k 373 k +v g -v g fig. 9 j-v curves of tin/zr1-xalxo2/tin capacitors measured (a) from 25 to 300 k, and (b) from 300 to 430 k. the insets compare the j-v curves at both top electrode polarities and different temperatures [40]. fig. 10 arrhenius plot of current density at (a) negative, and (b) positive top electrode polarity [40]. dependence on thickness results obtained for ru/ta2o5/sion/si mos structures will be used to demonstrate the dependence of conduction mechanisms on dielectric thickness [44]. temperature dependent i-v (i-v-t) characteristics of structures with different ta2o5 thickness in the range 1.85 – 13.5 nm measured at 25 and 100 °c are shown in fig. 11. it is seen that i-vt characteristics measured at negative voltages show stronger temperature dependence for the thicker dielectrics; this dependence weakens as the oxide thickness decreases and almost disappears for 1.85-nm-thick oxide film. for positive gate bias, i.e. substrate injection, iv-t characteristics show a weak dependence on temperature for all thicknesses. the considerations show that at low electric fields (fig. 11), the logarithm of current density (jg) changes linearly with gate voltage (vg) for ta2o5 with tox =13.5– 7.0 nm. this is consistent with the poole hopping, eq (14) and can be explained with the higher defect density close to ru electrode originating from the reaction at its interface with ta2o5. for higher electric fields under gate injection (vg>-3, -1.5, and -1 v for mos capacitors with tox=13.5, 7.0, and 3.5 nm, respectively), observed behaviour indicates a gradual transition from temperature dependent conduction mechanism governing leakage current in the thicker layers to the mechanism with a weak temperature dependence in the thinner oxide layers. among other conduction mechanisms examined, only transition from poole-frenkel (pf) 528 a. paskaleva, d. spassov, d. dankovic emission (eq. 13) through thermionic field assisted tunneling (t-fat) to field-assisted tunneling (fat) (eq. 3,12) can qualitatively reproduce i-v-t characteristics. for if layer, direct tunneling through trapezoidal barrier is assumed. for mos structure with tox=13.5 nm, pf emission (eq. 13) fits well i-v-t curves for vg>-3 v and t=0.7 ev is extrapolated from eq. (13) (fig. 12). since electric field in the high-k oxides increases for thinner oxides, electron tunneling from the trap level to ta2o5 conduction band through the triangular barrier (i.e. fieldassisted emission) becomes dominant (fig. 13). in thin layers tunneling is the more probable process. with increasing the thickness, the tunneling probability decreases and the pf emission from the traps becomes dominating. -4 -2 0 2 4 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 t=25°c t=100°c c u r r e n t d e n si ty (a /c m 2 ) gate voltage (v) 13.5 nm 3.5 nm 1.85 nm t ox =7.0 nm fig. 11 leakage current–voltage characteristics measured at 25 (solid symbols) and 100 ◦ c (open symbols) on mos structures with different oxide thickness. for clarity, the i–v–t characteristics of 10 nm thick oxide are not displayed [44]. the trap barrier heights for mos structures with oxide thickness of 10.1, 7.0, and 3.5 nm have been calculated from i-v data measured at room temperature (rt) and 100 °c (ehigh-k is defined from eq. 23; mox=0.3me). the trap barrier heights measured at room temperature are in the range 0.29 0.31 ev for ta2o5 with different thickness, while those extracted from the i-v curves measured at 100 c are systematically lower of about 10 mev. although the difference between t extracted at rt and 100°c is in the range of experimental error, observed behaviour suggests that conduction mechanism in the mos capacitors with 3.5 0.2 mv/cm compared to the pure ta2o5, the effect of the doping consist in a modification of the compensation factor of pf mechanism. the doping of thin ta2o5 increases the value of r to 2, but for the thicker ones r is reduced compared to the pure ta2o5 (fig. 14). these results show that the type and the density of traps are changed as a result of the doping, but the details of this change depend on the film thickness rather than on the ti incorporation method. consideration of conduction mechanisms in high-k dielectric stacks... 531 the analysis of the temperature dependence of the current in ti:ta2o5 layers [48] reveal that the energy levels of the defects participating in the conduction are in the range of 0.6-0.7 ev and do not depend on the method of ti incorporation. these values are close to the values typical for pure ta2o5 suggesting that ti-doping does not alter the dominant defect structure but only the traps density and the degree of compensation in the pf mechanism. 600 800 1000 -30 -28 -26 -24 k r = 4 r = 1.9r = 1.95 r = 1 k r = 4.6 k r = 4 ln (j /e ) e 1/2 (v/cm) 1/2 a) 0.4 0.6 0.8 1.0 e (mv/cm) 10nmta 2 o 5 0.7nmti/10nmta 2 o 5 2nmti/10nmta 2 o 5 400 500 600 700 800 900 1000 -29 -28 -27 -26 -25 -24 -23 -22 -21 r=1.2 r = 1.1 1.4 r = 1.7 k r = 4.4 ln (j /e ) e 1/2 (v/cm) 1/2 k r = 5.3 r = 1.2 b) 30nmta 2 o 5 0.7nmti/30nmta2o5 15nmta 2 o 5 /2nmti/15nmta 2 o 5 15nmta 2 o 5 /0.7nmti/15nmta 2 o 5 0.2 0.4 0.6 0.8 1.0 e (mv/cm) fig. 14 poole-frenkel plot of j–v curves of the capacitors with pure and doped ta2o5 with two levels of ti incorporation and two thicknesses d of the stack: a) d∼10 nm, b) ∼30 nm. the values of kr, r are given [47]. hf-doped ta2o5: the doping of ta2o5 with hf results in a significant alteration of the shape and temperature dependence of the leakage current (fig.15) [49]. generally, the current at room temperature in hf-doped samples is higher than that in pure ta2o5 and it is very weakly temperature dependent. the dominating conduction mechanisms and energy location of traps involved in them for samples with various thicknesses are summarized in table 1. the results imply that a change of the conduction mechanism from field-assisted tunneling (fat) to pf emission occurs with increasing the thickness of hfdoped ta2o5 which is in agreement with results obtained above for pure ta2o5 with ru electrode. however, the traps responsible for both processes are the same and they are located at ~0.35-0.45 ev below cb of dielectric. therefore, the conduction in hf-doped ta2o5, either field-assisted tunneling, pf emission or even hopping, is governed by shallower traps (t~0.35-0.45 ev). the deeper levels (~0.75-0.9 ev) typical of pure ta2o5 (table 1) are not observed in any of the doped samples. these results imply that hf passivates the deeper traps (o vacancies) and the electron transport is performed through energetically shallower traps. the obvious consequences of the transport through shallower traps in the doped samples are the higher level of leakage current at room temperature and considerably weaker temperature dependence. the last result can have serious implementation as the increase of temperature during device operation will not change the leakage current in hf-doped samples, hence more stable and predictable behavior and enhanced reliability could be anticipated. in fact, the leakage current at 100° c is already lower in the hf-doped ta2o5 as compared to the un-doped ones. 532 a. paskaleva, d. spassov, d. dankovic 0.0 -0.5 -1.0 -1.5 -2.0 10 -7 10 -6 10 -5 10 -4 10 -3 2.6 2.8 3.0 3.2 3.4 10 -5 10 -4 10 -3 10 -2 7nm 15 nm 20 nm t=0.76 ev  t=0.85 ev j (a /c m 2 ) 1000/t ( k-1) t=0.77ev a) 100 °c room temperature ta 2 o 5 d=7 nm j g (a /c m 2 ) v g ( v) 0.0 -0.5 -1.0 -1.5 -2.0 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 5.0x10 -7 1.0x10 -6 -40 -39 -38 -37 -36 -35 hf-doped ta2o5 7 nm 15 nm ln ( j /e 2 ) 1/ (v/cm)e -1  t = 0.32 ev  t = 0.43 ev b) 100 °c room temperature hf-doped ta 2 o 5 d=7 nm j g ( a /c m 2 ) v g (v) fig. 15 temperature dependent j-v curves of a) al/pure ta2o5/p-si structures; the inset shows the arrhenius plot of the current for three different film thicknesses and b) al/hf-doped ta2o5/p-si structures; the inset represents the results in fn coordinates [49]. al-doped ta2o5: another doping agent studied to assess the possibility for improving the electrical properties of ta2o5 is aluminum. since al atoms have lower valence than ta ones it was expected that they could act as acceptors and compensate the oxygen vacancies in ta2o5. two al-doping techniques were investigated. in the first approach, al was introduced in ta2o5 using the surface doping method (similarly as in the case of ti and hfdoping) [50]. in the second approach, lightly al-doped ta2o5 layers were obtained by reactive sputtering in oxygen containing environment of taal target containing 5 at% aluminum [51]. for the first type of al-doped stacks, the high field conductance was through pf effect, and similarly to the case of ti-doping the effect of the dopant consists in a modification of the compensation factor r. the increase of al content results in higher r. consideration of conduction mechanisms in high-k dielectric stacks... 533 the behavior of lightly al-doped films fabricated by taal target sputtering is more complicated. compared to the pure ta2o5, the current in lightly al-doped samples is higher. furthermore, the temperature dependence of j seams to depend on the layer thickness as evidenced in fig. 16, symptomatic of a significant difference in the dominant conduction mechanisms. schottky emission and sclc were ruled out as a possible origin of j-v characteristics in 10 nm films. pf mechanism can explain the characteristics at high fields (above 1.4 mv/cm). the obtained from the fits values of r exhibit a clear temperature dependence it decreases with the temperature. j-v curves can be also well described by jexp(e) (fig. 17) which is consistent with poole hopping (eq. 14). the two segments could be interpreted with two sets of traps defining the current at low and high fields respectively. the operation of poole hopping (at gate injection) means the presence of high trap concentration close to the gate electrode. the arrhenius plots of the current density at several applied voltages is depicted on fig. 18 together with the calculated trap energies according to the considered conduction mechanisms. table 1 summary of the dominant conduction mechanisms and the respective traps participating in them in dependence on the film composition and thickness samples d, nm temperature of measurement dominant conduction mechanism trap energy position, ev pure ta2o5 7 20-50 c 60-100c field assisted tunneling poole-frenkel emission 0.4 0.85 hf-doped ta2o5 20-100c field assisted tunneling 0.43 pure ta2o5 15 20-100c poole-frenkel emission 0.76 hf-doped ta2o5 20-100c 1.21.5 mv/cm) is b ~ 0.70.1 ev (m * = 0.1m0) [53]. as the barrier height at tin/hfo2 is expected to be much higher (2.6 ev) [54], we conclude that the dominant mechanism is not fowler-nordheim tunnelling but field-assisted tunnelling (fat). -4 -3 -2 -1 0 1 2 3 4 10 -8 10 -7 10 -6 10 -5 10 -4 j ( a /c m 2 ) voltage (v) 30 o 180 o c 30 o 180 o chfo2 (75 cy) (a) -5 -4 -3 -2 -1 0 1 2 3 4 5 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 30 o j ( a /c m 2 ) voltage (v) 140 o c 120 o c hah (36/5/36 cy) 30 o (b) bd -5 -4 -3 -2 -1 0 1 2 3 4 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 j ( a /c m 2 ) voltage (v) hahah (24/2/24/2/24 cy) 30 o 160 o c 30 o 140 o c (c) fig. 20 temperature-dependent i-v measurements for: (a) hfo2 (75 cy); (b) hah (36/5/36); and (c) hahah (24/2/24/2/24) samples [52]. consideration of conduction mechanisms in high-k dielectric stacks... 537 therefore, the obtained value of 0.7 ev is the position of traps t with respect to the cb of hfo2 and it corresponds to the energy position of oxygen vacancies in hfo2 [55,56]. the current is temperature dependent for t>60 c, hence poole-frenkel mechanism (eq. 13) has been considered. the obtained values of r 2 kr (from the slope of the lines) are consistent with the refractive index of the dielectric and the limitation 1  r 2 and it is concluded that pf is the dominant conduction mechanism and the energy position of traps t 0.72 ev was extracted, i.e. these are the same traps which govern the fat process at lower temperatures. therefore, the results imply that the conduction is performed through the traps situated at 0.7 ev below the cb of hfo2. fat dominates at low t, while pf dominates at elevated temperatures. it is worth mentioning that fat and pf conduction should co-exist in principle as they both represent the escape of an electron from a trap and the current is actually a sum of the currents from both mechanisms. similarly, the current in hah (36/5/36) sample is nearly t-independent up to 60 c (fig. 20b) and the same mechanisms fat at t < 60 c, and pf at t > 60 c, (ehk > 2 mv/cm) have been considered. the fitting gives a trap energy t 1.2-1.3 ev for fat process and t  0.4 – 0.5 ev for pf process. in other words, unlike the pure hfo2, in the hah (36/5/36) sample the fat and pf processes are mediated by two different trap levels. these results give evidence that al-doping introduces deeper trap levels than these in pure hfo2. further considerations of the conduction mechanisms in the hahah(24/2/24/2/24) sample confirm this conclusion and reveal that the way of al incorporation into hfo2 is also of substantial importance. there is still a weak t-dependence up to 60 c and above 60 c the current increases more strongly compared to the hah(36/5/36) sample (fig. 20c). consideration of fat process as a dominant conduction mechanism at t < 60 c, ehk > 1.4 mv/cm gives a trap level t  1.4 ev, i.e. slightly deeper than that obtained for the hah(36/5/36) sample. poole-frenkel does not fit the i-v curves well. having in mind the peculiarity of the doping process of the hahah(24/2/24/2/24) sample, it is suggested that the conduction could be governed by poole conduction (eq.13). indeed, the poole mechanism fits well the current at t100 c (fig. 21a) and the activation energy of the process controlling leakage current is found (fig. 21b) ea  0.2-0.3 ev for t<100 c, while at t100 c a larger activation energy of ea  0.6-0.7 ev is observed. for poole conduction ea = t – spehk, i.e. the activation energy is linearly dependent on ehk and the intercept of the line with the y-axis gives t = 1.2 ev (fig. 21a), i.e. very similar to the trap level which controls the fat process at lower temperatures both in the hah(36/5/36) and the hahah(24/2/24/2/24) samples. this result supports the conclusion that the 1.2 ev level is related to al-doping. therefore, the results obtained for this sample reveal that fat is dominant at t<60c; at 60c < t < 100c both fat and poole conduction are present. with increasing temperature the contribution of poole conduction increases and at t>100c it is the main mechanism which governs the current in hahah(24/2/24/2/24) sample. the specific way of al incorporation in hahah(24/2/24/2/24) sample results in a more homogeneous distribution of traps and favours poole conduction between them the conduction mechanisms considered up-to-now dominate at relatively high fields ehk > 1.5 mv/cm. as is seen in fig. 20c there is a wide electrical field region (v< -3 v, ehk < 1.3 mv/cm.) where the current increases only slightly with applied voltage and temperature. it is suggested that in this region the conduction is governed by a space charge limited current (sclc) mechanism [57]. the logj-loge representation of i-v 538 a. paskaleva, d. spassov, d. dankovic curves gives a line with a slope of 1 corresponding to ohmic conduction, which increases to 2 (fig.22). this behavior is in accordance with the sclc theory in the presence of discrete shallow traps (i.e. trap level above the fermi level) (eq. 17). the voltage vtfl in fig. 22 is a trap-filled limit at which all the traps are filled. therefore, in this field range sclc is the dominant mechanism in the hahah(24/2/24/2/24) sample. 2.0 2.5 3.0 -18 -16 -14 -12 -10 -8 ln j e hk (mv/cm) (a) t=120-160 o c 30 o c 160 o c 2.0 2.4 2.8 0.5 0.6 0.7  t -s p e hk , ev e hk , mv/cm  t =1.2 ev 2.4x10 -3 2.8x10 -3 3.2x10 -3 -17 -15 -13 -11 -9 ln j 1/t (k -1 ) e a = 0.26 ev 0.30 ev 0.28 ev 0.25 ev 0.23 ev 0.19 ev 0.20 ev 0.15 ev 0.61 ev 0.72 ev 0.66 ev 0.67 ev 0.67 ev 0.62 ev 0.58 ev 0.55 ev v = -4.9 v -3.5 v -3.9 v -4.1 v -4.3 v -4.5 v -4.7 v -3.7 v (b) fig. 21 (a) representation of i-v curves in poole coordinates (lnj vs. ehk) of hahah sample. the inset shows the dependence of activation energy ea = t spehk of poole conduction on the electric field. (b) arrhenius plot of the current in hahah sample in the whole temperature range (30-100° c) and for different applied voltages [52]. therefore, the detailed characterisation of conduction mechanisms in different stacks revealed the existence of traps with an energy level of 0.7 ev below the conduction band in pure hfo2 which is consistent with the energy of an oxygen vacancy in hfo2. the consideration of conduction mechanisms in high-k dielectric stacks... 539 conduction in hfo2 is performed via these traps and the dominant conduction mechanism (fat or pf emission) is defined by the measurement conditions (applied voltage and temperature). in all al-doped samples the existence of a deeper level (1.2-1.4 ev below the conduction band) is undoubtedly revealed. the 0.7 ev level is not observed at all. the disappearance of the 0.7 ev trap level could be explained by a reduction of the concentration of oxygen vacancies in hfo2 by al-doping, which was observed also by other authors [58]. the increase of al-doping obviously results in the formation of deeper traps, which have been observed also by molas et. al. [59], who found a trap level at about 1.35 ev below the conduction band for hfalo layers with hf:al(9:1) which is even deeper (1.55 ev) for samples with higher al content (hf:al(1:9)). 10 5 10 6 10 -10 10 -8 10 -6 10 -4 lo g j loge hk ~ 1 4 ~0.63 slope ~ 0.92 ~ 2.1 ~1.5 ~1.8 v tfl = 3.1 v v sclc = 0.9 v fig. 22 logj vs. logehk curves representing the space-charge-limited conduction at low fields in hahah sample [52]. dependence on composition in the examples considered up to now the influence of small amount of a given element on conduction mechanisms of high-k dielectrics have been considered. the next example will demonstrate how the gradual change of hf:ti ratio (both hf and ti in significant amount) in the composition of high-k hfti-silicate layers results in quite different field, temperature and polarity dependences of the leakage currents, hence in significant alteration of the dominating conduction mechanism (fig. 23a-d) [60,61]. five different compositions with hf:ti ratios in the films (100:0), (42:58), (27:73), (12:88), (0:100) have been investigated. the pure hf-silicate (hf0.5si0.5o2) (fig. 23a) exhibits symmetrical i-v curves and conduction mechanisms suggesting deposition of single layers without formation of a significant interfacial layer. this also shows that the asymmetry of the band diagram does not influence the conduction mechanism, hence it is mostly bulk-limited. it is found that the conduction is governed by two different processes – pf emission dominates at lower fields, whereas fat is the more probable process at higher fields. the trap level t evaluated in both processes and both polarities is at 0.65-0.7 ev below the cb of dielectric (inset fig. 23a). at t>100 c for positive polarity another trap level t~1.1 ev is also observed. this trap has been found to dominate also the pf conduction in hf-silicate layers with larger hf content (hf0.86si0.14o2) [62]. the layers containing both hf and ti in significant amounts (hf:ti (42:58) and hf:ti 540 a. paskaleva, d. spassov, d. dankovic (27:73)) reveal a strong gate polarity asymmetry of the conduction process (fig.3b,c). at negative polarity pf emission is again the dominating mechanism. however, two trap levels located at 0.7 ev and 0.9 ev below the cb of dielectric (insets fig. 23b,c) participate in the conduction process. as with the increase of the ti content the conduction through the 0.9 ev level prevails, this trap has been assigned to ti-bonds. the later theoretical calculations of muñoz-ramo et al. [63] have indeed confirmed the existence of a 0.9 ev state caused by ti incorporation into hfo layers. therefore, these states are intrinsic traps related to the inherent electronic structure of the hftio dielectric. the analysis of the curves has also revealed that two phenomena give rise to the asymmetry. the first one is the enhanced trapping of negative charge near the si/dielectric interface which modifies the field, hence the current. a significant trapping of negative charge occurs in the two samples at low positive voltages and lower temperatures. the trapped charge exists along the whole dielectric film and with the increase of ti content its centroid moves farther away from the dielectric/si interface. here, the distinct difference between the traps participating in the conduction process and those causing asymmetry of the i-v curves should be mentioned. the former ones serve only as stepping sites for the carriers (i. e., the electrons are trapped there and released immediately thereafter). the later ones are states in which the carrier is trapped longer and could be detrapped at increased voltage and/or temperature. the asymmetry of electron conduction in the films cannot be explained only by charge trapping especially for the hf:ti (27:73) sample. a formation of an interfacial layer with a composition and/or structure different from the bulk film is also involved for the interpretation of the obtained results. it is suggested that the two phenomena (charge trapping and formation of a double layer structure) are manifestations of one and the same structural process, namely separation of phases and formation of tio2, hfo2, and sio2 islands in the film. depending on the degree of phase separation, these islands can act as trapping centers that only modify the poole-frenkel emission in the layer (hf:ti (42:58)) or can form a separate layer, in this way, causing a real asymmetry of the conduction process (hf:ti (27:73)). the conduction in the sample with the lowest hf content (hf:ti(12:88)) (fig. 23d) and the pure ti-silicate (not shown) could not be fitted by any of the commonly considered conduction mechanisms. the low temperature (down to -193 °c) j-v measurements have revealed the participation of soft-optical phonons with an energy of ~20 mev in the conduction process and it was concluded that it is governed by a phonon-assisted tunneling between localized states (i.e. the current obeys the relation j~exp(a/e 1/4 ) [65]) (insets fig. 23d). the obvious consequence of this kind of process is that the electrons do not enter the conduction band of dielectric and the conduction takes place within the band gap. another implication is that the conduction is governed by the intrinsic properties of the layers. the finding that soft optical phonons impact substantially the conduction gives some hints about the structural status of the films as the phonon modes depend on crystalline (amorphous) structure and local bonding [60,61]. the substantial contribution of the soft optical phonons to the intrinsic properties of hftio layers has been confirmed also by theoretical calculations [63] revealing that the increase in permittivity mainly originates from the changes in phonon spectrum induced by the presence of ti. consideration of conduction mechanisms in high-k dielectric stacks... 541 fig. 23 temperature dependent j-v curves of al/ti/hftisio/p-si structures with different dielectric layer compositions: a) hf:ti (100:0), b) hf:ti (42:58), and c) hf:ti (27:73); the insets show the arrhenius plot of the current and the respective trap energy levels. d) hf:ti (12:88); the insets represent the arrhenius plot of the current and j-v curves in phonon-assisted tunneling coordinates. all j-v characteristics in inversion are measured under illumination to ensure a sufficient generation of minority carriers. [64]. effect of the top (gate) metal electrode the influence of the gate electrode material on the leakage currents and conduction mechanisms in high-k dielectric stacks is of significant interest. since the implementation of high-k dielectrics in microelectronics devices inculcate abandoning the poly-si gates in order to avoid the creation of siox interfacial layers compromising the overall capacitance, the gate material have to be carefully selected. generally, the choice of gate metal is determined of its work function, as the higher work function will result in higher barrier height between the gate electrode and dielectric and hence lower leakage currents are envisaged. it turned out, however, that metal electrodes could also react with the underlying high-k film generating additional traps e.g. oxygen vacancies in the high-k [66]. in addition, the process of gate deposition has to be considered carefully, as it can also introduce defects in the dielectric and modify severely leakage and conduction mechanisms. 542 a. paskaleva, d. spassov, d. dankovic the effect of the metal gate on the conduction mechanisms is illustrated by the results obtain for ta2o5 with al, w and tin gate electrode mis capacitors. two types of ta2o5 layers were studied: reactively sputtered, and thermally oxidized ta2o5 [67,68]. in case of sputtered ta2o5 al-gated structures exhibit highest leakage currents (fig. 24). 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 10 -9 10 -7 10 -5 10 -3 10 -1 w tin al w al tin d=17nm 27 17 27 17 31 j ( a /c m 2 ) e (mv/cm) fig. 24 j-e dependence at negative gate voltage for mos capacitors two thickness of ta2o5 and different gate electrodes [67]. despite the higher work function (4.95 ev) capacitors with tin electrodes were leakier than w-gated ones (4.55 ev was assumed as w work function). this behavior was attributed to a lowering of the barrier height at tin/ta2o5 interface, as a result of accumulation of radiation defects during electrode deposition. the conduction analysis showed that the current in the investigated capacitors is generally dominated by bulk limited conduction. pf effect governs the current in al and w gated capacitors. fig. 25 represents leakage characteristics in pf (a) and schottky (b) plots. the dominant mechanism was defined by the level of agreement between kr and their refractive index. the ellipsometry determined values of n were ∼2.2, without a clear dependence on ta2o5 thickness, hence the corresponding value of kr is ~ 4.8. j of capacitors with w electrode is almost independent of e in the low field region (< 0.6mv/cm) suggesting a presence of transient currents. at higher fields (0.8– 1.6mv/cm) the j–e curves are well fitted by a modified pf mechanism (with r=1.4) in case of 17 nm films and normal pf for the thicker (31 nm) ones indicating that they contain fewer traps. al electroded structures are characterized with two regions: a low field region (∼0.1– 0.8 mv/cm) in which the conduction can be attributed to schottky emission (fig. 25b) although the obtained from the fit kr values only roughly agree with ellipsometricallymeasured refractive index. the conductivity in the second region can be described by the modified pf effect, (fig. 25a) with r ≈ 2 with a weak trend of decreasing of r upon the increase of d. neither the pf effect nor the schottky emission can be invoked to explain the conduction for tin capacitors (the values of kr and r extracted from schottky as well as pf plots do not agree with n). the j(v) dependence (fig. 26) at low voltages are linear, indicating ohmic behavior, suggesting hopping conduction [69]. at applied |v| > 0.5 v the current becomes proportional to v 2 characteristic of space charge limited current. this consideration of conduction mechanisms in high-k dielectric stacks... 543 behavior is typical of dielectrics with substantial amount of shallow defects which affect the electric field in the film by charge carriers trapping. based on the results, it can be concluded that the type of the gate electrode affects the dominant conduction mechanism in reactively sputtered ta2o5 layers on si. the observed high leakage current for al/ta2o5 is a result from both lower barrier height and the larger defect density in ta2o5. the former could be additionally reduced by a defects generation at the al/ta2o5 interface due to the reaction between al and ta2o5 causing appearance of thermionic emission in these capacitors [66,67,70]. the j–v behavior of tin capacitors and corresponding conduction mechanisms is a manifestation of electrode deposition induced radiation defects affecting in some extend the electrical properties. 400 600 800 1000 1200 -14 -12 -10 -8 -6 -4 k =4.8 k =4.8 w gate al gate r r =1.4 d=17nm 31 d=17nm 27 r r =1.9 r = 2 r =1.1 ln (j /e ) e 1/2 (v/cm) 1/2 a) fig. 25 j–v (forward bias) curves of capacitors with sputtered ta2o5 and different gate electrodes in a poole-frenkel plot (a) and schottky plot (b) [67]. the obtained results for mis capacitors with thermal ta2o5 and the same metal electrodes (al, w and tin) are in general agreement with the data for sputtered ta2o5 [67]. the lowest leakage corresponds to w-gated capacitors, although in this case the conduction mechanism was not identified. contrary to the structures with sputtered ta2o5 highest j in 544 a. paskaleva, d. spassov, d. dankovic thermal ta2o5 was observed for tin electrodes, and the current of the capacitors with al and tin gates was substantially higher (up to 8 orders of magnitude) than j of w ones. the reason for such behavior was attributed to the possible reaction between ta2o5 and al or tin electrodes producing near the gate interface alox or tiox accompanied with unwanted traps probably in form of oxygen vacancies. the leakage characteristics of structures with al and tin electrodes in pf and schottky plots are presented in fig. 26. a nearly normal pf effect (r = 1.1 at kr = 4.8) dominates the current at applied field of 0.3 to 1.7 mv/cm for the samples with al electrode; schottky conduction can be ruled out as evidenced in fig 26b. pf effect can also be invoked for tin gates at low electric fields (0.06 to 0.5 mv/cm). the obtained compensation factor is 1.2 close to the value proposed for normal pf. hence, thermal ta2o5 exhibits much lower (or negligible) density of compensating centers. the electrode material and deposition conditions only slightly alter the density of compensation traps causing small variations of degree of compensation. at higher fields the slope of pf plot is about 2.5 which is not fully consistent with the pf theory. in this field range j is described reasonably well by schottky emission as the obtained from the plot value of kr is in accordance with ellipsometrically determined refractive index. -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 -5 -4 -3 d =27 nm slope = 2.4 lo g j logv slope = 2 d =17 nm tin gate slope =1 fig. 26 logj vs. logv for tin/sputtered ta2o5/si capacitors [67]. this change of the mechanism from bulk (low and middle fields) to electrode limited (high fields) is strange and at a first glance it is not consistent with the indication for a noticeable generation of defects during tin deposition. this behavior can be attributed to the quality of the top electrode-interface region itself (in one case electrons tunnel from the gate into the traps located close to/in this region and in another case they overcome the gate barrier by schottky emission) defined by the density and energy distribution of the traps created during gate deposition. if the trap concentration close to the electrode is small the electrons could not tunnel through them in ta2o5 conduction band. in this case schottky emission is more relevant at high fields; especially for tin case this means that the rf sputter-induced traps are not basically in the form of traps at the electrode but rather of interface states at si with high density, as indicated by the c–v data. it should be mentioned however, that although r is slightly higher than theoretical limit of 2 in the high-filed region for tin capacitors, the overall form of the pf plot is consistent with the consideration of conduction mechanisms in high-k dielectric stacks... 545 0.2 0.4 0.6 0.8 1.0 1.2 1.4 -28 -24 -20 -16 k r = 4.8 1.2 1.1 r = 2.5 al tin electrode ln (j /e ) e 1/2 (mv/cm) 1/2 a) e (mv/cm) 0.2 0.6 1.0 1.4 1.8 2.2 0.2 0.4 0.6 0.8 1.0 1.2 1.4 -16 -12 -8 -4 0 ln (j ) e 1/2 (mv/cm) 1/2 tin electrode al k r = 3.8 0.9 1.2 b) e (mv/cm) 0.2 0.6 1.0 1.4 1.8 2.2 fig. 26 (a) poole–frenkel and (b) schottky plot of i–v characteristics for tin and al-electroded capacitors with thermal ta2o5, d = 15 nm [68]. predicted in [16] one, suggesting a translation from r=1 to r=2 with the increase of e as a result of the growth of the electron density in the conduction band during the voltage ramp and change of inequity between the free electrons and donor and compensating traps. the observed slightly higher r may be a result from some uncertainty in the electric field determination typical for the structures with high-k dielectrics due to their double– layered nature and charge trapping. moreover, the suggestion that conduction is governed by bulk-limited mechanism is further supported by the absence of a well pronounced effect of the gate material in accordance with its work function, although a special study of the work function especially of tin deposited under conditions used here was not performed and the possibility that it differs from the literature values cannot be excluded. 4. conclusion in conclusion, a big diversity of conduction mechanisms poole–frenkel emission, fieldassisted tunneling, trap-assisted tunneling, phonon-assisted tunneling, etc., is demonstrated to operate in high-k dielectric materials. it is shown that by performing a detailed study and analysis of conduction mechanisms through metal gate/high-k dielectric stacks it is possible to obtain valuable information about the trap parameters as well as some stack parameters and structural alterations in these structures. it is also demonstrated that one and the same trap could mediate different conduction mechanisms in a given high-k dielectric depending on the specific stack parameters and measurement conditions. the doping/mixing of high-k dielectrics is revealed as an effective approach to change the energy location of traps in the bandgap of dielectric as well as their density in this way also changing the dominant conduction mechanism, hence electrical behavior of structures. the results also imply that in most of the cases investigated the oxygen vacancy is the main transport site acknowledgement: the paper is a part of the research done within the bilateral cooperation between bulgarian academy of sciences and serbian academy of sciences and arts. 546 a. paskaleva, d. spassov, d. dankovic references [1] s.h. lo, d. a. buchanan, y. taur and w. wang, ―quantum-mechanical modeling of electron tunneling current from the inversion layer of ultra-thin-oxide nmosfet's‖, ieee electron device lett., vol. 18, pp. 209-211, 1997. [2] j. g. simmons, ―richardson-schottky effect in solids,‖ phys. rev. lett., vol. 15, pp. 967-968, 1965. [3] k. f. schuegraf and c. hu, "hole injection sio2 breakdown model for very low voltage lifetime extrapolation", ieee trans. on electron devices, vol. 41, no. 5, pp. 761-767, 1994. [4] n.f. mott and w.d. twose, ―the theory of impurity conduction‖, j. appl. phys., vol. 10, pp. 107-163, 1961. [5] a.i. chou, k. lai, k. kumar, p. chowdhury and j.c. lee, ―modeling of stress-induced leakage current in ultrathin oxides with the trap-assisted tunneling mechanism‖, appl. phys. lett., vol. 70, pp. 34073409, 1997. [6] s. fleisher. p.t. lai and y.c. cheng, ―simplified closed-form trap-assisted tunneling model applied to nitrided oxide dielectric capacitors‖, j. appl. phys., vol. 75, pp. 5711-5715, 1992. [7] m.p. houng and y.h. wang, ―current transport mechanism in trapped oxides: a generalized trap-assisted tunneling model‖, j. appl. phys., vol. 86, pp. 1488-1452, 1999. [8] a. cuadras, b. garido, j.r. morante and l. fonseca, ―leakage currents and dielectric breakdown of si1−x−ygexcy thermal oxides‖, microelectron. reliab. vol. 48, pp. 1635–1640, 2008. [9] m. houssa, m. tuominen, m.naili, v. afanas'ev, a. stesmans, s.haukka and m.m. heyns, ―trap-assisted tunneling in high permittivity gate dielectric stacks‖, j. appl. phys. vol. 87, pp. 86158620, 2000. [10] d.m. sathaiya and s. karmalkar, ―thermionic trap-assisted tunneling model and its application to leakage current in nitrided oxides and algan∕gan high electron mobility transistors‖, j. appl. phys. vol. 99, 093701, 2006. [11] d.m. sathaiya and s. karmalkar, ―a closed-form model for thermionic trap-assisted tunneling‖, ieee trans. on electron devices vol. 55, pp. 557-564, 2008. [12] j. frenkel, ―on pre-breakdown phenomena in insulators and electronic semi-conductors‖, phys. rev., vol. 54, pp. 647-648, 1938. [13] j. g. simmons, ―poole-frenkel effect and schottky effect in metal-insulator-metal systems‖, phys. rev., vol. 155, pp. 657-660, 1967. [14] r. yeargan and h. l. taylor, ―the poole‐frenkel effect with compensation present―, j. appl. phys., vol. 39, pp. 5600-5604, 1968. [15] d. mark and t. e. hartman, ―on distinguishing between the schottky and poole‐frenkel effects in insulators ―j. appl. phys., vol. 39, pp. 2163-2164, 1968. [16] w. k. choi and c. h. ling, ―analysis of the variation in the field-dependent behavior of thermally oxidized tantalum oxide films‖, j. appl. phys., vol. 75, pp. 39873990, 1994. [17] s.m. sze, physics of semiconductor devices, wiley, new york, 1969. p. 812. [18] r.l. angle and h.e. talley, ―electrical and charge storage characteristics of the tantalum oxide–silicon dioxide device‖, ieee trans. electron devices, vol. 25, pp. 1277–1283, 1978. [19] c. chaneliere, s. four, j.l. autran and r.a.b. devine, ―comparison between the properties of amorphous and crystalline ta2o5 thin films deposited on si‖, microelectron. reliab., vol. 39, pp. 261-268, 1999. [20] j.g. simmons, ―conduction in thin dielectric films‖, j. phys. d: appl. phys., vol. 4, pp. 613-657. [21] f.c. chiu, ―a review on conduction mechanisms in dielectric films‖, advances in materials science and engineering vol. 2014, art. no. 578168 (18p.), 2014. [22] k.c. kao, dielectric phenomena in solids. san diego. elsevier academic press, 2004, p. 447. [23] w.r. harrell and c. gopalakrishnan, ―implications of advanced modeling on the observation of poole– frenkel effect saturation‖, thin solid films, vol. 405, pp. 205-217, 2002. [24] r.g. southwick, j. reed, c.buu, r. butlera and g. bersuker, ―limitations of poole–frenkel conduction in bilayer hfo2/sio2mos devices‖, ieee trans. device and mater. reliab., vol. 10, pp. 201-207, 2010. [25] r. ongaro and a. pillonnet, ―poole-frenkel (pf) effect high field saturation‖, revue phys. appl. vol. 24, pp. 1085-1095, 1989. [26] m. ieda, g. sawa and s. kato, ―a consideration of poole frenkel effect on electric conduction in insulators‖, j. appl. phys. vol. 42, pp. 3737-3740, 1971. [27] j.l. hartke, ―the three‐dimensional poole‐frenkel effect―, j. appl. phys., vol. 39, pp. 4871-4873, 1968. [28] r. ongaro, and a. pillonnet, ―generalized poole frenkel (pf) effect with donors distributed in energy,‖ revue phys. appl., vol. 24, pp. 1097-1110, 1989. consideration of conduction mechanisms in high-k dielectric stacks... 547 [29] b. de salvo, g. ghibaudo, g. pananakakis, b. guillaumot and g. reimbold, ―a general bulk-limited transport analysis of a 10 nm thick oxide stress-induced leakage current‖, solid-state electron., vol. 44, pp. 895-903, 2000. [30] a. pillonnet and r. orlando, revue phys. appl. vol. 25 pp. 229-242, 1990. [31] p. mark and w. helfrich, ―space charge limited currents in organic crystals‖, j. appl. phys., vol. 33, pp. 205-215, 1962. [32] j. s. bonham, ―sclc theory for a gaussian trap distribution‖, aust. j. chem., vol. 26, pp. 927–939, 1978. [33] a. rose, ―recombination processes in insulators and semiconductors‖, phys. rev., vol. 97, pp. 322-333, 1955. [34] c. chaneliere, j.l; autran, r.a.b. devine, and b. balland, ―tantalum pentoxide (ta2o5) thin films for advanced dielectric applications‖, mater. sci. eng., vol. 22, pp. 269−322, 1998. [35] m. lenzlinger and e. snow, ―fowler-nordheim tunneling into thermally grown sio2‖, j. appl. phys., vol. 40, pp. 278-283, 1969. [36] m lemberger, a paskaleva, s zürcher, a.j bauer, l frey and h ryssel, ―electrical characterization and reliability aspects of zirconium silicate films obtained from novel mocvd precursors‖, microelectron. eng., vol. 72, pp. 315-320, 2004. [37] j. robertson, ―electronic structure and band offsets of high-dielectric-constant gate oxides‖, mrs bull., vol. 27, pp. 217-221, 2002. [38] s. kalpat, h.h. tseng, m. ramon, m. moosa, d. tekleab, ph.j. tobin, d.c. gilmer, r.i. hegde, c. capasso, c. tracy and b.e. white jr, ―bti characteristics and mechanisms of metal gated hfo2 films with enhanced interface/bulk process treatments‖ ieee trans. device mater. reliab., vol. 5, pp. 26-35, 2005. [39] m. aoulaiche, m. houssa, r. degraeve, g. groeseneken, s. de gendt, and m. m. heyns, ―polarity dependence of bias temperature instabilities in hfxsi1-xon/tan gate stacks―, in proceedings of the 35th european solid-state device research conference essderc, grenoble, france, 2005, ieee, pp. 197–200. [40] a. paskaleva, m. lemberger, a.j. bauer and l. frey, ―implication of oxygen vacancies on current conduction mechanisms in tin/zr1-x alxo2/tin mim structures‖, j. appl. phys., vol. 109, art. no. 076101 (3p), 2011. [41] w. weinreich,r. reiche, m. lemberger, g. jegert, j. mueller, l. wilde, s. teichert, j. heitmann, e. erben, l. oberbeck, u. schroeder, a. j. bauer and h. ryssel, ―impact of interface variations on j-v and c-v polarity asymmetry of mim capacitors with amorphous and crystalline zr(1-x)alxo2 films‖ microelectron. eng., vol. 86, pp. 1826-1829, 2009. [42] a. paskaleva, m. lemberger, a. j. bauer, w. weinreich, j. heitmann, e. erben, u. schröder, and l. oberbeck, ―influence of the amorphous/crystalline phase of zr1−x alxo2 high-k layers on the capacitance performance of metal insulator metal stacks‖, j. appl. phys., vol. 106, art. no. 054107, 2009. [43] j. robertson, k. xiong, and b. falabretti, ―point defects in zro2 high-κ gate oxide‖, ieee trans. device mater. reliab. vol. 5, 84-89, 2005. [44] m. tapajna, a. paskaleva, e. atanassova, e. dobrocka, k. husekova and k. frohlich, ―gate oxide thickness dependence of the leakage current mechanism in ru/ta2o5/sion/si structures‖, semicond. sci. technol., vol. 25, art. no. 075007, 2010. [45] h. sawada and k. kawakami, ―electronic structure of oxygen vacancy in ta2o5‖, j. appl. phys., vol. 86, pp. 956-959, 1999. [46] w. s. lau, l.l. leong, t. han and n.p. sandler, ―detection of oxygen vacancy defect states in capacitors with ultrathin ta2o5 films by zero-bias thermally stimulated current spectroscopy‖, appl. phys. lett., vol. 83, pp. 2835-2837, 2003. [47] e. atanassova, d. spassov, a. paskaleva, m. georgieva and j. koprinarova, ―electrical characteristics of tidoped ta2o5 stacked capacitors‖, thin solid films, vol. 516, pp. 8684-8692, 2008. [48] d spassov, e atanassova, a paskaleva, n novkovski and a skeparovski, ―electrical behaviour of ti-doped ta2o5 on n2oand nh3-nitrided si‖, semicond. sci. technol., vol. 24, art.no 075024 (10pp.), 2009. [49] a. paskaleva and e. atanassova, ―evidence for a conduction through shallow traps in hf-doped ta2o5‖, mater. sci. semicond. process., vol. 13, pp. 349-55, 2010. [50] a skeparovski1, n novkovski1, e atanassova, a paskaleva and v k lazarov ―effect of al gate on the electrical behaviour of al-doped ta2o5 stacks‖, j. phys. d: appl. phys., vol. 44 art. no. 235103 (10pp), 2011. [51] d. spassov, e. atanassova and a. paskaleva, ―lightly al-doped ta2o5: electrical properties and mechanisms of conductivity‖, microelectron. reliab., vol. 51, pp. 2102-2109, 2011. [52] a. paskaleva, m. rommel, a. hutzler, d. spassov, and a. j. bauer, ―tailoring the electrical properties of hfo2 mos-devices by aluminum doping‖, acs appl. mater. interfaces, vol. 7, pp. 17032-17043, 2015. [53] w.j. zhu, t.p. ma, t. tamagawa, j. kim and y. di, ―current transport in metal/hafnium oxide/silicon structure‖, ieee electron device lett., vol. 23, pp. 97-99, 2002. 548 a. paskaleva, d. spassov, d. dankovic [54] v.v afanas‘ev, a. stesmans, l. pantisano, s. cimino, c. adelmann, l. goux, y.y chen, j.a. kittl, d. wouters and m. jurczak, ―tinx/hfo2 interface dipole induced by oxygen scavenging‖, appl. phys. lett., vol. 98, art. no. 132901 (3pp.), 2011. [55] j.l. gavartin, d. mu oz ramo, a.l. shluger, g. bersuker and b.h. lee, ―negative oxygen vacancies in hfo2 as charge traps in high-k stacks‖, appl. phys. lett., vol. 89, art. no. 082908 (3pp.), 2006. [56] c. mannequin, p. gonon, c. vall e, l. latu-romain, a. bsiesy, h. grampeix, a. sala n and v. jousseaume, ―stress-induced leakage current and trap generation in hfo2 thin films‖, j. appl. phys. vol. 112, art. no. 074103 (9pp.), 2012. [57] m.a. lampert and p. mark, current injection in solids. new york and london, academic press, 1970. [58] t.j. park, j.h. kim, j.h. jang, c.k. lee, k.d. na, s.y. lee, h.s. jung, m. kim, s. han and c.s. hwang, ―reduction of electrical defects in atomic layer deposited hfo2 films by al doping‖, chem. mater., vol. 22, pp. 4175-4184, 2010. [59] g. molas, m. bocquet, j.buckley, h. grampeix, m. g ly, j.p. colonna, c. licitra, n. rochat, t. veyront, x. garros, f. martin, p. brianceau, v. vidal, c. bongiorno, s. lombardo, b. de salvo and s. deleonibus, ―investigation of hafnium-aluminate alloys in view of integration as interpoly dielectrics of future flash memories‖, solid state electron., vol. 51, pp. 1540-1546, 2007. [60] a. paskaleva, a. j. bauer, m. lemberger, and s. z rcher, ―different current conduction mechanisms through thin high-k hfxtiysizo films due to the varying hf to ti ratio‖, j. appl. phys., vol. 95, pp. 55835590, 2004. [61] a. paskaleva, a. j. bauer, and m. lemberger, ―an asymmetry of conduction mechanisms and charge trapping in thin high-k hfxtiysizo films‖, j. appl. phys., vol. 98, art. no 053707, 2005. [62] m. lemberger, a. paskaleva, s. z rcher, a. j. bauer, l. frey, and h. ryssel, ―electrical properties of hafnium silicate films obtained from a single-source mocvd precursor‖, microelectron. reliab., vol. 45, pp. 819-822, 2005. [63] d. muñoz ramo, a. l. shluger, and g. bersuker, ―ab initio study of charge trapping and dielectric properties of ti-doped hfo2‖, phys. rev. b, vol. 79, art. no 035306, 2009. [64] a. paskaleva, m. lemberger, e. atanassova, and a. j. bauer, "traps and trapping phenomena and their implementations on electrical behavior of high-k capacitor stack", j. vac. sci. technol., vol. 29, art. no. 01aa03 (10pp), 2011. [65] g. a. niklasson and k. brantervik, ―analysis of current‐voltage characteristics of metal‐insulator composite films‖, j. appl. phys., vol. 59, pp. 980-982, 1986. [66] r.m. fleming, d.v. lang, c.d.w. jones, m.l. steigerwald, d.w. murphy, g.b. alers, y.-h. wong, r.b. van dover, j.r. kwo, and a.m. sergent, ―defect dominated charge transport in amorphous ta2o5 thin films‖, j. appl. phys., vol. 88, pp. 850-862 2000. [67] d. spassov, e. atanassova and d. virovska, ―electrical characteristics of ta2o5 based capacitors with different gate electrodes‖, appl. phys. a, vol. 82, pp. 55-62, 2006. [68] e. atanassova, d. spassov and a. paskaleva, ―metal gates and gate-deposition-induced defects in ta2o5 stack capacitors‖, microelectron. reliab., vol. 47, pp. 2088–2093, 2007. [69] l. michalas, m. koutsoureli, e. papandreou, a. gantis, g. papaioannou, ―a mim capacitor study of dielectric charging for rf mems capacitive switches‖, facta universitatis, series: electronics and energetics, vol. 28, pp. 113-122, 2015. [70] n. novkovski, ―physical modeling of electrical and dielectric properties of high-k ta2o5 based mos capacitors on silicon‖, facta universitatis, series: electronics and energetics, vol. 27, pp. 259-73, 2014. facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. 339 357 doi: 10.2298/fuee1403339r harnessing cloud computing infrastructure for e-learning services  božidar radenković, marijana despotović-zrakić, zorica bogdanović, vladimir vujin, dušan barać faculty of organizational sciences, university of belgrade, serbia abstract. this paper introduces an innovative model for harnessing cloud computing infrastructure within an e-learning ecosystem. the main goal was to design a scalable, reliable and secure it environment that provides a plethora of e-learning services and seamless integration of the heterogeneous e-learning components through iaas, paas and saas cloud service models. the e-learning services are tailored to foster courses for it engineers in the areas of mobile technologies, social computing, internet of things and big data. the model was implemented and evaluated in the e-learning ecosystem of the e-business lab, university of belgrade. key words: cloud computing, e-learning services 1. introduction nowadays, building an e-learning ecosystem is quite a cumbersome endeavour, as it implies integration of various technologies, platforms, and services. the pervasiveness of internet technologies transforms e-learning ecosystems into dynamic environments that enhance learning processes, leverage collaboration, enable kpi management and allow a use of a variety of resources, software services and applications [1]. usage of e-learning technologies in the university and lifelong education has increased significantly in the past years; however, various problems have arisen. one of the most critical issues is the underlying infrastructure. accordingly, this requires innovative models and approaches in designing the infrastructure for e-learning systems. in this paper, a model for harnessing cloud computing infrastructure within an elearning ecosystem was introduced. the main goal was to design a scalable, reliable and secure it environment that enables the provision of a variety of e-learning services and seamless integration of the heterogeneous e-learning components. the model should support e-learning services delivered as infrastructure, network, platform, and software as a service. further, the results from the paper should raise the awareness of possibilities that cloud computing gives to e-learning, make an impact on the development of advanced  received march 12, 2014 corresponding author: božidar radenković faculty of organizational sciences, university of belgrade, serbia (e-mail: boza@elab.rs) 340 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać services for e-learning, and induce the harnessing of cloud computing in implementing the cloud infrastructure within educational institutions. the research context is focused on it infrastructure for educational institutions that educate it engineers in the fields of emerging technologies, such as cloud computing, mobile technologies, internet of things, context-aware computing, ubiquitous computing, social computing, big data, and virtual reality. the implementation was conducted in the e-learning processes of the e-business lab, faculty of organizational sciences, university of belgrade. 2. cloud computing infrastructure for e-learning research and case studies pointed out that the most common approaches in harnessing cloud computing within universities are private and public clouds [2][3][4]. a private cloud model enables educational institutions to have a complete control of identity management, services, data security, applications, and resources [5]. specifically, when courses are related to computer science, students can be provided with an appropriate environment for application development, and sophisticated software tools. recently, the number of private cloud based solutions within e-learning systems has significantly increased, and many examples can be found in the literature [6][7][8]. one of the main challenges in cloud computing environment is designing and managing the network infrastructure that could effectively support all cloud computing models [9] and provide adequate resources for teaching and research groups at universities. two disruptive technologies that can bring a new paradigm to university clouds are multitenancy virtualized clusters, and software defined networking. the concept of multi-tenancy implies that several tenants can use the same cloud infrastructure and share computing, storage, and network resources. in this scenario, each tenant's data and resources are isolated and remain invisible to other tenants. tenants in an e-learning system can be teachers, students, administrators, but at the same time tenants could be e-learning services and applications. network-as-aservice (naas) is a new cloud computing model in which tenants have access to additional computing resources collocated with switches and routers [10]. tenants can use naas to implement custom forwarding decisions based on application needs. this enables the design of efficient in-network services, such as data aggregation, stream processing, caching and redundancy elimination protocols, that are applicationspecific as opposed to traditional application-agnostic network services. for the realization of naas, the concepts of software defined networks are used. software defined networks (sdn) allow network management by abstractions of lower network levels [11]. they are based on the principles of clear distinction among management, service, control and forwarding network layers [12]. the aspects of designing a private cloud within an e-learning system significantly differ from the ones in the business sector. while in the business sector the focus is on costs and security, the e-learning systems require other aspects. flexibility in service design means that it infrastructure should be expandable and flexible, so as to support teaching courses in the fields of emerging trends in it. e-learning ecosystems require efficiency in resource usage and elastic provisioning and release of e-learning services, according to demand. further, teachers should be enabled to define cloud computing services that will be offered to students according to the specifics of each course. in addition, the harnessing cloud computing infrastructure for e-learning services 341 provision of self-service should be facilitated in order to allow access to information and applications at any time from any location. the design and implementation of cloud infrastructure and services for e-learning can be realized through the following steps: 1) conceptual design with respect to educational needs  this includes activities such as: defining a list of e-learning services and required service models, defining pedagogical aspects and expected workloads; 2) designing network and cloud infrastructure  several components should be designed: network infrastructure, cloud infrastructure, management and monitoring services; 3) designing and deploying cloud services  e-learning services deployment on the implemented infrastructure and integration with the learning management system; 4) testing, tuning and evaluation  testing the infrastructure, tuning the performance, evaluation of students’ and teachers’ results and satisfaction. 3. design and implementation 3.1. research context the e-business lab, university of belgrade, organizes e-learning courses using a ubiquitous learning concept. more than 1000 students are engaged in around 30 undergraduate and postgraduate studies in the fields of information technologies. the elearning system is based on the moodle learning management system (lms). some of the courses include internet technologies, mobile business, internet of things, cloud infrastructures and services, computer simulation and virtual reality, and others. the teaching process in each semester has specific software requirements for laboratory exercises and practical projects. most e-learning resources for each course are deployed for a specific course assignment. educational content is various and grows rapidly in amount, requiring scalable storage capacity. specifically, the requests to education content follow a highly dynamic rule. these issues affect resource utilization to a great extent. during the learning process a large amount of teaching material is generated, which further aggravates the available resources. one of the biggest problems in the implementation of an it infrastructure is a competitive access to the shared resources in the higher education institution [13]. in addition to scalability, the efficiency of the existing resources represents another problem. 3.2. conceptual design the conceptual model of the e-learning services within a private cloud of an educational institution is shown in fig. 1. infrastructural services can be grouped into: 1) core infrastructure services – critical services that need to be functional in order to enable efficient functioning of services for teaching and learning; 2) monitoring services – services for continuous monitoring of network, hardware and services. they are not necessary for functioning of the whole system, however, they enable a fast detection and correction of the problems in the functioning of the infrastructure; 3) backup services – services for creating backup copies of the system. their continuous functioning enables a fast recovery in case of disasters; 4) physically dependant services – group of services related to specific physical or virtual 342 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać hardware. these services enable functioning, access administration and replication of specific services related to that hardware. fig. 1 types of services deployed on the implemented infrastructure the e-learning service layer is designed to provide an environment for gaining practical knowledge in the fields of advanced information technologies. all of the cloud service deployment models should be supported, i.e. infrastructure, platform, and software as a service. infrastructure as a service (iaas) provides teachers with an opportunity to design specific infrastructure for each course. this model has been widely used in many educational institutions [14][15], mainly for providing students with virtual machines, where each machine contains all the necessary software and teaching materials for each course. further, this model can effectively be used to provide students with an it environment tailored for their own projects in the fields of enterprise networking, big data, distributed simulation, etc. platform as a service (paas) gives students an environment for developing software as service solutions in different areas, and in different programming languages. also, the paas model is suitable for courses where the emphasis is on software development while hardware layer is highly abstracted [16]. for example, software for mobile business or the internet of things projects can effectively be taught using this model. software as a service (saas) model is highly suitable for courses where students are required to learn to use specific software. this model can also be used to provide teachers and students with services that support educational processes, such as learning management systems, student relationship management, project management, digital libraries, and many others. harnessing cloud computing infrastructure for e-learning services 343 table 1 gives examples of possibilities of use for each cloud services model within the course related to a specific area of modern it. the areas of it have been selected according to trends given in [17][18][19][20]. table 1 it course areas with iaas, paas and saas examples course area iaas paas saas cloud computing – virtualization – software defined networks – software defined data centres – cloud storage – cloud based simulation – development of cloud software (google app engine, windows azure, amazon) – communication apps – cloud storage apps – social computing apps – apps management – web hosting service mobile technologies – software defined radio networks – sms api – mobile application development – mobile agents – mobile cloud applications – mobile commerce apps – m-payment apps – mobile learning apps – mobile social apps internet of things – software defined wireless sensor networks – smart environments – api for accessing the sensor data – api for contextaware applications – api for wearable computing applications – software for smart device management big data – environment for hadoop projects – environment for mongodb projects – map-reduce api – data analysis – visualization it management – cloud management – salesforce paas – heroku paas – project management software – crm software computer simulation and virtual reality – resources for simulation execution – environment for rendering – api for developing 3d models – web simulation software – 3d modelling software tools access to all services is managed using a single sign-on concept. the single sign-on (sso) concept can be used to solve many problems related to multiple credentials for different applications. sso is a mechanism that uses a single action of authentication to permit an authorized user to access all related but independent software systems or applications without being prompted to log in again at each of them during a particular session [21]. for instance, when a student accesses an e-learning course web page, they are being transferred to the home page of the e-learning portal that requires authentication (fig. 2). the e-learning portal component for identity management checks the credentials and sends a message to the learning management system. the student gains access to the e-learning resources that are available to their permissions within the course. 344 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać fig. 2 architecture of sso 3.3. designing network and cloud infrastructure the hierarchical network model is proven to be a good model for designing the network infrastructure for educational clouds [22]. this model divides the network in core, distribution, and access layers, where each layer performs a specific function. the advantages of using the hierarchical model are numerous. the principle of modularity simplifies the process of designing the network. testing and troubleshooting are easier, as well as network maintenance. more details on the design of hierarchical network model can be found in [22] . fig. 3 conceptual model of the cloud infrastructure harnessing cloud computing infrastructure for e-learning services 345 the hierarchical network model allows realization of a part of the network infrastructure within a private cloud. the cloud network infrastructure includes parts of distribution and access layers. the concept of virtual local area networks (vlan) is used, since within the cloud infrastructure it is important to isolate network traffic for each tenant. all servers are connected to access ports within the same vlan, so the communication between servers and virtual instances flows within the same network [23][24]. the usage of vlans enables a simple, flexible, and inexpensive way to administer the network; it provides a segmentation of virtual servers, and secures isolated network traffic between instances [25]. in addition, multiple vlans can be created for each instance [26]. this approach allows flexibility in the development of cloud e-learning services, because students can be provided with network, infrastructure, development platforms, and software as a service, according to the specifics of each course. the organization of the network segment of the cloud infrastructure is shown in figure 3. after setting up the network infrastructure, the main issue is to enable an efficient and comprehensive management of the network resources. considering the heterogeneity of e-learning systems’ components and a need for delivering resources and services on demand, the idea was to find and customize a tool that fulfils all the mentioned requirements. openstack is an open source cloud computing platform for public and private clouds primarily focused on infrastructure as a service [27]. it provides software, control panels, and apis required to orchestrate a cloud, including running instances, managing networks, and controlling access. the communication of openstack services is realized through public apis. the main features of the openstack include: multi-tenancy, massive scalability, multiple network models, pluggable authentication, block storage support, control panel, hypervisor support [28][29]. openstack networking adds a layer of virtualized network services. this gives tenants the capability for designing their own, virtual networks. neutron, openstack project that provides networking as a service, allows users to create multiple tenant networks [30]. a single open-switch bridge can be utilized by multiple tenant networks using different vlan ids, allowing instances to communicate with other instances across the environment. neutron has an api extension to allow administrators and tenants to create routers that connect to l2 networks. neutron uses the linux ip stack and iptables to perform l3 forwarding and nat. in order to support multiple routers with potentially overlapping ip addresses, the linux network namespaces are used to provide isolated forwarding contexts. like the dhcp namespaces that exist for every network defined in neutron, each router has its own namespace with a name based on its uuid. giving each application its own virtual infrastructure, including its own network, leads to simplified security configurations, and gives developers additional flexibility. application owners in this case do not use the shared infrastructure, but instead each has their own end-to-end infrastructure, isolated from the others, and with few or no interaction points. the flexibility and automation that comes with virtualization of the network allows both network administrators and application owners to respond more quickly to educational needs. for instance, in periods of high usage of the e-learning system (during labs, tests, upload of assignment, etc.) using openstack, an it administrator can easily reserve additional computing resources on the infrastructure and add new instances of the needed server. at the same time, when a new course opens, services and applications tailored to the course’ requirements can be delivered in real time. in the periods of the reduced activities, some server instances can easily be removed. in that way, computing 346 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać resources are set free and can be used for other purposes. using openstack reporting services, teachers and administrators monitor performances of the systems, track ways that students use the system, and notice the most important constraints and factors that influence the outcome of the teaching and learning processes. integrated dedicated and virtual servers in cloud infrastructure are shown in fig. 4 fig. 4 integrated dedicated and virtual servers in cloud infrastructure 3.4. designing big data infrastructure the concept of big data refers to large, diverse and distributed datasets that cannot be handled using conventional hardware and software infrastructures [31]. the term big data also refers to reliable, distributed and scalable infrastructure for managing large datasets [32]. the main problems in big data are related to gathering, storing, searching, analyzing, and visualization of large datasets. the cloud computing infrastructure is considered an adequate solution for these issues. the big data platform in the e-business lab is based on hadoop framework, which provides storage for large datasets, and large scale processing by using the map-reduce approach. the big data platform is realized using savanna plugin for openstack. savanna enables easier installation and management of hadoop stack. in addition, savanna enables easy service management. the implemented hadoop cluster is currently based on three hosts. the master host includes services for managing hadoop virtual functions. host nodes store data, and contain client services, such as pig, hive, hbase, which are used for data analysis. big data infrastructure is shown in fig. 5. harnessing cloud computing infrastructure for e-learning services 347 fig. 5 big data infrastructure 3.5. deploying cloud services services implemented on the cloud computing infrastructure are classified into two groups: services for students and services for teachers. some of the services are used by both students and teachers, such as moodle learning management system, which provides a variety of services that facilitate the learning process: course management, learning resources and activities management, collaboration tools, etc. the following text provides a brief description of the most frequently used services for teachers and students that are implemented on the cloud infrastructure. services for teachers include services that support teaching and administrative processes. intranet portal is a single access point to all the services and applications for the teachers. the portal was designed using the wordpress content management system. authentication is based on ldap. based on users’ requirements, the intranet portal communicates with the ldap server and provides users with appropriate services and information. the service for project management enables planning, tracking and reporting about projects within the laboratory. the service provides features such as role management, reporting services, gant chart and calendar, time tracking, wikis and forums per project, etc. the laboratory’s staff takes part in several projects at the same time and there is a strong need for a comprehensive project management solution [33][34]. the service is implemented using redmine software solution. 348 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać teachers are also provided with an adapted pydio tool that enables complete file management via web browser from any place. pydio (formerly ajaxplorer) is an open source software that enables any server (on premise, nas, cloud iaas or paas) to be turned into a file sharing platform. it is an alternative to saas boxes and drives, with more control, safety and privacy. advantages of this approach for file management are explained in [35][36]. a visual tool for easy management of the events in teaching, business, collaboration processes within laboratory is implemented using a wordpress plugin. the event management service is fully customizable and personalized according to the teachers’ preferences. further, the calendar content is provided on mobile devices, too. the notifications from the calendar are sent via email and sms as remainders. nurturing good relationship with students before, during and after their studies becomes a challenging issue [37]. in order to attract, engage, retain students and promote our activities we have developed comprehensive student relationship management services that have been implemented using a sugarcrm software solution. the sugarcrm is chosen because it is an open source, customizable, and easy to use. the module for managing activities related to students contains information about each student, contact details, additional comments (remarks, suggestions, topics of final thesis, notes, etc.). the goal is to raise a relationship management system between students and educational institutions on a higher level in a simple and quick way by the use of high availability and prompt exchange of information. an example of using srm concept in e-learning management could be seen in [38]. the cloud infrastructure enables seamless design and development of e-learning services. these are numerous services for students implemented and evaluated within the private cloud of the e-business lab: services for adaptive e-learning [39], a platform for discrete event simulation [40][41], services for learning continuous simulation [42], services for mobile learning [43], services for mobile language learning [44], services for social network learning [45], and services for visualization [46]. hereinafter, we focus on services related to the usage of cloud resources, and platforms for learning the concepts of mobile technologies, the internet of things and big data. cloud resource reservation service although theoretically cloud infrastructure gives a possibility for an elastic and seamless provisioning and release of e-learning cloud resources, there may be situations where students are required to book the needed cloud resources. fig. 6 shows the activity diagram for booking cloud resources. a teacher defines virtual machines (vm) according to the needs of each course; students can view the available virtual machines for each course they attend to, and reserve it resources, that will be allocated for them. teachers’ and students’ actions are handled through the moodle interface, while the scheduling and allocation of cloud resources is handled by the openstack. this approach enables educational institutions to use their cloud resources efficiently, even in cases when these resources are not as elastic as with commercial cloud providers. harnessing cloud computing infrastructure for e-learning services 349 fig. 6 cloud resource reservation service internet of things, sms, and big data platforms in order to conduct courses in the field of the internet of things several platforms have been developed: iot, bigdata and sms platforms. each platform provides rest apis, which enable a set of functionalities that can be integrated in students’ web and mobile applications. students’ applications are hosted on web hosting services that are also provided to students as cloud services. fig. 7 illustrates the architecture of the designed system. 350 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać fig. 7 architecture of the sms, iot and big data platforms the iot platform is developed in order to support the education in the field of designing internet of things applications [47][48]. during the lectures students gain an insight into the hardware aspects of the internet of things; however, the focus of the course is on the software aspects. therefore, the platform abstracts the hardware layer, and consists of: iot hardware that includes raspberry pi and arduino devices [49] connected with a number of different sensors and actuators; hardware controllers that include pre-programmed features necessary for gathering sensor data and manipulating the actuators; web services available to students to integrate them into their own applications. the internet of things projects in time can generate large quantities of data. therefore, big data infrastructures are often used to store and search sensor data. students can use the big data platform to store and analyze their data. the architecture enables real time analytics that students can implement within their projects. in order to support the development of sms based applications for courses in the fields of mobile technologies and the internet of things, an sms platform has been developed. the platform provides rest api for the integration of sms services into students’ web or mobile applications. an sms gateway device connected to gsm network is used to send and receive sms messages. gnokii is used as an sms server. it runs on a virtual machine connected to an sms gateway device. the sms application has been developed with the aim to sort the received sms messages to separate users' inboxes and generate api keys. these features are provided to students through a web portal. students can login by using their moodle credentials. after login, students can view their personal api key which is used for authentication to the web service. using the rest api, students can integrate sms services into their application. harnessing cloud computing infrastructure for e-learning services 351 4. evaluation and discussion the evaluation of the performance and benefits that cloud computing brings to educational institutions may be as complex as designing and developing the cloud infrastructure itself. most of the models for evaluation of the cloud are created for business environments, where the main kpis are related to costs, roi, availability, and scalability. although these factors are important in the educational contexts, factors related to pedagogical aspects and flexibility in the design the e-learning courses are more important. students’ and teachers attitudes towards cloud services have been studied by many researchers [14][50], as well as the metrics for high performance systems within educational and research institutions [51]. however, a good method for an overall evaluation of usefulness of educational clouds and flexibility in designing e-learning courses cannot be found in the literature yet. therefore, in this paper we will focus on three aspects: 1) analysis of data related to internal metrics for cloud provisioning, such as networking, storage and processors. this information should provide a base for better utilization of the available resources. 2) analysis of results that students achieve when using the developed sms and iot paas. these results should encourage teachers to use cloud services more frequently. 3) discussion of benefits that cloud services bring to e-learning environments. this analysis should contribute to a better incorporation of cloud services as integral elements of e-learning courses. 4.1. analysis of cloud performance the performance of a cloud computing system is determined by the analysis of the characteristics involved in performing an efficient and reliable service that meets requirements under stated conditions and within the maximum limits of the system parameters [52]. in order to measure the performance of the developed cloud computing infrastructure, a performance measurement framework has been applied. the framework defines qualitative and quantitative concepts as well as basic measuring functions (fig. 8). to measure the resource utilization four functions are used: failure function, task function, time function and transmission function. the time function is based on different methods for gathering data related to cpu user utilization, job duration and response time. the measurement procedure is similar for all the functions. the measures are then grouped together, depending on the desired perspective: user, developer, or maintainer. finally, the performance analysis is supported by an analysis model to help interpret the results by relating them to the initial performance requirements. cloud performance has been measured through load, cpu, and memory usage. networking metrics include parameters related to type of traffic, latency, throughput, and reliability. cloud storage is usually evaluated through availability and reliability which can be unambiguously measured, but also through security, simplicity and usability for users that are harder to quantify. for processor usage in private educational clouds, we can choose processor utilization or idle time as metrics, since these might give us an insight into how to improve the usage of available resources. parameters should be measured on the levels of physical and virtual resources, in order to enable a better utilization of all resources on all levels. 352 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać fig. 8 performance measurement model for cloud computing table 2 shows parameters related to resource usage within the developed private cloud of the e-business lab. parameters are shown for three representative services: moodle, hosting service (virtual machine which hosts most of the mentioned teaching and learning services), and students’ hosting service, where all students projects are hosted. maximum and average parameters are shown for the semester period and for the exam period. table 2 performance metrics of the private educational cloud during semester during exam period service metrics max average max average students’ hosting service cpu utilization 55% 7% 65% 32% load 17.3m 145k 21.4m 612k memory usage 7.9g 3.72g 8.1g 5.87g moodle cpu utilization 20% 2% 55% 13% load 8.61m 182k 36.69m 2.61m memory usage 14g 6g 23.7g 11g hosting service cpu utilization 48% 5% 86% 28% load 4.8m 124k 7.4m 419k memory usage 5.5g 3g 8.8g 4g harnessing cloud computing infrastructure for e-learning services 353 the results in table 2 show that all the parameters are in the acceptable range. a detailed analysis showed that maximum values were achieved in expected cases, e.g. during classes where around 60 students use the services concurrently, or during the exam when around 500 students access the moodle service simultaneously. high values of average memory usage are due to the reasonably high level of cached data. 4.2. analysis of students’ results in order to evaluate the students’ satisfaction and results when using the developed iot and sms paas, research has been conducted on a test sample of 8 students who attended a mobile business course at the master studies in e-business. the students were asked to participate in a workshop, where they were required to develop mobile applications using iot and sms paas. the task was to develop a mobile application for managing the air condition within a smart house. students' applications should enable functionalities to read the sensor data, turn on and off the air conditioning system, through an android application and through sms. students were grouped into four groups with two members in each group. students were provided with a preconfigured iot hardware based on arduino microcontroller connected with a temperature sensor. the arduino microcontroller was connected to a raspberry pi microcomputer, which was configured as an arduino controller and as a web server. raspberry pi was also connected to a led diode used to simulate an air conditioning system. in addition, students were provided with the functionalities of the iot platform that enable students’ applications to read the sensor data, and to turn on or off the air conditioning. functionalities of the sms platform included features to send and receive smss. the goal of the study was to examine the students’ impressions about the iot and sms platforms, as well as the attitudes towards this type of learning. two instruments were used in the research: a questionnaire for gathering data on students’ attitudes about the iot platform and learning of the iot, and a test for assessing the students’ knowledge. the questionnaire required students to grade the ease of use, quality of documentation, impacts on knowledge and motivation. an average grade that students achieved on the test is 8.31, with a standard deviation of 0.66. the analysis of questions showed that students had the most problems with questions related to hardware issues, while the questions related to software aspects were answered correctly by most of the students. scores that students assigned to each considered aspect of iot and sms platforms are shown in table 3: table 3 students’ impressions about iot and sms platforms high score=5 neutral score=3 low score=1 average ease of use 6 2 0 4.5 documentation 8 0 0 5 impact on knowledge 8 0 0 5 impact on motivation 6 1 1 4.25 354 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać the data in table 3 show that students are generally satisfied with the designed platforms. all of the students thought that the documentation and the impact on the knowledge were adequate. an average score for the impact on motivation was lowest, although every result above 4 can be considered as satisfactory. some of the students’ impressions were:  "it is intriguing, interactive, and fun. it was very motivating."  "i could gain useful and applicable knowledge."  "i liked that i could see the results of the programming at once."  "documentation should contain more code examples and video tutorials." the main deficiency of the pilot class was the fact that only eight students participated. still, currently there are not many research papers concerning the aspects of teaching and learning the iot using paas, therefore some general remarks useful for future research and teaching can be given. the students suggested the help from the teachers was important for the completion of their tasks; this especially needs to be considered if future classes are performed with a larger number of students. iot exercises that attempt to encompass both the hardware and software aspects of iot should likely be of longer length, as the students were mostly in favour of 3-hour classes over those twice as short. the sms service that was provided for sending and receiving sms messages was very well liked, although it was not a core part of the iot. this is probably due to the fact that sms messages are a very familiar concept, and it allowed the students to envision iot integration in a more realistic manner. the students should therefore be provided with as many other services as possible in order to widen the possibilities and increase motivation. students also felt that seeing the results of programming immediately was very motivating, so iot exercises should be designed in such a way to provide directly verifiable results, and preferably a direct impact on the physical world in some way. 4.3. discussion the cloud infrastructure brings educational institutions commonly known benefits such as using services on demand, resource elasticity, and mechanisms for measuring the usage of resources by numerous parameters [15][46]. specific benefits are reflected in the fact that cloud is suitable for cases when processes, applications, and data are largely independent, for example in different e-learning courses [6]. this can be effective in cases where cloud resources are needed for a course during the semester, but less needed when the semester is over. in this way, the same resources can be used for teaching different courses, without major infrastructure changes [15]. also, points of integration are well defined, and lower level of security is acceptable. one of the main challenges in designing the private cloud for e-learning is that it is rather complex and requires specific expert knowledge [16]. highly skilled staff is needed to design, implement, and maintain the network, cloud, and e-learning services. in the described approach, it is assumed that these experts can be found within educational institutions that realize study programs in the fields of information technologies and computer sciences. in addition, initial costs required for the realization of a private cloud can be significant, and in some cases, it may be less expensive to use services provided by a commercial cloud provider. however, private clouds provide high flexibility in designing e-learning courses [6][8]. the concept of network as a service gives possibilities to organize and use the physical infrastructure through heterogeneous logical infrastructures, where each user, i.e. harnessing cloud computing infrastructure for e-learning services 355 student, within each course can be provided with a required infrastructure [10]. finally, introducing new services within a private cloud is expected to have lower costs in comparison with public clouds. 5. conclusion the main contributions of the paper are reflected in an innovative model of the cloudbased infrastructure that enables hosting of a corpus of e-learning services. using hierarchical model in network design enables a high level of scalability and reliability. further, the model of infrastructure based on a private cloud is suitable for educational institutions. learning services are provided as infrastructure, platform or software on demand. in addition, software defined networks provide multi-tenancy and network as a service. the model is especially suitable as a support to study programs in the fields of emerging information technologies, where applications and usage scenarios are new and often experimental. the model has been implemented and used within the e-learning system in the e-business lab, the faculty of organizational sciences. future research is directed toward mainstreaming the infrastructure for big data into other areas of application, and improving the platform as a service for projects in the area of the internet of things. also, modern pedagogical approaches, such as game based learning, learning in context, etc., that include advanced it services are going to be applied and evaluated. acknowledgement: the authors are thankful to the ministry of education, science and technological development, the republic of serbia, for financial support grant number 174031. references [1] j. c. marcos recio and j. alcolado santos, “a new educational paradigm: from e-learning to cloud learning (c-learning). knowledge in the cloud.,” in edulearn11: 3rd international conference on education and new learning technologies, 2011, pp. 4932–4941. [2] n. sultan, “cloud computing for education: a new dawn?,” int. j. inf. manage., vol. 30, no. 2, pp. 109– 116, 2010. [3] n. radhakrishnan, n. p. chelvan, and d. ramkumar, “utilization of cloud computing in e-learning systems,” in 2012 international conference on cloud computing technologies, applications and management (iccctam), 2012, pp. 208–213. [4] m. h. sqalli, m. al-saeedi, f. binbeshr, and m. siddiqui, “ucloud: a simulated hybrid cloud for a university environment,” in 2012 ieee 1st international conference on cloud networking (cloudnet), 2012. [5] v. vujin, “cloud computing in science and higher education,” management, vol. 16, no. 59, pp. 65–70, 2011. [6] m. despotović-zrakić, v. milutinović, and a. belić, eds., handbook of research on high performance and cloud computing in scientific research and education. igi global, 2014. [7] l. xu, l. liu, and c. wu, “services enhancing usage of large equipments in a private cloud,” in 2013 international conference on service sciences (icss 2013), 2013, pp. 118–122. [8] f. doelitzscher, a. sulistio, c. reich, h. kuijs, and d. wolf, “private cloud for collaboration and elearning services: from iaas to saas,” computing, vol. 91, no. 1, pp. 23–42, jan. 2011. [9] b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, and d. barać, “designing network infrastructure for an e-learning cloud,” in the fourth international conference on e-learning (elearning2013), 2013. [10] p. wolf, c. matteo, m. peter, and a. l. pietzuch, “naas: network-as-a-service in the cloud.” [online]. available: http://research.microsoft.com/en-us/um/people/pcosta/papers/costa12naas.pdf. 356 b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, d. barać [11] s. paul, r. jain, m. samaka, and j. pan, “application delivery in multi-cloud environments using software defined networking,” comput. networks, 10.1016/j.comnet.2013.12.005, feb. 2014. [12] c. j. s. decusatis, a. carranza, and c. m. decusatis, “communication within clouds: open standards and proprietary protocols for data center networking,” ieee commun. mag., vol. 50, no. 9, pp. 26–33, sep. 2012. [13] v. vujin, a. milić, m. despotović-zrakić, b. jovanić, and b. radenković, “development and implementation of e-education model in a higher education institution,” sci. res. essays, vol. 7, no. 13, apr. 2012. [14] m. despotović-zrakić, k. simić, a. labus, a. milić, and b. jovanić, “scaffolding environment for е learning through cloud computing,” educ. technol. soc., vol. 16, no. 3, pp. 301–314, 2013. [15] l. m. vaquero, “educloud: paas versus iaas cloud usage for an advanced computer science course,” ieee trans. educ., vol. 54, no. 4, pp. 590–598, nov. 2011. [16] m. n. ameen, h. a. sanjay, and y. patel, “a service provisioning and managing framework for platform as a service in educational cloud,” in 2012 2nd ieee international conference on parallel, distributed and grid computing (pdgc), 2012, pp. 262–267. [17] gartner, “gartner identifies the top 10 strategic technology trends for 2014,” gartner, orlando, florida, 2013. . [18] ieee spectrum, “2014 top tech to watch,” 2014. [online]. available: http://spectrum.ieee.org/static/ 2014-top-tech-to-watch. [19] d. milojičić, “ieee computer society 2022 report,” in joint seminar on computer science and applied mathematics, 30.12.2013., 2013. [20] european commission, “horizon 2020 work programme 2014-2015,” 2013. [21] s. suriadi, e. foo, and a. jøsang, “a user-centric federated single sign-on system,” j. netw. comput. appl., vol. 32, no. 2, pp. 388–401, mar. 2009. [22] b. radenković, m. despotović-zrakić, z. bogdanović, v. vujin, and d. barać, “designing network infrastructure for an e-learning cloud,” in the fourth international conference on e-learning (elearning2013), 2013. [23] y.-w. e. sung, x. sun, s. g. rao, g. g. xie, and d. a. maltz, “towards systematic design of enterprise networks,” ieee/acm trans. netw., vol. 19, no. 3, pp. 695–708, jun. 2011. [24] z. xiyang and c. chuanqing, “research on vlan technology in l3 switch,” in 2009 third international symposium on intelligent information technology application, 2009, pp. 722–725. [25] x. wang, h. zhao, m. guan, c. guo, and w. jiyong, “research and implementation of vlan based on service,” in globecom ’03. ieee global telecommunications conference (ieee cat. no.03ch37489), vol. 5, pp. 2932–2936. [26] m. yu, j. rexford, x. sun, s. rao, and n. feamster, “a survey of virtual lan usage in campus networks,” ieee commun. mag., vol. 49, no. 7, pp. 98–103, jul. 2011. [27] m. bist, m. wariya, and a. agarwal, “comparing delta, open stack and xen cloud platforms: a survey on open source iaas,” in 2013 3rd ieee international advance computing conference (iacc), 2013, pp. 96–100. [28] a. corradi, m. fanelli, and l. foschini, “vm consolidation: a real case based on openstack cloud,” futur. gener. comput. syst., vol. 32, pp. 118–127, mar. 2014. [29] j. bernal bernabe, j. m. marin perez, j. m. alcaraz calero, f. j. garcia clemente, g. martinez perez, and a. f. gomez skarmeta, “semantic-aware multi-tenancy authorization system for cloud architectures,” futur. gener. comput. syst., vol. 32, pp. 154–167, mar. 2014. [30] “neutron’s developer documentation.” [online]. available: http://docs.openstack.org/developer/neutron/. [31] l. liu, “computing infrastructure for big data processing,” front. comput. sci., vol. 7, no. 2, pp. 165– 170, apr. 2013. [32] h. chen, r. h. l. chiang, and v. c. storey, “business intelligence and analytics: from big data to big impact,” mis q., vol. 36, no. 4, pp. 1165–1188, dec. 2012. [33] n. g. hall, “project management: recent developments and research opportunities,” j. syst. sci. syst. eng., vol. 21, no. 2, pp. 129–143, jun. 2012. [34] m. braglia and m. frosolini, “an integrated approach to implement project management information systems within the extended enterprise,” int. j. proj. manag., vol. 32, no. 1, pp. 18–29, jan. 2014. [35] v. stantchev, r. colomo-palacios, p. soto-acosta, and s. misra, “learning management systems and cloud file hosting services: a study on students’ acceptance,” comput. human behav., vol. 31, pp. 612– 619, feb. 2014. [36] d. dunford, “managed file transfer: the next stage for data in motion?,” netw. secur., vol. 2013, no. 9, pp. 12–15, sep. 2013. [37] m. b. piedade and m. y. santos, “student relationship management: concept, practice and technological support,” in 2008 ieee international engineering management conference, 2008, pp. 1–5. harnessing cloud computing infrastructure for e-learning services 357 [38] b. radenkovic, m. despotovic-zrakic, z. bogdanovic, a. labus, and m. milutinovic, “providing services for student relationship management on cloud computing infrastructure,” in 2013 11th international conference on telecommunications in modern satellite, cable and broadcasting services (telsiks), 2013, pp. 385–388. [39] m. despotovic-zrakic, a. markovic, z. bogdanovic, d. barac, and s. krco, “providing adaptivity in moodle lms courses,” educ. technol. soc., vol. 15, no. 1, pp. 326–338, 2012. [40] m. despotović-zrakić, d. barać, z. bogdanović, b. jovanić, and b. radenković, “integration of web based environment for learning discrete simulation in e-learning system,” simul. model. pract. theory, vol. 27, pp. 17–30, sep. 2012. [41] m. despotović-zrakić, d. barać, z. bogdanović, b. jovanić, and b. radenković, “web-based environment for learning discrete event simulation,” j. univers. comput. sci., vol. 18, no. 10, pp. 1259– 1278, may 2012. [42] m. despotović-zrakić, d. barać, z. bogdanović, b. jovanić, and b. radenković, “software environment for learning continuous system simulation,” acta polytech. hungarica, vol. 11, no. 02, feb. 2014. [43] z. bogdanović, d. barać, b. jovanić, s. popović, and b. radenković, “evaluation of mobile assessment in a learning management system,” br. j. educ. technol., p. n/a–n/a, feb. 2013. [44] m. milutinović, a. labus, v. stojiljković, z. bogdanović, and m. despotović-zrakić, “designing a mobile language learning system based on lightweight learning objects,” multimed. tools appl., sep. 2013. [45] a. labus, k. simić, m. vulić, m. despotović-zrakić, and z. bogdanović, “an application of social media in elearning 2.0,” in proceedings of the 25th bled econference edependability: reliable and trustworthy estructures, eprocesses, eoperations and eservices for the future, 2012, pp. 557–572. [46] z. bogdanovic, m. despotovic-zrakic, m. milutinovic, m. andjelic, and s. milinovic, “model for enhanced data management, visualization, and adaptation in e-learning,” manag. j. theory pract. manag., vol. 18, no. 69, pp. 5–14, dec. 2013. [47] j. carretero and j. d. garcía, “the internet of things: connecting the world,” pers. ubiquitous comput., vol. 18, no. 2, pp. 445–447, may 2013. [48] j. gubbi, r. buyya, s. marusic, and m. palaniswami, “internet of things (iot): a vision, architectural elements, and future directions,” futur. gener. comput. syst., vol. 29, no. 7, pp. 1645–1660, 2013. [49] a. d’ausilio, “arduino: a low-cost multipurpose lab equipment.,” behav. res. methods, vol. 44, no. 2, pp. 305–13, jun. 2012. [50] w.-w. wu, l. w. lan, and y.-t. lee, “factors hindering acceptance of using cloud services in university: a case study,” electron. libr., vol. 31, no. 1, pp. 84–98, 2013. [51] t. r. furlani, m. d. jones, s. m. gallo, a. e. bruno, c.-d. lu, a. ghadersohi, r. j. gentner, a. patra, r. l. deleon, g. von laszewski, f. wang, and a. zimmerman, “performance metrics and auditing framework using application kernels for high-performance computer systems,” concurr. comput. pract. exp., vol. 25, no. 7, pp. 918–931, may 2013. [52] l. bautista, a. abran, and a. april, “design of a performance measurement framework for cloud computing,” a j. softw. eng. appl., 2011. instruction facta universitatis series: electronics and energetics vol. 27, n o 3, september 2014, pp. i i editorial since my appointment as a new editor-in-chief of facta universitatis: series electronics and energetics, in october 2013, we have published the series of three special anniversary issues dedicated to the journal’s majestic age of a quarter of century. the published papers in special anniversary issues not only met the goals consistent with our focused aims, but have surpassed our expectation in quality and practical value. over the past year, we were receiving submissions and publishing papers from a very broad geographical area, making facta universitatis: series electronics and energetics a truly international journal. however, our job is not finished yet and the journal will be improved further. this is the fun part of this job, often it is the journey that is more enjoyable than the destination itself. this issue, the first one in the series of the forthcoming regular issues, is a collection of 5 invited papers by well-known experts for the specific areas, most of them being the members of the advisory board and editorial board, and 7 research papers by the authors from serbian academia environment, who present and discuss the state-of-the-art issues of practical interest in the field. on behalf of our editorial team, i promise to continue to develop and improve facta universitatis: series electronics and energetics in order to keep it at the forefront of science and technology. ninoslav stojadinović editor-in-chief plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 447-460 https://doi.org/10.2298/fuee1803447s channel capacity of the macrodiversity sc system in the presence of kappa-mu fading and correlated slow gamma fading  marko m. smilić 1 , branimir s. jakšić 2 , dejan n. milić 3 , stefan r. panić 1 , petar ć. spalević 2 1 university of priština, faculty of natural sciences and mathematics, kosovska mitrovica, serbia 2 university of priština, faculty of technical sciences, kosovska mitrovica, serbia 3 university of niš, faculty of electronic engineering, niš, serbia abstract. in this paper macrodiversity system consisting of two microdiversity sc (selection combiner) receivers and one macrodiversity sc receiver are analyzed. independent κ-μ fading and correlated slow gamma fading are present at the inputs to the microdiversity sc receivers. for this system model, analytical expression for the probability density of the signal at the output of the macrodiversity receiver sc, and the output capacity of the macrodiversity sc receiver are calculated. the obtained results are graphically presented to show the impact of rician κ factor, the shading severity of the channel c, the number of clusters µ and correlation coefficient ρ on the probability density of the signal at the output of the macrodiversity system and channel capacity at the output of the macrodiversity system. based on the obtained results it is possible to analyze the real behavior of the macrodiversity system in the presence of κ-μ fading. key words: joint probability density, channel capacity, macrodiversity sc receiver, correlation coefficient, rician κ factor. 1. introduction radio signals generally propagate according to three mechanisms; reflection, diffraction, and scattering. as a result of the above three mechanisms, radio propagation can be roughly characterized by three nearly independent phenomenon; path loss variation with distance, slow log-normal shadowing, and fast multipath fading. each of these phenomenon is caused by a different underlying physical principle and each must be accounted for when designing and evaluating the performance of a cellular system [1]. received october 23, 2017; received in revised form february 16, 2018 corresponding author: marko m. smilić faculty of natural sciences and mathematics, university of pristina, lole ribara br. 29, 38220 kosovska mitrovica, serbia (e-mail: marko.smilic@pr.ac.rs) 448 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević fast fading is caused by spreading of signal in multiple directions. the interaction of the waves with objects that are between the transmitter and receiver (reflection, diffraction and dispersion) causes that at the input of the receiver will arrives a large number of copies of the sent signal. the environment through which the wave spreads can be linear and nonlinear. when the reflected waves are correlated with factor ρ, the environment is non-linear [1], [2]. slow fading occurs due to the shadow effect. various objects between the transmitter and the receiver can form the shadow effect [3]. in most cases, the slow fading is correlated. the signal envelope is variable due to the fast fading, and the power of signal envelope is variable due to the slow fading [4], [5]. the statistical behavior of signals in such systems can be described by different distributions: by rayleigh, rician, nakagami-m, weibull or κ-μ [2], [6], [7]. κ-μ distribution can be used to describe the variation of the signal envelope in linear environments where it is a dominant component. there are several clusters in the propagation environment and the strength of components in phase and quadrature are equal. κ-μ distribution has two parameters. the parameter κ is rician factor and it is equal to the quotient of the power of dominant component and the power of linear component [8]. parameter μ is related to the number of clusters in propagation environment. κ-μ distribution is basic distribution, while other distributions can be obtained from it as special cases [9], [10]. a variety of diversity techniques are used to reduce the impact of fast fading and slow fading on system performances. diversity techniques of more replicas of the same information signal are combined. the most commonly used diversity techniques are mrc (maximum ratio combining), egc (equal gain combining) and sc (selection combining) [1], [9]. sc diversity receiver is easy for practical realization because the processing is done only on one diversity branch. sc receiver uses the branch with the highest signal-to-noise ratio for next processing of signal [11]. if the noise power is the same in all branches of sc receiver, then sc receiver separates the branch with the strongest signal [6]. performances of sc receiver are worse than performances of mrc and egc receivers. at the sc receiver, it is relatively easy to determine probability density and cumulative probability of the signal at the output from the receiver. the most commonly used are spatial diversity techniques. spatial diversity techniques are realized with multiple antennas placed on a receiver. by using spatial diversity technique it increases the reliability of the system and the channel capacity without the increase of transmitter power and the expansion of the frequency range. there are more combining spatial diversity techniques that can be used to reduce the influence of fading and co-channel interference on the performance of the system. with regard to analytical methods, for the channel capacity we used the well-known meijer g-function [12]. it is also shown how with the change of some parameters we influence on the change of the channel capacity. there are two types of channel capacity: the shannon capacity and the capacity with outage. shannon capacity is the maximum data rate that can be sent over the radio channel with asymptotically small error probability, so is also called the ergodic capacity. capacity with outage is the maximum data rate that can be transmitted over a channel with some outage probability that is the percentage of data that can not be received correctly due to the deep fading [13], [14], [15], [16], [17]. in many papers [18], [19], [20], [21], statistical characteristics of the signal for macrodiversity systems are presented. for the considered system results have not been channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 449 presented yet. based on the results obtained in this paper, it is possible to optimize the parameters of the wireless system and the emission power of the signal. using the results obtained, it is possible to predict the behavior of various system implementations for various mobile transmission scenarios and in various propagation environments, which enables mobile system designers to make rational system solutions for the desired system performance. 2. system model in this paper we discuss the macrodiversity system with macrodiversity sc (selection combining) receiver and two microdiversity sc receivers. independent κ-μ fading and slow gamma fading are at the inputs of the microdiversity sc receivers. the slow fading is correlated. the correlation coefficient decreases with increase of the distance between the antennas. microdiversity sc receiver reduces the impact of fast fading on system performance, while macrodiversity sc receiver reduces the impact of slow fading on system performance. macro system that are discussed here can be used in a single cell of a cellular mobile radio system. microdiversity receivers are installed on the base stations serving to mobile users in a single cell. macrodiversity system uses signals from multiple base stations positioned in a single cell or two or more cells. the system which is discussed is shown in figure 1. fig. 1 macrodiversity system with one macrodiversity sc receiver and two microdiversity sc receivers. the signals at the input to the first sc microdiversity receiver are marked with x11 and x12, and with x1 at the output. the signals at the input to the other sc microdiversity receiver are marked with x21 and x22, and with x2 at the output. signal at the output of the macrodiversity system is marked with x. power of signals at the input to the microdiversity receivers are marked with ω1 and ω2. the signal at the output of the macrodiversity sc receiver x is equal to the signal at the output of that microdiversity sc receiver whose power is greater than the power of signal at the input of other microdiversity sc receiver [2]. 450 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević 3. the probability density of the signal the probability density of κ-μ signal x1 and first microdiversity sc receiver is given by [22]: 2 1 1 1 1 ( 1) 2 1 1 1 11 112 1 2 ( 1) ( 1) ( ) 2 , 1, 2; i k x x k k k k p x x e i x i k e                   (1) the probability density of κ-μ signal x2 and second microdiversity sc receiver is given by [22]: 2 2 2 2 1 ( 1) 2 2 2 1 21 212 2 2 ( 1) ( 1) ( ) 2 , 1, 2; i k x x k k k k p x x e i x i k e                   (2) the parameter μ represents the number of clusters through which the signal is extended, κ rician factor, ω1 and ω2 are average power of signals at the output of the first and second microdiversity system respectively, in(·) modified bessel function of the first kind and n type [23]. after using the series for bessel functions, the term of probability density x1i becomes 2 11 1 1 1 1 1 2 1( 1) 2 2 1 1 1 11 0 1 1 112 1 2 ( 1) ( 1) 1 ( ) ! ( )i ik x i x ik k k k p x x e x i i k e                      (3) where г(·) denotes the gamma function, and x1i represent signals envelopes at the input of first microdiversity sc receiver and x1 represents signal envelope at the output of first sc receiver [23]. in a similar way we get the probability density of the signal at the input to another microdiversity sc receiver: 2 12 2 1 2 1 1 2 1( 1) 2 2 1 2 2 21 0 2 1 112 2 2 ( 1) ( 1) 1 ( ) ! ( )i ik x i x ik k k k p x x e x i i k e                      (4) the cumulative probability of x1i, i=1,2 is 2 1 1 2 2 1 2 1 1 2 1 2 1 1 0 10 12 1 ( 1) 2 2 1 2 2 0 2 ( 1) ( 1) ( ) ( ) 1 ! ( ) i i i x x x ik k tx i k k k f x dt p t k e dt t e i i                             (5) after solving the integral by the use of [23], we have the expression for the cumulative probability of the signal at the input to the first microdiversity sc receiver: channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 451 2 1 2 2 1 2 1 2 1 1 0 1 2 212 1 21 2 1 1 ( 1) ( 1) 1 ( ) ! ( ) ( 1) , ( 1) i i x ik i k k k f x i i k e k i x k                                    (6) where γ(·) represents the lower incomplete gamma function [23]. by applying the procedure for obtaining the cumulative probability of the signal at the input to the first microdiversity sc receiver, the cumulative probability x2i of the signal at the input to the second microdiversity sc receiver can also be obtained. the probability density of the signal at the output of the first microdiversity sc receiver is 1 11 12 12 11 11 121 1 1 1 1 1 1 ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) x x x x x x x p x p x f x p x f x p x f x   (7) where px1i is given by (3), while fx1i is given by (6). after the replacement of (3) and (6) into (7) we have 2 11 1 1 1 1 2 2 2 2 1 2 1( 1) 2 2 2 1 1 11 0 1 1 112 1 2 1 21 2 1 0 1 2 2 1 ( 1) ( 1) 1 ( ) 4 ! ( ) ( 1) 1 ( 1) , ! ( ) ( 1) ik x i x ik i i i k k k p x e x i i k e k k k i x i i k                                                            (8) the probability density of the signal at the output of the second microdiversity sc receiver is 2 12 2 1 2 1 2 2 2 2 1 2 1( 1) 2 2 2 1 2 21 0 2 1 112 2 2 1 22 2 2 0 2 2 2 2 ( 1) ( 1) 1 ( ) 4 ! ( ) ( 1) 1 ( 1) , ! ( ) ( 1) ik x i x ik i i i k k k p x e y i i k e k k k i x i i k                                                            (9) probability density of the signal at the output of macrodiversity sc receiver is equal to the probability density of the signal at the output of that microdiversity sc receiver whose power at the input is greater than the power of the signal at the input to the two other microdiversity sc receivers [23]. based on this, the probability density of the signal at the output of the microdiversity sc receiver is equal to 1 1 1 2 1 2 1 2 1 2 1 1 2 0 0 1 2 1 1 2 1 2 0 0 ( ) ( / ) ( ) ( / ) ( ) x x x p x d d p x p d d p x p i i                         (10) where px(x/ω1), px(x/ω2) are given by (8) and (9), respectively. joint probability density power ω1, ω2 is given by [6]: 452 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević 3 1 2 3 1 2 3 3 0 2 1 1 2 2 1 2 0 00 1 1 (1 ) 1 2 3 3 1 ( ) (1 )( )(1 ) 1 ! ( ) i c c c i i c i c p c e i i c                                      (11) where c is shadowing severity. integral i1 is equal to     1 1 1 1 2 1 2 1 2 2 1 2 2 1 2 1 1 2 1 1 2 1 00 0 2 2 1 2 2 1 01 1 2 2 2 2 2 1 2 1 00 ( 1) ( / ) ( ) 4( ) ( 1) 1 1 1 ( 1) ! ( ) ! ( ) ( ( 1)) ( 1) 1 , (1 )( )(1 ) i x ik i i i i c c k i d d p x p k k k e x k k i i i i k k i x c                                                         3 3 2 1 21 3 1 2 2 1 0 3 0 2 1 0 3 3 ( 1) 1 1 1 2 2 (1 ) 1 (1 ) 2 2 2 2 1 1 2 2 0 0 1 ! ( ) i c i k x i c i i i i c i i c d e d e                                            (12) by the use of the method for calculating the integral i1 (appendix a, expressions a1, a2 and a3), the integral i2 is also solved: 1 2 1 22 1 2 1 1 2 0 0 ( / ) ( ) x i d d p x p           (13) 4. channel capacity of macrodiversity after we got the expression for the joint probability density, we can calculate channel capacity at the output of the macrodiversity system shown in figure 1. the maximum data rate can be achieved after the channel has experienced all possible fading states during a sufficiently long transmitting time, which can be expressed as follows in the unit of bits per second, where b denotes channel bandwidth expressed in hz [24], [25]: 2 0 (1 ) ( ) x c b log x p x dx    (14) substituting the expression for the joint probability density in the expression for the channel capacity, we get: channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 453     1 1 1 2 2 2 3 3 1 2 2 1 2 2 12 1 0 1 10 2 2 1 2 1 2 0 2 2 0 2 1 3 0 0 3 3 3 4 ( 1) 1 ln(1 )( ) ( 1) ln 2 ! ( ) 1 1 1 ( 1) ! ( ) ( )(1 )( ( 1)) ( )!1 1 (1 ) ! ( ) ( i i ik i i c c i i c i c k dx x k k x b i i k e k k i i ck i c i i c i c i                                                     1 1 2 1 2 3 1 2 2 2 1 2 3 1 2 0 3 1 20 2 3 2 1 2 2 22 2 0 0 2 2 2 2 3 2 1 0 1 1 )! ( (1 )) ( )! ( ( 1) ) ( ( 1) ) (2 ( 1) (1 )) ( )! 4 ( 1) 2 (1 ) j j j i i i j j c i j i i i j j c c j i i k x k x k x i j k x k                                                 (15) in order to make the integral solution from expression (15) simpler, firstly, in front of the integrals we can get all the constants, ie, everything that does not solve the integral. secondly, we can show the logarithmic function through meijer g-function (appendix b, expression b1) as well as bessel function (appendix b, expressions b2 and b3), with the aim to get more convenient and simpler expression for resolution. the general form of our expression for the channel capacity would be:   0 ln(1 ) r v c r dxx x k x b    (16) where r represents a constant, or an expression in front of the integral, r represents the argument of degree variable by which the expression is solved and v represents the argument of bessel function. replacing (b5) into (b4), we get the expression for the solution of channel capacity at the output of the macrodiversity system:     1 2 1 2 3 2 3 1 1 1 2 2 1 2 1 2 1 0 01 1 2 22 2 1 2 1 2 0 0 3 3 30 3 0 3 1 0 2 ( 1) 1 1 2 ( ) ( 1) ( 1) 2 ! ( ) ! ( ) 1 1 1 1 (1 ) ! ( )( )(1 )( ( 1)) ( )! 1 ( )! ( z i i i ik i c i c c i j c k k k k k b i i i i k e i i c i cck i c i c j                                                        2 2 1 2 1 2 3 1 2 22 02 2 2 2 3 2 1 622 0 46 1 10 ( )!1 ( ( 1)) ( ( 1) ) ( )!(1 )) ( ,1 ), ( ,1 )4 ( 1) (2 ( 1) (1 )) | , , ( ,1 ), ( ,1 )(1 ) j i j j i i i j j c s v m t i k k x i i j l d l dk k g b b l c l c                                            (17) in table 1, the number of terms to be summed in order to achieve accuracy at the desired significant digit is depicted. as we can see from the table, how increases the correlation coefficient increases the number of terms to be summed in order to achieve accuracy at the 4th significant digit. for higher values of parameter c, smaller number of terms to achieve accuracy at the 4th significant digit is required [2], [26]. 454 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević table 1 terms need to be summed in the expression for cumulative distribution function to achieve accuracy at the significant digit presented in the brackets. x=1, k=1, ω0=1 c=1 (4th) c=1.5 (4th) c=2 (4th) ρ=0.2 148 132 118 ρ=0.4 152 138 121 ρ=0.6 154 140 125 ρ=0.8 155 141 126 5. numerical results by using (12) and (13), in figure 2 we show the change of the probability density of the signal x at the output of the macrodiversity system for different number of clusters μ through which the signal extends. medium powers of signal are ω0 = 1, the correlation coefficient ρ = 0.5, shadowing severity c = 1.5 and rician factor κ=1. in the figure we can see that the highest value of the probability density for value parameter μ. with the decrease of μ number of clusters, decrease in the probability density of the signal is slower. in the figure we can see that maximum values of the probability density for higher values of parameter μ. with the decrease of μ number of clusters, decrease in the probability density of the signal is slower. also, by using (12) and (13), in figure 3 we show the probability density of the signal x at the output of macrodiversity system for different values of rician κ factor and μ number of clusters. medium power of signals are ω0= 1, the correlation coefficient ρ = 0.5, and the channel shadowing severity c = 1.5. for higher values of parameter κ and μ are obtained more extreme of the probability density and its faster decrease. fig. 2 the probability density of the signal at the output of macrodiversity sc receiver for different values of μ number of clusters. channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 455 by using (17), in figure 4 we show the channel capacity depending on the correlation coefficient at the output of macrodiversity system for various numbers of cluster μ and the channel shadowing severity c. medium powers of signal are ω0 = 1, and rician factor κ=1. we can see in the figure that the channel capacity decreases with the correlation coefficient at the output of macrodiversity system. for lower values of the correlation coefficient, the highest capacity channel is obtained for higher values of μ number of clusters and the channel shadowing severity c, but channel capacity faster decreases for the same values than for lower values of μ number of clusters and the channel shadowing severity c. for lower values of the correlation coefficient, the highest capacity channel is obtained for higher values of μ number of clusters and the channel shadowing severity c, but channel capacity faster decreases for the same values than for lower values of μ number of clusters and the channel shadowing severity c. figure 5 gives the graphic view of the channel capacity at the output of the macrodiversity system depending on rician κ factor for different values of μ number of cluster. medium powers of the signal are ω0= 1, the channel shadowing severity c = 1. channel capacity at the output of macrodiversity system decreases with the increase of rician κ factor especially in his lower values, while for higher values the mean number of axial cross sections is constant and approximately equal, regardless of the number of clusters. channel capacity faster decreases for lower values of the number of clusters. fig. 3 the probability density of the signal at the output of macrodiversity sc receiver for different values of rician κ factor and μ number of clusters. 456 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević fig. 4 channel capacity per unit bandwidth at the output of the macrodiversity sc receiver for different values of the channel shadowing severity c and μ number of clusters. fig. 5 channel capacity per unit bandwidth at the output of macrodiversity system depending on the rician κ factor. channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 457 6. conclusion in this paper we discussed the diversity system with two microdiversity sc receivers and one macrodiversity sc receiver. at the inputs to microdiversity sc receivers there is an independent κ-μ fading and correlated slow gamma fading. microdiversity sc receiver reduces the impact of fast fading on system performances, while macrodiversity sc receiver reduces the impact of slow fading on system performances. for this system, the probability density function and channel capacity at the output from the macrodiversity system are calculated. the probability density of the signal is important statistical characteristic through which we calculate other statistical characteristics of the first and the second order. when the parameter μ decreases, acuity fading influence increases, but when the parameter μ increases, acuity fading influence decreases. when the acuity fading influence increases, system performances deteriorate. greater acuity fading influence occurs when rician κ factor is smaller. for lower values of the correlation coefficient, the highest channel capacity is obtained for higher values of μ number of clusters and the channel shadowing severity c. channel capacity at the output of macrodiversity system decreases with the increase of rician κ factor especially in his lower values, while for higher values the mean number of axial cross-sections is constant and approximately equal, regardless of the number of clusters. the analysis presented in this paper has a high level of generality and applicability, due to the fact that the modeling of propagation scenarios performed using κ-μ model, which within itself, as a special case involves a large number of known signal propagation model (nakagami-m, rayleigh, rician etc…). appendix a after using [21] for solving the second integral in (12), i1 becomes:     1 2 1 1 2 3 2 3 3 1 2 2 1 2 1 2 2 12 1 1 0 01 12 2 1 2 1 2 02 2 00 0 2 3 3 1 ( 1) 1 4( ) ( 1) ( 1) ! ( ) 1 1 1 ! ( ) (1 )( )(1 )( ( 1)) 1 ( 1) ( (1 )) , ! ( ) i i i i ik i c i c c i i c k i k k x k k i i k e i i ck k i x i i c                                                         2 1 3 1 2 2 1 0 2 1 3 0 ( 1) 1 1 1 2 2 (1 ) 2 2 2 2 1 1 0 , (1 ) k x i c i i i i c d e                                    (a1) after the development of gamma function 0 1 ! ( , ) ( )! n x i i n n x x e x n n i        (a2) and by the use of [21] we have 458 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević     1 2 1 1 2 3 2 3 1 1 2 2 1 2 1 2 2 12 1 1 0 01 12 2 1 2 1 2 02 2 00 3 03 3 3 3 1 ( 1) 1 4( ) ( 1) ( 1) ! ( ) 1 1 1 ! ( ) (1 )( )(1 )( ( 1)) ( )!1 1 1 ! ( ) ( )! i i i i ik i c i c c i j k i k k x k k i i k e i i ck i c i i c i c i c j                                                       2 1 1 2 3 1 2 2 2 1 2 3 1 2 2 20 2 3 2 1 2 22 2 0 0 2 2 2 2 3 2 1 0 1 ( ( 1) ) ( (1 )) ( )! ( ( 1) ) (2 ( 1) (1 )) ( )! 4 ( 1) 2 (1 ) i j i i i j j c j j i i i j j c k x i i k x k x i j k x k                                            (a3) where kn(x) is modified bessel's function of the second kind, order n and argument x [21]. infinite-series from above rapidly converge with only few terms needed to achieve accuracy at 5th significant digit. appendix b by applying from [27] we have: 12 22 1,1 ln(1 ) | 1, 0 x g x         (b1) and by applying the formula (8.4.23/1) from [24] we have: 2 20 02 _____ 1 ( ) | 2 4 , 2 2 v x k x g v v         (b2) specifically for our case the application of (20) would be: 1 2 3 1 2 2 2 20 2 3 2 1 02 10 0 1 2 3 1 2 1 1 2 3 1 2 ,4 ( 1) 1 4 ( 1) 2 | ,(1 ) 2 (1 ) 2 3 2 1 ; 2 2 3 2 1 . 2 i i i j j c m m k x k x k g b b i i i j j c b i i i j j c b                                               (b3) after replacing (b1) and (b3) into (15) and after arrangement, we get the expression: channel capacity of the macrodiversity sc system in the presence of kappa-mu fading... 459     1 2 1 2 3 2 3 1 1 2 2 1 2 1 2 1 0 01 12 2 1 2 1 2 02 2 00 3 03 3 3 3 1 0 ( 1) 1 4( ) ( 1) ( 1) ! ( ) 1 1 1 ! ( ) (1 )( )(1 )( ( 1)) ( )!1 1 1 ! ( ) ( )! ( (1 )) i i i ik i c i c c i j c k k k k k b i i k e i i ck i c i i c i c i c j                                                       2 1 1 2 3 1 2 2 1 2 3 1 2 2 2 3 2 1 2 22 2 0 0 2 2 2 2 2 2 12 20 22 02 100 1 ( ( 1)) ( )! ( ( 1) ) (2 ( 1) (1 )) ( )! ,1,1 1 4 ( 1) | | ,1, 0 2 (1 ) i j i i i j j c j j i i i j j c m k i i k x k i j k x dxx g x g b b                                                (b4) integral of (b4) is solved by applying the formula (2.24.1/1) from [24]. we get that the integral is equal to: 1 2 3 1 2 2 2 2 2 12 20 22 02 100 1 62 46 1 10 1 2 3 1 ,1,1 1 4 ( 1) | | ,1, 0 2 (1 ) ( ,1 ), ( ,1 )1 2 4 ( 1) | , , ( ,1 ), ( ,1 )2 2 (1 ) 1 2 ( ,1 ) i i i j j c m z s v m t s k x i dxx g x g b b l d l dk g b b l c l c i i i j l d                                                     2 1 2 3 1 2 1 2 3 1 2 1 2 3 1 2 1 2 3 1 2 1 2 3 1 2 1 1 2 3 1 2 2 2 2 ; ( ,1 ) ; 2 2 2 2 2 3 2 2 ( ,1 ) ; ( ,1 ) ; 2 2 2 3 2 1 2 3 2 1 ; ; 2 2 1 2 ( ,1 ) s v v m j c i i i j j c l d i i i j j c i i i j j c l d l d i i i j j c i i i j j c b b i i i j l c                                                                1 2 1 2 3 1 2 1 1 2 3 1 2 1 2 3 1 2 2 2 2 2 ; ( ,1 ) ; 2 2 1 2 2 2 2 2 ( ,1 ) ; ( ,1 ) . 2 2 t t j c i i i j j c l c i i i j j c i i i j j c l c l c                                     (b5) references [1] g. l. stüber, principles of mobile communication, 2nd ed. new york: kluwer academic publishers, 2002. [2] s. panic, m. stefanovic, j. anastasov, and p. spalevic, fading and interference mitigation in wireless communications, 1st ed. boca raton, fl, usa: crc press, inc., 2013. [3] m. k. simon and m.-s. alouini, digital communication over fading channels, 2nd ed. new york: john wiley & sons, inc., 2005. [4] s. r. panić, d. m. stefanović, i. m. petrović, m. ĉ. stefanović, j. a. anastassov, and d. s. krstić, “second order statistics of selection macro-diversity system operating over gamma shadowed κ-μ fading channels,” eurasip j. wirel. commun. netw., vol. 2011, no. 151, pp. 1–7, 2011. [5] m. d. yacoub, “the κ-μ distribution and the η-μ distribution,” ieee antennas propag. mag., vol. 49, no. 1, pp. 68–81, 2007. 460 m. m. smilić, b. s. jakšić, d. n. milić, s. r. panić, p. ć. spalević [6] n. djordjević, b. s. jakšić, a. matović, m. matović, and m. smilić, “moments of microdiversity egc receivers and macrodiversity sc receiver output signal over gamma shadowed nakagami-mmultipath fading channel,” j. electr. eng., vol. 66, no. 6, pp. 348–351, 2015. [7] a. v. marković, z. h. perić, d. b. đošić, m. m. smilić, and b. s. jakšić, “level crossing rate of macrodiversity system over composite gamma shadowed alpha-kappa-mu multipath fading channel,” facta universitatis, ser. autom. control robot., vol. 14, no. 2, pp. 99–109, 2015. [8] d. krstic, v. doljak, m. stefanovic, and b. jaksic, “second order statistics of macrodiversity sc receiver output signal over gamma shadowed k-μ multipath fading channel,” in proceedings of the 2016 international conference on broadband communications for next generation networks and multimedia applications (cobcom), 2016, pp. 1–6. [9] j. proakis, digital communications, 4th ed. new york: mcgraw-hill, 2001. [10] p. m. shankar, “analysis of microdiversity and dual channel macrodiversity in shadowed fading channels using a compound fading model,” aeu int. j. electron. commun., vol. 62, no. 6, pp. 445–449, jun. 2008. [11] p. c. spalevic, b. s. jaksic, b. p. prlincevic, i. dinic, and m. m. smilic, “signal moments at the output from the macrodiversity system with three mrc micro diversity receivers in the presence of k μ f ading,” in proceedings of ieee conference telsiks 2015, 2015, pp. 271–274. [12] “wolfram functions site.” [online]. available: http://functions.wolfram.com. [accessed: 10-jun-2016]. [13] j. li, a. bose, and y. q. zhao, “rayleigh flat fading channels’ capacity,” in proceedings of the 3rd annual communication networks and services research conference, 2005, vol. 2005, pp. 214–217. [14] p. varzakas, “average channel capacity for rayleigh fading spread spectrum mimo systems,” int. j. commun. syst., vol. 19, no. 10, pp. 1081–1087, 2006. [15] w. hu, l. wang, g. cai, and g. chen, “non-coherent capacity of m -ary dcsk modulation system over multipath rayleigh fading channels,” ieee access, vol. 5, no. 1, pp. 956–966, 2017. [16] p. yang, y. wu, and h. yang, “capacity of nakagami$m$ fading channel with bpsk/qpsk modulations,” ieee commun. lett., vol. 21, no. 3, pp. 564–567, 2017. [17] j. m. romero-jerez and f. j. lopez-martinez, “fundamental capacity limits of spectrum-sharing in hoyt (nakagami-q) fading channels,” in ieee vehicular technology conference, 2017. [18] d. b. djosic, d. m. stefanovic, and c. m. stefanovic, “level crossing rate of macro-diversity system with two micro-diversity sc receivers over correlated gamma shadowed α–µ multipath fading channels,” iete j. res., vol. 62, no. 2, pp. 140–145, 2016. [19] s. r. panić, d. m. stefanović, i. m. petrović, m. ĉ. stefanović, j. a. anastasov, and d. s. krstić, “second-order statistics of selection macro-diversity system operating over gamma shadowed $κ$-$μ$ fading channels,” eurasip j. wirel. commun. netw., vol. 2011, no. 1, p. 151, oct. 2011. [20] p. s. bithas and a. a. rontogiannis, “mobile communication systems in the presence of fading/shadowing, noise and interference,” pp. 1–14, 2014. [21] m. stefanović, s. r. panić, n. simić, p. spalević, and ĉ. stefanović, “on the macrodiversity reception in the correlated gamma shadowed nakagami-m fading,” teh. vjesn., vol. 21, no. 3, pp. 511–515, 2014. [22] b. jaksic, m. stefanovic, d. aleksic, d. radenkovic, and s. minic, “first-order statistical characteristics of macrodiversity system with three microdiversity mrc receivers in the presence of κ-μ short-term fading and gamma lon,” j. electr. comput. eng., vol. 2016, pp. 1–9, 2016. [23] i. s. gradshteyn and i. m. ryzhik, table of integrals, series, and products, 5th ed. sad diego: san diego, academic press. [24] m. s. alouini and a. j. goldsmith, “capacity of rayleigh fading channels under different adaptive transmission and diversity-combining techniques,” ieee trans. veh. technol., vol. 48, no. 4, pp. 1165– 1181, 1999. [25] n. y. ermolova, “capacity analysis of two-wave with diffuse power fading channels using a mixture of gamma distributions,” ieee commun. lett., vol. 20, no. 11, pp. 2245–2248, 2016. [26] b. s. jakšić, “level crossing rate of macrodiversity sc receiver with two microdiversity sc receivers over gamma shadowed multipath fading channel,” facta universitatis, ser. autom. control robot., vol. 14, no. 2, pp. 87–98, mar. 2015. [27] a. p. prudnikov and j. a. brychkov, integrasl and series, 2nd ed. moscow: moscow, fizmatlit, 2003. plane thermoelastic waves in infinite half-space caused facta universitatis series: electronics and energetics vol. 31, n o 3, september 2018, pp. 389-400 https://doi.org/10.2298/fuee1803389k the concept for the “smart home” controlled by a smartwatch  miloš kosanović, slavimir stošović college of applied technical sciences in niš, serbia abstract. in this paper a “smart home“ solution is proposed in which power plugs in a remote room can be controlled by a smartwatch, an android mobile device or a php web app. communication between these devices takes place in real time via server using node.js technology. an electrical circuit for determining current and voltage on the plugs via arduino wi-fi module sends the measured values to the server, based on which the electrical energy consumption in each time interval can be determined. all the measured values are stored in mysql database and used for creation of appropriate reports. smartwatch app enables remote plugging and unplugging. in addition, the setting of limits for electrical energy consumption on each plug is enabled, as well as the power of the consumption device that can be plugged. exceeding of the allowed values leads to the automatic unplugging. key words: iot, smart home, smart watch, power consumption 1. introduction in recent times, the world has seen an exponential rise in the number of devices connected to the internet. in order to automatize business process, but also to improve life comfort, computers connected to the internet became a part of our daily routine. a concept of connecting embedded computer devices within the existing internet infrastructure is popularly called internet of things (iot) [1]. that concept should enable connecting different devices, systems and services that exceed present communication of two machines. a task is put before the engineers to implement different protocols and applications for wide spectra of devices as air condition devices, washing machines, biochips or wireless sensors. this leads to development of different systems with wide application possibilities like context aware systems, ambient assisted living systems, smart homes, smart cities etc. considering the diversity of the devices, there are many challenges that an engineer needs to overcome when designing and implementing such a system. main research received september 7, 2017; received in revised form february 12, 2018 corresponding author: miloš kosanović college of applied technical sciences in niš (e-mail: milos.kosanovic@vtsnis.edu.rs) 390 m. kosanovic, s. stosovic directions and challenges are thoroughly described in [2]. developing the architecture, designing or choosing hardware and sensors, deciding on the operating system or systems, choosing communication protocol, integrating “things” into the web by using web services are only some of the problems that we had in mind when we implemented iot solution described in this paper. one of the challenges is certainly the development of the architectural design that will enable scalability, interoperability and security. as trillions of things (objects) are connected to the internet, it is necessary to have an adequate architecture that permits easy connectivity, control, communications, and useful applications. several solutions are proposed like diat, marm, mosden, cloudthings and other [3]. in depth analysis of different context aware systems and smart home architectures and technologies can be found in [4] and [5]. another challenge is a design of universal operating systems that would work on different hardware with a similar success. one of the most popular and most common operating systems is android, as an open source operating system. in recent years, tizen os also gained popularity, due to its applicability on various devices, and support by samsung company. several communication protocols are used for communication between iot devices. most relevant is probably the snmp protocol that forms the part of the ip stack and is universally supported. on the application level coap (constrained application protocol) and mqtt (message queuing telemetry transport) are often used [6]. in the paper [7] jabeur goes further by explaining that integration of real world things or rwts into the web leads to more advanced perspective, where these things are abstracted into reusable web services, and not only viewed as simple web pages. these leads to the subset of iot called web of things or wot. restful web services are based on representational state transfer (rest) [8] which is lightweight, simple, loosely coupled, flexible as well as easy to integrate into the web using the http application protocol. from a design perspective and compared to the traditional client-server architecture, the wot has a flat architecture that should integrate the rwt into the web and make them mutually interoperate and fuse into complex web services. jabeur further introduces artificial intelligence technics as well as the ideas of social networking into the iot. smart home represents one context aware iot system. it consists of home appliances, sensors, actuators and data processors and analyzers [9]. home automation of appliances can be either wired or wireless. the idea of a smart home integrates many different aspects of iot, energy efficiency and software engineering. for example, [10] describes connecting wireless sensors to internet, [11] describes platform for smart learning environment, which enables acquiring data from sensors distributed within the university building, and [12] smart home system based on sensor technology. home devices control is usually performed from other smart devices, as tablets, mobile phones, smartwatches, smart wristbands, etc. there are several smart home systems that are proposed based on different architecture and different communication technologies like zigbee [13], bluetooth [14], gsm [15] and wi-fi[16]. as far as we know the first paper that mentions the possibility of using a smartwatch as a remote-control device in a smart home is [17]. however, this paper was written at the time when smartwatches were not yet widely available and were quite limited in comparison to nowadays smart devices. the paper [15] describes a system where android smartphone is used for monitoring a home security system. the concept for the „smart home“ controlled by a smartwatch 391 the power monitoring systems in home environments are described in [18] [19]. the first paper does not use smartwatch nor smartphone for monitoring, and does not implement real time communication as gui needs to be refreshed so the data can be shown. the second paper does not use web and cannot be considered as scalable nor easy for integration. in this paper, a smart home solution is described in which an android mobile device, php server, and tizen smartwatch app are used to monitor and control house power consumption. all three apps communicate over wi-fi, regardless the development technology, have identical functionalities, and communicate among themselves via common nodejs server, via websocket and http protocol. as a connection between hardware on one side and server app on another, an arduino open source computer platform, based on a simple board with i/o pins and an atmel microcontroller, is used. this solution represents the continuation of development of the remote control in samsung apps lab and vtš apps team in the college of applied technical sciences in niš, and it is the expansion of the apps functionality for the lab access control system [20]. it also represents an addition to work already described in [12] by introducing the smart watch control and system for energy consumption monitoring and control. considering the growing popularity of smartwatches and their apps development specificity, the special focus is put on the control of the proposed smart home solution. in the section 2, the system architecture is presented, whereas in the subsequent chapters every part of the system is described. after the third and fourth section, in which the hardware for electrical energy and voltage measurement on the plugs are described, as well as the functioning of the arduino module, in the fifth section a detailed description of server part of the system, the mobile device and the web app is given. 2. system architecture design for setting up a system for energy consumption control in the home environment the following hardware components are needed:  microcomputer – an arduino board with the wi-fi module, the sparkfun ftdi basic breakout board and the lcd display  main server for sharing the data between clients. it can be a desktop computer or a service provided by the hosting provider  smartphone, smartwatch or pc device  wi-fi network and wireless router  customized power plug the working principle of the whole system is shown graphically in figure 1. all the smart home module users, regardless whether they want to have control via smartwatch or android device, ought to be registered in the database of lab members. the application on the smartwatch enables us to remotely turn on and off the power plug, to set timer, and to set power limits for energy consumption, as well as the maximal consumption power that can be connected to any power plug. in this way, the watch on our wrist becomes a sort of personal home remote control device. beside the smart watch app, the android mobile application and php web application were also developed with similar functionalities. communication between all these applications happens in real time over 392 m. kosanovic, s. stosovic the node.js server. the connection between the power plug and electrical circuits for measuring current and voltage is implemented by using an arduino microcontroller board. for testing the proposed solution two power plugs on one power strip were used. all measured values are saved in mysql database and are used for energy consumption and cost calculations. all previously developed modules: laboratory access control system, php library of the vtš apps team and vtš explorer app [2], use the same database which is stored on server. so, the first step for the administrator is to create a user account by filling out the appropriate web form in the php app. in that way, a universal username and password are given to all the members of the lab so that they can access all the above-mentioned apps (modules). fig. 1 the block diagram of the system software and hardware architecture when creating an account, the php app checks whether entered username for newly created user is available. at the same time, a form validation is performed regarding the mandatory field checks and whether the data are entered in valid format. if created account meets all the requirements and security checks, the app stores it in database. two plugs are placed in the laboratory, and are connected with the arduino microcontroller via the electrical circuit described in the following chapter. arduino is used for voltage and electrical energy measuring, and it sends the measured values to the server app via appropriate wi-fi module. the systems based on wi-fi have the advantage of using technology which is present or will be present in almost any modern electronic device. its main feature is the existing wide support, alongside the fact that it is an upper layer protocol which allows communication over the internet without needing a protocol translator [5]. furthermore, wi-fi network today exists in almost every home with internet access. the app accepts the measured values on the server, and along with accepting date and time, stores them in the database. based on measured and stored values, a current power chart on the plugs in real time is drawn. the server app sends these data to the smartwatch or android device apps via appropriate service. communication between server and above-mentioned devices is two-way, because the devices can send instructions for momentary unplugging, as well as the time when the plug should be switched on or off. the concept for the „smart home“ controlled by a smartwatch 393 3. plug design there are a lot of wireless power plugs available on the market with price ranging from 15 to 50 eur per plug. to use the plug the costumer needs to install a proprietary mobile app. integration with third-party apps is usually not possible, as well as getting the real-time power consumption data. the price, non-accessible api and unavailability of real time power consumption readings are the reasons why we decided to design and construct custom power plugs the interior of the existing power strip is modified so the plugs could be physically divided. in figure 2, a circuit implementation scheme for measuring electrical energy is shown. calculated value represents the active power of the consumption device and is acquired as a product of the effective value of voltage and the power of the receiver. the active power is provided in technical specifications of home appliances, as well as their characteristics. the information on electrical energy consumption is acquired using dl-ct1005a current sensor, through which a conductor is pulled, and the alternating current of the consumption device is transmitted through it. that alternating current is induced proportionally in current sensor’s coils. a 39ω resistor, through which the alternating current flows causing alternating voltage drop (around 1v), is connected in parallel with the current sensor. relay 1 relay 2 load 2 load 1 fig. 2 a circuit implementation scheme for measuring electrical energy and circuit implementation scheme for relay control 394 m. kosanovic, s. stosovic the value of the gained voltage is amplified using lm358n operational amplifier to the point where it is applicable to transform it into direct voltage (figure 2). the schottky diodes are used for rectifying voltage because of their quality of having a lower voltage sag in comparison with silicon diodes. for comparing voltages at the input of a/d converter, the arduino uses the internal reference voltage of 1.1v, so the maximum value of the voltage (for consumption device of 10a), received from the amplifying and rectifying circuit, is 1.1v. considering that the display and wi-fi module are connected to 5v, the voltage varies in mv and is not even nearly precise as intern voltage generated as internal arduino cpu. considering that the system monitors the power consumption of two independent devices, two current sensors are required (sk1 and sk2 in figure 2), as well as the amplifiers and rectifiers for both current sources. at the output of the rectifying degree, the rc connection (between a 10kω resistor and a 47µf capacitor) serves as a low-pass filter, so more stable voltage is achieved at the input of the arduino. the arduino keeps filtering this signal further by repeating measurements for certain time interval and taking the mean. both consumption devices can be switched on and off remotely in a way that the instruction is sent from the mobile app, via server, to the arduino device, ordering it to switch on/off the appropriate relay. a consumption device and a relay are connected in series to the mains voltage, and relay’s coil is propelled by transistor’s switch derived with pn2222a npn transistor (figure 2). from the arduino’s digital pin to the database, via 510ω resistor, the transistor is saturated and the consumption device is switched on. diodes neutralize counter electromagnetic force that occurs after the relay is switched off. a circuit scheme for measuring voltage is shown in figure 3. mains voltage is the same on both consumption devices, so only one readout structure is necessary. the logic is galvanically separated from mains voltage with a transformation that shows 12v on its secondary. the graetz bridge, which is made with schottky diodes because of the less attrition, generates the dc voltage. this voltage is too high to be led to an analogue arduino input directly, so a voltage divider is created using 100kω and 1kω resistors. using a 2.2kω potentiometer, which is connected in series to a 1kω resistor, output voltage is set in a fine way (around 800mv for mains voltage of 225v). at the output, the rc connection between a 10kω resistor and a 10µf capacitor serves as a low-pass filter. fig. 3 a circuit implementation scheme for measuring voltage the concept for the „smart home“ controlled by a smartwatch 395 4. arduino and server communication module arduino microcontroller board is an open source development platform based on 8-bit atmel avr or 32-bit atmel arm microcontroller. there are several models of arduino boards that have different features, but all boards contain standard microcontroller components like oscillator, the crystal that regulates time periodic impulses of the processor clock, and 5v and 3.3v voltage regulators. for this project arduino pro mini board was used, with atmega 328 microprocessor and esp8266 esp-01 wi-fi module which are shown in figure 4. the final look of the arduino module and power plug is shown on figure 5. a) b) c) d) e) fig. 4 a) arduino pro mini, b) wi-fi module esp8266 esp-01 c) ftdi cable, d) sparkfun ftdi basic breakout board development environment needed for the board programming is completely free, open source, and can be downloaded for the mac, windows, or linux operating systems. the processing power of the arduino is not great, as it is built to be cheap and accessible, so the optimization of the code is important, so that the system would work without any latency. this is extremely important in a real-time system. in this project, the arduino has several different responsibilities: 1. it measures the output of electrical current and voltage for each power plug that is described in section 3; 2. it writes measured values on the lcd display; 3. it communicates with the server by sending the measured values via wi-fi; 4. it receives from the server the power consumption limits for each power plug. communication between the arduino and the php server is implemented by using the esp8266 esp-01 wireless module (figure 4b). the module uses 802.11 b/g/n standard and the tcp/ip protocol. the possibilities for the work mode are station, access point and both. it is also possible to set a static ip address or port number, and to specify the explicit web address, as the module has the dns capability to translate it to the ip address. arduino pro mini contains 14 digital input/output pins, from which six can be used as a pwm (pulse width modulation) channels. also, it has six analog entries and a reset button. this platform is categorized as an entry-level model and is designed for use where small board dimensions are required. due to this, the board does not contain the usb connector. therefore, connecting with a personal computer is not possible via usb cable, which leaves two other possibilities: 396 m. kosanovic, s. stosovic 1. direct connection via ftdi (future technology devices international) cable (figure 4c) which is used with the arduino pro mini 16mhz module with 5v power source 2. connection with additional sparkfun ftdi basic breakout board (figure 4d), which is used with the arduino pro mini 8mhz module with 3.3v power source. we decided to use this solution. every pin on the arduino board, from the 14 available pins, can be used as an input or output maximal current is limited to 40ma. additionally, specialized pins are available for connecting the bluetooth or wireless module like rx (receive x) and tx (transmit x) pins, which can also be used for the uart ttl serial communication. 5. implementation of the server, web and mobile application the server application is implemented in node.js and php technologies. node.js is a server-side javascript environment based on google chrome v8 engine, and is mainly used for the implementation of the fast simple scalable network applications. it is very efficient for development of the real-time applications, distributed applications, and applications that have the need for the large amount of data transactions or http requests or full duplex communication via web socket protocol. communication between the arduino module in the power plug, and the smartwatch or the android device is done in real time by using the web sockets, which are implemented in node.js and use socket.io library. arduino monitors current and voltage on the consumption device, which is connected to the power plug, and forwards measured values to the node.js server in real time. node.js saves the measured values in the mysql database, and forwards these values to the client applications, the smartwatch, and android and web application. with any change in the state of the plug, the current or voltage values will be immediately synchronized on all the client applications. all client apps communicate with the server via wi-fi. when any change occurs, on server or client side, data is sent to the server which forwards this information to all connected applications which immediately update the ui. fig. 5 the image of arduino module and the custom power plug the concept for the „smart home“ controlled by a smartwatch 397 fig. 6 design and functionalities of the web application web application was developed in the php programming language with the mysql database, and has the same functionalities as the android and smartwatch app which is showed in figure 6. the android application was developed in android studio. 6. implementation of the smartwatch application today’s mobile devices market has many different smartwatches manufactures, and many different operating systems that power them. the application for smart home control in this paper has been developed for samsung smartwatch gear s model, which works on tizen operating system. tizen is an open and flexible operating system developed to run on different devices like mobile phones, cars, bracelets, watches, television sets, and other devices. it was made by open source software development community and is still open for anyone who wants to contribute to the project. there are different versions of the system, like tizen mobile, tizen tv and tizen wearable, and they are all compatible with the html5 standard. due to the small screen size and the small resolution of the watch (360x480), the design of the user interface (ui) turns out to be increasingly challenging. the ui of this type of app consists of several screens. since the watch has only one physical button, it is only natural that different gestures must be used for navigation between screens. swiping gesture from the top of the application to the bottom would signify closing the application. swiping in the other direction, from the bottom to the top, would return you to the previous screen. by clicking the physical button, the app continues to run in the background, so when it is activated again, the user can continue where he left off. fig. 7 the smartwatch application screenshots annotated as 7a, 7b, 7c and 7d 398 m. kosanovic, s. stosovic the smartwatch can turn on and off the power plug autonomously by using timers. on the bottom of the screen, the next scheduled task with scheduled time is shown. to set the timer the user can click on the before mentioned task and the screen in figure 7b will be shown. on this screen, the user can see all scheduled tasks for the future, sort them, delete them, and add the new ones. by clicking on the green button on the bottom, the new screen for button addition will be shown (figure 7c). to schedule a new task, the user needs to choose the power plug, type of action, time and date. by clicking in the middle of the application, on the part that shows current power of the consumption device, the screen with consumption date for that plug will be shown in figure 7d. the data are sorted by the current tariff system in serbia (blue, green, and red tariff). the user can also see the price in serbian dinars. in the figure 8 we can see the results of the measured power consumption on the monthly scale for two power plugs used in the laboratory. fig. 8 power consumption report 7. conclusion in this paper, we proposed and described the architecture model based on wi-fi protocol that controls and monitors power consumption by using different devices. there are multiple advantages of such a system comparing to standard home like better control of home appliances, convenience, better power consumption awareness, energy efficiency etc. considering other proposed solutions in scientific literature, our solution differs by implementing smartwatch application that can monitor and control power plugs in real time. all the controlling devices exchange data and synchronize in real time. the custom power plug was designed with arduino board and wi-fi connectivity. the plugs can be controlled through the server as web services that can be accessed from any web capable device and are integrated into the “web of things”. finally, the rich and interactive tizen, android and php applications were developed which access these web services and are used to control devices and show the monitored data in a user-friendly way. the system was constructed from widely available hardware components (arduino, wi-fi module, smart watch) and from some that were custom-made (power plug). the total price for all the components is not more than 40 euros. the real-world application in the concept for the „smart home“ controlled by a smartwatch 399 home environment is possible for selected devices as it does not require changes in electric installations. however, it is not recommended for all wall power plugs, as better solutions exist. the electric installations can be changed, so all of them connect to one microcomputer, or power consumption measurement can be done on the main circuit breaker. the attention when building the system was on accessibility, usability, openness, scalability, easy integration, and real-time data exchange, while other considerations like security, robustness and system optimization were also considered but were not in the prime focus of the research. verbose data due to the protocol overheads increase the power consumed by the device and may make it less efficient. emerging protocols such as coap can be considered to provide better performance. the practical test of the smart house system features easy control, good stability and scalability, and easy integration into larger systems so it can provide reference to the future design of the smart home hardware and architecture. special attention in this paper has been given to the software development, especially to the smartwatch application due to the popularity and innovative concept that this technology represents. the integration of the other controlling devices as well as other sensors can be easily implemented and integrated in the proposed system, and is supported by the software architecture. additional work can be done to integrating smart tv as a sort of smart house hub that can monitor and control other house devices. furthermore, the development of the intelligent systems that detect and learn about our behavior and help enrich and simplify our daily routines is more common. artificial intelligence and machine learning techniques can be used to optimize and personalize energy consumption and personal preferences for optimal device usage. proposed system can help and simplify the collection of the data needed for the training of such systems. acknowledgement: the authors would like to mention that the software and hardware has been implemented by the members of vtš apps team in the college of applied sciences in niš. we are grateful to miloš milić, miloš segić, dimitrije dimitrijević and all the members of the vtš team that were involved and have significantly contributed to this paper. references [1] o. vermesen, p. fries, internet of things converging technologies for smart environments and integrated ecosystems, river publishers, 2013. [2] john a. stankovic, "research directions for the internet of things," ieee internet of things journal, vol. 1, pp. 3-9, 2014. [3] abhirup khanna, "an architectural design for cloud of things," facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 357 – 365, september 2016. [4] charith perera, arkady zaslavsky, peter christen, dimitrios georgakopoulos, "context aware computing for the internet of things: a survey, " ieee communications surveys & tutorials, vol. 16, no. 1, pp. 414 – 454, 2014. [5] gabriele lobaccaro, salvatore carlucci, erica löfström, a review of systems and technologies for smart homes and smart grids, energies, 2016 [6] m. savic, "bridging the snmp gap: simple network monitoring the internet of things," facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 475 – 487, september 2016. [7] nafaâ jabeur, hedi haddad, "from intelligent web of things to social web of thing," facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 367 – 381, september 2016. 400 m. kosanovic, s. stosovic [8] r. t. fielding, architectural styles and the design of network-based software architectures, ph.d.dissertation, university of california, irvine, 2000. [9] h. ghayvat, s. mukhopadhyay, x. gui, n. suryadevara, "wsnand iot-based smart homes and their extension to smart buildings, " sensors 2015, vol. 15, no. 10350-10379, 2015. [10] mirko r. kosanović, mile k. stojčev, "connecting wireless sensor networks to internet," facta universitatis, series: mechanical engineering, vol. 9, no. 2, pp. 169 – 182, 2011. [11] konstantin simić, marijana despotović-zrakić, ţivko bojović, branislav jovanić, đorđe kneţević, "a platform for a smart learning environment," facta universitatis series: electronics and energetics, vol. 29, no. 3, pp. 407 – 417, september 2016. [12] boban davidović, aleksandra labus, "a smart home system based on sensor technology," facta universitatis, series: electronics and energetics, vol. 29, no. 3, pp. 451 – 460, september 2016. [13] xian-jun yi, min zhou, jian liu, "design of smart home control system by internet of things based on zigbee, " in proceedings of the 2016 ieee 11th conference on industrial electronics and applications (iciea), hefei, 2016, pp. 128-133. [14] z. yufeng and j. ruqiao, "design and realization of the smart home control system based on the bluetooth," in proceedings of the 2015 international conference on intelligent transportation, big data and smart city, halong bay, 2015, pp. 286-289. [15] s. morsalin, a. m. j. islam, g. r. rahat, s. r. h. pidim, a. rahman and m. a. b. siddiqe, "machine-tomachine communication based smart home security system by nfc, fingerprint, and pir sensor with mobile android application," in proceedings of the 3rd international conference on electrical engineering and information communication technology (iceeict), dhaka, 2016 [16] r. k. kodali, s. soratkal and l. boppana, "iot based control of appliances," in proceedings of the 2016 international conference on computing, communication and automation (iccca), noida, 2016, pp. 12931297. [17] l. de russis, d. bonino, f. corno, “the smart home controller on your wrist”, homesys workshop (2013). [18] edwin chobot, daniel newby, renee chandler, nusaybah abu-mulaweh, chao chen, carlos pomalazaráez, "design and implementation of a wireless sensor and actuator network for energy measurement and control at home, " international journal of embedded systems and applications (ijesa), vol.3, no.1, 2013. [19] chia-hung lien, hsien-chung chen, ying-wen bai, and ming-bo lin, "power monitoring and control for electric home appliances based on power line communication, " in proceedings of the i²mtc 2008 – ieee international instrumentation and measurement technology conference. [20] n. ţivković, m. milojević, n. nikolić, b. majkić, s. stošović, "system for access and working time control implemented with raspberry pi platform, " in proceedings of the ieeestec conference, niš, serbia, 2014. pp. 201-206. instruction facta universitatis series: electronics and energetics vol. 31, n o 2, june 2018, pp. 257 265 https://doi.org/10.2298/fuee1802257v impact of channel engineering (si1-0.25ge0.25) technique on gm (transconductance) and its higher order derivatives of 3d conventional and wavy junctionless finfets (jlt)  b. vandana 1 , jitendra kumar das 1 , sushanta k. mohapatra 1 , suman lata tripathi 2 1 school of electronics engineering, kiit university, bhubaneswar, odisha, india 2 electronics and communication engineering, lovely professional university, jalandhar, punjab, india abstract. the paper explores the analog analysis and higher order derivatives of drain current (id) at gate source voltage (vgs), by introducing channel engineering technique of 3d conventional and wavy junctionless finfets (jlt) as silicon germanium (si1-0.25ge0.25) device layer. in view of this, the performances are carried out for different gate length (lg) values (15-30 nm) and current characteristics determined by maintaining constant on current (ion 10 -5 ) (a/μm) for both devices. with respect to this, a comparison has been made between these mos structures at molefraction x = 0.25 and it was found that the electric field is perpendicular to the current flow which induces volume inversion approach. accordingly, for the simulation study better channel controllability over the gate is observed for wavy structures and high id induces as the lg scales down. with respect to this the constant ion determine id, transconductance (gm), transconductance generation factor (tgf) and its higher order terms (g \ m, and g \\ m) of the devices are studied with relaxed sige approximation. the extensive simulation study on short channel (sc) parameters are also performed and it is observed that the wavy jl finfet shows less sensitivity towards short channel effects (sces) over conventional one, therefore the dependency of n-type doping concentration (nd = 1.7x10 19 cm -3 ) and metal workfunction (ϕm = 4.6 ev) are responsible to achieving reduced sces. key words: sige jl finfet, channel engineering, molefraction, analog parameters, higher order derivatives, short channel parameters (sc). received may 25, 2017; received in revised form october 23, 2017 corresponding author: b. vandana school of electronics engineering, kiit university, bhubaneswar, odisha, india (e-mail: vandana.rao20@gmail.com) 258 b. vandana, j. k. das, s. k. mohapatra, s. l. tripathi 1. introduction due to the tremendous growth in technology, the exploration of novel architectures has become mandatory for ultra large scale integration (ulsi) applications. among various architectures, the finfet has become an attractive device solution for down scaling the sces. as the device dimensions have moved to nanometer range [1]–[4], this primarily owes to its superior gate control over channel. multi-gate structures like silicon on insulator (soi) mosfets [5], [6] are scaled down to decananometer range, however realizing these mosfets in decananometer [7] range requires extremely sharp source/drain p-n regions which are possibly achieved through high end annealing techniques and there by increases the fabrication cost. to overcome these difficulties a new mosfet without source/drain p-n junction was proposed [8], [9], and named junctionless nanowire transistor. the comparative study was performed between fabricated junctionless finfet (jlt) and conventional bulk finfet, realizing the sces as discussed in [10]. heavily doped jlt induces fully depleted channel in the subthreshold region with high vertical electric field (efield). the e-field is neutral at the inversion mode of operation and the shift in vth occurs when the bands (ϕm – ϕs) are flat at flat band voltage (vfb) [9]. the absence of doping concentration gradients eliminates diffusion impurities and the sharp doping profile problem. the paper explores the multi-gate jl finfet topology which is an extended work of [11], [12], this mainly concentrating the analog performances and the higher order gm parameters using id characteristics. the probabilistic analyses of higher order derivatives are also important to study at scaled lg, the major issues that emphasize the analog and higher order derivatives are important for advance communication system. non-linearity characteristics realizes unwanted disturbances with frequencies differences at input once, which generates intermodulation distortion (imd) at output stage [13]–[15]. the higher order analysis and the inter-modulation harmonics are important to maintain minimal linearity’s at the rf stage [16]. accordingly, at pre-fabrication process the analog performance parameters are necessary at nanoscale regime. the paper discusses the higher order derivative parameters of 3d conventional and wavy jl finfets using channel engineering scheme. along with the introduction, section 2 discusses the device architecture specifications and the simulation procedure undertaken, section 3 includes the comparative analysis on analog performances of these devices using si10.25ge0.25 material as device layer. finally, the conclusion is drawn. 2. device description and simulation framework the multi-gate transistors are the basic step to scale down the sces, the challenges and the issues are discussed in [17] and their performance metrics is given in [18]. a thin dual gate approach on soi with the volume inversion is reported in [5], [19]. the another representation using 2d planar utb and 3d non-planar approach is first given by [20], [21] later provides the detailed analysis with several performance metrics analyzed and reported in [22]–[24]. the significance of the finfet provides better layout area efficiency in the digital circuits [25]. in general, the fin utilizes the availability of single fin per pitch, in which most of the pitch area is unused. to overcome this, the finfet limits the current per pitch technology representation. therefore, the pitch area in finfet utilizes fully channel engineering (si1-0.25ge0.25) technique on gm (transconductance) and its higher order derivatives 259 depleted soi (fd-soi) topology which is grown epitaxial and merged with the 2dfinfet forming a single device with common gate [21]. utilizing these two approaches, a comparative analog analysis has been performed using channel engineering technique (sige material) with the junctionless finfet topology. in this section the architectural representation of conventional jl devices and wavy-jlt is shown in fig. 1(a), (b). accordingly, the parameters required to construct the devices are tabulated in table 1. the structural design is observed for different lg for 15-30 nm with a uniform doping concentration nd = 1.7 x10 19 cm -3 , and using high-k (hfo2) gate side wall spacer’s. the simulations are carried out using sentaurus tcad [26] simulator. phillips unified mobility model is used with lombardi model to account for high-κ induced carrier mobility degradation as considered [27]. for a deeper understanding of the quantum confinement effect, the thickness of fin and utb determine the density gradient based quantization models that are used. inversion accumulation layer mobility model includes doping and transverse field dependency, which in turn accounts for a coulomb impurity scattering being used. (a) (b) (c) fig. 1 a 3d representation of (a) conventional jl finfet, (b) wavy-jl finfet at lg = 15-30 nm (c) id-vgs characteristics of conventional and wavy jl finfet at lg = 20 nm and x = 0.25. to account for the longitudinal and vertical electron field an effective intrinsic density, oldslotboom band gap narrowing model [28], shockley-read-hall mechanism for generation and recombination [29], and quantum mechanical effects are included. the device physical properties are discretized onto a non-uniform mesh of nodes and simulated with appropriate parameterization models [30]. the same models are considered for the 260 b. vandana, j. k. das, s. k. mohapatra, s. l. tripathi simulation study to observe the performance of the devices. with respective to this, the id-vgs characteristics are plotted and shown in fig. 1(c) and the ion ranges constant for both the device, but a small improvement in id is observe for 3d wavy-jlt. table 1 parameter required for simulation. parameters 3d jl finfet 3d wavy-jl finfet sige device layer (wfin) 7 nm 7 nm sige device layer (hfin) 30 nm 30 nm silicon thickness (tsi) --------10 nm donor doping (nd) 1.7x10 19 cm -3 1.7x10 19 cm -3 eot of gate dielectric (tox) 1 nm 1 nm gate work function (ϕm) 4.6 ev 4.6 ev drain supply voltage (vdd) 0.05 v, 0.7 v 0.05 v, 0.7 v channel length (lg) 15-30 nm 15-30 nm underlap s/d (lus, lud) 5 nm 5 nm molefraction (x) 0.25 0.25 total device length (lt) 110 nm 110 nm total device width (lw) 32 nm 32 nm (a) (b) (c) (d) fig. 2 transfer characteristics of (a) jl finfet and (b) wavy-jl finfet for varying lg = 15-30 nm at nd = 1.7 x10 19 cm -3 , ϕm = 4.6 ev. as shown in fig. 2a and 2b, the id-vgs is plotted in logarithmic and linear scales, an improvement in ion and ioff is observed for wavy-jl finfet. the device layer (s/d and channel) is si1-xgex material with molefraction x = 0.25. considering x = 0.25, channel engineering (si1-0.25ge0.25) technique on gm (transconductance) and its higher order derivatives 261 substituting this value of x, results in high content of si in sige material. therefore, the device acquires the properties of si material, and accordingly the simulation data are extracted. the conduction mechanism of jlt seems to be similar to that of im devices, jl device with no concentration gradients across the s/d channel regions and high n– type doping profile induces a volume inversion mechanism. from the fig. 2 it is analyzed that, as the lg is scaled down, the ion enhances and ioff reduces, on this point of view the performance of the device is identified using sige channel. in fig. 2c and 2d the id is plotted along the vgs for the different value of x at lg = 20 nm, from this it is realized that as the value of x increases the shift in vth takes place which there reduces the ioff. 3. results and discussions the section deals with the results and discussions carried out for the simulation study. the higher order gm of id characteristics results in the second and third order (g \ m, g \\ m) parameters. further, these parameters result in second and third order intermodulation and linearity performances. in mos circuits, harmonic distortion occurs due to the nonlinearity exhibited by higher-order derivatives of id-vgs characteristics. therefore, the circuits realize balanced topologies, due to this the even-order harmonics are cancelled out. the third order harmonic, which represents g \\ m, determines a lower limit of distortion and hence amplitude should be minimized. thus, reducing g \\ m and increasing the gm acts as a sustainable solution to improve device linearity[31]. (a) (b) fig. 3 tgf and gm as a function vgs (a) jl finfet and (b) wavy-jl finfet for different lg = 15-30 nm at nd = 1.7 x10 19 cm -3 , ϕm = 4.6 ev and x = 0.25. the fig. 3 represents tgf (gm/id) and gm the values are extracted from the measured values of id and plotted as a function of vgs as illustrated in fig. 2(a, b). the graphs exhibit different dimensions of lg, jl transistors, and show that a lower gm is induced at room temperature because of the reduced carrier mobility with that of the im devices. the mobility is an important parameter for evaluating gm, but the other factors may also affect this parameter. according to the drift equation the current that flows through the device layer has a great impact on the mobility, e-field, and nd. this can be identified without including the mobility degradation models to the simulator and measured at different dimensions. the parameter tgf is observed as the available gain per unit value of power dissipation. from the fig. 3 gm increases as the id increases for scaled lg, but 262 b. vandana, j. k. das, s. k. mohapatra, s. l. tripathi the tgf decreases as the lg scales down. however, the tgf values are near to the ideal values and but the gm values are very high for jl finfet. (a) (b) fig. 4 g \ m as a function vgs (a) jl finfet and (b) wavy-jl finfet for different lg = 15-30 nm at nd = 1.7 x10 19 cm -3 , ϕm = 4.6 ev and x = 0.25. (a) (b) fig. 5 g \\ m as a function vgs (a) jl finfet and (b) wavy-jl finfet for different lg = 15-30 nm at nd = 1.7 x10 19 cm -3 , ϕm = 4.6 ev and x = 0.25. table 2 sc parameters 3d jl finfet at vds = 0.7v. lg (nm) s-ssub (mv/decade) ion x10 -5 (a/μm) ioff x10 -10 (a/μm) 15 70.446 2.70 0.190 20 80.12 2.30 2.57 25 81.419 2.28 2.01 30 70.905 2.37 0.20 table 3 sc parameters for wavy-jl finfet at vds = 0.7v. lg (nm) s-ssub (mv/decade) ion x10 -5 (a/μm) ioff x10 -11 (a/μm) 15 66.702 2.99 2.24 20 66.545 2.83 2.12 25 66.427 2.69 2.02 30 66.357 2.55 1.94 channel engineering (si1-0.25ge0.25) technique on gm (transconductance) and its higher order derivatives 263 table 4 sc parameters at different values of x for jl finfet vds = 0.7v, lg = 20 nm. x s-ssub (mv/decade) ion x10 -5 (a/μm) ioff (a/μm) 0.25 80.12 2.30 2.57x10 -10 0.5 78.662 2.05 7.48x10 -11 0.75 77.449 1.42 1.79x10 -12 table 5 sc parameters at different values of x for 3d wavy-jl finfet vds = 0.7v, lg = 20 nm. x s-ssub (mv/decade) ion x10 -5 (a/μm) ioff (a/μm) 0.25 66.545 2.83 2.12 x10 -11 0.5 66.89 2.48 7.44 x10 -12 0.75 67.319 1.47 9.88 x10 -14 the higher order derivatives of id (g \ m and g \\ m) as a function vgs for different lg at vds = 0.7 v are plotted in fig. 4 and 5 respectively. usually for better linearity properties there should be a lesser distortion amplitude of g \ m and g \\ m. the value of vgs at which the higher order of transconductance parameters (g \ m and g \\ m) becomes zero is known as zero crossover point (zcp) which decides the optimum bias point for device operation [15], [32]. therefore, from the fig. 4 and fig. 5 the minimal higher order derivative shows better for wavy-jl finfet. the comparison of sc parameters for jlt devices at lg variation is tabulated in tables 2 and 3, and at fixed lg with different values of x is given in table 4 and 5. from the overall simulation study the wavyjlt explores good improvement in ion and possess less sensitivity to sces over the conventional one. 4. conclusion the paper investigates the performance study of analog analysis and higher order parameters for both conventional and wavy jlfinfet for different lg variations. due to the equal amount of doping profiles along the device layer the ion is improved and ioff is decreased. the conduction mechanism of jl finfet with the concept of sige device layer is explained at different values of x. the simulation results are extracted at vdsat values at x = 0.25, ϕm = 4.6 ev are considered to estimate the id characteristics and the higher order parameters are evaluated accordingly. from the results it has been observed that the higher order parameters show minimal non-linearity distortions performance for wavy-jl finfets over conventional jlt. therefore, the performance of the 3d wavyjl finfet shows better channel controllability through gate and thereby enhances the id. on the other hand, the high nd with the effective channel length and width of the depletion layer are also responsible to achieve scaled sces. 264 b. vandana, j. k. das, s. k. mohapatra, s. l. tripathi references [1] c. hu, “finfet and other new transistor technologies. univ. of california. article. finfet and other new transistor technologies,” 2011. [2] x. huang, w.-c. lee, c. kuo, d. hisamoto, l. chang, j. kedzierski, e. anderson, h. takeuchi, y.-k. choi, k. asano, and others, “sub 50-nm finfet: pmos,” in technical digest. international of the electron devices meeting, iedm’99., 1999, pp. 67–70. [3] d. hisamoto, w.-c. lee, j. kedzierski, h. takeuchi, k. asano, c. kuo, e. anderson, t.-j. king, j. bokor, and c. hu, “finfet-a self-aligned double-gate mosfet scalable to 20 nm,” ieee trans. electron devices, vol. 47, no. 12, pp. 2320–2325, 2000. [4] s.-y. kim and j. h. lee, “hot carrier-induced degradation in bulk finfets,” ieee electron device lett., vol. 26, no. 8, pp. 566–568, 2005. [5] t. ernst, s. cristoloveanu, g. ghibaudo, t. ouisse, s. horiguchi, y. ono, y. takahashi, and k. murase, “ultimately thin double-gate soi mosfets,” ieee trans. electron devices, vol. 50, no. 3, pp. 830– 838, 2003. [6] j. p. colinge, “the new generation of soi mosfets,” rom. j. inf. sci. technol, vol. 11, no. 1, pp. 3– 15, 2008. [7] t. rudenko, s. barraud, y. m. georgiev, v. lysenko, and a. nazarov, “electrical characterization and parameter extraction of junctionless nanowire transistors.,” j. nano res., vol. 39, 2016. [8] j.-p. colinge, c. w. lee, a. afzalian, n. dehdashti, r. yan, i. ferain, p. razavi, b. o’neill, a. blake, m. white, and others, “soi gated resistor: cmos without junctions,” in proceedings of the ieee international soi conference, 2009, pp. 1–2. [9] a. kranti, r. yan, c. w. lee, i. ferain, r. yu, n. d. akhavan, p. razavi, and j. p. colinge, “junctionless nanowire transistor (jnt): properties and design guidelines,” in proceedings of the essderc conference, 2010, pp. 357–360. [10] r. rios, a. cappellani, m. armstrong, a. budrevich, h. gomez, r. pai, n. rahhal-orabi, and k. kuhn, “comparison of junctionless and conventional trigate transistors with lg down to 26 nm,” ieee electron device lett., vol. 32, no. 9, pp. 1170–1172, 2011. [11] b. vandana, b. s. patro, s. k. mohapatra, and j. k. das, “exploration towards electrostatic integrity for sige on insulator (sg-oi) on junctionless channel transistor (jlct),” facta universitatis, series: electronics and energetics, vol. 30, no. 3, pp. 383-390, 2017. [12] b. vandana, b. s. patro, j. k. das, and s. k. mohapatra, “physical insight of junctionless transistor with simulation study of strained channel,” ecti trans. electr. eng. electron. commun., vol. 15, no. 1, pp. 1–7, 2017. [13] p. ghosh, s. haldar, r. s. gupta, and m. gupta, “an investigation of linearity performance and intermodulation distortion of gme cgt mosfet for rfic design,” ieee trans. electron devices, vol. 59, no. 12, pp. 3263–3268, 2012. [14] y. pratap, s. haldar, r. s. gupta, and m. gupta, “performance evaluation and reliability issues of junctionless csg mosfet for rfic design,” ieee trans. device mater. reliab., vol. 14, no. 1, pp. 418–425, 2014. [15] s. k. mohapatra, k. p. pradhan, and p. k. sahu, “linearity and analog performance analysis in gsdgmosfet with gate and channel engineering,” in proceedings of the annual ieee india conference (indicon), 2014, pp. 1–5. [16] b. razavi and r. behzad, rf microelectronics, vol. 2. prentice hall new jersey, 1998. [17] j.-t. park and j.-p. colinge, “multiple-gate soi mosfets: device design guidelines,” ieee trans. electron devices, vol. 49, no. 12, pp. 2222–2229, 2002. [18] s. k. mohapatra, “investigation on performance metrics of nanoscale multigate mosfets towards rf and ic applications,” 2015. [19] f. balestra, s. cristoloveanu, m. benachir, j. brini, and t. elewa, “double-gate silicon-on-insulator transistor with volume inversion: a new device with greatly enhanced performance,” ieee electron device lett., vol. 8, no. 9, pp. 410–412, 1987. [20] l. mathew, m. sadd, s. kalpat, m. zavala, t. stephens, r. mora, s. bagchi, c. parker, j. vasek, and d. sing, “inverted t channel fet (itfet)-fabrication and characteristics of vertical-horizontal, thin body, multi-gate, multi-orientation devices, itfet sram bit-cell operation. a novel technology for 45nm and beyond cmos.,” in technical digest ieee international electron devices meeting, iedm ., 2005, pp. 713–716. channel engineering (si1-0.25ge0.25) technique on gm (transconductance) and its higher order derivatives 265 [21] w. zhang, j. g. fossum, and l. mathew, “the itfet: a novel finfet-based hybrid device,” ieee trans. electron devices, vol. 53, no. 9, pp. 2335–2343, 2006. [22] a. n. hanna, m. t. ghoneim, r. r. bahabry, a. m. hussain, and m. m. hussain, “zinc oxide integrated area efficient high output low power wavy channel thin film transistor,” appl. phys. lett., vol. 103, no. 22, p. 224101, 2013. [23] a. n. hanna, a. m. hussain, and m. m. hussain, “wavy channel architecture thin film transistor (tft) using amorphous zinc oxide for high-performance and low-power semiconductor circuits,” in proceedings of the 73rd annual device research conference (drc), 2015, pp. 201–202. [24] k. p. pradhan, p. k. sahu, and r. ranjan, “investigation on asymmetric dual-k spacer (ads) trigate wavy finfet: a novel device,” in proceedings of the 3rd international conference on devices, circuits and systems (icdcs), 2016, pp. 137–140. [25] j.-w. yang and j. g. fossum, “on the feasibility of nanoscale triple-gate cmos transistors,” ieee trans. electron devices, vol. 52, no. 6, pp. 1159–1164, 2005. [26] http://www.synopsys.com/, “sentaurus tcad user’s manual,” in synopsys sentaurus device, synopsys, 2012. [27] d. b. m. klaassen, “a unified mobility model for device simulation-i. model equations and concentration dependence,” solid. state. electron., vol. 35, no. 7, pp. 953–959, 1992. [28] j. del alamo, s. swirhun, and r. m. swanson, “simultaneous measurement of hole lifetime, hole mobility and bandgap narrowing in heavily doped n-type silicon,” in proceedigns of the 1985 international electron devices meeting, 1985, vol. 31, pp. 290–293. [29] w. shockley and w. t. read jr, “statistics of the recombinations of holes and electrons,” phys. rev., vol. 87, no. 5, p. 835, 1952. [30] s. saha, “mosfet test structures for two-dimensional device simulation,” solid. state. electron., vol. 38, no. 1, pp. 69–73, 1995. [31] n. aggarwal, i. gupta, k. sikka, and r. chaujar, “tcad linearity performance evaluation of gate workfunction engineering in surrounding gate silicon nanowire mosfet,” nanoscale, vol. 9, no. b, p. 10, 2012. [32] s. kang, b. choi, and b. kim, “linearity analysis of cmos for rf application,” ieee trans. microw. theory tech., vol. 51, no. 3, pp. 972–977, 2003. instruction facta universitatis series: electronics and energetics vol. 31, n o 1, march 2018, pp. 25 39 https://doi.org/10.2298/fuee1801025b comparative evaluation of quasi-delay-insensitive asynchronous adders corresponding to return-to-zero and return-to-one handshaking padmanabhan balasubramanian school of electrical and electronic engineering, nanyang technological university, singapore abstract. this article makes a comparative evaluation of quasi-delay-insensitive (qdi) asynchronous adders, realized using the delay-insensitive dual-rail code, which adhere to 4-phase return-to-zero (rtz) and 4-phase return-to-one (rto) handshake protocols. the qdi adders realized correspond to the following adder architectures: i) ripple carry adder, ii) carry lookahead adder, and iii) carry select adder. the qdi adders correspond to three different timing regimes viz. strong-indication, weak-indication, and early output. they are physically implemented using a 32/28nm cmos process. the comparative evaluation shows that, overall, qdi adders which correspond to the 4-phase rto handshake protocol are better than the qdi adder counterparts which correspond to the 4-phase rtz handshake protocol in terms of latency, area, and average power dissipation. key words: asynchronous circuits, qdi, adders, indication, standard cells, cmos 1. introduction the international technology roadmap for semiconductors (itrs 2.0) [1] has identified design for variability as one of the key challenges for nanoelectronics. process variability and device variability have assumed more significance in the nanoelectronics era compared to the microelectronics era. this is because random dopant and atomistic fluctuations, high heat flux, negative bias temperature instability, electro-migration, hot carrier effects, stress-induced variation, process-induced defects, electrostatic discharge, and metrology and other manufacturing issues have become more prominent in the nanoelectronics era compared to the microelectronics era. to overcome these issues, solutions are being developed at various levels such as at material-level, process-level, device-level, circuit-level, and the system-level [2]. received september 18, 2017 corresponding author: padmanabhan balasubramanian the author is now with the school of computer science and engineering, nanyang technological university, 50 nanyang avenue, singapore 639798 (e-mail: balasubramanian@ntu.edu.sg) 26 p. balasubramanian at the circuit-level, the qdi 1 asynchronous design method [3] employing delayinsensitive code(s) for data representation and processing and a 4-phase handshake protocol for data communication is considered to be robust and is construed to be a viable alternative to the synchronous design method [4]. this is because qdi circuits encompass several advantages [5] such as low power [6 – 9], tolerance to noise and electromagnetic interference [10 – 12], ability to withstand process, voltage and temperature variations [13] [14], self-checking [15], resistant to side channel attacks in the case of secure applications [16 – 19] etc. in general, qdi circuits widely employ the delay-insensitive dual-rail data encoding and the 4-phase rtz handshaking [20]. however, a new 4-phase rto handshake protocol was proposed [21] for qdi circuits. based on a few case studies [22] [23], it was reported that qdi circuits which correspond to the rto protocol report better design metrics than their qdi circuit counterparts adhering to the rtz protocol. qdi circuits performing data transactions based on either the rtz or the rto protocol are robust. qdi circuits and systems are guaranteed to be correct by construction since they adopt unbounded delay models for gates and wires, with the only exception of isochronic forks 2 [24] which represent the weakest compromise to delay-insensitivity. in this work, the adder which forms an important datapath of any processing unit is considered for the analysis to compare the efficiency of the rto protocol versus the rtz protocol. various adder architectures such as the ripple carry adder (rca), the carry lookahead adder (cla), and the carry select adder (csla) are considered for qdi implementation based on the rtz and rto protocols to perform a comprehensive comparative evaluation. this work builds upon [25], wherein only the rca architecture was considered to comparatively evaluate the rtz and rto protocols. the rest of this article is organized as follows. section 2 provides an overview of: i) qdi circuit operation encompassing delay-insensitive data encoding and data transaction using the rtz and rto handshake protocols, and ii) the types of qdi circuits, their timing characteristics, and their general properties. section 3 presents the logic rules for transforming qdi circuits corresponding to the rtz protocol into qdi circuits adhering to the rto protocol and vice-versa. also, some circuit illustrations are provided in this section. section 4 presents the simulation results corresponding to several 32-bit qdi rcas, clas, and cslas, implemented using delay-insensitive dual-rail data encoding and adhering to rtz and rto handshaking. the qdi adders realized correspond to strong-indication, weak-indication, and early output. section 5 provides the conclusions. 2. qdi circuit operation, types and properties 2.1. operation of qdi circuit the architecture of a qdi circuit is correlated with the sender (sx) and receiver (rx) analogy in figure 1a. the qdi circuit is sandwiched between the current stage and the 1 qdi design represents a robust flavor of asynchronous circuit design methods. qdi circuits are the practically realizable delay-insensitive asynchronous circuits. 2 an isochronic fork implies that the up-going or down-going signal transitions on all the ends of the fork are assumed to be concurrent. comparative evaluation of quasi-delay-insensitive asynchronous adders... 27 next stage register banks. a register in a qdi design is a 2-input c-element that is represented by the circle with the marking c in the figures. the c-element outputs binary 1 or 0 only if all its inputs are binary 1 or 0 respectively and would maintain its existing steady-state even if any of its inputs is different. qdi circuit current stage register next stage register completion detector (cd) ackout ackin completion detector (cd) ackout ackin ackout ackout x1 x0 sender (sx) receiver (rx) ackin qdi circuit x1 x0 y1 y0 z1 z0 c (b) x1 x0 y1 y0 z1 z0 c (c) cdrtz cdrto (a) z1 z0 y1 y0 fig. 1 (a) an asynchronous circuit, correlated with the sender-receiver analogy for illustration. completion detectors corresponding to (b) the rtz handshake protocol, and (c) the rto handshake protocol. a single-rail data wire x is encoded using the dual-rail code [26] into two data wires as x1 and x0. based on the rtz protocol [20], the data x = 1 is represented by x1 = 1 and x0 = 0, and the data x = 0 is represented by x0 = 1 and x1 = 0. x1 = x0 = 0 represents the spacer. x1 = x0 = 1 is invalid since the coding scheme is unordered [27] and where no code word is allowed to be a subset of another code word. according to the rtz protocol, the application of primary inputs to a qdi circuit should follow the sequence: data-spacer-data-spacer, and so forth, with each input data followed by the rtz of the encoded data wires. note that binary 1 is used to represent data with respect to the rtz protocol. on the other hand, according to the rto protocol [21], binary 0 is used to 28 p. balasubramanian represent data. as per the rto protocol, the valid data y = 1 is represented by y1 = 0 and y0 = 1, and y = 0 is represented by y0 = 0 and y1 = 1. the spacer is represented by y0 = y1 = 1. y1 = y0 = 0 is deemed invalid since the coding scheme is unordered. as per the rto protocol, the application of primary inputs to a qdi circuit follows the sequence: spacer-data-spacer-data, and so forth, with each input data followed by the rto of the encoded data wires. the 4-phase handshake protocol, whether it is rtz or rto, consists of four phases which will be explained with reference to figure 1a by considering dual-rail encoded data. however, the explanation would be applicable for data represented using any delayinsensitive 1-of-n code [26]. as per the rtz protocol, in the first phase, the dual-rail data bus shown in figure 1a which is specified by (x1, x0), (y1, y0), and (z1, z0) is in the spacer state, and ackin is high. sx transmits data and this results in rising signal transitions on anyone of the corresponding dual rails of the entire dual-rail data bus. in the second phase, rx receives the data sent, and it drives ackout high. in the third phase, sx waits for ackin to go low and then resets the entire dual-rail data bus to the spacer state i.e. all 0s. in the fourth phase, after an unbounded but a finite and positive time, rx would drive ackout low i.e. ackin becomes high. with this one data transaction is said to be complete, and the asynchronous circuit is ready to proceed with the next data transaction. an example completion detector, which comprises the dual-rail encoded primary inputs (x1, x0), (y1, y0), and (z1, z0), that indicates or acknowledges the receipt of data and the all zeroes spacer on the primary inputs through its output cdrtz is illustrated in figure 1b. the completion detector shown in figure 1b corresponds to the rtz protocol. with respect to the rto handshake protocol, in the first phase, ackin is 1. sx would transmit the spacer i.e. all 1s, and this causes rising signal transitions on all the rails of the dual-rail data bus. in the second phase, rx receives the spacer sent, and it drives ackout high. in the third phase, tx waits for ackin to assume 0 and then sends the input data by resetting any one of the corresponding dual-rails of the entire dual-rail data bus. then in the fourth phase, after an unbounded but a finite and positive time, rx would drive ackout low i.e. ackin becomes high. with this one data transaction is said to be complete, and the qdi circuit is ready to commence the next data transaction. an example completion detector that comprises the dual-rail encoded primary inputs (x1, x0), (y1, y0), and (z1, z0), which indicates the receipt of data and the all ones spacer on the primary inputs through its output cdrto is depicted by figure 1c. this completion detector corresponds to the rto protocol. 2.2. types of qdi circuits qdi circuits are classified as strongly indicating, weakly indicating, and early output types [28]. a strong-indication qdi circuit [29] [30] waits to receive all the primary inputs, whether they are data or spacer, and then starts data processing to produce the required primary outputs. a weak-indication qdi circuit [29] [31] would produce some of the primary outputs after receiving a subset of the primary inputs. however, the production of at least one primary output is delayed till the last primary input is received. an early output qdi circuit [32] [33] is the most relaxed of the three in that it is able to produce all the primary outputs after receiving a subset of the primary inputs. if an early output qdi circuit produces data early, it is said to be of early set type, and if an early output qdi circuit assumes the spacer state early, it is said to be of early reset type. the comparative evaluation of quasi-delay-insensitive asynchronous adders... 29 input-output timing behaviour of strong-indication, weak-indication, and early output qdi circuits is captured by figure 2. the early set and reset behaviours are shown in figure 2. inputs arrival all none all none outputs production strong-indication all none outputs production weak-indication all none outputs production early output valid data arriving spacer data arriving valid data arrived spacer data arrived early set behaviour early reset behaviour fig. 2 input-output timing characteristic of strong-indication, weak-indication, and early output qdi circuits 2.3. general properties of qdi circuits qdi circuits, regardless of whether they are strongly indicating or weakly indicating or early output type, have some properties in common. firstly, qdi circuits should be free of wire and gate orphans [34] [35]. a wire orphan refers to an unacknowledged signal transition on a wire. the wire orphan problem, if any, can be resolved through the isochronic fork assumption. a gate orphan is an unacknowledged signal transition on an intermediate gate output. the gate orphan problem is difficult to resolve and to overcome it, sophisticated timing assumption(s) might be required. secondly, qdi circuits tend to satisfy the monotonic cover constraint [16], which implies the activation of a unique signal path from a primary input to a primary output for each input data applied. the monotonic cover constraint is implicit in a disjoint sum-of-products expression [36], which is used to synthesize a qdi circuit. in a disjoint sum-of-products expression, the product terms are mutually orthogonal, i.e. the logical conjunction of any two product terms in a disjoint sum-of-products expression yields null [37 – 39]. thirdly, the signal 30 p. balasubramanian transitions ripple monotonically [40] from the first logic level up to the last logic level in a qdi circuit [41]. the transitions either increase or decrease monotonically. for a qdi circuit that adheres to the rtz protocol, for the application of data, the transitions would increase monotonically and for the application of spacer, the transitions would decrease monotonically. on the contrary, for a qdi circuit adhering to the rto protocol, for the application of spacer, the transitions would increase monotonically, and for the application of data, the transitions would decrease monotonically throughout the circuit. it is important to ascertain the type of a qdi circuit when it is composed using many qdi sub-circuits, as is common in the design of qdi arithmetic circuits. in general, a cascade of strong-indication or weak-indication or early output qdi sub-circuits yields a strong-indication or a weak-indication or an early output qdi circuit respectively. sometimes there might be an exception when composing early output qdi sub-circuits. for example, it was noted in [42] [43] that a cascade of early output qdi full adders led to a relative-timed rca, whereas in [33] [44] a cascade of early output qdi full adders led to an early output rca. this might be because in terms of robustness, the strong-indication timing model tops the hierarchy followed by the weak-indication timing model, which is succeeded by the early output timing model. the relative-timing model is not qdi and is the least robust of the asynchronous timing models described. relative-timed asynchronous circuits [45] require explicit and perhaps complicated timing assumptions to ensure their safe operation but could exhibit more optimized design metrics compared to the qdi circuits. hence, in the case of relative-timing, the robustness is traded off for greater design optimization [46]. further, a cascade of qdi sub-circuits with more robust and less robust timing models generally causes the least robust timing model to be ascribed to the resultant qdi circuit. for example, a cascade of strong-indication and weak-indication qdi subcircuits leads to a weak-indication qdi circuit. a cascade of strong-indication and/or weakindication qdi sub-circuits and early output qdi sub-circuit(s) leads to an early output qdi circuit. 3. logic rules for rtz to rto and vice-versa protocol conversion qdi circuits, regardless of whether they correspond to the rtz or the rto protocol, when physically realized, generally consist of c-elements 3 and simple and complex logic gates. any c-elements used in a qdi circuit, whether they correspond to the rtz or the rto protocol, would remain unchanged and their inputs would also be unchanged when transforming a qdi circuit which adheres to the rtz protocol into a qdi circuit which corresponds to the rto protocol and vice-versa. the logic transformation rules to be discussed below, which could facilitate the rtz to rto and vice-versa protocol conversion are applicable only to the discrete and complex logic gates comprising the respective circuits and excludes any c-elements. the logic transformation rules for the handshake protocols conversion tend to obey the well-established duality principle of boolean algebra. the duality principle [47] states that every algebraic expression that is deduced using the postulates of boolean algebra remains valid if the logical operators and 3 the c-element outputs binary 1 or 0 only when all its inputs are binary 1 or 0. if any of its inputs is different, the c-element would maintain its existing steady-state. the c-element is portrayed by an and gate with the marking „c‟ on its periphery. comparative evaluation of quasi-delay-insensitive asynchronous adders... 31 identity elements are interchanged. herein, it implies that for the rtz to rto protocol conversion the and operator should be replaced by the or operator and the or operator should be replaced by the and operator; the reverse is applicable for the rto to rtz protocol conversion. an example set of logic transformation rules for the handshake protocols conversion and their proofs by induction are provided below. these rules may be extended without any loss of generality depending upon a qdi circuit composition. rtz: p + q ↔ rto: pq (1) rtz: p + qr ↔ rto: p (q + r) (2) rtz: pq + rs ↔ rto: (p + q) (r + s) (3) the function (p + q) corresponding to the rtz protocol, given in (1), is implemented using a 2-input or gate, and the rto equivalent viz. pq is implemented using a 2-input and gate. table 1 shows the proof by induction for (1). the 2-input or and and gates are simple logic gates present in a standard digital cell library [48]. the function (p + qr) corresponding to the rtz protocol, given in (2), can be implemented using the ao21 gate and its rto equivalent viz. p (q + r) can be implemented using the oa21 gate. table 2 shows the proof by induction for (2). the function (pq + rs) corresponding to the rtz protocol, given in (3), can be implemented using the ao22 gate and the rto equivalent i.e. (p + q) (r + s) can be implemented using the oa22 gate. table 3 shows the proof by induction for (3). the ao21, oa21, ao22 and oa22 gates are complex logic gates present in a standard digital cell library [48]. table 1 proof by induction for (1) inputs rtz rto p q p + q pq 0 0 0 0 0 1 1 0 1 0 1 0 1 1 1 1 recall that binary 1 is used to represent the data with respect to the rtz protocol and binary 0 is used to represent the data with respect to the rto protocol after data encoding. this is conformance with the duality property of boolean algebra, which states that identity elements can be interchanged [47]. as mentioned in section 2.1, the zeroes spacer is used in the case of the rtz protocol and the ones spacer is used in the case of the rto protocol. given these, it can be seen from table 1 that if the input p or q is 1, which indicates the data with respect to the rtz protocol, (p + q) would yield 1, and when p and q are 0 then (p + q) would yield 0 indicating the rtz state. on the other hand, if either p or q is 0 in table 1, which indicates the data based on the rto protocol, pq would evaluate to 0, and when p and q are 1, pq would evaluate to 1, which indicates the rto state. in tables 2 and 3, sub-functions are additionally introduced for the sakes of clarity and illustration. in the case of table 2, if p or qr is 1, then (p + qr) evaluates to 1 signifying the data according to the rtz protocol, and if p and qr are 0, then (p + qr) 32 p. balasubramanian evaluates to 0 signifying the rtz state. if p or (q + r) is 0, then p (q + r) evaluates to 0 signifying the data according to the rto protocol, and if p and (q + r) are 1, then p (q + r) evaluates to 1 signifying the rto state. with respect to table 3, if pq or rs is 1, then (pq + rs) evaluates to 1 signifying the data as per the rtz protocol, and if pq and rs are 0, then (pq + rs) evaluates to 0 signifying the rtz state. however, if (p + q) or (r + s) is 0, then (p + q) (r + s) evaluates to 0 signifying the data as per the rto protocol. supposing (p + q) and (r + s) are 1, then (p + q) (r + s) would evaluate to 1 signifying the rto state. table 2 proof by induction for (2) inputs rtz sub-function rtz rto sub-function rto p q r qr p + qr q + r p (q + r) 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 1 1 1 1 0 1 0 0 0 1 0 0 1 0 1 0 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 table 3 proof by induction for (3) inputs rtz sub-functions rtz rto sub-functions rto p q r s pq rs pq + rs p + q r + s (p + q) (r + s) 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 1 1 0 1 1 0 0 0 0 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 1 0 1 1 0 0 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 example circuits to illustrate the conversion from rtz to rto protocol and viceversa are shown in figure 3. comparative evaluation of quasi-delay-insensitive asynchronous adders... 33 a0 cin1 cout1 cin1 cin0 cin0 cin0 cin1 sum0 cout0 b1 a1 b0 a0 b0 a1 b1 sum1 (a) c c c c c c c c c c a0 cin1 cout1 cin1 cin0 cin0 cin0 cin1 sum0 cout0 b1 a1 b0 a0 b0 a1 b1 sum1 c c c c c c c c c c (b) a0 cin0 cin1 b1 a1 b0 a0 cin1 cin0 b0 a1 b1 sum1 sum0 cin1 cin0 cout1 (c) c c c c c c c c cout0 a0 cin0 cin1 b1 a1 b0 a0 cin1 cin0 b0 a1 b1 sum1 sum0 cin1 cin0 cout1 (d) c c c c c c c c cout0 a0 b0 a1 b1 cin1 cin0 sum0 sum1 cin1 a1 b1 cout1 cin0 a0 b0 cout0 (e) c c a0 b1 a1 b0 cin1 cin0 c c a0 b0 a1 b1 cin1 cin0 sum0 sum1 cout1 (f) c c cin1 cin0 c c a0 b1 a1 b0 a1 b1 a0 b0 cout0 cin1 cin0 fig 3 strongly indicating full adder [50] corresponding to (a) rtz handshaking and (b) rto handshaking; weakly indicating full adder [51] corresponding to (c) rtz handshaking and (d) rto handshaking; early output full adder [33], corresponding to (e) rtz handshaking (early reset type) and (f) rto handshaking (early set type) figure 3 portrays strong-indication, weak-indication and early output implementations of the full adder. the full adder adds an augend and an addend along with a carry input 34 p. balasubramanian and produces the sum output and any carry overflow. in figure 3, (a1, a0), (b1, b0) and (cin1, cin0) represent the dual-rail augend, addend and carry inputs of the full adders, and (sum1, sum0) and (cout1, cout0) represent the dual-rail sum and carry outputs. figures 3a, 3c and 3e depict full adder implementations which correspond to the rtz protocol, and figures 3b, 3d and 3f show the respective full adder realizations which correspond to the rto protocol. the 2-input or gates, ao21 gates and ao22 gates of figures 3a, 3c and 3e are replaced by 2-input and gates, oa21 gates and oa22 gates respectively in figures 3b, 3d and 3f, in accordance with (1), (2) and (3), given earlier. note that there is no change whatsoever in the inputs or outputs of the corresponding circuits belonging to the rtz and the rto protocols in figure 3. moreover, the 2-input c-elements and their corresponding inputs remain unchanged. 4. simulation results several 32-bit qdi rcas [49 – 55], clas [56] [57] and cslas [58] were semicustom realized using the standard digital library cells of a 32/28nm cmos process [48]. the 2-input c-element was alone custom realized by modifying the ao222 complex gate by introducing feedback. the 2-input c-element was realized using 12 transistors and was made available to implement the various qdi adders, which correspond to rtz and rto protocols. any high-input c-element functionality, wherever likely in an adder design, was safely decomposed in qdi style [59] to avoid the problem of gate orphans. about 1000 random input vectors were identically supplied to all the qdi adders through a test bench at time intervals of 20ns to perform the functional simulations and to capture their respective switching activities. the value change dump (.vcd) files generated through the functional simulations were used to estimate the average power dissipation. the worst-case (forward) latency, i.e. the critical path delay and the area of the qdi adders were also estimated. a default wire load model was considered while estimating the design metrics to include the effect of parasitic in the simulations. the design metrics viz. latency, area, and average power dissipation, estimated for the various qdi adders, which correspond to the rtz and rto protocols are given in table 4. the input registers and the completion detectors of the various qdi adders corresponding to the rtz and rto protocols are respectively identical. so the differences between their design metrics can be attributed to the respective differences between their function blocks. before discussing the results, it should be noted that the focus of this article is not to comment on the efficiency of various adder architectures or about the type of the adders with relation to latency or area or power optimization, and these have already been discussed in the published literature. rather, the intent of this article is to provide a comparison between the design metrics of different qdi adders based on their realization using rtz and rto protocols, and eventually to arrive at a general conclusion regarding which of these handshake protocols is more preferable to potentially achieve enhanced optimizations in the design metrics regardless of the extent of optimization achievable. the improvements in the design metrics which may be achieved by one protocol over the other could in part be explained as due to the differences in the implementation. however, the extent of optimizations in the design metrics achievable would also depend on the digital cell library targeted, the technology node and the pvt corner chosen to perform comparative evaluation of quasi-delay-insensitive asynchronous adders... 35 the simulations. thus the results given in table 4 are to be used as a reference to guide the choice of a 4-phase handshake protocol for the effective design of qdi circuits. table 4 design metrics of 32-bit qdi adders corresponding to rtz and rto protocols, estimated using synopsys tools based on implementation using a 32/28nm cmos process qdi adder reference; and adder type 4-phase rtz handshake protocol 4-phase rto handshake protocol latency (ns) area (µm 2 ) power (µw) latency (ns) area (µm 2 ) power (µw) rcas [49]; si 14.61 2529 2190 14.15 2529 2185 [52]; si 9.26 2504.60 2181 8.74 2374.48 2167 [50]; si 9.04 2293.14 2172 8.88 2293.15 2168 [52]; wi 8.24 2423.27 2177 8.03 2358.21 2167 [53]; wi 7 2016.63 2171 6.95 2016.63 2167 [54]; wi 9.66 2642.85 2192 9.66 2642.85 2191 [55]; wi 4.43 2097.96 2174 3.79 2097.96 2170 [51]; wi 3.32 2049.16 2171 3.31 2049.16 2167 [33]; eo 3.10 1658.80 2161 2.93 1658.80 2157 [44]; eo 2.14 2436.48 2173 2.13 2649.96 2176 clas [56], [55]; wi: regular 3.31 2951.88 2191 3.19 2984.41 2184 [56], [55]; wi: hybrid 3.08 2845.14 2189 2.97 2873.61 2182 [56], [55]; wi: regular with alias logic 2.46 2992.55 2192 2.36 3025.08 2185 [56], [55]; wi: hybrid with alias logic 2.38 2880.72 2190 2.29 2909.19 2183 [56], [51]; wi: regular 3.14 2915.29 2188 3.10 2947.82 2182 [56], [51]; wi: hybrid 2.93 2807.02 2186 2.89 2835.49 2180 [56], [51]; wi: regular with alias logic 2.32 2955.95 2190 2.30 2988.48 2183 [56], [51]; wi: hybrid with alias logic 2.25 2842.60 2187 2.22 2871.07 2181 [57]; eo: regular 2.75 2569.65 2177 2.73 2553.39 2169 [57]; eo: hybrid 2.53 2455.80 2175 2.51 2441.56 2167 cslas [58] – [33], [60]; eo: non-uniform 3.23 3384.44 2312 3.15 3384.44 2303 [58] – [33], [60]; eo: uniform 2.46 3000.17 2293 2.38 3000.17 2285 legends used: si – strong-indication; wi – weak-indication; eo – early output hybrid clas incorporate a 4-bit least significant rca, which improves the design metrics of regular clas overall, it can be observed from table 4 that the qdi adders based on the rto protocol feature less latency (and hence less cycle time) and power dissipation and occupy almost the same area than their qdi adder counterparts based on the rtz protocol. the completion detector of a qdi circuit corresponding to the rtz protocol consists of a series of 2-input or gates whose outputs are synchronized by a c-element tree. on the other hand, the completion detector of a qdi circuit adhering to the rto 36 p. balasubramanian protocol comprises a series of 2-input and gates whose outputs are synchronized by a tree of c-elements. further, any 2-input or gates present in the functional block(s) of a qdi adder corresponding to the rtz protocol would be replaced by 2-input and gates in the functional block(s) of a qdi adder counterpart adhering to the rto protocol. in static cmos implementations, it is well known that the or gate is more expensive than the and gate in terms of delay, area, and power dissipation [61] due to the series stacking of pmos transistors in the pull-up network of the former contrary to the parallel stacking of pmos transistors in the pull-up network of the latter. hence the use of 2-input and gates instead of 2-input or gates in the qdi adders and their respective completion detectors implies better optimized design metrics can be expected for the rto protocol compared to the rtz protocol. in table 4, it can be noticed that in some scenarios the areas of the qdi adders corresponding to the rtz and rto protocols are the same. for examples, the nonuniform 32-bit csla with the input partition of 8-7-6-4-3-2-2 and the uniform 32-bit csla with the input partition of 8-8-8-8 occupy similar areas with respect to both the handshake protocols. the non-uniform and uniform qdi cslas, highlighted in table 4, are constructed using the early output full adder of [33], and the strongly indicating 2:1 multiplexer (mux) of [60]. with respect to the rtz protocol, the early output full adder of [33] consists of four ao22 gates, four 2-input c-elements and two 2-input or gates, as shown in figure 3e. based on the rto protocol, the early output full adder of [33] would comprise four oa22 gates, four 2-input c-elements and two 2-input and gates, as shown in figure 3f. the strongly indicating 2:1 mux design of [60], which is called sidco, requires seven 2-input c-elements and four 2-input or gates for realization based on the rtz protocol. on the other hand, for implementation based on the rto protocol, the strongly indicating 2:1 mux design would require seven 2-input c-elements and four 2input and gates. the ao22 and oa22 gates of the digital cell library [48] have the same area of 2.54µm 2 , and the 2-input or gate and the 2-input and gate occupy the same area of 2.03µm 2 . as a result, the areas of the full adder and the 2:1 mux of a qdi csla would be the same regardless of the handshake protocol adopted. this explains why the non-uniform and uniform qdi cslas in table 4 feature the same area with respect to both rtz and rto protocols. although the areas of ao22 and oa22 gates, and the areas of the 2-input or gate and the 2-input and gate are the same in [48], their corresponding delay and power dissipation values are different. this is the reason why the qdi cslas based on the rto protocol have less latency and power dissipation than the qdi cslas based on the rtz protocol, as seen in table 4. having similar cell areas for the dual logic gates viz. or and and, ao21 and oa21, ao22 and oa22 etc. in [48] is rather uncommon in the case of commercial standard cell libraries. the standard digital cell library [48] does not have foundry support and is meant for use for academic teaching and research. hence, it may be safely hypothesized that if a commercial digital cell library is used for the physical implementation of the qdi adders given in table 4, then the rto protocol would facilitate higher percentage optimizations in the design metrics than the rtz protocol and therefore the improvements in the design metrics reported in table 4 would tend to serve as a baseline. comparative evaluation of quasi-delay-insensitive asynchronous adders... 37 5. conclusions this article discussed the implementation of various qdi adders, which correspond to diverse architectures and timing regimes by utilizing the delay-insensitive dual-rail code, based on the 4-phase rtz and rto handshake protocols. the logic transformation rules governing the circuit conversions between rtz and rto protocols were presented, and their proofs by induction were also provided. the simulations were performed by using a 32/28nm cmos process. the simulation results show that qdi adders corresponding to the rto protocol generically feature improved design parameters than the qdi adder counterparts which adhere to the rtz protocol. hence it is concluded that the 4-phase rto protocol is potentially more efficient than the 4-phase rtz protocol to implement handshaking in qdi asynchronous (arithmetic) circuits. references [1] itrs design report. available: http://www.itrs2.net [2] s. kundu and a. sreedhar, nanoscale cmos vlsi circuits: design for manufacturability, mcgrawhill, new york, usa, 2010. [3] a.j. martin, s.m. burns, t.k. lee, d. borkovic and p.j. hazewindus, “the first asynchronous microprocessor: the test results,” acm sigarch computer architecture news, vol. 17, pp. 95-98, 1989. [4] a.j. martin and m. nystrom, “asynchronous techniques for system-on-chip design,” proceedings of the ieee, vol. 94, pp. 1089-1120, 2006. [5] c.h. van kees berkel, m.b. josephs and s.m. nowick, “scanning the technology applications of asynchronous circuits”, proceedings of the ieee, vol. 87, pp. 223-233, 1999. [6] s.b. furber, d.a. edwards and j.d. garside, “amulet3: a 100 mips asynchronous embedded processor,” in proceedings of the international conference on computer design, pp. 329-334, 2000. [7] l. necchi, l. lavagno, d. pandini and l. vanzago, “an ultra-low energy asynchronous processor for wireless sensor networks,” in proceedings of the 12 th ieee international symposium on asynchronous circuits and systems, 2006, pp. 1-8. [8] b.z. tang and f. lane, “low power qdi asynchronous fft,” in proceedings of the 22 nd ieee international symposium on asynchronous circuits and systems, 2016, pp. 87-88. [9] w. jiang, d. bertozzi, g. miorandi, s.m. nowick, w. burleson and g. sadowski, “an asynchronous noc router in a 14nm finfet library: comparison to an industrial synchronous counterpart,” in proceedings of the design, automation and test in europe conference and exhibition, 2017, pp. 732-733. [10] n.c. paver, p. day, c. farnsworth, d.l. jackson, w.a. lien and j. liu, “a low-power, low noise, configurable self-timed dsp”, in proceedings of the 4 th international symposium on advanced research in asynchronous circuits and systems, pp. 32-42, 1998. [11] a.j. martin and m. nystrom, “asynchronous techniques for noise tolerant nanoelectronics,” technical report situs-tr-04-01, situs logic, pasadena, ca, usa, 2004. [12] g.f. bouesse, g. sicard, a. baixas and m. renaudin, “quasi delay insensitive asynchronous circuits for low emi”, in proceedings of the 4 th international workshop on electromagnetic compatibility of integrated circuits, 2004, pp. 27-31. [13] k.j. kulikowski, v. venkataraman, z. wang, a. taubin and m. karpovsky, “asynchronous balanced gates tolerant to interconnect variability”, in proceedings of the ieee international symposium on circuits and systems, 2008, pp. 3190-3193. [14] i.j. chang, s.p. park and k. roy, “exploring asynchronous design techniques for process-tolerant and energy-efficient subthreshold operation”, ieee journal of solid-state circuits, vol. 45, pp. 401-410, 2010. [15] i. david, r. ginosar and m. yoeli, “self-timed is self-checking”, journal of electronic testing: theory and applications, vol. 6, pp. 219-228, 1995. [16] l.a. plana, p.a. riocreux, w.j. bainbridge, a. bardsley, s. temple, j.d. garside, z.c. yu, “spa – a secure amulet core for smartcard applications,” microprocessors and microsystems, vol. 27, pp. 431446, 2003. http://www.itrs2.net/ 38 p. balasubramanian [17] d. sokolov, j. murphy, a. bystrov and a. yakovlev, “design and analysis of dual-rail circuits for security applications”, ieee transactions on computers, vol. 54, pp. 449-460, 2005. [18] f. burns, a. bystrov, a. koelmans and a. yakovlev, “design and security evaluation of balanced 1-of-n circuits,” iet computers and digital techniques, vol. 6, pp. 125-135, 2012. [19] w. cilio, m. linder, c. porter, j. di, d.r. thompson and s.c. smith, “mitigating powerand timingbased side-channel attacks using dual-spacer dual-rail delay-insensitive asynchronous logic,” microelectronics journal, vol. 44, pp. 258-269, 2013. [20] j. sparsø and s. furber (eds.), principles of asynchronous circuit design: a systems perspective, kluwer academic publishers, 2001. [21] m.t. moreira and n.l.v. calazans, “quasi-delay-insensitive return-to-one design,” in proceedings of the design, automation and test in europe conference and exhibition phd forum, 2014, pp. 1-2. [22] m.t. moreira, j.j.h. pontes and n.l.v. calazans, “tradeoffs between rto and rtz in wchb qdi asynchronous design,” in proceedings of the 15 th international symposium on quality electronic design, 2014, pp. 692-699. [23] r.a. guazzelli, m.t. moreira and n.l.v. calazans, “a comparison of asynchronous qdi templates using static logic,” in proceedings of the 8 th ieee latin american symposium on circuits and systems, 2017, pp. 1-4. [24] a.j. martin, “the limitation to delay-insensitivity in asynchronous circuits,” in proceedings of the 6 th mit conference on advanced research in vlsi, 1990, pp. 263-278. [25] p. balasubramanian, c. dang, “a comparison of quasi-delay-insensitive asynchronous adder designs corresponding to return-to-zero and return-to-one handshaking,” in proceedings of the 60 th ieee international midwest symposium on circuits and systems, 2017, pp. 1192-1195. [26] t. verhoeff, “delay-insensitive codes – an overview”, distributed computing, vol. 3, pp. 1-8, 1988. [27] b. bose, “on unordered codes”, ieee transactions on computers, vol. 40, pp. 1-8, 1988. [28] p. balasubramanian, “comments on “dual-rail asynchronous logic multi-level implementation”,” integration, the vlsi journal, vol. 52, pp. 34-40, 2016. [29] c.l. seitz, “system timing”, in introduction to vlsi systems, c. mead and l. conway (editors), pp. 218-262, addison-wesley, reading, massachusetts, usa, 1980. [30] p. balasubramanian and d.a. edwards, “efficient realization of strongly indicating function blocks”, in proceedings of the ieee computer society annual symposium on vlsi, 2008, pp. 429-432. [31] p. balasubramanian and d.a. edwards, “a new design technique for weakly indicating function blocks”, in proceedings of the 11 th ieee workshop on design and diagnostics of electronic circuits and systems, 2008, pp. 116-121. [32] c. brej, “early output logic and anti-tokens,” phd thesis, school of computer science, the university of manchester, 2006. [33] p. balasubramanian, “a robust asynchronous early output full adder,” wseas transactions on circuits and systems, vol. 10, pp. 221-230, 2011. [34] c. jeong and s.m. nowick, “block-level relaxation for timing-robust asynchronous circuits based on eager evaluation”, in proceedings of the 14 th ieee international symposium on asynchronous circuits and systems, 2008, pp. 95-104. [35] p. balasubramanian, k. prasad and n.e. mastorakis, “robust asynchronous implementation of boolean functions on the basis of duality,” in proceedings of the 14 th wseas international conference on circuits, 2010, pp. 37-43. [36] p. balasubramanian, r. arisaka and h.r. arabnia, “rb_dsop: a rule based disjoint sum of products synthesis method”, in proceedings of the 12 th international conference on computer design, 2012, pp. 39-43. [37] p. balasubramanian and d.a. edwards, “self-timed realization of combinational logic”, in proceedings of the 19 th international workshop on logic and synthesis, 2010, pp. 55-62. [38] p. balasubramanian, “self-timed logic and the design of self-timed adders”, phd thesis, school of computer science, the university of manchester, 2010. [39] p. balasubramanian and n.e. mastorakis, “a set theory based method to derive network reliability expressions of complex system topologies,” in proceedings of the applied computing conference, 2010, pp. 108-114. [40] j. cortadella, a. kondratyev, l. lavagno and c. sotiriou, “coping with the variability of combinational logic delays,” in proceedings of the ieee international conference on computer design: vlsi in computers and processors, 2004, pp. 505-508. comparative evaluation of quasi-delay-insensitive asynchronous adders... 39 [41] v.i. varshavsky (ed.), self-timed control of concurrent processes: the design of aperiodic logical circuits in computers and discrete systems, chapter 4: aperiodic circuits, pp. 77-85, (translated from the russian by a.v. yakovlev), kluwer academic publishers, 1990. [42] p. balasubramanian and k. prasad, “early output hybrid input encoded asynchronous full adder and relative-timed ripple carry adder,” in proceedings of the 14 th international conference on embedded systems, cyber-physical systems, and applications, 2016, pp. 62-65. [43] p. balasubramanian and s. yamashita, “area/latency optimized early output asynchronous full adders and relative-timed ripple carry adders,” springerplus, vol. 5, pages 26, 2016. [44] p. balasubramanian and k. prasad, “latency optimized asynchronous early output ripple carry adder based on delay-insensitive dual-rail data encoding,” international journal of circuits, systems and signal processing, vol. 11, pp. 65-74, 2017. [45] k.s. stevens, r. ginosar and s. rotem, “relative timing,” ieee transactions on vlsi systems, vol. 11, pp. 129-140, 2003. [46] d. bhadra and k.s. stevens, “design of a low power, relative timing based asynchronous msp430 processor,” in proceedings of the design, automation and test in europe conference and exhibition, pp. 794-799, 2017. [47] m.m. mano and m.d. ciletti, digital design, 4 th edition, prentice-hall, new jersey, usa, 2007. [48] synopsys digital standard cell library saed_edk32/28_core databook, revision 1.0.0, 2012. [49] n.p. singh, “a design methodology for self-timed systems,” msc dissertation, massachusetts institute of technology, usa, 1981. [50] w.b. toms, “synthesis of quasi-delay-insensitive datapath circuits”, phd thesis, school of computer science, the university of manchester, uk, 2006. [51] p. balasubramanian, “a latency optimized biased implementation style weak-indication self-timed full adder,” facta universitatis, series: electronics and energetics, vol. 28, pp. 657-671, 2015. [52] j. sparsø and j. staunstrup, “delay-insensitive multi-ring structures”, integration, the vlsi journal, vol. 15, pp. 313-340, 1993. [53] b. folco, v. bregier, l. fesquet and m. renaudin, “technology mapping for area optimized quasi delay insensitive circuits”, in proceedings of the ifip 13 th international conference on very large scale integration of system-on-chip, 2005, pp. 146-151. [54] w.b. toms and d.a. edwards, “a complete synthesis method for block-level relaxation in self-timed datapaths,” in proceedings of the 10 th international conference on application of concurrency to system design, 2010, pp. 24-34. [55] p. balasubramanian and d.a. edwards, “a delay efficient robust self-timed full adder”, in proceedings of the ieee 3 rd international design and test workshop, 2008, pp. 129-134. [56] p. balasubramanian, d.a. edwards and w.b. toms, “self-timed section-carry based carry lookahead adders and the concept of alias logic,” journal of circuits, systems, and computers, vol. 22, pp. 1350028-1–1350028-24, 2013. [57] p. balasubramanian, d. dhivyaa, j.p. jayakirthika, p. kaviyarasi and k. prasad, “low power self-timed carry lookahead adders,” in proceedings of the 56 th ieee international midwest symposium on circuits and systems, 2013, pp. 457-460. [58] p. balasubramanian, “asynchronous carry select adders,” engineering science and technology, an international journal, vol. 20, pp. 1066-1074, 2017. [59] p. balasubramanian and n.e. mastorakis, “qdi decomposed dims method featuring homogeneous/ heterogeneous data encoding”, in proceedings of the international conference on computers, digital communications and computing, 2011, pp. 93-101. [60] p. balasubramanian and d.a. edwards, “power, delay and area efficient self-timed multiplexer and demultiplexer designs,” in proceedings of the 4 th ieee international conference on design and technology of integrated systems in nanoscale era, 2009, pp. 173-178, 2009. [61] n.h.e. weste and k. eshraghian, principles of cmos vlsi design: a systems perspective, 2 nd edition, addison-wesley publishing company, massachusetts, usa, 1993. facta universitatis series: electronics and energetics vol. x, no x, x 2018, pp. x energy-efficient cryptographic primitives elena dubrova∗ royal institute of technology (kth), stockholm, sweden abstract: our society greatly depends on services and applications provided by mobile communication networks. as billions of people and devices become connected, it becomes increasingly important to guarantee security of interactions of all players. in this talk we address several aspects of this important, many-folded problem. first, we show how to design cryptographic primitives which can assure integrity and confidentiality of transmitted messages while satisfying resource constrains of low-end low-cost wireless devices such as sensors or rfid tags. second, we describe countermeasures which can enhance the resistance of hardware implementing cryptographic algorithms to hardware trojans. keywords: security, lightweight cryptography, cryptographic primitive, encryption, message authentication, hardware trojan. 1 introduction today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or rfid tags in the conventional belief that the information they gather is of little concern to attackers. however, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network. for example, in the attack manuscript received x x, x corresponding author: elena dubrova royal institute of technology (kth), stockholm, sweden (e-mail: dubrova@kth.se) ∗an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017. 1 facta universitatis series: electronics and energetics vol. 28, no 1, march 2018, pp. 101 125 construction of subsets of bent functions satisfying restrictions in the reed-muller domain miloš radmanović1 and radomir s. stanković1 1faculty of electronic engineering, university of nǐs, nǐs, serbia abstract: bent functions are boolean functions with highest nonlinearity which makes them interesting for cryptography. determination of bent functions is an important but hard problem, since the general structure of bent functions is still unknown. various constructions methods for bent functions are based on certain deterministic procedures, which might result in some regularity that is a feature undesired for applications in cryptography. random generation of bent functions is an alternative, however, the search space is very large and the related procedures are time consuming. a solution is to restrict the search space by imposing some conditions that should be satisfied by the produced bent functions. in this paper, we propose three ways of imposing such restrictions to construct subsets of boolean functions within which the bent functions are searched. we estimate experimentally the number of bent functions in the corresponding subsets of boolean functions. keywords: cryptography, boolean functions, bent, reed-muller domain, subsets. 1 introduction bent functions are by definition the most nonlinear boolean functions, i.e., at the maximum distance of 2n−1−2n/2−1, n-number of variables, from affine manuscript received january 31, 2018 corresponding author: miloš radmanović faculty of electronic engineering, university of nǐs, nǐs, serbia (e-mail: milos.radmanovic@elfak.ni.ac.rs) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 1 2 m. radmanović and r. stanković functions. due to that property, bent functions are useful for cryptographic purposes, such as block ciphers, stream ciphers, and hash functions in many areas [1], [2]. they have been attracted a lot of attention in cryptography, but have also studied in many other areas such as combinatorics, coding theory, logic synthesis, and signal processing [3], [4]. the problem is that they are not balanced. therefore, constructing bent functions followed by a procedure to make them balanced is in foundations of many cryptographic procedures. there are many procedures to construct bent functions, but most of them are exhaustive in the sense that can produce all bent functions for a given n. another problem is that being based on some deterministic procedures, there is a no negligible possibility that the produced bent function might express some degree of regularity. random generation of bent functions is often used as an alternative. the problem is however that the number of bent functions is very small compared to the total number of 22n boolean functions. expressed in percentages, it is 1, 36%, 2, 94×10−8%, and 8, 57×10−44% for n = 4, 6, 8, out of 65.536, 18.446.744.073.709.551.616, and 1, 1579208923731619542357098500869 × 10+77 functions. therefore, random finding of bent functions necessarily requires reduction of the search space by using properties of bent functions. in this paper, we explore how large is the number of bent functions in certain restricted subsets of boolean functions with restrictions derived from properties of bent functions. the aim is to provide specifications how to restrict the search space for bent functions according to the probability of finding them in predefined subsets of boolean functions. the restrictions are formulated in the reed-muller (rm) domain [5]. this paper is organized as follows: section 2 shortly introduces the theoretical background and necessary concepts to be discussed. in section 3, three approaches to the restriction of the search space in finding bent functions in rm domain are presented. the experimental results are shown and discussed in section 4. the closing section 5 summarizes the results of the research reported in this paper. 2 background theory 2.1 reed-muller transform the algebraic normal form (anf) is one of the most used representations in cryptography. every boolean function has a unique representation in facta universitatis series: electronics and energetics vol. 31, no 2, june 2018, pp. 207 222 https://doi.org/10.2298/fuee1802207r miloš radmanović, radomir s. stanković received october 21, 2017; received in revised form january 31, 2018 corresponding author: miloš radmanović faculty of electronic engineering, university of niš, medevedeva 14, 18000 niš, serbia (e-mail: milos.radmanovic@elfak.ni.ac.rs) *an earlier version of this paper was presented as an invited address at the reed-muller 2017 workshop, novi sad, serbia, may 24-25, 2017 facta universitatis series: electronics and energetics vol. 28, no 4, december 2015, pp. 507 525 doi: 10.2298/fuee1504507s horizontal current bipolar transistor (hcbt) – a low-cost, high-performance flexible bicmos technology for rf communication applications tomislav suligoj1, marko koričić1, josip žilak1, hidenori mochizuki2, so-ichi morita2, katsumi shinomura2, hisaya imai2 1university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia 2asahi kasei microdevices co. 5-4960, nobeoka, miyazaki, 882-0031, japan abstract. in an overview of horizontal current bipolar transistor (hcbt) technology, the state-of-the-art integrated silicon bipolar transistors are described which exhibit ft and fmax of 51 ghz and 61 ghz and ftbvceo product of 173 ghzv that are among the highest-performance implanted-base, silicon bipolar transistors. hbct is integrated with cmos in a considerably lower-cost fabrication sequence as compared to standard vertical-current bipolar transistors with only 2 or 3 additional masks and fewer process steps. due to its specific structure, the charge sharing effect can be employed to increase bvceo without sacrificing ft and fmax. moreover, the electric field can be engineered just by manipulating the lithography masks achieving the high-voltage hcbts with breakdowns up to 36 v integrated in the same process flow with high-speed devices, i.e. at zero additional costs. double-balanced active mixer circuit is designed and fabricated in hcbt technology. the maximum iip3 of 17.7 dbm at mixer current of 9.2 ma and conversion gain of -5 db are achieved. key words: bicmos technology, bipolar transistors, horizontal current bipolar transistor, radio frequency integrated circuits, mixer, high-voltage bipolar transistors. 1. introduction in the highly competitive wireless communication markets, the rf circuits and systems are fabricated in the technologies that are very cost-sensitive. in order to minimize the fabrication costs, the sub-10 ghz applications can be processed by using the high-volume silicon technologies. it has been identified that the optimum solution might received march 9, 2015 corresponding author: tomislav suligoj university of zagreb, faculty of electrical engineering and computing, department of electronics, microand nano-electronics laboratory, croatia (e-mail: tom@zemris.fer.hr) construction of subsets of bent functions satisfying restrictions in the reed-muller domain* faculty of electronic engineering, university of niš, niš, serbia abstract. bent functions are boolean functions with highest nonlinearity which makes them interesting for cryptography. determination of bent functions is an importantbut hard problem, since the general structure of bent functions is still unknown. various constructions methods for bent functions are based on certain deterministic procedures, which might result in some regularitythat is a feature undesired for applications in cryptography. random generation of bent functions is an alternative, however, the search space is very large and the related procedures are time consuming. a solution is to restrict the search space by imposing some conditions that should be satisfied by the produced bent functions. in this paper, we propose three ways of imposing such restrictions to construct subsets of boolean functions within which the bent functions are searched. we estimate experimentally the number of bent functions in the corresponding subsets of boolean functions. key words: cryptography, boolean functions, bent, reed-muller domain, subsets. 2 m. radmanović and r. stanković functions. due to that property, bent functions are useful for cryptographic purposes, such as block ciphers, stream ciphers, and hash functions in many areas [1], [2]. they have been attracted a lot of attention in cryptography, but have also studied in many other areas such as combinatorics, coding theory, logic synthesis, and signal processing [3], [4]. the problem is that they are not balanced. therefore, constructing bent functions followed by a procedure to make them balanced is in foundations of many cryptographic procedures. there are many procedures to construct bent functions, but most of them are exhaustive in the sense that can produce all bent functions for a given n. another problem is that being based on some deterministic procedures, there is a no negligible possibility that the produced bent function might express some degree of regularity. random generation of bent functions is often used as an alternative. the problem is however that the number of bent functions is very small compared to the total number of 22n boolean functions. expressed in percentages, it is 1, 36%, 2, 94×10−8%, and 8, 57×10−44% for n = 4, 6, 8, out of 65.536, 18.446.744.073.709.551.616, and 1, 1579208923731619542357098500869 × 10+77 functions. therefore, random finding of bent functions necessarily requires reduction of the search space by using properties of bent functions. in this paper, we explore how large is the number of bent functions in certain restricted subsets of boolean functions with restrictions derived from properties of bent functions. the aim is to provide specifications how to restrict the search space for bent functions according to the probability of finding them in predefined subsets of boolean functions. the restrictions are formulated in the reed-muller (rm) domain [5]. this paper is organized as follows: section 2 shortly introduces the theoretical background and necessary concepts to be discussed. in section 3, three approaches to the restriction of the search space in finding bent functions in rm domain are presented. the experimental results are shown and discussed in section 4. the closing section 5 summarizes the results of the research reported in this paper. 2 background theory 2.1 reed-muller transform the algebraic normal form (anf) is one of the most used representations in cryptography. every boolean function has a unique representation in construction of subsets of bent functions satisfying restrictions... 3 the anf as the binary coefficient vector, known as the positive polarity reed-muller spectrum. therefore in switching theory, logic design, signal processing and related areas, the anf is also called positive polarity reedmuller normal form. any boolean function of n variables can be expressed in the positive polarity reed-muller form or algebraic normal form as [6]: f(x1, x2, . . . , xn) = n ⊕ i=1 siφi, (1) where, φi = n−1 ∏ k=0 (xk) ik, (2) where xk is a variable that may assume either 0 or 1. the values si ∈ {0, 1}, also known as the rm spectral coefficients, short rm-coefficients, determine whether a product term is present or not in (1). in addition, coordinates ik ∈ {0, 1} in binary representation of i indicate the presence or absence of a variable xk in the product term φi, (x 1 k = xk, x 0 k = 1) . the symbol ⊕ denotes the exor operation, and multiplication is assumed to be the and operation. the algebraic degree or the order of nonlinearity of f is the maximum number of variables in a product term related with a non-zero coefficient si. the positive polarity rm coefficients are divided into groups according to the number of ones in the binary representation of their indices. for example, coefficients of the first order are s1, s2, s4, s8 . . .. coefficients of the second order are assigned to product terms consisting of two variables. the third order coefficients correspond to product of three variables and so forth, until the coefficient of order n related to the last product term x1x2 . . . xn. definition 1 in matrix notation, if a function f and its rm spectrum sf are represented by vectors f = [f0, f1, . . . , f2n−1] t and sf = [s0, s1, . . . , s2n−1] t respectively, the positive polarity rm transform is defined by the positive polarity rm matrix rm(n) [6]: sf = r(n)f, (3) where, r(n) = n ⊗ i=1 r(1), (4) 208 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 209 2 m. radmanović and r. stanković functions. due to that property, bent functions are useful for cryptographic purposes, such as block ciphers, stream ciphers, and hash functions in many areas [1], [2]. they have been attracted a lot of attention in cryptography, but have also studied in many other areas such as combinatorics, coding theory, logic synthesis, and signal processing [3], [4]. the problem is that they are not balanced. therefore, constructing bent functions followed by a procedure to make them balanced is in foundations of many cryptographic procedures. there are many procedures to construct bent functions, but most of them are exhaustive in the sense that can produce all bent functions for a given n. another problem is that being based on some deterministic procedures, there is a no negligible possibility that the produced bent function might express some degree of regularity. random generation of bent functions is often used as an alternative. the problem is however that the number of bent functions is very small compared to the total number of 22n boolean functions. expressed in percentages, it is 1, 36%, 2, 94×10−8%, and 8, 57×10−44% for n = 4, 6, 8, out of 65.536, 18.446.744.073.709.551.616, and 1, 1579208923731619542357098500869 × 10+77 functions. therefore, random finding of bent functions necessarily requires reduction of the search space by using properties of bent functions. in this paper, we explore how large is the number of bent functions in certain restricted subsets of boolean functions with restrictions derived from properties of bent functions. the aim is to provide specifications how to restrict the search space for bent functions according to the probability of finding them in predefined subsets of boolean functions. the restrictions are formulated in the reed-muller (rm) domain [5]. this paper is organized as follows: section 2 shortly introduces the theoretical background and necessary concepts to be discussed. in section 3, three approaches to the restriction of the search space in finding bent functions in rm domain are presented. the experimental results are shown and discussed in section 4. the closing section 5 summarizes the results of the research reported in this paper. 2 background theory 2.1 reed-muller transform the algebraic normal form (anf) is one of the most used representations in cryptography. every boolean function has a unique representation in construction of subsets of bent functions satisfying restrictions... 3 the anf as the binary coefficient vector, known as the positive polarity reed-muller spectrum. therefore in switching theory, logic design, signal processing and related areas, the anf is also called positive polarity reedmuller normal form. any boolean function of n variables can be expressed in the positive polarity reed-muller form or algebraic normal form as [6]: f(x1, x2, . . . , xn) = n ⊕ i=1 siφi, (1) where, φi = n−1 ∏ k=0 (xk) ik, (2) where xk is a variable that may assume either 0 or 1. the values si ∈ {0, 1}, also known as the rm spectral coefficients, short rm-coefficients, determine whether a product term is present or not in (1). in addition, coordinates ik ∈ {0, 1} in binary representation of i indicate the presence or absence of a variable xk in the product term φi, (x 1 k = xk, x 0 k = 1) . the symbol ⊕ denotes the exor operation, and multiplication is assumed to be the and operation. the algebraic degree or the order of nonlinearity of f is the maximum number of variables in a product term related with a non-zero coefficient si. the positive polarity rm coefficients are divided into groups according to the number of ones in the binary representation of their indices. for example, coefficients of the first order are s1, s2, s4, s8 . . .. coefficients of the second order are assigned to product terms consisting of two variables. the third order coefficients correspond to product of three variables and so forth, until the coefficient of order n related to the last product term x1x2 . . . xn. definition 1 in matrix notation, if a function f and its rm spectrum sf are represented by vectors f = [f0, f1, . . . , f2n−1] t and sf = [s0, s1, . . . , s2n−1] t respectively, the positive polarity rm transform is defined by the positive polarity rm matrix rm(n) [6]: sf = r(n)f, (3) where, r(n) = n ⊗ i=1 r(1), (4) construction of subsets of bent functions satisfying restrictions... 3 the anf as the binary coefficient vector, known as the positive polarity reed-muller spectrum. therefore in switching theory, logic design, signal processing and related areas, the anf is also called positive polarity reedmuller normal form. any boolean function of n variables can be expressed in the positive polarity reed-muller form or algebraic normal form as [6]: f(x1, x2, . . . , xn) = n ⊕ i=1 siφi, (1) where, φi = n−1 ∏ k=0 (xk) ik, (2) where xk is a variable that may assume either 0 or 1. the values si ∈ {0, 1}, also known as the rm spectral coefficients, short rm-coefficients, determine whether a product term is present or not in (1). in addition, coordinates ik ∈ {0, 1} in binary representation of i indicate the presence or absence of a variable xk in the product term φi, (x 1 k = xk, x 0 k = 1) . the symbol ⊕ denotes the exor operation, and multiplication is assumed to be the and operation. the algebraic degree or the order of nonlinearity of f is the maximum number of variables in a product term related with a non-zero coefficient si. the positive polarity rm coefficients are divided into groups according to the number of ones in the binary representation of their indices. for example, coefficients of the first order are s1, s2, s4, s8 . . .. coefficients of the second order are assigned to product terms consisting of two variables. the third order coefficients correspond to product of three variables and so forth, until the coefficient of order n related to the last product term x1x2 . . . xn. definition 1 in matrix notation, if a function f and its rm spectrum sf are represented by vectors f = [f0, f1, . . . , f2n−1] t and sf = [s0, s1, . . . , s2n−1] t respectively, the positive polarity rm transform is defined by the positive polarity rm matrix rm(n) [6]: sf = r(n)f, (3) where, r(n) = n ⊗ i=1 r(1), (4) 4 m. radmanović and r. stanković and r(1) = [ 1 0 1 1 ] (5) is the basic positive reed-muller transform matrix. note that rm(1) and, therefore, also rm(n) is a self-inverse matrix. the kornecker product used in (4) determines the so-called hadamard order of the rm-coefficients which will be used in the following considerations. when the order of coefficients is specified, then the position of each coefficient in the rm-spectrum is uniquely determined. this property will be used to define restrictions reducing the search space for bent functions. definition 2 the function f is reconstructed from its rm spectrum as: f(x1, x2, . . . , xn) = x(n)sf , (6) where x(n) = n ⊗ i=1 [ 1 xi ] . (7) 2.2 walsh transform definition 3 for a boolean function f in (1, −1) encoding defined by the truth-vector f = [f0, f1, . . . ,f2n−1] t , the walsh spectrum sf,w = [r0, r1, . . . , r2n−1] t is defined as [?]: sf,w = w(n)f, (8) where, w(n) = n ⊗ i=1 w(1), (9) where, w(1) = [ 1 1 1 −1 ] (10) is the basic walsh transform matrix. 2.3 bent functions for the case of even number of variables, there is a class of boolean functions introduced by o. rothaus achieving maximum nonlinearity, the so-called bent functions, figurative meaning is opposite to linear. the exact number 208 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 209 4 m. radmanović and r. stanković and r(1) = [ 1 0 1 1 ] (5) is the basic positive reed-muller transform matrix. note that rm(1) and, therefore, also rm(n) is a self-inverse matrix. the kornecker product used in (4) determines the so-called hadamard order of the rm-coefficients which will be used in the following considerations. when the order of coefficients is specified, then the position of each coefficient in the rm-spectrum is uniquely determined. this property will be used to define restrictions reducing the search space for bent functions. definition 2 the function f is reconstructed from its rm spectrum as: f(x1, x2, . . . , xn) = x(n)sf , (6) where x(n) = n ⊗ i=1 [ 1 xi ] . (7) 2.2 walsh transform definition 3 for a boolean function f in (1, −1) encoding defined by the truth-vector f = [f0, f1, . . . ,f2n−1] t , the walsh spectrum sf,w = [r0, r1, . . . , r2n−1] t is defined as [?]: sf,w = w(n)f, (8) where, w(n) = n ⊗ i=1 w(1), (9) where, w(1) = [ 1 1 1 −1 ] (10) is the basic walsh transform matrix. 2.3 bent functions for the case of even number of variables, there is a class of boolean functions introduced by o. rothaus achieving maximum nonlinearity, the so-called bent functions, figurative meaning is opposite to linear. the exact number construction of subsets of bent functions satisfying restrictions... 5 of n-variable bent functions is known only for small number of variables, for n < 10. the general number of bent functions is an open problem and it is known the upper bounds in respect to n. the following theorems and definition determine some basic properties of bent functions that will be used in the present considerations. theorem 1 [3] there are defined some ”naive” lower and upper bounds on the number of bent functions. the lower bound is (2n/2)!22 n/2 , and upper bound is 2 2 n−1 + 1 2 ( nn/2). note that, the number of bent functions increases rapidly due to exponential increasing of lower bound. theorem 2 [2] if f is a bent function, f is also bent. theorem 3 [1] every bent function has a hamming weight (number of ones in the truth vector) of 2n−1 ± 2n/2−1. the following definition of a bent function is formulated in terms of the walsh transform coefficients: definition 4 [4] a boolean function f(x1, x2, . . . , xn) in (1, −1) encoding is called bent if all walsh coefficients in vector sf,w have the same absolute value 2n/2. theorem 4 [4] the algebraic degree of bent functions f(x1, x2, . . . , xn) is at most n/2 for n > 2. positions of the non-zero rm coefficients in the sf of a bent function are related with the order of coefficients. for example, the bent functions of four variables can have non-zero rm coefficients of order 0,1,2 and the number of ones in the binary representation of their vector index is ≤ 2. in relation to that, the maximal number of non-zero coefficients in the rm spectrum of bent functions is: n/2 ∑ i=0 ( n i ) = 2n−1 + 1 2 ( n n 2 ) . (11) these both features will be used in the present paper. 210 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 211 4 m. radmanović and r. stanković and r(1) = [ 1 0 1 1 ] (5) is the basic positive reed-muller transform matrix. note that rm(1) and, therefore, also rm(n) is a self-inverse matrix. the kornecker product used in (4) determines the so-called hadamard order of the rm-coefficients which will be used in the following considerations. when the order of coefficients is specified, then the position of each coefficient in the rm-spectrum is uniquely determined. this property will be used to define restrictions reducing the search space for bent functions. definition 2 the function f is reconstructed from its rm spectrum as: f(x1, x2, . . . , xn) = x(n)sf , (6) where x(n) = n ⊗ i=1 [ 1 xi ] . (7) 2.2 walsh transform definition 3 for a boolean function f in (1, −1) encoding defined by the truth-vector f = [f0, f1, . . . ,f2n−1] t , the walsh spectrum sf,w = [r0, r1, . . . , r2n−1] t is defined as [?]: sf,w = w(n)f, (8) where, w(n) = n ⊗ i=1 w(1), (9) where, w(1) = [ 1 1 1 −1 ] (10) is the basic walsh transform matrix. 2.3 bent functions for the case of even number of variables, there is a class of boolean functions introduced by o. rothaus achieving maximum nonlinearity, the so-called bent functions, figurative meaning is opposite to linear. the exact number construction of subsets of bent functions satisfying restrictions... 5 of n-variable bent functions is known only for small number of variables, for n < 10. the general number of bent functions is an open problem and it is known the upper bounds in respect to n. the following theorems and definition determine some basic properties of bent functions that will be used in the present considerations. theorem 1 [3] there are defined some ”naive” lower and upper bounds on the number of bent functions. the lower bound is (2n/2)!22 n/2 , and upper bound is 2 2 n−1 + 1 2 ( nn/2). note that, the number of bent functions increases rapidly due to exponential increasing of lower bound. theorem 2 [2] if f is a bent function, f is also bent. theorem 3 [1] every bent function has a hamming weight (number of ones in the truth vector) of 2n−1 ± 2n/2−1. the following definition of a bent function is formulated in terms of the walsh transform coefficients: definition 4 [4] a boolean function f(x1, x2, . . . , xn) in (1, −1) encoding is called bent if all walsh coefficients in vector sf,w have the same absolute value 2n/2. theorem 4 [4] the algebraic degree of bent functions f(x1, x2, . . . , xn) is at most n/2 for n > 2. positions of the non-zero rm coefficients in the sf of a bent function are related with the order of coefficients. for example, the bent functions of four variables can have non-zero rm coefficients of order 0,1,2 and the number of ones in the binary representation of their vector index is ≤ 2. in relation to that, the maximal number of non-zero coefficients in the rm spectrum of bent functions is: n/2 ∑ i=0 ( n i ) = 2n−1 + 1 2 ( n n 2 ) . (11) these both features will be used in the present paper. construction of subsets of bent functions satisfying restrictions... 5 of n-variable bent functions is known only for small number of variables, for n < 10. the general number of bent functions is an open problem and it is known the upper bounds in respect to n. the following theorems and definition determine some basic properties of bent functions that will be used in the present considerations. theorem 1 [3] there are defined some ”naive” lower and upper bounds on the number of bent functions. the lower bound is (2n/2)!22 n/2 , and upper bound is 2 2 n−1 + 1 2 ( nn/2). note that, the number of bent functions increases rapidly due to exponential increasing of lower bound. theorem 2 [2] if f is a bent function, f is also bent. theorem 3 [1] every bent function has a hamming weight (number of ones in the truth vector) of 2n−1 ± 2n/2−1. the following definition of a bent function is formulated in terms of the walsh transform coefficients: definition 4 [4] a boolean function f(x1, x2, . . . , xn) in (1, −1) encoding is called bent if all walsh coefficients in vector sf,w have the same absolute value 2n/2. theorem 4 [4] the algebraic degree of bent functions f(x1, x2, . . . , xn) is at most n/2 for n > 2. positions of the non-zero rm coefficients in the sf of a bent function are related with the order of coefficients. for example, the bent functions of four variables can have non-zero rm coefficients of order 0,1,2 and the number of ones in the binary representation of their vector index is ≤ 2. in relation to that, the maximal number of non-zero coefficients in the rm spectrum of bent functions is: n/2 ∑ i=0 ( n i ) = 2n−1 + 1 2 ( n n 2 ) . (11) these both features will be used in the present paper. 6 m. radmanović and r. stanković example 1 consider for a bent function of four variables f(x1, x2, x3, x4) = x1x2 ⊕ x3x4, given by f = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0] t . the rm spectrum is sf = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] t , from where the number of non-zero rm coefficients is 2 and algebraic degree is 2 ≤ 4/2. the walsh spectrum with (1, −1) encoding is sf,w = [4, 4, 4, −4, 4, 4, 4, −4, 4, 4, 4,−4, −4, −4, −4, 4] t , where all coefficients have the same absolute value 22 = 4. 2.4 construction of bent functions finding the complete set of bent functions for a given number of inputs is an open problem. there are no formal methods for generalization, construction, or classification of all bent functions for a given number of inputs. thus, it has been developed a variety of approaches for construction and classification of bent functions with particular properties [2]. however, bent functions that have some predefined specific properties are very rare and they constitute a rather small subset of the total of all bent functions. therefore, it is important to define different approaches for determination of bent functions. different classes of bent functions are focused on bent functions satisfying certain additional conditions. for example, quadratic bent functions have an important place in bent function construction. all bent functions from this class are known and they can be obtained by applying the affine transformations to the variables of the function [7]. two most known approaches for construction classification of bent functions are based on a combinatorial construction and an algebraic construction of functions [2]. the well-known combinatorial constructions are maioranamcfarland [8], partial spreads [9], dobbertin [10], iterative constructions [11], minterm constructions [12], maiorana-mcfarland super class [13], etc. the most widely known algebraic constructions are monomial bent functions in the kasami, gold, dillon and canteaut-leander case [14], hyper bent functions [15], niho bent functions [16], etc. 3 subsets of bent functions in the rm domain in this paper, we propose to split the set of all boolean functions with respect to three different criteria related to the properties of rm-spectra of bent functions: 1. the number of non-zero rm-coefficients. we call it the vertical subset. 2. the orders of nonlinearity defined as the maximum number of variables 6 m. radmanović and r. stanković example 1 consider for a bent function of four variables f(x1, x2, x3, x4) = x1x2 ⊕ x3x4, given by f = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0] t . the rm spectrum is sf = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] t , from where the number of non-zero rm coefficients is 2 and algebraic degree is 2 ≤ 4/2. the walsh spectrum with (1, −1) encoding is sf,w = [4, 4, 4, −4, 4, 4, 4, −4, 4, 4, 4,−4, −4, −4, −4, 4] t , where all coefficients have the same absolute value 22 = 4. 2.4 construction of bent functions finding the complete set of bent functions for a given number of inputs is an open problem. there are no formal methods for generalization, construction, or classification of all bent functions for a given number of inputs. thus, it has been developed a variety of approaches for construction and classification of bent functions with particular properties [2]. however, bent functions that have some predefined specific properties are very rare and they constitute a rather small subset of the total of all bent functions. therefore, it is important to define different approaches for determination of bent functions. different classes of bent functions are focused on bent functions satisfying certain additional conditions. for example, quadratic bent functions have an important place in bent function construction. all bent functions from this class are known and they can be obtained by applying the affine transformations to the variables of the function [7]. two most known approaches for construction classification of bent functions are based on a combinatorial construction and an algebraic construction of functions [2]. the well-known combinatorial constructions are maioranamcfarland [8], partial spreads [9], dobbertin [10], iterative constructions [11], minterm constructions [12], maiorana-mcfarland super class [13], etc. the most widely known algebraic constructions are monomial bent functions in the kasami, gold, dillon and canteaut-leander case [14], hyper bent functions [15], niho bent functions [16], etc. 3 subsets of bent functions in the rm domain in this paper, we propose to split the set of all boolean functions with respect to three different criteria related to the properties of rm-spectra of bent functions: 1. the number of non-zero rm-coefficients. we call it the vertical subset. 2. the orders of nonlinearity defined as the maximum number of variables 6 m. radmanović and r. stanković example 1 consider for a bent function of four variables f(x1, x2, x3, x4) = x1x2 ⊕ x3x4, given by f = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0] t . the rm spectrum is sf = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] t , from where the number of non-zero rm coefficients is 2 and algebraic degree is 2 ≤ 4/2. the walsh spectrum with (1, −1) encoding is sf,w = [4, 4, 4, −4, 4, 4, 4, −4, 4, 4, 4,−4, −4, −4, −4, 4] t , where all coefficients have the same absolute value 22 = 4. 2.4 construction of bent functions finding the complete set of bent functions for a given number of inputs is an open problem. there are no formal methods for generalization, construction, or classification of all bent functions for a given number of inputs. thus, it has been developed a variety of approaches for construction and classification of bent functions with particular properties [2]. however, bent functions that have some predefined specific properties are very rare and they constitute a rather small subset of the total of all bent functions. therefore, it is important to define different approaches for determination of bent functions. different classes of bent functions are focused on bent functions satisfying certain additional conditions. for example, quadratic bent functions have an important place in bent function construction. all bent functions from this class are known and they can be obtained by applying the affine transformations to the variables of the function [7]. two most known approaches for construction classification of bent functions are based on a combinatorial construction and an algebraic construction of functions [2]. the well-known combinatorial constructions are maioranamcfarland [8], partial spreads [9], dobbertin [10], iterative constructions [11], minterm constructions [12], maiorana-mcfarland super class [13], etc. the most widely known algebraic constructions are monomial bent functions in the kasami, gold, dillon and canteaut-leander case [14], hyper bent functions [15], niho bent functions [16], etc. 3 subsets of bent functions in the rm domain in this paper, we propose to split the set of all boolean functions with respect to three different criteria related to the properties of rm-spectra of bent functions: 1. the number of non-zero rm-coefficients. we call it the vertical subset. 2. the orders of nonlinearity defined as the maximum number of variables 6 m. radmanović and r. stanković example 1 consider for a bent function of four variables f(x1, x2, x3, x4) = x1x2 ⊕ x3x4, given by f = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0] t . the rm spectrum is sf = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] t , from where the number of non-zero rm coefficients is 2 and algebraic degree is 2 ≤ 4/2. the walsh spectrum with (1, −1) encoding is sf,w = [4, 4, 4, −4, 4, 4, 4, −4, 4, 4, 4,−4, −4, −4, −4, 4] t , where all coefficients have the same absolute value 22 = 4. 2.4 construction of bent functions finding the complete set of bent functions for a given number of inputs is an open problem. there are no formal methods for generalization, construction, or classification of all bent functions for a given number of inputs. thus, it has been developed a variety of approaches for construction and classification of bent functions with particular properties [2]. however, bent functions that have some predefined specific properties are very rare and they constitute a rather small subset of the total of all bent functions. therefore, it is important to define different approaches for determination of bent functions. different classes of bent functions are focused on bent functions satisfying certain additional conditions. for example, quadratic bent functions have an important place in bent function construction. all bent functions from this class are known and they can be obtained by applying the affine transformations to the variables of the function [7]. two most known approaches for construction classification of bent functions are based on a combinatorial construction and an algebraic construction of functions [2]. the well-known combinatorial constructions are maioranamcfarland [8], partial spreads [9], dobbertin [10], iterative constructions [11], minterm constructions [12], maiorana-mcfarland super class [13], etc. the most widely known algebraic constructions are monomial bent functions in the kasami, gold, dillon and canteaut-leander case [14], hyper bent functions [15], niho bent functions [16], etc. 3 subsets of bent functions in the rm domain in this paper, we propose to split the set of all boolean functions with respect to three different criteria related to the properties of rm-spectra of bent functions: 1. the number of non-zero rm-coefficients. we call it the vertical subset. 2. the orders of nonlinearity defined as the maximum number of variables 210 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 211 6 m. radmanović and r. stanković example 1 consider for a bent function of four variables f(x1, x2, x3, x4) = x1x2 ⊕ x3x4, given by f = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0] t . the rm spectrum is sf = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] t , from where the number of non-zero rm coefficients is 2 and algebraic degree is 2 ≤ 4/2. the walsh spectrum with (1, −1) encoding is sf,w = [4, 4, 4, −4, 4, 4, 4, −4, 4, 4, 4,−4, −4, −4, −4, 4] t , where all coefficients have the same absolute value 22 = 4. 2.4 construction of bent functions finding the complete set of bent functions for a given number of inputs is an open problem. there are no formal methods for generalization, construction, or classification of all bent functions for a given number of inputs. thus, it has been developed a variety of approaches for construction and classification of bent functions with particular properties [2]. however, bent functions that have some predefined specific properties are very rare and they constitute a rather small subset of the total of all bent functions. therefore, it is important to define different approaches for determination of bent functions. different classes of bent functions are focused on bent functions satisfying certain additional conditions. for example, quadratic bent functions have an important place in bent function construction. all bent functions from this class are known and they can be obtained by applying the affine transformations to the variables of the function [7]. two most known approaches for construction classification of bent functions are based on a combinatorial construction and an algebraic construction of functions [2]. the well-known combinatorial constructions are maioranamcfarland [8], partial spreads [9], dobbertin [10], iterative constructions [11], minterm constructions [12], maiorana-mcfarland super class [13], etc. the most widely known algebraic constructions are monomial bent functions in the kasami, gold, dillon and canteaut-leander case [14], hyper bent functions [15], niho bent functions [16], etc. 3 subsets of bent functions in the rm domain in this paper, we propose to split the set of all boolean functions with respect to three different criteria related to the properties of rm-spectra of bent functions: 1. the number of non-zero rm-coefficients. we call it the vertical subset. 2. the orders of nonlinearity defined as the maximum number of variables construction of subsets of bent functions satisfying restrictions... 7 in a product term corresponding to a non-zero rm-coefficient. we call it the horizontal subset. 3. both, the number of non-zero rm-coefficients and the order of nonlinearity. we call it the grid subset. the main objective is to investigate the number of bent functions in each of these subsets. in order to study the proposed subsets of functions in the spectral rm domain, we developed in c++ three independent implementations of the algorithms for vertical, horizontal, and grid rm subset classifications. these implementations are used to investigate considered subsets of bent functions with 4, 6, and 8 variables. 3.1 vertical subset since the order of nonlinearity of n-variable bent functions is less or equal to n/2, in their rm-spectra there cannot be non-zero rm-coefficients assigned to product terms with more than n/2 variables. the values of coefficients of larger order are 0. therefore, the number of possible non-zero coefficients is smaller than 2n. functions that do not satisfy this requirement can be eliminated from checking for bentness. in this way, the search space for bent functions is considerably reduced which permits for finding bent functions in a reasonably short time. since the rm-expression for a bent function cannot contain product terms with more than n/2 variables, the maximal number of non-zero coefficients of k-th order is ( n k ) , k = 0, 1, . . . , n/2. therefore, the total number of non-zero coefficients in the rm spectrum of bent functions is: ( n 0 ) + ( n 1 ) + . . . + ( n n/2 ) = 2n−1 + 1 2 ( n n 2 ) . vertical (v ) subset consists of n-variable boolean functions with no more than 2n−1 + 1 2 ( n n 2 ) rm-coefficients. definition 5 a bent function belongs to the vertical k-subset v (k) iff it has k non-zero rm-coefficients. clearly, the number of possible v subsets depends of the number of variables. example 2 the maximal number of non-zero spectral rm coefficients of bent functions with 4 variables is 11 out of 16 coefficients. hence, there are 212 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 213 6 m. radmanović and r. stanković example 1 consider for a bent function of four variables f(x1, x2, x3, x4) = x1x2 ⊕ x3x4, given by f = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0] t . the rm spectrum is sf = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] t , from where the number of non-zero rm coefficients is 2 and algebraic degree is 2 ≤ 4/2. the walsh spectrum with (1, −1) encoding is sf,w = [4, 4, 4, −4, 4, 4, 4, −4, 4, 4, 4,−4, −4, −4, −4, 4] t , where all coefficients have the same absolute value 22 = 4. 2.4 construction of bent functions finding the complete set of bent functions for a given number of inputs is an open problem. there are no formal methods for generalization, construction, or classification of all bent functions for a given number of inputs. thus, it has been developed a variety of approaches for construction and classification of bent functions with particular properties [2]. however, bent functions that have some predefined specific properties are very rare and they constitute a rather small subset of the total of all bent functions. therefore, it is important to define different approaches for determination of bent functions. different classes of bent functions are focused on bent functions satisfying certain additional conditions. for example, quadratic bent functions have an important place in bent function construction. all bent functions from this class are known and they can be obtained by applying the affine transformations to the variables of the function [7]. two most known approaches for construction classification of bent functions are based on a combinatorial construction and an algebraic construction of functions [2]. the well-known combinatorial constructions are maioranamcfarland [8], partial spreads [9], dobbertin [10], iterative constructions [11], minterm constructions [12], maiorana-mcfarland super class [13], etc. the most widely known algebraic constructions are monomial bent functions in the kasami, gold, dillon and canteaut-leander case [14], hyper bent functions [15], niho bent functions [16], etc. 3 subsets of bent functions in the rm domain in this paper, we propose to split the set of all boolean functions with respect to three different criteria related to the properties of rm-spectra of bent functions: 1. the number of non-zero rm-coefficients. we call it the vertical subset. 2. the orders of nonlinearity defined as the maximum number of variables construction of subsets of bent functions satisfying restrictions... 7 in a product term corresponding to a non-zero rm-coefficient. we call it the horizontal subset. 3. both, the number of non-zero rm-coefficients and the order of nonlinearity. we call it the grid subset. the main objective is to investigate the number of bent functions in each of these subsets. in order to study the proposed subsets of functions in the spectral rm domain, we developed in c++ three independent implementations of the algorithms for vertical, horizontal, and grid rm subset classifications. these implementations are used to investigate considered subsets of bent functions with 4, 6, and 8 variables. 3.1 vertical subset since the order of nonlinearity of n-variable bent functions is less or equal to n/2, in their rm-spectra there cannot be non-zero rm-coefficients assigned to product terms with more than n/2 variables. the values of coefficients of larger order are 0. therefore, the number of possible non-zero coefficients is smaller than 2n. functions that do not satisfy this requirement can be eliminated from checking for bentness. in this way, the search space for bent functions is considerably reduced which permits for finding bent functions in a reasonably short time. since the rm-expression for a bent function cannot contain product terms with more than n/2 variables, the maximal number of non-zero coefficients of k-th order is ( n k ) , k = 0, 1, . . . , n/2. therefore, the total number of non-zero coefficients in the rm spectrum of bent functions is: ( n 0 ) + ( n 1 ) + . . . + ( n n/2 ) = 2n−1 + 1 2 ( n n 2 ) . vertical (v ) subset consists of n-variable boolean functions with no more than 2n−1 + 1 2 ( n n 2 ) rm-coefficients. definition 5 a bent function belongs to the vertical k-subset v (k) iff it has k non-zero rm-coefficients. clearly, the number of possible v subsets depends of the number of variables. example 2 the maximal number of non-zero spectral rm coefficients of bent functions with 4 variables is 11 out of 16 coefficients. hence, there are construction of subsets of bent functions satisfying restrictions... 7 in a product term corresponding to a non-zero rm-coefficient. we call it the horizontal subset. 3. both, the number of non-zero rm-coefficients and the order of nonlinearity. we call it the grid subset. the main objective is to investigate the number of bent functions in each of these subsets. in order to study the proposed subsets of functions in the spectral rm domain, we developed in c++ three independent implementations of the algorithms for vertical, horizontal, and grid rm subset classifications. these implementations are used to investigate considered subsets of bent functions with 4, 6, and 8 variables. 3.1 vertical subset since the order of nonlinearity of n-variable bent functions is less or equal to n/2, in their rm-spectra there cannot be non-zero rm-coefficients assigned to product terms with more than n/2 variables. the values of coefficients of larger order are 0. therefore, the number of possible non-zero coefficients is smaller than 2n. functions that do not satisfy this requirement can be eliminated from checking for bentness. in this way, the search space for bent functions is considerably reduced which permits for finding bent functions in a reasonably short time. since the rm-expression for a bent function cannot contain product terms with more than n/2 variables, the maximal number of non-zero coefficients of k-th order is ( n k ) , k = 0, 1, . . . , n/2. therefore, the total number of non-zero coefficients in the rm spectrum of bent functions is: ( n 0 ) + ( n 1 ) + . . . + ( n n/2 ) = 2n−1 + 1 2 ( n n 2 ) . vertical (v ) subset consists of n-variable boolean functions with no more than 2n−1 + 1 2 ( n n 2 ) rm-coefficients. definition 5 a bent function belongs to the vertical k-subset v (k) iff it has k non-zero rm-coefficients. clearly, the number of possible v subsets depends of the number of variables. example 2 the maximal number of non-zero spectral rm coefficients of bent functions with 4 variables is 11 out of 16 coefficients. hence, there are 8 m. radmanović and r. stanković 11 possible v subsets for bent function with 4 variables. the v (1) subset is the subset of bent functions having the rm spectrum with 1 non-zero rm coefficient, the v (2) subset is the subset having 2 non-zero rm coefficients, and etc. there are 42 possible v subsets for bent functions with 6 variables, and 163 possible v subsets for bent functions with 8 variables. example 3 two bent functions of 4 variables f1(x1, x2, x3, x4) = 1⊕x1x2⊕ x3x4, and f2(x1, x2, x3, x4) = x4⊕x1x2⊕x3x4 have 3 non-zero rm-coefficients and belong to the same subset v (3). 3.2 horizontal subset in definition of this subset, we take into account the order of the rm coefficients. if we fix the order of rm coefficients, then we know the positions of coefficients that have to be 0 in the rm spectrum of a bent function, due to this restriction related to the order of nonlinearity. all other coefficients can be either 0 or 1. as noticed above, in this paper, we use the so-called hadamard ordering of rm coefficients originating in the kronecker product structure of the reed-muller matrix in (4). in this ordering, for a given n, we can determine positions of coefficients which cannot be 1 by referring to (7). example 4 for a bent function of four variables, the rm spectrum can be represented by sf = [−, −, −, −, −, −, −, ×, −, −, −, ×, −, ×, ×, ×] t , where the possible positions of the non-zero coefficients are denoted by dash lines and the restricted positions are denoted with ×. definition 6 a bent function belongs to the horizontal (kmin, kmax)-subset h(kmin, kmax) iff it has the minimum kmin and the maximum kmax order of rm coefficients. h subsets involve the application of ranges of possible non-zero rm spectrum coefficients orders. the number of h subsets also depends of the number of variables of the bent functions. example 5 the maximal order of non-zero spectral coefficients of bent functions with 4 variables is 2. hence, there are 3 possible h subsets h(0, 2), 212 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 213 8 m. radmanović and r. stanković 11 possible v subsets for bent function with 4 variables. the v (1) subset is the subset of bent functions having the rm spectrum with 1 non-zero rm coefficient, the v (2) subset is the subset having 2 non-zero rm coefficients, and etc. there are 42 possible v subsets for bent functions with 6 variables, and 163 possible v subsets for bent functions with 8 variables. example 3 two bent functions of 4 variables f1(x1, x2, x3, x4) = 1⊕x1x2⊕ x3x4, and f2(x1, x2, x3, x4) = x4⊕x1x2⊕x3x4 have 3 non-zero rm-coefficients and belong to the same subset v (3). 3.2 horizontal subset in definition of this subset, we take into account the order of the rm coefficients. if we fix the order of rm coefficients, then we know the positions of coefficients that have to be 0 in the rm spectrum of a bent function, due to this restriction related to the order of nonlinearity. all other coefficients can be either 0 or 1. as noticed above, in this paper, we use the so-called hadamard ordering of rm coefficients originating in the kronecker product structure of the reed-muller matrix in (4). in this ordering, for a given n, we can determine positions of coefficients which cannot be 1 by referring to (7). example 4 for a bent function of four variables, the rm spectrum can be represented by sf = [−, −, −, −, −, −, −, ×, −, −, −, ×, −, ×, ×, ×] t , where the possible positions of the non-zero coefficients are denoted by dash lines and the restricted positions are denoted with ×. definition 6 a bent function belongs to the horizontal (kmin, kmax)-subset h(kmin, kmax) iff it has the minimum kmin and the maximum kmax order of rm coefficients. h subsets involve the application of ranges of possible non-zero rm spectrum coefficients orders. the number of h subsets also depends of the number of variables of the bent functions. example 5 the maximal order of non-zero spectral coefficients of bent functions with 4 variables is 2. hence, there are 3 possible h subsets h(0, 2), construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on 214 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 215 construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on 8 m. radmanović and r. stanković 11 possible v subsets for bent function with 4 variables. the v (1) subset is the subset of bent functions having the rm spectrum with 1 non-zero rm coefficient, the v (2) subset is the subset having 2 non-zero rm coefficients, and etc. there are 42 possible v subsets for bent functions with 6 variables, and 163 possible v subsets for bent functions with 8 variables. example 3 two bent functions of 4 variables f1(x1, x2, x3, x4) = 1⊕x1x2⊕ x3x4, and f2(x1, x2, x3, x4) = x4⊕x1x2⊕x3x4 have 3 non-zero rm-coefficients and belong to the same subset v (3). 3.2 horizontal subset in definition of this subset, we take into account the order of the rm coefficients. if we fix the order of rm coefficients, then we know the positions of coefficients that have to be 0 in the rm spectrum of a bent function, due to this restriction related to the order of nonlinearity. all other coefficients can be either 0 or 1. as noticed above, in this paper, we use the so-called hadamard ordering of rm coefficients originating in the kronecker product structure of the reed-muller matrix in (4). in this ordering, for a given n, we can determine positions of coefficients which cannot be 1 by referring to (7). example 4 for a bent function of four variables, the rm spectrum can be represented by sf = [−, −, −, −, −, −, −, ×, −, −, −, ×, −, ×, ×, ×] t , where the possible positions of the non-zero coefficients are denoted by dash lines and the restricted positions are denoted with ×. definition 6 a bent function belongs to the horizontal (kmin, kmax)-subset h(kmin, kmax) iff it has the minimum kmin and the maximum kmax order of rm coefficients. h subsets involve the application of ranges of possible non-zero rm spectrum coefficients orders. the number of h subsets also depends of the number of variables of the bent functions. example 5 the maximal order of non-zero spectral coefficients of bent functions with 4 variables is 2. hence, there are 3 possible h subsets h(0, 2), construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on construction of subsets of bent functions satisfying restrictions... 9 h(1, 2), and h(2, 2). there are h subsets h(0, 0), h(0, 1), and h(1, 1), that do not contain any of the bent functions. example 6 for bent functions with 6 variables, there are 10 possible h subsets: h(0, 0), h(0, 1), h(0, 2), h(0, 3), h(1, 1), h(1, 2), h(1, 3), h(2, 2), h(2, 3), and h(3, 3). some of subsets also do not contain any of the bent functions. example 7 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x2x4 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only second order non-zero rm-coefficients and belong to the same subset h(2, 2). 3.3 grid subset in definition of this subset, we take into account both, the orders and the number of the coefficients. definition 7 a bent function belongs to the grid (k, kmin, kmax)-subset g(k, kmin, kmax) iff it has k non-zero rm-coefficients and minimum order of rm coefficients kmin and maximum order kmax. the g subsets of bent functions are the intersection of v and h subsets. number of g subsets also depends of the number of variables of the bent functions. for example, there are many g subsets for bent functions with 4 variables: v (1, 0, 0), h(1, 0, 1), ..., h(6, 2, 2). there are also many subsets that do not contain any of the bent functions. example 8 two bent functions of 4 variables f1(x1, x2, x3, x4) = x1x2 ⊕ x3x4, and f2(x1, x2, x3, x4) = x1x3 ⊕ x2x4 have only 2 and second order non-zero rm-coefficients and belong to the same subset g(2, 2, 2). it should be noticed that there are bent functions that do not belongs to any of g subsets. 4 experimental results in this section, we made enumeration of bent functions in some vertical, horizontal and grid subsets for functions with 4, 6, and 8 variables. there are two types of methods for enumerations of bent functions, primary and secondary. the primary methods are based on the direct enumeration in the boolean domain. the secondary methods are based on 214 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 215 10 m. radmanović and r. stanković enumeration in the reed-muller domain. the most known secondary methods use property that all bent functions of n variables have algebraic degree at most n/2 [17]. the secondary method for complete enumeration of bent functions of 8 variables has been used approximately 50 pcs running for 3 months [18]. it is known that, there are 8 bent functions in two variables, 896 bent functions in four variables, 5.425.430.528 bent functions in 6 variables, and 99.270.589.265.934.370.305.785.861.242.880 bent functions in 8 variables [2]. the efficiency of using parallel multi-core cpu technique for random generation of bent function in the rm domain is analyzed in [19]. to improve execution performance, the algorithm for efficient parallel generation of bent function in the rm domain using gpu platform have been defined in [20]. for experimental purposes, we developed a c++ implementations of the algorithm for creating subsets of bent functions in the rm domain. the algorithm for creation of subsets in rm domain is similar to the techniques for hardware enumeration of bent functions [21] and generation of bent function in the rm domain [19], except that the search space of further reduced. note that, for some functions of 6 and 8 variables, experiments were not shown, since the computation time was limited to 30 minutes. table 1 shows the number of bent functions in the subsets v (k) for n = 4 variables. there exists no bent function that can be represented by a single product term. note that majority of the bent functions requires 4 to 7 coefficients. precisely, more that 756 of the total of 896 bent functions for n = 4 require 4 to 7 non-zero coefficients. this makes 84, 375% of the total of bent functions. from data in table 1, it can be seen that vertical subsets v (5) and v (6) contain more than a half of the total number of bent functions. note that vertical subsets v (2), v (10), and v (11) contain small number of the bent functions. table 2 shows the number of bent functions in the most of rm vertical subsets for functions with 6 variables. from data in table 2, it can be seen again that central vertical subsets contains large number of the bent functions. for example, the rm vertical subsets v (18) and v (19) contain about 20% of the bent functions. again, note that vertical subsets v (3), v (4), v (36) and v (37) contain small number of the bent functions. the numbers of bent functions for the subsets from v (20) to v (27) are not included due to very long computation time. table 3 shows the number of bent functions with 4 and 5 non-zero rm coefficients for functions with 8 variables. it is again confirmed that vertical subsets with small and large number of coefficients do not contain many bent 216 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 217 10 m. radmanović and r. stanković enumeration in the reed-muller domain. the most known secondary methods use property that all bent functions of n variables have algebraic degree at most n/2 [17]. the secondary method for complete enumeration of bent functions of 8 variables has been used approximately 50 pcs running for 3 months [18]. it is known that, there are 8 bent functions in two variables, 896 bent functions in four variables, 5.425.430.528 bent functions in 6 variables, and 99.270.589.265.934.370.305.785.861.242.880 bent functions in 8 variables [2]. the efficiency of using parallel multi-core cpu technique for random generation of bent function in the rm domain is analyzed in [19]. to improve execution performance, the algorithm for efficient parallel generation of bent function in the rm domain using gpu platform have been defined in [20]. for experimental purposes, we developed a c++ implementations of the algorithm for creating subsets of bent functions in the rm domain. the algorithm for creation of subsets in rm domain is similar to the techniques for hardware enumeration of bent functions [21] and generation of bent function in the rm domain [19], except that the search space of further reduced. note that, for some functions of 6 and 8 variables, experiments were not shown, since the computation time was limited to 30 minutes. table 1 shows the number of bent functions in the subsets v (k) for n = 4 variables. there exists no bent function that can be represented by a single product term. note that majority of the bent functions requires 4 to 7 coefficients. precisely, more that 756 of the total of 896 bent functions for n = 4 require 4 to 7 non-zero coefficients. this makes 84, 375% of the total of bent functions. from data in table 1, it can be seen that vertical subsets v (5) and v (6) contain more than a half of the total number of bent functions. note that vertical subsets v (2), v (10), and v (11) contain small number of the bent functions. table 2 shows the number of bent functions in the most of rm vertical subsets for functions with 6 variables. from data in table 2, it can be seen again that central vertical subsets contains large number of the bent functions. for example, the rm vertical subsets v (18) and v (19) contain about 20% of the bent functions. again, note that vertical subsets v (3), v (4), v (36) and v (37) contain small number of the bent functions. the numbers of bent functions for the subsets from v (20) to v (27) are not included due to very long computation time. table 3 shows the number of bent functions with 4 and 5 non-zero rm coefficients for functions with 8 variables. it is again confirmed that vertical subsets with small and large number of coefficients do not contain many bent construction of subsets of bent functions satisfying restrictions... 11 table 1: the number of bent functions in the subsets v (k) for n = 4 variables k #f of v (k) 2 3 3 27 4 102 5 210 6 256 7 188 8 82 9 22 10 5 11 1 functions. for example, experiments show that there are no bent functions with 1,2, and 3 as well as with 157, 158, 159, 160, 161, 162, and 163 nonzero rm-coefficients. note that the number of ”empty” vertical subsets with small number of coefficients is linearly increases in the number of variables of bent functions. moreover, ”empty” vertical subsets with great number of coefficients exponentially increase in respect to number of variables of bent functions. table 4 shows the number of bent functions in some rm horizontal subsets for functions with 4, 6, and 8 variables. it is evident that subsets with only third order non-zero rm coefficients contain very small number of bent functions. table 5, 6, and 7 shows the number of bent functions in some rm grid subsets for functions with 4, 6, and 8 variables, respectively. it is evident that subsets with only second order non-zero rm coefficients contain large number of functions, especially in central grid subsets with respect to the total number of coefficients of only second order. it is interesting that there are only two non-”empty” grid subsets for functions with 6 variables with only third order non-zero rm coefficients. it is evident that most of bent functions are with mixture of first, second, and third order of non-zero rm coefficients. experiments with mixture of different orders of coefficients are not included in this paper. 216 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 217 12 m. radmanović and r. stanković table 2: the number of bent functions in the subsets v (k) for n = 6 variables k #f of v (k) 3 15 4 405 5 4575 6 30885 7 147630 8 548190 9 1657950 10 4239474 11 9512343 12 19341969 13 36536505 14 65365185 15 112296016 16 185615422 17 290719416 18 420742250 19 551175695 28 57338355 29 25754775 30 9869427 31 3098124 32 770562 33 153060 34 26070 35 3882 36 504 37 72 table 3: the number of bent functions in the subsets v (k) for n = 8 variables k #f of v (k) 4 105 5 8505 5 conclusion for practical cryptographic applications, it is often necessary to generate random bent functions. the runtime of an exhaustive search method for 218 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 219 12 m. radmanović and r. stanković table 2: the number of bent functions in the subsets v (k) for n = 6 variables k #f of v (k) 3 15 4 405 5 4575 6 30885 7 147630 8 548190 9 1657950 10 4239474 11 9512343 12 19341969 13 36536505 14 65365185 15 112296016 16 185615422 17 290719416 18 420742250 19 551175695 28 57338355 29 25754775 30 9869427 31 3098124 32 770562 33 153060 34 26070 35 3882 36 504 37 72 table 3: the number of bent functions in the subsets v (k) for n = 8 variables k #f of v (k) 4 105 5 8505 5 conclusion for practical cryptographic applications, it is often necessary to generate random bent functions. the runtime of an exhaustive search method for construction of subsets of bent functions satisfying restrictions... 13 table 4: the number of bent functions in some subsets h(kmin, kmax) for n =4,6, and, 8 variables n kmin, kmax #f of h(kmin, kmax) 4 2,2 24 6 2,2 13440 6 3,3 12 8 2,2 111992832 table 5: the number of bent functions in some subsets g(k, kmin, kmax) for n = 4 variables k, kmin, kmax #f of g(k, kmin, kmax) 2,2,2 2 3,2,2 8 4,2,2 10 5,2,2 4 table 6: the number of bent functions in some subsets g(k, kmin, kmax) for n = 6 variables k, kmin, kmax #f of g(k, kmin, kmax) 3,2,2 12 4,2,2 144 5,2,2 732 6,2,2 1968 7,2,2 3008 8,2,2 3040 9,2,2 2360 10,2,2 1384 11,2,2 564 12,2,2 176 13,2,2 44 14,2,2 8 16,3,3 6 17,3,3 6 generation of bent function is exponential in terms of the number of variables. in this paper, we propose an approach to the restriction of the search space based on the restriction of the number and order of rm-coefficients for bent functions. using properties of bent function in rm domain, we define rm vertical, 12 m. radmanović and r. stanković table 2: the number of bent functions in the subsets v (k) for n = 6 variables k #f of v (k) 3 15 4 405 5 4575 6 30885 7 147630 8 548190 9 1657950 10 4239474 11 9512343 12 19341969 13 36536505 14 65365185 15 112296016 16 185615422 17 290719416 18 420742250 19 551175695 28 57338355 29 25754775 30 9869427 31 3098124 32 770562 33 153060 34 26070 35 3882 36 504 37 72 table 3: the number of bent functions in the subsets v (k) for n = 8 variables k #f of v (k) 4 105 5 8505 5 conclusion for practical cryptographic applications, it is often necessary to generate random bent functions. the runtime of an exhaustive search method for construction of subsets of bent functions satisfying restrictions... 13 table 4: the number of bent functions in some subsets h(kmin, kmax) for n =4,6, and, 8 variables n kmin, kmax #f of h(kmin, kmax) 4 2,2 24 6 2,2 13440 6 3,3 12 8 2,2 111992832 table 5: the number of bent functions in some subsets g(k, kmin, kmax) for n = 4 variables k, kmin, kmax #f of g(k, kmin, kmax) 2,2,2 2 3,2,2 8 4,2,2 10 5,2,2 4 table 6: the number of bent functions in some subsets g(k, kmin, kmax) for n = 6 variables k, kmin, kmax #f of g(k, kmin, kmax) 3,2,2 12 4,2,2 144 5,2,2 732 6,2,2 1968 7,2,2 3008 8,2,2 3040 9,2,2 2360 10,2,2 1384 11,2,2 564 12,2,2 176 13,2,2 44 14,2,2 8 16,3,3 6 17,3,3 6 generation of bent function is exponential in terms of the number of variables. in this paper, we propose an approach to the restriction of the search space based on the restriction of the number and order of rm-coefficients for bent functions. using properties of bent function in rm domain, we define rm vertical, 218 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 219 14 m. radmanović and r. stanković table 7: the number of bent functions in some subsets g(k, kmin, kmax) for n = 8 variables k, kmin, kmax #f of g(k, kmin, kmax) 2,2,2 2 3,2,2 8 4,2,2 90 5,2,2 2160 6,2,2 23850 7,2,2 157860 8,2,2 687030 9,2,2 2081400 10,2,2 4753710 11,2,2 8640684 12,2,2 12897908 13,2,2 16181264 14,2,2 17405460 15,2,2 16291480 16,2,2 13230940 17,2,2 9215136 18,2,2 5554956 19,2,2 2907720 20,2,2 1277010 21,2,2 476288 22,2,2 157730 23,2,2 41412 24,2,2 7630 25,2,2 1000 26,2,2 102 27,2,2 12 horizontal, and grid subsets of boolean functions that might contain bent functions. the proposed approach is experimentally verified through enumeration of bent functions in some rm vertical, horizontal, and grid subsets for functions with 4, 6, and 8 variables. experimental results showed some interesting properties of different subsets in the spectral reed-muller domain. it is shown that some vertical, horizontal and grid subsets contain large and small number of the bent functions, or they are ”empty”. this information can be helpful in designing search method for generation of bent function. construction of subsets of bent functions satisfying restrictions... 13 table 4: the number of bent functions in some subsets h(kmin, kmax) for n =4,6, and, 8 variables n kmin, kmax #f of h(kmin, kmax) 4 2,2 24 6 2,2 13440 6 3,3 12 8 2,2 111992832 table 5: the number of bent functions in some subsets g(k, kmin, kmax) for n = 4 variables k, kmin, kmax #f of g(k, kmin, kmax) 2,2,2 2 3,2,2 8 4,2,2 10 5,2,2 4 table 6: the number of bent functions in some subsets g(k, kmin, kmax) for n = 6 variables k, kmin, kmax #f of g(k, kmin, kmax) 3,2,2 12 4,2,2 144 5,2,2 732 6,2,2 1968 7,2,2 3008 8,2,2 3040 9,2,2 2360 10,2,2 1384 11,2,2 564 12,2,2 176 13,2,2 44 14,2,2 8 16,3,3 6 17,3,3 6 generation of bent function is exponential in terms of the number of variables. in this paper, we propose an approach to the restriction of the search space based on the restriction of the number and order of rm-coefficients for bent functions. using properties of bent function in rm domain, we define rm vertical, 220 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 221 14 m. radmanović and r. stanković table 7: the number of bent functions in some subsets g(k, kmin, kmax) for n = 8 variables k, kmin, kmax #f of g(k, kmin, kmax) 2,2,2 2 3,2,2 8 4,2,2 90 5,2,2 2160 6,2,2 23850 7,2,2 157860 8,2,2 687030 9,2,2 2081400 10,2,2 4753710 11,2,2 8640684 12,2,2 12897908 13,2,2 16181264 14,2,2 17405460 15,2,2 16291480 16,2,2 13230940 17,2,2 9215136 18,2,2 5554956 19,2,2 2907720 20,2,2 1277010 21,2,2 476288 22,2,2 157730 23,2,2 41412 24,2,2 7630 25,2,2 1000 26,2,2 102 27,2,2 12 horizontal, and grid subsets of boolean functions that might contain bent functions. the proposed approach is experimentally verified through enumeration of bent functions in some rm vertical, horizontal, and grid subsets for functions with 4, 6, and 8 variables. experimental results showed some interesting properties of different subsets in the spectral reed-muller domain. it is shown that some vertical, horizontal and grid subsets contain large and small number of the bent functions, or they are ”empty”. this information can be helpful in designing search method for generation of bent function. construction of subsets of bent functions satisfying restrictions... 15 it can be concluded from these experimental results that besides the walsh and the reed-muller domain, exploring bent functions in some other representation domains can be interesting. references [1] s. mesnager, bent functions: fundamentals and results. springer international publishing, 2016. [2] n. tokareva, bent functions, results and applications to cryptography. academic press, 2015. [3] t. cusick and p. stanica, cryptographic boolean functions and applications (second edition). academic press, 2017. [4] t. sasao and j. t. butler, boolean functions for cryptography. morgan and claypool publishers, 2010. [5] m. thornton, r. drechsler, and d. miller, spectral techniques in vlsi cad. springer us, 2012. [6] m. g. karpovsky, r. s. stanković, and j. t. astola, spectral logic and its applications for the design of digital devices. wiley, 2008. [7] c. carlet and p. guillot, bent, resilient functions and the numerical normal form. american mathematical society, 2001, vol. 56. [8] r. mcfarland, “a family of noncyclic difference sets,” journal of combinatorial theory, vol. 15a, pp. 541–542, 1965. [9] j. f. dillon, “elementary hadamard difference sets,” ph.d. dissertation, university of maryland, 1974. [10] h. dobbertin, construction of bent functions and balanced boolean functions with high nonlinearity, ser. lecture notes in computer science, b. preneel, ed. springer, berlin, germany, 1995, vol. 1008. [11] j. climent, f. garcia, and v. requena, “on the iterative construction of bent functions,” in proc. of the 5th wseas int. conf. on inf. security and privacy (isp06), n. mastorakis and a. cecchi, eds. stevens point, wisconsin, usa: world scientific and engineering academy and society (wseas), 2006, pp. 15–18. [12] j. climent, f. garca, and v. requena, “on the construction of bent functions of n+2 variables from bent functions of n variables,” advances in mathematics of communications, vol. 2, pp. 421–431, 2008. [13] c. carlet, a larger class of cryptographic boolean functions via a study of the maioranamcfarland construction, ser. lecture notes in computer science, b. preneel, ed. springer, 2002, vol. 2442. [14] p. langevin and g. leander, “monomial bent functions and stickelberger’s theorem,” finite fields and their applications, vol. 14, no. 3, pp. 727–742, 2008. construction of subsets of bent functions satisfying restrictions... 13 table 4: the number of bent functions in some subsets h(kmin, kmax) for n =4,6, and, 8 variables n kmin, kmax #f of h(kmin, kmax) 4 2,2 24 6 2,2 13440 6 3,3 12 8 2,2 111992832 table 5: the number of bent functions in some subsets g(k, kmin, kmax) for n = 4 variables k, kmin, kmax #f of g(k, kmin, kmax) 2,2,2 2 3,2,2 8 4,2,2 10 5,2,2 4 table 6: the number of bent functions in some subsets g(k, kmin, kmax) for n = 6 variables k, kmin, kmax #f of g(k, kmin, kmax) 3,2,2 12 4,2,2 144 5,2,2 732 6,2,2 1968 7,2,2 3008 8,2,2 3040 9,2,2 2360 10,2,2 1384 11,2,2 564 12,2,2 176 13,2,2 44 14,2,2 8 16,3,3 6 17,3,3 6 generation of bent function is exponential in terms of the number of variables. in this paper, we propose an approach to the restriction of the search space based on the restriction of the number and order of rm-coefficients for bent functions. using properties of bent function in rm domain, we define rm vertical, 220 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... 221 16 m. radmanović and r. stanković [15] a. m. youssef and g. gong, hyper-bent functions, ser. lecture notes in computer science, b. preneel, ed. springer, 2001, vol. 2045. [16] h. dobbertin, g. leander, a. canteaut, c. carlet, p. felke, and p. gaborit, “construction of bent functions via niho power functions,” journal of combinatorial theory, vol. 113, no. 5, pp. 779–798, 2006. [17] j. l. shafer, s. w. schneider, j. t. butler, and p. stanica, “enumeration of bent boolean functions by reconfigurable computer,” in proc. of 18th ieee annual int. symposium on field-programmable custom computing machines, vol. 2, charlotte, nc, usa, 2010, pp. 265–272. [18] p. langevin and g. leander, “counting all bent functions in dimension eight,” designs, codes and cryptography, vol. 59, pp. 193–201, 2011. [19] m. radmanović and r. s. stanković, “random generation of bent functions on multicore cpu platform,” in proc. of 51st int. sci. conf. on inf., communication and energy systems and technologies (icest 2016), ohrid, macedonia, 2016, pp. 239–242. [20] m. radmanović, “efficient random generation of bent functions using gpu platform,” in proc. 12th int. workshop on boolean problems (iwsbp2016), freiberg, germany, 2016, pp. 167–173. [21] s. w. schneider and j. t. butler, bent function enumeration by a circular pipeline implemented on fpga, b. steinbach, ed. cambridge scholars publishing, 2017. 222 m. radmanović, r. s. stanković construction of subsets of bent functions satisfying restrictions... pb